[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [GT] Q re max-loaded-items



Unless you really know what you are doing, limiting the data loaded most likely will lead to problems. If you have less data loaded than you need for your analysis, then you will not get any indication other than either (i) there is no output, or (ii) the output is wrong.
 
There are no error checks whatsoever, e.g., if you limit your analysis to a date range, the max-loaded-items should include that range. If you have an indicator with long dependencies, the max-loaded-items should include the intended date range and the dependencies. None of this is checked in the code. 
 
I never use the max-loaded-items to restrict loading of data.
 
I guess one could do above analysis if we have static dependencies, but not if we have dynamic dependencies. But even then, it will be much work.
 
So I guess the best strategy is to keep supporting the max-loaded-items, but with a "do it on your own risk" warning attached to it.
 
The --nb-item option should be divorced from loading and restrict analysis together with --start and --end (and --latest-record where that makes sense).
 
Th.
 
P.S. The reason for this email is that long time ago I have, in my installation, made these options consistent across all scripts and that I would like to do so for GT in general. But I am hoping to understand any hidden requirements that might be reflected in the current inconsistency so as not to break anything. Please see also my other email regarding an analysis of the current state.

________________________________

From: Robert A. Schmied [mailto:ras
AT
acm.org]
Sent: Tue 3/25/2008 1:40 AM
To: devel
AT
geniustrader.org
Subject: Re: [GT] Q re max-loaded-items





limiting the tuples retrieved from the db is a two edged sword, if
your limit is smaller than the minimum needed for the worst case
ta study the results will be invalid or simply empty, with naturally
no explanation as to why. usually the limit is a large number, say
750-1000 (about 3-5 years of day data).

rather than make this a user controlled option couldn't a method be
created that would determine the maximum data required to satisfy the
worst case technical analysis study in the analysis and set that limit?

based on the recent indicator dependency work i think determining that
number before you get the data is not possible.

another thing that you need to understand (or at least to take into
account) -- the current db interface methods do not have the means to
deal with dates -- meaning they do a fairly brute force query. the
only means to throttle the query size is via the limit variable.

so when a user tries to look at say the offspring of the 1984 att
breakup from 1984 to 1994 your limit needs to be large enough to
get all the data from now to 1984 even though most of it will not
be used.

at least these are my understandings (general as they are) of how
the underlying code works.