Tim O'Brien wrote:

On Mon, 16 Jun 2003, Mark R. Diggory wrote:


(1) Bean etiquette suggests "getters" are for bean properties, its usually recommended that this means that they do nothing more than return the value for a property. This is beneficial in our Univariate case when calling a getter many times without adding a new value (lets say you use "getKurtosis" allot in a calculation before adding another value), then its more logical to have the kurtosis only calculated once and put the code for calculating it in the addValue method.



These objects are not JavaBeans, but using getXXX naming standards does
provide some benefits (say create a Univariate instance and reference it
from EL, Velocity, etc...). I don't see any problems violating the standard for bean properties as these are not really "properties".


Yes, just as long as we all agree that these are not really Java Beans, then I'm ok with it too.

(2) However, If calling addValue many times (more likely the case) with only the interest of getting the "getMean" back, its wasted computational time to calculate all the other Stats (like kurtosis) in addValue when you just want the results of "getMean" back after each "addValue".



It is important to remember that in some of the stored univariate instances the storage medium is external to the Univariate instance. In those cases, I don't see us being able to consolidate any of our calculations in addValue(). In other words, ListUnivariateImpl is imply attached to an external List - a user can go ahead and add 100 values to that list without ListUnivariateImpl's involvement.


I'm talking strictly about UnivariateImpl at this time, I'm not quite ready to delve into the Storage Implementations. I understand and value the benifit of what your pointing out. Storage based Univariate Implementations have different requirements than "UnivariateImpl" from this standpoint. But, I do think some aspects of what Andreou is point out could optimize those implementations in the future too. I could be possible to establish a sort of "concurrentModification" style attack in addValue such that if the underlying List or Array was modified, it could be detected by the the Univariate Implementation and such a "caching" mechanism could be updated (I'm not sure though, this may not be something to explore before reaching release).

Andreou Andreas wrote:

Mark, I would go for the latter approach (the one on the p.s.) cause it doesn't seem that complex to me...
Why not add a CachableUnivariateImpl class
that extends UnivariateImpl
and also keeps track in a cache the results of the getters (getMean, getKurtosis, e.t.c.).
In this way, whenever a new value is added, the cache will be cleared, and on calling the getters, each correspending statistic will be
recalculated.
If no new values have been added, this new subclass will just return the cached results...


Yes, I think this is a novel idea to explore in the future, its difficult to draw the lines on what to store in it because at this time, we are now calculating the mean/variance in addValue with Al's new 2-pass algorithm, while the more complex kurt and skew calculations are in the getter methods. But, I like the idea of it. I'm working on 2-pass style algorithms for skew and kurt now. Which may unfortunately require more calculation to occur in addValue than I want to see happening.

-Mark


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to