Mark R. Diggory wrote:

I've got a design decision to make that I'd like to get others opinion on. Currently, the strategy in UnivariateImpl is to calculate the rudimentary building blocks of the statistics and then calculate the statistics in the "getters" (getVariance, getSkewness, getKurtosis etc.). Some cases its done in the getter, some cases its done in the addValue method itself. Often its based on the implementors opinion of where to put it, not on any hard logic.

This presents a debate with the following arguments:

(1) Bean etiquette suggests "getters" are for bean properties, its usually recommended that this means that they do nothing more than return the value for a property. This is beneficial in our Univariate case when calling a getter many times without adding a new value (lets say you use "getKurtosis" allot in a calculation before adding another value), then its more logical to have the kurtosis only calculated once and put the code for calculating it in the addValue method.

(2) However, If calling addValue many times (more likely the case) with only the interest of getting the "getMean" back, its wasted computational time to calculate all the other Stats (like kurtosis) in addValue when you just want the results of "getMean" back after each "addValue".

I suspect this debate leads to a compromise similar to what I've done in skew and kurt where all the rudimentary building blocks for all the stats are built in addValue, and the detailed calculation specific to that stat is done in the getter.

thoughts?
Mark

p.s. In a more complex approach the user might be able to tune the calculations given thier specific need. But this would require the creation of a delegation framework and boolean switching to control the behavior of the Implementation, allot of added complexity that would need to be maintained, it could create more work than its worth.

Mark, I would go for the latter approach (the one on the p.s.) cause it doesn't seem that complex to me...
Why not add a CachableUnivariateImpl class
that extends UnivariateImpl
and also keeps track in a cache the results of the getters (getMean, getKurtosis, e.t.c.).
In this way, whenever a new value is added, the cache will be cleared, and on calling the getters, each correspending statistic will be
recalculated.
If no new values have been added, this new subclass will just return the cached results...




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to