I have a couple of problems with the recent commits to stat and util. First, the testAddElementRolling test case in FixedDoubleArrayTest will not compile, since it is trying to access what is now a private field in FixedDoubleArray (internalArray). The changes to FixedDoubleArray should be rolled back or the tests should be modified so that they compile and succeed.
Second, I do not see the value in all of the additional classes and overhead introduced into stat. The goal of Univariate was to provide basic univariate statistics via a simple interface and lightweight, numerically sound implementation, consistent with the vision of commons-math and Jakarta Commons in general. I fear that we may be straying off into statistical computation framework-building, which I don't think belongs in commons-math (really Jakarta Commons). More importantly, I don't think we need to add this complexity to deliver the functionality that we are providing. The only problem that I see with the structure prior to the recent commits is the confusion between collections and univariates addValue methods. I would favor eliminating the List and BeanList univariates altogether and replacing their functionality with methods added to StatUtils that take Lists or Collections and property names as input and compute statistics from them. Similarly, the Univariate interface could be modified to include addValues(double[]), addValues(List) (assumes contents are Numbers), addValues(Collection, propertyName). The checkin comment says that the new univariate framework is independent of the existing implementations; but StatUtils has been modified to include numerous static data members and to delegate computation to these. This adds significant overhead and I do not see the value in it. The cost of the additional stack operations/object creations is significant. I ran tests comparing the previous version that does direct computations using the double[] arrays to the modified version and found an average of more than 6x slowdown using the new implementation. I did not profile memory utilization, but that is also a concern. Repeated tests computing the mean of a 1000 doubles 100000 times using the old and new implementations averaged 1.5 and 10.2 seconds, resp. I do not see the need for all of this additional overhead. I suggest that we postpone introduction of a statistical computation framework until after the initial release, if needed. In any case, I would like to keep StatUtils and the core UnivariateImpl small, fast and lightweight, so I would like to request that the changes to these classes be rolled back. If others feel that this additional infrastucture is essential, then I just need to be educated. It is quite possible that I am thinking too narrowly in terms of current scope and I may be missing some looming structural problems. If this is the case, I am open to being educated. I just need to see a) exactly why we need to add more complexity at this time and b) why breaking univariate statistics into four packages and 17 classes when all we are computing is basic statistics is necessary. Phil __________________________________ Do you Yahoo!? SBC Yahoo! DSL - Now only $29.95 per month! http://sbc.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]