Al Chou wrote:

--- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote:

I have several modifications I'm planning to make, but in the spirit of consensus I want to propose them and attempt to get some agreement. So math developer opinions on the subject would be good.

1.) o.a.c.math.stat.distributions --> o.a.c.math.distributions

Gives this package a more "generic" position to hold more than just "stat" distributions.


What other kinds of distributions did you have in mind?  I'm asking out of
complete ignorance.



2.) Like in my last emails concerning "Univariate" I would like to, (and have done so in my checkout successfully) Make the following Class changes:

interface o.a.c.m.stat.StoreUnivariate -->
           abstract class o.a.c.m.stat.DescriptiveStatistics

this actually becomes a factory class and uses Discovery to instantiate new instances of the following implementations

*default implementation*
o.a.c.m.stat.StoreUnivariateImpl -->
          o.a.c.m.stat.univariate.StatisticsImpl


Forgive me for not refamiliarizing myself with the code first, but should the
storeless version perhaps be the default implementation instead?  What do we
lose by going that way?  I'm thinking it would be nice to keep memory usage
lower if possible.


The Storeless version (UnivariateImpl) doesn't support rank Statistics because of its storeless nature, the more fully featured implementation is StoreUnivariateImpl, it does everything, but has the limitation of requiring storage of the values. These are two different implementations with different internal storage configurations. I choose StoreUnivariateImpl because I think the default should have full capabilities.


The storeless version is more of an Optimized solution, It probably wise to suggest that one use it only if one needs that functionality (ie trying to get moments across huge datasets or realtime value streams of sorts)



*alternate implementations*
o.a.c.m.stat.UnivariateImpl -->
          o.a.c.m.stat.univariate.StorelessStatisticsImpl

o.a.c.m.stat.ListUnivariateImpl -->
          o.a.c.m.stat.univariate.ListStatisticsImpl

o.a.c.m.stat.BeanListUnivariateImpl -->
          o.a.c.m.stat.univariate.BeanListStatisticsImpl

The benefit of this is that the Alternate Implementations can all be instantiated from the o.a.c.m.stat.DescriptiveStatistics factories newInstance(...) methods. Thus alternate implementations of DescriptiveStatistics can be written as Service Providers and set in the environment/JVM configuration. We can now write SP's for other tools like Matlab, Mathematica, JLink, C++ libraries, R, Omegahat ... the list goes on and on...

Someday, I'd like to see this design extended for Bivariate Statistics and Regression Classes. Eventually for Random Number generation as well.


Before we go overboard, can you give a quick example of instantiating one of
the implementations?  Or perhaps, both the default and one alternative
implementation?  Is it:

import org.apache.commons.math.stat.*;


> ...
>
> StoreUnivariateImpl defaultImplementation = DescriptiveStatistics.newInstance()
> ;
> StoreUnivariateImpl storagelessImplementation =
> DescriptiveStatistics.newInstance( StorelessStatisticsImpl ) ;
>


Yes, like that

For the default Discovery configured implementation:

DescriptiveStatistics stats = DescriptiveStatistics.newInstance();

stats.addValue(5.0);
...

double mean = stats.getMean();


For any alternate Implementations:


DescriptiveStatistics stats = DescriptiveStatistics.newInstance(StorelessDescriptiveStatisticsImpl.class);

stats.addValue(5.0);
...

double mean = stats.getMean();

and/or

DescriptiveStatistics stats = DescriptiveStatistics.newInstance("o.a.c.math.stat.impl.StorelessDescriptiveStatisticsImpl");

stats.addValue(5.0);
...

double mean = stats.getMean();

depending n which people like more


-- Mark Diggory Software Developer Harvard MIT Data Center http://osprey.hmdc.harvard.edu

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to