Re: [math] Recent commits to stat, util packages

Phil Steitz Sun, 06 Jul 2003 20:32:10 -0700

--- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote:
 
> > 
> > Well, I for one would prefer to have the simple computational methods in
> one
> > place.  I would support making the class require instantiation, however,
> i.e.
> > making the methods non-static.
> > 
> 
> Yes, but again is a question of having big flat monolithic classes vs 
> having extensible implementations that can easily be expanded on. I'm 
> not particularly thrilled at the idea of being totally locked into such 
> an interface like Univariate or StatUtils. It is just totally inflexible 
> and there always too much restriction and argument about what do we want 
> to put in it vs, not put in it.


I think that it is a good idea to have these discussions and I don't understand
what you mean by "inflexible".  In retrospect, we probably should have named
StoreUnivariate "ExtendedUnivariate", since it really represents a statistical
object supporting more statistics.  Univariate can always be extended --
statistics can be added to the base interface as well as to the abstract and
concrete classes that implement the base interface.  Some of these statistics
can be based on computational methods in StatUtils.  If we eliminate the static
methods in StatUtils, then we can make the computational strategies pluggable.

One more sort of philosphical point that makes me want to keep Univariates as
objects with statistics as properties:  to me a Univariate is in fact a java
bean.  It's state is the data that it is characterizing and its properties are
the statistics describing these data.  Univariates that support only a limited
set of statistics don't have to hold all of the individual data values
comprising their state internally.  Extended Univariates require more overhead.
 It is natural, therefore, to define the extended statistics in an extended
interface. 
> 
> > 
> Yes, simple, but not very organized, and not as extensible as a 
> framework like "solvers" is. You can implement any new "solver" we could 
> desire right now without much complaint, but try to implement a new 
> statistic and blam, all this argument starts up as to whether its 
> appropriate or not in the Univariate interface.

You are confusing strategies with implementations. The rootfinding framework
exists to support multiple strategies to do rootfinding, not to support
arbitrary numerical methods. A better analogy would be to the distribution
framework which supports creation of different probability distributions.  You
could argue that a "statistic" is as natural an abstraction as a probability
distribution.  I disagree with that.  There is lots of structure in a
probability distribution, very little in a statistic from an abstract
standpoint.

 There's not room for 
> growth here! If I decide to go down the road an try to implement things 
> like auto-correlation coefficients (which would be a logical addition 
> someday) then I end up having to get permission just to "add" the 
> implementation, whereas if there's a logical framework, theres more room 
> for growth without stepping on each others toes so much. This is very 
> logical to me.

I disagree. Extending a class or adding a method to an interface is no harder
than adding a new class (actually easier).  It seems ridiculous to me to add a
new class for each univariate statistic that we want to support. If the stats
are going to be meaningfully integrated, they will have to be used/defined by
the core univariate classes any way, unless your idea is to eliminate these and
force users think about statistics one at a time instead of as part of a
univariate statistical summary. This may be the crux of our disagreement.  I
see the statistics as natural properties of a set of data, not meaningful
objects in their own right. 

We are always going to have to discuss what goes in to commons-math and what
does not go in, regardless of how packages are organized.  For example, I would
be opposed (as I suspect J, Al and Brent would be too) to adding a Newton's
method solver now, since it would provide no value beyond what we already have.
This has nothing to do with how the package is organized. 

I would like to propose the following compromise solution that allows the kind
of flexibility that you want without breaking things apart as much.

1. Rename StoreUnivariate to ExtendedUnivariate and change all other "Store"
names to "Extended".   

2. Make the methods in StatUtils non-static. Continue to use these for basic
computational methods shared by Univariate and ExtendedUnivariate
implementations and for direct use by applications and elsewhere in
commons-math. These methods do not have to be used by all Univariate
implementation strategies.  

3. Add addValues methods to Univariate that accept double[], List and
Collection with property name and eliminate ListUnivariate and
BeanListUnivariate.

4. Rename UnivariateImpl to SimpleUnivariate and add a UnivariateFactory with
factory methods to create Simple, Extended  and whatever other sorts of
Univariates we may define.

To add new statistics or computational strategies in this environment, we can
a) add to the Univariate interface if we think that they are really basic -- I
think that t-based confidence interval half-width for the mean is a basic stat
that is now missing, for example b) add to the ExtendedUnivariate interface c)
extend an existing Univariate implementation to add the new statistic or d)
create a new Univariate including the new statistic or computational strategy.
U
 
Phil
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 





__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [math] Recent commits to stat, util packages

Reply via email to