Re: [math] Proposal for Package restructuring and Class renaming

2003-11-09 Thread Mark R. Diggory
Al Chou wrote:
Would you move the existing ones into
org.apache.commons.math.distributions.statistical or something so that the
probability distributions could be organized together under *.probability? 
Also, I noticed that the current package uses the singular distribution
rather than distributions.
I suspect its unclear where this boundary would be drawn, I think all 
the distributions would be both beneficial for both random number 
distributions and statistical usage. I guess if it became clear that 
there was a strong separation between the two then separate packages 
would be warranted, but I'm not convinced of a difference. Yourself and 
others may have more informed opinions.

-Mark

--
Mark Diggory
Software Developer
Harvard MIT Data Center
http://www.hmdc.harvard.edu
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [math] Proposal for Package restructuring and Class renaming

2003-11-09 Thread Mark R. Diggory
Al Chou wrote:
OK, I see.  The one thing I notice is that the names are getting awfully long,
especially for the non-default case.  I guess that's a price we pay for having
descriptive (no play on words intended) names like DescriptiveStatistics
Maybe the Implementations could be abbreviated somewhat

o.a.c.math.stat.DescriptiveStatistics

o.a.c.math.stat.StorelessDscrStatsImpl
o.a.c.math.stat.DscrStatsImpl
We could also consider pushing the actual implementation off into its 
own packages

o.a.c.math.stat.impl.StorelessDscrStatsImpl
o.a.c.math.stat.impl.DscrStatsImpl
This would even push all the univariate stat providers off into this 
hierarchy as well

o.a.c.math.stat.impl.univar.StorelessUnivariateStatistic
o.a.c.math.stat.impl.univar.UnivariateStatistic
-M.
--
Mark Diggory
Software Developer
Harvard MIT Data Center
http://www.hmdc.harvard.edu
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [math] Proposal for Package restructuring and Class renaming

2003-11-09 Thread Al Chou
--- Mark R. Diggory [EMAIL PROTECTED] wrote:
 Al Chou wrote:
  
  OK, I see.  The one thing I notice is that the names are getting awfully
 long,
  especially for the non-default case.  I guess that's a price we pay for
 having
  descriptive (no play on words intended) names like
 DescriptiveStatistics
 
 Maybe the Implementations could be abbreviated somewhat
 
 o.a.c.math.stat.DescriptiveStatistics
 
 o.a.c.math.stat.StorelessDscrStatsImpl
 o.a.c.math.stat.DscrStatsImpl
 
 We could also consider pushing the actual implementation off into its 
 own packages
 
 o.a.c.math.stat.impl.StorelessDscrStatsImpl
 o.a.c.math.stat.impl.DscrStatsImpl
 
 This would even push all the univariate stat providers off into this 
 hierarchy as well
 
 o.a.c.math.stat.impl.univar.StorelessUnivariateStatistic
 o.a.c.math.stat.impl.univar.UnivariateStatistic


Too much renaming and reorganization.  I didn't mean to complain too loudly,
and if the result is to use abbreviations, I retract my comments.  I probably
should have given more than half a second's thought to what alternative names
might be shorter, but in the absence of well-thought-out shorter names, I much
prefer the current proposal of DescriptiveStatistics.  Never use abbreviations
unless everyone already knows them (e.g., sin for sine), I say.


Al

__
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [math] Proposal for Package restructuring and Class renaming

2003-11-09 Thread Al Chou
--- Mark R. Diggory [EMAIL PROTECTED] wrote:
 Al Chou wrote:
  
  Would you move the existing ones into
  org.apache.commons.math.distributions.statistical or something so that the
  probability distributions could be organized together under *.probability? 
  Also, I noticed that the current package uses the singular distribution
  rather than distributions.
 
 I suspect its unclear where this boundary would be drawn, I think all 
 the distributions would be both beneficial for both random number 
 distributions and statistical usage. I guess if it became clear that 
 there was a strong separation between the two then separate packages 
 would be warranted, but I'm not convinced of a difference. Yourself and 
 others may have more informed opinions.
 
 -Mark

I don't have an informed opinion, so I'll fall back to the default opinion of
lump everything together until/unless it's clear how to split it up.


Al

__
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [math] Proposal for Package restructuring and Class renaming

2003-11-08 Thread Al Chou
--- Mark R. Diggory [EMAIL PROTECTED] wrote:
 Al Chou wrote:
  --- Mark R. Diggory [EMAIL PROTECTED] wrote:
  
 I have several modifications I'm planning to make, but in the spirit of 
 consensus I want to propose them and attempt to get some agreement. So 
 math developer opinions on the subject would be good.
 
 1.) o.a.c.math.stat.distributions -- o.a.c.math.distributions
 
 Gives this package a more generic position to hold more than just 
 stat distributions.
  
  
  What other kinds of distributions did you have in mind?  I'm asking out of
  complete ignorance.
  
 
 Probability Distributions (Gamma, Beta, Poisson, Exponential, 
 Logarithmic, Hyperbolic ...) great examples of these are in Colt's
 
 cern.jet.stat and cern.jet.random packages.
 
 ... but are bound up as implementations of RandomNumberGeneration 
 classes...not that that a bad thing.
 
 Eventually ours could be used in random number generation, I think they 
 should be a more dominant package.
 -Mark

Would you move the existing ones into
org.apache.commons.math.distributions.statistical or something so that the
probability distributions could be organized together under *.probability? 
Also, I noticed that the current package uses the singular distribution
rather than distributions.


Al

=
Albert Davidson Chou

Get answers to Mac questions at http://www.Mac-Mgrs.org/ .

__
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [math] Proposal for Package restructuring and Class renaming

2003-11-08 Thread Al Chou
--- Mark R. Diggory [EMAIL PROTECTED] wrote:
 Al Chou wrote:
  --- Mark R. Diggory [EMAIL PROTECTED] wrote:
...
 2.) Like in my last emails concerning Univariate I would like to, (and 
 have done so in my checkout successfully) Make the following Class changes:
 
 interface o.a.c.m.stat.StoreUnivariate --
 abstract class o.a.c.m.stat.DescriptiveStatistics
 
 this actually becomes a factory class and uses Discovery to instantiate 
 new instances of the following implementations
 
 *default implementation*
 o.a.c.m.stat.StoreUnivariateImpl --
o.a.c.m.stat.univariate.StatisticsImpl
  
  
  Forgive me for not refamiliarizing myself with the code first, but should
 the
  storeless version perhaps be the default implementation instead?  What do
 we
  lose by going that way?  I'm thinking it would be nice to keep memory usage
  lower if possible.
 
 The Storeless version (UnivariateImpl) doesn't support rank Statistics 
 because of its storeless nature, the more fully featured implementation 
 is StoreUnivariateImpl, it does everything, but has the limitation of 
 requiring storage of the values. These are two different implementations 
 with different internal storage configurations. I choose 
 StoreUnivariateImpl because I think the default should have full 
 capabilities.
 
 The storeless version is more of an Optimized solution, It probably wise 
 to suggest that one use it only if one needs that functionality (ie 
 trying to get moments across huge datasets or realtime value streams of 
 sorts)

That sounds reasonable.  Thanks for the refresher (I looked at the current code
based on your remarks, too).


  Before we go overboard, can you give a quick example of instantiating one
 of
  the implementations?  Or perhaps, both the default and one alternative
...
 Yes, like that
 
 For the default Discovery configured implementation:
 
 DescriptiveStatistics stats = DescriptiveStatistics.newInstance();
 
 stats.addValue(5.0);
 ...
 
 double mean = stats.getMean();
 
 
 For any alternate Implementations:
 
 DescriptiveStatistics stats = 
 DescriptiveStatistics.newInstance(StorelessDescriptiveStatisticsImpl.class);
 
 stats.addValue(5.0);
 ...
 
 double mean = stats.getMean();
 
 and/or
 
 DescriptiveStatistics stats = 

DescriptiveStatistics.newInstance(o.a.c.math.stat.impl.StorelessDescriptiveStatisticsImpl);
 
 stats.addValue(5.0);
 ...
 
 double mean = stats.getMean();
 
 depending n which people like more

OK, I see.  The one thing I notice is that the names are getting awfully long,
especially for the non-default case.  I guess that's a price we pay for having
descriptive (no play on words intended) names like DescriptiveStatistics



Al

__
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [math] Proposal for Package restructuring and Class renaming

2003-11-07 Thread Matt Cliff
I agree

On Fri, 7 Nov 2003, Mark R. Diggory wrote:

 I have several modifications I'm planning to make, but in the spirit of 
 consensus I want to propose them and attempt to get some agreement. So 
 math developer opinions on the subject would be good.
 
 1.) o.a.c.math.stat.distributions -- o.a.c.math.distributions
 
 Gives this package a more generic position to hold more than just 
 stat distributions.
 
 2.) Like in my last emails concerning Univariate I would like to, (and 
 have done so in my checkout successfully) Make the following Class changes:
 
 interface o.a.c.m.stat.StoreUnivariate --
 abstract class o.a.c.m.stat.DescriptiveStatistics
 
 this actually becomes a factory class and uses Discovery to instantiate 
 new instances of the following implementations
 
 *default implementation*
 o.a.c.m.stat.StoreUnivariateImpl --
o.a.c.m.stat.univariate.StatisticsImpl
 
 *alternate implementations*
 o.a.c.m.stat.UnivariateImpl --
o.a.c.m.stat.univariate.StorelessStatisticsImpl
 
 o.a.c.m.stat.ListUnivariateImpl --
o.a.c.m.stat.univariate.ListStatisticsImpl
 
 o.a.c.m.stat.BeanListUnivariateImpl --
o.a.c.m.stat.univariate.BeanListStatisticsImpl
 
 The benefit of this is that the Alternate Implementations can all be 
 instantiated from the o.a.c.m.stat.DescriptiveStatistics factories 
 newInstance(...) methods. Thus alternate implementations of 
 DescriptiveStatistics can be written as Service Providers and set in the 
 environment/JVM configuration. We can now write SP's for other tools 
 like Matlab, Mathematica, JLink, C++ libraries, R, Omegahat ... the list 
 goes on and on...
 
 Someday, I'd like to see this design extended for Bivariate Statistics 
 and Regression Classes. Eventually for Random Number generation as well.
 
 -Mark
 
 

-- 
  Matt Cliff
  Cliff Consulting
  303.757.4912
  720.280.6324 (c)


  The label said install Windows 98 or better so I installed Linux.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [math] Proposal for Package restructuring and Class renaming

2003-11-07 Thread Al Chou
--- Mark R. Diggory [EMAIL PROTECTED] wrote:
 I have several modifications I'm planning to make, but in the spirit of 
 consensus I want to propose them and attempt to get some agreement. So 
 math developer opinions on the subject would be good.
 
 1.) o.a.c.math.stat.distributions -- o.a.c.math.distributions
 
 Gives this package a more generic position to hold more than just 
 stat distributions.

What other kinds of distributions did you have in mind?  I'm asking out of
complete ignorance.


 2.) Like in my last emails concerning Univariate I would like to, (and 
 have done so in my checkout successfully) Make the following Class changes:
 
 interface o.a.c.m.stat.StoreUnivariate --
 abstract class o.a.c.m.stat.DescriptiveStatistics
 
 this actually becomes a factory class and uses Discovery to instantiate 
 new instances of the following implementations
 
 *default implementation*
 o.a.c.m.stat.StoreUnivariateImpl --
o.a.c.m.stat.univariate.StatisticsImpl

Forgive me for not refamiliarizing myself with the code first, but should the
storeless version perhaps be the default implementation instead?  What do we
lose by going that way?  I'm thinking it would be nice to keep memory usage
lower if possible.


 *alternate implementations*
 o.a.c.m.stat.UnivariateImpl --
o.a.c.m.stat.univariate.StorelessStatisticsImpl
 
 o.a.c.m.stat.ListUnivariateImpl --
o.a.c.m.stat.univariate.ListStatisticsImpl
 
 o.a.c.m.stat.BeanListUnivariateImpl --
o.a.c.m.stat.univariate.BeanListStatisticsImpl
 
 The benefit of this is that the Alternate Implementations can all be 
 instantiated from the o.a.c.m.stat.DescriptiveStatistics factories 
 newInstance(...) methods. Thus alternate implementations of 
 DescriptiveStatistics can be written as Service Providers and set in the 
 environment/JVM configuration. We can now write SP's for other tools 
 like Matlab, Mathematica, JLink, C++ libraries, R, Omegahat ... the list 
 goes on and on...
 
 Someday, I'd like to see this design extended for Bivariate Statistics 
 and Regression Classes. Eventually for Random Number generation as well.

Before we go overboard, can you give a quick example of instantiating one of
the implementations?  Or perhaps, both the default and one alternative
implementation?  Is it:

import org.apache.commons.math.stat.*;

...

StoreUnivariateImpl defaultImplementation = DescriptiveStatistics.newInstance()
;
StoreUnivariateImpl storagelessImplementation =
DescriptiveStatistics.newInstance( StorelessStatisticsImpl ) ;



Al

=
Albert Davidson Chou

Get answers to Mac questions at http://www.Mac-Mgrs.org/ .

__
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [math] Proposal for Package restructuring and Class renaming

2003-11-07 Thread Mark R. Diggory


Al Chou wrote:

--- Mark R. Diggory [EMAIL PROTECTED] wrote:

I have several modifications I'm planning to make, but in the spirit of 
consensus I want to propose them and attempt to get some agreement. So 
math developer opinions on the subject would be good.

1.) o.a.c.math.stat.distributions -- o.a.c.math.distributions

Gives this package a more generic position to hold more than just 
stat distributions.


What other kinds of distributions did you have in mind?  I'm asking out of
complete ignorance.


2.) Like in my last emails concerning Univariate I would like to, (and 
have done so in my checkout successfully) Make the following Class changes:

interface o.a.c.m.stat.StoreUnivariate --
   abstract class o.a.c.m.stat.DescriptiveStatistics
this actually becomes a factory class and uses Discovery to instantiate 
new instances of the following implementations

*default implementation*
o.a.c.m.stat.StoreUnivariateImpl --
  o.a.c.m.stat.univariate.StatisticsImpl


Forgive me for not refamiliarizing myself with the code first, but should the
storeless version perhaps be the default implementation instead?  What do we
lose by going that way?  I'm thinking it would be nice to keep memory usage
lower if possible.
The Storeless version (UnivariateImpl) doesn't support rank Statistics 
because of its storeless nature, the more fully featured implementation 
is StoreUnivariateImpl, it does everything, but has the limitation of 
requiring storage of the values. These are two different implementations 
with different internal storage configurations. I choose 
StoreUnivariateImpl because I think the default should have full 
capabilities.

The storeless version is more of an Optimized solution, It probably wise 
to suggest that one use it only if one needs that functionality (ie 
trying to get moments across huge datasets or realtime value streams of 
sorts)



*alternate implementations*
o.a.c.m.stat.UnivariateImpl --
  o.a.c.m.stat.univariate.StorelessStatisticsImpl
o.a.c.m.stat.ListUnivariateImpl --
  o.a.c.m.stat.univariate.ListStatisticsImpl
o.a.c.m.stat.BeanListUnivariateImpl --
  o.a.c.m.stat.univariate.BeanListStatisticsImpl
The benefit of this is that the Alternate Implementations can all be 
instantiated from the o.a.c.m.stat.DescriptiveStatistics factories 
newInstance(...) methods. Thus alternate implementations of 
DescriptiveStatistics can be written as Service Providers and set in the 
environment/JVM configuration. We can now write SP's for other tools 
like Matlab, Mathematica, JLink, C++ libraries, R, Omegahat ... the list 
goes on and on...

Someday, I'd like to see this design extended for Bivariate Statistics 
and Regression Classes. Eventually for Random Number generation as well.


Before we go overboard, can you give a quick example of instantiating one of
the implementations?  Or perhaps, both the default and one alternative
implementation?  Is it:
import org.apache.commons.math.stat.*;

 ...

 StoreUnivariateImpl defaultImplementation = 
DescriptiveStatistics.newInstance()
 ;
 StoreUnivariateImpl storagelessImplementation =
 DescriptiveStatistics.newInstance( StorelessStatisticsImpl ) ;


Yes, like that

For the default Discovery configured implementation:

DescriptiveStatistics stats = DescriptiveStatistics.newInstance();

stats.addValue(5.0);
...
double mean = stats.getMean();

For any alternate Implementations:

DescriptiveStatistics stats = 
DescriptiveStatistics.newInstance(StorelessDescriptiveStatisticsImpl.class);

stats.addValue(5.0);
...
double mean = stats.getMean();

and/or

DescriptiveStatistics stats = 
DescriptiveStatistics.newInstance(o.a.c.math.stat.impl.StorelessDescriptiveStatisticsImpl);

stats.addValue(5.0);
...
double mean = stats.getMean();

depending n which people like more

--
Mark Diggory
Software Developer
Harvard MIT Data Center
http://osprey.hmdc.harvard.edu
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [math] Proposal for Package restructuring and Class renaming

2003-11-07 Thread Mark R. Diggory


Al Chou wrote:

--- Mark R. Diggory [EMAIL PROTECTED] wrote:

I have several modifications I'm planning to make, but in the spirit of 
consensus I want to propose them and attempt to get some agreement. So 
math developer opinions on the subject would be good.

1.) o.a.c.math.stat.distributions -- o.a.c.math.distributions

Gives this package a more generic position to hold more than just 
stat distributions.


What other kinds of distributions did you have in mind?  I'm asking out of
complete ignorance.
Probability Distributions (Gamma, Beta, Poisson, Exponential, 
Logarithmic, Hyperbolic ...) great examples of these are in Colt's

cern.jet.stat and cern.jet.random packages.

... but are bound up as implementations of RandomNumberGeneration 
classes...not that that a bad thing.

Eventually ours could be used in random number generation, I think they 
should be a more dominant package.
-Mark

--
Mark Diggory
Software Developer
Harvard MIT Data Center
http://osprey.hmdc.harvard.edu
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]