Re: [math] Proposal for Package restructuring and Class renaming

2003-11-09 Thread Al Chou
--- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote:
> Al Chou wrote:
> > 
> > Would you move the existing ones into
> > org.apache.commons.math.distributions.statistical or something so that the
> > probability distributions could be organized together under *.probability? 
> > Also, I noticed that the current package uses the singular "distribution"
> > rather than "distributions".
> 
> I suspect its unclear where this boundary would be drawn, I think all 
> the distributions would be both beneficial for both random number 
> distributions and statistical usage. I guess if it became clear that 
> there was a strong separation between the two then separate packages 
> would be warranted, but I'm not convinced of a difference. Yourself and 
> others may have more informed opinions.
> 
> -Mark

I don't have an informed opinion, so I'll fall back to the default opinion of
"lump everything together until/unless it's clear how to split it up".


Al

__
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [math] Proposal for Package restructuring and Class renaming

2003-11-09 Thread Al Chou
--- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote:
> Al Chou wrote:
> > 
> > OK, I see.  The one thing I notice is that the names are getting awfully
> long,
> > especially for the non-default case.  I guess that's a price we pay for
> having
> > descriptive (no play on words intended) names like
> DescriptiveStatistics
> 
> Maybe the Implementations could be abbreviated somewhat
> 
> o.a.c.math.stat.DescriptiveStatistics
> 
> o.a.c.math.stat.StorelessDscrStatsImpl
> o.a.c.math.stat.DscrStatsImpl
> 
> We could also consider pushing the actual implementation off into its 
> own packages
> 
> o.a.c.math.stat.impl.StorelessDscrStatsImpl
> o.a.c.math.stat.impl.DscrStatsImpl
> 
> This would even push all the univariate stat providers off into this 
> hierarchy as well
> 
> o.a.c.math.stat.impl.univar.StorelessUnivariateStatistic
> o.a.c.math.stat.impl.univar.UnivariateStatistic


Too much renaming and reorganization.  I didn't mean to complain too loudly,
and if the result is to use abbreviations, I retract my comments.  I probably
should have given more than half a second's thought to what alternative names
might be shorter, but in the absence of well-thought-out shorter names, I much
prefer the current proposal of DescriptiveStatistics.  Never use abbreviations
unless everyone already knows them (e.g., sin for sine), I say.


Al

__
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [math] Proposal for Package restructuring and Class renaming

2003-11-09 Thread Mark R. Diggory
Al Chou wrote:
OK, I see.  The one thing I notice is that the names are getting awfully long,
especially for the non-default case.  I guess that's a price we pay for having
descriptive (no play on words intended) names like DescriptiveStatistics
Maybe the Implementations could be abbreviated somewhat

o.a.c.math.stat.DescriptiveStatistics

o.a.c.math.stat.StorelessDscrStatsImpl
o.a.c.math.stat.DscrStatsImpl
We could also consider pushing the actual implementation off into its 
own packages

o.a.c.math.stat.impl.StorelessDscrStatsImpl
o.a.c.math.stat.impl.DscrStatsImpl
This would even push all the univariate stat providers off into this 
hierarchy as well

o.a.c.math.stat.impl.univar.StorelessUnivariateStatistic
o.a.c.math.stat.impl.univar.UnivariateStatistic
-M.
--
Mark Diggory
Software Developer
Harvard MIT Data Center
http://www.hmdc.harvard.edu
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [math] Proposal for Package restructuring and Class renaming

2003-11-09 Thread Mark R. Diggory
Al Chou wrote:
Would you move the existing ones into
org.apache.commons.math.distributions.statistical or something so that the
probability distributions could be organized together under *.probability? 
Also, I noticed that the current package uses the singular "distribution"
rather than "distributions".
I suspect its unclear where this boundary would be drawn, I think all 
the distributions would be both beneficial for both random number 
distributions and statistical usage. I guess if it became clear that 
there was a strong separation between the two then separate packages 
would be warranted, but I'm not convinced of a difference. Yourself and 
others may have more informed opinions.

-Mark

--
Mark Diggory
Software Developer
Harvard MIT Data Center
http://www.hmdc.harvard.edu
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [math] Proposal for Package restructuring and Class renaming

2003-11-08 Thread Al Chou
--- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote:
> Al Chou wrote:
> > --- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote:
...
> >>2.) Like in my last emails concerning "Univariate" I would like to, (and 
> >>have done so in my checkout successfully) Make the following Class changes:
> >>
> >>interface o.a.c.m.stat.StoreUnivariate -->
> >>abstract class o.a.c.m.stat.DescriptiveStatistics
> >>
> >>this actually becomes a factory class and uses Discovery to instantiate 
> >>new instances of the following implementations
> >>
> >>*default implementation*
> >>o.a.c.m.stat.StoreUnivariateImpl -->
> >>   o.a.c.m.stat.univariate.StatisticsImpl
> > 
> > 
> > Forgive me for not refamiliarizing myself with the code first, but should
> the
> > storeless version perhaps be the default implementation instead?  What do
> we
> > lose by going that way?  I'm thinking it would be nice to keep memory usage
> > lower if possible.
> 
> The Storeless version (UnivariateImpl) doesn't support rank Statistics 
> because of its storeless nature, the more fully featured implementation 
> is StoreUnivariateImpl, it does everything, but has the limitation of 
> requiring storage of the values. These are two different implementations 
> with different internal storage configurations. I choose 
> StoreUnivariateImpl because I think the default should have full 
> capabilities.
> 
> The storeless version is more of an Optimized solution, It probably wise 
> to suggest that one use it only if one needs that functionality (ie 
> trying to get moments across huge datasets or realtime value streams of 
> sorts)

That sounds reasonable.  Thanks for the refresher (I looked at the current code
based on your remarks, too).


> > Before we go overboard, can you give a quick example of instantiating one
> of
> > the implementations?  Or perhaps, both the default and one alternative
...
> Yes, like that
> 
> For the default Discovery configured implementation:
> 
> DescriptiveStatistics stats = DescriptiveStatistics.newInstance();
> 
> stats.addValue(5.0);
> ...
> 
> double mean = stats.getMean();
> 
> 
> For any alternate Implementations:
> 
> DescriptiveStatistics stats = 
> DescriptiveStatistics.newInstance(StorelessDescriptiveStatisticsImpl.class);
> 
> stats.addValue(5.0);
> ...
> 
> double mean = stats.getMean();
> 
> and/or
> 
> DescriptiveStatistics stats = 
>
DescriptiveStatistics.newInstance("o.a.c.math.stat.impl.StorelessDescriptiveStatisticsImpl");
> 
> stats.addValue(5.0);
> ...
> 
> double mean = stats.getMean();
> 
> depending n which people like more

OK, I see.  The one thing I notice is that the names are getting awfully long,
especially for the non-default case.  I guess that's a price we pay for having
descriptive (no play on words intended) names like DescriptiveStatistics



Al

__
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [math] Proposal for Package restructuring and Class renaming

2003-11-08 Thread Al Chou
--- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote:
> Al Chou wrote:
> > --- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote:
> > 
> >>I have several modifications I'm planning to make, but in the spirit of 
> >>consensus I want to propose them and attempt to get some agreement. So 
> >>math developer opinions on the subject would be good.
> >>
> >>1.) o.a.c.math.stat.distributions --> o.a.c.math.distributions
> >>
> >>Gives this package a more "generic" position to hold more than just 
> >>"stat" distributions.
> > 
> > 
> > What other kinds of distributions did you have in mind?  I'm asking out of
> > complete ignorance.
> > 
> 
> Probability Distributions (Gamma, Beta, Poisson, Exponential, 
> Logarithmic, Hyperbolic ...) great examples of these are in Colt's
> 
> cern.jet.stat and cern.jet.random packages.
> 
> ... but are bound up as implementations of RandomNumberGeneration 
> classes...not that that a bad thing.
> 
> Eventually ours could be used in random number generation, I think they 
> should be a more dominant package.
> -Mark

Would you move the existing ones into
org.apache.commons.math.distributions.statistical or something so that the
probability distributions could be organized together under *.probability? 
Also, I noticed that the current package uses the singular "distribution"
rather than "distributions".


Al

=
Albert Davidson Chou

Get answers to Mac questions at http://www.Mac-Mgrs.org/ .

__
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [math] Proposal for Package restructuring and Class renaming

2003-11-07 Thread Mark R. Diggory


Al Chou wrote:

--- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote:

I have several modifications I'm planning to make, but in the spirit of 
consensus I want to propose them and attempt to get some agreement. So 
math developer opinions on the subject would be good.

1.) o.a.c.math.stat.distributions --> o.a.c.math.distributions

Gives this package a more "generic" position to hold more than just 
"stat" distributions.


What other kinds of distributions did you have in mind?  I'm asking out of
complete ignorance.
Probability Distributions (Gamma, Beta, Poisson, Exponential, 
Logarithmic, Hyperbolic ...) great examples of these are in Colt's

cern.jet.stat and cern.jet.random packages.

... but are bound up as implementations of RandomNumberGeneration 
classes...not that that a bad thing.

Eventually ours could be used in random number generation, I think they 
should be a more dominant package.
-Mark

--
Mark Diggory
Software Developer
Harvard MIT Data Center
http://osprey.hmdc.harvard.edu
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [math] Proposal for Package restructuring and Class renaming

2003-11-07 Thread Mark R. Diggory


Al Chou wrote:

--- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote:

I have several modifications I'm planning to make, but in the spirit of 
consensus I want to propose them and attempt to get some agreement. So 
math developer opinions on the subject would be good.

1.) o.a.c.math.stat.distributions --> o.a.c.math.distributions

Gives this package a more "generic" position to hold more than just 
"stat" distributions.


What other kinds of distributions did you have in mind?  I'm asking out of
complete ignorance.


2.) Like in my last emails concerning "Univariate" I would like to, (and 
have done so in my checkout successfully) Make the following Class changes:

interface o.a.c.m.stat.StoreUnivariate -->
   abstract class o.a.c.m.stat.DescriptiveStatistics
this actually becomes a factory class and uses Discovery to instantiate 
new instances of the following implementations

*default implementation*
o.a.c.m.stat.StoreUnivariateImpl -->
  o.a.c.m.stat.univariate.StatisticsImpl


Forgive me for not refamiliarizing myself with the code first, but should the
storeless version perhaps be the default implementation instead?  What do we
lose by going that way?  I'm thinking it would be nice to keep memory usage
lower if possible.
The Storeless version (UnivariateImpl) doesn't support rank Statistics 
because of its storeless nature, the more fully featured implementation 
is StoreUnivariateImpl, it does everything, but has the limitation of 
requiring storage of the values. These are two different implementations 
with different internal storage configurations. I choose 
StoreUnivariateImpl because I think the default should have full 
capabilities.

The storeless version is more of an Optimized solution, It probably wise 
to suggest that one use it only if one needs that functionality (ie 
trying to get moments across huge datasets or realtime value streams of 
sorts)



*alternate implementations*
o.a.c.m.stat.UnivariateImpl -->
  o.a.c.m.stat.univariate.StorelessStatisticsImpl
o.a.c.m.stat.ListUnivariateImpl -->
  o.a.c.m.stat.univariate.ListStatisticsImpl
o.a.c.m.stat.BeanListUnivariateImpl -->
  o.a.c.m.stat.univariate.BeanListStatisticsImpl
The benefit of this is that the Alternate Implementations can all be 
instantiated from the o.a.c.m.stat.DescriptiveStatistics factories 
newInstance(...) methods. Thus alternate implementations of 
DescriptiveStatistics can be written as Service Providers and set in the 
environment/JVM configuration. We can now write SP's for other tools 
like Matlab, Mathematica, JLink, C++ libraries, R, Omegahat ... the list 
goes on and on...

Someday, I'd like to see this design extended for Bivariate Statistics 
and Regression Classes. Eventually for Random Number generation as well.


Before we go overboard, can you give a quick example of instantiating one of
the implementations?  Or perhaps, both the default and one alternative
implementation?  Is it:
import org.apache.commons.math.stat.*;

> ...
>
> StoreUnivariateImpl defaultImplementation = 
DescriptiveStatistics.newInstance()
> ;
> StoreUnivariateImpl storagelessImplementation =
> DescriptiveStatistics.newInstance( StorelessStatisticsImpl ) ;
>

Yes, like that

For the default Discovery configured implementation:

DescriptiveStatistics stats = DescriptiveStatistics.newInstance();

stats.addValue(5.0);
...
double mean = stats.getMean();

For any alternate Implementations:

DescriptiveStatistics stats = 
DescriptiveStatistics.newInstance(StorelessDescriptiveStatisticsImpl.class);

stats.addValue(5.0);
...
double mean = stats.getMean();

and/or

DescriptiveStatistics stats = 
DescriptiveStatistics.newInstance("o.a.c.math.stat.impl.StorelessDescriptiveStatisticsImpl");

stats.addValue(5.0);
...
double mean = stats.getMean();

depending n which people like more

--
Mark Diggory
Software Developer
Harvard MIT Data Center
http://osprey.hmdc.harvard.edu
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [math] Proposal for Package restructuring and Class renaming

2003-11-07 Thread Al Chou
--- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote:
> I have several modifications I'm planning to make, but in the spirit of 
> consensus I want to propose them and attempt to get some agreement. So 
> math developer opinions on the subject would be good.
> 
> 1.) o.a.c.math.stat.distributions --> o.a.c.math.distributions
> 
> Gives this package a more "generic" position to hold more than just 
> "stat" distributions.

What other kinds of distributions did you have in mind?  I'm asking out of
complete ignorance.


> 2.) Like in my last emails concerning "Univariate" I would like to, (and 
> have done so in my checkout successfully) Make the following Class changes:
> 
> interface o.a.c.m.stat.StoreUnivariate -->
> abstract class o.a.c.m.stat.DescriptiveStatistics
> 
> this actually becomes a factory class and uses Discovery to instantiate 
> new instances of the following implementations
> 
> *default implementation*
> o.a.c.m.stat.StoreUnivariateImpl -->
>o.a.c.m.stat.univariate.StatisticsImpl

Forgive me for not refamiliarizing myself with the code first, but should the
storeless version perhaps be the default implementation instead?  What do we
lose by going that way?  I'm thinking it would be nice to keep memory usage
lower if possible.


> *alternate implementations*
> o.a.c.m.stat.UnivariateImpl -->
>o.a.c.m.stat.univariate.StorelessStatisticsImpl
> 
> o.a.c.m.stat.ListUnivariateImpl -->
>o.a.c.m.stat.univariate.ListStatisticsImpl
> 
> o.a.c.m.stat.BeanListUnivariateImpl -->
>o.a.c.m.stat.univariate.BeanListStatisticsImpl
> 
> The benefit of this is that the Alternate Implementations can all be 
> instantiated from the o.a.c.m.stat.DescriptiveStatistics factories 
> newInstance(...) methods. Thus alternate implementations of 
> DescriptiveStatistics can be written as Service Providers and set in the 
> environment/JVM configuration. We can now write SP's for other tools 
> like Matlab, Mathematica, JLink, C++ libraries, R, Omegahat ... the list 
> goes on and on...
> 
> Someday, I'd like to see this design extended for Bivariate Statistics 
> and Regression Classes. Eventually for Random Number generation as well.

Before we go overboard, can you give a quick example of instantiating one of
the implementations?  Or perhaps, both the default and one alternative
implementation?  Is it:

import org.apache.commons.math.stat.*;

...

StoreUnivariateImpl defaultImplementation = DescriptiveStatistics.newInstance()
;
StoreUnivariateImpl storagelessImplementation =
DescriptiveStatistics.newInstance( StorelessStatisticsImpl ) ;



Al

=
Albert Davidson Chou

Get answers to Mac questions at http://www.Mac-Mgrs.org/ .

__
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [math] Proposal for Package restructuring and Class renaming

2003-11-07 Thread Matt Cliff
I agree

On Fri, 7 Nov 2003, Mark R. Diggory wrote:

> I have several modifications I'm planning to make, but in the spirit of 
> consensus I want to propose them and attempt to get some agreement. So 
> math developer opinions on the subject would be good.
> 
> 1.) o.a.c.math.stat.distributions --> o.a.c.math.distributions
> 
> Gives this package a more "generic" position to hold more than just 
> "stat" distributions.
> 
> 2.) Like in my last emails concerning "Univariate" I would like to, (and 
> have done so in my checkout successfully) Make the following Class changes:
> 
> interface o.a.c.m.stat.StoreUnivariate -->
> abstract class o.a.c.m.stat.DescriptiveStatistics
> 
> this actually becomes a factory class and uses Discovery to instantiate 
> new instances of the following implementations
> 
> *default implementation*
> o.a.c.m.stat.StoreUnivariateImpl -->
>o.a.c.m.stat.univariate.StatisticsImpl
> 
> *alternate implementations*
> o.a.c.m.stat.UnivariateImpl -->
>o.a.c.m.stat.univariate.StorelessStatisticsImpl
> 
> o.a.c.m.stat.ListUnivariateImpl -->
>o.a.c.m.stat.univariate.ListStatisticsImpl
> 
> o.a.c.m.stat.BeanListUnivariateImpl -->
>o.a.c.m.stat.univariate.BeanListStatisticsImpl
> 
> The benefit of this is that the Alternate Implementations can all be 
> instantiated from the o.a.c.m.stat.DescriptiveStatistics factories 
> newInstance(...) methods. Thus alternate implementations of 
> DescriptiveStatistics can be written as Service Providers and set in the 
> environment/JVM configuration. We can now write SP's for other tools 
> like Matlab, Mathematica, JLink, C++ libraries, R, Omegahat ... the list 
> goes on and on...
> 
> Someday, I'd like to see this design extended for Bivariate Statistics 
> and Regression Classes. Eventually for Random Number generation as well.
> 
> -Mark
> 
> 

-- 
  Matt Cliff
  Cliff Consulting
  303.757.4912
  720.280.6324 (c)


  The label said install Windows 98 or better so I installed Linux.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[math] Proposal for Package restructuring and Class renaming

2003-11-07 Thread Mark R. Diggory
I have several modifications I'm planning to make, but in the spirit of 
consensus I want to propose them and attempt to get some agreement. So 
math developer opinions on the subject would be good.

1.) o.a.c.math.stat.distributions --> o.a.c.math.distributions

Gives this package a more "generic" position to hold more than just 
"stat" distributions.

2.) Like in my last emails concerning "Univariate" I would like to, (and 
have done so in my checkout successfully) Make the following Class changes:

interface o.a.c.m.stat.StoreUnivariate -->
   abstract class o.a.c.m.stat.DescriptiveStatistics
this actually becomes a factory class and uses Discovery to instantiate 
new instances of the following implementations

*default implementation*
o.a.c.m.stat.StoreUnivariateImpl -->
  o.a.c.m.stat.univariate.StatisticsImpl
*alternate implementations*
o.a.c.m.stat.UnivariateImpl -->
  o.a.c.m.stat.univariate.StorelessStatisticsImpl
o.a.c.m.stat.ListUnivariateImpl -->
  o.a.c.m.stat.univariate.ListStatisticsImpl
o.a.c.m.stat.BeanListUnivariateImpl -->
  o.a.c.m.stat.univariate.BeanListStatisticsImpl
The benefit of this is that the Alternate Implementations can all be 
instantiated from the o.a.c.m.stat.DescriptiveStatistics factories 
newInstance(...) methods. Thus alternate implementations of 
DescriptiveStatistics can be written as Service Providers and set in the 
environment/JVM configuration. We can now write SP's for other tools 
like Matlab, Mathematica, JLink, C++ libraries, R, Omegahat ... the list 
goes on and on...

Someday, I'd like to see this design extended for Bivariate Statistics 
and Regression Classes. Eventually for Random Number generation as well.

-Mark

--
Mark Diggory
Software Developer
Harvard MIT Data Center
http://osprey.hmdc.harvard.edu
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]