Re: [math] Proposal for Package restructuring and Class renaming
--- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote: > Al Chou wrote: > > > > Would you move the existing ones into > > org.apache.commons.math.distributions.statistical or something so that the > > probability distributions could be organized together under *.probability? > > Also, I noticed that the current package uses the singular "distribution" > > rather than "distributions". > > I suspect its unclear where this boundary would be drawn, I think all > the distributions would be both beneficial for both random number > distributions and statistical usage. I guess if it became clear that > there was a strong separation between the two then separate packages > would be warranted, but I'm not convinced of a difference. Yourself and > others may have more informed opinions. > > -Mark I don't have an informed opinion, so I'll fall back to the default opinion of "lump everything together until/unless it's clear how to split it up". Al __ Do you Yahoo!? Protect your identity with Yahoo! Mail AddressGuard http://antispam.yahoo.com/whatsnewfree - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [math] Proposal for Package restructuring and Class renaming
--- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote: > Al Chou wrote: > > > > OK, I see. The one thing I notice is that the names are getting awfully > long, > > especially for the non-default case. I guess that's a price we pay for > having > > descriptive (no play on words intended) names like > DescriptiveStatistics > > Maybe the Implementations could be abbreviated somewhat > > o.a.c.math.stat.DescriptiveStatistics > > o.a.c.math.stat.StorelessDscrStatsImpl > o.a.c.math.stat.DscrStatsImpl > > We could also consider pushing the actual implementation off into its > own packages > > o.a.c.math.stat.impl.StorelessDscrStatsImpl > o.a.c.math.stat.impl.DscrStatsImpl > > This would even push all the univariate stat providers off into this > hierarchy as well > > o.a.c.math.stat.impl.univar.StorelessUnivariateStatistic > o.a.c.math.stat.impl.univar.UnivariateStatistic Too much renaming and reorganization. I didn't mean to complain too loudly, and if the result is to use abbreviations, I retract my comments. I probably should have given more than half a second's thought to what alternative names might be shorter, but in the absence of well-thought-out shorter names, I much prefer the current proposal of DescriptiveStatistics. Never use abbreviations unless everyone already knows them (e.g., sin for sine), I say. Al __ Do you Yahoo!? Protect your identity with Yahoo! Mail AddressGuard http://antispam.yahoo.com/whatsnewfree - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [math] Proposal for Package restructuring and Class renaming
Al Chou wrote: OK, I see. The one thing I notice is that the names are getting awfully long, especially for the non-default case. I guess that's a price we pay for having descriptive (no play on words intended) names like DescriptiveStatistics Maybe the Implementations could be abbreviated somewhat o.a.c.math.stat.DescriptiveStatistics o.a.c.math.stat.StorelessDscrStatsImpl o.a.c.math.stat.DscrStatsImpl We could also consider pushing the actual implementation off into its own packages o.a.c.math.stat.impl.StorelessDscrStatsImpl o.a.c.math.stat.impl.DscrStatsImpl This would even push all the univariate stat providers off into this hierarchy as well o.a.c.math.stat.impl.univar.StorelessUnivariateStatistic o.a.c.math.stat.impl.univar.UnivariateStatistic -M. -- Mark Diggory Software Developer Harvard MIT Data Center http://www.hmdc.harvard.edu - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [math] Proposal for Package restructuring and Class renaming
Al Chou wrote: Would you move the existing ones into org.apache.commons.math.distributions.statistical or something so that the probability distributions could be organized together under *.probability? Also, I noticed that the current package uses the singular "distribution" rather than "distributions". I suspect its unclear where this boundary would be drawn, I think all the distributions would be both beneficial for both random number distributions and statistical usage. I guess if it became clear that there was a strong separation between the two then separate packages would be warranted, but I'm not convinced of a difference. Yourself and others may have more informed opinions. -Mark -- Mark Diggory Software Developer Harvard MIT Data Center http://www.hmdc.harvard.edu - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [math] Proposal for Package restructuring and Class renaming
--- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote: > Al Chou wrote: > > --- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote: ... > >>2.) Like in my last emails concerning "Univariate" I would like to, (and > >>have done so in my checkout successfully) Make the following Class changes: > >> > >>interface o.a.c.m.stat.StoreUnivariate --> > >>abstract class o.a.c.m.stat.DescriptiveStatistics > >> > >>this actually becomes a factory class and uses Discovery to instantiate > >>new instances of the following implementations > >> > >>*default implementation* > >>o.a.c.m.stat.StoreUnivariateImpl --> > >> o.a.c.m.stat.univariate.StatisticsImpl > > > > > > Forgive me for not refamiliarizing myself with the code first, but should > the > > storeless version perhaps be the default implementation instead? What do > we > > lose by going that way? I'm thinking it would be nice to keep memory usage > > lower if possible. > > The Storeless version (UnivariateImpl) doesn't support rank Statistics > because of its storeless nature, the more fully featured implementation > is StoreUnivariateImpl, it does everything, but has the limitation of > requiring storage of the values. These are two different implementations > with different internal storage configurations. I choose > StoreUnivariateImpl because I think the default should have full > capabilities. > > The storeless version is more of an Optimized solution, It probably wise > to suggest that one use it only if one needs that functionality (ie > trying to get moments across huge datasets or realtime value streams of > sorts) That sounds reasonable. Thanks for the refresher (I looked at the current code based on your remarks, too). > > Before we go overboard, can you give a quick example of instantiating one > of > > the implementations? Or perhaps, both the default and one alternative ... > Yes, like that > > For the default Discovery configured implementation: > > DescriptiveStatistics stats = DescriptiveStatistics.newInstance(); > > stats.addValue(5.0); > ... > > double mean = stats.getMean(); > > > For any alternate Implementations: > > DescriptiveStatistics stats = > DescriptiveStatistics.newInstance(StorelessDescriptiveStatisticsImpl.class); > > stats.addValue(5.0); > ... > > double mean = stats.getMean(); > > and/or > > DescriptiveStatistics stats = > DescriptiveStatistics.newInstance("o.a.c.math.stat.impl.StorelessDescriptiveStatisticsImpl"); > > stats.addValue(5.0); > ... > > double mean = stats.getMean(); > > depending n which people like more OK, I see. The one thing I notice is that the names are getting awfully long, especially for the non-default case. I guess that's a price we pay for having descriptive (no play on words intended) names like DescriptiveStatistics Al __ Do you Yahoo!? Protect your identity with Yahoo! Mail AddressGuard http://antispam.yahoo.com/whatsnewfree - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [math] Proposal for Package restructuring and Class renaming
--- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote: > Al Chou wrote: > > --- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote: > > > >>I have several modifications I'm planning to make, but in the spirit of > >>consensus I want to propose them and attempt to get some agreement. So > >>math developer opinions on the subject would be good. > >> > >>1.) o.a.c.math.stat.distributions --> o.a.c.math.distributions > >> > >>Gives this package a more "generic" position to hold more than just > >>"stat" distributions. > > > > > > What other kinds of distributions did you have in mind? I'm asking out of > > complete ignorance. > > > > Probability Distributions (Gamma, Beta, Poisson, Exponential, > Logarithmic, Hyperbolic ...) great examples of these are in Colt's > > cern.jet.stat and cern.jet.random packages. > > ... but are bound up as implementations of RandomNumberGeneration > classes...not that that a bad thing. > > Eventually ours could be used in random number generation, I think they > should be a more dominant package. > -Mark Would you move the existing ones into org.apache.commons.math.distributions.statistical or something so that the probability distributions could be organized together under *.probability? Also, I noticed that the current package uses the singular "distribution" rather than "distributions". Al = Albert Davidson Chou Get answers to Mac questions at http://www.Mac-Mgrs.org/ . __ Do you Yahoo!? Protect your identity with Yahoo! Mail AddressGuard http://antispam.yahoo.com/whatsnewfree - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [math] Proposal for Package restructuring and Class renaming
Al Chou wrote: --- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote: I have several modifications I'm planning to make, but in the spirit of consensus I want to propose them and attempt to get some agreement. So math developer opinions on the subject would be good. 1.) o.a.c.math.stat.distributions --> o.a.c.math.distributions Gives this package a more "generic" position to hold more than just "stat" distributions. What other kinds of distributions did you have in mind? I'm asking out of complete ignorance. Probability Distributions (Gamma, Beta, Poisson, Exponential, Logarithmic, Hyperbolic ...) great examples of these are in Colt's cern.jet.stat and cern.jet.random packages. ... but are bound up as implementations of RandomNumberGeneration classes...not that that a bad thing. Eventually ours could be used in random number generation, I think they should be a more dominant package. -Mark -- Mark Diggory Software Developer Harvard MIT Data Center http://osprey.hmdc.harvard.edu - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [math] Proposal for Package restructuring and Class renaming
Al Chou wrote: --- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote: I have several modifications I'm planning to make, but in the spirit of consensus I want to propose them and attempt to get some agreement. So math developer opinions on the subject would be good. 1.) o.a.c.math.stat.distributions --> o.a.c.math.distributions Gives this package a more "generic" position to hold more than just "stat" distributions. What other kinds of distributions did you have in mind? I'm asking out of complete ignorance. 2.) Like in my last emails concerning "Univariate" I would like to, (and have done so in my checkout successfully) Make the following Class changes: interface o.a.c.m.stat.StoreUnivariate --> abstract class o.a.c.m.stat.DescriptiveStatistics this actually becomes a factory class and uses Discovery to instantiate new instances of the following implementations *default implementation* o.a.c.m.stat.StoreUnivariateImpl --> o.a.c.m.stat.univariate.StatisticsImpl Forgive me for not refamiliarizing myself with the code first, but should the storeless version perhaps be the default implementation instead? What do we lose by going that way? I'm thinking it would be nice to keep memory usage lower if possible. The Storeless version (UnivariateImpl) doesn't support rank Statistics because of its storeless nature, the more fully featured implementation is StoreUnivariateImpl, it does everything, but has the limitation of requiring storage of the values. These are two different implementations with different internal storage configurations. I choose StoreUnivariateImpl because I think the default should have full capabilities. The storeless version is more of an Optimized solution, It probably wise to suggest that one use it only if one needs that functionality (ie trying to get moments across huge datasets or realtime value streams of sorts) *alternate implementations* o.a.c.m.stat.UnivariateImpl --> o.a.c.m.stat.univariate.StorelessStatisticsImpl o.a.c.m.stat.ListUnivariateImpl --> o.a.c.m.stat.univariate.ListStatisticsImpl o.a.c.m.stat.BeanListUnivariateImpl --> o.a.c.m.stat.univariate.BeanListStatisticsImpl The benefit of this is that the Alternate Implementations can all be instantiated from the o.a.c.m.stat.DescriptiveStatistics factories newInstance(...) methods. Thus alternate implementations of DescriptiveStatistics can be written as Service Providers and set in the environment/JVM configuration. We can now write SP's for other tools like Matlab, Mathematica, JLink, C++ libraries, R, Omegahat ... the list goes on and on... Someday, I'd like to see this design extended for Bivariate Statistics and Regression Classes. Eventually for Random Number generation as well. Before we go overboard, can you give a quick example of instantiating one of the implementations? Or perhaps, both the default and one alternative implementation? Is it: import org.apache.commons.math.stat.*; > ... > > StoreUnivariateImpl defaultImplementation = DescriptiveStatistics.newInstance() > ; > StoreUnivariateImpl storagelessImplementation = > DescriptiveStatistics.newInstance( StorelessStatisticsImpl ) ; > Yes, like that For the default Discovery configured implementation: DescriptiveStatistics stats = DescriptiveStatistics.newInstance(); stats.addValue(5.0); ... double mean = stats.getMean(); For any alternate Implementations: DescriptiveStatistics stats = DescriptiveStatistics.newInstance(StorelessDescriptiveStatisticsImpl.class); stats.addValue(5.0); ... double mean = stats.getMean(); and/or DescriptiveStatistics stats = DescriptiveStatistics.newInstance("o.a.c.math.stat.impl.StorelessDescriptiveStatisticsImpl"); stats.addValue(5.0); ... double mean = stats.getMean(); depending n which people like more -- Mark Diggory Software Developer Harvard MIT Data Center http://osprey.hmdc.harvard.edu - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [math] Proposal for Package restructuring and Class renaming
--- "Mark R. Diggory" <[EMAIL PROTECTED]> wrote: > I have several modifications I'm planning to make, but in the spirit of > consensus I want to propose them and attempt to get some agreement. So > math developer opinions on the subject would be good. > > 1.) o.a.c.math.stat.distributions --> o.a.c.math.distributions > > Gives this package a more "generic" position to hold more than just > "stat" distributions. What other kinds of distributions did you have in mind? I'm asking out of complete ignorance. > 2.) Like in my last emails concerning "Univariate" I would like to, (and > have done so in my checkout successfully) Make the following Class changes: > > interface o.a.c.m.stat.StoreUnivariate --> > abstract class o.a.c.m.stat.DescriptiveStatistics > > this actually becomes a factory class and uses Discovery to instantiate > new instances of the following implementations > > *default implementation* > o.a.c.m.stat.StoreUnivariateImpl --> >o.a.c.m.stat.univariate.StatisticsImpl Forgive me for not refamiliarizing myself with the code first, but should the storeless version perhaps be the default implementation instead? What do we lose by going that way? I'm thinking it would be nice to keep memory usage lower if possible. > *alternate implementations* > o.a.c.m.stat.UnivariateImpl --> >o.a.c.m.stat.univariate.StorelessStatisticsImpl > > o.a.c.m.stat.ListUnivariateImpl --> >o.a.c.m.stat.univariate.ListStatisticsImpl > > o.a.c.m.stat.BeanListUnivariateImpl --> >o.a.c.m.stat.univariate.BeanListStatisticsImpl > > The benefit of this is that the Alternate Implementations can all be > instantiated from the o.a.c.m.stat.DescriptiveStatistics factories > newInstance(...) methods. Thus alternate implementations of > DescriptiveStatistics can be written as Service Providers and set in the > environment/JVM configuration. We can now write SP's for other tools > like Matlab, Mathematica, JLink, C++ libraries, R, Omegahat ... the list > goes on and on... > > Someday, I'd like to see this design extended for Bivariate Statistics > and Regression Classes. Eventually for Random Number generation as well. Before we go overboard, can you give a quick example of instantiating one of the implementations? Or perhaps, both the default and one alternative implementation? Is it: import org.apache.commons.math.stat.*; ... StoreUnivariateImpl defaultImplementation = DescriptiveStatistics.newInstance() ; StoreUnivariateImpl storagelessImplementation = DescriptiveStatistics.newInstance( StorelessStatisticsImpl ) ; Al = Albert Davidson Chou Get answers to Mac questions at http://www.Mac-Mgrs.org/ . __ Do you Yahoo!? Protect your identity with Yahoo! Mail AddressGuard http://antispam.yahoo.com/whatsnewfree - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [math] Proposal for Package restructuring and Class renaming
I agree On Fri, 7 Nov 2003, Mark R. Diggory wrote: > I have several modifications I'm planning to make, but in the spirit of > consensus I want to propose them and attempt to get some agreement. So > math developer opinions on the subject would be good. > > 1.) o.a.c.math.stat.distributions --> o.a.c.math.distributions > > Gives this package a more "generic" position to hold more than just > "stat" distributions. > > 2.) Like in my last emails concerning "Univariate" I would like to, (and > have done so in my checkout successfully) Make the following Class changes: > > interface o.a.c.m.stat.StoreUnivariate --> > abstract class o.a.c.m.stat.DescriptiveStatistics > > this actually becomes a factory class and uses Discovery to instantiate > new instances of the following implementations > > *default implementation* > o.a.c.m.stat.StoreUnivariateImpl --> >o.a.c.m.stat.univariate.StatisticsImpl > > *alternate implementations* > o.a.c.m.stat.UnivariateImpl --> >o.a.c.m.stat.univariate.StorelessStatisticsImpl > > o.a.c.m.stat.ListUnivariateImpl --> >o.a.c.m.stat.univariate.ListStatisticsImpl > > o.a.c.m.stat.BeanListUnivariateImpl --> >o.a.c.m.stat.univariate.BeanListStatisticsImpl > > The benefit of this is that the Alternate Implementations can all be > instantiated from the o.a.c.m.stat.DescriptiveStatistics factories > newInstance(...) methods. Thus alternate implementations of > DescriptiveStatistics can be written as Service Providers and set in the > environment/JVM configuration. We can now write SP's for other tools > like Matlab, Mathematica, JLink, C++ libraries, R, Omegahat ... the list > goes on and on... > > Someday, I'd like to see this design extended for Bivariate Statistics > and Regression Classes. Eventually for Random Number generation as well. > > -Mark > > -- Matt Cliff Cliff Consulting 303.757.4912 720.280.6324 (c) The label said install Windows 98 or better so I installed Linux. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[math] Proposal for Package restructuring and Class renaming
I have several modifications I'm planning to make, but in the spirit of consensus I want to propose them and attempt to get some agreement. So math developer opinions on the subject would be good. 1.) o.a.c.math.stat.distributions --> o.a.c.math.distributions Gives this package a more "generic" position to hold more than just "stat" distributions. 2.) Like in my last emails concerning "Univariate" I would like to, (and have done so in my checkout successfully) Make the following Class changes: interface o.a.c.m.stat.StoreUnivariate --> abstract class o.a.c.m.stat.DescriptiveStatistics this actually becomes a factory class and uses Discovery to instantiate new instances of the following implementations *default implementation* o.a.c.m.stat.StoreUnivariateImpl --> o.a.c.m.stat.univariate.StatisticsImpl *alternate implementations* o.a.c.m.stat.UnivariateImpl --> o.a.c.m.stat.univariate.StorelessStatisticsImpl o.a.c.m.stat.ListUnivariateImpl --> o.a.c.m.stat.univariate.ListStatisticsImpl o.a.c.m.stat.BeanListUnivariateImpl --> o.a.c.m.stat.univariate.BeanListStatisticsImpl The benefit of this is that the Alternate Implementations can all be instantiated from the o.a.c.m.stat.DescriptiveStatistics factories newInstance(...) methods. Thus alternate implementations of DescriptiveStatistics can be written as Service Providers and set in the environment/JVM configuration. We can now write SP's for other tools like Matlab, Mathematica, JLink, C++ libraries, R, Omegahat ... the list goes on and on... Someday, I'd like to see this design extended for Bivariate Statistics and Regression Classes. Eventually for Random Number generation as well. -Mark -- Mark Diggory Software Developer Harvard MIT Data Center http://osprey.hmdc.harvard.edu - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]