> On 28 May 2019, at 18:09, Eric Barnhill <ericbarnh...@gmail.com> wrote:
> 
> The previous commons-math interface for descriptive statistics used a
> paradigm of constructing classes for various statistical functions and
> calling evaluate(). Example
> 
> Mean mean = new Mean();
> double mn = mean.evaluate(double[])
> 
> I wrote this type of code all through grad school and always found it
> unnecessarily bulky.  To me these summary statistics are classic use cases
> for static methods:
> 
> double mean .= Mean.evaluate(double[])
> 
> I don't have any particular problem with the evaluate() syntax.
> 
> I looked over the old Math 4 API to see if there were any benefits to the
> previous class-oriented approach that we might not want to lose. But I
> don't think there were, the functionality outside of evaluate() is minimal.

A quick check shows that evaluate comes from UnivariateStatistic. This has some 
more methods that add little to an instance view of the computation:

double evaluate(double[] values) throws MathIllegalArgumentException;
double evaluate(double[] values, int begin, int length) throws 
MathIllegalArgumentException;
UnivariateStatistic copy();

However it is extended by StorelessUnivariateStatistic which adds methods to 
update the statistic:

void increment(double d);
void incrementAll(double[] values) throws MathIllegalArgumentException;
void incrementAll(double[] values, int start, int length) throws 
MathIllegalArgumentException;
double getResult();
long getN();
void clear();
StorelessUnivariateStatistic copy();

This type of functionality would be lost by static methods.

If you are moving to a functional interface type pattern for each statistic 
then you will lose the other functionality possible with an instance state, 
namely updating with more values or combining instances.

So this is a question of whether updating a statistic is required after the 
first computation.

Will there be an alternative in the library for a map-reduce type operation 
using instances that can be combined using Stream.collect:

    <R> R collect(Supplier<R> supplier,
                  ObjDoubleConsumer<R> accumulator,
                  BiConsumer<R, R> combiner);

Here <R> would be Mean:

double mean = Arrays.stream(new double[1000]).collect(Mean::new, Mean::add, 
Mean::add).getMean() with:

void add(double);
void add(Mean);
double getMean();

(Untested code)

> 
> Finally we should consider whether we really need a separate class for each
> statistic at all. Do we want to call:
> 
> Mean.evaluate()
> 
> or
> 
> SummaryStats.mean()
> 
> or maybe
> 
> Stats.mean() ?
> 
> The last being nice and compact.
> 
> Let's make a decision so our esteemed mentee Virendra knows in what
> direction to take his work this summer. :)


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to