Hi Luc, Thank you for your attention, I'll open a ticket for it in the JIRA tracking system and the details can be discussed there. André
On Tue, Sep 9, 2008 at 8:53 PM, Luc Maisonobe <[EMAIL PROTECTED]> wrote: > André Panisson a écrit : >> Hello, > > Hi, > > First of all, please add the name of the component in square brackets at > the beginning of the subject when you post a message to this list. I > have done it here with [math], as your question concerns commons-math. > This list is shared by all commons sub-projects, this policy helps > people filter their mail. > >> >> I'm writing a complex validation algorithm, that makes a K-Fold >> cross-validation using a data set. The data set is partitioned into K >> subsamples, and of the K subsamples, a single subsample is retained >> as the validation data for testing, and the remaining K − 1 >> subsamples are used as training data. The process is then repeated K >> times, and at the end the K results are aggregated to a single >> result. The problem is that all K results return Statistics objects >> (org.apache.commons.math.stat.descriptive.SummaryStatistics), and I >> need to make the aggregation of all K objects in a single Statistics. >> I think it is a common problem in the statistics field. There's >> anyone who had already implemented an utility method to do it? > > There is no such feature currently in commons-math. The > SummaryStatistics class wraps a bunch of specialized statistics classes > (Sum, Mean, Max, SumOfSquares ...) which can be overriden by > user-provided StorelessUnivariateStatistic implementations. > > So this feature should be added to the StorelessUnivariateStatistic > interface and all its implementations, with a signature like this: > public void aggregate(StorelessUnivariateStatistic otherStatistic); > > The implementation of this method should only use the > StorelessUnivariateStatistic methods, i.e. getResult() and getN(). This > seems feasible for the statistics used by SummaryStatistics, but has not > been done yet. > > One should be aware that SummaryStatistics does not enforce strong > typing, so one could call aggregate on a Sum instance and provide it a > Min instance, which would of course result in meaningless results. > >> Or maybe it would be interesting to request it as an Improvement to >> the Commons Math developers, adding an "aggregator" to all Statistics >> implementations? > > If you want to request this improvement, please open a ticket for it > using our JIRA tracking system: > http://issues.apache.org/jira/browse/MATH. You'll have to register to be > able to add your feature request. You can also provide a patch if you > want to contribute it by yourself. > > Luc > >> >> Thanks in advance, >> >> Andre Panisson >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] For >> additional commands, e-mail: [EMAIL PROTECTED] >> >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
