Hi Gilles, On Wed, May 29, 2019 at 11:18 PM Gilles Sadowski <[email protected]> wrote:
> Hello. > > Le mer. 29 mai 2019 à 12:24, Marco Neumann <[email protected]> a > écrit : > > > > I am evaluating the use of Apache Math Commons Median for the querying of > > large data sets in another Apache project called Apache Jena. > > > > In my preliminary performance tests I was surprised to find that a simple > > implementation of a median function with Arrays.sort() and a programmatic > > selection of the median value yields much faster results > > than Median().evaluate() or DescriptiveStatistics.getPercentile(50). > > :-( > no worries, I still consider Apache Commons Math still a very valuable effort. > > > Since we only use this function for Arrays of confirmed numbers > > What is a "confirmed number"? > should probably read more like programmatically confirmed "numbers" rather than "confirmed number". I am not dealing with NaN and infinite values in the sort at the moment. > > is there a > > particular benefit in using Apache Commons Math for this task or are we > > better advised to use our own implementation here? > > There is ongoing work to refactor the "o.a.c.m.stat.descriptive" package > of "Commons Math". The new code will be in "Commons Statistics".[1] > Your observation is an interesting data point for this task; could you > please > file a report in JIRA[2] and/or mention on the "dev" ML? > I can certainly file a report and will do so tomorrow. I am looking forward to the results of the work on the new stats package! Best, Marco Thanks, > Gilles > > [1] http://commons.apache.org/proper/commons-statistics/ > [2] > https://issues.apache.org/jira/projects/STATISTICS/issues/STATISTICS-15?filter=allopenissues > > > > > Thank You > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > -- --- Marco Neumann KONA
