Otis, What statistics do you need?
What guarantees? On Wed, Aug 7, 2013 at 1:26 PM, Otis Gospodnetic <otis_gospodne...@yahoo.com > wrote: > Hi Ted, > > I'm actually trying to find an alternative to QDigest (the stream-lib impl > specifically) because even though it seems good, we have to deal with crazy > volumes of data in SPM (performance monitoring service, see signature)... > I'm hoping we can find something that has both a lower memory footprint > than QDigest AND that is mergeable a la QDigest. Utopia? > > Thanks, > Otis > ---- > Performance Monitoring for Solr / ElasticSearch / Hadoop / HBase - > http://sematext.com/spm > > > > > >________________________________ > > From: Ted Dunning <ted.dunn...@gmail.com> > >To: "user@mahout.apache.org" <user@mahout.apache.org> > >Sent: Wednesday, August 7, 2013 4:51 PM > >Subject: Re: Is OnlineSummarizer mergeable? > > > > > >It isn't as mergeable as I would like. If you have randomized record > >selection, it should be possible, but perverse ordering can cause serious > >errors. > > > >It would be better to use something like a Q-digest. > > > >http://www.cs.virginia.edu/~son/cs851/papers/ucsb.sensys04.pdf > > > > > > > > > >On Wed, Aug 7, 2013 at 4:21 AM, Otis Gospodnetic < > otis.gospodne...@gmail.com > >> wrote: > > > >> Hi, > >> > >> Is OnlineSummarizer algo "mergeable"? > >> > >> Say that we compute a percentile for some metric for time 12:00-12:01 > >> and store that somewhere, then we compute it for 1201-12:02 and store > >> that separately, and so on. > >> > >> Can we then later merge these computed and previously stored > >> percentile "instances" and get an accurate value? > >> > >> Thanks, > >> Otis > >> -- > >> Performance Monitoring -- http://sematext.com/spm > >> Solr & ElasticSearch Support -- http://sematext.com/ > >> > > > > > >