Hi Tomas,

A less sophisticated approach is to just bin the values using some
regular grid and count them, and then use the midpoint of the median bin
as an approximation.

Then

abs((median bin midpoint) - (true median)) <= (median bin width)/2

which may be good enough for your application.

Best,

Tamas

On Tue, Jan 06 2015, Tomas Mikoviny <tomas.mikov...@gmail.com> wrote:

> Tamas, thanks for the quantile reference - I'm just immersed in that paper.
>
> As I've mentioned in reply to Kevin, I have rather bulky mass spec data and 
> performance is rather crucial. I've read about the approach with limited 
> distinct values but unfortunately it is not very applicable in my case. 
> Let's see what I'll learn from Greenwald paper
>
> Tomas
>
>
> On Tuesday, January 6, 2015 6:35:06 PM UTC+1, Tamas Papp wrote:
>>
>> Hi Tomas, 
>>
>> I don't know any packages, but a running mean is trivial to 
>> implement. See eg 
>>
>> http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online_algorithm
>>  
>> <http://www.google.com/url?q=http%3A%2F%2Fen.wikipedia.org%2Fwiki%2FAlgorithms_for_calculating_variance%23Online_algorithm&sa=D&sntz=1&usg=AFQjCNGY1sv8symyRT_JBeOg4AwZb6Nt-g>
>>  
>> , it doesn't get faster than this, and it is reasonably accurate. 
>>
>> Medians (and quantiles in general) are trickier, because you have to 
>> keep track of the whole distribution or sacrifice accuracy. See 
>>
>> @inproceedings{greenwald2001space, 
>>   title={Space-efficient online computation of quantile summaries}, 
>>   author={Greenwald, Michael and Khanna, Sanjeev}, 
>>   booktitle={ACM SIGMOD Record}, 
>>   volume={30}, 
>>   number={2}, 
>>   pages={58--66}, 
>>   year={2001}, 
>>   organization={ACM} 
>> } 
>>
>> You might want to tell us more about your problem and then you could get 
>> more specific advice. Eg if number of observations >> number of distinct 
>> values, you could simply count them in a Dict. 
>>
>> Best, 
>>
>> Tamas 
>>
>> On Tue, Jan 06 2015, Tomas Mikoviny <tomas.m...@gmail.com <javascript:>> 
>> wrote: 
>>
>> > Hi, 
>> > I was just wondering if anyone knows if there is a package that 
>> implements 
>> > *fast* running median/mean algorithms? 
>> > 
>> > Thanks a lot... 
>> >   
>>
>>

Reply via email to