Hi everyone,

I implemented a version of distributed streaming quantiles for PySpark.  It
uses a count-min sketch approach.  You can find the code here:

https://github.com/laserson/dsq

Thought it might be of interest...

Uri

-- 
Uri Laserson, PhD
Data Scientist, Cloudera
Twitter/GitHub: @laserson
+1 617 910 0447
laser...@cloudera.com

Reply via email to