Hey Guys,

I saw this floating around Twitter recently:

  https://github.com/tdunning/t-digest

Seems like it might be a good way to compute quantiles from a Samza task. Just 
throwing it out there in case anyone's interested.

One other thought would be to adapt this to a state store, so you could have 
predictable quantile computation (even in the face of failure). Keep in mind, 
though, that the algorithm is approximate, so you'd only get exactly the same 
approximate answer (hah!) in the case of failure. It does, however, take 
advantage of local disk.

Cheers,
Chris

Reply via email to