Hello, Last year, in the AofA’16 <http://www.aofa2016.meetings.pl/> conference Robert Sedgewick <https://www.cs.princeton.edu/~rs/> proposed a new algorithm for cardinality estimation. Robert Sedgwick is a professor at Princeton with a long track of publications on combinatorial/randomized algorithms. He was a good friend of Philippe Flajolet (creator of Hyperloglog) and HyperBitBit it's based on the same ideas. However, it uses less memory than Hyperloglog and can provide the same results. On practical data, HyperBitBit, for N < 2^64 estimates cardinality within 10% using only 128 + 6 bits.
There are some open implementations of HyperBitBit on Github (mainly in Go <https://github.com/seiflotfy/hyperbitbit>) but would be great having it in Java and one of the most used libraries for streaming. I would like to implement it if you think that it would be useful let me know. If you want to know more about HyperBitBit, I let you some notes <https://www.cs.princeton.edu/~rs/talks/AC11-Cardinality.pdf> about it. Thank you, Jordi. P.S: I am aware of the library internals because I implemented once other algorithm called Recordinality for it (It is pending to be merged for one year now) so it would not take me so much integrate HyperBitBit. -- You received this message because you are subscribed to the Google Groups "stream-lib-user" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
