I believe that when zetasketch was added, it was also noticeably more efficient than other sketch implementations. However this was a number of years ago, and I don't know whether it still has an advantage or not.
On Wed, Jan 18, 2023 at 10:41 AM Byron Ellis via dev <dev@beam.apache.org> wrote: > Hi everyone, > > I was looking at adding at least a couple of the sketches from the Apache > Datasketches library to the Beam Java SDK and I was wondering if folks had > a preference for adding to the existing "sketching" extension vs splitting > it out into its own extension? > > The reason I ask is that there's some overlap (which already exists in > zetasketch) between the sketches available in Datasketches vs Beam today, > particularly HyperLogLog which would have 3 implementations if we were to > add all of them. > > I don't really have a strong opinion, though personally I'd probably lean > towards a single sketching extension (zetasketch being something of a > special case as it exists for format compatibility as far as I can tell). > But I could see how that could be confusing if you had the Apache > Datasketch implementation and the existing implementation derived from the > clearspring implementations. > > Any thoughts? > > Best, > B >