Thanks Charles for your reply, it helped me better understand “finalize” and "post-aggregate".
Back to my question, I understand HyperUnique type stores the unique entities even though it is encoded by BASE64, taking a specific example, suppose I have one HyperUnique which represents {‘a’, 'b', 'c'} and another one for {'b', 'c', 'd'}, I would like to have their intersection, i.e., logical AND operation between them, is it supported by Druid? Thx Lei Wang ________________________________ From: Charles Allen <charles.al...@snap.com.INVALID> Sent: Monday, July 30, 2018 22:47 To: dev@druid.apache.org Subject: Re: Two questions about HLL Thank you for bringing this up. The query process has multiple stages. The final stage is a "finalize" stage. During this process the query tries to get the binary form (ex: long, float, stats, hyperunique) into something that can be sent over the wire as json for consumption by the end user in a meaningful way. As such, you should be able to use a HyperUnique aggregation just fine as its finalization should yield a human readable number. The reason the HyperUniqueCardinality aggregator exists is because the post aggregator computations occur before the finalization. So at that point the HyperUnique is still a HyperUnique aggregator and not a finalized number. Trying to use the HyperUnique aggregator in a post-agg is hard to make dynamically work for all post aggs and still yield expected behavior. Explicitly declaring a HyperUniqueCardinality post aggregator makes it very clear how you want the results of the HLL calculation handled for the purposes of post aggregation. Long story short, you should be able to use HyperUnique if you want the sketch estimate directly in the query result body. Cheers, Charles Allen On Sun, Jul 29, 2018 at 11:15 PM ? ? <biolearn...@hotmail.com> wrote: > Hi, > > I am newbee of Druid, and I would like to aggregate hyperUnique of daily > users to get the distinct count in a period of days, around this - > > > 1. I am surprised by I did not even find the druid page about all of > column types supported by Druid, so far what I met are - String, > HyperUnique, LongSum, time etc; > 2. Is there such post aggregation function to aggregate hyperUnique > further to count the distinct values on the top of that? I did not find > that it seems there is only one HyperUniqueCadinality for the count number > from HyperUnique which can be used to arithmetical calculation only. > > > Thanks advance for your clarification. > >