Thanks Gopal!

In the hive-hll-udf, you seem to mention about RRD. Is that something
supported by Hive?

Will go over the Data Sketches as well, thanks for the pointer :)

On Wed, Dec 30, 2015 at 4:29 PM, Gopal Vijayaraghavan <go...@hortonworks.com
> wrote:

>
> > I'm trying to explore the HLL UDF option to compute # of uniq users for
> >each time range (week, month, yr, etc.) and wanted to know if
> > its possible to just maintain HLL struct for each day and then use those
> >to compute the uniqs for various time
> > ranges using these per day structs instead of running the queries across
> >all the data?
>
> Yes, unions of raw HLL can be done (though not intersects).
>
> https://github.com/t3rmin4t0r/hive-hll-udf
>
>
> Or better yet, use the Yahoo sketches which work better than raw HLL.
>
> http://yahooeng.tumblr.com/post/135390948446/data-sketches
>
> +
> http://datasketches.github.io/
>
> +
> https://github.com/DataSketches/sketches-hive
>
>
> Cheers,
> Gopal
>
>

Reply via email to