[ 
https://issues.apache.org/jira/browse/ARROW-14158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17527844#comment-17527844
 ] 

ZMZ91 commented on ARROW-14158:
-------------------------------

Sure. We'd like to have a hash_count_distinct_hll for a proximate result in 
many real cases.

> [C++][Compute] Implement count distinct kernel using HyperLogLog
> ----------------------------------------------------------------
>
>                 Key: ARROW-14158
>                 URL: https://issues.apache.org/jira/browse/ARROW-14158
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++
>    Affects Versions: 7.0.0
>            Reporter: Percy Camilo TriveƱo Aucahuasi
>            Priority: Major
>              Labels: Kernels, kernel
>
> Having a version of the aggregation kernel count distinct using HyperLogLog 
> may be useful.
> Note: The implementation should support the merge operator.
> cc [~icook] [~lidavidm]
> Some resources/links:
> [http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf]
> [https://engineering.fb.com/2018/12/13/data-infrastructure/hyperloglog/]
> [https://github.com/facebookincubator/velox/tree/main/velox/aggregates/hyperloglog]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to