[ 
https://issues.apache.org/jira/browse/IMPALA-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-2658:
-------------------------------------

    Assignee:     (was: Peter Ebert)

> Extend the NDV function to accept a precision
> ---------------------------------------------
>
>                 Key: IMPALA-2658
>                 URL: https://issues.apache.org/jira/browse/IMPALA-2658
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 2.2.4
>            Reporter: Peter Ebert
>            Priority: Minor
>              Labels: ramp-up
>         Attachments: Comparison of HLL Memory usage, Query Duration and 
> Accuracy.jpg
>
>
> Hyperloglog algorithm used by NDV defaults to a precision of 10.  Being able 
> to set this precision would have two benefits:
> # Lower precision sizes can speed up the performance, as a precision of 9 has 
> 1/2 the number of registers as 10 (exponential) and may be just as accurate 
> depending on expected cardinality.
> # Higher precision can help with very large cardinalities (100 million to 
> billion range) and will typically provide more accurate data.  Those who are 
> presenting estimates to end users will likely be willing to trade some 
> performance cost for more accuracy, while still out performing the naive 
> approach by a large margin.
> Propose adding the overloaded function NDV(expression, int precision)
> with accepted range between 18 and 4 inclusive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to