[ 
https://issues.apache.org/jira/browse/HIVE-25498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Zhang reassigned HIVE-25498:
-----------------------------------

    Assignee: Robbie Zhang

> Query with more than 32 count distinct functions returns wrong result
> ---------------------------------------------------------------------
>
>                 Key: HIVE-25498
>                 URL: https://issues.apache.org/jira/browse/HIVE-25498
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Robbie Zhang
>            Assignee: Robbie Zhang
>            Priority: Major
>
> If there are more than 32 "COUNT(DISTINCT COL)" functions in a query, all 
> these COUNT functions in this query return 0 instead of the proper values.
> Here are the queries to reproduce this issue:
> {code:java}
> set hive.cbo.enable=true;
> create table test_count (c0 string, c1 string, c2 string, c3 string, c4 
> string, c5 string, c6 string, c7 string, c8 string, c9 string, c10 string, 
> c11 string, c12 string, c13 string, c14 string, c15 string, c16 string, c17 
> string, c18 string, c19 string, c20 string, c21 string, c22 string, c23 
> string, c24 string, c25 string, c26 string, c27 string, c28 string, c29 
> string, c30 string, c31 string, c32 string);
> INSERT INTO test_count values ('c0', 'c1', 'c2', 'c3', 'c4', 'c5', 'c6', 
> 'c7', 'c8', 'c9', 'c10', 'c11', 'c12', 'c13', 'c14', 'c15', 'c16', 'c17', 
> 'c18', 'c19', 'c20', 'c21', 'c22', 'c23', 'c24', 'c25', 'c26', 'c27', 'c28', 
> 'c29', 'c30', 'c31', 'c32'); 
> select count (distinct c0), count(distinct c1), count(distinct c2), 
> count(distinct c3), count(distinct c4), count(distinct c5), count(distinct 
> c6), count(distinct c7), count(distinct c8), count(distinct c9), 
> count(distinct c10), count(distinct c11), count(distinct c12), count(distinct 
> c13), count(distinct c14), count(distinct c15), count(distinct c16), 
> count(distinct c17), count(distinct c18), count(distinct c19), count(distinct 
> c20), count(distinct c21), count(distinct c22), count(distinct c23), 
> count(distinct c24), count(distinct c25), count(distinct c26), count(distinct 
> c27), count(distinct c28), count(distinct c29), count(distinct c30), 
> count(distinct c31), count(distinct c32) from test_count;
> {code}
>  This bug is caused by HiveExpandDistinctAggregatesRule.getGroupingIdValue() 
> which uses int type. When there are more than 32 groupings the values 
> overflow.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to