[ https://issues.apache.org/jira/browse/HIVE-25498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated HIVE-25498: ---------------------------------- Labels: pull-request-available (was: ) > Query with more than 32 count distinct functions returns wrong result > --------------------------------------------------------------------- > > Key: HIVE-25498 > URL: https://issues.apache.org/jira/browse/HIVE-25498 > Project: Hive > Issue Type: Bug > Reporter: Robbie Zhang > Assignee: Robbie Zhang > Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > If there are more than 32 "COUNT(DISTINCT COL)" functions in a query, all > these COUNT functions in this query return 0 instead of the proper values. > Here are the queries to reproduce this issue: > {code:java} > set hive.cbo.enable=true; > create table test_count (c0 string, c1 string, c2 string, c3 string, c4 > string, c5 string, c6 string, c7 string, c8 string, c9 string, c10 string, > c11 string, c12 string, c13 string, c14 string, c15 string, c16 string, c17 > string, c18 string, c19 string, c20 string, c21 string, c22 string, c23 > string, c24 string, c25 string, c26 string, c27 string, c28 string, c29 > string, c30 string, c31 string, c32 string); > INSERT INTO test_count values ('c0', 'c1', 'c2', 'c3', 'c4', 'c5', 'c6', > 'c7', 'c8', 'c9', 'c10', 'c11', 'c12', 'c13', 'c14', 'c15', 'c16', 'c17', > 'c18', 'c19', 'c20', 'c21', 'c22', 'c23', 'c24', 'c25', 'c26', 'c27', 'c28', > 'c29', 'c30', 'c31', 'c32'); > select count (distinct c0), count(distinct c1), count(distinct c2), > count(distinct c3), count(distinct c4), count(distinct c5), count(distinct > c6), count(distinct c7), count(distinct c8), count(distinct c9), > count(distinct c10), count(distinct c11), count(distinct c12), count(distinct > c13), count(distinct c14), count(distinct c15), count(distinct c16), > count(distinct c17), count(distinct c18), count(distinct c19), count(distinct > c20), count(distinct c21), count(distinct c22), count(distinct c23), > count(distinct c24), count(distinct c25), count(distinct c26), count(distinct > c27), count(distinct c28), count(distinct c29), count(distinct c30), > count(distinct c31), count(distinct c32) from test_count; > {code} > This bug is caused by HiveExpandDistinctAggregatesRule.getGroupingIdValue() > which uses int type. When there are more than 32 groupings the values > overflow. -- This message was sent by Atlassian Jira (v8.3.4#803005)