Shohei Okumiya created HIVE-28254: ------------------------------------- Summary: CBO (Calcite Return Path): Multiple DISTINCT leads to wrong results Key: HIVE-28254 URL: https://issues.apache.org/jira/browse/HIVE-28254 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 4.0.0 Reporter: Shohei Okumiya Assignee: Shohei Okumiya
CBO return path can build incorrect GroupByOperator when multiple aggregations with DISTINCT are involved. This is an example. {code:java} CREATE TABLE test (col1 INT, col2 INT); INSERT INTO test VALUES (1, 100), (2, 200), (2, 200), (3, 300); set hive.cbo.returnpath.hiveop=true; set hive.map.aggr=false; SELECT SUM(DISTINCT col1), COUNT(DISTINCT col1), SUM(DISTINCT col2), SUM(col2) FROM test;{code} The last column should be 800. But the SUM refers to col1 and the actual result is 8. {code:java} +------+------+------+------+ | _c0 | _c1 | _c2 | _c3 | +------+------+------+------+ | 6 | 3 | 600 | 8 | +------+------+------+------+ {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)