Shohei Okumiya created HIVE-28254:
-------------------------------------

             Summary: CBO (Calcite Return Path): Multiple DISTINCT leads to 
wrong results
                 Key: HIVE-28254
                 URL: https://issues.apache.org/jira/browse/HIVE-28254
             Project: Hive
          Issue Type: Bug
          Components: CBO
    Affects Versions: 4.0.0
            Reporter: Shohei Okumiya
            Assignee: Shohei Okumiya


CBO return path can build incorrect GroupByOperator when multiple aggregations 
with DISTINCT are involved.

This is an example.
{code:java}
CREATE TABLE test (col1 INT, col2 INT);
INSERT INTO test VALUES (1, 100), (2, 200), (2, 200), (3, 300);

set hive.cbo.returnpath.hiveop=true;
set hive.map.aggr=false;

SELECT
  SUM(DISTINCT col1),
  COUNT(DISTINCT col1),
  SUM(DISTINCT col2),
  SUM(col2)
FROM test;{code}
The last column should be 800. But the SUM refers to col1 and the actual result 
is 8.
{code:java}
+------+------+------+------+
| _c0  | _c1  | _c2  | _c3  |
+------+------+------+------+
| 6    | 3    | 600  | 8    |
+------+------+------+------+ {code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to