[ 
https://issues.apache.org/jira/browse/CALCITE-5578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17699933#comment-17699933
 ] 

Pinhan Zhao commented on CALCITE-5578:
--------------------------------------

I believe this would be the same issue (with the testAggregateCaseToFilter 
rule) as in 
[https://issues.apache.org/jira/projects/CALCITE/issues/CALCITE-5516?filter=allopenissues|https://issues.apache.org/jira/projects/CALCITE/issues/CALCITE-5516?filter=allopenissues.]

I've marked the case documented in CALCITE-5516 as a duplicate of this.  Please 
check.  Thanks so much.

> RelOptRulesTest testAggregateCaseToFilter optimized plan not semantically 
> equivalent to the original one after conversion
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-5578
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5578
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Pinhan Zhao
>            Priority: Blocker
>
> After converting the "plan before" and "plan after" of 
> testAggregateCaseToFilter 
> ([https://github.com/apache/calcite/blob/2dba40e7a0a5651eac5a30d9e0a72f178bd9bff2/core/src/test/java/org/apache/calcite/test/RelOptRulesTest.java#L4140])
>  to PostgreSQL queries, we obtained two queries:
> {code:java}
> SELECT SUM(SAL) AS SUM_SAL, COUNT(DISTINCT CASE WHEN JOB = 'CLERK' THEN 
> DEPTNO ELSE NULL END) AS COUNT_DISTINCT_CLERK, SUM(CASE WHEN DEPTNO = 10 THEN 
> SAL ELSE NULL END) AS SUM_SAL_D10, SUM(CASE WHEN DEPTNO = 20 THEN SAL ELSE 0 
> END) AS SUM_SAL_D20, SUM(CASE WHEN DEPTNO = 30 THEN 1 ELSE 0 END) AS 
> COUNT_D30, COUNT(CASE WHEN DEPTNO = 40 THEN 'x' ELSE NULL END) AS COUNT_D40, 
> SUM(CASE WHEN DEPTNO = 45 THEN 1 ELSE NULL END) AS COUNT_D45, SUM(CASE WHEN 
> DEPTNO = 50 THEN 1 ELSE NULL END) AS COUNT_D50, SUM(CAST(NULL AS DECIMAL(19, 
> 9))) AS SUM_NULL_D60, SUM(CASE WHEN DEPTNO = 70 THEN NULL ELSE 1 END) AS 
> SUM_NULL_D70, COUNT(CASE WHEN DEPTNO = 20 THEN 1 ELSE NULL END) AS COUNT_D20 
> FROM EMP{code}
> and
> {code:java}
> SELECT SUM(SAL) AS SUM_SAL, COUNT(DISTINCT DEPTNO) FILTER (WHERE JOB = 
> 'CLERK') AS COUNT_DISTINCT_CLERK, SUM(SAL) FILTER (WHERE DEPTNO = 10) AS 
> SUM_SAL_D10, SUM(SAL) FILTER (WHERE DEPTNO = 20) AS SUM_SAL_D20, 
> CAST(COUNT(*) FILTER (WHERE DEPTNO = 30) AS INTEGER) AS COUNT_D30, COUNT(*) 
> FILTER (WHERE DEPTNO = 40) AS COUNT_D40, SUM(1) FILTER (WHERE DEPTNO = 45) AS 
> COUNT_D45, SUM(1) FILTER (WHERE DEPTNO = 50) AS COUNT_D50, SUM(CAST(NULL AS 
> DECIMAL(19, 9))) AS SUM_NULL_D60, SUM(1) FILTER (WHERE DEPTNO <> 70) AS 
> SUM_NULL_D70, COUNT(*) FILTER (WHERE DEPTNO = 20) AS COUNT_D20 FROM EMP{code}
>  
> The code we used to convert the plans to PostgreSQL queries is:
> {code:java}
> converter.visitRoot(rel).asStatement().toSqlString(PostgresqlSqlDialect.DEFAULT).getSql().replace('\n',
>  '');{code}
>  
> However, these queries are not semantically equivalent.  With a 
> counterexample below (which I believe is also satisfying the integrity 
> constraints):
> {code:java}
> CREATE TABLE EMP (
>         EMPNO INTEGER,
>         DEPTNO INTEGER,
>         ENAME VARCHAR(20),
>         JOB VARCHAR(20),
>         MGR INTEGER,
>         HIREDATE DATE,
>         SAL INTEGER,
>         COMM INTEGER,
>         SLACKER INTEGER
> );
> INSERT INTO EMP VALUES (0, 70, '-2147483649', '-6721455509335307966', 0, 
> '1970-01-01', 0, 0, 1);
> CREATE TABLE DEPT (
>         DEPTNO INTEGER,
>         NAME VARCHAR(20)
> );
> INSERT INTO DEPT VALUES (70, '-2147483649');
> CREATE TABLE BONUS (
>         ENAME VARCHAR(20),
>         JOB VARCHAR(20),
>         SAL INTEGER,
>         COMM INTEGER
> );
> INSERT INTO BONUS VALUES ('2147483648', '2147483648', 0, 0);
> CREATE TABLE EMPNULLABLES (
>         EMPNO INTEGER,
>         DEPTNO INTEGER,
>         ENAME VARCHAR(20),
>         JOB VARCHAR(20),
>         MGR INTEGER,
>         HIREDATE DATE,
>         SAL INTEGER,
>         COMM INTEGER,
>         SLACKER INTEGER
> );
> INSERT INTO EMPNULLABLES VALUES (0, 70, '2147483648', '2147483648', 0, 
> '1970-01-01', 0, 0, 1);
> CREATE TABLE EMPNULLABLES_20 (
>         EMPNO INTEGER,
>         DEPTNO INTEGER,
>         ENAME VARCHAR(20),
>         JOB VARCHAR(20),
>         MGR INTEGER,
>         HIREDATE DATE,
>         SAL INTEGER,
>         COMM INTEGER,
>         SLACKER INTEGER
> );
> INSERT INTO EMPNULLABLES_20 VALUES (0, 70, '2147483648', '2147483648', 0, 
> '1970-01-01', 0, 0, 1);
> CREATE TABLE EMP_B (
>         EMPNO INTEGER,
>         DEPTNO INTEGER,
>         ENAME VARCHAR(20),
>         JOB VARCHAR(20),
>         MGR INTEGER,
>         HIREDATE DATE,
>         SAL INTEGER,
>         COMM INTEGER,
>         SLACKER INTEGER,
>         BIRTHDATE DATE
> );
> INSERT INTO EMP_B VALUES (0, 70, '2147483648', '2147483648', 0, '1970-01-01', 
> 0, 0, 1, '1970-01-01');{code}
>  
> we can have two queries produce different output:
> {code:java}
>  sum_sal | count_distinct_clerk | sum_sal_d10 | sum_sal_d20 | count_d30 | 
> count_d40 | count_d45 | count_d50 | sum_null_d60 | sum_null_d70 | count_d20 
> ---------+----------------------+-------------+-------------+-----------+-----------+-----------+-----------+--------------+--------------+-----------
>        0 |                    0 |             |           0 |         0 |     
>     0 |           |           |              |              |         0
> (1 row){code}
> vs
> {code:java}
>  sum_sal | count_distinct_clerk | sum_sal_d10 | sum_sal_d20 | count_d30 | 
> count_d40 | count_d45 | count_d50 | sum_null_d60 | sum_null_d70 | count_d20 
> ---------+----------------------+-------------+-------------+-----------+-----------+-----------+-----------+--------------+--------------+-----------
>        0 |                    0 |             |             |         0 |     
>     0 |           |           |              |              |         0
> (1 row){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to