Vineet Garg created CALCITE-1493:
------------------------------------

             Summary: Wrong plan for NOT IN correlated queries
                 Key: CALCITE-1493
                 URL: https://issues.apache.org/jira/browse/CALCITE-1493
             Project: Calcite
          Issue Type: Bug
          Components: core
            Reporter: Vineet Garg
            Assignee: Julian Hyde


Plan generated by calcite with SubqueryRemoveRule followed by de-correlation 
for the following query:
{noformat} select sal from emp where empno NOT IN (select deptno from dept 
where emp.job = dept.name) {noformat}

is

{noformat}
LogicalProject(SAL=[$5])
  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
    LogicalFilter(condition=[IS NULL($11)])
      LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], DEPTNO0=[CAST($9):INTEGER], 
JOB0=[CAST($10):VARCHAR(10) CHARACTER SET "ISO-8859-1" COLLATE 
"ISO-8859-1$en_US$primary"], $f2=[CAST($11):BOOLEAN])
        LogicalJoin(condition=[AND(=($2, $10), =($0, $9))], joinType=[inner])
          LogicalTableScan(table=[[CATALOG, SALES, EMP]])
          LogicalProject(DEPTNO=[$0], JOB=[$1], $f2=[true])
            LogicalAggregate(group=[{0, 1}])
              LogicalProject(DEPTNO=[$0], JOB=[$2], i=[$1])
                LogicalProject(DEPTNO=[$0], i=[true], JOB=[$1])
                  LogicalProject(DEPTNO=[$0], JOB=[$2])
                    LogicalJoin(condition=[=($2, $1)], joinType=[inner])
                      LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
                      LogicalAggregate(group=[{0}])
                        LogicalProject(JOB=[$2])
                          LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{noformat}

As you can notice instead of doing {noformat} Left Outer Join {noformat} 
Calcite is doing {noformat} Inner Join {noformat}. This will produce wrong 
results.

Plan for same query just before SubqueryRemove Rule is:
{noformat}
LogicalProject(SAL=[$5])
  LogicalFilter(condition=[NOT(IN($0, {
LogicalProject(DEPTNO=[$0])
  LogicalFilter(condition=[=($cor0.JOB, $1)])
    LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
}))], variablesSet=[[$cor0]])
    LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{noformat}

Plan just after SubqueryRemove Rule:

{noformat}
LogicalProject(SAL=[$5])
  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
    LogicalFilter(condition=[IS NULL($10)])
      LogicalFilter(condition=[=($0, $9)])
        LogicalCorrelate(correlation=[$cor0], joinType=[LEFT], 
requiredColumns=[{2}])
          LogicalTableScan(table=[[CATALOG, SALES, EMP]])
          LogicalAggregate(group=[{0, 1}])
            LogicalProject(DEPTNO=[$0], i=[true])
              LogicalProject(DEPTNO=[$0])
                LogicalFilter(condition=[=($cor0.JOB, $1)])
                  LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{noformat}

Looking at above it seems RelDecorrelator have some issue where it is coming up 
with Inner Join.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to