[ https://issues.apache.org/jira/browse/CALCITE-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713845#comment-17713845 ]
Benchao Li edited comment on CALCITE-5655 at 4/19/23 2:10 AM: -------------------------------------------------------------- CC [~FrankZou] since you are the assignee of CALCITE-5390, could you help to verify whether CALCITE-5390 still exists after we merged this one? was (Author: libenchao): CC [~FrankZou] since you are the assignee of CALCITER-5390, could you help to verify whether CALCITER-5390 still exists after we merged this one? > Wrong plan for multiple IN/SOME sub-queries with OR predicate > ------------------------------------------------------------- > > Key: CALCITE-5655 > URL: https://issues.apache.org/jira/browse/CALCITE-5655 > Project: Calcite > Issue Type: Bug > Components: core > Affects Versions: 1.34.0 > Reporter: Runkang He > Assignee: Runkang He > Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > When the query contains multiple IN/SOME sub-queries connected with OR > predicate in WHERE clause, the result is wrong. The minimal reproducer is > below: > SQL: > {code:sql} > select empno from sales.empnullables > where deptno in ( > select deptno from sales.deptnullables where name = 'dept1') > or deptno in ( > select deptno from sales.deptnullables where name = 'dept2') > {code} > The Plan generated by calcite master branch: (Notice the bold part of *<>($2, > 0)* in the downstream LogicalFilter) > {code:sql} > LogicalProject(EMPNO=[$0]) > LogicalProject(EMPNO=[$0], DEPTNO=[$1]) > LogicalFilter(condition=[OR(AND(<>($2, 0), IS NOT NULL($5), IS NOT > NULL($1)), AND(***<>($2, 0)***, IS NOT NULL($9), IS NOT NULL($1)))]) > LogicalJoin(condition=[=($1, $8)], joinType=[left]) > LogicalJoin(condition=[true], joinType=[inner]) > LogicalJoin(condition=[=($1, $4)], joinType=[left]) > LogicalJoin(condition=[true], joinType=[inner]) > LogicalProject(EMPNO=[$0], DEPTNO=[$7]) > LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]]) > LogicalAggregate(group=[{}], c=[COUNT()], ck=[COUNT($0)]) > LogicalProject(DEPTNO=[$0]) > LogicalFilter(condition=[=($1, 'dept1')]) > LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]]) > LogicalProject(DEPTNO=[$0], i=[true]) > LogicalFilter(condition=[=($1, 'dept1')]) > LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]]) > LogicalAggregate(group=[{}], c=[COUNT()], ck=[COUNT($0)]) > LogicalProject(DEPTNO=[$0]) > LogicalFilter(condition=[=($1, 'dept2')]) > LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]]) > LogicalProject(DEPTNO=[$0], i=[true]) > LogicalFilter(condition=[=($1, 'dept2')]) > LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]]) > {code} > The wrong part is that when build the downstream LogicalFilter for the two > sub-queries, the filter for the second sub-query is AND(<>($2, 0), IS NOT > NULL($9), IS NOT NULL($1)), notice that *$2 should be the second sub-query's > intermediate table field ct.c(which field index is $6), but now the actual > reference is the first sub-query's*, this leads to wrong plan, and wrong > result. > The root cause is that intermediate table alias is the same as the previous > sub-query's, but when lookup intermediate table field, it always returns the > previous one which is not belong to the current subquery. -- This message was sent by Atlassian Jira (v8.20.10#820010)