[ 
https://issues.apache.org/jira/browse/CALCITE-5789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-5789:
-------------------------------
    Fix Version/s: 1.35.0

> Query with two nested subqueries where the inner-most references the 
> outer-most table returns wrong result
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-5789
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5789
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Ruben Q L
>            Assignee: Ruben Q L
>            Priority: Major
>             Fix For: 1.35.0
>
>         Attachments: debugger.png
>
>
> Problem can be reproduced with the following query (to be added e.g. in 
> sub-query.iq):
> {code:sql}
> select deptno from dept d1 where exists (
>  select 1 from dept d2 where d2.deptno = d1.deptno and exists (
>   select 1 from dept d3 where d3.deptno = d2.deptno and d3.dname = d1.dname));
> {code}
> The problem appears with at least two nested subqueries, when the inner most 
> references the outermost table, in our case {{and d3.dname = d1.dname}} (if 
> we remove this expression, the problem does not occur).
> When the above query is processed, the following plan is generated (notice 
> how the top-most projection contains two fields instead of one):
> {noformat}
> // Plan2 (Decorrelator output)
> LogicalProject(DEPTNO=[$0], DEPTNO0=[$4])
>   LogicalJoin(condition=[AND(=($0, $2), =($1, $3))], joinType=[inner])
>     LogicalProject(DEPTNO=[$0], DNAME=[$1])
>       LogicalTableScan(table=[[scott, DEPT]])
>     LogicalProject(DEPTNO3=[$0], DNAME0=[$3], DEPTNO0=[$4], $f3=[true])
>       LogicalJoin(condition=[true], joinType=[inner])
>         LogicalTableScan(table=[[scott, DEPT]])
>         LogicalProject(DNAME=[$1], DEPTNO=[$0], $f2=[true])
>           LogicalFilter(condition=[IS NOT NULL($1)])
>             LogicalTableScan(table=[[scott, DEPT]])
> {noformat}
> Even if this minimalist test does not fail, it leads to a situation where the 
> {{RelRoot#validatedRowType}} and the {{RelRoot#rel#rowType}} don't match  
> (see attached screenshot) , which can lead to unforeseeable consequences (for 
> the record, I have seen more complex queries which do fail at execution time 
> because of this issue).
> The culprit that generates the unexpected extra field in the final projection 
> is RelDecorrelator, however I think the decorrelator is not to blame here, 
> because the input that reaches it is already wrong:
> {noformat}
> // Plan1 (SubQueryRemoveRule output, Decorrelator input)
> LogicalProject(DEPTNO=[$0])
>   LogicalProject(DEPTNO=[$0], DNAME=[$1])
>     LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
> requiredColumns=[{0, 1}])
>       LogicalProject(DEPTNO=[$0], DNAME=[$1])
>         LogicalTableScan(table=[[scott, DEPT]])
>       LogicalAggregate(group=[{0}])
>         LogicalProject(i=[true])
>           LogicalProject(DEPTNO=[$0], DNAME=[$1], LOC=[$2])
>             LogicalFilter(condition=[=($0, $cor0.DEPTNO)])
> ***           LogicalJoin(condition=[true], joinType=[inner], 
> variablesSet=[[$cor1, $cor0]])
>                 LogicalTableScan(table=[[scott, DEPT]])
>                 LogicalAggregate(group=[{0}])
>                   LogicalProject(i=[true])
> ***                 LogicalFilter(condition=[AND(=($0, $cor1.DEPTNO), =($1, 
> $cor0.DNAME))])
>                       LogicalTableScan(table=[[scott, DEPT]])
> {noformat}
> Notice how there is a correlated variable $cor1 used by the LogicalJoin, 
> which is not generated by any LogicalCorrelate.
> Who generates this wrong plan? SubQueryRemoveRule, which takes the original 
> plan and removes the nested subqueries with the questionable above result:
> {noformat}
> // Plan0 (Original, SubQueryRemoveRule input)
> LogicalProject(DEPTNO=[$0])
>   LogicalFilter(condition=[EXISTS({
>     LogicalFilter(condition=[AND(=($0, $cor0.DEPTNO), EXISTS({
>       LogicalFilter(condition=[AND(=($0, $cor1.DEPTNO), =($1, $cor0.DNAME))])
>         LogicalTableScan(table=[[scott, DEPT]])
>     }))], variablesSet=[[$cor1]])
>       LogicalTableScan(table=[[scott, DEPT]])
>   })], variablesSet=[[$cor0]])
>     LogicalTableScan(table=[[scott, DEPT]])
> {noformat}
> When processing the original plan, 
> {{CoreRules.FILTER_SUB_QUERY_TO_CORRELATE}} correctly converts the first 
> subquery into a correlation with $cor0. However, when processing the second 
> (inner-most) subquery, we reach this code in SubQueryRemoveRule#matchFilter 
> method:
> {code:java}
> final Set<CorrelationId>  variablesSet =
>           RelOptUtil.getVariablesUsed(e.rel);
> {code}
> Which brings a set with TWO variables ($cor0 and $cor1), because the subquery 
> RelNode references both: {{AND(=($0, $cor1.DEPTNO), =($1, $cor0.DNAME))}}. 
> However $cor0 does not belong to the "current context", since it is the 
> variable of the first subquery.
> *My hypothesis is that the root cause is precisely the presence of the first 
> subquery variable ($cor0) in the variableSet of the second subquery; if at 
> this point the variable set would be simply $cor1, the problem would not 
> occur.*
> When reaching the {{SubQueryRemoveRule#rewriteExists}} with this 
> {{variablesSet}} of two elements:
> {code:java}
> builder.join(JoinRelType.INNER, builder.literal(true), variablesSet);
> {code}
> The correlation is not introduced because of this check in {{RelBuilder}}:
> {code:java}
>   public RelBuilder join(JoinRelType joinType, RexNode condition,
>       Set<CorrelationId> variablesSet) {
>     ...
>     final boolean correlate = checkIfCorrelated(variablesSet, joinType, 
> left.rel, right.rel);
>     ...
>   }
>   private boolean checkIfCorrelated(Set<CorrelationId> variablesSet,
>       JoinRelType joinType, RelNode leftNode, RelNode rightRel) {
>     if (variablesSet.size() != 1) {
>       return false;
>     }
>     ...
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to