[ https://issues.apache.org/jira/browse/CALCITE-6554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17877496#comment-17877496 ]
Ian Bertolacci edited comment on CALCITE-6554 at 8/28/24 6:46 PM: ------------------------------------------------------------------ [~libenchao] Would it make sense for the project constructor, or the RelBuilder.project calls which do not take correlation variables, to collect the correlation variables on construction? That way we can avoid this issue in any other place where a projection which uses a correlated subquery is created as a projected expression. My concern is that there are other places where this is an issue, but we just haven't hit it. was (Author: ian.bertolacci): [~libenchao] Would it make sense for the project constructor, or the RelBuilder.project calls which do not take correlation variables, to collect the correlation variables on construction? That way we can avoid this issue in any other place where a projection which uses a correlated subquery is created as a projected expression. > nested correlated sub-query in aggregation does not have inner correlation > variable bound to inner projection > ------------------------------------------------------------------------------------------------------------- > > Key: CALCITE-6554 > URL: https://issues.apache.org/jira/browse/CALCITE-6554 > Project: Calcite > Issue Type: Bug > Affects Versions: 1.37.0 > Reporter: Ian Bertolacci > Assignee: Ian Bertolacci > Priority: Major > > It appears that nested correlated subqueries can (at least in aggregates) > result in project nodes not having their variablesSet populated with the > correct (or any) correlation variables. > For example: > {code:java} > @Test void testCorrelationInProjectionWith1xNestedCorrelatedProjection() { > final String sql = > "select e1.empno,\n" > + " (select sum(e2.sal +\n" > + " (select sum(e3.sal) from emp e3 where e3.mgr = e2.empno)\n" > + " ) from emp e2 where e2.mgr = e1.empno)\n" > + "from emp e1"; > sql(sql).withExpand(false).withDecorrelate(false).ok(); > } > {code} > currently gives the following plan: > {code:java} > LogicalProject(variablesSet=[[$cor0]], EMPNO=[$0], EXPR$1=[$SCALAR_QUERY({ > LogicalAggregate(group=[{}], EXPR$0=[SUM($0)]) > LogicalProject($f0=[+($5, $SCALAR_QUERY({ > LogicalAggregate(group=[{}], EXPR$0=[SUM($0)]) > LogicalProject(SAL=[$5]) > LogicalFilter(condition=[=($3, $cor1.EMPNO)]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > }))]) > LogicalFilter(condition=[=($3, $cor0.EMPNO)]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > })]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > {code} > Notice that cor1 is not bound to the inner query's projection nodes > variablesSet as cor0 is to the outer query's projection node. > This results in improper subquery removal (which was originally believed to > be an issue with the rule itself, discussed in CALCITE-5213) > This is because in [SqlToRelConverter.createAggImpl, it does not check for > correlation variables after creating an intermediate > project|https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L3589-L3595] > in the same way that > [SqlToRelConverter.convertSelectList|https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L4754-L4766] > does. > This patch fixes addresses this issue: > {code:none} > diff --git > a/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java > b/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java > index d88daa631..5668bc825 100644 > --- a/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java > +++ b/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java > @@ -3587,11 +3587,26 @@ private void createAggImpl(Blackboard bb, > final RelNode inputRel = bb.root(); > // Project the expressions required by agg and having. > - bb.setRoot( > - relBuilder.push(inputRel) > - .projectNamed(preExprs.leftList(), preExprs.rightList(), false) > - .build(), > - false); > + { > + RelNode intermediateProject = relBuilder.push(inputRel) > + .projectNamed(preExprs.leftList(), preExprs.rightList(), false) > + .build(); > + final RelNode r2; > + // deal with correlation > + final CorrelationUse p = getCorrelationUse(bb, intermediateProject); > + if (p != null) { > + assert p.r instanceof Project; > + // correlation variables have been normalized in p.r, we should > use expressions > + // in p.r instead of the original exprs > + Project project1 = (Project) p.r; > + r2 = relBuilder.push(bb.root()) > + .projectNamed(project1.getProjects(), > project1.getRowType().getFieldNames(), true, ImmutableSet.of(p.id)) > + .build(); > + } else { > + r2 = intermediateProject; > + } > + bb.setRoot(r2, false); > + } > bb.mapRootRelToFieldProjection.put(bb.root(), r.groupExprProjection); > // REVIEW jvs 31-Oct-2007: doesn't the declaration of > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)