Conceptually, it makes sense to have a unique variable for each
Correlate node but as you observed there are cases where this does not
hold. In some cases it causes problems and in some others it doesn't
which makes a holistic fix somewhat tricky. I think it's worth trying
to generate unique ids for each correlate although I suspect that some
things may be broken along the way and it may take some time before
merging this proposal.

Best,
Stamatis

On Wed, May 28, 2025 at 8:20 PM Mihai Budiu <[email protected]> wrote:
>
> Things are hard enough handling subqueries, I think that having scopes for 
> correlateId is necessary. Each Correlate node (or subquery) should use a 
> fresh correlateId. These identifiers are generated by the compiler, and never 
> visible outside.
>
> Mihai
> ________________________________
> From: suibianwanwan33 <[email protected]>
> Sent: Wednesday, May 28, 2025 10:27 AM
> To: dev <[email protected]>
> Subject: [DISCUSS] Some questions related to subquery issues
>
> Hello,
>
> Recently, there have been many issues about SubQuery in the Jira. I believe 
> subqueries have always been a complex area in databases, and many related 
> issues exist across other database systems. I've recently been trying to fix 
> some of these, but encountered some questions during the process.
>
>
> 1. Is it valid to have multiple Correlates with the same correlateId in 
> RelNode?&nbsp;&nbsp;
>
>
> In SubQueryRemoveRule, there are many cases where we use the same variableSet 
> to build joins. This results in multiple Correlates with the same corId in 
> the RelNode.&nbsp;&nbsp;
> For example, take the case of FILTER_SUB_QUERY_TO_CORRELATE:
>
> SELECT *
> FROM emp e1
> WHERE e1.deptno IN (
> &nbsp; &nbsp; SELECT e3.deptno
> &nbsp; &nbsp; FROM emp e3
> &nbsp; &nbsp; WHERE e3.empno = e1.empno and e3.JOB = 'CLERK'
> )
> AND e1.deptno IN (
> &nbsp; &nbsp; SELECT e4.deptno
> &nbsp; &nbsp; FROM emp e4
> &nbsp; &nbsp; WHERE e4.empno = e1.empno and e4.ename = 'SMITH'
> );
>
>
>
> Execution plan before unnesting:
>
>
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7])
> &nbsp; LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], 
> HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7])
> &nbsp; &nbsp; LogicalFilter(condition=[=($7, $9)])
> &nbsp; &nbsp; &nbsp; LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
> requiredColumns=[{0}])
> &nbsp; &nbsp; &nbsp; &nbsp; LogicalFilter(condition=[=($7, $8)])
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LogicalCorrelate(correlation=[$cor0], 
> joinType=[inner], requiredColumns=[{0}])
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LogicalTableScan(table=[[scott, 
> EMP]])
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LogicalAggregate(group=[{0}])
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LogicalProject(DEPTNO=[$7])
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 
> LogicalFilter(condition=[AND(=($0, $cor0.EMPNO), =($2, 'CLERK'))])
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 
> LogicalTableScan(table=[[scott, EMP]])
> &nbsp; &nbsp; &nbsp; &nbsp; LogicalProject(DEPTNO=[$7])
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LogicalFilter(condition=[AND(=($0, 
> $cor0.EMPNO), =($1, 'SMITH'))])
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LogicalTableScan(table=[[scott, 
> EMP]])
>
>
>
> I suspect a similar case might have caused an error during decorrelate.
>
>
> 2. Are there any relevant documents or papers on the design of Blackboard in 
> SqlToRel?&nbsp;&nbsp;
>
>
> Another part of the issue failed during SqlToRel, but I'm not very familiar 
> with this part of the code, and the logic here is quite complex for me. It 
> would be great if there were some references (I believe this would also be 
> useful for other developers working with Calcite).
>
>
>
> Thank you for the suggestion.
>
>
> Best regards,
>
>
> suibianwanwan

Reply via email to