Things are hard enough handling subqueries, I think that having scopes for 
correlateId is necessary. Each Correlate node (or subquery) should use a fresh 
correlateId. These identifiers are generated by the compiler, and never visible 
outside.

Mihai
________________________________
From: suibianwanwan33 <[email protected]>
Sent: Wednesday, May 28, 2025 10:27 AM
To: dev <[email protected]>
Subject: [DISCUSS] Some questions related to subquery issues

Hello,

Recently, there have been many issues about SubQuery in the Jira. I believe 
subqueries have always been a complex area in databases, and many related 
issues exist across other database systems. I've recently been trying to fix 
some of these, but encountered some questions during the process.


1. Is it valid to have multiple Correlates with the same correlateId in 
RelNode?&nbsp;&nbsp;


In SubQueryRemoveRule, there are many cases where we use the same variableSet 
to build joins. This results in multiple Correlates with the same corId in the 
RelNode.&nbsp;&nbsp;
For example, take the case of FILTER_SUB_QUERY_TO_CORRELATE:

SELECT *
FROM emp e1
WHERE e1.deptno IN (
&nbsp; &nbsp; SELECT e3.deptno
&nbsp; &nbsp; FROM emp e3
&nbsp; &nbsp; WHERE e3.empno = e1.empno and e3.JOB = 'CLERK'
)
AND e1.deptno IN (
&nbsp; &nbsp; SELECT e4.deptno
&nbsp; &nbsp; FROM emp e4
&nbsp; &nbsp; WHERE e4.empno = e1.empno and e4.ename = 'SMITH'
);



Execution plan before unnesting:


LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7])
&nbsp; LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], 
HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7])
&nbsp; &nbsp; LogicalFilter(condition=[=($7, $9)])
&nbsp; &nbsp; &nbsp; LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
requiredColumns=[{0}])
&nbsp; &nbsp; &nbsp; &nbsp; LogicalFilter(condition=[=($7, $8)])
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LogicalCorrelate(correlation=[$cor0], 
joinType=[inner], requiredColumns=[{0}])
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LogicalTableScan(table=[[scott, EMP]])
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LogicalAggregate(group=[{0}])
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LogicalProject(DEPTNO=[$7])
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 
LogicalFilter(condition=[AND(=($0, $cor0.EMPNO), =($2, 'CLERK'))])
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 
LogicalTableScan(table=[[scott, EMP]])
&nbsp; &nbsp; &nbsp; &nbsp; LogicalProject(DEPTNO=[$7])
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LogicalFilter(condition=[AND(=($0, 
$cor0.EMPNO), =($1, 'SMITH'))])
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LogicalTableScan(table=[[scott, EMP]])



I suspect a similar case might have caused an error during decorrelate.


2. Are there any relevant documents or papers on the design of Blackboard in 
SqlToRel?&nbsp;&nbsp;


Another part of the issue failed during SqlToRel, but I'm not very familiar 
with this part of the code, and the logic here is quite complex for me. It 
would be great if there were some references (I believe this would also be 
useful for other developers working with Calcite).



Thank you for the suggestion.


Best regards,


suibianwanwan

Reply via email to