Re: [DISCUSS] Some questions related to subquery issues

Julian Hyde Sat, 31 May 2025 08:20:17 -0700

I added some remarks to the jira. 

I am worried that UUIDs will make planning non-repeatable. Currently if you 
make equivalent plans via different sequences of rule-firings they will be 
identical. That is really important to Calcite’s ability to “reason locally”.


I would spend more effort fixing the frying pan, rather than jumping into the 
fire.   

> On May 29, 2025, at 12:11 AM, Stamatis Zampetakis <zabe...@gmail.com> wrote:
> 
> Conceptually, it makes sense to have a unique variable for each
> Correlate node but as you observed there are cases where this does not
> hold. In some cases it causes problems and in some others it doesn't
> which makes a holistic fix somewhat tricky. I think it's worth trying
> to generate unique ids for each correlate although I suspect that some
> things may be broken along the way and it may take some time before
> merging this proposal.
> 
> Best,
> Stamatis
> 
>> On Wed, May 28, 2025 at 8:20 PM Mihai Budiu <mbu...@gmail.com> wrote:
>> 
>> Things are hard enough handling subqueries, I think that having scopes for 
>> correlateId is necessary. Each Correlate node (or subquery) should use a 
>> fresh correlateId. These identifiers are generated by the compiler, and 
>> never visible outside.
>> 
>> Mihai
>> ________________________________
>> From: suibianwanwan33 <suibianwanwa...@foxmail.com>
>> Sent: Wednesday, May 28, 2025 10:27 AM
>> To: dev <dev@calcite.apache.org>
>> Subject: [DISCUSS] Some questions related to subquery issues
>> 
>> Hello,
>> 
>> Recently, there have been many issues about SubQuery in the Jira. I believe 
>> subqueries have always been a complex area in databases, and many related 
>> issues exist across other database systems. I've recently been trying to fix 
>> some of these, but encountered some questions during the process.
>> 
>> 
>> 1. Is it valid to have multiple Correlates with the same correlateId in 
>> RelNode?&nbsp;&nbsp;
>> 
>> 
>> In SubQueryRemoveRule, there are many cases where we use the same 
>> variableSet to build joins. This results in multiple Correlates with the 
>> same corId in the RelNode.&nbsp;&nbsp;
>> For example, take the case of FILTER_SUB_QUERY_TO_CORRELATE:
>> 
>> SELECT *
>> FROM emp e1
>> WHERE e1.deptno IN (
>> &nbsp; &nbsp; SELECT e3.deptno
>> &nbsp; &nbsp; FROM emp e3
>> &nbsp; &nbsp; WHERE e3.empno = e1.empno and e3.JOB = 'CLERK'
>> )
>> AND e1.deptno IN (
>> &nbsp; &nbsp; SELECT e4.deptno
>> &nbsp; &nbsp; FROM emp e4
>> &nbsp; &nbsp; WHERE e4.empno = e1.empno and e4.ename = 'SMITH'
>> );
>> 
>> 
>> 
>> Execution plan before unnesting:
>> 
>> 
>> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
>> SAL=[$5], COMM=[$6], DEPTNO=[$7])
>> &nbsp; LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], 
>> HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7])
>> &nbsp; &nbsp; LogicalFilter(condition=[=($7, $9)])
>> &nbsp; &nbsp; &nbsp; LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
>> requiredColumns=[{0}])
>> &nbsp; &nbsp; &nbsp; &nbsp; LogicalFilter(condition=[=($7, $8)])
>> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LogicalCorrelate(correlation=[$cor0], 
>> joinType=[inner], requiredColumns=[{0}])
>> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LogicalTableScan(table=[[scott, 
>> EMP]])
>> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LogicalAggregate(group=[{0}])
>> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LogicalProject(DEPTNO=[$7])
>> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 
>> LogicalFilter(condition=[AND(=($0, $cor0.EMPNO), =($2, 'CLERK'))])
>> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 
>> LogicalTableScan(table=[[scott, EMP]])
>> &nbsp; &nbsp; &nbsp; &nbsp; LogicalProject(DEPTNO=[$7])
>> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LogicalFilter(condition=[AND(=($0, 
>> $cor0.EMPNO), =($1, 'SMITH'))])
>> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; LogicalTableScan(table=[[scott, 
>> EMP]])
>> 
>> 
>> 
>> I suspect a similar case might have caused an error during decorrelate.
>> 
>> 
>> 2. Are there any relevant documents or papers on the design of Blackboard in 
>> SqlToRel?&nbsp;&nbsp;
>> 
>> 
>> Another part of the issue failed during SqlToRel, but I'm not very familiar 
>> with this part of the code, and the logic here is quite complex for me. It 
>> would be great if there were some references (I believe this would also be 
>> useful for other developers working with Calcite).
>> 
>> 
>> 
>> Thank you for the suggestion.
>> 
>> 
>> Best regards,
>> 
>> 
>> suibianwanwan

Re: [DISCUSS] Some questions related to subquery issues

Reply via email to