I added some remarks to the jira. I am worried that UUIDs will make planning non-repeatable. Currently if you make equivalent plans via different sequences of rule-firings they will be identical. That is really important to Calcite’s ability to “reason locally”.
I would spend more effort fixing the frying pan, rather than jumping into the fire. > On May 29, 2025, at 12:11 AM, Stamatis Zampetakis <zabe...@gmail.com> wrote: > > Conceptually, it makes sense to have a unique variable for each > Correlate node but as you observed there are cases where this does not > hold. In some cases it causes problems and in some others it doesn't > which makes a holistic fix somewhat tricky. I think it's worth trying > to generate unique ids for each correlate although I suspect that some > things may be broken along the way and it may take some time before > merging this proposal. > > Best, > Stamatis > >> On Wed, May 28, 2025 at 8:20 PM Mihai Budiu <mbu...@gmail.com> wrote: >> >> Things are hard enough handling subqueries, I think that having scopes for >> correlateId is necessary. Each Correlate node (or subquery) should use a >> fresh correlateId. These identifiers are generated by the compiler, and >> never visible outside. >> >> Mihai >> ________________________________ >> From: suibianwanwan33 <suibianwanwa...@foxmail.com> >> Sent: Wednesday, May 28, 2025 10:27 AM >> To: dev <dev@calcite.apache.org> >> Subject: [DISCUSS] Some questions related to subquery issues >> >> Hello, >> >> Recently, there have been many issues about SubQuery in the Jira. I believe >> subqueries have always been a complex area in databases, and many related >> issues exist across other database systems. I've recently been trying to fix >> some of these, but encountered some questions during the process. >> >> >> 1. Is it valid to have multiple Correlates with the same correlateId in >> RelNode? >> >> >> In SubQueryRemoveRule, there are many cases where we use the same >> variableSet to build joins. This results in multiple Correlates with the >> same corId in the RelNode. >> For example, take the case of FILTER_SUB_QUERY_TO_CORRELATE: >> >> SELECT * >> FROM emp e1 >> WHERE e1.deptno IN ( >> SELECT e3.deptno >> FROM emp e3 >> WHERE e3.empno = e1.empno and e3.JOB = 'CLERK' >> ) >> AND e1.deptno IN ( >> SELECT e4.deptno >> FROM emp e4 >> WHERE e4.empno = e1.empno and e4.ename = 'SMITH' >> ); >> >> >> >> Execution plan before unnesting: >> >> >> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], >> SAL=[$5], COMM=[$6], DEPTNO=[$7]) >> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], >> HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7]) >> LogicalFilter(condition=[=($7, $9)]) >> LogicalCorrelate(correlation=[$cor0], joinType=[inner], >> requiredColumns=[{0}]) >> LogicalFilter(condition=[=($7, $8)]) >> LogicalCorrelate(correlation=[$cor0], >> joinType=[inner], requiredColumns=[{0}]) >> LogicalTableScan(table=[[scott, >> EMP]]) >> LogicalAggregate(group=[{0}]) >> LogicalProject(DEPTNO=[$7]) >> >> LogicalFilter(condition=[AND(=($0, $cor0.EMPNO), =($2, 'CLERK'))]) >> >> LogicalTableScan(table=[[scott, EMP]]) >> LogicalProject(DEPTNO=[$7]) >> LogicalFilter(condition=[AND(=($0, >> $cor0.EMPNO), =($1, 'SMITH'))]) >> LogicalTableScan(table=[[scott, >> EMP]]) >> >> >> >> I suspect a similar case might have caused an error during decorrelate. >> >> >> 2. Are there any relevant documents or papers on the design of Blackboard in >> SqlToRel? >> >> >> Another part of the issue failed during SqlToRel, but I'm not very familiar >> with this part of the code, and the logic here is quite complex for me. It >> would be great if there were some references (I believe this would also be >> useful for other developers working with Calcite). >> >> >> >> Thank you for the suggestion. >> >> >> Best regards, >> >> >> suibianwanwan