In our implementation we always set "isExpand" to true in the SqlToRelConverter, so the nodes that introduce correlate variables are always explicit as Correlate Rel nodes. I don't have experience with the non-expanded representation.
Mihai ________________________________ From: Zikai Wang <[email protected]> Sent: Thursday, May 29, 2025 8:24 PM To: [email protected] <[email protected]> Subject: Question about correlated variable handling in scalar subqueries Hello, I'm working with Apache Calcite and using variable sets in each relational node to track correlations between outer attributes in correlated subqueries and their corresponding outer nodes. However, I'm encountering an issue where I cannot find the expected correlated variables in the variable sets when dealing with scalar subqueries. Example Query: SELECT (SELECT name FROM t1 WHERE a = (SELECT a FROM t2 WHERE b = t.b)) AS name FROM t3 t WHERE t.b = 'xxxxx'; Generated Logical Plan: rel#22:LogicalProject.NONE.[](input=LogicalFilter#15,exprs=[$SCALAR_QUERY({ LogicalProject(NAME=[$0]) LogicalFilter(condition=[=($1, $SCALAR_QUERY({ LogicalProject(a=[$0]) LogicalFilter(condition=[=($1, $cor0.contno)]) LogicalTableScan(table=[[t2]]) }))]) LogicalTableScan(table=[[t1]]) })]) Expected vs. Actual Behavior: Based on my understanding, the correlation variable $cor0 should be stored in the variable set of the outermost LogicalProject node, indicating that cor0 correlates to the outer table t. However, when I examine the variable sets, they are empty for all nodes in this plan. Question: Could you help clarify whether this is: 1. A potential bug in how correlated variables are populated in variable sets for scalar subqueries, or 2. Expected behavior where Calcite uses special logic for handling correlations in scalar subqueries that differs from the standard variable set mechanism? Any guidance on the correct way to track these correlations in scalar subquery scenarios would be greatly appreciated. Best regards, Zikai Wang
