Hi, folks I have a question about org.apache.calcite.rel.RelNode#getVariablesSet. Javadoc says, it returns variables that are set by current node:
/** * Returns the variables that are set in this relational * expression but also used and therefore not available to parents of this * relational expression. * * <p>Note: only {@link org.apache.calcite.rel.core.Correlate} should set * variables. * * @return Names of variables which are set in this relational * expression */ Set<CorrelationId> getVariablesSet(); But I've got a plan where node returns all variables used by children nodes regardless this variable are set by current or parent node. Original query is: SELECT * FROM t1 as "outer" WHERE a > ( SELECT COUNT(*) FROM t1 as "inner" WHERE "inner".a IN ( SELECT * FROM table(system_range("inner".a, "inner".b + "outer".b)) ) ) After SQL to Rel translation I've got plan as follow: LogicalProject(A=[$2], B=[$3], C=[$4], D=[$5], E=[$6]) LogicalFilter(condition=[>($2, $SCALAR_QUERY({ LogicalAggregate(group=[{}], COUNT(*)=[COUNT()]) LogicalFilter(condition=[IN($2, { LogicalProject(X=[$0]) LogicalTableFunctionScan(invocation=[SYSTEM_RANGE($cor0.A, +($cor0.B, $cor2.B))], rowType=[RecordType(BIGINT X)]) })], variablesSet=[[$cor0]]) LogicalTableScan(table=[[PUBLIC, T1]]) }))], variablesSet=[[$cor2]]) LogicalTableScan(table=[[PUBLIC, T1]]) Every LogicalFilter introduce its own correlation variable, and everything is OK so far. But then I apply SubQueryRemoveRule and new plan looks like this: LogicalProject(A=[$2], B=[$3], C=[$4], D=[$5], E=[$6]) LogicalProject(_KEY=[$0], _VAL=[$1], A=[$2], B=[$3], C=[$4], D=[$5], E=[$6]) LogicalFilter(condition=[>($2, $7)]) LogicalCorrelate(correlation=[$cor2], joinType=[left], requiredColumns=[{3}]) LogicalTableScan(table=[[PUBLIC, T1]]) LogicalAggregate(group=[{}], COUNT(*)=[COUNT()]) LogicalProject(_KEY=[$0], _VAL=[$1], A=[$2], B=[$3], C=[$4], D=[$5], E=[$6]) LogicalJoin(condition=[=($2, $7)], joinType=[inner]) LogicalTableScan(table=[[PUBLIC, T1]]) LogicalAggregate(group=[{0}]) LogicalProject(X=[$0]) LogicalTableFunctionScan(invocation=[SYSTEM_RANGE($cor0.A, +($cor0.B, $cor2.B))], rowType=[RecordType(BIGINT X)]) At this point LogicalJoin.getVariablesSet() returns both "cor0" and "cor2" variables which doesn't seem right. Is such behaviour expected or it is a bug? -- Regards, Konstantin Orlov