[ https://issues.apache.org/jira/browse/CALCITE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stamatis Zampetakis resolved CALCITE-4574. ------------------------------------------ Resolution: Fixed Fixed in [8c2228eaf8ccc05ae58778276e760092557f78cc|https://github.com/apache/calcite/commit/8c2228eaf8ccc05ae58778276e760092557f78cc]. Thanks for the review [~jamesstarr], [~rubenql], [~julianhyde]! > Wrong/Invalid plans when using RelBuilder#join with correlations > ---------------------------------------------------------------- > > Key: CALCITE-4574 > URL: https://issues.apache.org/jira/browse/CALCITE-4574 > Project: Calcite > Issue Type: Bug > Affects Versions: 1.26.0 > Reporter: Stamatis Zampetakis > Assignee: Stamatis Zampetakis > Priority: Major > Labels: pull-request-available > Fix For: 1.27.0 > > Time Spent: 50m > Remaining Estimate: 0h > > RelBuilder#join produces wrong/invalid relational expressions when correlated > variables are passed as a parameter along with different join types and non > trivial (always true) conditions. > *Wrong plans* exist already in the code base where the {{requiredColumns}} > attribute in {{LogicalCorrelate}} is missing some columns. > Consider for instance the middle plan in > {{RelOptRulesTest#testSelectNotInCorrelated}}: > {noformat} > LogicalProject(SAL=[$5], EXPR$1=[IS NULL($10)]) > LogicalCorrelate(correlation=[$cor0], joinType=[left], > requiredColumns=[{2}]) <-- PROBLEM > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalFilter(condition=[=($cor0.EMPNO, $0)]) <-- $cor0.EMPNO refers to > column 0 in EMP relation > LogicalProject(DEPTNO=[$0], i=[true]) > LogicalFilter(condition=[=($cor0.JOB, $1)]) <-- $cor0.JOB refers to > column 2 in EMP relation > LogicalTableScan(table=[[CATALOG, SALES, DEPT]]) > {noformat} > {{EMPNO}} column (index 0) that is referenced in the correlation in the right > input is not present in the {{requiredColumns}} attribute. > *Invalid plans* are created when the join type is SEMI or ANTI and the join > condition uses columns from the right side. Currently, the join condition is > added after the {{Correlate}} and columns from right side no longer exist > thus the filter does not reference valid inputs. > If we are lucky we will get an {{AssertionError}} when constructing the > {{Filter}} operator: > {noformat} > RexInputRef index 8 out of range 0..7 > java.lang.AssertionError: RexInputRef index 8 out of range 0..7 > at org.apache.calcite.util.Litmus$1.fail(Litmus.java:32) > at org.apache.calcite.rex.RexChecker.visitInputRef(RexChecker.java:125) > at org.apache.calcite.rex.RexChecker.visitInputRef(RexChecker.java:61) > at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:114) > at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:144) > at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:61) > at org.apache.calcite.rex.RexCall.accept(RexCall.java:189) > at org.apache.calcite.rel.core.Filter.isValid(Filter.java:127) > at > org.apache.calcite.rel.logical.LogicalFilter.<init>(LogicalFilter.java:72) > at > org.apache.calcite.rel.logical.LogicalFilter.create(LogicalFilter.java:116) > at > org.apache.calcite.rel.core.RelFactories$FilterFactoryImpl.createFilter(RelFactories.java:345) > at org.apache.calcite.tools.RelBuilder.filter(RelBuilder.java:1349) > at org.apache.calcite.tools.RelBuilder.filter(RelBuilder.java:1307) > at org.apache.calcite.tools.RelBuilder.join(RelBuilder.java:2407) > {noformat} > otherwise (assertions disabled) we will end up with an invalid plan. > {code:java} > RelNode root = builder > .scan("EMP") > .variable(v) > .scan("DEPT") > .join(type, > builder.equals( > builder.field(2, 0, "DEPTNO"), > builder.field(2, 1, "DEPTNO")), ImmutableSet.of(v.get().id)) > .build(); > {code} > +Actual plan+ > {noformat} > LogicalFilter(condition=[=($7, $8)]) <- PROBLEM I > LogicalCorrelate(correlation=[$cor0], joinType=[semi], > requiredColumns=[{}]) <- PROBLEM II > LogicalTableScan(table=[[scott, EMP]]) > LogicalTableScan(table=[[scott, DEPT]]) > {noformat} > +Expected plan+ > {noformat} > LogicalCorrelate(correlation=[$cor0], joinType=[semi], requiredColumns=[{7}]) > LogicalTableScan(table=[[scott, EMP]]) > LogicalFilter(condition=[=($cor0.DEPTNO, $0)]) > LogicalTableScan(table=[[scott, DEPT]]) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)