[ https://issues.apache.org/jira/browse/CALCITE-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17236176#comment-17236176 ]
Ruben Q L edited comment on CALCITE-4414 at 11/20/20, 3:26 PM: --------------------------------------------------------------- Test: {code} @Test void test() { final RelBuilder builder = RelBuilder.create(RelBuilderTest.config().build()); final RelNode relNode = builder .scan("EMP") .project( builder.field("DEPTNO")) .scan("EMP") .project( builder.field("DEPTNO")) .union(true) .projectPlus(builder.field("DEPTNO")) .filter( builder.equals( builder.field(0), builder.literal(0))) .build(); // Program to convert Project + Filter into a single Calc final HepProgram program = new HepProgramBuilder() .addRuleInstance(CoreRules.FILTER_TO_CALC) .addRuleInstance(CoreRules.PROJECT_TO_CALC) .addRuleInstance(CoreRules.CALC_MERGE) .build(); final HepPlanner hepPlanner = new HepPlanner(program); hepPlanner.setRoot(relNode); RelNode output = hepPlanner.findBestExp(); // Add filter on the extra field generated by projectPlus (now a Calc after hepPlanner) output = builder .push(output) .filter( builder.equals( builder.field(1), builder.literal(0))) .build(); // Should not fail output.estimateRowCount(output.getCluster().getMetadataQuery()); } {code} Fails with: {noformat} java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.calcite.plan.RelOptUtil$RexInputConverter.visitInputRef(RelOptUtil.java:4452) at org.apache.calcite.plan.RelOptUtil$RexInputConverter.visitInputRef(RelOptUtil.java:4372) at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112) at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:158) at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:110) at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:33) at org.apache.calcite.rex.RexCall.accept(RexCall.java:175) at org.apache.calcite.rel.metadata.RelMdSelectivity.getSelectivity(RelMdSelectivity.java:80) at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source) at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getSelectivity(RelMetadataQuery.java:426) at org.apache.calcite.rel.metadata.RelMdSelectivity.getSelectivity(RelMdSelectivity.java:131) at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source) at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getSelectivity(RelMetadataQuery.java:426) at org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:767) at org.apache.calcite.rel.core.Filter.estimateRowCount(Filter.java:139) at org.apache.calcite.test.RelMdSelectivityTest.test(RelMdSelectivityTest.java:72) {noformat} The plan that we are dealing with at the end of the test: {noformat} LogicalFilter(condition=[=($1, 0)]) LogicalCalc(expr#0=[{inputs}], expr#1=[0], expr#2=[=($t0, $t1)], DEPTNO=[$t0], DEPTNO0=[$t0], $condition=[$t2]) LogicalUnion(all=[true]) LogicalCalc(expr#0..7=[{inputs}], DEPTNO=[$t7]) LogicalTableScan(table=[[scott, EMP]]) LogicalCalc(expr#0..7=[{inputs}], DEPTNO=[$t7]) LogicalTableScan(table=[[scott, EMP]]) {noformat} was (Author: rubenql): Test: {code} @Test void test() { final RelBuilder builder = RelBuilder.create(RelBuilderTest.config().build()); final RelNode relNode = builder .scan("EMP") .project( builder.field("DEPTNO")) .scan("EMP") .project( builder.field("DEPTNO")) .union(true) .projectPlus(builder.field("DEPTNO")) .filter( builder.equals( builder.field(0), builder.literal(0))) .build(); // Program to convert Project + Filter into a single Calc final HepProgram program = new HepProgramBuilder() .addRuleInstance(CoreRules.FILTER_TO_CALC) .addRuleInstance(CoreRules.PROJECT_TO_CALC) .addRuleInstance(CoreRules.CALC_MERGE) .build(); final HepPlanner hepPlanner = new HepPlanner(program); hepPlanner.setRoot(relNode); RelNode output = hepPlanner.findBestExp(); // Add filter on the extra field generated by projectPlus (now a Calc after hepPlanner) output = builder .push(output) .filter( builder.equals( builder.field(1), builder.literal(0))) .build(); output.estimateRowCount(output.getCluster().getMetadataQuery()); } {code} Fails with: {noformat} java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.calcite.plan.RelOptUtil$RexInputConverter.visitInputRef(RelOptUtil.java:4452) at org.apache.calcite.plan.RelOptUtil$RexInputConverter.visitInputRef(RelOptUtil.java:4372) at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112) at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:158) at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:110) at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:33) at org.apache.calcite.rex.RexCall.accept(RexCall.java:175) at org.apache.calcite.rel.metadata.RelMdSelectivity.getSelectivity(RelMdSelectivity.java:80) at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source) at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getSelectivity(RelMetadataQuery.java:426) at org.apache.calcite.rel.metadata.RelMdSelectivity.getSelectivity(RelMdSelectivity.java:131) at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source) at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getSelectivity(RelMetadataQuery.java:426) at org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:767) at org.apache.calcite.rel.core.Filter.estimateRowCount(Filter.java:139) at org.apache.calcite.test.RelMdSelectivityTest.test(RelMdSelectivityTest.java:72) {noformat} The plan that we are dealing with at the end of the test: {noformat} LogicalFilter(condition=[=($1, 0)]) LogicalCalc(expr#0=[{inputs}], expr#1=[0], expr#2=[=($t0, $t1)], DEPTNO=[$t0], DEPTNO0=[$t0], $condition=[$t2]) LogicalUnion(all=[true]) LogicalCalc(expr#0..7=[{inputs}], DEPTNO=[$t7]) LogicalTableScan(table=[[scott, EMP]]) LogicalCalc(expr#0..7=[{inputs}], DEPTNO=[$t7]) LogicalTableScan(table=[[scott, EMP]]) {noformat} > RelMdSelectivity#getSelectivity for Calc can propagate a predicate with wrong > references > ---------------------------------------------------------------------------------------- > > Key: CALCITE-4414 > URL: https://issues.apache.org/jira/browse/CALCITE-4414 > Project: Calcite > Issue Type: Bug > Components: core > Reporter: Ruben Q L > Assignee: Ruben Q L > Priority: Major > Labels: pull-request-available > Fix For: 1.27.0 > > Time Spent: 10m > Remaining Estimate: 0h > > {{RelMdSelectivity#getSelectivity(Calc rel, RelMetadataQuery mq, RexNode > predicate)}} method: > {code} > public Double getSelectivity(Calc rel, RelMetadataQuery mq, RexNode > predicate) { > final RexProgram rexProgram = rel.getProgram(); > final RexLocalRef programCondition = rexProgram.getCondition(); > if (programCondition == null) { > return getSelectivity(rel.getInput(), mq, predicate); // [2] > } else { > // [1] > return mq.getSelectivity(rel.getInput(), > RelMdUtil.minusPreds( > rel.getCluster().getRexBuilder(), > predicate, > rexProgram.expandLocalRef(programCondition))); > } > } > {code} > currently passes down the predicate to its input [1] without considering any > possible translation, since the predicate might include expressions generated > by the Calc's projections; hence when the Calc's input analyzes the > predicate, it can end up trying to access fields that do not exist on its > rowType. > This can lead to unforeseeable consequences, like the test attached to the > first comment, where after {{RelMdSelectivity#getSelectivity(Calc)}} we reach > {{RelMdSelectivity#getSelectivity(Union)}} and this method ends up in an > {{ArrayIndexOutOfBoundsException}} because it tries to access a field ($1) > that does not exists on its rowType (which only has $0). This $1 is actually > projected by the Calc which is on top of the Union. > Note I: in a similar situation, RelM uses{{ RelOptUtil.pushPastProject}} > (which "Converts an expression that is based on the output fields of a > Project to an equivalent expression on the Project's input fields.") to > convert the predicate before passing it to the Project's input. > Note II: in the code snipped above that in our test example the issue only > happens in line [1], and not in [2] because the "if" block calls > {{getSelectivity}} instead of {{mq.getSelectivity}}, although I find this a > bit questionable and maybe {{mq.getSelectivity}} should be called here as > well. -- This message was sent by Atlassian Jira (v8.3.4#803005)