[jira] [Updated] (CALCITE-5952) SemiJoinJoinTransposeRule should check if JoinType supports pushing predicates into its inputs
[ https://issues.apache.org/jira/browse/CALCITE-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruben Q L updated CALCITE-5952: --- Fix Version/s: 1.36.0 > SemiJoinJoinTransposeRule should check if JoinType supports pushing > predicates into its inputs > -- > > Key: CALCITE-5952 > URL: https://issues.apache.org/jira/browse/CALCITE-5952 > Project: Calcite > Issue Type: Bug >Affects Versions: 1.35.0 >Reporter: Leonid Chistov >Assignee: Leonid Chistov >Priority: Major > Labels: pull-request-available > Fix For: 1.36.0 > > > The following test will fail if added to RelOptRulesTest.java > {code:java} > @Test void testCanNotPushSemiJoinToRightJoinBranch() { > final Function relFn = b -> b > .scan("EMP") > .scan("DEPT") > .join(JoinRelType.LEFT, > b.equals( > b.field(2, 0, "DEPTNO"), > b.field(2, 1, "DEPTNO")) > ) > .scan("BONUS") > // semi join only relates to RHS fields of left join > .semiJoin(b.equals( > b.field(2, 0, "DNAME"), > b.field(2, 1, "JOB"))) > .build(); > relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged(); > } {code} > Produced plan will look like: > {code:java} > LogicalJoin(condition=[=($7, $8)], joinType=[left]) > LogicalTableScan(table=[[scott, EMP]]) > LogicalJoin(condition=[=($1, $4)], joinType=[semi]) > LogicalTableScan(table=[[scott, DEPT]]) > LogicalTableScan(table=[[scott, BONUS]]) {code} > Which is different from the original plan: > {code:java} > LogicalJoin(condition=[=($9, $12)], joinType=[semi]) > LogicalJoin(condition=[=($7, $8)], joinType=[left]) > LogicalTableScan(table=[[scott, EMP]]) > LogicalTableScan(table=[[scott, DEPT]]) > LogicalTableScan(table=[[scott, BONUS]]) {code} > This is not correct - in general case it is not correct to push semi-join to > right side of left-join. > The reason is the following: > Consider rows from *EMP* that have no matching rows in {*}DEPT{*}. These rows > will have *nulls* for *DEPT* columns in the result of left-join and they will > be rejected by the top semi-join. > But if we push semi-join to RHS of left-join, we are going to see rows from > *EMP* with *nulls* on the *DEPT* side in the final result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-5952) SemiJoinJoinTransposeRule should check if JoinType supports pushing predicates into its inputs
[ https://issues.apache.org/jira/browse/CALCITE-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leonid Chistov updated CALCITE-5952: Summary: SemiJoinJoinTransposeRule should check if JoinType supports pushing predicates into its inputs (was: Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule) > SemiJoinJoinTransposeRule should check if JoinType supports pushing > predicates into its inputs > -- > > Key: CALCITE-5952 > URL: https://issues.apache.org/jira/browse/CALCITE-5952 > Project: Calcite > Issue Type: Bug >Affects Versions: 1.35.0 >Reporter: Leonid Chistov >Assignee: Leonid Chistov >Priority: Major > Labels: pull-request-available > > The following test will fail if added to RelOptRulesTest.java > {code:java} > @Test void testCanNotPushSemiJoinToRightJoinBranch() { > final Function relFn = b -> b > .scan("EMP") > .scan("DEPT") > .join(JoinRelType.LEFT, > b.equals( > b.field(2, 0, "DEPTNO"), > b.field(2, 1, "DEPTNO")) > ) > .scan("BONUS") > // semi join only relates to RHS fields of left join > .semiJoin(b.equals( > b.field(2, 0, "DNAME"), > b.field(2, 1, "JOB"))) > .build(); > relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged(); > } {code} > Produced plan will look like: > {code:java} > LogicalJoin(condition=[=($7, $8)], joinType=[left]) > LogicalTableScan(table=[[scott, EMP]]) > LogicalJoin(condition=[=($1, $4)], joinType=[semi]) > LogicalTableScan(table=[[scott, DEPT]]) > LogicalTableScan(table=[[scott, BONUS]]) {code} > Which is different from the original plan: > {code:java} > LogicalJoin(condition=[=($9, $12)], joinType=[semi]) > LogicalJoin(condition=[=($7, $8)], joinType=[left]) > LogicalTableScan(table=[[scott, EMP]]) > LogicalTableScan(table=[[scott, DEPT]]) > LogicalTableScan(table=[[scott, BONUS]]) {code} > This is not correct - in general case it is not correct to push semi-join to > right side of left-join. > The reason is the following: > Consider rows from *EMP* that have no matching rows in {*}DEPT{*}. These rows > will have *nulls* for *DEPT* columns in the result of left-join and they will > be rejected by the top semi-join. > But if we push semi-join to RHS of left-join, we are going to see rows from > *EMP* with *nulls* on the *DEPT* side in the final result. -- This message was sent by Atlassian Jira (v8.20.10#820010)