[
https://issues.apache.org/jira/browse/CALCITE-7634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated CALCITE-7634:
------------------------------------
Labels: pull-request-available (was: )
> JoinExpandOrToUnionRule incorrectly expands OR branches with non-equi
> predicates referencing both join inputs
> -------------------------------------------------------------------------------------------------------------
>
> Key: CALCITE-7634
> URL: https://issues.apache.org/jira/browse/CALCITE-7634
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.42.0
> Reporter: Zhen Chen
> Assignee: Zhen Chen
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.43.0
>
>
> {{JoinExpandOrToUnionRule}} may incorrectly split an OR branch that contains
> both an equi-join predicate and a non-equi predicate referencing both join
> inputs.
> SQL:
> {code:java}
> select *
> from EMP as p1
> inner join EMP as p2
> on (p1.empno = p2.empno and p1.sal < p2.sal)
> or (p1.mgr = p2.mgr and p1.comm < p2.comm)
> or p1.deptno = p2.deptno {code}
> Before the fix, the rule expands the condition into 3 LogicalJoin branches:
> {code:java}
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4],
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], EMPNO0=[$9], ENAME0=[$10],
> JOB0=[$11], MGR0=[$12], HIREDATE0=[$13], SAL0=[$14], COMM0=[$15],
> DEPTNO0=[$16], SLACKER0=[$17])
> LogicalUnion(all=[true])
> LogicalJoin(condition=[AND(=($0, $9), <($5, $14))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalJoin(condition=[AND(=($3, $12), <($6, $15), OR(<>($0, $9), >=($5,
> $14)))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalJoin(condition=[AND(=($7, $16), OR(<>($0, $9), >=($5, $14)),
> OR(<>($3, $12), >=($6, $15)))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}
> This is unsafe because predicates such as p1.sal < p2.sal and p1.comm <
> p2.comm reference columns from both join inputs and can evaluate to UNKNOWN
> under SQL three-valued logic. The generated NOT conditions are not equivalent
> to excluding previously matched branches in nullable cases.After the fix, the
> unsafe branches remain grouped, and only the safe equi-join branch is split
> out:
> {code:java}
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4],
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], EMPNO0=[$9], ENAME0=[$10],
> JOB0=[$11], MGR0=[$12], HIREDATE0=[$13], SAL0=[$14], COMM0=[$15],
> DEPTNO0=[$16], SLACKER0=[$17])
> LogicalUnion(all=[true])
> LogicalJoin(condition=[OR(AND(=($0, $9), <($5, $14)), AND(=($3, $12),
> <($6, $15)))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalJoin(condition=[AND(=($7, $16), OR(<>($0, $9), >=($5, $14)),
> OR(<>($3, $12), >=($6, $15)))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}
> Root cause
> `RexInputRefCounter.visitInputRef` increments `leftFieldCount` instead of
> `leftInputRefCount`, so predicates referencing the left input may be
> miscounted.
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)