[
https://issues.apache.org/jira/browse/CALCITE-7514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mihai Budiu resolved CALCITE-7514.
----------------------------------
Fix Version/s: 1.42.0
Resolution: Fixed
Fixed in
https://github.com/apache/calcite/commit/0d4c0a2b2122cb6b19ebae4ea32f2236c3191d3e
Thank you for your contribution [~sbroeder]
Thank you for the reviews [~xuzifu666], [~tamas.mate]
> MultiJoinOptimizeBushyRule throws AssertionError when a join condition
> references 3 or more factors
> ---------------------------------------------------------------------------------------------------
>
> Key: CALCITE-7514
> URL: https://issues.apache.org/jira/browse/CALCITE-7514
> Project: Calcite
> Issue Type: Bug
> Components: core
> Reporter: Sean Broeder
> Assignee: Sean Broeder
> Priority: Minor
> Labels: pull-request-available
> Fix For: 1.42.0
>
>
> MultiJoinOptimizeBushyRule.onMatch builds a list of Edge objects from
> multiJoin.getJoinFilters() and assumes every edge connects exactly two
> factors. When a join condition references three or more factors — for
> example, a CASE expression that spans three tables —
> LoptMultiJoin.createEdge produces an edge whose factors.cardinality() is 3,
> causing an AssertionError in the edge comparator:
> java.lang.AssertionError: {0, 1, 2}
> at
> MultiJoinOptimizeBushyRule$1.rowCountDiff(MultiJoinOptimizeBushyRule.java:147)
> at
> MultiJoinOptimizeBushyRule$1.compare(MultiJoinOptimizeBushyRule.java:143)
> at
> MultiJoinOptimizeBushyRule.minPos(MultiJoinOptimizeBushyRule.java:347)
> at
> MultiJoinOptimizeBushyRule.chooseBestEdge(MultiJoinOptimizeBushyRule.java:329)
> at
> MultiJoinOptimizeBushyRule.onMatch(MultiJoinOptimizeBushyRule.java:157)
> The class Javadoc already acknowledges this as a known gap:
> * TODO:
> * <li>Join conditions that touch 1 factor.
> * <li>Join conditions that touch 3 factors.
> This can be reproduced with the following query:
> SELECT e1.ename
> FROM emp e1, dept d, emp e2
> WHERE e1.deptno = d.deptno
> AND e2.deptno = d.deptno
> AND d.deptno = CASE WHEN e1.sal > 1000 THEN e2.empno ELSE e1.empno END
> With pre-program FILTER_INTO_JOIN (bottom-up) + JOIN_TO_MULTI_JOIN, the
> CASE condition spans all three factors and ends up in the MultiJoin's
> joinFilter. When MULTI_JOIN_OPTIMIZE_BUSHY fires, the assertion in
> rowCountDiff fails.
> Expected behavior: The rule handles conditions that touch ≠ 2 factors
> gracefully, producing a valid join tree.
> Actual behavior: AssertionError at rowCountDiff (line 147), chooseBestEdge
> (line 329), or the edge-rewrite loop (line 244), depending on which assertion
> is reached first.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)