Zhen Chen created CALCITE-6930:
----------------------------------

             Summary: Implementing JoinConditionOrExpansionRule
                 Key: CALCITE-6930
                 URL: https://issues.apache.org/jira/browse/CALCITE-6930
             Project: Calcite
          Issue Type: New Feature
            Reporter: Zhen Chen
            Assignee: Zhen Chen


JoinConditionOrExpansionRule transforms a join with OR conditions into a UNION 
ALL of multiple joins. 

For example, the SQL:
SELECT * FROM emp JOIN dept 
ON emp.deptno = dept.deptno OR emp.deptno = dept.mgr
Will be transformed into:
SELECT * FROM emp JOIN dept ON emp.deptno = dept.deptno
UNION ALL 
SELECT * FROM emp JOIN dept ON emp.deptno = dept.mgr
The original plan:
LogicalJoin(condition=[OR(=(deptno, deptno), =(deptno, mgr))])
  LogicalTableScan(table=emp)
  LogicalTableScan(table=dept)
Is transformed to:
LogicalUnion(all=[true])
  LogicalJoin(condition=[=(deptno, deptno)])
    LogicalTableScan(table=emp)
    LogicalTableScan(table=dept)
  LogicalJoin(condition=[=(deptno, mgr)])
    LogicalTableScan(table=emp)
    LogicalTableScan(table=dept)
Benefits:
 # Each individual join can use hash-join algorithm, which is more efficient 
than nested-loop joins required for OR conditions.

 # Each join branch can independently choose its optimal join implementation.

 # The joins can be executed in parallel.

 # Individual joins may be able to use indexes that would not be usable with OR 
conditions.

The rule only fires when:
 # The OR condition contains only simple equi-join conditions (comparing 
columns between left and right inputs).

Limitations and considerations:
 # May not improve performance if individual joins produce large results that 
need to be union-ed.

 # Could be inefficient if the OR condition is very selective but individual 
conditions are not.

 # Multiple hash tables may cause memory pressure.

 # The optimizer should consider statistics and cost estimates when applying 
this rule.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to