Runkang He created CALCITE-5661:
-----------------------------------

             Summary: Introduce another way to convert IN predicate to RelNode 
when IN list is large
                 Key: CALCITE-5661
                 URL: https://issues.apache.org/jira/browse/CALCITE-5661
             Project: Calcite
          Issue Type: Improvement
    Affects Versions: 1.34.0
            Reporter: Runkang He


When IN list is large, the plan generation is time-consuming, after benchmark, 
when the IN value list size was 3w, it took 2 minutes to generate the final 
plan.
{code:java}
select empno from emp where deptno in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
..., 30000){code}
We find that in sql-to-rel phase, there are two methods to convert IN predicate 
to RelNode:
1.IN list size is below InSubQueryThreshold, convert IN to OR;
2.IN list size is over InSubQueryThreshold, convert IN to VALUES + JOIN.
The first one will be very time-consuming in the expression simplification 
stage for the large OR predicate. As mentioned before, when the IN value list 
size was 3w, it took 2 minutes, which is not acceptable in OLAP scenarios.
The second one will not be able to apply IN predicate pushdown, which it is 
very important in OLAP scenarios.
SO maybe we need to support converting IN to RexCall directly to avoid the 
disadvantages of the above two methods.
After POC, when convert IN to RexCall directly, it takes less than 1 second to 
generate the final plan.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to