[ 
https://issues.apache.org/jira/browse/CALCITE-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714230#comment-17714230
 ] 

Julian Hyde commented on CALCITE-5661:
--------------------------------------

If that doesn't work, also consider using Sarg. It is an efficient 
implementation of IN lists (in terms of the number of heap objects and heap 
bytes to represent an IN-list of a given size) and may be able to convert dense 
integer lists to ranges, e.g. 'x in (1, 2, 3, 4, 5, 10)' becomes 'x between 1 
and 5 or x = 10'.

> Introduce another way to convert IN predicate to RelNode when IN list is large
> ------------------------------------------------------------------------------
>
>                 Key: CALCITE-5661
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5661
>             Project: Calcite
>          Issue Type: Improvement
>    Affects Versions: 1.34.0
>            Reporter: Runkang He
>            Assignee: Runkang He
>            Priority: Major
>
> When IN list is large, the plan generation is time-consuming, after 
> benchmark, when the IN value list size was 3w, it took 2 minutes to generate 
> the final plan.
> {code:sql}
> select empno from emp where deptno in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
> ..., 30000){code}
> We find that in sql-to-rel phase, there are two methods to convert IN 
> predicate to RelNode:
> 1.IN list size is below InSubQueryThreshold, convert IN to OR;
> 2.IN list size is over InSubQueryThreshold, convert IN to VALUES + JOIN.
> The first one will be very time-consuming in the expression simplification 
> stage for the large OR predicate. As mentioned before, when the IN value list 
> size was 3w, it took 2 minutes, which is not acceptable in OLAP scenarios.
> The second one will not be able to apply predicate pushdown, which it is very 
> important in OLAP scenarios.
> So maybe we need to support converting IN to RexCall directly to avoid the 
> disadvantages of the above two methods.
> After POC, when convert IN to RexCall directly, it takes less than 1 second 
> to generate the final plan.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to