[ https://issues.apache.org/jira/browse/CALCITE-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714230#comment-17714230 ]
Julian Hyde commented on CALCITE-5661: -------------------------------------- If that doesn't work, also consider using Sarg. It is an efficient implementation of IN lists (in terms of the number of heap objects and heap bytes to represent an IN-list of a given size) and may be able to convert dense integer lists to ranges, e.g. 'x in (1, 2, 3, 4, 5, 10)' becomes 'x between 1 and 5 or x = 10'. > Introduce another way to convert IN predicate to RelNode when IN list is large > ------------------------------------------------------------------------------ > > Key: CALCITE-5661 > URL: https://issues.apache.org/jira/browse/CALCITE-5661 > Project: Calcite > Issue Type: Improvement > Affects Versions: 1.34.0 > Reporter: Runkang He > Assignee: Runkang He > Priority: Major > > When IN list is large, the plan generation is time-consuming, after > benchmark, when the IN value list size was 3w, it took 2 minutes to generate > the final plan. > {code:sql} > select empno from emp where deptno in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, > ..., 30000){code} > We find that in sql-to-rel phase, there are two methods to convert IN > predicate to RelNode: > 1.IN list size is below InSubQueryThreshold, convert IN to OR; > 2.IN list size is over InSubQueryThreshold, convert IN to VALUES + JOIN. > The first one will be very time-consuming in the expression simplification > stage for the large OR predicate. As mentioned before, when the IN value list > size was 3w, it took 2 minutes, which is not acceptable in OLAP scenarios. > The second one will not be able to apply predicate pushdown, which it is very > important in OLAP scenarios. > So maybe we need to support converting IN to RexCall directly to avoid the > disadvantages of the above two methods. > After POC, when convert IN to RexCall directly, it takes less than 1 second > to generate the final plan. -- This message was sent by Atlassian Jira (v8.20.10#820010)