[ 
https://issues.apache.org/jira/browse/CALCITE-5133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17532204#comment-17532204
 ] 

xiong duan commented on CALCITE-5133:
-------------------------------------

The title "The IN-list values converted to SQL will fail when it is converted 
to LogicalValues".

Why need to gt 20, Because when the list size is less than 20, the value list 
will be converted to OR condition. The relative configuration is 
"DEFAULT_IN_SUB_QUERY_THRESHOLD" and the default value is 20.

And welcome the contribution.

> JdbcValues lost some values when the value of in was more than 20
> -----------------------------------------------------------------
>
>                 Key: CALCITE-5133
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5133
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.30.0
>            Reporter: itxiangkui
>            Priority: Major
>
>  
> I have a piece of data, the example is as follows:
> ||channel||gettime||
> |c1|2022-04-27 00:00:00|
> |c2|2022-04-27 00:00:00|
> |c3|2022-04-27 00:00:00|
> |c4|2022-04-27 00:00:00|
> |c5|2022-04-27 00:00:00|
> |c6|2022-04-27 00:00:00|
>  
> {color:#00875a}it will be ok to query the dataset...{color}
> {code:java}
> select channel
> from`tidb_noah_mars`.`noah_global`.`mpays`
> where gettime>='2022-04-27 00:00:00'
> and gettime<='2022-04-27 14:00:00'
> and channel in ('c1','c2'...'c19')
> | JdbcToEnumerableConverter
>   JdbcProject(channel=[$11])
>     JdbcFilter(condition=[AND(SEARCH($38, Sarg[[2022-04-27 
> 00:00:00..2022-04-27 14:00:00]]), SEARCH($11, 
> Sarg[_UTF-8'360_assistant':VARCHAR(14) CHARACTER SET "UTF-8" |{code}
>  
> {color:#ff0000}it can not query the dataset...{color}
> {code:java}
> select channel
> from xxx
> where gettime>='2022-04-27 00:00:00'
> and gettime<='2022-04-27 14:00:00'
> and channel in ('c1','c2'....'c20') 
> | JdbcToEnumerableConverter
>   JdbcProject(channel=[$0])
>     JdbcJoin(condition=[=($0, $2)], joinType=[inner])
>       JdbcProject(channel=[$11], gettime=[$38])
>         JdbcFilter(condition=[SEARCH($38, Sarg[[2022-04-27 00:00 |{code}
>  
> The difference is that the in sub-collection of the latter will have more 
> than 20 elements, while the former has only 19
> This leads to a huge difference in execution plans
>  
> When the latter sql was executed, I found out when I debugged, 
> org.apache.calcite.rel.rel2sql.visit(Values e)
> in the method:
> {code:java}
>        } else {
>          query = SqlStdOperatorTable.UNION_ALL.createCall(
>              new SqlNodeList(list, POS));
>        } {code}
>  
> Here list.size = 20, but the result of the *query* is
> {code:java}
> SELECT 'wsss' AS `ROW_VALUE`
> UNION ALL
> SELECT 'xxx' AS `ROW_VALUE`
> {code}
> *There are two lines,not 20 lines...*
> It sounds weird and I can't explain more details...
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to