[ https://issues.apache.org/jira/browse/CALCITE-5133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
itxiangkui updated CALCITE-5133: -------------------------------- Description: I have a piece of data, the example is as follows: ||channel||gettime|| |c1|2022-04-27 00:00:00| |c2|2022-04-27 00:00:00| |c3|2022-04-27 00:00:00| |c4|2022-04-27 00:00:00| |c5|2022-04-27 00:00:00| |c6|2022-04-27 00:00:00| {color:#00875a}it will be ok to query the dataset...{color} {code:java} select channel from`tidb_noah_mars`.`noah_global`.`mpays` where gettime>='2022-04-27 00:00:00' and gettime<='2022-04-27 14:00:00' and channel in ('c1','c2'...'c19') | JdbcToEnumerableConverter JdbcProject(channel=[$11]) JdbcFilter(condition=[AND(SEARCH($38, Sarg[[2022-04-27 00:00:00..2022-04-27 14:00:00]]), SEARCH($11, Sarg[_UTF-8'360_assistant':VARCHAR(14) CHARACTER SET "UTF-8" |{code} {color:#ff0000}it can not query the dataset...{color} {code:java} select channel from xxx where gettime>='2022-04-27 00:00:00' and gettime<='2022-04-27 14:00:00' and channel in ('c1','c2'....'c20') | JdbcToEnumerableConverter JdbcProject(channel=[$0]) JdbcJoin(condition=[=($0, $2)], joinType=[inner]) JdbcProject(channel=[$11], gettime=[$38]) JdbcFilter(condition=[SEARCH($38, Sarg[[2022-04-27 00:00 |{code} The difference is that the in sub-collection of the latter will have more than 20 elements, while the former has only 19 This leads to a huge difference in execution plans When the latter sql was executed, I found out when I debugged, org.apache.calcite.rel.rel2sql.visit(Values e) in the method: {code:java} } else { query = SqlStdOperatorTable.UNION_ALL.createCall( new SqlNodeList(list, POS)); } {code} Here list.size = 20, but the result of the *query* is {code:java} SELECT 'wsss' AS `ROW_VALUE` UNION ALL SELECT 'xxx' AS `ROW_VALUE` {code} *There are two lines,not 20 lines...* It sounds weird and I can't explain more details... was: I have a piece of data, the example is as follows: ||channel||gettime|| |c1|2022-04-27 00:00:00| |c2|2022-04-27 00:00:00| |c3|2022-04-27 00:00:00| |c4|2022-04-27 00:00:00| |c5|2022-04-27 00:00:00| |c6|2022-04-27 00:00:00| {color:#00875a}it will be ok to query the dataset...{color} {code:java} select channel from`tidb_noah_mars`.`noah_global`.`mpays` where gettime>='2022-04-27 00:00:00' and gettime<='2022-04-27 14:00:00' and channel in ('c1','c2'...'c19') | JdbcToEnumerableConverter JdbcProject(channel=[$11]) JdbcFilter(condition=[AND(SEARCH($38, Sarg[[2022-04-27 00:00:00..2022-04-27 14:00:00]]), SEARCH($11, Sarg[_UTF-8'360_assistant':VARCHAR(14) CHARACTER SET "UTF-8" |{code} {color:#ff0000}it can not query the dataset...{color} {code:java} select channel from xxx where gettime>='2022-04-27 00:00:00' and gettime<='2022-04-27 14:00:00' and channel in ('c1','c2'....'c20') | JdbcToEnumerableConverter JdbcProject(channel=[$0]) JdbcJoin(condition=[=($0, $2)], joinType=[inner]) JdbcProject(channel=[$11], gettime=[$38]) JdbcFilter(condition=[SEARCH($38, Sarg[[2022-04-27 00:00 |{code} The difference is that the in sub-collection of the latter will have more than 20 elements, while the former has only 19 This leads to a huge difference in execution plans When the latter sql was executed, I found out when I debugged, org.apache.calcite.rel.rel2sql.visit(Values e) in the method: {code:java} } else { query = SqlStdOperatorTable.UNION_ALL.createCall( new SqlNodeList(list, POS)); } {code} Here list.size = 20, but the result of the *query* is {code:java} SELECT 'wsss' AS `ROW_VALUE` UNION ALL SELECT 'xxx' AS `ROW_VALUE` {code} *There are two lines* It sounds weird and I can't explain more details... > JdbcValues lost some values when the value of in was more than 20 > ----------------------------------------------------------------- > > Key: CALCITE-5133 > URL: https://issues.apache.org/jira/browse/CALCITE-5133 > Project: Calcite > Issue Type: Bug > Reporter: itxiangkui > Priority: Major > > > I have a piece of data, the example is as follows: > ||channel||gettime|| > |c1|2022-04-27 00:00:00| > |c2|2022-04-27 00:00:00| > |c3|2022-04-27 00:00:00| > |c4|2022-04-27 00:00:00| > |c5|2022-04-27 00:00:00| > |c6|2022-04-27 00:00:00| > > {color:#00875a}it will be ok to query the dataset...{color} > {code:java} > select channel > from`tidb_noah_mars`.`noah_global`.`mpays` > where gettime>='2022-04-27 00:00:00' > and gettime<='2022-04-27 14:00:00' > and channel in ('c1','c2'...'c19') > | JdbcToEnumerableConverter > JdbcProject(channel=[$11]) > JdbcFilter(condition=[AND(SEARCH($38, Sarg[[2022-04-27 > 00:00:00..2022-04-27 14:00:00]]), SEARCH($11, > Sarg[_UTF-8'360_assistant':VARCHAR(14) CHARACTER SET "UTF-8" |{code} > > {color:#ff0000}it can not query the dataset...{color} > {code:java} > select channel > from xxx > where gettime>='2022-04-27 00:00:00' > and gettime<='2022-04-27 14:00:00' > and channel in ('c1','c2'....'c20') > | JdbcToEnumerableConverter > JdbcProject(channel=[$0]) > JdbcJoin(condition=[=($0, $2)], joinType=[inner]) > JdbcProject(channel=[$11], gettime=[$38]) > JdbcFilter(condition=[SEARCH($38, Sarg[[2022-04-27 00:00 |{code} > > The difference is that the in sub-collection of the latter will have more > than 20 elements, while the former has only 19 > This leads to a huge difference in execution plans > > When the latter sql was executed, I found out when I debugged, > org.apache.calcite.rel.rel2sql.visit(Values e) > in the method: > {code:java} > } else { > query = SqlStdOperatorTable.UNION_ALL.createCall( > new SqlNodeList(list, POS)); > } {code} > > Here list.size = 20, but the result of the *query* is > {code:java} > SELECT 'wsss' AS `ROW_VALUE` > UNION ALL > SELECT 'xxx' AS `ROW_VALUE` > {code} > *There are two lines,not 20 lines...* > It sounds weird and I can't explain more details... > -- This message was sent by Atlassian Jira (v8.20.7#820007)