[ https://issues.apache.org/jira/browse/SPARK-47286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824795#comment-17824795 ]
Aleksandar Tomic commented on SPARK-47286: ------------------------------------------ [~gpgp] There is already a PR that should handle `IN` among other cases: https://github.com/apache/spark/pull/45383 But, if you are interested in contributing to collation track I can propose two tracks: 1) Creation of benchmarking suites. Current implementation of collators is not super perf efficient. Once we get to stable implementation perf optimizations will be one track. 2) String expression support - you can talk to [~uros-db] about this. > IN operator support > ------------------- > > Key: SPARK-47286 > URL: https://issues.apache.org/jira/browse/SPARK-47286 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 4.0.0 > Reporter: Aleksandar Tomic > Priority: Major > > At this point following query works fine: > ``` > sql("select * from t1 where ucs_basic_lcase in ('aaa' collate > 'ucs_basic_lcase', 'bbb' collate 'ucs_basic_lcase')").show() > ``` > But if we were to miss explicit collate or even mix collations: > ``` > sql("select * from t1 where ucs_basic_lcase in ('aaa' collate > 'ucs_basic_lcase', 'bbb'").show() > ``` > Query would still run and return invalid results. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org