[ 
https://issues.apache.org/jira/browse/SPARK-47286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824795#comment-17824795
 ] 

Aleksandar Tomic commented on SPARK-47286:
------------------------------------------

[~gpgp] There is already a PR that should handle `IN` among other cases:
https://github.com/apache/spark/pull/45383

But, if you are interested in contributing to collation track I can propose two 
tracks:
1) Creation of benchmarking suites. Current implementation of collators is not 
super perf efficient. Once we get to stable implementation perf optimizations 
will be one track.
2) String expression support - you can talk to [~uros-db] about this.

> IN operator support
> -------------------
>
>                 Key: SPARK-47286
>                 URL: https://issues.apache.org/jira/browse/SPARK-47286
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 4.0.0
>            Reporter: Aleksandar Tomic
>            Priority: Major
>
> At this point following query works fine:
> ```
>  sql("select * from t1 where ucs_basic_lcase in ('aaa' collate 
> 'ucs_basic_lcase', 'bbb' collate 'ucs_basic_lcase')").show()
> ```
> But if we were to miss explicit collate or even mix collations:
> ```
>       sql("select * from t1 where ucs_basic_lcase in ('aaa' collate 
> 'ucs_basic_lcase', 'bbb'").show()
> ```
> Query would still run and return invalid results.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to