[ 
https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592216#comment-16592216
 ] 

Nihar Sheth commented on SPARK-25230:
-------------------------------------

This seems to be a JVM thing  
[https://docs.oracle.com/javase/6/docs/api/java/lang/String.html#toUpperCase%28java.util.Locale%29]
 All locales will switch it to SS in Java/Scala

>From what I've quickly checked, mysql, postgresql, and sqlite all do not 
>change the character, but spark-sql and websql change to SS. If it's essential 
>to fix, it might just come down to replacing it with a placeholder value, 
>performing the uppercasing, then substituting it back in.

> Upper behavior incorrect for string contains "ß"
> ------------------------------------------------
>
>                 Key: SPARK-25230
>                 URL: https://issues.apache.org/jira/browse/SPARK-25230
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.1
>            Reporter: Yuming Wang
>            Priority: Major
>         Attachments: MySQL.png, Oracle.png, Teradata.jpeg
>
>
> How to reproduce:
> {code:sql}
> spark-sql> SELECT upper('Haßler');
> HASSLER
> {code}
> Mainstream databases returns {{HAßLER}}.
>  !MySQL.png!
>  
> This behavior may lead to data inconsistency:
> {code:sql}
> create temporary view SPARK_25230 as select * from values
>   ("Hassler"),
>   ("Haßler")
> as EMPLOYEE(name);
> select UPPER(name) from SPARK_25230 group by 1;
> -- result
> HASSLER{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to