[ https://issues.apache.org/jira/browse/SPARK-25230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592216#comment-16592216 ]
Nihar Sheth commented on SPARK-25230: ------------------------------------- This seems to be a JVM thing [https://docs.oracle.com/javase/6/docs/api/java/lang/String.html#toUpperCase%28java.util.Locale%29] All locales will switch it to SS in Java/Scala >From what I've quickly checked, mysql, postgresql, and sqlite all do not >change the character, but spark-sql and websql change to SS. If it's essential >to fix, it might just come down to replacing it with a placeholder value, >performing the uppercasing, then substituting it back in. > Upper behavior incorrect for string contains "ß" > ------------------------------------------------ > > Key: SPARK-25230 > URL: https://issues.apache.org/jira/browse/SPARK-25230 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.3.1 > Reporter: Yuming Wang > Priority: Major > Attachments: MySQL.png, Oracle.png, Teradata.jpeg > > > How to reproduce: > {code:sql} > spark-sql> SELECT upper('Haßler'); > HASSLER > {code} > Mainstream databases returns {{HAßLER}}. > !MySQL.png! > > This behavior may lead to data inconsistency: > {code:sql} > create temporary view SPARK_25230 as select * from values > ("Hassler"), > ("Haßler") > as EMPLOYEE(name); > select UPPER(name) from SPARK_25230 group by 1; > -- result > HASSLER{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org