uros-db commented on code in PR #46511:
URL: https://github.com/apache/spark/pull/46511#discussion_r1612004770


##########
common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationSupport.java:
##########
@@ -430,7 +430,7 @@ public static int execBinary(final UTF8String string, final 
UTF8String substring
     }
     public static int execLowercase(final UTF8String string, final UTF8String 
substring,
         final int start) {
-      return string.toLowerCase().indexOf(substring.toLowerCase(), start);

Review Comment:
   no, unfortunately it's not - while it works fine for ASCII, it actually 
gives wrong results in some special cases featuring conditional case mapping, 
when a character has a lowercase equivalent that consists of multiple 
characters, or is found at a particular place in the string (context-awareness)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to