uros-db commented on code in PR #46511: URL: https://github.com/apache/spark/pull/46511#discussion_r1604485097
########## common/unsafe/src/test/java/org/apache/spark/unsafe/types/CollationSupportSuite.java: ########## @@ -709,12 +774,24 @@ public void testLocate() throws SparkException { assertLocate("大千", "test大千世界大千世界", 9, "UNICODE_CI", 9); assertLocate("大千", "大千世界大千世界", 1, "UNICODE_CI", 1); // Case-variable character length + assertLocate("\u0307", "i̇", 1, "UTF8_BINARY", 2); + assertLocate("\u0307", "İ", 1, "UTF8_BINARY_LCASE", 0); // != UTF8_BINARY Review Comment: not wrong, but the intent was a bit different than other cases: instead of comparing differences between UNICODE_CI and UTF8_BINARY_LCASE, I wanted to compare UTF8_BINARY and UTF8_BINARY_LCASE - essentially ensuring that the new UTF8_BINARY_LCASE (character-wise) searching logic does `not` equal to applying UTF8_BINARY (byte-wise) searching logic on lowercased strings -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org