Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

via GitHub Fri, 15 Mar 2024 02:29:28 -0700


dbatomic commented on code in PR #45422:
URL: https://github.com/apache/spark/pull/45422#discussion_r1525990277



##########
common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java:
##########
@@ -69,6 +69,7 @@ public static class Collation {
      * byte for byte equal. All accent or case-insensitive collations are 
considered non-binary.
      */
     public final boolean isBinaryCollation;
+    public final boolean isLowercaseCollation;

Review Comment:
   I think that the name is misleading. Also it doesn't make much sense to keep 
a flag that will be 0 for 99% of collations and 1 for only 1. You use this only 
in `StringType` so can you just say there:
   
   ```
   def isLcaseCollation: Boolean = collationId == LOWERCASE_COLLATION_ID
   ```
   That being said, maybe even we don't need this and you can just make this 
check in `StringTypeBinaryLcase`.
   
   I think that `isLowercaseCollation` is a very special edge case that 
shouldn't be baked in deeply in `Collation` object. `isBinary` is an actual 
property of collation and it makes sense to be here, but `isLowercaseCollation` 
is not.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Re: [PR] [SPARK-47296][SQL][COLLATION] Fail unsupported functions for non-binary collations [spark]

Reply via email to