dbatomic commented on code in PR #45421:
URL: https://github.com/apache/spark/pull/45421#discussion_r1519451854


##########
common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java:
##########
@@ -396,7 +389,18 @@ public boolean startsWith(final UTF8String prefix, int 
collationId) {
     if (collationId == CollationFactory.LOWERCASE_COLLATION_ID) {
       return this.toLowerCase().startsWith(prefix.toLowerCase());
     }
-    return matchAt(prefix, 0, collationId);
+    return collatedStartsWith(prefix, collationId);
+  }
+
+  private boolean collatedStartsWith(final UTF8String prefix, int collationId) 
{

Review Comment:
   I think that @MaxGekk has a point.
   `CollationSuite` becomes a bit over cluttered and it should be used for for 
E2E testing. Can we either add unit test to `UTF8StringSuite` or create 
`UTF8StringSuiteWithCollation` suite?
   
   Option is also to use `CollationExpressionSuite`.
   
   I would propose following:
   1) E2E collation tests go to `CollationSuite`.
   2) String level tests go to either `UTF8StringSuite` or 
`UTF8StringWithCollationSuite`.
   3) Expression level tests go to `CollationExpressionSuite`.
   4) Collation management tests to to `CollationFactorySuite`.
   
   @MaxGekk and @cloud-fan - what are your thought on this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to