amogh-jahagirdar commented on code in PR #14500:
URL: https://github.com/apache/iceberg/pull/14500#discussion_r2771543092


##########
api/src/main/java/org/apache/iceberg/expressions/Literals.java:
##########
@@ -608,9 +608,24 @@ public String toString() {
     }
   }
 
-  static class UUIDLiteral extends ComparableLiteral<UUID> {
+  static class UUIDLiteral extends BaseLiteral<UUID> {
+    private static final Comparator<UUID> RFC_CMP =
+        Comparators.<UUID>nullsFirst().thenComparing(Comparators.uuids());
+    private static final Comparator<UUID> SIGNED_CMP =
+        
Comparators.<UUID>nullsFirst().thenComparing(Comparators.signedUUIDs());
+
+    // Flag to indicate which comparator to use (serializable)
+    private final boolean useSignedComparator;
+    // Transient cached comparator (reconstructed on deserialization)
+    private transient volatile Comparator<UUID> cmp;
+
     UUIDLiteral(UUID value) {
+      this(value, false);

Review Comment:
   I think I ultimately look at this as while it's a breaking change in 
behavior it's probably unimportant, particularly for downstream consumers 
comparing literal. It already was kind of  a gap for the regular stats case for 
a long time, a probably bigger use case which is being fixed in this PR and 
handled in a compatible manner, and then for the literal case it's basically a 
potential issue for any consumer who is comparing on their own but that seems 
even more rare. So all that to say I agree with @bodduv rationale here; maybe 
there's some conf/property based approach but seems overkill and I'd prefer to 
just "rip the bandaid" off here for this case.



##########
api/src/main/java/org/apache/iceberg/expressions/ExpressionUtil.java:
##########
@@ -744,4 +746,125 @@ private static PartitionSpec identitySpec(Schema schema, 
int... ids) {
 
     return specBuilder.build();
   }
+
+  /**
+   * Transform UUID literals in an unbound expression to use signed 
comparators, if the expression
+   * contains UUID bounds predicates. This maintains backward compatibility 
with files written
+   * before RFC 4122/9562 compliant comparison was implemented.
+   *
+   * <p>The transformed expression contains literals with signed comparators 
for lt/gt/eq
+   * predicates. For IN predicates, the comparator information is lost when 
binding converts
+   * literals to a Set of raw values, so the evaluator must handle this 
separately.
+   *
+   * @param expr an unbound expression
+   * @return the expression with signed UUID comparators, or null if no UUID 
predicates are present
+   */
+  @Nullable
+  public static Expression toSignedUUIDLiteral(Expression expr) {
+    SignedUUIDLiteralVisitor visitor = new SignedUUIDLiteralVisitor();
+    Expression transformed = ExpressionVisitors.visit(expr, visitor);
+    return visitor.foundUUIDBoundsPredicate() ? transformed : null;
+  }
+
+  /**
+   * Visitor that transforms an expression to use the signed UUID comparator 
in all UUID literals,
+   * while also tracking whether any UUID bounds predicates were found.
+   */
+  private static class SignedUUIDLiteralVisitor
+      extends ExpressionVisitors.ExpressionVisitor<Expression> {
+
+    private boolean foundUUIDBoundsPredicate = false;
+
+    boolean foundUUIDBoundsPredicate() {
+      return foundUUIDBoundsPredicate;
+    }
+
+    @Override
+    public Expression alwaysTrue() {
+      return Expressions.alwaysTrue();
+    }
+
+    @Override
+    public Expression alwaysFalse() {
+      return Expressions.alwaysFalse();
+    }
+
+    @Override
+    public Expression not(Expression result) {
+      return Expressions.not(result);
+    }
+
+    @Override
+    public Expression and(Expression leftResult, Expression rightResult) {
+      return Expressions.and(leftResult, rightResult);
+    }
+
+    @Override
+    public Expression or(Expression leftResult, Expression rightResult) {
+      return Expressions.or(leftResult, rightResult);
+    }
+
+    @Override
+    public <T> Expression predicate(BoundPredicate<T> pred) {
+      // Bound predicates should not be transformed - this is for unbound 
expressions
+      throw new UnsupportedOperationException(
+          "Cannot transform bound predicate; use unbound expressions");
+    }
+
+    @Override
+    @SuppressWarnings("unchecked")
+    public <T> Expression predicate(UnboundPredicate<T> pred) {

Review Comment:
   Minor: I believe this method could be simplified with (felt like more code 
than needed for falling through for the non-UUID cases in this context):
   
   ```
       public <T> Expression predicate(UnboundPredicate<T> pred) {
         switch (pred.op()) {
           case LT:
           case LT_EQ:
           case GT:
           case GT_EQ:
           case EQ:
           case NOT_EQ:
             if (pred.literal().value() instanceof UUID) {
               foundUUIDBoundsPredicate = true;
               Literals.UUIDLiteral uuidLit = (Literals.UUIDLiteral) 
pred.literal();
               return new UnboundPredicate<>(pred.op(), pred.term(), (T) 
uuidLit.withSignedComparator());
             }
   
             return pred;
   
           case IN:
           case NOT_IN:
             List<Literal<T>> literals = pred.literals();
             if (!literals.isEmpty() && literals.get(0).value() instanceof 
UUID) {
               foundUUIDBoundsPredicate = true;
               List<T> transformedValues =
                   literals.stream()
                       .map(l -> (T) ((Literals.UUIDLiteral) 
l).withSignedComparator())
                       .collect(Collectors.toList());
               return new UnboundPredicate<>(pred.op(), pred.term(), 
transformedValues);
             }
   
             return pred;
   
           default:
             return pred;
         }
       }
     }
   
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to