rdblue commented on a change in pull request #1747:
URL: https://github.com/apache/iceberg/pull/1747#discussion_r529096291



##########
File path: api/src/main/java/org/apache/iceberg/expressions/Expressions.java
##########
@@ -123,6 +123,22 @@ public static Expression not(Expression child) {
     return new UnboundPredicate<>(Expression.Operation.NOT_NULL, expr);
   }
 
+  public static <T> UnboundPredicate<T> isNaN(String name) {
+    return new UnboundPredicate<>(Expression.Operation.IS_NAN, ref(name));
+  }
+
+  public static <T> UnboundPredicate<T> isNaN(UnboundTerm<T> expr) {
+    return new UnboundPredicate<>(Expression.Operation.IS_NAN, expr);
+  }
+
+  public static <T> UnboundPredicate<T> notNaN(String name) {
+    return new UnboundPredicate<>(Expression.Operation.NOT_NAN, ref(name));
+  }
+
+  public static <T> UnboundPredicate<T> notNaN(UnboundTerm<T> expr) {
+    return new UnboundPredicate<>(Expression.Operation.NOT_NAN, expr);
+  }

Review comment:
       Yeah, it isn't much earlier in that case. Maybe that actually exposes a 
problem with rewriting, too.
   
   `Expressions.equal("c", Double.NaN)` if `c` is not a floating point column 
would result in `isNaN`, which should be rejected while binding expressions. 
You could argue that it should rewrite to `alwaysFalse` instead following the 
same logic as `Expressions.equal("intCol", Long.MAX_VALUE)` -- it can't be true.
   
   I think that it would be better to be strict and reject binding in that case 
because something is clearly wrong. I think a lot of the time, that kind of 
error would happen when columns are misaligned or predicates are incorrectly 
converted.
   
   If the result of those errors is just to fail in expression binding, then 
why rewrite at all? Maybe we should just reject NaN in any predicate and force 
people to explicitly use `isNaN` and `notNaN`. That way we do throw an 
exception much earlier in all cases. Plus, we wouldn't have to worry about 
confusion over whether `NaN` is equal to itself: in Java, a `Double` that holds 
NaN is equal to itself, but a primitive is not. :confused:




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to