rdblue commented on a change in pull request #1326:
URL: https://github.com/apache/iceberg/pull/1326#discussion_r474804542
##########
File path:
mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergFilterFactory.java
##########
@@ -140,18 +137,34 @@ private static Object leafToLiteral(PredicateLeaf leaf) {
case STRING:
return leaf.getLiteralList();
case DATE:
- return leaf.getLiteralList().stream().map(value -> ((Date)
value).toLocalDate().toEpochDay())
+ return leaf.getLiteralList().stream().map(value -> dateToString((Date)
value))
.collect(Collectors.toList());
case DECIMAL:
return leaf.getLiteralList().stream()
- .map(value -> BigDecimal.valueOf(((HiveDecimalWritable)
value).doubleValue()))
+ .map(value -> hiveDecimalToBigDecimal((HiveDecimalWritable)
value))
.collect(Collectors.toList());
case TIMESTAMP:
return leaf.getLiteralList().stream()
- .map(value -> ((Timestamp) value).toInstant().getEpochSecond()
* MICROS_PER_SECOND +
- ((Timestamp) value).getNanos() /
NANOS_PER_MICROSEC).collect(Collectors.toList());
+ .map(value -> timestampToTimestampString((Timestamp) value))
Review comment:
This shouldn't convert to a string. Instead, it should convert the
`Timestamp` value directly to microseconds from the unix epoch. String
conversion in expressions is only for convenience in tests and for people using
the API directly with generics. If an engine passes a predicate, we don't want
to needlessly convert to string and back because it is much, much more likely
to corrupt the value.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]