rdblue commented on a change in pull request #1981:
URL: https://github.com/apache/iceberg/pull/1981#discussion_r553008256
##########
File path: api/src/main/java/org/apache/iceberg/transforms/ProjectionUtil.java
##########
@@ -254,4 +257,145 @@ private ProjectionUtil() {
return predicate(predicate.op(), fieldName,
Iterables.transform(predicate.asSetPredicate().literalSet(),
transform::apply));
}
+
+ /**
+ * Fixes an inclusive projection to account for incorrectly transformed
values.
+ * <p>
+ * A bug in 0.10.0 and earlier caused negative values to be incorrectly
transformed by date and timestamp transforms
+ * to 1 larger than the correct value. For example, day(1969-12-31 10:00:00)
produced 0 instead of -1. To read data
+ * written by versions with this bug, this method adjusts the inclusive
projection. The current inclusive projection
+ * is correct, so this modifies the "correct" projection when needed. For
example, < day(1969-12-31 10:00:00) will
+ * produce <= -1 (= 1969-12-31) and is adjusted to <= 0 (= 1969-01-01)
because the incorrect transformed value was 0.
+ */
+ static UnboundPredicate<Integer>
fixInclusiveTimeProjection(UnboundPredicate<Integer> projected) {
+ if (projected == null) {
+ return projected;
+ }
+
+ // adjust the predicate for values that were 1 larger than the correct
transformed value
+ switch (projected.op()) {
+ case LT:
+ if (projected.literal().value() < 0) {
+ return Expressions.lessThan(projected.term(),
projected.literal().value() + 1);
+ }
+
+ return projected;
+
+ case LT_EQ:
+ if (projected.literal().value() < 0) {
+ return Expressions.lessThanOrEqual(projected.term(),
projected.literal().value() + 1);
+ }
+
+ return projected;
+
+ case GT:
+ case GT_EQ:
+ // incorrect projected values are already greater than the bound for
GT, GT_EQ
+ return projected;
+
+ case EQ:
+ if (projected.literal().value() < 0) {
+ // match either the incorrect value (projectedValue + 1) or the
correct value (projectedValue)
+ return Expressions.in(projected.term(), projected.literal().value(),
projected.literal().value() + 1);
+ }
+
+ return projected;
+
+ case IN:
+ Set<Integer> fixedSet = Sets.newHashSet();
+ boolean hasNegativeValue = false;
+ for (Literal<Integer> lit : projected.literals()) {
+ Integer value = lit.value();
+ fixedSet.add(value);
+ if (value < 0) {
+ hasNegativeValue = true;
+ fixedSet.add(value + 1);
+ }
+ }
+
+ if (hasNegativeValue) {
+ return Expressions.in(projected.term(), fixedSet);
Review comment:
If there is no negative value, then there is no need to fixup the
expression and we can return the original. That avoids some object allocation?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]