nastra commented on code in PR #13695:
URL: https://github.com/apache/iceberg/pull/13695#discussion_r2247624930
##########
core/src/main/java/org/apache/iceberg/DataFiles.java:
##########
@@ -151,6 +152,7 @@ public static class Builder {
private Map<Integer, Long> nanValueCounts = null;
private Map<Integer, ByteBuffer> lowerBounds = null;
private Map<Integer, ByteBuffer> upperBounds = null;
+ private Map<Integer, Type> originalTypes = null;
Review Comment:
> @nastra: Let me rephrase what I understand from your comment:
>
> * Currently we have Metrics - here in some cases we have some binary data
for min and max which is not typed
currently lower/upper bounds are **all** binary and we don't know what their
original type was
> * In the future we will have only Stats - where the stats will be typed,
and this type will help interpret the min/max values
yes the new stats structure will store upper/lower bound with their actual
type
>
> In between, for a while will have Metrics converted to Stats, and for this
we need the type info
correct. We want to convert from metrics to stats and we can only do so if
we know the original type of the upper/lower bound.
One of the reasons why I'm doing the metrics -> stats conversion is because
currently our appender/writer APIs are all returning `Metrics` after data has
been written. Changing those APIs to return stats instead would be quite a big
change, which I want to avoid until we figure out how we want the new API to
look like
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]