rdblue commented on code in PR #240:
URL: https://github.com/apache/parquet-format/pull/240#discussion_r1800214702
##########
src/main/thrift/parquet.thrift:
##########
@@ -286,6 +313,9 @@ struct Statistics {
7: optional bool is_max_value_exact;
/** If true, min_value is the actual minimum value for a column */
8: optional bool is_min_value_exact;
+
+ /** statistics specific to geometry logical type */
+ 9: optional GeometryStatistics geometry_stats;
Review Comment:
I don't think this is the right place to include `GeometryStatistics`. There
are a couple reasons:
1. This is unnecessary nesting to get to the geo stats, making them harder
to find
2. Nesting within `Statistics` includes geo stats in places where simple
`Statistics` make sense, but geo stats do not. For example, this would be
included in page headers in addition to `ColumnMetaData` (and the page index
already removed these)
3. This doesn't match the approach used for `SizeStatistics`, which was
included only in `ColumnMetaData`
I think that this should match the addition of `SizeStatistics` and should
be included as a field in `ColumnMetaData`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]