rdblue commented on code in PR #240:
URL: https://github.com/apache/parquet-format/pull/240#discussion_r1800214702


##########
src/main/thrift/parquet.thrift:
##########
@@ -286,6 +313,9 @@ struct Statistics {
    7: optional bool is_max_value_exact;
    /** If true, min_value is the actual minimum value for a column */
    8: optional bool is_min_value_exact;
+
+   /** statistics specific to geometry logical type */
+   9: optional GeometryStatistics geometry_stats;

Review Comment:
   I don't think this is the right place to include `GeometryStatistics`. There 
are a couple reasons:
   1. This is unnecessary nesting to get to the geo stats, making them harder 
to find
   2. Nesting withing `Statistics` includes geo stats in places where simple 
`Statistics` make sense, but geo stats do not. For example, this would be 
included in page headers in addition to `ColumnMetaData`
   3. This doesn't match the approach used for `SizeStatistics`, which was 
included only in `ColumnMetaData`
   
   I think that this should match the addition of `SizeStatistics` and should 
be included as a field in `ColumnMetaData`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to