emkornfield commented on code in PR #240:
URL: https://github.com/apache/parquet-format/pull/240#discussion_r1793009835


##########
src/main/thrift/parquet.thrift:
##########
@@ -1084,6 +1290,9 @@ struct ColumnIndex {
     * Same as repetition_level_histograms except for definitions levels.
     **/
    7: optional list<i64> definition_level_histograms;
+
+   /** A list containing statistics of GEOMETRY logical type for each page */
+   8: optional list<GeometryStatistics> geometry_stats;

Review Comment:
   Just to unify the conversation.  A page typically holds O(10000s of records) 
or O(Multiple KBs to MBs of data).  It sounds like the at least some cases it 
might be worth it.  I think to @rdblue point, we might want to evaluate 
additional file size this data adds to make sure estimates are accurate for 
some geography data and if it is not worth writing by default we should 
consider removing it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to