wgtmac commented on code in PR #240:
URL: https://github.com/apache/parquet-format/pull/240#discussion_r1797524864


##########
src/main/thrift/parquet.thrift:
##########
@@ -1084,6 +1290,9 @@ struct ColumnIndex {
     * Same as repetition_level_histograms except for definitions levels.
     **/
    7: optional list<i64> definition_level_histograms;
+
+   /** A list containing statistics of GEOMETRY logical type for each page */
+   8: optional list<GeometryStatistics> geometry_stats;

Review Comment:
   > we might want to evaluate additional file size this data adds to make sure 
estimates are accurate for some geography data and if it is not worth writing 
by default we should consider removing it.
   
   @jiayuasu @Kontinuation Do you have any good benchmark data to create some 
testing Parquet files so we can determine if it is worth adding the column 
index?
   
   Or we can remove column index of geometry type for now and add them back in 
the follow up to expedite the initial version.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to