wgtmac commented on code in PR #240:
URL: https://github.com/apache/parquet-format/pull/240#discussion_r1829437140
##########
src/main/thrift/parquet.thrift:
##########
@@ -237,6 +237,37 @@ struct SizeStatistics {
3: optional list<i64> definition_level_histogram;
}
+/**
+ * Bounding box of geometries in the representation of min/max value pair of
+ * coordinates from each axis.
+ */
+struct BoundingBox {
+ /** Min X value when edges = PLANAR, westmost value if edges = SPHERICAL */
+ 1: required double xmin;
+ /** Max X value when edges = PLANAR, eastmost value if edges = SPHERICAL */
+ 2: required double xmax;
+ /** Min Y value when edges = PLANAR, southmost value if edges = SPHERICAL */
+ 3: required double ymin;
+ /** Max Y value when edges = PLANAR, northmost value if edges = SPHERICAL */
+ 4: required double ymax;
+ /** Min Z value if the axis exists */
+ 5: optional double zmin;
+ /** Max Z value if the axis exists */
+ 6: optional double zmax;
+ /** Min M value if the axis exists */
+ 7: optional double mmin;
+ /** Max M value if the axis exists */
+ 8: optional double mmax;
+}
+
+/** Statistics specific to GEOMETRY logical type */
+struct GeometryStatistics {
Review Comment:
Yes, we can. I think the proposal in Iceberg is doing this. But we have more
geometry statistics to support in the future and it is much easier to use if
the bbox is explicitly defined. BTW, I don't think min/max values are suitable
because geometry type should not be compared and min/max stats require a
SortOrder for the logical type.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]