jorisvandenbossche commented on code in PR #240:
URL: https://github.com/apache/parquet-format/pull/240#discussion_r1809141466
##########
src/main/thrift/parquet.thrift:
##########
@@ -380,6 +410,38 @@ struct JsonType {
struct BsonType {
}
+/** Physical type and encoding for the geometry type */
+enum GeometryEncoding {
+ /**
+ * Allowed for physical type: BYTE_ARRAY.
+ *
+ * Well-known binary (WKB) representations of geometries.
+ */
+ WKB = 0;
+}
+
+/** Interpretation for edges of elements of a GEOMETRY type */
+enum Edges {
+ PLANAR = 0;
+ SPHERICAL = 1;
+}
+
+/**
+ * GEOMETRY logical type annotation (added in 2.11.0)
+ *
+ * GeometryEncoding and Edges are required. CRS is optional.
+ *
+ * Once CRS is set, it MUST be a key to an entry in the `key_value_metadata`
+ * field of `FileMetaData`.
Review Comment:
> Why is it required that the CRS is embedded in file metadata?
Note that it is not actually _required_ to embed CRS in the file metadata.
This is an optional field, and so as a producer of Parquet files with
geospatial data, you are not required to fill it. For example Iceberg, assuming
it would already be tracking the CRS elsewhere in Iceberg-specific metadata or
manifest file, could just leave this field blank in the parquet files itself.
Of course that makes those files less interoperable (but my understanding is
that parquet files contained in an Iceberg table generally are not meant to be
read by another non-Iceberg aware tool?).
But putting something like `iceberg.xxx` as crs value would also not be
great for interoperability.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]