paleolimbot commented on code in PR #240:
URL: https://github.com/apache/parquet-format/pull/240#discussion_r1772566650


##########
src/main/thrift/parquet.thrift:
##########
@@ -373,6 +505,78 @@ struct JsonType {
 struct BsonType {
 }
 
+/**
+ * Geometry logical type annotation (added in 2.11.0)
+ */
+struct GeometryType {
+  /**
+   * Physical type and encoding for the geometry type.
+   * Please refer to the definition of GeometryEncoding for more detail.
+   */
+  1: required GeometryEncoding encoding;
+  /**
+   * Interpretation for edges of elements of a GEOMETRY logical type, i.e. 
whether
+   * the interpolation between points along an edge represents a straight 
cartesian
+   * line or the shortest line on the sphere.
+   * Please refer to the definition of Edges for more detail.
+   */
+  2: required EdgeInterpolation edges;
+  /**
+   * Coordinate Reference System, i.e. mapping of how coordinates refer to
+   * precise locations on earth. Writers are not required to set this field.
+   * Once crs is set, crs_encoding field below MUST be set together.
+   * For example, "OGC:CRS84" can be set in the form of PROJJSON as below:
+   * {
+   *     "$schema": "https://proj.org/schemas/v0.5/projjson.schema.json";,
+   *     "type": "GeographicCRS",
+   *     "name": "WGS 84 longitude-latitude",
+   *     "datum": {
+   *         "type": "GeodeticReferenceFrame",
+   *         "name": "World Geodetic System 1984",
+   *         "ellipsoid": {
+   *             "name": "WGS 84",
+   *             "semi_major_axis": 6378137,
+   *             "inverse_flattening": 298.257223563
+   *         }
+   *     },
+   *     "coordinate_system": {
+   *         "subtype": "ellipsoidal",
+   *         "axis": [
+   *         {
+   *             "name": "Geodetic longitude",
+   *             "abbreviation": "Lon",
+   *             "direction": "east",
+   *             "unit": "degree"
+   *         },
+   *         {
+   *             "name": "Geodetic latitude",
+   *             "abbreviation": "Lat",
+   *             "direction": "north",
+   *             "unit": "degree"
+   *         }
+   *         ]
+   *     },
+   *     "id": {
+   *         "authority": "OGC",
+   *         "code": "CRS84"
+   *     }
+   * }
+   */
+  3: optional string crs;
+  /**
+   * Encoding used in the above crs field. It MUST be set if crs field is set.
+   * Currently the only allowed value is "PROJJSON".
+   */
+  4: optional string crs_encoding;

Review Comment:
   The ability to include a parameterized CRS is absolutely essential for the 
GEOMETRY type in Parquet to be useful: not all CRSes have been catalogued, and 
many can't be because they're too specific (e.g., a CRS optimized for a small 
locality or specific project, or the view of a satellite orbiting a planet) or 
too old (e.g., one of my projects with the Canadian government digitizing 
several decades of sea ice coverage where the first four decades were in a CRS 
that had never been catalogued but could be expressed in PROJJSON).
   
   The `crs_encoding` piece is to make the `crs` string unambiguous. I happen 
to think this is an improvement over many existing systems that just provide a 
string and force the reader to guess the intent; however, it is not strictly 
necessary (e.g., we could just define the CRS as a string).
   
   Iceberg has a different set of use cases to Parquet...Parquet is useful to 
geospatial practitioners operating at a smaller scale that need to deal with 
these issues and want to use Parquet to do so. It may be that an 
identifier-based format may fit the Iceberg use case well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to