rdblue commented on code in PR #240:
URL: https://github.com/apache/parquet-format/pull/240#discussion_r1800222339


##########
src/main/thrift/parquet.thrift:
##########
@@ -373,6 +505,78 @@ struct JsonType {
 struct BsonType {
 }
 
+/**
+ * Geometry logical type annotation (added in 2.11.0)
+ */
+struct GeometryType {
+  /**
+   * Physical type and encoding for the geometry type.
+   * Please refer to the definition of GeometryEncoding for more detail.
+   */
+  1: required GeometryEncoding encoding;
+  /**
+   * Interpretation for edges of elements of a GEOMETRY logical type, i.e. 
whether
+   * the interpolation between points along an edge represents a straight 
cartesian
+   * line or the shortest line on the sphere.
+   * Please refer to the definition of Edges for more detail.
+   */
+  2: required EdgeInterpolation edges;
+  /**
+   * Coordinate Reference System, i.e. mapping of how coordinates refer to
+   * precise locations on earth. Writers are not required to set this field.
+   * Once crs is set, crs_encoding field below MUST be set together.
+   * For example, "OGC:CRS84" can be set in the form of PROJJSON as below:
+   * {
+   *     "$schema": "https://proj.org/schemas/v0.5/projjson.schema.json";,
+   *     "type": "GeographicCRS",
+   *     "name": "WGS 84 longitude-latitude",
+   *     "datum": {
+   *         "type": "GeodeticReferenceFrame",
+   *         "name": "World Geodetic System 1984",
+   *         "ellipsoid": {
+   *             "name": "WGS 84",
+   *             "semi_major_axis": 6378137,
+   *             "inverse_flattening": 298.257223563
+   *         }
+   *     },
+   *     "coordinate_system": {
+   *         "subtype": "ellipsoidal",
+   *         "axis": [
+   *         {
+   *             "name": "Geodetic longitude",
+   *             "abbreviation": "Lon",
+   *             "direction": "east",
+   *             "unit": "degree"
+   *         },
+   *         {
+   *             "name": "Geodetic latitude",
+   *             "abbreviation": "Lat",
+   *             "direction": "north",
+   *             "unit": "degree"
+   *         }
+   *         ]
+   *     },
+   *     "id": {
+   *         "authority": "OGC",
+   *         "code": "CRS84"
+   *     }
+   * }
+   */
+  3: optional string crs;
+  /**
+   * Encoding used in the above crs field. It MUST be set if crs field is set.
+   * Currently the only allowed value is "PROJJSON".
+   */
+  4: optional string crs_encoding;

Review Comment:
   I agree that the CRS is essential, but we don't need a tight coupling. I'm 
glad to see that this is now a pseudo-identifier rather than requiring each geo 
column to have a JSON blob. But I don't think that it necessarily makes sense 
to require that it is put in some specific place because there can be multiple 
options that make sense.
   
   Requiring the CRS in file metadata seems like an unnecessary cost when it 
could be stored in table metadata, for example. I would be more lenient and 
state that the CRS should be made accessible for the dataset and give examples, 
but not make hard requirements.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to