desruisseaux commented on code in PR #240:
URL: https://github.com/apache/parquet-format/pull/240#discussion_r1778425283


##########
src/main/thrift/parquet.thrift:
##########
@@ -237,6 +237,135 @@ struct SizeStatistics {
    3: optional list<i64> definition_level_histogram;
 }
 
+/**
+ * Physical type and encoding for the geometry type.
+ */
+enum GeometryEncoding {
+  /**
+   * Allowed for physical type: BYTE_ARRAY.
+   *
+   * Well-known binary (WKB) representations of geometries.
+   *
+   * To be clear, we follow the same rule of WKB and coordinate axis order from
+   * GeoParquet [1][2]. Geometries SHOULD be encoded as ISO WKB [3][4]
+   * supporting XY, XYZ, XYM, XYZM and the standard geometry types
+   * Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon,
+   * and GeometryCollection). Coordinate order is always (x, y) where x is
+   * easting or longitude and y is northing or latitude. This ordering 
explicitly
+   * overrides the axis order as specified in the CRS following the GeoPackage
+   * specification [5].
+   *
+   * This is the preferred encoding for maximum portability. It also supports
+   * GeometryStatistics to be set in the column chunk and page index.
+   *
+   * [1] 
https://github.com/opengeospatial/geoparquet/blob/v1.1.0/format-specs/geoparquet.md?plain=1#L92
+   * [2] 
https://github.com/opengeospatial/geoparquet/blob/v1.1.0/format-specs/geoparquet.md?plain=1#L155
+   * [3] https://portal.ogc.org/files/?artifact_id=18241
+   * [4] https://www.iso.org/standard/60343.html
+   * [5] https://www.geopackage.org/spec130/#gpb_spec
+   */
+  WKB = 0;
+}
+
+/**
+ * Interpretation for edges of elements of a GEOMETRY logical type. In other
+ * words, whether a point between two vertices should be interpolated in
+ * its XY dimensions as if it were a Cartesian line connecting the two
+ * vertices (planar) or the shortest spherical arc between the longitude
+ * and latitude represented by the two vertices (spherical). This value
+ * applies to all non-point geometry objects and is independent of the
+ * coordinate reference system.
+ *
+ * Because most systems currently assume planar edges and do not support
+ * spherical edges, planar should be used as the default value.
+ */
+enum EdgeInterpolation {
+  PLANAR = 0;
+  SPHERICAL = 1;

Review Comment:
   I think that the confusion may come from the name of the enumeration values. 
If I'm understanding right, the intend of `PLANAR` is "shorted path between two 
points on a plane". Problems:
   
   * It is not the only way to connect two points on a plane. We may also want 
Bézier curves, splines, etc.
   * `PLANE` is a bit restrictive as it seems to exclude the 3D space.
   
   What about renaming `PLANE` as `STRAIGHT_LINE`? It leaves room for other 
interpolation methods on the plane in the future if desired. It also adds real 
value compared to just saying that the coordinate system is Cartesian, which 
can already be inferred from the CRS.
   
   Likewise, `SPHERICAL`  and `SPHEROIDAL` could be renamed as `GEODESIC_LINE`. 
It leaves room for different interpolation methods in the future, e.g. with 
smooth changes of headings instead of using the shortest paths. This is 
equivalent, in the plane, to Spline curves featuring smooth changes of 
derivatives instead of sudden changes of direction at every points.
   
   I saw the discussion about `SPHERICAL` meaning the use of spherical formulas 
even if the CRS is ellipsoidal. But I'm not sure how it would work in practice. 
Which sphere radius should the computation use? The authalic sphere? Something 
else?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to