desruisseaux commented on code in PR #240:
URL: https://github.com/apache/parquet-format/pull/240#discussion_r1829055493
##########
LogicalTypes.md:
##########
@@ -767,6 +767,188 @@ optional group my_map (MAP_KEY_VALUE) {
}
```
+## Geospatial Types
+
+### GEOMETRY
+
+`GEOMETRY` is used for geometry features from [OGC – Simple feature
access][simple-feature-access].
+See [Geospatial Notes](#geospatial-notes).
+
+The type has three type parameters:
+- `encoding`: A required enum value for annonated physical type and encoding
+ for the `GEOMETRY` type. See [Geometry
Encoding](#geometry-encoding).
+- `edges`: A required enum value for interpretation for edges of elements of
the
+ `GEOMETRY` type, i.e. whether the interpolation between points along
+ an edge represents a straight cartesian line or the shortest line on
+ the sphere. See [Edges](#edges).
+- `crs`: An optional string value for CRS (coordinate reference system), which
+ is a mapping of how coordinates refer to precise locations on earth.
+ See [Coordinate Reference System](#coordinate-reference-system).
+
+The sort order used for `GEOMETRY` is undefined. When writing data, no min/max
+statistics should be saved for this type and if such non-compliant statistics
+are found during reading, they must be ignored. Instead,
[GeometryStatistics](#geometry-statistics)
+is introduced for `GEOMETRY` type.
+
+#### Geometry Encoding
+
+Physical type and encoding for the `GEOMETRY` type. Supported values:
+- `WKB`: `GEOMETRY` type with `WKB` encoding can only be used to annotate the
+ `BYTE_ARRAY` primitive type. See [WKB](#well-known-binary-wkb).
+
+Note that geometry encoding is required for `GEOMETRY` type. In order to
correctly
+interpret geometry data, writer implementations SHOULD always set this field,
and
+reader implementations SHOULD fail for an unknown geometry encoding value.
+
+##### Well-known binary (WKB)
+
+Well-known binary (WKB) representations of geometries, see [Geospatial
Notes](#geospatial-notes).
+
+To be clear, we follow the same definitions of GeoParquet for
[WKB][geoparquet-wkb]
+and [coordinate axis order][coordinate-axis-order]:
+- Geometries SHOULD be encoded as ISO WKB supporting XY, XYZ, XYM, XYZM.
Supported
+standard geometry types: Point, LineString, Polygon, MultiPoint,
MultiLineString,
+MultiPolygon, and GeometryCollection.
+- Coordinate axis order is always (x, y) where x is easting or longitude, and
+y is northing or latitude. This ordering explicitly overrides the axis order
+as specified in the CRS following the [GeoPackage
specification][geopackage-spec].
+
+This is the preferred encoding for maximum portability.
+
+[geoparquet-wkb]:
https://github.com/opengeospatial/geoparquet/blob/v1.1.0/format-specs/geoparquet.md?plain=1#L92
+[coordinate-axis-order]:
https://github.com/opengeospatial/geoparquet/blob/v1.1.0/format-specs/geoparquet.md?plain=1#L155
+[geopackage-spec]: https://www.geopackage.org/spec130/#gpb_spec
+
+#### Edges
+
+Interpretation for edges of elements of `GEOMETRY` type. In other words, it
+specifies how a point between two vertices should be interpolated in its XY
+dimensions. Supported values and corresponding interpolation approaches are:
+- `PLANAR`: a Cartesian line connecting the two vertices.
+- `SPHERICAL`: a shortest spherical arc between the longitude and latitude
+ represented by the two vertices.
+
+This value applies to all non-point geometry objects and is independent of the
+[Coordinate Reference System](#coordinate-reference-system).
Review Comment:
> Can the edges be planar while the CRS is based on elliptic geometry?
In principle, no. First, talking about "planar edges" or "spherical edges"
makes no sense and was a confusion of terms in the initial draft of this
specification (the group reached an agreement to fix that in recent talks, I
hope it will be done before release). An edge can be a straight line, a curve,
a geodesic, etc., but cannot be a plane or a sphere (because of wrong number of
dimensions).
What the initial draft intended to say with "planar edges" (sic) is _"edges
computed as if they were in a planar (two-dimensional Cartesian) coordinate
system"_ (the thing that is planar is the coordinate system, not the edges).
This is not really correct for geographic CRS, so you are right to said that
they are not really independent. However, while it would be more exact to said
that lines on a geographic CRS are geodesics, loxodrome, etc., it happens often
that software ignore that physical reality and just perform linear
interpolations of latitude and longitude values. The line on the ellipsoid
surface obtained that way has no interesting properties, it is just easy to
compute. We do not recommend doing that, but the use of "planar" word in this
context was an acknowledgement that it happens in practice and an attempt to
describe that.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]