Hello

As we implement Geometry/Geography type support in the engines, we notice
one problem we missed to close when adopting these types in the V3 spec.

First, the use case:

   1. It is much easier to calculate/interpret lower and upper bounds of
   geospatial objects when using linear/Cartesian edges, rather than spherical
   edges.
   2. To properly model the earth we need wraparound bounds (allow xmin >
   xmax to represent, if the object crosses the anti-meridian).


However, the spec does not allow for this use case:

   1. Wraparound bounds is allowed only for Geography, and not Geometry type
   2. No 'linear' edge is defined in Geography type

There is a long offline debate on how to support this case, options
included:

   1. Allowing wraparound for Geometry type for certain CRS, but now
   Iceberg library needs to understand CRS's and if they support wraparound
   when writing/interpreting bounds for predicate pushdown, rather than
   treating it as just type metadata.
   2. Defining a Linear edge for Geography type, however this is not so
   common and a bit confusing to the user.

A compromise is somehow updating the format to allow "Geometry with
Wraparound" by adding a boolean to simply indicate whether the bounds are
wraparound or not (whether the objects cross the anti-meridian) instead of
having to read the CRS.  The exact format seems not to have been proposed
yet.

In any case, all options seem to involve a format version bump to V4 in the
strictest sense.  If we take this interpretation, we may unfortunately not
support this use case until then and we add guards against it, as we
proceed with work of Geometry/Geography types in Iceberg reference
implementation.

This is discussed in https://github.com/apache/iceberg/pull/13227 and
https://github.com/apache/iceberg/pull/12667 where it was suggested to put
a DISCUSS thread on devlist to spread more awareness of this discussion.  I
apologize for my lack of deep geo knowledge as I may mis-speak about
something.  But I am curious if this path makes sense, or if we should take
another approach.  I'm also open to supporting this earlier than V4 if
there is consensus on the way forward and if there's no conflicting
implementation out there.

Thanks!
Szehon

Reply via email to