cholmes commented on issue #10260:
URL: https://github.com/apache/iceberg/issues/10260#issuecomment-2260700912

   > I agree that reducing the amount of options is desired. However, the 
argument in favour of PROJJSON is biased. It assumes that there is only two 
options: having PROJ, or having no referencing library at all.  
   
   PROJJSON is an open standard, not a reference library. It follows the rich 
tradition of geojson, georss, vector tiles, STAC, mbtiles, pmtiles, flatgeobuf, 
zarr, copc and many others in that it has started in the open source community 
and in real usage, and most have evolved to some form of formal 
standardization. Yes, PROJJSON right now only has a single implementation, but 
it is written as a JSON encoding of WKT2:2019, and the goal is to become a 
standard.
   
   > The third option, having a referencing library other than PROJ (e.g. ESRI, 
GeoTools, Apache SIS, Proj4J, PyCRS and more that I don't know) seems 
completely ignored. 
   
   No, that's not completely ignored - those just don't yet implement projjson. 
To me the next step is to push for them to implement it, and to try to find 
funding to enable that. The twist seems to be that many don't fully implement 
WKT2:2019. If they have a wkt2 implementation the parsing from JSON to wkt 
seems to be fairly easy - it took a day or two to do it for javascript. If OGC 
insists on a CRSJSON that differs too much from PROJJSON then libraries should 
be able to parse both and put them into the same WKT2:2019 data model.
   
   > A standard CRS JSON is very likely to happen, just not now. It may be a 
matter of about 2 years. This delay is the price to pay for better consistency 
with ISO 19111 and ISO 19115-4.
   
   PROJJSON is not 1.0, and can easily evolve to be completely consistent with 
how the CRS spec evolves. But we need something that works today, not two years 
from now.  Like I said above my hope is that PROJJSON can evolve to be 
consistent with CRSJSON, or even merge them. But if there don't manage to 100% 
align then libraries should be able to easily parse both. 
   
   >  But we are going in circles: JSON is easier to parse for non-geospatial 
libraries, but WKT is better supported by all geospatial libraries other than 
PROJ. It is not obvious to said which side is more important.
   
   If we want geospatial to have a bigger impact on the world than the size of 
the existing geospatial market it is clear to me that being easier to parse for 
non-geospatial libraries is more important. We can't expect every 
implementation of iceberg to include geospatial libraries, so we need a smooth 
'on-ramp' for implementors to support geospatial without understanding the 
depths of coordinate reference systems. We have a great start, with just 
focusing on OGC:CRS84. Having a next step be to just understand a few common 
CRS's by parsing JSON seems like a good way to meet people 'more than half 
way'. And then geospatial libraries can evolve to support JSON encoding of 
CRS's (PROJJSON and/or CRS JSON) - and ideally we in the geospatial community 
work out that set of recommendations. 
   
   For now I think that bit is more important for GeoParquet, where the clear 
'native' format to use for Parquet metadata is JSON. And I think we should all 
work together to get to a path from where we are today to the two year goal - 
we are loath to do a 2.0 for GeoParquet, but we could consider it if there is 
clear consensus between the various geospatial communities on the need for a 
breaking change from PROJJSON.
   
   For Iceberg I do think the best answer is the SPATIAL_REF_SYS table, text 
from the [core spec](https://portal.ogc.org/files/?artifact_id=25354)
   
   ```
   6.1.3 Identification of Spatial Reference Systems
   
   Every Geometry Column and every geometric entity is associated with exactly 
one Spatial Reference System.
   The Spatial Reference System identifies the coordinate system for all 
geometric objects stored in the column, and
   gives meaning to the numeric coordinate values for any geometric object 
stored in the column. Examples of
   commonly used Spatial Reference Systems include ―Latitude Longitude‖ and 
―UTM Zone 10‖.
   
   The SPATIAL_REF_SYS table stores information on each Spatial Reference 
System in the database. The
   columns of this table are the Spatial Reference System Identifier (SRID), 
the Spatial Reference System Authority
   Name (AUTH_NAME), the Authority Specific Spatial Reference System Identifier 
(AUTH_SRID) and the Wellknown Text description of the Spatial Reference System 
(SRTEXT). The Spatial Reference System Identifier
   (SRID) constitutes a unique integer key for a Spatial Reference System 
within a database.
   
   Interoperability between clients is achieved via the SRTEXT column which 
stores the Well-known Text
   representation for a Spatial Reference System.
   ```
   
   And there are additional details [in postgis 
docs](https://postgis.net/docs/manual-1.4/ch04.html#spatial_ref_sys) and 
[geopackage spec](https://www.geopackage.org/spec/#spatial_ref_sys).
   
   This allows SRID to be used, but includes a table of all the core WKT values 
to map to those SRID's, and lets users define their own. 
   
   I think this means that core iceberg should not need to know PROJJSON. I do 
still believe PROJJSON is the best choice for GeoParquet and Parquet, and we 
can continue to work together to figure out the best approach there so the 
entire ecosystem works well.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to