paleolimbot opened a new issue, #7240: URL: https://github.com/apache/arrow-rs/issues/7240
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** I'd like to be able to read and/or write Parquet files with the new GEOMETRY and GEOGRAPHY types! - Spec references: https://github.com/apache/parquet-format/blob/master/Geospatial.md + https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L240-L261 - C++ implementation PR: https://github.com/apache/arrow/pull/45459 - Java implementation PR: https://github.com/apache/parquet-java/pull/2971 - Test files: https://github.com/apache/parquet-testing/pull/70 (and a few bigger ones at https://github.com/geoarrow/geoarrow-data ) **Describe the solution you'd like** Support for read and/or write (perhaps read first and then write). **Describe alternatives you've considered** **Additional context** I think the main issue is what Arrow type to read into. The Parquet types have type-level metadata (a coordinate reference system and edge interpolation for geography) which can be propagated via the `geoarrow.wkb` extension type ( https://github.com/geoarrow/geoarrow/blob/main/extension-types.md#extension-metadata ). The most complicated mapping scenario looks something like: Parquet: `GEOGRAPHY(crs=projjson:some_file_metadata_field, algorithm=spherical)` -> Arrow: `geoarrow.wkb` + `{"crs": {<the actual projjson>}, "edges": "spherical"}` (The fact that the Parquet spec "recommends" putting the actual PROJJSON into the file metadata is something I tried to discourage when negotiating the spec change but was not ultimately successful). I haven't looked at the existing type mapping code but I think I remember reading the recent `ExtensionType` change was followed up with the ability for field metadata to be inspected/generated on the way in/out of Parquet to ensure that metadata is propagated wherever possible. Right now GeoArrow extension types are listed as "community extension types", which I believe was a category made up just for us. It may be that moving/voting `geoarrow.wkb` to the "canonical extension type" category is a precursor to finalizing this implementation, which is definitely fair 🙂 . I'm happy to attempt this when I get a chance (unless @kylebarron is chomping at the bit to do it or has already done it!). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
