BlakeOrth commented on code in PR #8801:
URL: https://github.com/apache/arrow-rs/pull/8801#discussion_r2528484583


##########
parquet-geospatial/src/types.rs:
##########
@@ -0,0 +1,210 @@
+use std::sync::Arc;
+
+use arrow::array::{Array, ArrayRef, BinaryArray, BinaryViewArray, 
LargeBinaryArray, make_array};
+use arrow::buffer::NullBuffer;
+use arrow::error::Result;
+use arrow_schema::Field;
+use arrow_schema::{ArrowError, DataType, extension::ExtensionType};
+use serde::{Deserialize, Serialize};
+
+#[derive(Copy, Clone, Debug, Serialize, Deserialize)]
+pub enum Hint {
+    Geometry,
+    Geography,
+}
+
+#[derive(Clone, Debug, Default, Serialize, Deserialize)]
+pub struct Metadata {
+    #[serde(skip_serializing_if = "Option::is_none")]
+    pub crs: Option<String>, // TODO: explore when this is valid JSON to avoid 
double escaping
+    #[serde(skip_serializing_if = "Option::is_none")]
+    pub algorithm: Option<String>,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    type_hint: Option<Hint>,
+}

Review Comment:
   Right, the `type_hint` (regardless of whether or not this would be tolerated 
by existing parsers) more or less makes this metadata neither Parquet nor 
GeoArrow. I think the more philosophical question of "should this metadata be 
Parquet or GeoArrow" is important here. I'll preface this by saying I don't 
really feel too strongly about this and my primary goal is to make sure this is 
usable for the community at large, however, I'll at least detail my thought 
process behind it.
   
   I think even without the `type_hint` the metadata we present here is already 
not really GeoArrow metadata, in spite of it generally looking compatible. 
Since cases like `{"crs": "projjson:my_crs_metadata_key"}` don't actually 
communicate the CRS without user (or other library, likely `geoarrow-rs`) 
intervention there's already inherently some additional inspection of this 
metdata necessary to convert it to GeoArrow. If users of the metadata are going 
to need to inspect and potentially convert it anyway, we may as well allow the 
metadata to support all the cases the underlying parquet spec supports.
   
   Now, all that being said, I think we're in agreement that the case this 
supports isn't a huge concern. If it makes most sense from an ecosystem 
compatibility perspective to discard the `type_hint` and accept that we 
technically have an unsupported case I think that's likely the strongest 
justification for how to move forward here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to