alamb commented on PR #8225:
URL: https://github.com/apache/arrow-rs/pull/8225#issuecomment-3304190280

   > Thank you all for taking a look!
   > 
   > > I think it would be really nice if users who didn't need to read 
geometry types could avoid paying the cost of that support (e.g. keep their 
binary and code size down).
   > 
   > I think that this is a great idea for the Arrow reader and writer (e.g., 
conversion to/from GeoArrow and writing statistics); however, can we at least 
provide access to the type annotation and statistics here? It seemed like 
that's where this PR was headed and I don't think the overhead of that is 
particularly onerous (I know I'm new here though!).
   > 
   > As a concrete target, I want to use this PR to prune row groups here:
   > 
   > 
https://github.com/apache/sedona-db/blob/653ab44bdd2923b5c395828f93de7fc3085ff6c2/rust/sedona-geoparquet/src/file_opener.rs#L186-L195
   > 
   > This is a place where I already have access to all the things I need 
(e.g., ParquetMetadata, file key/value metadata) and I don't really want or 
need that to be done for me. All I need is for 
`row_group_metadata.column(j).statistics()` to let me look at GeoStatistics.
   
   For sure -- if there are types that make sense to add (and always compile) 
to the main parquet crate sounds good to me. Reasonable rust structures for 
bounding box statistics sounds like it could fit this model
   
   What I am trying to avoid is having a bunch more code / binary size for 
users of the parquet crate if they aren't going to use geometry types, unless 
they opt in. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to