alamb opened a new issue, #8479: URL: https://github.com/apache/arrow-rs/issues/8479
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** The Parquet type system includes LogicalTypes types without a direct arrow equivalent, such as JSON, Variant, and UUID However, Arrow includes the idea of "Extension" types that add extra semantics to an existing Arrow physical type, and the arrow-rs parquet reader will automatically map these the relevant parquet types to a canonical Arrow extension type if the `arrow_canonical_extension_types` feature is set. However, right now that mapping of Parquet LogicalType --> Arrow (Canonical) ExtensionType is hard coded, which is unfortunate as it means: 1. Users can not override the mapping (if they want to write their own implementation of parquet LogicalTypes, for example) 2. The code has a bunch of `#[cfg(...)]` sprinkled in it -- see https://github.com/apache/arrow-rs/pull/8409 for an example **Describe the solution you'd like** @paleolimbot suggested on https://github.com/apache/arrow-rs/pull/8409/files#r2371071848 that we could maintain some sort of registry that was more ergonomic to configure and would allow user defined extension types **Describe alternatives you've considered** Quoting @paleolimbot on https://github.com/apache/arrow-rs/pull/8409/files#r2371071848: > you could also consider an injection approach like: ```rust pub trait ParquetArrowExtension { fn try_from_logical_type(&self, mut arrow_field: Field, logical_type: &LogicalType) -> Result<Option<Field>>; fn try_to_logical_type(&self, &Field) -> Result<Option<LogicalType>>; } ``` ...and maintain a registry of those in the reader/writer options. Then you don't need compile time flags to support the extensions (something like DataFusion or a derivative could wire it all together at runtime). **Additional context** <!-- Add any other context or screenshots about the feature request here. --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
