paleolimbot opened a new issue, #7982: URL: https://github.com/apache/arrow-rs/issues/7982
**Describe the bug** The representation of Dictionary in the data types enum seems to exclude field metadata, so extension types are dropped when they go through arrow-rs structures: https://github.com/apache/arrow-rs/blob/a7f3ba8f3a748243af1575bce8d50dfc6a81ab73/arrow-schema/src/datatype.rs#L359 The definition of RunEndEncoded and others seem to use a `FieldRef` and I'm wondering if it was a deliberate choice not to do this or whether it's just never come up. **To Reproduce** I used arro3 to reproduce: ```python import arro3.core as a3 import geoarrow.pyarrow as ga import nanoarrow as na import pyarrow as pa c_schema = na.c_schema(pa.dictionary(pa.int32(), ga.wkb())) c_schema.metadata is None #> True c_schema.dictionary.metadata #> <nanoarrow._schema.SchemaMetadata> #> - b'ARROW:extension:name': b'geoarrow.wkb' #> - b'ARROW:extension:metadata': b'{}' c_schema2 = na.c_schema(a3.DataType.dictionary(pa.int32(), ga.wkb())) c_schema2.metadata is None #> True c_schema2.dictionary.metadata is None #> True ``` **Expected behavior** I would have expected the metadata to roundtrip through the arrow-rs data type representation **Additional context** Occasionally Parquet readers will return dictionary-encoded arrays on read whose representation is not entirely in control of the user. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org