Are you able to share your code, particularly how you build your ArrowWriterProperties?
The Arrow Schema and therefore the field-level metadata is actually stored in the Parquet file as an opaque blob. Opaque in the sense that it's opaque to the standard Parquet tools. You'll have to read it in with a tool that's Arrow-aware such as Arrow C++ or PyArrow. parquet-tools can at least tell you if the Arrow Schema has been stored if you look in the key_value_metadata list for a member with the special name "ARROW:schema". However, I believe the default behavior of the Arrow C++ Parquet implementation is to not store the Arrow Schema so you'll have to opt into that behavior to get what you want by enabling store_schema [1] [1] https://arrow.apache.org/docs/cpp/parquet.html#writetable On Mon, Jan 6, 2025 at 12:31 PM Andrew Bell <[email protected]> wrote: > > Hi, > > I'm creating a Parquet file with a writer (a FileWriter based on a > ParquetFileWriter). The writer is created using a Schema and the > Schema itself was created from a list of Fields. Each of the fields > contains metadata and the schema itself also contains metadata. When I > examine the output of the file with `parquet-tools inspect --detail` > it shows the Schema metadata, but no field metadata. > > I'm trying to figure out if the field metadata is being written or if > this is just an issue with seeing the data using the `paquet-tools` > program. Do I have to do something special to get metadata associated > with schema fields written to a parquet file? Or do I need to use some > other command to see field-level metadata? > > Thanks, > > -- > Andrew Bell > [email protected]
