etseidl opened a new issue, #9863:
URL: https://github.com/apache/arrow-rs/issues/9863

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   The `Compression` enum in the parquet crate currently serves two purposes. 
First it is part of the Parquet thrift schema, and is used for informational 
purposes in the metadata. In this role, it is a stand-in for the actual Thrift 
`CompressionCodec`.
   
   Its second purpose is as a configuration parameter to control the writing of 
Parquet pages. As such, it is desirable for it to carry extra information to 
fine tune the compression codecs in use. But as more information is added, the 
memory burden on the metadata increases.
   
   **Describe the solution you'd like**
   I propose splitting this enum into two. `Compression` can move to 
`file::properties` for use in configuration. A new `CompressionCodec` that can 
make use of the thrift macros would be created in `basic`.
   
   **Describe alternatives you've considered**
   
   **Additional context**
   This would be a breaking API change, but I think it make adding new features 
to the writer easier while simplifying the parsing and representation of the 
Parquet metadata.
   
   I thought of this while evaluating #9807 and #9367
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to