lilianm opened a new issue, #8526:
URL: https://github.com/apache/arrow-rs/issues/8526

   **Describe the bug**
   High memory use
   
   **To Reproduce**
   Writer with compressed page  with parquet v1 
   
   **Expected behavior**
   Less memory use
   
   **Additional context**
   
   Buffer is not shrink after compression and `Bytes` don't change memory layout
   
   
https://github.com/apache/arrow-rs/blob/b9c2bf73e792e7cb849f0bd453059ceef45b0b74/parquet/src/column/writer/mod.rs#L1074-L1076
   
   In my case uncompressed page ~1M and after compression ~20k It's lot of 
memory wasted
   
   Before change
   ```
   (3,639,172,424B) 0x41C901F: 
parquet::column::writer::GenericColumnWriter<E>::add_data_page (mod.rs:1070)
   (876,240,384B) 0x41D2F36: 
parquet::column::writer::GenericColumnWriter<E>::add_data_page (mod.rs:1070)
   ```
   For output file ~550M
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to