adamreeve commented on PR #46972:
URL: https://github.com/apache/arrow/pull/46972#issuecomment-3043538730

   Right OK that makes sense. I'm testing with plain encoded float columns so 
hadn't noticed that but yes it might be a good idea to also change the 
decompression buffers then.
   
   I did some rough benchmarks with `/usr/bin/time -v` and get the following 
results for my test case (what's described in #46971 but reading all row 
groups), taking the best out of three runs:
   
   |  | System allocator | mimalloc | jemalloc |
   | --- | --- | --- | --- |
   | Baseline time (s) | 6.92 |  6.89 | 6.68 |
   | Time with temp decrypt buffers (s) | 8.23 | 6.00 | 6.42 |
   | Time with temp decrypt and decompress buffers (s) | 6.50 |  6.08 | 6.59 |
   
   
   | | System allocator | mimalloc | jemalloc |
   | --- | --- | --- | --- |
   | Baseline max RSS (MB) | 1,556 | 1,550 | 1,128 |
   | Max RSS with temp decrypt buffers (MB) | 894 | 891 | 627 |
   | Max RSS with temp decrypt and decompress buffers (MB) | 1,823 |890 | 629 |
   
   The behaviour with mimalloc and jemalloc looks good, but the results with 
the system allocator are quite concerning. The max RSS decreases if using 
temporary decryption buffers, but actually increases quite significantly when 
also using temporary decompression buffers. I'm not sure why that would be, 
maybe this causes more memory fragmentation? (C++ heap memory management is not 
something I know a lot about...). There is also a noticeable slow-down in the 
temporary decryption buffer case with the system allocator.
   
   Maybe this is acceptable, given most users will be using mimalloc?
   
   I also tested with unencrypted data:
   
   |  | System allocator | mimalloc | jemalloc |
   | --- | --- | --- | --- |
   | Baseline time (s) | 4.07 | 4.09  | 3.93 |
   | Time with temp decompress buffers (s) | 4.08 | 4.25  | 3.98 |
   
   | | System allocator | mimalloc | jemalloc |
   | --- | --- | --- | --- |
   | Baseline max RSS (MB) | 884 | 895 | 627 |
   | Max RSS with temp decompress buffers (MB) | 954 | 913 | 660 |
   
   Based on these benchmarks alone, maybe only the decryption buffers should be 
temporary. But I've only tested with plain float data. I'll look into testing 
with more data types and encodings.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to