Ext3h commented on code in PR #48192:
URL: https://github.com/apache/arrow/pull/48192#discussion_r2559342235


##########
cpp/src/arrow/util/compression_zstd.cc:
##########
@@ -187,9 +202,14 @@ class ZSTDCodec : public Codec {
       DCHECK_EQ(output_buffer_len, 0);
       output_buffer = &empty_buffer;
     }
-
-    size_t ret = ZSTD_decompress(output_buffer, 
static_cast<size_t>(output_buffer_len),
-                                 input, static_cast<size_t>(input_len));
+    // Decompression context for ZSTD contains several large heap allocations.

Review Comment:
   IIRC 5-10MB in total. Enough to hurt performance with small blocks (i.e. 
Parquet with 8kB row groups) both due to memory management and cache trashing, 
not enough to hurt in terms of total memory footprint.
   
   Would have liked to slave those allocations to the arrow default memory pool 
for proper tracing, but that feature is exclusive to the static linkage of ZSTD.
   
   I did deliberately avoid managing a pool per instance, assuming that there 
may be many instances of this class, more than threads in the thread pools.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to