Ext3h commented on code in PR #48192:
URL: https://github.com/apache/arrow/pull/48192#discussion_r2559488106


##########
cpp/src/arrow/util/compression_zstd.cc:
##########
@@ -187,9 +202,14 @@ class ZSTDCodec : public Codec {
       DCHECK_EQ(output_buffer_len, 0);
       output_buffer = &empty_buffer;
     }
-
-    size_t ret = ZSTD_decompress(output_buffer, 
static_cast<size_t>(output_buffer_len),
-                                 input, static_cast<size_t>(input_len));
+    // Decompression context for ZSTD contains several large heap allocations.

Review Comment:
   Unsure about that - the problematic free-threaded case there is the use of 
thread pools within feather/ipc. They'd need a `thread_local` like patterns in 
any case. Which means instead of one central `thread_local` there would simply 
be one at 3+ code locations instead.
   
   Exposing the context for use with Parquet would require exposing it all the 
way out to `parquet::WriterProperties::Builder` - and then you'd possibly still 
end up with multiple writer instances wrongly sharing a context, rendering 
threading of those writers suddenly impossible. If anything you'd need to 
export a threading aware "context pool" rather than a context, but that would 
be equal to reinventing `thread_local` except worse in terms of cache locality 
and undesirable extra synchronization primitives.
   
   The Rust implementation did not encounter those issues as there is no 
sharing of the context permitted in the first place due to being constrained by 
language features. And correspondingly also no aggressive threading using a 
(potentially) shared state.
   
   Ultimately, having exactly one cached context per thread for the single shot 
compression/decompression API is the recommended usage pattern from the ZSTD 
maintainers, and aligns best with the available API:
   ```c++
   /*= Decompression context
    *  When decompressing many times,
    *  it is recommended to allocate a context only once,
    *  and reuse it for each successive compression operation.
    *  This will make workload friendlier for system's memory.
    *  Use one context per thread for parallel execution. */
   typedef struct ZSTD_DCtx_s ZSTD_DCtx;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to