[ https://issues.apache.org/jira/browse/KAFKA-10470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17217623#comment-17217623 ]
Robert Wagner commented on KAFKA-10470: --------------------------------------- [zstd-jni version 1.4.5-7|https://github.com/luben/zstd-jni/releases/tag/v1.4.5-7] has been released, which internally implements a BufferPool to re-use the decompression buffer without changing their API. This fixes the issue of GC pressure but doesn't address [~yuzawa-san]'s concern about JNI boundary crossings. My initial test with this updated dependency shows memory allocations and GC pressure comparable to gzip. overall throughput with small message batches still seems to lag behind all other codecs. > zstd decompression with small batches is slow and causes excessive GC > --------------------------------------------------------------------- > > Key: KAFKA-10470 > URL: https://issues.apache.org/jira/browse/KAFKA-10470 > Project: Kafka > Issue Type: Bug > Affects Versions: 2.5.1 > Reporter: Robert Wagner > Priority: Major > > Similar to KAFKA-5150 but for zstd instead of LZ4, it appears that a large > decompression buffer (128kb) created by zstd-jni per batch is causing a > significant performance bottleneck. > The next upcoming version of zstd-jni (1.4.5-7) will have a new constructor > for ZstdInputStream that allows the client to pass its own buffer. A similar > fix as [PR #2967|https://github.com/apache/kafka/pull/2967] could be used to > have the ZstdConstructor use a BufferSupplier to re-use the decompression > buffer. -- This message was sent by Atlassian Jira (v8.3.4#803005)