[ 
https://issues.apache.org/jira/browse/KAFKA-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16031486#comment-16031486
 ] 

ASF GitHub Bot commented on KAFKA-5150:
---------------------------------------

GitHub user xvrl opened a pull request:

    https://github.com/apache/kafka/pull/3180

    MINOR: reuse decompression buffers in log cleaner

    follow-up to KAFKA-5150, reuse decompression buffers in the log cleaner 
thread.
    
    @ijuma @hachikuji 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xvrl/kafka logcleaner-decompression-buffers

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/3180.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3180
    
----
commit d37ef9dfe118ec62b4091a4db72558ccbe888fb8
Author: Xavier Léauté <xav...@confluent.io>
Date:   2017-05-31T16:24:15Z

    MINOR: reuse decompression buffers in log cleaner

----


> LZ4 decompression is 4-5x slower than Snappy on small batches / messages
> ------------------------------------------------------------------------
>
>                 Key: KAFKA-5150
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5150
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer
>    Affects Versions: 0.8.2.2, 0.9.0.1, 0.11.0.0, 0.10.2.1
>            Reporter: Xavier Léauté
>            Assignee: Xavier Léauté
>             Fix For: 0.11.0.0
>
>
> I benchmarked RecordsIteratorDeepRecordsIterator instantiation on small batch 
> sizes with small messages after observing some performance bottlenecks in the 
> consumer. 
> For batch sizes of 1 with messages of 100 bytes, LZ4 heavily underperforms 
> compared to Snappy (see benchmark below). Most of our time is currently spent 
> allocating memory blocks in KafkaLZ4BlockInputStream, due to the fact that we 
> default to larger 64kB block sizes. Some quick testing shows we could improve 
> performance by almost an order of magnitude for small batches and messages if 
> we re-used buffers between instantiations of the input stream.
> [Benchmark 
> Code|https://github.com/xvrl/kafka/blob/small-batch-lz4-benchmark/clients/src/test/java/org/apache/kafka/common/record/DeepRecordsIteratorBenchmark.java#L86]
> {code}
> Benchmark                                              (compressionType)  
> (messageSize)   Mode  Cnt       Score       Error  Units
> DeepRecordsIteratorBenchmark.measureSingleMessage                    LZ4      
>       100  thrpt   20   84802.279 ±  1983.847  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage                 SNAPPY      
>       100  thrpt   20  407585.747 ±  9877.073  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage                   NONE      
>       100  thrpt   20  579141.634 ± 18482.093  ops/s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to