[jira] [Commented] (KAFKA-5150) LZ4 decompression is 4-5x slower than Snappy on small batches / messages

2017-06-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035737#comment-16035737
 ] 

ASF GitHub Bot commented on KAFKA-5150:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3180


> LZ4 decompression is 4-5x slower than Snappy on small batches / messages
> 
>
> Key: KAFKA-5150
> URL: https://issues.apache.org/jira/browse/KAFKA-5150
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.11.0.0, 0.10.2.1
>Reporter: Xavier Léauté
>Assignee: Xavier Léauté
> Fix For: 0.11.0.0, 0.10.2.2
>
>
> I benchmarked RecordsIteratorDeepRecordsIterator instantiation on small batch 
> sizes with small messages after observing some performance bottlenecks in the 
> consumer. 
> For batch sizes of 1 with messages of 100 bytes, LZ4 heavily underperforms 
> compared to Snappy (see benchmark below). Most of our time is currently spent 
> allocating memory blocks in KafkaLZ4BlockInputStream, due to the fact that we 
> default to larger 64kB block sizes. Some quick testing shows we could improve 
> performance by almost an order of magnitude for small batches and messages if 
> we re-used buffers between instantiations of the input stream.
> [Benchmark 
> Code|https://github.com/xvrl/kafka/blob/small-batch-lz4-benchmark/clients/src/test/java/org/apache/kafka/common/record/DeepRecordsIteratorBenchmark.java#L86]
> {code}
> Benchmark  (compressionType)  
> (messageSize)   Mode  Cnt   Score   Error  Units
> DeepRecordsIteratorBenchmark.measureSingleMessageLZ4  
>   100  thrpt   20   84802.279 ±  1983.847  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage SNAPPY  
>   100  thrpt   20  407585.747 ±  9877.073  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage   NONE  
>   100  thrpt   20  579141.634 ± 18482.093  ops/s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5150) LZ4 decompression is 4-5x slower than Snappy on small batches / messages

2017-06-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16033516#comment-16033516
 ] 

ASF GitHub Bot commented on KAFKA-5150:
---

Github user xvrl closed the pull request at:

https://github.com/apache/kafka/pull/3090


> LZ4 decompression is 4-5x slower than Snappy on small batches / messages
> 
>
> Key: KAFKA-5150
> URL: https://issues.apache.org/jira/browse/KAFKA-5150
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.11.0.0, 0.10.2.1
>Reporter: Xavier Léauté
>Assignee: Xavier Léauté
> Fix For: 0.11.0.0, 0.10.2.2
>
>
> I benchmarked RecordsIteratorDeepRecordsIterator instantiation on small batch 
> sizes with small messages after observing some performance bottlenecks in the 
> consumer. 
> For batch sizes of 1 with messages of 100 bytes, LZ4 heavily underperforms 
> compared to Snappy (see benchmark below). Most of our time is currently spent 
> allocating memory blocks in KafkaLZ4BlockInputStream, due to the fact that we 
> default to larger 64kB block sizes. Some quick testing shows we could improve 
> performance by almost an order of magnitude for small batches and messages if 
> we re-used buffers between instantiations of the input stream.
> [Benchmark 
> Code|https://github.com/xvrl/kafka/blob/small-batch-lz4-benchmark/clients/src/test/java/org/apache/kafka/common/record/DeepRecordsIteratorBenchmark.java#L86]
> {code}
> Benchmark  (compressionType)  
> (messageSize)   Mode  Cnt   Score   Error  Units
> DeepRecordsIteratorBenchmark.measureSingleMessageLZ4  
>   100  thrpt   20   84802.279 ±  1983.847  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage SNAPPY  
>   100  thrpt   20  407585.747 ±  9877.073  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage   NONE  
>   100  thrpt   20  579141.634 ± 18482.093  ops/s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5150) LZ4 decompression is 4-5x slower than Snappy on small batches / messages

2017-05-31 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16031486#comment-16031486
 ] 

ASF GitHub Bot commented on KAFKA-5150:
---

GitHub user xvrl opened a pull request:

https://github.com/apache/kafka/pull/3180

MINOR: reuse decompression buffers in log cleaner

follow-up to KAFKA-5150, reuse decompression buffers in the log cleaner 
thread.

@ijuma @hachikuji 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xvrl/kafka logcleaner-decompression-buffers

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3180.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3180


commit d37ef9dfe118ec62b4091a4db72558ccbe888fb8
Author: Xavier Léauté 
Date:   2017-05-31T16:24:15Z

MINOR: reuse decompression buffers in log cleaner




> LZ4 decompression is 4-5x slower than Snappy on small batches / messages
> 
>
> Key: KAFKA-5150
> URL: https://issues.apache.org/jira/browse/KAFKA-5150
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.11.0.0, 0.10.2.1
>Reporter: Xavier Léauté
>Assignee: Xavier Léauté
> Fix For: 0.11.0.0
>
>
> I benchmarked RecordsIteratorDeepRecordsIterator instantiation on small batch 
> sizes with small messages after observing some performance bottlenecks in the 
> consumer. 
> For batch sizes of 1 with messages of 100 bytes, LZ4 heavily underperforms 
> compared to Snappy (see benchmark below). Most of our time is currently spent 
> allocating memory blocks in KafkaLZ4BlockInputStream, due to the fact that we 
> default to larger 64kB block sizes. Some quick testing shows we could improve 
> performance by almost an order of magnitude for small batches and messages if 
> we re-used buffers between instantiations of the input stream.
> [Benchmark 
> Code|https://github.com/xvrl/kafka/blob/small-batch-lz4-benchmark/clients/src/test/java/org/apache/kafka/common/record/DeepRecordsIteratorBenchmark.java#L86]
> {code}
> Benchmark  (compressionType)  
> (messageSize)   Mode  Cnt   Score   Error  Units
> DeepRecordsIteratorBenchmark.measureSingleMessageLZ4  
>   100  thrpt   20   84802.279 ±  1983.847  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage SNAPPY  
>   100  thrpt   20  407585.747 ±  9877.073  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage   NONE  
>   100  thrpt   20  579141.634 ± 18482.093  ops/s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5150) LZ4 decompression is 4-5x slower than Snappy on small batches / messages

2017-05-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16030468#comment-16030468
 ] 

ASF GitHub Bot commented on KAFKA-5150:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2967


> LZ4 decompression is 4-5x slower than Snappy on small batches / messages
> 
>
> Key: KAFKA-5150
> URL: https://issues.apache.org/jira/browse/KAFKA-5150
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.11.0.0, 0.10.2.1
>Reporter: Xavier Léauté
>Assignee: Xavier Léauté
> Fix For: 0.11.0.0
>
>
> I benchmarked RecordsIteratorDeepRecordsIterator instantiation on small batch 
> sizes with small messages after observing some performance bottlenecks in the 
> consumer. 
> For batch sizes of 1 with messages of 100 bytes, LZ4 heavily underperforms 
> compared to Snappy (see benchmark below). Most of our time is currently spent 
> allocating memory blocks in KafkaLZ4BlockInputStream, due to the fact that we 
> default to larger 64kB block sizes. Some quick testing shows we could improve 
> performance by almost an order of magnitude for small batches and messages if 
> we re-used buffers between instantiations of the input stream.
> [Benchmark 
> Code|https://github.com/xvrl/kafka/blob/small-batch-lz4-benchmark/clients/src/test/java/org/apache/kafka/common/record/DeepRecordsIteratorBenchmark.java#L86]
> {code}
> Benchmark  (compressionType)  
> (messageSize)   Mode  Cnt   Score   Error  Units
> DeepRecordsIteratorBenchmark.measureSingleMessageLZ4  
>   100  thrpt   20   84802.279 ±  1983.847  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage SNAPPY  
>   100  thrpt   20  407585.747 ±  9877.073  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage   NONE  
>   100  thrpt   20  579141.634 ± 18482.093  ops/s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5150) LZ4 decompression is 4-5x slower than Snappy on small batches / messages

2017-05-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16030291#comment-16030291
 ] 

ASF GitHub Bot commented on KAFKA-5150:
---

Github user ijuma closed the pull request at:

https://github.com/apache/kafka/pull/3164


> LZ4 decompression is 4-5x slower than Snappy on small batches / messages
> 
>
> Key: KAFKA-5150
> URL: https://issues.apache.org/jira/browse/KAFKA-5150
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.11.0.0, 0.10.2.1
>Reporter: Xavier Léauté
>Assignee: Xavier Léauté
> Fix For: 0.11.0.0
>
>
> I benchmarked RecordsIteratorDeepRecordsIterator instantiation on small batch 
> sizes with small messages after observing some performance bottlenecks in the 
> consumer. 
> For batch sizes of 1 with messages of 100 bytes, LZ4 heavily underperforms 
> compared to Snappy (see benchmark below). Most of our time is currently spent 
> allocating memory blocks in KafkaLZ4BlockInputStream, due to the fact that we 
> default to larger 64kB block sizes. Some quick testing shows we could improve 
> performance by almost an order of magnitude for small batches and messages if 
> we re-used buffers between instantiations of the input stream.
> [Benchmark 
> Code|https://github.com/xvrl/kafka/blob/small-batch-lz4-benchmark/clients/src/test/java/org/apache/kafka/common/record/DeepRecordsIteratorBenchmark.java#L86]
> {code}
> Benchmark  (compressionType)  
> (messageSize)   Mode  Cnt   Score   Error  Units
> DeepRecordsIteratorBenchmark.measureSingleMessageLZ4  
>   100  thrpt   20   84802.279 ±  1983.847  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage SNAPPY  
>   100  thrpt   20  407585.747 ±  9877.073  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage   NONE  
>   100  thrpt   20  579141.634 ± 18482.093  ops/s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5150) LZ4 decompression is 4-5x slower than Snappy on small batches / messages

2017-05-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16028559#comment-16028559
 ] 

ASF GitHub Bot commented on KAFKA-5150:
---

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/3164

KAFKA-5150: Reduce lz4 decompression overhead (without thread local buffers)

Temporary PR that has additional changes over 
https://github.com/apache/kafka/pull/2967 for comparison.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-5150-reduce-lz4-decompression-overhead

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3164.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3164


commit 950858e7fae838aecbf31c1ea201c3dbcd67a91d
Author: Xavier Léauté 
Date:   2017-05-03T17:01:07Z

small batch decompression benchmark

commit 0177665f3321e101ccbb2e95ec724125bf784e1c
Author: Xavier Léauté 
Date:   2017-05-03T20:40:45Z

KAFKA-5150 reduce lz4 decompression overhead

- reuse decompression buffers, keeping one per thread
- switch lz4 input stream to operate directly on ByteBuffers
- more tests with both compressible / incompressible data, multiple
  blocks, and various other combinations to increase code coverage
- fixes bug that would cause EOFException instead of invalid block size
  for invalid incompressible blocks

commit 7b553afdd7a6a7d39b122c503cd643b915b9f556
Author: Xavier Léauté 
Date:   2017-05-04T17:26:52Z

remove unnecessary synchronized on reset/mark

commit b4c46ac15aa25e2c8ea6bb9392d700b02b61fccd
Author: Xavier Léauté 
Date:   2017-05-05T16:00:49Z

avoid exception when reaching end of batch

commit 77e1a1d47f9060430257821704045353ec77a8d0
Author: Xavier Léauté 
Date:   2017-05-18T16:47:01Z

remove reflection for LZ4 and add comments

commit e3b68668b6b2e0057bcd9a3de24ab8fce774d8d5
Author: Ismael Juma 
Date:   2017-05-26T15:45:13Z

Simplify DataLogInputStream.nextBatch

commit 213bb77b8a3862a325118492d658a4e58ffd3c29
Author: Ismael Juma 
Date:   2017-05-26T15:56:16Z

Minor comment improvement

commit 9bd10361d70de837ccb58a82b356b402d28bb94f
Author: Ismael Juma 
Date:   2017-05-29T14:18:13Z

Minor tweaks in `DefaultRecord.readFrom`

commit 178d4900a6c848a4f1b0aa0ae68aaa24885f36bc
Author: Ismael Juma 
Date:   2017-05-29T15:22:01Z

Cache decompression buffers in Fetcher instead of thread-locals

This means that this only benefits the consumer for now, which
is the most important case. For the server, we should consider
how this fits with KIP-72.

commit c10b310cc13f5ec110cbaed8fb72f24774c2a2cd
Author: Ismael Juma 
Date:   2017-05-29T15:23:19Z

Tweaks to `KafkaLZ4*Stream` classes and `RecordBatchIterationBenchmark

commit d93444c147430a62f5e9d16492ad14d2c6a0dd38
Author: Ismael Juma 
Date:   2017-05-29T18:18:23Z

Trivial style tweaks to KafkaLZ4Test

commit 419500e848b943f20d9bce1790fe40e64080ae29
Author: Ismael Juma 
Date:   2017-05-29T18:38:55Z

Provide a `NO_CACHING` BufferSupplier




> LZ4 decompression is 4-5x slower than Snappy on small batches / messages
> 
>
> Key: KAFKA-5150
> URL: https://issues.apache.org/jira/browse/KAFKA-5150
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.11.0.0, 0.10.2.1
>Reporter: Xavier Léauté
>Assignee: Xavier Léauté
> Fix For: 0.11.0.0
>
>
> I benchmarked RecordsIteratorDeepRecordsIterator instantiation on small batch 
> sizes with small messages after observing some performance bottlenecks in the 
> consumer. 
> For batch sizes of 1 with messages of 100 bytes, LZ4 heavily underperforms 
> compared to Snappy (see benchmark below). Most of our time is currently spent 
> allocating memory blocks in KafkaLZ4BlockInputStream, due to the fact that we 
> default to larger 64kB block sizes. Some quick testing shows we could improve 
> performance by almost an order of magnitude for small batches and messages if 
> we re-used buffers between instantiations of the input stream.
> [Benchmark 
> Code|https://github.com/xvrl/kafka/blob/small-batch-lz4-benchmark/clients/src/test/java/org/apache/kafka/common/record/DeepRecordsIteratorBenchmark.java#L86]
> {code}
> Benchmark  (compressionType)  
> (messageSize)   Mode  Cnt   Score   Error  Units
> DeepRecordsIteratorBenchmark.measureSingleMessageLZ4  
>   100  thrpt   20   84802.279 ±  1983.847  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage

[jira] [Commented] (KAFKA-5150) LZ4 decompression is 4-5x slower than Snappy on small batches / messages

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016146#comment-16016146
 ] 

ASF GitHub Bot commented on KAFKA-5150:
---

GitHub user xvrl opened a pull request:

https://github.com/apache/kafka/pull/3090

KAFKA-5150 reduce lz4 decompression overhead - Backport to 0.10.2.x



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xvrl/kafka kafka-5150-0.10

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3090.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3090


commit 265ee89ec76b223eb97c6b364260443346e6dac1
Author: Xavier Léauté 
Date:   2017-05-03T17:03:44Z

KAFKA-5150 reduce lz4 decompression overhead

- reuse decompression buffers, keeping one per thread
- switch lz4 input stream to operate directly on ByteBuffers
- more tests with both compressible / incompressible data, multiple
  blocks, and various other combinations to increase code coverage
- fixes bug that would cause EOFException instead of invalid block size
  for invalid incompressible blocks

commit cef091d0353a8a1f45ac913750f1e0dba04d7ab1
Author: Xavier Léauté 
Date:   2017-05-05T22:18:55Z

avoid exception when reaching end of batch




> LZ4 decompression is 4-5x slower than Snappy on small batches / messages
> 
>
> Key: KAFKA-5150
> URL: https://issues.apache.org/jira/browse/KAFKA-5150
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.11.0.0, 0.10.2.1
>Reporter: Xavier Léauté
>Assignee: Xavier Léauté
> Fix For: 0.11.0.0
>
>
> I benchmarked RecordsIteratorDeepRecordsIterator instantiation on small batch 
> sizes with small messages after observing some performance bottlenecks in the 
> consumer. 
> For batch sizes of 1 with messages of 100 bytes, LZ4 heavily underperforms 
> compared to Snappy (see benchmark below). Most of our time is currently spent 
> allocating memory blocks in KafkaLZ4BlockInputStream, due to the fact that we 
> default to larger 64kB block sizes. Some quick testing shows we could improve 
> performance by almost an order of magnitude for small batches and messages if 
> we re-used buffers between instantiations of the input stream.
> [Benchmark 
> Code|https://github.com/xvrl/kafka/blob/small-batch-lz4-benchmark/clients/src/test/java/org/apache/kafka/common/record/DeepRecordsIteratorBenchmark.java#L86]
> {code}
> Benchmark  (compressionType)  
> (messageSize)   Mode  Cnt   Score   Error  Units
> DeepRecordsIteratorBenchmark.measureSingleMessageLZ4  
>   100  thrpt   20   84802.279 ±  1983.847  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage SNAPPY  
>   100  thrpt   20  407585.747 ±  9877.073  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage   NONE  
>   100  thrpt   20  579141.634 ± 18482.093  ops/s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5150) LZ4 decompression is 4-5x slower than Snappy on small batches / messages

2017-05-05 Thread JIRA

[ 
https://issues.apache.org/jira/browse/KAFKA-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999059#comment-15999059
 ] 

Xavier Léauté commented on KAFKA-5150:
--

I also have a versions of the same patch for 0.10.2.x in case someone is 
interested, or if we decide to do another 0.10.2.x release.

https://github.com/apache/kafka/compare/0.10.2...xvrl:kafka-5150-0.10?expand=1

> LZ4 decompression is 4-5x slower than Snappy on small batches / messages
> 
>
> Key: KAFKA-5150
> URL: https://issues.apache.org/jira/browse/KAFKA-5150
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.11.0.0, 0.10.2.1
>Reporter: Xavier Léauté
>Assignee: Xavier Léauté
> Fix For: 0.11.0.0
>
>
> I benchmarked RecordsIteratorDeepRecordsIterator instantiation on small batch 
> sizes with small messages after observing some performance bottlenecks in the 
> consumer. 
> For batch sizes of 1 with messages of 100 bytes, LZ4 heavily underperforms 
> compared to Snappy (see benchmark below). Most of our time is currently spent 
> allocating memory blocks in KafkaLZ4BlockInputStream, due to the fact that we 
> default to larger 64kB block sizes. Some quick testing shows we could improve 
> performance by almost an order of magnitude for small batches and messages if 
> we re-used buffers between instantiations of the input stream.
> [Benchmark 
> Code|https://github.com/xvrl/kafka/blob/small-batch-lz4-benchmark/clients/src/test/java/org/apache/kafka/common/record/DeepRecordsIteratorBenchmark.java#L86]
> {code}
> Benchmark  (compressionType)  
> (messageSize)   Mode  Cnt   Score   Error  Units
> DeepRecordsIteratorBenchmark.measureSingleMessageLZ4  
>   100  thrpt   20   84802.279 ±  1983.847  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage SNAPPY  
>   100  thrpt   20  407585.747 ±  9877.073  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage   NONE  
>   100  thrpt   20  579141.634 ± 18482.093  ops/s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5150) LZ4 decompression is 4-5x slower than Snappy on small batches / messages

2017-05-05 Thread JIRA

[ 
https://issues.apache.org/jira/browse/KAFKA-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998617#comment-15998617
 ] 

Xavier Léauté commented on KAFKA-5150:
--

I expanded the patch for this issue to also address the KAFKA-4293 and improve 
performance by up to 3x for legacy single message batches for other compression 
formats as well.

> LZ4 decompression is 4-5x slower than Snappy on small batches / messages
> 
>
> Key: KAFKA-5150
> URL: https://issues.apache.org/jira/browse/KAFKA-5150
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.11.0.0, 0.10.2.1
>Reporter: Xavier Léauté
>Assignee: Xavier Léauté
> Fix For: 0.11.0.0
>
>
> I benchmarked RecordsIteratorDeepRecordsIterator instantiation on small batch 
> sizes with small messages after observing some performance bottlenecks in the 
> consumer. 
> For batch sizes of 1 with messages of 100 bytes, LZ4 heavily underperforms 
> compared to Snappy (see benchmark below). Most of our time is currently spent 
> allocating memory blocks in KafkaLZ4BlockInputStream, due to the fact that we 
> default to larger 64kB block sizes. Some quick testing shows we could improve 
> performance by almost an order of magnitude for small batches and messages if 
> we re-used buffers between instantiations of the input stream.
> [Benchmark 
> Code|https://github.com/xvrl/kafka/blob/small-batch-lz4-benchmark/clients/src/test/java/org/apache/kafka/common/record/DeepRecordsIteratorBenchmark.java#L86]
> {code}
> Benchmark  (compressionType)  
> (messageSize)   Mode  Cnt   Score   Error  Units
> DeepRecordsIteratorBenchmark.measureSingleMessageLZ4  
>   100  thrpt   20   84802.279 ±  1983.847  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage SNAPPY  
>   100  thrpt   20  407585.747 ±  9877.073  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage   NONE  
>   100  thrpt   20  579141.634 ± 18482.093  ops/s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5150) LZ4 decompression is 4-5x slower than Snappy on small batches / messages

2017-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995673#comment-15995673
 ] 

ASF GitHub Bot commented on KAFKA-5150:
---

GitHub user xvrl opened a pull request:

https://github.com/apache/kafka/pull/2967

KAFKA-5150 reduce lz4 decompression overhead

- reuse decompression buffers, keeping one per thread
- switch lz4 input stream to operate directly on ByteBuffers
- more tests with both compressible / incompressible data, multiple
  blocks, and various other combinations to increase code coverage
- fixes bug that would cause EOFException instead of invalid block size
  for invalid incompressible blocks

Overall this improves LZ4 decompression performance by up to 23x for small 
batches.
Most improvements are seen for batches of size 1 with messages on the order 
of ~100B.
At least 10x improvements for for batch sizes of < 10 messages, with 
messages of < 10kB

See benchmark code and results here
https://gist.github.com/xvrl/05132e0643513df4adf842288be86efd

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xvrl/kafka kafka-5150

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2967.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2967


commit 0efc6e7f15b6994a6665da5975e69c77426cf904
Author: Xavier Léauté 
Date:   2017-05-03T20:40:45Z

KAFKA-5150 reduce lz4 decompression overhead

- reuse decompression buffers, keeping one per thread
- switch lz4 input stream to operate directly on ByteBuffers
- more tests with both compressible / incompressible data, multiple
  blocks, and various other combinations to increase code coverage
- fixes bug that would cause EOFException instead of invalid block size
  for invalid incompressible blocks




> LZ4 decompression is 4-5x slower than Snappy on small batches / messages
> 
>
> Key: KAFKA-5150
> URL: https://issues.apache.org/jira/browse/KAFKA-5150
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.11.0.0, 0.10.2.1
>Reporter: Xavier Léauté
>Assignee: Xavier Léauté
>
> I benchmarked RecordsIteratorDeepRecordsIterator instantiation on small batch 
> sizes with small messages after observing some performance bottlenecks in the 
> consumer. 
> For batch sizes of 1 with messages of 100 bytes, LZ4 heavily underperforms 
> compared to Snappy (see benchmark below). Most of our time is currently spent 
> allocating memory blocks in KafkaLZ4BlockInputStream, due to the fact that we 
> default to larger 64kB block sizes. Some quick testing shows we could improve 
> performance by almost an order of magnitude for small batches and messages if 
> we re-used buffers between instantiations of the input stream.
> [Benchmark 
> Code|https://github.com/xvrl/kafka/blob/small-batch-lz4-benchmark/clients/src/test/java/org/apache/kafka/common/record/DeepRecordsIteratorBenchmark.java#L86]
> {code}
> Benchmark  (compressionType)  
> (messageSize)   Mode  Cnt   Score   Error  Units
> DeepRecordsIteratorBenchmark.measureSingleMessageLZ4  
>   100  thrpt   20   84802.279 ±  1983.847  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage SNAPPY  
>   100  thrpt   20  407585.747 ±  9877.073  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage   NONE  
>   100  thrpt   20  579141.634 ± 18482.093  ops/s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)