Ismael Juma created KAFKA-3565:
----------------------------------

             Summary: Producer's throughput lower with compressed data after 
KIP-31/32
                 Key: KAFKA-3565
                 URL: https://issues.apache.org/jira/browse/KAFKA-3565
             Project: Kafka
          Issue Type: Bug
            Reporter: Ismael Juma
            Priority: Critical
             Fix For: 0.10.0.0


Relative offsets were introduced by KIP-31 so that the broker does not have to 
recompress data (this was previously required after offsets were assigned). The 
implicit assumption is that reducing CPU usage required by recompression would 
mean that producer throughput for compressed data would increase.

However, this doesn't seem to be the case:

{code}
Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
test_id:    
2016-04-15--012.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
status:     PASS
run time:   59.030 seconds
{"records_per_sec": 519418.343653, "mb_per_sec": 49.54}
{code}
Full results: https://gist.github.com/ijuma/0afada4ff51ad6a5ac2125714d748292

{code}
Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
test_id:    
2016-04-15--013.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
status:     PASS
run time:   1 minute 0.243 seconds
{"records_per_sec": 427308.818848, "mb_per_sec": 40.75}
{code}
Full results: https://gist.github.com/ijuma/e49430f0548c4de5691ad47696f5c87d

The difference for the uncompressed case is smaller (and within what one would 
expect given the additional size overhead caused by the timestamp field):

{code}
Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
test_id:    
2016-04-15--010.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
status:     PASS
run time:   1 minute 4.176 seconds
{"records_per_sec": 321018.17747, "mb_per_sec": 30.61}
{code}
Full results: https://gist.github.com/ijuma/5fec369d686751a2d84debae8f324d4f

{code}
Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
test_id:    
2016-04-15--014.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
status:     PASS
run time:   1 minute 5.079 seconds
{"records_per_sec": 291777.608696, "mb_per_sec": 27.83}
{code}
Full results: https://gist.github.com/ijuma/1d35bd831ff9931448b0294bd9b787ed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to