Re: More partitions => less throughput?

Peter Bukowinski Sat, 30 Nov 2019 09:26:12 -0800

Testing multiple brokers VMs on a single host won’t give you accurate 
performance numbers unless that is how you will be deploying kafka in 
production. (Don’t do this.) All your kafka networking is being handled by a 
single host, so instead of being spread out between machines to increase total 
possible throughput, they are competing with each other.


Given that this is the test environment you settled on, you should tune the 
number of partitions taking number of producers and consumers, and also the 
average message size into account. If you have only one producer, then a single 
consumer should be sufficient to read the data in real-time. If you have 
multiple producers, you may need to scale up the consumer count and use 
consumer groups.

-- Peter

> On Nov 30, 2019, at 8:57 AM, Tom Brown <tombrow...@gmail.com> wrote:
> 
> I think the number of partitions needs to be tuned to the size of the
> cluster; 64 partitions on what is essentially a single box seems high. Do
> you know what hardware you will be deploying on in production? Can you run
> your benchmark on that instead of a vm?
> 
> —Tom
> 
>> On Thursday, November 28, 2019, Craig Pastro <siyo...@gmail.com> wrote:
>> 
>> Hello there,
>> 
>> I was wondering if anyone here could help me with some insight into a
>> conundrum that I am facing.
>> 
>> Basically, the story is that I am running three Kafka brokers via docker on
>> a single vm with log.flush.interval.messages = 1 and min.insync.replicas =
>> 2. Then I create two topics: both with replication factor = 3, but one with
>> one partition and the other with 64.
>> 
>> Then I try to run a benchmark using these topics and what I find is as
>> follows:
>> 
>> 1 partition, 1381.02 records/sec,  685.87 ms average latency
>> 64 partitions, 601.00 records/sec, 1298.18 ms average latency
>> 
>> This is the opposite of what I expected. In neither case am I even close to
>> the IOPS of what the disk can handle. So what I would like to know is if
>> there is any obvious reason that I am missing for the slow down with more
>> partitions?
>> 
>> If it is helpful the docker-compose file and the code to do the
>> benchmarking can be found at https://github.com/siyopao/kafka-benchmark.
>> (Any comments or advice on how to make the code better are greatly
>> appreciated!) The benchmarking code is inspired by and very similar to what
>> the bin/kafka-producer-perf-test.sh script does.
>> 
>> Thank you!
>> 
>> Best wishes,
>> Craig
>>

Re: More partitions => less throughput?

Reply via email to