Hi, Marisa.

Kafka is well-designed to make full use of system resources, so I think
calculating based on machine's spec is a good start.

Let's say we have servers with 10Gbps full-duplex NIC.
Also, let's say we set the topic's replication factor to 3 (so the cluster
will have minimum 3 servers), and the average produced message size is 500
bytes.

Then, a single machine's spec-wise throughput bound will be calculated as
follows:
- Max messages / sec that single machine can transmit = 10Gbps / 8 (convert
to byte) / 3 (replicate to 2 replicas & fetched by 1 consumer group) / 500
= 833K.

Note that, of course this is just an example so you should also take other
factors into account (e.g. HDD throughput etc).
Also, I think producing / consuming to a single partition at a rate of 833K
msg/sec is a bit hard due to client-side bottlenecks so we may need to
adjust partition count as well.

However, at least, 833K msg/sec for 500 bytes messages with above spec
sounds not so far from my experience of running Kafka in production.

2022年1月7日(金) 0:01 Marisa Queen <marisa.queen...@gmail.com>:

> Cheers from NYC!
>
> I'm trying to give a performance number to a potential client (from the
> financial market) who asked me the following question:
>
> *"If I have a Kafka system setup in the best way possible for performance,
> what is an approximate number that I can have in mind for the throughput of
> this system?"*
>
> The client proceeded to say:
>
> *"What I want to know specifically, is how many messages per second can I
> send from one side of my distributed system to the other side with Apache
> Kafka."*
>
> And he concluded with:
>
> *"To give you an example, let's say I have 10 million messages that I need
> to send from producers to consumers. Let's assume I have 1 topic, 1
> producer for this topic, 4 partitions for this topic and 4 consumers, one
> for each partition. What I would like to know is: How long is it going to
> take for these 10 million messages to travel all the way from the producer
> to the consumers? That's the throughput performance number I'm interested
> in."*
>
> I read in a reddit post yesterday (for some reason I can't find the post
> anymore) that Kafka is able to handle 7 trillion messages per day. The
> LinkedIn article about it, says:
>
>
> *"We maintain over 100 Kafka clusters with more than 4,000 brokers, which
> serve more than 100,000 topics and 7 million partitions. The total number
> of messages handled by LinkedIn’s Kafka deployments recently surpassed 7
> trillion per day."*
>
> The OP of the reddit post went on to say that WhatsApp is handling around
> 64 billion messages per day (740,000 msgs per sec x 24 x 60 x 60) and that
> 7
> trillion for LinkedIn is a huge number, giving a whopping 81 million
> messages per second for LinkedIn. But that doesn't matter for my question.
>
> 7 Trillion messages divided by 7 million partitions gives us 1 million
> messages per day per partition. So to calculate the throughput we do:
>
>     1 million divided by 60 divided by 60 divided by 24 => *23 messages per
> second per partition*
>
> We'll all agree that 23 messages per second per partition for throughput
> performance is very low, so I can't give this number to my potential
> client.
>
> So my question is: *What number should I give to my potential client?* Note
> that he is a stubborn and strict bank CTO, so he won't take any talk from
> me. He wants a mathematical answer using the scientific method.
>
> Has anyone been in my shoes and can shed some light on this kafka
> throughput performance topic?
>
> Cheers,
>
> M. Queen
>


-- 
========================
Okada Haruki
ocadar...@gmail.com
========================

Reply via email to