Hi, Marisa. Kafka is well-designed to make full use of system resources, so I think calculating based on machine's spec is a good start.
Let's say we have servers with 10Gbps full-duplex NIC. Also, let's say we set the topic's replication factor to 3 (so the cluster will have minimum 3 servers), and the average produced message size is 500 bytes. Then, a single machine's spec-wise throughput bound will be calculated as follows: - Max messages / sec that single machine can transmit = 10Gbps / 8 (convert to byte) / 3 (replicate to 2 replicas & fetched by 1 consumer group) / 500 = 833K. Note that, of course this is just an example so you should also take other factors into account (e.g. HDD throughput etc). Also, I think producing / consuming to a single partition at a rate of 833K msg/sec is a bit hard due to client-side bottlenecks so we may need to adjust partition count as well. However, at least, 833K msg/sec for 500 bytes messages with above spec sounds not so far from my experience of running Kafka in production. 2022年1月7日(金) 0:01 Marisa Queen <marisa.queen...@gmail.com>: > Cheers from NYC! > > I'm trying to give a performance number to a potential client (from the > financial market) who asked me the following question: > > *"If I have a Kafka system setup in the best way possible for performance, > what is an approximate number that I can have in mind for the throughput of > this system?"* > > The client proceeded to say: > > *"What I want to know specifically, is how many messages per second can I > send from one side of my distributed system to the other side with Apache > Kafka."* > > And he concluded with: > > *"To give you an example, let's say I have 10 million messages that I need > to send from producers to consumers. Let's assume I have 1 topic, 1 > producer for this topic, 4 partitions for this topic and 4 consumers, one > for each partition. What I would like to know is: How long is it going to > take for these 10 million messages to travel all the way from the producer > to the consumers? That's the throughput performance number I'm interested > in."* > > I read in a reddit post yesterday (for some reason I can't find the post > anymore) that Kafka is able to handle 7 trillion messages per day. The > LinkedIn article about it, says: > > > *"We maintain over 100 Kafka clusters with more than 4,000 brokers, which > serve more than 100,000 topics and 7 million partitions. The total number > of messages handled by LinkedIn’s Kafka deployments recently surpassed 7 > trillion per day."* > > The OP of the reddit post went on to say that WhatsApp is handling around > 64 billion messages per day (740,000 msgs per sec x 24 x 60 x 60) and that > 7 > trillion for LinkedIn is a huge number, giving a whopping 81 million > messages per second for LinkedIn. But that doesn't matter for my question. > > 7 Trillion messages divided by 7 million partitions gives us 1 million > messages per day per partition. So to calculate the throughput we do: > > 1 million divided by 60 divided by 60 divided by 24 => *23 messages per > second per partition* > > We'll all agree that 23 messages per second per partition for throughput > performance is very low, so I can't give this number to my potential > client. > > So my question is: *What number should I give to my potential client?* Note > that he is a stubborn and strict bank CTO, so he won't take any talk from > me. He wants a mathematical answer using the scientific method. > > Has anyone been in my shoes and can shed some light on this kafka > throughput performance topic? > > Cheers, > > M. Queen > -- ======================== Okada Haruki ocadar...@gmail.com ========================