The interesting numbers are the recovery times after 1) the Kafka broker currently acting as the "active" controller (or the sole controller in a ZooKeeper-based deployment) goes away; 2) the Kafka broker currently acting as the consumer group coordinator for a consumer group with many partitions and a high commit rate goes away. Here "goes away" means as ugly a loss mode as can realistically be simulated in your test environment; I suggest forcing the to-be-impaired broker into heavy paging by running it inside a cgroups container and progressively shrinking the memory cgroup. It's also fun to force high packet loss using iptables.
If you're serious about testing KRaft's survivability under load, then I suggest you compare against a ZooKeeper deployment that's relatively non-broken. That means setting up a ZooKeeper observer https://zookeeper.apache.org/doc/current/zookeeperObservers.html local to each broker. Personally I'd want to test with a large number of partitions (840 or 2520 per topic, tens of thousands overall), especially in the coordinator-failure scenario. I haven't been following the horizontal scaling work closely, but I suspect that still means porting forward the Dropwizard-based metrics patch I wrote years ago. If I were doing that, I'd bring the shared dependencies of zookeeper and kafka up to current and do a custom zookeeper build off of the 3.9.x branch (compare https://github.com/mkedwards/zookeeper/commit/e608be61a3851c128088d9c9c54871f56aa05012 and consider backporting https://github.com/apache/zookeeper/commit/5894dc88cce1f4675809fb347cc60d3e0ebf08d4). Then I'd do https://github.com/mkedwards/kafka/tree/bitpusher-2.3 all over again, starting from the kafka 3.6.x branch and synchronizing the shared dependencies. If you'd like to outsource that work, I'm available on a consulting basis :D Seriously, ZooKeeper itself has in my opinion never been the problem, at least since it got revived after the sad 3.14.1x / 3.5.x-alpha days. Inadequately resourced and improperly deployed ZooKeeper clusters have been a problem, as has the use of JMX to do the job of a modern metrics library. The KRaft ship has sailed as far as upstream development is concerned; but if you're not committed to that in your production environment, there are other ways to scale up and out while retaining ZooKeeper as your reliable configuration/metadata store. (It's also cost-effective and latency-feasible to run a cross-AZ ZooKeeper cluster, which I would not attempt with Kafka brokers in any kind of large-scale production setting.) Cheers, - Michael On Thu, Feb 1, 2024 at 7:02 AM Doğuşcan Namal <namal.dogus...@gmail.com> wrote: > Hi Paul, > > I did some benchmarking as well and couldn't find a marginal difference > between KRaft and Zookeeper on end to end latency from producers to > consumers. I tested it on Kafka version 3.5.1 and used openmessaging's > benchmarking framework https://openmessaging.cloud/docs/benchmarks/ . > > What I noticed was if you run the tests long enough(60 mins) the throughput > converges to the same value eventually. I also noticed some difference on > p99+ latencies between Zookeeper and KRaft clusters but the results were > not consistent on repetitive runs. > > Which version did you make the tests on and what are your findings? > > On Wed, 31 Jan 2024 at 22:57, Brebner, Paul <paul.breb...@netapp.com > .invalid> > wrote: > > > Hi all, > > > > We’ve previously done some benchmarking of Kafka ZooKeeper vs KRaft and > > found no difference in throughput (which we believed is also what theory > > predicted, as ZK/Kraft are only involved in Kafka meta-data operations, > not > > data workloads). > > > > BUT – latest tests reveal improved producer and consumer latency for > Kraft > > c.f. ZooKeeper. So just wanted to check if Kraft is actually involved in > > any aspect of write/read workloads? For example, some documentation > > (possibly old) suggests that consumer offsets are stored in meta-data? > In > > which case this could explain better Kraft latencies. But if not, then > I’m > > curious to understand the difference (and if it’s documented anywhere?) > > > > Also if anyone has noticed the same regarding latency in benchmarks. > > > > Regards, Paul Brebner > > >