Hey Michael, thanks for your comments. I think the first of the improvements you mentioned, the faster controller failover is a known improvement to me. But the second one you suggest is a faster consumer group failover, could you open that up a bit for me why do you think it will be better on KRaft?
As you mentioned these are improvements on the recovery times, so from your mail I understand you wouldn't expect an improvement on latencies as well. On Thu, 1 Feb 2024 at 22:53, Michael K. Edwards <m.k.edwa...@gmail.com> wrote: > The interesting numbers are the recovery times after 1) the Kafka broker > currently acting as the "active" controller (or the sole controller in a > ZooKeeper-based deployment) goes away; 2) the Kafka broker currently acting > as the consumer group coordinator for a consumer group with many partitions > and a high commit rate goes away. Here "goes away" means as ugly a loss > mode as can realistically be simulated in your test environment; I suggest > forcing the to-be-impaired broker into heavy paging by running it inside a > cgroups container and progressively shrinking the memory cgroup. It's also > fun to force high packet loss using iptables. > > If you're serious about testing KRaft's survivability under load, then I > suggest you compare against a ZooKeeper deployment that's relatively > non-broken. That means setting up a ZooKeeper observer > https://zookeeper.apache.org/doc/current/zookeeperObservers.html local to > each broker. Personally I'd want to test with a large number of partitions > (840 or 2520 per topic, tens of thousands overall), especially in the > coordinator-failure scenario. I haven't been following the horizontal > scaling work closely, but I suspect that still means porting forward the > Dropwizard-based metrics patch I wrote years ago. If I were doing that, > I'd bring the shared dependencies of zookeeper and kafka up to current and > do a custom zookeeper build off of the 3.9.x branch (compare > > https://github.com/mkedwards/zookeeper/commit/e608be61a3851c128088d9c9c54871f56aa05012 > and consider backporting > > https://github.com/apache/zookeeper/commit/5894dc88cce1f4675809fb347cc60d3e0ebf08d4 > ). > Then I'd do https://github.com/mkedwards/kafka/tree/bitpusher-2.3 all over > again, starting from the kafka 3.6.x branch and synchronizing the shared > dependencies. > > If you'd like to outsource that work, I'm available on a consulting basis > :D Seriously, ZooKeeper itself has in my opinion never been the problem, > at least since it got revived after the sad 3.14.1x / 3.5.x-alpha days. > Inadequately resourced and improperly deployed ZooKeeper clusters have been > a problem, as has the use of JMX to do the job of a modern metrics > library. The KRaft ship has sailed as far as upstream development is > concerned; but if you're not committed to that in your production > environment, there are other ways to scale up and out while retaining > ZooKeeper as your reliable configuration/metadata store. (It's also > cost-effective and latency-feasible to run a cross-AZ ZooKeeper cluster, > which I would not attempt with Kafka brokers in any kind of large-scale > production setting.) > > Cheers, > - Michael > > On Thu, Feb 1, 2024 at 7:02 AM Doğuşcan Namal <namal.dogus...@gmail.com> > wrote: > > > Hi Paul, > > > > I did some benchmarking as well and couldn't find a marginal difference > > between KRaft and Zookeeper on end to end latency from producers to > > consumers. I tested it on Kafka version 3.5.1 and used openmessaging's > > benchmarking framework https://openmessaging.cloud/docs/benchmarks/ . > > > > What I noticed was if you run the tests long enough(60 mins) the > throughput > > converges to the same value eventually. I also noticed some difference on > > p99+ latencies between Zookeeper and KRaft clusters but the results were > > not consistent on repetitive runs. > > > > Which version did you make the tests on and what are your findings? > > > > On Wed, 31 Jan 2024 at 22:57, Brebner, Paul <paul.breb...@netapp.com > > .invalid> > > wrote: > > > > > Hi all, > > > > > > We’ve previously done some benchmarking of Kafka ZooKeeper vs KRaft and > > > found no difference in throughput (which we believed is also what > theory > > > predicted, as ZK/Kraft are only involved in Kafka meta-data operations, > > not > > > data workloads). > > > > > > BUT – latest tests reveal improved producer and consumer latency for > > Kraft > > > c.f. ZooKeeper. So just wanted to check if Kraft is actually involved > in > > > any aspect of write/read workloads? For example, some documentation > > > (possibly old) suggests that consumer offsets are stored in meta-data? > > In > > > which case this could explain better Kraft latencies. But if not, then > > I’m > > > curious to understand the difference (and if it’s documented anywhere?) > > > > > > Also if anyone has noticed the same regarding latency in benchmarks. > > > > > > Regards, Paul Brebner > > > > > >