Hi Akash, thanks for the quick reply. By the term leader i mean partition leader.
I agree with you as upgrading to Kraft based cluster will help in reducing the partitions leader election time, but as we already have Zookeeper based Kafka cluster upgrading to Kraft based cluster will be bit challenging. Do we have any implementation changes from the Kafka side(KIPs) as what KIP - 951 do for Kraft based Kafka clusters for leader discovery that we can incorporate with ZK based Kafka clusters? Additionally for optimising Zookeeper for better throughput/latency you mean configuring suitable configs like session timeout, tick time, connection limits?? What all config parameters in Kafka and Zookeeper can help in reducing the time? On Fri, Aug 23, 2024, 2:36 PM Akash Jain <akashjain0...@gmail.com> wrote: > For Kafka with Zookeeper the recovery time is proportional to the number of > partitions in the cluster. So theoretically speaking the behaviour is > consistent - it will take time. Kraft based Kafka clusters (since Kafka > v3.3) are much much better with clusters with a large number of partitions > such as yours. This is one thing you should consider - upgrade to a newer > Kraft based cluster. > On the current setup you can try to optimize zookeeper for better > throughput/latency. > > On Fri, Aug 23, 2024 at 12:40 PM Akash Jain <akashjain0...@gmail.com> > wrote: > > > HI Atul you use the word 'leader'. You mean the 'controller'? Or you > > referring to the leader for each of the partitions? > > > > On Fri, Aug 23, 2024 at 7:44 AM Atul Sharma > > <atul.sharma.ma...@itbhu.ac.in.invalid> wrote: > > > >> Hi, > >> We are currently facing a prolonged leader election time, approx 2 mins, > >> in > >> a Kafka cluster (version 2.8.2) that is configured with Zookeeper. This > >> cluster has large number of topic partitions. > >> > >> The issue arises during the rolling restarts of the servers in the Kafka > >> cluster. > >> > >> This extended leader election time is causing communication issues and > >> unavailability for producers and consumers as they are unable to connect > >> to > >> Kafka within this timeframe. Any recommendations on reducing the leader > >> election time? > >> > >> Issue is occurring on Kafka 2.8.2 with Zookeeper > >> > > >