Hey Atul Leadership information is propagated to the rest of the brokers by the controller. In Zk mode, the controller may take a long time to start up because it needs to fetch all state from Zookeeper. Rolling restart makes it worse because you may end up having controller re-election multiple times.
The optimization for this is to ensure that rolling restart will restart the controller at the very end. This ensures that controller re-election and metadata sync from Zk happens exactly once per cluster instead of at-least once. Could you try that and report back if this improved your situation? -- Divij Vaidya On Fri, Aug 23, 2024 at 12:01 PM Atul Sharma <atul.sharma.ma...@itbhu.ac.in.invalid> wrote: > Hi Akash, thanks for the quick reply. > > By the term leader i mean partition leader. > > > I agree with you as upgrading to Kraft based cluster will help in reducing > the partitions leader election time, but as we already have Zookeeper based > Kafka cluster upgrading to Kraft based cluster will be bit challenging. > > Do we have any implementation changes from the Kafka side(KIPs) as what KIP > - 951 do for Kraft based Kafka clusters for leader discovery that we can > incorporate with ZK based Kafka clusters? > > Additionally for optimising Zookeeper for better throughput/latency you > mean configuring suitable configs like session timeout, tick time, > connection limits?? What all config parameters in Kafka and Zookeeper can > help in reducing the time? > > On Fri, Aug 23, 2024, 2:36 PM Akash Jain <akashjain0...@gmail.com> wrote: > > > For Kafka with Zookeeper the recovery time is proportional to the number > of > > partitions in the cluster. So theoretically speaking the behaviour is > > consistent - it will take time. Kraft based Kafka clusters (since Kafka > > v3.3) are much much better with clusters with a large number of > partitions > > such as yours. This is one thing you should consider - upgrade to a newer > > Kraft based cluster. > > On the current setup you can try to optimize zookeeper for better > > throughput/latency. > > > > On Fri, Aug 23, 2024 at 12:40 PM Akash Jain <akashjain0...@gmail.com> > > wrote: > > > > > HI Atul you use the word 'leader'. You mean the 'controller'? Or you > > > referring to the leader for each of the partitions? > > > > > > On Fri, Aug 23, 2024 at 7:44 AM Atul Sharma > > > <atul.sharma.ma...@itbhu.ac.in.invalid> wrote: > > > > > >> Hi, > > >> We are currently facing a prolonged leader election time, approx 2 > mins, > > >> in > > >> a Kafka cluster (version 2.8.2) that is configured with Zookeeper. > This > > >> cluster has large number of topic partitions. > > >> > > >> The issue arises during the rolling restarts of the servers in the > Kafka > > >> cluster. > > >> > > >> This extended leader election time is causing communication issues and > > >> unavailability for producers and consumers as they are unable to > connect > > >> to > > >> Kafka within this timeframe. Any recommendations on reducing the > leader > > >> election time? > > >> > > >> Issue is occurring on Kafka 2.8.2 with Zookeeper > > >> > > > > > >