Hi, Rijo. This slide might help you to create a procedure to migrate the zk ensemble without downtime. https://speakerdeck.com/line_developers/split-brain-free-online-zookeeper-migration
The slide is based on zookeeper 3.4 so in your environment (3.5), the procedure might be simplified thanks to dynamic reconfiguration though. Thanks, 2021年10月21日(木) 4:46 Ran Lupovich <ranlupov...@gmail.com>: > One thing that comes to my mind after reading your explanation, zk quorum > should be odd number, you stated you have six zookeepers... I would suggest > checking this matter, 3 , 5 , 7 etc... > > בתאריך יום ד׳, 20 באוק׳ 2021, 22:00, מאת Rijo Roy > <rjo_...@yahoo.com.invalid>: > > > Hi, > > > > Hope you are safe and well! > > > > Let me give a brief about my environment: > > > > OS: Ubuntu 18.04 > > Kafka Version: Confluent Kafka v5.5.1 > > ZooKeeper Version : 3.5.8 > > No.of Kafka Brokers: 3 > > No. of Zookeeper nodes: 3 > > > > I am working on a project where we are aiming to move out from our > > existing infrastructure lets call it A where Kafka and ZooKeeper clusters > > are hosted to a better infrastructure lets call it B but with no or > minimal > > downtime. Once the cutover is done, we would like to terminate the old > > infrastructure A. > > > > I was able to use kafka-reassign-partitions.sh as per the steps mentioned > > in https://kafka.apache.org/documentation/#basic_ops_cluster_expansion > to > > move the topics-partitions to the Kafka brokers I created in B. Please > note > > that I have added 3 zookeeper nodes running in B into the zookeeper > cluster > > in A and hence they were following the ZK leader in A. > > I was in the impression that since I had 6 nodes in the ZooKeeper > > ensemble, stopping the A side of ZooKeeper nodes would not cause an issue > > but I was wrong. As soon as I stopped the ZK process on the A nodes, B Zk > > nodes failed to accept any connections from Kafka and I assume it is > > because the leadership of ZK did not transfer to the ZK B nodes and > failed > > the quorum resulting in this failure. I had to remove the version-2 > folder > > inside the B Zk nodes and starting them 1 by 1 after removing the details > > of ZK A nodes from zookeeper.properties helped me to resolve the failure > > and run the cluster on infrastructure B. I know I failed miserably but > this > > was a sandbox where I could afford the downtime but cannot in a > production > > setup. I request your help and guidance to make it right. Please help! > > > > Thanks in advance. > > > > Regards,Rijo S Roy > > > > > > > -- ======================== Okada Haruki ocadar...@gmail.com ========================