Hi, Rijo.

This slide might help you to create a procedure to migrate the zk ensemble
without downtime.
https://speakerdeck.com/line_developers/split-brain-free-online-zookeeper-migration

The slide is based on zookeeper 3.4 so in your environment (3.5), the
procedure might be simplified thanks to dynamic reconfiguration though.


Thanks,

2021年10月21日(木) 4:46 Ran Lupovich <ranlupov...@gmail.com>:

> One thing that comes to my mind after reading your explanation, zk quorum
> should be odd number, you stated you have six zookeepers... I would suggest
> checking this matter, 3 , 5 , 7 etc...
>
> בתאריך יום ד׳, 20 באוק׳ 2021, 22:00, מאת Rijo Roy
> ‏<rjo_...@yahoo.com.invalid>:
>
> > Hi,
> >
> > Hope you are safe and well!
> >
> > Let me give a brief about my environment:
> >
> > OS: Ubuntu 18.04
> > Kafka Version: Confluent Kafka v5.5.1
> > ZooKeeper Version : 3.5.8
> > No.of Kafka Brokers: 3
> > No. of Zookeeper nodes: 3
> >
> > I am working on a project where we are aiming to move out from our
> > existing infrastructure lets call it A where Kafka and ZooKeeper clusters
> > are hosted to a better infrastructure lets call it B but with no or
> minimal
> > downtime. Once the cutover is done, we would like to terminate the old
> > infrastructure A.
> >
> > I was able to use kafka-reassign-partitions.sh as per the steps mentioned
> > in https://kafka.apache.org/documentation/#basic_ops_cluster_expansion
> to
> > move the topics-partitions to the Kafka brokers I created in B. Please
> note
> > that I have added 3 zookeeper nodes running in B into the zookeeper
> cluster
> > in A and hence they were following the ZK leader in A.
> > I was in the impression that since I had 6 nodes in the ZooKeeper
> > ensemble, stopping the A side of ZooKeeper nodes would not cause an issue
> > but I was wrong. As soon as I stopped the ZK process on the A nodes, B Zk
> > nodes failed to accept any connections from Kafka and I assume it is
> > because the leadership of ZK did not transfer to the ZK B nodes and
> failed
> > the quorum resulting in this failure. I had to remove the version-2
> folder
> > inside the B Zk nodes and starting them 1 by 1 after removing the details
> > of ZK A nodes from zookeeper.properties helped me to resolve the failure
> > and run the cluster on infrastructure B. I know I failed miserably but
> this
> > was a sandbox where I could afford the downtime but cannot in a
> production
> > setup. I request your help and guidance to make it right. Please help!
> >
> > Thanks in advance.
> >
> > Regards,Rijo S Roy
> >
> >
> >
>


-- 
========================
Okada Haruki
ocadar...@gmail.com
========================

Reply via email to