I am attaching log file. Could you take a look why the new instance cannot join quorum?
On Tue, Nov 5, 2013 at 9:52 AM, Bae, Jae Hyeon <[email protected]> wrote: > Thanks a lot Ben > > We are also using zookeeper in AWS with elastic IP. Why I asked this > question is, when the bad Zookeeper EC2 instance is terminated and new > instance is launched with the previous elastic IP, it cannot join quorum > without any specific error messages. But when I did rolling restart, the > new instance started normally, synchronized and joined quorum. > > As I understand German's response, the new instance should start, > synchronize, and join quorum successfully without any impact on existing > instances but it didn't. I will investigate further. > > Thank you > Best, Jae > > > On Tue, Nov 5, 2013 at 8:24 AM, Ben Hall <[email protected]> wrote: > >> Hi Jae, >> >> I wrote that article several years ago. (tbh - I hope it is not totally >> out of date by now). I agree with German's points. >> >> The issue it was solving was to replace a bad server without having to >> shutdown the ensemble and without having to update the config files on >> each server. I would also add that this only works as long as the server >> names and ports are the same - iirc at the time the article was written we >> were using servers in AWS and referencing them either by assigned >> hostnames such as zookeeper-[01|11] or by elastic IP's that could be moved >> from server to server. >> >> If I understand your question correctly, if you are "adding a new server" >> such as going from 7 to 9 servers, then this approach won't benefit you as >> you. >> >> We also used this approach when we would upgrade the servers, but like >> German said we did it one server at a time so that the Leader election >> could be natural. This allowed us to upgrade a pool of 11 servers who >> were responsible for many thousands of client connections without any down >> time. >> >> Thanks >> Ben >> >> >> On 11/5/13 6:51 AM, "German Blanco" <[email protected]> >> wrote: >> >> >... and make sure that there is no rubbish in the data dir of the new >> >server. >> > >> > >> >On Tue, Nov 5, 2013 at 3:49 PM, German Blanco < >> >[email protected]> wrote: >> > >> >> Hello Jae, >> >> >> >> I think that the answer to your question is "no, there is no benefit in >> >>a >> >> rolling restart in that case". >> >> If you remove a machine that was hosting a zookeeper server that was >> >>part >> >> of a cluster, and replace it with a new machine, with a zookeeper >> server >> >> running the same software version and listening on the same IP and >> >>ports, >> >> then this new server will join the cluster, synchronize and start >> >>working >> >> normally. >> >> I wouldn't recommend to replace more than one server at a time, and I >> >> think that it is better if the new server joins while the existing >> >>quorum >> >> is stable (avoid leader elections while the new server joins, i.e. >> avoid >> >> restarts or disconnections of the existing servers). >> >> >> >> Best regards, >> >> >> >> Germán. >> >> >> >> >> >> On Tue, Nov 5, 2013 at 6:42 AM, Bae, Jae Hyeon <[email protected]> >> >>wrote: >> >> >> >>> Hi >> >>> >> >>> I read an article >> >>> >> >>> >> >>> >> http://www.benhallbenhall.com/2011/07/rolling-restart-in-apache-zookeepe >> >>>r-to-dynamically-add-servers-to-the-ensemble/ >> >>> >> >>> My question is, even though failed hardware is replaced with the same >> >>>IP >> >>> address, do I need to do rolling restart for adding replaced hardware >> >>>to >> >>> the quorum? >> >>> >> >>> I am using zookeeper ver3.4.5. >> >>> >> >>> Thank you >> >>> Best, Jae >> >>> >> >> >> >> >> >> >
