On Mon, Apr 1, 2019 at 4:48 PM Alexander Shraer <shra...@gmail.com> wrote:
> Hi, > > I think that one of the problems with the proposed method is that you may > end-up having a majority of servers that don't have the latest state > (imagine that there is a minority failure while your replaced > node hasn't been brought up do date yet). > Have you considered using dynamic reconfiguration ? Removing the nodes > logically first, then replacing them and adding back in ? You can do > multiple servers at a time this way. Does dynamic reconfiguration as you suggest here buy me anything in a 3-node cluster? No matter what I'm going to be at N+0 during the transition, so doesn't it just add more steps for the same result? Or, you can give new servers higher ids, add them using reconfig, and later > remove the old servers. Reconfiguration ensures that a quorum always has > the data. > My admittedly terrible motivation for avoiding that is that I want to preserve hostnames, to avoid reconfiguring clients. This is in a cloud environment where DNS is tied to instance name, so I can't play tricks at the network layer - at some point I have to delete the old instances and set up new ones with the same name. I suppose I could do a careful dance where I grow to 5 nodes, then do a rolling removal/readd of the first 3, so that I can stay at N+1 during the replacement, and just trust that clients can reach at least one of the first 3 replicas to discover the entire cluster. - Dave > Alex > > > > On Mon, Apr 1, 2019 at 2:51 PM David Anderson <d...@tockhq.com> wrote: > > > Hi, > > > > I have a running Zookeeper (3.5) cluster where the machines need to be > > replaced. I was thinking of just setting the same ID on each new > > machine, and then doing a rolling replacement: take down old ID 1, > > start new ID 1, let it rejoin the cluster and replicate the state, > > then continue with the other replicas. > > > > I'm finding conflicting information on the internet about the safety > > of this. The Apache Kafka FAQ says to do exactly this when replacing a > > failed Zookeeper replica, and the new machine will just replicate the > > state before participating in the quorum. Other places on the internet > > say that reusing the ID without also copying over the state directory > > will break assumptions that ZAB makes about replicas, with bad (but > > nondescript) consequences. > > > > So, is it safe to reuse IDs in the way I described? If not, what's the > > suggested procedure for a rolling replacement of all cluster replicas? > > > > Thanks, > > - Dave > > >