Well, why not set up the zk ensemble across 3 AZs? In this case, for 5-node
ensemble, we can have 2, 2, 1 zk servers per AZ. Then if any AZ is down,
the ensemble is still up. It will also work for 7-node ensemble, we can
have 3, 2, 2 zk servers per AZ.

HTH,
Zhewei

On Thu, Aug 5, 2021 at 07:38 Ivan Kelly <[email protected]> wrote:

> *typo, the exception is when there's an _equal_ number of participants
>
> On Thu, Aug 5, 2021 at 3:37 PM Ivan Kelly <[email protected]> wrote:
> >
> > > Promoting the 2 observers to participant will be a manual step (as
> part of disaster recovery) to get the cluster up. During this manual step,
> if needed, we can shutdown/terminate the old AZ instances.
> > > We also have puppet managing configuration. Puppet module will be
> updated to reflect new cluster instances. So when, if the AZ comes up,
> puppet will see that these instances are no longer part of the zookeeper
> cluster and module will stop zookeeper service.
> > What guarantee do you have that all clients will have switched over to
> > the new cluster? Even if puppet will shutdown the old cluster, it will
> > take time to see that it needs to be shut down, which creates an
> > opportunity for clients to connect and do stuff.
> >
> > > A side question: Will observers will always in sync with entire
> cluster? in other words when observers will be in sync with the quorum
> participants?
> > By in sync, I take it that you mean that any write that was
> > acknowledged by the initial cluster exists in the failover cluster.
> > No, they may not be in-sync. The observer will always have a prefix of
> > the log of the participants. This prefix may be the entire log, or it
> > may be missing the latest writes. This is true even if you have a
> > participant in the failover AZ. For a write to be acknowledged, it has
> > to hit a majority of the quorum.
> >
> > With 2 AZs, 1 AZ will always have a majority, so if it goes down,
> > writes will be missing from the other AZ. The exception to this is
> > where there's an even number of participant in each AZ. In this case,
> > you one AZ goes down, you can no longer form a majority, but all
> > writes will exist on both AZs. Maybe this could be a path forward,
> > since you accept that you will have manual failover. I'm not sure how
> > well this scenario is supported in the tooling though.
> >
> > -Ivan
>

Reply via email to