*typo, the exception is when there's an _equal_ number of participants

On Thu, Aug 5, 2021 at 3:37 PM Ivan Kelly <[email protected]> wrote:
>
> > Promoting the 2 observers to participant will be a manual step (as part of 
> > disaster recovery) to get the cluster up. During this manual step, if 
> > needed, we can shutdown/terminate the old AZ instances.
> > We also have puppet managing configuration. Puppet module will be updated 
> > to reflect new cluster instances. So when, if the AZ comes up, puppet will 
> > see that these instances are no longer part of the zookeeper cluster and 
> > module will stop zookeeper service.
> What guarantee do you have that all clients will have switched over to
> the new cluster? Even if puppet will shutdown the old cluster, it will
> take time to see that it needs to be shut down, which creates an
> opportunity for clients to connect and do stuff.
>
> > A side question: Will observers will always in sync with entire cluster? in 
> > other words when observers will be in sync with the quorum participants?
> By in sync, I take it that you mean that any write that was
> acknowledged by the initial cluster exists in the failover cluster.
> No, they may not be in-sync. The observer will always have a prefix of
> the log of the participants. This prefix may be the entire log, or it
> may be missing the latest writes. This is true even if you have a
> participant in the failover AZ. For a write to be acknowledged, it has
> to hit a majority of the quorum.
>
> With 2 AZs, 1 AZ will always have a majority, so if it goes down,
> writes will be missing from the other AZ. The exception to this is
> where there's an even number of participant in each AZ. In this case,
> you one AZ goes down, you can no longer form a majority, but all
> writes will exist on both AZs. Maybe this could be a path forward,
> since you accept that you will have manual failover. I'm not sure how
> well this scenario is supported in the tooling though.
>
> -Ivan

Reply via email to