@李珣The situation you describe may have conceptual deviations about how the consensus protocol works:---> Since the data of the follower when the follower uses the DIFF method to synchronize with the leader is still in the memory, it has not had time to persist1. The write path is: write transaction log(WAL) firstly, after reaching a consensus, then apply to memory, other than the opposite. ---> but at this time, the latest zxid_n of the leader has not been supported by the quorum of the follower. At this time, if a client connects to the leader and sees zxid_n,2. If a write has not been supported by the quorum, it's not safe to apply to the state machine and the client is not able to see this write. I guess that your question may be: how the system handles the uncommitted logs when leader changes?
----- Original Message ----- From: Ted Dunning <ted.dunn...@gmail.com> To: dev@zookeeper.apache.org Subject: Re: May violate the ZAB agreement -- version 3.6.1 Date: 2020-08-28 01:25 How is it that participant A would have a later zxid than the leader? In particular, it seems to me that it should be impossible to have these two facts be true: 1) a transaction has been committed with zxid = z_0. This implies that a quorum of the cluster has accepted this transaction and it has been committed. 2) a new leader election nominates a leader with latest zxid < z_0. My reasoning is that any new leader election has to involve a quorum and at least a sufficient number of that quorum must have accepted zxid >= z_0 and therefore would refuse to be part of the quorum (this is a contradiction). Thus, no leader could be elected with zxid < z_0 if fact (1) is true. What you are describing seems to require both of these facts. Perhaps I am missing something about your suggested scenario. Could you describe what you are thinking in more detail? On Thu, Aug 27, 2020 at 2:08 AM 李珣 <274952...@qq.com> wrote: > version 3.6.1 > org.apache.zookeeper.server.quorum.Learner.java line:605 > Suppose there is a situation > zxid_n is the largest zxid of Participant A (the leader has just resumed > from downtime). Zxid_n has not been recognized by the quorum. Assuming > Participant A is elected as the Leader, then if a follower appears to use > DIFF to synchronize data with the Leader, Leader After sending the > UPTODATE, the leader can already provide external access, but at this time, > the latest zxid_n of the leader has not been supported by the quorum of the > follower. At this time, if a client connects to the leader and sees zxid_n, > then at this time both the leader and the follower are down. For some > reason, the leader cannot be started, and the follower can start normally. > At this time, a new leader can only be elected from the follower. Since the > data of the follower when the follower uses the DIFF method to synchronize > with the leader is still in the memory, it has not had time to persist, > then this The newly elected leader does not have the data of zxid_n, but > before zxid_n has been seen by the client on the old leader, there will be > inconsistencies in the data view. > Is the above situation possible?