[ https://issues.apache.org/jira/browse/ZOOKEEPER-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16975961#comment-16975961 ]
maoling edited comment on ZOOKEEPER-335 at 11/17/19 10:05 AM: -------------------------------------------------------------- Users may confuse about these two variables:*acceptedEpoch and currentEpoch* introduced by this ticket. The implementation up to version 3.3.3 has not included epoch variables *acceptedEpoch and currentEpoch*. This omission has generated problems in a production version and was noticed by many ZooKeeper clients. − *acceptedEpoch*: the epoch number of the last *NEWEPOCH* message accepted; − *currentEpoch*: the epoch number of the last *NEWLEADER* message accepted; The origin of this problem is at the beginning of *Recovery* Phase, when the leader increments its epoch (contained in *lastZxid*) even before acquiring a quorum of successfully connected followers (such leader is called *false leader*). Since a follower goes back to *FLE* if its epoch is larger than the leader’s epoch, when a *false leader* drops leadership and becomes a follower of a leader from a previous epoch, it finds a smaller epoch and goes back to FLE. This behavior can loop, switching from *Recovery* Phase to *FLE* repeatedly. Consequently, using *lastZxid* to store the epoch number, there is no distinction between a *tried* epoch and a *joined* epoch in the implementation. Those are the respective purposes for *acceptedEpoch and currentEpoch*, hence the omission of them render such problems. More details can be found in this report paper: _*ZooKeeper’s atomic broadcast protocol: Theory and practice. Andr ́e Medeiros March 20, 2012*_ was (Author: maoling): Users may confuse about these two variables:*acceptedEpoch and currentEpoch* introduced by this ticketThe implementation up to version 3.3.3 has not included epoch variables *acceptedEpoch and currentEpoch*. This omission has generated problems in a production version and was noticed by many ZooKeeper clients. − *acceptedEpoch*: the epoch number of the last *NEWEPOCH* message accepted; − *currentEpoch*: the epoch number of the last *NEWLEADER* message accepted; The origin of this problem is at the beginning of *Recovery* Phase, when the leader increments its epoch (contained in *lastZxid*) even before acquiring a quorum of successfully connected followers (such leader is called *false leader*). Since a follower goes back to *FLE* if its epoch is larger than the leader’s epoch, when a *false leader* drops leadership and becomes a follower of a leader from a previous epoch, it finds a smaller epoch and goes back to FLE. This behavior can loop, switching from *Recovery* Phase to *FLE* repeatedly. Consequently, using *lastZxid* to store the epoch number, there is no distinction between a *tried* epoch and a *joined* epoch in the implementation. Those are the respective purposes for *acceptedEpoch and currentEpoch*, hence the omission of them render such problems. More details can be found in this report paper: _*ZooKeeper’s atomic broadcast protocol: Theory and practice. Andr ́e Medeiros March 20, 2012*_ > zookeeper servers should commit the new leader txn to their logs. > ----------------------------------------------------------------- > > Key: ZOOKEEPER-335 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-335 > Project: ZooKeeper > Issue Type: Bug > Components: server > Affects Versions: 3.1.0 > Reporter: Mahadev Konar > Assignee: Benjamin Reed > Priority: Blocker > Fix For: 3.4.0 > > Attachments: ZOOKEEPER-335.patch, ZOOKEEPER-335_2.patch, > ZOOKEEPER-335_3.patch, ZOOKEEPER-335_4.patch, ZOOKEEPER-335_5.patch, > ZOOKEEPER-790.travis.log.bz2, faultynode-vishal.txt, zk.log.gz, zklogs.tar.gz > > > currently the zookeeper followers do not commit the new leader election. This > will cause problems in a failure scenarios with a follower acking to the same > leader txn id twice, which might be two different intermittent leaders and > allowing them to propose two different txn's of the same zxid. -- This message was sent by Atlassian Jira (v8.3.4#803005)