[
https://issues.apache.org/jira/browse/ZOOKEEPER-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13715115#comment-13715115
]
Flavio Junqueira commented on ZOOKEEPER-1732:
---------------------------------------------
I understand your concern, it is a valid one. Our goal however shouldn't be to
have every follower having the most recent notification from the leader. Our
goal is to have a leader that has enough supporters so that we can make
progress. If all servers are either following/leading, then it doesn't matter
if a follower has stale LE information. But, it does become an issue in the
case that you uncovered with your logs. In this scenario, the stale follower
will receive a more recent notification from either the leader, when it is
trying to be re-elected, or from the follower that is stuck. In either case, it
will be able to determine that it is stale and stop following.
The bottom line is that the follower stops following only if it realizes that
it is stale. If it doesn't hear anything, then it just keeps going. Does it
work?
About changing the protocol, in my experience, changing messages is a pain
because there are many subtle cases and it is quite easy to get it wrong. It is
best not to touch it. I think we can take care of this case without really
changing the messages we are sending.
> ZooKeeper server unable to join established ensemble
> ----------------------------------------------------
>
> Key: ZOOKEEPER-1732
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1732
> Project: ZooKeeper
> Issue Type: Bug
> Components: leaderElection
> Affects Versions: 3.4.5
> Environment: Windows 7, Java 1.7
> Reporter: Germán Blanco
> Priority: Blocker
> Fix For: 3.5.0, 3.4.6
>
> Attachments: zklog.tar.gz
>
>
> I have a test in which I do a rolling restart of three ZooKeeper servers and
> it was failing from time to time.
> I ran the tests in a loop until the failure came out and it seems that at
> some point one of the servers is unable to join the enssemble formed by the
> other two.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira