[
https://issues.apache.org/jira/browse/ZOOKEEPER-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13719402#comment-13719402
]
Flavio Junqueira commented on ZOOKEEPER-1732:
---------------------------------------------
By "agree to vote", don't you need a different message pattern, even if the
message content is the same? You're still changing the protocol here. Also, we
don't need agreement, since different processes can have a different opinion
about who the leader should be. They need to agree before they start a new
epoch, but that's precisely what the recovery phase of zab does. It does a bit
more actually, but the whole state sync up is not relevant to this discussion.
bq. it actually doesn't take part in the leader election logic
This is not entirely true, the LE step exposes a leader that has the highest
zxid among a quorum of servers. Also, I think that you're using LE as the
recovery phase of Zab, not that the initial protocol that finds a prospective
leader.
bq. The new server just checks if the ensemble has a quorum and the leader is
alive (sends a notification voting for itself)
I believe we have discussed this point in this jira. As you have observed, the
ensemble is still able to make progress in the situation you have originally
described, so the inconsistent LE information doesn't prevent zookeeper from
doing work. The problem is getting a server stuck, which we fix by making sure
that a follower is able to send notifications with state that reflects the
latest leader election.
One option I was actually considering is to loosen the constraint that all
FOLLOWING/LEADING notifications need to come from the same LE round. This is
possibly too conservative, so it might be ok to change it, but I need to think
a bit more about it.
> ZooKeeper server unable to join established ensemble
> ----------------------------------------------------
>
> Key: ZOOKEEPER-1732
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1732
> Project: ZooKeeper
> Issue Type: Bug
> Components: leaderElection
> Affects Versions: 3.4.5
> Environment: Windows 7, Java 1.7
> Reporter: Germán Blanco
> Priority: Blocker
> Fix For: 3.5.0, 3.4.6
>
> Attachments: zklog.tar.gz
>
>
> I have a test in which I do a rolling restart of three ZooKeeper servers and
> it was failing from time to time.
> I ran the tests in a loop until the failure came out and it seems that at
> some point one of the servers is unable to join the enssemble formed by the
> other two.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira