[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13714965#comment-13714965
 ] 

Germán Blanco commented on ZOOKEEPER-1732:
------------------------------------------

I tried to refresh the proposal simply by doing "updateProposal(getInitId(), 
getInitLastLoggedZxid(), getPeerEpoch());", before sending a notification when 
the server that sends it is part of an established ensemble. The test didn't 
run for long enough time, because of other reasons, but I think now that it 
can't work anyway. Reading your alternatives now and the way Votes are 
compared, I see that zxid and epoch need to be the same in all members of the 
ensemble and in this race case the follower hasn't received the zxid that the 
leader used to finish the election.
My personal preference would be "3". Because it is faster (follower doesn't go 
back to LOOKING, it can just update the proposal with the info in LeaderInfo), 
and it doesn't depend on any more races that could lead to the follower not 
processing the notification from the leader.
If the protocol backward compatibility issues are just more work, then I will 
be very willing to help as much as I can.
                
> ZooKeeper server unable to join established ensemble
> ----------------------------------------------------
>
>                 Key: ZOOKEEPER-1732
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1732
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: leaderElection
>    Affects Versions: 3.4.5
>         Environment: Windows 7, Java 1.7
>            Reporter: Germán Blanco
>            Priority: Blocker
>             Fix For: 3.5.0, 3.4.6
>
>         Attachments: zklog.tar.gz
>
>
> I have a test in which I do a rolling restart of three ZooKeeper servers and 
> it was failing from time to time.
> I ran the tests in a loop until the failure came out and it seems that at 
> some point one of the servers is unable to join the enssemble formed by the 
> other two.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to