[ https://issues.apache.org/jira/browse/ZOOKEEPER-684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843300#action_12843300 ]
Flavio Paiva Junqueira commented on ZOOKEEPER-684: -------------------------------------------------- I agree with your observation: "Thread 1 has received 2 votes for server 2 as the leader. It then exits, and this is the problem, I think. As a result, Thread 0 can never get a quorum." And, my interpretation is that it happens because server 1 is timing out before receiving the vote of server 2 in round 1. Server 1 then receives in the second round the vote of mock server 2 and the vote of server 0 (also supporting 2), which cause server 1 to leave prematurely. I also don't think your patch works because "peer.getElectionAlg().lookForLeader()" won't return until the election is over. That method is not called for each round. > Race in LENonTerminateTest > -------------------------- > > Key: ZOOKEEPER-684 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-684 > Project: Zookeeper > Issue Type: Bug > Components: leaderElection, server > Reporter: Flavio Paiva Junqueira > Assignee: Henry Robinson > Priority: Critical > Fix For: 3.3.0 > > Attachments: zookeeper-684-test-failure.rtf, ZOOKEEPER-684.patch > > > testNonTermination failed during a Hudson run for ZOOKEEPER-59. After > inspecting the output, it looks like server is electing 2 as a leader and > leaving. Given that 2 is just a mock server, server 0 remains alone in leader > election. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.