[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843241#action_12843241
 ] 

Henry Robinson commented on ZOOKEEPER-684:
------------------------------------------

According to the logs, it looks like the problem is that thread 1 is getting 
two votes for server 2. This happens because thread 0 votes for 2 rather than 
itself as it has already received a full complement of votes.

The race is therefore that thread 0 collects all its votes and updates its own 
vote before thread 1 is responded to. 

To fix, I propose adding a CountdownLatch between lookForLeader and 
setCurrentVote to ensure that both thread 0 and thread 1 manage one round of 
voting before updating their choices. I can't recreate the error here, so I'll 
offer a patch for Ben to hopefully try.  

> Race in LENonTerminateTest
> --------------------------
>
>                 Key: ZOOKEEPER-684
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-684
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: leaderElection, server
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Henry Robinson
>            Priority: Critical
>             Fix For: 3.3.0
>
>         Attachments: zookeeper-684-test-failure.rtf
>
>
> testNonTermination failed during a Hudson run for ZOOKEEPER-59. After 
> inspecting the output, it looks like server is electing 2 as a leader and 
> leaving. Given that 2 is just a mock server, server 0 remains alone in leader 
> election.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to