[ https://issues.apache.org/jira/browse/ZOOKEEPER-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083303#comment-13083303 ]
Hudson commented on ZOOKEEPER-1144: ----------------------------------- Integrated in ZooKeeper-trunk #1261 (See [https://builds.apache.org/job/ZooKeeper-trunk/1261/]) ZOOKEEPER-1144: ZooKeeperServer not starting on leader due to a race condition (Vishal K via camille) camille : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1156649 Files : * /zookeeper/trunk/CHANGES.txt * /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/Leader.java > ZooKeeperServer not starting on leader due to a race condition > -------------------------------------------------------------- > > Key: ZOOKEEPER-1144 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1144 > Project: ZooKeeper > Issue Type: Bug > Affects Versions: 3.4.0 > Reporter: Vishal Kher > Assignee: Vishal Kher > Priority: Blocker > Fix For: 3.4.0 > > Attachments: ZOOKEEPER-1144.patch > > > I have found one problem that is causing QuorumPeerMainTest:testQuorum to > fail. This test uses 2 ZK servers. > The test is failing because leader is not starting ZooKeeperServer after > leader election. so everything halts. > With the new changes, the server is now started in Leader.processAck() which > is called from LeaderHandler. processAck() starts ZooKeeperServer if majority > have acked NEWLEADER. The leader puts its ack in the the ackSet in > Leader.lead(). Since processAck() is called from LearnerHandler it can happen > that the learner's ack is processed before the leader is able to put its ack > in the ackSet. When LearnerHandler invokes processAck(), the ackSet for > newLeaderProposal will not have quorum (in this case 2). As a result, the > ZooKeeperServer is never started on the Leader. > The leader needs to ensure that its ack is put in ackSet before starting > LearnerCnxAcceptor or invoke processAck() itself after adding to ackSet. I > haven't had time to go through the ZAB2 changes so I am not too familiar with > the code. Can Ben/Flavio fix this? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira