Flavio take a look at 1264... I'm not sure this is the cause but not at a computer to look more right now
>From my phone On Nov 4, 2011 2:26 PM, "Flavio Junqueira (Commented) (JIRA)" < j...@apache.org> wrote: > > [ > https://issues.apache.org/jira/browse/ZOOKEEPER-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144237#comment-13144237] > > Flavio Junqueira commented on ZOOKEEPER-1270: > --------------------------------------------- > > Here is some progress. I was actually looking at the wrong snippet. The > correct one was the NEWLEADER handler: > > {noformat} > case Leader.NEWLEADER: // it will be NEWLEADER in v1.0 > zk.takeSnapshot(); > snapshotTaken = true; > writePacket(new QuorumPacket(Leader.ACK, newLeaderZxid, > null, null), true); > break; > } > > {noformat} > > We also take a snapshot here and by looking at the stack trace that Pat > posted, we see that the learner handlers are stuck in the loop right after > receiving the ack, which essentially waits for the leader to start. By the > same stack trace, the leader is not starting because it is waiting for the > followers to acknowledge the NEWLEADER message... but the followers have > acknowledged the NEWLEADER message, otherwise the learner handlers wouldn't > be executing that loop (Line 450). Unless I'm missing anything, the problem > must be in Leader.processAck. > > > > testEarlyLeaderAbandonment failing intermittently, quorum formed, no > serving. > > > ----------------------------------------------------------------------------- > > > > Key: ZOOKEEPER-1270 > > URL: > https://issues.apache.org/jira/browse/ZOOKEEPER-1270 > > Project: ZooKeeper > > Issue Type: Bug > > Components: server > > Reporter: Patrick Hunt > > Priority: Blocker > > Fix For: 3.4.0, 3.5.0 > > > > Attachments: ZOOKEEPER-1270tests.patch, > ZOOKEEPER-1270tests2.patch, testEarlyLeaderAbandonment.txt.gz, > testEarlyLeaderAbandonment2.txt.gz, testEarlyLeaderAbandonment3.txt.gz > > > > > > Looks pretty serious - quorum is formed but no clients can attach. Will > attach logs momentarily. > > This test was introduced in the following commit (all three jira commit > at once): > > ZOOKEEPER-335. zookeeper servers should commit the new leader txn to > their logs. > > ZOOKEEPER-1081. modify leader/follower code to correctly deal with new > leader > > ZOOKEEPER-1082. modify leader election to correctly take into account > current > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators: > https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > For more information on JIRA, see: http://www.atlassian.com/software/jira > > >