I might be wrong here, but let me try to chip in my few cents. I think the problem is in LearnerHandler.java at the leader fo this Follower.
/* see what other packets from the proposal * and tobeapplied queues need to be sent * and then decide if we can just send a DIFF * or we actually need to send the whole snapshot */ long leaderLastZxid = leader.startForwarding(this, updates); ---> this leaderLastZxid returned is probably incorrect. // a special case when both the ids are the same if (peerLastZxid == leaderLastZxid) { packetToSend = Leader.DIFF; zxidToSend = leaderLastZxid; } QuorumPacket newLeaderQP = new QuorumPacket(Leader.NEWLEADER, leaderLastZxid, null, null); oa.writeRecord(newLeaderQP, "packet"); bufferedOutput.flush() On Fri, Jun 18, 2010 at 4:49 PM, Flavio Paiva Junqueira (JIRA) < j...@apache.org> wrote: > > [ > https://issues.apache.org/jira/browse/ZOOKEEPER-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880320#action_12880320] > > Flavio Paiva Junqueira commented on ZOOKEEPER-335: > -------------------------------------------------- > > Guys, I don't see enough information in these logs to determine what's > going on. Let me tell you what I'm seeing so that perhaps other folks can > help me out here. > > One part of the log that is suspicious is this one: > > {noformat} > =6693 [QuorumPeer:/0.0.0.0:2181] WARN > org.apache.zookeeper.server.quorum.Learner - Got zxid 0x300000001 expected > 0x1 > =6693 [QuorumPeer:/0.0.0.0:2181] WARN > org.apache.zookeeper.server.quorum.Learner - Got zxid 0x300000001 expected > 0x1 > [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor30] > [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor27] > [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor22] > [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor23] > [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor18] > [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor20] > [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor19] > [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor31] > [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor21] > [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor26] > [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor25] > [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor33] > [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor29] > [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor28] > [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor24] > [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor32] > > ************* NODE RESTARTED HERE ********************** > {noformat} > > Before being restarted, the bad node receives a proposal with zxid <3,1> > and it expects <0,1>. Next in the logs after being restarted, I can see that > it is complaining that it has epoch 4 and the leader 3. Something strange > apparently happened during the restart. It also seems to be the case that > the node was being able to talk to the others (first entries in the log > before the excerpt above). > > Do you guys see anything I'm overlooking? > > > zookeeper servers should commit the new leader txn to their logs. > > ----------------------------------------------------------------- > > > > Key: ZOOKEEPER-335 > > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-335 > > Project: Zookeeper > > Issue Type: Bug > > Components: server > > Affects Versions: 3.1.0 > > Reporter: Mahadev konar > > Assignee: Mahadev konar > > Priority: Blocker > > Fix For: 3.4.0 > > > > Attachments: zk.log.gz, zklogs.tar.gz > > > > > > currently the zookeeper followers do not commit the new leader election. > This will cause problems in a failure scenarios with a follower acking to > the same leader txn id twice, which might be two different intermittent > leaders and allowing them to propose two different txn's of the same zxid. > > -- > This message is automatically generated by JIRA. > - > You can reply to this email to add a comment to the issue online. > >