[ https://issues.apache.org/jira/browse/ZOOKEEPER-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051688#comment-14051688 ]
Hongchao Deng commented on ZOOKEEPER-1549: ------------------------------------------ I think the root of the problem is in this function: ``` public long restore(DataTree dt, Map<Long, Integer> sessions, PlayBackListener listener) throws IOException { ... while (true) { // iterator points to // the first valid txn when initialized hdr = itr.getHeader(); ... processTransaction(hdr,dt,sessions, itr.getTxn()); ... listener.onTxnLoaded(hdr, itr.getTxn()); if (!itr.next()) break; } ... ``` Instead of processing all transactions in the log, keeping an index of committedUpTo and processing until that makes more sense. Am I correct?? > Data inconsistency when follower is receiving a DIFF with a dirty snapshot > -------------------------------------------------------------------------- > > Key: ZOOKEEPER-1549 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1549 > Project: ZooKeeper > Issue Type: Bug > Components: quorum > Affects Versions: 3.4.3 > Reporter: Jacky007 > Assignee: Thawan Kooburat > Priority: Blocker > Fix For: 3.5.0 > > Attachments: ZOOKEEPER-1549-3.4.patch, ZOOKEEPER-1549-learner.patch, > case.patch > > > the trunc code (from ZOOKEEPER-1154?) cannot work correct if the snapshot is > not correct. > here is scenario(similar to 1154): > Initial Condition > 1. Lets say there are three nodes in the ensemble A,B,C with A being the > leader > 2. The current epoch is 7. > 3. For simplicity of the example, lets say zxid is a two digit number, > with epoch being the first digit. > 4. The zxid is 73 > 5. All the nodes have seen the change 73 and have persistently logged it. > Step 1 > Request with zxid 74 is issued. The leader A writes it to the log but there > is a crash of the entire ensemble and B,C never write the change 74 to their > log. > Step 2 > A,B restart, A is elected as the new leader, and A will load data and take a > clean snapshot(change 74 is in it), then send diff to B, but B died before > sync with A. A died later. > Step 3 > B,C restart, A is still down > B,C form the quorum > B is the new leader. Lets say B minCommitLog is 71 and maxCommitLog is 73 > epoch is now 8, zxid is 80 > Request with zxid 81 is successful. On B, minCommitLog is now 71, > maxCommitLog is 81 > Step 4 > A starts up. It applies the change in request with zxid 74 to its in-memory > data tree > A contacts B to registerAsFollower and provides 74 as its ZxId > Since 71<=74<=81, B decides to send A the diff. > Problem: > The problem with the above sequence is that after truncate the log, A will > load the snapshot again which is not correct. > In 3.3 branch, FileTxnSnapLog.restore does not call listener(ZOOKEEPER-874), > the leader will send a snapshot to follower, it will not be a problem. -- This message was sent by Atlassian JIRA (v6.2#6252)