[ https://issues.apache.org/jira/browse/ZOOKEEPER-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14708523#comment-14708523 ]
Raul Gutierrez Segales commented on ZOOKEEPER-2033: --------------------------------------------------- [~fpj]: any comments on the patch? > zookeeper follower fails to start after a restart immediately following a new > epoch > ----------------------------------------------------------------------------------- > > Key: ZOOKEEPER-2033 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2033 > Project: ZooKeeper > Issue Type: Bug > Components: quorum > Affects Versions: 3.4.6 > Reporter: Asad Saeed > Assignee: Asad Saeed > Fix For: 3.4.7 > > Attachments: ZOOKEEPER-2033-3.4.patch, ZOOKEEPER-2033.patch > > > The following issue was seen when adding a new node to a zookeeper cluster. > Reproduction steps > 1. Create a 2 node ensemble. Write some keys. > 2. Add another node to the ensemble, by modifying the config. Restarting > entire cluster. > 3. Restart the new node before writing any new keys. > What occurs is that the new node gets a SNAP from the newly elected leader, > since it is too far behind. The zxid for this snapshot is from the new epoch > but that is not in the committed log cache. > On restart of this new node. The follower sends the new epoch zxid. The > leader looks at it's maxCommitted logs, and sees that it is not the newest > epoch, and therefore sends a TRUNC. > The follower sees the TRUNC but it only has a snapshot, so it cannot truncate! -- This message was sent by Atlassian JIRA (v6.3.4#6332)