[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14708523#comment-14708523
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2033:
---------------------------------------------------

[~fpj]: any comments on the patch?

> zookeeper follower fails to start after a restart immediately following a new 
> epoch
> -----------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-2033
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2033
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.4.6
>            Reporter: Asad Saeed
>            Assignee: Asad Saeed
>             Fix For: 3.4.7
>
>         Attachments: ZOOKEEPER-2033-3.4.patch, ZOOKEEPER-2033.patch
>
>
> The following issue was seen when adding a new node to a zookeeper cluster.
> Reproduction steps
> 1. Create a 2 node ensemble. Write some keys.
> 2. Add another node to the ensemble, by modifying the config. Restarting 
> entire cluster.
> 3. Restart the new node before writing any new keys.
> What occurs is that the new node gets a SNAP from the newly elected leader, 
> since it is too far behind. The zxid for this snapshot is from the new epoch 
> but that is not in the committed log cache.
> On restart of this new node. The follower sends the new epoch zxid. The 
> leader looks at it's maxCommitted logs, and sees that it is not the newest 
> epoch, and therefore sends a TRUNC.
> The follower sees the TRUNC but it only has a snapshot, so it cannot truncate!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to