[ https://issues.apache.org/jira/browse/ZOOKEEPER-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13791169#comment-13791169 ]
Thawan Kooburat commented on ZOOKEEPER-1558: -------------------------------------------- Again, my concern is that the current solution would cause leader to be blocked taking the snapshot before starting to send ping to quorum members. If the snapshot taking time is larger than syncLimit, the quorum will tear down. You can simply simulate this situation by adding sleep which is longer that syncLimit into the new code where takeSnaphot() is called. If we agree that this is any issue, a simple fix is to create a method that take snapshot asynchronously (spin up a thread on demand similar to SyncRequestProcessor). Some refactoring and additional locking may be needed as well in order to do this cleanly. > Leader should not snapshot uncommitted state > -------------------------------------------- > > Key: ZOOKEEPER-1558 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1558 > Project: ZooKeeper > Issue Type: Sub-task > Components: quorum > Affects Versions: 3.4.6 > Reporter: Flavio Junqueira > Assignee: Flavio Junqueira > Priority: Blocker > Fix For: 3.4.6 > > Attachments: ZOOKEEPER-1558.patch, ZOOKEEPER-1558.patch, > ZOOKEEPER-1558.patch, ZOOKEEPER-1558.patch > > > Leader currently takes a snapshot when it calls loadData in the beginning of > the lead() method. The loaded data, however, may contain uncommitted state. -- This message was sent by Atlassian JIRA (v6.1#6144)