[ https://issues.apache.org/jira/browse/ZOOKEEPER-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011935#comment-13011935 ]
Flavio Junqueira commented on ZOOKEEPER-1032: --------------------------------------------- Hi Jiangwen, Some thoughts and questions on your proposal: # I'm not entirely clear on what you're suggesting on (1). In the way I understand, you're saying that once the follower starts following a leader, if it has all updates according to the leader, then it doesn't have to take a snapshot at that point. If this is correct, then the follower would simply keep appending to the last log it wrote to and take a snapshot eventually. Is this right? # I'm not sure what you mean with "DataTree should not be cleared, only restored from logs/snap" in (2). In what sense should it not be cleared? # The idea in (3) is interesting, but I have one concern. Given that ZooKeeper keeps its state in memory, adding a cache of logs might cause some memory pressure on the state of ZooKeeper. How much would we have to cache to see a significant improvement? I understand that this can be configurable, though. > speed up recovery from leader failure > ------------------------------------- > > Key: ZOOKEEPER-1032 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1032 > Project: ZooKeeper > Issue Type: Improvement > Components: server > Reporter: jiangwen wei > Fix For: 3.4.0 > > > when the number of nodes is large, it may take a long time to recover from > leader failure > there are some points to improve: > 1. Follower should take snapshot asynchronously when follower up to date > 2. Currently Leader/Follower will clear the DataTree on leader failures, and > then restore it from a snapshot and transaction logs. DataTree should not be > cleared, only restore it from transaction logs. > 3. FileTxnLog should store recently transaction logs in memory, so when > DataTree is not behind the transaction logs a lot, the transaction logs in > memory can be used to restore DataTree. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira