GitHub user revans2 opened a pull request:

    https://github.com/apache/zookeeper/pull/157

    ZOOKEEPER-2678: Discovery and Sync can take a very long time on large DB

    This patch addresses recovery time when a leader is lost on a large DB.  
    
    It does this by not clearing the DB before leader election begins, and by 
avoiding taking a snapshot as part of the SYNC phase, specifically for a DIFF 
sync. It does this by buffering the proposals and commits just like the code 
currently does for proposals/commits sent after the NEWLEADER and before the 
UPTODATE messages. 
    
    If a SNAP is sent we cannot avoid writing out the full snapshot because 
there is no other way to make sure the disk DB is in sync with what is in 
memory.  So any edits to the edit log before a background snapshot happened 
could possibly be applied on top of an incorrect snapshot.
    
    This same optimization should work for TRUNC too, but I opted not to do it 
for TRUNC because TRUNC is rare and TRUNC by its very nature already forces the 
DB to be reread after the edit logs are modified.  So it would still not be 
fast.
    
    In practice this makes it so instead of taking 5+ mins for the cluster to 
recover from losing a leader it now takes about 3 seconds.
    
    I am happy to port this to 3.5. if it looks good.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/revans2/zookeeper ZOOKEEPER-2678

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/zookeeper/pull/157.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #157
    
----
commit 5aa25620e0189b28d7040305272be2fda28126fb
Author: Robert (Bobby) Evans <ev...@yahoo-inc.com>
Date:   2017-01-19T19:50:32Z

    ZOOKEEPER-2678: Discovery and Sync can take a very long time on large DBs

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to