Fangmin Lv created ZOOKEEPER-3658:
-------------------------------------

             Summary: Potential data inconsistency due to txns gap in 
committedLog when ZkDB not fully shutdown
                 Key: ZOOKEEPER-3658
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3658
             Project: ZooKeeper
          Issue Type: Bug
          Components: server
    Affects Versions: 3.5.6, 3.6.0
            Reporter: Fangmin Lv
            Assignee: Fangmin Lv


During DIFF sync, the txns will be applied to learner's DataTree but it won't 
be added into the in memory committed txns cache in ZkDatabase. If this server 
became new leader later, and when other servers try to sync with it, it may 
cause data inconsistency due to part of txns are missing.

This is not a problem if we fully shutdown the ZkDB and reload from disk, but 
the current behavior in 3.5 and 3.6 will not fully shutdown the DB, which is a 
nice optimization to reduce the unavailable time with large snapshot.

Internally, we have another version of 'Retain DB' implementation, and we 
caught this issue with the digest feature we just upstreamed, and have fixed 
that internally. Just realized we haven't upstreamed that, and this is the Jira 
for that issue, will send a PR for this soon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to