Fangmin Lv created ZOOKEEPER-3658:
-------------------------------------
Summary: Potential data inconsistency due to txns gap in
committedLog when ZkDB not fully shutdown
Key: ZOOKEEPER-3658
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3658
Project: ZooKeeper
Issue Type: Bug
Components: server
Affects Versions: 3.5.6, 3.6.0
Reporter: Fangmin Lv
Assignee: Fangmin Lv
During DIFF sync, the txns will be applied to learner's DataTree but it won't
be added into the in memory committed txns cache in ZkDatabase. If this server
became new leader later, and when other servers try to sync with it, it may
cause data inconsistency due to part of txns are missing.
This is not a problem if we fully shutdown the ZkDB and reload from disk, but
the current behavior in 3.5 and 3.6 will not fully shutdown the DB, which is a
nice optimization to reduce the unavailable time with large snapshot.
Internally, we have another version of 'Retain DB' implementation, and we
caught this issue with the digest feature we just upstreamed, and have fixed
that internally. Just realized we haven't upstreamed that, and this is the Jira
for that issue, will send a PR for this soon.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)