Hi! Recently we discovered two issues in the ZooKeeper’s latest versions that might cause data inconsistency or committed data loss. Details and analysis of the issues are presented on JIRA:
* ZOOKEEPER-4643<https://issues.apache.org/jira/browse/ZOOKEEPER-4643> : Committed txns may be improperly truncated if follower crashes right after updating currentEpoch but before persisting txns to disk. * ZOOKEEPER-4646<https://issues.apache.org/jira/browse/ZOOKEEPER-4646> : Committed txns may still be lost if followers crash after replying ACK-LD but before writing txns to disk. (This issue is related to the fix of ZOOKEEPER-3911<https://issues.apache.org/jira/browse/ZOOKEEPER-3911>) The issues seem to be critical since they lead to data loss or inconsistency, which violate the properties that ZAB is supposed to satisfy. I wonder whether the bugs should get a fix since data consistency is of prime importance of ZooKeeper. If so, I will try to fix the code together with further testing and verification techniques. Thanks! Attached are example traces of these two issues that have been generated in multiple versions such as 3.8.0 & 3.7.1. (The traces are also provided on JIRA.)