Hi!

Recently we discovered two issues in the ZooKeeper’s latest versions that might 
cause data inconsistency or committed data loss. Details and analysis of the 
issues are presented on JIRA:


  *   ZOOKEEPER-4643<https://issues.apache.org/jira/browse/ZOOKEEPER-4643> :  
Committed txns may be improperly truncated if follower crashes right after 
updating currentEpoch but before persisting txns to disk.
  *   ZOOKEEPER-4646<https://issues.apache.org/jira/browse/ZOOKEEPER-4646> : 
Committed txns may still be lost if followers crash after replying ACK-LD but 
before writing txns to disk. (This issue is related to the fix of 
ZOOKEEPER-3911<https://issues.apache.org/jira/browse/ZOOKEEPER-3911>)

The issues seem to be critical since they lead to data loss or inconsistency, 
which violate the properties that ZAB is supposed to satisfy. I wonder whether 
the bugs should get a fix since data consistency is of prime importance of 
ZooKeeper. If so, I will try to fix the code together with further testing and 
verification techniques.

Thanks!

Attached are example traces of these two issues that have been generated in 
multiple versions such as 3.8.0 & 3.7.1. (The traces are also provided on JIRA.)

Reply via email to