Kezhu Wang created ZOOKEEPER-4925:
-------------------------------------
Summary: Diff sync introduce hole in stale follower's committedLog
which cause data loss in leading
Key: ZOOKEEPER-4925
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4925
Project: ZooKeeper
Issue Type: Bug
Components: server
Affects Versions: 3.9.3
Reporter: Kezhu Wang
There are two variants of {{ZooKeeperServer::processTxn}}. Those two variants
diverge in behavior since ZOOKEEPER-3484. {{processTxn(Request request)}} pops
outstanding change from {{outstandingChanges}} and adds txn to {{committedLog}}
for follower to sync in addition to what {{processTxn(TxnHeader hdr, Record
txn)}} does. The {{Learner}} uses {{processTxn(TxnHeader hdr, Record txn)}} to
commit txn to memory after ZOOKEEPER-4394, which means it leaves
{{committedLog}} untouched in {{SYNCHRONIZATION}} phase.
In above case, a stale follower will have hole in its {{committedLog}} after
joining cluster. The stale follower will propagate the in memory hole to other
stale nodes after becoming leader. This causes data loss.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)