Wenhan Feng created ZOOKEEPER-4903:
--------------------------------------
Summary: An Inconsistency of `lastProcessed` variable in Spec and
Impl
Key: ZOOKEEPER-4903
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4903
Project: ZooKeeper
Issue Type: Bug
Components: other
Affects Versions: 3.10.0
Reporter: Wenhan Feng
I have noticed that in the specification of ZKV3_7_0, it appears that the
`SyncFollower` action uses an incorrect `lastProcessed`.
In `SyncFollower`, the following code is used to determine which sync mode
should be employed:
```
lastProcessedZxid == lastProcessed[i].zxid
minCommittedIdx == lastSnapshot[i].index + 1
maxCommittedIdx == IF zabState[i] = BROADCAST THEN lastCommitted[i].index
ELSE Len(history[i])
committedLogEmpty == minCommittedIdx > maxCommittedIdx
minCommittedLog == IF committedLogEmpty THEN lastProcessedZxid
ELSE history[i][minCommittedIdx].zxid
maxCommittedLog == IF committedLogEmpty THEN lastProcessedZxid
ELSE IF maxCommittedIdx = 0 THEN << 0, 0>>
ELSE history[i][maxCommittedIdx].zxid
```
We can observe that `lastProcessedZxid` is assigned the value of
`lastProcessed[i].zxid`, where `i` is the leader node ID and `lastProcessed` is
a global variable in this specification.
According to the comment in the specification, the `lastProcessed` variable
means "The index and zxid of the last processed transaction in history". And we
can see that it will be updated in the `StartZkServer(i)` Action with the
following code:
```
LastProposed(i) == IF Len(history[i]) = 0 THEN [ index |-> 0, zxid |-> <<0,
0>> ]
ELSE LET lastIndex == Len(history[i])
entry == history[i][lastIndex]
IN [ index |-> lastIndex,
zxid |-> entry.zxid ]
StartZkServer(i) ==
LET latest == LastProposed(i)
IN /\ lastCommitted' = [lastCommitted EXCEPT ![i] = latest]
/\ lastProcessed' = [lastProcessed EXCEPT ![i] = latest]
/\ lastSnapshot' = [lastSnapshot EXCEPT ![i] = latest]
/\ UpdateElectionVote(i, acceptedEpoch[i])
```
We know that when a cluster finishes an election but before receiving any user
requests, according to the specification, it will contain no history. And in
`StartZKServer(i)`, `lastSnapshot[i]` will be assigned `[ index |-> 0, zxid
|-> <<0, 0>> ]`.
However, in the implementation code, I found that the `lastProcessed` used in
the `LearnHanler.syncFollower()` function is actually the
`dataTree.lastProcessedZxid` variable. And this variable is updated with the
following code when `ZkServer` is starting:
```
Leader.java : Line 657: zk.setZxid(ZxidUtils.makeZxid(epoch, 0));
Leader.java : Line 1609: zk.getZKDatabase().setlastProcessedZxid(zk.getZxid());
```
Therefore, when a cluster finishes an election but before receiving any user
requests, its value is indeed `<<1, 0>>`. This causes the following
`LeaderSyncFollower` action to choose an incorrect sync model as per the
specification.
I found that the specification already seems to use a variable, namely
`tempMaxEpoch`, to record `zk.hzxid` (which `zk.getZxid()` actually retrieves).
It seems that we should use `tempMaxEpoch[i]` in the `StartZkServer` action.
And I think the correct code should be:
```
LastProposed(i) == IF Len(history[i]) = 0 THEN [ index |-> 0, zxid |->
<<Maximum({tempMaxEpoch[i], 0}), 0>> ]
......
StartZkServer(i) ==
LET latest == LastProposed(i)
IN ...... /\ lastProcessed' = [lastProcessed EXCEPT ![i] = latest]
......
```
--
This message was sent by Atlassian Jira
(v8.20.10#820010)