AlphaCanisMajoris opened a new pull request, #1993: URL: https://github.com/apache/zookeeper/pull/1993
In brief, committed logs might be lost due to the follower's **asynchronous** transaction logging when replying ACK of NEWLEADER during the SYNC phase. See [ZOOKEEPER-4646](https://issues.apache.org/jira/browse/ZOOKEEPER-4646) for details on the symptom, example trace, diagnostic, and possible fix idea. Actually, this problem had been first raised in [ZOOKEEPER-3911](https://issues.apache.org/jira/browse/ZOOKEEPER-3911) . However, the fixing patch of [ZOOKEEPER-3911](https://issues.apache.org/jira/browse/ZOOKEEPER-3911) does not solve the problem at the root. Besides, [ZOOKEEPER-4685](https://issues.apache.org/jira/browse/ZOOKEEPER-4685) is also caused with similar root cause, i.e., non-deterministic multi-threading executions. The solution in this patch is simple and effective as it guarantees the partial order that uncommitted transactions will be logged on the follower node before the follower replies ACK of NEWLEADER, which is exactly the ZAB protocol requires. Specifically, a CountDownLatch named newleaderLatch is applied to record the number of uncommitted transactions that should be logged before the follower replies ACK of NEWLEADER. Only the count of the newleaderLatch turns to zero will the follower be able to reply ACK of NEWLEADER. Besides, the ACKs of proposals generated on the follower node before replying the ACK of NEWLEADER should be queued first. They will be replied only after the ACK of NEWLEADER is replied. This is because that, the corresponding learner handler of the leader will not able to process the ACK of PROPOSAL before receiving ACK of NEWLEADER. What's worse, this disorder might possibly trigger the leader to shutdown and raise new election, increasing unnecessary extra time of system unavailability (See [ZOOKEEPER-4685](https://issues.apache.org/jira/browse/ZOOKEEPER-4685) for further details). This solution promises the safety property that all committed logs will not be lost with little performance penalty. Besides [ZOOKEEPER-4646](https://issues.apache.org/jira/browse/ZOOKEEPER-4646) and [ZOOKEEPER-3911](https://issues.apache.org/jira/browse/ZOOKEEPER-3911), the solution is also able to avoid the issue of [ZOOKEEPER-4685](https://issues.apache.org/jira/browse/ZOOKEEPER-4685) at the same time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
