jonmv commented on code in PR #1925:
URL: https://github.com/apache/zookeeper/pull/1925#discussion_r998888168


##########
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java:
##########
@@ -756,13 +760,21 @@ protected void syncWithLeader(long newLeaderZxid) throws 
Exception {
                     zk.startupWithoutServing();
                     if (zk instanceof FollowerZooKeeperServer) {
                         FollowerZooKeeperServer fzk = 
(FollowerZooKeeperServer) zk;
-                        for (PacketInFlight p : packetsNotCommitted) {
+                        fzk.syncProcessor.setDelayForwarding(true);
+                        for (PacketInFlight p : packetsNotLogged) {
                             fzk.logRequest(p.hdr, p.rec, p.digest);
                         }
-                        packetsNotCommitted.clear();
+                        packetsNotLogged.clear();

Review Comment:
   This was the bug that would cause the learner to crash during sync, because 
it "forgot" a previous PROPOSAL on the NEWLEADER, and would then fail to match 
up that PROPOSAL with a later COMMIT, if one was sent during the sync. 
   In turn, this caused the learner to have to re-sync, which could trigger the 
same crash again if there was heavy concurrent write traffic, and it would also 
give duplicate series of transactions in the transaction logs, with resulting 
transaction digest mismatch on that server (but otherwise consistent data 
view). 
   
   So we need a separation of what's not yet written to the log, and what's not 
yet matched with a COMMIT, which is what these two queues are about. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@zookeeper.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to