Brokenice0415 commented on code in PR #939:
URL: https://github.com/apache/ratis/pull/939#discussion_r1361478687


##########
ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java:
##########
@@ -143,7 +143,12 @@ private void resetClient(AppendEntriesRequest request, 
boolean onError) {
       if (request != null && request.isHeartbeat()) {
         return;
       }
-      getFollower().decreaseNextIndex(nextIndex);
+      // decrease next index
+      final long oldNextIndex = getFollower().getNextIndex();
+      final long matchIndex = getFollower().getMatchIndex();
+      getFollower().updateNextIndex(
+              Math.max(matchIndex + 1
+                      , oldNextIndex <= 0L? oldNextIndex: 
Math.min(oldNextIndex - 1, nextIndex)));

Review Comment:
   @szetszwo, I agree that we should not update next index blindly.
   
   The patch I wrote is according to the operations to next index in 
`FollowerInfoImpl`, which aims to add a low bound `match index + 1` to the 
decreased next index.
   
   And I think I should call `setNextIndex` instead of `updateNextIndex` to 
ensure the decrease.
   ```java
   @Override
   public void decreaseNextIndex(long newNextIndex) {
     nextIndex.updateUnconditionally(old -> old <= 0L? old: Math.min(old - 1, 
newNextIndex),
         message -> info("decreaseNextIndex", message));
   }
   
   @Override
   public void setNextIndex(long newNextIndex) {
     nextIndex.updateUnconditionally(old -> newNextIndex >= 0 ? newNextIndex : 
old,
         message -> info("setNextIndex", message));
   }
   
   @Override
   public void updateNextIndex(long newNextIndex) {
     nextIndex.updateToMax(newNextIndex,
         message -> debug("updateNextIndex", message));
   }
   ```
   
   The bug I describe happens in case below:
   
   1. Leader 0 append nop log [0] and get consensus in cluster;
   2. Now follower 1's next index is 1 and match index is 0.
   3. Leader 0 send heartbeat to follower 1.
   4. Follower 1 crashes before replying the heartbeat.
   5. Leader 0 decrease 1's next index to match index 0 for assigning the new 
next index by `min(old - 1, newNextIndex)`
     ```
     2023-10-17 10:59:26 WARN  GrpcLogAppender:122 - 
1@group-6F7570313233->0-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: 
Network closed for unknown reason
     2023-10-17 10:59:26 INFO  FollowerInfo:51 - 1@group-6F7570313233->0: 
nextIndex: updateUnconditionally 1 -> 0
     ```
   6. Now next index is not larger than match index. 
   
   > since it got an error, it should try resending the same (previous log 
index + 1) even if (previous log index + 1 <= match index). 
   
   Although resending the same request will be always right, which can only 
bring some redundant request, it will be better to optimize it.
   
   > One reason could be due to inconsistency; see getNextIndexForInconsistency.
   
   In this case, the patch works the same before if using `setNextIndex` 
instead of `updateNextIndex`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to