[ https://issues.apache.org/jira/browse/RATIS-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
guangbao zhao reassigned RATIS-2145: ------------------------------------ Assignee: guangbao zhao > Follower hangs until the next trigger to take a snapshot > --------------------------------------------------------- > > Key: RATIS-2145 > URL: https://issues.apache.org/jira/browse/RATIS-2145 > Project: Ratis > Issue Type: Bug > Components: gRPC > Affects Versions: 3.0.1 > Reporter: guangbao zhao > Assignee: guangbao zhao > Priority: Major > > We discovered a problem when writing tests with high concurrency. It often > happens that a follower is running well and then triggers takeSnalshot. > The following is the relevant log. > follower: (as the follower log says, between 2024/08/22 20:18:14,044 and > 2024/08/22 20:21:57,058, no other logs appeared in the follower, but the > follower election was not triggered, indicating that the leader gave The > heartbeat sent by the follower is successful) > {code:java} > 2024/08/22 20:18:13,987 [node1@group-4F53D3317400-StateMachineUpdater] INFO > org.apache.ratis.server.raftlog.RaftLog: > node1@group-4F53D3317400-SegmentedRaftLog: snapshotIndex: updateIncreasingly > 22436696498 -> 22441096501 > 2024/08/22 20:18:13,999 [node1@group-4F53D3317400-SegmentedRaftLogWorker] > INFO org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker: > node1@group-4F53D3317400-SegmentedRaftLogWorker: created new log segment > /home/work/ssd1/lavafs/aktst-private/metaserver/metadata/ratis/23d5405d-0e30-3d56-9a77-4f53d3317400/current/log_inprogress_22441098615 > 2024/08/22 20:18:14,044 [node1@group-4F53D3317400-SegmentedRaftLogWorker] > INFO org.apache.ratis.server.raftlog.RaftLog: > node1@group-4F53D3317400-SegmentedRaftLog: purgeIndex: updateToMax > old=22432683959, new=22437078979, updated? true > 2024/08/22 20:21:57,058 [grpc-default-executor-23] INFO > com.xxx.RaftJournalManager: Received install snapshot notification from > MetaStore leader: node3 with term index: (t:192, i:22441477801) > 2024/08/22 20:21:57,059 [InstallSnapshotThread] INFO > com.xxx.MetaStoreRatisSnapshotProvider: Downloading latest checkpoint from > Leader MetaStore node3. Checkpoint address: leader:8170 > 2024/08/22 20:21:57,064 [grpc-default-executor-23] INFO > org.apache.ratis.grpc.server.GrpcServerProtocolService: node1: Completed > INSTALL_SNAPSHOT, lastRequest: node3->node1#0-t192,notify:(t:192, > i:22441477801) > 2024/08/22 20:21:57,065 [grpc-default-executor-23] INFO > org.apache.ratis.grpc.server.GrpcServerProtocolService: node1: Completed > INSTALL_SNAPSHOT, lastReply: null > 2024/08/22 20:21:57,067 [node1-server-thread55] INFO > org.apache.ratis.server.RaftServer$Division: node1@group-4F53D3317400: Failed > appendEntries as snapshot (22441477801) installation is in progress > 2024/08/22 20:21:57,068 [node1-server-thread55] INFO > org.apache.ratis.server.RaftServer$Division: node1@group-4F53D3317400: > inconsistency entries. > Reply:node3<-node1#19406445:FAIL-t192,INCONSISTENCY,nextIndex=22441098642,followerCommit=22441098595,matchIndex=-1{code} > leader: > {code:java} > 2024/08/22 20:18:16,958 [timer5] WARN > org.apache.ratis.grpc.server.GrpcLogAppender: > node3@group-4F53D3317400->node1-GrpcLogAppender: Timed out appendEntries, > errorCount=1, > request=AppendEntriesRequest:cid=16867241,entriesCount=25,entries=(t:192, > i:22441098598)...(t:192, i:22441098622) > 2024/08/22 20:18:16,964 [timer3] WARN > org.apache.ratis.grpc.server.GrpcLogAppender: > node3@group-4F53D3317400->node1-GrpcLogAppender: Timed out appendEntries, > errorCount=1, > request=AppendEntriesRequest:cid=16867246,entriesCount=1,entry=(t:192, > i:22441098624) > 2024/08/22 20:18:16,964 [timer6] WARN > org.apache.ratis.grpc.server.GrpcLogAppender: > node3@group-4F53D3317400->node1-GrpcLogAppender: Timed out appendEntries, > errorCount=1, > request=AppendEntriesRequest:cid=16867247,entriesCount=1,entry=(t:192, > i:22441098625) > 2024/08/22 20:18:16,964 [timer7] WARN > org.apache.ratis.grpc.server.GrpcLogAppender: > node3@group-4F53D3317400->node1-GrpcLogAppender: Timed out appendEntries, > errorCount=1, > request=AppendEntriesRequest:cid=16867245,entriesCount=1,entry=(t:192, > i:22441098623) > 2024/08/22 20:18:16,965 [timer3] WARN > org.apache.ratis.grpc.server.GrpcLogAppender: > node3@group-4F53D3317400->node1-GrpcLogAppender: Timed out appendEntries, > errorCount=1, > request=AppendEntriesRequest:cid=16867255,entriesCount=1,entry=(t:192, > i:22441098627) > 2024/08/22 20:18:16,965 [timer7] WARN > org.apache.ratis.grpc.server.GrpcLogAppender: > node3@group-4F53D3317400->node1-GrpcLogAppender: Timed out appendEntries, > errorCount=1, > request=AppendEntriesRequest:cid=16867253,entriesCount=1,entry=(t:192, > i:22441098626) > 2024/08/22 20:18:16,979 [timer0] WARN > org.apache.ratis.grpc.server.GrpcLogAppender: > node3@group-4F53D3317400->node1-GrpcLogAppender: Timed out appendEntries, > errorCount=1, > request=AppendEntriesRequest:cid=16867406,entriesCount=13,entries=(t:192, > i:22441098628)...(t:192, i:22441098640) > 2024/08/22 20:18:16,981 [timer5] WARN > org.apache.ratis.grpc.server.GrpcLogAppender: > node3@group-4F53D3317400->node1-GrpcLogAppender: Timed out appendEntries, > errorCount=1, > request=AppendEntriesRequest:cid=16867411,entriesCount=1,entry=(t:192, > i:22441098641) > 2024/08/22 20:21:52,849 [node3@group-4F53D3317400-StateMachineUpdater] INFO > com.xxx.RaftJournalManager: Current LastApplied Index (t:192, i:22445496444) > 2024/08/22 20:21:57,021 [node3@group-4F53D3317400-StateMachineUpdater] INFO > org.apache.ratis.server.impl.StateMachineUpdater: > node3@group-4F53D3317400-StateMachineUpdater: Took a snapshot at index > 22445496444 > 2024/08/22 20:21:57,021 [node3@group-4F53D3317400-StateMachineUpdater] INFO > org.apache.ratis.server.impl.StateMachineUpdater: > node3@group-4F53D3317400-StateMachineUpdater: snapshotIndex: > updateIncreasingly 22441096438 -> 22445496444 > 2024/08/22 20:21:57,021 [node3@group-4F53D3317400-StateMachineUpdater] INFO > org.apache.ratis.server.raftlog.RaftLog: > node3@group-4F53D3317400-SegmentedRaftLog: purge 22441496502 > 2024/08/22 20:21:57,021 [node3@group-4F53D3317400-StateMachineUpdater] INFO > org.apache.ratis.server.raftlog.RaftLog: > node3@group-4F53D3317400-SegmentedRaftLog: snapshotIndex: updateIncreasingly > 22441096438 -> 22445496444 > 2024/08/22 20:21:57,021 > [node3@group-4F53D3317400->node1-GrpcLogAppender-LogAppenderDaemon] INFO > org.apache.ratis.grpc.server.GrpcLogAppender: > node3@group-4F53D3317400->node1-GrpcLogAppender: notifyInstallSnapshot with > firstAvailable=(t:192, i:22441477801), followerNextIndex=22441098642 {code} > Because the leader did not receive the onNext callback within the > requestTimeoutDuration(3s) time, the above warn log was > generated.(raft.grpc.server.leader.outstanding.appends.max is set to 8) > {code:java} > scheduler.onTimeout(requestTimeoutDuration, > () -> timeoutAppendRequest(request.getCallId(), request.isHeartbeat()), > LOG, () -> "Timeout check failed for append entry request: " + request); > {code} > and see through code: > {code:java} > private boolean haveTooManyPendingRequests() { > final int size = pendingRequests.logRequestsSize(); > if (size == 0) { > return false; > } else if (size >= maxPendingRequestsNum) { > return true; > } else { > // queue is non-empty and non-full > return !replyState.isFirstReplyReceived(); > } > } {code} > When the size of pendingRequests.logRequestsSize() is >= 8, log entries will > not be sent, only heartbeats will be sent.(This also verifies that there is a > log gap in the follower log and no election is taking place) > {quote}leader: > 2024/08/22 20:18:16,981 [timer5] WARN > org.apache.ratis.grpc.server.GrpcLogAppender: > node3@group-4F53D3317400->node1-GrpcLogAppender: Timed out appendEntries, > errorCount=1, > request=AppendEntriesRequest:cid=16867411,entriesCount=1,entry=(t:192, > i:22441098641) > ... > 2024/08/22 20:21:57,021 > [node3@group-4F53D3317400->node1-GrpcLogAppender-LogAppenderDaemon] INFO > org.apache.ratis.grpc.server.GrpcLogAppender: > node3@group-4F53D3317400->node1-GrpcLogAppender: notifyInstallSnapshot with > firstAvailable=(t:192, i:22441477801), > followerNextIndex={color:#ff0000}22441098642{color} > follower: > 2024/08/22 20:21:57,068 [node1-server-thread55] INFO > org.apache.ratis.server.RaftServer$Division: node1@group-4F53D3317400: > inconsistency entries. > Reply:node3<-node1#19406445:FAIL-t192,INCONSISTENCY,nextIndex={color:#ff0000}22441098642{color},followerCommit=22441098595,matchIndex=-1 > {quote} > And when the leader meets the purge log conditions, the leader will trigger > notifyInstallSnapshot to the follower. At this time, the follower's nextIndex > is 22441098642, and the last warn log is 22441098641 (exactly 22441098642 - > 1), indicating that the follower has received the above warning log and > updated the follower's nextIndex, but the logRequests of these 8 warns are > still in logRequests , if it is the next 8 logRequests, then the > corresponding warn log will appear in the following log, but I did not find > it in the log. > As mentioned above, pendingRequests.logRequestsSize() >= > maxPendingRequestsNum is always satisfied during the period of 2024/08/22 > 20:18:14,044 and 2024/08/22 20:21:57,058, until the leader's firstAvailable > > followerNextIndex > {quote} > private void timeoutAppendRequest(long cid, boolean heartbeat) { > final AppendEntriesRequest pending = pendingRequests.handleTimeout(cid, > heartbeat); > if (pending != null) { > final int errorCount = replyState.process(Event.TIMEOUT); > LOG.warn("{}: Timed out {}appendEntries, errorCount={}, request={}", > this, heartbeat ? "HEARTBEAT " : "", errorCount, pending); > grpcServerMetrics.onRequestTimeout(getFollowerId().toString(), heartbeat); > pending.stopRequestTimer(); > } > } > > {quote} > [~szetszwo] Is it possible to replace handleTimeout with the remove method > here, so that it will not cause pendingRequests.logRequestsSize() >= > maxPendingRequestsNum, causing the leader to keep sending heartbeats to the > follower until notifyInstallSnapshot is triggered? leave the rest to the > follower's checkInconsistentAppendEntries to ensure, do you think this is ok? > When this phenomenon occurs, the last log of the follower always stays at: > {quote}2024/08/22 20:18:14,044 > [node1@group-4F53D3317400-SegmentedRaftLogWorker] INFO > org.apache.ratis.server.raftlog.RaftLog: > node1@group-4F53D3317400-SegmentedRaftLog: purgeIndex: updateToMax > old=22432683959, new=22437078979, updated? true > {quote} > Or sometimes there are hidden bugs in the purge call logic that have not been > discovered. -- This message was sent by Atlassian Jira (v8.20.10#820010)