EgorKuts commented on code in PR #6843:
URL: https://github.com/apache/ignite-3/pull/6843#discussion_r2480257001
##########
modules/raft/src/main/java/org/apache/ignite/raft/jraft/core/Replicator.java:
##########
@@ -1239,6 +1239,15 @@ void onHeartbeatReturned(final ThreadId id, final Status
status, final AppendEnt
r.startHeartbeatTimer(startTimeMs);
return;
}
+ if (!response.success()) {
Review Comment:
1. There are 3 nodes in the group. 3rd becomes a leader, term=2.
2. 3rd creates Replicators for each follower in becomeLeader()
3. 1st node deletes its state and joins the group.
4. The leader sees no configuration difference here: newConf.diff(oldConf,
adding, removing); No new Replicator is created, because 1st rejoins the group
with same consistentId(see
[equals](https://github.com/EgorKuts/ignite-3/blob/52ba3bfe8c21e4785474477a0ff3a40b4a3247f4/modules/raft/src/main/java/org/apache/ignite/raft/jraft/entity/PeerId.java#L245))
5. The old Replicator on the 3d node keeps thinking that node 1 has
prevLogIndex=52 from step 1. Though the actual state of node1 is prevLogIndex=0
term=0. The leader keeps sending heartbeats to 1 with prevLogIndex=52. They
don't trigger probeCheck. Everything is stuck until new election or real
AppendEntry from the leader.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]