[
https://issues.apache.org/jira/browse/RATIS-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696736#comment-17696736
]
Kaijie Chen commented on RATIS-1796:
------------------------------------
{quote}The old leader should have stepped down first. No?{quote}
The old leader will not step down first, the step down is caused by RequestVote
from the transferee. Quotes from Raft dissertation:
{quote}Once the target server receives the TimeoutNow request, it is highly
likely to start an election before any other server and become leader in the
next term. Its next message to the prior leader will include its new term
number, causing the prior leader to step down. At this point, leadership
transfer is complete.{quote}
In Ratis implementation, {{stepDownLeaderAsync}} is only called when
{{newLeader}} is {{null}}.
{code:java}
CompletableFuture<RaftClientReply>
transferLeadershipAsync(TransferLeadershipRequest request)
throws IOException {
if (request.getNewLeader() == null) {
return stepDownLeaderAsync(request);
}
{code}
This is a special case for the TransferLeadership rpc.
Alternatively, it could be interpreted as transfer leadership to any other
peer, so we can reduce the downtime.
----
For the problem in this Jira,
{quote}I think I have found the problem:
Transferee received startLeaderElection
(RaftServerImpl#startLeaderElection:1700 ->
RaftServerImpl#changeToCandidate:649 -> RoleInfo#startLeaderElection:121 ->
start new thread LeaderElection)
Transferee received appendEntries (stack trace in the log above), and become
follower.
LeaderElection thread in step 1 is running, found the CandidateState is already
CLOSED by step 2.
The term of transferee is expected to be increased in step 3
(LeaderElection#run:238 -> LeaderElection#askForVotes:304 ->
ServerState#initElection:221 -> currentTerm.incrementAndGet).
But in this case, step 2 is executed before step 3 when the term hasn't been
increased.{quote}
Maybe we can introduce a Pre-Candidate state along with the Candidate state.
And increase the term when a peer becomes Candidate instead of in
LeaderElection.
Reference: https://github.com/etcd-io/raft/blob/main/raft.go#L839-L866
[~szetszwo] what do you think?
> TransferLeadership stopped by appendEntries from old leader
> -----------------------------------------------------------
>
> Key: RATIS-1796
> URL: https://issues.apache.org/jira/browse/RATIS-1796
> Project: Ratis
> Issue Type: Sub-task
> Reporter: Kaijie Chen
> Assignee: Kaijie Chen
> Priority: Major
>
> Candidate state of transferee may be stopped by the appendEntries from old
> leader, see the log below
> {code:java}
> 2023-02-28 04:52:45,026 [s0-server-thread1] INFO impl.TransferLeadership
> (TransferLeadership.java:tryTransferLeadership(107)) - s0@group-43918D205BB2:
> start transferring leadership to s1
> 2023-02-28 04:52:45,029 [s0-server-thread1] INFO impl.TransferLeadership
> (TransferLeadership.java:tryTransferLeadership(116)) - s0@group-43918D205BB2:
> sent StartLeaderElection to transferee s1 immediately as it already has
> up-to-date log
> 2023-02-28 04:52:45,031 [grpc-default-executor-6] INFO impl.RoleInfo
> (RoleInfo.java:shutdownFollowerState(111)) - s1: shutdown
> s1@group-43918D205BB2-FollowerState
> 2023-02-28 04:52:45,032 [s1@group-43918D205BB2-FollowerState] INFO
> impl.FollowerState (FollowerState.java:run(152)) -
> s1@group-43918D205BB2-FollowerState was interrupted
> 2023-02-28 04:52:45,032 [grpc-default-executor-6] INFO impl.RoleInfo
> (RoleInfo.java:updateAndGet(140)) - s1: start
> s1@group-43918D205BB2-LeaderElection4
> 2023-02-28 04:52:45,054 [s1-server-thread1] INFO impl.RoleInfo
> (RoleInfo.java:shutdownLeaderElection(131)) - s1: shutdown
> s1@group-43918D205BB2-LeaderElection4
> 2023-02-28 04:52:45,054 [s1-server-thread1] INFO impl.RoleInfo
> (RoleInfo.java:startFollowerState(104)) - s1: startFollowerState
> reason:appendEntries from s0 term 1,
> trace: java.base/java.lang.Thread.getStackTrace(Thread.java:1602),
>
> org.apache.ratis.server.impl.RoleInfo.startFollowerState(RoleInfo.java:104),
>
> org.apache.ratis.server.impl.RaftServerImpl.changeToFollower(RaftServerImpl.java:547),
>
> org.apache.ratis.server.impl.RaftServerImpl.changeToFollowerAndPersistMetadata(RaftServerImpl.java:556),
>
> org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:1498),
>
> org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:1396),
>
> org.apache.ratis.server.impl.RaftServerProxy.lambda$appendEntriesAsync$26(RaftServerProxy.java:639),
> org.apache.ratis.util.JavaUtils.callAsUnchecked(JavaUtils.java:117),
>
> org.apache.ratis.server.impl.RaftServerImpl.lambda$executeSubmitServerRequestAsync$11(RaftServerImpl.java:818),
>
> java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700),
>
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128),
>
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628),
> java.base/java.lang.Thread.run(Thread.java:829)
> 2023-02-28 04:52:45,055 [s1-server-thread1] INFO impl.RoleInfo
> (RoleInfo.java:updateAndGet(140)) - s1: start
> s1@group-43918D205BB2-FollowerState
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)