[
https://issues.apache.org/jira/browse/HDDS-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782195#comment-17782195
]
Aryan Gupta commented on HDDS-9552:
-----------------------------------
Hi [~pratyush.bhatt],
When we bootstrap a new SCM node the new node joins the raft group and tries to
fetch the raft logs from the leader node, this may take a while, ratis always
elects a peer as a leader that has the most up-to-date log index therefore
newly added peer takes some time to catch up with the leader due to which the
transfer leader command fails sometimes if we immediately try to change the
leadership after the bootstrap. In real cases, we don't need to manually
transfer the leadership as the leader election is autonomous. Anyway, I have
tested the transfer leadership command several times and it's working fine on
my cluster.
> [MasterNode decommissioning] TIMED_OUT transferring SCM Leadership
> ------------------------------------------------------------------
>
> Key: HDDS-9552
> URL: https://issues.apache.org/jira/browse/HDDS-9552
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Pratyush Bhatt
> Assignee: Aryan Gupta
> Priority: Major
>
> *Scenario:* Transfer leadrship to a newly commissioned SCM.
> *Steps:*
> 1. Decommission a leader SCM Node.
> 2. Recommission the same SCM back.(It wont be the leader currently)
> 3. Transfer the leadership to this commissioned SCM.
> *Observed behavior:*
> Current SCM roles(after recommissioning)
> {code:java}
> 2023-10-26 17:14:01,191|INFO|MainThread|machine.py:190 -
> run()||GUID=c4d24ae9-4381-4032-bebe-2ada9d440a42|RUNNING: ozone admin scm
> roles
> 2023-10-26 17:14:04,616|INFO|MainThread|machine.py:205 -
> run()||GUID=c4d24ae9-4381-4032-bebe-2ada9d440a42|ozn-decom77-1.ozn-decom77.xyz:1111:LEADER:13c4a2c0-0e9e-4a18-ba22-a14b35e05a94:172.27.27.19
> 2023-10-26 17:14:04,617|INFO|MainThread|machine.py:205 -
> run()||GUID=c4d24ae9-4381-4032-bebe-2ada9d440a42|ozn-decom77-4.ozn-decom77.xyz:1111:FOLLOWER:e1d197a7-402e-4d5e-a5e8-06be360fafe3:172.27.187.133
> 2023-10-26 17:14:04,617|INFO|MainThread|machine.py:205 -
> run()||GUID=c4d24ae9-4381-4032-bebe-2ada9d440a42|ozn-decom77-5.ozn-decom77.xyz:1111:FOLLOWER:8f80727a-2fc4-401c-8b5a-bb9f110a511e:172.27.40.70
> {code}
> Running leader transfer command, command fails with
> org.apache.ratis.protocol.exceptions.TransferLeadershipException:
> {code:java}
> 2023-10-26 17:14:04,662|INFO|MainThread|machine.py:190 -
> run()||GUID=2a68b512-985f-467b-919f-1194d6b8f7a9|RUNNING: ozone admin scm
> transfer -n 8f80727a-2fc4-401c-8b5a-bb9f110a511e
> 2023-10-26 17:16:21,031|INFO|MainThread|machine.py:205 -
> run()||GUID=2a68b512-985f-467b-919f-1194d6b8f7a9|com.google.protobuf.ServiceException:
>
> org.apache.hadoop.ipc.RemoteException(org.apache.ratis.protocol.exceptions.TransferLeadershipException):
> 13c4a2c0-0e9e-4a18-ba22-a14b35e05a94@group-F683BA8433BB: Failed to transfer
> leadership to 8f80727a-2fc4-401c-8b5a-bb9f110a511e (the current leader is
> 13c4a2c0-0e9e-4a18-ba22-a14b35e05a94): TIMED_OUT(60s)
> 2023-10-26 17:16:21,032|INFO|MainThread|machine.py:205 -
> run()||GUID=2a68b512-985f-467b-919f-1194d6b8f7a9|at
> org.apache.ratis.server.impl.TransferLeadership$PendingRequest.complete(TransferLeadership.java:152)
> 2023-10-26 17:16:21,032|INFO|MainThread|machine.py:205 -
> run()||GUID=2a68b512-985f-467b-919f-1194d6b8f7a9|at
> org.apache.ratis.server.impl.TransferLeadership.lambda$complete$11(TransferLeadership.java:336)
> 2023-10-26 17:16:21,032|INFO|MainThread|machine.py:205 -
> run()||GUID=2a68b512-985f-467b-919f-1194d6b8f7a9|at
> java.util.Optional.ifPresent(Optional.java:159)
> 2023-10-26 17:16:21,032|INFO|MainThread|machine.py:205 -
> run()||GUID=2a68b512-985f-467b-919f-1194d6b8f7a9|at
> org.apache.ratis.server.impl.TransferLeadership.complete(TransferLeadership.java:336)
> 2023-10-26 17:16:21,032|INFO|MainThread|machine.py:205 -
> run()||GUID=2a68b512-985f-467b-919f-1194d6b8f7a9|at
> org.apache.ratis.server.impl.TransferLeadership.lambda$start$8(TransferLeadership.java:307)
> 2023-10-26 17:16:21,032|INFO|MainThread|machine.py:205 -
> run()||GUID=2a68b512-985f-467b-919f-1194d6b8f7a9|at
> org.apache.ratis.util.TimeoutTimer.lambda$onTimeout$2(TimeoutTimer.java:101)
> 2023-10-26 17:16:21,033|INFO|MainThread|machine.py:205 -
> run()||GUID=2a68b512-985f-467b-919f-1194d6b8f7a9|at
> org.apache.ratis.util.LogUtils.runAndLog(LogUtils.java:38)
> 2023-10-26 17:16:21,033|INFO|MainThread|machine.py:205 -
> run()||GUID=2a68b512-985f-467b-919f-1194d6b8f7a9|at
> org.apache.ratis.util.LogUtils$1.run(LogUtils.java:79)
> 2023-10-26 17:16:21,033|INFO|MainThread|machine.py:205 -
> run()||GUID=2a68b512-985f-467b-919f-1194d6b8f7a9|at
> org.apache.ratis.util.TimeoutTimer$Task.run(TimeoutTimer.java:55)
> 2023-10-26 17:16:21,033|INFO|MainThread|machine.py:205 -
> run()||GUID=2a68b512-985f-467b-919f-1194d6b8f7a9|at
> java.util.TimerThread.mainLoop(Timer.java:555)
> 2023-10-26 17:16:21,033|INFO|MainThread|machine.py:205 -
> run()||GUID=2a68b512-985f-467b-919f-1194d6b8f7a9|at
> java.util.TimerThread.run(Timer.java:505) {code}
> I checked the scm logs of ozn-decom77-5.ozn-decom77.xyz at around that point:
> 1. Bootstrap of this SCM was successful as per the logs:
> {code:java}
> 2023-10-26 17:13:49,535 INFO
> [main]-org.apache.hadoop.hdds.scm.ha.HASecurityUtils: Successfully stored SCM
> signed certificate.
> 2023-10-26 17:13:49,544 INFO
> [main]-org.apache.hadoop.hdds.scm.server.StorageContainerManager: SCM
> BootStrap is successful for ClusterID
> CID-0e38bb44-5930-4665-aa8e-f683ba8433bb, SCMID
> 8f80727a-2fc4-401c-8b5a-bb9f110a511e
> 2023-10-26 17:13:49,545 INFO
> [main]-org.apache.hadoop.hdds.scm.server.StorageContainerManager: Primary SCM
> Node ID 13c4a2c0-0e9e-4a18-ba22-a14b35e05a94 {code}
> 2. Seeing warning messages "Failed to allocate a batch for containerId" after
> the bootstrap:
> {code:java}
> 2023-10-26 17:14:01,495 WARN
> [8f80727a-2fc4-401c-8b5a-bb9f110a511e@group-F683BA8433BB-StateMachineUpdater]-org.apache.hadoop.hdds.scm.ha.SequenceIdGenerator:
> Failed to allocate a batch for containerId, expected lastId is 0, actual
> lastId is 16000.
> 2023-10-26 17:14:01,495 WARN
> [8f80727a-2fc4-401c-8b5a-bb9f110a511e@group-F683BA8433BB-StateMachineUpdater]-org.apache.hadoop.hdds.scm.ha.SequenceIdGenerator:
> Failed to allocate a batch for localId, expected lastId is 0, actual lastId
> is 111677748019597000.
> 2023-10-26 17:14:01,499 WARN
> [8f80727a-2fc4-401c-8b5a-bb9f110a511e@group-F683BA8433BB-StateMachineUpdater]-org.apache.hadoop.hdds.scm.ha.SequenceIdGenerator:
> Failed to allocate a batch for localId, expected lastId is 0, actual lastId
> is 111677748019599000.
> 2023-10-26 17:14:01,500 INFO
> [8f80727a-2fc4-401c-8b5a-bb9f110a511e@group-F683BA8433BB-StateMachineUpdater]-org.apache.hadoop.hdds.security.symmetric.SecretKeyStateImpl:
> Updating keys with [SecretKey(id = bddc1c74-1915-4c1f-ab13-d366372ce818,
> creation at: 2023-10-24T08:36:13.085Z, expire at: 2023-10-31T08:36:13.085Z)]
> 2023-10-26 17:14:01,500 INFO
> [8f80727a-2fc4-401c-8b5a-bb9f110a511e@group-F683BA8433BB-StateMachineUpdater]-org.apache.hadoop.hdds.security.symmetric.SecretKeyStateImpl:
> Current key updated SecretKey(id = bddc1c74-1915-4c1f-ab13-d366372ce818,
> creation at: 2023-10-24T08:36:13.085Z, expire at: 2023-10-31T08:36:13.085Z)
> 2023-10-26 17:14:01,501 INFO
> [8f80727a-2fc4-401c-8b5a-bb9f110a511e@group-F683BA8433BB-StateMachineUpdater]-org.apache.hadoop.hdds.security.symmetric.LocalSecretKeyStore:
> Saved [SecretKey(id = bddc1c74-1915-4c1f-ab13-d366372ce818, creation at:
> 2023-10-24T08:36:13.085Z, expire at: 2023-10-31T08:36:13.085Z)] to file
> /var/lib/hadoop-ozone/scm/ozone-metadata/scm/keys/secret_keys.json
> 2023-10-26 17:14:01,503 INFO
> [8f80727a-2fc4-401c-8b5a-bb9f110a511e@group-F683BA8433BB-StateMachineUpdater]-org.apache.hadoop.hdds.scm.server.SCMCertStore:
> Scm certificate 74992207739889802 for
> [email protected],OU=ff681d8f-295f-4bb1-ae9b-3b9c3d3ef62f,O=CID-0e38bb44-5930-4665-aa8e-f683ba8433bb
> is stored
> 2023-10-26 17:14:01,503 WARN
> [8f80727a-2fc4-401c-8b5a-bb9f110a511e@group-F683BA8433BB-StateMachineUpdater]-org.apache.hadoop.hdds.scm.ha.SequenceIdGenerator:
> Failed to allocate a batch for localId, expected lastId is 0, actual lastId
> is 111677748019600000.
> 2023-10-26 17:14:01,507 WARN
> [8f80727a-2fc4-401c-8b5a-bb9f110a511e@group-F683BA8433BB-StateMachineUpdater]-org.apache.hadoop.hdds.scm.ha.SequenceIdGenerator:
> Failed to allocate a batch for localId, expected lastId is 0, actual lastId
> is 111677748019601000.
> 2023-10-26 17:14:01,525 WARN
> [8f80727a-2fc4-401c-8b5a-bb9f110a511e@group-F683BA8433BB-StateMachineUpdater]-org.apache.hadoop.hdds.scm.ha.SequenceIdGenerator:
> Failed to allocate a batch for containerId, expected lastId is 0, actual
> lastId is 17000.
> 2023-10-26 17:14:01,531 WARN
> [8f80727a-2fc4-401c-8b5a-bb9f110a511e@group-F683BA8433BB-StateMachineUpdater]-org.apache.hadoop.hdds.scm.ha.SequenceIdGenerator:
> Failed to allocate a batch for delTxnId, expected lastId is 0, actual lastId
> is 11000. {code}
> 3. Eventually the SCM goes down with
> org.apache.hadoop.ozone.common.statemachine.InvalidStateTransitionException
> {code:java}
> 2023-10-26 17:14:01,860 ERROR
> [8f80727a-2fc4-401c-8b5a-bb9f110a511e@group-F683BA8433BB-StateMachineUpdater]-org.apache.ratis.statemachine.StateMachine:
> Terminating with exit status 1: Invalid event: CLOSE at OPEN state.
> org.apache.hadoop.ozone.common.statemachine.InvalidStateTransitionException:
> Invalid event: CLOSE at OPEN state.
> at
> org.apache.hadoop.ozone.common.statemachine.StateMachine.getNextState(StateMachine.java:60)
> at
> org.apache.hadoop.hdds.scm.container.ContainerStateManagerImpl.updateContainerState(ContainerStateManagerImpl.java:356)
> at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.hadoop.hdds.scm.ha.SCMStateMachine.process(SCMStateMachine.java:188)
> at
> org.apache.hadoop.hdds.scm.ha.SCMStateMachine.applyTransaction(SCMStateMachine.java:148)
> at
> org.apache.ratis.server.impl.RaftServerImpl.applyLogToStateMachine(RaftServerImpl.java:1777)
> at
> org.apache.ratis.server.impl.StateMachineUpdater.applyLog(StateMachineUpdater.java:242)
> at
> org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:184)
> at java.lang.Thread.run(Thread.java:748)
> 2023-10-26 17:14:01,864 INFO
> [shutdown-hook-0]-org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter:
> SHUTDOWN_MSG:
> /************************************************************ {code}
> 4. The current leader(ozn-decom77-1.ozn-decom77.xyz) logs at that point was
> showing this for the newly recommissioned SCM node:
> {code:java}
> 2023-10-26 17:14:07,145 WARN
> [grpc-default-executor-3]-org.apache.ratis.grpc.server.GrpcLogAppender:
> 13c4a2c0-0e9e-4a18-ba22-a14b35e05a94@group-F683BA8433BB->8f80727a-2fc4-401c-8b5a-bb9f110a511e-AppendLogResponseHandler:
> Failed appendEntries:
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io
> exception
> 2023-10-26 17:14:07,145 WARN
> [grpc-default-executor-10]-org.apache.ratis.grpc.server.GrpcLogAppender:
> 13c4a2c0-0e9e-4a18-ba22-a14b35e05a94@group-F683BA8433BB->8f80727a-2fc4-401c-8b5a-bb9f110a511e-AppendLogResponseHandler:
> Failed appendEntries:
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io
> exception
> 2023-10-26 17:14:07,148 INFO
> [grpc-default-executor-3]-org.apache.ratis.server.leader.FollowerInfo:
> 13c4a2c0-0e9e-4a18-ba22-a14b35e05a94@group-F683BA8433BB->8f80727a-2fc4-401c-8b5a-bb9f110a511e:
> decreaseNextIndex nextIndex: updateUnconditionally 8821 -> 8820
> 2023-10-26 17:14:07,148 INFO
> [grpc-default-executor-10]-org.apache.ratis.server.leader.FollowerInfo:
> 13c4a2c0-0e9e-4a18-ba22-a14b35e05a94@group-F683BA8433BB->8f80727a-2fc4-401c-8b5a-bb9f110a511e:
> decreaseNextIndex nextIndex: updateUnconditionally 8820 -> 8819
> 2023-10-26 17:14:07,990 WARN
> [grpc-default-executor-10]-org.apache.ratis.grpc.server.GrpcLogAppender:
> 13c4a2c0-0e9e-4a18-ba22-a14b35e05a94@group-F683BA8433BB->8f80727a-2fc4-401c-8b5a-bb9f110a511e-AppendLogResponseHandler:
> Failed appendEntries:
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io
> exception
> 2023-10-26 17:14:07,990 WARN
> [grpc-default-executor-3]-org.apache.ratis.grpc.server.GrpcLogAppender:
> 13c4a2c0-0e9e-4a18-ba22-a14b35e05a94@group-F683BA8433BB->8f80727a-2fc4-401c-8b5a-bb9f110a511e-AppendLogResponseHandler:
> Failed appendEntries:
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io
> exception {code}
> {code:java}
> 2023-10-26 17:14:08,286 INFO [IPC Server handler 64 on
> 9860]-org.apache.hadoop.hdds.ratis.RatisHelper: Chosen the targetLeaderId
> 8f80727a-2fc4-401c-8b5a-bb9f110a511e to transfer leadership
> {code}
> {code:java}
> org.apache.ratis.protocol.exceptions.TransferLeadershipException:
> 13c4a2c0-0e9e-4a18-ba22-a14b35e05a94@group-F683BA8433BB: Failed to transfer
> leadership to 8f80727a-2fc4-401c-8b5a-bb9f110a511e (the current leader is
> 13c4a2c0-0e9e-4a18-ba22-a14b35e05a94): TIMED_OUT(60s)
> at
> org.apache.ratis.server.impl.TransferLeadership$PendingRequest.complete(TransferLeadership.java:152)
> at
> org.apache.ratis.server.impl.TransferLeadership.lambda$complete$11(TransferLeadership.java:336)
> at java.util.Optional.ifPresent(Optional.java:159)
> at
> org.apache.ratis.server.impl.TransferLeadership.complete(TransferLeadership.java:336)
> at
> org.apache.ratis.server.impl.TransferLeadership.lambda$start$8(TransferLeadership.java:307)
> at
> org.apache.ratis.util.TimeoutTimer.lambda$onTimeout$2(TimeoutTimer.java:101)
> at org.apache.ratis.util.LogUtils.runAndLog(LogUtils.java:38)
> at org.apache.ratis.util.LogUtils$1.run(LogUtils.java:79)
> at org.apache.ratis.util.TimeoutTimer$Task.run(TimeoutTimer.java:55)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505) {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]