[jira] [Commented] (YARN-5535) Remove RMDelegationToken make resourcemanager recovery very slow
[ https://issues.apache.org/jira/browse/YARN-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426631#comment-15426631 ] tangshangwen commented on YARN-5535: I'm sorry, it is after recovery , and i found even queue size very large {noformat} [2016-08-12T19:43:25.986+08:00] [INFO] yarn.event.AsyncDispatcher.handle(AsyncDispatcher.java:235) [AsyncDispatcher event handler] : Size of event-queue is 643000 [2016-08-12T19:43:25.986+08:00] [INFO] yarn.event.AsyncDispatcher.handle(AsyncDispatcher.java:235) [AsyncDispatcher event handler] : Size of event-queue is 644000 [2016-08-12T19:43:25.986+08:00] [INFO] yarn.event.AsyncDispatcher.handle(AsyncDispatcher.java:235) [AsyncDispatcher event handler] : Size of event-queue is 645000 [2016-08-12T19:43:25.986+08:00] [INFO] yarn.event.AsyncDispatcher.handle(AsyncDispatcher.java:235) [AsyncDispatcher event handler] : Size of event-queue is 646000 {noformat} > Remove RMDelegationToken make resourcemanager recovery very slow > > > Key: YARN-5535 > URL: https://issues.apache.org/jira/browse/YARN-5535 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: tangshangwen >Assignee: tangshangwen > > In our cluster, I found that when restart RM, RM recovery is very slow, this > is my log > {noformat} > [2016-08-12T19:43:21.478+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737879 > [2016-08-12T19:43:21.478+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.486+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737878 > [2016-08-12T19:43:21.486+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.494+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737877 > [2016-08-12T19:43:21.494+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.503+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737876 > [2016-08-12T19:43:21.503+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.519+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737875 > [2016-08-12T19:43:21.519+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.533+08:00] [INFO] > security.authorize.ServiceAuthorizationManager.authorize(ServiceAuthorizationManager.java:148) > [Socket Reader #1 for port 8031] : Authorization successful for yarn > (auth:SIMPLE) for protocol=interface > org.apache.hadoop.yarn.server.api.ResourceTrackerPB > [2016-08-12T19:43:21.536+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737874 > [2016-08-12T19:43:21.536+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.553+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737873 > [2016-08-12T19:43:21.553+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.568+08:00] [INFO] > yarn.util.RackRes
[jira] [Commented] (YARN-5535) Remove RMDelegationToken make resourcemanager recovery very slow
[ https://issues.apache.org/jira/browse/YARN-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426330#comment-15426330 ] Naganarasimha G R commented on YARN-5535: - Hi [~tangshangwen], Is this removal of RM delegation token happening synchronously during recovery ? IIUC, Recovery of application doesnt involve removal of delegation token (atleast synchronously), or correct me if i am wrong or missing something ! > Remove RMDelegationToken make resourcemanager recovery very slow > > > Key: YARN-5535 > URL: https://issues.apache.org/jira/browse/YARN-5535 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: tangshangwen >Assignee: tangshangwen > > In our cluster, I found that when restart RM, RM recovery is very slow, this > is my log > {noformat} > [2016-08-12T19:43:21.478+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737879 > [2016-08-12T19:43:21.478+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.486+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737878 > [2016-08-12T19:43:21.486+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.494+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737877 > [2016-08-12T19:43:21.494+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.503+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737876 > [2016-08-12T19:43:21.503+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.519+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737875 > [2016-08-12T19:43:21.519+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.533+08:00] [INFO] > security.authorize.ServiceAuthorizationManager.authorize(ServiceAuthorizationManager.java:148) > [Socket Reader #1 for port 8031] : Authorization successful for yarn > (auth:SIMPLE) for protocol=interface > org.apache.hadoop.yarn.server.api.ResourceTrackerPB > [2016-08-12T19:43:21.536+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737874 > [2016-08-12T19:43:21.536+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.553+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737873 > [2016-08-12T19:43:21.553+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.568+08:00] [INFO] > yarn.util.RackResolver.coreResolve(RackResolver.java:109) [IPC Server handler > 0 on 8031] : Resolved -7056.hadoop.xxx.local to /rack/rack5118 > [2016-08-12T19:43:21.569+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737872 > [2016-08-12T19:43:21.569+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:3
[jira] [Commented] (YARN-5535) Remove RMDelegationToken make resourcemanager recovery very slow
[ https://issues.apache.org/jira/browse/YARN-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426107#comment-15426107 ] tangshangwen commented on YARN-5535: Thanks [~sunilg] for the comments. I think Removing RMDelegationToken and SequenceNumber may take a long time,lead to can't handle other events {code:title=ZKRMStateStore.java|borderStyle=solid} @Override protected synchronized void removeRMDelegationTokenState( RMDelegationTokenIdentifier rmDTIdentifier) throws Exception { String nodeRemovePath = getNodePath(delegationTokensRootPath, DELEGATION_TOKEN_PREFIX + rmDTIdentifier.getSequenceNumber()); if (LOG.isDebugEnabled()) { LOG.debug("Removing RMDelegationToken_" + rmDTIdentifier.getSequenceNumber()); } if (existsWithRetries(nodeRemovePath, false) != null) { ArrayList opList = new ArrayList(); opList.add(Op.delete(nodeRemovePath, -1)); doDeleteMultiWithRetries(opList); } else { LOG.debug("Attempted to delete a non-existing znode " + nodeRemovePath); } } {code} > Remove RMDelegationToken make resourcemanager recovery very slow > > > Key: YARN-5535 > URL: https://issues.apache.org/jira/browse/YARN-5535 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: tangshangwen >Assignee: tangshangwen > > In our cluster, I found that when restart RM, RM recovery is very slow, this > is my log > {noformat} > [2016-08-12T19:43:21.478+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737879 > [2016-08-12T19:43:21.478+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.486+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737878 > [2016-08-12T19:43:21.486+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.494+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737877 > [2016-08-12T19:43:21.494+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.503+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737876 > [2016-08-12T19:43:21.503+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.519+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737875 > [2016-08-12T19:43:21.519+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.533+08:00] [INFO] > security.authorize.ServiceAuthorizationManager.authorize(ServiceAuthorizationManager.java:148) > [Socket Reader #1 for port 8031] : Authorization successful for yarn > (auth:SIMPLE) for protocol=interface > org.apache.hadoop.yarn.server.api.ResourceTrackerPB > [2016-08-12T19:43:21.536+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737874 > [2016-08-12T19:43:21.536+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.553+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737873 > [2016-08-12T19:43:21.553+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition
[jira] [Commented] (YARN-5535) Remove RMDelegationToken make resourcemanager recovery very slow
[ https://issues.apache.org/jira/browse/YARN-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426084#comment-15426084 ] Sunil G commented on YARN-5535: --- YARN-4041 has made token recovery asynchronous during recovery. Its been merged into 2.7 branch, but i think not present in 2.7.1 With this optimization, recovery seems more faster when token renewer was slow. > Remove RMDelegationToken make resourcemanager recovery very slow > > > Key: YARN-5535 > URL: https://issues.apache.org/jira/browse/YARN-5535 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: tangshangwen >Assignee: tangshangwen > > In our cluster, I found that when restart RM, RM recovery is very slow, this > is my log > {noformat} > [2016-08-12T19:43:21.478+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737879 > [2016-08-12T19:43:21.478+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.486+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737878 > [2016-08-12T19:43:21.486+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.494+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737877 > [2016-08-12T19:43:21.494+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.503+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737876 > [2016-08-12T19:43:21.503+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.519+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737875 > [2016-08-12T19:43:21.519+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.533+08:00] [INFO] > security.authorize.ServiceAuthorizationManager.authorize(ServiceAuthorizationManager.java:148) > [Socket Reader #1 for port 8031] : Authorization successful for yarn > (auth:SIMPLE) for protocol=interface > org.apache.hadoop.yarn.server.api.ResourceTrackerPB > [2016-08-12T19:43:21.536+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737874 > [2016-08-12T19:43:21.536+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.553+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737873 > [2016-08-12T19:43:21.553+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber > [2016-08-12T19:43:21.568+08:00] [INFO] > yarn.util.RackResolver.coreResolve(RackResolver.java:109) [IPC Server handler > 0 on 8031] : Resolved -7056.hadoop.xxx.local to /rack/rack5118 > [2016-08-12T19:43:21.569+08:00] [INFO] > resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136) > [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence > number: 737872 > [2016-08-12T19:43:21.569+08:00] [INFO] > resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) > [Thread[Thread-26,5,main]] : Removing RMDelegationToke