[jira] [Commented] (YARN-5535) Remove RMDelegationToken make resourcemanager recovery very slow

2016-08-18 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426084#comment-15426084
 ] 

Sunil G commented on YARN-5535:
---

YARN-4041 has made token recovery asynchronous during recovery. 
Its been merged into 2.7 branch, but i think not present in 2.7.1

With this optimization, recovery seems more faster when token renewer was slow.

> Remove RMDelegationToken make resourcemanager recovery very slow
> 
>
> Key: YARN-5535
> URL: https://issues.apache.org/jira/browse/YARN-5535
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: tangshangwen
>Assignee: tangshangwen
>
> In our cluster, I found that when restart RM, RM recovery is very slow, this 
> is my log
> {noformat}
> [2016-08-12T19:43:21.478+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737879
> [2016-08-12T19:43:21.478+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.486+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737878
> [2016-08-12T19:43:21.486+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.494+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737877
> [2016-08-12T19:43:21.494+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.503+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737876
> [2016-08-12T19:43:21.503+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.519+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737875
> [2016-08-12T19:43:21.519+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.533+08:00] [INFO] 
> security.authorize.ServiceAuthorizationManager.authorize(ServiceAuthorizationManager.java:148)
>  [Socket Reader #1 for port 8031] : Authorization successful for yarn 
> (auth:SIMPLE) for protocol=interface 
> org.apache.hadoop.yarn.server.api.ResourceTrackerPB
> [2016-08-12T19:43:21.536+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737874
> [2016-08-12T19:43:21.536+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.553+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737873
> [2016-08-12T19:43:21.553+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.568+08:00] [INFO] 
> yarn.util.RackResolver.coreResolve(RackResolver.java:109) [IPC Server handler 
> 0 on 8031] : Resolved -7056.hadoop.xxx.local to /rack/rack5118
> [2016-08-12T19:43:21.569+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737872
> [2016-08-12T19:43:21.569+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToke

[jira] [Commented] (YARN-5535) Remove RMDelegationToken make resourcemanager recovery very slow

2016-08-18 Thread tangshangwen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426107#comment-15426107
 ] 

tangshangwen commented on YARN-5535:


Thanks [~sunilg] for the comments. 
I think Removing RMDelegationToken and SequenceNumber may take a long time,lead 
to can't handle other events
{code:title=ZKRMStateStore.java|borderStyle=solid}
  @Override
  protected synchronized void removeRMDelegationTokenState(
  RMDelegationTokenIdentifier rmDTIdentifier) throws Exception {
String nodeRemovePath =
getNodePath(delegationTokensRootPath, DELEGATION_TOKEN_PREFIX
+ rmDTIdentifier.getSequenceNumber());
if (LOG.isDebugEnabled()) {
  LOG.debug("Removing RMDelegationToken_"
  + rmDTIdentifier.getSequenceNumber());
}
if (existsWithRetries(nodeRemovePath, false) != null) {
  ArrayList opList = new ArrayList();
  opList.add(Op.delete(nodeRemovePath, -1));
  doDeleteMultiWithRetries(opList);
} else {
  LOG.debug("Attempted to delete a non-existing znode " + nodeRemovePath);
}
  }
{code}

> Remove RMDelegationToken make resourcemanager recovery very slow
> 
>
> Key: YARN-5535
> URL: https://issues.apache.org/jira/browse/YARN-5535
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: tangshangwen
>Assignee: tangshangwen
>
> In our cluster, I found that when restart RM, RM recovery is very slow, this 
> is my log
> {noformat}
> [2016-08-12T19:43:21.478+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737879
> [2016-08-12T19:43:21.478+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.486+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737878
> [2016-08-12T19:43:21.486+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.494+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737877
> [2016-08-12T19:43:21.494+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.503+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737876
> [2016-08-12T19:43:21.503+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.519+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737875
> [2016-08-12T19:43:21.519+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.533+08:00] [INFO] 
> security.authorize.ServiceAuthorizationManager.authorize(ServiceAuthorizationManager.java:148)
>  [Socket Reader #1 for port 8031] : Authorization successful for yarn 
> (auth:SIMPLE) for protocol=interface 
> org.apache.hadoop.yarn.server.api.ResourceTrackerPB
> [2016-08-12T19:43:21.536+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737874
> [2016-08-12T19:43:21.536+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.553+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737873
> [2016-08-12T19:43:21.553+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition

[jira] [Commented] (YARN-5535) Remove RMDelegationToken make resourcemanager recovery very slow

2016-08-18 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426330#comment-15426330
 ] 

Naganarasimha G R commented on YARN-5535:
-

Hi [~tangshangwen], Is this removal of RM delegation token happening 
synchronously during recovery ?
IIUC, Recovery of application doesnt involve removal of delegation token 
(atleast synchronously), or correct me if i am wrong or missing something !


> Remove RMDelegationToken make resourcemanager recovery very slow
> 
>
> Key: YARN-5535
> URL: https://issues.apache.org/jira/browse/YARN-5535
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: tangshangwen
>Assignee: tangshangwen
>
> In our cluster, I found that when restart RM, RM recovery is very slow, this 
> is my log
> {noformat}
> [2016-08-12T19:43:21.478+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737879
> [2016-08-12T19:43:21.478+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.486+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737878
> [2016-08-12T19:43:21.486+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.494+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737877
> [2016-08-12T19:43:21.494+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.503+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737876
> [2016-08-12T19:43:21.503+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.519+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737875
> [2016-08-12T19:43:21.519+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.533+08:00] [INFO] 
> security.authorize.ServiceAuthorizationManager.authorize(ServiceAuthorizationManager.java:148)
>  [Socket Reader #1 for port 8031] : Authorization successful for yarn 
> (auth:SIMPLE) for protocol=interface 
> org.apache.hadoop.yarn.server.api.ResourceTrackerPB
> [2016-08-12T19:43:21.536+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737874
> [2016-08-12T19:43:21.536+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.553+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737873
> [2016-08-12T19:43:21.553+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.568+08:00] [INFO] 
> yarn.util.RackResolver.coreResolve(RackResolver.java:109) [IPC Server handler 
> 0 on 8031] : Resolved -7056.hadoop.xxx.local to /rack/rack5118
> [2016-08-12T19:43:21.569+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737872
> [2016-08-12T19:43:21.569+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:3

[jira] [Commented] (YARN-5535) Remove RMDelegationToken make resourcemanager recovery very slow

2016-08-18 Thread tangshangwen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426631#comment-15426631
 ] 

tangshangwen commented on YARN-5535:


I'm sorry, it is after recovery ,  and i found even queue size very large
{noformat}
[2016-08-12T19:43:25.986+08:00] [INFO] 
yarn.event.AsyncDispatcher.handle(AsyncDispatcher.java:235) [AsyncDispatcher 
event handler] : Size of event-queue is 643000
[2016-08-12T19:43:25.986+08:00] [INFO] 
yarn.event.AsyncDispatcher.handle(AsyncDispatcher.java:235) [AsyncDispatcher 
event handler] : Size of event-queue is 644000
[2016-08-12T19:43:25.986+08:00] [INFO] 
yarn.event.AsyncDispatcher.handle(AsyncDispatcher.java:235) [AsyncDispatcher 
event handler] : Size of event-queue is 645000
[2016-08-12T19:43:25.986+08:00] [INFO] 
yarn.event.AsyncDispatcher.handle(AsyncDispatcher.java:235) [AsyncDispatcher 
event handler] : Size of event-queue is 646000
{noformat} 

> Remove RMDelegationToken make resourcemanager recovery very slow
> 
>
> Key: YARN-5535
> URL: https://issues.apache.org/jira/browse/YARN-5535
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: tangshangwen
>Assignee: tangshangwen
>
> In our cluster, I found that when restart RM, RM recovery is very slow, this 
> is my log
> {noformat}
> [2016-08-12T19:43:21.478+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737879
> [2016-08-12T19:43:21.478+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.486+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737878
> [2016-08-12T19:43:21.486+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.494+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737877
> [2016-08-12T19:43:21.494+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.503+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737876
> [2016-08-12T19:43:21.503+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.519+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737875
> [2016-08-12T19:43:21.519+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.533+08:00] [INFO] 
> security.authorize.ServiceAuthorizationManager.authorize(ServiceAuthorizationManager.java:148)
>  [Socket Reader #1 for port 8031] : Authorization successful for yarn 
> (auth:SIMPLE) for protocol=interface 
> org.apache.hadoop.yarn.server.api.ResourceTrackerPB
> [2016-08-12T19:43:21.536+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737874
> [2016-08-12T19:43:21.536+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.553+08:00] [INFO] 
> resourcemanager.security.RMDelegationTokenSecretManager.removeStoredToken(RMDelegationTokenSecretManager.java:136)
>  [Thread[Thread-26,5,main]] : removing RMDelegation token with sequence 
> number: 737873
> [2016-08-12T19:43:21.553+08:00] [INFO] 
> resourcemanager.recovery.RMStateStore.transition(RMStateStore.java:320) 
> [Thread[Thread-26,5,main]] : Removing RMDelegationToken and SequenceNumber
> [2016-08-12T19:43:21.568+08:00] [INFO] 
> yarn.util.RackRes