[jira] [Commented] (YARN-7701) Both RM are in standby in secure cluster

2018-01-05 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313523#comment-16313523
 ] 

genericqa commented on YARN-7701:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 
58s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 26s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 33s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m  
8s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 64m  
2s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}123m 11s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.rmLoginUGI; 
locked 50% of time  Unsynchronized access at ResourceManager.java:50% of time  
Unsynchronized access at ResourceManager.java:[line 1243] |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7701 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12904812/YARN-7701.01.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux e8dccb092a6a 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 0c75d06 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 

[jira] [Commented] (YARN-7701) Both RM are in standby in secure cluster

2018-01-05 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313300#comment-16313300
 ] 

Rohith Sharma K S commented on YARN-7701:
-

Got complete RM logs. The cluster is some what in 2.8 code base matching. 
# My suspect is *ClientRMService#getDelegationToken* does synchronous call 
RMStatestore for storing passwords. If RMStateStore is fenced then RM will be 
moved to standby on this synchronous call. In secure cluster, transitioning to 
standby happens to be in context of callerUgi. When RM is transitioned to 
standby, service initialization and elector reset happens in context of 
callerUgi who invoked _getDelegationToken_. As a result any subsequent call to 
become active or standby from elector will have callerUgi context which will 
fail at ACLs check. 
# Below is the log trace that gives hint that transition to standby in 
ClientRMService#getDelegationToken method call which is in the context of 
callerUgi.
{noformat}
2017-12-20 11:55:01,302 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: State 
store operation failed 
org.apache.hadoop.yarn.server.resourcemanager.recovery.StoreFencedException: 
RMStateStore has been fenced
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1213)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:995)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.storeRMDelegationTokenState(ZKRMStateStore.java:752)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreRMDTTransition.transition(RMStateStore.java:345)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreRMDTTransition.transition(RMStateStore.java:330)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:960)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeRMDelegationToken(RMStateStore.java:775)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.RMDelegationTokenSecretManager.storeNewToken(RMDelegationTokenSecretManager.java:110)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.RMDelegationTokenSecretManager.storeNewToken(RMDelegationTokenSecretManager.java:47)
at 
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.storeToken(AbstractDelegationTokenSecretManager.java:272)
at 
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:391)
at 
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:47)
at org.apache.hadoop.security.token.Token.(Token.java:62)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getDelegationToken(ClientRMService.java:968)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getDelegationToken(ApplicationClientProtocolPBServiceImpl.java:296)
at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:433)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2345)
2017-12-20 11:55:01,303 WARN 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: 
State-store fenced ! Transitioning RM to standby
2017-12-20 11:55:01,398 INFO 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: 
RMStateStore state change from ACTIVE to FENCED
2017-12-20 11:55:01,398 INFO 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: 
RMStateStore has been fenced
2017-12-20 11:55:01,404 INFO