[ https://issues.apache.org/jira/browse/SLIDER-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Manoj Samel updated SLIDER-1158: -------------------------------- Description: In certain cases, when a RM fails over from RM1 to RM2, the Slider AM starts getting error "Client cannot authenticate via:[TOKEN]" when trying to talk to RM2. Then slider AM goes from (initial) RUNNING State to ACCEPTED state. Attached are RM log, slider AM log and detailed info and analysis in file README_INFO_ANALYSIS. Please make sure to read README_INFO_ANALYSIS for further information on the issue and log info {CODE} 2016-07-26 01:04:28,337 [AMRM Heartbeater thread] INFO client.ConfiguredRMFailoverProxyProvider - Failing over to rm2 2016-07-26 01:04:28,339 [AMRM Heartbeater thread] WARN security.UserGroupInformation - PriviledgedActionException as:abc@xyz (auth:KERBEROS) cause:org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN] 2016-07-26 01:04:28,340 [AMRM Heartbeater thread] WARN ipc.Client - Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN] 2016-07-26 01:04:28,341 [AMRM Heartbeater thread] WARN security.UserGroupInformation - PriviledgedActionException as:abc@xyz (auth:KERBEROS) cause:java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN] 2016-07-26 01:04:28,341 [AMRM Heartbeater thread] INFO retry.RetryInvocationHandler - Exception while invoking allocate of class ApplicationMasterProtocolPBClientImpl over rm2 after 3 fail over attempts. Trying to fail over immediately. java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN]; Host Details : local host is: "foo/xx.xx.xx.xx"; destination host is: "bar":23130; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) at org.apache.hadoop.ipc.Client.call(Client.java:1476) at org.apache.hadoop.ipc.Client.call(Client.java:1403) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) at com.sun.proxy.$Proxy23.allocate(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77) at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at com.sun.proxy.$Proxy24.allocate(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:278) at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:224) Caused by: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN] at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:682) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:645) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:733) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1525) at org.apache.hadoop.ipc.Client.call(Client.java:1442) ... 12 more Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN] at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:555) at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:370) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:725) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:721) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:720) ... 15 more {CODE} was: In certain cases, when a RM fails over from RM1 RM2, the Slider AM starts getting error "Client cannot authenticate via:[TOKEN]" when trying to talk to RM2. Then slider AM goes from (initial) RUNNING State to ACCEPTED state. Attached are RM log, slider AM log and detailed info and analysis in file README_INFO_ANALYSIS. Please make sure to read README_INFO_ANALYSIS for further information on the issue and log info {CODE} 2016-07-26 01:04:28,337 [AMRM Heartbeater thread] INFO client.ConfiguredRMFailoverProxyProvider - Failing over to rm2 2016-07-26 01:04:28,339 [AMRM Heartbeater thread] WARN security.UserGroupInformation - PriviledgedActionException as:abc@xyz (auth:KERBEROS) cause:org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN] 2016-07-26 01:04:28,340 [AMRM Heartbeater thread] WARN ipc.Client - Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN] 2016-07-26 01:04:28,341 [AMRM Heartbeater thread] WARN security.UserGroupInformation - PriviledgedActionException as:abc@xyz (auth:KERBEROS) cause:java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN] 2016-07-26 01:04:28,341 [AMRM Heartbeater thread] INFO retry.RetryInvocationHandler - Exception while invoking allocate of class ApplicationMasterProtocolPBClientImpl over rm2 after 3 fail over attempts. Trying to fail over immediately. java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN]; Host Details : local host is: "foo/xx.xx.xx.xx"; destination host is: "bar":23130; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) at org.apache.hadoop.ipc.Client.call(Client.java:1476) at org.apache.hadoop.ipc.Client.call(Client.java:1403) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) at com.sun.proxy.$Proxy23.allocate(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77) at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at com.sun.proxy.$Proxy24.allocate(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:278) at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:224) Caused by: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN] at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:682) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:645) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:733) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1525) at org.apache.hadoop.ipc.Client.call(Client.java:1442) ... 12 more Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN] at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:555) at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:370) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:725) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:721) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:720) ... 15 more {CODE} > Slider AM hits error org.apache.hadoop.security.AccessControlException: > Client cannot authenticate via:[TOKEN] when RM failover happens in RM HA > ------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: SLIDER-1158 > URL: https://issues.apache.org/jira/browse/SLIDER-1158 > Project: Slider > Issue Type: Bug > Components: appmaster > Affects Versions: Slider 0.80 > Environment: centos, Hadoop 2.6 (CDH 5.5.1), Kerberos > Reporter: Manoj Samel > Attachments: README_INFO_ANALYSIS, rm2.log, slider.log > > > In certain cases, when a RM fails over from RM1 to RM2, the Slider AM starts > getting error "Client cannot authenticate via:[TOKEN]" when trying to talk to > RM2. Then slider AM goes from (initial) RUNNING State to ACCEPTED state. > Attached are RM log, slider AM log and detailed info and analysis in file > README_INFO_ANALYSIS. Please make sure to read README_INFO_ANALYSIS for > further information on the issue and log info > {CODE} > 2016-07-26 01:04:28,337 [AMRM Heartbeater thread] INFO > client.ConfiguredRMFailoverProxyProvider - Failing over to rm2 > 2016-07-26 01:04:28,339 [AMRM Heartbeater thread] WARN > security.UserGroupInformation - PriviledgedActionException as:abc@xyz > (auth:KERBEROS) cause:org.apache.hadoop.security.AccessControlException: > Client cannot authenticate via:[TOKEN] > 2016-07-26 01:04:28,340 [AMRM Heartbeater thread] WARN ipc.Client - > Exception encountered while connecting to the server : > org.apache.hadoop.security.AccessControlException: Client cannot authenticate > via:[TOKEN] > 2016-07-26 01:04:28,341 [AMRM Heartbeater thread] WARN > security.UserGroupInformation - PriviledgedActionException as:abc@xyz > (auth:KERBEROS) cause:java.io.IOException: > org.apache.hadoop.security.AccessControlException: Client cannot authenticate > via:[TOKEN] > 2016-07-26 01:04:28,341 [AMRM Heartbeater thread] INFO > retry.RetryInvocationHandler - Exception while invoking allocate of class > ApplicationMasterProtocolPBClientImpl over rm2 after 3 fail over attempts. > Trying to fail over immediately. > java.io.IOException: Failed on local exception: java.io.IOException: > org.apache.hadoop.security.AccessControlException: Client cannot authenticate > via:[TOKEN]; Host Details : local host is: "foo/xx.xx.xx.xx"; destination > host is: "bar":23130; > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) > at org.apache.hadoop.ipc.Client.call(Client.java:1476) > at org.apache.hadoop.ipc.Client.call(Client.java:1403) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) > at com.sun.proxy.$Proxy23.allocate(Unknown Source) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77) > at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy24.allocate(Unknown Source) > at > org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:278) > at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:224) > Caused by: java.io.IOException: > org.apache.hadoop.security.AccessControlException: Client cannot authenticate > via:[TOKEN] > at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:682) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at > org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:645) > at > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:733) > at > org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) > at org.apache.hadoop.ipc.Client.getConnection(Client.java:1525) > at org.apache.hadoop.ipc.Client.call(Client.java:1442) > ... 12 more > Caused by: org.apache.hadoop.security.AccessControlException: Client cannot > authenticate via:[TOKEN] > at > org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172) > at > org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396) > at > org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:555) > at > org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:370) > at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:725) > at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:721) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:720) > ... 15 more > {CODE} -- This message was sent by Atlassian JIRA (v6.3.4#6332)