zhengchenyu created YARN-11148: ---------------------------------- Summary: In federation and security mode, nm recover may fail. Key: YARN-11148 URL: https://issues.apache.org/jira/browse/YARN-11148 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 3.2.1 Reporter: zhengchenyu Assignee: zhengchenyu
Exception stack {code:java} 2022-05-08 00:44:11,536 WARN org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] 2022-05-08 00:44:11,540 ERROR org.apache.hadoop.yarn.server.nodemanager.amrmproxy.AMRMProxyService: Exception when recovering appattempt_1650635484875_0036_000002, removing it from NMStateStore and move on org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: DestHost:destPort host:8032 , LocalHost:localPort node/10.x.x.x:0. Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] at org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor.recover(FederationInterceptor.java:441) at org.apache.hadoop.yarn.server.nodemanager.amrmproxy.AMRMProxyService.initializePipeline(AMRMProxyService.java:466) at org.apache.hadoop.yarn.server.nodemanager.amrmproxy.AMRMProxyService.recover(AMRMProxyService.java:270) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:389) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:324) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:516) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:974) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1054) Caused by: java.io.IOException: DestHost:destPort host:8032 , LocalHost:localPort host/10.x.x.x:0. Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] at sun.reflect.GeneratedConstructorAccessor30.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:833) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:808) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1558) at org.apache.hadoop.ipc.Client.call(Client.java:1492) at org.apache.hadoop.ipc.Client.call(Client.java:1389) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) at com.sun.proxy.$Proxy30.getContainers(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getContainers(ApplicationClientProtocolPBClientImpl.java:479) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) at com.sun.proxy.$Proxy31.getContainers(Unknown Source) at org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor.recover(FederationInterceptor.java:418) ... 10 more Caused by: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:771) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:734) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:828) at org.apache.hadoop.ipc.Client$Connection.access$3900(Client.java:422) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1615) at org.apache.hadoop.ipc.Client.call(Client.java:1436) ... 25 more Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:173) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:390) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:628) at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:422) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:815) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:811) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:811) ... 28 more {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org