[ https://issues.apache.org/jira/browse/YARN-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14363730#comment-14363730 ]
zhihai xu commented on YARN-3190: --------------------------------- Sorry, some mistake in the changelog for CDH 5.3.1 release, YARN-2964 is not really in CDH 5.3.1, but it is in CDH 5.3.2. > NM can't aggregate logs: token can't be found in cache > ------------------------------------------------------- > > Key: YARN-3190 > URL: https://issues.apache.org/jira/browse/YARN-3190 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 2.5.0 > Environment: CDH 5.3.1 > HA HDFS > Kerberos > Reporter: Andrejs Dubovskis > Priority: Minor > > In rare cases node manager can not aggregate logs: generating exception: > {code} > 2015-02-12 13:04:03,703 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: > Starting aggregate log-file for app application_1423661043235_2150 at > /tmp/logs/catalyst/logs/application_1423661043235_2150/catdn001.intrum.net_8041.tmp > 2015-02-12 13:04:03,707 INFO > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting > absolute path : > /data5/yarn/nm/usercache/catalyst/appcache/application_1423661043235_2150/container_1423661043235_2150_01_000442 > 2015-02-12 13:04:03,707 INFO > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting > absolute path : > /data6/yarn/nm/usercache/catalyst/appcache/application_1423661043235_2150/container_1423661043235_2150_01_000442 > 2015-02-12 13:04:03,707 INFO > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting > absolute path : > /data7/yarn/nm/usercache/catalyst/appcache/application_1423661043235_2150/container_1423661043235_2150_01_000442 > 2015-02-12 13:04:03,709 INFO > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting > absolute path : > /data1/yarn/nm/usercache/catalyst/appcache/application_1423661043235_2150 > 2015-02-12 13:04:03,709 WARN org.apache.hadoop.security.UserGroupInformation: > PriviledgedActionException as:catalyst (auth:SIMPLE) > cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (HDFS_DELEGATION_TOKEN token 2334644 for catalyst) can't be found in > cache > 2015-02-12 13:04:03,709 WARN org.apache.hadoop.ipc.Client: Exception > encountered while connecting to the server : > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (HDFS_DELEGATION_TOKEN token 2334644 for catalyst) can't be found in > cache > 2015-02-12 13:04:03,709 WARN org.apache.hadoop.security.UserGroupInformation: > PriviledgedActionException as:catalyst (auth:SIMPLE) > cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (HDFS_DELEGATION_TOKEN token 2334644 for catalyst) can't be found in > cache > 2015-02-12 13:04:03,712 WARN org.apache.hadoop.security.UserGroupInformation: > PriviledgedActionException as:catalyst (auth:SIMPLE) > cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (HDFS_DELEGATION_TOKEN token 2334644 for catalyst) can't be found in > cache > 2015-02-12 13:04:03,712 ERROR > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: > Cannot create writer for app application_1423661043235_2150. Disabling > log-aggregation for this app. > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > token (HDFS_DELEGATION_TOKEN token 2334644 for catalyst) can't be found in > cache > at org.apache.hadoop.ipc.Client.call(Client.java:1411) > at org.apache.hadoop.ipc.Client.call(Client.java:1364) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) > at com.sun.proxy.$Proxy19.getServerDefaults(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getServerDefaults(ClientNamenodeProtocolTranslatorPB.java:259) > at sun.reflect.GeneratedMethodAccessor114.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > at com.sun.proxy.$Proxy20.getServerDefaults(Unknown Source) > at > org.apache.hadoop.hdfs.DFSClient.getServerDefaults(DFSClient.java:966) > at org.apache.hadoop.fs.Hdfs.getServerDefaults(Hdfs.java:159) > at > org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:543) > at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:680) > at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:676) > at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) > at org.apache.hadoop.fs.FileContext.create(FileContext.java:676) > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter$1.run(AggregatedLogFormat.java:272) > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter$1.run(AggregatedLogFormat.java:267) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter.<init>(AggregatedLogFormat.java:266) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainer(AppLogAggregatorImpl.java:134) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:196) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:166) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$2.run(LogAggregationService.java:372) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)