[ https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15362137#comment-15362137 ]
Xianyin Xin commented on YARN-5302: ----------------------------------- Thanks [~Naganarasimha]. *we simply always manage our own tokens for localization and log-aggregation for long-running applications / services* is ok, but not enough if we want to support recovery of long running jobs. Of course we can limit our tokens management to the above two scopes considering some secure problems, the *cost* is we lose the ability of recovery of long running jobs(YARN-5302, YARN-5310) and may also obtain log aggregation failures for such jobs(YARN-5305). However, all of this depends on how we *define* the behavior of YARN that on behalf of users, that is, what YARN can/cannot do when the token expires and can we accept the failures because of the token expiration. > Yarn Application log Aggreagation fails due to NM can not get correct HDFS > delegation token II > ---------------------------------------------------------------------------------------------- > > Key: YARN-5302 > URL: https://issues.apache.org/jira/browse/YARN-5302 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn > Reporter: Xianyin Xin > Assignee: Xianyin Xin > Attachments: YARN-5032.001.patch, YARN-5032.002.patch, > YARN-5302.003.patch, YARN-5302.004.patch > > > Different with YARN-5098, this happens at NM side. When NM recovers, > credentials are read from NMStateStore. When initialize app aggregators, > exception happens because of the overdue tokens. The app is a long running > service. > {code:title=LogAggregationService.java} > protected void initAppAggregator(final ApplicationId appId, String user, > Credentials credentials, ContainerLogsRetentionPolicy > logRetentionPolicy, > Map<ApplicationAccessType, String> appAcls, > LogAggregationContext logAggregationContext) { > // Get user's FileSystem credentials > final UserGroupInformation userUgi = > UserGroupInformation.createRemoteUser(user); > if (credentials != null) { > userUgi.addCredentials(credentials); > } > ... > try { > // Create the app dir > createAppDir(user, appId, userUgi); > } catch (Exception e) { > appLogAggregator.disableLogAggregation(); > if (!(e instanceof YarnRuntimeException)) { > appDirException = new YarnRuntimeException(e); > } else { > appDirException = (YarnRuntimeException)e; > } > appLogAggregators.remove(appId); > closeFileSystems(userUgi); > throw appDirException; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org