[ https://issues.apache.org/jira/browse/YARN-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15180522#comment-15180522 ]
Steve Loughran commented on YARN-4721: -------------------------------------- HADOOP-12889 covers the changes to KDiag & UGI for this > RM to try to auth with HDFS on startup, retry with max diagnostics on failure > ----------------------------------------------------------------------------- > > Key: YARN-4721 > URL: https://issues.apache.org/jira/browse/YARN-4721 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager > Affects Versions: 2.8.0 > Reporter: Steve Loughran > Attachments: HADOOP-12889-001.patch > > > If the RM can't auth with HDFS, this can first surface during job submission, > which can cause confusion about what's wrong and whose credentials are > playing up. > Instead, the RM could try to talk to HDFS on launch, {{ls /}} should suffice. > If it can't auth, it can then tell UGI to log more and retry. > I don't know what the policy should be if the RM can't auth to HDFS at this > point. Certainly it can't currently accept work. But should it fail fast or > keep going in the hope that the problem is in the KDC or NN and will fix > itself without an RM restart? -- This message was sent by Atlassian JIRA (v6.3.4#6332)