[ https://issues.apache.org/jira/browse/YARN-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15262130#comment-15262130 ]
Junping Du commented on YARN-4721: ---------------------------------- Thanks [~ste...@apache.org] for updating the patch. 003 patch looks pretty good. Just one question: it seems KDiagBinding use builder pattern for all optional parameter in construction. The only exception is withKeytabConfKey: {noformat} + public void withKeytabConfKey(String key) { + this.keytabConfKey = key; + } {noformat} Shall we follow the same pattern of Builder? If so, in ResourceManager.doSecureLogin(), {noformat} + String keytabFilename = conf.get(YarnConfiguration.RM_KEYTAB); + if (keytabFilename != null) { + binding.withKeytab(new File(keytabFilename)); + } {noformat} Can we just simply do binding.withKeytabConfKey(YarnConfiguration.RM_KEYTAB) and handle null case inside of withKeytabConfKey()? Other looks good to me. > RM to try to auth with HDFS on startup, retry with max diagnostics on failure > ----------------------------------------------------------------------------- > > Key: YARN-4721 > URL: https://issues.apache.org/jira/browse/YARN-4721 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager > Affects Versions: 2.8.0 > Reporter: Steve Loughran > Assignee: Steve Loughran > Attachments: HADOOP-12289-002.patch, HADOOP-12289-003.patch, > HADOOP-12889-001.patch > > > If the RM can't auth with HDFS, this can first surface during job submission, > which can cause confusion about what's wrong and whose credentials are > playing up. > Instead, the RM could try to talk to HDFS on launch, {{ls /}} should suffice. > If it can't auth, it can then tell UGI to log more and retry. > I don't know what the policy should be if the RM can't auth to HDFS at this > point. Certainly it can't currently accept work. But should it fail fast or > keep going in the hope that the problem is in the KDC or NN and will fix > itself without an RM restart? -- This message was sent by Atlassian JIRA (v6.3.4#6332)