[ 
https://issues.apache.org/jira/browse/YARN-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15262130#comment-15262130
 ] 

Junping Du commented on YARN-4721:
----------------------------------

Thanks [~ste...@apache.org] for updating the patch. 003 patch looks pretty 
good. 
Just one question: it seems KDiagBinding use builder pattern for all optional 
parameter in construction. The only exception is withKeytabConfKey:
{noformat}
+    public void withKeytabConfKey(String key) {
+      this.keytabConfKey = key;
+    }
{noformat}
Shall we follow the same pattern of Builder?
If so, in ResourceManager.doSecureLogin(), 
{noformat}
+        String keytabFilename = conf.get(YarnConfiguration.RM_KEYTAB);
+        if (keytabFilename != null) {
+          binding.withKeytab(new File(keytabFilename));
+        }
{noformat}
Can we just simply do binding.withKeytabConfKey(YarnConfiguration.RM_KEYTAB) 
and handle null case inside of withKeytabConfKey()?
Other looks good to me.

> RM to try to auth with HDFS on startup, retry with max diagnostics on failure
> -----------------------------------------------------------------------------
>
>                 Key: YARN-4721
>                 URL: https://issues.apache.org/jira/browse/YARN-4721
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-12289-002.patch, HADOOP-12289-003.patch, 
> HADOOP-12889-001.patch
>
>
> If the RM can't auth with HDFS, this can first surface during job submission, 
> which can cause confusion about what's wrong and whose credentials are 
> playing up.
> Instead, the RM could try to talk to HDFS on launch, {{ls /}} should suffice. 
> If it can't auth, it can then tell UGI to log more and retry.
> I don't know what the policy should be if the RM can't auth to HDFS at this 
> point. Certainly it can't currently accept work. But should it fail fast or 
> keep going in the hope that the problem is in the KDC or NN and will fix 
> itself without an RM restart?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to