[
https://issues.apache.org/jira/browse/MAPREDUCE-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13870396#comment-13870396
]
Siddharth Seth commented on MAPREDUCE-5663:
-------------------------------------------
That's two sets of tokens that are obtained - for the working directory, and
for any additional HDFS servers which the user may have configured.
In addition to this, tokens may be obtained by Input/OutputFormats
>From FileInputFormat
{code}
Path[] dirs = getInputPaths(job);
if (dirs.length == 0) {
throw new IOException("No input paths specified in job");
}
// get tokens for all the required FileSystems..
TokenCache.obtainTokensForNamenodes(job.getCredentials(), dirs,
job.getConfiguration());
{code}
getInputPaths reads the property "mapreduce.input.fileinputformat.inputdir" -
which is specific to FIF. If the input paths reside on a different Namenode
than the one on which the staging directory is, I don't think users must set
MRJobConfig.JOB_NAMENODES. The tokens would just be picked up as part of client
side split generation.
In terms of Oozie, from what I understand, the JobSubmitter does not get
invoked on a box with kerberos credentials - not for the main job anyway (maybe
for the launcher) - so this code to obtain tokens doesn't kick in. If that's
the case, my guess is Oozie has additional configuration, and explicitly goes
out and fetches tokens before submitting the launcher.
> Add an interface to Input/Ouput Formats to obtain delegation tokens
> -------------------------------------------------------------------
>
> Key: MAPREDUCE-5663
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5663
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Reporter: Siddharth Seth
> Assignee: Michael Weng
> Attachments: MAPREDUCE-5663.4.txt, MAPREDUCE-5663.5.txt,
> MAPREDUCE-5663.6.txt, MAPREDUCE-5663.patch.txt, MAPREDUCE-5663.patch.txt2,
> MAPREDUCE-5663.patch.txt3
>
>
> Currently, delegation tokens are obtained as part of the getSplits /
> checkOutputSpecs calls to the InputFormat / OutputFormat respectively.
> This works as long as the splits are generated on a node with kerberos
> credentials. For split generation elsewhere (AM for example), an explicit
> interface is required.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)