Jens Rabe created MAPREDUCE-6320:
------------------------------------

             Summary: Configuration of retrieved Job via Cluster is not 
properly set-up
                 Key: MAPREDUCE-6320
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6320
             Project: Hadoop Map/Reduce
          Issue Type: Bug
    Affects Versions: 2.4.1
            Reporter: Jens Rabe


When getting a Job via the Cluster API, it is not correctly configured.

To reproduce this:

# Submit a MR job, and set some arbitrary parameter to its configuration
{code:java}
job.getConfiguration().set("foo", "bar");
job.setJobName("foo-bug-demo");
{code}
# Get the job in a client:
{code:java}
final Cluster c = new Cluster(conf);
final JobStatus[] statuses = c.getAllJobStatuses();
final JobStatus s = ... // get the status for the job named foo-bug-demo
final Job j = c.getJob(s.getJobId());
final Configuration conf = job.getConfiguration();
{code}
# Get its "foo" entry
{code:java}
final String s = conf.get("foo");
{code}
# Expected: s is "bar"; But: s is null.

The reason is that the job's configuration is stored on HDFS (the Configuration 
has a resource with a *hdfs://* URL) and in the *loadResource* it is changed to 
a path on the local file system (hdfs://host.domain:port/tmp/hadoop-yarn/... is 
changed to /tmp/hadoop-yarn/...), which does not exist, and thus the 
configuration is not populated.

The bug happens in the *Cluster* class, where *JobConfs* are created from 
*status.getJobFile()*. A quick fix would be to copy this job file to a 
temporary file in the local file system and populate the JobConf from this file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to