Jens Rabe created MAPREDUCE-6320: ------------------------------------ Summary: Configuration of retrieved Job via Cluster is not properly set-up Key: MAPREDUCE-6320 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6320 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.1 Reporter: Jens Rabe
When getting a Job via the Cluster API, it is not correctly configured. To reproduce this: # Submit a MR job, and set some arbitrary parameter to its configuration {code:java} job.getConfiguration().set("foo", "bar"); job.setJobName("foo-bug-demo"); {code} # Get the job in a client: {code:java} final Cluster c = new Cluster(conf); final JobStatus[] statuses = c.getAllJobStatuses(); final JobStatus s = ... // get the status for the job named foo-bug-demo final Job j = c.getJob(s.getJobId()); final Configuration conf = job.getConfiguration(); {code} # Get its "foo" entry {code:java} final String s = conf.get("foo"); {code} # Expected: s is "bar"; But: s is null. The reason is that the job's configuration is stored on HDFS (the Configuration has a resource with a *hdfs://* URL) and in the *loadResource* it is changed to a path on the local file system (hdfs://host.domain:port/tmp/hadoop-yarn/... is changed to /tmp/hadoop-yarn/...), which does not exist, and thus the configuration is not populated. The bug happens in the *Cluster* class, where *JobConfs* are created from *status.getJobFile()*. A quick fix would be to copy this job file to a temporary file in the local file system and populate the JobConf from this file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)