[ https://issues.apache.org/jira/browse/MAPREDUCE-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13710153#comment-13710153 ]
Jason Lowe commented on MAPREDUCE-5001: --------------------------------------- [~sandyr], do you have an ETA on a patch? Some of our Hive devs would love to see this fixed. > LocalJobRunner has race condition resulting in job failures > ------------------------------------------------------------ > > Key: MAPREDUCE-5001 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5001 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 2.0.2-alpha > Reporter: Brock Noland > Assignee: Sandy Ryza > > Hive is hitting a race condition with LocalJobRunner and the Cluster class. > The JobClient uses the Cluster class to obtain Job objects. The Cluster class > uses the job.xml file to populate the JobConf object > (https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Cluster.java#L184). > However, this file is deleted by the LocalJobRunner at the end of it's job > (https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalJobRunner.java#L484). > This results in the following exception: > {noformat} > 2013-02-11 14:45:17,755 (main) [FATAL - > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2001)] > error parsing conf > file:/tmp/hadoop-brock/mapred/staging/brock1916441210/.staging/job_local_0432/job.xml > java.io.FileNotFoundException: > /tmp/hadoop-brock/mapred/staging/brock1916441210/.staging/job_local_0432/job.xml > (No such file or directory) > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.<init>(FileInputStream.java:120) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1917) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1870) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1777) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:712) > at > org.apache.hadoop.mapred.JobConf.checkAndWarnDeprecation(JobConf.java:1951) > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:398) > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:388) > at > org.apache.hadoop.mapred.JobClient$NetworkedJob.<init>(JobClient.java:174) > at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:655) > at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:668) > at > org.apache.hadoop.mapreduce.TestMR2LocalMode.test(TestMR2LocalMode.java:40) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) > at org.junit.runners.ParentRunner.run(ParentRunner.java:300) > at > org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50) > at > org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) > {noformat} > Here is code which exposes this race fairly quickly: > {noformat} > Configuration conf = new Configuration(); > conf.set("mapreduce.framework.name", "local"); > conf.set("mapreduce.jobtracker.address", "local"); > File inputDir = new File("/tmp", "input-" + System.currentTimeMillis()); > File outputDir = new File("/tmp", "output-" + System.currentTimeMillis()); > while(true) { > Assert.assertTrue(inputDir.mkdirs()); > File inputFile = new File(inputDir, "file"); > FileUtils.copyFile(new File("/etc/passwd"), inputFile); > Path input = new Path(inputDir.getAbsolutePath()); > Path output = new Path(outputDir.getAbsolutePath()); > JobConf jobConf = new JobConf(conf, TestMR2LocalMode.class); > FileInputFormat.addInputPath(jobConf, input); > FileOutputFormat.setOutputPath(jobConf, output); > JobClient jobClient = new JobClient(conf); > RunningJob runningJob = jobClient.submitJob(jobConf); > while(!runningJob.isComplete()) { > runningJob = jobClient.getJob(runningJob.getJobID()); > } > FileUtils.deleteQuietly(inputDir); > FileUtils.deleteQuietly(outputDir); > } > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira