[ https://issues.apache.org/jira/browse/HADOOP-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Devaraj Das updated HADOOP-2228: -------------------------------- Fix Version/s: (was: 0.15.2) Status: Open (was: Patch Available) I don't think this patch is going to solve the problem. The reason is this that a {{job}} is not removed immediately from the {{jobs}} list even though it may have succeeded (or failed). The JobTracker works on a specific directory per job and the directory name is based on the jobid. The jobIds assigned by the JobTracker should never clash since the jobId is really unique (to clarify, the jobId is a combination of the JobTracker startup timestamp and a counter that keeps track of submitted jobs). Of course, the possibility of a bug here cannot be ruled out but it requires investigation... I feel that this issue can be addressed in 0.16 and marking it as such. > Jobs fail because job.xml exists > -------------------------------- > > Key: HADOOP-2228 > URL: https://issues.apache.org/jira/browse/HADOOP-2228 > Project: Hadoop > Issue Type: Bug > Components: mapred > Affects Versions: 0.14.3 > Environment: 35 node cluster, linux > Reporter: Johan Oskarsson > Assignee: Johan Oskarsson > Fix For: 0.16.0 > > Attachments: HADOOP-2228-v1.patch > > > org.apache.hadoop.ipc.RemoteException: java.io.IOException: Target > /var/storage/4/mapred/local/jobTracker/job_200711081903_3976.xml already > exists > at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:271) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:117) > at > org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:803) > at > org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:784) > at > org.apache.hadoop.mapred.JobInProgress.<init>(JobInProgress.java:134) > at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1479) > at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566) > at org.apache.hadoop.ipc.Client.call(Client.java:470) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:165) > at $Proxy1.submitJob(Unknown Source) > at sun.reflect.GeneratedMethodAccessor26.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) > at $Proxy1.submitJob(Unknown Source) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:397) > at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:345) > at > org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:250) > at > org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:282) > at java.lang.Thread.run(Thread.java:619) > Perhaps related to HADOOP-1057, HADOOP-891 or to the rpc retry. It seems my > job was submitted and actually finished despite the exception. Could it be > that the job went in and the rpc retry decided to submit it again anyway? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.