[jira] [Updated] (MAPREDUCE-4850) Job recovery may fail if staging directory has been deleted

2012-12-13 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated MAPREDUCE-4850:
-

Attachment: MAPREDUCE-4850.patch

New patch with unit test. This depends on the fixes I made for MAPREDUCE-4859 
which are not committed yet.

 Job recovery may fail if staging directory has been deleted
 ---

 Key: MAPREDUCE-4850
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4850
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Tom White
Assignee: Tom White
 Attachments: MAPREDUCE-4850.patch, MAPREDUCE-4850.patch


 The job staging directory is deleted in the job cleanup task, which happens 
 before the job-info file is deleted from the system directory (by the 
 JobInProgress garbageCollect() method). If the JT shuts down between these 
 two operations, then when the JT restarts and tries to recover the job, it 
 fails since the job.xml and splits are no longer available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4850) Job recovery may fail if staging directory has been deleted

2012-12-05 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated MAPREDUCE-4850:
-

Attachment: MAPREDUCE-4850.patch

A patch that deletes the staging directory after the system directory.

Manual testing showed that with this patch I couldn't get a recovery failure in 
the scenario in the description. It would be nice to add a unit test, but I'm 
still trying to figure out how to write one for this.


 Job recovery may fail if staging directory has been deleted
 ---

 Key: MAPREDUCE-4850
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4850
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.1.1
Reporter: Tom White
Assignee: Tom White
 Attachments: MAPREDUCE-4850.patch


 The job staging directory is deleted in the job cleanup task, which happens 
 before the job-info file is deleted from the system directory (by the 
 JobInProgress garbageCollect() method). If the JT shuts down between these 
 two operations, then when the JT restarts and tries to recover the job, it 
 fails since the job.xml and splits are no longer available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira