JobTracker's TaskCommitQueue is vulnerable to non-IOExceptions
--------------------------------------------------------------
Key: HADOOP-2051
URL: https://issues.apache.org/jira/browse/HADOOP-2051
Project: Hadoop
Issue Type: Bug
Components: mapred
Affects Versions: 0.15.0
Reporter: Arun C Murthy
Assignee: Arun C Murthy
Priority: Blocker
Fix For: 0.15.0
The {{JobTracker#TaskCommitQueue#run}} method only handles {{IOException}}s.
Christian Kunz ran into a scenario where a job was stuck with all tasks in
{{COMMIT_PENDING}} state and the stack traces showed that the "Task Commit
Thread" wasn't even around.
The work-around is to model {{TaskCommitQueue#run}} along the lines of other
long-running threads in the {{JobTracer}} ({{ExpireLaunchingTasks}},
{{ExpireTrackers}} etc.) to catch, log and ignore any {{Exception}} in a loop.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.