[
https://issues.apache.org/jira/browse/HADOOP-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12463087
]
Arun C Murthy commented on HADOOP-600:
--------------------------------------
Attached a straight-forward fix: lock the JobTracker before locking the
'taskTrackers' & 'trackerExpiryQueue'.
I didn't bother trying to build a list of dead task-trackers and then lock the
JobTracker since the inner-loop only checks timestamps & hence shouldn't a
big-deal... :-)
> Race condition in JobTracker updating the task tracker's status while
> declaring it lost
> ---------------------------------------------------------------------------------------
>
> Key: HADOOP-600
> URL: https://issues.apache.org/jira/browse/HADOOP-600
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.7.1
> Reporter: Owen O'Malley
> Assigned To: Arun C Murthy
> Fix For: 0.10.1
>
> Attachments: HADOOP-600_20070108_1.patch
>
>
> There was a case where the JobTracker lost track of a set of tasks that were
> on a task tracker. It appears to be a race condition because the
> ExpireTrackers thread doesn't lock the JobTracker while updating the state.
> The fix would be to build a list of dead task trackers and then lock the job
> tracker while updating their status.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira