[ http://issues.apache.org/jira/browse/HADOOP-181?page=comments#action_12427115 ] Doug Cutting commented on HADOOP-181: -------------------------------------
> improved detection of tasktracker death is a separate issue It's certainly related. This issue deals with fixing things when tasktracker deaths are mis-detected. If we didn't misdetect so much, this would not be an issue. > Multiple instances of a task should be handled by the speculative execution > code. What if speculative execution is disabled in the job? Then we'd get multiple instances of the task running at once, when the client explicitly requested that not happen. I'm +0 on this patch. If others feel strongly that it's the best approach, I won't veto it. But I would prefer we address the root problem first, and then see if this is still an issue, before adding this new mechanism. Does that make sense, or am I missing something? > task trackers should not restart for having a late heartbeat > ------------------------------------------------------------ > > Key: HADOOP-181 > URL: http://issues.apache.org/jira/browse/HADOOP-181 > Project: Hadoop > Issue Type: Bug > Components: mapred > Reporter: Owen O'Malley > Assigned To: Devaraj Das > Fix For: 0.6.0 > > Attachments: lost-heartbeat.patch > > > TaskTrackers should not close and restart themselves for having a late > heartbeat. The JobTracker should just accept their current status. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira