[ 
http://issues.apache.org/jira/browse/HADOOP-181?page=comments#action_12427581 ] 
            
Doug Cutting commented on HADOOP-181:
-------------------------------------

> Why does one turn off speculative execution?

In the case of the Nutch crawler, speculative execution is disabled to observe 
politeness.  We do not want two tasks to attempt to fetch pages from a site at 
the same time.

This patch adds a fair amount of complexity, introducing a new state for tasks 
(presumed dead, but reanimateable).  A new state is likely to add new failure 
modes.

Does anyone deny that this primarily addresses an issue that would go away if 
we could more reliably detect tasktracker death?  Shouldn't we attempt to fix 
that first?  Sameer raises the issue of "transient network problems".  Are we 
actually seeing these?  Even if these were to occur, the system would operate 
correctly as-is: this is an optimization.  Is this a common-enough case that we 
can afford to optimize it?


> task trackers should not restart for having a late heartbeat
> ------------------------------------------------------------
>
>                 Key: HADOOP-181
>                 URL: http://issues.apache.org/jira/browse/HADOOP-181
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Owen O'Malley
>         Assigned To: Devaraj Das
>             Fix For: 0.6.0
>
>         Attachments: lost-heartbeat.patch
>
>
> TaskTrackers should not close and restart themselves for having a late 
> heartbeat. The JobTracker should just accept their current status.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to