[ 
http://issues.apache.org/jira/browse/HADOOP-133?page=comments#action_12374428 ] 

Doug Cutting commented on HADOOP-133:
-------------------------------------

Sure, that would be safer, but recall that this communication is all on the 
same host.  A tasktracker shouldn't have more than a handful of children, so 
per second pings should not be a great burden.  And communications problems to 
localhost seem unlikely.  I've seen nodes with loads over 100, timing out all 
sorts of requests from other hosts, and I've never seen "Parent died" logged 
when a tasktracker was really still alive.  But, still, it shouldn't hurt to 
try a few times.

> the TaskTracker.Child.ping thread calls exit
> --------------------------------------------
>
>          Key: HADOOP-133
>          URL: http://issues.apache.org/jira/browse/HADOOP-133
>      Project: Hadoop
>         Type: Bug

>   Components: mapred
>     Versions: 0.1.1
>     Reporter: Owen O'Malley
>     Assignee: Owen O'Malley

>
> The TaskTracker.Child.startPinging thread calls exit if the TaskTracker 
> doesn't respond. Calling exit in a mutli-threaded program is really 
> problematic. In particular, it prevents cleanup/finally clauses from running. 
> We need to move to a model where it uses Thread.interrupt(), which means we 
> need to check the interrupt flag in place in the map loop and reduce loop and 
> stop masking the InterruptExceptions.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to