[ http://issues.apache.org/jira/browse/HADOOP-547?page=all ]
Sanjay Dahiya reassigned HADOOP-547: ------------------------------------ Assignee: Sanjay Dahiya > ReduceTaskRunner can miss sending hearbeats if no map output copy finishes > within "mapred.task.timeout" > ------------------------------------------------------------------------------------------------------- > > Key: HADOOP-547 > URL: http://issues.apache.org/jira/browse/HADOOP-547 > Project: Hadoop > Issue Type: Bug > Components: mapred > Affects Versions: 0.6.2 > Reporter: Sanjay Dahiya > Assigned To: Sanjay Dahiya > > In ReduceTaskRunner, main loop sending heartbeats waits on copyResults, which > releases only if a copy thread finishes copying. This can cause good reduce > tasks which are copying data to fail, if no map task output was copied within > "mapred.task.timeout". > ReduceTaskRunner.java:490 > try { > copyResults.wait(); <=========== Calls > unconditional wait. > } catch (InterruptedException e) { } > wait() should be with a timeout, possibly taskTimeout/2 after which it should > send a hearbeat and go back to wait. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira