[ http://issues.apache.org/jira/browse/HADOOP-547?page=all ]

Sanjay Dahiya reassigned HADOOP-547:
------------------------------------

    Assignee: Sanjay Dahiya

> ReduceTaskRunner can miss sending hearbeats if no map output copy finishes 
> within "mapred.task.timeout"
> -------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-547
>                 URL: http://issues.apache.org/jira/browse/HADOOP-547
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.6.2
>            Reporter: Sanjay Dahiya
>         Assigned To: Sanjay Dahiya
>
> In ReduceTaskRunner, main loop sending heartbeats waits on copyResults, which 
> releases only if a copy thread finishes copying. This can cause good reduce 
> tasks which are copying data to fail, if no map task output was copied within 
> "mapred.task.timeout". 
> ReduceTaskRunner.java:490
>         try {
>           copyResults.wait();                      <=========== Calls 
> unconditional wait. 
>         } catch (InterruptedException e) { }
> wait() should be with a timeout, possibly taskTimeout/2 after which it should 
> send a hearbeat and go back to wait. 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to