[ http://issues.apache.org/jira/browse/HADOOP-180?page=all ]

Owen O'Malley updated HADOOP-180:
---------------------------------

    Attachment: task-cleanup-thread.patch

This patch fixes the timeouts by creating a synchronized queue (we really 
should go to java 1.5 soon) of tasks that need to be cleaned up and a daemon 
thread that does it in the background.

It also fixes some race conditions in the TaskTracker on the tasks variable. 
(Some references where locking the TaskTracker and others were locking the 
TaskTracker.TaskInProgress.)

I also changed the rpc logging a little to include both client and server time 
measurements.

> task tracker times out cleaning big job
> ---------------------------------------
>
>          Key: HADOOP-180
>          URL: http://issues.apache.org/jira/browse/HADOOP-180
>      Project: Hadoop
>         Type: Bug

>   Components: mapred
>     Versions: 0.1.1
>     Reporter: Owen O'Malley
>     Assignee: Owen O'Malley
>      Fix For: 0.3
>  Attachments: task-cleanup-thread.patch
>
> After completing a big job (63,920 maps, 1880 reduces, 188 nodes), lots of 
> the TaskTrackers timed out because the task cleanup is handled by the same 
> thread as the heartbeats.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to