[jira] Updated: (MAPREDUCE-1213) TaskTrackers restart is very slow because it deletes distributed cache directory synchronously

Zheng Shao (JIRA) Thu, 14 Jan 2010 14:28:19 -0800

     [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Zheng Shao updated MAPREDUCE-1213:
----------------------------------

    Attachment: MAPREDUCE-1213.branch-0.20.2.patch

Removed unnecessary changes for hadoop 0.20.

> TaskTrackers restart is very slow because it deletes distributed cache 
> directory synchronously
> ----------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1213
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1213
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.1
>            Reporter: dhruba borthakur
>            Assignee: Zheng Shao
>             Fix For: 0.22.0
>
>         Attachments: MAPREDUCE-1213.1.patch, MAPREDUCE-1213.2.patch, 
> MAPREDUCE-1213.3.patch, MAPREDUCE-1213.4.patch, 
> MAPREDUCE-1213.branch-0.20.2.patch, MAPREDUCE-1213.branch-0.20.patch
>
>
> We are seeing that when we restart a tasktracker, it tries to recursively 
> delete all the file in the distributed cache. It invoked 
> FileUtil.fullyDelete() which is very very slow. This means that the 
> TaskTracker cannot join the cluster for an extended period of time (upto 2 
> hours for us). The problem is acute if the number of files in a distributed 
> cache is a few-thousands.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1213) TaskTrackers restart is very slow because it deletes distributed cache directory synchronously

Reply via email to