[ http://issues.apache.org/jira/browse/HADOOP-725?page=all ]

Milind Bhandarkar updated HADOOP-725:
-------------------------------------

    Status: Patch Available  (was: Open)

> chooseTargets method in FSNamesystem is very inefficient
> --------------------------------------------------------
>
>                 Key: HADOOP-725
>                 URL: http://issues.apache.org/jira/browse/HADOOP-725
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>         Environment: All
>            Reporter: Milind Bhandarkar
>         Assigned To: Milind Bhandarkar
>             Fix For: 0.9.0
>
>         Attachments: chooseTargets.patch
>
>
> Currently the chooseTargets method (that selects datanodes for 
> block-placement) takes in excess of 20% of cpu on a namenode. This is the 
> most time-consuming namenode method, according to the profiler. This 
> inefficiency has already contributed to cascading crash in DFS earlier. As 
> datanodes went down, new locations needed to be found for the blocks on dead 
> datanodes, and since this was done inside a synchronized method, it locked 
> the whole namesystem for several minutes, which caused more datanode 
> failures, when the namenode marked them dead because no heartbeat could be 
> processed during that interval. This has been detailed in HADOOP-572.
> The patch I am about to upload reduces the time taken in the chooseTarget 
> method to be proportional to nReplicas per block, instead of the current 
> implementation, which is proportional to (nDataNodes * nReplicas). Also, when 
> a number of datanodes crash, their blocks are put on the pendingReplications 
> list one datanode at a time in a synchronized section. (Currently, the 
> syncchronized section processes ALL the dead datanodes, thus locking the 
> namesystem for a considerable amount of time.) Also, this patch will add a 
> unit test to check replication.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to