[ 
https://issues.apache.org/jira/browse/HADOOP-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508684
 ] 

dhruba borthakur commented on HADOOP-1448:
------------------------------------------

This problem becomes worse when there are map-reduce nodes that do not run 
"dfs". *All* accesses from these nodes go to the first replica of every block. 

One proposal is that if the client does not belong to a dfs cluster, then the 
getBlockLocations call returns all replicas in somewhat *random* order. 

> Setting the replication factor of a file too high causes namenode cpu overload
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-1448
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1448
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>
> The replication factor of a file in set to 300 (on a 800 node cluster). Then 
> all mappers try to open this file. For every open call that the namenode 
> receives from each of these 800 clients, it sorts all the replicas of the 
> block(s) based on the distance from the client. This causes CPU usage 
> overload on the namenode.
> One proposal is to make the namenode return a non-sorted list of datanodes to 
> the client. Information about each replica also contains the rack on which 
> that replica resides. The client can look at the replicas to determine if 
> there is a copy on the local node. If not, then it can find out if there is a 
> replica on the local rack. If not then it can choose a replica at random.
> This proposal is scalable because the sorting and selection of replicas is 
> done by the client rather than the Namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to