wangzhixiang created HDFS-15560:
-----------------------------------

             Summary: The getMaxNodesPerRack May Cause "Failed to place enough 
replicas"
                 Key: HDFS-15560
                 URL: https://issues.apache.org/jira/browse/HDFS-15560
             Project: Hadoop HDFS
          Issue Type: Bug
            Reporter: wangzhixiang
            Assignee: wangzhixiang


In our hdfs Cluster, the nodes in each rack is extremely uneven.

Eg. rack1=[1 node], rack2=[1 node], rack3=[3 nodes], rack4=[5 nodes], rack5=[4 
nodes], rack6=[4 nodes].

When invoke getMaxNodesPerRack method, we will get MaxNodesPerRack = 4 by 
MaxNodesPerRack = (totalNumOfReplicas-1)/numOfRacks + 2, totalNumOfReplicas = 
18, numOfRacks = 6。

And the replications of some files in our cluster is set to 50, so it be 
allocated 18 replicas and we need the all nodes . However, the rack4 could only 
choose 4 nodes because of  MaxNodesPerRack = 4. It will cause only 17 
(1+1+3+4+4+4) replicas be choosen and throws the warn log "Failed to place 
enough replicas, still in need of 1 to reach 18".  

Besides, ReplicationMonitor will add the file as ReplicationWork to retry and 
it still failed in loop. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to