[ https://issues.apache.org/jira/browse/HDFS-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15981732#comment-15981732 ]
Ruslan Dautkhanov commented on HDFS-8131: ----------------------------------------- Thanks for this great improvement! When using AvailableSpaceBlockPlacementPolicy, the default below logic does not work anymore? {quote} 1. Place the first replica somewhere – either a random rack and node (if the HDFS client is outside the hadoop cluster) or on the local node (if the HDFS client is running on a node inside the cluster). 2. The second replica is written to a different rack from the first, chosen at random. 3. The third replica is written to the same rack as the second replica, but on a different node. 4. If there are more replicas – spread them across the rest of the racks. {quote} What is this logic now? When it comes to rackawareness and such? Is it by pure available space and rack awareness logic doesn't kick in? > Implement a space balanced block placement policy > ------------------------------------------------- > > Key: HDFS-8131 > URL: https://issues.apache.org/jira/browse/HDFS-8131 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Affects Versions: 3.0.0-alpha1 > Reporter: Liu Shaohui > Assignee: Liu Shaohui > Priority: Minor > Labels: BlockPlacementPolicy > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: balanced.png, HDFS-8131.004.patch, HDFS-8131.005.patch, > HDFS-8131.006.patch, HDFS-8131-v1.diff, HDFS-8131-v2.diff, HDFS-8131-v3.diff > > > The default block placement policy will choose datanodes for new blocks > randomly, which will result in unbalanced space used percent among datanodes > after an cluster expansion. The old datanodes always are in high used percent > of space and new added ones are in low percent. > Through we can used the external balance tool to balance the space used rate, > it will cost extra network IO and it's not easy to control the balance speed. > An easy solution is to implement an balanced block placement policy which > will choose low used percent datanodes for new blocks with a little high > possibility. In a not long term, the used percent of datanodes will trend to > be balanced. > Suggestions and discussions are welcomed. Thanks -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org