[ https://issues.apache.org/jira/browse/HDFS-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770120#comment-13770120 ]
Arpit Agarwal commented on HDFS-4990: ------------------------------------- Nice patch! Couple of points which we can handle in a separate Jira so +1 from me. In {{BlockPlacementPolicyDefault#isGoodTarget}} and {{#addIfIsGoodTarget}} there seems to be a bug. It does not check the usage on the Storage so it could keep picking a full storage. The DataNode will have to send per-storage usage information to the NN. Also we should probably try to round robin the storages. It looks like we will just try to put replicas on the first storage returned by DatanodeDescriptor#getStorageInfos. > Block placement for storage types > --------------------------------- > > Key: HDFS-4990 > URL: https://issues.apache.org/jira/browse/HDFS-4990 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode > Reporter: Suresh Srinivas > Assignee: Tsz Wo (Nicholas), SZE > Attachments: h4990_20130909.patch, h4990_20130916.patch, > h4990_20130917b.patch, h4990_20130917c.patch, h4990_20130917.patch > > > Currently block location for writes are made based on: > # Datanode load (number of transceivers) > # Space left on datanode > # Topology > With storage abstraction, namenode must choose a storage instead of a > datanode for block placement. It also needs to consider storage type, load on > the storage etc. > As an additional benefit, currently HDFS support heterogeneous nodes (nodes > with different number of spindles etc.) poorly. This work should help solve > that issue as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira