[ https://issues.apache.org/jira/browse/HDFS-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13589252#comment-13589252 ]
Devaraj Das commented on HDFS-2576: ----------------------------------- Good discussion points. However, do note that this mechanism is best effort and if possible I'd like to avoid major surgeries in HDFS for this to work. What is currently there in the patch seems to be a simple approach and not too intrusive in the HDFS layer (and should make the average failure cases better, and the worst cases may be unaffected). The case in point here is HBase and maybe the only major user of the API introduced in the patch (and maybe we can mark the API private/evolving with HBase as the audience..) Even after implementing an ideal solution, it'd be still best effort in terms of locality (imagine cases where the hdfs balancer ends up doing a not so good job since it has to deal with practical physical/disk constraints). The hdfs balancer would have to be significantly changed if we want to do this in the HDFS (offline discussions with [~sureshms] indicated) since the balancer works with blocks and doesn't know which directory owns which blocks. For now if the balancer (which is run manually for HDFS) is too problematic, we can change the balancer implementation so that it doesn't move blocks belonging to certain directories (the information of blocks to paths seems to be there). BTW another point is that applications like HBase may sometimes want to control the placement of blocks for multi-tenancy reasons, load-balancing reasons, etc. Maybe, at some point both mechanisms (client driven and namenode policy) of having favored nodes in the HDFS might be required (policy, if configured, always wins in case of conflicts). > Namenode should have a favored nodes hint to enable clients to have control > over block placement. > ------------------------------------------------------------------------------------------------- > > Key: HDFS-2576 > URL: https://issues.apache.org/jira/browse/HDFS-2576 > Project: Hadoop HDFS > Issue Type: New Feature > Reporter: Pritam Damania > Attachments: hdfs-2576-1.txt > > > Sometimes Clients like HBase are required to dynamically compute the > datanodes it wishes to place the blocks for a file for higher level of > locality. For this purpose there is a need of a way to give the Namenode a > hint in terms of a favoredNodes parameter about the locations where the > client wants to put each block. The proposed solution is a favored nodes > parameter in the addBlock() method and in the create() file method to enable > the clients to give the hints to the NameNode about the locations of each > replica of the block. Note that this would be just a hint and finally the > NameNode would look at disk usage, datanode load etc. and decide whether it > can respect the hints or not. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira