[ 
https://issues.apache.org/jira/browse/HDFS-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13589207#comment-13589207
 ] 

stack commented on HDFS-2576:
-----------------------------

Nice writeup Jon.

bq. I'm not clear what group-member-count part of the group-member-count vs 
replication count concern is.

A "block affinity group" is made up of members.  If the count of members is > 
than the replication setting, then we could assign a 'region' to a 'block 
affinity group' member that actually didn't have blocks local (because this 
member was beyond the replication count; i.e. if replication is 3 and group has 
10 members, 7/10 times we will be on a node where the block is not local).

Outline #1 is how it should work.  Outline #2 would arrive much faster methinks 
and could do as stopgap until Option #1 shows up (Can we ask the hdfs block 
balancer to by-pass directories?)  Rack awareness wouldn't be too bad.  Skew we 
could live with (probably) and ditto for datanodes coming and going (NN would 
not let us select them as favored nodes and if the application is paying 
attention, it'll stop suggesting dead DNs).
                
> Namenode should have a favored nodes hint to enable clients to have control 
> over block placement.
> -------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-2576
>                 URL: https://issues.apache.org/jira/browse/HDFS-2576
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Pritam Damania
>         Attachments: hdfs-2576-1.txt
>
>
> Sometimes Clients like HBase are required to dynamically compute the 
> datanodes it wishes to place the blocks for a file for higher level of 
> locality. For this purpose there is a need of a way to give the Namenode a 
> hint in terms of a favoredNodes parameter about the locations where the 
> client wants to put each block. The proposed solution is a favored nodes 
> parameter in the addBlock() method and in the create() file method to enable 
> the clients to give the hints to the NameNode about the locations of each 
> replica of the block. Note that this would be just a hint and finally the 
> NameNode would look at disk usage, datanode load etc. and decide whether it 
> can respect the hints or not.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to