[ https://issues.apache.org/jira/browse/HDFS-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469168#comment-16469168 ]
Rushabh S Shah commented on HDFS-13448: --------------------------------------- [~belugabehr]: Could you please add some test cases for this enhancement ? > HDFS Block Placement - Ignore Locality for First Block Replica > -------------------------------------------------------------- > > Key: HDFS-13448 > URL: https://issues.apache.org/jira/browse/HDFS-13448 > Project: Hadoop HDFS > Issue Type: New Feature > Components: block placement, hdfs-client > Affects Versions: 2.9.0, 3.0.1 > Reporter: BELUGA BEHR > Assignee: BELUGA BEHR > Priority: Minor > Attachments: HDFS-13448.1.patch, HDFS-13448.2.patch, > HDFS-13448.3.patch, HDFS-13448.4.patch, HDFS-13448.5.patch > > > According to the HDFS Block Place Rules: > {quote} > /** > * The replica placement strategy is that if the writer is on a datanode, > * the 1st replica is placed on the local machine, > * otherwise a random datanode. The 2nd replica is placed on a datanode > * that is on a different rack. The 3rd replica is placed on a datanode > * which is on a different node of the rack as the second replica. > */ > {quote} > However, there is a hint for the hdfs-client that allows the block placement > request to not put a block replica on the local datanode _where 'local' means > the same host as the client is being run on._ > {quote} > /** > * Advise that a block replica NOT be written to the local DataNode where > * 'local' means the same host as the client is being run on. > * > * @see CreateFlag#NO_LOCAL_WRITE > */ > {quote} > I propose that we add a new flag that allows the hdfs-client to request that > the first block replica be placed on a random DataNode in the cluster. The > subsequent block replicas should follow the normal block placement rules. > The issue is that when the {{NO_LOCAL_WRITE}} is enabled, the first block > replica is not placed on the local node, but it is still placed on the local > rack. Where this comes into play is where you have, for example, a flume > agent that is loading data into HDFS. > If the Flume agent is running on a DataNode, then by default, the DataNode > local to the Flume agent will always get the first block replica and this > leads to un-even block placements, with the local node always filling up > faster than any other node in the cluster. > Modifying this example, if the DataNode is removed from the host where the > Flume agent is running, or this {{NO_LOCAL_WRITE}} is enabled by Flume, then > the default block placement policy will still prefer the local rack. This > remedies the situation only so far as now the first block replica will always > be distributed to a DataNode on the local rack. > This new flag would allow a single Flume agent to distribute the blocks > randomly, evenly, over the entire cluster instead of hot-spotting the local > node or the local rack. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org