[jira] [Commented] (HDFS-13448) HDFS Block Placement - Ignore Locality for First Block Replica

BELUGA BEHR (JIRA) Mon, 16 Apr 2018 09:01:25 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439637#comment-16439637
 ]


BELUGA BEHR commented on HDFS-13448:
------------------------------------

Team, some of the unit tests that failed, were related to timeouts.  I will 
re-run tests to see if it's simply a matter of flaky tests.  However, one was 
related regarding a new configuration in the HDFS service.  Also, a couple of 
check-style errors were generated regarding long-lines.  However, given the 
current format for naming configurations, it's hard to work around and provides 
little value to squeeze it in there.  Few, if any, of the current 
configurations adhere to this line-length restriction.

Regarding the XML configuration, I will add this configuration if you are 
agreeable to the approach that I have proposed here.

{code}
java.lang.AssertionError: interface 
org.apache.hadoop.hdfs.client.HdfsClientConfigKeys interface 
org.apache.hadoop.hdfs.client.HdfsClientConfigKeys$StripedRead class 
org.apache.hadoop.hdfs.DFSConfigKeys interface 
org.apache.hadoop.hdfs.client.HdfsClientConfigKeys$BlockWrite$ReplaceDatanodeOnFailure
 has 1 variables missing in hdfs-default.xml Entries:   
dfs.namenode.block-placement-policy.default.allow-ignore-locality expected:<0> 
but was:<1>
        at org.junit.Assert.fail(Assert.java:88)
        at org.junit.Assert.failNotEquals(Assert.java:743)
        at org.junit.Assert.assertEquals(Assert.java:118)
        at org.junit.Assert.assertEquals(Assert.java:555)
        at 
org.apache.hadoop.conf.TestConfigurationFieldsBase.testCompareConfigurationClassAgainstXml(TestConfigurationFieldsBase.java:493)
{code}

> HDFS Block Placement - Ignore Locality for First Block Replica
> --------------------------------------------------------------
>
>                 Key: HDFS-13448
>                 URL: https://issues.apache.org/jira/browse/HDFS-13448
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: block placement, hdfs-client
>    Affects Versions: 2.9.0, 3.0.1
>            Reporter: BELUGA BEHR
>            Assignee: BELUGA BEHR
>            Priority: Minor
>         Attachments: HDFS-13448.1.patch, HDFS-13448.2.patch
>
>
> According to the HDFS Block Place Rules:
> {quote}
> /**
>  * The replica placement strategy is that if the writer is on a datanode,
>  * the 1st replica is placed on the local machine, 
>  * otherwise a random datanode. The 2nd replica is placed on a datanode
>  * that is on a different rack. The 3rd replica is placed on a datanode
>  * which is on a different node of the rack as the second replica.
>  */
> {quote}
> However, there is a hint for the hdfs-client that allows the block placement 
> request to not put a block replica on the local datanode _where 'local' means 
> the same host as the client is being run on._
> {quote}
>   /**
>    * Advise that a block replica NOT be written to the local DataNode where
>    * 'local' means the same host as the client is being run on.
>    *
>    * @see CreateFlag#NO_LOCAL_WRITE
>    */
> {quote}
> I propose that we add a new flag that allows the hdfs-client to request that 
> the first block replica be placed on a random DataNode in the cluster.  The 
> subsequent block replicas should follow the normal block placement rules.
> The issue is that when the {{NO_LOCAL_WRITE}} is enabled, the first block 
> replica is not placed on the local node, but it is still placed on the local 
> rack.  Where this comes into play is where you have, for example, a flume 
> agent that is loading data into HDFS.
> If the Flume agent is running on a DataNode, then by default, the DataNode 
> local to the Flume agent will always get the first block replica and this 
> leads to un-even block placements, with the local node always filling up 
> faster than any other node in the cluster.
> Modifying this example, if the DataNode is removed from the host where the 
> Flume agent is running, or this {{NO_LOCAL_WRITE}} is enabled by Flume, then 
> the default block placement policy will still prefer the local rack.  This 
> remedies the situation only so far as now the first block replica will always 
> be distributed to a DataNode on the local rack.
> This new flag would allow a single Flume agent to distribute the blocks 
> randomly, evenly, over the entire cluster instead of hot-spotting the local 
> node or the local rack.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13448) HDFS Block Placement - Ignore Locality for First Block Replica

Reply via email to