[jira] [Commented] (HDFS-14882) Consider DataNode load when #getBlockLocation

Istvan Fajth (Jira) Mon, 28 Oct 2019 06:03:46 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-14882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16961023#comment-16961023
 ]


Istvan Fajth commented on HDFS-14882:
-------------------------------------

Hello [~hexiaoqiao],

I was checking into the patch, and into the proposal, and I think even though 
the changes looks cool, and does what it promises as I see, I would have one 
question/suggestion to consider instead of doing this when the 
dfs.namenode.read.considerLoad is set to true:
In NetworkTopology#sortByDistance, we already sort the nodes by network 
distance, and there is a shuffle for the nodes on the same level that thrives 
to ensure some distribution of load. That shuffle can be considered as well as 
a secondary sorting strategy, which we can inject into that point from outside. 
If we inject the secondary sorting from the DataNodeManager, then if the 
read.considerLoad is turned on, we can inject a sorting by transceiver count 
instead of the shuffle.

With this, we can avoid calculating the network distance twice, also we can 
avoid shuffling then sorting by transceiver count. I am posting a proposal, 
just to demonstrate what exactly I am thinking about, the JUnit test in 
patch-008 is passing with it, I haven't tried other tests locally.

Please share what do you think about this approach. Also I am happy to have 
some feedback from you [~ayushtkn] and [~elgoiri] too.

> Consider DataNode load when #getBlockLocation
> ---------------------------------------------
>
>                 Key: HDFS-14882
>                 URL: https://issues.apache.org/jira/browse/HDFS-14882
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Xiaoqiao He
>            Assignee: Xiaoqiao He
>            Priority: Major
>         Attachments: HDFS-14882.001.patch, HDFS-14882.002.patch, 
> HDFS-14882.003.patch, HDFS-14882.004.patch, HDFS-14882.005.patch, 
> HDFS-14882.006.patch, HDFS-14882.007.patch, HDFS-14882.008.patch
>
>
> Currently, we consider load of datanode when #chooseTarget for writer, 
> however not consider it for reader. Thus, the process slot of datanode could 
> be occupied by #BlockSender for reader, and disk/network will be busy 
> workload, then meet some slow node exception. IIRC same case is reported 
> times. Based on the fact, I propose to consider load for reader same as it 
> did #chooseTarget for writer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14882) Consider DataNode load when #getBlockLocation

Reply via email to