[
https://issues.apache.org/jira/browse/HDFS-17913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18078572#comment-18078572
]
ASF GitHub Bot commented on HDFS-17913:
---------------------------------------
ZanderXu commented on PR #8462:
URL: https://github.com/apache/hadoop/pull/8462#issuecomment-4384661133
Merged. Thanks @CapMoon for your contribution. Thanks @Hexiaoqiao for your
review
> Dead DataNode in Host2NodesMap can break block location sorting
> ---------------------------------------------------------------
>
> Key: HDFS-17913
> URL: https://issues.apache.org/jira/browse/HDFS-17913
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs
> Affects Versions: 3.4.3
> Reporter: Yue Wang
> Assignee: Yue Wang
> Priority: Major
> Labels: pull-request-available
>
> When HeartbeatManager#heartbeatCheck removes a dead DataNode via
> DatanodeManager#removeDeadDatanode, the node is removed from NetworkTopology,
> but it may still be returned by host2DatanodeMap.
> If an HDFS client is co-located on the same host/IP as that dead DataNode,
> DatanodeManager#sortLocatedBlock may treat the client as a DataNode reader.
> Since the descriptor has already been removed from NetworkTopology, its
> parent is null, and NetworkTopology#sortByDistance can compute incorrect
> weights for replicas. This may cause rack locality to be lost, especially
> when dfs.namenode.read.considerLoad=true.
> Expected behavior:
> A DataNode descriptor detached from NetworkTopology should not be treated as
> a DataNode reader.
> Proposed fix:
> In DatanodeManager#sortLocatedBlock, ignore a host-map hit whose topology
> parent is null
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]