[ 
https://issues.apache.org/jira/browse/HDFS-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17443276#comment-17443276
 ] 

Janus Chow edited comment on HDFS-16320 at 11/14/21, 9:41 AM:
--------------------------------------------------------------

[~hexiaoqiao]  Thank you for the review.

The issue we met is we have clients writing to the slownode and it took a very 
long time to finish writing for a normal file.

After we checked the metrics, we found we can avoid the pipeline creating on 
the slownodes with 
"dfs.namenode.block-placement-policy.exclude-slow-nodes.enabled" set to true. 
It will work fine for new clients, but for clients already using the slownode 
as pipeline, they have to suffer the slownode. (Maybe the slownode is reported 
by this pipeline.)

Since when clients are writing data, it will only be clients and datanodes 
communicating, so even NameNode has the information that the datanode in the 
pipeline is slow, clients can not do too much to avoid it. Our proposal would 
be, to let Datanodes get the information from heartbeats reports, then during 
the writing, datanodes can report it to clients, then clients can choose to 
rebuild the pipeline to improve the writing performance.


was (Author: symious):
[~hexiaoqiao]  Thank you for the review.

The issue we met is we have clients writing to the slownode and it took a very 
long time to finish writing for a normal file.

After we checked the metrics, we found we can avoid the pipeline creating on 
the slownodes with 
"dfs.namenode.block-placement-policy.exclude-slow-nodes.enabled" set to true. 
It will work fine for new clients, but for clients already using the slownode 
as pipeline, they have to suffer the slownode. (Maybe the slownode is reported 
by this pipeline.)

Since when clients are writing data, it will only be clients and datanodes 
communicating, so even NameNode has the information that the datanode in the 
pipeline is slow, clients can do too much to avoid it. Our proposal would be, 
to let Datanodes get the information from heartbeats reports, then during the 
writing, datanodes can report it to clients, then clients can choose to rebuild 
the pipeline to improve the writing performance.

> Datanode retrieve slownode information from NameNode
> ----------------------------------------------------
>
>                 Key: HDFS-16320
>                 URL: https://issues.apache.org/jira/browse/HDFS-16320
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Janus Chow
>            Assignee: Janus Chow
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The current information of slownode is reported by reportingNode, and stored 
> in NameNode.
> This ticket is to let the slownode retrieve the information from NameNode, so 
> that it can do other performance improvement actions based on this 
> information.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to