[ 
https://issues.apache.org/jira/browse/HDFS-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17396991#comment-17396991
 ] 

tomscut commented on HDFS-16158:
--------------------------------

We can sort directly to find the nodes on the Web where the volumes usages are 
unbalanced.

!image-2021-08-11-10-14-58-430.png|width=602,height=213!

> Discover datanodes with unbalanced volume usage by the standard deviation 
> --------------------------------------------------------------------------
>
>                 Key: HDFS-16158
>                 URL: https://issues.apache.org/jira/browse/HDFS-16158
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: tomscut
>            Assignee: tomscut
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: image-2021-08-11-10-14-58-430.png, namenode-web.jpg
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Discover datanodes with unbalanced volume usage by the standard deviation
> In some scenarios, we may cause unbalanced datanode disk usage:
> 1. Repair the damaged disk and make it online again.
> 2. Add disks to some Datanodes.
> 3. Some disks are damaged, resulting in slow data writing.
> 4. Use some custom volume choosing policies.
> In the case of unbalanced disk usage, a sudden increase in datanode write 
> traffic may result in busy disk I/O with low volume usage, resulting in 
> decreased throughput across datanodes.
> In this case, we need to find these nodes in time to do diskBalance, or other 
> processing. Based on the volume usage of each datanode, we can calculate the 
> standard deviation of the volume usage. The more unbalanced the volume, the 
> higher the standard deviation.
> To prevent the namenode from being too busy, we can calculate the standard 
> variance on the datanode side, transmit it to the namenode through heartbeat, 
> and display the result on the Web of namenode. We can then sort directly to 
> find the nodes on the Web where the volumes usages are unbalanced.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to