[jira] [Work logged] (HDFS-16158) Discover datanodes with unbalanced volume usage by the standard deviation

ASF GitHub Bot (Jira) Tue, 31 Aug 2021 17:38:09 -0700


     [ 
https://issues.apache.org/jira/browse/HDFS-16158?focusedWorklogId=644690&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-644690
 ]


ASF GitHub Bot logged work on HDFS-16158:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 01/Sep/21 00:37
            Start Date: 01/Sep/21 00:37
    Worklog Time Spent: 10m 
      Work Description: tomscut commented on pull request #3288:
URL: https://github.com/apache/hadoop/pull/3288#issuecomment-909761516


   Hi @jojochuang , this PR has a lot of changes, which can make rolling 
updates difficult. I re-implemented this feature and will submit another PR 
later. Thank you for your review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 644690)
    Time Spent: 1h 40m  (was: 1.5h)

> Discover datanodes with unbalanced volume usage by the standard deviation 
> --------------------------------------------------------------------------
>
>                 Key: HDFS-16158
>                 URL: https://issues.apache.org/jira/browse/HDFS-16158
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: tomscut
>            Assignee: tomscut
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: image-2021-08-11-10-14-58-430.png
>
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Discover datanodes with unbalanced volume usage by the standard deviation
> In some scenarios, we may cause unbalanced datanode disk usage:
> 1. Repair the damaged disk and make it online again.
> 2. Add disks to some Datanodes.
> 3. Some disks are damaged, resulting in slow data writing.
> 4. Use some custom volume choosing policies.
> In the case of unbalanced disk usage, a sudden increase in datanode write 
> traffic may result in busy disk I/O with low volume usage, resulting in 
> decreased throughput across datanodes.
> In this case, we need to find these nodes in time to do diskBalance, or other 
> processing. Based on the volume usage of each datanode, we can calculate the 
> standard deviation of the volume usage. The more unbalanced the volume, the 
> higher the standard deviation.
> To prevent the namenode from being too busy, we can calculate the standard 
> variance on the datanode side, transmit it to the namenode through heartbeat, 
> and display the result on the Web of namenode. We can then sort directly to 
> find the nodes on the Web where the volumes usages are unbalanced.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16158) Discover datanodes with unbalanced volume usage by the standard deviation

Reply via email to