[ https://issues.apache.org/jira/browse/HDFS-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
tomscut resolved HDFS-16158. ---------------------------- Resolution: Abandoned > Discover datanodes with unbalanced volume usage by the standard deviation > -------------------------------------------------------------------------- > > Key: HDFS-16158 > URL: https://issues.apache.org/jira/browse/HDFS-16158 > Project: Hadoop HDFS > Issue Type: New Feature > Reporter: tomscut > Assignee: tomscut > Priority: Major > Labels: pull-request-available > Attachments: image-2021-08-11-10-14-58-430.png > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Discover datanodes with unbalanced volume usage by the standard deviation > In some scenarios, we may cause unbalanced datanode disk usage: > 1. Repair the damaged disk and make it online again. > 2. Add disks to some Datanodes. > 3. Some disks are damaged, resulting in slow data writing. > 4. Use some custom volume choosing policies. > In the case of unbalanced disk usage, a sudden increase in datanode write > traffic may result in busy disk I/O with low volume usage, resulting in > decreased throughput across datanodes. > In this case, we need to find these nodes in time to do diskBalance, or other > processing. Based on the volume usage of each datanode, we can calculate the > standard deviation of the volume usage. The more unbalanced the volume, the > higher the standard deviation. > To prevent the namenode from being too busy, we can calculate the standard > variance on the datanode side, transmit it to the namenode through heartbeat, > and display the result on the Web of namenode. We can then sort directly to > find the nodes on the Web where the volumes usages are unbalanced. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org