[ https://issues.apache.org/jira/browse/HDFS-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17797380#comment-17797380 ]
ASF GitHub Bot commented on HDFS-17292: --------------------------------------- huangzhaobo99 opened a new pull request, #6364: URL: https://github.com/apache/hadoop/pull/6364 ### Description of PR 1. Add a new record of the number of times the DatanodeManager#slowPeerCollectorDaemon thread collects SlowNodes, and display it in a map structure. 2. The same SlowNode may always appear in the prod env, so when slowPeerCollectorDaemon is turned on, record the number of times it has been collected by the slowPeerCollectorDaemon thread. If the collection frequency is too high, SRE or DEV need to repair the machine. 3. The following figure shows the SlowNodes collected by the slowPeerCollectorDaemon thread at different time periods. (If "DataNodeWriteXceiversCount" is 0, there is no write request, indicating a SlowNode) <img width="1213" alt="image" src="https://github.com/apache/hadoop/assets/63718681/f383874b-3f05-4d04-b963-c6c9430d2836"> ### How was this patch tested? Add Unit Test. > Show the number of times the slowPeerCollectorDaemon thread has collected > SlowNodes. > ------------------------------------------------------------------------------------ > > Key: HDFS-17292 > URL: https://issues.apache.org/jira/browse/HDFS-17292 > Project: Hadoop HDFS > Issue Type: New Feature > Reporter: huangzhaobo99 > Assignee: huangzhaobo99 > Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org