[ https://issues.apache.org/jira/browse/HDFS-17305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
huangzhaobo99 reassigned HDFS-17305: ------------------------------------ Assignee: huangzhaobo99 > Add avoid datanode reason count related metrics to namenode. > ------------------------------------------------------------ > > Key: HDFS-17305 > URL: https://issues.apache.org/jira/browse/HDFS-17305 > Project: Hadoop HDFS > Issue Type: New Feature > Reporter: huangzhaobo99 > Assignee: huangzhaobo99 > Priority: Minor > > Now, there are slownode and load avoidance functions, mainly implemented in > theĀ BlockPlacementPolicyDefault class. > 1. After triggering the exclusion condition, some logs will be printed on nn, > which can be used to troubleshoot anomalies in nn by checking the logs, the > code is as follows: > {code:java} > ... > if (!node.isInService()) { > logNodeIsNotChosen(node, NodeNotChosenReason.NOT_IN_SERVICE); > return false; > } > if (avoidStaleNodes) { > if (node.isStale(this.staleInterval)) { > logNodeIsNotChosen(node, NodeNotChosenReason.NODE_STALE); > return false; > } > } > ...{code} > 2. If the exclusion condition is triggered, can we record it through metrics > and count the total number of exclusions? > 3. These metrics through prometheus+grafana to observe the current situation > of the cluster when selecting datanodes. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org