neils-dev commented on code in PR #3781:
URL: https://github.com/apache/ozone/pull/3781#discussion_r1004992451
##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/DatanodeAdminMonitorImpl.java:
##########
@@ -168,6 +225,43 @@ public int getTrackedNodeCount() {
return trackedNodes.size();
}
+ synchronized void setMetricsToGauge() {
+ metrics.setTrackedContainersUnhealthyTotal(unhealthyContainers);
+ metrics.setTrackedRecommissionNodesTotal(trackedRecommission);
+ metrics.setTrackedDecommissioningMaintenanceNodesTotal(
+ trackedDecomMaintenance);
+ metrics.setTrackedContainersUnderReplicatedTotal(
+ underReplicatedContainers);
+ metrics.setTrackedContainersSufficientlyReplicatedTotal(
+ sufficientlyReplicatedContainers);
+ metrics.setTrackedPipelinesWaitingToCloseTotal(pipelinesWaitingToClose);
+ for (Map.Entry<String, Long> e :
+ pipelinesWaitingToCloseByHost.entrySet()) {
+ metrics.metricRecordPipelineWaitingToCloseByHost(e.getKey(),
+ e.getValue());
+ }
+ for (Map.Entry<String, ContainerStateInWorkflow> e :
Review Comment:
Thanks. We should go forward with using this implementation that works for
JMX metrics for completing this PR to _expose decommission / maintenance
metrics via JMX_ and open a new jira to look into supporting the prom endpoint.
This PR supports metrics tracking the decommission and maintenance workflow
both with aggregated counts and DN host specific counts. A jira will be filed
to track prom endpoint behavior for the metrics. What do you think?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]