sodonnel commented on PR #3781:
URL: https://github.com/apache/ozone/pull/3781#issuecomment-1263753876

   The changes look good, but I think it would be much more useful if we could 
track metric at the decommissioning node level too. Ie:
   
   ```
   TotalTrackedContainersUnderReplicatedForHostname = xyz
   ```
   
   I had a look at the ReplicationManagerMetric class, and in there, is an 
example of how to form a metric "on the fly" using:
   
   ```
   private static final MetricsInfo INFLIGHT_REPLICATION = Interns.info(
         "InflightReplication",
         "Tracked inflight container replication requests.");
   ```
   
   I think it should be possible store the counts per hostname in a map or 
list, and then when the metrics are snapshot, form dynamic metric names for the 
host level under / over / unhealthy container counts.
   
   Also keep the aggregate metrics. These host level metrics would let people 
see if one host is stuck or if all are making progress etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to