jojochuang commented on PR #10382:
URL: https://github.com/apache/ozone/pull/10382#issuecomment-4724799229
After comparing it against the existing dashboards in dashboards, here is
the duplication and overlap analysis:
### 1. No Pre-existing "SCM Overview" Dashboard
Previously, unlike the Ozone Manager (which has Ozone - OM Overview.json),
the Storage Container Manager (SCM) did not have a dedicated, service-level
overview
dashboard. The only SCM-specific dashboard was Ozone - SCM Safemode.json,
which focuses exclusively on safe mode metrics.
### 2. Minor Overlap: Ozone - Datanode Decommission and Maintenance.json
• Overlap: The new SCM dashboard includes a Replication manager workload
(cmds / s) panel under the Container replication/deletion/ec-
reconstruction/ec-deletion row. This panel queries SCM Replication
Manager command counts (such as
replication_manager_metrics_replication_cmds_sent_total ,
replication_manager_metrics_deletion_cmds_sent_total , etc.).
• Distinction: The decommission dashboard contains much more granular
panels mapping replication manager queues, inflight replication/deletion tasks,
supervisor status, and measured replicator transfer/failure rates on
datanodes. The decommission dashboard remains highly specialized, while the
overview dashboard provides a consolidated SCM-wide workload view.
### 3. Functional Overlap: Ozone - JVM Metrics.json & Ozone - Memory
Consumption Metrics.json
• Overlap: The new dashboard contains a JVM row monitoring SCM's CPU
load, Heap/Non-heap usage, GC metrics, thread counts, netty direct memory, and
jetty server threads specifically for StorageContainerManager .
• Distinction: This directly replicates SCM-specific panels found in the
global JVM and Memory dashboards. This redundancy is standard design practice
to keep the overview dashboard self-contained (matching the pattern used
in the OM Overview dashboard).
### 4. No Overlap: Ozone - SCM Safemode.json
• The new dashboard does not contain safe mode state-timeline, rule
durations, or thresholds. Thus, it does not duplicate the Safemode dashboard.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]