[
https://issues.apache.org/jira/browse/HDDS-15324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrey Yarovoy reassigned HDDS-15324:
-------------------------------------
Assignee: Andrey Yarovoy
> [Ozone Dashboard] Create a dashboard that shows DataNode performance metrics
> ----------------------------------------------------------------------------
>
> Key: HDDS-15324
> URL: https://issues.apache.org/jira/browse/HDDS-15324
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Andrey Yarovoy
> Assignee: Andrey Yarovoy
> Priority: Major
>
> Create a dashboard that shows DataNode performance metrics:
> h3. JVM (HddsDatanode)
> * CPU — JVM vs system load for the DN process hosts you pick.
> * Heap — used, committed, and max heap memory.
> * Garbage collection — how much CPU time GC uses and how often collections
> happen.
> * Netty — direct (off-heap) buffer use vs configured max.
> * Threads — count of JVM threads by state.
> h3. Ratis
> * Log append throughput, flushes, and RPC-style client read/write rates.
> * Backlog (pending queue) and rough timing snapshots for appends, follower
> appends, and log sync; failed writes rate.
> * All of this is rolled up across raft groups per DataNode (one scrape
> target series per selected node).
> h3. Container I/O
> For common Xceiver operations (WriteChunk, ReadChunk, PutBlock, GetBlock,
> DeleteChunk/Block, CreateContainer, CloseContainer):
> * How many ops per second, bytes per second, and average latency
> (CloseContainer omits bytes; only ops + latency).
> h3. Storage volume I/O
> Per selected DataNode, sums across disks: read/write throughput, read/write
> IOPS, read/write latency, and volume space used vs capacity (excluding
> total-capacity rollup metrics).
> h3. SCM commands and background work
> * Command handlers — for each SCM command type, panels for incoming command
> rate, handler invocation rate, run time, queue depth, and optional
> thread-pool size — so you can see SCM-driven work separated by command.
> * Block deleting service — background delete pipeline: transactions,
> blocks/bytes succeeded or failed, pending/chosen/marked counts, retries,
> outliers (e.g. lock timeouts, out-of-order transactions).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]