kfaraz commented on code in PR #14443:
URL: https://github.com/apache/druid/pull/14443#discussion_r1237973409


##########
docs/operations/metrics.md:
##########
@@ -326,6 +326,12 @@ If `emitBalancingStats` is set to `true` in the 
Coordinator [dynamic configurati
 
 ## General Health
 
+### Service Health
+
+|Metric|Description|Dimensions|Normal Value|
+|------|-----------|----------|------------|
+|`druid/heartbeat`| Report service health. For Overlord/Coordinator, the 
dimension is leader count. `ServiceStatusMonitor` must be enabled. 
|`heartbeatType`|1|

Review Comment:
   No, I don't have a concrete example in mind either. I was thinking mostly of 
any cluster-level information, inter-service communication, etc.
   
   I agree with you that `server/` can be misleading when running on 
containers. I avoided using it in a PR due to similar reasons.
   
   I am working on a PR where I think I am going to use `cluster/` for server 
view syncs. e.g. `cluster/serverview/synced` which denotes the sync status 
between coordinator/broker inventory and different historical/peon processes.
   
   `druid/` is pretty much a catch-all and any metric that goes under 
`cluster/` could potentially go under `druid/` as well. But I generally try to 
use prefixes that make the metrics a little more user-friendly and somewhat 
self-explanatory.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to