One of the features we intend to add to release 0.9.4.0 is a monitoring
system.
The general idea is that the master will periodically issue something like a
"dump stats" command to the RangeServers and collect the responses.
The data collected will be used to make load balancing/range re-distribution
decisions as well as exposed via the monitoring interface so that admins can
monitor the health of the Hypertable cluster.
The Master can rollup stats to a per-RangeServer and per-table level since
that will probably make the most sense from a health check point of view.

I propose the following list of *per RangeServer stats:*
-#ranges
-cpu load
-memory usage
-disk stats
-network stats
-block cache size
-query cache size
-#open file handles
-block cache hit rate
-query cache hit rate

as well as a set of *per Range stats*:
-name/range id
-qps , bytes read per sec
-writes/s , bytes written/s
-#open scanners
-(min, max, avg) scanner lifetimes
-#open mutators
-(min, max, avg) mutator lifetimes

Any thoughts on other useful stats for the monitoring console, range
balancing or on the monitoring framework in general would be very welcome.

-Sanjit

-- 
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en.

Reply via email to