One of the features we intend to add to release 0.9.4.0 is a monitoring system. The general idea is that the master will periodically issue something like a "dump stats" command to the RangeServers and collect the responses. The data collected will be used to make load balancing/range re-distribution decisions as well as exposed via the monitoring interface so that admins can monitor the health of the Hypertable cluster. The Master can rollup stats to a per-RangeServer and per-table level since that will probably make the most sense from a health check point of view.
I propose the following list of *per RangeServer stats:* -#ranges -cpu load -memory usage -disk stats -network stats -block cache size -query cache size -#open file handles -block cache hit rate -query cache hit rate as well as a set of *per Range stats*: -name/range id -qps , bytes read per sec -writes/s , bytes written/s -#open scanners -(min, max, avg) scanner lifetimes -#open mutators -(min, max, avg) mutator lifetimes Any thoughts on other useful stats for the monitoring console, range balancing or on the monitoring framework in general would be very welcome. -Sanjit -- You received this message because you are subscribed to the Google Groups "Hypertable Development" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/hypertable-dev?hl=en.
