short version:

if o.a.c.concurrent.{ROW-READ-STAGE,ROW-MUTATION-STAGE} and
o.a.c.db.CompactionManager have

 - completed task count increasing
 - pending tasks stable (for RRS and RMS, stable in low hundreds or
less, for CM stable in single digits or less)
 - the log isn't spitting out Error lines

then the node is completing requests and keeping up with demand reasonably well.

On Tue, Jun 22, 2010 at 3:41 PM, Andrew Psaltis
<andrew.psal...@webtrends.com> wrote:
> All,
> We have been working through some operations scenarios, so that we are ready 
> to deploy our first Cassandra cluster into production  in the coming months. 
> During this process our operations folks have asked us to provide a Health 
> Check service. I am using the word service here very liberally - really we 
> just need to provide a way for the folks in out NOC to know that not only is 
> the Cassandra process running (which they will get with their monitoring 
> tools ), but that it is actually alive and well. We do not have the intent of 
> verifying that the data is valid, just that every node in the cluster that is 
> known to be running is actually alive and healthy. My questions are - What 
> does it mean for a Cassandra node to be healthy?  What is the minimum (from 
> an impact to the performance of a node) things we can check to make sure that 
> a node is not a zombie?
>
> Any and all input is greatly appreciated.
>
> Thanks,
> Andrew
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Reply via email to