Hey,
I'm looking for some suggestions for implementing a service check on a
redundant host pair that access a shared resource.
Here's our setup:
We have N hosts that process (via delayed_job) a shared job queue
(mysql/redis). We have several checks that are host-specific (# of workers
on that host), but we also have several checks that examine the shared job
queue (# of unprocessed jobs).
I have several possible implementations:
============
1. Shared Job Queue check on single processing host (current setup)
Pros:
* We only get notified once when the shared queue is high
Cons:
* If the single host goes down, we lose the shared queue check
============
2. Shared Job Queue check on all processing hosts
Pros:
* If a single processing host goes down, the shared queue check still
functions
Cons:
* Multiple emails from hosts when the shared check fails
============
3. Shared Job Queue check on job queue host (ie the DB box)
Pros:
* If the DB goes down, you can't reach the queue anyway
* Single email on failure
Cons:
* The check requires app knowledge, which requires having the app deployed
on the job queue host
How are others adding a check like this? #2 and just bite the bullet for
multiple emails?
Thanks
------------------------------------------------------------------------------
Get your SQL database under version control now!
Version control is standard for application code, but databases havent
caught up. So what steps can you take to put your SQL databases under
version control? Why should you start doing it? Read more to find out.
http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk
_______________________________________________
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting
any issue.
::: Messages without supporting info will risk being sent to /dev/null