On 28 Jan 2010, at 10:46, Andy Ashley wrote:
Recently I have had an issue where the 2 slave nodes were reporting as OK on thier respective web interfaces (all hosts and services green). However, the master reports "UNKNOWN: Service results are stale" and I am unable to get the master to update the status of the slave nodes.
That sounds like the data from the slaves is not being received by the master.
I managed to get the master to update the slave node status(ii?) but now all of the slave services are reported as: "Active checks of this service have been disabled - Only passive checks are being accepted" -with the little passive checks icon.
In a slave node cluster, each slave node has definitions for all hosts and services in that cluster, but only ones active on that slave node will be marked as active checks.
I dont think I understand the way this relationship works properly between master and slave and the check types used, but I cant seem to find much info in the documentation (perhaps looking in the wrong places).
http://docs.opsview.org/doku.php?id=opsview-community:slavesetup, though I can see that it lacks a "how does it all work" explanation. I'll make a note to update documentation here.
Is slave node host and service data always monitored passively by the master? If not, how do I force active checks (always) and remove the passive checks that have been set to stop this stale/UNKNOWN service or host states occurring again. It is a lot of work going and setting each service on the slave nodes individually to be actively checked.
We made a change to Nagios (which was implemented in Nagios 3.0) where after a reload, the active_checks_enabled variable from the config is always used in preference to the retention.dat variable. So a reload should force active checks to be enabled appropriately.
Ton _______________________________________________ Opsview-users mailing list [email protected] http://lists.opsview.org/lists/listinfo/opsview-users
