Re: [Nagios-users] Distributed monitoring: central collector doesn't seem to be able to run active checks

2013-08-28 Thread C. Bensend
I'm continuing to iron out the wrinkles with 3.5.1 and distributed monitoring. I'm using mod_gearman to submit and receive events from two distributed pollers. Every now and again, I'll get something similar in the log on the centralized collecting machine: CRITICAL: Return code of

Re: [Nagios-users] Distributed monitoring: central collector doesn't seem to be able to run active checks

2013-08-28 Thread Justin Pryzby
Do you get many of those error messages in the logs at once, or just one at a time? Only one thought: what are the permissions on your $USER$ variables? Nagios on my systems setuid() to nonroot after startup, and if it gets SIGHUP to reload config, but can't read the file defining $USER*$, will

Re: [Nagios-users] Distributed monitoring: central collector doesn't seem to be able to run active checks

2013-08-28 Thread C. Bensend
Do you get many of those error messages in the logs at once, or just one at a time? Only one thought: what are the permissions on your $USER$ variables? Nagios on my systems setuid() to nonroot after startup, and if it gets SIGHUP to reload config, but can't read the file defining $USER*$,

Re: [Nagios-users] Distributed monitoring: central collector doesn't seem to be able to run active checks

2013-08-28 Thread Sven Nierlein
On 8/22/13 13:51, C. Bensend wrote: CRITICAL: Return code of 127 is out of bounds. Make sure the plugin youre trying to run actually exists. (worker: collector.domain.org) Hi, if this is the collector host, why does it have a mod-gearman worker installed? If nagios would have run the check by

Re: [Nagios-users] Distributed monitoring: central collector doesn't seem to be able to run active checks

2013-08-28 Thread C. Bensend
On 8/22/13 13:51, C. Bensend wrote: CRITICAL: Return code of 127 is out of bounds. Make sure the plugin youre trying to run actually exists. (worker: collector.domain.org) Hi, if this is the collector host, why does it have a mod-gearman worker installed? If nagios would have run the

Re: [Nagios-users] Distributed monitoring: central collector doesn't seem to be able to run active checks

2013-08-28 Thread Sven Nierlein
On 8/28/13 14:43, C. Bensend wrote: Are you saying I just need gearmand running on the collector? Well, i assumed it. You are the only one which really can tell that. You will need a worker on each host which should run checks. If your collector should not run any checks, than no worker is

Re: [Nagios-users] Distributed monitoring: central collector doesn't seem to be able to run active checks

2013-08-28 Thread C. Bensend
On 8/28/13 14:43, C. Bensend wrote: Are you saying I just need gearmand running on the collector? Well, i assumed it. You are the only one which really can tell that. You will need a worker on each host which should run checks. If your collector should not run any checks, than no worker is

[Nagios-users] Distributed monitoring: central collector doesn't seem to be able to run active checks

2013-08-22 Thread C. Bensend
Hey folks, I'm continuing to iron out the wrinkles with 3.5.1 and distributed monitoring. I'm using mod_gearman to submit and receive events from two distributed pollers. Every now and again, I'll get something similar in the log on the centralized collecting machine: CRITICAL: Return