On Jun 19, 2009, at 4:35 PM, Kustner, Tom wrote: > I am a Nagios user, not the administrator. We are running Nagios > 2.9 on > RHEL 4 or 5. Overall, 200+ hosts with 3000 services being > monitored. I > have access for monitoring a smaller number of hosts. > > In another posting, I alluded to an issue where a host had gone down > but > no alert was sent out. The issue surfaced again today and as was > done > the other time, Nagios was restarted to "fix" the problem. I am > naturally concerned about the unreliability. > > Any thoughts on this problem? Specifically:
I am not aware of any problems such as you describe that a restart would 'fix'. > * What are best practices for making sure Nagios does not fall down on > the job? Is there something not set right? See my previous e-mail. > * Are other Nagios administrators restarting Nagios on a weekly or > nightly basis to keep it on the job? I reload hourly to pick up new config changes but rarely restart. > * Is this an issue specific to Nagios 2.9? Was 2.9 a spotty version? I'm not sure you even have an issue outside of a possible configuration problem. No, 2.9 was not spotty or bad. > * For a given host, why would "active checks" be enabled, yet "N/A" > appears in the "Next Active Check" field? This is normal. See my previous e-mail. -- Marc ------------------------------------------------------------------------------ Are you an open source citizen? Join us for the Open Source Bridge conference! Portland, OR, June 17-19. Two days of sessions, one day of unconference: $250. Need another reason to go? 24-hour hacker lounge. Register today! http://ad.doubleclick.net/clk;215844324;13503038;v?http://opensourcebridge.org _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null