I recently added two new slaves to a distributed Nagios system. The central server now passively processes 17,000+ service checks on 3000+ servers.
It's been over an hour and a half since I brought those new slaves online and I have about 150 hosts still stuck in 'Pending' and about 1300 services in the same state. In addition to that it seems that the service check results from the other slaves that were working normally are now arbitrarily disappearing. For example, on one host three of the service checks have been updated relatively recently (i.e. 5-30 minutes ago) but three other service checks haven't been updated for almost an hour. The slaves all appear operational and the hosts are being checked on time. Is it possible I've overwhelmed Nagios' ability to process data from the NSCA daemon or struck some internal Nagios bottleneck? Any suggestions would be appreciated. Jonathan This email message is intended for the use of the person to whom it has been sent, and may contain information that is confidential or legally protected. If you are not the intended recipient or have received this message in error, you are not authorized to copy, distribute, or otherwise use this message or its attachments. Please notify the sender immediately by return e-mail and permanently delete this message and any attachments. Verio, Inc. makes no warranty that this email is error or virus free. Thank you. ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null