I'm just curious as to how many systems/services everyone here is monitoring with mon...
 
I'm currently monitoring 1600 services across 300 systems and have a constant load average of about 7-9 (that's load average, not CPU % used).  We have plans to push all 800+ systems onto our little 'mon'itor, but are having problems...
 
I'm running mon on a Dual HT 3.0Ghz Xeon system w/4GB RAM, and there are also some custom PHP daemons running in the background to handle queued database insertions and incoming snmptrap packets from Windows hosts (containing event logs).
 
I'm afraid I might be running into resource issues, as mon is crashing randomly (couple times a day) ever since I added the last batch of systems.  New systems don't have any services different from any other system (ie, no new/different monitoring scripts that might be going haywire).  Nothing useful in the logfile.  Is there any sort of internal limit to the number of services or hosts that can be monitored?  I tried turning on debugging so that I could narrow down the point of the crash, but debug logging on this many services literally ground my system to a halt.  I checked the memory/CPU/Paging history (via sar) after a crash and didn't see anything out of the ordinary.
 
Am I pushing this piece of software to it's limits on this hardware?  Should I start using multiple mon-hosts, or should I be looking elsewhere for the root of the problem?  I know this is dependent on the specific monitoring scripts I'm using, but anyone have any general ideas about how many services I should be able to monitor from a single mon-host of the type I described?
 
Any advice/suggestions/flames/taunts?
 
Thanks in advance.
_______________________________________________
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon

Reply via email to