I'm just curious as
to how many systems/services everyone here is monitoring with
mon...
I'm currently
monitoring 1600 services across 300 systems and have a constant load average of
about 7-9 (that's load average, not CPU % used). We have plans to push all
800+ systems onto our little 'mon'itor, but are having
problems...
I'm running mon on a
Dual HT 3.0Ghz Xeon system w/4GB RAM, and there are also some
custom PHP daemons running in the background to handle queued database
insertions and incoming snmptrap packets from Windows hosts (containing event
logs).
I'm afraid I might
be running into resource issues, as mon is crashing randomly (couple times
a day) ever since I added the last batch of systems. New systems don't
have any services different from any other system (ie, no new/different
monitoring scripts that might be going haywire). Nothing useful in the
logfile. Is there any sort of internal limit to the number of services or
hosts that can be monitored? I tried turning on debugging so that I could
narrow down the point of the crash, but debug logging on this many services
literally ground my system to a halt. I checked the memory/CPU/Paging
history (via sar) after a crash and didn't see anything out of the
ordinary.
Am I pushing this
piece of software to it's limits on this hardware? Should I start using
multiple mon-hosts, or should I be looking elsewhere for the root of the
problem? I know this is dependent on the specific monitoring scripts I'm
using, but anyone have any general ideas about how many services I should be
able to monitor from a single mon-host of the type I
described?
Any
advice/suggestions/flames/taunts?
Thanks in
advance.
_______________________________________________ mon mailing list mon@linux.kernel.org http://linux.kernel.org/mailman/listinfo/mon