For Dell systems I normally run OMSA and check it either with check_omsa in nagios, or have it send traps+snmp agent to Zenoss with the OMSA zenpack installed. It's been so long since I ran anything else for bare metal system that I assume my knowledge there is now useless (I hope).
Frankly, check_omsa is mostly dreamy, if your monitoring system can use nagios scripts, this should be how you do it. -n On Mon, Jan 13, 2014 at 5:47 PM, Kelvin Ku <[email protected]> wrote: > On Mon, Jan 13, 2014 at 5:56 PM, David Lang <[email protected]> wrote: >> >> On Mon, 13 Jan 2014, Matthew Barr wrote: >> >>> So, i’ve recently been reading up on the #monitoringsucks tags, their >>> responses, and some of the various things that have come out of it. >>> I’m in a new shop, AWS based, so may of the old standbys aren’t quite as >>> much of a obvious call anymore. >>> >>> What I’m now trying to figure out is what I’m missing, or would lose, by >>> going with a newer paradigm for monitoring. >>> >>> >>> Anyone using Riemann yet? Do you still use nagios / sensu / etc? >>> >>> — Basically, Riemann operates on a stream of metrics, vs relying on a a >>> check every X min. >>> >>> I’m trying to determine what I’ve lost by not implementing a nagios style >>> system, to basically cron checks. (the alerting & state stuff I’m pretty >>> confidant I’m not loosing.) >>> >>> >>> For example: I had initially thought I’d lose a check of the web site >>> every X min, but the load balancer does that anyways, and that triggers log >>> and metrics about page speed return. >>> >>> I think that as you scale, you start getting even more data & metrics, >>> and the need for manual injection of jobs becomes smaller. >>> >>> >>> I’m curious about peoples thoughts on this… >> >> >> You can eliminate a lot of active checks if you watch the logs for normal >> activity (you can even setup your alerts so instead of just calling a >> person, it first does a monitoring probe in case the traffic had just >> dropped off) >> >> One thing to remember, your load balancer's test is not testing to see if >> the product works, just that the webserver works. you need other tests to >> make sure that all the web hits you are getting aren't just generating a >> 'database error, try again later' response ;-) >> >> David Lang >> _______________________________________________ >> Discuss mailing list >> [email protected] >> https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss >> This list provided by the League of Professional System Administrators >> http://lopsa.org/ >> > > What does everyone here use for (host) hardware monitoring? At $work we use > a combination of host-side scripts that periodically run and parse the > output of vendor-specific binaries and send alerts to our monitoring servers > and we also run the vendor hardware agents which send snmp traps. There are > shortcomings in both approaches and I'm currently splitting my time trying > to improve both of them. > > _______________________________________________ > Discuss mailing list > [email protected] > https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss > This list provided by the League of Professional System Administrators > http://lopsa.org/ > -- ------------------------------------------- nathan hruby <[email protected]> metaphysically wrinkle-free ------------------------------------------- _______________________________________________ Discuss mailing list [email protected] https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss This list provided by the League of Professional System Administrators http://lopsa.org/
