For Dell systems I normally run OMSA and check it either with
check_omsa in nagios, or have it send traps+snmp agent to Zenoss with
the OMSA zenpack installed.  It's been so long since I ran anything
else for bare metal system that I assume my knowledge there is now
useless (I hope).

Frankly, check_omsa is mostly dreamy, if your monitoring system can
use nagios scripts, this should be how you do it.

-n

On Mon, Jan 13, 2014 at 5:47 PM, Kelvin Ku <[email protected]> wrote:
> On Mon, Jan 13, 2014 at 5:56 PM, David Lang <[email protected]> wrote:
>>
>> On Mon, 13 Jan 2014, Matthew Barr wrote:
>>
>>> So, i’ve recently been reading up on the #monitoringsucks tags, their
>>> responses, and some of the various things that have come out of it.
>>> I’m in a new shop, AWS based, so may of the old standbys aren’t quite as
>>> much of a obvious call anymore.
>>>
>>> What I’m now trying to figure out is what I’m missing, or would lose, by
>>> going with a newer paradigm for monitoring.
>>>
>>>
>>> Anyone using Riemann yet?   Do you still use nagios / sensu / etc?
>>>
>>> — Basically, Riemann operates on a stream of metrics, vs relying on a a
>>> check every X min.
>>>
>>> I’m trying to determine what I’ve lost by not implementing a nagios style
>>> system, to basically cron checks.   (the alerting & state stuff I’m pretty
>>> confidant I’m not loosing.)
>>>
>>>
>>> For example: I had initially thought I’d lose a check of the web site
>>> every X min, but the load balancer does that anyways, and that triggers log
>>> and metrics about page speed return.
>>>
>>> I think that as you scale, you start getting even more data & metrics,
>>> and the need for manual injection of jobs becomes smaller.
>>>
>>>
>>> I’m curious about peoples thoughts on this…
>>
>>
>> You can eliminate a lot of active checks if you watch the logs for normal
>> activity (you can even setup your alerts so instead of just calling a
>> person, it first does a monitoring probe in case the traffic had just
>> dropped off)
>>
>> One thing to remember, your load balancer's test is not testing to see if
>> the product works, just that the webserver works. you need other tests to
>> make sure that all the web hits you are getting aren't just generating a
>> 'database error, try again later' response ;-)
>>
>> David Lang
>> _______________________________________________
>> Discuss mailing list
>> [email protected]
>> https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss
>> This list provided by the League of Professional System Administrators
>>  http://lopsa.org/
>>
>
> What does everyone here use for (host) hardware monitoring? At $work we use
> a combination of host-side scripts that periodically run and parse the
> output of vendor-specific binaries and send alerts to our monitoring servers
> and we also run the vendor hardware agents which send snmp traps. There are
> shortcomings in both approaches and I'm currently splitting my time trying
> to improve both of them.
>
> _______________________________________________
> Discuss mailing list
> [email protected]
> https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss
> This list provided by the League of Professional System Administrators
>  http://lopsa.org/
>



-- 
-------------------------------------------
nathan hruby <[email protected]>
metaphysically wrinkle-free
-------------------------------------------
_______________________________________________
Discuss mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to