[Nagios-users] Transient errors

2012-03-01 Thread David Dyer-Bennet

I see a lot of transient errors on services and hosts I'm monitoring. 
Hence finding ways to keep notifications from going out on situations that
will resolve themselves are kind of an issue.

I've played with how many failures in a row are needed to cause a
notification, and have that set differently for things I'm monitoring
across long links (Beijing, say) compared to things I'm monitoring locally
or in New York.  Of course, one problem with that is that it makes it take
longer before a real problem causes a notification.  Right now it takes
over 15 minutes for the total failure of our link to Beijing to cause a
notification.

For things that are numeric values, I can play with the critical and
warning ranges to potentially reduce false positives.  That, at least,
doesn't slow down recognition of total failures.   Some things just don't
seem to fit the Nagios model -- for example it's quite normal for the SQL
server to pull 100% of the cpu for periods now and then, but if it goes on
too long, *that's* unusual.  Hmm; I suppose I could override the number of
failures needed to cause a notification in the service definition for
htose, couldn't I? There may be some things I should just stop monitoring
(there aren't clear-cut "okay" and "bad" behaviors that I can quantify).

I guess I'm wondering if there are useful basic approaches to handling
this problem that I'm missing, or if I just need to work through the
details more carefully.   I'm startled at how often I get isolated
failures for no apparent reason.  Is that normal for most people
monitoring services?  I think I'm finding my connections time out now and
then due simply to load, without the load actually being at all high.
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Where can arguments go?

2012-02-22 Thread David Dyer-Bennet

On Wed, February 22, 2012 12:36, Jonathan Nilsson wrote:
>>
>> I'm thinking of using the router as a parent; if the failure isn't found
>> by the check_command but only by one service failing, will that still
>> cause the whole router host to be considered down for parent purposes?
>>
>
> Ah, I had never thought to setup dependency relationships like that for
> the
> sake of notifications. But that makes sense - getting one notification
> about the switch is better than tons of false notifications about the
> services behind the switch!

It's also probably fiddling around with small-percentage improvements, but
I'm somewhat theoretically-minded, so the fact that it *can* go wrong and
there might be a better way was bothering me.

Anyway, that's what I was thinking about.  Thanks for confirming that the
check_command is what defines "host down", I thought so but wasn't
confident.

(Nagios documentation is pretty good as such things go, mostly when I get
confused I eventually work out that what I had wrong was right in a
not-too-weird place in the docs.)
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Where can arguments go?

2012-02-22 Thread David Dyer-Bennet

On Tue, February 21, 2012 18:06, Jonathan Nilsson wrote:
> It looks like you are using variables in the wrong location. Those should
> go in the command definition. See below for a sample, and hopefully you
> can adapt it to your specific needs.

Ah, got it.  I hadn't thought of moving the check_command from the
template into the host definition; I was instead trying to pass arguments
through the template invocation, which apparently isn't supported.

(And I had syntax wrong on passing args; I know better than that, I've
written and used many other commands that need args, and passed them
properly before.)

Thank you, and thank you other posters with suggestions and variants as well.

[snip]

>> Is this a possible / sane thing to do?  Is this the right way to
>> approach
>> it, or am I missing a way that actually makes sense?
>>
>
> Yes, I would say that this is appropriate for a switch/router. Personally,
> I usually don't overwrite the default host check_command since check_ping
> is fine, and instead add additional services as needed, such as SNMP
> checks to get more info.

I'm thinking of using the router as a parent; if the failure isn't found
by the check_command but only by one service failing, will that still
cause the whole router host to be considered down for parent purposes? 
One goal here is to eliminate reports of multiple services being down from
beyond the router, if the router itself (or the link) is what's actually
down.  Of course in many cases, the router is entirely cut off, so ping
will also fail.  The other thing ping doesn't catch is administrative
screw-ups that mess up a port without shutting down the router (unless it
happens to be a port that the Nagios monitoring has to go through).


-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Where can arguments go?

2012-02-21 Thread David Dyer-Bennet
I'm looking to use a special check command to verify routers are in
operation by checking the main link port of the router (instead of the
default ping).  I'm running into confusion, because I need to specify a
port number in the host definition, and I can't really see how to do it.

I use soemthing like this for a template:

define host {
namesnmp-switch ; The name of this host template
use generic-switch
#check_command  check-host-alive ; Default command to check if 
routers
are "alive"
check_command   check-snmp-switch-alive $HOSTADDRESS$ $ARG1$ 
$ARG2$
register0   ; DONT REGISTER THIS - ITS JUST 
A TEMPLATE
}

I'm not sure the args on the check_command line are legal  And I'm not
sure that arguments on a "use snmp-switch" line referencing this could
have arguments on them.

Is this a possible / sane thing to do?  Is this the right way to approach
it, or am I missing a way that actually makes sense?


-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] check_snmp -s (and -r) not working

2012-02-20 Thread David Dyer-Bennet
Using check_snmp v1.4.15 (nagios-plugins 1.4.15)

[ddb@prc-mn-lnx01 dev]$ /usr/lib64/nagios/plugins/check_snmp  -H
192.168.1.254 -C XX -o 1.3.6.1.2.1.2.2.1.8.10125 -l ifOperStatus  -s
"loser" -v
/usr/bin/snmpget -t 1 -r 5 -m '' -v 1 [authpriv] 192.168.1.254:161
1.3.6.1.2.1.2.2.1.8.10125
iso.3.6.1.2.1.2.2.1.8.10125 = INTEGER: 2
SNMP OK - ifOperStatus 2 | ifOperStatus=2

I'm trying to check an OID value that isn't range-related,
ifOperStatus..  As you see, I'm getting back the enum value that
means "down".  This does NOT in any way match the -s string I supplied. 
I'm still getting an "OK" return value.

(Obviously it's easy to wrap this with a script that finds the
"ifOperStatus=" and returns what I need.  I'll do that if I have to.)

But...am I misunderstanding what the -s switch does?  Or is the -s switch
completely non-functional in this version?  (I've read about a change
rolled back to restore 1.4.14 behavior, but I didn't read the description
as saying the string matching was completely non-functional before that.)

I've also tried -r for regexps, same results; they seem to be completely
ignored.
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] init.d/nagios

2010-03-19 Thread David Dyer-Bennet

The init file built and installed by "make install-init" doesn't seem to
work right.  I apparently have nagios running after saying "service nagios
start" (it reported success), based on receiving some notifications and on
ps showing /usr/local/nagios/bin/nagios running as a daemon, but "service
nagios status" reports "nagios is not running".

(This is on a Centos 4.8 system, with Nagios 3.2.1 built from source.)

I configured this version with
"--with-lockfile=/usr/local/nagios/var/nagios.pid" after the previous
version, without that, logged a permissions error when it tried to start
up nagios.  I suppose that could be the wrong fix for that problem, and
hence causing this one.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Host check timing

2010-03-19 Thread David Dyer-Bennet

On Thu, March 18, 2010 10:42, Marc Powell wrote:

> If you need to have more control over that then I'd suggest upgrading to
> nagios-3. Host check logic was greatly improved and more in line with how
> service checks are done.

Upgrading to 3 is being problematic.

I built 3.2.1 with default options except for specifying user nagios. 
I've updated my config to pass check-config.  I've put in the init.d
script.  And when I start it, I get

[1268949181] Nagios 3.2.1 starting... (PID=5575)
[1268949181] Local time is Thu Mar 18 16:53:01 CDT 2010
[1268949181] LOG VERSION: 2.0
[1268949181] Failed to obtain lock on file /var/run/nagios.pid: Permission
denied
[1268949181] Bailing out due to errors encountered while attempting to
daemonize...

I can't see why it's trying to access /var/run/nagios.pid; everything else
it's doing is in /usr/local/nagios.  I stepped through the startup steps
in the init.d script, and that error is being logged when it tries to
start the main nagios executable; it's not in the script itself.  I'm sure
it's my new executable; I removed the RPM install, and in stepping through
I typed the path by hand, definitely the new nagios executable.

I would guess that was "localstatedir" in config; but if so, the default
prefix is /usr/local, so it shouldn't end up accessing /var/run.  But I
don't see the actual config stuff that produces /var/run at all.  I
haven't looked into the source yet -- but I don't really think I should
have to to get it to build and run with default locations, either!

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Host check timing

2010-03-18 Thread David Dyer-Bennet
I'm monitoring some far-away remote hosts, that we connect to via the
public internet (well, there's an encrypted VPN involved).  I'm trying not
to send notifications until an outage persists for a while.

In an example I looked at this morning, I see that it was repeating the
host check every 10 seconds until it hit the retry count.

Where does that 10 seconds time come from?  The manual is remarkably vague
about host check scheduling; about all it says is that it does them on
demand, and "If the first host check returns a non-OK state, Nagios will
keep pounding out checks of the host until either (a) the maximum number
of host checks (specified by the max_attempts option in the host
definition) is reached or (b) a host check results in an OK state."

Does this mean I have no control over the timing?  Can I treat the 10
second observed delay as real (and then control total time delay by
setting max_attempts high)?

(Running Nagios 2.12 on Centos 5).
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] check_snmp disk space monitoring

2010-02-17 Thread David Dyer-Bennet

I'm playing with using check_snmp to look at disk space, with commands
(working from the command line so far) like:

/usr/lib/nagios/plugins/check_snmp -C public -P 2c -H localhost -o
dskErrorMsg.2 -r '^\s*$'

([[:space:]] doesn't work any better than \s either)

I'm trying to monitor the error message rather than the simple flag so
that the data returned will include the error when one is found.  I'm
trying to use the regex capability to to match an empty error message; so
that anything non-empty will be reported as an error.

And I'm not getting anywhere.  I'm mostly pretty good with regexps, but
despite claiming in --help to support "extended regular expressions", it
doesn't seem to.  In particular the "^" for beginning of text and "$" for
end of text don't seem to be working.

Clues!  Clues for the poor!

Is this a basically stupid approach, by the way?

Oh, and how does -r work with multiply OIDs in -o?  What's the syntax for
providing multiple -r values, and what happens if you only provide one?


-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Monitoring a router

2009-09-02 Thread David Dyer-Bennet

On Tue, September 1, 2009 16:54, Marc Powell wrote:
>
> On Sep 1, 2009, at 4:41 PM, David Dyer-Bennet wrote:
>
>> Right now I'm just pinging it.  That works, in the sense that I get
>> the
>> ping back. But I am suspicious that, if the link went down, the router
>> would still respond to pings.
>>
>> Is this "best practice" in the opinion of the community?
>
> We monitor about 3000 routers with ping only. We ping an RFC 1918 IP
> assigned to the loopback interface on the router. That way, as long as
> any serial interface on the router is up, the router is still 'up'
> without us having to configure every a ping for every serial
> interface. This is sufficiently 'up' for our SLA purposes... For the
> purposes of parenting, the ping check is likely sufficient.

Thanks for reminding me about the collection of IP interfaces sitting in
that router.  That makes ping more precise.

> We also monitor the individual interfaces via SNMP to know when any
> one goes down. We have a process such that we don't need to add each
> specific serial interface into nagios but just figure what should be
> up in real time.

Sounds like a more complicated network than ours.  And the network isn't
my responsibility, I'm just looking to use it to filter out failure
reports from services beyond failed network links.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Monitoring a router

2009-09-02 Thread David Dyer-Bennet

On Tue, September 1, 2009 17:41, Jim Avery wrote:
> 2009/9/1 David Dyer-Bennet :
>> Is this "best practice" in the opinion of the community?  Or is using
>> SNMP
>> to monitor something inside the router better somehow?  And if so, WHAT?
>
>
> Good question.  One person's 'best practice' is another person's
> over-kill or under-kill.

Ain't it the truth!

And they're probably right for their actual situation, even.

> My golden rule is "only monitor something if you're going to be
> interested in it".

This seems to be getting back to Steinbach's Guildeline for Systems
Programming: never test for an error condition you don't know how to
handle.

> For routers this usually means I simply ping them,
> but often I'm interested in bandwidth of specific interfaces too so I
> make sure specific WAN links are monitored for bandwidth, errors,
> discards and so on using the plugins from http://www.manubulon.com

Another group is responsible for that level of network maintenance.  I
hope they're monitoring that sort of detail.

> It's often a good idea to have the router send you SNMP traps - you'll
> need to configure snmptt to handle them though, maybe using NagTrap.
> I often find I get more traps than I'm interested in though - which
> breaks the golden rule (see above) - so I then need either to filter
> the traps out in the snmptt config or prevent the router from sending
> them in the first place.

I probably can't get traps sent to me.  But, really, I only care about
routers in terms of what parts of the topology I can and can't reach, and
ping will tell me that.

> I used to monitor each router interface using ping, but now I think
> that's usually overkill.  It just depends ...

Right.  Thanks.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Monitoring a router

2009-09-01 Thread David Dyer-Bennet
Not complex stuff; I'm not really monitoring the router primarily (that's
another group's job), but I want some kind of check whether the router,
and the connection it serves, are working or not (to use as a parent for
other checks, so that I get just get notified that the router is down
rather than every host and server beyond it reported down if the link goes
down).

Right now I'm just pinging it.  That works, in the sense that I get the
ping back. But I am suspicious that, if the link went down, the router
would still respond to pings.

Is this "best practice" in the opinion of the community?  Or is using SNMP
to monitor something inside the router better somehow?  And if so, WHAT? 
One thing that comes to mind is IP-MIB::ipForwarding, which appears to be
a boolean.  I don't know if that indicates administrative state or actual
condition, though.  Should I be looking for some sort of interface state
field instead?

(It's a  Cisco IOS Software, Catalyst 4000 L3 Switch Software
(cat4000-I9S-M), Version 12.2(25)EWA11, RELEASE SOFTWARE (fc1)).
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Email service notifications not working

2009-09-01 Thread David Dyer-Bennet

On Tue, September 1, 2009 14:44, Marc Powell wrote:

> Reading between the lines, this seems to be a new install for you. Is
> there any reason you're starting with an old version? 3.0 has been out
> since 3/2008 and offers many improvements over 2.x. You're also
> starting out way behind the curve and will find fewer and fewer people
> willing or able to help support you.

It's a new round of additions to an existing install, so much of what I'm
working with is new.

I'm using 2.10 because that's what Centos 4.8 includes.  That's old, too,
of course; but we just upgraded from 4.7 to 4.8 on this box earlier this
week.
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Email service notifications not working

2009-09-01 Thread David Dyer-Bennet

On Tue, September 1, 2009 12:25, David Dyer-Bennet wrote:

> How do I approach this debugging problem?  I'm assuming my config is at
> fault of course; but I'm running out of places to look.

Following up my own query -- I'm not sure it's resolved yet, but I've
found the "view config" option on the web page, which has been VERY useful
in figuring out what's going on.  For anybody who doesn't know, it shows
all the detailed parameters set on various objects.

So for example I could see that that I was not in fact getting
notification options set.  That part I've figured out -- you have to put
them in the service definition, you can't JUST have them in the  contact
definition.

Still testing to see if there's anything else wrong, but this is turning
out to be a tremendously useful tool to see how Nagios has understood my
configuration.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Email service notifications not working

2009-09-01 Thread David Dyer-Bennet
Or SOME of them.  I did get ONE email service notification -- when I
"acknowledged" a problem, I got a notification of that.  But I got no
notification of the change to critical status, and no notification of the
return to normal.

Service notification options are w,u,c,r.  service notification period is
24x7.  Service notification command is notify-by-email.  Nagios 2.10 on
Centos 4.8.  I believe sendmail is configured adequately, because I can
send a test email by hand, and because I've received that one service
notification.  Host notifications have generally worked on this install,
though I haven't tested one lately.

The web display is showing the service going into state critical, and it
shows notifications enabled for that service.  It shows "last service
notification" as N/A, making me think that the problem is that Nagios
isn't sending the notification, rather than any sort of delivery issue
(and I've tested email from the monitoring host, works fine).

This is a small setup, 13 hosts and 28 services, on a system with no load
problems, so I'm not worried about performance issues being the cause. 
Everything is active tests form the monitoring system.

How related are host and service notification? Should I test host
notification at this point, or should I go ahead and work on service
notification on its own?

How do I approach this debugging problem?  I'm assuming my config is at
fault of course; but I'm running out of places to look.
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_http confusion / problem

2009-08-29 Thread David Dyer-Bennet
Jon Angliss wrote:
> On Fri, 28 Aug 2009 14:01:44 -0500, "David Dyer-Bennet"
>  wrote:
>
>   
>> [check_http questions]
>> 
>
> --help usually gives a whole bunch of extra information...
>
>   

Yes, that's where I got the information I had; it's what got me confused 
in the first place.  In particular, help shows a long-form as well as a 
short-form option for -e, and it was while discovering that the 
long-form option given in help doesn't appear to be recognized that I 
ended up with the "-e=400" syntax which is wrong (but which IS accepted).


>> [...@prcapp00 dev]$ /usr/lib/nagios/plugins/check_http
>> --IP-address=192.168.5.3 -p 8075 --no-body -f critical -v -v -v -e="Bad
>> Request"
>> GET / HTTP/1.0
>> User-Agent: check_http/1.99 (nagios-plugins 1.4.6)
>>
>>
>> http://192.168.5.3:8075/ is 168 characters
>> STATUS: HTTP/1.1 400 Bad Request
>>  HEADER 
>> Content-Type: text/html
>> Date: Fri, 28 Aug 2009 18:33:44 GMT
>> Connection: close
>> Content-Length: 39
>>  CONTENT 
>>  [[ skipped ]]
>> Invalid HTTP response received from host on port 8075
>> [...@prcapp00 dev]$ echo $?
>> 2
>> 
>
> Command syntax is incorrect.
>
> # ./check_http -I 192.168.5.3 -p 8075 --no-body -f critical -vvv -e \
>  "Bad Request"
>   

That was the base problem in some sense, thanks.


> I tried against one of my servers without any issues.  Albeit I got a
> critical failure because my server didn't return bad request.
>
> I do notice you're using an old version of the plugins package. 1.4.13
> is the current version, can you download and compile in a different
> directory and see if you still end upw with the same issue?
>   

This is a Centos 4.7 install, and I'm trying to stick to the packaged 
distributions that match each other, rather than going around upgrading 
things at random and hoping they work together.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_http confusion / problem

2009-08-28 Thread David Dyer-Bennet

On Fri, August 28, 2009 14:25, jmose...@corp.xanadoo.com wrote:
> No, it works - you have an '=' character after the '-e' argument.  Leave
> that out or use "expect="
>
> For more documentation:
>
> check_http --help

That's where I found --expect= in the first place.  All my tests showed it
not working as I "expected", and I started thrashing around as usual, and
eventually ended up with -e= which is of course wrong (incomplete
editing).

What is the argument?  A regexp?  The other match parameters are, but this
one doesn't say so.  A full-line match?  An initial segment match?  Will
it match anywhere within the line?
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] check_http confusion / problem

2009-08-28 Thread David Dyer-Bennet
Thanks to various list members for pointing me at various bits of
documentation that I hadn't been able to find, which explain that commands
can in fact take arguments, and that those and other useful things are
called "macros".  (I've been using macros since 1401 Autocoder, and mostly
think of them as compile time code-generating tools.)

So, what's with the check_http plugin?  The parameters it accepts don't
match the parameters its help says it accepts.  But beyond that, the -e
switch doesn't seem to do what it says it should.

I'm not getting invalid results because I'm running from the command line,
am I?  This seems much the easiest way to test things, and it rather
sounds like this is an intended use.  But thought I'd ask just to be sure.

In this case, the expected result of the test is a 400 Bad Request error
(because I'm hitting a web services port and requesting root; this test is
intended to do a minimal check and see that the service director is up,
but not test the individual services yet).  So the "400 Bad Request"
response is correct and valid.

Now, the -e switch seems to be intended to check just this status line,
and to nicely short-circuit later processing, and seems in all ways
optimized for exactly what I'm doing.  Except for the minor fact that it
doesn't seem to work.  See below, run in verbose mode.  I've tried a bunch
of variants on the value I pass to -e, including the whole line given, and
they all give the same result, an exit code of 2 and the "invalid HTTP
response code" message.   So what's up?  And is there more documentation
on the plugins hidden somewhere, particularly this one?

[...@prcapp00 dev]$ /usr/lib/nagios/plugins/check_http
--IP-address=192.168.5.3 -p 8075 --no-body -f critical -v -v -v -e="Bad
Request"
GET / HTTP/1.0
User-Agent: check_http/1.99 (nagios-plugins 1.4.6)


http://192.168.5.3:8075/ is 168 characters
STATUS: HTTP/1.1 400 Bad Request
 HEADER 
Content-Type: text/html
Date: Fri, 28 Aug 2009 18:33:44 GMT
Connection: close
Content-Length: 39
 CONTENT 
  [[ skipped ]]
Invalid HTTP response received from host on port 8075
[...@prcapp00 dev]$ echo $?
2



-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Why are there "commands"?

2009-08-28 Thread David Dyer-Bennet

On Fri, August 28, 2009 13:26, Marc Powell wrote:
>
> On Aug 28, 2009, at 12:18 PM, David Dyer-Bennet wrote:
>
>> I don't really understand the purpose / utility of the "command"
>> level of
>> abstraction in Nagios configuration.  (2.10; we're still on Centos
>> 4.7).
>>
>> To define a new service to check particular Windows web services we've
>> written, I define a service, and then it has to refer to a command,
>> and
>> over in the command I have to hard-code the parameters needed to
>> test this
>> specific service -- so in fact I need a separate command for each
>> service.
>
> Can you give an example? I think you just don't know the flexibility
> that is available.

That seems to be the case.

> You shouldn't need to hard code much except those
> things that are constant. Nagios has extensive macro capabilities and
> allows you to pass much data from service definitions and other parts
> of nagios to the commands being run. This allows you to re-use generic
> command definitions between many different services that check similar
> things. Have you read the Macro documentation, particularly passing
> arguments to commands?

I had no idea that "macros" described something having to do with command
arguments, and the examples I saw looked very limited and were all
involved with user names and passwords.

>> As a broader question, are there documents that give more of a logical
>> overview of Nagios, explaining how and why things are broken up and
>> how they work together?
>
> The published Documentation? Beyond that, ask specifics but be sure
> you've read the documentation first.

I tried to find anything relevant on commands in the published
documentation, and didn't find anything suggesting the possibility of
command-line arguments, so I thought I'd checked and found it wasn't
possible.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Why are there "commands"?

2009-08-28 Thread David Dyer-Bennet


On Fri, August 28, 2009 12:57, andr...@one.net wrote:
> Why not create more generic command definitions and pass the specific
arguments along to the commands in your service definitions?
>
> http://nagios.sourceforge.net/docs/2_0/xodtemplate.html#command

Aha!  That's precisely what I would prefer to do, but I had been unable to
find any indication that it was possible; hence my puzzlement about what
the intermediate level was good for.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info



-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Why are there "commands"?

2009-08-28 Thread David Dyer-Bennet
I don't really understand the purpose / utility of the "command" level of
abstraction in Nagios configuration.  (2.10; we're still on Centos 4.7).

To define a new service to check particular Windows web services we've
written, I define a service, and then it has to refer to a command, and
over in the command I have to hard-code the parameters needed to test this
specific service -- so in fact I need a separate command for each service.
 This seems, to me, to just introduce confusion, and separate bits of
information that belong together.

Is this just a historical artifact that in fact doesn't make much sense,
or are there lots of cases where it's useful and makes it easier or
clearer to do what you want?

(I'm fine with "that's the way it works, but it doesn't really make much
sense as it turns out", I've got plenty of that in my own code; I'm just
looking for more understanding, in case it makes more sense than I've so
far figured out.)

As a broader question, are there documents that give more of a logical
overview of Nagios, explaining how and why things are broken up and how
they work together?
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] I don't understand the check_by_ssh plugin

2008-12-05 Thread David Dyer-Bennet

On Fri, December 5, 2008 13:26, Patrick Morris wrote:

> In addition to the remote-plugin-execution approachi via something like
> NSCA or NRPE, you can probably use SNMP to pull what you're looking for
> using the check_snmp plugin without having to install additional software
> on your monitored hosts.  It involves a bit more research up front to find
> the OIDs you want to watch, but it's pretty flexible and often doesn't
> require and special configuration on the monitored machines.

I have a bad history of failure to accomplish much of anything in at least
three runs at using SNMP to get data from various devices, so I tend to
shy away from the concept.  It seems to be a horrendous learning curve,
and none of the sites I've found so far make any *sense*.

Having said that -- can you recommend a site that talks about SNMP and
gives examples at the level of getting information out of a Linux box, and
perhaps some sort of household router/WAP?  It'd be good for me to learn,
if I had some reason to hope I'd have a better outcome than the last few
times.

It does seem very likely that the simple things I need from Linux may
already be in the MIB, meaning I just need to access stuff there rather
than add anything new.

-- 
David Dyer-Bennet, [EMAIL PROTECTED]; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] command line parameters in service definitions

2008-12-05 Thread David Dyer-Bennet

On Fri, December 5, 2008 14:11, Marc Powell wrote:
>
> On Dec 5, 2008, at 1:57 PM, Alan McKay wrote:
>
>> OK folks, I've been working on some of the suggested improvements for
>> my check_iflocal plugin, and I cannot seem to get command line
>> parameters going into the script from my service definition.
>>
>> When I run the below from the BASH command line it works as I would
>> expect.   But as my service definition it is not working.   It does
>> not seems as though the WANTDUPLEX and WANTAUTONEG parameters are
>> getting into my script.  Yes, I restarted nagios after updating the
>> service definition.
>>
>> Any thoughts here?
>>
>> define service{
>>use local-service
>>host_name   localhost
>>service_groups  Interfaces
>>contact_groups  admins
>>service_description eth0
>>check_command
>> check_iflocal!eth0!WANTDUPLEX=half!WANTAUTONEG=on
>>}
>
> You'll need to post the check_iflocal command{} definition. That's
> what determines how the command line is created. It should use the
> $ARG1$, $ARG2$ and $ARG3$ macros in the appropriate places for the
> substitution you're expecting.
>
> Better yet, read Example 2 of
> http://nagios.sourceforge.net/docs/3_0/macros.html

I will say that I got a command using one parameter working without any
trouble (and have since abandoned it for an approach making better use of
templates) yesterday.  That example (well, the 2.10 version) was useful to
me.
-- 
David Dyer-Bennet, [EMAIL PROTECTED]; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] I don't understand the check_by_ssh plugin

2008-12-05 Thread David Dyer-Bennet

On Fri, December 5, 2008 14:08, Marc Powell wrote:
>
> On Dec 5, 2008, at 11:55 AM, David Dyer-Bennet wrote:
>
>> When I run it from the command line simply, I just get the output of
>> the
>> remote command (as if the plugin was just passing it through).  I
>> see that
>> I could cause it to produce errors based on how long it takes, but
>> nothing
>> about how to actually use the output.
>
> This is what check_by_ssh does. It's simply a transport mechanism to
> allow nagios to run a plugin on a remote host and receive that
> plugin's output. It does not perform any testing, just transport. The
> plugin that is being executed by check_by_ssh would be doing the
> testing that you're looking for. For example --
>
> define command {
>  command_name   check_disk_remote
>  command_line   $USER1$/check_by_ssh -t 120 -l
> username -H $HOSTADDRESS$ -C "/home/monitor/libexec/check_disk  -t 40 -
> w 10% -c 5%"
>  }
>
> When nagios executes the check_disk_remote command, it will tell
> check_by_ssh to connect to $HOSTADDRESS$ as username and execute the
> command '/home/monitor/libexec/check_disk  -t 40 -w 10% -c 5%'.

Okay, I can work with that.

The --help output from the plugin, and I believe the website (but it's
timing out right now for me), gave an example using the ordinary "uptime"
service, and not a nagios plugin; and the text made no mention of the
thing run needing to be a nagios plugin. If it needs to be a plugin (or if
the vast majority of effective uses use a plugin) those docs should
probably be updated to reflect that!

> In order for the ssh connection to work, you do need to configure
> SSH's authorized_keys functionality. You can limit the host allowed to
> connect without password as well as the specific command that is run
> based on the key.

Sure, I'm familiar with that.  For command-line testing I'm relying on my
normal key and ssh-agent authentication, but I am planning to set up a
special nagios key that I can install, with limited command access, on
systems I need to monitor this way.

-- 
David Dyer-Bennet, [EMAIL PROTECTED]; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] I don't understand the check_by_ssh plugin

2008-12-05 Thread David Dyer-Bennet
I'm running Nagios 2.10 (the Centos 5.2 packaged version).

I want to do some small local checks on each of a bunch of real and
virtual servers, and I really don't want to have to set up Nagios (even a
minimal install) on each of them just to check uptime, load average, and
disk space.  (Mostly I'm testing externally visible services on them.)

The documentation on this plugin doesn't seem to tell me anything about
what it does with the part of the command output it processes (I see that
the -S and -E commands let me prune what command output it looks at).

When I run it from the command line simply, I just get the output of the
remote command (as if the plugin was just passing it through).  I see that
I could cause it to produce errors based on how long it takes, but nothing
about how to actually use the output.

Then I see some tantalizing hints about "passive" mode, where it writes a
file that seems to show it doing some parsing and making decisions based
on the data (in the example).  But I can't get the example to produce a
non-empty file from the command line.

So I'm pretty sure I'm missing something about how to use this plugin in
the first place, or what it's supposed to let me do, or some such.

Or, if the answer to what I'm trying to do is some other approach
entirely, I'd settle for enlightenment about that!

Help?
-- 
David Dyer-Bennet, [EMAIL PROTECTED]; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Where in the docs...

2008-12-04 Thread David Dyer-Bennet

On Thu, December 4, 2008 17:10, Hugo van der Kooij wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> David Dyer-Bennet wrote:

>> And now of course I see how I should have found it myself.  I was
>> getting
>> lost enough in the documentation navigation that I just didn't look
>> closely enough at the TOC (should have used text search, much more
>> reliable than the Mark I eyeball).

> Well. The Mk I eyeball combined with the heuristic algorithms installed
> in the Mk I brain can still deliver stunning results.

On a good day, anyway, sure.  :-)

On the other hand -- the Mk I brain should have suggested using text
search earlier, and it *didn't*.  Anybody know where I can find a Mk II
brain in working condition, cheap?
-- 
David Dyer-Bennet, [EMAIL PROTECTED]; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Confusing error message

2008-12-04 Thread David Dyer-Bennet
Nagios 2.10 is giving me an actively misleading error message.  In this
segment of log, it says I have an invalid "retry_interval" possibly; then,
when I add one (there was none), it says that "retry_interval" is an
invalid directive.  I'm assuming that reading the config docs will tell me
what the real name is, this is just a grouse about the incorrect error.

Error: Invalid max_attempts, check_interval, retry_interval, or
notification_int
erval value for service '$HOSTNAME$ ssh' on host 'prcapp02'
Error: Could not register service (config file '/etc/nagios/modcl.cfg',
starting
 on line 42)

***> One or more problems was encountered while processing the config
files...

 Check your configuration file(s) to ensure that they contain valid
 directives and data defintions.  If you are upgrading from a previous
 version of Nagios, you should be aware that some variables/definitions
 may have been removed or modified in this version.  Make sure to read
 the HTML documentation regarding the config files, as well as the
 'Whats New' section to find out what has changed.

[EMAIL PROTECTED] dev]$ ./install.sh
[EMAIL PROTECTED] dev]$ nagios -v /etc/nagios/nagios.cfg

Nagios 2.10
Copyright (c) 1999-2007 Ethan Galstad (http://www.nagios.org)
Last Modified: 10-21-2007
License: GPL

Reading configuration data...

Error: Cannot open resource file '/etc/nagios/private/resource.cfg' for
reading!
Error: Invalid host object directive 'retry_interval'.
Error: Could not add object property in file '/etc/nagios/linux.cfg' on
line 21.


-- 
David Dyer-Bennet, [EMAIL PROTECTED]; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Testing the configuration

2008-12-04 Thread David Dyer-Bennet
Since the top configuration file, at least in the examples I have, gives
absolute paths to the other config files it specifies, it makes it hard to
test a configuration before putting it into production (short of a
separate test system, perhaps a virtual one).  Are those absolute paths
needed?  The doc gives examples showing absolute paths, and doesn't talk
about what a relative path would mean (but doesn't say it's forbidden,
either).  Are relative paths allowed?  Can we predict what they'd be
relative to?

Running Nagios 2.10 (due to Centos 5 packaging).
-- 
David Dyer-Bennet, [EMAIL PROTECTED]; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Where in the docs...

2008-12-04 Thread David Dyer-Bennet

On Thu, December 4, 2008 14:48, Andy Shellam wrote:

> This explains how macros work:
> http://nagios.sourceforge.net/docs/3_0/macros.html

Thanks very much, that's just what I need.

And now of course I see how I should have found it myself.  I was getting
lost enough in the documentation navigation that I just didn't look
closely enough at the TOC (should have used text search, much more
reliable than the Mark I eyeball).
-- 
David Dyer-Bennet, [EMAIL PROTECTED]; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Where in the docs...

2008-12-04 Thread David Dyer-Bennet
...could I find information on macro definition and substitution? 
Comments in various places describe "$HOSTADDRESS$" as some kind of macro
substitution, and other examples show multiple sets of parameters in
somewhat strange syntax in service invocations (and refer to $ARG1$ and
such), but I haven't found where these things are actually documented or
explained at all.

I've been able to write a number of new service, command, and a couple of
host definitions, even some that use parameters, and they work, which is
nice (and even useful, already), but I need to understand macros more; are
they pre-coded, which ones exist, and so forth.  I'm sure the way I'm
doing things now is repeating things a lot more than is necessary (yes,
I've looked at the tip for multiple hosts with the same services, etc.,
and I do understand at least some of them).

-- 
David Dyer-Bennet, [EMAIL PROTECTED]; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info


--
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null