from:"Frost, Mark \{PBC\}"

Re: [Nagios-users] Hostgroup tricks?

2011-11-08 Thread Frost, Mark {PBC}

From: Tim AtLee [mailto:t.at...@cfertech.com]
Sent: Tuesday, November 08, 2011 9:46 AM
To: nagios-users@lists.sourceforge.net
Subject: [Nagios-users] Hostgroup tricks?

Hello

I have a hostgroup defined as:
define hostgroup {
hostgroup_name  ping-servers
alias   Pingable hosts
members *
}

I have recently added a host outside our firewall that has ping disabled.  I 
have changed the host's check_command to be 'check_tcp!80' so that the host 
won't be offline permanently, but I am wondering if there is a way to exclude 
this host from the 'ping-servers' hostgroup in the host definition?

Ideally, something like:
define host {
host_name   outsidefirewallhost
alias   Host outside firewall
address some.ip.address
use generic-host

hostgroup!ping-servers
}

This generates an error when I test the configuration.  The only way I have 
been able to achieve this is to change the ping-servers hostgroup definition to 
exclude this individual host (*,!outsidefirewallhost), but I'd rather keep the 
exclusion define in the host, not in the "blanket rule".

Maybe it's just me being OCD...  but is this possible?

Thanks,

Tim

Tim,

I'm a  little unclear about your question.  Are you trying to alter the "Host 
Check Command" for a single host definition?  That is, the method used by 
Nagios to determine if a host is up or not?  If that's the case, you can just 
override the definition for that one host:

define host {
host_name   outsidefirewallhost
alias   Host outside firewall
address some.ip.address
check_command   check-tcp-port-80
use generic-host

}

Check the docs for information on precedence, but your "generic-host" inclusion 
will specify a check_command (usually ping or better yet fping), but defining a 
different value in the definition itself can override that for the specific 
definition.

If that's not what you mean, and you want to change a specific service to check 
everything in that hostgroup except that one host, that would look something 
like

define service {
hostgroup_name  ping-servers
host_name   !outsidefirewallhost
service_description   My Service
check_command   run-a-ping
use generic-service

}

Hopefully I've understood your question...

Mark
--
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Suggestions for event correlation managers?

2011-08-09 Thread Frost, Mark {PBC}

Splunk perhaps?

Mark

From: Furnish, Trever G [tgfurn...@herffjones.com]
Sent: Tuesday, August 09, 2011 12:30 AM
To: Nagios Users List
Subject: Re: [Nagios-users] Suggestions for event correlation managers?

Anyone?  C'mon, don't be shy! :-)

--
Trever

From: Furnish, Trever G [tgfurn...@herffjones.com]
Sent: Friday, August 05, 2011 4:45 PM
To: nagios-users@lists.sourceforge.net
Cc: Boeglin, Adam R
Subject: [Nagios-users] Suggestions for event correlation managers?

Hello,

I'm looking for suggestions for applying Nagios' style of event handling 
(escalations, recoveries, acknowledgements), hopefully with some improvements 
(aggregation), to events coming from many different (non-Nagios) sources.  I 
know of a few Nagios-specific notification aggregators, but can anyone 
recommend a good (preferably inexpensive / OSS) way of expanding that to 
include many other tools?  I know about SNARE and RiverMuse, but they're 
relatively expensive.

We make heavy use of Nagios as well as several other tools (MSFT SCOM, HP SIM, 
Oracle Grid Control, AlertSite.net, etc).  They're all sending alerts in 
various forms to a small group of admins and engineers, so many of us get 
alerts from all of the tools, sometimes from more than one tool regarding a 
single event.

Nagios does a great job of flexibly managing alerts from its own events, but I 
don't see how I'd hook in the other tools.  Several of the tools (e.g. SCOM and 
SIM) don't even have any concept of event correlation -- breakage and recovery 
are two separate events.

I see tools like SNARE, RiverMuse ECM, and a few others filling this gap, at 
least partially, but I don't yet have experience with them and they're 
relatively expensive.  Anyone doing this effectively with OSS tools or low-cost 
tools or a good home-grown approach you wouldn't mind sharing (and possibly 
collaborating on)?

--
Trever Furnish, tgfurn...@herffjones.com
Herff Jones, Inc. Solutions Architect
Phone: 317.612.3519

--
BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA
The must-attend event for mobile developers. Connect with experts.
Get tools for creating Super Apps. See the latest technologies.
Sessions, hands-on labs, demos & much more. Register early & save!
http://p.sf.net/sfu/rim-blackberry-1
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue.
::: Messages without supporting info will risk being sent to /dev/null

--
uberSVN's rich system and user administration capabilities and model
configuration take the hassle out of deploying and managing Subversion and
the tools developers use with it. Learn more about uberSVN and get a free
download at:  http://p.sf.net/sfu/wandisco-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue.
::: Messages without supporting info will risk being sent to /dev/null

--
uberSVN's rich system and user administration capabilities and model 
configuration take the hassle out of deploying and managing Subversion and 
the tools developers use with it. Learn more about uberSVN and get a free 
download at:  http://p.sf.net/sfu/wandisco-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Services are dependent on the host they run on?

2011-05-26 Thread Frost, Mark {PBC}

Maybe I'm missing something but I thought that suppressing notifications for 
services on the same host when the host goes down is the default behavior.  
It's only when you have to suppress notifications from different hosts that you 
need host/service dependencies.


Mark

-Original Message-
From: Assaf Flatto [mailto:nag...@flatto.net] 
Sent: Thursday, May 26, 2011 1:39 PM
To: Nagios Users List
Subject: Re: [Nagios-users] Services are dependent on the host they run on?

Martin Hugo wrote:
> Hi Robi,
>
> I have never done it but I know you can make hosts/services children that 
> will not report if the parent is down.
>
> Hope this puts you on the right track.
>
> Marty
>
> -Original Message-
> From: Roberto Nunnari [mailto:roberto.nunn...@supsi.ch] 
> Sent: Thursday, May 26, 2011 12:47 PM
> To: nagios-users@lists.sourceforge.net
> Subject: [Nagios-users] Services are dependent on the host they run on?
>
> Hi all.
>
> Some time ago, I've installed and configured nagios to monitor our IT 
> infrastructure.
>
> It works very well and we're happy with it.
>
> There's still one problem though:
> When a host goes down, nagios sends notifications not only for host 
> down, but also for all services running on that host. When a host goes 
> down, I would like nagios to only send notifications about the host 
> down, and not for all the services running on that host.
>
> How can I achive that?
> May it be a configuration error from my side? I thought that to nagios, 
> services would be dependent from the host running them..
>
> Any hint/advice/guidance is very welcome.
>
> Thank you and best regards.
>
> Robi
>   
check out service dependencies

http://nagios.sourceforge.net/docs/3_0/dependencies.html



--
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery, 
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now. 
http://p.sf.net/sfu/quest-d2dcopy1
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

--
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery, 
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now. 
http://p.sf.net/sfu/quest-d2dcopy1
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios Core & Remedy Ticketing Integration

2011-02-23 Thread Frost, Mark {PBC}

We don't use Remedy, but another ticketing system.  In our case, the app (or 
someone who worked with the app), created a command-line script that you can 
use to create the actual ticket.  I then created an event handler for a failing 
service to call that command line utility to create the ticket.  You could 
really do this via a notification as well depending on what you want as long as 
you rewrite the notification command to call your "make me a ticket with these 
parameters" program instead of mail.

I believe the key piece here either way, is whatever way Remedy provides you of 
opening a ticket from a command line, or worst case, via some web interface 
that you have Nagios login to in some automated fashion.

Mark

From: steve f [mailto:a31mod...@hotmail.com]
Sent: Wednesday, February 23, 2011 1:23 PM
To: nagios-users@lists.sourceforge.net
Subject: [Nagios-users] Nagios Core & Remedy Ticketing Integration

Hello All,

Has anyone integrated Nagios with BMC Remedy for ticket creation?

I am looking at ARCPerl for this since our Remedy infrastructure is not set up 
to receive e-mails.

Has anyone tried this?  Any info / horror stories would be appreciated.

Thanks,

Steve
--
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev ___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check if SAP login is possible

2011-02-23 Thread Frost, Mark {PBC}

Werner,

I can't say that I'm an expert at any of these methods, but there are a
few possibilities you might explore.

- WebInject.  It allows you to write these kind of request/response
scripts that walk through interaction with a website, including a
login.  There's even some stuff about using it directly with Nagios
on their site.

- Perl and the WWW::Mechanize module.  This allows you to do something
similar to WebInject by writing your own Perl script that interacts with
a website including "pressing" buttons, etc.  I would also recommend
Ton Voon's spiffy Nagios::Plugin module to handle the Nagios plugin
duties.

In either case, you would probably want to create some dummy/test
user to attempt the login with.

Mark

-Original Message-
From: Werner Flamme [mailto:werner.fla...@ufz.de] 
Sent: Wednesday, February 23, 2011 8:15 AM
To: nagios-users@lists.sourceforge.net
Subject: [Nagios-users] check if SAP login is possible

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi everyone,

last week I had a new problem - all Nagios checks of the SAP systems
succeeded, but no one was able to login or to work inside SAP. The users
got a timeout message, but remained logged in. The usual checks via
check_sap_cons still delivered their standard output.

How can I check if a SAP login is possible or not? As a first step, I
check the https:// login screen (with check_http), but how can I check
that a user may log in after seeing that screen?

BTW, the reason was a shared filesystem full to the brim...

Regards,
Werner
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.15 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/

iEYEARECAAYFAk1lCFQACgkQk33Krq8b42MZlgCfSoyg7yByXygupxaM7C7wFxqB
TfkAnRMQiAvorypMZfkAo9jbzTuH+zcc
=ZMmj
-END PGP SIGNATURE-

--
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

--
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] CPU monitor for a single Linux user space process ?

2010-12-15 Thread Frost, Mark {PBC}


> -Original Message-
> From: Bruce Edge [mailto:bruce.e...@gmail.com] 
> Sent: Wednesday, December 15, 2010 8:06 PM
> 
> Rookie question here. Trying to determine nagios suitability for an
> embedded app.
> 
> Can I monitor the CPU utilization for a single user space process on a
> Linux box with nagios?
> And, can I define an action if it exceeds a threshold?
> 
> Thanks
> 
> -Bruce

Bruce,

I'm not sure that there's an existing check plugin that would do this (might 
be).

I can say that "yes" you can do this, it's just a question of what you're 
willing
to do.  If I were to do this for our environment, I'd write a perl script that
used the 'ps' command to look at the process and pull the 'pcpu' field (% cpu
-- see the 'ps' man page) info for that process.  I'd also use the 
Nagios::Plugin
perl module to make the Nagios side easier and probably report the actual pcpu
value as performance data suitable for graphing.

You could then configure the an event on that service check.  That essentially
another script that gets called when the state changes on the check.  This means
it gets called anytime the state changes, including when it goes to an "OK" 
state
so you need to have the script detect when it's called and potentially exit if
it hasn't gone into a hard critical state (depending on what you want, 
actually).
You can read up on events on the Nagios documentation.

Mark

--
Lotusphere 2011
Register now for Lotusphere 2011 and learn how
to connect the dots, take your collaborative environment
to the next level, and enter the era of Social Business.
http://p.sf.net/sfu/lotusphere-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] converting distributed Nagios setup to Nagios+Merlin

2010-12-15 Thread Frost, Mark {PBC}

Our site currently uses a somewhat traditional distributed Nagios setup.  I'm 
setting up merlin on some new Nagios servers and am looking at what 
configurations I'm going to want to change.  As part of that, I realize that 
there are some Nagios config directives that I wanted some clarification on 
before I started changing things.  I haven't seen these documented elsewhere 
(at least not that I could find).

I was looking for clarification on the following:

1) Obsessive (ocsp/ochp) configuration directives get turned off.  Merlin does 
all that.  Plus ocsp/ochp is deemed detrimental to performance making that 
another reason to turn it off.

2) Freshness checking.  Nagios would probably still try to do this if I left it 
in, but there's no point since Merlin will also do this.

3) Passive/Active checks.  If I understand things correctly under Merlin 
everything is an active check.  Or rather, anything that Nagios is supposed to 
run on some host or another is an active check.  Things that are truly sent via 
NSCA from some monitored host out there would still be passive, but otherwise 
everything's configured to run actively Merlin takes care of where it runs.

4) In a load balanced/redundant configuration (such as 'yoda' and 'obi' in the 
HOWTO doc), which of 'yoda' or 'obi1' sends out notifications?  Or do they both 
send them out but Merlin somehow only has one of them send it?  I'm guessing 
that this is handled in the more traditional way where notifications are 
enabled on say, 'yoda' but disabled on 'obi1'.  If 'yoda' crashes, you manually 
enable the alerts via the command file on 'obi1'?  It would of course be 
super-cool if Merlin handled all that :-).

5) Other parameters such as
process_perf_data - still probably only on the master(s), but that's really up 
to how crazy we'd want be.
event handler settings - unchanged by this configuration
retain status information - unchanged by this configuration

Thanks

Mark



--
Lotusphere 2011
Register now for Lotusphere 2011 and learn how
to connect the dots, take your collaborative environment
to the next level, and enter the era of Social Business.
http://p.sf.net/sfu/lotusphere-d2d___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] distributed nagios ?

2010-12-15 Thread Frost, Mark {PBC}

> -Original Message-
> From: Andreas Ericsson [mailto:a...@op5.se] 
> Sent: Wednesday, December 15, 2010 4:46 AM
> 
> On 12/14/2010 08:39 PM, Frost, Mark {PBC} wrote:
>> 
>> Hooray!
>> 
>> Actually, I wanted to point out a few things I found when building the
>> most recent version of merlin recently.  At the heart of my issues
>> is that our team is not allowed root access on these servers (long boring
>> corporate story...) so I'm installing everything in an alternate tree.
>> 
>> 1) There are a couple of hard-coded paths in ipc.c and node.c for
>> the socket and the binlogs.  I'm assuming that's intentional, but it
>> does mean one has to manually edit the source files to point to different
>> paths rather than specifying anything like that during the build process.
>> 
>
> The socket location can be configured. Binlogs cannot. I'll amend that in
> the next release though. The core functionality is there, but there's no
> option to set it in the config files, which is kinda stupid.

"Binlogs cannot" meaning it can't be moved without modifying the code
directly, right?  Because that's what I did :-).

>> 2) Because we're trying to put all the files into an alternate tree, the
>> installation of 'mon' from install-merlin.sh didn't really work right.

> Yes. The install-merlin.sh script is designed to be usable from the
> rpm spec file, and it's meant to aid people who want to install
> everything in its default location. Would $root_path/$bindir/mon
> work for you? Since you can set $root_path to whatever you want,
> I suppose it should.

Yes, I believe that would work for me.  I'm not setting $root_path at all.

Thanks, Andreas.

Mark


--
Lotusphere 2011
Register now for Lotusphere 2011 and learn how
to connect the dots, take your collaborative environment
to the next level, and enter the era of Social Business.
http://p.sf.net/sfu/lotusphere-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] distributed nagios ?

2010-12-14 Thread Frost, Mark {PBC}


> -Original Message-
> From: Andreas Ericsson [mailto:a...@op5.se] 
> Sent: Tuesday, December 14, 2010 4:49 AM
> To: nagios List; doc...@yahoo.com
> Subject: Re: [Nagios-users] distributed nagios ?
> 
>> Any pointers to docs on how to set it up?
>> 
> 
> http://git.op5.org/git/?p=nagios/merlin.git;a=blob;f=HOWTO;hb=master
> http://git.op5.org/git/?p=nagios/merlin.git;a=blob;f=README;hb=master
> https://wiki.op5.org/merlin:start#guides
> 
> If I were you, I'd wait til tomorrow with installing it though, when 1.0.0
> is released as stable. Reading up on the docs and whatnot beforehand is
> still a good idea though.
> 
> -- 
> Andreas Ericsson   andreas.erics...@op5.se
> OP5 AB www.op5.se
> Tel: +46 8-230225  Fax: +46 8-230231

Hooray!

Actually, I wanted to point out a few things I found when building the
most recent version of merlin recently.  At the heart of my issues
is that our team is not allowed root access on these servers (long boring
corporate story...) so I'm installing everything in an alternate tree.

1) There are a couple of hard-coded paths in ipc.c and node.c for
the socket and the binlogs.  I'm assuming that's intentional, but it
does mean one has to manually edit the source files to point to different
paths rather than specifying anything like that during the build process.

2) Because we're trying to put all the files into an alternate tree, the
installation of 'mon' from install-merlin.sh didn't really work right.  In
our case, it made a lot more sense to change

cp apps/mon.py $root_path/usr/bin/mon

to

cp apps/mon.py $bindir/mon

otherwise it would put 'mon' in a really weird spot.

I'm guessing these are design decisions on your part, but in case they're
not, I thought I'd point them out.

Thanks

Mark

--
Lotusphere 2011
Register now for Lotusphere 2011 and learn how
to connect the dots, take your collaborative environment
to the next level, and enter the era of Social Business.
http://p.sf.net/sfu/lotusphere-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] high latency

2010-12-11 Thread Frost, Mark {PBC}

> -Original Message-
> From: Andreas Ericsson [mailto:a...@op5.se] 
> Sent: Tuesday, December 07, 2010 5:57 PM
> To: Frost, Mark {PBC}
> Cc: Nagios Users List
> Subject: Re: [Nagios-users] high latency
> 
> > 
> > Any chance that the OP5 site will eventually be
> > configured to allow git through a proxy?  It's of course less convenient to
> > use snapshot tarballs, but still workable, of course.
> > 
> 
> You mean through http? Doesn't it already? I think it's supposed to. I can 
> check
> up on that later. The gitweb page has links for grabbing latest master as a
> tarball though. That might work as an interim solution.
>
> -- 
> Andreas Ericsson   andreas.erics...@op5.se
> OP5 AB www.op5.se
> Tel: +46 8-230225  Fax: +46 8-230231

Andreas,

It's just never worked for me and I thought you'd mentioned some time ago that
OP5's git site just didn't support it.

I've validated that my version of git (1.7.1) will grab code from a public site
via our corporate proxy using other public code (the proxy is setup via the 
$http_proxy environment variable):

$ git clone http://github.com/schacon/grack.git
Initialized empty Git repository in /home/mfrost0/src/grack/.git/
remote: Counting objects: 85, done.
remote: Compressing objects: 100% (45/45), done.
remote: Total 85 (delta 32), reused 80 (delta 31)
Unpacking objects: 100% (85/85), done.

but...

$ git clone http://git.op5.org/nagios/merlin.git merlin-src
Initialized empty Git repository in /home/mfrost0/src/merlin-src/.git/
fatal: http://git.op5.org/nagios/merlin.git/info/refs not found: did 
you run git update-server-info on the server?
$ git clone http://git.op5.org/nagios.git nagios-src
Initialized empty Git repository in /home/mfrost0/src/nagios-src/.git/
fatal: http://git.op5.org/nagios.git/info/refs not found: did you run 
git update-server-info on the server?

so, you know :-(

Thanks

Mark

--
Oracle to DB2 Conversion Guide: Learn learn about native support for PL/SQL,
new data types, scalar functions, improved concurrency, built-in packages, 
OCI, SQL*Plus, data movement tools, best practices and more.
http://p.sf.net/sfu/oracle-sfdev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] high latency

2010-12-07 Thread Frost, Mark {PBC}


> -Original Message-
> From: Andreas Ericsson [mailto:a...@op5.se] 
> Sent: Tuesday, December 07, 2010 9:44 AM
> 
> > Hmm.  So then I'd be so curious why the 2 distservers which are both using
> > oc[sh]p commands the same way have such radically different latencies.
> > 
>
> Agreed. There must be other differences too. Perhaps there's trouble resolving
> from one of the nodes? That usually makes checks run a helluva lot longer than
> they normally have to.

I had another look.  While I found a test host that I'd made that was
deliberately unreachable, I found that when I removed it it made no
difference.  Execution times are significantly lower (min/max/avg) on
the host with the high latencies than for the one with low latencies.
I don't see any unresolvable hosts or now, any unreachable hosts.
Puzzling.

I've always wished there was an easy way to see which processes had
high latencies from the web interface without having to view the status.dat
file...

> > Either way, you're suggesting that having a NEB module handle the
> > post-check work will eliminate the serialization.

> Yes. Sneaking a peak at what's needed in order for an event to get sent to
> master via an eventbroker compared to running an oc[sh]p command renders
> this, more or less:

> [ good stuff snipped...]

Wow.

> In terms of effort, the difference is sort of like either hopping on one
> leg along the entire great wall of china or walking to the kitchen and grab
> a beer.

> > 
> > parallelize_check is set to 1 everywhere.
> 
> Does one server have a lot of random service failures? On-demand hostchecks 
> are
> still run in parallel.

I don't think so.  Intermittent you mean?  Not as far as I know or can see.

> > > What version of Nagios are you running?
> > 
> > 3.2.1
> 
> I take it upgrading makes no difference?

To 3.2.3?   I'll probably try that on the new servers, but if things work out I 
may
just move to Merlin + 3.2.4.  I wasn't sure I saw anything in the 3.2.3 release 
that
I found compelling for us at the time.  As I say, this system now has fairly 
high
visibility so just trying something like that would involve a rather painful
internal change process.  It's like piloting the QE2 -- I can't change
course very quickly :-)

> > Thanks, Andreas.  I'm hoping to allocate sufficient resources on the new 
> > servers
> > to be able to play with Merlin more there.
> 
> It's quite resource-friendly actually. Well, compared to what you're running 
> now
> it's positively feather-light.

I meant more like installing MySQL everywhere, building filesystems to hold the
MySQL data, etc.  Not so much like I need more memory or more CPUs.  I don't
remember seeing anything in the Merlin docs (maybe I missed it), but how
large would the MySQL database need to be?  Pretty small on each box, right?
Like 500MB or less?

> >  Will I be able to have the performance
> > data from a poller be sent up to a NOC for digestion by pnp4nagios?
>
> Yes, but you'll need the threadsafe version of Nagios you can obtain from 
> either
> CVS or git://git.op5.org/nagios.git for performance-data to work. Actually, 
> you
> need that for Merlin to work.

That's part of the plan.  Any chance that the OP5 site will eventually be
configured to allow git through a proxy?  It's of course less convenient to
use snapshot tarballs, but still workable, of course.

> >  It may have
> > been a long time ago, but I thought I remember seeing that performance data 
> > was
> > not yet implemented.
> > 
> 
> That was then. This is now :)

Spifftacular!

> > No we'd be using some flavor of SLES.
> > 
> 
> Should work marvellously then.

Thanks as always for your help, Andreas.

Mark

--
What happens now with your Lotus Notes apps - do you make another costly 
upgrade, or settle for being marooned without product support? Time to move
off Lotus Notes and onto the cloud with Force.com, apps are easier to build,
use, and manage than apps on traditional platforms. Sign up for the Lotus 
Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] high latency

2010-12-06 Thread Frost, Mark {PBC}


> -Original Message-
> From: Andreas Ericsson [mailto:a...@op5.se] 
> Sent: Monday, December 06, 2010 6:06 AM
> To: Nagios Users List
> Cc: Frost, Mark {PBC}
> Subject: Re: [Nagios-users] high latency
> 
> On 12/03/2010 08:14 PM, Frost, Mark {PBC} wrote:
> > 
> > I too struggle with them and I'm running on lightly-loaded physical 
> > hardware.
> > We have 2 servers doing the checks sending back to a central server.  Both
> > distributed nodes use ocsp/ochp, but they do nothing more than append 
> > results
> > to a file (i.e. it exits quickly).  Results are handled outside of Nagios.
> > 
>
> Try getting rid of the oc[sh]p commands and use Merlin or google for "pnsca" 
> or
> "persistent nsca". There's one available from op5's repositories that may or 
> may
> not work, and there's one from somewhere else that they're apparently using to
> great effect.
> 
> Even if it exits quickly, it's still executed serially, so checking halts a
> small period of time for each and every check that runs.

Hmm.  So then I'd be so curious why the 2 distservers which are both using
oc[sh]p commands the same way have such radically different latencies.

Either way, you're suggesting that having a NEB module handle the
post-check work will eliminate the serialization.

> > What's odd is that distserver 1 and distserver 2 are configured the same
> > 
> > distserver1:
> > Hosts Checked   675
> > Services Checked:  4179
> > Active Service Latency: 0.000 / 3.155 / 0.382 sec
> > Active Service Execution Time:  0.000 / 60.038 / 0.145 sec
> > 
> > distserver2:
> > Hosts Checked:  261
> > Services Checked:  4289
> > Active Service Latency: 0.000 / 169.977 / 81.300 sec
> > Active Service Execution Time:  0.000 / 15.270 / 0.211 sec
> > 
> > yet as you can see, distserver2's latency is much higher and always has 
> > been.
> > I tried turning off EPN yesterday on distserver2 and it had no discernable 
> > effect.
> > We added 400 new service checks yesterday on distserver2 (just more of the 
> > same
> > checks we already do but on 26 new hosts) and the latency went from 35 to 
> > over 80.
> > 
> 
> What kind of checks are you running? Some plugins draw a lot of cpu.
> Are any of the checks set to run in serial (grep for parallelize_check in your
> objects.cache file).

parallelize_check is set to 1 everywhere.

Most things are NRPE checks (also NRPE to NSClient++).  Some are locally
running perl scripts and others are locally running things like check_http.


> What version of Nagios are you running?
> 

3.2.1

> > The checks we do are very different (Windows, Linux, Unix, many are 
> > app-centric) so
> > it's difficult to compare exactly what runs on distserver1 and distserver2, 
> > but given
> > the jump that was taken yesterday, I'm wondering if the fact that the type 
> > of checks
> > on these new hosts are all built on dependencies make me wonder if that 
> > doesn't
> > have something to do with it.  These hosts (Windows) have a basic check for 
> > NRPE
> > and all other checks on the host are dependent on the NRPE check succeeding.
> > 
> > I have to move to all new Nagios servers very soon.  I'm interested in 
> > Merlin, but
> > given its non-production nature just yet, I'm hesitant to commit and I'm 
> > not sure if
> > it will help me here.
> > 
> It's been running at our 400+ customers with very few problems for the past 
> month.
> 0.9.1, released just yesterday, solves the known issues our customers have
> encountered. You might want to take a look at it again. There are some issues 
> on
> FreeBSD though (was that you reporting them?). I just recently got a new 
> laptop
> with better support for running virtual systems, so I'm downloading a FreeBSD 
> 8.1
> install dvd as we speak. Hopefully I'll have those issues sorted out before 
> the
> end of the week.
> 
> -- 
> Andreas Ericsson   andreas.erics...@op5.se

Thanks, Andreas.  I'm hoping to allocate sufficient resources on the new servers
to be able to play with Merlin more there.  Will I be able to have the 
performance
data from a poller be sent up to a NOC for digestion by pnp4nagios?  It may have
been a long time ago, but I thought I remember seeing that performance data was
not yet implemented.

No we'd be using some flavor of SLES.

Thanks

Mark

--
What happens now with your Lotus Notes apps - do you mak

Re: [Nagios-users] high latency

2010-12-03 Thread Frost, Mark {PBC}


Can the use of dependencies also be the cause of increased latencies?

I too struggle with them and I'm running on lightly-loaded physical hardware.
We have 2 servers doing the checks sending back to a central server.  Both
distributed nodes use ocsp/ochp, but they do nothing more than append results
to a file (i.e. it exits quickly).  Results are handled outside of Nagios.

What's odd is that distserver 1 and distserver 2 are configured the same

distserver1:
Hosts Checked   675
Services Checked:  4179
Active Service Latency: 0.000 / 3.155 / 0.382 sec
Active Service Execution Time:  0.000 / 60.038 / 0.145 sec

distserver2:
Hosts Checked:  261
Services Checked:  4289
Active Service Latency: 0.000 / 169.977 / 81.300 sec
Active Service Execution Time:  0.000 / 15.270 / 0.211 sec

yet as you can see, distserver2's latency is much higher and always has been.
I tried turning off EPN yesterday on distserver2 and it had no discernable 
effect.
We added 400 new service checks yesterday on distserver2 (just more of the same
checks we already do but on 26 new hosts) and the latency went from 35 to over 
80.

The checks we do are very different (Windows, Linux, Unix, many are 
app-centric) so
it's difficult to compare exactly what runs on distserver1 and distserver2, but 
given
the jump that was taken yesterday, I'm wondering if the fact that the type of 
checks
on these new hosts are all built on dependencies make me wonder if that doesn't
have something to do with it.  These hosts (Windows) have a basic check for NRPE
and all other checks on the host are dependent on the NRPE check succeeding.

I have to move to all new Nagios servers very soon.  I'm interested in Merlin, 
but
given its non-production nature just yet, I'm hesitant to commit and I'm not 
sure if
it will help me here.

Thanks

Mark

--
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] different notification_intervals by contact

2010-11-12 Thread Frost, Mark {PBC}

From: Duncan Berriman [mailto:dun...@dcl.co.uk]
Sent: Wednesday, November 10, 2010 1:00 PM
To: 'Nagios Users List'
Subject: Re: [Nagios-users] different notification_intervals by contact

Escalations are a little pesky to get working correctly.

Here is an example.

...

Thanks, Duncan.

I've decided to take a somewhat different approach.  Ultimately, what they want 
is for the pager to occur at 4x the frequency of the e-mail (15 minutes versus 
1 hour).

So this doesn't wind up being all that hard if I make a contact that calls a 
simple shell script.  That shell script then looks at the NOTIFICATIONNUMBER to 
(in this case) determine if it's a multiple of 4 and if so, sends the alert.  
In fact, I'm going to make this so that's going to take an argument to 
determine what number to perform 'modulo' on.  So in theory this could be 
reused if someone wanted to have something run every other notification number, 
every 6th, etc, indefinitely.

The downside as I see it is that Nagios won't quite have an accurate 
representation of who got what notifications.  From Nagios' perspective, it 
sent an alert to the mailing list, but really, the script acts as a gateway to 
determine if a message was actually sent.  So the "Notifications" for the 
host/service as shown in the UI will not be quite correct.  But I think they 
can live with that.

Mark
--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] different notification_intervals by contact

2010-11-10 Thread Frost, Mark {PBC}

So we're setting up some Nagios checks for a new team and they're asking for 
something new that I'm not really sure we can do with Nagios.   For any 
production alerts they want to receive pager alerts every 15 minutes and e-mail 
alerts every 60 minutes.  Since each host/service definition has only a single 
notification_interval setting and contact definitions don't allow a 
notification_interval setting, I don't see how this can be done within that 
context.

We don't currently use escalations for anything, but I've been staring at them 
and trying to figure out how that might work for us.  In terms of using 
escalations to solve this problem I'm struck by several issues:

- I'd be trying to use escalations to setup an indefinite pattern, not a system 
where there's an last_notification where everyone gets the notifications.
- I have to do this for a lot of hosts/services and it doesn't look like I can 
wildcard service_descriptions (tried it and it failed).

My other thought is to just have 2 checks for the same service where check A 
has the 15-minute notification_interval and goes to pagers and check B has a 
1-hour notification_interval and goes to e-mail.  And that's for a lot of 
services.  I can't really do the duplicate checks on hosts.  But either way, 
you know, "yuck".

I keep thinking there's some easier more obvious solution that's eluding me to 
this.  Is something that anyone else has solved?  I'm inclined to tell them 
that we can't do this and get them to unify on one notification_interval like 
everyone else, but before I do, I thought I'd ask.


Thanks

Mark
--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-27 Thread Frost, Mark {PBC}

Benny,

OK, well, I hope I'm not embarrassing myself with this.  It's a perl script and 
uses Ton Voon's nifty Nagios::Plugins module.  I run checks against things I 
want to know about.  Thinking about it, I guess it would be nice to have the 
failed hosts/services check alert on percentage of failures.  Maybe someday.

Mark

-Original Message-
From: C. Bensend [mailto:be...@bennyvision.com] 
Sent: Saturday, October 23, 2010 8:44 PM
To: nagios-users@lists.sourceforge.net
Subject: Re: [Nagios-users] Scheduled checks falling far behind


> You can also run, if memory serves, the "nagiostats" command located in
> your Nagios "bin" directory to see this information as well.  I actually
> use that nagiostats data in a custom check and graph a lot of those
> latencies and other Nagios performance related info.

Boy, would I *love* to see your method for that!

I personally hacked the source of nagiostats to create a custom
plugin, but it's a horrible, horrible hack and I'd like to see
a cleaner, more scalable method.

Can you share?

Benny


-- 
"No matter how many shorts we have in the system, my guards will
be instructed to treat every surveillance camera malfunction as a
full-scale emergency."
   -- Peter Anspach's Evil Overlord List, #67



--
Nokia and AT&T present the 2010 Calling All Innovators-North America contest
Create new apps & games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null
#!/usr/bin/perl -w

# Nagios Plugin script to check several Nagios statistics

use strict;
use warnings;
use Nagios::Plugin;

use vars qw( $VERSION $PROGNAME $NAGIOS_BASE $NAGIOSSTATS
$warning $critical
%nagios_stats_data );
$VERSION = '0.32';

# the location of the Nagios installation root
$NAGIOS_BASE = '/usr/local/nagios';

# the command to run to get 'nagiostats'
$NAGIOSSTATS = $NAGIOS_BASE . '/bin/nagiostats';

# get the base name of this script for use in the examples
use File::Basename;
$PROGNAME = basename($0);

sub load_nagiostats;

sub do_cached_host_checks;
sub do_cached_service_checks;
sub do_command_buffers;
sub do_execution_time;
sub do_failed_hosts;
sub do_failed_services;
sub do_host_latency;
sub do_hosts_checked;
sub do_service_latency;
sub do_services_checked;

# Instantiate Nagios::Plugin object (the 'usage' parameter is mandatory)
my $p = Nagios::Plugin -> new (
usage => "Usage: %s [ --verbose | -v ] [ --debug | -d ]
[ --cached-host-checks] |
[ --cached-service-checks ] |
[ --command-buffers   ] |
[ --execution-time] |
[ --failed-hosts  ] |
[ --failed-services   ] |
[ --hosts-checked ] |
[ --host-latency  ] |
[ --service-checked   ] |
[ --services-latency  ]",
version => $VERSION,
blurb => 'Nagios plugin to check various Nagios statistics.',
extra =>  "

THRESHOLDs are specified 'min:max' or 'min:' or ':max'
(or 'max'). If specified '\...@min:max', a warning status will be generated
if the count *is* inside the specified range."

);

# Define and document the valid command line options
# usage, help, version, timeout and verbose are defined by default.


$p -> add_arg(
spec => 'cached-host-checks',
help => 'Check that number of cached host checks matches the threshold.'
);

$p -> add_arg(
spec => 'cached-service-checks',
help => 'Check that number of cached service checks matches the threshold.'
);

$p -> add_arg(
spec => 'command-buffers',
help => 'Check that number of used command buffers matches the threshold.'
);

$p -> add_arg(
spec => 'execution-time',
help => 'Check the host and service execution times.'
);

$p -> add_arg(
spec => 'failed-hosts',
help => 'Check that number of failed hosts matches the threshold.'
);

$p -> add_arg(
spec => 'failed-services',
help => 'Check that number of failed services matches the threshold.'
);

$p -> add_arg(
spec => 'host-latency',
help => 'Check that average host latency is within the threshold.'
);

$p -> add_arg(
spec => 'hosts-checked',
help => 'Check that number of hosts checked matches the threshold.'
);

$p->add_arg(
spec => 'services-checked',
help => 'Check that number of servi

Re: [Nagios-users] Scheduled checks falling far behind

2010-10-22 Thread Frost, Mark {PBC}

Matthew,

You don't say, but my guess would be that you have high latencies.  That is for 
one of several reasons, Nagios is not able to run checks when it thinks it 
should.  You can see this information and other stats by looking at the 
Performance item near the bottom of the Nav pane in the Nagios web interface.

You can also run, if memory serves, the "nagiostats" command located in your 
Nagios "bin" directory to see this information as well.  I actually use that 
nagiostats data in a custom check and graph a lot of those latencies and other 
Nagios performance related info.

>From my own experience, I found that I did not pay attention to this 
>information when I started using Nagios, then read about it, made a few tweaks 
>to make it better then forgot about it.  Then as our installation grew and 
>grew, I found that some things got worse again and I had to consider different 
>tuning options.

I would recommend that you first read the "Tuning Nagios For Maximum 
Performance" section of the docs:

http://nagios.sourceforge.net/docs/3_0/tuning.html

If nothing else, this will give you an idea of some things that can affect 
latencies.

Additionally, you may find that you see your average latencies, but then see 
something with a whopping huge max latency.  It can be hard to track down what 
that is in the UI.  I've just looked up that max latency and then quickly 
looked in the status.dat file to find the service that had that same matching 
latency and dug into that.  You could, for example, have a few checks that 
aren't really timing out so the check may take 10 minutes or more to complete 
which would really screw up your overall latencies.  Like the checks wouldn't 
have finished before the next time they were supposed to be run.

Mark


From: Litwin, Matthew [mlit...@stubhub.com]
Sent: Friday, October 22, 2010 8:29 PM
To: nagios-users@lists.sourceforge.net
Subject: [Nagios-users] Scheduled checks falling far behind

I have been chasing my tail trying to figure out why my RRD files were very 
sparsely populated, and I am realizing that my checks are falling behind of 
their scheduled times up to 3 times their set check interval. For example a 
service that should be checking every 5 minutes. In the example below, the time 
is 00:19:02, the last check was 00:10:30 and the next scheduled check time is 
00:13:28. This means it is almost 6 minutes behind schedule and almost 9 
minutes since the last check!

I find even if I shorten the check interval to say 3 minutes it still behaves 
about the same. The server has very low load and nagios is hardly working at 
all. (usually below 4% cpu) I haven't touch any of the tuning on this and from 
what I have read the default settings appear unthrottled. Is there any way to 
make it "work harder"?

--Service information--
Last Updated: Sat Oct 23 00:19:02 UTC 2010

--Service State Information--
Current Status:
  OK
 (for 7d 16h 14m 46s)
Status Information: CPU STATISTICS OK : user=0.12% system=0.00% 
iowait=0.00% idle=99.88%
Performance Data:   0.12;0.00;0.00;99.88;80;90
Current Attempt:1/3  (HARD state)
>>> Last Check Time:10-23-2010 00:10:30  
Check Type: ACTIVE
Check Latency / Duration:   612.633 / 2.052 seconds
>>> Next Scheduled Check:   10-23-2010 00:13:28 <<<
Last State Change:  10-15-2010 08:04:16
Last Notification:  N/A (notification 0)
Is This Service Flapping?
  NO
 (0.00% state change)
In Scheduled Downtime?
  NO
Last Update:10-23-2010 00:18:33  ( 0d 0h 0m 29s ago)



--
Nokia and AT&T present the 2010 Calling All Innovators-North America contest
Create new apps & games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store
http://p.sf.net/sfu/nokia-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue.
::: Messages without supporting info will risk being sent to /dev/null
--
Nokia and AT&T present the 2010 Calling All Innovators-North America contest
Create new apps & games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
:

[Nagios-users] question about slow startup and retained data

2010-10-16 Thread Frost, Mark {PBC}

After adding a fair number of hosts/services based on templates -- all with a 
number of dependent services -- we're seeing Nagios taking a fair amount of 
time to start up now.  We're using Nagios 3.2.1.  Startup times seemed to be in 
the vicinity of 4 minutes.  During that time Nagios chews up 100% of one CPU 
core and eventually 2 CPU cores, then settles down.  I assumed it was time for 
me to investigate the fast-startup options and deal with at least the 
dependency checking.

Note that this host in question is the "central" node in a distributed setup so 
virtually everything it gets is a passive check result.

When I tried starting with '-s', I found that the first block (Object Config 
Processing Times) went very quickly and then it hung on the second block 
(Retention Data Times) ran for a while as indicated.  Everything else after 
that seemed to go fairly quickly to my surprise.  So apparently, my problem is 
with retained data.

My relevant nagios.cfg entries are as follows:

retain_state_information=1
retention_update_interval=60
use_retained_program_state=1
use_retained_scheduling_info=1
retained_host_attribute_mask=0
retained_service_attribute_mask=0
retained_process_host_attribute_mask=0
retained_process_service_attribute_mask=0
retained_contact_host_attribute_mask=0
retained_contact_service_attribute_mask=0

So we definitely want to make use of historical data.   I see in the config 
file comment that using retained state may come at the cost of increased 
startup times.   None of the speedup options I see seem to say that they try to 
help startup time with retained status.   Am I stuck?  Do I either need to live 
with the 194-second processing time (and that will go up as we add more 
hosts/services over time) or do without retained data?

nagios -s output:

Object Config Source: Config files (uncached)

OBJECT CONFIG PROCESSING TIMES  (* = Potential for precache savings with -u 
option)
--
Read: 0.042305 sec
Resolve:  0.008222 sec  *
Recomb Contactgroups: 0.002196 sec  *
Recomb Hostgroups:0.006549 sec  *
Dup Services: 0.026616 sec  *
Recomb Servicegroups: 0.289033 sec  *
Duplicate:0.012538 sec  *
Inherit:  0.005349 sec  *
Recomb Contacts:  0.00 sec  *
Sort: 0.01 sec  *
Register: 0.076975 sec
Free: 0.008261 sec
  
TOTAL:0.478046 sec  * = 0.350505 sec (73.32%) estimated savings


RETENTION DATA TIMES
--
Read and Process: 194.016362 sec
  
TOTAL:194.016362 sec


Timing information on configuration verification is listed below.

CONFIG VERIFICATION TIMES  (* = Potential for speedup with -x option)
--
Object Relationships: 0.054524 sec
Circular Paths:   0.822133 sec  *
Misc: 0.005159 sec
  
TOTAL:0.881816 sec  * = 0.822133 sec (93.2%) estimated savings


EVENT SCHEDULING TIMES
-
Get service info:0.014405 sec
Get host info info:  0.001732 sec
Get service params:  0.13 sec
Schedule service times:  0.000801 sec
Schedule service events: 0.000461 sec
Get host params: 0.01 sec
Schedule host times: 0.000144 sec
Schedule host events:0.38 sec
 
TOTAL:   0.017595 sec


Projected scheduling information for host and service checks
is listed below.  This information assumes that you are going
to start running Nagios with your current config files.

HOST SCHEDULING INFORMATION
---
Total hosts: 870
Total scheduled hosts:   19
Host inter-check delay method:   SMART
Average host check interval: 300.00 sec
Host inter-check delay:  15.79 sec
Max host check spread:   30 min
First scheduled check:   Sat Oct 16 23:26:24 2010
Last scheduled check:Sat Oct 16 23:28:46 2010


SERVICE SCHEDULING INFORMATION
---
Total services: 7569
Total scheduled services:   34
Service inter-check delay method:   SMART
Average service check interval: 292.94 sec
Inter-check delay:  8.62 sec
Interleave factor method:   SMART
Average services per host:  8.70
Service interleave factor:  1
Max service check spread:   30 min
First scheduled check:  Sat Oct 16 23:31:16 2010
Last scheduled check:   Sat Oct 16 23:33:26 2010


CHECK PROCESSING INFORMATION

Check result reaper interval:   2 sec
Max concurrent service checks:  Unlimited


PERFORMANCE SUGGESTIONS
---
I have no suggestions - things look okay.



Thanks

Mark

Re: [Nagios-users] Alleviating Nagios i/o contention problem

2010-09-27 Thread Frost, Mark {PBC}



> -Original Message-
> From: Marc Powell [mailto:li...@xodus.org]
> Sent: Sunday, September 26, 2010 11:27 AM
> To: Nagios Users List
> Subject: Re: [Nagios-users] Alleviating Nagios i/o contention problem
> 
> 
> On Sep 25, 2010, at 10:53 AM, Max wrote:
> 
> > I like the suggestions Matthias makes; those suggestions have worked
> > well for us.
> >
> > RRD updates are very expensive - I am pretty sure without knowing
> > anything more about your system that the RRD writes are causing most
> > of the I/O load.
> 
> I no longer have access to this system but my experience has been
> otherwise. We were running a nagios install with nearly 10,000 services
> received by external pollers every 5 minutes, and a cricket install on
> the same machine polling/updating 100,000+ rrd files during the same
> interval. This was on a Poweredge 6850, 5 disk RAID-5.
> RRDtool itself writes very little data to disk. I think it's 8 Bytes
> per DS per RRA updated. Linux, though, wants to write 4KB chunks at a
> time so it performs a read-modify-write of 4KB just to update those 8
> Bytes.
> 
> The OP can reduce his IO load particularly for RRD updates and help
> Linux better organize it's writes to disk by ensuring that he has
> enough RAM to keep key information for each RRD file in the filesystem
> cache. The OP will need at least 8K * number of rrd files available to
> be used as filesystem buffer cache.
> 
> --
> Marc



Thanks very much to all who replied (Breandan, Marc, Max and Matthias, this 
means you! :-) ).

- I can't say exactly how many checks create perfdata (we have a very 
heterogeneous set of check types).  I can see 9K files in the graph data 
filesystem, so that would be about 4,500.

- I'm not running updates through syslog.  I don't have root on these machines 
so that would not be helpful to me.  I will have to double-check, but I don't 
believe that I have writing to the pnp4nagios turned on, except maybe for the 
lowest level.  I don't recall it logging much of anything at that level, but as 
I say,  I'll check.

- According to our performance analysis team, these servers have way more RAM 
that they're actually using so I wouldn't think I'm limited by the Linux disk 
cache here.  Perhaps it's just the hardware we have (the i/o rates on a 
3-year-old Dell 2950 with a single RAID 5 set) that makes this particularly bad 
for us.  Perhaps on faster hardware we'd not even notice.

- I would assume that the rrdcached was built for a reason (i.e. this i/o issue 
was observed at least somewhere) so it's definitely an avenue I want to try out.

- The ramdisk idea is also interesting.   I'm curious though, about why one 
would want to rsync it back to the local disk periodically.  It's just a 
run-time status file, right?  Unless I misread the docs, it goes away when 
Nagios is shut down.  What would having a local disk copy of status.dat benefit 
me?  Also, nagios.log isn't written to that often in our case (we don't log 
passive check results, for example).  I'm not sure I'd see the benefit for us 
in putting that on ramdisk.  Although... we do have Splunk watch that file so 
that would be some additional read overhead I guess.


Thanks!

Mark

--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Alleviating Nagios i/o contention problem

2010-09-25 Thread Frost, Mark {PBC}

Greetings, listers,

We've got an on-going issue with i/o contention.  There's the obvious problem 
that we've got a whole lot of things all writing to the same partition.  In 
this case, there's just one big chunk of RAID 5 disk on a single controller so 
I don't believe that making more partitions is going to help.

On this same partition we have:

1) Nagios 3.2.1 running as the central/reporting server for a couple of other 
Nagios nodes that are sending check results via NSCA.  Approximately 6-7K 
checks.

2) pnp4nagios 0.6.2 (with rrd 1.4.2) writing graph data.

There's a 2nd server configured identically to the first that's acting as a 
"hot spare" so it also receives check data from the 2 distributed nodes and 
writes its own copy of the graph data locally as well.

At the moment I'm concerned about the graphdata, but because I can only see i/o 
utilization as an aggregate, I can't tell what is the worst component on that 
filesystem -- status.dat updates?  graph data?  writes to the var/spool 
directory?  We also look at continued growth so this is only going to get worse.

These systems are quite lightly loaded from a CPU (2 dual-core CPUs) and memory 
(4GB) perspective, but the i/o to the nagios filesystem is queuing now.

We're about to order new hardware for these servers and I want to make a 
reasonable choice.  I'd like to make some reasonable changes without requiring 
too exotic of a setup.  I believe these servers are currently Dell 2950s and 
they're all running Suse Linux 10.3 SP2.

My first thought was to potentially move the graphs to a NAS share which would 
shift that i/o to the network.  I don't know how that would work though and it 
would ultimately be an experiment.

What experiences do people out there have handling this kind of i/o and what 
have you done to ease it?


Thanks very much!

Mark

--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Running Nagios on Vmware

2010-07-08 Thread Frost, Mark {PBC}


In my experience, there are weird things that happen with timing.  That is, the 
time
on a VM should be sync'd with a time source so no time is lost.  However, the VM
has what I like to think of as "seconds of variable length".

So when we tested with a VM a few years ago, the latency and execution timings 
and
calculations were really screwy.  There were checks that Nagios thought ran in
"-0.15" seconds, for example.  Considering that this was information that we 
cared
about, we chose to stick with a physical box.

And yes, I/O is now an increasing concern for us so a VM would be even less 
likely.

That said, I know another team who has much lighter requirements (they just want
alerts, don't care about latencies (yet)) and they've been on a VM for years 
now with
Nagios.

Mark

--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Merlin/Ninja perfdata status?

2010-06-11 Thread Frost, Mark {PBC}

> -Original Message-
> From: Andreas Ericsson [mailto:a...@op5.se] 
> Sent: Friday, June 11, 2010 9:29 AM
> To: Nagios Users List
> Subject: Re: [Nagios-users] Large Installation
> 
> 
> Unless you desperately need performance data from satellite systems
> handled properly, I'd invite you to give Merlin and Ninja a try.

Andreas,

We're planning on a Nagios refresh/rearchitecture near the end of this year
and I'm really hopeful that we might be able to move to Ninja/Merlin as they
do a lot of things we'd really like to have.  They also solve some issues we
have with our current distributed system.

I've been trying to pay attention to the latest developments in this area, but
I may have missed something as changes are happening quickly.

We do, however, rely pretty heavily on performance data.  I think I saw someone 
had
a hack to do it with Merlin, but it's not really part of Merlin right now which 
makes
me not want to adopt it for a production Nagios installation.

I recall a sort of Merlin roadmap for the rest of the year indicating that 
upcoming
work was to better support distributed setups, if I remember correctly.  Is 
there also
work afoot to get perfdata into Merlin perhaps with the next release?

I'm trying to build some test systems to try the current version of 
Merlin/Ninja to
assess how "production ready" it might be for us by the end of the year when we 
need
to make a decision.

Thanks very much for all the hard work you and others at Op5 have put in to 
these
tools.

Mark

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Overly persistent contact group

2010-05-26 Thread Frost, Mark {PBC}

Mordur,

Two thoughts on this.  First, I find that I've been burned many times by
contact/contactgroup inheritance.  That is, where you define a contactgroup
for a host and that gets inherited by the service (when I don't want it to).

Second, I rely a lot on looking at the "Configuration" link at the bottom of the
Nagios web interface.  That lets you look and see what's really defined for
all the objects (hosts, services, contacts, contactgroups, timeperiods, etc).
Essentially it allows me to go in and compare what I intended to say in the
configuration with what Nagios really is using.

Mark

-Original Message-
From: mli...@1984.is [mailto:mli...@1984.is] 
Sent: Wednesday, May 26, 2010 2:50 PM
To: nagios-users@lists.sourceforge.net
Subject: [Nagios-users] Overly persistent contact group

Dear list,

I have nagios3 on Debian Lenny.  I created a service template and host
template for a customer as well as a couple of contacts and a contact
group.  I specified the contact group in the host and service template
and created some host and service defininitions based on the
aforementioned templates.

So I hoped that notifications would be sent to these new contacts as per
the setup descibed above. This hope failed, and notifications were only
sent to a 'admins' contactgroup, which is not specified anywhere in the
setup of those hosts, services, contacts, group or template.

When I remove the 'admins' contact group from the config files and run a
test of the config, I get this:

Error: Contact group 'admins' specified in service 'SYSTEM STATUS' for
host 'host.domain.tld' is not defined anywhere!

Even though this contact group is mentioned nowhere in connection with
these hosts or services.

It seems that all contact groups except the one named 'admins' fail to
register with the Nagios system and that the 'admins' contact group is
somehow automatically associated with all host definitions, regardless of
which contact group is actually specified in configuration.

Mordur Ingolfsson
 

--

___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

--

___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] trying to fix problem with excessive latency

2010-05-19 Thread Frost, Mark {PBC}


> -Original Message-
> From: Corey Hickey [mailto:bugfood...@fatooh.org] 
> Sent: Tuesday, May 18, 2010 9:30 PM
> To: nagios-users@lists.sourceforge.net
> Subject: [Nagios-users] trying to fix problem with excessive latency
> 
> Hello,
> 
> I have inherited maintenance of a medium-sized Nagios installation. We 
> currently have 649 hosts and 5415 services. Our setup works nicely, with 
> one exception: Nagios falls behind on host/service checks. Our usual 
> latency once Nagios has been running for a while is about 190-200 
> seconds. Our Nagios host is reasonably powerful and isn't struggling; it 
> seems that Nagios itself is limited somehow.
> 



> Active Service Execution Time:  0.020 / 120.007 / 0.847 sec
> Active Host Execution Time: 0.020 / 11.019 / 0.069 sec
> 



> I have a feeling I'm missing something I would appreciate any 
> suggestions.
> 
> Thanks,
> Corey

Corey,

I'm not an expert, but I'll relay some of my own experiences here.  I did
find that switching on large_installation_tweaks did indeed make a big 
difference
with our latencies.

We also were doing the pre-Nagios 3.2 practice of not doing active host checks. 
 As
the tuning guide recommends, it's actually more efficient to do active checks 
and then
enable the cached check results.   When we did that, we found that the host 
that we
were seeing latency issues on leveled out on latencies.  (It's good to graph 
those values,
by the way).  They were still high-ish, but the active host checks caused them
to stop increasing over time.

But additionally, we found that long running checks were also messing up 
latencies.
As I understand it, if Nagios schedules a check and then it takes a lot longer 
than Nagios
expects it to to return, that can mess up scheduling the other checks.  I see 
you've got
some check(s) that ran at a max of 120 seconds.  When I started seeing some 
latency
problems I also saw that I had a service check or two that was running for 
several minutes.
I tracked that down and changed the check so that it completed (or timed out, 
really)
more quickly returning status back to Nagios in a matter of seconds rather than 
minutes.
The latency plummeted after that.  In general, our policy is that most checks 
should
complete in under 30 seconds, preferably under 10.

In the same vein, I'm not quite sure how you could have any host checks that 
would take
11 seconds to execute.  Are you doing multiple pings/fpings to check that a 
host is up?  Typically you can get away with just a single fping rather than a 
series of 10 to tell
you that a host is not reachable.

Hope that helps.

Mark

--

___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] turning off service inheritance of host settings?

2010-05-11 Thread Frost, Mark {PBC}


I don't suppose there's any way (short of changing the source and recompiling) 
to turn off the "feature" of inheriting host settings to services?  This is one 
thing I've found *really* annoying about 3.2.0 and would like to have a way to 
turn it off.  I didn't see anything in the docs or in the nagios.cfg file that 
let me turn this behavior on or off or something I could put in a host or 
service setting that would let me disable it.

Thanks

Mark


--

___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Does anyone have event log monitors that work?

2010-03-19 Thread Frost, Mark {PBC}



>-Original Message-
>From: C. Bensend [mailto:be...@bennyvision.com]
>Sent: Friday, March 19, 2010 10:32 AM
>To: nagios-users@lists.sourceforge.net
>Subject: [Nagios-users] Does anyone have event log monitors that *work*?

>

>

>Hey folks,

>

 >  I have been beating my head against various and sundry walls,

>tables, and desks for quite some time now, and my brain is starting

>to get very, VERY mushy.

>

>   I need to monitor Windows event logs.  You'd think this would

>be easy, but either the tools available out there don't work (which

>I doubt, I KNOW you monitor event logs), or I'm man enough to admit

>that I'm a hopeless idiot.

>

>   I've tried to get help on the 3rd-party sites (Steve

>Shipway's site for Nagios EventLog Service and NSClient++), but

>they're either away from their desks for an extended period of

>time or I've just plain worn them out and they're no longer answering

>my questions.

>

>   I beg of you; if you use either of these tools and *successfuly*

>monitor Windows event logs, please give me a hand.  I apologize for

>the length of this email, but this is my last stand - if I cannot

>get event log monitoring working, this entire project may get

>scrapped.



Benny,



This is probably overkill for your situation but you could use Splunk

to watch event logs (and other logs) via saved searches and then

have it notify Nagios when it spots something.  We do this here as

Splunk just has more smarts about dealing with events/logs/matches

within certain time windows.  But as I say, it IS more overhead than

the other solutions you cite.



Mark




--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_http and proxy

2010-03-17 Thread Frost, Mark {PBC}


I had this same issue (trying to check sites with SSL through a proxy).  
Unfortunately, it appears that the issue is that check_http does not support 
the 'CONNECT' tunneling protocol that our proxy servers require for that 
service.

I'm not really sure what other options exist to do this.  Like, for instance if 
WWW::Mechanize would allow it either.  I wish check_http did, though.

Mark


-Original Message-
From: Leo Stolk [mailto:leo.st...@enovation.nl] 
Sent: Wednesday, March 17, 2010 10:34 AM
To: Nagios Users List
Subject: Re: [Nagios-users] check_http and proxy

Hi,

You could try to use --ssl in the check.

check_http --ssl -H my.proxy -p my_proxy_port -u http://my.website

Greetings,

Leo


-Oorspronkelijk bericht-
Van: Marc-André Doll [mailto:m...@b-care.net] 
Verzonden: woensdag 17 maart 2010 14:17
Aan: Nagios-Users
Onderwerp: [Nagios-users] check_http and proxy

Hi list,

I'm trying to check some web applications through a proxy with
check_http (version 1.4.13).

I googled it and found that, with version 1.4.8, it might be possible to
try this
( http://www.mail-archive.com/nagios-users@lists.sourceforge.net/msg11186.html 
) :

check_http -H my.proxy -p my_proxy_port -u http://my.website

Unfortunately, I have to access my application through HTTPS. So I tried
with

check_http -H my.proxy -p my_proxy_port -u https://my.website -vvv

And I obtained this message : 
> GET https://my.website HTTP/1.0
> User-Agent: check_http/v2053 (nagios-plugins 1.4.13)
> Connection: close
> Host: my.proxy:my_proxy_port
>
>
> http://my.proxy:my_proxy_porthttps://my.website
> STATUS: HTTP/1.1 400 Bad Request
> []

Does soemone know how I should deal with this ?

Thank you,

Marc-André


--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null
--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null
--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios DST bug and upcoming DST-time switch in Europe

2010-03-15 Thread Frost, Mark {PBC}



>-Original Message-
>From: Ton Voon [mailto:ton.v...@opsera.com] 
>Sent: Tuesday, March 09, 2010 3:33 AM
>To: Nagios Users Mailinglist
>Subject: Re: [Nagios-users] Nagios DST bug and upcoming DST-time switch in 
>Europe
>
>
>On 9 Mar 2010, at 08:19, Mark Elsen wrote:
>
>> Nagios 3.2.0
>> 
>>
>> - By the end of the month, Europe will switch to DST.
>>   Will I be affected by the Nagios DST-BUG which , which results in
>> NAGIOS becoming
>>   dis-functional ?
>>
>> Which countermeasures can I take to prevent being struck by this  
>> problem ?
>>
>
>The bug, which stops all Nagios monitoring for 24 hours, occurs when  
>"time moves backwards". This does not happen when "time moves  
>forward". However, any other timeperiods (such as 09:00-17:00) maybe  
>incorrect by an hour, which is obviously not as serious. This is true  
>for Nagios 3.2.0.
>
>Ton

Unless I'm mistaken, we had this timeperiod problem occur here.  Some
alerts were sent during a timeperiod for which notifications
are not enabled.  I did restart Nagios after DST went into affect
just for fun, but that was before these alerts went out.

I know that the fix for this when this bug was evidenced last fall
was to run a script against one of the .dat files (retention.dat?).
However, that was for the monitoring problem.  Is there something
similar that we need to do to correct timeperiods?

Thanks

Mark

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Hostgroup tricks?

Re: [Nagios-users] Suggestions for event correlation managers?

Re: [Nagios-users] Services are dependent on the host they run on?

Re: [Nagios-users] Nagios Core & Remedy Ticketing Integration

Re: [Nagios-users] check if SAP login is possible

Re: [Nagios-users] CPU monitor for a single Linux user space process ?

[Nagios-users] converting distributed Nagios setup to Nagios+Merlin

Re: [Nagios-users] distributed nagios ?

Re: [Nagios-users] distributed nagios ?

Re: [Nagios-users] high latency

Re: [Nagios-users] high latency

Re: [Nagios-users] high latency

Re: [Nagios-users] high latency

Re: [Nagios-users] different notification_intervals by contact

[Nagios-users] different notification_intervals by contact

Re: [Nagios-users] Scheduled checks falling far behind

Re: [Nagios-users] Scheduled checks falling far behind

[Nagios-users] question about slow startup and retained data

Re: [Nagios-users] Alleviating Nagios i/o contention problem

[Nagios-users] Alleviating Nagios i/o contention problem

Re: [Nagios-users] Running Nagios on Vmware

Re: [Nagios-users] Merlin/Ninja perfdata status?

Re: [Nagios-users] Overly persistent contact group

Re: [Nagios-users] trying to fix problem with excessive latency

[Nagios-users] turning off service inheritance of host settings?

Re: [Nagios-users] Does anyone have event log monitors that work?

Re: [Nagios-users] check_http and proxy

Re: [Nagios-users] Nagios DST bug and upcoming DST-time switch in Europe

28 matches

Site Navigation

Mail list logo

Footer information