date:20100218

[Nagios-users] Fwd: Problem with recovery notification

2010-02-18 Thread samuel . mutel

Hello,

Anybody ?
I don't understand the Hard and soft logic in Service Alert of server 1 :

CRITICAL;SOFT;1
CRITICAL;SOFT;2
CRITICAL;HARD;3
CRITICAL;SOFT;1

= Why I don't have after CRITICAL;HARD;3 and before CRITICAL;SOFT;1 : 
OK;HARD;3.

Questions :
  1) The flapping mode can explain this behaviour ?
  2) If the node is down the service state (hard or soft) is set to soft ?

Regards,
Thanks.

- Mail transféré -
De: Samuel Mutel samuel.mu...@free.fr
À: nagios-users@lists.sourceforge.net
Envoyé: Mardi 16 Février 2010 21:05:54 GMT +01:00 Amsterdam / Berlin / Berne / 
Rome / Stockholm / Vienne
Objet: Problem with recovery notification

Hello,

I have two Nagios servers that monitor the same equipement. This two 
nagios send the result of check by notification to another monitoring 
system (OpenNMS). I use Nagios 3.2.
I received the recovery notification from server 2 but I did not 
received recovery notification from server 1. Why ? I think that SOFT 
and HARD states are the problem but I am not sur. In the second server 2 
the status of service is HARD - OK so the notification is sent but on 
server 1, the service is SOFT - OK !!!

Here is the log of Nagios :

Service Alert of server 1 :

[1266299592] SERVICE ALERT: 
test-server;CPU;CRITICAL;SOFT;1;CHECK_ESX3.PL CRITICAL - Error: Server 
version unavailable at 'https://ip_address/sdk/vimService.wsdl'
[1266299885] SERVICE ALERT: 
test-server;CPU;CRITICAL;SOFT;2;CHECK_ESX3.PL CRITICAL - Error 
connecting to server at 'https://ip_address/sdk/webService': Perhaps 
host is not a Virtual Center or ESX server
[1266299925] SERVICE ALERT: 
test-server;CPU;CRITICAL;HARD;3;CHECK_ESX3.PL CRITICAL - Error: Server 
version unavailable at 'https://ip_address/sdk/vimService.wsdl'
[1266303080] SERVICE ALERT: 
test-server;CPU;CRITICAL;SOFT;1;CHECK_ESX3.PL CRITICAL - Error 
connecting to server at 'https://ip_address/sdk/webService': Perhaps 
host is not a Virtual Center or ESX server
[1266303380] SERVICE ALERT: 
test-server;CPU;CRITICAL;SOFT;2;CHECK_ESX3.PL CRITICAL - Error 
connecting to server at 'https://ip_address/sdk/webService': Perhaps 
host is not a Virtual Center or ESX server
[1266308485] SERVICE ALERT: test-server;CPU;CRITICAL;SOFT;1;(Service 
Check Timed Out)
[1266308500] SERVICE ALERT: test-server;CPU;OK;SOFT;2;CHECK_ESX3.PL OK - 
test-server cpu usage=2.29 %

Service Notification of server 1 :

[1266299925] SERVICE NOTIFICATION: 
onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL 
CRITICAL - Error: Server version unavailable at 
https://ip_address/sdk/vimService.wsdl
[1266300645] SERVICE NOTIFICATION: 
onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL 
CRITICAL - Error: Server version unavailable at 
https://ip_address/sdk/vimService.wsdl
[1266301385] SERVICE NOTIFICATION: 
onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL 
CRITICAL - Error: Server version unavailable at 
https://ip_address/sdk/vimService.wsdl
[1266301720] SERVICE NOTIFICATION: 
onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL 
CRITICAL - Error: Server version unavailable at 
https://ip_address/sdk/vimService.wsdl
[1266303575] SERVICE NOTIFICATION: 
onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL 
CRITICAL - Error connecting to server at 
https://ip_address/sdk/webService: Perhaps host is not a Virtual Center 
or ESX server
[1266304175] SERVICE NOTIFICATION: 
onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL 
CRITICAL - Error connecting to server at 
https://ip_address/sdk/webService: Perhaps host is not a Virtual Center 
or ESX server
[1266304810] SERVICE NOTIFICATION: 
onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL 
CRITICAL - Error: Server version unavailable at 
https://ip_address/sdk/vimService.wsdl
[1266305270] SERVICE NOTIFICATION: 
onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL 
CRITICAL - Error: Server version unavailable at 
https://ip_address/sdk/vimService.wsdl
[1266305975] SERVICE NOTIFICATION: 
onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL 
CRITICAL - Error connecting to server at 
https://ip_address/sdk/webService: Perhaps host is not a Virtual Center 
or ESX server

Service Alert of server 2 :

[1266299856] SERVICE ALERT: test-server;CPU;CRITICAL;SOFT;1;(Service 
Check Timed Out)
[1266300161] SERVICE ALERT: test-server;CPU;CRITICAL;HARD;1;(Service 
Check Timed Out)
[1266300516] SERVICE ALERT: 
test-server;CPU;CRITICAL;SOFT;1;CHECK_ESX3.PL CRITICAL - Error: Server 
version unavailable at 'https://ip_address/sdk/vimService.wsdl'
[1266301481] SERVICE ALERT: test-server;CPU;CRITICAL;SOFT;1;(Service 
Check Timed Out)
[1266301512] SERVICE ALERT: 
test-server;CPU;CRITICAL;SOFT;2;CHECK_ESX3.PL CRITICAL - Error: Server 
version unavailable at 'https://ip_address/sdk/vimService.wsdl'
[1266304201]

Re: [Nagios-users] Fwd: Problem with recovery notification

2010-02-18 Thread Morris, Patrick

samuel.mu...@free.fr wrote:
 Hello,

 Anybody ?
 I don't understand the Hard and soft logic in Service Alert of server 1 :

 CRITICAL;SOFT;1
 CRITICAL;SOFT;2
 CRITICAL;HARD;3
 CRITICAL;SOFT;1

 = Why I don't have after CRITICAL;HARD;3 and before CRITICAL;SOFT;1 : 
 OK;HARD;3.

 Questions :
   1) The flapping mode can explain this behaviour ?
   2) If the node is down the service state (hard or soft) is set to soft ?
   

Flap detection only inhibits notifications.  It would not effect 
hard/soft states.

Several things could cause this, but it appears you've stripped all 
context out of the logs.  Was Nagios restarted between the 

CRITICAL;HARD;3 and the CRITICAL;SOFT;1, maybe? Im not 100% sure, but the 
service state count may also be reset (I'd be a bit surprised if it isn't) if 
the host is determined to be down.


--
Download Intelreg; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs 
proactively, and fine-tune applications for parallel performance. 
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] service notification when host is down

2010-02-18 Thread Samuel Bancal

Thanks for your answer,

In fact it is normal behavior to me also.
Thing that is not normal behavior to me is that between two checks, Nagios
jumps from SOFT 1 to HARD 1 without doing the steps SOFT 1  SOFT 2
 SOFT 3 and finally HARD 4.

Regards,
Samuel Bancal

2010/2/17 Morris, Patrick patrick.mor...@hp.com

 Samuel Bancal wrote:

 Nagios Core 3.2.0
 nagios-plugins-1.4.14
 Ubuntu server 8.04.3 LTS

 Hi,

 I'm encountering problems to configure the notifications in case a server
 is no more responding to PING (ICMP).
 I don't understand why Nagios is jumping over steps when it's doing
 service-check icmp.
 Here is the config :

 define host{
  usegeneric-server
  host_name  server1
  alias  server1
  addressthe.ip.the.ip
  hostgroups prod-servers
  contact_groups group1
  check_command  check-host-alive
  check_period   24x7
  check_interval 5
  retry_interval 1
  max_check_attempts 4
  notification_period24x7
  notification_interval  60
  notification_options   d,u,r
 }

 define service{
  use generic-service
  host_name   server1
  service_description ICMP
  check_command   check_icmp!100.0,20%!500.0,60%
  max_check_attempts  4
  normal_check_interval   5
  retry_check_interval1
  notification_optionsw,u,c,r
  notification_interval   60
  notification_period 24x7
 }
 [...]
 define command{
  command_namecheck-host-alive
  command_line$USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c
 5000.0,100% -p 5
 }
 define command{
  command_namecheck_icmp
  command_line$USER1$/check_icmp -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$
 -p 5
 }
 [...]

 Here is an example of history that I get :
 Service Critical[2010-02-16 11:33:13] SERVICE ALERT:
 server1;ICMP;CRITICAL;SOFT;1;CRITICAL - the.ip.the.ip: rta nan, lost 100%
 Host Down[2010-02-16 11:33:43] HOST ALERT: server1;DOWN;SOFT;1;(Host Check
 Timed Out)
 Service Critical[2010-02-16 11:34:13] SERVICE ALERT:
 server1;ICMP;CRITICAL;HARD;1;CRITICAL - the.ip.the.ip: rta nan, lost 100%
 Host Down[2010-02-16 11:34:43] HOST ALERT: server1;DOWN;SOFT;2;(Host Check
 Timed Out)
 Host Down[2010-02-16 11:35:23] HOST ALERT: server1;DOWN;SOFT;3;(Host Check
 Timed Out)
 Host Down[2010-02-16 11:36:33] HOST ALERT: server1;DOWN;HARD;4;(Host Check
 Timed Out)
 Host Up[2010-02-16 11:37:43] HOST ALERT: server1;UP;HARD;1;PING OK -
 Packet loss = 0%, RTA = 0.67 ms
 Service Ok[2010-02-16 11:39:13] SERVICE ALERT: server1;ICMP;OK;HARD;1;OK -
 the.ip.the.ip: rta 0.943ms, lost 0%

 Or later :
 Host Down[2010-02-16 11:42:03] HOST ALERT: server1;DOWN;SOFT;1;(Host Check
 Timed Out)
 Host Down[2010-02-16 11:43:13] HOST ALERT: server1;DOWN;SOFT;2;(Host Check
 Timed Out)
 Service Critical[2010-02-16 11:44:13] SERVICE ALERT:
 server1;ICMP;CRITICAL;HARD;1;CRITICAL - the.ip.the.ip: rta nan, lost 100%
 Host Down[2010-02-16 11:44:43] HOST ALERT: server1;DOWN;SOFT;3;(Host Check
 Timed Out)
 Host Up[2010-02-16 11:45:53] HOST ALERT: server1;UP;SOFT;4;PING OK -
 Packet loss = 0%, RTA = 0.64 ms
 Service Ok[2010-02-16 11:49:13] SERVICE ALERT: server1;ICMP;OK;HARD;1;OK -
 the.ip.the.ip: rta 0.948ms, lost 0%


 If you're asking why Nagios runs a host check when it sees the service fail
 a check, that's normal behavior.

 When a service check fails, the first thing Nagios will do is look to see
 if the service failed because the host is down.




-- 
Samuel Bancal - CH
--
Download Intelreg; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs 
proactively, and fine-tune applications for parallel performance. 
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Fwd: Problem with recovery notification

2010-02-18 Thread samuel . mutel

I found this in source code :

/* ADDED IF STATEMENT 01-17-05 EG */
/* 01-17-05: Services in hard problem states before hosts went down would 
sometimes come back as soft problem states after */
/* the hosts recovered.  This caused problems, so hopefully this will fix it */
if(temp_service-state_type==SOFT_STATE)
  temp_service-current_attempt=1;
}

so hopefully this will fix it = Perhaps this patch does not work ...

Samuel Mutel.


- Mail Original -
De: samuel mutel samuel.mu...@free.fr
À: nagios-users@lists.sourceforge.net
Envoyé: Jeudi 18 Février 2010 09:21:41 GMT +01:00 Amsterdam / Berlin / Berne / 
Rome / Stockholm / Vienne
Objet: [Nagios-users] Fwd: Problem with recovery notification

Hello,

Anybody ?
I don't understand the Hard and soft logic in Service Alert of server 1 :

CRITICAL;SOFT;1
CRITICAL;SOFT;2
CRITICAL;HARD;3
CRITICAL;SOFT;1

= Why I don't have after CRITICAL;HARD;3 and before CRITICAL;SOFT;1 : 
OK;HARD;3.

Questions :
  1) The flapping mode can explain this behaviour ?
  2) If the node is down the service state (hard or soft) is set to soft ?

Regards,
Thanks.

- Mail transféré -
De: Samuel Mutel samuel.mu...@free.fr
À: nagios-users@lists.sourceforge.net
Envoyé: Mardi 16 Février 2010 21:05:54 GMT +01:00 Amsterdam / Berlin / Berne / 
Rome / Stockholm / Vienne
Objet: Problem with recovery notification

Hello,

I have two Nagios servers that monitor the same equipement. This two 
nagios send the result of check by notification to another monitoring 
system (OpenNMS). I use Nagios 3.2.
I received the recovery notification from server 2 but I did not 
received recovery notification from server 1. Why ? I think that SOFT 
and HARD states are the problem but I am not sur. In the second server 2 
the status of service is HARD - OK so the notification is sent but on 
server 1, the service is SOFT - OK !!!

Here is the log of Nagios :

Service Alert of server 1 :

[1266299592] SERVICE ALERT: 
test-server;CPU;CRITICAL;SOFT;1;CHECK_ESX3.PL CRITICAL - Error: Server 
version unavailable at 'https://ip_address/sdk/vimService.wsdl'
[1266299885] SERVICE ALERT: 
test-server;CPU;CRITICAL;SOFT;2;CHECK_ESX3.PL CRITICAL - Error 
connecting to server at 'https://ip_address/sdk/webService': Perhaps 
host is not a Virtual Center or ESX server
[1266299925] SERVICE ALERT: 
test-server;CPU;CRITICAL;HARD;3;CHECK_ESX3.PL CRITICAL - Error: Server 
version unavailable at 'https://ip_address/sdk/vimService.wsdl'
[1266303080] SERVICE ALERT: 
test-server;CPU;CRITICAL;SOFT;1;CHECK_ESX3.PL CRITICAL - Error 
connecting to server at 'https://ip_address/sdk/webService': Perhaps 
host is not a Virtual Center or ESX server
[1266303380] SERVICE ALERT: 
test-server;CPU;CRITICAL;SOFT;2;CHECK_ESX3.PL CRITICAL - Error 
connecting to server at 'https://ip_address/sdk/webService': Perhaps 
host is not a Virtual Center or ESX server
[1266308485] SERVICE ALERT: test-server;CPU;CRITICAL;SOFT;1;(Service 
Check Timed Out)
[1266308500] SERVICE ALERT: test-server;CPU;OK;SOFT;2;CHECK_ESX3.PL OK - 
test-server cpu usage=2.29 %

Service Notification of server 1 :

[1266299925] SERVICE NOTIFICATION: 
onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL 
CRITICAL - Error: Server version unavailable at 
https://ip_address/sdk/vimService.wsdl
[1266300645] SERVICE NOTIFICATION: 
onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL 
CRITICAL - Error: Server version unavailable at 
https://ip_address/sdk/vimService.wsdl
[1266301385] SERVICE NOTIFICATION: 
onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL 
CRITICAL - Error: Server version unavailable at 
https://ip_address/sdk/vimService.wsdl
[1266301720] SERVICE NOTIFICATION: 
onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL 
CRITICAL - Error: Server version unavailable at 
https://ip_address/sdk/vimService.wsdl
[1266303575] SERVICE NOTIFICATION: 
onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL 
CRITICAL - Error connecting to server at 
https://ip_address/sdk/webService: Perhaps host is not a Virtual Center 
or ESX server
[1266304175] SERVICE NOTIFICATION: 
onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL 
CRITICAL - Error connecting to server at 
https://ip_address/sdk/webService: Perhaps host is not a Virtual Center 
or ESX server
[1266304810] SERVICE NOTIFICATION: 
onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL 
CRITICAL - Error: Server version unavailable at 
https://ip_address/sdk/vimService.wsdl
[1266305270] SERVICE NOTIFICATION: 
onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL 
CRITICAL - Error: Server version unavailable at 
https://ip_address/sdk/vimService.wsdl
[1266305975] SERVICE NOTIFICATION: 
onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL 
CRITICAL - Error connecting to server at

[Nagios-users] CHECK_HTTP odd behaviour

2010-02-18 Thread Paul WILLIS PSE 55499

Well I tried writing a wrapper script to see what check_http was actually 
receiving. The answer would appear to be absolutely nothing, in
fact check_http is never even getting called. Something in the parameters would 
appear to be causing nagios to throw an exception when
trying to make the call that is caught and  treated as a critical error with a 
null reply.
When I went through the -A parameter and escaped every non-standard character 
everything burst into life, the wrapper reported the correct
string and check_http reported the site as up. 
Clearly that whereas bash only needs $ and ` escaping within inverted commas 
nagios must have a larger list, including I would guess either the ; or the :
Thanks for the help
Paul Willis

-- 
This email and any accompanying document(s) contain information from Kent 
Police, which is confidential or privileged.
The information is intended to be for the exclusive use of the individual(s) or 
bodies to whom it is addressed.
If you are not the intended recipient, be aware that any disclosure, copying, 
distribution or use of the contents of this information is prohibited.
If you have received this email in error, please notify us immediately by 
contacting the sender or telephoning 01622 690690.
The copyright in the contents of this email and any enclosure is the property 
of Kent Police and any unauthorised reproduction or disclosure is contrary to 
the provisions of the Copyright Designs and Patents Act 1998.
--
Download Intelreg; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs 
proactively, and fine-tune applications for parallel performance. 
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Processing External Commands

2010-02-18 Thread Glynne Jones

Hi,

I'm running a distributed setup with two servers carrying out monitoring and 
sending their results back to a central server via NSCA. Most of the time this 
works well, but from time to time I get a substantial delay both between NSCA 
receiving the check result and the EXTERNAL COMMAND being logged, and between 
the EXTERNAL COMMAND being logged and the PASSIVE SERVICE CHECK result being 
logged.

This delay can be several minutes and has sometimes been over 10 minutes.

I am running ndoutils as well, and some of the tables are quite big. Could this 
affect things?

Any help appreciated.

Thanks,

Glynne



--
Download Intelreg; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs 
proactively, and fine-tune applications for parallel performance. 
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] CHECK_HTTP odd behaviour

2010-02-18 Thread Marc Powell


On Feb 18, 2010, at 5:04 AM, Paul WILLIS PSE 55499 wrote:

 Clearly that whereas bash only needs $ and ` escaping within inverted commas 
 nagios must have a larger list, including I would guess either the ; or the :

Nope, not really. \, ! and $ are the only characters that may need escaping, 
depending on where they are used. With the exception of $MACRO$ substitutions, 
nagios just takes your raw command_line and passes it to the shell for 
execution. You never posted your command definition but I'd guess that you 
didn't have proper quoting or something like that.

--
Marc
--
Download Intelreg; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs 
proactively, and fine-tune applications for parallel performance. 
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Can I run both Nagios V2 and V3 in parallel while I migrate?

2010-02-18 Thread Marc Powell


On Feb 17, 2010, at 8:29 PM, Lylex Ryan wrote:

 In upgrading from nagios (v2) to nagios3, I'd like to do a fresh install of 
 nagios3 and start with a clean sheet of  (config) paper.  But can I do this 
 while V2 is running production?

Yes, I've run instances of all three versions on a single box at once.

 Since the packages have different names, I thought it might work.  But they 
 probably would both have /etc/nagios and other default directories in common. 
  

Clearly if the packages install components to common directories then that 
isn't going to work. Compiling and installing from tarball is not difficult at 
all and you have control over where things get put (by default everything is 
under /usr/local/nagios). You'll need to set up a second http vhost with a 
different name for the second instance and either modify the nagios init script 
to start the second instance or add the startup to rc.local.

Once you're confident in the success of your transition, you could uninstall 
the v2 package, install the v3 package and copy over your etc and var 
directories from your transition install...

--
Marc


--
Download Intelreg; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs 
proactively, and fine-tune applications for parallel performance. 
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Processing External Commands

2010-02-18 Thread Marc Powell


On Feb 18, 2010, at 5:17 AM, Glynne Jones wrote:

 This delay can be several minutes and has sometimes been over 10 minutes.
 
 I am running ndoutils as well, and some of the tables are quite big. Could 
 this affect things?

Yes, certainly. If the database is busy, either through action of your own or 
through one of the regular table maintenance tasks then processing of check 
results may be delayed waiting on the database. Should be pretty easy to see 
through top if mysql is busy during those times.

--
Marc


--
Download Intelreg; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs 
proactively, and fine-tune applications for parallel performance. 
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Can I run both Nagios V2 and V3 in parallel while I migrate?

2010-02-18 Thread Jim Avery

On 18 February 2010 02:29, Lylex Ryan lylexr...@yahoo.com wrote:

 In upgrading from nagios (v2) to nagios3, I'd like to do a fresh install of 
 nagios3 and start with a clean sheet of  (config) paper.  But can I do this 
 while V2 is running production?

 Since the packages have different names, I thought it might work.  But they 
 probably would both have /etc/nagios and other default directories in 
 common.  Maybe if I installed from the tar-ball, I could specify new 
 directories for V3, but I'm also trying to avoid that learning-process and 
 use a pre-packaged rpm.

 Maybe installing V3 on a different server all-together, then moving it to the 
 production machine would be a way.

I think the standard advice is no you can't run more than one instance
on a single operating-system (of course you probably can if you put
enough effort in to it).

I would recommend against installing your new Nagios 3 install with
non-standard install paths - it could make installing add-ons in the
future (for example PNP graphing, NagVis dashboards etc,) difficult if
everything is in the wrong place.

Personally, when I upgraded from 2 to 3, I put the 3 install on a new
server and 'migrated' hosts and services across from old to new
gradually over a period of a couple of months.

--
Download Intelreg; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs 
proactively, and fine-tune applications for parallel performance. 
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Editing the nagios Side bar

2010-02-18 Thread Marc Powell


On Feb 17, 2010, at 5:14 PM, Ron Wilson wrote:

 I have 6 groups set up holding servers being patched on each day. I would 
 like an entry  in the Nagios sidebar that says patching which would then give 
 a web page view of the six patching groups on one page. This makes it easier 
 for admins to disable notifications for a large number of servers with one 
 click.
 Because we have so many groups it would be easier to have the Patching days 
 on one page.
 However while I can create a url for one days Patching in the new page I 
 cannot get all six.
 This is my php code
 lia href=?php echo 
 $cfg[cgi_base_url];?/status.cgi?hostgroup=Patch_Day1amp;style=overview 
 target=?php echo $link_target;?Patch Day1/a/li
  
 This works fine but how can I get the other 5 Patch Groups in that line. I 
 need something like Patch_Day* but such a command does not work with php.
 Anyone got some ideas?

It's not a PHP thing... Nagios does not have functionality to limit (or expand, 
depending on how you look at it), the display of multiple hostgroups that are a 
subset of all hostgroups. The only exception to this is limitation through 
authentication, which wouldn't appear to fit your goals.

--
Marc


--
Download Intelreg; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs 
proactively, and fine-tune applications for parallel performance. 
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] service notification when host is down

2010-02-18 Thread Marc Powell


On Feb 18, 2010, at 3:47 AM, Samuel Bancal wrote:

 Thanks for your answer,
 
 In fact it is normal behavior to me also.
 Thing that is not normal behavior to me is that between two checks, Nagios 
 jumps from SOFT 1 to HARD 1 without doing the steps SOFT 1  SOFT 2  
 SOFT 3 and finally HARD 4.

If the host is down, why should nagios go through all that? There's no 
possibility for the service to be up when the host is not.

--
Marc


--
Download Intelreg; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs 
proactively, and fine-tune applications for parallel performance. 
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] unsubscribe

2010-02-18 Thread Don McCallum

unsubscribe

--
Download Intelreg; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs 
proactively, and fine-tune applications for parallel performance. 
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Processing External Commands

2010-02-18 Thread Glynne Jones

On Thu, Feb 18, 2010 at 07:19:05AM -0600, Marc Powell wrote:
 
 On Feb 18, 2010, at 5:17 AM, Glynne Jones wrote:
 
  This delay can be several minutes and has sometimes been over 10 minutes.
  
  I am running ndoutils as well, and some of the tables are quite big. Could 
  this affect things?
 
 Yes, certainly. If the database is busy, either through action of your own or 
 through one of the regular table maintenance tasks then processing of check 
 results may be delayed waiting on the database. Should be pretty easy to see 
 through top if mysql is busy during those times.
 

Thought that might be the case. mysql is always busy (I've got 3370 checks over 
362 hosts).

You mention regular table maintenance tasks - is this something that comes out 
of the box or something separate?

Thanks,

Glynne



--
Download Intelreg; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs 
proactively, and fine-tune applications for parallel performance. 
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Help required

2010-02-18 Thread Digital Edge

From: reachta...@hotmail.com
To: j...@jimavery.me.uk
Date: Thu, 18 Feb 2010 12:36:17 +0530
CC: nagios-users@lists.sourceforge.net
Subject: Re: [Nagios-users] Help required

 Date: Wed, 17 Feb 2010 20:27:34 +
 Subject: Re: [Nagios-users] Help required
 From: j...@jimavery.me.uk
 To: reachta...@hotmail.com
 CC: nagios-users@lists.sourceforge.net

 On 17 February 2010 17:36, Digital Edge reachta...@hotmail.com wrote:
  Hi List,

  It will be really helpful if i can get any response on my below mentioned
  query.

  I have an URL , say..

  http://www.example.com/sigin.jsf , After login in, it'll redirect to
  https://www.example1.com/ddo/get_sec_pwd.php; ; here another authentication
  will happen, then it'll come to an URL
  https://www.example1.com/home/home.jsf. Inside that page I have several
  other Tabs..

  Home|Home1|Home2

  all the tabs can be navigate  viewable after successful login of 2nd time .
  And can be accessible within that session only.

  Can we monitor those URLS response time without loosing the session , one by
  one in Nagios..

 Not in Nagios itself, no, but I expect you could use WebInject
 http://www.webinject.org/ to do the web querying and timing and feed
 the results back to Nagios.

 hth,

 Jim

Hi ,

Yes; even I have tried also. The issue what i'm facing is after successful 
authentication checking , I'm unable to navigate through those links.

testcases repeat=1
case
id=1
description1=Connecting to portfolio_signup
method=get
url=http://www.example.com/portfolio_signup.php;
verifypositive=Sign in
errormessage=Unable to connect to the login page of portfolio_signup
/
case
id=2
description1=Authentication on portfolio_signup
method=post
parseresponse='mykey=|'
url=http://www.example.com/portfolio_signup.php;
postbody=user=abcdpassword=1234mykey={PARSEDRESULT}
verifypositive=Sign in
errormessage=Unable to authenticate user abcd in portfolio_signup
/
case
id=3
description1=Authentication on MM
method=post
parseresponse='mykey=|'
url=https://www.example1.com/sso/get_sec_pwd.php;
postbody=user=abcdpassword=12345rmykey={PARSEDRESULT}
verifypositive=Secure Password
errormessage=Unable to authenticate user abcd in MM
/

case
id=4
description1=Navigate through www.example1.com while authenticated
method=get
url=https://www.example1.com/quickenweb/main/home.jsf;
verifypositive=How can I ?
errormessage=Unable to navigate through www.example1.com even though 
correctly authenticated
/
/testcases

All the tests are passing except case4. I am not able to understand why it's 
happening. can anyone help me on this .

/\
Ricky

Dear List,

can anyone help me on this sorry for the double post.

Hi ,

Yes; even I have tried also. The issue what i'm facing is 
after successful authentication checking , I'm unable to navigate 
through those links.

testcases repeat=1
case

 id=1
description1=Connecting to portfolio_signup

method=get
url=http://www.example.com/portfolio_signup.php;

 verifypositive=Sign in
errormessage=Unable to connect to the 
login page of portfolio_signup
/
case
id=2

 description1=Authentication on portfolio_signup
method=post

 parseresponse='mykey=|'

url=http://www.example.com/portfolio_signup.php;

postbody=user=abcdpassword=1234mykey={PARSEDRESULT}

verifypositive=Sign in
errormessage=Unable to authenticate 
user abcd in portfolio_signup
/
case
id=3

 description1=Authentication on MM
method=post

parseresponse='mykey=|'

url=https://www.example1.com/sso/get_sec_pwd.php;

postbody=user=abcdpassword=12345rmykey={PARSEDRESULT}

 verifypositive=Secure Password
errormessage=Unable to 
authenticate user abcd in MM
/

case
id=4

 description1=Navigate through www.example1.com while authenticated

 method=get

url=https://www.example1.com/quickenweb/main/home.jsf;

verifypositive=How can I ?
errormessage=Unable to navigate 
through www.example1.com even though correctly authenticated
/
/testcases

All
 the tests are passing except case4. I am not able to understand why 
it's happening. can anyone help me on this .

/\
Ricky

_
Your E-mail and More On-the-Go. Get Windows Live Hotmail Free.
https://signup.live.com/signup.aspx?id=60969--
Download Intelreg; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs 
proactively, and fine-tune applications for parallel performance. 
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev___

[Nagios-users] CHECK_HTTP odd behaviour

2010-02-18 Thread Paul WILLIS PSE 55499

Marc
It wasn't the command definition file. That was the same as the command I was
using for running directly, ie
# 'check_eRhttp' command definition
define command{
command_namecheck_eRhttp
command_line$USER1$/check_http -p 8000 -H some.host.co.uk -u
/sap/bc/webdynpro/sap/hrrcf_a_unreg_job_search?sap-wd-configId=ZUNREG_JOB_SEARCHsap-ep-themeroot=/sap/public/bc/ur/customerthemes/sap_kp
-A Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.0.9) Gecko/20061206
Firefox/1.5.0.9 -R fs_QE2_00I did away with passing parameters from the
service as I originally thought that was the problem.
I have since had a quick play and can confirm it is indeed the semi colons.
Leave them in and it goes critical / null status without calling the plugin.
Escape them and it behaves

Paul Willis

--
This email and any accompanying document(s) contain information from Kent
Police, which is confidential or privileged.
The information is intended to be for the exclusive use of the individual(s) or
bodies to whom it is addressed.
If you are not the intended recipient, be aware that any disclosure, copying,
distribution or use of the contents of this information is prohibited.
If you have received this email in error, please notify us immediately by
contacting the sender or telephoning 01622 690690.
The copyright in the contents of this email and any enclosure is the property
of Kent Police and any unauthorised reproduction or disclosure is contrary to
the provisions of the Copyright Designs and Patents Act 1998.
--
Download Intelreg; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting
any issue.
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] MACRO PROBLEM

2010-02-18 Thread Khurram Malik


Hi PAtrick

 

Thanks for your reply, i know according to the matrix CONTACTEMAIL and 
CONTACTPAGER are disabled for Host/Service Event Handlers but I am talking 
about CONTACTGROUPMEMBERS. Or if you can help me any other way getting 
CONTACTEMAIL and CONTACTPAGER ? I hope you understand the problem that i need 
all this information to send to Netcool along with other Host or Service 
related information.

Regards

Khurram Malik




 
 


 
 Date: Wed, 17 Feb 2010 23:38:12 -0800
 From: patrick.mor...@hp.com
 To: malik_khur...@hotmail.com
 CC: nagios-users@lists.sourceforge.net
 Subject: Re: [Nagios-users] MACRO PROBLEM
 
 Khurram Malik wrote:
  Hi
  
  I am using Nagios 3.0.6 and in an integration project i want Nagios to 
  send alerts to Netcool. I am using Host/Service Global Event Handlers. 
  I am able to get the maximum information via the following macros
  
  $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $SERVICEDESC$
  
  But i also want some other info via macros and i am using the 
  following link to see if macro is enabled or disabled
  http://nagios.sourceforge.net/docs/3_0/macrolist.html#hostoutput
  
  I want to get CONTACTEMAIL and CONTACTPAGER contents but these macros 
  are disabled with Global Host/Service handler, what is the easiest way 
  to get info for the conact macros with Global Event Handlers. I can 
  see $_CONTACTGROUPMEMBERS$_ is enabled with Global Event Handlers but 
  I am unable to get any value, seems like a bug.
 
 This is not a bug. These macros are not available with event handlers, 
 since eventhandlers do not have contacts associated with them. If you 
 look at the matrix on tha page you linked, you'll see that CONTACTEMAIL 
 and CONTACTPAGER work only with host and service notifications.
  --
Download Intelreg; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs 
proactively, and fine-tune applications for parallel performance. 
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] MACRO PROBLEM

2010-02-18 Thread Khurram Malik


Thanks Patrick

 

But how can i provide the name of contact group when it depends upon the 
service or host which is triggering the event. Is there a way that i can 
provide contact group dynamically to CONTACTGROUPMEMBERS ?

 

e.g $CONTACTEMAIL:CONTACTGROUPMEMBERS:,

 

which can give me comma separated list of emails associated with that 
perticular service or host?

 

or if there is any ready made script for Nagios and Netcool integration?

Regards

Khurram Malik




 
 


 
 Date: Wed, 17 Feb 2010 23:48:08 -0800
 From: patrick.mor...@hp.com
 To: malik_khur...@hotmail.com
 CC: nagios-users@lists.sourceforge.net
 Subject: Re: [Nagios-users] MACRO PROBLEM
 
 Morris, Patrick wrote:
  Khurram Malik wrote:
  
  Hi
  
  I am using Nagios 3.0.6 and in an integration project i want Nagios to 
  send alerts to Netcool. I am using Host/Service Global Event Handlers. 
  I am able to get the maximum information via the following macros
  
  $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $SERVICEDESC$
  
  But i also want some other info via macros and i am using the 
  following link to see if macro is enabled or disabled
  http://nagios.sourceforge.net/docs/3_0/macrolist.html#hostoutput
  
  I want to get CONTACTEMAIL and CONTACTPAGER contents but these macros 
  are disabled with Global Host/Service handler, what is the easiest way 
  to get info for the conact macros with Global Event Handlers. I can 
  see $_CONTACTGROUPMEMBERS$_ is enabled with Global Event Handlers but 
  I am unable to get any value, seems like a bug.
  
 
  This is not a bug. These macros are not available with event handlers, 
  since eventhandlers do not have contacts associated with them. If you 
  look at the matrix on tha page you linked, you'll see that CONTACTEMAIL 
  and CONTACTPAGER work only with host and service notifications.
  
 
 After re-reading your original question, I may have misunderstood, and 
 you're wondering why
 
 $CONTACTGROUPMEMBERS$ doesn't work.
 
 See notes 5 and 7 on the page you linked. These macros work as 
 on-demand-macros in event handlers, since event handler have no contacts 
 associated with them. To obtain a list of conatct group members in that 
 contacts, you would also need to provide the name of the group.
 
  --
Download Intelreg; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs 
proactively, and fine-tune applications for parallel performance. 
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Processing External Commands

2010-02-18 Thread Marc Powell


On Feb 18, 2010, at 7:44 AM, Glynne Jones wrote:

 Thought that might be the case. mysql is always busy (I've got 3370 checks 
 over 362 hosts).
 
 You mention regular table maintenance tasks - is this something that comes 
 out of the box or something separate?

ndo2db.cfg --

## TABLE TRIMMING OPTIONS
# Several database tables containing Nagios event data can become quite large
# over time.  Most admins will want to trim these tables and keep only a
# certain amount of data in them.  The options below are used to specify the
# age (in MINUTES) that data should be allowd to remain in various tables
# before it is deleted.  Using a value of zero (0) for any value means that
# that particular table should NOT be automatically trimmed.

# Keep timed events for 24 hours
max_timedevents_age=1440

# Keep system commands for 1 week
max_systemcommands_age=10080

# Keep service checks for 1 week
max_servicechecks_age=10080

# Keep host checks for 1 week
max_hostchecks_age=10080

# Keep event handlers for 31 days
max_eventhandlers_age=44640



I've set all of these to 1 hour for my install based on my needs. If you have 
database backup scripts, those could be causing delays as well.

--
Marc


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Delayed Notification for Primary Secondary Nagios Servers

2010-02-18 Thread Chris Pepper

We have 2 Nagios servers, with the same /usr/local/nagios/etc (kept 
current with rsync). I'd like to have the second server skip the initial 
notifications, since under normal circumstances we'll receive the primary 
server's notification email and react to that.

Does Nagios provide a way to do this? I don't see anything.

Currently, I'm considering going with email notifications from primary 
every 30min, and using a $USER$ macro to effectively set 
notifications_enabled=0 on secondary.

Then I should be abel to escalate to pages with first_notification=3 
from secondary, and first_notification=5 from primary. This way, if primary 
goes down, we should get pages at 90min; if secondary goes down, we should get 
email about it at 30min and pages at 150min. I can also explicitly enable 
notifications regarding primary on the secondary server.

Is there a better way to do this?

Thanks,

Chris Pepper
-- 
Chris Pepper:http://cbio.mskcc.org/
 http://www.extrapepperoni.com/

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Processing External Commands

2010-02-18 Thread Glynne Jones

On Thu, Feb 18, 2010 at 10:02:51AM -0600, Marc Powell wrote:
 
 On Feb 18, 2010, at 7:44 AM, Glynne Jones wrote:
 
  Thought that might be the case. mysql is always busy (I've got 3370 checks 
  over 362 hosts).
  
  You mention regular table maintenance tasks - is this something that comes 
  out of the box or something separate?
 
 ndo2db.cfg --
 
 ## TABLE TRIMMING OPTIONS

[snip]

 I've set all of these to 1 hour for my install based on my needs. If you have 
 database backup scripts, those could be causing delays as well.
 

Doh!, cheers Marc. Forgot those were there. Don't think those are getting the 
way. Think it's more likely to be some of the other tables getting large. Have 
you changed the table engine from MyISAM to InnoDB?

I'll have a play with the debug settings to see if I can find where the delay 
is coming in.

Thanks,

Glynne
 

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Processing External Commands

2010-02-18 Thread Marc Powell


On Feb 18, 2010, at 10:29 AM, Glynne Jones wrote:

 Doh!, cheers Marc. Forgot those were there. Don't think those are getting the 
 way. Think it's more likely to be some of the other tables getting large. 
 Have you changed the table engine from MyISAM to InnoDB?

Still all myISAM.

--
Marc


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] NRPE/NSCA replacement thoughts?

2010-02-18 Thread Michael Medin

Hello

Since I am pondering a replacement for the NSCA and NRPE protocol I 
thought I would get some thoughts from the community?
So this is pretty much an open floor kind of thing to get some sense 
of what people actually need and would want (if anything at all).
But to get some general idea I'll give you a few questions to start it off:

Is a new protocol a good idea?

Should a new protocol be flat text based or structured?

Would webservices be the best way?

Should the protocol be extensible?

What features would a new protocol need to support?
  - message, performance data, configuration, multiple queries, control 
logic transfer, inventory, etc.

What plattforms would it need to support?

Whats polling scheme(s): active, passive, active/passive, proxy, etc?

Master/slave scenarios?
In both NRPE and NSCA nagios is the master should the client be 
allowed to act as master?

What kind of security mechanisms do you need (host, password, 
encryption, certificates, etc)?

Client side checks or client side data gathering with server side checks?
(ie. check_nrpe get ok back, another option would be to get the 
value and let the server decide if it is good or bad.)

Multiple streams?
ie  send to both Nagios and potentially other collectors (like rrd)

Feel free to add more thoughts and ideas here

// Michael Medin


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] unsubscribe

2010-02-18 Thread Rick Garland

unsubscribe

 

--
The information contained in this transmission may be confidential. Any 
disclosure, copying, or further distribution of confidential information is not 
permitted unless such privilege is explicitly granted in writing by Quantum. 
Quantum reserves the right to have electronic communications, including email 
and attachments, sent across its networks filtered through anti virus and spam 
software programs and retain such messages in order to comply with applicable 
data security and retention requirements. Quantum is not responsible for the 
proper and complete transmission of the substance of this communication or for 
any delay in its receipt.
--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] unsubscribe

2010-02-18 Thread Marc Powell

you mean to send this to nagios-users-requ...@lists.sourceforge.net

From this e-mail's headers --

List-Id:Nagios Users List nagios-users.lists.sourceforge.net
List-Unsubscribe:   
https://lists.sourceforge.net/lists/listinfo/nagios-users,  
mailto:nagios-users-requ...@lists.sourceforge.net?subject=unsubscribe
List-Archive:   
http://sourceforge.net/mailarchive/forum.php?forum_name=nagios-users
List-Post:  mailto:nagios-users@lists.sourceforge.net
List-Help:  
mailto:nagios-users-requ...@lists.sourceforge.net?subject=help
List-Subscribe: 
https://lists.sourceforge.net/lists/listinfo/nagios-users, 
mailto:nagios-users-requ...@lists.sourceforge.net?subject=subscribe

--
Marc

On Feb 18, 2010, at 12:42 PM, Rick Garland wrote:

 unsubscribe
  
 The information contained in this transmission may be confidential. Any 
 disclosure, copying, or further distribution of confidential information is 
 not permitted unless such privilege is explicitly granted in writing by 
 Quantum. Quantum reserves the right to have electronic communications, 
 including email and attachments, sent across its networks filtered through 
 anti virus and spam software programs and retain such messages in order to 
 comply with applicable data security and retention requirements. Quantum is 
 not responsible for the proper and complete transmission of the substance of 
 this communication or for any delay in its receipt.
 --
 Download Intel#174; Parallel Studio Eval
 Try the new software tools for yourself. Speed compiling, find bugs
 proactively, and fine-tune applications for parallel performance.
 See why Intel Parallel Studio got high marks during beta.
 http://p.sf.net/sfu/intel-sw-dev___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting 
 any issue. 
 ::: Messages without supporting info will risk being sent to /dev/null


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Set Host Status from Distributed Monitoring Server

2010-02-18 Thread tktucker

Hello! Pardon me if this is already covered somewhere in the documentation.  
I can't seem to find what I'm looking for.


I have a working Nagios 3.2.0 environment in a distributed configuration.
One of my environments lives behind a firewall that is blocking icmp  
traffic from the central server. In that
same environment I have a remote Nagios node that is successfully sending  
service checks backs to the central server.
All of the nodes behind this firewall are reporting DOWN on the central  
server nagios page, but their service checks are reporting OK. Is their a  
configuration
setting available so that the central server will report these nodes as UP  
if it receives a successful check from the remote monitoring node?
The translate_passive_host_checks options sounds like it should work, but  
it doesn't. I understand I can remove the check_command from the host.cfg,  
but

the host status will be a Pending status.

Any suggestions? Thank you in advance for your time and assistance.

CS Configs

log_file=/usr/local/nagios/var/nagios.log
cfg_file=/usr/local/nagios/etc/objects/commands.cfg
cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg
cfg_file=/usr/local/nagios/etc/objects/templates.cfg
cfg_file=/usr/local/nagios/etc/objects/hostgroups.cfg
cfg_dir=/usr/local/nagios/etc/hosts
object_cache_file=/usr/local/nagios/var/objects.cache
precached_object_file=/usr/local/nagios/var/objects.precache
resource_file=/usr/local/nagios/etc/resource.cfg
status_file=/usr/local/nagios/var/status.dat
status_update_interval=10
nagios_user=nagios
nagios_group=nagios
check_external_commands=1
command_check_interval=-1
command_file=/usr/local/nagios/var/rw/nagios.cmd
external_command_buffer_slots=4096
lock_file=/usr/local/nagios/var/nagios.lock
temp_file=/usr/local/nagios/var/nagios.tmp
temp_path=/tmp
event_broker_options=-1
log_rotation_method=d
log_archive_path=/usr/local/nagios/var/archives
use_syslog=1
log_notifications=1
log_service_retries=1
log_host_retries=1
log_event_handlers=1
log_initial_states=0
log_external_commands=1
log_passive_checks=1
service_inter_check_delay_method=s
max_service_check_spread=30
service_interleave_factor=s
host_inter_check_delay_method=s
max_host_check_spread=30
max_concurrent_checks=0
check_result_reaper_frequency=10
max_check_result_reaper_time=30
check_result_path=/usr/local/nagios/var/spool/checkresults
max_check_result_file_age=3600
cached_host_check_horizon=15
cached_service_check_horizon=15
enable_predictive_host_dependency_checks=1
enable_predictive_service_dependency_checks=1
soft_state_dependencies=0
auto_reschedule_checks=0
auto_rescheduling_interval=30
auto_rescheduling_window=180
sleep_time=0.25
service_check_timeout=60
host_check_timeout=30
event_handler_timeout=30
notification_timeout=30
ocsp_timeout=5
perfdata_timeout=5
retain_state_information=1
state_retention_file=/usr/local/nagios/var/retention.dat
retention_update_interval=60
use_retained_program_state=1
use_retained_scheduling_info=1
retained_host_attribute_mask=0
retained_service_attribute_mask=0
retained_process_host_attribute_mask=0
retained_process_service_attribute_mask=0
retained_contact_host_attribute_mask=0
retained_contact_service_attribute_mask=0
interval_length=60
check_for_updates=1
bare_update_check=0
use_aggressive_host_checking=0
execute_service_checks=0
accept_passive_service_checks=1
execute_host_checks=1
accept_passive_host_checks=1
enable_notifications=0
enable_event_handlers=1
process_performance_data=0
obsess_over_services=0
obsess_over_hosts=0
translate_passive_host_checks=1
passive_host_checks_are_soft=0
check_for_orphaned_services=1
check_for_orphaned_hosts=1
check_service_freshness=1
service_freshness_check_interval=60
check_host_freshness=1
host_freshness_check_interval=60
additional_freshness_latency=15
enable_flap_detection=1
low_service_flap_threshold=5.0
high_service_flap_threshold=20.0
low_host_flap_threshold=5.0
high_host_flap_threshold=20.0
date_format=us
p1_file=/usr/local/nagios/bin/p1.pl
enable_embedded_perl=1
use_embedded_perl_implicitly=1
illegal_object_name_chars=`~!$%^*|'?,()=
illegal_macro_output_chars=`~$|'
use_regexp_matching=0
use_true_regexp_matching=0
admin_email=nag...@localhost
admin_pager=pagenag...@localhost
daemon_dumps_core=0
use_large_installation_tweaks=0
enable_environment_macros=1
debug_level=16
debug_verbosity=2
debug_file=/usr/local/nagios/var/nagios.debug
max_debug_file_size=100



Remote Monitoring Node

log_file=/usr/local/nagios/var/nagios.log
cfg_file=/usr/local/nagios/etc/objects/commands.cfg
cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg
cfg_file=/usr/local/nagios/etc/objects/templates.cfg
cfg_file=/usr/local/nagios/etc/objects/hostgroups.cfg
cfg_dir=/usr/local/nagios/etc/hosts
object_cache_file=/usr/local/nagios/var/objects.cache

[Nagios-users] E-mailing separate group for subset of hosts(and their services)

2010-02-18 Thread Ryan Rawdon


Hi,

I've had a smallish deployment of Nagios for a while now, but now I need
to add some more functionality to it.  I need to have Nagios notify certain
people when there is an issue with a host or any service on it.  I see that
adding their contactgroup to the host definition only notifies them when
the host itself is down or up, however adding their contactgroup to the
service definition would notify them whenever said service has an issue on
any host - not just the ones I want them to be notified about.

Where is the happy medium here?  Do I need to create a duplicate copy of
all services on these hosts just so that I can list them as the contacts?


Thanks,
Ryan

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Set Host Status from Distributed Monitoring Server

2010-02-18 Thread Morris, Patrick

tktuc...@gmail.com wrote:
 Hello! Pardon me if this is already covered somewhere in the 
 documentation. I can't seem to find what I'm looking for.

 I have a working Nagios 3.2.0 environment in a distributed configuration.
 One of my environments lives behind a firewall that is blocking icmp 
 traffic from the central server. In that
 same environment I have a remote Nagios node that is successfully 
 sending service checks backs to the central server.
 All of the nodes behind this firewall are reporting DOWN on the 
 central server nagios page, but their service checks are reporting 
 OK. Is their a configuration
 setting available so that the central server will report these nodes 
 as UP if it receives a successful check from the remote monitoring node?
 The translate_passive_host_checks options sounds like it should 
 work, but it doesn't. I understand I can remove the check_command from 
 the host.cfg, but
 the host status will be a Pending status.

 Any suggestions? Thank you in advance for your time and assistance.

translate_passive_host_checks only works if you send them.  Are you?

I suspect whatever method you're using to send service check results 
upstream is only being used for service checks, and you may need to 
modify it to also send service check results.

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NRPE/NSCA replacement thoughts?

2010-02-18 Thread Morris, Patrick

Michael Medin wrote:
 Hello

 Since I am pondering a replacement for the NSCA and NRPE protocol I 
 thought I would get some thoughts from the community?
 So this is pretty much an open floor kind of thing to get some sense 
 of what people actually need and would want (if anything at all).
 But to get some general idea I'll give you a few questions to start it off:

 Is a new protocol a good idea?

 Should a new protocol be flat text based or structured?

 Would webservices be the best way?

 Should the protocol be extensible?

 What features would a new protocol need to support?
   - message, performance data, configuration, multiple queries, control 
 logic transfer, inventory, etc.

 What plattforms would it need to support?

 Whats polling scheme(s): active, passive, active/passive, proxy, etc?

 Master/slave scenarios?
 In both NRPE and NSCA nagios is the master should the client be 
 allowed to act as master?

 What kind of security mechanisms do you need (host, password, 
 encryption, certificates, etc)?

 Client side checks or client side data gathering with server side checks?
 (ie. check_nrpe get ok back, another option would be to get the 
 value and let the server decide if it is good or bad.)

 Multiple streams?
 ie  send to both Nagios and potentially other collectors (like rrd)

   

For what it's worth, I'm pretty happy with NSCA and NRPE as-is, though 
I'd be interested to hear your motivation for replacing them (especially 
the resons for replacing them outright instead of extending the existing 
apps).

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NRPE/NSCA replacement thoughts?

2010-02-18 Thread Michael Medin

On 2010-02-19 05:22, Morris, Patrick wrote:
 Michael Medin wrote:
 Hello


 Multiple streams?
 ie  send to both Nagios and potentially other collectors (like rrd)


 For what it's worth, I'm pretty happy with NSCA and NRPE as-is, though 
 I'd be interested to hear your motivation for replacing them 
 (especially the resons for replacing them outright instead of 
 extending the existing apps).


Well, the main reason is that they have a number of limitations which I 
need to resolve and after speaking with Ethan about it I got the 
impression he would not be updating NRPE/NSCA any more (for instance Ton 
Voon has some patches to handle payload size which has not been 
applied). He would (or so I gathered) rather have a new replacement 
client(s).
Also I tend to write programs in C++ and not C which sort of means it is 
simpler for me to re-write them.

// Michael Medin

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NRPE/NSCA replacement thoughts?

2010-02-18 Thread Kevin Keane

 -Original Message-
 From: Morris, Patrick [mailto:patrick.mor...@hp.com]
 Sent: Thursday, February 18, 2010 8:23 PM
 To: Michael Medin
 Cc: nagios-users
 Subject: Re: [Nagios-users] NRPE/NSCA replacement thoughts?

 Michael Medin wrote:
  Hello

  Since I am pondering a replacement for the NSCA and NRPE protocol I
  thought I would get some thoughts from the community?
  So this is pretty much an open floor kind of thing to get some sense
  of what people actually need and would want (if anything at all).

I have actually written a minimal replacement to resolve a shortcoming in my 
own situation. Actually, it is the standard NSCA protocol wrapped in SSL or 
(optionally) SSH

  But to get some general idea I'll give you a few questions to start
 it off:

  Is a new protocol a good idea?

Maybe the answer to that question should come at the end instead of the 
beginning of the process?

Generally, I believe that extending an existing protocol is usually a better 
idea than wholesale replacement, but sometimes one does have to clear-cut some 
junk.

  Should a new protocol be flat text based or structured?

What is the design goal? I would advocate structured because of the 
flexibility, but if it means more bandwidth or using more processing power to 
parse the protocol, it may not be a great idea?

  Would webservices be the best way?

Separate the structure of the protocol from the underlying transport mechanism. 
In my mind, a lack of that separation is actually the greatest weakness of the 
current protocol.

Web services are an excellent choice of transport for many situations, and 
quite likely will be the the dominant one for the foreseeable future. Another 
potential transport is SSH, yet another could be RFC 1149/RFC 2549 avian 
carriers or whatever else somebody comes up with.

Some advantages of Web services:

Advantages:
- built-in firewall compatibility
- built-in encryption and authentication (via SSL)

Drawbacks:
- needs to be integrated with Web servers, potentially adding to complexity and 
performance issues.

  Should the protocol be extensible?

Yes, within reason. One of the beauties of Nagios is in its simplicity, so if 
you add too much extensibility you might actually lose more than you gain.

But on the other hand, some extensibility is important - otherwise, people will 
come up with their own extensions that don't really fit with the model. For 
instance, today's performance data is such an enhancement.

  What plattforms would it need to support?

ASCII and Unicode. Other than that, is must be platform neutral, or you lose 
too much.

  Whats polling scheme(s): active, passive, active/passive, proxy, etc?

Both have its place. Most people seem to love active polling, but firewalls may 
sometimes require passive polling.

  Master/slave scenarios?
  In both NRPE and NSCA nagios is the master should the client be
  allowed to act as master?

Define master and slave in this context! If you are talking about the 
current model of multiple Nagios servers, it seems to me that this is more of a 
redesign of Nagios, rather than a protocol issue.

One thing I would definitely like to see in this context is automatically 
adding services to Nagios when passive check results arrive. Keeping the list 
of services in sync between master and slave servers is one of the things that 
makes such a setup complicated.

  What kind of security mechanisms do you need (host, password,
  encryption, certificates, etc)?

That should be left to the underlying transport. Why reinvent the wheel and 
have to keep chasing security holes if there are already plenty of good 
solutions available?

  Client side checks or client side data gathering with server side checks?
  (ie. check_nrpe get ok back, another option would be to get the
  value and let the server decide if it is good or bad.)

Definitely client-side checks. Otherwise, you'd be looking at effectively 
re-inventing RPC. What if the value being checked is some huge binary blob, 
or the result of multiple interdependent system calls?

  Multiple streams?
  ie  send to both Nagios and potentially other collectors (like rrd)

No. Keep it simple, not a protocol to solve all the problems in the world. 
Nagios itself can forward to other collectors.

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NRPE/NSCA replacement thoughts?

2010-02-18 Thread Kevin Keane

 -Original Message-
 From: Michael Medin [mailto:mich...@medin.name]
 Sent: Thursday, February 18, 2010 10:05 PM
 To: Morris, Patrick; nagios-users
 Subject: Re: [Nagios-users] NRPE/NSCA replacement thoughts?

 On 2010-02-19 05:22, Morris, Patrick wrote:

  For what it's worth, I'm pretty happy with NSCA and NRPE as-is,
 though
  I'd be interested to hear your motivation for replacing them
  (especially the resons for replacing them outright instead of
  extending the existing apps).

Don't get me wrong - I like the idea of improvements to NRPE/NSCA, but I see a 
few issues with the motivation.

 Well, the main reason is that they have a number of limitations which I
 need to resolve and after speaking with Ethan about it I got the
 impression he would not be updating NRPE/NSCA any more (for instance
 Ton Voon has some patches to handle payload size which has not been
 applied). He would (or so I gathered) rather have a new replacement
 client(s).

Client? Or protocol?

 Also I tend to write programs in C++ and not C which sort of means it
 is simpler for me to re-write them.

That really isn't a good reason to throw out the investment thousands of people 
have made in a working NRPE/NSCA infrastructure! When the next developer comes 
into the project and likes Java, are we going to get yet another protocol? What 
if somebody wants to write a client for a new platform - does it have to be in 
C++?

Now don't get me wrong: I actually agree that there are good reasons to update 
or even replace the protocol. But I'm quite concerned about the motivation, and 
the end result that would come from it.

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NRPE/NSCA replacement thoughts?

2010-02-18 Thread Michael Medin

On 2010-02-19 07:35, Kevin Keane wrote:

 Well, the main reason is that they have a number of limitations which I
 need to resolve and after speaking with Ethan about it I got the
 impression he would not be updating NRPE/NSCA any more (for instance
 Ton Voon has some patches to handle payload size which has not been
 applied). He would (or so I gathered) rather have a new replacement
 client(s).
  
 Client? Or protocol?

protocol (NRPE and NSCA has fixed limits on data length, Ton extended 
the protocol with an additional packet type that was more data.


 Also I tend to write programs in C++ and not C which sort of means it
 is simpler for me to re-write them.
  
 That really isn't a good reason to throw out the investment thousands of 
 people have made in a working NRPE/NSCA infrastructure! When the next 
 developer comes into the project and likes Java, are we going to get yet 
 another protocol? What if somebody wants to write a client for a new platform 
 - does it have to be in C++?

Uhmm... I am talking about a protocol here, so feel free to implement a 
client in brainf*ck if you like...
 Now don't get me wrong: I actually agree that there are good reasons to 
 update or even replace the protocol. But I'm quite concerned about the 
 motivation, and the end result that would come from it.

Well, in this case my motivation is pure and simple self interest... I 
have no noble ideas about helping the nagios community.
I need a new protocol, I will write one... pure and simple...
I just figured I would get some insight into what to think about whilst 
doing it...

// Michael Medin

 --
 Download Intel#174; Parallel Studio Eval
 Try the new software tools for yourself. Speed compiling, find bugs
 proactively, and fine-tune applications for parallel performance.
 See why Intel Parallel Studio got high marks during beta.
 http://p.sf.net/sfu/intel-sw-dev
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting 
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null




--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NRPE/NSCA replacement thoughts?

2010-02-18 Thread Michael Medin

On 2010-02-19 07:25, Kevin Keane wrote:
 Is a new protocol a good idea?
  
 Maybe the answer to that question should come at the end instead of the 
 beginning of the process?

Well, if everyone thinks this is a doomed sinking ship there is no point 
to venture forth so for me this is the most important question actually :)
 Generally, I believe that extending an existing protocol is usually a better 
 idea than wholesale replacement, but sometimes one does have to clear-cut 
 some junk.

The protocol part of NRPEand NSCA has far far to many flaws to merit 
extending them.
 Should a new protocol be flat text based or structured?

 What is the design goal? I would advocate structured because of the 
 flexibility, but if it means more bandwidth or using more processing power to 
 parse the protocol, it may not be a great idea?

Thats exactly the question: whats more interesting, speed, simplicity or 
flexibility?
Nagios has survived on its simplicity but lately has tried to grow 
into something more advanced.

 Should the protocol be extensible?

 Yes, within reason. One of the beauties of Nagios is in its simplicity, so if 
 you add too much extensibility you might actually lose more than you gain.

 But on the other hand, some extensibility is important - otherwise, people 
 will come up with their own extensions that don't really fit with the model. 
 For instance, today's performance data is such an enhancement.


 Master/slave scenarios?
 In both NRPE and NSCA nagios is the master should the client be
 allowed to act as master?

 Define master and slave in this context! If you are talking about the 
 current model of multiple Nagios servers, it seems to me that this is more of 
 a redesign of Nagios, rather than a protocol issue.

One pretty interesting idea I saw at the Nordic Nagios Meet last spring 
was a client (I don't recall the name now) that allowed you to define 
the checks and such on the clienht. This was then uploaded and 
incorporated into Nagios. This means nagios is no longer the master for 
configuration data instead the clients have become masters.
 What kind of security mechanisms do you need (host, password,
 encryption, certificates, etc)?

 That should be left to the underlying transport. Why reinvent the wheel and 
 have to keep chasing security holes if there are already plenty of good 
 solutions available?


 Client side checks or client side data gathering with server side checks?
 (ie. check_nrpe get ok back, another option would be to get the
 value and let the server decide if it is good or bad.)

 Definitely client-side checks. Otherwise, you'd be looking at effectively 
 re-inventing RPC. What if the value being checked is some huge binary blob, 
 or the result of multiple interdependent system calls?


 Multiple streams?
 ie  send to both Nagios and potentially other collectors (like rrd)

 No. Keep it simple, not a protocol to solve all the problems in the world. 
 Nagios itself can forward to other collectors.


 From what I have gathered this is pretty time and CPU consuming so 
another option would be to split off the data outside Nagios.

// Michael Medin

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Fwd: Problem with recovery notification

Re: [Nagios-users] Fwd: Problem with recovery notification

Re: [Nagios-users] service notification when host is down

Re: [Nagios-users] Fwd: Problem with recovery notification

[Nagios-users] CHECK_HTTP odd behaviour

[Nagios-users] Processing External Commands

Re: [Nagios-users] CHECK_HTTP odd behaviour

Re: [Nagios-users] Can I run both Nagios V2 and V3 in parallel while I migrate?

Re: [Nagios-users] Processing External Commands

Re: [Nagios-users] Can I run both Nagios V2 and V3 in parallel while I migrate?

Re: [Nagios-users] Editing the nagios Side bar

Re: [Nagios-users] service notification when host is down

[Nagios-users] unsubscribe

Re: [Nagios-users] Processing External Commands

Re: [Nagios-users] Help required

[Nagios-users] CHECK_HTTP odd behaviour

Re: [Nagios-users] MACRO PROBLEM

Re: [Nagios-users] MACRO PROBLEM

Re: [Nagios-users] Processing External Commands

[Nagios-users] Delayed Notification for Primary Secondary Nagios Servers

Re: [Nagios-users] Processing External Commands

Re: [Nagios-users] Processing External Commands

[Nagios-users] NRPE/NSCA replacement thoughts?

[Nagios-users] unsubscribe

Re: [Nagios-users] unsubscribe

[Nagios-users] Set Host Status from Distributed Monitoring Server

[Nagios-users] E-mailing separate group for subset of hosts(and their services)

Re: [Nagios-users] Set Host Status from Distributed Monitoring Server

Re: [Nagios-users] NRPE/NSCA replacement thoughts?

Re: [Nagios-users] NRPE/NSCA replacement thoughts?

Re: [Nagios-users] NRPE/NSCA replacement thoughts?

Re: [Nagios-users] NRPE/NSCA replacement thoughts?

Re: [Nagios-users] NRPE/NSCA replacement thoughts?

Re: [Nagios-users] NRPE/NSCA replacement thoughts?

34 matches

Site Navigation

Mail list logo

Footer information