Re: [Nagios-users] check_disk plugin
Davide Blasi wrote: with or without quotes give me the same result :( Try using single quotes, e.g. -I '/my/fist/.*' -I '/second/.*' -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] About host check retry interval for nagios v 3.x
Yu Watanabe wrote: Hello all. I would like to ask a question regarding to Host Definition in Nagios official document of 3.x. In the Object Definitions - Host Definition, the host retry interval is set as #. What would be the interval lentgh that Nagios is actually performed with this value? Would it be the default time unit , 60 sec? Thank you Yu Watanabe This is not a real value, the # indicates that the directive requires a number. In the case of retry_interval, this is the number of minutes between each check attempt after the host goes into a SOFT non-ok state. Normally you put a 1 here so that it retries every 1 minute until it reaches max_check_attempts. regards, Aidan -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Host Dependency Object Inheritance Issue
Aidan Anderson wrote: Hi, Using Nagios v3.2.1 I am have problems defining host dependency object inheritance (chaining) using templates. It appears that if you use 2 levels of inheritance, Nagios doesn't like it and aborts with the following error: Error: Could not expand dependent hostgroups and/or hosts specified in host dependency (config file '/usr/local/nagios/etc/manual/templates-hosts.cfg', starting on line 123) Here is my config. I created the following host dependency templates in '/usr/local/nagios/etc/manual/templates-hosts.cfg'. This is where the error is found so I've highlighted line 123: define hostdependency{ namedc-ping-proxy execution_failure_criteria d,u,p notification_failure_criteria d,u,p register0 } define hostdependency{ use dc-ping-proxy namecam-ping-proxy host_name rp1b register0 } define hostdependency{ --- Line 123 use dc-ping-proxy nametcl-ping-proxy host_name rp1a register0 } I then created the following 2 host dependency definitions which use the bottom 2 templates: define hostdependency{ use cam-ping-proxy dependent_host_name cam-int } define hostdependency{ use tcl-ping-proxy dependent_host_name tcl-int } This should expand as follows: define hostdependency{ host_name rp1b dependent_host_name cam-int execution_failure_criteria d,u,p notification_failure_criteria d,u,p } define hostdependency{ host_name rp1a dependent_host_name tcl-int execution_failure_criteria d,u,p notification_failure_criteria d,u,p } but I get the error. I then changed the configs to remove 1 level of inheritance. My templates and definitions now look like this: Template: define hostdependency{ namedc-ping-proxy execution_failure_criteria d,u,p notification_failure_criteria d,u,p register0 } Definitions: define hostdependency{ use dc-ping-proxy host_name rp1b dependent_host_name cam-int } define hostdependency{ use dc-ping-proxy host_name rp1a dependent_host_name tcl-int } This should expand to the same configuration as when there were 2 levels of inheritance. However, the second configuration works fine but the first one doesn't. Also, I have created a similar service dependency setup with 2 levels of inheritance and that works fine. Can someone cast their eye over the configs listed above to see if there is anything obvious that I have done wrong with the inheritance? regards, Aidan I've changed the why I work out the host_name of the host being depended upon to make it more dynamic so this is no longer an issue for me. If someone could double check my syntax to make sure I have not made an error, I will post to nagios-dev as a possible bug. cheers, Aidan -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Host Dependency Object Inheritance Issue
Hi, Using Nagios v3.2.1 I am have problems defining host dependency object inheritance (chaining) using templates. It appears that if you use 2 levels of inheritance, Nagios doesn't like it and aborts with the following error: Error: Could not expand dependent hostgroups and/or hosts specified in host dependency (config file '/usr/local/nagios/etc/manual/templates-hosts.cfg', starting on line 123) Here is my config. I created the following host dependency templates in '/usr/local/nagios/etc/manual/templates-hosts.cfg'. This is where the error is found so I've highlighted line 123: define hostdependency{ namedc-ping-proxy execution_failure_criteria d,u,p notification_failure_criteria d,u,p register0 } define hostdependency{ use dc-ping-proxy namecam-ping-proxy host_name rp1b register0 } define hostdependency{ --- Line 123 use dc-ping-proxy nametcl-ping-proxy host_name rp1a register0 } I then created the following 2 host dependency definitions which use the bottom 2 templates: define hostdependency{ use cam-ping-proxy dependent_host_name cam-int } define hostdependency{ use tcl-ping-proxy dependent_host_name tcl-int } This should expand as follows: define hostdependency{ host_name rp1b dependent_host_name cam-int execution_failure_criteria d,u,p notification_failure_criteria d,u,p } define hostdependency{ host_name rp1a dependent_host_name tcl-int execution_failure_criteria d,u,p notification_failure_criteria d,u,p } but I get the error. I then changed the configs to remove 1 level of inheritance. My templates and definitions now look like this: Template: define hostdependency{ namedc-ping-proxy execution_failure_criteria d,u,p notification_failure_criteria d,u,p register0 } Definitions: define hostdependency{ use dc-ping-proxy host_name rp1b dependent_host_name cam-int } define hostdependency{ use dc-ping-proxy host_name rp1a dependent_host_name tcl-int } This should expand to the same configuration as when there were 2 levels of inheritance. However, the second configuration works fine but the first one doesn't. Also, I have created a similar service dependency setup with 2 levels of inheritance and that works fine. Can someone cast their eye over the configs listed above to see if there is anything obvious that I have done wrong with the inheritance? regards, Aidan -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Persistent Comment in Acknowledgement
Hi, When acknowledging a host or service problem, I've noticed that the Persistent Comment check box is not ticked by default in v3 whereas it was in v2. Is there anyway of changing this behaviour so that it is ticked by default? I can't find any options in cgi.cfg or nagios.cfg to change this behaviour. If there is no official way to change it, does anyone know of a hack to do this? regards, Aidan -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Persistent Comment in Acknowledgement
Assaf Flatto wrote: Aidan Anderson wrote: Hi, When acknowledging a host or service problem, I've noticed that the Persistent Comment check box is not ticked by default in v3 whereas it was in v2. Is there anyway of changing this behaviour so that it is ticked by default? I can't find any options in cgi.cfg or nagios.cfg to change this behaviour. If there is no official way to change it, does anyone know of a hack to do this? regards, Aidan you will need to make changes to the cmd.c file and recompile the cgi AFAIK. Good luck . Hi Assaf, Thanks for the reply. I must admit, I've never messed about with C source code before but I'll give it a try :) I assume that if I make any changes, I would need to repeat the changes following any upgrades? regards, Aidan -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Persistent Comment in Acknowledgement
Assaf Flatto wrote: Aidan Anderson wrote: Assaf Flatto wrote: Aidan Anderson wrote: Hi, When acknowledging a host or service problem, I've noticed that the Persistent Comment check box is not ticked by default in v3 whereas it was in v2. Is there anyway of changing this behaviour so that it is ticked by default? I can't find any options in cgi.cfg or nagios.cfg to change this behaviour. If there is no official way to change it, does anyone know of a hack to do this? regards, Aidan you will need to make changes to the cmd.c file and recompile the cgi AFAIK. Good luck . Hi Assaf, Thanks for the reply. I must admit, I've never messed about with C source code before but I'll give it a try :) I assume that if I make any changes, I would need to repeat the changes following any upgrades? Aidan If you've never delved in the C code , then i'd advise not to do any changes with out the help of a C programer and have a backup before any attempts begin . As for the upgrade issue - Of course ! Since this is a local change , unless you plan to to the upgrade for the core with out the CGI's , any local change will be overwritten when you upgrade . but once you do it and get it right , doing it again on the new version will be much easier . Good luck Assaf Hi Assaf, I had to have a go and (surprising myself) have managed to do it. I will remember to do this again each time I upgrade. Below is the output of a 'diff' following my changes to cmd.c in case anyone else is interested in making this modification. 2 changes are required to cover host and service acknowledgements. 958c958 printf(INPUT TYPE='checkbox' NAME='persistent' %s,(cmd==CMD_ACKNOWLEDGE_HOST_PROBLEM)?:CHECKED); --- printf(INPUT TYPE='checkbox' NAME='persistent' CHECKED); 984c984 printf(INPUT TYPE='checkbox' NAME='persistent' %s,(cmd==CMD_ACKNOWLEDGE_SVC_PROBLEM)?:CHECKED); --- printf(INPUT TYPE='checkbox' NAME='persistent' CHECKED); Thanks again for your help Assaf. regards, Aidan -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Child host becomes UNREACHABLE when parent changes from UP to a SOFT DOWN state
Hi List! I am in the process of upgrading from v2.12 to v3.2.1. As well as upgrading, I am taking the opportunity to move to a new server at the same time. This has allowed me to run both versions in tandem to compare the operation of the two versions. One difference I noticed straight away was downtime duration on certain hosts. For example, v2 would show a host down for over 2 days yet v3 would show the same host as being down for only a few hours. On investigation, it turned out that the parent of the host on v3 went into a soft down state. This changed the host in question to an unreachable state. The parent host recovered within a minute or so and changed the host back to a down state, effectively resetting the down duration back to zero. I would have expected that the child host should only change state if the parent goes into a hard down state, not a soft down state. I googled for the issue and found one related post from just over a year ago: http://www.mail-archive.com/nagios-users@lists.sourceforge.net/msg25543.html The poster was given various suggestions to circumvent the problem, i.e. tweaking flap detection, increasing time-out on the plugin etc but nothing that seemed to resolve his issue. The posters main problem with this behaviour was that he was getting down e-mail alerts for hosts that are already down due to the state changes. My issue is not with repeated alerts but with the accuracy of the down duration of the host. When our support department look to resolve host problems, they will try and resolve the oldest problems first for obvious reasons of fairness to our customers. This scenario breaks this. In v3, to get an accurate downtime for a host, you would now have to trawl through the alert history or run a trend report for the host to find out when the host really went down. Version 2 does not exhibit this problem. I don't think this is by design but purely down to the way serial host checks work in v2. When a host goes into a soft down state in v2, Nagios cannot do anything else until it has completed all the retries or the host recovers so Nagios never gets the chance to mark the child host unreachable unless it reaches max_check_attempts and determines that the parent host really is down. The original poster of this problem made a good point that Nagios has all the tolerance built in to avoid false alarms on host checks but unfortunately this logic doesn't carry on through child hosts. I can't see that the current way v3 deals with parent/child problems as being desirable for most people, although it seems to have only bothered 2 of us! Thoughts? regards, Aidan -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Accessing Nagios for the first time
Tim Tompson wrote: My nagios.conf: ## BEGIN APACHE CONFIG SNIPPET - NAGIOS.CONF ScriptAlias /nagios/cgi-bin /usr/local/nagios/sbin Directory /usr/local/nagios/sbin Options ExecCGI AllowOverride None Order allow,deny Allow from all AuthType Digest AuthName Nagios Access AuthUserFile /usr/local/nagios/etc/.digest_pw Require valid-user /Directory Alias /nagios /usr/local/nagios/share Directory /usr/local/nagios/share Options None AllowOverride None Order allow,deny Allow from all AuthType Digest AuthName Nagios Access AuthUserFile /usr/local/nagios/etc/.digest_pw Require valid-user /Directory ## END APACHE CONFIG SNIPPETS I followed the instructions at: http://nagios.sourceforge.net/docs/3_0/cgisecurity.html -- to secure my install, and thats where I got the above .conf file. Its set to Allow from all, shouldn't that work? It should and so should connecting to serveripaddress/nagios. It looks like Apache is your issue. Is it running? Are there other web sites running on the same box? Are they working? Aidan -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] problem creating hostgroup
Gezina Dekker wrote: Hi all, When I restart after adding this host-group using split.cfg I get the following. Running configuration check. CONFIG ERROR! Restart aborted. Check your Nagios configuration I have server a definition for it. if I comment the lines out, the resatrt is successful. I am just missing something??? This is what my hostgroup definition looks like define hostgroup{ hostgroup_name Linux_group alias No_Call-Out memberssvrlinux01 } Any ideas that can help me out here? I have server a definition for it. if I comment the lines out, the resatrt is successful. Regards and thanks for all the help so far, learned a lot, Gezina Looks like a typo, did you mean to add the member as svrlinux01 or srvlinux01? -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] host status in nagios/var/status.dat
Colin McKinnon wrote: Hi all, Having looked at what was avilable (NLG, centreon...) I decided to write my own front end for Nagios. This proved to be quite straightforward (except for sorting out the locking semantics in PHP - but that's another story). The only problem I'm having is that while the status reported in status.dat for services matches the output from the probe (0=OK,1=warn,2=crit,3=unknown) for hosts it seems to record a status of 0 for OK but 1 for critical (down). Is this the way its supposed to work? Or am I missing something? (Nagios 2.10) TIA C. AFAIK this is correct. With services Nagios needs to know the actual state, e.g. Ok, warn, crit, unknown but with hosts all it needs to know is if the host is UP or DOWN hence 0 or 1. regards, Aidan - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] NSClient issue (Unknown alerts)
Ronaldo A. Bueno Filho wrote: Hi, guys and ladies :) Now, I'm experiencing a problem regarding NSClient++. I'm monitoring a Windows workstation on my LAN. I configured NSClient++ following its documentation. Now, that workstation shows unknown alerts for CPU load, Memory usage and Uptime with the message: NSClient - ERROR: PDH Collection thread not running. Looking on google.com, I found that it happens when you are not using English language on Windows. Also, I did not find any resolution for that issue. I'm not sure if there is an issue related with the windows language. Does somebody know how to solve this issue? It tells you how to resolve this issue in the installation section of the readme.html file that comes with the nsclient download. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] String errors
Jerad Riggin wrote: Ok, so I'm monitoring about 100 websites with string checks via check_http. We are mirroring what our datacenter actually checks, so we have notifications turned off so when a site goes down we aren't being spammed by the datacenter and our nagios installation. The issue is that every once in awhile a string changes on the site so it goes critical in our nagios. We perhaps won't notice it for a day which messes up our availability reports. Is there a way to retroactively mark the time that it was critical as scheduled downtime? I'm not aware of any way to retrospectively schedule downtime but you could probably solve your problem by adjusting your checking procedure. Assuming you or a colleague has access to change the html on your websites, you could have a standard string of text that you add to all your websites so that Nagios is checking the same text on each site. Whenever a new site is added, just make sure that your standard text string is added and you will avoid this problem in the first place. hth Aidan - SF.Net email is sponsored by: The Future of Linux Business White Paper from Novell. From the desktop to the data center, Linux is going mainstream. Let it simplify your IT future. http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Service notifications for a down host?
Doug Tabb wrote: I’m looking for a little behavior confirmation here, please. It’s my understanding that a failed service check is one way a host check is initiated. If Nagios determines the host is down, further service problem notifications are suppressed. However, I still get one or more notifications for the initial service problems. Wouldn’t Nagios suppress those initial service checks until at least one host check has been made? For illustration, I have a remote site with host parent/child relationships configured. If the site goes down, I get about 2 dozen service notifications from various child hosts before it realizes the top parent host is down and suppresses notifications for that site. I then receive the one host recovery along with the 2 dozen or so service recovery messages. I had hoped to not receive any service notifications in this scenario. Is this expected behavior? Thank you very much! Doug Tabb You shouldn't be seeing this behavior. The only time your should see this is if your services enter a hard state before the hosts. How does your host retry attempts compare to your service retry attempts? Aidan - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Monitor packet loss with check_ping command
Alex Dehaini wrote: But in this case - if there is a 20% packet loss out of 10 pings sent to a host - will I be notified? That all depends on what you set your max_check_attempts to. If you want to be notified of any packet loss, set this to 1 (one). Increase this value if you prefer more tolerance. On 10/22/07, *Giles Coochey* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: check_ping uses the ping command. Packet Loss is considered a reply not within the timeout, this can typically be around 3000ms So something like: ./check_ping -H $HOSTNAME$ -w 3000,20% -c 3000,50% Will do what you want. * From: * [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED]] *On Behalf Of *Alex Dehaini *Sent:* 22 October 2007 11:29 *To:* nagios-users@lists.sourceforge.net mailto:nagios-users@lists.sourceforge.net *Subject:* [Nagios-users] Monitor packet loss with check_ping command Hi Guys, Can someone give me an example on how I can monitor only packet loss but not latency -- Alex Dehaini Developer Site - www.alexdehaini.com http://www.alexdehaini.com Email - [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] -- Alex Dehaini Developer Site - www.alexdehaini.com http://www.alexdehaini.com Email - [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] notify contact only once
Terry wrote: I have a contact that I only want to receive one notification. How can I set this up? A good place to start looking would be here: http://nagios.sourceforge.net/docs/2_0/notifications.html ;) - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] notify contact only once
Aidan Anderson wrote: Terry wrote: I have a contact that I only want to receive one notification. How can I set this up? A good place to start looking would be here: http://nagios.sourceforge.net/docs/2_0/notifications.html ;) without supporting info will risk being sent to /dev/null Apologies, here is where you want to start: http://nagios.sourceforge.net/docs/2_0/escalations.html You would specify the contact you only want to receive one notification in the first escalation and all other contacts in the first and subsequent escalations. Aidan - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] notify contact only once
Terry wrote: Thanks for the reply. Let me be more specific: version: 2.9 OS: centos 5 I have regular contacts set up, me for example. I want to get notified every 30 minutes indefinitely if a service is in a hard state of warning or critical. However, I want another contact to only get notified one time when that hard state is achieved.That's it. From what I can tell, I can only achieve this through the notification_interval which is only set at the host/service level, not the contact level. If this is true, I will need to create 2 services, each with a different notification_interval and of course apply the different contact groups to each service. Am I correct or is there another way around this? Thanks! On 10/5/07, Aidan Anderson [EMAIL PROTECTED] wrote: Terry wrote: I have a contact that I only want to receive one notification. How can I set this up? Hi Terry, I've just posted you another message before seeing this one. You want to use host or service escalations to achieve this. I've briefly explained in the previous post but if you need more help, just shout. Aidan - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Newbie Notifications Problem
Ray Wadkins wrote: Thanks for the reply. I didn't include notify-host-by-email because it didn't seem relevant, but it's in commands.cfg (pasted below). The host isn't failing, just the service. When you say service notifications are suppressed what do you mean? Is there a configuration I can't see that's suppressing service notifications? It's something Nagios does by default. If a service check fails, it will check the host. If the host check fails, it will send out a host notification but suppress the service notification. By what you've said, I don't think that's your problem. I've noticed that you have used a lot of templates (inheritance) in your configs. You could try simplifying it but just setting up a contact, a contact group, a host, a service and a time period but don't use templates. If that basic test works the problem may lie with one of your templates. Aidan - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Newbie Notifications Problem
Ray Wadkins wrote: define contact{ contact_namerwadkins_e ; Short name of user use generic-contact ; Inherit default values from generic-contact template (defined above) alias Ray Wadkins ; Full name of user email X; * CHANGE THIS TO YOUR EMAIL ADDRESS ** host_notifications_enabled 1 service_notifications_enabled 1 host_notification_period24x7 service_notification_period 24x7 host_notification_options d,u,r,f,s service_notification_optionsw,u,c,r,f,s host_notification_commands notify-host-by-email service_notification_commands notify-service-by-email You've specified the command notify-host-by-email in your contact definition *From commands.cfg* * * define command{ command_namenotify-service-by-email command_line/usr/bin/printf %b * Nagios *\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $ HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$ | /bin/mail -s ** $NOTIFICATIONTYPE$ Service Alert: $HOS TALIAS$/$SERVICEDESC$ is $SERVICESTATE$ ** $CONTACTEMAIL$ } but don't seem to have defined it in the commands.cfg file. When a host goes down, only host notifications are sent out (service notifications are suppressed). As you don't seem to have defined a host notification command, you will never receive any notifications. Try adding the following to commands.cfg: # 'notify-host-by-email' command definition define command{ command_namenotify-host-by-email command_line/usr/bin/printf %b Notification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nDetails: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $HOSTSTATE$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n\n$HOSTACKAUTHOR$\n$HOSTACKCOMMENT$\n | /bin/mail -s Host $HOSTSTATE$ alert for $HOSTNAME$ - $HOSTALIAS$ $CONTACTEMAIL$ } HTH Aidan - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Configure smtp in Nagios
Rodrigo Tavares wrote: Hello, How I do configure smtp in Nagios ? best regards, Rodrigo Faria You don't. Whatever mail server you are running on the Nagios box will take care of SMTP. Nagios simply pipes the notification through the /bin/mail command or whatever command suits the mail server. Most distros come with Sendmail or Postfix by default, just make sure you have one running and configured to route mail. Aidan - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios Log Management Tips?
Rogelio Bastardo wrote: Anyone have any tips for dealing with Nagios logs? Things are getting a little crazy, and I haven't even been logging very much! e.g. [EMAIL PROTECTED] run]# find / *nagios* -type f -size +100k -exec ls -lh {} \; | awk '{ print $9 : $5 }' /var/log/nagios/archives/nagios-08-14-2007-00.log: 2.3G /var/log/nagios/archives/nagios-08-13-2007-00.log: 3.4G /var/log/nagios/archives/nagios-08-12-2007-00.log: 2.6G /var/log/messages.4: 3.5G [EMAIL PROTECTED] run]# - Good grief, what on earth are you logging? I'm monitoring over 1000 hosts and 1600 services and my daily logs range between 600KB and 1.5MB. Can you post a snippet of your log (say a 15min span ) so we can get an idea of what it is logging? I'd love to see how your browser copes with viewing the daily log. Aidan - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] NRPE (No output returned from plugin)
shacky wrote: Hi. I'm using NRPE to monitor a remote server. The most part of the plugins works without problems, but check_backuppc returns the error (No output returned from plugin) in the Nagios web interface. The check_backuppc stanza in the Nagios configuration is the following: define service{ use remote-service host_name myremoteserver service_description BackupPC check_command check_nrpe!check_backuppc } If I execute from the shell check_nrpe -H bakserver.blupixel.local -c check_backuppc I correctly get the plugin's answer (BACKUPPC WARNING - (5/7) failures). Where is the problem? Have you set up the command definition correctly in commands.cfg or wherever you store your commands on your Nagios server. Also check that Nagios has permission to execute the pluggin on the remote machine. Test by re-trying your check_nrpe command logged on as nagios. Aidan - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Cancel Downtime?
On Jun 7, 2007, at 11:18 PM, Anthony Mendoza wrote: Click Downtime and then the Trash can icon to the right of the service/host you want to cancel. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Wil Schultz Sent: Thursday, June 07, 2007 11:11 PM To: nagios-users Subject: [Nagios-users] Cancel Downtime? IIRC, there used to be a Cancel Downtime link, am I blind or did this go away? How do you cancel scheduled downtime? I need to cancel scheduled downtime on a host and took the advise of Anthony Mendoza in this thread. Clicking on the Trash can icon certainly removes the Nagios generated comment but the period of scheduled downtime remains. Any ideas anyone? regards, Aidan - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Cancel Downtime?
Aidan Anderson wrote: On Jun 7, 2007, at 11:18 PM, Anthony Mendoza wrote: Click Downtime and then the Trash can icon to the right of the service/host you want to cancel. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Wil Schultz Sent: Thursday, June 07, 2007 11:11 PM To: nagios-users Subject: [Nagios-users] Cancel Downtime? IIRC, there used to be a Cancel Downtime link, am I blind or did this go away? How do you cancel scheduled downtime? I need to cancel scheduled downtime on a host and took the advise of Anthony Mendoza in this thread. Clicking on the Trash can icon certainly removes the Nagios generated comment but the period of scheduled downtime remains. Any ideas anyone? regards, Aidan Ignore last e-mail, I found it. You do it from the downtime link on the sidebar. :) cheers, Aidan - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] how to unsubscribe????
Go to https://lists.sourceforge.net/lists/listinfo/nagios-users Go to the bottom of the page to the section headed Nagios-users Subscribers and follow the instructions for unsubscribing. You'll need your password. Aidan Arief Iqbal wrote: hi, how can i unsubscribe from this goddamned mailing list??? thx Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now http://us.rd.yahoo.com/evt=48223/*http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow (it's updated for today's economy) at Yahoo! Games. - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Severe peformance issue during major network outage
Hi, I have recently set up Nagios 2.8 and am monitoring 1623 hosts and 1946 services. Performance under normal circumstances is fine. Typical check and latency times are as follows: Monitoring Performance Service Check Execution Time:0.03 / 11.04 / 3.418 sec Service Check Latency: 0.00 / 1.87/ 0.479 sec Host Check Execution Time: 0.03 / 10.04 / 0.843 sec Host Check Latency: 0.00 / 0.00/ 0.000 sec # Active Host / Service Checks: 1623 / 1946 # Passive Host / Service Checks: 0 / 0 The vast majority of these hosts are spread over 320 geographic locations throughout the UK. These locations are connected to our data centre via a hardware VPN device with the majority (about 270) using a private ADSL circuit to facilitate the VPN connection. Yesterday, we had a major outage caused by the failure of one of the ADSL central routers at our ISP. This took out a third of our ADSL sites (roughly 90) for 16 minutes. Each of these sites has about 4 devices monitored by Nagios so in effect about 360 devices (hosts) went down in an instant. As you can imagine, we were aware of the problem almost immediately due to the barrage of phone calls from out clients, but unfortunately Nagios didn't even remotely reflect the current situation. I have used parent child relationships to the full so I was expecting a good portion of the VPN devices to show as down with all other devices behind the VPN device showing as unreachable. This was not the case. It actually took half an hour to find only 20 of these VPN devices down and another half an hour to notice that they were actually back up again having only noticed 20 of the 90 in the first place. During the outage, the service check latency was increasing exponentially and the performance stats half an hour after the start of the problem were as follows: Monitoring Performance Service Check Execution Time:0.03 / 11.04 / 3.646 sec Service Check Latency: 947.84 / 2080.05 / 1467.274 sec Host Check Execution Time: 0.03 / 10.04 / 0.968 sec Host Check Latency: 0.00 / 0.00/ 0.000 sec # Active Host / Service Checks: 1623 / 1946 # Passive Host / Service Checks: 0 / 0 As you can see, the average service check latency time has jumped to 1467 seconds (24 mins). On all of these hosts there is only one service which is a ping (check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5). The host check is also a ping (check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 1) but much faster with only 1 ping being sent out. The normal_check_interval on services is 5 mins with 2 max_check_attempts and a retry_interval of 1. The host also has a max_check_attempts of 2. A lot of people have mentioned using fping to speed things up but if my average service latency is only 0.479 seconds in normal circumstances, I can't see how tweaking this will help in a major outage situation. I have also read through the section on tweaking performance which seems to be geared toward protecting the machine Nagios is running on. I want to do the opposite and give Nagios a lot more work to do. The machine is dedicated to Nagios and is quite high spec. It's an IBM xServies 336 with 2 Dual Core processors and 4GB of RAM so it should be able to take a much bigger hit. I have been monitoring CPU performance with MRTG and the CPU performance never goes lower than 90% idle. Ironically during the problem, the machines idle time jumped to 95% when I would have expected to drop rather than increase. The only performance tweak I could see that would affect the performance in this situation is max_concurrent_checks but this is already set to 0. I am fairly new to Nagios (2 months) so I apologise if I have missed something obvious but any pointers to a solution to this problem would be greatly appreciated. I have run a nagios -s (attached below) which seems to indicate that everything is setup ok. Let me know if you require any more information from my config that would help diagnose the problem. regards, Aidan Nagios 2.8 Copyright (c) 1999-2007 Ethan Galstad (http://www.nagios.org) Last Modified: 03-08-2007 License: GPL Projected scheduling information for host and service checks is listed below. This information assumes that you are going to start running Nagios with your current config files. HOST SCHEDULING INFORMATION --- Total hosts: 1624 Total scheduled hosts: 0 Host inter-check delay method: SMART Average host check interval: 0.00 sec Host inter-check delay: 0.00 sec Max host check spread: 30 min First scheduled check: N/A Last scheduled check:N/A SERVICE SCHEDULING INFORMATION --- Total services: 1947 Total scheduled services: 1947 Service inter-check delay method: SMART Average
Re: [Nagios-users] nrpe command line test question
Maxwell,Brady wrote: My nrpe.cfg on the remote host contains these commands command[check_disk]=/usr/local/nagios/libexec/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$ command[check_disk1]=/usr/local/nagios/libexec/check_disk -w 20 -c 10 -p /dev/vga/root running a check_nrpe from the command line has the following results. [EMAIL PROTECTED] ~]# /usr/local/nagios/libexec/check_nrpe -H hostname -c check_disk –a 10 5 /dev/vga/root check_disk: Warning threshold must be integer or percentage! [EMAIL PROTECTED] ~]# /usr/local/nagios/libexec/check_nrpe -H hostname -c check_disk1 DISK OK - free space: / 801 MB (12% inode=81%);| /=5625MB;6405;6415;80;6425 I would like to be able to pass arguments to the remote system, allowing me to set threshold values at the service level. Can anyone tell me why I get the error “Warning threshold must be integer or percentage!” ? Or suggest another method of passing the args to the remote nrpe process? Thanks Brady - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null Make sure that you set dont_blame_nrpe to 1 in nrpe.cfg to allow nrpe to accept client arguments. This is set to 0 by default as it is deemed a security risk Aidan - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Severe peformance issue during major network outage
Ton Voon wrote: On 11 May 2007, at 19:03, Jim Avery wrote: On 11/05/07, Aidan Anderson [EMAIL PROTECTED] wrote: A lot of people have mentioned using fping to speed things up but if my average service latency is only 0.479 seconds in normal circumstances, I can't see how tweaking this will help in a major outage situation. check_ping won't finish until it's done all the pings, and the pings are (if I recall) always at one second intervals. This means that if you've configured check_ping to do (let's say) 5 pings, the check_ping plugin will always take at least 5 seconds to complete. If the check_ping is being run as a host check rather than a service check, my understanding is that this is the only thing Nagios will be doing; it doesn't do anything else concurrently (correct me if I'm wrong people). Correct. We noticed this some time ago too: http://altinity.blogs.com/ dotorg/2006/05/immediate_perfo.html If you do stick to using check_ping, use -p 1 which is sub second response time. First of all, thank-you for the replies! The majority of devices that I monitor are routers/vpn devices and I have (on the documentation's advice) not set active checks on the hosts and instead I've added check_ping as a service on each of these hosts to do 5 pings as follows: check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5 For the host check I already use as you suggested a check_ping that only does one ping as follows: check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 1 My understanding was that if the service check failed it would then abandon the service check altogether and move onto the host check which is only 1 ping. The fact that the service checks are parallelised should mean that it shouldn't matter that there are 5 pings and the host check is only 1 ping which should resolve the bottleneck of serialised host checks. I'm at a loss as to why performance has been impacted so severely. Maybe I need to abandon the service checks altogether and just have a host check. I'm reluctant to do this because I get very useful information from 5 pings, ie packet loss and high rta which is particularly handy for checking volatile links such as ADSL. Maybe that is the trade-off, fast host checking with no useful stats or slow host checking with useful stats. regards, Aidan - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Severe peformance issue during major network outage
Ton Voon wrote: On 11 May 2007, at 20:25, Aidan Anderson wrote: First of all, thank-you for the replies! The majority of devices that I monitor are routers/vpn devices and I have (on the documentation's advice) not set active checks on the hosts and instead I've added check_ping as a service on each of these hosts to do 5 pings as follows: check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5 For the host check I already use as you suggested a check_ping that only does one ping as follows: check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 1 My understanding was that if the service check failed it would then abandon the service check altogether and move onto the host check which is only 1 ping. The fact that the service checks are parallelised should mean that it shouldn't matter that there are 5 pings and the host check is only 1 ping which should resolve the bottleneck of serialised host checks. I'm at a loss as to why performance has been impacted so severely. Maybe I need to abandon the service checks altogether and just have a host check. I'm reluctant to do this because I get very useful information from 5 pings, ie packet loss and high rta which is particularly handy for checking volatile links such as ADSL. Maybe that is the trade-off, fast host checking with no useful stats or slow host checking with useful stats. Just noticed this in your original email: Host Check Execution Time: 0.03 / 10.04 / 0.843 sec This means that some of your host checks are taking 10 seconds, which is, funnily enough, the timeout period for check_ping. So the -p 1 will still take 10 seconds if the routers are not responding. You can use a timeout flag for check_ping (but is only supported on some OSes). I guess check_icmp is a better bet here. Ton Hi Ton, Well spotted, thank-you. check_icmp here we come :) thanks Aidan - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Disable service_notification_commands
Hi Kareem, I am using 2.8 and the docs have 'service_notification_commands' in red (required) so I don't know whether that is an error in the docs or not. If you want to disable service notifications, put the directive back and simply specify the 'n' option. This will disable service notifications. regards, Aidan Kareem Mahgoub wrote: Dear All I am using Nagios 2.5 and I want to disable the service notification command. On the documentation under the section of Contact Definition, I can see that the directive service_notification_commands is in black which means it is optional. When I commented it and made a conf check it gave Error: Contact 'kareem' has no service notification commands defined! Am I missing something here? Best Regards, Kareem Mahgoub - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Service Group Summary Changing Numbers
Hi Elijah, Fantastic, glad I could help. The situation you mentioned where not all processes stop after a restart is quite common and has been mentioned a few times on the list. I had similar problems and one post suggested doing a "reload" rather than a "restart". I now religiously use "reload" and have not had a problem since. regards, Aidan Elijah Savage wrote: Aidan, Not sure how I miss that but you are right there were multiple processes running. I think my situation was from actually doing a restart on the services with the init script and they all did not stop for some reason. I have since stopped all services killed off any additional processes and now things seem to be back to exactly what I have grown to expect, a nice stable platform in nagios. Thank you - Original Message - From: Aidan Anderson [EMAIL PROTECTED] To: Nagios Users Mailinglist nagios-users@lists.sourceforge.net Sent: Tuesday, April 10, 2007 6:27:21 AM GMT-0500 Auto-Detected Subject: Re: [Nagios-users] Service Group Summary Changing Numbers Hi Elijah, This sounds similar to a problem that I had, refreshing the browser was giving me different results. It turned out that the problem was to do with 2 Nagios processes running. When I was refreshing the browser, it was randomly picking one of the processes and reporting back the state of that particular instance hence the different results on each refresh. To rectify, I stopped Nagios and manually removed the remaining process and then started Nagios again. I caused the problem during a Nagios upgrade, I didn't stop Nagios before starting the upgrade so it ended up being started twice. Regards, Aidan Elijah Savage wrote: All, I have something going on that I consider very weird happening. Under service group summary my numbers are changing on refresh of the browser when there are no devices down. I have 4 different host groups on that page, but in one group I have 70 devices. You login it shows 70 devices up then you do a refresh and it will show 60 devices up none down when you know you have 70, next refresh it may show 68devices up none down. I know it all sounds like baby talk but it is some what difficult for me to explain. It does this under the hostgroup summary as well. I have been on this list for a long time and have never had to post because through reading the emails and searching the archives I have been able to achieve what I needed to for my environment, but I could not find anything close to what I am seeing now. Nagios is Version 2.7 updated this past weekend had I known and was paying attention I would have waited on the 2.8 release from this weekend :) Running on Solaris and Sun V880 Platform 4cpu's 8gig of mem. The server is no where close to being over loaded. Thing is I do not know if this was happening on the previous version. Of course when you announce a major change or upgrade people really start to pay close attention to the tools they use. Oh yeah one last thing these devices being monitored are Cisco devices with the check_command check-router-alive. Any help would be greatly appreciated. - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys-and earn cash http://ww