[Nagios-users] Service latency suddenly through the roof
Hi, I'm running nagios 2.0b4, on suse 9.3. The plan is to upgrade to 3.0.6 (nagios) and SLES10 sp1. Recently, during some checking of the data that I collect for SLA reporting, I noticed the service latency has gone through the roof! The only recent change I've made is to put ssh checks as a host check. This was to filter ssh checks away from the service problem web interface as we do have to shut nodes down from time to time but we don't want to be alerted as a service problem. I then had to introduce check_dummy as the service check (otherwise the pre-flight check outputs hundred of lines stating the no services was associated with host). The check_dummy service check is only done on a Sunday once per host as it's a dummy check. I thought this would put less load on Nagios (please correct me if I'm wrong) We regard service problems to only be checks against our databases. Does anyone know of what could set this servicelatency to suddenly increase like this ? regards, deborah *** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. Any unauthorised distribution or copying is strictly prohibited. Whilst Kognitio Limited takes steps to prevent the transmission of viruses via e-mail, we can not guarantee that any email or attachment is free from computer viruses and you are strongly advised to undertake your own anti-virus precautions. Kognitio grants no warranties regarding performance, use or quality of any e-mail or attachment and undertakes no liability for loss or damage, howsoever caused. Kognitio Limited, a company registered in England and Wales. Registered number 0212 7833. Registered Office: 3a Waterside Park, Cookham Road, Bracknell, Berks, RG12 1RB. VAT number 864 4378 92. Kognitio Inc, a company incorporated in Delaware, principal office 180 North Stetson, Suite 3500, Chicago, IL 60601, USA ***-- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Service latency suddenly through the roof
Deborah Martin wrote: Hi, I'm running nagios 2.0b4, on suse 9.3. The plan is to upgrade to 3.0.6 (nagios) and SLES10 sp1. Recently, during some checking of the data that I collect for SLA reporting, I noticed the service latency has gone through the roof! The only recent change I've made is to put ssh checks as a host check. This was to filter ssh checks away from the service problem web interface as we do have to shut nodes down from time to time but we don't want to be alerted as a service problem. Did you also enable scheduled host checks? In 2.x, nagios stops everything as soon as it runs a host check, so latency will naturally go through the roof. I then had to introduce check_dummy as the service check (otherwise the pre-flight check outputs hundred of lines stating the no services was associated with host). The check_dummy service check is only done on a Sunday once per host as it's a dummy check. I thought this would put less load on Nagios (please correct me if I'm wrong) We regard service problems to only be checks against our databases. Does anyone know of what could set this servicelatency to suddenly increase like this ? Since you're running a beta release that's more than 2 years old, I'm not even remotely interested to guess or investigate this matter. At least upgrade to latest 2.x stable (2.12, I think?). If the problem still persists, then it's worth investigating. -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Service latency suddenly through the roof
Andreas, Thanks for your reply. I completely agree with the suggestion of upgrading. I'm currently migrating configs across to Nagios 3.0.6 with the hope of switching to this box in about a week. I was hoping to keep the configs on the nagios 2.0b4 box fairly consistent with the new box so we can run them in parallel for about a month before trashing it. Should I expect latency to be a lot lower with the new version of Nagios ? I'm currently looking at the logs produced so far for the new version to see what the latency levels are like. regards, deborah -Original Message- From: Andreas Ericsson [mailto:a...@op5.se] Sent: 23 March 2009 14:02 To: Deborah Martin Cc: 'nagios-users@lists.sourceforge.net' Subject: Re: [Nagios-users] Service latency suddenly through the roof Deborah Martin wrote: Hi, I'm running nagios 2.0b4, on suse 9.3. The plan is to upgrade to 3.0.6 (nagios) and SLES10 sp1. Recently, during some checking of the data that I collect for SLA reporting, I noticed the service latency has gone through the roof! The only recent change I've made is to put ssh checks as a host check. This was to filter ssh checks away from the service problem web interface as we do have to shut nodes down from time to time but we don't want to be alerted as a service problem. Did you also enable scheduled host checks? In 2.x, nagios stops everything as soon as it runs a host check, so latency will naturally go through the roof. I then had to introduce check_dummy as the service check (otherwise the pre-flight check outputs hundred of lines stating the no services was associated with host). The check_dummy service check is only done on a Sunday once per host as it's a dummy check. I thought this would put less load on Nagios (please correct me if I'm wrong) We regard service problems to only be checks against our databases. Does anyone know of what could set this servicelatency to suddenly increase like this ? Since you're running a beta release that's more than 2 years old, I'm not even remotely interested to guess or investigate this matter. At least upgrade to latest 2.x stable (2.12, I think?). If the problem still persists, then it's worth investigating. -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. *** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. Any unauthorised distribution or copying is strictly prohibited. Whilst Kognitio Limited takes steps to prevent the transmission of viruses via e-mail, we can not guarantee that any email or attachment is free from computer viruses and you are strongly advised to undertake your own anti-virus precautions. Kognitio grants no warranties regarding performance, use or quality of any e-mail or attachment and undertakes no liability for loss or damage, howsoever caused. Kognitio Limited, a company registered in England and Wales. Registered number 0212 7833. Registered Office: 3a Waterside Park, Cookham Road, Bracknell, Berks, RG12 1RB. VAT number 864 4378 92. Kognitio Inc, a company incorporated in Delaware, principal office 180 North Stetson, Suite 3500, Chicago, IL 60601, USA ***-- Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null