Hi list, I get some latency problems i can't explain. Here's the story.
Nagios 3.0.3 on Ubuntu 8.0.4. Hardware is an intel quad core with 10 Go Ram and fast disks. I get 24524 services on 1654 hosts to check. Services are mostly active-passive with a check intervall of 6 hours. check_intervall for hosts is 0 so Nagios make them only on demand. No broker module activated. ocsp and ochp are activated because this server is part of a distributed system. Nagios debug is activated. Some configuration options i have. service_inter_check_delay_method=s service_interleave_factor=s host_inter_check_delay_method=s max_concurrent_checks=0 max_service_check_spread=240 check_result_reaper_frequency=2 max_check_result_reaper_time=30 I tried to "play" with those options without success. Latency keeps growing whatever i tried. What is strange is the fact that in the performance screen i see that Metric Min. Max. Average Check Execution Time: 0.00 sec 15.01 sec 0.888 sec Check Latency: 0.00 sec 10191.24 sec 4924.060 sec Percent State Change: 0.00% 18.36% 0.24% and that in the scheduling queue, i see that at 9:45. SERVER PRINT_ERROR 25-09-2008 03:45:32 25-09-2008 09:45:32 Normal ENABLED Disable Active Checks Of This Service Re-schedule This Service Check It seems that services are corrrectly scheduled despite the latency i see in the performance screen. /usr/local/nagios/bin/nagios -s /usr/local/nagios/etc/nagios.cfg tells me that everything is fine and have no suggestion for me. HOST SCHEDULING INFORMATION --------------------------- Total hosts: 1654 Total scheduled hosts: 0 Host inter-check delay method: SMART Average host check interval: 0.00 sec Host inter-check delay: 0.00 sec Max host check spread: 360 min First scheduled check: N/A Last scheduled check: N/A SERVICE SCHEDULING INFORMATION ------------------------------- Total services: 24524 Total scheduled services: 23739 Service inter-check delay method: SMART Average service check interval: 17331.08 sec Inter-check delay: 0.61 sec Interleave factor method: SMART Average services per host: 14.83 Service interleave factor: 15 Max service check spread: 240 min First scheduled check: Thu Sep 25 10:07:13 2008 Last scheduled check: Thu Sep 25 14:07:15 2008 CHECK PROCESSING INFORMATION ---------------------------- Check result reaper interval: 2 sec Max concurrent service checks: Unlimited PERFORMANCE SUGGESTIONS ----------------------- I have no suggestions - things look okay. So i'm a bit lost. Which screen is right ? The performance one that indicates the 4924 sec latency or the scheduling one that tells me the checks are made in time. What do you think of that ? How can the latency be so high when nagios needs to make only 1 or 2 checks/seconds ? Is there anything wrong in my setup ? Thanks in advance for any advice or info you could give. Olivier Jan ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null