Re: [Nagios-users] how to fix excessive latency
On 06/29/2010 03:57 AM, wwanghongrui wrote: Thanks your reply. We are writing to mysql database by ndoutils.We don't use nsca. About external_command_buffer_slots, we don't set it up. status_update_interval =15 I use vmstate to capture system performance,like below.Maybe the bottleneck is not at system. Endeavour to not run Nagios on a virtual server. If you must use a virtual server, make very sure that your checkresult spooldirectory and status data files are on a ramdisk, or you will certainly run into trouble. -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] how to fix excessive latency
I agree, better not to use Nagios on virtual machine. The I/O layer of vms have poor performances. Ciao, Giorgio Il giorno 29/giu/2010, alle ore 14:23, Andreas Ericsson a...@op5.se ha scritto: On 06/29/2010 03:57 AM, wwanghongrui wrote: Thanks your reply. We are writing to mysql database by ndoutils.We don't use nsca. About external_command_buffer_slots, we don't set it up. status_update_interval =15 I use vmstate to capture system performance,like below.Maybe the bottleneck is not at system. Endeavour to not run Nagios on a virtual server. If you must use a virtual server, make very sure that your checkresult spooldirectory and status data files are on a ramdisk, or you will certainly run into trouble. -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] how to fix excessive latency
Clock skew can be an issue as well depending on the virtualization platform. On 6/29/10, Giorgio Zarrelli zarre...@linux.it wrote: I agree, better not to use Nagios on virtual machine. The I/O layer of vms have poor performances. Ciao, Giorgio Il giorno 29/giu/2010, alle ore 14:23, Andreas Ericsson a...@op5.se ha scritto: On 06/29/2010 03:57 AM, wwanghongrui wrote: Thanks your reply. We are writing to mysql database by ndoutils.We don't use nsca. About external_command_buffer_slots, we don't set it up. status_update_interval =15 I use vmstate to capture system performance,like below.Maybe the bottleneck is not at system. Endeavour to not run Nagios on a virtual server. If you must use a virtual server, make very sure that your checkresult spooldirectory and status data files are on a ramdisk, or you will certainly run into trouble. -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] how to fix excessive latency
I am sorry for my bad english. My nagios server is not running in virtual server. Nagios3.2.0 + Suse10-sp2 x86_64 + 8 GB mem + 4 x ( Xeon(R) CPU E7420 @ 2.13GHz ), I think this hardware is enough. I use vmstate to capture system performance, vmstate is a command in SUSE10,not a virtual server. My configuration is like below,I don't know which parameter should I optimize,could you give me suggestions, thanks~ cfg_file=/usr/local/nagios/etc/hosts.cfg cfg_file=/usr/local/nagios/etc/services.cfg cfg_file=/usr/local/nagios/etc/misccommands.cfg cfg_file=/usr/local/nagios/etc/checkcommands.cfg cfg_file=/usr/local/nagios/etc/contactgroups.cfg cfg_file=/usr/local/nagios/etc/contacts.cfg cfg_file=/usr/local/nagios/etc/hostgroups.cfg cfg_file=/usr/local/nagios/etc/servicegroups.cfg cfg_file=/usr/local/nagios/etc/timeperiods.cfg cfg_file=/usr/local/nagios/etc/escalations.cfg cfg_file=/usr/local/nagios/etc/dependencies.cfg cfg_file=/usr/local/nagios/etc/hostextinfo.cfg cfg_file=/usr/local/nagios/etc/serviceextinfo.cfg cfg_file=/usr/local/nagios/etc/meta_commands.cfg cfg_file=/usr/local/nagios/etc/meta_contactgroup.cfg cfg_file=/usr/local/nagios/etc/meta_contact.cfg cfg_file=/usr/local/nagios/etc/meta_dependencies.cfg cfg_file=/usr/local/nagios/etc/meta_escalations.cfg cfg_file=/usr/local/nagios/etc/meta_hostgroup.cfg cfg_file=/usr/local/nagios/etc/meta_host.cfg cfg_file=/usr/local/nagios/etc/meta_services.cfg cfg_file=/usr/local/nagios/etc/meta_timeperiod.cfg resource_file=/usr/local/nagios/etc//resource.cfg log_file=/usr/local/nagios/var/nagios.log temp_file=/usr/local/nagios/var/nagios.tmp status_file=/usr/local/nagios/var/status.log p1_file=/usr/local/nagios/bin/p1.pl status_update_interval=15 nagios_user=nagios nagios_group=nagios enable_notifications=1 execute_service_checks=1 accept_passive_service_checks=1 execute_host_checks=1 accept_passive_host_checks=1 enable_event_handlers=1 log_rotation_method=d log_archive_path=/usr/local/nagios/var/archives/ check_external_commands=1 command_check_interval=1s command_file=/usr/local/nagios/var/rw/nagios.cmd lock_file=/usr/local/nagios/var/nagios.lock retain_state_information=1 retention_update_interval=60 use_retained_program_state=1 use_retained_scheduling_info=1 use_syslog=1 log_notifications=1 log_service_retries=1 log_host_retries=1 log_event_handlers=1 log_initial_states=1 log_external_commands=1 sleep_time=1 service_inter_check_delay_method=s service_interleave_factor=s max_concurrent_checks=2000 service_reaper_frequency=5 interval_length=60 use_agressive_host_checking=1 enable_flap_detection=0 low_service_flap_threshold=25.0 high_service_flap_threshold=50.0 low_host_flap_threshold=25.0 high_host_flap_threshold=50.0 service_check_timeout=60 host_check_timeout=10 event_handler_timeout=30 notification_timeout=30 ocsp_timeout=5 ochp_timeout=5 perfdata_timeout=5 process_performance_data=1 host_perfdata_command=107 service_perfdata_command=process-service-perfdata host_perfdata_file=/usr/local/pnp4nagios/var/host-perfdata service_perfdata_file=/usr/local/pnp4nagios/var/service-perfdata host_perfdata_file_template=DATATYPE::HOSTPERFDATA TIMET::$TIMET$ HOSTNAME::$HOSTNAME$HOSTPERFDATA::$HOSTPERFDATA$ HOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$HOSTSTATE::$HOSTSTATE$ HOSTSTATETYPE::$HOSTSTATETYPE$ service_perfdata_file_template=DATATYPE::SERVICEPERFDATATIMET::$TIMET$ HOSTNAME::$HOSTNAME$ SERVICEDESC::$SERVICEDESC$SERVICEPERFDATA::$SERVICEPERFDATA$ SERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$ HOSTSTATE::$HOSTSTATE$ HOSTSTATETYPE::$HOSTSTATETYPE$ SERVICESTATE::$SERVICESTATE$ SERVICESTATETYPE::$SERVICESTATETYPE$ host_perfdata_file_mode=a service_perfdata_file_mode=a host_perfdata_file_processing_interval=30 service_perfdata_file_processing_interval=30 host_perfdata_file_processing_command=process-host-perfdata-file service_perfdata_file_processing_command=process-service-perfdata-file check_service_freshness=1 date_format=euro illegal_object_name_chars=~!$%^*|'?,()= illegal_macro_output_chars=`~$^|' admin_email=admin admin_pager=ad...@localhost broker_module=/usr/local/nagios/bin/ndomod-3x.o config_file=/usr/local/nagios/etc/ndomod.cfg event_broker_options=-1 use_large_installation_tweaks=1 child_processes_fork_twice=0 enable_environment_macros=0 debug_file=/usr/local/centreon/log/Debug-Graphs.log debug_level=-1 max_debug_file_size=6 check_result_reaper_frequency=10 max_check_result_reaper_time=20 Regards HongRui Wang Mail:wwanghong...@cebbank.com 2010-06-30 wwanghongrui 2010-06-30 发件人: Andreas Ericsson 发送时间: 2010-06-29 20:24:12 收件人: wwanghongrui; Nagios Users List 抄送: shadih rahman 主题: Re: [Nagios-users] how to fix excessive latency On 06/29/2010 03:57 AM, wwanghongrui wrote: Thanks your reply. We are writing to mysql database by ndoutils.We don't use nsca. About external_command_buffer_slots, we don't set it up. status_update_interval =15 I use vmstate to capture system performance
Re: [Nagios-users] how to fix excessive latency
There is something definitely not right here. We have about 1 checks and the performance is lot better. Anyhow we are using the following values check_result_reaper_frequency=10 max_check_result_reaper_time=20 You should enabled debug mode and check the debug logs. Are you writing to any backend database? Are you using nsca to transfer service information to remote location. what is the value of your status_update_interval? what is your external_command_buffer_slots? 2010/6/28 wwanghongrui wwanghong...@cebbank.com Hi,guys~ Our nagios server envrionment: Nagios3.2.0 + Suse10-sp2 x86_64 + 8 GB mem + 4 x ( Xeon(R) CPU E7420 @ 2.13GHz ) We have 500+ active check hosts and 3k+ active check services. I have adjust some perfomance parameters in nagios.cfg, like below: use_large_installation_tweaks=1 child_processes_fork_twice=0 enable_environment_macros=0 check_result_reaper_frequency=5 max_check_result_reaper_time=30 But, The nagios performance is still bad, like below: Services Actively Checked: Time Frame Services Checked = 1 minute: 271 (9.4%) = 5 minutes: 1749 (60.4%) = 15 minutes: 2824 (97.4%) = 1 hour: 2898 (100.0%) Since program start: 2869 (99.0%) Metric Min. Max. Average Check Execution Time: 0.09 sec 32.23 sec 1.113 sec Check Latency: 1.12 sec 212.59 sec 116.329 sec Percent State Change: 0.00% 23.88% 0.05% Hosts Acrively Checked: Time Frame Hosts Checked = 1 minute: 32 (5.5%) = 5 minutes: 419 (71.5%) = 15 minutes: 586 (100.0%) = 1 hour: 586 (100.0%) Since program start: 586 (100.0%) Metric Min. Max. Average Check Execution Time: 0.08 sec 4.29 sec 3.035 sec Check Latency: 0.00 sec 135.25 sec 116.420 sec Percent State Change: 0.00% 11.32% 0.09% How could I find which services check or hosts check cause this seriously check latency? Regards HongRui Wang mail: wwanghong...@cebbank.com 2010-06-28 -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Cordially, Shadhin Rahman -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] how to fix excessive latency
0 853 2947 7 22 70 0 0 4 0160 187860 289248 604014800 0 0 564 2748 12 25 64 0 0 4 0160 202880 289248 604014800 0 0 432 2336 5 22 73 0 0 5 0160 189956 289248 604014800 0 416 824 2762 7 24 69 1 0 2 0160 195912 289248 60411760052 1224 789 2332 5 15 78 2 0 1 0160 205060 289248 604117600 0 8 343 1718 2 8 90 0 0 1 0160 205076 289248 604117600 0 0 320 1177 0 6 93 0 0 1 0160 213844 289248 604117600 0 0 315 1100 0 7 92 0 0 1 0160 226900 289248 604117600 0 0 305 1210 0 8 92 0 0 2 0160 227188 289248 604117600 0 956 556 901 0 4 92 3 0 1 0160 228924 289248 604117600 0 0 294 1034 1 6 93 0 0 1 0160 229740 289248 604117600 0 0 292 1235 1 6 93 0 0 1 0160 230228 289248 604117600 0 0 287 1696 1 6 93 0 0 3 1160 230456 289248 604117600 0 128 288 1307 1 6 93 0 0 1 1160 228756 289248 604220400 3052 4944 921 1673 5 7 84 4 0 1 1160 229004 289248 604220400 0 1676 1061 1122 1 6 87 6 0 1 1160 229004 289248 604220400 0 1672 1081 1093 0 6 87 6 0 1 1160 230788 289248 604220400 0 1856 1171 1198 1 6 87 6 0 Regards HongRui Wang Mail:wwanghong...@cebbank.com 2010-06-29 发件人: shadih rahman 发送时间: 2010-06-29 00:57:24 收件人: wwanghongrui; Nagios Users List 抄送: 主题: Re: [Nagios-users] how to fix excessive latency There is something definitely not right here. We have about 1 checks and the performance is lot better. Anyhow we are using the following values check_result_reaper_frequency=10 max_check_result_reaper_time=20 You should enabled debug mode and check the debug logs. Are you writing to any backend database? Are you using nsca to transfer service information to remote location. what is the value of your status_update_interval? what is your external_command_buffer_slots? 2010/6/28 wwanghongrui wwanghong...@cebbank.com Hi,guys~ Our nagios server envrionment: Nagios3.2.0 + Suse10-sp2 x86_64 + 8 GB mem + 4 x ( Xeon(R) CPU E7420 @ 2.13GHz ) We have 500+ active check hosts and 3k+ active check services. I have adjust some perfomance parameters in nagios.cfg, like below: use_large_installation_tweaks=1 child_processes_fork_twice=0 enable_environment_macros=0 check_result_reaper_frequency=5 max_check_result_reaper_time=30 But, The nagios performance is still bad, like below: Services Actively Checked:Time FrameServices Checked = 1 minute:271 (9.4%) = 5 minutes:1749 (60.4%) = 15 minutes:2824 (97.4%) = 1 hour:2898 (100.0%) Since program start: 2869 (99.0%) MetricMin.Max.Average Check Execution Time: 0.09 sec32.23 sec1.113 sec Check Latency:1.12 sec212.59 sec116.329 sec Percent State Change:0.00%23.88%0.05% Hosts Acrively Checked:Time FrameHosts Checked = 1 minute:32 (5.5%) = 5 minutes:419 (71.5%) = 15 minutes:586 (100.0%) = 1 hour:586 (100.0%) Since program start: 586 (100.0%) MetricMin.Max.Average Check Execution Time: 0.08 sec4.29 sec3.035 sec Check Latency:0.00 sec135.25 sec116.420 sec Percent State Change:0.00%11.32%0.09% How could I find which services check or hosts check cause this seriously check latency? Regards HongRui Wang mail: wwanghong...@cebbank.com 2010-06-28 -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Cordially, Shadhin Rahman -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null