[Nagios-users] Help with failover
I have 2 master servers that attach to a shared NAS which holds the nagios configs and status data basically /etc and /var. When the primary goes down there's an event handler that mounts the NAS on the backup and starts nagios. When I implemented this setup I thought it would be beneficial in the sense that I would have all the status data retained from the primary instance. This is not the case when I initiate a failover the backup system which mounts all the same data that the primary had prior to being brought down doesn't see any saved status data all checks are in a pending state until the distributed servers send new check data via nsca. The other confusing thing is that if I fail it back over to the primary it see's the old saved status data. I'm not sure what I'm missing here can anyone enlighten me? Thanks in advance Michael - SF.Net email is sponsored by: The Future of Linux Business White Paper from Novell. From the desktop to the data center, Linux is going mainstream. Let it simplify your IT future. http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Issues with Nagios 2.4?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Nagios seems to have been very unstable the past couple of weeks. The only change I've made is upgrading from 2.2 to 2.4. It could just be that I have some bad configuration options, but I'm not sure. I just had a server go down for an hour, and Nagios never caught it. In general, since upgrading to 2.0 Nagios seems very slow on catching broken services/hosts, but usually checks them (not always). I look at my status overview right now, and Nagios says Last Check for almost every service is two days old. Any ideas on what I'm doing wrong? Here's my Nagios log without comments or blank lines log_file=/nagios/services/nagios/var/nagios.log cfg_file=/nagios/services/nagios/etc/checkcommands.cfg cfg_file=/nagios/services/nagios/etc/contact-templates.cfg cfg_file=/nagios/services/nagios/etc/contactgroups.cfg cfg_file=/nagios/services/nagios/etc/contacts.cfg cfg_file=/nagios/services/nagios/etc/escalations.cfg cfg_file=/nagios/services/nagios/etc/host-templates.cfg cfg_file=/nagios/services/nagios/etc/service-templates.cfg cfg_file=/nagios/services/nagios/etc/misccommands.cfg cfg_file=/nagios/services/nagios/etc/time_periods.cfg cfg_file=/nagios/services/nagios/etc/nagios-commands.cfg cfg_file=/nagios/services/nagios/etc/nagios-hostgroups.cfg cfg_file=/nagios/services/nagios/etc/nagios-hosts.cfg cfg_file=/nagios/services/nagios/etc/nagios-service-templates.cfg cfg_file=/nagios/services/nagios/etc/nagios-services.cfg object_cache_file=/nagios/services/nagios/var/objects.cache resource_file=/nagios/services/nagios/etc/resource.cfg temp_file=/nagios/services/nagios/var/nagios.tmp status_file=/nagios/services/nagios/var/status.dat aggregate_status_updates=1 status_update_interval=15 nagios_user=nagios nagios_group=nagios enable_notifications=1 execute_service_checks=1 accept_passive_service_checks=0 execute_host_checks=1 accept_passive_host_checks=0 enable_event_handlers=1 log_rotation_method=d log_archive_path=/nagios/services/nagios/var/archives check_external_commands=1 command_check_interval=60s command_file=/nagios/services/nagios/var/rw/nagios.cmd downtime_file=/nagios/services/nagios/var/downtime.dat comment_file=/nagios/services/nagios/var/comments.dat lock_file=/nagios/services/nagios/var/nagios.lock retain_state_information=1 state_retention_file=/nagios/services/nagios/var/retention.dat use_retained_scheduling_info=1 retention_update_interval=0 use_retained_program_state=1 use_syslog=1 log_notifications=1 log_service_retries=1 log_host_retries=1 log_event_handlers=1 log_initial_states=0 log_external_commands=1 log_passive_checks=0 sleep_time=0.25 service_inter_check_delay_method=n max_service_check_spread=5 service_interleave_factor=s max_concurrent_checks=300 service_reaper_frequency=40 host_inter_check_delay_method=n max_host_check_spread=5 interval_length=60 auto_reschedule_checks=0 auto_rescheduling_interval=30 auto_rescheduling_window=30 use_agressive_host_checking=0 enable_flap_detection=0 low_service_flap_threshold=5.0 high_service_flap_threshold=20.0 low_host_flap_threshold=5.0 high_host_flap_threshold=20.0 soft_state_dependencies=0 service_check_timeout=60 host_check_timeout=30 event_handler_timeout=30 notification_timeout=30 ocsp_timeout=5 perfdata_timeout=5 obsess_over_services=0 process_performance_data=0 check_for_orphaned_services=0 check_service_freshness=0 freshness_check_interval=60 check_host_freshness=0 host_freshness_check_interval=60 date_format=us illegal_object_name_chars=`~!$%^*|'?,()'= illegal_macro_output_chars=`~$|' use_regexp_matching=0 use_true_regexp_matching=0 [EMAIL PROTECTED] [EMAIL PROTECTED] -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (Darwin) iD8DBQFEqVHlwjCqooJyNAMRAgeJAKCSF6mCLLr9uRhtwHng+cW6W2/4VwCbBrOS cjp0AoxpQp1pj72WGsqs4RQ= =vHlh -END PGP SIGNATURE- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Issues with Nagios 2.4?
Hugo, My apologies. I meant Here's my Nagios Configuration without comments or blank lines. There is nothing useful in the Nagios logs. As to your other question, yes there is only one Nagios daemon running .. That's the only process I'm seeing Nagios is running, however. On Mon, 3 Jul 2006, Michael T. Halligan wrote: Nagios seems to have been very unstable the past couple of weeks. The only change I've made is upgrading from 2.2 to 2.4. It could just be that I have some bad configuration options, but I'm not sure. I just had a server go down for an hour, and Nagios never caught it. In general, since upgrading to 2.0 Nagios seems very slow on catching broken services/hosts, but usually checks them (not always). I look at my status overview right now, and Nagios says Last Check for almost every service is two days old. Any ideas on what I'm doing wrong? Here's my Nagios log without comments or blank lines Unfortunatly there is no log. That happened to be a config file. Go over the logs and start looking for odd things. First thing first. Is Nagios actually running? Then check if there is just one nagios daemon. voiceoverThere can be only one!/voiceover Hugo. -- I hate duplicates. Just reply to the relevant mailinglist. [EMAIL PROTECTED] http://hvdkooij.xs4all.nl/ Don't meddle in the affairs of magicians, for they are subtle and quick to anger. Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null --- BitPusher, LLC http://www.bitpusher.com/ 1.888.9PUSHER (415) 724.7998 - Mobile Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Issues with Nagios 2.4?
Here's my Nagios log without comments or blank lines snip service_inter_check_delay_method=n snip host_inter_check_delay_method=n Try using a setting of 's' for these two directives. Using no delay is generally not recommended. -- Nagios Documentatio These are some parameters I've been fiddling with. There is no difference when I use =s versus =n, it's consistently unreliable no matter what I try. Next, look at your scheduling queue for clues (note that host checks will not be 'scheduled', only service checks get shceduled) Well, I do notice that most of my schedules checks are in a past-time, and I have to assume that they're just not getting run. Finally, if there are a very large numbers of hosts in down/unreachable state, they can affect your service check performance severly. --- BitPusher, LLC http://www.bitpusher.com/ 1.888.9PUSHER (415) 724.7998 - Mobile Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Issues with Nagios 2.4?
A combination of tweaks seems to have fixed this. Lowering service_reaper_frequency combined with turning on smart interleaving seems to make Nagios quite a bit better at catching problems. -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Nagios seems to have been very unstable the past couple of weeks. The only change I've made is upgrading from 2.2 to 2.4. It could just be that I have some bad configuration options, but I'm not sure. I just had a server go down for an hour, and Nagios never caught it. In general, since upgrading to 2.0 Nagios seems very slow on catching broken services/hosts, but usually checks them (not always). I look at my status overview right now, and Nagios says Last Check for almost every service is two days old. Any ideas on what I'm doing wrong? Here's my Nagios log without comments or blank lines log_file=/nagios/services/nagios/var/nagios.log cfg_file=/nagios/services/nagios/etc/checkcommands.cfg cfg_file=/nagios/services/nagios/etc/contact-templates.cfg cfg_file=/nagios/services/nagios/etc/contactgroups.cfg cfg_file=/nagios/services/nagios/etc/contacts.cfg cfg_file=/nagios/services/nagios/etc/escalations.cfg cfg_file=/nagios/services/nagios/etc/host-templates.cfg cfg_file=/nagios/services/nagios/etc/service-templates.cfg cfg_file=/nagios/services/nagios/etc/misccommands.cfg cfg_file=/nagios/services/nagios/etc/time_periods.cfg cfg_file=/nagios/services/nagios/etc/nagios-commands.cfg cfg_file=/nagios/services/nagios/etc/nagios-hostgroups.cfg cfg_file=/nagios/services/nagios/etc/nagios-hosts.cfg cfg_file=/nagios/services/nagios/etc/nagios-service-templates.cfg cfg_file=/nagios/services/nagios/etc/nagios-services.cfg object_cache_file=/nagios/services/nagios/var/objects.cache resource_file=/nagios/services/nagios/etc/resource.cfg temp_file=/nagios/services/nagios/var/nagios.tmp status_file=/nagios/services/nagios/var/status.dat aggregate_status_updates=1 status_update_interval=15 nagios_user=nagios nagios_group=nagios enable_notifications=1 execute_service_checks=1 accept_passive_service_checks=0 execute_host_checks=1 accept_passive_host_checks=0 enable_event_handlers=1 log_rotation_method=d log_archive_path=/nagios/services/nagios/var/archives check_external_commands=1 command_check_interval=60s command_file=/nagios/services/nagios/var/rw/nagios.cmd downtime_file=/nagios/services/nagios/var/downtime.dat comment_file=/nagios/services/nagios/var/comments.dat lock_file=/nagios/services/nagios/var/nagios.lock retain_state_information=1 state_retention_file=/nagios/services/nagios/var/retention.dat use_retained_scheduling_info=1 retention_update_interval=0 use_retained_program_state=1 use_syslog=1 log_notifications=1 log_service_retries=1 log_host_retries=1 log_event_handlers=1 log_initial_states=0 log_external_commands=1 log_passive_checks=0 sleep_time=0.25 service_inter_check_delay_method=n max_service_check_spread=5 service_interleave_factor=s max_concurrent_checks=300 service_reaper_frequency=40 host_inter_check_delay_method=n max_host_check_spread=5 interval_length=60 auto_reschedule_checks=0 auto_rescheduling_interval=30 auto_rescheduling_window=30 use_agressive_host_checking=0 enable_flap_detection=0 low_service_flap_threshold=5.0 high_service_flap_threshold=20.0 low_host_flap_threshold=5.0 high_host_flap_threshold=20.0 soft_state_dependencies=0 service_check_timeout=60 host_check_timeout=30 event_handler_timeout=30 notification_timeout=30 ocsp_timeout=5 perfdata_timeout=5 obsess_over_services=0 process_performance_data=0 check_for_orphaned_services=0 check_service_freshness=0 freshness_check_interval=60 check_host_freshness=0 host_freshness_check_interval=60 date_format=us illegal_object_name_chars=`~!$%^*|'?,()'= illegal_macro_output_chars=`~$|' use_regexp_matching=0 use_true_regexp_matching=0 [EMAIL PROTECTED] [EMAIL PROTECTED] -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (Darwin) iD8DBQFEqVHlwjCqooJyNAMRAgeJAKCSF6mCLLr9uRhtwHng+cW6W2/4VwCbBrOS cjp0AoxpQp1pj72WGsqs4RQ= =vHlh -END PGP SIGNATURE- --- BitPusher, LLC http://www.bitpusher.com/ 1.888.9PUSHER (415) 724.7998 - Mobile Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Alternative web interfaces for limiting web access?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 We monitor N different customer infrastructures with Nagios. Some of our customers are starting to request access to the web interface. Beyond writing our own custom interface, are there any good projects out there to allow users with specified access to only be able to see certain objects? Michael T. Halligan - - BitPusher, LLC http://www.bitpusher.com/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (Darwin) iD8DBQFDxZwdwjCqooJyNAMRAo96AKCitSr+1JGqGHXgO136CRRlGPyA2wCcDSFM l5sQ7OqKBgN607J74+7+kKk= =6h8k -END PGP SIGNATURE- --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637alloc_id=16865op=click ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] false Host UP notifications
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hugo, Thanks, this did the trick! Michael Michael T. Halligan - - BitPusher, LLC http://www.bitpusher.com/ On Dec 11, 2005, at 7:17 AM, Hugo van der Kooij wrote: On Sat, 10 Dec 2005, Michael T. Halligan wrote: Nothing out of the ordinary in the logs really.. Just a bunch of host up hard messages, but without any corresponding host down messages. Does this have something to do with freshness testing, maybe? Disable it if you run active checks on your hosts. Hugo. -- I hate duplicates. Just reply to the relevant mailinglist. [EMAIL PROTECTED] http://hvdkooij.xs4all.nl/ Don't meddle in the affairs of magicians, for they are subtle and quick to anger. --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637alloc_id=16865op=click ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (Darwin) iD8DBQFDoip5wjCqooJyNAMRAo5RAJ9FaU5sgfB/iN8oKAsjhexv+GHpmQCcDMRB WWDLCTJhC1EkjwbZ2zMrVpA= =xDMx -END PGP SIGNATURE- --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637alloc_id=16865op=click ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] false Host UP notifications
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hugo, # Host Definition define host { host_name HOST alias HOST address IPADDR use HOSTTEMPLATE contact_groups noc } #Host Template # host_templates HOSTTEMPLATE define host { nameHOSTTEMPLATE process_perf_data 1 retain_status_information 1 flap_detection_enabled 0 retain_nonstatus_information0 active_checks_enabled 1 passive_checks_enabled 0 check_period24x7 obsess_over_host1 check_freshness 1 check_command check-host-alive max_check_attempts 3 event_handler_enabled 0 notifications_enabled 1 notification_interval 120 notification_period 24x7 notification_optionsd,u,r contact_groups noc register0 } On Dec 9, 2005, at 9:54 PM, Hugo van der Kooij wrote: On Fri, 9 Dec 2005, Michael T. Halligan wrote: --[PinePGP]-- [begin]-- This problem seemed to go away when I switched from 2.0b4 - 2.0b6, but it's rearing it's head again. I'm wondering if this is some type of a flapping issue. I've tried tunning on off aggressive host service checking. As far as I can recall, it's only happening on host checks, not service checks. How is your host defined exactly? What do you see reported in the log file? Hugo. -- I hate duplicates. Just reply to the relevant mailinglist. [EMAIL PROTECTED] http://hvdkooij.xs4all.nl/ Don't meddle in the affairs of magicians, for they are subtle and quick to anger. --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637alloc_id=16865op=click ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (Darwin) iD8DBQFDmzsowjCqooJyNAMRAth0AKC57ZuPMLrVc7zfibJiMgfLra+uTACgwMfR qHB7ZGjSScr2sMvotwG6SCM= =dBtt -END PGP SIGNATURE- --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637alloc_id=16865op=click ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] false Host UP notifications
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hugo, Nothing out of the ordinary in the logs really.. Just a bunch of host up hard messages, but without any corresponding host down messages. Does this have something to do with freshness testing, maybe? Hmm, I just realized that my subject line is a little bit misleading. By saying that the notifications I'm getting are false host-up notifications, I'm not saying that the host is down. The host has been up for a couple of weeks, yet nagios just keeps reminding me that the host is up. On Dec 10, 2005, at 2:37 PM, Hugo van der Kooij wrote: On Sat, 10 Dec 2005, Michael T. Halligan wrote: # Host Definition # host_templates HOSTTEMPLATE Looks ok. But what do the logs tell you? And what if you move options from the template to the host? Hugo. -- I hate duplicates. Just reply to the relevant mailinglist. [EMAIL PROTECTED] http://hvdkooij.xs4all.nl/ Don't meddle in the affairs of magicians, for they are subtle and quick to anger. --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637alloc_id=16865op=click ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (Darwin) iD8DBQFDm3UawjCqooJyNAMRAq1JAKCMOzoaVX6KXFWPie7k9gB7lBGpQACfduQl wm3+hCY3Nr7bLogXvTvIPxA= =/8GI -END PGP SIGNATURE- --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637alloc_id=16865op=click ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null