Hi, Marc Powell wrote: > On Jan 26, 2009, at 4:46 PM, Mathieu Gagné wrote: > >> Unfortunately, we can't afford this kind of downtime while >> Nagios/NDOutils is busy exporting to MySQL. Also, host/service status >> are not available while the reload is occuring. > > While I can't really speak to the more general problem (I experience > it too), I did find the 'Retain status file over a reload' patch at > http://altinity.blogs.com/dotorg/2007/09/nagios-patch-da.html > to be extremely helpful for this particular symptom. Reloads are > completely transparent to GUI users now.
I never had this issue when sending the external command "RESTART_PROGRAM" to Nagios. Host/service status are still available to me. My understanding is that "RESTART_PROGRAM" is the equivalent of issuing a "reload" or sending a HUP signal. Unfortunately, during that time, checks are still not performed as Nagios is busy broking events to NDOutils and to MySQL in a synchronous way. I found this patch on the same page: Do not resend retained status to NDO http://svn.opsview.org/opsview/tags/nagios-patch-day/opsview-base/patches/nagios_stop_logging_retained_states_to_ndo.patch The following questions come to my mind: - What could be the consequences of using such patch? - When will old entries from the table "nagios_hoststatus" be deleted? (same for "nagios_servicestatus") - What does aggregated_dump mean? (the patch changes it from FALSE to TRUE) -> int update_host_status(host *hst,int aggregated_dump){ -> int update_service_status(service *svc,int aggregated_dump){ - The comment tells: "a check cycle needs to complete before NDO has all the status information". What does it mean? To which "check cycle" does it refer to? - Does it have any link to "aggregate_status_updates" and "status_update_interval"? Does it mean status will be dump later and not at startup? "status_update_interval" seconds after startup? >> Is there a way to speed things up? Any help would be appreciated. >> Thanks. > > I'm interested in tips as well. I had to create several SQL indexes. At startup, the broker deletes old entries and limits the deletion by "start_time" and/or "scheduled_time". Unfortunately, there is no indexes on tables with such columns, putting MySQL on its knees as soon as Nagios restarts. Disclaimer: I'm not a DBA and can't guaranty the efficiency of each of them. I might have also missed some. ALTER TABLE `nagios_servicechecks` ADD INDEX `start_time` ( `instance_id` , `start_time` ) ; ALTER TABLE `nagios_timedevents` ADD INDEX `scheduled_time` ( `instance_id` , `scheduled_time` ); ALTER TABLE `nagios_systemcommands` ADD INDEX `start_time` ( `instance_id` , `start_time` ); ALTER TABLE `nagios_hostchecks` ADD INDEX `start_time` ( `instance_id` , `start_time` ); ALTER TABLE `nagios_eventhandlers` ADD INDEX `start_time` ( `instance_id` , `start_time` ); ALTER TABLE `nagios_instances` ADD INDEX `instance_name` ( `instance_name`,`instance_id` ) ; ALTER TABLE `nagios_hosts` ADD INDEX host_object_id (host_object_id); -- Mathieu ------------------------------------------------------------------------------ This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null