>-----Original Message----- >From: Steve Shipway [mailto:[EMAIL PROTECTED] >Sent: Tuesday, January 22, 2008 8:45 PM >To: Frost, Mark {PBG}; Nagios Users >Subject: RE: [Nagios-users] Problem with high latencies after >going distributed > >> As I'd mentioned in a previous message, I'm in the process of >converting >> from a centralized >> Nagios 2.10 setup all running on a single host to a distributed setup >> running on at least 3 >> hosts (3 to start anyway). The centralized setup has 572 hosts and >2900 >> services 99.9% of which are active checks. >... >> Active Service Latency: 0.000 / 7267.198 / >> 4241.019 sec > >This isn't much help, but... > >We've just done exactly the same (Nagios 2.9), and we have a comparable >size of system (actually a bit larger - 713 hosts, 5834 services). >After going distributed, we too have this insanely high latency on the >satellites. > >The only possible cause is the OCSP command slowing things >down somehow. >This is using the supplied send_nsca call to send the status off to the >central server... > >define command { > command_name relay > command_line $USER1$/submit_check_result "$HOSTNAME$" >"$SERVICEDESC$" "$SERVICESTATEID$" "$SERVICEOUTPUT$" >} > >So it should work. I guess things would be better if it packaged the >updates up into batches, although it cant do that normally. > >I think it might be better to make the OCSP command just dump >the status >to a file, and then have a cronjob every 60 seconds that reads the file >and sends the statuses off as a batch. I will try this here, >when I get >the chance. > >Steve
But if the submit_check_result is running slowly, that would only affect the service execution time wouldn't it? My understanding of check latency is that it's the difference in time between when Nagios schedules a check to run versus the time that the check actually starts to execute. But maybe I'm misunderstanding something here. When it comes to working with Nagios, I tend to learn the most when I have the biggest problems :-). Do you do the same thing I mentioned where you define all the checks on both distributed nodes, but disable checks on complimentary halves of those checks on each node? Thanks Mark ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null