[Nagios-users] scheduled downtime Retention problem
Hello, I'm having some problem with the retention status with this setup: nagios-3.4.1-3 Debian Wheezy Problem: I schedule some downtime for either a host or a service, but it's not here after a reload (or a restart) of the nagios3 service Steps: - schedule some downtime (for one month, if this helps) - wait for the icon to appear in the web interface - restart (or reload) nagios3 service - scheduled downtime is gone Some more information: I did check the retention.dat file, it contains, at the end, this block which seems completely correct: servicedowntime { host_name=HOSTNAME service_description=SERVICE downtime_id=1 entry_time=1369832618 start_time=1369832609 end_time=1372518209 triggered_by=0 fixed=1 duration=2685600 is_in_effect=1 author=USER comment=MY COMMENT } There's also a servicecomment containing some Nagios comment as well - THIS block is shown and taken in account. I tried to see whether nagios encounter some error while reading its retention file, but nothing shown up… servicecomment { host_name=HOSTNAME service_description=SERVICE entry_type=2 comment_id=4 source=0 persistent=0 entry_time=1369833542 expires=0 expire_time=0 author=(Nagios Process) comment_data=This service has been scheduled for fixed downtime from 2013-05-29 15:03:29 to 2013-06-29 17:03:29. Notifications for the service will not be sent out during that time period. } Retention configuration: grep retain nagios.cfg | grep -v '#' retain_state_information=1 use_retained_program_state=1 use_retained_scheduling_info=1 retained_host_attribute_mask=0 retained_service_attribute_mask=0 retained_process_host_attribute_mask=0 retained_process_service_attribute_mask=0 retained_contact_host_attribute_mask=0 retained_contact_service_attribute_mask=0 Another information: the Acknowledge works as expected… Any idea? Thanks a lot for your help! Cheers, C. -- Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET Get 100% visibility into your production application - at no cost. Code-level diagnostics for performance bottlenecks with 2% overhead Download for free and get started troubleshooting in minutes. http://p.sf.net/sfu/appdyn_d2d_ap1 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] scheduled downtime Retention problem
Me gain, Of course, *after* having searched for a while, and then sent the mail, I fumble on the official nagios bug: http://tracker.nagios.org/view.php?id=338 Will check on debian tracker if they can integrate this patch… Cheers, C. On Wed, 29 May 2013 15:23:18 +0200 Cedric Jeanneret nagios...@tengu.ch wrote: Hello, I'm having some problem with the retention status with this setup: nagios-3.4.1-3 Debian Wheezy Problem: I schedule some downtime for either a host or a service, but it's not here after a reload (or a restart) of the nagios3 service Steps: - schedule some downtime (for one month, if this helps) - wait for the icon to appear in the web interface - restart (or reload) nagios3 service - scheduled downtime is gone Some more information: I did check the retention.dat file, it contains, at the end, this block which seems completely correct: servicedowntime { host_name=HOSTNAME service_description=SERVICE downtime_id=1 entry_time=1369832618 start_time=1369832609 end_time=1372518209 triggered_by=0 fixed=1 duration=2685600 is_in_effect=1 author=USER comment=MY COMMENT } There's also a servicecomment containing some Nagios comment as well - THIS block is shown and taken in account. I tried to see whether nagios encounter some error while reading its retention file, but nothing shown up… servicecomment { host_name=HOSTNAME service_description=SERVICE entry_type=2 comment_id=4 source=0 persistent=0 entry_time=1369833542 expires=0 expire_time=0 author=(Nagios Process) comment_data=This service has been scheduled for fixed downtime from 2013-05-29 15:03:29 to 2013-06-29 17:03:29. Notifications for the service will not be sent out during that time period. } Retention configuration: grep retain nagios.cfg | grep -v '#' retain_state_information=1 use_retained_program_state=1 use_retained_scheduling_info=1 retained_host_attribute_mask=0 retained_service_attribute_mask=0 retained_process_host_attribute_mask=0 retained_process_service_attribute_mask=0 retained_contact_host_attribute_mask=0 retained_contact_service_attribute_mask=0 Another information: the Acknowledge works as expected… Any idea? Thanks a lot for your help! Cheers, C. -- Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET Get 100% visibility into your production application - at no cost. Code-level diagnostics for performance bottlenecks with 2% overhead Download for free and get started troubleshooting in minutes. http://p.sf.net/sfu/appdyn_d2d_ap1 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET Get 100% visibility into your production application - at no cost. Code-level diagnostics for performance bottlenecks with 2% overhead Download for free and get started troubleshooting in minutes. http://p.sf.net/sfu/appdyn_d2d_ap1 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] nagios.cmd does not seem to work properly
Hello, We'd like to use nagios.cmd pipe to send some signals. According to the examples in documentation, we can do a simple thing like that : /bin/printf [%lu] ACKNOWLEDGE_HOST_PROBLEM;host1;1;1;1;Some One;Some Acknowledgement Comment\n $now $commandfile The problem is, it seems it's not taken in account - no log in nagios.log, no message is printed in shell, and more over nagios doesn't seem to do anything. I tested with an error inside the command, and it seems to be parsed: [1273156151] Warning: Unrecognized external command - SCHEDULE_HOST_SVC_CHECKs;fqdn;1273156151 but nothing else happens. Is there a way to be sure that nagios do what we ask? The main thing is: - we're using NSCA plugin so that we have about 80 servers with their own nagios, sending results and status to a single host - we tried to run this command only on remote hosts - maybe we have to use it on the nsca server ? Any help is welcome. Thank you in advance ! Best regards, C. -- Cédric Jeanneret | System Administrator 021 619 10 32| Camptocamp SA cedric.jeanne...@camptocamp.com | PSE-A / EPFL -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] NSCA strange behaviour
Hello, problem solved, it was indeed a version problem (minor number).. redhat has released a new nsca client/server package, removing a patch, and that made it incompatible with previous version. See: http://www.mail-archive.com/nagios-users@lists.sourceforge.net/msg19402.html and related patch: http://cvs.fedoraproject.org/viewvc/EL-5/nsca/nsca-increase_max_plugin_output_length.patch?revision=1.1view=markupsortby=log Best regards, C. On Tue, 8 Dec 2009 10:52:50 -0600 Marc Powell m...@ena.com wrote: On Dec 8, 2009, at 4:40 AM, Cedric Jeanneret wrote: - Enbling debug for nsca on server01 doesn't show me anything interesting. I just don't see where nsca catch up client22 status, and it keeps on saying : Warning: The results of host 'client22.domain.lt' are stale by 0d 0h 2m 0s (threshold=0d 0h 6m 0s). I'm forcing an immediate check of the host. On another hand, it shows me: [1260267958.216051] [016.1] [pid=23191] Check results for service 'Cron service' on host 'client22.domain.lt' are fresh. What do they show? What you've quoted here doesn't come from NSCA (that I can tell from a simple grep). -- nsca-2.7.2]# grep -r 'are fresh' * nsca-2.7.2]# I'm sure that comes from nagios via nagios.log, not NSCA. Are you sure you're looking in the right place? It's typically in /var/log/messages. How are you running nsca? daemon mode or via inetd? If inetd, is inetd rejecting the connection? -- Marc -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Cédric Jeanneret | System Administrator 021 619 10 32| Camptocamp SA cedric.jeanne...@camptocamp.com | PSE-A / EPFL signature.asc Description: PGP signature -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] NSCA strange behaviour
Hello again I've made some other tests: - I changed encryption algo on server01 and client22, restarted nagiosnsca on server01, restarted nagios on client22. - let it run like that, checking logs for something. As I thought, nsca begins to output lot of error regarding version client and/or encryption method and/or password. Nice. - I put back the right encryption method on server01 but NOT on client22. I should have seen a pattern like Received invalid packet but nothing. I forced some status update from client22: for i in $(seq 1000); do echo -n 'host '; /usr/local/bin/submit_ochp $(hostname -f) UP Host is up; sleep 2; done Nothing. TCPDump shows me traffic, in and out for both hosts... Last test, I stopped nagiosnsca on server01, removed nsca.dump objects.cache retention.dat files, and start again nscanagios If we let the fact that all my hosts are pending, my client22 doesn't push any status... nothing in nsca logs. Any other idea ? It's like the nsca daemon ignore (without any output) client22 queries. and client22 doesn't know about it. Best regards, C. -- Cédric Jeanneret | System Administrator 021 619 10 32| Camptocamp SA cedric.jeanne...@camptocamp.com | PSE-A / EPFL signature.asc Description: PGP signature -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] NSCA strange behaviour
Hello, I'm having troubles with NSCA. What we have : - about 47 passive hosts - about 220 passive services Versions : all are redhat servers, with: - NSCA 2.7.2 (latest one) - Nagios 3.1.2 We have a single nagios aggregator, which collect all NSCA status from the other hosts. What's happening: a host was reinstalled yesterday (say client22), and now it seems NSCA daemon on the aggregator (say server01) doesn't seem to collect data. What I've done: - tcpdump on both client22 and server01, both show me traffic between them, on NSCA default port (5667) - checked iptables rules, all is ok (as tcpdump shows me traffic, that's a confirmation) - trying to push status by hand from client22 to server01; ALL packets are sent successfully 1 data packet(s) sent to host successfully.. I've done this with a loop like that: for i in $(seq 1000); do /usr/local/bin/submit_ochp $(hostname -f) UP 'Host is up'; sleep 2; done - Enbling debug for nsca on server01 doesn't show me anything interesting. I just don't see where nsca catch up client22 status, and it keeps on saying : Warning: The results of host 'client22.domain.lt' are stale by 0d 0h 2m 0s (threshold=0d 0h 6m 0s). I'm forcing an immediate check of the host. On another hand, it shows me: [1260267958.216051] [016.1] [pid=23191] Check results for service 'Cron service' on host 'client22.domain.lt' are fresh. I really don't know where to find a solution, neither where is the real problem. We have another network with about 200 passive hosts and over 350 passive services, and it works fine. The only differences are : - the working network is debian-only - the working network's NSCA server doesn't do anything else than central nagios server. server01 does some other stuff, like syslog server and collectd server... maybe there's a bottleneck in there, but I can't be sure about that. Does anyone of you have an idea ? Thank you in advance. Best regards, C. -- Cédric Jeanneret | System Administrator 021 619 10 32| Camptocamp SA cedric.jeanne...@camptocamp.com | PSE-A / EPFL signature.asc Description: PGP signature -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] NSCA strange behaviour
Hello, As far as I can see, only new one. Even if they are just reinstalled (client22 was in nagios config before it was reinstalled). Best regards, C. On Tue, 8 Dec 2009 08:08:03 -0600 Greg Pangrazio pangr...@gmail.com wrote: Do all of your clients fail, or just the new one? Greg Pangrazio pangr...@gmail.com On Tue, Dec 8, 2009 at 4:40 AM, Cedric Jeanneret cedric.jeanne...@camptocamp.com wrote: Hello, I'm having troubles with NSCA. What we have : - about 47 passive hosts - about 220 passive services Versions : all are redhat servers, with: - NSCA 2.7.2 (latest one) - Nagios 3.1.2 We have a single nagios aggregator, which collect all NSCA status from the other hosts. What's happening: a host was reinstalled yesterday (say client22), and now it seems NSCA daemon on the aggregator (say server01) doesn't seem to collect data. What I've done: - tcpdump on both client22 and server01, both show me traffic between them, on NSCA default port (5667) - checked iptables rules, all is ok (as tcpdump shows me traffic, that's a confirmation) - trying to push status by hand from client22 to server01; ALL packets are sent successfully 1 data packet(s) sent to host successfully.. I've done this with a loop like that: for i in $(seq 1000); do /usr/local/bin/submit_ochp $(hostname -f) UP 'Host is up'; sleep 2; done - Enbling debug for nsca on server01 doesn't show me anything interesting. I just don't see where nsca catch up client22 status, and it keeps on saying : Warning: The results of host 'client22.domain.lt' are stale by 0d 0h 2m 0s (threshold=0d 0h 6m 0s). I'm forcing an immediate check of the host. On another hand, it shows me: [1260267958.216051] [016.1] [pid=23191] Check results for service 'Cron service' on host 'client22.domain.lt' are fresh. I really don't know where to find a solution, neither where is the real problem. We have another network with about 200 passive hosts and over 350 passive services, and it works fine. The only differences are : - the working network is debian-only - the working network's NSCA server doesn't do anything else than central nagios server. server01 does some other stuff, like syslog server and collectd server... maybe there's a bottleneck in there, but I can't be sure about that. Does anyone of you have an idea ? Thank you in advance. Best regards, C. -- Cédric Jeanneret | System Administrator 021 619 10 32 | Camptocamp SA cedric.jeanne...@camptocamp.com | PSE-A / EPFL -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Cédric Jeanneret | System Administrator 021 619 10 32| Camptocamp SA cedric.jeanne...@camptocamp.com | PSE-A / EPFL signature.asc Description: PGP signature -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] NSCA strange behaviour
Hello again, In fact, configuration files are dployed via puppet(http://reductivelabs.com/trac/puppet/wiki), so all files (should be) are the same. I'll check it, but as puppet runs on every hosts, they all should have the same files. IP addresses are the same (fixed IP, fixed ports). I'll check files once again. If anyone has another idea... Thank you. Best regards, C. On Tue, 8 Dec 2009 08:31:59 -0600 Greg Pangrazio pangr...@gmail.com wrote: It sounds like there is something that changed with the re-install. Is the IP address of the system the same? Did you pick the same encryption type in the nsca config? Can you diff the nsca config with a working host? Greg Pangrazio pangr...@gmail.com On Tue, Dec 8, 2009 at 8:14 AM, Cedric Jeanneret cedric.jeanne...@camptocamp.com wrote: Hello, As far as I can see, only new one. Even if they are just reinstalled (client22 was in nagios config before it was reinstalled). Best regards, C. On Tue, 8 Dec 2009 08:08:03 -0600 Greg Pangrazio pangr...@gmail.com wrote: Do all of your clients fail, or just the new one? Greg Pangrazio pangr...@gmail.com On Tue, Dec 8, 2009 at 4:40 AM, Cedric Jeanneret cedric.jeanne...@camptocamp.com wrote: Hello, I'm having troubles with NSCA. What we have : - about 47 passive hosts - about 220 passive services Versions : all are redhat servers, with: - NSCA 2.7.2 (latest one) - Nagios 3.1.2 We have a single nagios aggregator, which collect all NSCA status from the other hosts. What's happening: a host was reinstalled yesterday (say client22), and now it seems NSCA daemon on the aggregator (say server01) doesn't seem to collect data. What I've done: - tcpdump on both client22 and server01, both show me traffic between them, on NSCA default port (5667) - checked iptables rules, all is ok (as tcpdump shows me traffic, that's a confirmation) - trying to push status by hand from client22 to server01; ALL packets are sent successfully 1 data packet(s) sent to host successfully.. I've done this with a loop like that: for i in $(seq 1000); do /usr/local/bin/submit_ochp $(hostname -f) UP 'Host is up'; sleep 2; done - Enbling debug for nsca on server01 doesn't show me anything interesting. I just don't see where nsca catch up client22 status, and it keeps on saying : Warning: The results of host 'client22.domain.lt' are stale by 0d 0h 2m 0s (threshold=0d 0h 6m 0s). I'm forcing an immediate check of the host. On another hand, it shows me: [1260267958.216051] [016.1] [pid=23191] Check results for service 'Cron service' on host 'client22.domain.lt' are fresh. I really don't know where to find a solution, neither where is the real problem. We have another network with about 200 passive hosts and over 350 passive services, and it works fine. The only differences are : - the working network is debian-only - the working network's NSCA server doesn't do anything else than central nagios server. server01 does some other stuff, like syslog server and collectd server... maybe there's a bottleneck in there, but I can't be sure about that. Does anyone of you have an idea ? Thank you in advance. Best regards, C. -- Cédric Jeanneret | System Administrator 021 619 10 32 | Camptocamp SA cedric.jeanne...@camptocamp.com | PSE-A / EPFL -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Cédric Jeanneret | System Administrator 021 619 10 32 | Camptocamp SA cedric.jeanne...@camptocamp.com | PSE-A / EPFL -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Cédric Jeanneret | System Administrator 021 619 10 32| Camptocamp SA cedric.jeanne
Re: [Nagios-users] NSCA strange behaviour
I just rsync-ed a complet config from a working host, then: for file in $(grep -lr working-client *); do sed -i 's/working-client/client22/g' $i done ... and it doesn't work any better. More over, puppet doesn't want to change anything... I'm stuck... :( On Tue, 8 Dec 2009 08:31:59 -0600 Greg Pangrazio pangr...@gmail.com wrote: It sounds like there is something that changed with the re-install. Is the IP address of the system the same? Did you pick the same encryption type in the nsca config? Can you diff the nsca config with a working host? Greg Pangrazio pangr...@gmail.com On Tue, Dec 8, 2009 at 8:14 AM, Cedric Jeanneret cedric.jeanne...@camptocamp.com wrote: Hello, As far as I can see, only new one. Even if they are just reinstalled (client22 was in nagios config before it was reinstalled). Best regards, C. On Tue, 8 Dec 2009 08:08:03 -0600 Greg Pangrazio pangr...@gmail.com wrote: Do all of your clients fail, or just the new one? Greg Pangrazio pangr...@gmail.com On Tue, Dec 8, 2009 at 4:40 AM, Cedric Jeanneret cedric.jeanne...@camptocamp.com wrote: Hello, I'm having troubles with NSCA. What we have : - about 47 passive hosts - about 220 passive services Versions : all are redhat servers, with: - NSCA 2.7.2 (latest one) - Nagios 3.1.2 We have a single nagios aggregator, which collect all NSCA status from the other hosts. What's happening: a host was reinstalled yesterday (say client22), and now it seems NSCA daemon on the aggregator (say server01) doesn't seem to collect data. What I've done: - tcpdump on both client22 and server01, both show me traffic between them, on NSCA default port (5667) - checked iptables rules, all is ok (as tcpdump shows me traffic, that's a confirmation) - trying to push status by hand from client22 to server01; ALL packets are sent successfully 1 data packet(s) sent to host successfully.. I've done this with a loop like that: for i in $(seq 1000); do /usr/local/bin/submit_ochp $(hostname -f) UP 'Host is up'; sleep 2; done - Enbling debug for nsca on server01 doesn't show me anything interesting. I just don't see where nsca catch up client22 status, and it keeps on saying : Warning: The results of host 'client22.domain.lt' are stale by 0d 0h 2m 0s (threshold=0d 0h 6m 0s). I'm forcing an immediate check of the host. On another hand, it shows me: [1260267958.216051] [016.1] [pid=23191] Check results for service 'Cron service' on host 'client22.domain.lt' are fresh. I really don't know where to find a solution, neither where is the real problem. We have another network with about 200 passive hosts and over 350 passive services, and it works fine. The only differences are : - the working network is debian-only - the working network's NSCA server doesn't do anything else than central nagios server. server01 does some other stuff, like syslog server and collectd server... maybe there's a bottleneck in there, but I can't be sure about that. Does anyone of you have an idea ? Thank you in advance. Best regards, C. -- Cédric Jeanneret | System Administrator 021 619 10 32 | Camptocamp SA cedric.jeanne...@camptocamp.com | PSE-A / EPFL -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Cédric Jeanneret | System Administrator 021 619 10 32 | Camptocamp SA cedric.jeanne...@camptocamp.com | PSE-A / EPFL -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Cédric Jeanneret | System Administrator 021 619 10 32| Camptocamp SA cedric.jeanne...@camptocamp.com | PSE-A / EPFL signature.asc Description: PGP signature
Re: [Nagios-users] NSCA strange behaviour
Hello Marcel, well, same on server01 and client22... working-server has an earlier one, but it works on this. NSCA doesn't show me any encryption error (encryption method and passphrase are correct on both ends) :/ Regards, C. On Tue, 8 Dec 2009 14:34:43 -0200 Marcel mits...@gmail.com wrote: check openssl versions and compatibility. On Tue, Dec 8, 2009 at 2:13 PM, Cedric Jeanneret cedric.jeanne...@camptocamp.com wrote: Hello again, In fact, configuration files are dployed via puppet( http://reductivelabs.com/trac/puppet/wiki), so all files (should be) are the same. I'll check it, but as puppet runs on every hosts, they all should have the same files. IP addresses are the same (fixed IP, fixed ports). I'll check files once again. If anyone has another idea... Thank you. Best regards, C. On Tue, 8 Dec 2009 08:31:59 -0600 Greg Pangrazio pangr...@gmail.com wrote: It sounds like there is something that changed with the re-install. Is the IP address of the system the same? Did you pick the same encryption type in the nsca config? Can you diff the nsca config with a working host? Greg Pangrazio pangr...@gmail.com On Tue, Dec 8, 2009 at 8:14 AM, Cedric Jeanneret cedric.jeanne...@camptocamp.com wrote: Hello, As far as I can see, only new one. Even if they are just reinstalled (client22 was in nagios config before it was reinstalled). Best regards, C. On Tue, 8 Dec 2009 08:08:03 -0600 Greg Pangrazio pangr...@gmail.com wrote: Do all of your clients fail, or just the new one? Greg Pangrazio pangr...@gmail.com On Tue, Dec 8, 2009 at 4:40 AM, Cedric Jeanneret cedric.jeanne...@camptocamp.com wrote: Hello, I'm having troubles with NSCA. What we have : - about 47 passive hosts - about 220 passive services Versions : all are redhat servers, with: - NSCA 2.7.2 (latest one) - Nagios 3.1.2 We have a single nagios aggregator, which collect all NSCA status from the other hosts. What's happening: a host was reinstalled yesterday (say client22), and now it seems NSCA daemon on the aggregator (say server01) doesn't seem to collect data. What I've done: - tcpdump on both client22 and server01, both show me traffic between them, on NSCA default port (5667) - checked iptables rules, all is ok (as tcpdump shows me traffic, that's a confirmation) - trying to push status by hand from client22 to server01; ALL packets are sent successfully 1 data packet(s) sent to host successfully.. I've done this with a loop like that: for i in $(seq 1000); do /usr/local/bin/submit_ochp $(hostname -f) UP 'Host is up'; sleep 2; done - Enbling debug for nsca on server01 doesn't show me anything interesting. I just don't see where nsca catch up client22 status, and it keeps on saying : Warning: The results of host 'client22.domain.lt' are stale by 0d 0h 2m 0s (threshold=0d 0h 6m 0s). I'm forcing an immediate check of the host. On another hand, it shows me: [1260267958.216051] [016.1] [pid=23191] Check results for service 'Cron service' on host 'client22.domain.lt' are fresh. I really don't know where to find a solution, neither where is the real problem. We have another network with about 200 passive hosts and over 350 passive services, and it works fine. The only differences are : - the working network is debian-only - the working network's NSCA server doesn't do anything else than central nagios server. server01 does some other stuff, like syslog server and collectd server... maybe there's a bottleneck in there, but I can't be sure about that. Does anyone of you have an idea ? Thank you in advance. Best regards, C. -- Cédric Jeanneret | System Administrator 021 619 10 32| Camptocamp SA cedric.jeanne...@camptocamp.com | PSE-A / EPFL -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Cédric Jeanneret | System Administrator 021 619 10 32| Camptocamp SA cedric.jeanne...@camptocamp.com | PSE-A / EPFL
Re: [Nagios-users] NSCA strange behaviour
Hello Marc, Indeed, the are fresh comes from nagios.log. ok, done. NSCA is running in daemon mode, iptables is opened for nsca port, and connections can go through it (tcpdump shows it to me, in both directions). Setting debug=1 in nsca.cfg seems to do nothing more in /var/log/messages (redhat server). I just see down hosts passing through (results for ... are stalled - forcing immedia check...). I set up debug for nagios itself, but it really seems to be a problem at NSCA level. Regards, C. On Tue, 8 Dec 2009 10:52:50 -0600 Marc Powell m...@ena.com wrote: On Dec 8, 2009, at 4:40 AM, Cedric Jeanneret wrote: - Enbling debug for nsca on server01 doesn't show me anything interesting. I just don't see where nsca catch up client22 status, and it keeps on saying : Warning: The results of host 'client22.domain.lt' are stale by 0d 0h 2m 0s (threshold=0d 0h 6m 0s). I'm forcing an immediate check of the host. On another hand, it shows me: [1260267958.216051] [016.1] [pid=23191] Check results for service 'Cron service' on host 'client22.domain.lt' are fresh. What do they show? What you've quoted here doesn't come from NSCA (that I can tell from a simple grep). -- nsca-2.7.2]# grep -r 'are fresh' * nsca-2.7.2]# I'm sure that comes from nagios via nagios.log, not NSCA. Are you sure you're looking in the right place? It's typically in /var/log/messages. How are you running nsca? daemon mode or via inetd? If inetd, is inetd rejecting the connection? -- Marc -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Cédric Jeanneret | System Administrator 021 619 10 32| Camptocamp SA cedric.jeanne...@camptocamp.com | PSE-A / EPFL signature.asc Description: PGP signature -- Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null