Re: [Puppet Users] Nagios service not restarting when removing a host from the database
On Tue, Aug 06, 2013 at 06:20:09AM -0700, jcbollinger wrote: > > Because there is a difference between unmanaged resources and resources > that are managed 'absent'. Only resources that are actually declared for > the target node are managed, whether declared directly or via collection of > virtual or exported resources. Declaring that unmanaged resources of a > given type should be purged does not make the affected resources managed. Your explanation just solved my problem. For the record using notify in the resource purge doesn't work as attested in several google searches (and confirmed here) with errors like: warning: /Nagios_service[check_ssh_test1]: Service[nagios] still depends on me -- not purging However, provided you do the following: 1) name your services like "servicename_${::hostname}" 2) name your hosts as "$::fqdn" 3) Make sure every @@nagios_* resource has "ensure=>'present'" then the following does work to both remove the resources from the nagios_*.cfg files __and__ restart nagios: Get resource ids: select id,title from resources where title like '%_' or title='yourfqdn'; Update present to absent for all service checks and the host check. update param_values set value='absent' where value='present' and (resource_id=1 or resource_id=2); Since there may be a use case to turn off a single service instead of blowing away all the service and host checks for a given node, I'll probably populate a checklist form. Granted, that will require that the host doesn't redefine it again (different problem). Anyway, thanks so much for the unmanaged vs managed comment. That was totally slipping by me while I pounded my head into my desk trying to figure out why the files were changing but the service wasn't being notified. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users+unsubscr...@googlegroups.com. To post to this group, send email to puppet-users@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-users. For more options, visit https://groups.google.com/groups/opt_out.
Re: [Puppet Users] Nagios service not restarting when removing a host from the database
On Monday, August 5, 2013 3:33:11 PM UTC-5, John Santana wrote: > > On Mon, Aug 05, 2013 at 04:22:41PM -0400, Gabriel Filion wrote: > > > > you need to export the resource with ensure => absent and run puppet on > > the host, then on the nagios server so that everything runs fine. > > Dozens of VMs are routinely destroyed on a weekly basis and in an > automated fashion based on load. The nagios_*.cfg files are > automatically changed, why is the notify not triggering? > > Because there is a difference between unmanaged resources and resources that are managed 'absent'. Only resources that are actually declared for the target node are managed, whether declared directly or via collection of virtual or exported resources. Declaring that unmanaged resources of a given type should be purged does not make the affected resources managed. You declare that Service['nagios'] must be notified when a *managed*Nagios_host or Nagios_service is modified (including from being absent to being present, or vise versa), but that's not what happens when you remove the [exported] resource declaration altogether. > > however in your example, you seem not to be redefining the "target" when > > collecting, so you might consider using purge => true. to achieve what > > you want with the workflow you mentioned above (e.g. without the need to > > export with ensure => absent) > > I am purging unless you are referring to a different resource stanza. > From my OP: > > resources { [ "nagios_host", "nagios_service" ]: > purge => true, > } > > > By definition, that causes *un*managed resources of the specified types to be removed from the system. Because they are unmanaged, their removal does not cause nagios to be notified. You could try having that Resources resource notify Service['nagios'] itself: resources { [ "nagios_host", "nagios_service" ]: purge => true, notify => Service['nagios'] } I don't actually know whether that would work, but it seems right. Be sure to check both that it causes nagios to be notified when hosts or services are purged, and that it doesn't do so otherwise. And please let us know what happens. I'm very curious. John -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users+unsubscr...@googlegroups.com. To post to this group, send email to puppet-users@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-users. For more options, visit https://groups.google.com/groups/opt_out.
Re: [Puppet Users] Nagios service not restarting when removing a host from the database
On 05/08/13 04:33 PM, puppetl...@downhomelinux.com wrote: > On Mon, Aug 05, 2013 at 04:22:41PM -0400, Gabriel Filion wrote: >> >> you need to export the resource with ensure => absent and run puppet on >> the host, then on the nagios server so that everything runs fine. > > Dozens of VMs are routinely destroyed on a weekly basis and in an > automated fashion based on load. The nagios_*.cfg files are > automatically changed, why is the notify not triggering? > >> however in your example, you seem not to be redefining the "target" when >> collecting, so you might consider using purge => true. to achieve what >> you want with the workflow you mentioned above (e.g. without the need to >> export with ensure => absent) > > I am purging unless you are referring to a different resource stanza. > From my OP: ah, so that's why the config is updated automatically then.. I can't use purging in my environment because of the annoying limitation with the "target" argument, so the best I can do now is to pull one suggestion out of my hat: if you try and add the notify here, maybe it'll catch removals too? > resources { [ "nagios_host", "nagios_service" ]: > purge => true, notify => Service['nagios'], > } it might complain about the notify lines in the collection.. not entirely sure. if so, try to remove the line at the collection point. -- Gabriel Filion signature.asc Description: OpenPGP digital signature
Re: [Puppet Users] Nagios service not restarting when removing a host from the database
On Mon, Aug 05, 2013 at 04:22:41PM -0400, Gabriel Filion wrote: > > you need to export the resource with ensure => absent and run puppet on > the host, then on the nagios server so that everything runs fine. Dozens of VMs are routinely destroyed on a weekly basis and in an automated fashion based on load. The nagios_*.cfg files are automatically changed, why is the notify not triggering? > however in your example, you seem not to be redefining the "target" when > collecting, so you might consider using purge => true. to achieve what > you want with the workflow you mentioned above (e.g. without the need to > export with ensure => absent) I am purging unless you are referring to a different resource stanza. >From my OP: resources { [ "nagios_host", "nagios_service" ]: purge => true, } -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users+unsubscr...@googlegroups.com. To post to this group, send email to puppet-users@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-users. For more options, visit https://groups.google.com/groups/opt_out.
Re: [Puppet Users] Nagios service not restarting when removing a host from the database
Hi there, On 05/08/13 10:51 AM, John Santana wrote: > When I remove the host from the database via > > delete from fact_values where host_id='N'; > delete from resources where host_id='N'; > delete from hosts where id='N'; if you remove the host exported resource in the manifests and the DB, then the nagios server is not collecting anything about it anymore: that's why the service doesn't get notified. you need to export the resource with ensure => absent and run puppet on the host, then on the nagios server so that everything runs fine. however in your example, you seem not to be redefining the "target" when collecting, so you might consider using purge => true. to achieve what you want with the workflow you mentioned above (e.g. without the need to export with ensure => absent) -- Gabriel Filion signature.asc Description: OpenPGP digital signature
[Puppet Users] Nagios service not restarting when removing a host from the database
Nagios is restarted every time a host or service is added, but never when removing hosts or services. The client resources: @@nagios_host { "$::fqdn": ensure=> 'present', alias => "$::hostname", address => "$::ipaddress", use => 'linux-server', } @@nagios_service { "check_ssh_${::hostname}": check_command => 'check_ssh', use => 'generic-service', host_name => "$::fqdn", service_description => 'SSH', } The nagios server resources: service { 'nagios': ensure=> 'running', hasstatus => true, enable=> true, } resources { [ "nagios_host", "nagios_service" ]: purge => true, } Nagios_host <<||>> { notify => Service['nagios'] } Nagios_service <<||>> { notify => Service['nagios'] } Based on variations I have seen out there I have also tried the following: - Have the service subscribe to /etc/nagios with checksum=>mtime - Added before => File['/etc/nagios'], to Nagios_host and Nagios_service - Tried checksum=>mtime on /etc/nagios/nagios_*.cfg resources When I add a host and then run puppet agent --test on the nagios server I see this: notice: /Stage[main]/Nagios::Monitor/Nagios_host[test1.tld]/ensure: created notice: /Stage[main]/Nagios::Monitor/Nagios_service[check_ssh_test1]/ensure: created notice: /Stage[main]/Nagios::Monitor/Service[nagios]: Triggered 'refresh' from 2 events When I remove the host from the database via delete from fact_values where host_id='N'; delete from resources where host_id='N'; delete from hosts where id='N'; The next run of puppet on the nagios server produces: notice: /Nagios_service[check_ssh_test1]/ensure: removed notice: /Nagios_host[test1.tld]/ensure: removed nagios_host.cfg and nagios_service.cfg are properly updated, but the service will not restart. This is centos6.3 with epel puppet-2.6.17 (for client and master). Any ideas? -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users+unsubscr...@googlegroups.com. To post to this group, send email to puppet-users@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-users. For more options, visit https://groups.google.com/groups/opt_out.