Re: [Puppet Users] Nagios service not restarting when removing a host from the database

2013-08-06 Thread puppetlist
On Tue, Aug 06, 2013 at 06:20:09AM -0700, jcbollinger wrote:
> 
> Because there is a difference between unmanaged resources and resources 
> that are managed 'absent'.  Only resources that are actually declared for 
> the target node are managed, whether declared directly or via collection of 
> virtual or exported resources.  Declaring that unmanaged resources of a 
> given type should be purged does not make the affected resources managed.

Your explanation just solved my problem. For the record using notify in
the resource purge doesn't work as attested in several google searches
(and confirmed here) with errors like:

warning: /Nagios_service[check_ssh_test1]: Service[nagios] still depends on me 
-- not purging

However, provided you do the following:

1) name your services like "servicename_${::hostname}"
2) name your hosts as "$::fqdn"
3) Make sure every @@nagios_* resource has "ensure=>'present'"

then the following does work to both remove the resources from the
nagios_*.cfg files __and__ restart nagios:

Get resource ids:
select id,title from resources where title like '%_' or 
title='yourfqdn';

Update present to absent for all service checks and the host check.
update param_values set value='absent' where value='present' and (resource_id=1 
or resource_id=2);

Since there may be a use case to turn off a single service instead of
blowing away all the service and host checks for a given node, I'll
probably populate a checklist form. Granted, that will require that the
host doesn't redefine it again (different problem).

Anyway, thanks so much for the unmanaged vs managed comment. That was
totally slipping by me while I pounded my head into my desk trying to
figure out why the files were changing but the service wasn't being
notified.

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-users+unsubscr...@googlegroups.com.
To post to this group, send email to puppet-users@googlegroups.com.
Visit this group at http://groups.google.com/group/puppet-users.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [Puppet Users] Nagios service not restarting when removing a host from the database

2013-08-06 Thread jcbollinger


On Monday, August 5, 2013 3:33:11 PM UTC-5, John Santana wrote:
>
> On Mon, Aug 05, 2013 at 04:22:41PM -0400, Gabriel Filion wrote: 
> > 
> > you need to export the resource with ensure => absent and run puppet on 
> > the host, then on the nagios server so that everything runs fine. 
>
> Dozens of VMs are routinely destroyed on a weekly basis and in an 
> automated fashion based on load. The nagios_*.cfg files are 
> automatically changed, why is the notify not triggering? 
>
>

Because there is a difference between unmanaged resources and resources 
that are managed 'absent'.  Only resources that are actually declared for 
the target node are managed, whether declared directly or via collection of 
virtual or exported resources.  Declaring that unmanaged resources of a 
given type should be purged does not make the affected resources managed.

You declare that Service['nagios'] must be notified when a *managed*Nagios_host 
or Nagios_service is modified (including from being absent to 
being present, or vise versa), but that's not what happens when you remove 
the [exported] resource declaration altogether.

 

> > however in your example, you seem not to be redefining the "target" when 
> > collecting, so you might consider using purge => true. to achieve what 
> > you want with the workflow you mentioned above (e.g. without the need to 
> > export with ensure => absent) 
>
> I am purging unless you are referring to a different resource stanza. 
> From my OP: 
>
> resources { [ "nagios_host", "nagios_service" ]: 
>   purge => true, 
> } 
>
>
>

By definition, that causes *un*managed resources of the specified types to 
be removed from the system.  Because they are unmanaged, their removal does 
not cause nagios to be notified.

You could try having that Resources resource notify Service['nagios'] 
itself:

resources { [ "nagios_host", "nagios_service" ]:
  purge => true,
  notify => Service['nagios']
}

I don't actually know whether that would work, but it seems right.  Be sure 
to check both that it causes nagios to be notified when hosts or services 
are purged, and that it doesn't do so otherwise.  And please let us know 
what happens. I'm very curious.


John

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-users+unsubscr...@googlegroups.com.
To post to this group, send email to puppet-users@googlegroups.com.
Visit this group at http://groups.google.com/group/puppet-users.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [Puppet Users] Nagios service not restarting when removing a host from the database

2013-08-05 Thread Gabriel Filion
On 05/08/13 04:33 PM, puppetl...@downhomelinux.com wrote:
> On Mon, Aug 05, 2013 at 04:22:41PM -0400, Gabriel Filion wrote:
>>
>> you need to export the resource with ensure => absent and run puppet on
>> the host, then on the nagios server so that everything runs fine.
> 
> Dozens of VMs are routinely destroyed on a weekly basis and in an
> automated fashion based on load. The nagios_*.cfg files are
> automatically changed, why is the notify not triggering?
> 
>> however in your example, you seem not to be redefining the "target" when
>> collecting, so you might consider using purge => true. to achieve what
>> you want with the workflow you mentioned above (e.g. without the need to
>> export with ensure => absent)
> 
> I am purging unless you are referring to a different resource stanza.
> From my OP:

ah, so that's why the config is updated automatically then..

I can't use purging in my environment because of the annoying limitation
with the "target" argument, so the best I can do now is to pull one
suggestion out of my hat:

if you try and add the notify here, maybe it'll catch removals too?

> resources { [ "nagios_host", "nagios_service" ]:
>   purge => true,
notify => Service['nagios'],
> }

it might complain about the notify lines in the collection.. not
entirely sure. if so, try to remove the line at the collection point.

-- 
Gabriel Filion



signature.asc
Description: OpenPGP digital signature


Re: [Puppet Users] Nagios service not restarting when removing a host from the database

2013-08-05 Thread puppetlist
On Mon, Aug 05, 2013 at 04:22:41PM -0400, Gabriel Filion wrote:
> 
> you need to export the resource with ensure => absent and run puppet on
> the host, then on the nagios server so that everything runs fine.

Dozens of VMs are routinely destroyed on a weekly basis and in an
automated fashion based on load. The nagios_*.cfg files are
automatically changed, why is the notify not triggering?

> however in your example, you seem not to be redefining the "target" when
> collecting, so you might consider using purge => true. to achieve what
> you want with the workflow you mentioned above (e.g. without the need to
> export with ensure => absent)

I am purging unless you are referring to a different resource stanza.
>From my OP:

resources { [ "nagios_host", "nagios_service" ]:
  purge => true,
}


-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-users+unsubscr...@googlegroups.com.
To post to this group, send email to puppet-users@googlegroups.com.
Visit this group at http://groups.google.com/group/puppet-users.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [Puppet Users] Nagios service not restarting when removing a host from the database

2013-08-05 Thread Gabriel Filion
Hi there,

On 05/08/13 10:51 AM, John Santana wrote:
> When I remove the host from the database via
> 
> delete from fact_values where host_id='N';
> delete from resources where host_id='N';
> delete from hosts where id='N';

if you remove the host exported resource in the manifests and the DB,
then the nagios server is not collecting anything about it anymore:
that's why the service doesn't get notified.

you need to export the resource with ensure => absent and run puppet on
the host, then on the nagios server so that everything runs fine.


however in your example, you seem not to be redefining the "target" when
collecting, so you might consider using purge => true. to achieve what
you want with the workflow you mentioned above (e.g. without the need to
export with ensure => absent)

-- 
Gabriel Filion



signature.asc
Description: OpenPGP digital signature


[Puppet Users] Nagios service not restarting when removing a host from the database

2013-08-05 Thread John Santana
Nagios is restarted every time a host or service is added, but never when 
removing hosts or services.

The client resources:

@@nagios_host { "$::fqdn":
ensure=> 'present',
alias => "$::hostname",
address   => "$::ipaddress",
use   => 'linux-server',
}

@@nagios_service { "check_ssh_${::hostname}":
check_command   => 'check_ssh',
use => 'generic-service',
host_name   => "$::fqdn",
service_description => 'SSH',
}

The nagios server resources:

service { 'nagios':
  ensure=> 'running',
  hasstatus => true,
  enable=> true,
}

resources { [ "nagios_host", "nagios_service" ]:
  purge => true,
}

Nagios_host <<||>> { notify  => Service['nagios'] }
Nagios_service  <<||>> { notify  => Service['nagios'] }

Based on variations I have seen out there I have also tried the following:

- Have the service subscribe to /etc/nagios with checksum=>mtime
- Added  before  => File['/etc/nagios'], to Nagios_host and Nagios_service
- Tried checksum=>mtime on /etc/nagios/nagios_*.cfg resources

When I add a host and then run puppet agent --test on the nagios server I 
see this:
notice: /Stage[main]/Nagios::Monitor/Nagios_host[test1.tld]/ensure: created
notice: 
/Stage[main]/Nagios::Monitor/Nagios_service[check_ssh_test1]/ensure: created
notice: /Stage[main]/Nagios::Monitor/Service[nagios]: Triggered 'refresh' 
from 2 events

When I remove the host from the database via 

delete from fact_values where host_id='N';
delete from resources where host_id='N';
delete from hosts where id='N';

The next run of puppet on the nagios server produces:
notice: /Nagios_service[check_ssh_test1]/ensure: removed
notice: /Nagios_host[test1.tld]/ensure: removed

nagios_host.cfg and nagios_service.cfg are properly updated, but the 
service will not restart.

This is centos6.3 with epel puppet-2.6.17 (for client and master). Any 
ideas?

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-users+unsubscr...@googlegroups.com.
To post to this group, send email to puppet-users@googlegroups.com.
Visit this group at http://groups.google.com/group/puppet-users.
For more options, visit https://groups.google.com/groups/opt_out.