Re: [Puppet Users] Why is it so hard to make a sane nagios server config?

2011-03-12 Thread Douglas Garstang
I gave up and stopped trying to obfuscate the nagios configuration in
puppet. I simply have puppet pushing out anything I drop into
/etc/nagios/conf.d, and then doing a nagios restart. This makes it very easy
to add hosts and services because a script can read the node manifest and
drop the necessary files into modules/nagios/etc/nagios/conf.d.

Doug.

On Fri, Mar 11, 2011 at 7:22 AM, Brian Gallew g...@gallew.org wrote:

 My setup also has a worst-case propagation delay of 90 minutes.  I have a
 custom fact that collects all of the information in classes.txt on the
 client.  That, in turn, is used (for Nagios) by a custom parser function
 that produces the hostgroup list for when the nagios_host resource is
 exported.  It's not optimal, but it's fully automatic.  In the rare case
 where I want/need the info to propagate immediately, I can run puppet agent
 --test twice on the client and then once more on the Nagios server.

 All of the services, as in your case, are generated as members of
 hostgroups.


 On Fri, Mar 11, 2011 at 6:14 AM, Martijn Grendelman mart...@iphion.nlwrote:

 On 11-03-11 12:46, Martijn Grendelman wrote:
 [snip]

  I did exactly what you did: use exported concat-fragments to collect the
  hostgroups on the puppetmaster and then use generate() to provision the
  hostgroups parameter of the nagios_host.

 [snip]

  The biggest downside of my system so far, is the fact that, in a
  worst-case scenario, it takes 90 minutes for a configuration change to
  propagate to Nagios.

 I now replaced the exported concat fragments with local concat fragments
 and a custom fact, which reads the result of the concat and joins it into
 a comma-separated string.

 Dependencies make sure that the concats are done before the nagios_hosts
 are exported.

 This removes the need to collect anything on the puppetmaster and reduces
 the time needed to propagate a configuration change to Nagios in the worst
 case to 30 minutes.

 Best regards,
 Martijn.

 --
 You received this message because you are subscribed to the Google Groups
 Puppet Users group.
 To post to this group, send email to puppet-users@googlegroups.com.
 To unsubscribe from this group, send email to
 puppet-users+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/puppet-users?hl=en.


  --
 You received this message because you are subscribed to the Google Groups
 Puppet Users group.
 To post to this group, send email to puppet-users@googlegroups.com.
 To unsubscribe from this group, send email to
 puppet-users+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/puppet-users?hl=en.




-- 
Regards,

Douglas Garstang
http://www.linkedin.com/in/garstang
Email: doug.garst...@gmail.com
Cell: +1-805-340-5627

-- 
You received this message because you are subscribed to the Google Groups 
Puppet Users group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to 
puppet-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.



[Puppet Users] Why is it so hard to make a sane nagios server config?

2011-03-11 Thread Nick Moffitt
I've tried to achieve my overall goals with several different features
of Puppet, but I've hit a bit of a wall here.  I think it's time for me
to explain what I'm trying to accomplish:

I want the enabling of a service in my manifests to configure
the monitoring of that service by a nagios server, without
needless repetition.

Let me explain how my non-automated nagios3 server is configured:

Each service is declared once per type (or type+role, but there
are very few types distinguished by role, so we'll ignore that
for now).  Services are members of hostgroups and servicegroups,
and hosts join hostgroups in order to activate monitoring.
Servicegroups are mostly used for sorting in the Web UI.

Now let's look at how Puppet wants to configure Nagios:

Each host exports a unique service for each of the checks it
should be subject to.

Holy cow!  So if I have 25 checks per machine, and a thousand nodes,
that's 25,000 entries!  We're talking about a nagios_services.cfg
measured in the tens of *megabytes*!  I'm a little stunned by this
pattern.

I had created a system whereby nodes exported concat::fragments
expressing desired membership in a hostgroup, and defines that created
services, hostgroups, and servicegroups only if they hadn't already been
made.  All this ground to a halt and with only 50 test nodes the
compilation of the nagios server's catalog took 3 minutes on a
reasonably spec'd machine.  I have to assume that the AST's mass of
defines-of-defines was partly to blame.

I'd like to make it as simple as possible for my module writers: Just
use the nagios::nrpe_check define.  That'll make the nrpe config locally
and export everything needed to ensure this host is checked for your
module's behaviors! Forcing them to make a service check *here* and a
hostgroup *there* and don't forget to add the hostgroup to the node that
used the class *there*… well it just opens up too many opportunities for
human fallibility.

So what are my options here?  I don't want to probe the hosts to
determine what should be monitored, because the whole *point* of
monitoring is to alert you when your live system state isn't what you
asked for.  I also worry that this Cartesian product of everything by
everyone will not scale acceptably into the future.

Fingers crossed that I'm just being dense and missed something big!

-- 
Ill-informed qmail-bashing is better than no
qmail-bashing at all.
--Don Marti

-- 
You received this message because you are subscribed to the Google Groups 
Puppet Users group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to 
puppet-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.



Re: [Puppet Users] Why is it so hard to make a sane nagios server config?

2011-03-11 Thread Martijn Grendelman
Hi Nick,

 I've tried to achieve my overall goals with several different features
 of Puppet, but I've hit a bit of a wall here.  I think it's time for me
 to explain what I'm trying to accomplish:
 
   I want the enabling of a service in my manifests to configure
   the monitoring of that service by a nagios server, without
   needless repetition.
 
 Let me explain how my non-automated nagios3 server is configured:
 
   Each service is declared once per type (or type+role, but there
   are very few types distinguished by role, so we'll ignore that
   for now).  Services are members of hostgroups and servicegroups,
   and hosts join hostgroups in order to activate monitoring.
   Servicegroups are mostly used for sorting in the Web UI.
 
 Now let's look at how Puppet wants to configure Nagios:
 
   Each host exports a unique service for each of the checks it
   should be subject to.
 
 Holy cow!  So if I have 25 checks per machine, and a thousand nodes,
 that's 25,000 entries!  We're talking about a nagios_services.cfg
 measured in the tens of *megabytes*!  I'm a little stunned by this
 pattern.
 
 I had created a system whereby nodes exported concat::fragments
 expressing desired membership in a hostgroup, and defines that created
 services, hostgroups, and servicegroups only if they hadn't already been
 made.  All this ground to a halt and with only 50 test nodes the
 compilation of the nagios server's catalog took 3 minutes on a
 reasonably spec'd machine.  I have to assume that the AST's mass of
 defines-of-defines was partly to blame.
 
 I'd like to make it as simple as possible for my module writers: Just
 use the nagios::nrpe_check define.  That'll make the nrpe config locally
 and export everything needed to ensure this host is checked for your
 module's behaviors! Forcing them to make a service check *here* and a
 hostgroup *there* and don't forget to add the hostgroup to the node that
 used the class *there*… well it just opens up too many opportunities for
 human fallibility.
 
 So what are my options here?  I don't want to probe the hosts to
 determine what should be monitored, because the whole *point* of
 monitoring is to alert you when your live system state isn't what you
 asked for.  I also worry that this Cartesian product of everything by
 everyone will not scale acceptably into the future.
 
 Fingers crossed that I'm just being dense and missed something big!

I'm afraid not.

Most of this past week I spent building my Puppet-induced Nagios
configuration, and after exploring my options, I decided that the
hostgroup-based configuration you describe is the only sensible way.

I did exactly what you did: use exported concat-fragments to collect the
hostgroups on the puppetmaster and then use generate() to provision the
hostgroups parameter of the nagios_host.

Catalog runs on the Nagios servers typically take less than a minute to
run, but I only have about 50 hosts defined so far. I hope that this
scheme will be sustainable.

The biggest downside of my system so far, is the fact that, in a
worst-case scenario, it takes 90 minutes for a configuration change to
propagate to Nagios.

Regards,
Martijn.

-- 
You received this message because you are subscribed to the Google Groups 
Puppet Users group.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to 
puppet-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.