Hi Nick, > I've tried to achieve my overall goals with several different features > of Puppet, but I've hit a bit of a wall here. I think it's time for me > to explain what I'm trying to accomplish: > > I want the enabling of a service in my manifests to configure > the monitoring of that service by a nagios server, without > needless repetition. > > Let me explain how my non-automated nagios3 server is configured: > > Each service is declared once per type (or type+role, but there > are very few types distinguished by role, so we'll ignore that > for now). Services are members of hostgroups and servicegroups, > and hosts join hostgroups in order to activate monitoring. > Servicegroups are mostly used for sorting in the Web UI. > > Now let's look at how Puppet "wants" to configure Nagios: > > Each host exports a unique service for each of the checks it > should be subject to. > > Holy cow! So if I have 25 checks per machine, and a thousand nodes, > that's 25,000 entries! We're talking about a nagios_services.cfg > measured in the tens of *megabytes*! I'm a little stunned by this > pattern. > > I had created a system whereby nodes exported concat::fragments > expressing desired membership in a hostgroup, and defines that created > services, hostgroups, and servicegroups only if they hadn't already been > made. All this ground to a halt and with only 50 test nodes the > compilation of the nagios server's catalog took 3 minutes on a > reasonably spec'd machine. I have to assume that the AST's mass of > defines-of-defines was partly to blame. > > I'd like to make it as simple as possible for my module writers: "Just > use the nagios::nrpe_check define. That'll make the nrpe config locally > and export everything needed to ensure this host is checked for your > module's behaviors!" Forcing them to make a service check *here* and a > hostgroup *there* and don't forget to add the hostgroup to the node that > used the class *there*… well it just opens up too many opportunities for > human fallibility. > > So what are my options here? I don't want to probe the hosts to > determine what should be monitored, because the whole *point* of > monitoring is to alert you when your live system state isn't what you > asked for. I also worry that this Cartesian product of everything by > everyone will not scale acceptably into the future. > > Fingers crossed that I'm just being dense and missed something big!
I'm afraid not. Most of this past week I spent building my Puppet-induced Nagios configuration, and after exploring my options, I decided that the hostgroup-based configuration you describe is the only sensible way. I did exactly what you did: use exported concat-fragments to collect the hostgroups on the puppetmaster and then use generate() to provision the hostgroups parameter of the nagios_host. Catalog runs on the Nagios servers typically take less than a minute to run, but I only have about 50 hosts defined so far. I hope that this scheme will be sustainable. The biggest downside of my system so far, is the fact that, in a worst-case scenario, it takes 90 minutes for a configuration change to propagate to Nagios. Regards, Martijn. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.