On 6/16/2014 12:33 PM, Stephen Morton wrote:
I've got some newbie puppet questions.
My team has a tremendous amount of linux/computer knowledge, but we're
new to Puppet.
We recently started using puppet to manage some 100 servers. Their
configs are all pretty similar with some small changes.

----
History

Prior to Puppet, we already had a management system that involved having
config files under revision control and the config file repo checked out
on every server and the repo config files symlinked into the appropriate
place in the filesystem. Updating the repo would update these files.This
was mostly just great, with the following limitations:

  * If the symlink got broken, it didn't work.
  * Some files require very specific ownership, or were required not to
    be symlinks (e.g. /etc/sudoers. /etc/vsftpd/ files I think)
  * Updating a daemon's config file does not mean that the daemon is
    restarted. e.g. updating /etc/httpd/conf/httpd.conf does not do a
    "service httpd reload"
  * You can't add a new symlink.
  * All files must be in revision control to link to. Some
    security-sensitive files we want to only be available to some
    servers and something like puppet that can send files over the
    network is a good solution to this.

----

Puppet to the rescue?

So we've tried a very conservative Puppet implementation. We've left our
existing infrastructure and we just add new rules in Puppet. So far, we
have a single site.pp file and only a dozen or so rules. But already
we're seeing problems.

 1. Puppet is good for configuring dynamic stuff that changes. But it
    seems silly to have rules for stuff that will be configured just one
    time and then will not change. If we set up some files, we don't
    expect them to disappear. In fact if they do disappear we might not
    want them silently fixed up we probably want to know what's going
    on.  Doing everything in puppet results in ever-growing manifests. I
    don't know of a way to specify different manifests, e.g. every 30
    minutes I want Puppet to run and request the lean and mean regular
    manifest and then once a week I want it to run the "make sure
    everything is in the right place" manifest.
 2. Puppet seems very sensitive to network glitches. We run puppet from
    a cron job and errors were so frequent that we just started sending
    all output to /dev/null.
 3. Endless certificate issues. It's crazy. So sometimes hosts would get
    "dropped"... for unknown reasons their certificates were no longer
    accepted. Because we'd already stopped output (see previous bullet
    point) we would not know this and the server would be quietly not
    updated. And when you get a certificate problem, often simply
    deleting the cert on the agent and master won't fix it. Sometimes a
    restart of the master service (or more?) is required.
      * The solution to this to me is not "you should run puppet
        dashboard, then you'd know". This shouldn't be failing in the
        first place. If something is that flaky, I don't want to run it.

(We're running version 3.4.2 on CentOS 6.5, 64-bit.)

---

Questions.

So my questions for the above three issue are I guess as follows

 1. Is there a common Puppet pattern to address this? Or am I thinking
    about things all wrong.
 2. Is there a way to get puppet to be more fault-tolerant, or at least
    complain less?
 3. Are endless certificate woes the norm? Once an agent has
    successfully got its certificates working with the server, is it a
    known issue that it should sometimes start to subsequently fail?

Thanks,
Steve

1. I don't think about it as manifests increasing in size, but whether I can completely recreate a server at anytime accurately. Or more importantly can I provision 12 more of any server asap. It's been my experience that active/passive sites usually drift into active/not updated sites. I believe the same would apply to a Puppet install that had one methodology for install and another for updates.

That said we do have servers that are usually short lived enough that we run Puppet on install and then run specifically targeted updates when needed using Puppet's --tags feature.

http://docs.puppetlabs.com/puppet/latest/reference/lang_tags.html#the-tag-metaparameter

2. I run Puppet masters in one US site and have agent machines is five others including three sites outside of the US. We average roughly one network related problem a month on the 50-100 nodes that aren't in the main site. Without more information, logs, etc it would appear that your the network's stability is the problem.

The symptoms you describe might be the result of an overloaded master. If that sounds possible, I'd look at the number of Puppet master processes you've configured in Apache/Passenger (or similar) and the concurrent requests to the master during the day. Agents when left to their own devices like to clump up over time. Additionally if you're still using the puppetmasterd startup script your master won't be able to handle more then one concurrent request.

3. I've been running Puppet for over four years and have never had the sort of cert problems you've described. IIRC the cert expire time is five years so that seems unlikely as well.

My best guess is time drift though I would expect transactions to remain broken till NTP was updated.

Ramin

--
You received this message because you are subscribed to the Google Groups "Puppet 
Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-users/539F629C.1040605%40badapple.net.
For more options, visit https://groups.google.com/d/optout.

Reply via email to