On Fri, Mar 18, 2011 at 9:04 AM, seth vidal <[email protected]> wrote:
> Hi folks,
>  some thoughts have been slowly coalescing in my head about how we're
> managing our boxes/services and I have some suggestions I've passed by
> various folks but I wanted to check them out with everyone:
>
>
> 1. puppetd sucks..... memory. Right now we have puppetd running on every
> box and it wakes up every half hour and runs itself. This is fine but in
> the time where it is not doing anything it just eats memory for no good
> reason. I'd like to suggest we move to a cron-driven model instead of
> puppetd. I'd write a simple cron job that runs every half hour to run
> puppetd, if a lock file is not found. Pretty straightforward, of
> course.

I'd be happy to help get this going.  I've set up puppet a few times
in this fashion now and it's pretty easy to do.

> 2. monitoring if puppetd has run properly:
>   two things we want to know about puppet runs:
>   a. when they last happened per-box
>   b. if they fell over in a horrible way.

It might be overkill, but puppet dashboard is pretty nice.  It's a web
interface, kind of like a nagios for puppet, telling you exactly the
things you want to know above.  Plus, it has some pretty graphs :)  It
runs on cron jobs too.  I've set it up once about a year ago, pretty
nice.  I'm sure it's improved some since then.  Again, I'd be happy to
help set this up.

>    (a) can be known by looking at the $nodename.yaml file which lives
> on the puppetmaster. I've written a script to check if that file is
> older than 1 hour and report the nodename if it is.
>    (b) can be done via the cron job - ie: taking error output from the
> puppet run and mailing to people until we fix it! :)
>
> 3. sign** boxes. problems here:
>   a. These boxes are falling out of date, repeatedly, b/c they aren't
> in our normal updating path.
>   b. these boxes don't email out to the same locations as the other
> boxes
>   c. these boxes don't get faspassword updates properly
>   d. these boxes don't get config changes normally via puppet
>
>   (a) I'd like to suggest that they be put into a normal updating path
> and/or we setup a nag mail to tell us about them
>   (b) obviously, fix their mail configs
>   (c) fasclient is failing b/c of a missing token b/c, most likely, of
> (d)
>
>  I'm open to suggestions on those but it is a bit annoying b/c while I
> understand their 'sensitivity' I think our way of treating them is
> making the problem WORSE not better.
>
> -sv
>
_______________________________________________
infrastructure mailing list
[email protected]
https://admin.fedoraproject.org/mailman/listinfo/infrastructure

Reply via email to