Re: [Puppet Users] Initial data population

2010-11-24 Thread Daniel Pittman
Ashley Penney  writes:
> On Tue, Nov 23, 2010 at 6:41 PM, Daniel Pittman  wrote:
>> Ashley Penney  writes:
>>
>> > As an example of the kind of thing we're talking about we use a product >
>> called Sonatype Nexus that relies on a bunch of on disk data in >
>> /srv/sonatype-nexus/.  When installing the system for the first time (for >
>> example, when the file{} containing the .war triggers) we would like it to >
>> automatically put down a copy of /srv/sonatype-nexus/.  We obviously don't >
>> want this drifting out of sync with the production data which is where the >
>> issue is.  How do other people handle this?
>>
>> Package those data files yourself, if necessary including logic in the
>> package to ensure that you don't overwrite valuable local changes.  Then use
>> puppet to ensure that package is either 'installed' or 'latest'.
>
> I suppose this is possible, but awkward.  An example of another application
> is this horrible Java CMS that we use that writes numerous XML files of
> random names all over the place during operation.

Well, I agree that by the time you got as far as Java you had already lost. ;)

More seriously, I can understand the problem, and it is a royal PITA.

> There's cache directories, it constantly rewrites various bits of
> configuration xml files, it spews logs all over.  Packaging something like
> that up in a way that is functional is almost impossible.  When we want to
> reinstall/clone that server we just copy the entire directory and then run
> Puppet to change a few key XML files.  Something like that is difficult to
> package, and the files that you would package change frequently due to
> patches and internal development on top of the CMS.  

I would approach that, personally, by holding my nose and using something like
Capastrano or another "deploy from a version control system" tool to do
literally that: copy from a golden source into the target system, by hand.

Then use puppet to manage the handful of configuration files that need
customization, and have the "deployment" tool trigger a puppet run with
'--test' on the target machine after installation.

Which is a bit nasty, and it would be nice if puppet could do it, but it sucks
less than the alternatives, I think.

(In other words, I think you identified the body of alternative processes well
 in your earlier post, although other wrappers around them might be nicer.)

Daniel

-- 
✣ Daniel Pittman✉ dan...@rimspace.net☎ +61 401 155 707
   ♽ made with 100 percent post-consumer electrons

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To post to this group, send email to puppet-us...@googlegroups.com.
To unsubscribe from this group, send email to 
puppet-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.



Re: [Puppet Users] Initial data population

2010-11-24 Thread Ashley Penney
On Tue, Nov 23, 2010 at 2:30 PM, Patrick  wrote:

>
> 1) So are the Puppet clients (Nexus servers) supposed to be modifying the
> data in /srv/sonatype-nexus like a database or is it used read-only like a
> file server?
>

They modify data in that directory.  I explained further in another email
but we have several Java apps that do similar things, constantly
changing/adding xml files and all kinds of logs and other stuff for running.


> 2) If you change the "master copy" of the data, can you wipe the data and
> recopy on each client or do you need to merge in changes?
>

I think in most cases it would require a remerge.  Generally speaking I'm
only dealing with a single client using this data at a time so my concerns
are more 'if the data is not there completely, automatically reprovision it
with a copy as up to date as possible' rather than changing things.  If
there are specific file configuration files I need to change within the data
then I handle that within Puppet like any other application.


> 3) How big is the biggest file in the data?  What's the total size?


Hmm, I'm not logged in to check at the moment but generally we're talking a
maximum of 4G of data for one of these applications, and some closer to 1G.

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To post to this group, send email to puppet-us...@googlegroups.com.
To unsubscribe from this group, send email to 
puppet-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.



Re: [Puppet Users] Initial data population

2010-11-24 Thread Ashley Penney
On Tue, Nov 23, 2010 at 6:41 PM, Daniel Pittman  wrote:

> Ashley Penney  writes:
>
> > As an example of the kind of thing we're talking about we use a product
> > called Sonatype Nexus that relies on a bunch of on disk data in
> > /srv/sonatype-nexus/.  When installing the system for the first time (for
> > example, when the file{} containing the .war triggers) we would like it
> to
> > automatically put down a copy of /srv/sonatype-nexus/.  We obviously
> don't
> > want this drifting out of sync with the production data which is where
> the
> > issue is.  How do other people handle this?
>
> Package those data files yourself, if necessary including logic in the
> package
> to ensure that you don't overwrite valuable local changes.  Then use puppet
> to
> ensure that package is either 'installed' or 'latest'.
>

I suppose this is possible, but awkward.  An example of another application
is this horrible Java CMS that we use that writes numerous XML files of
random names all over the place during operation.  There's cache
directories, it constantly rewrites various bits of configuration xml files,
it spews logs all over.  Packaging something like that up in a way that is
functional is almost impossible.  When we want to reinstall/clone that
server we just copy the entire directory and then run Puppet to change a few
key XML files.  Something like that is difficult to package, and the files
that you would package change frequently due to patches and internal
development on top of the CMS.


>
> > Our options seem to be:
> >
> > * Nightly/hourly backups of production data to some location where Puppet
> >   can rsync/wget/shovel it out when needed.
> > * Some kind of process that real-time syncs directories to nfs storage.
> > * Erroring if the data is missing in some fashion when Puppet runs and
> relying on
> >   sysadmins to put it in place.
>
> ...or making it available as a puppet file server, and using puppet to put
> it
> in place.
>

In our experience that is almost unusable, speedwise.


>
> > We've talked through the options but they all have fairly significant
> > drawbacks.  My personal favorite solution would be some kind of daemon
> that
> > syncs data constantly and is capable of intelligently syncing the data
> back
> > to the node if it goes missing.  It could be potentially error prone but
> it
> > represents the least bad choice.
>
> You could potentially just use:
>
>file { "/example":
>  source => 'puppet:///module/example', replace => false
>}
>
> That will only put the file in place if it doesn't already exist.
>

Hmm, I always forget about replace => false.  I wonder if it has the same
awful speed penalties.  I think my issue with this is still the hassle of
constantly syncing the changing files back into Puppet.  That's why I was
looking for some kind of semi or fully automated syncing mechanism for
something like this.  It's mostly Java apps that are especially bad for
this.  Most open source software sticks data into a database or at least a
single easily dealt with directory.  Java explodes all over the place like
some kind of evil virus.

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To post to this group, send email to puppet-us...@googlegroups.com.
To unsubscribe from this group, send email to 
puppet-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.



Re: [Puppet Users] Initial data population

2010-11-23 Thread Daniel Pittman
Ashley Penney  writes:

> We've been having some internal discussions about the best way to handle
> certain cases and I thought I'd turn to the list to solicit opinions on how
> other people have solved this issue (or don't, as the case may be).  The
> issue is that we would like our modules to, where possible, check for the
> existence of certain on disk data when installing a service for the first
> time and retrieve it from somewhere if it's not available.

Patrick asked some really good questions about this, but generally:

> As an example of the kind of thing we're talking about we use a product
> called Sonatype Nexus that relies on a bunch of on disk data in
> /srv/sonatype-nexus/.  When installing the system for the first time (for
> example, when the file{} containing the .war triggers) we would like it to
> automatically put down a copy of /srv/sonatype-nexus/.  We obviously don't
> want this drifting out of sync with the production data which is where the
> issue is.  How do other people handle this?

Package those data files yourself, if necessary including logic in the package
to ensure that you don't overwrite valuable local changes.  Then use puppet to
ensure that package is either 'installed' or 'latest'.

> Our options seem to be:
>
> * Nightly/hourly backups of production data to some location where Puppet
>   can rsync/wget/shovel it out when needed.
> * Some kind of process that real-time syncs directories to nfs storage.
> * Erroring if the data is missing in some fashion when Puppet runs and 
> relying on
>   sysadmins to put it in place.

...or making it available as a puppet file server, and using puppet to put it
in place.

> We've talked through the options but they all have fairly significant
> drawbacks.  My personal favorite solution would be some kind of daemon that
> syncs data constantly and is capable of intelligently syncing the data back
> to the node if it goes missing.  It could be potentially error prone but it
> represents the least bad choice.

You could potentially just use:

file { "/example": 
  source => 'puppet:///module/example', replace => false
}

That will only put the file in place if it doesn't already exist.

Regards,
Daniel
-- 
✣ Daniel Pittman✉ dan...@rimspace.net☎ +61 401 155 707
   ♽ made with 100 percent post-consumer electrons

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To post to this group, send email to puppet-us...@googlegroups.com.
To unsubscribe from this group, send email to 
puppet-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.



Re: [Puppet Users] Initial data population

2010-11-23 Thread Patrick

On Nov 23, 2010, at 6:45 AM, Ashley Penney wrote:

> Hi,
> 
> We've been having some internal discussions about the best way to handle 
> certain cases and I thought I'd turn to the list to solicit opinions on how 
> other people have solved this issue (or don't, as the case may be).  The 
> issue is that we would like our modules to, where possible, check for the 
> existence of certain on disk data when installing a service for the first 
> time and retrieve it from somewhere if it's not available.
> 
> As an example of the kind of thing we're talking about we use a product 
> called Sonatype Nexus that relies on a bunch of on disk data in 
> /srv/sonatype-nexus/.  When installing the system for the first time (for 
> example, when the file{} containing the .war triggers) we would like it to 
> automatically put down a copy of /srv/sonatype-nexus/.  We obviously don't 
> want this drifting out of sync with the production data which is where the 
> issue is.  How do other people handle this?
> 
> Our options seem to be:
> 
> * Nightly/hourly backups of production data to some location where Puppet can 
> rsync/wget/shovel it out when needed.
> * Some kind of process that real-time syncs directories to nfs storage.
> * Erroring if the data is missing in some fashion when Puppet runs and 
> relying on sysadmins to put it in place.
> 
> We've talked through the options but they all have fairly significant 
> drawbacks.  My personal favorite solution would be some kind of daemon that 
> syncs data constantly and is capable of intelligently syncing the data back 
> to the node if it goes missing.  It could be potentially error prone but it 
> represents the least bad choice.  That combined with regular backups would 
> seem ideal but I can't find anything out there that does this without 
> significant work/investment.  I debated just rsyncing every 15 minutes but 
> that's not great either.

1) So are the Puppet clients (Nexus servers) supposed to be modifying the data 
in /srv/sonatype-nexus like a database or is it used read-only like a file 
server?
2) If you change the "master copy" of the data, can you wipe the data and 
recopy on each client or do you need to merge in changes?
3) How big is the biggest file in the data?  What's the total size?

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To post to this group, send email to puppet-us...@googlegroups.com.
To unsubscribe from this group, send email to 
puppet-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.



[Puppet Users] Initial data population

2010-11-23 Thread Ashley Penney
Hi,

We've been having some internal discussions about the best way to handle
certain cases and I thought I'd turn to the list to solicit opinions on how
other people have solved this issue (or don't, as the case may be).  The
issue is that we would like our modules to, where possible, check for the
existence of certain on disk data when installing a service for the first
time and retrieve it from somewhere if it's not available.

As an example of the kind of thing we're talking about we use a product
called Sonatype Nexus that relies on a bunch of on disk data in
/srv/sonatype-nexus/.  When installing the system for the first time (for
example, when the file{} containing the .war triggers) we would like it to
automatically put down a copy of /srv/sonatype-nexus/.  We obviously don't
want this drifting out of sync with the production data which is where the
issue is.  How do other people handle this?

Our options seem to be:

* Nightly/hourly backups of production data to some location where Puppet
can rsync/wget/shovel it out when needed.
* Some kind of process that real-time syncs directories to nfs storage.
* Erroring if the data is missing in some fashion when Puppet runs and
relying on sysadmins to put it in place.

We've talked through the options but they all have fairly significant
drawbacks.  My personal favorite solution would be some kind of daemon that
syncs data constantly and is capable of intelligently syncing the data back
to the node if it goes missing.  It could be potentially error prone but it
represents the least bad choice.  That combined with regular backups would
seem ideal but I can't find anything out there that does this without
significant work/investment.  I debated just rsyncing every 15 minutes but
that's not great either.

Thanks,

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To post to this group, send email to puppet-us...@googlegroups.com.
To unsubscribe from this group, send email to 
puppet-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.