[Puppet Users] Re: Speed issue when copying 'n' files

jcbollinger Tue, 18 Aug 2015 06:35:34 -0700


On Monday, August 17, 2015 at 9:23:30 AM UTC-5, Sergiu Cornea wrote:
>
> Hello guys,
>
> So the scenario is as follows:
>
> On a virtual host I've got 10 linux containers and each container has a 
> website associated with. Therefore, for that website to work I have to copy 
> around 100 + files using Puppet.
>
> At the moment I am using rsync to do so as I have tried using Puppet to 
> copy the files over but the Puppet agent run speed was considerably slow, 
> and that's just for 10 websites, which is bound to grow.
>



To be precise, you are using Puppet to *manage* ~100 files per node.  Each 
one should be copied only in the event that its contents on the target node 
differ from its contents on the master (including if the file is missing 
altogether).  Puppet will not copy the files if the contents are unchanged, 
which it checks via a configurable comparison function.

 

>
> However, using rsync saves me some time, however, the run it is still 
> quiet slow.
>
>

If rsync is slow for this task then it must be that the aggregate size of 
your ~100 files is enormous, that your available processing power per 
container is meager, or both.

 

> What are you guys suggesting that I should do? or how are you guys going 
> about copying a large number of files using Puppet?
>


You're not really into the realm that I'd characterize as "a large number" 
of files yet, but yes, Puppet is not well suited to be used as a 
content-management system.  You have many options both inside Puppet and 
out, however, to improve the observed performance.

One alternative is to use a less expensive mechanism for checking file 
contents by setting a different 'checksum 
<https://docs.puppetlabs.com/references/3.4.stable/type.html#file-attribute-checksum>'
 
parameter on the File resources.  This is a tradeoff between speed and 
reliability, with the default, 'md5', providing maximum reliability.  You 
could get a moderate performance improvement without giving up too much 
reliability by changing to 'md5lite'.  You could get a great performance 
improvement at the cost of a significant reduction in reliability by 
choosing the 'mtime' option (thereby relying on file modification 
timestamps).

If the files in question rarely change, then another alternative would be 
to make a package out of them for your target machines' native packaging 
system (RPM, Apt, ...), and manage the package instead of individual 
files.  This is pretty effective for installation, but not necessarily so 
good for catching and reverting changes to the installed files.

Another alternative would be to enroll the files in a version-control 
system such as Git, and have Puppet use that to sync files with their 
master copies instead of managing the files as resources.  Honestly, I 
think this is a pretty good fit to the usage you describe.


John

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-users/7f447ac6-1936-49d8-afa4-ff30ff6f7529%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[Puppet Users] Re: Speed issue when copying 'n' files

Reply via email to