Hi there,

I'm running into slow catalog runs because of many files that are managed. 
I was thinking about some optimizations of this functionality.

1: On puppetmaster:
For files with "source => 'puppet:///modules...' puppetmaster should 
already calculate md5 and send it with the catalog.

2: On managed node:
As md5s for files are already there once catalog is received, there is no 
need for x https calls (x is the number of files managed with source=> 
parameter)

3. Puppetmaster md5 cache
This would of course put some strain on puppetmaster, which would then 
benefit from some sort of file md5 cache:
- when md5 is calculated, put in into cache, key is filename. Also add file 
mtime and time of cache insert.
- on each catalog request, for each file in the catalog check if mtime has 
changed, and if so, recalculate md5 hash, else just retrieve md5 hash from 
cache
- some sort of stale cache entries removal, based on cache insert time, 
maybe at the end of each puppet catalog compilation, maybe controlled with 
probability 1:100 or something

Do you have any comments about these optimizations? They will be greatly 
appreciated... really :)

b.

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/puppet-users/-/_2z4LaLw_IoJ.
To post to this group, send email to puppet-users@googlegroups.com.
To unsubscribe from this group, send email to 
puppet-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/puppet-users?hl=en.

Reply via email to