On Thu, Feb 6, 2014 at 1:50 PM, Brice Figureau < [email protected]> wrote:
> Hi, > > During (the awesome) cfgmgmtcamp Puppet Contributor Summit earlier this > week, Peter Meier and I were chasing (in fact he did all the hard work) > a couple of strange bugs, and we found while stracing the agent what > might be a performance issue. > > That was an absolutely incredible write up of the issue that you put in the ticket! Many thanks to both of you for tracking that down. Erik Dalen had investigated something that looks like the same problem and posted the info to PUP-751. > Everything is explained in detail in PUP-1592 [1], but the tl;dr version > is we're stat(2)ing a lot of inexistent files during the transaction for > every instance of a given defined types (that might well get in the > order of 100s of stat(2) per instance of defined types). > > I think your idea of changing the order that puppet searches makes sense. It is much more likely that the requested name is a defined type than a custom type and so looking for defined types first short circuits the whole process more quickly. There is maybe another issue at play, though. As Erik points out it seems like puppet will look up the same things again and again. We've done a lot of hacking around to try to put in caches and all sorts of other things, but those all end up being very fragile and we regress on the fix. The problem is that known_resource_types nearly impossible for us to manage in the system in a sane matter. I think the full fix for this is going to need to address the problem of keeping track of what is loaded, what isn't, where it was loaded from, and make a much cleaner step for when the cache is invalidated. I recently created the start of a benchmarking system in puppet ( https://github.com/puppetlabs/puppet/commit/e48157e42360aaf0c67e2ef3866b091f0a113ce8) and the profile of the one scenario added already shows an absurd number of calls to determine if any files have changed. Thankfully in this case the caching of the stat call is working, but it is still spending a huge amount of time just deciding if the filetimeout has expired. > I'd like to get the input of all the devs here on the proper way of > fixing the bug before sending a PR :) > > I think that just swapping the search order is a good fix that should be less likely for us to regress. If that shows a performance improvement, then that should definitely go in. I think the more proper fix will require a larger redesign of how puppet loads things and tracks what it has loaded. > I'll also try to do some performance tests during the week-end to see > the impact (it's not yet fully known if it really matters or not > compared to the I/O load an agent already see). > > Thanks! > > [1]: > https://tickets.puppetlabs.com/browse/PUP-1592 > -- > Brice Figureau > My Blog: http://www.masterzen.fr/ > > -- > You received this message because you are subscribed to the Google Groups > "Puppet Developers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/puppet-dev/52F403BA.7090709%40daysofwonder.com > . > For more options, visit https://groups.google.com/groups/opt_out. > -- Andrew Parker [email protected] Freenode: zaphod42 Twitter: @aparker42 Software Developer *Join us at PuppetConf 2014, September 23-24 in San Francisco - * http://bit.ly/pupconf14 -- You received this message because you are subscribed to the Google Groups "Puppet Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CANhgQXuMn9V71_cv_CmMxdn7h83nnCqJEn4FwTtcWpg7P9ujvQ%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
