On 16/03/10 19:55, Brice Figureau wrote: > On 15/03/10 21:44, Luke Kanies wrote: >> On Mar 15, 2010, at 2:06 AM, Brice Figureau wrote: >> >>> On Sun, 2010-03-14 at 15:49 -0700, Luke Kanies wrote: >>>> On Mar 14, 2010, at 1:42 AM, Brice Figureau wrote: > [snipped] >>> >>> Frankly I'd prefer we fix the issue in 0.25.x with this patch (granted >>> someone else can test it with success). >>> >>> I don't really have lot of time until June for puppet dev (except those >>> occasional fixes), but I can at least try to start something. If you >>> want to do it, then go ahead, that will certainly be much more effective >>> than my procrastination :-) >> >> Ok. I'll get this patch tested, and if I have some airplane time or >> something see what a full checksum refactor would look like. The >> 'source' refactor ended up being relatively easy in the end and it had a >> huge impact on maintainability. > > Peter tested my patch today and found that it was working as intended > (there is no more md5 checksumming of all the files). > > But he also discovered something that we didnt't even thought about: the > first run of: > > file { > "/deep/hierarchy/full/of/files": > recurse => true, checksum => none, mode => 0600 > } > > produces many changes in the transaction (assuming the files are not > 0600 of course). This in turn generates a large consumption of memory > and 100% of cpu use for a looong time (ie more than 30 min for 7.5k > files said Peter, several minutes for 2.5k on my fast computer). > > Why? > > After a quick search, we found that most of the time was taken around > Puppet::SimpleGraph#matching_edges or around line 218 of > Puppet::Transaction (ie I'm not sure it's the call itselft or iteration > on the result). > > I didn't do a deep analysis of the problem, but to me we are generating > tons of events that we are propagating up to the root file (that spawned > all those sub-file-resources). I don't think in this case that it is > necessary, unless there is a notify/subscribe (which is not the case). > > One of the problem I think (no evidence to back this up, it's only my > understanding) is that we dup the current edge in a fast paced loop. The > poor ruby GC has really no time to collect the previous edge instance so > we pile up new edge instances at a really fast pace, spending most of > it's time in the allocator (the ruby one or the libgc one, which is > costly operation). > > I'll do more tests later this week on this code (I first have to > understand it of course :-)). > > Anyway commenting line 218 in transaction.rb, and the aforementioned > manifest becomes really fast (if not instantaneous). > > Any comments?
OK, I identified the issue (patch coming soon): Puppet::SimpleGraph#matching_edge is simply completely inefficient in front of lots of events for a given resource (which is our case). The problem is that matching_edge returns the list of edges for which we should trigger an event. In the aforementioned manifest, since there is no subscribe/notify, the list is empty. Unfortunately with the current code, it takes about 22s to find this result for about 1100 events. With the (soon to be published patch), I was able to reduce this time to 4s. The whole issue was that to find this list of edges the process was abusing memory allocations (ie appending to arrays again and again), producing a large list of empty arrays which finally ends up as being flattened and then compacted. I rewrote this part (sorry the code is not as beautiful as it was before I touched it :-)) to never create those transient arrays, limiting as much as we can array appending. -- Brice Figureau My Blog: http://www.masterzen.fr/ -- You received this message because you are subscribed to the Google Groups "Puppet Developers" group. To post to this group, send email to puppet-...@googlegroups.com. To unsubscribe from this group, send email to puppet-dev+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en.