On Mar 17, 2010, at 12:27 PM, Brice Figureau wrote:

On 17/03/10 20:09, Brice Figureau wrote:
On 16/03/10 19:55, Brice Figureau wrote:
On 15/03/10 21:44, Luke Kanies wrote:
On Mar 15, 2010, at 2:06 AM, Brice Figureau wrote:

On Sun, 2010-03-14 at 15:49 -0700, Luke Kanies wrote:
On Mar 14, 2010, at 1:42 AM, Brice Figureau wrote:
[snipped]

Frankly I'd prefer we fix the issue in 0.25.x with this patch (granted
someone else can test it with success).

I don't really have lot of time until June for puppet dev (except those occasional fixes), but I can at least try to start something. If you want to do it, then go ahead, that will certainly be much more effective
than my procrastination :-)

Ok. I'll get this patch tested, and if I have some airplane time or
something see what a full checksum refactor would look like.  The
'source' refactor ended up being relatively easy in the end and it had a
huge impact on maintainability.

Peter tested my patch today and found that it was working as intended
(there is no more md5 checksumming of all the files).

But he also discovered something that we didnt't even thought about: the
first run of:

file {
"/deep/hierarchy/full/of/files":
  recurse => true, checksum => none, mode => 0600
}

produces many changes in the transaction (assuming the files are not
0600 of course). This in turn generates a large consumption of memory
and 100% of cpu use for a looong time (ie more than 30 min for 7.5k
files said Peter, several minutes for 2.5k on my fast computer).

Why?

After a quick search, we found that most of the time was taken around
Puppet::SimpleGraph#matching_edges or around line 218 of
Puppet::Transaction (ie I'm not sure it's the call itselft or iteration
on the result).

I didn't do a deep analysis of the problem, but to me we are generating tons of events that we are propagating up to the root file (that spawned
all those sub-file-resources). I don't think in this case that it is
necessary, unless there is a notify/subscribe (which is not the case).

One of the problem I think (no evidence to back this up, it's only my understanding) is that we dup the current edge in a fast paced loop. The poor ruby GC has really no time to collect the previous edge instance so we pile up new edge instances at a really fast pace, spending most of
it's time in the allocator (the ruby one or the libgc one, which is
costly operation).

I'll do more tests later this week on this code (I first have to
understand it of course :-)).

Anyway commenting line 218 in transaction.rb, and the aforementioned
manifest becomes really fast (if not instantaneous).

Any comments?

OK, I identified the issue (patch coming soon):

Puppet::SimpleGraph#matching_edge is simply completely inefficient in
front of lots of events for a given resource (which is our case).
The problem is that matching_edge returns the list of edges for which we should trigger an event. In the aforementioned manifest, since there is
no subscribe/notify, the list is empty.

Unfortunately with the current code, it takes about 22s to find this
result for about 1100 events. With the (soon to be published patch), I
was able to reduce this time to 4s.

Note: 22s doesn't seem that many, but I also tested with about 5000
events and got about 2min (new code), and stopped it after more than
15min (old code).

Still... 2minutes just for the graph operations?  That's pretty gross.

I suppose this wouldn't be mitigated at all if we switched to not creating individual file resources per file, since you'd still need the same number of events.

It seems the previous code was at least quadratic (damn ruby allocator,
not the code itself).

Peter, if you have some spare time, I suggest you try this new patch on
top of the previous one (although they're not related).

Warning: I didn't really test the new code, and it isn't covered with
tests (it shouldn't modify any behavior).

I think that code is pretty well covered, so it should be a straight refactor.

--
'Tis better to be silent and be thought a fool, than to speak and
remove all doubt. --Abraham Lincoln
---------------------------------------------------------------------
Luke Kanies  -|-   http://reductivelabs.com   -|-   +1(615)594-8199

--
You received this message because you are subscribed to the Google Groups "Puppet 
Developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-dev?hl=en.

Reply via email to