[Puppet-dev] Re: The future of known_resource_types and loading puppet manifests

Henrik Lindberg Tue, 27 Aug 2013 12:51:21 -0700

On 2013-26-08 19:49, Andy Parker wrote:

Adrien put a lot of effort into tracking down what was happening in
#15106 (Missing site.pp can cause error 'Cannot find definition Class').
That exact issue, as described in that bug, has been fixed, but in the
investigation Adrien figured out that there are a lot of other problems
that can crop up (https://projects.puppetlabs.com/issues/15106#note-13).


Basically it comes down to the way puppet tracks what is loaded, what
can be loaded, and when things need to be reloaded. When compiling a
catalog from manifests, the autoloader (for puppet types, not for ruby
code) will be invoked at various times to parse the .pp files that it
thinks should contain the types that are needed. At the same time it
caches what it has already parsed in a Puppet::Resource::TypeCollection,
which throughout the code is known as known_resource_types. There are
also a few cases where the TypeCollection will be cleared, even part way
through a compile, that causes it to start reloading things.

Charlie Sharpsteen, Adrien, and I talked about this around a week ago,
before puppetconf and came to the conclusion that the current method of
autoloading puppet manifests and tracking known types is just untenable.
There are multiple points in the code where it loses track of the
environment that it is working with, trying to pass that information
through (I tried it a few days ago) ends up uncovering more issues.

The conclusion that we came to was that the current lazy-loading of
puppet manifests needs to go away. Lazy loading makes all of the
information to correctly load types at the right time and from the right
place very difficult to keep track of (not intrinsically so, but in our
current state).

I think the system needs to change to eager loading of manifests (not
applying them all, but at least loading them all). For the development
case, this makes things maybe a little more expensive, but it should
make the stable, production case for manifests much faster, because it
will rarely, if ever need to look at the filesystem to find a type.

Now the problem is that if we start going down this path, it becomes a
large change to the underlying architecture of the compiler.. It will be
unnoticeable  to most users from a manifest standpoint (unless somehow
they were able to rely on certain manifests never being loaded), however
we may need to make changes that will break code at the ruby level
(maybe the testing wrappers, maybe types and providers, probably some
functions).

I think something this large should be an ARM, but I wanted to put this
out here to get some feedback before working up an ARM. Maybe we are
missing something and we can salvage this without a larger change, but
at the moment I'm skeptical.

--
Andrew Parker


This is very interesting discussion.

In Geppetto the approach is to parse everything up-front. This is notthe same as what is typically referred to as parsing in the puppetcommunity which seems to also involve evaluation and linking. We oftentalk about "parse order" when we actually mean "evaluation order".

Parsing is quite straight forward, simply turn the DSL source into anAST and remember it. It is the linking (and evaluation) that is tricky,especially when changes can take place to files mid-transaction.

In Geppetto it is not good to keep all information about all files inmemory all the time and there a technique is used to populate an indexwith references to the source positions where referable elements arelocated, this index is used during linking. A graph of all dependenciesis created in a way that makes it possible to compute if a change to a"name => element" change will have effect on other resolutions (it wasmissing, now it exists, it existed was resolved, but should now resolvedifferently).

The build system that kicks in on any change makes use of the dependencyinformation to link the resulting AST in the correct order.Sometimes it is required for the builder to make an extra pass due tolack of information (the dependencies were not yet computed), or whenthere is circularity.

Puppet is a very tricky language to link since many of the links can bedynamic (not known until evaluation takes place). You can see this inGeppetto; sometimes you have to do a "build clean" as the interactivestate does not know that issues where resolved. Also, in many cases itis not possible to validate/link these constructs at all.


In Geppetto, the majority of the computing time is to validate the links.

With all of that said. In a Puppet Master we could parse all files toAST but when we start evaluation we start over from the beginning everytime (check if file is stale, if so reparse it). Holding more state thanthat in memory is very problematic (esp. when environments are involved).

I was talking with Eric Dalén, and he said they tried running the masterin way where each catalog request got a fresh process. If I understoodhim correctly this did not have much (if any) negative impact onperformance !! I can think of many reasons why, ruby is not the best atgarbage collection, when things are done in virgin state there are lessthings to check (poor cache implementations are sometimes worse than nocache), memory is better organized. I can also imagine that speedupsthat were measured in the past may not be as relevant today when disksare much faster (SSD even), and with speed improvements in the Rubyruntime. It is high time to again measure where the bottlenecks are.

So - Andy, I am very much looking forward to hearing about themeasurements you are doing on parse time. Are your running both old/newparser and 1.8.7, 1.9.3 ?


Regards
- henrik

--
You received this message because you are subscribed to the Google Groups "Puppet 
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To post to this group, send email to puppet-dev@googlegroups.com.
Visit this group at http://groups.google.com/group/puppet-dev.
For more options, visit https://groups.google.com/groups/opt_out.

[Puppet-dev] Re: The future of known_resource_types and loading puppet manifests

Reply via email to