On 30/03/16 23:12, Jeremiah Powell wrote:
    ASTs don't need to be built on a node-by-node basis (unless you
    meant manifest-by-manifest basis


Well, manifest-by-manifest where the manifest will vary depending on the
node the compile job is depending upon.  Reviewing the code I get the
impression I just don't understand the existing parser[1] enough today
to hold a valid opinion on this.  But honestly, I'm not trying to troll.


In puppet it is the "compilation of a catalog" that is specific to a node, not the parsing of the individual .pp files. The word "parsing" does not imply "evaluation" or "catalog compilation".

The puppet parser does the following:

* reads the source text
* builds a representation of this in memory (the Abstract Syntax Tree (AST)
* Performs static validation of the produced AST (to report semantic problems that are not covered by syntax alone.

The compiler does this:

* given a node, with its facts, settings from a node classifier etc. it is determined where it should start evaluating puppet code

* when it needs to evaluate something - e.g. "site.pp", it first needs to parse this file into AST (the steps above).

* When it has the AST, it starts evaluating the AST (expressions like 1 + 1, function calls, resource declarations etc).

* The result of the evaluation is that it has built up a catalog of resources.

* The catalog is typically sent to an agent for application, but can be written to disk etc.

The XPP PRFC is only for the three parsing steps. There will be no difference what so ever in what happens once the AST is loaded into memory. The only difference between the "Ruby parser parsing .pp file into AST" and the "reading an XPP containing an AST" is that we do not have to use the very slow Ruby runtime to do all the processing.


    The goal of this particular initiative is to enable the C++ parser
    (i.e. the frontend) to interop with the Ruby evaluation
    implementation (i.e. the backend).  The Puppet code is not being
    pre-compiled, but pre-parsed/pre-validated; the C++ implementation
    will not (yet) evaluate any Puppet code or load custom types or
    functions defined in Ruby.


How will this work with create_resources[2]?

They are completely unrelated. The call to the "create_resources" will take place in exactly the same way. The AST that is evaluated looks exactly the same if it came from an XPP file or if it was read in source form from a .pp file and parsed by the ruby parser.


    In compiler terminology, there's a "frontend" and a "backend".


In compiler terminology the frontend is a scanner composed of a parser
and a lexer.  The front-end validates the parse of the code as a
side-effect.  This is beyond the scope of the discussion of the PRFC and
into a sizing competition about who's read Aho, Lam, Sethi and Ullman.

The only point form this is that this is not compiling but a partial
parsing. Some of my concerns cannot be raised until there is actual
output to examine.


Puppet does not have a compiler in the typical computer science sense. The puppet term "compiler" uses the word in an English/generic sense since what it is doing is "compiling a catalog" (putting a catalog together out of the pieces that are supposed to be in it).

Puppet is an interpreter that interprets the AST that is produced by the puppet parser. The Puppet compiler uses that interpreter when it is compiling the catalog.

    The above seems to be confusing, understandably so, pre-compiling a
    resource catalog with pre-parsing a manifest.  In terms of a
    language like Python, the "pre-compiled" pyc files are simply a
    representation of the source that is more efficient to load and
    execute than having to parse the Python source again


That is because the Java code and CPython code is completely compiled
and ready to link in at runtime.    In this the XPP proposal does not
appear similar to .pyc files or Java bytecode.


Java is compiled to byte code. This byte code is then either interpreted or "just in time" compiled into machine code.

The Puppet AST is to Puppet what the Java byte code is to a JVM.
(Although puppet it is not byte code based; we simply use the AST).

Byte code is typically for a virtual machine that is stack based.
As a simple example if you have the source:

  a = 2 + 3

Byte code may be something like

 Push Literal 2
 Push Literal 3
 Add
 Store a

Whereas the AST is a tree of nodes (which is hard to draw, so here using
a list notation. The same in puppet would be:

(AssignmentExpression a
  (ArithmeticExpression +
    (Literal 2)
    (Literal 3)))


It does appear to me too be very similar to the Ecore technology[3] from
Eclipse, and thus Geppetto as mentioned in the references on RGen in the
prior art section.   It does appears to be similar in how you write a
Coffeescript parser for grammars in Atom or languages in Sublime Text.
It is just that you plan to serialize the result to disk instead of
displaying to the user.

All parsers do build a representation in memory using some form of tree. Puppet is no exception. This form is useful for further processing (validation, compilation/transformation into another form). What we get from RGen/Ecore is simply a convenient way to define what the building blocks are of the tree together with tooling that helps us process them. It would look very similar if done by hand (only take more time to do so).

And Yes! XPP simply gives us the ability to not having to go through all of the steps from puppet source to AST, to evaluation to catalog output. The parsing step is a compute intensive and it is the worst kind of task possible for Ruby to perform.

XPP is simply the AST in serialized form (+ validation result).

I suggest you read more about the CPython implementation of .pyc files
in PEP 3147[4]. The PEP proposal is very well written, IMHO.  It covers
a lot of the questions that are being raised in comments on the PRFC.
Like the discussion of not using a shadow filesystem for the files.


Thanks for the reference, will read more in that PEP.

An example from the PEP: will there be features like the ability to
detect if pre-parsing is available or in use?  Can I turn it off in code
as a developer or must I always use the --no-xpp command line as a
user?  Would that even be a good idea?


The --xpp / --no-xpp is a setting so it can be set in a configuration file to avoid having to give it on the command line every time.

You cannot turn this on/off in individual puppet manifests. It is a setting for the puppet "catalog compiler" if it should load AST from XPP files instead of using the much slower Ruby parsing route. It will always be possible to fallback to the Ruby route if there is no XPP available.

 From my understanding of the PRFC and XPP is a half-parsed file with
compiler warnings mixed in.    This brings to mind the use of this for
created a blocking step in a code deployment process.  I've already
commented on that use in the document, tough.


It is not "half parsed" - it is completely parsed and validated (as far as we can do static analysis). It does the same as the command "puppet parser validate" only that it produces a result that can be used later.

- henrik


--

Visit my Blog "Puppet on the Edge"
http://puppet-on-the-edge.blogspot.se/

--
You received this message because you are subscribed to the Google Groups "Puppet 
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/56FC4CA7.4060809%40puppetlabs.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to