Re: [Puppet-dev] RFC2 - Resource Defaults

David Schmitt Fri, 11 Jul 2014 05:47:34 -0700

Hi *,

On 2014-07-07 03:26, Henrik Lindberg wrote:

The egrammar (as well as the current grammar in 3x) tries to be helpful
by recognizing certain combinations as illegal (an override that is
virtual or exported), a resource default or override cannot specify a
title. This unfortunately means that the grammar has to recognize
sequences of tokens that makes this grammar ambiguous and it has to be
solved via operator precedence tricks (that makes the problem show up as
other corner cases of the grammar). (This is a classic mistake of trying
to implement too much semantics in the grammar / parser).


So...

What if we simply made the three resource expressions (create resource,
set resource defaults, an resource override) have exactly the same
grammar, and push of validation to the static validation that takes
place and the runtime.

Basically the grammar would be (I am cheating just a little here to
avoid irrelevant details):

     ResourceExpression
       : At? left_expr = Expression '{' ResourceBodies ';'? '}'
       ;

     ResourceBodies
       : ResourceBody (';' ResourceBody)*
       ;

     ResourceBody
       : title = Expression ':' AttributeOperations ','?
       ;

     AttributeOperations
       : AttributeOperation (',' AttributeOperation)*
       ;

     AttributeOperation
       : AttributeName ('=>' | '+>') Expression

     AttributeName
       : NAME |  KeywordsAcceptableAsAttributeName
       ;

     # Details here irrelevant, meaning is: virtual or exported resource
     # AT is the '@' token
     At
       : AT
       | AT AT
       | ATAT
       ;

So, how are the three kinds expressed? Notice that a title is required
for each ResourceBody. So we are basically going to handle different
combinations of left_expr and titles. We simply evaluate the left_expr
at runtime and treat the combinations of the *resulting* type and type
of title:

[...]

Since what I propose simply evaluates the left expression there is no
reason to deny certain expression in this place, and it is possible to
use say a variable as indirection to the actual type.

   $a = Notify
   $a { hi: message => 'hello there' }

(Which is *very* useful to alias types). Strings can also be used - e.g.

   'notify' { hi: message => 'hello there'}

which also makes the grammar more symmetric (a bare word like notify is
just a string value i.e. 'notify'). (We still would not allow types
to have funny characters, spaces etc. but it is at least symmetrical).

I really dig this idea. Reading it sparked a crazy idea in thelanguage-designer part of my brain: What about going even further andmaking the RHS also an Expression?

In the grammar basically everything would become a function call or justa sequence of expressions. For the expressiveness of the language itmight do wonders:


  $ref = File[id]
  $select = File<|title == id|>
  $ref == $select # true
  $type = File

  $values = { id => { mode => 0664, owner => root } }
  # equivalent hash shortcut notation for backwards compat and
  # keystroke reduction
  $values = { id: mode => 0664, owner => root }
  $defaults = { owner => def, group => def }
  $overrides = { mode => 0 }

  $final = hash_merge($values, { default: $defaults })

  # old style
  create_resources($type, $values, $defaults)
  # basic resource statement
  $type $final
  # interpreted as function call
  $type($final)
  # override chaining
  $ref $overrides
  $select $overrides

  # if create_resources would return the created resources:
  $created = create_resources($type, $values, $defaults)
  $created $overrides

  # replace create_resources
  File hiera('some_files')

  # different nesting
  file { "/tmp/foo": $value_hash }

  # extreme override chaining
  File['/tmp/bar']
  { mode => 0644 }
  { owner => root }
  { group => root }

  # inverse defaulting
  file { [ '/tmp/1', '/tmp/2' ]: } { mode => 0664, owner => root }

  # define defined()
  defined(File['/tmp/bar']) == !empty(File<|title == '/tmp/bar'|>)

This would require unifying the attribute overriding semantics as almosteverything would become a override.

It would also lift set-of-resources as currently used in simplecollect-and-override statements to an important language element asalmost everything touching resources would "return" such a set.


Formalizing this a little bit:

  * 'type' is a type reference.
  * 'Type' is the list of resources of type 'type' in the current
    catalog (compilation).
  * 'Type[expr]' is the resource of type 'type' and the title equal
    to the result of evaluating 'expr'
  * 'Type<| expr |>' is the list of local resources of type 'type' in
    the current compilation where 'expr' evaluates true. As a
    side-effect, it realizes all matched virtual resources.[1]
  * 'Type<<| expr |>>' is the list of local and exported resources of
    type 'type' where 'expr' evaluates true. As a side-effect,
    it realizes all matched exported resources.[2]
  * '{ key => value, }' is a simple hash ('hash')
  * '{ title: key => value, }' is a hash-of-hashes. Let's call this a
    untyped resource ('ur') due to its special syntax[3].
  * 'type ur' now syntactically matches what puppet3 has and evaluates
    to the set of resources ('resset') created by
    create_resources('type', 'ur').
  * '[Type1[expr1], Type2[expr2]]' is the resset containing
    'Type1[expr1]' and 'Type2[expr2]'.
  * 'resset hash' (e.g. 'File { mode => 0 }') is an override expression.
    It sets all values from 'hash' on all resources in 'resset'.
  * 'resset -> resset' (and friends) define resource relationships
    between sets of resources.
    'Yumrepo -> Package' would be a nice example, also avoiding
    premature realization.
  * 'create_resource(type, ur)' returns a resset containing resources
    of type 'type' with the values from 'ur'. Written differently,
    'create_resource' becomes a cast-and-realize operator.[4]
    - This allows things like 'create_resource(...) -> resset' and
      'create_resource(...) hash'
  * 'include someclass' returns the resset of all resources included in
    'someclass'. Note that 'included' is a very weakly defined concept
    in puppet, see Anchor Pattern.
  * Instances of user-defined types might also be seen as heterogeneous
    ressets.

[1] It might be worthwhile to start requiring to always write'realize(Type<| expr |>)' for this side-effect. This looks annoying.[2] Unintentionally realized exported resources seem a much lessfrequent problem than the same side-effect on virtual resources causes.It might make sense to avoid [1] and instead introduce something like'Type[|expr|]' and 'Type[[|expr|]]' to select without realizing.[3] Note that this is really only syntactic. { title => { key => value}} would be the evaluate to the equivalent untyped resource.

[4] I'm beginning to get an uncanny XPath/PuppetSQL vibe here.

Up until now, this is MOSTLY syntactic sugar to massively improve theflexibility of the language. To avoid the most egregious abuses andtraps of this flexibility we have to take a good look at the underlyingdatamodel, how evaluating puppet manifests changes this model and whatthe result should be.

The result is very simple: the compiled catalog is a heterogeneous setof resources. In an ideal world is that the contents of this resset isindependent of the evaluation order of the source files (and also theorder of the statements within).

Unifying all kinds of overrides, defaults and "normal" parameter settinginto a single basic operation opens the way to discuss this on adifferent level: for a evaluation order independent result, it's notimportant how or when a value is set, but it's only important that it isonly set once at most. That is a condition that is easily checked andenforced if we accept that the evaluator may reject some complexmanifests that could be evaluated theoretically but not with a givenimplementation.

The alert reader rightly complains that defaults and overrides havedifferent precedences. To make a strict evaluation possible I'd suggestto create multiple "value slots" on a property. A default, normal andoverride slot. The properties' value is the highest priority valueavailable.

To avoid write/write conflicts in the evaluation, each slot may bechanged only once. This follows directly from the eval-orderindependence requirement: when there are two places trying to set thesame property to different values with the same precedence it cannotwork. The argument is the same as for disallowing duplicate resourcescurrently.

To avoid read/write conflicts in the evaluation, each property may besealed to the currently available value(s) when reading from it. Thisallows detecting write-after-read situations. At this point theevaluator has enough information to decide whether the write is safe(the value doesn't change) or not (the eval-order independence isviolated). In a future version, the evaluator could be changed to returnpromises instead of values and to lazy evaluation of promises. That wayit would be possible to evaluate all manifests that have a eval-orderindependent result (that is, all that are reference-loop-free).

The case of +>: the write/write conflict is irrelevant up to the orderof the resulting list. The read/write conflict can be checked like anyother case.

A more subtle problem with this approach are resset-based assignments.Some examples:


  File { mode => 0644 } # wrong precedence
  file { '/tmp/foo': mode => 0600 }

  File['/tmp/foo'] { mode => 0644 }
  file { '/tmp/foo': mode => 0600 }

  File<| title == '/tmp/foo' |> { mode => 0644 }
  file { '/tmp/foo': mode => 0600 }

  File <| owner == root |> { mode => 0644 }
  file { '/tmp/foo': mode => 0600 }

The solution to this lies in deferring evaluation of all dynamic (Typeand Type<||>) ressets to the end of the compilation. While that wouldnot influence write/write conflicts, it would force most read/writeconflicts to happen always.


Another ugly thing would be detecting this nonsense:

  File <| mode == 0600 |> { mode => 0644 }

The same read/write conflict detection logic could be re-used forvariables, finally being able to detect use of not-yet-defined variables.

My own main issue with the idea is that it makes code backwards
incompatible; you cannot write a manifest that uses defaults and
overrides in a way that works both in 3.x and 4.x. (Or, I have not
figured out a way yet at least).

Even if you skip the resources-as-hashes idea, I think most of thedefaults and overrides precedence and eval-order confusion can bemitigated by a multi-slot implementation for properties as described above.

And finally, an alternative regarding Overrides, if we want to keep the
left side to be resource instance specific, (i.e no title), we could
simply change it to an assignment of a hash. I.e. instead of

Notify[hi] { message => 'overridden message' }

you write:

Notify[hi] = { message => 'overriden message' }

And now, the right hand side is simply a hash. The evaluator gets a
specific reference to a resource instance, and knows what to do.
(We could also allow both; the type + title in body way, and the
assignment way).

This is what actually triggered my first idea. Also because I reallydislike the assignment there.

Now I have typed too much already...


Me too ;-)


Regards, David


--
You received this message because you are subscribed to the Google Groups "Puppet 
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/53BFA665.6020707%40dasz.at.
For more options, visit https://groups.google.com/d/optout.

Re: [Puppet-dev] RFC2 - Resource Defaults

Reply via email to