[Puppet-dev] Re: RFC2 - Resource Defaults

Henrik Lindberg Thu, 17 Jul 2014 14:42:07 -0700

On 2014-17-07 14:56, David Schmitt wrote:

On 2014-07-12 04:50, Henrik Lindberg wrote:

On 2014-11-07 10:55, David Schmitt wrote:

[...snip...]

   # old style
   create_resources($type, $values, $defaults)
   # basic resource statement
   $type $final

This is problematic, we cannot make any sequence a function call without
requiring that every expression is terminated with punctuation (e.g.
';') - but that must then be applied everywhere.



You are right, having that in the grammar makes no sense. I still think
it is a neat detail to keep in mind when thinking about the underlying
structure of what we're building.

yes, basically an expression that is a kind of join between a PuppetType[Resource] and application of one or multiple sets of data.


[...snip...]

If functions can return sets of resources that can be manipulated, the
special query operator syntax can be abolished - or at least
de-emphasised. See Eric's puppetdbquery for an example.

yes, that is the direction this is going.

   * 'Type<<| expr |>>' is the list of local and exported resources of
     type 'type' where 'expr' evaluates true. As a side-effect,
     it realizes all matched exported resources.[2]

Same comment as above, but using a different container.

   * '{ key => value, }' is a simple hash ('hash')
   * '{ title: key => value, }' is a hash-of-hashes. Let's call this a
     untyped resource ('ur') due to its special syntax[3].
   * 'type ur' now syntactically matches what puppet3 has and evaluates
     to the set of resources ('resset') created by
     create_resources('type', 'ur').
   * '[Type1[expr1], Type2[expr2]]' is the resset containing
     'Type1[expr1]' and 'Type2[expr2]'.

That is what you get now. (or rather you get a set of references to the
resource instances, not the instances themselves).


Is there a distinguishable difference for the language user?

No, not really. The type is a reference, the operations on it looks itup or creates new. In the current implementation the one and same classis used for both references and the real thing (this causes great painand confusion in the code).

   * 'resset hash' (e.g. 'File { mode => 0 }') is an override
expression.
     It sets all values from 'hash' on all resources in 'resset'.
   * 'resset -> resset' (and friends) define resource relationships
     between sets of resources.
     'Yumrepo -> Package' would be a nice example, also avoiding
     premature realization.

The relations are recorded as being between references.

   * 'create_resource(type, ur)' returns a resset containing resources
     of type 'type' with the values from 'ur'. Written differently,
     'create_resource' becomes a cast-and-realize operator.[4]
     - This allows things like 'create_resource(...) -> resset' and
       'create_resource(...) hash'
   * 'include someclass' returns the resset of all resources included in
     'someclass'. Note that 'included' is a very weakly defined concept
     in puppet, see Anchor Pattern.

Hm, intriguing idea.

   * Instances of user-defined types might also be seen as heterogeneous
     ressets.

Yes.


[1] It might be worthwhile to start requiring to always write
'realize(Type<| expr |>)' for this side-effect. This looks annoying.


it could be

   Type <| expr |>.realize


Ugh.

A positive Ugh, or a Ugh of agonizing pain? :-)

[2] Unintentionally realized exported resources seem a much less
frequent problem than the same side-effect on virtual resources causes.
It might make sense to avoid [1] and instead introduce something like
'Type[|expr|]' and 'Type[[|expr|]]' to select without realizing.


I like to go in the other direction with fewer special operators.


As said above, functions returning resource sets might be the way to go
then. It's not like we need to design the next APL ;-)

yes agree, functions are good - operators are ok too when they are nonambiguous and not too exotic :-)


[...snip ...]

I think this may just move the problem to dueling defaults, dueling
values, and dueling overrides. (This problem occurs in the binder and
there the problem is solved by the rules (expressed in the terms we use
here (except the term 'layer', which I will come back to):
- if two defaults are in conflict, a set value wins
- if two values are in conflict, an override wins
- if two overrides are in conflict, then the one made in the highest
layer wins.
- a layer must be conflict free


Don't you mean "the highest layer with a value must be conflict free" ?

yes, that is true, since some higher layer must resolve the issue. Notmeaningful to enforce resolution in every layer.

Highest (most important) layer is "the environment", secondly "all
modules" - this means that conflicts bubble to the top, where a user
must resolve the conflict by making the final decision.

The environment level can be thought of as what is expressed in
"site.pp", global or expressed for a  "node" (if we forget for a while
about all the crazy things puppet allows you to do with global scope;
open and redefine code etc).


If you mean what I think you mean, I think like it.

Another example to try to understand this:

   class somemodule { package { "git": ensure => installed } }

   class othermodule { package { "git": ensure => '2.0' } }

   node 'developer-workstation' {
     # force conflict on Package[git]#ensure here: installed != '2.0'
     include somemodule
     include othermodule

     # conflict resolved: higher layer saves the day
     Package[git] { ensure => '2.1' }
   }

How would the parser/grammar/evaluator understand which manifests are
part of what layer?

We will have a new catalog model (the result built by the new catalogbuilder slated to replace the current compiler). There we have a richermodel for defining the information about resources. We also have a newloader system that knows where code came from. Thus, anything loaded atthe environment level (i.e. node definitions) knows it is in a higher layer.

To avoid read/write conflicts in the evaluation, each property may be
sealed to the currently available value(s) when reading from it. This
allows detecting write-after-read situations. At this point the
evaluator has enough information to decide whether the write is safe
(the value doesn't change) or not (the eval-order independence is
violated). In a future version, the evaluator could be changed to return
promises instead of values and to lazy evaluation of promises. That way
it would be possible to evaluate all manifests that have a eval-order
independent result (that is, all that are reference-loop-free).

yes, and now, basically, the catalog is produced using a production
system that was populated by the puppet logic.

The case of +>: the write/write conflict is irrelevant up to the order
of the resulting list. The read/write conflict can be checked like any
other case.

A more subtle problem with this approach are resset-based assignments.
Some examples:

   File { mode => 0644 } # wrong precedence
   file { '/tmp/foo': mode => 0600 }

   File['/tmp/foo'] { mode => 0644 }
   file { '/tmp/foo': mode => 0600 }

   File<| title == '/tmp/foo' |> { mode => 0644 }
   file { '/tmp/foo': mode => 0600 }

   File <| owner == root |> { mode => 0644 }
   file { '/tmp/foo': mode => 0600 }

The solution to this lies in deferring evaluation of all dynamic (Type
and Type<||>) ressets to the end of the compilation. While that would
not influence write/write conflicts, it would force most read/write
conflicts to happen always.

Another ugly thing would be detecting this nonsense:

   File <| mode == 0600 |> { mode => 0644 }


The same read/write conflict detection logic could be re-used for
variables, finally being able to detect use of not-yet-defined
variables.

Here we have another problem; variables defined in classes are very
different from those defined elsewhere - they are really
attributes/parameters of the class. All other variables follow the
imperative flow. That has always bothered me and causes leakage from
classes (all the temporary variables, those used for internal purposes
etc). This is also the source of "immutable variables", they really do
not have to be immutable (except in this case).


Yeah, not being able to calculate and reset values in parameters (or
class vars) is a pita, leading to all sorts of $managed_ and $real_
variables for little gain. Having proper futures or at least r/w
conflict detection might fix that instead of doing immutable.

If we make variables be part of the lazy logic you would be able to
write:

   $a = $b + 2
   $b = 2

I think this will confuse people greatly.


Hehe, I can imagine that. When accessing variables across files/classes
I do not see that as a big problem, though. Within a single file/scope
it can be forbidden, or at least warned/linted.

We will see what we end up with - this is currently not a primaryconcern. One neat thing you can do with the future parser (and in 4.0)is to evaluate code in a local block using the function 'with', and thenpass it arguments, the variables in the lambda given to with are alllocal! Thus you can assign the final value from what that lambda returns.

The crux here is that just having one expression being followed by
another - e.g:

   1 1

is this a call to 1 with 1 as an argument, or the production of one
value 1, followed by another?


Is the parser and the evaluator so intertwined that that cannot be
interpreted in context? "1" is not a callable, therefore it cannot be a
function call.

They are not intertwined except that the parser builds a model that theevaluator evaluates. The evaluator does exactly what you say, itevaluates the LHS, and if that is not a NAME it fails, and if the NAMEis not a reference to a Function it fails. But the evaluator can not dothat in general, there may be a sequence of say if statements, they allproduce a value, should the second if be an argument to attempting tocall what the first produced ?


  if true { teapot }
  if true { 'yes I am'}

The evaluator would have no way of knowing, except doing static analysis(which limits the expressiveness).

Currently the parser rewrites NAME expr, or NAME expr ',' expr ... intoa function call. I do not want to add additional magic of the same kindif it can be avoided.

This general problem is solved by stating that for this to be a call,
the first expression must be special; a NAME token that is followed by a
general expression (or a list of expressions: e.g. NAME e,e,e)

We cannot turn a hash into an operator since that would make it close to
impossible to write a literal hash.

Hence... for the resource expressions we need an operator that operates
on  three things, type, id, and named arguments (plus, via the operator,
or through other means) the extra information if each value is a
default, a value, or an override, if it is an addition or a subtraction).

We can solve this by making the data structure special (the {: }),
using an operator, or using more complex but generic data structure
(hash with particular keys). If we use : in hashes to mean hash of hash,
then we made it easier to encode things like defaults, values and
overrides but we lack type and id.

You could read the
    notify { hi: message => hello }
as:
    Notify.new(hi, {message=>hello})

As I see it, the main grammar problem is that there is no "new
operator". Hence my attempt:

   Notify[hi] = {message => hello}

Now I have typed too much already...


Me too ;-)

Dueling ramblers?


I think "Ideating" is the proper jargon here ;-)

To Summarize

I think it will be hard to change the core expression that creates
resource - i.e.

    notify { hi : ...}

and then we are back at where I started;
- we can play tricks with the titel (using a literal default there)
- we can generalize the LHS since {: is an operator (i.e. differentiate
between LHS being name, and a type (notify vs Notify), or being a
resource-set, say from a query like Notify <| |>, or indeed any
expression such as a variable reference. The main problem here is being
able to infer the correct type (when that is not possible we end up with
late evaluation errors if there are mistakes, and they are hard to deal
with), so we may want to restrict the type of expression to those where
type is easily inferred.


:-/


Regards, David

I am hacking on ideas to fix the problematic constructs in the grammar.I have had some success, but it is to early to write about. Will comeback when I know more.


- henrik

--

Visit my Blog "Puppet on the Edge"
http://puppet-on-the-edge.blogspot.se/

--
You received this message because you are subscribed to the Google Groups "Puppet 
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/lq9fu5%24e1s%241%40ger.gmane.org.
For more options, visit https://groups.google.com/d/optout.

[Puppet-dev] Re: RFC2 - Resource Defaults

Reply via email to