Hi Henrik,

On 2014-06-28 16:54, Henrik Lindberg wrote:
Hi,
We (the server-team) have started to dig into the area of Resource
Defaults and Collection in our work on 3.7's future parser/evaluator
(and what will become Puppet 4.0).

thanks for taking up this topic. As I'm a heavy user of export/collect, Defaults were always taboo for my modules, due to many of the reasons you're listing below.

We believe that there is an opportunity to improve on how Puppet now
works, and make it less confusing. We are not yet completely sure about
edge use-cases and the implications of what we would like to do - so we
would very much like to get your feedback.

Puppet currently allows setting defaults for resources - e.g.

     File { mode => '0666' }

* These can also be set for user defined types.

* It is possible to make multiple such statements for the same resource
type, but additional statements may not set the same attributes

* When a default statement is evaluated, it is registered in its closure
scope (i.e. where it is located in source). e.g. a resource created in
Class a, gets defaults that are visible in the scope of Class a.

* All resources defined in that scope will potentially be affected by
those defaults (i.e. if the are found to have missing values at the end
of the compilation).

* At the end of the compilation, all resources will be processed and any
missing values are set from the registered defaults (from the
perspective of the resource's containment scope (i.e. where it was
defined).

What are the problems with this?
---

* Defaults are applied after collection, thus if you try to use +> to
add to a virtual resource's attribute, the resource value(s) for the
attribute you are trying to append to will not include the defaults. As
you are also setting/overriding the values with the collection, the
default values are not included in the values, and you have to repeat
them (if they should be included).

* You cannot query (or so we suspect - we have to investigate to be
sure) for values that will be set via the defaults, so the collection
will miss the resources that later (after collection is completed) will
be given a value that would have made it included in the collection.

* It is possible to reopen classes, and introduce additional default
expressions after resources have been instantiated (but not evaluated).
This affects the attribute values the resources will end up having in
the catalog. (At some point later, we will make it an error to reopen a
class, but we don't think we will be able to fix this in time for 4.0).

* Evaluation of Collection Expressions is intertwined with resource
evaluation (same priority in the lazy evaluation queue), and it is
possible to play tricks with virtual resources, defaults and collections
that will (even if there is an attempt at considering potential defaults
when querying) never be able to consider such information that arrived
after the fact.

* Collection can be an operand in a dependency, and thus be evaluated
even later than the application of defaults.

Or, in other words: it is really confusing.

What we want to do:
---

* Make application of defaults eager so that when a resource is
instantiated, it will immediately get the registered and visible
defaults (for missing attributes), at *that* point of time in the
evaluation. This means that defaults become imperative (like variable
assignment).

Do I understand this correctly, that after

  File { mode => 0664 }
  file { "/tmp/foo":; }
  File { owner => bin }
  file { "/tmp/bar": mode => 0755; }

the actual resources will look like

  file {
    # receives mode from first, but not owner from second
    "/tmp/foo": mode => 0644, owner => root;
    # locally defines mode, but gets owner from second
    "/tmp/bar": mode => 0755, owner => bin;
  }

and later trying to override any of those attributes (e.g. in a inherits) will fail?

* When resource defaults are applied eagerly they are also available
when doing collection (instantiation is naturally done before collection
- otherwise there is nothing to collect).

There should be no difference between

  File { mode => 0644 }
  File <| title == 'example' |>

and

  File <| title == 'example' |> { mode => 0644 }

except, perhaps that the latter could create an error when mode is already set on @file ?

* There are edge cases where collection can be made at a point where all
resources that the query matches (will match) have not yet been created
(i.e. if the virtual resources are instantiated after the query is
evaluated). There seems to be logic that tries to find everything
(lazily at the end), but it is uncertain that it actually works
correctly in all cases. What we propose for the defaults have no effect
on this, it only changes what the resource attributes are when collecting.

Seems obvious when applying defaults immediately.


* We want to change the function 'defined' to return the state of the
resource instead of just a boolean. Currently the concept of being
"defined" is vague, the function returns true for all kinds of resources
(virtual and exported) as well as those that are included in the
catalog. We want to return some kind of status (that is "truthy" to make
it backwards compatible), but that also tells if the resource is
actually realized (in the catalog). (We agree that using the defined
function is bad practice, but it is required to support certain
use-cases that would otherwise be very painful/impossible to handle).

I think the most common use for "defined" is "is this resource already part of the catalog up until now". I'm not totally sure why you want to burden the Defaults story with *this* can of worms, but I think it might be clearer to deprecate defined in its current form and replace it with specific functions that inspect the current compilation state.

* The File[foo][mode] syntax for lookup of a resource attribute will be
more sane, as it will correctly produce the value at that point in time
in the evaluation. It is either the value (explicit or default) that
will end up in the catalog for a realized resource, or the explicit or
default value of an unrealized virtual resource that *may* get a
different value when it is later realized (if realized at all). Thus,
with a combination of being able to know if a resource is realized or
not, it is possible to also be certain that the value is the value that
will end up in the catalog).

Even if it's not realized yet, the value can only change iff the current value is undef, yes? In that case, I'd start with making accesses und undef values on not-yet-realized resources invalid. Perhaps even detect when a evaluation-order dependent value is accessed *after* it already has acquired a value.

* In case you wonder, the ability to lookup an attribute is available
server side, when the catalog is produced. Later default values set by
the agent cannot (for obvious reasons) not be computed when the catalog
is being compiled.

When/how does the agent apply defaults to resources? Have you ment "the state the agent reads from the running system"?

We believe that the construct with lazy default application was required
when dynamic scoping was available, but we are honestly not sure about
the rationale behind the current design. There are plenty of
logged issues in Redmine regarding defaults, and collection, but that
did not help us much as they are for a variety of versions, and for when
dynamic scoping was available (plus a number of other issues where not
fixed), so it is very hard to tell which of the old issues that are
still relevant.

If you are doing advanced things with virtual resources, or advanced
composition of defaults where you explicitly depend on the evaluation
order, etc. to get the default values you want we would like to hear
from you, or naturally if you have input on any of the above. Also if
you have logged / or know details of relevant issues that can still be
replicated, we also want to hear from you.

Another thing I'm avoiding, because it didn't work when I started with export/collect, which you might want to look into:

  define hell($content = $::hostname) {
    file { "/tmp/$::name": content => $content; }
  }

  node b {
    @@hell { $::hostname: }
  }

  node b {
    Hell<<||>>
  }

will create /tmp/a with the content "b", although I would have expected the content to be "a" too.

I apologize if the explanations above are short and that there are no
examples. I am happy to write such to illustrate, but I felt that would
only make a long post even longer, and it would be better to show
examples to answer questions.

If there is interest in participating in a hangout to discuss these
matters, please ping, and we will set one up.

I don't see the immediate need for one, but would try to attend if invited.

Thanks for your time and work!

Regards, David

--
You received this message because you are subscribed to the Google Groups "Puppet 
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/53B25AC5.2090109%40dasz.at.
For more options, visit https://groups.google.com/d/optout.

Reply via email to