[Puppet-dev] Re: The 4x scope

Henrik Lindberg Fri, 14 Mar 2014 20:49:26 -0700

I have a long and rambling response written in several installmentsbetween meetings - so I apologize if it is not completely consistent...

To summarize the proposal for 4x (starting out with the cleanest moststrict to see what this means in practice):

1. unqualified variable references is "what can be seen in thisevaluation scope, outer evaluation scopes, and then global"

2. qualified variable references are absolute

3. definition of classes and defines with relative names are namedrelative to the namespace it is in4. resolution of qualified names used as references (for include etc)are absolute


The rationale for 2, 3, 4 is that this is faster, most user seem to not

trust relative name resolution and throw in :: everywhere (for bothsanity and speed). For those that actually understand how it currentlyworks and actually wants relative name resolution it does mean a bitmore characters to type with a small loss in ease of refactoring (movinga class means it can resolve against other classes than before - both ablessing and a curse).


We are contemplating having an alias feature.

I then ramble on about scope and try to answer questions...

Regards
- henrik

On 2014-14-03 23:07, John Bollinger wrote:



On Thursday, March 13, 2014 7:19:22 PM UTC-5, henrik lindberg wrote:


    We also have to decide if any of the relative name-space functionality
    should remain (i.e. reference to x::y is relative to potentially a
    series of
    other name spaces ("dynamic scoping"), or if it is always a global
    reference when it is qualified.



Can you choose a different term than "dynamic scoping" for what you're
describing there?  It's not consistent with other uses of that term with
which I am familiar, and historically "dynamic scoping" has meant
something different in Puppet.

Not quite sure what the correct term is in current Puppet - I mean "thevarious ways current puppet resolves a name such as x::y".

    The implementation idea we have in mind is that there is one global
    scope where all "qualified variables" are found/can be resolved, and
    that all other variables are in local scopes that nest. (Local scopes
    include ephemeral scopes for match variables).

    Given the numbers from measuring the read ratio, we (sort of already
    know, but still need to measure) need a fast route from any scope to
    the
    global - we know that a qualified variable is never resolved by any
    local scope so we can go straight to the global scope. (This way
    we do not have to traverse the chain up to the "parent most" scope (the
    global one).



I think that's fine, as long as it's consistent, but it has the
potential to present oddities.  For example, in the body of class m, can
one declare class ::m::a via its unqualified name (e.g. "include 'a'")?
If so, then should one not also from the same scope be able to refer to
the variables of ::m::a via relative names ($a::foo)?

There are several concepts at play:
* the name given to a class or user defined resource type
* the loading of it
* referencing a variable

Class naming

Currently a class or define gets a name in the name space where it isdefined - its (possibly qualified name) is appended to the name where itis defined. Thus:


class a {
  class b {
  }
  class x::y {
  }
}

Creates the three classes ::a, ::a::b, and ::a::x::y

This construct is exactly the same as if they were defined like this:

class a {
}
class a::b {
}
class a::x::y {
}

The fact that a class is defined inside of another does not give it anyspecial privileges (reading private content inside the class it isdefined in, etc.). This is a naming operation only.

Likewise, when a class is included, this inclusion is in some arbitrarynamespace and it currently searches for the relative name. Thesuggestion is to not do this, and instead require a fully qualified(absolute) name.


include a
include b::a
include a::x::y


Sidebar:

| (I use the term "define" to mean "defining what a named entity is" as| opposed to the term "declare" which is a term that denotes a| definition of the existence of something (typically having some given| type) - e.g. "int c" in the C language, which declares c, but does

| not define it.
|
| (Just saying since the use of define / declare may confuse someone)

I see two main reasonable alternatives:

  * class names are always treated as absolute.  Class ::m can declare
    class ::m::a only via its qualified name, and $a::foo is always
    equivalent to $::a::foo.

If you mean that class ::m::a can be defined inside of class ::m, we dothat either by:


a) as today, class gets its (relative) name concatenated on to the
   containing class' name, otherwise its absolute name.
b) the name is always absolute, a class a {} inside a class m {} gets
   the fully qualified name ::a
c) enforce that they are always named with a starting :: to be able to
   flag down all relative names.
d) forbid that a nested class is given an absolute name

Of these I prefer a) since it causes the least breakage and surprise.

(Side note, the idea that nesting classes should not be allowed has
been raised as well - to further break the illusion that they have

some privileged relation to each other - they are not "inner classes" asin Java or anything like that, they are not protected/private in any way- the are just named after where they are defined).

  * class names can be expressed absolutely or relative to the innermost
    enclosing class scope (~ the current namespace), only, both for
    class declaration and for variable lookup.  Class ::m can declare
    class ::m::a via its unqualified name, and can refer to the
    variables of ::m::a via relative names ($a::var).

This is like a) from naming the class, but keeps relative resolution ofreferences. Maybe it is a really bad idea to remove this ability - butit is what opens up the can of worms... is it also relative to the namespace of the super class? is it relative to any outer name space? to anyouter name space of an inherited class?

Either approach provides consistency in that any way it is permissible
to refer to a class itself, it is also permissible to refer to that
class's variables by appending '::varname'.  Note that the latter does
not require traversing the chain of enclosing scopes, nor looking up
names directly in any local scope; rather, it could be implemented as
maximum two lookups against the global scope.


Well, it is not really possible to refer to a class with a variable

it does not evaluate to an instance of class, it may evaluate to avariable in another namespace though... (this is also confusing)


What the proposed (strict) rules means:

class a {
  class b {
    $x = 1
  }
  class c inherits b {
    $y = $x + 10
  }
}

The resolution of $x will lookup the $x in local scope representing theclass a::c, fail, and then in its parent scope representing a::b, andthere find x.


If instead

  class c inherits b {
    $y = $b::x + 10
  }

was used, it would immediately go to the global scope and resolve b::xand find the value 10.

now, if we instead treats b::x as a relative reference to the name spaceit is used in - then it may be a reference to:

* ::a::c::x
* ::a::b::x
* ::b::x

What if b also inherits? What if the namespaces are more deeply nested?

$x = 0
class aa {
  $x = 1
  class a {
    $x = 2
    class b {
    }
    class c inherits b {
      $y = $x
      $z = $b::x
    }
  }
}
class b {
    $x = 4
}

What is $aa::a::c::y and $aa::a::c::z ?

In the proposal, the $y evaluates to 0 (there is no x in c, nor in b,they do not see into aa, and can not see into aa::b). And $z evaluatesto :undef, since there is no x in b.

In 3.x the value of $aa::a::c::z becomes 0, since when it reaches classaa::a::b and it does not have an x, then x resolves the global x. (Jīng!<- Chinese Surprise).


With relative naming the search is done in this order

* aa::a::c::b::x
* aa::a::b::x
* aa::b::x
* b::x

(if we remove the 3x surprising behavior to resolve to global x whenthere is no x in b by setting $x in b) and move the b class aroundbetween the various namespaces it is possible to verify that it searchesin the order above.

We can do that in 4x as well (sans the Jīng!) if we come to theconclusion that that would be the best. (i.e. worst case 4 hash lookupsfor a 3 level nesting of names). We cannot really optimize this - thenames have to be tried in that given order.

Making it strict means that there is only one lookup, but the c classwould have to be written like this:


    class c inherits b {
      $y = $x
      $z = $aa::a::b::x
    }

if we insist on making a qualified reference to the x in b (a $x getsthe same result).

We could make the inherited class have special status - and thus resolveagainst it - but not sure if it is worth doing this.

I expect that we will retain the ability to refer to variables via their
unqualified names within some nest of scopes related to where they are
declared (e.g. up to the innermost named (class or resource) scope).
Given, then, that that form of relative name lookup will be supported, I
think generalizing that to classes and resources as well (second
alternative) bears serious consideration.

There is also the ability to reference a class and access its attributesvia the Class type. This way, it is totally clear what the resolutionis, and what the names are relative to. e.g.


 $b = Class[some::class::somewhere]
 $b[x]

and if this is done in a class, and you don't want the $b to be visible

 private $b = Class[...]

This way there is no guessing what a relative name may mean. (In essencerelative names are only (optionally) used when defining classes and userdefined resource types.

On the other hand, those who have commented in the past seem to agree
that Puppet's historic behavior of traversing the full chain of nested
scopes, trying to resolve relative names with respect to each, is more
surprising than useful.  I'm on board with that; I'm just suggesting
that there may be both room and use for a more limited form of relative
naming.

I am struggling with the balance of being useful, not having to type toomuch, and ease of refactoring with sanity and performance... thisdiscussion is very valuable.

I like the simplicity of "an unquailified variable = what I see here",and "a qualified variable = an absolute reference".


    Local scopes are always local, there is no way to address
    the local variables from some other non-nested scope - essentially how
    the regular CPU stack works, or how variables in a language like C
    work).

    i.e. we have something like this in Scope

    Scope
        attr_reader :global_scope
        attr_reader :parent_scope
        # ...
    end


    The global scope keeps an index designed to be as fast as possible to
    resolve a qualified name to a value. The design of this index
    depends on
    the frequency of different types of lookup. If all qualified lookups
    are
    absolute it would simply be a hash of all absolute names to values (it
    really cannot be faster than that).

    The logic for lookup then becomes:
    - for un-qualified name, search up the parent chain (this chain does
    not
    reach the global scope), if still unresolved, look in global scope.



 From the description alone, I'm not sure how it can be asserted that
the chain of local scopes does not reach global scope, unless by the the
trivial fact that the global scope is not itself a local scope.  What I
would hope to see, and perhaps what is meant, is that the lookup stops
at local scopes that correspond to classes and resources.  In
particular, I think it is essential that unqualified class name lookups
not be resolved against parent namespaces.

Nested ("local") scopes only contains unqualified names, and an innerscope shadows an outer scope (there are a few additional rules forrestricted names such as $trusted, and $facts which may not be shadowedin any scope). Qualified names (for variables) can only be created inclasses and these are only the public attributes of those classes. Nolocal (shadowing) scope places this "global scope" as an outer scope.


$x = 10
class a {
  $x = 20
  $y = $x
  $z = $::x
}

Here the variables $a::x == 20, $a::y == 20, and $z == 10

The $::x is not found in an outer scope of the scope used to evaluatethe logic inside of class a.

The local scopes dies when evaluation using that scope - eh. goes out ofscope. The persisted values are kept in the global-scope index (and inthe instantiated classes and created resources).

That is, in class ::m::a::b, "include 'foo'" must not refer to
::m::a::foo, and certainly not to ::m::foo, but I'd be ok if it could
refer to ::m::a::b::foo.  As a special (but important) case, in
::m::a::b, "include 'b'" must not refer to ::m::a::b itself, and
"include 'a'" should not refer to ::m::a.

I think (but is not 100% sure) that it would be best to have to qualifythe name - i.e.


  include foo    # is include ::foo
  include x::y   # is include ::x::y

Other languages have solved the same issue in different ways:

* Ruby is obviously very flexible in how it searches (it also makes it
  slow), and sometimes (just like in Puppet) it is mysterious why it
  works or not in some cases.
* Java uses an import to import a name which can then be used in short
  form, nested classes can be relatively referenced.
* Some Java like (new) languages use an import/alias mechanism

If we go down that path, these name imports would appear at the start ofthe file and apply to the content of that file - i.e. it is a help tothe *parser* to construct the correct code (there is no searching atruntime). Now sadly, import is a function that is just deprecated in thePuppet Programming Language and reintroducing it with a differentmeaning would just be a cruel joke... if instead we want to be able toalias names maybe we could use "alias"


alias apache = mystuff::better_apache::apache

To support an alias like that, the only reasonable thing a parser coulddo is to replace every "apache" in every qualified name with the alias -i.e. apache::foo becomes mystuff::better_apache::apache::foo

A powerful mechanism to reduce typing - but that are also tricky if wesupport more than a first pass of alias replacements, multi segementaliases etc. (A sane impl. could perhaps only perform the replacement ofthe first segment, and that an alias cannot be qualified itself.

(An alias could also be set to ::)

I am not sure I want to see aliases like these in the language.Sometimes a bit more typing is good for (esp. the future) you.

We have a problem with referencing a class directly with a variablesince we can do this


  class a {
    $b = { x = jing }
    class b {
      $x = 10
    }
    notice $b::x
    notice $b
  }

$b is not a reference to the class, but in $b::x it is (this is kind ofconfusing).


Super

Yet another way of handling resolutions is to add a super (reserved)namespace word, that resolves the superclass. It would function as an(absolute) reference to the superclass and mean give me a variable asthe superclass sees it (given class is allowed to see it). e.g.


  class c inherits b {
    $z = $super::x
  }

But I am not sure that throwing yet another object oriented term intothe non object oriented puppet casserole makes it any sweeter...

I'm going to try to digest some more of this over the weekend.  Perhaps
I'll have more to say on Monday.


I can imagine having a hangout on this topic as well...

Such as about scoping function names
so that different environments can bind different implementations to the
same name, maybe.


There will be support for scoping function names. i.e. you can call

mymodule::foo(x)

All such references are currently (albeit still at the idea state)absolute names - no shadowing, and no "local functions" are planned.

Aliasing is being contemplated, which means it is possible to aliascertain functions.


alias foo = mymodule::foo

Which would make all calls to foo() go to mymodule::foo() in the
.pp file having that alias at the top.



--
You received this message because you are subscribed to the Google Groups "Puppet 
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/lg0ii1%24uo0%241%40ger.gmane.org.
For more options, visit https://groups.google.com/d/optout.

[Puppet-dev] Re: The 4x scope

Reply via email to