I have a long and rambling response written in several installments between meetings - so I apologize if it is not completely consistent...

To summarize the proposal for 4x (starting out with the cleanest most strict to see what this means in practice):

1. unqualified variable references is "what can be seen in this evaluation scope, outer evaluation scopes, and then global"
2. qualified variable references are absolute
3. definition of classes and defines with relative names are named relative to the namespace it is in 4. resolution of qualified names used as references (for include etc) are absolute

The rationale for 2, 3, 4 is that this is faster, most user seem to not
trust relative name resolution and throw in :: everywhere (for both sanity and speed). For those that actually understand how it currently works and actually wants relative name resolution it does mean a bit more characters to type with a small loss in ease of refactoring (moving a class means it can resolve against other classes than before - both a blessing and a curse).

We are contemplating having an alias feature.

I then ramble on about scope and try to answer questions...

Regards
- henrik

On 2014-14-03 23:07, John Bollinger wrote:


On Thursday, March 13, 2014 7:19:22 PM UTC-5, henrik lindberg wrote:


    We also have to decide if any of the relative name-space functionality
    should remain (i.e. reference to x::y is relative to potentially a
    series of
    other name spaces ("dynamic scoping"), or if it is always a global
    reference when it is qualified.



Can you choose a different term than "dynamic scoping" for what you're
describing there?  It's not consistent with other uses of that term with
which I am familiar, and historically "dynamic scoping" has meant
something different in Puppet.

Not quite sure what the correct term is in current Puppet - I mean "the various ways current puppet resolves a name such as x::y".

    The implementation idea we have in mind is that there is one global
    scope where all "qualified variables" are found/can be resolved, and
    that all other variables are in local scopes that nest. (Local scopes
    include ephemeral scopes for match variables).

    Given the numbers from measuring the read ratio, we (sort of already
    know, but still need to measure) need a fast route from any scope to
    the
    global - we know that a qualified variable is never resolved by any
    local scope so we can go straight to the global scope. (This way
    we do not have to traverse the chain up to the "parent most" scope (the
    global one).



I think that's fine, as long as it's consistent, but it has the
potential to present oddities.  For example, in the body of class m, can
one declare class ::m::a via its unqualified name (e.g. "include 'a'")?
If so, then should one not also from the same scope be able to refer to
the variables of ::m::a via relative names ($a::foo)?

There are several concepts at play:
* the name given to a class or user defined resource type
* the loading of it
* referencing a variable

Class naming

Currently a class or define gets a name in the name space where it is defined - its (possibly qualified name) is appended to the name where it is defined. Thus:

class a {
  class b {
  }
  class x::y {
  }
}

Creates the three classes ::a, ::a::b, and ::a::x::y

This construct is exactly the same as if they were defined like this:

class a {
}
class a::b {
}
class a::x::y {
}

The fact that a class is defined inside of another does not give it any special privileges (reading private content inside the class it is defined in, etc.). This is a naming operation only.

Likewise, when a class is included, this inclusion is in some arbitrary namespace and it currently searches for the relative name. The suggestion is to not do this, and instead require a fully qualified (absolute) name.

include a
include b::a
include a::x::y


Sidebar:
| (I use the term "define" to mean "defining what a named entity is" as | opposed to the term "declare" which is a term that denotes a | definition of the existence of something (typically having some given | type) - e.g. "int c" in the C language, which declares c, but does
| not define it.
|
| (Just saying since the use of define / declare may confuse someone)


I see two main reasonable alternatives:

  * class names are always treated as absolute.  Class ::m can declare
    class ::m::a only via its qualified name, and $a::foo is always
    equivalent to $::a::foo.
If you mean that class ::m::a can be defined inside of class ::m, we do that either by:

a) as today, class gets its (relative) name concatenated on to the
   containing class' name, otherwise its absolute name.
b) the name is always absolute, a class a {} inside a class m {} gets
   the fully qualified name ::a
c) enforce that they are always named with a starting :: to be able to
   flag down all relative names.
d) forbid that a nested class is given an absolute name

Of these I prefer a) since it causes the least breakage and surprise.

(Side note, the idea that nesting classes should not be allowed has
been raised as well - to further break the illusion that they have
some privileged relation to each other - they are not "inner classes" as in Java or anything like that, they are not protected/private in any way - the are just named after where they are defined).

  * class names can be expressed absolutely or relative to the innermost
    enclosing class scope (~ the current namespace), only, both for
    class declaration and for variable lookup.  Class ::m can declare
    class ::m::a via its unqualified name, and can refer to the
    variables of ::m::a via relative names ($a::var).

This is like a) from naming the class, but keeps relative resolution of references. Maybe it is a really bad idea to remove this ability - but it is what opens up the can of worms... is it also relative to the name space of the super class? is it relative to any outer name space? to any outer name space of an inherited class?

Either approach provides consistency in that any way it is permissible
to refer to a class itself, it is also permissible to refer to that
class's variables by appending '::varname'.  Note that the latter does
not require traversing the chain of enclosing scopes, nor looking up
names directly in any local scope; rather, it could be implemented as
maximum two lookups against the global scope.


Well, it is not really possible to refer to a class with a variable
it does not evaluate to an instance of class, it may evaluate to a variable in another namespace though... (this is also confusing)

What the proposed (strict) rules means:

class a {
  class b {
    $x = 1
  }
  class c inherits b {
    $y = $x + 10
  }
}

The resolution of $x will lookup the $x in local scope representing the class a::c, fail, and then in its parent scope representing a::b, and there find x.

If instead

  class c inherits b {
    $y = $b::x + 10
  }

was used, it would immediately go to the global scope and resolve b::x and find the value 10.

now, if we instead treats b::x as a relative reference to the name space it is used in - then it may be a reference to:
* ::a::c::x
* ::a::b::x
* ::b::x

What if b also inherits? What if the namespaces are more deeply nested?

$x = 0
class aa {
  $x = 1
  class a {
    $x = 2
    class b {
    }
    class c inherits b {
      $y = $x
      $z = $b::x
    }
  }
}
class b {
    $x = 4
}

What is $aa::a::c::y and $aa::a::c::z ?

In the proposal, the $y evaluates to 0 (there is no x in c, nor in b, they do not see into aa, and can not see into aa::b). And $z evaluates to :undef, since there is no x in b.

In 3.x the value of $aa::a::c::z becomes 0, since when it reaches class aa::a::b and it does not have an x, then x resolves the global x. (Jīng! <- Chinese Surprise).

With relative naming the search is done in this order

* aa::a::c::b::x
* aa::a::b::x
* aa::b::x
* b::x

(if we remove the 3x surprising behavior to resolve to global x when there is no x in b by setting $x in b) and move the b class around between the various namespaces it is possible to verify that it searches in the order above.

We can do that in 4x as well (sans the Jīng!) if we come to the conclusion that that would be the best. (i.e. worst case 4 hash lookups for a 3 level nesting of names). We cannot really optimize this - the names have to be tried in that given order.

Making it strict means that there is only one lookup, but the c class would have to be written like this:

    class c inherits b {
      $y = $x
      $z = $aa::a::b::x
    }

if we insist on making a qualified reference to the x in b (a $x gets the same result).

We could make the inherited class have special status - and thus resolve against it - but not sure if it is worth doing this.

I expect that we will retain the ability to refer to variables via their
unqualified names within some nest of scopes related to where they are
declared (e.g. up to the innermost named (class or resource) scope).
Given, then, that that form of relative name lookup will be supported, I
think generalizing that to classes and resources as well (second
alternative) bears serious consideration.

There is also the ability to reference a class and access its attributes via the Class type. This way, it is totally clear what the resolution is, and what the names are relative to. e.g.

 $b = Class[some::class::somewhere]
 $b[x]

and if this is done in a class, and you don't want the $b to be visible

 private $b = Class[...]

This way there is no guessing what a relative name may mean. (In essence relative names are only (optionally) used when defining classes and user defined resource types.

On the other hand, those who have commented in the past seem to agree
that Puppet's historic behavior of traversing the full chain of nested
scopes, trying to resolve relative names with respect to each, is more
surprising than useful.  I'm on board with that; I'm just suggesting
that there may be both room and use for a more limited form of relative
naming.

I am struggling with the balance of being useful, not having to type too much, and ease of refactoring with sanity and performance... this discussion is very valuable.

I like the simplicity of "an unquailified variable = what I see here", and "a qualified variable = an absolute reference".


    Local scopes are always local, there is no way to address
    the local variables from some other non-nested scope - essentially how
    the regular CPU stack works, or how variables in a language like C
    work).

    i.e. we have something like this in Scope

    Scope
        attr_reader :global_scope
        attr_reader :parent_scope
        # ...
    end


    The global scope keeps an index designed to be as fast as possible to
    resolve a qualified name to a value. The design of this index
    depends on
    the frequency of different types of lookup. If all qualified lookups
    are
    absolute it would simply be a hash of all absolute names to values (it
    really cannot be faster than that).

    The logic for lookup then becomes:
    - for un-qualified name, search up the parent chain (this chain does
    not
    reach the global scope), if still unresolved, look in global scope.



 From the description alone, I'm not sure how it can be asserted that
the chain of local scopes does not reach global scope, unless by the the
trivial fact that the global scope is not itself a local scope.  What I
would hope to see, and perhaps what is meant, is that the lookup stops
at local scopes that correspond to classes and resources.  In
particular, I think it is essential that unqualified class name lookups
not be resolved against parent namespaces.


Nested ("local") scopes only contains unqualified names, and an inner scope shadows an outer scope (there are a few additional rules for restricted names such as $trusted, and $facts which may not be shadowed in any scope). Qualified names (for variables) can only be created in classes and these are only the public attributes of those classes. No local (shadowing) scope places this "global scope" as an outer scope.

$x = 10
class a {
  $x = 20
  $y = $x
  $z = $::x
}

Here the variables $a::x == 20, $a::y == 20, and $z == 10

The $::x is not found in an outer scope of the scope used to evaluate the logic inside of class a.

The local scopes dies when evaluation using that scope - eh. goes out of scope. The persisted values are kept in the global-scope index (and in the instantiated classes and created resources).

That is, in class ::m::a::b, "include 'foo'" must not refer to
::m::a::foo, and certainly not to ::m::foo, but I'd be ok if it could
refer to ::m::a::b::foo.  As a special (but important) case, in
::m::a::b, "include 'b'" must not refer to ::m::a::b itself, and
"include 'a'" should not refer to ::m::a.

I think (but is not 100% sure) that it would be best to have to qualify the name - i.e.

  include foo    # is include ::foo
  include x::y   # is include ::x::y

Other languages have solved the same issue in different ways:

* Ruby is obviously very flexible in how it searches (it also makes it
  slow), and sometimes (just like in Puppet) it is mysterious why it
  works or not in some cases.
* Java uses an import to import a name which can then be used in short
  form, nested classes can be relatively referenced.
* Some Java like (new) languages use an import/alias mechanism

If we go down that path, these name imports would appear at the start of the file and apply to the content of that file - i.e. it is a help to the *parser* to construct the correct code (there is no searching at runtime). Now sadly, import is a function that is just deprecated in the Puppet Programming Language and reintroducing it with a different meaning would just be a cruel joke... if instead we want to be able to alias names maybe we could use "alias"

alias apache = mystuff::better_apache::apache

To support an alias like that, the only reasonable thing a parser could do is to replace every "apache" in every qualified name with the alias - i.e. apache::foo becomes mystuff::better_apache::apache::foo

A powerful mechanism to reduce typing - but that are also tricky if we support more than a first pass of alias replacements, multi segement aliases etc. (A sane impl. could perhaps only perform the replacement of the first segment, and that an alias cannot be qualified itself.
(An alias could also be set to ::)

I am not sure I want to see aliases like these in the language. Sometimes a bit more typing is good for (esp. the future) you.

We have a problem with referencing a class directly with a variable since we can do this

  class a {
    $b = { x = jing }
    class b {
      $x = 10
    }
    notice $b::x
    notice $b
  }

$b is not a reference to the class, but in $b::x it is (this is kind of confusing).

Super

Yet another way of handling resolutions is to add a super (reserved) namespace word, that resolves the superclass. It would function as an (absolute) reference to the superclass and mean give me a variable as the superclass sees it (given class is allowed to see it). e.g.

  class c inherits b {
    $z = $super::x
  }

But I am not sure that throwing yet another object oriented term into the non object oriented puppet casserole makes it any sweeter...

I'm going to try to digest some more of this over the weekend.  Perhaps
I'll have more to say on Monday.

I can imagine having a hangout on this topic as well...

Such as about scoping function names
so that different environments can bind different implementations to the
same name, maybe.


There will be support for scoping function names. i.e. you can call

mymodule::foo(x)

All such references are currently (albeit still at the idea state) absolute names - no shadowing, and no "local functions" are planned.

Aliasing is being contemplated, which means it is possible to alias certain functions.

alias foo = mymodule::foo

Which would make all calls to foo() go to mymodule::foo() in the
.pp file having that alias at the top.



--
You received this message because you are subscribed to the Google Groups "Puppet 
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/lg0ii1%24uo0%241%40ger.gmane.org.
For more options, visit https://groups.google.com/d/optout.

Reply via email to