[Puppet-dev] Re: The 4x scope

John Bollinger Mon, 17 Mar 2014 09:22:24 -0700


On Friday, March 14, 2014 10:48:31 PM UTC-5, henrik lindberg wrote:
>
> I have a long and rambling response written in several installments 
> between meetings - so I apologize if it is not completely consistent... 
>
> To summarize the proposal for 4x (starting out with the cleanest most 
> strict to see what this means in practice): 
>
> 1. unqualified variable references is "what can be seen in this 
> evaluation scope, outer evaluation scopes, and then global" 
> 2. qualified variable references are absolute 
> 3. definition of classes and defines with relative names are named 
> relative to the namespace it is in 
> 4. resolution of qualified names used as references (for include etc) 
> are absolute 
>
>

As far as I can tell, (1) and (3) are the same behavior as 3x.  I don't 
think there is any argument about those.  Certainly I'm focused on (2) and 
(4).

> The rationale for 2, 3, 4 is that this is faster, most user seem to not 
> trust relative name resolution and throw in :: everywhere (for both 
> sanity and speed). For those that actually understand how it currently 
> works and actually wants relative name resolution it does mean a bit 
> more characters to type with a small loss in ease of refactoring (moving 
> a class means it can resolve against other classes than before - both a 
> blessing and a curse). 
>
> We are contemplating having an alias feature. 
>
> I then ramble on about scope and try to answer questions... 
>
> Regards 
> - henrik 
>
> On 2014-14-03 23:07, John Bollinger wrote: 
> > 
> > 
> > On Thursday, March 13, 2014 7:19:22 PM UTC-5, henrik lindberg wrote: 
> > 
> > 
> >     We also have to decide if any of the relative name-space 
> functionality 
> >     should remain (i.e. reference to x::y is relative to potentially a 
> >     series of 
> >     other name spaces ("dynamic scoping"), or if it is always a global 
> >     reference when it is qualified. 
> > 
> > 
> > 
> > Can you choose a different term than "dynamic scoping" for what you're 
> > describing there?  It's not consistent with other uses of that term with 
> > which I am familiar, and historically "dynamic scoping" has meant 
> > something different in Puppet. 
> > 
> Not quite sure what the correct term is in current Puppet - I mean "the 
> various ways current puppet resolves a name such as x::y". 
>
>

I'm not aware of an existing term that applies, so maybe we can just call 
it "classic" scoping or "3x" scoping?

 > I think that's fine, as long as it's consistent, but it has the 

> > potential to present oddities.  For example, in the body of class m, can 
> > one declare class ::m::a via its unqualified name (e.g. "include 'a'")? 
> > If so, then should one not also from the same scope be able to refer to 
> > the variables of ::m::a via relative names ($a::foo)? 
> > 
> There are several concepts at play: 
> * the name given to a class or user defined resource type 
> * the loading of it 
> * referencing a variable 
>
> Class naming 
>
> Currently a class or define gets a name in the name space where it is 
> defined - its (possibly qualified name) is appended to the name where it 
> is defined. Thus: 
>
> class a { 
>    class b { 
>    } 
>    class x::y { 
>    } 
> } 
>
> Creates the three classes ::a, ::a::b, and ::a::x::y 
>
> This construct is exactly the same as if they were defined like this: 
>
> class a { 
> } 
> class a::b { 
> } 
> class a::x::y { 
> } 
>
> The fact that a class is defined inside of another does not give it any 
> special privileges (reading private content inside the class it is 
> defined in, etc.). This is a naming operation only. 
>
>

Yes, I've got that already.

> Likewise, when a class is included, this inclusion is in some arbitrary 
> namespace and it currently searches for the relative name. The 
> suggestion is to not do this, and instead require a fully qualified 
> (absolute) name. 
>
> include a 
> include b::a 
> include a::x::y 
>
>

Yes, I'm tracking with you there.  I would argue for more precise wording, 
however: for the purpose of declaring classes, relative class names (*all*of 
the above being examples) are resolved only relative to the global 
namespace.

>
> Sidebar: 
> | (I use the term "define" to mean "defining what a named entity is" as 
> | opposed to the term "declare" which is a term that denotes a 
> | definition of the existence of something (typically having some given 
> | type) - e.g. "int c" in the C language, which declares c, but does 
> | not define it. 
> | 
> | (Just saying since the use of define / declare may confuse someone) 
>
>

We are speaking the same language.

>
> > I see two main reasonable alternatives: 
> > 
> >   * class names are always treated as absolute.  Class ::m can declare 
> >     class ::m::a only via its qualified name, and $a::foo is always 
> >     equivalent to $::a::foo. 
> If you mean that class ::m::a can be defined inside of class ::m

No, I don't.  I mean the same thing by "declare" that you do.

> (Side note, the idea that nesting classes should not be allowed has 
> been raised as well - to further break the illusion that they have 
> some privileged relation to each other - they are not "inner classes" as 
> in Java or anything like that, they are not protected/private in any way 
> - the are just named after where they are defined). 
>
>

I would be fine with that.  If scoping will change, then that's an especial 
reason to also remove nested class and defined type definition, as those 
muddy the waters not only on OO-confusion grounds but also simply on 
lexical scoping grounds.

> >   * class names can be expressed absolutely or relative to the innermost 
> >     enclosing class scope (~ the current namespace), only, both for 
> >     class declaration and for variable lookup.  Class ::m can declare 
> >     class ::m::a via its unqualified name, and can refer to the 
> >     variables of ::m::a via relative names ($a::var). 
> > 
> This is like a) from naming the class, but keeps relative resolution of 
> references. Maybe it is a really bad idea to remove this ability - but 
> it is what opens up the can of worms... is it also relative to the name 
> space of the super class? is it relative to any outer name space? to any 
> outer name space of an inherited class? 
>
>

I return here to my thesis: if in some evaluation scope I can declare a 
given class via name N -- whether relative or absolute -- then in that same 
scope I should be able to refer to that class's public variables via the 
form $N::varname.

If relative names are resolved only vs the global namespace, then there is 
no problem.  I think that there is also no problem if relative names are 
resolved against the current namespace if possible and otherwise against 
the global namespace, but not against any intermediate namespaces.

It may also be that different rules need to apply to variable names than to 
class names.  In fact, it might be easiest to maintain the consistency I am 
requesting by characterizing variable lookups as composite operations 
consisting of (1) looking up the class name, and then (2) looking up the 
variable relative to that class.  Whether the variable lookup part of that 
should be able to pick up superclass variables is an open question, which 
might most easily be answered "no".

> > Either approach provides consistency in that any way it is permissible 
> > to refer to a class itself, it is also permissible to refer to that 
> > class's variables by appending '::varname'.  Note that the latter does 
> > not require traversing the chain of enclosing scopes, nor looking up 
> > names directly in any local scope; rather, it could be implemented as 
> > maximum two lookups against the global scope. 
> > 
>
> Well, it is not really possible to refer to a class with a variable 
>

I'm not talking about storing class references in variables, I'm talking 
about literal references in DSL code:

  include 'foo::bar'
  class { 'baz': }
  $v_bar = $foo::bar::v
  $v_baz = $baz::v

What the proposed (strict) rules means: 
>
> class a { 
>    class b { 
>      $x = 1 
>    } 
>    class c inherits b { 
>      $y = $x + 10 
>    } 
> } 
>
> The resolution of $x will lookup the $x in local scope representing the 
> class a::c, fail, and then in its parent scope representing a::b, and 
> there find x. 
>
> If instead 
>
>    class c inherits b { 
>      $y = $b::x + 10 
>    } 
>
> was used, it would immediately go to the global scope and resolve b::x 
> and find the value 10. 
>
>

I was about to assert that both of those are how it works in classic 
scoping, too, but I see that that might not be exactly the case.  I do 
assert that classic scoping yields the same result, but I'm not sure 
whether it would do if there were a $a::x.  Under my proposal, the 
existence of an $a::x would not be relevant.

> now, if we instead treats b::x as a relative reference to the name space 
> it is used in - then it may be a reference to: 
> * ::a::c::x 
> * ::a::b::x 
> * ::b::x 
>
>

No, I don't quite see that.  If $b::x appears in the definition of class 
::a::c (inherits ::a::b), then it can refer to $::b::x or it can refer to 
$::a::c::b::x, only.  Or it least that's my suggestion.

> What if b also inherits? What if the namespaces are more deeply nested? 
>

Namespace nesting depth is not relevant because my proposal is to consider 
only the current namespace and the global one.

Inheritance is a complicating factor, but a separate one, I think.  The 
question there is whether "$::b::x" means "the variable 'x' declared by 
class ::b" or whether it means "the variable to which '$x' refers in the 
evaluation scope of class ::b".  I think either one could work, but at the 
moment I'm favoring the former.  That question arises regardless of the 
question of support for any version relative class names.

>
> $x = 0 
> class aa { 
>    $x = 1 
>    class a { 
>      $x = 2 
>      class b { 
>      } 
>      class c inherits b { 
>        $y = $x 
>        $z = $b::x 
>      } 
>    } 
> } 
> class b { 
>      $x = 4 
> } 
>
> What is $aa::a::c::y and $aa::a::c::z ? 
>
> In the proposal, the $y evaluates to 0 (there is no x in c, nor in b, 
> they do not see into aa, and can not see into aa::b).

Unless my eyes deceive me, you did not write a class ::aa::b, but rather a 
class ::b.  I will take it as your comments indicate you intended, rather 
than as written.  Since we agree that nesting class definitions is and 
should be significant to name resolution only inasmuch as it affects the 
qualified names of declared classes, it would probably be easier on 
everyone to avoid nesting in future examples.

In my proposal, $aa::a::c::y and $aa::a::c::z take the same values, for 
essentially the same reasons.  It is central to my proposal that 
intermediate namespaces between the global one and the current one *not* be 
consulted for relative names, unless as a result of class inheritance.  The 
difference would arise if there were also this class:

class aa::a::c::b {
  $x = 42
}

In that case, $aa::a::c::y would still take the value 0, but $aa::a::c::z 
would take the value 42 because within the definition of class aa::a::c 
variable name 'b::x' is looked up first as $::aa::a::c::b::x.  If that 
first lookup fails, then the only other one considered is $::b::x.

In 3.x the value of $aa::a::c::z becomes 0, since when it reaches class 
> aa::a::b and it does not have an x, then x resolves the global x. (Jīng! 
> <- Chinese Surprise). 
>
> With relative naming the search is done in this order 
>
> * aa::a::c::b::x 
> * aa::a::b::x 
> * aa::b::x 
> * b::x 
>
> (if we remove the 3x surprising behavior to resolve to global x when 
> there is no x in b by setting $x in b) and move the b class around 
> between the various namespaces it is possible to verify that it searches 
> in the order above. 
>

Yes, I understand the 3x way, and I agree that it has surprising and 
undesirable results.  I am not advocating for it.

>
> We can do that in 4x as well (sans the Jīng!) if we come to the 
> conclusion that that would be the best. (i.e. worst case 4 hash lookups 
> for a 3 level nesting of names). We cannot really optimize this - the 
> names have to be tried in that given order. 
>
>

That's not quite what I am suggesting.  Mine is less surprising and more 
performant -- maximum 2 + (# of superclasses) lookups regardless of degree 
of namespace nesting.

> Making it strict means that there is only one lookup, but the c class 
> would have to be written like this: 
>
>      class c inherits b { 
>        $y = $x 
>        $z = $aa::a::b::x 
>      } 
>
> if we insist on making a qualified reference to the x in b (a $x gets 
> the same result). 
>
> We could make the inherited class have special status - and thus resolve 
> against it - but not sure if it is worth doing this. 
>
>

That's a significant question of what class inheritance means (or should 
mean).  I think a fine approach that wouldn't break too hard with 3x is 
that class inheritance adds ancestor classes' namespaces to those that are 
searched for *unqualified* *variable* names, between the current namespace 
and the global one.  Referring back to my partitioning of variable lookups 
into class lookups and local variable lookups, that amounts to using 
ancestor classes only for the variable lookup part, not for the class 
lookup part.

> > I expect that we will retain the ability to refer to variables via their 
> > unqualified names within some nest of scopes related to where they are 
> > declared (e.g. up to the innermost named (class or resource) scope). 
> > Given, then, that that form of relative name lookup will be supported, I 
> > think generalizing that to classes and resources as well (second 
> > alternative) bears serious consideration. 
> > 
> There is also the ability to reference a class and access its attributes 
> via the Class type. This way, it is totally clear what the resolution 
> is, and what the names are relative to. e.g. 
>
>   $b = Class[some::class::somewhere] 
>   $b[x] 
>
> and if this is done in a class, and you don't want the $b to be visible 
>
>   private $b = Class[...] 
>
> This way there is no guessing what a relative name may mean. (In essence 
> relative names are only (optionally) used when defining classes and user 
> defined resource types. 
>
>

Whereas that's pretty cool, it doesn't really address the problem of 
avoiding breaking modules as much as possible.  Relative naming is 
pervasive -- especially so if you consider that even most qualified names 
appearing in current Puppet manifests are still relative ("a::b" rather 
than "::a::b").

> > On the other hand, those who have commented in the past seem to agree 
> > that Puppet's historic behavior of traversing the full chain of nested 
> > scopes, trying to resolve relative names with respect to each, is more 
> > surprising than useful.  I'm on board with that; I'm just suggesting 
> > that there may be both room and use for a more limited form of relative 
> > naming. 
> > 
> I am struggling with the balance of being useful, not having to type too 
> much, and ease of refactoring with sanity and performance... this 
> discussion is very valuable. 
>
> I like the simplicity of "an unquailified variable = what I see here", 
> and "a qualified variable = an absolute reference". 
>
>

That does have a profound simplicity, but it's not very friendly to 
refactoring.  I am suggesting starting from a different axiom: the class 
name I can use here for one purpose, I can also use here for any other 
purpose.

Our two simplicities do not *inherently* conflict, but they clash if any 
form of relative class naming is recognized, whether my proposed narrow 
form or the broader 3x form.  Yet my sense is that relative class naming is 
fairly common in current code, mostly in forms that would be served by my 
proposal, and it is a great facilitator of refactoring (at least for people 
who undertake that task by hand, instead of via an automated tool).

>  From the description alone, I'm not sure how it can be asserted that 

> > the chain of local scopes does not reach global scope, unless by the the 
> > trivial fact that the global scope is not itself a local scope.  What I 
> > would hope to see, and perhaps what is meant, is that the lookup stops 
> > at local scopes that correspond to classes and resources.  In 
> > particular, I think it is essential that unqualified class name lookups 
> > not be resolved against parent namespaces. 
> > 
>
> Nested ("local") scopes only contains unqualified names, and an inner 
> scope shadows an outer scope (there are a few additional rules for 
> restricted names such as $trusted, and $facts which may not be shadowed 
> in any scope). Qualified names (for variables) can only be created in 
> classes and these are only the public attributes of those classes. No 
> local (shadowing) scope places this "global scope" as an outer scope. 
>
>

Ok, got it.  Thanks for clearing that up.

>
> > That is, in class ::m::a::b, "include 'foo'" must not refer to 
> > ::m::a::foo, and certainly not to ::m::foo, but I'd be ok if it could 
> > refer to ::m::a::b::foo.  As a special (but important) case, in 
> > ::m::a::b, "include 'b'" must not refer to ::m::a::b itself, and 
> > "include 'a'" should not refer to ::m::a. 
> > 
> I think (but is not 100% sure) that it would be best to have to qualify 
> the name - i.e. 
>
>    include foo    # is include ::foo 
>    include x::y   # is include ::x::y 
>
>

Having to qualify the name is the current situation, but it is a relatively 
common error for people to fail to do so, writing, for example,

class site::ntp {
  # intended to refer to class ::ntp
  include 'ntp'
}

That error is resolved equally well, however, either by resolving all class 
name lookups only against the global namespace or by resolving them first 
against the current namespace and then against the global one (skipping 
intermediate ones, and possibly inherited ones).  In my estimation, the 
latter would break less code and would allow coding style that is more 
conducive to refactoring.

> Other languages have solved the same issue in different ways: 
>
> * Ruby is obviously very flexible in how it searches (it also makes it 
>    slow), and sometimes (just like in Puppet) it is mysterious why it 
>    works or not in some cases. 
> * Java uses an import to import a name which can then be used in short 
>    form, nested classes can be relatively referenced. 
> * Some Java like (new) languages use an import/alias mechanism 
>

I'm not seeing that the problem addressed by those mechanisms is 
necessarily one that has been raised here.  At least, I think those 
approaches have much broader scope than we have heretofore been discussing, 
and it is not immediately evident to me that Puppet would benefit much from 
such a mechanism relative to the simpler, implicit (but much narrower) 
mechanism I proposed.

If we go down that path, these name imports would appear at the start of 
> the file and apply to the content of that file - i.e. it is a help to 
> the *parser* to construct the correct code (there is no searching at 
> runtime). Now sadly, import is a function that is just deprecated in the 
> Puppet Programming Language and reintroducing it with a different 
> meaning would just be a cruel joke... if instead we want to be able to 
> alias names maybe we could use "alias" 
>
> alias apache = mystuff::better_apache::apache 
>
> To support an alias like that, the only reasonable thing a parser could 
> do is to replace every "apache" in every qualified name with the alias - 
> i.e. apache::foo becomes mystuff::better_apache::apache::foo 
>
>

Well, I'd say replace every initial segment exactly matching "apache", but 
that's my fetish for precision talking.  Or is that in fact different from 
what you meant?

> A powerful mechanism to reduce typing - but that are also tricky if we 
> support more than a first pass of alias replacements, multi segement 
> aliases etc. (A sane impl. could perhaps only perform the replacement of 
> the first segment, and that an alias cannot be qualified itself. 
> (An alias could also be set to ::) 
>
> I am not sure I want to see aliases like these in the language. 
> Sometimes a bit more typing is good for (esp. the future) you. 
>
>

I am not sure I want to see such a thing, either.  Let's shelf that idea 
for now.

> We have a problem with referencing a class directly with a variable 
> since we can do this 
>
>    class a { 
>      $b = { x = jing } 
>      class b { 
>        $x = 10 
>      } 
>      notice $b::x 
>      notice $b 
>    } 
>
> $b is not a reference to the class, but in $b::x it is (this is kind of 
> confusing). 
>
>

You are right, that can be a bit confusing, but it's not related to 
scoping: the same confusion arises if there is a class ::b.  The rule for 
users to learn is simply that you look at the last segment of a variable 
reference for the actual variable name, and I don't think that's either 
hard or unintuitive.

> Super 
>
> Yet another way of handling resolutions is to add a super (reserved) 
> namespace word, that resolves the superclass. It would function as an 
> (absolute) reference to the superclass and mean give me a variable as 
> the superclass sees it (given class is allowed to see it). e.g. 
>
>    class c inherits b { 
>      $z = $super::x 
>    } 
>
> But I am not sure that throwing yet another object oriented term into 
> the non object oriented puppet casserole makes it any sweeter... 
>
>

Nor am I, especially since I'm having trouble thinking of many appropriate 
uses for such a construct in Puppet code.

John

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/9ba450af-8181-4ad9-a118-5dde719137c8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[Puppet-dev] Re: The 4x scope

Reply via email to