Hi,
we are just started to get more concrete on how to implement things for
4x and breaking it down into actionable items. If you have looked in
Jira, there are currently 5 big issues in the epic "Biff the Catalog
Builder" [1] - which is the goal (a new, better performing catalog
builder (what is currently known as the "compiler") where we can fix
many known issues that today are just to hard to implement.
This time, I want to talk about the implementation of Scope, which is
part of "(PUP-1832) Implement the Puppet 4.0 Runtime" [2].
Currently scope has many responsibilities (too many):
* it is classic computer language scope (what is visible "here")
* for a class it also represents one aspect of "an instance of a class"
(the attributes of the class are variables in that scope).
* Inheritance is achieved by looking up and continuing the search for a
variable in another "scope".
Coming up with a new implementation is important to make scope perform
well. Thus it is important to know:
- write vs read ratio
- unqualified vs. qualified lookup (i.e. reading $a:.b::x from within
$a::b vs from other scopes)
- typical nesting levels of named scopes
We also have to decide if any of the relative name-space functionality
should remain (i.e. reference to x::y is relative to potentially a series of
other name spaces ("dynamic scoping"), or if it is always a global
reference when it is qualified.
The implementation idea we have in mind is that there is one global
scope where all "qualified variables" are found/can be resolved, and
that all other variables are in local scopes that nest. (Local scopes
include ephemeral scopes for match variables).
Given the numbers from measuring the read ratio, we (sort of already
know, but still need to measure) need a fast route from any scope to the
global - we know that a qualified variable is never resolved by any
local scope so we can go straight to the global scope. (This way
we do not have to traverse the chain up to the "parent most" scope (the
global one). Local scopes are always local, there is no way to address
the local variables from some other non-nested scope - essentially how
the regular CPU stack works, or how variables in a language like C work).
i.e. we have something like this in Scope
Scope
attr_reader :global_scope
attr_reader :parent_scope
# ...
end
The global scope keeps an index designed to be as fast as possible to
resolve a qualified name to a value. The design of this index depends on
the frequency of different types of lookup. If all qualified lookups are
absolute it would simply be a hash of all absolute names to values (it
really cannot be faster than that).
The logic for lookup then becomes:
- for un-qualified name, search up the parent chain (this chain does not
reach the global scope), if still unresolved, look in global scope.
- for qualified name, look up in global scope directly
If we need to also consider relative namespaces (i.e. x::y could mean
z::x::y, or a::b::c::x::y etc. we can then either probe in turn with
each name (which is fine if the number of things to probe is low), or
provide a reverse index where y is first looked up to get the next level
of names, etc. (the idea being that this requires fewer operations to
find the right one).
IF we can completely remove the notion of relative namespacing we gain
performance!
The global scope, in addition to having the qualified names also needs
to separate the names by "kind" since we can have the same name for
different "kinds". We can now keep keep all named things in the global
scope - functions, types, variables, etc. Global scope and loading are
associated (more about loading in a later post) but it is worth noting
that it may be of value to be able to record that there has already been
an attempt of loading a particular name, and that there was nothing
there to load...
We are going to need the following kinds of scopes:
* Global Scope - holding map from kind, to fully qualified name to value
* Local Scope - holding variables that shadow parent scope
* Ephemeral / Match Scope (read only) - when a match is made
* Class Scope - the topmost scope for a class - needed because variable
lookup in it, and its nested scope needs to lookup all class attributes
(and defined them) via reading/setting variables.
* Resource Scope - the topmost scope for a user defined resource type -
needed because its parameters are available as read only variables.
The resource scope simply makes the resource parameters available. It
behaves as a local scope otherwise.
The class scope looks up unqualified variables in the class itself, if
not found there, it continues up the parent chain of scopes. If the
class inherits from another, then, the parent scope is one that
represents its super class.
In class scope, setting a variable also means that it is set in global
scope with the fully qualified name. This is where the logic around
class private variables comes in. If it is private, it cannot be
accessed from the outside (i.e. with a qualified name), and thus it
is only set in the class / class-scope. This in turn brings up the issue
of also supporting "protected" variables; only visible from within the
class logic, and the logic in sub classes, and if subclasses should see
private inherited variables or not (probably not).
The above could probably do with some picture :-)
Now, some questions...
- Are there any particular performance concerns you think we need to be
aware of?
- Do you have concerns about things we missed? Something important scope
needs to do?
- Do you have metrics from your environment? (number of lookups of
various kinds, etc)
- What is your reaction to getting rid of dynamic/relative name
resolution? (Breakage vs. sanity...)
Regards
- henrik
Links
---
[1]: https://tickets.puppetlabs.com/browse/PUP-1789
[2]: https://tickets.puppetlabs.com/browse/PUP-1832
--
You received this message because you are subscribed to the Google Groups "Puppet
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/puppet-dev/lfthtr%24vnh%241%40ger.gmane.org.
For more options, visit https://groups.google.com/d/optout.