[Puppet-dev] Re: A more ideal language (was Re: Classes vs. definitions (#1645))

Florian Grandel Tue, 04 Nov 2008 20:10:14 -0800

Hi Luke,

oups, I saw now that whenever you reply to a mail with the previous 
subject, then the discussion subject changes. That was not my intent...


I'll change it back to your new subject with this post.

You didn't like the term "resource-bundle". I think "class" is a good 
candidate for future naming of "resource-bundles". Just want to avoid 
"class" for now to avoid ambiguity.

 > Note that
 > I'm entirely self-trained, so a lot of what you're talking about goes
 > right over my head.

I am really sorry about that!! I am also entirely self-trained. I think 
I just have a strong OO and design pattern background while you may have 
more an expert sysadmin background. I need these abstract concepts more 
for myself as to "get to grips" with complexity. Our communication gaps 
will go away as soon as we communicate with examples about the concepts 
we are talking about.

To answer some of your questions I'll give the following example in 
"resource-bundle" syntax:

# Define everything that all nodes should have in common.
#
# We inherit from a built-in resource-bundle called "node"
#
# We use inheritance here but that's not necessary, we could
# also use a "mix-in" bundle that is simply included or
# "required" by other bundles.
bundle base-node inherits node {
    sudo {...}
    syslog-ng {...}
    ...
}

# Define a resource-bundle that represents a
# web-server node
bundle super-duper-webserver-node inherits base-node {
    # Let's include some other bundles specific to this
    # node type.
    #
    # self::ip is a read-only automatic attribute of
    # the built-in resource-bundle "node" that we inherit
    # (see explanation below), replaces facter
    #
    # You can see that we don't have any "name" attribute.
    # The name is implicitly set to "$name". Therefore
    # apache{"$self::name": ip => $self::ip, par2 => ...} would
    # be exactly the same as:
    apache {ip => $self::ip, par2 => ...}
}

# Now let's instantiate our nodes
super-duper-webserver-node {
   "myhost1": ;
   "myhost2": ;
   ...
}

 >> I bet I could reformulate
 >> everything I can do in puppet today with a simplistic language like
 >> that.
 >
 > I can't disagree with that, but my question would be, would you have a
 > "better" language if you did this?  And it must be said, "better" here
 > is defined as "better for its audience".

You see in my example that "node" is no longer something special. It 
probably is a built-in "resource-bundle" like "package", "file" or 
whatever that provides read-only instance attributes rather than using 
global facter variables that miraculously appear from "nowhere". (This 
shows the point about read-only attributes that I made in my response to 
the other thread.) Internally facter may provide such attributes, but 
does the user have to be concerned with such implementation detail???

The only difference is that now "classes", "definitions", "types" and 
"nodes" can all be inherited, parameterized, expanded and whatever, all 
the same without any "artificial" distinctions that have to be learned. 
I think that this one syntax for all is really much easier to understand 
and learn for everybody. You hide away a lot of complexity.

I personally don't believe that a sysadmin "intuitively" thinks about 
classes and definitions. To the contrary: The fact that you have to 
dedicate a own chapter in the documentation and answer many IRC 
questions about when to use classes and when to use definitions seems to 
show that the distinction is not really intuitive. I personally think 
it's more intuitive to do away with it. Maybe you should start some kind 
of poll for first time users?

I think its obvious however that the "bundle-syntax" is as readable as 
the current language but at the same time easier to refactor and expand.

 > What if you have multiple classes that want to declare that they
 > require a given class?
 > classes can be cross-
 > cutting, in that many classes might care that a given class is
 > instantiated, so it's important that they can declare this.

You are right in that I wouldn't allow include statements any more. IMO 
the include statement can always be replaced by inheritance, inclusion 
of other bundles or "require/before"-relationships.

You can use a "require" relationship at as many points as you like. 
Remember that we now can use "require/before" for everything and not 
only for "definitions" or "types". I think "require/before" for class 
instances makes the "include"-syntax completely redundant.

If you get something as "above/below" (see 
http://groups.google.com/group/puppet-dev/browse_thread/thread/1cc72c2de9d1ced7#)
 
this would further improve the ability to describe real bundle 
relationships. I am strongly in favour of such a relationship type.

Give me some examples of use-cases where you think you cannot do without 
multiple includes or where the "resource-bundle" syntax would be 
counter-intuitive and I'll see whether I have an error in my logic or 
whether I can formulate your use-cases in a satisfactory way in "bundle 
syntax".

 > nodes have
 > automatic entry points (i.e., the parser accepts a request from a host
 > and uses the host's name to look up a node instance).  Neither classes
 > nor definitions have automatic entry.

The built-in resource-bundle "node" continues to be an automatic entry 
point for puppetd. But IMO that's an unnecessary implementation detail 
that can be hidden away from the user.

 > To me, the term 'class' is a convenient way to describe the
 > configuration associated with a given intent.  "Web server" means one
 > list of resources, "dns client" means another, etc.

There are many places where the difference between an "intent" and a 
"resource" is blurred:
- Is a webserver a resource or an intent?
- Is a subversion server a resource or an intent?
- If I have to add a second database instance to my database server, is 
my "database-server class" now suddenly mutating from an intent towards 
a resource?

How you respond to these questions is a matter of standpoint. When you 
think in terms of an application server, a database instance may be just 
a resource. But if you think in terms of a database server it maybe 
/the/ "intent".

If you use bundle syntax this blurred distinction simply doesn't matter. 
You can show intent where you need to, but you don't have to bother if 
the distinction becomes a little blurred or needs to be refactored.

What matters is that "bundle syntax" helps me to express some abstract 
concept (e.g. a load-balanced cluster of firewall-protected 
web-servers). But it also helps me to iteratively "divide and conquer" 
abstract concepts without any conceptual break until I arrive at the OS 
level (files, packages, services, etc.). And I can also cluster concepts 
into higher-level concepts at any time even if that means that I have to 
transform a singleton into a prototype or vice versa. (Currently 
refactoring a class into a definition or vice versa is /much/ more work 
than that.)

IMO the "bundle-grammar" is simpler while still being able to express 
intent. It doesn't reduce expressivity or readability but adds 
encapsulation, maintainability and flexibility.

 > Puppet can't really
 > talk about anything that's cross-host, and it's a significant
 > problem.  I think of this as the 'me/it' problem -- Puppet can only
 > talk about 'me' (what services should I be running?) rather than
 > 'it' (what services should this host or that host be running?).  Been
 > a while since I thought about this aspect, but it's definitely there
 > and definitely a problem.
 >
 > How does your 'class of nodes' idea change this?
 >
 >> But what if you
 >> wanted to cluster nodes one day as one "thing" to avoid the current
 >> duplication of node-specific code in puppet? Oups! You'd have to
 >> introduce ... named puppet class instances.
 >
 > Can you elaborate on this?

Now comes my point about "clusters" or "class of nodes". In the 
generalized "bundle-syntax" you can say:

bundle full-system-cluster($needs_firewall = true) {
   if $self::needs_firewall {
     firewall-and-load-balancer-node { "${self::name}-fw": }
   } else {
     load-balancer-node { "${self::name}-lb": }
   }

   webserver-node {
     "${self::name}-web1": ;
     "${self::name}-web2": ;
   }

   application-server-node {
     "${self::name}-app1":
     "${self::name}-app2":
   }

   db-server-node { "${self::name}-db": }
}

And now you can instantiate whole clusters with the same infrastructure 
of containing nodes:

full-system-cluster {
   "production-cluster": ;
   "staging-cluster": ;
   "development-cluster": needs_firewall => false;
}

You'd effectively configure 18 hosts (three groups of six) in just three 
lines. Isn't that super readable and intuitive, and clearly showing 
intent? Try to express the same thing with disparate node declarations 
(in a .pp file or as external nodes) and you'll have duplicated code 
and/or data en masse.

That's what I meant with "node instances". Sure, puppetd will still 
"enter" at the node level. But why not going down /and/ up in the 
hierarchy to find out about how to configure our specific node? You 
could still derive one single catalog per node.

 >> IMO a resource can only be a singleton in the context of a bundle.
 >> Not more not less.
 >
 > I agree, and that bundle is currently named a 'catalog' in Puppet.
 > 0.25 finally makes this Catalog class the arbiter of singleton-hood.
 > a given resource must be unique within a
 > host's catalog, but not usually unique within a network.

I think that my cluster example shows that it may make sense to have 
something "unique" at a higher level than node/catalog level to avoid 
code duplication in node definitions.

Sure anything that is unique above node level will be unique at node 
level as well. But not everything unique at node level must 
automatically be "duplicate" at a higher level! My cluster example shows 
that "singleton-hood" on a higher than node level can make sense!

> the lack of meaningful distinction [between node and class] is one  
> thing that led me to want to push nodes outside of Puppet.

You can continue to do so. I believe that the more generic "bundle 
syntax" will help you to simplify and extend your "external definition 
provider" implementations. You could provide external parameters not 
only on type and node level but also in between or above wherever you 
like in the bundle hierarchy. All instantiation can be done externally. 
You could keep only bundle definitions in your *.pp files and get all 
the rest from an external source.

> When a client produces a log message, that message  
> requieres context to be useful.  If it fails to install a package, we  
> need to know why (again, the intent) that package was needed, so we  
> can then determine what services would be affected.  We can also  
> relate that failure back to code changes, maybe, so it's easier to  
> resolve the problem network-wide.

Hm. I don't get your point. Why can you not trace back an error in the 
log with "bundles"? The bundles are completely hierarchical. So I don't 
see problems here. I think that the error messages won't change at all. 
You get the highest level concepts (intents) at the beginning of the log 
message and the lowest level resources (OS resources) at the end of the 
log message.

>> My hypothesis is: Any non-procedural language can
>> be considered class- and aspect-oriented at the same time.
> 
> I don't really agree that classes and aspects are essentially the  
> same.  In fact, if an aspect is a cross-cutting concern, then Puppet's  
> classes are actually explicitly not aspects -- they're completely  
> exclusive.  No two classes can overlap at all.

I think you are right with what you are saying about aspects. I'll have 
to drop my "aspect=class" idea for declarative languages. You 
demonstrated a case (the dns example) where the definition of an aspect 
makes a difference in practice. Maybe this could be solved quite easily 
with parameters as well. But I could think of other situations now where 
it makes sense to dynamically mix resources into bundles based on some 
dynamic runtime attribute value (e.g. to make a difference between 
virtual servers and hardware servers across all definitions).

Although it was me who introduced the term "aspect" here I'd rather drop 
the topic from this thread:
- In "bundle syntax" aspects can be worked around in most cases with 
parameters, at least for the time being.
- I think aspects are neither incompatible with "resource-bundles" nor 
with current syntax.
- Aspects are a feature completely new to puppet. This thread concerns 
the re-formulation of existing features. We therefore should discuss 
aspects at another place otherwise this thread will further explode. ;-)

> I don't like the idea of declaring class membership via the same  
> syntax for specifying resources, partially because I think most people  
> will find that the attributes that are consumed by a class are often  
> specified in a different location than the code that specifies class  
> membership.

Can you give an example? I don't understand that. IMO separating 
parameter and class instances is a design error. This is exactly what 
makes code error prone and difficult to maintain as it increases the the 
amount of code you have to consider when refactoring classes.

If you have to nest bundles at several levels then I consider it good 
practice to re-declare required parameters at every level:

bundle shorewall::rule($ip, $port, ...) {
   ...
   file{...template(...using $self::ip and $self::port...)...}
   ...
}

bundle apache::vhost($ip, $port, ...) {
   ...
   shorewall::rule { ip => $self::ip, port => $self::port, ... }
   # This is a good example for a "local singleton" by the way... :-)
   # The rule implicitly inherits the name of the enclosing vhost
   # If we later need two rules we simply add an explicit name.
   ...
}

bundle web-server-node inherits node {
   ...
   # As we inherit from the built-in "node" bundle, we automatically
   # get facter variables as instance variables and may use them,
   # see $self::ip ...
   apache::vhost( "${name}-vhost1": ip => $self::ip, port => 80 }
   apache::vhost( "${name}-vhost2": ip => $self::ip, port => 443 }
   ...
}

and so on.

And one final idea concerning templates:

If you made templates a built-in "resource bundle" rather than a 
function it would be very easy to explicitly declare parameters.

You could simply say:

template { "/my/template.erb": var1 => ..., var2 => ... }

All variables not explicitly declared will simply not be present in the
template.

This would remove the last occurrence of undeclared dependencies. I 
don't see why you couldn't use ruby's template engine internally anyway.

Florian

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/puppet-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

[Puppet-dev] Re: A more ideal language (was Re: Classes vs. definitions (#1645))

Reply via email to