[lssconf-discuss] Another tool (quite high level)

esmith4 Sun, 22 Oct 2006 22:37:44 -0700

Hail my quiet friends

Perhaps I should stop spamming the list with these announcements, butnevertheless, until someone mentions it as a problem I'm going tocontinue to tell you about new config tool chain software I write. Allof this is proof of concept. It runs, as far as I know it doesn't leakor crash, it does some useful task. Even if I were trying to create aproduction tool chain, I'd still release it like this. These toolsaren't tied together, except by the file format standard which at onepoint I thought I had convinced people was sufficient. Perhaps theseexamples show that there are some interesting things that can be donewith it. There is a crash bug in mcfg that I fixed while I was writingthis new tool, so if you have the old archive I can send you an updateif you wish. mcfg and cfgas compatibility is not quite 100% yet,because mcfg doesn't report if an overriding scalar list constraintremoved an unnamed value from a map. I will fix this for the sake ofcompleteness, but haven't got around to it yet.

Today's tool is cfgw, config want! It is a primitive site validatorbased on predicate logic, that is, logic with quantifiers (for all,there exists). It allows you to make assertions (things you want to betrue, hence the name) about your site, as it is defined in a set ofconfiguration files. It will then go through your file, assertion at atime, and attempt to show that you're right and have nothing to worryabout, or wrong and should worry. It is still a little primitive inplaces (the actual logic seems fine, but the output is a little crypticat times). It can provide a full justification of its decisions (bythis I mean basically a proof: a tree labelled with truth values).

Anyway, the man page discusses this in more detail (it is again poorlyhtmlified - does anyone have a tool that surpasses man2html; I refuseto maintain docs in 2 formats, and man seemed more appropriate, but itreally is annoying that man2html can't substitute \(lq for “...it wouldn't seem like a hard piece of functionality).

This tool essentially allows you to express assertions like "all mynodes that are requesting services from a server, are pointing at aserver that exists and provides that service." Of course, it doesn't goout and find that out. You could write a query tool that would readthis stuff back in, but I suspect that would be hard. Nevertheless itsstill useful to know that what you have asked for is reasonable.Performance is of course questionable, and probably goes like O(n^q),where n is the number of machines and q is the number of quantifiersinvolving node domain variables. Basically, it gets very expensive veryfast; searching for relations between more than 2 machines is probablyvery slow. Performance is also hampered by primitive internals: Iwanted to show that a tool like this could exist, and could beuseful.If you were doing this for money, you'd use hashing rather than linkedlists for critical lookups, and try to be cleverer about evaluations.


I hope this is of interest. Again, mail me if you want source code.

Ed

Title: Manpage of CFGW

Content-type: text/html

CFGW

Section: (1)
Updated: October 2006
Index Return to Main Contents

NAME

cfgw - config want site validator

SYNOPSIS

cfgw: [-b] [-h] [-j] [-J] [-n] [-v] [-V]
[-p policy_file ]

DESCRIPTION

cfgw is an experimental site validator for XML node configurations, each of which describes a node. It is an example of a high level system configuration tool, in that it does not specify any way to instantiatea given configuration on a node. A low-level tool is required to take the node configuration files that cfgw checks and configure a machine with them.

cfgw accepts a line structured language based on first order predicate logic. Each line defines a predicate whose value is to be determined. It is called config want because these are taken to be assertions about the desired state of the site (ie each line is a statement of something that you want to be true about your site, as defined by the configuration files that you supplied to cfgw).

cfgw defines a very limited set of operators for working with scalar values (the ~ operator, meaning lqexistsrq, and the = operator meaning case sensitive equality). Several standard logical operators are also provided, (specifically, =>, lqimpliesrq, =, boolean equality, &, lqandrq, |, lqorrq and ! lqnotrq).

Scalar literals may be supplied, as in other members of this family, as either alphanumeric (plus non-initial underscore) values or quoted strings (so for example you can say foo, foo12, and foo_b, but you have to quote lq1foorq, lqfoo brq, and lqfoo.barrq). For completeness, the boolean literals T, true, and F, false, are also provided.

cfgw is a solver for first order predicate logic, so it provides two more powerful kinds of operator: quantifiers. There are two standard quantifiers, each of which has a variable and a body. These operators try all possible values of the variable to determine the truth of their body. The universal quantifier, A, lqfor allrq, is true when for all possible values of its variable, its body is true. Similarly, the existential quantifier, E, lqthere existsrq is true there is any possible value of its variable that is true.

: E $r [ body ];
A $r [ body ];

Variable names are constrained in the same way as unquoted scalar literals (they don't need to be a single letter, although depending on your background, you may find a single letter more comfortable!). There are two domains from which variables can be drawn, files and resources. A variable marked with an @ in its declaration will range over the files provided, whereas a variable with a $ in its declaration will range over the union of all the resources provided in all files. The two types are not interchangeable.

For resource domain variables, there is an optional extra section to the declarations:

: E $r resource [ body ];
A $r resource [ body ];

By default, these variables range over top level resources in a file only (for example, foo, bar), but using the form above, you can specify a parent for them to range under, so in the following example, $r could range over foo.a, foo.b, any and all children of foo, including unnamed ones.

: E $r foo [ body ];

A resource domain variable can be used anywhere where a scalar literal could be used, within the body of its quantifier, simply by writing it prefixed with a dollar. It evaluates to the name of the resource it is currently bound to. It can also be used as $_resource, in which case it evaluates to its most local name. An example might make this clearer: if we have a resource variable, $S, which is currently bound to foo.a.b, then within the body of its quantifier, it evaluates as:

: $S = foo.a.b
$_S = b

Note that for unnamed resources (scalars or maps) the only way to test assertions about them is by a search through their parents, as the only way in which they can be represented in the input language is when they are bound to variables.

The power of cfgw comes from its ability to substitute a value with the value of the resource of that name in a given file, using the dereferencing operator. It is this functionality which allows the checking of cross-constraints between files. For example, you can say

: foo.bar <lqfoo.configrq>

This will be substituted with the value of the resource foo.bar in one of the files you provided (foo.config). cfgw doesn't do any more looking up after the command line is parsed, so even if there's a foo.config in your current directory this will still return not defined if it wasn't specified on the command line.

There is an alternative syntax which makes use of a current node domain variable to choose which value to apply. In the following example, when @n is bound to e.g. machine1.config, the value of foo.bar in the file machine1.config will be used; when @n is bound to machine2.config, the value in that file will be used instead.

: foo.bar @n

There are also a few miscellaneous items of syntax. The keyword lqchildrq is an infix operator, which returns true if both its operands are resources which exist, and the left operand is the child (ie directly contained in) of the right one. The keyword lqdescendsrq is similar, except that it checks whether the left operand is contained (at any depth) in the right operand. So:

: a.b.c child a.b
a.b.c descends a

Finally, parentheses may be used freely to determine operator precedence, the operation of which may be surprising in places. It is always possible to find out why cfgw came to a particular conclusion by specifying one of the lqjustifyrq options on the command line. These spit out a (somewhat intimidating) glob of data which amounts to a proof of (a justification of) cfgw's decision.

Because policy files may be large, and clauses may become obscure, cfgw offers an annotation feature. Any clause may be annotated with an arbitrary string enclosed in curly braces. These are similar to comments, but unlike the line comments (# followed by anything is ignored until a new line), their values are stored and printed as explanations in the justifications that cfgw provides. As a special case, an annotation for the whole line will be printed when the line's status is reported, even when justifications are switched off. See the examples section for more.

OPTIONS

-b: Brief. Print nothing to stdout. Fatal error messages will go to stderr.
-h: Help. Display usage information and exit.
-v: Verbose. Print a summary for each rule processed, and a summary at the end.
-V: Version. Display version information and exit.
-n: Justify none. Never print justifications
-j: Justify rejections. For any line which evaluates to false, print a justification of the decision to reject it. This is the default.
-J: Justify all. For every line, print a full justification of the decision reached.
-p policy file: Policy file. The file whose truth is to be determined(!)

RETURN VALUES

0: on success.
-1: on an invocation error (command-line).
1: on a lexing error (invalid characters).
2: on a parsing error (syntactic error).
3: on a compile error (semantic error).
4: on a rejection (policy error). This exists so that you can run cfgw fully gagged and still find out whether it succeeded or failed.

EXAMPLES

Example 1
The following example uses the data from test_data/man_ex3.mcf, which you should compile using mcfg. It is an example small network, with 4 machines: bigserver, smallserver, client1 and client2. Each machine's profile has a map lqrequiresrq, listing the services it needs to consume, and a map lqprovidesrq listing the services it can offer. In addition, each machine has a named entry in a map lqserverrq which lists a hostname for the server providing that functionality (so server.ftp is its ftp server). These choices are for the sake of argument only.

We will use the following policy file (man_ex1.cw) to validate this potential network configuration.

{Check that every client's servers exist}
A @n [ A $r server [ ~ $r @n => E @m [ hostname @m = $r @n ]]];

{Check that every client has a server for everything it requires}
A @n [ A $r requires [ ~($r @n) => 
       E $s server   [ ~($s @n) => $_s = $r @n ]]];

{Check that every client's servers serve the requested function}
A @n [ A $r server [ (~ ($r @n)) => 
     {does there exist a machine with that name}
     E @m [ hostname @m = $r @n & 
        {does it provide that service}
        E $s provides [ ($s @m) = $_r ]]]];

The rules are ordered in terms of complexity. Note that they aren't easy to write (without practice anyway) and the debugging information you get isn't always ideal, however they are short, and very widely applicable (if time consuming to run). Note the way that annotations can be used within blocks to give an idea of what you were thinking for each clause. They appear within the justifications of a decision too. Run this through cfgw with (assuming you have all the paths set up, and you built the config files in the current directory).

: cfgw -nvp man_ex1.cw *.config

Note the use of verbose with annotations to give human readable information as to what failed. This is useful if you expect others (less skilled in cfgw perhaps?) to use files you produce.

An easy way to break each of the different rules is as follows:

1: Add a requires entry for another service, (eg ntp) to any of the machines, or to one of the aspects, without adding a corresponding server.ntp entry.
2: Set a client's server for some service to a random hostname (foo.example.com perhaps).
2: Change the definition for client2 server.ftp (line 42) to be some other service (so e.g. server.http = smallserver). smallserver only provides ftp so this will fail.

ERRORS

Lex errors

Unterminated string constant: Values cannot span lines, even within quotes.
Unterminated annotation: The end of file was reached without encountering a closing curly brace.
String constant truncated: cfgw ran out of memory trying to buffer a quoted value. It isn't expected that you will ever see this.
Annotation truncated: cfgw ran out of memory trying to buffer an annotation. It isn't expected that you will ever see this.
Unknown escape sequence: cfgw only supports two escape sequences, \\ and \lq. In particular, no new lines or octal escape sequences are allowed.
Unexpected characters: This is returned when characters in the file do not form any possible token of the mcfg language (You may have forgotten quotes around a value or inadvertantly corrupted the file).

STANDARDS

cfgw analyses configurations compatible with the standard format proposed in [1]. cfgw can handle all possible constructs.

[1]: lqConfiguration Tools: Working Togetherrq, Paul Anderson and Edmund Smith, Proceedings of LISA XIX, 2005.

BUGS

This implementation probably leaks memory. It is also probably horribly slow on very large sizes of input, although this has not yet been tested.

AUTHOR

This manual page was written by Edmund Smith <[EMAIL PROTECTED]>.

This document was created by man2html, using the manual pages.
Time: 05:34:48 GMT, October 23, 2006

_______________________________________________
lssconf-discuss mailing list
lssconf-discuss@inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/lssconf-discuss