Hello,

This is a preannounce of Apache::ConfigParser.  I wrote this to
allow programs separate from Apache to completely understand,
parse and manipulate Apache configuration files.

The interface is not simple, but it allows for more complicated
understanding of log files, such as finding the associated ServerName
for log files.

There are two separate modules described here.  The first manages
a single directive and the second assembles these into an object
that represents a complete configuration file.

Comments welcome, including the name of the module.

It's available now at

http://www.orcaware.com/perl/Apache-ConfigParser-0.01.tar.gz

and will be up on CPAN if there are no serious comments.

Regards,
Blair




NAME
         Apache::ConfigParser::Directive - An Apache directive or start
context


SYNOPSIS
         use Apache::ConfigParser::Directive;

         # Create a new emtpy directive.
         my $d = Apache::ConfigParser::Directive->new;

         # Make it a ServerRoot directive.
         # ServerRoot /etc/httpd
         $d->name('ServerRoot');
         $d->value('/etc/httpd');

         # A more complicated directive.  Value automatically splits the
         # argument into separate elements.  It treats elements in "'s
as a
         # single ement.
         # LogFormat "%h %l %u %t \"%r\" %>s %b" common
         $d->name('LogFormat');
         $d->value('"%h %l %u %t \"%r\" %>s %b" common');

         # Get a string form of the name.
         # Prints `logformat'.
         print $d->name, "\n";

         # Get a string form of the value.
         # Prints `"%h %l %u %t \"%r\" %>s %b" common'.
         print $d->value, "\n";

         # Get the values separated into individual elements. 
Whitespace
         # separated elements that are enclosed in "'s are treated as a
         # single element.  Protected quotes, \", are honored to not
begin or
         # end a value element.  In this form protected "'s, \", are no
         # longer protected.
         my @value = $d->get_value_array;
         scalar @value == 2;           # There are two elements in this
array.
         $value[0] eq '%h %l %u %t \"%r\" %>s %b';
         $value[1] eq 'common';

         # The array form can also be set.  Change style of LogFormat
from a
         # common to a referer style log.
         $d->set_value_array('%{Referer}i -> %U', 'referer');

         # This is equivalent.
         $d->value('"%{Referer}i -> %U" referer');

         # There are also an equivalent pair of values that are called
         # `original' that can be accessed via orig_value,
         # get_orig_value_array and set_orig_value_array.
         $d->orig_value('"%{User-agent}i" agent');
         $d->set_orig_value_array('%{User-agent}i', 'agent');
         @value = $d->get_orig_value_array;
         scalar @value == 2;           # There are two elements in this
array.
         $value[0] eq '%{User-agent}i';
         $value[1] eq 'agent';

         # You can set undef values for the strings.
         $d->value(undef);


DESCRIPTION
       The "Apache::ConfigParser::Directive" module is a subclass
       of "Tree::DAG_Node", which provides methods to represents
       nodes in a tree.  Each node is a single Apache configura­
       tion directive or root node for a context, such as <Direc­
       tory> or <VirtualHost>.  All of the methods in that module
       are available here.  This module adds some additional
       methods that make it easier to represent Apache directives
       and contexts.

       This module holds a directive or context:

         name
         value in string form
         value in array form
         a separate value termed `original' in string form
         a separate value termed `original' in array form
         the filename where the directive was set
         the line number in the filename where the directive was set

       The `original' value is separate from the non-`original'
       value and the methods to operate on the two sets of values
       have distinct names.  The `original' value can be used to
       store the original value of a directive while the
       non-`directive' value can be a modified form, such as
       changing the CustomLog filename to make it absolute.  The
       actual use of these two distinct values is up to the
       caller as this module does not link the two in any way.

METHODS
       The following methods are available:

       $d = Apache::ConfigParser::Directive->new;
           This creates a brand new "Apache::ConfigParser::Direc­
           tive" object.

           It is not recommended to pass any arguments to "new"
           to set the internal state and instead use the follow­
           ing methods.

           There actually is no "new" method in the "Apache::Con­
           figParser::Directive" module.  Instead, due to
           "Apache::ConfigParser::Directive" being a subclass of
           "Tree::DAG_Node", "Tree::DAG_Node::new" will be used.

       $d->name
       $d->name($name)
           In the first form get the directive or context's name.
           In the second form set the new name of the directive
           or context to the lowercase version of $name and
           return the original name.

       $d->value
       $d->value($value)
           In the first form get the directive's value in string
           form.  In the second form, return the previous direc­
           tive value in string form and set the new directive
           value to $value.  $value can be set to undef.

           If the value is being set, then $value is saved so
           another call to "value" will return $value.  If $value
           is defined, then $value is also parsed into an array
           of elements that can be retrieved with the
           "value_array_ref" or "get_value_array" methods.  The
           parser separates elements by whitespace, unless
           whitespace separated elements are enclosed by "'s.
           Protected quotes, \", are honored to not begin or end
           a value element.

       $d->orig_value
       $d->orig_value($value)
           Identical behavior as "value", except that this
           applies to a the `original' value.  Use
           "orig_value_ref" or "get_orig_value_array" to get the
           value elements.

       $d->value_array_ref
       $d->value_array_ref(\@array)
           In the first form get a reference to the value array.
           This can return an undefined value if an undefined
           value was passed to "value" or an undefined reference
           was passed to "value_array_ref".  In the second form
           "value_array_ref" sets the value array and value
           string.  Both forms of "value_array_ref" return the
           original array reference.

           If you modify the value array reference after getting
           it and do not use "value_array_ref" "set_value_array"
           to set the value, then the string returned from
           "value" will not be consistent with the array.

       $d->orig_value_array_ref
       $d->orig_value_array_ref(\@array)
           Identical behavior as "value_array_ref", except that
           this applies to a the `original' value.

       $d->get_value_array
           Get the value array elements.  If the value was set to
           an undefined value using "value", then
           "get_value_array" will return an empty list in a list
           context, an undefined value in a scalar context, or
           nothing in a void context.

       $d->get_orig_value_array
           This has the same behavior of "get_value_array" except
           that it operates on the `original' value.

       $d->set_value_array(@values)
           Set the value array elements.  If no elements are
           passed in, then the value will be defined but empty
           and a following call to "get_value_array" will return
           an empty array.

           After setting the value elements with this method, the
           string returned from calling "value" is a concatena­
           tion of each of the elements so that the output could
           be used for an Apache configuration file.  If any ele­
           ments contain whitespace, then the "'s are placed
           around the element as the element is being concate­
           nated into the value string and if any elements con­
           tain a " or a \, then a copy of the element is made
           and the character is protected, i.e. \" or \\, and
           then copied into the value string.

       $d->set_orig_value_array(@values)
           This has the same behavior as "set_value_array" except
           that it operates on the `original' value, so to get a
           string version, "orig_value".

       $d->filename
       $d->filename($filename)
           In the first form get the filename where this paritic­
           ular directive or context appears.  In the second form
           set the new filename of the directive or context and
           return the original filename.

       $d->line_number
       $d->line_number($line_number)
           In the first form get the line number where the direc­
           tive or context appears in a filename.  In the second
           form set the new line number of the directive or con­
           text and return the original line number.

SEE ALSO
       the Apache::ConfigParser::Directive manpage and the
       Tree::DAG_Node manpage.





NAME
       Apache::ConfigParser - Load Apache configuration files

SYNOPSIS
         use Apache::ConfigParser;

         # Create a new empty parser.
         my $c1 = Apache::ConfigParser->new;

         # Create a new parser and load a specific configuration file.
         my $c2 =
Apache::ConfigParser->new('/etc/httpd/conf/httpd.conf');

         # Load a configuration file explicitly.
         $c1->parse_file('/etc/httpd/conf/httpd.conf');

         # Get the root of a tree that represents the configuration
file.
         # This is an Apache::ConfigParser::Directive object.
         my $root = $c1->root;

         # Get all of the directives and starting of context's.
         my @directives = $root->daughters;

         # Get the first directive's name.
         my $d_name = $directives[0]->name;

         # This directive appeared in this file, which may be in an
Include'd file.
         my $d_filename = $directives[0]->filename;

         # And it begins on this line number.
         my $d_line_number = $directives[0]->line_number;

         # Find all the CustomLog entries, regardless of context.
         my @custom_logs =
$c1->find_at_and_down_option_names('CustomLog');

         # Get the first CustomLog.
         my $custom_log = $custom_logs[0];

         # Get the value in string form.
         $custom_log_args = $custom_log->value;

         # Get the value in array form already split.
         my @custom_log_args = $custom_log->get_value_array;

         # Get the same array but a reference to it.
         my $customer_log_args = $custom_log->value_array_ref;

         # The first value in a CustomLog is the filename of the log.
         my $custom_log_file = $custom_log_args->[0];

         # Get the original value before the path has been made
absolute.
         @custom_log_args   = $custom_log->get_orig_value_array;
         $customer_log_file = $custom_log_args[0];


DESCRIPTION
       The "Apache::ConfigParser" module is used to load an
       Apache configuration file to allow programs to determine
       Apache's configuration options.  The resulting object con­
       tains a tree based structure using the "Apache::Config­
       Parser::Directive" class, which is a subclass of
       "Tree::DAG_node", so all of the methods that enable tree
       based searches and modifications.  The tree structure is
       used to represent the ability to nest sections, such as
       <VirtualHost>, <Directory>, etc.

       Apache does a great job of checking Apache configuration
       files for errors and this modules leaves most of that to
       Apache.  This module does minimal configuration file
       checking.  The module currently checks for:

       Start and end context names match
           The module checks if the start and end context names
           match.  If the end context name does not match the
           start context name, then it is ignored.  The module
           does not even check if the configuration options mod­
           ules have valid names.

PARSING
       Notes regarding parsing of configuration files.

       Line continuation is treated exactly as Apache 1.3.20.
       Line continuation occurs only when the line ends in
       [^\\]\\\r?\n.  If the line ends in two \'s, then it will
       replace the two \'s with one \ and not continue the line.

EXPORTED VARIABLES
       The following variables are exported via @EXPORT_OK.

       %directive_takes_rel_path
           This hash is keyed by the lowercase version of a
           directive name.  The hash value is a subroutine refer­
           ence.  If a hash entry exists for a particular entry,
           then the directive name can take a relative path that
           may need to be made absolute.  The subroutine takes a
           single variable which should be the potential file
           path entry and it returns 1 if the potential filename
           is a valid filename that can be made absolute, 0 oth­
           erwise.

           For example, ErrorLog can take a filename, a piped
           command or a syslog:* entry.  The particular subrou­
           tine for ErrorLog checks if the value is a filename.

           On Windows, these subroutines return 0 if the value is
           'nul'.

           These subroutines do not remove any "'s before check­
           ing on the type of value.

           This is a list of directives and any special values to
           check for as of Apache 1.3.20.

             AccessConfig
             AuthGroupFile
             AuthUserFile
             CookieLog
             CustomLog             check for "| command"
             ErrorLog              check for "| command" or syslog:
             Include
             LoadFile
             LoadModule
             LockFile
             MimeMagicFile
             PidFile
             RefererLog            check for "| command"
             ResourceConf
             ScoreBoardFile
             ScriptLog
             TransferLog           check for "| command"
             TypesConfig


METHODS
       The following methods are available:

       $c = Apache::ConfigParser->new
       $c = Apache::ConfigParser->new({options})
       $c = Apache::ConfigParser->new($filename)
       $c = Apache::ConfigParser->new({options}, $filename)
           Create a new "Apache::ConfigParser" object that stores
           the content of an Apache configuration file.  The
           first optional argument is a reference to a hash that
           contains options to new.

           If $filename is given, then the contents of $filename
           will be loaded.  If $filename cannot be be opened then
           $! will contain the error message for the failed
           open() and new will returns an empty list in a list
           content, an undefined value in a scalar context, or
           nothing in a void context.

           The currently recognized options are:

           pre_transform_path_sub => sub { }
               This allows the file or directory name for any
               directive that is a filename or directory name to
               be transformed by this subroutine before it is
               made absolute with ServerRoot.  This transforma­
               tion is applied to any of the directives that
               appear in %directive_takes_rel_path.

               The subroutine is passed the following arguments:

                 Apache::ConfigParser object
                 lowercase string of the configuration directive
                 the file or directory name to transform


           post_transform_path_sub => sub { }
               This allows the file or directory name for any
               directive that is a filename or directory name to
               be transformed by this subroutine after it is made
               absolute with ServerRoot.  This transformation is
               applied to the same directives as pre_trans­
               form_path_sub.

               The subroutine is passed the following arguments:

                 Apache::ConfigParser object
                 lowercase version of the configuration directive
                 the file or directory name to transform


           One example of where the transformations is useful is
           when the Apache configuration directory on one host is
           NFS exported to another host and the remote host
           parses the configuration file using "Apache::Config­
           Parser" and the paths to the access logs must be
           transformed so that the remote host can properly find
           them.

       $c->DESTROY
           There is an explicit DESTROY method for this class to
           destroy the tree, since it has cyclical references.

       $c->parse_file($filename)
           This method takes a filename and adds it to the
           already loaded configuration file inside the object.
           If a previous Apache configuration file was loaded
           either with new or parse_file and the configuration
           file did not close all of its contexts, such as <Vir­
           tualHost>, then the new configuration options in
           $filename will be added to the existing context.  If
           $filename could not be opened, then $! will contain
           the reason for open's failure.

       $c->root
           Returns the root of the tree that represents the
           Apache configuration file.  Each object here is a
           "Apache::ConfigParser::Directive".

       $c->find_at_and_down_option_names('option', ...)
       $c->find_at_and_down_option_names($node, 'option', ...)
           In list context, returns the list all of $c options
           that match the option names listed at the level of
           $node and below.  In scalar context, returns the num­
           ber of such options.  The level here is in a tree
           sense, not in the sense that some options appear $node
           in the configuration file.  If $node is given, then
           the search is started at $node, includes $node and
           searches $node's children.  If $node is not passed,
           then it starts at the top of the tree and searches the
           whole configuration file.

           All of the option names are made lowercase.

           This is useful if you want to find all of the Custom­
           Log's in the configuration file:

             my @logs = $c->find_at_and_down_option_names('CustomLog');


       $c->find_in_siblings_option_names('option', ...)
       $c->find_in_siblings_option_names($node, 'option', ...)
           In list context, returns the list of all $c options
           that match the option names at the same level of
           $node, that is siblings of $node.  In scalar context,
           returns the number of such options.  The level here is
           in a tree sense, not in the sense that some options
           appear $node in the configuration file.  If $node is
           not given or $node is the passed and it is "$c-"tree>,
           then it will search through root's children.

           All of the option names are made lowercase.

       $c->find_in_siblings_and_up_option_names($node, 'option',
       ...)
           In list context, returns the list of all $c options
           that match the option names at the same level of
           $node, that is siblings of $node, and above $node.  In
           scalar context, returns the number of such options.
           The level here is in a tree sense, not in the sense
           that some options appear $node in the configuration
           file.  In this method $node is a required option,
           because it does not make sense to check the root node.

           All of the option names are made lowercase.

           This is useful when you find an option and you want to
           find an associated option.  For example, find all of
           the CustomLog's and find the associated ServerName.

             foreach my $log_node
($c->find_at_and_down_option_names('CustomLog')) {
               my $log_filename = $log_node->name;
               my @server_names =
$c->find_in_siblings_and_up_option_names($log_node);
               my $server_name  = $server_names[0];
               print "ServerName for $log_filename is $server_name\n";
             }


       $c->dump
           Return an array of lines that represents the internal
           state of the tree.

SEE ALSO
       the Apache::ConfigParser::Directive manpage and the
       Tree::DAG_Node manpage.

Reply via email to