Hi Valentin. Thank you very much for your patch implementing this much-desired automake feature. Here is a preliminary review...
On Friday 03 September 2010, Valentin David wrote: > Dear Automake list, > > I work with programming languages. It is very common to write small > compilers, parser generators, or other. When using my tools, it is > hard to convince my co-workers they should use the autotools when I > cannot do the proper extension of Automake. Other people end up > writing builders for Eclipse. I will never go to the dark side. And I > wish to provide the right developing tools for my users, e.g. an > extension to Automake. It is very easy to extend Autoconf for > instance. But not for Automake. > > I realize, it is not often that people want to extend Automake. Most > of the time, fancy libraries are coming with their Autoconf files, and > the user is using the same compilers. Libraries do not need extension > in Automake. Compilers do. > > To keep a patch set for Automake is not an option. It already > irritates the system administrators when I tell them that the versions > of the autotools are too old. Moreover, I cannot ask my co-workers to > install it by hand in another prefix. > > You would surely ask me why I do not just write generic rules in a > makefile to be included anywhere I need to use my tool. This has > several problems. > > The first problem I encounter, is the dependency files. My tools might > generate some. It is too much to ask a user to include by hand a > dependency file. If there are several objects to compile, then they > will just probably forget to include some. Also, you need to use > "-include" (which I wonder whether it is portable), or trick Automake > by using "@am__include@". The latter might be caught by the config > command from depout.m4, so you do not have to make one. Tell the users > why they cannot use just "include" instead... > > Second problem, you cannot use the Automake variables to generate the > right files. The user needs to trick. Here is an example. > > #INCDEP is subst to "include ./$(DEPDIR)/" > @inc...@foo.pmyhh > > EXTRA_DIST=foo.myhh > foo_SOURCES=foo.cc > nodist_foo_SOURCES=foo.hh > > # This does not work, guess why. > foo-foo.o: foo.hh > > # This one will work, but it is like killing a > # fly with the assault rifle, specially if you > # have several objects. Hope you have ccache. > $(foo_OBJECTS): foo.hh > > So now, it is not obvious for the user why foo.myhh is not a source, > but an "extra", why foo.hh has to be set manually in the sources, and > you are happy when your user understands what "nodist_" means. > > What I wanted the user to write was only: > foo_SOURCES=foo.cc foo.myhh > > After all you can do it with Yacc or Lex, why not with my tools? > > I propose a patch as attached. > > * The lang_*_rewrite are added to the Language structure. The default > is lang_sub_obj. They do not return anymore the object extension > because the field 'output_extensions' already computes it. Hmm... looking at the git diffs, this seems like a useful refactoring by itself. Could this be done in separate, "preparatory" patch preceding the implementation of the "user extensions support" proper? > * Compilers that generate source files might still generate dependency files. > > * --libdir= can be called several times, the arguments can also have a > list of paths separated by a colon. We should really use the system PATH separator here, for better portability (such PATH separator is already computed and AC_SUBST'd by configure as `...@path_separator@'). > Empty paths correspond to the original libdir. I guess this is done so that the user can override the previously-passed custom libdirs with the default one, without having to know its complete path -- right? I think this is useful and smart, but couldn't we find a better speacial value than "empty" to indicate the default libdir? > * For all libdir, all libdir/lang/*.pm are loaded. This happens right > after parsing the options. Clean and simple. I agree that this is the best way to go ATM. > The test tests/user_def_lang.test should show how the feature works. We'll need much deeper test than the one provided by your patch (I can help writing them if you want). > I can understand that loading perl files is not really nice because > there easily can be problems of backward compatibility. But it's surely better than the present situation IMHO. > I am waiting for your comments. > > Best regards, > -*-*-*- I don't know the automake internals very well, so I cannot judge the overall, complete effect of your patch (which anyway *seems* quite limited -- which is good, and a point in favor of your patch). That said, I think I can provide some (hopefully) useful, albeit limited, criticism. > diff --git a/automake.in b/automake.in > index c0c5289..25c3467 100644 > --- a/automake.in > +++ b/automake.in > @@ -107,7 +107,14 @@ struct (# Short name of the language (c, f77...). > > # If TRUE, nodist_ sources will be compiled using specific rules > # (i.e. not inference rules). The default is FALSE. > > - 'nodist_specific' => "\$"); > + 'nodist_specific' => "\$", > + > + # This subroutine follows a simple formula: Return value is > + # LANG_SUBDIR if the resulting object file should be in a > + # subdir if the source file is, LANG_PROCESS if file is to > + # be dealt with, LANG_IGNORE otherwise. > + # Arguments are ($DIRECTORY, $BASE, $EXT) > + 'rewrite' => "\$"); > > sub finish ($) > > @@ -298,6 +305,9 @@ use constant QUEUE_STRING => "string"; > > ## Variables related to the options. ## > ## ---------------------------------- ## > > +# Contains directories where library files are. > +my @userlibdir = (); > + > > # TRUE if we should always generate Makefile.in. > my $force_generation = 1; > > @@ -744,7 +754,8 @@ register_language ('name' => 'c', > > 'compile_flag' => '-c', > 'libtool_tag' => 'CC', > 'extensions' => ['.c'], > - '_finish' => \&lang_c_finish); > + '_finish' => \&lang_c_finish, > + 'rewrite' => \&lang_c_rewrite); > > # C++. > register_language ('name' => 'cxx', > > @@ -809,7 +820,8 @@ register_language ('name' => 'header', > > # No output. > 'output_extensions' => sub { return () }, > # Nothing to do. > - '_finish' => sub { }); > + '_finish' => sub { }, > + 'rewrite' => sub { return LANG_IGNORE; }); > > # Vala > register_language ('name' => 'vala', > > @@ -1045,7 +1057,8 @@ register_language ('name' => 'java', > > 'lder' => 'GCJLD', > 'ld' => '$(GCJ)', > 'pure' => 1, > - 'extensions' => ['.java', '.class', '.zip', '.jar']); > + 'extensions' => ['.java', '.class', '.zip', '.jar'], > + 'rewrite' => sub { return LANG_SUBDIR; }); > > ################################################################ > > @@ -1858,17 +1871,15 @@ sub handle_single_transform ($$$$$%) > > } > > } > > - # Note: computed subr call. The language rewrite function > - # should return one of the LANG_* constants. It could > - # also return a list whose first value is such a constant > - # and whose second value is a new source extension which > - # should be applied. This means this particular language > - # generates another source file which we must then process > - # further. > - my $subr = \&{'lang_' . $lang->name . '_rewrite'}; > - my ($r, $source_extension) > - = &$subr ($directory, $base, $extension, > + # The language rewrite function should return one of the > + # LANG_* constants. > + my $r > + = &{$lang->rewrite} ($directory, $base, $extension, > $nonansi_obj, $have_per_exec_flags, $var); What about a saner indentation here? As in: my $r = &{$lang->rewrite} ($directory, $base, $extension, $nonansi_obj, $have_per_exec_flags, $var); > + > + my $output_extension > + = (&{$lang->output_extensions} ($extension))[0]; > + > > # Skip this entry if we were asked not to process it. > next if $r == LANG_IGNORE; > > @@ -1876,9 +1887,9 @@ sub handle_single_transform ($$$$$%) > > $linker = $lang->linker; > > my $this_obj_ext; > > - if (defined $source_extension) > + if ($output_extension ne '.$(OBJEXT)')> > { > - $this_obj_ext = $source_extension; > + $this_obj_ext = $output_extension; > $derived_source = 1; > > } > elsif ($lang->ansi) > > @@ -2056,10 +2067,19 @@ sub handle_single_transform ($$$$$%) > > $lang->target_hook ($aggregate, $object, $full, %transform); > } > > + # Transform .o or $o file into .P file (for automatic > + # dependency code). > + if ($lang && $lang->autodep ne 'no') > + { > + my $depfile = $object; > + $depfile =~ s/\.([^.]*)$/.P$1/; > + $depfile =~ s/\$\(OBJEXT\)$/o/; > + $dep_files{dirname ($depfile) . '/$(DEPDIR)/' > + . basename ($depfile)} = 1; > + } > + > > if ($derived_source) > { > - prog_error ($lang->name . " has automatic dependency tracking") > - if $lang->autodep ne 'no'; > > # Make sure this new source file is handled next. That will > # make it appear to be at the right place in the list. > unshift (@files, $object); > > @@ -2120,16 +2140,6 @@ sub handle_single_transform ($$$$$%) > > if scalar @dep_list > 0; > } > > - # Transform .o or $o file into .P file (for automatic > - # dependency code). > - if ($lang && $lang->autodep ne 'no') > - { > - my $depfile = $object; > - $depfile =~ s/\.([^.]*)$/.P$1/; > - $depfile =~ s/\$\(OBJEXT\)$/o/; > - $dep_files{dirname ($depfile) . '/$(DEPDIR)/' > - . basename ($depfile)} = 1; > - } > } > > return @result; > > @@ -2555,7 +2565,7 @@ sub handle_compile () > > } > > my ($coms, $vars, $rules) = > > - &file_contents_internal (1, "$libdir/am/compile.am", > + &file_contents_internal (1, libfile ("am/compile.am"), > > new Automake::Location, > ('DEFAULT_INCLUDES' => $default_includes, > > 'MOSTLYRMS' => join ("\n", @mostly_rms), > > @@ -5726,16 +5736,9 @@ sub check_gnits_standards > > # > # Functions to handle files of each language. > > -# Each `lang_X_rewrite($DIRECTORY, $BASE, $EXT)' function follows a > -# simple formula: Return value is LANG_SUBDIR if the resulting object > -# file should be in a subdir if the source file is, LANG_PROCESS if > -# file is to be dealt with, LANG_IGNORE otherwise. > - > # Much of the actual processing is handled in > # handle_single_transform. These functions exist so that > # auxiliary information can be recorded for a later cleanup pass. > > -# Note that the calls to these functions are computed, so don't bother > -# searching for their precise names in the source. > > # This is just a convenience function that can be used to determine > # when a subdir object should be used. > > @@ -5799,128 +5802,6 @@ sub lang_c_rewrite > > return $r; > > } > > -# Rewrite a single C++ source file. > -sub lang_cxx_rewrite > -{ > - return &lang_sub_obj; > -} > - > -# Rewrite a single header file. > -sub lang_header_rewrite > -{ > - # Header files are simply ignored. > - return LANG_IGNORE; > -} > - > -# Rewrite a single Vala source file. > -sub lang_vala_rewrite > -{ > - my ($directory, $base, $ext) = @_; > - > - (my $newext = $ext) =~ s/vala$/c/; > - return (LANG_SUBDIR, $newext); > -} > - > -# Rewrite a single yacc file. > -sub lang_yacc_rewrite > -{ > - my ($directory, $base, $ext) = @_; > - > - my $r = &lang_sub_obj; > - (my $newext = $ext) =~ tr/y/c/; > - return ($r, $newext); > -} > - > -# Rewrite a single yacc++ file. > -sub lang_yaccxx_rewrite > -{ > - my ($directory, $base, $ext) = @_; > - > - my $r = &lang_sub_obj; > - (my $newext = $ext) =~ tr/y/c/; > - return ($r, $newext); > -} > - > -# Rewrite a single lex file. > -sub lang_lex_rewrite > -{ > - my ($directory, $base, $ext) = @_; > - > - my $r = &lang_sub_obj; > - (my $newext = $ext) =~ tr/l/c/; > - return ($r, $newext); > -} > - > -# Rewrite a single lex++ file. > -sub lang_lexxx_rewrite > -{ > - my ($directory, $base, $ext) = @_; > - > - my $r = &lang_sub_obj; > - (my $newext = $ext) =~ tr/l/c/; > - return ($r, $newext); > -} > - > -# Rewrite a single assembly file. > -sub lang_asm_rewrite > -{ > - return &lang_sub_obj; > -} > - > -# Rewrite a single preprocessed assembly file. > -sub lang_cppasm_rewrite > -{ > - return &lang_sub_obj; > -} > - > -# Rewrite a single Fortran 77 file. > -sub lang_f77_rewrite > -{ > - return &lang_sub_obj; > -} > - > -# Rewrite a single Fortran file. > -sub lang_fc_rewrite > -{ > - return &lang_sub_obj; > -} > - > -# Rewrite a single preprocessed Fortran file. > -sub lang_ppfc_rewrite > -{ > - return &lang_sub_obj; > -} > - > -# Rewrite a single preprocessed Fortran 77 file. > -sub lang_ppf77_rewrite > -{ > - return &lang_sub_obj; > -} > - > -# Rewrite a single ratfor file. > -sub lang_ratfor_rewrite > -{ > - return &lang_sub_obj; > -} > - > -# Rewrite a single Objective C file. > -sub lang_objc_rewrite > -{ > - return &lang_sub_obj; > -} > - > -# Rewrite a single Unified Parallel C file. > -sub lang_upc_rewrite > -{ > - return &lang_sub_obj; > -} > - > -# Rewrite a single Java file. > -sub lang_java_rewrite > -{ > - return LANG_SUBDIR; > -} > - What a nice trim! > # The lang_X_finish functions are called after all source file > # processing is done. Each should handle defining rules for the > # language, etc. A finish function is only called if a source > file of > > @@ -6288,6 +6169,8 @@ sub register_language (%) > unless defined $option{'output_extensions'}; > > $option{'nodist_specific'} = 0 > unless defined $option{'nodist_specific'}; > > + $option{'rewrite'} = \&lang_sub_obj > + unless defined $option{'rewrite'}; > > my $lang = new Language (%option); > > @@ -6969,7 +6852,7 @@ sub define_standard_variables > > { > my $saved_output_vars = $output_vars; > my ($comments, undef, $rules) = > > - file_contents_internal (1, "$libdir/am/header-vars.am", > + file_contents_internal (1, libfile ("am/header-vars.am"), > new Automake::Location); > > foreach my $var (sort keys %configure_vars) > > @@ -7187,7 +7070,20 @@ sub make_paragraphs ($%) > > return @res; > } > > - > +# libfile ($FILE) > +# --------------- > +# Look for $FILE in the library directories. Returns the first path > +# found. If not found, returns a path from the library directory with > +# highest priority. The function implementation below doesn't respect the GCS (GNU coding standards) w.r.t. use of white spaces, indentation and brackets placement. And while I dislike what the GCS mandate on these matters, we should strive to be consistent anyway. > +sub libfile ($) > +{ > + my ($f) = @_; > + foreach (@userlibdir) { > + return "$_/$f" > + if -r "$_/$f"; Wouldn't a `-f "$_/$f"' be more correct here? > + } > + return $userlibdir[0] . "/" . $f; Shouldn't this be `$libdir' instead of `$userlibdir[0]'? Otherwise, Automake even fails to bootstrap itself, giving the error message: Use of uninitialized value $userlibdir[0] in concatenation (.) or string at ./automake.tmp line 7088. automake.tmp: error: cannot open < /am/header-vars.am: No such file or directory With the change I'm proposing, Automake bootstrap itself *and* passes the whole testsuite. > +} > > # ($COMMENT, $VARIABLES, $RULES) > # &file_contents_internal ($IS_AM, $FILE, $WHERE, [%TRANSFORM]) > > @@ -7246,7 +7142,7 @@ sub file_contents_internal ($$$%) > { > if ($cond != FALSE) > { > - my $file = ($is_am ? "$libdir/am/" : '') . $1; > + my $file = ($is_am ? libfile ("am/".$1) : $1); > > $where->push_context ("`$file' included from here"); > # N-ary `.=' fails. > my ($com, $vars, $rules) > > @@ -7391,7 +7287,7 @@ sub file_contents ($$%) > > { > > my ($basename, $where, %transform) = @_; > my ($comments, $variables, $rules) = > > - file_contents_internal (1, "$libdir/am/$basename.am", $where, > + file_contents_internal (1, libfile ("am/$basename.am"), $where, > %transform); > > return "$comments$variables$rules"; > > } > > @@ -7864,7 +7760,7 @@ sub require_file_internal ($$$@) > > my $message = "required file `$fullfile' not found"; > if ($add_missing) > { > - if (-f "$libdir/$file") > + if (-f libfile ("$file") ) > { > $suppress = 1; > > @@ -7888,14 +7784,14 @@ sub require_file_internal ($$$@) > > unlink ($fullfile) if -f $fullfile; > if ($symlink_exists && ! $copy_missing) > > { > > - if (! symlink ("$libdir/$file", $fullfile) > + if (! symlink (libfile ("$file"), $fullfile) > > || ! -e $fullfile) > > { > > $suppress = 0; > $trailer = "; error while making link: $!"; > > } > > } > > - elsif (system ('cp', "$libdir/$file", $fullfile)) > + elsif (system ('cp', libfile ("$file"), $fullfile)) > > { > > $suppress = 0; > $trailer = "\n error while copying"; > > @@ -7923,7 +7819,7 @@ sub require_file_internal ($$$@) > > else > > { > > $trailer = "\n `automake --add-missing' can install `$file'" > > - if -f "$libdir/$file"; > + if -f libfile ("$file"); > > } > > # If --force-missing was specified, and we have > > @@ -8458,7 +8354,9 @@ sub parse_arguments () > > my $cli_where = new Automake::Location; > my %cli_options = > > ( > > - 'libdir=s' => \$libdir, > + 'libdir=s' => sub { push @userlibdir, > + (map { ($_ eq '') ? "$libdir" : $_ } > + split (':', $_[1])); }, We should use $PATH_SEPARATOR here, not ':'. > > 'gnu' => sub { set_strictness ('gnu'); }, > 'gnits' => sub { set_strictness ('gnits'); }, > 'cygnus' => sub { set_global_option ('cygnus', > $cli_where); }, > > @@ -8724,6 +8622,25 @@ sub handle_makefiles_threaded ($) > > } > > } > > +# load_languages () > +# ----------------- > +# Load Perl files for each .pm files in all libdir/lang. > +sub load_languages > +{ > + foreach (@userlibdir) { > + my $dir = $_; > + if (-d "$dir/lang") { > + opendir(LANG_DIR, "$dir/lang") || next; > + my @files=readdir(LANG_DIR); > + foreach (@files) { > + do "$dir/lang/$_" > + if (-r "$dir/lang/$_" && $_ =~ /\.pm$/); Wouldn't a `-f "$dir/lang/$_"' be more correct here? > + } > + closedir(LANG_DIR); > + } > + } > +} > + > > ################################################################ > > # Parse the WARNINGS environment variable. > > @@ -8732,6 +8649,8 @@ parse_WARNINGS; > > # Parse command line. > parse_arguments; > > +load_languages; > + > > $configure_ac = require_configure_ac; > > # Do configure.ac scan only once. > > diff --git a/tests/Makefile.am b/tests/Makefile.am > index d86b03b..23903a0 100644 > --- a/tests/Makefile.am > +++ b/tests/Makefile.am > @@ -774,6 +774,7 @@ unused.test \ > > upc.test \ > upc2.test \ > upc3.test \ > +user_def_language.test \ Typo; should be `user_def_lang.test'. > vala.test \ > vala1.test \ > vala2.test \ > > diff --git a/tests/Makefile.in b/tests/Makefile.in > index 77ec888..bf22206 100644 > --- a/tests/Makefile.in > +++ b/tests/Makefile.in > @@ -985,6 +985,7 @@ unused.test \ > > upc.test \ > upc2.test \ > upc3.test \ > +user_def_lang.test \ > vala.test \ > vala1.test \ > vala2.test \ > > diff --git a/tests/user_def_lang.test b/tests/user_def_lang.test > new file mode 100755 > index 0000000..9b015da > --- /dev/null > +++ b/tests/user_def_lang.test This test is missing the "#! /bin/sh" shebang line, the copyright header, and a brief comment explaining what is being tested. > @@ -0,0 +1,80 @@ > +. ./defs || Exit 1 > + > +set -e > + > +cat > configure.in << 'END' > +AC_INIT > +AM_INIT_AUTOMAKE(nonesuch, nonesuch) > +AC_PROG_CC > +AC_SUBST([FOO], "foo") > +AC_OUTPUT(Makefile) > +END Why not use the configure.in setup already provided by ./defs? I.e. just do: cat >> configure.in << 'END' AC_PROG_CC AC_SUBST([FOO], [foo]) AC_OUTPUT END > + > +cat > Makefile.am <<'END' > +bin_PROGRAMS = bar > +bar_SOURCES = bar.foo > +END > + > +mkdir -p lib/am > +cat > lib/am/foo.am <<'END' > +?GENERIC?%EXT%%DERIVED-EXT%: > +?!GENERIC?%OBJ%: %SOURCE% > +?GENERIC? %VERBOSE%%COMPILE% <%SOURCE% >%OBJ% > +?GENERIC? %SILENT%echo %OBJ%: somefile >%DEPBASE%.Po > +?!GENERIC? %VERBOSE%%COMPILE% <`test -f '%SOURCE%' || echo > '$(srcdir)/'`%SOURCE% >%OBJ% > +?!GENERIC??SUBDIROBJ? %SILENT%depbase=`echo %OBJ% | sed > 's|[^/]*$$|$(DEPDIR)/&|;s|\.o$$||'`;\ > +?!GENERIC??SUBDIROBJ? echo %OBJ%: somefile >%DEPBASE%.Po > +?!GENERIC??!SUBDIROBJ? %SILENT%echo %OBJ%: somefile >%DEPBASE%.Po > +END > + > +mkdir -p lib/lang > +cat > lib/lang/foo.pm <<'END' > +register_language (# Short name of the language (c, f77...). > + 'name' => "foo", > + # Nice name of the language (C, Fortran 77...). > + 'Name' => "Foo", > + # List of configure variables which must be defined. > + 'config_vars' => ['FOO'], > + 'autodep' => 'FOO', > + > + # Name of the compiling variable (COMPILE). > + 'compiler' => "FOOCOMPILE", > + # Content of the compiling variable. > + 'compile' => '$(FOO) $(FOOFLAGS)', > + # Flag to require compilation without linking (-c). > + 'extensions' => ['.foo'], > + # A subroutine to compute a list of possible extensions of > + # the product given the input extensions. > + # (defaults to a subroutine which returns ('.$(OBJEXT)', > '.lo')) > + 'output_extensions' => sub { return ('.c',); }, > + # A list of flag variables used in 'compile'. > + # (defaults to []) > + 'flags' => ["FOOFLAGS"], > + > + # The file to use when generating rules for this language. > + # The default is 'depend2'. > + 'rule_file' => "foo", > + > + # Name of the compiler variable (CC). > + 'ccer' => "FOO", > + > + '_finish' => sub {}, > + > + # This is a subroutine which is called whenever we finally > + # determine the context in which a source file will be > + # compiled. > + > + '_target_hook' => sub { > + my ($self, $aggregate, $output, $input, %transform) = @_; > + $clean_files{$output} = MAINTAINER_CLEAN; > + }, > + > + # If TRUE, nodist_ sources will be compiled using specific > rules > + # (i.e. not inference rules). The default is FALSE. > + 'nodist_specific' => 1); > +END > + > +$ACLOCAL > +$AUTOMAKE -a --libdir=:`pwd`/lib > +grep 'bar.c' Makefile.in A tiny wee bit weak test for such an important new feature, isn't it? ;-) -*-*-*- Thanks again for working on this. Regards, Stefano