[gradle-dev] C++ concept: compiler support

Jay Berkenbilt Wed, 26 Dec 2012 06:10:59 -0800

In this post, I will discuss what abuild does to support compiler
toolchains.  One issue with abuild, and I believe this is also an issue
with gradle, is that the compiler support is built into the build tool
or added as configuration rather than being figured out at runtime using
something like autoconf.  This is somewhat unfortunate but probably not
a huge deal, and it could probably be mitigated by having an
autoconf-generated script that could help with the configuration of
additional compilers.  This is probably a bigger problem for abuild than
gradle.  The main problem here is that tools like autoconf and libtool
contain a lot of knowledge about different compilers and are kept up to
date as these things evolve.  Gradle either has to duplicate this
knowledge or use those other tools.  Maybe it's okay for gradle to
duplicate that knowledge since gradle is under active development. 
Also, the autoconf/libtool packages aren't so great with non-UNIX
compilers, though some support is there and I have used them
successfully with Microsoft compilers.  As abuild's sole developer, it
would have been impossible for me to keep this stuff up to date.  Most
of the world these days uses only a handful of compilers, so if it's
easy to add support for a new compiler in gradle and if specification of
the compiler can either be automatic or easily specified without
touching any build configuration files, then it's probably good enough.


I will not discuss incremental builds and fine-grained dependencies,
shared libraries, or autoconf support in this post.  Those will be
topics of additional posts.

I don't know any way to do compiler specification other than to have a
long list of things you have to provide about a compiler, which is
pretty much how driver support always works.  The list of things abuild
needs to know to support a compiler is pretty complete...I'm sure
there's more that could be there, but what's there has worked well for
me over many years and many projects.  Most of this piece of abuild
predates the tool itself and comes from earlier work I did.

Abuild has a list of things you specify about a compiler that could work
with just about any C/C++ compiler.  Then it also pre-answers many of
the questions for typical UNIX-based compilers that usually enables you
to answer a simpler set of questions for those compilers.

The general list:

 * Patterns to match against for -l<lib> so that abuild can check
link-time library dependencies based on -llib without having to have the
full path of the library (e.g. lib<lib>.a, lib<lib>.so, <lib>.lib)
 * The suffix for non-library object files (like .o or .obj)
 * The suffix for library object files (usually the same but sometimes
different, like .lo for libtool's object file wrappers)
 * The command to invoke the C preprocessor
 * The command to invoke the C++ preprocessor
 * The command to invoke the C compiler]
 * The command to invoke the C++ compiler
 * The command to invoke the linker for C-only programs (with no C++)
 * The command to invoke the linker for a program that requires C++
libraries
 * default debugging flags
 * default optimization flags
 * default warning flags
 * the name of an external program, if any, used to compute fine-grained
dependencies (more on this later)
 * a list of targets to ignore when detecting dangling object files (for
example, so compilers generate a separate .o file with no corresponding
source file for stuff like static initializers or template instantiation)
 * a function that translates the base part of a library's name to the
full filename of a static library (e.g. zlib to libzlib.a on Linux or
zlib.lib on Windows)
 * a function to map the base name of a library along with its version
information to the full name of the shared library (like mapping
zlib,1,2,3 to libzlib.so.1.2.3 on Linux and to zlib.dll or zlib1.dll on
Windows)
 * a function to map the base name of an execute to the full filename
(e.g. exec -> exec on Linux, exec -> exec.exe on Windows)
 * a function to convert a list of include directories to appropriate
compiler/preprocessor include path flags
 * a function to map {compiler, pic, flags, src, obj } to a command that
generates an object file from a source file.  compiler is either the C
or C++ compiler as specified earlier.  pic is whether
position-independent code is required (which it is for dynamically
loadable code)
 * a function to map a static library archive name and list of object
files to a command that creates the library
 * a function that maps { linker, compiler-flags, link-flags,
object-files, lib-directories, library-names, executable base name } to
a command that links an executable.  The linker argument is one of the
linkers specified earlier
 * a function that maps { linker, compiler-flags, link-flags,
object-files, lib-directories, library-names, shared library base name,
shared library version information } to a command that creates a shared
library

For UNIX compilers, abuild provides a specification that provides all
the functions and commands based on a few parameters that are usually
simple command names.  Some of these exactly match what's above, and
others are used to build the above functions.

Direct correspondence:
 * invocation of C compiler
 * invocation of C++ compiler
 * invocation of C preprocessor
 * invocation of C++ preprocessor
 * default { debug, optimization, warning } flags
Parameters to derive remaining options:
 * command to invoke the library archiver (e.g. ar cru)
 * command to run over a newly created static library (e.g. ranlib)
 * flags to pass to the compiler for position-independent code
 * flags to pass to the compiler to generate a shared library
 * function to map shared library version information to soname that is
stored in the shared library header
 * Flag to precede a directory to add it to the include path
 * Flag to precede a "system include directory" to add it to the include
path.

Some compilers have the concept of system include directories.  They are
searched just like regular include directories but the compiler does not
issue warnings for problems in those headers.  Abuild lets you do this
for third party packages, for example.  Maybe your build turns on tons
of warning options (generally a good idea), but this pollutes your build
with tons of warnings about system header files or header files that
belong to third party libraries that you are using.  Specifying their
directories as system includes prevents generation of these warnings so
you only see warnings about your own sources.

Additionally, abuild encourages the compiler support author to prefix
all linker commands with the variable $(LINKWRAPPER) which the end user
can define if they need to wrap the link command.  This is very useful
for running a build with a static analyzer and can be used for other
purposes as well.  For example, sometime in the mid 1990s, I was doing a
build on a system where the linker performed was extremely badly over
NFS because it did lots of seeks or something...linking over NFS could
take 20 or 30 times longer than linking to local disk.  I wrote a
trivial link wrapper that did the link in a local temporary directory
and then copied the result over NFS.  That's pretty pathological, but it
demonstrates that there are sometimes surprise uses for wrapper hooks.

If anyone wants to see the actual compiler support files that abuild
uses, they can be found in the abuild distribution in the
make/toolchains directory.  There's unix_compiler.mk, gcc.mk, mingw.mk,
and msvc.mk.  The driver that calls all these methods is in
rules/object-code/ccxx.mk which is also included in an appendix of the
abuild documentation.  I'm not going to bother excerpting from them
here.  They contain very hairy but reasonably well-commented gnu make
code.  I've written plugins for multiple other compilers using these
parameters and have never run into any problems.

There are a number of features that abuild's C/C++ support handles that
are not obvious from the above list:
 * Ability to override debug, warn, optimization, or general other
compiler flags at the file level
 * Detection of dangling object files no longer generated by any
existing source file (so if you remove a source file, the object file
disappears and you don't accidentally continue linking with code that
you can no longer create -- a potential source of really nasty build
problems)
 * Automatic creation of companion static libraries when building
Windows DLLs and .so links when building UNIX shared libraries; abuild
also has its own naming system for DLLs that encode the major version
much like how ELF shared libraries encode soname, while remaining full
compatibility with standard Windows environments (basically the
companion static library is versionless but is associated with a DLL
that contains a version number in its name; I'll discuss that in a
separate post)
 * a build target that doesn't build anything but prints the compiler
flags that would be used; very useful for debugging the build code like
discovering problems of missing include directories, conflicting
preprocessor symbols, or other evils
 * creation of multiple targets per directory (e.g. both libraries and
executables) with automatic prevention of circular link dependencies;
i.e., if you build shared libraries and executables in the same file,
you can list all your libraries on your link statement and abuild will
be smart enough not to link a library with itself but to link the
executable with the library
 * When supported by the compiler, "whole library" support; more below

Whole library support requires its own explanation.  Supporting this
mode is a little bit evil since it allows people to use an anti-pattern
and also since it is not supported by all compilers (notably Microsoft
Visual C++) but for existing builds or some types of compilers, it can
be needed.  Ordinarily, when an executable or shared library links with
a static library, it only sucks in object files that contain functions
or symbols that are actually referenced.  While this is almost always
always correct, there are times when it doesn't do what is needed.  For
example, in C++ code, it is possible to declare a static instance of a
class whose constructor has side effects and to not have anything in
that file be referenced anywhere else.  For example, say you have some
kind of factory method that knows how to create instances of some child
classes of some base class and that child classes announce themselves to
the base class by using some kind of registration method that is invoked
by using a static initializer.  It may be that the invocation of methods
from the object file that contains the base class is only done by
indirection at runtime.  If you put these classes in a static library,
the linker won't know they're there and won't link them in.  In Java,
this pattern would be implemented through something like dependency
injection, and there isn't any automated thing that decides for you that
something you coded isn't really needed.  In C++, you could use shared
libraries or dynamically loaded code to achieve the same affect, but
sometimes you may need to use static libraries for some reason.  The
specific case I just described can be avoided by using better coding
practices or varying the design pattern, but there's another case I've
seen where whole library support is harder to ignore: compilers that
generate extra object files that only contain static initializers or
template instantiation.  Some of the vxworks compilers do this because
of the way things are loaded in that environment.  The details are out
of scope.  Anyway, if you reserve use of whole-library linking for cases
that are there only to hide compiler-specific implementation details,
then you don't have to care about compilers that don't support that
feature since those compilers obviously won't require use of that
feature.  In that case, abuild's whole library support makes it possible
to implement support for things like the vxworks compilers without
having the developer have to be worried about this special extra step of
handling static initializers.  (I've seen this trip people up multiple
times.)

Finally, abuild includes a "compiler verification" tool, which is
basically a self-contained build that has one of everything that
abuild's compiler support allows.  This allows you to conveniently do a
test build using your compiler specification, and if the test build and
its embedded test suite passes, it means your compiler specification
correctly instructed abuild how to do each of the things that it needed
to do.  This is an essential component since, without it, people will
invariably miss some corner case that their code didn't do when they
wrote the compiler support.  Like if your code doesn't use shared
libraries, you might not get that right.  Okay, I'll admit that compiler
verification doesn't test everything, but it tests almost everything. 
In particular, it doesn't test straight C without C++ or whole library
support, and it doesn't exercise things that shouldn't be related to
your compiler support file like detection and removal of dangling object
files, but it exercises the cases of interaction between static
libraries, shared libraries, and executables, and it also supports both
native compiler plugins and cross compiler plugins.  For native compiler
plugins, it is full automated and basically gives you a yes or no
answer.  For cross compiler plugins, it gives you some things you have
to go run on your target and report back about.  Either way, it greatly
simplifies testing new compiler plugins.

--Jay

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

[gradle-dev] C++ concept: compiler support

Reply via email to