[gradle-dev] C++ concept: shared libraries

Jay Berkenbilt Wed, 26 Dec 2012 06:59:30 -0800

In this post, I will describe how abuild deals with shared library
creation, and I will contrast it to libtool, which is used by many open
source packages and just about all open source library packages that use
autoconf and automake.  Both systems have their strengths and weaknesses.


Abuild supports two flavors of shared libraries: ELF shared libraries
(which is what you have in Linux and Solaris) and Windows DLLs.  libtool
supports other types of shared libraries, but since I've never actually
encountered these in practice, I have had no way to build or test
support for them in abuild.

Linux (and Solaris and most if not all other modern UNIXes, but I'll
just say Linux here and let it be understood that it applies more
broadly) has both static and shared libraries.  In the open source
world, static libraries are almost not used anymore except for special
environments where dynamic loading doesn't make sense.  In the open
source world, you would typically build a single library both as a
static library and as a shared library.  The static "moo" library would
be called libmoo.a, and the shared library would typically be called
libmoo.so.x.y.z where x.y.z is the shared library version number (though
this is just a convention and some people don't follow it) and would
also come with a file called libmoo.so that would be a symbolic link to
the library.  Typically you would also have libmoo.so.x that would be a
symbolic link to it as well.  An ELF shared library typically has an
"soname" field embedded in its ELF header.  This indicates a version
number of the shared library that needs to be found at runtime by
something that links against that library.

At link time, if you link against -lmoo, the linker will search through
your library search path for a file called libmoo.so and, if it finds
it, it will link against that shared library.  It links against shared
libraries by making sure that any symbols it needs can be found in the
collection of shared libraries it links against, and for every shared
library that it needed, it embeds in its own ELF header a record of
needing a specific version of that library based on the library's
soname.  I'm oversimplifying a little here...there are lots of wrinkles
about how downstream dependencies are handled, mixing static and shared
libraries, etc., but they are not important to this discussion.  This
can be clarified with an example.  Suppose you are linking your animal
sounds program against the moo and quack libraries and you have the
following

libmoo.so -> libmoo.so.2 -> libmoo.so.2.5.3
libquack.so -> libquack.so.3 -> libquack.so.3.0.0

libmoo.so's ELF header would have "libmoo.so.2" as the soname, and
libquack.so would have "libquack.so.3".  When you linked your program
with -lmoo -lquack, assuming symbols from those libraries were needed,
the linker would find libmoo.so and libquack.so and would embed in your
program's ELF header that it needed to find libmoo.so.2 and
libquack.so.3 at runtime.  You can see this on a Linux system with the
ldd and readelf (or eu-readelf) commands.  For example, on my debian
system, I can run ldd /usr/bin/jpegtran and see, among other things, the
line

libjpeg.so.8 => /usr/lib/x86_64-linux-gnu/libjpeg.so.8 (0x00007fb4511f1000)

If you look at /usr/lib/x86_64-linux-gnu/libjpeg.so.8, you can see that
it is a symbolic link to libjpeg.so.8.4.0.  Also,
/usr/lib/x86_64-linux-gnu/libjpeg.so is a symlink to the same file.  The
libjpeg.so.8 file is the one that is needed at runtime, and libjpeg.so
is needed a link time.  If you run "readelf -d
/usr/lib/x86_64-linux-gnu/libjpeg.so | grep SONAME", you will see the output

 0x000000000000000e (SONAME)             Library soname: [libjpeg.so.8]

which says that anything that links with the -ljpeg shared library
requires libjpeg.so.8 at runtime.

Abuild handles all this automatically, as does libtool.  Once in a while
you see people try to build shared libraries without understanding
this.  If you don't explicitly specify the soname of the library when
you create it, the soname ends up matching the name of the output file,
which is typically just the bare .so file.  This creates havoc if your
binary interface is not stable.  Going into shared libraries beyond this
is probably out of scope here, and the information is available from
plenty of other sources.

At link time in Linux, if libmoo.so is not found, then the linker will
try to find libmoo.a (the static library) and will link against that
instead.  It is normal for there to be both a shared and static version
of the same library installed at the same time.  The linker will favor
the shared library over the static library unless explicitly instructed
to do otherwise.

In Windows, things are a little different.  You wouldn't ordinarily have
both a static library and a shared library at the same time.  Also,
while in Linux you actually link directly with a shared library and the
linker knows what to do with it, in Windows every DLL has a companion
static library that literally contains code to dynamically load the DLL
and call its functions.  The companion static library contains
information about the name of the DLL, much like soname for ELF shared
libraries, but as far as I know, there's no way to make it different
from the name of the companion library at creation time.  What abuild
does instead is to create moo2.lib and moo2.dll and then to rename
moo2.lib to moo.lib.  That way, when you link with -lmoo, the linker
finds moo.lib which is the companion library to moo2.dll.  That way you
can have multiple versions of the DLL coexisting without running into
incompatible symbols.

To contrast abuild with libtool, abuild generally does not give you a
way to build both a static library and a shared library of the same
library.  Abuild's view of the world is that static libraries are a
convenient mechanism for modularizing your build while shared libraries
are installable entities for published APIs.  This is not necessarily
the only valid world view and may not even be the best one, but I've
found it works extremely well for large systems with lots of components
where the APIs are not necessarily all that stable.  I would think
gradle might want to do things more like the libtool way, but I'm not
really sure which way would be best.  With libtool, when you build a
library, the default behavior is to compile each file twice: once with
position independent code and once without.  The resulting object files
are used to build shared and static libraries respectively, and a .lo
file is created as a wrapper around the object file so libtool can do
this all transparently.  This increases build times significantly, and
while it may be right for open source products being installed in a
Linux distribution (or similar), it doesn't seem right in a large
enterprise development project most of the time.  At least not in my
experience; others may well differ.

The other problem with shared libraries is how you find them at
runtime.  In Linux, shared libraries are searched for in certain
specific locations (like /usr/lib) and also in locations specified by
/etc/ld.so.conf or files in /etc/ld.so.conf.d.  For non-setuid/setgid
executables, the LD_LIBRARY_PATH environment variable can also influence
this.  It is also possible to encode a runtime path in the executable,
but this is often a bad idea, and abuild doesn't directly support it. 
If a runtime path is encoded in an executable, then the runtime loader
will always look there first, and there's no way to override that.  This
can be useful for creating self-contained binary distributions that
include all their shared libraries in a fixed location, but I think even
then it's a bad idea because it makes distributions non-relocatable.  If
you must have private shared libraries with a binary distribution, it's
better to include wrapper scripts that set LD_LIBRARY_PATH.  This is no
different from having to ship wrapper scripts around java code that
invokes java with the appropriate classpath.  libtool has what I
consider to be a bug in that if you indicate that you are installing
something in a non-standard location, libtool includes a run path.

One trap that a build tool should not fall into is to put runpaths in
link statements to the location of a shared library in the source tree. 
This is a terrible idea because it means that an installed version of an
executable that happens to be running on a build host that has a built
copy of the code will resolve its symbols against the version of the
shared library in the build tool rather than the version installed on
the system.  This can cause all sorts of hard-to-detect problems. 
libtool handles this by automatically generating wrapper scripts so that
you can run your code from its build directory without having to use run
paths.  Abuild doesn't handle this at all, and leaves it up to the
developer to handle, which is probably bad.  For Windows, the situation
is similar except PATH is searched for DLLs.  Windows also looks in the
executable's directory for DLLs, and there are other standard places
where it looks.

Anyway, the minimum a build system has to do for ELF is to make sure
that soname is properly embedded, that appropriate symbolic links are
made, and that runpaths are not used inappropriately.  For DLLs, I
recommend some kind of version numbering scheme for the DLL vs. the
companion library, but one could live without this.  Also with versions
of MSVC greater than or equal to 2005 (I think, or maybe it's greater
than or equal to 2008), you have to create a manifest file with mt.exe. 
All this can be found in Microsoft's documentation or gleaned from
abuild's support for msvc.

--Jay

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

[gradle-dev] C++ concept: shared libraries

Reply via email to