Hi All,

Great discussion on list that is very enlightening. Thought this would be helpful re RPM.

Cheers

Jason

-------- Original Message --------
Subject: Re: [Cooker] Document review request: RPM devel package dependency problem
Date: Sun, 6 Apr 2003 18:44:24 -0800
From: Andi Payn <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
References: <[EMAIL PROTECTED]>


On Sunday 06 April 2003 13:26, Stefan van der Eijk wrote:
Hello,

I've written up on an issue with rpm dependencies in -devel packages.
I'm not sure if the story is 100% accurate (I'm not a programmer), so if
you've got a moment to spare, feel free to review it.

This is pretty much completely wrong. The history is wrong, the description of
the problem is wrong, and the proposed solution won't work. The proposed
solution might have some benefits anyway; I'll get to that.


Rather than explain what's wrong, let me try to give a more accurate history,
with some rationales along the way, and then explain the actual problem, and
why it can't be solved automatically. This is probably going to be a very
long email, as there's a lot to go over.


In the old days, Mandrake worked the same way Redhat did (and, along with most
of the RPM-based world, still does): The typical source package
mything-1.0.srpm, if it contained libraries or other development files, would
build two packages, mything-1.0-i586.rpm and mything-devel-1.0-i586.rpm.


The mything package would contain the application binaries, shared libraries
(libmything.so.1.0.0), shared library symlinks needed for normal use
(libmything.so.1), and user documentation. The mything-devel package would
contain the static libraries (libmything.a), shared library symlinks needed
for building other code (libmything.so), header files, and developer
documentation.


This split long predates the Mandrake policy; it came long before the
numbering of libraries. Also, this split has no effect whatsoever on the
ability of multiple versions to coexist.

For example, mything-1.0 can't coexist with mything-1.2, because they both try
to provide the same files, such as /usr/bin/myapp. Likewise,
mything-devel-1.0 can't coexist with mything-devel-1.2, because they both try
to provide the same files, such as /usr/include/mylib.h. Naming them
differently wouldn't solve this problem; it would just make it harder for RPM
to catch the problem immediately (it has to go through the "preparing"
phase--which, for urpmi/rpmdrake, means it has to download the packages).


The reason for splitting off the -devel packages was that most people don't
need them. Why waste download time, space on the CD, space on the user's hard
disk, and/or other resources for header files if most users will never
compile anything that requires those header files?


In a few cases, packages were further split: -static-devel, -doc, -devel-doc,
-utils, -tools, etc. may be split off. This was pretty rare in the early
days, and on most distros (including Mandrake and Redhat), it's still pretty
rare, but a few distros went overboard with this (PLD' policy is to create
separate -static, -static-devel, and, where appropriate, -docs and
-docs-devel, for example, and Conectiva goes about half-way there).


Sometimes this is because the static libraries end up being 80% of -devel and
even most developers will never need them--so again, it saves
space/bandwidth/etc. to separate them out. Sometimes it's because the
original program came in multiple separate tarballs and it's easier (both
initially and for maintenance) to organize the RPMs the same way. Sometimes
it's because the developer puts a specfile in each tarball, which makes this
even more compelling (especially when the specfile is designed for your
distro). Often it's because whoever had the package first split off
-static-devel and everyone else just followed suit (this is especially true
when developers make Mandrake packages and Redhat redhatizes them).


Now, on to the multiple-version issue. This was a problem from the beginning
of the shared library days, before RPM. Let's say that lots and lots of
packages link to mything's shared libraries. Now, 1.0 comes out, and it's
incompatible with 0.2.


Let's look at the library version numbering system used by linux/glibc. When a
user upgrades from mything-0.2.1 to mything-0.3.0,
/usr/lib/libmything.so.0.2.1 goes away, /usr/lib/libmything-0.3.0 gets
installed, and the existing /usr/lib/libmything.so.0 link now points to the
new version. Since programs link against libmything.so.0, all existing
programs still work, and programs that require new 0.3 features also work.
(This assumes that minor-version upgrades are backwards compatible, which
they're supposed to be, but some developers disagree, or just aren't
perfect.)


When the user later upgrades to mything-1.0.0, which may be incompatible with
0.3.0 (major-version upgrades can be incompatible), libmything.so.0.3.0 stays
in place, and libmything.so.0 continues to point at it, while
libmything.so.1.0.0 is added, and libmything.so.1 points at the new version.
Old programs still work because they still have the old library; new programs
work because they have the new library.


Other operating systems with shared libraries have similar problems, and
handle them in similar ways (except classic MacOS and a few other OS's that
went for something more complicated). Even Windows had a completely ad-hoc
version of the glibc solution. When VC 4.1 came out, MFC 4.1 was compatible
with MFC 4.0, so its shared library was still called MFC40.DLL. When VC 4.2
came out, MFC 4.2 was not compatible, so its shared library was called
MFC42.DLL. (Of course later versions of MFC were incompatible with 4.2, but
Microsoft still called the library MFC42.DLL, causing all kinds of problems,
but that's another story.)


Unfortunately, the RPM system doesn't understand this version numbering
scheme. When you upgrade mything-0.2.1 to mything-0.3.0, it replaces all of
the files, just as you'd want it to. But when you upgrade mything-0.3.0 to
mything-1.0.0, it also replaces all of the files--which means everything that
linked to libmything.so.0 stops working.


RPM's automatic requirements handling helped a little. Now, RPM won't let you
upgrade 0.3.0 to 1.0.0 if anything depends on libmything.so.0; you have to
remove or upgrade all of your old packages to upgrade mything. But that was
still a disadvantage to RPM-based distros: there's nothing about linux that
prevents you from having 0.3.0 and 1.0.0 simultaneously, but there is
something about RPM that prevents it. (Note that this feature was added for
RPM 3.0/RedHat 6.0, IIRC; most packages that have the -devel split on a
current Mandrake or Redhat system had the same split in RedHat 5 and
vice-versa, so the idea that "the dependency system has not evolved with
these changes" is silly.)


More than once, this turned into a nightmare for users and distributors. Let's
say mything is something really important (like ImageMagick), but and change
is so dramatic that many projects won't convert for quite some time (like the
ImageMagick 4.x to 5.x transition). For months, users and distributors are
stuck with a choice: keep mything-0.3.0, and give up on hundreds of new and
updated packages, or go to mything-1.0.0, and lose hundreds of old packages
for which there may be no good replacement.


When the situation gets this bad, a distributor will usually create a special
mything-compatlibs-0.3.0 package. This package can coexist with mything-1.0.0
(it has a different name, after all), allowing users to have both .so.0 and
.so.1 versions at the same time.


Unfortunately, this problem comes up all the time on less-important packages,
and distributors only deal with the really major problems. They don't provide
a mything-compatlibs for every single package that undergoes a major version
upgrade, so many users are forced to build their own "-compatlibs" packages,
or, more likely circumvent the RPM system and do the same thing manually. Of
course many users--especially the novices that Redhat and Mandrake are trying
to attract--won't do either, they'll just assume something's "broken" and
their system can't do what they want.


Mandrake's innovation now seems obvious: Why not provide compatibility
libraries for everything that users might possibly need? Just as glibc can
handle mything.so.0 and mything.so.1, so RPM can handle libmything0 and
libmything1. Split off libmything from mything, and stick the major version
number right on the name, and you're done.

So now, instead of mything-0.3.0, you have mything-0.3.0 and
libmything0-0.3.0. When mything-1.0.0 comes out, the mything-1.0.0 package
will replace mything-0.3.0 (since they're both named "mything"), but
libmything1-1.0.0 will live alongside libmything0-0.3.0 (since they have
different names).

What to do with the -devel packages? It's most consistent, and simplest for
the users to figure out, if you rename mything-devel to libmything1-devel. If
a user wants to build a package that requires libmything1, she does a "urpmi
libmything1-devel." If she wants to build a package that requires
libmything0, she does a "urpmi libmything0-devel." Easy. The same logic
applies to -static-devel, -devel-docs, or whatever else might exist; if they
go with the libraries, rename them along with the libraries; if they go with
the applications (-docs, -utils, etc.), keep the "classic" names.


Note that this doesn't turn one package into four, as your history claims. It
may turn three packages into four (if you had mything, mything-devel, and
mything-static-devel, you now get mything, libmything0, libmything0-devel,
and libmything0-static-devel), or it may turn one into two (mything and
libmything0), or it may not change anything (if mything doesn't have any
shared libraries that any other app might need).


In some cases (as with Qt, where the libraries and headers are all together in
one directory, so you can have /usr/lib/qt3 and /usr/lib/qt2 side by side), a
user can even have both -devel versions installed at once. In most cases
(where everything just goes in /usr/lib and /usr/include), she can't. In some
of these cases, the new version cleanly obsoletes the old one (it's backward
compatible); in others it doesn't.


RPM can't automatically figure out that libmything1-devel and
libmything0-devel can't coexist; because they have different names, it
assumes they're unrelated. If you try to install libmything1-devel when you
already have libmything0-devel, you'll probably get an error message saying
something like, "file /usr/include/myheader.h from package libmything1-devel
conflicts with file /usr/include/myheader.h from package libmything0-devel,"
but this won't happen until the preparation stage. This stage is after
urpmi/rpmdrake has downloaded the package, and too late to do its dependency
analysis (to tell you that "in order to install libmything1-devel, you must
remove these 3 packages and upgrade these 9 others"), which is only based on
information in the headers.


Fortunately, RPM does provide a way for the packager to handle each of the
three cases, by use of the conflicts tag, the obsoletes tag, or neither.
However, the RPM system can't guess this automatically.

Your problem is similar to this: While RPM can figure out dependencies between
two shared libraries (because this happens to be really easy to do in linux
via ldd), it can't figure out most other kinds of dependencies. So it can't
figure out that a -devel package depends on another -devel package.


However, this problem was not caused by splitting the libraries. In fact,
splitting the libraries is a partial solution! I'll explain how, but first I
want to go back to my fictional mything package, because both of your
examples are pretty bad for illustrating the issue. (Why? The zlib package is
one of the few that doesn't follow the Mandrake policy--it should be libz1
and libz1-devel; it doesn't have a separate apps package z--that'd be gzip,
zip, and unzip; and it doesn't require anything besides glibc--which you
can't have a modern linux system without. The libpng package is one of the
few that was prefixed with lib even before the Mandrake policy [the internal
name is libpng]; it doesn't have a separate apps package "png;" it's one of
the few examples I remember of any package having -static-devel split off and
later merged back into -devel on any distro; and it's one of the rare cases
where most other distros have ended up following an ad-hoc version of the
Mandrake policy [Redhat's "libpng10" and "libpng12" are equivalent to
Mandrake's "libpng2" and "libpng3," while Conectiva just uses Mandrake's
names].)


RPM can automatically infer all of the requirements for libmything1, because
it's nothing but shared libraries. RPM may not be able to automatically infer
all of the requirements for libmything1-devel, however.


This means that if the packager isn't sufficiently diligent, a user may find
herself inexplicably unable to install libmything1-devel--but every user will
always be able to install libmything1, or know the reason why (in fact, with
urpmi/rpmdrake, it'll usually fix it for her: "to install the selected
packages, you also need to install libmyotherthing," and all she has to do is
click OK). As a side benefit, this will sometimes, but not always, give
"power users" a clue toward solving the problems with the other packages
("Since libmything1 required libmyotherthing3, maybe libmyotherthing3-devel
is required for libmything1-devel to work.").


If the package hadn't been split, those "hidden dependencies" in the
development files that cause problems with libmything1-devel would instead
cause problems with the monolithic mything package. And this is a much worse
problem.


Why? Let's say that something in KDE requires /usr/lib/libmything.so.1. Now,
something like 90% of Mandrake's users run KDE. Maybe 1% of Mandrake's users
need to compile something that relies on libmything (not only that, that 1%
are guaranteed to be among the most able to figure out the problem--after
all, they're at least compiling packages they pulled off the net, if not
actively developing code).


In other words, splitting off the -devel packages doesn't create a problem;
instead, it insulated 90% of the users, including 100% of the novice users,
from an existing problem.

And Mandrake's additional separation of the libraries from the application
makes this even better, not worse. Let's say mything contains a python script
/usr/bin/myapp, with hidden dependencies that RPM can't detect, but
libmything0 is the shared libraries used by /usr/bin/myapp and by something
in kdebase.


Under the traditional (Redhat) policy, this hidden dependency could block the
user from installing mything, meaning that she couldn't install KDE either.
Under the Mandrake policy, even if the user couldn't install mything, she
could still get libmything0 installed, so she could still have KDE. Again,
this means that most users, including all novices, are insulated from
potential problems (and again it provides a potential clue for mid-level
"power users" to resolve problems).


If you think that my example is contrived and implausible, look at the OGG
libraries. Try "rpm -q --whatrequires libogg.so.0" on your system; you should
see kdebase. If some developers or power-users can't install libogg0-devel
because of some hidden dependencies, sure, that's a problem--but if some
novices can't install libogg0 because of hidden dependencies, their system is
unusable. Fortunately, because libogg0 contains nothing but the shared
libraries, this cannot happen on a Mandrake system--but it can happen on
another RPM-based system. (In fact, the same is true of your example, zlib1!)


As for your proposed solution, while it's true that the header files contain
information that can be used to find dependencies, even a perfect job at this
will miss most dependencies for -devel packages, and will be even worse for
the application packages.


First of all, not everything is written in C(++); in particular, there are
thousands of packages written (at least in part) in perl, sh, python, scheme,
and a few other scripting languages, so you need to parse these. And to get
the last 5%, you'll need to parse uncommon scripting languages, plus Modula,
Eiffel, or any other language that anyone has ever built a package with.


Second, even if you can determine that a header file includes <myheader.h>,
that just tells you that it requires myheader.h. More often than not, you
specifically need the version from libmything1-devel, so knowing that you
need anything that provides myheader.h doesn't help. For example, ImageMagick
5.0 provided most of the same header files as 4.x, but their contents were
quite different. A more thorough analysis could tell you that
ImageMagick-devel-4.x wasn't compatible, but that would require a huge amount
of work, and so much information that your RPM headers would be bigger than
their contents. And even then, your code may not run if built against the
wrong version, so you really need a good regression test suite for every
package to test this automatically. (Going back to your example, if you can
tell that zlib1-devel requires glibc, but you can't tell that it requires
glibc-2.3, that's useless; only really ancient systems don't have glibc, but
many comparatively recent systems don't have 2.3.)


Third, nearly every language has a way to run a program, load a library, or
read a file by name. For example, the following code probably calls convert,
creating a dependency on ImageMagick that a script couldn't catch:
sprintf(cmd, "%s \"%s\" \"%s\"", config->convert_path, gifname, pngname);
system(cmd);


Fourth, there are cases where there's no strict dependency, but the package is
still completely useless without another. For example, xcivstart will run
without freeciv, but all it can do is tell you that it can't find civclient
and civserver, which surely isn't very useful to a user. Sometimes this is a
judgement call. For example, civworld, the freeciv editor, isn't all that
useful without freeciv--but you might want to edit scenarios on a machine you
never intend to play the game on, so maybe it is.


A human packager will gather much of this information in a variety of
ways--from the package's website, from her own experience, from trial and
error, from bug reports ("mything-1.0.0 installed fine on my system, but when
I tried to run it, it told me there was 'no convert found in the path'"),
from knowing what the package is intended to do ("xcivstart is a front-end to
FreeCiv" probably means that xcivstart requires freeciv), etc.


That's not to say that it would be a bad idea to parse header files. But it's
not going to solve most problems.


As for -static-devel packages (where they exist), they may actually be easier.
Static libraries are relatively easy to parse, so finding dependencies on
other packages is easy. However, there's no real reason to do this. Static
libraries are, in almost every case, useless without the accompanying header
files. This means that libmything0-static-devel has to require
libmything0-devel. And since this in turns requires libmything0, and the
static libraries will probably have (a subset of) the same requirements as
the shared libraries, you don't need anything else.


By the way, while the Mandrake library policy doesn't cause any of the
problems you mention, it isn't perfect. Here's a list of problems:

First, because the names of all of these packages have changed from what the
rest of the world uses, taking specfiles designed for another distro (usually
Redhat) and Mandrakizing them isn't always easy. If I download a Redhat SRPM
(or a tarball from the developer's website that has--or builds on the fly-a
specfile designed for Redhat), I have to go through each of the manual
requires, buildrequires, obsoletes, conflicts, etc. tags and figure out the
Mandrake names for each of those libraries.


On top of that, because developers don't always follow the rule that minor
version upgrades are compatible (especially in the 0.x stage), the .so often
ends up with a different number than the overall package. So, you may have
myotherthing.so.2 for 1.1.0 through 1.2.2, and myotherthing.so.3 for 1.2.3
through 1.2.9. This means that the user is now forced to deal with two
different version number schemes. But it's even worse for the packager. If
the stock (Redhat) specfile says "Requires: myotherthing >= 1.2.3" I need to
know that this means "Requires: libmyotherthing3 >= 1.2.3." To know this, I
have to know that myotherthing-1.2.3.srpm builds libmything3. If I have
libmyotherthing3-1.2.7 installed, there's no easy way to figure out when the
change came--I have to read the changelogs, browse the website, or even
download and inspect old packages.


To make things still worse, let's say that the package I'm trying to use can
build with any version of myotherthing-devel from 1.2.3 through 1.5.1 (but
not 1.2.2, or 2.0.0). Those libraries span libmyotherthing3-1.2.3 through
libmyotherthing5-1.5.1. What do I put in the buildrequires? If I specify
libmyotherthing5-devel, other users may have to upgrade for no good reason
just to build it. And then, do I put libmyotherthing5 in the requires, making
end users upgrade unnecessarily too?


On another note, if the devel files had kept their old names (in other words,
you have mything, libmything0, and mything-devel instead of mything,
libmything0, and libmything0-devel), RPM would automatically prevent two
versions of the -devel packages from being installed together. While the
flexibility in choosing between conflicts, obsoletes, and nothing is good,
this means that the default is to allow the two to coexist, which is usually
wrong, which can't be detected until the package has been downloaded (or,
usually, 20 packages have been downloaded and the whole batch installation
fails).


The virtual package names provide a partial solution to most of these
problems. If libmyotherthing3 provides a virtual package named libmything,
and so do 0 and 1 and 2 and 4 and 5, and likewise for the devel packages, I
can just put "buildrequires: libmyotherthing-devel >= 1.2.3" and "requires:
libmyotherthing >= 1.2.3". And of course my script can just replace every
"requires: myotherthing" with "requires: libmyotherthing" (and likewise for
-devel and buildrequires) in the Redhat package and I'm ready to go.

However, this only works if you, and every other packager, is always careful
about getting the virtual package names right--and if there's a good policy
on how to do so.


My only suggestion for this first problem is to provide a clearer policy for
virtual package names that requires, wherever possible, providing the typical
(Redhat) name (or, if this is not feasible, something derived from this name
by an simple algorithm--even if it's not one that the computer has enough
information to run for you). This problem is especially bad in complicated
suites of libraries like Gtk/GNOME, and I seriously doubt that this is
because fcrozat is lazy or stupid (in fact, anyone who could get the GNOME
mess integrated into a distro that doesn't follow the exact same policies as
the GNOME team deserves a commendation); it's because there's no consistent
policy to guide him.


Also, when both developers and Mandrake are sticking version numbers on the
end of a package, the need for a good numbering policy becomes even clearer.
The current policy is to stick the "internal" version number (from the
SONAME), followed by an underscore, followed by the so version number.


This policy makes sense, but it's often not followed. Sometimes the numbering
is even changed from one version to the next, neither in accordance with the
policy. Here are some examples I was able to gather in a few seconds (there
are probably dozens more):
libadplug-1.3.so.0 is libadplug1.3_0, not libadplug-1.3_0.
libadplug-1.4.so.0 is libadplug1.4, not libadplug-1.4_0.
libart_lgpl_2.so.2 is libart_lgpl2, not libart_lgpl_2_2.
libfaad.so.0 is libfaad2_0, not libfaad0.
libfame-0.8.so.8 is libfame0.8, not libfame-0.8_8.
libfame-0.9.so.0 is libfame0.9, not libfame-0.9_0.
libgtk-1.2.so.0 is libgtk+1.2_0, not libgtk-1.2_0.


This policy also means that the package names that result are often confusing.
Some of the GNOME 2.2 packages have no number on the soname, some have 2,
some have -2, some have -2.0, etc. None have 2.2 or -2.2. Since the soversion
is usually 0, this means that you have two extra sets of version numbers,
neither of which is 2.2, which often together look like 2.0, and which mean
nothing to the user.


And of course this makes the virtual names problem even more complicated,
since other vendors and the GNOME team themselves are providing packages with
names that often don't follow any policy at all, and other people are
building packages against those.


There should be a better way. Unfortunately, I can't think of one; maybe it's
just wishful thinking.


If not, the policy should always be followed, rigorously. Maybe rpmlint could
check that the right-most part of the name made up of digits, dots, hyphens,
and underscores follows the rule exactly, and for the rare special cases
(like liba52dec0 not being named liba52dec52_0, for obvious reasons) you'd
have to explicitly make a judgement call that rpmlint was wrong.








Reply via email to