On Sunday 06 April 2003 13:26, Stefan van der Eijk wrote: > Hello, > > I've written up on an issue with rpm dependencies in -devel packages. > I'm not sure if the story is 100% accurate (I'm not a programmer), so if > you've got a moment to spare, feel free to review it.
This is pretty much completely wrong. The history is wrong, the description of the problem is wrong, and the proposed solution won't work. The proposed solution might have some benefits anyway; I'll get to that. Rather than explain what's wrong, let me try to give a more accurate history, with some rationales along the way, and then explain the actual problem, and why it can't be solved automatically. This is probably going to be a very long email, as there's a lot to go over. In the old days, Mandrake worked the same way Redhat did (and, along with most of the RPM-based world, still does): The typical source package mything-1.0.srpm, if it contained libraries or other development files, would build two packages, mything-1.0-i586.rpm and mything-devel-1.0-i586.rpm. The mything package would contain the application binaries, shared libraries (libmything.so.1.0.0), shared library symlinks needed for normal use (libmything.so.1), and user documentation. The mything-devel package would contain the static libraries (libmything.a), shared library symlinks needed for building other code (libmything.so), header files, and developer documentation. This split long predates the Mandrake policy; it came long before the numbering of libraries. Also, this split has no effect whatsoever on the ability of multiple versions to coexist. For example, mything-1.0 can't coexist with mything-1.2, because they both try to provide the same files, such as /usr/bin/myapp. Likewise, mything-devel-1.0 can't coexist with mything-devel-1.2, because they both try to provide the same files, such as /usr/include/mylib.h. Naming them differently wouldn't solve this problem; it would just make it harder for RPM to catch the problem immediately (it has to go through the "preparing" phase--which, for urpmi/rpmdrake, means it has to download the packages). The reason for splitting off the -devel packages was that most people don't need them. Why waste download time, space on the CD, space on the user's hard disk, and/or other resources for header files if most users will never compile anything that requires those header files? In a few cases, packages were further split: -static-devel, -doc, -devel-doc, -utils, -tools, etc. may be split off. This was pretty rare in the early days, and on most distros (including Mandrake and Redhat), it's still pretty rare, but a few distros went overboard with this (PLD' policy is to create separate -static, -static-devel, and, where appropriate, -docs and -docs-devel, for example, and Conectiva goes about half-way there). Sometimes this is because the static libraries end up being 80% of -devel and even most developers will never need them--so again, it saves space/bandwidth/etc. to separate them out. Sometimes it's because the original program came in multiple separate tarballs and it's easier (both initially and for maintenance) to organize the RPMs the same way. Sometimes it's because the developer puts a specfile in each tarball, which makes this even more compelling (especially when the specfile is designed for your distro). Often it's because whoever had the package first split off -static-devel and everyone else just followed suit (this is especially true when developers make Mandrake packages and Redhat redhatizes them). Now, on to the multiple-version issue. This was a problem from the beginning of the shared library days, before RPM. Let's say that lots and lots of packages link to mything's shared libraries. Now, 1.0 comes out, and it's incompatible with 0.2. Let's look at the library version numbering system used by linux/glibc. When a user upgrades from mything-0.2.1 to mything-0.3.0, /usr/lib/libmything.so.0.2.1 goes away, /usr/lib/libmything-0.3.0 gets installed, and the existing /usr/lib/libmything.so.0 link now points to the new version. Since programs link against libmything.so.0, all existing programs still work, and programs that require new 0.3 features also work. (This assumes that minor-version upgrades are backwards compatible, which they're supposed to be, but some developers disagree, or just aren't perfect.) When the user later upgrades to mything-1.0.0, which may be incompatible with 0.3.0 (major-version upgrades can be incompatible), libmything.so.0.3.0 stays in place, and libmything.so.0 continues to point at it, while libmything.so.1.0.0 is added, and libmything.so.1 points at the new version. Old programs still work because they still have the old library; new programs work because they have the new library. Other operating systems with shared libraries have similar problems, and handle them in similar ways (except classic MacOS and a few other OS's that went for something more complicated). Even Windows had a completely ad-hoc version of the glibc solution. When VC 4.1 came out, MFC 4.1 was compatible with MFC 4.0, so its shared library was still called MFC40.DLL. When VC 4.2 came out, MFC 4.2 was not compatible, so its shared library was called MFC42.DLL. (Of course later versions of MFC were incompatible with 4.2, but Microsoft still called the library MFC42.DLL, causing all kinds of problems, but that's another story.) Unfortunately, the RPM system doesn't understand this version numbering scheme. When you upgrade mything-0.2.1 to mything-0.3.0, it replaces all of the files, just as you'd want it to. But when you upgrade mything-0.3.0 to mything-1.0.0, it also replaces all of the files--which means everything that linked to libmything.so.0 stops working. RPM's automatic requirements handling helped a little. Now, RPM won't let you upgrade 0.3.0 to 1.0.0 if anything depends on libmything.so.0; you have to remove or upgrade all of your old packages to upgrade mything. But that was still a disadvantage to RPM-based distros: there's nothing about linux that prevents you from having 0.3.0 and 1.0.0 simultaneously, but there is something about RPM that prevents it. (Note that this feature was added for RPM 3.0/RedHat 6.0, IIRC; most packages that have the -devel split on a current Mandrake or Redhat system had the same split in RedHat 5 and vice-versa, so the idea that "the dependency system has not evolved with these changes" is silly.) More than once, this turned into a nightmare for users and distributors. Let's say mything is something really important (like ImageMagick), but and change is so dramatic that many projects won't convert for quite some time (like the ImageMagick 4.x to 5.x transition). For months, users and distributors are stuck with a choice: keep mything-0.3.0, and give up on hundreds of new and updated packages, or go to mything-1.0.0, and lose hundreds of old packages for which there may be no good replacement. When the situation gets this bad, a distributor will usually create a special mything-compatlibs-0.3.0 package. This package can coexist with mything-1.0.0 (it has a different name, after all), allowing users to have both .so.0 and .so.1 versions at the same time. Unfortunately, this problem comes up all the time on less-important packages, and distributors only deal with the really major problems. They don't provide a mything-compatlibs for every single package that undergoes a major version upgrade, so many users are forced to build their own "-compatlibs" packages, or, more likely circumvent the RPM system and do the same thing manually. Of course many users--especially the novices that Redhat and Mandrake are trying to attract--won't do either, they'll just assume something's "broken" and their system can't do what they want. Mandrake's innovation now seems obvious: Why not provide compatibility libraries for everything that users might possibly need? Just as glibc can handle mything.so.0 and mything.so.1, so RPM can handle libmything0 and libmything1. Split off libmything from mything, and stick the major version number right on the name, and you're done. So now, instead of mything-0.3.0, you have mything-0.3.0 and libmything0-0.3.0. When mything-1.0.0 comes out, the mything-1.0.0 package will replace mything-0.3.0 (since they're both named "mything"), but libmything1-1.0.0 will live alongside libmything0-0.3.0 (since they have different names). What to do with the -devel packages? It's most consistent, and simplest for the users to figure out, if you rename mything-devel to libmything1-devel. If a user wants to build a package that requires libmything1, she does a "urpmi libmything1-devel." If she wants to build a package that requires libmything0, she does a "urpmi libmything0-devel." Easy. The same logic applies to -static-devel, -devel-docs, or whatever else might exist; if they go with the libraries, rename them along with the libraries; if they go with the applications (-docs, -utils, etc.), keep the "classic" names. Note that this doesn't turn one package into four, as your history claims. It may turn three packages into four (if you had mything, mything-devel, and mything-static-devel, you now get mything, libmything0, libmything0-devel, and libmything0-static-devel), or it may turn one into two (mything and libmything0), or it may not change anything (if mything doesn't have any shared libraries that any other app might need). In some cases (as with Qt, where the libraries and headers are all together in one directory, so you can have /usr/lib/qt3 and /usr/lib/qt2 side by side), a user can even have both -devel versions installed at once. In most cases (where everything just goes in /usr/lib and /usr/include), she can't. In some of these cases, the new version cleanly obsoletes the old one (it's backward compatible); in others it doesn't. RPM can't automatically figure out that libmything1-devel and libmything0-devel can't coexist; because they have different names, it assumes they're unrelated. If you try to install libmything1-devel when you already have libmything0-devel, you'll probably get an error message saying something like, "file /usr/include/myheader.h from package libmything1-devel conflicts with file /usr/include/myheader.h from package libmything0-devel," but this won't happen until the preparation stage. This stage is after urpmi/rpmdrake has downloaded the package, and too late to do its dependency analysis (to tell you that "in order to install libmything1-devel, you must remove these 3 packages and upgrade these 9 others"), which is only based on information in the headers. Fortunately, RPM does provide a way for the packager to handle each of the three cases, by use of the conflicts tag, the obsoletes tag, or neither. However, the RPM system can't guess this automatically. Your problem is similar to this: While RPM can figure out dependencies between two shared libraries (because this happens to be really easy to do in linux via ldd), it can't figure out most other kinds of dependencies. So it can't figure out that a -devel package depends on another -devel package. However, this problem was not caused by splitting the libraries. In fact, splitting the libraries is a partial solution! I'll explain how, but first I want to go back to my fictional mything package, because both of your examples are pretty bad for illustrating the issue. (Why? The zlib package is one of the few that doesn't follow the Mandrake policy--it should be libz1 and libz1-devel; it doesn't have a separate apps package z--that'd be gzip, zip, and unzip; and it doesn't require anything besides glibc--which you can't have a modern linux system without. The libpng package is one of the few that was prefixed with lib even before the Mandrake policy [the internal name is libpng]; it doesn't have a separate apps package "png;" it's one of the few examples I remember of any package having -static-devel split off and later merged back into -devel on any distro; and it's one of the rare cases where most other distros have ended up following an ad-hoc version of the Mandrake policy [Redhat's "libpng10" and "libpng12" are equivalent to Mandrake's "libpng2" and "libpng3," while Conectiva just uses Mandrake's names].) RPM can automatically infer all of the requirements for libmything1, because it's nothing but shared libraries. RPM may not be able to automatically infer all of the requirements for libmything1-devel, however. This means that if the packager isn't sufficiently diligent, a user may find herself inexplicably unable to install libmything1-devel--but every user will always be able to install libmything1, or know the reason why (in fact, with urpmi/rpmdrake, it'll usually fix it for her: "to install the selected packages, you also need to install libmyotherthing," and all she has to do is click OK). As a side benefit, this will sometimes, but not always, give "power users" a clue toward solving the problems with the other packages ("Since libmything1 required libmyotherthing3, maybe libmyotherthing3-devel is required for libmything1-devel to work."). If the package hadn't been split, those "hidden dependencies" in the development files that cause problems with libmything1-devel would instead cause problems with the monolithic mything package. And this is a much worse problem. Why? Let's say that something in KDE requires /usr/lib/libmything.so.1. Now, something like 90% of Mandrake's users run KDE. Maybe 1% of Mandrake's users need to compile something that relies on libmything (not only that, that 1% are guaranteed to be among the most able to figure out the problem--after all, they're at least compiling packages they pulled off the net, if not actively developing code). In other words, splitting off the -devel packages doesn't create a problem; instead, it insulated 90% of the users, including 100% of the novice users, from an existing problem. And Mandrake's additional separation of the libraries from the application makes this even better, not worse. Let's say mything contains a python script /usr/bin/myapp, with hidden dependencies that RPM can't detect, but libmything0 is the shared libraries used by /usr/bin/myapp and by something in kdebase. Under the traditional (Redhat) policy, this hidden dependency could block the user from installing mything, meaning that she couldn't install KDE either. Under the Mandrake policy, even if the user couldn't install mything, she could still get libmything0 installed, so she could still have KDE. Again, this means that most users, including all novices, are insulated from potential problems (and again it provides a potential clue for mid-level "power users" to resolve problems). If you think that my example is contrived and implausible, look at the OGG libraries. Try "rpm -q --whatrequires libogg.so.0" on your system; you should see kdebase. If some developers or power-users can't install libogg0-devel because of some hidden dependencies, sure, that's a problem--but if some novices can't install libogg0 because of hidden dependencies, their system is unusable. Fortunately, because libogg0 contains nothing but the shared libraries, this cannot happen on a Mandrake system--but it can happen on another RPM-based system. (In fact, the same is true of your example, zlib1!) As for your proposed solution, while it's true that the header files contain information that can be used to find dependencies, even a perfect job at this will miss most dependencies for -devel packages, and will be even worse for the application packages. First of all, not everything is written in C(++); in particular, there are thousands of packages written (at least in part) in perl, sh, python, scheme, and a few other scripting languages, so you need to parse these. And to get the last 5%, you'll need to parse uncommon scripting languages, plus Modula, Eiffel, or any other language that anyone has ever built a package with. Second, even if you can determine that a header file includes <myheader.h>, that just tells you that it requires myheader.h. More often than not, you specifically need the version from libmything1-devel, so knowing that you need anything that provides myheader.h doesn't help. For example, ImageMagick 5.0 provided most of the same header files as 4.x, but their contents were quite different. A more thorough analysis could tell you that ImageMagick-devel-4.x wasn't compatible, but that would require a huge amount of work, and so much information that your RPM headers would be bigger than their contents. And even then, your code may not run if built against the wrong version, so you really need a good regression test suite for every package to test this automatically. (Going back to your example, if you can tell that zlib1-devel requires glibc, but you can't tell that it requires glibc-2.3, that's useless; only really ancient systems don't have glibc, but many comparatively recent systems don't have 2.3.) Third, nearly every language has a way to run a program, load a library, or read a file by name. For example, the following code probably calls convert, creating a dependency on ImageMagick that a script couldn't catch: sprintf(cmd, "%s \"%s\" \"%s\"", config->convert_path, gifname, pngname); system(cmd); Fourth, there are cases where there's no strict dependency, but the package is still completely useless without another. For example, xcivstart will run without freeciv, but all it can do is tell you that it can't find civclient and civserver, which surely isn't very useful to a user. Sometimes this is a judgement call. For example, civworld, the freeciv editor, isn't all that useful without freeciv--but you might want to edit scenarios on a machine you never intend to play the game on, so maybe it is. A human packager will gather much of this information in a variety of ways--from the package's website, from her own experience, from trial and error, from bug reports ("mything-1.0.0 installed fine on my system, but when I tried to run it, it told me there was 'no convert found in the path'"), from knowing what the package is intended to do ("xcivstart is a front-end to FreeCiv" probably means that xcivstart requires freeciv), etc. That's not to say that it would be a bad idea to parse header files. But it's not going to solve most problems. As for -static-devel packages (where they exist), they may actually be easier. Static libraries are relatively easy to parse, so finding dependencies on other packages is easy. However, there's no real reason to do this. Static libraries are, in almost every case, useless without the accompanying header files. This means that libmything0-static-devel has to require libmything0-devel. And since this in turns requires libmything0, and the static libraries will probably have (a subset of) the same requirements as the shared libraries, you don't need anything else. By the way, while the Mandrake library policy doesn't cause any of the problems you mention, it isn't perfect. Here's a list of problems: First, because the names of all of these packages have changed from what the rest of the world uses, taking specfiles designed for another distro (usually Redhat) and Mandrakizing them isn't always easy. If I download a Redhat SRPM (or a tarball from the developer's website that has--or builds on the fly-a specfile designed for Redhat), I have to go through each of the manual requires, buildrequires, obsoletes, conflicts, etc. tags and figure out the Mandrake names for each of those libraries. On top of that, because developers don't always follow the rule that minor version upgrades are compatible (especially in the 0.x stage), the .so often ends up with a different number than the overall package. So, you may have myotherthing.so.2 for 1.1.0 through 1.2.2, and myotherthing.so.3 for 1.2.3 through 1.2.9. This means that the user is now forced to deal with two different version number schemes. But it's even worse for the packager. If the stock (Redhat) specfile says "Requires: myotherthing >= 1.2.3" I need to know that this means "Requires: libmyotherthing3 >= 1.2.3." To know this, I have to know that myotherthing-1.2.3.srpm builds libmything3. If I have libmyotherthing3-1.2.7 installed, there's no easy way to figure out when the change came--I have to read the changelogs, browse the website, or even download and inspect old packages. To make things still worse, let's say that the package I'm trying to use can build with any version of myotherthing-devel from 1.2.3 through 1.5.1 (but not 1.2.2, or 2.0.0). Those libraries span libmyotherthing3-1.2.3 through libmyotherthing5-1.5.1. What do I put in the buildrequires? If I specify libmyotherthing5-devel, other users may have to upgrade for no good reason just to build it. And then, do I put libmyotherthing5 in the requires, making end users upgrade unnecessarily too? On another note, if the devel files had kept their old names (in other words, you have mything, libmything0, and mything-devel instead of mything, libmything0, and libmything0-devel), RPM would automatically prevent two versions of the -devel packages from being installed together. While the flexibility in choosing between conflicts, obsoletes, and nothing is good, this means that the default is to allow the two to coexist, which is usually wrong, which can't be detected until the package has been downloaded (or, usually, 20 packages have been downloaded and the whole batch installation fails). The virtual package names provide a partial solution to most of these problems. If libmyotherthing3 provides a virtual package named libmything, and so do 0 and 1 and 2 and 4 and 5, and likewise for the devel packages, I can just put "buildrequires: libmyotherthing-devel >= 1.2.3" and "requires: libmyotherthing >= 1.2.3". And of course my script can just replace every "requires: myotherthing" with "requires: libmyotherthing" (and likewise for -devel and buildrequires) in the Redhat package and I'm ready to go. However, this only works if you, and every other packager, is always careful about getting the virtual package names right--and if there's a good policy on how to do so. My only suggestion for this first problem is to provide a clearer policy for virtual package names that requires, wherever possible, providing the typical (Redhat) name (or, if this is not feasible, something derived from this name by an simple algorithm--even if it's not one that the computer has enough information to run for you). This problem is especially bad in complicated suites of libraries like Gtk/GNOME, and I seriously doubt that this is because fcrozat is lazy or stupid (in fact, anyone who could get the GNOME mess integrated into a distro that doesn't follow the exact same policies as the GNOME team deserves a commendation); it's because there's no consistent policy to guide him. Also, when both developers and Mandrake are sticking version numbers on the end of a package, the need for a good numbering policy becomes even clearer. The current policy is to stick the "internal" version number (from the SONAME), followed by an underscore, followed by the so version number. This policy makes sense, but it's often not followed. Sometimes the numbering is even changed from one version to the next, neither in accordance with the policy. Here are some examples I was able to gather in a few seconds (there are probably dozens more): libadplug-1.3.so.0 is libadplug1.3_0, not libadplug-1.3_0. libadplug-1.4.so.0 is libadplug1.4, not libadplug-1.4_0. libart_lgpl_2.so.2 is libart_lgpl2, not libart_lgpl_2_2. libfaad.so.0 is libfaad2_0, not libfaad0. libfame-0.8.so.8 is libfame0.8, not libfame-0.8_8. libfame-0.9.so.0 is libfame0.9, not libfame-0.9_0. libgtk-1.2.so.0 is libgtk+1.2_0, not libgtk-1.2_0. This policy also means that the package names that result are often confusing. Some of the GNOME 2.2 packages have no number on the soname, some have 2, some have -2, some have -2.0, etc. None have 2.2 or -2.2. Since the soversion is usually 0, this means that you have two extra sets of version numbers, neither of which is 2.2, which often together look like 2.0, and which mean nothing to the user. And of course this makes the virtual names problem even more complicated, since other vendors and the GNOME team themselves are providing packages with names that often don't follow any policy at all, and other people are building packages against those. There should be a better way. Unfortunately, I can't think of one; maybe it's just wishful thinking. If not, the policy should always be followed, rigorously. Maybe rpmlint could check that the right-most part of the name made up of digits, dots, hyphens, and underscores follows the rule exactly, and for the rare special cases (like liba52dec0 not being named liba52dec52_0, for obvious reasons) you'd have to explicitly make a judgement call that rpmlint was wrong.