On Mon, 2022-04-11 at 10:13:52 +0200, Ansgar wrote: > Package: dpkg > Version: 1.21.7 > Severity: wishlist
> Someone wondered on IRC why we ship symbols files in shared library > packages instead of the associated -dev packages While there is in theory no technical limitation why these could not be shipped in -dev packages, as the tools care about only explicitly linked shared libraries (so if a transitive shared library does not have shlibs/symbols files, that should not be a problem if these are not present on the system). The problem is that the tooling needs to be able to find these files, and it goes from object SONAME declaration to shared object on disk, then looks for the package owning that file, and then looks for the shlibs or symbols present in that package (either from the system or package build directories). (This should probably be documented somewhere explicitly, as I did not see anything obvious neither in dpkg docs nor the Debian policy manual.) > and noted that they > take 1% of disk space for a fresh debian:sid docker container (and > probably more on the -slim variant of the container). I just checked (OOC) and f.ex. on the current sid-slim variant it seems to be around 1.77%. The actual size of these files there is 1.6 MiB (according to du -sch). After a quick look I see either trivial targets that would more than offset that, f.ex.: $ dpkg -P e2fsprogs libext2fs2 mount gcc-9-base $ rm -f /var/cache/debconf/*.dat-old $ rm -f /var/cache/debconf/templates.dat Or other more localized/focused things like there being both libpcre2-8-0 and libpcre3, or (libcrypto + libssl) + (libhogweed + libnettle + libp11-kit + libtasn + libidn2 + libunistring + libgnutls), that would give way more significant gains (f.ex. getting rid of the GnuTLS stack would amount to something like an additional 8 MiB, including reduction from the then no longer present symbols files). Matthias Klose seems to have implied (in a bug report) to not find symbols files for C++ libraries very helpful, so if he'd decide to stop shipping them for libstdc++6 (the biggest there), that would be an additional 400 KiB reduction. Otherwise making dpkg transparently compress such files on the db, would reduce its size by 1.1 MiB (with just gzip), which is something that I had previously already considered for the old changelog in the dpkg db proposal. > It would be nice if it was possible to exclude symbol files from such > environments. This could mean: > > - Ship them in the -dev package instead. While this could potentially be done, it seems to me the amount of global effort and resulting properties might not be a very good trade-off for the gains of currently less than 2 MiB (or potentially around ~500 KiB) of space there. Conceptually storing them in either <lib> or <lib>-dev packages can be argued to make sense and have good and bad properties. Shipping them on <lib>: - They are guaranteed to be kept in sync with the shared object they describe (no requirement for guaranteeing exact version dependencies between <lib> and <lib>-dev, even though this tends to be current practice). - They do not require adding some way to back-reference the <lib>-dev package corresponding to its <lib> (a new control field f.ex.). - They do not depend on the <lib>-dev package being arch:any (which Multi-Arch would require, but that's an optional feature from dpkg PoV). - (There could be external functional reliance on these files being shipped in <lib> packages to extract specific symbol version information, as this can be considered part of the interface.) Shipping them on <lib>-dev: - They are shipped in the package that would denote the file might get used, and don't "waste" space in case no building is going to be happening. - The Build-Depends-Package field in symbols files could be somehow simplified into some boolean variant (but not its Build-Depends-Package_s_ counterpart, although both of these are optional, unlike the required new back-reference field in the control file). So doing this change seems to me would imply that: - Maintainers (not just debhelper) would need to modify the packaging to move those files to the new package (which has global impact), for a potentially very long-winded transition. - Switching all libraries seems like a rather large undertaking for a potentially ~500 KiB gain TBH. Switching only packages in the minbase set would create a weird packaging oddness and non-uniformity. :/ - Regardless of a full or a partial transition, both locations would need to be supported anyway, which would also make packaging a bit more complicated/confusing. This seems in contrast to other proposals to reduce the essential set, which imply global efforts, but they also imply complexity reduction by making f.ex. the bootstrapping requirements smaller, or making dependencies explicit to get rid of implied assumptions. Which in this case seems to instead end up adding new complexity. > - Ship them in a well-known location in /usr (they are not variable > state data after all); this would allow the regular exclusion > mechanism already used by -slim images to be used here as well. They are varying packaging state metadata, like all other stuff stored in the dpkg database. The excludes used, all seem for non-functionally altering files anyway. So excluding these files would render these images not usable as bases for build containers. (Not to mention the additional complication of having to encode these pathnames in a way compatible with the dpkg db so that f.ex. Multi-Arch can be handled correctly, or whatever new requirements might be coming along, w/o needing to encode the location format somewhere else.) Guillem