Working on an old piece of code like rpm is much like city infrastructure
renewals: you try to expect the unexpected and plan accordingly, but every now
and then you'll still get surprised when you break the asphalt: "what are all
these pipes, they don't exist in any drawing?". And consequently, work gets
delayed to sort it all out. Several times.
Perhaps the main headline feature in [rpm
4.20](https://rpm.org/wiki/Releases/4.20.0) is the [declarative
buildsystem](https://github.com/rpm-software-management/rpm/issues/1087)
support in the spec files. This was a feature I first dreamed up around 2012,
which alone suggests there was quite a bit of plumbing to sort out before it
could happen. One of the more critical support features for that was ability to
[append and prepend](https://github.com/rpm-software-management/rpm/pull/2728)
to existing spec sections. In order to support *that*, the previously very
special `%prep` section with its built-in `%setup` and `%prep` pseudo-macros
needed to be [turned
into](https://github.com/rpm-software-management/rpm/pull/2730) normal
scriptlet, and in order to do that, the pseudo-macros needed to be turned into
real macros, and in order to do *that*, the macro engine needed a rather
thorough rework, also over several years and countless changes like #1406 and
#1434. The first concrete step towards declarative builds was [introduction of
%autosetup](6c5214950e5885c33c498969ca256c9550f5936b) in 2012, complemented
with [%patchlist](https://github.com/rpm-software-management/rpm/pull/679) in
2019. And so on. It was a lot of work spread over more than a decade, but these
factors were reasonably well known ahead. But this is all just backdrop to the
thing that *did* get us by surprise, right on the finishing lines. If you do
things with rpm, you have probably encountered it a few times: debuginfo
packages.
That story begins somewhere around 2002. I wasn't deeply involved with rpm at
that time so the early parts is based info gathered and deduced from commits to
rpm and redhat-rpm-config and various fragments I've heard/read over the years
and so may contain inaccuracies. But AIUI, the toolchain people at Red Hat were
tasked with making debugging released binary builds meaningful. I don't know
whether the task was specifically to achieve this without major changes to rpm
itself, but that's how it took place: it was practically all implemented with
macro voodoo + a helper script and a binary, none of which needed to be inside
rpm. Much of it ended up in the rpm repository sooner or later, but it didn't
*need* to be there. It's no mean feat, really, but it also did require some
quite, uh, creative solutions.
Of course in the intervening 22 years *a lot* happened. For a long time, the
underly was in the rpm repository and In particular, around 2017 Mark Wielaard
practically rewrote the underlying debugedit tool and introduced some in-rpm
code for better integration, and Michael Schroeder and Richard Biener added
support for debuginfo sub-packages, which also needed in-rpm code. And then in
2021 debugedit and the helper script was split to an external project because
people outside the rpm ecosystem got interested in them. To a great relief to
us rpm maintainers: debugedit deals with deep ELF format internals, and we
never really knew what to do with it anyhow. In all that flux, the one thing
that didn't change is the one thing almost certainly intended as a temporary
hack only: the way debuginfo packages are actually enabled. It also never
entered the rpm codebase at all. And that's what we ran into head-on, 22 years
later, basically on the eve of the 4.20 alpha release.
In broad strokes, debuginfo packages live as template macros which are used to
generate the spec preamble for them, and then a script invoked from %install
post template to edit and collect the files. This all be done quite neatly in
the generic spec scriptlet template infrastructure, except for one thing: how
do you inject something into nearly every single spec preamble, without
actually modifying them? I believe the brilliant-awful macro hack was
originally by Elliot Lee in redhat-rpm-config, for accomplishing something
else. The people adding debuginfo support saw the trick and ran away with it.
Since 2002, redhat-rpm-config has contained this macro definition:
```
%install %{?_enable_debug_packages:%{?buildsubdir:%{debug_package}}}\
%%install\
%{nil}
```
This is the entry to our little Rube Goldberg machine. There's an incredible
amount of powerful magic embedded in those three lines.
You need to be familiar with the rpm spec syntax and macros to properly follow
this, but `%install` marks the beginning of the shell scriptlet where the
packager tells rpm which content to put in the resulting binary package. So,
`%install` is just a section opener string, hard-coded inside the spec parser
for that purpose, and doesn't "do" anything by itself. However the above macro
turns this innocent section marker to quite something else. In English: if
`%_enable_debug_packages` macro is defined, then if '%setup' was used in the
spec (`%buildsubdir` is a side-effect of that), expand the contents of
`%{debug_package}` macro here, and then add back the `%install` section marker
as if nothing hapened.
`%debug_package` is defined something like this:
```
%debug_package \
%ifnarch noarch\
%global __debug_package 1\
%_debuginfo_template\
%{?_debugsource_packages:%_debugsource_template}\
%endif\
%{nil}
```
`%_debuginfo_template` is the spec preamble definition of a debuginfo package,
something like this (trimmed for brewity):
```
%_debuginfo_template \
%package debuginfo\
Summary: Debug information for package %{name}\
%description debuginfo\
This package provides debug information for package %{name}.\
%files debuginfo -f debugfiles.list\
%{nil}
```
So the `%install` macro override emits all that into the spec preamble section
inside a `%ifnarch noarch` conditional to prevent it from firing on arch
independent packages (which aren't expected to contain ELF files), and then
emits that original `%install` to let you proceed with whatever it was your
package does in there. It's really quite clever, but at the same time, awful.
But it gets weirder from there. Notice how there's a `%global __debug_package
1` inside the `%ifnarch` block? You'd think that it doesn't get defined on
noarch packages, but it does. The macro engine doesn't know anything about
`%if` and the like, it's only something that looks like a macro but is
undefined so falls through untouched. The multiline result then gets passed to
the spec parser which processes the %ifs and the other content.
At the other end of `%install`, the rest of the magic is embedded inside
`%_spec_install_post` template macro, something like the following, and gets
appended to end of `%install` behind the scenes during the actual build of a
package:
```
%__spec_install_post\
%{?__debug_package:%{__debug_install_post}}\
%{__arch_install_post}\
%{__os_install_post}\
%{nil}
```
Note how it tests for `%__debug_package` definition to avoid triggering on
noarch packages. But we just concluded in the above that it gets always
defined! Yet, somehow debuginfo packages are not generated for debuginfo
packages, so it must work somehow? Well, it doesn't. The `%__spec_install_post`
section actually fires for noarch packages but it silently falls through as
there's nothing for it to do on a normal noarch package. But, it can leave
behind tell-tale `debugfiles.list` etc files in the build directory if you go
looking. So how does it not fail with errors then? The catch is that the
`%ifnarch noarch` block in the `%debug_package` macro works for the spec
preample part, so the debuginfo *package* is never created, and so rpm doesn't
go looking for it, and the *.list files end up just being some junk in the
directory, rpm doesn't care.
That's why I call it a Rube Goldberg machine: it may not be intentionally
complicated, but it sure is complicated and precarious.
Now, what does this all have to do with our declarative buildsystems? Well, the
related append and prepend options means `%install` can occur multiple times in
the spec with -a or -p options, and you can probably see how that wouldn't go
too well with this. But, couldn't you just turn the `%install` macro override
into a parametric macro which only emits the debug stuff when no arguments are
passed and look away for another twenty years? Well, maybe, but the madness has
to stop somewhere.
The real rub was that this `%install` override over exists in distros and not
rpm upstream, we only really ran into it when it was far too late in the
release process to start reworking something like that. Technically, I knew it
existed but had blissfully forgotten, and certainly didn't realize the
implications when adding append/prepend modes. In any case, this blocked the
use of our headline feature, so we scrambled for a few weeks to get debuginfo
enablement logic properly and fully upstreamed. The existing frail machinery,
together with tens of thousands of packages built on top and sometimes around
it, each in their own sometimes peculiar ways, was always a terrifying thing to
modify, and doubly so when under time pressure.
The end result in 4.20 utilizes some of the new dynamic spec generation
features Florian Festi has been working on. It was no walk in the park though,
it took us several weeks of experimenting over multiple pull-requests to get it
right to the point it currently is. Among other fun, there was a bug which
causes `%_target_cpu` and various other macros + variables to disagree with the
rest of the spec on specs with BuildArch, when an explicit `--target` is not
passed, causing surprises (like, getting debuginfo packages when you don't
expect them) with dynamically generated content. And when I finally made rpm
automatically reload the platform configuration if `--target noarch` is not
specified to address that, we discovered that `mock` always passes something
like `--target $(uname -m)` to rpmbuild, even for noarch packages. Except
inside `koji` where it always passed `--target noarch`. And so on. We also
intended to enable debuginfo packages for packages without %setup, but that
turned out to be too much breakage.
What is present in 4.20 resembles the old madness way too much for my liking,
but various details of the old implementation have leaked to thousands of specs
in ways that make changing them impossible or nearly so. At least all the
machinery is now upstream under our eyes where we can hopefully simplify and
streamline it gradually over time.
To those who made it this far: this is hopefully the start of on-going blogs
about rpm development, "tales from the trenches" and whatnot.
--
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/discussions/3188
You are receiving this because you are subscribed to this thread.
Message ID: <rpm-software-management/rpm/repo-discussions/3...@github.com>
_______________________________________________
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint