On Fri, Jan 12, 2024 at 01:31:18PM +0100, Helmut Grohne wrote:
> The relevant situation is not entirely trivial to construct:
> 
>  * Package $first contains an aliased file $file and this is moved to
>    package $second in an update.
>    OR
>    Package $first diverts aliased location $file normally owned by
>    package $second.
> 
>  * An update to package $second moves $file to its physical location
>    below /usr.
> 
>  * Package $second declares a versioned conflict for package $first with
>    any version that contains or diverts the aliased $file.
> 
> Then we can construct a file loss scenario:
> 
>  * Install package $first.
>  * Schedule $first for removal:
>    echo "$first remove" | dpkg --set-selections
>  * Install the updated $second:
>    dpkg --unpack $second.deb

I somehow missed how Ben's libnfsidmap bug #1058937 works slightly
simpler. Given that $second has a conflict with the installed version of
$first, one can skip that second step and instead install $second
directly with dpkg -i. So no, this weird selections stuff is not
technically necessary.

In general, when doing these dances there are two outcomes. Either,
$first is unpacked (or removed) before $second is unpacked or the other
way round. The latter case always shows a message like this:

    dpkg: considering removing $first in favour of $second ...
    dpkg: yes, will remove $first in favour of $second

It is these cases that exhibit the buggy behaviour.

> In most upgrade scenarios, apt will remove/upgrade package $first before
> performing the unpack of $second. In these cases, no loss happens.

I tried to get an idea of what "most" means precisely. For one thing, I
constructed various bullseye->bookworm and bookworm->unstable upgrades
followed by dpkg --verify. This included large installations such as
task-gnome-desktop with recommends and targeted cases where I hoped to
find problems such as upgrading molly-guard or nfs-ganesha or and a few
more. In none of the cases (where doing plain apt-get dist-upgrade), I
was able to make dpkg --verify unhappy.

Another route was searching for existing evidence. piuparts has lots of
logs of upgrading packages and in >= bookworm, I found exactly one
having that "yes, will remove" content. It was
https://piuparts.debian.org/testing2sid/pass/rubber_1.6.0-2.log and
that's due to texlive-base declaring a versioned Conflict with
texlive-latex-base and texlive-latex-base declaring versioned Breaks
with texlive-base. This is the mutual Conflicts case that also broke
stuff in a draft patch for molly-guard. This data point confirms what
David Kalnischkies said about this earlier: In the absence of mutual
conflicts, apt removes $first before unpacking $second.

I also tried a web search for "yes, will remove" and really most of logs
I found, dpkg was used directly (though without --set-selections). That
texlive mutual conflict was an exception. Evidently, this is rarely
happening on real installations.

> Therefore, I hope that the loss cannot be experienced when upgrading
> with apt or frontends using apt such as aptitude, but there is no proof
> of this.

So all the evidence I found confirms the guess that the problem cannot
be observed with apt unless mutual conflicts exist. On the flip side,
simply installing a package that declares Conflicts occasionally
triggers this and if you happen to do this to a package that replaces
aliased files, then your files vanish.

In particular, this raises the question whether we consider the upgrade
that Ben describes in #1058937 as supported or whether we can close the
bug. In effect, that's the question to ask here.

I note that netplan.io/0.107.1-2/#1060661 just opted for not doing
Conflicts and instead employed protective diversions (M8). In principle,
we could generally prefer M8 (for P1, but not for P7) and thereby reduce
the problem at the cost of making the mitigation more complex. At least
for the essential set, there is not much choice as employing Conflicts
is known to lead to bad things.

> One takeaway from the CTTE meeting was that this loss should be
> mitigated when it may make a system unbootable. That is a property that
> is difficult to capture and would likely require mitigating half of the
> conflicts.

While this may seem like an obvious rule-of-thumb, it very much is
not. For many of the packages, one can plausibly construct crazy boot
schemes involving them. Though cryptsetup quite clearly falls within
scope.

> The way of mitigation also is non-trivial. In the window between unpack
> of files that will be lost and actual loss, no maintainer script is run
> reliably. Hence, copies of affected files have to be installed
> elsewhere:
>  * systemd-sysv looses only symlinks whose target is specified in
>    postinst.
>  * gzip looses shell scripts that will be embedded in its postinst.
>  * For larger files such as dhclient, we consider moving the file to an
>    unaffected location and then pointing a symlink at it such that only
>    the symlink needs to be restored.

This is the complex case where Conflicts are used to mitigate
ineffective diversions. Mitigating ineffective Replaces is a little
simpler. For instance,
https://salsa.debian.org/debian/netplan.io/-/commit/6e3a0b9a32f23291d32e0f61f1828be1aab52bf2
Doing the same thing for libnfsidmap is plausible at least.

And since we're likely going to mitigate cryptsetup and isc-dhcp-client
that doesn't leave many diversion cases and most of the remaining
conflicts could be augmented with these simpler protective diversions.
Striking a good balance here is hard.

Helmut

Reply via email to