Bug#1020920: usrmerge: if "the system crashes at a really bad time" during conversion it may be left unbootable
On Wed, 2022-09-28 at 17:59 -0400, Zack Weinberg wrote: > On 2022-09-28 5:30 PM, Ansgar wrote: > > On Wed, 2022-09-28 at 16:40 -0400, Zack Weinberg wrote: > > > "Available and usable at all times" is orthogonal to "maintainer > > > scripts > > > do not render the system unbootable". As I read things, *all* > > > packages > > > bear the responsibility of not rendering the system unbootable. > > > > No, it's a significantly weaker requirement than what you want to > > impose. If it is not available and usable at all time, it can > > clearly > > render the system unbootable (by not being available or usable at > > boot). > > The vast majority of Debian packages provide programs, libraries, > etc. > that are not used at all during the boot process. Therefore, *even > if* > those packages are currently unusable, due to a crash in the middle > of > an upgrade that left them unpacked-but-not-configured, or whatever, > they > can't prevent the system from coming up at least as far as the point > where it's possible to get a root shell and run `dpkg -a -- > configure`. Yes, those packages are irrelevant and I wasn't talking about them anywhere. So I don't know why you mention them now. > The small subset of packages that *are* used at boot time, do need to > take extra care to keep working even if they are unpacked but not > configured, and that subset and that extra requirement is codified as > the rules for (transitively) Essential packages. No, that is not correct. The set of packages required for boot is significantly larger than essential and not well defined. Common examples include: kernel, boot loaders, init system, ...; they are often required for boot, but not essential. I also don't think the essential requirements are sufficient for "works after system crash". > But *all* packages must take particular care *in their maintainer > scripts* to not render the system unbootable, because maintainer scripts > are all run with full root privileges, at a time when the system is in a > partially ill-defined state (since it is in the process of being > upgraded -- No, they usually aren't run in an ill-defined state. > this is why we have the "postinst scripts can't assume any > non-Essential functionality works" rule), There is no such rule. Seriously, this is getting nowhere. If you want to tell maintainers they are wrong and everything they do must be reverted, please at least inform yourself a bit more before filing bugs with tech-ctte or vague release-critical bugs. Especially if you do so for issues where people are already tired of talking about. > and yet it could still be in > active use (there has never been a requirement to take the system to > single-user mode before running 'apt-get upgrade'). That would be a different problem from "must work after arbitrary system crash". I would prefer if we would not switch between different problems. > > I tried searching for that justification and a major internet > > search > > provider just says 'Your search - "potentially renders the system > > unbootable" - did not match any documents.' > > https://www.debian.org/Bugs/Developer#severities > > The official wording appears to be "makes unrelated software on the > system (or the entire system) break". I hope you will agree that a > system that doesn't boot is entirely broken. > > https://salsa.debian.org/reportbug-team/reportbug/-/blob/master/reportbug/debbugs.py#L79 > > is where I got the "unbootable" phrasing. No, there is a significant difference between "renders the entire system unusable (e.g., unbootable [...]" and "potentially renders the system unbootable". Anyway, please take it to the ctte bug or just close this bug. Ansgar
Bug#1020920: usrmerge: if "the system crashes at a really bad time" during conversion it may be left unbootable
On 2022-09-28 5:30 PM, Ansgar wrote: On Wed, 2022-09-28 at 16:40 -0400, Zack Weinberg wrote: "Available and usable at all times" is orthogonal to "maintainer scripts do not render the system unbootable". As I read things, *all* packages bear the responsibility of not rendering the system unbootable. No, it's a significantly weaker requirement than what you want to impose. If it is not available and usable at all time, it can clearly render the system unbootable (by not being available or usable at boot). The vast majority of Debian packages provide programs, libraries, etc. that are not used at all during the boot process. Therefore, *even if* those packages are currently unusable, due to a crash in the middle of an upgrade that left them unpacked-but-not-configured, or whatever, they can't prevent the system from coming up at least as far as the point where it's possible to get a root shell and run `dpkg -a --configure`. The small subset of packages that *are* used at boot time, do need to take extra care to keep working even if they are unpacked but not configured, and that subset and that extra requirement is codified as the rules for (transitively) Essential packages. But *all* packages must take particular care *in their maintainer scripts* to not render the system unbootable, because maintainer scripts are all run with full root privileges, at a time when the system is in a partially ill-defined state (since it is in the process of being upgraded -- this is why we have the "postinst scripts can't assume any non-Essential functionality works" rule), and yet it could still be in active use (there has never been a requirement to take the system to single-user mode before running 'apt-get upgrade'). But most packages don't *do* anything in their maintainer scripts that has any serious *risk* of rendering the system unbootable, and therefore we don't have to worry about them. The subset of packages that do dangerous things in their maintainer scripts *overlaps* the set of Essential packages, but there are members of each set that are not members of the other. There is also a set of packages where it's the *installed software* that might have bugs that render the system unbootable, such as implementations of fsck for particular filesystems. Do you understand the distinctions I am making? If you don't, please explain what doesn't make sense about what I just said, because I don't think we're going to get any further with this discussion until you do. One of the several documented justifications for that severity is "potentially renders the system unbootable". I see nothing anywhere that limits the scope of that justification to essential packages, or to any other subset of the archive. I tried searching for that justification and a major internet search provider just says 'Your search - "potentially renders the system unbootable" - did not match any documents.' https://www.debian.org/Bugs/Developer#severities The official wording appears to be "makes unrelated software on the system (or the entire system) break". I hope you will agree that a system that doesn't boot is entirely broken. https://salsa.debian.org/reportbug-team/reportbug/-/blob/master/reportbug/debbugs.py#L79 is where I got the "unbootable" phrasing. zw
Bug#1020920: usrmerge: if "the system crashes at a really bad time" during conversion it may be left unbootable
On Wed, 2022-09-28 at 16:40 -0400, Zack Weinberg wrote: > On 2022-09-28 3:45 PM, Ansgar wrote: > > On Wed, 2022-09-28 at 15:39 -0400, Zack Weinberg wrote: > > > On 2022-09-28 3:29 PM, Ansgar wrote: > > > > On Wed, 2022-09-28 at 15:22 -0400, Zack Weinberg wrote: > > > > > On 2022-09-28 3:06 PM, Ansgar wrote: > > > > > > Your requirement is that a system must *never* become > > > > > > unbootable in > > > > > > *all* of these states. > > > > > > > > > > Yes, and furthermore I think Debian has required this for many, > > > > > many > > > > > years. > > > > > > > > No, it never did. > > > > > > I told you why I think it does. Unless you can provide _evidence_ > > > that it doesn't, you're not going to change my mind. > > > > Policy makes a special guarantee about essential packages: > > > > +--- > > > Essential is defined as the minimal set of functionality that must > > > be available and usable on the system at all times, even when > > > packages are in the “Unpacked” state. > > +--- > > "Available and usable at all times" is orthogonal to "maintainer scripts > do not render the system unbootable". As I read things, *all* packages > bear the responsibility of not rendering the system unbootable. No, it's a significantly weaker requirement than what you want to impose. If it is not available and usable at all time, it can clearly render the system unbootable (by not being available or usable at boot). > Naturally, most packages don't need to take particular care to avoid > rendering the system unbootable, since they don't do anything in their > maintainer scripts that would risk that. But some do -- like bash, like > libc6, and like usrmerge -- and so they do need to take extra care, and > have always been expected to do so. Maintainer scripts are only one part; not fully installed packages can make the system unbootable for other reasons as mentioned earlier. As you now only talk about maintainer scripts, are these no longer relevant? > > Please provide evidence that the even harder guarantees you demand are > > made somewhere for a much larger set of packages that are critical for > > boot. And are actually fulfilled in practice. > > I already told you the answer to that question: it's inherent in the > definition of a severity:critical bug. One of the several documented > justifications for that severity is "potentially renders the system > unbootable". I see nothing anywhere that limits the scope of that > justification to essential packages, or to any other subset of the archive. I tried searching for that justification and a major internet search provider just says 'Your search - "potentially renders the system unbootable" - did not match any documents.' Anyway, please send follow-ups not just to me, but the bug tracker and ideally the tech-ctte bug. Ansgar
Bug#1020920: usrmerge: if "the system crashes at a really bad time" during conversion it may be left unbootable
On Wed, 2022-09-28 at 15:39 -0400, Zack Weinberg wrote: > On 2022-09-28 3:29 PM, Ansgar wrote: > > On Wed, 2022-09-28 at 15:22 -0400, Zack Weinberg wrote: > > > On 2022-09-28 3:06 PM, Ansgar wrote: > > > > Your requirement is that a system must *never* become > > > > unbootable in > > > > *all* of these states. > > > > > > Yes, and furthermore I think Debian has required this for many, > > > many > > > years. > > > > No, it never did. > > I told you why I think it does. Unless you can provide _evidence_ > that it doesn't, you're not going to change my mind. Policy makes a special guarantee about essential packages: +--- | Essential is defined as the minimal set of functionality that must | be available and usable on the system at all times, even when | packages are in the “Unpacked” state. +--- Please provide evidence that the even harder guarantees you demand are made somewhere for a much larger set of packages that are critical for boot. And are actually fulfilled in practice. Ansgar
Bug#1020920: usrmerge: if "the system crashes at a really bad time" during conversion it may be left unbootable
On Wed, 2022-09-28 at 15:22 -0400, Zack Weinberg wrote: > On 2022-09-28 3:06 PM, Ansgar wrote: > > On Wed, 2022-09-28 at 14:54 -0400, Zack Weinberg wrote: > > > On 2022-09-28 2:40 PM, Ansgar wrote: > > > > > If I thought there was a bug in some other package that posed > > > > > a > > > > > significant risk of rendering Debian systems unbootable on > > > > > upgrade, I > > > > > would have filed a report against THAT PACKAGE. > > > > > > > > Okay, so I understand this is an arbitrary requirement for > > > > *just* > > > > usrmerge. Any other package may still break the system (as > > > > there are > > > > enough critical packages). > > > > > > I don't understand how you got from what I said to "this is an > > > arbitrary > > > requirement just for usrmerge". > > > > > > It is, in fact, a *non*-arbitrary requirement, spelled out in > > > Policy as > > > such, that applies to *all* packages. "Potentially breaks the > > > entire > > > system (e.g. by rendering it unbootable)" = critical-severity > > > bug. > > > > During upgrades, package dependencies might not be satisfied, there > > is > > no guarantee that non-essential (as in the Policy meaning of > > essential) > > packages work at all, partly-unpacked essential packages are likely > > also interesting. > > > > The system can crash while any of this is the case, not even > > involving > > more complex parts like maintainer scripts. > > > > This obviously also includes boot loaders and similar. > > > > Your requirement is that a system must *never* become unbootable in > > *all* of these states. > > Yes, and furthermore I think Debian has required this for many, many > years. No, it never did. > > So again: please show that other packages don't have such issues in > > general. > > I do not think it is reasonable for you to ask that I investigate the > possibility of bugs existing in other packages before I file a bug on > your package. If you want to impose requirements on this package that are not imposed elsewhere... Ansgar
Bug#1020920: usrmerge: if "the system crashes at a really bad time" during conversion it may be left unbootable
On 2022-09-28 3:06 PM, Ansgar wrote: On Wed, 2022-09-28 at 14:54 -0400, Zack Weinberg wrote: On 2022-09-28 2:40 PM, Ansgar wrote: If I thought there was a bug in some other package that posed a significant risk of rendering Debian systems unbootable on upgrade, I would have filed a report against THAT PACKAGE. Okay, so I understand this is an arbitrary requirement for *just* usrmerge. Any other package may still break the system (as there are enough critical packages). I don't understand how you got from what I said to "this is an arbitrary requirement just for usrmerge". It is, in fact, a *non*-arbitrary requirement, spelled out in Policy as such, that applies to *all* packages. "Potentially breaks the entire system (e.g. by rendering it unbootable)" = critical-severity bug. During upgrades, package dependencies might not be satisfied, there is no guarantee that non-essential (as in the Policy meaning of essential) packages work at all, partly-unpacked essential packages are likely also interesting. The system can crash while any of this is the case, not even involving more complex parts like maintainer scripts. This obviously also includes boot loaders and similar. Your requirement is that a system must *never* become unbootable in *all* of these states. Yes, and furthermore I think Debian has required this for many, many years. So again: please show that other packages don't have such issues in general. I do not think it is reasonable for you to ask that I investigate the possibility of bugs existing in other packages before I file a bug on your package. zw
Bug#1020920: usrmerge: if "the system crashes at a really bad time" during conversion it may be left unbootable
On Wed, 2022-09-28 at 14:54 -0400, Zack Weinberg wrote: > On 2022-09-28 2:40 PM, Ansgar wrote: > > > If I thought there was a bug in some other package that posed a > > > significant risk of rendering Debian systems unbootable on upgrade, I > > > would have filed a report against THAT PACKAGE. > > > > Okay, so I understand this is an arbitrary requirement for *just* > > usrmerge. Any other package may still break the system (as there are > > enough critical packages). > > I don't understand how you got from what I said to "this is an arbitrary > requirement just for usrmerge". > > It is, in fact, a *non*-arbitrary requirement, spelled out in Policy as > such, that applies to *all* packages. "Potentially breaks the entire > system (e.g. by rendering it unbootable)" = critical-severity bug. During upgrades, package dependencies might not be satisfied, there is no guarantee that non-essential (as in the Policy meaning of essential) packages work at all, partly-unpacked essential packages are likely also interesting. The system can crash while any of this is the case, not even involving more complex parts like maintainer scripts. This obviously also includes boot loaders and similar. Your requirement is that a system must *never* become unbootable in *all* of these states. Unless of course, it is just usrmerge that is required to provide guarantees that no other package is. (Or change the entire system to mandatory A/B updates or similar things.) So again: please show that other packages don't have such issues in general. I very much don't think so and do not think it is particularily useful to demand this from one specific package. Ansgar
Bug#1020920: usrmerge: if "the system crashes at a really bad time" during conversion it may be left unbootable
On 2022-09-28 2:16 PM, Marco d'Itri wrote: it appears to be possible for the next boot to find the root filesystem in a state where /lib or /bin doesn’t exist at all. Recovery from this state will require booting from installation media. This is technically correct. But after 8 years of development in Debian, and after Ubuntu converting all their user base on upgrades, no such event has been reported. I don't think you can draw conclusions from Ubuntu in this context since their upgrade process is radically different. If I remember correctly, they invoke convert-usrmerge at a point when the system is effectively in single-user mode, and thus other processes are much less likely to interfere. I also don't think you can draw conclusions from 8 years of past development within Debian because the vast majority of Debian installations that were originally installed unmerged (pre-bullseye or opt out) *have not yet been converted*. Most people who maintain Debian installations, after all, aren't paying any attention to this process. They'll get the conversion *only* when they upgrade to bookworm. As such I think we haven't yet seen *most* of the truly weird conditions under which convert-usrmerge will be invoked, and I think you should reconsider. zw
Bug#1020920: usrmerge: if "the system crashes at a really bad time" during conversion it may be left unbootable
On Wed, 2022-09-28 at 14:32 -0400, Zack Weinberg wrote: > On 2022-09-28 2:04 PM, Ansgar wrote: > > On Wed, 2022-09-28 at 13:53 -0400, Zack Weinberg wrote: > > > On Wed, Sep 28, 2022, at 1:47 PM, Ansgar wrote: > > > > No, you would need to atomically replace the *entire* system, > > > > not > > > > just > > > > individual directories. > > > > > > ??? Atomic replacement of each affected directory is, as far as I > > > can > > > see, both necessary and sufficient to prevent the system being > > > rendered unbootable. > > > > No. It is not sufficient. Upgrading packages can affect multiple > > directories and half-upgraded packages can easily render systems > > unbootable. > > Do I really have to spell this out for you? > > Atomic replacement of each directory replaced with a symlink by > convert-usrmerge should be sufficient [unless I missed something > while > reading through convert-usrmerge's code] to prevent the system being > unbootable AS A CONSEQUENCE OF ACTIONS PERFORMED BY convert-usrmerge. > > If I thought there was a bug in some other package that posed a > significant risk of rendering Debian systems unbootable on upgrade, I > would have filed a report against THAT PACKAGE. Okay, so I understand this is an arbitrary requirement for *just* usrmerge. Any other package may still break the system (as there are enough critical packages). Of course, if not correct, you can demonstrate there are no other packages that will make the system unbootable if half-upgraded (including combinations involving multiple packages). Ansgar
Bug#1020920: usrmerge: if "the system crashes at a really bad time" during conversion it may be left unbootable
On 2022-09-28 2:04 PM, Ansgar wrote: On Wed, 2022-09-28 at 13:53 -0400, Zack Weinberg wrote: On Wed, Sep 28, 2022, at 1:47 PM, Ansgar wrote: No, you would need to atomically replace the *entire* system, not just individual directories. ??? Atomic replacement of each affected directory is, as far as I can see, both necessary and sufficient to prevent the system being rendered unbootable. No. It is not sufficient. Upgrading packages can affect multiple directories and half-upgraded packages can easily render systems unbootable. Do I really have to spell this out for you? Atomic replacement of each directory replaced with a symlink by convert-usrmerge should be sufficient [unless I missed something while reading through convert-usrmerge's code] to prevent the system being unbootable AS A CONSEQUENCE OF ACTIONS PERFORMED BY convert-usrmerge. If I thought there was a bug in some other package that posed a significant risk of rendering Debian systems unbootable on upgrade, I would have filed a report against THAT PACKAGE. zw
Bug#1020920: usrmerge: if "the system crashes at a really bad time" during conversion it may be left unbootable
Control: severity -1 wishlist On Sep 28, Zack Weinberg wrote: > convert-usrmerge is nominally idempotent and restartable, but (as it > says in the script’s own documentation) if “the system crashes at a > really bad time” during the conversion process it might not be > possible to recover without manual intervention. Unfortunately, it’s > worse than that: if the system crashes at _just_ the wrong time > (specifically, in the middle of a convert_directory operation, in > between the rename() and symlink() calls) it appears to be possible > for the next boot to find the root filesystem in a state where /lib or > /bin doesn’t exist at all. Recovery from this state will require > booting from installation media. This is technically correct. But after 8 years of development in Debian, and after Ubuntu converting all their user base on upgrades, no such event has been reported. Hence I believe that adding significant complexity to the package is not justified at this point because the risk of introducing more bugs would be higher than the (actually measured) risk of what you described actually happening. > To fix this, I think some technique for replacing directories with > symlinks _atomically_ needs to be found. Such a tecnique is described in the TODO file in the source package. -- ciao, Marco signature.asc Description: PGP signature
Bug#1020920: usrmerge: if "the system crashes at a really bad time" during conversion it may be left unbootable
On Wed, 2022-09-28 at 13:53 -0400, Zack Weinberg wrote: > On Wed, Sep 28, 2022, at 1:47 PM, Ansgar wrote: > > No, you would need to atomically replace the *entire* system, not > > just > > individual directories. > > ??? Atomic replacement of each affected directory is, as far as I can > see, both necessary and sufficient to prevent the system being > rendered unbootable. No. It is not sufficient. Upgrading packages can affect multiple directories and half-upgraded packages can easily render systems unbootable. Ansgar
Bug#1020920: usrmerge: if "the system crashes at a really bad time" during conversion it may be left unbootable
Control: severity -1 wishlist On Wed, 2022-09-28 at 13:53 -0400, Zack Weinberg wrote: > On Wed, Sep 28, 2022, at 1:47 PM, Ansgar wrote: > > No, you would need to atomically replace the *entire* system, not just > > individual directories. > > ??? Atomic replacement of each affected directory is, as far as I can see, > both necessary and sufficient to prevent the system being rendered unbootable. > > > But please explain how this is specifc to usrmerge and not many other > > packages. > > As I already said, this code needs to be extra robust because it is being run > from a postinst script, at some unpredictable moment in the middle of an > upgrade to bookworm (in most cases). You _said_ it, but you really didn't explain it. Let's ask again: why should this be any different than any other package that can bork a system if it crashes just at the right time, of which there are many? Given there's no rationale nor explanation, let's downgrade for now. -- Kind regards, Luca Boccassi signature.asc Description: This is a digitally signed message part
Bug#1020920: usrmerge: if "the system crashes at a really bad time" during conversion it may be left unbootable
On Wed, Sep 28, 2022, at 1:47 PM, Ansgar wrote: > No, you would need to atomically replace the *entire* system, not just > individual directories. ??? Atomic replacement of each affected directory is, as far as I can see, both necessary and sufficient to prevent the system being rendered unbootable. > But please explain how this is specifc to usrmerge and not many other > packages. As I already said, this code needs to be extra robust because it is being run from a postinst script, at some unpredictable moment in the middle of an upgrade to bookworm (in most cases). zw
Bug#1020920: usrmerge: if "the system crashes at a really bad time" during conversion it may be left unbootable
On Wed, 2022-09-28 at 13:39 -0400, Zack Weinberg wrote: > convert-usrmerge is nominally idempotent and restartable, but (as it > says in the script’s own documentation) if “the system crashes at a > really bad time” during the conversion process it might not be > possible to recover without manual intervention. Unfortunately, it’s > worse than that: if the system crashes at _just_ the wrong time > (specifically, in the middle of a convert_directory operation, in > between the rename() and symlink() calls) it appears to be possible > for the next boot to find the root filesystem in a state where /lib > or /bin doesn’t exist at all. Recovery from this state will require > booting from installation media. There are many packages that will render a package unbootable if the system crashes at just the wrong time... You need a very, very large change to how Debian works to change that. > To fix this, I think some technique for replacing directories with > symlinks _atomically_ needs to be found. No, you would need to atomically replace the *entire* system, not just individual directories. But please explain how this is specifc to usrmerge and not many other packages. Ansgar
Bug#1020920: usrmerge: if "the system crashes at a really bad time" during conversion it may be left unbootable
Package: usrmerge Version: 31 Severity: critical Justification: breaks the whole system X-Debbugs-Cc: z...@owlfolio.org convert-usrmerge is nominally idempotent and restartable, but (as it says in the script’s own documentation) if “the system crashes at a really bad time” during the conversion process it might not be possible to recover without manual intervention. Unfortunately, it’s worse than that: if the system crashes at _just_ the wrong time (specifically, in the middle of a convert_directory operation, in between the rename() and symlink() calls) it appears to be possible for the next boot to find the root filesystem in a state where /lib or /bin doesn’t exist at all. Recovery from this state will require booting from installation media. Since the current plan for the usrmerge transition is to run convert-usrmerge from usrmerge’s postinst, during (for most installations where a conversion is required) a bullseye->bookworm upgrade, which system administrators may choose to do *without* dropping to single user mode, a crash at exactly that point is plausible due to interactions with other concurrent processes. Imagine that a watchdog process picks exactly the moment where /bin is being replaced to check whether it can exec /bin/true, or (perhaps more plausible) a server picks exactly the moment where /lib is being replaced to try to load an NSS module, fails, crashes, and then a watchdog notices the server crash and triggers a reboot. To fix this, I think some technique for replacing directories with symlinks _atomically_ needs to be found. -- System Information: Debian Release: bookworm/sid APT prefers unstable-debug APT policy: (500, 'unstable-debug'), (500, 'unstable') merged-usr: no Architecture: amd64 (x86_64) Kernel: Linux 5.19.0-1-amd64 (SMP w/32 CPU threads; PREEMPT) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages usrmerge depends on: ii libfile-find-rule-perl 0.34-2 ii perl5.34.0-5 usrmerge recommends no packages. usrmerge suggests no packages.