Introduction ------------ I would like to recount a situation. I'm not sure where, if anywhere, the root bug(s) lie, but I am inclined to say that a big part of the problem was a change to the contents of jessie-backports. I would be interested to hear what the backports team and ftpmaster have to say; in particular, if anyone knows the answers to my questions below.
My tentative conclusions are that: 1. Packages should not be removed from foo-backports just because a similar package is in foo-security, because there are situations where a host may have been relying on the package being in foo-backports and a similar (even, newer) package being in foo-security is not sufficient. 2. Cruft removal in stable releases, including in -backports, should perhaps be done with care/caution/announcement or something. Background ---------- The upstream Xen project CI system does baremetal testing of Xen hypervisors etc. and therefore needs to reinstall hosts quite often. This is done by running a debian-installer netinst image with a preseed file. For Reasons we are still mostly on jessie. We have some arm64 boxes. They don't work with the kernel from jessie. So we arrange to use the kernel from jessie-backports. Using the jessie-backports kernel with the jessie installer involves using the preseed hook mechanism to add jessie-backports to the target's apt sources, and an in-target apt-get install rune to install the kernel package. (Using the jessie-backports kernel also involves editing the installer image to have the jessie-backports kernel and modules, but that is not relevant to this tale.) The arm64 kernel in jessie-backports is this package linux-image-4.9.0-0.bpo.2-arm64 (4.9.18-1~bpo8+1) It Depends on `linux-base (>= 4.3~)'. So it is necessary to have a newer linux-base. According to my git commit logs, in January 2017 I added the equivalent of apt-get install -t jessie-backports linux-base to the commands run via the preseed mechanism: at that time a newer linux-base was available in backports. Breakage -------- According to snapshot.d.o, until the 6th of February, linux-base 4.3~bpo8+1 was available in jessie-backports. So things worked fine. Around 16:00 UTC on the 7th of February, linux-base was removed from jessie-backports, presumably because it was considered cruft. After all, linux-base 4.5~deb8u1 is now in jessie-security. However, after that change to the archive, the dependency resolver from jessie's apt, in our CI, is no longer willing to update to linux-base from jessie-security. (I have not yet investigated in detail but I suspect that the apt-get -t jessie-backports rune above is part of that causal chain.) The result is that linux-image-4.9.*'s version dependency on linux-base could not be satisfied. In our CI this resulted in a mysterious failure where despite us not having changed anything, the host would fail to boot when it wanted to reboot into the installed system, because it would try to use the original jessie 3.16 kernel (which does not run on our hardware). Logs ---- For the very curious, and for my reference, complete logs of an example failure are preserved here: http://logs.test-lab.xenproject.org/~iwj/132973.test-arm64-arm64-xl/info.html Mostly you want to look at the `Logfiles etc.'. You can also click on the entries in the `status' column to see the output from the CI system perl scripts. The installer syslog is here: http://logs.test-lab.xenproject.org/~iwj/132973.test-arm64-arm64-xl/3.ts-syslog-server.log When looking at the serial log: http://logs.test-lab.xenproject.org/~iwj/132973.test-arm64-arm64-xl/serial-laxton0.log it is important to realise that that logfile contains a fair amount of previous output. Look at the timestamps: you want the part of the log starting at 2019-02-07 15:13:32 Z. Analysis and questions ---------------------- I'm almost certain that the proximate cause of the breakage was the removal of linux-base from jessie-security. I think, but I am not sure, that that apt-get rune to request linux-base from backports was was previously necessary. The reason I say that I am not sure is that the CI commit which added that rune had, according to its commit message, an additional effect of putting backports in the apt sources; perhaps that latter would have been sufficient. (After I have sent this mail I am going to mess about with the system to find a way to get it working properly again.) Q: Was `apt-get install -t backports linux-base' unnecessary (and wrong) ? It is unfortunate that something which worked for a period of over 2 years was broken by an archive change. I don't know for sure that the removal this was cruft removal but it seems like the most plausible explanation. I haven't so far found any explanation somewhere but perhaps I looked in the wrong places. Q. Why was linux-base removed from jessie-backports ? Opinions and suggestions welcome. Thanks, Ian. -- Ian Jackson <ijack...@chiark.greenend.org.uk> These opinions are my own. If I emailed you from an address @fyvzl.net or @evade.org.uk, that is a private address which bypasses my fierce spamfilter.