On Mon, 22 Aug 2022 at 21:07:57 +0200, Paul Gevers wrote: > paul@mulciber ~ $ sudo lxc-start test && sudo lxc-attach test -- sh -ec "if > [ -d /run/systemd/system ]; then echo systemd ; exit 0 ; else echo unknown ; > exit 0 ; fi" ; echo $? && sudo lxc-stop test > [sudo] password for paul: > unknown > 0 > paul@mulciber ~ $ sudo lxc-start test && sudo lxc-attach test -- sh -ec "if > [ -d /run/systemd/system ]; then echo systemd ; exit 0 ; else echo unknown ; > exit 0 ; fi" ; echo $? && sudo lxc-stop test > systemd > 0
A theory for what might be going on here: as systemd starts up, it creates /run/systemd/system *almost* immediately; but if lxc-attach happens fast enough, then it can finish running the shell command before systemd has had a chance to create /run/systemd/system. So wait_booted() would think we're running sysvinit or some other non-systemd init, and run lib/await-sysv-boot instead of systemctl. However, that seems unlikely to be the root cause for the original bug you reported. Even if we mis-detect systemd as sysvinit, lib/await-sysv-boot is basically doing the same as the old implementation, polling until runlevel(8) indicates a suitable runlevel, but with the polling loop happening in the container instead of on the host. And I'd expect that to mostly work, either on systemd or sysvinit? The old implementation always just did the equivalent of this anyway, and systemd does have a working runlevel(8) (its use is discouraged, but it's there). The one part of lib/await-sysv-boot that is not systemd-friendly is that it unconditionally waits for /etc/init.d/rc to finish, and does not wait for network-online.target if systemd was detected. On podman with systemd, we'd ideally be waiting for boot by using `NOTIFY_SOCKET=... podman run --notify=container ...` - but that requires knowing in advance that the container is going to boot with systemd (or in principle some other init system that implements the sd_notify() protocol), and in general we don't know that in advance. Also, lxc doesn't implement that protocol, as far as I'm aware. >From the original bug report: > autopkgtest-virt-lxc [21:45:43]: ERROR: Waiting for boot to finish failed > autopkgtest-virt-lxc [21:45:43]: ERROR: Failed to connect to bus: No such > file or directory I think that error message is systemd failing to connect to the D-Bus system bus when we do the systemctl command to wait for boot to finish. This is a different race: we have successfully identified the container as running systemd as its init system, but there's no system bus socket yet, so systemctl fails. In the follow-up, instead we're hitting a timeout: > File "/usr/share/autopkgtest/lib/VirtSubproc.py", line 262, in wait_booted > (rc, err, _) = execute_timeout( This is a 60 second timeout waiting for lib/await-sysv-boot to finish, and it looks as though lib/await-sysv-boot has not produced any output on its stdout/stderr (I should probably make it more verbose, since its output is only shown if it fails). smcv