On Sat, May 27, 2023 at 11:09:32PM +0200, Helmut Grohne wrote: > Hi, > > I sat down with Jochen in Hamburg to try and fix this. > > On Sun, May 14, 2023 at 03:21:24PM -0400, Theodore Ts'o wrote: > > Can someone send the instructions on how to fix this? > > We wish we could give you. Instead, we document our findings, so maybe the > next one looking into this bug has a better idea, but for now we give up > as it is too late for bookworm anyway.
Helmut, Jochem, thanks so much for trying to look into this. Here's some additional context from my research. First of all, the change to use WantedBy=default.target to Wantedby-multi-user.target, as described in Message #19 of this bug, was in response to a bug report from Ansgar, bug report #991349: >I noticed that e2scrub_reap.service uses > > WantedBy=default.target > >instead of the more usual > > WantedBy=multi-user.target. > >As default.target is usually just an alias for multi-user.target or >graphical.target, this means it will even be pulled in if someone uses >some other custom target. This feels rather unexpected. > >Is there any reason not to use WantedBy=multi-user.target? At the time, I thought to myself, sure, makes sense, and made the change in commit b42c9788c75d ("e2scrub: use WantedBy=multi-user.target in e2scrub_reap.service"), and in the commit I noted "Addresses-Debian-Bug: #991349" As near as I can tell, on a system that started with the Bullseye version of e2fsprogs, and which has then updated to the Bookform version e2fsprogs, via periodic updates to testing (Bookworm), the default.target link still exists: % ls -l /etc/systemd/system/default.target.wants/e2scrub_reap.service 0 lrwxrwxrwx 1 root root 40 Dec 19 2020 /etc/systemd/system/default.target.wants/e2scrub_reap.service -> /lib/systemd/system/e2scrub_reap.service ... and this is enough for systemctl status to seem to think that e2scrub_reap is still enabled: % systemctl status e2scrub_reap ○ e2scrub_reap.service - Remove Stale Online ext4 Metadata Check Snapshots Loaded: loaded (/lib/systemd/system/e2scrub_reap.service; enabled; preset: enable> Active: inactive (dead) since Sat 2023-05-27 17:53:22 EDT; 1h 34min ago Docs: man:e2scrub_all(8) Process: 1309 ExecStart=/sbin/e2scrub_all -A -r (code=exited, status=0/SUCCESS) Main PID: 1309 (code=exited, status=0/SUCCESS) CPU: 12ms ... So sure, /etc/systemd.d/system/multi-user.target.wants/e2scrub_reap.service doesn't exist. *But* it still exists in .../default.target.wants/... which seems to be enough to keep the e2scrub_reap service enabled. Right? What am I missing? In any case, I am still unclear (a) what is actually broken in this particular setup, since according to systemctl status the systemd unit is apparently still appropriate enabled, even if it isn't via the expected Wanted-b: multi-user.target. And secondly, (b) what is e2fsprogs's control scripts supposed to have done differently? That is, if this is indeed this is a bug in e2fsprogs --- what did I do wrong, and how do I fix it? And if the answer is you should never, ever, try to change a Wanted-by line in a systemd script, because debian's systemd unit file infrastructure is too fragile to handle this correctly, given that bookworm is about to ship with "Wanted-by: multi-user.target", what's the best path forward at this point? I'll note that e2scrub_reap.service is just a helper unit file which is only needed to clean up after a system crash while e2scrub is running --- and that will only happen if the user has edited and appropriately configured e2scrub in /etc/e2scrub.conf. So from my maintainer's perspective, what I am going for is that e2scrub_reap.service and e2scrub_all.timer should *always* be enabled, since the real control point (as far as I am concerned) is /etc/e2scrub.conf. I really don't actually *care* whether it is enabled via default.target.wanted or multi-user.target.wanted. If I need to be sent to some systemd re-educational camp to understand the finer points about default.target vs multi-user.target, and whether it acctually makes any difference whether the systemd unit file says "Wanted-by: multi-user.target", but in the upgraded bullseye->bookworm installation, the symlink is still in */default.target.wanted/* --- please point me at the documentation. Otherwise, I'm beginning to think that nothing is actually broken, and the bullseye2bookworm piuparts tests is just being overly picky, but nothing is actually broken in actual practice. And perhaps I should just close this bug as "Working as Intended". Again, what am I missing? - Ted P.S. I really *am* trying to get with the systemd program, but this all of this complexity is just hopelessly confusing. :-( :-( :-( P.P.S. And there is actually a case where this will actually break a real user, can someone give me clear reproduction instructions, which starts with "install bookworm in a VM", then do X, then do Y, and then observe breakage. Please don't just point me at piuparts output, because unfortunately, it's too complicated for my tiny brain to understand.