Bug#948193: e2scrub services delay boot by 2 seconds
On Mon, 6 Jan 2020 16:14:24 -0500 "Theodore Y. Ts'o" wrote: > On Mon, Jan 06, 2020 at 12:39:52AM -0800, Josh Triplett wrote: > > That's an *additional* delay, on top of the sleeps above. The two-second > > sleep in the "exitcode" function seems like the primary culprit. Note > > that I don't even have lvm2-tools installed. > > Ah, yes, sorry, I had missed the sleep in the exitcode function. > Actually it's not needed in e2scrub_all at all; it was there due > copy/paste oversight. > > The commit below should address your concern. That addresses the two-second sleep in a default install, thank you. Please also move the entirely commented-out e2scrub.conf to an example in /usr/share/doc (deleting /etc/e2scrub.conf in postinst if it matches the current example), and making the timer/service use ConditionPathExists=/etc/e2scrub.conf, which will avoid running the service at all. - Josh Triplett
Bug#948193: e2scrub services delay boot by 2 seconds
On Mon, Jan 06, 2020 at 12:39:52AM -0800, Josh Triplett wrote: > That's an *additional* delay, on top of the sleeps above. The two-second > sleep in the "exitcode" function seems like the primary culprit. Note > that I don't even have lvm2-tools installed. Ah, yes, sorry, I had missed the sleep in the exitcode function. Actually it's not needed in e2scrub_all at all; it was there due copy/paste oversight. The commit below should address your concern. Cheers, - Ted commit 0b3208958eb63df6cd8b38ee63f3bc4266a683e7 Author: Theodore Ts'o Date: Mon Jan 6 16:01:23 2020 -0500 e2scrub, e2scrub_all: don't sleep unnecessarily in exitcode The two second sleep is only needed in e2scrub, and when there is a failure, so that systemd has a chance to gather the log output before e2scrub exits. It's not needed if the script is exiting successfully, and it's never needed for e2scrub_all ever. Addresses-Debian-Bug: #948193 Signed-off-by: Theodore Ts'o diff --git a/scrub/e2scrub.in b/scrub/e2scrub.in index f21499b6..30ab7cbd 100644 --- a/scrub/e2scrub.in +++ b/scrub/e2scrub.in @@ -66,7 +66,7 @@ exitcode() { # for capturing all the log messages if the scrub fails, because the # fail service uses the service name to gather log messages for the # error report. - if [ -n "${SERVICE_MODE}" ]; then + if [ -n "${SERVICE_MODE}" -a "${ret}" -ne 0 ]; then test "${ret}" -ne 0 && ret=1 sleep 2 fi diff --git a/scrub/e2scrub_all.in b/scrub/e2scrub_all.in index f0336711..4288b969 100644 --- a/scrub/e2scrub_all.in +++ b/scrub/e2scrub_all.in @@ -56,14 +56,8 @@ exitcode() { # section 22.2) and hope the admin will scan the log for what # actually happened. - # We have to sleep 2 seconds here because journald uses the pid to - # connect our log messages to the systemd service. This is critical - # for capturing all the log messages if the scrub fails, because the - # fail service uses the service name to gather log messages for the - # error report. - if [ -n "${SERVICE_MODE}" ]; then + if [ -n "${SERVICE_MODE}" -a "${ret}" -ne 0 ]; then test "${ret}" -ne 0 && ret=1 - sleep 2 fi exit "${ret}"
Bug#948193: e2scrub services delay boot by 2 seconds
On Sun, 5 Jan 2020 15:41:57 -0500 "Theodore Y. Ts'o" wrote: > On Sat, Jan 04, 2020 at 07:57:16PM -0800, Josh Triplett wrote: > > Package: e2fsprogs > > Version: 1.45.4-1 > > Severity: important > > > > The e2fsprogs package installs a service and timer to run e2scrub. That > > service sleeps for 2 seconds before exiting, delaying the boot by 2 > > seconds. > > It's not necessarily 2 seconds, and it's not directly sleeping. $ grep sleep /sbin/e2scrub* /sbin/e2scrub: # We have to sleep 2 seconds here because journald uses the pid to /sbin/e2scrub: sleep 2 /sbin/e2scrub: sleep 0.5 /sbin/e2scrub: sleep 0.5 /sbin/e2scrub_all: # We have to sleep 2 seconds here because journald uses the pid to /sbin/e2scrub_all: sleep 2 $ systemd-analyze blame | grep e2scrub 2.012s e2scrub_all.service > It's > however long it takes to spin up any storage devices, caused by > running lvs. The bulk of the time of running "e2scrub_all -A -r" is > the time to run 'lsblk' and 'lvs'. That's an *additional* delay, on top of the sleeps above. The two-second sleep in the "exitcode" function seems like the primary culprit. Note that I don't even have lvm2-tools installed. > > Second, please use ConditionPathExists or similar to check for the tools > > e2scrub needs (lsblk and lvcreate), rather than running a script that > > checks for them and then exits. > > That's not the cause of most of the time needed to run e2scrub_all. > We also need to run these sanity checks when e2scrub_all is run by > hand, or run out of cron. It is when I'm trying to boot a system in ~100ms. I want to avoid launching the service and its shell script *at all*. > > And third, please consider *not* enabling this by default. > > It wasn't enabled by default. I just installed a fresh system, and it definitely *is* enabled by default, both via a systemd timer and via cron. I noticed that while debugging the boot process on that fresh system, which also led me to find it enabled on my laptop, where it similarly delays the boot process by two seconds. It's possible that we're talking past each other here. There's an enabled systemd timer unit and an enabled cron job that run a shell script. That shell script then sources its configuration file, just to find out that it isn't actually configured, and then sleeps before exiting. That's still enabled. (And even if it didn't have the sleep in it, it's still enabled.) This is the equivalent of an /etc/default/foo file with an ENABLED=0 flag or similar, which has similar problems and shouldn't ever happen either. The right way to have a disabled-by-default service is to disable it, and tell people how to enable it. One way to do that would be to disable the service/timer and tell people how to enable it via systemctl or similar; another way would be to remove the sample /etc/e2scrub.conf and then use ConditionPathExists=/etc/e2scrub.conf . Personally, I would suggest the latter. (Along the same lines, e2scrub_fail should have a ConditionPathExists on sendmail, because it can't do anything useful if sendmail doesn't exist.) - Josh Triplett
Bug#948193: e2scrub services delay boot by 2 seconds
On Sat, Jan 04, 2020 at 07:57:16PM -0800, Josh Triplett wrote: > Package: e2fsprogs > Version: 1.45.4-1 > Severity: important > > The e2fsprogs package installs a service and timer to run e2scrub. That > service sleeps for 2 seconds before exiting, delaying the boot by 2 > seconds. It's not necessarily 2 seconds, and it's not directly sleeping. It's however long it takes to spin up any storage devices, caused by running lvs. The bulk of the time of running "e2scrub_all -A -r" is the time to run 'lsblk' and 'lvs'. I've already queued up a change (see below) so that we won't attempt to clean up any left-over LVM snapshot volumes if e2scrub is not enabled via /etc/e2scrub.conf, and even if scrubbing is enabled, we check for snapshots via scanning /dev/mapper instead of using lvs. This commit will be part of e2fsprogs 1.45.5, to be released in the next few days. > Second, please use ConditionPathExists or similar to check for the tools > e2scrub needs (lsblk and lvcreate), rather than running a script that > checks for them and then exits. That's not the cause of most of the time needed to run e2scrub_all. We also need to run these sanity checks when e2scrub_all is run by hand, or run out of cron. > And third, please consider *not* enabling this by default. It wasn't enabled by default. And the issue of lvs being slow is fixed by: commit 333268d65d26fbb2d22f7a8b6ac797babcc69543 Author: Darrick J. Wong Date: Mon Nov 4 17:54:14 2019 -0800 e2scrub_all: don't even reap if the config file doesn't allow it Dave Chinner complains that the automated on-boot e2scrub reaping takes a long time (because the lvs command can take a while to run) even though the automated e2scrub is disabled via e2scrub.conf on his systems. We still need the reaping service to kill off stale e2scrub snapshots after a crash, but it's unnecessary to annoy everyone with slow bootup. Because we can look for the e2scrub snapshots in /dev/mapper, let's skip reaping if periodic e2scrub is disabled unless we find evidence of e2scrub snapshots in /dev. Reported-by: Dave Chinner Signed-off-by: Darrick J. Wong Signed-off-by: Theodore Ts'o - Ted
Bug#948193: e2scrub services delay boot by 2 seconds
Package: e2fsprogs Version: 1.45.4-1 Severity: important The e2fsprogs package installs a service and timer to run e2scrub. That service sleeps for 2 seconds before exiting, delaying the boot by 2 seconds. First of all, sleeping for 2 seconds is not OK. Second, please use ConditionPathExists or similar to check for the tools e2scrub needs (lsblk and lvcreate), rather than running a script that checks for them and then exits. And third, please consider *not* enabling this by default. -- System Information: Debian Release: bullseye/sid APT prefers unstable APT policy: (500, 'unstable'), (1, 'experimental') Architecture: amd64 (x86_64) Kernel: Linux 5.4.0-1-amd64 (SMP w/4 CPU cores) Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE=C.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages e2fsprogs depends on: ii libblkid12.34-0.1 ii libc62.29-7 ii libcom-err2 1.45.4-1 ii libext2fs2 1.45.4-1 ii libss2 1.45.4-1 ii libuuid1 2.34-0.1 ii logsave 1.45.4-1 Versions of packages e2fsprogs recommends: pn e2fsprogs-l10n Versions of packages e2fsprogs suggests: pn e2fsck-static pn fuse2fs pn gpart ii parted 3.3-1 -- no debconf information