Bug#408954: checkroot.sh: should not skip running fsck with JFS root
[2019-03-02 00:01] Pierre Ynard > Do we want a blacklist, or a whitelist? > Do we want to delegate conditionality to particular implementations? If > so, which factors? Running on battery was suggested. Shipping a dummy > fsck.$type is a way to delegate the possibility or impossibility to fsck > to the implementation - but that doesn't seem like the best or most > flexible technical solution to me. If you ask me, requiring every FS to provide /usr/bin/fsck.$FS and standartizing command line options is good thing. This would eliminate both whitelist and blacklist. What more important, it would remove assumption, that maintainers of initscripts have in-depth understanding of all file systems in existence. > What is the difference between checking the root filesystem, and > checking other filesystems? Why would the logic to skip brtfs and nfs > apply only to checkroot.sh, why would checkroot.sh ignore FSCKTYPES? No idea, sorry. > Can we have a switch in fsck similar to -A to let it parse the pass > field of /etc/fstab for us, except when checking only one device > passed in argument, or only the root fs? That way we wouldn't have > to parse it ourselves too just for the root fs and pass it from > /lib/init/mount-functions.sh back to /etc/init.d/checkroot.sh. We can write such universal front-end, don't we? -- Note, that I send and fetch email in batch, once every 24 hours. If matter is urgent, try https://t.me/kaction --
Bug#408954: checkroot.sh: should not skip running fsck with JFS root
Hello, I invite you guys to look inside /etc/init.d/checkroot.sh to see that the check for AC power or battery is actually completely commented out. So as far as battery is concerned, the question is moot, and the filesystem is always checked. The battery check was introduced in #326647 in 2005: > While reviewing the patches from Ubuntu, I came across a nice > improvement. Make sure to not run fsck if the machine is running on > battery. This would make the boot process a lot better for laptops. I haven't researched the original Ubuntu patch to check what the rationale was when it was introduced there. Then in 2009, it was pointed out in #526398 that this "can cause serious data corruption if booting on battery power" and that skipping fsck wasn't necessarily a nice idea. Some discussion already ensued about the necessity and conditionality of running fsck at boot, similar to this thread. The battery check was commented out, and #326647 was reopened. So I would believe that this 2007 bug was already fixed at the same time as #526398, when the battery check was reverted. Now there could be other mishaps causing the reporter's JFS filesystem to be skipped, but I don't see any and running on batteries was specifically mentioned. So as for this bug report proper, I suspect it could be closed. Now as for the discussion: /etc/init.d/checkroot.sh contains a check to skip btrfs. /lib/init/mount-functions.sh contains checks specific to the root fs to skip nfs and nfs4; and also /etc/fstab configurations with the pass field set to 0 - which by the way was reported in #571241 to fail to skip trying to check an ubifs root (for which there is no fsck utility). /etc/init.d/checkfs.sh contains checks to limit fsck operations to the undocumented FSCKTYPES variable (which is ignored by /etc/init.d/checkroot.sh) and calls fsck with -A to let it handle itself the pass column of /etc/fstab. And finally, initscripts ships a dummy fsck.nfs to cope with attempts that should already be prevented by the above check for it. So the current handling of this matter is not very rationalized. Ted says that checking filesystems at boot is very much not a dead concept. Yet there are some filesystems where it's undesirable or impossible. Do we want a blacklist, or a whitelist? Do we want to delegate conditionality to particular implementations? If so, which factors? Running on battery was suggested. Shipping a dummy fsck.$type is a way to delegate the possibility or impossibility to fsck to the implementation - but that doesn't seem like the best or most flexible technical solution to me. What is the difference between checking the root filesystem, and checking other filesystems? Why would the logic to skip brtfs and nfs apply only to checkroot.sh, why would checkroot.sh ignore FSCKTYPES? Does the installer set up /etc/fstab configurations requesting to fsck wrong filesystems to begin with, that we need checks to disable later on? Shall we ask installer maintainers, open a bug if there isn't one? Can we have a switch in fsck similar to -A to let it parse the pass field of /etc/fstab for us, except when checking only one device passed in argument, or only the root fs? That way we wouldn't have to parse it ourselves too just for the root fs and pass it from /lib/init/mount-functions.sh back to /etc/init.d/checkroot.sh. Can we gather the filesystem blacklist/whitelist, FSCKTYPES handling, is_fastboot_active check, and possibly battery check, all together and factorized in a single place in /lib/init/mount-functions.sh? That's what I would suggest. Regards, -- Pierre Ynard
Bug#408954: checkroot.sh: should not skip running fsck with JFS root
On Tue, Nov 13, 2018 at 11:46:31PM +0100, Adam Borowski wrote: > what would you say about getting rid of fsck at boot for most filesystems? The reason why it's important to run fsck at boot is because for many file systems if a file system consistency problem is detected at run time (this might be caused by a kernel bug; or a hardware problem; or a cosmic ray). If that happens a flag in the superblock is set indicating that file system really needs to be checked. For ext4, what happens after the flag is set in the superblock depends on how the file system is configured (via mount options or by flags set via tune2fs -e. We can either ignore the fact that there was an error (the "don't worry, be happy" mode), we can remount the file system read-only --- or we can immediately force a reboot. At which point, when the system reboots, the file system checker will run, and in preen mode, will automatically force a full check. So the assertion in the bug report, "running fsck at boot is harmful for any modern file system" falls into the same trap as ZFS did when they asserted, "we're a modern file system, we don't need a fsck program at all!" They very quickly learned that in the real world, there are cosmic rays hitting DRAM; there are hardware bugs; there are kernel bugs. And sending angry customers to ZFS developers to manually fix corrupted file systems (because ZFS didn't have an fsck) didn't scale. :-) So running fsck at boot is absolutely required. > For the few that actually need it, being on battery shouldn't skip it. It was never a good idea for checkroot.sh to be checking whether or it was on battery. That check needs to be done in the file system checker. So for ext4, if you do want to enable time-based or mount count-based checks, e2fsck will check whether or not the system was on battery, and skip the check if the reason for the check was the last check time or mount count was triggering the check. HOWEVER, if the file system is marked as having some corruption found by the kernel, e2fsck will always try to fix the problem, on the assumption that most users care about the data not getting lost more than they care about battery life. :-) Regards, - Ted
Bug#408954: checkroot.sh: should not skip running fsck with JFS root
[JFS's regular fsck does nothing but, and is needed to, trigger journal replay. If the journal replay is skipped, dirty fs will fail the mount, resulting in a boot failure if / . Being on battery skips fsck.] On Tue, Nov 13, 2018 at 08:22:09PM +, Dmitry Bogatov wrote: > > control: tags -1 confirmed > > [2018-11-12 02:03] Adam Borowski > > On Mon, Nov 12, 2018 at 01:20:25AM +0100, Adam Borowski wrote: > > > And JFS doesn't do the full check either, it merely apparently needs (or > > > more likely needed -- this report is 11¾ years old) only a trigger for the > > > journal replay. > > > > > > So I propose: > > > 1. checking if JFS still needs this > > > > Yes, it does. > > > > So we still need fsck to boot, no matter if the computer is on battery or > > not. So this bug is still valid. > > Am I correct, that fsck (checkroot.sh) is actually needed only in case > of ext2 or jfs root? No idea about some weird filesystems -- but at least for ext*, there's a guy who knows it a wee bit better than any of us. Might also offer advice wrt other filesystems as well. Ted: what would you say about getting rid of fsck at boot for most filesystems? For the few that actually need it, being on battery shouldn't skip it. Meow! -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!? ⢿⡄⠘⠷⠚⠋⠀ -- Genghis Ht'rok'din ⠈⠳⣄
Bug#408954: checkroot.sh: should not skip running fsck with JFS root
control: tags -1 confirmed [2018-11-12 02:03] Adam Borowski > On Mon, Nov 12, 2018 at 01:20:25AM +0100, Adam Borowski wrote: > > And JFS doesn't do the full check either, it merely apparently needs (or > > more likely needed -- this report is 11¾ years old) only a trigger for the > > journal replay. > > > > So I propose: > > 1. checking if JFS still needs this > > Yes, it does. > > So we still need fsck to boot, no matter if the computer is on battery or > not. So this bug is still valid. Am I correct, that fsck (checkroot.sh) is actually needed only in case of ext2 or jfs root?
Bug#408954: checkroot.sh: should not skip running fsck with JFS root
Adam Borowski writes ("Bug#408954: checkroot.sh: should not skip running fsck with JFS root"): > Running fsck at boot is useless and harmful for any modern filesystem. > Sure, for ext2 it was needed to at least somewhat reduce data loss you just > suffered, but anything newer is crash safe. I still make ext2 filesystems in new installs... If ext3+ checks are harmful at boot then this should be done in the filesystem specific check utility by making the preen function into a noop, surely ? Ian.
Bug#408954: checkroot.sh: should not skip running fsck with JFS root
On Mon, Nov 12, 2018 at 01:20:25AM +0100, Adam Borowski wrote: > And JFS doesn't do the full check either, it merely apparently needs (or > more likely needed -- this report is 11¾ years old) only a trigger for the > journal replay. > > So I propose: > 1. checking if JFS still needs this Yes, it does. So we still need fsck to boot, no matter if the computer is on battery or not. So this bug is still valid. -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢰⠒⠀⣿⡁ Imagine there are bandits in your house, your kid is bleeding out, ⢿⡄⠘⠷⠚⠋⠀ the house is on fire, and seven big-ass trumpets are playing in the ⠈⠳⣄ sky. Your cat demands food. The priority should be obvious...
Bug#408954: checkroot.sh: should not skip running fsck with JFS root
On Sun, Nov 11, 2018 at 09:08:55PM +, Dmitry Bogatov wrote: > [2007-01-29 17:37] Ari Sovijärvi > > The JFS filesystem replays its journal when FSCK is run. Without > > journal replay the filesystem will not allow mount as RW. When running > > on batteries and the root filesystem is JFS, after a crash the system is > > left in unusable state since FSCK is skipped and as a result the remount > > to read/write attempt fails. > > > > Maybe we need a check if we're using JFS and then unconditionally run > > FSCK. > > While adding yet another special check into `checkroot.sh' would > definitely solve problem at hand, probably there is something better, > less ad-hoc? Any ideas? Running fsck at boot is useless and harmful for any modern filesystem. Sure, for ext2 it was needed to at least somewhat reduce data loss you just suffered, but anything newer is crash safe. Thus, what about removing fsck in the general case? Filesystems like btrfs or XFS had stubs since forever, for ext4 all it gives you is an annoying check if the machine's clock is unpowered or missing. And JFS doesn't do the full check either, it merely apparently needs (or more likely needed -- this report is 11¾ years old) only a trigger for the journal replay. So I propose: 1. checking if JFS still needs this 2. talking with any filesystem maintainer like TyT'so if fsck-on-boot can be dropped for their filesystem 3. making fsck no longer conditional (for that poor ext2 user) Meow! -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢰⠒⠀⣿⡁ ⢿⡄⠘⠷⠚⠋⠀ I was born a dumb, ugly and work-loving kid, then I got swapped on ⠈⠳⣄ the maternity ward.
Bug#408954: checkroot.sh: should not skip running fsck with JFS root
Package: initscripts Version: 2.86.ds1-36 Severity: normal The JFS filesystem replays its journal when FSCK is run. Without journal replay the filesystem will not allow mount as RW. When running on batteries and the root filesystem is JFS, after a crash the system is left in unusable state since FSCK is skipped and as a result the remount to read/write attempt fails. Maybe we need a check if we're using JFS and then unconditionally run FSCK. -- System Information: Debian Release: 4.0 APT prefers testing APT policy: (500, 'testing') Architecture: i386 (i686) Shell: /bin/sh linked to /bin/bash Kernel: Linux 2.6.19-laiskat-ja-tyhmat1 Locale: LANG=C, [EMAIL PROTECTED] (charmap=ISO-8859-15) Versions of packages initscripts depends on: ii debianut 2.17Miscellaneous utilities specific t ii e2fsprog 1.39+1.40-WIP-2006.11.14+dfsg-1 ext2 file system utilities and lib ii libc62.3.6.ds1-8 GNU C Library: Shared libraries ii lsb-base 3.1-22 Linux Standard Base 3.1 init scrip ii mount2.12r-15Tools for mounting and manipulatin ii sysvinit 2.86.ds1-36 System-V-like utilities Versions of packages initscripts recommends: ii psmisc22.3-1 Utilities that use the proc filesy -- no debconf information -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]