On Tue, Nov 13, 2018 at 11:46:31PM +0100, Adam Borowski wrote:
> what would you say about getting rid of fsck at boot for most filesystems?

The reason why it's important to run fsck at boot is because for many
file systems if a file system consistency problem is detected at run
time (this might be caused by a kernel bug; or a hardware problem; or
a cosmic ray).  If that happens a flag in the superblock is set
indicating that file system really needs to be checked.

For ext4, what happens after the flag is set in the superblock depends
on how the file system is configured (via mount options or by flags
set via tune2fs -e.  We can either ignore the fact that there was an
error (the "don't worry, be happy" mode), we can remount the file
system read-only --- or we can immediately force a reboot.  At which
point, when the system reboots, the file system checker will run, and
in preen mode, will automatically force a full check.

So the assertion in the bug report, "running fsck at boot is harmful
for any modern file system" falls into the same trap as ZFS did when
they asserted, "we're a modern file system, we don't need a fsck
program at all!"  They very quickly learned that in the real world,
there are cosmic rays hitting DRAM; there are hardware bugs; there are
kernel bugs.  And sending angry customers to ZFS developers to
manually fix corrupted file systems (because ZFS didn't have an fsck)
didn't scale.  :-)

So running fsck at boot is absolutely required.

> For the few that actually need it, being on battery shouldn't skip it.

It was never a good idea for checkroot.sh to be checking whether or it
was on battery.  That check needs to be done in the file system
checker.  So for ext4, if you do want to enable time-based or mount
count-based checks, e2fsck will check whether or not the system was on
battery, and skip the check if the reason for the check was the last
check time or mount count was triggering the check.  HOWEVER, if the
file system is marked as having some corruption found by the kernel,
e2fsck will always try to fix the problem, on the assumption that most
users care about the data not getting lost more than they care about
battery life.  :-)

Regards,

                                                - Ted

Reply via email to