On Fri, Oct 03, 2025 at 06:02:33PM +0000, H. Hartzer wrote:

> Even with a dedicated partition, a system may not come back after
> power loss or a crash. If automatic FSCK fails, manual intervention
> has to be done via the console.

There is no obligation to automatically mount it and invoke fsck via
/etc/fstab.

> On a remote system, this may be a
> pretty big deal. Of course I would love simple serial console and
> power access to every system I have, but this is not the reality.

But you don't need that.

If the partition is listed in /etc/fstab with fs_passno set to 0, then it will
not be fsck'ed at boot.  If fs_mntops includes noauto then the kernel will not
try to mount it at boot time.

When the system is suddenly restarted, if there are no other problems during
boot then it will boot multi-user and you can perform whatever actions are
required on the dedicated partition via a regular ssh session, or possibly via
scripts invoked automatically after the standard boot scripts.

> And even if it were, there's still the chance that the corruption
> will be so much that manual fsck can't take care of it and the
> kernel will panic trying to read data from the partition.

See above - as long as you don't try to mount it, the kernel should not crash.

It's true that fsck might not be able to repair it, or might indeed fix the
metadata but that there could be bad user data in place.  This needs to be
detected and fixed with other tools.

> So then
> I have to reformat the partition and start over.
> 
> Starting over could mean syncing a 200GB+ blockchain, or transfering
> that from another machine. This just isn't a fast operation by any
> means. If I can avoid this, which it appears I can, I will be much
> better off.

Another option is copying the 200GB+ of data from another partition on the
same disk.  Whilst this is not an instant operation, it's probably not too
much of a prejudice if you're using a decent disk.

That just requires a simple script to replicate the data from one partition to
another at regular time intervals.  Then you have a quick recovery method if
you're ever in the situation where the filesystem couldn't be recovered by
fsck and you decide to format it.

But it's even better than that, because you could just mount the alternative
backup partition as the live one and continue working almost immediately
after rebooting.  Then just re-format the first partition which got corrupted,
and use that as the new backup.

Unless you had multiple unrecoverable-by-fsck failures that in a short
timespan, this should avoid copying the large dataset from another machine.

> Again, this behavior is out-of-the-box defaults with or without
> RAID 1.  I did not see this tested or documented anywhere.

To trigger the problem you describe, the machine has to be halted or restarted
abruptly at a moment of heavy write I/O.

Maybe this is less common than you expect.

The restart will typically be due to either a power failure or a kernel panic.

OpenBSD machines deployed as production servers will usually have a reliable
power source.  Those run by private hobbyists might not, but then there is a
good chance that the users running them have enough experience to get the
system back up and running without too much concern or difficulty.

A lot of non-server installs are on laptops which have a battery, or are
otherwise connected via a UPS.

Kernel panics shouldn't be happening to the vast majority of users who are
running the generic kernel on reliable hardware.  If there is a reliable way
to trigger a kernel panic by running unprivilaged userland code on a generic
kernel, then it would certainly be fixed.

Sure, anybody hacking the kernel code can introduce bugs that cause a panic,
but if you test your own buggy code on a system that matters, then that's a
risk you take.

> reliable filesystems that can handle crashes
> and power loss seem quite important.

Honestly, it depends on the application.

Crashes and power loss shouldn't be happening in the first place, and ideally
the cause should be addressed rather than trying fix it with additional
complication at the filesystem level.

If data integrity is critical, then the system needs to check each bit of each
file after any crash, regardless of whether the filesystem says it's OK or
not.  If you're going to do that anyway, there is not much hope of an
instant restart, and not much to be gained over restoring a backup.

It's quite easy to _promise_ a filesystem or raid array that will hardly ever
cause data loss, but if that comes at a cost that prejudices regular operation
on working hardware then there will obviously be resistance or lack of
interest from users who operate systems within that scope.

Reply via email to