Re: btrfs as / filesystem in RAID1

Chris Murphy Fri, 08 Feb 2019 12:17:46 -0800

On Fri, Feb 8, 2019 at 11:10 AM waxhead <waxh...@dirtcellar.net> wrote:


> So what you are saying here is that distro's that use btrfs by default
> should be responsible enough to make some monitoring solution if they
> allow non-technical users to create a "raid"1 like btrfs filesystem in
> the first place.

None do this by default. I'm only aware of one that makes it possible
in custom partitioning which is widely regarded as "you're on your
own" land.

I am of the opinion that GUI installers have a high burden to protect
users from themselves but it's just an opinion; I see plenty of fail
danger GUI software.


> So why do BTRFS hurry to mount itself even if devices are missing?

It isn't and it doesn't. You have to specify 'degraded' mount option,
which is not the default, which right now with the present design
means you intend for an immediate successful mount if there's a
missing device and it's still possible to mount anyway.

>and
> if BTRFS still can mount , why whould it blindly accept a non-existing
> disk to take part of the pool?!

I can't parse this question. I think the answer is, it doesn't do that.


> > Realistically, we can only safely recover from divergence correctly if
> > we can prove that all devices are true prior states of the current
> > highest generation, which is not currently possible to do reliably
> > because of how BTRFS operates.
> >
> So what you are saying is that the generation number does not represent
> a true frozen state of the filesystem at that point?

You have a two device raid1, and their generation is 100. You mount
one device by itself with degraded mount option. And you start adding
and deleting files, no snapshots, and those changes are all under
generation 101. You now unmount it, and you degraded mount the other
device, and you add and delete some different files, and those changes
are all under generation 101 too.

How do you merge them? I personally think that scarnio is user
sabotage and they're just screwed. Start over. They had to
intentionally, manually, mount those two drives *separately* with a
non-default 'degraded' flag. It's crazy to expect Btrfs to sort this
out - but it's entirely reasonable for it to faceplant read only the
instant it becomes confused; and reasonable to expect and design it to
quickly become confused in such a case, to keep damage from making
both separated mirrors so corrupted they can't be mounted even read
only.


> Why does systemd concern itself about what devices btrfs consist of.
> Please educate me, I am curious.

I'm not sure of the history of:
/usr/lib/udev/rules.d/64-btrfs-dm.rules
/usr/lib/udev/rules.d/64-btrfs.rules

But I think they were submitted to udev by Btrfs developers long ago,
which was then later subsumed into systemd. It would be ideal if this
rule had time sort of timeout, I think instead it will indefinitely
wait for all devices to appear. Anyway, without that rule, if a device
is merely delayed, and systemd tries to mount, mount immediately fails
and thus boot fails. There is no such thing in systemd as reattempting
to mount after a mount failure, and if sysroot fails to mount, it's a
fatal startup error.

> I would also like for BTRFS to be over-aggressively safe, but I also
> want it to be over-aggressively always running or even limping if that
> is what it needs to do.

While I understand that's a metaphor, someone limping along is not a
stable situation. They are more likely to trip and fall.


-- 
Chris Murphy

Re: btrfs as / filesystem in RAID1

Reply via email to