On Fri, Feb 8, 2019 at 11:10 AM waxhead <waxh...@dirtcellar.net> wrote:
> So what you are saying here is that distro's that use btrfs by default > should be responsible enough to make some monitoring solution if they > allow non-technical users to create a "raid"1 like btrfs filesystem in > the first place. None do this by default. I'm only aware of one that makes it possible in custom partitioning which is widely regarded as "you're on your own" land. I am of the opinion that GUI installers have a high burden to protect users from themselves but it's just an opinion; I see plenty of fail danger GUI software. > So why do BTRFS hurry to mount itself even if devices are missing? It isn't and it doesn't. You have to specify 'degraded' mount option, which is not the default, which right now with the present design means you intend for an immediate successful mount if there's a missing device and it's still possible to mount anyway. >and > if BTRFS still can mount , why whould it blindly accept a non-existing > disk to take part of the pool?! I can't parse this question. I think the answer is, it doesn't do that. > > Realistically, we can only safely recover from divergence correctly if > > we can prove that all devices are true prior states of the current > > highest generation, which is not currently possible to do reliably > > because of how BTRFS operates. > > > So what you are saying is that the generation number does not represent > a true frozen state of the filesystem at that point? You have a two device raid1, and their generation is 100. You mount one device by itself with degraded mount option. And you start adding and deleting files, no snapshots, and those changes are all under generation 101. You now unmount it, and you degraded mount the other device, and you add and delete some different files, and those changes are all under generation 101 too. How do you merge them? I personally think that scarnio is user sabotage and they're just screwed. Start over. They had to intentionally, manually, mount those two drives *separately* with a non-default 'degraded' flag. It's crazy to expect Btrfs to sort this out - but it's entirely reasonable for it to faceplant read only the instant it becomes confused; and reasonable to expect and design it to quickly become confused in such a case, to keep damage from making both separated mirrors so corrupted they can't be mounted even read only. > Why does systemd concern itself about what devices btrfs consist of. > Please educate me, I am curious. I'm not sure of the history of: /usr/lib/udev/rules.d/64-btrfs-dm.rules /usr/lib/udev/rules.d/64-btrfs.rules But I think they were submitted to udev by Btrfs developers long ago, which was then later subsumed into systemd. It would be ideal if this rule had time sort of timeout, I think instead it will indefinitely wait for all devices to appear. Anyway, without that rule, if a device is merely delayed, and systemd tries to mount, mount immediately fails and thus boot fails. There is no such thing in systemd as reattempting to mount after a mount failure, and if sysroot fails to mount, it's a fatal startup error. > I would also like for BTRFS to be over-aggressively safe, but I also > want it to be over-aggressively always running or even limping if that > is what it needs to do. While I understand that's a metaphor, someone limping along is not a stable situation. They are more likely to trip and fall. -- Chris Murphy