Re: btrfs as / filesystem in RAID1

waxhead Thu, 07 Feb 2019 10:53:15 -0800



Austin S. Hemmelgarn wrote:

On 2019-02-07 06:04, Stefan K wrote:
Thanks, with degraded as kernel parameter and also ind the fstab itworks like expected
That should be the normal behaviour, cause a server must be up andrunning, and I don't care about a device loss, thats why I use aRAID1. The device-loss problem can I fix later, but its important thata server is up and running, i got informed at boot time and also inthe logs files that a device is missing, also I see that if you use amonitoring program.
No, it shouldn't be the default, because:
* Normal desktop users _never_ look at the log files or boot info, andrarely run monitoring programs, so they as a general rule won't noticeuntil it's already too late. BTRFS isn't just a server filesystem, soit needs to be safe for regular users too.

I am willing to argue that whatever you refer to as normal users don'thave a clue how to make a raid1 filesystem, nor do they care about whatunderlying filesystem their computer runs. I can't quite see how alimping system would be worse than a failing system in this case.Besides "normal" desktop users use Windows anyway, people that run onpenguin powered stuff generally have at least some technical knowledge.

* It's easily possible to end up mounting degraded by accident if one ofthe constituent devices is slow to enumerate, and this can easily resultin a split-brain scenario where all devices have diverged and the volumecan only be repaired by recreating it from scratch.

Am I wrong or would not the remaining disk have the generation numberbumped on every commit? would it not make sense to ignore (previously)stale disks and require a manual "re-add" of the failed disks. From ausers perspective with some C coding knowledge this sounds to me (inprinciple) like something as quite simple.E.g. if the superblock UUID match for all devices and one (or more)devices has a lower generation number than the other(s) then the disk(s)with the newest generation number should be considered good and theother disks with a lower generation number should be marked as failed.

* We have _ZERO_ automatic recovery from this situation. This makesboth of the above mentioned issues far more dangerous.

See above, would this not be as simple as auto-deleting disks from thepool that has a matching UUID and a mismatch for the superblockgeneration number? Not exactly a recovery, but the system should be ableto limp along.

* It just plain does not work with most systemd setups, because systemdwill hang waiting on all the devices to appear due to the fact that theyrefuse to acknowledge that the only way to correctly know if a BTRFSvolume will mount is to just try and mount it.

As far as I have understood this BTRFS refuses to mount even inredundant setups without the degraded flag. Why?! This is just plainuseless. If anything the degraded mount option should be replaced withsomething like failif=X where X would be anything from 'never' whichshould get a 2 disk system up with exclusively raid1 profiles even ifonly one device is working. 'always' in case any device is failed oreven 'atrisk' when loss of one more device would keep any raid chunkprofile guarantee. (this get admittedly complex in a multi disk raid1setup or when subvolumes perhaps can be mounted with different "raid"profiles....)

* Given that new kernels still don't properly generate half-raid1 chunkswhen a device is missing in a two-device raid1 setup, there's a veryreal possibility that users will have trouble recovering filesystemswith old recovery media (IOW, any recovery environment running a kernelbefore 4.14 will not mount the volume correctly).

Sometimes you have to break a few eggs to make an omelette right? Ifpeople want to recover their data they should have backups, and if theyare really interested in recovering their data (and don't have backups)then they will probably find this on the web by searching anyway...

* You shouldn't be mounting writable and degraded for any reason otherthan fixing the volume (or converting it to a single profile until youcan fix it), even aside from the other issues.

Well in my opinion the degraded mount option is counter intuitive.Unless otherwise asked for the system should mount and work as long asit can guarantee the data can be read and written somehow (regardless ifany redundancy guarantee is not met). If the user is willing to acceptmore or less risk they should configure it!

Re: btrfs as / filesystem in RAID1

Reply via email to