Re: degraded permanent mount option

Austin S. Hemmelgarn Mon, 29 Jan 2018 11:02:05 -0800

On 2018-01-29 12:58, Andrei Borzenkov wrote:

29.01.2018 14:24, Adam Borowski пишет:
...


So any event (the user's request) has already happened.  A rc system, of
which systemd is one, knows whether we reached the "want root filesystem" or
"want secondary filesystems" stage.  Once you're there, you can issue the
mount() call and let the kernel do the work.

It is a btrfs choice to not expose compound device as separate one (like
every other device manager does)


Btrfs is not a device manager, it's a filesystem.

it is a btrfs drawback that doesn't provice anything else except for this
IOCTL with it's logic


How can it provide you with something it doesn't yet have?  If you want the
information, call mount().  And as others in this thread have mentioned,
what, pray tell, would you want to know "would a mount succeed?" for if you
don't want to mount?

it is a btrfs drawback that there is nothing to push assembling into "OK,
going degraded" state


The way to do so is to timeout, then retry with -o degraded.


That's possible way to solve it. This likely requires support from
mount.btrfs (or btrfs.ko) to return proper indication that filesystem is
incomplete so caller can decide whether to retry or to try degraded mount.

We already do so in the accepted standard manner. If the mount failsbecause of a missing device, you get a very specific message in thekernel log about it, as is the case for most other common errors (foruncommon ones you usually just get a generic open_ctree error). This isreally the only option too, as the mount() syscall (which the mountcommand calls) returns only 0 on success or -1 and an appropriate errnovalue on failure, and we can't exactly go about creating a half dozennew error numbers just for this (well, technically we could, but I verymuch doubt that they would be accepted upstream, which defeats the purpose).


Or may be mount.btrfs should implement this logic internally. This would
really be the most simple way to make it acceptable to the other side by
not needing to accept anything :)

And would also be another layering violation which would require aproliferation of extra mount options to control the mount command itselfand adjust the timeout handling.

This has been done before with mount.nfs, but for slightly differentreasons (primarily to allow nested NFS mounts, since the local directorythat the filesystem is being mounted on not being present is treatedlike a mount timeout), and it had near zero control. It works therebecause they push the complicated policy decisions to userspace (namely,there is no support for retrying with different options or trying adifferent server).

With what you're proposing for BTRFS however, _everything_ is acomplicated decision, namely:1. Do you retry at all? During boot, the answer should usually be yes,but during normal system operation it should normally be no (because weshould be letting the user handle issues at that point).2. How long should you wait before you retry? There is no right answerhere that will work in all cases (I've seen systems which take multipleminutes for devices to become available on boot), especially consideringthose of us who would rather have things fail early.3. If the retry fails, do you retry again? How many times before itjust outright fails? This is going to be system specific policy. Onsystems where devices may take a while to come online, the answer isprobably yes and some reasonably large number, while on systems wheredevices are known to reliably be online immediately, it makes no senseto retry more than once or twice.4. If you are going to retry, should you try a degraded mount? Again,this is going to be system specific policy (regular users would probablywant this to be a yes, while people who care about data integrity overavailability would likely want it to be a no).5. Assuming you do retry with the degraded mount, how many times shoulda normal mount fail before things go degraded? This ties in with 3 andhas the same arguments about variability I gave there.6. How many times do you try a degraded mount before just giving up?Again, similar variability to 3.7. Should each attempt try first a regular mount and then a degradedone, or do you try just normal a couple times and then switch todegraded, or even start out trying normal and then start alternating?Any of those patterns has valid arguments both for and against it, sothis again needs to be user configurable policy.

Altogether, that's a total of 7 policy decisions that should be userconfigurable. Having a config file other than /etc/fstab for the mountcommand should probably be avoided for sanity reasons (again, BTRFS is afilesystem, not a volume manager), so they would all have to be handledthrough mount options. The kernel will additionally have to understandthat those options need to be ignored (things do try to mountfilesystems without calling a mount helper, most notably the kernel whenit mounts the root filesystem on boot if you're not using an initramfs).All in all, this type of thing gets out of hand _very_ fast.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: degraded permanent mount option

Reply via email to