On Mon, Jun 04, 2018 at 01:12:40AM +0200, Alexander Bluhm wrote:
> dounmount() does the vfs busy in the forward direction of the mnt_list.
> while ((mp = TAILQ_NEXT(mp, mnt_list)) != NULL) {
> error = vfs_busy(mp, VB_WRITE|VB_WAIT);
> Then it unmounts all nested mount points in the reverse direction.
>
> So I think we should remove the _REVERSE in vfs_stall().
Let me explain the code a bit. The requirements in dounmount() are:
- If a mount point gets unmounted, all sub mounts in the tree must also
be unmounted. Otherwise we would have dangling unmounts.
- If an USB stick gets pulled, we must unmount successfully.
We cannot fail and report an error to the user.
- Unmout may sleep during filesystem sync. No other process may try
to mount or unmount anything in the subtree.
The solution natano@ and I implemented in dounmount() does this:
- Lock all mount points that are below the one we unmount.
We walk the list of all mount points mnt_list and check if it is
a sub mount point of something we already found. If so, vfs_busy()
it and put it to the list of mount points to be unmounted.
- If locking fails and we must not fail, restart from a safe point.
- The new list contains all mount points, deep nested first.
Umnount in forward direction.
As the mnt_list is traversed in forward direction, this is the
existing lock order. The algorithm is complex and I don't want to
touch it.
The function vfs_stall() traverses the mnt_list in reverser order
and calls vfs_busy() for all mount points. There we have a lock
order conflict. I don't know why it is done this way, but it does
not seem to matter. So I suggest to change the direction.
ok?
bluhm
Index: kern/vfs_subr.c
===================================================================
RCS file: /data/mirror/openbsd/cvs/src/sys/kern/vfs_subr.c,v
retrieving revision 1.274
diff -u -p -r1.274 vfs_subr.c
--- kern/vfs_subr.c 4 Jun 2018 04:57:09 -0000 1.274
+++ kern/vfs_subr.c 4 Jun 2018 13:29:20 -0000
@@ -1605,7 +1605,7 @@ vfs_stall(struct proc *p, int stall)
* The loop variable mp is protected by vfs_busy() so that it cannot
* be unmounted while VFS_SYNC() sleeps.
*/
- TAILQ_FOREACH_REVERSE(mp, &mountlist, mntlist, mnt_list) {
+ TAILQ_FOREACH(mp, &mountlist, mnt_list) {
if (stall) {
error = vfs_busy(mp, VB_WRITE|VB_WAIT|VB_DUPOK);
if (error) {