On Mon, Jun 04, 2018 at 01:12:40AM +0200, Alexander Bluhm wrote:
> dounmount() does the vfs busy in the forward direction of the mnt_list.
>         while ((mp = TAILQ_NEXT(mp, mnt_list)) != NULL) {
>                         error = vfs_busy(mp, VB_WRITE|VB_WAIT);
> Then it unmounts all nested mount points in the reverse direction.
> 
> So I think we should remove the _REVERSE in vfs_stall().

Let me explain the code a bit.  The requirements in dounmount() are:
- If a mount point gets unmounted, all sub mounts in the tree must also
  be unmounted.  Otherwise we would have dangling unmounts.
- If an USB stick gets pulled, we must unmount successfully.
  We cannot fail and report an error to the user.
- Unmout may sleep during filesystem sync.  No other process may try
  to mount or unmount anything in the subtree.

The solution natano@ and I implemented in dounmount() does this:
- Lock all mount points that are below the one we unmount.
  We walk the list of all mount points mnt_list and check if it is
  a sub mount point of something we already found.  If so, vfs_busy()
  it and put it to the list of mount points to be unmounted.
- If locking fails and we must not fail, restart from a safe point.
- The new list contains all mount points, deep nested first.
  Umnount in forward direction.

As the mnt_list is traversed in forward direction, this is the
existing lock order.  The algorithm is complex and I don't want to
touch it.

The function vfs_stall() traverses the mnt_list in reverser order
and calls vfs_busy() for all mount points.  There we have a lock
order conflict.  I don't know why it is done this way, but it does
not seem to matter.  So I suggest to change the direction.

ok?

bluhm

Index: kern/vfs_subr.c
===================================================================
RCS file: /data/mirror/openbsd/cvs/src/sys/kern/vfs_subr.c,v
retrieving revision 1.274
diff -u -p -r1.274 vfs_subr.c
--- kern/vfs_subr.c     4 Jun 2018 04:57:09 -0000       1.274
+++ kern/vfs_subr.c     4 Jun 2018 13:29:20 -0000
@@ -1605,7 +1605,7 @@ vfs_stall(struct proc *p, int stall)
         * The loop variable mp is protected by vfs_busy() so that it cannot
         * be unmounted while VFS_SYNC() sleeps.
         */
-       TAILQ_FOREACH_REVERSE(mp, &mountlist, mntlist, mnt_list) {
+       TAILQ_FOREACH(mp, &mountlist, mnt_list) {
                if (stall) {
                        error = vfs_busy(mp, VB_WRITE|VB_WAIT|VB_DUPOK);
                        if (error) {

Reply via email to