On Mon, Jun 04, 2018 at 01:12:40AM +0200, Alexander Bluhm wrote: > dounmount() does the vfs busy in the forward direction of the mnt_list. > while ((mp = TAILQ_NEXT(mp, mnt_list)) != NULL) { > error = vfs_busy(mp, VB_WRITE|VB_WAIT); > Then it unmounts all nested mount points in the reverse direction. > > So I think we should remove the _REVERSE in vfs_stall().
Let me explain the code a bit. The requirements in dounmount() are: - If a mount point gets unmounted, all sub mounts in the tree must also be unmounted. Otherwise we would have dangling unmounts. - If an USB stick gets pulled, we must unmount successfully. We cannot fail and report an error to the user. - Unmout may sleep during filesystem sync. No other process may try to mount or unmount anything in the subtree. The solution natano@ and I implemented in dounmount() does this: - Lock all mount points that are below the one we unmount. We walk the list of all mount points mnt_list and check if it is a sub mount point of something we already found. If so, vfs_busy() it and put it to the list of mount points to be unmounted. - If locking fails and we must not fail, restart from a safe point. - The new list contains all mount points, deep nested first. Umnount in forward direction. As the mnt_list is traversed in forward direction, this is the existing lock order. The algorithm is complex and I don't want to touch it. The function vfs_stall() traverses the mnt_list in reverser order and calls vfs_busy() for all mount points. There we have a lock order conflict. I don't know why it is done this way, but it does not seem to matter. So I suggest to change the direction. ok? bluhm Index: kern/vfs_subr.c =================================================================== RCS file: /data/mirror/openbsd/cvs/src/sys/kern/vfs_subr.c,v retrieving revision 1.274 diff -u -p -r1.274 vfs_subr.c --- kern/vfs_subr.c 4 Jun 2018 04:57:09 -0000 1.274 +++ kern/vfs_subr.c 4 Jun 2018 13:29:20 -0000 @@ -1605,7 +1605,7 @@ vfs_stall(struct proc *p, int stall) * The loop variable mp is protected by vfs_busy() so that it cannot * be unmounted while VFS_SYNC() sleeps. */ - TAILQ_FOREACH_REVERSE(mp, &mountlist, mntlist, mnt_list) { + TAILQ_FOREACH(mp, &mountlist, mnt_list) { if (stall) { error = vfs_busy(mp, VB_WRITE|VB_WAIT|VB_DUPOK); if (error) {