It has been brought to my attention that the logic in mount_checkdirs() both (a) races with fork and (b) is probably compromised by the *at() syscalls.
The purpose of the code is to update all processes' current dirs and root dirs that have just been mounted over, so nobody ends up sitting on an intermediate vnode in the middle of a mount stack. However, in fork we first copy the parent's cwd structure and then assign the copy, during which time the copy is invisible to mount_checkdirs; so in theory some process's current dir (or root dir, too) could be skipped. Also, since you can't chdir (even fchdir) to the middle of a mount stack, this logic was sufficient to avoid using the middle of a mount stack as the starting point for a path lookup, even if someone had an fd open. But the *at() system calls break this invariant wide open. So the question is: does it matter? Do we actually care? It seems to me that no great harm arises (other than perhaps some confusion) if one is sitting in the middle of a mount stack, and in fact it might even be desirable if the mount stack is a union mount. Also it's occasionally useful to mount over things and leave a process underneath, which this logic seems to complicate. The logic was added to 4.4 by Kirk McKusick but without much in the way of rationale: https://svnweb.freebsd.org/csrg/sys/kern/vfs_syscalls.c?revision=67974&view=markup My inclination is that it is wrong, and if we care about not starting lookups in the middle of mount stacks that the logic should moved to namei; and if we don't, it should just be removed. Thoughts? -- David A. Holland dholl...@netbsd.org