On Sun, Jun 19 2022, Visa Hankala <v...@hankala.org> wrote: > On Sun, Jun 19, 2022 at 11:05:38AM +0200, Jeremie Courreges-Anglas wrote: >> On Fri, Jun 17 2022, Jeremie Courreges-Anglas <j...@wxcvbn.org> wrote: >> > On Thu, Jun 16 2022, Visa Hankala <v...@hankala.org> wrote: >> >> nfs_inactive() has a lock order reversal. When it removes the silly >> >> file, it locks the directory vnode while it already holds the lock >> >> of the argument file vnode. This clashes for example with name lookups >> >> where directory vnodes are locked before file vnodes. >> >> >> >> The reversal can cause a deadlock when an NFS client has multiple >> >> processes that create, modify and remove files in the same >> >> NFS directory. >> >> >> >> The following patch makes the silly file removal happen after >> >> nfs_inactive() has released the file vnode lock. This should be safe >> >> because the silly file removal is independent of nfs_inactive()'s >> >> argument vnode. >> >> The diff makes sense to me. Did you spot it reviewing the code, or >> using WITNESS? > > I noticed it by code review. > > WITNESS is somewhat helpless with vnode locks because they can involve > multiple levels of lock nesting. In fact, the order checking between > vnodes has been disabled by initializing the locks with RWL_IS_VNODE. > To fix this, the kernel would have to pass nesting information around > the filesystem code.
I see, thanks for confirming. > This particular deadlock can be triggered for example by quickly > writing and removing temporary files in an NFS directory using one > process while another process lists the directory contents repeatedly. > >> >> Could some NFS users test this? >> > >> > I'm running this diff on the riscv64 build cluster, since 1h25mn with no >> > hang. Let's see how it goes. >> >> This run did finish properly yesterday. >> >> > This cluster doesn't use NFS as much as it could (build logs stored >> > locally) but I can try that in the next build. >> >> So I have restarted a build with this diff and dpb(1) logging on an >> NFS-mounted /usr/ports/logs. I get a dpb(1) hang after 1400+ packages >> built. Any other attempt to access the NFS-mounted filesystem results >> in a hang. Let me know if I can extract more data from the system. > > No need this time. Those wait messages give some pointers. Another hang, slightly different excerpt below. >> shannon ~$ grep nfs riscv64/nfs-hang.txt >> 97293 72036 49335 0 3 0x91 nfs_fsync perl >> 69045 83700 64026 55 3 0x82 nfs_fsync c++ >> 80365 37354 15104 55 3 0x100082 nfs_fsync make >> 28876 139812 59322 55 3 0x100082 nfs_fsync make >> 6160 193238 61541 1000 3 0x100003 nfsnode ksh >> 7535 421732 0 0 3 0x14280 nfsrcvlk nfsio >> 70437 237308 0 0 3 0x14280 nfsrcvlk nfsio >> 97073 406345 0 0 3 0x14200 nfsrcvlk nfsio >> 88487 390804 0 0 3 0x14200 nfsrcvlk nfsio >> 58945 91139 92962 0 3 0x80 nfsd nfsd >> 75619 357314 92962 0 3 0x80 nfsd nfsd >> 39027 137228 92962 0 3 0x80 nfsd nfsd >> 22028 406380 92962 0 3 0x80 nfsd nfsd >> 92962 11420 1 0 3 0x80 netcon nfsd >> 90467 310188 0 0 3 0x14280 nfsrcvlk update shannon ~$ grep -e nfs riscv64/nfs-hang-2.txt 54340 24206 19639 1000 3 0x100003 nfsnode ksh 43709 63222 55473 55 3 0x82 nfs_fsync cmake 27658 186996 47176 55 3 0x100082 nfs_fsync make 38397 435956 77676 55 3 0x82 nfsrcvlk c++ 37625 117156 96553 0 3 0x100003 nfsnode ssh 96553 293049 2676 0 3 0x93 nfsrcvlk perl 72080 505789 0 0 3 0x14280 nfsrcvlk nfsio 68474 378836 0 0 3 0x14200 nfsrcvlk nfsio 98412 13475 0 0 3 0x14280 nfsrcvlk nfsio 41393 345501 0 0 3 0x14200 netio nfsio 75129 466227 20392 0 3 0x80 nfsd nfsd 7762 364728 20392 0 3 0x80 nfsd nfsd 65425 254516 20392 0 3 0x80 nfsd nfsd 8062 263249 20392 0 3 0x80 nfsd nfsd 20392 248868 1 0 3 0x80 netcon nfsd 96499 295545 0 0 3 0x14280 nfsrcvlk update -- jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE