Pawel, I'm not quite sure I understand why thread #1 below is stalled. Is there only a single thread available for IO completion?
-Mark Pawel Jakub Dawidek wrote: > Hi. > > I'm observing the following deadlock. > > One thread holds zfsvfs->z_hold_mtx[i] lock and waits for I/O: > > Tracing pid 1188 tid 100114 td 0xc41b1660 > sched_switch(c41b1660,0,1,180,9bbb0333,...) at sched_switch+0x1b6 > mi_switch(1,0,c0666ae9,1bf,c41b1660,...) at mi_switch+0x1ee > sleepq_switch(c41b1660,0,c0666ae9,21d,c41b1660,...) at sleepq_switch+0xf0 > sleepq_wait(de465f94,de465f7c,c43fc60a,1,0,...) at sleepq_wait+0x60 > _cv_wait(de465f94,de465f7c,c43fc4cd,34f,0,...) at _cv_wait+0x1fc > zio_wait(de465d68,c43ea3c5,240,4000,0,...) at zio_wait+0x99 > dbuf_read(d3a39594,de465d68,2,c43ee6df,d2e09900,...) at dbuf_read+0x2e7 > dnode_hold_impl(c4bbf000,a075,0,1,c43ec318,...) at dnode_hold_impl+0x15a > dnode_hold(c4bbf000,a075,0,c43ec318,f899a758,...) at dnode_hold+0x35 > dmu_bonus_hold(c480e540,a075,0,0,f899a7a0,...) at dmu_bonus_hold+0x31 > zfs_zget(c3f4f000,a075,0,f899a89c,1,...) at zfs_zget+0x7a > zfs_dirent_lock(f899a8a0,c4d18c64,f899a928,f899a89c,6,...) at > zfs_dirent_lock+0x619 > zfs_dirlook(c4d18c64,f899a928,f899ac20,0,0,...) at zfs_dirlook+0x272 > zfs_lookup(cbef4414,f899a928,f899ac20,f899ac34,2,...) at zfs_lookup+0x2df > zfs_freebsd_lookup(f899aa70,c0685c24,cbef4414,cbef4414,f899ac34,...) at > zfs_freebsd_lookup+0xdd > VOP_CACHEDLOOKUP_APV(c4405800,f899aa70,f899ac34,f899ac20,c3f65200,...) at > VOP_CACHEDLOOKUP_APV+0xc5 > vfs_cache_lookup(f899aaf4,c066e847,cbef4414,1,cbef4414,...) at > vfs_cache_lookup+0xd3 > VOP_LOOKUP_APV(c4405800,f899aaf4,c41b1660,c066d7b8,19e,...) at > VOP_LOOKUP_APV+0xe5 > lookup(f899ac0c,c066d7b8,c9,bf,ca5b032c,...) at lookup+0x53e > namei(f899ac0c,0,c0679402,8d6,c065f266,...) at namei+0x2fe > kern_rename(c41b1660,2820d040,2820c040,0,f899ad2c,...) at kern_rename+0x4a > rename(c41b1660,f899acfc,8,c06758bf,c06924e0,...) at rename+0x29 > syscall(f899ad38) at syscall+0x283 > Xint0x80_syscall() at Xint0x80_syscall+0x20 > --- syscall (128, FreeBSD ELF32, rename), eip = 0x280d1cdb, esp = 0xbfbfe7fc, > ebp = 0xbfbfea48 --- > > Another thread tries to finish I/O, but can't, because it is trying to > acquire zfsvfs->z_hold_mtx[i] lock: > > Tracing command spa_zio_intr_2 pid 1117 tid 100020 td 0xc3b38440 > sched_switch(c3b38440,0,1,180,9d2b0e9f,...) at sched_switch+0x1b6 > mi_switch(1,0,c0666ae9,1bf,c3b38440,...) at mi_switch+0x1ee > sleepq_switch(c3b38440,0,c0666ae9,21d,c3f4f6a4,...) at sleepq_switch+0xf0 > sleepq_wait(c3f4f6a4,0,c43fba6f,3,0,...) at sleepq_wait+0x60 > _sx_xlock_hard(c3f4f6a4,c3b38440,0,c43fb80c,31e,...) at _sx_xlock_hard+0x286 > _sx_xlock(c3f4f6a4,0,c43fb80c,31e,c7af015c,...) at _sx_xlock+0xb8 > zfs_zinactive(d0d00988,0,c4401326,e27,d0d00988,...) at zfs_zinactive+0xa2 > zfs_inactive(c7af015c,c3aa9600,0,c7af015c,f8aa2888,...) at zfs_inactive+0x307 > zfs_freebsd_inactive(f8aa28a0,c06856cc,c7af0230,c7af0230,c7af015c,...) at > zfs_freebsd_inactive+0x32 > VOP_INACTIVE_APV(c4405800,f8aa28a0,c066e0c7,8e9,c06b5120,...) at > VOP_INACTIVE_APV+0xc5 > vinactive(c7af0230,0,c066e0c7,859,858,...) at vinactive+0xb1 > vrele(c7af015c,c71317d0,f8aa2924,c04d16b0,c71317d0,...) at vrele+0x18f > zfs_get_done(c5832d8c,c71317d0,292,28d,d2387b70,...) at zfs_get_done+0xad > dmu_sync_done(d11e56b4,ccc323c0,c6940930,0,0,...) at dmu_sync_done+0x1f0 > arc_write_done(d11e56b4,d4863b2c,e38a2208,4d3,d11d68f0,...) at > arc_write_done+0x44a > zio_done(d11e56b4,c44033e0,f8aa2a38,c434906a,d13c9e00,...) at zio_done+0xb2 > zio_next_stage(d11e56b4,d11e56f8,80,0,f8aa2a68,...) at zio_next_stage+0x236 > zio_assess(d11e56b4,c3b384d8,c069938c,c43fc4cd,d11e58c8,...) at > zio_assess+0x843 > zio_next_stage(d11e56b4,c43fc4cd,36d,36a,f8aa2b44,...) at zio_next_stage+0x236 > zio_wait_for_children(d11e56b4,12,d11e58bc,f8aa2b8c,c436cc76,...) at > zio_wait_for_children+0x99 > zio_wait_children_done(d11e56b4,c0a575a0,c3b384d8,c0697e04,c0679402,...) at > zio_wait_children_done+0x25 > zio_next_stage(d11e56b4,c1074808,0,c0679402,8d6,...) at zio_next_stage+0x236 > zio_vdev_io_assess(d11e56b4,c44033e0,40,cef4def8,c3,...) at > zio_vdev_io_assess+0x2a7 > zio_next_stage(d11e56b4,c3b38440,f8aa2c3c,246,c0a116b4,...) at > zio_next_stage+0x236 > vdev_mirror_io_done(d11e56b4,f8aa2cf8,c42c94de,d11e56b4,0,...) at > vdev_mirror_io_done+0x113 > zio_vdev_io_done(d11e56b4,0,c43e6f2d,33d,c442ec14,...) at > zio_vdev_io_done+0x21 > taskq_thread(c442ebf4,f8aa2d38,c065fcb9,31e,c3b37aa0,...) at > taskq_thread+0x29e > fork_exit(c42c9240,c442ebf4,f8aa2d38) at fork_exit+0xb8 > fork_trampoline() at fork_trampoline+0x8 > --- trap 0, eip = 0, esp = 0xf8aa2d70, ebp = 0 --- > > As you can see, the SPA ZIO INTR thread wants to finish I/O requests, calls > arc_write_done(). All in all we call vn_rele(), which may end up in calling > zfs_zinactive() if this was the last vnode reference. zfs_zinactive() tries > to acquire zfsvfs->z_hold_mtx[i] lock, but can't, because it is already held > by the thread waiting for I/O. Neither I/O can be finished, nor lock can be > unlocked. > > I started to see this deadlock after last integration, so it was introduced > in more or less during last month. Any ideas how to solve it? > > > > ------------------------------------------------------------------------ > > _______________________________________________ > zfs-code mailing list > zfs-code at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-code