Robert Milkowski wrote: > Hi, > > I'm replicating some zfs file systems between some servers. > Recently from time to time a server which is doing zfs recv hangs all > operations while writing to zfs pool. zfs recv is "stuck" so is an attempt to > zfs create xxx or sync. > Creating a new file system or files in other pool on the same server works > fine. > > All vdevs and pools are healthy and there are no other issues. > It looks like some kind of deadlock. IIRC there was some kind of a 3-way > deadlock but it was fixed in b111b (or a new code which introduced it was > backed out) so it is probably something else. > > I can provide a forced crashdump if someone is interested to help... :) > >
Sounds like 6826836, introduced in 111, fixed in 113. -tim > x4500 with Open Solaris snv_111b > > SolarisCAT(live/11X)> ps -ef | egrep "zfs|sync" > root 24820 24819 0 13:40:45 ? 0:00 bash -c > /usr/local/bin/mbuffer -s 65536 -m 1024M | pfexec /usr/sbin/zfs recv > root 25356 25326 0 13:50:56 pts/8 0:00 zfs create archive-2/tmp > root 24822 24820 0 13:40:45 ? 0:13 /usr/sbin/zfs recv -dF -v > archive-2/repl > root 25576 25567 0 13:54:04 pts/9 0:00 sync > SolarisCAT(live/11X)> tlist -l proc 24822 > ==== user (LWP_SYS) thread: 0xffffff04e6604080 PID: 24822 ==== > cmd: /usr/sbin/zfs recv -dF -v archive-2/repl > fmri: svc:/network/shell:default > t_wchan: 0xffffff04de46055e sobj: condition var (from > zfs:txg_wait_open+0x7a) > t_procp: 0xffffff05a26f16e8 > p_as: 0xffffff04e51c7548 size: 5472256 RSS: 2707456 > hat: 0xffffff04dbe9c6e8 > cpuset: > zone: global > t_stk: 0xffffff001f484f10 sp: 0xffffff001f484620 t_stkbase: > 0xffffff001f480000 > t_pri: 59(TS) pctcpu: 0.000000 > t_lwp: 0xffffff04e5f216c0 lwp_regs: 0xffffff001f484f10 > mstate: LMS_SLEEP ms_prev: LMS_SYSTEM > ms_state_start: 11 minutes 9.816633850 seconds earlier > ms_start: 12 minutes 9.081459673 seconds earlier > psrset: 0 last CPU: 2 > idle: 79631 ticks (13 minutes 16.31 seconds) > start: Thu Jun 11 13:40:45 2009 > age: 855 seconds (14 minutes 15 seconds) > syscall: #54 ioctl(, 0x0) () > tstate: TS_SLEEP - awaiting an event > tflg: T_DFLTSTK - stack is default size > tpflg: TP_TWAIT - wait to be freed by lwp_wait > TP_MSACCT - collect micro-state accounting information > tsched: TS_LOAD - thread is in memory > TS_DONT_SWAP - thread/LWP should not be swapped > pflag: SMSACCT - process is keeping micro-state accounting > SMSFORK - child inherits micro-state accounting > > pc: unix:_resume_from_idle+0xf1 resume_return: addq $0x8,%rsp > > unix:_resume_from_idle+0xf1 resume_return() > unix:swtch+0x147() > genunix:cv_wait+0x61() > zfs:txg_wait_open+0x7a() > zfs:dmu_tx_wait+0xb3() > zfs:dmu_tx_assign+0x4b() > zfs:dmu_free_long_range_impl+0x12a() > zfs:dmu_free_long_range+0x5b() > zfs:dmu_object_reclaim+0x112() > zfs:restore_object+0xff() > zfs:dmu_recv_stream+0x48d() > zfs:zfs_ioc_recv+0x2c0() > zfs:zfsdev_ioctl+0x10b() > genunix:cdev_ioctl+0x45() > specfs:spec_ioctl+0x83() > genunix:fop_ioctl+0x7b() > genunix:ioctl+0x18e() > unix:_syscall32_save+0xbf() > -- switch to user thread's user stack -- > > > 1 thread for that process found. > > SolarisCAT(live/11X)> tlist -l proc 25356 > ==== user (LWP_SYS) thread: 0xffffff05a36bae00 PID: 25356 ==== > cmd: zfs create archive-2/tmp > fmri: svc:/network/ssh:default > t_wchan: 0xffffff04de46055a sobj: condition var (from > zfs:txg_wait_synced+0x7f) > t_procp: 0xffffff051e5996c8 > p_as: 0xffffff051a234e38 size: 7806976 RSS: 2531328 > hat: 0xffffff052f664448 > cpuset: > zone: global > t_stk: 0xffffff001eccbf10 sp: 0xffffff001eccb990 t_stkbase: > 0xffffff001ecc7000 > t_pri: 59(TS) pctcpu: 0.000000 > t_lwp: 0xffffff05d9201220 lwp_regs: 0xffffff001eccbf10 > mstate: LMS_SLEEP ms_prev: LMS_SYSTEM > ms_state_start: 2 minutes 2.018084726 seconds earlier > ms_start: 2 minutes 2.064748896 seconds earlier > psrset: 0 last CPU: 2 > idle: 24851 ticks (4 minutes 8.51 seconds) > start: Thu Jun 11 13:50:56 2009 > age: 248 seconds (4 minutes 8 seconds) > syscall: #54 ioctl(, 0x0) () > tstate: TS_SLEEP - awaiting an event > tflg: T_DFLTSTK - stack is default size > tpflg: TP_TWAIT - wait to be freed by lwp_wait > TP_MSACCT - collect micro-state accounting information > tsched: TS_LOAD - thread is in memory > TS_DONT_SWAP - thread/LWP should not be swapped > pflag: SMSACCT - process is keeping micro-state accounting > SMSFORK - child inherits micro-state accounting > > pc: unix:_resume_from_idle+0xf1 resume_return: addq $0x8,%rsp > > unix:_resume_from_idle+0xf1 resume_return() > unix:swtch+0x147() > genunix:cv_wait+0x61() > zfs:txg_wait_synced+0x7f() > zfs:dsl_sync_task_group_wait+0xee() > zfs:dsl_sync_task_do+0x65() > zfs:dmu_objset_create+0x142() > zfs:zfs_ioc_create+0x1e7() > zfs:zfsdev_ioctl+0x10b() > genunix:cdev_ioctl+0x45() > specfs:spec_ioctl+0x83() > genunix:fop_ioctl+0x7b() > genunix:ioctl+0x18e() > unix:_syscall32_save+0xbf() > -- switch to user thread's user stack -- > > > 1 thread for that process found. > > SolarisCAT(live/11X)> tlist -l proc 25576 > ==== user (LWP_SYS) thread: 0xffffff05a3246ac0 PID: 25576 ==== > cmd: sync > fmri: svc:/network/ssh:default > t_wchan: 0xffffff04de46055a sobj: condition var (from > zfs:txg_wait_synced+0x7f) > t_procp: 0xffffff05a2657a90 > p_as: 0xffffff051e9f1c60 size: 4165632 RSS: 1081344 > hat: 0xffffff051e8977a8 > cpuset: > zone: global > t_stk: 0xffffff001f96bf10 sp: 0xffffff001f96bd50 t_stkbase: > 0xffffff001f967000 > t_pri: 59(TS) pctcpu: 0.000000 > t_lwp: 0xffffff05d778a280 lwp_regs: 0xffffff001f96bf10 > mstate: LMS_SLEEP ms_prev: LMS_SYSTEM > ms_state_start: 56.644445127 seconds later > ms_start: 56.641730136 seconds later > psrset: 0 last CPU: 2 > idle: 6985 ticks (1 minutes 9.85 seconds) > start: Thu Jun 11 13:54:03 2009 > age: 70 seconds (1 minutes 10 seconds) > syscall: #36 sync(, 0x0) () > tstate: TS_SLEEP - awaiting an event > tflg: T_DFLTSTK - stack is default size > tpflg: TP_TWAIT - wait to be freed by lwp_wait > TP_MSACCT - collect micro-state accounting information > tsched: TS_LOAD - thread is in memory > TS_DONT_SWAP - thread/LWP should not be swapped > pflag: SMSACCT - process is keeping micro-state accounting > SMSFORK - child inherits micro-state accounting > > pc: unix:_resume_from_idle+0xf1 resume_return: addq $0x8,%rsp > > unix:_resume_from_idle+0xf1 resume_return() > unix:swtch+0x147() > genunix:cv_wait+0x61() > zfs:txg_wait_synced+0x7f() > zfs:spa_sync_allpools+0x76() > zfs:zfs_sync+0xce() > genunix:vfs_sync+0x9c() > genunix:syssync+0xb() > unix:_syscall32_save+0xbf() > -- switch to user thread's user stack -- > > > 1 thread for that process found. > > SolarisCAT(live/11X)>