Hi, I'm replicating some zfs file systems between some servers. Recently from time to time a server which is doing zfs recv hangs all operations while writing to zfs pool. zfs recv is "stuck" so is an attempt to zfs create xxx or sync. Creating a new file system or files in other pool on the same server works fine.
All vdevs and pools are healthy and there are no other issues. It looks like some kind of deadlock. IIRC there was some kind of a 3-way deadlock but it was fixed in b111b (or a new code which introduced it was backed out) so it is probably something else. I can provide a forced crashdump if someone is interested to help... :) x4500 with Open Solaris snv_111b SolarisCAT(live/11X)> ps -ef | egrep "zfs|sync" root 24820 24819 0 13:40:45 ? 0:00 bash -c /usr/local/bin/mbuffer -s 65536 -m 1024M | pfexec /usr/sbin/zfs recv root 25356 25326 0 13:50:56 pts/8 0:00 zfs create archive-2/tmp root 24822 24820 0 13:40:45 ? 0:13 /usr/sbin/zfs recv -dF -v archive-2/repl root 25576 25567 0 13:54:04 pts/9 0:00 sync SolarisCAT(live/11X)> tlist -l proc 24822 ==== user (LWP_SYS) thread: 0xffffff04e6604080 PID: 24822 ==== cmd: /usr/sbin/zfs recv -dF -v archive-2/repl fmri: svc:/network/shell:default t_wchan: 0xffffff04de46055e sobj: condition var (from zfs:txg_wait_open+0x7a) t_procp: 0xffffff05a26f16e8 p_as: 0xffffff04e51c7548 size: 5472256 RSS: 2707456 hat: 0xffffff04dbe9c6e8 cpuset: zone: global t_stk: 0xffffff001f484f10 sp: 0xffffff001f484620 t_stkbase: 0xffffff001f480000 t_pri: 59(TS) pctcpu: 0.000000 t_lwp: 0xffffff04e5f216c0 lwp_regs: 0xffffff001f484f10 mstate: LMS_SLEEP ms_prev: LMS_SYSTEM ms_state_start: 11 minutes 9.816633850 seconds earlier ms_start: 12 minutes 9.081459673 seconds earlier psrset: 0 last CPU: 2 idle: 79631 ticks (13 minutes 16.31 seconds) start: Thu Jun 11 13:40:45 2009 age: 855 seconds (14 minutes 15 seconds) syscall: #54 ioctl(, 0x0) () tstate: TS_SLEEP - awaiting an event tflg: T_DFLTSTK - stack is default size tpflg: TP_TWAIT - wait to be freed by lwp_wait TP_MSACCT - collect micro-state accounting information tsched: TS_LOAD - thread is in memory TS_DONT_SWAP - thread/LWP should not be swapped pflag: SMSACCT - process is keeping micro-state accounting SMSFORK - child inherits micro-state accounting pc: unix:_resume_from_idle+0xf1 resume_return: addq $0x8,%rsp unix:_resume_from_idle+0xf1 resume_return() unix:swtch+0x147() genunix:cv_wait+0x61() zfs:txg_wait_open+0x7a() zfs:dmu_tx_wait+0xb3() zfs:dmu_tx_assign+0x4b() zfs:dmu_free_long_range_impl+0x12a() zfs:dmu_free_long_range+0x5b() zfs:dmu_object_reclaim+0x112() zfs:restore_object+0xff() zfs:dmu_recv_stream+0x48d() zfs:zfs_ioc_recv+0x2c0() zfs:zfsdev_ioctl+0x10b() genunix:cdev_ioctl+0x45() specfs:spec_ioctl+0x83() genunix:fop_ioctl+0x7b() genunix:ioctl+0x18e() unix:_syscall32_save+0xbf() -- switch to user thread's user stack -- 1 thread for that process found. SolarisCAT(live/11X)> tlist -l proc 25356 ==== user (LWP_SYS) thread: 0xffffff05a36bae00 PID: 25356 ==== cmd: zfs create archive-2/tmp fmri: svc:/network/ssh:default t_wchan: 0xffffff04de46055a sobj: condition var (from zfs:txg_wait_synced+0x7f) t_procp: 0xffffff051e5996c8 p_as: 0xffffff051a234e38 size: 7806976 RSS: 2531328 hat: 0xffffff052f664448 cpuset: zone: global t_stk: 0xffffff001eccbf10 sp: 0xffffff001eccb990 t_stkbase: 0xffffff001ecc7000 t_pri: 59(TS) pctcpu: 0.000000 t_lwp: 0xffffff05d9201220 lwp_regs: 0xffffff001eccbf10 mstate: LMS_SLEEP ms_prev: LMS_SYSTEM ms_state_start: 2 minutes 2.018084726 seconds earlier ms_start: 2 minutes 2.064748896 seconds earlier psrset: 0 last CPU: 2 idle: 24851 ticks (4 minutes 8.51 seconds) start: Thu Jun 11 13:50:56 2009 age: 248 seconds (4 minutes 8 seconds) syscall: #54 ioctl(, 0x0) () tstate: TS_SLEEP - awaiting an event tflg: T_DFLTSTK - stack is default size tpflg: TP_TWAIT - wait to be freed by lwp_wait TP_MSACCT - collect micro-state accounting information tsched: TS_LOAD - thread is in memory TS_DONT_SWAP - thread/LWP should not be swapped pflag: SMSACCT - process is keeping micro-state accounting SMSFORK - child inherits micro-state accounting pc: unix:_resume_from_idle+0xf1 resume_return: addq $0x8,%rsp unix:_resume_from_idle+0xf1 resume_return() unix:swtch+0x147() genunix:cv_wait+0x61() zfs:txg_wait_synced+0x7f() zfs:dsl_sync_task_group_wait+0xee() zfs:dsl_sync_task_do+0x65() zfs:dmu_objset_create+0x142() zfs:zfs_ioc_create+0x1e7() zfs:zfsdev_ioctl+0x10b() genunix:cdev_ioctl+0x45() specfs:spec_ioctl+0x83() genunix:fop_ioctl+0x7b() genunix:ioctl+0x18e() unix:_syscall32_save+0xbf() -- switch to user thread's user stack -- 1 thread for that process found. SolarisCAT(live/11X)> tlist -l proc 25576 ==== user (LWP_SYS) thread: 0xffffff05a3246ac0 PID: 25576 ==== cmd: sync fmri: svc:/network/ssh:default t_wchan: 0xffffff04de46055a sobj: condition var (from zfs:txg_wait_synced+0x7f) t_procp: 0xffffff05a2657a90 p_as: 0xffffff051e9f1c60 size: 4165632 RSS: 1081344 hat: 0xffffff051e8977a8 cpuset: zone: global t_stk: 0xffffff001f96bf10 sp: 0xffffff001f96bd50 t_stkbase: 0xffffff001f967000 t_pri: 59(TS) pctcpu: 0.000000 t_lwp: 0xffffff05d778a280 lwp_regs: 0xffffff001f96bf10 mstate: LMS_SLEEP ms_prev: LMS_SYSTEM ms_state_start: 56.644445127 seconds later ms_start: 56.641730136 seconds later psrset: 0 last CPU: 2 idle: 6985 ticks (1 minutes 9.85 seconds) start: Thu Jun 11 13:54:03 2009 age: 70 seconds (1 minutes 10 seconds) syscall: #36 sync(, 0x0) () tstate: TS_SLEEP - awaiting an event tflg: T_DFLTSTK - stack is default size tpflg: TP_TWAIT - wait to be freed by lwp_wait TP_MSACCT - collect micro-state accounting information tsched: TS_LOAD - thread is in memory TS_DONT_SWAP - thread/LWP should not be swapped pflag: SMSACCT - process is keeping micro-state accounting SMSFORK - child inherits micro-state accounting pc: unix:_resume_from_idle+0xf1 resume_return: addq $0x8,%rsp unix:_resume_from_idle+0xf1 resume_return() unix:swtch+0x147() genunix:cv_wait+0x61() zfs:txg_wait_synced+0x7f() zfs:spa_sync_allpools+0x76() zfs:zfs_sync+0xce() genunix:vfs_sync+0x9c() genunix:syssync+0xb() unix:_syscall32_save+0xbf() -- switch to user thread's user stack -- 1 thread for that process found. SolarisCAT(live/11X)> -- This message posted from opensolaris.org