-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

hi,

i have an X4500 running Solaris 10 Update 5 (with all current patches).  it has
a stripe-mirror ZFS pool over 44 disks with 2 hot spare.  the system is
entirely idle, except that every 60 seconds, a 'zfs recv' is run.  a couple of
days ago, while the recv was running, i tried to destroy the pool.  the 'zfs
destroy' command hung; the 'zfs receive' also seemed to be hung.  neither would
reply to 'kill -9', and i could not do anything with the pool.  i had to reboot
the system to recover it.

i saved the core dump and have gathered some information on the state at the
time it hung.  this is the 'zfs receive' thread:

SolarisCAT(vmcore.0/10X)> thread 0xffffffff9d1c1500
==== user (LWP_SYS) thread: 0xffffffff9d1c1500  PID: 10090 ====
cmd: /sbin/zfs receive export/[EMAIL PROTECTED]
t_wchan: 0xffffffff9d1c16d6  sobj: condition var (from genunix:delay+0xa0)  
t_procp: 0xffffffffaefbc710
  p_as: 0xffffffffae4087e8  size: 4661248  RSS: 1970176
  hat: 0xffffffff9d2583a0  cpuset: 
  zone: global
t_stk: 0xfffffe8000185f10  sp: 0xfffffe80001859f0  t_stkbase: 0xfffffe8000181000
t_pri: 59(TS)  t_epri: 60  pctcpu: 0.019277
t_lwp: 0xfffffe84cc7ed8a0  lwp_regs: 0xfffffe8000185f10
  mstate: LMS_SLEEP  ms_prev: LMS_SYSTEM
  ms_state_start: 0.212252471 seconds earlier
  ms_start: 10 minutes 45.391859933 seconds earlier
psrset: 0  last CPU: 0  
idle: 0 ticks (0 seconds)
start: Fri Sep 26 21:46:06 2008
age: 645 seconds (10 minutes 45 seconds)
syscall: #54 ioctl(, 0x0) (sysent: genunix:ioctl+0x0)
tstate: TS_SLEEP - awaiting an event
tflg:   T_DFLTSTK - stack is default size
tpflg:  TP_TWAIT - wait to be freed by lwp_wait
        TP_MSACCT - collect micro-state accounting information
tsched: TS_LOAD - thread is in memory
        TS_DONT_SWAP - thread/LWP should not be swapped
pflag:  SKILLED - SIGKILL has been posted to the process
        SMSACCT - process is keeping micro-state accounting
        SMSFORK - child inherits micro-state accounting

pc:      unix:_resume_from_idle+0xf8 resume_return:  addq   $0x8,%rsp

unix:_resume_from_idle+0xf8 resume_return()
unix:swtch+0x12a()
genunix:cv_wait+0x68()
genunix:delay+0xa0()
zfs:dnode_special_close+0x20()
zfs:dmu_objset_evict+0x37()
zfs:dsl_dataset_evict+0x9f()
zfs:dbuf_evict_user+0x44()
zfs:dmu_buf_rele+0x179()
zfs:dsl_dataset_close+0x56()
zfs:dmu_recvbackup+0x5bc()
zfs:zfs_ioc_recvbackup+0xa3()
zfs:zfsdev_ioctl+0x13e()
genunix:cdev_ioctl+0x1d()
specfs:spec_ioctl+0x50()
genunix:fop_ioctl+0x25()
genunix:ioctl+0xac()
unix:_syscall32_save+0xbf()
- -- switch to user thread's user stack --

this is the 'zfs destroy' thread:

SolarisCAT(vmcore.0/10X)> thread 0xffffffff98c3e400
==== user (LWP_SYS) thread: 0xffffffff98c3e400  PID: 10110 ====
cmd: zfs destroy export/filebench
t_wchan: 0xffffffffb8c381d8  sobj: mutex  owner: 0xffffffff9d1c1500     
top owner is waiting for a cv: 0xffffffff9d1c1500 
t_procp: 0xffffffffaf551e50
  p_as: 0xffffffffae408628  size: 4579328  RSS: 1826816
  hat: 0xffffffff9d258270  cpuset: 
  zone: global
t_stk: 0xfffffe80001d3f10  sp: 0xfffffe80001d3a40  t_stkbase: 0xfffffe80001cf000
t_pri: 60(TS)  pctcpu: 0.000006
t_lwp: 0xfffffe84cc6d7760  lwp_regs: 0xfffffe80001d3f10
  mstate: LMS_SLEEP  ms_prev: LMS_SYSTEM
  ms_state_start: 5 minutes 6.360093741 seconds earlier
  ms_start: 5 minutes 6.366816867 seconds earlier
psrset: 0  last CPU: 2  
idle: 30615 ticks (5 minutes 6.15 seconds)
start: Fri Sep 26 21:51:45 2008
age: 306 seconds (5 minutes 6 seconds)
syscall: #54 ioctl(, 0x0) (sysent: genunix:ioctl+0x0)
tstate: TS_SLEEP - awaiting an event
tflg:   T_DFLTSTK - stack is default size
tpflg:  TP_TWAIT - wait to be freed by lwp_wait
        TP_MSACCT - collect micro-state accounting information
tsched: TS_LOAD - thread is in memory
        TS_DONT_SWAP - thread/LWP should not be swapped
pflag:  SKILLED - SIGKILL has been posted to the process
        SMSACCT - process is keeping micro-state accounting
        SMSFORK - child inherits micro-state accounting

pc:      unix:_resume_from_idle+0xf8 resume_return:  addq   $0x8,%rsp

unix:_resume_from_idle+0xf8 resume_return()
unix:swtch+0x12a()
genunix:turnstile_block+0x1f9()
unix:mutex_vector_enter+0x239()
zfs:dbuf_read+0x6c()
zfs:dmu_bonus_hold+0x8f()
zfs:dsl_dataset_open_obj+0x55()
zfs:dsl_dataset_open_spa+0x142()
zfs:dsl_dataset_open+0x16()
zfs:dmu_objset_open+0x49()
zfs:zfs_ioc_objset_stats+0x3d()
zfs:zfsdev_ioctl+0x13e()
genunix:cdev_ioctl+0x1d()
specfs:spec_ioctl+0x50()
genunix:fop_ioctl+0x25()
genunix:ioctl+0xac()
unix:_syscall32_save+0xbf()
- -- switch to user thread's user stack --

and this is a 'zfs list' command which was also hung:

SolarisCAT(vmcore.0/10X)> thread 0xffffffff98d78060
==== user (LWP_SYS) thread: 0xffffffff98d78060  PID: 10100 ====
cmd: zfs list
t_wchan: 0xffffffffb8c381d8  sobj: mutex  owner: 0xffffffff9d1c1500     
top owner is waiting for a cv: 0xffffffff9d1c1500 
t_procp: 0xffffffffaf553720
  p_as: 0xffffffffaf490ee0  size: 4841472  RSS: 2166784
  hat: 0xffffffffaf4f0740  cpuset: 
  zone: global
t_stk: 0xfffffe800015bf10  sp: 0xfffffe800015ba00  t_stkbase: 0xfffffe8000157000
t_pri: 60(TS)  pctcpu: 0.000001
t_lwp: 0xfffffe84cc77e850  lwp_regs: 0xfffffe800015bf10
  mstate: LMS_SLEEP  ms_prev: LMS_SYSTEM
  ms_state_start: 5 minutes 55.321220812 seconds earlier
  ms_start: 5 minutes 55.331209204 seconds earlier
psrset: 0  last CPU: 2  
idle: 35511 ticks (5 minutes 55.11 seconds)
start: Fri Sep 26 21:50:56 2008
age: 355 seconds (5 minutes 55 seconds)
syscall: #54 ioctl(, 0x0) (sysent: genunix:ioctl+0x0)
tstate: TS_SLEEP - awaiting an event
tflg:   T_DFLTSTK - stack is default size
tpflg:  TP_TWAIT - wait to be freed by lwp_wait
        TP_MSACCT - collect micro-state accounting information
tsched: TS_LOAD - thread is in memory
        TS_DONT_SWAP - thread/LWP should not be swapped
pflag:  SMSACCT - process is keeping micro-state accounting
        SMSFORK - child inherits micro-state accounting

pc:      unix:_resume_from_idle+0xf8 resume_return:  addq   $0x8,%rsp

unix:_resume_from_idle+0xf8 resume_return()
unix:swtch+0x12a()
genunix:turnstile_block+0x1f9()
unix:mutex_vector_enter+0x239()
zfs:dbuf_read+0x6c()
zfs:dmu_bonus_hold+0x8f()
zfs:dsl_dataset_open_obj+0x55()
zfs:dsl_dataset_open_spa+0x142()
zfs:dsl_dataset_open+0x16()
zfs:dmu_objset_open+0x49()
zfs:zfs_ioc_objset_stats+0x3d()
zfs:zfs_ioc_dataset_list_next+0x13e()
zfs:zfsdev_ioctl+0x13e()
genunix:cdev_ioctl+0x1d()
specfs:spec_ioctl+0x50()
genunix:fop_ioctl+0x25()
genunix:ioctl+0xac()
unix:_syscall32_save+0xbf()
- -- switch to user thread's user stack --

the list and destroy are both waiting on the same mtx:

SolarisCAT(vmcore.0/10X)> mutex -l 0xffffffffb8c381d8
adaptive mutex:  owner: 0xffffffff9d1c1500  waiters: true
turnstile @ 0xffffffff9884f240
ts_next: 0x0  ts_free: 0xffffffff9879bec8  ts_sobj: 0xffffffffb8c381d8
ts_waiters: 2  ts_epri: 60  ts_inheritor: 0xffffffff9d1c1500
ts_prioinv: 0x0
writer sleepq:
  thread             pri pctcpu           idle   PID              wchan command
  0xffffffff98d78060  60  0.000      -5m55.11s 10100 0xffffffffb8c381d8 zfs list
  0xffffffff98c3e400  60  0.000       -5m6.15s 10110 0xffffffffb8c381d8 zfs 
destroy export/filebench
reader sleepq: NULL

while the recv waits on a cv:

SolarisCAT(vmcore.0/10X)> cv 0xffffffff9d1c16d6
cv_waiters: 1
sleepq_head @ 0xfffffffffbcd1ac0, sq_lock: 0x0
  thread             pri pctcpu           idle   PID              wchan command
  0xffffffff9d1c1500  60  0.019             0s 10090 0xffffffff9d1c16d6 
/sbin/zfs receive export/[EMAIL PROTECTED]

now, i can't be entirely sure that the system was really hung rather than
blocking on the 'zfs recv' (which often takes a very long time, even for a
smallish stream); however, i waited at least 5 minutes, which is usually enough
for the recv to complete.  

i realise this is an OpenSolaris forum, but i assume people won't mind reports
of possibly serious bugs...

        - river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (SunOS)

iEYEARECAAYFAkjhmJgACgkQIXd7fCuc5vIvqwCgiTSBW0RBzvju+Jc1DXeonvgh
goQAnjCZrZd2dEGE5kxKeFflO0Dazira
=kHGP
-----END PGP SIGNATURE-----
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to