John Baldwin wrote:
On Thursday 26 February 2009 4:22:15 pm Guy Helmer wrote:
db> show sleepchain 23110
thread 100181 (pid 23110, vmstat) blocked on sx "user map" XLOCK
thread 100208 (pid 23092, kvoop) is on a run queue
db> show sleepchain 23092
thread 100208 (pid 23092, kvoop) is on a run queue

Ah, so this is normal (well, mostly) in that kvoop is simply on the run queue waiting for a CPU. Can you find the thread pointer for kvoop and check on things such as if it is pinned and if so to which CPU (td_pinned will tell you the first, and td_sched->ts_cpu will tell you the second with ULE).
(kgdb) print td->td_pinned
$2 = 0
(kgdb) print td->td_sched->ts_cpu
$3 = 3 '\003'
(kgdb) print td->td_state
$4 = TDS_RUNQ

From my captured ddb run:
cpuid        = 3
curthread    = 0xc5e2f000: pid 23090 "filter"
curpcb       = 0xe6f90d90
fpcurthread  = none
idlethread   = 0xc442daf0: pid 11 "idle: cpu3"
APIC ID      = 7
currentldt   = 0x50
spin locks held:

Back to kgdb:
(kgdb) proc 23090
[Switching to thread 131 (Thread 100199)]#0 cpustop_handler () at atomic.h:253
253     ATOMIC_ASM(set,      int,   "orl %1,%0",   "ir",  v);
(kgdb) where
#0  cpustop_handler () at atomic.h:253
#1  0xc07eedef in ipi_nmi_handler () at ../../../i386/i386/mp_machdep.c:1300
#2  0xc07f85e0 in trap (frame=0xe6f90b64) at ../../../i386/i386/trap.c:216
#3  0xc07e02db in calltrap () at ../../../i386/i386/exception.s:159
#4  0xc068a066 in witness_unlock (lock=0xc5e4cb2c, flags=0,
   file=0xc08529f0 "../../../kern/kern_descrip.c", line=2083)
   at ../../../kern/subr_witness.c:1266
#5  0xc065c95e in _sx_sunlock (sx=0xc5e4cb2c,
   file=0xc08529f0 "../../../kern/kern_descrip.c", line=2083)
   at ../../../kern/kern_sx.c:294
#6  0xc062e1ab in fget_read (td=0xc5e2f000, fd=15, fpp=0xe6f90c34)
   at ../../../kern/kern_descrip.c:2083
#7  0xc068c2e8 in kern_readv (td=0xc5e2f000, fd=15, auio=0xe6f90c60)
   at ../../../kern/sys_generic.c:189
#8  0xc068c3ff in read (td=0xc5e2f000, uap=0xe6f90cfc)
   at ../../../kern/sys_generic.c:108
#9 0xc07f8343 in syscall (frame=0xe6f90d38) at ../../../i386/i386/trap.c:1090
#10 0xc07e0340 in Xint0x80_syscall () at ../../../i386/i386/exception.s:255
#11 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) frame 4
#4  0xc068a066 in witness_unlock (lock=0xc5e4cb2c, flags=0,
   file=0xc08529f0 "../../../kern/kern_descrip.c", line=2083)
   at ../../../kern/subr_witness.c:1266
1266 if (witness_cold || witness_watch == 0 || lock->lo_witness == NULL ||
(kgdb) print *lock
$5 = {lo_name = 0xc0852b03 "filedesc structure",
 lo_type = 0xc0852b03 "filedesc structure", lo_flags = 37421056,
 lo_witness_data = {lod_list = {stqe_next = 0xc091c228},
   lod_witness = 0xc091c228}}

Then you will want to see what is running on that CPU. You might want to check your other coredump and find the td_state member of the thread for kvoop there as well.

On the amd64, kvoop was on the run queue for cpu 1 and another process, filter, looks like it must have been running in user mode when I broke into ddb:
(kgdb) proc 33201
[Switching to thread 127 (Thread 100113)]#0 cpustop_handler () at atomic.h:264
264     ATOMIC_ASM(set,      int,   "orl %1,%0",   "ir",  v);
(kgdb) where
#0  cpustop_handler () at atomic.h:264
#1  0xffffffff8050c560 in ipi_nmi_handler ()
   at ../../../amd64/amd64/mp_machdep.c:1119
#2  0xffffffff8051aba7 in trap (frame=0xffffffff9fe18f40)
   at ../../../amd64/amd64/trap.c:198
#3  0xffffffff805013eb in nmi_calltrap ()
   at ../../../amd64/amd64/exception.S:427
#4  0x00000000280af9d4 in ?? ()
Previous frame inner to this frame (corrupt stack?)



I sure wish I could find the root cause of the hangs. On a hunch, I tried setting "machdep.cpu_idle_hlt=0" on the amd64 machine, and it has run 32 hours without a hang. It could just be coincidence, though...

Thanks for your help,
Guy
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to