John Baldwin wrote:
On Thursday 26 February 2009 4:22:15 pm Guy Helmer wrote:
db> show sleepchain 23110
thread 100181 (pid 23110, vmstat) blocked on sx "user map" XLOCK
thread 100208 (pid 23092, kvoop) is on a run queue
db> show sleepchain 23092
thread 100208 (pid 23092, kvoop) is on a run queue
Ah, so this is normal (well, mostly) in that kvoop is simply on the run queue
waiting for a CPU. Can you find the thread pointer for kvoop and check on
things such as if it is pinned and if so to which CPU (td_pinned will tell
you the first, and td_sched->ts_cpu will tell you the second with ULE).
(kgdb) print td->td_pinned
$2 = 0
(kgdb) print td->td_sched->ts_cpu
$3 = 3 '\003'
(kgdb) print td->td_state
$4 = TDS_RUNQ
From my captured ddb run:
cpuid = 3
curthread = 0xc5e2f000: pid 23090 "filter"
curpcb = 0xe6f90d90
fpcurthread = none
idlethread = 0xc442daf0: pid 11 "idle: cpu3"
APIC ID = 7
currentldt = 0x50
spin locks held:
Back to kgdb:
(kgdb) proc 23090
[Switching to thread 131 (Thread 100199)]#0 cpustop_handler () at
atomic.h:253
253 ATOMIC_ASM(set, int, "orl %1,%0", "ir", v);
(kgdb) where
#0 cpustop_handler () at atomic.h:253
#1 0xc07eedef in ipi_nmi_handler () at ../../../i386/i386/mp_machdep.c:1300
#2 0xc07f85e0 in trap (frame=0xe6f90b64) at ../../../i386/i386/trap.c:216
#3 0xc07e02db in calltrap () at ../../../i386/i386/exception.s:159
#4 0xc068a066 in witness_unlock (lock=0xc5e4cb2c, flags=0,
file=0xc08529f0 "../../../kern/kern_descrip.c", line=2083)
at ../../../kern/subr_witness.c:1266
#5 0xc065c95e in _sx_sunlock (sx=0xc5e4cb2c,
file=0xc08529f0 "../../../kern/kern_descrip.c", line=2083)
at ../../../kern/kern_sx.c:294
#6 0xc062e1ab in fget_read (td=0xc5e2f000, fd=15, fpp=0xe6f90c34)
at ../../../kern/kern_descrip.c:2083
#7 0xc068c2e8 in kern_readv (td=0xc5e2f000, fd=15, auio=0xe6f90c60)
at ../../../kern/sys_generic.c:189
#8 0xc068c3ff in read (td=0xc5e2f000, uap=0xe6f90cfc)
at ../../../kern/sys_generic.c:108
#9 0xc07f8343 in syscall (frame=0xe6f90d38) at
../../../i386/i386/trap.c:1090
#10 0xc07e0340 in Xint0x80_syscall () at ../../../i386/i386/exception.s:255
#11 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) frame 4
#4 0xc068a066 in witness_unlock (lock=0xc5e4cb2c, flags=0,
file=0xc08529f0 "../../../kern/kern_descrip.c", line=2083)
at ../../../kern/subr_witness.c:1266
1266 if (witness_cold || witness_watch == 0 ||
lock->lo_witness == NULL ||
(kgdb) print *lock
$5 = {lo_name = 0xc0852b03 "filedesc structure",
lo_type = 0xc0852b03 "filedesc structure", lo_flags = 37421056,
lo_witness_data = {lod_list = {stqe_next = 0xc091c228},
lod_witness = 0xc091c228}}
Then you will want to see what is running on that CPU. You might want to
check your other coredump and find the td_state member of the thread for
kvoop there as well.
On the amd64, kvoop was on the run queue for cpu 1 and another process,
filter, looks like it must have been running in user mode when I broke
into ddb:
(kgdb) proc 33201
[Switching to thread 127 (Thread 100113)]#0 cpustop_handler () at
atomic.h:264
264 ATOMIC_ASM(set, int, "orl %1,%0", "ir", v);
(kgdb) where
#0 cpustop_handler () at atomic.h:264
#1 0xffffffff8050c560 in ipi_nmi_handler ()
at ../../../amd64/amd64/mp_machdep.c:1119
#2 0xffffffff8051aba7 in trap (frame=0xffffffff9fe18f40)
at ../../../amd64/amd64/trap.c:198
#3 0xffffffff805013eb in nmi_calltrap ()
at ../../../amd64/amd64/exception.S:427
#4 0x00000000280af9d4 in ?? ()
Previous frame inner to this frame (corrupt stack?)
I sure wish I could find the root cause of the hangs. On a hunch, I
tried setting "machdep.cpu_idle_hlt=0" on the amd64 machine, and it has
run 32 hours without a hang. It could just be coincidence, though...
Thanks for your help,
Guy
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"