Re: Crash when PREEMPTION enabled

2005-03-02 Thread Kris Kennaway
On Tue, Mar 01, 2005 at 07:13:04PM +1100, Peter Jeremy wrote:
 I have tried enabling PREEMPTION on 5.3-RELEASE-p5 whilst it ran OK
 for a few hours, it crashed overnight whilst doing a make index in
 /usr/ports.  I've tried repeating make index and it worked.  I have
 a coredump of the crash and have been doing some poking around.

I've just turned on PREEMPTION on the 12-CPU sparc64 machine running
RELENG_5 that I've been playing with, and it quickly resets and
reboots under load (buildworld -j12).  This happens with either ULE or
4BSD, although ULE without PREEMPTION seems to work fine (and perform
much better than 4BSD, as expected).

It looks like PREEMPTION is indeed still broken in RELENG_5 - I guess
my workloads aren't enough to trigger this on other machines :-(

Kris


pgpOSXvOKR5AY.pgp
Description: PGP signature


Crash when PREEMPTION enabled

2005-03-01 Thread Peter Jeremy
I have tried enabling PREEMPTION on 5.3-RELEASE-p5 whilst it ran OK
for a few hours, it crashed overnight whilst doing a make index in
/usr/ports.  I've tried repeating make index and it worked.  I have
a coredump of the crash and have been doing some poking around.

A backtrace (with the nonsense frames deleted) is:
#8  0xc06775e2 in trap (frame=
  {tf_fs = 24, tf_es = 16, tf_ds = 16, tf_edi = 4, tf_esi = -1011675136, 
tf_ebp = -663303504, tf_isp = -663303576, tf_ebx = 65538, tf_edx = -1040411904, 
tf_ecx = -1011675136, tf_eax = 25, tf_trapno = 12, tf_err = 0, tf_eip = 8, 
tf_cs = 8, tf_eflags = 66050, tf_esp = -1068017362, tf_ss = -663303532})
at /usr/src/sys/i386/i386/trap.c:417
#9  0xc0664ada in calltrap () at /usr/src/sys/i386/i386/exception.s:140
#26 0xc057592e in vn_lock (vp=0xc3b31000, flags=65538, td=0xc1cc0640)
at vnode_if.h:1013
#27 0xc0567636 in vget (vp=0xc3b31000, flags=2, td=0x0)
at /usr/src/sys/kern/vfs_subr.c:2028
#28 0xc055a4aa in vfs_cache_lookup (ap=0x0)
at /usr/src/sys/kern/vfs_cache.c:723
#29 0xc061a2a8 in ufs_vnoperate (ap=0x0)
at /usr/src/sys/ufs/ufs/ufs_vnops.c:2816
#30 0xc055fdce in lookup (ndp=0xd876cc24) at vnode_if.h:52
#31 0xc055f7bb in namei (ndp=0xd876cc24) at /usr/src/sys/kern/vfs_lookup.c:181
#32 0xc056f042 in stat (td=0xc1cc0640, uap=0xd876cd14)
at /usr/src/sys/kern/vfs_syscalls.c:2034
#33 0xc06780d0 in syscall (frame=
  {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 0, tf_esi = 134964334, 
tf_ebp = -1077946296, tf_isp = -663302796, tf_ebx = 134964320, tf_edx = 
134964320, tf_ecx = 134964320, tf_eax = 188, tf_trapno = 12, tf_err = 2, tf_eip 
= 134643347, tf_cs = 31, tf_eflags = 582, tf_esp = -1077946452, tf_ss = 47})
at /usr/src/sys/i386/i386/trap.c:1001
#34 0xc0664b2f in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:201

It appears that vn_lock() has called VOP_LOCK() but v_op is pointing to
an array of garbage.  Apart from the memory at v_op, the vnode looks
reasonably sane:

(kgdb) p *(struct vnode *)0xc3b31000
$1 = {v_interlock = {mtx_object = {lo_class = 0xc06e583c, 
  lo_name = 0xc06b89b9 vnode interlock, 
  lo_type = 0xc06b89b9 vnode interlock, lo_flags = 196608, lo_list = {
tqe_next = 0x0, tqe_prev = 0x0}, lo_witness = 0x0}, 
mtx_lock = 3251373632, mtx_recurse = 0}, v_iflag = 0, v_usecount = 1, 
  v_numoutput = 0, v_vxthread = 0x0, v_holdcnt = 1, v_cleanblkhd = {
tqh_first = 0xcbf84bc4, tqh_last = 0xcbf84c64}, 
  v_cleanblkroot = 0xcbf84bc4, v_cleanbufcnt = 1, v_dirtyblkhd = {
tqh_first = 0x0, tqh_last = 0xc3b31048}, v_dirtyblkroot = 0x0, 
  v_dirtybufcnt = 0, v_vflag = 8, v_writecount = 0, v_object = 0xc36d4ad4, 
  v_lastw = 0, v_cstart = 0, v_lasta = 0, v_clen = 0, v_un = {
vu_mountedhere = 0x0, vu_socket = 0x0, vu_spec = {vu_cdev = 0x0, 
  vu_specnext = {sle_next = 0x0}}, vu_fifoinfo = 0x0}, v_freelist = {
tqe_next = 0xc2719210, tqe_prev = 0xc39e5ef8}, v_nmntvnodes = {
tqe_next = 0xc39cfd68, tqe_prev = 0xc24d9df8}, v_synclist = {
le_next = 0xc3ae1840, le_prev = 0xc3b151a0}, v_type = VREG, 
  v_tag = 0xc06b7244 ufs, v_data = 0xc270c690, v_lock = {
lk_interlock = 0xc070cc94, lk_flags = 16777280, lk_sharecount = 0, 
lk_waitcount = 0, lk_exclusivecount = 0, lk_prio = 80, 
lk_wmesg = 0xc06b7244 ufs, lk_timo = 6, lk_lockholder = 0x, 
lk_newlock = 0x0}, v_vnlock = 0xc3b310ac, v_op = 0xc1fc9300, 
  v_mount = 0xc1bb4000, v_cache_src = {lh_first = 0x310}, v_cache_dst = {
tqh_first = 0xc358e3b8, tqh_last = 0xc358e3c8}, v_id = 362047, 
  v_dd = 0xc3b31000, v_ddid = 0, v_pollinfo = 0x0, v_label = 0x0, 
  v_cachedfs = 1057, v_cachedid = 6630696, v_bsize = 16384}

Anyone have any ideas?
-- 
Peter Jeremy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]