On Wed, 2003-02-19 at 16:44, Lars Eggert wrote:
> #11 0xc0302ff8 in calltrap () at {standard input}:97
> #12 0xc02098a4 in namei (ndp=0x9e) at /usr/src/sys/kern/vfs_lookup.c:158
> #13 0xc021bcfc in vn_open_cred (ndp=0xeb3b1a44, flagp=0xeb3b1a0c, cmode=0,
>      cred=0xc2195e80) at /usr/src/sys/kern/vfs_vnops.c:185
> #14 0xc6acffb4 in ?? ()
> #15 0xc01a06b3 in closef (fp=0x2, td=0x0) at vnode_if.h:1225
> #16 0xc01a0054 in fdfree (td=0xc662d1e0)
>      at /usr/src/sys/kern/kern_descrip.c:1433
> #17 0xc01a5da2 in exit1 (td=0xc662d1e0) at /usr/src/sys/kern/kern_exit.c:254

Well, I haven't had much luck tracking down the exact cause.  For some
reason I haven't been able to figure out, all of my crash dumps jump
directly from vn_open_cred (line 185 of vfs_vnops.c) to calltrap().  The
namei call doesn't show up in the stack at all, almost like the function
is being inlined.  I'm only using -O, which shouldn't inline anything
not explicitly declared as such.

Anyway, using a cvsup binary search I've managed to narrow it down
some.  The problem did not exist before midnight UTC on 2003-04-15.  It
does exist on midnight UTC 2003-04-16.  I've been digging through the
commit logs for that day, but it seems it was a busy day for the VFS
code with lots of commits.  Since it always happens after an fdfree(),
I'm leaning toward a large (number of files) commit by alfred@ having to
do with a lock order reversal and adding a mutex associated with freeing
filedesc structures.  Just a guess, though.

Reproducing the problem seems to be as simple as killing any process
that has an open, locked file on an NFS volume.  A simple

gconfd-1 &
sleep 5; killall -9 gconfd-1

does it every time for me.  I assume this would also happen if a process
calls exit() without closing all of it's fds first; probably why
starting GNOME or booting diskless is enough to tickle it.

Craig


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Reply via email to