On Wed, 2003-02-19 at 16:44, Lars Eggert wrote: > #11 0xc0302ff8 in calltrap () at {standard input}:97 > #12 0xc02098a4 in namei (ndp=0x9e) at /usr/src/sys/kern/vfs_lookup.c:158 > #13 0xc021bcfc in vn_open_cred (ndp=0xeb3b1a44, flagp=0xeb3b1a0c, cmode=0, > cred=0xc2195e80) at /usr/src/sys/kern/vfs_vnops.c:185 > #14 0xc6acffb4 in ?? () > #15 0xc01a06b3 in closef (fp=0x2, td=0x0) at vnode_if.h:1225 > #16 0xc01a0054 in fdfree (td=0xc662d1e0) > at /usr/src/sys/kern/kern_descrip.c:1433 > #17 0xc01a5da2 in exit1 (td=0xc662d1e0) at /usr/src/sys/kern/kern_exit.c:254
Well, I haven't had much luck tracking down the exact cause. For some reason I haven't been able to figure out, all of my crash dumps jump directly from vn_open_cred (line 185 of vfs_vnops.c) to calltrap(). The namei call doesn't show up in the stack at all, almost like the function is being inlined. I'm only using -O, which shouldn't inline anything not explicitly declared as such. Anyway, using a cvsup binary search I've managed to narrow it down some. The problem did not exist before midnight UTC on 2003-04-15. It does exist on midnight UTC 2003-04-16. I've been digging through the commit logs for that day, but it seems it was a busy day for the VFS code with lots of commits. Since it always happens after an fdfree(), I'm leaning toward a large (number of files) commit by alfred@ having to do with a lock order reversal and adding a mutex associated with freeing filedesc structures. Just a guess, though. Reproducing the problem seems to be as simple as killing any process that has an open, locked file on an NFS volume. A simple gconfd-1 & sleep 5; killall -9 gconfd-1 does it every time for me. I assume this would also happen if a process calls exit() without closing all of it's fds first; probably why starting GNOME or booting diskless is enough to tickle it. Craig To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message