Re: panic starting gnome

2003-02-19 Thread Lars Eggert
Terry Lambert wrote:

Debug:


[excellent kernel-debugging recipe snipped]

Here's a backtrace of a crashdump that should be more helpful:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; lapic.id = 
fault virtual address   = 0x34
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc01b28c6
stack pointer   = 0x10:0xeb3b17c0
frame pointer   = 0x10:0xeb3b17e0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 2104 (gconf-sanity-check-)
panic: from debugger
cpuid = 0; lapic.id = 


Fatal trap 3: breakpoint instruction fault while in kernel mode
cpuid = 0; lapic.id = 
instruction pointer = 0x8:0xc03019ea
stack pointer   = 0x10:0xeb3b1534
frame pointer   = 0x10:0xeb3b1540
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= IOPL = 0
current process = 2104 (gconf-sanity-check-)
panic: from debugger
cpuid = 0; lapic.id = 
boot() called on cpu#0
Uptime: 4m49s
Dumping 1023 MB
 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 
320 336 352 368 384 400 416 432 448 464 480 496 512 528 544 560 576 592 
608 624 640 656 672 688 704 720 736 752 768 784 800 816 832 848 864 880 
896 912 928 944 960 976 992 1008
---
#0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
240 dumpsys(dumper);

(kgdb) bt
#0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
#1  0xc01bc00e in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:371
#2  0xc01bc627 in panic (fmt=0xc0349b8d from debugger)
at /usr/src/sys/kern/kern_shutdown.c:542
#3  0xc0148192 in db_panic () at /usr/src/sys/ddb/db_command.c:448
#4  0xc0147fcc in db_command (last_cmdp=0xc037f9a0, cmd_table=0x0,
aux_cmd_tablep=0xc0376fb8, aux_cmd_tablep_end=0xc0376fbc)
at /usr/src/sys/ddb/db_command.c:346
#5  0xc014820a in db_command_loop () at /usr/src/sys/ddb/db_command.c:470
#6  0xc014af96 in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_trap.c:72
#7  0xc0301697 in kdb_trap (type=12, code=0, regs=0xeb3b1780)
at /usr/src/sys/i386/i386/db_interface.c:166
#8  0xc031a590 in trap_fatal (frame=0xeb3b1780, eva=0)
at /usr/src/sys/i386/i386/trap.c:839
#9  0xc031a2da in trap_pfault (frame=0xeb3b1780, usermode=0, eva=52)
at /usr/src/sys/i386/i386/trap.c:758
#10 0xc0319e95 in trap (frame=
  {tf_fs = -1038483432, tf_es = 16, tf_ds = -1070202864, tf_edi = 
158, tf_esi = 52, tf_ebp = -348448800, tf_isp = -348448852, tf_ebx = 0, 
tf_edx = -966573056, tf_ecx = -966602272, tf_eax = -966602272, tf_trapno 
= 12, tf_err = 0, tf_eip
= -1071961914, tf_cs = 8, tf_eflags = 66178, tf_esp = 0, tf_ss = 
-1070141731})
at /usr/src/sys/i386/i386/trap.c:445
#11 0xc0302ff8 in calltrap () at {standard input}:97
#12 0xc02098a4 in namei (ndp=0x9e) at /usr/src/sys/kern/vfs_lookup.c:158
#13 0xc021bcfc in vn_open_cred (ndp=0xeb3b1a44, flagp=0xeb3b1a0c, cmode=0,
cred=0xc2195e80) at /usr/src/sys/kern/vfs_vnops.c:185
#14 0xc6acffb4 in ?? ()
#15 0xc01a06b3 in closef (fp=0x2, td=0x0) at vnode_if.h:1225
#16 0xc01a0054 in fdfree (td=0xc662d1e0)
at /usr/src/sys/kern/kern_descrip.c:1433
#17 0xc01a5da2 in exit1 (td=0xc662d1e0) at /usr/src/sys/kern/kern_exit.c:254
#18 0xc01a5b11 in sys_exit () at /usr/src/sys/kern/kern_exit.c:116
#19 0xc031ab56 in syscall (frame=
  {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 0, tf_esi = 11095, 
tf_ebp =
-1077937128, tf_isp = -348447372, tf_ebx = 679838148, tf_edx = 
679837268, tf_ecx = 19, tf_eax = 1, tf_trapno = 12, tf_err = 2, tf_eip = 
680166719, tf_cs = 31, tf_eflags = 582, tf_esp = -1077937172, tf_ss = 47})
at /usr/src/sys/i386/i386/trap.c:1033
#20 0xc030304d in Xint0x80_syscall () at {standard input}:139
---Can't read userspace from dump, or kernel process---

(kgdb) up 12
#12 0xc02098a4 in namei (ndp=0x9e) at /usr/src/sys/kern/vfs_lookup.c:158
158 FILEDESC_LOCK(fdp);
(kgdb) list
153 #endif
154
155 /*
156  * Get starting point for the translation.
157  */
158 FILEDESC_LOCK(fdp);
159 ndp-ni_rootdir = fdp-fd_rdir;
160 ndp-ni_topdir = fdp-fd_jdir;
161
162 dp = fdp-fd_cdir;

(kgdb) print ndp
$2 = (struct nameidata *) 0x9e

(kgdb) print fdp
$1 = (struct filedesc *) 0x34
(kgdb)

(kgdb) print p
$3 = (struct proc *) 0x0

(kgdb) print td
$5 = (struct thread *) 0xc662d1e0

(kgdb) print *td
$7 = {td_proc = 0xc66307f0,
[...]

Very strange. namei() does essentially the following:

	p = td-td_proc;
	fdp = p-p_fd;

td-td_proc seems reasonable, but p is 0. No idea how this could happen, 
any guesses?

Thanks,
Lars
--
Lars Eggert [EMAIL PROTECTED]   USC Information Sciences Institute


smime.p7s
Description: S/MIME Cryptographic Signature


Re: panic starting gnome

2003-02-19 Thread Craig Boston
On Wed, 2003-02-19 at 16:44, Lars Eggert wrote:
 #11 0xc0302ff8 in calltrap () at {standard input}:97
 #12 0xc02098a4 in namei (ndp=0x9e) at /usr/src/sys/kern/vfs_lookup.c:158
 #13 0xc021bcfc in vn_open_cred (ndp=0xeb3b1a44, flagp=0xeb3b1a0c, cmode=0,
  cred=0xc2195e80) at /usr/src/sys/kern/vfs_vnops.c:185
 #14 0xc6acffb4 in ?? ()
 #15 0xc01a06b3 in closef (fp=0x2, td=0x0) at vnode_if.h:1225
 #16 0xc01a0054 in fdfree (td=0xc662d1e0)
  at /usr/src/sys/kern/kern_descrip.c:1433
 #17 0xc01a5da2 in exit1 (td=0xc662d1e0) at /usr/src/sys/kern/kern_exit.c:254

Well, I haven't had much luck tracking down the exact cause.  For some
reason I haven't been able to figure out, all of my crash dumps jump
directly from vn_open_cred (line 185 of vfs_vnops.c) to calltrap().  The
namei call doesn't show up in the stack at all, almost like the function
is being inlined.  I'm only using -O, which shouldn't inline anything
not explicitly declared as such.

Anyway, using a cvsup binary search I've managed to narrow it down
some.  The problem did not exist before midnight UTC on 2003-04-15.  It
does exist on midnight UTC 2003-04-16.  I've been digging through the
commit logs for that day, but it seems it was a busy day for the VFS
code with lots of commits.  Since it always happens after an fdfree(),
I'm leaning toward a large (number of files) commit by alfred@ having to
do with a lock order reversal and adding a mutex associated with freeing
filedesc structures.  Just a guess, though.

Reproducing the problem seems to be as simple as killing any process
that has an open, locked file on an NFS volume.  A simple

gconfd-1 
sleep 5; killall -9 gconfd-1

does it every time for me.  I assume this would also happen if a process
calls exit() without closing all of it's fds first; probably why
starting GNOME or booting diskless is enough to tickle it.

Craig


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: panic starting gnome

2003-02-19 Thread Terry Lambert
Lars Eggert wrote:
 Terry Lambert wrote:
  Debug:
  
 [excellent kernel-debugging recipe snipped]
 
 Here's a backtrace of a crashdump that should be more helpful:

[ ... ]

 (kgdb) up 12
 #12 0xc02098a4 in namei (ndp=0x9e) at /usr/src/sys/kern/vfs_lookup.c:158
 158 FILEDESC_LOCK(fdp);
 (kgdb) list
 153 #endif
 154
 155 /*
 156  * Get starting point for the translation.
 157  */
 158 FILEDESC_LOCK(fdp);
 159 ndp-ni_rootdir = fdp-fd_rdir;
 160 ndp-ni_topdir = fdp-fd_jdir;
 161
 162 dp = fdp-fd_cdir;
 
 (kgdb) print ndp
 $2 = (struct nameidata *) 0x9e
 
 (kgdb) print fdp
 $1 = (struct filedesc *) 0x34
 (kgdb)
 
 (kgdb) print p
 $3 = (struct proc *) 0x0
 
 (kgdb) print td
 $5 = (struct thread *) 0xc662d1e0
 
 (kgdb) print *td
 $7 = {td_proc = 0xc66307f0,
 [...]
 
 Very strange. namei() does essentially the following:
 
 p = td-td_proc;
 fdp = p-p_fd;
 
 td-td_proc seems reasonable, but p is 0. No idea how this could happen,
 any guesses?

Cool.

This is not where I was guessing it was at, at all.  8-) 8-).

There's a commit that Alfred made last Friday night that might
have something to do with it.  It was an attempt to fix a lock
order reversal between PROC/filedesc, according to the commit,
and it introduced fdesc_mtx.

If you grep for that everywhere, and then annotate the involved
files, it should be pretty obvious which changes to revert to see
if this is the case (1.50-1.49 of /sys/sys/filedesc.h, etc.).

It may also be an issue with some of the recent KSE commits
over the last weekend missing an assignment on a context switch.

Probably the easiest thing to do, if you can repeat the problem
reliably, is to bsearch, starting 8 days days ago, for the commit
that broke the camel's back.

It's really tempting to make a script that's capable of carrying
out a /usr/src/sys bsearch semi-automatically, because people are
really hesistant to use this approach for solving problems, even
though it only requires O(log2(N)) reboots to find it...


-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: panic starting gnome

2003-02-19 Thread Terry Lambert
Craig Boston wrote:
 Well, I haven't had much luck tracking down the exact cause.  For some
 reason I haven't been able to figure out, all of my crash dumps jump
 directly from vn_open_cred (line 185 of vfs_vnops.c) to calltrap().  The
 namei call doesn't show up in the stack at all, almost like the function
 is being inlined.  I'm only using -O, which shouldn't inline anything
 not explicitly declared as such.

Nope.  The problem is a NULL pointer dereference, apparently into
the proc structure, which is a NULL proc pointer.

 Anyway, using a cvsup binary search I've managed to narrow it down
 some.  The problem did not exist before midnight UTC on 2003-04-15.  It
 does exist on midnight UTC 2003-04-16.  I've been digging through the
 commit logs for that day, but it seems it was a busy day for the VFS
 code with lots of commits.  Since it always happens after an fdfree(),
 I'm leaning toward a large (number of files) commit by alfred@ having to
 do with a lock order reversal and adding a mutex associated with freeing
 filedesc structures.  Just a guess, though.

FWIW, I arrived at the same place, given Lars' debugging information,
though it was only my most likely suspect.  There are some changes
that went in for KSE, as well, but I'm pretty sure they were after
last Wednesday.


 Reproducing the problem seems to be as simple as killing any process
 that has an open, locked file on an NFS volume.  A simple
 
 gconfd-1 
 sleep 5; killall -9 gconfd-1
 
 does it every time for me.  I assume this would also happen if a process
 calls exit() without closing all of it's fds first; probably why
 starting GNOME or booting diskless is enough to tickle it.

Yes, this is most likely.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



panic starting gnome

2003-02-18 Thread Lars Eggert
Hi,

on today's -current, I get the following panic when starting gnome from 
xdm; a kernel from 2/10 works with today's world, so it must be 
something in the kernel that changed over the last week:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; lapic.id = 
fault virtual address   = 0x34
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc01b28a6
stack pointer   = 0x10:0xe91a57c0
frame pointer   = 0x10:0xe91a57e0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 2444 (gconf-sanity-check-)
kernel: type 12 trap, code=0
Stopped at  _mtx_lock_flags+0x26:   cmpl$0xc03884a0,0(%esi)
db mi_switch(c21b4980,0,c0354624,185,3f8) at mi_switch+0x240
ithread_schedule(c6283b80,1,c0305e16,c658e780,e91a55c0) at 
ithread_schedule+0x11c
sched_ithd(d) at sched_ithd+0x41
Xintr13() at Xintr13+0xd3
--- interrupt, eip = 0xc02efea2, esp = 0xe91a55a4, ebp = 0xe91a55c0 ---
siocnopen(e91a55d4,3f8,1c200,301,c01f040b) at siocnopen+0x12
siocncheckc(c03b4c80,78,e91a5608,c01f0358,e91a5624) at siocncheckc+0x40
cncheckc(e91a5624,c0149625,e91a57c8,c03a9ac8,e91a5634) at cncheckc+0x2c
cngetc(e91a57c8,c03a9ac8,e91a5634,0,e91a57c8) at cngetc+0x18
db_readline(c03b1b80,78,e91a5658,c01481e6,c03499fb) at db_readline+0x65
db_read_line(c03499fb,c03a9ac8,e91a5658,c0148a28,0) at db_read_line+0x1a
db_command_loop(c01b28a6,a0,0,e91a5680,0) at db_command_loop+0x46
db_trap(c,0,0,e91a56c0,5) at db_trap+0x66
kdb_trap(c,0,e91a5780,1,1) at kdb_trap+0x107
trap_fatal(e91a5780,34,c0372ee0,2e4,c658e780) at trap_fatal+0x250
trap_pfault(e91a5780,0,34,c03e0758,34) at trap_pfault+0x17a
trap(c21a0018,10,c0360010,9e,34) at trap+0x3e5
calltrap() at calltrap+0x5
--- trap 0xc, eip = 0xc01b28a6, esp = 0xe91a57c0, ebp = 0xe91a57e0 ---
_mtx_lock_flags(34,0,c035cf5f,9e,c658e780) at _mtx_lock_flags+0x26
namei(e91a5a44,c0207d5a,c749458c,0,c658e780) at namei+0x134
vn_open_cred(e91a5a44,e91a5a0c,0,c2195e80,0) at vn_open_cred+0x53c
nfs_dolock(e91a5c0c,c658e780,1b3,c03e0748,6001) at nfs_dolock+0x294
closef(c6673834,c658e780,c0353f03,595,c7375934) at closef+0x123
fdfree(c658e780,0,c03543ab,f2,73) at fdfree+0x1d4
exit1(c658e780,0,c03543ab,73,e91a5d40) at exit1+0x282
sys_exit(c658e780,e91a5d10,c0372ee0,407,c658de4c) at sys_exit+0x41
syscall(2f,2f,2f,0,2b57) at syscall+0x3d6
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (1), eip = 0x288a853f, esp = 0xbfbffbec, ebp = 0xbfbffc18 ---

Lars
--
Lars Eggert [EMAIL PROTECTED]   USC Information Sciences Institute


smime.p7s
Description: S/MIME Cryptographic Signature


Re: panic starting gnome

2003-02-18 Thread Craig Boston
FWIW, this looks nearly identical to the panic I reported last night in
the thread VFS panic (possibly NFS locking related?).  I didn't manage
to catch the ddb trace and had to work postmortem with a crash dump and
gdb.  But it looked just like here.

Lars: Do you by any chance have your home directory on an NFS mount?

I think the reason that my gdb trace showed ?? instead of nfs_dolock
is that I have nfsclient loaded as a module...

Craig

On Tue, 2003-02-18 at 17:00, Lars Eggert wrote:
 Hi,
 
 on today's -current, I get the following panic when starting gnome from 
 xdm; a kernel from 2/10 works with today's world, so it must be 
 something in the kernel that changed over the last week:
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; lapic.id = 
 fault virtual address   = 0x34
 fault code  = supervisor read, page not present
 instruction pointer = 0x8:0xc01b28a6
 stack pointer   = 0x10:0xe91a57c0
 frame pointer   = 0x10:0xe91a57e0
 code segment= base 0x0, limit 0xf, type 0x1b
  = DPL 0, pres 1, def32 1, gran 1
 processor eflags= interrupt enabled, resume, IOPL = 0
 current process = 2444 (gconf-sanity-check-)
 kernel: type 12 trap, code=0
 Stopped at  _mtx_lock_flags+0x26:   cmpl$0xc03884a0,0(%esi)
 db mi_switch(c21b4980,0,c0354624,185,3f8) at mi_switch+0x240
 ithread_schedule(c6283b80,1,c0305e16,c658e780,e91a55c0) at 
 ithread_schedule+0x11c
 sched_ithd(d) at sched_ithd+0x41
 Xintr13() at Xintr13+0xd3
 --- interrupt, eip = 0xc02efea2, esp = 0xe91a55a4, ebp = 0xe91a55c0 ---
 siocnopen(e91a55d4,3f8,1c200,301,c01f040b) at siocnopen+0x12
 siocncheckc(c03b4c80,78,e91a5608,c01f0358,e91a5624) at siocncheckc+0x40
 cncheckc(e91a5624,c0149625,e91a57c8,c03a9ac8,e91a5634) at cncheckc+0x2c
 cngetc(e91a57c8,c03a9ac8,e91a5634,0,e91a57c8) at cngetc+0x18
 db_readline(c03b1b80,78,e91a5658,c01481e6,c03499fb) at db_readline+0x65
 db_read_line(c03499fb,c03a9ac8,e91a5658,c0148a28,0) at db_read_line+0x1a
 db_command_loop(c01b28a6,a0,0,e91a5680,0) at db_command_loop+0x46
 db_trap(c,0,0,e91a56c0,5) at db_trap+0x66
 kdb_trap(c,0,e91a5780,1,1) at kdb_trap+0x107
 trap_fatal(e91a5780,34,c0372ee0,2e4,c658e780) at trap_fatal+0x250
 trap_pfault(e91a5780,0,34,c03e0758,34) at trap_pfault+0x17a
 trap(c21a0018,10,c0360010,9e,34) at trap+0x3e5
 calltrap() at calltrap+0x5
 --- trap 0xc, eip = 0xc01b28a6, esp = 0xe91a57c0, ebp = 0xe91a57e0 ---
 _mtx_lock_flags(34,0,c035cf5f,9e,c658e780) at _mtx_lock_flags+0x26
 namei(e91a5a44,c0207d5a,c749458c,0,c658e780) at namei+0x134
 vn_open_cred(e91a5a44,e91a5a0c,0,c2195e80,0) at vn_open_cred+0x53c
 nfs_dolock(e91a5c0c,c658e780,1b3,c03e0748,6001) at nfs_dolock+0x294
 closef(c6673834,c658e780,c0353f03,595,c7375934) at closef+0x123
 fdfree(c658e780,0,c03543ab,f2,73) at fdfree+0x1d4
 exit1(c658e780,0,c03543ab,73,e91a5d40) at exit1+0x282
 sys_exit(c658e780,e91a5d10,c0372ee0,407,c658de4c) at sys_exit+0x41
 syscall(2f,2f,2f,0,2b57) at syscall+0x3d6
 Xint0x80_syscall() at Xint0x80_syscall+0x1d
 --- syscall (1), eip = 0x288a853f, esp = 0xbfbffbec, ebp = 0xbfbffc18 ---
 
 Lars


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: panic starting gnome

2003-02-18 Thread Lars Eggert
Craig Boston wrote:

FWIW, this looks nearly identical to the panic I reported last night in
the thread VFS panic (possibly NFS locking related?).


I missed your message, just read it: yes, that sounds similar.


 I didn't manage
to catch the ddb trace and had to work postmortem with a crash dump and
gdb.  But it looked just like here.

Lars: Do you by any chance have your home directory on an NFS mount?


Yes, I do.


I think the reason that my gdb trace showed ?? instead of nfs_dolock
is that I have nfsclient loaded as a module...


Mine's loaded as a module, too.

Lars
--
Lars Eggert [EMAIL PROTECTED]   USC Information Sciences Institute



smime.p7s
Description: S/MIME Cryptographic Signature


Re: panic starting gnome

2003-02-18 Thread Terry Lambert
Lars Eggert wrote:
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; lapic.id = 
 fault virtual address   = 0x34

 fault code  = supervisor read, page not present
 instruction pointer = 0x8:0xc01b28a6

[ ... ]

 kernel: type 12 trap, code=0
 Stopped at  _mtx_lock_flags+0x26:   cmpl$0xc03884a0,0(%esi)

[ ... ]

 trap_fatal(e91a5780,34,c0372ee0,2e4,c658e780) at trap_fatal+0x250
 trap_pfault(e91a5780,0,34,c03e0758,34) at trap_pfault+0x17a
 trap(c21a0018,10,c0360010,9e,34) at trap+0x3e5
 calltrap() at calltrap+0x5
 --- trap 0xc, eip = 0xc01b28a6, esp = 0xe91a57c0, ebp = 0xe91a57e0 ---
 _mtx_lock_flags(34,0,c035cf5f,9e,c658e780) at _mtx_lock_flags+0x26
  **

Attempt to dereference the value 0x34 as if it were a pointer.

 namei(e91a5a44,c0207d5a,c749458c,0,c658e780) at namei+0x134

Called from here.

Debug:

1)  Make sure that the kernel that has the fault was
created with config -g, so that there is a debug
version of it lying around in the build directory.

2)  Make sure that the kernel you installed is the
stripped version of the debug kernel (there are two
kernels created as a result of config -g; one is
kernel.debug (the debug version) and the other is
kernel (the stripped version).

3)  If #1 and #2 are not true, then make them true, and
repeat the problem.

4)  Boot a kernel that doesn't crash instead, so that you
can run the debugger.

5)  Go to the build directory, and look at the faulting
code to see where it gets the value 0x34 to pass in
to the _mtx_lock_flags(); this is the bogus value.  For
example, if you had a debug kernel for the kernel that
has the problem, and it was config'ed from i386 GENERIC,
you would use the following sequence of commands:

cd /sys/i386/compile/GENERIC
gdb -k kernel.debug
list namei+0x134

6)  Change the code so the bogus value is no longer being
passed.

7)  Live happily ever after.


Note that, to me, this looks like a problem with a dereference of a
current process which is not really current, as a result of a
wakeup occurring in an interrupt handler for an outstanding request
which was satisfied by the interrupt handler.

Note:   Under no circumstances should a page 0 address be passed
around to anyone, since page zero is typically unmapped in
order to trigger NULL pointer dereference faults and/or
structure member reference faults for structure elements
(at least in the the initial 4K: range 0x-0x1000)
when a structure pointer itself is NULL.

IMO, the most likely cause is that you have a null structure
pointer, and the element at offset 0x34 into the structure is
being referenced out of it, without checking that the pointer
is not NULL, and the most likely culprit is a proc/kse/thread
type structure that's not guaranteed to be valid in interrupt
context.

Probably, the scheduler is switching directly from interrupt
of a process context Q to a wakeup of the same process Q,
without restoring a register value that should normally be
restored following an interrupt.  I have no idea which of the
schedulers you are using, so I have no idea if this should be
an expected omission; my best guess is you are using the new
one, though, because this is an unlikely problem with the old
one, if it's really a scheduler wakeup problem.

 namei(e91a5a44,c0207d5a,c749458c,0,c658e780) at namei+0x134
   ^
   |
 vn_open_cred(e91a5a44,e91a5a0c,0,c2195e80,0) at vn_open_cred+0x53c
 ^  ^
 |  |
  ...all three of these are also incredibly suspicious, at first sight...


Until you are willing to list out the code where the bogus value is
being passed to the function call, there's no way any of us are
going to be able to correlate your stack traceback to our own source
trees, in order to be able to help you, unless you are running a
tagged veraion (e.g. 5.0-RELEASE) with no modifications.

Just saying the most recent current or I CVS'up'ed on xxx date is
really useless to us, because CVS mirrors don't contain well known
information relative to a CVS'up date.  In many cases, we will need
you to check out (at least!) a fresh /sys source tree from the CVS
repository, using a date tage, if you are not running a -RELEASE
version.  Yes, this is a long-standing problem with the FreeBSD
project itself.

If you can do this, and repeat the