I had reported this earlier, but the similarities are striking:

I too have seen strange AMD panics where stack variables inexplicably
go to zero.  My systems are K6/2-400's, and I have often witnessed the 
following fault (only happens on a *really* busy web server)

#0  boot (howto=256) at ../../kern/kern_shutdown.c:285
#1  0xc014aad1 in panic (fmt=0xc023878a "page fault")
    at ../../kern/kern_shutdown.c:446
#2  0xc02098ce in trap_fatal (frame=0xcc74eecc, eva=134812896)
    at ../../i386/i386/trap.c:942
#3  0xc0209587 in trap_pfault (frame=0xcc74eecc, usermode=0, eva=134812896)
    at ../../i386/i386/trap.c:835
#4  0xc02091ba in trap (frame={tf_es = -887750640, tf_ds = -1036058608, 
      tf_edi = -1050208512, tf_esi = -1043943040, tf_ebp = -864751828, 
      tf_isp = -864751884, tf_ebx = 2287, tf_edx = -1036043576, tf_ecx = 0, 
      tf_eax = 134812884, tf_trapno = 12, tf_err = 2, tf_eip = -1072417321, 
      tf_cs = 8, tf_eflags = 66054, tf_esp = -1041509376, tf_ss = -1036024832})
    at ../../i386/i386/trap.c:437
#5  0xc01435d7 in fdcopy (p=0xcc5796e0) at ../../kern/kern_descrip.c:954
#6  0xc014587b in fork1 (p1=0xcc5796e0, flags=-2147483596)
    at ../../kern/kern_fork.c:379
#7  0xc014533b in vfork (p=0xcc5796e0, uap=0xcc74ef94)
    at ../../kern/kern_fork.c:109
#8  0xc0209b17 in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi = 236237520, 
      tf_esi = 236231856, tf_ebp = -1077952324, tf_isp = -864751644, 
      tf_ebx = 673171048, tf_edx = 163766316, tf_ecx = 672877149, tf_eax = 66, 
      tf_trapno = 7, tf_err = 2, tf_eip = 672936705, tf_cs = 31, 
      tf_eflags = 514, tf_esp = -1077952368, tf_ss = 39})
    at ../../i386/i386/trap.c:1100
#9  0xc01feedc in Xint0x80_syscall ()

Now the interesting code here is at stack from #5:

(kgdb) list
948             fpp = newfdp->fd_ofiles;
949             for (i = newfdp->fd_lastfile; i-- >= 0; fpp++)
950                     if (*fpp != NULL)
951                             (*fpp)->f_count++;

(kgdb) p newfdp->fd_ofiles
$1 = (struct file **) 0xc23f2000
(kgdb) p fpp
$2 = (struct file **) 0x0

Now... the only operation on fpp is fpp++.  It should take a _long_
time for fpp to get around to 0 and you'd thing that *fpp would be
zero long before that (or cause a page fault at some other
non-existant location).

So... the similarity here is that deep in the kernel, we have a
automatic (possibly register) local variable that's getting zero'd.

I have half-a-dozen crash dumps of this nature.  For me, it always
happens in fdcopy().  This may be due to the fact that the machine is
running a large apache config --- so fork() is something it's doing
often.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Reply via email to