Re: Fatal trap 12: page fault while in kernel mode on 7.1/amd64, but not 7.0

2009-03-07 Thread Kostik Belousov
On Fri, Mar 06, 2009 at 05:01:12PM -0500, Boris Kochergin wrote:
 Gavin Atkinson wrote:
 On Thu, 2009-03-05 at 19:55 -0500, Boris Kochergin wrote:
   
 Ahoy. I recently upgraded an amd64 machine to 7.1-RELEASE, and started 
 getting a bunch of these at a pretty high frequency (a few hours to a 
 day apart):
 
 http://acm.poly.edu/~spawk/IMG00033.jpg
 
 The current process is always httpd. They're particularly annoying 
 because the machine doesn't actually ever reboot, requiring manual 
 intervention. Reverting the kernel back to 7.0 makes the panic go away, 
 and the machine had been happily running 7.0 for about a year 
 beforehand. I realize that the photo hardly contains any useful 
 debugging information, but I was hoping it might look familiar to 
 someone. If not, I guess I'll come back with a backtrace.
 
 
 A backtrace will almost certainly be necessary to figure out what this
 issue is, although there is a possibility that the output of
 addr2line -e /boot/kernel/kernel.symbols 0x8:0x802d7010
 might help, assuming you've not recompiled your kernel yet.  (That
 number should be the same as the instruction pointer shown by the
 panic, but as the photo is quite blurred there's a chance I've got it
 wrong, if you have a better picture of it or wrote it down then use
 that)
 
 Gavin
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
   
 Here it is, with some additional information afterward:
 
 Unread portion of the kernel message buffer:
 kernel trap 12 with interrupts disabled
 
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 1; apic id = 01
 fault virtual address   = 0x30
 fault code  = supervisor read data, page not present
 instruction pointer = 0x8:0x80293faf
 stack pointer   = 0x10:0x9cbaea70
 frame pointer   = 0x10:0xff000fc14000
 code segment= base 0x0, limit 0xf, type 0x1b
   = DPL 0, pres 1, long 1, def32 0, gran 1
 processor eflags= resume, IOPL = 0
 current process = 881 (httpd)
 trap number = 12
 panic: page fault
 cpuid = 1
 Uptime: 1m51s
 Physical memory: 8185 MB
 Dumping 328 MB: 313 297 281 265 249 233 217 201 185 169 153 137 121 105 
 89 73 57 41 25 9
 
 #0  doadump () at pcpu.h:195
 195 pcpu.h: No such file or directory.
   in pcpu.h
 (kgdb) where
 #0  doadump () at pcpu.h:195
 #1  0xff000fc14000 in ?? ()
 #2  0x8025eba9 in boot (howto=260) at 
 /usr/src-7.1/sys/kern/kern_shutdown.c:418
 #3  0x8025efb2 in panic (fmt=0x104 Address 0x104 out of 
 bounds) at /usr/src-7.1/sys/kern/kern_shutdown.c:574
 #4  0x803df5c3 in trap_fatal (frame=0xff000fc14000, 
 eva=Variable eva is not available.
 ) at /usr/src-7.1/sys/amd64/amd64/trap.c:764
 #5  0x803e018f in trap (frame=0x9cbae9c0) at 
 /usr/src-7.1/sys/amd64/amd64/trap.c:290
 #6  0x803c5c4e in calltrap () at 
 /usr/src-7.1/sys/amd64/amd64/exception.S:209
 #7  0x80293faf in turnstile_broadcast (ts=0x0, queue=0) at 
 /usr/src-7.1/sys/kern/subr_turnstile.c:836
 #8  0x8025256a in _mtx_unlock_sleep (m=0x80593538, 
 opts=Variable opts is not available.
 ) at /usr/src-7.1/sys/kern/kern_mutex.c:619
 #9  0x80275ed3 in __umtx_op_cv_wait (td=0x1ee, uap=Variable 
 uap is not available.
 ) at /usr/src-7.1/sys/kern/kern_umtx.c:312
 #10 0x803dfb78 in syscall (frame=0x9cbaec80) at 
 /usr/src-7.1/sys/amd64/amd64/trap.c:907
 #11 0x803c5e5b in Xfast_syscall () at 
 /usr/src-7.1/sys/amd64/amd64/exception.S:330
 #12 0x000800f5354c in ?? ()
 Previous frame inner to this frame (corrupt stack?)
 (kgdb)
 
 The dump was difficult to acquire--the system would often lock up after 
 dumping only a portion of the memory it wanted to save. I can also now 
 trigger the panic pretty reliably using this bit of script:
 
 #!/usr/local/bin/bash
 
 for i in {1..900}
 do
 wget --quiet -O /dev/null http://acm.poly.edu/wiki/Hosting 
 done
 
 ...where the URL is a MediaWiki installation on the afflicted machine.

Can you, please, recompile the kernel with debugging options, and
provoke the panic on it ?

We need at least options INVARIANTS, INVARIANT_SUPPORT and WITNESS.



pgpRs7poemfsA.pgp
Description: PGP signature


Re: Fatal trap 12: page fault while in kernel mode on 7.1/amd64, but not 7.0

2009-03-06 Thread Kostik Belousov
On Thu, Mar 05, 2009 at 07:55:30PM -0500, Boris Kochergin wrote:
 Ahoy. I recently upgraded an amd64 machine to 7.1-RELEASE, and started 
 getting a bunch of these at a pretty high frequency (a few hours to a 
 day apart):
 
 http://acm.poly.edu/~spawk/IMG00033.jpg
 
 The current process is always httpd. They're particularly annoying 
 because the machine doesn't actually ever reboot, requiring manual 
 intervention. Reverting the kernel back to 7.0 makes the panic go away, 
 and the machine had been happily running 7.0 for about a year 
 beforehand. I realize that the photo hardly contains any useful 
 debugging information, but I was hoping it might look familiar to 
 someone. If not, I guess I'll come back with a backtrace.

You need to provide the backtrace from kgdb.


pgpifKOEedU0J.pgp
Description: PGP signature


Re: Fatal trap 12: page fault while in kernel mode on 7.1/amd64, but not 7.0

2009-03-06 Thread Gavin Atkinson
On Thu, 2009-03-05 at 19:55 -0500, Boris Kochergin wrote:
 Ahoy. I recently upgraded an amd64 machine to 7.1-RELEASE, and started 
 getting a bunch of these at a pretty high frequency (a few hours to a 
 day apart):
 
 http://acm.poly.edu/~spawk/IMG00033.jpg
 
 The current process is always httpd. They're particularly annoying 
 because the machine doesn't actually ever reboot, requiring manual 
 intervention. Reverting the kernel back to 7.0 makes the panic go away, 
 and the machine had been happily running 7.0 for about a year 
 beforehand. I realize that the photo hardly contains any useful 
 debugging information, but I was hoping it might look familiar to 
 someone. If not, I guess I'll come back with a backtrace.

A backtrace will almost certainly be necessary to figure out what this
issue is, although there is a possibility that the output of
addr2line -e /boot/kernel/kernel.symbols 0x8:0x802d7010
might help, assuming you've not recompiled your kernel yet.  (That
number should be the same as the instruction pointer shown by the
panic, but as the photo is quite blurred there's a chance I've got it
wrong, if you have a better picture of it or wrote it down then use
that)

Gavin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Fatal trap 12: page fault while in kernel mode on 7.1/amd64, but not 7.0

2009-03-06 Thread Boris Kochergin

Gavin Atkinson wrote:

On Thu, 2009-03-05 at 19:55 -0500, Boris Kochergin wrote:
  
Ahoy. I recently upgraded an amd64 machine to 7.1-RELEASE, and started 
getting a bunch of these at a pretty high frequency (a few hours to a 
day apart):


http://acm.poly.edu/~spawk/IMG00033.jpg

The current process is always httpd. They're particularly annoying 
because the machine doesn't actually ever reboot, requiring manual 
intervention. Reverting the kernel back to 7.0 makes the panic go away, 
and the machine had been happily running 7.0 for about a year 
beforehand. I realize that the photo hardly contains any useful 
debugging information, but I was hoping it might look familiar to 
someone. If not, I guess I'll come back with a backtrace.



A backtrace will almost certainly be necessary to figure out what this
issue is, although there is a possibility that the output of
addr2line -e /boot/kernel/kernel.symbols 0x8:0x802d7010
might help, assuming you've not recompiled your kernel yet.  (That
number should be the same as the instruction pointer shown by the
panic, but as the photo is quite blurred there's a chance I've got it
wrong, if you have a better picture of it or wrote it down then use
that)

Gavin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
  

Here it is, with some additional information afterward:

Unread portion of the kernel message buffer:
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x30
fault code  = supervisor read data, page not present
instruction pointer = 0x8:0x80293faf
stack pointer   = 0x10:0x9cbaea70
frame pointer   = 0x10:0xff000fc14000
code segment= base 0x0, limit 0xf, type 0x1b
  = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= resume, IOPL = 0
current process = 881 (httpd)
trap number = 12
panic: page fault
cpuid = 1
Uptime: 1m51s
Physical memory: 8185 MB
Dumping 328 MB: 313 297 281 265 249 233 217 201 185 169 153 137 121 105 
89 73 57 41 25 9


#0  doadump () at pcpu.h:195
195 pcpu.h: No such file or directory.
  in pcpu.h
(kgdb) where
#0  doadump () at pcpu.h:195
#1  0xff000fc14000 in ?? ()
#2  0x8025eba9 in boot (howto=260) at 
/usr/src-7.1/sys/kern/kern_shutdown.c:418
#3  0x8025efb2 in panic (fmt=0x104 Address 0x104 out of 
bounds) at /usr/src-7.1/sys/kern/kern_shutdown.c:574
#4  0x803df5c3 in trap_fatal (frame=0xff000fc14000, 
eva=Variable eva is not available.

) at /usr/src-7.1/sys/amd64/amd64/trap.c:764
#5  0x803e018f in trap (frame=0x9cbae9c0) at 
/usr/src-7.1/sys/amd64/amd64/trap.c:290
#6  0x803c5c4e in calltrap () at 
/usr/src-7.1/sys/amd64/amd64/exception.S:209
#7  0x80293faf in turnstile_broadcast (ts=0x0, queue=0) at 
/usr/src-7.1/sys/kern/subr_turnstile.c:836
#8  0x8025256a in _mtx_unlock_sleep (m=0x80593538, 
opts=Variable opts is not available.

) at /usr/src-7.1/sys/kern/kern_mutex.c:619
#9  0x80275ed3 in __umtx_op_cv_wait (td=0x1ee, uap=Variable 
uap is not available.

) at /usr/src-7.1/sys/kern/kern_umtx.c:312
#10 0x803dfb78 in syscall (frame=0x9cbaec80) at 
/usr/src-7.1/sys/amd64/amd64/trap.c:907
#11 0x803c5e5b in Xfast_syscall () at 
/usr/src-7.1/sys/amd64/amd64/exception.S:330

#12 0x000800f5354c in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)

The dump was difficult to acquire--the system would often lock up after 
dumping only a portion of the memory it wanted to save. I can also now 
trigger the panic pretty reliably using this bit of script:


#!/usr/local/bin/bash

for i in {1..900}
do
wget --quiet -O /dev/null http://acm.poly.edu/wiki/Hosting 
done

...where the URL is a MediaWiki installation on the afflicted machine.

-Boris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Fatal trap 12: page fault while in kernel mode on 7.1/amd64, but not 7.0

2009-03-05 Thread Boris Kochergin
Ahoy. I recently upgraded an amd64 machine to 7.1-RELEASE, and started 
getting a bunch of these at a pretty high frequency (a few hours to a 
day apart):


http://acm.poly.edu/~spawk/IMG00033.jpg

The current process is always httpd. They're particularly annoying 
because the machine doesn't actually ever reboot, requiring manual 
intervention. Reverting the kernel back to 7.0 makes the panic go away, 
and the machine had been happily running 7.0 for about a year 
beforehand. I realize that the photo hardly contains any useful 
debugging information, but I was hoping it might look familiar to 
someone. If not, I guess I'll come back with a backtrace.


-Boris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org