Re: fatal trap 12 in pagedaemon on dual-core opteron machine

2005-07-01 Thread Rob Watt
On Thu, 30 Jun 2005, Kris Kennaway wrote:

> On Thu, Jun 30, 2005 at 04:00:47PM -0400, Rob Watt wrote:
>
> > #7  0x80400c0b in calltrap () at
> > /usr/src/sys/amd64/amd64/exception.S:171
> > #8  0xff007c3b00f0 in ?? ()
> > #9  0xff007b78c500 in ?? ()
> > #10 0x0001840f in ?? ()
> > #11 0x in ?? ()
> > #12 0x in ?? ()
>
> [..]
>
> All these bogus stack frames can be caused by having compiled the
> kernel with -O2 instead of -O.  Is this the case?

It seems the default for amd64 is to compile with:
COPTFLAGS="-O2 -frename-registers -pipe"
I changed the -O2 to -O, and there are still a large number of bogus stack
frames (although there are more readable frames then before):

#0  doadump () at pcpu.h:167
#1  0x in ?? ()
#2  0x802aca23 in boot (howto=260) at
/usr/src/sys/kern/kern_shutdown.c:410
#3  0x802ace8b in panic (fmt=0xff007b78c500 "\u\022y{") at
/usr/src/sys/kern/kern_shutdown.c:566
#4  0x804275bc in trap_fatal (frame=0xff007b78c500,
eva=18446742976269456104)
at /usr/src/sys/amd64/amd64/trap.c:639
#5  0x80427220 in trap_pfault (frame=0xb1c129c0,
usermode=0) at /usr/src/sys/amd64/amd64/trap.c:562
#6  0x80426e99 in trap (frame=
  {tf_rdi = -1097427386128, tf_rsi = -1097440115456, tf_rdx = 100956,
tf_rcx = 0, tf_r8 = 0, tf_r9 = 0, tf_rax = 100956, tf_rbx = 0, tf_rbp =
-1098510893056, tf_r10 = 30, tf_r11 = 29, tf_r12 = -1097364252160, tf_r13
= -2143265920, tf_r14 = 0, tf_r15 = -2141262160, tf_trapno = 12, tf_addr =
136, tf_flags = 0, tf_err = 0, tf_rip = -2144628916, tf_cs = 8, tf_rflags
= 66050, tf_rsp = -1312740736, tf_ss = 16}) at
/usr/src/sys/amd64/amd64/trap.c:341
#7  0x80413c5b in calltrap () at
/usr/src/sys/amd64/amd64/exception.S:171
#8  0xff007c3b00f0 in ?? ()
#9  0xff007b78c500 in ?? ()
#10 0x00018a5c in ?? ()
#11 0x in ?? ()
#12 0x in ?? ()
#13 0x in ?? ()
#14 0x00018a5c in ?? ()
#15 0x in ?? ()
#16 0xff003ba6 in ?? ()
#17 0x001e in ?? ()
#18 0x001d in ?? ()
#19 0xff007ffe5a00 in ?? ()
#20 0x80405b80 in vm_pageout_page_stats () at
/usr/src/sys/vm/vm_pageout.c:1350
#21 0x in ?? ()
#22 0x805eeeb0 in sysctl___kern_sched_runq_fuzz ()
#23 0x000c in ?? ()
#24 0x0088 in ?? ()
#25 0x in ?? ()
#26 0x in ?? ()
#27 0x802b8f4c in thread_fini (mem=0x0, size=0) at
/usr/src/sys/kern/kern_thread.c:271
#28 0x0010 in ?? ()
#29 0xff007ffe4620 in ?? ()
#30 0x in ?? ()
#31 0xff003ba60f98 in ?? ()
#32 0x80407a41 in zone_drain (zone=0x10202) at
/usr/src/sys/vm/uma_core.c:749
#33 0x80408ed6 in zone_foreach (zfunc=0x80407810
) at /usr/src/sys/vm/uma_core.c:1494
#34 0x8040acb5 in uma_reclaim () at
/usr/src/sys/vm/uma_core.c:2623
#35 0x80404836 in vm_pageout_scan (pass=0) at
/usr/src/sys/vm/vm_pageout.c:674
#36 0x80405f1e in vm_pageout () at
/usr/src/sys/vm/vm_pageout.c:1476
#37 0x80292e4b in fork_exit (callout=0x80405b80
, arg=0x0, frame=0xb1c12c50)
at /usr/src/sys/kern/kern_fork.c:791
#38 0x80413e5e in fork_trampoline () at
/usr/src/sys/amd64/amd64/exception.S:296
#39 0x in ?? ()
#40 0x in ?? ()
#41 0x0001 in ?? ()
#42 0x in ?? ()
#43 0x in ?? ()
#44 0x in ?? ()
#45 0x in ?? ()
#46 0x in ?? ()
#47 0x in ?? ()
#48 0x in ?? ()
#49 0x in ?? ()
#50 0x in ?? ()
#51 0x in ?? ()
#52 0x in ?? ()
#53 0x in ?? ()
#54 0x in ?? ()
#55 0x in ?? ()
#56 0x in ?? ()
#57 0x in ?? ()
#58 0x in ?? ()
#59 0x in ?? ()
#60 0x in ?? ()
#61 0x in ?? ()
#62 0x in ?? ()
#63 0x in ?? ()
#64 0x in ?? ()
#65 0x in ?? ()
#66 0x in ?? ()
#67 0x in ?? ()
#68 0x in ?? ()
#69 0x in ?? ()
#70 0x in ?? ()
#71 0x0081e000 in ?? ()
#72 0x806457f4 in vm_page_max_wired ()
#73 0x in ?? ()
#74 0x0001 in ?? ()
#75 0xff007b7912e8 in ?? ()
#76 0xff007b7f5000 in ?? ()
#77 0xb1c12ae8 in ?? ()
#78 0xff007b78c500 in ?? ()
#79 0x802c0c84 in sched_switch (td=0x0, newtd=0x0, flags=1) at
/usr/src/sys/kern/sched_4bsd.c:881
...

-
Rob Watt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAI

Re: fatal trap 12 in pagedaemon on dual-core opteron machine

2005-06-30 Thread Kris Kennaway
On Thu, Jun 30, 2005 at 04:00:47PM -0400, Rob Watt wrote:

> #7  0x80400c0b in calltrap () at
> /usr/src/sys/amd64/amd64/exception.S:171
> #8  0xff007c3b00f0 in ?? ()
> #9  0xff007b78c500 in ?? ()
> #10 0x0001840f in ?? ()
> #11 0x in ?? ()
> #12 0x in ?? ()

[..]

All these bogus stack frames can be caused by having compiled the
kernel with -O2 instead of -O.  Is this the case?

Kris


pgpYbaJzaXJKj.pgp
Description: PGP signature


fatal trap 12 in pagedaemon on dual-core opteron machine

2005-06-30 Thread Rob Watt
I've been having stability problems in a new dual-core opteron machine.
When it first crashed we were running a number of multicast
applications that were listening to, recording, and rebroadcasting data. I
have been able to re-simulate the data play upto the crash and can
recreate the panic 1 out of every 5 or 6 replays.

the hardware is:

- 2 275 (2.2Ghz) dual-core processors
- tyan 2881 k8sr motherboard (bios 2.05v)
- 2 Gb DDR 400Mhz PC3200 Registered ECC memory
- SEAGATE ST3146807LC scsi drive
- intel dual-port gigabit ethernet card

I have experienced the problem in both 5.4-RELEASE and STABLE (stable
as-of today). I am running a custom SMP kernel.

The most recent panic message was:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x88
fault code  = supervisor read, page not present
instruction pointer = 0x8:0x802af0da
stack pointer   = 0x10:0xb1c16a60
frame pointer   = 0x10:0xff005b8cd000
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 63 (pagedaemon)
trap number = 12
panic: page fault
cpuid = 0
boot() called on cpu#0
Uptime: 1h3m10s
Dumping 2047 MB
 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320
336 352 368 384 400 416 432 448 464 480 496 512 528 544 560 576 592 608
624 640 656 672 688 704 720 736 752 768 784 800 816 832 848 864 880 896
912 928 944 960 976 992 1008 1024 1040 1056 1072 1088 1104 1120 1136 1152
1168 1184 1200 1216 1232 1248 1264 1280 1296 1312 1328 1344 1360 1376 1392
1408 1424 1440 1456 1472 1488 1504 1520 1536 1552 1568 1584 1600 1616 1632
1648 1664 1680 1696 1712 1728 1744 1760 1776 1792 1808 1824 1840 1856 1872
1888 1904 1920 1936 1952 1968 1984 2000 2016 2032

Every time it crashes the current process is pagedaemon, and the
instruction pointer points to 'thread_fini'.

kgdb shows:

host1# kgdb /usr/obj/usr/src/sys/LOCAL/kernel.debug
/usr/tmp/crash/vmcore.6
[GDB will not be able to debug user-mode threads:
/usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
This GDB was configured as "amd64-marcel-freebsd".
#0  doadump () at pcpu.h:167
167 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) where
#0  doadump () at pcpu.h:167
#1  0x in ?? ()
#2  0x802a2bd7 in boot (howto=260) at
/usr/src/sys/kern/kern_shutdown.c:410
#3  0x802a340f in panic (fmt=0xff007b78c500 "\u\022y{") at
/usr/src/sys/kern/kern_shutdown.c:566
#4  0x80412f8a in trap_fatal (frame=0xff007b78c500,
eva=18446742976269456104)
at /usr/src/sys/amd64/amd64/trap.c:639
#5  0x804132af in trap_pfault (frame=0xb1c169b0,
usermode=0) at /usr/src/sys/amd64/amd64/trap.c:562
#6  0x80413553 in trap (frame=
  {tf_rdi = -1097427386128, tf_rsi = -1097440115456, tf_rdx = 99343,
tf_rcx = 0, tf_r8 = 0, tf_r9 = 0, tf_rax = 99343, tf_rbx = 0, tf_rbp =
-1097975672832, tf_r10 = 4503599627366400, tf_r11 = 3424, tf_r12 = 4,
tf_r13 = 4, tf_r14 = -1098264600680, tf_r15 = -1097364252144, tf_trapno =
12, tf_addr = 136, tf_flags = -1098264600680, tf_err = 0, tf_rip =
-2144669478, tf_cs = 8, tf_rflags = 66050, tf_rsp = -1312724368, tf_ss =
16})
at /usr/src/sys/amd64/amd64/trap.c:341
#7  0x80400c0b in calltrap () at
/usr/src/sys/amd64/amd64/exception.S:171
#8  0xff007c3b00f0 in ?? ()
#9  0xff007b78c500 in ?? ()
#10 0x0001840f in ?? ()
#11 0x in ?? ()
#12 0x in ?? ()
#13 0x in ?? ()
#14 0x0001840f in ?? ()
#15 0x in ?? ()
#16 0xff005b8cd000 in ?? ()
#17 0x000ff000 in ?? ()
#18 0x0d60 in ?? ()
#19 0x0004 in ?? ()
#20 0x0004 in ?? ()
#21 0xff004a541f98 in ?? ()
#22 0xff007ffe5a10 in ?? ()
#23 0x000c in ?? ()
#24 0x0088 in ?? ()
#25 0xff004a541f98 in ?? ()
#26 0x in ?? ()
#27 0x802af0da in thread_fini (mem=0x0, size=0) at
/usr/src/sys/kern/kern_thread.c:271
#28 0x0010 in ?? ()
#29 0x0001 in ?? ()
#30 0xff007ffe5a00 in ?? ()
#31 0xff005b8cdf98 in ?? ()
#32 0x803f67ff in zone_drain (zone=0x8) at
/usr/src/sys/vm/uma_core.c:749
#33 0x803f43b6 in zone_foreach (zfunc=0x803f6630
) at /usr/src/sys/vm/uma_core.c:1494
#34 0x803f7fc9 in uma_reclaim () at
/usr/src/sys/vm/uma_core.c:2623
#35 0x803f1dac in vm_pageout () at
/usr/src/sys/vm/vm_pageout.c:674
#36 0x802898cc in fork_exit (callout=0x803f17b0
, arg=0x0, frame=0xb1c16c50)
at /usr/src/sys/kern/kern_fork.c:791
#37 0x80400e0e in fork_trampoline () at
/usr/src/sys/amd64/amd64/exception.S:296
#38 0x in ?? ()
#39 0x in ?? ()
#40 0x00