Re: 6.2-STABLE (i386) Repeating crash (supervisor read, page not present)

2007-04-23 Thread Tom Judge

Michael Proto wrote:

Kris Kennaway wrote:

On Mon, Apr 23, 2007 at 01:24:52PM +0100, Tom Judge wrote:

Hi,

Recently I have noticed that one of our Dell PE1950's has been crashing 
a lot with the following reason "supervisor read, page not present".


The system runs 6.2 Release under i386.

I have attached 2 back traces, and I still have both cores if any more 
information is required.  Any light that can be shed on this problem 
would be greatly appreciated.




<<>>


You might be hitting a bug in an obscure code path because of the
above errors.  I'm CC'ing someone who might be able to help.

Kris



Bear in mind that a recent "urgent" firmware update was released by Dell
last week for 1950, 1955, and 2950 systems that is supposed to fix some
data-corruption issues related to dual-core processors. I don't know if
this problem is a symptom of that, but it strongly suggested to apply
the firmware update regardless.





I have just been to dells site and there are firmware updates for almost 
every component in the system released about 2 weeks ago (10-11/4).  I 
have around 17 [12]950's waiting to go into pre production testing at 
the moment so I think that I will spend some time upgrading the firmware 
on them now rather than later.


Thanks for the heads up.

Tom

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: 6.2-STABLE (i386) Repeating crash (supervisor read, page not present)

2007-04-23 Thread Kris Kennaway
On Mon, Apr 23, 2007 at 01:12:30PM -0400, Michael Proto wrote:
> Kris Kennaway wrote:
> > On Mon, Apr 23, 2007 at 01:24:52PM +0100, Tom Judge wrote:
> >> Hi,
> >>
> >> Recently I have noticed that one of our Dell PE1950's has been crashing 
> >> a lot with the following reason "supervisor read, page not present".
> >>
> >> The system runs 6.2 Release under i386.
> >>
> >> I have attached 2 back traces, and I still have both cores if any more 
> >> information is required.  Any light that can be shed on this problem 
> >> would be greatly appreciated.
> >>
> >> Tom
> >>
> >> ===
> >>
> >> uname -a
> >> FreeBSD narthex.mintel.co.uk 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Mon 
> >> Apr  2 20:13:11 BST 2007 
> >> [EMAIL PROTECTED]:/usr/obj/usr/src/sys/PE1950  i386
> >>
> >>
> >> ## Core 1
> >>
> >> [EMAIL PROTECTED] '13:14:47' '/home/london/tj'
> >>> $ kgdb /usr/obj/usr/src/sys/PE1950/kernel.debug /var/crash/vmcore.1
> >> [GDB will not be able to debug user-mode threads: 
> >> /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
> >> GNU gdb 6.1.1 [FreeBSD]
> >> Copyright 2004 Free Software Foundation, Inc.
> >> GDB is free software, covered by the GNU General Public License, and you 
> >> are
> >> welcome to change it and/or distribute copies of it under certain 
> >> conditions.
> >> Type "show copying" to see the conditions.
> >> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> >> This GDB was configured as "i386-marcel-freebsd".
> >>
> >> Unread portion of the kernel message buffer:
> >>
> >>
> >> Fatal trap 12: page fault while in kernel mode
> >> cpuid = 0; apic id = 00
> >> fault virtual address   = 0x15c
> >> fault code  = supervisor read, page not present
> >> instruction pointer = 0x20:0xc05df61f
> >> stack pointer   = 0x28:0xe4f63c30
> >> frame pointer   = 0x28:0xe4f63c90
> >> code segment= base 0x0, limit 0xf, type 0x1b
> >>= DPL 0, pres 1, def32 1, gran 1
> >> processor eflags= interrupt enabled, resume, IOPL = 0
> >> current process = 12 (swi1: net)
> >> trap number = 12
> >> panic: page fault
> >> cpuid = 0
> >> Uptime: 1h25m33s
> >> Dumping 2047 MB (2 chunks)
> >>  chunk 0: 1MB (159 pages) ... ok
> >>  chunk 1: 2047MB (523944 pages) 2031 2015 1999 1983 1967 1951 1935 1919 
> >> 1903 1887
> >> <7>arp_rtrequest: bad gateway 172.31.1.1 (!AF_LINK)
> >> <7>arp_rtrequest: bad gateway 172.31.0.1 (!AF_LINK)
> > 
> > You might be hitting a bug in an obscure code path because of the
> > above errors.  I'm CC'ing someone who might be able to help.
> > 
> > Kris
> > 
> 
> Bear in mind that a recent "urgent" firmware update was released by Dell
> last week for 1950, 1955, and 2950 systems that is supposed to fix some
> data-corruption issues related to dual-core processors. I don't know if
> this problem is a symptom of that, but it strongly suggested to apply
> the firmware update regardless.

Thanks, this could be very relevant.

Kris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: 6.2-STABLE (i386) Repeating crash (supervisor read, page not present)

2007-04-23 Thread Michael Proto
Kris Kennaway wrote:
> On Mon, Apr 23, 2007 at 01:24:52PM +0100, Tom Judge wrote:
>> Hi,
>>
>> Recently I have noticed that one of our Dell PE1950's has been crashing 
>> a lot with the following reason "supervisor read, page not present".
>>
>> The system runs 6.2 Release under i386.
>>
>> I have attached 2 back traces, and I still have both cores if any more 
>> information is required.  Any light that can be shed on this problem 
>> would be greatly appreciated.
>>
>> Tom
>>
>> ===
>>
>> uname -a
>> FreeBSD narthex.mintel.co.uk 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Mon 
>> Apr  2 20:13:11 BST 2007 
>> [EMAIL PROTECTED]:/usr/obj/usr/src/sys/PE1950  i386
>>
>>
>> ## Core 1
>>
>> [EMAIL PROTECTED] '13:14:47' '/home/london/tj'
>>> $ kgdb /usr/obj/usr/src/sys/PE1950/kernel.debug /var/crash/vmcore.1
>> [GDB will not be able to debug user-mode threads: 
>> /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
>> GNU gdb 6.1.1 [FreeBSD]
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and you are
>> welcome to change it and/or distribute copies of it under certain 
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB.  Type "show warranty" for details.
>> This GDB was configured as "i386-marcel-freebsd".
>>
>> Unread portion of the kernel message buffer:
>>
>>
>> Fatal trap 12: page fault while in kernel mode
>> cpuid = 0; apic id = 00
>> fault virtual address   = 0x15c
>> fault code  = supervisor read, page not present
>> instruction pointer = 0x20:0xc05df61f
>> stack pointer   = 0x28:0xe4f63c30
>> frame pointer   = 0x28:0xe4f63c90
>> code segment= base 0x0, limit 0xf, type 0x1b
>>= DPL 0, pres 1, def32 1, gran 1
>> processor eflags= interrupt enabled, resume, IOPL = 0
>> current process = 12 (swi1: net)
>> trap number = 12
>> panic: page fault
>> cpuid = 0
>> Uptime: 1h25m33s
>> Dumping 2047 MB (2 chunks)
>>  chunk 0: 1MB (159 pages) ... ok
>>  chunk 1: 2047MB (523944 pages) 2031 2015 1999 1983 1967 1951 1935 1919 
>> 1903 1887
>> <7>arp_rtrequest: bad gateway 172.31.1.1 (!AF_LINK)
>> <7>arp_rtrequest: bad gateway 172.31.0.1 (!AF_LINK)
> 
> You might be hitting a bug in an obscure code path because of the
> above errors.  I'm CC'ing someone who might be able to help.
> 
> Kris
> 

Bear in mind that a recent "urgent" firmware update was released by Dell
last week for 1950, 1955, and 2950 systems that is supposed to fix some
data-corruption issues related to dual-core processors. I don't know if
this problem is a symptom of that, but it strongly suggested to apply
the firmware update regardless.



-Proto
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: 6.2-STABLE (i386) Repeating crash (supervisor read, page not present)

2007-04-23 Thread Kris Kennaway
On Mon, Apr 23, 2007 at 01:24:52PM +0100, Tom Judge wrote:
> Hi,
> 
> Recently I have noticed that one of our Dell PE1950's has been crashing 
> a lot with the following reason "supervisor read, page not present".
> 
> The system runs 6.2 Release under i386.
> 
> I have attached 2 back traces, and I still have both cores if any more 
> information is required.  Any light that can be shed on this problem 
> would be greatly appreciated.
> 
> Tom
> 
> ===
> 
> uname -a
> FreeBSD narthex.mintel.co.uk 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Mon 
> Apr  2 20:13:11 BST 2007 
> [EMAIL PROTECTED]:/usr/obj/usr/src/sys/PE1950  i386
> 
> 
> ## Core 1
> 
> [EMAIL PROTECTED] '13:14:47' '/home/london/tj'
> > $ kgdb /usr/obj/usr/src/sys/PE1950/kernel.debug /var/crash/vmcore.1
> [GDB will not be able to debug user-mode threads: 
> /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain 
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "i386-marcel-freebsd".
> 
> Unread portion of the kernel message buffer:
> 
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address   = 0x15c
> fault code  = supervisor read, page not present
> instruction pointer = 0x20:0xc05df61f
> stack pointer   = 0x28:0xe4f63c30
> frame pointer   = 0x28:0xe4f63c90
> code segment= base 0x0, limit 0xf, type 0x1b
>= DPL 0, pres 1, def32 1, gran 1
> processor eflags= interrupt enabled, resume, IOPL = 0
> current process = 12 (swi1: net)
> trap number = 12
> panic: page fault
> cpuid = 0
> Uptime: 1h25m33s
> Dumping 2047 MB (2 chunks)
>  chunk 0: 1MB (159 pages) ... ok
>  chunk 1: 2047MB (523944 pages) 2031 2015 1999 1983 1967 1951 1935 1919 
> 1903 1887
> <7>arp_rtrequest: bad gateway 172.31.1.1 (!AF_LINK)
> <7>arp_rtrequest: bad gateway 172.31.0.1 (!AF_LINK)

You might be hitting a bug in an obscure code path because of the
above errors.  I'm CC'ing someone who might be able to help.

Kris

> 1871 1855 1839 1823 1807 1791 1775 1759 1743 1727 1711 1695 1679 1663 
> 1647 1631 1615 1599 1583 1567 1551 1535 1519 1503 1487 1471 1455 1439 
> 1423 1407 1391 1375 1359 1343 1327 1311 1295 1279 1263 1247 1231 1215 
> 1199 1183 1167 1151 1135 1119 1103 1087 1071 1055 1039 1023 1007 991 975 
> 959 943 927 911 895 879 863 847 831 815 799 783 767 751 735 719 703 687 
> 671 655 639 623 607 591 575 559 543 527 511 495 479 463 447 431 415 399 
> 383 367 351 335 319 303 287 271 255 239 223 207 191 175 159 143 127 111 
> 95 79 63 47 31 15
> 
> #0  doadump () at pcpu.h:165
> 165 pcpu.h: No such file or directory.
>in pcpu.h
> (kgdb) bt
> #0  doadump () at pcpu.h:165
> #1  0xc05622ba in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
> #2  0xc05625e1 in panic (fmt=0xc06e2578 "%s") at 
> /usr/src/sys/kern/kern_shutdown.c:565
> #3  0xc06b4580 in trap_fatal (frame=0xe4f63bf0, eva=16777308) at 
> /usr/src/sys/i386/i386/trap.c:837
> #4  0xc06b42bf in trap_pfault (frame=0xe4f63bf0, usermode=0, 
> eva=16777308) at /usr/src/sys/i386/i386/trap.c:745
> #5  0xc06b3f19 in trap (frame=
>  {tf_fs = -1067581432, tf_es = -965803992, tf_ds = -964624344, 
> tf_edi = -957112288, tf_esi = -965676032, tf_ebp = -453624688, tf_isp = 
> -453624804, tf_ebx = 16777216, tf_edx = -968955648, tf_ecx = 4, tf_eax = 
> 0, tf_trapno = 12, tf_err = 0, tf_eip = -1067583969, tf_cs = 32, 
> tf_eflags = 66118, tf_esp = 3, tf_ss = 0}) at 
> /usr/src/sys/i386/i386/trap.c:435
> #6  0xc06a095a in calltrap () at /usr/src/sys/i386/i386/exception.s:139
> #7  0xc05df61f in in_arpinput (m=0xc68ba200) at 
> /usr/src/sys/netinet/if_ether.c:636
> #8  0xc05df4ea in arpintr (m=0xc68ba200) at 
> /usr/src/sys/netinet/if_ether.c:551
> #9  0xc05d861b in netisr_processqueue (ni=0xc076b078) at 
> /usr/src/sys/net/netisr.c:236
> #10 0xc05d881a in swi_net (dummy=0x0) at /usr/src/sys/net/netisr.c:349
> #11 0xc054cc49 in ithread_execute_handlers (p=0xc63ed860, ie=0xc643bb80) 
> at /usr/src/sys/kern/kern_intr.c:682
> #12 0xc054cd59 in ithread_loop (arg=0xc63bb870) at 
> /usr/src/sys/kern/kern_intr.c:765
> #13 0xc054b9fd in fork_exit (callout=0xc054cd04 , 
> arg=0xc63bb870, frame=0xe4f63d38) at /usr/src/sys/kern/kern_fork.c:821
> #14 0xc06a09bc in fork_trampoline () at 
> /usr/src/sys/i386/i386/exception.s:208
> (kgdb) exit
> Undefined command: "exit".  Try "help".
> (kgdb) quit
> 
> 
> ## Core 2
> [EMAIL PROTECTED] '13:15:32' '/home/london/tj'
> > $ kgdb /usr/obj/usr/src/sys/PE1950/kernel.debug /var/crash/vmcore.0
> [GDB will not be able to debug user-mode threads: 
> /usr/lib/libthread_db.so: Undefined symbol "ps_pglo

Re: 6.2-STABLE (i386) Repeating crash (supervisor read, page not present)

2007-04-23 Thread Brian A. Seklecki
Are you running i386 on this platform to keep your network platform
homogeneous.  We run 6.2p2 on amd64 on the same hardware w/o issue.  

That's an interesting ARP issue; are you using bce(4)?  We prefer to
stack the machine with PCIe4x em(4) cards.

~BAS

On Mon, 2007-04-23 at 13:24 +0100, Tom Judge wrote:
> I have attached 2 back traces, and I still have both cores if any
> more 
-- 
Brian A. Seklecki <[EMAIL PROTECTED]>
Collaborative Fusion, Inc.




IMPORTANT: This message contains confidential information and is intended only 
for the individual named. If the reader of this message is not an intended 
recipient (or the individual responsible for the delivery of this message to an 
intended recipient), please be advised that any re-use, dissemination, 
distribution or copying of this message is prohibited.  Please notify the 
sender immediately by e-mail if you have received this e-mail by mistake and 
delete this e-mail from your system.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


6.2-STABLE (i386) Repeating crash (supervisor read, page not present)

2007-04-23 Thread Tom Judge

Hi,

Recently I have noticed that one of our Dell PE1950's has been crashing 
a lot with the following reason "supervisor read, page not present".


The system runs 6.2 Release under i386.

I have attached 2 back traces, and I still have both cores if any more 
information is required.  Any light that can be shed on this problem 
would be greatly appreciated.


Tom

===

uname -a
FreeBSD narthex.mintel.co.uk 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Mon 
Apr  2 20:13:11 BST 2007 
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/PE1950  i386



## Core 1

[EMAIL PROTECTED] '13:14:47' '/home/london/tj'
> $ kgdb /usr/obj/usr/src/sys/PE1950/kernel.debug /var/crash/vmcore.1
[GDB will not be able to debug user-mode threads: 
/usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.

Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x15c
fault code  = supervisor read, page not present
instruction pointer = 0x20:0xc05df61f
stack pointer   = 0x28:0xe4f63c30
frame pointer   = 0x28:0xe4f63c90
code segment= base 0x0, limit 0xf, type 0x1b
   = DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 12 (swi1: net)
trap number = 12
panic: page fault
cpuid = 0
Uptime: 1h25m33s
Dumping 2047 MB (2 chunks)
 chunk 0: 1MB (159 pages) ... ok
 chunk 1: 2047MB (523944 pages) 2031 2015 1999 1983 1967 1951 1935 1919 
1903 1887

<7>arp_rtrequest: bad gateway 172.31.1.1 (!AF_LINK)
<7>arp_rtrequest: bad gateway 172.31.0.1 (!AF_LINK)
1871 1855 1839 1823 1807 1791 1775 1759 1743 1727 1711 1695 1679 1663 
1647 1631 1615 1599 1583 1567 1551 1535 1519 1503 1487 1471 1455 1439 
1423 1407 1391 1375 1359 1343 1327 1311 1295 1279 1263 1247 1231 1215 
1199 1183 1167 1151 1135 1119 1103 1087 1071 1055 1039 1023 1007 991 975 
959 943 927 911 895 879 863 847 831 815 799 783 767 751 735 719 703 687 
671 655 639 623 607 591 575 559 543 527 511 495 479 463 447 431 415 399 
383 367 351 335 319 303 287 271 255 239 223 207 191 175 159 143 127 111 
95 79 63 47 31 15


#0  doadump () at pcpu.h:165
165 pcpu.h: No such file or directory.
   in pcpu.h
(kgdb) bt
#0  doadump () at pcpu.h:165
#1  0xc05622ba in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
#2  0xc05625e1 in panic (fmt=0xc06e2578 "%s") at 
/usr/src/sys/kern/kern_shutdown.c:565
#3  0xc06b4580 in trap_fatal (frame=0xe4f63bf0, eva=16777308) at 
/usr/src/sys/i386/i386/trap.c:837
#4  0xc06b42bf in trap_pfault (frame=0xe4f63bf0, usermode=0, 
eva=16777308) at /usr/src/sys/i386/i386/trap.c:745

#5  0xc06b3f19 in trap (frame=
 {tf_fs = -1067581432, tf_es = -965803992, tf_ds = -964624344, 
tf_edi = -957112288, tf_esi = -965676032, tf_ebp = -453624688, tf_isp = 
-453624804, tf_ebx = 16777216, tf_edx = -968955648, tf_ecx = 4, tf_eax = 
0, tf_trapno = 12, tf_err = 0, tf_eip = -1067583969, tf_cs = 32, 
tf_eflags = 66118, tf_esp = 3, tf_ss = 0}) at 
/usr/src/sys/i386/i386/trap.c:435

#6  0xc06a095a in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#7  0xc05df61f in in_arpinput (m=0xc68ba200) at 
/usr/src/sys/netinet/if_ether.c:636
#8  0xc05df4ea in arpintr (m=0xc68ba200) at 
/usr/src/sys/netinet/if_ether.c:551
#9  0xc05d861b in netisr_processqueue (ni=0xc076b078) at 
/usr/src/sys/net/netisr.c:236

#10 0xc05d881a in swi_net (dummy=0x0) at /usr/src/sys/net/netisr.c:349
#11 0xc054cc49 in ithread_execute_handlers (p=0xc63ed860, ie=0xc643bb80) 
at /usr/src/sys/kern/kern_intr.c:682
#12 0xc054cd59 in ithread_loop (arg=0xc63bb870) at 
/usr/src/sys/kern/kern_intr.c:765
#13 0xc054b9fd in fork_exit (callout=0xc054cd04 , 
arg=0xc63bb870, frame=0xe4f63d38) at /usr/src/sys/kern/kern_fork.c:821
#14 0xc06a09bc in fork_trampoline () at 
/usr/src/sys/i386/i386/exception.s:208

(kgdb) exit
Undefined command: "exit".  Try "help".
(kgdb) quit


## Core 2
[EMAIL PROTECTED] '13:15:32' '/home/london/tj'
> $ kgdb /usr/obj/usr/src/sys/PE1950/kernel.debug /var/crash/vmcore.0
[GDB will not be able to debug user-mode threads: 
/usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.

Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".