kernel panic 6.2-RELEASE SMP dual quad core

2007-12-29 Thread Iain Dooley

hi all,

uname -a
FreeBSD HOSTNAME 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Fri Jan 12 11:05:30 UTC 
2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP  i386


running on dual quad core intel xeons with 4gb ram.

my server has been rebooting quite a bit and stopped responding today. i found 
this on the console:


Fatal trap: 12 page fault while in kernel mode
cpuid = 5; apic id = 05
fault virtual address = 0x0
fault code = supervisor write, page not present
instruction pointer = 0x20:0xc0880472
stack pointer = 0x28:0xe6ea9c8c
code segment = base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, def32 1, gran 1
processor eflags = resume, IOPL = 0
current process = 12 (idle: cpu5)
trap number = 12
panic: page fault
cpuid = 5
uptime 24m41s
cannot dump. no dump device specified

i've configured the dump device and will follow the kernel debugging details in 
the handbook if it happens again but i thought i'd write in now in case the 
cause of the problem jumped out at anyone.


i've run mprime for 24 hours, and memtest for 3 passes, and a script i 
wrote which just exhausts ram and CPU, then backs off and does it again 
which have been running for 24 hours.


i've also just started running this:

http://www.holm.cc/stress/

i noticed that the same "fatal trap 12" appeared during stress tests as 
listed here:


http://people.freebsd.org/~pho/stress/log/cons224.html

any help, guidance or information would be much appreciated.

cheers

iain
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: kernel panic 6.2-RELEASE SMP dual quad core

2007-12-29 Thread Kris Kennaway

Iain Dooley wrote:

hi all,

uname -a
FreeBSD HOSTNAME 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Fri Jan 12 11:05:30 
UTC 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP  i386


running on dual quad core intel xeons with 4gb ram.

my server has been rebooting quite a bit and stopped responding today. i 
found this on the console:


Fatal trap: 12 page fault while in kernel mode
cpuid = 5; apic id = 05
fault virtual address = 0x0
fault code = supervisor write, page not present
instruction pointer = 0x20:0xc0880472
stack pointer = 0x28:0xe6ea9c8c
code segment = base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, def32 1, gran 1
processor eflags = resume, IOPL = 0
current process = 12 (idle: cpu5)
trap number = 12
panic: page fault
cpuid = 5
uptime 24m41s
cannot dump. no dump device specified

i've configured the dump device and will follow the kernel debugging 
details in the handbook if it happens again but i thought i'd write in 
now in case the cause of the problem jumped out at anyone.


i've run mprime for 24 hours, and memtest for 3 passes, and a script i 
wrote which just exhausts ram and CPU, then backs off and does it again 
which have been running for 24 hours.


i've also just started running this:

http://www.holm.cc/stress/

i noticed that the same "fatal trap 12" appeared during stress tests as 
listed here:


http://people.freebsd.org/~pho/stress/log/cons224.html

any help, guidance or information would be much appreciated.


What you have so far is close to meaningless.  The "fault virtual 
address = 0x0" means little more than "somewhere in the kernel there was 
a null pointer dereference".  The fact that it was the idle process is 
suspicious though, it suggests that hardware failure is a high 
probability.  Please follow up with the backtrace if you want to pursue 
this further.


Kris

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: kernel panic 6.2-RELEASE SMP dual quad core

2007-12-29 Thread Iain Dooley



uname -a
FreeBSD HOSTNAME 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Fri Jan 12 11:05:30 
UTC 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP  i386


running on dual quad core intel xeons with 4gb ram.

my server has been rebooting quite a bit and stopped responding today. i 
found this on the console:


Fatal trap: 12 page fault while in kernel mode
cpuid = 5; apic id = 05
fault virtual address = 0x0
fault code = supervisor write, page not present
instruction pointer = 0x20:0xc0880472
stack pointer = 0x28:0xe6ea9c8c
code segment = base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, def32 1, gran 1
processor eflags = resume, IOPL = 0
current process = 12 (idle: cpu5)
trap number = 12
panic: page fault
cpuid = 5
uptime 24m41s
cannot dump. no dump device specified

i've configured the dump device and will follow the kernel debugging 
details in the handbook if it happens again but i thought i'd write in now 
in case the cause of the problem jumped out at anyone.


i've run mprime for 24 hours, and memtest for 3 passes, and a script i 
wrote which just exhausts ram and CPU, then backs off and does it again 
which have been running for 24 hours.


i've also just started running this:

http://www.holm.cc/stress/

i noticed that the same "fatal trap 12" appeared during stress tests as 
listed here:


http://people.freebsd.org/~pho/stress/log/cons224.html

any help, guidance or information would be much appreciated.


What you have so far is close to meaningless.  The "fault virtual address = 
0x0" means little more than "somewhere in the kernel there was a null pointer 
dereference".  The fact that it was the idle process is suspicious though, it 
suggests that hardware failure is a high probability.  Please follow up with 
the backtrace if you want to pursue this further.


I thought that dodgy ram was the culprit. I've run memtest86 on three 
passes with no errrors reported although I've also read numerous reports 
on the net of memtest86 not being very effective.


can you suggest a good method of stress testing ram? this is a new machine 
and still under warranty so if it's a ram issue I can easily just get it 
replaced.


Cheers,

Iain
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: kernel panic 6.2-RELEASE SMP dual quad core

2007-12-29 Thread Ivan Voras
Iain Dooley wrote:
> hi all,
> 
> uname -a
> FreeBSD HOSTNAME 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Fri Jan 12 11:05:30
> UTC 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP  i386
> 
> running on dual quad core intel xeons with 4gb ram.

Are you using PAE? (probably not if this is the generic SMP configuration).

> Fatal trap: 12 page fault while in kernel mode

This only means an equivalent of "segmentation fault" for user-mode
programs. The actual problem can be anything.

> current process = 12 (idle: cpu5)

This is important. The "idle" process does literary nothing and is
highly unlikely to contain a bug.

Does this mean that the problem only appears when the system is idle?

> i've run mprime for 24 hours, and memtest for 3 passes, and a script i
> wrote which just exhausts ram and CPU, then backs off and does it again
> which have been running for 24 hours.

Do your tests stress multiple CPUs? If not, this may be something to
try. Also try disabling CPUs to see if it makes any difference.

By the way, you are likely not to get any performance benefits (and some
performance regressions are likely) with this number of CPUs on FreeBSD
6.2, except if you intend to do CPU-intensive tasks (like scientific
caluclations). If you can, try the latest release candidate of 7.0.





signature.asc
Description: OpenPGP digital signature


Re: kernel panic 6.2-RELEASE SMP dual quad core

2007-12-29 Thread Iain Dooley

hi ivan,


uname -a
FreeBSD HOSTNAME 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Fri Jan 12 11:05:30
UTC 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP  i386

running on dual quad core intel xeons with 4gb ram.


Are you using PAE? (probably not if this is the generic SMP configuration).


no, i checked and i'm not using PAE.


current process = 12 (idle: cpu5)


This is important. The "idle" process does literary nothing and is
highly unlikely to contain a bug.

Does this mean that the problem only appears when the system is idle?


hmm maybe i should leave the system idle and see if it crashes again :) 
i've been running an application which just loads up RAM with lots of 
processes keeping at least 32 processes running at all times. each script 
is killed when it uses too much space. the machine hasn't crashed again 
for like 2 days.


i'm going away so i might leave it idle for that time and see if it 
crashes again. i've got a dump device setup this time which will give me 
more information to send into the list.



By the way, you are likely not to get any performance benefits (and some
performance regressions are likely) with this number of CPUs on FreeBSD
6.2, except if you intend to do CPU-intensive tasks (like scientific
caluclations). If you can, try the latest release candidate of 7.0.


i'm just running a web application on it. it's a total pain though. i 
should have just bought a late model second hand p4 or something rather 
than a state of the art machine. i'm too out of touch with modern hardware 
to have known what i was getting myself into. live and learn i guess.


cheers

iain
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: kernel panic 6.2-RELEASE SMP dual quad core

2007-12-29 Thread Ivan Voras
Iain Dooley wrote:

> i'm just running a web application on it. it's a total pain though. i
> should have just bought a late model second hand p4 or something rather
> than a state of the art machine. i'm too out of touch with modern
> hardware to have known what i was getting myself into. live and learn i
> guess.

Quad core Xeons are not exactly unusual or significantly different
nowadays. I and many other people have been running them with FreeBSD
without problems almost since they were made.



signature.asc
Description: OpenPGP digital signature