Re: [CentOS] Kernel panic - where to go from here?

2007-11-30 Thread Bart Schaefer
On Nov 30, 2007 10:07 AM, Mike <[EMAIL PROTECTED]> wrote:
> On Wed, 28 Nov 2007, Bart Schaefer wrote:
> > Reboot with memtest86 (should be on the centos install media) and look
> > for test failures.
>
> That was it!  Replaced the failing memory, now OpenVPN has been up for ~16
> hours.

Glad to hear it.  I sometimes wonder, given how often I've seen memory
failures occur only under load, whether unused RAM is more likely to
go bad than RAM that's kept busy all the time.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel panic - where to go from here?

2007-11-30 Thread Mike

On Wed, 28 Nov 2007, Bart Schaefer wrote:


On Nov 28, 2007 11:27 AM, Mike <[EMAIL PROTECTED]> wrote:

I googled "unable to handle kernel paging request" and didn't really find
anything useful (to me).


In my experience this probably means that you have some RAM going bad
and you only manage to tickle the problem when the machine becomes
loaded enough to need that part of the address space.

Reboot with memtest86 (should be on the centos install media) and look
for test failures.


That was it!  Replaced the failing memory, now OpenVPN has been up for ~16 
hours.


-- Thanks, Mike

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel panic - where to go from here?

2007-11-29 Thread William L. Maltby
On Wed, 2007-11-28 at 14:27 -0800, Bart Schaefer wrote: 
> On Nov 28, 2007 11:27 AM, Mike <[EMAIL PROTECTED]> wrote:
> > I googled "unable to handle kernel paging request" and didn't really find
> > anything useful (to me).
> 
> In my experience this probably means that you have some RAM going bad
> and you only manage to tickle the problem when the machine becomes
> loaded enough to need that part of the address space.
> 
> Reboot with memtest86 (should be on the centos install media) and look
> for test failures.

JFTR: I chased "random" panics for some time on my Acer AK77-400.
Thought bad memory, ran memtest86 and it was confimed... NOT!

Turns out that although the board supports DDR.../333/400
(PCwhatchamacallit/2700/...) and has three slots, there is not enough
bandwidth to run @ 400 with all three slots populated. At 333, all
memory tested good.

*After* I was made aware of this niggling little inconvenience, I had to
make the tough choice between faster or more memory. *sigh*.

I had found a post about it somewhere, but I can't locate it now. I hope
this is not your problem... or maybe it is better than bad memory?

> 

--
Bill

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel panic - where to go from here?

2007-11-28 Thread Mike

On Wed, 28 Nov 2007, Bart Schaefer wrote:


On Nov 28, 2007 11:27 AM, Mike <[EMAIL PROTECTED]> wrote:

I googled "unable to handle kernel paging request" and didn't really find
anything useful (to me).


In my experience this probably means that you have some RAM going bad
and you only manage to tickle the problem when the machine becomes
loaded enough to need that part of the address space.

Reboot with memtest86 (should be on the centos install media) and look
for test failures.


Thanks Bart - That makes perfect sense.  I've installed memtest and will 
let it cook over night.


-- Mike

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel panic - where to go from here?

2007-11-28 Thread Bart Schaefer
On Nov 28, 2007 11:27 AM, Mike <[EMAIL PROTECTED]> wrote:
> I googled "unable to handle kernel paging request" and didn't really find
> anything useful (to me).

In my experience this probably means that you have some RAM going bad
and you only manage to tickle the problem when the machine becomes
loaded enough to need that part of the address space.

Reboot with memtest86 (should be on the centos install media) and look
for test failures.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] Kernel panic - where to go from here?

2007-11-28 Thread Mike

CentOS 5 has been running continuously since 9/21 on my "do everything" home
server (with the exception of a kernel update).  It's a fairly old Athlon
machine that serves as a firewall and various servers (dovecot, samba, NFS,
dhcp, OpenVPN, etc).

I connected via OpenVPN about a week ago and discovered I get a kernel panic.
I've since found that this is very repeatable and happens only after being 
connected via OpenVPN for about 4 hours or so.


I was able to manually copy the stuff on the console after the panic (see 
below).
I googled "unable to handle kernel paging request" and didn't really find
anything useful (to me).

I've tried both kernel version 2.6.18-8.1.14.el5 and 2.6.18-8.1.15.el5 as well
as OpenVPN versions 2.1_rc4-1 and 2.0.9 all with the same results.

Not sure where to go with this(?).  Should I post this on a kernel mailing
list?  Or somewhere else?



Call Trace:
  [] dump_trace+0x8c/0x96
  [] show_trace_log_lvl+0x10/0x20
  [] show_stack_log_lvl+0x8c/0x94
  [] show_registers+0x125/0x191
  [] kernel_thread_helper+0x7/0x10
  [] die+0x196/0x296
  [] do_page_fault+0x3ea/0x4b8
  [] kthread+0x0/0xeb
  [] do_page_fault+0x0/0x4b8
  [] error_code+0x39/0x40
  [] kthread+0x0/0xeb
  [] kernel_thread_helper+0x7/0x10
BUG: unable to handle kernel paging request at virtual address c0613dbf
Printing eip:
  c0404c44
  *pde = 2f9b5163
Recursive die() failure, output suppressed
  <0>Kernel panic - not syncing: Fatal exception


-- Thanks, Mike

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos