before you trash the defective memory!!!

There's a project on the web for Linux to blend out the addresses with the faulty
memory. So you can use the 256 MB Ram and only have to dump a little bit of it. You
can also send it to the author of the patch to help him develop this software!

I don't have the URL by the hand, but i think it was mentioned in the
kernel-mailinglist not long ago - (kt.linuxcare.com in one of the last 4-5 issues)

Thomas Kotzian

----- Original Message -----
From: "Greg Baker" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>; "bert hubert" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Friday, April 21, 2000 11:17 PM
Subject: Re: VM: killing... (and Oops!)


>
> I'd like to thank everyone for their comments and offer current status:
>
> It seems according to memtest (http://reality.sgi.com/
> cbrady_denver/memtest86/) I do have some bad memory.  I had originally
> thought of this, and used the VA Linux burn-in software
> (ftp://ftp.varesearch.com/pub/software/Cerberus) to test (which
> returned no error results).  I guess don't trust www.crucial.com for
> memory.  Should have stuck to mushkin.
>
> Unfortunately I'll be in Washington DC for a week starting today, so
> won't have a chance to get my hands-on for further debugging until
> then.
>
> Thanks for pointing out memtest.  Hopefully I can isolate at least 1
> good chip so I can do my benchmarks/testing.  If anybody else has done
> large (~ process > 150MB, 4 hour+ run-time) Mentor Calibre jobs on
> Linux, please let me know about success, failures, and caveats.
>
> Thanks,
>
> --Greg
>
> FYI, during the heavy load I introduced today I got 4 Oops too!
>
> OOPS 1:
> pr 21 12:26:48 case kernel: Unable to handle kernel paging request at
> virtual address 00ddff28
> Apr 21 12:26:48 case kernel: current->tss.cr3 = 2bddb000, %cr3 =
> 2bddb000
> Apr 21 12:26:48 case kernel: *pde = 00000000
> Apr 21 12:26:48 case kernel: Oops: 0000
> Apr 21 12:26:48 case kernel: CPU:    0
> Apr 21 12:26:48 case kernel: EIP:    0010:[del_timer+10/59]
> Apr 21 12:26:48 case kernel: EFLAGS: 00010046
> Apr 21 12:26:48 case kernel: eax: 2bddb000   ebx: 00000246
> ecx: 00ddff24   edx: ecd249c0
> Apr 21 12:26:48 case kernel: esi: 000250bf   edi: 00000007
> ebp: ebddff0c   esp: ebddff10
> Apr 21 12:26:48 case kernel: ds: 0018   es: 0018   ss: 0018
> Apr 21 12:26:48 case kernel: Process sshd (pid: 977, process nr: 165,
> stackpage=ebddf000)
> Apr 21 12:26:48 case kernel: Stack: c0111294 00ddff24 ebddff24
> 00000000 00000040 00000000 00000000 000250bf
> Apr 21 12:26:48 case kernel:        ebdde000 c0110f04 00000000
> c012e480 00000004 00000026 00000007 ed4fefa8
> Apr 21 12:26:48 case kernel:        00000104 00000007 ebdde000
> 00000001 00000000 d5b73000 c012e927 00000007
> Apr 21 12:26:48 case kernel: Call Trace: [schedule_timeout+108/134]
> [process_timeout+0/15] [do_select+154/529] [sys_select+816/1134]
> [system_call+52/56]
> Apr 21 12:26:48 case kernel: Code: 8b 51 04 85 d2 74 12 8b 01 89 02 85
> c0 74 03 89 50 04 b8 01
> Apr 21 12:49:10 case kernel: md: md1: sync done.
>
> OOPS 2:
>
> Apr 21 14:57:32 case kernel: Unable to handle kernel NULL pointer
> dereference at virtual address 00000350
> Apr 21 14:57:32 case kernel: current->tss.cr3 = 00101000, %cr3 =
> 00101000
> Apr 21 14:57:32 case kernel: *pde = 00000000
> Apr 21 14:57:32 case kernel: Oops: 0002
> Apr 21 14:57:32 case kernel: CPU:    0
> Apr 21 14:57:32 case kernel: EIP:    0010:[kmem_cache_free+205/368]
> Apr 21 14:57:32 case kernel: EFLAGS: 00010046
> Apr 21 14:57:32 case kernel: eax: 00000340   ebx: e4983fd0
> ecx: dc008fe0   edx: 00000340
> Apr 21 14:57:32 case kernel: esi: efeff740   edi: 00000286
> ebp: 00000031   esp: efed9f74
> Apr 21 14:57:32 case kernel: ds: 0018   es: 0018   ss: 0018
> Apr 21 14:57:32 case kernel: Process kswapd (pid: 5, process nr: 5,
> stackpage=efed9000)
> Apr 21 14:57:32 case kernel: Stack: dc008f90 c06d1f98 dc008fdc
> efed9fac c0129069 efeff740 dc008f90 dc008f90
> Apr 21 14:57:32 case kernel:        dc008f90 c0129dab dc008f90
> dc008f90 c06d1f98 00000bfb 00000030 00000008
> Apr 21 14:57:32 case kernel:        c011e2b2 c06d1f98 00000010
> 00000006 c012365a 00000006 00000030 efed8000
> Apr 21 14:57:32 case kernel: Call
> Trace: [put_unused_buffer_head+33/76] [try_to_free_buffers+71/128]
> [shrink_mmap+218/300] [do_try_to_free_pages+42/124] [tvecs+7278/13856]
> [kswapd+107/164] [get_options+0/112]
> Apr 21 14:57:32 case kernel:        [kernel_thread+35/48]
> Apr 21 14:57:32 case kernel: Code: 89 48 10 89 0e eb 9c 8d 74 26 00 57
> 9d 56 53 68 67 5f 1e c0
>
> OOPS 3 & 4 happened crashed the system and I didn't copy the screen
> dump down.
>
> On Fri, 21 Apr 2000, bert hubert wrote:
>
> |I took the liberty to forward your message to linux-raid:
> |
> |----- Forwarded message from "Georg P. Israel" <[EMAIL PROTECTED]> -----
> |
> |Date: Fri, 21 Apr 2000 22:00:35 +0200
> |From: "Georg P. Israel" <[EMAIL PROTECTED]>
> |To: [EMAIL PROTECTED], [EMAIL PROTECTED]
> |Subject: Re: [[EMAIL PROTECTED]: Re: VM: killing...]
> |
> |Vince,
> |
> |I'm pretty sure you have some bad memory modules in you machine.
> |Make a mem test e.g. memtest86
> |to be sure that your memory is ok.
> |
> |
> |Georg
> |<[EMAIL PROTECTED]>
>

Reply via email to