Re: [CentOS] Strange Kernel Warning.

2011-08-17 Thread Keith Roberts
On Wed, 17 Aug 2011, John Doe wrote:

> To: CentOS mailing list 
> From: John Doe 
> Subject: Re: [CentOS] Strange Kernel Warning.
> 
> From: Lisandro Grullon 
>> Can someone give me clues as to whether my memory is 
>> going bad or I am having problem with the actual board. 
>> Thank you in advace.
>
>
> Any led on the motherboard (even better if next to a RAM slot)?
> Usually, the best (if you can) is to swap RAM modules.
> If the error follows the RAM module; it is a module problem.
> If the error stays at the same position, it is the motherboard.

Another option is to take all the memory out, and then put 
one module at a time back onto the motherboard, and test 
that with memtest86+. Should be on the install CD/DVD.

Once you have identified a memory module that tests without 
errors, try that in each of the other slots if possible.

If there are no errors in any of the other slots, then you 
will need to test the other memory modules in a slot you 
know works OK, to see if it's one of the other memory 
modules that is faulty.

It's also possible to have high density memory and low 
density memory that works and tests OK individually on a 
motherboard. However when high density and low density 
modules are mixed together at the same time in a system, 
then you might find errors occuring.

I had this on a Centos 5.5 32 bit system, and it was very 
frustrating to locate the cause of the error. Testing one 
individual memory module at a time takes the guesswork out 
of which module may be the faulty one. But NEVER mix HD and 
LD modules together, as they don't always work well 
together.

HTH

Keith Roberts

-
Websites:
http://www.karsites.net
http://www.php-debuggers.net
http://www.raised-from-the-dead.org.uk

All email addresses are challenge-response protected with
TMDA [http://tmda.net]
-
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Strange Kernel Warning.

2011-08-17 Thread Lisandro Grullon
Sure morten,
 
lspci reflects the following:
 
00:00.0 Host bridge: ATI Technologies Inc RD890 Northbridge only dual slot 
(2x16) PCI-e GFX Hydra part (rev 02)
00:04.0 PCI bridge: ATI Technologies Inc RD890 PCI to PCI bridge (PCI express 
gpp port D)
00:09.0 PCI bridge: ATI Technologies Inc RD890 PCI to PCI bridge (PCI express 
gpp port H)
00:0b.0 PCI bridge: ATI Technologies Inc RD890 PCI to PCI bridge (NB-SB link)
00:11.0 SATA controller: ATI Technologies Inc SB7x0/SB8x0/SB9x0 SATA Controller 
[IDE mode]
00:12.0 USB Controller: ATI Technologies Inc SB7x0/SB8x0/SB9x0 USB OHCI0 
Controller
00:12.1 USB Controller: ATI Technologies Inc SB7x0 USB OHCI1 Controller
00:12.2 USB Controller: ATI Technologies Inc SB7x0/SB8x0/SB9x0 USB EHCI 
Controller
00:13.0 USB Controller: ATI Technologies Inc SB7x0/SB8x0/SB9x0 USB OHCI0 
Controller
00:13.1 USB Controller: ATI Technologies Inc SB7x0 USB OHCI1 Controller
00:13.2 USB Controller: ATI Technologies Inc SB7x0/SB8x0/SB9x0 USB EHCI 
Controller
00:14.0 SMBus: ATI Technologies Inc SBx00 SMBus Controller (rev 3d)
00:14.1 IDE interface: ATI Technologies Inc SB7x0/SB8x0/SB9x0 IDE Controller
00:14.3 ISA bridge: ATI Technologies Inc SB7x0/SB8x0/SB9x0 LPC host controller
00:14.4 PCI bridge: ATI Technologies Inc SBx00 PCI to PCI Bridge
00:14.5 USB Controller: ATI Technologies Inc SB7x0/SB8x0/SB9x0 USB OHCI2 
Controller
00:18.0 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor 
HyperTransport Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Address 
Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor DRAM 
Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor 
Miscellaneous Control
00:18.4 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Link 
Control
00:19.0 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor 
HyperTransport Configuration
00:19.1 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Address 
Map
00:19.2 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor DRAM 
Controller
00:19.3 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor 
Miscellaneous Control
00:19.4 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Link 
Control
00:1a.0 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor 
HyperTransport Configuration
00:1a.1 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Address 
Map
00:1a.2 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor DRAM 
Controller
00:1a.3 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor 
Miscellaneous Control
00:1a.4 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Link 
Control
00:1b.0 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor 
HyperTransport Configuration
00:1b.1 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Address 
Map
00:1b.2 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor DRAM 
Controller
00:1b.3 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor 
Miscellaneous Control
00:1b.4 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Link 
Control
01:09.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics 
Family (rev 10)
02:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 
[Liberator] (rev 05)
03:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
04:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection 
(rev 01)
04:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection 
(rev 01)

>>> Morten Stevens  8/17/2011 9:11 AM >>>
On Wed, 17 Aug 2011 08:17:58 -0400, Lisandro Grullon wrote:

> Dear CentOS community,
> Can someone give me clues as to whether my memory is going bad or I 
> am
> having problem with the actual board. Thank you in advace.
>
> I am getting the following error via stdout and also in 
> /var/log/messages

Hi,

Please tell us more about your system. (lspci, dmesg and cat 
/proc/mtrr)

Best regards,

Morten
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Strange Kernel Warning.

2011-08-17 Thread Morten Stevens
On Wed, 17 Aug 2011 08:17:58 -0400, Lisandro Grullon wrote:

> Dear CentOS community,
> Can someone give me clues as to whether my memory is going bad or I 
> am
> having problem with the actual board. Thank you in advace.
>
> I am getting the following error via stdout and also in 
> /var/log/messages

Hi,

Please tell us more about your system. (lspci, dmesg and cat 
/proc/mtrr)

Best regards,

Morten
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Strange Kernel Warning.

2011-08-17 Thread Lisandro Grullon
Thank you john,
I surely hope that shifting RAM around would fix the issue...this board is 
extremely expensive to change...about 2K the board along.

>>> John Doe  8/17/2011 8:54 AM >>>
From: Lisandro Grullon 
>Can someone give me clues as to whether my memory is going bad or I am having 
>problem with the actual board. Thank you in advace.


Any led on the motherboard (even better if next to a RAM slot)?
Usually, the best (if you can) is to swap RAM modules.
If the error follows the RAM module; it is a module problem.
If the error stays at the same position, it is the motherboard.


JD

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Strange Kernel Warning.

2011-08-17 Thread John Doe
From: Lisandro Grullon 
>Can someone give me clues as to whether my memory is going bad or I am having 
>problem with the actual board. Thank you in advace.


Any led on the motherboard (even better if next to a RAM slot)?
Usually, the best (if you can) is to swap RAM modules.
If the error follows the RAM module; it is a module problem.
If the error stays at the same position, it is the motherboard.


JD

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] Strange Kernel Warning.

2011-08-17 Thread Lisandro Grullon
Dear CentOS community,
Can someone give me clues as to whether my memory is going bad or I am having 
problem with the actual board. Thank you in advace.
 
I am getting the following error via stdout and also in /var/log/messages
 
Aug 15 20:37:10 saturn kernel: Northbridge Error, node 0
Aug 15 20:37:10 saturn kernel: ECC/ChipKill ECC error.
Aug 15 20:37:10 saturn kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0x1b9e740
Aug 15 20:37:10 saturn kernel: EDAC MC0: CE page 0x1b9e, offset 0x740, grain 0, 
syndrome 0x1cc8, row 2, channel 0, label "": amd64_edac
Aug 15 20:37:10 saturn kernel: EDAC MC0: CE - no information available: 
amd64_edacError Overflow
Aug 15 23:33:41 saturn kernel: Northbridge Error, node 0
Aug 15 23:33:41 saturn kernel: ECC/ChipKill ECC error.
Aug 15 23:33:41 saturn kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0x1098d00
Aug 15 23:33:41 saturn kernel: EDAC MC0: CE page 0x1098, offset 0xd00, grain 0, 
syndrome 0x976f, row 2, channel 0, label "": amd64_edac
Aug 15 23:33:41 saturn kernel: EDAC MC0: CE - no information available: 
amd64_edacError Overflow
Aug 16 02:56:30 saturn kernel: Northbridge Error, node 1
Aug 16 02:56:30 saturn kernel: ECC/ChipKill ECC error.
Aug 16 02:56:30 saturn kernel: EDAC amd64 MC1: CE ERROR_ADDRESS= 0x80bd9cc00
Aug 16 02:56:30 saturn kernel: EDAC MC1: CE page 0x80bd9c, offset 0xc00, grain 
0, syndrome 0xe08f, row 3, channel 0, label "": amd64_edac
Aug 16 02:56:30 saturn kernel: EDAC MC1: CE - no information available: 
amd64_edacError Overflow
Aug 17 02:17:02 saturn kernel: Northbridge Error, node 0
Aug 17 02:17:02 saturn kernel: ECC/ChipKill ECC error.
Aug 17 02:17:02 saturn kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0x1e25fd0
Aug 17 02:17:02 saturn kernel: EDAC MC0: CE page 0x1e25, offset 0xfd0, grain 0, 
syndrome 0x1cc8, row 2, channel 0, label "": amd64_edac
Aug 17 02:17:02 saturn kernel: EDAC MC0: CE - no information available: 
amd64_edacError Overflow
Aug 17 02:41:22 saturn kernel: Northbridge Error, node 1
Aug 17 02:41:22 saturn kernel: ECC/ChipKill ECC error.
Aug 17 02:41:22 saturn kernel: EDAC amd64 MC1: CE ERROR_ADDRESS= 0x80d2ce600
Aug 17 02:41:22 saturn kernel: EDAC MC1: CE page 0x80d2ce, offset 0x600, grain 
0, syndrome 0xe08f, row 3, channel 0, label "": amd64_edac
Aug 17 02:41:22 saturn kernel: EDAC MC1: CE - no information available: 
amd64_edacError Overflow
Aug 17 04:07:16 saturn kernel: Northbridge Error, node 0
Aug 17 04:07:16 saturn kernel: ECC/ChipKill ECC error.
Aug 17 04:07:16 saturn kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0x41fe79200
Aug 17 04:07:16 saturn kernel: EDAC MC0: CE page 0x41fe79, offset 0x200, grain 
0, syndrome 0xa612, row 3, channel 0, label "": amd64_edac
Aug 17 04:07:16 saturn kernel: EDAC MC0: CE - no information available: 
amd64_edacError Overflow
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos