On Tuesday 21 Jan 2014 23:11:24 Skippy wrote:

> "Have you ever found a program in linux that allows you to locate bad
> dims if you have faults?  I’ve tried memtest86, memconf, memtester and
> none of them can point out what slot on the motherboard has the bad RAM.

memtest86+ is  what I use, but I have not found an application that will 
identify and report on its own a faulty module or controller out of a whole 
bank of them.  Press F2 when it starts, to enable SMT support and get the 
tests done a bit faster.


>  I know usually you just plug in one at a time.  But memtest86 takes
> hours and I have…wait for it….16 slots to test DIMs on for this specific
> server with memory failure."

You don't have to test 16 modules one at a time, although you will have to run 
the test more than once:

Remove half (8) of the memory modules.  Ensure what is left is installed in 
the slot combination recommended by the MoBo manufacturer.  Test these.  If no 
fault is found swap them for the other half.  As soon as a fault is reported, 
remove half of this batch (4) and install the other 4 as recommended by the 
MoBo manufacturer.  Rinse and repeat.  This way you will eventually isolate 
the dodgy DIMM module, by running the test fewer than 16 times.  Usually 
errors show in the first round of tests, but some times you may need to wait 
for more than 8 passes.

Before you start any of this it is a good idea to just reseat the modules one 
at a time in case you have some dirt or oxidisation in any of the contacts.  
That could save a lot of hours ...

Make sure you have marked clearly which batches have showed no errors - if you 
mix them up you will have to start from the beginning.  I know I am stating 
the obvious, but I have been there with colleagues who like to tidy up other 
people's work space <sigh>.

-- 
Regards,
Mick

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to