On Saturday 20 September 2003 03:15, Fred Clausen wrote:
> Fred Clausen wrote:
> Ok, this is turing out to be one major problem. :-/ I ran memtest and
> sure enough, one of my sticks of RAM was bad.  Ok, no problem, I only
> loose 512 mb(only *tear*), and I am down to 768.  Oh well.  Start the
> machine, run memtest again, no errors.  But when I try to compile the
> kernel, I get the *same* exact errors as before.  Now I am clueless,
> and I think that memtest is lying to me.

No it simply shows you that in memory one or bit's have flipped, but it 
does NOT tell you that the memory is faulty. I would check the PSU, 
because if that is not quite up to the need then compiling and 
memtesting can drain too much power from it. This leads to overheat 
condition, which then leads to massive or minor voltage oscillation.
This results can be seen either as memory or CPU failures depending 
which power rails the system stresses too much under heavy load.

> One of the odd things about the RAM error is that it was more than
> 768 mb down the line,and never once have I seen linux eat up 512 mb
> of RAM. Just 'one of those things' I guess.

Yeah it most likely is not memory error. Memory testing is the easiest 
way to determine that the system has something wrong.

> Can you fellows think of anything else I can do?

As a check up try to run memtest with open cover and put a table fan to 
blow inside the case.

If it still fails check, which parts of the system are hot, if any. So 
if the PSU gets really hot even in that condition then you most likely 
have too low speced PSU, which is the one thing that most peoples and 
COMPANIES are trying to save some money. A too low speced PSU is really 
an anoying thing, because in normal operation it works all right 
although it can still oscillate the voltages heavily, but if the 
voltages doesn't get too low the system just functions, but it stresses 
the HW and reduces it's life time.

If the PSU is not the problem, but some other part of the machine burns 
when touching then you have too low cooling in your system.

PS. This certainly doesn't mean that there is no faulty memories, but in 
reality from the quantities they are made only very small amount of the 
faulty ones gets pass the QA. People simply just think that the memory 
is faulty, but in fact the memory module might be just close of the QA 
limit, and a litle under powered or overheated environment rises the 
resistance of that module too high, which then shows up as a memory 
failure.



--
[EMAIL PROTECTED] mailing list

Reply via email to