Mauro Maroni <[EMAIL PROTECTED]> posted
[EMAIL PROTECTED], excerpted below, on  Wed, 08 Nov
2006 20:25:25 -0300:

> Well, then I got segfaults compiling other packages, and a couple of
> times the machine freezed doing trivial things like browsing the web.
> Could this be a hardware issue? RAM seems to be OK as I ran memtest
> during the night and did not show any error after 9 hours.

That's a classic hardware issue, yes.  The cause can be one of several
things.  Note that there are at least two ways RAM can be bad and memtest
checks only one -- memory actually corrupting in storage.  From hard
experience, I know the other one all too well -- AND know that memtest
doesn't catch it AT ALL.  That one is memory timing issues, and as
memory speeds increase, it's becoming more and more common.  Taking my
case as an example, the RAM was rated PC3200, but simply wasn't stable at
that.  Unfortunately, my mobo was new enough at the time, and using the
then new AMD64 memory-controller-on-CPU technology, that the BIOS didn't
have the usual memory speed tweaking options.  After fighting with it for
some time, a BIOS upgrade was eventually made available that added these
options, and a very simple (with the right BIOS option) tweak to reduce
memory clocking from the rated PC3200 (200 MHz DDRed to 400, times 8 bit
bus width, equals 3200) to ~PC3000 (183 MHz DDred to 366, times 8, rounds
to 3000) eliminated the issue entirely.  The system was then rock-stable,
even after tweaking some of the detailed individual wait-state settings
back up to increase the performance a bit from the defaults.

So, before you eliminate memory as a possibility, check your BIOS and try
declocking it a notch or two.

Actually, all the hardware possibilities trace to the same root, what
should be a binary one becoming at times a binary zero, very often due to
undervolting.  This can be due to speed issues, as with the above or if
you overclock your memory or CPU, or power issues, which may occur
anywhere in your "power train", from the stuff coming to you from your
electricity supplier, to an underpowered computer power supply, to an
underpowered single voltage rail on that supply, to an underpowered UPS,
to a faulty power regulator on your mobo, to a bad connection somewhere,
to simply having to many things connected to the computer at once.  Or it
can be both power and speed issues, since higher speeds commonly require
more power in ordered to remain stable.  (This makes perfect sense given
that higher speeds mean there's less time to actually bring the transistor
to the high voltage "1" state before actually seeing if it is a 1 or a 0,
and boosting the supply voltage -- to a point -- can often make it reach
that state faster.)

So, it should go without saying, but cut the overclocking if you were
doing it (and note that overclocking can cause permanent damage even after
returning to normal clocking)  Next, check your power supply, both at the
wall plug and that you are using a good PSU in the computer, sufficiently
highly rated and UL Listed (if in the US, substitute the appropriate
authority if elsewhere), since it's common knowledge that the rating of
many power supplies lacking this listing aren't worth the cost of ink used
to print the rating.  If you are using a UPS, check that too.

Finally, check for overheating.

Those are the most common hardware causes of instability.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

-- 
gentoo-amd64@gentoo.org mailing list

Reply via email to