Re: Machine rebooting frequently

2002-03-04 Thread mikea

On Mon, Mar 04, 2002 at 02:24:13PM +0100, Marcel Prisi wrote:
> Help !!
> 
> We have a postgresql server (bi-PIII 833, 1Gb RAM, Adaptec SCSI RAID)
> rebooting by itself more and more frequently.
> 
> I can not find any message anywhere, as if someone just presses the "reboot"
> button every now and then. It used to reboot every week or so, but it
> rebooted just six times today !
> 
> What can I do to fond the cause ? The machine runs 4.5-RELEASE, but used to
> run 4.3-PRERELEASE with the same trouble.
> 
> I partially checked RAM through memtest86 (www.memtest86.com) but did not
> find any trouble. What might it be ? faulty processor ? Faulty power-supply
> ? any clues ? how to test ?

The program-driven memory testers are at best rather bad at 
finding memory problems. 

Candidates, not in any particular order:
o   Overheated processor(s)
o   Overheated or faulty memory
o   Faulty or overloaded power supply
o   Faulty line power from the wall
o   Flaky motherbord (_Highly_ unlikely)

and, inevitably, 

o   "Other"

Problem Determination and Isolation techniques:

o Processor: run healthd or another voltage/temperature monitor
to watch voltages and temperatures. Check all fans for good
speed. Remove lint, cat-hair, etc., from heat sinks.[1] 

o Faulty or overheated memory: Check fans, etc., as above Take . 
memory to a store that has a hardware memory tester Remove one . 
SIMM at a time, reboot, wait   . 

o Power Supply: Check DC power with voltmeter and oscilloscope.
Replace PS if you _think_ there might be a problem with it or if
it is anywhere near capacity. They're cheap here: about $100 will
get you a very-high-capacity PS.

o Faulty line power: Put the server on a known-good UPS. 

o Flaky motherboard: Replace motherboard (absolute last resort).


[1] Cat hair in my house appears to be attracted preferentially
to CPU heat sinks. 

-- 
Mike Andrews
[EMAIL PROTECTED]
Tired old sysadmin since 1964

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message



Machine rebooting frequently

2002-03-04 Thread Marcel Prisi

Help !!

We have a postgresql server (bi-PIII 833, 1Gb RAM, Adaptec SCSI RAID)
rebooting by itself more and more frequently.

I can not find any message anywhere, as if someone just presses the "reboot"
button every now and then. It used to reboot every week or so, but it
rebooted just six times today !

What can I do to fond the cause ? The machine runs 4.5-RELEASE, but used to
run 4.3-PRERELEASE with the same trouble.

I partially checked RAM through memtest86 (www.memtest86.com) but did not
find any trouble. What might it be ? faulty processor ? Faulty power-supply
? any clues ? how to test ?

THANKS !!


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-stable" in the body of the message