On Mon, Mar 04, 2002 at 02:24:13PM +0100, Marcel Prisi wrote: > Help !! > > We have a postgresql server (bi-PIII 833, 1Gb RAM, Adaptec SCSI RAID) > rebooting by itself more and more frequently. > > I can not find any message anywhere, as if someone just presses the "reboot" > button every now and then. It used to reboot every week or so, but it > rebooted just six times today ! > > What can I do to fond the cause ? The machine runs 4.5-RELEASE, but used to > run 4.3-PRERELEASE with the same trouble. > > I partially checked RAM through memtest86 (www.memtest86.com) but did not > find any trouble. What might it be ? faulty processor ? Faulty power-supply > ? any clues ? how to test ?
The program-driven memory testers are at best rather bad at finding memory problems. Candidates, not in any particular order: o Overheated processor(s) o Overheated or faulty memory o Faulty or overloaded power supply o Faulty line power from the wall o Flaky motherbord (_Highly_ unlikely) and, inevitably, o "Other" Problem Determination and Isolation techniques: o Processor: run healthd or another voltage/temperature monitor to watch voltages and temperatures. Check all fans for good speed. Remove lint, cat-hair, etc., from heat sinks.[1] o Faulty or overheated memory: Check fans, etc., as above Take . memory to a store that has a hardware memory tester Remove one . SIMM at a time, reboot, wait . o Power Supply: Check DC power with voltmeter and oscilloscope. Replace PS if you _think_ there might be a problem with it or if it is anywhere near capacity. They're cheap here: about $100 will get you a very-high-capacity PS. o Faulty line power: Put the server on a known-good UPS. o Flaky motherboard: Replace motherboard (absolute last resort). [1] Cat hair in my house appears to be attracted preferentially to CPU heat sinks. -- Mike Andrews [EMAIL PROTECTED] Tired old sysadmin since 1964 To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-stable" in the body of the message