On Tue, Oct 31, 2006 at 05:29:29PM +0000, Dougie Nisbet wrote: > A server which has been running steadily for years is beginning to > reboot. To the best of my knowledge, nothing has changed. It is a > dual-processor PIII. It runs stable. > > It is tucked away in the loft and usually has no monitor attached so > tracking this down is difficult. However even if I brought it into a > more convenient area, short of sitting staring at the screen waiting for > a crash or reboot, I'm not sure it would help much. > > I've tried rebuilding a newer kernel from backports.org. And trimmed it > right down as much as possible. There is nothing useful in syslog. A > typical series of reboots looks like: > > dougie pts/0 tbird2xp:0.0 Tue Oct 31 17:15 still logged in > runlevel (to lvl 2) 2.6.17 Tue Oct 31 17:12 - 17:21 (00:08) > reboot system boot 2.6.17 Tue Oct 31 17:12 (00:08) > dougie pts/0 tbird2xp:0.0 Tue Oct 31 17:09 - crash (00:02) > runlevel (to lvl 2) 2.6.17 Tue Oct 31 16:59 - 17:12 (00:12) > reboot system boot 2.6.17 Tue Oct 31 16:59 (00:21) > dougie pts/0 tbird2xp:0.0 Tue Oct 31 16:05 - crash (00:54) > runlevel (to lvl 2) 2.6.17 Tue Oct 31 15:16 - 16:59 (01:43) > reboot system boot 2.6.17 Tue Oct 31 15:16 (02:04) > date new time Sun Oct 29 07:11 > date old time Sun Oct 29 07:12 > root pts/3 kitchens Sun Oct 29 07:11 - crash (2+08:04) > dougie pts/2 kitchens Sat Oct 28 20:29 - crash (2+19:46) > dougie pts/1 kitchens Sat Oct 28 11:37 - 16:04 (1+05:27) > dougie pts/0 tbird2xp:0.0 Fri Oct 27 13:16 - crash (4+03:00) > > > And the syslog shows nothing notable around the time. Usuall just lines > from postfix as it processes the mail queue, then: > > Oct 31 17:12:22 nick syslogd 1.4.1#17: restart (remote reception). > Oct 31 17:12:22 nick kernel: klogd 1.4.1#17, log source = /proc/kmsg > started. > Oct 31 17:12:23 nick kernel: Inspecting /boot/System.map-2.6.17 > Oct 31 17:12:23 nick kernel: Loaded 21314 symbols from > /boot/System.map-2.6.17. > > I'm not sure how to go about tracking this down. My searching of the > archives shows that these symptoms could describe a faulty physical > component, such as memory or PSU. So my next step is probably going to > be trying to swap the PSU and doing a memtest. One thing about the > reboots is that they often appear to be in clusters. For example, around > 7AM to 9AM on Oct 24 it looks like it was bouncing for about two hours > off and on: > > # last reboot > reboot system boot 2.6.8 Wed Oct 25 05:03 (06:50) > reboot system boot 2.6.8 Wed Oct 25 04:31 (07:22) > reboot system boot 2.6.8 Tue Oct 24 11:09 (1+00:44) > reboot system boot 2.6.8 Tue Oct 24 10:59 (00:06) > reboot system boot 2.6.8 Tue Oct 24 09:52 (01:01) > reboot system boot 2.6.8 Tue Oct 24 09:50 (01:03) > reboot system boot 2.6.8 Tue Oct 24 09:49 (01:05) > reboot system boot 2.6.8 Tue Oct 24 09:37 (01:17) > reboot system boot 2.6.8 Tue Oct 24 09:05 (01:49) > reboot system boot 2.6.8 Tue Oct 24 08:53 (02:00) > reboot system boot 2.6.8 Tue Oct 24 08:51 (02:03) > reboot system boot 2.6.8 Tue Oct 24 07:28 (03:26) > reboot system boot 2.6.8 Tue Oct 24 07:26 (03:27) > reboot system boot 2.6.8 Tue Oct 24 07:24 (03:29) > reboot system boot 2.6.8 Tue Oct 24 07:01 (03:52) > reboot system boot 2.6.8 Tue Oct 24 06:18 (04:36) > > I'm a bit stumped on how to solve this and would appreciate any thoughts > on strategy.
"Tucked away in the loft", you say. Is dust building up somewhere along your power supply line? In a multiple-socket extension, perhaps. A long shot, but I once had this problem. I think the dust caused momentary short circuits, not long enough to blow a fuse but long enough to cut the power to the computer, while the dust burnt away - but I'm no electrician. Cheers, David -- David Jardine "Running Debian GNU/Linux and loving every minute of it." -L. von Sacher-M.(1835-1895) -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

