My system has become increasingly flaky in the last 36 hours, and I need whatever advice anyone might have to offer quickly!
I wouldn't bring this to the list, except that I believe that whatever problem I'm having may well affect others. History: On Saturday I got carried away with housecleaning: I started with the study, but then decided to clean the computers, and then I decided to clean the _insides_ of the computers. That now looks like a big mistake. (The system has an Intel Plato motherboard with an NCR SCSI controller, 2 1 GByte SCSI disks, 1 SCSI CD-ROM, and (externally) a Micropolis 9 GByte disk drive and an Exabyte tape drive. This particular Micropolis has been well behaved, but I had another Micropolis 9 Gbyte which died quickly, so now I think of them as delicate disk drives. The system has 48 MBytes of RAM, though it is rare that I ever see more than 2 MBytes of RAM free when I run vmstat -- an issue I'd been meaning to investigate for months. This seems odd to me since I am almost always the only user.) It's been a while since I've backed up the system. After I cleaned it, the system seemed to boot up normally. But the next morning, I saw a message: KERNEL PANIC, and something about a SCSI I/O error. I tried to logout so I could reboot normally, but couldn't do that. So I shutdown the system, and when I tried to reboot, I was of course greeted by a message that said I'd have to run fsck manually. When I did that, I was asked to agree to a bunch of actions which I didn't understand, but agreed to anyway. Then the report came up that a file system (/dev/sdc1) had been changed. I was unable to boot after the fsck process. Using emergency boot disks (made with Bruce's boot-floppies package), I was able to boot enough to remove /dev/sdc1 from /etc/fstab, and then mke2fs on /dev/sdc1, which allowed me to boot normally again. Having come this close to a crisis, I decided to back up the system. (!) This has resulted in more and worse failures. 1. I used tob, with the command, bash /sbin/tob -rc /etc/tob/tob.rc.afioz -full all I left the room, came back after a couple of hours, and saw that the process seemed to have made some progress, but then just stopped. The afio process was marked 'D' in the ps -ax output. Again, I tried to shutdown, couldn't, had to turn off the machine, went through another fsck exercise, and finally rebooted. 2. I tried tob again with the command: bash /sbin/tob -rc /etc/tob/tob.rc.afioz.TEST -full all The process seemed to start up normally, so I let it go for a few hours. Same result. I took notes this time: PID TTY STAT TIME COMMAND 375 p1 S 0:00 -bash 484 p1 S 0:00 bash /sbin/tob -rc /etc/tob/tob.rc.afioz.TEST -full all 515 p1 D 2:50 afio -Zvo /tmp/tob.out I also noted that the root file system (on which /tmp was mounted) was full. So I removed some files that I thought might alleviate the problem, but I was never able to reawaken tob. 3. In desperation, I tried plain old tar, but now that seems to have done exactly the same thing: i.e., it started out well, the disks and taper whirred a lot, but now the process seems to have hung. Here's the ps -ax output: PID TTY STAT TIME COMMAND 22671 p2 D 0:24 tar -cvf /dev/st0 /home Now one of my xterms seems to have some very weird settings -- the characters have all changed from a-z to some graphics set, and 'stty sane' doesn't help. The root file system is not full, vmstat reports about 1.8 MBytes free (which is maddening, but typical). I do not believe that I can reboot the system now without serious loss of data. I'm sorry it's taken so long to relay my story. I'd very much appreciate any advice on how to get rid of all the sick processes now on this system (see process table below), and how I might safely back up some files and reboot. Meantime, as I mentioned above, I think this is all a sign of some peculiar memory problems. I suspect tob is not the source of the problem. Regards, Susan Kleinmann