Hi, We have an unusual problem whereby machines will lockup with a kernel panic when reading/writing to scsi a hard drive. This doesn't happen very often, but as the servers are production machines which need close to 100% uptime, it is of significant concern. So far, it has happened on three separate machines. All were running various versions of 2.4.x kernels. All of them have Adaptec SCSI controllers (7899P or 7892A controller chips), and Fujitsu 10K rpm drives (various models and sizes). The cabling and termination is ok as far as I can determine. Looking at the messages log, we get:
Feb 19 23:07:28 deptserv kernel: Info fld=0x1f31b70, Current sd08:01: sense key Medium Error Feb 19 23:07:28 deptserv kernel: Additional sense indicates Read retries exhausted Feb 19 23:07:28 deptserv kernel: I/O error: dev 08:01, sector 32709424 Feb 19 23:12:38 deptserv kernel: scsi0: ERROR on channel 0, id 1, lun 0, CDB: Read (10) 00 00 55 90 87 00 00 f8 00 Feb 19 23:12:38 deptserv kernel: Info fld=0x559105, Current sd08:01: sense key Medium Error Feb 19 23:12:38 deptserv kernel: Additional sense indicates Read retries exhausted Feb 19 23:12:38 deptserv kernel: I/O error: dev 08:01, sector 5607616 Feb 19 23:12:44 deptserv kernel: scsi0: ERROR on channel 0, id 1, lun 0, CDB: Read (10) 00 00 55 91 07 00 00 78 00 Feb 19 23:12:44 deptserv kernel: Info fld=0x559107, Current sd08:01: sense key Medium Error Feb 19 23:12:44 deptserv kernel: Additional sense indicates Read retries exhausted Feb 19 23:12:44 deptserv kernel: I/O error: dev 08:01, sector 5607624 before it crashes in the logs and on the console: Segment 0xc3be3920, blocks 4, addr 0x319f7ff Segment 0xc3be3aa0, blocks 4, addr 0x36a7fff Kernel panic: Ththththaats all folks. Too dangerous to continue. (segment no.s are different - I just copied this from another post since I couldn't get it from the console) I have run fscks as well as surface scans on the disk using the adpatec bios, and these turn up no issues or errors. It has only happened on about 6 occasions but we can't afford downtime. We use the same model of drives on the same machines in a RAID (mylex controllers) but have had no issues. The drives are only being used as backup drives (i.e., copying data from the RAID to the backup drive). Which is on of the mysteries: why would it crash the machine with a kernel panic on a non-system drive? I am not really sure who to mail regarding this error. Can anyone make suggestions as to what the cause might be, or ways to remedy it? Any help/pointers greatly appreciated. Regards, Campbell -- -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]