Hi Guys,

    I have a machine here which is running a RAID-1 configuration on a
kernel
2.2.6 with alpha raid patch. It has hot-plug hardware which I decided to
test by
removing one half of the mirror. I pulled the disk and the machine hung
until I
replaced the disk again. Then the raid subsystem began to flag the
various mirrored
partitions as failed.

I receive this error message in the log file over and over until the
disk is replaced:

Sep 23 12:00:49 mailtest kernel: sym53c876-0: restart (scsi reset).
Sep 23 12:00:49 mailtest kernel: sym53c876-0: Downloading SCSI SCRIPTS.

Then when the disk is set back in the cabinet the SCSI controller starts
(with the
same message as at boot time):

Sep 23 12:01:31 mailtest kernel: sym53c876-0-<0,*>: FAST-20 WIDE SCSI
40.0 MB/s
(50 ns, offset 15)
Sep 23 12:01:31 mailtest kernel: sym53c876-0-<1,*>: FAST-20 WIDE SCSI
40.0 MB/s
(50 ns, offset 15)

And then I get "md" messages mentioning various write failures and
finally the
disk partitions are disabled.

If I do not replace the disk again then the system hangs forever (well
for the five minutes
I waited). How do I solve this problem satisfactorily so that the
sym53c876 driver disables
the retries (perhaps after 5 or so) so the computer continues running on
the other disk
instead of hanging on the redundant failed one.

I am sure this was handled properly (from a raid point of view) in the
2.0.36 kernel + raid patch.

regards,
Brian Murphy

Reply via email to