I'm using Kernel 2.2.0 with the appropriate RAID tools and patches. I have
six Quantum 4.3GB U2W (80mb/s) drives and an Adaptec 2940U2W controller,
with the raidtools running a RAID5 across all the drives. When I do any kind
of prolonged work on the drives (mke2fs on a 3gig partition, or copying 4
gigs over a 100baseT network),  I get the following error message:

scsi: aborting command due to timeout: pid 97536, scsi 0, channel 0, id 8
scsi: aborting command due to timeout: pid 97537, scsi 0, channel 0, id 9
scsi: aborting command due to timeout: pid 97538, scsi 0, channel 0, id 10

and so it goes through all my six SCSI drive IDs, with a +1 pid for each of
them. Then it resets the bus, tries again, tries again "harder", and then
brings the same message all over again, with a new (higher) set of pids. The
system locks up soon thereafter and dies, only to cycle through these errors
(nonetheless still with increasing pids) for the rest of the day.

A while ago (<6months) I read that the SCSI driver for the u2w controller is
still somewhat beta. Another post on this group a while ago also mentioned
that the only stable raid drivers are the ones which come with RedHat 5.2,
with which I have not been able to get a RAID5 to work reliably (the same
problem occured then already, I believe. The configuration was just funky
anyhow.) I have conducted some type of stress tests on all the drives (i.e.,
dd if=/dev/zero of=/mnt/whatever count=some_really_large_number)
concurrently, so as to see whether the controller/driver gets overloaded,
and they all worked fine without crashing.

I was wondering whether you had any hints as to where the problem might be,
and how to fix it.

Thanks a lot!

-Robin Giese

Reply via email to