Hi,
I am seeing stack overflows when doing riggourous error testing while
using the sym53c8xx driver. This would not happen if the driver was
using the newer scsi error handling code. The newer code uses a queue
with a bottom half driver and this will prevent the error cases that I'm
seeing.
I have tried to switch the driver over to the newer scsi error handling
code but have data corruption problems. I have modified the SYM53C8XX
#define (in sym53c8xx.h) to include use_new_eh_code:1, and modified
sym53c8xx_queue_command() so it always returns 0. Things appear to work
fine for a while, but in 1-2 minutes of testing, my test code detects
data corruption in one of its files. The cache gets corrupted. Every
corruption I have looked at seems to be file based... that is, the
information that should go at the beginning of a file is seen at the
beginning of a different file.
So, my real question is...
Does anybody know the history of the scsi error handling code as to know
what is needed to convert older scsi drivers to utilize the new code
(without suffering data corruption problems)?
<>< Lance.
------------------------
Notes on what I am doing to generate the stack overflow errors...
I have modified the driver slightly to handle missing devices as will
happen if someone yanks a drive from the bus (while in an appropriate
scsi backplane). The driver was changed to set a flag (bad_select) when
a select timeout happens. All new commands to that device, other than
TEST UNIT READY, will be rejected for that drive. When a TEST UNIT READY
command is seen, the bad_select bit is cleared.
This all works for our situation, except when there is a backlog of
commands that are queued in the sd layer. When the command is rejected
in the sym53c8xx driver because of a previous bad_select, that command
is sent to the scsi done code which gets its way up to rw_intr, then
requeu_sd_request, then do_sd_request, back to requeue_sd_request, to
scsi_do_cmd, back to sym53c8xx_queue_command. If this command is also
rejected, the cycle continues for about 12-16 times in which the stack
overflows and the system freezes.
-----------------------
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]