From [EMAIL PROTECTED] Wed Dec 30 15:23:06 1998

    On Wed, 30 Dec 1998 [EMAIL PROTECTED] wrote:

    > On ftp.win.tue.nl:/pub/linux/drivers/aha1542.c
    > there is a version of aha1542.c that survives
    > the bad tapes and CDROMs that I have here, while
    > 2.0.36 and 2.1.132 do not.
    > 
    > It was tested with 2.1.132.
    > 
    > Note that Adaptec 1542 users have two quite independent
    > problems: (1) The driver didnt handle aborts and resets,
    > and (2) The driver uses the new scsi_error model.
    > 
    > The present source seems to solve (1) on my hardware.
    > [To be precise, aborts function well, bus device resets
    > seem to work but had very light testing, bus resets and
    > host resets have not been tested at all.]
    > 
    Problem 2 has been addressed in 2.2.0-pre1 (changes to scsi_error.c):  
    handling of the sense data received from the adapter is changed so that no
    more elaborate error recovery is done by the middle-level driver (when
    sense data is received, the adapter and the bus are operating properly).
    This does solve most of the problems related to tapes (good and bad) and
    bad CDROMs and this is the way the "old" error code operates.

Very good.

However, 2.2.0-pre1 does not address one of my concerns:
As soon as the machine comes in error recovery, it is dead.

Look at the scenario. One busy machine, lots of disks, all active.
It works fine, but someone with a bad CDROM comes along.
He tries to read it, but sr.c gets a timeout.

Now host->in_recovery is set, and do_sd_request stops
queuing commands. The error handler does an abort and succeeds.
Then it does a retry. Again a timeout.

Now what is the effect on the system? For a time of
SR_TIMEOUT = 30*HZ no disk requests were queued.
In other words, the machine is dead for thirty seconds.

(And not only that - this badly burned CD has lots of
substandard spots, and a fragment of a second later
another problem spot is encountered, again 30 seconds death.)

This may be reasonable on a single-user machine,
but in a production environment it is unacceptable.
Thus, no-one should use the new scsi_error paradigm
for the time being.

Andries

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]

Reply via email to