Rogier Wolff <[EMAIL PROTECTED]> wrote:
> Mar 29 18:57:30 ozon kernel: scsi : aborting command due to timeout : pid 27, scsi0, 
>channel 0, id 0, lun 0 Read (10) 00 00 00 db c6 00 00 02 00 

Well, this is not a read error: this is the disk gone out for lunch.

NB: some disks, in case of read errors, can be configured (see the
read recovery mode page with scsiconfig), by default, to to re-reads
a number of times, then maybe even do auto-replace (remapping of
bad blocks). I had a few very old drives (e.g. IBM DCHS 9 GB from
1995) which would take upto 10 or 20 seconds recovery time, the
drive doing all sort of clicking noises.

What you want is either to augment the Linux sd timeout (but, well ...),
or to diminush the number of read retries/auto-replace. After doing that,
also diminush the number of verify retries below the one of read retries,
and do a complete verify of your disk. Also, you should check that the
drive is properly cold down (fan, etc), and that the power supply is
enough even for the worst cases (also, in some rare cases the power
supply can be not enough loaded!).

Now, one can argue about whether the SCSI layer should really retry
that many times, and do that many bus resets, but the initial problem
is your hard drive.

PS: a command timeout can also be due to bad cabling, bad termination, etc,
    but if it only happens with a specific block on disk, one can
    safely assume it has something to do with the drive error recovery.


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]

Reply via email to