On Fri, Feb 02, 2007 at 05:58:04PM -0500, Mark Lord wrote:
> Matt Mackall wrote:
> >..
> >Also worth considering is that spending minutes trying to reread
> >damaged sectors is likely to accelerate your death spiral. More data
> >may be recoverable if you give up quickly in a first pass, then go
>
Matt Mackall wrote:
..
Also worth considering is that spending minutes trying to reread
damaged sectors is likely to accelerate your death spiral. More data
may be recoverable if you give up quickly in a first pass, then go
back and manually retry damaged bits with smaller I/Os.
All good input.
Alan wrote:
>> The interesting point of this question is about the typically pattern of
>> IO errors. On a read, it is safe to assume that you will have issues
>> with some bounded numbers of adjacent sectors.
>
> Which in theory you can get by asking the drive for the real sector size
> from th
On Fri, Feb 02, 2007 at 11:06:19AM -0500, Mark Lord wrote:
> Alan wrote:
> >
> >If this is the right strategy for disk recovery for a given type of
> >device then this ought to be an automatic strategy. Most end users will
> >not have the knowledge to frob about in sysfs, and if the bad sector hits
James Bottomley wrote:
On Fri, 2007-02-02 at 14:42 +, Alan wrote:
The interesting point of this question is about the typically pattern of
IO errors. On a read, it is safe to assume that you will have issues
with some bounded numbers of adjacent sectors.
Which in theory you can
Alan wrote:
If this is the right strategy for disk recovery for a given type of
device then this ought to be an automatic strategy. Most end users will
not have the knowledge to frob about in sysfs, and if the bad sector hits
at the wrong moment a sensible automatic recovery strategy is going to
On Fri, 2007-02-02 at 14:42 +, Alan wrote:
> > The interesting point of this question is about the typically pattern of
> > IO errors. On a read, it is safe to assume that you will have issues
> > with some bounded numbers of adjacent sectors.
>
> Which in theory you can get by asking the dr
> your system requirements are, what the system is trying to do (i.e.,
> when trying to recover a failing but not dead yet disk, IO errors should
> be as quick as possible and we should choose an IO scheduler that does
> not combine IO's).
If this is the right strategy for disk recovery for a g
> The interesting point of this question is about the typically pattern of
> IO errors. On a read, it is safe to assume that you will have issues
> with some bounded numbers of adjacent sectors.
Which in theory you can get by asking the drive for the real sector size
from the ATA7 info. (We ough
James Bottomley wrote:
On Thu, 2007-02-01 at 15:02 -0500, Mark Lord wrote:
I believe you made the first change in response to my prodding at the time,
when libata was not returning valid sense data (no LBA) for media errors.
The SCSI EH handling of that was rather poor at the time,
and so
James Bottomley wrote:
On Thu, 2007-02-01 at 15:02 -0500, Mark Lord wrote:
..
One thing that could be even better than the patch below,
would be to have it perhaps skip the entire bio that includes
the failed sector, rather than only the bad sector itself.
Er ... define "skip over the bio". A
On Thu, 2007-02-01 at 15:02 -0500, Mark Lord wrote:
> I believe you made the first change in response to my prodding at the time,
> when libata was not returning valid sense data (no LBA) for media errors.
> The SCSI EH handling of that was rather poor at the time,
> and so having it not retry the
James Bottomley wrote:
On Tue, 2007-01-30 at 19:47 -0500, Mark Lord wrote:
Kernels since about 2.6.16 or so have been broken in this regard.
They "complete" the good sectors before the error,
and then fail the entire remaining portions of the request.
What was the commit that introduced the ch
James Bottomley wrote:
On Wed, 2007-01-31 at 12:57 -0500, Mark Lord wrote:
Alan wrote:
When libata reports a MEDIUM_ERROR to us, we *know* it's non-recoverable,
as the drive itself has already done internal retries (libata uses the
"with retry" ATA opcodes for this).
This depends on the firmwa
On Wed, 2007-01-31 at 12:57 -0500, Mark Lord wrote:
> Alan wrote:
> >> When libata reports a MEDIUM_ERROR to us, we *know* it's non-recoverable,
> >> as the drive itself has already done internal retries (libata uses the
> >> "with retry" ATA opcodes for this).
> >
> > This depends on the firmware
Alan wrote:
When libata reports a MEDIUM_ERROR to us, we *know* it's non-recoverable,
as the drive itself has already done internal retries (libata uses the
"with retry" ATA opcodes for this).
This depends on the firmware. Some of the "raid firmware" drives don't
appear to do retries in firmwar
Alan wrote:
When libata reports a MEDIUM_ERROR to us, we *know* it's non-recoverable,
as the drive itself has already done internal retries (libata uses the
"with retry" ATA opcodes for this).
This depends on the firmware. Some of the "raid firmware" drives don't
appear to do retries in firmw
Douglas Gilbert wrote:
Ric,
Both ATA (ATA8-ACS) and SCSI (SBC-3) have recently added
command support to flag a block as "uncorrectable". There
is no need to send bad "long" data to it and suppress the
disk's automatic re-allocation logic.
That'll be useful in a couple of years, once drives tha
Ric Wheeler wrote:
>
>
> Jeff Garzik wrote:
>> Mark Lord wrote:
>>> Eric D. Mudama wrote:
Actually, it's possibly worse, since each failure in libata will
generate 3-4 retries. With existing ATA error recovery in the
drives, that's about 3 seconds per retry on average, or 12
On Wed, 2007-01-31 at 10:13 -0500, Mark Lord wrote:
> James Bottomley wrote:
> >
> > For the MD case, this is what REQ_FAILFAST is for.
> I cannot find where SCSI honours that flag. James?
Er, it's in scsi_error.c:scsi_decide_disposition():
maybe_retry:
/* we requeue for retry be
Mark Lord wrote:
James Bottomley wrote:
For the MD case, this is what REQ_FAILFAST is for.
I cannot find where SCSI honours that flag. James?
Scratch that thought.. SCSI honours it in scsi_end_request().
But I'm not certain that the block layer handles it correctly,
at least not in the 2.
James Bottomley wrote:
For the MD case, this is what REQ_FAILFAST is for.
I cannot find where SCSI honours that flag. James?
And for that matter, even when I patch SCSI so that it *does* honour it,
I don't actually see the flag making it into the SCSI layer from above.
And I don't see where
> When libata reports a MEDIUM_ERROR to us, we *know* it's non-recoverable,
> as the drive itself has already done internal retries (libata uses the
> "with retry" ATA opcodes for this).
This depends on the firmware. Some of the "raid firmware" drives don't
appear to do retries in firmware.
> But
Ric Wheeler wrote:
Mark Lord wrote:
Eric D. Mudama wrote:
Actually, it's possibly worse, since each failure in libata will
generate 3-4 retries.
(note: libata does *not* generate retries for medium errors;
the looping is driven by the SCSI mid-layer code).
It really beats the alternative o
Jeff Garzik wrote:
Mark Lord wrote:
Eric D. Mudama wrote:
Actually, it's possibly worse, since each failure in libata will
generate 3-4 retries. With existing ATA error recovery in the
drives, that's about 3 seconds per retry on average, or 12 seconds
per failure. Multiply that by the n
Mark Lord wrote:
Eric D. Mudama wrote:
Actually, it's possibly worse, since each failure in libata will
generate 3-4 retries. With existing ATA error recovery in the drives,
that's about 3 seconds per retry on average, or 12 seconds per
failure. Multiply that by the number of blocks past t
Ric Wheeler wrote:
>
>
> Mark Lord wrote:
>
>> Eric D. Mudama wrote:
>>
>>>
>>> Actually, it's possibly worse, since each failure in libata will
>>> generate 3-4 retries. With existing ATA error recovery in the
>>> drives, that's about 3 seconds per retry on average, or 12 seconds
>>> per failu
On Tue, 2007-01-30 at 22:20 -0500, Ric Wheeler wrote:
> Mark Lord wrote:
> > The number of retries is an entirely separate issue.
> > If we really care about it, then we should fix SD_MAX_RETRIES.
> >
> > The current value of 5 is *way* too high. It should be zero or one.
> >
> > Cheers
> >
> I th
Mark Lord wrote:
Eric D. Mudama wrote:
Actually, it's possibly worse, since each failure in libata will
generate 3-4 retries. With existing ATA error recovery in the
drives, that's about 3 seconds per retry on average, or 12 seconds
per failure. Multiply that by the number of blocks pa
First off, please send SCSI patches to the SCSI list:
On Tue, 2007-01-30 at 19:47 -0500, Mark Lord wrote:
> In ancient kernels, the SCSI disk code used to continue after
> encountering a MEDIUM_ERROR. It would "complete" the good
> sectors before the error, fail the bad sector/block, and then
>
Eric D. Mudama wrote:
Actually, it's possibly worse, since each failure in libata will
generate 3-4 retries. With existing ATA error recovery in the drives,
that's about 3 seconds per retry on average, or 12 seconds per failure.
Multiply that by the number of blocks past the error to comple
James Bottomley wrote:
First off, please send SCSI patches to the SCSI list:
Fixed already, thanks!
This patch fixes the behaviour to be similar to what we had originally.
When a bad sector is encounted, SCSI will now work around it again,
failing *only* the bad sector itself.
Erm, but th
In ancient kernels, the SCSI disk code used to continue after
encountering a MEDIUM_ERROR. It would "complete" the good
sectors before the error, fail the bad sector/block, and then
continue with the rest of the request.
Kernels since about 2.6.16 or so have been broken in this regard.
They "comp
33 matches
Mail list logo