Re: [PATCH 0/7] retry write on error

David Sterba Tue, 28 Nov 2017 11:24:47 -0800

On Tue, Nov 21, 2017 at 05:35:51PM -0700, Liu Bo wrote:
> If the underlying protocal doesn't support retry and there are some
> transient errors happening somewhere in our IO stack, we'd like to
> give an extra chance for IO.  Or sometimes you see btrfs reporting
> 'wrr 1 flush 0 read 0 blabla' but the disk drive is 100% good, this
> retry may help a bit.


A limited number of retries may make sense, though I saw some long
stalls after retries on bad disks. Tracking the retries would be a good
addition to the dev stats, ie. a soft error but still worth reporting.

> In btrfs, read retry is handled in bio_readpage_error() with the retry
> unit being page size, for write retry however, we're going to do it in
> a different way, as a write may consist of several writes onto
> different stripes, retry write needs to be done right after the IO on
> each stripe completes and arrives at endio.
> 
> Patch 1-3 are the implementation of retry write on error for
> non-raid56 profile.  Patch 4-6 are for raid56 profile.  Both raid56
> and non-raid56 shares one retry function helper.
> 
> Patch 3 does retry sector by sector, but since this patch set doesn't
> included badblocks support, patch 7 changes it back to retry the whole
> bio.  (I didn't fold patch 7 to patch 3 in the hope of just reverting
> patch 7 once badblocks support is done, but I'm open to it.)

What does 'badblocks' refer to? I know about the badblocks utility that
find and reportts bad blocks, possibly ext2 understands that and avoids
allocating them. Btrfs does not have such support.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/7] retry write on error

Reply via email to