On 2/19/19 5:31 AM, Dave Chinner wrote:
> On Wed, Feb 13, 2019 at 05:50:35PM +0800, Bob Liu wrote:
>> Motivation:
>> When fs data/metadata checksum mismatch, lower block devices may have other
>> correct copies. e.g. If XFS successfully reads a metadata buffer off a raid1
>> but
>> decides that the metadata is garbage, today it will shut down the entire
>> filesystem without trying any of the other mirrors. This is a severe
>> loss of service, and we propose these patches to have XFS try harder to
>> avoid failure.
>>
>> This patch prototype this mirror retry idea by:
>> * Adding @nr_mirrors to struct request_queue which is similar as
>> blk_queue_nonrot(), filesystem can grab device request queue and check max
>> mirrors this block device has.
>> Helper functions were also added to get/set the nr_mirrors.
>>
>> * Introducing bi_rd_hint just like bi_write_hint, but bi_rd_hint is a long
>> bitmap
>> in order to support stacked layer case.
>>
>> * Modify md/raid1 to support this retry feature.
>>
>> * Adapter xfs to use this feature.
>> If the read verify fails, we loop over the available mirrors and retry the
>> read.
>
> Why does the filesystem have to iterate every single posible
> combination of devices that are underneath it?
>
> Wouldn't it be much simpler to be able to attach a verifier
> function to the bio, and have each layer that gets called iterate
> over all it's copies internally until the verfier function passes
> or all copies are exhausted?
>
> This works for stacked mirrors - it can pass the higher layer
> verifier down as far as necessary. It can work for RAID5/6, too, by
> having that layer supply it's own verifier for reads that verifies
> parity and can reconstruct of failure, then when it's reconstructed
> a valid stripe it can run the verifier that was supplied to it from
> above, etc.
>
> i.e. I dont see why only filesystems should drive retries or have to
> be aware of the underlying storage stacking. ISTM that each
> layer of the storage stack should be able to verify what has been
> returned to it is valid independently of the higher layer
> requirements. The only difference from a caller point of view should
> be submit_bio(bio); vs submit_bio_verify(bio, verifier_cb_func);
>
We already have bio->bi_end_io(), how about do the verification inside
bi_end_io()?
Then the whole sequence would like:
bio_endio()
> 1.bio->bi_end_io()
> xfs_buf_bio_end_io()
> verify, set bio->bi_status = "please retry" if verify fail
> 2.if found bio->bi_status = retry
> 3.resubmit bio
Is it fine to resubmit a bio inside bio_endio()?
- Thanks, Bob.