On Wed, Jul 11, 2018 at 01:41:21PM +0800, Qu Wenruo wrote:
> In commit ac0b4145d662 ("btrfs: scrub: Don't use inode pages for device
> replace") we removed the branch of copy_nocow_pages() to avoid
> corruption for compressed nodatasum extents.
> 
> However above commit only solves the problem in scrub_extent(), if
> during scrub_pages() we failed to read some pages,
> sctx->no_io_error_seen will be non-zero and we go to fixup function
> scrub_handle_errored_block().
> 
> In scrub_handle_errored_block(), for sctx without csum (no matter if
> we're doing replace or scrub) we go to scrub_fixup_nodatasum() routine,
> which does the similar thing with copy_nocow_pages(), but does it
> without the extra check in copy_nocow_pages() routine.
> 
> So for test cases like btrfs/100, where we emulate read errors during
> replace/scrub, we could corrupt compressed extent data again.
> 
> This patch will fix it just by avoiding any "optimization" for
> nodatasum, just falls back to the normal fixup routine by try read from
> any good copy.
> 
> This also solves WARN_ON() or dead lock caused by lame backref iteration
> in scrub_fixup_nodatasum() routine.
> 
> The deadlock or WARN_ON() won't be triggered before commit ac0b4145d662
> ("btrfs: scrub: Don't use inode pages for device replace") since
> copy_nocow_pages() have better locking and extra check for data extent,
> and it's already doing the fixup work by try to read data from any good
> copy, so it won't go scrub_fixup_nodatasum() anyway.
> 
> Fixes: ac0b4145d662 ("btrfs: scrub: Don't use inode pages for device replace")
> Signed-off-by: Qu Wenruo <w...@suse.com>

Thanks, I'll forward this to 4.18, there are a few more regression fixes
queued.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to