On Wed, Jul 11, 2018 at 01:41:21PM +0800, Qu Wenruo wrote: > In commit ac0b4145d662 ("btrfs: scrub: Don't use inode pages for device > replace") we removed the branch of copy_nocow_pages() to avoid > corruption for compressed nodatasum extents. > > However above commit only solves the problem in scrub_extent(), if > during scrub_pages() we failed to read some pages, > sctx->no_io_error_seen will be non-zero and we go to fixup function > scrub_handle_errored_block(). > > In scrub_handle_errored_block(), for sctx without csum (no matter if > we're doing replace or scrub) we go to scrub_fixup_nodatasum() routine, > which does the similar thing with copy_nocow_pages(), but does it > without the extra check in copy_nocow_pages() routine. > > So for test cases like btrfs/100, where we emulate read errors during > replace/scrub, we could corrupt compressed extent data again. > > This patch will fix it just by avoiding any "optimization" for > nodatasum, just falls back to the normal fixup routine by try read from > any good copy. > > This also solves WARN_ON() or dead lock caused by lame backref iteration > in scrub_fixup_nodatasum() routine. > > The deadlock or WARN_ON() won't be triggered before commit ac0b4145d662 > ("btrfs: scrub: Don't use inode pages for device replace") since > copy_nocow_pages() have better locking and extra check for data extent, > and it's already doing the fixup work by try to read data from any good > copy, so it won't go scrub_fixup_nodatasum() anyway. > > Fixes: ac0b4145d662 ("btrfs: scrub: Don't use inode pages for device replace") > Signed-off-by: Qu Wenruo <w...@suse.com>
Thanks, I'll forward this to 4.18, there are a few more regression fixes queued. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html