On 2019/2/18 下午1:27, Qu Wenruo wrote:
> Patchset can be fetched from github:
> https://github.com/adam900710/linux/tree/write_time_tree_checker
> Which is based on v5.0-rc1 tag.
> Also there is no conflict rebasing the patchset to misc-next.

Now the github branch rebased to v5.0-rc7 tag.

And ran tests of btrfs auto group, no new regressions found.

Git is clever enough in this rebase, so I bother mail bombing the mail
list for another minor update.

Thanks,
Qu

> 
> This patchset has the following 3 features:
> - Tree block validation output enhancement
>   * Output validation failure timing (write time or read time)
>   * Always output tree block level/key mismatch error message
>     This part is already submitted and reviewed.
> 
> - Write time tree block validation check
>   To catch memory corruption either from hardware or kernel.
>   Example output would be:
> 
>     BTRFS critical (device dm-3): corrupt leaf: root=2 block=1350630375424 
> slot=68, bad key order, prev (10510212874240 169 0) current (1714119868416 
> 169 0)
>     BTRFS error (device dm-3): block=1350630375424 write time tree block 
> corruption detected
>     BTRFS: error (device dm-3) in btrfs_commit_transaction:2220: errno=-5 IO 
> failure (Error while writing out transaction)
>     BTRFS info (device dm-3): forced readonly
>     BTRFS warning (device dm-3): Skipping commit of aborted transaction.
>     BTRFS: error (device dm-3) in cleanup_transaction:1839: errno=-5 IO 
> failure
>     BTRFS info (device dm-3): delayed_refs has NO entry
> 
> - Better error handling before calling flush_write_bio()
>   One hidden reason of calling flush_write_bio() under all cases is,
>   flush_write_bio() will trigger endio function and endio function of
>   epd->bio will free the bio under all cases.
>   So we're in fact abusing flush_write_bio() as cleanup.
> 
>   Since now flush_write_bio() has its own return value, we shouldn't call
>   flush_write_bio() no-brain, here we introduce proper cleanup helper,
>   end_write_bio(). Now we call flush_write_bio() like:
>               New                 |           Old
>   --------------------------------------------------------------
>   ret = do_some_evil(&epd);       | ret = do_some_evil(&epd);
>   if (ret < 0) {                  | flush_write_bio(&epd);
>       end_write_bio(&epd, ret); | ^^^ submitting half-backed epd->bio?
>       return ret;               | return ret;
>   }                               |
>   ret = flush_write_bio(&epd);    |
>   return ret;                     |
> 
>   Above code should be more streamline for the error handling part.
> 
> Changelog:
> v2:
> - Unlock locked pages in lock_extent_buffer_for_io() for error handling.
> - Added Reviewed-by tags.
> 
> v3:
> - Remove duplicated error message.
> - Use IS_ENABLED() macro to replace #ifdef.
> - Added Reviewed-by tags.
> 
> v4:
> - Re-organized patch split
>   Now each BUG_ON() cleanup has its own patch
> - Dig much further into the call sites to eliminate unexpected >0 return
>   May be a little paranoid and abuse some ASSERT(), but it should be
>   much safer against further code change.
> - Fix the false alert caused by balance and memory pressure
>   The fix it skip owner checker for non-essential tree at write time.
>   Since owner root can't always be reliable, either due to commit root
>   created in current transaction or balance + memory pressure.
> 
> v5:
> - Do proper error-out handling other than relying on flush_write_bio()
>   to clean up.
>   This has a side effect that no Reviewed-by tags for modified patches.
> - New comment for why we don't need to do anything about ebp->bio when
>   submit_one_bio() fails.
> - Add some Reviewed-by tag.
> 
> v5.1:
> - Add "block=%llu " output for write/read time error line.
> - Also output read time error message for fsid/start/level check.
> 
> Qu Wenruo (12):
>   btrfs: Always output error message when key/level verification fails
>   btrfs: extent_io: Kill the forward declaration of flush_write_bio()
>   btrfs: disk-io: Show the timing of corrupted tree block explicitly
>   btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up
>   btrfs: extent_io: Handle error better in extent_write_full_page()
>   btrfs: extent_io: Handle error better in btree_write_cache_pages()
>   btrfs: extent_io: Kill the dead branch in extent_write_cache_pages()
>   btrfs: extent_io: Handle error better in extent_write_locked_range()
>   btrfs: extent_io: Kill the BUG_ON() in lock_extent_buffer_for_io()
>   btrfs: extent_io: Kill the BUG_ON() in extent_write_cache_pages()
>   btrfs: extent_io: Handle error better in extent_writepages()
>   btrfs: Do mandatory tree block check before submitting bio
> 
>  fs/btrfs/disk-io.c      |  23 ++++--
>  fs/btrfs/extent_io.c    | 168 ++++++++++++++++++++++++++++------------
>  fs/btrfs/tree-checker.c |  24 +++++-
>  fs/btrfs/tree-checker.h |   8 ++
>  4 files changed, 164 insertions(+), 59 deletions(-)
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to