On 2019/3/21 下午10:34, David Sterba wrote:
> On Wed, Mar 20, 2019 at 02:27:38PM +0800, Qu Wenruo wrote:
>> Patchset can be fetched from github:
>> https://github.com/adam900710/linux/tree/write_time_tree_checker
>> Which is based on v5.1-rc1 tag.
>>
>> This patchset has the following 3 features:
>> - Tree block validation output enhancement
>>   * Output validation failure timing (write time or read time)
>>   * Always output tree block level/key mismatch error message
>>     This part is already submitted and reviewed.
>>
>> - Write time tree block validation check
>>   To catch memory corruption either from hardware or kernel.
>>   Example output would be:
>>
>>     BTRFS critical (device dm-3): corrupt leaf: root=2 block=1350630375424 
>> slot=68, bad key order, prev (10510212874240 169 0) current (1714119868416 
>> 169 0)
>>     BTRFS error (device dm-3): block=1350630375424 write time tree block 
>> corruption detected
>>     BTRFS: error (device dm-3) in btrfs_commit_transaction:2220: errno=-5 IO 
>> failure (Error while writing out transaction)
>>     BTRFS info (device dm-3): forced readonly
>>     BTRFS warning (device dm-3): Skipping commit of aborted transaction.
>>     BTRFS: error (device dm-3) in cleanup_transaction:1839: errno=-5 IO 
>> failure
>>     BTRFS info (device dm-3): delayed_refs has NO entry
>>
>> - Better error handling before calling flush_write_bio()
>>   One hidden reason of calling flush_write_bio() under all cases is,
>>   flush_write_bio() will trigger endio function and endio function of
>>   epd->bio will free the bio under all cases.
>>   So we're in fact abusing flush_write_bio() as cleanup.
>>
>>   Since now flush_write_bio() has its own return value, we shouldn't call
>>   flush_write_bio() no-brain, here we introduce proper cleanup helper,
>>   end_write_bio(). Now we call flush_write_bio() like:
>>               New                 |           Old
>>   --------------------------------------------------------------
>>   ret = do_some_evil(&epd);       | ret = do_some_evil(&epd);
>>   if (ret < 0) {                  | flush_write_bio(&epd);
>>      end_write_bio(&epd, ret); | ^^^ submitting half-backed epd->bio?
>>      return ret;               | return ret;
>>   }                               |
>>   ret = flush_write_bio(&epd);    |
>>   return ret;                     |
>>
>>   Above code should be more streamline for the error handling part.
>>
>> Changelog:
>> v2:
>> - Unlock locked pages in lock_extent_buffer_for_io() for error handling.
>> - Added Reviewed-by tags.
>>
>> v3:
>> - Remove duplicated error message.
>> - Use IS_ENABLED() macro to replace #ifdef.
>> - Added Reviewed-by tags.
>>
>> v4:
>> - Re-organized patch split
>>   Now each BUG_ON() cleanup has its own patch
>> - Dig much further into the call sites to eliminate unexpected >0 return
>>   May be a little paranoid and abuse some ASSERT(), but it should be
>>   much safer against further code change.
>> - Fix the false alert caused by balance and memory pressure
>>   The fix it skip owner checker for non-essential tree at write time.
>>   Since owner root can't always be reliable, either due to commit root
>>   created in current transaction or balance + memory pressure.
>>
>> v5:
>> - Do proper error-out handling other than relying on flush_write_bio()
>>   to clean up.
>>   This has a side effect that no Reviewed-by tags for modified patches.
>> - New comment for why we don't need to do anything about ebp->bio when
>>   submit_one_bio() fails.
>> - Add some Reviewed-by tag.
>>
>> v5.1:
>> - Add "block=%llu " output for write/read time error line.
>> - Also output read time error message for fsid/start/level check.
>>
>> v5.2:
>> - Fix a missing page_unlock() in error hanlding
>>
>> v5.3:
>> - Rebase to v5.1-rc1 tag
>>
>> Qu Wenruo (11):
>>   btrfs: Always output error message when key/level verification fails
>>   btrfs: disk-io: Show the timing of corrupted tree block explicitly
>>   btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up
>>   btrfs: extent_io: Handle error better in extent_write_full_page()
>>   btrfs: extent_io: Handle error better in btree_write_cache_pages()
>>   btrfs: extent_io: Kill the dead branch in extent_write_cache_pages()
>>   btrfs: extent_io: Handle error better in extent_write_locked_range()
>>   btrfs: extent_io: Kill the BUG_ON() in lock_extent_buffer_for_io()
>>   btrfs: extent_io: Kill the BUG_ON() in extent_write_cache_pages()
>>   btrfs: extent_io: Handle error better in extent_writepages()
>>   btrfs: Do mandatory tree block check before submitting bio
> 
> All except 9 and 11 are going to misc-next after a final test. The two
> patches are only postponed, one for review and the other one due to
> conflict with another patch in misc-next.

Do I need to resend the last patch to solve the conflict by myself?

Thanks,
Qu

> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to