On 2019/8/7 上午1:47, David Sterba wrote: > On Tue, Aug 06, 2019 at 10:04:51PM +0800, Qu Wenruo wrote: >> >> >> On 2019/8/6 下午9:58, David Sterba wrote: >>> On Thu, Jul 25, 2019 at 02:12:20PM +0800, Qu Wenruo wrote: >>>> >>>> if (!first_key) >>>> return 0; >>>> + /* We have @first_key, so this @eb must have at least one item */ >>>> + if (btrfs_header_nritems(eb) == 0) { >>>> + btrfs_err(fs_info, >>>> + "invalid tree nritems, bytenr=%llu nritems=0 expect >0", >>>> + eb->start); >>>> + WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG)); >>>> + return -EUCLEAN; >>>> + } >>> >>> generic/015 complains: >>> >>> generic/015 [13:51:40][ 5949.416657] run fstests generic/015 at >>> 2019-08-06 13:51:40 >> >> I hit this once, but not this test case. >> The same backtrace for csum tree. >> >> Have you ever hit it again? > > Yes I found a few more occurences, the last one seems to be interesting so > it's > pasted as-is. > > generic/449 > > [21423.875017] read_block_for_search+0x144/0x380 [btrfs] > [21423.876433] btrfs_search_slot+0x297/0xfc0 [btrfs] > [21423.877830] ? btrfs_update_delayed_refs_rsv+0x59/0x70 [btrfs] > [21423.880038] btrfs_lookup_csum+0xa9/0x210 [btrfs] > [21423.881304] btrfs_csum_file_blocks+0x205/0x800 [btrfs] > [21423.882674] ? unpin_extent_cache+0x27/0xc0 [btrfs] > [21423.884050] add_pending_csums+0x50/0x70 [btrfs] > [21423.885285] btrfs_finish_ordered_io+0x403/0x7b0 [btrfs] > [21423.886781] ? _raw_spin_unlock_bh+0x30/0x40 > [21423.888164] normal_work_helper+0xe2/0x520 [btrfs] > [21423.889521] process_one_work+0x22f/0x5b0 > [21423.890332] worker_thread+0x50/0x3b0 > [21423.891001] ? process_one_work+0x5b0/0x5b0 > [21423.892025] kthread+0x11a/0x130
Haven't yet triggered it again, but indeed looks like a race. I have only triggered it once on my old host, while now migrated to a new system, it looks my new setup is sometimes too fast to trigger the race window, sometimes even too fast to allow btrfs replace cancel to be executed before replace finishes. Would you please try the following diff you could trigger it more reliably? (Which moves the nritems check after the generation check) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index a843c21f3060..787ebe4af55d 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -414,6 +414,15 @@ int btrfs_verify_level_key(struct extent_buffer *eb, int level, if (!first_key) return 0; + /* + * For live tree block (new tree blocks in current transaction), + * we need proper lock context to avoid race, which is impossible here. + * So we only checks tree blocks which is read from disk, whose + * generation <= fs_info->last_trans_committed. + */ + if (btrfs_header_generation(eb) > fs_info->last_trans_committed) + return 0; + /* We have @first_key, so this @eb must have at least one item */ if (btrfs_header_nritems(eb) == 0) { btrfs_err(fs_info, @@ -423,14 +432,6 @@ int btrfs_verify_level_key(struct extent_buffer *eb, int level, return -EUCLEAN; } - /* - * For live tree block (new tree blocks in current transaction), - * we need proper lock context to avoid race, which is impossible here. - * So we only checks tree blocks which is read from disk, whose - * generation <= fs_info->last_trans_committed. - */ - if (btrfs_header_generation(eb) > fs_info->last_trans_committed) - return 0; if (found_level) btrfs_node_key_to_cpu(eb, &found_key, 0); else
signature.asc
Description: OpenPGP digital signature