On 2019/8/27 下午5:58, Anand Jain wrote: > On 27/8/19 4:12 PM, Qu Wenruo wrote: >> >> >> On 2019/8/27 下午3:40, Anand Jain wrote: >>> In a corrupted tree if search for next devid finds the device with >>> devid = -1, then report the error -EUCLEAN back to the parent >>> function to fail gracefully. >>> >>> Signed-off-by: Anand Jain <anand.j...@oracle.com> >>> --- >>> fs/btrfs/volumes.c | 7 ++++++- >>> 1 file changed, 6 insertions(+), 1 deletion(-) >>> >>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >>> index 4db4a100c05b..36aa5f79fb6c 100644 >>> --- a/fs/btrfs/volumes.c >>> +++ b/fs/btrfs/volumes.c >>> @@ -1849,7 +1849,12 @@ static noinline int find_next_devid(struct >>> btrfs_fs_info *fs_info, >>> if (ret < 0) >>> goto error; >>> - BUG_ON(ret == 0); /* Corruption */ >>> + if (ret == 0) { >>> + /* Corruption */ >>> + btrfs_err(fs_info, "corrupted chunk tree devid -1 matched"); >> >> It will never hit this branch. >> >> As in tree checker, we have checked if the devid is so large that a >> chunk item or system chunk array can't contain one. > > That check is buggy. It assumes devid represents the num_devices, > it does not account for the possible devid hole as created in the > below script. > > $ cat t > > umount /btrfs > dev1=/dev/sdb > dev2=/dev/sdc > mkfs.btrfs -fq -dsingle -msingle $dev1 > mount $dev1 /btrfs > > _fail() > { > echo $1 > exit 1 > } > > while true; do > btrfs dev add -f $dev2 /btrfs || _fail "add failed" > btrfs dev del $dev1 /btrfs || _fail "del failed" > dev_tmp=$dev1 > dev1=$dev2 > dev2=$dev_tmp > done > > ----------------------- > [ 185.446441] BTRFS critical (device sdb): corrupt leaf: root=3 > block=313739198464 slot=1 devid=1 invalid devid: has=507 expect=[0, 506] > [ 185.446446] BTRFS error (device sdb): block=313739198464 write time > tree block corruption detected > [ 185.446556] BTRFS: error (device sdb) in > btrfs_commit_transaction:2268: errno=-5 IO failure (Error while writing > out transaction) > [ 185.446559] BTRFS warning (device sdb): Skipping commit of aborted > transaction. > [ 185.446561] BTRFS: error (device sdb) in cleanup_transaction:1827: > errno=-5 IO failure > -----------------------
Oh, that's a case I haven't considered. Great we can find a bug in a seemingly unrelated patch. So the patch itself is OK. Reviewed-by: Qu Wenruo <w...@suse.com> Thanks, Qu > > > Thanks, Anand > > >> That limit is way smaller than (u64)-1. >> Thus if we really have a key (DEV_ITEMS DEV_ITEM -1), it will be >> rejected by tree-checker in the first place, thus you will get a ret == >> -EUCLEAN from previous btrfs_search_slot() call. >> >> Thanks, >> Qu >>> + ret = -EUCLEAN; >>> + goto error; >>> + } >>> ret = btrfs_previous_item(fs_info->chunk_root, path, >>> BTRFS_DEV_ITEMS_OBJECTID, >>> >> >
signature.asc
Description: OpenPGP digital signature