On 2018年04月23日 16:04, Dmitrii Tcvetkov wrote: >>>>> TL;DR It seems as regression in 4.17, but I managed to find a >>>>> workaround to make filesystem rw mountable again. >>>>> >>>>> Kernel built from tag v4.17-rc1 >>>>> btrfs-progs 4.16 >>>>> >>>>> Tonight two my machines (PC (ECC RAM) and laptop(non-ECC RAM)) were >>>>> doing usual weekly balance with this command via cron: >>>>> btrfs balance start -musage=50 -dusage=50 <mountpoint> >>>>> Both machines run same kernel version. >>>>> >>>>> On PC that caused root and "data" filesystems to go readonly. Root >>>>> is on an SSD with data single and metadata DUP, "data" filesystem >>>>> is on 2 HDDs with RAID1 for data and metadata. >>>>> >>>>> On laptop only /home went ro, it's on NVMe SSD with data single and >>>>> metadata DUP. >>>>> >>>>> Btrfs check of PC rootfs was without any errors in both modes, I did >>>>> them once each before reboot on readonly filesystem with --force >>>>> flag and then from live usb. Same output without any errors. >>>>> >>>>> After reboot kernel refused rw mount rootfs with the same error as >>>>> during cron balance, ro mount was accepted, error during rw mount: >>>>> BTRFS: error (device dm-17) in merge_reloc_roots:2465: errno=-117 >>> >>>> 117 means EUCLEAN, which could be caused by the newly introduced >>>> first_key and level check. >>> >>>> Please apply this hotfix to fix it. >>>> btrfs: Only check first key for committed tree blocks >>>> (Which is included in latest pull request) >>> >>>> Also, please consider enable CONFIG_BTRFS_DEBUG to provide extra >>>> debug info. >>> >>>> Thanks, >>>> Qu >>> >>> I tried 4.17-rc2 (as the pull request was pulled) with >>> CONFIG_BTRFS_DEBUG on LVM snapshot of laptop home partition (/dev/vdb) >>> in a VM (VM kernel sees only snapshot so no UUID collisions). Dmesg >>> attached. >> >> Thanks for the info and your previous btrfs-image. >> >> The image itself shows nothing wrong, so it should be runtime problem. >> Would you please apply these two debug patches? >> https://patchwork.kernel.org/patch/10335133/ >> https://patchwork.kernel.org/patch/10335135/ >> >> And the attached diff file? >> >> My guess is the parent node is not initialized correctly in this case. >> >> Thanks, >> Qu > > Dmesg from kernel with all three patches applied attached. > Thanks for the debug info, it really helps a lot!
It turns out that I'm just a super idiot, a typo in replace_path() caused this, and it could not be trigger unless we enter it from relocation recovery. Please try the attached patch to see if it solves the problem. Thanks, Qu
From 4b70eb864192ec5cf54a7e67e2957ddf0e5c0f6f Mon Sep 17 00:00:00 2001 From: Qu Wenruo <w...@suse.com> Date: Mon, 23 Apr 2018 16:13:55 +0800 Subject: [PATCH] btrfs: Fix wrong first_key parameter in replace_path Commit 581c1760415c ("btrfs: Validate child tree block's level and first key") introduced new @first_key parameter for read_tree_block(), however caller in replace_path() is parasing wrong key to read_tree_block(). It should use parameter @first_key other than @key. Normally it won't expose problem as @key is normally initialzied to the same value of @first_key we expect. However in relocation recovery case, @key can be set to (0, 0, 0), and since no valid key in relocation tree can be (0, 0, 0), it will cause read_tree_block() to return -EUCLEAN and interrupt relocation recovery. Fix it by setting @first_key correctly. Signed-off-by: Qu Wenruo <w...@suse.com> --- fs/btrfs/relocation.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 00b7d3231821..b041b945a7ae 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -1841,7 +1841,7 @@ int replace_path(struct btrfs_trans_handle *trans, old_bytenr = btrfs_node_blockptr(parent, slot); blocksize = fs_info->nodesize; old_ptr_gen = btrfs_node_ptr_generation(parent, slot); - btrfs_node_key_to_cpu(parent, &key, slot); + btrfs_node_key_to_cpu(parent, &first_key, slot); if (level <= max_level) { eb = path->nodes[level]; -- 2.17.0
signature.asc
Description: OpenPGP digital signature