Re: [PATCH 1/4] Btrfs-progs: fsck: only allow partial opening under repair mode
2014-05-28 21:56 GMT+08:00 Eric Sandeen sand...@redhat.com: The subject and the comment say what this change does, but that's obvious from reading the code. Nothing says *why* the change has been made. What does this fix, and how does it fix it? Yup, the reason that we allow partial opening is that sometimes, we may have a corrupted extent tree(for example), but for fsck repair case, the broken tree maybe rebuilt. So if users only want to do check but not repaired, this patch will make fsck return failure as soon as possible and tell users that some critial roots have been corrupted... Let's come to your comments, Eric, you are absolutely right, i was a little lazy sometimes... I would add necessay changelog to describe why we need this patch, thanks for your comments. Regards, Wang Can you add/update the commit log so that some reader in the future (or for that matter, a reviewer in the present) will have an idea about the reason for this change? What was the failure case, what was the failure mode, why does this change fix it, etc. Thanks, -Eric On 5/28/14, 6:20 AM, Wang Shilong wrote: Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- cmds-check.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/cmds-check.c b/cmds-check.c index db7df80..0e4e042 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -6810,8 +6810,7 @@ int cmd_check(int argc, char **argv) int option_index = 0; int init_csum_tree = 0; int qgroup_report = 0; - enum btrfs_open_ctree_flags ctree_flags = - OPEN_CTREE_PARTIAL | OPEN_CTREE_EXCLUSIVE; + enum btrfs_open_ctree_flags ctree_flags = OPEN_CTREE_EXCLUSIVE; while(1) { int c; @@ -6877,6 +6876,10 @@ int cmd_check(int argc, char **argv) goto err_out; } + /* only allow partial opening under repair mode */ + if (repair) + ctree_flags |= OPEN_CTREE_PARTIAL; + info = open_ctree_fs_info(argv[optind], bytenr, 0, ctree_flags); if (!info) { fprintf(stderr, Couldn't open file system\n); -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: set right total device count for seeding support
2014-05-16 22:14 GMT+08:00 Anand Jain anand.j...@oracle.com: Wang, There seems to be a problem - after we delete the seed disk, the total_devices didn't decrement back to 1. reproducer as in the below test case. (I used btrfs-devlist (posted) to check fs_devices). There should be other problems flighting or i missed something , i'll take a look at this. Thanks Anand for finding this. # mkfs.btrfs -f /dev/sdb # btrfstune -S 1 /dev/sdb # mount /dev/sdb /mnt # btrfs device add -f /dev/sdc /mnt ---fs_devices-total_devices = 1 mount -o rw,remount /mnt btrfs dev del /dev/sdb /mnt -- fs_devices-total_devices is still 2 Thanks, Anand On 13/05/14 17:05, Wang Shilong wrote: Seeding device support allows us to create a new filesystem based on existed filesystem. However newly created filesystem's @total_devices should include seed devices. This patch fix the following problem: # mkfs.btrfs -f /dev/sdb # btrfstune -S 1 /dev/sdb # mount /dev/sdb /mnt # btrfs device add -f /dev/sdc /mnt ---fs_devices-total_devices = 1 # umount /mnt # mount /dev/sdc /mnt ---fs_devices-total_devices = 2 This is because we record right @total_devices in superblock, but @fs_devices-total_devices is reset to be 0 in btrfs_prepare_sprout(). Fix this problem by not resetting @fs_devices-total_devices. Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- fs/btrfs/volumes.c | 1 - 1 file changed, 1 deletion(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 6fd7fe6..19b2d32 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1883,7 +1883,6 @@ static int btrfs_prepare_sprout(struct btrfs_root *root) fs_devices-seeding = 0; fs_devices-num_devices = 0; fs_devices-open_devices = 0; - fs_devices-total_devices = 0; fs_devices-seed = seed_devices; generate_random_uuid(fs_devices-fsid); -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: set right total device count for seeding support
2014-05-16 22:44 GMT+08:00 Anand Jain anand.j...@oracle.com: Wang, when we unmount mount (instead of remount) and followed with device del seed it ends up with null pointer deref at btrfs_shrink_dev. Thats because the btrfs_root is not set for seed disk as we mounted the writable sprout disk. I am writing a separate fix for that as its a new bug. You means this one which was addressed by Liu few days ago? https://patchwork.kernel.org/patch/4150471/ Thanks, Anand On 16/05/14 22:14, Anand Jain wrote: Wang, There seems to be a problem - after we delete the seed disk, the total_devices didn't decrement back to 1. reproducer as in the below test case. (I used btrfs-devlist (posted) to check fs_devices). # mkfs.btrfs -f /dev/sdb # btrfstune -S 1 /dev/sdb # mount /dev/sdb /mnt # btrfs device add -f /dev/sdc /mnt ---fs_devices-total_devices = 1 mount -o rw,remount /mnt btrfs dev del /dev/sdb /mnt -- fs_devices-total_devices is still 2 Thanks, Anand On 13/05/14 17:05, Wang Shilong wrote: Seeding device support allows us to create a new filesystem based on existed filesystem. However newly created filesystem's @total_devices should include seed devices. This patch fix the following problem: # mkfs.btrfs -f /dev/sdb # btrfstune -S 1 /dev/sdb # mount /dev/sdb /mnt # btrfs device add -f /dev/sdc /mnt ---fs_devices-total_devices = 1 # umount /mnt # mount /dev/sdc /mnt ---fs_devices-total_devices = 2 This is because we record right @total_devices in superblock, but @fs_devices-total_devices is reset to be 0 in btrfs_prepare_sprout(). Fix this problem by not resetting @fs_devices-total_devices. Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- fs/btrfs/volumes.c | 1 - 1 file changed, 1 deletion(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 6fd7fe6..19b2d32 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1883,7 +1883,6 @@ static int btrfs_prepare_sprout(struct btrfs_root *root) fs_devices-seeding = 0; fs_devices-num_devices = 0; fs_devices-open_devices = 0; -fs_devices-total_devices = 0; fs_devices-seed = seed_devices; generate_random_uuid(fs_devices-fsid); -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: ioctl BTRFS_IOC_FS_INFO and BTRFS_IOC_DEV_INFO miss-matched with slots
2014-05-16 23:13 GMT+08:00 Anand Jain anand.j...@oracle.com: On 16/05/14 22:40, Shilong Wang wrote: 2014-05-16 22:06 GMT+08:00 Anand Jain anand.j...@oracle.com: BTRFS_IOC_FS_INFO return num_devices which does not include seed disks, BTRFS_IOC_DEV_INFO fetches seed disk when probed. So in this case hits the btrfs-progs bug: get_fs_info() :: BUG_ON(ndevs = fi_args-num_devices); which is very easy to hit by using btrfs filesystem show. This patch will make BTRFS_IOC_DEV_INFO ioctl to provide disks only of the FSID being probed (seed disks are under different FSID). which means when seed is still not deleted from the sprout the btrfs filesystem show command will show disks them under their respective FSIDs Just a quick comment: Newly created filesystem is based on existed seeding filesystem. Though they have different fsid, newly created filesystem devices should include seed devices, from seed device filesytem, they can not see newly created filesystem devices! No? Will the follow patch change the above behavior? Yes, as the dev info list does not contain the seed disk so is the btrfs fi show output. Do you think its better to show seed and sprout disks together ? other way is indicate the seed fsid against the added disk until seed is deleted. IMO, i think dev info list should contain seed disks, since new filesystem can access their data. Maybe a better way 'file show' will tell users which devices are seed devices.. Thanks. Anand Thanks, Wang Signed-off-by: Anand Jain anand.j...@oracle.com --- fs/btrfs/ioctl.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index ff27c08..902d279 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -2584,7 +2584,8 @@ static long btrfs_ioctl_dev_info(struct btrfs_root *root, void __user *arg) s_uuid = di_args-uuid; mutex_lock(fs_devices-device_list_mutex); - dev = btrfs_find_device(root-fs_info, di_args-devid, s_uuid, NULL); + dev = btrfs_find_device(root-fs_info, di_args-devid, s_uuid, + fs_devices-fsid); if (!dev) { ret = -ENODEV; -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Back from leave
2014-05-05 22:28 GMT+08:00 Josef Bacik jba...@fb.com: Hello, I had way too much email so I just deleted it all, if there was something you wanted my specific attention on then bounce it back at me and I'll look at it. Thanks, Welcome back, Josef! Wang Josef -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: csum failed that was not detected by scrub
Hello, 2014-05-02 17:42 GMT+08:00 Jaap Pieroen j...@pieroen.nl: Hi all, I completed a full scrub: root@nasbak:/home/jpieroen# btrfs scrub status /home/ scrub status for 7ca5f38e-308f-43ab-b3ea-31b3bcd11a0d scrub started at Wed Apr 30 08:30:19 2014 and finished after 144131 seconds total bytes scrubbed: 4.76TiB with 0 errors Then tried to remove a device: root@nasbak:/home/jpieroen# btrfs device delete /dev/sdb /home This triggered bug_on, with the following error in dmesg: csum failed ino 258 off 1395560448 csum 2284440321 expected csum 319628859 How can there still be csum failures directly after a scrub? If I rerun the scrub it still won't find any errors. I know this, because I've had the same issue 3 times in a row. Each time running a scrub and still being unable to remove the device. There is a known RAID5/6 bug, i sent a patch to address this problem. Could you please double check if your kernel source includes the following commit: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=3b080b2564287be91605bfd1d5ee985696e61d3c RAID5/6 should detect checksum mismatch, it can not fix errors now. Thanks, Wang Kind Regards, Jaap -- Details: root@nasbak:/home/jpieroen# uname -a Linux nasbak 3.14.1-031401-generic #201404141220 SMP Mon Apr 14 16:21:48 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux root@nasbak:/home/jpieroen# btrfs --version Btrfs v3.14.1 root@nasbak:/home/jpieroen# btrfs fi df /home Data, RAID5: total=4.57TiB, used=4.55TiB System, RAID1: total=32.00MiB, used=352.00KiB Metadata, RAID1: total=7.00GiB, used=5.59GiB root@nasbak:/home/jpieroen# btrfs fi show Label: 'btrfs_storage' uuid: 7ca5f38e-308f-43ab-b3ea-31b3bcd11a0d Total devices 6 FS bytes used 4.56TiB devid1 size 1.82TiB used 1.31TiB path /dev/sde devid2 size 1.82TiB used 1.31TiB path /dev/sdf devid3 size 1.82TiB used 1.31TiB path /dev/sdg devid4 size 931.51GiB used 25.00GiB path /dev/sdb devid6 size 2.73TiB used 994.03GiB path /dev/sdh devid7 size 2.73TiB used 994.03GiB path /dev/sdi Btrfs v3.14.1 jpieroen@nasbak:~$ dmesg [227248.656438] BTRFS info (device sdi): relocating block group 9735225016320 flags 129 [227261.713860] BTRFS info (device sdi): found 9 extents [227264.531019] BTRFS info (device sdi): found 9 extents [227265.011826] BTRFS info (device sdi): relocating block group 76265029632 flags 129 [227274.052249] BTRFS info (device sdi): csum failed ino 258 off 1395560448 csum 2284440321 expected csum 319628859 [227274.052354] BTRFS info (device sdi): csum failed ino 258 off 1395564544 csum 3646299263 expected csum 319628859 [227274.052402] BTRFS info (device sdi): csum failed ino 258 off 1395568640 csum 281259278 expected csum 319628859 [227274.052449] BTRFS info (device sdi): csum failed ino 258 off 1395572736 csum 2594807184 expected csum 319628859 [227274.052492] BTRFS info (device sdi): csum failed ino 258 off 1395576832 csum 4288971971 expected csum 319628859 [227274.052537] BTRFS info (device sdi): csum failed ino 258 off 1395580928 csum 752615894 expected csum 319628859 [227274.052581] BTRFS info (device sdi): csum failed ino 258 off 1395585024 csum 3828951500 expected csum 319628859 [227274.061279] [ cut here ] [227274.061354] kernel BUG at /home/apw/COD/linux/fs/btrfs/extent_io.c:2116! [227274.061445] invalid opcode: [#1] SMP [227274.061509] Modules linked in: cuse deflate [227274.061573] BTRFS info (device sdi): csum failed ino 258 off 1395560448 csum 2284440321 expected csum 319628859 [227274.061707] ctr twofish_generic twofish_x86_64_3way twofish_x86_64 twofish_common camellia_generic camellia_x86_64 serpent_sse2_x86_64 xts serpent_generic lrw gf128mul glue_helper blowfish_generic blowfish_x86_64 blowfish_common cast5_generic cast_common ablk_helper cryptd des_generic cmac xcbc rmd160 crypto_null af_key xfrm_algo nfsd auth_rpcgss nfs_acl nfs lockd sunrpc fscache dm_crypt ip6t_REJECT ppdev xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT xt_comment xt_LOG kvm xt_recent microcode xt_multiport xt_limit xt_tcpudp psmouse serio_raw xt_addrtype k10temp edac_core ipt_MASQUERADE edac_mce_amd iptable_nat nf_nat_ipv4 sp5100_tco nf_conntrack_ipv4 nf_defrag_ipv4 ftdi_sio i2c_piix4 usbserial xt_conntrack ip6table_filter ip6_tables joydev nf_conntrack_netbios_ns nf_conntrack_broadcast snd_hda_codec_via nf_nat_ftp snd_hda_codec_hdmi nf_nat snd_hda_codec_generic nf_conntrack_ftp nf_conntrack snd_hda_intel iptable_filter ir_lirc_codec(OF) lirc_dev(OF) ip_tables snd_hda_codec ir_mce_kbd_decoder(OF) x_tables snd_hwdep ir_sony_decoder(OF) rc_tbs_nec(OF) ir_jvc_decoder(OF) snd_pcm ir_rc6_decoder(OF) ir_rc5_decoder(OF) saa716x_tbs_dvb(OF) tbs6982fe(POF) tbs6680fe(POF) ir_nec_decoder(OF) tbs6923fe(POF) tbs6985se(POF) tbs6928se(POF) tbs6982se(POF) tbs6991fe(POF) tbs6618fe(POF) saa716x_core(OF)
Re: [PATCH] Btrfs: do not reset last_snapshot after relocation
Hi Josef, 2014-03-28 2:56 GMT+08:00 Josef Bacik jba...@fb.com: This was done to allow NO_COW to continue to be NO_COW after relocation but it is not right. When relocating we will convert blocks to FULL_BACKREF that we relocate. We can leave some of these full backref blocks behind if they are not cow'ed out during the relocation, like if we fail the relocation with ENOSPC and then just drop the reloc tree. Then when we go to cow the block again we won't lookup the extent flags because we won't think there has been a snapshot recently which means we will do our normal ref drop thing instead of adding back a tree ref and dropping the shared ref. This will cause btrfs_free_extent to blow up because it can't find the ref we are trying to free. This was found with my ref verifying tool. Thanks, Could we pass error into merge_reloc_roots() something like this: merge_reloc_roots(struct reloc_control *rc, int err) and we only skip reset root's last snapshot if @err is not 0. I think it is meaningful to allow NO_COW to continue after relocation. Thanks, Wang Signed-off-by: Josef Bacik jba...@fb.com --- fs/btrfs/relocation.c | 21 - 1 file changed, 21 deletions(-) diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index ec00777..f026a82 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -2318,7 +2318,6 @@ void free_reloc_roots(struct list_head *list) static noinline_for_stack int merge_reloc_roots(struct reloc_control *rc) { - struct btrfs_trans_handle *trans; struct btrfs_root *root; struct btrfs_root *reloc_root; u64 last_snap; @@ -2376,26 +2375,6 @@ again: list_add_tail(reloc_root-root_list, reloc_roots); goto out; - } else if (!ret) { - /* -* recover the last snapshot tranid to avoid -* the space balance break NOCOW. -*/ - root = read_fs_root(rc-extent_root-fs_info, - objectid); - if (IS_ERR(root)) - continue; - - trans = btrfs_join_transaction(root); - BUG_ON(IS_ERR(trans)); - - /* Check if the fs/file tree was snapshoted or not. */ - if (btrfs_root_last_snapshot(root-root_item) == - otransid - 1) - btrfs_set_root_last_snapshot(root-root_item, -last_snap); - - btrfs_end_transaction(trans, root); } } -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] Btrfs: don't compress for a small write
2014-03-31 20:31 GMT+08:00 Chris Mason c...@fb.com: On 03/24/2014 05:58 AM, Wang Shilong wrote: To compress a small write(=blocksize) dosen't save us disk space at all, skip it can save us some compression time. This patch can also fix wrong setting nocompression flag for inode, say a case when @total_in is 4096, and then we get @total_compressed 52,because we do aligment to page cache size firstly, and then we get into conclusion @total_in=@total_compressed thus we will clear this inode's compression flag. An exception comes from inserting inline extent failure but we still have @total_compressed @total_in,so we will still reset inode's flag, this is ok, because we don't have good compression effect. So your check for start 0 || end + 1 disk_i_size means we're only skipping compression for a small file range that isn't in an inline extent. Could you please update the patch description and comments in the code to reflect this? No problem, i will update this patch.~_~ Thanks! -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] Btrfs: scrub raid56 stripes in the right way
2014-03-31 18:34 GMT+08:00 Wang Shilong wangsl.f...@cn.fujitsu.com: Steps to reproduce: # mkfs.btrfs -f /dev/sda[8-11] -m raid5 -d raid5 # mount /dev/sda8 /mnt # btrfs scrub start -BR /mnt # echo $? --unverified errors make return value be 3 This is because we don't setup right mapping between physical and logical address for raid56, which makes checksum mismatch. But we will find everthing is fine later when rechecking using btrfs_map_block(). This patch fixed the problem by settuping right mappings and we only verify data stripes' checksums. Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- v1-v2: on the right way to fix the problem. --- fs/btrfs/scrub.c | 83 1 file changed, 71 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index db21a13..ebe4c5e 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -2235,6 +2235,43 @@ behind_scrub_pages: return 0; } +/* + * Given a physical address, this will calculate it's + * logical offset. if this is a parity stripe, it will return + * the most left data stripe's logical offset. + * + * return 0 if it is a data stripe, 1 means parity stripe. + */ +static int get_raid56_logic_offset(u64 physical, int num, +struct map_lookup *map, u64 *offset) +{ + int i; + int j = 0; + u64 stripe_nr; + u64 tmp; + u64 last_offset; + int stripe_index; + + last_offset = (physical - map-stripes[num].physical) * + nr_data_stripes(map); + *offset = last_offset; + for (i = 0; i nr_data_stripes(map); i++) { + *offset += i * map-stripe_len; oops, this is obviously wrong, it should be: + stripe_nr = *offset; + do_div(stripe_nr, map-stripe_len); + + stripe_index = do_div(stripe_nr, nr_data_stripes(map)); + tmp = stripe_nr + stripe_index; + stripe_index = do_div(tmp, map-num_stripes); + if (stripe_index == num) + return 0; + if (stripe_index num) + j++; + } + *offset = last_offset + j * map-stripe_len; + return 1; +} + static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx, struct map_lookup *map, struct btrfs_device *scrub_dev, @@ -2269,16 +2306,10 @@ static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx, u64 extent_len; struct btrfs_device *extent_dev; int extent_mirror_num; - int stop_loop; - - if (map-type (BTRFS_BLOCK_GROUP_RAID5 | -BTRFS_BLOCK_GROUP_RAID6)) { - if (num = nr_data_stripes(map)) { - return 0; - } - } + int stop_loop = 0; nstripes = length; + physical = map-stripes[num].physical; offset = 0; do_div(nstripes, map-stripe_len); if (map-type BTRFS_BLOCK_GROUP_RAID0) { @@ -2296,6 +2327,11 @@ static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx, } else if (map-type BTRFS_BLOCK_GROUP_DUP) { increment = map-stripe_len; mirror_num = num % map-num_stripes + 1; + } else if (map-type (BTRFS_BLOCK_GROUP_RAID5 | + BTRFS_BLOCK_GROUP_RAID6)) { + get_raid56_logic_offset(physical, num, map, offset); + increment = map-stripe_len * nr_data_stripes(map); + mirror_num = 1; } else { increment = map-stripe_len; mirror_num = 1; @@ -2357,10 +2393,18 @@ static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx, * now find all extents for each stripe and scrub them */ logical = base + offset; - physical = map-stripes[num].physical; logic_end = logical + increment * nstripes; ret = 0; while (logical logic_end) { + /* for raid56, we skip parity stripe */ + if (map-type (BTRFS_BLOCK_GROUP_RAID5 | + BTRFS_BLOCK_GROUP_RAID6)) { + ret = get_raid56_logic_offset(physical, num, + map, logical); + logical += base; + if (ret) + goto skip; + } /* * canceled? */ @@ -2504,9 +2548,23 @@ again: scrub_free_csums(sctx); if (extent_logical + extent_len key.objectid + bytes) { - logical += increment; -
Re: [PATCH v2 2/2] Btrfs: scrub raid56 stripes in the right way
2014-03-31 20:54 GMT+08:00 Shilong Wang wangshilong1...@gmail.com: 2014-03-31 18:34 GMT+08:00 Wang Shilong wangsl.f...@cn.fujitsu.com: Steps to reproduce: # mkfs.btrfs -f /dev/sda[8-11] -m raid5 -d raid5 # mount /dev/sda8 /mnt # btrfs scrub start -BR /mnt # echo $? --unverified errors make return value be 3 This is because we don't setup right mapping between physical and logical address for raid56, which makes checksum mismatch. But we will find everthing is fine later when rechecking using btrfs_map_block(). This patch fixed the problem by settuping right mappings and we only verify data stripes' checksums. Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- v1-v2: on the right way to fix the problem. --- fs/btrfs/scrub.c | 83 1 file changed, 71 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index db21a13..ebe4c5e 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -2235,6 +2235,43 @@ behind_scrub_pages: return 0; } +/* + * Given a physical address, this will calculate it's + * logical offset. if this is a parity stripe, it will return + * the most left data stripe's logical offset. + * + * return 0 if it is a data stripe, 1 means parity stripe. + */ +static int get_raid56_logic_offset(u64 physical, int num, +struct map_lookup *map, u64 *offset) +{ + int i; + int j = 0; + u64 stripe_nr; + u64 tmp; + u64 last_offset; + int stripe_index; + + last_offset = (physical - map-stripes[num].physical) * + nr_data_stripes(map); + *offset = last_offset; + for (i = 0; i nr_data_stripes(map); i++) { + *offset += i * map-stripe_len; oops, this is obviously wrong, it should be: It shoule be: *offset = last_offset + i * map-stripe_len; Will fix this in v3. + stripe_nr = *offset; + do_div(stripe_nr, map-stripe_len); + + stripe_index = do_div(stripe_nr, nr_data_stripes(map)); + tmp = stripe_nr + stripe_index; + stripe_index = do_div(tmp, map-num_stripes); + if (stripe_index == num) + return 0; + if (stripe_index num) + j++; + } + *offset = last_offset + j * map-stripe_len; + return 1; +} + static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx, struct map_lookup *map, struct btrfs_device *scrub_dev, @@ -2269,16 +2306,10 @@ static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx, u64 extent_len; struct btrfs_device *extent_dev; int extent_mirror_num; - int stop_loop; - - if (map-type (BTRFS_BLOCK_GROUP_RAID5 | -BTRFS_BLOCK_GROUP_RAID6)) { - if (num = nr_data_stripes(map)) { - return 0; - } - } + int stop_loop = 0; nstripes = length; + physical = map-stripes[num].physical; offset = 0; do_div(nstripes, map-stripe_len); if (map-type BTRFS_BLOCK_GROUP_RAID0) { @@ -2296,6 +2327,11 @@ static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx, } else if (map-type BTRFS_BLOCK_GROUP_DUP) { increment = map-stripe_len; mirror_num = num % map-num_stripes + 1; + } else if (map-type (BTRFS_BLOCK_GROUP_RAID5 | + BTRFS_BLOCK_GROUP_RAID6)) { + get_raid56_logic_offset(physical, num, map, offset); + increment = map-stripe_len * nr_data_stripes(map); + mirror_num = 1; } else { increment = map-stripe_len; mirror_num = 1; @@ -2357,10 +2393,18 @@ static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx, * now find all extents for each stripe and scrub them */ logical = base + offset; - physical = map-stripes[num].physical; logic_end = logical + increment * nstripes; ret = 0; while (logical logic_end) { + /* for raid56, we skip parity stripe */ + if (map-type (BTRFS_BLOCK_GROUP_RAID5 | + BTRFS_BLOCK_GROUP_RAID6)) { + ret = get_raid56_logic_offset(physical, num, + map, logical); + logical += base; + if (ret) + goto skip; + } /* * canceled? */ @@ -2504,9 +2548,23 @@ again: scrub_free_csums(sctx
Re: [PATCH] Btrfs-progs: fsck: disable --init-extent-tree option when using snapshots
Hi Josef, As i haven't thought any better ideas to rebuild extent tree which contains extent that owns 'FULL BACKREF' flag. Considering an extent's refs can be equal or more than 1 if this extent has *FULL BACKREF* flag, so we could not make sure an extent's flag by only searching fs/file tree any more. So until now, i just disable this option if snapshots exists, please correct me if i miss something here. Or you have any better ideas to solve this problem.~_~ Thanks, Wang 2014-03-10 18:39 GMT+08:00 Wang Shilong wangsl.f...@cn.fujitsu.com: We haven't supported to rebuild extent tree if there are any *FULL BACKREF* with broken filesystem, disable this option when detecting snapshots. Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- cmds-check.c | 62 1 file changed, 62 insertions(+) diff --git a/cmds-check.c b/cmds-check.c index d1cafe1..ddee897 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -6143,6 +6143,56 @@ static int reset_block_groups(struct btrfs_fs_info *fs_info) return 0; } +static int is_snapshot_exist(struct btrfs_fs_info *fs_info) +{ + struct btrfs_root *root = fs_info-tree_root; + struct btrfs_path *path; + struct extent_buffer *leaf; + struct btrfs_key key; + int ret; + int found = 0; + + path = btrfs_alloc_path(); + if (!path) + return -ENOMEM; + + key.objectid = BTRFS_FIRST_FREE_OBJECTID; + key.type = BTRFS_ROOT_ITEM_KEY; + key.offset = 0; + + ret = btrfs_search_slot(NULL, root, key, path, 0, 0); + if (ret 0) + goto out; + + while (1) { + if (path-slots[0] = btrfs_header_nritems(path-nodes[0])) { + ret = btrfs_next_leaf(root, path); + if (ret) + goto out; + } + leaf = path-nodes[0]; + btrfs_item_key_to_cpu(leaf, key, path-slots[0]); + + if (key.type != BTRFS_ROOT_ITEM_KEY || + key.objectid BTRFS_FIRST_FREE_OBJECTID) { + path-slots[0]++; + continue; + } + if (key.offset 0) { + found = 1; + break; + } + path-slots[0]++; + } +out: + btrfs_free_path(path); + if (found) + return 1; + else if (ret = 0) + return 0; + return ret; +} + static int reset_balance(struct btrfs_trans_handle *trans, struct btrfs_fs_info *fs_info) { @@ -6537,6 +6587,18 @@ int cmd_check(int argc, char **argv) ret = -EIO; goto close_out; } + if (init_extent_tree) { + ret = is_snapshot_exist(info); + if (ret 0) { + fprintf(stderr, ERROR: fail to check if there are snapshots in btrfs filesystem\n); + ret = 1; + goto close_out; + } else if (ret) { + fprintf(stderr, Snapshots detected, unable to rebuild extent tree for such case.\n); + ret = 1; + goto close_out; + } + } if (init_extent_tree || init_csum_tree) { struct btrfs_trans_handle *trans; -- 1.9.0 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: scrub crashed?
Hello Johan, This should be a known problem. The problem seemed that scrub log file is corrupt, so i added an option -f something like: btrfs scrub start -f .. You can update latest btrfs-progs from david's latest integration branch and try it. if you don't want to do that, just rm /var/lib/btrfs/scrub* should fix your problem. Hopely it can help you.^_^ Thanks, Wang 2014-02-10 22:02 GMT+08:00 Johan Kröckel johan.kroec...@gmail.com: root@fortknox:~# uname -a Linux fortknox 3.12-0.bpo.1-amd64 #1 SMP Debian 3.12.9-1~bpo70+1 (2014-02-07) x86_64 GNU/Linux root@fortknox:~# btrfs version Btrfs v3.12 root@fortknox:~# btrfs scrub status -d /bunker scrub status for 11312131-3372-4637-b526-35a4ef0c31eb scrub device /dev/mapper/bunkerA (id 1) status scrub started at Sun Feb 9 19:12:22 2014, running for 11317 seconds total bytes scrubbed: 325.32GiB with 0 errors scrub device /dev/dm-1 (id 2) status scrub started at Sun Feb 9 19:12:22 2014, running for 11317 seconds total bytes scrubbed: 321.76GiB with 0 errors root@fortknox:~# btrfs scrub cancel /bunker ERROR: scrub cancel failed on /bunker: not running root@fortknox:~# btrfs scrub start /bunker ERROR: scrub is already running. To cancel use 'btrfs scrub cancel /bunker'. To see the status use 'btrfs scrub status [-d] /bunker'. root@fortknox:~# ps -A|grep btrfs 3704 ?00:00:00 btrfs-genwork-1 3705 ?00:00:00 btrfs-submit-1 3706 ?00:00:00 btrfs-delalloc- 3707 ?00:00:00 btrfs-fixup-1 3708 ?00:00:02 btrfs-endio-1 3709 ?00:00:00 btrfs-endio-met 3710 ?00:00:00 btrfs-rmw-1 3711 ?00:00:00 btrfs-endio-rai 3712 ?00:00:00 btrfs-endio-met 3714 ?00:00:05 btrfs-freespace 3715 ?00:00:00 btrfs-delayed-m 3716 ?00:00:00 btrfs-cache-1 3717 ?00:00:00 btrfs-readahead 3718 ?00:00:00 btrfs-flush_del 3719 ?00:00:00 btrfs-qgroup-re 3720 ?00:00:00 btrfs-cleaner 3721 ?00:02:00 btrfs-transacti 8380 ?00:00:00 btrfs-endio-wri 8836 ?00:00:00 btrfs-worker-4 8936 ?00:00:00 btrfs-worker-3 8961 ?00:00:00 btrfs-worker-3 9138 ?00:00:00 btrfs-worker-4 What can/should I do now? -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 2/2] Revert Btrfs: remove transaction from btrfs send
2014-02-09 21:52 GMT+08:00 Filipe David Manana fdman...@gmail.com: On Sun, Feb 9, 2014 at 2:39 AM, Shilong Wang wangshilong1...@gmail.com wrote: 2014-02-08 23:46 GMT+08:00 Wang Shilong wangshilong1...@gmail.com: From: Wang Shilong wangsl.f...@cn.fujitsu.com This reverts commit 41ce9970a8a6a362ae8df145f7a03d789e9ef9d2. Previously i was thinking we can use readonly root's commit root safely while it is not true, readonly root may be cowed with the following cases. 1.snapshot send root will cow source root. 2.balance,device operations will also cow readonly send root to relocate. So i have two ideas to make us safe to use commit root. --approach 1: make it protected by transaction and end transaction properly and we research next item from root node(see btrfs_search_slot_for_read()). --approach 2: add another counter to local root structure to sync snapshot with send. and add a global counter to sync send with exclusive device operations. So with approach 2, send can use commit root safely, because we make sure send root can not be cowed during send. Unfortunately, it make codes *ugly* and more complex to maintain. To make snapshot and send exclusively, device operations and send operation exclusively with each other is a little confusing for common users. So why not drop into previous way. Cc: Josef Bacik jba...@fb.com Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- Josef, if we reach agreement to adopt this approach, please revert Filipe's patch(Btrfs: make some tree searches in send.c more efficient) from btrfs-next. Oops, this patch guarantee searching commit roots are all protected by transaction, Filipe's patch is ok, we need update Josef's previous patch. Hi Shilong, I am confused. Can you explain why that optimization patch is a problem, either with or without your patch or any other patch currently flying around? Either before or after the optimization, we search through the commit root and after a key search we process a key while holding the leaf's extent buffer. Both approaches call btrfs_next_leaf too (either directly or via btrfs_search_slot_for_read). Sorry my miss, your patch did not have problem, you did not notice my following thread comments for this patch, we need update josef's previous patch not yours. ^_^ Thanks, Wang Thanks Wang -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Filipe David Manana, Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That's why all progress depends on unreasonable men. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 2/2] Revert Btrfs: remove transaction from btrfs send
2014-02-08 23:46 GMT+08:00 Wang Shilong wangshilong1...@gmail.com: From: Wang Shilong wangsl.f...@cn.fujitsu.com This reverts commit 41ce9970a8a6a362ae8df145f7a03d789e9ef9d2. Previously i was thinking we can use readonly root's commit root safely while it is not true, readonly root may be cowed with the following cases. 1.snapshot send root will cow source root. 2.balance,device operations will also cow readonly send root to relocate. So i have two ideas to make us safe to use commit root. --approach 1: make it protected by transaction and end transaction properly and we research next item from root node(see btrfs_search_slot_for_read()). --approach 2: add another counter to local root structure to sync snapshot with send. and add a global counter to sync send with exclusive device operations. So with approach 2, send can use commit root safely, because we make sure send root can not be cowed during send. Unfortunately, it make codes *ugly* and more complex to maintain. To make snapshot and send exclusively, device operations and send operation exclusively with each other is a little confusing for common users. So why not drop into previous way. Cc: Josef Bacik jba...@fb.com Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- Josef, if we reach agreement to adopt this approach, please revert Filipe's patch(Btrfs: make some tree searches in send.c more efficient) from btrfs-next. Oops, this patch guarantee searching commit roots are all protected by transaction, Filipe's patch is ok, we need update Josef's previous patch. Wang -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: WARNING at fs/btrfs/backref.c:936 find_parent_nodes+0x378/0x5d2
Hello Tomasz, This seems a new bug, did you run balance and defragment concurrently for triggering the following warning? Thanks, Wang 2014/1/19 Tomasz Chmielewski t...@virtall.com: Just had a few of these with 3.13.0-rc8: [262162.560701] [ cut here ] [262162.560764] WARNING: CPU: 0 PID: 22705 at fs/btrfs/backref.c:936 find_parent_nodes+0x378/0x5d2 [btrfs]() [262162.560872] Modules linked in: ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables cpufreq_ondemand cpufreq_conservative cpufreq_powersave cpufreq_stats bridge stp llc ipv6 btrfs xor raid6_pq zlib_deflate loop pcspkr ehci_pci ehci_hcd button lpc_ich mfd_core i2c_i801 i2c_core video acpi_cpufreq ext4 crc16 jbd2 mbcache raid1 sg sd_mod ahci libahci libata r8169 mii scsi_mod [262162.562685] CPU: 0 PID: 22705 Comm: btrfs-endio-wri Not tainted 3.13.0-rc8 #2 [262162.562796] Hardware name: System manufacturer System Product Name/P8H77-M PRO, BIOS 1101 02/04/2013 [262162.562911] 0009 88009e911a78 813898b7 0006 [262162.563028] 88009e911ab8 810370a9 01000846 [262162.563143] a02b7a6f 8807e7248090 8807f2d5e000 88009e911b40 [262162.563259] Call Trace: [262162.563319] [813898b7] dump_stack+0x46/0x58 [262162.563382] [810370a9] warn_slowpath_common+0x77/0x91 [262162.563458] [a02b7a6f] ? find_parent_nodes+0x378/0x5d2 [btrfs] [262162.563522] [810370d8] warn_slowpath_null+0x15/0x17 [262162.563595] [a02b7a6f] find_parent_nodes+0x378/0x5d2 [btrfs] [262162.563671] [a02b84dc] iterate_extent_inodes+0xc9/0x1d6 [btrfs] [262162.563748] [a0270981] ? record_extent_backrefs+0xc3/0xc3 [btrfs] [262162.563824] [a0270981] ? record_extent_backrefs+0xc3/0xc3 [btrfs] [262162.563899] [a02b8668] iterate_inodes_from_logical+0x7f/0x95 [btrfs] [262162.564022] [a0270919] record_extent_backrefs+0x5b/0xc3 [btrfs] [262162.564098] [a0278f67] btrfs_finish_ordered_io+0x77a/0x877 [btrfs] [262162.564221] [a0279074] finish_ordered_fn+0x10/0x12 [btrfs] [262162.564296] [a02940d9] worker_loop+0x15e/0x495 [btrfs] [262162.564371] [a0293f7b] ? btrfs_queue_worker+0x269/0x269 [btrfs] [262162.564437] [8104ee9a] kthread+0xcd/0xd5 [262162.564499] [8104edcd] ? kthread_freezable_should_stop+0x43/0x43 [262162.564563] [8138e6fc] ret_from_fork+0x7c/0xb0 [262162.564626] [8104edcd] ? kthread_freezable_should_stop+0x43/0x43 [262162.564689] ---[ end trace 50ae39e01ebd7394 ]--- [262162.564753] [ cut here ] [262162.564825] WARNING: CPU: 0 PID: 22705 at fs/btrfs/backref.c:936 find_parent_nodes+0x378/0x5d2 [btrfs]() [262162.564939] Modules linked in: ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables cpufreq_ondemand cpufreq_conservative cpufreq_powersave cpufreq_stats bridge stp llc ipv6 btrfs xor raid6_pq zlib_deflate loop pcspkr ehci_pci ehci_hcd button lpc_ich mfd_core i2c_i801 i2c_core video acpi_cpufreq ext4 crc16 jbd2 mbcache raid1 sg sd_mod ahci libahci libata r8169 mii scsi_mod [262162.565348] CPU: 0 PID: 22705 Comm: btrfs-endio-wri Tainted: GW 3.13.0-rc8 #2 [262162.565461] Hardware name: System manufacturer System Product Name/P8H77-M PRO, BIOS 1101 02/04/2013 [262162.565575] 0009 88009e911a78 813898b7 0006 [262162.565689] 88009e911ab8 810370a9 8807f33be200 [262162.565804] a02b7a6f 8807e7248090 8807f2d5e000 88009e911b40 [262162.565920] Call Trace: [262162.565975] [813898b7] dump_stack+0x46/0x58 [262162.566037] [810370a9] warn_slowpath_common+0x77/0x91 [262162.566109] [a02b7a6f] ? find_parent_nodes+0x378/0x5d2 [btrfs] [262162.566174] [810370d8] warn_slowpath_null+0x15/0x17 [262162.566247] [a02b7a6f] find_parent_nodes+0x378/0x5d2 [btrfs] [262162.566321] [a02b84dc] iterate_extent_inodes+0xc9/0x1d6 [btrfs] [262162.566396] [a0270981] ? record_extent_backrefs+0xc3/0xc3 [btrfs] [262162.566469] [a0270981] ? record_extent_backrefs+0xc3/0xc3 [btrfs] [262162.566544] [a02b8668] iterate_inodes_from_logical+0x7f/0x95 [btrfs] [262162.57] [a0270919] record_extent_backrefs+0x5b/0xc3 [btrfs] [262162.566742] [a0278f67] btrfs_finish_ordered_io+0x77a/0x877 [btrfs] [262162.566866] [a0279074] finish_ordered_fn+0x10/0x12 [btrfs] [262162.566941] [a02940d9] worker_loop+0x15e/0x495 [btrfs] [262162.567015] [a0293f7b] ? btrfs_queue_worker+0x269/0x269 [btrfs] [262162.567080] [8104ee9a] kthread+0xcd/0xd5 [262162.567141] [8104edcd] ?
Re: kernel BUG at fs/btrfs/relocation.c:1062
Hello Tomasz, There should be some bugs flying on, unitil now, you can try to remount: something like: # mount /dev/sda /mnt -o skip_balance It will skip unfinshed balance, hopley it can help you before we give a bug fix. Thanks, Wang 2013/12/20 Tomasz Chmielewski t...@virtall.com: On Thu, 19 Dec 2013 22:07:37 +0900 Tomasz Chmielewski t...@virtall.com wrote: If it matters, I had to hard reboot after that bug; the balance continued after the system booted again and I got this a while later (filesystem was remounted read only): Actually, looks like the fs is quite hosed now :( If I want to do an operation like removing a snapshot (btrfs sub del ...), the command returns, there is IO for a few minutes, but then, the kernel complains and fs is remounted readonly: [ 5111.773900] BTRFS debug (device sdb5): run_one_delayed_ref returned -17 [ 5111.773902] [ cut here ] [ 5111.773957] WARNING: CPU: 0 PID: 15042 at fs/btrfs/super.c:254 __btrfs_abort_transaction+0x4d/0xff [btrfs]() [ 5111.774047] btrfs: Transaction aborted (error -17) [ 5111.774048] Modules linked in: ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables cpufreq_ondemand cpufreq_conservative cpufreq_powersave cpufreq_stats bridge stp llc ipv6 btrfs xor raid6_pq zlib_deflate loop pcspkr button video ehci_pci ehci_hcd acpi_cpufreq i2c_i801 i2c_core lpc_ich mfd_core ext4 crc16 jbd2 mbcache raid1 sg sd_mod ahci libahci libata scsi_mod r8169 mii [ 5111.774355] CPU: 0 PID: 15042 Comm: btrfs-transacti Tainted: GW 3.13.0-rc4 #1 [ 5111.774443] Hardware name: System manufacturer System Product Name/P8H77-M PRO, BIOS 1101 02/04/2013 [ 5111.774533] 0009 8800374ddc48 8138a37d 0006 [ 5111.774622] 8800374ddc98 8800374ddc88 810370a9 8800374ddd80 [ 5111.774711] a020d524 ffef 8807ead7d800 8807ff0cc8c0 [ 5111.774800] Call Trace: [ 5111.774846] [8138a37d] dump_stack+0x46/0x58 [ 5111.774894] [810370a9] warn_slowpath_common+0x77/0x91 [ 5111.774944] [a020d524] ? __btrfs_abort_transaction+0x4d/0xff [btrfs] [ 5111.775032] [81037157] warn_slowpath_fmt+0x41/0x43 [ 5111.775081] [a020d524] __btrfs_abort_transaction+0x4d/0xff [btrfs] [ 5111.775135] [a02226ed] btrfs_run_delayed_refs+0x253/0x46f [btrfs] [ 5111.775189] [a022fdec] btrfs_commit_transaction+0x36d/0x7df [btrfs] [ 5111.775281] [a022e345] transaction_kthread+0xef/0x1c2 [btrfs] [ 5111.775333] [a022e256] ? open_ctree+0x1ac7/0x1ac7 [btrfs] [ 5111.775382] [8104ee9a] kthread+0xcd/0xd5 [ 5111.775428] [8104edcd] ? kthread_freezable_should_stop+0x43/0x43 [ 5111.775477] [8138f17c] ret_from_fork+0x7c/0xb0 [ 5111.775524] [8104edcd] ? kthread_freezable_should_stop+0x43/0x43 [ 5111.775572] ---[ end trace b552aca9a0cff3cb ]--- [ 5111.775618] BTRFS error (device sdb5) in btrfs_run_delayed_refs:2730: errno=-17 Object already exists [ 5111.775707] BTRFS info (device sdb5): forced readonly [ 5111.775754] BTRFS warning (device sdb5): Skipping commit of aborted transaction. [ 5111.775841] BTRFS error (device sdb5) in cleanup_transaction:1553: errno=-17 Object already exists If I run balance (after unmounting and mounting the filesystem), it ends in a similar way: [ 5927.338989] btrfs: relocating block group 4647284637696 flags 17 [ 6184.333629] btrfs: found 983 extents [ 6269.512577] BTRFS debug (device sdb5): run_one_delayed_ref returned -17 [ 6269.512579] [ cut here ] [ 6269.512636] WARNING: CPU: 7 PID: 17836 at fs/btrfs/super.c:254 __btrfs_abort_transaction+0x4d/0xff [btrfs]() [ 6269.512727] btrfs: Transaction aborted (error -17) [ 6269.512727] Modules linked in: ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables cpufreq_ondemand cpufreq_conservative cpufreq_powersave cpufreq_stats bridge stp llc ipv6 btrfs xor raid6_pq zlib_deflate loop pcspkr button video ehci_pci ehci_hcd acpi_cpufreq i2c_i801 i2c_core lpc_ich mfd_core ext4 crc16 jbd2 mbcache raid1 sg sd_mod ahci libahci libata scsi_mod r8169 mii [ 6269.513036] CPU: 7 PID: 17836 Comm: btrfs-transacti Tainted: GW 3.13.0-rc4 #1 [ 6269.513124] Hardware name: System manufacturer System Product Name/P8H77-M PRO, BIOS 1101 02/04/2013 [ 6269.513214] 0009 8807c5b07c48 8138a37d 0006 [ 6269.513304] 8807c5b07c98 8807c5b07c88 810370a9 8807c5b07d80 [ 6269.513393] a020d524 ffef 8807ead7e800 8800371eedc0 [ 6269.513483] Call Trace: [ 6269.513529] [8138a37d] dump_stack+0x46/0x58 [ 6269.513577] [810370a9] warn_slowpath_common+0x77/0x91 [ 6269.513627] [a020d524] ?
Re: kernel BUG at fs/btrfs/relocation.c:1062
2013/12/20 Tomasz Chmielewski t...@virtall.com: -o skip_balance - didn't know this. Actually, I was able to skip the balance, sort of, with this: mount /mnt/btrfs ; btrfs fi balance cancel /mnt/btrfs From your previous email, i suspend your filesystem is nearly fully. mount -o skip_balance will avoid balance continuing while use btrfs balance cancel still can not avoid balance totally, so i recommend you use skip_balance option when remounting. However, the fs is extremely unstable (will remount read only quite fast, i.e. if I start removing snapshot). I'm trying to run btrfsck without the --repair option to see if it shows anything interesting. I'd like you not to use btrfsck --repair, it is a little dangerous anyway. If you still want to try, just use btrfsck is ok to check if your filesystem is consistent. Or you can remount with option degraded, recovery(maybe help you to make filesystem stable at least). Thanks, Wang -- Tomasz Chmielewski http://wpkg.org On Fri, 20 Dec 2013 23:52:08 +0800 Shilong Wang wangshilong1...@gmail.com wrote: Hello Tomasz, There should be some bugs flying on, unitil now, you can try to remount: something like: # mount /dev/sda /mnt -o skip_balance It will skip unfinshed balance, hopley it can help you before we give a bug fix. Thanks, Wang 2013/12/20 Tomasz Chmielewski t...@virtall.com: On Thu, 19 Dec 2013 22:07:37 +0900 Tomasz Chmielewski t...@virtall.com wrote: If it matters, I had to hard reboot after that bug; the balance continued after the system booted again and I got this a while later (filesystem was remounted read only): Actually, looks like the fs is quite hosed now :( If I want to do an operation like removing a snapshot (btrfs sub del ...), the command returns, there is IO for a few minutes, but then, the kernel complains and fs is remounted readonly: [ 5111.773900] BTRFS debug (device sdb5): run_one_delayed_ref returned -17 [ 5111.773902] [ cut here ] [ 5111.773957] WARNING: CPU: 0 PID: 15042 at fs/btrfs/super.c:254 __btrfs_abort_transaction+0x4d/0xff [btrfs]() [ 5111.774047] btrfs: Transaction aborted (error -17) [ 5111.774048] Modules linked in: ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables cpufreq_ondemand cpufreq_conservative cpufreq_powersave cpufreq_stats bridge stp llc ipv6 btrfs xor raid6_pq zlib_deflate loop pcspkr button video ehci_pci ehci_hcd acpi_cpufreq i2c_i801 i2c_core lpc_ich mfd_core ext4 crc16 jbd2 mbcache raid1 sg sd_mod ahci libahci libata scsi_mod r8169 mii [ 5111.774355] CPU: 0 PID: 15042 Comm: btrfs-transacti Tainted: GW3.13.0-rc4 #1 [ 5111.774443] Hardware name: System manufacturer System Product Name/P8H77-M PRO, BIOS 1101 02/04/2013 [ 5111.774533] 0009 8800374ddc48 8138a37d 0006 [ 5111.774622] 8800374ddc98 8800374ddc88 810370a9 8800374ddd80 [ 5111.774711] a020d524 ffef 8807ead7d800 8807ff0cc8c0 [ 5111.774800] Call Trace: [ 5111.774846] [8138a37d] dump_stack+0x46/0x58 [ 5111.774894] [810370a9] warn_slowpath_common+0x77/0x91 [ 5111.774944] [a020d524] ? __btrfs_abort_transaction+0x4d/0xff [btrfs] [ 5111.775032] [81037157] warn_slowpath_fmt+0x41/0x43 [ 5111.775081] [a020d524] __btrfs_abort_transaction+0x4d/0xff [btrfs] [ 5111.775135] [a02226ed] btrfs_run_delayed_refs+0x253/0x46f [btrfs] [ 5111.775189] [a022fdec] btrfs_commit_transaction+0x36d/0x7df [btrfs] [ 5111.775281] [a022e345] transaction_kthread+0xef/0x1c2 [btrfs] [ 5111.775333] [a022e256] ? open_ctree+0x1ac7/0x1ac7 [btrfs] [ 5111.775382] [8104ee9a] kthread+0xcd/0xd5 [ 5111.775428] [8104edcd] ? kthread_freezable_should_stop+0x43/0x43 [ 5111.775477] [8138f17c] ret_from_fork+0x7c/0xb0 [ 5111.775524] [8104edcd] ? kthread_freezable_should_stop+0x43/0x43 [ 5111.775572] ---[ end trace b552aca9a0cff3cb ]--- [ 5111.775618] BTRFS error (device sdb5) in btrfs_run_delayed_refs:2730: errno=-17 Object already exists [ 5111.775707] BTRFS info (device sdb5): forced readonly [ 5111.775754] BTRFS warning (device sdb5): Skipping commit of aborted transaction. [ 5111.775841] BTRFS error (device sdb5) in cleanup_transaction:1553: errno=-17 Object already exists If I run balance (after unmounting and mounting the filesystem), it ends in a similar way: [ 5927.338989] btrfs: relocating block group 4647284637696 flags 17 [ 6184.333629] btrfs: found 983 extents [ 6269.512577] BTRFS debug (device sdb5): run_one_delayed_ref returned -17 [ 6269.512579] [ cut here ] [ 6269.512636] WARNING: CPU: 7 PID: 17836 at fs/btrfs/super.c:254 __btrfs_abort_transaction+0x4d/0xff [btrfs]() [ 6269.512727] btrfs: Transaction aborted
Re: kernel BUG at fs/btrfs/relocation.c:1062
2013/12/21 Tomasz Chmielewski t...@virtall.com: On Sat, 21 Dec 2013 00:07:19 +0800 Shilong Wang wangshilong1...@gmail.com wrote: 2013/12/20 Tomasz Chmielewski t...@virtall.com: -o skip_balance - didn't know this. Actually, I was able to skip the balance, sort of, with this: mount /mnt/btrfs ; btrfs fi balance cancel /mnt/btrfs From your previous email, i suspend your filesystem is nearly fully. mount -o skip_balance will avoid balance continuing while use btrfs balance cancel still can not avoid balance totally, so i recommend you use skip_balance option when remounting. My other thread a few days ago (no space left, metadata usage almost full?), if that's what you're referring to, was about a different filesystem. This one should have enough free space left: # btrfs fi show /mnt/lxc1 Label: lxc1 uuid: 8d08ad6d-4543-4fe5-8b1b-640dc1423d41 Total devices 2 FS bytes used 2.02TiB devid1 size 2.62TiB used 2.02TiB path /dev/sda5 devid2 size 2.62TiB used 2.02TiB path /dev/sdb5 Btrfs v3.12 # btrfs fi df /mnt/lxc1 Data, RAID1: total=1.97TiB, used=1.97TiB System, RAID1: total=32.00MiB, used=300.00KiB Metadata, RAID1: total=50.00GiB, used=49.21GiB I've tried using skip_balance - it mounted, but breaks soon after I try to remove any snapshot. ok, It seems that your filesystem is not in consistency anymore. To confirm your filesystem status, you can try to run btrfsck to check if there is something wrong. To get a stable fs now, you can try to mount btrfs with option skip_balance and recovery, recovery options try to get a previous good tree root which might help you get a stable fs. -- Tomasz Chmielewski http://wpkg.org -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] open_ctree() failes after mounting BTRFS with 3.2 after 3.12
Hello Philipp, In order to reproduce your problems. I compile btrfs in kernel 3.2 and 3.12. # mkfs.btrfs -f /dev/sda2 # mount /dev/sda2 /mnt # dd if=/dev/zero of=/mnt/data bs=1M count=1024 # umount /mnt The above operations is in 3.2 and then i switch my system into 3.12, and try to mount /dev/sda2 it will output something like the following: btrfs: mismatching generation and generation_v2 found in root item. This root was probably mounted with an older kernel. Resetting all new fields. But it mounted successfully, then i do some similar operations in 3.12...and umount it. and switch into kernel version 3.2... but i failt to mount it in 3.2 and it output the following message. BTRFS: couldn't mount because of unsupported optional features (20) It seems that here the problem is because incompat features in newer kernel and make older kernel unmountable any more.. But in your problem is that you can still mount on older kernel? but not new kernel? Thanks, Wang 2013/12/14 Philipp Matthias Hahn pmh...@pmhahn.de: Hello, I'm not subscribed, so please cc: me. I've an extermal USB disk using BTRFS used for backups and data sharing between multiple systems using different kernels (3.2.xx, 3.12.5). The FS fails to mount with 3.12.x, but still works on 3.2.x with the following dmesg: btrfs: mismatching generation and generation_v2 found in root item. This root was probably mounted with an older kernel. Resetting all new fields. btrfs: mismatching generation and generation_v2 found in root item. This root was probably mounted with an older kernel. Resetting all new fields. btrfs: open_ctree failed Following https://btrfs.wiki.kernel.org/index.php/Debugging_Btrfs_with_GDB#Building_UML_Kernel here some more debug info. To me it looks like the call to btrfs_read_fs_root_no_name() fails. $ gdb --args ./linux ubd0=/dev/sdc2 ... (gdb) handle SIGSEGV pass nostop noprint (gdb) handle SIGUSR1 pass nostop noprint (gdb) break open_ctree (gdb) r ... [0.00] Linux version 3.12.5 (pmhahn@scout) (gcc version 4.8.2 (Debian 4.8.2-10) ) #9 Sat Dec 14 09:40:10 CET 2013 ... [1.12] Btrfs loaded, debug=on ... [1.12] ubda: unknown partition table [1.12] btrfs: device label backup devid 1 transid 5478 /dev/root Breakpoint 1, open_ctree (sb=sb@entry=0x61d0e400, fs_devices=0x61d15c80, options=options@entry=0x0) at fs/btrfs/disk-io.c:2096 2096{ 2093int open_ctree(struct super_block *sb, 2123tree_root = fs_info-tree_root = btrfs_alloc_root(fs_info); 2096{ 2123tree_root = fs_info-tree_root = btrfs_alloc_root(fs_info); 2096{ 2118int num_backups_tried = 0; 2119int backup_index = 0; 2123tree_root = fs_info-tree_root = btrfs_alloc_root(fs_info); 2124chunk_root = fs_info-chunk_root = btrfs_alloc_root(fs_info); 2125if (!tree_root || !chunk_root) { 2124chunk_root = fs_info-chunk_root = btrfs_alloc_root(fs_info); 2125if (!tree_root || !chunk_root) { 2130ret = init_srcu_struct(fs_info-subvol_srcu); 2131if (ret) { 2130ret = init_srcu_struct(fs_info-subvol_srcu); 2131if (ret) { 2136ret = setup_bdi(fs_info, fs_info-bdi); 2156fs_info-btree_inode = new_inode(sb); 2136ret = setup_bdi(fs_info, fs_info-bdi); 2142ret = percpu_counter_init(fs_info-dirty_metadata_bytes, 0); 2147fs_info-dirty_metadata_batch = PAGE_CACHE_SIZE * 2150ret = percpu_counter_init(fs_info-delalloc_bytes, 0); 2136ret = setup_bdi(fs_info, fs_info-bdi); 2156fs_info-btree_inode = new_inode(sb); 2157if (!fs_info-btree_inode) { 2156fs_info-btree_inode = new_inode(sb); 2157if (!fs_info-btree_inode) { 2162mapping_set_gfp_mask(fs_info-btree_inode-i_mapping, GFP_NOFS); 2179mutex_init(fs_info-reloc_mutex); 2162mapping_set_gfp_mask(fs_info-btree_inode-i_mapping, GFP_NOFS); 2165INIT_LIST_HEAD(fs_info-trans_list); 2179mutex_init(fs_info-reloc_mutex); 2164INIT_RADIX_TREE(fs_info-fs_roots_radix, GFP_ATOMIC); 2165INIT_LIST_HEAD(fs_info-trans_list); 2166INIT_LIST_HEAD(fs_info-dead_roots); 2164INIT_RADIX_TREE(fs_info-fs_roots_radix, GFP_ATOMIC); 2166INIT_LIST_HEAD(fs_info-dead_roots); 2167INIT_LIST_HEAD(fs_info-delayed_iputs); 2168INIT_LIST_HEAD(fs_info-delalloc_roots); 2169INIT_LIST_HEAD(fs_info-caching_block_groups); 2179mutex_init(fs_info-reloc_mutex); 2182init_completion(fs_info-kobj_unregister); 2180seqlock_init(fs_info-profiles_lock); 2182init_completion(fs_info-kobj_unregister); 2183INIT_LIST_HEAD(fs_info-dirty_cowonly_roots); 2186
Fwd: [PATCH] Btrfs: fix a warning when iput a file
CC:linux-btrfs -- Forwarded message -- From: Shilong Wang wangshilong1...@gmail.com Date: 2013/12/14 Subject: Re: [PATCH] Btrfs: fix a warning when iput a file To: bo.li@oracle.com Hello Liu 2013/12/14 Liu Bo bo.li@oracle.com: On Sat, Dec 14, 2013 at 03:27:31PM +0800, Wang Shilong wrote: See the warning below: [ 1209.102076] [a04721b9] remove_extent_mapping+0x69/0x70 [btrfs] [ 1209.102084] [a0466b06] btrfs_evict_inode+0x96/0x4d0 [btrfs] [ 1209.102089] [81073010] ? wake_atomic_t_function+0x40/0x40 [ 1209.102092] [8118ab2e] evict+0x9e/0x190 [ 1209.102094] [8118b313] iput+0xf3/0x180 [ 1209.102101] [a0461fd1] btrfs_run_delayed_iputs+0xb1/0xd0 [btrfs] [ 1209.102107] [a045d358] __btrfs_end_transaction+0x268/0x350 [btrfs] clear extent bit here to avoid triggering WARN_ON() in remove_extent_mapping() Why PINNED and LOGGING isn't unset as usual? Flippe sent an improved patch regarding evict inode: https://patchwork.kernel.org/patch/3204281/ In btrfs_invalidatepages(), if we are evicting an inode, we will skip usual operation that is operated page by page. Rather than we do it more effectively, that we operate extent_map and extent_state structure, this will be faster! So for now code , after calling truncate_inode_pages(), PINNED and LOGGED won't be unset as usual(Maybe a better solution is to move remove_extent_mappin after we clear extent state). Correct me if i am wrong here. Thanks, Wang -liubo Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- fs/btrfs/inode.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index e889779..c0f0f9d 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4515,6 +4515,8 @@ static void evict_inode_truncate_pages(struct inode *inode) node = rb_first(map_tree-map); em = rb_entry(node, struct extent_map, rb_node); + clear_bit(EXTENT_FLAG_PINNED, em-flags); + clear_bit(EXTENT_FLAG_LOGGING, em-flags); remove_extent_mapping(map_tree, em); free_extent_map(em); } -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: wait for ordered extents before removing extent maps
Hello Filipe, 2013/12/14 Filipe David Borba Manana fdman...@gmail.com: Wang Shilong got into a case where during inode eviction we were removing an extent map while it was pinned. This triggered a warning in remove_extent_mapping() because the extent map had the pinned flag set: [ 1209.102076] [a04721b9] remove_extent_mapping+0x69/0x70 [btrfs] [ 1209.102084] [a0466b06] btrfs_evict_inode+0x96/0x4d0 [btrfs] [ 1209.102089] [81073010] ? wake_atomic_t_function+0x40/0x40 [ 1209.102092] [8118ab2e] evict+0x9e/0x190 [ 1209.102094] [8118b313] iput+0xf3/0x180 [ 1209.102101] [a0461fd1] btrfs_run_delayed_iputs+0xb1/0xd0 [btrfs] [ 1209.102107] [a045d358] __btrfs_end_transaction+0x268/0x350 [btrfs] Therefore wait for any pending ordered extents, if any, which will trigger calls to unpin_extent_cache(), before removing the extent maps. Wang's solution of simply clearing the pinned bit wasn't enough, as after unpin_extent_cache() will be called and trigger another WARN_ON() because the lookup for the extent map returned NULL. Why not in evict_inode_truncate_pages() move remove_extent_mapping() after clear_extent_bit()? Thanks, Wang Thanks Wang for finding out this. Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- fs/btrfs/inode.c |5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index e889779..c2933fb 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4509,6 +4509,9 @@ static void evict_inode_truncate_pages(struct inode *inode) ASSERT(inode-i_state I_FREEING); truncate_inode_pages(inode-i_data, 0); + /* do we really want it for -i_nlink 0 and zero btrfs_root_refs? */ + btrfs_wait_ordered_range(inode, 0, (u64)-1); + write_lock(map_tree-lock); while (!RB_EMPTY_ROOT(map_tree-map)) { struct extent_map *em; @@ -4566,8 +4569,6 @@ void btrfs_evict_inode(struct inode *inode) btrfs_orphan_del(NULL, inode); goto no_delete; } - /* do we really want it for -i_nlink 0 and zero btrfs_root_refs? */ - btrfs_wait_ordered_range(inode, 0, (u64)-1); if (root-fs_info-log_root_recovering) { BUG_ON(test_bit(BTRFS_INODE_HAS_ORPHAN_ITEM, -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: wait for ordered extents before removing extent maps
2013/12/14 Filipe David Manana fdman...@gmail.com: On Sat, Dec 14, 2013 at 2:56 PM, Shilong Wang wangshilong1...@gmail.com wrote: Hello Filipe, 2013/12/14 Filipe David Borba Manana fdman...@gmail.com: Wang Shilong got into a case where during inode eviction we were removing an extent map while it was pinned. This triggered a warning in remove_extent_mapping() because the extent map had the pinned flag set: [ 1209.102076] [a04721b9] remove_extent_mapping+0x69/0x70 [btrfs] [ 1209.102084] [a0466b06] btrfs_evict_inode+0x96/0x4d0 [btrfs] [ 1209.102089] [81073010] ? wake_atomic_t_function+0x40/0x40 [ 1209.102092] [8118ab2e] evict+0x9e/0x190 [ 1209.102094] [8118b313] iput+0xf3/0x180 [ 1209.102101] [a0461fd1] btrfs_run_delayed_iputs+0xb1/0xd0 [btrfs] [ 1209.102107] [a045d358] __btrfs_end_transaction+0x268/0x350 [btrfs] Therefore wait for any pending ordered extents, if any, which will trigger calls to unpin_extent_cache(), before removing the extent maps. Wang's solution of simply clearing the pinned bit wasn't enough, as after unpin_extent_cache() will be called and trigger another WARN_ON() because the lookup for the extent map returned NULL. Why not in evict_inode_truncate_pages() move remove_extent_mapping() after clear_extent_bit()? So, if the pinned bit is set, it means some task will clear it later, via unpin_extent_cache(). And if you look at that function, it has this: write_lock(tree-lock); em = lookup_extent_mapping(tree, start, len); WARN_ON(!em || em-start != start); And remove_extent_mapping() will remove the em from the rbtree, regardless of its reference count value, therefore triggering that warning above. Here i mean, in evict_inode_truncate_pages() We change it to: Step1: unpin_extent_cache() Step2: remove it from extent_mapping Dose this cause any problems? i am a little confused, correct me if i am wrong some places^_^. Does it makes sense? thanks Thanks, Wang Thanks Wang for finding out this. Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- fs/btrfs/inode.c |5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index e889779..c2933fb 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4509,6 +4509,9 @@ static void evict_inode_truncate_pages(struct inode *inode) ASSERT(inode-i_state I_FREEING); truncate_inode_pages(inode-i_data, 0); + /* do we really want it for -i_nlink 0 and zero btrfs_root_refs? */ + btrfs_wait_ordered_range(inode, 0, (u64)-1); + write_lock(map_tree-lock); while (!RB_EMPTY_ROOT(map_tree-map)) { struct extent_map *em; @@ -4566,8 +4569,6 @@ void btrfs_evict_inode(struct inode *inode) btrfs_orphan_del(NULL, inode); goto no_delete; } - /* do we really want it for -i_nlink 0 and zero btrfs_root_refs? */ - btrfs_wait_ordered_range(inode, 0, (u64)-1); if (root-fs_info-log_root_recovering) { BUG_ON(test_bit(BTRFS_INODE_HAS_ORPHAN_ITEM, -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Filipe David Manana, Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That's why all progress depends on unreasonable men. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: fix deadlock when iterating inode refs and running delayed inodes
Hello Filipe, I mean no hurt, but your changelog is really too long. I think pasting some of dmesg is ok... Thanks, Wang 2013/12/15 Filipe David Borba Manana fdman...@gmail.com: While running btrfs/004 from xfstests, after 503 iterations, dmesg reported a deadlock between tasks iterating inode refs and tasks running delayed inodes (during a transaction commit). It turns out that iterating inode refs implies doing one tree search and release all nodes in the path except the leaf node, and then passing that leaf node to btrfs_ref_to_path(), which in turn does another tree search without releasing the lock on the leaf node it received as parameter. This is a problem when other task wants to write to the btree as well and ends up updating the leaf that is read locked - the writer task locks the parent of the leaf and then blocks waiting for the leaf's lock to be released - at the same time, the task executing btrfs_ref_to_path() does a second tree search, without releasing the lock on the first leaf, and wants to access a leaf (the same or another one) that is a child of the same parent, resulting in a deadlock. The trace reported by lockdep follows. [84314.936373] INFO: task fsstress:11930 blocked for more than 120 seconds. [84314.936381] Tainted: GW O 3.12.0-fdm-btrfs-next-16+ #70 [84314.936383] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [84314.936386] fsstressD 8806e1bf8000 0 11930 11926 0x [84314.936393] 8804d6d89b78 0046 8804d6d89b18 810bd8bd [84314.936399] 8806e1bf8000 8804d6d89fd8 8804d6d89fd8 8804d6d89fd8 [84314.936405] 880806308000 8806e1bf8000 8804d6d89c08 8804deb8f190 [84314.936410] Call Trace: [84314.936421] [810bd8bd] ? trace_hardirqs_on+0xd/0x10 [84314.936428] [81774269] schedule+0x29/0x70 [84314.936451] [a0715bf5] btrfs_tree_lock+0x75/0x270 [btrfs] [84314.936457] [810715c0] ? __init_waitqueue_head+0x60/0x60 [84314.936470] [a06ba231] btrfs_search_slot+0x7f1/0x930 [btrfs] [84314.936489] [a0731c2a] ? __btrfs_run_delayed_items+0x13a/0x1e0 [btrfs] [84314.936504] [a06d2e1f] btrfs_lookup_inode+0x2f/0xa0 [btrfs] [84314.936510] [810bd6ef] ? trace_hardirqs_on_caller+0x1f/0x1e0 [84314.936528] [a073173c] __btrfs_update_delayed_inode+0x4c/0x1d0 [btrfs] [84314.936543] [a0731c2a] ? __btrfs_run_delayed_items+0x13a/0x1e0 [btrfs] [84314.936558] [a0731c2a] ? __btrfs_run_delayed_items+0x13a/0x1e0 [btrfs] [84314.936573] [a0731c82] __btrfs_run_delayed_items+0x192/0x1e0 [btrfs] [84314.936589] [a0731d03] btrfs_run_delayed_items+0x13/0x20 [btrfs] [84314.936604] [a06dbcd4] btrfs_flush_all_pending_stuffs+0x24/0x80 [btrfs] [84314.936620] [a06ddc13] btrfs_commit_transaction+0x223/0xa20 [btrfs] [84314.936630] [a06ae5ae] btrfs_sync_fs+0x6e/0x110 [btrfs] [84314.936635] [811d0b50] ? __sync_filesystem+0x60/0x60 [84314.936639] [811d0b50] ? __sync_filesystem+0x60/0x60 [84314.936643] [811d0b70] sync_fs_one_sb+0x20/0x30 [84314.936648] [811a3541] iterate_supers+0xf1/0x100 [84314.936652] [811d0c45] sys_sync+0x55/0x90 [84314.936658] [8177ef12] system_call_fastpath+0x16/0x1b [84314.936660] INFO: lockdep is turned off. [84314.936663] INFO: task btrfs:11955 blocked for more than 120 seconds. [84314.93] Tainted: GW O 3.12.0-fdm-btrfs-next-16+ #70 [84314.936668] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [84314.936670] btrfs D 880541729a88 0 11955 11608 0x [84314.936674] 880541729a38 0046 8805417299d8 810bd8bd [84314.936680] 88075430c8a0 880541729fd8 880541729fd8 880541729fd8 [84314.936685] 81c104e0 88075430c8a0 8804de8b00b8 8804de8b [84314.936690] Call Trace: [84314.936695] [810bd8bd] ? trace_hardirqs_on+0xd/0x10 [84314.936700] [81774269] schedule+0x29/0x70 [84314.936717] [a0715815] btrfs_tree_read_lock+0xd5/0x140 [btrfs] [84314.936721] [810715c0] ? __init_waitqueue_head+0x60/0x60 [84314.936733] [a06ba201] btrfs_search_slot+0x7c1/0x930 [btrfs] [84314.936746] [a06bd505] btrfs_find_item+0x55/0x160 [btrfs] [84314.936763] [a06ff689] ? free_extent_buffer+0x49/0xc0 [btrfs] [84314.936780] [a073c9ca] btrfs_ref_to_path+0xba/0x1e0 [btrfs] [84314.936797] [a06f9719] ? release_extent_buffer+0xb9/0xe0 [btrfs] [84314.936813] [a06ff689] ? free_extent_buffer+0x49/0xc0 [btrfs] [84314.936830] [a073cb50] inode_to_path+0x60/0xd0 [btrfs] [84314.936846] [a073d365] paths_from_inode+0x115/0x3c0 [btrfs] [84314.936851] [8118dd44] ? kmem_cache_alloc_trace+0x114/0x200
Re: [PATCH] Btrfs: fix deadlock when iterating inode refs and running delayed inodes
Hello Filipe, I mean no harm, but this patch's changelog is too long.. I think pasting some of dmesg is ok... Thanks, Wang 2013/12/15 Filipe David Borba Manana fdman...@gmail.com: While running btrfs/004 from xfstests, after 503 iterations, dmesg reported a deadlock between tasks iterating inode refs and tasks running delayed inodes (during a transaction commit). It turns out that iterating inode refs implies doing one tree search and release all nodes in the path except the leaf node, and then passing that leaf node to btrfs_ref_to_path(), which in turn does another tree search without releasing the lock on the leaf node it received as parameter. This is a problem when other task wants to write to the btree as well and ends up updating the leaf that is read locked - the writer task locks the parent of the leaf and then blocks waiting for the leaf's lock to be released - at the same time, the task executing btrfs_ref_to_path() does a second tree search, without releasing the lock on the first leaf, and wants to access a leaf (the same or another one) that is a child of the same parent, resulting in a deadlock. The trace reported by lockdep follows. [84314.936373] INFO: task fsstress:11930 blocked for more than 120 seconds. [84314.936381] Tainted: GW O 3.12.0-fdm-btrfs-next-16+ #70 [84314.936383] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [84314.936386] fsstressD 8806e1bf8000 0 11930 11926 0x [84314.936393] 8804d6d89b78 0046 8804d6d89b18 810bd8bd [84314.936399] 8806e1bf8000 8804d6d89fd8 8804d6d89fd8 8804d6d89fd8 [84314.936405] 880806308000 8806e1bf8000 8804d6d89c08 8804deb8f190 [84314.936410] Call Trace: [84314.936421] [810bd8bd] ? trace_hardirqs_on+0xd/0x10 [84314.936428] [81774269] schedule+0x29/0x70 [84314.936451] [a0715bf5] btrfs_tree_lock+0x75/0x270 [btrfs] [84314.936457] [810715c0] ? __init_waitqueue_head+0x60/0x60 [84314.936470] [a06ba231] btrfs_search_slot+0x7f1/0x930 [btrfs] [84314.936489] [a0731c2a] ? __btrfs_run_delayed_items+0x13a/0x1e0 [btrfs] [84314.936504] [a06d2e1f] btrfs_lookup_inode+0x2f/0xa0 [btrfs] [84314.936510] [810bd6ef] ? trace_hardirqs_on_caller+0x1f/0x1e0 [84314.936528] [a073173c] __btrfs_update_delayed_inode+0x4c/0x1d0 [btrfs] [84314.936543] [a0731c2a] ? __btrfs_run_delayed_items+0x13a/0x1e0 [btrfs] [84314.936558] [a0731c2a] ? __btrfs_run_delayed_items+0x13a/0x1e0 [btrfs] [84314.936573] [a0731c82] __btrfs_run_delayed_items+0x192/0x1e0 [btrfs] [84314.936589] [a0731d03] btrfs_run_delayed_items+0x13/0x20 [btrfs] [84314.936604] [a06dbcd4] btrfs_flush_all_pending_stuffs+0x24/0x80 [btrfs] [84314.936620] [a06ddc13] btrfs_commit_transaction+0x223/0xa20 [btrfs] [84314.936630] [a06ae5ae] btrfs_sync_fs+0x6e/0x110 [btrfs] [84314.936635] [811d0b50] ? __sync_filesystem+0x60/0x60 [84314.936639] [811d0b50] ? __sync_filesystem+0x60/0x60 [84314.936643] [811d0b70] sync_fs_one_sb+0x20/0x30 [84314.936648] [811a3541] iterate_supers+0xf1/0x100 [84314.936652] [811d0c45] sys_sync+0x55/0x90 [84314.936658] [8177ef12] system_call_fastpath+0x16/0x1b [84314.936660] INFO: lockdep is turned off. [84314.936663] INFO: task btrfs:11955 blocked for more than 120 seconds. [84314.93] Tainted: GW O 3.12.0-fdm-btrfs-next-16+ #70 [84314.936668] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [84314.936670] btrfs D 880541729a88 0 11955 11608 0x [84314.936674] 880541729a38 0046 8805417299d8 810bd8bd [84314.936680] 88075430c8a0 880541729fd8 880541729fd8 880541729fd8 [84314.936685] 81c104e0 88075430c8a0 8804de8b00b8 8804de8b [84314.936690] Call Trace: [84314.936695] [810bd8bd] ? trace_hardirqs_on+0xd/0x10 [84314.936700] [81774269] schedule+0x29/0x70 [84314.936717] [a0715815] btrfs_tree_read_lock+0xd5/0x140 [btrfs] [84314.936721] [810715c0] ? __init_waitqueue_head+0x60/0x60 [84314.936733] [a06ba201] btrfs_search_slot+0x7c1/0x930 [btrfs] [84314.936746] [a06bd505] btrfs_find_item+0x55/0x160 [btrfs] [84314.936763] [a06ff689] ? free_extent_buffer+0x49/0xc0 [btrfs] [84314.936780] [a073c9ca] btrfs_ref_to_path+0xba/0x1e0 [btrfs] [84314.936797] [a06f9719] ? release_extent_buffer+0xb9/0xe0 [btrfs] [84314.936813] [a06ff689] ? free_extent_buffer+0x49/0xc0 [btrfs] [84314.936830] [a073cb50] inode_to_path+0x60/0xd0 [btrfs] [84314.936846] [a073d365] paths_from_inode+0x115/0x3c0 [btrfs] [84314.936851] [8118dd44] ? kmem_cache_alloc_trace+0x114/0x200
Re: [PATCH 1/2] Btrfs: fix an oops when doing balance relocation
2013/12/10 Wang Shilong wangsl.f...@cn.fujitsu.com: On 12/10/2013 12:14 AM, Wang Shilong wrote: From: Wang Shilong wangsl.f...@cn.fujitsu.com I hit a BUG_ON() when inserting reloc root into rc-reloc_root_tree.rb_root, the fact is block bytenr has been inserted before, this is really a redicious bug, the reason is that we freed root node after we have allocated root node block,and this block bytenr will be reused and then oops happens. By hacking the code (force cow for relocation tree), BUG_ON() can be triggered easily. The problem is that reloc root bytenr inserted into @reloc_root_tree is not within one transaction. root block node can be cowed while it is still in @reloc_root_tree and previous relocation root block can be reused and this oops happen. We can fix it by updating relocation root node in @reloc_root_tree when cowing block, Patch has been make and hopely it can fix this problem, will send v2 later. ^_^ oops, this is wrong, sorry for noise. Previously, i really made mistakes, free_extent_buffer() even won't free tree block, it just dec refs and free memory if possible. Thanks, Wang Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- fs/btrfs/relocation.c | 1 - 1 file changed, 1 deletion(-) diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index ce459a7..4dc7f26 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -1361,7 +1361,6 @@ static struct btrfs_root *create_reloc_root(struct btrfs_trans_handle *trans, } btrfs_tree_unlock(eb); - free_extent_buffer(eb); ret = btrfs_insert_root(trans, root-fs_info-tree_root, root_key, root_item); -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: fix generation mismatch when scrubbing supers
Hello Sebastian, Would you please apply this patch in your test box and see if the problem still exist. Thanks, Wang 2013/12/3 Wang Shilong wangshilong1...@gmail.com: From: Wang Shilong wangsl.f...@cn.fujitsu.com We came a race condition when scrubbing superblocks, the story is: In commiting transaction, we will update last_trans_commited after writting superblocks. if a scrub start after writting superblocks and before last_trans_commited, generation mismatch happens! We fix it by protecting writting superblock and updating last_trans_commited with tree_log_mutex. Reported-by: Sebastian Ochmann ochm...@informatik.uni-bonn.de Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- fs/btrfs/scrub.c | 11 +++ fs/btrfs/transaction.c | 13 ++--- 2 files changed, 17 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index 561e2f1..afa2f01 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -2932,12 +2932,15 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, atomic_inc(fs_info-scrubs_running); mutex_unlock(fs_info-scrub_lock); + /* +* by holding tree_log_mutex while scrubbing supers +* we can avoid generation mismatch. See comments in commiting +* transaction when updating last_trans_commited. +*/ if (!is_dev_replace) { - /* -* by holding device list mutex, we can -* kick off writing super in log tree sync. -*/ + mutex_lock(fs_info-tree_log_mutex); ret = scrub_supers(sctx, dev); + mutex_unlock(fs_info-tree_log_mutex); } mutex_unlock(fs_info-fs_devices-device_list_mutex); diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index c6a872a..052eb22 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -1898,15 +1898,22 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans, goto cleanup_transaction; } + btrfs_finish_extent_commit(trans, root); + + /* +* we must gurantee last_trans_commited update is protected by +* tree_log_mutex with write_ctree_super together, otherwise, +* scubbing super will come in before updating last_trans_commited +* and we will get generation mismatch when scrubbing superblocks. +*/ + root-fs_info-last_trans_committed = cur_trans-transid; + /* * the super is written, we can safely allow the tree-loggers * to go about their business */ mutex_unlock(root-fs_info-tree_log_mutex); - btrfs_finish_extent_commit(trans, root); - - root-fs_info-last_trans_committed = cur_trans-transid; /* * We needn't acquire the lock here because there is no other task * which can change it. -- 1.8.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Fwd: 2 errors when scrubbing - but I don't know what they mean
cc: linux-btrfs -- Forwarded message -- From: Shilong Wang wangshilong1...@gmail.com Date: 2013/12/1 Subject: Re: 2 errors when scrubbing - but I don't know what they mean To: Sebastian Ochmann ochm...@informatik.uni-bonn.de Hello Sebastian, 2013/11/30 Sebastian Ochmann ochm...@informatik.uni-bonn.de: Hello, thank you for your input. I didn't know that btrfs keeps the error counters over mounts/reboots, but that's nice. I'm still trying to figure out how such a generation error may occur in the first place. One thing I noticed looking at the btrfs code is that the generation error counter will only get incremented in the actual scrubbing code (either in scrub_checksum_super or in scrub_handle_errored_block, both in scrub.c - please correct me if I'm wrong, I'm not a btrfs dev). Right, Scrub will read superblock with bio rather than using pagecaches. This mean we will reread superblock from disks, if a checksum mismatch happens, This can be the following reasons: 1.some read errors happen while scrubing, while superblocks are actually good 2.during last transaction, when we are trying to write superblocks to disk, some silent corruption happens. 3.some unexpected operation write data to superblocks directly, for example..'dd if=/dev/zero' of=/dev/ seek=65536 count=4k' something like this. Actually, during boot time, superblock should be fine, because will do checksum check when trying to using superblock. if checksum mismatch, we will refuse to mount, After mounting, these superblocks should be cached in memory until you umouting filesystem. So ideal thing is your disk is fine, and during next transaction, superblocks will be rewritten. and during next umounting, you can mounting filesystem successfully! However, if you find such superblocks checksum mismatch very often during scrub, it maybe there are something wrong with disk! Also, the dmesg errors I saw were not there at boot time, but about 10 minutes after boot which was about the time when I started the scrub so I'm pretty sure that it was the scrub that detected the errors. The question remains what can cause superblock/gen errors. Sure it could be some read error, but I'd really like to make sure that it's not a systematic error. I wasn't able to reproduce it yet though. You can reproduce this by doing 'dd if=/dev/zero of=/dev/sd* seek=65536 count=4k' before btrfs scrubing. Thanks, Wang Best Sebastian -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] btrfs-progs: btrfs send does not work inside mounted non root subvolume
Hello Thomas, 2013/11/29 Thomas Scheiblauer t...@sharkbay.at: When operating from inside a mounted btrfs subvolume which is not the root subvolume btrfs send exits with the error: ERROR: open subvolume path failed. No such file or directory Steps to reproduce: 1. mkfs.btrfs /dev/sda1 2. mount /dev/sda1 /mnt 3. btrfs subvolume create /mnt/foo 4. umount /mnt 5. mount -o subvol=foo /dev/sda1 /mnt 6. btrfs subvolume snapshot -r /mnt /mnt/snap 7. btrfs send /mnt/snap /dev/null - ERROR: open foo/snap failed. No such file or directory The problem is that we will try to use btrfs internal path(foo/snap) to open the sub-volume, this is not right here, i will send a patch for this, thanks for reporting. Thanks, Wang -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 10/12] Btrfs-progs: add '--block-size' option to control print result
Thanks for finding this, the problem comes to patch [v3 9/12]. When updating max columns len of child_qgroup, i miswrite qgroup-member to qgroup-parent, i have updated this patch and send a v4, it can be appiled without conflicts with later. Thanks, Wang 2013/10/8 David Sterba dste...@suse.cz: On Mon, Oct 07, 2013 at 03:21:46PM +0800, Wang Shilong wrote: You can use it like: btrfs qgroup show --block-size=m mnt Here, block size supports k/K/m/M/g/G/t/T/p/P/e/E. There is no distinction between the 1000 and 1024 based prefixes, also no way to get the raw values in bytes. I don't have a suggestion how to do that, merely letting you know that this could go separately (this and the -h patch, the rest shall be integrated). Also, the numbers in the table should be aligned to the right: $ btrfs qgroup show -h -p /mnt/ qgroupid rfer excl parent -- 0/5 900.00KiB 900.00KiB --- 0/267688.00KiB 12.00KiB 1/5 0/268684.00KiB 8.00KiB 1/5 0/2696.71GiB 4.00KiB 1/1 0/2776.71GiB 4.00KiB 1/1 0/27839.74GiB 39.74GiB 1/2 1/1 6.71GiB 6.71GiB --- 1/2 39.74GiB 39.74GiB --- 1/5 696.00KiB 696.00KiB --- david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 10/12] Btrfs-progs: add '--block-size' option to control print result
Sorry, This should to reply to the bug that you find, not this thread. Anyway, you are smart enough to get this . 2013/10/9 Shilong Wang wangshilong1...@gmail.com: Thanks for finding this, the problem comes to patch [v3 9/12]. When updating max columns len of child_qgroup, i miswrite qgroup-member to qgroup-parent, i have updated this patch and send a v4, it can be appiled without conflicts with later. Thanks, Wang 2013/10/8 David Sterba dste...@suse.cz: On Mon, Oct 07, 2013 at 03:21:46PM +0800, Wang Shilong wrote: You can use it like: btrfs qgroup show --block-size=m mnt Here, block size supports k/K/m/M/g/G/t/T/p/P/e/E. There is no distinction between the 1000 and 1024 based prefixes, also no way to get the raw values in bytes. I don't have a suggestion how to do that, merely letting you know that this could go separately (this and the -h patch, the rest shall be integrated). Also, the numbers in the table should be aligned to the right: $ btrfs qgroup show -h -p /mnt/ qgroupid rfer excl parent -- 0/5 900.00KiB 900.00KiB --- 0/267688.00KiB 12.00KiB 1/5 0/268684.00KiB 8.00KiB 1/5 0/2696.71GiB 4.00KiB 1/1 0/2776.71GiB 4.00KiB 1/1 0/27839.74GiB 39.74GiB 1/2 1/1 6.71GiB 6.71GiB --- 1/2 39.74GiB 39.74GiB --- 1/5 696.00KiB 696.00KiB --- david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 10/12] Btrfs-progs: add '--block-size' option to control print result
Hi David, 2013/10/8 David Sterba dste...@suse.cz: On Mon, Oct 07, 2013 at 03:21:46PM +0800, Wang Shilong wrote: You can use it like: btrfs qgroup show --block-size=m mnt Here, block size supports k/K/m/M/g/G/t/T/p/P/e/E. There is no distinction between the 1000 and 1024 based prefixes, also no way to get the raw values in bytes. I don't have a suggestion how to do that, merely letting you know that this could go separately (this and the -h patch, the rest shall be integrated). I implement this like the command 'du'. In default, we print result in bytes. And block size don't give a byte unit implicitly.Aslo i don't know why we need to distinct 1000 and 1024, i don't have any ideas about this. Also, the numbers in the table should be aligned to the right: Yes, this should be fixed. Thanks, Wang $ btrfs qgroup show -h -p /mnt/ qgroupid rfer excl parent -- 0/5 900.00KiB 900.00KiB --- 0/267688.00KiB 12.00KiB 1/5 0/268684.00KiB 8.00KiB 1/5 0/2696.71GiB 4.00KiB 1/1 0/2776.71GiB 4.00KiB 1/1 0/27839.74GiB 39.74GiB 1/2 1/1 6.71GiB 6.71GiB --- 1/2 39.74GiB 39.74GiB --- 1/5 696.00KiB 696.00KiB --- david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
about btrfs patches
Hello Josef, Could you please pull these patches: https://patchwork.kernel.org/patch/2192901/ https://patchwork.kernel.org/patch/2192911/ https://patchwork.kernel.org/patch/2192921/ The first one is definitely a bug that needs to be fixed, the second one is to remove unnecessary BUG_ON, the last one is a clean up.. Is there any problems with these patches? Please let me know.. Thanks, Wang -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs-progs: fix parse_limit function to return errors when parsing unit
Hello, Eric, David 2013/2/27 Eric Sandeen sand...@redhat.com: On 2/27/13 9:52 AM, David Sterba wrote: On Thu, Feb 21, 2013 at 08:26:15PM +0800, Wang Shilong wrote: From: Wang Shilong wangsl-f...@cn.fujitsu.com Steps to reproduce: btrfs qgroup limit m mnt/subv Here, unit(k/K/g/G/m/M/t/T) all will trigger the problem. For the above command, the original code will parse the limit value as 0 and return successfully.It is wrong,fix it. I don't think we should allow to accept bare units for specifing numbers, so in that case fail with 'unrecognized limit'. I agree with that. Maybe since David didn't quite want to take my patch 01/17, and Zach thought we needed better code for it all, and Goffredo didn't like the exit() from parse_size, and now this problem, we need a single, concerted effort to fix up size parsing. I Agre exit() should not be used in parse_size... But i don't agree with Zach about better code for it... Maybe like this: int parse_size(const char *s, u64 * size).. Many places that use the function parse_size/parse_limit, they have to be corrected after change Thanks, Wang Who knew it was so hard? :) -Eric david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 14/17] btrfs-progs: fix mem leak in resolve_root
Hello, Eric, David 2013/2/27 David Sterba dste...@suse.cz: On Mon, Feb 25, 2013 at 10:36:41PM -0600, Eric Sandeen wrote: On 2/25/13 6:36 PM, Shilong Wang wrote: --- a/btrfs-list.c +++ b/btrfs-list.c @@ -568,8 +568,10 @@ static int resolve_root(struct root_lookup *rl, struct root_info *ri, * ref_tree = 0 indicates the subvolumes * has been deleted. */ - if (!found-ref_tree) + if (!found-ref_tree) { + free(full_path); return -ENOENT; + } int add_len = strlen(found-path); /* room for / and for null */ @@ -612,8 +614,10 @@ static int resolve_root(struct root_lookup *rl, struct root_info *ri, * subvolume was deleted. */ found = root_tree_search(rl, next); - if (!found) + if (!found) { + free(full_path); return -ENOENT; + } } ri-full_path = full_path; -- 1.7.1 I think the patch is wrong; Here we return ENOENT, it means a subvolume/snapshot deletion happens. We just filter them in the filter_root, But the free work is done by the function all_subvolume_free.. so your modification will cause a double free.. Thanks for the review. I'll admit that when looking at too many of these static checker reports, sometimes things look obvious when they are actually subtle, and I've certainly made mistakes before. :) However, full_path location doesn't seem to be available outside the scope of this function unless we exit normally and do: ri-full_path = full_path; return 0; } If we exit early at the -ENOENT points, it seems that full_path is leaked; there are no other references to it. I looked the code carefully, i was wrong before.. Agree with the patch Thanks, Wang I agree with you, the freed value is local. david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 14/17] btrfs-progs: fix mem leak in resolve_root
Hello, Eric 2013/2/26 Eric Sandeen sand...@redhat.com: If we exit with error we must free the allocated memory to avoid a leak. Signed-off-by: Eric Sandeen sand...@redhat.com --- btrfs-list.c |8 ++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/btrfs-list.c b/btrfs-list.c index 851c059..8c3f84d 100644 --- a/btrfs-list.c +++ b/btrfs-list.c @@ -568,8 +568,10 @@ static int resolve_root(struct root_lookup *rl, struct root_info *ri, * ref_tree = 0 indicates the subvolumes * has been deleted. */ - if (!found-ref_tree) + if (!found-ref_tree) { + free(full_path); return -ENOENT; + } int add_len = strlen(found-path); /* room for / and for null */ @@ -612,8 +614,10 @@ static int resolve_root(struct root_lookup *rl, struct root_info *ri, * subvolume was deleted. */ found = root_tree_search(rl, next); - if (!found) + if (!found) { + free(full_path); return -ENOENT; + } } ri-full_path = full_path; -- 1.7.1 I think the patch is wrong; Here we return ENOENT, it means a subvolume/snapshot deletion happens. We just filter them in the filter_root, But the free work is done by the function all_subvolume_free.. so your modification will cause a double free.. Thanks, Wang -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] Btrfs: create the qgroup that limits root subvolume automatically
2013/2/22 Arne Jansen sensi...@gmx.net: On 02/22/13 13:02, Wang Shilong wrote: From: Wang Shilong wangsl-f...@cn.fujitsu.com Creating the root subvolume qgroup when enabling quota,with Why only create a qgroup for the root subvolume and not for every existing subvolume? Yes,You are right. Creating all the existed subvolume qgroup is necessary when enabling quota since we try to prevent creating group level 0...the subvolume/snapshot group should be operated automatically... Atfer this work. I think it is necessary to delete the subvolume/snapshot qgroup as the deletion of sub volume/snapshot. BTW, there is a thing to think about... During enabling quota...No new subvolume should be created before the enabling quota is done. I will try to implement such functions. this patch,it will be ok to limit the whole filesystem size. This will not limit the whole filesystem, but only the root subvolume. To limit the whole filesystem you'd have to create a level 1 qgroup and add all subvolumes to it. Right, thanks for correcting it... Thanks, Wang -Arne Signed-off-by: Wang Shilong wangsl-f...@cn.fujitsu.com Reviewed-by: Miao Xie mi...@cn.fujitsu.com Cc: Arne Jansen sensi...@gmx.net --- fs/btrfs/qgroup.c | 12 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index a5c8562..c409096 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -777,6 +777,7 @@ int btrfs_quota_enable(struct btrfs_trans_handle *trans, struct extent_buffer *leaf; struct btrfs_key key; int ret = 0; + struct btrfs_qgroup *qgroup = NULL; spin_lock(fs_info-qgroup_lock); if (fs_info-quota_root) { @@ -823,7 +824,18 @@ int btrfs_quota_enable(struct btrfs_trans_handle *trans, btrfs_mark_buffer_dirty(leaf); + btrfs_release_path(path); + ret = add_qgroup_item(trans, quota_root, BTRFS_FS_TREE_OBJECTID); + if (ret) + goto out; + spin_lock(fs_info-qgroup_lock); + qgroup = add_qgroup_rb(fs_info, BTRFS_FS_TREE_OBJECTID); + if (IS_ERR(qgroup)) { + spin_unlock(fs_info-qgroup_lock); + ret = PTR_ERR(qgroup); + goto out; + } fs_info-quota_root = quota_root; fs_info-pending_quota_state = 1; spin_unlock(fs_info-qgroup_lock); -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RESEND RFC PATCH 2/2] Btrfs: disable the qgroup level 0 for userspace use
Hello, 2013/2/22 Arne Jansen sensi...@gmx.net: On 02/22/13 13:09, Wang Shilong wrote: From: Wang Shilong wangsl-f...@cn.fujitsu.com This patch tries to stop users to create/destroy qgroup level 0, users can only create/destroy qgroup level more than 0. See the fact: a subvolume/snapshot qgroup was created automatically when creating subvolume/snapshot, so creating a qgroup level 0 can't be a subvolume/snapshot qgroup, the only way to use it is that assigning subvolume/snapshot qgroup to it, the point is that we don't want to have a parent qgroup whose level is 0. So we want to force users to use qgroup with clear relations which means a parent qgroup's level child qgroup's level.For example: 2/0 /\ / \ /\ 1/0 1/1 / \\ / \\ / \\ 0/256 0/2570/258 This pattern of quota is nature and easy for users to understand, otherwise it will make the quota configuration confusing and difficult to maintain. I agree that a strict hierarchy of the levels should be enforced. Currently the kernel has no idea of 'level', it's just an artificial concept that lives in userspace. This patch would be the first place to add that magic shift '48' to the kernel. In my opinion it would be sufficient to do the enforcement in user space, as it is of no technical nature. ...i have made some patches about these work in btrfs-prog, but it has been not merged... I will pick up thoses patches and do the other necessary work.. Thanks, Wang -Arne Signed-off-by: Wang Shilong wangsl-f...@cn.fujitsu.com Acked-by: Miao Xie mi...@cn.fujitsu.com Cc: Arne Jansen sensi...@gmx.net --- fs/btrfs/ioctl.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index a31cd93..3590c21 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -3755,7 +3755,7 @@ static long btrfs_ioctl_qgroup_create(struct file *file, void __user *arg) goto drop_write; } - if (!sa-qgroupid) { + if (!(sa-qgroupid 48)) { ret = -EINVAL; goto out; } -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RESEND RFC PATCH 2/2] Btrfs: disable the qgroup level 0 for userspace use
Hello, David 2013/2/23 David Sterba dste...@suse.cz: On Sat, Feb 23, 2013 at 12:39:24AM +0800, Shilong Wang wrote: Hello, 2013/2/22 Arne Jansen sensi...@gmx.net: On 02/22/13 13:09, Wang Shilong wrote: From: Wang Shilong wangsl-f...@cn.fujitsu.com This patch tries to stop users to create/destroy qgroup level 0, users can only create/destroy qgroup level more than 0. See the fact: a subvolume/snapshot qgroup was created automatically when creating subvolume/snapshot, so creating a qgroup level 0 can't be a subvolume/snapshot qgroup, the only way to use it is that assigning subvolume/snapshot qgroup to it, the point is that we don't want to have a parent qgroup whose level is 0. So we want to force users to use qgroup with clear relations which means a parent qgroup's level child qgroup's level.For example: 2/0 /\ / \ /\ 1/0 1/1 / \\ / \\ / \\ 0/256 0/2570/258 This pattern of quota is nature and easy for users to understand, otherwise it will make the quota configuration confusing and difficult to maintain. I agree that a strict hierarchy of the levels should be enforced. Currently the kernel has no idea of 'level', it's just an artificial concept that lives in userspace. This patch would be the first place to add that magic shift '48' to the kernel. In my opinion it would be sufficient to do the enforcement in user space, as it is of no technical nature. ...i have made some patches about these work in btrfs-prog, but it has been not merged... I will pick up thoses patches and do the other necessary work.. This one? https://patchwork.kernel.org/patch/2008591/ went through integration branch into progs' master. Yes, it is.However, more work needs done to make it work well.. I'd continue my work based on integration-20130126.. Thanks, Wang david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html