Re: [PATCH 1/4] Btrfs-progs: fsck: only allow partial opening under repair mode

2014-05-28 Thread Shilong Wang
2014-05-28 21:56 GMT+08:00 Eric Sandeen sand...@redhat.com:
 The subject and the comment say what this change does, but
 that's obvious from reading the code.  Nothing says *why*
 the change has been made.  What does this fix, and how does
 it fix it?

Yup, the reason that we allow partial opening is that sometimes,
we may have a corrupted extent tree(for example), but for fsck repair case,
the broken tree maybe rebuilt.

So if users only want to do check but not repaired, this patch will make fsck
return failure as soon as possible and tell users that some critial roots have
been corrupted...

Let's come to your comments, Eric, you are absolutely right, i was a little
lazy sometimes...

I would add necessay changelog to describe why we need this patch,
thanks for your comments.

Regards,
Wang

 Can you add/update the commit log so that some reader in the future
 (or for that matter, a reviewer in the present) will have an idea
 about the reason for this change?  What was the failure case, what
 was the failure mode, why does this change fix it, etc.

 Thanks,
 -Eric

 On 5/28/14, 6:20 AM, Wang Shilong wrote:
 Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com
 ---
  cmds-check.c | 7 +--
  1 file changed, 5 insertions(+), 2 deletions(-)

 diff --git a/cmds-check.c b/cmds-check.c
 index db7df80..0e4e042 100644
 --- a/cmds-check.c
 +++ b/cmds-check.c
 @@ -6810,8 +6810,7 @@ int cmd_check(int argc, char **argv)
   int option_index = 0;
   int init_csum_tree = 0;
   int qgroup_report = 0;
 - enum btrfs_open_ctree_flags ctree_flags =
 - OPEN_CTREE_PARTIAL | OPEN_CTREE_EXCLUSIVE;
 + enum btrfs_open_ctree_flags ctree_flags = OPEN_CTREE_EXCLUSIVE;

   while(1) {
   int c;
 @@ -6877,6 +6876,10 @@ int cmd_check(int argc, char **argv)
   goto err_out;
   }

 + /* only allow partial opening under repair mode */
 + if (repair)
 + ctree_flags |= OPEN_CTREE_PARTIAL;
 +
   info = open_ctree_fs_info(argv[optind], bytenr, 0, ctree_flags);
   if (!info) {
   fprintf(stderr, Couldn't open file system\n);


 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: set right total device count for seeding support

2014-05-16 Thread Shilong Wang
2014-05-16 22:14 GMT+08:00 Anand Jain anand.j...@oracle.com:

 Wang,

  There seems to be a problem - after we delete the seed
  disk, the total_devices didn't decrement back to 1.
  reproducer as in the below test case. (I used btrfs-devlist
  (posted) to check fs_devices).

There should be other problems flighting or i missed something ,
i'll take a look at this. Thanks Anand for finding this.



   # mkfs.btrfs -f /dev/sdb
   # btrfstune -S 1 /dev/sdb
   # mount /dev/sdb /mnt
   # btrfs device add -f /dev/sdc /mnt ---fs_devices-total_devices = 1
  mount -o rw,remount /mnt
  btrfs dev del /dev/sdb /mnt -- fs_devices-total_devices is still 2

 Thanks, Anand



 On 13/05/14 17:05, Wang Shilong wrote:

 Seeding device support allows us to create a new filesystem
 based on existed filesystem.

 However newly created filesystem's @total_devices should include seed
 devices. This patch fix the following problem:

   # mkfs.btrfs -f /dev/sdb
   # btrfstune -S 1 /dev/sdb
   # mount /dev/sdb /mnt
   # btrfs device add -f /dev/sdc /mnt ---fs_devices-total_devices = 1
   # umount /mnt
   # mount /dev/sdc /mnt   ---fs_devices-total_devices = 2

 This is because we record right @total_devices in superblock, but
 @fs_devices-total_devices is reset to be 0 in btrfs_prepare_sprout().

 Fix this problem by not resetting @fs_devices-total_devices.

 Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com
 ---
   fs/btrfs/volumes.c | 1 -
   1 file changed, 1 deletion(-)

 diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
 index 6fd7fe6..19b2d32 100644
 --- a/fs/btrfs/volumes.c
 +++ b/fs/btrfs/volumes.c
 @@ -1883,7 +1883,6 @@ static int btrfs_prepare_sprout(struct btrfs_root
 *root)
 fs_devices-seeding = 0;
 fs_devices-num_devices = 0;
 fs_devices-open_devices = 0;
 -   fs_devices-total_devices = 0;
 fs_devices-seed = seed_devices;

 generate_random_uuid(fs_devices-fsid);

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: set right total device count for seeding support

2014-05-16 Thread Shilong Wang
2014-05-16 22:44 GMT+08:00 Anand Jain anand.j...@oracle.com:


 Wang,

  when we unmount  mount (instead of remount) and followed
  with device del seed it ends up with null pointer deref at
  btrfs_shrink_dev. Thats because the btrfs_root is not set for
  seed disk as we mounted the writable sprout disk. I am writing
  a separate fix for that as its a new bug.

You means this one which was addressed by Liu few days ago?
https://patchwork.kernel.org/patch/4150471/



 Thanks, Anand



 On 16/05/14 22:14, Anand Jain wrote:


 Wang,

   There seems to be a problem - after we delete the seed
   disk, the total_devices didn't decrement back to 1.
   reproducer as in the below test case. (I used btrfs-devlist
   (posted) to check fs_devices).

# mkfs.btrfs -f /dev/sdb
# btrfstune -S 1 /dev/sdb
# mount /dev/sdb /mnt
# btrfs device add -f /dev/sdc /mnt ---fs_devices-total_devices = 1
   mount -o rw,remount /mnt
   btrfs dev del /dev/sdb /mnt -- fs_devices-total_devices is still 2

 Thanks, Anand


 On 13/05/14 17:05, Wang Shilong wrote:

 Seeding device support allows us to create a new filesystem
 based on existed filesystem.

 However newly created filesystem's @total_devices should include seed
 devices. This patch fix the following problem:

   # mkfs.btrfs -f /dev/sdb
   # btrfstune -S 1 /dev/sdb
   # mount /dev/sdb /mnt
   # btrfs device add -f /dev/sdc /mnt ---fs_devices-total_devices = 1
   # umount /mnt
   # mount /dev/sdc /mnt   ---fs_devices-total_devices = 2

 This is because we record right @total_devices in superblock, but
 @fs_devices-total_devices is reset to be 0 in btrfs_prepare_sprout().

 Fix this problem by not resetting @fs_devices-total_devices.

 Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com
 ---
   fs/btrfs/volumes.c | 1 -
   1 file changed, 1 deletion(-)

 diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
 index 6fd7fe6..19b2d32 100644
 --- a/fs/btrfs/volumes.c
 +++ b/fs/btrfs/volumes.c
 @@ -1883,7 +1883,6 @@ static int btrfs_prepare_sprout(struct
 btrfs_root *root)
   fs_devices-seeding = 0;
   fs_devices-num_devices = 0;
   fs_devices-open_devices = 0;
 -fs_devices-total_devices = 0;
   fs_devices-seed = seed_devices;

   generate_random_uuid(fs_devices-fsid);

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: ioctl BTRFS_IOC_FS_INFO and BTRFS_IOC_DEV_INFO miss-matched with slots

2014-05-16 Thread Shilong Wang
2014-05-16 23:13 GMT+08:00 Anand Jain anand.j...@oracle.com:


 On 16/05/14 22:40, Shilong Wang wrote:

 2014-05-16 22:06 GMT+08:00 Anand Jain anand.j...@oracle.com:

 BTRFS_IOC_FS_INFO return num_devices which does not include seed disks,
 BTRFS_IOC_DEV_INFO fetches seed disk when probed. So in this case hits
 the btrfs-progs bug:
  get_fs_info()
  ::
  BUG_ON(ndevs = fi_args-num_devices);
 which is very easy to hit by using btrfs filesystem show.

 This patch will make BTRFS_IOC_DEV_INFO ioctl to provide disks only
 of the FSID being probed (seed disks are under different FSID).

 which means when seed is still not deleted from the sprout the btrfs
 filesystem show command will show disks them under their
 respective FSIDs


 Just a quick comment:

 Newly created filesystem is based on existed seeding filesystem.
 Though they have different fsid, newly created filesystem devices
 should include seed devices, from seed device filesytem, they can not
 see newly created filesystem devices! No?

 Will the follow patch change the above behavior?


  Yes, as the dev info list does not contain the seed disk
  so is the btrfs fi show output. Do you think its better
  to show seed and sprout disks together ? other way is
  indicate the seed fsid against the added disk until
  seed is deleted.

IMO, i think dev info list should contain seed disks, since new filesystem
can access their data.

Maybe a better way 'file show' will tell users which devices are seed devices..


 Thanks. Anand


 Thanks,
 Wang


 Signed-off-by: Anand Jain anand.j...@oracle.com
 ---
   fs/btrfs/ioctl.c |3 ++-
   1 files changed, 2 insertions(+), 1 deletions(-)

 diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
 index ff27c08..902d279 100644
 --- a/fs/btrfs/ioctl.c
 +++ b/fs/btrfs/ioctl.c
 @@ -2584,7 +2584,8 @@ static long btrfs_ioctl_dev_info(struct btrfs_root
 *root, void __user *arg)
  s_uuid = di_args-uuid;

  mutex_lock(fs_devices-device_list_mutex);
 -   dev = btrfs_find_device(root-fs_info, di_args-devid, s_uuid,
 NULL);
 +   dev = btrfs_find_device(root-fs_info, di_args-devid, s_uuid,
 +   fs_devices-fsid);

  if (!dev) {
  ret = -ENODEV;
 --
 1.7.1

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Back from leave

2014-05-05 Thread Shilong Wang
2014-05-05 22:28 GMT+08:00 Josef Bacik jba...@fb.com:
 Hello,

 I had way too much email so I just deleted it all, if there was something
 you wanted my specific attention on then bounce it back at me and I'll look
 at it.  Thanks,

Welcome back, Josef!

Wang

 Josef
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: csum failed that was not detected by scrub

2014-05-02 Thread Shilong Wang
Hello,


2014-05-02 17:42 GMT+08:00 Jaap Pieroen j...@pieroen.nl:
 Hi all,

 I completed a full scrub:
 root@nasbak:/home/jpieroen# btrfs scrub status /home/
 scrub status for 7ca5f38e-308f-43ab-b3ea-31b3bcd11a0d
 scrub started at Wed Apr 30 08:30:19 2014 and finished after 144131 seconds
 total bytes scrubbed: 4.76TiB with 0 errors

 Then tried to remove a device:
 root@nasbak:/home/jpieroen# btrfs device delete /dev/sdb /home

 This triggered bug_on, with the following error in dmesg: csum failed
 ino 258 off 1395560448 csum 2284440321 expected csum 319628859

 How can there still be csum failures directly after a scrub?
 If I rerun the scrub it still won't find any errors. I know this,
 because I've had the same issue 3 times in a row. Each time running a
 scrub and still being unable to remove the device.

There is a known RAID5/6 bug, i sent a patch to address this problem.
Could you please double check if your kernel source includes the
following commit:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=3b080b2564287be91605bfd1d5ee985696e61d3c

RAID5/6 should detect checksum mismatch, it can not fix errors now.

Thanks,
Wang

 Kind Regards,
 Jaap

 --
 Details:

 root@nasbak:/home/jpieroen#   uname -a
 Linux nasbak 3.14.1-031401-generic #201404141220 SMP Mon Apr 14
 16:21:48 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

 root@nasbak:/home/jpieroen#   btrfs --version
 Btrfs v3.14.1

 root@nasbak:/home/jpieroen#   btrfs fi df /home
 Data, RAID5: total=4.57TiB, used=4.55TiB
 System, RAID1: total=32.00MiB, used=352.00KiB
 Metadata, RAID1: total=7.00GiB, used=5.59GiB

 root@nasbak:/home/jpieroen# btrfs fi show
 Label: 'btrfs_storage'  uuid: 7ca5f38e-308f-43ab-b3ea-31b3bcd11a0d
 Total devices 6 FS bytes used 4.56TiB
 devid1 size 1.82TiB used 1.31TiB path /dev/sde
 devid2 size 1.82TiB used 1.31TiB path /dev/sdf
 devid3 size 1.82TiB used 1.31TiB path /dev/sdg
 devid4 size 931.51GiB used 25.00GiB path /dev/sdb
 devid6 size 2.73TiB used 994.03GiB path /dev/sdh
 devid7 size 2.73TiB used 994.03GiB path /dev/sdi

 Btrfs v3.14.1

 jpieroen@nasbak:~$ dmesg
 [227248.656438] BTRFS info (device sdi): relocating block group
 9735225016320 flags 129
 [227261.713860] BTRFS info (device sdi): found 9 extents
 [227264.531019] BTRFS info (device sdi): found 9 extents
 [227265.011826] BTRFS info (device sdi): relocating block group
 76265029632 flags 129
 [227274.052249] BTRFS info (device sdi): csum failed ino 258 off
 1395560448 csum 2284440321 expected csum 319628859
 [227274.052354] BTRFS info (device sdi): csum failed ino 258 off
 1395564544 csum 3646299263 expected csum 319628859
 [227274.052402] BTRFS info (device sdi): csum failed ino 258 off
 1395568640 csum 281259278 expected csum 319628859
 [227274.052449] BTRFS info (device sdi): csum failed ino 258 off
 1395572736 csum 2594807184 expected csum 319628859
 [227274.052492] BTRFS info (device sdi): csum failed ino 258 off
 1395576832 csum 4288971971 expected csum 319628859
 [227274.052537] BTRFS info (device sdi): csum failed ino 258 off
 1395580928 csum 752615894 expected csum 319628859
 [227274.052581] BTRFS info (device sdi): csum failed ino 258 off
 1395585024 csum 3828951500 expected csum 319628859
 [227274.061279] [ cut here ]
 [227274.061354] kernel BUG at /home/apw/COD/linux/fs/btrfs/extent_io.c:2116!
 [227274.061445] invalid opcode:  [#1] SMP
 [227274.061509] Modules linked in: cuse deflate
 [227274.061573] BTRFS info (device sdi): csum failed ino 258 off
 1395560448 csum 2284440321 expected csum 319628859
 [227274.061707]  ctr twofish_generic twofish_x86_64_3way
 twofish_x86_64 twofish_common camellia_generic camellia_x86_64
 serpent_sse2_x86_64 xts serpent_generic lrw gf128mul glue_helper
 blowfish_generic blowfish_x86_64 blowfish_common cast5_generic
 cast_common ablk_helper cryptd des_generic cmac xcbc rmd160
 crypto_null af_key xfrm_algo nfsd auth_rpcgss nfs_acl nfs lockd sunrpc
 fscache dm_crypt ip6t_REJECT ppdev xt_hl ip6t_rt nf_conntrack_ipv6
 nf_defrag_ipv6 ipt_REJECT xt_comment xt_LOG kvm xt_recent microcode
 xt_multiport xt_limit xt_tcpudp psmouse serio_raw xt_addrtype k10temp
 edac_core ipt_MASQUERADE edac_mce_amd iptable_nat nf_nat_ipv4
 sp5100_tco nf_conntrack_ipv4 nf_defrag_ipv4 ftdi_sio i2c_piix4
 usbserial xt_conntrack ip6table_filter ip6_tables joydev
 nf_conntrack_netbios_ns nf_conntrack_broadcast snd_hda_codec_via
 nf_nat_ftp snd_hda_codec_hdmi nf_nat snd_hda_codec_generic
 nf_conntrack_ftp nf_conntrack snd_hda_intel iptable_filter
 ir_lirc_codec(OF) lirc_dev(OF) ip_tables snd_hda_codec
 ir_mce_kbd_decoder(OF) x_tables snd_hwdep ir_sony_decoder(OF)
 rc_tbs_nec(OF) ir_jvc_decoder(OF) snd_pcm ir_rc6_decoder(OF)
 ir_rc5_decoder(OF) saa716x_tbs_dvb(OF) tbs6982fe(POF) tbs6680fe(POF)
 ir_nec_decoder(OF) tbs6923fe(POF) tbs6985se(POF) tbs6928se(POF)
 tbs6982se(POF) tbs6991fe(POF) tbs6618fe(POF) saa716x_core(OF)
 

Re: [PATCH] Btrfs: do not reset last_snapshot after relocation

2014-04-09 Thread Shilong Wang
Hi Josef,

2014-03-28 2:56 GMT+08:00 Josef Bacik jba...@fb.com:
 This was done to allow NO_COW to continue to be NO_COW after relocation but it
 is not right.  When relocating we will convert blocks to FULL_BACKREF that we
 relocate.  We can leave some of these full backref blocks behind if they are 
 not
 cow'ed out during the relocation, like if we fail the relocation with ENOSPC 
 and
 then just drop the reloc tree.  Then when we go to cow the block again we 
 won't
 lookup the extent flags because we won't think there has been a snapshot
 recently which means we will do our normal ref drop thing instead of adding 
 back
 a tree ref and dropping the shared ref.  This will cause btrfs_free_extent to
 blow up because it can't find the ref we are trying to free.  This was found
 with my ref verifying tool.  Thanks,

Could we pass error into merge_reloc_roots() something like this:

merge_reloc_roots(struct reloc_control *rc, int err)

and we only skip reset root's last snapshot if  @err is not 0. I think
it is meaningful
to allow NO_COW to continue after relocation.

Thanks,
Wang

 Signed-off-by: Josef Bacik jba...@fb.com
 ---
  fs/btrfs/relocation.c | 21 -
  1 file changed, 21 deletions(-)

 diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
 index ec00777..f026a82 100644
 --- a/fs/btrfs/relocation.c
 +++ b/fs/btrfs/relocation.c
 @@ -2318,7 +2318,6 @@ void free_reloc_roots(struct list_head *list)
  static noinline_for_stack
  int merge_reloc_roots(struct reloc_control *rc)
  {
 -   struct btrfs_trans_handle *trans;
 struct btrfs_root *root;
 struct btrfs_root *reloc_root;
 u64 last_snap;
 @@ -2376,26 +2375,6 @@ again:
 list_add_tail(reloc_root-root_list,
   reloc_roots);
 goto out;
 -   } else if (!ret) {
 -   /*
 -* recover the last snapshot tranid to avoid
 -* the space balance break NOCOW.
 -*/
 -   root = read_fs_root(rc-extent_root-fs_info,
 -   objectid);
 -   if (IS_ERR(root))
 -   continue;
 -
 -   trans = btrfs_join_transaction(root);
 -   BUG_ON(IS_ERR(trans));
 -
 -   /* Check if the fs/file tree was snapshoted or not. */
 -   if (btrfs_root_last_snapshot(root-root_item) ==
 -   otransid - 1)
 -   btrfs_set_root_last_snapshot(root-root_item,
 -last_snap);
 -
 -   btrfs_end_transaction(trans, root);
 }
 }

 --
 1.8.3.1

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Btrfs: don't compress for a small write

2014-03-31 Thread Shilong Wang
2014-03-31 20:31 GMT+08:00 Chris Mason c...@fb.com:


 On 03/24/2014 05:58 AM, Wang Shilong wrote:

 To compress a small write(=blocksize) dosen't save us
 disk space at all, skip it can save us some compression time.

 This patch can also fix wrong setting nocompression flag for
 inode, say a case when @total_in is 4096, and then we get
 @total_compressed 52,because we do aligment to page cache size
 firstly, and then we get into conclusion @total_in=@total_compressed
 thus we will clear this inode's compression flag.

 An exception comes from inserting inline extent failure but we
 still have @total_compressed  @total_in,so we will still reset
 inode's flag, this is ok, because we don't have good compression
 effect.


 So your check for start  0 || end + 1  disk_i_size means we're only
 skipping compression for a small file range that isn't in an inline extent.
 Could you please update the patch description and comments in the code to
 reflect this?

No problem, i will update this patch.~_~


 Thanks!

 -chris


 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] Btrfs: scrub raid56 stripes in the right way

2014-03-31 Thread Shilong Wang
2014-03-31 18:34 GMT+08:00 Wang Shilong wangsl.f...@cn.fujitsu.com:
 Steps to reproduce:
  # mkfs.btrfs -f /dev/sda[8-11] -m raid5 -d raid5
  # mount /dev/sda8 /mnt
  # btrfs scrub start -BR /mnt
  # echo $? --unverified errors make return value be 3

 This is because we don't setup right mapping between physical
 and logical address for raid56, which makes checksum mismatch.
 But we will find everthing is fine later when rechecking using
 btrfs_map_block().

 This patch fixed the problem by settuping right mappings and
 we only verify data stripes' checksums.

 Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com
 ---
 v1-v2:
 on the right way to fix the problem.
 ---
  fs/btrfs/scrub.c | 83 
 
  1 file changed, 71 insertions(+), 12 deletions(-)

 diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
 index db21a13..ebe4c5e 100644
 --- a/fs/btrfs/scrub.c
 +++ b/fs/btrfs/scrub.c
 @@ -2235,6 +2235,43 @@ behind_scrub_pages:
 return 0;
  }

 +/*
 + * Given a physical address, this will calculate it's
 + * logical offset. if this is a parity stripe, it will return
 + * the most left data stripe's logical offset.
 + *
 + * return 0 if it is a data stripe, 1 means parity stripe.
 + */
 +static int get_raid56_logic_offset(u64 physical, int num,
 +struct map_lookup *map, u64 *offset)
 +{
 +   int i;
 +   int j = 0;
 +   u64 stripe_nr;
 +   u64 tmp;
 +   u64 last_offset;
 +   int stripe_index;
 +
 +   last_offset = (physical - map-stripes[num].physical) *
 + nr_data_stripes(map);
 +   *offset = last_offset;
 +   for (i = 0; i  nr_data_stripes(map); i++) {
 +   *offset += i * map-stripe_len;
oops, this is obviously wrong, it should be:

 +   stripe_nr = *offset;
 +   do_div(stripe_nr, map-stripe_len);
 +
 +   stripe_index = do_div(stripe_nr, nr_data_stripes(map));
 +   tmp = stripe_nr + stripe_index;
 +   stripe_index = do_div(tmp, map-num_stripes);
 +   if (stripe_index == num)
 +   return 0;
 +   if (stripe_index  num)
 +   j++;
 +   }
 +   *offset = last_offset + j * map-stripe_len;
 +   return 1;
 +}
 +
  static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx,
struct map_lookup *map,
struct btrfs_device *scrub_dev,
 @@ -2269,16 +2306,10 @@ static noinline_for_stack int scrub_stripe(struct 
 scrub_ctx *sctx,
 u64 extent_len;
 struct btrfs_device *extent_dev;
 int extent_mirror_num;
 -   int stop_loop;
 -
 -   if (map-type  (BTRFS_BLOCK_GROUP_RAID5 |
 -BTRFS_BLOCK_GROUP_RAID6)) {
 -   if (num = nr_data_stripes(map)) {
 -   return 0;
 -   }
 -   }
 +   int stop_loop = 0;

 nstripes = length;
 +   physical = map-stripes[num].physical;
 offset = 0;
 do_div(nstripes, map-stripe_len);
 if (map-type  BTRFS_BLOCK_GROUP_RAID0) {
 @@ -2296,6 +2327,11 @@ static noinline_for_stack int scrub_stripe(struct 
 scrub_ctx *sctx,
 } else if (map-type  BTRFS_BLOCK_GROUP_DUP) {
 increment = map-stripe_len;
 mirror_num = num % map-num_stripes + 1;
 +   } else if (map-type  (BTRFS_BLOCK_GROUP_RAID5 |
 +   BTRFS_BLOCK_GROUP_RAID6)) {
 +   get_raid56_logic_offset(physical, num, map, offset);
 +   increment = map-stripe_len * nr_data_stripes(map);
 +   mirror_num = 1;
 } else {
 increment = map-stripe_len;
 mirror_num = 1;
 @@ -2357,10 +2393,18 @@ static noinline_for_stack int scrub_stripe(struct 
 scrub_ctx *sctx,
  * now find all extents for each stripe and scrub them
  */
 logical = base + offset;
 -   physical = map-stripes[num].physical;
 logic_end = logical + increment * nstripes;
 ret = 0;
 while (logical  logic_end) {
 +   /* for raid56, we skip parity stripe */
 +   if (map-type  (BTRFS_BLOCK_GROUP_RAID5 |
 +   BTRFS_BLOCK_GROUP_RAID6)) {
 +   ret = get_raid56_logic_offset(physical, num,
 +   map, logical);
 +   logical += base;
 +   if (ret)
 +   goto skip;
 +   }
 /*
  * canceled?
  */
 @@ -2504,9 +2548,23 @@ again:
 scrub_free_csums(sctx);
 if (extent_logical + extent_len 
 key.objectid + bytes) {
 -   logical += increment;
 -   

Re: [PATCH v2 2/2] Btrfs: scrub raid56 stripes in the right way

2014-03-31 Thread Shilong Wang
2014-03-31 20:54 GMT+08:00 Shilong Wang wangshilong1...@gmail.com:
 2014-03-31 18:34 GMT+08:00 Wang Shilong wangsl.f...@cn.fujitsu.com:
 Steps to reproduce:
  # mkfs.btrfs -f /dev/sda[8-11] -m raid5 -d raid5
  # mount /dev/sda8 /mnt
  # btrfs scrub start -BR /mnt
  # echo $? --unverified errors make return value be 3

 This is because we don't setup right mapping between physical
 and logical address for raid56, which makes checksum mismatch.
 But we will find everthing is fine later when rechecking using
 btrfs_map_block().

 This patch fixed the problem by settuping right mappings and
 we only verify data stripes' checksums.

 Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com
 ---
 v1-v2:
 on the right way to fix the problem.
 ---
  fs/btrfs/scrub.c | 83 
 
  1 file changed, 71 insertions(+), 12 deletions(-)

 diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
 index db21a13..ebe4c5e 100644
 --- a/fs/btrfs/scrub.c
 +++ b/fs/btrfs/scrub.c
 @@ -2235,6 +2235,43 @@ behind_scrub_pages:
 return 0;
  }

 +/*
 + * Given a physical address, this will calculate it's
 + * logical offset. if this is a parity stripe, it will return
 + * the most left data stripe's logical offset.
 + *
 + * return 0 if it is a data stripe, 1 means parity stripe.
 + */
 +static int get_raid56_logic_offset(u64 physical, int num,
 +struct map_lookup *map, u64 *offset)
 +{
 +   int i;
 +   int j = 0;
 +   u64 stripe_nr;
 +   u64 tmp;
 +   u64 last_offset;
 +   int stripe_index;
 +
 +   last_offset = (physical - map-stripes[num].physical) *
 + nr_data_stripes(map);
 +   *offset = last_offset;
 +   for (i = 0; i  nr_data_stripes(map); i++) {
 +   *offset += i * map-stripe_len;
 oops, this is obviously wrong, it should be:

It shoule be:
   *offset = last_offset + i * map-stripe_len;

Will fix this in v3.

 +   stripe_nr = *offset;
 +   do_div(stripe_nr, map-stripe_len);
 +
 +   stripe_index = do_div(stripe_nr, nr_data_stripes(map));
 +   tmp = stripe_nr + stripe_index;
 +   stripe_index = do_div(tmp, map-num_stripes);
 +   if (stripe_index == num)
 +   return 0;
 +   if (stripe_index  num)
 +   j++;
 +   }
 +   *offset = last_offset + j * map-stripe_len;
 +   return 1;
 +}
 +
  static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx,
struct map_lookup *map,
struct btrfs_device *scrub_dev,
 @@ -2269,16 +2306,10 @@ static noinline_for_stack int scrub_stripe(struct 
 scrub_ctx *sctx,
 u64 extent_len;
 struct btrfs_device *extent_dev;
 int extent_mirror_num;
 -   int stop_loop;
 -
 -   if (map-type  (BTRFS_BLOCK_GROUP_RAID5 |
 -BTRFS_BLOCK_GROUP_RAID6)) {
 -   if (num = nr_data_stripes(map)) {
 -   return 0;
 -   }
 -   }
 +   int stop_loop = 0;

 nstripes = length;
 +   physical = map-stripes[num].physical;
 offset = 0;
 do_div(nstripes, map-stripe_len);
 if (map-type  BTRFS_BLOCK_GROUP_RAID0) {
 @@ -2296,6 +2327,11 @@ static noinline_for_stack int scrub_stripe(struct 
 scrub_ctx *sctx,
 } else if (map-type  BTRFS_BLOCK_GROUP_DUP) {
 increment = map-stripe_len;
 mirror_num = num % map-num_stripes + 1;
 +   } else if (map-type  (BTRFS_BLOCK_GROUP_RAID5 |
 +   BTRFS_BLOCK_GROUP_RAID6)) {
 +   get_raid56_logic_offset(physical, num, map, offset);
 +   increment = map-stripe_len * nr_data_stripes(map);
 +   mirror_num = 1;
 } else {
 increment = map-stripe_len;
 mirror_num = 1;
 @@ -2357,10 +2393,18 @@ static noinline_for_stack int scrub_stripe(struct 
 scrub_ctx *sctx,
  * now find all extents for each stripe and scrub them
  */
 logical = base + offset;
 -   physical = map-stripes[num].physical;
 logic_end = logical + increment * nstripes;
 ret = 0;
 while (logical  logic_end) {
 +   /* for raid56, we skip parity stripe */
 +   if (map-type  (BTRFS_BLOCK_GROUP_RAID5 |
 +   BTRFS_BLOCK_GROUP_RAID6)) {
 +   ret = get_raid56_logic_offset(physical, num,
 +   map, logical);
 +   logical += base;
 +   if (ret)
 +   goto skip;
 +   }
 /*
  * canceled?
  */
 @@ -2504,9 +2548,23 @@ again:
 scrub_free_csums(sctx

Re: [PATCH] Btrfs-progs: fsck: disable --init-extent-tree option when using snapshots

2014-03-10 Thread Shilong Wang
Hi Josef,

As i haven't thought any better ideas to rebuild extent tree which contains
extent that owns 'FULL BACKREF' flag.

Considering an extent's refs can be equal or more than 1 if this extent has
*FULL BACKREF* flag, so we could not make sure an extent's flag by only
searching fs/file tree any more.

So until now, i just disable this option if snapshots exists, please correct me
if i miss something here. Or you have any better ideas to solve this problem.~_~


Thanks,
Wang
2014-03-10 18:39 GMT+08:00 Wang Shilong wangsl.f...@cn.fujitsu.com:
 We haven't supported to rebuild extent tree if there are any *FULL BACKREF*
 with broken filesystem, disable this option when detecting snapshots.

 Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com
 ---
  cmds-check.c | 62 
 
  1 file changed, 62 insertions(+)

 diff --git a/cmds-check.c b/cmds-check.c
 index d1cafe1..ddee897 100644
 --- a/cmds-check.c
 +++ b/cmds-check.c
 @@ -6143,6 +6143,56 @@ static int reset_block_groups(struct btrfs_fs_info 
 *fs_info)
 return 0;
  }

 +static int is_snapshot_exist(struct btrfs_fs_info *fs_info)
 +{
 +   struct btrfs_root *root = fs_info-tree_root;
 +   struct btrfs_path *path;
 +   struct extent_buffer *leaf;
 +   struct btrfs_key key;
 +   int ret;
 +   int found = 0;
 +
 +   path = btrfs_alloc_path();
 +   if (!path)
 +   return -ENOMEM;
 +
 +   key.objectid = BTRFS_FIRST_FREE_OBJECTID;
 +   key.type = BTRFS_ROOT_ITEM_KEY;
 +   key.offset = 0;
 +
 +   ret = btrfs_search_slot(NULL, root, key, path, 0, 0);
 +   if (ret  0)
 +   goto out;
 +
 +   while (1) {
 +   if (path-slots[0] = btrfs_header_nritems(path-nodes[0])) {
 +   ret = btrfs_next_leaf(root, path);
 +   if (ret)
 +   goto out;
 +   }
 +   leaf = path-nodes[0];
 +   btrfs_item_key_to_cpu(leaf, key, path-slots[0]);
 +
 +   if (key.type != BTRFS_ROOT_ITEM_KEY ||
 +   key.objectid  BTRFS_FIRST_FREE_OBJECTID) {
 +   path-slots[0]++;
 +   continue;
 +   }
 +   if (key.offset  0) {
 +   found = 1;
 +   break;
 +   }
 +   path-slots[0]++;
 +   }
 +out:
 +   btrfs_free_path(path);
 +   if (found)
 +   return 1;
 +   else if (ret = 0)
 +   return 0;
 +   return ret;
 +}
 +
  static int reset_balance(struct btrfs_trans_handle *trans,
  struct btrfs_fs_info *fs_info)
  {
 @@ -6537,6 +6587,18 @@ int cmd_check(int argc, char **argv)
 ret = -EIO;
 goto close_out;
 }
 +   if (init_extent_tree) {
 +   ret = is_snapshot_exist(info);
 +   if (ret  0) {
 +   fprintf(stderr, ERROR: fail to check if there are 
 snapshots in btrfs filesystem\n);
 +   ret = 1;
 +   goto close_out;
 +   } else if (ret) {
 +   fprintf(stderr, Snapshots detected, unable to 
 rebuild extent tree for such case.\n);
 +   ret = 1;
 +   goto close_out;
 +   }
 +   }

 if (init_extent_tree || init_csum_tree) {
 struct btrfs_trans_handle *trans;
 --
 1.9.0

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scrub crashed?

2014-02-10 Thread Shilong Wang
Hello Johan,

This should be a known problem.

The problem seemed that scrub log file is corrupt, so i added an option
-f something like:

btrfs scrub start -f ..

You can update latest btrfs-progs from david's latest integration
branch and try it. if you don't
want to do that, just rm /var/lib/btrfs/scrub* should fix your problem.

Hopely it can help you.^_^

Thanks,
Wang


2014-02-10 22:02 GMT+08:00 Johan Kröckel johan.kroec...@gmail.com:
 root@fortknox:~# uname -a
 Linux fortknox 3.12-0.bpo.1-amd64 #1 SMP Debian 3.12.9-1~bpo70+1
 (2014-02-07) x86_64 GNU/Linux
 root@fortknox:~# btrfs version
 Btrfs v3.12
 root@fortknox:~# btrfs scrub status -d /bunker
 scrub status for 11312131-3372-4637-b526-35a4ef0c31eb
 scrub device /dev/mapper/bunkerA (id 1) status
 scrub started at Sun Feb  9 19:12:22 2014, running for 11317 seconds
 total bytes scrubbed: 325.32GiB with 0 errors
 scrub device /dev/dm-1 (id 2) status
 scrub started at Sun Feb  9 19:12:22 2014, running for 11317 seconds
 total bytes scrubbed: 321.76GiB with 0 errors
 root@fortknox:~# btrfs scrub cancel /bunker
 ERROR: scrub cancel failed on /bunker: not running
 root@fortknox:~# btrfs scrub start /bunker
 ERROR: scrub is already running.
 To cancel use 'btrfs scrub cancel /bunker'.
 To see the status use 'btrfs scrub status [-d] /bunker'.
 root@fortknox:~# ps -A|grep btrfs
  3704 ?00:00:00 btrfs-genwork-1
  3705 ?00:00:00 btrfs-submit-1
  3706 ?00:00:00 btrfs-delalloc-
  3707 ?00:00:00 btrfs-fixup-1
  3708 ?00:00:02 btrfs-endio-1
  3709 ?00:00:00 btrfs-endio-met
  3710 ?00:00:00 btrfs-rmw-1
  3711 ?00:00:00 btrfs-endio-rai
  3712 ?00:00:00 btrfs-endio-met
  3714 ?00:00:05 btrfs-freespace
  3715 ?00:00:00 btrfs-delayed-m
  3716 ?00:00:00 btrfs-cache-1
  3717 ?00:00:00 btrfs-readahead
  3718 ?00:00:00 btrfs-flush_del
  3719 ?00:00:00 btrfs-qgroup-re
  3720 ?00:00:00 btrfs-cleaner
  3721 ?00:02:00 btrfs-transacti
  8380 ?00:00:00 btrfs-endio-wri
  8836 ?00:00:00 btrfs-worker-4
  8936 ?00:00:00 btrfs-worker-3
  8961 ?00:00:00 btrfs-worker-3
  9138 ?00:00:00 btrfs-worker-4

 What can/should I do now?
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 2/2] Revert Btrfs: remove transaction from btrfs send

2014-02-09 Thread Shilong Wang
2014-02-09 21:52 GMT+08:00 Filipe David Manana fdman...@gmail.com:
 On Sun, Feb 9, 2014 at 2:39 AM, Shilong Wang wangshilong1...@gmail.com 
 wrote:
 2014-02-08 23:46 GMT+08:00 Wang Shilong wangshilong1...@gmail.com:
 From: Wang Shilong wangsl.f...@cn.fujitsu.com

 This reverts commit 41ce9970a8a6a362ae8df145f7a03d789e9ef9d2.
 Previously i was thinking we can use readonly root's commit root
 safely while it is not true, readonly root may be cowed with the
 following cases.

 1.snapshot send root will cow source root.
 2.balance,device operations will also cow readonly send root
 to relocate.

 So i have two ideas to make us safe to use commit root.

 --approach 1:
 make it protected by transaction and end transaction properly and we 
 research
 next item from root node(see btrfs_search_slot_for_read()).

 --approach 2:
 add another counter to local root structure to sync snapshot with send.
 and add a global counter to sync send with exclusive device operations.

 So with approach 2, send can use commit root safely, because we make sure
 send root can not be cowed during send. Unfortunately, it make codes *ugly*
 and more complex to maintain.

 To make snapshot and send exclusively, device operations and send operation
 exclusively with each other is a little confusing for common users.

 So why not drop into previous way.

 Cc: Josef Bacik jba...@fb.com
 Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com
 ---
 Josef, if we reach agreement to adopt this approach, please revert
 Filipe's patch(Btrfs: make some tree searches in send.c more efficient)
 from btrfs-next.

 Oops, this patch guarantee searching commit roots are all protected by
 transaction, Filipe's
 patch is ok, we need update Josef's previous patch.

 Hi Shilong,

 I am confused. Can you explain why that optimization patch is a
 problem, either with or without your patch or any other patch
 currently flying around?

 Either before or after the optimization, we search through the commit
 root and after a key search we process a key while holding the leaf's
 extent buffer. Both approaches call btrfs_next_leaf too (either
 directly or via btrfs_search_slot_for_read).

Sorry my miss, your patch did not have problem, you did not notice
my following thread comments for this patch, we need update josef's previous
patch not yours. ^_^

Thanks,
Wang

 Thanks


 Wang
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html



 --
 Filipe David Manana,

 Reasonable men adapt themselves to the world.
  Unreasonable men adapt the world to themselves.
  That's why all progress depends on unreasonable men.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 2/2] Revert Btrfs: remove transaction from btrfs send

2014-02-08 Thread Shilong Wang
2014-02-08 23:46 GMT+08:00 Wang Shilong wangshilong1...@gmail.com:
 From: Wang Shilong wangsl.f...@cn.fujitsu.com

 This reverts commit 41ce9970a8a6a362ae8df145f7a03d789e9ef9d2.
 Previously i was thinking we can use readonly root's commit root
 safely while it is not true, readonly root may be cowed with the
 following cases.

 1.snapshot send root will cow source root.
 2.balance,device operations will also cow readonly send root
 to relocate.

 So i have two ideas to make us safe to use commit root.

 --approach 1:
 make it protected by transaction and end transaction properly and we research
 next item from root node(see btrfs_search_slot_for_read()).

 --approach 2:
 add another counter to local root structure to sync snapshot with send.
 and add a global counter to sync send with exclusive device operations.

 So with approach 2, send can use commit root safely, because we make sure
 send root can not be cowed during send. Unfortunately, it make codes *ugly*
 and more complex to maintain.

 To make snapshot and send exclusively, device operations and send operation
 exclusively with each other is a little confusing for common users.

 So why not drop into previous way.

 Cc: Josef Bacik jba...@fb.com
 Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com
 ---
 Josef, if we reach agreement to adopt this approach, please revert
 Filipe's patch(Btrfs: make some tree searches in send.c more efficient)
 from btrfs-next.

Oops, this patch guarantee searching commit roots are all protected by
transaction, Filipe's
patch is ok, we need update Josef's previous patch.

Wang
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WARNING at fs/btrfs/backref.c:936 find_parent_nodes+0x378/0x5d2

2014-01-19 Thread Shilong Wang
Hello Tomasz,

This seems a new bug, did you run balance and defragment concurrently for
triggering the following warning?

Thanks,
Wang
2014/1/19 Tomasz Chmielewski t...@virtall.com:
 Just had a few of these with 3.13.0-rc8:

 [262162.560701] [ cut here ]
 [262162.560764] WARNING: CPU: 0 PID: 22705 at fs/btrfs/backref.c:936 
 find_parent_nodes+0x378/0x5d2 [btrfs]()
 [262162.560872] Modules linked in: ipt_MASQUERADE iptable_nat 
 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables 
 x_tables cpufreq_ondemand cpufreq_conservative cpufreq_powersave 
 cpufreq_stats bridge stp llc ipv6 btrfs xor raid6_pq zlib_deflate loop pcspkr 
 ehci_pci ehci_hcd button lpc_ich mfd_core i2c_i801 i2c_core video 
 acpi_cpufreq ext4 crc16 jbd2 mbcache raid1 sg sd_mod ahci libahci libata 
 r8169 mii scsi_mod
 [262162.562685] CPU: 0 PID: 22705 Comm: btrfs-endio-wri Not tainted 
 3.13.0-rc8 #2
 [262162.562796] Hardware name: System manufacturer System Product 
 Name/P8H77-M PRO, BIOS 1101 02/04/2013
 [262162.562911]  0009 88009e911a78 813898b7 
 0006
 [262162.563028]   88009e911ab8 810370a9 
 01000846
 [262162.563143]  a02b7a6f 8807e7248090 8807f2d5e000 
 88009e911b40
 [262162.563259] Call Trace:
 [262162.563319]  [813898b7] dump_stack+0x46/0x58
 [262162.563382]  [810370a9] warn_slowpath_common+0x77/0x91
 [262162.563458]  [a02b7a6f] ? find_parent_nodes+0x378/0x5d2 [btrfs]
 [262162.563522]  [810370d8] warn_slowpath_null+0x15/0x17
 [262162.563595]  [a02b7a6f] find_parent_nodes+0x378/0x5d2 [btrfs]
 [262162.563671]  [a02b84dc] iterate_extent_inodes+0xc9/0x1d6 [btrfs]
 [262162.563748]  [a0270981] ? record_extent_backrefs+0xc3/0xc3 
 [btrfs]
 [262162.563824]  [a0270981] ? record_extent_backrefs+0xc3/0xc3 
 [btrfs]
 [262162.563899]  [a02b8668] iterate_inodes_from_logical+0x7f/0x95 
 [btrfs]
 [262162.564022]  [a0270919] record_extent_backrefs+0x5b/0xc3 [btrfs]
 [262162.564098]  [a0278f67] btrfs_finish_ordered_io+0x77a/0x877 
 [btrfs]
 [262162.564221]  [a0279074] finish_ordered_fn+0x10/0x12 [btrfs]
 [262162.564296]  [a02940d9] worker_loop+0x15e/0x495 [btrfs]
 [262162.564371]  [a0293f7b] ? btrfs_queue_worker+0x269/0x269 [btrfs]
 [262162.564437]  [8104ee9a] kthread+0xcd/0xd5
 [262162.564499]  [8104edcd] ? 
 kthread_freezable_should_stop+0x43/0x43
 [262162.564563]  [8138e6fc] ret_from_fork+0x7c/0xb0
 [262162.564626]  [8104edcd] ? 
 kthread_freezable_should_stop+0x43/0x43
 [262162.564689] ---[ end trace 50ae39e01ebd7394 ]---
 [262162.564753] [ cut here ]
 [262162.564825] WARNING: CPU: 0 PID: 22705 at fs/btrfs/backref.c:936 
 find_parent_nodes+0x378/0x5d2 [btrfs]()
 [262162.564939] Modules linked in: ipt_MASQUERADE iptable_nat 
 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables 
 x_tables cpufreq_ondemand cpufreq_conservative cpufreq_powersave 
 cpufreq_stats bridge stp llc ipv6 btrfs xor raid6_pq zlib_deflate loop pcspkr 
 ehci_pci ehci_hcd button lpc_ich mfd_core i2c_i801 i2c_core video 
 acpi_cpufreq ext4 crc16 jbd2 mbcache raid1 sg sd_mod ahci libahci libata 
 r8169 mii scsi_mod
 [262162.565348] CPU: 0 PID: 22705 Comm: btrfs-endio-wri Tainted: GW   
  3.13.0-rc8 #2
 [262162.565461] Hardware name: System manufacturer System Product 
 Name/P8H77-M PRO, BIOS 1101 02/04/2013
 [262162.565575]  0009 88009e911a78 813898b7 
 0006
 [262162.565689]   88009e911ab8 810370a9 
 8807f33be200
 [262162.565804]  a02b7a6f 8807e7248090 8807f2d5e000 
 88009e911b40
 [262162.565920] Call Trace:
 [262162.565975]  [813898b7] dump_stack+0x46/0x58
 [262162.566037]  [810370a9] warn_slowpath_common+0x77/0x91
 [262162.566109]  [a02b7a6f] ? find_parent_nodes+0x378/0x5d2 [btrfs]
 [262162.566174]  [810370d8] warn_slowpath_null+0x15/0x17
 [262162.566247]  [a02b7a6f] find_parent_nodes+0x378/0x5d2 [btrfs]
 [262162.566321]  [a02b84dc] iterate_extent_inodes+0xc9/0x1d6 [btrfs]
 [262162.566396]  [a0270981] ? record_extent_backrefs+0xc3/0xc3 
 [btrfs]
 [262162.566469]  [a0270981] ? record_extent_backrefs+0xc3/0xc3 
 [btrfs]
 [262162.566544]  [a02b8668] iterate_inodes_from_logical+0x7f/0x95 
 [btrfs]
 [262162.57]  [a0270919] record_extent_backrefs+0x5b/0xc3 [btrfs]
 [262162.566742]  [a0278f67] btrfs_finish_ordered_io+0x77a/0x877 
 [btrfs]
 [262162.566866]  [a0279074] finish_ordered_fn+0x10/0x12 [btrfs]
 [262162.566941]  [a02940d9] worker_loop+0x15e/0x495 [btrfs]
 [262162.567015]  [a0293f7b] ? btrfs_queue_worker+0x269/0x269 [btrfs]
 [262162.567080]  [8104ee9a] kthread+0xcd/0xd5
 [262162.567141]  [8104edcd] ? 
 

Re: kernel BUG at fs/btrfs/relocation.c:1062

2013-12-20 Thread Shilong Wang
Hello Tomasz,

There should be some bugs flying on, unitil now, you can try to remount:
something like:

# mount  /dev/sda /mnt -o  skip_balance

It will skip unfinshed balance, hopley it can help you before we give a bug fix.

Thanks,
Wang
2013/12/20 Tomasz Chmielewski t...@virtall.com:
 On Thu, 19 Dec 2013 22:07:37 +0900
 Tomasz Chmielewski t...@virtall.com wrote:

 If it matters, I had to hard reboot after that bug; the balance
 continued after the system booted again and I got this a while later
 (filesystem was remounted read only):

 Actually, looks like the fs is quite hosed now :(

 If I want to do an operation like removing a snapshot (btrfs sub
 del ...), the command returns, there is IO for a few minutes, but then, the
 kernel complains and fs is remounted readonly:

 [ 5111.773900] BTRFS debug (device sdb5): run_one_delayed_ref returned -17
 [ 5111.773902] [ cut here ]
 [ 5111.773957] WARNING: CPU: 0 PID: 15042 at fs/btrfs/super.c:254 
 __btrfs_abort_transaction+0x4d/0xff [btrfs]()
 [ 5111.774047] btrfs: Transaction aborted (error -17)
 [ 5111.774048] Modules linked in: ipt_MASQUERADE iptable_nat 
 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables 
 x_tables cpufreq_ondemand cpufreq_conservative cpufreq_powersave 
 cpufreq_stats bridge stp llc ipv6 btrfs xor raid6_pq zlib_deflate loop pcspkr 
 button video ehci_pci ehci_hcd acpi_cpufreq i2c_i801 i2c_core lpc_ich 
 mfd_core ext4 crc16 jbd2 mbcache raid1 sg sd_mod ahci libahci libata scsi_mod 
 r8169 mii
 [ 5111.774355] CPU: 0 PID: 15042 Comm: btrfs-transacti Tainted: GW
 3.13.0-rc4 #1
 [ 5111.774443] Hardware name: System manufacturer System Product Name/P8H77-M 
 PRO, BIOS 1101 02/04/2013
 [ 5111.774533]  0009 8800374ddc48 8138a37d 
 0006
 [ 5111.774622]  8800374ddc98 8800374ddc88 810370a9 
 8800374ddd80
 [ 5111.774711]  a020d524 ffef 8807ead7d800 
 8807ff0cc8c0
 [ 5111.774800] Call Trace:
 [ 5111.774846]  [8138a37d] dump_stack+0x46/0x58
 [ 5111.774894]  [810370a9] warn_slowpath_common+0x77/0x91
 [ 5111.774944]  [a020d524] ? __btrfs_abort_transaction+0x4d/0xff 
 [btrfs]
 [ 5111.775032]  [81037157] warn_slowpath_fmt+0x41/0x43
 [ 5111.775081]  [a020d524] __btrfs_abort_transaction+0x4d/0xff 
 [btrfs]
 [ 5111.775135]  [a02226ed] btrfs_run_delayed_refs+0x253/0x46f 
 [btrfs]
 [ 5111.775189]  [a022fdec] btrfs_commit_transaction+0x36d/0x7df 
 [btrfs]
 [ 5111.775281]  [a022e345] transaction_kthread+0xef/0x1c2 [btrfs]
 [ 5111.775333]  [a022e256] ? open_ctree+0x1ac7/0x1ac7 [btrfs]
 [ 5111.775382]  [8104ee9a] kthread+0xcd/0xd5
 [ 5111.775428]  [8104edcd] ? kthread_freezable_should_stop+0x43/0x43
 [ 5111.775477]  [8138f17c] ret_from_fork+0x7c/0xb0
 [ 5111.775524]  [8104edcd] ? kthread_freezable_should_stop+0x43/0x43
 [ 5111.775572] ---[ end trace b552aca9a0cff3cb ]---
 [ 5111.775618] BTRFS error (device sdb5) in btrfs_run_delayed_refs:2730: 
 errno=-17 Object already exists
 [ 5111.775707] BTRFS info (device sdb5): forced readonly
 [ 5111.775754] BTRFS warning (device sdb5): Skipping commit of aborted 
 transaction.
 [ 5111.775841] BTRFS error (device sdb5) in cleanup_transaction:1553: 
 errno=-17 Object already exists



 If I run balance (after unmounting and mounting the filesystem), it ends in a 
 similar way:

 [ 5927.338989] btrfs: relocating block group 4647284637696 flags 17
 [ 6184.333629] btrfs: found 983 extents
 [ 6269.512577] BTRFS debug (device sdb5): run_one_delayed_ref returned -17
 [ 6269.512579] [ cut here ]
 [ 6269.512636] WARNING: CPU: 7 PID: 17836 at fs/btrfs/super.c:254 
 __btrfs_abort_transaction+0x4d/0xff [btrfs]()
 [ 6269.512727] btrfs: Transaction aborted (error -17)
 [ 6269.512727] Modules linked in: ipt_MASQUERADE iptable_nat 
 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables 
 x_tables cpufreq_ondemand cpufreq_conservative cpufreq_powersave 
 cpufreq_stats bridge stp llc ipv6 btrfs xor raid6_pq zlib_deflate loop pcspkr 
 button video ehci_pci ehci_hcd acpi_cpufreq i2c_i801 i2c_core lpc_ich 
 mfd_core ext4 crc16 jbd2 mbcache raid1 sg sd_mod ahci libahci libata scsi_mod 
 r8169 mii
 [ 6269.513036] CPU: 7 PID: 17836 Comm: btrfs-transacti Tainted: GW
 3.13.0-rc4 #1
 [ 6269.513124] Hardware name: System manufacturer System Product Name/P8H77-M 
 PRO, BIOS 1101 02/04/2013
 [ 6269.513214]  0009 8807c5b07c48 8138a37d 
 0006
 [ 6269.513304]  8807c5b07c98 8807c5b07c88 810370a9 
 8807c5b07d80
 [ 6269.513393]  a020d524 ffef 8807ead7e800 
 8800371eedc0
 [ 6269.513483] Call Trace:
 [ 6269.513529]  [8138a37d] dump_stack+0x46/0x58
 [ 6269.513577]  [810370a9] warn_slowpath_common+0x77/0x91
 [ 6269.513627]  [a020d524] ? 

Re: kernel BUG at fs/btrfs/relocation.c:1062

2013-12-20 Thread Shilong Wang
2013/12/20 Tomasz Chmielewski t...@virtall.com:
 -o  skip_balance - didn't know this.

 Actually, I was able to skip the balance, sort of, with this:

 mount /mnt/btrfs ; btrfs fi balance cancel /mnt/btrfs

From your previous email, i suspend your filesystem is nearly fully.
mount -o skip_balance will avoid balance continuing while use btrfs
balance cancel still
can not avoid balance totally, so i recommend you use skip_balance
option when remounting.




 However, the fs is extremely unstable (will remount read only quite
 fast, i.e. if I start removing snapshot).


 I'm trying to run btrfsck without the --repair option to see if it
 shows anything interesting.


I'd like you not to use btrfsck --repair, it is a little dangerous anyway.
If you still want to try, just use btrfsck is ok to check if your
filesystem is consistent.

Or you can remount with option degraded, recovery(maybe help you  to
make filesystem stable
at least).

Thanks,
Wang


 --
 Tomasz Chmielewski
 http://wpkg.org


 On Fri, 20 Dec 2013 23:52:08 +0800
 Shilong Wang wangshilong1...@gmail.com wrote:

 Hello Tomasz,

 There should be some bugs flying on, unitil now, you can try to
 remount: something like:

 # mount  /dev/sda /mnt -o  skip_balance

 It will skip unfinshed balance, hopley it can help you before we give
 a bug fix.

 Thanks,
 Wang
 2013/12/20 Tomasz Chmielewski t...@virtall.com:
  On Thu, 19 Dec 2013 22:07:37 +0900
  Tomasz Chmielewski t...@virtall.com wrote:
 
  If it matters, I had to hard reboot after that bug; the balance
  continued after the system booted again and I got this a while
  later (filesystem was remounted read only):
 
  Actually, looks like the fs is quite hosed now :(
 
  If I want to do an operation like removing a snapshot (btrfs sub
  del ...), the command returns, there is IO for a few minutes, but
  then, the kernel complains and fs is remounted readonly:
 
  [ 5111.773900] BTRFS debug (device sdb5): run_one_delayed_ref
  returned -17 [ 5111.773902] [ cut here ]
  [ 5111.773957] WARNING: CPU: 0 PID: 15042 at fs/btrfs/super.c:254
  __btrfs_abort_transaction+0x4d/0xff [btrfs]() [ 5111.774047] btrfs:
  Transaction aborted (error -17) [ 5111.774048] Modules linked in:
  ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
  nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables cpufreq_ondemand
  cpufreq_conservative cpufreq_powersave cpufreq_stats bridge stp llc
  ipv6 btrfs xor raid6_pq zlib_deflate loop pcspkr button video
  ehci_pci ehci_hcd acpi_cpufreq i2c_i801 i2c_core lpc_ich mfd_core
  ext4 crc16 jbd2 mbcache raid1 sg sd_mod ahci libahci libata
  scsi_mod r8169 mii [ 5111.774355] CPU: 0 PID: 15042 Comm:
  btrfs-transacti Tainted: GW3.13.0-rc4 #1 [ 5111.774443]
  Hardware name: System manufacturer System Product Name/P8H77-M PRO,
  BIOS 1101 02/04/2013 [ 5111.774533]  0009
  8800374ddc48 8138a37d 0006 [ 5111.774622]
  8800374ddc98 8800374ddc88 810370a9 8800374ddd80
  [ 5111.774711]  a020d524 ffef 8807ead7d800
  8807ff0cc8c0 [ 5111.774800] Call Trace: [ 5111.774846]
  [8138a37d] dump_stack+0x46/0x58 [ 5111.774894]
  [810370a9] warn_slowpath_common+0x77/0x91 [ 5111.774944]
  [a020d524] ? __btrfs_abort_transaction+0x4d/0xff [btrfs]
  [ 5111.775032]  [81037157] warn_slowpath_fmt+0x41/0x43
  [ 5111.775081]  [a020d524]
  __btrfs_abort_transaction+0x4d/0xff [btrfs] [ 5111.775135]
  [a02226ed] btrfs_run_delayed_refs+0x253/0x46f [btrfs]
  [ 5111.775189]  [a022fdec]
  btrfs_commit_transaction+0x36d/0x7df [btrfs] [ 5111.775281]
  [a022e345] transaction_kthread+0xef/0x1c2 [btrfs]
  [ 5111.775333]  [a022e256] ? open_ctree+0x1ac7/0x1ac7
  [btrfs] [ 5111.775382]  [8104ee9a] kthread+0xcd/0xd5
  [ 5111.775428]  [8104edcd] ?
  kthread_freezable_should_stop+0x43/0x43 [ 5111.775477]
  [8138f17c] ret_from_fork+0x7c/0xb0 [ 5111.775524]
  [8104edcd] ? kthread_freezable_should_stop+0x43/0x43
  [ 5111.775572] ---[ end trace b552aca9a0cff3cb ]--- [ 5111.775618]
  BTRFS error (device sdb5) in btrfs_run_delayed_refs:2730: errno=-17
  Object already exists [ 5111.775707] BTRFS info (device sdb5):
  forced readonly [ 5111.775754] BTRFS warning (device sdb5):
  Skipping commit of aborted transaction. [ 5111.775841] BTRFS error
  (device sdb5) in cleanup_transaction:1553: errno=-17 Object already
  exists
 
 
 
  If I run balance (after unmounting and mounting the filesystem), it
  ends in a similar way:
 
  [ 5927.338989] btrfs: relocating block group 4647284637696 flags 17
  [ 6184.333629] btrfs: found 983 extents
  [ 6269.512577] BTRFS debug (device sdb5): run_one_delayed_ref
  returned -17 [ 6269.512579] [ cut here ]
  [ 6269.512636] WARNING: CPU: 7 PID: 17836 at fs/btrfs/super.c:254
  __btrfs_abort_transaction+0x4d/0xff [btrfs]() [ 6269.512727] btrfs:
  Transaction aborted

Re: kernel BUG at fs/btrfs/relocation.c:1062

2013-12-20 Thread Shilong Wang
2013/12/21 Tomasz Chmielewski t...@virtall.com:
 On Sat, 21 Dec 2013 00:07:19 +0800
 Shilong Wang wangshilong1...@gmail.com wrote:

 2013/12/20 Tomasz Chmielewski t...@virtall.com:
  -o  skip_balance - didn't know this.
 
  Actually, I was able to skip the balance, sort of, with this:
 
  mount /mnt/btrfs ; btrfs fi balance cancel /mnt/btrfs

 From your previous email, i suspend your filesystem is nearly fully.
 mount -o skip_balance will avoid balance continuing while use btrfs
 balance cancel still
 can not avoid balance totally, so i recommend you use skip_balance
 option when remounting.

 My other thread a few days ago (no space left, metadata usage almost
 full?), if that's what you're referring to, was about a different
 filesystem.

 This one should have enough free space left:

 # btrfs fi
 show /mnt/lxc1 Label: lxc1  uuid: 8d08ad6d-4543-4fe5-8b1b-640dc1423d41
 Total devices 2 FS bytes used 2.02TiB
 devid1 size 2.62TiB used 2.02TiB path /dev/sda5
 devid2 size 2.62TiB used 2.02TiB path /dev/sdb5

 Btrfs v3.12

 # btrfs fi df /mnt/lxc1
 Data, RAID1: total=1.97TiB, used=1.97TiB
 System, RAID1: total=32.00MiB, used=300.00KiB
 Metadata, RAID1: total=50.00GiB, used=49.21GiB


 I've tried using skip_balance - it mounted, but breaks soon after I try
 to remove any snapshot.


ok, It seems that your filesystem is not in consistency anymore.
To confirm your filesystem status, you can try to run btrfsck to check if there
is something wrong.

To get a stable fs now, you can try to mount btrfs with option skip_balance and
recovery, recovery options try to get a previous good tree root which
might help you
get a stable fs.



 --
 Tomasz Chmielewski
 http://wpkg.org
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] open_ctree() failes after mounting BTRFS with 3.2 after 3.12

2013-12-15 Thread Shilong Wang
Hello Philipp,

In order to reproduce your problems.

I compile btrfs in kernel 3.2 and 3.12.

# mkfs.btrfs  -f /dev/sda2
# mount /dev/sda2  /mnt
# dd if=/dev/zero of=/mnt/data bs=1M count=1024
# umount /mnt

The above operations is in 3.2

and then i switch my system into 3.12, and try to mount /dev/sda2
it will output something like the following:

btrfs: mismatching generation and generation_v2 found in root item.
This root was probably mounted with an older kernel. Resetting all new
fields.

But it mounted successfully, then i do some similar operations in
3.12...and umount it.
and switch into kernel version 3.2...

but i failt to mount it in 3.2 and it output the following message.

BTRFS: couldn't mount because of unsupported optional features (20)

It seems that here the problem is because incompat features in newer
kernel and make older kernel
unmountable any more..

But in your problem is that you can still mount on older kernel? but
not new kernel?


Thanks,
Wang

2013/12/14 Philipp Matthias Hahn pmh...@pmhahn.de:
 Hello,

 I'm not subscribed, so please cc: me.

 I've an extermal USB disk using BTRFS used for backups and data sharing
 between multiple systems using different kernels (3.2.xx, 3.12.5). The
 FS fails to mount with 3.12.x, but still works on 3.2.x with the
 following dmesg:

 btrfs: mismatching generation and generation_v2 found in root item. This root 
 was probably mounted with an older kernel. Resetting all new fields.
 btrfs: mismatching generation and generation_v2 found in root item. This root 
 was probably mounted with an older kernel. Resetting all new fields.
 btrfs: open_ctree failed

 Following
 https://btrfs.wiki.kernel.org/index.php/Debugging_Btrfs_with_GDB#Building_UML_Kernel
 here some more debug info. To me it looks like the call to
 btrfs_read_fs_root_no_name() fails.

 $ gdb --args ./linux ubd0=/dev/sdc2
 ...
 (gdb) handle SIGSEGV pass nostop noprint
 (gdb) handle SIGUSR1 pass nostop noprint
 (gdb) break open_ctree
 (gdb) r
 ...
 [0.00] Linux version 3.12.5 (pmhahn@scout) (gcc version 4.8.2 (Debian 
 4.8.2-10) ) #9 Sat Dec 14 09:40:10 CET 2013
 ...
 [1.12] Btrfs loaded, debug=on
 ...
 [1.12]  ubda: unknown partition table
 [1.12] btrfs: device label backup devid 1 transid 5478 /dev/root

 Breakpoint 1, open_ctree (sb=sb@entry=0x61d0e400, fs_devices=0x61d15c80, 
 options=options@entry=0x0) at fs/btrfs/disk-io.c:2096
 2096{
 2093int open_ctree(struct super_block *sb,
 2123tree_root = fs_info-tree_root = btrfs_alloc_root(fs_info);
 2096{
 2123tree_root = fs_info-tree_root = btrfs_alloc_root(fs_info);
 2096{
 2118int num_backups_tried = 0;
 2119int backup_index = 0;
 2123tree_root = fs_info-tree_root = btrfs_alloc_root(fs_info);
 2124chunk_root = fs_info-chunk_root = btrfs_alloc_root(fs_info);
 2125if (!tree_root || !chunk_root) {
 2124chunk_root = fs_info-chunk_root = btrfs_alloc_root(fs_info);
 2125if (!tree_root || !chunk_root) {
 2130ret = init_srcu_struct(fs_info-subvol_srcu);
 2131if (ret) {
 2130ret = init_srcu_struct(fs_info-subvol_srcu);
 2131if (ret) {
 2136ret = setup_bdi(fs_info, fs_info-bdi);
 2156fs_info-btree_inode = new_inode(sb);
 2136ret = setup_bdi(fs_info, fs_info-bdi);
 2142ret = percpu_counter_init(fs_info-dirty_metadata_bytes, 0);
 2147fs_info-dirty_metadata_batch = PAGE_CACHE_SIZE *
 2150ret = percpu_counter_init(fs_info-delalloc_bytes, 0);
 2136ret = setup_bdi(fs_info, fs_info-bdi);
 2156fs_info-btree_inode = new_inode(sb);
 2157if (!fs_info-btree_inode) {
 2156fs_info-btree_inode = new_inode(sb);
 2157if (!fs_info-btree_inode) {
 2162mapping_set_gfp_mask(fs_info-btree_inode-i_mapping, 
 GFP_NOFS);
 2179mutex_init(fs_info-reloc_mutex);
 2162mapping_set_gfp_mask(fs_info-btree_inode-i_mapping, 
 GFP_NOFS);
 2165INIT_LIST_HEAD(fs_info-trans_list);
 2179mutex_init(fs_info-reloc_mutex);
 2164INIT_RADIX_TREE(fs_info-fs_roots_radix, GFP_ATOMIC);
 2165INIT_LIST_HEAD(fs_info-trans_list);
 2166INIT_LIST_HEAD(fs_info-dead_roots);
 2164INIT_RADIX_TREE(fs_info-fs_roots_radix, GFP_ATOMIC);
 2166INIT_LIST_HEAD(fs_info-dead_roots);
 2167INIT_LIST_HEAD(fs_info-delayed_iputs);
 2168INIT_LIST_HEAD(fs_info-delalloc_roots);
 2169INIT_LIST_HEAD(fs_info-caching_block_groups);
 2179mutex_init(fs_info-reloc_mutex);
 2182init_completion(fs_info-kobj_unregister);
 2180seqlock_init(fs_info-profiles_lock);
 2182init_completion(fs_info-kobj_unregister);
 2183INIT_LIST_HEAD(fs_info-dirty_cowonly_roots);
 2186

Fwd: [PATCH] Btrfs: fix a warning when iput a file

2013-12-14 Thread Shilong Wang
CC:linux-btrfs

-- Forwarded message --
From: Shilong Wang wangshilong1...@gmail.com
Date: 2013/12/14
Subject: Re: [PATCH] Btrfs: fix a warning when iput a file
To: bo.li@oracle.com


Hello Liu

2013/12/14 Liu Bo bo.li@oracle.com:
 On Sat, Dec 14, 2013 at 03:27:31PM +0800, Wang Shilong wrote:
 See the warning below:

 [ 1209.102076]  [a04721b9] remove_extent_mapping+0x69/0x70 [btrfs]
 [ 1209.102084]  [a0466b06] btrfs_evict_inode+0x96/0x4d0 [btrfs]
 [ 1209.102089]  [81073010] ? wake_atomic_t_function+0x40/0x40
 [ 1209.102092]  [8118ab2e] evict+0x9e/0x190
 [ 1209.102094]  [8118b313] iput+0xf3/0x180
 [ 1209.102101]  [a0461fd1] btrfs_run_delayed_iputs+0xb1/0xd0 
 [btrfs]
 [ 1209.102107]  [a045d358] __btrfs_end_transaction+0x268/0x350 
 [btrfs]

 clear extent bit here to avoid triggering WARN_ON() in 
 remove_extent_mapping()

 Why PINNED and LOGGING isn't unset as usual?


Flippe sent an improved patch regarding evict inode:
https://patchwork.kernel.org/patch/3204281/

In btrfs_invalidatepages(), if we are evicting an inode, we will skip
usual operation that is operated page by page.

Rather than we do it more effectively, that we operate extent_map and
extent_state structure, this will be faster!

So for now code , after calling truncate_inode_pages(), PINNED and
LOGGED won't be unset
as usual(Maybe a better solution is to move remove_extent_mappin after
we clear extent state).

Correct me if i am wrong here.

Thanks,
Wang

 -liubo


 Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com
 ---
  fs/btrfs/inode.c | 2 ++
  1 file changed, 2 insertions(+)

 diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
 index e889779..c0f0f9d 100644
 --- a/fs/btrfs/inode.c
 +++ b/fs/btrfs/inode.c
 @@ -4515,6 +4515,8 @@ static void evict_inode_truncate_pages(struct inode 
 *inode)

   node = rb_first(map_tree-map);
   em = rb_entry(node, struct extent_map, rb_node);
 + clear_bit(EXTENT_FLAG_PINNED, em-flags);
 + clear_bit(EXTENT_FLAG_LOGGING, em-flags);
   remove_extent_mapping(map_tree, em);
   free_extent_map(em);
   }
 --
 1.8.3.1

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: wait for ordered extents before removing extent maps

2013-12-14 Thread Shilong Wang
Hello Filipe,

2013/12/14 Filipe David Borba Manana fdman...@gmail.com:
 Wang Shilong got into a case where during inode eviction we were
 removing an extent map while it was pinned. This triggered a warning
 in remove_extent_mapping() because the extent map had the pinned
 flag set:

 [ 1209.102076]  [a04721b9] remove_extent_mapping+0x69/0x70 [btrfs]
 [ 1209.102084]  [a0466b06] btrfs_evict_inode+0x96/0x4d0 [btrfs]
 [ 1209.102089]  [81073010] ? wake_atomic_t_function+0x40/0x40
 [ 1209.102092]  [8118ab2e] evict+0x9e/0x190
 [ 1209.102094]  [8118b313] iput+0xf3/0x180
 [ 1209.102101]  [a0461fd1] btrfs_run_delayed_iputs+0xb1/0xd0 [btrfs]
 [ 1209.102107]  [a045d358] __btrfs_end_transaction+0x268/0x350 
 [btrfs]

 Therefore wait for any pending ordered extents, if any, which will
 trigger calls to unpin_extent_cache(), before removing the extent maps.

 Wang's solution of simply clearing the pinned bit wasn't enough, as after
 unpin_extent_cache() will be called and trigger another WARN_ON() because
 the lookup for the extent map returned NULL.

Why not in evict_inode_truncate_pages() move remove_extent_mapping() after
clear_extent_bit()?

Thanks,
Wang

 Thanks Wang for finding out this.

 Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
 ---
  fs/btrfs/inode.c |5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

 diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
 index e889779..c2933fb 100644
 --- a/fs/btrfs/inode.c
 +++ b/fs/btrfs/inode.c
 @@ -4509,6 +4509,9 @@ static void evict_inode_truncate_pages(struct inode 
 *inode)
 ASSERT(inode-i_state  I_FREEING);
 truncate_inode_pages(inode-i_data, 0);

 +   /* do we really want it for -i_nlink  0 and zero btrfs_root_refs? */
 +   btrfs_wait_ordered_range(inode, 0, (u64)-1);
 +
 write_lock(map_tree-lock);
 while (!RB_EMPTY_ROOT(map_tree-map)) {
 struct extent_map *em;
 @@ -4566,8 +4569,6 @@ void btrfs_evict_inode(struct inode *inode)
 btrfs_orphan_del(NULL, inode);
 goto no_delete;
 }
 -   /* do we really want it for -i_nlink  0 and zero btrfs_root_refs? */
 -   btrfs_wait_ordered_range(inode, 0, (u64)-1);

 if (root-fs_info-log_root_recovering) {
 BUG_ON(test_bit(BTRFS_INODE_HAS_ORPHAN_ITEM,
 --
 1.7.9.5

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: wait for ordered extents before removing extent maps

2013-12-14 Thread Shilong Wang
2013/12/14 Filipe David Manana fdman...@gmail.com:
 On Sat, Dec 14, 2013 at 2:56 PM, Shilong Wang wangshilong1...@gmail.com 
 wrote:
 Hello Filipe,

 2013/12/14 Filipe David Borba Manana fdman...@gmail.com:
 Wang Shilong got into a case where during inode eviction we were
 removing an extent map while it was pinned. This triggered a warning
 in remove_extent_mapping() because the extent map had the pinned
 flag set:

 [ 1209.102076]  [a04721b9] remove_extent_mapping+0x69/0x70 [btrfs]
 [ 1209.102084]  [a0466b06] btrfs_evict_inode+0x96/0x4d0 [btrfs]
 [ 1209.102089]  [81073010] ? wake_atomic_t_function+0x40/0x40
 [ 1209.102092]  [8118ab2e] evict+0x9e/0x190
 [ 1209.102094]  [8118b313] iput+0xf3/0x180
 [ 1209.102101]  [a0461fd1] btrfs_run_delayed_iputs+0xb1/0xd0 
 [btrfs]
 [ 1209.102107]  [a045d358] __btrfs_end_transaction+0x268/0x350 
 [btrfs]

 Therefore wait for any pending ordered extents, if any, which will
 trigger calls to unpin_extent_cache(), before removing the extent maps.

 Wang's solution of simply clearing the pinned bit wasn't enough, as after
 unpin_extent_cache() will be called and trigger another WARN_ON() because
 the lookup for the extent map returned NULL.

 Why not in evict_inode_truncate_pages() move remove_extent_mapping() after
 clear_extent_bit()?

 So, if the pinned bit is set, it means some task will clear it later,
 via unpin_extent_cache(). And if you look at that function, it has
 this:

 write_lock(tree-lock);
 em = lookup_extent_mapping(tree, start, len);

 WARN_ON(!em || em-start != start);

 And remove_extent_mapping() will remove the em from the rbtree,
 regardless of its reference count value, therefore triggering that
 warning above.

Here i mean,  in evict_inode_truncate_pages()
We change it to:

Step1:  unpin_extent_cache()
Step2:  remove it from extent_mapping

Dose this cause any problems? i am a little confused, correct me if i
am wrong some places^_^.



 Does it makes sense?

 thanks


 Thanks,
 Wang

 Thanks Wang for finding out this.

 Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
 ---
  fs/btrfs/inode.c |5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

 diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
 index e889779..c2933fb 100644
 --- a/fs/btrfs/inode.c
 +++ b/fs/btrfs/inode.c
 @@ -4509,6 +4509,9 @@ static void evict_inode_truncate_pages(struct inode 
 *inode)
 ASSERT(inode-i_state  I_FREEING);
 truncate_inode_pages(inode-i_data, 0);

 +   /* do we really want it for -i_nlink  0 and zero btrfs_root_refs? 
 */
 +   btrfs_wait_ordered_range(inode, 0, (u64)-1);
 +
 write_lock(map_tree-lock);
 while (!RB_EMPTY_ROOT(map_tree-map)) {
 struct extent_map *em;
 @@ -4566,8 +4569,6 @@ void btrfs_evict_inode(struct inode *inode)
 btrfs_orphan_del(NULL, inode);
 goto no_delete;
 }
 -   /* do we really want it for -i_nlink  0 and zero btrfs_root_refs? 
 */
 -   btrfs_wait_ordered_range(inode, 0, (u64)-1);

 if (root-fs_info-log_root_recovering) {
 BUG_ON(test_bit(BTRFS_INODE_HAS_ORPHAN_ITEM,
 --
 1.7.9.5

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html



 --
 Filipe David Manana,

 Reasonable men adapt themselves to the world.
  Unreasonable men adapt the world to themselves.
  That's why all progress depends on unreasonable men.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix deadlock when iterating inode refs and running delayed inodes

2013-12-14 Thread Shilong Wang
Hello Filipe,

I mean no hurt, but your changelog is really too long.
I think pasting some of dmesg is ok...


Thanks,
Wang
2013/12/15 Filipe David Borba Manana fdman...@gmail.com:
 While running btrfs/004 from xfstests, after 503 iterations, dmesg reported
 a deadlock between tasks iterating inode refs and tasks running delayed inodes
 (during a transaction commit).

 It turns out that iterating inode refs implies doing one tree search and
 release all nodes in the path except the leaf node, and then passing that
 leaf node to btrfs_ref_to_path(), which in turn does another tree search
 without releasing the lock on the leaf node it received as parameter.

 This is a problem when other task wants to write to the btree as well and
 ends up updating the leaf that is read locked - the writer task locks the
 parent of the leaf and then blocks waiting for the leaf's lock to be
 released - at the same time, the task executing btrfs_ref_to_path()
 does a second tree search, without releasing the lock on the first leaf,
 and wants to access a leaf (the same or another one) that is a child of
 the same parent, resulting in a deadlock.

 The trace reported by lockdep follows.

 [84314.936373] INFO: task fsstress:11930 blocked for more than 120 seconds.
 [84314.936381]   Tainted: GW  O 3.12.0-fdm-btrfs-next-16+ #70
 [84314.936383] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables 
 this message.
 [84314.936386] fsstressD 8806e1bf8000 0 11930  11926 
 0x
 [84314.936393]  8804d6d89b78 0046 8804d6d89b18 
 810bd8bd
 [84314.936399]  8806e1bf8000 8804d6d89fd8 8804d6d89fd8 
 8804d6d89fd8
 [84314.936405]  880806308000 8806e1bf8000 8804d6d89c08 
 8804deb8f190
 [84314.936410] Call Trace:
 [84314.936421]  [810bd8bd] ? trace_hardirqs_on+0xd/0x10
 [84314.936428]  [81774269] schedule+0x29/0x70
 [84314.936451]  [a0715bf5] btrfs_tree_lock+0x75/0x270 [btrfs]
 [84314.936457]  [810715c0] ? __init_waitqueue_head+0x60/0x60
 [84314.936470]  [a06ba231] btrfs_search_slot+0x7f1/0x930 [btrfs]
 [84314.936489]  [a0731c2a] ? __btrfs_run_delayed_items+0x13a/0x1e0 
 [btrfs]
 [84314.936504]  [a06d2e1f] btrfs_lookup_inode+0x2f/0xa0 [btrfs]
 [84314.936510]  [810bd6ef] ? trace_hardirqs_on_caller+0x1f/0x1e0
 [84314.936528]  [a073173c] __btrfs_update_delayed_inode+0x4c/0x1d0 
 [btrfs]
 [84314.936543]  [a0731c2a] ? __btrfs_run_delayed_items+0x13a/0x1e0 
 [btrfs]
 [84314.936558]  [a0731c2a] ? __btrfs_run_delayed_items+0x13a/0x1e0 
 [btrfs]
 [84314.936573]  [a0731c82] __btrfs_run_delayed_items+0x192/0x1e0 
 [btrfs]
 [84314.936589]  [a0731d03] btrfs_run_delayed_items+0x13/0x20 [btrfs]
 [84314.936604]  [a06dbcd4] btrfs_flush_all_pending_stuffs+0x24/0x80 
 [btrfs]
 [84314.936620]  [a06ddc13] btrfs_commit_transaction+0x223/0xa20 
 [btrfs]
 [84314.936630]  [a06ae5ae] btrfs_sync_fs+0x6e/0x110 [btrfs]
 [84314.936635]  [811d0b50] ? __sync_filesystem+0x60/0x60
 [84314.936639]  [811d0b50] ? __sync_filesystem+0x60/0x60
 [84314.936643]  [811d0b70] sync_fs_one_sb+0x20/0x30
 [84314.936648]  [811a3541] iterate_supers+0xf1/0x100
 [84314.936652]  [811d0c45] sys_sync+0x55/0x90
 [84314.936658]  [8177ef12] system_call_fastpath+0x16/0x1b
 [84314.936660] INFO: lockdep is turned off.
 [84314.936663] INFO: task btrfs:11955 blocked for more than 120 seconds.
 [84314.93]   Tainted: GW  O 3.12.0-fdm-btrfs-next-16+ #70
 [84314.936668] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables 
 this message.
 [84314.936670] btrfs   D 880541729a88 0 11955  11608 
 0x
 [84314.936674]  880541729a38 0046 8805417299d8 
 810bd8bd
 [84314.936680]  88075430c8a0 880541729fd8 880541729fd8 
 880541729fd8
 [84314.936685]  81c104e0 88075430c8a0 8804de8b00b8 
 8804de8b
 [84314.936690] Call Trace:
 [84314.936695]  [810bd8bd] ? trace_hardirqs_on+0xd/0x10
 [84314.936700]  [81774269] schedule+0x29/0x70
 [84314.936717]  [a0715815] btrfs_tree_read_lock+0xd5/0x140 [btrfs]
 [84314.936721]  [810715c0] ? __init_waitqueue_head+0x60/0x60
 [84314.936733]  [a06ba201] btrfs_search_slot+0x7c1/0x930 [btrfs]
 [84314.936746]  [a06bd505] btrfs_find_item+0x55/0x160 [btrfs]
 [84314.936763]  [a06ff689] ? free_extent_buffer+0x49/0xc0 [btrfs]
 [84314.936780]  [a073c9ca] btrfs_ref_to_path+0xba/0x1e0 [btrfs]
 [84314.936797]  [a06f9719] ? release_extent_buffer+0xb9/0xe0 [btrfs]
 [84314.936813]  [a06ff689] ? free_extent_buffer+0x49/0xc0 [btrfs]
 [84314.936830]  [a073cb50] inode_to_path+0x60/0xd0 [btrfs]
 [84314.936846]  [a073d365] paths_from_inode+0x115/0x3c0 [btrfs]
 [84314.936851]  [8118dd44] ? kmem_cache_alloc_trace+0x114/0x200
 

Re: [PATCH] Btrfs: fix deadlock when iterating inode refs and running delayed inodes

2013-12-14 Thread Shilong Wang
Hello Filipe,

I mean no harm, but this patch's changelog is too long..
I think pasting some of dmesg is ok...

Thanks,
Wang

2013/12/15 Filipe David Borba Manana fdman...@gmail.com:
 While running btrfs/004 from xfstests, after 503 iterations, dmesg reported
 a deadlock between tasks iterating inode refs and tasks running delayed inodes
 (during a transaction commit).

 It turns out that iterating inode refs implies doing one tree search and
 release all nodes in the path except the leaf node, and then passing that
 leaf node to btrfs_ref_to_path(), which in turn does another tree search
 without releasing the lock on the leaf node it received as parameter.

 This is a problem when other task wants to write to the btree as well and
 ends up updating the leaf that is read locked - the writer task locks the
 parent of the leaf and then blocks waiting for the leaf's lock to be
 released - at the same time, the task executing btrfs_ref_to_path()
 does a second tree search, without releasing the lock on the first leaf,
 and wants to access a leaf (the same or another one) that is a child of
 the same parent, resulting in a deadlock.

 The trace reported by lockdep follows.

 [84314.936373] INFO: task fsstress:11930 blocked for more than 120 seconds.
 [84314.936381]   Tainted: GW  O 3.12.0-fdm-btrfs-next-16+ #70
 [84314.936383] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables 
 this message.
 [84314.936386] fsstressD 8806e1bf8000 0 11930  11926 
 0x
 [84314.936393]  8804d6d89b78 0046 8804d6d89b18 
 810bd8bd
 [84314.936399]  8806e1bf8000 8804d6d89fd8 8804d6d89fd8 
 8804d6d89fd8
 [84314.936405]  880806308000 8806e1bf8000 8804d6d89c08 
 8804deb8f190
 [84314.936410] Call Trace:
 [84314.936421]  [810bd8bd] ? trace_hardirqs_on+0xd/0x10
 [84314.936428]  [81774269] schedule+0x29/0x70
 [84314.936451]  [a0715bf5] btrfs_tree_lock+0x75/0x270 [btrfs]
 [84314.936457]  [810715c0] ? __init_waitqueue_head+0x60/0x60
 [84314.936470]  [a06ba231] btrfs_search_slot+0x7f1/0x930 [btrfs]
 [84314.936489]  [a0731c2a] ? __btrfs_run_delayed_items+0x13a/0x1e0 
 [btrfs]
 [84314.936504]  [a06d2e1f] btrfs_lookup_inode+0x2f/0xa0 [btrfs]
 [84314.936510]  [810bd6ef] ? trace_hardirqs_on_caller+0x1f/0x1e0
 [84314.936528]  [a073173c] __btrfs_update_delayed_inode+0x4c/0x1d0 
 [btrfs]
 [84314.936543]  [a0731c2a] ? __btrfs_run_delayed_items+0x13a/0x1e0 
 [btrfs]
 [84314.936558]  [a0731c2a] ? __btrfs_run_delayed_items+0x13a/0x1e0 
 [btrfs]
 [84314.936573]  [a0731c82] __btrfs_run_delayed_items+0x192/0x1e0 
 [btrfs]
 [84314.936589]  [a0731d03] btrfs_run_delayed_items+0x13/0x20 [btrfs]
 [84314.936604]  [a06dbcd4] btrfs_flush_all_pending_stuffs+0x24/0x80 
 [btrfs]
 [84314.936620]  [a06ddc13] btrfs_commit_transaction+0x223/0xa20 
 [btrfs]
 [84314.936630]  [a06ae5ae] btrfs_sync_fs+0x6e/0x110 [btrfs]
 [84314.936635]  [811d0b50] ? __sync_filesystem+0x60/0x60
 [84314.936639]  [811d0b50] ? __sync_filesystem+0x60/0x60
 [84314.936643]  [811d0b70] sync_fs_one_sb+0x20/0x30
 [84314.936648]  [811a3541] iterate_supers+0xf1/0x100
 [84314.936652]  [811d0c45] sys_sync+0x55/0x90
 [84314.936658]  [8177ef12] system_call_fastpath+0x16/0x1b
 [84314.936660] INFO: lockdep is turned off.
 [84314.936663] INFO: task btrfs:11955 blocked for more than 120 seconds.
 [84314.93]   Tainted: GW  O 3.12.0-fdm-btrfs-next-16+ #70
 [84314.936668] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables 
 this message.
 [84314.936670] btrfs   D 880541729a88 0 11955  11608 
 0x
 [84314.936674]  880541729a38 0046 8805417299d8 
 810bd8bd
 [84314.936680]  88075430c8a0 880541729fd8 880541729fd8 
 880541729fd8
 [84314.936685]  81c104e0 88075430c8a0 8804de8b00b8 
 8804de8b
 [84314.936690] Call Trace:
 [84314.936695]  [810bd8bd] ? trace_hardirqs_on+0xd/0x10
 [84314.936700]  [81774269] schedule+0x29/0x70
 [84314.936717]  [a0715815] btrfs_tree_read_lock+0xd5/0x140 [btrfs]
 [84314.936721]  [810715c0] ? __init_waitqueue_head+0x60/0x60
 [84314.936733]  [a06ba201] btrfs_search_slot+0x7c1/0x930 [btrfs]
 [84314.936746]  [a06bd505] btrfs_find_item+0x55/0x160 [btrfs]
 [84314.936763]  [a06ff689] ? free_extent_buffer+0x49/0xc0 [btrfs]
 [84314.936780]  [a073c9ca] btrfs_ref_to_path+0xba/0x1e0 [btrfs]
 [84314.936797]  [a06f9719] ? release_extent_buffer+0xb9/0xe0 [btrfs]
 [84314.936813]  [a06ff689] ? free_extent_buffer+0x49/0xc0 [btrfs]
 [84314.936830]  [a073cb50] inode_to_path+0x60/0xd0 [btrfs]
 [84314.936846]  [a073d365] paths_from_inode+0x115/0x3c0 [btrfs]
 [84314.936851]  [8118dd44] ? kmem_cache_alloc_trace+0x114/0x200
 

Re: [PATCH 1/2] Btrfs: fix an oops when doing balance relocation

2013-12-10 Thread Shilong Wang
2013/12/10 Wang Shilong wangsl.f...@cn.fujitsu.com:
 On 12/10/2013 12:14 AM, Wang Shilong wrote:

 From: Wang Shilong wangsl.f...@cn.fujitsu.com

 I hit a BUG_ON() when inserting reloc root into
 rc-reloc_root_tree.rb_root,
 the fact is block bytenr has been inserted before, this is really a
 redicious
 bug, the reason is that we freed root node after we have allocated root
 node block,and this block bytenr will be reused and then oops happens.


By hacking the code (force cow for relocation tree),  BUG_ON() can be triggered
easily.

The problem is that reloc root bytenr inserted into @reloc_root_tree
is not within
one transaction. root block node can be cowed while it is still in
@reloc_root_tree
and  previous relocation root block can be reused and this oops happen.

We can fix it by updating relocation root node in @reloc_root_tree
when cowing block,
Patch has been make and hopely it can fix this problem, will send v2 later. ^_^

 oops, this is wrong, sorry for noise.

Previously, i really made mistakes, free_extent_buffer() even won't
free tree block,
it just dec refs and free memory if possible.

Thanks,
Wang


 Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com
 ---
   fs/btrfs/relocation.c | 1 -
   1 file changed, 1 deletion(-)

 diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
 index ce459a7..4dc7f26 100644
 --- a/fs/btrfs/relocation.c
 +++ b/fs/btrfs/relocation.c
 @@ -1361,7 +1361,6 @@ static struct btrfs_root *create_reloc_root(struct
 btrfs_trans_handle *trans,
 }
 btrfs_tree_unlock(eb);
 -   free_extent_buffer(eb);
 ret = btrfs_insert_root(trans, root-fs_info-tree_root,
 root_key, root_item);


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix generation mismatch when scrubbing supers

2013-12-02 Thread Shilong Wang
Hello Sebastian,

Would you please apply this patch in your test box and see if the
problem still exist.

Thanks,
Wang
2013/12/3 Wang Shilong wangshilong1...@gmail.com:
 From: Wang Shilong wangsl.f...@cn.fujitsu.com

 We came a race condition when scrubbing superblocks, the story is:

 In commiting transaction, we will update last_trans_commited after
 writting superblocks. if a scrub start after writting superblocks
 and before last_trans_commited, generation mismatch happens!

 We fix it by protecting writting superblock and updating last_trans_commited
 with tree_log_mutex.

 Reported-by: Sebastian Ochmann ochm...@informatik.uni-bonn.de
 Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com
 ---
  fs/btrfs/scrub.c   | 11 +++
  fs/btrfs/transaction.c | 13 ++---
  2 files changed, 17 insertions(+), 7 deletions(-)

 diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
 index 561e2f1..afa2f01 100644
 --- a/fs/btrfs/scrub.c
 +++ b/fs/btrfs/scrub.c
 @@ -2932,12 +2932,15 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, 
 u64 devid, u64 start,
 atomic_inc(fs_info-scrubs_running);
 mutex_unlock(fs_info-scrub_lock);

 +   /*
 +* by holding tree_log_mutex while scrubbing supers
 +* we can avoid generation mismatch. See comments in commiting
 +* transaction when updating last_trans_commited.
 +*/
 if (!is_dev_replace) {
 -   /*
 -* by holding device list mutex, we can
 -* kick off writing super in log tree sync.
 -*/
 +   mutex_lock(fs_info-tree_log_mutex);
 ret = scrub_supers(sctx, dev);
 +   mutex_unlock(fs_info-tree_log_mutex);
 }
 mutex_unlock(fs_info-fs_devices-device_list_mutex);

 diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
 index c6a872a..052eb22 100644
 --- a/fs/btrfs/transaction.c
 +++ b/fs/btrfs/transaction.c
 @@ -1898,15 +1898,22 @@ int btrfs_commit_transaction(struct 
 btrfs_trans_handle *trans,
 goto cleanup_transaction;
 }

 +   btrfs_finish_extent_commit(trans, root);
 +
 +   /*
 +* we must gurantee last_trans_commited update is protected by
 +* tree_log_mutex with write_ctree_super together, otherwise,
 +* scubbing super will come in before updating last_trans_commited
 +* and we will get generation mismatch when scrubbing superblocks.
 +*/
 +   root-fs_info-last_trans_committed = cur_trans-transid;
 +
 /*
  * the super is written, we can safely allow the tree-loggers
  * to go about their business
  */
 mutex_unlock(root-fs_info-tree_log_mutex);

 -   btrfs_finish_extent_commit(trans, root);
 -
 -   root-fs_info-last_trans_committed = cur_trans-transid;
 /*
  * We needn't acquire the lock here because there is no other task
  * which can change it.
 --
 1.8.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fwd: 2 errors when scrubbing - but I don't know what they mean

2013-11-30 Thread Shilong Wang
cc: linux-btrfs

-- Forwarded message --
From: Shilong Wang wangshilong1...@gmail.com
Date: 2013/12/1
Subject: Re: 2 errors when scrubbing - but I don't know what they mean
To: Sebastian Ochmann ochm...@informatik.uni-bonn.de


Hello Sebastian,

2013/11/30 Sebastian Ochmann ochm...@informatik.uni-bonn.de:
 Hello,

 thank you for your input. I didn't know that btrfs keeps the error counters
 over mounts/reboots, but that's nice.

 I'm still trying to figure out how such a generation error may occur in the
 first place. One thing I noticed looking at the btrfs code is that the
 generation error counter will only get incremented in the actual scrubbing
 code (either in scrub_checksum_super or in scrub_handle_errored_block,
 both in scrub.c - please correct me if I'm wrong, I'm not a btrfs dev).

Right, Scrub will read superblock with bio rather than using pagecaches.
This mean we will reread superblock from disks, if a checksum mismatch happens,
This can be the following reasons:

1.some read errors happen while scrubing, while superblocks are actually good
2.during last transaction, when we are trying to write superblocks to
disk, some silent corruption
   happens.
3.some unexpected operation write data to superblocks directly, for
example..'dd if=/dev/zero'
of=/dev/ seek=65536   count=4k' something like this.

Actually, during boot time, superblock should be fine, because will do
checksum check
when trying to using superblock. if checksum mismatch, we will refuse
to mount, After mounting,
these superblocks should be cached in memory until you umouting filesystem.

So ideal thing is your disk is fine, and during next transaction,
superblocks will be rewritten.
and during next umounting, you can mounting filesystem successfully!

However, if you find such superblocks checksum mismatch very often
during scrub, it maybe
there are something wrong with disk!

 Also, the dmesg errors I saw were not there at boot time, but about 10
 minutes after boot which was about the time when I started the scrub so I'm
 pretty sure that it was the scrub that detected the errors.

 The question remains what can cause superblock/gen errors. Sure it could be
 some read error, but I'd really like to make sure that it's not a
 systematic error. I wasn't able to reproduce it yet though.

You can reproduce this by doing 'dd if=/dev/zero of=/dev/sd*
seek=65536 count=4k' before
btrfs scrubing.

Thanks,
Wang

 Best
 Sebastian

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] btrfs-progs: btrfs send does not work inside mounted non root subvolume

2013-11-29 Thread Shilong Wang
Hello Thomas,

2013/11/29 Thomas Scheiblauer t...@sharkbay.at:
 When operating from inside a mounted btrfs subvolume which is not the
 root subvolume btrfs send exits with the error:
 ERROR: open subvolume path failed. No such file or directory

 Steps to reproduce:
  1. mkfs.btrfs /dev/sda1
  2. mount /dev/sda1 /mnt
  3. btrfs subvolume create /mnt/foo
  4. umount /mnt
  5. mount -o subvol=foo /dev/sda1 /mnt
  6. btrfs subvolume snapshot -r /mnt /mnt/snap
  7. btrfs send /mnt/snap  /dev/null
 - ERROR: open foo/snap failed. No such file or directory

The problem is that we will try to use btrfs internal path(foo/snap)
to open the sub-volume, this is not right here, i will send a patch for this,
thanks for reporting.

Thanks,
Wang



 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 10/12] Btrfs-progs: add '--block-size' option to control print result

2013-10-08 Thread Shilong Wang
Thanks for finding this, the problem comes to patch [v3 9/12].
When updating max columns len of child_qgroup, i miswrite qgroup-member to
qgroup-parent, i have updated this patch and send a v4, it can be
appiled without conflicts with later.

Thanks,
Wang
2013/10/8 David Sterba dste...@suse.cz:
 On Mon, Oct 07, 2013 at 03:21:46PM +0800, Wang Shilong wrote:
 You can use it like:
   btrfs qgroup show --block-size=m mnt

 Here, block size supports k/K/m/M/g/G/t/T/p/P/e/E.

 There is no distinction between the 1000 and 1024 based prefixes, also
 no way to get the raw values in bytes. I don't have a suggestion how to
 do that, merely letting you know that this could go separately (this and
 the -h patch, the rest shall be integrated).

 Also, the numbers in the table should be aligned to the right:

 $ btrfs qgroup show -h -p /mnt/
 qgroupid rfer  excl  parent
      --
 0/5  900.00KiB 900.00KiB ---
 0/267688.00KiB 12.00KiB  1/5
 0/268684.00KiB 8.00KiB   1/5
 0/2696.71GiB   4.00KiB   1/1
 0/2776.71GiB   4.00KiB   1/1
 0/27839.74GiB  39.74GiB  1/2
 1/1  6.71GiB   6.71GiB   ---
 1/2  39.74GiB  39.74GiB  ---
 1/5  696.00KiB 696.00KiB ---


 david
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 10/12] Btrfs-progs: add '--block-size' option to control print result

2013-10-08 Thread Shilong Wang
Sorry, This should to reply to the bug that you find, not this thread.
Anyway, you are smart enough to get this .

2013/10/9 Shilong Wang wangshilong1...@gmail.com:
 Thanks for finding this, the problem comes to patch [v3 9/12].
 When updating max columns len of child_qgroup, i miswrite qgroup-member to
 qgroup-parent, i have updated this patch and send a v4, it can be
 appiled without conflicts with later.

 Thanks,
 Wang
 2013/10/8 David Sterba dste...@suse.cz:
 On Mon, Oct 07, 2013 at 03:21:46PM +0800, Wang Shilong wrote:
 You can use it like:
   btrfs qgroup show --block-size=m mnt

 Here, block size supports k/K/m/M/g/G/t/T/p/P/e/E.

 There is no distinction between the 1000 and 1024 based prefixes, also
 no way to get the raw values in bytes. I don't have a suggestion how to
 do that, merely letting you know that this could go separately (this and
 the -h patch, the rest shall be integrated).

 Also, the numbers in the table should be aligned to the right:

 $ btrfs qgroup show -h -p /mnt/
 qgroupid rfer  excl  parent
      --
 0/5  900.00KiB 900.00KiB ---
 0/267688.00KiB 12.00KiB  1/5
 0/268684.00KiB 8.00KiB   1/5
 0/2696.71GiB   4.00KiB   1/1
 0/2776.71GiB   4.00KiB   1/1
 0/27839.74GiB  39.74GiB  1/2
 1/1  6.71GiB   6.71GiB   ---
 1/2  39.74GiB  39.74GiB  ---
 1/5  696.00KiB 696.00KiB ---


 david
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 10/12] Btrfs-progs: add '--block-size' option to control print result

2013-10-08 Thread Shilong Wang
Hi David,

2013/10/8 David Sterba dste...@suse.cz:
 On Mon, Oct 07, 2013 at 03:21:46PM +0800, Wang Shilong wrote:
 You can use it like:
   btrfs qgroup show --block-size=m mnt

 Here, block size supports k/K/m/M/g/G/t/T/p/P/e/E.

 There is no distinction between the 1000 and 1024 based prefixes, also
 no way to get the raw values in bytes. I don't have a suggestion how to
 do that, merely letting you know that this could go separately (this and
 the -h patch, the rest shall be integrated).

I implement this like the command 'du'.

In default, we print result in bytes. And block size don't give a byte
unit implicitly.Aslo i don't know why we need to distinct 1000 and
1024, i don't have any ideas about this.


 Also, the numbers in the table should be aligned to the right:

Yes, this should be fixed.

Thanks,
Wang

 $ btrfs qgroup show -h -p /mnt/
 qgroupid rfer  excl  parent
      --
 0/5  900.00KiB 900.00KiB ---
 0/267688.00KiB 12.00KiB  1/5
 0/268684.00KiB 8.00KiB   1/5
 0/2696.71GiB   4.00KiB   1/1
 0/2776.71GiB   4.00KiB   1/1
 0/27839.74GiB  39.74GiB  1/2
 1/1  6.71GiB   6.71GiB   ---
 1/2  39.74GiB  39.74GiB  ---
 1/5  696.00KiB 696.00KiB ---


 david
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


about btrfs patches

2013-03-01 Thread Shilong Wang
Hello Josef,

 Could you please pull these patches:

https://patchwork.kernel.org/patch/2192901/
https://patchwork.kernel.org/patch/2192911/
https://patchwork.kernel.org/patch/2192921/

The first one is definitely a  bug that needs to be fixed,
the second one is  to remove unnecessary BUG_ON, the last one is a clean up..

Is there any problems with these patches? Please let me know..


Thanks,
Wang
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs-progs: fix parse_limit function to return errors when parsing unit

2013-02-28 Thread Shilong Wang
Hello, Eric, David

2013/2/27 Eric Sandeen sand...@redhat.com:
 On 2/27/13 9:52 AM, David Sterba wrote:
 On Thu, Feb 21, 2013 at 08:26:15PM +0800, Wang Shilong wrote:
 From: Wang Shilong wangsl-f...@cn.fujitsu.com

 Steps to reproduce:
  btrfs qgroup limit m mnt/subv

 Here, unit(k/K/g/G/m/M/t/T) all will trigger the problem.
 For the above command, the original code will parse the limit value as 0
 and return successfully.It is wrong,fix it.

 I don't think we should allow to accept bare units for specifing
 numbers, so in that case fail with 'unrecognized limit'.

 I agree with that.

 Maybe since David didn't quite want to take my patch 01/17, and Zach
 thought we needed better code for it all, and Goffredo didn't like
 the exit() from parse_size, and now this problem, we need a single,
 concerted effort to fix up size parsing.

I Agre exit() should not be used in parse_size...
But i don't agree with Zach about better code for it...

Maybe like this:

int parse_size(const char *s, u64 * size)..

Many places that use the function parse_size/parse_limit,
they have to be corrected after change

Thanks,
Wang


 Who knew it was so hard? :)

 -Eric

 david
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 14/17] btrfs-progs: fix mem leak in resolve_root

2013-02-27 Thread Shilong Wang
Hello, Eric, David

2013/2/27 David Sterba dste...@suse.cz:
 On Mon, Feb 25, 2013 at 10:36:41PM -0600, Eric Sandeen wrote:
 On 2/25/13 6:36 PM, Shilong Wang wrote:
  --- a/btrfs-list.c
  +++ b/btrfs-list.c
  @@ -568,8 +568,10 @@ static int resolve_root(struct root_lookup *rl, 
  struct root_info *ri,
  * ref_tree = 0 indicates the subvolumes
  * has been deleted.
  */
  -   if (!found-ref_tree)
  +   if (!found-ref_tree) {
  +   free(full_path);
  return -ENOENT;
  +   }
  int add_len = strlen(found-path);
 
  /* room for / and for null */
  @@ -612,8 +614,10 @@ static int resolve_root(struct root_lookup *rl, 
  struct root_info *ri,
  * subvolume was deleted.
  */
  found = root_tree_search(rl, next);
  -   if (!found)
  +   if (!found) {
  +   free(full_path);
  return -ENOENT;
  +   }
  }
 
  ri-full_path = full_path;
  --
  1.7.1
 
  I think the patch is wrong;
  Here we return ENOENT, it means a subvolume/snapshot deletion happens.
  We just filter them in the filter_root, But the free work is done by
  the function all_subvolume_free..
  so your modification will cause a double free..

 Thanks for the review.  I'll admit that when looking at too many of
 these static checker reports, sometimes things look obvious when
 they are actually subtle, and I've certainly made mistakes before. :)

 However, full_path location doesn't seem to be available outside the
 scope of this function unless we exit normally and do:

 ri-full_path = full_path;

 return 0;
 }

 If we exit early at the -ENOENT points, it seems that full_path
 is leaked; there are no other references to it.

I looked the code carefully, i was wrong before..
Agree with the patch

Thanks,
Wang

 I agree with you, the freed value is local.

 david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 14/17] btrfs-progs: fix mem leak in resolve_root

2013-02-25 Thread Shilong Wang
Hello, Eric

2013/2/26 Eric Sandeen sand...@redhat.com:
 If we exit with error we must free the allocated memory
 to avoid a leak.

 Signed-off-by: Eric Sandeen sand...@redhat.com
 ---
  btrfs-list.c |8 ++--
  1 files changed, 6 insertions(+), 2 deletions(-)

 diff --git a/btrfs-list.c b/btrfs-list.c
 index 851c059..8c3f84d 100644
 --- a/btrfs-list.c
 +++ b/btrfs-list.c
 @@ -568,8 +568,10 @@ static int resolve_root(struct root_lookup *rl, struct 
 root_info *ri,
 * ref_tree = 0 indicates the subvolumes
 * has been deleted.
 */
 -   if (!found-ref_tree)
 +   if (!found-ref_tree) {
 +   free(full_path);
 return -ENOENT;
 +   }
 int add_len = strlen(found-path);

 /* room for / and for null */
 @@ -612,8 +614,10 @@ static int resolve_root(struct root_lookup *rl, struct 
 root_info *ri,
 * subvolume was deleted.
 */
 found = root_tree_search(rl, next);
 -   if (!found)
 +   if (!found) {
 +   free(full_path);
 return -ENOENT;
 +   }
 }

 ri-full_path = full_path;
 --
 1.7.1

I think the patch is wrong;
Here we return ENOENT, it means a subvolume/snapshot deletion happens.
We just filter them in the filter_root, But the free work is done by
the function all_subvolume_free..
so your modification will cause a double free..

Thanks,
Wang


 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Btrfs: create the qgroup that limits root subvolume automatically

2013-02-22 Thread Shilong Wang
2013/2/22 Arne Jansen sensi...@gmx.net:
 On 02/22/13 13:02, Wang Shilong wrote:
 From: Wang Shilong wangsl-f...@cn.fujitsu.com

 Creating the root subvolume qgroup when enabling quota,with

 Why only create a qgroup for the root subvolume and not for
 every existing subvolume?


Yes,You are right.
Creating all the existed subvolume  qgroup is necessary when enabling
quota since we try to prevent
creating  group level 0...the subvolume/snapshot group should be
operated automatically...

Atfer this work.
I think it is necessary to delete the subvolume/snapshot qgroup  as
the deletion of sub volume/snapshot.

BTW, there is a thing  to think about...
During enabling quota...No new subvolume should be created before the
enabling quota is done.

I will try to implement such functions.

 this patch,it will be ok to limit the whole filesystem size.

 This will not limit the whole filesystem, but only the root
 subvolume. To limit the whole filesystem you'd have to create
 a level 1 qgroup and add all subvolumes to it.


Right, thanks for correcting it...

Thanks,
Wang


 -Arne


 Signed-off-by: Wang Shilong wangsl-f...@cn.fujitsu.com
 Reviewed-by: Miao Xie mi...@cn.fujitsu.com
 Cc: Arne Jansen sensi...@gmx.net
 ---
  fs/btrfs/qgroup.c |   12 
  1 files changed, 12 insertions(+), 0 deletions(-)

 diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
 index a5c8562..c409096 100644
 --- a/fs/btrfs/qgroup.c
 +++ b/fs/btrfs/qgroup.c
 @@ -777,6 +777,7 @@ int btrfs_quota_enable(struct btrfs_trans_handle *trans,
   struct extent_buffer *leaf;
   struct btrfs_key key;
   int ret = 0;
 + struct btrfs_qgroup *qgroup = NULL;

   spin_lock(fs_info-qgroup_lock);
   if (fs_info-quota_root) {
 @@ -823,7 +824,18 @@ int btrfs_quota_enable(struct btrfs_trans_handle *trans,

   btrfs_mark_buffer_dirty(leaf);

 + btrfs_release_path(path);
 + ret = add_qgroup_item(trans, quota_root, BTRFS_FS_TREE_OBJECTID);
 + if (ret)
 + goto out;
 +
   spin_lock(fs_info-qgroup_lock);
 + qgroup = add_qgroup_rb(fs_info, BTRFS_FS_TREE_OBJECTID);
 + if (IS_ERR(qgroup)) {
 + spin_unlock(fs_info-qgroup_lock);
 + ret = PTR_ERR(qgroup);
 + goto out;
 + }
   fs_info-quota_root = quota_root;
   fs_info-pending_quota_state = 1;
   spin_unlock(fs_info-qgroup_lock);


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RESEND RFC PATCH 2/2] Btrfs: disable the qgroup level 0 for userspace use

2013-02-22 Thread Shilong Wang
Hello,

2013/2/22 Arne Jansen sensi...@gmx.net:
 On 02/22/13 13:09, Wang Shilong wrote:
 From: Wang Shilong wangsl-f...@cn.fujitsu.com

 This patch tries to stop users to create/destroy qgroup level 0,
 users can only create/destroy qgroup level more than 0.

 See the fact:
   a subvolume/snapshot qgroup was created automatically
 when creating subvolume/snapshot, so creating a qgroup level 0 can't
 be a subvolume/snapshot qgroup, the only way to use it is that assigning
 subvolume/snapshot qgroup to it, the point is that we don't want to have a
 parent qgroup whose level is 0.

   So we want to force users to use qgroup with clear relations
 which means a parent qgroup's level  child qgroup's level.For example:

   2/0
  /\
 /  \
/\
   1/0   1/1
 / \\
/   \\
   / \\
   0/256 0/2570/258

 This pattern of quota is nature and easy for users to understand, otherwise 
 it will
 make the quota configuration confusing and difficult to maintain.

 I agree that a strict hierarchy of the levels should be enforced.
 Currently the kernel has no idea of 'level', it's just an artificial
 concept that lives in userspace. This patch would be the first place
 to add that magic shift '48' to the kernel.
 In my opinion it would be sufficient to do the enforcement in user
 space, as it is of no technical nature.


...i have made some patches about these work in btrfs-prog, but it has
been not merged...
I will pick up thoses patches and do the other necessary work..

Thanks,
Wang

 -Arne


 Signed-off-by: Wang Shilong wangsl-f...@cn.fujitsu.com
 Acked-by: Miao Xie mi...@cn.fujitsu.com
 Cc: Arne Jansen sensi...@gmx.net
 ---
  fs/btrfs/ioctl.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

 diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
 index a31cd93..3590c21 100644
 --- a/fs/btrfs/ioctl.c
 +++ b/fs/btrfs/ioctl.c
 @@ -3755,7 +3755,7 @@ static long btrfs_ioctl_qgroup_create(struct file 
 *file, void __user *arg)
   goto drop_write;
   }

 - if (!sa-qgroupid) {
 + if (!(sa-qgroupid  48)) {
   ret = -EINVAL;
   goto out;
   }


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RESEND RFC PATCH 2/2] Btrfs: disable the qgroup level 0 for userspace use

2013-02-22 Thread Shilong Wang
Hello, David

2013/2/23 David Sterba dste...@suse.cz:
 On Sat, Feb 23, 2013 at 12:39:24AM +0800, Shilong Wang wrote:
 Hello,

 2013/2/22 Arne Jansen sensi...@gmx.net:
  On 02/22/13 13:09, Wang Shilong wrote:
  From: Wang Shilong wangsl-f...@cn.fujitsu.com
 
  This patch tries to stop users to create/destroy qgroup level 0,
  users can only create/destroy qgroup level more than 0.
 
  See the fact:
a subvolume/snapshot qgroup was created automatically
  when creating subvolume/snapshot, so creating a qgroup level 0 can't
  be a subvolume/snapshot qgroup, the only way to use it is that assigning
  subvolume/snapshot qgroup to it, the point is that we don't want to have a
  parent qgroup whose level is 0.
 
So we want to force users to use qgroup with clear relations
  which means a parent qgroup's level  child qgroup's level.For example:
 
2/0
   /\
  /  \
 /\
1/0   1/1
  / \\
 /   \\
/ \\
0/256 0/2570/258
 
  This pattern of quota is nature and easy for users to understand, 
  otherwise it will
  make the quota configuration confusing and difficult to maintain.
 
  I agree that a strict hierarchy of the levels should be enforced.
  Currently the kernel has no idea of 'level', it's just an artificial
  concept that lives in userspace. This patch would be the first place
  to add that magic shift '48' to the kernel.
  In my opinion it would be sufficient to do the enforcement in user
  space, as it is of no technical nature.
 

 ...i have made some patches about these work in btrfs-prog, but it has
 been not merged...
 I will pick up thoses patches and do the other necessary work..

 This one?

 https://patchwork.kernel.org/patch/2008591/

 went through integration branch into progs' master.

Yes, it is.However, more work needs done to make it work well..
I'd continue my work based on integration-20130126..

Thanks,
Wang

 david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html