On 12/26/2014 01:43 PM, Satoru Takeuchi wrote:
Hi Yang,

On 2014/12/26 10:32, Dongsheng Yang wrote:
Hi Satoru,

I saw your mail of "[BUG] "Quota Ignored On write" problem still exist with 3.16-rc5 <http://news.gmane.org/find-root.php?message_id=53C8DEB0.1060404%40jp.fujitsu.com>" in gmane.
I guess this patch will fix the problem you mentioned.
Could you help to give a try?

Although it reached disk quota before writing whole 1.5GB,
the copied size was only about 700 MB. In this case,
expected behavior is to reach disk quota just after
writing 1 GB.

Hi Satoru,
Thanx for your testing a lot. But stranger, I tried it in 3.19-rc1 and 3.18, both works well writing 1GB. Could you give me some more information about your testing machine?
And could you execute "sync && btrfs qgroup show -pecr $MNT" after writing?

*3.19-rc1 with my 2/2 patch applied.
=======================================================================
[root@atest-guest linux_btrfs]# uname -a
Linux atest-guest 3.19.0-rc1+ #66 SMP Fri Dec 26 10:38:12 EST 2014 x86_64 x86_64 x86_64 GNU/Linux
[root@atest-guest linux_btrfs]# sh quota.sh
umount: /mnt: not mounted
Btrfs v3.17-1-gbef9fd4
See http://btrfs.wiki.kernel.org for more information.

Turning ON incompat feature 'extref': increased hardlink limit per file to 65536
fs created label (null) on /dev/vdb
    nodesize 16384 leafsize 16384 sectorsize 4096 size 50.00GiB
Create subvolume '/mnt/quota_test'
dd: error writing \u2018/mnt/quota_test/test\u2019: Disk quota exceeded
999842+0 records in
999841+0 records out
1023837184 bytes (1.0 GB) copied, 8.06996 s, 127 MB/s
[PASS] quota works correctly
[root@atest-guest linux_btrfs]# sync
[root@atest-guest linux_btrfs]# df -h /mnt/
Filesystem      Size  Used Avail Use% Mounted on
/dev/vdb         50G  996M   49G   2% /mnt
[root@atest-guest linux_btrfs]# btrfs qgroup show -pecr /mnt/
qgroupid rfer       excl       max_rfer   max_excl parent  child
-------- ----       ----       --------   -------- ------  -----
0/5      16384      16384      0          0        ---     ---
0/256    1023868928 1023868928 1024000000 0        ---     ---
=============================================================

* 3.19-rc1 without your two v2 patches

===============================================================================
+ mkfs.btrfs -f /dev/vdb
Btrfs v3.17
See http://btrfs.wiki.kernel.org for more information.

Turning ON incompat feature 'extref': increased hardlink limit per file to 65536
fs created label (null) on /dev/vdb
        nodesize 16384 leafsize 16384 sectorsize 4096 size 30.00GiB
+ mount /dev/vdb /root/btrfs-auto-test/
+ ret=0
+ btrfs quota enable /root/btrfs-auto-test/
+ btrfs subvolume create /root/btrfs-auto-test//sub
Create subvolume '/root/btrfs-auto-test/sub'
+ btrfs qgroup limit 1G /root/btrfs-auto-test//sub
+ dd if=/dev/zero of=/root/btrfs-auto-test//sub/file bs=1024 count=1500000
1500000+0 records in
1500000+0 records out
1536000000 bytes (1.5 GB) copied, 25.6601 s, 59.9 MB/s
===============================================================================

* 3.19-rc1 with your two v2 patches

===============================================================================
+ mkfs.btrfs -f /dev/vdb
Btrfs v3.17
See http://btrfs.wiki.kernel.org for more information.

Turning ON incompat feature 'extref': increased hardlink limit per file to 65536
fs created label (null) on /dev/vdb
        nodesize 16384 leafsize 16384 sectorsize 4096 size 30.00GiB
+ mount /dev/vdb /root/btrfs-auto-test/
+ ret=0
+ btrfs quota enable /root/btrfs-auto-test/
+ btrfs subvolume create /root/btrfs-auto-test//sub
Create subvolume '/root/btrfs-auto-test/sub'
+ btrfs qgroup limit 1G /root/btrfs-auto-test//sub
+ dd if=/dev/zero of=/root/btrfs-auto-test//sub/file bs=1024 count=1500000
dd: error writing '/root/btrfs-auto-test//sub/file': Disk quota exceeded
681353+0 records in
681352+0 records out
697704448 bytes (698 MB) copied, 8.15563 s, 85.5 MB/s
===============================================================================

Thanks,
Satoru


Thanx
Yang

On 12/12/2014 04:44 PM, Dongsheng Yang wrote:
Currently, for pre_alloc or delay_alloc, the bytes will be accounted
in space_info by the three guys.
space_info->bytes_may_use --- space_info->reserved --- space_info->used. But on the other hand, in qgroup, there are only two counters to account the
bytes, qgroup->reserved and qgroup->excl. And qg->reserved accounts
bytes in space_info->bytes_may_use and qg->excl accounts bytes in
space_info->used. So the bytes in space_info->reserved is not accounted
in qgroup. If so, there is a window we can exceed the quota limit when
bytes is in space_info->reserved.

Example:
    # btrfs quota enable /mnt
    # btrfs qgroup limit -e 10M /mnt
    # for((i=0;i<20;i++));do fallocate -l 1M /mnt/data$i; done
    # sync
    # btrfs qgroup show -pcre /mnt
qgroupid rfer     excl     max_rfer max_excl parent  child
-------- ----     ----     -------- -------- ------  -----
0/5      20987904 20987904 0        10485760 ---     ---

qg->excl is 20987904 larger than max_excl 10485760.

This patch introduce a new counter named may_use to qgroup, then
there are three counters in qgroup to account bytes in space_info
as below.
space_info->bytes_may_use --- space_info->reserved --- space_info->used.
qgroup->may_use           --- qgroup->reserved     --- qgroup->excl

With this patch applied:
    # btrfs quota enable /mnt
    # btrfs qgroup limit -e 10M /mnt
    # for((i=0;i<20;i++));do fallocate -l 1M /mnt/data$i; done
fallocate: /mnt/data9: fallocate failed: Disk quota exceeded
fallocate: /mnt/data10: fallocate failed: Disk quota exceeded
fallocate: /mnt/data11: fallocate failed: Disk quota exceeded
fallocate: /mnt/data12: fallocate failed: Disk quota exceeded
fallocate: /mnt/data13: fallocate failed: Disk quota exceeded
fallocate: /mnt/data14: fallocate failed: Disk quota exceeded
fallocate: /mnt/data15: fallocate failed: Disk quota exceeded
fallocate: /mnt/data16: fallocate failed: Disk quota exceeded
fallocate: /mnt/data17: fallocate failed: Disk quota exceeded
fallocate: /mnt/data18: fallocate failed: Disk quota exceeded
fallocate: /mnt/data19: fallocate failed: Disk quota exceeded
    # sync
    # btrfs qgroup show -pcre /mnt
qgroupid rfer    excl    max_rfer max_excl parent  child
-------- ----    ----    -------- -------- ------  -----
0/5      9453568 9453568 0        10485760 ---     ---

Reported-by: Cyril SCETBON<cyril.scet...@free.fr>
Signed-off-by: Dongsheng Yang<yangds.f...@cn.fujitsu.com>
---
Changelog:
    v1 -> v2:
        Remove the redundant check for fs_info->quota_enabled.

  fs/btrfs/extent-tree.c | 20 ++++++++++++++-
  fs/btrfs/inode.c       | 18 ++++++++++++-
fs/btrfs/qgroup.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++---
  fs/btrfs/qgroup.h      |  4 +++
  4 files changed, 104 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 014b7f2..f4ad737 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -5500,8 +5500,12 @@ static int pin_down_extent(struct btrfs_root *root,

      set_extent_dirty(root->fs_info->pinned_extents, bytenr,
               bytenr + num_bytes - 1, GFP_NOFS | __GFP_NOFAIL);
-    if (reserved)
+    if (reserved) {
+        btrfs_qgroup_update_reserved_bytes(root->fs_info,
+                           root->root_key.objectid,
+                           num_bytes, -1);
          trace_btrfs_reserved_extent_free(root, bytenr, num_bytes);
+    }
      return 0;
  }

@@ -6230,6 +6234,9 @@ void btrfs_free_tree_block(struct btrfs_trans_handle *trans, btrfs_update_reserved_bytes(cache, buf->len, RESERVE_FREE, 0);
          trace_btrfs_reserved_extent_free(root, buf->start, buf->len);
          pin = 0;
+        btrfs_qgroup_update_reserved_bytes(root->fs_info,
+                           root->root_key.objectid,
+                           buf->len, -1);
      }
  out:
      if (pin)
@@ -6964,7 +6971,11 @@ static int __btrfs_free_reserved_extent(struct btrfs_root *root,
      else {
          btrfs_add_free_space(cache, start, len);
btrfs_update_reserved_bytes(cache, len, RESERVE_FREE, delalloc);
+        btrfs_qgroup_update_reserved_bytes(root->fs_info,
+                           root->root_key.objectid,
+                           len, -1);
      }
+
      btrfs_put_block_group(cache);

      trace_btrfs_reserved_extent_free(root, start, len);
@@ -7200,6 +7211,9 @@ int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans,
      BUG_ON(ret); /* logic error */
      ret = alloc_reserved_file_extent(trans, root, 0, root_objectid,
                       0, owner, offset, ins, 1);
+    btrfs_qgroup_update_reserved_bytes(root->fs_info,
+                       root->root_key.objectid,
+                       ins->offset, 1);
      btrfs_put_block_group(block_group);
      return ret;
  }
@@ -7346,6 +7360,10 @@ struct extent_buffer *btrfs_alloc_tree_block(struct btrfs_trans_handle *trans,
          return ERR_PTR(ret);
      }

+    btrfs_qgroup_update_reserved_bytes(root->fs_info,
+                       root_objectid,
+                       ins.offset, 1);
+
      buf = btrfs_init_new_buffer(trans, root, ins.objectid,
                      blocksize, level);
      BUG_ON(IS_ERR(buf)); /* -ENOMEM */
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index d23362f..0c824f3 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -59,6 +59,7 @@
  #include "backref.h"
  #include "hash.h"
  #include "props.h"
+#include "qgroup.h"

  struct btrfs_iget_args {
      struct btrfs_key *location;
@@ -739,7 +740,9 @@ retry:
              }
              goto out_free;
          }
-
+        btrfs_qgroup_update_reserved_bytes(root->fs_info,
+                           root->root_key.objectid,
+                           ins.offset, 1);
          /*
           * here we're doing allocation and writeback of the
           * compressed pages
@@ -951,6 +954,10 @@ static noinline int cow_file_range(struct inode *inode,
          if (ret < 0)
              goto out_unlock;

+        btrfs_qgroup_update_reserved_bytes(root->fs_info,
+                           root->root_key.objectid,
+                           ins.offset, 1);
+
          em = alloc_extent_map();
          if (!em) {
              ret = -ENOMEM;
@@ -6745,6 +6752,10 @@ static struct extent_map *btrfs_new_extent_direct(struct inode *inode,
          return ERR_PTR(ret);
      }

+    btrfs_qgroup_update_reserved_bytes(root->fs_info,
+                       root->root_key.objectid,
+                       ins.offset, 1);
+
      return em;
  }

@@ -9266,6 +9277,11 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode,
                  btrfs_end_transaction(trans, root);
              break;
          }
+
+        btrfs_qgroup_update_reserved_bytes(root->fs_info,
+                           root->root_key.objectid,
+                           ins.offset, 1);
+
          btrfs_drop_extent_cache(inode, cur_offset,
                      cur_offset + ins.offset -1, 0);

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 48b60db..d147082 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -72,6 +72,7 @@ struct btrfs_qgroup {
      /*
       * reservation tracking
       */
+    u64 may_use;
      u64 reserved;

      /*
@@ -1414,6 +1415,8 @@ static int qgroup_excl_accounting(struct btrfs_fs_info *fs_info,
      WARN_ON(sign < 0 && qgroup->excl < oper->num_bytes);
      qgroup->excl += sign * oper->num_bytes;
      qgroup->excl_cmpr += sign * oper->num_bytes;
+    if (sign > 0)
+        qgroup->reserved -= oper->num_bytes;

      qgroup_dirty(fs_info, qgroup);

@@ -1434,6 +1437,8 @@ static int qgroup_excl_accounting(struct btrfs_fs_info *fs_info,
          qgroup->excl += sign * oper->num_bytes;
          if (sign < 0)
              WARN_ON(qgroup->excl < oper->num_bytes);
+        if (sign > 0)
+            qgroup->reserved -= oper->num_bytes;
          qgroup->excl_cmpr += sign * oper->num_bytes;
          qgroup_dirty(fs_info, qgroup);

@@ -2359,6 +2364,61 @@ out:
      return ret;
  }

+int btrfs_qgroup_update_reserved_bytes(struct btrfs_fs_info *fs_info,
+                        u64 ref_root,
+                        u64 num_bytes,
+                        int sign)
+{
+    struct btrfs_root *quota_root;
+    struct btrfs_qgroup *qgroup;
+    int ret = 0;
+    struct ulist_node *unode;
+    struct ulist_iterator uiter;
+
+    if (!is_fstree(ref_root) || !fs_info->quota_enabled)
+        return 0;
+
+    if (num_bytes == 0)
+        return 0;
+
+    spin_lock(&fs_info->qgroup_lock);
+    quota_root = fs_info->quota_root;
+    if (!quota_root)
+        goto out;
+
+    qgroup = find_qgroup_rb(fs_info, ref_root);
+    if (!qgroup)
+        goto out;
+
+    ulist_reinit(fs_info->qgroup_ulist);
+    ret = ulist_add(fs_info->qgroup_ulist, qgroup->qgroupid,
+            (uintptr_t)qgroup, GFP_ATOMIC);
+    if (ret < 0)
+        goto out;
+
+    ULIST_ITER_INIT(&uiter);
+    while ((unode = ulist_next(fs_info->qgroup_ulist, &uiter))) {
+        struct btrfs_qgroup *qg;
+        struct btrfs_qgroup_list *glist;
+
+        qg = u64_to_ptr(unode->aux);
+
+        qg->reserved += sign * num_bytes;
+
+        list_for_each_entry(glist, &qg->groups, next_group) {
+            ret = ulist_add(fs_info->qgroup_ulist,
+                    glist->group->qgroupid,
+                    (uintptr_t)glist->group, GFP_ATOMIC);
+            if (ret < 0)
+                goto out;
+        }
+    }
+
+out:
+    spin_unlock(&fs_info->qgroup_lock);
+    return ret;
+}
+
  /*
* reserve some space for a qgroup and all its parents. The reservation takes
   * place with start_transaction or dealloc_reserve, similar to ENOSPC
@@ -2407,14 +2467,14 @@ int btrfs_qgroup_reserve(struct btrfs_root *root, u64 num_bytes)
          qg = u64_to_ptr(unode->aux);

          if ((qg->lim_flags & BTRFS_QGROUP_LIMIT_MAX_RFER) &&
-            qg->reserved + (s64)qg->rfer + num_bytes >
+            qg->reserved + qg->may_use + (s64)qg->rfer + num_bytes >
              qg->max_rfer) {
              ret = -EDQUOT;
              goto out;
          }

          if ((qg->lim_flags & BTRFS_QGROUP_LIMIT_MAX_EXCL) &&
-            qg->reserved + (s64)qg->excl + num_bytes >
+            qg->reserved + qg->may_use + (s64)qg->excl + num_bytes >
              qg->max_excl) {
              ret = -EDQUOT;
              goto out;
@@ -2438,7 +2498,7 @@ int btrfs_qgroup_reserve(struct btrfs_root *root, u64 num_bytes)

          qg = u64_to_ptr(unode->aux);

-        qg->reserved += num_bytes;
+        qg->may_use += num_bytes;
      }

  out:
@@ -2484,7 +2544,7 @@ void btrfs_qgroup_free(struct btrfs_root *root, u64 num_bytes)

          qg = u64_to_ptr(unode->aux);

-        qg->reserved -= num_bytes;
+        qg->may_use -= num_bytes;

          list_for_each_entry(glist, &qg->groups, next_group) {
              ret = ulist_add(fs_info->qgroup_ulist,
diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h
index 18cc68c..31de88e 100644
--- a/fs/btrfs/qgroup.h
+++ b/fs/btrfs/qgroup.h
@@ -95,6 +95,10 @@ int btrfs_run_qgroups(struct btrfs_trans_handle *trans,
  int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans,
               struct btrfs_fs_info *fs_info, u64 srcid, u64 objectid,
               struct btrfs_qgroup_inherit *inherit);
+int btrfs_qgroup_update_reserved_bytes(struct btrfs_fs_info *fs_info,
+                       u64 ref_root,
+                       u64 num_bytes,
+                       int sign);
  int btrfs_qgroup_reserve(struct btrfs_root *root, u64 num_bytes);
  void btrfs_qgroup_free(struct btrfs_root *root, u64 num_bytes);



.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to