On 2018年02月22日 09:50, Jeff Mahoney wrote:
> On 2/21/18 8:36 PM, Qu Wenruo wrote:
>>
>>
>> On 2018年02月22日 04:19, je...@suse.com wrote:
>>> From: Jeff Mahoney <je...@suse.com>
>>>
>>> There are several places where we call btrfs_qgroup_reserve_meta and
>>> assume that a return value of 0 means that the reservation was successful.
>>>
>>> Later, we use the original bytes value passed to that call to free
>>> bytes during error handling or to pass the number of bytes reserved to
>>> the caller.
>>>
>>> This patch returns -ENODATA when we don't perform a reservation so that
>>> callers can make the distinction.  This also lets call sites not
>>> necessarily care whether qgroups are enabled.
>>
>> IMHO if we don't need to reserve, returning 0 seems good enough.
>> Caller doesn't really need to care if it has reserved some bytes.
>>
>> Or is there any special case where we need to distinguish such case?
> 
> Anywhere where the reservation takes place prior to the transaction
> starting, which is pretty much everywhere.  We wait until transaction
> commit to flip the bit to turn on quotas, which means that if a
> transaction commits that enables quotas lands in between the reservation
> being take and any error handling that involves freeing the reservation,
> we'll end up with an underflow.

So the same case as btrfs_qgroup_reserve_data().

In that case we could use ret > 0 to indicates the real bytes we
reserved, instead of -ENODATA which normally means error.

> 
> This is the first patch of a series I'm working on, but it can stand
> alone.  The rest is the patch set I mentioned when we talked a few
> months ago where the lifetimes of reservations are incorrect.  We can't
> just drop all the reservations at the end of the transaction because 1)
> the lifetime of some reservations can cross transactions and 2) because
> especially in the start_transaction case, we do the reservation prior to
> waiting to join the transaction.  So if the transaction we're waiting on
> commits, our reservation goes away with it but we continue on as if we
> still have it.

Right, the same problem I also addressed in patchset "[PATCH v2 00/10]
Use split qgroup rsv type".

In 6th patch, "[PATCH v2 06/10] btrfs: qgroup: Use
root->qgroup_meta_rsv_* to record qgroup meta reserved space" qgroup
meta reserve will only be increased if we succeeded in reserving
metadata, so later free won't underflow that number.

Thanks,
Qu


> 
> -Jeff
> 
>> Thanks,
>> Qu
>>
>>>
>>> Signed-off-by: Jeff Mahoney <je...@suse.com>
>>> ---
>>>  fs/btrfs/extent-tree.c | 33 ++++++++++++++++-----------------
>>>  fs/btrfs/qgroup.c      |  4 ++--
>>>  fs/btrfs/transaction.c |  5 ++++-
>>>  3 files changed, 22 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>>> index c1618ab9fecf..2d5e963fae88 100644
>>> --- a/fs/btrfs/extent-tree.c
>>> +++ b/fs/btrfs/extent-tree.c
>>> @@ -5988,20 +5988,18 @@ int btrfs_subvolume_reserve_metadata(struct 
>>> btrfs_root *root,
>>>                                  u64 *qgroup_reserved,
>>>                                  bool use_global_rsv)
>>>  {
>>> -   u64 num_bytes;
>>>     int ret;
>>>     struct btrfs_fs_info *fs_info = root->fs_info;
>>>     struct btrfs_block_rsv *global_rsv = &fs_info->global_block_rsv;
>>> +   /* One for parent inode, two for dir entries */
>>> +   u64 num_bytes = 3 * fs_info->nodesize;
>>>  
>>> -   if (test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags)) {
>>> -           /* One for parent inode, two for dir entries */
>>> -           num_bytes = 3 * fs_info->nodesize;
>>> -           ret = btrfs_qgroup_reserve_meta(root, num_bytes, true);
>>> -           if (ret)
>>> -                   return ret;
>>> -   } else {
>>> +   ret = btrfs_qgroup_reserve_meta(root, num_bytes, true);
>>> +   if (ret == -ENODATA) {
>>>             num_bytes = 0;
>>> -   }
>>> +           ret = 0;
>>> +   } else if (ret)
>>> +           return ret;
>>>  
>>>     *qgroup_reserved = num_bytes;
>>>  
>>> @@ -6057,6 +6055,7 @@ int btrfs_delalloc_reserve_metadata(struct 
>>> btrfs_inode *inode, u64 num_bytes)
>>>     enum btrfs_reserve_flush_enum flush = BTRFS_RESERVE_FLUSH_ALL;
>>>     int ret = 0;
>>>     bool delalloc_lock = true;
>>> +   u64 qgroup_reserved;
>>>  
>>>     /* If we are a free space inode we need to not flush since we will be in
>>>      * the middle of a transaction commit.  We also don't need the delalloc
>>> @@ -6090,17 +6089,17 @@ int btrfs_delalloc_reserve_metadata(struct 
>>> btrfs_inode *inode, u64 num_bytes)
>>>     btrfs_calculate_inode_block_rsv_size(fs_info, inode);
>>>     spin_unlock(&inode->lock);
>>>  
>>> -   if (test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags)) {
>>> -           ret = btrfs_qgroup_reserve_meta(root,
>>> -                           nr_extents * fs_info->nodesize, true);
>>> -           if (ret)
>>> -                   goto out_fail;
>>> -   }
>>> +   qgroup_reserved = nr_extents * fs_info->nodesize;
>>> +   ret = btrfs_qgroup_reserve_meta(root, qgroup_reserved, true);
>>> +   if (ret == -ENODATA) {
>>> +           ret = 0;
>>> +           qgroup_reserved = 0;
>>> +   } if (ret)
>>> +           goto out_fail;
>>>  
>>>     ret = btrfs_inode_rsv_refill(inode, flush);
>>>     if (unlikely(ret)) {
>>> -           btrfs_qgroup_free_meta(root,
>>> -                                  nr_extents * fs_info->nodesize);
>>> +           btrfs_qgroup_free_meta(root, qgroup_reserved);
>>>             goto out_fail;
>>>     }
>>>  
>>> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
>>> index aa259d6986e1..5d9e011243c6 100644
>>> --- a/fs/btrfs/qgroup.c
>>> +++ b/fs/btrfs/qgroup.c
>>> @@ -3025,7 +3025,7 @@ int btrfs_qgroup_reserve_meta(struct btrfs_root 
>>> *root, int num_bytes,
>>>  
>>>     if (!test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags) ||
>>>         !is_fstree(root->objectid) || num_bytes == 0)
>>> -           return 0;
>>> +           return -ENODATA;
>>>  
>>>     BUG_ON(num_bytes != round_down(num_bytes, fs_info->nodesize));
>>>     trace_qgroup_meta_reserve(root, (s64)num_bytes);
>>> @@ -3057,7 +3057,7 @@ void btrfs_qgroup_free_meta(struct btrfs_root *root, 
>>> int num_bytes)
>>>     struct btrfs_fs_info *fs_info = root->fs_info;
>>>  
>>>     if (!test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags) ||
>>> -       !is_fstree(root->objectid))
>>> +       !is_fstree(root->objectid) || num_bytes == 0)
>>>             return;
>>>  
>>>     BUG_ON(num_bytes != round_down(num_bytes, fs_info->nodesize));
>>> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
>>> index 04f07144b45c..ab67b73bd7fa 100644
>>> --- a/fs/btrfs/transaction.c
>>> +++ b/fs/btrfs/transaction.c
>>> @@ -510,7 +510,10 @@ start_transaction(struct btrfs_root *root, unsigned 
>>> int num_items,
>>>             qgroup_reserved = num_items * fs_info->nodesize;
>>>             ret = btrfs_qgroup_reserve_meta(root, qgroup_reserved,
>>>                                             enforce_qgroups);
>>> -           if (ret)
>>> +           if (ret == -ENODATA) {
>>> +                   ret = 0;
>>> +                   qgroup_reserved = 0;
>>> +           } else if (ret)
>>>                     return ERR_PTR(ret);
>>>  
>>>             num_bytes = btrfs_calc_trans_metadata_size(fs_info, num_items);
>>>
>>
> 
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to