[Overall]
The previous rework on qgroup reservation system put a lot of effort on
data, which works quite fine.

But it takes less focus on metadata reservation, causing some problem
like metadata reservation underflow and noisy kernel warning.

This patchset will try to address the remaining problem of metadata
reservation.

The idea of new qgroup metadata reservation is to use 2 types of
metadata reservation:
1) Per-transaction reservation
   Life span will be inside a transaction. Will be freed at transaction
   commit time.

2) Preallocated reservation
   For case where we reserve space before starting a transaction.
   Operation like dealloc and delayed-inode/item belongs to this type.

   This works similar to block_rsv, its reservation can be
   reserved/released at any timing caller like.

   The only point to notice is, if preallocated reservation is used and
   finished without problem, it should be converted to per-transaction
   type instead of just freeing.
   This is to co-operate with qgroup update at commit time.

For preallocated type, this patch will integrate them into inode_rsv
mechanism reworked by Josef, and delayed-inode/item reservation.


[Problem: Over-reserve]
Currently the patchset addresses metadata underflow quite well, but
due to the over-reserve nature of btrfs and highly bounded to inode_rsv,
qgroup metadata reservation also tends to be over-reserved.

This is especially obvious for small limit.
For 128M limit, it's will only be able to write about 70M before hitting
quota limit.
Although for larger limit, like 5G limit, it can reach 4.5G or more
before hitting limit.

Such over-reserved behavior can lead to some problem with existing test
cases (where limit is normally less than 20M).

While it's also possible to be addressed by use more accurate space other
than max estimations.

For example, to calculate metadata needed for delalloc, we use
btrfs_calc_trans_metadata_size(), which always try to reserve space for
CoW a full-height tree, and will also include csum size.
Both calculate is way over-killed for qgroup metadata reservation.

[Patch structure]
The patch is consist of 2 main parts:
1) Type based qgroup reservation
   The original patchset is sent several months ago.
   Nothing is modified at all, just rebased. And not conflict at all.

   It's from patch 1 to patch 6.

2) Split meta qgroup reservation into per-trans and prealloc sub types
   The real work to address metadata underflow.
   Due to the over-reserve problem, this part is still in RFC state.
   But the framework should mostly be fine, only needs extra fine-tuning
   to get more accurate qgroup rsv to avoid too early limit.

   It's from patch 7 to 14.

Qu Wenruo (14):
  btrfs: qgroup: Skeleton to support separate qgroup reservation type
  btrfs: qgroup: Introduce helpers to update and access new qgroup rsv
  btrfs: qgroup: Make qgroup_reserve and its callers to use separate
    reservation type
  btrfs: qgroup: Fix wrong qgroup reservation update for relationship
    modification
  btrfs: qgroup: Update trace events to use new separate rsv types
  btrfs: qgroup: Cleanup the remaining old reservation counters
  btrfs: qgroup: Split meta rsv type into meta_prealloc and
    meta_pertrans
  btrfs: qgroup: Don't use root->qgroup_meta_rsv for qgroup
  btrfs: qgroup: Introduce function to convert META_PREALLOC into
    META_PERTRANS
  btrfs: qgroup: Use separate meta reservation type for delalloc
  btrfs: delayed-inode: Use new qgroup meta rsv for delayed inode and
    item
  btrfs: qgroup: Use root->qgroup_meta_rsv_* to record qgroup meta
    reserved space
  btrfs: qgroup: Update trace events for metadata reservation
  Revert "btrfs: qgroups: Retry after commit on getting EDQUOT"

 fs/btrfs/ctree.h             |  15 +-
 fs/btrfs/delayed-inode.c     |  50 +++++--
 fs/btrfs/disk-io.c           |   2 +-
 fs/btrfs/extent-tree.c       |  49 +++---
 fs/btrfs/file.c              |  15 +-
 fs/btrfs/free-space-cache.c  |   2 +-
 fs/btrfs/inode-map.c         |   4 +-
 fs/btrfs/inode.c             |  27 ++--
 fs/btrfs/ioctl.c             |  10 +-
 fs/btrfs/ordered-data.c      |   2 +-
 fs/btrfs/qgroup.c            | 350 ++++++++++++++++++++++++++++++++-----------
 fs/btrfs/qgroup.h            | 102 ++++++++++++-
 fs/btrfs/relocation.c        |   9 +-
 fs/btrfs/transaction.c       |   8 +-
 include/trace/events/btrfs.h |  73 ++++++++-
 15 files changed, 537 insertions(+), 181 deletions(-)

-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to