When a file gets deleted on a zoned file system, the space freed is not returned back into the block group's free space, but is migrated to zone_unusable.
As this zone_unusable space is behind the current write pointer it is not possible to use it for new allocations. In the current implementation a zone is reset once all of the block group's space is accounted as zone unusable. This behaviour can lead to premature ENOSPC errors on a busy file system. Instead of only reclaiming the zone once it is completely unusable, kick off a reclaim job once the amount of unusable bytes exceeds a user configurable threshold between 51% and 100%. It can be set per mounted filesystem via the sysfs tunable bg_reclaim_threshold which is set to 75% per default. Similar to reclaiming unused block groups, these dirty block groups are added to a to_reclaim list and then on a transaction commit, the reclaim process is triggered but after we deleted unused block groups, which will free space for the relocation process. Zones that are 100% full and zone unusable already get reclaimed atomatically on transaction commit. Another improvement on the garbage collection side of zoned btrfs would be no to reclaim block groups that have used, pinned and reserved = 0 but zone_unusable > 0. This is not yet included as it needs further reaserch and testing. Changes to v5: - Prefix define (David) - Print bg usage percentage in reclaim info (David) Changes to v4: - Bail out on unmount (Josef) - Fix delete extents comment (Filipe) - Use constant for default threashold (David) - document reclaim_bgs_mutex (David) Changes to v3: - Special case "discarding" after relocation (Filipe) Changes to v2: - Fix locking in multiple ways (Filipe) - Offload reclaim into workqueue (Josef) - Add patch discarding/zone-resetting after successfull relocation (Anand) Changes to v1: - Document sysfs parameter (David) - Add info print for reclaim (Josef) - Rename delete_unused_bgs_mutex to reclaim_bgs_lock (Filipe) - Remove list_is_singular check (Filipe) - Document of space_info->groups_sem use (Filipe) Johannes Thumshirn (3): btrfs: zoned: reset zones of relocated block groups btrfs: rename delete_unused_bgs_mutex btrfs: zoned: automatically reclaim zones fs/btrfs/block-group.c | 107 ++++++++++++++++++++++++++++++++++- fs/btrfs/block-group.h | 3 + fs/btrfs/ctree.h | 8 ++- fs/btrfs/disk-io.c | 19 ++++++- fs/btrfs/free-space-cache.c | 9 ++- fs/btrfs/sysfs.c | 35 ++++++++++++ fs/btrfs/volumes.c | 63 +++++++++++++-------- fs/btrfs/volumes.h | 1 + fs/btrfs/zoned.h | 6 ++ include/trace/events/btrfs.h | 12 ++++ 10 files changed, 231 insertions(+), 32 deletions(-) -- 2.30.0