On 2019/3/27 下午8:24, Nikolay Borisov wrote:
> Up until know trimming the freespace was done irrespective of what the
> arguments of the FITRIM ioctl were.

Since commit 6ba9fc8e628b ("btrfs: Ensure btrfs_trim_fs can trim the
whole filesystem"), btrfs in fact follows the range parameter.

It interpreter the range parameter as *btrfs logical address* range, so
it will find any block group in that range, and trim the free space in
that range.

> For example fstrim's -o/-l arguments
> will be entirely ignored. Fix it by correctly handling those paramter.

The fix is in fact causing several problems:
- Trim out of range
  # mkfs.btrfs -f /dev/test/scratch1 # A 5G LV
  # mount /dev/test/scratch1 /mnt/btrfs
  # fstrim -v -o1152921504338411520 -l10m /mnt/btrfs
  fstrim: /mnt/btrfs: FITRIM ioctl failed: Input/output error
  # dmesg
  [ 1253.984230] attempt to access beyond end of device
  [ 1253.984233] dm-4: rw=2051, want=2251799813181440, limit=10485760
  [ 1253.984240] BTRFS warning (device dm-4): failed to trim 1
device(s), last error -5

  At least we need to verify the range before using it.

- Cross level interpretation of range.
  We have chosen to interpreter the range as btrfs logical address
  already. Then it shouldn't be interpreted as physical address any
  more.
  In the context of multi-device btrfs context, we need devid to locate
  a physical range, and the fstrim range doesn't follow it at all.

  I understand always trimming the unallocated space is not a good idea,
  but it's even worse to interpret the fstrim range into two different
  meanings.

I still prefer the old behavior, or even a step furthermore, even ignore
the range->minlen as it makes no sense at the context of device physical
address.

Thanks,
Qu

> This requires breaking if the found freespace extent is after the end
> of the passed range as well as completing trim after trimming
> fstrim_range::len bytes.
>
> Fixes: 499f377f49f0 ("btrfs: iterate over unused chunk space in FITRIM")
> Signed-off-by: Nikolay Borisov <nbori...@suse.com>
> ---
>  fs/btrfs/extent-tree.c | 25 +++++++++++++++++++------
>  1 file changed, 19 insertions(+), 6 deletions(-)
>
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index b0c86a817a99..dcda3c4de240 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -11309,9 +11309,9 @@ int btrfs_error_unpin_extent_range(struct 
> btrfs_fs_info *fs_info,
>   * held back allocations.
>   */
>  static int btrfs_trim_free_extents(struct btrfs_device *device,
> -                                u64 minlen, u64 *trimmed)
> +                                struct fstrim_range *range, u64 *trimmed)
>  {
> -     u64 start = 0, len = 0;
> +     u64 start = range->start, len = 0;
>       int ret;
>
>       *trimmed = 0;
> @@ -11354,8 +11354,8 @@ static int btrfs_trim_free_extents(struct 
> btrfs_device *device,
>               if (!trans)
>                       up_read(&fs_info->commit_root_sem);
>
> -             ret = find_free_dev_extent_start(trans, device, minlen, start,
> -                                              &start, &len);
> +             ret = find_free_dev_extent_start(trans, device, range->minlen,
> +                                              start, &start, &len);
>               if (trans) {
>                       up_read(&fs_info->commit_root_sem);
>                       btrfs_put_transaction(trans);
> @@ -11368,6 +11368,16 @@ static int btrfs_trim_free_extents(struct 
> btrfs_device *device,
>                       break;
>               }
>
> +             /* If we are out of the passed range break */
> +             if (start > range->start + range->len - 1) {
> +                     mutex_unlock(&fs_info->chunk_mutex);
> +                     ret = 0;
> +                     break;
> +             }
> +
> +             start = max(range->start, start);
> +             len = min(range->len, len);
> +
>               ret = btrfs_issue_discard(device->bdev, start, len, &bytes);
>               mutex_unlock(&fs_info->chunk_mutex);
>
> @@ -11377,6 +11387,10 @@ static int btrfs_trim_free_extents(struct 
> btrfs_device *device,
>               start += len;
>               *trimmed += bytes;
>
> +             /* We've trimmed enough */
> +             if (*trimmed >= range->len)
> +                     break;
> +
>               if (fatal_signal_pending(current)) {
>                       ret = -ERESTARTSYS;
>                       break;
> @@ -11460,8 +11474,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, 
> struct fstrim_range *range)
>       mutex_lock(&fs_info->fs_devices->device_list_mutex);
>       devices = &fs_info->fs_devices->devices;
>       list_for_each_entry(device, devices, dev_list) {
> -             ret = btrfs_trim_free_extents(device, range->minlen,
> -                                           &group_trimmed);
> +             ret = btrfs_trim_free_extents(device, range, &group_trimmed);
>               if (ret) {
>                       dev_failed++;
>                       dev_ret = ret;
>

Reply via email to