Re: bug? fstrim only trims unallocated space, not unused in bg's
[chris@f27h linux]$ git apply -v ~/RESEND-1-2-btrfs-Enhance-btrfs_trim_fs-function-to-handle-error-better.patch Checking patch fs/btrfs/extent-tree.c... Hunk #1 succeeded at 10942 (offset -6 lines). Hunk #2 succeeded at 10962 (offset -6 lines). Hunk #3 succeeded at 10974 (offset -6 lines). Hunk #4 succeeded at 10988 (offset -6 lines). Hunk #5 succeeded at 11025 (offset -6 lines). Applied patch fs/btrfs/extent-tree.c cleanly. [chris@f27h linux]$ git apply -v ~/v2.2-2-2-btrfs-Ensure-btrfs_trim_fs-can-trim-the-whole-fs.patch Checking patch fs/btrfs/extent-tree.c... Hunk #1 succeeded at 10961 (offset -6 lines). Checking patch fs/btrfs/ioctl.c... Hunk #1 succeeded at 364 (offset -1 lines). Hunk #2 succeeded at 387 (offset -1 lines). Applied patch fs/btrfs/extent-tree.c cleanly. Applied patch fs/btrfs/ioctl.c cleanly. compiles fine, and test appears to trim more than before, and the file system still works and scrubs with no errors (did scrub with 4.15.14.fc28). [chris@f27h ~]$ uname -r 4.16.0-rc7 (mainline, unpatched) [chris@f27h ~]$ sudo btrfs fi us / Overall: Device size: 70.00GiB Device allocated: 46.06GiB Device unallocated: 23.94GiB Device missing: 0.00B Used: 43.90GiB Free (estimated): 25.86GiB(min: 13.89GiB) Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 109.94MiB(used: 0.00B) Data,single: Size:44.00GiB, Used:42.08GiB /dev/nvme0n1p9 44.00GiB Metadata,single: Size:2.00GiB, Used:1.82GiB /dev/nvme0n1p9 2.00GiB System,DUP: Size:32.00MiB, Used:16.00KiB /dev/nvme0n1p9 64.00MiB Unallocated: /dev/nvme0n1p9 23.94GiB [chris@f27h ~]$ sudo fstrim -v / [sudo] password for chris: /: 24 GiB (25701646336 bytes) trimmed [chris@f27h ~]$ == [chris@f27h ~]$ uname -r 4.16.0-rc7+ (mainline, plus patch) [chris@f27h ~]$ sudo btrfs fi us / Overall: Device size: 70.00GiB Device allocated: 46.06GiB Device unallocated: 23.94GiB Device missing: 0.00B Used: 43.90GiB Free (estimated): 25.86GiB(min: 13.89GiB) Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 110.06MiB(used: 0.00B) Data,single: Size:44.00GiB, Used:42.08GiB /dev/nvme0n1p9 44.00GiB Metadata,single: Size:2.00GiB, Used:1.82GiB /dev/nvme0n1p9 2.00GiB System,DUP: Size:32.00MiB, Used:16.00KiB /dev/nvme0n1p9 64.00MiB Unallocated: /dev/nvme0n1p9 23.94GiB [chris@f27h ~]$ sudo fstrim -v / [sudo] password for chris: /: 26.5 GiB (28394635264 bytes) trimmed [chris@f27h ~]$ --- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bug? fstrim only trims unallocated space, not unused in bg's
On 2018年04月01日 11:28, Chris Murphy wrote: > On Mon, Nov 20, 2017 at 11:10 PM, Qu Wenruowrote: >> >> >> On 2017年11月21日 13:58, Chris Murphy wrote: >>> On Mon, Nov 20, 2017 at 9:58 PM, Qu Wenruo wrote: On 2017年11月21日 12:49, Chris Murphy wrote: > On Mon, Nov 20, 2017 at 9:43 PM, Qu Wenruo wrote: >> >> >>> >>> Apply in addition to previous patch? Or apply to clean v4.14? >> >> On previous patch. > > Refuses to apply with or without previous patch. > > $ git apply -v ~/qufstrim3.patch > Checking patch fs/btrfs/extent-tree.c... > error: while searching for: >int dev_ret = 0; >int ret = 0; > >/* > * try to trim all FS space, our block group may start from > non-zero. > */ > > error: patch failed: fs/btrfs/extent-tree.c:10972 > error: fs/btrfs/extent-tree.c: patch does not apply > Please try this branch. It's just previous patch and diff merged together and applied on v4.14 tag from torvalds. https://github.com/adam900710/linux/tree/tmp >>> >>> # fstrim -v / >>> /: 38 GiB (40767586304 bytes) trimmed >>> # dmesg >>> >>> ..snip... >>> [ 46.408792] BTRFS info (device nvme0n1p8): trimming btrfs, start=0 >>> len=75161927680 minlen=512 >>> [ 46.408800] BTRFS info (device nvme0n1p8): bg start=140882477056 >>> len=1073741824 >>> [ 46.433867] BTRFS info (device nvme0n1p8): trimming done >> >> Great (for the output, not for the trimming failure). >> >> And the problem is very obvious now. >> 140882477056 << First chunk start >> 75161927680 << length of fstrim_range passed in >> >> Obviously, fstrim_range passed in is using the filesystem size it >> assumes to be. >> >> While we stupidly use the range in fstrim_range without considering the >> fact that, we're dealing with *btrfs logical address space*. >> Where our chunk can start from any bytenr (well, at least aligned with >> sectorsize). >> When I read the code I also think the range check has nothing wrong at all. >> >> So the truth here is, we should not ever try to check the range from >> fstrim_range. >> >> And the problem means that, a normal btrfs with some usage and after >> several full balance, fstrim will only trim the unallocated space for btrfs. >> >> Now the fix should not be a hard to craft. >> >> Great thanks for all your help to locate the problem. > > I still see this problem in 4.16.0-0.rc7.git1.1.fc29.x86_64. Are there > any patches to test? > > [chris@f27h mnt]$ sudo btrfs fi us / > Overall: > Device size: 70.00GiB > Device allocated: 60.06GiB > Device unallocated: 9.94GiB > Device missing: 0.00B > Used: 36.89GiB > Free (estimated): 29.75GiB(min: 24.78GiB) > Data ratio: 1.00 > Metadata ratio: 1.00 > Global reserve: 107.77MiB(used: 12.38MiB) > > Data,single: Size:55.00GiB, Used:35.19GiB >/dev/nvme0n1p9 55.00GiB > > Metadata,single: Size:5.00GiB, Used:1.70GiB >/dev/nvme0n1p9 5.00GiB > > System,DUP: Size:32.00MiB, Used:16.00KiB >/dev/nvme0n1p9 64.00MiB > > Unallocated: >/dev/nvme0n1p9 9.94GiB > [chris@f27h mnt]$ sudo fstrim -v / > [sudo] password for chris: > /: 10 GiB (10669260800 bytes) trimmed The latest version is here https://patchwork.kernel.org/patch/10078773/ https://patchwork.kernel.org/patch/10083979/ And I didn't see it in misc-next either, I may need to ping it soon. Thanks, Qu signature.asc Description: OpenPGP digital signature
Re: bug? fstrim only trims unallocated space, not unused in bg's
On Mon, Nov 20, 2017 at 11:10 PM, Qu Wenruowrote: > > > On 2017年11月21日 13:58, Chris Murphy wrote: >> On Mon, Nov 20, 2017 at 9:58 PM, Qu Wenruo wrote: >>> >>> >>> On 2017年11月21日 12:49, Chris Murphy wrote: On Mon, Nov 20, 2017 at 9:43 PM, Qu Wenruo wrote: > > >> >> Apply in addition to previous patch? Or apply to clean v4.14? > > On previous patch. Refuses to apply with or without previous patch. $ git apply -v ~/qufstrim3.patch Checking patch fs/btrfs/extent-tree.c... error: while searching for: int dev_ret = 0; int ret = 0; /* * try to trim all FS space, our block group may start from non-zero. */ error: patch failed: fs/btrfs/extent-tree.c:10972 error: fs/btrfs/extent-tree.c: patch does not apply >>> >>> Please try this branch. >>> >>> It's just previous patch and diff merged together and applied on v4.14 >>> tag from torvalds. >>> >>> https://github.com/adam900710/linux/tree/tmp >> >> # fstrim -v / >> /: 38 GiB (40767586304 bytes) trimmed >> # dmesg >> >> ..snip... >> [ 46.408792] BTRFS info (device nvme0n1p8): trimming btrfs, start=0 >> len=75161927680 minlen=512 >> [ 46.408800] BTRFS info (device nvme0n1p8): bg start=140882477056 >> len=1073741824 >> [ 46.433867] BTRFS info (device nvme0n1p8): trimming done > > Great (for the output, not for the trimming failure). > > And the problem is very obvious now. > 140882477056 << First chunk start > 75161927680 << length of fstrim_range passed in > > Obviously, fstrim_range passed in is using the filesystem size it > assumes to be. > > While we stupidly use the range in fstrim_range without considering the > fact that, we're dealing with *btrfs logical address space*. > Where our chunk can start from any bytenr (well, at least aligned with > sectorsize). > When I read the code I also think the range check has nothing wrong at all. > > So the truth here is, we should not ever try to check the range from > fstrim_range. > > And the problem means that, a normal btrfs with some usage and after > several full balance, fstrim will only trim the unallocated space for btrfs. > > Now the fix should not be a hard to craft. > > Great thanks for all your help to locate the problem. I still see this problem in 4.16.0-0.rc7.git1.1.fc29.x86_64. Are there any patches to test? [chris@f27h mnt]$ sudo btrfs fi us / Overall: Device size: 70.00GiB Device allocated: 60.06GiB Device unallocated: 9.94GiB Device missing: 0.00B Used: 36.89GiB Free (estimated): 29.75GiB(min: 24.78GiB) Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 107.77MiB(used: 12.38MiB) Data,single: Size:55.00GiB, Used:35.19GiB /dev/nvme0n1p9 55.00GiB Metadata,single: Size:5.00GiB, Used:1.70GiB /dev/nvme0n1p9 5.00GiB System,DUP: Size:32.00MiB, Used:16.00KiB /dev/nvme0n1p9 64.00MiB Unallocated: /dev/nvme0n1p9 9.94GiB [chris@f27h mnt]$ sudo fstrim -v / [sudo] password for chris: /: 10 GiB (10669260800 bytes) trimmed -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bug? fstrim only trims unallocated space, not unused in bg's
On 2017年11月21日 13:58, Chris Murphy wrote: > On Mon, Nov 20, 2017 at 9:58 PM, Qu Wenruowrote: >> >> >> On 2017年11月21日 12:49, Chris Murphy wrote: >>> On Mon, Nov 20, 2017 at 9:43 PM, Qu Wenruo wrote: > > Apply in addition to previous patch? Or apply to clean v4.14? On previous patch. >>> >>> Refuses to apply with or without previous patch. >>> >>> $ git apply -v ~/qufstrim3.patch >>> Checking patch fs/btrfs/extent-tree.c... >>> error: while searching for: >>>int dev_ret = 0; >>>int ret = 0; >>> >>>/* >>> * try to trim all FS space, our block group may start from non-zero. >>> */ >>> >>> error: patch failed: fs/btrfs/extent-tree.c:10972 >>> error: fs/btrfs/extent-tree.c: patch does not apply >>> >> >> Please try this branch. >> >> It's just previous patch and diff merged together and applied on v4.14 >> tag from torvalds. >> >> https://github.com/adam900710/linux/tree/tmp > > # fstrim -v / > /: 38 GiB (40767586304 bytes) trimmed > # dmesg > > ..snip... > [ 46.408792] BTRFS info (device nvme0n1p8): trimming btrfs, start=0 > len=75161927680 minlen=512 > [ 46.408800] BTRFS info (device nvme0n1p8): bg start=140882477056 > len=1073741824 > [ 46.433867] BTRFS info (device nvme0n1p8): trimming done Great (for the output, not for the trimming failure). And the problem is very obvious now. 140882477056 << First chunk start 75161927680 << length of fstrim_range passed in Obviously, fstrim_range passed in is using the filesystem size it assumes to be. While we stupidly use the range in fstrim_range without considering the fact that, we're dealing with *btrfs logical address space*. Where our chunk can start from any bytenr (well, at least aligned with sectorsize). When I read the code I also think the range check has nothing wrong at all. So the truth here is, we should not ever try to check the range from fstrim_range. And the problem means that, a normal btrfs with some usage and after several full balance, fstrim will only trim the unallocated space for btrfs. Now the fix should not be a hard to craft. Great thanks for all your help to locate the problem. Thanks, Qu > > Attaching 'btrfs-debug -b /' to get an idea about the block groups present. > > > signature.asc Description: OpenPGP digital signature
Re: bug? fstrim only trims unallocated space, not unused in bg's
On Mon, Nov 20, 2017 at 9:58 PM, Qu Wenruowrote: > > > On 2017年11月21日 12:49, Chris Murphy wrote: >> On Mon, Nov 20, 2017 at 9:43 PM, Qu Wenruo wrote: >>> >>> Apply in addition to previous patch? Or apply to clean v4.14? >>> >>> On previous patch. >> >> Refuses to apply with or without previous patch. >> >> $ git apply -v ~/qufstrim3.patch >> Checking patch fs/btrfs/extent-tree.c... >> error: while searching for: >>int dev_ret = 0; >>int ret = 0; >> >>/* >> * try to trim all FS space, our block group may start from non-zero. >> */ >> >> error: patch failed: fs/btrfs/extent-tree.c:10972 >> error: fs/btrfs/extent-tree.c: patch does not apply >> > > Please try this branch. > > It's just previous patch and diff merged together and applied on v4.14 > tag from torvalds. > > https://github.com/adam900710/linux/tree/tmp # fstrim -v / /: 38 GiB (40767586304 bytes) trimmed # dmesg ..snip... [ 46.408792] BTRFS info (device nvme0n1p8): trimming btrfs, start=0 len=75161927680 minlen=512 [ 46.408800] BTRFS info (device nvme0n1p8): bg start=140882477056 len=1073741824 [ 46.433867] BTRFS info (device nvme0n1p8): trimming done Attaching 'btrfs-debug -b /' to get an idea about the block groups present. -- Chris Murphy $ sudo ~/Applications/btrfs-debugfs -b / block group offset 140882477056 len 1073741824 used 663564288 chunk_objectid 256 flags 1 usage 0.62 block group offset 141956218880 len 1073741824 used 774782976 chunk_objectid 256 flags 1 usage 0.72 block group offset 143029960704 len 1073741824 used 559759360 chunk_objectid 256 flags 1 usage 0.52 block group offset 144103702528 len 1073741824 used 719872000 chunk_objectid 256 flags 1 usage 0.67 block group offset 145177444352 len 1073741824 used 407699456 chunk_objectid 256 flags 1 usage 0.38 block group offset 146251186176 len 1073741824 used 446414848 chunk_objectid 256 flags 1 usage 0.42 block group offset 147324928000 len 1073741824 used 647254016 chunk_objectid 256 flags 1 usage 0.60 block group offset 148398669824 len 1073741824 used 695906304 chunk_objectid 256 flags 1 usage 0.65 block group offset 149472411648 len 1073741824 used 655175680 chunk_objectid 256 flags 1 usage 0.61 block group offset 150546153472 len 1073741824 used 700932096 chunk_objectid 256 flags 1 usage 0.65 block group offset 151619895296 len 1073741824 used 822620160 chunk_objectid 256 flags 1 usage 0.77 block group offset 152693637120 len 1073741824 used 787226624 chunk_objectid 256 flags 1 usage 0.73 block group offset 153767378944 len 1073741824 used 751927296 chunk_objectid 256 flags 1 usage 0.70 block group offset 155948417024 len 1073741824 used 611794944 chunk_objectid 256 flags 1 usage 0.57 block group offset 157022158848 len 1073741824 used 284831744 chunk_objectid 256 flags 1 usage 0.27 block group offset 158095900672 len 1073741824 used 176189440 chunk_objectid 256 flags 1 usage 0.16 block group offset 159169642496 len 1073741824 used 530092032 chunk_objectid 256 flags 1 usage 0.49 block group offset 161317126144 len 1073741824 used 957775872 chunk_objectid 256 flags 1 usage 0.89 block group offset 162390867968 len 1073741824 used 703913984 chunk_objectid 256 flags 1 usage 0.66 block group offset 163464609792 len 1073741824 used 159911936 chunk_objectid 256 flags 1 usage 0.15 block group offset 166685835264 len 1073741824 used 156217344 chunk_objectid 256 flags 1 usage 0.15 block group offset 167759577088 len 1073741824 used 120881152 chunk_objectid 256 flags 1 usage 0.11 block group offset 168833318912 len 1073741824 used 692060160 chunk_objectid 256 flags 1 usage 0.64 block group offset 169907060736 len 1073741824 used 428404736 chunk_objectid 256 flags 1 usage 0.40 block group offset 170980802560 len 1073741824 used 782372864 chunk_objectid 256 flags 1 usage 0.73 block group offset 172054544384 len 1073741824 used 533180416 chunk_objectid 256 flags 1 usage 0.50 block group offset 173128286208 len 1073741824 used 891572224 chunk_objectid 256 flags 1 usage 0.83 block group offset 177423253504 len 1073741824 used 1025331200 chunk_objectid 256 flags 1 usage 0.95 block group offset 179570737152 len 1073741824 used 1058926592 chunk_objectid 256 flags 1 usage 0.99 block group offset 180644478976 len 1073741824 used 291184640 chunk_objectid 256 flags 1 usage 0.27 total_free 14174478336 min_used 120881152 free_of_min_used 952860672 block_group_of_min_used 167759577088 balance block group (167759577088) can reduce the number of data block group [chris@f27h ~]$
Re: bug? fstrim only trims unallocated space, not unused in bg's
On 2017年11月21日 12:49, Chris Murphy wrote: > On Mon, Nov 20, 2017 at 9:43 PM, Qu Wenruowrote: >> >> >>> >>> Apply in addition to previous patch? Or apply to clean v4.14? >> >> On previous patch. > > Refuses to apply with or without previous patch. > > $ git apply -v ~/qufstrim3.patch > Checking patch fs/btrfs/extent-tree.c... > error: while searching for: >int dev_ret = 0; >int ret = 0; > >/* > * try to trim all FS space, our block group may start from non-zero. > */ > > error: patch failed: fs/btrfs/extent-tree.c:10972 > error: fs/btrfs/extent-tree.c: patch does not apply > Please try this branch. It's just previous patch and diff merged together and applied on v4.14 tag from torvalds. https://github.com/adam900710/linux/tree/tmp Thanks, Qu signature.asc Description: OpenPGP digital signature
Re: bug? fstrim only trims unallocated space, not unused in bg's
On Mon, Nov 20, 2017 at 9:43 PM, Qu Wenruowrote: > > > On 2017年11月21日 12:34, Chris Murphy wrote: >> On Mon, Nov 20, 2017 at 9:29 PM, Qu Wenruo wrote: >>> >>> >>> On 2017年11月21日 12:06, Chris Murphy wrote: On Mon, Nov 20, 2017 at 6:16 PM, Qu Wenruo wrote: > > > On 2017年11月21日 06:23, Chris Murphy wrote: >> On Sun, Nov 19, 2017 at 7:42 PM, Qu Wenruo >> wrote: >>> >>> >>> On 2017年11月20日 10:24, Chris Murphy wrote: On Sun, Nov 19, 2017 at 7:13 PM, Qu Wenruo wrote: > > > On 2017年11月19日 14:17, Chris Murphy wrote: >> fstrim should trim free space, but it only trims unallocated. This is >> with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it >> behaved this way with 4.12 also. > > Tested with 4.14-rc7, can't reproduce it. $ sudo btrfs fi us / Overall: Device size: 70.00GiB Device allocated: 31.03GiB Device unallocated: 38.97GiB Device missing: 0.00B Used: 22.12GiB Free (estimated): 47.62GiB(min: 47.62GiB) ...snip... $ sudo fstrim -v / /: 39 GiB (41841328128 bytes) trimmed Then I run btrfs-debug -b / and find the least used block group, at 8% usage; block group offset 174202028032 len 1073741824 used 89206784 chunk_objectid 256 flags 1 usage 0.08 And balance that block group: $ sudo btrfs balance start -dvrange=174202028032..174202028033 -dlimit=1 / Done, had to relocate 1 out of 32 chunks And trim again: /: 39 GiB (41841328128 bytes) trimmed > Any special mount options or setup? > (BTW, I also tried space_cache=v2 and default v1, no obvious > difference) /dev/nvme0n1p8 on / type btrfs (rw,relatime,seclabel,ssd,space_cache,subvolid=333,subvol=/root27) >>> >>> Nothing special at all. >>> >>> And unfortunately, no trace point inside btrfs_trim_block_group() at >>> all. >>> >>> But a quick glance shows me that, the loop to iterate existing block >>> groups to trim free space inside them has a return value overwrite bug. >>> >>> So only unallocated space get trimmed. >>> >>> Would you please try this diff to get the return value? >>> >>> -- >>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c >>> index 309a109069f1..dbec05dc8810 100644 >>> --- a/fs/btrfs/extent-tree.c >>> +++ b/fs/btrfs/extent-tree.c >>> @@ -10983,12 +10983,12 @@ int btrfs_trim_fs(struct btrfs_fs_info >>> *fs_info, struct fstrim_range *range) >>> ret = cache_block_group(cache, 0); >>> if (ret) { >>> btrfs_put_block_group(cache); >>> - break; >>> + goto out; >>> } >>> ret = >>> wait_block_group_cache_done(cache); >>> if (ret) { >>> btrfs_put_block_group(cache); >>> - break; >>> + goto out; >>> } >>> } >>> ret = btrfs_trim_block_group(cache, >>> @@ -11000,7 +11000,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, >>> struct fstrim_range *range) >>> trimmed += group_trimmed; >>> if (ret) { >>> btrfs_put_block_group(cache); >>> - break; >>> + goto out; >>> } >>> } >>> >>> @@ -11019,6 +11019,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, >>> struct fstrim_range *range) >>> } >>> mutex_unlock(_info->fs_devices->device_list_mutex); >>> >>> +out: >>> range->len = trimmed; >>> return ret; >>> } >>> -- >> >> This won't apply on tag v4.14 for some reason. >> >> [chris@f27s linux]$ git apply -v ~/qutrim1.patch >> Checking patch fs/btrfs/extent-tree.c... >> error: while searching for: >>ret = cache_block_group(cache, 0); >>if (ret) { >>
Re: bug? fstrim only trims unallocated space, not unused in bg's
On Mon, Nov 20, 2017 at 9:43 PM, Qu Wenruowrote: > > And BTW, according to your sysfs discard output, it seems that you're > using Intel 600P 256G NVME ssd, which I'm also using. No. 'nvme list' shows it as: SAMSUNG MZVLV256HCHP-000H1 And lscpi shows it as: 6d:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM951/PM951 [144d:a802] (rev 01) (prog-if 02 [NVM Express]) -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bug? fstrim only trims unallocated space, not unused in bg's
On 2017年11月21日 12:34, Chris Murphy wrote: > On Mon, Nov 20, 2017 at 9:29 PM, Qu Wenruowrote: >> >> >> On 2017年11月21日 12:06, Chris Murphy wrote: >>> On Mon, Nov 20, 2017 at 6:16 PM, Qu Wenruo wrote: On 2017年11月21日 06:23, Chris Murphy wrote: > On Sun, Nov 19, 2017 at 7:42 PM, Qu Wenruo wrote: >> >> >> On 2017年11月20日 10:24, Chris Murphy wrote: >>> On Sun, Nov 19, 2017 at 7:13 PM, Qu Wenruo >>> wrote: On 2017年11月19日 14:17, Chris Murphy wrote: > fstrim should trim free space, but it only trims unallocated. This is > with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it > behaved this way with 4.12 also. Tested with 4.14-rc7, can't reproduce it. >>> >>> $ sudo btrfs fi us / >>> Overall: >>> Device size: 70.00GiB >>> Device allocated: 31.03GiB >>> Device unallocated: 38.97GiB >>> Device missing: 0.00B >>> Used: 22.12GiB >>> Free (estimated): 47.62GiB(min: 47.62GiB) >>> ...snip... >>> >>> $ sudo fstrim -v / >>> /: 39 GiB (41841328128 bytes) trimmed >>> >>> Then I run btrfs-debug -b / and find the least used block group, at 8% >>> usage; >>> >>> block group offset 174202028032 len 1073741824 used 89206784 >>> chunk_objectid 256 flags 1 usage 0.08 >>> >>> And balance that block group: >>> >>> $ sudo btrfs balance start -dvrange=174202028032..174202028033 >>> -dlimit=1 / >>> Done, had to relocate 1 out of 32 chunks >>> >>> And trim again: >>> >>> /: 39 GiB (41841328128 bytes) trimmed >>> >>> Any special mount options or setup? (BTW, I also tried space_cache=v2 and default v1, no obvious difference) >>> >>> >>> /dev/nvme0n1p8 on / type btrfs >>> (rw,relatime,seclabel,ssd,space_cache,subvolid=333,subvol=/root27) >> >> Nothing special at all. >> >> And unfortunately, no trace point inside btrfs_trim_block_group() at all. >> >> But a quick glance shows me that, the loop to iterate existing block >> groups to trim free space inside them has a return value overwrite bug. >> >> So only unallocated space get trimmed. >> >> Would you please try this diff to get the return value? >> >> -- >> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c >> index 309a109069f1..dbec05dc8810 100644 >> --- a/fs/btrfs/extent-tree.c >> +++ b/fs/btrfs/extent-tree.c >> @@ -10983,12 +10983,12 @@ int btrfs_trim_fs(struct btrfs_fs_info >> *fs_info, struct fstrim_range *range) >> ret = cache_block_group(cache, 0); >> if (ret) { >> btrfs_put_block_group(cache); >> - break; >> + goto out; >> } >> ret = wait_block_group_cache_done(cache); >> if (ret) { >> btrfs_put_block_group(cache); >> - break; >> + goto out; >> } >> } >> ret = btrfs_trim_block_group(cache, >> @@ -11000,7 +11000,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, >> struct fstrim_range *range) >> trimmed += group_trimmed; >> if (ret) { >> btrfs_put_block_group(cache); >> - break; >> + goto out; >> } >> } >> >> @@ -11019,6 +11019,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, >> struct fstrim_range *range) >> } >> mutex_unlock(_info->fs_devices->device_list_mutex); >> >> +out: >> range->len = trimmed; >> return ret; >> } >> -- > > This won't apply on tag v4.14 for some reason. > > [chris@f27s linux]$ git apply -v ~/qutrim1.patch > Checking patch fs/btrfs/extent-tree.c... > error: while searching for: >ret = cache_block_group(cache, 0); >if (ret) { >btrfs_put_block_group(cache); >break; >} >ret = wait_block_group_cache_done(cache); >
Re: bug? fstrim only trims unallocated space, not unused in bg's
On Mon, Nov 20, 2017 at 9:29 PM, Qu Wenruowrote: > > > On 2017年11月21日 12:06, Chris Murphy wrote: >> On Mon, Nov 20, 2017 at 6:16 PM, Qu Wenruo wrote: >>> >>> >>> On 2017年11月21日 06:23, Chris Murphy wrote: On Sun, Nov 19, 2017 at 7:42 PM, Qu Wenruo wrote: > > > On 2017年11月20日 10:24, Chris Murphy wrote: >> On Sun, Nov 19, 2017 at 7:13 PM, Qu Wenruo >> wrote: >>> >>> >>> On 2017年11月19日 14:17, Chris Murphy wrote: fstrim should trim free space, but it only trims unallocated. This is with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it behaved this way with 4.12 also. >>> >>> Tested with 4.14-rc7, can't reproduce it. >> >> $ sudo btrfs fi us / >> Overall: >> Device size: 70.00GiB >> Device allocated: 31.03GiB >> Device unallocated: 38.97GiB >> Device missing: 0.00B >> Used: 22.12GiB >> Free (estimated): 47.62GiB(min: 47.62GiB) >> ...snip... >> >> $ sudo fstrim -v / >> /: 39 GiB (41841328128 bytes) trimmed >> >> Then I run btrfs-debug -b / and find the least used block group, at 8% >> usage; >> >> block group offset 174202028032 len 1073741824 used 89206784 >> chunk_objectid 256 flags 1 usage 0.08 >> >> And balance that block group: >> >> $ sudo btrfs balance start -dvrange=174202028032..174202028033 -dlimit=1 >> / >> Done, had to relocate 1 out of 32 chunks >> >> And trim again: >> >> /: 39 GiB (41841328128 bytes) trimmed >> >> >>> Any special mount options or setup? >>> (BTW, I also tried space_cache=v2 and default v1, no obvious difference) >> >> >> /dev/nvme0n1p8 on / type btrfs >> (rw,relatime,seclabel,ssd,space_cache,subvolid=333,subvol=/root27) > > Nothing special at all. > > And unfortunately, no trace point inside btrfs_trim_block_group() at all. > > But a quick glance shows me that, the loop to iterate existing block > groups to trim free space inside them has a return value overwrite bug. > > So only unallocated space get trimmed. > > Would you please try this diff to get the return value? > > -- > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index 309a109069f1..dbec05dc8810 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -10983,12 +10983,12 @@ int btrfs_trim_fs(struct btrfs_fs_info > *fs_info, struct fstrim_range *range) > ret = cache_block_group(cache, 0); > if (ret) { > btrfs_put_block_group(cache); > - break; > + goto out; > } > ret = wait_block_group_cache_done(cache); > if (ret) { > btrfs_put_block_group(cache); > - break; > + goto out; > } > } > ret = btrfs_trim_block_group(cache, > @@ -11000,7 +11000,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, > struct fstrim_range *range) > trimmed += group_trimmed; > if (ret) { > btrfs_put_block_group(cache); > - break; > + goto out; > } > } > > @@ -11019,6 +11019,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, > struct fstrim_range *range) > } > mutex_unlock(_info->fs_devices->device_list_mutex); > > +out: > range->len = trimmed; > return ret; > } > -- This won't apply on tag v4.14 for some reason. [chris@f27s linux]$ git apply -v ~/qutrim1.patch Checking patch fs/btrfs/extent-tree.c... error: while searching for: ret = cache_block_group(cache, 0); if (ret) { btrfs_put_block_group(cache); break; } ret = wait_block_group_cache_done(cache); if (ret) { btrfs_put_block_group(cache); break;
Re: bug? fstrim only trims unallocated space, not unused in bg's
On 2017年11月21日 12:06, Chris Murphy wrote: > On Mon, Nov 20, 2017 at 6:16 PM, Qu Wenruowrote: >> >> >> On 2017年11月21日 06:23, Chris Murphy wrote: >>> On Sun, Nov 19, 2017 at 7:42 PM, Qu Wenruo wrote: On 2017年11月20日 10:24, Chris Murphy wrote: > On Sun, Nov 19, 2017 at 7:13 PM, Qu Wenruo wrote: >> >> >> On 2017年11月19日 14:17, Chris Murphy wrote: >>> fstrim should trim free space, but it only trims unallocated. This is >>> with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it >>> behaved this way with 4.12 also. >> >> Tested with 4.14-rc7, can't reproduce it. > > $ sudo btrfs fi us / > Overall: > Device size: 70.00GiB > Device allocated: 31.03GiB > Device unallocated: 38.97GiB > Device missing: 0.00B > Used: 22.12GiB > Free (estimated): 47.62GiB(min: 47.62GiB) > ...snip... > > $ sudo fstrim -v / > /: 39 GiB (41841328128 bytes) trimmed > > Then I run btrfs-debug -b / and find the least used block group, at 8% > usage; > > block group offset 174202028032 len 1073741824 used 89206784 > chunk_objectid 256 flags 1 usage 0.08 > > And balance that block group: > > $ sudo btrfs balance start -dvrange=174202028032..174202028033 -dlimit=1 / > Done, had to relocate 1 out of 32 chunks > > And trim again: > > /: 39 GiB (41841328128 bytes) trimmed > > >> Any special mount options or setup? >> (BTW, I also tried space_cache=v2 and default v1, no obvious difference) > > > /dev/nvme0n1p8 on / type btrfs > (rw,relatime,seclabel,ssd,space_cache,subvolid=333,subvol=/root27) Nothing special at all. And unfortunately, no trace point inside btrfs_trim_block_group() at all. But a quick glance shows me that, the loop to iterate existing block groups to trim free space inside them has a return value overwrite bug. So only unallocated space get trimmed. Would you please try this diff to get the return value? -- diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 309a109069f1..dbec05dc8810 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -10983,12 +10983,12 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range) ret = cache_block_group(cache, 0); if (ret) { btrfs_put_block_group(cache); - break; + goto out; } ret = wait_block_group_cache_done(cache); if (ret) { btrfs_put_block_group(cache); - break; + goto out; } } ret = btrfs_trim_block_group(cache, @@ -11000,7 +11000,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range) trimmed += group_trimmed; if (ret) { btrfs_put_block_group(cache); - break; + goto out; } } @@ -11019,6 +11019,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range) } mutex_unlock(_info->fs_devices->device_list_mutex); +out: range->len = trimmed; return ret; } -- >>> >>> This won't apply on tag v4.14 for some reason. >>> >>> [chris@f27s linux]$ git apply -v ~/qutrim1.patch >>> Checking patch fs/btrfs/extent-tree.c... >>> error: while searching for: >>>ret = cache_block_group(cache, 0); >>>if (ret) { >>>btrfs_put_block_group(cache); >>>break; >>>} >>>ret = wait_block_group_cache_done(cache); >>>if (ret) { >>>btrfs_put_block_group(cache); >>>break; >>>} >>>} >>>ret = btrfs_trim_block_group(cache, >>> >>> error: patch failed: fs/btrfs/extent-tree.c:10983 >>> error: fs/btrfs/extent-tree.c: patch does not apply
Re: bug? fstrim only trims unallocated space, not unused in bg's
On Mon, Nov 20, 2017 at 9:13 PM, Jeff Mahoneywrote: > On 11/20/17 11:04 PM, Chris Murphy wrote: >> On Mon, Nov 20, 2017 at 6:46 PM, Jeff Mahoney wrote: >>> On 11/20/17 5:59 PM, Chris Murphy wrote: On Mon, Nov 20, 2017 at 1:40 PM, Jeff Mahoney wrote: > On 11/20/17 3:01 PM, Jeff Mahoney wrote: >> On 11/20/17 3:00 PM, Jeff Mahoney wrote: >>> On 11/19/17 4:38 PM, Chris Murphy wrote: On Sat, Nov 18, 2017 at 11:27 PM, Andrei Borzenkov wrote: > 19.11.2017 09:17, Chris Murphy пишет: >> fstrim should trim free space, but it only trims unallocated. This is >> with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it >> behaved this way with 4.12 also. >> > > Well, I was told it should also trim free space ... > > https://www.spinics.net/lists/linux-btrfs/msg61819.html > It definitely isn't. If I do a partial balance, then fstrim, I get a larger trimmed value, corresponding exactly to unallocated space. >>> >>> >>> I've just tested with 4.14 and it definitely trims within block groups. >> >> Derp. This should read 4.12. >> >>> I've attached my test script and the log of the run. I'll build and >>> test a 4.14 kernel and see if I can reproduce there. It may well be >>> that we're just misreporting the bytes trimmed. > > I get the same results on v4.14. I wrote up a little script to parse > the btrfs-debug-tree extent tree dump and the discards that are issued > after the final sync (when the tree is dumped) match. > > The script output is also as expected: > /mnt2: 95.1 GiB (102082281472 bytes) trimmed > # remove every other 100MB file, totalling 1.5 GB > + sync > + killall blktrace > + wait > + echo 'after sync' > + sleep 1 > + btrace -a discard /dev/loop0 > + fstrim -v /mnt2 > /mnt2: 96.6 GiB (103659962368 bytes) trimmed > > One thing that may not be apparent is that the byte count is from the > device(s)'s perspective. If you have a file system with duplicate > chunks or a redundant RAID mode, the numbers will reflect that. > > The total byte count should be correct as well. It's the total number > of bytes that we submit for discard and that were accepted by the block > layer. > > Do you have a test case that shows it being wrong and can you provide > the blktrace capture of the device(s) while the fstrim is running? Further, # fstrim -v / /: 38 GiB (40767586304 bytes) trimmed And then delete 10G worth of files, do not balance, and do nothing for a minute before: # fstrim -v / /: 38 GiB (40767586304 bytes) trimmed It's the same value. Free space according to fi us is +10 larger than before, and yet nothing additional is trimmed than before. So I don't know what's going on but it's not working for me. >>> >>> What happens if you sync before doing the fstrim again? The code is >>> there to drop extents within block groups. It works for me. The big >>> thing is that the space must be freed entirely before we can trim. >> >> I've sync'd and I've also rebooted, it's the same. >> >> [root@f27h ~]# fstrim -v / >> /: 38 GiB (40767586304 bytes) trimmed >> [root@f27h ~]# btrfs fi us / >> Overall: >> Device size: 70.00GiB >> Device allocated: 32.03GiB >> Device unallocated: 37.97GiB >> Device missing: 0.00B >> Used: 15.50GiB >> Free (estimated): 52.93GiB(min: 52.93GiB) >> Data ratio: 1.00 >> Metadata ratio: 1.00 >> Global reserve: 53.97MiB(used: 192.00KiB) >> >> Data,single: Size:30.00GiB, Used:15.04GiB >>/dev/nvme0n1p8 30.00GiB >> >> Metadata,single: Size:2.00GiB, Used:473.34MiB >>/dev/nvme0n1p8 2.00GiB >> >> System,single: Size:32.00MiB, Used:16.00KiB >>/dev/nvme0n1p8 32.00MiB >> >> Unallocated: >>/dev/nvme0n1p8 37.97GiB >> [root@f27h ~]# > > What's the discard granularity on that device? > > grep . /sys/block/nvme0n1/queue/discard_* > cat /sys/block/nvme0n1/discard* # grep . /sys/block/nvme0n1/queue/discard_* /sys/block/nvme0n1/queue/discard_granularity:512 /sys/block/nvme0n1/queue/discard_max_bytes:2199023255040 /sys/block/nvme0n1/queue/discard_max_hw_bytes:2199023255040 /sys/block/nvme0n1/queue/discard_zeroes_data:0 [root@f27h ~]# cat /sys/block/nvme0n1/discard* 512 [root@f27h ~]# -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bug? fstrim only trims unallocated space, not unused in bg's
On 11/20/17 11:04 PM, Chris Murphy wrote: > On Mon, Nov 20, 2017 at 6:46 PM, Jeff Mahoneywrote: >> On 11/20/17 5:59 PM, Chris Murphy wrote: >>> On Mon, Nov 20, 2017 at 1:40 PM, Jeff Mahoney wrote: On 11/20/17 3:01 PM, Jeff Mahoney wrote: > On 11/20/17 3:00 PM, Jeff Mahoney wrote: >> On 11/19/17 4:38 PM, Chris Murphy wrote: >>> On Sat, Nov 18, 2017 at 11:27 PM, Andrei Borzenkov >>> wrote: 19.11.2017 09:17, Chris Murphy пишет: > fstrim should trim free space, but it only trims unallocated. This is > with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it > behaved this way with 4.12 also. > Well, I was told it should also trim free space ... https://www.spinics.net/lists/linux-btrfs/msg61819.html >>> >>> It definitely isn't. If I do a partial balance, then fstrim, I get a >>> larger trimmed value, corresponding exactly to unallocated space. >> >> >> I've just tested with 4.14 and it definitely trims within block groups. > > Derp. This should read 4.12. > >> I've attached my test script and the log of the run. I'll build and >> test a 4.14 kernel and see if I can reproduce there. It may well be >> that we're just misreporting the bytes trimmed. I get the same results on v4.14. I wrote up a little script to parse the btrfs-debug-tree extent tree dump and the discards that are issued after the final sync (when the tree is dumped) match. The script output is also as expected: /mnt2: 95.1 GiB (102082281472 bytes) trimmed # remove every other 100MB file, totalling 1.5 GB + sync + killall blktrace + wait + echo 'after sync' + sleep 1 + btrace -a discard /dev/loop0 + fstrim -v /mnt2 /mnt2: 96.6 GiB (103659962368 bytes) trimmed One thing that may not be apparent is that the byte count is from the device(s)'s perspective. If you have a file system with duplicate chunks or a redundant RAID mode, the numbers will reflect that. The total byte count should be correct as well. It's the total number of bytes that we submit for discard and that were accepted by the block layer. Do you have a test case that shows it being wrong and can you provide the blktrace capture of the device(s) while the fstrim is running? >>> >>> >>> Further, >>> >>> # fstrim -v / >>> /: 38 GiB (40767586304 bytes) trimmed >>> >>> And then delete 10G worth of files, do not balance, and do nothing for >>> a minute before: >>> >>> # fstrim -v / >>> /: 38 GiB (40767586304 bytes) trimmed >>> >>> It's the same value. Free space according to fi us is +10 larger than >>> before, and yet nothing additional is trimmed than before. So I don't >>> know what's going on but it's not working for me. >> >> What happens if you sync before doing the fstrim again? The code is >> there to drop extents within block groups. It works for me. The big >> thing is that the space must be freed entirely before we can trim. > > I've sync'd and I've also rebooted, it's the same. > > [root@f27h ~]# fstrim -v / > /: 38 GiB (40767586304 bytes) trimmed > [root@f27h ~]# btrfs fi us / > Overall: > Device size: 70.00GiB > Device allocated: 32.03GiB > Device unallocated: 37.97GiB > Device missing: 0.00B > Used: 15.50GiB > Free (estimated): 52.93GiB(min: 52.93GiB) > Data ratio: 1.00 > Metadata ratio: 1.00 > Global reserve: 53.97MiB(used: 192.00KiB) > > Data,single: Size:30.00GiB, Used:15.04GiB >/dev/nvme0n1p8 30.00GiB > > Metadata,single: Size:2.00GiB, Used:473.34MiB >/dev/nvme0n1p8 2.00GiB > > System,single: Size:32.00MiB, Used:16.00KiB >/dev/nvme0n1p8 32.00MiB > > Unallocated: >/dev/nvme0n1p8 37.97GiB > [root@f27h ~]# What's the discard granularity on that device? grep . /sys/block/nvme0n1/queue/discard_* cat /sys/block/nvme0n1/discard* -Jeff -- Jeff Mahoney SUSE Labs signature.asc Description: OpenPGP digital signature
Re: bug? fstrim only trims unallocated space, not unused in bg's
Do I need btrfs debug stuff enabled for this patch to work? $ grep -i btrfs /home/chris/linux/.config CONFIG_BTRFS_FS=m CONFIG_BTRFS_FS_POSIX_ACL=y # CONFIG_BTRFS_FS_CHECK_INTEGRITY is not set # CONFIG_BTRFS_FS_RUN_SANITY_TESTS is not set # CONFIG_BTRFS_DEBUG is not set # CONFIG_BTRFS_ASSERT is not set $ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bug? fstrim only trims unallocated space, not unused in bg's
On Mon, Nov 20, 2017 at 6:16 PM, Qu Wenruowrote: > > > On 2017年11月21日 06:23, Chris Murphy wrote: >> On Sun, Nov 19, 2017 at 7:42 PM, Qu Wenruo wrote: >>> >>> >>> On 2017年11月20日 10:24, Chris Murphy wrote: On Sun, Nov 19, 2017 at 7:13 PM, Qu Wenruo wrote: > > > On 2017年11月19日 14:17, Chris Murphy wrote: >> fstrim should trim free space, but it only trims unallocated. This is >> with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it >> behaved this way with 4.12 also. > > Tested with 4.14-rc7, can't reproduce it. $ sudo btrfs fi us / Overall: Device size: 70.00GiB Device allocated: 31.03GiB Device unallocated: 38.97GiB Device missing: 0.00B Used: 22.12GiB Free (estimated): 47.62GiB(min: 47.62GiB) ...snip... $ sudo fstrim -v / /: 39 GiB (41841328128 bytes) trimmed Then I run btrfs-debug -b / and find the least used block group, at 8% usage; block group offset 174202028032 len 1073741824 used 89206784 chunk_objectid 256 flags 1 usage 0.08 And balance that block group: $ sudo btrfs balance start -dvrange=174202028032..174202028033 -dlimit=1 / Done, had to relocate 1 out of 32 chunks And trim again: /: 39 GiB (41841328128 bytes) trimmed > Any special mount options or setup? > (BTW, I also tried space_cache=v2 and default v1, no obvious difference) /dev/nvme0n1p8 on / type btrfs (rw,relatime,seclabel,ssd,space_cache,subvolid=333,subvol=/root27) >>> >>> Nothing special at all. >>> >>> And unfortunately, no trace point inside btrfs_trim_block_group() at all. >>> >>> But a quick glance shows me that, the loop to iterate existing block >>> groups to trim free space inside them has a return value overwrite bug. >>> >>> So only unallocated space get trimmed. >>> >>> Would you please try this diff to get the return value? >>> >>> -- >>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c >>> index 309a109069f1..dbec05dc8810 100644 >>> --- a/fs/btrfs/extent-tree.c >>> +++ b/fs/btrfs/extent-tree.c >>> @@ -10983,12 +10983,12 @@ int btrfs_trim_fs(struct btrfs_fs_info >>> *fs_info, struct fstrim_range *range) >>> ret = cache_block_group(cache, 0); >>> if (ret) { >>> btrfs_put_block_group(cache); >>> - break; >>> + goto out; >>> } >>> ret = wait_block_group_cache_done(cache); >>> if (ret) { >>> btrfs_put_block_group(cache); >>> - break; >>> + goto out; >>> } >>> } >>> ret = btrfs_trim_block_group(cache, >>> @@ -11000,7 +11000,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, >>> struct fstrim_range *range) >>> trimmed += group_trimmed; >>> if (ret) { >>> btrfs_put_block_group(cache); >>> - break; >>> + goto out; >>> } >>> } >>> >>> @@ -11019,6 +11019,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, >>> struct fstrim_range *range) >>> } >>> mutex_unlock(_info->fs_devices->device_list_mutex); >>> >>> +out: >>> range->len = trimmed; >>> return ret; >>> } >>> -- >> >> This won't apply on tag v4.14 for some reason. >> >> [chris@f27s linux]$ git apply -v ~/qutrim1.patch >> Checking patch fs/btrfs/extent-tree.c... >> error: while searching for: >>ret = cache_block_group(cache, 0); >>if (ret) { >>btrfs_put_block_group(cache); >>break; >>} >>ret = wait_block_group_cache_done(cache); >>if (ret) { >>btrfs_put_block_group(cache); >>break; >>} >>} >>ret = btrfs_trim_block_group(cache, >> >> error: patch failed: fs/btrfs/extent-tree.c:10983 >> error: fs/btrfs/extent-tree.c: patch does not apply >> [chris@f27s linux]$ >> >> >> If I do it manually (just adding the goto and build it, reboot, I >> still get the same result for fstrim and nothing in dmesg. > > Sorry,
Re: bug? fstrim only trims unallocated space, not unused in bg's
On Mon, Nov 20, 2017 at 6:46 PM, Jeff Mahoneywrote: > On 11/20/17 5:59 PM, Chris Murphy wrote: >> On Mon, Nov 20, 2017 at 1:40 PM, Jeff Mahoney wrote: >>> On 11/20/17 3:01 PM, Jeff Mahoney wrote: On 11/20/17 3:00 PM, Jeff Mahoney wrote: > On 11/19/17 4:38 PM, Chris Murphy wrote: >> On Sat, Nov 18, 2017 at 11:27 PM, Andrei Borzenkov >> wrote: >>> 19.11.2017 09:17, Chris Murphy пишет: fstrim should trim free space, but it only trims unallocated. This is with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it behaved this way with 4.12 also. >>> >>> Well, I was told it should also trim free space ... >>> >>> https://www.spinics.net/lists/linux-btrfs/msg61819.html >>> >> >> It definitely isn't. If I do a partial balance, then fstrim, I get a >> larger trimmed value, corresponding exactly to unallocated space. > > > I've just tested with 4.14 and it definitely trims within block groups. Derp. This should read 4.12. > I've attached my test script and the log of the run. I'll build and > test a 4.14 kernel and see if I can reproduce there. It may well be > that we're just misreporting the bytes trimmed. >>> >>> I get the same results on v4.14. I wrote up a little script to parse >>> the btrfs-debug-tree extent tree dump and the discards that are issued >>> after the final sync (when the tree is dumped) match. >>> >>> The script output is also as expected: >>> /mnt2: 95.1 GiB (102082281472 bytes) trimmed >>> # remove every other 100MB file, totalling 1.5 GB >>> + sync >>> + killall blktrace >>> + wait >>> + echo 'after sync' >>> + sleep 1 >>> + btrace -a discard /dev/loop0 >>> + fstrim -v /mnt2 >>> /mnt2: 96.6 GiB (103659962368 bytes) trimmed >>> >>> One thing that may not be apparent is that the byte count is from the >>> device(s)'s perspective. If you have a file system with duplicate >>> chunks or a redundant RAID mode, the numbers will reflect that. >>> >>> The total byte count should be correct as well. It's the total number >>> of bytes that we submit for discard and that were accepted by the block >>> layer. >>> >>> Do you have a test case that shows it being wrong and can you provide >>> the blktrace capture of the device(s) while the fstrim is running? >> >> >> Further, >> >> # fstrim -v / >> /: 38 GiB (40767586304 bytes) trimmed >> >> And then delete 10G worth of files, do not balance, and do nothing for >> a minute before: >> >> # fstrim -v / >> /: 38 GiB (40767586304 bytes) trimmed >> >> It's the same value. Free space according to fi us is +10 larger than >> before, and yet nothing additional is trimmed than before. So I don't >> know what's going on but it's not working for me. > > What happens if you sync before doing the fstrim again? The code is > there to drop extents within block groups. It works for me. The big > thing is that the space must be freed entirely before we can trim. I've sync'd and I've also rebooted, it's the same. [root@f27h ~]# fstrim -v / /: 38 GiB (40767586304 bytes) trimmed [root@f27h ~]# btrfs fi us / Overall: Device size: 70.00GiB Device allocated: 32.03GiB Device unallocated: 37.97GiB Device missing: 0.00B Used: 15.50GiB Free (estimated): 52.93GiB(min: 52.93GiB) Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 53.97MiB(used: 192.00KiB) Data,single: Size:30.00GiB, Used:15.04GiB /dev/nvme0n1p8 30.00GiB Metadata,single: Size:2.00GiB, Used:473.34MiB /dev/nvme0n1p8 2.00GiB System,single: Size:32.00MiB, Used:16.00KiB /dev/nvme0n1p8 32.00MiB Unallocated: /dev/nvme0n1p8 37.97GiB [root@f27h ~]# -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bug? fstrim only trims unallocated space, not unused in bg's
On 11/20/17 5:59 PM, Chris Murphy wrote: > On Mon, Nov 20, 2017 at 1:40 PM, Jeff Mahoneywrote: >> On 11/20/17 3:01 PM, Jeff Mahoney wrote: >>> On 11/20/17 3:00 PM, Jeff Mahoney wrote: On 11/19/17 4:38 PM, Chris Murphy wrote: > On Sat, Nov 18, 2017 at 11:27 PM, Andrei Borzenkov > wrote: >> 19.11.2017 09:17, Chris Murphy пишет: >>> fstrim should trim free space, but it only trims unallocated. This is >>> with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it >>> behaved this way with 4.12 also. >>> >> >> Well, I was told it should also trim free space ... >> >> https://www.spinics.net/lists/linux-btrfs/msg61819.html >> > > It definitely isn't. If I do a partial balance, then fstrim, I get a > larger trimmed value, corresponding exactly to unallocated space. I've just tested with 4.14 and it definitely trims within block groups. >>> >>> Derp. This should read 4.12. >>> I've attached my test script and the log of the run. I'll build and test a 4.14 kernel and see if I can reproduce there. It may well be that we're just misreporting the bytes trimmed. >> >> I get the same results on v4.14. I wrote up a little script to parse >> the btrfs-debug-tree extent tree dump and the discards that are issued >> after the final sync (when the tree is dumped) match. >> >> The script output is also as expected: >> /mnt2: 95.1 GiB (102082281472 bytes) trimmed >> # remove every other 100MB file, totalling 1.5 GB >> + sync >> + killall blktrace >> + wait >> + echo 'after sync' >> + sleep 1 >> + btrace -a discard /dev/loop0 >> + fstrim -v /mnt2 >> /mnt2: 96.6 GiB (103659962368 bytes) trimmed >> >> One thing that may not be apparent is that the byte count is from the >> device(s)'s perspective. If you have a file system with duplicate >> chunks or a redundant RAID mode, the numbers will reflect that. >> >> The total byte count should be correct as well. It's the total number >> of bytes that we submit for discard and that were accepted by the block >> layer. >> >> Do you have a test case that shows it being wrong and can you provide >> the blktrace capture of the device(s) while the fstrim is running? > > > Further, > > # fstrim -v / > /: 38 GiB (40767586304 bytes) trimmed > > And then delete 10G worth of files, do not balance, and do nothing for > a minute before: > > # fstrim -v / > /: 38 GiB (40767586304 bytes) trimmed > > It's the same value. Free space according to fi us is +10 larger than > before, and yet nothing additional is trimmed than before. So I don't > know what's going on but it's not working for me. What happens if you sync before doing the fstrim again? The code is there to drop extents within block groups. It works for me. The big thing is that the space must be freed entirely before we can trim. -Jeff -- Jeff Mahoney SUSE Labs signature.asc Description: OpenPGP digital signature
Re: bug? fstrim only trims unallocated space, not unused in bg's
On 2017年11月21日 06:23, Chris Murphy wrote: > On Sun, Nov 19, 2017 at 7:42 PM, Qu Wenruowrote: >> >> >> On 2017年11月20日 10:24, Chris Murphy wrote: >>> On Sun, Nov 19, 2017 at 7:13 PM, Qu Wenruo wrote: On 2017年11月19日 14:17, Chris Murphy wrote: > fstrim should trim free space, but it only trims unallocated. This is > with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it > behaved this way with 4.12 also. Tested with 4.14-rc7, can't reproduce it. >>> >>> $ sudo btrfs fi us / >>> Overall: >>> Device size: 70.00GiB >>> Device allocated: 31.03GiB >>> Device unallocated: 38.97GiB >>> Device missing: 0.00B >>> Used: 22.12GiB >>> Free (estimated): 47.62GiB(min: 47.62GiB) >>> ...snip... >>> >>> $ sudo fstrim -v / >>> /: 39 GiB (41841328128 bytes) trimmed >>> >>> Then I run btrfs-debug -b / and find the least used block group, at 8% >>> usage; >>> >>> block group offset 174202028032 len 1073741824 used 89206784 >>> chunk_objectid 256 flags 1 usage 0.08 >>> >>> And balance that block group: >>> >>> $ sudo btrfs balance start -dvrange=174202028032..174202028033 -dlimit=1 / >>> Done, had to relocate 1 out of 32 chunks >>> >>> And trim again: >>> >>> /: 39 GiB (41841328128 bytes) trimmed >>> >>> Any special mount options or setup? (BTW, I also tried space_cache=v2 and default v1, no obvious difference) >>> >>> >>> /dev/nvme0n1p8 on / type btrfs >>> (rw,relatime,seclabel,ssd,space_cache,subvolid=333,subvol=/root27) >> >> Nothing special at all. >> >> And unfortunately, no trace point inside btrfs_trim_block_group() at all. >> >> But a quick glance shows me that, the loop to iterate existing block >> groups to trim free space inside them has a return value overwrite bug. >> >> So only unallocated space get trimmed. >> >> Would you please try this diff to get the return value? >> >> -- >> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c >> index 309a109069f1..dbec05dc8810 100644 >> --- a/fs/btrfs/extent-tree.c >> +++ b/fs/btrfs/extent-tree.c >> @@ -10983,12 +10983,12 @@ int btrfs_trim_fs(struct btrfs_fs_info >> *fs_info, struct fstrim_range *range) >> ret = cache_block_group(cache, 0); >> if (ret) { >> btrfs_put_block_group(cache); >> - break; >> + goto out; >> } >> ret = wait_block_group_cache_done(cache); >> if (ret) { >> btrfs_put_block_group(cache); >> - break; >> + goto out; >> } >> } >> ret = btrfs_trim_block_group(cache, >> @@ -11000,7 +11000,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, >> struct fstrim_range *range) >> trimmed += group_trimmed; >> if (ret) { >> btrfs_put_block_group(cache); >> - break; >> + goto out; >> } >> } >> >> @@ -11019,6 +11019,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, >> struct fstrim_range *range) >> } >> mutex_unlock(_info->fs_devices->device_list_mutex); >> >> +out: >> range->len = trimmed; >> return ret; >> } >> -- > > This won't apply on tag v4.14 for some reason. > > [chris@f27s linux]$ git apply -v ~/qutrim1.patch > Checking patch fs/btrfs/extent-tree.c... > error: while searching for: >ret = cache_block_group(cache, 0); >if (ret) { >btrfs_put_block_group(cache); >break; >} >ret = wait_block_group_cache_done(cache); >if (ret) { >btrfs_put_block_group(cache); >break; >} >} >ret = btrfs_trim_block_group(cache, > > error: patch failed: fs/btrfs/extent-tree.c:10983 > error: fs/btrfs/extent-tree.c: patch does not apply > [chris@f27s linux]$ > > > If I do it manually (just adding the goto and build it, reboot, I > still get the same result for fstrim and nothing in dmesg. Sorry, that diff will not output extra info. Just to abort the process and return true error code. I have update the patch to output more verbose output. You could find it in patchwork:
Re: bug? fstrim only trims unallocated space, not unused in bg's
On Mon, Nov 20, 2017 at 1:40 PM, Jeff Mahoneywrote: > On 11/20/17 3:01 PM, Jeff Mahoney wrote: >> On 11/20/17 3:00 PM, Jeff Mahoney wrote: >>> On 11/19/17 4:38 PM, Chris Murphy wrote: On Sat, Nov 18, 2017 at 11:27 PM, Andrei Borzenkov wrote: > 19.11.2017 09:17, Chris Murphy пишет: >> fstrim should trim free space, but it only trims unallocated. This is >> with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it >> behaved this way with 4.12 also. >> > > Well, I was told it should also trim free space ... > > https://www.spinics.net/lists/linux-btrfs/msg61819.html > It definitely isn't. If I do a partial balance, then fstrim, I get a larger trimmed value, corresponding exactly to unallocated space. >>> >>> >>> I've just tested with 4.14 and it definitely trims within block groups. >> >> Derp. This should read 4.12. >> >>> I've attached my test script and the log of the run. I'll build and >>> test a 4.14 kernel and see if I can reproduce there. It may well be >>> that we're just misreporting the bytes trimmed. > > I get the same results on v4.14. I wrote up a little script to parse > the btrfs-debug-tree extent tree dump and the discards that are issued > after the final sync (when the tree is dumped) match. > > The script output is also as expected: > /mnt2: 95.1 GiB (102082281472 bytes) trimmed > # remove every other 100MB file, totalling 1.5 GB > + sync > + killall blktrace > + wait > + echo 'after sync' > + sleep 1 > + btrace -a discard /dev/loop0 > + fstrim -v /mnt2 > /mnt2: 96.6 GiB (103659962368 bytes) trimmed > > One thing that may not be apparent is that the byte count is from the > device(s)'s perspective. If you have a file system with duplicate > chunks or a redundant RAID mode, the numbers will reflect that. > > The total byte count should be correct as well. It's the total number > of bytes that we submit for discard and that were accepted by the block > layer. > > Do you have a test case that shows it being wrong and can you provide > the blktrace capture of the device(s) while the fstrim is running? Further, # fstrim -v / /: 38 GiB (40767586304 bytes) trimmed And then delete 10G worth of files, do not balance, and do nothing for a minute before: # fstrim -v / /: 38 GiB (40767586304 bytes) trimmed It's the same value. Free space according to fi us is +10 larger than before, and yet nothing additional is trimmed than before. So I don't know what's going on but it's not working for me. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bug? fstrim only trims unallocated space, not unused in bg's
On Sun, Nov 19, 2017 at 7:42 PM, Qu Wenruowrote: > > > On 2017年11月20日 10:24, Chris Murphy wrote: >> On Sun, Nov 19, 2017 at 7:13 PM, Qu Wenruo wrote: >>> >>> >>> On 2017年11月19日 14:17, Chris Murphy wrote: fstrim should trim free space, but it only trims unallocated. This is with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it behaved this way with 4.12 also. >>> >>> Tested with 4.14-rc7, can't reproduce it. >> >> $ sudo btrfs fi us / >> Overall: >> Device size: 70.00GiB >> Device allocated: 31.03GiB >> Device unallocated: 38.97GiB >> Device missing: 0.00B >> Used: 22.12GiB >> Free (estimated): 47.62GiB(min: 47.62GiB) >> ...snip... >> >> $ sudo fstrim -v / >> /: 39 GiB (41841328128 bytes) trimmed >> >> Then I run btrfs-debug -b / and find the least used block group, at 8% usage; >> >> block group offset 174202028032 len 1073741824 used 89206784 >> chunk_objectid 256 flags 1 usage 0.08 >> >> And balance that block group: >> >> $ sudo btrfs balance start -dvrange=174202028032..174202028033 -dlimit=1 / >> Done, had to relocate 1 out of 32 chunks >> >> And trim again: >> >> /: 39 GiB (41841328128 bytes) trimmed >> >> >>> Any special mount options or setup? >>> (BTW, I also tried space_cache=v2 and default v1, no obvious difference) >> >> >> /dev/nvme0n1p8 on / type btrfs >> (rw,relatime,seclabel,ssd,space_cache,subvolid=333,subvol=/root27) > > Nothing special at all. > > And unfortunately, no trace point inside btrfs_trim_block_group() at all. > > But a quick glance shows me that, the loop to iterate existing block > groups to trim free space inside them has a return value overwrite bug. > > So only unallocated space get trimmed. > > Would you please try this diff to get the return value? > > -- > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index 309a109069f1..dbec05dc8810 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -10983,12 +10983,12 @@ int btrfs_trim_fs(struct btrfs_fs_info > *fs_info, struct fstrim_range *range) > ret = cache_block_group(cache, 0); > if (ret) { > btrfs_put_block_group(cache); > - break; > + goto out; > } > ret = wait_block_group_cache_done(cache); > if (ret) { > btrfs_put_block_group(cache); > - break; > + goto out; > } > } > ret = btrfs_trim_block_group(cache, > @@ -11000,7 +11000,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, > struct fstrim_range *range) > trimmed += group_trimmed; > if (ret) { > btrfs_put_block_group(cache); > - break; > + goto out; > } > } > > @@ -11019,6 +11019,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, > struct fstrim_range *range) > } > mutex_unlock(_info->fs_devices->device_list_mutex); > > +out: > range->len = trimmed; > return ret; > } > -- This won't apply on tag v4.14 for some reason. [chris@f27s linux]$ git apply -v ~/qutrim1.patch Checking patch fs/btrfs/extent-tree.c... error: while searching for: ret = cache_block_group(cache, 0); if (ret) { btrfs_put_block_group(cache); break; } ret = wait_block_group_cache_done(cache); if (ret) { btrfs_put_block_group(cache); break; } } ret = btrfs_trim_block_group(cache, error: patch failed: fs/btrfs/extent-tree.c:10983 error: fs/btrfs/extent-tree.c: patch does not apply [chris@f27s linux]$ If I do it manually (just adding the goto and build it, reboot, I still get the same result for fstrim and nothing in dmesg. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bug? fstrim only trims unallocated space, not unused in bg's
On Mon, Nov 20, 2017 at 1:40 PM, Jeff Mahoneywrote: > On 11/20/17 3:01 PM, Jeff Mahoney wrote: >> On 11/20/17 3:00 PM, Jeff Mahoney wrote: >>> On 11/19/17 4:38 PM, Chris Murphy wrote: On Sat, Nov 18, 2017 at 11:27 PM, Andrei Borzenkov wrote: > 19.11.2017 09:17, Chris Murphy пишет: >> fstrim should trim free space, but it only trims unallocated. This is >> with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it >> behaved this way with 4.12 also. >> > > Well, I was told it should also trim free space ... > > https://www.spinics.net/lists/linux-btrfs/msg61819.html > It definitely isn't. If I do a partial balance, then fstrim, I get a larger trimmed value, corresponding exactly to unallocated space. >>> >>> >>> I've just tested with 4.14 and it definitely trims within block groups. >> >> Derp. This should read 4.12. >> >>> I've attached my test script and the log of the run. I'll build and >>> test a 4.14 kernel and see if I can reproduce there. It may well be >>> that we're just misreporting the bytes trimmed. > > I get the same results on v4.14. I wrote up a little script to parse > the btrfs-debug-tree extent tree dump and the discards that are issued > after the final sync (when the tree is dumped) match. > > The script output is also as expected: > /mnt2: 95.1 GiB (102082281472 bytes) trimmed > # remove every other 100MB file, totalling 1.5 GB > + sync > + killall blktrace > + wait > + echo 'after sync' > + sleep 1 > + btrace -a discard /dev/loop0 > + fstrim -v /mnt2 > /mnt2: 96.6 GiB (103659962368 bytes) trimmed > > One thing that may not be apparent is that the byte count is from the > device(s)'s perspective. If you have a file system with duplicate > chunks or a redundant RAID mode, the numbers will reflect that. This is a single device volume, single profile for data and metadata bg's. > > The total byte count should be correct as well. It's the total number > of bytes that we submit for discard and that were accepted by the block > layer. Total byte count from fstrim matches unallocated space, it does not match reported free space. And when I do a filtered balance to reclaim free space into unallocated space, subsequent fstrim shows a different total value even though no files have been deleted. So it seems like it's really trimming unallocated space not free space, or the trimmed total wouldn't chance just because I've done a partial balance. > > Do you have a test case that shows it being wrong and can you provide > the blktrace capture of the device(s) while the fstrim is running? I have a test machine exhibiting this problem and can run whatever you want on the machine, but it'll have to be rather verbose instructions. In the meantime I'm about to test Qu's patch. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bug? fstrim only trims unallocated space, not unused in bg's
On 11/20/17 3:01 PM, Jeff Mahoney wrote: > On 11/20/17 3:00 PM, Jeff Mahoney wrote: >> On 11/19/17 4:38 PM, Chris Murphy wrote: >>> On Sat, Nov 18, 2017 at 11:27 PM, Andrei Borzenkov>>> wrote: 19.11.2017 09:17, Chris Murphy пишет: > fstrim should trim free space, but it only trims unallocated. This is > with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it > behaved this way with 4.12 also. > Well, I was told it should also trim free space ... https://www.spinics.net/lists/linux-btrfs/msg61819.html >>> >>> It definitely isn't. If I do a partial balance, then fstrim, I get a >>> larger trimmed value, corresponding exactly to unallocated space. >> >> >> I've just tested with 4.14 and it definitely trims within block groups. > > Derp. This should read 4.12. > >> I've attached my test script and the log of the run. I'll build and >> test a 4.14 kernel and see if I can reproduce there. It may well be >> that we're just misreporting the bytes trimmed. I get the same results on v4.14. I wrote up a little script to parse the btrfs-debug-tree extent tree dump and the discards that are issued after the final sync (when the tree is dumped) match. The script output is also as expected: /mnt2: 95.1 GiB (102082281472 bytes) trimmed # remove every other 100MB file, totalling 1.5 GB + sync + killall blktrace + wait + echo 'after sync' + sleep 1 + btrace -a discard /dev/loop0 + fstrim -v /mnt2 /mnt2: 96.6 GiB (103659962368 bytes) trimmed One thing that may not be apparent is that the byte count is from the device(s)'s perspective. If you have a file system with duplicate chunks or a redundant RAID mode, the numbers will reflect that. The total byte count should be correct as well. It's the total number of bytes that we submit for discard and that were accepted by the block layer. Do you have a test case that shows it being wrong and can you provide the blktrace capture of the device(s) while the fstrim is running? -Jeff -- Jeff Mahoney SUSE Labs signature.asc Description: OpenPGP digital signature
Re: bug? fstrim only trims unallocated space, not unused in bg's
On 11/20/17 3:00 PM, Jeff Mahoney wrote: > On 11/19/17 4:38 PM, Chris Murphy wrote: >> On Sat, Nov 18, 2017 at 11:27 PM, Andrei Borzenkov>> wrote: >>> 19.11.2017 09:17, Chris Murphy пишет: fstrim should trim free space, but it only trims unallocated. This is with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it behaved this way with 4.12 also. >>> >>> Well, I was told it should also trim free space ... >>> >>> https://www.spinics.net/lists/linux-btrfs/msg61819.html >>> >> >> It definitely isn't. If I do a partial balance, then fstrim, I get a >> larger trimmed value, corresponding exactly to unallocated space. > > > I've just tested with 4.14 and it definitely trims within block groups. Derp. This should read 4.12. > I've attached my test script and the log of the run. I'll build and > test a 4.14 kernel and see if I can reproduce there. It may well be > that we're just misreporting the bytes trimmed. > > -Jeff > > -- Jeff Mahoney SUSE Labs signature.asc Description: OpenPGP digital signature
Re: bug? fstrim only trims unallocated space, not unused in bg's
On 2017年11月20日 10:24, Chris Murphy wrote: > On Sun, Nov 19, 2017 at 7:13 PM, Qu Wenruowrote: >> >> >> On 2017年11月19日 14:17, Chris Murphy wrote: >>> fstrim should trim free space, but it only trims unallocated. This is >>> with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it >>> behaved this way with 4.12 also. >> >> Tested with 4.14-rc7, can't reproduce it. > > $ sudo btrfs fi us / > Overall: > Device size: 70.00GiB > Device allocated: 31.03GiB > Device unallocated: 38.97GiB > Device missing: 0.00B > Used: 22.12GiB > Free (estimated): 47.62GiB(min: 47.62GiB) > ...snip... > > $ sudo fstrim -v / > /: 39 GiB (41841328128 bytes) trimmed > > Then I run btrfs-debug -b / and find the least used block group, at 8% usage; > > block group offset 174202028032 len 1073741824 used 89206784 > chunk_objectid 256 flags 1 usage 0.08 > > And balance that block group: > > $ sudo btrfs balance start -dvrange=174202028032..174202028033 -dlimit=1 / > Done, had to relocate 1 out of 32 chunks > > And trim again: > > /: 39 GiB (41841328128 bytes) trimmed > > >> Any special mount options or setup? >> (BTW, I also tried space_cache=v2 and default v1, no obvious difference) > > > /dev/nvme0n1p8 on / type btrfs > (rw,relatime,seclabel,ssd,space_cache,subvolid=333,subvol=/root27) Nothing special at all. And unfortunately, no trace point inside btrfs_trim_block_group() at all. But a quick glance shows me that, the loop to iterate existing block groups to trim free space inside them has a return value overwrite bug. So only unallocated space get trimmed. Would you please try this diff to get the return value? -- diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 309a109069f1..dbec05dc8810 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -10983,12 +10983,12 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range) ret = cache_block_group(cache, 0); if (ret) { btrfs_put_block_group(cache); - break; + goto out; } ret = wait_block_group_cache_done(cache); if (ret) { btrfs_put_block_group(cache); - break; + goto out; } } ret = btrfs_trim_block_group(cache, @@ -11000,7 +11000,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range) trimmed += group_trimmed; if (ret) { btrfs_put_block_group(cache); - break; + goto out; } } @@ -11019,6 +11019,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range) } mutex_unlock(_info->fs_devices->device_list_mutex); +out: range->len = trimmed; return ret; } -- Thanks, Qu > > > Would a strace of fstrim help? > > signature.asc Description: OpenPGP digital signature
Re: bug? fstrim only trims unallocated space, not unused in bg's
On Sun, Nov 19, 2017 at 7:24 PM, Chris Murphywrote: > $ sudo fstrim -v / > /: 39 GiB (41841328128 bytes) trimmed > And trim again: > > /: 39 GiB (41841328128 bytes) trimmed Cute. The balance command claimed it balanced a chunk but it didn't. btrfs-debug -b says that same 8% chunk is present... block group offset 175275769856 len 1073741824 used 89206784 chunk_objectid 256 flags 1 usage 0.08 Fine. I'll do a -duage instead. $ sudo btrfs balance start -dusage=11 / Done, had to relocate 2 out of 32 chunks $ sudo fstrim -v / /: 40 GiB (42915069952 bytes) trimmed $ sudo btrfs balance start -dusage=21 / Done, had to relocate 2 out of 31 chunks $ sudo fstrim -v / /: 41 GiB (43988811776 bytes) trimmed OK so a different bug is that it's claiming to balance two chunks but it's really only balancing one. That same 8% used block group was not rewritten, it's at the same address, so for whatever reason that tiny one is pinned. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bug? fstrim only trims unallocated space, not unused in bg's
On Sun, Nov 19, 2017 at 7:13 PM, Qu Wenruowrote: > > > On 2017年11月19日 14:17, Chris Murphy wrote: >> fstrim should trim free space, but it only trims unallocated. This is >> with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it >> behaved this way with 4.12 also. > > Tested with 4.14-rc7, can't reproduce it. $ sudo btrfs fi us / Overall: Device size: 70.00GiB Device allocated: 31.03GiB Device unallocated: 38.97GiB Device missing: 0.00B Used: 22.12GiB Free (estimated): 47.62GiB(min: 47.62GiB) ...snip... $ sudo fstrim -v / /: 39 GiB (41841328128 bytes) trimmed Then I run btrfs-debug -b / and find the least used block group, at 8% usage; block group offset 174202028032 len 1073741824 used 89206784 chunk_objectid 256 flags 1 usage 0.08 And balance that block group: $ sudo btrfs balance start -dvrange=174202028032..174202028033 -dlimit=1 / Done, had to relocate 1 out of 32 chunks And trim again: /: 39 GiB (41841328128 bytes) trimmed > Any special mount options or setup? > (BTW, I also tried space_cache=v2 and default v1, no obvious difference) /dev/nvme0n1p8 on / type btrfs (rw,relatime,seclabel,ssd,space_cache,subvolid=333,subvol=/root27) Would a strace of fstrim help? -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bug? fstrim only trims unallocated space, not unused in bg's
On 2017年11月19日 14:17, Chris Murphy wrote: > fstrim should trim free space, but it only trims unallocated. This is > with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it > behaved this way with 4.12 also. Tested with 4.14-rc7, can't reproduce it. -- # btrfs fi us /mnt/btrfs/ Overall: Device size: 1.00GiB Device allocated:566.38MiB Device unallocated: 457.62MiB Device missing: 0.00B Used:256.81MiB Free (estimated):649.62MiB (min: 420.81MiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 16.00MiB (used: 0.00B) Data,single: Size:448.00MiB, Used:256.00MiB /dev/loop0448.00MiB Metadata,DUP: Size:51.19MiB, Used:400.00KiB /dev/loop0102.38MiB System,DUP: Size:8.00MiB, Used:16.00KiB /dev/loop0 16.00MiB Unallocated: /dev/loop0457.62MiB # fstrim /mnt/btrfs -v /mnt/btrfs: 665.3 MiB (697597952 bytes) trimmed -- Any special mount options or setup? (BTW, I also tried space_cache=v2 and default v1, no obvious difference) Thanks, Qu > > > [root@f27h ~]# fstrim -v / > /: 39 GiB (41841328128 bytes) trimmed > [root@f27h ~]# btrfs fi us / > Overall: > Device size: 70.00GiB > Device allocated: 31.03GiB > Device unallocated: 38.97GiB > Device missing: 0.00B > Used: 22.02GiB > Free (estimated): 47.72GiB(min: 47.72GiB) > Data ratio: 1.00 > Metadata ratio: 1.00 > Global reserve: 65.97MiB(used: 0.00B) > > Data,single: Size:30.00GiB, Used:21.25GiB >/dev/nvme0n1p8 30.00GiB > > Metadata,single: Size:1.00GiB, Used:791.58MiB >/dev/nvme0n1p8 1.00GiB > > System,single: Size:32.00MiB, Used:16.00KiB >/dev/nvme0n1p8 32.00MiB > > Unallocated: >/dev/nvme0n1p8 38.97GiB > > signature.asc Description: OpenPGP digital signature
Re: bug? fstrim only trims unallocated space, not unused in bg's
On Sat, Nov 18, 2017 at 11:27 PM, Andrei Borzenkovwrote: > 19.11.2017 09:17, Chris Murphy пишет: >> fstrim should trim free space, but it only trims unallocated. This is >> with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it >> behaved this way with 4.12 also. >> > > Well, I was told it should also trim free space ... > > https://www.spinics.net/lists/linux-btrfs/msg61819.html > It definitely isn't. If I do a partial balance, then fstrim, I get a larger trimmed value, corresponding exactly to unallocated space. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bug? fstrim only trims unallocated space, not unused in bg's
19.11.2017 09:17, Chris Murphy пишет: > fstrim should trim free space, but it only trims unallocated. This is > with kernel 4.14.0 and the entire 4.13 series. I'm pretty sure it > behaved this way with 4.12 also. > Well, I was told it should also trim free space ... https://www.spinics.net/lists/linux-btrfs/msg61819.html > > [root@f27h ~]# fstrim -v / > /: 39 GiB (41841328128 bytes) trimmed > [root@f27h ~]# btrfs fi us / > Overall: > Device size: 70.00GiB > Device allocated: 31.03GiB > Device unallocated: 38.97GiB > Device missing: 0.00B > Used: 22.02GiB > Free (estimated): 47.72GiB(min: 47.72GiB) > Data ratio: 1.00 > Metadata ratio: 1.00 > Global reserve: 65.97MiB(used: 0.00B) > > Data,single: Size:30.00GiB, Used:21.25GiB >/dev/nvme0n1p8 30.00GiB > > Metadata,single: Size:1.00GiB, Used:791.58MiB >/dev/nvme0n1p8 1.00GiB > > System,single: Size:32.00MiB, Used:16.00KiB >/dev/nvme0n1p8 32.00MiB > > Unallocated: >/dev/nvme0n1p8 38.97GiB > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html