On Mon, Feb 8, 2021 at 3:21 PM Zygo Blaxell
<ce3g8...@umail.furryterror.org> wrote:


> defrag will put the file's contents back into delalloc, and it won't be
> allocated until a flush (fsync, sync, or commit interval).  Defrag is
> roughly equivalent to simply copying the data to a new file in btrfs,
> except the logical extents are atomically updated to point to the new
> location.

BTRFS_IOC_DEFRAG results:
https://pastebin.com/1ufErVMs

BTRFS_IOC_DEFRAG_RANGE results:
https://pastebin.com/429fZmNB

They're different.

Questions: is this a bug? it is intentional? does the interleaved
BTRFS_IOC_DEFRAG version improve things over the non-defragmented
file, which had only 3 8MB extents for a 24MB file, plus 1 4KiB block?
Should BTRFS_IOC_DEFRAG be capable of estimating fragmentation and
just do a no op in that case?


> FIEMAP has an option flag to sync the data before returning a map.
> DEFRAG has an option to start IO immediately so it will presumably be
> done by the time you look at the extents with FIEMAP.

I waited for the defrag result to settle, so the results I've posted are stable.


> Be very careful how you set up this test case.  If you use fallocate on
> a file, it has a _permanent_ effect on the inode, and alters a lot of
> normal btrfs behavior downstream.  You won't see these effects if you
> just write some data to a file without using prealloc.

OK. That might answer the idempotent question. Following
BTRFS_IOC_DEFRAG most unwritten exents are no longer present. I can't
figure out the pattern. Some of the archived journals have them,
others have one, but none have the four or more that I see in active
use journals. And then when defragged with BTRFS_IOC_DEFRAG_RANGE none
of those have unwritten extents.

Since the file is changing each time it goes through the ioctl it
makes sense what comes out the back end is different.

While BTRFS_IOC_DEFRAG_RANGE has a no op if an extent is bigger than
the -l (len=) value, I can't tell that BTRFS_IOC_DEFRAG has any sort
of no op unless there's no fragments at all *shrug*.

Maybe they should use BTRFS_IOC_DEFRAG_RANGE and specify an 8MB exent?
Because in the nodatacow case, that's what they already have and it'd
be a no op. And then for datacow case... well I don't like
unconditional write amplification on SSDs just to satisfy the HDD
case. But it'd be avoidable by just using default (nodatacow for the
journals).

-- 
Chris Murphy

Reply via email to