Quoting "Austin S. Hemmelgarn" <ahferro...@gmail.com>:

On 2019-09-12 15:18, webmas...@zedlx.com wrote:

Quoting "Austin S. Hemmelgarn" <ahferro...@gmail.com>:

On 2019-09-11 17:37, webmas...@zedlx.com wrote:

Quoting "Austin S. Hemmelgarn" <ahferro...@gmail.com>:

On 2019-09-11 13:20, webmas...@zedlx.com wrote:

Quoting "Austin S. Hemmelgarn" <ahferro...@gmail.com>:

On 2019-09-10 19:32, webmas...@zedlx.com wrote:

Quoting "Austin S. Hemmelgarn" <ahferro...@gmail.com>:



* Reflinks can reference partial extents.  This means, ultimately, that you may end up having to split extents in odd ways during defrag if you want to preserve reflinks, and might have to split extents _elsewhere_ that are only tangentially related to the region being defragmented. See the example in my previous email for a case like this, maintaining the shared regions as being shared when you defragment either file to a single extent will require splitting extents in the other file (in either case, whichever file you don't defragment to a single extent will end up having 7 extents if you try to force the one that's been defragmented to be the canonical version).  Once you consider that a given extent can have multiple ranges reflinked from multiple other locations, it gets even more complicated.

I think that this problem can be solved, and that it can be solved perfectly (the result is a perfectly-defragmented file). But, if it is so hard to do, just skip those problematic extents in initial version of defrag.

Ultimately, in the super-duper defrag, those partially-referenced extents should be split up by defrag.

* If you choose to just not handle the above point by not letting defrag split extents, you put a hard lower limit on the amount of fragmentation present in a file if you want to preserve reflinks. IOW, you can't defragment files past a certain point.  If we go this way, neither of the two files in the example from my previous email could be defragmented any further than they already are, because doing so would require splitting extents.

Oh, you're reading my thoughts. That's good.

Initial implementation of defrag might be not-so-perfect. It would still be better than the current defrag.

This is not a one-way street. Handling of partially-used extents can be improved in later versions.

* Determining all the reflinks to a given region of a given extent is not a cheap operation, and the information may immediately be stale (because an operation right after you fetch the info might change things).  We could work around this by locking the extent somehow, but doing so would be expensive because you would have to hold the lock for the entire defrag operation.

No. DO NOT LOCK TO RETRIEVE REFLINKS.

Instead, you have to create a hook in every function that updates the reflink structure or extents (for exaple, write-to-file operation). So, when a reflink gets changed, the defrag is immediately notified about this. That way the defrag can keep its data about reflinks in-sync with the filesystem.

This doesn't get around the fact that it's still an expensive operation to enumerate all the reflinks for a given region of a file or extent.

No, you are wrong.

In order to enumerate all the reflinks in a region, the defrag needs to have another array, which is also kept in memory and in sync with the filesystem. It is the easiest to divide the disk into regions of equal size, where each region is a few MB large. Lets call this array "regions-to-extents" array. This array doesn't need to be associative, it is a plain array. This in-memory array links regions of disk to extents that are in the region. The array in initialized when defrag starts.

This array makes the operation of finding all extents of a region extremely fast.
That has two issues:

* That's going to be a _lot_ of memory. You still need to be able to defragment big (dozens plus TB) arrays without needing multiple GB of RAM just for the defrag operation, otherwise it's not realistically useful (remember, it was big arrays that had issues with the old reflink-aware defrag too).

* You still have to populate the array in the first place. A sane implementation wouldn't be keeping it in memory even when defrag is not running (no way is anybody going to tolerate even dozens of MB of memory overhead for this), so you're not going to get around the need to enumerate all the reflinks for a file at least once (during startup, or when starting to process that file), so you're just moving the overhead around instead of eliminating it.

Nope, I'm not just "moving the overhead around instead of eliminating it", I am eliminating it.

The only overhead is at defrag startup, when the entire b-tree structure has to be loaded and examined. That happens in a few seconds.

After this point, there is no more "overhead" because the running defrag is always notified of any changes to the b-trees (by hookc in b-tree update routines). Whenever there is such a change, region-extents array gets updated. Since this region-extents array is in-memory, the update is so fast that it can be considered a zero overhead.


Reply via email to