Quoting "Austin S. Hemmelgarn" <ahferro...@gmail.com>:
On 2019-09-12 15:18, webmas...@zedlx.com wrote:
Quoting "Austin S. Hemmelgarn" <ahferro...@gmail.com>:
On 2019-09-11 17:37, webmas...@zedlx.com wrote:
Quoting "Austin S. Hemmelgarn" <ahferro...@gmail.com>:
On 2019-09-11 13:20, webmas...@zedlx.com wrote:
Quoting "Austin S. Hemmelgarn" <ahferro...@gmail.com>:
On 2019-09-10 19:32, webmas...@zedlx.com wrote:
Quoting "Austin S. Hemmelgarn" <ahferro...@gmail.com>:
* Reflinks can reference partial extents. This means,
ultimately, that you may end up having to split extents in odd
ways during defrag if you want to preserve reflinks, and might
have to split extents _elsewhere_ that are only tangentially
related to the region being defragmented. See the example in my
previous email for a case like this, maintaining the shared
regions as being shared when you defragment either file to a
single extent will require splitting extents in the other file
(in either case, whichever file you don't defragment to a single
extent will end up having 7 extents if you try to force the one
that's been defragmented to be the canonical version). Once you
consider that a given extent can have multiple ranges reflinked
from multiple other locations, it gets even more complicated.
I think that this problem can be solved, and that it can be
solved perfectly (the result is a perfectly-defragmented file).
But, if it is so hard to do, just skip those problematic extents
in initial version of defrag.
Ultimately, in the super-duper defrag, those partially-referenced
extents should be split up by defrag.
* If you choose to just not handle the above point by not
letting defrag split extents, you put a hard lower limit on the
amount of fragmentation present in a file if you want to
preserve reflinks. IOW, you can't defragment files past a
certain point. If we go this way, neither of the two files in
the example from my previous email could be defragmented any
further than they already are, because doing so would require
splitting extents.
Oh, you're reading my thoughts. That's good.
Initial implementation of defrag might be not-so-perfect. It
would still be better than the current defrag.
This is not a one-way street. Handling of partially-used extents
can be improved in later versions.
* Determining all the reflinks to a given region of a given
extent is not a cheap operation, and the information may
immediately be stale (because an operation right after you fetch
the info might change things). We could work around this by
locking the extent somehow, but doing so would be expensive
because you would have to hold the lock for the entire defrag
operation.
No. DO NOT LOCK TO RETRIEVE REFLINKS.
Instead, you have to create a hook in every function that updates
the reflink structure or extents (for exaple, write-to-file
operation). So, when a reflink gets changed, the defrag is
immediately notified about this. That way the defrag can keep its
data about reflinks in-sync with the filesystem.
This doesn't get around the fact that it's still an expensive
operation to enumerate all the reflinks for a given region of a
file or extent.
No, you are wrong.
In order to enumerate all the reflinks in a region, the defrag
needs to have another array, which is also kept in memory and in
sync with the filesystem. It is the easiest to divide the disk into
regions of equal size, where each region is a few MB large. Lets
call this array "regions-to-extents" array. This array doesn't need
to be associative, it is a plain array.
This in-memory array links regions of disk to extents that are in
the region. The array in initialized when defrag starts.
This array makes the operation of finding all extents of a region
extremely fast.
That has two issues:
* That's going to be a _lot_ of memory. You still need to be able
to defragment big (dozens plus TB) arrays without needing multiple
GB of RAM just for the defrag operation, otherwise it's not
realistically useful (remember, it was big arrays that had issues
with the old reflink-aware defrag too).
* You still have to populate the array in the first place. A sane
implementation wouldn't be keeping it in memory even when defrag is
not running (no way is anybody going to tolerate even dozens of MB
of memory overhead for this), so you're not going to get around the
need to enumerate all the reflinks for a file at least once (during
startup, or when starting to process that file), so you're just
moving the overhead around instead of eliminating it.
Nope, I'm not just "moving the overhead around instead of eliminating
it", I am eliminating it.
The only overhead is at defrag startup, when the entire b-tree
structure has to be loaded and examined. That happens in a few seconds.
After this point, there is no more "overhead" because the running
defrag is always notified of any changes to the b-trees (by hookc in
b-tree update routines). Whenever there is such a change,
region-extents array gets updated. Since this region-extents array is
in-memory, the update is so fast that it can be considered a zero
overhead.