Quoting Zygo Blaxell <ce3g8...@umail.furryterror.org>:

On Thu, Sep 12, 2019 at 08:26:04PM -0400, General Zed wrote:

Quoting Zygo Blaxell <ce3g8...@umail.furryterror.org>:

You mean: all metadata size is 156 GB on one of your systems. However, you
don't typically have to put ALL metadata in RAM.
You need just some parts needed for defrag operation. So, for defrag, what
you really need is just some large metadata cache present in RAM. I would
say that if such a metadata cache is using 128 MB (for 2 TB disk) to 2 GB
(for 156 GB disk), than the defrag will run sufficiently fast.

You're missing something (metadata requirement for delete?) in those
estimates.

Total metadata size does not affect how much metadata cache you need
to defragment one extent quickly.  That number is a product of factors
including input and output and extent size ratio, the ratio of various
metadata item sizes to the metadata page size, and the number of trees you
have to update (number of reflinks + 3 for extent, csum, and free space
trees).

It follows from the above that if you're joining just 2 unshared extents
together, the total metadata required is well under a MB.

If you're defragging a 128MB journal file with 32768 4K extents, it can
create several GB of new metadata and spill out of RAM cache (which is
currently capped at 512MB for assorted reasons).  Add reflinks and you
might need more cache, or take a performance hit.  Yes, a GB might be
the total size of all your metadata, but if you run defrag on a 128MB
log file you could rewrite all of your filesystem's metadata in a single
transaction (almost...you probably won't need to update the device or
uuid trees).

I can't see how that can happen. If you are defragmenting a single 128 MB journal file, the metadata that points to it is certainly a small part of the entire b-tree (because the tree is ordered). If that part of the b-tree is to completely change, al the way up to super, the entire update of b-tree (written into new exents) can't be more than a tenth of the filesize (128 MB). So, there is no big overhead.


Reply via email to