On Sun, Nov 29, 2015 at 6:31 AM, Duncan <1i5t5.dun...@cox.net> wrote:

> What are you using to tell you it has 1018391 extents?  If you're using
> filefrag, it's known not to understand btrfs compression, which uses 128
> KiB (pre-compression size, I believe, tho I'm not absolutely positive)
> blocks, and as a result, to report each of those blocks as a separate
> extent.
Indeed filefrag, and filefrag -k -v results in a 72M txt file, almost
all 'extents' show length:128, but also several a bit bigger or much
bigger.
>From just quickly browsing and random checks, it seems mostly
contiguous (on linux block layer level or from what filefrag sees), it
looks like it is not so difficult to change filefrag so that it also
reports the amount of discontinuities. But maybe there is other
'misunderstanding' between filefrag and btrfs, so then we would need
something dedicated. filefrag execution takes long time mostly
(minutes), even just for 1 though big file.

> Because if you're using compress-force, filefrag will see each 128 KiB
> compression block as an extent, and 1018391 reported "extents" (actually
> compression blocks) should be ~ 125 GiB.
See also above;
file is 176521162380 bytes ~= 164.4 GiB and ls -lh reports 165G

> AFAIK there's no easy "admin-level" way to check extent usage when btrfs
> compression is used on a file.  There's developer-level btrfs-debug
> output, but nothing admin-level or user-level at all.
I would be interested in tooling that gives more visibility on what
happens on btrfs disk-block level, primarily just for non-compression
use-cases. If you know some, please let us know.

> And while compress-force won't change the reported "extents" that are
> actually compression-blocks if the file is actually compressed, just
> compress by itself may or may not actually compress the file (there's an
> algorithm used, from what the devs have said, basically it checks whether
> the first block or two compress well, and assumes the rest of the file
> will be similar, compressing or not based on the result of that attempt),
This method is probably a good method for live data being written, but
for static content (my data files), it turned out to be not good
enough, i.e. total space gained in TeraByte range was not what I
expected/wanted.

> so it's quite possible you'll get better "extent" numbers if the file
> isn't actually compressed, in which case filefrag actually gets things
> right and reports real extent numbers, vs the number of compression
> blocks if the file is compressed.
I think you mean that when the destination block of the file are not
re-compressed (and stored on disk) but left and stored as the
unprocessed source datablock. Indeed, when I first e.g gzip the file,
it's then 75G and filefrag reports 8613 extents, most of them 512KiB
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to