On Sun, Nov 29, 2015 at 6:31 AM, Duncan <1i5t5.dun...@cox.net> wrote:
> What are you using to tell you it has 1018391 extents? If you're using > filefrag, it's known not to understand btrfs compression, which uses 128 > KiB (pre-compression size, I believe, tho I'm not absolutely positive) > blocks, and as a result, to report each of those blocks as a separate > extent. Indeed filefrag, and filefrag -k -v results in a 72M txt file, almost all 'extents' show length:128, but also several a bit bigger or much bigger. >From just quickly browsing and random checks, it seems mostly contiguous (on linux block layer level or from what filefrag sees), it looks like it is not so difficult to change filefrag so that it also reports the amount of discontinuities. But maybe there is other 'misunderstanding' between filefrag and btrfs, so then we would need something dedicated. filefrag execution takes long time mostly (minutes), even just for 1 though big file. > Because if you're using compress-force, filefrag will see each 128 KiB > compression block as an extent, and 1018391 reported "extents" (actually > compression blocks) should be ~ 125 GiB. See also above; file is 176521162380 bytes ~= 164.4 GiB and ls -lh reports 165G > AFAIK there's no easy "admin-level" way to check extent usage when btrfs > compression is used on a file. There's developer-level btrfs-debug > output, but nothing admin-level or user-level at all. I would be interested in tooling that gives more visibility on what happens on btrfs disk-block level, primarily just for non-compression use-cases. If you know some, please let us know. > And while compress-force won't change the reported "extents" that are > actually compression-blocks if the file is actually compressed, just > compress by itself may or may not actually compress the file (there's an > algorithm used, from what the devs have said, basically it checks whether > the first block or two compress well, and assumes the rest of the file > will be similar, compressing or not based on the result of that attempt), This method is probably a good method for live data being written, but for static content (my data files), it turned out to be not good enough, i.e. total space gained in TeraByte range was not what I expected/wanted. > so it's quite possible you'll get better "extent" numbers if the file > isn't actually compressed, in which case filefrag actually gets things > right and reports real extent numbers, vs the number of compression > blocks if the file is compressed. I think you mean that when the destination block of the file are not re-compressed (and stored on disk) but left and stored as the unprocessed source datablock. Indeed, when I first e.g gzip the file, it's then 75G and filefrag reports 8613 extents, most of them 512KiB -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html