On 2015-12-02 08:53, Tomasz Chmielewski wrote:
For future reference, if you run 'btrfs filesystem defrag -r -czlib' on the top level directory, you can achieve the same effect without having to deal with the copy overhead. This has a side effect of breaking reflinks, but copying the files off and back onto the filesystem does so also, and even then, I doubt that you're using reflinks. There probably wouldn't be much difference in the time it takes, but at least you wouldn't be hitting another disk in the process.On 2015-12-02 22:03, Austin S Hemmelgarn wrote:From these numbers (124 GB used where data size is 153 GB), it appears that we save around 20% with zlib compression enabled. Is 20% reasonable saving for zlib? Typically text compresses much better with that algorithm, although I understand that we have several limitations when applying that on a filesystem level.This is actually an excellent question. A couple of things to note before I share what I've seen: 1. Text compresses better with any compression algorithm. It is by nature highly patterned and moderately redundant data, which is what benefits the most from compression.It looks that compress=zlib does not compress very well. Following Duncan's suggestion, I've changed it to compress-force=zlib, and re-copied the data to make sure the file are compressed.
That's better than 80% space savings (it works out to about 83.6%), so I doubt that you'd manage to get anything better than that even with only plain text files. It's interesting that there's such a big discrepancy though, that indicates that BTRFS really needs some work WRT deciding what to compress.Compression ratio is much much better now (on a slightly changed data set): # df -h /dev/xvdb 200G 24G 176G 12% /var/log/remote # du -sh /var/log/remote/ 138G /var/log/remote/ So, 138 GB files use just 24 GB on disk - nice! However, I would still expect that compress=zlib has almost the same effect as compress-force=zlib, for 100% text files/logs.
smime.p7s
Description: S/MIME Cryptographic Signature