Mitch Harder posted on Fri, 06 Jun 2014 14:06:53 -0500 as excerpted: > Every time you update your database, btrfs is going to update whichever > 128 KiB blocks need to be modified. > > Even for a tiny modification, the new compressed block may be slightly > more or slightly less than 128 KiB.
FWIW, I believe that's 128 KiB pre-compression. And at least without compress-force, btrfs will try the compression and if the compressed size is larger than the uncompressed size, it simply won't compress that block. So 128 KiB is the largest amount of space that 128 KiB of data could take with compression on, but it can be half that or less if the compression happens to be good for that 128 KiB block. > If you have a 1-2 GB database that is being updated with any frequency, > you can see how you will quickly end up with lots of metadata > fragmentation as well as inefficient data block utilization. > I think this will be the case even if you switch to NOCOW due to the > compression. That is one reason that, as I said, NOCOW turns off compression. Compression simply doesn't work well with in-place updates, because as you point out, the update may compress more or less well than the original, and that won't work in-place. So NOCOW turns off compression to avoid the problem. If its COW (that is, not NOCOW), then the COW-based out-of-place-updates avoid the problem of fitting more data in the same space, because the new write can take more space in the new location if it has to. But you are correct that compression and large, frequently updated databases don't play well together either. Which is why turning off compression when turning off COW isn't the big problem it would first appear to be -- as it happens, the very same files where COW doesn't work well, are also the ones where compression doesn't work well. Similarly for checksumming. When there are enough updates, in addition to taking more time to calculate and write, checksumming simply invites race conditions between the last then-valid checksum and the next update invalidating it. In addition, in many, perhaps most cases, the sorts of apps that do constant internal updates, have already evolved their own data integrity verification methods in ordered to cope with issues on the after all way more common unverified filesystems, creating even more possible race conditions and timing issues and making all that extra work that btrfs normally does for verification unnecessary. Trying to do all that in-place due to NOCOW is a recipe for failure or insanity if not both So when turning off COW, just turning off checksumming/verification and compression along with it makes the most sense, and that's what btrfs does. To do otherwise is just asking for trouble, which is why you very rarely see in-place-update-by-default filesystems offering either transparent compression or data verification as features. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html