On 2015-09-16 07:35, Brendan Heading wrote:
Btrfs has two possible solutions to work around the problem. The first one is the autodefrag mount option, which detects file fragmentation during the write and queues up the affected file for a defragmenting rewrite by a lower priority worker thread. This works best on the small end, because as file size increases, so does time to actually write it out, and at some point, depending on the size of the file and how busy the database/VM is, writes are (trying to) come in faster than the file can be rewritten. Typically, there's no problem under a quarter GiB, with people beginning to notice performance issues at half to 3/4 GiB, tho on fast disks and not too busy VMs/DBs (which may well include your home system, depending on what you use the VMs for), you might not see problems until size reaches 2 GiB or so. As such, autodefrag tends to be a very good option for firefox sqlite database files, for instance, as they tend to be small enough not to have issues. But it's not going to work so well for multi-GiB VM images.[unlurking for the first time] This problem has been faced by a certain very large storage vendor whom I won't name, who provide an option similar to the above. Reading between the lines I think their approach is to try to detect which accesses are read-sequential, and schedule those blocks for rewriting in sequence. They also have a feature to run as a background job which can be scheduled to run during an off peak period where they can reorder entire files that are significantly out of sequence. I'd expect the algorithm is intelligent ie there's no need to rewrite entire large files that are mostly sequential with a few out-of-order sections. Has anyone considered these options for btrfs ? Not being able to run VMs on it is probably going to be a bit of a killer ..
3 things to mention here:1. It's perfectly possible to run VM's on BTRFS, it just takes some effort to get decent efficiency, and you can't really over-provision storage (the above mentioned effort is to create the file with NOCOW set, and then use fallocate or dd to pre-allocate space for it). 2. If you are using a file for the disk image, you are already sacrificing performance for portability, it's just a bigger tradeoff with BTRFS than most other filesystems on Linux. 3. Almost all of the issues that BTRFS has with VM disk images are also present in other filesystems, they are just much worse on BTRFS because of the fact that it is COW based.
smime.p7s
Description: S/MIME Cryptographic Signature