On 2015-09-16 07:35, Brendan Heading wrote:
Btrfs has two possible solutions to work around the problem.  The first
one is the autodefrag mount option, which detects file fragmentation
during the write and queues up the affected file for a defragmenting
rewrite by a lower priority worker thread.  This works best on the small
end, because as file size increases, so does time to actually write it
out, and at some point, depending on the size of the file and how busy
the database/VM is, writes are (trying to) come in faster than the file
can be rewritten.  Typically, there's no problem under a quarter GiB,
with people beginning to notice performance issues at half to 3/4 GiB,
tho on fast disks and not too busy VMs/DBs (which may well include your
home system, depending on what you use the VMs for), you might not see
problems until size reaches 2 GiB or so.  As such, autodefrag tends to be
a very good option for firefox sqlite database files, for instance, as
they tend to be small enough not to have issues.  But it's not going to
work so well for multi-GiB VM images.

[unlurking for the first time]

This problem has been faced by a certain very large storage vendor
whom I won't name, who provide an option similar to the above. Reading
between the lines I think their approach is to try to detect which
accesses are read-sequential, and schedule those blocks for rewriting
in sequence. They also have a feature to run as a background job which
can be scheduled to run during an off peak period where they can
reorder entire files that are significantly out of sequence. I'd
expect the algorithm is intelligent ie there's no need to rewrite
entire large files that are mostly sequential with a few out-of-order
sections.

Has anyone considered these options for btrfs ? Not being able to run
VMs on it is probably going to be a bit of a killer ..

3 things to mention here:
1. It's perfectly possible to run VM's on BTRFS, it just takes some effort to get decent efficiency, and you can't really over-provision storage (the above mentioned effort is to create the file with NOCOW set, and then use fallocate or dd to pre-allocate space for it). 2. If you are using a file for the disk image, you are already sacrificing performance for portability, it's just a bigger tradeoff with BTRFS than most other filesystems on Linux. 3. Almost all of the issues that BTRFS has with VM disk images are also present in other filesystems, they are just much worse on BTRFS because of the fact that it is COW based.

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to