On 2018-05-28 13:10, ein wrote:
On 05/23/2018 01:03 PM, Austin S. Hemmelgarn wrote:
On 2018-05-23 06:09, ein wrote:
On 05/23/2018 11:09 AM, Duncan wrote:
ein posted on Wed, 23 May 2018 10:03:52 +0200 as excerpted:

IMHO the best course of action would be to disable checksumming for
you
vm files.

Do you mean '-o nodatasum' mount flag? Is it possible to disable
checksumming for singe file by setting some magical chattr? Google
thinks it's not possible to disable csums for a single file.

You can use nocow (-C), but of course that has other restrictions (like
setting it on the files when they're zero-length, easiest done for
existing data by setting it on the containing dir and copying files (no
reflink) in) as well as the nocow effects, and nocow becomes cow1
after a
snapshot (which locks the existing copy in place so changes written to a
block /must/ be written elsewhere, thus the cow1, aka cow the first time
written after the snapshot but retain the nocow for repeated writes
between snapshots).

But if you're disabling checksumming anyway, nocow's likely the way
to go.

Disabling checksumming only may be a way to go - we live without it
every day. But nocow @ VM files defeats whole purpose of using BTRFS for
me, even with huge performance penalty - backup reasons - I mean few
snapshots (20-30), send & receive.

Setting NOCOW on a file doesn't prevent it from being snapshotted, it
just prevents COW operations from happening under most normal
circumstances.  In essence, it prevents COW from happening except for
writing right after the snapshot.  More specifically, the first write to
a given block in a file set for NOCOW after taking a snapshot will
trigger a _single_ COW operation for _only_ that block (unless you have
autodefrag enabled too), after which that block will revert to not doing
COW operations on write.  This way, you still get consistent and working
snapshots, but you also don't take the performance hits from COW except
right after taking a snapshot.

Yeah, just after I've post it, I've found some Duncan post from 2015,
explaining it, thank you anyway.

Is there anything we can do better in random/write VM workload to speed
the BTRFS up and why?

My settings:

<disk type='file' device='disk'>
       <driver name='qemu' type='raw' cache='none' io='native'/>
       <source file='/var/lib/libvirt/images/db.raw'/>
       <target dev='vda' bus='virtio'/>
       [...]
</disk>

/dev/mapper/raid10-images on /var/lib/libvirt type btrfs
(rw,noatime,nodiratime,compress=lzo:3,ssd,space_cache,autodefrag,subvolid=5,subvol=/)

md1 : active raid10 sdc1[2] sdb2[1] sdd1[3] sda2[0]
       468596736 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
       bitmap: 0/4 pages [0KB], 65536KB chunk

CPU: E3-1246 with: VT-x, VT-d, HT, EPT, TSX-NI, AES-NI on debian's
kernel 4.15.0-0.bpo.2-amd64

As far as I understand compress and autodefrag are impacting negatively
for performance (latency), especially autodefrag. I think also that
nodatacow shall also speed things up and it's a must when using qemu and
BTRFS. Is it better to use virtio or virt-scsi with TRIM support?

FWIW, I've been doing just fine without nodatacow, but I also use raw images contained in sparse files, and keep autodefrag off for the dedicated filesystem I put the images on.

Compression shouldn't have much in the way of negative impact unless you're also using transparent compression (or disk for file encryption) inside the VM (in fact, it may speed things up significantly depending on what filesystem is being used by the guest OS, the ext4 inode table in particular seems to compress very well). If you are using `nodatacow` though, you can just turn compression off, as it's not going to be used anyway. If you want to keep using compression, then I'd suggest using `compress-force` instead of `compress`, which makes BTRFS a bit more aggressive about trying to compress things, but makes the performance much more deterministic. You may also want to look int using `zstd` instead of `lzo` for the compression, it gets better ratios most of the time, and usually performs better than `lzo` does.

Autodefrag should probably be off. If you have nodatacow set (or just have all the files marked with the NOCOW attribute), then there's not really any point to having autodefrag on. If like me you aren't turning off COW for data, it's still a good idea to have it off and just do batch defragmentation at a regularly scheduled time.

For the VM settings, everything looks fine to me (though if you have somewhat slow storage and aren't giving the VM's lots of memory to work with, doing write-through caching might be helpful). I would probably be using virtio-scsi for the TRIM support, as with raw images you will get holes in the file where the TRIM command was issued, which can actually improve performance (and does improve storage utilization (though doing batch trims instead of using the `discard` mount option is better for performance if you have Linux guests).

You're using an MD RAID10 array. This is generally the fastest option in terms of performance, but it also means you can't take advantage of BTRFS' self repairing ability very well, and you may be wasting space and some performance (because you probably have the 'dup' profile set for metadata). If it's an option, I'd suggest converting this to a BTRFS raid1 volume on top of two MD RAID0 volumes, which should either get the same performance, or slightly better performance, will avoid wasting space storing metadata, and will also let you take advantage of the self-repair functionality in BTRFS.

You should probably switch the `ssd` mount option to `nossd` (and then run a full recursive defrag on the volume, as this option affects the allocation policy, so the changes only take effect for new allocations). The SSD allocator can actually pretty significantly hurt performance in many cases, and has at best very limited benefits for device lifetimes (you'll maybe get another few months out of a device that will last for ten years without issue). Make a point to test this though, because you're on a RAID array, this may actually be improving performance slightly.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to