On Mon 17 Aug 2020 12:10:19 PM CEST, Kevin Wolf wrote: >> Since commit c8bb23cbdbe / QEMU 4.1.0 (and if the storage backend >> allows it) writing to an image created with preallocation=metadata >> can be slower (20% in my tests) than writing to an image with no >> preallocation at all. > > A while ago we had a case where commit c8bb23cbdbe was actually > reported as a major performance regression, so it's a big "it > depends". > > XFS people told me that they consider this code a bad idea. Just > because it's a specialised "write zeroes" operation, it's not > necessarily fast on filesystems. In particular, on XFS, ZERO_RANGE > causes a queue drain with O_DIRECT (probably hurts cases with high > queue depths) and additionally even a page cache flush without > O_DIRECT. > > So in a way this whole thing is a two-edged sword.
I see... on ext4 the improvements are clearly visible. Are we not detecting this for xfs? We do have an s->is_xfs flag. >> a) shall we include a warning in the documentation ("note that this >> preallocation mode can result in worse performance")? > > To be honest, I don't really understand this case yet. With metadata > preallocation, the clusters are already marked as allocated, so why > would handle_alloc_space() even be called? We're not allocating new > clusters after all? It's not called, what happens is what you say below: > Or are you saying that ZERO_RANGE + pwrite on a sparse file (= cluster > allocation) is faster for you than just the pwrite alone (= writing to > already allocated cluster)? Yes, 20% faster in my tests (4KB random writes), but in the latter case the cluster is already allocated only at the qcow2 level, not on the filesystem. preallocation=falloc is faster than preallocation=metadata (preallocation=off sits in the middle). >> b) why don't we also initialize preallocated clusters with >> QCOW_OFLAG_ZERO? (at least when there's no subclusters involved, >> i.e. no backing file). This would make reading from them (and >> writing to them, after this patch) faster. > > Because the idea with metadata preallocation is that you don't have to > perform any COW and update any metdata because everything is already > allocated. If you set the zero flag, you get cluster allocations with > COW again, defeating the whole purpose of the preallocation. Fair enough. Berto