On Fri, Jan 28, 2022 at 02:19:44PM +0100, Kevin Wolf wrote: > Am 28.01.2022 um 13:30 hat Hanna Reitz geschrieben: > > > > I just changed that line of code [2], as shown in [4]. I suppose > > > > the better thing to do would be to have an option for the NBD server > > > > to force-change the announced request alignment, because it can > > > > expect the qemu block layer code to auto-align requests through > > > > RMW. Doing it in the client is wrong, because the NBD server might > > > > want to detect that the client sends unaligned requests and reject > > > > them (though ours doesn’t, it just traces such events[5] – note that > > > > it’s explicitly noted there that qemu will auto-align requests). > > > I know I said I didn't care about performance (in this case), but is > > > there in fact a penalty to sending unaligned requests to the qcow2 > > > layer? Or perhaps it cannot compress them? > > > > In qcow2, only the whole cluster can be compressed, so writing compressed > > data means having to write the whole cluster. qcow2 could implement the > > padding by itself, but we decided to just leave the burden of only writing > > full clusters (with the COMPRESSED write flag) on the callers. > > > > Things like qemu-img convert and blockdev-backup just adhere to that by > > design; and the compress driver makes sure to set its request alignment > > accordingly so that requests to it will always be aligned to the cluster > > size (either by its user, or by the qemu block layer which performs the > > padding automatically). > > I thought the more limiting factor would be that after auto-aligning the > first request by padding with zeros, the second request to the same > cluster would fail because compression doesn't allow using an already > allocated cluster: > > /* Compression can't overwrite anything. Fail if the cluster was already > * allocated. */ > cluster_offset = get_l2_entry(s, l2_slice, l2_index); > if (cluster_offset & L2E_OFFSET_MASK) { > qcow2_cache_put(s->l2_table_cache, (void **) &l2_slice); > return -EIO; > } > > Did you always just test a single request or why don't you run into > this?
I didn't test that one specifically and yes it does fail: $ qemu-img create -f qcow2 output.qcow2 1M Formatting 'output.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16 $ qemu-nbd -t --image-opts driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=output.qcow2 & [1] 2069037 $ nbdsh -u nbd://localhost nbd> h.set_strict_mode(h.get_strict_mode() & ~nbd.STRICT_ALIGN) nbd> buf = b'1' * 1024 nbd> h.pwrite(buf, 0) nbd> h.pwrite(buf, 1024) Traceback (most recent call last): File "/usr/lib64/python3.10/code.py", line 90, in runcode exec(code, self.locals) File "<console>", line 1, in <module> File "/usr/lib64/python3.10/site-packages/nbd.py", line 1631, in pwrite return libnbdmod.pwrite(self._o, buf, offset, flags) nbd.Error: nbd_pwrite: write: command failed: Input/output error (EIO) So what I said in the previous email about about minimum vs preferred is wrong :-( What's more interesting is that nbdcopy still appeared to work. Simulating what that was doing would be something like which also fails when I do it directly: nbd> h.pwrite(buf, 0) nbd> h.zero(1024, 1024) Traceback (most recent call last): File "/usr/lib64/python3.10/code.py", line 90, in runcode exec(code, self.locals) File "<console>", line 1, in <module> File "/usr/lib64/python3.10/site-packages/nbd.py", line 1782, in zero return libnbdmod.zero(self._o, count, offset, flags) nbd.Error: nbd_zero: write-zeroes: command failed: Input/output error (EIO) Anyway back to poking at nbdcopy to make it support block sizes ... > I guess checking L2E_OFFSET_MASK is strictly speaking wrong because it's > invalid for compressed clusters (qcow2_get_cluster_type() feels more > appropriate), but in practice, you will always have non-zero data there, > so it should error out here. > > Kevin Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/