On Wed, Apr 24, 2019 at 09:19:17AM +0200, Kevin Wolf wrote:
Am 24.04.2019 um 08:40 hat Vladimir Sementsov-Ogievskiy geschrieben:23.04.2019 18:08, Kevin Wolf wrote: > Am 23.04.2019 um 16:26 hat Martin Kletzander geschrieben: >> On Tue, Apr 23, 2019 at 02:12:18PM +0200, Kevin Wolf wrote: >>> Am 23.04.2019 um 13:30 hat Martin Kletzander geschrieben: >>>> Hi, >>>> >>>> I am using qemu-img with nbdkit to transfer a disk image and the update it with >>>> extra data from newer snapshots. The end image cannot be transferred because >>>> the snapshots will be created later than the first transfer and we want to save >>>> some time up front. You might think of it as a continuous synchronisation. It >>>> looks something like this: >>>> >>>> I first transfer the whole image: >>>> >>>> qemu-img convert -p $nbd disk.raw >>>> >>>> Where `$nbd` is something along the lines of `nbd+unix:///?socket=nbdkit.sock` >>>> >>>> Then, after the next snapshot is created, I can update it thanks to the `-n` >>>> parameter (the $nbd now points to the newer snapshot with unchanged data looking >>>> like holes in the file): >>>> >>>> qemu-img convert -p -n $nbd disk.raw >>>> >>>> This is fast and efficient as it uses block status nbd extension, so it only >>>> transfers new data. >>> >>> This is an implementation detail. Don't rely on it. What you're doing is >>> abusing 'qemu-img convert', so problems like what you describe are to be >>> expected. >>> >>>> This can be done over and over again to keep the local >>>> `disk.raw` image up to date with the latest remote snapshot. >>>> >>>> However, when the guest OS zeroes some of the data and it gets written into the >>>> snapshot, qemu-img scans for those zeros and does not write them to the >>>> destination image. Checking the output of `qemu-img map --output=json $nbd` >>>> shows that the zeroed data is properly marked as `data: true`. >>>> >>>> Using `-S 0` would write zeros even where the holes are, effectively overwriting >>>> the data from the last snapshot even though they should not be changed. >>>> >>>> Having gone through some workarounds I would like there to be another way. I >>>> know this is far from the typical usage of qemu-img, but is this really the >>>> expected behaviour or is this just something nobody really needed before? If it >>>> is the former, would it be possible to have a parameter that would control this >>>> behaviour? If the latter is the case, can that behaviour be changed so that it >>>> properly replicates the data when `-n` parameter is used? >>>> >>>> Basically the only thing we need is to either: >>>> >>>> 1) write zeros where they actually are or >>>> >>>> 2) turn off explicit sparsification without requesting dense image (basically >>>> sparsify only the par that is reported as hole on the source) or >>>> >>>> 3) ideally, just FALLOC_FL_PUNCH_HOLE in places where source did report data, >>>> but qemu-img found they are all zeros (or source reported HOLE+ZERO which, I >>>> believe, is effectively the same) >>>> >>>> If you want to try this out, I found the easiest reproducible way is using >>>> nbdkit's data plugin, which can simulate whatever source image you like. >>> >>> I think what you _really_ want is a commit block job. The problem is >>> just that you don't have a proper backing file chain, but just a bunch >>> of NBD connections. >>> >>> Can't you get an NBD connection that already provides the condensed form >>> of the whole snapshot chain directly at the source? If the NBD server >>> was QEMU, this would actually be easier than providing each snapshot >>> individually. >>> >>> If this isn't possible, I think you need to replicate the backing chain >>> on the destination instead of converting into the same image again and >>> again so that qemu-img knows that it must take existing data of the >>> backing file into consideration: >>> >>> qemu-img convert -O qcow2 nbd://... base.qcow2 >>> qemu-img convert -O qcow2 -F qcow2 -B base.qcow2 nbd://... overlay1.qcow2 >>> qemu-img convert -O qcow2 -F qcow2 -B overlay1.qcow2 nbd://... overlay2.qcow2 >>> ...
So I spoke too soon. This approach fixed the one thing that I was struggling with, but broke the rest, because it completely replicates the last image even when the source provides proper allocation data. Best to show with an illustration: $ rm -f disk.img snap.img $ dd if=/dev/urandom of=disk.img bs=2M count=1 $ dd if=/dev/zero of=snap.img bs=1M count=1 $ truncate -s 2M snap.img $ qemu-img map --output=json snap.img [{ "start": 0, "length": 1048576, "depth": 0, "zero": false, "data": true, "offset": 0}, { "start": 1048576, "length": 1048576, "depth": 0, "zero": true, "data": false, "offset": 1048576}] $ qemu-img convert -f raw -O qcow2 disk.img disk.qcow2 $ qemu-img convert -f raw -O qcow2 -B disk.qcow2 snap.img snap.qcow2 $ qemu-img convert -f qcow2 -O raw snap.qcow2 output.raw $ hexdump -C output.raw 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00200000 And qemu-img convert from qcow2 to raw is not broken So it looks like either we add support for this specific feature in qemu-img or we need to use our own client that does that. Unless someone has other ideas, that is. Martin
signature.asc
Description: PGP signature