On 05.02.20 20:21, Eric Blake wrote: > On 2/5/20 11:04 AM, Max Reitz wrote: >> OK, I expected users to come in a separate patch. > > I can refactor that better in v2. > >> >>> That's the use case: when copying into a destination file, it's useful >>> to know if the destination already reads as all zeroes, before >>> attempting a fallback to bdrv_make_zero(BDRV_REQ_NO_FALLBACK) or calls >>> to block status checking for holes. >> >> But that was my point on IRC. Is it really more useful if >> bdrv_make_zero() is just as quick? (And the fact that NBD doesn’t have >> an implementation looks more like a problem with NBD to me.) > > That is indeed a thought - why should qemu-img even TRY to call > bdrv_has_init_zero; it should instead call bdrv_make_zero(), which > should be as fast as possible (see my latest reply on 9/17 exploring > that idea more). Under the hood, we can then make bdrv_make_zero() use > whatever tricks it needs, whether keeping the driver's > .bdrv_has_zero_init/_truncate callbacks, adding another callback, making > write_zeroes faster, or whatever, but instead of making qemu-img sort > through multiple ideas, the burden would now be hidden in the block layer.
I didn’t even think of that. More on that at the very bottom of this mail. >> (Considering that at least the code we discussed on IRC didn’t work for >> preallocated images, which was the one point where we actually have a >> problem in practice.) > > And this series DOES improve the case for preallocated qcow2 images, by > virtue of a new qcow2 autoclear bit. But again, that may be something > we want to hide in the driver callback interfaces, while qemu-img just > blindly calls bdrv_make_zero() and gets success (the image now reads as > zeroes, either because it was already that way or we did something > quick) or failure (it is a waste of time to prezero, whether by > write_zeroes or by trim or by truncate, so just manually write zeroes as > part of your image copying). Oh, yes, indeed. Sorry. >>>> (We have a use case with convert -n to freshly created image files, but >>>> my position on this on IRC was that we want the --target-is-zero flag >>>> for that anyway: Auto-detection may always break, our preferred default >>>> behavior may always change, so if you want convert -n not to touch the >>>> target image except to write non-zero data from the source, we need a >>>> --target-is-zero flag and users need to use it. Well, management >>>> layers, because I don’t think users would use convert -n anyway. >>>> >>>> And with --target-is-zero and users effectively having to use it, I >>>> don’t think that’s a good example of a use case.) >>> >>> Yes, there will still be cases where you have to use --target-is-zero >>> because the image itself couldn't report that it already reads as >>> zeroes, but there are also enough cases where the destination is already >>> known to read zeroes and it's a shame to tell the user that 'you have to >>> add --target-is-zero to get faster copying even though we could have >>> inferred it on your behalf'. >> >> How is it a shame? I think only management tools would use convert -n. >> Management tools want reliable behavior. If you want reliable >> behavior, you have to use --target-is-zero anyway. So I don’t see the >> actual benefit for qemu-img convert. > > qemu-img convert to an NBD destination cannot create the destination, so > it ALWAYS has to use -n. I don't know how often users are likely to > wire up a command line for qemu-img convert with NBD as the destination, > or whether you are correct that it will be a management app able to > supply -n (and thus able to supply --target-is-zero). But at the same > time, can a management app learn whether it is safe to supply > --target-is-zero? With my upcoming NBD patches, 'qemu-img --list' will > show whether the NBD target is known to start life all zero, and a > management app could use mechanism to decide when to pass > --target-is-zero (whether a management app would actually fork qemu-img > --list, or just be an NBD client itself to do the same thing qemu-img > would do, is beside the point). > > Similarly, this series includes enhancements to 'qemu-img info' on qcow2 > images known to currently read as zero; again, that sort of information > is necessary somewhere in the chain, whether it be because qemu-img > consumes the information itself, or because the management app consumes > the information in order to pass the --target-is-zero option to > qemu-img, either way, the information needs to be available for > consumption. I simply assumed that management applications will just assume that a newly created image is zero. I’m aware that may be wrong, but then again, that hasn’t stopped them in the past or they would have asked for qemu to deliver this information earlier... (That doesn’t mean that at some point maybe they will start to care and ask for it.) One problem with delivering this information of course is that it’s useless. If qemu knows the image to be zero, then it will do the right thing by itself, and then the management application doesn’t need to pass --target-is-zero anymore. >>>> I suppose there is the point of blockdev-create + blockdev-mirror: This >>>> has exactly the same problem as convert -n. But again, if you really >>>> want blockdev-mirror not just to force-zero the image, you probably >>>> need >>>> to tell it so explicitly (i.e., with a --target-is-zero flag for >>>> blockdev-mirror). >>>> >>>> (Well, I suppose we could save us a target-is-zero for mirror if we >>>> took >>>> this series and had a filter driver that force-reports BDRV_ZERO_OPEN. >>>> But, well, please no.) >>>> >>>> But maybe I’m just an idiot and there is no reason not to take this >>>> series and make blockdev-create + blockdev-mirror do the sensible thing >>>> by default in most cases. *shrug* >>> >>> My argument for taking the series _is_ that the common case can be made >>> more efficient without user effort. >> >> The thing is, I don’t see the user effort. I don’t think users use >> convert -n or backup manually. And for management tools, it isn’t >> really effort to add another switch. > > Maybe, but it is just shifting the burden between who consumes the > information that an image is all zero, and how the consumption of that > information is passed to qemu-img. Sure, but the question is who can take the burden the easiest. The management layer creates the image and then uses it, so it can easily retain this information. When qemu creates an image and then uses it in a separate step, it cannot retain this information. It has to be stored somewhere persistently and we have to fetch it. In the case of qcow2, that works with a flag. In other cases... Well, in any case it isn’t as trivial as in a management application. >>> Yes, we still need the knob for >>> when the common case isn't already smart enough, >> >> But the user can’t know when qemu isn’t smart enough. So users who care >> have to always give the flag. >> >>> but the difference in >>> avoiding a pre-zeroing pass is noticeable when copying images around >> >> I’m sure it is, but the question I ask is whether in practice we >> wouldn’t get --target-is-zero in all of these cases anyway. >> >> >> So I’m not sold on “it works most of the time”, because if it’s just >> most of the time, then we’ll likely see --target-is-zero all of the time. >> >> OTOH, I suppose that with the new qcow2 extension, it would always work >> for the following case: >> (1) Create a qcow2 file, >> (2) Immediately (with the next qemu-img/QMP invocation) use it as a >> target of convert -n or mirror or anything similar. > > Yes, that is one of the immediately obvious fallouts from this series - > you can now create a preallocated qcow2 image in one process, and the > next process using that image can readily tell that it is still > just-created. And it’s a common case with blockdev-create. >> If so, that means it works reliably all of the time for a common case. >> I guess that’d be enough for me. >> >> Max >> >>> (and more than just for qcow2 - my followup series to improve NBD is >>> similarly useful given how much work has already been invested in >>> mapping NBD into storage access over https in the upper layers like >>> ovirt). >>> >> >> > > At any rate, I think you've convinced me to rethink how I present v2 > (maybe not by refactoring bdrv_known_zeroes usage, but instead > refactoring bdrv_make_zero), but that the qcow2 autoclear bit is still a > useful feature to have. Hm. So you mean there isn’t any caller that actually cares about whether an image is zero, only that it is zero. That is, they are all “if (!is_zero()) { make_zero(); }” – which is functionally the same as “make_zero();” alone. make_zero in turn could and should be a no-op when the image is known to be zero already. I actually didn’t think of that. Sounds good. Max
signature.asc
Description: OpenPGP digital signature