growable

Max Reitz Thu, 04 Sep 2014 13:03:31 -0700

On 20.08.2014 13:40, Kevin Wolf wrote:

Am 12.07.2014 um 00:23 hat Max Reitz geschrieben:

Currently, the field "growable" in a BDS is set iff the BDS is opened in
protocol mode (with O_BDRV_PROTOCOL). However, not every protocol block
driver allows growing: NBD, for instance, does not. On the other hand,
a non-protocol block driver may allow growing: The raw driver does.


Fix this by correcting the "growable" field in the driver-specific open
function for the BDS, if necessary.

Signed-off-by: Max Reitz <mre...@redhat.com>

I'm not sure I agree with bs->growable = true for raw. It's certainly
true that the backend can technically provide the functionality that
writes beyond EOF grow the file. That's not the point of bs->growable,
though.

The point of it was to _forbid_ it to grow even when it's technically
possible (non-file protocols weren't really a thing back then, apart
from vvfat, so the assumption was that it's always technically
possible). growable was introduced with bdrv_check_request(), which is
supposed to reject guest requests after the end of the virtual disk (and
this fixed a CVE, see commit 71d0770c). You're now disabling this check
for raw.

I think we need to make sure that bs->growable is only set if it is
opened for an image that has drv->requires_growing_file set and
therefore not directly used by a guest.

Well, except that with node-name a guest will be able to use any image
in the chain... Might this mean that it's really a BlockBackend
property?

Okay, the more I sit at this problem, the harder it seems to get. Theonly thing I currently know for sure is that I disagree with Anthony'sopinion in 71d0770c ("this patch makes the BlockDriver API guaranteethat all requests are within 0..bdrv_getlength() which to me seems likea Good Thing").

The initial point was to range check guest requests. In 2009, it mayhave been enough to statically check the BDS type (protocol or format)to know whether the guest directly accesses it (format) or not(protocol). However, this is no longer sufficient. Now, as far as Iknow, guests can access basically any BDS (as you yourself say).Therefore, it seems to me like it's impossible to determine whether thedevice should be marked growable or not when opening it. Instead, Ithink we have to check for each single request whether it comes from theguest or from within the block layer and do range checking only for theformer case; but this should not be the task of the block layer core,but of the block devices instead. Theoretically, guests may write beyondthe image end and grow it that way all they want, but in practice thisshould be limited by the block devices which have a fixed length.

Under this impression, I wanted to simply return to growable = false forraw. However, this breaks test 071 which attaches blkdebug to a raw BDSafter qemu has been started. blkdebug detects raw is not growable,therefore reports not to be growable as well, and because qcow2 is ontop of all that, the warning introduced by this series is emitted. Okay,so we will need growable = true for raw in some cases.

It's not trivial to determine whether the superordinate BDS of a certainBDS has BlockDriver.requires_growing_file set or not. We could introducea new flag for bdrv_open(), but I'd rather avoid it. In fact, I triedsomething like this, but I quickly got into problems because e.g.blkdebug does not have requires_growing_file_set, but decides whetherits own BDS are growable based on whether the underlying file isgrowable or not. So if a blkdebug BDS is growable, the underlying fileactually needs to be growable as well. Therefore, we'd need a moresophisticated requires_growing_file_set and maybe even propagation ofgrowable requirement through the BDS layers which quickly turns uglywhen one has to consider that a BDS may be used by multiple users.

Also, it's actually irrelevant whether requires_growing_file is set ornot. growable's current sole purpose apparently is enabling range checksfor guest-accessible images. If the BDS is only a subordinate to anotherBDS, it doesn't matter whether that other BDS needs growable set or not.

So I scrapped that. Instead, we can just test whether BDRV_O_PROTOCOL isgiven or not. If it is, the BDS is used from within the block layer; ifit isn't, it probably isn't, and even if it is, the user just has tocope with activated range checks. That's at least the idea.

But this doesn't work either: You can create a protocol BDS and thenpass it to the guest; on second thought, however, this is alreadypossible, so I wouldn't bother about this. But on the other hand, thisbreaks 071 as well because 071 creates a (non-guest accessible, but itcould be) non-protocol BDS and then tries to put blkdebug on top ofthat. I do know that this is not a general use case but it should workanyway.

So, in my honest and very humble opinion, I'd reevaluate the usefulnessof BDS.growable regarding it's original purpose and instead change it tobe what this patch does.

Anthony argumented that the block layer could very well do the rangechecks. It can, but it cannot trivially know (at least in its currentstate) whether a certain request comes from the guest or not. In 2009,this may have been known when the BDS was created; but this isabsolutely not true today.

On the other hand, the block devices very well know that any requestcoming to them has to fit in a certain range. They should do that rangechecking, not the block layer. I understand the intent of having aredundant fail-safe system, but it simply doesn't work anymore. Wecannot sometimes expected raw BDS to grow (when in the middle of a BDSstack) and sometimes not (when directly used by a guest). On the otherhand, the guest can simply be given a protocol BDS and all the rangechecks are disabled.

Putting BDS.growable into BlockBackend may (and probably) will fix this.But I really don't like doing the check in the block layer when it'sreally the block devices who are responsible for it, even if it's just abackup check.

The worst thing is, I can't even introduce a new field like"writes_beyond_eof" to BDS to circumvent all of this. Again, take theblkdebug-on-existing-raw example. With a separate field, the block layerwill not complain about opening that constellation. But if you actuallytry to write something to the qcow2 BDS which would make the image filegrow, the range checks breaks everything because it only cares aboutgrowable. So in the end, the block layer should actually have complainedabout the constellation. But on the other hand, it shouldn't have,because the constellation should work. This really is the heart of theproblem: The raw BDS might be exposed to the guest, so when the guestaccesses it, range checks should be done. But if it is used throughqcow2+blkdebug, range checks should be disabled. Using BlockBackend willfix this, but we don't have that yet.

tl;dr, I see only two ways to go on: Either I wait until BlockBackendexists; or I simply leave this patch as it is because it's actually theblock device driver's fault if an out-of-range request comes in from theguest. Since I remember talking about the former a year ago personallywith you and Markus, I fear it'll still take a considerable amount oftime. Therefore, I'm strongly in favor of the latter. If the blockdevice drivers do their job, it won't break anything. If they don't,they should be fixed (and at least it'll be only raw that's broken).

Max

Re: [Qemu-devel] [PATCH 1/4] block: Correct bs->growable

Reply via email to