Re: [Qemu-devel] [PATCH 13/17] block: Convert bdrv_pread(v) to BdrvChild

Max Reitz Mon, 27 Jun 2016 08:17:05 -0700

On 21.06.2016 11:21, Kevin Wolf wrote:
> Signed-off-by: Kevin Wolf <kw...@redhat.com>
> ---
>  block.c                |  7 ++++---
>  block/bochs.c          |  6 +++---
>  block/cloop.c          |  8 ++++----
>  block/crypto.c         |  2 +-
>  block/dmg.c            | 21 +++++++++++----------
>  block/io.c             |  8 ++++----
>  block/parallels.c      |  4 ++--
>  block/qcow.c           | 10 +++++-----
>  block/qcow2-cache.c    |  2 +-
>  block/qcow2-refcount.c | 12 ++++++------
>  block/qcow2-snapshot.c | 12 ++++++------
>  block/qcow2.c          | 16 ++++++++--------
>  block/qed.c            |  6 +++---
>  block/vhdx-log.c       |  8 ++++----
>  block/vhdx.c           | 38 +++++++++++++++++++++++---------------
>  block/vmdk.c           | 36 +++++++++++++++++-------------------
>  block/vpc.c            |  8 ++++----
>  include/block/block.h  |  5 ++---
>  18 files changed, 108 insertions(+), 101 deletions(-)


Reviewed-by: Max Reitz <mre...@redhat.com>

But I do take issue with a comment in this patch, see below.

[...]

> diff --git a/block/vhdx.c b/block/vhdx.c
> index b0f66de..c5ec608 100644
> --- a/block/vhdx.c
> +++ b/block/vhdx.c

[...]

> @@ -1405,12 +1407,18 @@ static int vhdx_create_new_headers(BlockBackend *blk, 
> uint64_t image_size,
>      vhdx_guid_generate(&hdr->file_write_guid);
>      vhdx_guid_generate(&hdr->data_write_guid);
>  
> -    ret = vhdx_write_header(bs, hdr, VHDX_HEADER1_OFFSET, false);
> +    /* XXX Ugly way to get blk->root, but that's a feature, not a bug. This
> +     * hack makes it obvious that vhdx_write_header() bypasses the 
> BlockBackend
> +     * here, which it really shouldn't be doing. */

Short question: Why not?


Long stuff:

The question I'm asking myself is why do we have a BB here at all? We
don't need it, really. Actually, the whole file creation business is
pretty shady. See bdrv_create(), that thing takes a filename for
whatever reason (OK, historical ones, I know). This is fine on the
protocol level, but strange on the format level.

Ideally we'd want to seperate the protocol creation from the format
creation, and the format level just gets some (indirect?) handle to the
protocol BDS.

So that's where the BB comes from, because the format level has to open
the file (on the protocol level) itself somehow, and we decided to frown
upon naked accesses to a BDS. (But I feel like I'm forgetting something
here, I think there was another reason, too...)

My intuitive and personal distinction between BDS and BB accesses is the
following:
- A BB is somehow related to an external user. External users always
  have to go through a BB to access a BDS tree. It's VM-y (by that I
  mean it's related to what the whole qemu program does).
- A BDS is just a way to access a host image. It's not VM-y.

So in the case of image creation, I personally would be completely fine
with accesses to the naked BDS, because those are not external users and
they just want to access a host image directly.

As long as there's no real difference between how accessing an image
through a BB or without a BB works, I don't particularly care which is used.

But there's a catch: Some format drivers reuse code they use for
existing images for image creation, for instance, writing out the image
header. If image creation works on a BB while code for existing images
does not (it works on a BdrvChild), sharing this code does not work
trivially. Previously, this was worked around by just bypassing the BB,
and I think this is completely fine and very reasonable, because, as I
said above, I don't see why code for new images should be treated
differently than code for existing images.

However, that no longer works after this series, as we can see here.
Bypassing the BB is not trivial and you say it's not intended. However,
that would mean that code for creating a new image has to be completely
separate from code for an existing image, because one takes a BB and the
other takes a BdrvChild, and I don't agree on that.


I think the issue is another: At this point, the current idea of image
creation and the premise of this series clash. The premise of this
series is that you should only be allowed to access a BDS if it's your
child, and this mostly works because any BDS is someone's child, and
actually only that someone wants to perform accesses on this (one
exception being the qcow{2,}_write_compressed() code).

However, this premise breaks here. Since all format drivers have to
create the protocol file and the protocol BDS themselves, these are
suddenly not their children. We work around that by putting an anonymous
BB on top of them, so that they are someone's children; but that someone
is the BB, not the image creator.

I think you should be able to access any BDS if you own it, not just if
it's your child. I'd argue that any BDS you own should be your child,
actually. And the issue with our image creation is that the format
drivers do own the protocol BDS, but it's not their child.


Your answer to this is: Image creation code should just use the BB.
Technically a valid answer, but I think this leads to unnecessary code
duplication, and I also don't understand the reasoning behind this.

I see two other ways:

The first one is to introduce a way to retrieve the BdrvChild from a BB.
Technically very simple. I do agree that this is not how the BB is
supposed to be used, so I would definitely restrict this to anonymous BBs.
The idea behind this is: If you own a BB, you should transitively own
the BDS behind it. Thus, it should be your child, too, and you should be
able to access it as such.
Other argument: We have blk_bs(). I don't see the difference, frankly.

The second one would be to have a way to introduce a very lightweight
BDS parent object. You can just put this on top of an existing BDS to
claim your parenthood, basically, and this one gives you then the
BdrvChild handle.
Technically not so simple. And also, I'd claim that we already have such
a lightweight parent object which is called "anonymous BB".


In my mind, the use of anonymous BBs is really just to claim parenthood
to a some BDS. Thus, I think it should be very much fine to have a
BdrvChild *blk_child(BlockBackend *blk) function which asserts that the
BB is indeed anonymous.


Max

> +    child = QLIST_FIRST(&bs->parents);
> +    assert(!QLIST_NEXT(child, next_parent));
> +
> +    ret = vhdx_write_header(child, hdr, VHDX_HEADER1_OFFSET, false);
>      if (ret < 0) {
>          goto exit;
>      }
>      hdr->sequence_number++;
> -    ret = vhdx_write_header(bs, hdr, VHDX_HEADER2_OFFSET, false);
> +    ret = vhdx_write_header(child, hdr, VHDX_HEADER2_OFFSET, false);
>      if (ret < 0) {
>          goto exit;
>      }

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 13/17] block: Convert bdrv_pread(v) to BdrvChild

Reply via email to