Re: Exporting qcow2 images as raw data from ova file with qemu-nbd

Nir Soffer Fri, 26 Jun 2020 12:43:35 -0700

On Tue, Jun 23, 2020 at 1:21 AM Nir Soffer <nsof...@redhat.com> wrote:
>
> I'm trying to export qcow2 images from ova format using qemu-nbd.
>
> I create 2 compressed qcow2 images, with different data:
>
> $ qemu-img info disk1.qcow2
> image: disk1.qcow2
> file format: qcow2
> virtual size: 200 MiB (209715200 bytes)
> disk size: 384 KiB
> ...
>
> $ qemu-img info disk2.qcow2
> image: disk2.qcow2
> file format: qcow2
> virtual size: 200 MiB (209715200 bytes)
> disk size: 384 KiB
> ...
>
> And packed them in a tar file. This is not a valid ova but good enough
> for this test:
>
> $ tar tvf vm.ova
> -rw-r--r-- nsoffer/nsoffer 454144 2020-06-22 21:34 disk1.qcow2
> -rw-r--r-- nsoffer/nsoffer 454144 2020-06-22 21:34 disk2.qcow2
>
> To get info about the disks in ova file, we can use:
>
> $ python -c 'import tarfile; print(list({"name": m.name, "offset":
> m.offset_data, "size": m.size} for m in tarfile.open("vm.ova")))'
> [{'name': 'disk1.qcow2', 'offset': 512, 'size': 454144}, {'name':
> 'disk2.qcow2', 'offset': 455168, 'size': 454144}]
>
> First I tried the obvious:
>
> $ qemu-nbd --persistent --socket=/tmp/nbd.sock --read-only --offset=512 vm.ova
>
> And it works, but it exposes the qcow2 data. I want to raw data so I
> can upload the guest
> data to ovirt, where is may be converted to qcow2 format.
>
> $ qemu-img info --output json "nbd+unix://?socket=/tmp/nbd.sock"
> {
>     "virtual-size": 209715200,
>     "filename": "nbd+unix://?socket=/tmp/nbd.sock",
>     "format": "qcow2",
>  ...
> }
>
> Looking in qemu manual and qapi/block-core.json, I could construct this 
> command:
>
> $ qemu-nbd --persistent --socket=/tmp/nbd.sock --read-only
> 'json:{"driver": "qcow2", "file": {"driver": "raw", "offset": 512,
> "size": 454144, "file": {"driver": "file", "filename": "vm.ova"}}}'
>
> And it works:
>
> $ qemu-img info --output json "nbd+unix://?socket=/tmp/nbd.sock"
> {
>     "virtual-size": 209715200,
>     "filename": "nbd+unix://?socket=/tmp/nbd.sock",
>     "format": "raw"
> }
>
> $ qemu-img map --output json "nbd+unix://?socket=/tmp/nbd.sock"
> [{ "start": 0, "length": 104857600, "depth": 0, "zero": false, "data":
> true, "offset": 0},
> { "start": 104857600, "length": 104857600, "depth": 0, "zero": true,
> "data": false, "offset": 104857600}]
>
> $ qemu-img map --output json disk1.qcow2
> [{ "start": 0, "length": 104857600, "depth": 0, "zero": false, "data": true},
> { "start": 104857600, "length": 104857600, "depth": 0, "zero": true,
> "data": false}]
>
> $ qemu-img convert -f raw -O raw nbd+unix://?socket=/tmp/nbd.sock disk1.raw
>
> $ qemu-img info disk1.raw
> image: disk1.raw
> file format: raw
> virtual size: 200 MiB (209715200 bytes)
> disk size: 100 MiB
>
> $ qemu-img compare disk1.raw disk1.qcow2
> Images are identical.
>
> I wonder if this is the best way to stack a qcow2 driver on top of a
> raw driver exposing
> a range from a tar file.
>
> I found similar example for gluster in:
> docs/system/device-url-syntax.rst.inc


Other related challenges with this are:

1. probing image format

With standalone images, we probe image format using:

    qemu-img info image

I know probing is considered dangerous, but I think this ok when user
run this code on
his machine, on an image they want to upload to oVirt. On a hypervisor
we use prlimit
to limit the resources used by qemu-img, so we can use the same
solution also when
running by a user if needed.

However not being able to probe image format is a usability issue. It
does not make sense
that qemu-img cannot probe image format safely, at least for qcow2 format.

I can get image info using:

$ qemu-img info 'json:{"driver": "qcow2", "file": {"driver": "raw",
"offset": 1536, "file": {"driver": "file", "filename":
"fedora-32.ova"}}}'
image: json:{"driver": "qcow2", "file": {"offset": 1536, "driver":
"raw", "file": {"driver": "file", "filename": "fedora-32.ova"}}}
file format: qcow2
virtual size: 6 GiB (6442450944 bytes)
disk size: 645 MiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

But there is no way to probe the format, unless I try first with
qcow2, and consider the image as raw
otherwise.

We can parse the qcow2 header manually, as we already do in oVirt
engine UI in javascript:
https://github.com/oVirt/ovirt-engine/blob/9d48ea6274fdd1bef3fc8e309f9161be3b540890/frontend/webadmin/modules/uicommonweb/src/main/java/org/ovirt/engine/ui/uicommonweb/models/storage/ImageInfoModel.java#L103

We have used this code for 5 years and had no issues with it yet.

In the worst case, if we fail to detect, or let the user upload a
qcow2 files oVirt does not
support, the uload will fail at the end, in the verification step,
when we run check the
uploaded image using "qemu-img info". This is done using prlimit since
we treat this
image as untrusted.

I think it would be useful if the qemu project was publishing
libraries in C/python/javascript
supporting format probing for qcow2 format.

2. getting image virtual size

So we can use qemu-img info with a custom json: filename, but this is
very complicated and
error prone.

3. measuring image required size when converting to qcow2 image on block device

This works if we know the image format:

$ qemu-img measure -O qcow2 'json:{"driver": "qcow2", "file":
{"driver": "raw", "offset": 1536, "file": {"driver": "file",
"filename": "fedora-32.ova"}}}'
required size: 1381302272
fully allocated size: 6443696128

But it is complicated.

Can we have better support in qemu-img/qemu-nbd for accessing images
in a tar file?

Maybe something like:

    qemu-img info tar://vm.ova?member=fedora-32.qcow2

This can return information on the file named "fofora-32.qcow2" in the
tar file "vm.ova".

image:  tar://vm.ova?member=fedora-32.qcow2
file format: qcow2
virtual size: 6 GiB (6442450944 bytes)
disk size: 645 MiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

$ qemu-img measure -O qcow2 tar://vm.ova?member=fedora-32.qcow2
required size: 1381302272
fully allocated size: 6443696128

What if we had a tar driver that can be used like this:

{"driver": "qcow2",
 "file": {"driver": "tar",
          "member": "fedora-32.qcow2",
          "file": {"driver": "file",
                    "filename": "vm.ova"}}}

 This driver can be implemented using tar parser and a raw driver
using offset and size.

So maybe we don't need a driver, but code in qemu-img parsing tar
format, and building the
right graph using existing drivers.

Regardless of how we implement it, qemu-img will have basic support
for ova format,
which sounds like a good thing, even if ova format is horrible and
non-standard. Users don't
care about the details, only about compatibility.

Nir

Re: Exporting qcow2 images as raw data from ova file with qemu-nbd

Reply via email to