Re: [Qemu-block] [VDSM] travis tests fail consistently since Apr 14

Nir Soffer Mon, 11 Jun 2018 16:08:24 -0700

On Mon, Jun 11, 2018 at 7:20 PM Kevin Wolf <kw...@redhat.com> wrote:

> Am 11.06.2018 um 12:43 hat Nir Soffer geschrieben:
> > On Mon, Jun 11, 2018 at 1:03 PM Maor Lipchuk <mlipc...@redhat.com>
> wrote:
> >
> > > On Tue, Jun 5, 2018 at 9:40 AM, Dan Kenigsberg <dan...@redhat.com>
> wrote:
> > >
> > >> On Mon, Jun 4, 2018 at 7:14 PM, Nir Soffer <nsof...@redhat.com>
> wrote:
> > >> > On Mon, Jun 4, 2018 at 6:56 PM Dan Kenigsberg <dan...@redhat.com>
> > >> wrote:
> > >> >>
> > >> >> On Tue, May 8, 2018 at 11:59 AM, Nir Soffer <nsof...@redhat.com>
> > >> wrote:
> > >> >> > There are several issues:
> > >> >> >
> > >> >> > 1. coverage fail after this patch:
> > >> >> >
> > >> >> >
> > >>
> https://github.com/oVirt/vdsm/commit/6b905c2c134bcf344961d28eefbd05f2838d2ec8
> > >> >> >
> > >> >> > https://travis-ci.org/oVirt/vdsm/builds/366574414
> > >> >> > ...
> > >> >> > pwd
> > >> >> > /vdsm/tests
> > >> >> > ls .cov*
> > >> >> > ls: cannot access .cov*: No such file or directory
> > >> >> > make[1]: *** [check] Error 2
> > >> >> > make[1]: Leaving directory `/vdsm/tests'
> > >> >>
> > >> >> That was me, sorry. This should solve it:
> > >> >> https://gerrit.ovirt.org/#/c/91925/
> > >> >
> > >> > Thanks!
> > >> >
> > >> >> BTW, on my fc28 I see TestCountClusters.test_multiple_blocks
> failing
> > >> with
> > >> >>
> > >> >> E           Error: Command ['/usr/bin/qemu-img', 'map', '--output',
> > >> >> 'json', '/var/tmp/vdsm/test_multiple_blocks0/test'] failed with
> rc=-6
> > >> >> out='' err="qemu-img:
> > >> >> /builddir/build/BUILD/qemu-2.12.0-rc1/qemu-img.c:2680:
> > >> >> get_block_status: Assertion `bytes' failed.\n"
> > >> >>
> > >> >> Any idea what's that?
> > >> >
> > >> >
> > >> > Looks like qemu-img bug.
> > >> >
> > >> > Can you file a qemu-img bug?
> > >>
> > >> I hope Maor can translate the test to qemu-img speak.
> > >>
> > >
> > > Opened the following bug:
> > >   https://bugzilla.redhat.com/1589738
> > >
> >
> > Adding qemu-block
>
> It's related to the unaligned image size.



Right, I think we forgot the -1 in the test setup:

    f.seek(16 * 1024 - 1)
    f.write(b"x")

But it is good that we made this error :-)


> Correct image files should be
> aligned to 512 byte sectors, so something is wrong with your image to
> start with (hard disks don't have half sectors).
>

Right. We indeed forbid uploads on unaligned images in the UI.

But we found that qemu-img creates unaligned images:

$ qemu-img create -f qcow2 empty.raw 10g
Formatting 'empty.raw', fmt=qcow2 size=10737418240 cluster_size=65536
lazy_refcounts=off refcount_bits=16

$ ls -l empty.raw
-rw-r--r--. 1 nsoffer nsoffer 196768 Jun 12 01:59 empty.raw

$ python -c "print 196768 % 512"
160

$ qemu-img map -f raw --output json test.raw
[{ "start": 0, "length": 4096, "depth": 0, "zero": false, "data": true,
"offset": 0},
{ "start": 4096, "length": 12288, "depth": 0, "zero": true, "data": false,
"offset": 4096},
{ "start": 16384, "length": 4096, "depth": 0, "zero": false, "data": true,
"offset": 16384},
qemu-img: /builddir/build/BUILD/qemu-2.12.0-rc1/qemu-img.c:2680:
get_block_status: Assertion `bytes' failed.
Aborted (core dumped)

The image becomes aligned once you write anything into it, but I still find
it
strange that qemu-img create such files. Shouldn't it create always 3
complete
clusters?


>
> Anyway, git bisects points to this one:
>
> a290f085901b528265787cd27ebda19c970be4ee is the first bad commit
> commit a290f085901b528265787cd27ebda19c970be4ee
> Author: Eric Blake <ebl...@redhat.com>
> Date:   Tue Feb 13 14:26:44 2018 -0600
>
>     file-posix: Switch to .bdrv_co_block_status()
>
>     We are gradually moving away from sector-based interfaces, towards
>     byte-based.  Update the file protocol driver accordingly.
>
>     In want_zero mode, we continue to report fine-grained hole
>     information (the caller wants as much mapping detail as possible);
>     but when not in that mode, the caller prefers larger *pnum and
>     merely cares about what offsets are allocated at this layer, rather
>     than where the holes live.  Since holes still read as zeroes at
>     this layer (rather than deferring to a backing layer), we can take
>     the shortcut of skipping lseek(), and merely state that all bytes
>     are allocated.
>
>     We can also drop redundant bounds checks that are already
>     guaranteed by the block layer.
>
>     Signed-off-by: Eric Blake <ebl...@redhat.com>
>     Reviewed-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com>
>     Reviewed-by: Fam Zheng <f...@redhat.com>
>     Signed-off-by: Kevin Wolf <kw...@redhat.com>
>
> I think the problem is a bit higher up the call stack, but I'm not
> completely sure yet. It manifests in img_map(), in this code:
>
>         while (curr.start + curr.length < length) {
>             ...
>             n = QEMU_ALIGN_DOWN(MIN(1 << 30, length - offset),
> BDRV_SECTOR_SIZE);
>             ret = get_block_status(bs, offset, n, &next);
>
> The loop condition is still true because a single byte is left to be
> processed, but n is aligned down to 0. I'm not sure why the
> QEMU_ALIGN_DOWN() is even there.
>
> Eric, would just removing the QEMU_ALIGN_DOWN() be correct?
>
> Kevin
>

Re: [Qemu-block] [VDSM] travis tests fail consistently since Apr 14

Reply via email to