On 06/15/2016 09:57 AM, Kevin Wolf wrote: > Am 14.06.2016 um 18:02 hat Cédric Le Goater geschrieben: >> On 06/14/2016 10:38 AM, Kevin Wolf wrote: >>> Am 14.06.2016 um 10:02 hat Cédric Le Goater geschrieben: >>>>>> #4 0x00007fa81c6694ac in bdrv_aligned_pwritev (bs=0x7fa81d4dd050, >>>>>> req=<optimized out>, offset=30878208, >>>>>> bytes=512, qiov=0x7fa7f47fee60, flags=0) >>>>>> at /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/io.c:1243 >>>>>> #5 0x00007fa81c669ecb in bdrv_co_pwritev (bs=0x7fa81d4dd050, offset=8, >>>>>> bytes=512, qiov=0x7fa80d5191c0, >>>>>> flags=(BDRV_REQ_COPY_ON_READ | BDRV_REQ_ZERO_WRITE | >>>>>> BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_SERIALISING | BDRV_REQ_FUA | unknown: >>>>>> 4278124256), flags@entry=(unknown: 0)) >>>>>> at /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/io.c:1492 >>>>> >>>>> That 'flags' value looks bogus... >>>>> >>>>>> #6 0x00007fa81c65e367 in blk_co_pwritev (blk=0x7fa81d4c5b60, >>>>>> offset=30878208, bytes=256, qiov=0x7fa80d5191c0, >>>>>> flags=(unknown: 0)) at >>>>>> /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/block-backend.c:788 >>>>>> #7 0x00007fa81c65e49b in blk_aio_write_entry (opaque=0x7fa7e849aca0) >>>>>> at >>>>>> /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/block-backend.c:977 >>>>>> #8 0x00007fa81c6c823a in coroutine_trampoline (i0=<optimized out>, >>>>>> i1=<optimized out>) >>>>>> at >>>>>> /home/legoater/work/qemu/qemu-ast2400-mainline.git/util/coroutine-ucontext.c:78 >>>>>> #9 0x00007fa818ea8f00 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 >>>>> >>>>> and we don't get anything further in the backtrace beyond coroutines, to >>>>> see who's sending the bad parameters. I recently debugged a bogus flags >>>>> in bdrv_aio_preadv, by hoisting an assert to occur before coroutines are >>>>> used in blk_aio_prwv(): >>>>> >>>>> https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg02948.html >>>>> >>>>> I've just posted v2 of that patch (now a 2/2 series), but in v2 no >>>>> longer kept the assert at that point. But maybe the correct fix, and/or >>>>> the hack for catching the bug prior to coroutines, will help you debug >>>>> where the bad arguments are coming from. >>>> >>>> That does not fix the assert. >>>> >>>>>> #10 0x00007fa80d5189d0 in ?? () >>>>>> #11 0x0000000000000000 in ?? () >>>>>> (gdb) up 4 >>>>>> #4 0x00007fa81c6694ac in bdrv_aligned_pwritev (bs=0x7fa81d4dd050, >>>>>> req=<optimized out>, offset=30878208, >>>>>> bytes=512, qiov=0x7fa7f47fee60, flags=0) >>>>>> at /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/io.c:1243 >>>>>> 1243 assert(!qiov || bytes == qiov->size); >>>>>> (gdb) p *qiov >>>>>> $1 = {iov = 0x7fa81da671d0, niov = 1, nalloc = 1, size = 256} >>>> >>>> So, it seems that the issue is coming from the fact that bdrv_co_pwritev() >>>> does not handle alignments less than BDRV_SECTOR_SIZE : >>>> >>>> /* TODO Lift BDRV_SECTOR_SIZE restriction in BlockDriver interface */ >>>> uint64_t align = MAX(BDRV_SECTOR_SIZE, bs->request_alignment); >>>> >>>> It calls bdrv_aligned_pwritev() which does the assert : >>>> >>>> assert(!qiov || bytes == qiov->size); >>> >>> Yes, but between these two places, there is code that should actually >>> enforce the right alignment: >>> >>> if ((offset + bytes) & (align - 1)) { >>> ... >>> } >>> >>> You can see in your backtrace that bdrv_aligned_pwritev() gets a >>> different qiov than bdrv_co_pwritev() (which is local_qiov in the latter >>> function). >>> >>> It's just unclear to me why this code extended bytes, but didn't add the >>> tail_buf iovec to local_qiov. >> >> The gdb backtrace is bogus. It does not make sense. May be a gdb issue >> with multithread on jessie. >> >> In the path tracking the tail bytes, we have : >> >> if ((offset + bytes) & (align - 1)) { >> ... > if (!use_local_qiov) { > qemu_iovec_init(&local_qiov, qiov->niov + 1); > qemu_iovec_concat(&local_qiov, qiov, 0, qiov->size); > use_local_qiov = true; > } >> tail_bytes = (offset + bytes) & (align - 1); >> qemu_iovec_add(&local_qiov, tail_buf + tail_bytes, align - >> tail_bytes); >> >> bytes = ROUND_UP(bytes, align); >> } >> >> This is where the issue is I think. The qiov holds 256 and bytes 512. >> >> I have no idea how to fix that though. > > Added some more context above. qiov->size as passed from the device is > already 256 bytes, which are added to local_qiov with > qemu_iovec_concat(). And then we add another 256 from tail_buf in the > lines that you quoted, so in theory we should end up with a properly > aligned 256 + 256 = 512 byte qiov.
yes. It seems that qiov is bogus after the bdrv_aligned_preadv() call. It gets zeroed most of the time, sometime ->size is 1, and then qemu_iovec_concat() constructs an awful local_qiov, which brings the assert in bdrv_aligned_pwritev() How's that possible ? Could it be a serialization issue ? C.