Am 18.06.2015 um 10:42 schrieb Kevin Wolf:
Am 18.06.2015 um 10:30 hat Peter Lieven geschrieben:
Am 18.06.2015 um 09:45 schrieb Kevin Wolf:
Am 18.06.2015 um 09:12 hat Peter Lieven geschrieben:
Thread 2 (Thread 0x7ffff5550700 (LWP 2636)):
#0  0x00007ffff5d87aa3 in ppoll () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1  0x0000555555955d91 in qemu_poll_ns (fds=0x5555563889c0, nfds=3,
     timeout=4999424576) at qemu-timer.c:326
         ts = {tv_sec = 4, tv_nsec = 999424576}
         tvsec = 4
#2  0x0000555555956feb in aio_poll (ctx=0x5555563528e0, blocking=true)
     at aio-posix.c:231
         node = 0x0
         was_dispatching = false
         ret = 1
         progress = false
#3  0x000055555594aeed in bdrv_prwv_co (bs=0x55555637eae0, offset=4292007936,
     qiov=0x7ffff554f760, is_write=false, flags=0) at block.c:2699
         aio_context = 0x5555563528e0
         co = 0x5555563888a0
         rwco = {bs = 0x55555637eae0, offset = 4292007936,
           qiov = 0x7ffff554f760, is_write = false, ret = 2147483647, flags = 0}
#4  0x000055555594afa9 in bdrv_rw_co (bs=0x55555637eae0, sector_num=8382828,
     buf=0x7ffff44cc800 "(", nb_sectors=4, is_write=false, flags=0)
     at block.c:2722
         qiov = {iov = 0x7ffff554f780, niov = 1, nalloc = -1, size = 2048}
         iov = {iov_base = 0x7ffff44cc800, iov_len = 2048}
#5  0x000055555594b008 in bdrv_read (bs=0x55555637eae0, sector_num=8382828,
     buf=0x7ffff44cc800 "(", nb_sectors=4) at block.c:2730
No locals.
#6  0x000055555599acef in blk_read (blk=0x555556376820, sector_num=8382828,
     buf=0x7ffff44cc800 "(", nb_sectors=4) at block/block-backend.c:404
No locals.
#7  0x0000555555833ed2 in cd_read_sector (s=0x555556408f88, lba=2095707,
     buf=0x7ffff44cc800 "(", sector_size=2048) at hw/ide/atapi.c:116
         ret = 32767
Here is the problem: The ATAPI emulation uses synchronous blk_read()
instead of the AIO or coroutine interfaces. This means that it keeps
polling for request completion while it holds the BQL until the request
is completed.
I will look at this.

We can (and should) fix that, otherwise the VCPUs is blocked while we're
reading from the image, even without a hang. It doesn't fully fix your
problem, though, as bdrv_drain_all() and friends still exist.
Any idea which commands actually call bdrv_drain_alll?
At least 'stop' and all commands changing the BDS graph (block jobs,
snapshots, commit, etc.). For a full list, I would have to inspect each
command in the code.

The guest can even trigger bdrv_drain_all() by stopping a running DMA
operation.

Unfortunately, excactly this is happening...
Is there any way to avoid the bdrv_drain_all in bmdma_cmd_writeb?

Peter

Reply via email to