Am 18.06.2015 um 10:42 schrieb Kevin Wolf:
Am 18.06.2015 um 10:30 hat Peter Lieven geschrieben:
Am 18.06.2015 um 09:45 schrieb Kevin Wolf:
Am 18.06.2015 um 09:12 hat Peter Lieven geschrieben:
Thread 2 (Thread 0x7ffff5550700 (LWP 2636)):
#0 0x00007ffff5d87aa3 in ppoll () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1 0x0000555555955d91 in qemu_poll_ns (fds=0x5555563889c0, nfds=3,
timeout=4999424576) at qemu-timer.c:326
ts = {tv_sec = 4, tv_nsec = 999424576}
tvsec = 4
#2 0x0000555555956feb in aio_poll (ctx=0x5555563528e0, blocking=true)
at aio-posix.c:231
node = 0x0
was_dispatching = false
ret = 1
progress = false
#3 0x000055555594aeed in bdrv_prwv_co (bs=0x55555637eae0, offset=4292007936,
qiov=0x7ffff554f760, is_write=false, flags=0) at block.c:2699
aio_context = 0x5555563528e0
co = 0x5555563888a0
rwco = {bs = 0x55555637eae0, offset = 4292007936,
qiov = 0x7ffff554f760, is_write = false, ret = 2147483647, flags = 0}
#4 0x000055555594afa9 in bdrv_rw_co (bs=0x55555637eae0, sector_num=8382828,
buf=0x7ffff44cc800 "(", nb_sectors=4, is_write=false, flags=0)
at block.c:2722
qiov = {iov = 0x7ffff554f780, niov = 1, nalloc = -1, size = 2048}
iov = {iov_base = 0x7ffff44cc800, iov_len = 2048}
#5 0x000055555594b008 in bdrv_read (bs=0x55555637eae0, sector_num=8382828,
buf=0x7ffff44cc800 "(", nb_sectors=4) at block.c:2730
No locals.
#6 0x000055555599acef in blk_read (blk=0x555556376820, sector_num=8382828,
buf=0x7ffff44cc800 "(", nb_sectors=4) at block/block-backend.c:404
No locals.
#7 0x0000555555833ed2 in cd_read_sector (s=0x555556408f88, lba=2095707,
buf=0x7ffff44cc800 "(", sector_size=2048) at hw/ide/atapi.c:116
ret = 32767
Here is the problem: The ATAPI emulation uses synchronous blk_read()
instead of the AIO or coroutine interfaces. This means that it keeps
polling for request completion while it holds the BQL until the request
is completed.
I will look at this.
We can (and should) fix that, otherwise the VCPUs is blocked while we're
reading from the image, even without a hang. It doesn't fully fix your
problem, though, as bdrv_drain_all() and friends still exist.
Any idea which commands actually call bdrv_drain_alll?
At least 'stop' and all commands changing the BDS graph (block jobs,
snapshots, commit, etc.). For a full list, I would have to inspect each
command in the code.
The guest can even trigger bdrv_drain_all() by stopping a running DMA
operation.
Unfortunately, excactly this is happening...
Is there any way to avoid the bdrv_drain_all in bmdma_cmd_writeb?
Peter