On Wed, 03/09 17:17, Paolo Bonzini wrote: > > > On 09/03/2016 15:29, Christian Borntraeger wrote: > > FWIW, it seems that this patch triggers this error, the > > "tracked_request_begin" > > that I reported yesterday and / or some early read issues from the > > bootloader > > in a random fashion. > > Using 2906cddfecff21af20eedab43288b485a679f9ac^ seems to work all the time, > > moving around vblk->dataplane_started = true also triggers all 3 types > > of bugs > > In all likelihood, the bug is that virtio_blk_handle_output is being > called in two threads. > > It's not clear to me how that's possible, though.
The aio_poll() inside "blk_set_aio_context(s->conf->conf.blk, s->ctx)" looks suspicious: main thread iothread ---------------------------------------------------------------------------- virtio_blk_handle_output() virtio_blk_data_plane_start() vblk->dataplane_started = true; blk_set_aio_context() bdrv_set_aio_context() bdrv_drain() aio_poll() <snip...> virtio_blk_handle_output() /* s->dataplane_started is true */ !!! -> virtio_blk_handle_request() event_notifier_set(ioeventfd) aio_poll() virtio_blk_handle_request() Christian, could you try the followed patch? The aio_poll above is replaced with a "limited aio_poll" that doesn't disptach ioeventfd. (Note: perhaps moving "vblk->dataplane_started = true;" after blk_set_aio_context() also *works around* this.) --- diff --git a/block.c b/block.c index ba24b8e..e37e8f7 100644 --- a/block.c +++ b/block.c @@ -4093,7 +4093,9 @@ void bdrv_attach_aio_context(BlockDriverState *bs, void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context) { - bdrv_drain(bs); /* ensure there are no in-flight requests */ + /* ensure there are no in-flight requests */ + bdrv_drained_begin(bs); + bdrv_drained_end(bs); bdrv_detach_aio_context(bs);