On 03/10/2017 10:08 PM, Halil Pasic wrote: > > > On 03/10/2017 05:47 PM, Paolo Bonzini wrote: >> >> On 07/03/2017 14:16, Halil Pasic wrote: >>> The commits 03de2f527 "virtio-blk: do not use vring in dataplane" and >>> 9ffe337c08 "virtio-blk: always use dataplane path if ioeventfd is active" >>> changed how notifications are done for virtio-blk substantially. Due to a >>> race condition, interrupts are lost when irqfd behind the guest notifier >>> is torn down after notify_guest_bh was scheduled but before it actually >>> runs. >>> >>> Let's fix this by forcing guest notifications before cleaning up the >>> irqfd's. Let's also add some explanatory comments. >>> >>> Cc: qemu-sta...@nongnu.org >>> Signed-off-by: Halil Pasic <pa...@linux.vnet.ibm.com> >>> Reported-by: Michael A. Tebolt <mi...@us.ibm.com> >>> Tested-by: Michael A. Tebolt <mi...@us.ibm.com> >>> Suggested-by: Paolo Bonzini <pbonz...@redhat.com> >>> --- >>> >>> This patch withstood the test case which discovered the problem >>> for several days (as reported by Michale Tebolt). >>> >>> v1 --> v2: >>> * Fixed typo pointed out by Connie >>> * Added Tested-by >> Hi Halil, >> >> I found a similar issue in NBD. Can you check if this patch fixes >> the virtio-blk issue too? >> >> Thanks, >> Paolo >> >> ------ 8< ------------ >> >> diff --git a/block.c b/block.c >> index f293ccb..e159251 100644 >> --- a/block.c >> +++ b/block.c >> @@ -4272,8 +4272,15 @@ void bdrv_attach_aio_context(BlockDriverState *bs, >> >> void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context) >> { >> + AioContext *ctx; >> + >> bdrv_drain(bs); /* ensure there are no in-flight requests */ >> >> + ctx = bdrv_get_aio_context(bs); >> + while (aio_poll(ctx, false)) { >> + /* wait for all bottom halves to execute */ >> + } >> + >> bdrv_detach_aio_context(bs); >> >> /* This function executes in the old AioContext so acquire the new one >> in >> >> > > So far so good! I will let it spin over the weekend but I think it's unlikely > something will turn up. > > I was wondering, would it make sense to push this logic into bdrv_drain? > (Along the lines: this looks much like tying up loose ends drain has left. > But I'm not sure about it.) >
I think it's safe to say that this fixes the virtio-blk issue too. Are you going to send a proper patch with this (or an equivalent) change? Halil