On Tue, 03/15 15:08, Paolo Bonzini wrote: > > > On 15/03/2016 14:18, Cornelia Huck wrote: > > On Tue, 15 Mar 2016 20:45:30 +0800 > > Fam Zheng <f...@redhat.com> wrote: > > > >> On Fri, 03/11 11:28, Paolo Bonzini wrote: > > > >>> But secondarily, I'm thinking of making the logic simpler to understand > >>> in two ways: > >>> > >>> 1) adding a mutex around virtio_blk_data_plane_start/stop. > >>> > >>> 2) moving > >>> > >>> event_notifier_set(virtio_queue_get_host_notifier(s->vq)); > >>> virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, true, true); > >>> > >>> to a bottom half (created with aio_bh_new in s->ctx). The bottom half > >>> takes the mutex, checks again "if (vblk->dataplane_started)" and if it's > >>> true starts the processing. > >> > >> Like this? If it captures your idea, could Bo or Christian help test? > >> > >> --- > >> > >> From b5b8886693828d498ee184fc7d4e13d8c06cdf39 Mon Sep 17 00:00:00 2001 > >> From: Fam Zheng <f...@redhat.com> > >> Date: Thu, 10 Mar 2016 10:26:36 +0800 > >> Subject: [PATCH] virtio-blk dataplane start crash fix > >> > >> Suggested-by: Paolo Bonzini <pbonz...@redhat.com> > >> Signed-off-by: Fam Zheng <f...@redhat.com> > >> --- > >> block.c | 4 +++- > >> hw/block/dataplane/virtio-blk.c | 39 > >> ++++++++++++++++++++++++++++++++------- > >> 2 files changed, 35 insertions(+), 8 deletions(-) > >> > >> diff --git a/block.c b/block.c > >> index ba24b8e..e37e8f7 100644 > >> --- a/block.c > >> +++ b/block.c > >> @@ -4093,7 +4093,9 @@ void bdrv_attach_aio_context(BlockDriverState *bs, > >> > >> void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context) > >> { > >> - bdrv_drain(bs); /* ensure there are no in-flight requests */ > >> + /* ensure there are no in-flight requests */ > >> + bdrv_drained_begin(bs); > >> + bdrv_drained_end(bs); > > I'm not sure that this is necessary. An empty section should be the > same as plain old bdrv_drain.
Slighly different. This wraps aio_poll of bdrv_drain with aio_disable_external/aio_enable_external, which avoids a nested virtio_blk_handle_output as explained in my earlier message. > > >> bdrv_detach_aio_context(bs); > >> > >> diff --git a/hw/block/dataplane/virtio-blk.c > >> b/hw/block/dataplane/virtio-blk.c > >> index 36f3d2b..6db5c22 100644 > >> --- a/hw/block/dataplane/virtio-blk.c > >> +++ b/hw/block/dataplane/virtio-blk.c > >> @@ -49,6 +49,8 @@ struct VirtIOBlockDataPlane { > >> > >> /* Operation blocker on BDS */ > >> Error *blocker; > >> + > >> + QemuMutex start_lock; > >> }; > >> > >> /* Raise an interrupt to signal guest, if necessary */ > >> @@ -150,6 +152,7 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, > >> VirtIOBlkConf *conf, > >> s = g_new0(VirtIOBlockDataPlane, 1); > >> s->vdev = vdev; > >> s->conf = conf; > >> + qemu_mutex_init(&s->start_lock); > >> > >> if (conf->iothread) { > >> s->iothread = conf->iothread; > >> @@ -184,15 +187,38 @@ void > >> virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s) > >> g_free(s); > >> } > >> > >> +typedef struct { > >> + VirtIOBlockDataPlane *s; > >> + QEMUBH *bh; > >> +} VirtIOBlockStartData; > >> + > >> +static void virtio_blk_data_plane_start_bh_cb(void *opaque) > >> +{ > >> + VirtIOBlockStartData *data = opaque; > >> + VirtIOBlockDataPlane *s = data->s; > > > > Won't you need to check here whether ->started is still set? > > Yes. > > >> + > >> + /* Kick right away to begin processing requests already in vring */ > >> + event_notifier_set(virtio_queue_get_host_notifier(s->vq)); > >> + > >> + /* Get this show started by hooking up our callbacks */ > >> + virtio_queue_aio_set_host_notifier_handler(s->vq, s->ctx, true, true); > >> + > >> + qemu_bh_delete(data->bh); > >> + g_free(data); > >> +} > >> + > >> /* Context: QEMU global mutex held */ > >> void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) > >> { > >> BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s->vdev))); > >> VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); > >> VirtIOBlock *vblk = VIRTIO_BLK(s->vdev); > >> + VirtIOBlockStartData *data; > >> int r; > >> > >> + qemu_mutex_lock(&s->start_lock); > >> if (vblk->dataplane_started || s->starting) { > > > > Do we still need ->starting with the new mutex? > > No, but really we shouldn't have needed it before either. :) So a task > for another day. > > >> + qemu_mutex_unlock(&s->start_lock); > >> return; > >> } > >> > >> /* Context: QEMU global mutex held */ > > > > Do you also need to do something in _stop()? > > _stop definitely needs to take the mutex too. Will fix this and above and send as a top level email. Fam