Sorry for the long delay, I was out of action for a week. Ryan Harper <ry...@us.ibm.com> writes:
> When removing a drive from the host-side via drive_del we currently have the > following path: > > drive_del > qemu_aio_flush() > bdrv_close() > drive_uninit() > bdrv_delete() > > When we bdrv_delete() we end up qemu_free() the BlockDriverState pointer > however, the block devices retain a copy of this pointer, see > hw/virtio-blk.c:virtio_blk_init() where we s->bs = conf->bs. > > We now have a use-after-free situation. If the guest continues to issue IO > against the device, and we've reallocated the memory that the BlockDriverState > pointed at, then we will fail the bs->drv checks in the various bdrv_ methods. "we will fail the bs->drv checks" is misleading, in my opinion. Here's what happens: 1. bdrv_close(bs) zaps bs->drv, which makes any subsequent I/O get dropped. Works as designed. 2. drive_uninit() frees the bs. Since the device is still connected to bs, any subsequent I/O is a use-after-free. The value of bs->drv becomes unpredictable on free. As long as it remains null, I/O still gets dropped. I/O crashes or worse once that changed. Could be right on free, could be much later. If you respin anyway, please clarify your description. > To resolve this issue as simply as possible, we can chose to not actually > delete the BlockDriverState pointer. Since bdrv_close() handles setting the > drv > pointer to NULL, we just need to remove the BlockDriverState from the QLIST > that is used to enumerate the block devices. This is currently handled within > bdrv_delete, so move this into it's own function, bdrv_remove(). Why do we remove the BlockDriverState from bdrv_states? Because we want drive_del make its *name* go away. Begs the question: is the code prepared for a BlockDriverState object that isn't on bdrv_states? Turns out we're in luck: bdrv_new() already creates such objects when the device_name is empty. This is used for internal BlockDriverStates such as COW backing files. Your code makes device_name empty when taking the object off bdrv_states, so we're good. Begs yet another question: how does the behavior of a BlockDriverState change when it's taken off bdrv_states, and is that the behavior we want? Changes: * bdrv_delete() no longer takes it off bdrv_states. Good. * bdrv_close_all(), bdrv_commit_all() and bdrv_flush_all() no longer cover it. Okay, because bdrv_close(), bdrv_commit() and bdrv_flush() do nothing anyway for closed BlockDriverStates. * "info block" and "info blockstats" no longer show it, because bdrv_info() and bdrv_info_stats() no longer see it. Okay. * bdrv_find(), bdrv_next(), bdrv_iterate() no longer see it. Impact? Please check their uses and report. > The result is that we can now invoke drive_del, this closes the file > descriptors > and sets BlockDriverState->drv to NULL which prevents futher IO to the device, > and since we do not free BlockDriverState, we don't have to worry about the > copy > retained in the block devices. Yep. But there's one more question: is the BlockDriverState freed when the device using it gets destroyed? qdev_free() runs prop->info->free() for all properties. The drive property's free() is free_drive(): static void free_drive(DeviceState *dev, Property *prop) { BlockDriverState **ptr = qdev_get_prop_ptr(dev, prop); if (*ptr) { bdrv_detach(*ptr, dev); blockdev_auto_del(*ptr); } } This should indeed delete the drive. But only if the property still points to it. See below. > Reported-by: Marcus Armbruster <arm...@redhat.com> > Signed-off-by: Ryan Harper <ry...@us.ibm.com> > --- > v1->v2 > - NULL bs->device_name after removing from list to prevent > second removal. > > block.c | 12 +++++++++--- > block.h | 1 + > blockdev.c | 2 +- > 3 files changed, 11 insertions(+), 4 deletions(-) > > diff --git a/block.c b/block.c > index 1544d81..0df9942 100644 > --- a/block.c > +++ b/block.c > @@ -697,14 +697,20 @@ void bdrv_close_all(void) > } > } > > +void bdrv_remove(BlockDriverState *bs) > +{ > + if (bs->device_name[0] != '\0') { > + QTAILQ_REMOVE(&bdrv_states, bs, list); > + } > + bs->device_name[0] = '\0'; > +} > + I don't like this name. What's the difference between "delete" and "remove"? The function zaps the device name. bdrv_make_anon()? > void bdrv_delete(BlockDriverState *bs) > { > assert(!bs->peer); > > /* remove from list, if necessary */ > - if (bs->device_name[0] != '\0') { > - QTAILQ_REMOVE(&bdrv_states, bs, list); > - } > + bdrv_remove(bs); > > bdrv_close(bs); > if (bs->file != NULL) { > diff --git a/block.h b/block.h > index 5d78fc0..8447397 100644 > --- a/block.h > +++ b/block.h > @@ -66,6 +66,7 @@ int bdrv_create(BlockDriver *drv, const char* filename, > QEMUOptionParameter *options); > int bdrv_create_file(const char* filename, QEMUOptionParameter *options); > BlockDriverState *bdrv_new(const char *device_name); > +void bdrv_remove(BlockDriverState *bs); > void bdrv_delete(BlockDriverState *bs); > int bdrv_file_open(BlockDriverState **pbs, const char *filename, int flags); > int bdrv_open(BlockDriverState *bs, const char *filename, int flags, > diff --git a/blockdev.c b/blockdev.c > index 0690cc8..1f57b50 100644 > --- a/blockdev.c > +++ b/blockdev.c > @@ -760,7 +760,7 @@ int do_drive_del(Monitor *mon, const QDict *qdict, > QObject **ret_data) Let me add a bit more context: bdrv_flush(bs); bdrv_close(bs); /* clean up guest state from pointing to host resource by * finding and removing DeviceState "drive" property */ if (bs->peer) { for (prop = bs->peer->info->props; prop && prop->name; prop++) { if (prop->info->type == PROP_TYPE_DRIVE) { ptr = qdev_get_prop_ptr(bs->peer, prop); if (*ptr == bs) { bdrv_detach(bs, bs->peer); *ptr = NULL; break; } } } > } This zaps the drive property. A subsequent free_drive() will do nothing. We leak the BlockDriverState on device unplug, I'm afraid. Any reason why we still want to zap the drive property? > > /* clean up host side */ > - drive_uninit(drive_get_by_blockdev(bs)); > + bdrv_remove(bs); > > return 0; > }