On Mon, Sep 1, 2014 at 4:17 PM, Gonglei (Arei) <arei.gong...@huawei.com> wrote: > > > > > > > >> -----Original Message----- >> From: peter.crosthwa...@petalogix.com >> [mailto:peter.crosthwa...@petalogix.com] On Behalf Of Peter Crosthwaite >> Sent: Monday, September 01, 2014 2:08 PM >> To: Gonglei (Arei) >> Cc: qemu-devel@nongnu.org Developers; Huangweidong (C); Michael S. Tsirkin; >> Luonengjun; Huangpeng (Peter); Igor Mammedov; Paolo Bonzini; Andreas >> Färber >> Subject: Re: [Qemu-devel] [PATCH v3 2/3] qdev: add cleanup logic in >> device_set_realized() to avoid resource leak >> >> On Mon, Sep 1, 2014 at 3:50 PM, <arei.gong...@huawei.com> wrote: >> > From: Gonglei <arei.gong...@huawei.com> >> > >> > At present, this function doesn't have partial cleanup implemented, >> > which will cause resource leak in some scenarios. >> > >> > Example: >> > >> > 1. Assuming that "dc->realize(dev, &local_err)" execute successful >> > and local_err == NULL; >> > 2. Executing device hotplug in hotplug_handler_plug(), but failed >> > (It is prone to occur). Then local_err != NULL; >> > 3. error_propagate(errp, local_err) and return. But the resources >> > which been allocated in dc->realize() will be leaked. >> > Simple backtrace: >> > dc->realize() >> > |->device_realize >> > |->pci_qdev_init() >> > |->do_pci_register_device() >> > |->etc. >> > >> > Adding fuller cleanup logic which assure that function can >> > goto appropriate error label as local_err population is >> > detected as each relevant point. >> > >> > Signed-off-by: Gonglei <arei.gong...@huawei.com> >> > --- >> > hw/core/qdev.c | 73 >> ++++++++++++++++++++++++++++++++++++++++++---------------- >> > 1 file changed, 53 insertions(+), 20 deletions(-) >> > >> > diff --git a/hw/core/qdev.c b/hw/core/qdev.c >> > index 4a1ac5b..2a82cc1 100644 >> > --- a/hw/core/qdev.c >> > +++ b/hw/core/qdev.c >> > @@ -834,12 +834,14 @@ static void device_set_realized(Object *obj, bool >> value, Error **errp) >> > dc->realize(dev, &local_err); >> > } >> > >> > - if (dev->parent_bus && dev->parent_bus->hotplug_handler && >> > - local_err == NULL) { >> > + if (local_err != NULL) { >> > + goto fail; >> > + } >> > + >> > + if (dev->parent_bus && dev->parent_bus->hotplug_handler) { >> > hotplug_handler_plug(dev->parent_bus->hotplug_handler, >> > dev, &local_err); >> > - } else if (local_err == NULL && >> > - object_dynamic_cast(qdev_get_machine(), >> TYPE_MACHINE)) { >> > + } else if (object_dynamic_cast(qdev_get_machine(), >> TYPE_MACHINE)) { >> > HotplugHandler *hotplug_ctrl; >> > MachineState *machine = MACHINE(qdev_get_machine()); >> > MachineClass *mc = MACHINE_GET_CLASS(machine); >> > @@ -852,21 +854,25 @@ static void device_set_realized(Object *obj, bool >> value, Error **errp) >> > } >> > } >> > >> > - if (qdev_get_vmsd(dev) && local_err == NULL) { >> > + if (local_err != NULL) { >> > + goto post_realize_fail; >> > + } >> > + >> > + if (qdev_get_vmsd(dev)) { >> > vmstate_register_with_alias_id(dev, -1, >> qdev_get_vmsd(dev), dev, >> > >> dev->instance_id_alias, >> > >> dev->alias_required_for_version); >> > } >> > - if (local_err == NULL) { >> > - QLIST_FOREACH(bus, &dev->child_bus, sibling) { >> > - object_property_set_bool(OBJECT(bus), true, "realized", >> > - &local_err); >> > - if (local_err != NULL) { >> > - break; >> > - } >> > + >> > + QLIST_FOREACH(bus, &dev->child_bus, sibling) { >> > + object_property_set_bool(OBJECT(bus), true, "realized", >> > + &local_err); >> > + if (local_err != NULL) { >> > + goto set_realized_fail; >> >> should this label be child_realize_fail for clarity? The object is >> already realized by now so this name unqualified is a little >> misleading. >> >> > } >> > } >> > - if (dev->hotplugged && local_err == NULL) { >> > + >> > + if (dev->hotplugged) { >> > device_reset(dev); >> > } >> > dev->pending_deleted_event = false; >> > @@ -875,24 +881,51 @@ static void device_set_realized(Object *obj, bool >> value, Error **errp) >> > object_property_set_bool(OBJECT(bus), false, "realized", >> > &local_err); >> > if (local_err != NULL) { >> > - break; >> > + goto set_unrealized_fail; >> > } >> > } >> > - if (qdev_get_vmsd(dev) && local_err == NULL) { >> > + if (qdev_get_vmsd(dev)) { >> > vmstate_unregister(dev, qdev_get_vmsd(dev), dev); >> > } >> > - if (dc->unrealize && local_err == NULL) { >> > + if (dc->unrealize) { >> > dc->unrealize(dev, &local_err); >> > } >> > dev->pending_deleted_event = true; >> > - } >> > >> > - if (local_err != NULL) { >> > - error_propagate(errp, local_err); >> > - return; >> > + if (local_err != NULL) { >> > + goto fail; >> > + } >> > } >> > >> > dev->realized = value; >> > + return; >> > + >> > +set_unrealized_fail: >> >> child_unrealize_fail >> >> > + QLIST_FOREACH(bus, &dev->child_bus, sibling) { >> > + object_property_set_bool(OBJECT(bus), true, "realized", >> >> "true" doesn't look right to me. It means if the recursive unrealize >> botches it will re-realize all children? The idea behind my comment >> last spin was to forcefully unrealize all children regardless of >> early-iteration errors. I think set_realized_fail and set >> unrealized_fail should be the same logic. You should keep going with >> your cleanup on error rather than undo the cleanup already done. >> > I knew your mean, but forgot to explain in pervious email, sorry for this. > When one child unrealized failed, the interface will return error for user, > so user think the device existed yet in QEMU, but not hotplugged. > If we keep going with cleanup, the state saved by QEMU is not same with user's > thought.
This is also broken in original code right? So there's an argument to say that this case would be a fatal error, i.e. the hotplug interface should be &error_abort ing on the unrealize path. I don't think Qdev has a chance of being able to guarantee that a botched recursive unrealize has no side-effects so hotplug framework has no ability for sane recovery in these cases. Regards, Peter > > Best regards, > -Gonglei