On Mon, 25 Mar 2019 at 11:02, Damien Hedde <damien.he...@greensocs.com> wrote: > > Hi all, > > This series is a proposal to implement the multi-phase reset we've discussed > here (https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg00310.html) and > more recently there > (https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg00081.html). > > To summarize, we need a multi-phase reset to allow for a more complex > initialization of a platform. In particular we need to "propagate" a clock > tree, but we have to ensure that every device is initialized first.
Hi; I finally managed to get my thoughts about reset coherent enough to write down in an email. OVERVIEW I had a read through the patchset and spent a while trying to understand what we currently have. Our current (device) reset model is: * single-phase -- there is only one 'reset' method * implicit -- devices don't need to be explicitly registered anywhere in a "reset hierarchy"; instead they are reset by virtue of being in the bus heirarchy they are already in * bus-based -- sysbus reset is registered in vl.c; free-floating other buses[*] (ie those with a NULL parent) have a reset registered in qbus_realize; buses with parents (ie anything with a non-NULL parent passed to qbus_create() or qbus_create_inplace()) will get traversed by the recursive traversal of whatever bus their parent is on. * not exhaustive -- any devices not put on a bus, or put on a bus whose parent device is not on a bus, will not get reset * not modelling GPIO reset signal lines [*] It turns out we actually don't have any of these any more, so we can remove the code that deals with them. The only parentless bus is the main system bus, which is the root of the "reset hierarchy". This patchset is trying to address: * changing to multi-phase * modelling of GPIO reset signal lines It leaves reset as bus-based: currently we do this via qbus_walk_children/ qdev_walk_children, which traverse the bus->children and dev->child_bus lists, and in the patchset's implementation of Resettable for qdev/qbus the methods iterate through those. I think this is reasonable -- we don't want to try to tackle every reset related problem at once. The two issues the patchset is looking at fit well together, because the GPIO-reset-lines are a motivation for switching to multiphase (you need to handle both entering and leaving reset). ("not exhaustive" is the thing we should really try to fix at some point, but I have no good ideas for how to do this.) API DESIGN On what the right APIs should be: I think we should separate "the API that's nice for devices to implement" from "the API that's nice for callers wanting to reset a device". Here's my suggestion for doing that: Have the Resettable interface be: * init * hold * exit * get_reset_count * increment_reset_count (returns previous count) * decrement_reset_count (returns new count) * some method for "iterate over any Resettable child objects" (probably a "call this callback for each child" type API) Individual devices implement init/hold/exit Device base class implements the reset_count methods Device base class implements a default 'init' method that calls dc->reset and default hold/exit that are no-ops Device base class has a new vmstate subsection which migrates the reset count, subsection's needed function says "only send if reset count is non-zero". Back-compat here should be fine as current machines don't implement any way that a device in the system can be being held in reset at the point where a migration happens. If future changes add features to a machine model that permit that, the migration is forwards-compatible. Device base class implements the iterate-over-children method to iterate over the dev->child_bus buses Bus base class implements reset_count methods Bus base class implements default no-op init/hold/exit Bus base class implements the iterate-over-children method to iterate over the bus->children devices Handling migration of the bus base class reset count is a little awkward -- we'll need to put it in a vmstate subsection which is part of the vmstate of the device which owns the bus. (I talked to David Gilbert on IRC about this and have some possibilities for how to do it, but will postpone writing them out until we've decided whether we like the APIs.) The "API for devices to implement" is then the init/hold/exit methods of the Resettable interface -- they don't need to worry about these methods possibly being called multiple times, and they don't need to handle the reference count or passing on the calling of the phase methods to their children. They just need to implement the correct behaviour for their device for this phase. The "API for callers wanting to reset a device" is a set of helper functions that take a pointer to a Resettable object. It's these that deal with the reset count and children: resettable_assert_reset(Resettable *r) { if (r->increment_reset_count() == 0) { r->init(); r->foreach_child(do_call_init); r->hold(); r->foreach_child(do_call_hold); } } resettable_deassert_reset(Resettable *r) { if (r->decrement_reset_count() == 0) { r->foreach_child(do_call_exit); r->exit(); } } plus a utility function for "call assert then deassert", and maybe one that wraps the get_reset_count method. So callers that want to reset devices (or buses) don't need to care about phases, they just assert and then deassert reset. Do you think that works ? API TRANSITIONS The other issue here is API transitions: the patchset essentially obsoletes the old DeviceClass::reset function, for instance. I think we should be clear about what the old and new APIs here are, and what our plans for transitioning to the new ones are. In some cases there are really very few users of the old API -- for instance the patchset makes qbus_reset_all(bus) a synonym for qbus_reset(bus, false), but there are only a dozen or so users of qbus_reset_all(). I think we should just go ahead and convert them all. (For purposes of structuring the patchset starting with a patch that says "reimplement qbus_reset_all() in terms of qbus_reset()" is OK, but then we should fix up the callers afterwards.) There are of course a lot more implementations of DeviceState::reset so transitioning away from that is a lot trickier, but we could look at a coccinelle script that could automate it. If you could describe in the cover letter of the next version of the patchseries all the old APIs being deprecated and the new ones that replace them, I think that would be very helpful. MISCELLANEOUS MINOR POINTS I know I suggested the idea of a ResetDomain object, but in the series as it stands I'm not sure it's serving very much purpose; perhaps we should drop it for the moment (just leaving the legacy reset handlers and sysbus reset the way they are) ? We can come back to it later as a concept. The "support reset via GPIO line" patch looks generally OK, but you can't just add fields to the DeviceState vmstate struct -- you'll break migration from older QEMU versions. The new fields need to go in a vmstate subsection with an appropriate 'needed' function. We should definitely make sure we have good documentation for what device authors should do to implement reset. thanks -- PMM