On 17/1/24 18:46, Peter Maydell wrote:
On Wed, 17 Jan 2024 at 09:16, <pet...@redhat.com> wrote:

From: Peter Xu <pet...@redhat.com>

QEMU resets do not have a way to order reset hooks.  Add one coarse grained
reset stage so that some devices can be reset later than some others.

Signed-off-by: Peter Xu <pet...@redhat.com>
---
  include/sysemu/reset.h |  5 ++++
  hw/core/reset.c        | 60 +++++++++++++++++++++++++++++++-----------
  2 files changed, 49 insertions(+), 16 deletions(-)

diff --git a/include/sysemu/reset.h b/include/sysemu/reset.h
index 609e4d50c2..0de697ce9f 100644
--- a/include/sysemu/reset.h
+++ b/include/sysemu/reset.h
@@ -5,9 +5,14 @@

  typedef void QEMUResetHandler(void *opaque);

+#define  QEMU_RESET_STAGES_N  2
+

Our reset handling APIs are already pretty complicated, and
raw qemu_register_reset() is kind of a legacy API that I would
prefer that we try to move away from, not add extra complexity to.

Our device reset design already has a multiple-phase system
(see docs/devel/reset.rst), part of the point of which is to
try to give us a way to deal with reset-ordering problems.
I feel like the right way to handle the issue you're trying to
address is to ensure that the thing that needs to happen last is
done in the 'exit' phase rather than the 'hold' phase (which is
where legacy reset methods get called).

I concur. Devices reset is hard, but bus reset is even harder.
Having a quick look, the issues tracked by Alex & Peter might
come from the PCI bridges using the legacy DeviceReset. In
particular functions like:

- hw/pci-bridge/pcie_root_port.c

 46 static void rp_reset_hold(Object *obj)
 47 {
 48     PCIDevice *d = PCI_DEVICE(obj);
 49     DeviceState *qdev = DEVICE(obj);
 50
 51     rp_aer_vector_update(d);
 52     pcie_cap_root_reset(d);
 53     pcie_cap_deverr_reset(d);
 54     pcie_cap_slot_reset(d);
 55     pcie_cap_arifwd_reset(d);
 56     pcie_acs_reset(d);
 57     pcie_aer_root_reset(d);
 58     pci_bridge_reset(qdev);
 59     pci_bridge_disable_base_limit(d);
 60 }

- hw/pci-bridge/pcie_pci_bridge.c

107 static void pcie_pci_bridge_reset(DeviceState *qdev)
108 {
109     PCIDevice *d = PCI_DEVICE(qdev);
110     pci_bridge_reset(qdev);
111     if (msi_present(d)) {
112         msi_reset(d);
113     }
114     shpc_reset(d);
115 }

- hw/pci-bridge/xio3130_downstream.c

 56 static void xio3130_downstream_reset(DeviceState *qdev)
 57 {
 58     PCIDevice *d = PCI_DEVICE(qdev);
 59
 60     pcie_cap_deverr_reset(d);
 61     pcie_cap_slot_reset(d);
 62     pcie_cap_arifwd_reset(d);
 63     pci_bridge_reset(qdev);
 64 }

should really be split and converted as ResettablePhases.

pci_bridge_reset() is likely a ResettableExitPhase one.

There are some annoying wrinkles here, notably that legacy
qemu_register_reset() doesn't support 3-phase reset and so
the phasing only happens for devices reset via the device/bus
tree hierarchy. But I think the way to go is to try to move
forward with that design (i.e. expand 3-phase reset to
qemu_register_reset() and/or move things using qemu_register_reset()
to device reset where that makes sense), not to ignore it and
put a completely different reset-ordering API in a different place.

Unfortunately despite DeviceReset being deprecated since 4 years
in commit c11256aa6f ("hw/core: add Resettable support to BusClass
and DeviceClass"), we keep adding code using this legacy API; for
example in the last 4 months:

- e029bb00a7 ("hw/pci-host: Add Astro system bus adapter found on PA-RISC machines")
- 2880e676c0 ("Add virtio-sound device stub")
- 4a58330343 ("hw/cxl: Add a switch mailbox CCI function")
- 95e1019a4a ("vhost-user-scsi: free the inflight area when reset")
- 6f9c3aaa34 ("fsl-imx: add simple RTC emulation for i.MX6 and i.MX7 boards")

Regards,

Phil.

Reply via email to