On Jul  9 08:55, Klaus Jensen wrote:
On Jul  9 08:16, Hannes Reinecke wrote:
On 7/9/21 8:05 AM, Klaus Jensen wrote:
On Jul  7 17:49, Klaus Jensen wrote:
From: Klaus Jensen <k.jen...@samsung.com>

Back in May, Hannes posted a fix[1] to re-enable NVMe PCI hotplug. We
discussed a bit back and fourth and I mentioned that the core issue was
an artifact of the parent/child relationship stemming from the qdev
setup we have with namespaces attaching to controller through a qdev
bus.

The gist of this series is the fourth patch "hw/nvme: fix controller hot
unplugging" which basically causes namespaces to be reassigned to a bus
owned by the subsystem if the parent controller is linked to one. This
fixes `device_del/add nvme` in such settings.

Note, that in the case that there is no subsystem involved, nvme devices
can be removed from the system with `device_del`, but this *will* cause
the namespaces to be removed as well since there is no place (i.e. no
subsystem) for them to "linger". And since this series does not add
support for hotplugging nvme-ns devices, while an nvme device can be
readded, no namespaces can. Support for hotplugging nvme-ns devices is
present in [1], but I'd rather not add that since I think '-device
nvme-ns' is already a bad design choice.

Now, I do realize that it is not "pretty" to explicitly change the
parent bus, so I do have a an RFC patch in queue that replaces the
subsystem and namespace devices with objects, but keeps -device shims
available for backwards compatibility. This approach will solve the
problems properly and should be a better model. However, I don't believe
it will make it for 6.1 and I'd really like to at least fix the
unplugging for 6.1 and this gets the job done.

 [1]: 20210511073511.32511-1-h...@suse.de

v2:
- added R-b's by Hannes for patches 1 through 3
- simplified "hw/nvme: fix controller hot unplugging"

Klaus Jensen (4):
 hw/nvme: remove NvmeCtrl parameter from ns setup/check functions
 hw/nvme: mark nvme-subsys non-hotpluggable
 hw/nvme: unregister controller with subsystem at exit
 hw/nvme: fix controller hot unplugging

hw/nvme/nvme.h   | 18 +++++++++-------
hw/nvme/ctrl.c   | 14 ++++++------
hw/nvme/ns.c     | 55 +++++++++++++++++++++++++++++++-----------------
hw/nvme/subsys.c |  9 ++++++++
4 files changed, 63 insertions(+), 33 deletions(-)

--
2.32.0


Applied patches 1 through 3 to nvme-next.

So, how do we go about with patch 4?
Without it this whole exercise is a bit pointless, seeing that it doesn't fix anything.


Patch 1-3 are fixes we need anyway, so I thought I might as well apply them :)

Shall we go with that patch as an interim solution?
Will you replace it with your 'object' patch?
What is the plan?


Yes, if acceptable, I would like to use patch 4 as an interim solution. We have a bug we need to fix for 6.1, and I belive this does the job.

I considered changing the existing nvme-bus to be on the main system bus, but then we break the existing behavior that the namespaces attach to the most recently defined controller in the absence of the shared parameter or an explicit bus parameter.

Wrt. "the plan", right now, I see two solutions going forward:

1. Introduce new -object's for nvme-nvm-subsystem and nvme-ns
This is the approach that I am taking right now and it works well. It allows many-to-many relationships and separates the life times of subsystems, namespaces and controllers like you mentioned.

Conceptually, I also really like that the subsystem and namespace are not "devices". One could argue that the namespace is comparable to a SCSI LUN (-device scsi-hd, right?), but where the SCSI LUN actually "shows up" in the host, the nvme namespace does not.

My series handles backwards compatibility by keeping -device "shims" around that just wraps the new objects but behaves like it used to. The plan would be to deprecate these devices.

The downside to this approach is that it moves the subsystem and namespaces out of the "qdev tree (info qtree)" and into the pure QOM "/objects" tree. Instead of qtree, we can have QMP and HMP commands for introspection.

2. Make the subsystem a "system bus device"
This way we add an "nvme-nvm-subsystems" bus as a direct child of the main system bus, and we can possibly get rid of the explicit -device nvme-subsys as well. We change the namespace device to plug into that instead. The nvme controller device still needs to plug into the PCI bus, so it cannot be a child of the subsystems bus, but can keep using a link parameter to hook into the subsystem and attach to any namespaces it would like.

I'm unsure if we can do this without deprecating the existing namespace device, just like option 1.

I have not implemented this, so I need to look more into it. It seems like the main thing that this gives us compared to 1) is `info qtree` support and we still end up just "wiring" namespace attachment with backlinks anyway.

I'm not sure what I would prefer, but I've found that implemting it as -object's is a breath of fresh air and as I mentioned, conceptually, I like option 1 because namespaces are -objects and not -devices.

And, by the way, thanks for chipping in on this Hannes, I had sort of crossed off option 2 before you showed up and threw some ideas in the air ;)

Wow, my mailer screwed up that formatting, sorry about that.

Attachment: signature.asc
Description: PGP signature

Reply via email to