[PATCH v3 1/1] vfio-user: introduce vfio-user protocol specification
From: Thanos Makatos This patch introduces the vfio-user protocol specification (formerly known as VFIO-over-socket), which is designed to allow devices to be emulated outside QEMU, in a separate process. vfio-user reuses the existing VFIO defines, structs and concepts. It has been earlier discussed as an RFC in: "RFC: use VFIO over a UNIX domain socket to implement device offloading" Signed-off-by: John G Johnson Signed-off-by: Thanos Makatos Signed-off-by: John Levon --- MAINTAINERS|4 +- docs/devel/index-internals.rst |1 + docs/devel/vfio-user.rst | 1522 3 files changed, 1526 insertions(+), 1 deletion(-) create mode 100644 docs/devel/vfio-user.rst diff --git a/MAINTAINERS b/MAINTAINERS index aba07722f64f..70499379c7ca 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3791,11 +3791,13 @@ F: include/semihosting/ F: tests/tcg/multiarch/arm-compat-semi/ F: tests/tcg/aarch64/system/semiheap.c -Multi-process QEMU +Multi-process QEMU / vfio-user M: Elena Ufimtseva M: Jagannathan Raman +M: Thanos Makatos S: Maintained F: docs/devel/multi-process.rst +F: docs/devel/vfio-user.rst F: docs/system/multi-process.rst F: hw/pci-host/remote.c F: include/hw/pci-host/remote.h diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst index e1a93df26392..0ecb5c6301d8 100644 --- a/docs/devel/index-internals.rst +++ b/docs/devel/index-internals.rst @@ -17,5 +17,6 @@ Details about QEMU's various subsystems including how to add features to them. s390-dasd-ipl tracing vfio-migration + vfio-user writing-monitor-commands virtio-backends diff --git a/docs/devel/vfio-user.rst b/docs/devel/vfio-user.rst new file mode 100644 index ..0d96477a68b4 --- /dev/null +++ b/docs/devel/vfio-user.rst @@ -0,0 +1,1522 @@ +.. include:: + +vfio-user Protocol Specification + + +-- +Version_ 0.9.1 +-- + +.. contents:: Table of Contents + +Introduction + +vfio-user is a protocol that allows a device to be emulated in a separate +process outside of a Virtual Machine Monitor (VMM). vfio-user devices consist +of a generic VFIO device type, living inside the VMM, which we call the client, +and the core device implementation, living outside the VMM, which we call the +server. + +The vfio-user specification is partly based on the +`Linux VFIO ioctl interface <https://www.kernel.org/doc/html/latest/driver-api/vfio.html>`_. + +VFIO is a mature and stable API, backed by an extensively used framework. The +existing VFIO client implementation in QEMU (``qemu/hw/vfio/``) can be largely +re-used, though there is nothing in this specification that requires that +particular implementation. None of the VFIO kernel modules are required for +supporting the protocol, on either the client or server side. Some source +definitions in VFIO are re-used for vfio-user. + +The main idea is to allow a virtual device to function in a separate process in +the same host over a UNIX domain socket. A UNIX domain socket (``AF_UNIX``) is +chosen because file descriptors can be trivially sent over it, which in turn +allows: + +* Sharing of client memory for DMA with the server. +* Sharing of server memory with the client for fast MMIO. +* Efficient sharing of eventfd's for triggering interrupts. + +Other socket types could be used which allow the server to run in a separate +guest in the same host (``AF_VSOCK``) or remotely (``AF_INET``). Theoretically +the underlying transport does not necessarily have to be a socket, however we do +not examine such alternatives. In this protocol version we focus on using a UNIX +domain socket and introduce basic support for the other two types of sockets +without considering performance implications. + +While passing of file descriptors is desirable for performance reasons, support +is not necessary for either the client or the server in order to implement the +protocol. There is always an in-band, message-passing fall back mechanism. + +Overview + + +VFIO is a framework that allows a physical device to be securely passed through +to a user space process; the device-specific kernel driver does not drive the +device at all. Typically, the user space process is a VMM and the device is +passed through to it in order to achieve high performance. VFIO provides an API +and the required functionality in the kernel. QEMU has adopted VFIO to allow a +guest to directly access physical devices, instead of emulating them in +software. + +vfio-user reuses the core VFIO concepts defined in its API, but implements them +as messages to be sent over a socket. It does not change the kernel-based VFIO +in any way, in fact none of the VFIO kernel modules need to be loaded to use +vfio-user. It is also possible for the client to concurrently use the current +kernel-based VFIO for one device,
[PATCH v3 0/1] introduce vfio-user protocol specification
Hi, This patch is a continuation of the following patch that John Johnson sent out for review already: [PATCH v2 01/23] vfio-user: introduce vfio-user protocol specification Message-Id: We have separated this patch from the original vfio-user client series. We will send the other patches in that series in about two weeks. v2 -> v3: - MAINTAINERS: Combined vfio-user and Multiprocess-QEMU sections and named it "Multiprocess-QEMU / vfio-user" as it already refers to the vfio-user files - We will remove multiprocess support after the vfio-user client gets through and rename the section "vfio-user." Thank you! Thanos Makatos (1): vfio-user: introduce vfio-user protocol specification MAINTAINERS|4 +- docs/devel/index-internals.rst |1 + docs/devel/vfio-user.rst | 1522 3 files changed, 1526 insertions(+), 1 deletion(-) create mode 100644 docs/devel/vfio-user.rst -- 2.20.1
[PULL 0/1] maintainers queue
The following changes since commit 45ae97993a75f975f1a01d25564724c7e10a543f: Merge tag 'pull-tricore-20230607' of https://github.com/bkoppelmann/qemu into staging (2023-06-07 11:45:22 -0700) are available in the Git repository at: https://gitlab.com/jraman/qemu.git tags/pull-maintainers-20230608 for you to fetch changes up to c45309f7a40083e5034fcb19e27e3c0b1b5ec6cd: maintainers: update maintainers list for vfio-user & multi-process QEMU (2023-06-08 14:16:08 -0400) maintainers: update maintainers list for vfio-user & multi-process QEMU Signed-off-by: Jagannathan Raman -------- Jagannathan Raman (1): maintainers: update maintainers list for vfio-user & multi-process QEMU MAINTAINERS | 1 - 1 file changed, 1 deletion(-)
[PULL 1/1] maintainers: update maintainers list for vfio-user & multi-process QEMU
Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi Tested-by: Philippe Mathieu-Daudé Reviewed-by: Philippe Mathieu-Daudé --- MAINTAINERS | 1 - 1 file changed, 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 436b3f0afefd..4a80a385118d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3786,7 +3786,6 @@ F: tests/tcg/aarch64/system/semiheap.c Multi-process QEMU M: Elena Ufimtseva M: Jagannathan Raman -M: John G Johnson S: Maintained F: docs/devel/multi-process.rst F: docs/system/multi-process.rst -- 2.20.1
[PATCH 1/1] maintainers: update maintainers list for vfio-user & multi-process QEMU
Signed-off-by: Jagannathan Raman --- MAINTAINERS | 1 - 1 file changed, 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 436b3f0afefd..4a80a385118d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3786,7 +3786,6 @@ F: tests/tcg/aarch64/system/semiheap.c Multi-process QEMU M: Elena Ufimtseva M: Jagannathan Raman -M: John G Johnson S: Maintained F: docs/devel/multi-process.rst F: docs/system/multi-process.rst -- 2.20.1
[PATCH 0/1] update maintainers list for vfio-user & multi-process QEMU
John Johnson doesn't work at Oracle anymore. I tried to contact him to get his updated email address, but I haven't heard anything from him. Jagannathan Raman (1): maintainers: update maintainers list for vfio-user & multi-process QEMU MAINTAINERS | 1 - 1 file changed, 1 deletion(-) -- 2.20.1
[PULL 1/2] vfio-user: update comments
Clarify the behavior of TYPE_VFU_OBJECT when TYPE_REMOTE_MACHINE enables the auto-shutdown property. Also, add notes to VFU_OBJECT_ERROR. Signed-off-by: Jagannathan Raman Reviewed-by: Markus Armbruster Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 88ffafc73e56..8b10c32a3c6e 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -30,6 +30,11 @@ * * notes - x-vfio-user-server could block IO and monitor during the * initialization phase. + * + * When x-remote machine has the auto-shutdown property + * enabled (default), x-vfio-user-server terminates after the last + * client disconnects. Otherwise, it will continue running until + * explicitly killed. */ #include "qemu/osdep.h" @@ -61,9 +66,12 @@ OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) /** - * VFU_OBJECT_ERROR - reports an error message. If auto_shutdown - * is set, it aborts the machine on error. Otherwise, it logs an - * error message without aborting. + * VFU_OBJECT_ERROR - reports an error message. + * + * If auto_shutdown is set, it aborts the machine on error. Otherwise, + * it logs an error message without aborting. auto_shutdown is disabled + * when the server serves clients from multiple VMs; as such, an error + * from one VM shouldn't be able to disrupt other VM's services. */ #define VFU_OBJECT_ERROR(o, fmt, ...) \ { \ -- 2.20.1
[PULL 0/2] vfio-user queue
The following changes since commit f5e6786de4815751b0a3d2235c760361f228ea48: Merge tag 'pull-target-arm-20230606' of https://git.linaro.org/people/pmaydell/qemu-arm into staging (2023-06-06 12:11:34 -0700) are available in the Git repository at: https://gitlab.com/jraman/qemu.git tags/pull-vfio-user-20230607 for you to fetch changes up to 7771e8b86335968ee46538d1afd44246e7a062bc: docs: fix multi-process QEMU documentation (2023-06-07 10:21:53 -0400) vfio-user: Fix the documentation for vfio-user and multi-process QEMU Signed-off-by: Jagannathan Raman ---- Jagannathan Raman (2): vfio-user: update comments docs: fix multi-process QEMU documentation docs/system/multi-process.rst | 2 +- hw/remote/vfio-user-obj.c | 14 +++--- 2 files changed, 12 insertions(+), 4 deletions(-)
[PULL 2/2] docs: fix multi-process QEMU documentation
Fix a typo in the system documentation for multi-process QEMU. Signed-off-by: Jagannathan Raman Reviewed-by: Markus Armbruster Reviewed-by: Stefan Hajnoczi --- docs/system/multi-process.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/system/multi-process.rst b/docs/system/multi-process.rst index 16f0352416bc..2008a6780953 100644 --- a/docs/system/multi-process.rst +++ b/docs/system/multi-process.rst @@ -4,7 +4,7 @@ Multi-process QEMU == This document describes how to configure and use multi-process qemu. -For the design document refer to docs/devel/qemu-multiprocess. +For the design document refer to docs/devel/multi-process.rst. 1) Configuration -- 2.20.1
[PATCH v1 0/2] Fix the documentation for vfio-user and multi-process QEMU
This series addresses recent comments from Markus Armbruster in the "Machine x-remote property auto-shutdown" email thread. Jagannathan Raman (2): vfio-user: update comments docs: fix multi-process QEMU documentation docs/system/multi-process.rst | 2 +- hw/remote/vfio-user-obj.c | 14 +++--- 2 files changed, 12 insertions(+), 4 deletions(-) -- 2.20.1
[PATCH v1 2/2] docs: fix multi-process QEMU documentation
Fix a typo in the system documentation for multi-process QEMU. Signed-off-by: Jagannathan Raman --- docs/system/multi-process.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/system/multi-process.rst b/docs/system/multi-process.rst index 16f0352416..2008a67809 100644 --- a/docs/system/multi-process.rst +++ b/docs/system/multi-process.rst @@ -4,7 +4,7 @@ Multi-process QEMU == This document describes how to configure and use multi-process qemu. -For the design document refer to docs/devel/qemu-multiprocess. +For the design document refer to docs/devel/multi-process.rst. 1) Configuration -- 2.20.1
[PATCH v1 1/2] vfio-user: update comments
Clarify the behavior of TYPE_VFU_OBJECT when TYPE_REMOTE_MACHINE enables the auto-shutdown property. Also, add notes to VFU_OBJECT_ERROR. Signed-off-by: Jagannathan Raman --- hw/remote/vfio-user-obj.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 88ffafc73e..8b10c32a3c 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -30,6 +30,11 @@ * * notes - x-vfio-user-server could block IO and monitor during the * initialization phase. + * + * When x-remote machine has the auto-shutdown property + * enabled (default), x-vfio-user-server terminates after the last + * client disconnects. Otherwise, it will continue running until + * explicitly killed. */ #include "qemu/osdep.h" @@ -61,9 +66,12 @@ OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) /** - * VFU_OBJECT_ERROR - reports an error message. If auto_shutdown - * is set, it aborts the machine on error. Otherwise, it logs an - * error message without aborting. + * VFU_OBJECT_ERROR - reports an error message. + * + * If auto_shutdown is set, it aborts the machine on error. Otherwise, + * it logs an error message without aborting. auto_shutdown is disabled + * when the server serves clients from multiple VMs; as such, an error + * from one VM shouldn't be able to disrupt other VM's services. */ #define VFU_OBJECT_ERROR(o, fmt, ...) \ { \ -- 2.20.1
[PATCH 1/1] vfio-user: update submodule to latest
Update libvfio-user submodule to the latest Signed-off-by: Jagannathan Raman --- subprojects/libvfio-user | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/subprojects/libvfio-user b/subprojects/libvfio-user index 0b28d20557..1305f161b7 16 --- a/subprojects/libvfio-user +++ b/subprojects/libvfio-user @@ -1 +1 @@ -Subproject commit 0b28d205572c80b568a1003db2c8f37ca333e4d7 +Subproject commit 1305f161b7e0dd2c2a420c17efcb0bd49b94dad4 -- 2.20.1
[PATCH 0/1] Update vfio-user module to the latest
Hi, This patch updates the libvfio-user submodule to the latest. Passed 'make check' & GitLab CI. Thank you! -- Jag Jagannathan Raman (1): vfio-user: update submodule to latest subprojects/libvfio-user | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- 2.20.1
[PATCH] msi: fix MSI vector limit check in msi_set_mask()
MSI supports a maximum of PCI_MSI_VECTORS_MAX vectors - from 0 to PCI_MSI_VECTORS_MAX - 1. msi_set_mask() was previously using PCI_MSI_VECTORS_MAX as the upper limit for MSI vectors. Fix the upper limit to PCI_MSI_VECTORS_MAX - 1. Fixes: Coverity CID 1490141 Fixes: 08cf3dc61199 vfio-user: handle device interrupts Signed-off-by: Jagannathan Raman --- hw/pci/msi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/pci/msi.c b/hw/pci/msi.c index 5c471b9616..058d1d1ef1 100644 --- a/hw/pci/msi.c +++ b/hw/pci/msi.c @@ -322,9 +322,9 @@ void msi_set_mask(PCIDevice *dev, int vector, bool mask, Error **errp) bool msi64bit = flags & PCI_MSI_FLAGS_64BIT; uint32_t irq_state, vector_mask, pending; -if (vector > PCI_MSI_VECTORS_MAX) { +if (vector >= PCI_MSI_VECTORS_MAX) { error_setg(errp, "msi: vector %d not allocated. max vector is %d", - vector, PCI_MSI_VECTORS_MAX); + vector, (PCI_MSI_VECTORS_MAX - 1)); return; } -- 2.20.1
[PATCH v12 05/14] vfio-user: define vfio-user-server object
Define vfio-user object which is remote process server for QEMU. Setup object initialization functions and properties necessary to instantiate the object Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- qapi/qom.json | 20 +++- include/hw/remote/machine.h | 2 + hw/remote/machine.c | 27 + hw/remote/vfio-user-obj.c | 210 MAINTAINERS | 1 + hw/remote/meson.build | 1 + hw/remote/trace-events | 3 + 7 files changed, 262 insertions(+), 2 deletions(-) create mode 100644 hw/remote/vfio-user-obj.c diff --git a/qapi/qom.json b/qapi/qom.json index 6a653c6636..80dd419b39 100644 --- a/qapi/qom.json +++ b/qapi/qom.json @@ -734,6 +734,20 @@ { 'struct': 'RemoteObjectProperties', 'data': { 'fd': 'str', 'devid': 'str' } } +## +# @VfioUserServerProperties: +# +# Properties for x-vfio-user-server objects. +# +# @socket: socket to be used by the libvfio-user library +# +# @device: the ID of the device to be emulated at the server +# +# Since: 7.1 +## +{ 'struct': 'VfioUserServerProperties', + 'data': { 'socket': 'SocketAddress', 'device': 'str' } } + ## # @RngProperties: # @@ -874,7 +888,8 @@ 'tls-creds-psk', 'tls-creds-x509', 'tls-cipher-suites', -{ 'name': 'x-remote-object', 'features': [ 'unstable' ] } +{ 'name': 'x-remote-object', 'features': [ 'unstable' ] }, +{ 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] } ] } ## @@ -938,7 +953,8 @@ 'tls-creds-psk': 'TlsCredsPskProperties', 'tls-creds-x509': 'TlsCredsX509Properties', 'tls-cipher-suites': 'TlsCredsProperties', - 'x-remote-object':'RemoteObjectProperties' + 'x-remote-object':'RemoteObjectProperties', + 'x-vfio-user-server': 'VfioUserServerProperties' } } ## diff --git a/include/hw/remote/machine.h b/include/hw/remote/machine.h index 8d0fa98d33..ac32fda387 100644 --- a/include/hw/remote/machine.h +++ b/include/hw/remote/machine.h @@ -24,6 +24,8 @@ struct RemoteMachineState { RemoteIOHubState iohub; bool vfio_user; + +bool auto_shutdown; }; /* Used to pass to co-routine device and ioc. */ diff --git a/hw/remote/machine.c b/hw/remote/machine.c index 9f3cdc55c3..4d008ed721 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -77,6 +77,28 @@ static void remote_machine_set_vfio_user(Object *obj, bool value, Error **errp) s->vfio_user = value; } +static bool remote_machine_get_auto_shutdown(Object *obj, Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +return s->auto_shutdown; +} + +static void remote_machine_set_auto_shutdown(Object *obj, bool value, + Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +s->auto_shutdown = value; +} + +static void remote_machine_instance_init(Object *obj) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +s->auto_shutdown = true; +} + static void remote_machine_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); @@ -90,12 +112,17 @@ static void remote_machine_class_init(ObjectClass *oc, void *data) object_class_property_add_bool(oc, "vfio-user", remote_machine_get_vfio_user, remote_machine_set_vfio_user); + +object_class_property_add_bool(oc, "auto-shutdown", + remote_machine_get_auto_shutdown, + remote_machine_set_auto_shutdown); } static const TypeInfo remote_machine = { .name = TYPE_REMOTE_MACHINE, .parent = TYPE_MACHINE, .instance_size = sizeof(RemoteMachineState), +.instance_init = remote_machine_instance_init, .class_init = remote_machine_class_init, .interfaces = (InterfaceInfo[]) { { TYPE_HOTPLUG_HANDLER }, diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c new file mode 100644 index 00..bc49adcc27 --- /dev/null +++ b/hw/remote/vfio-user-obj.c @@ -0,0 +1,210 @@ +/** + * QEMU vfio-user-server server object + * + * Copyright © 2022 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL-v2, version 2 or later. + * + * See the COPYING file in the top-level directory. + * + */ + +/** + * Usage: add options: + * -machine x-remote,vfi
[PATCH v12 07/14] vfio-user: find and init PCI device
Find the PCI device with specified id. Initialize the device context with the QEMU PCI device Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 67 +++ 1 file changed, 67 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 68f8a9dfa9..3ca6aa2b45 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -43,6 +43,8 @@ #include "qemu/notify.h" #include "sysemu/sysemu.h" #include "libvfio-user.h" +#include "hw/qdev-core.h" +#include "hw/pci/pci.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -80,6 +82,10 @@ struct VfuObject { Notifier machine_done; vfu_ctx_t *vfu_ctx; + +PCIDevice *pci_dev; + +Error *unplug_blocker; }; static void vfu_object_init_ctx(VfuObject *o, Error **errp); @@ -181,6 +187,9 @@ static void vfu_object_machine_done(Notifier *notifier, void *data) static void vfu_object_init_ctx(VfuObject *o, Error **errp) { ERRP_GUARD(); +DeviceState *dev = NULL; +vfu_pci_type_t pci_type = VFU_PCI_TYPE_CONVENTIONAL; +int ret; if (o->vfu_ctx || !o->socket || !o->device || !phase_check(PHASE_MACHINE_READY)) { @@ -199,6 +208,53 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) error_setg(errp, "vfu: Failed to create context - %s", strerror(errno)); return; } + +dev = qdev_find_recursive(sysbus_get_default(), o->device); +if (dev == NULL) { +error_setg(errp, "vfu: Device %s not found", o->device); +goto fail; +} + +if (!object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) { +error_setg(errp, "vfu: %s not a PCI device", o->device); +goto fail; +} + +o->pci_dev = PCI_DEVICE(dev); + +object_ref(OBJECT(o->pci_dev)); + +if (pci_is_express(o->pci_dev)) { +pci_type = VFU_PCI_TYPE_EXPRESS; +} + +ret = vfu_pci_init(o->vfu_ctx, pci_type, PCI_HEADER_TYPE_NORMAL, 0); +if (ret < 0) { +error_setg(errp, + "vfu: Failed to attach PCI device %s to context - %s", + o->device, strerror(errno)); +goto fail; +} + +error_setg(&o->unplug_blocker, + "vfu: %s for %s must be deleted before unplugging", + TYPE_VFU_OBJECT, o->device); +qdev_add_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); + +return; + +fail: +vfu_destroy_ctx(o->vfu_ctx); +if (o->unplug_blocker && o->pci_dev) { +qdev_del_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); +error_free(o->unplug_blocker); +o->unplug_blocker = NULL; +} +if (o->pci_dev) { +object_unref(OBJECT(o->pci_dev)); +o->pci_dev = NULL; +} +o->vfu_ctx = NULL; } static void vfu_object_init(Object *obj) @@ -241,6 +297,17 @@ static void vfu_object_finalize(Object *obj) o->device = NULL; +if (o->unplug_blocker && o->pci_dev) { +qdev_del_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); +error_free(o->unplug_blocker); +o->unplug_blocker = NULL; +} + +if (o->pci_dev) { +object_unref(OBJECT(o->pci_dev)); +o->pci_dev = NULL; +} + if (!k->nr_devs && vfu_object_auto_shutdown()) { qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN); } -- 2.20.1
[PATCH v12 14/14] vfio-user: handle reset of remote device
Adds handler to reset a remote device Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 20 1 file changed, 20 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 5ecdec06f6..c6cc53acf2 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -676,6 +676,20 @@ void vfu_object_set_bus_irq(PCIBus *pci_bus) max_bdf); } +static int vfu_object_device_reset(vfu_ctx_t *vfu_ctx, vfu_reset_type_t type) +{ +VfuObject *o = vfu_get_private(vfu_ctx); + +/* vfu_object_ctx_run() handles lost connection */ +if (type == VFU_RESET_LOST_CONN) { +return 0; +} + +qdev_reset_all(DEVICE(o->pci_dev)); + +return 0; +} + /* * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' * properties. It also depends on devices instantiated in QEMU. These @@ -795,6 +809,12 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) goto fail; } +ret = vfu_setup_device_reset_cb(o->vfu_ctx, &vfu_object_device_reset); +if (ret < 0) { +error_setg(errp, "vfu: Failed to setup reset callback"); +goto fail; +} + ret = vfu_realize_ctx(o->vfu_ctx); if (ret < 0) { error_setg(errp, "vfu: Failed to realize device %s- %s", -- 2.20.1
[PATCH v12 09/14] vfio-user: handle PCI config space accesses
Define and register handlers for PCI config space accesses Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 51 +++ hw/remote/trace-events| 2 ++ 2 files changed, 53 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 178bd6f8ed..cef473cb98 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -46,6 +46,7 @@ #include "qapi/qapi-events-misc.h" #include "qemu/notify.h" #include "qemu/thread.h" +#include "qemu/main-loop.h" #include "sysemu/sysemu.h" #include "libvfio-user.h" #include "hw/qdev-core.h" @@ -244,6 +245,45 @@ retry_attach: qemu_set_fd_handler(o->vfu_poll_fd, vfu_object_ctx_run, NULL, o); } +static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf, + size_t count, loff_t offset, + const bool is_write) +{ +VfuObject *o = vfu_get_private(vfu_ctx); +uint32_t pci_access_width = sizeof(uint32_t); +size_t bytes = count; +uint32_t val = 0; +char *ptr = buf; +int len; + +/* + * Writes to the BAR registers would trigger an update to the + * global Memory and IO AddressSpaces. But the remote device + * never uses the global AddressSpaces, therefore overlapping + * memory regions are not a problem + */ +while (bytes > 0) { +len = (bytes > pci_access_width) ? pci_access_width : bytes; +if (is_write) { +memcpy(&val, ptr, len); +pci_host_config_write_common(o->pci_dev, offset, + pci_config_size(o->pci_dev), + val, len); +trace_vfu_cfg_write(offset, val); +} else { +val = pci_host_config_read_common(o->pci_dev, offset, + pci_config_size(o->pci_dev), len); +memcpy(ptr, &val, len); +trace_vfu_cfg_read(offset, val); +} +offset += len; +ptr += len; +bytes -= len; +} + +return count; +} + /* * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' * properties. It also depends on devices instantiated in QEMU. These @@ -336,6 +376,17 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) TYPE_VFU_OBJECT, o->device); qdev_add_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); +ret = vfu_setup_region(o->vfu_ctx, VFU_PCI_DEV_CFG_REGION_IDX, + pci_config_size(o->pci_dev), &vfu_object_cfg_access, + VFU_REGION_FLAG_RW | VFU_REGION_FLAG_ALWAYS_CB, + NULL, 0, -1, 0); +if (ret < 0) { +error_setg(errp, + "vfu: Failed to setup config space handlers for %s- %s", + o->device, strerror(errno)); +goto fail; +} + ret = vfu_realize_ctx(o->vfu_ctx); if (ret < 0) { error_setg(errp, "vfu: Failed to realize device %s- %s", diff --git a/hw/remote/trace-events b/hw/remote/trace-events index 7da12f0d96..2ef7884346 100644 --- a/hw/remote/trace-events +++ b/hw/remote/trace-events @@ -5,3 +5,5 @@ mpqemu_recv_io_error(int cmd, int size, int nfds) "failed to receive %d size %d, # vfio-user-obj.c vfu_prop(const char *prop, const char *val) "vfu: setting %s as %s" +vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u -> 0x%x" +vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u <- 0x%x" -- 2.20.1
[PATCH v12 12/14] vfio-user: handle PCI BAR accesses
Determine the BARs used by the PCI device and register handlers to manage the access to the same. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- include/exec/memory.h | 3 + hw/remote/vfio-user-obj.c | 190 softmmu/physmem.c | 4 +- tests/qtest/fuzz/generic_fuzz.c | 9 +- hw/remote/trace-events | 3 + 5 files changed, 203 insertions(+), 6 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index f1c19451bc..a6a0f4d8ad 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -2810,6 +2810,9 @@ MemTxResult address_space_write_cached_slow(MemoryRegionCache *cache, hwaddr addr, const void *buf, hwaddr len); +int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr); +bool prepare_mmio_access(MemoryRegion *mr); + static inline bool memory_access_is_direct(MemoryRegion *mr, bool is_write) { if (is_write) { diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 7b21f77052..dd760a99e2 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -52,6 +52,7 @@ #include "hw/qdev-core.h" #include "hw/pci/pci.h" #include "qemu/timer.h" +#include "exec/memory.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -332,6 +333,193 @@ static void dma_unregister(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) trace_vfu_dma_unregister((uint64_t)info->iova.iov_base); } +static int vfu_object_mr_rw(MemoryRegion *mr, uint8_t *buf, hwaddr offset, +hwaddr size, const bool is_write) +{ +uint8_t *ptr = buf; +bool release_lock = false; +uint8_t *ram_ptr = NULL; +MemTxResult result; +int access_size; +uint64_t val; + +if (memory_access_is_direct(mr, is_write)) { +/** + * Some devices expose a PCI expansion ROM, which could be buffer + * based as compared to other regions which are primarily based on + * MemoryRegionOps. memory_region_find() would already check + * for buffer overflow, we don't need to repeat it here. + */ +ram_ptr = memory_region_get_ram_ptr(mr); + +if (is_write) { +memcpy((ram_ptr + offset), buf, size); +} else { +memcpy(buf, (ram_ptr + offset), size); +} + +return 0; +} + +while (size) { +/** + * The read/write logic used below is similar to the ones in + * flatview_read/write_continue() + */ +release_lock = prepare_mmio_access(mr); + +access_size = memory_access_size(mr, size, offset); + +if (is_write) { +val = ldn_he_p(ptr, access_size); + +result = memory_region_dispatch_write(mr, offset, val, + size_memop(access_size), + MEMTXATTRS_UNSPECIFIED); +} else { +result = memory_region_dispatch_read(mr, offset, &val, + size_memop(access_size), + MEMTXATTRS_UNSPECIFIED); + +stn_he_p(ptr, access_size, val); +} + +if (release_lock) { +qemu_mutex_unlock_iothread(); +release_lock = false; +} + +if (result != MEMTX_OK) { +return -1; +} + +size -= access_size; +ptr += access_size; +offset += access_size; +} + +return 0; +} + +static size_t vfu_object_bar_rw(PCIDevice *pci_dev, int pci_bar, +hwaddr bar_offset, char * const buf, +hwaddr len, const bool is_write) +{ +MemoryRegionSection section = { 0 }; +uint8_t *ptr = (uint8_t *)buf; +MemoryRegion *section_mr = NULL; +uint64_t section_size; +hwaddr section_offset; +hwaddr size = 0; + +while (len) { +section = memory_region_find(pci_dev->io_regions[pci_bar].memory, + bar_offset, len); + +if (!section.mr) { +warn_report("vfu: invalid address 0x%"PRIx64"", bar_offset); +return size; +} + +section_mr = section.mr; +section_offset = section.offset_within_region; +section_size = int128_get64(section.size); + +if (is_write && section_mr->readonly) { +warn_report("vfu: attempting to write to readonly region in " +"bar %d - [0x%"PRIx64" - 0x%"PRIx64"]", +pci_bar, bar_offset, +(bar_off
[PATCH v12 11/14] vfio-user: handle DMA mappings
Define and register callbacks to manage the RAM regions used for device DMA Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/machine.c | 5 hw/remote/vfio-user-obj.c | 55 +++ hw/remote/trace-events| 2 ++ 3 files changed, 62 insertions(+) diff --git a/hw/remote/machine.c b/hw/remote/machine.c index cbb2add291..645b54343d 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -22,6 +22,7 @@ #include "hw/remote/iohub.h" #include "hw/remote/iommu.h" #include "hw/qdev-core.h" +#include "hw/remote/iommu.h" static void remote_machine_init(MachineState *machine) { @@ -51,6 +52,10 @@ static void remote_machine_init(MachineState *machine) pci_host = PCI_HOST_BRIDGE(rem_host); +if (s->vfio_user) { +remote_iommu_setup(pci_host->bus); +} + remote_iohub_init(&s->iohub); pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq, diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index cef473cb98..7b21f77052 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -284,6 +284,54 @@ static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf, return count; } +static void dma_register(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) +{ +VfuObject *o = vfu_get_private(vfu_ctx); +AddressSpace *dma_as = NULL; +MemoryRegion *subregion = NULL; +g_autofree char *name = NULL; +struct iovec *iov = &info->iova; + +if (!info->vaddr) { +return; +} + +name = g_strdup_printf("mem-%s-%"PRIx64"", o->device, + (uint64_t)info->vaddr); + +subregion = g_new0(MemoryRegion, 1); + +memory_region_init_ram_ptr(subregion, NULL, name, + iov->iov_len, info->vaddr); + +dma_as = pci_device_iommu_address_space(o->pci_dev); + +memory_region_add_subregion(dma_as->root, (hwaddr)iov->iov_base, subregion); + +trace_vfu_dma_register((uint64_t)iov->iov_base, iov->iov_len); +} + +static void dma_unregister(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) +{ +VfuObject *o = vfu_get_private(vfu_ctx); +AddressSpace *dma_as = NULL; +MemoryRegion *mr = NULL; +ram_addr_t offset; + +mr = memory_region_from_host(info->vaddr, &offset); +if (!mr) { +return; +} + +dma_as = pci_device_iommu_address_space(o->pci_dev); + +memory_region_del_subregion(dma_as->root, mr); + +object_unparent((OBJECT(mr))); + +trace_vfu_dma_unregister((uint64_t)info->iova.iov_base); +} + /* * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' * properties. It also depends on devices instantiated in QEMU. These @@ -387,6 +435,13 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) goto fail; } +ret = vfu_setup_device_dma(o->vfu_ctx, &dma_register, &dma_unregister); +if (ret < 0) { +error_setg(errp, "vfu: Failed to setup DMA handlers for %s", + o->device); +goto fail; +} + ret = vfu_realize_ctx(o->vfu_ctx); if (ret < 0) { error_setg(errp, "vfu: Failed to realize device %s- %s", diff --git a/hw/remote/trace-events b/hw/remote/trace-events index 2ef7884346..f945c7e33b 100644 --- a/hw/remote/trace-events +++ b/hw/remote/trace-events @@ -7,3 +7,5 @@ mpqemu_recv_io_error(int cmd, int size, int nfds) "failed to receive %d size %d, vfu_prop(const char *prop, const char *val) "vfu: setting %s as %s" vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u -> 0x%x" vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u <- 0x%x" +vfu_dma_register(uint64_t gpa, size_t len) "vfu: registering GPA 0x%"PRIx64", %zu bytes" +vfu_dma_unregister(uint64_t gpa) "vfu: unregistering GPA 0x%"PRIx64"" -- 2.20.1
[PATCH v12 10/14] vfio-user: IOMMU support for remote device
Assign separate address space for each device in the remote processes. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- include/hw/remote/iommu.h | 40 hw/remote/iommu.c | 131 ++ hw/remote/machine.c | 13 +++- MAINTAINERS | 2 + hw/remote/meson.build | 1 + 5 files changed, 186 insertions(+), 1 deletion(-) create mode 100644 include/hw/remote/iommu.h create mode 100644 hw/remote/iommu.c diff --git a/include/hw/remote/iommu.h b/include/hw/remote/iommu.h new file mode 100644 index 00..33b68a8f4b --- /dev/null +++ b/include/hw/remote/iommu.h @@ -0,0 +1,40 @@ +/** + * Copyright © 2022 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef REMOTE_IOMMU_H +#define REMOTE_IOMMU_H + +#include "hw/pci/pci_bus.h" +#include "hw/pci/pci.h" + +#ifndef INT2VOIDP +#define INT2VOIDP(i) (void *)(uintptr_t)(i) +#endif + +typedef struct RemoteIommuElem { +MemoryRegion *mr; + +AddressSpace as; +} RemoteIommuElem; + +#define TYPE_REMOTE_IOMMU "x-remote-iommu" +OBJECT_DECLARE_SIMPLE_TYPE(RemoteIommu, REMOTE_IOMMU) + +struct RemoteIommu { +Object parent; + +GHashTable *elem_by_devfn; + +QemuMutex lock; +}; + +void remote_iommu_setup(PCIBus *pci_bus); + +void remote_iommu_unplug_dev(PCIDevice *pci_dev); + +#endif diff --git a/hw/remote/iommu.c b/hw/remote/iommu.c new file mode 100644 index 00..fd723d91f3 --- /dev/null +++ b/hw/remote/iommu.c @@ -0,0 +1,131 @@ +/** + * IOMMU for remote device + * + * Copyright © 2022 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" + +#include "hw/remote/iommu.h" +#include "hw/pci/pci_bus.h" +#include "hw/pci/pci.h" +#include "exec/memory.h" +#include "exec/address-spaces.h" +#include "trace.h" + +/** + * IOMMU for TYPE_REMOTE_MACHINE - manages DMA address space isolation + * for remote machine. It is used by TYPE_VFIO_USER_SERVER. + * + * - Each TYPE_VFIO_USER_SERVER instance handles one PCIDevice on a PCIBus. + * There is one RemoteIommu per PCIBus, so the RemoteIommu tracks multiple + * PCIDevices by maintaining a ->elem_by_devfn mapping. + * + * - memory_region_init_iommu() is not used because vfio-user MemoryRegions + * will be added to the elem->mr container instead. This is more natural + * than implementing the IOMMUMemoryRegionClass APIs since vfio-user + * provides something that is close to a full-fledged MemoryRegion and + * not like an IOMMU mapping. + * + * - When a device is hot unplugged, the elem->mr reference is dropped so + * all vfio-user MemoryRegions associated with this vfio-user server are + * destroyed. + */ + +static AddressSpace *remote_iommu_find_add_as(PCIBus *pci_bus, + void *opaque, int devfn) +{ +RemoteIommu *iommu = opaque; +RemoteIommuElem *elem = NULL; + +qemu_mutex_lock(&iommu->lock); + +elem = g_hash_table_lookup(iommu->elem_by_devfn, INT2VOIDP(devfn)); + +if (!elem) { +elem = g_malloc0(sizeof(RemoteIommuElem)); +g_hash_table_insert(iommu->elem_by_devfn, INT2VOIDP(devfn), elem); +} + +if (!elem->mr) { +elem->mr = MEMORY_REGION(object_new(TYPE_MEMORY_REGION)); +memory_region_set_size(elem->mr, UINT64_MAX); +address_space_init(&elem->as, elem->mr, NULL); +} + +qemu_mutex_unlock(&iommu->lock); + +return &elem->as; +} + +void remote_iommu_unplug_dev(PCIDevice *pci_dev) +{ +AddressSpace *as = pci_device_iommu_address_space(pci_dev); +RemoteIommuElem *elem = NULL; + +if (as == &address_space_memory) { +return; +} + +elem = container_of(as, RemoteIommuElem, as); + +address_space_destroy(&elem->as); + +object_unref(elem->mr); + +elem->mr = NULL; +} + +static void remote_iommu_init(Object *obj) +{ +RemoteIommu *iommu = REMOTE_IOMMU(obj); + +iommu->elem_by_devfn = g_hash_table_new_full(NULL, NULL, NULL, g_free); + +qemu_mutex_init(&iommu->lock); +} + +static void remote_iommu_finalize(Object *obj) +{ +RemoteIommu *iommu = REMOTE_IOMMU(obj); + +qemu_mutex_destroy(&iommu->lock); + +g_hash_table_destroy(iommu->elem_by_devfn); + +iommu->elem_by_devfn = NULL; +} + +void remote_iommu_setup(PCIBus *pci_bus) +{ +RemoteIommu *iommu = NULL; + +g_assert(pci_bus); + +iommu = REMOTE_IOMMU(object_new(TYPE_REMOTE_IOMMU)); + +pci_setup_iommu(pci_bus, remote_iommu_find
[PATCH v12 13/14] vfio-user: handle device interrupts
Forward remote device's interrupts to the guest Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- include/hw/pci/msi.h | 1 + include/hw/pci/msix.h | 1 + include/hw/pci/pci.h | 13 +++ include/hw/remote/vfio-user-obj.h | 6 ++ hw/pci/msi.c | 49 +++-- hw/pci/msix.c | 35 ++- hw/pci/pci.c | 13 +++ hw/remote/machine.c | 14 ++- hw/remote/vfio-user-obj.c | 167 ++ stubs/vfio-user-obj.c | 6 ++ MAINTAINERS | 1 + hw/remote/trace-events| 1 + stubs/meson.build | 1 + 13 files changed, 297 insertions(+), 11 deletions(-) create mode 100644 include/hw/remote/vfio-user-obj.h create mode 100644 stubs/vfio-user-obj.c diff --git a/include/hw/pci/msi.h b/include/hw/pci/msi.h index 4087688486..58aa576215 100644 --- a/include/hw/pci/msi.h +++ b/include/hw/pci/msi.h @@ -43,6 +43,7 @@ void msi_notify(PCIDevice *dev, unsigned int vector); void msi_send_message(PCIDevice *dev, MSIMessage msg); void msi_write_config(PCIDevice *dev, uint32_t addr, uint32_t val, int len); unsigned int msi_nr_vectors_allocated(const PCIDevice *dev); +void msi_set_mask(PCIDevice *dev, int vector, bool mask, Error **errp); static inline bool msi_present(const PCIDevice *dev) { diff --git a/include/hw/pci/msix.h b/include/hw/pci/msix.h index 4c4a60c739..4f1cda0ebe 100644 --- a/include/hw/pci/msix.h +++ b/include/hw/pci/msix.h @@ -36,6 +36,7 @@ void msix_clr_pending(PCIDevice *dev, int vector); int msix_vector_use(PCIDevice *dev, unsigned vector); void msix_vector_unuse(PCIDevice *dev, unsigned vector); void msix_unuse_all_vectors(PCIDevice *dev); +void msix_set_mask(PCIDevice *dev, int vector, bool mask, Error **errp); void msix_notify(PCIDevice *dev, unsigned vector); diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index 44dacfa224..b54b6ef88f 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -16,6 +16,7 @@ extern bool pci_available; #define PCI_SLOT(devfn) (((devfn) >> 3) & 0x1f) #define PCI_FUNC(devfn) ((devfn) & 0x07) #define PCI_BUILD_BDF(bus, devfn) ((bus << 8) | (devfn)) +#define PCI_BDF_TO_DEVFN(x) ((x) & 0xff) #define PCI_BUS_MAX 256 #define PCI_DEVFN_MAX 256 #define PCI_SLOT_MAX32 @@ -127,6 +128,10 @@ typedef void PCIMapIORegionFunc(PCIDevice *pci_dev, int region_num, pcibus_t addr, pcibus_t size, int type); typedef void PCIUnregisterFunc(PCIDevice *pci_dev); +typedef void MSITriggerFunc(PCIDevice *dev, MSIMessage msg); +typedef MSIMessage MSIPrepareMessageFunc(PCIDevice *dev, unsigned vector); +typedef MSIMessage MSIxPrepareMessageFunc(PCIDevice *dev, unsigned vector); + typedef struct PCIIORegion { pcibus_t addr; /* current PCI mapping address. -1 means not mapped */ #define PCI_BAR_UNMAPPED (~(pcibus_t)0) @@ -329,6 +334,14 @@ struct PCIDevice { /* Space to store MSIX table & pending bit array */ uint8_t *msix_table; uint8_t *msix_pba; + +/* May be used by INTx or MSI during interrupt notification */ +void *irq_opaque; + +MSITriggerFunc *msi_trigger; +MSIPrepareMessageFunc *msi_prepare_message; +MSIxPrepareMessageFunc *msix_prepare_message; + /* MemoryRegion container for msix exclusive BAR setup */ MemoryRegion msix_exclusive_bar; /* Memory Regions for MSIX table and pending bit entries. */ diff --git a/include/hw/remote/vfio-user-obj.h b/include/hw/remote/vfio-user-obj.h new file mode 100644 index 00..87ab78b875 --- /dev/null +++ b/include/hw/remote/vfio-user-obj.h @@ -0,0 +1,6 @@ +#ifndef VFIO_USER_OBJ_H +#define VFIO_USER_OBJ_H + +void vfu_object_set_bus_irq(PCIBus *pci_bus); + +#endif diff --git a/hw/pci/msi.c b/hw/pci/msi.c index 47d2b0f33c..5c471b9616 100644 --- a/hw/pci/msi.c +++ b/hw/pci/msi.c @@ -134,7 +134,7 @@ void msi_set_message(PCIDevice *dev, MSIMessage msg) pci_set_word(dev->config + msi_data_off(dev, msi64bit), msg.data); } -MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector) +static MSIMessage msi_prepare_message(PCIDevice *dev, unsigned int vector) { uint16_t flags = pci_get_word(dev->config + msi_flags_off(dev)); bool msi64bit = flags & PCI_MSI_FLAGS_64BIT; @@ -159,6 +159,11 @@ MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector) return msg; } +MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector) +{ +return dev->msi_prepare_message(dev, vector); +} + bool msi_enabled(const PCIDevice *dev) { return msi_present(dev) && @@ -241,6 +246,8 @@ int msi_init(struct PCIDevice *dev, uint8_t offset, 0x >> (PCI_MSI_VECTORS_MAX - nr_vectors)); } +dev->msi_prepare
[PATCH v12 08/14] vfio-user: run vfio-user context
Setup a handler to run vfio-user context. The context is driven by messages to the file descriptor associated with it - get the fd for the context and hook up the handler with it Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- qapi/misc.json| 31 ++ hw/remote/vfio-user-obj.c | 118 +- 2 files changed, 148 insertions(+), 1 deletion(-) diff --git a/qapi/misc.json b/qapi/misc.json index 45344483cd..27ef5a2b20 100644 --- a/qapi/misc.json +++ b/qapi/misc.json @@ -553,3 +553,34 @@ ## { 'event': 'RTC_CHANGE', 'data': { 'offset': 'int', 'qom-path': 'str' } } + +## +# @VFU_CLIENT_HANGUP: +# +# Emitted when the client of a TYPE_VFIO_USER_SERVER closes the +# communication channel +# +# @vfu-id: ID of the TYPE_VFIO_USER_SERVER object. It is the last component +# of @vfu-qom-path referenced below +# +# @vfu-qom-path: path to the TYPE_VFIO_USER_SERVER object in the QOM tree +# +# @dev-id: ID of attached PCI device +# +# @dev-qom-path: path to attached PCI device in the QOM tree +# +# Since: 7.1 +# +# Example: +# +# <- { "event": "VFU_CLIENT_HANGUP", +# "data": { "vfu-id": "vfu1", +#"vfu-qom-path": "/objects/vfu1", +#"dev-id": "sas1", +#"dev-qom-path": "/machine/peripheral/sas1" }, +# "timestamp": { "seconds": 1265044230, "microseconds": 450486 } } +# +## +{ 'event': 'VFU_CLIENT_HANGUP', + 'data': { 'vfu-id': 'str', 'vfu-qom-path': 'str', +'dev-id': 'str', 'dev-qom-path': 'str' } } diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 3ca6aa2b45..178bd6f8ed 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -27,6 +27,9 @@ * * device - id of a device on the server, a required option. PCI devices * alone are supported presently. + * + * notes - x-vfio-user-server could block IO and monitor during the + * initialization phase. */ #include "qemu/osdep.h" @@ -40,11 +43,14 @@ #include "hw/remote/machine.h" #include "qapi/error.h" #include "qapi/qapi-visit-sockets.h" +#include "qapi/qapi-events-misc.h" #include "qemu/notify.h" +#include "qemu/thread.h" #include "sysemu/sysemu.h" #include "libvfio-user.h" #include "hw/qdev-core.h" #include "hw/pci/pci.h" +#include "qemu/timer.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -86,6 +92,8 @@ struct VfuObject { PCIDevice *pci_dev; Error *unplug_blocker; + +int vfu_poll_fd; }; static void vfu_object_init_ctx(VfuObject *o, Error **errp); @@ -164,6 +172,78 @@ static void vfu_object_set_device(Object *obj, const char *str, Error **errp) vfu_object_init_ctx(o, errp); } +static void vfu_object_ctx_run(void *opaque) +{ +VfuObject *o = opaque; +const char *vfu_id; +char *vfu_path, *pci_dev_path; +int ret = -1; + +while (ret != 0) { +ret = vfu_run_ctx(o->vfu_ctx); +if (ret < 0) { +if (errno == EINTR) { +continue; +} else if (errno == ENOTCONN) { +vfu_id = object_get_canonical_path_component(OBJECT(o)); +vfu_path = object_get_canonical_path(OBJECT(o)); +g_assert(o->pci_dev); +pci_dev_path = object_get_canonical_path(OBJECT(o->pci_dev)); + /* o->device is a required property and is non-NULL here */ +g_assert(o->device); +qapi_event_send_vfu_client_hangup(vfu_id, vfu_path, + o->device, pci_dev_path); +qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL); +o->vfu_poll_fd = -1; +object_unparent(OBJECT(o)); +g_free(vfu_path); +g_free(pci_dev_path); +break; +} else { +VFU_OBJECT_ERROR(o, "vfu: Failed to run device %s - %s", + o->device, strerror(errno)); +break; +} +} +} +} + +static void vfu_object_attach_ctx(void *opaque) +{ +VfuObject *o = opaque; +GPollFD pfds[1]; +int ret; + +qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL); + +pfds[0].fd = o->vfu_poll_fd; +pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR; + +retry_attach: +ret = vfu_attach_ctx(o-&g
[PATCH v12 04/14] vfio-user: build library
add the libvfio-user library as a submodule. build it as a meson subproject. libvfio-user is distributed with BSD 3-Clause license and json-c with MIT (Expat) license Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- configure | 17 + meson.build | 23 ++- .gitlab-ci.d/buildtest.yml | 1 + .gitmodules | 3 +++ Kconfig.host| 4 MAINTAINERS | 1 + hw/remote/Kconfig | 4 hw/remote/meson.build | 2 ++ meson_options.txt | 2 ++ subprojects/libvfio-user| 1 + tests/docker/dockerfiles/centos8.docker | 2 ++ 11 files changed, 59 insertions(+), 1 deletion(-) create mode 16 subprojects/libvfio-user diff --git a/configure b/configure index e69537c756..39f30c0283 100755 --- a/configure +++ b/configure @@ -315,6 +315,7 @@ meson_args="" ninja="" bindir="bin" skip_meson=no +vfio_user_server="disabled" # The following Meson options are handled manually (still they # are included in the automatically generated help message) @@ -909,6 +910,10 @@ for opt do ;; --disable-blobs) meson_option_parse --disable-install-blobs "" ;; + --enable-vfio-user-server) vfio_user_server="enabled" + ;; + --disable-vfio-user-server) vfio_user_server="disabled" + ;; --enable-tcmalloc) meson_option_parse --enable-malloc=tcmalloc tcmalloc ;; --enable-jemalloc) meson_option_parse --enable-malloc=jemalloc jemalloc @@ -2133,6 +2138,17 @@ write_container_target_makefile() { +## +# check for vfio_user_server + +case "$vfio_user_server" in + enabled ) +if test "$git_submodules_action" != "ignore"; then + git_submodules="${git_submodules} subprojects/libvfio-user" +fi +;; +esac + ## # End of CC checks # After here, no more $cc or $ld runs @@ -2669,6 +2685,7 @@ if test "$skip_meson" = no; then test "$slirp" != auto && meson_option_add "-Dslirp=$slirp" test "$smbd" != '' && meson_option_add "-Dsmbd=$smbd" test "$tcg" != enabled && meson_option_add "-Dtcg=$tcg" + test "$vfio_user_server" != auto && meson_option_add "-Dvfio_user_server=$vfio_user_server" run_meson() { NINJA=$ninja $meson setup --prefix "$prefix" "$@" $cross_arg "$PWD" "$source_path" } diff --git a/meson.build b/meson.build index 21cd949082..fac9853254 100644 --- a/meson.build +++ b/meson.build @@ -308,6 +308,10 @@ multiprocess_allowed = get_option('multiprocess') \ .require(targetos == 'linux', error_message: 'Multiprocess QEMU is supported only on Linux') \ .allowed() +vfio_user_server_allowed = get_option('vfio_user_server') \ + .require(targetos == 'linux', error_message: 'vfio-user server is supported only on Linux') \ + .allowed() + have_tpm = get_option('tpm') \ .require(targetos != 'windows', error_message: 'TPM emulation only available on POSIX systems') \ .allowed() @@ -2373,7 +2377,8 @@ host_kconfig = \ (have_virtfs ? ['CONFIG_VIRTFS=y'] : []) + \ ('CONFIG_LINUX' in config_host ? ['CONFIG_LINUX=y'] : []) + \ (have_pvrdma ? ['CONFIG_PVRDMA=y'] : []) + \ - (multiprocess_allowed ? ['CONFIG_MULTIPROCESS_ALLOWED=y'] : []) + (multiprocess_allowed ? ['CONFIG_MULTIPROCESS_ALLOWED=y'] : []) + \ + (vfio_user_server_allowed ? ['CONFIG_VFIO_USER_SERVER_ALLOWED=y'] : []) ignored = [ 'TARGET_XML_FILES', 'TARGET_ABI_DIR', 'TARGET_ARCH' ] @@ -2665,6 +2670,21 @@ if have_system endif endif +libvfio_user_dep = not_found +if have_system and vfio_user_server_allowed + have_internal = fs.exists(meson.current_source_dir() / 'subprojects/libvfio-user/meson.build') + + if not have_internal +error('libvfio-user source not found - please pull git submodule') + endif + + libvfio_user_proj = subproject('libvfio-user') + + libvfio_user_lib = libvfio_user_proj.get_variable('libvfio_user_dep') + + libvfio_user_dep = declare_dependency(dependencies: [libvfio_user_lib]) +endif + fdt = not_found if have_system fdt_opt = get_option('fdt') @@ -3783,6 +3803,7 @@ summary_info += {'target list': ' '.join(target_dirs)} if have_system summary_info += {'default devices': get_
[PATCH v12 06/14] vfio-user: instantiate vfio-user context
create a context with the vfio-user library to run a PCI device Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 82 +++ 1 file changed, 82 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index bc49adcc27..68f8a9dfa9 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -40,6 +40,9 @@ #include "hw/remote/machine.h" #include "qapi/error.h" #include "qapi/qapi-visit-sockets.h" +#include "qemu/notify.h" +#include "sysemu/sysemu.h" +#include "libvfio-user.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -73,8 +76,14 @@ struct VfuObject { char *device; Error *err; + +Notifier machine_done; + +vfu_ctx_t *vfu_ctx; }; +static void vfu_object_init_ctx(VfuObject *o, Error **errp); + static bool vfu_object_auto_shutdown(void) { bool auto_shutdown = true; @@ -107,6 +116,11 @@ static void vfu_object_set_socket(Object *obj, Visitor *v, const char *name, { VfuObject *o = VFU_OBJECT(obj); +if (o->vfu_ctx) { +error_setg(errp, "vfu: Unable to set socket property - server busy"); +return; +} + qapi_free_SocketAddress(o->socket); o->socket = NULL; @@ -122,17 +136,69 @@ static void vfu_object_set_socket(Object *obj, Visitor *v, const char *name, } trace_vfu_prop("socket", o->socket->u.q_unix.path); + +vfu_object_init_ctx(o, errp); } static void vfu_object_set_device(Object *obj, const char *str, Error **errp) { VfuObject *o = VFU_OBJECT(obj); +if (o->vfu_ctx) { +error_setg(errp, "vfu: Unable to set device property - server busy"); +return; +} + g_free(o->device); o->device = g_strdup(str); trace_vfu_prop("device", str); + +vfu_object_init_ctx(o, errp); +} + +/* + * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' + * properties. It also depends on devices instantiated in QEMU. These + * dependencies are not available during the instance_init phase of this + * object's life-cycle. As such, the server is initialized after the + * machine is setup. machine_init_done_notifier notifies TYPE_VFU_OBJECT + * when the machine is setup, and the dependencies are available. + */ +static void vfu_object_machine_done(Notifier *notifier, void *data) +{ +VfuObject *o = container_of(notifier, VfuObject, machine_done); +Error *err = NULL; + +vfu_object_init_ctx(o, &err); + +if (err) { +error_propagate(&error_abort, err); +} +} + +static void vfu_object_init_ctx(VfuObject *o, Error **errp) +{ +ERRP_GUARD(); + +if (o->vfu_ctx || !o->socket || !o->device || +!phase_check(PHASE_MACHINE_READY)) { +return; +} + +if (o->err) { +error_propagate(errp, o->err); +o->err = NULL; +return; +} + +o->vfu_ctx = vfu_create_ctx(VFU_TRANS_SOCK, o->socket->u.q_unix.path, 0, +o, VFU_DEV_TYPE_PCI); +if (o->vfu_ctx == NULL) { +error_setg(errp, "vfu: Failed to create context - %s", strerror(errno)); +return; +} } static void vfu_object_init(Object *obj) @@ -147,6 +213,12 @@ static void vfu_object_init(Object *obj) TYPE_VFU_OBJECT, TYPE_REMOTE_MACHINE); return; } + +if (!phase_check(PHASE_MACHINE_READY)) { +o->machine_done.notify = vfu_object_machine_done; +qemu_add_machine_init_done_notifier(&o->machine_done); +} + } static void vfu_object_finalize(Object *obj) @@ -160,6 +232,11 @@ static void vfu_object_finalize(Object *obj) o->socket = NULL; +if (o->vfu_ctx) { +vfu_destroy_ctx(o->vfu_ctx); +o->vfu_ctx = NULL; +} + g_free(o->device); o->device = NULL; @@ -167,6 +244,11 @@ static void vfu_object_finalize(Object *obj) if (!k->nr_devs && vfu_object_auto_shutdown()) { qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN); } + +if (o->machine_done.notify) { +qemu_remove_machine_init_done_notifier(&o->machine_done); +o->machine_done.notify = NULL; +} } static void vfu_object_class_init(ObjectClass *klass, void *data) -- 2.20.1
[PATCH v12 03/14] remote/machine: add vfio-user property
Add vfio-user to x-remote machine. It is a boolean, which indicates if the machine supports vfio-user protocol. The machine configures the bus differently vfio-user and multiprocess protocols, so this property informs it on how to configure the bus. This property should be short lived. Once vfio-user fully replaces multiprocess, this property could be removed. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- include/hw/remote/machine.h | 2 ++ hw/remote/machine.c | 23 +++ 2 files changed, 25 insertions(+) diff --git a/include/hw/remote/machine.h b/include/hw/remote/machine.h index 2a2a33c4b2..8d0fa98d33 100644 --- a/include/hw/remote/machine.h +++ b/include/hw/remote/machine.h @@ -22,6 +22,8 @@ struct RemoteMachineState { RemotePCIHost *host; RemoteIOHubState iohub; + +bool vfio_user; }; /* Used to pass to co-routine device and ioc. */ diff --git a/hw/remote/machine.c b/hw/remote/machine.c index a97e53e250..9f3cdc55c3 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -58,6 +58,25 @@ static void remote_machine_init(MachineState *machine) qbus_set_hotplug_handler(BUS(pci_host->bus), OBJECT(s)); } +static bool remote_machine_get_vfio_user(Object *obj, Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +return s->vfio_user; +} + +static void remote_machine_set_vfio_user(Object *obj, bool value, Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +if (phase_check(PHASE_MACHINE_CREATED)) { +error_setg(errp, "Error enabling vfio-user - machine already created"); +return; +} + +s->vfio_user = value; +} + static void remote_machine_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); @@ -67,6 +86,10 @@ static void remote_machine_class_init(ObjectClass *oc, void *data) mc->desc = "Experimental remote machine"; hc->unplug = qdev_simple_device_unplug_cb; + +object_class_property_add_bool(oc, "vfio-user", + remote_machine_get_vfio_user, + remote_machine_set_vfio_user); } static const TypeInfo remote_machine = { -- 2.20.1
[PATCH v12 02/14] remote/machine: add HotplugHandler for remote machine
Allow hotplugging of PCI(e) devices to remote machine Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/machine.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/hw/remote/machine.c b/hw/remote/machine.c index 92d71d47bb..a97e53e250 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -20,6 +20,7 @@ #include "qapi/error.h" #include "hw/pci/pci_host.h" #include "hw/remote/iohub.h" +#include "hw/qdev-core.h" static void remote_machine_init(MachineState *machine) { @@ -53,14 +54,19 @@ static void remote_machine_init(MachineState *machine) pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq, &s->iohub, REMOTE_IOHUB_NB_PIRQS); + +qbus_set_hotplug_handler(BUS(pci_host->bus), OBJECT(s)); } static void remote_machine_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); +HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc); mc->init = remote_machine_init; mc->desc = "Experimental remote machine"; + +hc->unplug = qdev_simple_device_unplug_cb; } static const TypeInfo remote_machine = { @@ -68,6 +74,10 @@ static const TypeInfo remote_machine = { .parent = TYPE_MACHINE, .instance_size = sizeof(RemoteMachineState), .class_init = remote_machine_class_init, +.interfaces = (InterfaceInfo[]) { +{ TYPE_HOTPLUG_HANDLER }, +{ } +} }; static void remote_machine_register_types(void) -- 2.20.1
[PATCH v12 01/14] qdev: unplug blocker for devices
Add blocker to prevent hot-unplug of devices TYPE_VFIO_USER_SERVER, which is introduced shortly, attaches itself to a PCIDevice on which it depends. If the attached PCIDevice gets removed while the server in use, it could cause it crash. To prevent this, TYPE_VFIO_USER_SERVER adds an unplug blocker for the PCIDevice. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- include/hw/qdev-core.h | 29 + hw/core/qdev.c | 24 softmmu/qdev-monitor.c | 4 3 files changed, 57 insertions(+) diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h index 92c3d65208..98774e2835 100644 --- a/include/hw/qdev-core.h +++ b/include/hw/qdev-core.h @@ -193,6 +193,7 @@ struct DeviceState { int instance_id_alias; int alias_required_for_version; ResettableState reset; +GSList *unplug_blockers; }; struct DeviceListener { @@ -419,6 +420,34 @@ void qdev_simple_device_unplug_cb(HotplugHandler *hotplug_dev, void qdev_machine_creation_done(void); bool qdev_machine_modified(void); +/** + * qdev_add_unplug_blocker: Add an unplug blocker to a device + * + * @dev: Device to be blocked from unplug + * @reason: Reason for blocking + */ +void qdev_add_unplug_blocker(DeviceState *dev, Error *reason); + +/** + * qdev_del_unplug_blocker: Remove an unplug blocker from a device + * + * @dev: Device to be unblocked + * @reason: Pointer to the Error used with qdev_add_unplug_blocker. + * Used as a handle to lookup the blocker for deletion. + */ +void qdev_del_unplug_blocker(DeviceState *dev, Error *reason); + +/** + * qdev_unplug_blocked: Confirm if a device is blocked from unplug + * + * @dev: Device to be tested + * @reason: Returns one of the reasons why the device is blocked, + * if any + * + * Returns: true if device is blocked from unplug, false otherwise + */ +bool qdev_unplug_blocked(DeviceState *dev, Error **errp); + /** * GpioPolarity: Polarity of a GPIO line * diff --git a/hw/core/qdev.c b/hw/core/qdev.c index 84f3019440..0806d8fcaa 100644 --- a/hw/core/qdev.c +++ b/hw/core/qdev.c @@ -468,6 +468,28 @@ char *qdev_get_dev_path(DeviceState *dev) return NULL; } +void qdev_add_unplug_blocker(DeviceState *dev, Error *reason) +{ +dev->unplug_blockers = g_slist_prepend(dev->unplug_blockers, reason); +} + +void qdev_del_unplug_blocker(DeviceState *dev, Error *reason) +{ +dev->unplug_blockers = g_slist_remove(dev->unplug_blockers, reason); +} + +bool qdev_unplug_blocked(DeviceState *dev, Error **errp) +{ +ERRP_GUARD(); + +if (dev->unplug_blockers) { +error_propagate(errp, error_copy(dev->unplug_blockers->data)); +return true; +} + +return false; +} + static bool device_get_realized(Object *obj, Error **errp) { DeviceState *dev = DEVICE(obj); @@ -704,6 +726,8 @@ static void device_finalize(Object *obj) DeviceState *dev = DEVICE(obj); +g_assert(!dev->unplug_blockers); + QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) { QLIST_REMOVE(ngl, node); qemu_free_irqs(ngl->in, ngl->num_in); diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c index bb5897fc76..4b0ef65780 100644 --- a/softmmu/qdev-monitor.c +++ b/softmmu/qdev-monitor.c @@ -899,6 +899,10 @@ void qdev_unplug(DeviceState *dev, Error **errp) HotplugHandlerClass *hdc; Error *local_err = NULL; +if (qdev_unplug_blocked(dev, errp)) { +return; +} + if (dev->parent_bus && !qbus_is_hotpluggable(dev->parent_bus)) { error_setg(errp, QERR_BUS_NO_HOTPLUG, dev->parent_bus->name); return; -- 2.20.1
[PATCH v12 00/14] vfio-user server in QEMU
This is v12 of the server side changes to enable vfio-user in QEMU. Thanks so much for reviewing this series and sharing your feedback. We made the following changes in this series: [PATCH v12 13/14] vfio-user: handle device interrupts - Renamed msi_set_irq_state() and msix_set_irq_state() as msi_set_mask() and msix_set_mask() respectively - Added missing return statement for error case in msi_set_mask() Thank you very much! Jagannathan Raman (14): qdev: unplug blocker for devices remote/machine: add HotplugHandler for remote machine remote/machine: add vfio-user property vfio-user: build library vfio-user: define vfio-user-server object vfio-user: instantiate vfio-user context vfio-user: find and init PCI device vfio-user: run vfio-user context vfio-user: handle PCI config space accesses vfio-user: IOMMU support for remote device vfio-user: handle DMA mappings vfio-user: handle PCI BAR accesses vfio-user: handle device interrupts vfio-user: handle reset of remote device configure | 17 + meson.build | 23 +- qapi/misc.json | 31 + qapi/qom.json | 20 +- include/exec/memory.h | 3 + include/hw/pci/msi.h| 1 + include/hw/pci/msix.h | 1 + include/hw/pci/pci.h| 13 + include/hw/qdev-core.h | 29 + include/hw/remote/iommu.h | 40 + include/hw/remote/machine.h | 4 + include/hw/remote/vfio-user-obj.h | 6 + hw/core/qdev.c | 24 + hw/pci/msi.c| 49 +- hw/pci/msix.c | 35 +- hw/pci/pci.c| 13 + hw/remote/iommu.c | 131 hw/remote/machine.c | 88 ++- hw/remote/vfio-user-obj.c | 958 softmmu/physmem.c | 4 +- softmmu/qdev-monitor.c | 4 + stubs/vfio-user-obj.c | 6 + tests/qtest/fuzz/generic_fuzz.c | 9 +- .gitlab-ci.d/buildtest.yml | 1 + .gitmodules | 3 + Kconfig.host| 4 + MAINTAINERS | 5 + hw/remote/Kconfig | 4 + hw/remote/meson.build | 4 + hw/remote/trace-events | 11 + meson_options.txt | 2 + stubs/meson.build | 1 + subprojects/libvfio-user| 1 + tests/docker/dockerfiles/centos8.docker | 2 + 34 files changed, 1528 insertions(+), 19 deletions(-) create mode 100644 include/hw/remote/iommu.h create mode 100644 include/hw/remote/vfio-user-obj.h create mode 100644 hw/remote/iommu.c create mode 100644 hw/remote/vfio-user-obj.c create mode 100644 stubs/vfio-user-obj.c create mode 16 subprojects/libvfio-user -- 2.20.1
[PATCH v11 13/14] vfio-user: handle device interrupts
Forward remote device's interrupts to the guest Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- include/hw/pci/msi.h | 1 + include/hw/pci/msix.h | 1 + include/hw/pci/pci.h | 13 +++ include/hw/remote/vfio-user-obj.h | 6 ++ hw/pci/msi.c | 48 +++-- hw/pci/msix.c | 35 ++- hw/pci/pci.c | 13 +++ hw/remote/machine.c | 14 ++- hw/remote/vfio-user-obj.c | 167 ++ stubs/vfio-user-obj.c | 6 ++ MAINTAINERS | 1 + hw/remote/trace-events| 1 + stubs/meson.build | 1 + 13 files changed, 296 insertions(+), 11 deletions(-) create mode 100644 include/hw/remote/vfio-user-obj.h create mode 100644 stubs/vfio-user-obj.c diff --git a/include/hw/pci/msi.h b/include/hw/pci/msi.h index 4087688486..127f3d5111 100644 --- a/include/hw/pci/msi.h +++ b/include/hw/pci/msi.h @@ -43,6 +43,7 @@ void msi_notify(PCIDevice *dev, unsigned int vector); void msi_send_message(PCIDevice *dev, MSIMessage msg); void msi_write_config(PCIDevice *dev, uint32_t addr, uint32_t val, int len); unsigned int msi_nr_vectors_allocated(const PCIDevice *dev); +void msi_set_irq_state(PCIDevice *dev, int vector, bool mask, Error **errp); static inline bool msi_present(const PCIDevice *dev) { diff --git a/include/hw/pci/msix.h b/include/hw/pci/msix.h index 4c4a60c739..f6ab96ed93 100644 --- a/include/hw/pci/msix.h +++ b/include/hw/pci/msix.h @@ -36,6 +36,7 @@ void msix_clr_pending(PCIDevice *dev, int vector); int msix_vector_use(PCIDevice *dev, unsigned vector); void msix_vector_unuse(PCIDevice *dev, unsigned vector); void msix_unuse_all_vectors(PCIDevice *dev); +void msix_set_irq_state(PCIDevice *dev, int vector, bool mask, Error **errp); void msix_notify(PCIDevice *dev, unsigned vector); diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index 44dacfa224..b54b6ef88f 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -16,6 +16,7 @@ extern bool pci_available; #define PCI_SLOT(devfn) (((devfn) >> 3) & 0x1f) #define PCI_FUNC(devfn) ((devfn) & 0x07) #define PCI_BUILD_BDF(bus, devfn) ((bus << 8) | (devfn)) +#define PCI_BDF_TO_DEVFN(x) ((x) & 0xff) #define PCI_BUS_MAX 256 #define PCI_DEVFN_MAX 256 #define PCI_SLOT_MAX32 @@ -127,6 +128,10 @@ typedef void PCIMapIORegionFunc(PCIDevice *pci_dev, int region_num, pcibus_t addr, pcibus_t size, int type); typedef void PCIUnregisterFunc(PCIDevice *pci_dev); +typedef void MSITriggerFunc(PCIDevice *dev, MSIMessage msg); +typedef MSIMessage MSIPrepareMessageFunc(PCIDevice *dev, unsigned vector); +typedef MSIMessage MSIxPrepareMessageFunc(PCIDevice *dev, unsigned vector); + typedef struct PCIIORegion { pcibus_t addr; /* current PCI mapping address. -1 means not mapped */ #define PCI_BAR_UNMAPPED (~(pcibus_t)0) @@ -329,6 +334,14 @@ struct PCIDevice { /* Space to store MSIX table & pending bit array */ uint8_t *msix_table; uint8_t *msix_pba; + +/* May be used by INTx or MSI during interrupt notification */ +void *irq_opaque; + +MSITriggerFunc *msi_trigger; +MSIPrepareMessageFunc *msi_prepare_message; +MSIxPrepareMessageFunc *msix_prepare_message; + /* MemoryRegion container for msix exclusive BAR setup */ MemoryRegion msix_exclusive_bar; /* Memory Regions for MSIX table and pending bit entries. */ diff --git a/include/hw/remote/vfio-user-obj.h b/include/hw/remote/vfio-user-obj.h new file mode 100644 index 00..87ab78b875 --- /dev/null +++ b/include/hw/remote/vfio-user-obj.h @@ -0,0 +1,6 @@ +#ifndef VFIO_USER_OBJ_H +#define VFIO_USER_OBJ_H + +void vfu_object_set_bus_irq(PCIBus *pci_bus); + +#endif diff --git a/hw/pci/msi.c b/hw/pci/msi.c index 47d2b0f33c..59f34e3568 100644 --- a/hw/pci/msi.c +++ b/hw/pci/msi.c @@ -134,7 +134,7 @@ void msi_set_message(PCIDevice *dev, MSIMessage msg) pci_set_word(dev->config + msi_data_off(dev, msi64bit), msg.data); } -MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector) +static MSIMessage msi_prepare_message(PCIDevice *dev, unsigned int vector) { uint16_t flags = pci_get_word(dev->config + msi_flags_off(dev)); bool msi64bit = flags & PCI_MSI_FLAGS_64BIT; @@ -159,6 +159,11 @@ MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector) return msg; } +MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector) +{ +return dev->msi_prepare_message(dev, vector); +} + bool msi_enabled(const PCIDevice *dev) { return msi_present(dev) && @@ -241,6 +246,8 @@ int msi_init(struct PCIDevice *dev, uint8_t offset, 0x >> (PCI_MSI_VECTORS_MAX - nr_vectors)); } +dev
[PATCH v11 12/14] vfio-user: handle PCI BAR accesses
Determine the BARs used by the PCI device and register handlers to manage the access to the same. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- include/exec/memory.h | 3 + hw/remote/vfio-user-obj.c | 190 softmmu/physmem.c | 4 +- tests/qtest/fuzz/generic_fuzz.c | 9 +- hw/remote/trace-events | 3 + 5 files changed, 203 insertions(+), 6 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index f1c19451bc..a6a0f4d8ad 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -2810,6 +2810,9 @@ MemTxResult address_space_write_cached_slow(MemoryRegionCache *cache, hwaddr addr, const void *buf, hwaddr len); +int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr); +bool prepare_mmio_access(MemoryRegion *mr); + static inline bool memory_access_is_direct(MemoryRegion *mr, bool is_write) { if (is_write) { diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 7b21f77052..dd760a99e2 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -52,6 +52,7 @@ #include "hw/qdev-core.h" #include "hw/pci/pci.h" #include "qemu/timer.h" +#include "exec/memory.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -332,6 +333,193 @@ static void dma_unregister(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) trace_vfu_dma_unregister((uint64_t)info->iova.iov_base); } +static int vfu_object_mr_rw(MemoryRegion *mr, uint8_t *buf, hwaddr offset, +hwaddr size, const bool is_write) +{ +uint8_t *ptr = buf; +bool release_lock = false; +uint8_t *ram_ptr = NULL; +MemTxResult result; +int access_size; +uint64_t val; + +if (memory_access_is_direct(mr, is_write)) { +/** + * Some devices expose a PCI expansion ROM, which could be buffer + * based as compared to other regions which are primarily based on + * MemoryRegionOps. memory_region_find() would already check + * for buffer overflow, we don't need to repeat it here. + */ +ram_ptr = memory_region_get_ram_ptr(mr); + +if (is_write) { +memcpy((ram_ptr + offset), buf, size); +} else { +memcpy(buf, (ram_ptr + offset), size); +} + +return 0; +} + +while (size) { +/** + * The read/write logic used below is similar to the ones in + * flatview_read/write_continue() + */ +release_lock = prepare_mmio_access(mr); + +access_size = memory_access_size(mr, size, offset); + +if (is_write) { +val = ldn_he_p(ptr, access_size); + +result = memory_region_dispatch_write(mr, offset, val, + size_memop(access_size), + MEMTXATTRS_UNSPECIFIED); +} else { +result = memory_region_dispatch_read(mr, offset, &val, + size_memop(access_size), + MEMTXATTRS_UNSPECIFIED); + +stn_he_p(ptr, access_size, val); +} + +if (release_lock) { +qemu_mutex_unlock_iothread(); +release_lock = false; +} + +if (result != MEMTX_OK) { +return -1; +} + +size -= access_size; +ptr += access_size; +offset += access_size; +} + +return 0; +} + +static size_t vfu_object_bar_rw(PCIDevice *pci_dev, int pci_bar, +hwaddr bar_offset, char * const buf, +hwaddr len, const bool is_write) +{ +MemoryRegionSection section = { 0 }; +uint8_t *ptr = (uint8_t *)buf; +MemoryRegion *section_mr = NULL; +uint64_t section_size; +hwaddr section_offset; +hwaddr size = 0; + +while (len) { +section = memory_region_find(pci_dev->io_regions[pci_bar].memory, + bar_offset, len); + +if (!section.mr) { +warn_report("vfu: invalid address 0x%"PRIx64"", bar_offset); +return size; +} + +section_mr = section.mr; +section_offset = section.offset_within_region; +section_size = int128_get64(section.size); + +if (is_write && section_mr->readonly) { +warn_report("vfu: attempting to write to readonly region in " +"bar %d - [0x%"PRIx64" - 0x%"PRIx64"]", +pci_bar, bar_offset, +(bar_off
[PATCH v11 11/14] vfio-user: handle DMA mappings
Define and register callbacks to manage the RAM regions used for device DMA Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/machine.c | 5 hw/remote/vfio-user-obj.c | 55 +++ hw/remote/trace-events| 2 ++ 3 files changed, 62 insertions(+) diff --git a/hw/remote/machine.c b/hw/remote/machine.c index cbb2add291..645b54343d 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -22,6 +22,7 @@ #include "hw/remote/iohub.h" #include "hw/remote/iommu.h" #include "hw/qdev-core.h" +#include "hw/remote/iommu.h" static void remote_machine_init(MachineState *machine) { @@ -51,6 +52,10 @@ static void remote_machine_init(MachineState *machine) pci_host = PCI_HOST_BRIDGE(rem_host); +if (s->vfio_user) { +remote_iommu_setup(pci_host->bus); +} + remote_iohub_init(&s->iohub); pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq, diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index cef473cb98..7b21f77052 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -284,6 +284,54 @@ static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf, return count; } +static void dma_register(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) +{ +VfuObject *o = vfu_get_private(vfu_ctx); +AddressSpace *dma_as = NULL; +MemoryRegion *subregion = NULL; +g_autofree char *name = NULL; +struct iovec *iov = &info->iova; + +if (!info->vaddr) { +return; +} + +name = g_strdup_printf("mem-%s-%"PRIx64"", o->device, + (uint64_t)info->vaddr); + +subregion = g_new0(MemoryRegion, 1); + +memory_region_init_ram_ptr(subregion, NULL, name, + iov->iov_len, info->vaddr); + +dma_as = pci_device_iommu_address_space(o->pci_dev); + +memory_region_add_subregion(dma_as->root, (hwaddr)iov->iov_base, subregion); + +trace_vfu_dma_register((uint64_t)iov->iov_base, iov->iov_len); +} + +static void dma_unregister(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) +{ +VfuObject *o = vfu_get_private(vfu_ctx); +AddressSpace *dma_as = NULL; +MemoryRegion *mr = NULL; +ram_addr_t offset; + +mr = memory_region_from_host(info->vaddr, &offset); +if (!mr) { +return; +} + +dma_as = pci_device_iommu_address_space(o->pci_dev); + +memory_region_del_subregion(dma_as->root, mr); + +object_unparent((OBJECT(mr))); + +trace_vfu_dma_unregister((uint64_t)info->iova.iov_base); +} + /* * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' * properties. It also depends on devices instantiated in QEMU. These @@ -387,6 +435,13 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) goto fail; } +ret = vfu_setup_device_dma(o->vfu_ctx, &dma_register, &dma_unregister); +if (ret < 0) { +error_setg(errp, "vfu: Failed to setup DMA handlers for %s", + o->device); +goto fail; +} + ret = vfu_realize_ctx(o->vfu_ctx); if (ret < 0) { error_setg(errp, "vfu: Failed to realize device %s- %s", diff --git a/hw/remote/trace-events b/hw/remote/trace-events index 2ef7884346..f945c7e33b 100644 --- a/hw/remote/trace-events +++ b/hw/remote/trace-events @@ -7,3 +7,5 @@ mpqemu_recv_io_error(int cmd, int size, int nfds) "failed to receive %d size %d, vfu_prop(const char *prop, const char *val) "vfu: setting %s as %s" vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u -> 0x%x" vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u <- 0x%x" +vfu_dma_register(uint64_t gpa, size_t len) "vfu: registering GPA 0x%"PRIx64", %zu bytes" +vfu_dma_unregister(uint64_t gpa) "vfu: unregistering GPA 0x%"PRIx64"" -- 2.20.1
[PATCH v11 02/14] remote/machine: add HotplugHandler for remote machine
Allow hotplugging of PCI(e) devices to remote machine Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/machine.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/hw/remote/machine.c b/hw/remote/machine.c index 92d71d47bb..a97e53e250 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -20,6 +20,7 @@ #include "qapi/error.h" #include "hw/pci/pci_host.h" #include "hw/remote/iohub.h" +#include "hw/qdev-core.h" static void remote_machine_init(MachineState *machine) { @@ -53,14 +54,19 @@ static void remote_machine_init(MachineState *machine) pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq, &s->iohub, REMOTE_IOHUB_NB_PIRQS); + +qbus_set_hotplug_handler(BUS(pci_host->bus), OBJECT(s)); } static void remote_machine_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); +HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc); mc->init = remote_machine_init; mc->desc = "Experimental remote machine"; + +hc->unplug = qdev_simple_device_unplug_cb; } static const TypeInfo remote_machine = { @@ -68,6 +74,10 @@ static const TypeInfo remote_machine = { .parent = TYPE_MACHINE, .instance_size = sizeof(RemoteMachineState), .class_init = remote_machine_class_init, +.interfaces = (InterfaceInfo[]) { +{ TYPE_HOTPLUG_HANDLER }, +{ } +} }; static void remote_machine_register_types(void) -- 2.20.1
[PATCH v11 08/14] vfio-user: run vfio-user context
Setup a handler to run vfio-user context. The context is driven by messages to the file descriptor associated with it - get the fd for the context and hook up the handler with it Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- qapi/misc.json| 31 ++ hw/remote/vfio-user-obj.c | 118 +- 2 files changed, 148 insertions(+), 1 deletion(-) diff --git a/qapi/misc.json b/qapi/misc.json index 45344483cd..27ef5a2b20 100644 --- a/qapi/misc.json +++ b/qapi/misc.json @@ -553,3 +553,34 @@ ## { 'event': 'RTC_CHANGE', 'data': { 'offset': 'int', 'qom-path': 'str' } } + +## +# @VFU_CLIENT_HANGUP: +# +# Emitted when the client of a TYPE_VFIO_USER_SERVER closes the +# communication channel +# +# @vfu-id: ID of the TYPE_VFIO_USER_SERVER object. It is the last component +# of @vfu-qom-path referenced below +# +# @vfu-qom-path: path to the TYPE_VFIO_USER_SERVER object in the QOM tree +# +# @dev-id: ID of attached PCI device +# +# @dev-qom-path: path to attached PCI device in the QOM tree +# +# Since: 7.1 +# +# Example: +# +# <- { "event": "VFU_CLIENT_HANGUP", +# "data": { "vfu-id": "vfu1", +#"vfu-qom-path": "/objects/vfu1", +#"dev-id": "sas1", +#"dev-qom-path": "/machine/peripheral/sas1" }, +# "timestamp": { "seconds": 1265044230, "microseconds": 450486 } } +# +## +{ 'event': 'VFU_CLIENT_HANGUP', + 'data': { 'vfu-id': 'str', 'vfu-qom-path': 'str', +'dev-id': 'str', 'dev-qom-path': 'str' } } diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 3ca6aa2b45..178bd6f8ed 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -27,6 +27,9 @@ * * device - id of a device on the server, a required option. PCI devices * alone are supported presently. + * + * notes - x-vfio-user-server could block IO and monitor during the + * initialization phase. */ #include "qemu/osdep.h" @@ -40,11 +43,14 @@ #include "hw/remote/machine.h" #include "qapi/error.h" #include "qapi/qapi-visit-sockets.h" +#include "qapi/qapi-events-misc.h" #include "qemu/notify.h" +#include "qemu/thread.h" #include "sysemu/sysemu.h" #include "libvfio-user.h" #include "hw/qdev-core.h" #include "hw/pci/pci.h" +#include "qemu/timer.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -86,6 +92,8 @@ struct VfuObject { PCIDevice *pci_dev; Error *unplug_blocker; + +int vfu_poll_fd; }; static void vfu_object_init_ctx(VfuObject *o, Error **errp); @@ -164,6 +172,78 @@ static void vfu_object_set_device(Object *obj, const char *str, Error **errp) vfu_object_init_ctx(o, errp); } +static void vfu_object_ctx_run(void *opaque) +{ +VfuObject *o = opaque; +const char *vfu_id; +char *vfu_path, *pci_dev_path; +int ret = -1; + +while (ret != 0) { +ret = vfu_run_ctx(o->vfu_ctx); +if (ret < 0) { +if (errno == EINTR) { +continue; +} else if (errno == ENOTCONN) { +vfu_id = object_get_canonical_path_component(OBJECT(o)); +vfu_path = object_get_canonical_path(OBJECT(o)); +g_assert(o->pci_dev); +pci_dev_path = object_get_canonical_path(OBJECT(o->pci_dev)); + /* o->device is a required property and is non-NULL here */ +g_assert(o->device); +qapi_event_send_vfu_client_hangup(vfu_id, vfu_path, + o->device, pci_dev_path); +qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL); +o->vfu_poll_fd = -1; +object_unparent(OBJECT(o)); +g_free(vfu_path); +g_free(pci_dev_path); +break; +} else { +VFU_OBJECT_ERROR(o, "vfu: Failed to run device %s - %s", + o->device, strerror(errno)); +break; +} +} +} +} + +static void vfu_object_attach_ctx(void *opaque) +{ +VfuObject *o = opaque; +GPollFD pfds[1]; +int ret; + +qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL); + +pfds[0].fd = o->vfu_poll_fd; +pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR; + +retry_attach: +ret = vfu_attach_ctx(o-&g
[PATCH v11 04/14] vfio-user: build library
add the libvfio-user library as a submodule. build it as a meson subproject. libvfio-user is distributed with BSD 3-Clause license and json-c with MIT (Expat) license Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- configure | 17 + meson.build | 23 ++- .gitlab-ci.d/buildtest.yml | 1 + .gitmodules | 3 +++ Kconfig.host| 4 MAINTAINERS | 1 + hw/remote/Kconfig | 4 hw/remote/meson.build | 2 ++ meson_options.txt | 2 ++ subprojects/libvfio-user| 1 + tests/docker/dockerfiles/centos8.docker | 2 ++ 11 files changed, 59 insertions(+), 1 deletion(-) create mode 16 subprojects/libvfio-user diff --git a/configure b/configure index e69537c756..39f30c0283 100755 --- a/configure +++ b/configure @@ -315,6 +315,7 @@ meson_args="" ninja="" bindir="bin" skip_meson=no +vfio_user_server="disabled" # The following Meson options are handled manually (still they # are included in the automatically generated help message) @@ -909,6 +910,10 @@ for opt do ;; --disable-blobs) meson_option_parse --disable-install-blobs "" ;; + --enable-vfio-user-server) vfio_user_server="enabled" + ;; + --disable-vfio-user-server) vfio_user_server="disabled" + ;; --enable-tcmalloc) meson_option_parse --enable-malloc=tcmalloc tcmalloc ;; --enable-jemalloc) meson_option_parse --enable-malloc=jemalloc jemalloc @@ -2133,6 +2138,17 @@ write_container_target_makefile() { +## +# check for vfio_user_server + +case "$vfio_user_server" in + enabled ) +if test "$git_submodules_action" != "ignore"; then + git_submodules="${git_submodules} subprojects/libvfio-user" +fi +;; +esac + ## # End of CC checks # After here, no more $cc or $ld runs @@ -2669,6 +2685,7 @@ if test "$skip_meson" = no; then test "$slirp" != auto && meson_option_add "-Dslirp=$slirp" test "$smbd" != '' && meson_option_add "-Dsmbd=$smbd" test "$tcg" != enabled && meson_option_add "-Dtcg=$tcg" + test "$vfio_user_server" != auto && meson_option_add "-Dvfio_user_server=$vfio_user_server" run_meson() { NINJA=$ninja $meson setup --prefix "$prefix" "$@" $cross_arg "$PWD" "$source_path" } diff --git a/meson.build b/meson.build index 21cd949082..fac9853254 100644 --- a/meson.build +++ b/meson.build @@ -308,6 +308,10 @@ multiprocess_allowed = get_option('multiprocess') \ .require(targetos == 'linux', error_message: 'Multiprocess QEMU is supported only on Linux') \ .allowed() +vfio_user_server_allowed = get_option('vfio_user_server') \ + .require(targetos == 'linux', error_message: 'vfio-user server is supported only on Linux') \ + .allowed() + have_tpm = get_option('tpm') \ .require(targetos != 'windows', error_message: 'TPM emulation only available on POSIX systems') \ .allowed() @@ -2373,7 +2377,8 @@ host_kconfig = \ (have_virtfs ? ['CONFIG_VIRTFS=y'] : []) + \ ('CONFIG_LINUX' in config_host ? ['CONFIG_LINUX=y'] : []) + \ (have_pvrdma ? ['CONFIG_PVRDMA=y'] : []) + \ - (multiprocess_allowed ? ['CONFIG_MULTIPROCESS_ALLOWED=y'] : []) + (multiprocess_allowed ? ['CONFIG_MULTIPROCESS_ALLOWED=y'] : []) + \ + (vfio_user_server_allowed ? ['CONFIG_VFIO_USER_SERVER_ALLOWED=y'] : []) ignored = [ 'TARGET_XML_FILES', 'TARGET_ABI_DIR', 'TARGET_ARCH' ] @@ -2665,6 +2670,21 @@ if have_system endif endif +libvfio_user_dep = not_found +if have_system and vfio_user_server_allowed + have_internal = fs.exists(meson.current_source_dir() / 'subprojects/libvfio-user/meson.build') + + if not have_internal +error('libvfio-user source not found - please pull git submodule') + endif + + libvfio_user_proj = subproject('libvfio-user') + + libvfio_user_lib = libvfio_user_proj.get_variable('libvfio_user_dep') + + libvfio_user_dep = declare_dependency(dependencies: [libvfio_user_lib]) +endif + fdt = not_found if have_system fdt_opt = get_option('fdt') @@ -3783,6 +3803,7 @@ summary_info += {'target list': ' '.join(target_dirs)} if have_system summary_info += {'default devices': get_
[PATCH v11 10/14] vfio-user: IOMMU support for remote device
Assign separate address space for each device in the remote processes. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- include/hw/remote/iommu.h | 40 hw/remote/iommu.c | 131 ++ hw/remote/machine.c | 13 +++- MAINTAINERS | 2 + hw/remote/meson.build | 1 + 5 files changed, 186 insertions(+), 1 deletion(-) create mode 100644 include/hw/remote/iommu.h create mode 100644 hw/remote/iommu.c diff --git a/include/hw/remote/iommu.h b/include/hw/remote/iommu.h new file mode 100644 index 00..33b68a8f4b --- /dev/null +++ b/include/hw/remote/iommu.h @@ -0,0 +1,40 @@ +/** + * Copyright © 2022 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef REMOTE_IOMMU_H +#define REMOTE_IOMMU_H + +#include "hw/pci/pci_bus.h" +#include "hw/pci/pci.h" + +#ifndef INT2VOIDP +#define INT2VOIDP(i) (void *)(uintptr_t)(i) +#endif + +typedef struct RemoteIommuElem { +MemoryRegion *mr; + +AddressSpace as; +} RemoteIommuElem; + +#define TYPE_REMOTE_IOMMU "x-remote-iommu" +OBJECT_DECLARE_SIMPLE_TYPE(RemoteIommu, REMOTE_IOMMU) + +struct RemoteIommu { +Object parent; + +GHashTable *elem_by_devfn; + +QemuMutex lock; +}; + +void remote_iommu_setup(PCIBus *pci_bus); + +void remote_iommu_unplug_dev(PCIDevice *pci_dev); + +#endif diff --git a/hw/remote/iommu.c b/hw/remote/iommu.c new file mode 100644 index 00..fd723d91f3 --- /dev/null +++ b/hw/remote/iommu.c @@ -0,0 +1,131 @@ +/** + * IOMMU for remote device + * + * Copyright © 2022 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" + +#include "hw/remote/iommu.h" +#include "hw/pci/pci_bus.h" +#include "hw/pci/pci.h" +#include "exec/memory.h" +#include "exec/address-spaces.h" +#include "trace.h" + +/** + * IOMMU for TYPE_REMOTE_MACHINE - manages DMA address space isolation + * for remote machine. It is used by TYPE_VFIO_USER_SERVER. + * + * - Each TYPE_VFIO_USER_SERVER instance handles one PCIDevice on a PCIBus. + * There is one RemoteIommu per PCIBus, so the RemoteIommu tracks multiple + * PCIDevices by maintaining a ->elem_by_devfn mapping. + * + * - memory_region_init_iommu() is not used because vfio-user MemoryRegions + * will be added to the elem->mr container instead. This is more natural + * than implementing the IOMMUMemoryRegionClass APIs since vfio-user + * provides something that is close to a full-fledged MemoryRegion and + * not like an IOMMU mapping. + * + * - When a device is hot unplugged, the elem->mr reference is dropped so + * all vfio-user MemoryRegions associated with this vfio-user server are + * destroyed. + */ + +static AddressSpace *remote_iommu_find_add_as(PCIBus *pci_bus, + void *opaque, int devfn) +{ +RemoteIommu *iommu = opaque; +RemoteIommuElem *elem = NULL; + +qemu_mutex_lock(&iommu->lock); + +elem = g_hash_table_lookup(iommu->elem_by_devfn, INT2VOIDP(devfn)); + +if (!elem) { +elem = g_malloc0(sizeof(RemoteIommuElem)); +g_hash_table_insert(iommu->elem_by_devfn, INT2VOIDP(devfn), elem); +} + +if (!elem->mr) { +elem->mr = MEMORY_REGION(object_new(TYPE_MEMORY_REGION)); +memory_region_set_size(elem->mr, UINT64_MAX); +address_space_init(&elem->as, elem->mr, NULL); +} + +qemu_mutex_unlock(&iommu->lock); + +return &elem->as; +} + +void remote_iommu_unplug_dev(PCIDevice *pci_dev) +{ +AddressSpace *as = pci_device_iommu_address_space(pci_dev); +RemoteIommuElem *elem = NULL; + +if (as == &address_space_memory) { +return; +} + +elem = container_of(as, RemoteIommuElem, as); + +address_space_destroy(&elem->as); + +object_unref(elem->mr); + +elem->mr = NULL; +} + +static void remote_iommu_init(Object *obj) +{ +RemoteIommu *iommu = REMOTE_IOMMU(obj); + +iommu->elem_by_devfn = g_hash_table_new_full(NULL, NULL, NULL, g_free); + +qemu_mutex_init(&iommu->lock); +} + +static void remote_iommu_finalize(Object *obj) +{ +RemoteIommu *iommu = REMOTE_IOMMU(obj); + +qemu_mutex_destroy(&iommu->lock); + +g_hash_table_destroy(iommu->elem_by_devfn); + +iommu->elem_by_devfn = NULL; +} + +void remote_iommu_setup(PCIBus *pci_bus) +{ +RemoteIommu *iommu = NULL; + +g_assert(pci_bus); + +iommu = REMOTE_IOMMU(object_new(TYPE_REMOTE_IOMMU)); + +pci_setup_iommu(pci_bus, remote_iommu_find
[PATCH v11 14/14] vfio-user: handle reset of remote device
Adds handler to reset a remote device Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 20 1 file changed, 20 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 2d716e6391..c8c61494dd 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -676,6 +676,20 @@ void vfu_object_set_bus_irq(PCIBus *pci_bus) max_bdf); } +static int vfu_object_device_reset(vfu_ctx_t *vfu_ctx, vfu_reset_type_t type) +{ +VfuObject *o = vfu_get_private(vfu_ctx); + +/* vfu_object_ctx_run() handles lost connection */ +if (type == VFU_RESET_LOST_CONN) { +return 0; +} + +qdev_reset_all(DEVICE(o->pci_dev)); + +return 0; +} + /* * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' * properties. It also depends on devices instantiated in QEMU. These @@ -795,6 +809,12 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) goto fail; } +ret = vfu_setup_device_reset_cb(o->vfu_ctx, &vfu_object_device_reset); +if (ret < 0) { +error_setg(errp, "vfu: Failed to setup reset callback"); +goto fail; +} + ret = vfu_realize_ctx(o->vfu_ctx); if (ret < 0) { error_setg(errp, "vfu: Failed to realize device %s- %s", -- 2.20.1
[PATCH v11 05/14] vfio-user: define vfio-user-server object
Define vfio-user object which is remote process server for QEMU. Setup object initialization functions and properties necessary to instantiate the object Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- qapi/qom.json | 20 +++- include/hw/remote/machine.h | 2 + hw/remote/machine.c | 27 + hw/remote/vfio-user-obj.c | 210 MAINTAINERS | 1 + hw/remote/meson.build | 1 + hw/remote/trace-events | 3 + 7 files changed, 262 insertions(+), 2 deletions(-) create mode 100644 hw/remote/vfio-user-obj.c diff --git a/qapi/qom.json b/qapi/qom.json index 6a653c6636..80dd419b39 100644 --- a/qapi/qom.json +++ b/qapi/qom.json @@ -734,6 +734,20 @@ { 'struct': 'RemoteObjectProperties', 'data': { 'fd': 'str', 'devid': 'str' } } +## +# @VfioUserServerProperties: +# +# Properties for x-vfio-user-server objects. +# +# @socket: socket to be used by the libvfio-user library +# +# @device: the ID of the device to be emulated at the server +# +# Since: 7.1 +## +{ 'struct': 'VfioUserServerProperties', + 'data': { 'socket': 'SocketAddress', 'device': 'str' } } + ## # @RngProperties: # @@ -874,7 +888,8 @@ 'tls-creds-psk', 'tls-creds-x509', 'tls-cipher-suites', -{ 'name': 'x-remote-object', 'features': [ 'unstable' ] } +{ 'name': 'x-remote-object', 'features': [ 'unstable' ] }, +{ 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] } ] } ## @@ -938,7 +953,8 @@ 'tls-creds-psk': 'TlsCredsPskProperties', 'tls-creds-x509': 'TlsCredsX509Properties', 'tls-cipher-suites': 'TlsCredsProperties', - 'x-remote-object':'RemoteObjectProperties' + 'x-remote-object':'RemoteObjectProperties', + 'x-vfio-user-server': 'VfioUserServerProperties' } } ## diff --git a/include/hw/remote/machine.h b/include/hw/remote/machine.h index 8d0fa98d33..ac32fda387 100644 --- a/include/hw/remote/machine.h +++ b/include/hw/remote/machine.h @@ -24,6 +24,8 @@ struct RemoteMachineState { RemoteIOHubState iohub; bool vfio_user; + +bool auto_shutdown; }; /* Used to pass to co-routine device and ioc. */ diff --git a/hw/remote/machine.c b/hw/remote/machine.c index 9f3cdc55c3..4d008ed721 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -77,6 +77,28 @@ static void remote_machine_set_vfio_user(Object *obj, bool value, Error **errp) s->vfio_user = value; } +static bool remote_machine_get_auto_shutdown(Object *obj, Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +return s->auto_shutdown; +} + +static void remote_machine_set_auto_shutdown(Object *obj, bool value, + Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +s->auto_shutdown = value; +} + +static void remote_machine_instance_init(Object *obj) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +s->auto_shutdown = true; +} + static void remote_machine_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); @@ -90,12 +112,17 @@ static void remote_machine_class_init(ObjectClass *oc, void *data) object_class_property_add_bool(oc, "vfio-user", remote_machine_get_vfio_user, remote_machine_set_vfio_user); + +object_class_property_add_bool(oc, "auto-shutdown", + remote_machine_get_auto_shutdown, + remote_machine_set_auto_shutdown); } static const TypeInfo remote_machine = { .name = TYPE_REMOTE_MACHINE, .parent = TYPE_MACHINE, .instance_size = sizeof(RemoteMachineState), +.instance_init = remote_machine_instance_init, .class_init = remote_machine_class_init, .interfaces = (InterfaceInfo[]) { { TYPE_HOTPLUG_HANDLER }, diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c new file mode 100644 index 00..bc49adcc27 --- /dev/null +++ b/hw/remote/vfio-user-obj.c @@ -0,0 +1,210 @@ +/** + * QEMU vfio-user-server server object + * + * Copyright © 2022 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL-v2, version 2 or later. + * + * See the COPYING file in the top-level directory. + * + */ + +/** + * Usage: add options: + * -machine x-remote,vfi
[PATCH v11 07/14] vfio-user: find and init PCI device
Find the PCI device with specified id. Initialize the device context with the QEMU PCI device Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 67 +++ 1 file changed, 67 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 68f8a9dfa9..3ca6aa2b45 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -43,6 +43,8 @@ #include "qemu/notify.h" #include "sysemu/sysemu.h" #include "libvfio-user.h" +#include "hw/qdev-core.h" +#include "hw/pci/pci.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -80,6 +82,10 @@ struct VfuObject { Notifier machine_done; vfu_ctx_t *vfu_ctx; + +PCIDevice *pci_dev; + +Error *unplug_blocker; }; static void vfu_object_init_ctx(VfuObject *o, Error **errp); @@ -181,6 +187,9 @@ static void vfu_object_machine_done(Notifier *notifier, void *data) static void vfu_object_init_ctx(VfuObject *o, Error **errp) { ERRP_GUARD(); +DeviceState *dev = NULL; +vfu_pci_type_t pci_type = VFU_PCI_TYPE_CONVENTIONAL; +int ret; if (o->vfu_ctx || !o->socket || !o->device || !phase_check(PHASE_MACHINE_READY)) { @@ -199,6 +208,53 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) error_setg(errp, "vfu: Failed to create context - %s", strerror(errno)); return; } + +dev = qdev_find_recursive(sysbus_get_default(), o->device); +if (dev == NULL) { +error_setg(errp, "vfu: Device %s not found", o->device); +goto fail; +} + +if (!object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) { +error_setg(errp, "vfu: %s not a PCI device", o->device); +goto fail; +} + +o->pci_dev = PCI_DEVICE(dev); + +object_ref(OBJECT(o->pci_dev)); + +if (pci_is_express(o->pci_dev)) { +pci_type = VFU_PCI_TYPE_EXPRESS; +} + +ret = vfu_pci_init(o->vfu_ctx, pci_type, PCI_HEADER_TYPE_NORMAL, 0); +if (ret < 0) { +error_setg(errp, + "vfu: Failed to attach PCI device %s to context - %s", + o->device, strerror(errno)); +goto fail; +} + +error_setg(&o->unplug_blocker, + "vfu: %s for %s must be deleted before unplugging", + TYPE_VFU_OBJECT, o->device); +qdev_add_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); + +return; + +fail: +vfu_destroy_ctx(o->vfu_ctx); +if (o->unplug_blocker && o->pci_dev) { +qdev_del_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); +error_free(o->unplug_blocker); +o->unplug_blocker = NULL; +} +if (o->pci_dev) { +object_unref(OBJECT(o->pci_dev)); +o->pci_dev = NULL; +} +o->vfu_ctx = NULL; } static void vfu_object_init(Object *obj) @@ -241,6 +297,17 @@ static void vfu_object_finalize(Object *obj) o->device = NULL; +if (o->unplug_blocker && o->pci_dev) { +qdev_del_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); +error_free(o->unplug_blocker); +o->unplug_blocker = NULL; +} + +if (o->pci_dev) { +object_unref(OBJECT(o->pci_dev)); +o->pci_dev = NULL; +} + if (!k->nr_devs && vfu_object_auto_shutdown()) { qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN); } -- 2.20.1
[PATCH v11 03/14] remote/machine: add vfio-user property
Add vfio-user to x-remote machine. It is a boolean, which indicates if the machine supports vfio-user protocol. The machine configures the bus differently vfio-user and multiprocess protocols, so this property informs it on how to configure the bus. This property should be short lived. Once vfio-user fully replaces multiprocess, this property could be removed. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- include/hw/remote/machine.h | 2 ++ hw/remote/machine.c | 23 +++ 2 files changed, 25 insertions(+) diff --git a/include/hw/remote/machine.h b/include/hw/remote/machine.h index 2a2a33c4b2..8d0fa98d33 100644 --- a/include/hw/remote/machine.h +++ b/include/hw/remote/machine.h @@ -22,6 +22,8 @@ struct RemoteMachineState { RemotePCIHost *host; RemoteIOHubState iohub; + +bool vfio_user; }; /* Used to pass to co-routine device and ioc. */ diff --git a/hw/remote/machine.c b/hw/remote/machine.c index a97e53e250..9f3cdc55c3 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -58,6 +58,25 @@ static void remote_machine_init(MachineState *machine) qbus_set_hotplug_handler(BUS(pci_host->bus), OBJECT(s)); } +static bool remote_machine_get_vfio_user(Object *obj, Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +return s->vfio_user; +} + +static void remote_machine_set_vfio_user(Object *obj, bool value, Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +if (phase_check(PHASE_MACHINE_CREATED)) { +error_setg(errp, "Error enabling vfio-user - machine already created"); +return; +} + +s->vfio_user = value; +} + static void remote_machine_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); @@ -67,6 +86,10 @@ static void remote_machine_class_init(ObjectClass *oc, void *data) mc->desc = "Experimental remote machine"; hc->unplug = qdev_simple_device_unplug_cb; + +object_class_property_add_bool(oc, "vfio-user", + remote_machine_get_vfio_user, + remote_machine_set_vfio_user); } static const TypeInfo remote_machine = { -- 2.20.1
[PATCH v11 09/14] vfio-user: handle PCI config space accesses
Define and register handlers for PCI config space accesses Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 51 +++ hw/remote/trace-events| 2 ++ 2 files changed, 53 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 178bd6f8ed..cef473cb98 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -46,6 +46,7 @@ #include "qapi/qapi-events-misc.h" #include "qemu/notify.h" #include "qemu/thread.h" +#include "qemu/main-loop.h" #include "sysemu/sysemu.h" #include "libvfio-user.h" #include "hw/qdev-core.h" @@ -244,6 +245,45 @@ retry_attach: qemu_set_fd_handler(o->vfu_poll_fd, vfu_object_ctx_run, NULL, o); } +static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf, + size_t count, loff_t offset, + const bool is_write) +{ +VfuObject *o = vfu_get_private(vfu_ctx); +uint32_t pci_access_width = sizeof(uint32_t); +size_t bytes = count; +uint32_t val = 0; +char *ptr = buf; +int len; + +/* + * Writes to the BAR registers would trigger an update to the + * global Memory and IO AddressSpaces. But the remote device + * never uses the global AddressSpaces, therefore overlapping + * memory regions are not a problem + */ +while (bytes > 0) { +len = (bytes > pci_access_width) ? pci_access_width : bytes; +if (is_write) { +memcpy(&val, ptr, len); +pci_host_config_write_common(o->pci_dev, offset, + pci_config_size(o->pci_dev), + val, len); +trace_vfu_cfg_write(offset, val); +} else { +val = pci_host_config_read_common(o->pci_dev, offset, + pci_config_size(o->pci_dev), len); +memcpy(ptr, &val, len); +trace_vfu_cfg_read(offset, val); +} +offset += len; +ptr += len; +bytes -= len; +} + +return count; +} + /* * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' * properties. It also depends on devices instantiated in QEMU. These @@ -336,6 +376,17 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) TYPE_VFU_OBJECT, o->device); qdev_add_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); +ret = vfu_setup_region(o->vfu_ctx, VFU_PCI_DEV_CFG_REGION_IDX, + pci_config_size(o->pci_dev), &vfu_object_cfg_access, + VFU_REGION_FLAG_RW | VFU_REGION_FLAG_ALWAYS_CB, + NULL, 0, -1, 0); +if (ret < 0) { +error_setg(errp, + "vfu: Failed to setup config space handlers for %s- %s", + o->device, strerror(errno)); +goto fail; +} + ret = vfu_realize_ctx(o->vfu_ctx); if (ret < 0) { error_setg(errp, "vfu: Failed to realize device %s- %s", diff --git a/hw/remote/trace-events b/hw/remote/trace-events index 7da12f0d96..2ef7884346 100644 --- a/hw/remote/trace-events +++ b/hw/remote/trace-events @@ -5,3 +5,5 @@ mpqemu_recv_io_error(int cmd, int size, int nfds) "failed to receive %d size %d, # vfio-user-obj.c vfu_prop(const char *prop, const char *val) "vfu: setting %s as %s" +vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u -> 0x%x" +vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u <- 0x%x" -- 2.20.1
[PATCH v11 06/14] vfio-user: instantiate vfio-user context
create a context with the vfio-user library to run a PCI device Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 82 +++ 1 file changed, 82 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index bc49adcc27..68f8a9dfa9 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -40,6 +40,9 @@ #include "hw/remote/machine.h" #include "qapi/error.h" #include "qapi/qapi-visit-sockets.h" +#include "qemu/notify.h" +#include "sysemu/sysemu.h" +#include "libvfio-user.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -73,8 +76,14 @@ struct VfuObject { char *device; Error *err; + +Notifier machine_done; + +vfu_ctx_t *vfu_ctx; }; +static void vfu_object_init_ctx(VfuObject *o, Error **errp); + static bool vfu_object_auto_shutdown(void) { bool auto_shutdown = true; @@ -107,6 +116,11 @@ static void vfu_object_set_socket(Object *obj, Visitor *v, const char *name, { VfuObject *o = VFU_OBJECT(obj); +if (o->vfu_ctx) { +error_setg(errp, "vfu: Unable to set socket property - server busy"); +return; +} + qapi_free_SocketAddress(o->socket); o->socket = NULL; @@ -122,17 +136,69 @@ static void vfu_object_set_socket(Object *obj, Visitor *v, const char *name, } trace_vfu_prop("socket", o->socket->u.q_unix.path); + +vfu_object_init_ctx(o, errp); } static void vfu_object_set_device(Object *obj, const char *str, Error **errp) { VfuObject *o = VFU_OBJECT(obj); +if (o->vfu_ctx) { +error_setg(errp, "vfu: Unable to set device property - server busy"); +return; +} + g_free(o->device); o->device = g_strdup(str); trace_vfu_prop("device", str); + +vfu_object_init_ctx(o, errp); +} + +/* + * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' + * properties. It also depends on devices instantiated in QEMU. These + * dependencies are not available during the instance_init phase of this + * object's life-cycle. As such, the server is initialized after the + * machine is setup. machine_init_done_notifier notifies TYPE_VFU_OBJECT + * when the machine is setup, and the dependencies are available. + */ +static void vfu_object_machine_done(Notifier *notifier, void *data) +{ +VfuObject *o = container_of(notifier, VfuObject, machine_done); +Error *err = NULL; + +vfu_object_init_ctx(o, &err); + +if (err) { +error_propagate(&error_abort, err); +} +} + +static void vfu_object_init_ctx(VfuObject *o, Error **errp) +{ +ERRP_GUARD(); + +if (o->vfu_ctx || !o->socket || !o->device || +!phase_check(PHASE_MACHINE_READY)) { +return; +} + +if (o->err) { +error_propagate(errp, o->err); +o->err = NULL; +return; +} + +o->vfu_ctx = vfu_create_ctx(VFU_TRANS_SOCK, o->socket->u.q_unix.path, 0, +o, VFU_DEV_TYPE_PCI); +if (o->vfu_ctx == NULL) { +error_setg(errp, "vfu: Failed to create context - %s", strerror(errno)); +return; +} } static void vfu_object_init(Object *obj) @@ -147,6 +213,12 @@ static void vfu_object_init(Object *obj) TYPE_VFU_OBJECT, TYPE_REMOTE_MACHINE); return; } + +if (!phase_check(PHASE_MACHINE_READY)) { +o->machine_done.notify = vfu_object_machine_done; +qemu_add_machine_init_done_notifier(&o->machine_done); +} + } static void vfu_object_finalize(Object *obj) @@ -160,6 +232,11 @@ static void vfu_object_finalize(Object *obj) o->socket = NULL; +if (o->vfu_ctx) { +vfu_destroy_ctx(o->vfu_ctx); +o->vfu_ctx = NULL; +} + g_free(o->device); o->device = NULL; @@ -167,6 +244,11 @@ static void vfu_object_finalize(Object *obj) if (!k->nr_devs && vfu_object_auto_shutdown()) { qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN); } + +if (o->machine_done.notify) { +qemu_remove_machine_init_done_notifier(&o->machine_done); +o->machine_done.notify = NULL; +} } static void vfu_object_class_init(ObjectClass *klass, void *data) -- 2.20.1
[PATCH v11 01/14] qdev: unplug blocker for devices
Add blocker to prevent hot-unplug of devices TYPE_VFIO_USER_SERVER, which is introduced shortly, attaches itself to a PCIDevice on which it depends. If the attached PCIDevice gets removed while the server in use, it could cause it crash. To prevent this, TYPE_VFIO_USER_SERVER adds an unplug blocker for the PCIDevice. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- include/hw/qdev-core.h | 29 + hw/core/qdev.c | 24 softmmu/qdev-monitor.c | 4 3 files changed, 57 insertions(+) diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h index 92c3d65208..98774e2835 100644 --- a/include/hw/qdev-core.h +++ b/include/hw/qdev-core.h @@ -193,6 +193,7 @@ struct DeviceState { int instance_id_alias; int alias_required_for_version; ResettableState reset; +GSList *unplug_blockers; }; struct DeviceListener { @@ -419,6 +420,34 @@ void qdev_simple_device_unplug_cb(HotplugHandler *hotplug_dev, void qdev_machine_creation_done(void); bool qdev_machine_modified(void); +/** + * qdev_add_unplug_blocker: Add an unplug blocker to a device + * + * @dev: Device to be blocked from unplug + * @reason: Reason for blocking + */ +void qdev_add_unplug_blocker(DeviceState *dev, Error *reason); + +/** + * qdev_del_unplug_blocker: Remove an unplug blocker from a device + * + * @dev: Device to be unblocked + * @reason: Pointer to the Error used with qdev_add_unplug_blocker. + * Used as a handle to lookup the blocker for deletion. + */ +void qdev_del_unplug_blocker(DeviceState *dev, Error *reason); + +/** + * qdev_unplug_blocked: Confirm if a device is blocked from unplug + * + * @dev: Device to be tested + * @reason: Returns one of the reasons why the device is blocked, + * if any + * + * Returns: true if device is blocked from unplug, false otherwise + */ +bool qdev_unplug_blocked(DeviceState *dev, Error **errp); + /** * GpioPolarity: Polarity of a GPIO line * diff --git a/hw/core/qdev.c b/hw/core/qdev.c index 84f3019440..0806d8fcaa 100644 --- a/hw/core/qdev.c +++ b/hw/core/qdev.c @@ -468,6 +468,28 @@ char *qdev_get_dev_path(DeviceState *dev) return NULL; } +void qdev_add_unplug_blocker(DeviceState *dev, Error *reason) +{ +dev->unplug_blockers = g_slist_prepend(dev->unplug_blockers, reason); +} + +void qdev_del_unplug_blocker(DeviceState *dev, Error *reason) +{ +dev->unplug_blockers = g_slist_remove(dev->unplug_blockers, reason); +} + +bool qdev_unplug_blocked(DeviceState *dev, Error **errp) +{ +ERRP_GUARD(); + +if (dev->unplug_blockers) { +error_propagate(errp, error_copy(dev->unplug_blockers->data)); +return true; +} + +return false; +} + static bool device_get_realized(Object *obj, Error **errp) { DeviceState *dev = DEVICE(obj); @@ -704,6 +726,8 @@ static void device_finalize(Object *obj) DeviceState *dev = DEVICE(obj); +g_assert(!dev->unplug_blockers); + QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) { QLIST_REMOVE(ngl, node); qemu_free_irqs(ngl->in, ngl->num_in); diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c index bb5897fc76..4b0ef65780 100644 --- a/softmmu/qdev-monitor.c +++ b/softmmu/qdev-monitor.c @@ -899,6 +899,10 @@ void qdev_unplug(DeviceState *dev, Error **errp) HotplugHandlerClass *hdc; Error *local_err = NULL; +if (qdev_unplug_blocked(dev, errp)) { +return; +} + if (dev->parent_bus && !qbus_is_hotpluggable(dev->parent_bus)) { error_setg(errp, QERR_BUS_NO_HOTPLUG, dev->parent_bus->name); return; -- 2.20.1
[PATCH v11 00/14] vfio-user server in QEMU
Hi, This is v11 of the server side changes to enable vfio-user in QEMU. Thank you for reviewing and sharing your feedback for the previous revision. We have addressed your comments in this revision. We made the following changes in this series: [PATCH v11 13/14] vfio-user: handle device interrupts - Added msi_set_irq_state() and msix_set_irq_state() to mask and unmask individual MSI(x) vectors - Implement callbacks to handle the MASK/UNMASK actions initiated by SET_IRQS message - vfu_object_set_bus_irq() sets the maximum number of IRQS to max BDF. This only affects devices using INTx - allows multiple devices to use INTx Thank you very much! Jagannathan Raman (14): qdev: unplug blocker for devices remote/machine: add HotplugHandler for remote machine remote/machine: add vfio-user property vfio-user: build library vfio-user: define vfio-user-server object vfio-user: instantiate vfio-user context vfio-user: find and init PCI device vfio-user: run vfio-user context vfio-user: handle PCI config space accesses vfio-user: IOMMU support for remote device vfio-user: handle DMA mappings vfio-user: handle PCI BAR accesses vfio-user: handle device interrupts vfio-user: handle reset of remote device configure | 17 + meson.build | 23 +- qapi/misc.json | 31 + qapi/qom.json | 20 +- include/exec/memory.h | 3 + include/hw/pci/msi.h| 1 + include/hw/pci/msix.h | 1 + include/hw/pci/pci.h| 13 + include/hw/qdev-core.h | 29 + include/hw/remote/iommu.h | 40 + include/hw/remote/machine.h | 4 + include/hw/remote/vfio-user-obj.h | 6 + hw/core/qdev.c | 24 + hw/pci/msi.c| 48 +- hw/pci/msix.c | 35 +- hw/pci/pci.c| 13 + hw/remote/iommu.c | 131 hw/remote/machine.c | 88 ++- hw/remote/vfio-user-obj.c | 958 softmmu/physmem.c | 4 +- softmmu/qdev-monitor.c | 4 + stubs/vfio-user-obj.c | 6 + tests/qtest/fuzz/generic_fuzz.c | 9 +- .gitlab-ci.d/buildtest.yml | 1 + .gitmodules | 3 + Kconfig.host| 4 + MAINTAINERS | 5 + hw/remote/Kconfig | 4 + hw/remote/meson.build | 4 + hw/remote/trace-events | 11 + meson_options.txt | 2 + stubs/meson.build | 1 + subprojects/libvfio-user| 1 + tests/docker/dockerfiles/centos8.docker | 2 + 34 files changed, 1527 insertions(+), 19 deletions(-) create mode 100644 include/hw/remote/iommu.h create mode 100644 include/hw/remote/vfio-user-obj.h create mode 100644 hw/remote/iommu.c create mode 100644 hw/remote/vfio-user-obj.c create mode 100644 stubs/vfio-user-obj.c create mode 16 subprojects/libvfio-user -- 2.20.1
[PATCH v10 13/14] vfio-user: handle device interrupts
Forward remote device's interrupts to the guest Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- include/hw/pci/pci.h | 13 include/hw/remote/vfio-user-obj.h | 6 ++ hw/pci/msi.c | 16 ++-- hw/pci/msix.c | 10 ++- hw/pci/pci.c | 13 hw/remote/machine.c | 14 +++- hw/remote/vfio-user-obj.c | 123 ++ stubs/vfio-user-obj.c | 6 ++ MAINTAINERS | 1 + hw/remote/trace-events| 1 + stubs/meson.build | 1 + 11 files changed, 193 insertions(+), 11 deletions(-) create mode 100644 include/hw/remote/vfio-user-obj.h create mode 100644 stubs/vfio-user-obj.c diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index 44dacfa224..b54b6ef88f 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -16,6 +16,7 @@ extern bool pci_available; #define PCI_SLOT(devfn) (((devfn) >> 3) & 0x1f) #define PCI_FUNC(devfn) ((devfn) & 0x07) #define PCI_BUILD_BDF(bus, devfn) ((bus << 8) | (devfn)) +#define PCI_BDF_TO_DEVFN(x) ((x) & 0xff) #define PCI_BUS_MAX 256 #define PCI_DEVFN_MAX 256 #define PCI_SLOT_MAX32 @@ -127,6 +128,10 @@ typedef void PCIMapIORegionFunc(PCIDevice *pci_dev, int region_num, pcibus_t addr, pcibus_t size, int type); typedef void PCIUnregisterFunc(PCIDevice *pci_dev); +typedef void MSITriggerFunc(PCIDevice *dev, MSIMessage msg); +typedef MSIMessage MSIPrepareMessageFunc(PCIDevice *dev, unsigned vector); +typedef MSIMessage MSIxPrepareMessageFunc(PCIDevice *dev, unsigned vector); + typedef struct PCIIORegion { pcibus_t addr; /* current PCI mapping address. -1 means not mapped */ #define PCI_BAR_UNMAPPED (~(pcibus_t)0) @@ -329,6 +334,14 @@ struct PCIDevice { /* Space to store MSIX table & pending bit array */ uint8_t *msix_table; uint8_t *msix_pba; + +/* May be used by INTx or MSI during interrupt notification */ +void *irq_opaque; + +MSITriggerFunc *msi_trigger; +MSIPrepareMessageFunc *msi_prepare_message; +MSIxPrepareMessageFunc *msix_prepare_message; + /* MemoryRegion container for msix exclusive BAR setup */ MemoryRegion msix_exclusive_bar; /* Memory Regions for MSIX table and pending bit entries. */ diff --git a/include/hw/remote/vfio-user-obj.h b/include/hw/remote/vfio-user-obj.h new file mode 100644 index 00..87ab78b875 --- /dev/null +++ b/include/hw/remote/vfio-user-obj.h @@ -0,0 +1,6 @@ +#ifndef VFIO_USER_OBJ_H +#define VFIO_USER_OBJ_H + +void vfu_object_set_bus_irq(PCIBus *pci_bus); + +#endif diff --git a/hw/pci/msi.c b/hw/pci/msi.c index 47d2b0f33c..d556e17a09 100644 --- a/hw/pci/msi.c +++ b/hw/pci/msi.c @@ -134,7 +134,7 @@ void msi_set_message(PCIDevice *dev, MSIMessage msg) pci_set_word(dev->config + msi_data_off(dev, msi64bit), msg.data); } -MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector) +static MSIMessage msi_prepare_message(PCIDevice *dev, unsigned int vector) { uint16_t flags = pci_get_word(dev->config + msi_flags_off(dev)); bool msi64bit = flags & PCI_MSI_FLAGS_64BIT; @@ -159,6 +159,11 @@ MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector) return msg; } +MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector) +{ +return dev->msi_prepare_message(dev, vector); +} + bool msi_enabled(const PCIDevice *dev) { return msi_present(dev) && @@ -241,6 +246,8 @@ int msi_init(struct PCIDevice *dev, uint8_t offset, 0x >> (PCI_MSI_VECTORS_MAX - nr_vectors)); } +dev->msi_prepare_message = msi_prepare_message; + return 0; } @@ -256,6 +263,7 @@ void msi_uninit(struct PCIDevice *dev) cap_size = msi_cap_sizeof(flags); pci_del_capability(dev, PCI_CAP_ID_MSI, cap_size); dev->cap_present &= ~QEMU_PCI_CAP_MSI; +dev->msi_prepare_message = NULL; MSI_DEV_PRINTF(dev, "uninit\n"); } @@ -334,11 +342,7 @@ void msi_notify(PCIDevice *dev, unsigned int vector) void msi_send_message(PCIDevice *dev, MSIMessage msg) { -MemTxAttrs attrs = {}; - -attrs.requester_id = pci_requester_id(dev); -address_space_stl_le(&dev->bus_master_as, msg.address, msg.data, - attrs, NULL); +dev->msi_trigger(dev, msg); } /* Normally called by pci_default_write_config(). */ diff --git a/hw/pci/msix.c b/hw/pci/msix.c index ae9331cd0b..6f85192d6f 100644 --- a/hw/pci/msix.c +++ b/hw/pci/msix.c @@ -31,7 +31,7 @@ #define MSIX_ENABLE_MASK (PCI_MSIX_FLAGS_ENABLE >> 8) #define MSIX_MASKALL_MASK (PCI_MSIX_FLAGS_MASKALL >> 8) -MSIMessage msix_get_message(PCIDevice *dev, unsigned vector) +static MSIMessage msix_prepare_message(PCIDe
[PATCH v10 10/14] vfio-user: IOMMU support for remote device
Assign separate address space for each device in the remote processes. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- include/hw/remote/iommu.h | 40 hw/remote/iommu.c | 131 ++ hw/remote/machine.c | 13 +++- MAINTAINERS | 2 + hw/remote/meson.build | 1 + 5 files changed, 186 insertions(+), 1 deletion(-) create mode 100644 include/hw/remote/iommu.h create mode 100644 hw/remote/iommu.c diff --git a/include/hw/remote/iommu.h b/include/hw/remote/iommu.h new file mode 100644 index 00..33b68a8f4b --- /dev/null +++ b/include/hw/remote/iommu.h @@ -0,0 +1,40 @@ +/** + * Copyright © 2022 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef REMOTE_IOMMU_H +#define REMOTE_IOMMU_H + +#include "hw/pci/pci_bus.h" +#include "hw/pci/pci.h" + +#ifndef INT2VOIDP +#define INT2VOIDP(i) (void *)(uintptr_t)(i) +#endif + +typedef struct RemoteIommuElem { +MemoryRegion *mr; + +AddressSpace as; +} RemoteIommuElem; + +#define TYPE_REMOTE_IOMMU "x-remote-iommu" +OBJECT_DECLARE_SIMPLE_TYPE(RemoteIommu, REMOTE_IOMMU) + +struct RemoteIommu { +Object parent; + +GHashTable *elem_by_devfn; + +QemuMutex lock; +}; + +void remote_iommu_setup(PCIBus *pci_bus); + +void remote_iommu_unplug_dev(PCIDevice *pci_dev); + +#endif diff --git a/hw/remote/iommu.c b/hw/remote/iommu.c new file mode 100644 index 00..fd723d91f3 --- /dev/null +++ b/hw/remote/iommu.c @@ -0,0 +1,131 @@ +/** + * IOMMU for remote device + * + * Copyright © 2022 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" + +#include "hw/remote/iommu.h" +#include "hw/pci/pci_bus.h" +#include "hw/pci/pci.h" +#include "exec/memory.h" +#include "exec/address-spaces.h" +#include "trace.h" + +/** + * IOMMU for TYPE_REMOTE_MACHINE - manages DMA address space isolation + * for remote machine. It is used by TYPE_VFIO_USER_SERVER. + * + * - Each TYPE_VFIO_USER_SERVER instance handles one PCIDevice on a PCIBus. + * There is one RemoteIommu per PCIBus, so the RemoteIommu tracks multiple + * PCIDevices by maintaining a ->elem_by_devfn mapping. + * + * - memory_region_init_iommu() is not used because vfio-user MemoryRegions + * will be added to the elem->mr container instead. This is more natural + * than implementing the IOMMUMemoryRegionClass APIs since vfio-user + * provides something that is close to a full-fledged MemoryRegion and + * not like an IOMMU mapping. + * + * - When a device is hot unplugged, the elem->mr reference is dropped so + * all vfio-user MemoryRegions associated with this vfio-user server are + * destroyed. + */ + +static AddressSpace *remote_iommu_find_add_as(PCIBus *pci_bus, + void *opaque, int devfn) +{ +RemoteIommu *iommu = opaque; +RemoteIommuElem *elem = NULL; + +qemu_mutex_lock(&iommu->lock); + +elem = g_hash_table_lookup(iommu->elem_by_devfn, INT2VOIDP(devfn)); + +if (!elem) { +elem = g_malloc0(sizeof(RemoteIommuElem)); +g_hash_table_insert(iommu->elem_by_devfn, INT2VOIDP(devfn), elem); +} + +if (!elem->mr) { +elem->mr = MEMORY_REGION(object_new(TYPE_MEMORY_REGION)); +memory_region_set_size(elem->mr, UINT64_MAX); +address_space_init(&elem->as, elem->mr, NULL); +} + +qemu_mutex_unlock(&iommu->lock); + +return &elem->as; +} + +void remote_iommu_unplug_dev(PCIDevice *pci_dev) +{ +AddressSpace *as = pci_device_iommu_address_space(pci_dev); +RemoteIommuElem *elem = NULL; + +if (as == &address_space_memory) { +return; +} + +elem = container_of(as, RemoteIommuElem, as); + +address_space_destroy(&elem->as); + +object_unref(elem->mr); + +elem->mr = NULL; +} + +static void remote_iommu_init(Object *obj) +{ +RemoteIommu *iommu = REMOTE_IOMMU(obj); + +iommu->elem_by_devfn = g_hash_table_new_full(NULL, NULL, NULL, g_free); + +qemu_mutex_init(&iommu->lock); +} + +static void remote_iommu_finalize(Object *obj) +{ +RemoteIommu *iommu = REMOTE_IOMMU(obj); + +qemu_mutex_destroy(&iommu->lock); + +g_hash_table_destroy(iommu->elem_by_devfn); + +iommu->elem_by_devfn = NULL; +} + +void remote_iommu_setup(PCIBus *pci_bus) +{ +RemoteIommu *iommu = NULL; + +g_assert(pci_bus); + +iommu = REMOTE_IOMMU(object_new(TYPE_REMOTE_IOMMU)); + +pci_setup_iommu(pci_bus, remote_iommu_find
[PATCH v10 05/14] vfio-user: define vfio-user-server object
Define vfio-user object which is remote process server for QEMU. Setup object initialization functions and properties necessary to instantiate the object Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- qapi/qom.json | 20 +++- include/hw/remote/machine.h | 2 + hw/remote/machine.c | 27 + hw/remote/vfio-user-obj.c | 210 MAINTAINERS | 1 + hw/remote/meson.build | 1 + hw/remote/trace-events | 3 + 7 files changed, 262 insertions(+), 2 deletions(-) create mode 100644 hw/remote/vfio-user-obj.c diff --git a/qapi/qom.json b/qapi/qom.json index 6a653c6636..80dd419b39 100644 --- a/qapi/qom.json +++ b/qapi/qom.json @@ -734,6 +734,20 @@ { 'struct': 'RemoteObjectProperties', 'data': { 'fd': 'str', 'devid': 'str' } } +## +# @VfioUserServerProperties: +# +# Properties for x-vfio-user-server objects. +# +# @socket: socket to be used by the libvfio-user library +# +# @device: the ID of the device to be emulated at the server +# +# Since: 7.1 +## +{ 'struct': 'VfioUserServerProperties', + 'data': { 'socket': 'SocketAddress', 'device': 'str' } } + ## # @RngProperties: # @@ -874,7 +888,8 @@ 'tls-creds-psk', 'tls-creds-x509', 'tls-cipher-suites', -{ 'name': 'x-remote-object', 'features': [ 'unstable' ] } +{ 'name': 'x-remote-object', 'features': [ 'unstable' ] }, +{ 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] } ] } ## @@ -938,7 +953,8 @@ 'tls-creds-psk': 'TlsCredsPskProperties', 'tls-creds-x509': 'TlsCredsX509Properties', 'tls-cipher-suites': 'TlsCredsProperties', - 'x-remote-object':'RemoteObjectProperties' + 'x-remote-object':'RemoteObjectProperties', + 'x-vfio-user-server': 'VfioUserServerProperties' } } ## diff --git a/include/hw/remote/machine.h b/include/hw/remote/machine.h index 8d0fa98d33..ac32fda387 100644 --- a/include/hw/remote/machine.h +++ b/include/hw/remote/machine.h @@ -24,6 +24,8 @@ struct RemoteMachineState { RemoteIOHubState iohub; bool vfio_user; + +bool auto_shutdown; }; /* Used to pass to co-routine device and ioc. */ diff --git a/hw/remote/machine.c b/hw/remote/machine.c index 9f3cdc55c3..4d008ed721 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -77,6 +77,28 @@ static void remote_machine_set_vfio_user(Object *obj, bool value, Error **errp) s->vfio_user = value; } +static bool remote_machine_get_auto_shutdown(Object *obj, Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +return s->auto_shutdown; +} + +static void remote_machine_set_auto_shutdown(Object *obj, bool value, + Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +s->auto_shutdown = value; +} + +static void remote_machine_instance_init(Object *obj) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +s->auto_shutdown = true; +} + static void remote_machine_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); @@ -90,12 +112,17 @@ static void remote_machine_class_init(ObjectClass *oc, void *data) object_class_property_add_bool(oc, "vfio-user", remote_machine_get_vfio_user, remote_machine_set_vfio_user); + +object_class_property_add_bool(oc, "auto-shutdown", + remote_machine_get_auto_shutdown, + remote_machine_set_auto_shutdown); } static const TypeInfo remote_machine = { .name = TYPE_REMOTE_MACHINE, .parent = TYPE_MACHINE, .instance_size = sizeof(RemoteMachineState), +.instance_init = remote_machine_instance_init, .class_init = remote_machine_class_init, .interfaces = (InterfaceInfo[]) { { TYPE_HOTPLUG_HANDLER }, diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c new file mode 100644 index 00..bc49adcc27 --- /dev/null +++ b/hw/remote/vfio-user-obj.c @@ -0,0 +1,210 @@ +/** + * QEMU vfio-user-server server object + * + * Copyright © 2022 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL-v2, version 2 or later. + * + * See the COPYING file in the top-level directory. + * + */ + +/** + * Usage: add options: + * -machine x-remote,vfi
[PATCH v10 08/14] vfio-user: run vfio-user context
Setup a handler to run vfio-user context. The context is driven by messages to the file descriptor associated with it - get the fd for the context and hook up the handler with it Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- qapi/misc.json| 31 hw/remote/vfio-user-obj.c | 104 +- 2 files changed, 134 insertions(+), 1 deletion(-) diff --git a/qapi/misc.json b/qapi/misc.json index 45344483cd..27ef5a2b20 100644 --- a/qapi/misc.json +++ b/qapi/misc.json @@ -553,3 +553,34 @@ ## { 'event': 'RTC_CHANGE', 'data': { 'offset': 'int', 'qom-path': 'str' } } + +## +# @VFU_CLIENT_HANGUP: +# +# Emitted when the client of a TYPE_VFIO_USER_SERVER closes the +# communication channel +# +# @vfu-id: ID of the TYPE_VFIO_USER_SERVER object. It is the last component +# of @vfu-qom-path referenced below +# +# @vfu-qom-path: path to the TYPE_VFIO_USER_SERVER object in the QOM tree +# +# @dev-id: ID of attached PCI device +# +# @dev-qom-path: path to attached PCI device in the QOM tree +# +# Since: 7.1 +# +# Example: +# +# <- { "event": "VFU_CLIENT_HANGUP", +# "data": { "vfu-id": "vfu1", +#"vfu-qom-path": "/objects/vfu1", +#"dev-id": "sas1", +#"dev-qom-path": "/machine/peripheral/sas1" }, +# "timestamp": { "seconds": 1265044230, "microseconds": 450486 } } +# +## +{ 'event': 'VFU_CLIENT_HANGUP', + 'data': { 'vfu-id': 'str', 'vfu-qom-path': 'str', +'dev-id': 'str', 'dev-qom-path': 'str' } } diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index fdee274933..fb5c46331c 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -27,6 +27,9 @@ * * device - id of a device on the server, a required option. PCI devices * alone are supported presently. + * + * notes - x-vfio-user-server could block IO and monitor during the + * initialization phase. */ #include "qemu/osdep.h" @@ -40,11 +43,14 @@ #include "hw/remote/machine.h" #include "qapi/error.h" #include "qapi/qapi-visit-sockets.h" +#include "qapi/qapi-events-misc.h" #include "qemu/notify.h" +#include "qemu/thread.h" #include "sysemu/sysemu.h" #include "libvfio-user.h" #include "hw/qdev-core.h" #include "hw/pci/pci.h" +#include "qemu/timer.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -86,6 +92,8 @@ struct VfuObject { PCIDevice *pci_dev; Error *unplug_blocker; + +int vfu_poll_fd; }; static void vfu_object_init_ctx(VfuObject *o, Error **errp); @@ -164,6 +172,78 @@ static void vfu_object_set_device(Object *obj, const char *str, Error **errp) vfu_object_init_ctx(o, errp); } +static void vfu_object_ctx_run(void *opaque) +{ +VfuObject *o = opaque; +const char *vfu_id; +char *vfu_path, *pci_dev_path; +int ret = -1; + +while (ret != 0) { +ret = vfu_run_ctx(o->vfu_ctx); +if (ret < 0) { +if (errno == EINTR) { +continue; +} else if (errno == ENOTCONN) { +vfu_id = object_get_canonical_path_component(OBJECT(o)); +vfu_path = object_get_canonical_path(OBJECT(o)); +g_assert(o->pci_dev); +pci_dev_path = object_get_canonical_path(OBJECT(o->pci_dev)); + /* o->device is a required property and is non-NULL here */ +g_assert(o->device); +qapi_event_send_vfu_client_hangup(vfu_id, vfu_path, + o->device, pci_dev_path); +qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL); +o->vfu_poll_fd = -1; +object_unparent(OBJECT(o)); +g_free(vfu_path); +g_free(pci_dev_path); +break; +} else { +VFU_OBJECT_ERROR(o, "vfu: Failed to run device %s - %s", + o->device, strerror(errno)); +break; +} +} +} +} + +static void vfu_object_attach_ctx(void *opaque) +{ +VfuObject *o = opaque; +GPollFD pfds[1]; +int ret; + +qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL); + +pfds[0].fd = o->vfu_poll_fd; +pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR; + +retry_attach: +ret = vfu_attach_ctx(o-&g
[PATCH v10 07/14] vfio-user: find and init PCI device
Find the PCI device with specified id. Initialize the device context with the QEMU PCI device Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 67 +++ 1 file changed, 67 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 68aac0c2b9..fdee274933 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -43,6 +43,8 @@ #include "qemu/notify.h" #include "sysemu/sysemu.h" #include "libvfio-user.h" +#include "hw/qdev-core.h" +#include "hw/pci/pci.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -80,6 +82,10 @@ struct VfuObject { Notifier machine_done; vfu_ctx_t *vfu_ctx; + +PCIDevice *pci_dev; + +Error *unplug_blocker; }; static void vfu_object_init_ctx(VfuObject *o, Error **errp); @@ -195,6 +201,9 @@ static void vfu_object_machine_done(Notifier *notifier, void *data) static void vfu_object_init_ctx(VfuObject *o, Error **errp) { ERRP_GUARD(); +DeviceState *dev = NULL; +vfu_pci_type_t pci_type = VFU_PCI_TYPE_CONVENTIONAL; +int ret; if (o->vfu_ctx || !o->socket || !o->device || !phase_check(PHASE_MACHINE_READY)) { @@ -213,6 +222,53 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) error_setg(errp, "vfu: Failed to create context - %s", strerror(errno)); return; } + +dev = qdev_find_recursive(sysbus_get_default(), o->device); +if (dev == NULL) { +error_setg(errp, "vfu: Device %s not found", o->device); +goto fail; +} + +if (!object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) { +error_setg(errp, "vfu: %s not a PCI device", o->device); +goto fail; +} + +o->pci_dev = PCI_DEVICE(dev); + +object_ref(OBJECT(o->pci_dev)); + +if (pci_is_express(o->pci_dev)) { +pci_type = VFU_PCI_TYPE_EXPRESS; +} + +ret = vfu_pci_init(o->vfu_ctx, pci_type, PCI_HEADER_TYPE_NORMAL, 0); +if (ret < 0) { +error_setg(errp, + "vfu: Failed to attach PCI device %s to context - %s", + o->device, strerror(errno)); +goto fail; +} + +error_setg(&o->unplug_blocker, + "vfu: %s for %s must be deleted before unplugging", + TYPE_VFU_OBJECT, o->device); +qdev_add_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); + +return; + +fail: +vfu_destroy_ctx(o->vfu_ctx); +if (o->unplug_blocker && o->pci_dev) { +qdev_del_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); +error_free(o->unplug_blocker); +o->unplug_blocker = NULL; +} +if (o->pci_dev) { +object_unref(OBJECT(o->pci_dev)); +o->pci_dev = NULL; +} +o->vfu_ctx = NULL; } static void vfu_object_init(Object *obj) @@ -255,6 +311,17 @@ static void vfu_object_finalize(Object *obj) o->device = NULL; +if (o->unplug_blocker && o->pci_dev) { +qdev_del_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); +error_free(o->unplug_blocker); +o->unplug_blocker = NULL; +} + +if (o->pci_dev) { +object_unref(OBJECT(o->pci_dev)); +o->pci_dev = NULL; +} + if (!k->nr_devs && vfu_object_auto_shutdown()) { qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN); } -- 2.20.1
[PATCH v10 14/14] vfio-user: handle reset of remote device
Adds handler to reset a remote device Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 20 1 file changed, 20 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index eeb165a805..c0c2277bfc 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -632,6 +632,20 @@ void vfu_object_set_bus_irq(PCIBus *pci_bus) pci_bus_irqs(pci_bus, vfu_object_set_irq, vfu_object_map_irq, pci_bus, 1); } +static int vfu_object_device_reset(vfu_ctx_t *vfu_ctx, vfu_reset_type_t type) +{ +VfuObject *o = vfu_get_private(vfu_ctx); + +/* vfu_object_ctx_run() handles lost connection */ +if (type == VFU_RESET_LOST_CONN) { +return 0; +} + +qdev_reset_all(DEVICE(o->pci_dev)); + +return 0; +} + /* * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' * properties. It also depends on devices instantiated in QEMU. These @@ -751,6 +765,12 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) goto fail; } +ret = vfu_setup_device_reset_cb(o->vfu_ctx, &vfu_object_device_reset); +if (ret < 0) { +error_setg(errp, "vfu: Failed to setup reset callback"); +goto fail; +} + ret = vfu_realize_ctx(o->vfu_ctx); if (ret < 0) { error_setg(errp, "vfu: Failed to realize device %s- %s", -- 2.20.1
[PATCH v10 04/14] vfio-user: build library
add the libvfio-user library as a submodule. build it as a meson subproject. libvfio-user is distributed with BSD 3-Clause license and json-c with MIT (Expat) license Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- configure | 17 + meson.build | 23 ++- .gitlab-ci.d/buildtest.yml | 1 + .gitmodules | 3 +++ Kconfig.host| 4 MAINTAINERS | 1 + hw/remote/Kconfig | 4 hw/remote/meson.build | 2 ++ meson_options.txt | 2 ++ subprojects/libvfio-user| 1 + tests/docker/dockerfiles/centos8.docker | 2 ++ 11 files changed, 59 insertions(+), 1 deletion(-) create mode 16 subprojects/libvfio-user diff --git a/configure b/configure index 180ee688dc..d6a36ba8e6 100755 --- a/configure +++ b/configure @@ -301,6 +301,7 @@ meson_args="" ninja="" bindir="bin" skip_meson=no +vfio_user_server="disabled" # The following Meson options are handled manually (still they # are included in the automatically generated help message) @@ -891,6 +892,10 @@ for opt do ;; --disable-blobs) meson_option_parse --disable-install-blobs "" ;; + --enable-vfio-user-server) vfio_user_server="enabled" + ;; + --disable-vfio-user-server) vfio_user_server="disabled" + ;; --enable-tcmalloc) meson_option_parse --enable-malloc=tcmalloc tcmalloc ;; --enable-jemalloc) meson_option_parse --enable-malloc=jemalloc jemalloc @@ -1796,6 +1801,17 @@ case "$slirp" in ;; esac +## +# check for vfio_user_server + +case "$vfio_user_server" in + enabled ) +if test "$git_submodules_action" != "ignore"; then + git_submodules="${git_submodules} subprojects/libvfio-user" +fi +;; +esac + ## # End of CC checks # After here, no more $cc or $ld runs @@ -2207,6 +2223,7 @@ if test "$skip_meson" = no; then test "$slirp" != auto && meson_option_add "-Dslirp=$slirp" test "$smbd" != '' && meson_option_add "-Dsmbd=$smbd" test "$tcg" != enabled && meson_option_add "-Dtcg=$tcg" + test "$vfio_user_server" != auto && meson_option_add "-Dvfio_user_server=$vfio_user_server" run_meson() { NINJA=$ninja $meson setup --prefix "$prefix" "$@" $cross_arg "$PWD" "$source_path" } diff --git a/meson.build b/meson.build index 9ebc00f032..6e66bb5a8c 100644 --- a/meson.build +++ b/meson.build @@ -308,6 +308,10 @@ multiprocess_allowed = get_option('multiprocess') \ .require(targetos == 'linux', error_message: 'Multiprocess QEMU is supported only on Linux') \ .allowed() +vfio_user_server_allowed = get_option('vfio_user_server') \ + .require(targetos == 'linux', error_message: 'vfio-user server is supported only on Linux') \ + .allowed() + have_tpm = get_option('tpm') \ .require(targetos != 'windows', error_message: 'TPM emulation only available on POSIX systems') \ .allowed() @@ -2358,7 +2362,8 @@ host_kconfig = \ (have_virtfs ? ['CONFIG_VIRTFS=y'] : []) + \ ('CONFIG_LINUX' in config_host ? ['CONFIG_LINUX=y'] : []) + \ (have_pvrdma ? ['CONFIG_PVRDMA=y'] : []) + \ - (multiprocess_allowed ? ['CONFIG_MULTIPROCESS_ALLOWED=y'] : []) + (multiprocess_allowed ? ['CONFIG_MULTIPROCESS_ALLOWED=y'] : []) + \ + (vfio_user_server_allowed ? ['CONFIG_VFIO_USER_SERVER_ALLOWED=y'] : []) ignored = [ 'TARGET_XML_FILES', 'TARGET_ABI_DIR', 'TARGET_ARCH' ] @@ -2650,6 +2655,21 @@ if have_system endif endif +libvfio_user_dep = not_found +if have_system and vfio_user_server_allowed + have_internal = fs.exists(meson.current_source_dir() / 'subprojects/libvfio-user/meson.build') + + if not have_internal +error('libvfio-user source not found - please pull git submodule') + endif + + libvfio_user_proj = subproject('libvfio-user') + + libvfio_user_lib = libvfio_user_proj.get_variable('libvfio_user_dep') + + libvfio_user_dep = declare_dependency(dependencies: [libvfio_user_lib]) +endif + fdt = not_found if have_system fdt_opt = get_option('fdt') @@ -3760,6 +3780,7 @@ summary_info += {'target list': ' '.join(target_dirs)} if have_system summary_info += {'default devices': get_option('default_devices&
[PATCH v10 03/14] remote/machine: add vfio-user property
Add vfio-user to x-remote machine. It is a boolean, which indicates if the machine supports vfio-user protocol. The machine configures the bus differently vfio-user and multiprocess protocols, so this property informs it on how to configure the bus. This property should be short lived. Once vfio-user fully replaces multiprocess, this property could be removed. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- include/hw/remote/machine.h | 2 ++ hw/remote/machine.c | 23 +++ 2 files changed, 25 insertions(+) diff --git a/include/hw/remote/machine.h b/include/hw/remote/machine.h index 2a2a33c4b2..8d0fa98d33 100644 --- a/include/hw/remote/machine.h +++ b/include/hw/remote/machine.h @@ -22,6 +22,8 @@ struct RemoteMachineState { RemotePCIHost *host; RemoteIOHubState iohub; + +bool vfio_user; }; /* Used to pass to co-routine device and ioc. */ diff --git a/hw/remote/machine.c b/hw/remote/machine.c index a97e53e250..9f3cdc55c3 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -58,6 +58,25 @@ static void remote_machine_init(MachineState *machine) qbus_set_hotplug_handler(BUS(pci_host->bus), OBJECT(s)); } +static bool remote_machine_get_vfio_user(Object *obj, Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +return s->vfio_user; +} + +static void remote_machine_set_vfio_user(Object *obj, bool value, Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +if (phase_check(PHASE_MACHINE_CREATED)) { +error_setg(errp, "Error enabling vfio-user - machine already created"); +return; +} + +s->vfio_user = value; +} + static void remote_machine_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); @@ -67,6 +86,10 @@ static void remote_machine_class_init(ObjectClass *oc, void *data) mc->desc = "Experimental remote machine"; hc->unplug = qdev_simple_device_unplug_cb; + +object_class_property_add_bool(oc, "vfio-user", + remote_machine_get_vfio_user, + remote_machine_set_vfio_user); } static const TypeInfo remote_machine = { -- 2.20.1
[PATCH v10 12/14] vfio-user: handle PCI BAR accesses
Determine the BARs used by the PCI device and register handlers to manage the access to the same. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- include/exec/memory.h | 3 + hw/remote/vfio-user-obj.c | 190 softmmu/physmem.c | 4 +- tests/qtest/fuzz/generic_fuzz.c | 9 +- hw/remote/trace-events | 3 + 5 files changed, 203 insertions(+), 6 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index f1c19451bc..a6a0f4d8ad 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -2810,6 +2810,9 @@ MemTxResult address_space_write_cached_slow(MemoryRegionCache *cache, hwaddr addr, const void *buf, hwaddr len); +int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr); +bool prepare_mmio_access(MemoryRegion *mr); + static inline bool memory_access_is_direct(MemoryRegion *mr, bool is_write) { if (is_write) { diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 8d208f1294..ee28a93782 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -52,6 +52,7 @@ #include "hw/qdev-core.h" #include "hw/pci/pci.h" #include "qemu/timer.h" +#include "exec/memory.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -332,6 +333,193 @@ static void dma_unregister(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) trace_vfu_dma_unregister((uint64_t)info->iova.iov_base); } +static int vfu_object_mr_rw(MemoryRegion *mr, uint8_t *buf, hwaddr offset, +hwaddr size, const bool is_write) +{ +uint8_t *ptr = buf; +bool release_lock = false; +uint8_t *ram_ptr = NULL; +MemTxResult result; +int access_size; +uint64_t val; + +if (memory_access_is_direct(mr, is_write)) { +/** + * Some devices expose a PCI expansion ROM, which could be buffer + * based as compared to other regions which are primarily based on + * MemoryRegionOps. memory_region_find() would already check + * for buffer overflow, we don't need to repeat it here. + */ +ram_ptr = memory_region_get_ram_ptr(mr); + +if (is_write) { +memcpy((ram_ptr + offset), buf, size); +} else { +memcpy(buf, (ram_ptr + offset), size); +} + +return 0; +} + +while (size) { +/** + * The read/write logic used below is similar to the ones in + * flatview_read/write_continue() + */ +release_lock = prepare_mmio_access(mr); + +access_size = memory_access_size(mr, size, offset); + +if (is_write) { +val = ldn_he_p(ptr, access_size); + +result = memory_region_dispatch_write(mr, offset, val, + size_memop(access_size), + MEMTXATTRS_UNSPECIFIED); +} else { +result = memory_region_dispatch_read(mr, offset, &val, + size_memop(access_size), + MEMTXATTRS_UNSPECIFIED); + +stn_he_p(ptr, access_size, val); +} + +if (release_lock) { +qemu_mutex_unlock_iothread(); +release_lock = false; +} + +if (result != MEMTX_OK) { +return -1; +} + +size -= access_size; +ptr += access_size; +offset += access_size; +} + +return 0; +} + +static size_t vfu_object_bar_rw(PCIDevice *pci_dev, int pci_bar, +hwaddr bar_offset, char * const buf, +hwaddr len, const bool is_write) +{ +MemoryRegionSection section = { 0 }; +uint8_t *ptr = (uint8_t *)buf; +MemoryRegion *section_mr = NULL; +uint64_t section_size; +hwaddr section_offset; +hwaddr size = 0; + +while (len) { +section = memory_region_find(pci_dev->io_regions[pci_bar].memory, + bar_offset, len); + +if (!section.mr) { +warn_report("vfu: invalid address 0x%"PRIx64"", bar_offset); +return size; +} + +section_mr = section.mr; +section_offset = section.offset_within_region; +section_size = int128_get64(section.size); + +if (is_write && section_mr->readonly) { +warn_report("vfu: attempting to write to readonly region in " +"bar %d - [0x%"PRIx64" - 0x%"PRIx64"]", +pci_bar, bar_offset, +(bar_off
[PATCH v10 11/14] vfio-user: handle DMA mappings
Define and register callbacks to manage the RAM regions used for device DMA Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/machine.c | 5 hw/remote/vfio-user-obj.c | 55 +++ hw/remote/trace-events| 2 ++ 3 files changed, 62 insertions(+) diff --git a/hw/remote/machine.c b/hw/remote/machine.c index cbb2add291..645b54343d 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -22,6 +22,7 @@ #include "hw/remote/iohub.h" #include "hw/remote/iommu.h" #include "hw/qdev-core.h" +#include "hw/remote/iommu.h" static void remote_machine_init(MachineState *machine) { @@ -51,6 +52,10 @@ static void remote_machine_init(MachineState *machine) pci_host = PCI_HOST_BRIDGE(rem_host); +if (s->vfio_user) { +remote_iommu_setup(pci_host->bus); +} + remote_iohub_init(&s->iohub); pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq, diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 575bd47397..8d208f1294 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -284,6 +284,54 @@ static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf, return count; } +static void dma_register(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) +{ +VfuObject *o = vfu_get_private(vfu_ctx); +AddressSpace *dma_as = NULL; +MemoryRegion *subregion = NULL; +g_autofree char *name = NULL; +struct iovec *iov = &info->iova; + +if (!info->vaddr) { +return; +} + +name = g_strdup_printf("mem-%s-%"PRIx64"", o->device, + (uint64_t)info->vaddr); + +subregion = g_new0(MemoryRegion, 1); + +memory_region_init_ram_ptr(subregion, NULL, name, + iov->iov_len, info->vaddr); + +dma_as = pci_device_iommu_address_space(o->pci_dev); + +memory_region_add_subregion(dma_as->root, (hwaddr)iov->iov_base, subregion); + +trace_vfu_dma_register((uint64_t)iov->iov_base, iov->iov_len); +} + +static void dma_unregister(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) +{ +VfuObject *o = vfu_get_private(vfu_ctx); +AddressSpace *dma_as = NULL; +MemoryRegion *mr = NULL; +ram_addr_t offset; + +mr = memory_region_from_host(info->vaddr, &offset); +if (!mr) { +return; +} + +dma_as = pci_device_iommu_address_space(o->pci_dev); + +memory_region_del_subregion(dma_as->root, mr); + +object_unparent((OBJECT(mr))); + +trace_vfu_dma_unregister((uint64_t)info->iova.iov_base); +} + /* * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' * properties. It also depends on devices instantiated in QEMU. These @@ -387,6 +435,13 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) goto fail; } +ret = vfu_setup_device_dma(o->vfu_ctx, &dma_register, &dma_unregister); +if (ret < 0) { +error_setg(errp, "vfu: Failed to setup DMA handlers for %s", + o->device); +goto fail; +} + ret = vfu_realize_ctx(o->vfu_ctx); if (ret < 0) { error_setg(errp, "vfu: Failed to realize device %s- %s", diff --git a/hw/remote/trace-events b/hw/remote/trace-events index 2ef7884346..f945c7e33b 100644 --- a/hw/remote/trace-events +++ b/hw/remote/trace-events @@ -7,3 +7,5 @@ mpqemu_recv_io_error(int cmd, int size, int nfds) "failed to receive %d size %d, vfu_prop(const char *prop, const char *val) "vfu: setting %s as %s" vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u -> 0x%x" vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u <- 0x%x" +vfu_dma_register(uint64_t gpa, size_t len) "vfu: registering GPA 0x%"PRIx64", %zu bytes" +vfu_dma_unregister(uint64_t gpa) "vfu: unregistering GPA 0x%"PRIx64"" -- 2.20.1
[PATCH v10 09/14] vfio-user: handle PCI config space accesses
Define and register handlers for PCI config space accesses Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 51 +++ hw/remote/trace-events| 2 ++ 2 files changed, 53 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index fb5c46331c..575bd47397 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -46,6 +46,7 @@ #include "qapi/qapi-events-misc.h" #include "qemu/notify.h" #include "qemu/thread.h" +#include "qemu/main-loop.h" #include "sysemu/sysemu.h" #include "libvfio-user.h" #include "hw/qdev-core.h" @@ -244,6 +245,45 @@ retry_attach: qemu_set_fd_handler(o->vfu_poll_fd, vfu_object_ctx_run, NULL, o); } +static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf, + size_t count, loff_t offset, + const bool is_write) +{ +VfuObject *o = vfu_get_private(vfu_ctx); +uint32_t pci_access_width = sizeof(uint32_t); +size_t bytes = count; +uint32_t val = 0; +char *ptr = buf; +int len; + +/* + * Writes to the BAR registers would trigger an update to the + * global Memory and IO AddressSpaces. But the remote device + * never uses the global AddressSpaces, therefore overlapping + * memory regions are not a problem + */ +while (bytes > 0) { +len = (bytes > pci_access_width) ? pci_access_width : bytes; +if (is_write) { +memcpy(&val, ptr, len); +pci_host_config_write_common(o->pci_dev, offset, + pci_config_size(o->pci_dev), + val, len); +trace_vfu_cfg_write(offset, val); +} else { +val = pci_host_config_read_common(o->pci_dev, offset, + pci_config_size(o->pci_dev), len); +memcpy(ptr, &val, len); +trace_vfu_cfg_read(offset, val); +} +offset += len; +ptr += len; +bytes -= len; +} + +return count; +} + /* * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' * properties. It also depends on devices instantiated in QEMU. These @@ -336,6 +376,17 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) TYPE_VFU_OBJECT, o->device); qdev_add_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); +ret = vfu_setup_region(o->vfu_ctx, VFU_PCI_DEV_CFG_REGION_IDX, + pci_config_size(o->pci_dev), &vfu_object_cfg_access, + VFU_REGION_FLAG_RW | VFU_REGION_FLAG_ALWAYS_CB, + NULL, 0, -1, 0); +if (ret < 0) { +error_setg(errp, + "vfu: Failed to setup config space handlers for %s- %s", + o->device, strerror(errno)); +goto fail; +} + ret = vfu_realize_ctx(o->vfu_ctx); if (ret < 0) { error_setg(errp, "vfu: Failed to realize device %s- %s", diff --git a/hw/remote/trace-events b/hw/remote/trace-events index 7da12f0d96..2ef7884346 100644 --- a/hw/remote/trace-events +++ b/hw/remote/trace-events @@ -5,3 +5,5 @@ mpqemu_recv_io_error(int cmd, int size, int nfds) "failed to receive %d size %d, # vfio-user-obj.c vfu_prop(const char *prop, const char *val) "vfu: setting %s as %s" +vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u -> 0x%x" +vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u <- 0x%x" -- 2.20.1
[PATCH v10 01/14] qdev: unplug blocker for devices
Add blocker to prevent hot-unplug of devices TYPE_VFIO_USER_SERVER, which is introduced shortly, attaches itself to a PCIDevice on which it depends. If the attached PCIDevice gets removed while the server in use, it could cause it crash. To prevent this, TYPE_VFIO_USER_SERVER adds an unplug blocker for the PCIDevice. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- include/hw/qdev-core.h | 29 + hw/core/qdev.c | 24 softmmu/qdev-monitor.c | 4 3 files changed, 57 insertions(+) diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h index 92c3d65208..98774e2835 100644 --- a/include/hw/qdev-core.h +++ b/include/hw/qdev-core.h @@ -193,6 +193,7 @@ struct DeviceState { int instance_id_alias; int alias_required_for_version; ResettableState reset; +GSList *unplug_blockers; }; struct DeviceListener { @@ -419,6 +420,34 @@ void qdev_simple_device_unplug_cb(HotplugHandler *hotplug_dev, void qdev_machine_creation_done(void); bool qdev_machine_modified(void); +/** + * qdev_add_unplug_blocker: Add an unplug blocker to a device + * + * @dev: Device to be blocked from unplug + * @reason: Reason for blocking + */ +void qdev_add_unplug_blocker(DeviceState *dev, Error *reason); + +/** + * qdev_del_unplug_blocker: Remove an unplug blocker from a device + * + * @dev: Device to be unblocked + * @reason: Pointer to the Error used with qdev_add_unplug_blocker. + * Used as a handle to lookup the blocker for deletion. + */ +void qdev_del_unplug_blocker(DeviceState *dev, Error *reason); + +/** + * qdev_unplug_blocked: Confirm if a device is blocked from unplug + * + * @dev: Device to be tested + * @reason: Returns one of the reasons why the device is blocked, + * if any + * + * Returns: true if device is blocked from unplug, false otherwise + */ +bool qdev_unplug_blocked(DeviceState *dev, Error **errp); + /** * GpioPolarity: Polarity of a GPIO line * diff --git a/hw/core/qdev.c b/hw/core/qdev.c index 84f3019440..0806d8fcaa 100644 --- a/hw/core/qdev.c +++ b/hw/core/qdev.c @@ -468,6 +468,28 @@ char *qdev_get_dev_path(DeviceState *dev) return NULL; } +void qdev_add_unplug_blocker(DeviceState *dev, Error *reason) +{ +dev->unplug_blockers = g_slist_prepend(dev->unplug_blockers, reason); +} + +void qdev_del_unplug_blocker(DeviceState *dev, Error *reason) +{ +dev->unplug_blockers = g_slist_remove(dev->unplug_blockers, reason); +} + +bool qdev_unplug_blocked(DeviceState *dev, Error **errp) +{ +ERRP_GUARD(); + +if (dev->unplug_blockers) { +error_propagate(errp, error_copy(dev->unplug_blockers->data)); +return true; +} + +return false; +} + static bool device_get_realized(Object *obj, Error **errp) { DeviceState *dev = DEVICE(obj); @@ -704,6 +726,8 @@ static void device_finalize(Object *obj) DeviceState *dev = DEVICE(obj); +g_assert(!dev->unplug_blockers); + QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) { QLIST_REMOVE(ngl, node); qemu_free_irqs(ngl->in, ngl->num_in); diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c index 12fe60c467..9cfd59d17c 100644 --- a/softmmu/qdev-monitor.c +++ b/softmmu/qdev-monitor.c @@ -898,6 +898,10 @@ void qdev_unplug(DeviceState *dev, Error **errp) HotplugHandlerClass *hdc; Error *local_err = NULL; +if (qdev_unplug_blocked(dev, errp)) { +return; +} + if (dev->parent_bus && !qbus_is_hotpluggable(dev->parent_bus)) { error_setg(errp, QERR_BUS_NO_HOTPLUG, dev->parent_bus->name); return; -- 2.20.1
[PATCH v10 00/14] vfio-user server in QEMU
Hi, This is v10 of the server side changes to enable vfio-user in QEMU. Thank you for reviewing and sharing your feedback for the previous revision. We have addressed your comments in this revision. We have dropped the following patches in this series: - tests/avocado: Specify target VM argument to helper routines - configure: require cmake 3.19 or newer - vfio-user: avocado tests for vfio-user We have also made the following changes: [PATCH v10 1/14] qdev: unplug blocker for devices - updated functions comments for unplug blockers in hw/qdev-core.h [PATCH v10 4/14] vfio-user: build library - uses meson build system to build libvfio-user library - dropped ubuntu CI build [PATCH v10 5/14] vfio-user: define vfio-user-server object - updated comments for VfioUserServerProperties in qapi/qom.json [PATCH v10 6/14] vfio-user: instantiate vfio-user context - added comments to vfu_object_init_ctx() explaining function contract [PATCH v10 8/14] vfio-user: run vfio-user context - vfu_object_ctx_run() asserts that VfuObject->device is not NULL - added a comment to vfu_object_ctx_run() explaining why VfuObject->device wouldn't be NULL Thank you very much! Jagannathan Raman (14): qdev: unplug blocker for devices remote/machine: add HotplugHandler for remote machine remote/machine: add vfio-user property vfio-user: build library vfio-user: define vfio-user-server object vfio-user: instantiate vfio-user context vfio-user: find and init PCI device vfio-user: run vfio-user context vfio-user: handle PCI config space accesses vfio-user: IOMMU support for remote device vfio-user: handle DMA mappings vfio-user: handle PCI BAR accesses vfio-user: handle device interrupts vfio-user: handle reset of remote device configure | 17 + meson.build | 23 +- qapi/misc.json | 31 + qapi/qom.json | 20 +- include/exec/memory.h | 3 + include/hw/pci/pci.h| 13 + include/hw/qdev-core.h | 29 + include/hw/remote/iommu.h | 40 ++ include/hw/remote/machine.h | 4 + include/hw/remote/vfio-user-obj.h | 6 + hw/core/qdev.c | 24 + hw/pci/msi.c| 16 +- hw/pci/msix.c | 10 +- hw/pci/pci.c| 13 + hw/remote/iommu.c | 131 hw/remote/machine.c | 88 ++- hw/remote/vfio-user-obj.c | 914 softmmu/physmem.c | 4 +- softmmu/qdev-monitor.c | 4 + stubs/vfio-user-obj.c | 6 + tests/qtest/fuzz/generic_fuzz.c | 9 +- .gitlab-ci.d/buildtest.yml | 1 + .gitmodules | 3 + Kconfig.host| 4 + MAINTAINERS | 5 + hw/remote/Kconfig | 4 + hw/remote/meson.build | 4 + hw/remote/trace-events | 11 + meson_options.txt | 2 + stubs/meson.build | 1 + subprojects/libvfio-user| 1 + tests/docker/dockerfiles/centos8.docker | 2 + 32 files changed, 1424 insertions(+), 19 deletions(-) create mode 100644 include/hw/remote/iommu.h create mode 100644 include/hw/remote/vfio-user-obj.h create mode 100644 hw/remote/iommu.c create mode 100644 hw/remote/vfio-user-obj.c create mode 100644 stubs/vfio-user-obj.c create mode 16 subprojects/libvfio-user -- 2.20.1
[PATCH v10 06/14] vfio-user: instantiate vfio-user context
create a context with the vfio-user library to run a PCI device Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 96 +++ 1 file changed, 96 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index bc49adcc27..68aac0c2b9 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -40,6 +40,9 @@ #include "hw/remote/machine.h" #include "qapi/error.h" #include "qapi/qapi-visit-sockets.h" +#include "qemu/notify.h" +#include "sysemu/sysemu.h" +#include "libvfio-user.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -73,8 +76,14 @@ struct VfuObject { char *device; Error *err; + +Notifier machine_done; + +vfu_ctx_t *vfu_ctx; }; +static void vfu_object_init_ctx(VfuObject *o, Error **errp); + static bool vfu_object_auto_shutdown(void) { bool auto_shutdown = true; @@ -107,6 +116,11 @@ static void vfu_object_set_socket(Object *obj, Visitor *v, const char *name, { VfuObject *o = VFU_OBJECT(obj); +if (o->vfu_ctx) { +error_setg(errp, "vfu: Unable to set socket property - server busy"); +return; +} + qapi_free_SocketAddress(o->socket); o->socket = NULL; @@ -122,17 +136,83 @@ static void vfu_object_set_socket(Object *obj, Visitor *v, const char *name, } trace_vfu_prop("socket", o->socket->u.q_unix.path); + +vfu_object_init_ctx(o, errp); } static void vfu_object_set_device(Object *obj, const char *str, Error **errp) { VfuObject *o = VFU_OBJECT(obj); +if (o->vfu_ctx) { +error_setg(errp, "vfu: Unable to set device property - server busy"); +return; +} + g_free(o->device); o->device = g_strdup(str); trace_vfu_prop("device", str); + +vfu_object_init_ctx(o, errp); +} + +/* + * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' + * properties. It also depends on devices instantiated in QEMU. These + * dependencies are not available during the instance_init phase of this + * object's life-cycle. As such, the server is initialized after the + * machine is setup. machine_init_done_notifier notifies TYPE_VFU_OBJECT + * when the machine is setup, and the dependencies are available. + */ +static void vfu_object_machine_done(Notifier *notifier, void *data) +{ +VfuObject *o = container_of(notifier, VfuObject, machine_done); +Error *err = NULL; + +vfu_object_init_ctx(o, &err); + +if (err) { +error_propagate(&error_abort, err); +} +} + +/** + * vfu_object_init_ctx: Create and initialize libvfio-user context. Add + * an unplug blocker for the associated PCI device. Setup a FD handler + * to process incoming messages in the context's socket. + * + * The socket and device properties are mandatory, and this function + * will not create the context without them - the setters for these + * properties should call this function when the property is set. The + * machine should also be ready when this function is invoked - it is + * because QEMU objects are initialized before devices, and the + * associated PCI device wouldn't be available at the object + * initialization time. Until these conditions are satisfied, this + * function would return early without performing any task. + */ +static void vfu_object_init_ctx(VfuObject *o, Error **errp) +{ +ERRP_GUARD(); + +if (o->vfu_ctx || !o->socket || !o->device || +!phase_check(PHASE_MACHINE_READY)) { +return; +} + +if (o->err) { +error_propagate(errp, o->err); +o->err = NULL; +return; +} + +o->vfu_ctx = vfu_create_ctx(VFU_TRANS_SOCK, o->socket->u.q_unix.path, 0, +o, VFU_DEV_TYPE_PCI); +if (o->vfu_ctx == NULL) { +error_setg(errp, "vfu: Failed to create context - %s", strerror(errno)); +return; +} } static void vfu_object_init(Object *obj) @@ -147,6 +227,12 @@ static void vfu_object_init(Object *obj) TYPE_VFU_OBJECT, TYPE_REMOTE_MACHINE); return; } + +if (!phase_check(PHASE_MACHINE_READY)) { +o->machine_done.notify = vfu_object_machine_done; +qemu_add_machine_init_done_notifier(&o->machine_done); +} + } static void vfu_object_finalize(Object *obj) @@ -160,6 +246,11 @@ static void vfu_object_finalize(Object *obj) o->socket = NULL; +if (o->vfu_ctx) { +vfu_destroy_ctx(o->vfu_ctx); +o->vfu_ctx = NULL; +} + g_free(o->device); o->
[PATCH v10 02/14] remote/machine: add HotplugHandler for remote machine
Allow hotplugging of PCI(e) devices to remote machine Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/machine.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/hw/remote/machine.c b/hw/remote/machine.c index 92d71d47bb..a97e53e250 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -20,6 +20,7 @@ #include "qapi/error.h" #include "hw/pci/pci_host.h" #include "hw/remote/iohub.h" +#include "hw/qdev-core.h" static void remote_machine_init(MachineState *machine) { @@ -53,14 +54,19 @@ static void remote_machine_init(MachineState *machine) pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq, &s->iohub, REMOTE_IOHUB_NB_PIRQS); + +qbus_set_hotplug_handler(BUS(pci_host->bus), OBJECT(s)); } static void remote_machine_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); +HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc); mc->init = remote_machine_init; mc->desc = "Experimental remote machine"; + +hc->unplug = qdev_simple_device_unplug_cb; } static const TypeInfo remote_machine = { @@ -68,6 +74,10 @@ static const TypeInfo remote_machine = { .parent = TYPE_MACHINE, .instance_size = sizeof(RemoteMachineState), .class_init = remote_machine_class_init, +.interfaces = (InterfaceInfo[]) { +{ TYPE_HOTPLUG_HANDLER }, +{ } +} }; static void remote_machine_register_types(void) -- 2.20.1
[PATCH v9 12/17] vfio-user: IOMMU support for remote device
Assign separate address space for each device in the remote processes. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- include/hw/remote/iommu.h | 40 hw/remote/iommu.c | 131 ++ hw/remote/machine.c | 13 +++- MAINTAINERS | 2 + hw/remote/meson.build | 1 + 5 files changed, 186 insertions(+), 1 deletion(-) create mode 100644 include/hw/remote/iommu.h create mode 100644 hw/remote/iommu.c diff --git a/include/hw/remote/iommu.h b/include/hw/remote/iommu.h new file mode 100644 index 00..33b68a8f4b --- /dev/null +++ b/include/hw/remote/iommu.h @@ -0,0 +1,40 @@ +/** + * Copyright © 2022 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef REMOTE_IOMMU_H +#define REMOTE_IOMMU_H + +#include "hw/pci/pci_bus.h" +#include "hw/pci/pci.h" + +#ifndef INT2VOIDP +#define INT2VOIDP(i) (void *)(uintptr_t)(i) +#endif + +typedef struct RemoteIommuElem { +MemoryRegion *mr; + +AddressSpace as; +} RemoteIommuElem; + +#define TYPE_REMOTE_IOMMU "x-remote-iommu" +OBJECT_DECLARE_SIMPLE_TYPE(RemoteIommu, REMOTE_IOMMU) + +struct RemoteIommu { +Object parent; + +GHashTable *elem_by_devfn; + +QemuMutex lock; +}; + +void remote_iommu_setup(PCIBus *pci_bus); + +void remote_iommu_unplug_dev(PCIDevice *pci_dev); + +#endif diff --git a/hw/remote/iommu.c b/hw/remote/iommu.c new file mode 100644 index 00..fd723d91f3 --- /dev/null +++ b/hw/remote/iommu.c @@ -0,0 +1,131 @@ +/** + * IOMMU for remote device + * + * Copyright © 2022 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" + +#include "hw/remote/iommu.h" +#include "hw/pci/pci_bus.h" +#include "hw/pci/pci.h" +#include "exec/memory.h" +#include "exec/address-spaces.h" +#include "trace.h" + +/** + * IOMMU for TYPE_REMOTE_MACHINE - manages DMA address space isolation + * for remote machine. It is used by TYPE_VFIO_USER_SERVER. + * + * - Each TYPE_VFIO_USER_SERVER instance handles one PCIDevice on a PCIBus. + * There is one RemoteIommu per PCIBus, so the RemoteIommu tracks multiple + * PCIDevices by maintaining a ->elem_by_devfn mapping. + * + * - memory_region_init_iommu() is not used because vfio-user MemoryRegions + * will be added to the elem->mr container instead. This is more natural + * than implementing the IOMMUMemoryRegionClass APIs since vfio-user + * provides something that is close to a full-fledged MemoryRegion and + * not like an IOMMU mapping. + * + * - When a device is hot unplugged, the elem->mr reference is dropped so + * all vfio-user MemoryRegions associated with this vfio-user server are + * destroyed. + */ + +static AddressSpace *remote_iommu_find_add_as(PCIBus *pci_bus, + void *opaque, int devfn) +{ +RemoteIommu *iommu = opaque; +RemoteIommuElem *elem = NULL; + +qemu_mutex_lock(&iommu->lock); + +elem = g_hash_table_lookup(iommu->elem_by_devfn, INT2VOIDP(devfn)); + +if (!elem) { +elem = g_malloc0(sizeof(RemoteIommuElem)); +g_hash_table_insert(iommu->elem_by_devfn, INT2VOIDP(devfn), elem); +} + +if (!elem->mr) { +elem->mr = MEMORY_REGION(object_new(TYPE_MEMORY_REGION)); +memory_region_set_size(elem->mr, UINT64_MAX); +address_space_init(&elem->as, elem->mr, NULL); +} + +qemu_mutex_unlock(&iommu->lock); + +return &elem->as; +} + +void remote_iommu_unplug_dev(PCIDevice *pci_dev) +{ +AddressSpace *as = pci_device_iommu_address_space(pci_dev); +RemoteIommuElem *elem = NULL; + +if (as == &address_space_memory) { +return; +} + +elem = container_of(as, RemoteIommuElem, as); + +address_space_destroy(&elem->as); + +object_unref(elem->mr); + +elem->mr = NULL; +} + +static void remote_iommu_init(Object *obj) +{ +RemoteIommu *iommu = REMOTE_IOMMU(obj); + +iommu->elem_by_devfn = g_hash_table_new_full(NULL, NULL, NULL, g_free); + +qemu_mutex_init(&iommu->lock); +} + +static void remote_iommu_finalize(Object *obj) +{ +RemoteIommu *iommu = REMOTE_IOMMU(obj); + +qemu_mutex_destroy(&iommu->lock); + +g_hash_table_destroy(iommu->elem_by_devfn); + +iommu->elem_by_devfn = NULL; +} + +void remote_iommu_setup(PCIBus *pci_bus) +{ +RemoteIommu *iommu = NULL; + +g_assert(pci_bus); + +iommu = REMOTE_IOMMU(object_new(TYPE_REMOTE_IOMMU)); + +pci_setup_iommu(pci_bus, remote_iommu_find_add_as, iommu); + +object_property_add_chil
[PATCH v9 13/17] vfio-user: handle DMA mappings
Define and register callbacks to manage the RAM regions used for device DMA Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/machine.c | 5 hw/remote/vfio-user-obj.c | 55 +++ hw/remote/trace-events| 2 ++ 3 files changed, 62 insertions(+) diff --git a/hw/remote/machine.c b/hw/remote/machine.c index cbb2add291..645b54343d 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -22,6 +22,7 @@ #include "hw/remote/iohub.h" #include "hw/remote/iommu.h" #include "hw/qdev-core.h" +#include "hw/remote/iommu.h" static void remote_machine_init(MachineState *machine) { @@ -51,6 +52,10 @@ static void remote_machine_init(MachineState *machine) pci_host = PCI_HOST_BRIDGE(rem_host); +if (s->vfio_user) { +remote_iommu_setup(pci_host->bus); +} + remote_iohub_init(&s->iohub); pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq, diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index c81a76094c..736339c74a 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -282,6 +282,54 @@ static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf, return count; } +static void dma_register(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) +{ +VfuObject *o = vfu_get_private(vfu_ctx); +AddressSpace *dma_as = NULL; +MemoryRegion *subregion = NULL; +g_autofree char *name = NULL; +struct iovec *iov = &info->iova; + +if (!info->vaddr) { +return; +} + +name = g_strdup_printf("mem-%s-%"PRIx64"", o->device, + (uint64_t)info->vaddr); + +subregion = g_new0(MemoryRegion, 1); + +memory_region_init_ram_ptr(subregion, NULL, name, + iov->iov_len, info->vaddr); + +dma_as = pci_device_iommu_address_space(o->pci_dev); + +memory_region_add_subregion(dma_as->root, (hwaddr)iov->iov_base, subregion); + +trace_vfu_dma_register((uint64_t)iov->iov_base, iov->iov_len); +} + +static void dma_unregister(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) +{ +VfuObject *o = vfu_get_private(vfu_ctx); +AddressSpace *dma_as = NULL; +MemoryRegion *mr = NULL; +ram_addr_t offset; + +mr = memory_region_from_host(info->vaddr, &offset); +if (!mr) { +return; +} + +dma_as = pci_device_iommu_address_space(o->pci_dev); + +memory_region_del_subregion(dma_as->root, mr); + +object_unparent((OBJECT(mr))); + +trace_vfu_dma_unregister((uint64_t)info->iova.iov_base); +} + /* * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' * properties. It also depends on devices instantiated in QEMU. These @@ -371,6 +419,13 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) goto fail; } +ret = vfu_setup_device_dma(o->vfu_ctx, &dma_register, &dma_unregister); +if (ret < 0) { +error_setg(errp, "vfu: Failed to setup DMA handlers for %s", + o->device); +goto fail; +} + ret = vfu_realize_ctx(o->vfu_ctx); if (ret < 0) { error_setg(errp, "vfu: Failed to realize device %s- %s", diff --git a/hw/remote/trace-events b/hw/remote/trace-events index 2ef7884346..f945c7e33b 100644 --- a/hw/remote/trace-events +++ b/hw/remote/trace-events @@ -7,3 +7,5 @@ mpqemu_recv_io_error(int cmd, int size, int nfds) "failed to receive %d size %d, vfu_prop(const char *prop, const char *val) "vfu: setting %s as %s" vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u -> 0x%x" vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u <- 0x%x" +vfu_dma_register(uint64_t gpa, size_t len) "vfu: registering GPA 0x%"PRIx64", %zu bytes" +vfu_dma_unregister(uint64_t gpa) "vfu: unregistering GPA 0x%"PRIx64"" -- 2.20.1
[PATCH v9 08/17] vfio-user: instantiate vfio-user context
create a context with the vfio-user library to run a PCI device Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 82 +++ 1 file changed, 82 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index bc49adcc27..68f8a9dfa9 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -40,6 +40,9 @@ #include "hw/remote/machine.h" #include "qapi/error.h" #include "qapi/qapi-visit-sockets.h" +#include "qemu/notify.h" +#include "sysemu/sysemu.h" +#include "libvfio-user.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -73,8 +76,14 @@ struct VfuObject { char *device; Error *err; + +Notifier machine_done; + +vfu_ctx_t *vfu_ctx; }; +static void vfu_object_init_ctx(VfuObject *o, Error **errp); + static bool vfu_object_auto_shutdown(void) { bool auto_shutdown = true; @@ -107,6 +116,11 @@ static void vfu_object_set_socket(Object *obj, Visitor *v, const char *name, { VfuObject *o = VFU_OBJECT(obj); +if (o->vfu_ctx) { +error_setg(errp, "vfu: Unable to set socket property - server busy"); +return; +} + qapi_free_SocketAddress(o->socket); o->socket = NULL; @@ -122,17 +136,69 @@ static void vfu_object_set_socket(Object *obj, Visitor *v, const char *name, } trace_vfu_prop("socket", o->socket->u.q_unix.path); + +vfu_object_init_ctx(o, errp); } static void vfu_object_set_device(Object *obj, const char *str, Error **errp) { VfuObject *o = VFU_OBJECT(obj); +if (o->vfu_ctx) { +error_setg(errp, "vfu: Unable to set device property - server busy"); +return; +} + g_free(o->device); o->device = g_strdup(str); trace_vfu_prop("device", str); + +vfu_object_init_ctx(o, errp); +} + +/* + * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' + * properties. It also depends on devices instantiated in QEMU. These + * dependencies are not available during the instance_init phase of this + * object's life-cycle. As such, the server is initialized after the + * machine is setup. machine_init_done_notifier notifies TYPE_VFU_OBJECT + * when the machine is setup, and the dependencies are available. + */ +static void vfu_object_machine_done(Notifier *notifier, void *data) +{ +VfuObject *o = container_of(notifier, VfuObject, machine_done); +Error *err = NULL; + +vfu_object_init_ctx(o, &err); + +if (err) { +error_propagate(&error_abort, err); +} +} + +static void vfu_object_init_ctx(VfuObject *o, Error **errp) +{ +ERRP_GUARD(); + +if (o->vfu_ctx || !o->socket || !o->device || +!phase_check(PHASE_MACHINE_READY)) { +return; +} + +if (o->err) { +error_propagate(errp, o->err); +o->err = NULL; +return; +} + +o->vfu_ctx = vfu_create_ctx(VFU_TRANS_SOCK, o->socket->u.q_unix.path, 0, +o, VFU_DEV_TYPE_PCI); +if (o->vfu_ctx == NULL) { +error_setg(errp, "vfu: Failed to create context - %s", strerror(errno)); +return; +} } static void vfu_object_init(Object *obj) @@ -147,6 +213,12 @@ static void vfu_object_init(Object *obj) TYPE_VFU_OBJECT, TYPE_REMOTE_MACHINE); return; } + +if (!phase_check(PHASE_MACHINE_READY)) { +o->machine_done.notify = vfu_object_machine_done; +qemu_add_machine_init_done_notifier(&o->machine_done); +} + } static void vfu_object_finalize(Object *obj) @@ -160,6 +232,11 @@ static void vfu_object_finalize(Object *obj) o->socket = NULL; +if (o->vfu_ctx) { +vfu_destroy_ctx(o->vfu_ctx); +o->vfu_ctx = NULL; +} + g_free(o->device); o->device = NULL; @@ -167,6 +244,11 @@ static void vfu_object_finalize(Object *obj) if (!k->nr_devs && vfu_object_auto_shutdown()) { qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN); } + +if (o->machine_done.notify) { +qemu_remove_machine_init_done_notifier(&o->machine_done); +o->machine_done.notify = NULL; +} } static void vfu_object_class_init(ObjectClass *klass, void *data) -- 2.20.1
[PATCH v9 17/17] vfio-user: avocado tests for vfio-user
Avocado tests for libvfio-user in QEMU - tests startup, hotplug and migration of the server object Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- MAINTAINERS| 1 + tests/avocado/vfio-user.py | 164 + 2 files changed, 165 insertions(+) create mode 100644 tests/avocado/vfio-user.py diff --git a/MAINTAINERS b/MAINTAINERS index d2e977affb..103fcb472f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3603,6 +3603,7 @@ F: hw/remote/vfio-user-obj.c F: include/hw/remote/vfio-user-obj.h F: hw/remote/iommu.c F: include/hw/remote/iommu.h +F: tests/avocado/vfio-user.py EBPF: M: Jason Wang diff --git a/tests/avocado/vfio-user.py b/tests/avocado/vfio-user.py new file mode 100644 index 00..ced304d770 --- /dev/null +++ b/tests/avocado/vfio-user.py @@ -0,0 +1,164 @@ +# vfio-user protocol sanity test +# +# This work is licensed under the terms of the GNU GPL, version 2 or +# later. See the COPYING file in the top-level directory. + + +import os +import socket +import uuid + +from avocado_qemu import QemuSystemTest +from avocado_qemu import wait_for_console_pattern +from avocado_qemu import exec_command +from avocado_qemu import exec_command_and_wait_for_pattern + +from avocado.utils import network +from avocado.utils import wait + +class VfioUser(QemuSystemTest): +""" +:avocado: tags=vfiouser +""" +KERNEL_COMMON_COMMAND_LINE = 'printk.time=0 ' +timeout = 20 + +def _get_free_port(self): +port = network.find_free_port() +if port is None: +self.cancel('Failed to find a free port') +return port + +def validate_vm_launch(self, vm): +wait_for_console_pattern(self, 'as init process', + 'Kernel panic - not syncing', vm=vm) +exec_command(self, 'mount -t sysfs sysfs /sys', vm=vm) +exec_command_and_wait_for_pattern(self, + 'cat /sys/bus/pci/devices/*/uevent', + 'PCI_ID=1000:0060', vm=vm) + +def launch_server_startup(self, socket, *opts): +server_vm = self.get_vm() +server_vm.add_args('-machine', 'x-remote,vfio-user=on') +server_vm.add_args('-nodefaults') +server_vm.add_args('-device', 'megasas,id=sas1') +server_vm.add_args('-object', 'x-vfio-user-server,id=vfioobj1,' + 'type=unix,path='+socket+',device=sas1') +for opt in opts: +server_vm.add_args(opt) +server_vm.launch() +return server_vm + +def launch_server_hotplug(self, socket): +server_vm = self.get_vm() +server_vm.add_args('-machine', 'x-remote,vfio-user=on') +server_vm.add_args('-nodefaults') +server_vm.launch() +server_vm.qmp('device_add', args_dict=None, conv_keys=None, + driver='megasas', id='sas1') +obj_add_opts = {'qom-type': 'x-vfio-user-server', +'id': 'vfioobj', 'device': 'sas1', +'socket': {'type': 'unix', 'path': socket}} +server_vm.qmp('object-add', args_dict=obj_add_opts) +return server_vm + +def launch_client(self, kernel_path, initrd_path, kernel_command_line, + machine_type, socket, *opts): +client_vm = self.get_vm() +client_vm.set_console() +client_vm.add_args('-machine', machine_type) +client_vm.add_args('-accel', 'kvm') +client_vm.add_args('-cpu', 'host') +client_vm.add_args('-object', + 'memory-backend-memfd,id=sysmem-file,size=2G') +client_vm.add_args('--numa', 'node,memdev=sysmem-file') +client_vm.add_args('-m', '2048') +client_vm.add_args('-kernel', kernel_path, + '-initrd', initrd_path, + '-append', kernel_command_line) +client_vm.add_args('-device', + 'vfio-user-pci,socket='+socket) +for opt in opts: +client_vm.add_args(opt) +client_vm.launch() +return client_vm + +def do_test_startup(self, kernel_url, initrd_url, kernel_command_line, +machine_type): +self.require_accelerator('kvm') + +kernel_path = self.fetch_asset(kernel_url) +initrd_path = self.fetch_asset(initrd_url) +socket
[PATCH v9 15/17] vfio-user: handle device interrupts
Forward remote device's interrupts to the guest Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- include/hw/pci/pci.h | 13 include/hw/remote/vfio-user-obj.h | 6 ++ hw/pci/msi.c | 16 ++-- hw/pci/msix.c | 10 ++- hw/pci/pci.c | 13 hw/remote/machine.c | 14 +++- hw/remote/vfio-user-obj.c | 123 ++ stubs/vfio-user-obj.c | 6 ++ MAINTAINERS | 1 + hw/remote/trace-events| 1 + stubs/meson.build | 1 + 11 files changed, 193 insertions(+), 11 deletions(-) create mode 100644 include/hw/remote/vfio-user-obj.h create mode 100644 stubs/vfio-user-obj.c diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index 3a32b8dd40..7595c05c98 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -16,6 +16,7 @@ extern bool pci_available; #define PCI_SLOT(devfn) (((devfn) >> 3) & 0x1f) #define PCI_FUNC(devfn) ((devfn) & 0x07) #define PCI_BUILD_BDF(bus, devfn) ((bus << 8) | (devfn)) +#define PCI_BDF_TO_DEVFN(x) ((x) & 0xff) #define PCI_BUS_MAX 256 #define PCI_DEVFN_MAX 256 #define PCI_SLOT_MAX32 @@ -127,6 +128,10 @@ typedef void PCIMapIORegionFunc(PCIDevice *pci_dev, int region_num, pcibus_t addr, pcibus_t size, int type); typedef void PCIUnregisterFunc(PCIDevice *pci_dev); +typedef void MSITriggerFunc(PCIDevice *dev, MSIMessage msg); +typedef MSIMessage MSIPrepareMessageFunc(PCIDevice *dev, unsigned vector); +typedef MSIMessage MSIxPrepareMessageFunc(PCIDevice *dev, unsigned vector); + typedef struct PCIIORegion { pcibus_t addr; /* current PCI mapping address. -1 means not mapped */ #define PCI_BAR_UNMAPPED (~(pcibus_t)0) @@ -321,6 +326,14 @@ struct PCIDevice { /* Space to store MSIX table & pending bit array */ uint8_t *msix_table; uint8_t *msix_pba; + +/* May be used by INTx or MSI during interrupt notification */ +void *irq_opaque; + +MSITriggerFunc *msi_trigger; +MSIPrepareMessageFunc *msi_prepare_message; +MSIxPrepareMessageFunc *msix_prepare_message; + /* MemoryRegion container for msix exclusive BAR setup */ MemoryRegion msix_exclusive_bar; /* Memory Regions for MSIX table and pending bit entries. */ diff --git a/include/hw/remote/vfio-user-obj.h b/include/hw/remote/vfio-user-obj.h new file mode 100644 index 00..87ab78b875 --- /dev/null +++ b/include/hw/remote/vfio-user-obj.h @@ -0,0 +1,6 @@ +#ifndef VFIO_USER_OBJ_H +#define VFIO_USER_OBJ_H + +void vfu_object_set_bus_irq(PCIBus *pci_bus); + +#endif diff --git a/hw/pci/msi.c b/hw/pci/msi.c index 47d2b0f33c..d556e17a09 100644 --- a/hw/pci/msi.c +++ b/hw/pci/msi.c @@ -134,7 +134,7 @@ void msi_set_message(PCIDevice *dev, MSIMessage msg) pci_set_word(dev->config + msi_data_off(dev, msi64bit), msg.data); } -MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector) +static MSIMessage msi_prepare_message(PCIDevice *dev, unsigned int vector) { uint16_t flags = pci_get_word(dev->config + msi_flags_off(dev)); bool msi64bit = flags & PCI_MSI_FLAGS_64BIT; @@ -159,6 +159,11 @@ MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector) return msg; } +MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector) +{ +return dev->msi_prepare_message(dev, vector); +} + bool msi_enabled(const PCIDevice *dev) { return msi_present(dev) && @@ -241,6 +246,8 @@ int msi_init(struct PCIDevice *dev, uint8_t offset, 0x >> (PCI_MSI_VECTORS_MAX - nr_vectors)); } +dev->msi_prepare_message = msi_prepare_message; + return 0; } @@ -256,6 +263,7 @@ void msi_uninit(struct PCIDevice *dev) cap_size = msi_cap_sizeof(flags); pci_del_capability(dev, PCI_CAP_ID_MSI, cap_size); dev->cap_present &= ~QEMU_PCI_CAP_MSI; +dev->msi_prepare_message = NULL; MSI_DEV_PRINTF(dev, "uninit\n"); } @@ -334,11 +342,7 @@ void msi_notify(PCIDevice *dev, unsigned int vector) void msi_send_message(PCIDevice *dev, MSIMessage msg) { -MemTxAttrs attrs = {}; - -attrs.requester_id = pci_requester_id(dev); -address_space_stl_le(&dev->bus_master_as, msg.address, msg.data, - attrs, NULL); +dev->msi_trigger(dev, msg); } /* Normally called by pci_default_write_config(). */ diff --git a/hw/pci/msix.c b/hw/pci/msix.c index ae9331cd0b..6f85192d6f 100644 --- a/hw/pci/msix.c +++ b/hw/pci/msix.c @@ -31,7 +31,7 @@ #define MSIX_ENABLE_MASK (PCI_MSIX_FLAGS_ENABLE >> 8) #define MSIX_MASKALL_MASK (PCI_MSIX_FLAGS_MASKALL >> 8) -MSIMessage msix_get_message(PCIDevice *dev, unsigned vector) +static MSIMessage msix_prepare_message(PCIDe
[PATCH v9 05/17] configure: require cmake 3.19 or newer
cmake needs to accept the compiler flags specified with CMAKE__COMPILER variable. It does so starting with version 3.19 Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- configure | 16 1 file changed, 16 insertions(+) diff --git a/configure b/configure index 59c43bea05..7cefab289d 100755 --- a/configure +++ b/configure @@ -249,6 +249,7 @@ stack_protector="" safe_stack="" use_containers="yes" gdb_bin=$(command -v "gdb-multiarch" || command -v "gdb") +cmake_required="no" if test -e "$source_path/.git" then @@ -2503,6 +2504,21 @@ if !(GIT="$git" "$source_path/scripts/git-submodule.sh" "$git_submodules_action" exit 1 fi +# Per cmake spec, CMAKE__COMPILER variable may include "mandatory" compiler +# flags. QEMU needs to specify these flags to correctly configure the build +# environment. cmake 3.19 allows specifying these mandatory compiler flags, +# and as such 3.19 or newer is required to build QEMU. +if test "$cmake_required" = "yes" ; then +cmake_bin=$(command -v "cmake") +if [ -z "$cmake_bin" ]; then +error_exit "cmake not found" +fi +cmake_version=$($cmake_bin --version | head -n 1) +if ! version_ge ${cmake_version##* } 3.19; then +error_exit "QEMU needs cmake 3.19 or newer" +fi +fi + config_host_mak="config-host.mak" echo "# Automatically generated by configure - do not modify" > $config_host_mak -- 2.20.1
[PATCH v9 11/17] vfio-user: handle PCI config space accesses
Define and register handlers for PCI config space accesses Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 51 +++ hw/remote/trace-events| 2 ++ 2 files changed, 53 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 3a4c6a9fa0..c81a76094c 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -46,6 +46,7 @@ #include "qapi/qapi-events-misc.h" #include "qemu/notify.h" #include "qemu/thread.h" +#include "qemu/main-loop.h" #include "sysemu/sysemu.h" #include "libvfio-user.h" #include "hw/qdev-core.h" @@ -242,6 +243,45 @@ retry_attach: qemu_set_fd_handler(o->vfu_poll_fd, vfu_object_ctx_run, NULL, o); } +static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf, + size_t count, loff_t offset, + const bool is_write) +{ +VfuObject *o = vfu_get_private(vfu_ctx); +uint32_t pci_access_width = sizeof(uint32_t); +size_t bytes = count; +uint32_t val = 0; +char *ptr = buf; +int len; + +/* + * Writes to the BAR registers would trigger an update to the + * global Memory and IO AddressSpaces. But the remote device + * never uses the global AddressSpaces, therefore overlapping + * memory regions are not a problem + */ +while (bytes > 0) { +len = (bytes > pci_access_width) ? pci_access_width : bytes; +if (is_write) { +memcpy(&val, ptr, len); +pci_host_config_write_common(o->pci_dev, offset, + pci_config_size(o->pci_dev), + val, len); +trace_vfu_cfg_write(offset, val); +} else { +val = pci_host_config_read_common(o->pci_dev, offset, + pci_config_size(o->pci_dev), len); +memcpy(ptr, &val, len); +trace_vfu_cfg_read(offset, val); +} +offset += len; +ptr += len; +bytes -= len; +} + +return count; +} + /* * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' * properties. It also depends on devices instantiated in QEMU. These @@ -320,6 +360,17 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) TYPE_VFU_OBJECT, o->device); qdev_add_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); +ret = vfu_setup_region(o->vfu_ctx, VFU_PCI_DEV_CFG_REGION_IDX, + pci_config_size(o->pci_dev), &vfu_object_cfg_access, + VFU_REGION_FLAG_RW | VFU_REGION_FLAG_ALWAYS_CB, + NULL, 0, -1, 0); +if (ret < 0) { +error_setg(errp, + "vfu: Failed to setup config space handlers for %s- %s", + o->device, strerror(errno)); +goto fail; +} + ret = vfu_realize_ctx(o->vfu_ctx); if (ret < 0) { error_setg(errp, "vfu: Failed to realize device %s- %s", diff --git a/hw/remote/trace-events b/hw/remote/trace-events index 7da12f0d96..2ef7884346 100644 --- a/hw/remote/trace-events +++ b/hw/remote/trace-events @@ -5,3 +5,5 @@ mpqemu_recv_io_error(int cmd, int size, int nfds) "failed to receive %d size %d, # vfio-user-obj.c vfu_prop(const char *prop, const char *val) "vfu: setting %s as %s" +vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u -> 0x%x" +vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u <- 0x%x" -- 2.20.1
[PATCH v9 16/17] vfio-user: handle reset of remote device
Adds handler to reset a remote device Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 20 1 file changed, 20 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index d351b1daa3..15b06744f9 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -630,6 +630,20 @@ void vfu_object_set_bus_irq(PCIBus *pci_bus) pci_bus_irqs(pci_bus, vfu_object_set_irq, vfu_object_map_irq, pci_bus, 1); } +static int vfu_object_device_reset(vfu_ctx_t *vfu_ctx, vfu_reset_type_t type) +{ +VfuObject *o = vfu_get_private(vfu_ctx); + +/* vfu_object_ctx_run() handles lost connection */ +if (type == VFU_RESET_LOST_CONN) { +return 0; +} + +qdev_reset_all(DEVICE(o->pci_dev)); + +return 0; +} + /* * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' * properties. It also depends on devices instantiated in QEMU. These @@ -735,6 +749,12 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) goto fail; } +ret = vfu_setup_device_reset_cb(o->vfu_ctx, &vfu_object_device_reset); +if (ret < 0) { +error_setg(errp, "vfu: Failed to setup reset callback"); +goto fail; +} + ret = vfu_realize_ctx(o->vfu_ctx); if (ret < 0) { error_setg(errp, "vfu: Failed to realize device %s- %s", -- 2.20.1
[PATCH v9 06/17] vfio-user: build library
add the libvfio-user library as a submodule. build it as a cmake subproject. libvfio-user is distributed with BSD 3-Clause license and json-c with MIT (Expat) license Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- configure | 20 +- meson.build| 44 +- .gitlab-ci.d/buildtest.yml | 2 + .gitmodules| 3 ++ Kconfig.host | 4 ++ MAINTAINERS| 1 + hw/remote/Kconfig | 4 ++ hw/remote/meson.build | 2 + meson_options.txt | 3 ++ subprojects/libvfio-user | 1 + tests/docker/dockerfiles/centos8.docker| 2 + tests/docker/dockerfiles/ubuntu2004.docker | 2 + 12 files changed, 86 insertions(+), 2 deletions(-) create mode 16 subprojects/libvfio-user diff --git a/configure b/configure index 7cefab289d..3b096f1b94 100755 --- a/configure +++ b/configure @@ -326,6 +326,7 @@ meson="" meson_args="" ninja="" skip_meson=no +vfio_user_server="disabled" # The following Meson options are handled manually (still they # are included in the automatically generated help message) @@ -1008,6 +1009,11 @@ for opt do ;; --disable-blobs) meson_option_parse --disable-install-blobs "" ;; + --enable-vfio-user-server) vfio_user_server="enabled" + cmake_required="yes" + ;; + --disable-vfio-user-server) vfio_user_server="disabled" + ;; --enable-tcmalloc) meson_option_parse --enable-malloc=tcmalloc tcmalloc ;; --enable-jemalloc) meson_option_parse --enable-malloc=jemalloc jemalloc @@ -1226,6 +1232,7 @@ cat << EOF vhost-kernelvhost kernel backend support vhost-user vhost-user backend support vhost-vdpa vhost-vdpa kernel backend support + vfio-user-servervfio-user server support NOTE: The object files are built at the place where configure is launched EOF @@ -2350,6 +2357,17 @@ case "$slirp" in ;; esac +## +# check for vfio_user_server + +case "$vfio_user_server" in + auto | enabled ) +if test "$git_submodules_action" != "ignore"; then + git_submodules="${git_submodules} subprojects/libvfio-user" +fi +;; +esac + ## # End of CC checks # After here, no more $cc or $ld runs @@ -2854,7 +2872,7 @@ if test "$skip_meson" = no; then -Db_pie=$(if test "$pie" = yes; then echo true; else echo false; fi) \ -Db_coverage=$(if test "$gcov" = yes; then echo true; else echo false; fi) \ -Db_lto=$lto -Dcfi=$cfi -Dtcg=$tcg -Dxen=$xen \ --Dcapstone=$capstone -Dfdt=$fdt -Dslirp=$slirp \ +-Dcapstone=$capstone -Dfdt=$fdt -Dslirp=$slirp -Dvfio_user_server=$vfio_user_server \ $(test -n "${LIB_FUZZING_ENGINE+xxx}" && echo "-Dfuzzing_engine=$LIB_FUZZING_ENGINE") \ $(if test "$default_feature" = no; then echo "-Dauto_features=disabled"; fi) \ "$@" $cross_arg "$PWD" "$source_path" diff --git a/meson.build b/meson.build index 1fe7d257ff..55b872b51e 100644 --- a/meson.build +++ b/meson.build @@ -298,6 +298,11 @@ have_tpm = get_option('tpm') \ .require(targetos != 'windows', error_message: 'TPM emulation only available on POSIX systems') \ .allowed() +if targetos != 'linux' and get_option('vfio_user_server').enabled() + error('vfio-user server is supported only on Linux') +endif +vfio_user_server_allowed = targetos == 'linux' and not get_option('vfio_user_server').disabled() + # Target-specific libraries and flags libm = cc.find_library('m', required: false) threads = dependency('threads') @@ -2204,7 +2209,8 @@ host_kconfig = \ (have_virtfs ? ['CONFIG_VIRTFS=y'] : []) + \ ('CONFIG_LINUX' in config_host ? ['CONFIG_LINUX=y'] : []) + \ (have_pvrdma ? ['CONFIG_PVRDMA=y'] : []) + \ - (multiprocess_allowed ? ['CONFIG_MULTIPROCESS_ALLOWED=y'] : []) + (multiprocess_allowed ? ['CONFIG_MULTIPROCESS_ALLOWED=y'] : []) + \ + (vfio_user_server_allowed ? ['CONFIG_VFIO_USER_SERVER_ALLOWED=y'] : []) ignored = [ 'TARGET_XML_FILES', 'TARGET_ABI_DIR', 'TARGET_ARCH' ] @@ -2596,6 +2602,41 @@ if get_option('cfi') and slirp_opt == 'system' + ' Please configure with --enable-slirp=git') endif +vfiouser = not_found +if have_system and
[PATCH v9 04/17] remote/machine: add vfio-user property
Add vfio-user to x-remote machine. It is a boolean, which indicates if the machine supports vfio-user protocol. The machine configures the bus differently vfio-user and multiprocess protocols, so this property informs it on how to configure the bus. This property should be short lived. Once vfio-user fully replaces multiprocess, this property could be removed. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- include/hw/remote/machine.h | 2 ++ hw/remote/machine.c | 23 +++ 2 files changed, 25 insertions(+) diff --git a/include/hw/remote/machine.h b/include/hw/remote/machine.h index 2a2a33c4b2..8d0fa98d33 100644 --- a/include/hw/remote/machine.h +++ b/include/hw/remote/machine.h @@ -22,6 +22,8 @@ struct RemoteMachineState { RemotePCIHost *host; RemoteIOHubState iohub; + +bool vfio_user; }; /* Used to pass to co-routine device and ioc. */ diff --git a/hw/remote/machine.c b/hw/remote/machine.c index a97e53e250..9f3cdc55c3 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -58,6 +58,25 @@ static void remote_machine_init(MachineState *machine) qbus_set_hotplug_handler(BUS(pci_host->bus), OBJECT(s)); } +static bool remote_machine_get_vfio_user(Object *obj, Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +return s->vfio_user; +} + +static void remote_machine_set_vfio_user(Object *obj, bool value, Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +if (phase_check(PHASE_MACHINE_CREATED)) { +error_setg(errp, "Error enabling vfio-user - machine already created"); +return; +} + +s->vfio_user = value; +} + static void remote_machine_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); @@ -67,6 +86,10 @@ static void remote_machine_class_init(ObjectClass *oc, void *data) mc->desc = "Experimental remote machine"; hc->unplug = qdev_simple_device_unplug_cb; + +object_class_property_add_bool(oc, "vfio-user", + remote_machine_get_vfio_user, + remote_machine_set_vfio_user); } static const TypeInfo remote_machine = { -- 2.20.1
[PATCH v9 10/17] vfio-user: run vfio-user context
Setup a handler to run vfio-user context. The context is driven by messages to the file descriptor associated with it - get the fd for the context and hook up the handler with it Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- qapi/misc.json| 30 +++ hw/remote/vfio-user-obj.c | 102 +- 2 files changed, 131 insertions(+), 1 deletion(-) diff --git a/qapi/misc.json b/qapi/misc.json index b83cc39029..fa49f2876a 100644 --- a/qapi/misc.json +++ b/qapi/misc.json @@ -553,3 +553,33 @@ ## { 'event': 'RTC_CHANGE', 'data': { 'offset': 'int', 'qom-path': 'str' } } + +## +# @VFU_CLIENT_HANGUP: +# +# Emitted when the client of a TYPE_VFIO_USER_SERVER closes the +# communication channel +# +# @vfu-id: ID of the TYPE_VFIO_USER_SERVER object +# +# @vfu-qom-path: path to the TYPE_VFIO_USER_SERVER object in the QOM tree +# +# @dev-id: ID of attached PCI device +# +# @dev-qom-path: path to attached PCI device in the QOM tree +# +# Since: 7.1 +# +# Example: +# +# <- { "event": "VFU_CLIENT_HANGUP", +# "data": { "vfu-id": "vfu1", +#"vfu-qom-path": "/objects/vfu1", +#"dev-id": "sas1", +#"dev-qom-path": "/machine/peripheral/sas1" }, +# "timestamp": { "seconds": 1265044230, "microseconds": 450486 } } +# +## +{ 'event': 'VFU_CLIENT_HANGUP', + 'data': { 'vfu-id': 'str', 'vfu-qom-path': 'str', +'dev-id': 'str', 'dev-qom-path': 'str' } } diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 3ca6aa2b45..3a4c6a9fa0 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -27,6 +27,9 @@ * * device - id of a device on the server, a required option. PCI devices * alone are supported presently. + * + * notes - x-vfio-user-server could block IO and monitor during the + * initialization phase. */ #include "qemu/osdep.h" @@ -40,11 +43,14 @@ #include "hw/remote/machine.h" #include "qapi/error.h" #include "qapi/qapi-visit-sockets.h" +#include "qapi/qapi-events-misc.h" #include "qemu/notify.h" +#include "qemu/thread.h" #include "sysemu/sysemu.h" #include "libvfio-user.h" #include "hw/qdev-core.h" #include "hw/pci/pci.h" +#include "qemu/timer.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -86,6 +92,8 @@ struct VfuObject { PCIDevice *pci_dev; Error *unplug_blocker; + +int vfu_poll_fd; }; static void vfu_object_init_ctx(VfuObject *o, Error **errp); @@ -164,6 +172,76 @@ static void vfu_object_set_device(Object *obj, const char *str, Error **errp) vfu_object_init_ctx(o, errp); } +static void vfu_object_ctx_run(void *opaque) +{ +VfuObject *o = opaque; +const char *vfu_id; +char *vfu_path, *pci_dev_path; +int ret = -1; + +while (ret != 0) { +ret = vfu_run_ctx(o->vfu_ctx); +if (ret < 0) { +if (errno == EINTR) { +continue; +} else if (errno == ENOTCONN) { +vfu_id = object_get_canonical_path_component(OBJECT(o)); +vfu_path = object_get_canonical_path(OBJECT(o)); +g_assert(o->pci_dev); +pci_dev_path = object_get_canonical_path(OBJECT(o->pci_dev)); +qapi_event_send_vfu_client_hangup(vfu_id, vfu_path, + o->device, pci_dev_path); +qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL); +o->vfu_poll_fd = -1; +object_unparent(OBJECT(o)); +g_free(vfu_path); +g_free(pci_dev_path); +break; +} else { +VFU_OBJECT_ERROR(o, "vfu: Failed to run device %s - %s", + o->device, strerror(errno)); +break; +} +} +} +} + +static void vfu_object_attach_ctx(void *opaque) +{ +VfuObject *o = opaque; +GPollFD pfds[1]; +int ret; + +qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL); + +pfds[0].fd = o->vfu_poll_fd; +pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR; + +retry_attach: +ret = vfu_attach_ctx(o->vfu_ctx); +if (ret < 0 && (errno == EAGAIN || errno == EWOULDBLOCK)) { +/** + * vfu_object_attach_ctx can block QEMU's main loop + * during attach - th
[PATCH v9 02/17] qdev: unplug blocker for devices
Add blocker to prevent hot-unplug of devices TYPE_VFIO_USER_SERVER, which is introduced shortly, attaches itself to a PCIDevice on which it depends. If the attached PCIDevice gets removed while the server in use, it could cause it crash. To prevent this, TYPE_VFIO_USER_SERVER adds an unplug blocker for the PCIDevice. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- include/hw/qdev-core.h | 29 + hw/core/qdev.c | 24 softmmu/qdev-monitor.c | 4 3 files changed, 57 insertions(+) diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h index 92c3d65208..1b9fa25e5c 100644 --- a/include/hw/qdev-core.h +++ b/include/hw/qdev-core.h @@ -193,6 +193,7 @@ struct DeviceState { int instance_id_alias; int alias_required_for_version; ResettableState reset; +GSList *unplug_blockers; }; struct DeviceListener { @@ -419,6 +420,34 @@ void qdev_simple_device_unplug_cb(HotplugHandler *hotplug_dev, void qdev_machine_creation_done(void); bool qdev_machine_modified(void); +/* + * qdev_add_unplug_blocker: Adds an unplug blocker to a device + * + * @dev: Device to be blocked from unplug + * @reason: Reason for blocking + */ +void qdev_add_unplug_blocker(DeviceState *dev, Error *reason); + +/* + * qdev_del_unplug_blocker: Removes an unplug blocker from a device + * + * @dev: Device to be unblocked + * @reason: Pointer to the Error used with qdev_add_unplug_blocker. + * Used as a handle to lookup the blocker for deletion. + */ +void qdev_del_unplug_blocker(DeviceState *dev, Error *reason); + +/* + * qdev_unplug_blocked: Confirms if a device is blocked from unplug + * + * @dev: Device to be tested + * @reason: Returns one of the reasons why the device is blocked, + * if any + * + * Returns: true if device is blocked from unplug, false otherwise + */ +bool qdev_unplug_blocked(DeviceState *dev, Error **errp); + /** * GpioPolarity: Polarity of a GPIO line * diff --git a/hw/core/qdev.c b/hw/core/qdev.c index 84f3019440..0806d8fcaa 100644 --- a/hw/core/qdev.c +++ b/hw/core/qdev.c @@ -468,6 +468,28 @@ char *qdev_get_dev_path(DeviceState *dev) return NULL; } +void qdev_add_unplug_blocker(DeviceState *dev, Error *reason) +{ +dev->unplug_blockers = g_slist_prepend(dev->unplug_blockers, reason); +} + +void qdev_del_unplug_blocker(DeviceState *dev, Error *reason) +{ +dev->unplug_blockers = g_slist_remove(dev->unplug_blockers, reason); +} + +bool qdev_unplug_blocked(DeviceState *dev, Error **errp) +{ +ERRP_GUARD(); + +if (dev->unplug_blockers) { +error_propagate(errp, error_copy(dev->unplug_blockers->data)); +return true; +} + +return false; +} + static bool device_get_realized(Object *obj, Error **errp) { DeviceState *dev = DEVICE(obj); @@ -704,6 +726,8 @@ static void device_finalize(Object *obj) DeviceState *dev = DEVICE(obj); +g_assert(!dev->unplug_blockers); + QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) { QLIST_REMOVE(ngl, node); qemu_free_irqs(ngl->in, ngl->num_in); diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c index 12fe60c467..9cfd59d17c 100644 --- a/softmmu/qdev-monitor.c +++ b/softmmu/qdev-monitor.c @@ -898,6 +898,10 @@ void qdev_unplug(DeviceState *dev, Error **errp) HotplugHandlerClass *hdc; Error *local_err = NULL; +if (qdev_unplug_blocked(dev, errp)) { +return; +} + if (dev->parent_bus && !qbus_is_hotpluggable(dev->parent_bus)) { error_setg(errp, QERR_BUS_NO_HOTPLUG, dev->parent_bus->name); return; -- 2.20.1
[PATCH v9 14/17] vfio-user: handle PCI BAR accesses
Determine the BARs used by the PCI device and register handlers to manage the access to the same. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- include/exec/memory.h | 3 + hw/remote/vfio-user-obj.c | 190 softmmu/physmem.c | 4 +- tests/qtest/fuzz/generic_fuzz.c | 9 +- hw/remote/trace-events | 3 + 5 files changed, 203 insertions(+), 6 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index f1c19451bc..a6a0f4d8ad 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -2810,6 +2810,9 @@ MemTxResult address_space_write_cached_slow(MemoryRegionCache *cache, hwaddr addr, const void *buf, hwaddr len); +int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr); +bool prepare_mmio_access(MemoryRegion *mr); + static inline bool memory_access_is_direct(MemoryRegion *mr, bool is_write) { if (is_write) { diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 736339c74a..f5ca909e68 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -52,6 +52,7 @@ #include "hw/qdev-core.h" #include "hw/pci/pci.h" #include "qemu/timer.h" +#include "exec/memory.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -330,6 +331,193 @@ static void dma_unregister(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) trace_vfu_dma_unregister((uint64_t)info->iova.iov_base); } +static int vfu_object_mr_rw(MemoryRegion *mr, uint8_t *buf, hwaddr offset, +hwaddr size, const bool is_write) +{ +uint8_t *ptr = buf; +bool release_lock = false; +uint8_t *ram_ptr = NULL; +MemTxResult result; +int access_size; +uint64_t val; + +if (memory_access_is_direct(mr, is_write)) { +/** + * Some devices expose a PCI expansion ROM, which could be buffer + * based as compared to other regions which are primarily based on + * MemoryRegionOps. memory_region_find() would already check + * for buffer overflow, we don't need to repeat it here. + */ +ram_ptr = memory_region_get_ram_ptr(mr); + +if (is_write) { +memcpy((ram_ptr + offset), buf, size); +} else { +memcpy(buf, (ram_ptr + offset), size); +} + +return 0; +} + +while (size) { +/** + * The read/write logic used below is similar to the ones in + * flatview_read/write_continue() + */ +release_lock = prepare_mmio_access(mr); + +access_size = memory_access_size(mr, size, offset); + +if (is_write) { +val = ldn_he_p(ptr, access_size); + +result = memory_region_dispatch_write(mr, offset, val, + size_memop(access_size), + MEMTXATTRS_UNSPECIFIED); +} else { +result = memory_region_dispatch_read(mr, offset, &val, + size_memop(access_size), + MEMTXATTRS_UNSPECIFIED); + +stn_he_p(ptr, access_size, val); +} + +if (release_lock) { +qemu_mutex_unlock_iothread(); +release_lock = false; +} + +if (result != MEMTX_OK) { +return -1; +} + +size -= access_size; +ptr += access_size; +offset += access_size; +} + +return 0; +} + +static size_t vfu_object_bar_rw(PCIDevice *pci_dev, int pci_bar, +hwaddr bar_offset, char * const buf, +hwaddr len, const bool is_write) +{ +MemoryRegionSection section = { 0 }; +uint8_t *ptr = (uint8_t *)buf; +MemoryRegion *section_mr = NULL; +uint64_t section_size; +hwaddr section_offset; +hwaddr size = 0; + +while (len) { +section = memory_region_find(pci_dev->io_regions[pci_bar].memory, + bar_offset, len); + +if (!section.mr) { +warn_report("vfu: invalid address 0x%"PRIx64"", bar_offset); +return size; +} + +section_mr = section.mr; +section_offset = section.offset_within_region; +section_size = int128_get64(section.size); + +if (is_write && section_mr->readonly) { +warn_report("vfu: attempting to write to readonly region in " +"bar %d - [0x%"PRIx64" - 0x%"PRIx64"]", +pci_bar, bar_offset, +(bar_offset + section_size)); +
[PATCH v9 03/17] remote/machine: add HotplugHandler for remote machine
Allow hotplugging of PCI(e) devices to remote machine Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/machine.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/hw/remote/machine.c b/hw/remote/machine.c index 92d71d47bb..a97e53e250 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -20,6 +20,7 @@ #include "qapi/error.h" #include "hw/pci/pci_host.h" #include "hw/remote/iohub.h" +#include "hw/qdev-core.h" static void remote_machine_init(MachineState *machine) { @@ -53,14 +54,19 @@ static void remote_machine_init(MachineState *machine) pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq, &s->iohub, REMOTE_IOHUB_NB_PIRQS); + +qbus_set_hotplug_handler(BUS(pci_host->bus), OBJECT(s)); } static void remote_machine_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); +HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc); mc->init = remote_machine_init; mc->desc = "Experimental remote machine"; + +hc->unplug = qdev_simple_device_unplug_cb; } static const TypeInfo remote_machine = { @@ -68,6 +74,10 @@ static const TypeInfo remote_machine = { .parent = TYPE_MACHINE, .instance_size = sizeof(RemoteMachineState), .class_init = remote_machine_class_init, +.interfaces = (InterfaceInfo[]) { +{ TYPE_HOTPLUG_HANDLER }, +{ } +} }; static void remote_machine_register_types(void) -- 2.20.1
[PATCH v9 01/17] tests/avocado: Specify target VM argument to helper routines
Specify target VM for exec_command and exec_command_and_wait_for_pattern routines Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Beraldo Leal Reviewed-by: Stefan Hajnoczi --- tests/avocado/avocado_qemu/__init__.py | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/tests/avocado/avocado_qemu/__init__.py b/tests/avocado/avocado_qemu/__init__.py index 39f15c1d51..340a345799 100644 --- a/tests/avocado/avocado_qemu/__init__.py +++ b/tests/avocado/avocado_qemu/__init__.py @@ -198,7 +198,7 @@ def wait_for_console_pattern(test, success_message, failure_message=None, """ _console_interaction(test, success_message, failure_message, None, vm=vm) -def exec_command(test, command): +def exec_command(test, command, vm=None): """ Send a command to a console (appending CRLF characters), while logging the content. @@ -207,11 +207,14 @@ def exec_command(test, command): :type test: :class:`avocado_qemu.QemuSystemTest` :param command: the command to send :type command: str +:param vm: target vm +:type vm: :class:`qemu.machine.QEMUMachine` """ -_console_interaction(test, None, None, command + '\r') +_console_interaction(test, None, None, command + '\r', vm=vm) def exec_command_and_wait_for_pattern(test, command, - success_message, failure_message=None): + success_message, failure_message=None, + vm=None): """ Send a command to a console (appending CRLF characters), then wait for success_message to appear on the console, while logging the. @@ -223,8 +226,11 @@ def exec_command_and_wait_for_pattern(test, command, :param command: the command to send :param success_message: if this message appears, test succeeds :param failure_message: if this message appears, test fails +:param vm: target vm +:type vm: :class:`qemu.machine.QEMUMachine` """ -_console_interaction(test, success_message, failure_message, command + '\r') +_console_interaction(test, success_message, failure_message, command + '\r', + vm=vm) class QemuBaseTest(avocado.Test): def _get_unique_tag_val(self, tag_name): -- 2.20.1
[PATCH v9 09/17] vfio-user: find and init PCI device
Find the PCI device with specified id. Initialize the device context with the QEMU PCI device Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 67 +++ 1 file changed, 67 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 68f8a9dfa9..3ca6aa2b45 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -43,6 +43,8 @@ #include "qemu/notify.h" #include "sysemu/sysemu.h" #include "libvfio-user.h" +#include "hw/qdev-core.h" +#include "hw/pci/pci.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -80,6 +82,10 @@ struct VfuObject { Notifier machine_done; vfu_ctx_t *vfu_ctx; + +PCIDevice *pci_dev; + +Error *unplug_blocker; }; static void vfu_object_init_ctx(VfuObject *o, Error **errp); @@ -181,6 +187,9 @@ static void vfu_object_machine_done(Notifier *notifier, void *data) static void vfu_object_init_ctx(VfuObject *o, Error **errp) { ERRP_GUARD(); +DeviceState *dev = NULL; +vfu_pci_type_t pci_type = VFU_PCI_TYPE_CONVENTIONAL; +int ret; if (o->vfu_ctx || !o->socket || !o->device || !phase_check(PHASE_MACHINE_READY)) { @@ -199,6 +208,53 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) error_setg(errp, "vfu: Failed to create context - %s", strerror(errno)); return; } + +dev = qdev_find_recursive(sysbus_get_default(), o->device); +if (dev == NULL) { +error_setg(errp, "vfu: Device %s not found", o->device); +goto fail; +} + +if (!object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) { +error_setg(errp, "vfu: %s not a PCI device", o->device); +goto fail; +} + +o->pci_dev = PCI_DEVICE(dev); + +object_ref(OBJECT(o->pci_dev)); + +if (pci_is_express(o->pci_dev)) { +pci_type = VFU_PCI_TYPE_EXPRESS; +} + +ret = vfu_pci_init(o->vfu_ctx, pci_type, PCI_HEADER_TYPE_NORMAL, 0); +if (ret < 0) { +error_setg(errp, + "vfu: Failed to attach PCI device %s to context - %s", + o->device, strerror(errno)); +goto fail; +} + +error_setg(&o->unplug_blocker, + "vfu: %s for %s must be deleted before unplugging", + TYPE_VFU_OBJECT, o->device); +qdev_add_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); + +return; + +fail: +vfu_destroy_ctx(o->vfu_ctx); +if (o->unplug_blocker && o->pci_dev) { +qdev_del_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); +error_free(o->unplug_blocker); +o->unplug_blocker = NULL; +} +if (o->pci_dev) { +object_unref(OBJECT(o->pci_dev)); +o->pci_dev = NULL; +} +o->vfu_ctx = NULL; } static void vfu_object_init(Object *obj) @@ -241,6 +297,17 @@ static void vfu_object_finalize(Object *obj) o->device = NULL; +if (o->unplug_blocker && o->pci_dev) { +qdev_del_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); +error_free(o->unplug_blocker); +o->unplug_blocker = NULL; +} + +if (o->pci_dev) { +object_unref(OBJECT(o->pci_dev)); +o->pci_dev = NULL; +} + if (!k->nr_devs && vfu_object_auto_shutdown()) { qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN); } -- 2.20.1
[PATCH v9 07/17] vfio-user: define vfio-user-server object
Define vfio-user object which is remote process server for QEMU. Setup object initialization functions and properties necessary to instantiate the object Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- qapi/qom.json | 20 +++- include/hw/remote/machine.h | 2 + hw/remote/machine.c | 27 + hw/remote/vfio-user-obj.c | 210 MAINTAINERS | 1 + hw/remote/meson.build | 1 + hw/remote/trace-events | 3 + 7 files changed, 262 insertions(+), 2 deletions(-) create mode 100644 hw/remote/vfio-user-obj.c diff --git a/qapi/qom.json b/qapi/qom.json index eeb5395ff3..582def0522 100644 --- a/qapi/qom.json +++ b/qapi/qom.json @@ -703,6 +703,20 @@ { 'struct': 'RemoteObjectProperties', 'data': { 'fd': 'str', 'devid': 'str' } } +## +# @VfioUserServerProperties: +# +# Properties for x-vfio-user-server objects. +# +# @socket: socket to be used by the libvfio-user library +# +# @device: the id of the device to be emulated at the server +# +# Since: 7.1 +## +{ 'struct': 'VfioUserServerProperties', + 'data': { 'socket': 'SocketAddress', 'device': 'str' } } + ## # @RngProperties: # @@ -842,7 +856,8 @@ 'tls-creds-psk', 'tls-creds-x509', 'tls-cipher-suites', -{ 'name': 'x-remote-object', 'features': [ 'unstable' ] } +{ 'name': 'x-remote-object', 'features': [ 'unstable' ] }, +{ 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] } ] } ## @@ -905,7 +920,8 @@ 'tls-creds-psk': 'TlsCredsPskProperties', 'tls-creds-x509': 'TlsCredsX509Properties', 'tls-cipher-suites': 'TlsCredsProperties', - 'x-remote-object':'RemoteObjectProperties' + 'x-remote-object':'RemoteObjectProperties', + 'x-vfio-user-server': 'VfioUserServerProperties' } } ## diff --git a/include/hw/remote/machine.h b/include/hw/remote/machine.h index 8d0fa98d33..ac32fda387 100644 --- a/include/hw/remote/machine.h +++ b/include/hw/remote/machine.h @@ -24,6 +24,8 @@ struct RemoteMachineState { RemoteIOHubState iohub; bool vfio_user; + +bool auto_shutdown; }; /* Used to pass to co-routine device and ioc. */ diff --git a/hw/remote/machine.c b/hw/remote/machine.c index 9f3cdc55c3..4d008ed721 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -77,6 +77,28 @@ static void remote_machine_set_vfio_user(Object *obj, bool value, Error **errp) s->vfio_user = value; } +static bool remote_machine_get_auto_shutdown(Object *obj, Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +return s->auto_shutdown; +} + +static void remote_machine_set_auto_shutdown(Object *obj, bool value, + Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +s->auto_shutdown = value; +} + +static void remote_machine_instance_init(Object *obj) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +s->auto_shutdown = true; +} + static void remote_machine_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); @@ -90,12 +112,17 @@ static void remote_machine_class_init(ObjectClass *oc, void *data) object_class_property_add_bool(oc, "vfio-user", remote_machine_get_vfio_user, remote_machine_set_vfio_user); + +object_class_property_add_bool(oc, "auto-shutdown", + remote_machine_get_auto_shutdown, + remote_machine_set_auto_shutdown); } static const TypeInfo remote_machine = { .name = TYPE_REMOTE_MACHINE, .parent = TYPE_MACHINE, .instance_size = sizeof(RemoteMachineState), +.instance_init = remote_machine_instance_init, .class_init = remote_machine_class_init, .interfaces = (InterfaceInfo[]) { { TYPE_HOTPLUG_HANDLER }, diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c new file mode 100644 index 00..bc49adcc27 --- /dev/null +++ b/hw/remote/vfio-user-obj.c @@ -0,0 +1,210 @@ +/** + * QEMU vfio-user-server server object + * + * Copyright © 2022 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL-v2, version 2 or later. + * + * See the COPYING file in the top-level directory. + * + */ + +/** + * Usage: add options: + * -machine x-remote,vfio-user=on,auto-shutdown=on +
[PATCH v9 00/17] vfio-user server in QEMU
Hi, This is v9 of the server side changes to enable vfio-user in QEMU. Thank you very much for reviewing the last revision of this series! We've made the following changes in this revision: [PATCH v9 02/17] qdev: unplug blocker for devices - updated commit message with more details [PATCH v9 06/17] vfio-user: build library - updated commit message with license information [PATCH v9 07/17] vfio-user: define vfio-user-server object - fixed type with libvfio-user library name in comments for VfioUserServerProperties [PATCH v9 10/17] vfio-user: run vfio-user context - added the QOM patchs of the PCI device and server to VFU_CLIENT_HANGUP event [PATCH v9 12/17] vfio-user: IOMMU support for remote device - added comments to describe the design of the remote machine's IOMMU [PATCH v9 14/17] vfio-user: handle PCI BAR accesses - unref memory region during early exit in vfu_object_bar_rw() Jagannathan Raman (17): tests/avocado: Specify target VM argument to helper routines qdev: unplug blocker for devices remote/machine: add HotplugHandler for remote machine remote/machine: add vfio-user property configure: require cmake 3.19 or newer vfio-user: build library vfio-user: define vfio-user-server object vfio-user: instantiate vfio-user context vfio-user: find and init PCI device vfio-user: run vfio-user context vfio-user: handle PCI config space accesses vfio-user: IOMMU support for remote device vfio-user: handle DMA mappings vfio-user: handle PCI BAR accesses vfio-user: handle device interrupts vfio-user: handle reset of remote device vfio-user: avocado tests for vfio-user configure | 36 +- meson.build| 44 +- qapi/misc.json | 30 + qapi/qom.json | 20 +- include/exec/memory.h | 3 + include/hw/pci/pci.h | 13 + include/hw/qdev-core.h | 29 + include/hw/remote/iommu.h | 40 + include/hw/remote/machine.h| 4 + include/hw/remote/vfio-user-obj.h | 6 + hw/core/qdev.c | 24 + hw/pci/msi.c | 16 +- hw/pci/msix.c | 10 +- hw/pci/pci.c | 13 + hw/remote/iommu.c | 131 +++ hw/remote/machine.c| 88 +- hw/remote/vfio-user-obj.c | 898 + softmmu/physmem.c | 4 +- softmmu/qdev-monitor.c | 4 + stubs/vfio-user-obj.c | 6 + tests/qtest/fuzz/generic_fuzz.c| 9 +- .gitlab-ci.d/buildtest.yml | 2 + .gitmodules| 3 + Kconfig.host | 4 + MAINTAINERS| 6 + hw/remote/Kconfig | 4 + hw/remote/meson.build | 4 + hw/remote/trace-events | 11 + meson_options.txt | 3 + stubs/meson.build | 1 + subprojects/libvfio-user | 1 + tests/avocado/avocado_qemu/__init__.py | 14 +- tests/avocado/vfio-user.py | 164 tests/docker/dockerfiles/centos8.docker| 2 + tests/docker/dockerfiles/ubuntu2004.docker | 2 + 35 files changed, 1625 insertions(+), 24 deletions(-) create mode 100644 include/hw/remote/iommu.h create mode 100644 include/hw/remote/vfio-user-obj.h create mode 100644 hw/remote/iommu.c create mode 100644 hw/remote/vfio-user-obj.c create mode 100644 stubs/vfio-user-obj.c create mode 16 subprojects/libvfio-user create mode 100644 tests/avocado/vfio-user.py -- 2.20.1
[PATCH v8 04/17] remote/machine: add vfio-user property
Add vfio-user to x-remote machine. It is a boolean, which indicates if the machine supports vfio-user protocol. The machine configures the bus differently vfio-user and multiprocess protocols, so this property informs it on how to configure the bus. This property should be short lived. Once vfio-user fully replaces multiprocess, this property could be removed. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- include/hw/remote/machine.h | 2 ++ hw/remote/machine.c | 23 +++ 2 files changed, 25 insertions(+) diff --git a/include/hw/remote/machine.h b/include/hw/remote/machine.h index 2a2a33c4b2..8d0fa98d33 100644 --- a/include/hw/remote/machine.h +++ b/include/hw/remote/machine.h @@ -22,6 +22,8 @@ struct RemoteMachineState { RemotePCIHost *host; RemoteIOHubState iohub; + +bool vfio_user; }; /* Used to pass to co-routine device and ioc. */ diff --git a/hw/remote/machine.c b/hw/remote/machine.c index 0c5bd4f923..a9a75e170f 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -59,6 +59,25 @@ static void remote_machine_init(MachineState *machine) qbus_set_hotplug_handler(BUS(pci_host->bus), OBJECT(s)); } +static bool remote_machine_get_vfio_user(Object *obj, Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +return s->vfio_user; +} + +static void remote_machine_set_vfio_user(Object *obj, bool value, Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +if (phase_check(PHASE_MACHINE_CREATED)) { +error_setg(errp, "Error enabling vfio-user - machine already created"); +return; +} + +s->vfio_user = value; +} + static void remote_machine_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); @@ -68,6 +87,10 @@ static void remote_machine_class_init(ObjectClass *oc, void *data) mc->desc = "Experimental remote machine"; hc->unplug = qdev_simple_device_unplug_cb; + +object_class_property_add_bool(oc, "vfio-user", + remote_machine_get_vfio_user, + remote_machine_set_vfio_user); } static const TypeInfo remote_machine = { -- 2.20.1
[PATCH v8 12/17] vfio-user: IOMMU support for remote device
Assign separate address space for each device in the remote processes. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- include/hw/remote/iommu.h | 40 + hw/remote/iommu.c | 114 ++ hw/remote/machine.c | 13 - MAINTAINERS | 2 + hw/remote/meson.build | 1 + 5 files changed, 169 insertions(+), 1 deletion(-) create mode 100644 include/hw/remote/iommu.h create mode 100644 hw/remote/iommu.c diff --git a/include/hw/remote/iommu.h b/include/hw/remote/iommu.h new file mode 100644 index 00..33b68a8f4b --- /dev/null +++ b/include/hw/remote/iommu.h @@ -0,0 +1,40 @@ +/** + * Copyright © 2022 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef REMOTE_IOMMU_H +#define REMOTE_IOMMU_H + +#include "hw/pci/pci_bus.h" +#include "hw/pci/pci.h" + +#ifndef INT2VOIDP +#define INT2VOIDP(i) (void *)(uintptr_t)(i) +#endif + +typedef struct RemoteIommuElem { +MemoryRegion *mr; + +AddressSpace as; +} RemoteIommuElem; + +#define TYPE_REMOTE_IOMMU "x-remote-iommu" +OBJECT_DECLARE_SIMPLE_TYPE(RemoteIommu, REMOTE_IOMMU) + +struct RemoteIommu { +Object parent; + +GHashTable *elem_by_devfn; + +QemuMutex lock; +}; + +void remote_iommu_setup(PCIBus *pci_bus); + +void remote_iommu_unplug_dev(PCIDevice *pci_dev); + +#endif diff --git a/hw/remote/iommu.c b/hw/remote/iommu.c new file mode 100644 index 00..16c6b0834e --- /dev/null +++ b/hw/remote/iommu.c @@ -0,0 +1,114 @@ +/** + * IOMMU for remote device + * + * Copyright © 2022 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" + +#include "hw/remote/iommu.h" +#include "hw/pci/pci_bus.h" +#include "hw/pci/pci.h" +#include "exec/memory.h" +#include "exec/address-spaces.h" +#include "trace.h" + +static AddressSpace *remote_iommu_find_add_as(PCIBus *pci_bus, + void *opaque, int devfn) +{ +RemoteIommu *iommu = opaque; +RemoteIommuElem *elem = NULL; + +qemu_mutex_lock(&iommu->lock); + +elem = g_hash_table_lookup(iommu->elem_by_devfn, INT2VOIDP(devfn)); + +if (!elem) { +elem = g_malloc0(sizeof(RemoteIommuElem)); +g_hash_table_insert(iommu->elem_by_devfn, INT2VOIDP(devfn), elem); +} + +if (!elem->mr) { +elem->mr = MEMORY_REGION(object_new(TYPE_MEMORY_REGION)); +memory_region_set_size(elem->mr, UINT64_MAX); +address_space_init(&elem->as, elem->mr, NULL); +} + +qemu_mutex_unlock(&iommu->lock); + +return &elem->as; +} + +void remote_iommu_unplug_dev(PCIDevice *pci_dev) +{ +AddressSpace *as = pci_device_iommu_address_space(pci_dev); +RemoteIommuElem *elem = NULL; + +if (as == &address_space_memory) { +return; +} + +elem = container_of(as, RemoteIommuElem, as); + +address_space_destroy(&elem->as); + +object_unref(elem->mr); + +elem->mr = NULL; +} + +static void remote_iommu_init(Object *obj) +{ +RemoteIommu *iommu = REMOTE_IOMMU(obj); + +iommu->elem_by_devfn = g_hash_table_new_full(NULL, NULL, NULL, g_free); + +qemu_mutex_init(&iommu->lock); +} + +static void remote_iommu_finalize(Object *obj) +{ +RemoteIommu *iommu = REMOTE_IOMMU(obj); + +qemu_mutex_destroy(&iommu->lock); + +if (iommu->elem_by_devfn) { +g_hash_table_destroy(iommu->elem_by_devfn); +iommu->elem_by_devfn = NULL; +} +} + +void remote_iommu_setup(PCIBus *pci_bus) +{ +RemoteIommu *iommu = NULL; + +g_assert(pci_bus); + +iommu = REMOTE_IOMMU(object_new(TYPE_REMOTE_IOMMU)); + +pci_setup_iommu(pci_bus, remote_iommu_find_add_as, iommu); + +object_property_add_child(OBJECT(pci_bus), "remote-iommu", OBJECT(iommu)); + +object_unref(OBJECT(iommu)); +} + +static const TypeInfo remote_iommu_info = { +.name = TYPE_REMOTE_IOMMU, +.parent = TYPE_OBJECT, +.instance_size = sizeof(RemoteIommu), +.instance_init = remote_iommu_init, +.instance_finalize = remote_iommu_finalize, +}; + +static void remote_iommu_register_types(void) +{ +type_register_static(&remote_iommu_info); +} + +type_init(remote_iommu_register_types) diff --git a/hw/remote/machine.c b/hw/remote/machine.c index ed91659794..cca5d25f50 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -21,6 +21,7 @@ #include "qapi/error.h" #include "hw/pci/pci_host.h" #include "hw/remote/iohub.h" +#include "hw/remo
[PATCH v8 09/17] vfio-user: find and init PCI device
Find the PCI device with specified id. Initialize the device context with the QEMU PCI device Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 67 +++ 1 file changed, 67 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index d46acd5b63..15f6fe3a1a 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -44,6 +44,8 @@ #include "qemu/notify.h" #include "sysemu/sysemu.h" #include "libvfio-user.h" +#include "hw/qdev-core.h" +#include "hw/pci/pci.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -81,6 +83,10 @@ struct VfuObject { Notifier machine_done; vfu_ctx_t *vfu_ctx; + +PCIDevice *pci_dev; + +Error *unplug_blocker; }; static void vfu_object_init_ctx(VfuObject *o, Error **errp); @@ -182,6 +188,9 @@ static void vfu_object_machine_done(Notifier *notifier, void *data) static void vfu_object_init_ctx(VfuObject *o, Error **errp) { ERRP_GUARD(); +DeviceState *dev = NULL; +vfu_pci_type_t pci_type = VFU_PCI_TYPE_CONVENTIONAL; +int ret; if (o->vfu_ctx || !o->socket || !o->device || !phase_check(PHASE_MACHINE_READY)) { @@ -200,6 +209,53 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) error_setg(errp, "vfu: Failed to create context - %s", strerror(errno)); return; } + +dev = qdev_find_recursive(sysbus_get_default(), o->device); +if (dev == NULL) { +error_setg(errp, "vfu: Device %s not found", o->device); +goto fail; +} + +if (!object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) { +error_setg(errp, "vfu: %s not a PCI device", o->device); +goto fail; +} + +o->pci_dev = PCI_DEVICE(dev); + +object_ref(OBJECT(o->pci_dev)); + +if (pci_is_express(o->pci_dev)) { +pci_type = VFU_PCI_TYPE_EXPRESS; +} + +ret = vfu_pci_init(o->vfu_ctx, pci_type, PCI_HEADER_TYPE_NORMAL, 0); +if (ret < 0) { +error_setg(errp, + "vfu: Failed to attach PCI device %s to context - %s", + o->device, strerror(errno)); +goto fail; +} + +error_setg(&o->unplug_blocker, + "vfu: %s for %s must be deleted before unplugging", + TYPE_VFU_OBJECT, o->device); +qdev_add_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); + +return; + +fail: +vfu_destroy_ctx(o->vfu_ctx); +if (o->unplug_blocker && o->pci_dev) { +qdev_del_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); +error_free(o->unplug_blocker); +o->unplug_blocker = NULL; +} +if (o->pci_dev) { +object_unref(OBJECT(o->pci_dev)); +o->pci_dev = NULL; +} +o->vfu_ctx = NULL; } static void vfu_object_init(Object *obj) @@ -242,6 +298,17 @@ static void vfu_object_finalize(Object *obj) o->device = NULL; +if (o->unplug_blocker && o->pci_dev) { +qdev_del_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); +error_free(o->unplug_blocker); +o->unplug_blocker = NULL; +} + +if (o->pci_dev) { +object_unref(OBJECT(o->pci_dev)); +o->pci_dev = NULL; +} + if (!k->nr_devs && vfu_object_auto_shutdown()) { qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN); } -- 2.20.1
[PATCH v8 01/17] tests/avocado: Specify target VM argument to helper routines
Specify target VM for exec_command and exec_command_and_wait_for_pattern routines Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Beraldo Leal Reviewed-by: Stefan Hajnoczi --- tests/avocado/avocado_qemu/__init__.py | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/tests/avocado/avocado_qemu/__init__.py b/tests/avocado/avocado_qemu/__init__.py index ac85e36a4d..18a34a798c 100644 --- a/tests/avocado/avocado_qemu/__init__.py +++ b/tests/avocado/avocado_qemu/__init__.py @@ -198,7 +198,7 @@ def wait_for_console_pattern(test, success_message, failure_message=None, """ _console_interaction(test, success_message, failure_message, None, vm=vm) -def exec_command(test, command): +def exec_command(test, command, vm=None): """ Send a command to a console (appending CRLF characters), while logging the content. @@ -207,11 +207,14 @@ def exec_command(test, command): :type test: :class:`avocado_qemu.QemuSystemTest` :param command: the command to send :type command: str +:param vm: target vm +:type vm: :class:`qemu.machine.QEMUMachine` """ -_console_interaction(test, None, None, command + '\r') +_console_interaction(test, None, None, command + '\r', vm=vm) def exec_command_and_wait_for_pattern(test, command, - success_message, failure_message=None): + success_message, failure_message=None, + vm=None): """ Send a command to a console (appending CRLF characters), then wait for success_message to appear on the console, while logging the. @@ -223,8 +226,11 @@ def exec_command_and_wait_for_pattern(test, command, :param command: the command to send :param success_message: if this message appears, test succeeds :param failure_message: if this message appears, test fails +:param vm: target vm +:type vm: :class:`qemu.machine.QEMUMachine` """ -_console_interaction(test, success_message, failure_message, command + '\r') +_console_interaction(test, success_message, failure_message, command + '\r', + vm=vm) class QemuBaseTest(avocado.Test): def _get_unique_tag_val(self, tag_name): -- 2.20.1
[PATCH v8 03/17] remote/machine: add HotplugHandler for remote machine
Allow hotplugging of PCI(e) devices to remote machine Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/machine.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/hw/remote/machine.c b/hw/remote/machine.c index 952105eab5..0c5bd4f923 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -21,6 +21,7 @@ #include "qapi/error.h" #include "hw/pci/pci_host.h" #include "hw/remote/iohub.h" +#include "hw/qdev-core.h" static void remote_machine_init(MachineState *machine) { @@ -54,14 +55,19 @@ static void remote_machine_init(MachineState *machine) pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq, &s->iohub, REMOTE_IOHUB_NB_PIRQS); + +qbus_set_hotplug_handler(BUS(pci_host->bus), OBJECT(s)); } static void remote_machine_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); +HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc); mc->init = remote_machine_init; mc->desc = "Experimental remote machine"; + +hc->unplug = qdev_simple_device_unplug_cb; } static const TypeInfo remote_machine = { @@ -69,6 +75,10 @@ static const TypeInfo remote_machine = { .parent = TYPE_MACHINE, .instance_size = sizeof(RemoteMachineState), .class_init = remote_machine_class_init, +.interfaces = (InterfaceInfo[]) { +{ TYPE_HOTPLUG_HANDLER }, +{ } +} }; static void remote_machine_register_types(void) -- 2.20.1
[PATCH v8 00/17] vfio-user server in QEMU
Hi, This is v8 of the server side changes to enable vfio-user in QEMU. Thank you very much for reviewing the last revision of this series! We've made the following changes in this revision: [PATCH v8 06/17] vfio-user: build library - updated libvfio-user to the latest [PATCH v8 07/17] vfio-user: define vfio-user-server object - changed auto_shutdown to a per-instance property than a per-class property [PATCH v8 12/17] vfio-user: IOMMU support for remote device - lock mutex while looking up hash table - removed global hash table - added RemoteIommu object to house this variable - added unplug handler to remove per-device IOMMU entry when a PCIDevice is unplugged [PATCH v8 14/17] vfio-user: handle PCI BAR accesses - refactored vfu_object_bar_rw() - vfu_object_bar_rw() handles short sections returned by memory_region_find() [PATCH v8 15/17] vfio-user: handle device interrupts - removed callbacks for msi_notify() and msix_notify() - added callbacks for msi_send_message() and msi(x)_get_message() operations Thank you! Jagannathan Raman (17): tests/avocado: Specify target VM argument to helper routines qdev: unplug blocker for devices remote/machine: add HotplugHandler for remote machine remote/machine: add vfio-user property configure: require cmake 3.19 or newer vfio-user: build library vfio-user: define vfio-user-server object vfio-user: instantiate vfio-user context vfio-user: find and init PCI device vfio-user: run vfio-user context vfio-user: handle PCI config space accesses vfio-user: IOMMU support for remote device vfio-user: handle DMA mappings vfio-user: handle PCI BAR accesses vfio-user: handle device interrupts vfio-user: handle reset of remote device vfio-user: avocado tests for vfio-user configure | 36 +- meson.build| 44 +- qapi/misc.json | 23 + qapi/qom.json | 20 +- include/exec/memory.h | 3 + include/hw/pci/pci.h | 13 + include/hw/qdev-core.h | 29 + include/hw/remote/iommu.h | 40 + include/hw/remote/machine.h| 4 + include/hw/remote/vfio-user-obj.h | 6 + hw/core/qdev.c | 24 + hw/pci/msi.c | 16 +- hw/pci/msix.c | 10 +- hw/pci/pci.c | 13 + hw/remote/iommu.c | 114 +++ hw/remote/machine.c| 88 +- hw/remote/vfio-user-obj.c | 891 + softmmu/physmem.c | 4 +- softmmu/qdev-monitor.c | 4 + stubs/vfio-user-obj.c | 6 + tests/qtest/fuzz/generic_fuzz.c| 9 +- .gitlab-ci.d/buildtest.yml | 2 + .gitmodules| 3 + Kconfig.host | 4 + MAINTAINERS| 6 + hw/remote/Kconfig | 4 + hw/remote/meson.build | 4 + hw/remote/trace-events | 11 + meson_options.txt | 3 + stubs/meson.build | 1 + subprojects/libvfio-user | 1 + tests/avocado/avocado_qemu/__init__.py | 14 +- tests/avocado/vfio-user.py | 164 tests/docker/dockerfiles/centos8.docker| 2 + tests/docker/dockerfiles/ubuntu2004.docker | 2 + 35 files changed, 1594 insertions(+), 24 deletions(-) create mode 100644 include/hw/remote/iommu.h create mode 100644 include/hw/remote/vfio-user-obj.h create mode 100644 hw/remote/iommu.c create mode 100644 hw/remote/vfio-user-obj.c create mode 100644 stubs/vfio-user-obj.c create mode 16 subprojects/libvfio-user create mode 100644 tests/avocado/vfio-user.py -- 2.20.1
[PATCH v8 07/17] vfio-user: define vfio-user-server object
Define vfio-user object which is remote process server for QEMU. Setup object initialization functions and properties necessary to instantiate the object Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- qapi/qom.json | 20 +++- include/hw/remote/machine.h | 2 + hw/remote/machine.c | 27 + hw/remote/vfio-user-obj.c | 211 MAINTAINERS | 1 + hw/remote/meson.build | 1 + hw/remote/trace-events | 3 + 7 files changed, 263 insertions(+), 2 deletions(-) create mode 100644 hw/remote/vfio-user-obj.c diff --git a/qapi/qom.json b/qapi/qom.json index eeb5395ff3..e7b1758a11 100644 --- a/qapi/qom.json +++ b/qapi/qom.json @@ -703,6 +703,20 @@ { 'struct': 'RemoteObjectProperties', 'data': { 'fd': 'str', 'devid': 'str' } } +## +# @VfioUserServerProperties: +# +# Properties for x-vfio-user-server objects. +# +# @socket: socket to be used by the libvfiouser library +# +# @device: the id of the device to be emulated at the server +# +# Since: 7.1 +## +{ 'struct': 'VfioUserServerProperties', + 'data': { 'socket': 'SocketAddress', 'device': 'str' } } + ## # @RngProperties: # @@ -842,7 +856,8 @@ 'tls-creds-psk', 'tls-creds-x509', 'tls-cipher-suites', -{ 'name': 'x-remote-object', 'features': [ 'unstable' ] } +{ 'name': 'x-remote-object', 'features': [ 'unstable' ] }, +{ 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] } ] } ## @@ -905,7 +920,8 @@ 'tls-creds-psk': 'TlsCredsPskProperties', 'tls-creds-x509': 'TlsCredsX509Properties', 'tls-cipher-suites': 'TlsCredsProperties', - 'x-remote-object':'RemoteObjectProperties' + 'x-remote-object':'RemoteObjectProperties', + 'x-vfio-user-server': 'VfioUserServerProperties' } } ## diff --git a/include/hw/remote/machine.h b/include/hw/remote/machine.h index 8d0fa98d33..ac32fda387 100644 --- a/include/hw/remote/machine.h +++ b/include/hw/remote/machine.h @@ -24,6 +24,8 @@ struct RemoteMachineState { RemoteIOHubState iohub; bool vfio_user; + +bool auto_shutdown; }; /* Used to pass to co-routine device and ioc. */ diff --git a/hw/remote/machine.c b/hw/remote/machine.c index a9a75e170f..ed91659794 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -78,6 +78,28 @@ static void remote_machine_set_vfio_user(Object *obj, bool value, Error **errp) s->vfio_user = value; } +static bool remote_machine_get_auto_shutdown(Object *obj, Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +return s->auto_shutdown; +} + +static void remote_machine_set_auto_shutdown(Object *obj, bool value, + Error **errp) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +s->auto_shutdown = value; +} + +static void remote_machine_instance_init(Object *obj) +{ +RemoteMachineState *s = REMOTE_MACHINE(obj); + +s->auto_shutdown = true; +} + static void remote_machine_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); @@ -91,12 +113,17 @@ static void remote_machine_class_init(ObjectClass *oc, void *data) object_class_property_add_bool(oc, "vfio-user", remote_machine_get_vfio_user, remote_machine_set_vfio_user); + +object_class_property_add_bool(oc, "auto-shutdown", + remote_machine_get_auto_shutdown, + remote_machine_set_auto_shutdown); } static const TypeInfo remote_machine = { .name = TYPE_REMOTE_MACHINE, .parent = TYPE_MACHINE, .instance_size = sizeof(RemoteMachineState), +.instance_init = remote_machine_instance_init, .class_init = remote_machine_class_init, .interfaces = (InterfaceInfo[]) { { TYPE_HOTPLUG_HANDLER }, diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c new file mode 100644 index 00..c4d59b4d9d --- /dev/null +++ b/hw/remote/vfio-user-obj.c @@ -0,0 +1,211 @@ +/** + * QEMU vfio-user-server server object + * + * Copyright © 2022 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL-v2, version 2 or later. + * + * See the COPYING file in the top-level directory. + * + */ + +/** + * Usage: add options: + * -machine x-remote,vfio-user=on,auto-shutdown=on +
[PATCH v8 02/17] qdev: unplug blocker for devices
Add blocker to prevent hot-unplug of devices Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- include/hw/qdev-core.h | 29 + hw/core/qdev.c | 24 softmmu/qdev-monitor.c | 4 3 files changed, 57 insertions(+) diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h index 92c3d65208..1b9fa25e5c 100644 --- a/include/hw/qdev-core.h +++ b/include/hw/qdev-core.h @@ -193,6 +193,7 @@ struct DeviceState { int instance_id_alias; int alias_required_for_version; ResettableState reset; +GSList *unplug_blockers; }; struct DeviceListener { @@ -419,6 +420,34 @@ void qdev_simple_device_unplug_cb(HotplugHandler *hotplug_dev, void qdev_machine_creation_done(void); bool qdev_machine_modified(void); +/* + * qdev_add_unplug_blocker: Adds an unplug blocker to a device + * + * @dev: Device to be blocked from unplug + * @reason: Reason for blocking + */ +void qdev_add_unplug_blocker(DeviceState *dev, Error *reason); + +/* + * qdev_del_unplug_blocker: Removes an unplug blocker from a device + * + * @dev: Device to be unblocked + * @reason: Pointer to the Error used with qdev_add_unplug_blocker. + * Used as a handle to lookup the blocker for deletion. + */ +void qdev_del_unplug_blocker(DeviceState *dev, Error *reason); + +/* + * qdev_unplug_blocked: Confirms if a device is blocked from unplug + * + * @dev: Device to be tested + * @reason: Returns one of the reasons why the device is blocked, + * if any + * + * Returns: true if device is blocked from unplug, false otherwise + */ +bool qdev_unplug_blocked(DeviceState *dev, Error **errp); + /** * GpioPolarity: Polarity of a GPIO line * diff --git a/hw/core/qdev.c b/hw/core/qdev.c index 84f3019440..0806d8fcaa 100644 --- a/hw/core/qdev.c +++ b/hw/core/qdev.c @@ -468,6 +468,28 @@ char *qdev_get_dev_path(DeviceState *dev) return NULL; } +void qdev_add_unplug_blocker(DeviceState *dev, Error *reason) +{ +dev->unplug_blockers = g_slist_prepend(dev->unplug_blockers, reason); +} + +void qdev_del_unplug_blocker(DeviceState *dev, Error *reason) +{ +dev->unplug_blockers = g_slist_remove(dev->unplug_blockers, reason); +} + +bool qdev_unplug_blocked(DeviceState *dev, Error **errp) +{ +ERRP_GUARD(); + +if (dev->unplug_blockers) { +error_propagate(errp, error_copy(dev->unplug_blockers->data)); +return true; +} + +return false; +} + static bool device_get_realized(Object *obj, Error **errp) { DeviceState *dev = DEVICE(obj); @@ -704,6 +726,8 @@ static void device_finalize(Object *obj) DeviceState *dev = DEVICE(obj); +g_assert(!dev->unplug_blockers); + QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) { QLIST_REMOVE(ngl, node); qemu_free_irqs(ngl->in, ngl->num_in); diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c index 12fe60c467..9cfd59d17c 100644 --- a/softmmu/qdev-monitor.c +++ b/softmmu/qdev-monitor.c @@ -898,6 +898,10 @@ void qdev_unplug(DeviceState *dev, Error **errp) HotplugHandlerClass *hdc; Error *local_err = NULL; +if (qdev_unplug_blocked(dev, errp)) { +return; +} + if (dev->parent_bus && !qbus_is_hotpluggable(dev->parent_bus)) { error_setg(errp, QERR_BUS_NO_HOTPLUG, dev->parent_bus->name); return; -- 2.20.1
[PATCH v8 06/17] vfio-user: build library
add the libvfio-user library as a submodule. build it as a cmake subproject. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- configure | 20 +- meson.build| 44 +- .gitlab-ci.d/buildtest.yml | 2 + .gitmodules| 3 ++ Kconfig.host | 4 ++ MAINTAINERS| 1 + hw/remote/Kconfig | 4 ++ hw/remote/meson.build | 2 + meson_options.txt | 3 ++ subprojects/libvfio-user | 1 + tests/docker/dockerfiles/centos8.docker| 2 + tests/docker/dockerfiles/ubuntu2004.docker | 2 + 12 files changed, 86 insertions(+), 2 deletions(-) create mode 16 subprojects/libvfio-user diff --git a/configure b/configure index 7a1a98bddf..c4fd7a42d4 100755 --- a/configure +++ b/configure @@ -333,6 +333,7 @@ meson_args="" ninja="" gio="$default_feature" skip_meson=no +vfio_user_server="disabled" # The following Meson options are handled manually (still they # are included in the automatically generated help message) @@ -1044,6 +1045,11 @@ for opt do ;; --disable-blobs) meson_option_parse --disable-install-blobs "" ;; + --enable-vfio-user-server) vfio_user_server="enabled" + cmake_required="yes" + ;; + --disable-vfio-user-server) vfio_user_server="disabled" + ;; --enable-tcmalloc) meson_option_parse --enable-malloc=tcmalloc tcmalloc ;; --enable-jemalloc) meson_option_parse --enable-malloc=jemalloc jemalloc @@ -1267,6 +1273,7 @@ cat << EOF vhost-vdpa vhost-vdpa kernel backend support opengl opengl support gio libgio support + vfio-user-servervfio-user server support NOTE: The object files are built at the place where configure is launched EOF @@ -2622,6 +2629,17 @@ but not implemented on your system" fi fi +## +# check for vfio_user_server + +case "$vfio_user_server" in + auto | enabled ) +if test "$git_submodules_action" != "ignore"; then + git_submodules="${git_submodules} subprojects/libvfio-user" +fi +;; +esac + ## # End of CC checks # After here, no more $cc or $ld runs @@ -3185,7 +3203,7 @@ if test "$skip_meson" = no; then -Db_pie=$(if test "$pie" = yes; then echo true; else echo false; fi) \ -Db_coverage=$(if test "$gcov" = yes; then echo true; else echo false; fi) \ -Db_lto=$lto -Dcfi=$cfi -Dtcg=$tcg -Dxen=$xen \ --Dcapstone=$capstone -Dfdt=$fdt -Dslirp=$slirp \ +-Dcapstone=$capstone -Dfdt=$fdt -Dslirp=$slirp -Dvfio_user_server=$vfio_user_server \ $(test -n "${LIB_FUZZING_ENGINE+xxx}" && echo "-Dfuzzing_engine=$LIB_FUZZING_ENGINE") \ $(if test "$default_feature" = no; then echo "-Dauto_features=disabled"; fi) \ "$@" $cross_arg "$PWD" "$source_path" diff --git a/meson.build b/meson.build index 861de93c4f..84bc3a1c4f 100644 --- a/meson.build +++ b/meson.build @@ -298,6 +298,11 @@ have_tpm = get_option('tpm') \ .require(targetos != 'windows', error_message: 'TPM emulation only available on POSIX systems') \ .allowed() +if targetos != 'linux' and get_option('vfio_user_server').enabled() + error('vfio-user server is supported only on Linux') +endif +vfio_user_server_allowed = targetos == 'linux' and not get_option('vfio_user_server').disabled() + # Target-specific libraries and flags libm = cc.find_library('m', required: false) threads = dependency('threads') @@ -2111,7 +2116,8 @@ host_kconfig = \ (have_virtfs ? ['CONFIG_VIRTFS=y'] : []) + \ ('CONFIG_LINUX' in config_host ? ['CONFIG_LINUX=y'] : []) + \ ('CONFIG_PVRDMA' in config_host ? ['CONFIG_PVRDMA=y'] : []) + \ - (multiprocess_allowed ? ['CONFIG_MULTIPROCESS_ALLOWED=y'] : []) + (multiprocess_allowed ? ['CONFIG_MULTIPROCESS_ALLOWED=y'] : []) + \ + (vfio_user_server_allowed ? ['CONFIG_VFIO_USER_SERVER_ALLOWED=y'] : []) ignored = [ 'TARGET_XML_FILES', 'TARGET_ABI_DIR', 'TARGET_ARCH' ] @@ -2500,6 +2506,41 @@ if get_option('cfi') and slirp_opt == 'system' + ' Please configure with --enable-slirp=git') endif +vfiouser = not_found +if have_system and vfio_user_server_allowed + have_internal = fs.exists(meson.current_source_dir() / '
[PATCH v8 05/17] configure: require cmake 3.19 or newer
cmake needs to accept the compiler flags specified with CMAKE__COMPILER variable. It does so starting with version 3.19 Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- configure | 16 1 file changed, 16 insertions(+) diff --git a/configure b/configure index 7c08c18358..7a1a98bddf 100755 --- a/configure +++ b/configure @@ -250,6 +250,7 @@ stack_protector="" safe_stack="" use_containers="yes" gdb_bin=$(command -v "gdb-multiarch" || command -v "gdb") +cmake_required="no" if test -e "$source_path/.git" then @@ -2777,6 +2778,21 @@ if !(GIT="$git" "$source_path/scripts/git-submodule.sh" "$git_submodules_action" exit 1 fi +# Per cmake spec, CMAKE__COMPILER variable may include "mandatory" compiler +# flags. QEMU needs to specify these flags to correctly configure the build +# environment. cmake 3.19 allows specifying these mandatory compiler flags, +# and as such 3.19 or newer is required to build QEMU. +if test "$cmake_required" = "yes" ; then +cmake_bin=$(command -v "cmake") +if [ -z "$cmake_bin" ]; then +error_exit "cmake not found" +fi +cmake_version=$($cmake_bin --version | head -n 1) +if ! version_ge ${cmake_version##* } 3.19; then +error_exit "QEMU needs cmake 3.19 or newer" +fi +fi + config_host_mak="config-host.mak" echo "# Automatically generated by configure - do not modify" > $config_host_mak -- 2.20.1
[PATCH v8 08/17] vfio-user: instantiate vfio-user context
create a context with the vfio-user library to run a PCI device Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/remote/vfio-user-obj.c | 82 +++ 1 file changed, 82 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index c4d59b4d9d..d46acd5b63 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -41,6 +41,9 @@ #include "hw/remote/machine.h" #include "qapi/error.h" #include "qapi/qapi-visit-sockets.h" +#include "qemu/notify.h" +#include "sysemu/sysemu.h" +#include "libvfio-user.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -74,8 +77,14 @@ struct VfuObject { char *device; Error *err; + +Notifier machine_done; + +vfu_ctx_t *vfu_ctx; }; +static void vfu_object_init_ctx(VfuObject *o, Error **errp); + static bool vfu_object_auto_shutdown(void) { bool auto_shutdown = true; @@ -108,6 +117,11 @@ static void vfu_object_set_socket(Object *obj, Visitor *v, const char *name, { VfuObject *o = VFU_OBJECT(obj); +if (o->vfu_ctx) { +error_setg(errp, "vfu: Unable to set socket property - server busy"); +return; +} + qapi_free_SocketAddress(o->socket); o->socket = NULL; @@ -123,17 +137,69 @@ static void vfu_object_set_socket(Object *obj, Visitor *v, const char *name, } trace_vfu_prop("socket", o->socket->u.q_unix.path); + +vfu_object_init_ctx(o, errp); } static void vfu_object_set_device(Object *obj, const char *str, Error **errp) { VfuObject *o = VFU_OBJECT(obj); +if (o->vfu_ctx) { +error_setg(errp, "vfu: Unable to set device property - server busy"); +return; +} + g_free(o->device); o->device = g_strdup(str); trace_vfu_prop("device", str); + +vfu_object_init_ctx(o, errp); +} + +/* + * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' + * properties. It also depends on devices instantiated in QEMU. These + * dependencies are not available during the instance_init phase of this + * object's life-cycle. As such, the server is initialized after the + * machine is setup. machine_init_done_notifier notifies TYPE_VFU_OBJECT + * when the machine is setup, and the dependencies are available. + */ +static void vfu_object_machine_done(Notifier *notifier, void *data) +{ +VfuObject *o = container_of(notifier, VfuObject, machine_done); +Error *err = NULL; + +vfu_object_init_ctx(o, &err); + +if (err) { +error_propagate(&error_abort, err); +} +} + +static void vfu_object_init_ctx(VfuObject *o, Error **errp) +{ +ERRP_GUARD(); + +if (o->vfu_ctx || !o->socket || !o->device || +!phase_check(PHASE_MACHINE_READY)) { +return; +} + +if (o->err) { +error_propagate(errp, o->err); +o->err = NULL; +return; +} + +o->vfu_ctx = vfu_create_ctx(VFU_TRANS_SOCK, o->socket->u.q_unix.path, 0, +o, VFU_DEV_TYPE_PCI); +if (o->vfu_ctx == NULL) { +error_setg(errp, "vfu: Failed to create context - %s", strerror(errno)); +return; +} } static void vfu_object_init(Object *obj) @@ -148,6 +214,12 @@ static void vfu_object_init(Object *obj) TYPE_VFU_OBJECT, TYPE_REMOTE_MACHINE); return; } + +if (!phase_check(PHASE_MACHINE_READY)) { +o->machine_done.notify = vfu_object_machine_done; +qemu_add_machine_init_done_notifier(&o->machine_done); +} + } static void vfu_object_finalize(Object *obj) @@ -161,6 +233,11 @@ static void vfu_object_finalize(Object *obj) o->socket = NULL; +if (o->vfu_ctx) { +vfu_destroy_ctx(o->vfu_ctx); +o->vfu_ctx = NULL; +} + g_free(o->device); o->device = NULL; @@ -168,6 +245,11 @@ static void vfu_object_finalize(Object *obj) if (!k->nr_devs && vfu_object_auto_shutdown()) { qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN); } + +if (o->machine_done.notify) { +qemu_remove_machine_init_done_notifier(&o->machine_done); +o->machine_done.notify = NULL; +} } static void vfu_object_class_init(ObjectClass *klass, void *data) -- 2.20.1
[PATCH v8 17/17] vfio-user: avocado tests for vfio-user
Avocado tests for libvfio-user in QEMU - tests startup, hotplug and migration of the server object Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- MAINTAINERS| 1 + tests/avocado/vfio-user.py | 164 + 2 files changed, 165 insertions(+) create mode 100644 tests/avocado/vfio-user.py diff --git a/MAINTAINERS b/MAINTAINERS index ad51ec0dc8..8676f546e9 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3602,6 +3602,7 @@ F: hw/remote/vfio-user-obj.c F: include/hw/remote/vfio-user-obj.h F: hw/remote/iommu.c F: include/hw/remote/iommu.h +F: tests/avocado/vfio-user.py EBPF: M: Jason Wang diff --git a/tests/avocado/vfio-user.py b/tests/avocado/vfio-user.py new file mode 100644 index 00..ced304d770 --- /dev/null +++ b/tests/avocado/vfio-user.py @@ -0,0 +1,164 @@ +# vfio-user protocol sanity test +# +# This work is licensed under the terms of the GNU GPL, version 2 or +# later. See the COPYING file in the top-level directory. + + +import os +import socket +import uuid + +from avocado_qemu import QemuSystemTest +from avocado_qemu import wait_for_console_pattern +from avocado_qemu import exec_command +from avocado_qemu import exec_command_and_wait_for_pattern + +from avocado.utils import network +from avocado.utils import wait + +class VfioUser(QemuSystemTest): +""" +:avocado: tags=vfiouser +""" +KERNEL_COMMON_COMMAND_LINE = 'printk.time=0 ' +timeout = 20 + +def _get_free_port(self): +port = network.find_free_port() +if port is None: +self.cancel('Failed to find a free port') +return port + +def validate_vm_launch(self, vm): +wait_for_console_pattern(self, 'as init process', + 'Kernel panic - not syncing', vm=vm) +exec_command(self, 'mount -t sysfs sysfs /sys', vm=vm) +exec_command_and_wait_for_pattern(self, + 'cat /sys/bus/pci/devices/*/uevent', + 'PCI_ID=1000:0060', vm=vm) + +def launch_server_startup(self, socket, *opts): +server_vm = self.get_vm() +server_vm.add_args('-machine', 'x-remote,vfio-user=on') +server_vm.add_args('-nodefaults') +server_vm.add_args('-device', 'megasas,id=sas1') +server_vm.add_args('-object', 'x-vfio-user-server,id=vfioobj1,' + 'type=unix,path='+socket+',device=sas1') +for opt in opts: +server_vm.add_args(opt) +server_vm.launch() +return server_vm + +def launch_server_hotplug(self, socket): +server_vm = self.get_vm() +server_vm.add_args('-machine', 'x-remote,vfio-user=on') +server_vm.add_args('-nodefaults') +server_vm.launch() +server_vm.qmp('device_add', args_dict=None, conv_keys=None, + driver='megasas', id='sas1') +obj_add_opts = {'qom-type': 'x-vfio-user-server', +'id': 'vfioobj', 'device': 'sas1', +'socket': {'type': 'unix', 'path': socket}} +server_vm.qmp('object-add', args_dict=obj_add_opts) +return server_vm + +def launch_client(self, kernel_path, initrd_path, kernel_command_line, + machine_type, socket, *opts): +client_vm = self.get_vm() +client_vm.set_console() +client_vm.add_args('-machine', machine_type) +client_vm.add_args('-accel', 'kvm') +client_vm.add_args('-cpu', 'host') +client_vm.add_args('-object', + 'memory-backend-memfd,id=sysmem-file,size=2G') +client_vm.add_args('--numa', 'node,memdev=sysmem-file') +client_vm.add_args('-m', '2048') +client_vm.add_args('-kernel', kernel_path, + '-initrd', initrd_path, + '-append', kernel_command_line) +client_vm.add_args('-device', + 'vfio-user-pci,socket='+socket) +for opt in opts: +client_vm.add_args(opt) +client_vm.launch() +return client_vm + +def do_test_startup(self, kernel_url, initrd_url, kernel_command_line, +machine_type): +self.require_accelerator('kvm') + +kernel_path = self.fetch_asset(kernel_url) +initrd_path = self.fetch_asset(initrd_url) +socket
[PATCH v8 15/17] vfio-user: handle device interrupts
Forward remote device's interrupts to the guest Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- include/hw/pci/pci.h | 13 include/hw/remote/vfio-user-obj.h | 6 ++ hw/pci/msi.c | 16 ++-- hw/pci/msix.c | 10 ++- hw/pci/pci.c | 13 hw/remote/machine.c | 14 +++- hw/remote/vfio-user-obj.c | 123 ++ stubs/vfio-user-obj.c | 6 ++ MAINTAINERS | 1 + hw/remote/trace-events| 1 + stubs/meson.build | 1 + 11 files changed, 193 insertions(+), 11 deletions(-) create mode 100644 include/hw/remote/vfio-user-obj.h create mode 100644 stubs/vfio-user-obj.c diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index 3a32b8dd40..7595c05c98 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -16,6 +16,7 @@ extern bool pci_available; #define PCI_SLOT(devfn) (((devfn) >> 3) & 0x1f) #define PCI_FUNC(devfn) ((devfn) & 0x07) #define PCI_BUILD_BDF(bus, devfn) ((bus << 8) | (devfn)) +#define PCI_BDF_TO_DEVFN(x) ((x) & 0xff) #define PCI_BUS_MAX 256 #define PCI_DEVFN_MAX 256 #define PCI_SLOT_MAX32 @@ -127,6 +128,10 @@ typedef void PCIMapIORegionFunc(PCIDevice *pci_dev, int region_num, pcibus_t addr, pcibus_t size, int type); typedef void PCIUnregisterFunc(PCIDevice *pci_dev); +typedef void MSITriggerFunc(PCIDevice *dev, MSIMessage msg); +typedef MSIMessage MSIPrepareMessageFunc(PCIDevice *dev, unsigned vector); +typedef MSIMessage MSIxPrepareMessageFunc(PCIDevice *dev, unsigned vector); + typedef struct PCIIORegion { pcibus_t addr; /* current PCI mapping address. -1 means not mapped */ #define PCI_BAR_UNMAPPED (~(pcibus_t)0) @@ -321,6 +326,14 @@ struct PCIDevice { /* Space to store MSIX table & pending bit array */ uint8_t *msix_table; uint8_t *msix_pba; + +/* May be used by INTx or MSI during interrupt notification */ +void *irq_opaque; + +MSITriggerFunc *msi_trigger; +MSIPrepareMessageFunc *msi_prepare_message; +MSIxPrepareMessageFunc *msix_prepare_message; + /* MemoryRegion container for msix exclusive BAR setup */ MemoryRegion msix_exclusive_bar; /* Memory Regions for MSIX table and pending bit entries. */ diff --git a/include/hw/remote/vfio-user-obj.h b/include/hw/remote/vfio-user-obj.h new file mode 100644 index 00..87ab78b875 --- /dev/null +++ b/include/hw/remote/vfio-user-obj.h @@ -0,0 +1,6 @@ +#ifndef VFIO_USER_OBJ_H +#define VFIO_USER_OBJ_H + +void vfu_object_set_bus_irq(PCIBus *pci_bus); + +#endif diff --git a/hw/pci/msi.c b/hw/pci/msi.c index 47d2b0f33c..d556e17a09 100644 --- a/hw/pci/msi.c +++ b/hw/pci/msi.c @@ -134,7 +134,7 @@ void msi_set_message(PCIDevice *dev, MSIMessage msg) pci_set_word(dev->config + msi_data_off(dev, msi64bit), msg.data); } -MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector) +static MSIMessage msi_prepare_message(PCIDevice *dev, unsigned int vector) { uint16_t flags = pci_get_word(dev->config + msi_flags_off(dev)); bool msi64bit = flags & PCI_MSI_FLAGS_64BIT; @@ -159,6 +159,11 @@ MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector) return msg; } +MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector) +{ +return dev->msi_prepare_message(dev, vector); +} + bool msi_enabled(const PCIDevice *dev) { return msi_present(dev) && @@ -241,6 +246,8 @@ int msi_init(struct PCIDevice *dev, uint8_t offset, 0x >> (PCI_MSI_VECTORS_MAX - nr_vectors)); } +dev->msi_prepare_message = msi_prepare_message; + return 0; } @@ -256,6 +263,7 @@ void msi_uninit(struct PCIDevice *dev) cap_size = msi_cap_sizeof(flags); pci_del_capability(dev, PCI_CAP_ID_MSI, cap_size); dev->cap_present &= ~QEMU_PCI_CAP_MSI; +dev->msi_prepare_message = NULL; MSI_DEV_PRINTF(dev, "uninit\n"); } @@ -334,11 +342,7 @@ void msi_notify(PCIDevice *dev, unsigned int vector) void msi_send_message(PCIDevice *dev, MSIMessage msg) { -MemTxAttrs attrs = {}; - -attrs.requester_id = pci_requester_id(dev); -address_space_stl_le(&dev->bus_master_as, msg.address, msg.data, - attrs, NULL); +dev->msi_trigger(dev, msg); } /* Normally called by pci_default_write_config(). */ diff --git a/hw/pci/msix.c b/hw/pci/msix.c index ae9331cd0b..6f85192d6f 100644 --- a/hw/pci/msix.c +++ b/hw/pci/msix.c @@ -31,7 +31,7 @@ #define MSIX_ENABLE_MASK (PCI_MSIX_FLAGS_ENABLE >> 8) #define MSIX_MASKALL_MASK (PCI_MSIX_FLAGS_MASKALL >> 8) -MSIMessage msix_get_message(PCIDevice *dev, unsigned vector) +static MSIMessage msix_prepare_message(PCIDe
[PATCH v8 16/17] vfio-user: handle reset of remote device
Adds handler to reset a remote device Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/remote/vfio-user-obj.c | 20 1 file changed, 20 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 70b4d8b9ce..8ca823aa01 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -623,6 +623,20 @@ void vfu_object_set_bus_irq(PCIBus *pci_bus) pci_bus_irqs(pci_bus, vfu_object_set_irq, vfu_object_map_irq, pci_bus, 1); } +static int vfu_object_device_reset(vfu_ctx_t *vfu_ctx, vfu_reset_type_t type) +{ +VfuObject *o = vfu_get_private(vfu_ctx); + +/* vfu_object_ctx_run() handles lost connection */ +if (type == VFU_RESET_LOST_CONN) { +return 0; +} + +qdev_reset_all(DEVICE(o->pci_dev)); + +return 0; +} + /* * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' * properties. It also depends on devices instantiated in QEMU. These @@ -728,6 +742,12 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) goto fail; } +ret = vfu_setup_device_reset_cb(o->vfu_ctx, &vfu_object_device_reset); +if (ret < 0) { +error_setg(errp, "vfu: Failed to setup reset callback"); +goto fail; +} + ret = vfu_realize_ctx(o->vfu_ctx); if (ret < 0) { error_setg(errp, "vfu: Failed to realize device %s- %s", -- 2.20.1
[PATCH v8 11/17] vfio-user: handle PCI config space accesses
Define and register handlers for PCI config space accesses Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 51 +++ hw/remote/trace-events| 2 ++ 2 files changed, 53 insertions(+) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 06d99a8698..7b863dec4f 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -47,6 +47,7 @@ #include "qapi/qapi-events-misc.h" #include "qemu/notify.h" #include "qemu/thread.h" +#include "qemu/main-loop.h" #include "sysemu/sysemu.h" #include "libvfio-user.h" #include "hw/qdev-core.h" @@ -236,6 +237,45 @@ retry_attach: qemu_set_fd_handler(o->vfu_poll_fd, vfu_object_ctx_run, NULL, o); } +static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf, + size_t count, loff_t offset, + const bool is_write) +{ +VfuObject *o = vfu_get_private(vfu_ctx); +uint32_t pci_access_width = sizeof(uint32_t); +size_t bytes = count; +uint32_t val = 0; +char *ptr = buf; +int len; + +/* + * Writes to the BAR registers would trigger an update to the + * global Memory and IO AddressSpaces. But the remote device + * never uses the global AddressSpaces, therefore overlapping + * memory regions are not a problem + */ +while (bytes > 0) { +len = (bytes > pci_access_width) ? pci_access_width : bytes; +if (is_write) { +memcpy(&val, ptr, len); +pci_host_config_write_common(o->pci_dev, offset, + pci_config_size(o->pci_dev), + val, len); +trace_vfu_cfg_write(offset, val); +} else { +val = pci_host_config_read_common(o->pci_dev, offset, + pci_config_size(o->pci_dev), len); +memcpy(ptr, &val, len); +trace_vfu_cfg_read(offset, val); +} +offset += len; +ptr += len; +bytes -= len; +} + +return count; +} + /* * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' * properties. It also depends on devices instantiated in QEMU. These @@ -314,6 +354,17 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) TYPE_VFU_OBJECT, o->device); qdev_add_unplug_blocker(DEVICE(o->pci_dev), o->unplug_blocker); +ret = vfu_setup_region(o->vfu_ctx, VFU_PCI_DEV_CFG_REGION_IDX, + pci_config_size(o->pci_dev), &vfu_object_cfg_access, + VFU_REGION_FLAG_RW | VFU_REGION_FLAG_ALWAYS_CB, + NULL, 0, -1, 0); +if (ret < 0) { +error_setg(errp, + "vfu: Failed to setup config space handlers for %s- %s", + o->device, strerror(errno)); +goto fail; +} + ret = vfu_realize_ctx(o->vfu_ctx); if (ret < 0) { error_setg(errp, "vfu: Failed to realize device %s- %s", diff --git a/hw/remote/trace-events b/hw/remote/trace-events index 7da12f0d96..2ef7884346 100644 --- a/hw/remote/trace-events +++ b/hw/remote/trace-events @@ -5,3 +5,5 @@ mpqemu_recv_io_error(int cmd, int size, int nfds) "failed to receive %d size %d, # vfio-user-obj.c vfu_prop(const char *prop, const char *val) "vfu: setting %s as %s" +vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u -> 0x%x" +vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u <- 0x%x" -- 2.20.1
[PATCH v8 13/17] vfio-user: handle DMA mappings
Define and register callbacks to manage the RAM regions used for device DMA Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/machine.c | 5 hw/remote/vfio-user-obj.c | 55 +++ hw/remote/trace-events| 2 ++ 3 files changed, 62 insertions(+) diff --git a/hw/remote/machine.c b/hw/remote/machine.c index cca5d25f50..7002d46980 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -23,6 +23,7 @@ #include "hw/remote/iohub.h" #include "hw/remote/iommu.h" #include "hw/qdev-core.h" +#include "hw/remote/iommu.h" static void remote_machine_init(MachineState *machine) { @@ -52,6 +53,10 @@ static void remote_machine_init(MachineState *machine) pci_host = PCI_HOST_BRIDGE(rem_host); +if (s->vfio_user) { +remote_iommu_setup(pci_host->bus); +} + remote_iohub_init(&s->iohub); pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq, diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 7b863dec4f..425e45e8b2 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -276,6 +276,54 @@ static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf, return count; } +static void dma_register(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) +{ +VfuObject *o = vfu_get_private(vfu_ctx); +AddressSpace *dma_as = NULL; +MemoryRegion *subregion = NULL; +g_autofree char *name = NULL; +struct iovec *iov = &info->iova; + +if (!info->vaddr) { +return; +} + +name = g_strdup_printf("mem-%s-%"PRIx64"", o->device, + (uint64_t)info->vaddr); + +subregion = g_new0(MemoryRegion, 1); + +memory_region_init_ram_ptr(subregion, NULL, name, + iov->iov_len, info->vaddr); + +dma_as = pci_device_iommu_address_space(o->pci_dev); + +memory_region_add_subregion(dma_as->root, (hwaddr)iov->iov_base, subregion); + +trace_vfu_dma_register((uint64_t)iov->iov_base, iov->iov_len); +} + +static void dma_unregister(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) +{ +VfuObject *o = vfu_get_private(vfu_ctx); +AddressSpace *dma_as = NULL; +MemoryRegion *mr = NULL; +ram_addr_t offset; + +mr = memory_region_from_host(info->vaddr, &offset); +if (!mr) { +return; +} + +dma_as = pci_device_iommu_address_space(o->pci_dev); + +memory_region_del_subregion(dma_as->root, mr); + +object_unparent((OBJECT(mr))); + +trace_vfu_dma_unregister((uint64_t)info->iova.iov_base); +} + /* * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' * properties. It also depends on devices instantiated in QEMU. These @@ -365,6 +413,13 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) goto fail; } +ret = vfu_setup_device_dma(o->vfu_ctx, &dma_register, &dma_unregister); +if (ret < 0) { +error_setg(errp, "vfu: Failed to setup DMA handlers for %s", + o->device); +goto fail; +} + ret = vfu_realize_ctx(o->vfu_ctx); if (ret < 0) { error_setg(errp, "vfu: Failed to realize device %s- %s", diff --git a/hw/remote/trace-events b/hw/remote/trace-events index 2ef7884346..f945c7e33b 100644 --- a/hw/remote/trace-events +++ b/hw/remote/trace-events @@ -7,3 +7,5 @@ mpqemu_recv_io_error(int cmd, int size, int nfds) "failed to receive %d size %d, vfu_prop(const char *prop, const char *val) "vfu: setting %s as %s" vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u -> 0x%x" vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u <- 0x%x" +vfu_dma_register(uint64_t gpa, size_t len) "vfu: registering GPA 0x%"PRIx64", %zu bytes" +vfu_dma_unregister(uint64_t gpa) "vfu: unregistering GPA 0x%"PRIx64"" -- 2.20.1
[PATCH v8 14/17] vfio-user: handle PCI BAR accesses
Determine the BARs used by the PCI device and register handlers to manage the access to the same. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- include/exec/memory.h | 3 + hw/remote/vfio-user-obj.c | 189 softmmu/physmem.c | 4 +- tests/qtest/fuzz/generic_fuzz.c | 9 +- hw/remote/trace-events | 3 + 5 files changed, 202 insertions(+), 6 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index 4d5997e6bb..4b061e62d5 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -2810,6 +2810,9 @@ MemTxResult address_space_write_cached_slow(MemoryRegionCache *cache, hwaddr addr, const void *buf, hwaddr len); +int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr); +bool prepare_mmio_access(MemoryRegion *mr); + static inline bool memory_access_is_direct(MemoryRegion *mr, bool is_write) { if (is_write) { diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 425e45e8b2..f75197cbe3 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -53,6 +53,7 @@ #include "hw/qdev-core.h" #include "hw/pci/pci.h" #include "qemu/timer.h" +#include "exec/memory.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -324,6 +325,192 @@ static void dma_unregister(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) trace_vfu_dma_unregister((uint64_t)info->iova.iov_base); } +static int vfu_object_mr_rw(MemoryRegion *mr, uint8_t *buf, hwaddr offset, +hwaddr size, const bool is_write) +{ +uint8_t *ptr = buf; +bool release_lock = false; +uint8_t *ram_ptr = NULL; +MemTxResult result; +int access_size; +uint64_t val; + +if (memory_access_is_direct(mr, is_write)) { +/** + * Some devices expose a PCI expansion ROM, which could be buffer + * based as compared to other regions which are primarily based on + * MemoryRegionOps. memory_region_find() would already check + * for buffer overflow, we don't need to repeat it here. + */ +ram_ptr = memory_region_get_ram_ptr(mr); + +if (is_write) { +memcpy((ram_ptr + offset), buf, size); +} else { +memcpy(buf, (ram_ptr + offset), size); +} + +return 0; +} + +while (size) { +/** + * The read/write logic used below is similar to the ones in + * flatview_read/write_continue() + */ +release_lock = prepare_mmio_access(mr); + +access_size = memory_access_size(mr, size, offset); + +if (is_write) { +val = ldn_he_p(ptr, access_size); + +result = memory_region_dispatch_write(mr, offset, val, + size_memop(access_size), + MEMTXATTRS_UNSPECIFIED); +} else { +result = memory_region_dispatch_read(mr, offset, &val, + size_memop(access_size), + MEMTXATTRS_UNSPECIFIED); + +stn_he_p(ptr, access_size, val); +} + +if (release_lock) { +qemu_mutex_unlock_iothread(); +release_lock = false; +} + +if (result != MEMTX_OK) { +return -1; +} + +size -= access_size; +ptr += access_size; +offset += access_size; +} + +return 0; +} + +static size_t vfu_object_bar_rw(PCIDevice *pci_dev, int pci_bar, +hwaddr bar_offset, char * const buf, +hwaddr len, const bool is_write) +{ +MemoryRegionSection section = { 0 }; +uint8_t *ptr = (uint8_t *)buf; +MemoryRegion *section_mr = NULL; +uint64_t section_size; +hwaddr section_offset; +hwaddr size = 0; + +while (len) { +section = memory_region_find(pci_dev->io_regions[pci_bar].memory, + bar_offset, len); + +if (!section.mr) { +warn_report("vfu: invalid address 0x%"PRIx64"", bar_offset); +return size; +} + +section_mr = section.mr; +section_offset = section.offset_within_region; +section_size = int128_get64(section.size); + +if (is_write && section_mr->readonly) { +warn_report("vfu: attempting to write to readonly region in " +"bar %d - [0x%"PRIx64" - 0x%"PRIx64"]", +pci_bar, bar_offset, +(bar_offset + section_size)); +
[PATCH v8 10/17] vfio-user: run vfio-user context
Setup a handler to run vfio-user context. The context is driven by messages to the file descriptor associated with it - get the fd for the context and hook up the handler with it Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- qapi/misc.json| 23 ++ hw/remote/vfio-user-obj.c | 95 ++- 2 files changed, 117 insertions(+), 1 deletion(-) diff --git a/qapi/misc.json b/qapi/misc.json index b83cc39029..f3cc4a4854 100644 --- a/qapi/misc.json +++ b/qapi/misc.json @@ -553,3 +553,26 @@ ## { 'event': 'RTC_CHANGE', 'data': { 'offset': 'int', 'qom-path': 'str' } } + +## +# @VFU_CLIENT_HANGUP: +# +# Emitted when the client of a TYPE_VFIO_USER_SERVER closes the +# communication channel +# +# @id: ID of the TYPE_VFIO_USER_SERVER object +# +# @device: ID of attached PCI device +# +# Since: 7.1 +# +# Example: +# +# <- { "event": "VFU_CLIENT_HANGUP", +# "data": { "id": "vfu1", +#"device": "lsi1" }, +# "timestamp": { "seconds": 1265044230, "microseconds": 450486 } } +# +## +{ 'event': 'VFU_CLIENT_HANGUP', + 'data': { 'id': 'str', 'device': 'str' } } diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 15f6fe3a1a..06d99a8698 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -27,6 +27,9 @@ * * device - id of a device on the server, a required option. PCI devices * alone are supported presently. + * + * notes - x-vfio-user-server could block IO and monitor during the + * initialization phase. */ #include "qemu/osdep.h" @@ -41,11 +44,14 @@ #include "hw/remote/machine.h" #include "qapi/error.h" #include "qapi/qapi-visit-sockets.h" +#include "qapi/qapi-events-misc.h" #include "qemu/notify.h" +#include "qemu/thread.h" #include "sysemu/sysemu.h" #include "libvfio-user.h" #include "hw/qdev-core.h" #include "hw/pci/pci.h" +#include "qemu/timer.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -87,6 +93,8 @@ struct VfuObject { PCIDevice *pci_dev; Error *unplug_blocker; + +int vfu_poll_fd; }; static void vfu_object_init_ctx(VfuObject *o, Error **errp); @@ -165,6 +173,69 @@ static void vfu_object_set_device(Object *obj, const char *str, Error **errp) vfu_object_init_ctx(o, errp); } +static void vfu_object_ctx_run(void *opaque) +{ +VfuObject *o = opaque; +const char *id = NULL; +int ret = -1; + +while (ret != 0) { +ret = vfu_run_ctx(o->vfu_ctx); +if (ret < 0) { +if (errno == EINTR) { +continue; +} else if (errno == ENOTCONN) { +id = object_get_canonical_path_component(OBJECT(o)); +qapi_event_send_vfu_client_hangup(id, o->device); +qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL); +o->vfu_poll_fd = -1; +object_unparent(OBJECT(o)); +break; +} else { +VFU_OBJECT_ERROR(o, "vfu: Failed to run device %s - %s", + o->device, strerror(errno)); +break; +} +} +} +} + +static void vfu_object_attach_ctx(void *opaque) +{ +VfuObject *o = opaque; +GPollFD pfds[1]; +int ret; + +qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL); + +pfds[0].fd = o->vfu_poll_fd; +pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR; + +retry_attach: +ret = vfu_attach_ctx(o->vfu_ctx); +if (ret < 0 && (errno == EAGAIN || errno == EWOULDBLOCK)) { +/** + * vfu_object_attach_ctx can block QEMU's main loop + * during attach - the monitor and other IO + * could be unresponsive during this time. + */ +(void)qemu_poll_ns(pfds, 1, 500 * (int64_t)SCALE_MS); +goto retry_attach; +} else if (ret < 0) { +VFU_OBJECT_ERROR(o, "vfu: Failed to attach device %s to context - %s", + o->device, strerror(errno)); +return; +} + +o->vfu_poll_fd = vfu_get_poll_fd(o->vfu_ctx); +if (o->vfu_poll_fd < 0) { +VFU_OBJECT_ERROR(o, "vfu: Failed to get poll fd %s", o->device); +return; +} + +qemu_set_fd_handler(o->vfu_poll_fd, vfu_object_ctx_run, NULL, o); +} + /* * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' * properties. It
[PATCH v7 12/17] vfio-user: IOMMU support for remote device
Assign separate address space for each device in the remote processes. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- include/hw/remote/iommu.h | 18 hw/remote/iommu.c | 95 +++ MAINTAINERS | 2 + hw/remote/meson.build | 1 + 4 files changed, 116 insertions(+) create mode 100644 include/hw/remote/iommu.h create mode 100644 hw/remote/iommu.c diff --git a/include/hw/remote/iommu.h b/include/hw/remote/iommu.h new file mode 100644 index 00..8f850400f1 --- /dev/null +++ b/include/hw/remote/iommu.h @@ -0,0 +1,18 @@ +/** + * Copyright © 2022 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef REMOTE_IOMMU_H +#define REMOTE_IOMMU_H + +#include "hw/pci/pci_bus.h" + +void remote_configure_iommu(PCIBus *pci_bus); + +void remote_iommu_del_device(PCIDevice *pci_dev); + +#endif diff --git a/hw/remote/iommu.c b/hw/remote/iommu.c new file mode 100644 index 00..13f329b45d --- /dev/null +++ b/hw/remote/iommu.c @@ -0,0 +1,95 @@ +/** + * IOMMU for remote device + * + * Copyright © 2022 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" + +#include "hw/remote/iommu.h" +#include "hw/pci/pci_bus.h" +#include "hw/pci/pci.h" +#include "exec/memory.h" +#include "exec/address-spaces.h" +#include "trace.h" + +struct RemoteIommuElem { +AddressSpace as; +MemoryRegion mr; +}; + +struct RemoteIommuTable { +QemuMutex lock; +GHashTable *elem_by_bdf; +} remote_iommu_table; + +#define INT2VOIDP(i) (void *)(uintptr_t)(i) + +static AddressSpace *remote_iommu_find_add_as(PCIBus *pci_bus, + void *opaque, int devfn) +{ +struct RemoteIommuTable *iommu_table = opaque; +struct RemoteIommuElem *elem = NULL; +int pci_bdf = PCI_BUILD_BDF(pci_bus_num(pci_bus), devfn); + +elem = g_hash_table_lookup(iommu_table->elem_by_bdf, INT2VOIDP(pci_bdf)); + +if (!elem) { +g_autofree char *mr_name = g_strdup_printf("vfu-ram-%d", pci_bdf); +g_autofree char *as_name = g_strdup_printf("vfu-as-%d", pci_bdf); + +elem = g_malloc0(sizeof(struct RemoteIommuElem)); + +memory_region_init(&elem->mr, NULL, mr_name, UINT64_MAX); +address_space_init(&elem->as, &elem->mr, as_name); + +qemu_mutex_lock(&iommu_table->lock); +g_hash_table_insert(iommu_table->elem_by_bdf, INT2VOIDP(pci_bdf), elem); +qemu_mutex_unlock(&iommu_table->lock); +} + +return &elem->as; +} + +static void remote_iommu_del_elem(gpointer data) +{ +struct RemoteIommuElem *elem = data; + +g_assert(elem); + +memory_region_unref(&elem->mr); +address_space_destroy(&elem->as); + +g_free(elem); +} + +void remote_iommu_del_device(PCIDevice *pci_dev) +{ +int pci_bdf; + +if (!remote_iommu_table.elem_by_bdf || !pci_dev) { +return; +} + +pci_bdf = PCI_BUILD_BDF(pci_bus_num(pci_get_bus(pci_dev)), pci_dev->devfn); + +qemu_mutex_lock(&remote_iommu_table.lock); +g_hash_table_remove(remote_iommu_table.elem_by_bdf, INT2VOIDP(pci_bdf)); +qemu_mutex_unlock(&remote_iommu_table.lock); +} + +void remote_configure_iommu(PCIBus *pci_bus) +{ +if (!remote_iommu_table.elem_by_bdf) { +remote_iommu_table.elem_by_bdf = +g_hash_table_new_full(NULL, NULL, NULL, remote_iommu_del_elem); +qemu_mutex_init(&remote_iommu_table.lock); +} + +pci_setup_iommu(pci_bus, remote_iommu_find_add_as, &remote_iommu_table); +} diff --git a/MAINTAINERS b/MAINTAINERS index e7b0297a63..21694a9698 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3599,6 +3599,8 @@ F: hw/remote/iohub.c F: include/hw/remote/iohub.h F: subprojects/libvfio-user F: hw/remote/vfio-user-obj.c +F: hw/remote/iommu.c +F: include/hw/remote/iommu.h EBPF: M: Jason Wang diff --git a/hw/remote/meson.build b/hw/remote/meson.build index 534ac5df79..bcef83c8cc 100644 --- a/hw/remote/meson.build +++ b/hw/remote/meson.build @@ -6,6 +6,7 @@ remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('message.c')) remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('remote-obj.c')) remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('proxy.c')) remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('iohub.c')) +remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('iommu.c')) remote_ss.add(when: 'CONFIG_VFIO_USER_SERVER', if_true: files('vfio-user-obj.c')) remote_ss.add(when: 'CONFIG_VFIO_USER_SERVER', if_true: vfiouser) -- 2.20.1
[PATCH v7 17/17] vfio-user: avocado tests for vfio-user
Avocado tests for libvfio-user in QEMU - tests startup, hotplug and migration of the server object Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- MAINTAINERS| 1 + tests/avocado/vfio-user.py | 164 + 2 files changed, 165 insertions(+) create mode 100644 tests/avocado/vfio-user.py diff --git a/MAINTAINERS b/MAINTAINERS index d07f2a0985..f165281796 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3602,6 +3602,7 @@ F: hw/remote/vfio-user-obj.c F: include/hw/remote/vfio-user-obj.h F: hw/remote/iommu.c F: include/hw/remote/iommu.h +F: tests/avocado/vfio-user.py EBPF: M: Jason Wang diff --git a/tests/avocado/vfio-user.py b/tests/avocado/vfio-user.py new file mode 100644 index 00..ced304d770 --- /dev/null +++ b/tests/avocado/vfio-user.py @@ -0,0 +1,164 @@ +# vfio-user protocol sanity test +# +# This work is licensed under the terms of the GNU GPL, version 2 or +# later. See the COPYING file in the top-level directory. + + +import os +import socket +import uuid + +from avocado_qemu import QemuSystemTest +from avocado_qemu import wait_for_console_pattern +from avocado_qemu import exec_command +from avocado_qemu import exec_command_and_wait_for_pattern + +from avocado.utils import network +from avocado.utils import wait + +class VfioUser(QemuSystemTest): +""" +:avocado: tags=vfiouser +""" +KERNEL_COMMON_COMMAND_LINE = 'printk.time=0 ' +timeout = 20 + +def _get_free_port(self): +port = network.find_free_port() +if port is None: +self.cancel('Failed to find a free port') +return port + +def validate_vm_launch(self, vm): +wait_for_console_pattern(self, 'as init process', + 'Kernel panic - not syncing', vm=vm) +exec_command(self, 'mount -t sysfs sysfs /sys', vm=vm) +exec_command_and_wait_for_pattern(self, + 'cat /sys/bus/pci/devices/*/uevent', + 'PCI_ID=1000:0060', vm=vm) + +def launch_server_startup(self, socket, *opts): +server_vm = self.get_vm() +server_vm.add_args('-machine', 'x-remote,vfio-user=on') +server_vm.add_args('-nodefaults') +server_vm.add_args('-device', 'megasas,id=sas1') +server_vm.add_args('-object', 'x-vfio-user-server,id=vfioobj1,' + 'type=unix,path='+socket+',device=sas1') +for opt in opts: +server_vm.add_args(opt) +server_vm.launch() +return server_vm + +def launch_server_hotplug(self, socket): +server_vm = self.get_vm() +server_vm.add_args('-machine', 'x-remote,vfio-user=on') +server_vm.add_args('-nodefaults') +server_vm.launch() +server_vm.qmp('device_add', args_dict=None, conv_keys=None, + driver='megasas', id='sas1') +obj_add_opts = {'qom-type': 'x-vfio-user-server', +'id': 'vfioobj', 'device': 'sas1', +'socket': {'type': 'unix', 'path': socket}} +server_vm.qmp('object-add', args_dict=obj_add_opts) +return server_vm + +def launch_client(self, kernel_path, initrd_path, kernel_command_line, + machine_type, socket, *opts): +client_vm = self.get_vm() +client_vm.set_console() +client_vm.add_args('-machine', machine_type) +client_vm.add_args('-accel', 'kvm') +client_vm.add_args('-cpu', 'host') +client_vm.add_args('-object', + 'memory-backend-memfd,id=sysmem-file,size=2G') +client_vm.add_args('--numa', 'node,memdev=sysmem-file') +client_vm.add_args('-m', '2048') +client_vm.add_args('-kernel', kernel_path, + '-initrd', initrd_path, + '-append', kernel_command_line) +client_vm.add_args('-device', + 'vfio-user-pci,socket='+socket) +for opt in opts: +client_vm.add_args(opt) +client_vm.launch() +return client_vm + +def do_test_startup(self, kernel_url, initrd_url, kernel_command_line, +machine_type): +self.require_accelerator('kvm') + +kernel_path = self.fetch_asset(kernel_url) +initrd_path = self.fetch_asset(initrd_url) +socket
[PATCH v7 15/17] vfio-user: handle device interrupts
Forward remote device's interrupts to the guest Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- include/hw/pci/pci.h | 10 +++ include/hw/remote/vfio-user-obj.h | 6 ++ hw/pci/msi.c | 11 +++- hw/pci/msix.c | 10 ++- hw/remote/machine.c | 14 +++-- hw/remote/vfio-user-obj.c | 101 ++ stubs/vfio-user-obj.c | 6 ++ MAINTAINERS | 1 + hw/remote/trace-events| 1 + stubs/meson.build | 1 + 10 files changed, 155 insertions(+), 6 deletions(-) create mode 100644 include/hw/remote/vfio-user-obj.h create mode 100644 stubs/vfio-user-obj.c diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index 3a32b8dd40..fb8a05ae25 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -16,6 +16,7 @@ extern bool pci_available; #define PCI_SLOT(devfn) (((devfn) >> 3) & 0x1f) #define PCI_FUNC(devfn) ((devfn) & 0x07) #define PCI_BUILD_BDF(bus, devfn) ((bus << 8) | (devfn)) +#define PCI_BDF_TO_DEVFN(x) ((x) & 0xff) #define PCI_BUS_MAX 256 #define PCI_DEVFN_MAX 256 #define PCI_SLOT_MAX32 @@ -126,6 +127,8 @@ typedef uint32_t PCIConfigReadFunc(PCIDevice *pci_dev, typedef void PCIMapIORegionFunc(PCIDevice *pci_dev, int region_num, pcibus_t addr, pcibus_t size, int type); typedef void PCIUnregisterFunc(PCIDevice *pci_dev); +typedef void PCIMSINotify(PCIDevice *pci_dev, unsigned vector); +typedef void PCIMSIxNotify(PCIDevice *pci_dev, unsigned vector); typedef struct PCIIORegion { pcibus_t addr; /* current PCI mapping address. -1 means not mapped */ @@ -321,6 +324,13 @@ struct PCIDevice { /* Space to store MSIX table & pending bit array */ uint8_t *msix_table; uint8_t *msix_pba; + +/* May be used by INTx or MSI during interrupt notification */ +void *irq_opaque; + +PCIMSINotify *msi_notify; +PCIMSIxNotify *msix_notify; + /* MemoryRegion container for msix exclusive BAR setup */ MemoryRegion msix_exclusive_bar; /* Memory Regions for MSIX table and pending bit entries. */ diff --git a/include/hw/remote/vfio-user-obj.h b/include/hw/remote/vfio-user-obj.h new file mode 100644 index 00..87ab78b875 --- /dev/null +++ b/include/hw/remote/vfio-user-obj.h @@ -0,0 +1,6 @@ +#ifndef VFIO_USER_OBJ_H +#define VFIO_USER_OBJ_H + +void vfu_object_set_bus_irq(PCIBus *pci_bus); + +#endif diff --git a/hw/pci/msi.c b/hw/pci/msi.c index 47d2b0f33c..a161a5380b 100644 --- a/hw/pci/msi.c +++ b/hw/pci/msi.c @@ -51,6 +51,8 @@ */ bool msi_nonbroken; +static void pci_msi_notify(PCIDevice *dev, unsigned int vector); + /* If we get rid of cap allocator, we won't need this. */ static inline uint8_t msi_cap_sizeof(uint16_t flags) { @@ -225,6 +227,8 @@ int msi_init(struct PCIDevice *dev, uint8_t offset, dev->msi_cap = config_offset; dev->cap_present |= QEMU_PCI_CAP_MSI; +dev->msi_notify = pci_msi_notify; + pci_set_word(dev->config + msi_flags_off(dev), flags); pci_set_word(dev->wmask + msi_flags_off(dev), PCI_MSI_FLAGS_QSIZE | PCI_MSI_FLAGS_ENABLE); @@ -307,7 +311,7 @@ bool msi_is_masked(const PCIDevice *dev, unsigned int vector) return mask & (1U << vector); } -void msi_notify(PCIDevice *dev, unsigned int vector) +static void pci_msi_notify(PCIDevice *dev, unsigned int vector) { uint16_t flags = pci_get_word(dev->config + msi_flags_off(dev)); bool msi64bit = flags & PCI_MSI_FLAGS_64BIT; @@ -332,6 +336,11 @@ void msi_notify(PCIDevice *dev, unsigned int vector) msi_send_message(dev, msg); } +void msi_notify(PCIDevice *dev, unsigned int vector) +{ +dev->msi_notify(dev, vector); +} + void msi_send_message(PCIDevice *dev, MSIMessage msg) { MemTxAttrs attrs = {}; diff --git a/hw/pci/msix.c b/hw/pci/msix.c index ae9331cd0b..fbf88654b3 100644 --- a/hw/pci/msix.c +++ b/hw/pci/msix.c @@ -31,6 +31,8 @@ #define MSIX_ENABLE_MASK (PCI_MSIX_FLAGS_ENABLE >> 8) #define MSIX_MASKALL_MASK (PCI_MSIX_FLAGS_MASKALL >> 8) +static void pci_msix_notify(PCIDevice *dev, unsigned vector); + MSIMessage msix_get_message(PCIDevice *dev, unsigned vector) { uint8_t *table_entry = dev->msix_table + vector * PCI_MSIX_ENTRY_SIZE; @@ -334,6 +336,7 @@ int msix_init(struct PCIDevice *dev, unsigned short nentries, dev->msix_table = g_malloc0(table_size); dev->msix_pba = g_malloc0(pba_size); dev->msix_entry_used = g_malloc0(nentries * sizeof *dev->msix_entry_used); +dev->msix_notify = pci_msix_notify; msix_mask_all(dev, nentries); @@ -485,7 +488,7 @@ int msix_enabled(PCIDevice *dev) } /* Send an MSI-X message */ -void msix_notify(PCIDevice *dev, unsigned vector) +static vo
[PATCH v7 10/17] vfio-user: run vfio-user context
Setup a handler to run vfio-user context. The context is driven by messages to the file descriptor associated with it - get the fd for the context and hook up the handler with it Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- qapi/misc.json| 23 ++ hw/remote/vfio-user-obj.c | 95 ++- 2 files changed, 117 insertions(+), 1 deletion(-) diff --git a/qapi/misc.json b/qapi/misc.json index b83cc39029..f3cc4a4854 100644 --- a/qapi/misc.json +++ b/qapi/misc.json @@ -553,3 +553,26 @@ ## { 'event': 'RTC_CHANGE', 'data': { 'offset': 'int', 'qom-path': 'str' } } + +## +# @VFU_CLIENT_HANGUP: +# +# Emitted when the client of a TYPE_VFIO_USER_SERVER closes the +# communication channel +# +# @id: ID of the TYPE_VFIO_USER_SERVER object +# +# @device: ID of attached PCI device +# +# Since: 7.1 +# +# Example: +# +# <- { "event": "VFU_CLIENT_HANGUP", +# "data": { "id": "vfu1", +#"device": "lsi1" }, +# "timestamp": { "seconds": 1265044230, "microseconds": 450486 } } +# +## +{ 'event': 'VFU_CLIENT_HANGUP', + 'data': { 'id': 'str', 'device': 'str' } } diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 15f6fe3a1a..06d99a8698 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -27,6 +27,9 @@ * * device - id of a device on the server, a required option. PCI devices * alone are supported presently. + * + * notes - x-vfio-user-server could block IO and monitor during the + * initialization phase. */ #include "qemu/osdep.h" @@ -41,11 +44,14 @@ #include "hw/remote/machine.h" #include "qapi/error.h" #include "qapi/qapi-visit-sockets.h" +#include "qapi/qapi-events-misc.h" #include "qemu/notify.h" +#include "qemu/thread.h" #include "sysemu/sysemu.h" #include "libvfio-user.h" #include "hw/qdev-core.h" #include "hw/pci/pci.h" +#include "qemu/timer.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -87,6 +93,8 @@ struct VfuObject { PCIDevice *pci_dev; Error *unplug_blocker; + +int vfu_poll_fd; }; static void vfu_object_init_ctx(VfuObject *o, Error **errp); @@ -165,6 +173,69 @@ static void vfu_object_set_device(Object *obj, const char *str, Error **errp) vfu_object_init_ctx(o, errp); } +static void vfu_object_ctx_run(void *opaque) +{ +VfuObject *o = opaque; +const char *id = NULL; +int ret = -1; + +while (ret != 0) { +ret = vfu_run_ctx(o->vfu_ctx); +if (ret < 0) { +if (errno == EINTR) { +continue; +} else if (errno == ENOTCONN) { +id = object_get_canonical_path_component(OBJECT(o)); +qapi_event_send_vfu_client_hangup(id, o->device); +qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL); +o->vfu_poll_fd = -1; +object_unparent(OBJECT(o)); +break; +} else { +VFU_OBJECT_ERROR(o, "vfu: Failed to run device %s - %s", + o->device, strerror(errno)); +break; +} +} +} +} + +static void vfu_object_attach_ctx(void *opaque) +{ +VfuObject *o = opaque; +GPollFD pfds[1]; +int ret; + +qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL); + +pfds[0].fd = o->vfu_poll_fd; +pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR; + +retry_attach: +ret = vfu_attach_ctx(o->vfu_ctx); +if (ret < 0 && (errno == EAGAIN || errno == EWOULDBLOCK)) { +/** + * vfu_object_attach_ctx can block QEMU's main loop + * during attach - the monitor and other IO + * could be unresponsive during this time. + */ +(void)qemu_poll_ns(pfds, 1, 500 * (int64_t)SCALE_MS); +goto retry_attach; +} else if (ret < 0) { +VFU_OBJECT_ERROR(o, "vfu: Failed to attach device %s to context - %s", + o->device, strerror(errno)); +return; +} + +o->vfu_poll_fd = vfu_get_poll_fd(o->vfu_ctx); +if (o->vfu_poll_fd < 0) { +VFU_OBJECT_ERROR(o, "vfu: Failed to get poll fd %s", o->device); +return; +} + +qemu_set_fd_handler(o->vfu_poll_fd, vfu_object_ctx_run, NULL, o); +} + /* * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' * properties. It