Re: [libvirt PATCH v4 25/25] nodedev: fix hang when destroying an mdev in use

2021-02-21 Thread Erik Skultety
On Wed, Feb 03, 2021 at 11:39:09AM -0600, Jonathon Jongsma wrote:
> Calling `mdevctl stop` for a mediated device that is in use by an active
> domain will block until that vm exits (or the vm closes the device).
> Since the nodedev driver cannot query the hypervisor driver to see
> whether any active domains are using the device, we resort to a
> workaround that relies on the fact that a vfio group can only be opened
> by one user at a time. If we get an EBUSY error when attempting to open
> the group file, we assume the device is in use and refuse to try to
> destroy that device.
> 
> Signed-off-by: Jonathon Jongsma 
> ---
>  src/node_device/node_device_driver.c | 17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/src/node_device/node_device_driver.c 
> b/src/node_device/node_device_driver.c
> index bf97291041..e6b0213157 100644
> --- a/src/node_device/node_device_driver.c
> +++ b/src/node_device/node_device_driver.c
> @@ -1164,6 +1164,23 @@ nodeDeviceDestroy(virNodeDevicePtr device)
>  
>  ret = 0;
>  } else if (nodeDeviceHasCapability(def, VIR_NODE_DEV_CAP_MDEV)) {
> +/* If this mediated device is in use by a vm, attempting to stop it
> + * will block until the vm closes the device. Since the nodedev 
> driver
> + * cannot query the hypervisor driver to determine whether the device

A nice detailed commentary, but for future reference I'd still add the reason
*why* it is that nodedev driver cannot poke the hypervisor driver.

> + * is in use by any active domains, we need to resort to a 
> workaround.
> + * vfio only allows the group for a device to be opened by one user 
> at
> + * a time. So if we get EBUSY when opening the group, we infer that 
> the
> + * device is in use and shouldn't try to remove the device. */
> +g_autofree char *vfiogroup =
> +virMediatedDeviceGetIOMMUGroupDev(def->caps->data.mdev.uuid);
> +VIR_AUTOCLOSE fd = open(vfiogroup, O_RDONLY);
> +
> +if (fd < 0 && errno == EBUSY) {
> +virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> +   _("Unable to destroy a mediated device that is in 
> use"));

I think a slightly better message would look like:
_("Unable to destroy '%s': device in use"), def->name


This was a simple workaround indeed :).

Reviewed-by: Erik Skultety 



Re: [libvirt PATCH v4 24/25] nodedev: add ability to specify UUID for new mdevs

2021-02-21 Thread Erik Skultety
On Wed, Feb 03, 2021 at 11:39:08AM -0600, Jonathon Jongsma wrote:
> Use the new  element in the mdev caps to define and start devices
> with a specific UUID.
> 
> Signed-off-by: Jonathon Jongsma 
> ---
>  src/node_device/node_device_driver.c   | 18 +++---
>  ...019_36ea_4111_8f0a_8c9a70e21366-define.argv |  3 ++-
>  ...d019_36ea_4111_8f0a_8c9a70e21366-start.argv |  3 ++-
>  ...ev_d069d019_36ea_4111_8f0a_8c9a70e21366.xml |  1 +
>  4 files changed, 20 insertions(+), 5 deletions(-)
> 
> diff --git a/src/node_device/node_device_driver.c 
> b/src/node_device/node_device_driver.c
> index d5cdf2b097..bf97291041 100644
> --- a/src/node_device/node_device_driver.c
> +++ b/src/node_device/node_device_driver.c
> @@ -728,6 +728,10 @@ 
> nodeDeviceGetMdevctlDefineStartCommand(virNodeDeviceDefPtr def,
> NULL);
>  
>  virCommandSetInputBuffer(cmd, json);
> +
> +if (def->caps->data.mdev.uuid)
> +virCommandAddArgPair(cmd, "--uuid", def->caps->data.mdev.uuid);
> +
>  virCommandSetOutputBuffer(cmd, uuid_out);
>  
>  return cmd;
> @@ -806,8 +810,12 @@ nodeDeviceCreateXMLMdev(virConnectPtr conn,
> _("Unable to start mediated device"));
>  return NULL;
>  }
> +if (uuid) {

mdevctl returns an empty string when UUID was provided, so this has to become
'if (uuid && uuid[0])'

> +g_free(def->caps->data.mdev.uuid);
> +def->caps->data.mdev.uuid = g_steal_pointer();
> +}
>  
> -return nodeDeviceFindNewMediatedDevice(conn, uuid);
> +return nodeDeviceFindNewMediatedDevice(conn, def->caps->data.mdev.uuid);
>  }
>  
>  
> @@ -1213,9 +1221,13 @@ nodeDeviceDefineXML(virConnectPtr conn,
>  return NULL;
>  }
>  
> -def->caps->data.mdev.uuid = g_strdup(uuid);
> +if (uuid) {

^Here too...

Erik

> +g_free(def->caps->data.mdev.uuid);
> +def->caps->data.mdev.uuid = g_steal_pointer();
> +}
> +
>  mdevGenerateDeviceName(def);
> -device = nodeDeviceFindNewMediatedDevice(conn, uuid);
> +device = nodeDeviceFindNewMediatedDevice(conn, 
> def->caps->data.mdev.uuid);



Re: [libvirt PATCH v4 23/25] nodedev: add element to mdev caps

2021-02-21 Thread Erik Skultety
On Wed, Feb 03, 2021 at 11:39:07AM -0600, Jonathon Jongsma wrote:
> It will be useful to be able to specify a particular UUID for a mediated
> device when defining the node device. To accomodate that, allow this to
> be specified in the xml schema. This patch also parses and formats that
> value to the xml, but does not yet use it.
> 
> Signed-off-by: Jonathon Jongsma 
> ---
...

>  
> +if ((uuidstr = virXPathString("string(./uuid[1])", ctxt))) {
> +/* make sure that the provided uuid is valid */
> +if (virUUIDParse(uuidstr, uuidbuf) < 0) {
> +virReportError(VIR_ERR_INTERNAL_ERROR,
> +   _("Invalid uuid '%s' for '%s'"), mdev->uuid,

we don't have mdev->uuid at this point yet, so:
s/mdev->uuid/uuidstr

> +   def->name);

we also haven't generated any name yet, so ^this yields "new device". I'd
suggest to hardcode the message with "new mdev device" instead of using
def->name.

Being able to specify UUIDs with mdevs directly is great.

Reviewed-by: Erik Skultety 



Re: [libvirt PATCH v4 21/25] api: add virNodeDeviceCreate()

2021-02-21 Thread Erik Skultety
On Wed, Feb 03, 2021 at 11:39:05AM -0600, Jonathon Jongsma wrote:
> This new API function provides a way to start a persistently-defined
> mediate device that was defined by virNodeDeviceDefineXML() (or one that
> was defined externally via mdevctl)
> 
> Signed-off-by: Jonathon Jongsma 
> ---
...

> +int nodeDeviceCreate(virNodeDevicePtr device)
> +{
> +int ret = -1;
> +virNodeDeviceObjPtr obj = NULL;
> +virNodeDeviceDefPtr def;

...actually, please initialize ^this pointer as well to remain consistent.
Erik

(The RB still stands)



Re: [libvirt PATCH v4 22/25] virsh: add "nodedev-start" command

2021-02-21 Thread Erik Skultety
On Wed, Feb 03, 2021 at 11:39:06AM -0600, Jonathon Jongsma wrote:
> This virsh command maps to virNodeDeviceCreate(), which starts a node
> device that has been previously defined by virNodeDeviceDefineXML().
> This is only supported for mediated devices.
> 
> Signed-off-by: Jonathon Jongsma 
> ---
Reviewed-by: Erik Skultety 



Re: [libvirt PATCH v4 20/25] virsh: add nodedev-undefine command

2021-02-21 Thread Erik Skultety
On Wed, Feb 03, 2021 at 11:39:04AM -0600, Jonathon Jongsma wrote:
> Add a virsh command that maps to virNodeDeviceUndefine().
> 
> Signed-off-by: Jonathon Jongsma 
Reviewed-by: Erik Skultety 



Re: [libvirt PATCH v4 15/25] virsh: Add --inactive, --all to nodedev-list

2021-02-21 Thread Erik Skultety
On Wed, Feb 03, 2021 at 11:38:59AM -0600, Jonathon Jongsma wrote:
> Now that we can filter active and inactive node devices in
> virConnectListAllNodeDevices(), add these switches to the virsh command.
> 
> Eventual output (once everything is hooked up):
> 
> virsh # nodedev-list --cap mdev
> mdev_bd2ea955_3402_4252_8c17_7468083a0f26
> 
> virsh # nodedev-list --inactive --cap mdev
> mdev_07d8b8b0_7e04_4c0f_97ed_9214ce12723c
> mdev_927c040f_ae7d_4a35_966e_286ba6ebbe1c
> 
> virsh # nodedev-list --all --cap mdev
> mdev_07d8b8b0_7e04_4c0f_97ed_9214ce12723c
> mdev_927c040f_ae7d_4a35_966e_286ba6ebbe1c
> mdev_bd2ea955_3402_4252_8c17_7468083a0f26
> 
> Signed-off-by: Jonathon Jongsma 
> ---
>  tools/virsh-nodedev.c | 30 ++
>  1 file changed, 26 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/virsh-nodedev.c b/tools/virsh-nodedev.c
> index 428ead7384..a2e83fb676 100644
> --- a/tools/virsh-nodedev.c
> +++ b/tools/virsh-nodedev.c
> @@ -378,6 +378,14 @@ static const vshCmdOptDef opts_node_list_devices[] = {
>   .completer = virshNodeDeviceCapabilityNameCompleter,
>   .help = N_("capability names, separated by comma")
>  },
> +{.name = "inactive",
> + .type = VSH_OT_BOOL,
> + .help = N_("list inactive devices")
> +},
> +{.name = "all",
> + .type = VSH_OT_BOOL,
> + .help = N_("list inactive & active devices")
> +},
>  {.name = NULL}
>  };
>  
> @@ -393,18 +401,27 @@ cmdNodeListDevices(vshControl *ctl, const vshCmd *cmd 
> G_GNUC_UNUSED)
>  int ncaps = 0;
>  virshNodeDeviceListPtr list = NULL;
>  int cap_type = -1;
> +bool inactive, all;

1 declaration per line...

>  
> +inactive = vshCommandOptBool(cmd, "inactive");
> +all = vshCommandOptBool(cmd, "all");

...also ^these 2 can be used to initialize the variables at their definition.

Reviewed-by: Erik Skultety 



Re: [PATCH RESEND 20/20] virhostdev.c: remove missing PCI devs from hostdev manager

2021-02-21 Thread Laine Stump

On 1/18/21 2:53 PM, Daniel Henrique Barboza wrote:

virHostdevReAttachPCIDevices() is called when we want to re-attach
a list of hostdevs back to the host, either on the shutdown path or
via a 'virsh detach-device' call.  This function always count on the
existence of the device in the host to work, but this can lead to
problems. For example, a SR-IOV device can be removed via an admin
"echo 0 > /sys/bus/pci/devices//sriov_numvfs", making the kernel
fire up and eventfd_signal() to the process, asking for the process to
release the device. The result might vary depending on the device driver
and OS/arch, but two possible outcomes are:

1) the hypervisor driver will detach the device from the VM, issuing a
delete event to Libvirt. This can be observed in QEMU;

2) the 'echo 0 > ...' will hang waiting for the device to be unplugged.
This means that the VM process failed/refused to release the hostdev back
to the host, and the hostdev will be detached during VM shutdown.

Today we don't behave well for both cases. We'll fail to remove the PCI device
reference from mgr->activePCIHostdevs and mgr->inactivePCIHostdevs because
we rely on the existence of the PCI device conf file in the sysfs. Attempting
to re-utilize the same device (assuming it is now present back in the host)
can result in an error like this:

$ ./run tools/virsh start vm1-sriov --console
error: Failed to start domain vm1-sriov
error: Requested operation is not valid: PCI device :01:00.2 is in use by 
driver QEMU, domain vm1-sriov

For (1), a VM destroy/start cycle is needed to re-use the VF in the guest.
For (2), the effect is more nefarious, requiring a Libvirtd daemon restart
to use the VF again in any guest.

We can make it a bit better by checking, during virHostdevReAttachPCIDevices(),
if there is any missing PCI device that will be left behind in activePCIHostdevs
and inactivePCIHostdevs lists. Remove any missing device found from both lists,
unconditionally, matching the current state of the host. This change affects
the code path in (1) (processDeviceDeletedEvent into qemuDomainRemoveDevice, all
the way back to qemuHostdevReAttachPCIDevices) and also in (b) (qemuProcessStop
into qemuHostdevReAttachDomainDevices).

NB: Although this patch enables the possibility of 'outside Libvirt' SR-IOV
hotunplug of PCI devices, if the hypervisor and the PCI driver copes with it,
our goal is to mitigate what it is still considered an user oopsie. For all''


This is one of those odd cases where you use "a" before a word starting 
with a vowel rather than "an". It's that way because the *sound* 
starting the next word is "you" rather than "ooh".



supported purposes, the admin must remove the SR-IOV VFs from all running 
domains
before removing the VFs from the host.

Resolves:  https://gitlab.com/libvirt/libvirt/-/issues/72
Signed-off-by: Daniel Henrique Barboza 
---
  src/hypervisor/virhostdev.c | 38 +
  1 file changed, 38 insertions(+)

diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c
index c708791eec..ed43733e71 100644
--- a/src/hypervisor/virhostdev.c
+++ b/src/hypervisor/virhostdev.c
@@ -1077,6 +1077,40 @@ virHostdevReAttachPCIDevicesImpl(virHostdevManagerPtr 
mgr,
  }
  
  
+static void

+virHostdevDeleteMissingPCIDevices(virHostdevManagerPtr mgr,
+  virDomainHostdevDefPtr *hostdevs,
+  int nhostdevs)
+{
+size_t i;
+
+if (nhostdevs == 0)
+return;
+
+virObjectLock(mgr->activePCIHostdevs);
+virObjectLock(mgr->inactivePCIHostdevs);
+
+for (i = 0; i < nhostdevs; i++) {
+virDomainHostdevDef *hostdev = hostdevs[i];
+virDomainHostdevSubsysPCI *pcisrc = >source.subsys.u.pci;
+g_autoptr(virPCIDevice) pci = NULL;
+
+if (virHostdevGetPCIHostDevice(hostdev, ) != -2)
+continue;
+
+/* The PCI device from 'hostdev' does not exist in the host
+ * anymore. Delete it from both active and inactive lists to
+ * reflect the current host state.
+ */
+virPCIDeviceListDel(mgr->activePCIHostdevs, >addr);
+virPCIDeviceListDel(mgr->inactivePCIHostdevs, >addr);
+}
+
+virObjectUnlock(mgr->activePCIHostdevs);
+virObjectUnlock(mgr->inactivePCIHostdevs);


It makes me nervous that you're unlocking these lists in the same order 
that you lock them, since that could theoretically lead to a deadlock. I 
haven't even thought beyond this single function to figure out if it's 
even a possibility in this case, but still would feel better if you 
unlocked in the opposite order that you lock.


With that fixed:

Reviewed-by: Laine Stump 



+}
+
+
  /* @oldStateDir:
   * For upgrade purpose: see virHostdevRestoreNetConfig
   */
@@ -1102,6 +1136,10 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr mgr,
  
  virHostdevReAttachPCIDevicesImpl(mgr, drv_name, dom_name, pcidevs,

   hostdevs, 

Re: [PATCH RESEND 17/20] virpci.c: use virPCIDeviceAddressPtr in virPCIDeviceListFind()

2021-02-21 Thread Laine Stump

On 1/18/21 2:53 PM, Daniel Henrique Barboza wrote:

Signed-off-by: Daniel Henrique Barboza 
---
  src/hypervisor/virhostdev.c | 12 
  src/util/virpci.c   | 16 
  src/util/virpci.h   |  2 +-
  3 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c
index 402d7be42d..3bfb04c674 100644
--- a/src/hypervisor/virhostdev.c
+++ b/src/hypervisor/virhostdev.c
@@ -657,7 +657,8 @@ virHostdevReattachAllPCIDevices(virHostdevManagerPtr mgr,
  
  /* We need to look up the actual device because that's what

   * virPCIDeviceReattach() expects as its argument */
-if (!(actual = virPCIDeviceListFind(mgr->inactivePCIHostdevs, pci)))
+if (!(actual = virPCIDeviceListFind(mgr->inactivePCIHostdevs,
+virPCIDeviceGetAddress(pci
  continue;
  
  if (virPCIDeviceGetManaged(actual)) {

@@ -777,7 +778,8 @@ virHostdevPreparePCIDevicesImpl(virHostdevManagerPtr mgr,
  
  /* Unmanaged devices should already have been marked as

   * inactive: if that's the case, we can simply move on */
-if (virPCIDeviceListFind(mgr->inactivePCIHostdevs, pci)) {
+if (virPCIDeviceListFind(mgr->inactivePCIHostdevs,
+ virPCIDeviceGetAddress(pci))) {


Wondered at first why you were using a access function to get the 
pointer rather than just referencing it directly. I'd forgotten that 
virPCIDevice is one of those rare "private structures" in libvirt - 
defined only in a single .c file.


Anyway,

Reviewed-by: Laine Stump 



  VIR_DEBUG("Not detaching unmanaged PCI device %s",
virPCIDeviceGetName(pci));
  continue;
@@ -860,7 +862,8 @@ virHostdevPreparePCIDevicesImpl(virHostdevManagerPtr mgr,
   * there because 'pci' only contain address information and will
   * be released at the end of the function */
  pci = virPCIDeviceListGet(pcidevs, i);
-actual = virPCIDeviceListFind(mgr->activePCIHostdevs, pci);
+actual = virPCIDeviceListFind(mgr->activePCIHostdevs,
+  virPCIDeviceGetAddress(pci));
  
  VIR_DEBUG("Setting driver and domain information for PCI device %s",

virPCIDeviceGetName(pci));
@@ -992,7 +995,8 @@ virHostdevReAttachPCIDevicesImpl(virHostdevManagerPtr mgr,
   * information such as by which domain and driver it is used. As a
   * side effect, by looking it up we can also tell whether it was
   * really active in the first place */
-actual = virPCIDeviceListFind(mgr->activePCIHostdevs, pci);
+actual = virPCIDeviceListFind(mgr->activePCIHostdevs,
+  virPCIDeviceGetAddress(pci));
  if (actual) {
  const char *actual_drvname;
  const char *actual_domname;
diff --git a/src/util/virpci.c b/src/util/virpci.c
index 1554acffb6..9544275c31 100644
--- a/src/util/virpci.c
+++ b/src/util/virpci.c
@@ -705,7 +705,7 @@ virPCIDeviceSharesBusWithActive(virPCIDevicePtr dev, 
virPCIDevicePtr check, void
  return 0;
  
  /* same bus, but inactive, i.e. about to be assigned to guest */

-if (inactiveDevs && virPCIDeviceListFind(inactiveDevs, check))
+if (inactiveDevs && virPCIDeviceListFind(inactiveDevs, >address))
  return 0;
  
  return 1;

@@ -1022,7 +1022,7 @@ virPCIDeviceReset(virPCIDevicePtr dev,
  return -1;
  }
  
-if (activeDevs && virPCIDeviceListFind(activeDevs, dev)) {

+if (activeDevs && virPCIDeviceListFind(activeDevs, >address)) {
  virReportError(VIR_ERR_INTERNAL_ERROR,
 _("Not resetting active device %s"), dev->name);
  return -1;
@@ -1294,7 +1294,7 @@ virPCIDeviceDetach(virPCIDevicePtr dev,
  if (virPCIProbeStubDriver(dev->stubDriver) < 0)
  return -1;
  
-if (activeDevs && virPCIDeviceListFind(activeDevs, dev)) {

+if (activeDevs && virPCIDeviceListFind(activeDevs, >address)) {
  virReportError(VIR_ERR_INTERNAL_ERROR,
 _("Not detaching active device %s"), dev->name);
  return -1;
@@ -1306,7 +1306,7 @@ virPCIDeviceDetach(virPCIDevicePtr dev,
  /* Add *a copy of* the dev into list inactiveDevs, if
   * it's not already there.
   */
-if (inactiveDevs && !virPCIDeviceListFind(inactiveDevs, dev)) {
+if (inactiveDevs && !virPCIDeviceListFind(inactiveDevs, >address)) {
  VIR_DEBUG("Adding PCI device %s to inactive list", dev->name);
  if (virPCIDeviceListAddCopy(inactiveDevs, dev) < 0)
  return -1;
@@ -1324,7 +1324,7 @@ virPCIDeviceReattach(virPCIDevicePtr dev,
   virPCIDeviceListPtr activeDevs,
   virPCIDeviceListPtr inactiveDevs)
  {
-if (activeDevs && 

Re: [PATCH RESEND 18/20] virpci.c: use virPCIDeviceAddressPtr in virPCIDeviceListSteal()

2021-02-21 Thread Laine Stump

On 1/18/21 2:53 PM, Daniel Henrique Barboza wrote:

Signed-off-by: Daniel Henrique Barboza 


Reviewed-by: Laine Stump 



Re: [PATCH RESEND 19/20] virpci.c: use virPCIDeviceAddressPtr in virPCIDeviceListDel()

2021-02-21 Thread Laine Stump

On 1/18/21 2:53 PM, Daniel Henrique Barboza wrote:

This change will allow us to remove PCI devices from a list
without the need of a PCI Device object, which will be need
in the next patch.

Signed-off-by: Daniel Henrique Barboza 


Reviewed-by: Laine Stump 







Re: [PATCH RESEND 16/20] virpci.c: use virPCIDeviceAddressPtr in virPCIDeviceListFindIndex()

2021-02-21 Thread Laine Stump

On 1/18/21 2:53 PM, Daniel Henrique Barboza wrote:

We're going to need a way to remove a PCI Device from a list without having
a valid virPCIDevicePtr, because the device is missing from the host. This
means that virPCIDevicesListDel() must operate with a PCI Device address
instead.

Turns out that virPCIDevicesListDel() and its related functions only use
the virPCIDeviceAddressPtr of the virPCIDevicePtr, so this change is
simple to do and will not cause hassle in all other callers. Let's
start adapting virPCIDeviceListFindIndex() and crawl our way up to
virPCIDevicesListDel().

Signed-off-by: Daniel Henrique Barboza 


Reviewed-by: Laine Stump 

(also Declared-should-have-been-done-this-way-in-the-first-place-by: 
Laine Stump  :-))



---
  src/util/virpci.c | 15 ---
  src/util/virpci.h |  2 +-
  2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/src/util/virpci.c b/src/util/virpci.c
index 7143380348..1554acffb6 100644
--- a/src/util/virpci.c
+++ b/src/util/virpci.c
@@ -1745,7 +1745,7 @@ virPCIDevicePtr
  virPCIDeviceListSteal(virPCIDeviceListPtr list,
virPCIDevicePtr dev)
  {
-return virPCIDeviceListStealIndex(list, virPCIDeviceListFindIndex(list, 
dev));
+return virPCIDeviceListStealIndex(list, virPCIDeviceListFindIndex(list, 
>address));
  }
  
  void

@@ -1756,16 +1756,17 @@ virPCIDeviceListDel(virPCIDeviceListPtr list,
  }
  
  int

-virPCIDeviceListFindIndex(virPCIDeviceListPtr list, virPCIDevicePtr dev)
+virPCIDeviceListFindIndex(virPCIDeviceListPtr list,
+  virPCIDeviceAddressPtr devAddr)
  {
  size_t i;
  
  for (i = 0; i < list->count; i++) {

  virPCIDevicePtr other = list->devs[i];
-if (other->address.domain   == dev->address.domain &&
-other->address.bus  == dev->address.bus&&
-other->address.slot == dev->address.slot   &&
-other->address.function == dev->address.function)
+if (other->address.domain   == devAddr->domain &&
+other->address.bus  == devAddr->bus&&
+other->address.slot == devAddr->slot   &&
+other->address.function == devAddr->function)
  return i;
  }
  return -1;
@@ -1798,7 +1799,7 @@ virPCIDeviceListFind(virPCIDeviceListPtr list, 
virPCIDevicePtr dev)
  {
  int idx;
  
-if ((idx = virPCIDeviceListFindIndex(list, dev)) >= 0)

+if ((idx = virPCIDeviceListFindIndex(list, >address)) >= 0)
  return list->devs[idx];
  else
  return NULL;
diff --git a/src/util/virpci.h b/src/util/virpci.h
index a9c597a428..8c6776da21 100644
--- a/src/util/virpci.h
+++ b/src/util/virpci.h
@@ -175,7 +175,7 @@ virPCIDeviceListFindByIDs(virPCIDeviceListPtr list,
unsigned int slot,
unsigned int function);
  int virPCIDeviceListFindIndex(virPCIDeviceListPtr list,
-  virPCIDevicePtr dev);
+  virPCIDeviceAddressPtr devAddr);
  
  /*

   * Callback that will be invoked once for each file





Re: [PATCH RESEND 15/20] qemu_cgroup.c: skip absent PCI devices in qemuTeardownHostdevCgroup()

2021-02-21 Thread Laine Stump

On 1/18/21 2:53 PM, Daniel Henrique Barboza wrote:

There is no need to bother with cgroup tearing down for absent
PCI devices, given that their entries in the sysfs are already
gone.

Signed-off-by: Daniel Henrique Barboza 
---
  src/qemu/qemu_cgroup.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c
index f7146a71c9..050df21d87 100644
--- a/src/qemu/qemu_cgroup.c
+++ b/src/qemu/qemu_cgroup.c
@@ -467,6 +467,16 @@ qemuTeardownHostdevCgroup(virDomainObjPtr vm,
  if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_DEVICES))
  return 0;
  
+/* Skip tearing down Cgroup for hostdevs that represents absent

+ * PCI devices, e.g. SR-IOV virtual functions that were removed from
+ * the host while the domain was still running. */
+if (virHostdevIsPCIDevice(dev)) {
+const virDomainHostdevSubsysPCI *pcisrc = >source.subsys.u.pci;
+
+if (!virPCIDeviceExists(>addr))
+return 0;


I would have skipped creating the temprorary variable, since it's only 
used once, but.. eh. Potato, potahtoe.


Reviewed-by: Laine Stump 


+}
+
  if (qemuDomainGetHostdevPath(dev, , NULL) < 0)
  return -1;
  





Re: [PATCH RESEND 14/20] virhostdev.c: add virHostdevIsPCIDevice() helper

2021-02-21 Thread Laine Stump

On 1/18/21 2:53 PM, Daniel Henrique Barboza wrote:

Add a helper to quickly determine if a hostdev is a PCI device,
instead of doing a tedius 'if' check with hostdev mode and


s/tedius/tedious/


subsys type.

Signed-off-by: Daniel Henrique Barboza 


Reviewed-by: Laine Stump 



Re: [PATCH RESEND 13/20] virsh-domain.c: modernize cmdDetachDevice()

2021-02-21 Thread Laine Stump

On 1/18/21 2:53 PM, Daniel Henrique Barboza wrote:

Use g_auto* pointers to avoid the need of a cleanup label. The
type of the pointer 'virDomainPtr dom' was changed to its alias
'virshDomainPtr' to allow the use of g_autoptr().

Signed-off-by: Daniel Henrique Barboza 


Reviewed-by: Laine Stump 



Re: [PATCH RESEND 11/20] qemu_driver.c: modernize qemuNodeDeviceReAttach()

2021-02-21 Thread Laine Stump

On 1/18/21 2:53 PM, Daniel Henrique Barboza wrote:

Add virObjectUnref an autoptr cleanup func for virNodeDevice,
then remove all unref and free calls from qemuNodeDeviceReAttach().

Signed-off-by: Daniel Henrique Barboza 
---
  src/datatypes.h|  2 ++
  src/qemu/qemu_driver.c | 32 
  2 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/src/datatypes.h b/src/datatypes.h
index ade3779e43..7a88aba0df 100644
--- a/src/datatypes.h
+++ b/src/datatypes.h
@@ -707,6 +707,8 @@ struct _virNodeDevice {
  char *parentName;   /* parent device name */
  };
  
+G_DEFINE_AUTOPTR_CLEANUP_FUNC(virNodeDevice, virObjectUnref);


It may seem like overkill, but I think this ^^^ should be added in a 
separate patch. That way if some other patch that uses 
g_autoptr(virNodeDevice) needs to be backported to a downstream release 
that doesn't want to take the rest of this patch's refactoring of 
qemuNodeDeviceReAttach(), they can do it.


Reviewed-by: Laine Stump 

but split the above line into a separate patch (which you can also put 
my R-b on)




+
  /**
   * _virSecret:
   *
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 7581e3c8cb..f9aa93fa3e 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -12045,17 +12045,16 @@ static int
  qemuNodeDeviceReAttach(virNodeDevicePtr dev)
  {
  virQEMUDriverPtr driver = dev->conn->privateData;
-virPCIDevicePtr pci = NULL;
+g_autoptr(virPCIDevice) pci = NULL;
  virPCIDeviceAddress devAddr;
-int ret = -1;
-virNodeDeviceDefPtr def = NULL;
+g_autoptr(virNodeDeviceDef) def = NULL;
  g_autofree char *xml = NULL;
  virHostdevManagerPtr hostdev_mgr = driver->hostdevMgr;
-virConnectPtr nodeconn = NULL;
-virNodeDevicePtr nodedev = NULL;
+g_autoptr(virConnect) nodeconn = NULL;
+g_autoptr(virNodeDevice) nodedev = NULL;
  
  if (!(nodeconn = virGetConnectNodeDev()))

-goto cleanup;
+return -1;
  
  /* 'dev' is associated with the QEMU virConnectPtr,

   * so for split daemons, we need to get a copy that
@@ -12063,36 +12062,29 @@ qemuNodeDeviceReAttach(virNodeDevicePtr dev)
   */
  if (!(nodedev = virNodeDeviceLookupByName(
nodeconn, virNodeDeviceGetName(dev
-goto cleanup;
+return -1;
  
  xml = virNodeDeviceGetXMLDesc(nodedev, 0);

  if (!xml)
-goto cleanup;
+return -1;
  
  def = virNodeDeviceDefParseString(xml, EXISTING_DEVICE, NULL);

  if (!def)
-goto cleanup;
+return -1;
  
  /* ACL check must happen against original 'dev',

   * not the new 'nodedev' we acquired */
  if (virNodeDeviceReAttachEnsureACL(dev->conn, def) < 0)
-goto cleanup;
+return -1;
  
  if (virDomainDriverNodeDeviceGetPCIInfo(def, ) < 0)

-goto cleanup;
+return -1;
  
  pci = virPCIDeviceNew();

  if (!pci)
-goto cleanup;
-
-ret = virHostdevPCINodeDeviceReAttach(hostdev_mgr, pci);
+return -1;
  
-virPCIDeviceFree(pci);

- cleanup:
-virNodeDeviceDefFree(def);
-virObjectUnref(nodedev);
-virObjectUnref(nodeconn);
-return ret;
+return virHostdevPCINodeDeviceReAttach(hostdev_mgr, pci);
  }
  
  static int






Re: [PATCH RESEND 10/20] libvirt-nodedev.c: remove return value from virNodeDeviceFree()

2021-02-21 Thread Laine Stump

On 1/18/21 2:53 PM, Daniel Henrique Barboza wrote:

The function returns -1 on error, but no caller is actually checking
the return value. Making it 'void' makes more sense with its
current use.

Signed-off-by: Daniel Henrique Barboza 


NAK - you can't change a public function.


---
  include/libvirt/libvirt-nodedev.h |  2 +-
  src/libvirt-nodedev.c | 15 +++
  2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/include/libvirt/libvirt-nodedev.h 
b/include/libvirt/libvirt-nodedev.h
index eab8abf6ab..5634980a75 100644
--- a/include/libvirt/libvirt-nodedev.h
+++ b/include/libvirt/libvirt-nodedev.h
@@ -114,7 +114,7 @@ char *  virNodeDeviceGetXMLDesc 
(virNodeDevicePtr dev,
   unsigned int flags);
  
  int virNodeDeviceRef(virNodeDevicePtr dev);

-int virNodeDeviceFree   (virNodeDevicePtr dev);
+voidvirNodeDeviceFree   (virNodeDevicePtr dev);
  
  int virNodeDeviceDettach(virNodeDevicePtr dev);

  int virNodeDeviceDetachFlags(virNodeDevicePtr dev,
diff --git a/src/libvirt-nodedev.c b/src/libvirt-nodedev.c
index eb8c735a8c..fcca40f47b 100644
--- a/src/libvirt-nodedev.c
+++ b/src/libvirt-nodedev.c
@@ -445,19 +445,26 @@ virNodeDeviceListCaps(virNodeDevicePtr dev,
   * Drops a reference to the node device, freeing it if
   * this was the last reference.
   *
- * Returns the 0 for success, -1 for error.
+ * Throws a VIR_ERR_INVALID_NODE_DEVICE error if @dev is
+ * not a valid node device. Does nothing if @dev is
+ * NULL.
   */
-int
+void
  virNodeDeviceFree(virNodeDevicePtr dev)
  {
+if (!dev)
+return;
+
  VIR_DEBUG("dev=%p, conn=%p", dev, dev ? dev->conn : NULL);
  
  virResetLastError();
  
-virCheckNodeDeviceReturn(dev, -1);

+virCheckNodeDeviceGoto(dev, invalid_device);
  
  virObjectUnref(dev);

-return 0;
+
+ invalid_device:
+return;
  }
  
  





Re: [PATCH RESEND 09/20] dac, selinux: skip setting/restoring label for absent PCI devices

2021-02-21 Thread Laine Stump

On 1/18/21 2:53 PM, Daniel Henrique Barboza wrote:

If the underlying PCI device of a hostdev does not exist in the
host (e.g. a SR-IOV VF that was removed while the domain was
running), skip security label handling for it.

This will avoid errors that happens during qemuProcessStop() time,
where a VF that was being used by the domain is not present anymore.
The restore label functions of both DAC and SELinux drivers will
trigger errors in virPCIDeviceNew().

Signed-off-by: Daniel Henrique Barboza 


This is fine as far as it goes, but what about 
AppArmorSetSecurityHostdevLabel() (and also whatever it is that's going 
on in the get_files() function in virt-aa-helper.c). You likely don't 
have a setup to test it, but it seems pretty straightforward to 
extrapolate that if you should skip setting the selinux and DAC labels 
when a device doesn't exist, you can probably do the same thing for 
AppArmor.


Reviewed-by: Laine Stump 

but add another patch that fixes the problem for AppArmor too.

(Also, all the code repetition here makes me think that there must be a 
better way to do this and reduce all the boilerplate, but I think it 
would be better to just make these changes and then look at eliminating 
the boilerplate afterwards, rather than re-structuring the code (which 
could create regressions) while adding this new concept on top of it.



---
  src/security/security_dac.c | 14 --
  src/security/security_selinux.c | 14 --
  2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/src/security/security_dac.c b/src/security/security_dac.c
index 0085982bb1..a2528aeb2d 100644
--- a/src/security/security_dac.c
+++ b/src/security/security_dac.c
@@ -1266,7 +1266,12 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr,
  }
  
  case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: {

-g_autoptr(virPCIDevice) pci = virPCIDeviceNew(>addr);
+g_autoptr(virPCIDevice) pci = NULL;
+
+if (!virPCIDeviceExists(>addr))
+break;
+
+pci = virPCIDeviceNew(>addr);
  
  if (!pci)

  return -1;
@@ -1422,7 +1427,12 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr 
mgr,
  }
  
  case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: {

-g_autoptr(virPCIDevice) pci = virPCIDeviceNew(>addr);
+g_autoptr(virPCIDevice) pci = NULL;
+
+if (!virPCIDeviceExists(>addr))
+break;
+
+pci = virPCIDeviceNew(>addr);
  
  if (!pci)

  return -1;
diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c
index 99adf08a15..c018c0708a 100644
--- a/src/security/security_selinux.c
+++ b/src/security/security_selinux.c
@@ -2101,7 +2101,12 @@ 
virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr,
  }
  
  case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: {

-g_autoptr(virPCIDevice) pci = virPCIDeviceNew(>addr);
+g_autoptr(virPCIDevice) pci = NULL;
+
+if (!virPCIDeviceExists(>addr))
+break;
+
+pci = virPCIDeviceNew(>addr);
  
  if (!pci)

  return -1;
@@ -2329,7 +2334,12 @@ 
virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr,
  }
  
  case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: {

-g_autoptr(virPCIDevice) pci = virPCIDeviceNew(>addr);
+g_autoptr(virPCIDevice) pci = NULL;
+
+if (!virPCIDeviceExists(>addr))
+break;
+
+pci = virPCIDeviceNew(>addr);
  
  if (!pci)

  return -1;





Re: [PATCH RESEND 06/20] virhostdev.c: virHostdevGetPCIHostDevice() now reports missing device

2021-02-21 Thread Laine Stump

On 1/18/21 2:53 PM, Daniel Henrique Barboza wrote:

Gitlab issue #72 [1] reports that removing SR-IOVs VFs before
removing the devices from the running domains can have strange
consequences. QEMU might be able to hotunplug the device inside the
guest, but Libvirt will not be aware of that, and then the guest is
now inconsistent with the domain definition.

There's also the possibility of the VFs removal not succeeding
while the domain is running but then, as soon as the domain
is shutdown, all the VFs are removed. Libvirt can't handle
the removal of the PCI devices while trying to reattach the
hostdevs, and the Libvirt daemon can be left in an inconsistent
state (see [2]).

This patch starts to address the issue related in Gitlab #72, most
notably the issue described in [2]. When shutting down a domain
with SR-IOV hostdevs that got missing, virHostdevReAttachPCIDevices()
is failing the whole process and failing to reattach all the
PCI devices, including the ones that aren't related to the VFs that
went missing. Let's make it more resilient with host changes by
changing virHostdevGetPCIHostDevice() to return an exclusive error
code '-2' for this case. virHostdevGetPCIHostDeviceList() can then
tell when virHostdevGetPCIHostDevice() failed to find the PCI
device of a hostdev and continue to make the list of PCI devices.

virHostdevReAttachPCIDevices() will now be able to proceed reattaching
all other valid PCI devices, at least. The 'ghost hostdevs' will be
handled later on.

[1] https://gitlab.com/libvirt/libvirt/-/issues/72
[2] https://gitlab.com/libvirt/libvirt/-/issues/72#note_459032148

Signed-off-by: Daniel Henrique Barboza 
---
  src/hypervisor/virhostdev.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c
index bd35397f2c..dbba36193b 100644
--- a/src/hypervisor/virhostdev.c
+++ b/src/hypervisor/virhostdev.c
@@ -220,7 +220,8 @@ virHostdevManagerGetDefault(void)
   * is returned.
   *
   * Returns: 0 on success (@pci might be NULL though),
- * -1 otherwise (with error reported).
+ * -1 otherwise (with error reported),
+ * -2 PCI device not found. @pci will be NULL
   */
  static int
  virHostdevGetPCIHostDevice(const virDomainHostdevDef *hostdev,
@@ -235,6 +236,9 @@ virHostdevGetPCIHostDevice(const virDomainHostdevDef 
*hostdev,
  hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI)
  return 0;
  
+if (!virPCIDeviceExists(>addr))

+return -2;


You've created a special return code to mean "This is a PCI device, but 
it isn't present on the host"...



+
  actual = virPCIDeviceNew(>addr);
  
  if (!actual)

@@ -270,7 +274,7 @@ virHostdevGetPCIHostDeviceList(virDomainHostdevDefPtr 
*hostdevs, int nhostdevs)
  virDomainHostdevDefPtr hostdev = hostdevs[i];
  g_autoptr(virPCIDevice) pci = NULL;
  
-if (virHostdevGetPCIHostDevice(hostdev, ) < 0)

+if (virHostdevGetPCIHostDevice(hostdev, ) == -1)
  return NULL;


...But you haven't actually used it. Will it become necessary later to 
have this special value? right now a missing device is treated exactly 
the same as if it isn't a PCI device.


I guess I can understand the conceptual desire to return an error of 
some type in this case though, and there are other places where 
something similar is done (-2 indicates some type of "odd" error), so 
I'll let it pass :-)



Reviewed-by: Laine Stump 

  
  if (!pci)