This series adds the base support to preserve a VFIO device file across
a Live Update. "Base support" means that this allows userspace to
safetly preserve a VFIO device file with LIVEUPDATE_SESSION_PRESERVE_FD
and retrieve a preserved VFIO device file with
LIVEUPDATE_SESSION_RETRIEVE_FD, but the device itself is not preserved
in a fully running state across Live Update.

This series unblocks 2 parallel but related streams of work:

 - iommufd preservation across Live Update. This work spans iommufd,
   the IOMMU subsystem, and IOMMU drivers [1]

 - Preservation of VFIO device state across Live Update (config space,
   BAR addresses, power state, SR-IOV state, etc.). This work spans both
   VFIO and the core PCI subsystem.

While we need all of the above to fully preserve a VFIO device across a
Live Update without disrupting the workload on the device, this series
aims to be functional and safe enough to merge as the first incremental
step toward that goal.

Areas for Discussion
--------------------

BDF Stability across Live Update

  The PCI support for tracking preserved devices across a Live Update to
  prevent auto-probing relies on PCI segment numbers and BDFs remaining
  stable. For now I have disallowed VFs, as the BDFs assigned to VFs can
  vary depending on how the kernel chooses to allocate bus numbers. For
  non-VFs I am wondering if there is any more needed to ensure BDF
  stability across Live Update.

  While we would like to support many different systems and
  configurations in due time (including preserving VFs), I'd like to
  keep this first serses constrained to simple use-cases.

FLB Locking

  I don't see a way to properly synchronize pci_flb_finish() with
  pci_liveupdate_incoming_is_preserved() since the incoming FLB mutex is
  dropped by liveupdate_flb_get_incoming() when it returns the pointer
  to the object, and taking pci_flb_incoming_lock in pci_flb_finish()
  could result in a deadlock due to reversing the lock ordering.

FLB Retrieving

  The first patch of this series includes a fix to prevent an FLB from
  being retrieved again it is finished. I am wondering if this is the
  right approach or if subsystems are expected to stop calling
  liveupdate_flb_get_incoming() after an FLB is finished.

Testing
-------

The patches at the end of this series provide comprehensive selftests
for the new code added by this series. The selftests have been validated
in both a VM environment using a virtio-net PCIe device, and in a
baremetal environment on an Intel EMR server with an Intel DSA device.

Here is an example of how to run the new selftests:

vfio_pci_liveupdate_uapi_test:

  $ tools/testing/selftests/vfio/scripts/setup.sh 0000:00:04.0
  $ tools/testing/selftests/vfio/vfio_pci_liveupdate_uapi_test 0000:00:04.0
  $ tools/testing/selftests/vfio/scripts/cleanup.sh

vfio_pci_liveupdate_kexec_test:

  $ tools/testing/selftests/vfio/scripts/setup.sh 0000:00:04.0
  $ tools/testing/selftests/vfio/vfio_pci_liveupdate_kexec_test --stage 1 
0000:00:04.0
  $ kexec [...]  # NOTE: distro-dependent

  $ tools/testing/selftests/vfio/scripts/setup.sh 0000:00:04.0
  $ tools/testing/selftests/vfio/vfio_pci_liveupdate_kexec_test --stage 2 
0000:00:04.0
  $ tools/testing/selftests/vfio/scripts/cleanup.sh

Dependencies
------------

This series was constructed on top of several in-flight series and on
top of mm-nonmm-unstable [2].

  +-- This series
  |
  +-- [PATCH v2 00/18] vfio: selftests: Support for multi-device tests
  |    https://lore.kernel.org/kvm/[email protected]/
  |
  +-- [PATCH v3 0/4] vfio: selftests: update DMA mapping tests to use queried 
IOVA ranges
  |   https://lore.kernel.org/kvm/[email protected]/
  |
  +-- [PATCH v8 0/2] Live Update: File-Lifecycle-Bound (FLB) State
  |   
https://lore.kernel.org/linux-mm/[email protected]/
  |
  +-- [PATCH v8 00/18] Live Update Orchestrator
  |   
https://lore.kernel.org/linux-mm/[email protected]/
  |

To simplify checking out the code, this series can be found on GitHub:

  https://github.com/dmatlack/linux/tree/liveupdate/vfio/cdev/v1

Changelog
---------

v1:
 - Rebase series on top of LUOv8 and VFIO selftests improvements
 - Drop commits to preserve config space fields across Live Update.
   These changes require changes to the PCI layer. For exmaple,
   preserving rbars could lead to an inconsistent device state until
   device BARs addresses are preserved across Live Update.
 - Drop commits to preserve Bus Master Enable on the device. There's no
   reason to preserve this until iommufd preservation is fully working.
   Furthermore, preserving Bus Master Enable could lead to memory
   corruption when the device if the device is bound to the default
   identity-map domain after Live Update.
 - Drop commits to preserve saved PCI state. This work is not needed
   until we are ready to preserve the device's config space, and
   requires more thought to make the PCI state data layout ABI-friendly.
 - Add support to skip auto-probing devices that are preserved by VFIO
   to avoid them getting bound to a different driver by the next kernel.
 - Restrict device preservation further (no VFs, no intel-graphics).
 - Various refactoring and small edits to improve readability and
   eliminate code duplication.

rfc: https://lore.kernel.org/kvm/[email protected]/

Cc: Saeed Mahameed <[email protected]>
Cc: Adithya Jayachandran <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Parav Pandit <[email protected]>
Cc: Leon Romanovsky <[email protected]>
Cc: William Tu <[email protected]>
Cc: Jacob Pan <[email protected]>
Cc: Lukas Wunner <[email protected]>
Cc: Pasha Tatashin <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Pratyush Yadav <[email protected]>
Cc: Samiullah Khawaja <[email protected]>
Cc: Chris Li <[email protected]>
Cc: Josh Hilke <[email protected]>
Cc: David Rientjes <[email protected]>

[1] 
https://lore.kernel.org/linux-iommu/[email protected]/
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/log/?h=mm-nonmm-unstable

David Matlack (12):
  liveupdate: luo_flb: Prevent retrieve() after finish()
  PCI: Add API to track PCI devices preserved across Live Update
  PCI: Require driver_override for incoming Live Update preserved
    devices
  vfio/pci: Notify PCI subsystem about devices preserved across Live
    Update
  vfio: Enforce preserved devices are retrieved via
    LIVEUPDATE_SESSION_RETRIEVE_FD
  vfio/pci: Store Live Update state in struct vfio_pci_core_device
  vfio: selftests: Add Makefile support for TEST_GEN_PROGS_EXTENDED
  vfio: selftests: Add vfio_pci_liveupdate_uapi_test
  vfio: selftests: Expose iommu_modes to tests
  vfio: selftests: Expose low-level helper routines for setting up
    struct vfio_pci_device
  vfio: selftests: Verify that opening VFIO device fails during Live
    Update
  vfio: selftests: Add continuous DMA to vfio_pci_liveupdate_kexec_test

Vipin Sharma (9):
  vfio/pci: Register a file handler with Live Update Orchestrator
  vfio/pci: Preserve vfio-pci device files across Live Update
  vfio/pci: Retrieve preserved device files after Live Update
  vfio/pci: Skip reset of preserved device after Live Update
  selftests/liveupdate: Move luo_test_utils.* into a reusable library
  selftests/liveupdate: Add helpers to preserve/retrieve FDs
  vfio: selftests: Build liveupdate library in VFIO selftests
  vfio: selftests: Initialize vfio_pci_device using a VFIO cdev FD
  vfio: selftests: Add vfio_pci_liveupdate_kexec_test

 MAINTAINERS                                   |   1 +
 drivers/pci/Makefile                          |   1 +
 drivers/pci/liveupdate.c                      | 248 ++++++++++++++++
 drivers/pci/pci-driver.c                      |  12 +-
 drivers/vfio/device_cdev.c                    |  25 +-
 drivers/vfio/group.c                          |   9 +
 drivers/vfio/pci/Makefile                     |   1 +
 drivers/vfio/pci/vfio_pci.c                   |  11 +-
 drivers/vfio/pci/vfio_pci_core.c              |  23 +-
 drivers/vfio/pci/vfio_pci_liveupdate.c        | 278 ++++++++++++++++++
 drivers/vfio/pci/vfio_pci_priv.h              |  16 +
 drivers/vfio/vfio.h                           |  13 -
 drivers/vfio/vfio_main.c                      |  22 +-
 include/linux/kho/abi/pci.h                   |  53 ++++
 include/linux/kho/abi/vfio_pci.h              |  45 +++
 include/linux/liveupdate.h                    |   3 +
 include/linux/pci.h                           |  38 +++
 include/linux/vfio.h                          |  51 ++++
 include/linux/vfio_pci_core.h                 |   7 +
 kernel/liveupdate/luo_flb.c                   |   4 +
 tools/testing/selftests/liveupdate/.gitignore |   1 +
 tools/testing/selftests/liveupdate/Makefile   |  14 +-
 .../include/libliveupdate.h}                  |  11 +-
 .../selftests/liveupdate/lib/libliveupdate.mk |  20 ++
 .../{luo_test_utils.c => lib/liveupdate.c}    |  43 ++-
 .../selftests/liveupdate/luo_kexec_simple.c   |   2 +-
 .../selftests/liveupdate/luo_multi_session.c  |   2 +-
 tools/testing/selftests/vfio/Makefile         |  23 +-
 .../vfio/lib/include/libvfio/iommu.h          |   2 +
 .../lib/include/libvfio/vfio_pci_device.h     |   8 +
 tools/testing/selftests/vfio/lib/iommu.c      |   4 +-
 .../selftests/vfio/lib/vfio_pci_device.c      |  60 +++-
 .../vfio/vfio_pci_liveupdate_kexec_test.c     | 255 ++++++++++++++++
 .../vfio/vfio_pci_liveupdate_uapi_test.c      |  93 ++++++
 34 files changed, 1313 insertions(+), 86 deletions(-)
 create mode 100644 drivers/pci/liveupdate.c
 create mode 100644 drivers/vfio/pci/vfio_pci_liveupdate.c
 create mode 100644 include/linux/kho/abi/pci.h
 create mode 100644 include/linux/kho/abi/vfio_pci.h
 rename tools/testing/selftests/liveupdate/{luo_test_utils.h => 
lib/include/libliveupdate.h} (80%)
 create mode 100644 tools/testing/selftests/liveupdate/lib/libliveupdate.mk
 rename tools/testing/selftests/liveupdate/{luo_test_utils.c => 
lib/liveupdate.c} (89%)
 create mode 100644 
tools/testing/selftests/vfio/vfio_pci_liveupdate_kexec_test.c
 create mode 100644 tools/testing/selftests/vfio/vfio_pci_liveupdate_uapi_test.c

-- 
2.52.0.487.g5c8c507ade-goog


Reply via email to