This bug is awaiting verification that the linux-mtk/5.15.0-1030.34
kernel in -proposed solves the problem. Please test the kernel and
update this bug with the results. If the problem is solved, change the
tag 'verification-needed-jammy-linux-mtk' to 'verification-done-jammy-
linux-mtk'. If the problem still exists, change the tag 'verification-
needed-jammy-linux-mtk' to 'verification-failed-jammy-linux-mtk'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-jammy-linux-mtk-v2 
verification-needed-jammy-linux-mtk

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1853306

Title:
  [22.04 FEAT] Enhanced Interpretation for PCI Functions on s390x -
  kernel part

Status in Ubuntu on IBM z Systems:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Jammy:
  Fix Released
Status in linux source package in Kinetic:
  Won't Fix
Status in linux source package in Lunar:
  Fix Released
Status in linux source package in Mantic:
  Fix Released

Bug description:
  [ Impact ]

   * Currently the PCI passthrough implementation for s390x is based on
     intercepting PCI I/O instructions, which leads to a reduced I/O performance
     compared to the execution of PCI instructions directly in LPAR.

   * Hence users may face I/O bottlenecks when using PCI devices in passthrough
     mode based on the current implementation.

   * For avoiding this and to improve performance, the interpretive execution
     of the PCI store and PCI load instructions get enabled.

   * A further improvement is achieved by enabling the 
Adapter-Event-Notification
     Interpretation (AENI).

   * Since LTS releases are the main focus for stable and long running KVM
     workloads, it is highly desired to get this backported to the jammy kernel
     (and because the next LTS is still some time away).

  [ Test Plan ]

  * Hardware used: z14 or greater LPAR, PCI-attached devices
    (RoCE VFs, ISM devices, NVMe drive)

  * Setup: Both the kernel and QEMU features are needed for the feature
    to function (an upstream QEMU can be used to verify the kernel early),
    and the facility is only avaialble on z14 or newer.
    When any of those pieces is missing,
    the interpretation facility will not be used.
    When both the kernel and QEMU features are included in their respective
    packages, and running in an LPAR on a z14 or newer machine,
    this feature will be enabled automatically.
    Existing supported devices should behave as before with no changes
    required by an end-user (e.g. no changes to libvirt domain definitions)
    -- but will now make use of the interpretation facility.
    Additionally, ISM devices will now be eligible for vfio-pci passthrough
    (where before QEMU would exit on error if attempting to provide an ISM
    device for vfio-pci passthrough, preventing the guest from starting)

  * Testing will include the following scenarios, repeated each for RoCE,
    ISM and NVMe:

    1) Testing of basic device passthrough (create a VM with a vfio-pci
       device as part of the libvirt domain definition, passing through
       a RoCE VF, an ISM device, or an NVMe drive. Verify that the device
       is available in the guest and functioning)
    2) Testing of device hotplug/unplug (create a VM with a vfio-pci device,
       virsh detach-device to remove the device from the running guest,
       verify the device is removed from the guest, then virsh attach-device
       to hotplug the device to the guest again, verify the device functions
       in the guest)
    3) Host power off testing: Power off the device from the host, verify
       that the device is unplugged from the guest as part of the poweroff
    4) Guest power off testing: Power off the device from within the guest,
       verify that the device is unusuable in the guest,
       power the device back on within the guest and verify that the device
       is once again usable.
    5) Guest reboot testing: (create a VM with a vfio-pci device,
       verify the device is in working condition, reboot the guest,
       verify that the device is still usable after reboot)

  Testing will include the following scenarios specifically for ISM
  devices:

  1) Testing of SMC-D v1 fallback: Using 2 ISM devices on the same VCHID
     that share a PNETID, create 2 guests and pass one ISM device
     via vfio-pci device to each guest.
     Establish TCP connectivity between the 2 guests using the libvirt
     default network, and then use smc_run
     (https://manpages.ubuntu.com/manpages/jammy/man8/smc_run.8.html)
     to run an iperf workload between the 2 guests (will include both
     short workloads and longer-running workloads).
     Verify that SMC-D transfer was used between the guests instead
     of TCP via 'smcd stats' 
     (https://manpages.ubuntu.com/manpages/jammy/man8/smcd.8.html)

  2) Testing of SMC-D v2: Same as above,
     but using 2 ISM devices on the same VCHID that have no PNETID specified

  Testing will include the following scenarios specifically for RoCE
  devices:

  1) Ping testing: Using 2 RoCE VFs that share a common network,
     create 2 guests and pass one RoCE device to each guest.
     Assign IP addresses within each guest to the associated TCP interface,
     perform a ping between the guests to verify connectivity.

  2) Iperf testing: Similar to the above, but instead establish an iperf
     connection between the 2 guests and verify that the workload
     is successful / no errors.
     Will include both short workloads and longer-running workloads.

  Testing will include the following scenario specifically for NVMe
  devices:

  1) Fio testing: Using a NVMe drive passed to the guest via vfio-pci,
     run a series of fio tests against the device from within the guest, 
     verifying that the workload is successful / no errors.
      Will include both short workloads and longer-running workloads.

  [ Where problems could occur ]

   * The modifications do not change the way users or APIs have to make
     use of PCI passthrough, only the internal implementation got modified.

   * The vast majority of the code changes/or additional code is s390x-specific,
     under arch/s390 and drivers/s390.

   * However there is also common code touched:

   * 'kvm: use kvfree() in kvm_arch_free_vm()' touches
     arch/arm64/include/asm/kvm_host.h, arch/arm64/kvm/arm.c,
     arch/x86/include/asm/kvm_host.h, arch/x86/kvm/x86.c,
     include/linux/kvm_host.h switches in kvm_arch_free_vm() from kfree() to
     kvfree() allowing to use the common variant, which is upstream since v5.16
     and with that well established.

   * And 'vfio-pci/zdev: add open/close device hooks' touches
     drivers/vfio/pci/vfio_pci_core.c and drivers/vfio/pci/vfio_pci_zdev.c
     include/linux/vfio_pci_core.h add now code to introduce device hooks.
     It's upstream since kernel 6.0.

   * 'KVM: s390: pci: provide routines for en-/disabling interrupt forwarding'
     expands a single #if statement in include/linux/sched/user.h.

   * 'KVM: s390: add KVM_S390_ZPCI_OP to manage guest zPCI devices'
     adds s390x specific KVM_S390_ZPCI_OP and it's definition to
     include/uapi/linux/kvm.h.

   * And 'vfio-pci/zdev: different maxstbl for interpreted devices' and
     'vfio-pci/zdev: add function handle to clp base capability' expand
     s390x-specific (aka z-specific aka zdev) device structs in
     include/uapi/linux/vfio_zdev.h.

   * This shows that the vast majority of modifications are s390x specific,
     even in most of the common code files.

   * The remaining modifications in the (generally) common code files are
     related to the newly introduced kernel option 'CONFIG_VFIO_PCI_ZDEV_KVM'
     and documentation.

   * The s390x changes are more significant, and could not only harm
     passthrough itself for zPCI devices, but also KVM virtualization in 
general.

   * In addition to these kernel changes, qemu modifications  are needed
     as well (that are addressed at LP#1853307), this modified kernel
     must be tested in combination with the updated qemu package.
     - The qemu autopkgtest will be a got fit to identify any regressions,
     also in the kernel.
     - In addition some passthrough related test will be done by IBM

  __________

  The PCI Passthrough implementation is based on intercepting PCI I/O 
instructions which leads to a reduced I/O performance compared to execution of 
PCI instructions in LPAR.
  For improved performance the interpretive execution of the PCI store and PCI 
load instructions get enabled.
  Further improvement is achieved by enabling the Adapter-Event-Notification

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1853306/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to