Public bug reported:

Description:   s390x/pci: Honor vfio DMA limiting
Symptom:       vfio-pci device on s390 enters error state
Problem:       Kernel commit 492855939bdb added a limit to the number of
               concurrent DMA requests for a vfio container.  However, lazy
               unmapping in s390 can in fact cause quite a large number of
               outstanding DMA requests to build up prior to being purged,
               potentially the entire guest DMA space.  This results in
               unexpected errors seen in qemu such as 'VFIO_MAP_DMA failed:
               No space left on device'
Solution:      The solution requires a change to both kernel and qemu - For
               qemu, add functionality to get the number of allowable DMA
               DMA requests via the VFIO_IOMMU_GET_INFO ioctl and then ensure
               that the guest is told to refresh mappings before exceeding
               the vfio limit.
Reproduction:  Put a vfio-pci device on s390 under I/O load

This QEMU issue is related to the kernel issue in launchpad bug #1907421.  
Backport patches have been attached for a subset of the required patches for 
this fix...  The backports required boiled down to 3 major reasons:
1) For the header sync, I suspect you only want the minimal set of changes 
needed
2) There is a missing upstream commit (408b55db8be3) that re-organizes the 
location of 2 s390-pci header files, causing conflicts
3) Adjustments had to be made due to the QEMU build system change (meson) 

I initially performed the backport against 4.2/focal-devel; the same
patches and process will also apply cleanly to 5.0/groovy-devel.  There
should be nothing required for hirsute as everything is already in
upstream QEMU 5.2.

In summary:
53ba2eee52bf: Backport as patch 0001.  Rather than doing a full header sync, 
update ONLY the header change needed for the DMA fix.  See attached patch 0001.
3ab7a0b40d4b: cherry-pick works
7486a62845b1: cherry-pick works
cd7498d07fbb: Backport as patch 0004.  This upstream commit added a new part 
using meson, which does not exist in 5.0.
37fa32de7073: Backport as patch 0005.  This was mainly due to conflicts with a 
missing patch that relocated some include files.
77280d33bc9c: Backport as patch 0006.  This was due to different build system + 
CONFIG_DEVICES doesn't exist.

As such, I have attached patches 0001, 0004, 0005 and 0006.  Please
cherry pick for patches 0002 and 0003.

To verify, I applied the patches provided and cherry-picks against both
focal-devel and groovy-devel.  In each case, for the host system I used
the groovy kernel Frank provided in launchpad bug #1907421 which
includes the kernel portion of this fix -- using these together, I
verified that the DMA limit is being read in and honored appropriately
by QEMU, and I can no longer trigger an overrun of the DMA space when a
guest pushes heavy data transfer via PCI (no errors in log, no transfer
stalls).

Also, as related to the last patch of the set, I further verified that
no build errors are encountered when configured with --without-default-
devices.

** Affects: linux (Ubuntu)
     Importance: Undecided
     Assignee: Skipper Bug Screeners (skipper-screen-team)
         Status: New


** Tags: architecture-s39064 bugnameltc-190223 severity-high 
targetmilestone-inin---

** Tags added: architecture-s39064 bugnameltc-190223 severity-high
targetmilestone-inin---

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1913395

Title:
  [UBUNTU 21.04] qemu s390x/pci: Honor vfio DMA limiting

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1913395/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to