Public bug reported:

SRU Justification

[Impact]

A VM on Azure can have 14 GPUs, and each GPU may have a huge MMIO BAR,
e.g. 128 GB. Currently the boot time of such a VM can be 4+ minutes, and
most of the time is used by the host to unmap/map the vBAR from/to pBAR
when the VM clears and sets the PCI_COMMAND_MEMORY bit: each unmap/map
operation for a 128GB BAR needs about 1.8 seconds, and the pci-hyperv
driver and the Linux PCI subsystem flip the PCI_COMMAND_MEMORY bit
eight times (see pci_setup_device() -> pci_read_bases() and
pci_std_update_resource()), increasing the boot time by 1.8 * 8 = 14.4
seconds per GPU, i.e. 14.4 * 14 = 201.6 seconds in total.

Fix the slowness by not turning on the PCI_COMMAND_MEMORY in pci-hyperv.c,
so the bit stays in the off state before the PCI device driver calls
pci_enable_device(): when the bit is off, pci_read_bases() and
pci_std_update_resource() don't cause Hyper-V to unmap/map the vBARs.
With this change, the boot time of such a VM is reduced by
1.8 * (8-1) * 14 = 176.4 seconds.

[Test Case]

Microsoft tested

[Where things could go wrong]

PCI BAR setup could fail or be incorrect.

[Other Info]

SF: #00336342

** Affects: linux-azure (Ubuntu)
     Importance: Undecided
         Status: Fix Released

** Affects: linux-azure (Ubuntu Focal)
     Importance: Undecided
         Status: New

** Affects: linux-azure (Ubuntu Impish)
     Importance: Medium
     Assignee: Tim Gardner (timg-tpi)
         Status: In Progress

** Affects: linux-azure (Ubuntu Jammy)
     Importance: Medium
     Assignee: Tim Gardner (timg-tpi)
         Status: In Progress

** Package changed: linux (Ubuntu) => linux-azure (Ubuntu)

** Also affects: linux-azure (Ubuntu Focal)
   Importance: Undecided
       Status: New

** Also affects: linux-azure (Ubuntu Jammy)
   Importance: Undecided
       Status: New

** Also affects: linux-azure (Ubuntu Impish)
   Importance: Undecided
       Status: New

** Changed in: linux-azure (Ubuntu)
       Status: New => Fix Released

** Changed in: linux-azure (Ubuntu Jammy)
   Importance: Undecided => Medium

** Changed in: linux-azure (Ubuntu Jammy)
       Status: New => In Progress

** Changed in: linux-azure (Ubuntu Jammy)
     Assignee: (unassigned) => Tim Gardner (timg-tpi)

** Changed in: linux-azure (Ubuntu Impish)
   Importance: Undecided => Medium

** Changed in: linux-azure (Ubuntu Impish)
       Status: New => In Progress

** Changed in: linux-azure (Ubuntu Impish)
     Assignee: (unassigned) => Tim Gardner (timg-tpi)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1972662

Title:
  [Azure] PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1972662/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to