A test build is available at 'ppa:fheimes/tmp-s390x'. ( For more info see https://launchpad.net/~fheimes/+archive/ubuntu/tmp-s390x ) The kernel and supporting packages for this ticket are under "linux - 5.4.0-31.35ubuntu1"
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1874056 Title: [UBUNTU 20.04] s390x/pci: enumerate pci functions per physical adapter Status in Ubuntu on IBM z Systems: In Progress Status in linux package in Ubuntu: In Progress Bug description: SRU Justification: ================== [Impact] * Mellanox CX5 port multi-pathing is broken on s390x due to non- standard topology of PCI IDs (phys. and virtual) * The Mellanox Connect-X 5 PCI driver (mlx5) implements multi-path that can be used to combine multiple networking ports to improve performance and reliability. * For that purpose, the mlx5 driver combines PCI functions based on topology information (the function number) as determined by their PCI ID. * Currently the Linux on Z PCI bus does not reflect PCI topology information in the PCI ID. As a result, the mlx5 multi-path function is broken and cannot be activated. [Fix] * Backport 1: https://launchpadlibrarian.net/479699471/0001-s390-pci- Improve-handling-of-unset-UID.patch * Backport 2: https://launchpadlibrarian.net/479699482/0002-s390-pci- embedding-hotplug_slot-in-zdev.patch * Backport 3: https://launchpadlibrarian.net/479699492/0003-s390-pci- Expose-new-port-attribute-for-PCIe-function.patch * Backport 4: https://launchpadlibrarian.net/479699497/0004-s390-pci- adaptation-of-iommu-to-multifunction.patch * Backport 5: https://launchpadlibrarian.net/479700706/0005-s390-pci- define-kernel-parameters-for-PCI-multifunct.patch * Backport 6: https://launchpadlibrarian.net/479700712/0006-s390-pci- define-RID-and-RID-available.patch * Backport 7: https://launchpadlibrarian.net/479700739/0007-s390-pci- create-zPCI-bus.patch * Backport 8: https://launchpadlibrarian.net/479700769/0008-s390-pci- adapt-events-for-zbus.patch * Backport 9: https://launchpadlibrarian.net/479700786/0009-s390-pci- Handling-multifunctions.patch * Backport 10: https://launchpadlibrarian.net/479700794/0010-s390-pci- Do-not-disable-PF-when-VFs-exist.patch * Backport 11: https://launchpadlibrarian.net/479700798/0011-s390-pci- Documentation-for-zPCI.patch * Backport 12: https://launchpadlibrarian.net/479700799/0012-s390-pci- removes-wrong-PCI-multifunction-assignment.patch [Test Case] * Prepare an IBM z13 or LinuxONE III (or newer) system with two or more RoCE Express PCI 2(.1) adapters. * Assign the adapters (and it's virtual functions) to an LPAR. * Verify whether the physical and virtual functions are grouped in arbitrary order or in consecutive order - physical first (for example with lspci -t ...) [Regression Potential] * The regression potential can be considered as moderate, since: * It is purely s390x specific code (arch/s390/* drivers/iommu/s390-iommu.c and drivers/pci/hotplug/s390_pci_hpc.c - and some doc adjustments, too). * It largely affects zPCI, the s390x specific PCI code layer. * PCI cards available for s390x are optional cards (RoCE and zEDC) and not very wide-spread. * The situation described above affects the RoCE adapters only (Mellanox based). * The patches are also upstream accepted and available via linux-next, but to apply them to focal kernel 5.4 the above backports are needed. * However, the code is modified by several patches (12), hence there is a chance to break zPCI with them. * For upfront testing a PPA got created with a focal (master-next) kernel that incl. all the above patches. __________ Today, the enumeration of PCI functions on s390x does not reflect which functions belongs to which physical adapter. Layout of a PCI function address on Linux: 0000:00:00.0 <root complex>:<bus>:<device>.<function> On s390x, each function is presented as individual root complex today, e.g.: PCHID 0100 VF1 0000:00:00.0 PCHID 0100 VF23 0001:00:00.0 PCHID 0200 VF1 0002:00:00.0 OCHID 0100 VF17 0003:00:00.0 On other platforms, the addresses correctly reflect the actual HW configuration. Some device drivers (mlx5 for Mellanox adapters) group functions of one physical adapter by checking which PCI functions have identical values for <root complex>:<bus>:<device>. We need to use the same enumeration scheme to achieve this functionality on s390x. In this case, the two physical functions of a Mellanox adapter need to get function number 0 and 1, and all virtual functions need to use the same <root complex>:<bus> numbers with function/device numbers counting up. Required result (example with 4 VFs per PF): PCHID 0100 PF 0 0000:00:00.0 PCHID 0100 PF 1 0000:00:00.1 PCHID 0100 PF 0 VF 0 0000:00:00.2 PCHID 0100 PF 0 VF 1 0000:00:00.3 PCHID 0100 PF 0 VF 2 0000:00:00.4 PCHID 0100 PF 0 VF 3 0000:00:00.5 PCHID 0100 PF 1 VF 0 0000:00:00.6 PCHID 0100 PF 1 VF 1 0000:00:00.7 PCHID 0100 PF 1 VF 2 0000:00:00.8 PCHID 0100 PF 1 VF 3 0000:00:00.9 PCHID 0200 PF 0 0001:00:00.0 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-z-systems/+bug/1874056/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp