On Mon, 7 Dec 2020 19:19:20 +0530 Vikas Aggarwal <vaggar...@diamanti.com> wrote:
> Hello list vfio-users, > Can someone help me understand reason that why mmap of requested address > overlaps with MSI-X table during mmap-ing of PCIe resources. > > Platform : ARM64 architecture (Marvell OcteonTX2) > > Linux kernel: 4.14.76-22.0.0 aarch64, Page Size 64K > > Application : Userspace DPDK+SPDK doing mmap-ing of PCIe resources via > pci_vfio_map_resource_primary( ) > > http://code.dpdk.org/dpdk/v19.11/source/lib/librte_pci/rte_pci.c#L140 > mapaddr = mmap(0x202080040000, 0x3000, > PROT_READ | PROT_WRITE, MAP_SHARED, 35, 0x0); > > Device 0003:0d:00.0 : Samsung SSD > > Failure: mapaddr returned is all 0xffffffffffffffff, errno is set > to EINVAL > EAL: pci_map_resource(): cannot mmap(36, > 0x2020801e0000, 0x2000, 0x0): Invalid argument (0xffffffffffffffff) > EAL: Failed to map pci BAR0 > EAL: 0003:0d:00.0 mapping BAR0 failed: Invalid > argument > EAL: Requested device 0003:0d:00.0 cannot be used > > Cause from kernel mmap handler: > EINVAL is returned by vfio_pci_mmap() in-kernel handler : > https://elixir.bootlin.com/linux/v4.14.76/source/drivers/vfio/pci/vfio_pci > .c#L1142 > if (index == vdev->msix_bar) { > /* > * Disallow mmaps overlapping the MSI-X table; users don't > * get to touch this directly. We could find somewhere > * else to map the overlap, but page granularity is only > * a recommendation, not a requirement, so the user needs > * to know which bits are real. Requiring them to mmap > * around the table makes that clear. > */ > > /* If neither entirely above nor below, then it overlaps > */ > if (!(req_start >= vdev->msix_offset + vdev->msix_size || > req_start + req_len <= vdev->msix_offset)) > return -EINVAL; <=====================Hitting > this > } > >From Debug prints: > req_start = 0; vdev->msix_offset = 8192; > vdev->msix_size=144; req_len=65536, vdev->msix_offset=8192; > > Can someone explain me how come this overlap situation is coming and how > can I fix it. The 'why' is exactly per your $Subject, previous kernels didn't allow mmaps over the MSI-X table, which means that for a 64k PAGE_SIZE you'd be precluded from mmap'ing anything in the first 64K of the BAR. This restriction was removed way back in a32295c612c5 ("vfio-pci: Allow mapping MSIX BAR"), which appeared in kernel v4.16... unfortunately still two kernel releases newer than the ancient kernel you're based on. We decided instead that interrupt remapping needs to protect the system against the user possibly misprogramming the vector table via an mmap, specifically for page size restrictions like this. I'd advise upgrading your kernel or backporting the change, otherwise outside of running a 4K PAGE_SIZE kernel, there's nothing that's going to let you mmap closer to the MSI-X vector table. Clearly userspace tools could be fixed to use read/write accesses within the page that contains the vector table (QEMU should already do this), but it comes at a performance loss that might be unacceptable. Thanks, Alex _______________________________________________ vfio-users mailing list vfio-users@redhat.com https://www.redhat.com/mailman/listinfo/vfio-users