On 7/17/24 09:15, David Laight wrote: > From: Stewart Hildebrand >> Sent: 16 July 2024 20:33 >> >> This series sets the default minimum resource alignment to 4k for memory >> BARs. In preparation, it makes an optimization and addresses some corner >> cases observed when reallocating BARs. I consider the prepapatory >> patches to be prerequisites to changing the default BAR alignment. > > Should the BARs be page aligned on systems with large pages? > At least as an option for hypervisor pass-through and any than can be mmap()ed > into userspace.
It is sort of an option already using the pci=resource_alignment= option, but you'd need to spell out every device and manually set the alignment value, and you'd still end up with fake BAR sizes. I had actually prepared locally a patch to make this less painful to do and preserve the BAR size (introduce "pci=resource_alignment=all" option), but I'd like Bjorn's opinion before sending since there has been some previous reluctance to making changes to the pci=resource_alignment= option [2]. [2] https://lore.kernel.org/linux-pci/20160929115422.GA31048@localhost/ Anyway, 4k is more defensible because that is what the PCIe 6.1 spec calls out, and that is better than the current situation of no default minimum alignment. I feel PAGE_SIZE is also justified, and that is why the actual patch now says max(SZ_4K, PAGE_SIZE) as you pointed out elsewhere. This is a change from v1 that simply had 4k (sorry, I forgot to update the cover letter). PowerNV has been using PAGE_SIZE since commits 0a701aa63784 and 382746376993, I think with 64k page size. I don't have a strong opinion whether the common default should be max(SZ_4K, PAGE_SIZE) or simply SZ_4K. > Does any hardware actually have 'silly numbers' of small memory BARs? > > I have a vague memory of some ethernet controllers having lots of (?) > virtual devices that might have separate registers than can be mapped > out to a hypervisor. > Expanding those to a large page might be problematic - but needed for > security. This series does not change alignment of SRIOV / VF BARs. See commits 62d9a78f32d9, ea629d873f3e, and PCIe 6.1 spec section 9.2.1.1.1. > For more normal hardware just ensuring that two separate targets don't share > a page while allowing (eg) two 1k BAR to reside in the same 64k page would > give some security. Allow me to understand this better, with an example: PCI Device A BAR 1 (1k) BAR 2 (1k) PCI Device B BAR 1 (1k) BAR 2 (1k) We align all BARs to 4k. Additionally, are you saying it would be ok to let both device A BARs to reside in the same 64k page, while device B BARs would need to reside in a separate 64k page? I.e. having two levels of alignment: PAGE_SIZE on a per-device basis, and 4k on a per-BAR basis? If I understand you correctly, there's currently no logic in the PCI subsystem to easily support this, so that is a rather large ask. I'm also not sure that it's necessary. > Aligning a small MSIX BAR is unlikely to have any effect on the address > space utilisation (for PCIe) since the bridge will assign a power of two > sized block - with a big pad (useful for generating pcie errors!) > > David