On 7/17/24 09:15, David Laight wrote:
> From: Stewart Hildebrand
>> Sent: 16 July 2024 20:33
>>
>> This series sets the default minimum resource alignment to 4k for memory
>> BARs. In preparation, it makes an optimization and addresses some corner
>> cases observed when reallocating BARs. I consider the prepapatory
>> patches to be prerequisites to changing the default BAR alignment.
> 
> Should the BARs be page aligned on systems with large pages?
> At least as an option for hypervisor pass-through and any than can be mmap()ed
> into userspace.

It is sort of an option already using the pci=resource_alignment=
option, but you'd need to spell out every device and manually set the
alignment value, and you'd still end up with fake BAR sizes. I had
actually prepared locally a patch to make this less painful to do and
preserve the BAR size (introduce "pci=resource_alignment=all" option),
but I'd like Bjorn's opinion before sending since there has been some
previous reluctance to making changes to the pci=resource_alignment=
option [2].

[2] https://lore.kernel.org/linux-pci/20160929115422.GA31048@localhost/

Anyway, 4k is more defensible because that is what the PCIe 6.1 spec
calls out, and that is better than the current situation of no default
minimum alignment.

I feel PAGE_SIZE is also justified, and that is why the actual patch now
says max(SZ_4K, PAGE_SIZE) as you pointed out elsewhere. This is a
change from v1 that simply had 4k (sorry, I forgot to update the cover
letter). PowerNV has been using PAGE_SIZE since commits 0a701aa63784 and
382746376993, I think with 64k page size. I don't have a strong opinion
whether the common default should be max(SZ_4K, PAGE_SIZE) or simply
SZ_4K.

> Does any hardware actually have 'silly numbers' of small memory BARs?
> 
> I have a vague memory of some ethernet controllers having lots of (?)
> virtual devices that might have separate registers than can be mapped
> out to a hypervisor.
> Expanding those to a large page might be problematic - but needed for 
> security.

This series does not change alignment of SRIOV / VF BARs. See commits
62d9a78f32d9, ea629d873f3e, and PCIe 6.1 spec section 9.2.1.1.1.

> For more normal hardware just ensuring that two separate targets don't share
> a page while allowing (eg) two 1k BAR to reside in the same 64k page would
> give some security.

Allow me to understand this better, with an example:

PCI Device A
    BAR 1 (1k)
    BAR 2 (1k)

PCI Device B
    BAR 1 (1k)
    BAR 2 (1k)

We align all BARs to 4k. Additionally, are you saying it would be ok to
let both device A BARs to reside in the same 64k page, while device B
BARs would need to reside in a separate 64k page? I.e. having two levels
of alignment: PAGE_SIZE on a per-device basis, and 4k on a per-BAR
basis?

If I understand you correctly, there's currently no logic in the PCI
subsystem to easily support this, so that is a rather large ask. I'm
also not sure that it's necessary.

> Aligning a small MSIX BAR is unlikely to have any effect on the address
> space utilisation (for PCIe) since the bridge will assign a power of two
> sized block - with a big pad (useful for generating pcie errors!)
> 
>       David

Reply via email to