On 24 May 2018 at 15:59, Laszlo Ersek <ler...@redhat.com> wrote: > On 05/24/18 15:07, Peter Maydell wrote: >> On 24 May 2018 at 13:59, Laszlo Ersek <ler...@redhat.com> wrote: >>> On 05/24/18 11:11, Peter Maydell wrote: >>>> Won't it also break a guest which is just Linux loaded not via >>>> firmware which is an aarch32 kernel without LPAE support? >>> >>> Does such a thing exist? (I honestly have no clue.) >> >> Yes, it does; LPAE isn't a mandatory kernel config option. >> This is why we have the machine 'highmem' option, so that >> we can run on those kernels by not putting anything above >> the 4G boundary. Looking back at the history on that, we >> opted at the time for "default to highmem on, and if you're >> running an non-lpae kernel you need to turn it off manually". > > Ah, OK, I didn't know that. > >> So we can handle those kernels by just not putting ECAM >> above 4G if highmem is false. > > The problem is we can have a combination of 32-bit UEFI firmware (which > certainly lacks LPAE) and a 32-bit kernel which supports LPAE. > Previously, you wouldn't specify highmem=off, and things would just work > -- the firmware would simply ignore the >=4GB MMIO aperture, and use the > 32-bit MMIO aperture only (and use the sole 32-bit ECAM). The kernel > could then use both low and high MMIO apertures, however (I gather?). > > The difference with "high ECAM" is that it is *moved* (not *added*), so > the 32-bit firmware is left with nothing for config space access. For > booting the same combination as above, you are suddenly forced to add > highmem=off, just to keep the ECAM low -- and that, while it keeps the > firmware happy, prevents the LPAE-capable kernel from using the high > MMIO aperture. > > So I think "highmem_ecam" should be computed like this: > > highmem_ecam = highmem_ecam_machtype_default && > highmem && > (!firmware_loaded || aarch64); >
Given that the firmware is tightly coupled to the platform, we may decide not to care about ECAM for UEFI itself, and invent a secondary config space access mechanism that does not consume such a huge amount of address space. For instance, legacy PCI uses a pair of I/O ports for this, one to set the address and one to perform the actual read or write, and we could easily implement something similar (such an interface is problematic in SMP context but we don't care about that anyway) Just a thought - perhaps we don't care enough about 32-bit to go through the trouble, but it would be nice if LPAE capable 32-bit guests could make use of the expanded PCIe config space as well.