On 29/12/2016 10:25, Liang Li wrote: > x86-64 is currently limited physical address width to 46 bits, which > can support 64 TiB of memory. Some vendors require to support more for > some use case. Intel plans to extend the physical address width to > 52 bits in some of the future products. > > The current EPT implementation only supports 4 level page table, which > can support maximum 48 bits physical address width, so it's needed to > extend the EPT to 5 level to support 52 bits physical address width. > > This patchset has been tested in the SIMICS environment for 5 level > paging guest, which was patched with Kirill's patchset for enabling > 5 level page table, with both the EPT and shadow page support. I just > covered the booting process, the guest can boot successfully. > > Some parts of this patchset can be improved. Any comments on the design > or the patches would be appreciated.
I will review the patches. They seem fairly straightforward. However, I am worried about the design of the 5-level page table feature with respect to migration. Processors that support the new LA57 mode can write 57-canonical/48-noncanonical linear addresses to some registers even when LA57 mode is inactive. This is true even of unprivileged instructions, in particular WRFSBASE/WRGSBASE. This is fairly bad because, if a guest performs such a write (because of a bug or because of malice), it will not be possible to migrate the virtual machine to a machine that lacks LA57 mode. Ordinarily, hypervisors trap CPUID to hide features that are only present in some processors of a heterogeneous cluster, and the hypervisor also traps for example CR4 writes to prevent enabling features that were masked away. In this case, however, the only way for the hypervisor to prevent the write would be to run the guest with CR4.FSGSBASE=0 and trap all executions of WRFSBASE/WRGSBASE. This might have negative effects on performance for workloads that use the instructions. Of course, this is a problem even without your patches. However, I think it should be addressed first. I am seriously thinking of blacklisting FSGSBASE completely on LA57 machines until the above is fixed in hardware. Paolo