* Kirill A. Shutemov <kirill.shute...@linux.intel.com> wrote:

> This patch addresses shortcoming in current boot process on machines
> that supports 5-level paging.
> 
> If bootloader enables 64-bit mode with 4-level paging, we need to
> switch over to 5-level paging. The switching requires disabling paging.
> It works fine if kernel itself is loaded below 4G.
> 
> If bootloader put the kernel above 4G (not sure if anybody does this),
> we would loose control as soon as paging is disabled as code becomes
> unreachable.
> 
> This patch implements trampoline in lower memory to handle this
> situation.
> 
> Apart from trampoline itself we also need place to store top level page
> table in lower memory as we don't have a way to load 64-bit value into
> CR3 from 32-bit mode. We only really need 8-bytes there as we only use
> the very first entry of the page table.
> 
> place_trampoline() would choose an address for the trampoline page.
> The implementation is based on reserve_bios_regions(). We take a page
> next to end of lowmem.
> 
> We only need the page  for very short time, until main kernel image
> setup its own page tables.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com>
> ---
>  arch/x86/boot/compressed/head_64.S | 87 
> ++++++++++++++++++++++++++------------
>  arch/x86/boot/compressed/misc.c    | 25 +++++++++++
>  2 files changed, 84 insertions(+), 28 deletions(-)
> 
> diff --git a/arch/x86/boot/compressed/head_64.S 
> b/arch/x86/boot/compressed/head_64.S
> index cefe4958fda9..961c72755986 100644
> --- a/arch/x86/boot/compressed/head_64.S
> +++ b/arch/x86/boot/compressed/head_64.S
> @@ -288,8 +288,23 @@ ENTRY(startup_64)
>       leaq    boot_stack_end(%rbx), %rsp
>  
>  #ifdef CONFIG_X86_5LEVEL
> +/*
> + * We need trampoline in lower memory switch from 4- to 5-level paging for
> + * cases when bootloader put kernel above 4G, but didn't enable 5-level 
> paging
> + * for us.
> + *
> + * We also have to have top page table in lower memory as we don't have a way
> + * to load 64-bit value into CR3 from 32-bit mode. We only need 8-bytes there
> + * as we only use the very first entry of the page table.
> + *
> + * The same page can be used to place both trampoline code and top level page
> + * table. place_trampoline() will find suitable place for the trampoline 
> page.
> + * Code will be placed with offset 0x100 from beginning of the page.
> + */
> +#define LVL5_TRAMPOLINE_CODE 0x100
> +
>       /* Preserve RBX across CPUID */
> -     movq    %rbx, %r8
> +     movq    %rbx, %r15
>  
>       /* Check if leaf 7 is supported */
>       xorl    %eax, %eax
> @@ -307,9 +322,6 @@ ENTRY(startup_64)
>       andl    $(1 << 16), %ecx
>       jz      lvl5
>  
> -     /* Restore RBX */
> -     movq    %r8, %rbx
> -
>       /* Check if 5-level paging has already been enabled */
>       movq    %cr4, %rax
>       testl   $X86_CR4_LA57, %eax
> @@ -323,34 +335,53 @@ ENTRY(startup_64)
>        * long mode would trigger #GP. So we need to switch off long mode
>        * first.
>        *
> -      * NOTE: This is not going to work if bootloader put us above 4G
> -      * limit.
> +      * We use trampoline in lower memory to handle situation when
> +      * bootloader put the kernel image above 4G.
>        *
>        * The first step is go into compatibility mode.
>        */
>  
> -     /* Clear additional page table */
> -     leaq    lvl5_pgtable(%rbx), %rdi
> -     xorq    %rax, %rax
> -     movq    $(PAGE_SIZE/8), %rcx
> -     rep     stosq
> +     /*
> +      * Find sitable place for trampoline.
> +      * The address will be stored in RBX.
> +      */
> +     call    place_trampoline
> +     movq    %rax, %rbx
> +
> +     /* Preserve RSI, to be used by movsb below */
> +     movq    %rsi, %r14
> +
> +     /* Copy trampoline code in place */
> +     leaq    lvl5_trampoline_src(%rip), %rsi
> +     leaq    LVL5_TRAMPOLINE_CODE(%rbx), %rdi
> +     movq    $(lvl5_trampoline_end - lvl5_trampoline_src), %rcx
> +     rep     movsb
> +
> +     /* Restore RSI */
> +     movq    %r14, %rsi

Yeah, so first most of this code should be moved from assembly to C. Any reason 
why that cannot be done?

Cleanups like that are a precondition to adding this patch or other 5-level 
paging complications like the dynamic boot time switching.

Thanks,

        Ingo

Reply via email to