Re: [PATCH v2 1/3] efi/x86: skip efi_arch_mem_reserve() in case of kexec.

2024-03-24 Thread Kalra, Ashish

Hello,

On 3/18/2024 11:00 PM, Dave Young wrote:

Hi,

Added Ard in cc.

On 03/18/24 at 07:02am, Ashish Kalra wrote:

From: Ashish Kalra 

For kexec use case, need to use and stick to the EFI memmap passed
from the first kernel via boot-params/setup data, hence,
skip efi_arch_mem_reserve() during kexec.

Additionally during SNP guest kexec testing discovered that EFI memmap
is corrupted during chained kexec. kexec_enter_virtual_mode() during
late init will remap the efi_memmap physical pages allocated in
efi_arch_mem_reserve() via memboot & then subsequently cause random
EFI memmap corruption once memblock is freed/teared-down.

Signed-off-by: Ashish Kalra 
---
  arch/x86/platform/efi/quirks.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index f0cc00032751..d4562d074371 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -258,6 +258,16 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64 
size)
int num_entries;
void *new;
  
+	/*

+* For kexec use case, we need to use the EFI memmap passed from the 
first
+* kernel via setup data, so we need to skip this.
+* Additionally kexec_enter_virtual_mode() during late init will remap
+* the efi_memmap physical pages allocated here via memboot & then
+* subsequently cause random EFI memmap corruption once memblock is 
freed.

Can you elaborate a bit about the corruption, is it reproducible without
SNP?


This is only reproducible on SNP.

This is the call-stack for the above function:

[    0.313377]  efi_arch_mem_reserve+0x64/0x220^M
[    0.314060]  ? memblock_add_range+0x2a0/0x2e0^M
[    0.314763]  efi_mem_reserve+0x36/0x60^M
[    0.315360]  efi_bgrt_init+0x17d/0x1a0^M
[    0.315959]  ? __pfx_acpi_parse_bgrt+0x10/0x10^M
[    0.316711]  acpi_parse_bgrt+0x12/0x20^M
[    0.317310]  acpi_table_parse+0x77/0xd0^M
[    0.317922]  acpi_boot_init+0x362/0x630^M
[    0.318535]  setup_arch+0xa4e/0xf90^M
[    0.319091]  start_kernel+0x68/0xa70^M
[    0.319664]  x86_64_start_reservations+0x1c/0x30^M
[    0.320431]  x86_64_start_kernel+0xbf/0x110^M
[    0.321099]  secondary_startup_64_no_verify+0x179/0x17b^M

This function efi_arch_mem_reserve() calls efi_memmap_alloc() which in 
turn calls __efi_memmap_alloc_early()  which does memblock_phys_alloc(), 
and later does efi_memmap_install() which does early_memremap() of the 
EFI memmap into this memblock allocated physical memory. So now EFI 
memmap gets re-mapped into the memblock allocated memory.


Later kexec_enter_virtual_mode() calls efi_memmap_init_late() which 
memremap()'s the EFI memmap into the above memblock allocated physical 
range.


Obviously, when memblocks are later freed during late init, this 
memblock allocated physical range will get freed and re-allocated which 
will eventually overwrite and corrupt the EFI memmap leading to 
subsequent kexec boot crash.



+*/
+   if (efi_setup)
+   return;
+

How about checking the md attribute instead of checking the efi_setup,
personally I feel it a bit better, something like below:


I based the above on the following code checking for kexec boot:

void __init efi_enter_virtual_mode(void)
{
   ...

    if (efi_setup)
    kexec_enter_virtual_mode();
    else
    __efi_enter_virtual_mode();

But, i have tested with the code (you shared below) about checking the 
md attribute and it works, so i can resend my v2 patch based on this.


Thanks, Ashish



diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index f0cc00032751..699332b075bb 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -255,15 +255,24 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64 
size)
struct efi_memory_map_data data = { 0 };
struct efi_mem_range mr;
efi_memory_desc_t md;
-   int num_entries;
+   int num_entries, ret;
void *new;
  
-	if (efi_mem_desc_lookup(addr, ) ||

-   md.type != EFI_BOOT_SERVICES_DATA) {
+   ret = efi_mem_desc_lookup(addr, );
+   if (ret) {
pr_err("Failed to lookup EFI memory descriptor for %pa\n", 
);
return;
}
  
+	if (md.type != EFI_BOOT_SERVICES_DATA) {

+   pr_err("Skil reserving non EFI Boot Service Data memory for %pa\n", 
);
+   return;
+   }
+
+   /* Kexec copied the efi memmap from the 1st kernel, thus skip the case. 
*/
+   if (md.attribute & EFI_MEMORY_RUNTIME)
+   return;
+
if (addr + size > md.phys_addr + (md.num_pages << EFI_PAGE_SHIFT)) {
pr_err("Region spans EFI memory descriptors, %pa\n", );
return;




___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 1/3] efi/x86: skip efi_arch_mem_reserve() in case of kexec.

2024-03-18 Thread Dave Young
Hi,

Added Ard in cc.

On 03/18/24 at 07:02am, Ashish Kalra wrote:
> From: Ashish Kalra 
> 
> For kexec use case, need to use and stick to the EFI memmap passed
> from the first kernel via boot-params/setup data, hence,
> skip efi_arch_mem_reserve() during kexec.
> 
> Additionally during SNP guest kexec testing discovered that EFI memmap
> is corrupted during chained kexec. kexec_enter_virtual_mode() during
> late init will remap the efi_memmap physical pages allocated in
> efi_arch_mem_reserve() via memboot & then subsequently cause random
> EFI memmap corruption once memblock is freed/teared-down.
> 
> Signed-off-by: Ashish Kalra 
> ---
>  arch/x86/platform/efi/quirks.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
> index f0cc00032751..d4562d074371 100644
> --- a/arch/x86/platform/efi/quirks.c
> +++ b/arch/x86/platform/efi/quirks.c
> @@ -258,6 +258,16 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64 
> size)
>   int num_entries;
>   void *new;
>  
> + /*
> +  * For kexec use case, we need to use the EFI memmap passed from the 
> first
> +  * kernel via setup data, so we need to skip this.
> +  * Additionally kexec_enter_virtual_mode() during late init will remap
> +  * the efi_memmap physical pages allocated here via memboot & then
> +  * subsequently cause random EFI memmap corruption once memblock is 
> freed.

Can you elaborate a bit about the corruption, is it reproducible without
SNP?

> +  */
> + if (efi_setup)
> + return;
> +

How about checking the md attribute instead of checking the efi_setup,
personally I feel it a bit better, something like below:

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index f0cc00032751..699332b075bb 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -255,15 +255,24 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64 
size)
struct efi_memory_map_data data = { 0 };
struct efi_mem_range mr;
efi_memory_desc_t md;
-   int num_entries;
+   int num_entries, ret;
void *new;
 
-   if (efi_mem_desc_lookup(addr, ) ||
-   md.type != EFI_BOOT_SERVICES_DATA) {
+   ret = efi_mem_desc_lookup(addr, );
+   if (ret) {
pr_err("Failed to lookup EFI memory descriptor for %pa\n", 
);
return;
}
 
+   if (md.type != EFI_BOOT_SERVICES_DATA) {
+   pr_err("Skil reserving non EFI Boot Service Data memory for 
%pa\n", );
+   return;
+   }
+
+   /* Kexec copied the efi memmap from the 1st kernel, thus skip the case. 
*/
+   if (md.attribute & EFI_MEMORY_RUNTIME)
+   return;
+
if (addr + size > md.phys_addr + (md.num_pages << EFI_PAGE_SHIFT)) {
pr_err("Region spans EFI memory descriptors, %pa\n", );
return;


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v2 1/3] efi/x86: skip efi_arch_mem_reserve() in case of kexec.

2024-03-18 Thread Ashish Kalra
From: Ashish Kalra 

For kexec use case, need to use and stick to the EFI memmap passed
from the first kernel via boot-params/setup data, hence,
skip efi_arch_mem_reserve() during kexec.

Additionally during SNP guest kexec testing discovered that EFI memmap
is corrupted during chained kexec. kexec_enter_virtual_mode() during
late init will remap the efi_memmap physical pages allocated in
efi_arch_mem_reserve() via memboot & then subsequently cause random
EFI memmap corruption once memblock is freed/teared-down.

Signed-off-by: Ashish Kalra 
---
 arch/x86/platform/efi/quirks.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index f0cc00032751..d4562d074371 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -258,6 +258,16 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64 
size)
int num_entries;
void *new;
 
+   /*
+* For kexec use case, we need to use the EFI memmap passed from the 
first
+* kernel via setup data, so we need to skip this.
+* Additionally kexec_enter_virtual_mode() during late init will remap
+* the efi_memmap physical pages allocated here via memboot & then
+* subsequently cause random EFI memmap corruption once memblock is 
freed.
+*/
+   if (efi_setup)
+   return;
+
if (efi_mem_desc_lookup(addr, ) ||
md.type != EFI_BOOT_SERVICES_DATA) {
pr_err("Failed to lookup EFI memory descriptor for %pa\n", 
);
-- 
2.34.1


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec