Re: [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations

2021-06-09 Thread Moritz Fischer
Hi James, Marc

On Fri, Jun 04, 2021 at 05:20:38PM +0100, James Morse wrote:
> Hi Marc,
> 
> On 31/05/2021 10:57, Marc Zyngier wrote:
> > This series is a complete departure from the approach I initially sent
> > almost a month ago[0]. Instead of trying to teach EFI, ACPI and other
> > subsystem to use memblock, I've decided to stick with the iomem
> > resource tree and use that exclusively for arm64.
> 
> > This means that my current approach is (despite what I initially
> > replied to both Dave and Catalin) to provide an arm64-specific
> > implementation of arch_kexec_locate_mem_hole() which walks the
> > resource tree and excludes ranges of RAM that have been registered for
> > any odd purpose. This is exactly what the userspace implementation
> > does, and I don't really see a good reason to diverge from it.
> 
> Because in the ideal world we'd have only 'is it reserved' list to check 
> against.
> Memblock has been extended before. The resource-list is overly stringy, and 
> I'm not sure
> we can shove everything in the resource list.
> 
> Kexec already has problems on arm64 with memory hotplug. Fixing this for 
> regular kexec in
> /proc/iomem was rejected, and memblock's memblock_is_hotpluggable() is broken 
> because
> free_low_memory_core_early() does this:
> | memblock_clear_hotplug(0, -1)
> 
> Once that has been unpicked its clear kexec_file_load() can use
> memblock_is_hotpluggable(). (its on the todo list, well, jira)
> 
> 
> I'd prefer to keep kexec using memblock because it _shouldn't_ change after 
> boot. Having
> an "I want to reserve this and make it persistent over kexec" call that can 
> happen at any
> time can't work if the kexec image has already been loaded.
> Practically, once user-space has started, you can't have new things you want 
> to reserve
> over kexec.
> 
> 
> I don't see how the ACPI tables can escape short of a firmware bug. Could 
> someone with an
> affected platform boot with efi=debug and post the EFI memory map and the 
> 'ACPI: FOO
> 0xphysicaladdress' stuff at the top of the boot log?
> 
> 
> efi_mem_reserve_persistent() has one caller for the GIC ITS stuff.
> 
> For the ITS, the reservations look like they are behind irqchip_init(), which 
> is well
> before the arch_initcall() that updates the resource tree from memblock. Your 
> v1's first
> patch should be sufficient.
> 
> 
> > Again, this allows my Synquacer board to reliably use kexec_file_load
> > with as little as 256M, something that would always fail before as it
> > would overwrite most of the reserved tables.
> > 
> > Although this series still targets 5.14, the initial patch is a
> > -stable candidate, and disables non-kdump uses of kexec_file_load. I
> > have limited it to 5.10, as earlier kernels will require a different,
> > probably more invasive approach.
> > 
> > Catalin, Ard: although this series has changed a bit compared to v1,
> > I've kept your AB/RB tags. Should anything seem odd, please let me
> > know and I'll drop them.
> 
> 
> Thanks,
> 
> James
> 
> 
> [0] I'm pretty sure this is enough. (Not tested)
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index 4b7ee3fa9224..3ed45153ce7f 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -893,7 +893,7 @@ static int __init efi_memreserve_map_root(void)
> return 0;
>  }
> 
> -static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
> +static int __efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
>  {
> struct resource *res, *parent;
> 
> @@ -911,6 +911,16 @@ static int efi_mem_reserve_iomem(phys_addr_t addr, u64 
> size)
> return parent ? request_resource(parent, res) : 0;
>  }
> 
> +static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
> +{
> +   int err = __efi_mem_reserve_iomem(addr, size);
> +
> +   if(IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK) && !err)
> +   memblock_reserve(addr, size);
> +
> +   return err;
> +}
> +
>  int __ref efi_mem_reserve_persistent(phys_addr_t addr, u64 size)
>  {
> struct linux_efi_memreserve *rsv;

Sorry for the long radio silence. Just got around to testing this.

I can confirm that the above change James proposed does work on the
platform that the issue was first observed on.

Cheers,
Moritz

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations

2021-06-04 Thread James Morse
Hi Marc,

On 31/05/2021 10:57, Marc Zyngier wrote:
> This series is a complete departure from the approach I initially sent
> almost a month ago[0]. Instead of trying to teach EFI, ACPI and other
> subsystem to use memblock, I've decided to stick with the iomem
> resource tree and use that exclusively for arm64.

> This means that my current approach is (despite what I initially
> replied to both Dave and Catalin) to provide an arm64-specific
> implementation of arch_kexec_locate_mem_hole() which walks the
> resource tree and excludes ranges of RAM that have been registered for
> any odd purpose. This is exactly what the userspace implementation
> does, and I don't really see a good reason to diverge from it.

Because in the ideal world we'd have only 'is it reserved' list to check 
against.
Memblock has been extended before. The resource-list is overly stringy, and I'm 
not sure
we can shove everything in the resource list.

Kexec already has problems on arm64 with memory hotplug. Fixing this for 
regular kexec in
/proc/iomem was rejected, and memblock's memblock_is_hotpluggable() is broken 
because
free_low_memory_core_early() does this:
|   memblock_clear_hotplug(0, -1)

Once that has been unpicked its clear kexec_file_load() can use
memblock_is_hotpluggable(). (its on the todo list, well, jira)


I'd prefer to keep kexec using memblock because it _shouldn't_ change after 
boot. Having
an "I want to reserve this and make it persistent over kexec" call that can 
happen at any
time can't work if the kexec image has already been loaded.
Practically, once user-space has started, you can't have new things you want to 
reserve
over kexec.


I don't see how the ACPI tables can escape short of a firmware bug. Could 
someone with an
affected platform boot with efi=debug and post the EFI memory map and the 
'ACPI: FOO
0xphysicaladdress' stuff at the top of the boot log?


efi_mem_reserve_persistent() has one caller for the GIC ITS stuff.

For the ITS, the reservations look like they are behind irqchip_init(), which 
is well
before the arch_initcall() that updates the resource tree from memblock. Your 
v1's first
patch should be sufficient.


> Again, this allows my Synquacer board to reliably use kexec_file_load
> with as little as 256M, something that would always fail before as it
> would overwrite most of the reserved tables.
> 
> Although this series still targets 5.14, the initial patch is a
> -stable candidate, and disables non-kdump uses of kexec_file_load. I
> have limited it to 5.10, as earlier kernels will require a different,
> probably more invasive approach.
> 
> Catalin, Ard: although this series has changed a bit compared to v1,
> I've kept your AB/RB tags. Should anything seem odd, please let me
> know and I'll drop them.


Thanks,

James


[0] I'm pretty sure this is enough. (Not tested)
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 4b7ee3fa9224..3ed45153ce7f 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -893,7 +893,7 @@ static int __init efi_memreserve_map_root(void)
return 0;
 }

-static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
+static int __efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
 {
struct resource *res, *parent;

@@ -911,6 +911,16 @@ static int efi_mem_reserve_iomem(phys_addr_t addr, u64 
size)
return parent ? request_resource(parent, res) : 0;
 }

+static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
+{
+   int err = __efi_mem_reserve_iomem(addr, size);
+
+   if(IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK) && !err)
+   memblock_reserve(addr, size);
+
+   return err;
+}
+
 int __ref efi_mem_reserve_persistent(phys_addr_t addr, u64 size)
 {
struct linux_efi_memreserve *rsv;

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations

2021-05-31 Thread Ard Biesheuvel
On Mon, 31 May 2021 at 11:57, Marc Zyngier  wrote:
>
> This series is a complete departure from the approach I initially sent
> almost a month ago[0]. Instead of trying to teach EFI, ACPI and other
> subsystem to use memblock, I've decided to stick with the iomem
> resource tree and use that exclusively for arm64.
>
> This means that my current approach is (despite what I initially
> replied to both Dave and Catalin) to provide an arm64-specific
> implementation of arch_kexec_locate_mem_hole() which walks the
> resource tree and excludes ranges of RAM that have been registered for
> any odd purpose. This is exactly what the userspace implementation
> does, and I don't really see a good reason to diverge from it.
>
> Again, this allows my Synquacer board to reliably use kexec_file_load
> with as little as 256M, something that would always fail before as it
> would overwrite most of the reserved tables.
>
> Although this series still targets 5.14, the initial patch is a
> -stable candidate, and disables non-kdump uses of kexec_file_load. I
> have limited it to 5.10, as earlier kernels will require a different,
> probably more invasive approach.
>
> Catalin, Ard: although this series has changed a bit compared to v1,
> I've kept your AB/RB tags. Should anything seem odd, please let me
> know and I'll drop them.
>

Fine with me.

> Thanks,
>
> M.
>
> * From v1 [1]:
>   - Move the overlap exclusion into find_next_iomem_res()
>   - Handle child resource not overlapping with parent
>   - Provide walk_system_ram_excluding_child_res() as a top level
> walker
>   - Simplify arch-specific code
>   - Add initial patch disabling non-crash kernels
>
> [0] https://lore.kernel.org/r/20210429133533.1750721-1-...@kernel.org
> [1] https://lore.kernel.org/r/20210526190531.62751-1-...@kernel.org
>
> Marc Zyngier (5):
>   arm64: kexec_file: Forbid non-crash kernels
>   kexec_file: Make locate_mem_hole_callback global
>   kernel/resource: Allow find_next_iomem_res() to exclude overlapping
> child resources
>   kernel/resource: Introduce walk_system_ram_excluding_child_res()
>   arm64: kexec_image: Restore full kexec functionnality
>
>  arch/arm64/kernel/kexec_image.c | 39 
>  include/linux/ioport.h  |  3 ++
>  include/linux/kexec.h   |  1 +
>  kernel/kexec_file.c |  6 +--
>  kernel/resource.c   | 82 +
>  5 files changed, 119 insertions(+), 12 deletions(-)
>
> --
> 2.30.2
>

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations

2021-05-31 Thread Marc Zyngier
This series is a complete departure from the approach I initially sent
almost a month ago[0]. Instead of trying to teach EFI, ACPI and other
subsystem to use memblock, I've decided to stick with the iomem
resource tree and use that exclusively for arm64.

This means that my current approach is (despite what I initially
replied to both Dave and Catalin) to provide an arm64-specific
implementation of arch_kexec_locate_mem_hole() which walks the
resource tree and excludes ranges of RAM that have been registered for
any odd purpose. This is exactly what the userspace implementation
does, and I don't really see a good reason to diverge from it.

Again, this allows my Synquacer board to reliably use kexec_file_load
with as little as 256M, something that would always fail before as it
would overwrite most of the reserved tables.

Although this series still targets 5.14, the initial patch is a
-stable candidate, and disables non-kdump uses of kexec_file_load. I
have limited it to 5.10, as earlier kernels will require a different,
probably more invasive approach.

Catalin, Ard: although this series has changed a bit compared to v1,
I've kept your AB/RB tags. Should anything seem odd, please let me
know and I'll drop them.

Thanks,

M.

* From v1 [1]:
  - Move the overlap exclusion into find_next_iomem_res()
  - Handle child resource not overlapping with parent
  - Provide walk_system_ram_excluding_child_res() as a top level
walker
  - Simplify arch-specific code
  - Add initial patch disabling non-crash kernels

[0] https://lore.kernel.org/r/20210429133533.1750721-1-...@kernel.org
[1] https://lore.kernel.org/r/20210526190531.62751-1-...@kernel.org

Marc Zyngier (5):
  arm64: kexec_file: Forbid non-crash kernels
  kexec_file: Make locate_mem_hole_callback global
  kernel/resource: Allow find_next_iomem_res() to exclude overlapping
child resources
  kernel/resource: Introduce walk_system_ram_excluding_child_res()
  arm64: kexec_image: Restore full kexec functionnality

 arch/arm64/kernel/kexec_image.c | 39 
 include/linux/ioport.h  |  3 ++
 include/linux/kexec.h   |  1 +
 kernel/kexec_file.c |  6 +--
 kernel/resource.c   | 82 +
 5 files changed, 119 insertions(+), 12 deletions(-)

-- 
2.30.2


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec