Re: [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations
Hi James, Marc On Fri, Jun 04, 2021 at 05:20:38PM +0100, James Morse wrote: > Hi Marc, > > On 31/05/2021 10:57, Marc Zyngier wrote: > > This series is a complete departure from the approach I initially sent > > almost a month ago[0]. Instead of trying to teach EFI, ACPI and other > > subsystem to use memblock, I've decided to stick with the iomem > > resource tree and use that exclusively for arm64. > > > This means that my current approach is (despite what I initially > > replied to both Dave and Catalin) to provide an arm64-specific > > implementation of arch_kexec_locate_mem_hole() which walks the > > resource tree and excludes ranges of RAM that have been registered for > > any odd purpose. This is exactly what the userspace implementation > > does, and I don't really see a good reason to diverge from it. > > Because in the ideal world we'd have only 'is it reserved' list to check > against. > Memblock has been extended before. The resource-list is overly stringy, and > I'm not sure > we can shove everything in the resource list. > > Kexec already has problems on arm64 with memory hotplug. Fixing this for > regular kexec in > /proc/iomem was rejected, and memblock's memblock_is_hotpluggable() is broken > because > free_low_memory_core_early() does this: > | memblock_clear_hotplug(0, -1) > > Once that has been unpicked its clear kexec_file_load() can use > memblock_is_hotpluggable(). (its on the todo list, well, jira) > > > I'd prefer to keep kexec using memblock because it _shouldn't_ change after > boot. Having > an "I want to reserve this and make it persistent over kexec" call that can > happen at any > time can't work if the kexec image has already been loaded. > Practically, once user-space has started, you can't have new things you want > to reserve > over kexec. > > > I don't see how the ACPI tables can escape short of a firmware bug. Could > someone with an > affected platform boot with efi=debug and post the EFI memory map and the > 'ACPI: FOO > 0xphysicaladdress' stuff at the top of the boot log? > > > efi_mem_reserve_persistent() has one caller for the GIC ITS stuff. > > For the ITS, the reservations look like they are behind irqchip_init(), which > is well > before the arch_initcall() that updates the resource tree from memblock. Your > v1's first > patch should be sufficient. > > > > Again, this allows my Synquacer board to reliably use kexec_file_load > > with as little as 256M, something that would always fail before as it > > would overwrite most of the reserved tables. > > > > Although this series still targets 5.14, the initial patch is a > > -stable candidate, and disables non-kdump uses of kexec_file_load. I > > have limited it to 5.10, as earlier kernels will require a different, > > probably more invasive approach. > > > > Catalin, Ard: although this series has changed a bit compared to v1, > > I've kept your AB/RB tags. Should anything seem odd, please let me > > know and I'll drop them. > > > Thanks, > > James > > > [0] I'm pretty sure this is enough. (Not tested) > diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c > index 4b7ee3fa9224..3ed45153ce7f 100644 > --- a/drivers/firmware/efi/efi.c > +++ b/drivers/firmware/efi/efi.c > @@ -893,7 +893,7 @@ static int __init efi_memreserve_map_root(void) > return 0; > } > > -static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size) > +static int __efi_mem_reserve_iomem(phys_addr_t addr, u64 size) > { > struct resource *res, *parent; > > @@ -911,6 +911,16 @@ static int efi_mem_reserve_iomem(phys_addr_t addr, u64 > size) > return parent ? request_resource(parent, res) : 0; > } > > +static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size) > +{ > + int err = __efi_mem_reserve_iomem(addr, size); > + > + if(IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK) && !err) > + memblock_reserve(addr, size); > + > + return err; > +} > + > int __ref efi_mem_reserve_persistent(phys_addr_t addr, u64 size) > { > struct linux_efi_memreserve *rsv; Sorry for the long radio silence. Just got around to testing this. I can confirm that the above change James proposed does work on the platform that the issue was first observed on. Cheers, Moritz ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations
Hi Marc, On 31/05/2021 10:57, Marc Zyngier wrote: > This series is a complete departure from the approach I initially sent > almost a month ago[0]. Instead of trying to teach EFI, ACPI and other > subsystem to use memblock, I've decided to stick with the iomem > resource tree and use that exclusively for arm64. > This means that my current approach is (despite what I initially > replied to both Dave and Catalin) to provide an arm64-specific > implementation of arch_kexec_locate_mem_hole() which walks the > resource tree and excludes ranges of RAM that have been registered for > any odd purpose. This is exactly what the userspace implementation > does, and I don't really see a good reason to diverge from it. Because in the ideal world we'd have only 'is it reserved' list to check against. Memblock has been extended before. The resource-list is overly stringy, and I'm not sure we can shove everything in the resource list. Kexec already has problems on arm64 with memory hotplug. Fixing this for regular kexec in /proc/iomem was rejected, and memblock's memblock_is_hotpluggable() is broken because free_low_memory_core_early() does this: | memblock_clear_hotplug(0, -1) Once that has been unpicked its clear kexec_file_load() can use memblock_is_hotpluggable(). (its on the todo list, well, jira) I'd prefer to keep kexec using memblock because it _shouldn't_ change after boot. Having an "I want to reserve this and make it persistent over kexec" call that can happen at any time can't work if the kexec image has already been loaded. Practically, once user-space has started, you can't have new things you want to reserve over kexec. I don't see how the ACPI tables can escape short of a firmware bug. Could someone with an affected platform boot with efi=debug and post the EFI memory map and the 'ACPI: FOO 0xphysicaladdress' stuff at the top of the boot log? efi_mem_reserve_persistent() has one caller for the GIC ITS stuff. For the ITS, the reservations look like they are behind irqchip_init(), which is well before the arch_initcall() that updates the resource tree from memblock. Your v1's first patch should be sufficient. > Again, this allows my Synquacer board to reliably use kexec_file_load > with as little as 256M, something that would always fail before as it > would overwrite most of the reserved tables. > > Although this series still targets 5.14, the initial patch is a > -stable candidate, and disables non-kdump uses of kexec_file_load. I > have limited it to 5.10, as earlier kernels will require a different, > probably more invasive approach. > > Catalin, Ard: although this series has changed a bit compared to v1, > I've kept your AB/RB tags. Should anything seem odd, please let me > know and I'll drop them. Thanks, James [0] I'm pretty sure this is enough. (Not tested) diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index 4b7ee3fa9224..3ed45153ce7f 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -893,7 +893,7 @@ static int __init efi_memreserve_map_root(void) return 0; } -static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size) +static int __efi_mem_reserve_iomem(phys_addr_t addr, u64 size) { struct resource *res, *parent; @@ -911,6 +911,16 @@ static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size) return parent ? request_resource(parent, res) : 0; } +static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size) +{ + int err = __efi_mem_reserve_iomem(addr, size); + + if(IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK) && !err) + memblock_reserve(addr, size); + + return err; +} + int __ref efi_mem_reserve_persistent(phys_addr_t addr, u64 size) { struct linux_efi_memreserve *rsv; ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations
On Mon, 31 May 2021 at 11:57, Marc Zyngier wrote: > > This series is a complete departure from the approach I initially sent > almost a month ago[0]. Instead of trying to teach EFI, ACPI and other > subsystem to use memblock, I've decided to stick with the iomem > resource tree and use that exclusively for arm64. > > This means that my current approach is (despite what I initially > replied to both Dave and Catalin) to provide an arm64-specific > implementation of arch_kexec_locate_mem_hole() which walks the > resource tree and excludes ranges of RAM that have been registered for > any odd purpose. This is exactly what the userspace implementation > does, and I don't really see a good reason to diverge from it. > > Again, this allows my Synquacer board to reliably use kexec_file_load > with as little as 256M, something that would always fail before as it > would overwrite most of the reserved tables. > > Although this series still targets 5.14, the initial patch is a > -stable candidate, and disables non-kdump uses of kexec_file_load. I > have limited it to 5.10, as earlier kernels will require a different, > probably more invasive approach. > > Catalin, Ard: although this series has changed a bit compared to v1, > I've kept your AB/RB tags. Should anything seem odd, please let me > know and I'll drop them. > Fine with me. > Thanks, > > M. > > * From v1 [1]: > - Move the overlap exclusion into find_next_iomem_res() > - Handle child resource not overlapping with parent > - Provide walk_system_ram_excluding_child_res() as a top level > walker > - Simplify arch-specific code > - Add initial patch disabling non-crash kernels > > [0] https://lore.kernel.org/r/20210429133533.1750721-1-...@kernel.org > [1] https://lore.kernel.org/r/20210526190531.62751-1-...@kernel.org > > Marc Zyngier (5): > arm64: kexec_file: Forbid non-crash kernels > kexec_file: Make locate_mem_hole_callback global > kernel/resource: Allow find_next_iomem_res() to exclude overlapping > child resources > kernel/resource: Introduce walk_system_ram_excluding_child_res() > arm64: kexec_image: Restore full kexec functionnality > > arch/arm64/kernel/kexec_image.c | 39 > include/linux/ioport.h | 3 ++ > include/linux/kexec.h | 1 + > kernel/kexec_file.c | 6 +-- > kernel/resource.c | 82 + > 5 files changed, 119 insertions(+), 12 deletions(-) > > -- > 2.30.2 > ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations
This series is a complete departure from the approach I initially sent almost a month ago[0]. Instead of trying to teach EFI, ACPI and other subsystem to use memblock, I've decided to stick with the iomem resource tree and use that exclusively for arm64. This means that my current approach is (despite what I initially replied to both Dave and Catalin) to provide an arm64-specific implementation of arch_kexec_locate_mem_hole() which walks the resource tree and excludes ranges of RAM that have been registered for any odd purpose. This is exactly what the userspace implementation does, and I don't really see a good reason to diverge from it. Again, this allows my Synquacer board to reliably use kexec_file_load with as little as 256M, something that would always fail before as it would overwrite most of the reserved tables. Although this series still targets 5.14, the initial patch is a -stable candidate, and disables non-kdump uses of kexec_file_load. I have limited it to 5.10, as earlier kernels will require a different, probably more invasive approach. Catalin, Ard: although this series has changed a bit compared to v1, I've kept your AB/RB tags. Should anything seem odd, please let me know and I'll drop them. Thanks, M. * From v1 [1]: - Move the overlap exclusion into find_next_iomem_res() - Handle child resource not overlapping with parent - Provide walk_system_ram_excluding_child_res() as a top level walker - Simplify arch-specific code - Add initial patch disabling non-crash kernels [0] https://lore.kernel.org/r/20210429133533.1750721-1-...@kernel.org [1] https://lore.kernel.org/r/20210526190531.62751-1-...@kernel.org Marc Zyngier (5): arm64: kexec_file: Forbid non-crash kernels kexec_file: Make locate_mem_hole_callback global kernel/resource: Allow find_next_iomem_res() to exclude overlapping child resources kernel/resource: Introduce walk_system_ram_excluding_child_res() arm64: kexec_image: Restore full kexec functionnality arch/arm64/kernel/kexec_image.c | 39 include/linux/ioport.h | 3 ++ include/linux/kexec.h | 1 + kernel/kexec_file.c | 6 +-- kernel/resource.c | 82 + 5 files changed, 119 insertions(+), 12 deletions(-) -- 2.30.2 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec