On Tue, Dec 16, 2025 at 3:49 AM Evangelos Petrongonas <[email protected]> wrote: > > When `CONFIG_DEFERRED_STRUCT_PAGE_INIT` is enabled, struct page > initialization is deferred to parallel kthreads that run later > in the boot process. > > During KHO restoration, `deserialize_bitmap()` writes metadata for > each preserved memory region. However, if the struct page has not been > initialized, this write targets uninitialized memory, potentially > leading to errors like: > ``` > BUG: unable to handle page fault for address: ... > ``` > > Fix this by introducing `kho_get_preserved_page()`, which ensures > all struct pages in a preserved region are initialized by calling > `init_deferred_page()` which is a no-op when deferred init is disabled > or when the struct page is already initialized. > > Fixes: 8b66ed2c3f42 ("kho: mm: don't allow deferred struct page with KHO")
You are adding a new feature. Backporting this to stable is not needed. > Signed-off-by: Evangelos Petrongonas <[email protected]> > --- > ### Notes > @Jason, this patch should act as a temporary fix to make KHO play nice > with deferred struct page init until you post your ideas about splitting > "Physical Reservation" from "Metadata Restoration". > > ### Testing > In order to test the fix, I modified the KHO selftest, to allocate more > memory and do so from higher memory to trigger the incompatibility. The > branch with those changes can be found in: > https://git.infradead.org/?p=users/vpetrog/linux.git;a=shortlog;h=refs/heads/kho-deferred-struct-page-init > > In future patches, we might want to enhance the selftest to cover > this case as well. However, properly adopting the test for this > is much more work than the actual fix, therefore it can be deferred to a > follow-up series. > > In addition attempting to run the selftest for arm (without my changes) > fails with: > ``` > ERROR:target/arm/internals.h:767:regime_is_user: code should not be reached > Bail out! ERROR:target/arm/internals.h:767:regime_is_user: code should not be > reached > ./tools/testing/selftests/kho/vmtest.sh: line 113: 61609 Aborted > ``` > I have not looked it up further, but can also do so as part of a > selftest follow-up. > > kernel/liveupdate/Kconfig | 2 -- > kernel/liveupdate/kexec_handover.c | 19 ++++++++++++++++++- > 2 files changed, 18 insertions(+), 3 deletions(-) > > diff --git a/kernel/liveupdate/Kconfig b/kernel/liveupdate/Kconfig > index d2aeaf13c3ac..9394a608f939 100644 > --- a/kernel/liveupdate/Kconfig > +++ b/kernel/liveupdate/Kconfig > @@ -1,12 +1,10 @@ > # SPDX-License-Identifier: GPL-2.0-only > > menu "Live Update and Kexec HandOver" > - depends on !DEFERRED_STRUCT_PAGE_INIT > > config KEXEC_HANDOVER > bool "kexec handover" > depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE > - depends on !DEFERRED_STRUCT_PAGE_INIT > select MEMBLOCK_KHO_SCRATCH > select KEXEC_FILE > select LIBFDT > diff --git a/kernel/liveupdate/kexec_handover.c > b/kernel/liveupdate/kexec_handover.c > index 9dc51fab604f..78cfe71e6107 100644 > --- a/kernel/liveupdate/kexec_handover.c > +++ b/kernel/liveupdate/kexec_handover.c > @@ -439,6 +439,23 @@ static int kho_mem_serialize(struct kho_out *kho_out) > return err; > } > > +/* > + * With CONFIG_DEFERRED_STRUCT_PAGE_INIT, struct pages in higher memory > + * regions may not be initialized yet at the time KHO deserializes preserved > + * memory. This function ensures all struct pages in the region are > initialized. > + */ > +static struct page *__init kho_get_preserved_page(phys_addr_t phys, > + unsigned int order) > +{ > + unsigned long pfn = PHYS_PFN(phys); > + int nid = early_pfn_to_nid(pfn); > + > + for (int i = 0; i < (1 << order); i++) > + init_deferred_page(pfn + i, nid); > + > + return pfn_to_page(pfn); > +} > + > static void __init deserialize_bitmap(unsigned int order, > struct khoser_mem_bitmap_ptr *elm) > { > @@ -449,7 +466,7 @@ static void __init deserialize_bitmap(unsigned int order, > int sz = 1 << (order + PAGE_SHIFT); > phys_addr_t phys = > elm->phys_start + (bit << (order + PAGE_SHIFT)); > - struct page *page = phys_to_page(phys); > + struct page *page = kho_get_preserved_page(phys, order); > union kho_page_info info; > > memblock_reserve(phys, sz); In deferred_init_memmap_chunk() we initialize deferred struct pages in this iterator: for_each_free_mem_range(i, nid, 0, &start, &end, NULL) { init_deferred_page() } However, since, memblock_reserve() is called, the memory is not going to be part of the for_each_free_mem_range() iterator. So, I think the proposed patch should work. Pratyush, what happens if we deserialize a HugeTLB with HVO? Since HVO optimizes out the unique backing struct pages for tail pages, blindly iterating and initializing them via init_deferred_page() might corrupt the shared struct page mapping. Pasha
