On Fri, Nov 24, 2017 at 4:23 PM, Maciej Bielski <m.biel...@virtualopensystems.com> wrote: > On Fri, Nov 24, 2017 at 09:42:33AM +0000, Andrea Reale wrote: >> Hi Arun, >> >> >> On Fri 24 Nov 2017, 11:25, Arun KS wrote: >> > On Thu, Nov 23, 2017 at 4:43 PM, Maciej Bielski >> > <m.biel...@virtualopensystems.com> wrote: >> >> [ ...] >> > > Introduces memory hotplug functionality (hot-add) for arm64. >> > > @@ -615,6 +616,44 @@ void __init paging_init(void) >> > > SWAPPER_DIR_SIZE - PAGE_SIZE); >> > > } >> > > >> > > +#ifdef CONFIG_MEMORY_HOTPLUG >> > > + >> > > +/* >> > > + * hotplug_paging() is used by memory hotplug to build new page tables >> > > + * for hot added memory. >> > > + */ >> > > + >> > > +struct mem_range { >> > > + phys_addr_t base; >> > > + phys_addr_t size; >> > > +}; >> > > + >> > > +static int __hotplug_paging(void *data) >> > > +{ >> > > + int flags = 0; >> > > + struct mem_range *section = data; >> > > + >> > > + if (debug_pagealloc_enabled()) >> > > + flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS; >> > > + >> > > + __create_pgd_mapping(swapper_pg_dir, section->base, >> > > + __phys_to_virt(section->base), section->size, >> > > + PAGE_KERNEL, pgd_pgtable_alloc, flags); >> > >> > Hello Andrea, >> > >> > __hotplug_paging runs on stop_machine context. >> > cpu stop callbacks must not sleep. >> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/stop_machine.c?h=v4.14#n479 >> > >> > __create_pgd_mapping uses pgd_pgtable_alloc. which does >> > __get_free_page(PGALLOC_GFP) >> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/mm/mmu.c?h=v4.14#n342 >> > >> > PGALLOC_GFP has GFP_KERNEL which inturn has __GFP_RECLAIM >> > >> > #define PGALLOC_GFP (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO) >> > #define GFP_KERNEL (__GFP_RECLAIM | __GFP_IO | __GFP_FS) >> > >> > Now, prepare_alloc_pages() called by __alloc_pages_nodemask checks for >> > >> > might_sleep_if(gfp_mask & __GFP_DIRECT_RECLAIM); >> > >> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/page_alloc.c?h=v4.14#n4150 >> > >> > and then BUG() >> >> Well spotted, thanks for reporting the problem. One possible solution >> would be to revert back to building the updated page tables on a copy >> pgdir (as it was done in v1 of this patchset) and then replacing swapper >> atomically with stop_machine. >> >> Actually, I am not sure if stop_machine is strictly needed, >> if we modify the swapper pgdir live: for example, in x86_64 >> kernel_physical_mapping_init, atomicity is ensured by spin-locking on >> init_mm.page_table_lock. >> https://elixir.free-electrons.com/linux/v4.14/source/arch/x86/mm/init_64.c#L684 >> I'll spend some time investigating whoever else could be working >> concurrently on the swapper pgdir. >> >> Any suggestion or pointer is very welcome. > > Hi Andrea, Arun, > > Alternative approach could be implementing pgd_pgtable_alloc_nosleep() and > pointing this to hotplug_paging(). Subsequently, it could use different flags, > eg: > > #define PGALLOC_GFP_NORECLAIM (__GFP_IO | __GFP_FS | __GFP_NOTRACK | > __GFP_ZERO)
This solves the problem with __get_free_page. But pgd_pgtable_alloc() -> pgtable_page_ctor() -> ptlock_alloc() and then kmem_cache_alloc(page_ptl_cachep, GFP_KERNEL) Same BUG again. Regards, Arun > > Is this unefficient approach in any way? > Do we like the fact that the memory-attaching thread can go to sleep? > > BR, > >> >> Thanks, >> Andrea >> >> > I was testing on 4.4 kernel, but cross checked with 4.14 as well. >> > >> > Regards, >> > Arun >> > >> > >> > > + >> > > + return 0; >> > > +} >> > > + >> > > +inline void hotplug_paging(phys_addr_t start, phys_addr_t size) >> > > +{ >> > > + struct mem_range section = { >> > > + .base = start, >> > > + .size = size, >> > > + }; >> > > + >> > > + stop_machine(__hotplug_paging, §ion, NULL); >> > > +} >> > > +#endif /* CONFIG_MEMORY_HOTPLUG */ >> > > + >> > > /* >> > > * Check whether a kernel address is valid (derived from arch/x86/). >> > > */ >> > > -- >> > > 2.7.4 >> > > >> > >> > > -- > Maciej Bielski