On 6/5/25 8:07 AM, Sebastian Andrzej Siewior wrote: > The per-CPU data section is handled differently than the other sections. > The memory allocations requires a special __percpu pointer and then the > section is copied into the view of each CPU. Therefore the SHF_ALLOC > flag is removed to ensure move_module() skips it. > > Later, relocations are applied and apply_relocations() skips sections > without SHF_ALLOC because they have not been copied. This also skips the > per-CPU data section. > The missing relocations result in a NULL pointer on x86-64 and very > small values on x86-32. This results in a crash because it is not > skipped like NULL pointer would and can't be dereferenced. > > Such an assignment happens during static per-CPU lock initialisation > with lockdep enabled. > > Add the SHF_ALLOC flag back for the per-CPU section (if found) after > move_module(). > > Reported-by: kernel test robot <oliver.s...@intel.com> > Closes: https://lore.kernel.org/oe-lkp/202506041623.e45e4f7d-...@intel.com > Fixes: 8d8022e8aba85 ("module: do percpu allocation after uniqueness check. > No, really!")
Isn't this broken earlier by "Don't relocate non-allocated regions in modules." (pre-Git, [1])? > Signed-off-by: Sebastian Andrzej Siewior <bige...@linutronix.de> > --- > v1…v2: https://lore.kernel.org/all/20250604152707.cied9...@linutronix.de/ > - Add the flag back only on SMP if the per-CPU section was found. > > kernel/module/main.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/kernel/module/main.c b/kernel/module/main.c > index 5c6ab20240a6d..4f6554dedf8ea 100644 > --- a/kernel/module/main.c > +++ b/kernel/module/main.c > @@ -2816,6 +2816,10 @@ static struct module *layout_and_allocate(struct > load_info *info, int flags) > if (err) > return ERR_PTR(err); > > + /* Add SHF_ALLOC back so that relocations are applied. */ > + if (IS_ENABLED(CONFIG_SMP) && info->index.pcpu) > + info->sechdrs[info->index.pcpu].sh_flags |= SHF_ALLOC; > + > /* Module has been copied to its final place now: return it. */ > mod = (void *)info->sechdrs[info->index.mod].sh_addr; > kmemleak_load_module(mod, info); This looks like a valid fix. The info->sechdrs[info->index.pcpu].sh_addr is set by rewrite_section_headers() to point to the percpu data in the userspace-passed ELF copy. The section has SHF_ALLOC reset, so it doesn't move and the sh_addr isn't adjusted by move_module(). The function apply_relocations() then applies the relocations in the initial ELF copy. Finally, post_relocation() copies the relocated percpu data to their final per-CPU destinations. However, I'm not sure if it is best to manipulate the SHF_ALLOC flag in this way. It is ok to reset it once, but if we need to set it back again then I would reconsider this. An alternative approach could be to teach apply_relocations() that the percpu section is special and should be relocated even though it doesn't have SHF_ALLOC set. This would also allow adding a comment explaining that we're relocating the data in the original ELF copy, which I find useful to mention as it is different to other relocation processing. For instance: /* * Don't bother with non-allocated sections. * * An exception is the percpu section, which has separate allocations * for individual CPUs. We relocate the percpu section in the initial * ELF template and subsequently copy it to the per-CPU destinations. */ if (!(info->sechdrs[infosec].sh_flags & SHF_ALLOC) && infosec != info->index.pcpu) continue; [1] https://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux-fullhistory.git/commit/?id=b3b91325f3c77ace041f769ada7039ebc7aab8de -- Thanks, Petr