From: Borislav Petkov <b...@suse.de> During boot, identify_secondary_cpu() calls at some point validate_apic_and_package_id() which calls topology_update_die_map() to update/verify the physical to logical DIE map of the CPUs on the system.
There's a call down that path to topology_phys_to_logical_die() which maps a physical die to a logical one. The check in there looks at cpuinfo_x86.initialized first before comparing die_ids and proc_ids. And this is where the problem lies: both ->cpu_die_id and ->phys_proc_id have been initialized as part of the identify_secondary_cpu() dance - just the cpuinfo_x86.initialized thing hasn't been set yet (it gets set as the last thing in smp_store_cpu_info()). So what that means is that initialized fields are being compared but the initialized flag says they're not, leading to: smpboot: topology_phys_to_logical_die: init: 1, cpu 7, cur_cpu: 8, cpu_die_id: 0, die_id: 2, phys_proc_id: 0, proc_id: 0, logical_die_id: 0 smpboot: topology_phys_to_logical_die: init: 0, cpu 8, cur_cpu: 8, cpu_die_id: 2, die_id: 2, phys_proc_id: 0, proc_id: 0, logical_die_id: 0 ... smpboot: topology_phys_to_logical_die: init: 0, cpu 127, cur_cpu: 8, cpu_die_id: 0, die_id: 2, phys_proc_id: 0, proc_id: 0, logical_die_id: 0 smpboot: CPU 8 Converting physical 2 to logical die 1 On CPU8 and all the way up to all possible_cpus, boot_cpu_data is not initialized yet even though cpu_die_id == die_id && phys_proc_id == proc_id for that CPU 8. As a result, topology_update_die_map() increments logical_die which gets written into cpuinfo_x86.logical_die_id of that CPU. Later, in the RAPL code, that logical_die_id is outside of the range of maximum dies present on the system: int maxdie = topology_max_packages() * topology_max_die_per_package(); which leads to indexing into the rapl_pmus->pmus[] array out of bounds. Boom. Thus, drop the c->initialized check because the values it should protect against checking, have been actually already initialized. (Yes, our boot order is fragile. :-\). Reported-by: Rafael Kitover <rkito...@gmail.com> Reported-by: Johnathan Smithinovic <johnathan.smithino...@gmx.at> Signed-off-by: Borislav Petkov <b...@suse.de> Link: https://bugzilla.kernel.org/show_bug.cgi?id=210939 --- arch/x86/kernel/smpboot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 8ca66af96a54..56d2ac8c54ab 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -319,7 +319,7 @@ int topology_phys_to_logical_die(unsigned int die_id, unsigned int cur_cpu) for_each_possible_cpu(cpu) { struct cpuinfo_x86 *c = &cpu_data(cpu); - if (c->initialized && c->cpu_die_id == die_id && + if (c->cpu_die_id == die_id && c->phys_proc_id == proc_id) return c->logical_die_id; } -- 2.29.2