Re: [PATCH] perf/x86/intel/rapl: Rename rapl_cpu_prepare() to rapl_cpu_starting()
On 01/30/2017 11:56 AM, Thomas Gleixner wrote: On Mon, 30 Jan 2017, Yasuaki Ishimatsu wrote: Hi Thomas, Do you have any idea to fix the issue? If you have the idea, please send me the patch. Yes, I have a patch, but need to do some tests and get changelogs written. Will keep you updated. Great!! I wait for your patch. Thanks, Yasuaki Ishimatsu Thanks, tglx
Re: [PATCH] perf/x86/intel/rapl: Rename rapl_cpu_prepare() to rapl_cpu_starting()
On Mon, 30 Jan 2017, Yasuaki Ishimatsu wrote: > Hi Thomas, > > Do you have any idea to fix the issue? > If you have the idea, please send me the patch. Yes, I have a patch, but need to do some tests and get changelogs written. Will keep you updated. Thanks, tglx
Re: [PATCH] perf/x86/intel/rapl: Rename rapl_cpu_prepare() to rapl_cpu_starting()
Hi Thomas, Do you have any idea to fix the issue? If you have the idea, please send me the patch. Thanks, Yasuaki Ishimatsu On 01/24/2017 02:54 PM, Thomas Gleixner wrote: On Tue, 24 Jan 2017, Yasuaki Ishimatsu wrote: rapl_cpu_prepare() must be called after logical package id of CPU is set by topology_update_package_map(). But when onlining hot-added CPU, rapl_cpu_prepare() is called before setting logical package id of the hot-added CPU. So cpu_to_rapl_pmu() in rapl_cpu_prepare() finds a rapl_pmu of wrong logical package id and rapl_cpu_prepare() initializes the wrong rapl_pmu. After that logical package id of the hot-added CPU is set by topology_update_package_map(). But rapl_cpu_prepare() does not initialize pmu of the logical package id of the hot-added CPU. So when calling rapl_cpu_online(), cpu_to_rapl_pmu() returns NULL and the following NULL pointer dereference occurs. BUG: unable to handle kernel NULL pointer dereference at 0008 IP: rapl_cpu_online+0x8d/0xb0 Call Trace: ? rapl_cpu_offline+0xc0/0xc0 cpuhp_invoke_callback+0x8d/0x3f0 cpuhp_up_callbacks+0x37/0xb0 cpuhp_thread_fun+0xc9/0xe0 smpboot_thread_fn+0x110/0x160 kthread+0x101/0x140 ? sort_range+0x30/0x30 ? kthread_park+0x90/0x90 ret_from_fork+0x25/0x30 The patch renames rapl_cpu_prepare() to rapl_cpu_starting() and changes the position of cpuhp_state so that rapl_cpu_starting() is called after topology_update_package_map(). Does not work. You cannot call that callback in the starting context. It does allocations. This needs be fixed in a different way. I'll have a look tomorrow. Thanks, tglx
Re: [PATCH] perf/x86/intel/rapl: Rename rapl_cpu_prepare() to rapl_cpu_starting()
Hi Thomas, Thank you for your review. I'm not familiar with the component. So I need your help to fix the issue. Thanks, Yasuaki Ishimatsu On 01/24/2017 02:54 PM, Thomas Gleixner wrote: On Tue, 24 Jan 2017, Yasuaki Ishimatsu wrote: rapl_cpu_prepare() must be called after logical package id of CPU is set by topology_update_package_map(). But when onlining hot-added CPU, rapl_cpu_prepare() is called before setting logical package id of the hot-added CPU. So cpu_to_rapl_pmu() in rapl_cpu_prepare() finds a rapl_pmu of wrong logical package id and rapl_cpu_prepare() initializes the wrong rapl_pmu. After that logical package id of the hot-added CPU is set by topology_update_package_map(). But rapl_cpu_prepare() does not initialize pmu of the logical package id of the hot-added CPU. So when calling rapl_cpu_online(), cpu_to_rapl_pmu() returns NULL and the following NULL pointer dereference occurs. BUG: unable to handle kernel NULL pointer dereference at 0008 IP: rapl_cpu_online+0x8d/0xb0 Call Trace: ? rapl_cpu_offline+0xc0/0xc0 cpuhp_invoke_callback+0x8d/0x3f0 cpuhp_up_callbacks+0x37/0xb0 cpuhp_thread_fun+0xc9/0xe0 smpboot_thread_fn+0x110/0x160 kthread+0x101/0x140 ? sort_range+0x30/0x30 ? kthread_park+0x90/0x90 ret_from_fork+0x25/0x30 The patch renames rapl_cpu_prepare() to rapl_cpu_starting() and changes the position of cpuhp_state so that rapl_cpu_starting() is called after topology_update_package_map(). Does not work. You cannot call that callback in the starting context. It does allocations. This needs be fixed in a different way. I'll have a look tomorrow. Thanks, tglx
Re: [PATCH] perf/x86/intel/rapl: Rename rapl_cpu_prepare() to rapl_cpu_starting()
On Tue, 24 Jan 2017, Yasuaki Ishimatsu wrote: > rapl_cpu_prepare() must be called after logical package id of CPU > is set by topology_update_package_map(). > > But when onlining hot-added CPU, rapl_cpu_prepare() is called before > setting logical package id of the hot-added CPU. So cpu_to_rapl_pmu() > in rapl_cpu_prepare() finds a rapl_pmu of wrong logical package id and > rapl_cpu_prepare() initializes the wrong rapl_pmu. > > After that logical package id of the hot-added CPU is set by > topology_update_package_map(). But rapl_cpu_prepare() does > not initialize pmu of the logical package id of the hot-added CPU. > So when calling rapl_cpu_online(), cpu_to_rapl_pmu() returns NULL and > the following NULL pointer dereference occurs. > > BUG: unable to handle kernel NULL pointer dereference at 0008 > IP: rapl_cpu_online+0x8d/0xb0 > > Call Trace: >? rapl_cpu_offline+0xc0/0xc0 >cpuhp_invoke_callback+0x8d/0x3f0 >cpuhp_up_callbacks+0x37/0xb0 >cpuhp_thread_fun+0xc9/0xe0 >smpboot_thread_fn+0x110/0x160 >kthread+0x101/0x140 >? sort_range+0x30/0x30 >? kthread_park+0x90/0x90 >ret_from_fork+0x25/0x30 > > The patch renames rapl_cpu_prepare() to rapl_cpu_starting() and changes > the position of cpuhp_state so that rapl_cpu_starting() is called > after topology_update_package_map(). Does not work. You cannot call that callback in the starting context. It does allocations. This needs be fixed in a different way. I'll have a look tomorrow. Thanks, tglx
[PATCH] perf/x86/intel/rapl: Rename rapl_cpu_prepare() to rapl_cpu_starting()
rapl_cpu_prepare() must be called after logical package id of CPU is set by topology_update_package_map(). But when onlining hot-added CPU, rapl_cpu_prepare() is called before setting logical package id of the hot-added CPU. So cpu_to_rapl_pmu() in rapl_cpu_prepare() finds a rapl_pmu of wrong logical package id and rapl_cpu_prepare() initializes the wrong rapl_pmu. After that logical package id of the hot-added CPU is set by topology_update_package_map(). But rapl_cpu_prepare() does not initialize pmu of the logical package id of the hot-added CPU. So when calling rapl_cpu_online(), cpu_to_rapl_pmu() returns NULL and the following NULL pointer dereference occurs. BUG: unable to handle kernel NULL pointer dereference at 0008 IP: rapl_cpu_online+0x8d/0xb0 Call Trace: ? rapl_cpu_offline+0xc0/0xc0 cpuhp_invoke_callback+0x8d/0x3f0 cpuhp_up_callbacks+0x37/0xb0 cpuhp_thread_fun+0xc9/0xe0 smpboot_thread_fn+0x110/0x160 kthread+0x101/0x140 ? sort_range+0x30/0x30 ? kthread_park+0x90/0x90 ret_from_fork+0x25/0x30 The patch renames rapl_cpu_prepare() to rapl_cpu_starting() and changes the position of cpuhp_state so that rapl_cpu_starting() is called after topology_update_package_map(). Fixes: 9de8d686955b ("perf/x86/intel/rapl: Convert it to a per package facility") Signed-off-by: Yasuaki Ishimatsu CC: Thomas Gleixner --- arch/x86/events/intel/rapl.c | 11 ++- include/linux/cpuhotplug.h | 2 +- 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c index 17c3564..f762cd6 100644 --- a/arch/x86/events/intel/rapl.c +++ b/arch/x86/events/intel/rapl.c @@ -598,7 +598,7 @@ static int rapl_cpu_online(unsigned int cpu) return 0; } -static int rapl_cpu_prepare(unsigned int cpu) +static int rapl_cpu_starting(unsigned int cpu) { struct rapl_pmu *pmu = cpu_to_rapl_pmu(cpu); @@ -804,8 +804,9 @@ static int __init rapl_pmu_init(void) * Install callbacks. Core will call them for each online cpu. */ - ret = cpuhp_setup_state(CPUHP_PERF_X86_RAPL_PREP, "perf/x86/rapl:prepare", - rapl_cpu_prepare, NULL); + ret = cpuhp_setup_state(CPUHP_AP_PERF_X86_RAPL_STARTING, + "perf/x86/rapl:starting", + rapl_cpu_starting, NULL); if (ret) goto out; @@ -825,7 +826,7 @@ static int __init rapl_pmu_init(void) out2: cpuhp_remove_state(CPUHP_AP_PERF_X86_RAPL_ONLINE); out1: - cpuhp_remove_state(CPUHP_PERF_X86_RAPL_PREP); + cpuhp_remove_state(CPUHP_AP_PERF_X86_RAPL_STARTING); out: pr_warn("Initialization failed (%d), disabled\n", ret); cleanup_rapl_pmus(); @@ -836,7 +837,7 @@ static int __init rapl_pmu_init(void) static void __exit intel_rapl_exit(void) { cpuhp_remove_state_nocalls(CPUHP_AP_PERF_X86_RAPL_ONLINE); - cpuhp_remove_state_nocalls(CPUHP_PERF_X86_RAPL_PREP); + cpuhp_remove_state_nocalls(CPUHP_AP_PERF_X86_RAPL_STARTING); perf_pmu_unregister(&rapl_pmus->pmu); cleanup_rapl_pmus(); } diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index d936a00..63da322 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -10,7 +10,6 @@ enum cpuhp_state { CPUHP_PERF_X86_PREPARE, CPUHP_PERF_X86_UNCORE_PREP, CPUHP_PERF_X86_AMD_UNCORE_PREP, - CPUHP_PERF_X86_RAPL_PREP, CPUHP_PERF_BFIN, CPUHP_PERF_POWER, CPUHP_PERF_SUPERH, @@ -90,6 +89,7 @@ enum cpuhp_state { CPUHP_AP_PERF_X86_AMD_UNCORE_STARTING, CPUHP_AP_PERF_X86_STARTING, CPUHP_AP_PERF_X86_AMD_IBS_STARTING, + CPUHP_AP_PERF_X86_RAPL_STARTING, CPUHP_AP_PERF_X86_CQM_STARTING, CPUHP_AP_PERF_X86_CSTATE_STARTING, CPUHP_AP_PERF_XTENSA_STARTING, -- 1.8.3.1