Re: [PATCH] perf/x86/intel/rapl: Rename rapl_cpu_prepare() to rapl_cpu_starting()
On 01/30/2017 11:56 AM, Thomas Gleixner wrote: On Mon, 30 Jan 2017, Yasuaki Ishimatsu wrote: Hi Thomas, Do you have any idea to fix the issue? If you have the idea, please send me the patch. Yes, I have a patch, but need to do some tests and get changelogs written. Will keep you updated. Great!! I wait for your patch. Thanks, Yasuaki Ishimatsu Thanks, tglx
Re: [PATCH] perf/x86/intel/rapl: Rename rapl_cpu_prepare() to rapl_cpu_starting()
On Mon, 30 Jan 2017, Yasuaki Ishimatsu wrote: > Hi Thomas, > > Do you have any idea to fix the issue? > If you have the idea, please send me the patch. Yes, I have a patch, but need to do some tests and get changelogs written. Will keep you updated. Thanks, tglx
Re: [PATCH] perf/x86/intel/rapl: Rename rapl_cpu_prepare() to rapl_cpu_starting()
Hi Thomas, Do you have any idea to fix the issue? If you have the idea, please send me the patch. Thanks, Yasuaki Ishimatsu On 01/24/2017 02:54 PM, Thomas Gleixner wrote: On Tue, 24 Jan 2017, Yasuaki Ishimatsu wrote: rapl_cpu_prepare() must be called after logical package id of CPU is set by topology_update_package_map(). But when onlining hot-added CPU, rapl_cpu_prepare() is called before setting logical package id of the hot-added CPU. So cpu_to_rapl_pmu() in rapl_cpu_prepare() finds a rapl_pmu of wrong logical package id and rapl_cpu_prepare() initializes the wrong rapl_pmu. After that logical package id of the hot-added CPU is set by topology_update_package_map(). But rapl_cpu_prepare() does not initialize pmu of the logical package id of the hot-added CPU. So when calling rapl_cpu_online(), cpu_to_rapl_pmu() returns NULL and the following NULL pointer dereference occurs. BUG: unable to handle kernel NULL pointer dereference at 0008 IP: rapl_cpu_online+0x8d/0xb0 Call Trace: ? rapl_cpu_offline+0xc0/0xc0 cpuhp_invoke_callback+0x8d/0x3f0 cpuhp_up_callbacks+0x37/0xb0 cpuhp_thread_fun+0xc9/0xe0 smpboot_thread_fn+0x110/0x160 kthread+0x101/0x140 ? sort_range+0x30/0x30 ? kthread_park+0x90/0x90 ret_from_fork+0x25/0x30 The patch renames rapl_cpu_prepare() to rapl_cpu_starting() and changes the position of cpuhp_state so that rapl_cpu_starting() is called after topology_update_package_map(). Does not work. You cannot call that callback in the starting context. It does allocations. This needs be fixed in a different way. I'll have a look tomorrow. Thanks, tglx
Re: [PATCH] perf/x86/intel/rapl: Rename rapl_cpu_prepare() to rapl_cpu_starting()
Hi Thomas, Thank you for your review. I'm not familiar with the component. So I need your help to fix the issue. Thanks, Yasuaki Ishimatsu On 01/24/2017 02:54 PM, Thomas Gleixner wrote: On Tue, 24 Jan 2017, Yasuaki Ishimatsu wrote: rapl_cpu_prepare() must be called after logical package id of CPU is set by topology_update_package_map(). But when onlining hot-added CPU, rapl_cpu_prepare() is called before setting logical package id of the hot-added CPU. So cpu_to_rapl_pmu() in rapl_cpu_prepare() finds a rapl_pmu of wrong logical package id and rapl_cpu_prepare() initializes the wrong rapl_pmu. After that logical package id of the hot-added CPU is set by topology_update_package_map(). But rapl_cpu_prepare() does not initialize pmu of the logical package id of the hot-added CPU. So when calling rapl_cpu_online(), cpu_to_rapl_pmu() returns NULL and the following NULL pointer dereference occurs. BUG: unable to handle kernel NULL pointer dereference at 0008 IP: rapl_cpu_online+0x8d/0xb0 Call Trace: ? rapl_cpu_offline+0xc0/0xc0 cpuhp_invoke_callback+0x8d/0x3f0 cpuhp_up_callbacks+0x37/0xb0 cpuhp_thread_fun+0xc9/0xe0 smpboot_thread_fn+0x110/0x160 kthread+0x101/0x140 ? sort_range+0x30/0x30 ? kthread_park+0x90/0x90 ret_from_fork+0x25/0x30 The patch renames rapl_cpu_prepare() to rapl_cpu_starting() and changes the position of cpuhp_state so that rapl_cpu_starting() is called after topology_update_package_map(). Does not work. You cannot call that callback in the starting context. It does allocations. This needs be fixed in a different way. I'll have a look tomorrow. Thanks, tglx
Re: [PATCH] perf/x86/intel/rapl: Rename rapl_cpu_prepare() to rapl_cpu_starting()
On Tue, 24 Jan 2017, Yasuaki Ishimatsu wrote: > rapl_cpu_prepare() must be called after logical package id of CPU > is set by topology_update_package_map(). > > But when onlining hot-added CPU, rapl_cpu_prepare() is called before > setting logical package id of the hot-added CPU. So cpu_to_rapl_pmu() > in rapl_cpu_prepare() finds a rapl_pmu of wrong logical package id and > rapl_cpu_prepare() initializes the wrong rapl_pmu. > > After that logical package id of the hot-added CPU is set by > topology_update_package_map(). But rapl_cpu_prepare() does > not initialize pmu of the logical package id of the hot-added CPU. > So when calling rapl_cpu_online(), cpu_to_rapl_pmu() returns NULL and > the following NULL pointer dereference occurs. > > BUG: unable to handle kernel NULL pointer dereference at 0008 > IP: rapl_cpu_online+0x8d/0xb0 > > Call Trace: >? rapl_cpu_offline+0xc0/0xc0 >cpuhp_invoke_callback+0x8d/0x3f0 >cpuhp_up_callbacks+0x37/0xb0 >cpuhp_thread_fun+0xc9/0xe0 >smpboot_thread_fn+0x110/0x160 >kthread+0x101/0x140 >? sort_range+0x30/0x30 >? kthread_park+0x90/0x90 >ret_from_fork+0x25/0x30 > > The patch renames rapl_cpu_prepare() to rapl_cpu_starting() and changes > the position of cpuhp_state so that rapl_cpu_starting() is called > after topology_update_package_map(). Does not work. You cannot call that callback in the starting context. It does allocations. This needs be fixed in a different way. I'll have a look tomorrow. Thanks, tglx