On Wed, Apr 12, 2017 at 10:07:34PM +0200, Thomas Gleixner wrote:
> acpi_processor_get_throttling() requires to invoke the getter function on
> the target CPU. This is achieved by temporarily setting the affinity of the
> calling user space thread to the requested CPU and reset it to the original
> affinity afterwards.
> 
> That's racy vs. CPU hotplug and concurrent affinity settings for that
> thread resulting in code executing on the wrong CPU and overwriting the
> new affinity setting.
> 
> acpi_processor_get_throttling() is invoked in two ways:
> 
> 1) The CPU online callback, which is already running on the target CPU and
>    obviously protected against hotplug and not affected by affinity
>    settings.
> 
> 2) The ACPI driver probe function, which is not protected against hotplug
>    during modprobe.
> 
> Switch it over to work_on_cpu() and protect the probe function against CPU
> hotplug.
> 

> +static int acpi_processor_get_throttling(struct acpi_processor *pr)
> +{
>       if (!pr)
>               return -EINVAL;
>  
>       if (!pr->flags.throttling)
>               return -ENODEV;
>  
> +      * This is either called from the CPU hotplug callback of
> +      * processor_driver or via the ACPI probe function. In the latter
> +      * case the CPU is not guaranteed to be online. Both call sites are
> +      * protected against CPU hotplug.
>        */
> +     if (!cpu_online(pr->id))
>               return -ENODEV;
>  
> +     return work_on_cpu(pr->id, __acpi_processor_get_throttling, pr);
>  }

That makes my machine sad...


[    9.583030] =============================================
[    9.589053] [ INFO: possible recursive locking detected ]
[    9.595079] 4.11.0-rc6-00385-g5aee78a-dirty #678 Not tainted
[    9.601393] ---------------------------------------------
[    9.607418] kworker/0:0/3 is trying to acquire lock:
[    9.612954]  ((&wfc.work)){+.+.+.}, at: [<ffffffff8110c172>] 
flush_work+0x12/0x2a0
[    9.621406] 
[    9.621406] but task is already holding lock:
[    9.627915]  ((&wfc.work)){+.+.+.}, at: [<ffffffff8110df17>] 
process_one_work+0x1e7/0x670
[    9.637044] 
[    9.637044] other info that might help us debug this:
[    9.644330]  Possible unsafe locking scenario:
[    9.644330] 
[    9.650934]        CPU0
[    9.653660]        ----
[    9.656386]   lock((&wfc.work));
[    9.659987]   lock((&wfc.work));
[    9.663586] 
[    9.663586]  *** DEADLOCK ***
[    9.663586] 
[    9.670189]  May be due to missing lock nesting notation
[    9.670189] 
[    9.677765] 2 locks held by kworker/0:0/3:
[    9.682332]  #0:  ("events"){.+.+.+}, at: [<ffffffff8110df17>] 
process_one_work+0x1e7/0x670
[    9.691654]  #1:  ((&wfc.work)){+.+.+.}, at: [<ffffffff8110df17>] 
process_one_work+0x1e7/0x670
[    9.701267] 
[    9.701267] stack backtrace:
[    9.706127] CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 
4.11.0-rc6-00385-g5aee78a-dirty #678
[    9.715545] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS 
SE5C600.86B.02.02.0002.122320131210 12/23/2013
[    9.726999] Workqueue: events work_for_cpu_fn
[    9.731860] Call Trace:
[    9.734591]  dump_stack+0x86/0xcf
[    9.738290]  __lock_acquire+0x790/0x1620
[    9.742667]  ? __lock_acquire+0x4a5/0x1620
[    9.747237]  lock_acquire+0x100/0x210
[    9.751319]  ? lock_acquire+0x100/0x210
[    9.755596]  ? flush_work+0x12/0x2a0
[    9.759583]  flush_work+0x47/0x2a0
[    9.763375]  ? flush_work+0x12/0x2a0
[    9.767362]  ? queue_work_on+0x47/0xa0
[    9.771545]  ? __this_cpu_preempt_check+0x13/0x20
[    9.776792]  ? trace_hardirqs_on_caller+0xfb/0x1d0
[    9.782139]  ? trace_hardirqs_on+0xd/0x10
[    9.786610]  work_on_cpu+0x82/0x90
[    9.790404]  ? __usermodehelper_disable+0x110/0x110
[    9.795846]  ? __acpi_processor_get_throttling+0x20/0x20
[    9.801773]  acpi_processor_set_throttling+0x199/0x220
[    9.807506]  ? trace_hardirqs_on_caller+0xfb/0x1d0
[    9.812851]  acpi_processor_get_throttling_ptc+0xec/0x180
[    9.818876]  __acpi_processor_get_throttling+0xf/0x20
[    9.824511]  work_for_cpu_fn+0x14/0x20
[    9.828692]  process_one_work+0x261/0x670
[    9.833165]  worker_thread+0x21b/0x3f0
[    9.837348]  kthread+0x108/0x140
[    9.840947]  ? process_one_work+0x670/0x670
[    9.845611]  ? kthread_create_on_node+0x40/0x40
[    9.850667]  ret_from_fork+0x31/0x40

Reply via email to