On 12/03/2018 03:31 PM, Qian Cai wrote:
> On Mon, 2018-12-03 at 15:07 -0500, Waiman Long wrote:
>> On 12/03/2018 02:33 PM, Qian Cai wrote:
>>> Booting this Huawei TaiShan 2280 arm64 server generated this lockdep
>>> warning.
>>>
>>> [    0.000000]  lockdep_assert_cpus_held+0x50/0x60
>>> [    0.000000]  static_key_enable_cpuslocked+0x30/0xe8
>>> [    0.000000]  arch_timer_check_ool_workaround+0x128/0x2d0
>>> [    0.000000]  arch_timer_acpi_init+0x274/0x6ac
>>> [    0.000000]  acpi_table_parse+0x1ac/0x218
>>> [    0.000000]  __acpi_probe_device_table+0x164/0x1ec
>>> [    0.000000]  timer_probe+0x1bc/0x254
>>> [    0.000000]  time_init+0x44/0x98
>>> [    0.000000]  start_kernel+0x4ec/0x7d4
>>>
>>> This is due to the commit cb538267ea1e ("jump_label/lockdep: Assert we hold
>>> the hotplug lock for _cpuslocked() operations"). Therefore, it will check
>>> if it is really in the CPU hotplug path or not, and work around this
>>> problem by using cpus_read_trylock(). The chance of not getting the read
>>> lock is very small. If that happens, it will report a lockdep warning at
>>> most.
>>>
>>> Signed-off-by: Qian Cai <[email protected]>
>>> ---
>>>  drivers/clocksource/arm_arch_timer.c | 9 +++++++++
>>>  1 file changed, 9 insertions(+)
>>>
>>> diff --git a/drivers/clocksource/arm_arch_timer.c
>>> b/drivers/clocksource/arm_arch_timer.c
>>> index 9a7d4dc..5c9acbd 100644
>>> --- a/drivers/clocksource/arm_arch_timer.c
>>> +++ b/drivers/clocksource/arm_arch_timer.c
>>> @@ -497,11 +497,20 @@ void arch_timer_enable_workaround(const struct
>>> arch_timer_erratum_workaround *wa
>>>                     per_cpu(timer_unstable_counter_workaround, i) = wa;
>>>     }
>>>  
>>> +#ifdef CONFIG_HOTPLUG_CPU
>> If HOTPLUG_CPU isn't defined, all the cpus_lock() and related functions
>> are just no-op. You don't need to use conditional compilation directive
>> here.
> Make sense.
>
>>> +   i = 0;
>>> +
>>>     /*
>>>      * Use the locked version, as we're called from the CPU
>>>      * hotplug framework. Otherwise, we end-up in deadlock-land.
>>>      */
>> I think the main problem is the above comment may not be true anymore or
>> is only occasionally true. We need to audit the code to find the root cause.
> This was a commit introduced in Aug. 2017, 450f9689f294
> (clocksource/arm_arch_timer: Use static_branch_enable_cpuslocked()) which
> basically drop the cpus_read_lock(). May I ask what changes made you think the
> above comment incorrect now?
>

If the above comment is true, you won't have the lockdep splat in the
first place. I think the most likely case is that the  function can be
called from both a hotplug path and a non-hotplug path. So the comment
needs to be updated to reflect that.

>>> +   i = cpus_read_trylock();
>>>     static_branch_enable_cpuslocked(&arch_timer_read_ool_enabled);
>>> +   if (i)
>>> +           cpus_read_unlock();
>> This is not the right way of fixing the lockdep splash.
>>
> I should had said it is a workaround. I am all-ears for a proper way to fix
> this. When the above commit 450f9689f294 was merged, there was no cb538267ea1e
> so no lockdep warning.

As a workaround, you should better document that in the code. Thinking
about it some more, your workaround seems to be valid. Recursive
cpus_read_lock() is allowed. If it is called from a hotplug path,
cpus_write_lock() should have been taken and cpus_read_trylock() will
failed. Of course, we will have to assume that the current CPU is the
one that has taken the write lock in this case.

Cheers,
Longman



Reply via email to