RE: [PATCH] drm/amdgpu: annotate a false positive recursive locking

Li, Dennis Thu, 06 Aug 2020 19:22:52 -0700

[AMD Public Use]

> [SNIP]
>>> I think it is a limitation of init_rwsem.
>> And exactly that's wrong, this is intentional and perfectly correct.
>>
>> [Dennis Li] I couldn't understand. Why is it a perfectly correct?
>> For example, we define two rw_sem: a and b. If we don't check init_rwsem 
>> definition, we may think case#1 and case#2 have the same behavior, but in 
>> fact they are different.
>>
>> Case 1:
>> init_rwsem(&a);
>> init_rwsem(&b);
>>
>> Case2:
>> void init_rwsem_ext(rw_sem* sem)
>> {
>>        init_rwsem(sem);
>> }
>> init_rwsem_ext(&a);
>> init_rwsem_ext(&b);
>>
>> As far as I know it is perfectly possible that the locks in the hive are not 
>> always grabbed in the same order. And that's why lockdep is complaining 
>> about this.
>> [Dennis Li] No. I takes a hive with two devices(a and b) to explain why 
>> lockdep complain.
>>
>> // Firstly driver lock the reset_sem of all devices 
>> down_write(&a->reset_sem); do_suspend(a);
>> down_write(&b->reset_sem);   // Because  b shared the same lock_class_key 
>> with a, lockdep will take a and b as the same rw_sem and complain here.
>> do_suspend(b);
>>
>> // do recovery
>> do_hive_recovery()
>>
>> // unlock the reset_sem of all devices do_resume(a); 
>> up_write(&a->reset_sem); do_resume(b); up_write(&b->reset_sem);
> Ah! Now I understand what you are working around. So the problem is the 
> static lock_class_key in the macro?
> [Dennis Li] Yes. The author of init_rwsem might not consider our similar use 
> case.
>
>> What we should do instead is to make sure we have only a single lock for the 
>> complete hive instead.
>> [Dennis Li] If we use a single lock, users will must wait for all devices 
>> resuming successfully, but in fact their tasks are only running in device a. 
>> It is not friendly to users.
> Well that is intentional I would say. We can only restart submissions after 
> all devices are resumed successfully, cause otherwise it can happen that a 
> job on device A depends on memory on device B which isn't accessible.
> [Dennis Li] Yes, you are right. Driver have make sure that the shared 
> resources(such as the shard memory) are ready before unlock the lock of adev 
> one by one. But each device still has private hardware resources such as 
> video  and display.


Yeah, but those are negligible and we have a team working on display support 
for XGMI. So this will sooner or later become a problem as well.

I still think that a single rwsem for the whole hive is still the best option 
here.

[Dennis Li] For your above concern, we can open a new thread to discuss it. If 
we make a decision to use a single after discussing, we can create another 
patch for it. 

Best Regards
Dennis lI
>
> Regards,
> Christian.
>
>> Regards,
>> Christian.
>>
>> Am 06.08.20 um 11:15 schrieb Li, Dennis:
>>> [AMD Official Use Only - Internal Distribution Only]
>>>
>>> Hi, Christian,
>>>         For this case, it is safe to use separated lock key. Please see the 
>>> definition of init_rwsem as the below. Every init_rwsem calling will use a 
>>> new static key, but devices of  the hive share the same code to do 
>>> initialization, so their lock_class_key are the same. I think it is a 
>>> limitation of init_rwsem.  In our case, it should be correct that reset_sem 
>>> of each adev has different  lock_class_key. BTW, this change doesn't effect 
>>> dead-lock detection and just correct it.
>>>
>>> #define init_rwsem(sem)                                     \
>>> do {                                                                \
>>>     static struct lock_class_key __key;                     \
>>>                                                             \
>>>     __init_rwsem((sem), #sem, &__key);                      \
>>> } while (0)
>>>
>>> Best Regards
>>> Dennis Li
>>> -----Original Message-----
>>> From: Koenig, Christian <christian.koe...@amd.com>
>>> Sent: Thursday, August 6, 2020 4:13 PM
>>> To: Li, Dennis <dennis...@amd.com>; amd-gfx@lists.freedesktop.org; 
>>> Deucher, Alexander <alexander.deuc...@amd.com>; Kuehling, Felix 
>>> <felix.kuehl...@amd.com>; Zhang, Hawking <hawking.zh...@amd.com>
>>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive 
>>> locking
>>>
>>> Preventing locking problems during implementation is obviously a good 
>>> approach, but lockdep has proven to be massively useful for finding and 
>>> fixing problems.
>>>
>>> Disabling lockdep splat by annotating lock with separate classes is usually 
>>> a no-go and only allowed if there is no other potential approach.
>>>
>>> In this case here we should really clean things up instead.
>>>
>>> Christian.
>>>
>>> Am 06.08.20 um 09:44 schrieb Li, Dennis:
>>>> [AMD Official Use Only - Internal Distribution Only]
>>>>
>>>> Hi, Christian,
>>>>           I agree with your concern. However we shouldn't rely on system 
>>>> to detect dead-lock, and should consider this when doing code 
>>>> implementation. In fact, dead-lock detection isn't enabled by default.
>>>>           For your proposal to remove reset_sem into the hive structure, 
>>>> we can open a new topic to discuss it. Currently we couldn't make sure 
>>>> which is the best solution. For example, with your proposal, we must wait 
>>>> for all devices resuming successfully before resubmit an old task in one 
>>>> device, which will effect performance.
>>>>
>>>> Best Regards
>>>> Dennis Li
>>>> -----Original Message-----
>>>> From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org> On Behalf Of 
>>>> Christian König
>>>> Sent: Thursday, August 6, 2020 3:08 PM
>>>> To: Li, Dennis <dennis...@amd.com>; amd-gfx@lists.freedesktop.org; 
>>>> Deucher, Alexander <alexander.deuc...@amd.com>; Kuehling, Felix 
>>>> <felix.kuehl...@amd.com>; Zhang, Hawking <hawking.zh...@amd.com>
>>>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive 
>>>> recursive locking
>>>>
>>>> Am 06.08.20 um 09:02 schrieb Dennis Li:
>>>>> [  584.110304] ============================================
>>>>> [  584.110590] WARNING: possible recursive locking detected
>>>>> [  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           
>>>>> OE
>>>>> [  584.111164] --------------------------------------------
>>>>> [  584.111456] kworker/38:1/553 is trying to acquire lock:
>>>>> [  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at:
>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.112112]
>>>>>                     but task is already holding lock:
>>>>> [  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at:
>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.113068]
>>>>>                     other info that might help us debug this:
>>>>> [  584.113689]  Possible unsafe locking scenario:
>>>>>
>>>>> [  584.114350]        CPU0
>>>>> [  584.114685]        ----
>>>>> [  584.115014]   lock(&adev->reset_sem);
>>>>> [  584.115349]   lock(&adev->reset_sem);
>>>>> [  584.115678]
>>>>>                      *** DEADLOCK ***
>>>>>
>>>>> [  584.116624]  May be due to missing lock nesting notation
>>>>>
>>>>> [  584.117284] 4 locks held by kworker/38:1/553:
>>>>> [  584.117616]  #0: ffff9ad635c1d348 
>>>>> ((wq_completion)events){+.+.},
>>>>> at: process_one_work+0x21f/0x630 [  584.117967]  #1:
>>>>> ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at:
>>>>> process_one_work+0x21f/0x630 [  584.118358]  #2: ffffffffc1c2a5d0 
>>>>> (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 
>>>>> [amdgpu] [  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, 
>>>>> at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.119222]
>>>>>                     stack backtrace:
>>>>> [  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: 
>>>>> G           OE     5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
>>>>> [  584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, 
>>>>> BIOS 3.1 05/23/2019 [  584.121223] Workqueue: events 
>>>>> amdgpu_ras_do_recovery [amdgpu] [  584.121638] Call Trace:
>>>>> [  584.122050]  dump_stack+0x98/0xd5 [  584.122499]
>>>>> __lock_acquire+0x1139/0x16e0 [  584.122931]  ?
>>>>> trace_hardirqs_on+0x3b/0xf0 [  584.123358]  ?
>>>>> cancel_delayed_work+0xa6/0xc0 [  584.123771]
>>>>> lock_acquire+0xb8/0x1c0 [  584.124197]  ?
>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.124599]  
>>>>> down_write+0x49/0x120 [  584.125032]  ?
>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125472]
>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125910]  ?
>>>>> amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [  584.126367]
>>>>> amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [  584.126789]
>>>>> process_one_work+0x29e/0x630 [  584.127208]
>>>>> worker_thread+0x3c/0x3f0 [  584.127621]  ?
>>>>> __kthread_parkme+0x61/0x90 [  584.128014]
>>>>> kthread+0x12f/0x150 [  584.128402]  ? process_one_work+0x630/0x630 
>>>>> kthread+[
>>>>> 584.128790]  ? kthread_park+0x90/0x90 [  584.129174]
>>>>> ret_from_fork+0x3a/0x50
>>>>>
>>>>> Each adev has owned lock_class_key to avoid false positive 
>>>>> recursive locking.
>>>> NAK, that is not a false positive but a real problem.
>>>>
>>>> The issue here is that we have multiple reset semaphores, one for each 
>>>> device in the hive. If those are not acquired in the correct order we 
>>>> deadlock.
>>>>
>>>> The real solution would be to move the reset_sem into the hive structure 
>>>> and make sure that we lock it only once.
>>>>
>>>> Christian.
>>>>
>>>>> Signed-off-by: Dennis Li <dennis...@amd.com>
>>>>> Change-Id: I7571efeccbf15483982031d00504a353031a854a
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> index e97c088d03b3..766dc8f8c8a0 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> @@ -967,6 +967,7 @@ struct amdgpu_device {
>>>>>           atomic_t                        in_gpu_reset;
>>>>>           enum pp_mp1_state               mp1_state;
>>>>>           struct rw_semaphore     reset_sem;
>>>>> + struct lock_class_key lock_key;
>>>>>           struct amdgpu_doorbell_index doorbell_index;
>>>>>       
>>>>>           struct mutex                    notifier_lock;
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> index 6c572db42d92..d78df9312d34 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> @@ -3037,6 +3037,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>>>>           mutex_init(&adev->virt.vf_errors.lock);
>>>>>           hash_init(adev->mn_hash);
>>>>>           init_rwsem(&adev->reset_sem);
>>>>> + lockdep_set_class(&adev->reset_sem, &adev->lock_key);
>>>>>           atomic_set(&adev->in_gpu_reset, 0);
>>>>>           mutex_init(&adev->psp.mutex);
>>>>>           mutex_init(&adev->notifier_lock);
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fl
>>>> i
>>>> s
>>>> t
>>>> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7
>>>> C
>>>> D
>>>> e
>>>> nnis.Li%40amd.com%7C56c95f939ddd441bd10408d839d77c9e%7C3dd8961fe488
>>>> 4
>>>> e
>>>> 6
>>>> 08e11a82d994e183d%7C0%7C0%7C637322944985436656&amp;sdata=%2FBoRyEW3
>>>> i
>>>> K
>>>> 5
>>>> 9Y52ctLWd4y1lOmi2CInb6lpIgAF88i4%3D&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: annotate a false positive recursive locking

Reply via email to