[AMD Public Use] > [SNIP] >>> I think it is a limitation of init_rwsem. >> And exactly that's wrong, this is intentional and perfectly correct. >> >> [Dennis Li] I couldn't understand. Why is it a perfectly correct? >> For example, we define two rw_sem: a and b. If we don't check init_rwsem >> definition, we may think case#1 and case#2 have the same behavior, but in >> fact they are different. >> >> Case 1: >> init_rwsem(&a); >> init_rwsem(&b); >> >> Case2: >> void init_rwsem_ext(rw_sem* sem) >> { >> init_rwsem(sem); >> } >> init_rwsem_ext(&a); >> init_rwsem_ext(&b); >> >> As far as I know it is perfectly possible that the locks in the hive are not >> always grabbed in the same order. And that's why lockdep is complaining >> about this. >> [Dennis Li] No. I takes a hive with two devices(a and b) to explain why >> lockdep complain. >> >> // Firstly driver lock the reset_sem of all devices >> down_write(&a->reset_sem); do_suspend(a); >> down_write(&b->reset_sem); // Because b shared the same lock_class_key >> with a, lockdep will take a and b as the same rw_sem and complain here. >> do_suspend(b); >> >> // do recovery >> do_hive_recovery() >> >> // unlock the reset_sem of all devices do_resume(a); >> up_write(&a->reset_sem); do_resume(b); up_write(&b->reset_sem); > Ah! Now I understand what you are working around. So the problem is the > static lock_class_key in the macro? > [Dennis Li] Yes. The author of init_rwsem might not consider our similar use > case. > >> What we should do instead is to make sure we have only a single lock for the >> complete hive instead. >> [Dennis Li] If we use a single lock, users will must wait for all devices >> resuming successfully, but in fact their tasks are only running in device a. >> It is not friendly to users. > Well that is intentional I would say. We can only restart submissions after > all devices are resumed successfully, cause otherwise it can happen that a > job on device A depends on memory on device B which isn't accessible. > [Dennis Li] Yes, you are right. Driver have make sure that the shared > resources(such as the shard memory) are ready before unlock the lock of adev > one by one. But each device still has private hardware resources such as > video and display.
Yeah, but those are negligible and we have a team working on display support for XGMI. So this will sooner or later become a problem as well. I still think that a single rwsem for the whole hive is still the best option here. [Dennis Li] For your above concern, we can open a new thread to discuss it. If we make a decision to use a single after discussing, we can create another patch for it. Best Regards Dennis lI > > Regards, > Christian. > >> Regards, >> Christian. >> >> Am 06.08.20 um 11:15 schrieb Li, Dennis: >>> [AMD Official Use Only - Internal Distribution Only] >>> >>> Hi, Christian, >>> For this case, it is safe to use separated lock key. Please see the >>> definition of init_rwsem as the below. Every init_rwsem calling will use a >>> new static key, but devices of the hive share the same code to do >>> initialization, so their lock_class_key are the same. I think it is a >>> limitation of init_rwsem. In our case, it should be correct that reset_sem >>> of each adev has different lock_class_key. BTW, this change doesn't effect >>> dead-lock detection and just correct it. >>> >>> #define init_rwsem(sem) \ >>> do { \ >>> static struct lock_class_key __key; \ >>> \ >>> __init_rwsem((sem), #sem, &__key); \ >>> } while (0) >>> >>> Best Regards >>> Dennis Li >>> -----Original Message----- >>> From: Koenig, Christian <christian.koe...@amd.com> >>> Sent: Thursday, August 6, 2020 4:13 PM >>> To: Li, Dennis <dennis...@amd.com>; amd-gfx@lists.freedesktop.org; >>> Deucher, Alexander <alexander.deuc...@amd.com>; Kuehling, Felix >>> <felix.kuehl...@amd.com>; Zhang, Hawking <hawking.zh...@amd.com> >>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive >>> locking >>> >>> Preventing locking problems during implementation is obviously a good >>> approach, but lockdep has proven to be massively useful for finding and >>> fixing problems. >>> >>> Disabling lockdep splat by annotating lock with separate classes is usually >>> a no-go and only allowed if there is no other potential approach. >>> >>> In this case here we should really clean things up instead. >>> >>> Christian. >>> >>> Am 06.08.20 um 09:44 schrieb Li, Dennis: >>>> [AMD Official Use Only - Internal Distribution Only] >>>> >>>> Hi, Christian, >>>> I agree with your concern. However we shouldn't rely on system >>>> to detect dead-lock, and should consider this when doing code >>>> implementation. In fact, dead-lock detection isn't enabled by default. >>>> For your proposal to remove reset_sem into the hive structure, >>>> we can open a new topic to discuss it. Currently we couldn't make sure >>>> which is the best solution. For example, with your proposal, we must wait >>>> for all devices resuming successfully before resubmit an old task in one >>>> device, which will effect performance. >>>> >>>> Best Regards >>>> Dennis Li >>>> -----Original Message----- >>>> From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org> On Behalf Of >>>> Christian König >>>> Sent: Thursday, August 6, 2020 3:08 PM >>>> To: Li, Dennis <dennis...@amd.com>; amd-gfx@lists.freedesktop.org; >>>> Deucher, Alexander <alexander.deuc...@amd.com>; Kuehling, Felix >>>> <felix.kuehl...@amd.com>; Zhang, Hawking <hawking.zh...@amd.com> >>>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive >>>> recursive locking >>>> >>>> Am 06.08.20 um 09:02 schrieb Dennis Li: >>>>> [ 584.110304] ============================================ >>>>> [ 584.110590] WARNING: possible recursive locking detected >>>>> [ 584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G >>>>> OE >>>>> [ 584.111164] -------------------------------------------- >>>>> [ 584.111456] kworker/38:1/553 is trying to acquire lock: >>>>> [ 584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at: >>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.112112] >>>>> but task is already holding lock: >>>>> [ 584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at: >>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.113068] >>>>> other info that might help us debug this: >>>>> [ 584.113689] Possible unsafe locking scenario: >>>>> >>>>> [ 584.114350] CPU0 >>>>> [ 584.114685] ---- >>>>> [ 584.115014] lock(&adev->reset_sem); >>>>> [ 584.115349] lock(&adev->reset_sem); >>>>> [ 584.115678] >>>>> *** DEADLOCK *** >>>>> >>>>> [ 584.116624] May be due to missing lock nesting notation >>>>> >>>>> [ 584.117284] 4 locks held by kworker/38:1/553: >>>>> [ 584.117616] #0: ffff9ad635c1d348 >>>>> ((wq_completion)events){+.+.}, >>>>> at: process_one_work+0x21f/0x630 [ 584.117967] #1: >>>>> ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at: >>>>> process_one_work+0x21f/0x630 [ 584.118358] #2: ffffffffc1c2a5d0 >>>>> (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 >>>>> [amdgpu] [ 584.118786] #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, >>>>> at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.119222] >>>>> stack backtrace: >>>>> [ 584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: >>>>> G OE 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 >>>>> [ 584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, >>>>> BIOS 3.1 05/23/2019 [ 584.121223] Workqueue: events >>>>> amdgpu_ras_do_recovery [amdgpu] [ 584.121638] Call Trace: >>>>> [ 584.122050] dump_stack+0x98/0xd5 [ 584.122499] >>>>> __lock_acquire+0x1139/0x16e0 [ 584.122931] ? >>>>> trace_hardirqs_on+0x3b/0xf0 [ 584.123358] ? >>>>> cancel_delayed_work+0xa6/0xc0 [ 584.123771] >>>>> lock_acquire+0xb8/0x1c0 [ 584.124197] ? >>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.124599] >>>>> down_write+0x49/0x120 [ 584.125032] ? >>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.125472] >>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.125910] ? >>>>> amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [ 584.126367] >>>>> amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [ 584.126789] >>>>> process_one_work+0x29e/0x630 [ 584.127208] >>>>> worker_thread+0x3c/0x3f0 [ 584.127621] ? >>>>> __kthread_parkme+0x61/0x90 [ 584.128014] >>>>> kthread+0x12f/0x150 [ 584.128402] ? process_one_work+0x630/0x630 >>>>> kthread+[ >>>>> 584.128790] ? kthread_park+0x90/0x90 [ 584.129174] >>>>> ret_from_fork+0x3a/0x50 >>>>> >>>>> Each adev has owned lock_class_key to avoid false positive >>>>> recursive locking. >>>> NAK, that is not a false positive but a real problem. >>>> >>>> The issue here is that we have multiple reset semaphores, one for each >>>> device in the hive. If those are not acquired in the correct order we >>>> deadlock. >>>> >>>> The real solution would be to move the reset_sem into the hive structure >>>> and make sure that we lock it only once. >>>> >>>> Christian. >>>> >>>>> Signed-off-by: Dennis Li <dennis...@amd.com> >>>>> Change-Id: I7571efeccbf15483982031d00504a353031a854a >>>>> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h >>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h >>>>> index e97c088d03b3..766dc8f8c8a0 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h >>>>> @@ -967,6 +967,7 @@ struct amdgpu_device { >>>>> atomic_t in_gpu_reset; >>>>> enum pp_mp1_state mp1_state; >>>>> struct rw_semaphore reset_sem; >>>>> + struct lock_class_key lock_key; >>>>> struct amdgpu_doorbell_index doorbell_index; >>>>> >>>>> struct mutex notifier_lock; >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>> index 6c572db42d92..d78df9312d34 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>> @@ -3037,6 +3037,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, >>>>> mutex_init(&adev->virt.vf_errors.lock); >>>>> hash_init(adev->mn_hash); >>>>> init_rwsem(&adev->reset_sem); >>>>> + lockdep_set_class(&adev->reset_sem, &adev->lock_key); >>>>> atomic_set(&adev->in_gpu_reset, 0); >>>>> mutex_init(&adev->psp.mutex); >>>>> mutex_init(&adev->notifier_lock); >>>> _______________________________________________ >>>> amd-gfx mailing list >>>> amd-gfx@lists.freedesktop.org >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fl >>>> i >>>> s >>>> t >>>> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7 >>>> C >>>> D >>>> e >>>> nnis.Li%40amd.com%7C56c95f939ddd441bd10408d839d77c9e%7C3dd8961fe488 >>>> 4 >>>> e >>>> 6 >>>> 08e11a82d994e183d%7C0%7C0%7C637322944985436656&sdata=%2FBoRyEW3 >>>> i >>>> K >>>> 5 >>>> 9Y52ctLWd4y1lOmi2CInb6lpIgAF88i4%3D&reserved=0 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx