RE: [PATCH V4 17/17] drm/amd/pm: unified lock protections in amdgpu_dpm.c

2022-04-01 Thread Quan, Evan
[AMD Official Use Only] Thanks for the confirming. I probably know the root cause. Let me prepare an official patch for you. BR Evan > -Original Message- > From: Arthur Marsh > Sent: Friday, April 1, 2022 8:19 PM > To: Quan, Evan > Cc: Deucher, Alexander ; Koenig, Christian > ; Feng, Ke

[PATCH v2 1/1] drm/amdkfd: Improve concurrency of event handling

2022-04-01 Thread Felix Kuehling
Use rcu_read_lock to read p->event_idr concurrently with other readers and writers. Use p->event_mutex only for creating and destroying events and in kfd_wait_on_events. Protect the contents of the kfd_event structure with a per-event spinlock that can be taken inside the rcu_read_lock critical se

Re: [PATCH v2 1/3] mm: add vm_normal_lru_pages for LRU handled pages only

2022-04-01 Thread Felix Kuehling
On 2022-03-31 04:53, Christoph Hellwig wrote: - page = vm_normal_page(vma, addr, pte); + page = vm_normal_lru_page(vma, addr, pte); Why can't this deal with ZONE_DEVICE pages? It certainly has nothing do with a LRU I think. In fact being able to have stats that count say the numb

[PATCH 1/1] drm/amdgpu: Flush TLB after mapping for VG20+XGMI

2022-04-01 Thread Philip Yang
For VG20 + XGMI bridge, all mappings PTEs cache in TC, this may have stall invalid PTEs in TC because one cache line has 8 pages. Need always flush_tlb after updating mapping. Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 ++ 1 file changed, 6 insertions(+) diff

Re: [PATCH] drm/amdgpu: fix TLB flushing during eviction

2022-04-01 Thread Felix Kuehling
On 2022-04-01 04:24, Christian König wrote: Am 31.03.22 um 16:37 schrieb Felix Kuehling: Am 2022-03-31 um 02:27 schrieb Christian König: Am 30.03.22 um 22:51 schrieb philip yang: On 2022-03-30 05:00, Christian König wrote: Testing the valid bit is not enough to figure out if we need to in

[linux-next:master] BUILD REGRESSION e5071887cd2296a7704dbcd10c1cedf0f11cdbd5

2022-04-01 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master branch HEAD: e5071887cd2296a7704dbcd10c1cedf0f11cdbd5 Add linux-next specific files for 20220401 Error/Warning reports: https://lore.kernel.org/linux-media/202203171537.svhye362-...@intel.com https

Re: [PATCH 1/1] drm: add PSR2 support and capability definition as per eDP 1.5

2022-04-01 Thread Zhang, Dingchen (David)
[AMD Official Use Only] Hi Paul and Harry, Thanks for reviewing the patch and commit msg has been revised as per your comments in the v2. From: Paul Menzel Sent: Friday, April 1, 2022 1:46 AM To: Zhang, Dingchen (David) Cc: amd-gfx@lists.freedesktop.org ; dri-de...@lists.freedesktop.org ;

Re: [PATCH] drm/amdgpu/vcn: remove Unneeded semicolon

2022-04-01 Thread 白浩文
dear Alex Deucher I just mean where unneeded semicolon comes from when I add fixes info. As your remind, I have got it, thank you. 原始邮件 发件人:Alex Deucher 时间:2022年4月1日 21:26 收件人:Paul Menzel 抄送:白浩文 ,David Airlie ,"Pan, Xinhui" ,LKML ,Maling list - DRI developers ,amd-gfx

Re: [PATCH V2] drm/amdgpu/vcn: Remove unneeded semicolon

2022-04-01 Thread Alex Deucher
Applied. Thanks! Alex On Fri, Apr 1, 2022 at 3:23 AM Haowen Bai wrote: > > report by coccicheck: > drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c:1951:2-3: Unneeded semicolon > > Fixes: c543dcbe4237 ("drm/amdgpu/vcn: Add VCN ras error query support") > > Signed-off-by: Haowen Bai > --- > V1->V2: change

Re: [PATCH v2] drm/amd/display: Fix unused-but-set-variable warning

2022-04-01 Thread Alex Deucher
Applied. Thanks! Alex On Thu, Mar 24, 2022 at 9:46 AM Aashish Sharma wrote: > > Fix the kernel test robot warning below: > > drivers/gpu/drm/amd/amdgpu/../display/dmub/inc/dmub_cmd.h:2893:12: > warning: variable 'temp' set but not used [-Wunused-but-set-variable] > > Replaced the assignment to

[PATCH] drm/amdgpu/smu10: fix SoC/fclk units in auto mode

2022-04-01 Thread Alex Deucher
SMU takes clock limits in Mhz units. socclk and fclk were using 10 khz units in some cases. Switch to Mhz units. Fixes higher than required SoC clocks. Fixes: 97cf32996c46d9 ("drm/amd/pm: Removed fixed clock in auto mode DPM") Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/pm/powerplay/hw

[PATCH AUTOSEL 4.19 16/29] drm/amdkfd: make CRAT table missing message informational only

2022-04-01 Thread Sasha Levin
From: Alex Deucher [ Upstream commit 9dff13f9edf755a15f6507874185a3290c1ae8bb ] The driver has a fallback so make the message informational rather than a warning. The driver has a fallback if the Component Resource Association Table (CRAT) is missing, so make this informational now. Bug: https:

[PATCH AUTOSEL 4.19 03/29] drm/amd/amdgpu/amdgpu_cs: fix refcount leak of a dma_fence obj

2022-04-01 Thread Sasha Levin
From: Xin Xiong [ Upstream commit dfced44f122c54a48ecc8db516bb6a295a1b ] This issue takes place in an error path in amdgpu_cs_fence_to_handle_ioctl(). When `info->in.what` falls into default case, the function simply returns -EINVAL, forgetting to decrement the reference count of a dma_fence

Re: [PATCH v2] drm/amdgpu: don't use BACO for reset in S3

2022-04-01 Thread Lazar, Lijo
On 4/1/2022 7:32 PM, Alex Deucher wrote: Seems to cause a reboots or hangs on some systems. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1924 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1953 Fixes: daf8de0874ab5b ("drm/amdgpu: always reset the asic in suspend (v2)") Signed-of

[PATCH AUTOSEL 5.4 20/37] drm/amdkfd: make CRAT table missing message informational only

2022-04-01 Thread Sasha Levin
From: Alex Deucher [ Upstream commit 9dff13f9edf755a15f6507874185a3290c1ae8bb ] The driver has a fallback so make the message informational rather than a warning. The driver has a fallback if the Component Resource Association Table (CRAT) is missing, so make this informational now. Bug: https:

[PATCH AUTOSEL 5.4 12/37] drm/amdgpu: Fix recursive locking warning

2022-04-01 Thread Sasha Levin
From: Rajneesh Bhardwaj [ Upstream commit 447c7997b62a5115ba4da846dcdee4fc12298a6a ] Noticed the below warning while running a pytorch workload on vega10 GPUs. Change to trylock to avoid conflicts with already held reservation locks. [ +0.03] WARNING: possible recursive locking detected [

[PATCH AUTOSEL 5.4 03/37] drm/amd/amdgpu/amdgpu_cs: fix refcount leak of a dma_fence obj

2022-04-01 Thread Sasha Levin
From: Xin Xiong [ Upstream commit dfced44f122c54a48ecc8db516bb6a295a1b ] This issue takes place in an error path in amdgpu_cs_fence_to_handle_ioctl(). When `info->in.what` falls into default case, the function simply returns -EINVAL, forgetting to decrement the reference count of a dma_fence

[PATCH AUTOSEL 5.10 32/65] drm/amdkfd: make CRAT table missing message informational only

2022-04-01 Thread Sasha Levin
From: Alex Deucher [ Upstream commit 9dff13f9edf755a15f6507874185a3290c1ae8bb ] The driver has a fallback so make the message informational rather than a warning. The driver has a fallback if the Component Resource Association Table (CRAT) is missing, so make this informational now. Bug: https:

[PATCH AUTOSEL 5.10 20/65] drm/amdgpu: Fix recursive locking warning

2022-04-01 Thread Sasha Levin
From: Rajneesh Bhardwaj [ Upstream commit 447c7997b62a5115ba4da846dcdee4fc12298a6a ] Noticed the below warning while running a pytorch workload on vega10 GPUs. Change to trylock to avoid conflicts with already held reservation locks. [ +0.03] WARNING: possible recursive locking detected [

[PATCH AUTOSEL 5.10 04/65] drm/amd/amdgpu/amdgpu_cs: fix refcount leak of a dma_fence obj

2022-04-01 Thread Sasha Levin
From: Xin Xiong [ Upstream commit dfced44f122c54a48ecc8db516bb6a295a1b ] This issue takes place in an error path in amdgpu_cs_fence_to_handle_ioctl(). When `info->in.what` falls into default case, the function simply returns -EINVAL, forgetting to decrement the reference count of a dma_fence

[PATCH AUTOSEL 5.10 03/65] drm/amd/display: Add signal type check when verify stream backends same

2022-04-01 Thread Sasha Levin
From: Dale Zhao [ Upstream commit 047db281c026de5971cedb5bb486aa29bd16a39d ] [Why] For allow eDP hot-plug feature, the stream signal may change to VIRTUAL when plug-out and back to eDP when plug-in. OS will still setPathMode with same timing for each plugging, but eDP gets no stream update as we

[PATCH AUTOSEL 5.15 51/98] drm/amdkfd: make CRAT table missing message informational only

2022-04-01 Thread Sasha Levin
From: Alex Deucher [ Upstream commit 9dff13f9edf755a15f6507874185a3290c1ae8bb ] The driver has a fallback so make the message informational rather than a warning. The driver has a fallback if the Component Resource Association Table (CRAT) is missing, so make this informational now. Bug: https:

[PATCH AUTOSEL 5.15 30/98] drm/amdgpu: Fix recursive locking warning

2022-04-01 Thread Sasha Levin
From: Rajneesh Bhardwaj [ Upstream commit 447c7997b62a5115ba4da846dcdee4fc12298a6a ] Noticed the below warning while running a pytorch workload on vega10 GPUs. Change to trylock to avoid conflicts with already held reservation locks. [ +0.03] WARNING: possible recursive locking detected [

[PATCH AUTOSEL 5.15 11/98] drm/amdkfd: Don't take process mutex for svm ioctls

2022-04-01 Thread Sasha Levin
From: Philip Yang [ Upstream commit ac7c48c0cce00d03b3c95fddcccb0a45257e33e3 ] SVM ioctls take proper svms->lock to handle race conditions, don't need take process mutex to serialize ioctls. This also fixes circular locking warning: WARNING: possible circular locking dependency detected Poss

[PATCH AUTOSEL 5.15 06/98] drm/amd/display: Use PSR version selected during set_psr_caps

2022-04-01 Thread Sasha Levin
From: Nicholas Kazlauskas [ Upstream commit b80ddeb29d9df449f875f0b6f5de08d7537c02b8 ] [Why] If the DPCD caps specifies a PSR version newer than PSR_VERSION_1 then we fallback to using PSR_VERSION_1 in amdgpu_dm_set_psr_caps. This gets overriden with the raw DPCD value in amdgpu_dm_link_setup_p

[PATCH AUTOSEL 5.15 05/98] drm/amd/display: Fix memory leak

2022-04-01 Thread Sasha Levin
From: Yongzhi Liu [ Upstream commit 5d5c6dba2b43e28845d7d7ed32a36802329a5f52 ] [why] Resource release is needed on the error handling path to prevent memory leak. [how] Fix this by adding kfree on the error handling path. Reviewed-by: Harry Wentland Signed-off-by: Yongzhi Liu Signed-off-by:

[PATCH AUTOSEL 5.15 04/98] drm/amd/amdgpu/amdgpu_cs: fix refcount leak of a dma_fence obj

2022-04-01 Thread Sasha Levin
From: Xin Xiong [ Upstream commit dfced44f122c54a48ecc8db516bb6a295a1b ] This issue takes place in an error path in amdgpu_cs_fence_to_handle_ioctl(). When `info->in.what` falls into default case, the function simply returns -EINVAL, forgetting to decrement the reference count of a dma_fence

[PATCH AUTOSEL 5.15 03/98] drm/amd/display: Add signal type check when verify stream backends same

2022-04-01 Thread Sasha Levin
From: Dale Zhao [ Upstream commit 047db281c026de5971cedb5bb486aa29bd16a39d ] [Why] For allow eDP hot-plug feature, the stream signal may change to VIRTUAL when plug-out and back to eDP when plug-in. OS will still setPathMode with same timing for each plugging, but eDP gets no stream update as we

[PATCH AUTOSEL 5.16 062/109] drm/amdkfd: make CRAT table missing message informational only

2022-04-01 Thread Sasha Levin
From: Alex Deucher [ Upstream commit 9dff13f9edf755a15f6507874185a3290c1ae8bb ] The driver has a fallback so make the message informational rather than a warning. The driver has a fallback if the Component Resource Association Table (CRAT) is missing, so make this informational now. Bug: https:

[PATCH AUTOSEL 5.16 050/109] drm/amd/display: reset lane settings after each PHY repeater LT

2022-04-01 Thread Sasha Levin
From: Sung Joon Kim [ Upstream commit 3b853c316c9321e195414a6fb121d1c2d45b1e87 ] [why] In LTTPR non-transparent mode, we need to reset the cached lane settings before performing link training on the next PHY repeater. Otherwise, the cached lane settings will be used for the next clock recovery e

[PATCH AUTOSEL 5.16 036/109] drm/amdgpu: Fix recursive locking warning

2022-04-01 Thread Sasha Levin
From: Rajneesh Bhardwaj [ Upstream commit 447c7997b62a5115ba4da846dcdee4fc12298a6a ] Noticed the below warning while running a pytorch workload on vega10 GPUs. Change to trylock to avoid conflicts with already held reservation locks. [ +0.03] WARNING: possible recursive locking detected [

[PATCH AUTOSEL 5.16 017/109] drm/amdkfd: svm range restore work deadlock when process exit

2022-04-01 Thread Sasha Levin
From: Philip Yang [ Upstream commit 6225bb3a88d22594aacea2485dc28ca12d596721 ] kfd_process_notifier_release flush svm_range_restore_work which calls svm_range_list_lock_and_flush_work to flush deferred_list work, but if deferred_list work mmput release the last user, it will call exit_mmap -> no

[PATCH AUTOSEL 5.16 016/109] drm/amdkfd: Ensure mm remain valid in svm deferred_list work

2022-04-01 Thread Sasha Levin
From: Philip Yang [ Upstream commit 367c9b0f1b8750a704070e7ae85234d591290434 ] svm_deferred_list work should continue to handle deferred_range_list which maybe split to child range to avoid child range leak, and remove ranges mmu interval notifier to avoid mm mm_count leak. So taking mm referenc

[PATCH AUTOSEL 5.16 015/109] drm/amdkfd: Don't take process mutex for svm ioctls

2022-04-01 Thread Sasha Levin
From: Philip Yang [ Upstream commit ac7c48c0cce00d03b3c95fddcccb0a45257e33e3 ] SVM ioctls take proper svms->lock to handle race conditions, don't need take process mutex to serialize ioctls. This also fixes circular locking warning: WARNING: possible circular locking dependency detected Poss

[PATCH AUTOSEL 5.16 009/109] drm/amd/display: Use PSR version selected during set_psr_caps

2022-04-01 Thread Sasha Levin
From: Nicholas Kazlauskas [ Upstream commit b80ddeb29d9df449f875f0b6f5de08d7537c02b8 ] [Why] If the DPCD caps specifies a PSR version newer than PSR_VERSION_1 then we fallback to using PSR_VERSION_1 in amdgpu_dm_set_psr_caps. This gets overriden with the raw DPCD value in amdgpu_dm_link_setup_p

[PATCH AUTOSEL 5.16 008/109] drm/amd/display: Fix memory leak

2022-04-01 Thread Sasha Levin
From: Yongzhi Liu [ Upstream commit 5d5c6dba2b43e28845d7d7ed32a36802329a5f52 ] [why] Resource release is needed on the error handling path to prevent memory leak. [how] Fix this by adding kfree on the error handling path. Reviewed-by: Harry Wentland Signed-off-by: Yongzhi Liu Signed-off-by:

[PATCH AUTOSEL 5.16 007/109] drm/amd/amdgpu/amdgpu_cs: fix refcount leak of a dma_fence obj

2022-04-01 Thread Sasha Levin
From: Xin Xiong [ Upstream commit dfced44f122c54a48ecc8db516bb6a295a1b ] This issue takes place in an error path in amdgpu_cs_fence_to_handle_ioctl(). When `info->in.what` falls into default case, the function simply returns -EINVAL, forgetting to decrement the reference count of a dma_fence

[PATCH AUTOSEL 5.16 003/109] drm/amd/display: Add signal type check when verify stream backends same

2022-04-01 Thread Sasha Levin
From: Dale Zhao [ Upstream commit 047db281c026de5971cedb5bb486aa29bd16a39d ] [Why] For allow eDP hot-plug feature, the stream signal may change to VIRTUAL when plug-out and back to eDP when plug-in. OS will still setPathMode with same timing for each plugging, but eDP gets no stream update as we

[PATCH AUTOSEL 5.17 091/149] drm/amdkfd: make CRAT table missing message informational only

2022-04-01 Thread Sasha Levin
From: Alex Deucher [ Upstream commit 9dff13f9edf755a15f6507874185a3290c1ae8bb ] The driver has a fallback so make the message informational rather than a warning. The driver has a fallback if the Component Resource Association Table (CRAT) is missing, so make this informational now. Bug: https:

[PATCH AUTOSEL 5.17 073/149] drm/amd/display: reset lane settings after each PHY repeater LT

2022-04-01 Thread Sasha Levin
From: Sung Joon Kim [ Upstream commit 3b853c316c9321e195414a6fb121d1c2d45b1e87 ] [why] In LTTPR non-transparent mode, we need to reset the cached lane settings before performing link training on the next PHY repeater. Otherwise, the cached lane settings will be used for the next clock recovery e

[PATCH AUTOSEL 5.17 048/149] drm/amdgpu: Fix recursive locking warning

2022-04-01 Thread Sasha Levin
From: Rajneesh Bhardwaj [ Upstream commit 447c7997b62a5115ba4da846dcdee4fc12298a6a ] Noticed the below warning while running a pytorch workload on vega10 GPUs. Change to trylock to avoid conflicts with already held reservation locks. [ +0.03] WARNING: possible recursive locking detected [

[PATCH AUTOSEL 5.17 024/149] drm/amdgpu: Fix an error message in rmmod

2022-04-01 Thread Sasha Levin
From: "Tianci.Yin" [ Upstream commit 7270e8957eb9aacf5914605d04865f3829a14bce ] [why] In rmmod procedure, kfd sends cp a dequeue request, but the request does not get response, then an error message "cp queue pipe 4 queue 0 preemption failed" printed. [how] Performing kfd suspending after disab

[PATCH AUTOSEL 5.17 023/149] drm/amdkfd: svm range restore work deadlock when process exit

2022-04-01 Thread Sasha Levin
From: Philip Yang [ Upstream commit 6225bb3a88d22594aacea2485dc28ca12d596721 ] kfd_process_notifier_release flush svm_range_restore_work which calls svm_range_list_lock_and_flush_work to flush deferred_list work, but if deferred_list work mmput release the last user, it will call exit_mmap -> no

[PATCH AUTOSEL 5.17 022/149] drm/amdkfd: Ensure mm remain valid in svm deferred_list work

2022-04-01 Thread Sasha Levin
From: Philip Yang [ Upstream commit 367c9b0f1b8750a704070e7ae85234d591290434 ] svm_deferred_list work should continue to handle deferred_range_list which maybe split to child range to avoid child range leak, and remove ranges mmu interval notifier to avoid mm mm_count leak. So taking mm referenc

[PATCH AUTOSEL 5.17 021/149] drm/amdkfd: Don't take process mutex for svm ioctls

2022-04-01 Thread Sasha Levin
From: Philip Yang [ Upstream commit ac7c48c0cce00d03b3c95fddcccb0a45257e33e3 ] SVM ioctls take proper svms->lock to handle race conditions, don't need take process mutex to serialize ioctls. This also fixes circular locking warning: WARNING: possible circular locking dependency detected Poss

[PATCH AUTOSEL 5.17 013/149] drm/amd/display: Use PSR version selected during set_psr_caps

2022-04-01 Thread Sasha Levin
From: Nicholas Kazlauskas [ Upstream commit b80ddeb29d9df449f875f0b6f5de08d7537c02b8 ] [Why] If the DPCD caps specifies a PSR version newer than PSR_VERSION_1 then we fallback to using PSR_VERSION_1 in amdgpu_dm_set_psr_caps. This gets overriden with the raw DPCD value in amdgpu_dm_link_setup_p

[PATCH AUTOSEL 5.17 012/149] drm/amd/display: Fix memory leak

2022-04-01 Thread Sasha Levin
From: Yongzhi Liu [ Upstream commit 5d5c6dba2b43e28845d7d7ed32a36802329a5f52 ] [why] Resource release is needed on the error handling path to prevent memory leak. [how] Fix this by adding kfree on the error handling path. Reviewed-by: Harry Wentland Signed-off-by: Yongzhi Liu Signed-off-by:

[PATCH AUTOSEL 5.17 011/149] drm/amd/amdgpu/amdgpu_cs: fix refcount leak of a dma_fence obj

2022-04-01 Thread Sasha Levin
From: Xin Xiong [ Upstream commit dfced44f122c54a48ecc8db516bb6a295a1b ] This issue takes place in an error path in amdgpu_cs_fence_to_handle_ioctl(). When `info->in.what` falls into default case, the function simply returns -EINVAL, forgetting to decrement the reference count of a dma_fence

[PATCH AUTOSEL 5.17 005/149] drm/amd/display: Add signal type check when verify stream backends same

2022-04-01 Thread Sasha Levin
From: Dale Zhao [ Upstream commit 047db281c026de5971cedb5bb486aa29bd16a39d ] [Why] For allow eDP hot-plug feature, the stream signal may change to VIRTUAL when plug-out and back to eDP when plug-in. OS will still setPathMode with same timing for each plugging, but eDP gets no stream update as we

[PATCH AUTOSEL 5.17 006/149] drm/amdkfd: enable heavy-weight TLB flush on Arcturus

2022-04-01 Thread Sasha Levin
From: Eric Huang [ Upstream commit f61c40c0757a79bcf744314df606c2bc8ae6a729 ] SDMA FW fixes the hang issue for adding heavy-weight TLB flush on Arcturus, so we can enable it. Signed-off-by: Eric Huang Acked-by: Alex Deucher Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher Signed-off-

[PATCH v2] drm/amdgpu: don't use BACO for reset in S3

2022-04-01 Thread Alex Deucher
Seems to cause a reboots or hangs on some systems. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1924 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1953 Fixes: daf8de0874ab5b ("drm/amdgpu: always reset the asic in suspend (v2)") Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/p

Re: [PATCH] drm/amdgpu: don't use BACO for reset in S3

2022-04-01 Thread Alex Deucher
On Fri, Apr 1, 2022 at 7:53 AM Lazar, Lijo wrote: > > > > On 3/31/2022 8:26 PM, Alex Deucher wrote: > > Seems to cause a reboots or hangs on some systems. > > > > Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1924 > > Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1953 > > Fixes: daf8

Re: [PATCH] drm/amdgpu/vcn: remove Unneeded semicolon

2022-04-01 Thread Alex Deucher
On Fri, Apr 1, 2022 at 1:54 AM Paul Menzel wrote: > > Dear Haowen, > > > Thank you for your patch. > > Am 31.03.22 um 07:56 schrieb Haowen Bai: > > In the commit message summary, please use: > > Remove unneeded semicolon > > > report by coccicheck: > > drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c:1951:2-

[PATCH v2] drm: add PSR2 support and capability definition as per eDP 1.5

2022-04-01 Thread David Zhang
[why & how] As per eDP 1.5 spec, add the below two DPCD bit fields for PSR-SU support and capability: 1. DP_PSR2_WITH_Y_COORD_ET_SUPPORTED 2. DP_PSR2_SU_AUX_FRAME_SYNC_NOT_NEEDED changes in v2 -- * fixed the typo * explicitly list what DPCD bit fields are added Signed-off-by: Dav

[PATCH v2 1/1] drm/amdkfd: Create file descriptor after client is added to smi_clients list

2022-04-01 Thread Lee Jones
This ensures userspace cannot prematurely clean-up the client before it is fully initialised which has been proven to cause issues in the past. Cc: Felix Kuehling Cc: Alex Deucher Cc: "Christian König" Cc: "Pan, Xinhui" Cc: David Airlie Cc: Daniel Vetter Cc: amd-gfx@lists.freedesktop.org Cc:

[PATCH V4 17/17] drm/amd/pm: unified lock protections in amdgpu_dpm.c

2022-04-01 Thread Arthur Marsh
Hi Evan, this is what was logged (filtering for drm and amdgpu) when I blacklisted amdgpu then manually did: modprobe amdgpu si_support=1 gpu_recovery=1 Apr 1 18:31:14 am64 kernel: [0.00] Command line: BOOT_IMAGE=/vmlinuz-5.17.0+ root=UUID=39706f53-7c27-4310-b22a-36c7b042d1a1 ro amdgp

[PATCH V4 17/17] drm/amd/pm: unified lock protections in amdgpu_dpm.c

2022-04-01 Thread Arthur Marsh
Hi, short answer is that with both patches applied, I am successfully running the amdgpu kernel module on radeonsi (plasma desktop on X.org). I confirmed that CONFIG_LOCKDEP_SUPPORT=y is enabled in the kernel. With the first patch applied and remotely connecting to the machine and loading amdgpu

Re: [PATCH v2 1/1] drm/amdkfd: Create file descriptor after client is added to smi_clients list

2022-04-01 Thread Lee Jones
On Fri, 01 Apr 2022, Lee Jones wrote: > This ensures userspace cannot prematurely clean-up the client before > it is fully initialised which has been proven to cause issues in the > past. > > Cc: Felix Kuehling > Cc: Alex Deucher > Cc: "Christian König" > Cc: "Pan, Xinhui" > Cc: David Airlie

Re: [PATCH] drm/amdgpu: don't use BACO for reset in S3

2022-04-01 Thread Lazar, Lijo
On 3/31/2022 8:26 PM, Alex Deucher wrote: Seems to cause a reboots or hangs on some systems. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1924 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1953 Fixes: daf8de0874ab5b ("drm/amdgpu: always reset the asic in suspend (v2)") Signed-o

RE: [PATCH V4 17/17] drm/amd/pm: unified lock protections in amdgpu_dpm.c

2022-04-01 Thread Quan, Evan
[AMD Official Use Only] Yes, as Christian mentioned, enabling CONFIG_LOCKDEP_SUPPORT will help debugging such deadlock issue. Meanwhile, can you give the following change(drop the lock protections in amdgpu_dpm_compute_clocks) a try? diff --git a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c b/drivers/g

Re: [PATCH V4 17/17] drm/amd/pm: unified lock protections in amdgpu_dpm.c

2022-04-01 Thread Christian König
Hi Arthur, apart from blacklisting amdgpu I generally advise to SSH from another computer into the affected system if you have a problem like this. Additionally to what Evan said I suggest that you enable CONFIG_LOCKDEP_SUPPORT in your kernel configuration. This will yield warnings in your s

Re: [PATCH] drm/amdgpu: fix TLB flushing during eviction

2022-04-01 Thread Christian König
Am 31.03.22 um 16:37 schrieb Felix Kuehling: Am 2022-03-31 um 02:27 schrieb Christian König: Am 30.03.22 um 22:51 schrieb philip yang: On 2022-03-30 05:00, Christian König wrote: Testing the valid bit is not enough to figure out if we need to invalidate the TLB or not. During eviction it is

Re: [PATCH] drm/amdgpu/vcn: remove Unneeded semicolon

2022-04-01 Thread baihaowen
在 4/1/22 1:54 PM, Paul Menzel 写道: > Dear Haowen, > > > Thank you for your patch. > > Am 31.03.22 um 07:56 schrieb Haowen Bai: > > In the commit message summary, please use: > > Remove unneeded semicolon > >> report by coccicheck: >> drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c:1951:2-3: Unneeded semicolon

[PATCH V2] drm/amdgpu/vcn: Remove unneeded semicolon

2022-04-01 Thread Haowen Bai
report by coccicheck: drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c:1951:2-3: Unneeded semicolon Fixes: c543dcbe4237 ("drm/amdgpu/vcn: Add VCN ras error query support") Signed-off-by: Haowen Bai --- V1->V2: change title; change Fixed info; drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 2 +- 1 file changed,

[PATCH V3] drm/amdgpu: expand cg_flags from u32 to u64

2022-04-01 Thread Evan Quan
With this, we can support more CG flags. Signed-off-by: Evan Quan Acked-by: Alex Deucher Reviewed-by: Hawking Zhang Change-Id: Iccf13c2f9c570ca6a4654291fc4876556125c3b8 -- v1->v2: - amdgpu_debugfs_gca_config_read: add a new rev to support CG flag upper 32 bits(Alex) v2->v3: - use '%llx'

RE: [PATCH V4 17/17] drm/amd/pm: unified lock protections in amdgpu_dpm.c

2022-04-01 Thread Quan, Evan
[AMD Official Use Only] Hi Arthur, Can you try to blacklist amdgpu module first and then do manual driver loading? Hope via that you can have a chance to observe the errors reported by driver. BR Evan > -Original Message- > From: Arthur Marsh > Sent: Thursday, March 31, 2022 12:27 PM >