RE: [PATCH 1/4] drm/amd/pm: drop unnecessary feature->mutex lock protections(V2)

2020-08-31 Thread Quan, Evan
[AMD Official Use Only - Internal Distribution Only] Yes. These two APIs get called only during hw/smu initialization. And although there needs hw initialization also on resume/gpu reset. They share no race conditions(there cannot be another gpu reset/resume unless current gpu reset/resume done)

[PATCH 1/1] drm/amdgpu: disable gpu-sched load balance for uvd

2020-08-31 Thread Nirmoy Das
UVD dependent jobs should run on the same udv instance. This patch disables gpu scheduler's load balancer for a context which binds jobs from same the context to a udv instance. Signed-off-by: Nirmoy Das --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 4 +++- 1 file changed, 3 insertions(+), 1 del

Re: [PATCH 1/1] drm/amdgpu: disable gpu-sched load balance for uvd

2020-08-31 Thread Christian König
Am 31.08.20 um 12:45 schrieb Nirmoy Das: UVD dependent jobs should run on the same udv instance. This patch disables gpu scheduler's load balancer for a context which binds jobs from same the context to a udv instance. Signed-off-by: Nirmoy Das Reviewed-by: Christian König --- drivers/gp

Re: [PATCH v2 1/7] drm/amdgpu: Implement DPC recovery

2020-08-31 Thread Christian König
Am 28.08.20 um 18:05 schrieb Andrey Grodzovsky: Add DPC handlers with basic recovery functionality. v2: remove pci_save_state to avoid breaking suspend/resume Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 9 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

Re: [PATCH 1/1] drm/amdgpu: disable gpu-sched load balance for uvd

2020-08-31 Thread Alex Deucher
On Mon, Aug 31, 2020 at 6:41 AM Nirmoy Das wrote: > > UVD dependent jobs should run on the same udv instance. > This patch disables gpu scheduler's load balancer for > a context which binds jobs from same the context to a udv > instance. typos: udv -> uvd With that fixed: Reviewed-by: Alex Deuche

Re: [PATCH v2 1/7] drm/amdgpu: Implement DPC recovery

2020-08-31 Thread Andrey Grodzovsky
On 8/28/20 3:23 PM, Alex Deucher wrote: On Fri, Aug 28, 2020 at 12:06 PM Andrey Grodzovsky wrote: Add DPC handlers with basic recovery functionality. v2: remove pci_save_state to avoid breaking suspend/resume Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu.h

Re: [PATCH 1/7] drm/amdgpu: Implement DPC recovery

2020-08-31 Thread Andrey Grodzovsky
On 8/28/20 10:07 PM, Luben Tuikov wrote: On 2020-08-26 10:46, Andrey Grodzovsky wrote: Add DPC handlers with basic recovery functionality. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 9 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 181 +++

Re: [PATCH v2 1/7] drm/amdgpu: Implement DPC recovery

2020-08-31 Thread Alex Deucher
On Mon, Aug 31, 2020 at 10:26 AM Andrey Grodzovsky wrote: > > > On 8/28/20 3:23 PM, Alex Deucher wrote: > > On Fri, Aug 28, 2020 at 12:06 PM Andrey Grodzovsky > > wrote: > >> Add DPC handlers with basic recovery functionality. > >> > >> v2: remove pci_save_state to avoid breaking suspend/resume >

Re: [PATCH 1/1] drm/amdgpu: disable gpu-sched load balance for uvd

2020-08-31 Thread Nirmoy
Hi Alex, On 8/31/20 4:17 PM, Alex Deucher wrote: On Mon, Aug 31, 2020 at 6:41 AM Nirmoy Das wrote: UVD dependent jobs should run on the same udv instance. This patch disables gpu scheduler's load balancer for a context which binds jobs from same the context to a udv instance. typos: udv -> uv

Re: [PATCH 2/7] drm/amdgpu: Avoid accessing HW when suspending SW state

2020-08-31 Thread Andrey Grodzovsky
On 8/28/20 10:03 PM, Luben Tuikov wrote: On 2020-08-26 10:46, Andrey Grodzovsky wrote: At this point the ASIC is already post reset by the HW/PSP so the HW not in proper state to be configured for suspension, some bloks might be even gated and so best is to avoid touching it. "blocks" Signe

Re: [PATCH 3/7] drm/amdgpu: Block all job scheduling activity during DPC recovery

2020-08-31 Thread Andrey Grodzovsky
On 8/28/20 10:16 PM, Luben Tuikov wrote: On 2020-08-26 10:46, Andrey Grodzovsky wrote: DPC recovery involves ASIC reset just as normal GPU recovery so block SW GPU shcedulers and wait on all concurent GPU resets. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.

[PATCH AUTOSEL 5.8 36/42] drm/amd/display: Revert HDCP disable sequence change

2020-08-31 Thread Sasha Levin
From: Jaehyun Chung [ Upstream commit b61f05622ace5b9498ae279cdfd1c9f0c1ce3f75 ] [Why] Revert HDCP disable sequence change that blanks stream before disabling HDCP. PSP and HW teams are currently investigating the root cause of why HDCP cannot be disabled before stream blank, which is expected t

[PATCH AUTOSEL 5.8 34/42] drm/amd/display: Reject overlay plane configurations in multi-display scenarios

2020-08-31 Thread Sasha Levin
From: Nicholas Kazlauskas [ Upstream commit 168f09cdadbd547c2b202246ef9a8183da725f13 ] [Why] These aren't stable on some platform configurations when driving multiple displays, especially on higher resolution. In particular the delay in asserting p-state and validating from x86 outweights any p

[PATCH AUTOSEL 5.8 33/42] drm/amd/display: should check error using DC_OK

2020-08-31 Thread Sasha Levin
From: Tong Zhang [ Upstream commit ed9ab229fea24cbcab17f484297dc8344afb7ea9 ] core_link_read_dpcd returns only DC_OK(1) and DC_ERROR_UNEXPECTED(-1), the caller should check error using DC_OK instead of checking against 0 Signed-off-by: Tong Zhang Signed-off-by: Alex Deucher Signed-off-by: Sas

[PATCH AUTOSEL 5.8 39/42] drm/amd/display: Retry AUX write when fail occurs

2020-08-31 Thread Sasha Levin
From: Wayne Lin [ Upstream commit ef67d792a2fc578319399f605fbec2f99ecc06ea ] [Why] In dm_dp_aux_transfer() now, we forget to handle AUX_WR fail cases. We suppose every write wil get done successfully and hence some AUX commands might not sent out indeed. [How] Check if AUX_WR success. If not, r

[PATCH AUTOSEL 5.8 37/42] drm/amd/display: Fix passive dongle mistaken as active dongle in EDID emulation

2020-08-31 Thread Sasha Levin
From: Samson Tam [ Upstream commit efbde23a3b0164cef27fd394e7d548f46af5b51d ] [Why] dongle_type is set during dongle connection but for passive dongles, dongle_type is not set. If user starts with an active dongle and then switches to a passive dongle, it will still report as an active dongle. T

[PATCH AUTOSEL 5.8 38/42] drm/amd/display: Keep current gain when ABM disable immediately

2020-08-31 Thread Sasha Levin
From: Brandon Syu [ Upstream commit cba4b52e431e5de3d8012281cfe194f1c39a9052 ] [Why] When system enters s3/s0i3, backlight PWM would set user level. [How] ABM disable function add keep current gain to avoid it. Signed-off-by: Brandon Syu Reviewed-by: Josip Pavic Acked-by: Eryk Brol Signed-o

[PATCH AUTOSEL 5.8 35/42] drivers: gpu: amd: Initialize amdgpu_dm_backlight_caps object to 0 in amdgpu_dm_update_backlight_caps

2020-08-31 Thread Sasha Levin
From: Furquan Shaikh [ Upstream commit 5896585512e5156482335e902f7c7393b940da51 ] In `amdgpu_dm_update_backlight_caps()`, there is a local `amdgpu_dm_backlight_caps` object that is filled in by `amdgpu_acpi_get_backlight_caps()`. However, this object is uninitialized before the call and hence th

[PATCH AUTOSEL 5.4 21/23] drm/amd/display: Fix memleak in amdgpu_dm_mode_config_init

2020-08-31 Thread Sasha Levin
From: Dinghao Liu [ Upstream commit b67a468a4ccef593cd8df6a02ba3d167b77f0c81 ] When amdgpu_display_modeset_create_props() fails, state and state->context should be freed to prevent memleak. It's the same when amdgpu_dm_audio_init() fails. Signed-off-by: Dinghao Liu Signed-off-by: Alex Deucher

[PATCH AUTOSEL 5.8 40/42] drm/amd/display: Fix memleak in amdgpu_dm_mode_config_init

2020-08-31 Thread Sasha Levin
From: Dinghao Liu [ Upstream commit b67a468a4ccef593cd8df6a02ba3d167b77f0c81 ] When amdgpu_display_modeset_create_props() fails, state and state->context should be freed to prevent memleak. It's the same when amdgpu_dm_audio_init() fails. Signed-off-by: Dinghao Liu Signed-off-by: Alex Deucher

[PATCH AUTOSEL 5.4 20/23] drm/amd/display: Retry AUX write when fail occurs

2020-08-31 Thread Sasha Levin
From: Wayne Lin [ Upstream commit ef67d792a2fc578319399f605fbec2f99ecc06ea ] [Why] In dm_dp_aux_transfer() now, we forget to handle AUX_WR fail cases. We suppose every write wil get done successfully and hence some AUX commands might not sent out indeed. [How] Check if AUX_WR success. If not, r

[PATCH AUTOSEL 5.4 18/23] drm/amd/display: Reject overlay plane configurations in multi-display scenarios

2020-08-31 Thread Sasha Levin
From: Nicholas Kazlauskas [ Upstream commit 168f09cdadbd547c2b202246ef9a8183da725f13 ] [Why] These aren't stable on some platform configurations when driving multiple displays, especially on higher resolution. In particular the delay in asserting p-state and validating from x86 outweights any p

[PATCH AUTOSEL 5.4 19/23] drivers: gpu: amd: Initialize amdgpu_dm_backlight_caps object to 0 in amdgpu_dm_update_backlight_caps

2020-08-31 Thread Sasha Levin
From: Furquan Shaikh [ Upstream commit 5896585512e5156482335e902f7c7393b940da51 ] In `amdgpu_dm_update_backlight_caps()`, there is a local `amdgpu_dm_backlight_caps` object that is filled in by `amdgpu_acpi_get_backlight_caps()`. However, this object is uninitialized before the call and hence th

[PATCH v3 4/8] drm/amdgpu: Fix SMU error failure

2020-08-31 Thread Andrey Grodzovsky
Wait for HW/PSP initiated ASIC reset to complete before starting the recovery operations. v2: Remove typo Signed-off-by: Andrey Grodzovsky Reviewed-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) dif

[PATCH v3 3/8] drm/amdgpu: Block all job scheduling activity during DPC recovery

2020-08-31 Thread Andrey Grodzovsky
DPC recovery involves ASIC reset just as normal GPU recovery so blosk SW GPU schedulers and wait on all concurrent GPU resets. Signed-off-by: Andrey Grodzovsky Acked-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 57 +++--- 1 file changed, 53 insertion

[PATCH v3 7/8] drm/amdgpu: Disable DPC for XGMI for now.

2020-08-31 Thread Andrey Grodzovsky
XGMI support is more complicated then single device support as questions of synchronization between the device recovering from PCI error and other members of the hive is required. Leaving this for next round. Signed-off-by: Andrey Grodzovsky Reviewed-by: Alex Deucher --- drivers/gpu/drm/amd/amd

[PATCH v3 6/8] drm/amdgpu: Trim amdgpu_pci_slot_reset by reusing code.

2020-08-31 Thread Andrey Grodzovsky
Reuse exsisting functions from GPU recovery to avoid code duplications. Signed-off-by: Andrey Grodzovsky Reviewed-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 73 +- 1 file changed, 12 insertions(+), 61 deletions(-) diff --git a/drivers/gpu/drm/

[PATCH v3 5/8] drm/amdgpu: Fix consecutive DPC recovery failures.

2020-08-31 Thread Andrey Grodzovsky
Cache the PCI state on boot and before each case were we might loose it. v2: Add pci_restore_state while caching the PCI state to avoid breaking PCI core logic for stuff like suspend/resume. v3: Extract pci_restore_state from amdgpu_device_cache_pci_state to avoid superflous restores during GPU r

[PATCH v3 0/8] Implement PCI Error Recovery on Navi12

2020-08-31 Thread Andrey Grodzovsky
Many PCI bus controllers are able to detect a variety of hardware PCI errors on the bus, such as parity errors on the data and address buses, A typical action taken is to disconnect the affected device, halting all I/O to it. Typically, a reconnection mechanism is also offered, so that the a

[PATCH v3 8/8] drm/amdgpu: Minor checkpatch fix

2020-08-31 Thread Andrey Grodzovsky
Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index fe720c2..16c7842 100644 --- a/drivers/gpu/drm/am

[PATCH v3 2/8] drm/amdgpu: Avoid accessing HW when suspending SW state

2020-08-31 Thread Andrey Grodzovsky
At this point the ASIC is already post reset by the HW/PSP so the HW not in proper state to be configured for suspension, some blocks might be even gated and so best is to avoid touching it. v2: Rename in_dpc to more meaningful name Signed-off-by: Andrey Grodzovsky Reviewed-by: Alex Deucher ---

[PATCH v3 1/8] drm/amdgpu: Implement DPC recovery

2020-08-31 Thread Andrey Grodzovsky
Add DPC handlers with basic recovery functionality. v2: remove pci_save_state to avoid breaking suspend/resume v3: Fix style comments Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 9 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 162 ++

Re: [PATCH v3 0/8] Implement PCI Error Recovery on Navi12

2020-08-31 Thread Nirmoy
Hi Andrey, I need to understand more about pci saved state. So excluding patch 5 the series is Acked-by: Nirmoy Das . Regards, Nirmoy On 8/31/20 5:50 PM, Andrey Grodzovsky wrote: Many PCI bus controllers are able to detect a variety of hardware PCI errors on the bus, such as parity err

Re: [PATCH 1/1] drm/amdgpu: disable gpu-sched load balance for uvd

2020-08-31 Thread Alex Deucher
On Mon, Aug 31, 2020 at 10:55 AM Nirmoy wrote: > > Hi Alex, > > On 8/31/20 4:17 PM, Alex Deucher wrote: > > On Mon, Aug 31, 2020 at 6:41 AM Nirmoy Das wrote: > >> UVD dependent jobs should run on the same udv instance. > >> This patch disables gpu scheduler's load balancer for > >> a context whic

Re: [PATCH v2 2/7] drm/amdgpu: Avoid accessing HW when suspending SW state

2020-08-31 Thread Luben Tuikov
On 2020-08-28 3:26 p.m., Alex Deucher wrote: > On Fri, Aug 28, 2020 at 12:06 PM Andrey Grodzovsky > wrote: >> >> At this point the ASIC is already post reset by the HW/PSP >> so the HW not in proper state to be configured for suspension, >> some bloks might be even gated and so best is to avoid to

Re: [PATCH v3 1/8] drm/amdgpu: Implement DPC recovery

2020-08-31 Thread Luben Tuikov
On 2020-08-31 11:50 a.m., Andrey Grodzovsky wrote: > Add DPC handlers with basic recovery functionality. > When this goes into the kernel log, it would be quite unclear what "DPC" stands for, as it is used undefined in this patch and this patchset. I'd write, Add PCI Downstream Port Con

Re: [PATCH v3 1/8] drm/amdgpu: Implement DPC recovery

2020-08-31 Thread Luben Tuikov
On 2020-08-31 11:50 a.m., Andrey Grodzovsky wrote: > Add DPC handlers with basic recovery functionality. > > v2: remove pci_save_state to avoid breaking suspend/resume > v3: Fix style comments > > Signed-off-by: Andrey Grodzovsky > --- > drivers/gpu/drm/amd/amdgpu/amdgpu.h| 9 ++ > dr

Re: [PATCH v3 3/8] drm/amdgpu: Block all job scheduling activity during DPC recovery

2020-08-31 Thread Luben Tuikov
On 2020-08-31 11:50 a.m., Andrey Grodzovsky wrote: > DPC recovery involves ASIC reset just as normal GPU recovery so blosk Again, typo: "blosk" --> "blocks". > SW GPU schedulers and wait on all concurrent GPU resets. > > Signed-off-by: Andrey Grodzovsky > Acked-by: Alex Deucher > --- > driver

Re: [PATCH v3 5/8] drm/amdgpu: Fix consecutive DPC recovery failures.

2020-08-31 Thread Luben Tuikov
On 2020-08-31 11:50 a.m., Andrey Grodzovsky wrote: > Cache the PCI state on boot and before each case were we might > loose it. Typo: "were" --> "where". > > v2: Add pci_restore_state while caching the PCI state to avoid > breaking PCI core logic for stuff like suspend/resume. > > v3: Extract p

Re: [PATCH v3 7/8] drm/amdgpu: Disable DPC for XGMI for now.

2020-08-31 Thread Luben Tuikov
On 2020-08-31 11:50 a.m., Andrey Grodzovsky wrote: > XGMI support is more complicated then single device support as Typo: "then" --> "than". > questions of synchronization between the device recovering from > PCI error and other members of the hive is required. Typo: "is" --> "are", as the subje

Re: [PATCH 1/1] drm/amdgpu: disable gpu-sched load balance for uvd

2020-08-31 Thread Leo Liu
On 2020-08-31 1:39 p.m., Alex Deucher wrote: On Mon, Aug 31, 2020 at 10:55 AM Nirmoy wrote: Hi Alex, On 8/31/20 4:17 PM, Alex Deucher wrote: On Mon, Aug 31, 2020 at 6:41 AM Nirmoy Das wrote: UVD dependent jobs should run on the same udv instance. This patch disables gpu scheduler's load b

Re: [PATCH 1/1] drm/amdgpu: disable gpu-sched load balance for uvd

2020-08-31 Thread Alex Deucher
On Mon, Aug 31, 2020 at 5:50 PM Leo Liu wrote: > > > On 2020-08-31 1:39 p.m., Alex Deucher wrote: > > On Mon, Aug 31, 2020 at 10:55 AM Nirmoy wrote: > >> Hi Alex, > >> > >> On 8/31/20 4:17 PM, Alex Deucher wrote: > >>> On Mon, Aug 31, 2020 at 6:41 AM Nirmoy Das wrote: > UVD dependent jobs s

Re: [PATCH v3 3/8] drm/amdgpu: Block all job scheduling activity during DPC recovery

2020-08-31 Thread Andrey Grodzovsky
On 8/31/20 5:00 PM, Luben Tuikov wrote: On 2020-08-31 11:50 a.m., Andrey Grodzovsky wrote: DPC recovery involves ASIC reset just as normal GPU recovery so blosk Again, typo: "blosk" --> "blocks". SW GPU schedulers and wait on all concurrent GPU resets. Signed-off-by: Andrey Grodzovsky Ack

Re: [PATCH 1/1] drm/amdgpu: disable gpu-sched load balance for uvd

2020-08-31 Thread Leo Liu
On 2020-08-31 5:53 p.m., Alex Deucher wrote: On Mon, Aug 31, 2020 at 5:50 PM Leo Liu wrote: On 2020-08-31 1:39 p.m., Alex Deucher wrote: On Mon, Aug 31, 2020 at 10:55 AM Nirmoy wrote: Hi Alex, On 8/31/20 4:17 PM, Alex Deucher wrote: On Mon, Aug 31, 2020 at 6:41 AM Nirmoy Das wrote: UVD

[PATCH] drm/amdgpu: block ring buffer access during GPU recovery

2020-08-31 Thread Dennis Li
When GPU is in reset, its status isn't stable and ring buffer also need be reset when resuming. Therefore driver should protect GPU recovery thread from ring buffer accessed by other threads. Otherwise GPU will randomly hang during recovery. Signed-off-by: Dennis Li diff --git a/drivers/gpu/drm/

[PATCH] drm/kfd: fix a system crash issue during GPU recovery

2020-08-31 Thread Dennis Li
The crash log as the below: [Thu Aug 20 23:18:14 2020] general protection fault: [#1] SMP NOPTI [Thu Aug 20 23:18:14 2020] CPU: 152 PID: 1837 Comm: kworker/152:1 Tainted: G OE 5.4.0-42-generic #46~18.04.1-Ubuntu [Thu Aug 20 23:18:14 2020] Hardware name: GIGABYTE G482-Z53-YF/MZ5

Re: [PATCH] drm/amdgpu: block ring buffer access during GPU recovery

2020-08-31 Thread Andrey Grodzovsky
On 8/31/20 9:17 PM, Dennis Li wrote: When GPU is in reset, its status isn't stable and ring buffer also need be reset when resuming. Therefore driver should protect GPU recovery thread from ring buffer accessed by other threads. Otherwise GPU will randomly hang during recovery. Signed-off-by:

RE: [PATCH] drm/amdgpu: block ring buffer access during GPU recovery

2020-08-31 Thread Li, Dennis
[AMD Official Use Only - Internal Distribution Only] Hi, Andrey, RE- Isn't adev->reset_sem non-recursive ? How this works when you try to access registers from within GPU reset thread while adev->reset_sem is already write locked from amdgpu_device_lock_adev earlier in the same thread ? De

[PATCH 4/5] drm_dp_cec: add plumbing in preparation for MST support

2020-08-31 Thread Sam McNally
From: Hans Verkuil Signed-off-by: Hans Verkuil [sa...@chromium.org: - rebased - removed polling-related changes - moved the calls to drm_dp_cec_(un)set_edid() into the next patch ] Signed-off-by: Sam McNally --- .../display/amdgpu_dm/amdgpu_dm_mst_types.c | 2 +- drivers/gpu/drm/drm_dp_