Re: [PATCH v2] drivers/amd/pm: check the return value of amdgpu_bo_kmap

2022-09-21 Thread Christian König

Am 22.09.22 um 06:17 schrieb Li Zhong:

amdgpu_bo_kmap() returns error when fails to map buffer object. Add the
error check and propagate the error.

Signed-off-by: Li Zhong 


We usually use "r" as return and error variables, but that's just a nit.

Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c 
b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
index 1eb4e613b27a..ec055858eb95 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
@@ -1485,6 +1485,7 @@ static int pp_get_prv_buffer_details(void *handle, void 
**addr, size_t *size)
  {
struct pp_hwmgr *hwmgr = handle;
struct amdgpu_device *adev = hwmgr->adev;
+   int err;
  
  	if (!addr || !size)

return -EINVAL;
@@ -1492,7 +1493,9 @@ static int pp_get_prv_buffer_details(void *handle, void 
**addr, size_t *size)
*addr = NULL;
*size = 0;
if (adev->pm.smu_prv_buffer) {
-   amdgpu_bo_kmap(adev->pm.smu_prv_buffer, addr);
+   err = amdgpu_bo_kmap(adev->pm.smu_prv_buffer, addr);
+   if (err)
+   return err;
*size = adev->pm.smu_prv_buffer_size;
}
  




RE: [PATCH v2] drivers/amd/pm: check the return value of amdgpu_bo_kmap

2022-09-21 Thread Quan, Evan
[AMD Official Use Only - General]

Reviewed-by: Evan Quan 

> -Original Message-
> From: Li Zhong 
> Sent: Thursday, September 22, 2022 12:18 PM
> To: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> Cc: jiapeng.ch...@linux.alibaba.com; Powell, Darren
> ; Chen, Guchun ;
> Limonciello, Mario ; Quan, Evan
> ; Lazar, Lijo ; dan...@ffwll.ch;
> airl...@linux.ie; Pan, Xinhui ; Koenig, Christian
> ; Deucher, Alexander
> ; Li Zhong 
> Subject: [PATCH v2] drivers/amd/pm: check the return value of
> amdgpu_bo_kmap
> 
> amdgpu_bo_kmap() returns error when fails to map buffer object. Add the
> error check and propagate the error.
> 
> Signed-off-by: Li Zhong 
> ---
>  drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
> b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
> index 1eb4e613b27a..ec055858eb95 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
> @@ -1485,6 +1485,7 @@ static int pp_get_prv_buffer_details(void *handle,
> void **addr, size_t *size)
>  {
>   struct pp_hwmgr *hwmgr = handle;
>   struct amdgpu_device *adev = hwmgr->adev;
> + int err;
> 
>   if (!addr || !size)
>   return -EINVAL;
> @@ -1492,7 +1493,9 @@ static int pp_get_prv_buffer_details(void *handle,
> void **addr, size_t *size)
>   *addr = NULL;
>   *size = 0;
>   if (adev->pm.smu_prv_buffer) {
> - amdgpu_bo_kmap(adev->pm.smu_prv_buffer, addr);
> + err = amdgpu_bo_kmap(adev->pm.smu_prv_buffer, addr);
> + if (err)
> + return err;
>   *size = adev->pm.smu_prv_buffer_size;
>   }
> 
> --
> 2.25.1


RE: [PATCH v1] drivers:amdgpu: check the return value of amdgpu_bo_kmap

2022-09-21 Thread Chen, Guchun
Perhaps you need to update the prefix of patch subject to 'drm/amd/pm: check 
return value ...'.

With above addressed, it's: Acked-by: Guchun Chen 

Regards,
Guchun

-Original Message-
From: Li Zhong  
Sent: Thursday, September 22, 2022 9:27 AM
To: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
Cc: jiapeng.ch...@linux.alibaba.com; Powell, Darren ; 
Chen, Guchun ; Limonciello, Mario 
; Quan, Evan ; Lazar, Lijo 
; dan...@ffwll.ch; airl...@linux.ie; Pan, Xinhui 
; Koenig, Christian ; Deucher, 
Alexander ; Li Zhong 
Subject: [PATCH v1] drivers:amdgpu: check the return value of amdgpu_bo_kmap

amdgpu_bo_kmap() returns error when fails to map buffer object. Add the error 
check and propagate the error.

Signed-off-by: Li Zhong 
---
 drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c 
b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
index 1eb4e613b27a..ec055858eb95 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
@@ -1485,6 +1485,7 @@ static int pp_get_prv_buffer_details(void *handle, void 
**addr, size_t *size)  {
struct pp_hwmgr *hwmgr = handle;
struct amdgpu_device *adev = hwmgr->adev;
+   int err;
 
if (!addr || !size)
return -EINVAL;
@@ -1492,7 +1493,9 @@ static int pp_get_prv_buffer_details(void *handle, void 
**addr, size_t *size)
*addr = NULL;
*size = 0;
if (adev->pm.smu_prv_buffer) {
-   amdgpu_bo_kmap(adev->pm.smu_prv_buffer, addr);
+   err = amdgpu_bo_kmap(adev->pm.smu_prv_buffer, addr);
+   if (err)
+   return err;
*size = adev->pm.smu_prv_buffer_size;
}
 
--
2.25.1



RE: [PATCH] drm/amdgpu: Fixed ras warning when uninstalling amdgpu

2022-09-21 Thread Zhang, Hawking
[AMD Official Use Only - General]

Reviewed-by: Hawking Zhang 

Regards,
Hawking
-Original Message-
From: Chai, Thomas  
Sent: Thursday, September 22, 2022 09:37
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Zhou1, Tao ; 
Clements, John ; Yang, Stanley 
Subject: RE: [PATCH] drm/amdgpu: Fixed ras warning when uninstalling amdgpu

[AMD Official Use Only - General]

Ping ...


-
Best Regards,
Thomas

-Original Message-
From: Chai, Thomas  
Sent: Tuesday, September 20, 2022 10:07 AM
To: amd-gfx@lists.freedesktop.org
Cc: Chai, Thomas ; Zhang, Hawking ; 
Zhou1, Tao ; Clements, John ; Yang, 
Stanley ; Chai, Thomas 
Subject: [PATCH] drm/amdgpu: Fixed ras warning when uninstalling amdgpu

  For the asic using smu v13_0_2, there is the following warning when 
uninstalling amdgpu:
  amdgpu: ras disable gfx failed poison:1 ret:-22.

[Why]:
  For the asic using smu v13_0_2, the psp .suspend and
  mode1reset is called before amdgpu_ras_pre_fini during
  amdgpu uninstall, it has disabled all ras features and
  reset the psp. Since the psp is reset, calling
  amdgpu_ras_disable_all_features in amdgpu_ras_pre_fini
  to disable ras features will fail.

[How]:
  If all ras features are disabled, amdgpu_ras_disable_all_features
  will not be called to disable all ras features again.

Signed-off-by: YiPeng Chai 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index e55f106621ef..3deb716710e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -2720,7 +2720,8 @@ int amdgpu_ras_pre_fini(struct amdgpu_device *adev)
 
 
/* Need disable ras on all IPs here before ip [hw/sw]fini */
-   amdgpu_ras_disable_all_features(adev, 0);
+   if (con->features)
+   amdgpu_ras_disable_all_features(adev, 0);
amdgpu_ras_recovery_fini(adev);
return 0;
 }
--
2.25.1


RE: [PATCH] drm/amdgpu: Fixed ras warning when uninstalling amdgpu

2022-09-21 Thread Chai, Thomas
[AMD Official Use Only - General]

Ping ...


-
Best Regards,
Thomas

-Original Message-
From: Chai, Thomas  
Sent: Tuesday, September 20, 2022 10:07 AM
To: amd-gfx@lists.freedesktop.org
Cc: Chai, Thomas ; Zhang, Hawking ; 
Zhou1, Tao ; Clements, John ; Yang, 
Stanley ; Chai, Thomas 
Subject: [PATCH] drm/amdgpu: Fixed ras warning when uninstalling amdgpu

  For the asic using smu v13_0_2, there is the following warning when 
uninstalling amdgpu:
  amdgpu: ras disable gfx failed poison:1 ret:-22.

[Why]:
  For the asic using smu v13_0_2, the psp .suspend and
  mode1reset is called before amdgpu_ras_pre_fini during
  amdgpu uninstall, it has disabled all ras features and
  reset the psp. Since the psp is reset, calling
  amdgpu_ras_disable_all_features in amdgpu_ras_pre_fini
  to disable ras features will fail.

[How]:
  If all ras features are disabled, amdgpu_ras_disable_all_features
  will not be called to disable all ras features again.

Signed-off-by: YiPeng Chai 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index e55f106621ef..3deb716710e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -2720,7 +2720,8 @@ int amdgpu_ras_pre_fini(struct amdgpu_device *adev)
 
 
/* Need disable ras on all IPs here before ip [hw/sw]fini */
-   amdgpu_ras_disable_all_features(adev, 0);
+   if (con->features)
+   amdgpu_ras_disable_all_features(adev, 0);
amdgpu_ras_recovery_fini(adev);
return 0;
 }
--
2.25.1


[PATCH 31/31] drm/amd/display: remove redundant CalculateRemoteSurfaceFlipDelay's

2022-09-21 Thread Jasdeep Dhillon
From: Tom Rix 

There are several copies of CalculateRemoteSurfaceFlipDelay.
Reduce to one instance.

Signed-off-by: Tom Rix 
Reviewed-by: Maíra Canal 
---
 .../dc/dml/dcn20/display_mode_vba_20.c|  4 +-
 .../dc/dml/dcn20/display_mode_vba_20v2.c  | 40 +--
 .../dc/dml/dcn21/display_mode_vba_21.c| 40 +--
 3 files changed, 4 insertions(+), 80 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c
index 4ca080950924..8e5d58336bc5 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c
@@ -158,7 +158,7 @@ double CalculateTWait(
double DRAMClockChangeLatency,
double UrgentLatency,
double SREnterPlusExitTime);
-static double CalculateRemoteSurfaceFlipDelay(
+double CalculateRemoteSurfaceFlipDelay(
struct display_mode_lib *mode_lib,
double VRatio,
double SwathWidth,
@@ -2909,7 +2909,7 @@ double CalculateTWait(
}
 }
 
-static double CalculateRemoteSurfaceFlipDelay(
+double CalculateRemoteSurfaceFlipDelay(
struct display_mode_lib *mode_lib,
double VRatio,
double SwathWidth,
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20v2.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20v2.c
index 2b4dcae4e432..e9ebc81adc71 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20v2.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20v2.c
@@ -182,7 +182,7 @@ double CalculateTWait(
double DRAMClockChangeLatency,
double UrgentLatency,
double SREnterPlusExitTime);
-static double CalculateRemoteSurfaceFlipDelay(
+double CalculateRemoteSurfaceFlipDelay(
struct display_mode_lib *mode_lib,
double VRatio,
double SwathWidth,
@@ -2967,44 +2967,6 @@ static void dml20v2_DisplayPipeConfiguration(struct 
display_mode_lib *mode_lib)
}
 }
 
-static double CalculateRemoteSurfaceFlipDelay(
-   struct display_mode_lib *mode_lib,
-   double VRatio,
-   double SwathWidth,
-   double Bpp,
-   double LineTime,
-   double XFCTSlvVupdateOffset,
-   double XFCTSlvVupdateWidth,
-   double XFCTSlvVreadyOffset,
-   double XFCXBUFLatencyTolerance,
-   double XFCFillBWOverhead,
-   double XFCSlvChunkSize,
-   double XFCBusTransportTime,
-   double TCalc,
-   double TWait,
-   double *SrcActiveDrainRate,
-   double *TInitXFill,
-   double *TslvChk)
-{
-   double TSlvSetup, AvgfillRate, result;
-
-   *SrcActiveDrainRate = VRatio * SwathWidth * Bpp / LineTime;
-   TSlvSetup = XFCTSlvVupdateOffset + XFCTSlvVupdateWidth + 
XFCTSlvVreadyOffset;
-   *TInitXFill = XFCXBUFLatencyTolerance / (1 + XFCFillBWOverhead / 100);
-   AvgfillRate = *SrcActiveDrainRate * (1 + XFCFillBWOverhead / 100);
-   *TslvChk = XFCSlvChunkSize / AvgfillRate;
-   dml_print(
-   "DML::CalculateRemoteSurfaceFlipDelay: 
SrcActiveDrainRate: %f\n",
-   *SrcActiveDrainRate);
-   dml_print("DML::CalculateRemoteSurfaceFlipDelay: TSlvSetup: %f\n", 
TSlvSetup);
-   dml_print("DML::CalculateRemoteSurfaceFlipDelay: TInitXFill: %f\n", 
*TInitXFill);
-   dml_print("DML::CalculateRemoteSurfaceFlipDelay: AvgfillRate: %f\n", 
AvgfillRate);
-   dml_print("DML::CalculateRemoteSurfaceFlipDelay: TslvChk: %f\n", 
*TslvChk);
-   result = 2 * XFCBusTransportTime + TSlvSetup + TCalc + TWait + *TslvChk 
+ *TInitXFill; // TODO: This doesn't seem to match programming guide
-   dml_print("DML::CalculateRemoteSurfaceFlipDelay: 
RemoteSurfaceFlipDelay: %f\n", result);
-   return result;
-}
-
 static void CalculateActiveRowBandwidth(
bool GPUVMEnable,
enum source_format_class SourcePixelFormat,
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_mode_vba_21.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_mode_vba_21.c
index df4b52b2ed4c..d79b27ba8a9c 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_mode_vba_21.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_mode_vba_21.c
@@ -210,7 +210,7 @@ double CalculateTWait(
double DRAMClockChangeLatency,
double UrgentLatency,
double SREnterPlusExitTime);
-static double CalculateRemoteSurfaceFlipDelay(
+double CalculateRemoteSurfaceFlipDelay(
struct display_mode_lib *mode_lib,
double VRatio,
double SwathWidth,
@@ -2980,44 +2980,6 @@ static void DisplayPipeConfiguration(s

[PATCH 20/31] drm/amd/display: polling vid stream status in hpo dp blank

2022-09-21 Thread Jasdeep Dhillon
From: Wenjing Liu 

[why]
vid stream control is double bufferred, if we don't wait for video
stream enable set to 0, we may get temporary image corruption
showing on the stream when setting PIXEL_TO_SYMBOL_FIFO_ENABLE to 0.

Reviewed-by: Ariel Bernstein 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Wenjing Liu 
---
 .../drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c  | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c 
b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c
index 23621ff08c90..52fb2bf3d578 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c
@@ -150,9 +150,9 @@ static void dcn31_hpo_dp_stream_enc_dp_blank(
 * 10us*5000=50ms. This covers 41.7ms of minimum 24 Hz mode +
 * a little more because we may not trust delay accuracy.
 */
-   //REG_WAIT(DP_SYM32_ENC_VID_STREAM_CONTROL,
-   //  VID_STREAM_STATUS, 0,
-   //  10, 5000);
+   REG_WAIT(DP_SYM32_ENC_VID_STREAM_CONTROL,
+   VID_STREAM_STATUS, 0,
+   10, 5000);
 
/* Disable SDP tranmission */
REG_UPDATE(DP_SYM32_ENC_SDP_CONTROL,
-- 
2.25.1



[PATCH 29/31] drm/amd/display: remove redundant CalculateTWait's

2022-09-21 Thread Jasdeep Dhillon
From: Tom Rix 

There are several copies of CalculateTwait.
Reduce to one instance and change local variable name to match common usage.

Signed-off-by: Tom Rix 
Reviewed-by: Maíra Canal 
---
 .../dc/dml/dcn20/display_mode_vba_20.c| 16 +++---
 .../dc/dml/dcn20/display_mode_vba_20v2.c  | 21 ++-
 .../dc/dml/dcn21/display_mode_vba_21.c| 19 +
 .../dc/dml/dcn30/display_mode_vba_30.c| 18 +---
 .../dc/dml/dcn31/display_mode_vba_31.c| 13 +---
 .../dc/dml/dcn314/display_mode_vba_314.c  | 13 +---
 6 files changed, 14 insertions(+), 86 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c
index d3b5b6fedf04..56c9c097823d 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c
@@ -153,10 +153,10 @@ static unsigned int CalculateVMAndRowBytes(
bool *PTEBufferSizeNotExceeded,
unsigned int *dpte_row_height,
unsigned int *meta_row_height);
-static double CalculateTWait(
+double CalculateTWait(
unsigned int PrefetchMode,
double DRAMClockChangeLatency,
-   double UrgentLatencyPixelDataOnly,
+   double UrgentLatency,
double SREnterPlusExitTime);
 static double CalculateRemoteSurfaceFlipDelay(
struct display_mode_lib *mode_lib,
@@ -2920,20 +2920,20 @@ static void dml20_DisplayPipeConfiguration(struct 
display_mode_lib *mode_lib)
}
 }
 
-static double CalculateTWait(
+double CalculateTWait(
unsigned int PrefetchMode,
double DRAMClockChangeLatency,
-   double UrgentLatencyPixelDataOnly,
+   double UrgentLatency,
double SREnterPlusExitTime)
 {
if (PrefetchMode == 0) {
return dml_max(
-   DRAMClockChangeLatency + 
UrgentLatencyPixelDataOnly,
-   dml_max(SREnterPlusExitTime, 
UrgentLatencyPixelDataOnly));
+   DRAMClockChangeLatency + UrgentLatency,
+   dml_max(SREnterPlusExitTime, UrgentLatency));
} else if (PrefetchMode == 1) {
-   return dml_max(SREnterPlusExitTime, UrgentLatencyPixelDataOnly);
+   return dml_max(SREnterPlusExitTime, UrgentLatency);
} else {
-   return UrgentLatencyPixelDataOnly;
+   return UrgentLatency;
}
 }
 
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20v2.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20v2.c
index edd098c7eb92..6b54be569691 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20v2.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20v2.c
@@ -177,10 +177,10 @@ static unsigned int CalculateVMAndRowBytes(
bool *PTEBufferSizeNotExceeded,
unsigned int *dpte_row_height,
unsigned int *meta_row_height);
-static double CalculateTWait(
+double CalculateTWait(
unsigned int PrefetchMode,
double DRAMClockChangeLatency,
-   double UrgentLatencyPixelDataOnly,
+   double UrgentLatency,
double SREnterPlusExitTime);
 static double CalculateRemoteSurfaceFlipDelay(
struct display_mode_lib *mode_lib,
@@ -2993,23 +2993,6 @@ static void dml20v2_DisplayPipeConfiguration(struct 
display_mode_lib *mode_lib)
}
 }
 
-static double CalculateTWait(
-   unsigned int PrefetchMode,
-   double DRAMClockChangeLatency,
-   double UrgentLatencyPixelDataOnly,
-   double SREnterPlusExitTime)
-{
-   if (PrefetchMode == 0) {
-   return dml_max(
-   DRAMClockChangeLatency + 
UrgentLatencyPixelDataOnly,
-   dml_max(SREnterPlusExitTime, 
UrgentLatencyPixelDataOnly));
-   } else if (PrefetchMode == 1) {
-   return dml_max(SREnterPlusExitTime, UrgentLatencyPixelDataOnly);
-   } else {
-   return UrgentLatencyPixelDataOnly;
-   }
-}
-
 static double CalculateRemoteSurfaceFlipDelay(
struct display_mode_lib *mode_lib,
double VRatio,
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_mode_vba_21.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_mode_vba_21.c
index 1d84ae50311d..d2dfa82d52a1 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_mode_vba_21.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_mode_vba_21.c
@@ -205,7 +205,7 @@ static unsigned int CalculateVMAndRowBytes(
unsigned int *DPDE0BytesFrame,
unsigned int *MetaPTEBytesFrame);
 
-static double Calculate

[PATCH 14/31] drm/amd/display: Fix audio on display after unplugging another

2022-09-21 Thread Jasdeep Dhillon
From: Aric Cyr 

Revert "dc: skip audio setup when audio stream is enabled"

This reverts commit c83f2553273a796b62411e73fb4fe19ec521f8a9

[why]
We have minimal pipe split transition method to avoid pipe
allocation outage.However, this method will invoke audio setup
which cause audio output stuck once pipe reallocate.

[how]
skip audio setup for pipelines which audio stream has been enabled

Reviewed-by: Martin Leung 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Aric Cyr 
---
 drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index 2341982ee0a5..d260eaa1509e 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -2178,8 +2178,7 @@ static void dce110_setup_audio_dto(
continue;
if (pipe_ctx->stream->signal != SIGNAL_TYPE_HDMI_TYPE_A)
continue;
-   if (pipe_ctx->stream_res.audio != NULL &&
-   pipe_ctx->stream_res.audio->enabled == false) {
+   if (pipe_ctx->stream_res.audio != NULL) {
struct audio_output audio_output;
 
build_audio_output(context, pipe_ctx, &audio_output);
@@ -2219,8 +2218,7 @@ static void dce110_setup_audio_dto(
if (!dc_is_dp_signal(pipe_ctx->stream->signal))
continue;
 
-   if (pipe_ctx->stream_res.audio != NULL &&
-   pipe_ctx->stream_res.audio->enabled == false) {
+   if (pipe_ctx->stream_res.audio != NULL) {
struct audio_output audio_output;
 
build_audio_output(context, pipe_ctx, 
&audio_output);
-- 
2.25.1



[PATCH 30/31] drm/amd/display: refactor CalculateWriteBackDelay to use vba_vars_st ptr

2022-09-21 Thread Jasdeep Dhillon
From: Tom Rix 

Mimimize the function signature by passing a pointer and an index instead
of passing several elements of the pointer.

The dml2x,dml3x families uses the same algorithm.  Remove the duplicates.
Use dml20_ and dml30_ prefix to distinguish the two variants.

Signed-off-by: Tom Rix 
---
 .../dc/dml/dcn20/display_mode_vba_20.c|  78 +++-
 .../dc/dml/dcn20/display_mode_vba_20v2.c  | 115 ++
 .../dc/dml/dcn21/display_mode_vba_21.c| 114 +
 .../dc/dml/dcn30/display_mode_vba_30.c|  74 +++
 .../dc/dml/dcn31/display_mode_vba_31.c|  76 +---
 .../dc/dml/dcn314/display_mode_vba_314.c  |  76 +---
 .../dc/dml/dcn32/display_mode_vba_32.c|  42 +--
 .../dc/dml/dcn32/display_mode_vba_util_32.c   |  30 -
 .../dc/dml/dcn32/display_mode_vba_util_32.h   |  10 +-
 9 files changed, 63 insertions(+), 552 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c
index 56c9c097823d..4ca080950924 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c
@@ -217,16 +217,8 @@ static void CalculateFlipSchedule(
double *DestinationLinesToRequestRowInImmediateFlip,
double *final_flip_bw,
bool *ImmediateFlipSupportedForPipe);
-static double CalculateWriteBackDelay(
-   enum source_format_class WritebackPixelFormat,
-   double WritebackHRatio,
-   double WritebackVRatio,
-   unsigned int WritebackLumaHTaps,
-   unsigned int WritebackLumaVTaps,
-   unsigned int WritebackChromaHTaps,
-   unsigned int WritebackChromaVTaps,
-   unsigned int WritebackDestinationWidth);
 
+double dlm20_CalculateWriteBackDelay(struct vba_vars_st *vba, unsigned int i);
 static void dml20_DisplayPipeConfiguration(struct display_mode_lib *mode_lib);
 static void 
dml20_DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation(
struct display_mode_lib *mode_lib);
@@ -1085,6 +1077,7 @@ static unsigned int CalculateVMAndRowBytes(
 static void 
dml20_DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation(
struct display_mode_lib *mode_lib)
 {
+   struct vba_vars_st *v = &mode_lib->vba;
unsigned int j, k;
 
mode_lib->vba.WritebackDISPCLK = 0.0;
@@ -1980,36 +1973,15 @@ static void 
dml20_DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPer
if (mode_lib->vba.BlendingAndTiming[k] == k) {
if (mode_lib->vba.WritebackEnable[k] == true) {

mode_lib->vba.WritebackDelay[mode_lib->vba.VoltageLevel][k] =
-   mode_lib->vba.WritebackLatency
-   + 
CalculateWriteBackDelay(
-   
mode_lib->vba.WritebackPixelFormat[k],
-   
mode_lib->vba.WritebackHRatio[k],
-   
mode_lib->vba.WritebackVRatio[k],
-   
mode_lib->vba.WritebackLumaHTaps[k],
-   
mode_lib->vba.WritebackLumaVTaps[k],
-   
mode_lib->vba.WritebackChromaHTaps[k],
-   
mode_lib->vba.WritebackChromaVTaps[k],
-   
mode_lib->vba.WritebackDestinationWidth[k])
-   
/ mode_lib->vba.DISPCLK;
+   mode_lib->vba.WritebackLatency + 
dlm20_CalculateWriteBackDelay(v, k) / mode_lib->vba.DISPCLK;
} else

mode_lib->vba.WritebackDelay[mode_lib->vba.VoltageLevel][k] = 0;
for (j = 0; j < mode_lib->vba.NumberOfActivePlanes; 
++j) {
if (mode_lib->vba.BlendingAndTiming[j] == k
&& 
mode_lib->vba.WritebackEnable[j] == true) {

mode_lib->vba.WritebackDelay[mode_lib->vba.VoltageLevel][k] =
-   dml_max(
-   
mode_lib->vba.WritebackDelay[mode_lib->vba.VoltageLevel][k],
-   

[PATCH 23/31] Add debug option for exiting idle optimizations on cursor updates

2022-09-21 Thread Jasdeep Dhillon
From: Brandon Syu 

[Description]
- Have option to exit idle opt on cursor updates
for debug and optimizations purposes

Reviewed-by: Aric Cyr 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Brandon Syu
---
 drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c   | 3 ++-
 drivers/gpu/drm/amd/display/dc/dcn301/dcn301_resource.c | 1 +
 drivers/gpu/drm/amd/display/dc/dcn302/dcn302_resource.c | 3 ++-
 drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c | 1 +
 4 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c
index f6f3878c99b8..3a3b2ac791c7 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c
@@ -724,7 +724,8 @@ static const struct dc_debug_options debug_defaults_drv = {
.dwb_fi_phase = -1, // -1 = disable,
.dmub_command_table = true,
.disable_psr = false,
-   .use_max_lb = true
+   .use_max_lb = true,
+   .exit_idle_opt_for_cursor_updates = true
 };
 
 static const struct dc_debug_options debug_defaults_diags = {
diff --git a/drivers/gpu/drm/amd/display/dc/dcn301/dcn301_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn301/dcn301_resource.c
index 0c2b15a0f3a7..559e563d5bc1 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn301/dcn301_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn301/dcn301_resource.c
@@ -700,6 +700,7 @@ static const struct dc_debug_options debug_defaults_drv = {
.dwb_fi_phase = -1, // -1 = disable
.dmub_command_table = true,
.use_max_lb = false,
+   .exit_idle_opt_for_cursor_updates = true
 };
 
 static const struct dc_debug_options debug_defaults_diags = {
diff --git a/drivers/gpu/drm/amd/display/dc/dcn302/dcn302_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn302/dcn302_resource.c
index 4fab537e822f..b925b6ddde5a 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn302/dcn302_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn302/dcn302_resource.c
@@ -93,7 +93,8 @@ static const struct dc_debug_options debug_defaults_drv = {
.underflow_assert_delay_us = 0x,
.dwb_fi_phase = -1, // -1 = disable,
.dmub_command_table = true,
-   .use_max_lb = true
+   .use_max_lb = true,
+   .exit_idle_opt_for_cursor_updates = true
 };
 
 static const struct dc_debug_options debug_defaults_diags = {
diff --git a/drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c
index d97076648acb..527d5c902878 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c
@@ -77,6 +77,7 @@ static const struct dc_debug_options debug_defaults_drv = {
.underflow_assert_delay_us = 0x,
.dwb_fi_phase = -1, // -1 = disable,
.dmub_command_table = true,
+   .exit_idle_opt_for_cursor_updates = true,
.disable_idle_power_optimizations = false,
 };
 
-- 
2.25.1



[PATCH 26/31] drm/amd/display: 3.2.205

2022-09-21 Thread Jasdeep Dhillon
From: Aric Cyr 

This version brings along following fixes:

- LTTPR mode can be be dynamically changed
- fixes divide by zero error
- features able to use same interface to update cursor info
- fixes for llvm compilation issues
- Fixes DIO FIFO underflow and other FIFO errors
- Partially valid EDIDs handled properly
- Phatom pipes are skipped when checking pending flip
- Fixed audio on audio on display after unplugging

Acked-by: Jasdeep Dhillon 
Signed-off-by: Aric Cyr 
---
 drivers/gpu/drm/amd/display/dc/dc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index 7db7929a7e81..2ecf36e6329b 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -47,7 +47,7 @@ struct aux_payload;
 struct set_config_cmd_payload;
 struct dmub_notification;
 
-#define DC_VER "3.2.204"
+#define DC_VER "3.2.205"
 
 #define MAX_SURFACES 3
 #define MAX_PLANES 6
-- 
2.25.1



[PATCH 22/31] drm/amd/display: Avoid unnecessary pixel rate divider programming

2022-09-21 Thread Jasdeep Dhillon
From: Taimur Hassan 

[Why]
Programming pixel rate divider when FIFO is enabled can cause FIFO error.

[How]
Skip divider programming when divider values are the same to prevent FIFO
error.

Reviewed-by: Alvin Lee 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Taimur Hassan 
---
 .../gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c | 53 +++
 1 file changed, 53 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c
index 26eb04ea472c..e4daed44ef5f 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c
@@ -42,6 +42,48 @@
 #define DC_LOGGER \
dccg->ctx->logger
 
+static void dccg32_get_pixel_rate_div(
+   struct dccg *dccg,
+   uint32_t otg_inst,
+   enum pixel_rate_div *k1,
+   enum pixel_rate_div *k2)
+{
+   struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
+   uint32_t val_k1 = PIXEL_RATE_DIV_NA, val_k2 = PIXEL_RATE_DIV_NA;
+
+   *k1 = PIXEL_RATE_DIV_NA;
+   *k2 = PIXEL_RATE_DIV_NA;
+
+   switch (otg_inst) {
+   case 0:
+   REG_GET_2(OTG_PIXEL_RATE_DIV,
+   OTG0_PIXEL_RATE_DIVK1, &val_k1,
+   OTG0_PIXEL_RATE_DIVK2, &val_k2);
+   break;
+   case 1:
+   REG_GET_2(OTG_PIXEL_RATE_DIV,
+   OTG1_PIXEL_RATE_DIVK1, &val_k1,
+   OTG1_PIXEL_RATE_DIVK2, &val_k2);
+   break;
+   case 2:
+   REG_GET_2(OTG_PIXEL_RATE_DIV,
+   OTG2_PIXEL_RATE_DIVK1, &val_k1,
+   OTG2_PIXEL_RATE_DIVK2, &val_k2);
+   break;
+   case 3:
+   REG_GET_2(OTG_PIXEL_RATE_DIV,
+   OTG3_PIXEL_RATE_DIVK1, &val_k1,
+   OTG3_PIXEL_RATE_DIVK2, &val_k2);
+   break;
+   default:
+   BREAK_TO_DEBUGGER();
+   return;
+   }
+
+   *k1 = (enum pixel_rate_div)val_k1;
+   *k2 = (enum pixel_rate_div)val_k2;
+}
+
 static void dccg32_set_pixel_rate_div(
struct dccg *dccg,
uint32_t otg_inst,
@@ -50,6 +92,17 @@ static void dccg32_set_pixel_rate_div(
 {
struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
 
+   enum pixel_rate_div cur_k1 = PIXEL_RATE_DIV_NA, cur_k2 = 
PIXEL_RATE_DIV_NA;
+
+   // Don't program 0xF into the register field. Not valid since
+   // K1 / K2 field is only 1 / 2 bits wide
+   if (k1 == PIXEL_RATE_DIV_NA || k2 == PIXEL_RATE_DIV_NA)
+   return;
+
+   dccg32_get_pixel_rate_div(dccg, otg_inst, &cur_k1, &cur_k2);
+   if (k1 == cur_k1 && k2 == cur_k2)
+   return;
+
switch (otg_inst) {
case 0:
REG_UPDATE_2(OTG_PIXEL_RATE_DIV,
-- 
2.25.1



[PATCH 17/31] drm/amd/display: Disable MALL when TMZ surface

2022-09-21 Thread Jasdeep Dhillon
From: Alvin Lee 

[Description]
- Don't use MALL buffering of any kind when the
  surface is TMZ
- Workaround for a HW bug

Reviewed-by: Jun Lei 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Alvin Lee 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c   |  8 ++--
 drivers/gpu/drm/amd/display/dc/dc.h|  1 +
 .../gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c | 18 ++
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.c   |  3 ++-
 4 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index b5ad0bf4135a..b82d572c55ae 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -2331,9 +2331,13 @@ static enum surface_update_type det_surface_update(const 
struct dc *dc,
type = get_scaling_info_update_type(u);
elevate_update_type(&overall_type, type);
 
-   if (u->flip_addr)
+   if (u->flip_addr) {
update_flags->bits.addr_update = 1;
-
+   if (u->flip_addr->address.tmz_surface != 
u->surface->address.tmz_surface) {
+   update_flags->bits.tmz_changed = 1;
+   elevate_update_type(&overall_type, UPDATE_TYPE_FULL);
+   }
+   }
if (u->in_transfer_func)
update_flags->bits.in_transfer_func_change = 1;
 
diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index ccb5395a8a90..7db7929a7e81 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -1122,6 +1122,7 @@ union surface_update_flags {
uint32_t clock_change:1;
uint32_t stereo_format_change:1;
uint32_t lut_3d:1;
+   uint32_t tmz_changed:1;
uint32_t full_update:1;
} bits;
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
index 133bc4085c78..6497246692cf 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
@@ -355,7 +355,7 @@ bool dcn32_apply_idle_power_optimizations(struct dc *dc, 
bool enable)
union dmub_rb_cmd cmd;
uint8_t ways, i;
int j;
-   bool stereo_in_use = false;
+   bool mall_ss_unsupported = false;
struct dc_plane_state *plane = NULL;
 
if (!dc->ctx->dmub_srv)
@@ -386,22 +386,23 @@ bool dcn32_apply_idle_power_optimizations(struct dc *dc, 
bool enable)
 */
ways = dcn32_calculate_cab_allocation(dc, 
dc->current_state);
 
-   /* MALL not supported with Stereo3D. If any plane is 
using stereo,
-* don't try to enter MALL.
+   /* MALL not supported with Stereo3D or TMZ surface. If 
any plane is using stereo,
+* or TMZ surface, don't try to enter MALL.
 */
for (i = 0; i < dc->current_state->stream_count; i++) {
for (j = 0; j < 
dc->current_state->stream_status[i].plane_count; j++) {
plane = 
dc->current_state->stream_status[i].plane_states[j];
 
-   if (plane->address.type == 
PLN_ADDR_TYPE_GRPH_STEREO) {
-   stereo_in_use = true;
+   if (plane->address.type == 
PLN_ADDR_TYPE_GRPH_STEREO ||
+   
plane->address.tmz_surface) {
+   mall_ss_unsupported = true;
break;
}
}
-   if (stereo_in_use)
+   if (mall_ss_unsupported)
break;
}
-   if (ways <= dc->caps.cache_num_ways && !stereo_in_use) {
+   if (ways <= dc->caps.cache_num_ways && 
!mall_ss_unsupported) {
memset(&cmd, 0, sizeof(cmd));
cmd.cab.header.type = DMUB_CMD__CAB_FOR_SS;
cmd.cab.header.sub_type = 
DMUB_CMD__CAB_DCN_SS_FIT_IN_CAB;
@@ -759,7 +760,8 @@ void dcn32_update_mall_sel(struct dc *dc, struct dc_state 
*context)
hubp->funcs->hubp_update_mall_sel(hubp,
num_ways <= dc->caps.cache_num_ways &&

pipe->stream->link->psr_settings.psr_version == DC_PSR_VERSION_UNSUPPORTED &&
-   pipe->plane_state->address.type !=  
PLN_ADDR_TYPE_GRPH_STEREO ? 2 : 0,
+   pipe->plane_state->address.type

[PATCH 19/31] drm/amd/display: fill in clock values when DPM is not enabled

2022-09-21 Thread Jasdeep Dhillon
From: Samson Tam 

[Why]
For individual feature testing, PMFW may not report all clock
values back. Driver will default them to 0 but this will
cause the BB table to be skipped and default to one state
with max clocks.

[How]
Add helper function to scan through initial clock values and
populate them with default clock limits so that BB table
can be built.
Add dpm_enabled flag to check when DPM is not enabled and
to trigger helper function.

Reviewed-by: Jun Lei 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Samson Tam 
---
 .../display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c  | 14 +++
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.c  | 39 +++
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.h  |  2 +
 .../amd/display/dc/inc/hw/clk_mgr_internal.h  |  2 +
 4 files changed, 57 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c
index c6785969eb1a..f0f3f66629cc 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c
@@ -156,12 +156,14 @@ void dcn32_init_clocks(struct clk_mgr *clk_mgr_base)
 {
struct clk_mgr_internal *clk_mgr = TO_CLK_MGR_INTERNAL(clk_mgr_base);
unsigned int num_levels;
+   unsigned int num_dcfclk_levels, num_dtbclk_levels, num_dispclk_levels;
 
memset(&(clk_mgr_base->clks), 0, sizeof(struct dc_clocks));
clk_mgr_base->clks.p_state_change_support = true;
clk_mgr_base->clks.prev_p_state_change_support = true;
clk_mgr_base->clks.fclk_prev_p_state_change_support = true;
clk_mgr->smu_present = false;
+   clk_mgr->dpm_present = false;
 
if (!clk_mgr_base->bw_params)
return;
@@ -179,6 +181,7 @@ void dcn32_init_clocks(struct clk_mgr *clk_mgr_base)
dcn32_init_single_clock(clk_mgr, PPCLK_DCFCLK,

&clk_mgr_base->bw_params->clk_table.entries[0].dcfclk_mhz,
&num_levels);
+   num_dcfclk_levels = num_levels;
 
/* SOCCLK */
dcn32_init_single_clock(clk_mgr, PPCLK_SOCCLK,
@@ -189,11 +192,16 @@ void dcn32_init_clocks(struct clk_mgr *clk_mgr_base)
dcn32_init_single_clock(clk_mgr, PPCLK_DTBCLK,

&clk_mgr_base->bw_params->clk_table.entries[0].dtbclk_mhz,
&num_levels);
+   num_dtbclk_levels = num_levels;
 
/* DISPCLK */
dcn32_init_single_clock(clk_mgr, PPCLK_DISPCLK,

&clk_mgr_base->bw_params->clk_table.entries[0].dispclk_mhz,
&num_levels);
+   num_dispclk_levels = num_levels;
+
+   if (num_dcfclk_levels && num_dtbclk_levels && num_dispclk_levels)
+   clk_mgr->dpm_present = true;
 
if (clk_mgr_base->ctx->dc->debug.min_disp_clk_khz) {
unsigned int i;
@@ -658,6 +666,12 @@ static void dcn32_get_memclk_states_from_smu(struct 
clk_mgr *clk_mgr_base)
&num_levels);
clk_mgr_base->bw_params->clk_table.num_entries = num_levels ? 
num_levels : 1;
 
+   if (clk_mgr->dpm_present && !num_levels)
+   clk_mgr->dpm_present = false;
+
+   if (!clk_mgr->dpm_present)
+   dcn32_patch_dpm_table(clk_mgr_base->bw_params);
+
DC_FP_START();
/* Refresh bounding box */
clk_mgr_base->ctx->dc->res_pool->funcs->update_bw_bounding_box(
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
index cb97afbee097..4484a7ece4b4 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
@@ -2044,6 +2044,45 @@ static void remove_entry_from_table_at_index(struct 
_vcs_dpi_voltage_scaling_st
memset(&table[--(*num_entries)], 0, sizeof(struct 
_vcs_dpi_voltage_scaling_st));
 }
 
+void dcn32_patch_dpm_table(struct clk_bw_params *bw_params)
+{
+   int i;
+   unsigned int max_dcfclk_mhz = 0, max_dispclk_mhz = 0, max_dppclk_mhz = 
0,
+   max_phyclk_mhz = 0, max_dtbclk_mhz = 0, max_fclk_mhz = 
0, max_uclk_mhz = 0;
+
+   for (i = 0; i < MAX_NUM_DPM_LVL; i++) {
+   if (bw_params->clk_table.entries[i].dcfclk_mhz > max_dcfclk_mhz)
+   max_dcfclk_mhz = 
bw_params->clk_table.entries[i].dcfclk_mhz;
+   if (bw_params->clk_table.entries[i].fclk_mhz > max_fclk_mhz)
+   max_fclk_mhz = bw_params->clk_table.entries[i].fclk_mhz;
+   if (bw_params->clk_table.entries[i].memclk_mhz > max_uclk_mhz)
+   max_uclk_mhz = 
bw_params->clk_table.entries[i].memclk_mhz;
+   if (bw_params->clk_table.entries[i].dispclk_mhz > 
max_dispclk_mhz)
+   max_dispclk_mhz = 
bw_params->clk_table.entries[i].dispclk_mhz;
+   if (bw_params->clk_table.entries[i].dppclk_mhz > max_dp

[PATCH 24/31] drm/amd/display: Cursor Info Update refactor

2022-09-21 Thread Jasdeep Dhillon
From: Max Tseng 

Dc: cursor info update: phase 1:

[Why]

Different feature might need to update cursor info, but
With different approaches.
To unify this diversity problem, all features should use
The same interface to update cursor.

Reviewed-by: Reza Amini 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Max Tseng 
---
 drivers/gpu/drm/amd/display/dc/core/dc_stream.c | 1 +
 drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h | 5 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
index 0c85ab5933b4..3ca1592ce7ac 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
@@ -30,6 +30,7 @@
 #include "resource.h"
 #include "ipp.h"
 #include "timing_generator.h"
+#include "dc_dmub_srv.h"
 
 #define DC_LOGGER dc->ctx->logger
 
diff --git a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h 
b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
index 6b9a529e9f12..5d1aadade8a5 100644
--- a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
+++ b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
@@ -760,6 +760,11 @@ enum dmub_cmd_dpia_type {
DMUB_CMD__DPIA_MST_ALLOC_SLOTS = 2,
 };
 
+enum dmub_cmd_header_sub_type {
+   DMUB_CMD__SUB_TYPE_GENERAL = 0,
+   DMUB_CMD__SUB_TYPE_CURSOR_POSITION = 1
+};
+
 #pragma pack(push, 1)
 
 /**
-- 
2.25.1



[PATCH 12/31] drm/amd/display: Update MALL SS NumWays calculation

2022-09-21 Thread Jasdeep Dhillon
From: Alvin Lee 

[Description]
Update MALL SS NumWays calculation according
to programming guide.

Reviewed-by: Jun Lei 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Alvin Lee 
---
 drivers/gpu/drm/amd/display/dc/dc.h   |   1 +
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.c| 206 --
 2 files changed, 97 insertions(+), 110 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index bbc352b18bf4..30274e8a6d23 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -832,6 +832,7 @@ struct dc_debug_options {
bool force_subvp_mclk_switch;
bool allow_sw_cursor_fallback;
unsigned int force_subvp_num_ways;
+   unsigned int force_mall_ss_num_ways;
bool alloc_extra_way_for_cursor;
bool force_usr_allow;
/* uses value at boot and disables switch */
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
index 6baea56f259c..ab47475c18ae 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
@@ -49,6 +49,7 @@
 #include "dcn20/dcn20_optc.h"
 #include "dmub_subvp_state.h"
 #include "dce/dmub_hw_lock_mgr.h"
+#include "dcn32_resource.h"
 #include "dc_link_dp.h"
 #include "dmub/inc/dmub_subvp_state.h"
 
@@ -198,42 +199,6 @@ static bool dcn32_check_no_memory_request_for_cab(struct 
dc *dc)
return false;
 }
 
-/* This function takes in the start address and surface size to be cached in 
CAB
- * and calculates the total number of cache lines required to store the 
surface.
- * The number of cache lines used for each surface is calculated independently 
of
- * one another. For example, if there is a primary surface(1), meta 
surface(2), and
- * cursor(3), this function should be called 3 times to calculate the number 
of cache
- * lines used for each of those surfaces.
- */
-static uint32_t dcn32_cache_lines_for_surface(struct dc *dc, uint32_t 
surface_size, uint64_t start_address)
-{
-   uint32_t lines_used = 1;
-   uint32_t num_cached_bytes = 0;
-   uint32_t remaining_size = 0;
-   uint32_t cache_line_size = dc->caps.cache_line_size;
-   uint32_t remainder = 0;
-
-   /* 1. Calculate surface size minus the number of bytes stored
-* in the first cache line (all bytes in first cache line might
-* not be fully used).
-*/
-   div_u64_rem(start_address, cache_line_size, &remainder);
-   num_cached_bytes = cache_line_size - remainder;
-   remaining_size = surface_size - num_cached_bytes;
-
-   /* 2. Calculate number of cache lines that will be fully used with
-* the remaining number of bytes to be stored.
-*/
-   lines_used += (remaining_size / cache_line_size);
-
-   /* 3. Check if we need an extra line due to the remaining size not being
-* a multiple of CACHE_LINE_SIZE.
-*/
-   if (remaining_size % cache_line_size > 0)
-   lines_used++;
-
-   return lines_used;
-}
 
 /* This function loops through every surface that needs to be cached in CAB 
for SS,
  * and calculates the total number of ways required to store all surfaces 
(primary,
@@ -241,96 +206,115 @@ static uint32_t dcn32_cache_lines_for_surface(struct dc 
*dc, uint32_t surface_si
  */
 static uint32_t dcn32_calculate_cab_allocation(struct dc *dc, struct dc_state 
*ctx)
 {
-   uint8_t i, j;
+   uint8_t i; 
+   int j;
struct dc_stream_state *stream = NULL;
struct dc_plane_state *plane = NULL;
-   uint32_t surface_size = 0;
uint32_t cursor_size = 0;
-   uint32_t cache_lines_used = 0;
uint32_t total_lines = 0;
uint32_t lines_per_way = 0;
-   uint32_t num_ways = 0;
-   uint32_t prev_addr_low = 0;
+   uint8_t num_ways = 0;
+   uint8_t bytes_per_pixel = 0;
+   uint8_t cursor_bpp = 0;
+   uint16_t mblk_width = 0;
+   uint16_t mblk_height = 0;
+   uint16_t mall_alloc_width_blk_aligned = 0;
+   uint16_t mall_alloc_height_blk_aligned = 0;
+   uint16_t num_mblks = 0;
+   uint32_t bytes_in_mall = 0;
+   uint32_t cache_lines_per_plane = 0;
 
-   for (i = 0; i < ctx->stream_count; i++) {
-   stream = ctx->streams[i];
+   for (i = 0; i < dc->res_pool->pipe_count; i++) {
+   struct pipe_ctx *pipe = &dc->current_state->res_ctx.pipe_ctx[i];
 
-   // Don't include PSR surface in the total surface size for CAB 
allocation
-   if (stream->link->psr_settings.psr_version != 
DC_PSR_VERSION_UNSUPPORTED)
+   if (!pipe->stream || !pipe->plane_state ||
+   pipe->stream->link->psr_settings.psr_version != 
DC_PSR_VERSION_UNSUPPORTED ||
+   pipe->stream->mall_stream_config.type == 
SUBVP_PHANTOM)
continue;
 
-   if (ctx->stream_

[PATCH 13/31] drm/amd/display: add missing null check

2022-09-21 Thread Jasdeep Dhillon
From: Wenjing Liu 

[why]
There is a coding error for a missing null check for stream pointer when 
iterating through
pipe_ctx.

Reviewed-by: Martin Leung 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Wenjing Liu 
---
 drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
index ab47475c18ae..133bc4085c78 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
@@ -1289,7 +1289,7 @@ static void apply_symclk_on_tx_off_wa(struct dc_link 
*link)
if (link->phy_state.symclk_ref_cnts.otg > 0) {
for (i = 0; i < MAX_PIPES; i++) {
pipe_ctx = &dc->current_state->res_ctx.pipe_ctx[i];
-   if (pipe_ctx->stream->link == link && 
pipe_ctx->top_pipe == NULL) {
+   if (pipe_ctx->stream && pipe_ctx->stream->link == link 
&& pipe_ctx->top_pipe == NULL) {
pipe_ctx->clock_source->funcs->program_pix_clk(
pipe_ctx->clock_source,

&pipe_ctx->stream_res.pix_clk_params,
-- 
2.25.1



[PATCH 27/31] drm/amd/display: Reduce number of arguments of dml314's CalculateWatermarksAndDRAMSpeedChangeSupport()

2022-09-21 Thread Jasdeep Dhillon
From: Nathan Chancellor 

Most of the arguments are identical between the two call sites and they
can be accessed through the 'struct vba_vars_st' pointer. This reduces
the total amount of stack space that
dml314_ModeSupportAndSystemConfigurationFull() uses by 240 bytes with
LLVM 16 (2216 -> 1976), helping clear up the following clang warning:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn314/display_mode_vba_314.c:4020:6:
 error: stack frame size (2216) exceeds limit (2048) in 
'dml314_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml314_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
   ^
  1 error generated.

Link: https://github.com/ClangBuiltLinux/linux/issues/1710
Reported-by: "kernelci.org bot" 
Signed-off-by: Nathan Chancellor 
Tested-by: Maíra Canal 
---
 .../dc/dml/dcn314/display_mode_vba_314.c  | 248 --
 1 file changed, 52 insertions(+), 196 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
index 3c6fb98944d6..4c1d0c103933 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
@@ -325,64 +325,28 @@ static void CalculateVupdateAndDynamicMetadataParameters(
 static void CalculateWatermarksAndDRAMSpeedChangeSupport(
struct display_mode_lib *mode_lib,
unsigned int PrefetchMode,
-   unsigned int NumberOfActivePlanes,
-   unsigned int MaxLineBufferLines,
-   unsigned int LineBufferSize,
-   unsigned int WritebackInterfaceBufferSize,
double DCFCLK,
double ReturnBW,
-   bool SynchronizedVBlank,
-   unsigned int dpte_group_bytes[],
-   unsigned int MetaChunkSize,
double UrgentLatency,
double ExtraLatency,
-   double WritebackLatency,
-   double WritebackChunkSize,
double SOCCLK,
-   double DRAMClockChangeLatency,
-   double SRExitTime,
-   double SREnterPlusExitTime,
-   double SRExitZ8Time,
-   double SREnterPlusExitZ8Time,
double DCFCLKDeepSleep,
unsigned int DETBufferSizeY[],
unsigned int DETBufferSizeC[],
unsigned int SwathHeightY[],
unsigned int SwathHeightC[],
-   unsigned int LBBitPerPixel[],
double SwathWidthY[],
double SwathWidthC[],
-   double HRatio[],
-   double HRatioChroma[],
-   unsigned int vtaps[],
-   unsigned int VTAPsChroma[],
-   double VRatio[],
-   double VRatioChroma[],
-   unsigned int HTotal[],
-   double PixelClock[],
-   unsigned int BlendingAndTiming[],
unsigned int DPPPerPlane[],
double BytePerPixelDETY[],
double BytePerPixelDETC[],
-   double DSTXAfterScaler[],
-   double DSTYAfterScaler[],
-   bool WritebackEnable[],
-   enum source_format_class WritebackPixelFormat[],
-   double WritebackDestinationWidth[],
-   double WritebackDestinationHeight[],
-   double WritebackSourceHeight[],
bool UnboundedRequestEnabled,
unsigned int CompressedBufferSizeInkByte,
enum clock_change_support *DRAMClockChangeSupport,
-   double *UrgentWatermark,
-   double *WritebackUrgentWatermark,
-   double *DRAMClockChangeWatermark,
-   double *WritebackDRAMClockChangeWatermark,
double *StutterExitWatermark,
double *StutterEnterPlusExitWatermark,
double *Z8StutterExitWatermark,
-   double *Z8StutterEnterPlusExitWatermark,
-   double *MinActiveDRAMClockChangeLatencySupported);
+   double *Z8StutterEnterPlusExitWatermark);
 
 static void CalculateDCFCLKDeepSleep(
struct display_mode_lib *mode_lib,
@@ -3041,64 +3005,28 @@ static void 
DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
CalculateWatermarksAndDRAMSpeedChangeSupport(
mode_lib,
PrefetchMode,
-   v->NumberOfActivePlanes,
-   v->MaxLineBufferLines,
-   v->LineBufferSize,
-   v->WritebackInterfaceBufferSize,
v->DCFCLK,
v->ReturnBW,
-   v->SynchronizedVBlank,
-   v->dpte_group_bytes,
-   

[PATCH 07/31] drm/amd/display: Add explicit FIFO disable for DP blank

2022-09-21 Thread Jasdeep Dhillon
From: Nicholas Kazlauskas 

[Why]
We rely on DMCUB to do this when disabling the link but it should
actually come before we disable the DP VID stream.

If we don't then the FIFO can end up with underflow that persists
the next time it's enabled.

[How]
Add a DCN314 specific blank sequence that will disable the DIG FIFO
first.

Reviewed-by: Syed Hassan 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Nicholas Kazlauskas 
---
 .../display/dc/dcn314/dcn314_dio_stream_encoder.c| 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c 
b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c
index 914c5da737ed..3107bd57 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c
@@ -261,6 +261,16 @@ static bool is_two_pixels_per_containter(const struct 
dc_crtc_timing *timing)
return two_pix;
 }
 
+void enc314_stream_encoder_dp_blank(
+   struct dc_link *link,
+   struct stream_encoder *enc)
+{
+   /* New to DCN314 - disable the FIFO before VID stream disable. */
+   enc314_disable_fifo(enc);
+
+   enc1_stream_encoder_dp_blank(link, enc);
+}
+
 static void enc314_stream_encoder_dp_unblank(
struct dc_link *link,
struct stream_encoder *enc,
@@ -408,7 +418,7 @@ static const struct stream_encoder_funcs 
dcn314_str_enc_funcs = {
.stop_dp_info_packets =
enc1_stream_encoder_stop_dp_info_packets,
.dp_blank =
-   enc1_stream_encoder_dp_blank,
+   enc314_stream_encoder_dp_blank,
.dp_unblank =
enc314_stream_encoder_dp_unblank,
.audio_mute_control = enc3_audio_mute_control,
-- 
2.25.1



[PATCH 15/31] drm/amd/display: add debug keys for override bios settings.

2022-09-21 Thread Jasdeep Dhillon
From: Charlene Liu 

[why]
adding debug keys used for compliance test.

Reviewed-by: Chris Park 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Charlene Liu 
---
 .../drm/amd/display/dc/bios/bios_parser2.c| 21 ---
 drivers/gpu/drm/amd/display/dc/dc.h   |  3 +++
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c 
b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
index 5d70f9901d13..53b077b40d72 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
@@ -24,6 +24,7 @@
  */
 
 #include "dm_services.h"
+#include "core_types.h"
 
 #include "ObjectID.h"
 #include "atomfirmware.h"
@@ -1374,7 +1375,7 @@ static enum bp_result bios_parser_get_lttpr_interop(
default:
break;
}
-
+   DC_LOG_BIOS("DCE_INFO_CAPS_VBIOS_LTTPR_TRANSPARENT_ENABLE: %d 
tbl_revision.major = %d tbl_revision.minor = %d\n", *dce_caps, 
tbl_revision.major, tbl_revision.minor);
return result;
 }
 
@@ -1390,6 +1391,7 @@ static enum bp_result bios_parser_get_lttpr_caps(
if (!DATA_TABLES(dce_info))
return BP_RESULT_UNSUPPORTED;
 
+   *dce_caps  = 0;
header = GET_IMAGE(struct atom_common_table_header,
DATA_TABLES(dce_info));
get_atom_data_table_revision(header, &tbl_revision);
@@ -1423,7 +1425,11 @@ static enum bp_result bios_parser_get_lttpr_caps(
default:
break;
}
-
+   DC_LOG_BIOS("DCE_INFO_CAPS_LTTPR_SUPPORT_ENABLE: %d tbl_revision.major 
= %d tbl_revision.minor = %d\n", *dce_caps, tbl_revision.major, 
tbl_revision.minor);
+   if (dcb->ctx->dc->config.force_bios_enable_lttpr && *dce_caps == 0) {
+   *dce_caps = 1;
+   DC_LOG_BIOS("DCE_INFO_CAPS_VBIOS_LTTPR_TRANSPARENT_ENABLE: 
forced enabled");
+   }
return result;
 }
 
@@ -2994,13 +3000,22 @@ static enum bp_result construct_integrated_info(

info->ext_disp_conn_info.path[i].ext_encoder_obj_id.id,

info->ext_disp_conn_info.path[i].caps
);
+   if (info->ext_disp_conn_info.path[i].caps & 
EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN)
+   DC_LOG_BIOS("BIOS 
EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN on path %d\n", i);
+   else if (bp->base.ctx->dc->config.force_bios_fixed_vs) {
+   info->ext_disp_conn_info.path[i].caps |= 
EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN;
+   DC_LOG_BIOS("driver forced 
EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN on path %d\n", i);
+   }
}
-
// Log the Checksum and Voltage Swing
DC_LOG_BIOS("Integrated info table CHECKSUM: %d\n"
"Integrated info table 
FIX_DP_VOLTAGE_SWING: %d\n",
info->ext_disp_conn_info.checksum,

info->ext_disp_conn_info.fixdpvoltageswing);
+   if (bp->base.ctx->dc->config.force_bios_fixed_vs && 
info->ext_disp_conn_info.fixdpvoltageswing == 0) {
+   info->ext_disp_conn_info.fixdpvoltageswing = 
bp->base.ctx->dc->config.force_bios_fixed_vs & 0xF;
+   DC_LOG_BIOS("driver forced fixdpvoltageswing = %d\n", 
info->ext_disp_conn_info.fixdpvoltageswing);
+   }
}
/* Sort voltage table from low to high*/
for (i = 1; i < NUMBER_OF_DISP_CLK_VOLTAGE; ++i) {
diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index 30274e8a6d23..ccb5395a8a90 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -406,6 +406,9 @@ struct dc_config {
bool ignore_dpref_ss;
bool enable_mipi_converter_optimization;
bool use_default_clock_table;
+   bool force_bios_enable_lttpr;
+   uint8_t force_bios_fixed_vs;
+
 };
 
 enum visual_confirm {
-- 
2.25.1



[PATCH 16/31] drm/amd/display: Fix typo in get_pixel_rate_div

2022-09-21 Thread Jasdeep Dhillon
From: Taimur Hassan 

[Why & How]
Some FIFO errors still occur due to reading wrong pixel rate divider.
Fix typo to prevent FIFO error.

Reviewed-by: Nicholas Kazlauskas 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Taimur Hassan 
---
 drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dccg.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dccg.c 
b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dccg.c
index 171e1580291a..1bd7e0f327d8 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dccg.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dccg.c
@@ -76,7 +76,7 @@ static void dccg314_get_pixel_rate_div(
case 3:
REG_GET_2(OTG_PIXEL_RATE_DIV,
OTG3_PIXEL_RATE_DIVK1, &val_k1,
-   OTG3_PIXEL_RATE_DIVK2, &val_k1);
+   OTG3_PIXEL_RATE_DIVK2, &val_k2);
break;
default:
BREAK_TO_DEBUGGER();
-- 
2.25.1



[PATCH 25/31] drm/amd/display: Remove assert for odm transition case

2022-09-21 Thread Jasdeep Dhillon
From: Eric Bernstein 

Remove assert that will hit during odm transition case, since this is a
valid case.

Signed-off-by: Eric Bernstein 
Reviewed-by: Rodrigo Siqueira 
---
 drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hubbub.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hubbub.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hubbub.c
index 0b70247a5d36..f6d3da475835 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hubbub.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hubbub.c
@@ -98,9 +98,13 @@ void dcn32_program_det_size(struct hubbub *hubbub, int 
hubp_inst, unsigned int d
default:
break;
}
-   /* Should never be hit, if it is we have an erroneous hw config*/
-   ASSERT(hubbub2->det0_size + hubbub2->det1_size + hubbub2->det2_size
-   + hubbub2->det3_size + hubbub2->compbuf_size_segments 
<= hubbub2->crb_size_segs);
+   if (hubbub2->det0_size + hubbub2->det1_size + hubbub2->det2_size
+   + hubbub2->det3_size + hubbub2->compbuf_size_segments > 
hubbub2->crb_size_segs) {
+   /* This may happen during seamless transition from ODM 2:1 to 
ODM4:1 */
+   DC_LOG_WARNING("CRB Config Warning: DET size (%d,%d,%d,%d) + 
Compbuf size (%d) >  CRB segments (%d)\n",
+   hubbub2->det0_size, 
hubbub2->det1_size, hubbub2->det2_size, hubbub2->det3_size,
+   hubbub2->compbuf_size_segments, 
hubbub2->crb_size_segs);
+   }
 }
 
 static void dcn32_program_compbuf_size(struct hubbub *hubbub, unsigned int 
compbuf_size_kb, bool safe_to_increase)
-- 
2.25.1



[PATCH 21/31] Add ABM control to panel_config struct.

2022-09-21 Thread Jasdeep Dhillon
From: Ian Chen 

Reviewed-by: Josip Pavic 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Ian Chen 
---
 drivers/gpu/drm/amd/display/dc/dc_link.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dc_link.h 
b/drivers/gpu/drm/amd/display/dc/dc_link.h
index 6e49ec262487..bf5f9e2773bc 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_link.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_link.h
@@ -127,6 +127,12 @@ struct dc_panel_config {
unsigned int extra_t12_ms;
unsigned int extra_post_OUI_ms;
} pps;
+   // ABM
+   struct varib {
+   unsigned int varibright_feature_enable;
+   unsigned int def_varibright_level;
+   unsigned int abm_config_setting;
+   } varib;
// edp DSC
struct dsc {
bool disable_dsc_edp;
-- 
2.25.1



[PATCH 18/31] drm/amd/display: Fix CAB allocation calculation

2022-09-21 Thread Jasdeep Dhillon
From: Alvin Lee 

[Description]
Accidentally added when should have subtracted
in calculation

Reviewed-by: Jun Lei 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Alvin Lee 
---
 drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
index 6497246692cf..85fa17185ccb 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
@@ -242,7 +242,7 @@ static uint32_t dcn32_calculate_cab_allocation(struct dc 
*dc, struct dc_state *c
 * mall_alloc_width_blk_aligned_l/c = 
full_vp_width_blk_aligned_l/c
 */
mall_alloc_width_blk_aligned = 
((pipe->plane_res.scl_data.viewport.x +
-   pipe->plane_res.scl_data.viewport.width + 
mblk_width - 1) / mblk_width * mblk_width) +
+   pipe->plane_res.scl_data.viewport.width + 
mblk_width - 1) / mblk_width * mblk_width) -

(pipe->plane_res.scl_data.viewport.x / mblk_width * mblk_width);
 
/* full_vp_height_blk_aligned = FLOOR(vp_y_start + 
full_vp_height + blk_height - 1, blk_height) -
@@ -251,7 +251,7 @@ static uint32_t dcn32_calculate_cab_allocation(struct dc 
*dc, struct dc_state *c
 * mall_alloc_height_blk_aligned_l/c = 
full_vp_height_blk_aligned_l/c
 */
mall_alloc_height_blk_aligned = 
((pipe->plane_res.scl_data.viewport.y +
-   pipe->plane_res.scl_data.viewport.height + 
mblk_height - 1) / mblk_height * mblk_height) +
+   pipe->plane_res.scl_data.viewport.height + 
mblk_height - 1) / mblk_height * mblk_height) -

(pipe->plane_res.scl_data.viewport.y / mblk_height * mblk_height);
 
num_mblks = ((mall_alloc_width_blk_aligned + mblk_width - 1) / 
mblk_width) *
-- 
2.25.1



[PATCH 28/31] drm/amd/display: Reduce number of arguments of dml314's CalculateFlipSchedule()

2022-09-21 Thread Jasdeep Dhillon
From: Nathan Chancellor 

Most of the arguments are identical between the two call sites and they
can be accessed through the 'struct vba_vars_st' pointer. This reduces
the total amount of stack space that
dml314_ModeSupportAndSystemConfigurationFull() uses by 112 bytes with
LLVM 16 (1976 -> 1864), helping clear up the following clang warning:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn314/display_mode_vba_314.c:4020:6:
 error: stack frame size (2216) exceeds limit (2048) in 
'dml314_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml314_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
   ^
  1 error generated.

Link: https://github.com/ClangBuiltLinux/linux/issues/1710
Reported-by: "kernelci.org bot" 
Signed-off-by: Nathan Chancellor 
---
 .../dc/dml/dcn314/display_mode_vba_314.c  | 172 +-
 1 file changed, 47 insertions(+), 125 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
index 4c1d0c103933..0d12fd079cd6 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
@@ -265,33 +265,13 @@ static void CalculateRowBandwidth(
 
 static void CalculateFlipSchedule(
struct display_mode_lib *mode_lib,
+   unsigned int k,
double HostVMInefficiencyFactor,
double UrgentExtraLatency,
double UrgentLatency,
-   unsigned int GPUVMMaxPageTableLevels,
-   bool HostVMEnable,
-   unsigned int HostVMMaxNonCachedPageTableLevels,
-   bool GPUVMEnable,
-   double HostVMMinPageSize,
double PDEAndMetaPTEBytesPerFrame,
double MetaRowBytes,
-   double DPTEBytesPerRow,
-   double BandwidthAvailableForImmediateFlip,
-   unsigned int TotImmediateFlipBytes,
-   enum source_format_class SourcePixelFormat,
-   double LineTime,
-   double VRatio,
-   double VRatioChroma,
-   double Tno_bw,
-   bool DCCEnable,
-   unsigned int dpte_row_height,
-   unsigned int meta_row_height,
-   unsigned int dpte_row_height_chroma,
-   unsigned int meta_row_height_chroma,
-   double *DestinationLinesToRequestVMInImmediateFlip,
-   double *DestinationLinesToRequestRowInImmediateFlip,
-   double *final_flip_bw,
-   bool *ImmediateFlipSupportedForPipe);
+   double DPTEBytesPerRow);
 static double CalculateWriteBackDelay(
enum source_format_class WritebackPixelFormat,
double WritebackHRatio,
@@ -2892,33 +2872,13 @@ static void 
DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
for (k = 0; k < v->NumberOfActivePlanes; ++k) {
CalculateFlipSchedule(
mode_lib,
+   k,
HostVMInefficiencyFactor,
v->UrgentExtraLatency,
v->UrgentLatency,
-   v->GPUVMMaxPageTableLevels,
-   v->HostVMEnable,
-   
v->HostVMMaxNonCachedPageTableLevels,
-   v->GPUVMEnable,
-   v->HostVMMinPageSize,
v->PDEAndMetaPTEBytesFrame[k],
v->MetaRowByte[k],
-   v->PixelPTEBytesPerRow[k],
-   
v->BandwidthAvailableForImmediateFlip,
-   v->TotImmediateFlipBytes,
-   v->SourcePixelFormat[k],
-   v->HTotal[k] / v->PixelClock[k],
-   v->VRatio[k],
-   v->VRatioChroma[k],
-   v->Tno_bw[k],
-   v->DCCEnable[k],
-   v->dpte_row_height[k],
-   v->meta_row_height[k],
-   v->dpte_row_height_chroma[k],
-   v->meta_row_height_chroma[k],
-   
&v->DestinationLinesToRequestVMInImmediateFlip[k],
-  

[PATCH 09/31] drm/amd/display: Change EDID fallback condition

2022-09-21 Thread Jasdeep Dhillon
From: Ilya Bakoulin 

[Why]
Partially valid EDIDs on MST sinks are treated the same way as broken
EDIDs or read failures and result in a fallback EDID being used instead.

[How]
If edid_status is EDID_PARTIAL_VALID, prefer to use the valid EDID
blocks instead of using a fallback EDID.

Reviewed-by: Martin Leung 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Ilya Bakoulin 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index d93393cc66c0..351888fe9b72 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -4061,7 +4061,7 @@ struct dc_sink *dc_link_add_remote_sink(
 * Treat device as no EDID device if EDID
 * parsing fails
 */
-   if (edid_status != EDID_OK) {
+   if (edid_status != EDID_OK && edid_status != EDID_PARTIAL_VALID) {
dc_sink->dc_edid.length = 0;
dm_error("Bad EDID, status%d!\n", edid_status);
}
-- 
2.25.1



[PATCH 02/31] drm/amd/display: Update DCN32 to use new SR latencies

2022-09-21 Thread Jasdeep Dhillon
From: Alvin Lee 

[Description]
Update to new SR latencies for DCN32

Reviewed-by: Nevenko Stupar 
Reviewed-by: Jun Lei 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Alvin Lee 
---
 drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
index fe0770038a90..6687cfed2ca9 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
@@ -121,8 +121,8 @@ struct _vcs_dpi_soc_bounding_box_st dcn3_2_soc = {
},
},
.num_states = 1,
-   .sr_exit_time_us = 20.16,
-   .sr_enter_plus_exit_time_us = 27.13,
+   .sr_exit_time_us = 42.97,
+   .sr_enter_plus_exit_time_us = 49.94,
.sr_exit_z8_time_us = 285.0,
.sr_enter_plus_exit_z8_time_us = 320,
.writeback_latency_us = 12.0,
-- 
2.25.1



[PATCH 10/31] drm/amd/display: skip phantom pipes when checking for pending flip

2022-09-21 Thread Jasdeep Dhillon
From: Aurabindo Pillai 

[Why&How]
Phantom pipes are not programmed fully to hardware and hence we should
not expect a flip completion.

Reviewed-by: Alvin Lee 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Aurabindo Pillai 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 351888fe9b72..b5ad0bf4135a 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1201,7 +1201,7 @@ static void wait_for_no_pipes_pending(struct dc *dc, 
struct dc_state *context)
int count = 0;
struct pipe_ctx *pipe = &context->res_ctx.pipe_ctx[i];
 
-   if (!pipe->plane_state)
+   if (!pipe->plane_state || pipe->stream->mall_stream_config.type 
== SUBVP_PHANTOM)
continue;
 
/* Timeout 100 ms */
-- 
2.25.1



[PATCH 08/31] drm/amd/display: Do DIO FIFO enable after DP video stream enable

2022-09-21 Thread Jasdeep Dhillon
From: Nicholas Kazlauskas 

[Why]
Avoids a race condition where DIO FIFO can underflow due to no incoming
data available.

[How]
Shift the FIFO enable below stream enable.

Make sure fullness level is written before the DIO reset takes place
and that we're not doing it twice.

Reviewed-by: Syed Hassan 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Nicholas Kazlauskas 
---
 .../display/dc/dcn314/dcn314_dio_stream_encoder.c   | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c 
b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c
index 3107bd57..0d2ffb692957 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c
@@ -56,7 +56,8 @@ static void enc314_enable_fifo(struct stream_encoder *enc)
 
/* TODO: Confirm if we need to wait for DIG_SYMCLK_FE_ON */
REG_WAIT(DIG_FE_CNTL, DIG_SYMCLK_FE_ON, 1, 10, 5000);
-   REG_UPDATE_2(DIG_FIFO_CTRL0, DIG_FIFO_RESET, 1, 
DIG_FIFO_READ_START_LEVEL, 0x7);
+   REG_UPDATE(DIG_FIFO_CTRL0, DIG_FIFO_READ_START_LEVEL, 0x7);
+   REG_UPDATE(DIG_FIFO_CTRL0, DIG_FIFO_RESET, 1);
REG_WAIT(DIG_FIFO_CTRL0, DIG_FIFO_RESET_DONE, 1, 10, 5000);
REG_UPDATE(DIG_FIFO_CTRL0, DIG_FIFO_RESET, 0);
REG_WAIT(DIG_FIFO_CTRL0, DIG_FIFO_RESET_DONE, 0, 10, 5000);
@@ -326,15 +327,11 @@ static void enc314_stream_encoder_dp_unblank(
/* switch DP encoder to CRTC data, but reset it the fifo first. It may 
happen
 * that it overflows during mode transition, and sometimes doesn't 
recover.
 */
-   REG_UPDATE(DIG_FIFO_CTRL0, DIG_FIFO_READ_START_LEVEL, 0x7);
REG_UPDATE(DP_STEER_FIFO, DP_STEER_FIFO_RESET, 1);
udelay(10);
 
REG_UPDATE(DP_STEER_FIFO, DP_STEER_FIFO_RESET, 0);
 
-   /* DIG Resync FIFO now needs to be explicitly enabled. */
-   enc314_enable_fifo(enc);
-
/* wait 100us for DIG/DP logic to prime
 * (i.e. a few video lines)
 */
@@ -350,6 +347,12 @@ static void enc314_stream_encoder_dp_unblank(
 
REG_UPDATE(DP_VID_STREAM_CNTL, DP_VID_STREAM_ENABLE, true);
 
+   /*
+* DIG Resync FIFO now needs to be explicitly enabled.
+* This should come after DP_VID_STREAM_ENABLE per HW docs.
+*/
+   enc314_enable_fifo(enc);
+
dp_source_sequence_trace(link, 
DPCD_SOURCE_SEQ_AFTER_ENABLE_DP_VID_STREAM);
 }
 
-- 
2.25.1



[PATCH 11/31] drm/amd/display: fix a divide by zero error

2022-09-21 Thread Jasdeep Dhillon
From: Aurabindo Pillai 

[Why&How]

Incorrect variable was being checked for zero condition.

Reviewed-by: Alvin Lee 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Aurabindo Pillai 
---
 drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c
index 46ba6eee69ea..a2a70a1572b7 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c
@@ -278,7 +278,7 @@ void dcn32_determine_det_override(struct dc *dc,
}
}
 
-   if (context->stream_count > 0) {
+   if (stream_count > 0) {
stream_segments = 18 / stream_count;
for (i = 0; i < context->stream_count; i++) {
if (context->streams[i]->mall_stream_config.type == 
SUBVP_PHANTOM)
-- 
2.25.1



[PATCH 06/31] drm/amd/display: Wrap OTG disable workaround with FIFO control

2022-09-21 Thread Jasdeep Dhillon
From: Nicholas Kazlauskas 

[Why]
The DIO FIFO will underflow if we turn off the OTG before we turn
off the FIFO.

Since this happens as part of the OTG workaround and we don't reset
the FIFO afterwards we see the error persist.

[How]
Add disable FIFO before the disable CRTC and enable FIFO after enabling
the CRTC.

Reviewed-by: Syed Hassan 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Nicholas Kazlauskas 
---
 .../amd/display/dc/clk_mgr/dcn314/dcn314_clk_mgr.c| 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_clk_mgr.c
index 193a0f3de18d..1131c6d73f6c 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_clk_mgr.c
@@ -137,11 +137,20 @@ static void dcn314_disable_otg_wa(struct clk_mgr 
*clk_mgr_base, struct dc_state
if (pipe->top_pipe || pipe->prev_odm_pipe)
continue;
if (pipe->stream && (pipe->stream->dpms_off || 
dc_is_virtual_signal(pipe->stream->signal))) {
+   struct stream_encoder *stream_enc = 
pipe->stream_res.stream_enc;
+
if (disable) {
+   if (stream_enc && 
stream_enc->funcs->disable_fifo)
+   
pipe->stream_res.stream_enc->funcs->disable_fifo(stream_enc);
+

pipe->stream_res.tg->funcs->immediate_disable_crtc(pipe->stream_res.tg);
reset_sync_context_for_pipe(dc, context, i);
-   } else
+   } else {

pipe->stream_res.tg->funcs->enable_crtc(pipe->stream_res.tg);
+
+   if (stream_enc && 
stream_enc->funcs->enable_fifo)
+   
pipe->stream_res.stream_enc->funcs->enable_fifo(stream_enc);
+   }
}
}
 }
-- 
2.25.1



[PATCH 03/31] drm/amd/display: Fix various dynamic ODM transitions on DCN32

2022-09-21 Thread Jasdeep Dhillon
From: Dillon Varone 

[Why&How]

Several transitions were fixed that will allow Dynamic ODM and MPO
transitions to be supported on DCN32.

1) Due to resource limitations, in certain scenarios that require an MPO
plane to be split, the features cannot be combined with the current
policy. This is due to unsafe transitions being required (OPP instance
per MPCC being switched on active pipe is not supported by DCN), to
support the split plane with ODM active as it moves across the viewport.
Dynamic ODM will now be disabled when MPO is required.

2) When exiting MPO and re-entering ODM, DC assigns an inactive pipe for
the next ODM pipe, which under previous power gating policy would result
in programming a gated DSC HW block. New policy dynamically
gates/un-gates DSC blocks when Dynamic ODM is active to support

transitions on DCN32 only.

3) Entry and exit from 3 plane MPO and Dynamic ODM requires a minimal
transition so that all pipes which require their MPCC OPP instance to
be changed have a full frame to be disabled before reprogramming. To
solve this, the Dynamic ODM policy now utilizes minimal state
transitions when entering or exiting 3 plane scenarios.

4) Various fixes to DCN32 pipe merge/split algorithm to support Dynamic
ODM and MPO transitions.

In summary, this commit fixes various transitions to support ODM->MPO
and MPO->ODM.

Reviewed-by: Martin Leung 
Reviewed-by: Jun Lei 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Dillon Varone 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 99 ++-
 drivers/gpu/drm/amd/display/dc/dc.h   |  1 +
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.c| 54 ++
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.h|  8 ++
 .../gpu/drm/amd/display/dc/dcn32/dcn32_init.c |  2 +
 .../drm/amd/display/dc/dcn32/dcn32_resource.c | 27 ++---
 .../amd/display/dc/dcn321/dcn321_resource.c   |  3 +
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.c  | 24 +
 .../gpu/drm/amd/display/dc/inc/hw_sequencer.h |  1 +
 .../amd/display/dc/inc/hw_sequencer_private.h |  2 +
 10 files changed, 186 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 19eb960d75d8..390adc00cd28 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1750,6 +1750,9 @@ static enum dc_status dc_commit_state_no_check(struct dc 
*dc, struct dc_state *c
context->stream_count == 0)
dc->hwss.prepare_bandwidth(dc, context);
 
+   if (dc->debug.enable_double_buffered_dsc_pg_support)
+   dc->hwss.update_dsc_pg(dc, context, false);
+
disable_dangling_plane(dc, context);
/* re-program planes for existing stream, in case we need to
 * free up plane resource for later use
@@ -1840,6 +1843,9 @@ static enum dc_status dc_commit_state_no_check(struct dc 
*dc, struct dc_state *c
dc->hwss.optimize_bandwidth(dc, context);
}
 
+   if (dc->debug.enable_double_buffered_dsc_pg_support)
+   dc->hwss.update_dsc_pg(dc, context, true);
+
if (dc->ctx->dce_version >= DCE_VERSION_MAX)
TRACE_DCN_CLOCK_STATE(&context->bw_ctx.bw.dcn.clk);
else
@@ -2003,6 +2009,9 @@ void dc_post_update_surfaces_to_stream(struct dc *dc)
 
dc->hwss.optimize_bandwidth(dc, context);
 
+   if (dc->debug.enable_double_buffered_dsc_pg_support)
+   dc->hwss.update_dsc_pg(dc, context, true);
+
dc->optimized_required = false;
dc->wm_optimized_required = false;
 }
@@ -3198,6 +3207,9 @@ static void commit_planes_for_stream(struct dc *dc,
if (get_seamless_boot_stream_count(context) == 0)
dc->hwss.prepare_bandwidth(dc, context);
 
+   if (dc->debug.enable_double_buffered_dsc_pg_support)
+   dc->hwss.update_dsc_pg(dc, context, false);
+
context_clock_trace(dc, context);
}
 
@@ -3521,11 +3533,59 @@ static void commit_planes_for_stream(struct dc *dc,
}
 }
 
+/* Determines if the incoming context requires a applying transition state 
with unnecessary
+ * pipe splitting and ODM disabled, due to hardware limitations. In a case 
where
+ * the OPP associated with an MPCC might change due to plane additions, this 
function
+ * returns true.
+ */
+static bool could_mpcc_tree_change_for_active_pipes(struct dc *dc,
+   struct dc_stream_state *stream,
+   int surface_count,
+   bool *is_plane_addition)
+{
+
+   struct dc_stream_status *cur_stream_status = 
stream_get_status(dc->current_state, stream);
+   bool force_minimal_pipe_splitting = false;
+
+   *is_plane_addition = false;
+
+   if (cur_stream_status &&
+   dc->current_state->stream_count > 0 &&
+   dc->debug.pipe_split_policy != MPC_SPLIT_AVOID) {
+   /* determine if minimal transition is required 

[PATCH 05/31] drm/amd/display: Avoid avoid unnecessary pixel rate divider programming

2022-09-21 Thread Jasdeep Dhillon
From: Taimur Hassan 

[Why]
Programming pixel rate divider when FIFO is enabled can cause FIFO error.

[How]
Skip divider programming when divider values are the same to prevent FIFO
error.

Reviewed-by: Nicholas Kazlauskas 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Taimur Hassan 
---
 .../drm/amd/display/dc/dcn314/dcn314_dccg.c   | 47 +++
 1 file changed, 47 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dccg.c 
b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dccg.c
index 36630d532c18..171e1580291a 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dccg.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dccg.c
@@ -45,6 +45,48 @@
 #define DC_LOGGER \
dccg->ctx->logger
 
+static void dccg314_get_pixel_rate_div(
+   struct dccg *dccg,
+   uint32_t otg_inst,
+   enum pixel_rate_div *k1,
+   enum pixel_rate_div *k2)
+{
+   struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
+   uint32_t val_k1 = PIXEL_RATE_DIV_NA, val_k2 = PIXEL_RATE_DIV_NA;
+
+   *k1 = PIXEL_RATE_DIV_NA;
+   *k2 = PIXEL_RATE_DIV_NA;
+
+   switch (otg_inst) {
+   case 0:
+   REG_GET_2(OTG_PIXEL_RATE_DIV,
+   OTG0_PIXEL_RATE_DIVK1, &val_k1,
+   OTG0_PIXEL_RATE_DIVK2, &val_k2);
+   break;
+   case 1:
+   REG_GET_2(OTG_PIXEL_RATE_DIV,
+   OTG1_PIXEL_RATE_DIVK1, &val_k1,
+   OTG1_PIXEL_RATE_DIVK2, &val_k2);
+   break;
+   case 2:
+   REG_GET_2(OTG_PIXEL_RATE_DIV,
+   OTG2_PIXEL_RATE_DIVK1, &val_k1,
+   OTG2_PIXEL_RATE_DIVK2, &val_k2);
+   break;
+   case 3:
+   REG_GET_2(OTG_PIXEL_RATE_DIV,
+   OTG3_PIXEL_RATE_DIVK1, &val_k1,
+   OTG3_PIXEL_RATE_DIVK2, &val_k1);
+   break;
+   default:
+   BREAK_TO_DEBUGGER();
+   return;
+   }
+
+   *k1 = (enum pixel_rate_div)val_k1;
+   *k2 = (enum pixel_rate_div)val_k2;
+}
+
 static void dccg314_set_pixel_rate_div(
struct dccg *dccg,
uint32_t otg_inst,
@@ -52,6 +94,11 @@ static void dccg314_set_pixel_rate_div(
enum pixel_rate_div k2)
 {
struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
+   enum pixel_rate_div cur_k1 = PIXEL_RATE_DIV_NA, cur_k2 = 
PIXEL_RATE_DIV_NA;
+
+   dccg314_get_pixel_rate_div(dccg, otg_inst, &cur_k1, &cur_k2);
+   if (k1 == PIXEL_RATE_DIV_NA || k2 == PIXEL_RATE_DIV_NA || (k1 == cur_k1 
&& k2 == cur_k2))
+   return;
 
switch (otg_inst) {
case 0:
-- 
2.25.1



[PATCH 00/31] DC Patches Sept 26, 2022

2022-09-21 Thread Jasdeep Dhillon
This DC patchset brings improvements in multiple areas. In summary, we have:
 
- LTTPR mode can be be dynamically changed
- features able to use same interface to update cursor info
- fixes for llvm compilation issues
- Fixes DIO FIFO underflow and other FIFO errors
- Partially valid EDIDs handled properly
- Phatom pipes are skipped when checking pending flip
- Fixed audio on display after unplugging
 
Cc: Daniel Wheeler 

Alvin Lee (4):
  drm/amd/display: Update DCN32 to use new SR latencies
  drm/amd/display: Update MALL SS NumWays calculation
  drm/amd/display: Disable MALL when TMZ surface
  drm/amd/display: Fix CAB allocation calculation

Aric Cyr (3):
  drm/amd/display: Remove interface for periodic interrupt 1
  drm/amd/display: Fix audio on display after unplugging another
  drm/amd/display: 3.2.205

Aurabindo Pillai (2):
  drm/amd/display: skip phantom pipes when checking for pending flip
  drm/amd/display: fix a divide by zero error

Brandon Syu (1):
  Add debug option for exiting idle optimizations on cursor updates

Charlene Liu (1):
  drm/amd/display: add debug keys for override bios settings.

Dillon Varone (1):
  drm/amd/display: Fix various dynamic ODM transitions on DCN32

Eric Bernstein (1):
  drm/amd/display: Remove assert for odm transition case

Ian Chen (1):
  Add ABM control to panel_config struct.

Ilya Bakoulin (1):
  drm/amd/display: Change EDID fallback condition

Max Tseng (1):
  drm/amd/display: Cursor Info Update refactor

Michael Strauss (1):
  drm/amd/display: Refactor LTTPR mode selection

Nathan Chancellor (2):
  drm/amd/display: Reduce number of arguments of dml314's
CalculateWatermarksAndDRAMSpeedChangeSupport()
  drm/amd/display: Reduce number of arguments of dml314's
CalculateFlipSchedule()

Nicholas Kazlauskas (3):
  drm/amd/display: Wrap OTG disable workaround with FIFO control
  drm/amd/display: Add explicit FIFO disable for DP blank
  drm/amd/display: Do DIO FIFO enable after DP video stream enable

Samson Tam (1):
  drm/amd/display: fill in clock values when DPM is not enabled

Taimur Hassan (3):
  drm/amd/display: Avoid avoid unnecessary pixel rate divider
programming
  drm/amd/display: Fix typo in get_pixel_rate_div
  drm/amd/display: Avoid unnecessary pixel rate divider programming

Tom Rix (3):
  drm/amd/display: remove redundant CalculateTWait's
  drm/amd/display: refactor CalculateWriteBackDelay to use vba_vars_st
ptr
  drm/amd/display: remove redundant CalculateRemoteSurfaceFlipDelay's

Wenjing Liu (2):
  drm/amd/display: add missing null check
  drm/amd/display: polling vid stream status in hpo dp blank

 .../drm/amd/display/dc/bios/bios_parser2.c|  21 +-
 .../dc/clk_mgr/dcn314/dcn314_clk_mgr.c|  11 +-
 .../display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c  |  14 +
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 127 +++--
 .../gpu/drm/amd/display/dc/core/dc_link_ddc.c |  19 +
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 301 ++-
 .../drm/amd/display/dc/core/dc_link_dpia.c|  34 +-
 .../gpu/drm/amd/display/dc/core/dc_stream.c   |   1 +
 drivers/gpu/drm/amd/display/dc/dc.h   |   8 +-
 drivers/gpu/drm/amd/display/dc/dc_link.h  |   7 +-
 drivers/gpu/drm/amd/display/dc/dc_stream.h|   6 +-
 .../display/dc/dce110/dce110_hw_sequencer.c   |   6 +-
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c |  35 +-
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.h |   3 +-
 .../drm/amd/display/dc/dcn30/dcn30_resource.c |   3 +-
 .../amd/display/dc/dcn301/dcn301_resource.c   |   1 +
 .../amd/display/dc/dcn302/dcn302_resource.c   |   3 +-
 .../amd/display/dc/dcn303/dcn303_resource.c   |   1 +
 .../dc/dcn31/dcn31_hpo_dp_stream_encoder.c|   6 +-
 .../drm/amd/display/dc/dcn314/dcn314_dccg.c   |  47 ++
 .../dc/dcn314/dcn314_dio_stream_encoder.c |  25 +-
 .../gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c |  53 ++
 .../drm/amd/display/dc/dcn32/dcn32_hubbub.c   |  10 +-
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.c| 280 ++
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.h|   8 +
 .../gpu/drm/amd/display/dc/dcn32/dcn32_init.c |   2 +
 .../drm/amd/display/dc/dcn32/dcn32_resource.c |  27 +-
 .../display/dc/dcn32/dcn32_resource_helpers.c |   2 +-
 .../amd/display/dc/dcn321/dcn321_resource.c   |   3 +
 .../dc/dml/dcn20/display_mode_vba_20.c|  98 +---
 .../dc/dml/dcn20/display_mode_vba_20v2.c  | 176 +-
 .../dc/dml/dcn21/display_mode_vba_21.c| 173 +-
 .../dc/dml/dcn30/display_mode_vba_30.c|  92 +---
 .../dc/dml/dcn31/display_mode_vba_31.c|  89 +--
 .../dc/dml/dcn314/display_mode_vba_314.c  | 509 --
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.c  |  70 ++-
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.h  |   2 +
 .../dc/dml/dcn32/display_mode_vba_32.c|  42 +-
 .../dc/dml/dcn32/display_mode_vba_util_32.c   |  30 --
 .../dc/dml/dcn32/display_mode_vba_util_32.h   |  10 +-
 .../gpu/drm/amd/display/dc/inc/dc_link_dp.h   |   5 +
 .../amd/display/d

[PATCH 04/31] drm/amd/display: Remove interface for periodic interrupt 1

2022-09-21 Thread Jasdeep Dhillon
From: Aric Cyr 

[why]
Only a single VLINE interrupt is available so interface should not
expose the second one which is used by DMU firmware.

[how]
Remove references to periodic_interrupt1 and VLINE1 from DC interfaces.

Reviewed-by: Jaehyun Chung 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Aric Cyr 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 16 +++--
 drivers/gpu/drm/amd/display/dc/dc_stream.h|  6 ++--
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 35 ++-
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.h |  3 +-
 .../gpu/drm/amd/display/dc/inc/hw_sequencer.h |  8 +
 5 files changed, 18 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 390adc00cd28..d93393cc66c0 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -2768,11 +2768,8 @@ static void copy_stream_update_to_stream(struct dc *dc,
if (update->abm_level)
stream->abm_level = *update->abm_level;
 
-   if (update->periodic_interrupt0)
-   stream->periodic_interrupt0 = *update->periodic_interrupt0;
-
-   if (update->periodic_interrupt1)
-   stream->periodic_interrupt1 = *update->periodic_interrupt1;
+   if (update->periodic_interrupt)
+   stream->periodic_interrupt = *update->periodic_interrupt;
 
if (update->gamut_remap)
stream->gamut_remap_matrix = *update->gamut_remap;
@@ -2992,13 +2989,8 @@ static void commit_planes_do_stream_update(struct dc *dc,
 
if (!pipe_ctx->top_pipe &&  !pipe_ctx->prev_odm_pipe && 
pipe_ctx->stream == stream) {
 
-   if (stream_update->periodic_interrupt0 &&
-   dc->hwss.setup_periodic_interrupt)
-   dc->hwss.setup_periodic_interrupt(dc, pipe_ctx, 
VLINE0);
-
-   if (stream_update->periodic_interrupt1 &&
-   dc->hwss.setup_periodic_interrupt)
-   dc->hwss.setup_periodic_interrupt(dc, pipe_ctx, 
VLINE1);
+   if (stream_update->periodic_interrupt && 
dc->hwss.setup_periodic_interrupt)
+   dc->hwss.setup_periodic_interrupt(dc, pipe_ctx);
 
if ((stream_update->hdr_static_metadata && 
!stream->use_dynamic_meta) ||
stream_update->vrr_infopacket ||
diff --git a/drivers/gpu/drm/amd/display/dc/dc_stream.h 
b/drivers/gpu/drm/amd/display/dc/dc_stream.h
index 9fcf9dc5bce4..9e6025c98db9 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_stream.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_stream.h
@@ -212,8 +212,7 @@ struct dc_stream_state {
/* DMCU info */
unsigned int abm_level;
 
-   struct periodic_interrupt_config periodic_interrupt0;
-   struct periodic_interrupt_config periodic_interrupt1;
+   struct periodic_interrupt_config periodic_interrupt;
 
/* from core_stream struct */
struct dc_context *ctx;
@@ -281,8 +280,7 @@ struct dc_stream_update {
struct dc_info_packet *hdr_static_metadata;
unsigned int *abm_level;
 
-   struct periodic_interrupt_config *periodic_interrupt0;
-   struct periodic_interrupt_config *periodic_interrupt1;
+   struct periodic_interrupt_config *periodic_interrupt;
 
struct dc_info_packet *vrr_infopacket;
struct dc_info_packet *vsc_infopacket;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index 287fdecc0b10..72521749c01d 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -3812,7 +3812,7 @@ void dcn10_calc_vupdate_position(
 {
const struct dc_crtc_timing *dc_crtc_timing = &pipe_ctx->stream->timing;
int vline_int_offset_from_vupdate =
-   pipe_ctx->stream->periodic_interrupt0.lines_offset;
+   pipe_ctx->stream->periodic_interrupt.lines_offset;
int vupdate_offset_from_vsync = 
dc->hwss.get_vupdate_offset_from_vsync(pipe_ctx);
int start_position;
 
@@ -3837,18 +3837,10 @@ void dcn10_calc_vupdate_position(
 static void dcn10_cal_vline_position(
struct dc *dc,
struct pipe_ctx *pipe_ctx,
-   enum vline_select vline,
uint32_t *start_line,
uint32_t *end_line)
 {
-   enum vertical_interrupt_ref_point ref_point = INVALID_POINT;
-
-   if (vline == VLINE0)
-   ref_point = pipe_ctx->stream->periodic_interrupt0.ref_point;
-   else if (vline == VLINE1)
-   ref_point = pipe_ctx->stream->periodic_interrupt1.ref_point;
-
-   switch (ref_point) {
+   switch (pipe_ctx->stream->periodic_interrupt.ref_point) {
case START_V_UPDATE:

[PATCH 01/31] drm/amd/display: Refactor LTTPR mode selection

2022-09-21 Thread Jasdeep Dhillon
From: Michael Strauss 

[WHY]
Previously, LTTPR mode was decided during detection which makes
link training inflexible as mode can't be dynamically changed.

[HOW]
-Remove lttpr_mode from link struct, and move to link training settings
-Defer choosing LTTPR mode until link training

Other DP changes included:
-Only use fixed vs/pe link training sequence for 8b/10b encoding
-Restrict fixed vs aux timeout workaround to Yellow Carp family

Reviewed-by: Wenjing Liu 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Michael Strauss 
---
 .../gpu/drm/amd/display/dc/core/dc_link_ddc.c |  19 ++
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 301 ++
 .../drm/amd/display/dc/core/dc_link_dpia.c|  34 +-
 drivers/gpu/drm/amd/display/dc/dc_link.h  |   1 -
 .../gpu/drm/amd/display/dc/inc/dc_link_dp.h   |   5 +
 .../amd/display/include/link_service_types.h  |   1 +
 scripts/extract-cert  | Bin 0 -> 18320 bytes
 7 files changed, 208 insertions(+), 153 deletions(-)
 create mode 100755 scripts/extract-cert

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
index d01d2eeed813..3d01965b533a 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
@@ -35,6 +35,8 @@
 #include "dc_link_ddc.h"
 #include "dce/dce_aux.h"
 #include "dmub/inc/dmub_cmd.h"
+#include "link_dpcd.h"
+#include "include/dal_asic_id.h"
 
 #define DC_LOGGER_INIT(logger)
 
@@ -683,6 +685,22 @@ bool dc_link_aux_try_to_configure_timeout(struct 
ddc_service *ddc,
bool result = false;
struct ddc *ddc_pin = ddc->ddc_pin;
 
+   if ((ddc->link->chip_caps & EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) &&
+   !ddc->link->dc->debug.disable_fixed_vs_aux_timeout_wa &&
+   
ASICREV_IS_YELLOW_CARP(ddc->ctx->asic_id.hw_internal_rev)) {
+   /* Fixed VS workaround for AUX timeout */
+   const uint32_t fixed_vs_address = 0xF004F;
+   const uint8_t fixed_vs_data[4] = {0x1, 0x22, 0x63, 0xc};
+
+   core_link_write_dpcd(
+   ddc->link,
+   fixed_vs_address,
+   fixed_vs_data,
+   sizeof(fixed_vs_data));
+
+   timeout = 3072;
+   }
+
/* Do not try to access nonexistent DDC pin. */
if (ddc->link->ep_type != DISPLAY_ENDPOINT_PHY)
return true;
@@ -691,6 +709,7 @@ bool dc_link_aux_try_to_configure_timeout(struct 
ddc_service *ddc,

ddc->ctx->dc->res_pool->engines[ddc_pin->pin_data->en]->funcs->configure_timeout(ddc,
 timeout);
result = true;
}
+
return result;
 }
 
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 7842df9f62de..b3a77a16dd0c 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -526,9 +526,9 @@ uint8_t dc_dp_initialize_scrambling_data_symbols(
return disable_scrabled_data_symbols;
 }
 
-static inline bool is_repeater(struct dc_link *link, uint32_t offset)
+static inline bool is_repeater(const struct link_training_settings 
*lt_settings, uint32_t offset)
 {
-   return (link->lttpr_mode == LTTPR_MODE_NON_TRANSPARENT) && (offset != 
0);
+   return (lt_settings->lttpr_mode == LTTPR_MODE_NON_TRANSPARENT) && 
(offset != 0);
 }
 
 static void dpcd_set_lt_pattern_and_lane_settings(
@@ -545,7 +545,7 @@ static void dpcd_set_lt_pattern_and_lane_settings(
bool edp_workaround = false; /* TODO link_prop.INTERNAL */
dpcd_base_lt_offset = DP_TRAINING_PATTERN_SET;
 
-   if (is_repeater(link, offset))
+   if (is_repeater(lt_settings, offset))
dpcd_base_lt_offset = DP_TRAINING_PATTERN_SET_PHY_REPEATER1 +
((DP_REPEATER_CONFIGURATION_AND_STATUS_SIZE) * (offset 
- 1));
 
@@ -561,7 +561,7 @@ static void dpcd_set_lt_pattern_and_lane_settings(
dpcd_lt_buffer[DP_TRAINING_PATTERN_SET - DP_TRAINING_PATTERN_SET]
= dpcd_pattern.raw;
 
-   if (is_repeater(link, offset)) {
+   if (is_repeater(lt_settings, offset)) {
DC_LOG_HW_LINK_TRAINING("%s\n LTTPR Repeater ID: %d\n 0x%X 
pattern = %x\n",
__func__,
offset,
@@ -584,7 +584,7 @@ static void dpcd_set_lt_pattern_and_lane_settings(
lt_settings->dpcd_lane_settings,
size_in_bytes);
 
-   if (is_repeater(link, offset)) {
+   if (is_repeater(lt_settings, offset)) {
if (dp_get_link_encoding_format(<_settings->link_settings) ==
DP_128b_132b_ENCODING)
DC_LOG_HW_LINK_TRAINING("%s:\n LTTPR Repeater ID: %d\n"
@@ -873,7 +873,7 @@ enum dc_status dp_get_lane_status_and_lane_adjust(
 

[PATCH 10/31] drm/amd/display: skip phantom pipes when checking for pending flip

2022-09-21 Thread Jasdeep Dhillon
From: Aurabindo Pillai 

[Why&How]
Phantom pipes are not programmed fully to hardware and hence we should
not expect a flip completion.

Reviewed-by: Alvin Lee 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Aurabindo Pillai 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 351888fe9b72..b5ad0bf4135a 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1201,7 +1201,7 @@ static void wait_for_no_pipes_pending(struct dc *dc, 
struct dc_state *context)
int count = 0;
struct pipe_ctx *pipe = &context->res_ctx.pipe_ctx[i];
 
-   if (!pipe->plane_state)
+   if (!pipe->plane_state || pipe->stream->mall_stream_config.type 
== SUBVP_PHANTOM)
continue;
 
/* Timeout 100 ms */
-- 
2.25.1



[PATCH 05/31] drm/amd/display: Avoid avoid unnecessary pixel rate divider programming

2022-09-21 Thread Jasdeep Dhillon
From: Taimur Hassan 

[Why]
Programming pixel rate divider when FIFO is enabled can cause FIFO error.

[How]
Skip divider programming when divider values are the same to prevent FIFO
error.

Reviewed-by: Nicholas Kazlauskas 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Taimur Hassan 
---
 .../drm/amd/display/dc/dcn314/dcn314_dccg.c   | 47 +++
 1 file changed, 47 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dccg.c 
b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dccg.c
index 36630d532c18..171e1580291a 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dccg.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dccg.c
@@ -45,6 +45,48 @@
 #define DC_LOGGER \
dccg->ctx->logger
 
+static void dccg314_get_pixel_rate_div(
+   struct dccg *dccg,
+   uint32_t otg_inst,
+   enum pixel_rate_div *k1,
+   enum pixel_rate_div *k2)
+{
+   struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
+   uint32_t val_k1 = PIXEL_RATE_DIV_NA, val_k2 = PIXEL_RATE_DIV_NA;
+
+   *k1 = PIXEL_RATE_DIV_NA;
+   *k2 = PIXEL_RATE_DIV_NA;
+
+   switch (otg_inst) {
+   case 0:
+   REG_GET_2(OTG_PIXEL_RATE_DIV,
+   OTG0_PIXEL_RATE_DIVK1, &val_k1,
+   OTG0_PIXEL_RATE_DIVK2, &val_k2);
+   break;
+   case 1:
+   REG_GET_2(OTG_PIXEL_RATE_DIV,
+   OTG1_PIXEL_RATE_DIVK1, &val_k1,
+   OTG1_PIXEL_RATE_DIVK2, &val_k2);
+   break;
+   case 2:
+   REG_GET_2(OTG_PIXEL_RATE_DIV,
+   OTG2_PIXEL_RATE_DIVK1, &val_k1,
+   OTG2_PIXEL_RATE_DIVK2, &val_k2);
+   break;
+   case 3:
+   REG_GET_2(OTG_PIXEL_RATE_DIV,
+   OTG3_PIXEL_RATE_DIVK1, &val_k1,
+   OTG3_PIXEL_RATE_DIVK2, &val_k1);
+   break;
+   default:
+   BREAK_TO_DEBUGGER();
+   return;
+   }
+
+   *k1 = (enum pixel_rate_div)val_k1;
+   *k2 = (enum pixel_rate_div)val_k2;
+}
+
 static void dccg314_set_pixel_rate_div(
struct dccg *dccg,
uint32_t otg_inst,
@@ -52,6 +94,11 @@ static void dccg314_set_pixel_rate_div(
enum pixel_rate_div k2)
 {
struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
+   enum pixel_rate_div cur_k1 = PIXEL_RATE_DIV_NA, cur_k2 = 
PIXEL_RATE_DIV_NA;
+
+   dccg314_get_pixel_rate_div(dccg, otg_inst, &cur_k1, &cur_k2);
+   if (k1 == PIXEL_RATE_DIV_NA || k2 == PIXEL_RATE_DIV_NA || (k1 == cur_k1 
&& k2 == cur_k2))
+   return;
 
switch (otg_inst) {
case 0:
-- 
2.25.1



[PATCH 08/31] drm/amd/display: Do DIO FIFO enable after DP video stream enable

2022-09-21 Thread Jasdeep Dhillon
From: Nicholas Kazlauskas 

[Why]
Avoids a race condition where DIO FIFO can underflow due to no incoming
data available.

[How]
Shift the FIFO enable below stream enable.

Make sure fullness level is written before the DIO reset takes place
and that we're not doing it twice.

Reviewed-by: Syed Hassan 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Nicholas Kazlauskas 
---
 .../display/dc/dcn314/dcn314_dio_stream_encoder.c   | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c 
b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c
index 3107bd57..0d2ffb692957 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c
@@ -56,7 +56,8 @@ static void enc314_enable_fifo(struct stream_encoder *enc)
 
/* TODO: Confirm if we need to wait for DIG_SYMCLK_FE_ON */
REG_WAIT(DIG_FE_CNTL, DIG_SYMCLK_FE_ON, 1, 10, 5000);
-   REG_UPDATE_2(DIG_FIFO_CTRL0, DIG_FIFO_RESET, 1, 
DIG_FIFO_READ_START_LEVEL, 0x7);
+   REG_UPDATE(DIG_FIFO_CTRL0, DIG_FIFO_READ_START_LEVEL, 0x7);
+   REG_UPDATE(DIG_FIFO_CTRL0, DIG_FIFO_RESET, 1);
REG_WAIT(DIG_FIFO_CTRL0, DIG_FIFO_RESET_DONE, 1, 10, 5000);
REG_UPDATE(DIG_FIFO_CTRL0, DIG_FIFO_RESET, 0);
REG_WAIT(DIG_FIFO_CTRL0, DIG_FIFO_RESET_DONE, 0, 10, 5000);
@@ -326,15 +327,11 @@ static void enc314_stream_encoder_dp_unblank(
/* switch DP encoder to CRTC data, but reset it the fifo first. It may 
happen
 * that it overflows during mode transition, and sometimes doesn't 
recover.
 */
-   REG_UPDATE(DIG_FIFO_CTRL0, DIG_FIFO_READ_START_LEVEL, 0x7);
REG_UPDATE(DP_STEER_FIFO, DP_STEER_FIFO_RESET, 1);
udelay(10);
 
REG_UPDATE(DP_STEER_FIFO, DP_STEER_FIFO_RESET, 0);
 
-   /* DIG Resync FIFO now needs to be explicitly enabled. */
-   enc314_enable_fifo(enc);
-
/* wait 100us for DIG/DP logic to prime
 * (i.e. a few video lines)
 */
@@ -350,6 +347,12 @@ static void enc314_stream_encoder_dp_unblank(
 
REG_UPDATE(DP_VID_STREAM_CNTL, DP_VID_STREAM_ENABLE, true);
 
+   /*
+* DIG Resync FIFO now needs to be explicitly enabled.
+* This should come after DP_VID_STREAM_ENABLE per HW docs.
+*/
+   enc314_enable_fifo(enc);
+
dp_source_sequence_trace(link, 
DPCD_SOURCE_SEQ_AFTER_ENABLE_DP_VID_STREAM);
 }
 
-- 
2.25.1



[PATCH 06/31] drm/amd/display: Wrap OTG disable workaround with FIFO control

2022-09-21 Thread Jasdeep Dhillon
From: Nicholas Kazlauskas 

[Why]
The DIO FIFO will underflow if we turn off the OTG before we turn
off the FIFO.

Since this happens as part of the OTG workaround and we don't reset
the FIFO afterwards we see the error persist.

[How]
Add disable FIFO before the disable CRTC and enable FIFO after enabling
the CRTC.

Reviewed-by: Syed Hassan 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Nicholas Kazlauskas 
---
 .../amd/display/dc/clk_mgr/dcn314/dcn314_clk_mgr.c| 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_clk_mgr.c
index 193a0f3de18d..1131c6d73f6c 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_clk_mgr.c
@@ -137,11 +137,20 @@ static void dcn314_disable_otg_wa(struct clk_mgr 
*clk_mgr_base, struct dc_state
if (pipe->top_pipe || pipe->prev_odm_pipe)
continue;
if (pipe->stream && (pipe->stream->dpms_off || 
dc_is_virtual_signal(pipe->stream->signal))) {
+   struct stream_encoder *stream_enc = 
pipe->stream_res.stream_enc;
+
if (disable) {
+   if (stream_enc && 
stream_enc->funcs->disable_fifo)
+   
pipe->stream_res.stream_enc->funcs->disable_fifo(stream_enc);
+

pipe->stream_res.tg->funcs->immediate_disable_crtc(pipe->stream_res.tg);
reset_sync_context_for_pipe(dc, context, i);
-   } else
+   } else {

pipe->stream_res.tg->funcs->enable_crtc(pipe->stream_res.tg);
+
+   if (stream_enc && 
stream_enc->funcs->enable_fifo)
+   
pipe->stream_res.stream_enc->funcs->enable_fifo(stream_enc);
+   }
}
}
 }
-- 
2.25.1



[PATCH 09/31] drm/amd/display: Change EDID fallback condition

2022-09-21 Thread Jasdeep Dhillon
From: Ilya Bakoulin 

[Why]
Partially valid EDIDs on MST sinks are treated the same way as broken
EDIDs or read failures and result in a fallback EDID being used instead.

[How]
If edid_status is EDID_PARTIAL_VALID, prefer to use the valid EDID
blocks instead of using a fallback EDID.

Reviewed-by: Martin Leung 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Ilya Bakoulin 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index d93393cc66c0..351888fe9b72 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -4061,7 +4061,7 @@ struct dc_sink *dc_link_add_remote_sink(
 * Treat device as no EDID device if EDID
 * parsing fails
 */
-   if (edid_status != EDID_OK) {
+   if (edid_status != EDID_OK && edid_status != EDID_PARTIAL_VALID) {
dc_sink->dc_edid.length = 0;
dm_error("Bad EDID, status%d!\n", edid_status);
}
-- 
2.25.1



[PATCH 07/31] drm/amd/display: Add explicit FIFO disable for DP blank

2022-09-21 Thread Jasdeep Dhillon
From: Nicholas Kazlauskas 

[Why]
We rely on DMCUB to do this when disabling the link but it should
actually come before we disable the DP VID stream.

If we don't then the FIFO can end up with underflow that persists
the next time it's enabled.

[How]
Add a DCN314 specific blank sequence that will disable the DIG FIFO
first.

Reviewed-by: Syed Hassan 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Nicholas Kazlauskas 
---
 .../display/dc/dcn314/dcn314_dio_stream_encoder.c| 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c 
b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c
index 914c5da737ed..3107bd57 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c
@@ -261,6 +261,16 @@ static bool is_two_pixels_per_containter(const struct 
dc_crtc_timing *timing)
return two_pix;
 }
 
+void enc314_stream_encoder_dp_blank(
+   struct dc_link *link,
+   struct stream_encoder *enc)
+{
+   /* New to DCN314 - disable the FIFO before VID stream disable. */
+   enc314_disable_fifo(enc);
+
+   enc1_stream_encoder_dp_blank(link, enc);
+}
+
 static void enc314_stream_encoder_dp_unblank(
struct dc_link *link,
struct stream_encoder *enc,
@@ -408,7 +418,7 @@ static const struct stream_encoder_funcs 
dcn314_str_enc_funcs = {
.stop_dp_info_packets =
enc1_stream_encoder_stop_dp_info_packets,
.dp_blank =
-   enc1_stream_encoder_dp_blank,
+   enc314_stream_encoder_dp_blank,
.dp_unblank =
enc314_stream_encoder_dp_unblank,
.audio_mute_control = enc3_audio_mute_control,
-- 
2.25.1



[PATCH 04/31] drm/amd/display: Remove interface for periodic interrupt 1

2022-09-21 Thread Jasdeep Dhillon
From: Aric Cyr 

[why]
Only a single VLINE interrupt is available so interface should not
expose the second one which is used by DMU firmware.

[how]
Remove references to periodic_interrupt1 and VLINE1 from DC interfaces.

Reviewed-by: Jaehyun Chung 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Aric Cyr 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 16 +++--
 drivers/gpu/drm/amd/display/dc/dc_stream.h|  6 ++--
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 35 ++-
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.h |  3 +-
 .../gpu/drm/amd/display/dc/inc/hw_sequencer.h |  8 +
 5 files changed, 18 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 390adc00cd28..d93393cc66c0 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -2768,11 +2768,8 @@ static void copy_stream_update_to_stream(struct dc *dc,
if (update->abm_level)
stream->abm_level = *update->abm_level;
 
-   if (update->periodic_interrupt0)
-   stream->periodic_interrupt0 = *update->periodic_interrupt0;
-
-   if (update->periodic_interrupt1)
-   stream->periodic_interrupt1 = *update->periodic_interrupt1;
+   if (update->periodic_interrupt)
+   stream->periodic_interrupt = *update->periodic_interrupt;
 
if (update->gamut_remap)
stream->gamut_remap_matrix = *update->gamut_remap;
@@ -2992,13 +2989,8 @@ static void commit_planes_do_stream_update(struct dc *dc,
 
if (!pipe_ctx->top_pipe &&  !pipe_ctx->prev_odm_pipe && 
pipe_ctx->stream == stream) {
 
-   if (stream_update->periodic_interrupt0 &&
-   dc->hwss.setup_periodic_interrupt)
-   dc->hwss.setup_periodic_interrupt(dc, pipe_ctx, 
VLINE0);
-
-   if (stream_update->periodic_interrupt1 &&
-   dc->hwss.setup_periodic_interrupt)
-   dc->hwss.setup_periodic_interrupt(dc, pipe_ctx, 
VLINE1);
+   if (stream_update->periodic_interrupt && 
dc->hwss.setup_periodic_interrupt)
+   dc->hwss.setup_periodic_interrupt(dc, pipe_ctx);
 
if ((stream_update->hdr_static_metadata && 
!stream->use_dynamic_meta) ||
stream_update->vrr_infopacket ||
diff --git a/drivers/gpu/drm/amd/display/dc/dc_stream.h 
b/drivers/gpu/drm/amd/display/dc/dc_stream.h
index 9fcf9dc5bce4..9e6025c98db9 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_stream.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_stream.h
@@ -212,8 +212,7 @@ struct dc_stream_state {
/* DMCU info */
unsigned int abm_level;
 
-   struct periodic_interrupt_config periodic_interrupt0;
-   struct periodic_interrupt_config periodic_interrupt1;
+   struct periodic_interrupt_config periodic_interrupt;
 
/* from core_stream struct */
struct dc_context *ctx;
@@ -281,8 +280,7 @@ struct dc_stream_update {
struct dc_info_packet *hdr_static_metadata;
unsigned int *abm_level;
 
-   struct periodic_interrupt_config *periodic_interrupt0;
-   struct periodic_interrupt_config *periodic_interrupt1;
+   struct periodic_interrupt_config *periodic_interrupt;
 
struct dc_info_packet *vrr_infopacket;
struct dc_info_packet *vsc_infopacket;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index 287fdecc0b10..72521749c01d 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -3812,7 +3812,7 @@ void dcn10_calc_vupdate_position(
 {
const struct dc_crtc_timing *dc_crtc_timing = &pipe_ctx->stream->timing;
int vline_int_offset_from_vupdate =
-   pipe_ctx->stream->periodic_interrupt0.lines_offset;
+   pipe_ctx->stream->periodic_interrupt.lines_offset;
int vupdate_offset_from_vsync = 
dc->hwss.get_vupdate_offset_from_vsync(pipe_ctx);
int start_position;
 
@@ -3837,18 +3837,10 @@ void dcn10_calc_vupdate_position(
 static void dcn10_cal_vline_position(
struct dc *dc,
struct pipe_ctx *pipe_ctx,
-   enum vline_select vline,
uint32_t *start_line,
uint32_t *end_line)
 {
-   enum vertical_interrupt_ref_point ref_point = INVALID_POINT;
-
-   if (vline == VLINE0)
-   ref_point = pipe_ctx->stream->periodic_interrupt0.ref_point;
-   else if (vline == VLINE1)
-   ref_point = pipe_ctx->stream->periodic_interrupt1.ref_point;
-
-   switch (ref_point) {
+   switch (pipe_ctx->stream->periodic_interrupt.ref_point) {
case START_V_UPDATE:

[PATCH 02/31] drm/amd/display: Update DCN32 to use new SR latencies

2022-09-21 Thread Jasdeep Dhillon
From: Alvin Lee 

[Description]
Update to new SR latencies for DCN32

Reviewed-by: Nevenko Stupar 
Reviewed-by: Jun Lei 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Alvin Lee 
---
 drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
index fe0770038a90..6687cfed2ca9 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
@@ -121,8 +121,8 @@ struct _vcs_dpi_soc_bounding_box_st dcn3_2_soc = {
},
},
.num_states = 1,
-   .sr_exit_time_us = 20.16,
-   .sr_enter_plus_exit_time_us = 27.13,
+   .sr_exit_time_us = 42.97,
+   .sr_enter_plus_exit_time_us = 49.94,
.sr_exit_z8_time_us = 285.0,
.sr_enter_plus_exit_z8_time_us = 320,
.writeback_latency_us = 12.0,
-- 
2.25.1



[PATCH 01/31] drm/amd/display: Refactor LTTPR mode selection

2022-09-21 Thread Jasdeep Dhillon
From: Michael Strauss 

[WHY]
Previously, LTTPR mode was decided during detection which makes
link training inflexible as mode can't be dynamically changed.

[HOW]
-Remove lttpr_mode from link struct, and move to link training settings
-Defer choosing LTTPR mode until link training

Other DP changes included:
-Only use fixed vs/pe link training sequence for 8b/10b encoding
-Restrict fixed vs aux timeout workaround to Yellow Carp family

Reviewed-by: Wenjing Liu 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Michael Strauss 
---
 .../gpu/drm/amd/display/dc/core/dc_link_ddc.c |  19 ++
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 301 ++
 .../drm/amd/display/dc/core/dc_link_dpia.c|  34 +-
 drivers/gpu/drm/amd/display/dc/dc_link.h  |   1 -
 .../gpu/drm/amd/display/dc/inc/dc_link_dp.h   |   5 +
 .../amd/display/include/link_service_types.h  |   1 +
 scripts/extract-cert  | Bin 0 -> 18320 bytes
 7 files changed, 208 insertions(+), 153 deletions(-)
 create mode 100755 scripts/extract-cert

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
index d01d2eeed813..3d01965b533a 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
@@ -35,6 +35,8 @@
 #include "dc_link_ddc.h"
 #include "dce/dce_aux.h"
 #include "dmub/inc/dmub_cmd.h"
+#include "link_dpcd.h"
+#include "include/dal_asic_id.h"
 
 #define DC_LOGGER_INIT(logger)
 
@@ -683,6 +685,22 @@ bool dc_link_aux_try_to_configure_timeout(struct 
ddc_service *ddc,
bool result = false;
struct ddc *ddc_pin = ddc->ddc_pin;
 
+   if ((ddc->link->chip_caps & EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN) &&
+   !ddc->link->dc->debug.disable_fixed_vs_aux_timeout_wa &&
+   
ASICREV_IS_YELLOW_CARP(ddc->ctx->asic_id.hw_internal_rev)) {
+   /* Fixed VS workaround for AUX timeout */
+   const uint32_t fixed_vs_address = 0xF004F;
+   const uint8_t fixed_vs_data[4] = {0x1, 0x22, 0x63, 0xc};
+
+   core_link_write_dpcd(
+   ddc->link,
+   fixed_vs_address,
+   fixed_vs_data,
+   sizeof(fixed_vs_data));
+
+   timeout = 3072;
+   }
+
/* Do not try to access nonexistent DDC pin. */
if (ddc->link->ep_type != DISPLAY_ENDPOINT_PHY)
return true;
@@ -691,6 +709,7 @@ bool dc_link_aux_try_to_configure_timeout(struct 
ddc_service *ddc,

ddc->ctx->dc->res_pool->engines[ddc_pin->pin_data->en]->funcs->configure_timeout(ddc,
 timeout);
result = true;
}
+
return result;
 }
 
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 7842df9f62de..b3a77a16dd0c 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -526,9 +526,9 @@ uint8_t dc_dp_initialize_scrambling_data_symbols(
return disable_scrabled_data_symbols;
 }
 
-static inline bool is_repeater(struct dc_link *link, uint32_t offset)
+static inline bool is_repeater(const struct link_training_settings 
*lt_settings, uint32_t offset)
 {
-   return (link->lttpr_mode == LTTPR_MODE_NON_TRANSPARENT) && (offset != 
0);
+   return (lt_settings->lttpr_mode == LTTPR_MODE_NON_TRANSPARENT) && 
(offset != 0);
 }
 
 static void dpcd_set_lt_pattern_and_lane_settings(
@@ -545,7 +545,7 @@ static void dpcd_set_lt_pattern_and_lane_settings(
bool edp_workaround = false; /* TODO link_prop.INTERNAL */
dpcd_base_lt_offset = DP_TRAINING_PATTERN_SET;
 
-   if (is_repeater(link, offset))
+   if (is_repeater(lt_settings, offset))
dpcd_base_lt_offset = DP_TRAINING_PATTERN_SET_PHY_REPEATER1 +
((DP_REPEATER_CONFIGURATION_AND_STATUS_SIZE) * (offset 
- 1));
 
@@ -561,7 +561,7 @@ static void dpcd_set_lt_pattern_and_lane_settings(
dpcd_lt_buffer[DP_TRAINING_PATTERN_SET - DP_TRAINING_PATTERN_SET]
= dpcd_pattern.raw;
 
-   if (is_repeater(link, offset)) {
+   if (is_repeater(lt_settings, offset)) {
DC_LOG_HW_LINK_TRAINING("%s\n LTTPR Repeater ID: %d\n 0x%X 
pattern = %x\n",
__func__,
offset,
@@ -584,7 +584,7 @@ static void dpcd_set_lt_pattern_and_lane_settings(
lt_settings->dpcd_lane_settings,
size_in_bytes);
 
-   if (is_repeater(link, offset)) {
+   if (is_repeater(lt_settings, offset)) {
if (dp_get_link_encoding_format(<_settings->link_settings) ==
DP_128b_132b_ENCODING)
DC_LOG_HW_LINK_TRAINING("%s:\n LTTPR Repeater ID: %d\n"
@@ -873,7 +873,7 @@ enum dc_status dp_get_lane_status_and_lane_adjust(
 

[PATCH 03/31] drm/amd/display: Fix various dynamic ODM transitions on DCN32

2022-09-21 Thread Jasdeep Dhillon
From: Dillon Varone 

[Why&How]

Several transitions were fixed that will allow Dynamic ODM and MPO
transitions to be supported on DCN32.

1) Due to resource limitations, in certain scenarios that require an MPO
plane to be split, the features cannot be combined with the current
policy. This is due to unsafe transitions being required (OPP instance
per MPCC being switched on active pipe is not supported by DCN), to
support the split plane with ODM active as it moves across the viewport.
Dynamic ODM will now be disabled when MPO is required.

2) When exiting MPO and re-entering ODM, DC assigns an inactive pipe for
the next ODM pipe, which under previous power gating policy would result
in programming a gated DSC HW block. New policy dynamically
gates/un-gates DSC blocks when Dynamic ODM is active to support

transitions on DCN32 only.

3) Entry and exit from 3 plane MPO and Dynamic ODM requires a minimal
transition so that all pipes which require their MPCC OPP instance to
be changed have a full frame to be disabled before reprogramming. To
solve this, the Dynamic ODM policy now utilizes minimal state
transitions when entering or exiting 3 plane scenarios.

4) Various fixes to DCN32 pipe merge/split algorithm to support Dynamic
ODM and MPO transitions.

In summary, this commit fixes various transitions to support ODM->MPO
and MPO->ODM.

Reviewed-by: Martin Leung 
Reviewed-by: Jun Lei 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Dillon Varone 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 99 ++-
 drivers/gpu/drm/amd/display/dc/dc.h   |  1 +
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.c| 54 ++
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.h|  8 ++
 .../gpu/drm/amd/display/dc/dcn32/dcn32_init.c |  2 +
 .../drm/amd/display/dc/dcn32/dcn32_resource.c | 27 ++---
 .../amd/display/dc/dcn321/dcn321_resource.c   |  3 +
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.c  | 24 +
 .../gpu/drm/amd/display/dc/inc/hw_sequencer.h |  1 +
 .../amd/display/dc/inc/hw_sequencer_private.h |  2 +
 10 files changed, 186 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 19eb960d75d8..390adc00cd28 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1750,6 +1750,9 @@ static enum dc_status dc_commit_state_no_check(struct dc 
*dc, struct dc_state *c
context->stream_count == 0)
dc->hwss.prepare_bandwidth(dc, context);
 
+   if (dc->debug.enable_double_buffered_dsc_pg_support)
+   dc->hwss.update_dsc_pg(dc, context, false);
+
disable_dangling_plane(dc, context);
/* re-program planes for existing stream, in case we need to
 * free up plane resource for later use
@@ -1840,6 +1843,9 @@ static enum dc_status dc_commit_state_no_check(struct dc 
*dc, struct dc_state *c
dc->hwss.optimize_bandwidth(dc, context);
}
 
+   if (dc->debug.enable_double_buffered_dsc_pg_support)
+   dc->hwss.update_dsc_pg(dc, context, true);
+
if (dc->ctx->dce_version >= DCE_VERSION_MAX)
TRACE_DCN_CLOCK_STATE(&context->bw_ctx.bw.dcn.clk);
else
@@ -2003,6 +2009,9 @@ void dc_post_update_surfaces_to_stream(struct dc *dc)
 
dc->hwss.optimize_bandwidth(dc, context);
 
+   if (dc->debug.enable_double_buffered_dsc_pg_support)
+   dc->hwss.update_dsc_pg(dc, context, true);
+
dc->optimized_required = false;
dc->wm_optimized_required = false;
 }
@@ -3198,6 +3207,9 @@ static void commit_planes_for_stream(struct dc *dc,
if (get_seamless_boot_stream_count(context) == 0)
dc->hwss.prepare_bandwidth(dc, context);
 
+   if (dc->debug.enable_double_buffered_dsc_pg_support)
+   dc->hwss.update_dsc_pg(dc, context, false);
+
context_clock_trace(dc, context);
}
 
@@ -3521,11 +3533,59 @@ static void commit_planes_for_stream(struct dc *dc,
}
 }
 
+/* Determines if the incoming context requires a applying transition state 
with unnecessary
+ * pipe splitting and ODM disabled, due to hardware limitations. In a case 
where
+ * the OPP associated with an MPCC might change due to plane additions, this 
function
+ * returns true.
+ */
+static bool could_mpcc_tree_change_for_active_pipes(struct dc *dc,
+   struct dc_stream_state *stream,
+   int surface_count,
+   bool *is_plane_addition)
+{
+
+   struct dc_stream_status *cur_stream_status = 
stream_get_status(dc->current_state, stream);
+   bool force_minimal_pipe_splitting = false;
+
+   *is_plane_addition = false;
+
+   if (cur_stream_status &&
+   dc->current_state->stream_count > 0 &&
+   dc->debug.pipe_split_policy != MPC_SPLIT_AVOID) {
+   /* determine if minimal transition is required 

[PATCH 00/31] DC Patches Sept 20, 2022

2022-09-21 Thread Jasdeep Dhillon
Subject: DC Patches MONTH DAY, YEAR
 
This DC patchset brings improvements in multiple areas. In summary, we have:
 
- LTTPR mode can be be dynamically changed
- features able to use same interface to update cursor info 
- fixes for llvm compilation issues
- Fixes DIO FIFO underflow and other FIFO errors
- Partially valid EDIDs handled properly 
- Phatom pipes are skipped when checking pending flip
- Fixed audio on display after unplugging
 
Cc: Daniel Wheeler 

Alvin Lee (4):
  drm/amd/display: Update DCN32 to use new SR latencies
  drm/amd/display: Update MALL SS NumWays calculation
  drm/amd/display: Disable MALL when TMZ surface
  drm/amd/display: Fix CAB allocation calculation

Aric Cyr (3):
  drm/amd/display: Remove interface for periodic interrupt 1
  drm/amd/display: Fix audio on display after unplugging another
  drm/amd/display: 3.2.205

Aurabindo Pillai (2):
  drm/amd/display: skip phantom pipes when checking for pending flip
  drm/amd/display: fix a divide by zero error

Brandon Syu (1):
  Add debug option for exiting idle optimizations on cursor updates

Charlene Liu (1):
  drm/amd/display: add debug keys for override bios settings.

Dillon Varone (1):
  drm/amd/display: Fix various dynamic ODM transitions on DCN32

Eric Bernstein (1):
  drm/amd/display: Remove assert for odm transition case

Ian Chen (1):
  Add ABM control to panel_config struct.

Ilya Bakoulin (1):
  drm/amd/display: Change EDID fallback condition

Max Tseng (1):
  drm/amd/display: Cursor Info Update refactor

Michael Strauss (1):
  drm/amd/display: Refactor LTTPR mode selection

Nathan Chancellor (2):
  drm/amd/display: Reduce number of arguments of dml314's
CalculateWatermarksAndDRAMSpeedChangeSupport()
  drm/amd/display: Reduce number of arguments of dml314's
CalculateFlipSchedule()

Nicholas Kazlauskas (3):
  drm/amd/display: Wrap OTG disable workaround with FIFO control
  drm/amd/display: Add explicit FIFO disable for DP blank
  drm/amd/display: Do DIO FIFO enable after DP video stream enable

Samson Tam (1):
  drm/amd/display: fill in clock values when DPM is not enabled

Taimur Hassan (3):
  drm/amd/display: Avoid avoid unnecessary pixel rate divider
programming
  drm/amd/display: Fix typo in get_pixel_rate_div
  drm/amd/display: Avoid unnecessary pixel rate divider programming

Tom Rix (3):
  drm/amd/display: remove redundant CalculateTWait's
  drm/amd/display: refactor CalculateWriteBackDelay to use vba_vars_st
ptr
  drm/amd/display: remove redundant CalculateRemoteSurfaceFlipDelay's

Wenjing Liu (2):
  drm/amd/display: add missing null check
  drm/amd/display: polling vid stream status in hpo dp blank

 .../drm/amd/display/dc/bios/bios_parser2.c|  21 +-
 .../dc/clk_mgr/dcn314/dcn314_clk_mgr.c|  11 +-
 .../display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c  |  14 +
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 127 +++--
 .../gpu/drm/amd/display/dc/core/dc_link_ddc.c |  19 +
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 301 ++-
 .../drm/amd/display/dc/core/dc_link_dpia.c|  34 +-
 .../gpu/drm/amd/display/dc/core/dc_stream.c   |   1 +
 drivers/gpu/drm/amd/display/dc/dc.h   |   8 +-
 drivers/gpu/drm/amd/display/dc/dc_link.h  |   7 +-
 drivers/gpu/drm/amd/display/dc/dc_stream.h|   6 +-
 .../display/dc/dce110/dce110_hw_sequencer.c   |   6 +-
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c |  35 +-
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.h |   3 +-
 .../drm/amd/display/dc/dcn30/dcn30_resource.c |   3 +-
 .../amd/display/dc/dcn301/dcn301_resource.c   |   1 +
 .../amd/display/dc/dcn302/dcn302_resource.c   |   3 +-
 .../amd/display/dc/dcn303/dcn303_resource.c   |   1 +
 .../dc/dcn31/dcn31_hpo_dp_stream_encoder.c|   6 +-
 .../drm/amd/display/dc/dcn314/dcn314_dccg.c   |  47 ++
 .../dc/dcn314/dcn314_dio_stream_encoder.c |  25 +-
 .../gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c |  53 ++
 .../drm/amd/display/dc/dcn32/dcn32_hubbub.c   |  10 +-
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.c| 280 ++
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.h|   8 +
 .../gpu/drm/amd/display/dc/dcn32/dcn32_init.c |   2 +
 .../drm/amd/display/dc/dcn32/dcn32_resource.c |  27 +-
 .../display/dc/dcn32/dcn32_resource_helpers.c |   2 +-
 .../amd/display/dc/dcn321/dcn321_resource.c   |   3 +
 .../dc/dml/dcn20/display_mode_vba_20.c|  98 +---
 .../dc/dml/dcn20/display_mode_vba_20v2.c  | 176 +-
 .../dc/dml/dcn21/display_mode_vba_21.c| 173 +-
 .../dc/dml/dcn30/display_mode_vba_30.c|  92 +---
 .../dc/dml/dcn31/display_mode_vba_31.c|  89 +--
 .../dc/dml/dcn314/display_mode_vba_314.c  | 509 --
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.c  |  70 ++-
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.h  |   2 +
 .../dc/dml/dcn32/display_mode_vba_32.c|  42 +-
 .../dc/dml/dcn32/display_mode_vba_util_32.c   |  30 --
 .../dc/dml/dcn32/display_mode_vba_util_32.h   |  10 +-
 .../gpu/drm/amd/display/dc/inc/d

[PATCH] drm/amdkfd: Fix UBSAN shift-out-of-bounds warning

2022-09-21 Thread Felix Kuehling
This was fixed in initialize_cpsch before, but not in initialize_nocpsch.
Factor sdma bitmap initialization into a helper function to apply the
correct implementation in both cases without duplicating it.

Reported-by: Ellis Michael 
Signed-off-by: Felix Kuehling 
---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 41 ---
 1 file changed, 17 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index e83725a28106..f88ec6a11ad2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1240,6 +1240,20 @@ static void init_interrupts(struct device_queue_manager 
*dqm)
dqm->dev->kfd2kgd->init_interrupts(dqm->dev->adev, i);
 }
 
+static void init_sdma_bitmaps(struct device_queue_manager *dqm)
+{
+   uint64_t num_sdma_queues = get_num_sdma_queues(dqm);
+   uint64_t num_xgmi_sdma_queues = get_num_xgmi_sdma_queues(dqm);
+
+   if (num_sdma_queues)
+   dqm->sdma_bitmap = GENMASK_ULL(num_sdma_queues-1, 0);
+   if (num_xgmi_sdma_queues)
+   dqm->xgmi_sdma_bitmap = GENMASK_ULL(num_xgmi_sdma_queues-1, 0);
+
+   dqm->sdma_bitmap &= ~get_reserved_sdma_queues_bitmap(dqm);
+   pr_info("sdma_bitmap: %llx\n", dqm->sdma_bitmap);
+}
+
 static int initialize_nocpsch(struct device_queue_manager *dqm)
 {
int pipe, queue;
@@ -1268,11 +1282,7 @@ static int initialize_nocpsch(struct 
device_queue_manager *dqm)
 
memset(dqm->vmid_pasid, 0, sizeof(dqm->vmid_pasid));
 
-   dqm->sdma_bitmap = ~0ULL >> (64 - get_num_sdma_queues(dqm));
-   dqm->sdma_bitmap &= ~(get_reserved_sdma_queues_bitmap(dqm));
-   pr_info("sdma_bitmap: %llx\n", dqm->sdma_bitmap);
-
-   dqm->xgmi_sdma_bitmap = ~0ULL >> (64 - get_num_xgmi_sdma_queues(dqm));
+   init_sdma_bitmaps(dqm);
 
return 0;
 }
@@ -1450,9 +1460,6 @@ static int set_sched_resources(struct 
device_queue_manager *dqm)
 
 static int initialize_cpsch(struct device_queue_manager *dqm)
 {
-   uint64_t num_sdma_queues;
-   uint64_t num_xgmi_sdma_queues;
-
pr_debug("num of pipes: %d\n", get_pipes_per_mec(dqm));
 
mutex_init(&dqm->lock_hidden);
@@ -1461,24 +1468,10 @@ static int initialize_cpsch(struct device_queue_manager 
*dqm)
dqm->active_cp_queue_count = 0;
dqm->gws_queue_count = 0;
dqm->active_runlist = false;
-
-   num_sdma_queues = get_num_sdma_queues(dqm);
-   if (num_sdma_queues >= BITS_PER_TYPE(dqm->sdma_bitmap))
-   dqm->sdma_bitmap = ULLONG_MAX;
-   else
-   dqm->sdma_bitmap = (BIT_ULL(num_sdma_queues) - 1);
-
-   dqm->sdma_bitmap &= ~(get_reserved_sdma_queues_bitmap(dqm));
-   pr_info("sdma_bitmap: %llx\n", dqm->sdma_bitmap);
-
-   num_xgmi_sdma_queues = get_num_xgmi_sdma_queues(dqm);
-   if (num_xgmi_sdma_queues >= BITS_PER_TYPE(dqm->xgmi_sdma_bitmap))
-   dqm->xgmi_sdma_bitmap = ULLONG_MAX;
-   else
-   dqm->xgmi_sdma_bitmap = (BIT_ULL(num_xgmi_sdma_queues) - 1);
-
INIT_WORK(&dqm->hw_exception_work, kfd_process_hw_exception);
 
+   init_sdma_bitmaps(dqm);
+
return 0;
 }
 
-- 
2.32.0



[linux-next:master] BUILD REGRESSION 483fed3b5dc8ce3644c83d24240cf5756fb0993e

2022-09-21 Thread kernel test robot
tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
branch HEAD: 483fed3b5dc8ce3644c83d24240cf5756fb0993e  Add linux-next specific 
files for 20220921

Error/Warning reports:

https://lore.kernel.org/linux-mm/202209042337.fqi69rlv-...@intel.com
https://lore.kernel.org/linux-mm/202209060229.dvuyxjbv-...@intel.com
https://lore.kernel.org/linux-mm/202209150141.wgbakqmx-...@intel.com
https://lore.kernel.org/linux-mm/202209160607.se3qvgty-...@intel.com
https://lore.kernel.org/linux-mm/202209200603.hpvoa8ii-...@intel.com
https://lore.kernel.org/linux-mm/202209200949.vl3xruyd-...@intel.com
https://lore.kernel.org/llvm/202209220009.8nypipst-...@intel.com
https://lore.kernel.org/llvm/202209220019.yr2vuxhg-...@intel.com

Error/Warning: (recently discovered and may have been fixed)

ERROR: modpost: "devm_ioremap_resource" [drivers/dma/fsl-edma.ko] undefined!
ERROR: modpost: "devm_ioremap_resource" [drivers/dma/idma64.ko] undefined!
ERROR: modpost: "devm_ioremap_resource" [drivers/dma/qcom/hdma.ko] undefined!
ERROR: modpost: "devm_memremap" [drivers/misc/open-dice.ko] undefined!
ERROR: modpost: "devm_memunmap" [drivers/misc/open-dice.ko] undefined!
ERROR: modpost: "devm_platform_ioremap_resource" 
[drivers/char/xillybus/xillybus_of.ko] undefined!
ERROR: modpost: "devm_platform_ioremap_resource" 
[drivers/clk/xilinx/clk-xlnx-clock-wizard.ko] undefined!
ERROR: modpost: "ioremap" [drivers/tty/ipwireless/ipwireless.ko] undefined!
ERROR: modpost: "iounmap" [drivers/net/ethernet/8390/pcnet_cs.ko] undefined!
ERROR: modpost: "iounmap" [drivers/tty/ipwireless/ipwireless.ko] undefined!
arch/arm64/kernel/alternative.c:199:6: warning: no previous prototype for 
'apply_alternatives_vdso' [-Wmissing-prototypes]
arch/arm64/kernel/alternative.c:295:14: warning: no previous prototype for 
'alt_cb_patch_nops' [-Wmissing-prototypes]
arch/ia64/kernel/sys_ia64.c:188:17: sparse: sparse: typename in expression
arch/ia64/kernel/sys_ia64.c:188:31: sparse: sparse: Trying to use reserved word 
'typeof' as identifier
arch/ia64/kernel/sys_ia64.c:188:31: sparse: sparse: Trying to use reserved word 
'void' as identifier
arch/ia64/kernel/sys_ia64.c:189:60: sparse: sparse: invalid initializer
arch/ia64/kernel/sys_ia64.c:190:17: sparse: sparse: Trying to use reserved word 
'return' as identifier
arch/parisc/lib/iomap.c:363:5: warning: no previous prototype for 
'ioread64_lo_hi' [-Wmissing-prototypes]
arch/parisc/lib/iomap.c:373:5: warning: no previous prototype for 
'ioread64_hi_lo' [-Wmissing-prototypes]
arch/parisc/lib/iomap.c:448:6: warning: no previous prototype for 
'iowrite64_lo_hi' [-Wmissing-prototypes]
arch/parisc/lib/iomap.c:454:6: warning: no previous prototype for 
'iowrite64_hi_lo' [-Wmissing-prototypes]
drivers/scsi/qla2xxx/qla_os.c:2854:23: warning: assignment to 'struct 
trace_array *' from 'int' makes pointer from integer without a cast 
[-Wint-conversion]
drivers/scsi/qla2xxx/qla_os.c:2854:25: error: implicit declaration of function 
'trace_array_get_by_name'; did you mean 'trace_array_set_clr_event'? 
[-Werror=implicit-function-declaration]
drivers/scsi/qla2xxx/qla_os.c:2869:9: error: implicit declaration of function 
'trace_array_put' [-Werror=implicit-function-declaration]
mm/hugetlb.c:5539:14: warning: variable 'reserve_alloc' set but not used 
[-Wunused-but-set-variable]

Error/Warning ids grouped by kconfigs:

gcc_recent_errors
|-- alpha-allyesconfig
|   |-- 
drivers-scsi-qla2xxx-qla_os.c:error:implicit-declaration-of-function-trace_array_get_by_name
|   |-- 
drivers-scsi-qla2xxx-qla_os.c:error:implicit-declaration-of-function-trace_array_put
|   `-- 
drivers-scsi-qla2xxx-qla_os.c:warning:assignment-to-struct-trace_array-from-int-makes-pointer-from-integer-without-a-cast
|-- alpha-randconfig-s033-20220921
|   `-- 
kernel-exit.c:sparse:sparse:incorrect-type-in-initializer-(different-address-spaces)-expected-struct-sighand_struct-sighand-got-struct-sighand_struct-noderef-__rcu-sighand
|-- arm64-allyesconfig
|   |-- 
arch-arm64-kernel-alternative.c:warning:no-previous-prototype-for-alt_cb_patch_nops
|   |-- 
arch-arm64-kernel-alternative.c:warning:no-previous-prototype-for-apply_alternatives_vdso
|   `-- mm-hugetlb.c:warning:variable-reserve_alloc-set-but-not-used
|-- arm64-randconfig-r013-20220921
|   |-- 
arch-arm64-kernel-alternative.c:warning:no-previous-prototype-for-alt_cb_patch_nops
|   `-- 
arch-arm64-kernel-alternative.c:warning:no-previous-prototype-for-apply_alternatives_vdso
|-- i386-allyesconfig
|   `-- mm-hugetlb.c:warning:variable-reserve_alloc-set-but-not-used
|-- i386-defconfig
|   `-- mm-hugetlb.c:warning:variable-reserve_alloc-set-but-not-used
|-- i386-randconfig-c021
|   `-- mm-hugetlb.c:warning:variable-reserve_alloc-set-

[pull] amdgpu drm-fixes-6.0

2022-09-21 Thread Alex Deucher
Hi Dave, Daniel,

Fixes for 6.0.  Mainly fixes for new IPs.  The big change here is the DML
clean up from Nathan to fix the Clang stack usage warnings on the DCN 3.1.4
code which was recently enabled.

The following changes since commit a8671493d2074950553da3cf07d1be43185ef6c6:

  drm/amdgpu: make sure to init common IP before gmc (2022-09-14 14:21:49 -0400)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-fixes-6.0-2022-09-21

for you to fetch changes up to f525ed19437d376736bed64ee7bc4afee82f2ba9:

  drm/amd/display: Reduce number of arguments of dml314's 
CalculateFlipSchedule() (2022-09-21 17:36:57 -0400)


amd-drm-fixes-6.0-2022-09-21:

amdgpu:
- SDMA 6.x fix
- GPUVM TF fix
- DCN 3.2.x fixes
- DCN 3.1.x fixes
- SMU 13.x fixes
- Clang stack size fixes for recently enabled DML code
- Fix drm dirty callback change on non-atomic cases
- USB4 display fix


Alex Deucher (1):
  drm/amdgpu: don't register a dirty callback for non-atomic

Alvin Lee (1):
  drm/amd/display: Only consider pixle rate div policy for DCN32+

Charlene Liu (1):
  drm/amd/display: correct num_dsc based on HW cap

Chris Park (1):
  drm/amd/display: Port DCN30 420 logic to DCN32

Cruise Hung (1):
  drm/amd/display: Fix DP MST timeslot issue when fallback happened

Daniel Miess (1):
  drm/amd/display: Add shift and mask for ICH_RESET_AT_END_OF_LINE

Dmytro Laktyushkin (2):
  drm/amd/display: fix dcn315 memory channel count and width read
  drm/amd/display: increase dcn315 pstate change latency

Evan Quan (2):
  drm/amd/pm: add support for 3794 pptable for SMU13.0.0
  drm/amd/pm: drop the pptable related workarounds for SMU 13.0.0

George Shen (1):
  drm/amd/display: Update dummy P-state search to use DCN32 DML

Hugo Hu (1):
  drm/amd/display: update gamut remap if plane has changed

Leo Li (1):
  drm/amd/display: Fix double cursor on non-video RGB MPO

Meenakshikumar Somasundaram (1):
  drm/amd/display: Display distortion after hotplug 5K tiled display

Michael Strauss (1):
  drm/amd/display: Assume an LTTPR is always present on fixed_vs links

Mukul Joshi (1):
  drm/amdgpu: Update PTE flags with TF enabled

Nathan Chancellor (2):
  drm/amd/display: Reduce number of arguments of dml314's 
CalculateWatermarksAndDRAMSpeedChangeSupport()
  drm/amd/display: Reduce number of arguments of dml314's 
CalculateFlipSchedule()

Nicholas Kazlauskas (1):
  drm/amd/display: Disable OTG WA for the plane_state NULL case on DCN314

Yifan Zhang (1):
  drm/amdgpu/mes: zero the sdma_hqd_mask of 2nd SDMA engine for SDMA 6.0.1

zhikzhai (1):
  drm/amd/display: skip audio setup when audio stream is enabled

 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c|  11 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c|   3 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |   3 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  |   7 +-
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  |  12 +-
 .../amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c   |  11 +-
 .../amd/display/dc/clk_mgr/dcn314/dcn314_clk_mgr.c |  14 +-
 .../amd/display/dc/clk_mgr/dcn315/dcn315_clk_mgr.c |  36 +-
 .../amd/display/dc/clk_mgr/dcn316/dcn316_clk_mgr.c |  11 +-
 drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c   |  16 +-
 drivers/gpu/drm/amd/display/dc/core/dc_resource.c  |  17 +
 .../amd/display/dc/dce110/dce110_hw_sequencer.c|   6 +-
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dsc.h   | 220 ---
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c |   1 +
 .../gpu/drm/amd/display/dc/dcn314/dcn314_hwseq.c   |  16 +-
 .../gpu/drm/amd/display/dc/dcn314/dcn314_hwseq.h   |   2 -
 .../gpu/drm/amd/display/dc/dcn314/dcn314_init.c|   1 -
 .../drm/amd/display/dc/dcn314/dcn314_resource.c|  11 +-
 .../gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c   |   7 +-
 .../display/dc/dml/dcn314/display_mode_vba_314.c   | 420 +
 .../gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c   |  46 ++-
 .../gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.h   |   6 +
 .../amd/display/dc/dml/dcn32/display_mode_vba_32.c |   2 +
 .../dc/dml/dcn32/display_mode_vba_util_32.c|  26 ++
 .../dc/dml/dcn32/display_mode_vba_util_32.h|   1 +
 drivers/gpu/drm/amd/display/dc/inc/resource.h  |   4 +
 drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c |  44 +--
 .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c   |  53 +--
 28 files changed, 309 insertions(+), 698 deletions(-)


Re: PROBLEM: UBSAN error in kfd_device_queue_manager.c

2022-09-21 Thread Felix Kuehling
Thank you for reporting this problem. This only affects a small number 
of GPUs that don't use the HW scheduler in KFD. We fixed the same issue 
in the HWS code path over a year ago but apparently didn't think to 
apply the same fix for the non-HWS code path. This is that patch for 
reference. I'll fix that now.


commit 50e2fc36e72d4ad672032ebf646cecb48656efe0
Author: Anson Jacob 
Date:   Wed Mar 3 12:33:15 2021 -0500

drm/amdkfd: Fix UBSAN shift-out-of-bounds warning

If get_num_sdma_queues or get_num_xgmi_sdma_queues is 0, we end up

doing a shift operation where the number of bits shifted equals
number of bits in the operand. This behaviour is undefined.

Set num_sdma_queues or num_xgmi_sdma_queues to ULLONG_MAX, if the

count is >= number of bits in the operand.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1472

Reported-by: Lyude Paul 

Signed-off-by: Anson Jacob 
Reviewed-by: Alex Deucher 
Reviewed-by: Felix Kuehling 
Tested-by: Lyude Paul 
Signed-off-by: Alex Deucher 

Regards,
  Felix


On 2022-09-21 02:56, Ellis Michael wrote:

Reporting an undefined behavior issue in the amdgpu driver in the
linux kernel I ran into recently. It appears during boot, fairly early
in the process.


[drm] UVD initialized successfully.
[drm] VCE initialized successfully.
kfd kfd: amdgpu: Allocated 3969056 bytes on gart


UBSAN: shift-out-of-bounds in
/build/linux-kQ6jNR/linux-5.15.0/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:997:32
shift exponent 64 is too large for 64-bit type 'long long unsigned int'
CPU: 10 PID: 483 Comm: systemd-udevd Not tainted 5.15.0-48-generic #54-Ubuntu
Hardware name: To Be Filled By O.E.M. B550M Pro4/B550M Pro4, BIOS
P2.30 02/24/2022
Call Trace:
  
  show_stack+0x52/0x5c
  dump_stack_lvl+0x4a/0x63
  dump_stack+0x10/0x16
  ubsan_epilogue+0x9/0x49
  __ubsan_handle_shift_out_of_bounds.cold+0x61/0xef
  initialize_nocpsch.cold+0x15/0x59 [amdgpu]
  device_queue_manager_init+0x20b/0x3b0 [amdgpu]
  kgd2kfd_device_init.cold+0x1af/0x483 [amdgpu]
  amdgpu_amdkfd_device_init+0x135/0x170 [amdgpu]
  amdgpu_device_ip_init+0x681/0x6a4 [amdgpu]
loop33: detected capacity change from 0 to 8
  amdgpu_device_init.cold+0x25b/0x7db [amdgpu]
  ? do_pci_enable_device+0xdb/0x110
  amdgpu_driver_load_kms+0x1e/0x270 [amdgpu]
  amdgpu_pci_probe+0x1ce/0x260 [amdgpu]
  local_pci_probe+0x4b/0x90
  pci_device_probe+0x119/0x1f0
  really_probe+0x222/0x420
  __driver_probe_device+0x119/0x190
  driver_probe_device+0x23/0xc0
  __driver_attach+0xbd/0x1e0
  ? __device_attach_driver+0x120/0x120
  bus_for_each_dev+0x7e/0xd0
  driver_attach+0x1e/0x30
  bus_add_driver+0x148/0x220
  driver_register+0x95/0x100
  __pci_register_driver+0x68/0x70
  amdgpu_init+0x7c/0x1000 [amdgpu]
  ? 0xc1a4
  do_one_initcall+0x48/0x1e0
  ? kmem_cache_alloc_trace+0x19e/0x2e0
  do_init_module+0x52/0x260
  load_module+0xacd/0xbc0
  __do_sys_finit_module+0xbf/0x120
  __x64_sys_finit_module+0x18/0x20
  do_syscall_64+0x5c/0xc0
  ? syscall_exit_to_user_mode+0x27/0x50
  ? __x64_sys_newfstatat+0x1c/0x30
  ? do_syscall_64+0x69/0xc0
  ? __x64_sys_mmap+0x33/0x50
  ? do_syscall_64+0x69/0xc0
  ? do_syscall_64+0x69/0xc0
  entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f06f3fb9a3d
Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 8b 0d c3 a3 0f 00 f7 d8 64 89 01 48
RSP: 002b:7ffc7ce54ae8 EFLAGS: 0246 ORIG_RAX: 0139
RAX: ffda RBX: 556c9ab3e3d0 RCX: 7f06f3fb9a3d
RDX:  RSI: 7f06f4150441 RDI: 001a
RBP: 0002 R08:  R09: 0002
R10: 001a R11: 0246 R12: 7f06f4150441
R13: 556c9aa05fb0 R14: 556c9ab40460 R15: 556c9ab35150
  

amdgpu: SW scheduler is used
amdgpu: SRAT table not found
amdgpu: Virtual CRAT table created for GPU
amdgpu: Topology: Add dGPU node [0x6938:0x1002]
kfd kfd: amdgpu: added device 1002:6938
amdgpu :06:00.0: amdgpu: SE 4, SH per SE 1, CU per SH 8, active_cu_number 32
[drm] fb mappable at 0xD1813000
[drm] vram apper at 0xD000
[drm] size 19906560
[drm] fb depth is 24
[drm]pitch is 13824
fbcon: amdgpudrmfb (fb0) is primary device


This only started recently, possibly after I replaced my motherboard
and CPU (though, not my GPU).

Quick info on my system:
Ubuntu 22.04.1, kernel version 5.15.0-48
Ryzen 5600
ASRock B550m Pro4
R9 380X (STRIX-R9380X-OC4G-GAMING)


This is potentially related to a bug I recently reported to the Ubuntu
bug tracker where my display wouldn't come back from being blank, and
I would see series of messages of the form:
amdgpu:
 last message was failed ret is 0
amdgpu:
 failed to send message 

[PATCH v5] drm/amd/display: Fix vblank refcount in vrr transition

2022-09-21 Thread Yunxiang Li
manage_dm_interrupts disable/enable vblank using drm_crtc_vblank_off/on
which causes drm_crtc_vblank_get in vrr_transition to fail, and later
when drm_crtc_vblank_put is called the refcount on vblank will be messed
up. Therefore move the call to after manage_dm_interrupts.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1247
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1380

Signed-off-by: Yunxiang Li 
---
v2: check the return code for calls that might fail and warn on them
v3/v4: make the sequence closer to the original and remove redundant local 
variables
v5: add bug tracking info

 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 55 +--
 1 file changed, 26 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index ece2003a74cc..97cc8ceaeea0 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -7484,15 +7484,15 @@ static void amdgpu_dm_handle_vrr_transition(struct 
dm_crtc_state *old_state,
 * We also need vupdate irq for the actual core vblank handling
 * at end of vblank.
 */
-   dm_set_vupdate_irq(new_state->base.crtc, true);
-   drm_crtc_vblank_get(new_state->base.crtc);
+   WARN_ON(dm_set_vupdate_irq(new_state->base.crtc, true) != 0);
+   WARN_ON(drm_crtc_vblank_get(new_state->base.crtc) != 0);
DRM_DEBUG_DRIVER("%s: crtc=%u VRR off->on: Get vblank ref\n",
 __func__, new_state->base.crtc->base.id);
} else if (old_vrr_active && !new_vrr_active) {
/* Transition VRR active -> inactive:
 * Allow vblank irq disable again for fixed refresh rate.
 */
-   dm_set_vupdate_irq(new_state->base.crtc, false);
+   WARN_ON(dm_set_vupdate_irq(new_state->base.crtc, false) != 0);
drm_crtc_vblank_put(new_state->base.crtc);
DRM_DEBUG_DRIVER("%s: crtc=%u VRR on->off: Drop vblank ref\n",
 __func__, new_state->base.crtc->base.id);
@@ -8257,23 +8257,6 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state)
mutex_unlock(&dm->dc_lock);
}
 
-   /* Count number of newly disabled CRTCs for dropping PM refs later. */
-   for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state,
- new_crtc_state, i) {
-   if (old_crtc_state->active && !new_crtc_state->active)
-   crtc_disable_count++;
-
-   dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
-   dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
-
-   /* For freesync config update on crtc state and params for irq 
*/
-   update_stream_irq_parameters(dm, dm_new_crtc_state);
-
-   /* Handle vrr on->off / off->on transitions */
-   amdgpu_dm_handle_vrr_transition(dm_old_crtc_state,
-   dm_new_crtc_state);
-   }
-
/**
 * Enable interrupts for CRTCs that are newly enabled or went through
 * a modeset. It was intentionally deferred until after the front end
@@ -8283,16 +8266,29 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state)
for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, 
new_crtc_state, i) {
struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc);
 #ifdef CONFIG_DEBUG_FS
-   bool configure_crc = false;
enum amdgpu_dm_pipe_crc_source cur_crc_src;
 #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
-   struct crc_rd_work *crc_rd_wrk = dm->crc_rd_wrk;
+   struct crc_rd_work *crc_rd_wrk;
+#endif
+#endif
+   /* Count number of newly disabled CRTCs for dropping PM refs 
later. */
+   if (old_crtc_state->active && !new_crtc_state->active)
+   crtc_disable_count++;
+
+   dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
+   dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
+
+   /* For freesync config update on crtc state and params for irq 
*/
+   update_stream_irq_parameters(dm, dm_new_crtc_state);
+
+#ifdef CONFIG_DEBUG_FS
+#if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
+   crc_rd_wrk = dm->crc_rd_wrk;
 #endif
spin_lock_irqsave(&adev_to_drm(adev)->event_lock, flags);
cur_crc_src = acrtc->dm_irq_params.crc_src;
spin_unlock_irqrestore(&adev_to_drm(adev)->event_lock, flags);
 #endif
-   dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
 
if (new_crtc_state->active &&
(!old_crtc_state->active ||
@@ -8300,16 +8296,19 @@ static void amdgpu_dm_atomic_commi

Re: [PATCH v4] drm/amd/display: Fix vblank refcount in vrr transition

2022-09-21 Thread Alex Deucher
On Tue, Aug 23, 2022 at 8:25 PM Yunxiang Li  wrote:
>
> manage_dm_interrupts disable/enable vblank using drm_crtc_vblank_off/on
> which causes drm_crtc_vblank_get in vrr_transition to fail, and later
> when drm_crtc_vblank_put is called the refcount on vblank will be messed
> up. Therefore move the call to after manage_dm_interrupts.
>

+ Rodrigo

You might want to add:
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1380

This looks logical to me, but someone from the display team should take a look.

Alex


> Signed-off-by: Yunxiang Li 
> ---
> v2: check the return code for calls that might fail and warn on them
> v3/v4: make the sequence closer to the original and remove redundant local 
> variables
>
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 55 +--
>  1 file changed, 26 insertions(+), 29 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index bc2493a2a90e..de80b61b8d8e 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -7488,15 +7488,15 @@ static void amdgpu_dm_handle_vrr_transition(struct 
> dm_crtc_state *old_state,
>  * We also need vupdate irq for the actual core vblank 
> handling
>  * at end of vblank.
>  */
> -   dm_set_vupdate_irq(new_state->base.crtc, true);
> -   drm_crtc_vblank_get(new_state->base.crtc);
> +   WARN_ON(dm_set_vupdate_irq(new_state->base.crtc, true) != 0);
> +   WARN_ON(drm_crtc_vblank_get(new_state->base.crtc) != 0);
> DRM_DEBUG_DRIVER("%s: crtc=%u VRR off->on: Get vblank ref\n",
>  __func__, new_state->base.crtc->base.id);
> } else if (old_vrr_active && !new_vrr_active) {
> /* Transition VRR active -> inactive:
>  * Allow vblank irq disable again for fixed refresh rate.
>  */
> -   dm_set_vupdate_irq(new_state->base.crtc, false);
> +   WARN_ON(dm_set_vupdate_irq(new_state->base.crtc, false) != 0);
> drm_crtc_vblank_put(new_state->base.crtc);
> DRM_DEBUG_DRIVER("%s: crtc=%u VRR on->off: Drop vblank ref\n",
>  __func__, new_state->base.crtc->base.id);
> @@ -8261,23 +8261,6 @@ static void amdgpu_dm_atomic_commit_tail(struct 
> drm_atomic_state *state)
> mutex_unlock(&dm->dc_lock);
> }
>
> -   /* Count number of newly disabled CRTCs for dropping PM refs later. */
> -   for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state,
> - new_crtc_state, i) {
> -   if (old_crtc_state->active && !new_crtc_state->active)
> -   crtc_disable_count++;
> -
> -   dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
> -   dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
> -
> -   /* For freesync config update on crtc state and params for 
> irq */
> -   update_stream_irq_parameters(dm, dm_new_crtc_state);
> -
> -   /* Handle vrr on->off / off->on transitions */
> -   amdgpu_dm_handle_vrr_transition(dm_old_crtc_state,
> -   dm_new_crtc_state);
> -   }
> -
> /**
>  * Enable interrupts for CRTCs that are newly enabled or went through
>  * a modeset. It was intentionally deferred until after the front end
> @@ -8287,16 +8270,29 @@ static void amdgpu_dm_atomic_commit_tail(struct 
> drm_atomic_state *state)
> for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, 
> new_crtc_state, i) {
> struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc);
>  #ifdef CONFIG_DEBUG_FS
> -   bool configure_crc = false;
> enum amdgpu_dm_pipe_crc_source cur_crc_src;
>  #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
> -   struct crc_rd_work *crc_rd_wrk = dm->crc_rd_wrk;
> +   struct crc_rd_work *crc_rd_wrk;
> +#endif
> +#endif
> +   /* Count number of newly disabled CRTCs for dropping PM refs 
> later. */
> +   if (old_crtc_state->active && !new_crtc_state->active)
> +   crtc_disable_count++;
> +
> +   dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
> +   dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
> +
> +   /* For freesync config update on crtc state and params for 
> irq */
> +   update_stream_irq_parameters(dm, dm_new_crtc_state);
> +
> +#ifdef CONFIG_DEBUG_FS
> +#if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
> +   crc_rd_wrk = dm->crc_rd_wrk;
>  #endif
> spin_lock_irqsave(&adev_to_drm(adev)->event_lock, flags);
> cur_crc_src = acrtc->dm_irq_params.crc_src;
> spin_unloc

Re: [PATCH] drm/amdkfd: fix MQD init for GFX11 in init_mqd

2022-09-21 Thread Alex Deucher
On Wed, Sep 21, 2022 at 2:47 PM Graham Sider  wrote:
>
> Set remaining compute_static_thread_mgmt_se* accordingly.
>
> Signed-off-by: Graham Sider 

Acked-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
> index d982c154537e..26b53b6d673e 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
> @@ -126,6 +126,10 @@ static void init_mqd(struct mqd_manager *mm, void **mqd,
> m->compute_static_thread_mgmt_se1 = 0x;
> m->compute_static_thread_mgmt_se2 = 0x;
> m->compute_static_thread_mgmt_se3 = 0x;
> +   m->compute_static_thread_mgmt_se4 = 0x;
> +   m->compute_static_thread_mgmt_se5 = 0x;
> +   m->compute_static_thread_mgmt_se6 = 0x;
> +   m->compute_static_thread_mgmt_se7 = 0x;
>
> m->cp_hqd_persistent_state = 
> CP_HQD_PERSISTENT_STATE__PRELOAD_REQ_MASK |
> 0x55 << CP_HQD_PERSISTENT_STATE__PRELOAD_SIZE__SHIFT;
> --
> 2.25.1
>


[PATCH] drm/amdkfd: fix MQD init for GFX11 in init_mqd

2022-09-21 Thread Graham Sider
Set remaining compute_static_thread_mgmt_se* accordingly.

Signed-off-by: Graham Sider 
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
index d982c154537e..26b53b6d673e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
@@ -126,6 +126,10 @@ static void init_mqd(struct mqd_manager *mm, void **mqd,
m->compute_static_thread_mgmt_se1 = 0x;
m->compute_static_thread_mgmt_se2 = 0x;
m->compute_static_thread_mgmt_se3 = 0x;
+   m->compute_static_thread_mgmt_se4 = 0x;
+   m->compute_static_thread_mgmt_se5 = 0x;
+   m->compute_static_thread_mgmt_se6 = 0x;
+   m->compute_static_thread_mgmt_se7 = 0x;
 
m->cp_hqd_persistent_state = CP_HQD_PERSISTENT_STATE__PRELOAD_REQ_MASK |
0x55 << CP_HQD_PERSISTENT_STATE__PRELOAD_SIZE__SHIFT;
-- 
2.25.1



[PATCH v4] drm/sched: Add FIFO sched policy to run queue v3

2022-09-21 Thread Andrey Grodzovsky
When many entities competing for same run queue on
the same scheduler When many entities have  unacceptably long wait
time for some jobs waiting stuck in the run queue before being picked
up are observed (seen using  GPUVis).
The issue is due to the Round Robin policy used by schedulers
to pick up the next entity's job queue for execution. Under stress
of many entities and long job queues within entity some
jobs could be stack for very long time in it's entity's
queue before being popped from the queue and executed
while for other entities with smaller job queues a job
might execute earlier even though that job arrived later
then the job in the long queue.
   
Fix:
Add FIFO selection policy to entities in run queue, chose next entity
on run queue in such order that if job on one entity arrived
earlier then job on another entity the first job will start
executing earlier regardless of the length of the entity's job
queue.
   
v2:
Switch to rb tree structure for entities based on TS of
oldest job waiting in the job queue of an entity. Improves next
entity extraction to O(1). Entity TS update
O(log N) where N is the number of entities in the run-queue
   
Drop default option in module control parameter.

v3:
Various cosmetical fixes and minor refactoring of fifo update function. (Luben)

v4:
Switch drm_sched_rq_select_entity_fifo to in order search (Luben)
   
Signed-off-by: Andrey Grodzovsky 
Tested-by: Li Yunxiang (Teddy) 
---
 drivers/gpu/drm/scheduler/sched_entity.c |  26 +-
 drivers/gpu/drm/scheduler/sched_main.c   | 107 ++-
 include/drm/gpu_scheduler.h  |  32 +++
 3 files changed, 159 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 6b25b2f4f5a3..f3ffce3c9304 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -73,6 +73,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
entity->priority = priority;
entity->sched_list = num_sched_list > 1 ? sched_list : NULL;
entity->last_scheduled = NULL;
+   RB_CLEAR_NODE(&entity->rb_tree_node);
 
if(num_sched_list)
entity->rq = &sched_list[0]->sched_rq[entity->priority];
@@ -417,14 +418,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
drm_sched_entity *entity)
 
sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
if (!sched_job)
-   return NULL;
+   goto skip;
 
while ((entity->dependency =
drm_sched_job_dependency(sched_job, entity))) {
trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
 
-   if (drm_sched_entity_add_dependency_cb(entity))
-   return NULL;
+   if (drm_sched_entity_add_dependency_cb(entity)) {
+   sched_job = NULL;
+   goto skip;
+   }
}
 
/* skip jobs from entity that marked guilty */
@@ -443,6 +446,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
drm_sched_entity *entity)
smp_wmb();
 
spsc_queue_pop(&entity->job_queue);
+
+   /*
+* It's when head job is extracted we can access the next job (or empty)
+* queue and update the entity location in the min heap accordingly.
+*/
+skip:
+   if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
+   drm_sched_rq_update_fifo(entity,
+(sched_job ? sched_job->submit_ts : 
ktime_get()));
+
return sched_job;
 }
 
@@ -502,11 +515,13 @@ void drm_sched_entity_push_job(struct drm_sched_job 
*sched_job)
 {
struct drm_sched_entity *entity = sched_job->entity;
bool first;
+   ktime_t ts =  ktime_get();
 
trace_drm_sched_job(sched_job, entity);
atomic_inc(entity->rq->sched->score);
WRITE_ONCE(entity->last_user, current->group_leader);
first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node);
+   sched_job->submit_ts = ts;
 
/* first job wakes up scheduler */
if (first) {
@@ -518,8 +533,13 @@ void drm_sched_entity_push_job(struct drm_sched_job 
*sched_job)
DRM_ERROR("Trying to push to a killed entity\n");
return;
}
+
drm_sched_rq_add_entity(entity->rq, entity);
spin_unlock(&entity->rq_lock);
+
+   if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
+   drm_sched_rq_update_fifo(entity, ts);
+
drm_sched_wakeup(entity->rq->sched);
}
 }
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 4f2395d1a791..565707a1c5c7 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -62,6 +62,64 @@
 #define to_drm_sched_job(sched_job) 

[PATCH AUTOSEL 4.19 2/3] drm/amd/display: Limit user regamma to a valid value

2022-09-21 Thread Sasha Levin
From: Yao Wang1 

[ Upstream commit 3601d620f22e37740cf73f8278eabf9f2aa19eb7 ]

[Why]
For HDR mode, we get total 512 tf_point and after switching to SDR mode
we actually get 400 tf_point and the rest of points(401~512) still use
dirty value from HDR mode. We should limit the rest of the points to max
value.

[How]
Limit the value when coordinates_x.x > 1, just like what we do in
translate_from_linear_space for other re-gamma build paths.

Tested-by: Daniel Wheeler 
Reviewed-by: Krunoslav Kovac 
Reviewed-by: Aric Cyr 
Acked-by: Pavle Kotarac 
Signed-off-by: Yao Wang1 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/modules/color/color_gamma.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c 
b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
index 11ea1a0e629b..4e866317ec25 100644
--- a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
+++ b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
@@ -1206,6 +1206,7 @@ static void interpolate_user_regamma(uint32_t 
hw_points_num,
struct fixed31_32 lut2;
struct fixed31_32 delta_lut;
struct fixed31_32 delta_index;
+   const struct fixed31_32 one = dc_fixpt_from_int(1);
 
i = 0;
/* fixed_pt library has problems handling too small values */
@@ -1234,6 +1235,9 @@ static void interpolate_user_regamma(uint32_t 
hw_points_num,
} else
hw_x = coordinates_x[i].x;
 
+   if (dc_fixpt_le(one, hw_x))
+   hw_x = one;
+
norm_x = dc_fixpt_mul(norm_factor, hw_x);
index = dc_fixpt_floor(norm_x);
if (index < 0 || index > 255)
-- 
2.35.1



[PATCH AUTOSEL 5.10 6/7] drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack usage

2022-09-21 Thread Sasha Levin
From: Nathan Chancellor 

[ Upstream commit 41012d715d5d7b9751ae84b8fb255e404ac9c5d0 ]

This function consumes a lot of stack space and it blows up the size of
dml30_ModeSupportAndSystemConfigurationFull() with clang:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3542:6:
 error: stack frame size (2200) exceeds limit (2048) in 
'dml30_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
   ^
  1 error generated.

Commit a0f7e7f759cf ("drm/amd/display: fix i386 frame size warning")
aimed to address this for i386 but it did not help x86_64.

To reduce the amount of stack space that
dml30_ModeSupportAndSystemConfigurationFull() uses, mark
UseMinimumDCFCLK() as noinline, using the _for_stack variant for
documentation. While this will increase the total amount of stack usage
between the two functions (1632 and 1304 bytes respectively), it will
make sure both stay below the limit of 2048 bytes for these files. The
aforementioned change does help reduce UseMinimumDCFCLK()'s stack usage
so it should not be reverted in favor of this change.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" 
Tested-by: Maíra Canal 
Reviewed-by: Rodrigo Siqueira 
Signed-off-by: Nathan Chancellor 
Signed-off-by: Rodrigo Siqueira 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
index 2663f1b31842..e427f4ffa080 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
@@ -6653,8 +6653,7 @@ static double CalculateUrgentLatency(
return ret;
 }
 
-
-static void UseMinimumDCFCLK(
+static noinline_for_stack void UseMinimumDCFCLK(
struct display_mode_lib *mode_lib,
int MaxInterDCNTileRepeaters,
int MaxPrefetchMode,
-- 
2.35.1



[PATCH AUTOSEL 5.4 3/5] drm/amdgpu: use dirty framebuffer helper

2022-09-21 Thread Sasha Levin
From: Hamza Mahfooz 

[ Upstream commit 66f99628eb24409cb8feb5061f78283c8b65f820 ]

Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use
drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs
struct.

Signed-off-by: Hamza Mahfooz 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index b588e0e409e7..d8687868407d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -495,6 +496,7 @@ bool amdgpu_display_ddc_probe(struct amdgpu_connector 
*amdgpu_connector,
 static const struct drm_framebuffer_funcs amdgpu_fb_funcs = {
.destroy = drm_gem_fb_destroy,
.create_handle = drm_gem_fb_create_handle,
+   .dirty = drm_atomic_helper_dirtyfb,
 };
 
 uint32_t amdgpu_display_supported_domains(struct amdgpu_device *adev,
-- 
2.35.1



[PATCH AUTOSEL 5.4 4/5] drm/amd/display: Limit user regamma to a valid value

2022-09-21 Thread Sasha Levin
From: Yao Wang1 

[ Upstream commit 3601d620f22e37740cf73f8278eabf9f2aa19eb7 ]

[Why]
For HDR mode, we get total 512 tf_point and after switching to SDR mode
we actually get 400 tf_point and the rest of points(401~512) still use
dirty value from HDR mode. We should limit the rest of the points to max
value.

[How]
Limit the value when coordinates_x.x > 1, just like what we do in
translate_from_linear_space for other re-gamma build paths.

Tested-by: Daniel Wheeler 
Reviewed-by: Krunoslav Kovac 
Reviewed-by: Aric Cyr 
Acked-by: Pavle Kotarac 
Signed-off-by: Yao Wang1 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/modules/color/color_gamma.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c 
b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
index e042d8ce05b4..22d105635e33 100644
--- a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
+++ b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
@@ -1486,6 +1486,7 @@ static void interpolate_user_regamma(uint32_t 
hw_points_num,
struct fixed31_32 lut2;
struct fixed31_32 delta_lut;
struct fixed31_32 delta_index;
+   const struct fixed31_32 one = dc_fixpt_from_int(1);
 
i = 0;
/* fixed_pt library has problems handling too small values */
@@ -1514,6 +1515,9 @@ static void interpolate_user_regamma(uint32_t 
hw_points_num,
} else
hw_x = coordinates_x[i].x;
 
+   if (dc_fixpt_le(one, hw_x))
+   hw_x = one;
+
norm_x = dc_fixpt_mul(norm_factor, hw_x);
index = dc_fixpt_floor(norm_x);
if (index < 0 || index > 255)
-- 
2.35.1



[PATCH AUTOSEL 5.10 5/7] drm/amd/display: Limit user regamma to a valid value

2022-09-21 Thread Sasha Levin
From: Yao Wang1 

[ Upstream commit 3601d620f22e37740cf73f8278eabf9f2aa19eb7 ]

[Why]
For HDR mode, we get total 512 tf_point and after switching to SDR mode
we actually get 400 tf_point and the rest of points(401~512) still use
dirty value from HDR mode. We should limit the rest of the points to max
value.

[How]
Limit the value when coordinates_x.x > 1, just like what we do in
translate_from_linear_space for other re-gamma build paths.

Tested-by: Daniel Wheeler 
Reviewed-by: Krunoslav Kovac 
Reviewed-by: Aric Cyr 
Acked-by: Pavle Kotarac 
Signed-off-by: Yao Wang1 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/modules/color/color_gamma.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c 
b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
index 09bc2c249e1a..3c4390d71a82 100644
--- a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
+++ b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
@@ -1524,6 +1524,7 @@ static void interpolate_user_regamma(uint32_t 
hw_points_num,
struct fixed31_32 lut2;
struct fixed31_32 delta_lut;
struct fixed31_32 delta_index;
+   const struct fixed31_32 one = dc_fixpt_from_int(1);
 
i = 0;
/* fixed_pt library has problems handling too small values */
@@ -1552,6 +1553,9 @@ static void interpolate_user_regamma(uint32_t 
hw_points_num,
} else
hw_x = coordinates_x[i].x;
 
+   if (dc_fixpt_le(one, hw_x))
+   hw_x = one;
+
norm_x = dc_fixpt_mul(norm_factor, hw_x);
index = dc_fixpt_floor(norm_x);
if (index < 0 || index > 255)
-- 
2.35.1



[PATCH AUTOSEL 5.15 08/10] drm/amd/display: Reduce number of arguments of dml31's CalculateFlipSchedule()

2022-09-21 Thread Sasha Levin
From: Nathan Chancellor 

[ Upstream commit 21485d3da659b66c37d99071623af83ee1c6733d ]

Most of the arguments are identical between the two call sites and they
can be accessed through the 'struct vba_vars_st' pointer. This reduces
the total amount of stack space that
dml31_ModeSupportAndSystemConfigurationFull() uses by 112 bytes with
LLVM 16 (1976 -> 1864), helping clear up the following clang warning:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6:
 error: stack frame size (2216) exceeds limit (2048) in 
'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
  ^
  1 error generated.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" 
Tested-by: Maíra Canal 
Reviewed-by: Rodrigo Siqueira 
Signed-off-by: Nathan Chancellor 
Signed-off-by: Rodrigo Siqueira 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../dc/dml/dcn31/display_mode_vba_31.c| 172 +-
 1 file changed, 47 insertions(+), 125 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
index a6ce22d23b26..aa0507e01792 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
@@ -259,33 +259,13 @@ static void CalculateRowBandwidth(
 
 static void CalculateFlipSchedule(
struct display_mode_lib *mode_lib,
+   unsigned int k,
double HostVMInefficiencyFactor,
double UrgentExtraLatency,
double UrgentLatency,
-   unsigned int GPUVMMaxPageTableLevels,
-   bool HostVMEnable,
-   unsigned int HostVMMaxNonCachedPageTableLevels,
-   bool GPUVMEnable,
-   double HostVMMinPageSize,
double PDEAndMetaPTEBytesPerFrame,
double MetaRowBytes,
-   double DPTEBytesPerRow,
-   double BandwidthAvailableForImmediateFlip,
-   unsigned int TotImmediateFlipBytes,
-   enum source_format_class SourcePixelFormat,
-   double LineTime,
-   double VRatio,
-   double VRatioChroma,
-   double Tno_bw,
-   bool DCCEnable,
-   unsigned int dpte_row_height,
-   unsigned int meta_row_height,
-   unsigned int dpte_row_height_chroma,
-   unsigned int meta_row_height_chroma,
-   double *DestinationLinesToRequestVMInImmediateFlip,
-   double *DestinationLinesToRequestRowInImmediateFlip,
-   double *final_flip_bw,
-   bool *ImmediateFlipSupportedForPipe);
+   double DPTEBytesPerRow);
 static double CalculateWriteBackDelay(
enum source_format_class WritebackPixelFormat,
double WritebackHRatio,
@@ -2923,33 +2903,13 @@ static void 
DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
for (k = 0; k < v->NumberOfActivePlanes; ++k) {
CalculateFlipSchedule(
mode_lib,
+   k,
HostVMInefficiencyFactor,
v->UrgentExtraLatency,
v->UrgentLatency,
-   v->GPUVMMaxPageTableLevels,
-   v->HostVMEnable,
-   
v->HostVMMaxNonCachedPageTableLevels,
-   v->GPUVMEnable,
-   v->HostVMMinPageSize,
v->PDEAndMetaPTEBytesFrame[k],
v->MetaRowByte[k],
-   v->PixelPTEBytesPerRow[k],
-   
v->BandwidthAvailableForImmediateFlip,
-   v->TotImmediateFlipBytes,
-   v->SourcePixelFormat[k],
-   v->HTotal[k] / v->PixelClock[k],
-   v->VRatio[k],
-   v->VRatioChroma[k],
-   v->Tno_bw[k],
-   v->DCCEnable[k],
-   v->dpte_row_height[k],
-   v->meta_row_height[k],
-   v->dpte_row_hei

[PATCH AUTOSEL 5.10 4/7] drm/amdgpu: use dirty framebuffer helper

2022-09-21 Thread Sasha Levin
From: Hamza Mahfooz 

[ Upstream commit 66f99628eb24409cb8feb5061f78283c8b65f820 ]

Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use
drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs
struct.

Signed-off-by: Hamza Mahfooz 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 7cc7af2a6822..947f50e402ba 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -498,6 +499,7 @@ bool amdgpu_display_ddc_probe(struct amdgpu_connector 
*amdgpu_connector,
 static const struct drm_framebuffer_funcs amdgpu_fb_funcs = {
.destroy = drm_gem_fb_destroy,
.create_handle = drm_gem_fb_create_handle,
+   .dirty = drm_atomic_helper_dirtyfb,
 };
 
 uint32_t amdgpu_display_supported_domains(struct amdgpu_device *adev,
-- 
2.35.1



[PATCH AUTOSEL 5.15 06/10] drm/amd/display: Limit user regamma to a valid value

2022-09-21 Thread Sasha Levin
From: Yao Wang1 

[ Upstream commit 3601d620f22e37740cf73f8278eabf9f2aa19eb7 ]

[Why]
For HDR mode, we get total 512 tf_point and after switching to SDR mode
we actually get 400 tf_point and the rest of points(401~512) still use
dirty value from HDR mode. We should limit the rest of the points to max
value.

[How]
Limit the value when coordinates_x.x > 1, just like what we do in
translate_from_linear_space for other re-gamma build paths.

Tested-by: Daniel Wheeler 
Reviewed-by: Krunoslav Kovac 
Reviewed-by: Aric Cyr 
Acked-by: Pavle Kotarac 
Signed-off-by: Yao Wang1 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/modules/color/color_gamma.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c 
b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
index ef742d95ef05..c707c9bfed43 100644
--- a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
+++ b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
@@ -1597,6 +1597,7 @@ static void interpolate_user_regamma(uint32_t 
hw_points_num,
struct fixed31_32 lut2;
struct fixed31_32 delta_lut;
struct fixed31_32 delta_index;
+   const struct fixed31_32 one = dc_fixpt_from_int(1);
 
i = 0;
/* fixed_pt library has problems handling too small values */
@@ -1625,6 +1626,9 @@ static void interpolate_user_regamma(uint32_t 
hw_points_num,
} else
hw_x = coordinates_x[i].x;
 
+   if (dc_fixpt_le(one, hw_x))
+   hw_x = one;
+
norm_x = dc_fixpt_mul(norm_factor, hw_x);
index = dc_fixpt_floor(norm_x);
if (index < 0 || index > 255)
-- 
2.35.1



[PATCH AUTOSEL 5.15 09/10] drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack usage

2022-09-21 Thread Sasha Levin
From: Nathan Chancellor 

[ Upstream commit 41012d715d5d7b9751ae84b8fb255e404ac9c5d0 ]

This function consumes a lot of stack space and it blows up the size of
dml30_ModeSupportAndSystemConfigurationFull() with clang:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3542:6:
 error: stack frame size (2200) exceeds limit (2048) in 
'dml30_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
   ^
  1 error generated.

Commit a0f7e7f759cf ("drm/amd/display: fix i386 frame size warning")
aimed to address this for i386 but it did not help x86_64.

To reduce the amount of stack space that
dml30_ModeSupportAndSystemConfigurationFull() uses, mark
UseMinimumDCFCLK() as noinline, using the _for_stack variant for
documentation. While this will increase the total amount of stack usage
between the two functions (1632 and 1304 bytes respectively), it will
make sure both stay below the limit of 2048 bytes for these files. The
aforementioned change does help reduce UseMinimumDCFCLK()'s stack usage
so it should not be reverted in favor of this change.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" 
Tested-by: Maíra Canal 
Reviewed-by: Rodrigo Siqueira 
Signed-off-by: Nathan Chancellor 
Signed-off-by: Rodrigo Siqueira 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
index e3d9f1decdfc..518672a2450f 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
@@ -6658,8 +6658,7 @@ static double CalculateUrgentLatency(
return ret;
 }
 
-
-static void UseMinimumDCFCLK(
+static noinline_for_stack void UseMinimumDCFCLK(
struct display_mode_lib *mode_lib,
int MaxInterDCNTileRepeaters,
int MaxPrefetchMode,
-- 
2.35.1



[PATCH AUTOSEL 5.15 07/10] drm/amd/display: Reduce number of arguments of dml31's CalculateWatermarksAndDRAMSpeedChangeSupport()

2022-09-21 Thread Sasha Levin
From: Nathan Chancellor 

[ Upstream commit 37934d4118e22bceb80141804391975078f31734 ]

Most of the arguments are identical between the two call sites and they
can be accessed through the 'struct vba_vars_st' pointer. This reduces
the total amount of stack space that
dml31_ModeSupportAndSystemConfigurationFull() uses by 240 bytes with
LLVM 16 (2216 -> 1976), helping clear up the following clang warning:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6:
 error: stack frame size (2216) exceeds limit (2048) in 
'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
  ^
  1 error generated.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" 
Tested-by: Maíra Canal 
Reviewed-by: Rodrigo Siqueira 
Signed-off-by: Nathan Chancellor 
Signed-off-by: Rodrigo Siqueira 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../dc/dml/dcn31/display_mode_vba_31.c| 248 --
 1 file changed, 52 insertions(+), 196 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
index d58925cff420..a6ce22d23b26 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
@@ -319,64 +319,28 @@ static void CalculateVupdateAndDynamicMetadataParameters(
 static void CalculateWatermarksAndDRAMSpeedChangeSupport(
struct display_mode_lib *mode_lib,
unsigned int PrefetchMode,
-   unsigned int NumberOfActivePlanes,
-   unsigned int MaxLineBufferLines,
-   unsigned int LineBufferSize,
-   unsigned int WritebackInterfaceBufferSize,
double DCFCLK,
double ReturnBW,
-   bool SynchronizedVBlank,
-   unsigned int dpte_group_bytes[],
-   unsigned int MetaChunkSize,
double UrgentLatency,
double ExtraLatency,
-   double WritebackLatency,
-   double WritebackChunkSize,
double SOCCLK,
-   double DRAMClockChangeLatency,
-   double SRExitTime,
-   double SREnterPlusExitTime,
-   double SRExitZ8Time,
-   double SREnterPlusExitZ8Time,
double DCFCLKDeepSleep,
unsigned int DETBufferSizeY[],
unsigned int DETBufferSizeC[],
unsigned int SwathHeightY[],
unsigned int SwathHeightC[],
-   unsigned int LBBitPerPixel[],
double SwathWidthY[],
double SwathWidthC[],
-   double HRatio[],
-   double HRatioChroma[],
-   unsigned int vtaps[],
-   unsigned int VTAPsChroma[],
-   double VRatio[],
-   double VRatioChroma[],
-   unsigned int HTotal[],
-   double PixelClock[],
-   unsigned int BlendingAndTiming[],
unsigned int DPPPerPlane[],
double BytePerPixelDETY[],
double BytePerPixelDETC[],
-   double DSTXAfterScaler[],
-   double DSTYAfterScaler[],
-   bool WritebackEnable[],
-   enum source_format_class WritebackPixelFormat[],
-   double WritebackDestinationWidth[],
-   double WritebackDestinationHeight[],
-   double WritebackSourceHeight[],
bool UnboundedRequestEnabled,
int unsigned CompressedBufferSizeInkByte,
enum clock_change_support *DRAMClockChangeSupport,
-   double *UrgentWatermark,
-   double *WritebackUrgentWatermark,
-   double *DRAMClockChangeWatermark,
-   double *WritebackDRAMClockChangeWatermark,
double *StutterExitWatermark,
double *StutterEnterPlusExitWatermark,
double *Z8StutterExitWatermark,
-   double *Z8StutterEnterPlusExitWatermark,
-   double *MinActiveDRAMClockChangeLatencySupported);
+   double *Z8StutterEnterPlusExitWatermark);
 
 static void CalculateDCFCLKDeepSleep(
struct display_mode_lib *mode_lib,
@@ -3072,64 +3036,28 @@ static void 
DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
CalculateWatermarksAndDRAMSpeedChangeSupport(
mode_lib,
PrefetchMode,
-   v->NumberOfActivePlanes,
-   v->MaxLineBufferLines,
-   v->LineBufferSize,
-   v->WritebackInterfaceBufferSize,
v->DCFCLK,
 

[PATCH AUTOSEL 5.19 12/16] drm/amd/display: Reduce number of arguments of dml31's CalculateWatermarksAndDRAMSpeedChangeSupport()

2022-09-21 Thread Sasha Levin
From: Nathan Chancellor 

[ Upstream commit 37934d4118e22bceb80141804391975078f31734 ]

Most of the arguments are identical between the two call sites and they
can be accessed through the 'struct vba_vars_st' pointer. This reduces
the total amount of stack space that
dml31_ModeSupportAndSystemConfigurationFull() uses by 240 bytes with
LLVM 16 (2216 -> 1976), helping clear up the following clang warning:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6:
 error: stack frame size (2216) exceeds limit (2048) in 
'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
  ^
  1 error generated.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" 
Tested-by: Maíra Canal 
Reviewed-by: Rodrigo Siqueira 
Signed-off-by: Nathan Chancellor 
Signed-off-by: Rodrigo Siqueira 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../dc/dml/dcn31/display_mode_vba_31.c| 248 --
 1 file changed, 52 insertions(+), 196 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
index e4b9fd31223c..586825d85d66 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
@@ -321,64 +321,28 @@ static void CalculateVupdateAndDynamicMetadataParameters(
 static void CalculateWatermarksAndDRAMSpeedChangeSupport(
struct display_mode_lib *mode_lib,
unsigned int PrefetchMode,
-   unsigned int NumberOfActivePlanes,
-   unsigned int MaxLineBufferLines,
-   unsigned int LineBufferSize,
-   unsigned int WritebackInterfaceBufferSize,
double DCFCLK,
double ReturnBW,
-   bool SynchronizedVBlank,
-   unsigned int dpte_group_bytes[],
-   unsigned int MetaChunkSize,
double UrgentLatency,
double ExtraLatency,
-   double WritebackLatency,
-   double WritebackChunkSize,
double SOCCLK,
-   double DRAMClockChangeLatency,
-   double SRExitTime,
-   double SREnterPlusExitTime,
-   double SRExitZ8Time,
-   double SREnterPlusExitZ8Time,
double DCFCLKDeepSleep,
unsigned int DETBufferSizeY[],
unsigned int DETBufferSizeC[],
unsigned int SwathHeightY[],
unsigned int SwathHeightC[],
-   unsigned int LBBitPerPixel[],
double SwathWidthY[],
double SwathWidthC[],
-   double HRatio[],
-   double HRatioChroma[],
-   unsigned int vtaps[],
-   unsigned int VTAPsChroma[],
-   double VRatio[],
-   double VRatioChroma[],
-   unsigned int HTotal[],
-   double PixelClock[],
-   unsigned int BlendingAndTiming[],
unsigned int DPPPerPlane[],
double BytePerPixelDETY[],
double BytePerPixelDETC[],
-   double DSTXAfterScaler[],
-   double DSTYAfterScaler[],
-   bool WritebackEnable[],
-   enum source_format_class WritebackPixelFormat[],
-   double WritebackDestinationWidth[],
-   double WritebackDestinationHeight[],
-   double WritebackSourceHeight[],
bool UnboundedRequestEnabled,
int unsigned CompressedBufferSizeInkByte,
enum clock_change_support *DRAMClockChangeSupport,
-   double *UrgentWatermark,
-   double *WritebackUrgentWatermark,
-   double *DRAMClockChangeWatermark,
-   double *WritebackDRAMClockChangeWatermark,
double *StutterExitWatermark,
double *StutterEnterPlusExitWatermark,
double *Z8StutterExitWatermark,
-   double *Z8StutterEnterPlusExitWatermark,
-   double *MinActiveDRAMClockChangeLatencySupported);
+   double *Z8StutterEnterPlusExitWatermark);
 
 static void CalculateDCFCLKDeepSleep(
struct display_mode_lib *mode_lib,
@@ -3027,64 +2991,28 @@ static void 
DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
CalculateWatermarksAndDRAMSpeedChangeSupport(
mode_lib,
PrefetchMode,
-   v->NumberOfActivePlanes,
-   v->MaxLineBufferLines,
-   v->LineBufferSize,
-   v->WritebackInterfaceBufferSize,
v->DCFCLK,
 

[PATCH AUTOSEL 5.15 05/10] drm/amdgpu: use dirty framebuffer helper

2022-09-21 Thread Sasha Levin
From: Hamza Mahfooz 

[ Upstream commit 66f99628eb24409cb8feb5061f78283c8b65f820 ]

Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use
drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs
struct.

Signed-off-by: Hamza Mahfooz 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 5c08047adb59..47fb722ab374 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -490,6 +491,7 @@ bool amdgpu_display_ddc_probe(struct amdgpu_connector 
*amdgpu_connector,
 static const struct drm_framebuffer_funcs amdgpu_fb_funcs = {
.destroy = drm_gem_fb_destroy,
.create_handle = drm_gem_fb_create_handle,
+   .dirty = drm_atomic_helper_dirtyfb,
 };
 
 uint32_t amdgpu_display_supported_domains(struct amdgpu_device *adev,
-- 
2.35.1



[PATCH AUTOSEL 5.15 04/10] drm/amd/pm: disable BACO entry/exit completely on several sienna cichlid cards

2022-09-21 Thread Sasha Levin
From: Guchun Chen 

[ Upstream commit 7c6fb61a400bf3218c6504cb2d48858f98822c9d ]

To avoid hardware intermittent failures.

Signed-off-by: Guchun Chen 
Reviewed-by: Lijo Lazar 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index 79976921dc46..c71d50e82168 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -358,6 +358,17 @@ static void sienna_cichlid_check_bxco_support(struct 
smu_context *smu)
smu_baco->platform_support =
(val & RCC_BIF_STRAP0__STRAP_PX_CAPABLE_MASK) ? true :
false;
+
+   /*
+* Disable BACO entry/exit completely on below SKUs to
+* avoid hardware intermittent failures.
+*/
+   if (((adev->pdev->device == 0x73A1) &&
+   (adev->pdev->revision == 0x00)) ||
+   ((adev->pdev->device == 0x73BF) &&
+   (adev->pdev->revision == 0xCF)))
+   smu_baco->platform_support = false;
+
}
 }
 
-- 
2.35.1



[PATCH AUTOSEL 5.19 13/16] drm/amd/display: Reduce number of arguments of dml31's CalculateFlipSchedule()

2022-09-21 Thread Sasha Levin
From: Nathan Chancellor 

[ Upstream commit 21485d3da659b66c37d99071623af83ee1c6733d ]

Most of the arguments are identical between the two call sites and they
can be accessed through the 'struct vba_vars_st' pointer. This reduces
the total amount of stack space that
dml31_ModeSupportAndSystemConfigurationFull() uses by 112 bytes with
LLVM 16 (1976 -> 1864), helping clear up the following clang warning:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6:
 error: stack frame size (2216) exceeds limit (2048) in 
'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
  ^
  1 error generated.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" 
Tested-by: Maíra Canal 
Reviewed-by: Rodrigo Siqueira 
Signed-off-by: Nathan Chancellor 
Signed-off-by: Rodrigo Siqueira 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../dc/dml/dcn31/display_mode_vba_31.c| 172 +-
 1 file changed, 47 insertions(+), 125 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
index 586825d85d66..40a672236198 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
@@ -261,33 +261,13 @@ static void CalculateRowBandwidth(
 
 static void CalculateFlipSchedule(
struct display_mode_lib *mode_lib,
+   unsigned int k,
double HostVMInefficiencyFactor,
double UrgentExtraLatency,
double UrgentLatency,
-   unsigned int GPUVMMaxPageTableLevels,
-   bool HostVMEnable,
-   unsigned int HostVMMaxNonCachedPageTableLevels,
-   bool GPUVMEnable,
-   double HostVMMinPageSize,
double PDEAndMetaPTEBytesPerFrame,
double MetaRowBytes,
-   double DPTEBytesPerRow,
-   double BandwidthAvailableForImmediateFlip,
-   unsigned int TotImmediateFlipBytes,
-   enum source_format_class SourcePixelFormat,
-   double LineTime,
-   double VRatio,
-   double VRatioChroma,
-   double Tno_bw,
-   bool DCCEnable,
-   unsigned int dpte_row_height,
-   unsigned int meta_row_height,
-   unsigned int dpte_row_height_chroma,
-   unsigned int meta_row_height_chroma,
-   double *DestinationLinesToRequestVMInImmediateFlip,
-   double *DestinationLinesToRequestRowInImmediateFlip,
-   double *final_flip_bw,
-   bool *ImmediateFlipSupportedForPipe);
+   double DPTEBytesPerRow);
 static double CalculateWriteBackDelay(
enum source_format_class WritebackPixelFormat,
double WritebackHRatio,
@@ -2878,33 +2858,13 @@ static void 
DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
for (k = 0; k < v->NumberOfActivePlanes; ++k) {
CalculateFlipSchedule(
mode_lib,
+   k,
HostVMInefficiencyFactor,
v->UrgentExtraLatency,
v->UrgentLatency,
-   v->GPUVMMaxPageTableLevels,
-   v->HostVMEnable,
-   
v->HostVMMaxNonCachedPageTableLevels,
-   v->GPUVMEnable,
-   v->HostVMMinPageSize,
v->PDEAndMetaPTEBytesFrame[k],
v->MetaRowByte[k],
-   v->PixelPTEBytesPerRow[k],
-   
v->BandwidthAvailableForImmediateFlip,
-   v->TotImmediateFlipBytes,
-   v->SourcePixelFormat[k],
-   v->HTotal[k] / v->PixelClock[k],
-   v->VRatio[k],
-   v->VRatioChroma[k],
-   v->Tno_bw[k],
-   v->DCCEnable[k],
-   v->dpte_row_height[k],
-   v->meta_row_height[k],
-   v->dpte_row_hei

[PATCH AUTOSEL 5.19 14/16] drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack usage

2022-09-21 Thread Sasha Levin
From: Nathan Chancellor 

[ Upstream commit 41012d715d5d7b9751ae84b8fb255e404ac9c5d0 ]

This function consumes a lot of stack space and it blows up the size of
dml30_ModeSupportAndSystemConfigurationFull() with clang:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3542:6:
 error: stack frame size (2200) exceeds limit (2048) in 
'dml30_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
   ^
  1 error generated.

Commit a0f7e7f759cf ("drm/amd/display: fix i386 frame size warning")
aimed to address this for i386 but it did not help x86_64.

To reduce the amount of stack space that
dml30_ModeSupportAndSystemConfigurationFull() uses, mark
UseMinimumDCFCLK() as noinline, using the _for_stack variant for
documentation. While this will increase the total amount of stack usage
between the two functions (1632 and 1304 bytes respectively), it will
make sure both stay below the limit of 2048 bytes for these files. The
aforementioned change does help reduce UseMinimumDCFCLK()'s stack usage
so it should not be reverted in favor of this change.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" 
Tested-by: Maíra Canal 
Reviewed-by: Rodrigo Siqueira 
Signed-off-by: Nathan Chancellor 
Signed-off-by: Rodrigo Siqueira 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
index f47d82da115c..42a567e71439 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
@@ -6651,8 +6651,7 @@ static double CalculateUrgentLatency(
return ret;
 }
 
-
-static void UseMinimumDCFCLK(
+static noinline_for_stack void UseMinimumDCFCLK(
struct display_mode_lib *mode_lib,
int MaxInterDCNTileRepeaters,
int MaxPrefetchMode,
-- 
2.35.1



[PATCH AUTOSEL 5.19 11/16] drm/amd/display: Limit user regamma to a valid value

2022-09-21 Thread Sasha Levin
From: Yao Wang1 

[ Upstream commit 3601d620f22e37740cf73f8278eabf9f2aa19eb7 ]

[Why]
For HDR mode, we get total 512 tf_point and after switching to SDR mode
we actually get 400 tf_point and the rest of points(401~512) still use
dirty value from HDR mode. We should limit the rest of the points to max
value.

[How]
Limit the value when coordinates_x.x > 1, just like what we do in
translate_from_linear_space for other re-gamma build paths.

Tested-by: Daniel Wheeler 
Reviewed-by: Krunoslav Kovac 
Reviewed-by: Aric Cyr 
Acked-by: Pavle Kotarac 
Signed-off-by: Yao Wang1 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/modules/color/color_gamma.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c 
b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
index 64a38f08f497..5a51be753e87 100644
--- a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
+++ b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
@@ -1603,6 +1603,7 @@ static void interpolate_user_regamma(uint32_t 
hw_points_num,
struct fixed31_32 lut2;
struct fixed31_32 delta_lut;
struct fixed31_32 delta_index;
+   const struct fixed31_32 one = dc_fixpt_from_int(1);
 
i = 0;
/* fixed_pt library has problems handling too small values */
@@ -1631,6 +1632,9 @@ static void interpolate_user_regamma(uint32_t 
hw_points_num,
} else
hw_x = coordinates_x[i].x;
 
+   if (dc_fixpt_le(one, hw_x))
+   hw_x = one;
+
norm_x = dc_fixpt_mul(norm_factor, hw_x);
index = dc_fixpt_floor(norm_x);
if (index < 0 || index > 255)
-- 
2.35.1



[PATCH AUTOSEL 5.19 10/16] drm/amdgpu: Skip reset error status for psp v13_0_0

2022-09-21 Thread Sasha Levin
From: Candice Li 

[ Upstream commit 86875d558b91cb46f43be112799c06ecce60ec1e ]

No need to reset error status since only umc ras supported on psp v13_0_0.

Signed-off-by: Candice Li 
Reviewed-by: Hawking Zhang 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index dac202ae864d..9193ca5d6fe7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1805,7 +1805,8 @@ static void amdgpu_ras_log_on_err_counter(struct 
amdgpu_device *adev)
amdgpu_ras_query_error_status(adev, &info);
 
if (adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 2) &&
-   adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 4)) {
+   adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 4) &&
+   adev->ip_versions[MP0_HWIP][0] != IP_VERSION(13, 0, 0)) {
if (amdgpu_ras_reset_error_status(adev, 
info.head.block))
dev_warn(adev->dev, "Failed to reset error 
counter and error status");
}
-- 
2.35.1



[PATCH AUTOSEL 5.19 09/16] drm/amdgpu: add HDP remap functionality to nbio 7.7

2022-09-21 Thread Sasha Levin
From: Alex Deucher 

[ Upstream commit 8c5708d3da37b8c7c3c22c7e945b9a76a7c9539b ]

Was missing before and would have resulted in a write to
a non-existant register. Normally APUs don't use HDP, but
other asics could use this code and APUs do use the HDP
when used in passthrough.

Reviewed-by: Lijo Lazar 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c 
b/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c
index cdc0c9779848..6c1fd471a4c7 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c
@@ -28,6 +28,14 @@
 #include "nbio/nbio_7_7_0_sh_mask.h"
 #include 
 
+static void nbio_v7_7_remap_hdp_registers(struct amdgpu_device *adev)
+{
+   WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_MEM_FLUSH_CNTL,
+adev->rmmio_remap.reg_offset + 
KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL);
+   WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_REG_FLUSH_CNTL,
+adev->rmmio_remap.reg_offset + 
KFD_MMIO_REMAP_HDP_REG_FLUSH_CNTL);
+}
+
 static u32 nbio_v7_7_get_rev_id(struct amdgpu_device *adev)
 {
u32 tmp;
@@ -237,4 +245,5 @@ const struct amdgpu_nbio_funcs nbio_v7_7_funcs = {
.ih_doorbell_range = nbio_v7_7_ih_doorbell_range,
.ih_control = nbio_v7_7_ih_control,
.init_registers = nbio_v7_7_init_registers,
+   .remap_hdp_registers = nbio_v7_7_remap_hdp_registers,
 };
-- 
2.35.1



[PATCH AUTOSEL 5.19 08/16] drm/amdgpu: change the alignment size of TMR BO to 1M

2022-09-21 Thread Sasha Levin
From: Yang Wang 

[ Upstream commit 36de13fdb04abef3ee03ade5129ab146de63983b ]

align TMR BO size TO tmr size is not necessary,
modify the size to 1M to avoid re-create BO fail
when serious VRAM fragmentation.

v2:
add new macro PSP_TMR_ALIGNMENT for TMR BO alignment size

Signed-off-by: Yang Wang 
Reviewed-by: Hawking Zhang 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 2b00f8fe15a8..7b8d4484c3c4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -748,7 +748,7 @@ static int psp_tmr_init(struct psp_context *psp)
}
 
pptr = amdgpu_sriov_vf(psp->adev) ? &tmr_buf : NULL;
-   ret = amdgpu_bo_create_kernel(psp->adev, tmr_size, 
PSP_TMR_SIZE(psp->adev),
+   ret = amdgpu_bo_create_kernel(psp->adev, tmr_size, PSP_TMR_ALIGNMENT,
  AMDGPU_GEM_DOMAIN_VRAM,
  &psp->tmr_bo, &psp->tmr_mc_addr, pptr);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
index e431f4994931..cd366c7f311f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
@@ -36,6 +36,7 @@
 #define PSP_CMD_BUFFER_SIZE0x1000
 #define PSP_1_MEG  0x10
 #define PSP_TMR_SIZE(adev) ((adev)->asic_type == CHIP_ALDEBARAN ? 0x80 
: 0x40)
+#define PSP_TMR_ALIGNMENT  0x10
 #define PSP_FW_NAME_LEN0x24
 
 enum psp_shared_mem_size {
-- 
2.35.1



[PATCH AUTOSEL 5.19 07/16] drm/amdgpu: use dirty framebuffer helper

2022-09-21 Thread Sasha Levin
From: Hamza Mahfooz 

[ Upstream commit 66f99628eb24409cb8feb5061f78283c8b65f820 ]

Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use
drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs
struct.

Signed-off-by: Hamza Mahfooz 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 4dfd6724b3ca..3451147beda3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -493,6 +494,7 @@ bool amdgpu_display_ddc_probe(struct amdgpu_connector 
*amdgpu_connector,
 static const struct drm_framebuffer_funcs amdgpu_fb_funcs = {
.destroy = drm_gem_fb_destroy,
.create_handle = drm_gem_fb_create_handle,
+   .dirty = drm_atomic_helper_dirtyfb,
 };
 
 uint32_t amdgpu_display_supported_domains(struct amdgpu_device *adev,
-- 
2.35.1



[PATCH AUTOSEL 5.19 06/16] drm/amd/pm: disable BACO entry/exit completely on several sienna cichlid cards

2022-09-21 Thread Sasha Levin
From: Guchun Chen 

[ Upstream commit 7c6fb61a400bf3218c6504cb2d48858f98822c9d ]

To avoid hardware intermittent failures.

Signed-off-by: Guchun Chen 
Reviewed-by: Lijo Lazar 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index 32bb6b1d9526..d13e455c8827 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -368,6 +368,17 @@ static void sienna_cichlid_check_bxco_support(struct 
smu_context *smu)
smu_baco->platform_support =
(val & RCC_BIF_STRAP0__STRAP_PX_CAPABLE_MASK) ? true :
false;
+
+   /*
+* Disable BACO entry/exit completely on below SKUs to
+* avoid hardware intermittent failures.
+*/
+   if (((adev->pdev->device == 0x73A1) &&
+   (adev->pdev->revision == 0x00)) ||
+   ((adev->pdev->device == 0x73BF) &&
+   (adev->pdev->revision == 0xCF)))
+   smu_baco->platform_support = false;
+
}
 }
 
-- 
2.35.1



RE: [PATCH 2/2] drm/amdgpu: Use simplified API for p2p dist calc

2022-09-21 Thread Chen, Guchun
Reviewed-by: Guchun Chen 

Regards,
Guchun

-Original Message-
From: Lazar, Lijo  
Sent: Wednesday, September 21, 2022 8:30 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Deucher, Alexander 
; Koenig, Christian ; 
Chen, Guchun ; Bai, Zoy (zoybai) 
Subject: [PATCH 2/2] drm/amdgpu: Use simplified API for p2p dist calc

Use the simpified API that calculates distance between two devices.

Signed-off-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  | 6 +++---  
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index f600f3a3fe50..ec1023e7b0cb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5576,9 +5576,9 @@ bool amdgpu_device_is_peer_accessible(struct 
amdgpu_device *adev,
~*peer_adev->dev->dma_mask : ~((1ULL << 32) - 1);
resource_size_t aper_limit =
adev->gmc.aper_base + adev->gmc.aper_size - 1;
-   bool p2p_access = !adev->gmc.xgmi.connected_to_cpu &&
- !(pci_p2pdma_distance_many(adev->pdev,
-   &peer_adev->dev, 1, false) < 0);
+   bool p2p_access =
+   !adev->gmc.xgmi.connected_to_cpu &&
+   !(pci_p2pdma_distance(adev->pdev, peer_adev->dev, false) < 0);
 
return pcie_p2p && p2p_access && (adev->gmc.visible_vram_size &&
adev->gmc.real_vram_size == adev->gmc.visible_vram_size && diff 
--git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index 9e2a4c552a4a..7bd8e33b14be 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -58,7 +58,7 @@ static int amdgpu_dma_buf_attach(struct dma_buf *dmabuf,
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
int r;
 
-   if (pci_p2pdma_distance_many(adev->pdev, &attach->dev, 1, false) < 0)
+   if (pci_p2pdma_distance(adev->pdev, attach->dev, false) < 0)
attach->peer2peer = false;
 
r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
--
2.25.1



RE: [PATCH 1/2] drm/amdgpu: Disable verbose for p2p dist calc

2022-09-21 Thread Chen, Guchun
This patch is: Reviewed-by: Guchun Chen 

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of Lijo Lazar
Sent: Wednesday, September 21, 2022 8:30 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Bai, Zoy (zoybai) 
; Koenig, Christian ; Chen, Guchun 
; Zhang, Hawking 
Subject: [PATCH 1/2] drm/amdgpu: Disable verbose for p2p dist calc

Disable verbose while getting p2p distance. With verbose, it shows warning if 
ACS redirect is set between the devices. Adds noise to dmesg logs when a few 
GPU devices are on the same platform.

Example log:

amdgpu :34:00.0: ACS redirect is set between the client and provider 
(:31:00.0) amdgpu :34:00.0: to disable ACS redirect for this path, add 
the kernel parameter:

pci=disable_acs_redir=:30:00.0;:2e:00.0;:33:00.0;:2e:10.0

Signed-off-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  | 2 +-  
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index c04ea7f1e819..f600f3a3fe50 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5578,7 +5578,7 @@ bool amdgpu_device_is_peer_accessible(struct 
amdgpu_device *adev,
adev->gmc.aper_base + adev->gmc.aper_size - 1;
bool p2p_access = !adev->gmc.xgmi.connected_to_cpu &&
  !(pci_p2pdma_distance_many(adev->pdev,
-   &peer_adev->dev, 1, true) < 0);
+   &peer_adev->dev, 1, false) < 0);
 
return pcie_p2p && p2p_access && (adev->gmc.visible_vram_size &&
adev->gmc.real_vram_size == adev->gmc.visible_vram_size && diff 
--git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index 782cbca37538..9e2a4c552a4a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -58,7 +58,7 @@ static int amdgpu_dma_buf_attach(struct dma_buf *dmabuf,
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
int r;
 
-   if (pci_p2pdma_distance_many(adev->pdev, &attach->dev, 1, true) < 0)
+   if (pci_p2pdma_distance_many(adev->pdev, &attach->dev, 1, false) < 0)
attach->peer2peer = false;
 
r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
--
2.25.1



Re: [PATCH 5/5] drm/amdgpu: Correct the position in patch_cond_exec for gfx9

2022-09-21 Thread Christian König

Am 21.09.22 um 11:41 schrieb jiadong@amd.com:

From: "Jiadong.Zhu" 

The current position calulated in gfx_v9_0_ring_emit_patch_cond_exec
underflows when the wptr is divisible by ring->buf_mask + 1.


Good catch, looks like a completely independent bug fix to me. So please 
push separately.



Signed-off-by: Jiadong.Zhu 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index c568a4f5b81e..65f8c8d4f4ae 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -5754,7 +5754,7 @@ static void gfx_v9_0_ring_emit_patch_cond_exec(struct 
amdgpu_ring *ring, unsigne
BUG_ON(offset > ring->buf_mask);
BUG_ON(ring->ring[offset] != 0x55aa55aa);
  
-	cur = (ring->wptr & ring->buf_mask) - 1;

+   cur = (ring->wptr - 1) & ring->buf_mask;
if (likely(cur > offset))
ring->ring[offset] = cur - offset;
else




Re: [PATCH 4/5] drm/amdgpu: Implement OS triggered MCBP (v5)

2022-09-21 Thread Christian König

Am 21.09.22 um 11:41 schrieb jiadong@amd.com:

From: "Jiadong.Zhu" 

Trigger Mid-Command Buffer Preemption according to the priority of the software
rings and the hw fence signalling condition.

The muxer saves the locations of the indirect buffer frames from the software
ring together with the fence sequence number in its fifo queue, and pops out
those records when the fences are signalled. The locations are used to resubmit
packages in preemption scenarios by coping the chunks from the software ring.


Maybe change the subject a bit. The MCBP is not really triggered by the 
core Linux kernel.


Maybe write instead "MCBP based on DRM scheduler".



v2: Update comment style.
v3: Fix conflict caused by previous modifications.
v4: Remove unnecessary prints.
v5: Fix corner cases for resubmission cases.

Cc: Christian Koenig 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky 
Acked-by: Luben Tuikov 
Signed-off-by: Jiadong.Zhu 
---
  drivers/gpu/drm/amd/amdgpu/Makefile  |   2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c   |   2 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_mcbp.c |  91 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_mcbp.h |  29 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c |  12 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |   3 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c | 186 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.h |  24 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_sw_ring.c  |  27 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c   |   2 +
  10 files changed, 372 insertions(+), 6 deletions(-)
  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_mcbp.c
  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_mcbp.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 85224bc81ce5..24c5aa19bbf2 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -59,7 +59,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o \
amdgpu_fw_attestation.o amdgpu_securedisplay.o \
amdgpu_eeprom.o amdgpu_mca.o amdgpu_psp_ta.o amdgpu_lsdma.o \
-   amdgpu_sw_ring.o amdgpu_ring_mux.o
+   amdgpu_sw_ring.o amdgpu_ring_mux.o amdgpu_mcbp.o


This functionality is spread over to many files. Probably better to move 
this into the amdgpu_ring_mux.c as well.


  
  amdgpu-$(CONFIG_PROC_FS) += amdgpu_fdinfo.o
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c

index 258cffe3c06a..af86d87e2f3b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -211,6 +211,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
}
}
  
+	amdgpu_ring_ib_begin(ring);

if (job && ring->funcs->init_cond_exec)
patch_offset = amdgpu_ring_init_cond_exec(ring);
  
@@ -285,6 +286,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,

ring->hw_prio == AMDGPU_GFX_PIPE_PRIO_HIGH)
ring->funcs->emit_wave_limit(ring, false);
  
+	amdgpu_ring_ib_end(ring);

amdgpu_ring_commit(ring);
return 0;
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mcbp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mcbp.c
new file mode 100644
index ..121b1a4e0f04
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mcbp.c
@@ -0,0 +1,91 @@
+/*
+ * Copyright 2022 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "amdgpu.h"
+#include "amdgpu_mcbp.h"
+#include "amdgpu_ring.h"
+
+/* Trigger Mid-Command Buffer Preemption (MCBP) and find if we need to 
resubmit. */
+int amdgpu_mcbp_trigger_preempt(struct amdgpu_ring_mux *mux)
+{
+   struct amdgpu_mux_entry *e;
+   struct amdgpu_ring *ring = NULL;
+   int i;

Re: [PATCH v5 15/21] dma-buf: Move dma_buf_vmap() to dynamic locking specification

2022-09-21 Thread Dmitry Osipenko
On 9/20/22 17:13, Sumit Semwal wrote:
> Hi Dmitry,
> 
> Thanks very much for the series.
> 
> On Wed, 14 Sept 2022 at 00:59, Dmitry Osipenko
>  wrote:
>>
>> Move dma_buf_vmap/vunmap_unlocked() functions to the dynamic locking
>> specification by asserting that the reservation lock is held.
> Thanks for the patch; just a minor nit - I think you mean dma_buf_vmap
> / vunmap() here, and not _unlocked?

Yes, the _unlocked should be dropped here. Thank you for the review!

-- 
Best regards,
Dmitry



Re: [PATCH 3/5] drm/amdgpu: Modify unmap_queue format for gfx9 (v3)

2022-09-21 Thread Christian König

Am 21.09.22 um 11:41 schrieb jiadong@amd.com:

From: "Jiadong.Zhu" 

1. Modify the unmap_queue package on gfx9. Add trailing fence to track the
preemption done.
2. Modify emit_ce_meta emit_de_meta functions for the resumed ibs.

v2: Restyle code not to use ternary operator.
v3: Modify code format.

Signed-off-by: Jiadong.Zhu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |   1 +
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 182 +++
  drivers/gpu/drm/amd/amdgpu/soc15d.h  |   2 +
  3 files changed, 156 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 275b885363c3..aeb48cc3666c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -60,6 +60,7 @@ enum amdgpu_ring_priority_level {
  #define AMDGPU_FENCE_FLAG_64BIT (1 << 0)
  #define AMDGPU_FENCE_FLAG_INT   (1 << 1)
  #define AMDGPU_FENCE_FLAG_TC_WB_ONLY(1 << 2)
+#define AMDGPU_FENCE_FLAG_EXEC  (1 << 3)
  
  #define to_amdgpu_ring(s) container_of((s), struct amdgpu_ring, sched)
  
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c

index 4a8be9595459..c568a4f5b81e 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -753,7 +753,7 @@ static void gfx_v9_0_set_rlc_funcs(struct amdgpu_device 
*adev);
  static int gfx_v9_0_get_cu_info(struct amdgpu_device *adev,
struct amdgpu_cu_info *cu_info);
  static uint64_t gfx_v9_0_get_gpu_clock_counter(struct amdgpu_device *adev);
-static void gfx_v9_0_ring_emit_de_meta(struct amdgpu_ring *ring);
+static void gfx_v9_0_ring_emit_de_meta(struct amdgpu_ring *ring, bool resume);
  static u64 gfx_v9_0_ring_get_rptr_compute(struct amdgpu_ring *ring);
  static void gfx_v9_0_query_ras_error_count(struct amdgpu_device *adev,
  void *ras_error_status);
@@ -826,9 +826,10 @@ static void gfx_v9_0_kiq_unmap_queues(struct amdgpu_ring 
*kiq_ring,

PACKET3_UNMAP_QUEUES_DOORBELL_OFFSET0(ring->doorbell_index));
  
  	if (action == PREEMPT_QUEUES_NO_UNMAP) {

-   amdgpu_ring_write(kiq_ring, lower_32_bits(gpu_addr));
-   amdgpu_ring_write(kiq_ring, upper_32_bits(gpu_addr));
-   amdgpu_ring_write(kiq_ring, seq);
+   amdgpu_ring_write(kiq_ring, lower_32_bits(ring->wptr & 
ring->buf_mask));
+   amdgpu_ring_write(kiq_ring, 0);
+   amdgpu_ring_write(kiq_ring, 0);
+
} else {
amdgpu_ring_write(kiq_ring, 0);
amdgpu_ring_write(kiq_ring, 0);
@@ -5357,11 +5358,17 @@ static void gfx_v9_0_ring_emit_ib_gfx(struct 
amdgpu_ring *ring,
  
  	control |= ib->length_dw | (vmid << 24);
  
-	if (amdgpu_sriov_vf(ring->adev) && (ib->flags & AMDGPU_IB_FLAG_PREEMPT)) {

+   if ((amdgpu_sriov_vf(ring->adev) || amdgpu_mcbp) && (ib->flags & 
AMDGPU_IB_FLAG_PREEMPT)) {


Why does this now depend on the amdgpu_mcbp parameter?

The goal was to completely remove that parameter and always enable the 
feature.


Regards,
Christian.


control |= INDIRECT_BUFFER_PRE_ENB(1);
  
+		if (flags & AMDGPU_IB_PREEMPTED)

+   control |= INDIRECT_BUFFER_PRE_RESUME(1);
+
if (!(ib->flags & AMDGPU_IB_FLAG_CE) && vmid)
-   gfx_v9_0_ring_emit_de_meta(ring);
+   gfx_v9_0_ring_emit_de_meta(ring,
+  (!amdgpu_sriov_vf(ring->adev) 
&&
+  flags & AMDGPU_IB_PREEMPTED) 
?
+  true : false);
}
  
  	amdgpu_ring_write(ring, header);

@@ -5416,17 +5423,23 @@ static void gfx_v9_0_ring_emit_fence(struct amdgpu_ring 
*ring, u64 addr,
bool write64bit = flags & AMDGPU_FENCE_FLAG_64BIT;
bool int_sel = flags & AMDGPU_FENCE_FLAG_INT;
bool writeback = flags & AMDGPU_FENCE_FLAG_TC_WB_ONLY;
+   bool exec = flags & AMDGPU_FENCE_FLAG_EXEC;
+   uint32_t dw2 = 0;
  
  	/* RELEASE_MEM - flush caches, send int */

amdgpu_ring_write(ring, PACKET3(PACKET3_RELEASE_MEM, 6));
-   amdgpu_ring_write(ring, ((writeback ? (EOP_TC_WB_ACTION_EN |
-  EOP_TC_NC_ACTION_EN) :
- (EOP_TCL1_ACTION_EN |
-  EOP_TC_ACTION_EN |
-  EOP_TC_WB_ACTION_EN |
-  EOP_TC_MD_ACTION_EN)) |
-EVENT_TYPE(CACHE_FLUSH_AND_INV_TS_EVENT) |
-EVENT_INDEX(5)));
+
+   if (writeback) {
+   dw2 = EOP_TC_WB_ACTION_EN | EOP_TC_NC_ACTION_EN;
+   } else {
+   dw2 = EOP_TCL1_ACTION_EN | EOP_TC_ACTION

Re: [PATCH 2/5] drm/amdgpu: Add software ring callbacks for gfx9 (v5)

2022-09-21 Thread Christian König

Am 21.09.22 um 11:41 schrieb jiadong@amd.com:

From: "Jiadong.Zhu" 

Set ring functions with software ring callbacks on gfx9.

The software ring could be tested by debugfs_test_ib case.

v2: Set sw_ring 2 to enable software ring by default.
v3: Remove the parameter for software ring enablement.
v4: Use amdgpu_ring_init/fini for software rings.
v5: Update for code format. Fix conflict.

Acked-by: Luben Tuikov 
Cc: Christian Koenig 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky 
Signed-off-by: Jiadong.Zhu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h  |   1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  |   2 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c |   7 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |   3 +-
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 117 +--
  5 files changed, 120 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 96d058c4cd4b..525df0b4d55f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -207,6 +207,7 @@ extern bool amdgpu_ignore_bad_page_threshold;
  extern struct amdgpu_watchdog_timer amdgpu_watchdog_timer;
  extern int amdgpu_async_gfx_ring;
  extern int amdgpu_mcbp;
+extern int amdgpu_sw_ring;
  extern int amdgpu_discovery;
  extern int amdgpu_mes;
  extern int amdgpu_mes_kiq;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index 9996dadb39f7..93b25d9a87f9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -348,6 +348,8 @@ struct amdgpu_gfx {
  
  	boolis_poweron;
  
+	/* software ring */

+   unsignednum_sw_gfx_rings;


Please completely drop that, just always enable the SW ring for GFX9.


struct amdgpu_ring_mux  muxer;
  };
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c

index 13db99d653bd..4eaf3bd332f7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -33,6 +33,7 @@
  
  #include 

  #include "amdgpu.h"
+#include "amdgpu_sw_ring.h"
  #include "atom.h"
  
  /*

@@ -121,6 +122,11 @@ void amdgpu_ring_commit(struct amdgpu_ring *ring)
  {
uint32_t count;
  
+	if (ring->is_sw_ring) {

+   amdgpu_sw_ring_commit(ring);
+   return;
+   }
+


That is a pretty clear NAK since the sw ring should be transparent to 
the upper layers.


Why exactly is that necessary?




/* We pad to match fetch size */
count = ring->funcs->align_mask + 1 -
(ring->wptr & ring->funcs->align_mask);
@@ -343,7 +349,6 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
amdgpu_ring *ring,
   */
  void amdgpu_ring_fini(struct amdgpu_ring *ring)
  {
-
/* Not to finish a ring which is not initialized */
if (!(ring->adev) ||
(!ring->is_mes_queue && !(ring->adev->rings[ring->idx])))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 40b1277b4f0c..275b885363c3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -38,7 +38,8 @@ struct amdgpu_vm;
  /* max number of rings */
  #define AMDGPU_MAX_RINGS  28
  #define AMDGPU_MAX_HWIP_RINGS 8
-#define AMDGPU_MAX_GFX_RINGS   2
+/*2 software ring and 1 real ring*/
+#define AMDGPU_MAX_GFX_RINGS   3


Please don't change that. Instead add a sw ring separate to the gfx_ring 
into amdgpu_gfx.



  #define AMDGPU_MAX_COMPUTE_RINGS  8
  #define AMDGPU_MAX_VCE_RINGS  3
  #define AMDGPU_MAX_UVD_ENC_RINGS  2
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 5349ca4d19e3..4a8be9595459 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -47,6 +47,7 @@
  
  #include "amdgpu_ras.h"
  
+#include "amdgpu_sw_ring.h"

  #include "gfx_v9_4.h"
  #include "gfx_v9_0.h"
  #include "gfx_v9_4_2.h"
@@ -55,7 +56,8 @@
  #include "asic_reg/pwr/pwr_10_0_sh_mask.h"
  #include "asic_reg/gc/gc_9_0_default.h"
  
-#define GFX9_NUM_GFX_RINGS 1

+#define GFX9_NUM_GFX_RINGS 3
+#define GFX9_NUM_SW_GFX_RINGS  2
  #define GFX9_MEC_HPD_SIZE 4096
  #define RLCG_UCODE_LOADING_START_ADDRESS 0x2000L
  #define RLC_SAVE_RESTORE_ADDR_STARTING_OFFSET 0xL
@@ -2270,6 +2272,7 @@ static int gfx_v9_0_compute_ring_init(struct 
amdgpu_device *adev, int ring_id,
  static int gfx_v9_0_sw_init(void *handle)
  {
int i, j, k, r, ring_id;
+   unsigned int hw_prio;
struct amdgpu_ring *ring;
struct amdgpu_kiq *kiq;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
@@ -2356,13 +2359,41 @@ static int gfx_v9_0_sw_init(void *handle)
sprintf(ring->name, "gfx_%d", i);
ring->use_doorbell = true;
   

Re: [PATCH 1/2] drm/amdgpu: Disable verbose for p2p dist calc

2022-09-21 Thread Christian König

Am 21.09.22 um 14:30 schrieb Lijo Lazar:

Disable verbose while getting p2p distance. With verbose, it shows
warning if ACS redirect is set between the devices. Adds noise
to dmesg logs when a few GPU devices are on the same platform.

Example log:

amdgpu :34:00.0: ACS redirect is set between the client and provider 
(:31:00.0)
amdgpu :34:00.0: to disable ACS redirect for this path, add the kernel 
parameter:

pci=disable_acs_redir=:30:00.0;:2e:00.0;:33:00.0;:2e:10.0

Signed-off-by: Lijo Lazar 


Reviewed-by: Christian König  for both patches.


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  | 2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index c04ea7f1e819..f600f3a3fe50 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5578,7 +5578,7 @@ bool amdgpu_device_is_peer_accessible(struct 
amdgpu_device *adev,
adev->gmc.aper_base + adev->gmc.aper_size - 1;
bool p2p_access = !adev->gmc.xgmi.connected_to_cpu &&
  !(pci_p2pdma_distance_many(adev->pdev,
-   &peer_adev->dev, 1, true) < 0);
+   &peer_adev->dev, 1, false) < 0);
  
  	return pcie_p2p && p2p_access && (adev->gmc.visible_vram_size &&

adev->gmc.real_vram_size == adev->gmc.visible_vram_size &&
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index 782cbca37538..9e2a4c552a4a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -58,7 +58,7 @@ static int amdgpu_dma_buf_attach(struct dma_buf *dmabuf,
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
int r;
  
-	if (pci_p2pdma_distance_many(adev->pdev, &attach->dev, 1, true) < 0)

+   if (pci_p2pdma_distance_many(adev->pdev, &attach->dev, 1, false) < 0)
attach->peer2peer = false;
  
  	r = pm_runtime_get_sync(adev_to_drm(adev)->dev);




Re: [PATCH 1/5] drm/amdgpu: Introduce gfx software ring (v6)

2022-09-21 Thread Christian König




Am 21.09.22 um 11:41 schrieb jiadong@amd.com:

From: "Jiadong.Zhu" 

The software ring is created to support priority context while there is only
one hardware queue for gfx.

Every software ring has its fence driver and could be used as an ordinary ring
for the GPU scheduler.
Multiple software rings are bound to a real ring with the ring muxer. The
packages committed on the software ring are copied to the real ring.

v2: Use array to store software ring entry.
v3: Remove unnecessary prints.
v4: Remove amdgpu_ring_sw_init/fini functions,
using gtt for sw ring buffer for later dma copy
optimization.
v5: Allocate ring entry dynamically in the muxer.
v6: Update comments for the ring muxer.

Cc: Christian Koenig 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky  
Signed-off-by: Jiadong.Zhu 
---
  drivers/gpu/drm/amd/amdgpu/Makefile  |   3 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  |   3 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |   4 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c | 185 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.h |  66 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_sw_ring.c  |  60 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_sw_ring.h  |  43 +
  7 files changed, 363 insertions(+), 1 deletion(-)
  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c
  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.h
  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_sw_ring.c
  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_sw_ring.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 3e0e2eb7e235..85224bc81ce5 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -58,7 +58,8 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
amdgpu_vm_sdma.o amdgpu_discovery.o amdgpu_ras_eeprom.o amdgpu_nbio.o \
amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o \
amdgpu_fw_attestation.o amdgpu_securedisplay.o \
-   amdgpu_eeprom.o amdgpu_mca.o amdgpu_psp_ta.o amdgpu_lsdma.o
+   amdgpu_eeprom.o amdgpu_mca.o amdgpu_psp_ta.o amdgpu_lsdma.o \
+   amdgpu_sw_ring.o amdgpu_ring_mux.o
  
  amdgpu-$(CONFIG_PROC_FS) += amdgpu_fdinfo.o
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h

index 53526ffb2ce1..9996dadb39f7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -33,6 +33,7 @@
  #include "amdgpu_imu.h"
  #include "soc15.h"
  #include "amdgpu_ras.h"
+#include "amdgpu_ring_mux.h"
  
  /* GFX current status */

  #define AMDGPU_GFX_NORMAL_MODE0xL
@@ -346,6 +347,8 @@ struct amdgpu_gfx {
struct amdgpu_gfx_ras   *ras;
  
  	boolis_poweron;

+
+   struct amdgpu_ring_mux  muxer;
  };
  
  #define amdgpu_gfx_get_gpu_clock_counter(adev) (adev)->gfx.funcs->get_gpu_clock_counter((adev))

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 7d89a52091c0..40b1277b4f0c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -278,6 +278,10 @@ struct amdgpu_ring {
boolis_mes_queue;
uint32_thw_queue_id;
struct amdgpu_mes_ctx_data *mes_ctx;
+
+   boolis_sw_ring;
+   unsigned intentry_index;
+
  };
  
  #define amdgpu_ring_parse_cs(r, p, job, ib) ((r)->funcs->parse_cs((p), (job), (ib)))

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c
new file mode 100644
index ..d6b30db27104
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c
@@ -0,0 +1,185 @@
+/*
+ * Copyright 2022 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+#include 
+#include 
+
+#include "amdgpu_ring_m

[PATCH 2/2] drm/amdgpu: Use simplified API for p2p dist calc

2022-09-21 Thread Lijo Lazar
Use the simpified API that calculates distance between two devices.

Signed-off-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  | 6 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index f600f3a3fe50..ec1023e7b0cb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5576,9 +5576,9 @@ bool amdgpu_device_is_peer_accessible(struct 
amdgpu_device *adev,
~*peer_adev->dev->dma_mask : ~((1ULL << 32) - 1);
resource_size_t aper_limit =
adev->gmc.aper_base + adev->gmc.aper_size - 1;
-   bool p2p_access = !adev->gmc.xgmi.connected_to_cpu &&
- !(pci_p2pdma_distance_many(adev->pdev,
-   &peer_adev->dev, 1, false) < 0);
+   bool p2p_access =
+   !adev->gmc.xgmi.connected_to_cpu &&
+   !(pci_p2pdma_distance(adev->pdev, peer_adev->dev, false) < 0);
 
return pcie_p2p && p2p_access && (adev->gmc.visible_vram_size &&
adev->gmc.real_vram_size == adev->gmc.visible_vram_size &&
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index 9e2a4c552a4a..7bd8e33b14be 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -58,7 +58,7 @@ static int amdgpu_dma_buf_attach(struct dma_buf *dmabuf,
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
int r;
 
-   if (pci_p2pdma_distance_many(adev->pdev, &attach->dev, 1, false) < 0)
+   if (pci_p2pdma_distance(adev->pdev, attach->dev, false) < 0)
attach->peer2peer = false;
 
r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
-- 
2.25.1



[PATCH 1/2] drm/amdgpu: Disable verbose for p2p dist calc

2022-09-21 Thread Lijo Lazar
Disable verbose while getting p2p distance. With verbose, it shows
warning if ACS redirect is set between the devices. Adds noise
to dmesg logs when a few GPU devices are on the same platform.

Example log:

amdgpu :34:00.0: ACS redirect is set between the client and provider 
(:31:00.0)
amdgpu :34:00.0: to disable ACS redirect for this path, add the kernel 
parameter:

pci=disable_acs_redir=:30:00.0;:2e:00.0;:33:00.0;:2e:10.0

Signed-off-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index c04ea7f1e819..f600f3a3fe50 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5578,7 +5578,7 @@ bool amdgpu_device_is_peer_accessible(struct 
amdgpu_device *adev,
adev->gmc.aper_base + adev->gmc.aper_size - 1;
bool p2p_access = !adev->gmc.xgmi.connected_to_cpu &&
  !(pci_p2pdma_distance_many(adev->pdev,
-   &peer_adev->dev, 1, true) < 0);
+   &peer_adev->dev, 1, false) < 0);
 
return pcie_p2p && p2p_access && (adev->gmc.visible_vram_size &&
adev->gmc.real_vram_size == adev->gmc.visible_vram_size &&
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index 782cbca37538..9e2a4c552a4a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -58,7 +58,7 @@ static int amdgpu_dma_buf_attach(struct dma_buf *dmabuf,
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
int r;
 
-   if (pci_p2pdma_distance_many(adev->pdev, &attach->dev, 1, true) < 0)
+   if (pci_p2pdma_distance_many(adev->pdev, &attach->dev, 1, false) < 0)
attach->peer2peer = false;
 
r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
-- 
2.25.1



Re: [PATCH 1/3] drm/amdgpu: avoid gfx register accessing during gfxoff

2022-09-21 Thread Lazar, Lijo




On 9/21/2022 10:26 AM, Evan Quan wrote:

Make sure gfxoff is disabled before gfx register accessing.

Signed-off-by: Evan Quan 


Series is -
Reviewed-by: Lijo Lazar 

Thanks,
Lijo


Change-Id: Ia032869080f51cdefc6e6bad4f04405193ab0fec
---
  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index ce8c792cef1a..710074682279 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -5245,6 +5245,8 @@ static void gfx_v11_0_update_spm_vmid(struct 
amdgpu_device *adev, unsigned vmid)
  {
u32 reg, data;
  
+	amdgpu_gfx_off_ctrl(adev, false);

+
reg = SOC15_REG_OFFSET(GC, 0, regRLC_SPM_MC_CNTL);
if (amdgpu_sriov_is_pp_one_vf(adev))
data = RREG32_NO_KIQ(reg);
@@ -5258,6 +5260,8 @@ static void gfx_v11_0_update_spm_vmid(struct 
amdgpu_device *adev, unsigned vmid)
WREG32_SOC15_NO_KIQ(GC, 0, regRLC_SPM_MC_CNTL, data);
else
WREG32_SOC15(GC, 0, regRLC_SPM_MC_CNTL, data);
+
+   amdgpu_gfx_off_ctrl(adev, true);
  }
  
  static const struct amdgpu_rlc_funcs gfx_v11_0_rlc_funcs = {




[PATCH 5/5] drm/amdgpu: Correct the position in patch_cond_exec for gfx9

2022-09-21 Thread jiadong.zhu
From: "Jiadong.Zhu" 

The current position calulated in gfx_v9_0_ring_emit_patch_cond_exec
underflows when the wptr is divisible by ring->buf_mask + 1.

Signed-off-by: Jiadong.Zhu 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index c568a4f5b81e..65f8c8d4f4ae 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -5754,7 +5754,7 @@ static void gfx_v9_0_ring_emit_patch_cond_exec(struct 
amdgpu_ring *ring, unsigne
BUG_ON(offset > ring->buf_mask);
BUG_ON(ring->ring[offset] != 0x55aa55aa);
 
-   cur = (ring->wptr & ring->buf_mask) - 1;
+   cur = (ring->wptr - 1) & ring->buf_mask;
if (likely(cur > offset))
ring->ring[offset] = cur - offset;
else
-- 
2.25.1



[PATCH 4/5] drm/amdgpu: Implement OS triggered MCBP (v5)

2022-09-21 Thread jiadong.zhu
From: "Jiadong.Zhu" 

Trigger Mid-Command Buffer Preemption according to the priority of the software
rings and the hw fence signalling condition.

The muxer saves the locations of the indirect buffer frames from the software
ring together with the fence sequence number in its fifo queue, and pops out
those records when the fences are signalled. The locations are used to resubmit
packages in preemption scenarios by coping the chunks from the software ring.

v2: Update comment style.
v3: Fix conflict caused by previous modifications.
v4: Remove unnecessary prints.
v5: Fix corner cases for resubmission cases.

Cc: Christian Koenig 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky 
Acked-by: Luben Tuikov 
Signed-off-by: Jiadong.Zhu 
---
 drivers/gpu/drm/amd/amdgpu/Makefile  |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c   |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_mcbp.c |  91 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_mcbp.h |  29 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c |  12 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |   3 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c | 186 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.h |  24 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_sw_ring.c  |  27 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c   |   2 +
 10 files changed, 372 insertions(+), 6 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_mcbp.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_mcbp.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 85224bc81ce5..24c5aa19bbf2 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -59,7 +59,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o \
amdgpu_fw_attestation.o amdgpu_securedisplay.o \
amdgpu_eeprom.o amdgpu_mca.o amdgpu_psp_ta.o amdgpu_lsdma.o \
-   amdgpu_sw_ring.o amdgpu_ring_mux.o
+   amdgpu_sw_ring.o amdgpu_ring_mux.o amdgpu_mcbp.o
 
 amdgpu-$(CONFIG_PROC_FS) += amdgpu_fdinfo.o
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index 258cffe3c06a..af86d87e2f3b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -211,6 +211,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
}
}
 
+   amdgpu_ring_ib_begin(ring);
if (job && ring->funcs->init_cond_exec)
patch_offset = amdgpu_ring_init_cond_exec(ring);
 
@@ -285,6 +286,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
ring->hw_prio == AMDGPU_GFX_PIPE_PRIO_HIGH)
ring->funcs->emit_wave_limit(ring, false);
 
+   amdgpu_ring_ib_end(ring);
amdgpu_ring_commit(ring);
return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mcbp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mcbp.c
new file mode 100644
index ..121b1a4e0f04
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mcbp.c
@@ -0,0 +1,91 @@
+/*
+ * Copyright 2022 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "amdgpu.h"
+#include "amdgpu_mcbp.h"
+#include "amdgpu_ring.h"
+
+/* Trigger Mid-Command Buffer Preemption (MCBP) and find if we need to 
resubmit. */
+int amdgpu_mcbp_trigger_preempt(struct amdgpu_ring_mux *mux)
+{
+   struct amdgpu_mux_entry *e;
+   struct amdgpu_ring *ring = NULL;
+   int i;
+
+   spin_lock(&mux->lock);
+
+   amdgpu_ring_preempt_ib(mux->real_ring);
+
+   for (i = 0; i < mux->num_ring_entries; i++) {
+   e = &mux->ring_entry[i];
+   if (e->ring->hw_prio <= AMDGPU_RING_PRIO_DEFAULT) {
+   ring = e->ring;
+   break;
+ 

[PATCH 2/5] drm/amdgpu: Add software ring callbacks for gfx9 (v5)

2022-09-21 Thread jiadong.zhu
From: "Jiadong.Zhu" 

Set ring functions with software ring callbacks on gfx9.

The software ring could be tested by debugfs_test_ib case.

v2: Set sw_ring 2 to enable software ring by default.
v3: Remove the parameter for software ring enablement.
v4: Use amdgpu_ring_init/fini for software rings.
v5: Update for code format. Fix conflict.

Acked-by: Luben Tuikov 
Cc: Christian Koenig 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky 
Signed-off-by: Jiadong.Zhu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h  |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c |   7 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |   3 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 117 +--
 5 files changed, 120 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 96d058c4cd4b..525df0b4d55f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -207,6 +207,7 @@ extern bool amdgpu_ignore_bad_page_threshold;
 extern struct amdgpu_watchdog_timer amdgpu_watchdog_timer;
 extern int amdgpu_async_gfx_ring;
 extern int amdgpu_mcbp;
+extern int amdgpu_sw_ring;
 extern int amdgpu_discovery;
 extern int amdgpu_mes;
 extern int amdgpu_mes_kiq;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index 9996dadb39f7..93b25d9a87f9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -348,6 +348,8 @@ struct amdgpu_gfx {
 
boolis_poweron;
 
+   /* software ring */
+   unsignednum_sw_gfx_rings;
struct amdgpu_ring_mux  muxer;
 };
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 13db99d653bd..4eaf3bd332f7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -33,6 +33,7 @@
 
 #include 
 #include "amdgpu.h"
+#include "amdgpu_sw_ring.h"
 #include "atom.h"
 
 /*
@@ -121,6 +122,11 @@ void amdgpu_ring_commit(struct amdgpu_ring *ring)
 {
uint32_t count;
 
+   if (ring->is_sw_ring) {
+   amdgpu_sw_ring_commit(ring);
+   return;
+   }
+
/* We pad to match fetch size */
count = ring->funcs->align_mask + 1 -
(ring->wptr & ring->funcs->align_mask);
@@ -343,7 +349,6 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
amdgpu_ring *ring,
  */
 void amdgpu_ring_fini(struct amdgpu_ring *ring)
 {
-
/* Not to finish a ring which is not initialized */
if (!(ring->adev) ||
(!ring->is_mes_queue && !(ring->adev->rings[ring->idx])))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 40b1277b4f0c..275b885363c3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -38,7 +38,8 @@ struct amdgpu_vm;
 /* max number of rings */
 #define AMDGPU_MAX_RINGS   28
 #define AMDGPU_MAX_HWIP_RINGS  8
-#define AMDGPU_MAX_GFX_RINGS   2
+/*2 software ring and 1 real ring*/
+#define AMDGPU_MAX_GFX_RINGS   3
 #define AMDGPU_MAX_COMPUTE_RINGS   8
 #define AMDGPU_MAX_VCE_RINGS   3
 #define AMDGPU_MAX_UVD_ENC_RINGS   2
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 5349ca4d19e3..4a8be9595459 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -47,6 +47,7 @@
 
 #include "amdgpu_ras.h"
 
+#include "amdgpu_sw_ring.h"
 #include "gfx_v9_4.h"
 #include "gfx_v9_0.h"
 #include "gfx_v9_4_2.h"
@@ -55,7 +56,8 @@
 #include "asic_reg/pwr/pwr_10_0_sh_mask.h"
 #include "asic_reg/gc/gc_9_0_default.h"
 
-#define GFX9_NUM_GFX_RINGS 1
+#define GFX9_NUM_GFX_RINGS 3
+#define GFX9_NUM_SW_GFX_RINGS  2
 #define GFX9_MEC_HPD_SIZE 4096
 #define RLCG_UCODE_LOADING_START_ADDRESS 0x2000L
 #define RLC_SAVE_RESTORE_ADDR_STARTING_OFFSET 0xL
@@ -2270,6 +2272,7 @@ static int gfx_v9_0_compute_ring_init(struct 
amdgpu_device *adev, int ring_id,
 static int gfx_v9_0_sw_init(void *handle)
 {
int i, j, k, r, ring_id;
+   unsigned int hw_prio;
struct amdgpu_ring *ring;
struct amdgpu_kiq *kiq;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
@@ -2356,13 +2359,41 @@ static int gfx_v9_0_sw_init(void *handle)
sprintf(ring->name, "gfx_%d", i);
ring->use_doorbell = true;
ring->doorbell_index = adev->doorbell_index.gfx_ring0 << 1;
+   ring->is_sw_ring = (adev->gfx.num_sw_gfx_rings > 1) && (i > 0);
+
+   if (adev->gfx.num_sw_gfx_rings > 1 && i == 2)
+   hw_prio = AMDGPU_RING_PRIO_2;
+   else
+   hw_prio = AMDGPU_RING_PRIO_DEFAULT;
+   i

[PATCH 3/5] drm/amdgpu: Modify unmap_queue format for gfx9 (v3)

2022-09-21 Thread jiadong.zhu
From: "Jiadong.Zhu" 

1. Modify the unmap_queue package on gfx9. Add trailing fence to track the
   preemption done.
2. Modify emit_ce_meta emit_de_meta functions for the resumed ibs.

v2: Restyle code not to use ternary operator.
v3: Modify code format.

Signed-off-by: Jiadong.Zhu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |   1 +
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 182 +++
 drivers/gpu/drm/amd/amdgpu/soc15d.h  |   2 +
 3 files changed, 156 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 275b885363c3..aeb48cc3666c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -60,6 +60,7 @@ enum amdgpu_ring_priority_level {
 #define AMDGPU_FENCE_FLAG_64BIT (1 << 0)
 #define AMDGPU_FENCE_FLAG_INT   (1 << 1)
 #define AMDGPU_FENCE_FLAG_TC_WB_ONLY(1 << 2)
+#define AMDGPU_FENCE_FLAG_EXEC  (1 << 3)
 
 #define to_amdgpu_ring(s) container_of((s), struct amdgpu_ring, sched)
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 4a8be9595459..c568a4f5b81e 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -753,7 +753,7 @@ static void gfx_v9_0_set_rlc_funcs(struct amdgpu_device 
*adev);
 static int gfx_v9_0_get_cu_info(struct amdgpu_device *adev,
struct amdgpu_cu_info *cu_info);
 static uint64_t gfx_v9_0_get_gpu_clock_counter(struct amdgpu_device *adev);
-static void gfx_v9_0_ring_emit_de_meta(struct amdgpu_ring *ring);
+static void gfx_v9_0_ring_emit_de_meta(struct amdgpu_ring *ring, bool resume);
 static u64 gfx_v9_0_ring_get_rptr_compute(struct amdgpu_ring *ring);
 static void gfx_v9_0_query_ras_error_count(struct amdgpu_device *adev,
  void *ras_error_status);
@@ -826,9 +826,10 @@ static void gfx_v9_0_kiq_unmap_queues(struct amdgpu_ring 
*kiq_ring,

PACKET3_UNMAP_QUEUES_DOORBELL_OFFSET0(ring->doorbell_index));
 
if (action == PREEMPT_QUEUES_NO_UNMAP) {
-   amdgpu_ring_write(kiq_ring, lower_32_bits(gpu_addr));
-   amdgpu_ring_write(kiq_ring, upper_32_bits(gpu_addr));
-   amdgpu_ring_write(kiq_ring, seq);
+   amdgpu_ring_write(kiq_ring, lower_32_bits(ring->wptr & 
ring->buf_mask));
+   amdgpu_ring_write(kiq_ring, 0);
+   amdgpu_ring_write(kiq_ring, 0);
+
} else {
amdgpu_ring_write(kiq_ring, 0);
amdgpu_ring_write(kiq_ring, 0);
@@ -5357,11 +5358,17 @@ static void gfx_v9_0_ring_emit_ib_gfx(struct 
amdgpu_ring *ring,
 
control |= ib->length_dw | (vmid << 24);
 
-   if (amdgpu_sriov_vf(ring->adev) && (ib->flags & 
AMDGPU_IB_FLAG_PREEMPT)) {
+   if ((amdgpu_sriov_vf(ring->adev) || amdgpu_mcbp) && (ib->flags & 
AMDGPU_IB_FLAG_PREEMPT)) {
control |= INDIRECT_BUFFER_PRE_ENB(1);
 
+   if (flags & AMDGPU_IB_PREEMPTED)
+   control |= INDIRECT_BUFFER_PRE_RESUME(1);
+
if (!(ib->flags & AMDGPU_IB_FLAG_CE) && vmid)
-   gfx_v9_0_ring_emit_de_meta(ring);
+   gfx_v9_0_ring_emit_de_meta(ring,
+  
(!amdgpu_sriov_vf(ring->adev) &&
+  flags & AMDGPU_IB_PREEMPTED) 
?
+  true : false);
}
 
amdgpu_ring_write(ring, header);
@@ -5416,17 +5423,23 @@ static void gfx_v9_0_ring_emit_fence(struct amdgpu_ring 
*ring, u64 addr,
bool write64bit = flags & AMDGPU_FENCE_FLAG_64BIT;
bool int_sel = flags & AMDGPU_FENCE_FLAG_INT;
bool writeback = flags & AMDGPU_FENCE_FLAG_TC_WB_ONLY;
+   bool exec = flags & AMDGPU_FENCE_FLAG_EXEC;
+   uint32_t dw2 = 0;
 
/* RELEASE_MEM - flush caches, send int */
amdgpu_ring_write(ring, PACKET3(PACKET3_RELEASE_MEM, 6));
-   amdgpu_ring_write(ring, ((writeback ? (EOP_TC_WB_ACTION_EN |
-  EOP_TC_NC_ACTION_EN) :
- (EOP_TCL1_ACTION_EN |
-  EOP_TC_ACTION_EN |
-  EOP_TC_WB_ACTION_EN |
-  EOP_TC_MD_ACTION_EN)) |
-EVENT_TYPE(CACHE_FLUSH_AND_INV_TS_EVENT) |
-EVENT_INDEX(5)));
+
+   if (writeback) {
+   dw2 = EOP_TC_WB_ACTION_EN | EOP_TC_NC_ACTION_EN;
+   } else {
+   dw2 = EOP_TCL1_ACTION_EN | EOP_TC_ACTION_EN |
+   EOP_TC_WB_ACTION_EN | EOP_TC_MD_ACTION_EN;
+   }
+   dw2 |= EVENT_TYPE(CACHE_FLUSH_AND_INV_TS_EVENT) | EVENT_INDEX(5);
+   if (exec)
+   dw

[PATCH 1/5] drm/amdgpu: Introduce gfx software ring (v6)

2022-09-21 Thread jiadong.zhu
From: "Jiadong.Zhu" 

The software ring is created to support priority context while there is only
one hardware queue for gfx.

Every software ring has its fence driver and could be used as an ordinary ring
for the GPU scheduler.
Multiple software rings are bound to a real ring with the ring muxer. The
packages committed on the software ring are copied to the real ring.

v2: Use array to store software ring entry.
v3: Remove unnecessary prints.
v4: Remove amdgpu_ring_sw_init/fini functions,
using gtt for sw ring buffer for later dma copy
optimization.
v5: Allocate ring entry dynamically in the muxer.
v6: Update comments for the ring muxer.

Cc: Christian Koenig 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky  
Signed-off-by: Jiadong.Zhu 
---
 drivers/gpu/drm/amd/amdgpu/Makefile  |   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  |   3 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |   4 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c | 185 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.h |  66 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_sw_ring.c  |  60 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_sw_ring.h  |  43 +
 7 files changed, 363 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.h
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_sw_ring.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_sw_ring.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 3e0e2eb7e235..85224bc81ce5 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -58,7 +58,8 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
amdgpu_vm_sdma.o amdgpu_discovery.o amdgpu_ras_eeprom.o amdgpu_nbio.o \
amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o \
amdgpu_fw_attestation.o amdgpu_securedisplay.o \
-   amdgpu_eeprom.o amdgpu_mca.o amdgpu_psp_ta.o amdgpu_lsdma.o
+   amdgpu_eeprom.o amdgpu_mca.o amdgpu_psp_ta.o amdgpu_lsdma.o \
+   amdgpu_sw_ring.o amdgpu_ring_mux.o
 
 amdgpu-$(CONFIG_PROC_FS) += amdgpu_fdinfo.o
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index 53526ffb2ce1..9996dadb39f7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -33,6 +33,7 @@
 #include "amdgpu_imu.h"
 #include "soc15.h"
 #include "amdgpu_ras.h"
+#include "amdgpu_ring_mux.h"
 
 /* GFX current status */
 #define AMDGPU_GFX_NORMAL_MODE 0xL
@@ -346,6 +347,8 @@ struct amdgpu_gfx {
struct amdgpu_gfx_ras   *ras;
 
boolis_poweron;
+
+   struct amdgpu_ring_mux  muxer;
 };
 
 #define amdgpu_gfx_get_gpu_clock_counter(adev) 
(adev)->gfx.funcs->get_gpu_clock_counter((adev))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 7d89a52091c0..40b1277b4f0c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -278,6 +278,10 @@ struct amdgpu_ring {
boolis_mes_queue;
uint32_thw_queue_id;
struct amdgpu_mes_ctx_data *mes_ctx;
+
+   boolis_sw_ring;
+   unsigned intentry_index;
+
 };
 
 #define amdgpu_ring_parse_cs(r, p, job, ib) ((r)->funcs->parse_cs((p), (job), 
(ib)))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c
new file mode 100644
index ..d6b30db27104
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c
@@ -0,0 +1,185 @@
+/*
+ * Copyright 2022 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+#include 
+#include 
+
+#include "amdgpu_ring_mux.h"
+#include "amdgpu_ring.h"
+
+#define AMDGPU_MUX_RESUBMIT_

Re: [PATCH v3] drm/sched: Add FIFO sched policy to run queue v3

2022-09-21 Thread Luben Tuikov
Inlined:

On 2022-09-20 15:16, Andrey Grodzovsky wrote:
> 
> On 2022-09-19 23:11, Luben Tuikov wrote:
>> Please run this patch through checkpatch.pl, as it shows
>> 12 warnings with it. Use these command line options:
>> "--strict --show-types".
>>
>> Inlined:
>>
>> On 2022-09-13 16:40, Andrey Grodzovsky wrote:
>>> Given many entities competing for same run queue on
>>> the same scheduler and unacceptably long wait time for some
>>> jobs waiting stuck in the run queue before being picked up are
>>> observed (seen using  GPUVis).
>> Since the second part of this sentence is the result of the first,
>> I'd say something like "When many entities ... we see unacceptably long ...".
>>
>>> The issue is due to the Round Robin policy used by schedulers
>>> to pick up the next entity's job queue for execution. Under stress
>>> of many entities and long job queus within entity some
>> Spelling: "queues".
>>
>>> jobs could be stack for very long time in it's entity's
>> "stuck", not "stack".
>>
>>> queue before being popped from the queue and executed
>>> while for other entities with smaller job queues a job
>>> might execute earlier even though that job arrived later
>>> then the job in the long queue.
>> "than".
>>
>>> 
>>> Fix:
>>> Add FIFO selection policy to entities in run queue, chose next entity
>>> on run queue in such order that if job on one entity arrived
>>> earlier then job on another entity the first job will start
>>> executing earlier regardless of the length of the entity's job
>>> queue.
>>> 
>>> v2:
>>> Switch to rb tree structure for entities based on TS of
>>> oldest job waiting in the job queue of an entity. Improves next
>>> entity extraction to O(1). Entity TS update
>>> O(log N) where N is the number of entities in the run-queue
>>> 
>>> Drop default option in module control parameter.
>>>
>>> v3:
>>> Various cosmetical fixes and minor refactoring of fifo update function.
>>> Signed-off-by: Andrey Grodzovsky 
>>> Tested-by: Li Yunxiang (Teddy) 
>>> ---
>>>   drivers/gpu/drm/scheduler/sched_entity.c |  26 -
>>>   drivers/gpu/drm/scheduler/sched_main.c   | 132 ++-
>>>   include/drm/gpu_scheduler.h  |  35 ++
>>>   3 files changed, 187 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
>>> b/drivers/gpu/drm/scheduler/sched_entity.c
>>> index 6b25b2f4f5a3..f3ffce3c9304 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>> @@ -73,6 +73,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>>> entity->priority = priority;
>>> entity->sched_list = num_sched_list > 1 ? sched_list : NULL;
>>> entity->last_scheduled = NULL;
>>> +   RB_CLEAR_NODE(&entity->rb_tree_node);
>>>   
>>> if(num_sched_list)
>>> entity->rq = &sched_list[0]->sched_rq[entity->priority];
>>> @@ -417,14 +418,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
>>> drm_sched_entity *entity)
>>>   
>>> sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
>>> if (!sched_job)
>>> -   return NULL;
>>> +   goto skip;
>>>   
>>> while ((entity->dependency =
>>> drm_sched_job_dependency(sched_job, entity))) {
>>> trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
>>>   
>>> -   if (drm_sched_entity_add_dependency_cb(entity))
>>> -   return NULL;
>>> +   if (drm_sched_entity_add_dependency_cb(entity)) {
>>> +   sched_job = NULL;
>>> +   goto skip;
>>> +   }
>>> }
>>>   
>>> /* skip jobs from entity that marked guilty */
>>> @@ -443,6 +446,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
>>> drm_sched_entity *entity)
>>> smp_wmb();
>>>   
>>> spsc_queue_pop(&entity->job_queue);
>>> +
>>> +   /*
>>> +* It's when head job is extracted we can access the next job (or empty)
>>> +* queue and update the entity location in the min heap accordingly.
>>> +*/
>>> +skip:
>>> +   if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
>>> +   drm_sched_rq_update_fifo(entity,
>>> +(sched_job ? sched_job->submit_ts : 
>>> ktime_get()));
>>> +
>>> return sched_job;
>>>   }
>>>   
>>> @@ -502,11 +515,13 @@ void drm_sched_entity_push_job(struct drm_sched_job 
>>> *sched_job)
>>>   {
>>> struct drm_sched_entity *entity = sched_job->entity;
>>> bool first;
>>> +   ktime_t ts =  ktime_get();
>>>   
>>> trace_drm_sched_job(sched_job, entity);
>>> atomic_inc(entity->rq->sched->score);
>>> WRITE_ONCE(entity->last_user, current->group_leader);
>>> first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node);
>>> +   sched_job->submit_ts = ts;
>>>   
>>> /* first job wakes up scheduler */
>>> if (first) {
>>> @@ -518,8 +533,13 @@ void drm_sched_entity_push_job(struct drm_sched_job 
>>> *sched_job)
>>> 

PROBLEM: UBSAN error in kfd_device_queue_manager.c

2022-09-21 Thread Ellis Michael
Reporting an undefined behavior issue in the amdgpu driver in the
linux kernel I ran into recently. It appears during boot, fairly early
in the process.


[drm] UVD initialized successfully.
[drm] VCE initialized successfully.
kfd kfd: amdgpu: Allocated 3969056 bytes on gart


UBSAN: shift-out-of-bounds in
/build/linux-kQ6jNR/linux-5.15.0/drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:997:32
shift exponent 64 is too large for 64-bit type 'long long unsigned int'
CPU: 10 PID: 483 Comm: systemd-udevd Not tainted 5.15.0-48-generic #54-Ubuntu
Hardware name: To Be Filled By O.E.M. B550M Pro4/B550M Pro4, BIOS
P2.30 02/24/2022
Call Trace:
 
 show_stack+0x52/0x5c
 dump_stack_lvl+0x4a/0x63
 dump_stack+0x10/0x16
 ubsan_epilogue+0x9/0x49
 __ubsan_handle_shift_out_of_bounds.cold+0x61/0xef
 initialize_nocpsch.cold+0x15/0x59 [amdgpu]
 device_queue_manager_init+0x20b/0x3b0 [amdgpu]
 kgd2kfd_device_init.cold+0x1af/0x483 [amdgpu]
 amdgpu_amdkfd_device_init+0x135/0x170 [amdgpu]
 amdgpu_device_ip_init+0x681/0x6a4 [amdgpu]
loop33: detected capacity change from 0 to 8
 amdgpu_device_init.cold+0x25b/0x7db [amdgpu]
 ? do_pci_enable_device+0xdb/0x110
 amdgpu_driver_load_kms+0x1e/0x270 [amdgpu]
 amdgpu_pci_probe+0x1ce/0x260 [amdgpu]
 local_pci_probe+0x4b/0x90
 pci_device_probe+0x119/0x1f0
 really_probe+0x222/0x420
 __driver_probe_device+0x119/0x190
 driver_probe_device+0x23/0xc0
 __driver_attach+0xbd/0x1e0
 ? __device_attach_driver+0x120/0x120
 bus_for_each_dev+0x7e/0xd0
 driver_attach+0x1e/0x30
 bus_add_driver+0x148/0x220
 driver_register+0x95/0x100
 __pci_register_driver+0x68/0x70
 amdgpu_init+0x7c/0x1000 [amdgpu]
 ? 0xc1a4
 do_one_initcall+0x48/0x1e0
 ? kmem_cache_alloc_trace+0x19e/0x2e0
 do_init_module+0x52/0x260
 load_module+0xacd/0xbc0
 __do_sys_finit_module+0xbf/0x120
 __x64_sys_finit_module+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_newfstatat+0x1c/0x30
 ? do_syscall_64+0x69/0xc0
 ? __x64_sys_mmap+0x33/0x50
 ? do_syscall_64+0x69/0xc0
 ? do_syscall_64+0x69/0xc0
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f06f3fb9a3d
Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 8b 0d c3 a3 0f 00 f7 d8 64 89 01 48
RSP: 002b:7ffc7ce54ae8 EFLAGS: 0246 ORIG_RAX: 0139
RAX: ffda RBX: 556c9ab3e3d0 RCX: 7f06f3fb9a3d
RDX:  RSI: 7f06f4150441 RDI: 001a
RBP: 0002 R08:  R09: 0002
R10: 001a R11: 0246 R12: 7f06f4150441
R13: 556c9aa05fb0 R14: 556c9ab40460 R15: 556c9ab35150
 

amdgpu: SW scheduler is used
amdgpu: SRAT table not found
amdgpu: Virtual CRAT table created for GPU
amdgpu: Topology: Add dGPU node [0x6938:0x1002]
kfd kfd: amdgpu: added device 1002:6938
amdgpu :06:00.0: amdgpu: SE 4, SH per SE 1, CU per SH 8, active_cu_number 32
[drm] fb mappable at 0xD1813000
[drm] vram apper at 0xD000
[drm] size 19906560
[drm] fb depth is 24
[drm]pitch is 13824
fbcon: amdgpudrmfb (fb0) is primary device


This only started recently, possibly after I replaced my motherboard
and CPU (though, not my GPU).

Quick info on my system:
Ubuntu 22.04.1, kernel version 5.15.0-48
Ryzen 5600
ASRock B550m Pro4
R9 380X (STRIX-R9380X-OC4G-GAMING)


This is potentially related to a bug I recently reported to the Ubuntu
bug tracker where my display wouldn't come back from being blank, and
I would see series of messages of the form:
amdgpu:
last message was failed ret is 0
amdgpu:
failed to send message 145 ret is 0
amdgpu:
last message was failed ret is 0
amdgpu:
failed to send message 146 ret is 0

That bug is here: https://bugs.launchpad.net/ubuntu/+bug/1990323/

I suspect that I should probably report it to this mailing list,
though, and I'm happy to send a separate email if you want me to.
Please let me know if there's any other information you need for me or
anything I can do.

Thanks!



More thorough information on my system:
$ cat /proc/version
Linux version 5.15.0-48-generic (buildd@lcy02-amd64-080) (gcc (Ubuntu
11.2.0-19ubuntu1) 11.2.0, GNU ld (GNU Binutils for Ubuntu) 2.38)
#54-Ubuntu SMP Fri Aug 26 13:26:29 UTC 2022

$ lspci -vvv
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD]
Starship/Matisse Root Complex
Subsystem: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Root Complex
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
SERR- TAbort-
SERR- TAbort-
SERR- TAbort-
Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: 
Kernel driver in use: pcieport

00:01.2 PCI bridge: Advanced Micr