RE: [PATCH] drm/amdgpu: Fix build warnings

2021-03-23 Thread Zhang, Hawking
[AMD Public Use]

Reviewed-by: Hawking Zhang 

Regards,
Hawking
From: Lazar, Lijo 
Sent: Wednesday, March 24, 2021 13:19
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Xu, Feifei 
Subject: [PATCH] drm/amdgpu: Fix build warnings


[AMD Public Use]

Fix header guard and make internal functions static. Fixes the below warnings:

drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu_reset.h:24:9: warning: 
'__AMDUGPU_RESET_H__' is used as a header guard here, followed by #define of a 
different macro [-Wheader-guard]
drivers/gpu/drm/amd/amdgpu/aldebaran.c:110:6: warning: no previous prototype 
for function 'aldebaran_async_reset' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/aldebaran_ppt.c:1435:5: warning: 
no previous prototype for function 'aldebaran_mode2_reset' 
[-Wmissing-prototypes]

Signed-off-by: Lijo Lazar lijo.la...@amd.com
Reported-by: kernel test robot l...@intel.com
---
drivers/gpu/drm/amd/amdgpu/aldebaran.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h  | 2 +-
drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/aldebaran.c 
b/drivers/gpu/drm/amd/amdgpu/aldebaran.c
index 39604a461bf5..65b1dca4b02e 100644
--- a/drivers/gpu/drm/amd/amdgpu/aldebaran.c
+++ b/drivers/gpu/drm/amd/amdgpu/aldebaran.c
@@ -107,7 +107,7 @@ aldebaran_mode2_prepare_hwcontext(struct 
amdgpu_reset_control *reset_ctl,
   return r;
}

-void aldebaran_async_reset(struct work_struct *work)
+static void aldebaran_async_reset(struct work_struct *work)
{
   struct amdgpu_reset_handler *handler;
   struct amdgpu_reset_control *reset_ctl =
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
index dc84d871fe72..e00d38d9160a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
@@ -21,7 +21,7 @@
  *
  */

-#ifndef __AMDUGPU_RESET_H__
+#ifndef __AMDGPU_RESET_H__
#define __AMDGPU_RESET_H__

 #include "amdgpu.h"
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
index 472829f5ff1b..ddbb9a23a0af 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
@@ -1433,7 +1433,7 @@ static ssize_t aldebaran_get_gpu_metrics(struct 
smu_context *smu,
   return sizeof(struct gpu_metrics_v1_1);
}

-int aldebaran_mode2_reset(struct smu_context *smu)
+static int aldebaran_mode2_reset(struct smu_context *smu)
{
   u32 smu_version;
   int ret = 0, index;
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: Fix build warnings

2021-03-23 Thread Lazar, Lijo
[AMD Public Use]

Fix header guard and make internal functions static. Fixes the below warnings:

drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu_reset.h:24:9: warning: 
'__AMDUGPU_RESET_H__' is used as a header guard here, followed by #define of a 
different macro [-Wheader-guard]
drivers/gpu/drm/amd/amdgpu/aldebaran.c:110:6: warning: no previous prototype 
for function 'aldebaran_async_reset' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/aldebaran_ppt.c:1435:5: warning: 
no previous prototype for function 'aldebaran_mode2_reset' 
[-Wmissing-prototypes]

Signed-off-by: Lijo Lazar lijo.la...@amd.com
Reported-by: kernel test robot l...@intel.com
---
drivers/gpu/drm/amd/amdgpu/aldebaran.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h  | 2 +-
drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/aldebaran.c 
b/drivers/gpu/drm/amd/amdgpu/aldebaran.c
index 39604a461bf5..65b1dca4b02e 100644
--- a/drivers/gpu/drm/amd/amdgpu/aldebaran.c
+++ b/drivers/gpu/drm/amd/amdgpu/aldebaran.c
@@ -107,7 +107,7 @@ aldebaran_mode2_prepare_hwcontext(struct 
amdgpu_reset_control *reset_ctl,
   return r;
}
-void aldebaran_async_reset(struct work_struct *work)
+static void aldebaran_async_reset(struct work_struct *work)
{
   struct amdgpu_reset_handler *handler;
   struct amdgpu_reset_control *reset_ctl =
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
index dc84d871fe72..e00d38d9160a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
@@ -21,7 +21,7 @@
  *
  */
-#ifndef __AMDUGPU_RESET_H__
+#ifndef __AMDGPU_RESET_H__
#define __AMDGPU_RESET_H__
 #include "amdgpu.h"
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
index 472829f5ff1b..ddbb9a23a0af 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
@@ -1433,7 +1433,7 @@ static ssize_t aldebaran_get_gpu_metrics(struct 
smu_context *smu,
   return sizeof(struct gpu_metrics_v1_1);
}
-int aldebaran_mode2_reset(struct smu_context *smu)
+static int aldebaran_mode2_reset(struct smu_context *smu)
{
   u32 smu_version;
   int ret = 0, index;
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: Fix check for RAS support

2021-03-23 Thread Luben Tuikov
Use positive logic to check for RAS
support. Rename the function to actually indicate
what it is testing for. Essentially, make the
function a predicate with the correct name.

Cc: Stanley Yang 
Cc: Alexander Deucher 
Signed-off-by: Luben Tuikov 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 0e16683876aa..17652972fd49 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1933,15 +1933,12 @@ int amdgpu_ras_request_reset_on_boot(struct 
amdgpu_device *adev,
return 0;
 }
 
-static int amdgpu_ras_check_asic_type(struct amdgpu_device *adev)
+static bool amdgpu_ras_asic_supported(struct amdgpu_device *adev)
 {
-   if (adev->asic_type != CHIP_VEGA10 &&
-   adev->asic_type != CHIP_VEGA20 &&
-   adev->asic_type != CHIP_ARCTURUS &&
-   adev->asic_type != CHIP_SIENNA_CICHLID)
-   return 1;
-   else
-   return 0;
+   return adev->asic_type == CHIP_VEGA10 ||
+   adev->asic_type == CHIP_VEGA20 ||
+   adev->asic_type == CHIP_ARCTURUS ||
+   adev->asic_type == CHIP_SIENNA_CICHLID;
 }
 
 /*
@@ -1960,7 +1957,7 @@ static void amdgpu_ras_check_supported(struct 
amdgpu_device *adev,
*supported = 0;
 
if (amdgpu_sriov_vf(adev) || !adev->is_atom_fw ||
-   amdgpu_ras_check_asic_type(adev))
+   !amdgpu_ras_asic_supported(adev))
return;
 
if (amdgpu_atomfirmware_mem_ecc_supported(adev)) {
-- 
2.31.0.97.g1424303384

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: move vram recover into sriov full access

2021-03-23 Thread Liu, Monk
[AMD Official Use Only - Internal Distribution Only]

Reviewed by: Monk.Liu 
--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Horace Chen  
Sent: Wednesday, March 24, 2021 12:18 PM
To: amd-gfx@lists.freedesktop.org
Cc: Grodzovsky, Andrey ; Quan, Evan 
; Chen, Horace ; Tuikov, Luben 
; Koenig, Christian ; Deucher, 
Alexander ; Xiao, Jack ; Zhang, 
Hawking ; Liu, Monk ; Xu, Feifei 
; Wang, Kevin(Yang) ; Xiaojie Yuan 

Subject: [PATCH] drm/amdgpu: move vram recover into sriov full access

[what]
currently driver recover vram after full access, which may hit a corner case 
that meanwhile another whole gpu reset may be triggered by another VF, which 
will cause vram recover fail then fail the whole device reset.

[how]
move the recover vram into full access. So another bad VF will not disturb the 
recover sequence for this vf.

Signed-off-by: Horace Chen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index bcb2c66437a2..23d3bb761319 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4103,11 +4103,11 @@ static int amdgpu_device_reset_sriov(struct 
amdgpu_device *adev,
amdgpu_amdkfd_post_reset(adev);
 
 error:
-   amdgpu_virt_release_full_gpu(adev, true);
if (!r && adev->virt.gim_feature & AMDGIM_FEATURE_GIM_FLR_VRAMLOST) {
amdgpu_inc_vram_lost(adev);
r = amdgpu_device_recover_vram(adev);
}
+   amdgpu_virt_release_full_gpu(adev, true);
 
return r;
 }
--
2.17.1
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/display: Removing unsued code from dmub_cmd.h

2021-03-23 Thread Rodrigo Siqueira
Reviewed-by: Rodrigo Siqueira 

On 03/23, Anson Jacob wrote:
> Removing code that is not used at the moment.
> 
> Signed-off-by: Anson Jacob 
> ---
>  .../gpu/drm/amd/display/dmub/inc/dmub_cmd.h   | 37 ---
>  1 file changed, 37 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h 
> b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
> index 09c62485a1f1..2d23462f4980 100644
> --- a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
> +++ b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
> @@ -202,12 +202,7 @@ struct dmub_feature_caps {
>* Max PSR version supported by FW.
>*/
>   uint8_t psr;
> -#ifndef TRIM_FAMS
> - uint8_t fw_assisted_mclk_switch;
> - uint8_t reserved[6];
> -#else
>   uint8_t reserved[7];
> -#endif
>  };
>  
>  #if defined(__cplusplus)
> @@ -532,10 +527,6 @@ enum dmub_cmd_type {
>* Command type used for OUTBOX1 notification enable
>*/
>   DMUB_CMD__OUTBOX1_ENABLE = 71,
> -#ifndef TRIM_FAMS
> - DMUB_CMD__FW_ASSISTED_MCLK_SWITCH = 76,
> -#endif
> -
>   /**
>* Command type used for all VBIOS interface commands.
>*/
> @@ -1115,13 +1106,6 @@ enum dmub_cmd_psr_type {
>   DMUB_CMD__PSR_FORCE_STATIC  = 5,
>  };
>  
> -#ifndef TRIM_FAMS
> -enum dmub_cmd_fams_type {
> - DMUB_CMD__FAMS_SETUP_FW_CTRL= 0,
> - DMUB_CMD__FAMS_DRR_UPDATE   = 1,
> -};
> -#endif
> -
>  /**
>   * PSR versions.
>   */
> @@ -1791,24 +1775,6 @@ struct dmub_rb_cmd_drr_update {
>   struct dmub_optc_state dmub_optc_state_req;
>  };
>  
> -#ifndef TRIM_FAMS
> -struct dmub_cmd_fw_assisted_mclk_switch_pipe_data {
> - uint32_t pix_clk_100hz;
> - uint32_t min_refresh_in_uhz;
> - uint32_t max_ramp_step;
> -};
> -
> -struct dmub_cmd_fw_assisted_mclk_switch_config {
> - uint32_t fams_enabled;
> - struct dmub_cmd_fw_assisted_mclk_switch_pipe_data 
> pipe_data[DMUB_MAX_STREAMS];
> -};
> -
> -struct dmub_rb_cmd_fw_assisted_mclk_switch {
> - struct dmub_cmd_header header;
> - struct dmub_cmd_fw_assisted_mclk_switch_config config_data;
> -};
> -#endif
> -
>  /**
>   * Data passed from driver to FW in a DMUB_CMD__VBIOS_LVTMA_CONTROL command.
>   */
> @@ -1951,9 +1917,6 @@ union dmub_rb_cmd {
>*/
>   struct dmub_rb_cmd_query_feature_caps query_feature_caps;
>   struct dmub_rb_cmd_drr_update drr_update;
> -#ifndef TRIM_FAMS
> - struct dmub_rb_cmd_fw_assisted_mclk_switch fw_assisted_mclk_switch;
> -#endif
>   /**
>* Definition of a DMUB_CMD__VBIOS_LVTMA_CONTROL command.
>*/
> -- 
> 2.25.1
> 

-- 
Rodrigo Siqueira
https://siqueira.tech


signature.asc
Description: PGP signature
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu/pm: mark pcie link/speed arrays as const

2021-03-23 Thread Quan, Evan
[AMD Public Use]

Reviewed-by: Evan Quan 

-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Wednesday, March 24, 2021 11:51 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Dave Airlie 

Subject: [PATCH] drm/amdgpu/pm: mark pcie link/speed arrays as const

They are read only.

Noticed-by: Dave Airlie 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/pm/inc/smu_v11_0.h| 4 ++--
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c | 4 ++--
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c | 4 ++--
 drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c| 4 ++--
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h 
b/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
index ad4db2edf1fb..d5182bbaa598 100644
--- a/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
+++ b/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
@@ -61,8 +61,8 @@
 #define LINK_WIDTH_MAX 6
 #define LINK_SPEED_MAX 3
 
-static __maybe_unused uint16_t link_width[] = {0, 1, 2, 4, 8, 12, 16};
-static __maybe_unused uint16_t link_speed[] = {25, 50, 80, 160};
+static const __maybe_unused uint16_t link_width[] = {0, 1, 2, 4, 8, 12, 16};
+static const __maybe_unused uint16_t link_speed[] = {25, 50, 80, 160};
 
 static const
 struct smu_temperature_range __maybe_unused smu11_thermal_policy[] =
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c
index b6d7b7b224a9..1a097e608808 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c
@@ -52,8 +52,8 @@
 
 #define LINK_WIDTH_MAX 6
 #define LINK_SPEED_MAX 3
-static int link_width[] = {0, 1, 2, 4, 8, 12, 16};
-static int link_speed[] = {25, 50, 80, 160};
+static const int link_width[] = {0, 1, 2, 4, 8, 12, 16};
+static const int link_speed[] = {25, 50, 80, 160};
 
 static int vega12_force_clock_level(struct pp_hwmgr *hwmgr,
enum pp_clock_type type, uint32_t mask);
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c
index 213c9c6b4462..d3177a534fdf 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c
@@ -57,8 +57,8 @@
 
 #define LINK_WIDTH_MAX 6
 #define LINK_SPEED_MAX 3
-static int link_width[] = {0, 1, 2, 4, 8, 12, 16};
-static int link_speed[] = {25, 50, 80, 160};
+static const int link_width[] = {0, 1, 2, 4, 8, 12, 16};
+static const int link_speed[] = {25, 50, 80, 160};
 
 static void vega20_set_default_registry_data(struct pp_hwmgr *hwmgr)
 {
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
index bd3a9c89dc44..2e296cb3bb04 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
@@ -72,8 +72,8 @@ MODULE_FIRMWARE("amdgpu/aldebaran_smc.bin");
 #define PCIE_LC_SPEED_CNTL__LC_CURRENT_DATA_RATE_MASK 0xC000
 #define PCIE_LC_SPEED_CNTL__LC_CURRENT_DATA_RATE__SHIFT 0xE
 
-static int link_width[] = {0, 1, 2, 4, 8, 12, 16};
-static int link_speed[] = {25, 50, 80, 160};
+static const int link_width[] = {0, 1, 2, 4, 8, 12, 16};
+static const int link_speed[] = {25, 50, 80, 160};
 
 int smu_v13_0_init_microcode(struct smu_context *smu)
 {
-- 
2.30.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cevan.quan%40amd.com%7Ce8555780c9d14e661a8408d8ee780db6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637521546688653644%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=tApsvru3cpPITKneQDeH%2B4XBks8ZF43fCCWJ3MjQxHE%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[pull] amdgpu, amdkfd, radeon drm-next-5.13

2021-03-23 Thread Alex Deucher
Hi Dave, Daniel,

Same as the last one, but with typo in one of the sign offs fixed.

The following changes since commit 6e80fb8ab04f6c4f377e2fd422bdd1855beb7371:

  drm/amdgpu: Set reference clock to 100Mhz on Renoir (v2) (2021-02-18 16:43:09 
-0500)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-next-5.13-2021-03-23

for you to fetch changes up to 8c44390d8872ebf3a28558b59a0074df39b3da8f:

  drm/amdkfd: Bump KFD API version (2021-03-23 23:40:55 -0400)


amd-drm-next-5.13-2021-03-23:

amdgpu:
- Debugfs cleanup
- Various cleanups and spelling fixes
- Flexible array cleanups
- Initial AMD Freesync HDMI
- Display fixes
- 10bpc dithering improvements
- Display ASSR support
- Clean up and unify powerplay and swsmu interfaces
- Vangogh fixes
- Add SMU gfx busy queues for RV/PCO
- PCIE DPM fixes
- S0ix fixes
- GPU metrics data fixes
- DCN secure display support
- Backlight type override
- Add initial support for Aldebaran
- RAS fixes
- Prime fixes for A+A systems
- Reset fixes
- Initial resource cursor support
- Drop legacy IO BAR requirements
- Various power fixes

amdkfd:
- MMU notifier fixes
- APU fixes

radeon:
- Debugfs cleanups
- Flexible array cleanups

UAPI:
- amdgpu: Add a new INFO ioctl interface to query video capabilities
  rather than hardcoding them in userspace.  This allows us to provide
  fine grained asic capabilities (e.g., if a particular part is
  bandwidth limited, we can limit the capabilities).  Proposed userspace:
  https://gitlab.freedesktop.org/leoliu/drm/-/commits/info_video_caps
  https://gitlab.freedesktop.org/leoliu/mesa/-/commits/info_video_caps
- amdkfd: bump the driver version.  There was a problem with reporting
  some RAS features on older versions of the driver. Proposed userspace:
  
https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/commit/7cdd63475c36bb9f49bb960f90f9a8cdb7e80a21


Alex Deucher (21):
  drm/amdgpu: add asic callback for querying video codec info (v3)
  drm/amdgpu: add video decode/encode cap tables and asic callbacks (v3)
  drm/amdgpu: add INFO ioctl support for querying video caps (v4)
  drm/amdgpu: bump driver version for new video codec INFO ioctl query
  drm/amdgpu/codec: drop the internal codec index
  drm/amdgpu/pm: make unsupported power profile messages debug
  drm/amdgpu/swsmu/vangogh: Only use RLCPowerNotify msg for disable
  drm/amdgpu: Only check for S0ix if AMD_PMC is configured
  drm/amdgpu: enable BACO runpm by default on sienna cichlid and navy 
flounder
  drm/amdgpu: enable TMZ by default on Raven asics
  drm/amdgpu/dc: fill in missing call to atom cmd table for pll adjust v2
  drm/amdgpu/display: simplify backlight setting
  drm/amdgpu/display: don't assert in set backlight function
  drm/amdgpu/display: handle aux backlight in backlight_get_brightness
  drm/amdgpu: add mmhub client ids for aldebaran
  drm/amdgpu: fix S0ix handling when the CONFIG_AMD_PMC=m
  drm/amdgpu/powerplay/smu10: add support for gpu busy query (v2)
  drm/amdgpu/smu8: return an error rather than 50% if busy query fails
  drm/amdgpu: drop legacy IO bar support
  drm/amdgpu: drop extraneous hw_status update
  drm/amdgpu/display: properly guard dc_dsc_stream_bandwidth_in_kbps

Alex Sierra (4):
  drm/amdgpu: UTLC1 RB SDMA timeout on Aldebaran
  drm/amdgpu: enable 48-bit IH timestamp counter
  drm/amdgpu: update mmhub client ids for Aldebaran
  drm/amdgpu: use pd addr based on gart level page table

Amber Lin (1):
  drm/amdgpu: Aldebaran doesn't use semaphore

Anson Jacob (5):
  Revert "drm/amd/display: reuse current context instead of recreating one"
  drm/amdkfd: Fix UBSAN shift-out-of-bounds warning
  Revert "drm/amd/display: remove duplicate include in amdgpu_dm.c"
  drm/amd/display: remove duplicate include in amdgpu_dm.c
  drm/amd/display: Fix UBSAN warning for not a valid value for type '_Bool'

Anthony Koo (5):
  drm/amd/display: [FW Promotion] Release 0.0.52
  drm/amd/display: [FW Promotion] Release 0.0.53
  drm/amd/display: [FW Promotion] Release 0.0.54
  drm/amd/display: [FW Promotion] Release 0.0.55
  drm/amd/display: [FW Promotion] Release 0.0.56

Anthony Wang (2):
  drm/amd/display: disable seamless boot for DP MST
  drm/amd/display: enable audio on DP seamless boot

Aric Cyr (10):
  drm/amd/display: 3.2.123
  drm/amd/display: Don't optimize bandwidth before disabling planes
  drm/amd/display: reduce scope for local var
  drm/amd/display: 3.2.124
  drm/amd/display: 3.2.125
  drm/amd/display: 3.2.126
  drm/amd/display: 3.2.126.1
  drm/amd/display: System black screen hangs on driver load
  drm/amd/display: DCHUB underflow counter increasing in some scenarios
  drm/amd/display: 3.2.1

[PATCH] drm/amdgpu/pm: mark pcie link/speed arrays as const

2021-03-23 Thread Alex Deucher
They are read only.

Noticed-by: Dave Airlie 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/pm/inc/smu_v11_0.h| 4 ++--
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c | 4 ++--
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c | 4 ++--
 drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c| 4 ++--
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h 
b/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
index ad4db2edf1fb..d5182bbaa598 100644
--- a/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
+++ b/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h
@@ -61,8 +61,8 @@
 #define LINK_WIDTH_MAX 6
 #define LINK_SPEED_MAX 3
 
-static __maybe_unused uint16_t link_width[] = {0, 1, 2, 4, 8, 12, 16};
-static __maybe_unused uint16_t link_speed[] = {25, 50, 80, 160};
+static const __maybe_unused uint16_t link_width[] = {0, 1, 2, 4, 8, 12, 16};
+static const __maybe_unused uint16_t link_speed[] = {25, 50, 80, 160};
 
 static const
 struct smu_temperature_range __maybe_unused smu11_thermal_policy[] =
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c
index b6d7b7b224a9..1a097e608808 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c
@@ -52,8 +52,8 @@
 
 #define LINK_WIDTH_MAX 6
 #define LINK_SPEED_MAX 3
-static int link_width[] = {0, 1, 2, 4, 8, 12, 16};
-static int link_speed[] = {25, 50, 80, 160};
+static const int link_width[] = {0, 1, 2, 4, 8, 12, 16};
+static const int link_speed[] = {25, 50, 80, 160};
 
 static int vega12_force_clock_level(struct pp_hwmgr *hwmgr,
enum pp_clock_type type, uint32_t mask);
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c
index 213c9c6b4462..d3177a534fdf 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c
@@ -57,8 +57,8 @@
 
 #define LINK_WIDTH_MAX 6
 #define LINK_SPEED_MAX 3
-static int link_width[] = {0, 1, 2, 4, 8, 12, 16};
-static int link_speed[] = {25, 50, 80, 160};
+static const int link_width[] = {0, 1, 2, 4, 8, 12, 16};
+static const int link_speed[] = {25, 50, 80, 160};
 
 static void vega20_set_default_registry_data(struct pp_hwmgr *hwmgr)
 {
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
index bd3a9c89dc44..2e296cb3bb04 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
@@ -72,8 +72,8 @@ MODULE_FIRMWARE("amdgpu/aldebaran_smc.bin");
 #define PCIE_LC_SPEED_CNTL__LC_CURRENT_DATA_RATE_MASK 0xC000
 #define PCIE_LC_SPEED_CNTL__LC_CURRENT_DATA_RATE__SHIFT 0xE
 
-static int link_width[] = {0, 1, 2, 4, 8, 12, 16};
-static int link_speed[] = {25, 50, 80, 160};
+static const int link_width[] = {0, 1, 2, 4, 8, 12, 16};
+static const int link_speed[] = {25, 50, 80, 160};
 
 int smu_v13_0_init_microcode(struct smu_context *smu)
 {
-- 
2.30.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 1/2] drm/amdgpu: use zero as start for dummy resource walks

2021-03-23 Thread Pan, Xinhui
[AMD Official Use Only - Internal Distribution Only]

I don’t think so. Start is offset here. We get the valid physical address from 
pages_addr[offset] when we update mapping.
Btw, what issue we are seeing?

-Original Message-
From: amd-gfx  On Behalf Of Christian 
K?nig
Sent: 2021年3月23日 22:55
To: amd-gfx@lists.freedesktop.org
Cc: Das, Nirmoy ; Chen, Guchun 
Subject: [PATCH 1/2] drm/amdgpu: use zero as start for dummy resource walks

When we don't have a physically backing store we should use zero instead of the 
virtual start address since that isn't necessary a valid physical one.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
index 40f2adf305bc..e94362ccf9d5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
@@ -54,7 +54,7 @@ static inline void amdgpu_res_first(struct ttm_resource *res,
 struct drm_mm_node *node;

 if (!res || !res->mm_node) {
-cur->start = start;
+cur->start = 0;
 cur->size = size;
 cur->remaining = size;
 cur->node = NULL;
--
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cxinhui.pan%40amd.com%7C031c743bd7c448e8d91508d8ee0ba402%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637521081053105295%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=lrJ6k3QBXqM9G6GRK25frFlqANkbfR4kAv6A3%2F8myBc%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/display: Use DRM_DEBUG_DP

2021-03-23 Thread Deucher, Alexander
[AMD Official Use Only - Internal Distribution Only]

Reviewed-by: Alex Deucher 

From: Tuikov, Luben 
Sent: Tuesday, March 23, 2021 4:26 PM
To: amd-gfx@lists.freedesktop.org 
Cc: Tuikov, Luben ; Wentland, Harry 
; Deucher, Alexander 
Subject: [PATCH] drm/amd/display: Use DRM_DEBUG_DP

Convert IRQ-based prints from DRM_DEBUG_DRIVER to
DRM_DEBUG_DP, as the latter is not used in drm/amd
prior to this patch and since IRQ-based prints
drown out the rest of the driver's
DRM_DEBUG_DRIVER messages.

Cc: Harry Wentland 
Cc: Alex Deucher 
Signed-off-by: Luben Tuikov 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 57 +--
 1 file changed, 28 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index f455fc3aa561..9376d44ce3b4 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -449,9 +449,9 @@ static void dm_pflip_high_irq(void *interrupt_params)
 amdgpu_crtc->pflip_status = AMDGPU_FLIP_NONE;
 spin_unlock_irqrestore(&adev_to_drm(adev)->event_lock, flags);

-   DRM_DEBUG_DRIVER("crtc:%d[%p], pflip_stat:AMDGPU_FLIP_NONE, vrr[%d]-fp 
%d\n",
-amdgpu_crtc->crtc_id, amdgpu_crtc,
-vrr_active, (int) !e);
+   DRM_DEBUG_KMS("crtc:%d[%p], pflip_stat:AMDGPU_FLIP_NONE, vrr[%d]-fp 
%d\n",
+ amdgpu_crtc->crtc_id, amdgpu_crtc,
+ vrr_active, (int) !e);
 }

 static void dm_vupdate_high_irq(void *interrupt_params)
@@ -993,8 +993,7 @@ static void event_mall_stutter(struct work_struct *work)
 dc_allow_idle_optimizations(
 dm->dc, dm->active_vblank_irq_count == 0);

-   DRM_DEBUG_DRIVER("Allow idle optimizations (MALL): %d\n", 
dm->active_vblank_irq_count == 0);
-
+   DRM_DEBUG_KMS("Allow idle optimizations (MALL): %d\n", 
dm->active_vblank_irq_count == 0);

 mutex_unlock(&dm->dc_lock);
 }
@@ -1810,8 +1809,8 @@ static void dm_gpureset_toggle_interrupts(struct 
amdgpu_device *adev,
 if (acrtc && state->stream_status[i].plane_count != 0) {
 irq_source = IRQ_TYPE_PFLIP + acrtc->otg_inst;
 rc = dc_interrupt_set(adev->dm.dc, irq_source, enable) 
? 0 : -EBUSY;
-   DRM_DEBUG("crtc %d - vupdate irq %sabling: r=%d\n",
- acrtc->crtc_id, enable ? "en" : "dis", rc);
+   DRM_DEBUG_VBL("crtc %d - vupdate irq %sabling: r=%d\n",
+ acrtc->crtc_id, enable ? "en" : "dis", 
rc);
 if (rc)
 DRM_WARN("Failed to %s pflip interrupts\n",
  enable ? "enable" : "disable");
@@ -4966,8 +4965,8 @@ static void update_stream_scaling_settings(const struct 
drm_display_mode *mode,
 stream->src = src;
 stream->dst = dst;

-   DRM_DEBUG_DRIVER("Destination Rectangle x:%d  y:%d  width:%d  
height:%d\n",
-   dst.x, dst.y, dst.width, dst.height);
+   DRM_DEBUG_KMS("Destination Rectangle x:%d  y:%d  width:%d  height:%d\n",
+ dst.x, dst.y, dst.width, dst.height);

 }

@@ -5710,8 +5709,8 @@ static inline int dm_set_vupdate_irq(struct drm_crtc 
*crtc, bool enable)

 rc = dc_interrupt_set(adev->dm.dc, irq_source, enable) ? 0 : -EBUSY;

-   DRM_DEBUG_DRIVER("crtc %d - vupdate irq %sabling: r=%d\n",
-acrtc->crtc_id, enable ? "en" : "dis", rc);
+   DRM_DEBUG_VBL("crtc %d - vupdate irq %sabling: r=%d\n",
+ acrtc->crtc_id, enable ? "en" : "dis", rc);
 return rc;
 }

@@ -6664,7 +6663,7 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
*plane,
 int r;

 if (!new_state->fb) {
-   DRM_DEBUG_DRIVER("No FB bound\n");
+   DRM_DEBUG_KMS("No FB bound\n");
 return 0;
 }

@@ -7896,11 +7895,11 @@ static void handle_cursor_update(struct drm_plane 
*plane,
 if (!plane->state->fb && !old_plane_state->fb)
 return;

-   DRM_DEBUG_DRIVER("%s: crtc_id=%d with size %d to %d\n",
-__func__,
-amdgpu_crtc->crtc_id,
-plane->state->crtc_w,
-plane->state->crtc_h);
+   DRM_DEBUG_KMS("%s: crtc_id=%d with size %d to %d\n",
+ __func__,
+ amdgpu_crtc->crtc_id,
+ plane->state->crtc_w,
+ plane->state->crtc_h);

 ret = get_cursor_position(plane, crtc, &position);
 if (ret)
@@ -7958,8 +7957,8 @@ static void prepare_flip_isr(struct amdgpu_crtc *acrtc)
 /* Mark this event as consumed */
 acrtc->base.state->event = NULL;

-   DRM_DEBUG_DRIVER("crtc:%d, pflip_

RE: [PATCH] drm/amd/pm: Update aldebaran pmfw interface

2021-03-23 Thread Xu, Feifei


Reviewed-by: Feifei Xu 

From: Lazar, Lijo 
Sent: Tuesday, March 23, 2021 9:07 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Xu, Feifei ; 
Feng, Kenneth ; Wang, Kevin(Yang) 
Subject: [PATCH] drm/amd/pm: Update aldebaran pmfw interface


[AMD Public Use]

Update aldebaran PMFW interfaces to version 0x6

Signed-off-by: Lijo Lazar lijo.la...@amd.com
---
.../gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h| 11 +--
drivers/gpu/drm/amd/pm/inc/smu_v13_0.h|  2 +-
2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h 
b/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h
index df2ead254f37..d23533bda002 100644
--- a/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h
+++ b/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h
@@ -435,8 +435,12 @@ typedef struct {
   uint8_t  GpioI2cSda; // Serial Data
   uint16_t spare5;

+  uint16_t XgmiMaxCurrent; // in Amps
+  int8_t   XgmiOffset; // in Amps
+  uint8_t  Padding_TelemetryXgmi;
+
   //reserved
-  uint32_t reserved[16];
+  uint32_t reserved[15];

 } PPTable_t;

@@ -481,7 +485,10 @@ typedef struct {
   uint16_t TemperatureAllHBM[4]  ;
   uint32_t GfxBusyAcc;
   uint32_t DramBusyAcc   ;
-  uint32_t Spare[4];
+  uint32_t EnergyAcc64bitLow ; //15.259uJ resolution
+  uint32_t EnergyAcc64bitHigh;
+  uint32_t TimeStampLow  ; //10ns resolution
+  uint32_t TimeStampHigh ;

   // Padding - ignore
   uint32_t MmHubPadding[8]; // SMU internal use
diff --git a/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h 
b/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h
index 6db3464c09d6..8145e1cbf181 100644
--- a/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h
+++ b/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h
@@ -26,7 +26,7 @@
#include "amdgpu_smu.h"

 #define SMU13_DRIVER_IF_VERSION_INV 0x
-#define SMU13_DRIVER_IF_VERSION_ALDE 0x5
+#define SMU13_DRIVER_IF_VERSION_ALDE 0x6

 /* MP Apertures */
#define MP0_Public  0x0380
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] Revert "drm/amd/display: To modify the condition in indicating branch device"

2021-03-23 Thread Alex Deucher
This breaks HDMI audio.

This reverts commit 9413b23fadad3861f5afd626ac44ef83ad8068ab.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1536
Signed-off-by: Alex Deucher 
Cc: Martin Tsai 
Cc: Bindu Ramamurthy 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 484d96f78ade..18adf4ea6044 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -3192,7 +3192,13 @@ static void get_active_converter_info(
}
 
/* DPCD 0x5 bit 0 = 1, it indicate it's branch device */
-   link->dpcd_caps.is_branch_dev = ds_port.fields.PORT_PRESENT;
+   if (ds_port.fields.PORT_TYPE == DOWNSTREAM_DP) {
+   link->dpcd_caps.is_branch_dev = false;
+   }
+
+   else {
+   link->dpcd_caps.is_branch_dev = ds_port.fields.PORT_PRESENT;
+   }
 
switch (ds_port.fields.PORT_TYPE) {
case DOWNSTREAM_VGA:
-- 
2.30.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v2] drm/amd/display: Use DRM_DEBUG_DP

2021-03-23 Thread Luben Tuikov
Convert IRQ-based prints from DRM_DEBUG_DRIVER to
the appropriate DRM log type, since IRQ-based
prints drown out the rest of the driver's
DRM_DEBUG_DRIVER messages.

v2: Update as per feedback to fine-tune for each
type of DRM log level.

Cc: Harry Wentland 
Cc: Alex Deucher 
Signed-off-by: Luben Tuikov 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 57 +--
 1 file changed, 28 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index ce615554faed..e923414777e6 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -449,9 +449,9 @@ static void dm_pflip_high_irq(void *interrupt_params)
amdgpu_crtc->pflip_status = AMDGPU_FLIP_NONE;
spin_unlock_irqrestore(&adev_to_drm(adev)->event_lock, flags);
 
-   DRM_DEBUG_DRIVER("crtc:%d[%p], pflip_stat:AMDGPU_FLIP_NONE, vrr[%d]-fp 
%d\n",
-amdgpu_crtc->crtc_id, amdgpu_crtc,
-vrr_active, (int) !e);
+   DRM_DEBUG_KMS("crtc:%d[%p], pflip_stat:AMDGPU_FLIP_NONE, vrr[%d]-fp 
%d\n",
+ amdgpu_crtc->crtc_id, amdgpu_crtc,
+ vrr_active, (int) !e);
 }
 
 static void dm_vupdate_high_irq(void *interrupt_params)
@@ -1019,8 +1019,7 @@ static void event_mall_stutter(struct work_struct *work)
dc_allow_idle_optimizations(
dm->dc, dm->active_vblank_irq_count == 0);
 
-   DRM_DEBUG_DRIVER("Allow idle optimizations (MALL): %d\n", 
dm->active_vblank_irq_count == 0);
-
+   DRM_DEBUG_KMS("Allow idle optimizations (MALL): %d\n", 
dm->active_vblank_irq_count == 0);
 
mutex_unlock(&dm->dc_lock);
 }
@@ -1836,8 +1835,8 @@ static void dm_gpureset_toggle_interrupts(struct 
amdgpu_device *adev,
if (acrtc && state->stream_status[i].plane_count != 0) {
irq_source = IRQ_TYPE_PFLIP + acrtc->otg_inst;
rc = dc_interrupt_set(adev->dm.dc, irq_source, enable) 
? 0 : -EBUSY;
-   DRM_DEBUG("crtc %d - vupdate irq %sabling: r=%d\n",
- acrtc->crtc_id, enable ? "en" : "dis", rc);
+   DRM_DEBUG_VBL("crtc %d - vupdate irq %sabling: r=%d\n",
+ acrtc->crtc_id, enable ? "en" : "dis", 
rc);
if (rc)
DRM_WARN("Failed to %s pflip interrupts\n",
 enable ? "enable" : "disable");
@@ -5014,8 +5013,8 @@ static void update_stream_scaling_settings(const struct 
drm_display_mode *mode,
stream->src = src;
stream->dst = dst;
 
-   DRM_DEBUG_DRIVER("Destination Rectangle x:%d  y:%d  width:%d  
height:%d\n",
-   dst.x, dst.y, dst.width, dst.height);
+   DRM_DEBUG_KMS("Destination Rectangle x:%d  y:%d  width:%d  height:%d\n",
+ dst.x, dst.y, dst.width, dst.height);
 
 }
 
@@ -5758,8 +5757,8 @@ static inline int dm_set_vupdate_irq(struct drm_crtc 
*crtc, bool enable)
 
rc = dc_interrupt_set(adev->dm.dc, irq_source, enable) ? 0 : -EBUSY;
 
-   DRM_DEBUG_DRIVER("crtc %d - vupdate irq %sabling: r=%d\n",
-acrtc->crtc_id, enable ? "en" : "dis", rc);
+   DRM_DEBUG_VBL("crtc %d - vupdate irq %sabling: r=%d\n",
+ acrtc->crtc_id, enable ? "en" : "dis", rc);
return rc;
 }
 
@@ -6712,7 +6711,7 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
*plane,
int r;
 
if (!new_state->fb) {
-   DRM_DEBUG_DRIVER("No FB bound\n");
+   DRM_DEBUG_KMS("No FB bound\n");
return 0;
}
 
@@ -7944,11 +7943,11 @@ static void handle_cursor_update(struct drm_plane 
*plane,
if (!plane->state->fb && !old_plane_state->fb)
return;
 
-   DRM_DEBUG_DRIVER("%s: crtc_id=%d with size %d to %d\n",
-__func__,
-amdgpu_crtc->crtc_id,
-plane->state->crtc_w,
-plane->state->crtc_h);
+   DRM_DEBUG_KMS("%s: crtc_id=%d with size %d to %d\n",
+ __func__,
+ amdgpu_crtc->crtc_id,
+ plane->state->crtc_w,
+ plane->state->crtc_h);
 
ret = get_cursor_position(plane, crtc, &position);
if (ret)
@@ -8006,8 +8005,8 @@ static void prepare_flip_isr(struct amdgpu_crtc *acrtc)
/* Mark this event as consumed */
acrtc->base.state->event = NULL;
 
-   DRM_DEBUG_DRIVER("crtc:%d, pflip_stat:AMDGPU_FLIP_SUBMITTED\n",
-acrtc->crtc_id);
+   DRM_DEBUG_KMS("crtc:%d, pflip_stat:AMDGPU_FLIP_SUBMITTED\n",
+ acrtc->crtc_id);
 }
 
 static void update_freesync_state_on_stream(
@@ -8313,7 +8312,7 @@ static void amdgpu_dm

[PATCH] drm/amd/display: Removing unsued code from dmub_cmd.h

2021-03-23 Thread Anson Jacob
Removing code that is not used at the moment.

Signed-off-by: Anson Jacob 
---
 .../gpu/drm/amd/display/dmub/inc/dmub_cmd.h   | 37 ---
 1 file changed, 37 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h 
b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
index 09c62485a1f1..2d23462f4980 100644
--- a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
+++ b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
@@ -202,12 +202,7 @@ struct dmub_feature_caps {
 * Max PSR version supported by FW.
 */
uint8_t psr;
-#ifndef TRIM_FAMS
-   uint8_t fw_assisted_mclk_switch;
-   uint8_t reserved[6];
-#else
uint8_t reserved[7];
-#endif
 };
 
 #if defined(__cplusplus)
@@ -532,10 +527,6 @@ enum dmub_cmd_type {
 * Command type used for OUTBOX1 notification enable
 */
DMUB_CMD__OUTBOX1_ENABLE = 71,
-#ifndef TRIM_FAMS
-   DMUB_CMD__FW_ASSISTED_MCLK_SWITCH = 76,
-#endif
-
/**
 * Command type used for all VBIOS interface commands.
 */
@@ -1115,13 +1106,6 @@ enum dmub_cmd_psr_type {
DMUB_CMD__PSR_FORCE_STATIC  = 5,
 };
 
-#ifndef TRIM_FAMS
-enum dmub_cmd_fams_type {
-   DMUB_CMD__FAMS_SETUP_FW_CTRL= 0,
-   DMUB_CMD__FAMS_DRR_UPDATE   = 1,
-};
-#endif
-
 /**
  * PSR versions.
  */
@@ -1791,24 +1775,6 @@ struct dmub_rb_cmd_drr_update {
struct dmub_optc_state dmub_optc_state_req;
 };
 
-#ifndef TRIM_FAMS
-struct dmub_cmd_fw_assisted_mclk_switch_pipe_data {
-   uint32_t pix_clk_100hz;
-   uint32_t min_refresh_in_uhz;
-   uint32_t max_ramp_step;
-};
-
-struct dmub_cmd_fw_assisted_mclk_switch_config {
-   uint32_t fams_enabled;
-   struct dmub_cmd_fw_assisted_mclk_switch_pipe_data 
pipe_data[DMUB_MAX_STREAMS];
-};
-
-struct dmub_rb_cmd_fw_assisted_mclk_switch {
-   struct dmub_cmd_header header;
-   struct dmub_cmd_fw_assisted_mclk_switch_config config_data;
-};
-#endif
-
 /**
  * Data passed from driver to FW in a DMUB_CMD__VBIOS_LVTMA_CONTROL command.
  */
@@ -1951,9 +1917,6 @@ union dmub_rb_cmd {
 */
struct dmub_rb_cmd_query_feature_caps query_feature_caps;
struct dmub_rb_cmd_drr_update drr_update;
-#ifndef TRIM_FAMS
-   struct dmub_rb_cmd_fw_assisted_mclk_switch fw_assisted_mclk_switch;
-#endif
/**
 * Definition of a DMUB_CMD__VBIOS_LVTMA_CONTROL command.
 */
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: Add new PF2VF flags for VF register access method

2021-03-23 Thread Rohit Khaire
Add 3 sub flags to notify guest for indirect access of gc, mmhub and ih

Signed-off-by: Rohit Khaire 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h| 11 +++
 drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h | 17 +++--
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
index 8dd624c20f89..0224f352d060 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
@@ -104,6 +104,17 @@ enum AMDGIM_FEATURE_FLAG {
AMDGIM_FEATURE_GIM_MM_BW_MGR = 0x8,
/* PP ONE VF MODE in GIM */
AMDGIM_FEATURE_PP_ONE_VF = (1 << 4),
+   /* Indirect Reg Access enabled */
+   AMDGIM_FETURE_INDIRECT_REG_ACCESS = (1 << 5),
+};
+
+enum AMDGIM_REG_ACCESS_FLAG {
+   /* Use PSP to program IH_RB_CNTL */
+   AMDGIM_FEATURE_IH_REG_PSP_EN = (1 << 0),
+   /* Use RLC to program MMHUB regs */
+   AMDGIM_FEATURE_RLC_MMHUB_EN  = (1 << 1),
+   /* Use RLC to program GC regs */
+   AMDGIM_FEATURE_RLC_GC_EN = (1 << 2),
 };
 
 struct amdgim_pf2vf_info_v1 {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h 
b/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
index 5355827ed0ae..7fed6377d931 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
@@ -90,11 +90,22 @@ union amd_sriov_msg_feature_flags {
uint32_t  host_flr_vramlost  : 1;
uint32_t  mm_bw_management   : 1;
uint32_t  pp_one_vf_mode : 1;
-   uint32_t  reserved   : 27;
+   uint32_t  reg_indirect_acc   : 1;
+   uint32_t  reserved   : 26;
} flags;
uint32_t  all;
 };
 
+union amd_sriov_reg_access_flags {
+   struct {
+   uint32_t vf_reg_access_ih: 1;
+   uint32_t vf_reg_access_mmhub : 1;
+   uint32_t vf_reg_access_gc: 1;
+   uint32_t reserved: 29;
+   } flags;
+   uint32_t all;
+};
+
 union amd_sriov_msg_os_info {
struct {
uint32_t  windows: 1;
@@ -149,8 +160,10 @@ struct amd_sriov_msg_pf2vf_info {
/* identification in ROCm SMI */
uint64_t uuid;
uint32_t fcn_idx;
+   /* flags to indicate which register access method VF should use */
+   union amd_sriov_reg_access_flags reg_access_flags;
/* reserved */
-   uint32_t reserved[256-26];
+   uint32_t reserved[256-27];
 };
 
 struct amd_sriov_msg_vf2pf_info_header {
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: Use DRM_DEBUG_DP

2021-03-23 Thread Luben Tuikov
Convert IRQ-based prints from DRM_DEBUG_DRIVER to
DRM_DEBUG_DP, as the latter is not used in drm/amd
prior to this patch and since IRQ-based prints
drown out the rest of the driver's
DRM_DEBUG_DRIVER messages.

Cc: Harry Wentland 
Cc: Alex Deucher 
Signed-off-by: Luben Tuikov 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 57 +--
 1 file changed, 28 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index f455fc3aa561..9376d44ce3b4 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -449,9 +449,9 @@ static void dm_pflip_high_irq(void *interrupt_params)
amdgpu_crtc->pflip_status = AMDGPU_FLIP_NONE;
spin_unlock_irqrestore(&adev_to_drm(adev)->event_lock, flags);
 
-   DRM_DEBUG_DRIVER("crtc:%d[%p], pflip_stat:AMDGPU_FLIP_NONE, vrr[%d]-fp 
%d\n",
-amdgpu_crtc->crtc_id, amdgpu_crtc,
-vrr_active, (int) !e);
+   DRM_DEBUG_KMS("crtc:%d[%p], pflip_stat:AMDGPU_FLIP_NONE, vrr[%d]-fp 
%d\n",
+ amdgpu_crtc->crtc_id, amdgpu_crtc,
+ vrr_active, (int) !e);
 }
 
 static void dm_vupdate_high_irq(void *interrupt_params)
@@ -993,8 +993,7 @@ static void event_mall_stutter(struct work_struct *work)
dc_allow_idle_optimizations(
dm->dc, dm->active_vblank_irq_count == 0);
 
-   DRM_DEBUG_DRIVER("Allow idle optimizations (MALL): %d\n", 
dm->active_vblank_irq_count == 0);
-
+   DRM_DEBUG_KMS("Allow idle optimizations (MALL): %d\n", 
dm->active_vblank_irq_count == 0);
 
mutex_unlock(&dm->dc_lock);
 }
@@ -1810,8 +1809,8 @@ static void dm_gpureset_toggle_interrupts(struct 
amdgpu_device *adev,
if (acrtc && state->stream_status[i].plane_count != 0) {
irq_source = IRQ_TYPE_PFLIP + acrtc->otg_inst;
rc = dc_interrupt_set(adev->dm.dc, irq_source, enable) 
? 0 : -EBUSY;
-   DRM_DEBUG("crtc %d - vupdate irq %sabling: r=%d\n",
- acrtc->crtc_id, enable ? "en" : "dis", rc);
+   DRM_DEBUG_VBL("crtc %d - vupdate irq %sabling: r=%d\n",
+ acrtc->crtc_id, enable ? "en" : "dis", 
rc);
if (rc)
DRM_WARN("Failed to %s pflip interrupts\n",
 enable ? "enable" : "disable");
@@ -4966,8 +4965,8 @@ static void update_stream_scaling_settings(const struct 
drm_display_mode *mode,
stream->src = src;
stream->dst = dst;
 
-   DRM_DEBUG_DRIVER("Destination Rectangle x:%d  y:%d  width:%d  
height:%d\n",
-   dst.x, dst.y, dst.width, dst.height);
+   DRM_DEBUG_KMS("Destination Rectangle x:%d  y:%d  width:%d  height:%d\n",
+ dst.x, dst.y, dst.width, dst.height);
 
 }
 
@@ -5710,8 +5709,8 @@ static inline int dm_set_vupdate_irq(struct drm_crtc 
*crtc, bool enable)
 
rc = dc_interrupt_set(adev->dm.dc, irq_source, enable) ? 0 : -EBUSY;
 
-   DRM_DEBUG_DRIVER("crtc %d - vupdate irq %sabling: r=%d\n",
-acrtc->crtc_id, enable ? "en" : "dis", rc);
+   DRM_DEBUG_VBL("crtc %d - vupdate irq %sabling: r=%d\n",
+ acrtc->crtc_id, enable ? "en" : "dis", rc);
return rc;
 }
 
@@ -6664,7 +6663,7 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
*plane,
int r;
 
if (!new_state->fb) {
-   DRM_DEBUG_DRIVER("No FB bound\n");
+   DRM_DEBUG_KMS("No FB bound\n");
return 0;
}
 
@@ -7896,11 +7895,11 @@ static void handle_cursor_update(struct drm_plane 
*plane,
if (!plane->state->fb && !old_plane_state->fb)
return;
 
-   DRM_DEBUG_DRIVER("%s: crtc_id=%d with size %d to %d\n",
-__func__,
-amdgpu_crtc->crtc_id,
-plane->state->crtc_w,
-plane->state->crtc_h);
+   DRM_DEBUG_KMS("%s: crtc_id=%d with size %d to %d\n",
+ __func__,
+ amdgpu_crtc->crtc_id,
+ plane->state->crtc_w,
+ plane->state->crtc_h);
 
ret = get_cursor_position(plane, crtc, &position);
if (ret)
@@ -7958,8 +7957,8 @@ static void prepare_flip_isr(struct amdgpu_crtc *acrtc)
/* Mark this event as consumed */
acrtc->base.state->event = NULL;
 
-   DRM_DEBUG_DRIVER("crtc:%d, pflip_stat:AMDGPU_FLIP_SUBMITTED\n",
-acrtc->crtc_id);
+   DRM_DEBUG_KMS("crtc:%d, pflip_stat:AMDGPU_FLIP_SUBMITTED\n",
+ acrtc->crtc_id);
 }
 
 static void update_freesync_state_on_stream(
@@ -8265,7 +8264,7 @@ static void amdgpu_dm_commit_planes(struct 
drm_ato

Re: [PATCH] drm/amdgpu: Ensure that the modifier requested is supported by plane.

2021-03-23 Thread Mark Yacoub
On Tue, Mar 23, 2021 at 11:02 AM Alex Deucher  wrote:
>
> On Wed, Mar 10, 2021 at 11:15 AM Mark Yacoub  wrote:
> >
> > From: Mark Yacoub 
> >
> > On initializing the framebuffer, call drm_any_plane_has_format to do a
> > check if the modifier is supported. drm_any_plane_has_format calls
> > dm_plane_format_mod_supported which is extended to validate that the
> > modifier is on the list of the plane's supported modifiers.
> >
> > The bug was caught using igt-gpu-tools test: 
> > kms_addfb_basic.addfb25-bad-modifier
> >
> > Tested on ChromeOS Zork by turning on the display, running an overlay
> > test, and running a YT video.
> >
> > Cc: Alex Deucher 
> > Cc: Bas Nieuwenhuizen 
> > Signed-off-by: default avatarMark Yacoub 
>
> I'm not an expert with modifiers yet.  Will this break chips which
> don't currently support modifiers?
No it shouldn't. When you don't support modifiers yet, your will
default to Linear Modifier (DRM_FORMAT_MOD_LINEAR),
which is later checked in amdgpu_dm.c::dm_plane_format_mod_supported()
/*
* We always have to allow this modifier, because core DRM still
* checks LINEAR support if userspace does not provide modifiers.
*/
if (modifier == DRM_FORMAT_MOD_LINEAR)
return true;

>
> Alex
>
>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   | 13 +
> >  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  9 +
> >  2 files changed, 22 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> > index afa5f8ad0f563..a947b5aa420d2 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> > @@ -908,6 +908,19 @@ int amdgpu_display_gem_fb_verify_and_init(
> >  &amdgpu_fb_funcs);
> > if (ret)
> > goto err;
> > +   /* Verify that the modifier is supported. */
> > +   if (!drm_any_plane_has_format(dev, mode_cmd->pixel_format,
> > + mode_cmd->modifier[0])) {
> > +   struct drm_format_name_buf format_name;
> > +   drm_dbg_kms(dev,
> > +   "unsupported pixel format %s / modifier 
> > 0x%llx\n",
> > +   drm_get_format_name(mode_cmd->pixel_format,
> > +   &format_name),
> > +   mode_cmd->modifier[0]);
> > +
> > +   ret = -EINVAL;
> > +   goto err;
> > +   }
> >
> > ret = amdgpu_display_framebuffer_init(dev, rfb, mode_cmd, obj);
> > if (ret)
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > index 961abf1cf040c..21314024a83ce 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > @@ -3939,6 +3939,7 @@ static bool dm_plane_format_mod_supported(struct 
> > drm_plane *plane,
> >  {
> > struct amdgpu_device *adev = drm_to_adev(plane->dev);
> > const struct drm_format_info *info = drm_format_info(format);
> > +   int i;
> >
> > enum dm_micro_swizzle microtile = 
> > modifier_gfx9_swizzle_mode(modifier) & 3;
> >
> > @@ -3952,6 +3953,14 @@ static bool dm_plane_format_mod_supported(struct 
> > drm_plane *plane,
> > if (modifier == DRM_FORMAT_MOD_LINEAR)
> > return true;
> >
> > +   /* Check that the modifier is on the list of the plane's supported 
> > modifiers. */
> > +   for (i = 0; i < plane->modifier_count; i++) {
> > +   if (modifier == plane->modifiers[i])
> > +   break;
> > +   }
> > +   if (i == plane->modifier_count)
> > +   return false;
> > +
> > /*
> >  * The arbitrary tiling support for multiplane formats has not been 
> > hooked
> >  * up.
> > --
> > 2.30.1.766.gb4fecdf3b7-goog
> >
> > ___
> > dri-devel mailing list
> > dri-de...@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: stop warning on TT shrinker failure

2021-03-23 Thread Michal Hocko
On Tue 23-03-21 14:56:54, Christian König wrote:
> Am 23.03.21 um 14:41 schrieb Michal Hocko:
[...]
> > Anyway, I am wondering whether the overall approach is sound. Why don't
> > you simply use shmem as your backing storage from the beginning and pin
> > those pages if they are used by the device?
> 
> Yeah, that is exactly what the Intel guys are doing for their integrated
> GPUs :)
> 
> Problem is for TTM I need to be able to handle dGPUs and those have all
> kinds of funny allocation restrictions. In other words I need to guarantee
> that the allocated memory is coherent accessible to the GPU without using
> SWIOTLB.
> 
> The simple case is that the device can only do DMA32, but you also got
> device which can only do 40bits or 48bits.
> 
> On top of that you also got AGP, CMA and stuff like CPU cache behavior
> changes (write back vs. write through, vs. uncached).

OK, so the underlying problem seems to be that gfp mask (thus
mapping_gfp_mask) cannot really reflect your requirements, right?  Would
it help if shmem would allow to provide an allocation callback to
override alloc_page_vma which is used currently? I am pretty sure there
will be more to handle but going through shmem for the whole life time
is just so much easier to reason about than some tricks to abuse shmem
just for the swapout path.
-- 
Michal Hocko
SUSE Labs
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: stop warning on TT shrinker failure

2021-03-23 Thread Michal Hocko
On Tue 23-03-21 14:06:25, Christian König wrote:
> Am 23.03.21 um 13:37 schrieb Michal Hocko:
> > On Tue 23-03-21 13:21:32, Christian König wrote:
[...]
> > > Ideally I would like to be able to trigger swapping out the shmem page I
> > > allocated immediately after doing the copy.
> > So let me try to rephrase to make sure I understand. You would like to
> > swap out the existing content from the shrinker and you use shmem as a
> > way to achieve that. The swapout should happen at the time of copying
> > (shrinker context) or shortly afterwards?
> > 
> > So effectively to call pageout() on the shmem page after the copy?
> 
> Yes, exactly that.

OK, good. I see what you are trying to achieve now. I do not think we
would want to allow pageout from the shrinker's context but what you can
do is to instantiate the shmem page into the tail of the inactive list
so the next reclaim attempt will swap it out (assuming swap is available
of course).

This is not really something that our existing infrastructure gives you
though, I am afraid. There is no way to tell a newly allocated shmem
page should be in fact cold and the first one to swap out. But there are
people more familiar with shmem and its pecularities so I might be wrong
here.

Anyway, I am wondering whether the overall approach is sound. Why don't
you simply use shmem as your backing storage from the beginning and pin
those pages if they are used by the device?

-- 
Michal Hocko
SUSE Labs
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: stop warning on TT shrinker failure

2021-03-23 Thread Michal Hocko
On Tue 23-03-21 14:15:05, Daniel Vetter wrote:
> On Tue, Mar 23, 2021 at 01:04:03PM +0100, Michal Hocko wrote:
> > On Tue 23-03-21 12:48:58, Christian König wrote:
> > > Am 23.03.21 um 12:28 schrieb Daniel Vetter:
> > > > On Tue, Mar 23, 2021 at 08:38:33AM +0100, Michal Hocko wrote:
> > > > > I think this is where I don't get yet what Christian tries to do: We
> > > > > really shouldn't do different tricks and calling contexts between 
> > > > > direct
> > > > > reclaim and kswapd reclaim. Otherwise very hard to track down bugs are
> > > > > pretty much guaranteed. So whether we use explicit gfp flags or the
> > > > > context apis, result is exactly the same.
> > > 
> > > Ok let us recap what TTMs TT shrinker does here:
> > > 
> > > 1. We got memory which is not swapable because it might be accessed by the
> > > GPU at any time.
> > > 2. Make sure the memory is not accessed by the GPU and driver need to 
> > > grab a
> > > lock before they can make it accessible again.
> > > 3. Allocate a shmem file and copy over the not swapable pages.
> > 
> > This is quite tricky because the shrinker operates in the PF_MEMALLOC
> > context so such an allocation would be allowed to completely deplete
> > memory unless you explicitly mark that context as __GFP_NOMEMALLOC. Also
> > note that if the allocation cannot succeed it will not trigger reclaim
> > again because you are already called from the reclaim context.
> 
> [Limiting to that discussion]
> 
> Yes it's not emulating real (direct) reclaim correctly, but ime the
> biggest issue with direct reclaim is when you do mutex_lock instead of
> mutex_trylock or in general block on stuff that you cant. And lockdep +
> fs_reclaim annotations gets us that, so pretty good to make sure our
> shrinker is correct.

I have to confess that I manage to (happily) forget all the nasty
details about fs_reclaim lockdep internals so I am not sure the use by
the proposed patch is actually reasonable. Talk to lockdep guys about
that and make sure to put a big fat comment explaining what is going on.

In general allocating from the reclaim context is a bad idea and you
should avoid that. As already said a simple allocation request from the
reclaim context is not constrained and it will not recurse back into
the reclaim. Calling into shmem from the shrinker context might be
really tricky as well. I am not even sure this is possible for anything
other than full (GFP_KERNEL) reclaim context.
-- 
Michal Hocko
SUSE Labs
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/dp_mst: Enhance DP MST topology logging

2021-03-23 Thread Lyude Paul
Sorry for the wait! Review comments below

On Thu, 2021-03-18 at 11:55 -0400, Eryk Brol wrote:
> [why]
> MST topology print was missing fec logging and pdt printed
> as an int wasn't clear. vcpi and payload info were also logged as an
> arbitrary series of ints which require the user to know the ordering
> of the prints, making the logs difficult to use.
> 
> [how]
> -add fec logging
> -add pdt parsing into strings
> -format vcpi and payload info into tables with headings
> -clean up topology prints
> 
> Signed-off-by: Eryk Brol 
> ---
>  drivers/gpu/drm/drm_dp_mst_topology.c | 67 ---
>  1 file changed, 51 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c
> b/drivers/gpu/drm/drm_dp_mst_topology.c
> index 932c4641ec3e..3afeaa59cbaa 100644
> --- a/drivers/gpu/drm/drm_dp_mst_topology.c
> +++ b/drivers/gpu/drm/drm_dp_mst_topology.c
> @@ -4720,6 +4720,24 @@ static void drm_dp_mst_kick_tx(struct
> drm_dp_mst_topology_mgr *mgr)
> queue_work(system_long_wq, &mgr->tx_work);
>  }
>  
> +static char *pdt_to_string(u8 pdt)

Let's make this static const char *
> +{
> +   switch (pdt) {
> +   case DP_PEER_DEVICE_NONE:
> +   return "NONE";
> +   case DP_PEER_DEVICE_SOURCE_OR_SST:
> +   return "SOURCE OR SST";
> +   case DP_PEER_DEVICE_MST_BRANCHING:
> +   return "MST BRANCHING";
> +   case DP_PEER_DEVICE_SST_SINK:
> +   return "SST SINK";
> +   case DP_PEER_DEVICE_DP_LEGACY_CONV:
> +   return "DP LEGACY CONV";
> +   default:
> +   return "ERR";
> +   }
> +}
> +
>  static void drm_dp_mst_dump_mstb(struct seq_file *m,
>  struct drm_dp_mst_branch *mstb)
>  {
> @@ -4732,9 +4750,20 @@ static void drm_dp_mst_dump_mstb(struct seq_file *m,
> prefix[i] = '\t';
> prefix[i] = '\0';
>  
> -   seq_printf(m, "%smst: %p, %d\n", prefix, mstb, mstb->num_ports);
> +   seq_printf(m, "%smstb - [%p]: num_ports: %d\n", prefix, mstb, mstb-
> >num_ports);
> list_for_each_entry(port, &mstb->ports, next) {
> -   seq_printf(m, "%sport: %d: input: %d: pdt: %d, ddps: %d ldps:
> %d, sdp: %d/%d, %p, conn: %p\n", prefix, port->port_num, port->input, 
> port->pdt,
> port->ddps, port->ldps, port->num_sdp_streams, port->num_sdp_stream_sinks, 
> port,
> port->connector);
> +   seq_printf(m, "%sport %d - [%p] (%s - %s): ddps: %d, ldps: %d,
> sdp: %d/%d, fec: %s, conn: %p\n",
> +   prefix,
> +   port->port_num,
> +   port,
> +   port->input ? "input" : "output",
> +   pdt_to_string(port->pdt),
> +   port->ddps,
> +   port->ldps,
> +   port->num_sdp_streams,
> +   port->num_sdp_stream_sinks,
> +   port->fec_capable ? "true" : "false",
> +   port->connector);

The indenting here is wrong, "i," and all the lines up until the end of the
function call should be aligned to one col after the starting paranthesis

So like this:

seq_printf(m, "foo%d",
   bar);

> if (port->mstb)
> drm_dp_mst_dump_mstb(m, port->mstb);
> }
> @@ -4787,33 +4816,39 @@ void drm_dp_mst_dump_topology(struct seq_file *m,
> mutex_unlock(&mgr->lock);
>  
> mutex_lock(&mgr->payload_lock);
> -   seq_printf(m, "vcpi: %lx %lx %d\n", mgr->payload_mask, mgr->vcpi_mask,
> -   mgr->max_payloads);
> +   seq_printf(m, "\n *** VCPI Info ***\npayload_mask: %lx, vcpi_mask: 
> %lx,
> max_payloads: %d\n",

There's an extra space between the \n and the ***
> +   mgr->payload_mask,
> +   mgr->vcpi_mask,
> +   mgr->max_payloads);

IMHO I would do the seq_printf() here like this instead:

seq_printf("\n*** VCPI INFO***\n");
seq_printf("payload_mask: %lx, vcpi_mask: %lx, max_payloads: %d\n", ...);

Just to make things a bit easier to read

>  
> +   seq_printf(m, "\n|   idx   |  port # |  vcp_id | # slots | sink
> name |\n");
> for (i = 0; i < mgr->max_payloads; i++) {
> if (mgr->proposed_vcpis[i]) {
> char name[14];
>  
> port = container_of(mgr->proposed_vcpis[i], struct
> drm_dp_mst_port, vcpi);
> fetch_monitor_name(mgr, port, name, sizeof(name));
> -   seq_printf(m, "vcpi %d: %d %d %d sink name: %s\n", i,
> -  port->port_num, port->vcpi.vcpi,
> -  port->vcpi.num_slots,
> -  (*name != 0) ? name :  "Unknown");
> +   seq_printf(m, "%10d%10d%10d%10d%20s\n",
>

Re: [PATCH] drm/amdgpu/display: fix clkmgr for SI

2021-03-23 Thread Nirmoy

Found the fixes tag for this:

Fixes: f4a5cbdcb1 ("drm/amd/display: hide VGH asic specific structs")


The series is Acked-by: Nirmoy Das 



On 3/23/21 6:06 PM, Alex Deucher wrote:

It looks like the SI case was missed.  Need to return
the clkmgr struct for SI.

Signed-off-by: Alex Deucher 
---
  drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c
index c81da30faf03..7d6c68c5dea9 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c
@@ -136,6 +136,7 @@ struct clk_mgr *dc_clk_mgr_create(struct dc_context *ctx, 
struct pp_smu_funcs *p
}
dce60_clk_mgr_construct(ctx, clk_mgr);
dce_clk_mgr_construct(ctx, clk_mgr);
+   return &clk_mgr->base;
}
  #endif
case FAMILY_CI:

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH][next] drm/amd/display/dc/calcs/dce_calcs: Fix allocation size for dceip and vbios

2021-03-23 Thread Lee Jones
On Tue, 23 Mar 2021, Colin King wrote:

> From: Colin Ian King 
> 
> Currently the allocations for dceip and vbios are based on the size of
> the pointer rather than the size of the data structures, causing heap
> issues. Fix this by using the correct allocation sizes.
> 
> Addresses-Coverity: ("Wrong size of argument")
> Fixes: a2a855772210 ("drm/amd/display/dc/calcs/dce_calcs: Remove some large 
> variables from the stack")
> Signed-off-by: Colin Ian King 
> ---
>  drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Fixed already mate.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu/display: fix memory leak for dimgrey cavefish

2021-03-23 Thread Alex Deucher
We need to clean up the dcn3 clk_mgr.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c
index 203150dd37f6..c81da30faf03 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c
@@ -274,6 +274,9 @@ void dc_destroy_clk_mgr(struct clk_mgr *clk_mgr_base)
if 
(ASICREV_IS_SIENNA_CICHLID_P(clk_mgr_base->ctx->asic_id.hw_internal_rev)) {
dcn3_clk_mgr_destroy(clk_mgr);
}
+   if 
(ASICREV_IS_DIMGREY_CAVEFISH_P(clk_mgr_base->ctx->asic_id.hw_internal_rev)) {
+   dcn3_clk_mgr_destroy(clk_mgr);
+   }
break;
 
case FAMILY_VGH:
-- 
2.30.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: remove unused variables

2021-03-23 Thread Alex Deucher
Leftover from the GPU reset rework.  Remove them.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 324b9e6b2965..5d73943797f6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4694,7 +4694,6 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
  struct amdgpu_job *job)
 {
struct list_head device_list, *device_list_handle =  NULL;
-   bool need_full_reset = false;
bool job_signaled = false;
struct amdgpu_hive_info *hive = NULL;
struct amdgpu_device *tmp_adev = NULL;
@@ -5234,7 +5233,6 @@ pci_ers_result_t amdgpu_pci_slot_reset(struct pci_dev 
*pdev)
struct amdgpu_device *adev = drm_to_adev(dev);
int r, i;
struct amdgpu_reset_context reset_context;
-   bool need_full_reset = true;
u32 memsize;
struct list_head device_list;
 
-- 
2.30.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu/display: fix clkmgr for SI

2021-03-23 Thread Alex Deucher
It looks like the SI case was missed.  Need to return
the clkmgr struct for SI.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c
index c81da30faf03..7d6c68c5dea9 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c
@@ -136,6 +136,7 @@ struct clk_mgr *dc_clk_mgr_create(struct dc_context *ctx, 
struct pp_smu_funcs *p
}
dce60_clk_mgr_construct(ctx, clk_mgr);
dce_clk_mgr_construct(ctx, clk_mgr);
+   return &clk_mgr->base;
}
 #endif
case FAMILY_CI:
-- 
2.30.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: stop warning on TT shrinker failure

2021-03-23 Thread Michal Hocko
On Tue 23-03-21 13:21:32, Christian König wrote:
> Am 23.03.21 um 13:04 schrieb Michal Hocko:
> > On Tue 23-03-21 12:48:58, Christian König wrote:
> > > Am 23.03.21 um 12:28 schrieb Daniel Vetter:
> > > > On Tue, Mar 23, 2021 at 08:38:33AM +0100, Michal Hocko wrote:
> > > > > On Mon 22-03-21 20:34:25, Christian König wrote:
> > [...]
> > > > > > My only concern is that if I could rely on memalloc_no* being used 
> > > > > > we could
> > > > > > optimize this quite a bit further.
> > > > > Yes you can use the scope API and you will be guaranteed that _any_
> > > > > allocation from the enclosed context will inherit GFP_NO* semantic.
> > > The question is if this is also guaranteed the other way around?
> > > 
> > > In other words if somebody calls get_free_page(GFP_NOFS) are the context
> > > flags set as well?
> > gfp mask is always restricted in the page allocator. So say you have
> > noio scope context and call get_free_page/kmalloc(GFP_NOFS) then the
> > scope would restrict the allocation flags to GFP_NOIO (aka drop
> > __GFP_IO). For further details, have a look at current_gfp_context
> > and its callers.
> > 
> > Does this answer your question?
> 
> But what happens if you don't have noio scope and somebody calls
> get_free_page(GFP_NOFS)?

Then this will be a regular NOFS request. Let me repeat scope API will
further restrict any requested allocation mode.

> Is then the noio scope added automatically? And is it possible that the
> shrinker gets called without noio scope even we would need it?

Here you have lost me again.

> > > > > I think this is where I don't get yet what Christian tries to do: We
> > > > > really shouldn't do different tricks and calling contexts between 
> > > > > direct
> > > > > reclaim and kswapd reclaim. Otherwise very hard to track down bugs are
> > > > > pretty much guaranteed. So whether we use explicit gfp flags or the
> > > > > context apis, result is exactly the same.
> > > Ok let us recap what TTMs TT shrinker does here:
> > > 
> > > 1. We got memory which is not swapable because it might be accessed by the
> > > GPU at any time.
> > > 2. Make sure the memory is not accessed by the GPU and driver need to 
> > > grab a
> > > lock before they can make it accessible again.
> > > 3. Allocate a shmem file and copy over the not swapable pages.
> > This is quite tricky because the shrinker operates in the PF_MEMALLOC
> > context so such an allocation would be allowed to completely deplete
> > memory unless you explicitly mark that context as __GFP_NOMEMALLOC.
> 
> Thanks, exactly that was one thing I was absolutely not sure about. And yes
> I agree that this is really tricky.
> 
> Ideally I would like to be able to trigger swapping out the shmem page I
> allocated immediately after doing the copy.

So let me try to rephrase to make sure I understand. You would like to
swap out the existing content from the shrinker and you use shmem as a
way to achieve that. The swapout should happen at the time of copying
(shrinker context) or shortly afterwards?

So effectively to call pageout() on the shmem page after the copy?
 
> This way I would only need a single page for the whole shrink operation at
> any given time.

What do you mean by that? You want the share the same shmem page for
other copy+swapout?
-- 
Michal Hocko
SUSE Labs
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: stop warning on TT shrinker failure

2021-03-23 Thread Michal Hocko
On Tue 23-03-21 12:51:13, Christian König wrote:
> 
> 
> Am 23.03.21 um 12:46 schrieb Michal Hocko:
> > On Tue 23-03-21 12:28:20, Daniel Vetter wrote:
> > > On Tue, Mar 23, 2021 at 08:38:33AM +0100, Michal Hocko wrote:
> > [...]
> > > > > > fs_reclaim_acquire is there to make sure lockdep understands that 
> > > > > > this
> > > > > > is a shrinker and that it checks all the dependencies for us like if
> > > > > > we'd be in real reclaim. There is some drop caches interfaces in 
> > > > > > proc
> > > > > > iirc, but those drop everything, and they don't have the fs_reclaim
> > > > > > annotations to teach lockdep about what we're doing.
> > > > ... I really do not follow this. You shouldn't really care whether this
> > > > is a reclaim interface or not. Or maybe I just do not understand this...
> > > We're heavily relying on lockdep and fs_reclaim to make sure we get it all
> > > right. So any drop caches interface that isn't wrapped in fs_reclaim
> > > context is kinda useless for testing. Plus ideally we want to only hit our
> > > own paths, and not trash every other cache in the system. Speed matters in
> > > CI.
> > But what is special about this path to hack around and make it pretend
> > it is part of the fs reclaim path?
> 
> That's just to teach lockdep that there is a dependency.
> 
> In other words we pretend in the debugfs file that it is part of the fs
> reclaim path to check for the case when it really becomes part of the fs
> reclaim path.

OK, our emails crossed and I can see your response only after replying
to your other email. OK, this makes more sense now. But as pointed in
other email this will likely not do what you think. Let's continue
discussing in the other subthread to reduce the further confusion.
-- 
Michal Hocko
SUSE Labs
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: stop warning on TT shrinker failure

2021-03-23 Thread Michal Hocko
On Tue 23-03-21 12:28:20, Daniel Vetter wrote:
> On Tue, Mar 23, 2021 at 08:38:33AM +0100, Michal Hocko wrote:
[...]
> > > > fs_reclaim_acquire is there to make sure lockdep understands that this
> > > > is a shrinker and that it checks all the dependencies for us like if
> > > > we'd be in real reclaim. There is some drop caches interfaces in proc
> > > > iirc, but those drop everything, and they don't have the fs_reclaim
> > > > annotations to teach lockdep about what we're doing.
> > 
> > ... I really do not follow this. You shouldn't really care whether this
> > is a reclaim interface or not. Or maybe I just do not understand this...
> 
> We're heavily relying on lockdep and fs_reclaim to make sure we get it all
> right. So any drop caches interface that isn't wrapped in fs_reclaim
> context is kinda useless for testing. Plus ideally we want to only hit our
> own paths, and not trash every other cache in the system. Speed matters in
> CI.

But what is special about this path to hack around and make it pretend
it is part of the fs reclaim path?
-- 
Michal Hocko
SUSE Labs
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: stop warning on TT shrinker failure

2021-03-23 Thread Michal Hocko
On Tue 23-03-21 12:48:58, Christian König wrote:
> Am 23.03.21 um 12:28 schrieb Daniel Vetter:
> > On Tue, Mar 23, 2021 at 08:38:33AM +0100, Michal Hocko wrote:
> > > On Mon 22-03-21 20:34:25, Christian König wrote:
[...]
> > > > My only concern is that if I could rely on memalloc_no* being used we 
> > > > could
> > > > optimize this quite a bit further.
> > > Yes you can use the scope API and you will be guaranteed that _any_
> > > allocation from the enclosed context will inherit GFP_NO* semantic.
> 
> The question is if this is also guaranteed the other way around?
> 
> In other words if somebody calls get_free_page(GFP_NOFS) are the context
> flags set as well?

gfp mask is always restricted in the page allocator. So say you have
noio scope context and call get_free_page/kmalloc(GFP_NOFS) then the
scope would restrict the allocation flags to GFP_NOIO (aka drop
__GFP_IO). For further details, have a look at current_gfp_context
and its callers.

Does this answer your question?

> > > I think this is where I don't get yet what Christian tries to do: We
> > > really shouldn't do different tricks and calling contexts between direct
> > > reclaim and kswapd reclaim. Otherwise very hard to track down bugs are
> > > pretty much guaranteed. So whether we use explicit gfp flags or the
> > > context apis, result is exactly the same.
> 
> Ok let us recap what TTMs TT shrinker does here:
> 
> 1. We got memory which is not swapable because it might be accessed by the
> GPU at any time.
> 2. Make sure the memory is not accessed by the GPU and driver need to grab a
> lock before they can make it accessible again.
> 3. Allocate a shmem file and copy over the not swapable pages.

This is quite tricky because the shrinker operates in the PF_MEMALLOC
context so such an allocation would be allowed to completely deplete
memory unless you explicitly mark that context as __GFP_NOMEMALLOC. Also
note that if the allocation cannot succeed it will not trigger reclaim
again because you are already called from the reclaim context.

> 4. Free the not swapable/reclaimable pages.
> 
> The pages we got from the shmem file are easily swapable to disk after the
> copy is completed. But only if IO is not already blocked because the
> shrinker was called from an allocation restricted by GFP_NOFS or GFP_NOIO.

Sorry for being dense here but I still do not follow the actual problem
(well, except for the above mentioned one). Is the sole point of this to
emulate a GFP_NO* allocation context and see how shrinker behaves? 
-- 
Michal Hocko
SUSE Labs
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amdgpu: use zero as start for dummy resource walks

2021-03-23 Thread Deucher, Alexander
[AMD Official Use Only - Internal Distribution Only]

Acked-by: Alex Deucher 

From: amd-gfx  on behalf of Christian 
König 
Sent: Tuesday, March 23, 2021 10:54 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Das, Nirmoy ; Chen, Guchun 
Subject: [PATCH 1/2] drm/amdgpu: use zero as start for dummy resource walks

When we don't have a physically backing store we should use zero instead
of the virtual start address since that isn't necessary a valid physical
one.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
index 40f2adf305bc..e94362ccf9d5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
@@ -54,7 +54,7 @@ static inline void amdgpu_res_first(struct ttm_resource *res,
 struct drm_mm_node *node;

 if (!res || !res->mm_node) {
-   cur->start = start;
+   cur->start = 0;
 cur->size = size;
 cur->remaining = size;
 cur->node = NULL;
--
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Calexander.deucher%40amd.com%7C44108ad9138645327a7708d8ee0ba373%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637521081047640112%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=bswNGifbeRgbvoBw89PSiDTpzLbCbhqtX5xqMIRYsq8%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] amdgpu: fix gcc -Wrestrict warning

2021-03-23 Thread Rasmus Villemoes
On 23/03/2021 14.04, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> gcc warns about an sprintf() that uses the same buffer as source
> and destination, which is undefined behavior in C99:
> 
> drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c: In function 
> 'amdgpu_securedisplay_debugfs_write':
> drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c:141:6: error: 'sprintf' 
> argument 3 overlaps destination object 'i2c_output' [-Werror=restrict]
>   141 |  sprintf(i2c_output, "%s 0x%X", i2c_output,
>   |  ^~
>   142 |   
> securedisplay_cmd->securedisplay_out_message.send_roi_crc.i2c_buf[i]);
>   |   
> ~
> drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c:97:7: note: destination 
> object referenced by 'restrict'-qualified argument 1 was declared here
>97 |  char i2c_output[256];
>   |   ^~
> 
> Rewrite it to remember the current offset into the buffer instead.
> 
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
> index 834440ab9ff7..69d7f6bff5d4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
> @@ -136,9 +136,10 @@ static ssize_t amdgpu_securedisplay_debugfs_write(struct 
> file *f, const char __u
>   ret = psp_securedisplay_invoke(psp, 
> TA_SECUREDISPLAY_COMMAND__SEND_ROI_CRC);
>   if (!ret) {
>   if (securedisplay_cmd->status == 
> TA_SECUREDISPLAY_STATUS__SUCCESS) {
> + int pos = 0;
>   memset(i2c_output,  0, sizeof(i2c_output));
>   for (i = 0; i < 
> TA_SECUREDISPLAY_I2C_BUFFER_SIZE; i++)
> - sprintf(i2c_output, "%s 0x%X", 
> i2c_output,
> + pos += sprintf(i2c_output + pos, " 
> 0x%X",
>   
> securedisplay_cmd->securedisplay_out_message.send_roi_crc.i2c_buf[i]);
>   dev_info(adev->dev, "SECUREDISPLAY: I2C buffer 
> out put is :%s\n", i2c_output);

Eh, why not get rid of the 256 byte stack allocation and just replace
all of this by

  dev_info(adev->dev, ""SECUREDISPLAY: I2C buffer out put is: %*ph\n",
TA_SECUREDISPLAY_I2C_BUFFER_SIZE,
securedisplay_cmd->securedisplay_out_message.send_roi_crc.i2c_buf);

That's much less code (both in #LOC and .text), and avoids adding yet
another place that will be audited over and over for "hm, yeah, that
sprintf() is actually not gonna overflow".

Yeah, it'll lose the 0x prefixes for each byte and use lowercase hex chars.

Rasmus
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/display: Use DRM_DEBUG_DP

2021-03-23 Thread Harry Wentland

Thanks for converting these away from _DRIVER.

On 2021-03-23 10:26 a.m., Alex Deucher wrote:

On Mon, Mar 22, 2021 at 8:31 PM Luben Tuikov  wrote:


Convert IRQ-based prints from DRM_DEBUG_DRIVER to
DRM_DEBUG_DP, as the latter is not used in drm/amd
prior to this patch and since IRQ-based prints
drown out the rest of the driver's
DRM_DEBUG_DRIVER messages.

Cc: Harry Wentland 
Cc: Alex Deucher 
Signed-off-by: Luben Tuikov 
---
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 67 +--
  1 file changed, 33 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index f455fc3aa561..aabaa652f6dc 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -449,9 +449,9 @@ static void dm_pflip_high_irq(void *interrupt_params)
 amdgpu_crtc->pflip_status = AMDGPU_FLIP_NONE;
 spin_unlock_irqrestore(&adev_to_drm(adev)->event_lock, flags);

-   DRM_DEBUG_DRIVER("crtc:%d[%p], pflip_stat:AMDGPU_FLIP_NONE, vrr[%d]-fp 
%d\n",
-amdgpu_crtc->crtc_id, amdgpu_crtc,
-vrr_active, (int) !e);
+   DRM_DEBUG_DP("crtc:%d[%p], pflip_stat:AMDGPU_FLIP_NONE, vrr[%d]-fp 
%d\n",


Should probably be _KMS or _ATOMIC since this is not displayport specific.


It looks like _ATOMIC is strictly for code dealing with atomic. KMS is a 
better bet.


_KMS




+amdgpu_crtc->crtc_id, amdgpu_crtc,
+vrr_active, (int) !e);
  }

  static void dm_vupdate_high_irq(void *interrupt_params)
@@ -993,8 +993,7 @@ static void event_mall_stutter(struct work_struct *work)
 dc_allow_idle_optimizations(
 dm->dc, dm->active_vblank_irq_count == 0);

-   DRM_DEBUG_DRIVER("Allow idle optimizations (MALL): %d\n", 
dm->active_vblank_irq_count == 0);
-
+   DRM_DEBUG_DP("Allow idle optimizations (MALL): %d\n", 
dm->active_vblank_irq_count == 0);


Maybe _VBL or _KMS or _ATOMIC?



_KMS



 mutex_unlock(&dm->dc_lock);
  }
@@ -1810,8 +1809,8 @@ static void dm_gpureset_toggle_interrupts(struct 
amdgpu_device *adev,
 if (acrtc && state->stream_status[i].plane_count != 0) {
 irq_source = IRQ_TYPE_PFLIP + acrtc->otg_inst;
 rc = dc_interrupt_set(adev->dm.dc, irq_source, enable) 
? 0 : -EBUSY;
-   DRM_DEBUG("crtc %d - vupdate irq %sabling: r=%d\n",
- acrtc->crtc_id, enable ? "en" : "dis", rc);
+   DRM_DEBUG_DP("crtc %d - vupdate irq %sabling: r=%d\n",
+acrtc->crtc_id, enable ? "en" : "dis", rc);


I think this should be _VBL.



_VBL


 if (rc)
 DRM_WARN("Failed to %s pflip interrupts\n",
  enable ? "enable" : "disable");
@@ -4966,8 +4965,8 @@ static void update_stream_scaling_settings(const struct 
drm_display_mode *mode,
 stream->src = src;
 stream->dst = dst;

-   DRM_DEBUG_DRIVER("Destination Rectangle x:%d  y:%d  width:%d  
height:%d\n",
-   dst.x, dst.y, dst.width, dst.height);
+   DRM_DEBUG_DP("Destination Rectangle x:%d  y:%d  width:%d  height:%d\n",
+dst.x, dst.y, dst.width, dst.height);


Should probably be _KMS or _ATOMIC since this is not displayport specific.



_KMS



  }

@@ -5710,8 +5709,8 @@ static inline int dm_set_vupdate_irq(struct drm_crtc 
*crtc, bool enable)

 rc = dc_interrupt_set(adev->dm.dc, irq_source, enable) ? 0 : -EBUSY;

-   DRM_DEBUG_DRIVER("crtc %d - vupdate irq %sabling: r=%d\n",
-acrtc->crtc_id, enable ? "en" : "dis", rc);
+   DRM_DEBUG_DP("crtc %d - vupdate irq %sabling: r=%d\n",
+acrtc->crtc_id, enable ? "en" : "dis", rc);


Should probably be _VBL.



_VBL


 return rc;
  }

@@ -6664,7 +6663,7 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
*plane,
 int r;

 if (!new_state->fb) {
-   DRM_DEBUG_DRIVER("No FB bound\n");
+   DRM_DEBUG_DP("No FB bound\n");


Should probably be _KMS or _ATOMIC since this is not displayport specific.



_KMS


 return 0;
 }

@@ -7896,11 +7895,11 @@ static void handle_cursor_update(struct drm_plane 
*plane,
 if (!plane->state->fb && !old_plane_state->fb)
 return;

-   DRM_DEBUG_DRIVER("%s: crtc_id=%d with size %d to %d\n",
-__func__,
-amdgpu_crtc->crtc_id,
-plane->state->crtc_w,
-plane->state->crtc_h);
+   DRM_DEBUG_DP("%s: crtc_id=%d with size %d to %d\n",
+__func__,
+amdgpu_crtc->crtc_id,
+plane->state->crtc_w,
+plane->state->

Re: [PATCH] drm/ttm: stop warning on TT shrinker failure

2021-03-23 Thread Christian König

Am 23.03.21 um 16:13 schrieb Michal Hocko:

On Tue 23-03-21 14:56:54, Christian König wrote:

Am 23.03.21 um 14:41 schrieb Michal Hocko:

[...]

Anyway, I am wondering whether the overall approach is sound. Why don't
you simply use shmem as your backing storage from the beginning and pin
those pages if they are used by the device?

Yeah, that is exactly what the Intel guys are doing for their integrated
GPUs :)

Problem is for TTM I need to be able to handle dGPUs and those have all
kinds of funny allocation restrictions. In other words I need to guarantee
that the allocated memory is coherent accessible to the GPU without using
SWIOTLB.

The simple case is that the device can only do DMA32, but you also got
device which can only do 40bits or 48bits.

On top of that you also got AGP, CMA and stuff like CPU cache behavior
changes (write back vs. write through, vs. uncached).

OK, so the underlying problem seems to be that gfp mask (thus
mapping_gfp_mask) cannot really reflect your requirements, right?  Would
it help if shmem would allow to provide an allocation callback to
override alloc_page_vma which is used currently? I am pretty sure there
will be more to handle but going through shmem for the whole life time
is just so much easier to reason about than some tricks to abuse shmem
just for the swapout path.


Well it's a start, but the pages can have special CPU cache settings. So 
direct IO from/to them usually doesn't work as expected.


Additional to that for AGP and CMA I need to make sure that I give those 
pages back to the relevant subsystems instead of just dropping the page 
reference.


So I would need to block for the swapio to be completed.

Anyway I probably need to revert those patches for now since this isn't 
working as we hoped it would.


Thanks for the explanation how stuff works here.

Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm/amdgpu: re-apply "use the new cursor in the VM code""

2021-03-23 Thread Nirmoy

Tested on Navi1x with "piglit run opengl results/test".

The series is Tested-by: Nirmoy Das from my side.


Curious to know how this holds up against Guchun's Vulkan cts test.


Regards,

Nirmoy



On 3/23/21 3:54 PM, Christian König wrote:

Now that we found the underlying problem we can re-apply this patch.

This reverts commit 867fee7f8821ff42e7308088cf0c3450ac49c17c.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 55 +-
  1 file changed, 18 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 9268db1172bd..bc3951b71079 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -37,6 +37,7 @@
  #include "amdgpu_gmc.h"
  #include "amdgpu_xgmi.h"
  #include "amdgpu_dma_buf.h"
+#include "amdgpu_res_cursor.h"
  
  /**

   * DOC: GPUVM
@@ -1583,7 +1584,7 @@ static int amdgpu_vm_update_ptes(struct 
amdgpu_vm_update_params *params,
   * @last: last mapped entry
   * @flags: flags for the entries
   * @offset: offset into nodes and pages_addr
- * @nodes: array of drm_mm_nodes with the MC addresses
+ * @res: ttm_resource to map
   * @pages_addr: DMA addresses to use for mapping
   * @fence: optional resulting fence
   *
@@ -1598,13 +1599,13 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
   bool unlocked, struct dma_resv *resv,
   uint64_t start, uint64_t last,
   uint64_t flags, uint64_t offset,
-  struct drm_mm_node *nodes,
+  struct ttm_resource *res,
   dma_addr_t *pages_addr,
   struct dma_fence **fence)
  {
struct amdgpu_vm_update_params params;
+   struct amdgpu_res_cursor cursor;
enum amdgpu_sync_mode sync_mode;
-   uint64_t pfn;
int r;
  
  	memset(¶ms, 0, sizeof(params));

@@ -1622,14 +1623,6 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
else
sync_mode = AMDGPU_SYNC_EXPLICIT;
  
-	pfn = offset >> PAGE_SHIFT;

-   if (nodes) {
-   while (pfn >= nodes->size) {
-   pfn -= nodes->size;
-   ++nodes;
-   }
-   }
-
amdgpu_vm_eviction_lock(vm);
if (vm->evicting) {
r = -EBUSY;
@@ -1648,23 +1641,17 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
if (r)
goto error_unlock;
  
-	do {

+   amdgpu_res_first(res, offset, (last - start + 1) * AMDGPU_GPU_PAGE_SIZE,
+&cursor);
+   while (cursor.remaining) {
uint64_t tmp, num_entries, addr;
  
-

-   num_entries = last - start + 1;
-   if (nodes) {
-   addr = nodes->start << PAGE_SHIFT;
-   num_entries = min((nodes->size - pfn) *
-   AMDGPU_GPU_PAGES_IN_CPU_PAGE, num_entries);
-   } else {
-   addr = 0;
-   }
-
+   num_entries = cursor.size >> AMDGPU_GPU_PAGE_SHIFT;
if (pages_addr) {
bool contiguous = true;
  
  			if (num_entries > AMDGPU_GPU_PAGES_IN_CPU_PAGE) {

+   uint64_t pfn = cursor.start >> PAGE_SHIFT;
uint64_t count;
  
  contiguous = pages_addr[pfn + 1] ==

@@ -1684,16 +1671,18 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
}
  
  			if (!contiguous) {

-   addr = pfn << PAGE_SHIFT;
+   addr = cursor.start;
params.pages_addr = pages_addr;
} else {
-   addr = pages_addr[pfn];
+   addr = pages_addr[cursor.start >> PAGE_SHIFT];
params.pages_addr = NULL;
}
  
  		} else if (flags & (AMDGPU_PTE_VALID | AMDGPU_PTE_PRT)) {

-   addr += bo_adev->vm_manager.vram_base_offset;
-   addr += pfn << PAGE_SHIFT;
+   addr = bo_adev->vm_manager.vram_base_offset +
+   cursor.start;
+   } else {
+   addr = 0;
}
  
  		tmp = start + num_entries;

@@ -1701,14 +1690,9 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
if (r)
goto error_unlock;
  
-		pfn += num_entries / AMDGPU_GPU_PAGES_IN_CPU_PAGE;

-   if (nodes && nodes->size == pfn) {
-   pfn = 0;
-   ++nodes;
-   }
+   amdgpu_res

Re: [PATCH] amdgpu: fix gcc -Wrestrict warning

2021-03-23 Thread Alex Deucher
Applied.  Thanks!

Alex

On Tue, Mar 23, 2021 at 9:04 AM Arnd Bergmann  wrote:
>
> From: Arnd Bergmann 
>
> gcc warns about an sprintf() that uses the same buffer as source
> and destination, which is undefined behavior in C99:
>
> drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c: In function 
> 'amdgpu_securedisplay_debugfs_write':
> drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c:141:6: error: 'sprintf' 
> argument 3 overlaps destination object 'i2c_output' [-Werror=restrict]
>   141 |  sprintf(i2c_output, "%s 0x%X", i2c_output,
>   |  ^~
>   142 |   
> securedisplay_cmd->securedisplay_out_message.send_roi_crc.i2c_buf[i]);
>   |   
> ~
> drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c:97:7: note: destination 
> object referenced by 'restrict'-qualified argument 1 was declared here
>97 |  char i2c_output[256];
>   |   ^~
>
> Rewrite it to remember the current offset into the buffer instead.
>
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
> index 834440ab9ff7..69d7f6bff5d4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
> @@ -136,9 +136,10 @@ static ssize_t amdgpu_securedisplay_debugfs_write(struct 
> file *f, const char __u
> ret = psp_securedisplay_invoke(psp, 
> TA_SECUREDISPLAY_COMMAND__SEND_ROI_CRC);
> if (!ret) {
> if (securedisplay_cmd->status == 
> TA_SECUREDISPLAY_STATUS__SUCCESS) {
> +   int pos = 0;
> memset(i2c_output,  0, sizeof(i2c_output));
> for (i = 0; i < 
> TA_SECUREDISPLAY_I2C_BUFFER_SIZE; i++)
> -   sprintf(i2c_output, "%s 0x%X", 
> i2c_output,
> +   pos += sprintf(i2c_output + pos, " 
> 0x%X",
> 
> securedisplay_cmd->securedisplay_out_message.send_roi_crc.i2c_buf[i]);
> dev_info(adev->dev, "SECUREDISPLAY: I2C 
> buffer out put is :%s\n", i2c_output);
> } else {
> --
> 2.29.2
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Amdgpu kernel oops and freezing on system suspend and hibernate

2021-03-23 Thread Harvey

Alex,

thanks for the hint, but...

Is this patch intended for kernel 5.11.8?

I applied the patch against 5.11.8 and it is freezing again:


Mär 23 16:18:51 obelix kernel: [drm:amdgpu_dm_atomic_commit_tail 
[amdgpu]] *ERROR* Waiting for fences timed out!
Mär 23 16:18:51 obelix kernel: [drm:amdgpu_dm_atomic_commit_tail 
[amdgpu]] *ERROR* Waiting for fences timed out!
Mär 23 16:18:51 obelix kernel: [drm:amdgpu_job_timedout [amdgpu]] 
*ERROR* ring sdma0 timeout, signaled seq=615, emitted seq=617
Mär 23 16:18:51 obelix kernel: [drm:amdgpu_job_timedout [amdgpu]] 
*ERROR* Process information: process  pid 0 thread  pid 0

Mär 23 16:18:51 obelix kernel: amdgpu :03:00.0: amdgpu: GPU reset begin!
Mär 23 16:18:51 obelix kernel: BUG: kernel NULL pointer dereference, 
address: 0029

Mär 23 16:18:51 obelix kernel: #PF: supervisor read access in kernel mode
Mär 23 16:18:51 obelix kernel: #PF: error_code(0x) - not-present page
Mär 23 16:18:51 obelix kernel: PGD 0 P4D 0
Mär 23 16:18:51 obelix kernel: Oops:  [#1] PREEMPT SMP NOPTI
Mär 23 16:18:51 obelix kernel: CPU: 12 PID: 178 Comm: kworker/12:1 Not 
tainted 5.11.8-arch1-1-custom #1
Mär 23 16:18:51 obelix kernel: Hardware name: Micro-Star International 
Co., Ltd. Bravo 17 A4DDR/MS-17FK, BIOS E17FKAMS.117 10/29/2020
Mär 23 16:18:51 obelix kernel: Workqueue: events drm_sched_job_timedout 
[gpu_sched]
Mär 23 16:18:51 obelix kernel: RIP: 0010:kernel_queue_uninit+0xd/0xf0 
[amdgpu]
Mär 23 16:18:51 obelix kernel: Code: ee 48 89 c7 e8 a4 f9 ff ff 84 c0 0f 
84 e3 d3 1f 00 4c 89 e0 5d 41 5c 41 5d c3 0f 1f 00 0f 1f 44 00 00 55 48 
8b 47 10 48 89 fd <8b> 50 28 83 fa 02 74 78 83 fa 03 0f 84 b1 00 00 00 
48 8b 7f 08 4c

Mär 23 16:18:51 obelix kernel: RSP: 0018:a35d806dfd40 EFLAGS: 00010246
Mär 23 16:18:51 obelix kernel: RAX: 0001 RBX: 
8b044c5ee000 RCX: 0080005b
Mär 23 16:18:51 obelix kernel: RDX: 0080005c RSI: 
0001 RDI: 8b044a877bc0
Mär 23 16:18:51 obelix kernel: RBP: 8b044a877bc0 R08: 
0001 R09: 
Mär 23 16:18:51 obelix kernel: R10:  R11: 
afccba00 R12: 8b044c5ee0d0
Mär 23 16:18:51 obelix kernel: R13: 8b044bf6 R14: 
8b04414a1000 R15: 8b04414a10c8
Mär 23 16:18:51 obelix kernel: FS:  () 
GS:8b075f90() knlGS:
Mär 23 16:18:51 obelix kernel: CS:  0010 DS:  ES:  CR0: 
80050033
Mär 23 16:18:51 obelix kernel: CR2: 0029 CR3: 
0001ab01 CR4: 00350ee0

Mär 23 16:18:51 obelix kernel: Call Trace:
Mär 23 16:18:51 obelix kernel:  stop_cpsch+0xa0/0xc0 [amdgpu]
Mär 23 16:18:51 obelix kernel:  kgd2kfd_suspend.part.0+0x2f/0x40 [amdgpu]
Mär 23 16:18:51 obelix kernel:  kgd2kfd_pre_reset+0x3f/0x50 [amdgpu]
Mär 23 16:18:51 obelix kernel: 
amdgpu_device_gpu_recover.cold+0x36e/0x95d [amdgpu]

Mär 23 16:18:51 obelix kernel:  amdgpu_job_timedout+0x121/0x140 [amdgpu]
Mär 23 16:18:51 obelix kernel:  drm_sched_job_timedout+0x64/0xe0 [gpu_sched]
Mär 23 16:18:51 obelix kernel:  process_one_work+0x214/0x3e0
Mär 23 16:18:51 obelix kernel:  worker_thread+0x4d/0x3d0
Mär 23 16:18:51 obelix kernel:  ? rescuer_thread+0x3c0/0x3c0
Mär 23 16:18:51 obelix kernel:  kthread+0x133/0x150
Mär 23 16:18:51 obelix kernel:  ? __kthread_bind_mask+0x60/0x60
Mär 23 16:18:51 obelix kernel:  ret_from_fork+0x22/0x30
Mär 23 16:18:51 obelix kernel: Modules linked in: rfcomm 
snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio 
snd_hda_codec_hdmi cmac algif_hash snd_hda_intel algif_skcipher 
snd_intel_dspcfg soundwire_intel af_alg soundwire_ge>
Mär 23 16:18:51 obelix kernel:  sr_mod cdrom uas usb_storage dm_crypt 
cbc encrypted_keys dm_mod trusted tpm crct10dif_pclmul crc32_pclmul 
crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd 
glue_helper serio_raw ccp xhc>

Mär 23 16:18:51 obelix kernel: CR2: 0029
Mär 23 16:18:51 obelix kernel: ---[ end trace 8a72c5e07cbe6b63 ]---
Mär 23 16:18:51 obelix kernel: RIP: 0010:kernel_queue_uninit+0xd/0xf0 
[amdgpu]
Mär 23 16:18:51 obelix kernel: Code: ee 48 89 c7 e8 a4 f9 ff ff 84 c0 0f 
84 e3 d3 1f 00 4c 89 e0 5d 41 5c 41 5d c3 0f 1f 00 0f 1f 44 00 00 55 48 
8b 47 10 48 89 fd <8b> 50 28 83 fa 02 74 78 83 fa 03 0f 84 b1 00 00 00 
48 8b 7f 08 4c

Mär 23 16:18:51 obelix kernel: RSP: 0018:a35d806dfd40 EFLAGS: 00010246
Mär 23 16:18:51 obelix kernel: RAX: 0001 RBX: 
8b044c5ee000 RCX: 0080005b
Mär 23 16:18:51 obelix kernel: RDX: 0080005c RSI: 
0001 RDI: 8b044a877bc0
Mär 23 16:18:51 obelix kernel: RBP: 8b044a877bc0 R08: 
0001 R09: 
Mär 23 16:18:51 obelix kernel: R10:  R11: 
afccba00 R12: 8b044c5ee0d0
Mär 23 16:18:51 obelix kernel: R13: 8b044bf6 R14: 
8b04414a1000 R15: 8b04414a10c8
Mär 23 16:18:51 obelix kernel: FS:  () 
GS:8b075f90() knlGS:
Mär 23 16:18:51 obelix kernel: 

Re: [PATCH] gpu: drm: amd: Remove duplicate includes

2021-03-23 Thread Alex Deucher
Same patch was already applied recently.

Thanks,

Alex

On Mon, Mar 22, 2021 at 9:19 PM Wan Jiabing  wrote:
>
> ../hw_ddc.h, ../hw_gpio.h and ../hw_hpd.h have been included
> at line 32, so remove them.
>
> Signed-off-by: Wan Jiabing 
> ---
>  .../gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c| 4 
>  1 file changed, 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c 
> b/drivers/gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c
> index 66e4841f41e4..ca335ea60412 100644
> --- a/drivers/gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c
> +++ b/drivers/gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c
> @@ -48,10 +48,6 @@
>  #define REGI(reg_name, block, id)\
> mm ## block ## id ## _ ## reg_name
>
> -#include "../hw_gpio.h"
> -#include "../hw_ddc.h"
> -#include "../hw_hpd.h"
> -
>  #include "reg_helper.h"
>  #include "../hpd_regs.h"
>
> --
> 2.25.1
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] gpu: drm: amd: Remove duplicate include of dce110_resource.h

2021-03-23 Thread Alex Deucher
The same patch was already applied recently.  Thanks!

Alex

On Mon, Mar 22, 2021 at 9:10 PM Wan Jiabing  wrote:
>
> dce110/dce110_resource.h has been included at line 58, so remove
> the duplicate include at line 64.
>
> Signed-off-by: Wan Jiabing 
> ---
>  drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c 
> b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
> index 4a3df13c9e49..c4fe21b3b23f 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
> @@ -61,7 +61,6 @@
>  #include "dcn21/dcn21_dccg.h"
>  #include "dcn21_hubbub.h"
>  #include "dcn10/dcn10_resource.h"
> -#include "dce110/dce110_resource.h"
>  #include "dce/dce_panel_cntl.h"
>
>  #include "dcn20/dcn20_dwb.h"
> --
> 2.25.1
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Ensure that the modifier requested is supported by plane.

2021-03-23 Thread Bas Nieuwenhuizen
On Wed, Mar 10, 2021 at 5:14 PM Mark Yacoub  wrote:

> From: Mark Yacoub 
>
> On initializing the framebuffer, call drm_any_plane_has_format to do a
> check if the modifier is supported. drm_any_plane_has_format calls
> dm_plane_format_mod_supported which is extended to validate that the
> modifier is on the list of the plane's supported modifiers.
>
> The bug was caught using igt-gpu-tools test:
> kms_addfb_basic.addfb25-bad-modifier
>
> Tested on ChromeOS Zork by turning on the display, running an overlay
> test, and running a YT video.
>
> Cc: Alex Deucher 
> Cc: Bas Nieuwenhuizen 
> Signed-off-by: default avatarMark Yacoub 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   | 13 +
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  9 +
>  2 files changed, 22 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> index afa5f8ad0f563..a947b5aa420d2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> @@ -908,6 +908,19 @@ int amdgpu_display_gem_fb_verify_and_init(
>  &amdgpu_fb_funcs);
> if (ret)
> goto err;
> +   /* Verify that the modifier is supported. */
> +   if (!drm_any_plane_has_format(dev, mode_cmd->pixel_format,
> + mode_cmd->modifier[0])) {
> +   struct drm_format_name_buf format_name;
> +   drm_dbg_kms(dev,
> +   "unsupported pixel format %s / modifier
> 0x%llx\n",
> +   drm_get_format_name(mode_cmd->pixel_format,
> +   &format_name),
> +   mode_cmd->modifier[0]);
> +
> +   ret = -EINVAL;
> +   goto err;
> +   }
>

Why is this needed?


> ret = amdgpu_display_framebuffer_init(dev, rfb, mode_cmd, obj);
> if (ret)
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 961abf1cf040c..21314024a83ce 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -3939,6 +3939,7 @@ static bool dm_plane_format_mod_supported(struct
> drm_plane *plane,
>  {
> struct amdgpu_device *adev = drm_to_adev(plane->dev);
> const struct drm_format_info *info = drm_format_info(format);
> +   int i;
>
> enum dm_micro_swizzle microtile =
> modifier_gfx9_swizzle_mode(modifier) & 3;
>
> @@ -3952,6 +3953,14 @@ static bool dm_plane_format_mod_supported(struct
> drm_plane *plane,
> if (modifier == DRM_FORMAT_MOD_LINEAR)
> return true;
>
> +   /* Check that the modifier is on the list of the plane's supported
> modifiers. */
> +   for (i = 0; i < plane->modifier_count; i++) {
> +   if (modifier == plane->modifiers[i])
> +   break;
> +   }
> +   if (i == plane->modifier_count)
> +   return false;
> +
>

This part seems fine by me.

> /*
>  * The arbitrary tiling support for multiplane formats has not
> been hooked
>  * up.
> --
> 2.30.1.766.gb4fecdf3b7-goog
>
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Ensure that the modifier requested is supported by plane.

2021-03-23 Thread Alex Deucher
On Wed, Mar 10, 2021 at 11:15 AM Mark Yacoub  wrote:
>
> From: Mark Yacoub 
>
> On initializing the framebuffer, call drm_any_plane_has_format to do a
> check if the modifier is supported. drm_any_plane_has_format calls
> dm_plane_format_mod_supported which is extended to validate that the
> modifier is on the list of the plane's supported modifiers.
>
> The bug was caught using igt-gpu-tools test: 
> kms_addfb_basic.addfb25-bad-modifier
>
> Tested on ChromeOS Zork by turning on the display, running an overlay
> test, and running a YT video.
>
> Cc: Alex Deucher 
> Cc: Bas Nieuwenhuizen 
> Signed-off-by: default avatarMark Yacoub 

I'm not an expert with modifiers yet.  Will this break chips which
don't currently support modifiers?

Alex


> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   | 13 +
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  9 +
>  2 files changed, 22 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> index afa5f8ad0f563..a947b5aa420d2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> @@ -908,6 +908,19 @@ int amdgpu_display_gem_fb_verify_and_init(
>  &amdgpu_fb_funcs);
> if (ret)
> goto err;
> +   /* Verify that the modifier is supported. */
> +   if (!drm_any_plane_has_format(dev, mode_cmd->pixel_format,
> + mode_cmd->modifier[0])) {
> +   struct drm_format_name_buf format_name;
> +   drm_dbg_kms(dev,
> +   "unsupported pixel format %s / modifier 0x%llx\n",
> +   drm_get_format_name(mode_cmd->pixel_format,
> +   &format_name),
> +   mode_cmd->modifier[0]);
> +
> +   ret = -EINVAL;
> +   goto err;
> +   }
>
> ret = amdgpu_display_framebuffer_init(dev, rfb, mode_cmd, obj);
> if (ret)
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 961abf1cf040c..21314024a83ce 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -3939,6 +3939,7 @@ static bool dm_plane_format_mod_supported(struct 
> drm_plane *plane,
>  {
> struct amdgpu_device *adev = drm_to_adev(plane->dev);
> const struct drm_format_info *info = drm_format_info(format);
> +   int i;
>
> enum dm_micro_swizzle microtile = 
> modifier_gfx9_swizzle_mode(modifier) & 3;
>
> @@ -3952,6 +3953,14 @@ static bool dm_plane_format_mod_supported(struct 
> drm_plane *plane,
> if (modifier == DRM_FORMAT_MOD_LINEAR)
> return true;
>
> +   /* Check that the modifier is on the list of the plane's supported 
> modifiers. */
> +   for (i = 0; i < plane->modifier_count; i++) {
> +   if (modifier == plane->modifiers[i])
> +   break;
> +   }
> +   if (i == plane->modifier_count)
> +   return false;
> +
> /*
>  * The arbitrary tiling support for multiplane formats has not been 
> hooked
>  * up.
> --
> 2.30.1.766.gb4fecdf3b7-goog
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] amdgpu: avoid incorrect %hu format string

2021-03-23 Thread Alex Deucher
Applied.  Thanks!

Alex

On Mon, Mar 22, 2021 at 7:55 AM Arnd Bergmann  wrote:
>
> From: Arnd Bergmann 
>
> clang points out that the %hu format string does not match the type
> of the variables here:
>
> drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c:263:7: warning: format specifies type 
> 'unsigned short' but the argument has type 'unsigned int' [-Wformat]
>   version_major, version_minor);
>   ^
> include/drm/drm_print.h:498:19: note: expanded from macro 'DRM_ERROR'
> __drm_err(fmt, ##__VA_ARGS__)
>   ~~~^~~
>
> Change it to a regular %u, the same way a previous patch did for
> another instance of the same warning.
>
> Fixes: 0b437e64e0af ("drm/amdgpu: remove h from printk format specifier")
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> index e2ed4689118a..c6dbc0801604 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> @@ -259,7 +259,7 @@ int amdgpu_uvd_sw_init(struct amdgpu_device *adev)
> if ((adev->asic_type == CHIP_POLARIS10 ||
>  adev->asic_type == CHIP_POLARIS11) &&
> (adev->uvd.fw_version < FW_1_66_16))
> -   DRM_ERROR("POLARIS10/11 UVD firmware version %hu.%hu 
> is too old.\n",
> +   DRM_ERROR("POLARIS10/11 UVD firmware version %u.%u is 
> too old.\n",
>   version_major, version_minor);
> } else {
> unsigned int enc_major, enc_minor, dec_minor;
> --
> 2.29.2
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drivers: gpu: Remove duplicate include of amdgpu_hdp.h

2021-03-23 Thread Alex Deucher
Applied.  Thanks!

Alex

On Mon, Mar 22, 2021 at 8:10 AM Christian König
 wrote:
>
>
>
> Am 22.03.21 um 13:02 schrieb Wan Jiabing:
> > amdgpu_hdp.h has been included at line 91, so remove
> > the duplicate include.
> >
> > Signed-off-by: Wan Jiabing 
>
> Acked-by: Christian König 
>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 -
> >   1 file changed, 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > index 49267eb64302..68836c22ef25 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > @@ -107,7 +107,6 @@
> >   #include "amdgpu_gfxhub.h"
> >   #include "amdgpu_df.h"
> >   #include "amdgpu_smuio.h"
> > -#include "amdgpu_hdp.h"
> >
> >   #define MAX_GPU_INSTANCE16
> >
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdkfd: Fix cat debugfs hang_hws file causes system crash bug

2021-03-23 Thread Alex Deucher
Applied.  Thanks!

Alex

On Sun, Mar 21, 2021 at 5:33 AM Qu Huang  wrote:
>
> Here is the system crash log:
> [ 1272.884438] BUG: unable to handle kernel NULL pointer dereference at
> (null)
> [ 1272.88] IP: [<  (null)>]   (null)
> [ 1272.884447] PGD 825b09067 PUD 8267c8067 PMD 0
> [ 1272.884452] Oops: 0010 [#1] SMP
> [ 1272.884509] CPU: 13 PID: 3485 Comm: cat Kdump: loaded Tainted: G
> [ 1272.884515] task: 9a38dbd4d140 ti: 9a37cd3b8000 task.ti:
> 9a37cd3b8000
> [ 1272.884517] RIP: 0010:[<>]  [<  (null)>]
> (null)
> [ 1272.884520] RSP: 0018:9a37cd3bbe68  EFLAGS: 00010203
> [ 1272.884522] RAX:  RBX:  RCX:
> 00014d5f
> [ 1272.884524] RDX: fff4 RSI: 0001 RDI:
> 9a38aca4d200
> [ 1272.884526] RBP: 9a37cd3bbed0 R08: 9a38dcd5f1a0 R09:
> 9a31ffc07300
> [ 1272.884527] R10: 9a31ffc07300 R11: addd5e9d R12:
> 9a38b4e0fb00
> [ 1272.884529] R13: 0001 R14: 9a37cd3bbf18 R15:
> 9a38aca4d200
> [ 1272.884532] FS:  7feccaa67740() GS:9a38dcd4()
> knlGS:
> [ 1272.884534] CS:  0010 DS:  ES:  CR0: 80050033
> [ 1272.884536] CR2:  CR3: 0008267c CR4:
> 003407e0
> [ 1272.884537] Call Trace:
> [ 1272.884544]  [] ? seq_read+0x130/0x440
> [ 1272.884548]  [] vfs_read+0x9f/0x170
> [ 1272.884552]  [] SyS_read+0x7f/0xf0
> [ 1272.884557]  [] system_call_fastpath+0x22/0x27
> [ 1272.884558] Code:  Bad RIP value.
> [ 1272.884562] RIP  [<  (null)>]   (null)
> [ 1272.884564]  RSP 
> [ 1272.884566] CR2: 
>
> Signed-off-by: Qu Huang 
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c
> index 511712c..673d5e3 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c
> @@ -33,6 +33,11 @@ static int kfd_debugfs_open(struct inode *inode, struct 
> file *file)
>
> return single_open(file, show, NULL);
>  }
> +static int kfd_debugfs_hang_hws_read(struct seq_file *m, void *data)
> +{
> +   seq_printf(m, "echo gpu_id > hang_hws\n");
> +   return 0;
> +}
>
>  static ssize_t kfd_debugfs_hang_hws_write(struct file *file,
> const char __user *user_buf, size_t size, loff_t *ppos)
> @@ -94,7 +99,7 @@ void kfd_debugfs_init(void)
> debugfs_create_file("rls", S_IFREG | 0444, debugfs_root,
> kfd_debugfs_rls_by_device, &kfd_debugfs_fops);
> debugfs_create_file("hang_hws", S_IFREG | 0200, debugfs_root,
> -   NULL, &kfd_debugfs_hang_hws_fops);
> +   kfd_debugfs_hang_hws_read, 
> &kfd_debugfs_hang_hws_fops);
>  }
>
>  void kfd_debugfs_fini(void)
> --
> 1.8.3.1
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/2] drm/amdgpu: use zero as start for dummy resource walks

2021-03-23 Thread Christian König
When we don't have a physically backing store we should use zero instead
of the virtual start address since that isn't necessary a valid physical
one.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
index 40f2adf305bc..e94362ccf9d5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
@@ -54,7 +54,7 @@ static inline void amdgpu_res_first(struct ttm_resource *res,
struct drm_mm_node *node;
 
if (!res || !res->mm_node) {
-   cur->start = start;
+   cur->start = 0;
cur->size = size;
cur->remaining = size;
cur->node = NULL;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/2] drm/amdgpu: re-apply "use the new cursor in the VM code""

2021-03-23 Thread Christian König
Now that we found the underlying problem we can re-apply this patch.

This reverts commit 867fee7f8821ff42e7308088cf0c3450ac49c17c.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 55 +-
 1 file changed, 18 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 9268db1172bd..bc3951b71079 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -37,6 +37,7 @@
 #include "amdgpu_gmc.h"
 #include "amdgpu_xgmi.h"
 #include "amdgpu_dma_buf.h"
+#include "amdgpu_res_cursor.h"
 
 /**
  * DOC: GPUVM
@@ -1583,7 +1584,7 @@ static int amdgpu_vm_update_ptes(struct 
amdgpu_vm_update_params *params,
  * @last: last mapped entry
  * @flags: flags for the entries
  * @offset: offset into nodes and pages_addr
- * @nodes: array of drm_mm_nodes with the MC addresses
+ * @res: ttm_resource to map
  * @pages_addr: DMA addresses to use for mapping
  * @fence: optional resulting fence
  *
@@ -1598,13 +1599,13 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
   bool unlocked, struct dma_resv *resv,
   uint64_t start, uint64_t last,
   uint64_t flags, uint64_t offset,
-  struct drm_mm_node *nodes,
+  struct ttm_resource *res,
   dma_addr_t *pages_addr,
   struct dma_fence **fence)
 {
struct amdgpu_vm_update_params params;
+   struct amdgpu_res_cursor cursor;
enum amdgpu_sync_mode sync_mode;
-   uint64_t pfn;
int r;
 
memset(¶ms, 0, sizeof(params));
@@ -1622,14 +1623,6 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
else
sync_mode = AMDGPU_SYNC_EXPLICIT;
 
-   pfn = offset >> PAGE_SHIFT;
-   if (nodes) {
-   while (pfn >= nodes->size) {
-   pfn -= nodes->size;
-   ++nodes;
-   }
-   }
-
amdgpu_vm_eviction_lock(vm);
if (vm->evicting) {
r = -EBUSY;
@@ -1648,23 +1641,17 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
if (r)
goto error_unlock;
 
-   do {
+   amdgpu_res_first(res, offset, (last - start + 1) * AMDGPU_GPU_PAGE_SIZE,
+&cursor);
+   while (cursor.remaining) {
uint64_t tmp, num_entries, addr;
 
-
-   num_entries = last - start + 1;
-   if (nodes) {
-   addr = nodes->start << PAGE_SHIFT;
-   num_entries = min((nodes->size - pfn) *
-   AMDGPU_GPU_PAGES_IN_CPU_PAGE, num_entries);
-   } else {
-   addr = 0;
-   }
-
+   num_entries = cursor.size >> AMDGPU_GPU_PAGE_SHIFT;
if (pages_addr) {
bool contiguous = true;
 
if (num_entries > AMDGPU_GPU_PAGES_IN_CPU_PAGE) {
+   uint64_t pfn = cursor.start >> PAGE_SHIFT;
uint64_t count;
 
contiguous = pages_addr[pfn + 1] ==
@@ -1684,16 +1671,18 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
}
 
if (!contiguous) {
-   addr = pfn << PAGE_SHIFT;
+   addr = cursor.start;
params.pages_addr = pages_addr;
} else {
-   addr = pages_addr[pfn];
+   addr = pages_addr[cursor.start >> PAGE_SHIFT];
params.pages_addr = NULL;
}
 
} else if (flags & (AMDGPU_PTE_VALID | AMDGPU_PTE_PRT)) {
-   addr += bo_adev->vm_manager.vram_base_offset;
-   addr += pfn << PAGE_SHIFT;
+   addr = bo_adev->vm_manager.vram_base_offset +
+   cursor.start;
+   } else {
+   addr = 0;
}
 
tmp = start + num_entries;
@@ -1701,14 +1690,9 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
if (r)
goto error_unlock;
 
-   pfn += num_entries / AMDGPU_GPU_PAGES_IN_CPU_PAGE;
-   if (nodes && nodes->size == pfn) {
-   pfn = 0;
-   ++nodes;
-   }
+   amdgpu_res_next(&cursor, num_entries * AMDGPU_GPU_PAGE_SIZE);
start = tmp;
-
-   } while (unlikely(start != last + 1));
+   };
 
r = vm->upd

Re: [PATCH] drm/amdgpu: remove irq_src->data handling

2021-03-23 Thread Alex Deucher
On Fri, Mar 19, 2021 at 8:25 AM Christian König
 wrote:
>
> That is unused for quite some time now.
>
> Signed-off-by: Christian König 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 5 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h | 1 -
>  2 files changed, 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> index af026109421a..03412543427a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
> @@ -382,11 +382,6 @@ void amdgpu_irq_fini(struct amdgpu_device *adev)
>
> kfree(src->enabled_types);
> src->enabled_types = NULL;
> -   if (src->data) {
> -   kfree(src->data);
> -   kfree(src);
> -   adev->irq.client[i].sources[j] = NULL;
> -   }
> }
> kfree(adev->irq.client[i].sources);
> adev->irq.client[i].sources = NULL;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
> index ac527e5deae6..cf6116648322 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
> @@ -62,7 +62,6 @@ struct amdgpu_irq_src {
> unsignednum_types;
> atomic_t*enabled_types;
> const struct amdgpu_irq_src_funcs   *funcs;
> -   void *data;
>  };
>
>  struct amdgpu_irq_client {
> --
> 2.25.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v2] drm/radeon: don't evict if not initialized

2021-03-23 Thread Alex Deucher
Applied.  Thanks!

Alex

On Mon, Mar 22, 2021 at 3:40 AM Christian König
 wrote:
>
> Am 21.03.21 um 16:19 schrieb Tong Zhang:
> > TTM_PL_VRAM may not initialized at all when calling
> > radeon_bo_evict_vram(). We need to check before doing eviction.
> >
> > [2.160837] BUG: kernel NULL pointer dereference, address: 
> > 0020
> > [2.161212] #PF: supervisor read access in kernel mode
> > [2.161490] #PF: error_code(0x) - not-present page
> > [2.161767] PGD 0 P4D 0
> > [2.163088] RIP: 0010:ttm_resource_manager_evict_all+0x70/0x1c0 [ttm]
> > [2.168506] Call Trace:
> > [2.168641]  radeon_bo_evict_vram+0x1c/0x20 [radeon]
> > [2.168936]  radeon_device_fini+0x28/0xf9 [radeon]
> > [2.169224]  radeon_driver_unload_kms+0x44/0xa0 [radeon]
> > [2.169534]  radeon_driver_load_kms+0x174/0x210 [radeon]
> > [2.169843]  drm_dev_register+0xd9/0x1c0 [drm]
> > [2.170104]  radeon_pci_probe+0x117/0x1a0 [radeon]
> >
> > Suggested-by: Christian König 
> > Signed-off-by: Tong Zhang 
>
> Reviewed-by: Christian König 
>
> > ---
> > v2: coding style fix
> >
> >   drivers/gpu/drm/radeon/radeon_object.c | 2 ++
> >   1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/radeon/radeon_object.c 
> > b/drivers/gpu/drm/radeon/radeon_object.c
> > index 9b81786782de..499ce55e34cc 100644
> > --- a/drivers/gpu/drm/radeon/radeon_object.c
> > +++ b/drivers/gpu/drm/radeon/radeon_object.c
> > @@ -384,6 +384,8 @@ int radeon_bo_evict_vram(struct radeon_device *rdev)
> >   }
> >   #endif
> >   man = ttm_manager_type(bdev, TTM_PL_VRAM);
> > + if (!man)
> > + return 0;
> >   return ttm_resource_manager_evict_all(bdev, man);
> >   }
> >
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 01/44] drm/amdgpu: replace per_device_list by array

2021-03-23 Thread Kim, Jonathan
[AMD Official Use Only - Internal Distribution Only]

> -Original Message-
> From: amd-gfx  On Behalf Of Felix
> Kuehling
> Sent: Monday, March 22, 2021 6:58 AM
> To: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> Cc: Sierra Guiza, Alejandro (Alex) 
> Subject: [PATCH 01/44] drm/amdgpu: replace per_device_list by array
>
> [CAUTION: External Email]
>
> From: Alex Sierra 
>
> Remove per_device_list from kfd_process and replace it with a
> kfd_process_device pointers array of MAX_GPU_INSTANCES size. This helps
> to manage the kfd_process_devices binded to a specific kfd_process.
> Also, functions used by kfd_chardev to iterate over the list were removed,
> since they are not valid anymore. Instead, it was replaced by a local loop
> iterating the array.
>
> Signed-off-by: Alex Sierra 
> Signed-off-by: Felix Kuehling 

As discussed, this patch is required to sync internal branches for the KFD and 
is
Reviewed-by: Jonathan Kim 

> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 116 --
>  drivers/gpu/drm/amd/amdkfd/kfd_iommu.c|   8 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  20 +--
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 108 
>  .../amd/amdkfd/kfd_process_queue_manager.c|   6 +-
>  5 files changed, 111 insertions(+), 147 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 6802c616e10e..43de260b2230 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -870,52 +870,47 @@ static int kfd_ioctl_get_process_apertures(struct
> file *filp,  {
> struct kfd_ioctl_get_process_apertures_args *args = data;
> struct kfd_process_device_apertures *pAperture;
> -   struct kfd_process_device *pdd;
> +   int i;
>
> dev_dbg(kfd_device, "get apertures for PASID 0x%x", p->pasid);
>
> args->num_of_nodes = 0;
>
> mutex_lock(&p->mutex);
> +   /* Run over all pdd of the process */
> +   for (i = 0; i < p->n_pdds; i++) {
> +   struct kfd_process_device *pdd = p->pdds[i];
> +
> +   pAperture =
> +   &args->process_apertures[args->num_of_nodes];
> +   pAperture->gpu_id = pdd->dev->id;
> +   pAperture->lds_base = pdd->lds_base;
> +   pAperture->lds_limit = pdd->lds_limit;
> +   pAperture->gpuvm_base = pdd->gpuvm_base;
> +   pAperture->gpuvm_limit = pdd->gpuvm_limit;
> +   pAperture->scratch_base = pdd->scratch_base;
> +   pAperture->scratch_limit = pdd->scratch_limit;
>
> -   /*if the process-device list isn't empty*/
> -   if (kfd_has_process_device_data(p)) {
> -   /* Run over all pdd of the process */
> -   pdd = kfd_get_first_process_device_data(p);
> -   do {
> -   pAperture =
> -   &args->process_apertures[args->num_of_nodes];
> -   pAperture->gpu_id = pdd->dev->id;
> -   pAperture->lds_base = pdd->lds_base;
> -   pAperture->lds_limit = pdd->lds_limit;
> -   pAperture->gpuvm_base = pdd->gpuvm_base;
> -   pAperture->gpuvm_limit = pdd->gpuvm_limit;
> -   pAperture->scratch_base = pdd->scratch_base;
> -   pAperture->scratch_limit = pdd->scratch_limit;
> -
> -   dev_dbg(kfd_device,
> -   "node id %u\n", args->num_of_nodes);
> -   dev_dbg(kfd_device,
> -   "gpu id %u\n", pdd->dev->id);
> -   dev_dbg(kfd_device,
> -   "lds_base %llX\n", pdd->lds_base);
> -   dev_dbg(kfd_device,
> -   "lds_limit %llX\n", pdd->lds_limit);
> -   dev_dbg(kfd_device,
> -   "gpuvm_base %llX\n", pdd->gpuvm_base);
> -   dev_dbg(kfd_device,
> -   "gpuvm_limit %llX\n", pdd->gpuvm_limit);
> -   dev_dbg(kfd_device,
> -   "scratch_base %llX\n", pdd->scratch_base);
> -   dev_dbg(kfd_device,
> -   "scratch_limit %llX\n", pdd->scratch_limit);
> -
> -   args->num_of_nodes++;
> -
> -   pdd = kfd_get_next_process_device_data(p, pdd);
> -   } while (pdd && (args->num_of_nodes <
> NUM_OF_SUPPORTED_GPUS));
> -   }
> +   dev_dbg(kfd_device,
> +   "node id %u\n", args->num_of_nodes);
> +   dev_dbg(kfd_device,
> +   "gpu id %u\n", pdd->dev->id);
> +   dev_dbg(kfd_device,
> +   "lds_base %llX\n", pdd->lds_base);
> +

Re: [PATCH] drm/amd/display: Use DRM_DEBUG_DP

2021-03-23 Thread Alex Deucher
On Mon, Mar 22, 2021 at 8:31 PM Luben Tuikov  wrote:
>
> Convert IRQ-based prints from DRM_DEBUG_DRIVER to
> DRM_DEBUG_DP, as the latter is not used in drm/amd
> prior to this patch and since IRQ-based prints
> drown out the rest of the driver's
> DRM_DEBUG_DRIVER messages.
>
> Cc: Harry Wentland 
> Cc: Alex Deucher 
> Signed-off-by: Luben Tuikov 
> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 67 +--
>  1 file changed, 33 insertions(+), 34 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index f455fc3aa561..aabaa652f6dc 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -449,9 +449,9 @@ static void dm_pflip_high_irq(void *interrupt_params)
> amdgpu_crtc->pflip_status = AMDGPU_FLIP_NONE;
> spin_unlock_irqrestore(&adev_to_drm(adev)->event_lock, flags);
>
> -   DRM_DEBUG_DRIVER("crtc:%d[%p], pflip_stat:AMDGPU_FLIP_NONE, 
> vrr[%d]-fp %d\n",
> -amdgpu_crtc->crtc_id, amdgpu_crtc,
> -vrr_active, (int) !e);
> +   DRM_DEBUG_DP("crtc:%d[%p], pflip_stat:AMDGPU_FLIP_NONE, vrr[%d]-fp 
> %d\n",

Should probably be _KMS or _ATOMIC since this is not displayport specific.

> +amdgpu_crtc->crtc_id, amdgpu_crtc,
> +vrr_active, (int) !e);
>  }
>
>  static void dm_vupdate_high_irq(void *interrupt_params)
> @@ -993,8 +993,7 @@ static void event_mall_stutter(struct work_struct *work)
> dc_allow_idle_optimizations(
> dm->dc, dm->active_vblank_irq_count == 0);
>
> -   DRM_DEBUG_DRIVER("Allow idle optimizations (MALL): %d\n", 
> dm->active_vblank_irq_count == 0);
> -
> +   DRM_DEBUG_DP("Allow idle optimizations (MALL): %d\n", 
> dm->active_vblank_irq_count == 0);

Maybe _VBL or _KMS or _ATOMIC?

>
> mutex_unlock(&dm->dc_lock);
>  }
> @@ -1810,8 +1809,8 @@ static void dm_gpureset_toggle_interrupts(struct 
> amdgpu_device *adev,
> if (acrtc && state->stream_status[i].plane_count != 0) {
> irq_source = IRQ_TYPE_PFLIP + acrtc->otg_inst;
> rc = dc_interrupt_set(adev->dm.dc, irq_source, 
> enable) ? 0 : -EBUSY;
> -   DRM_DEBUG("crtc %d - vupdate irq %sabling: r=%d\n",
> - acrtc->crtc_id, enable ? "en" : "dis", rc);
> +   DRM_DEBUG_DP("crtc %d - vupdate irq %sabling: r=%d\n",
> +acrtc->crtc_id, enable ? "en" : "dis", 
> rc);

I think this should be _VBL.

> if (rc)
> DRM_WARN("Failed to %s pflip interrupts\n",
>  enable ? "enable" : "disable");
> @@ -4966,8 +4965,8 @@ static void update_stream_scaling_settings(const struct 
> drm_display_mode *mode,
> stream->src = src;
> stream->dst = dst;
>
> -   DRM_DEBUG_DRIVER("Destination Rectangle x:%d  y:%d  width:%d  
> height:%d\n",
> -   dst.x, dst.y, dst.width, dst.height);
> +   DRM_DEBUG_DP("Destination Rectangle x:%d  y:%d  width:%d  
> height:%d\n",
> +dst.x, dst.y, dst.width, dst.height);

Should probably be _KMS or _ATOMIC since this is not displayport specific.

>
>  }
>
> @@ -5710,8 +5709,8 @@ static inline int dm_set_vupdate_irq(struct drm_crtc 
> *crtc, bool enable)
>
> rc = dc_interrupt_set(adev->dm.dc, irq_source, enable) ? 0 : -EBUSY;
>
> -   DRM_DEBUG_DRIVER("crtc %d - vupdate irq %sabling: r=%d\n",
> -acrtc->crtc_id, enable ? "en" : "dis", rc);
> +   DRM_DEBUG_DP("crtc %d - vupdate irq %sabling: r=%d\n",
> +acrtc->crtc_id, enable ? "en" : "dis", rc);

Should probably be _VBL.

> return rc;
>  }
>
> @@ -6664,7 +6663,7 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
> *plane,
> int r;
>
> if (!new_state->fb) {
> -   DRM_DEBUG_DRIVER("No FB bound\n");
> +   DRM_DEBUG_DP("No FB bound\n");

Should probably be _KMS or _ATOMIC since this is not displayport specific.

> return 0;
> }
>
> @@ -7896,11 +7895,11 @@ static void handle_cursor_update(struct drm_plane 
> *plane,
> if (!plane->state->fb && !old_plane_state->fb)
> return;
>
> -   DRM_DEBUG_DRIVER("%s: crtc_id=%d with size %d to %d\n",
> -__func__,
> -amdgpu_crtc->crtc_id,
> -plane->state->crtc_w,
> -plane->state->crtc_h);
> +   DRM_DEBUG_DP("%s: crtc_id=%d with size %d to %d\n",
> +__func__,
> +amdgpu_crtc->crtc_id,
> +plane->state->crtc_w,
> +plane->state->crtc_h);

Should probably be _KMS or _ATOMIC since this is not dis

[PATCH][next] drm/amd/display/dc/calcs/dce_calcs: Fix allocation size for dceip and vbios

2021-03-23 Thread Colin King
From: Colin Ian King 

Currently the allocations for dceip and vbios are based on the size of
the pointer rather than the size of the data structures, causing heap
issues. Fix this by using the correct allocation sizes.

Addresses-Coverity: ("Wrong size of argument")
Fixes: a2a855772210 ("drm/amd/display/dc/calcs/dce_calcs: Remove some large 
variables from the stack")
Signed-off-by: Colin Ian King 
---
 drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c 
b/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c
index 556ecfabc8d2..1244fcb0f446 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c
+++ b/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c
@@ -2051,11 +2051,11 @@ void bw_calcs_init(struct bw_calcs_dceip *bw_dceip,
 
enum bw_calcs_version version = bw_calcs_version_from_asic_id(asic_id);
 
-   dceip = kzalloc(sizeof(dceip), GFP_KERNEL);
+   dceip = kzalloc(sizeof(*dceip), GFP_KERNEL);
if (!dceip)
return;
 
-   vbios = kzalloc(sizeof(vbios), GFP_KERNEL);
+   vbios = kzalloc(sizeof(*vbios), GFP_KERNEL);
if (!vbios) {
kfree(dceip);
return;
-- 
2.30.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 2/2] drm/amdgpu: Introduce new SETUP_TMR interface

2021-03-23 Thread Lazar, Lijo
[AMD Public Use]


-Original Message-
From: amd-gfx  On Behalf Of Zeng, Oak
Sent: Monday, March 22, 2021 7:33 PM
To: amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix ; Zhang, Hawking 

Subject: Re: [PATCH 2/2] drm/amdgpu: Introduce new SETUP_TMR interface

[AMD Official Use Only - Internal Distribution Only]

[AMD Official Use Only - Internal Distribution Only]

Hello all,

Can someone help to review below patches? We verified with firmware team and 
want to check-in together with psp firmware

Regards,
Oak



On 2021-03-12, 4:24 PM, "Zeng, Oak"  wrote:

This new interface passes both virtual and physical address
to PSP. It is backword compatible with old interface.

Signed-off-by: Oak Zeng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 13 ++---
 drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h | 11 ++-
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index cd3eda9..99e1a3e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -328,8 +328,13 @@ psp_cmd_submit_buf(struct psp_context *psp,

 static void psp_prep_tmr_cmd_buf(struct psp_context *psp,
  struct psp_gfx_cmd_resp *cmd,
- uint64_t tmr_mc, uint32_t size)
+ uint64_t tmr_mc, struct amdgpu_bo *tmr_bo)
 {
+struct amdgpu_device *adev = psp->adev;
+uint32_t size = amdgpu_bo_size(tmr_bo);
+uint64_t tmr_pa = amdgpu_bo_gpu_offset(tmr_bo) +
+adev->vm_manager.vram_base_offset - adev->gmc.vram_start;
+

<> This looks like a candidate for a small inline function in gmc. PSP doesn't 
need to know about the calculation.

Thanks,
Lijo

 if (amdgpu_sriov_vf(psp->adev))
 cmd->cmd_id = GFX_CMD_ID_SETUP_VMR;
 else
@@ -337,6 +342,9 @@ static void psp_prep_tmr_cmd_buf(struct psp_context 
*psp,
 cmd->cmd.cmd_setup_tmr.buf_phy_addr_lo = lower_32_bits(tmr_mc);
 cmd->cmd.cmd_setup_tmr.buf_phy_addr_hi = upper_32_bits(tmr_mc);
 cmd->cmd.cmd_setup_tmr.buf_size = size;
+cmd->cmd.cmd_setup_tmr.bitfield.virt_phy_addr = 1;
+cmd->cmd.cmd_setup_tmr.system_phy_addr_lo = lower_32_bits(tmr_pa);
+cmd->cmd.cmd_setup_tmr.system_phy_addr_hi = upper_32_bits(tmr_pa);
 }

 static void psp_prep_load_toc_cmd_buf(struct psp_gfx_cmd_resp *cmd,
@@ -456,8 +464,7 @@ static int psp_tmr_load(struct psp_context *psp)
 if (!cmd)
 return -ENOMEM;

-psp_prep_tmr_cmd_buf(psp, cmd, psp->tmr_mc_addr,
- amdgpu_bo_size(psp->tmr_bo));
+psp_prep_tmr_cmd_buf(psp, cmd, psp->tmr_mc_addr, psp->tmr_bo);
 DRM_INFO("reserve 0x%lx from 0x%llx for PSP TMR\n",
  amdgpu_bo_size(psp->tmr_bo), psp->tmr_mc_addr);

diff --git a/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h 
b/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h
index a41b054..604a1c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h
+++ b/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h
@@ -170,10 +170,19 @@ struct psp_gfx_cmd_setup_tmr
 uint32_tbuf_phy_addr_lo;   /* bits [31:0] of GPU Virtual 
address of TMR buffer (must be 4 KB aligned) */
 uint32_tbuf_phy_addr_hi;   /* bits [63:32] of GPU Virtual 
address of TMR buffer */
 uint32_tbuf_size;  /* buffer size in bytes (must 
be multiple of 4 KB) */
+union {
+struct {
+uint32_tsriov_enabled:1; /* whether the device runs under SR-IOV*/
+uint32_tvirt_phy_addr:1; /* driver passes both virtual and physical 
address to PSP*/
+uint32_treserved:30;
+} bitfield;
+uint32_ttmr_flags;
+};
+uint32_tsystem_phy_addr_lo;/* bits [31:0] of system 
physical address of TMR buffer (must be 4 KB aligned) */
+uint32_tsystem_phy_addr_hi;/* bits [63:32] of system 
physical address of TMR buffer */

 };

-
 /* FW types for GFX_CMD_ID_LOAD_IP_FW command. Limit 31. */
 enum psp_gfx_fw_type {
 GFX_FW_TYPE_NONE= 0,/* */
--
2.7.4


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Clijo.lazar%40amd.com%7C0a3c80485f804300776608d8ed3b3dd1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637520185990508957%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=2Yp9oCfXTbs1PAtp90Jp287Pe1k72KVWk7YC9qrGN6k%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: stop warning on TT shrinker failure

2021-03-23 Thread Christian König

Am 23.03.21 um 14:41 schrieb Michal Hocko:

On Tue 23-03-21 14:06:25, Christian König wrote:

Am 23.03.21 um 13:37 schrieb Michal Hocko:

On Tue 23-03-21 13:21:32, Christian König wrote:

[...]

Ideally I would like to be able to trigger swapping out the shmem page I
allocated immediately after doing the copy.

So let me try to rephrase to make sure I understand. You would like to
swap out the existing content from the shrinker and you use shmem as a
way to achieve that. The swapout should happen at the time of copying
(shrinker context) or shortly afterwards?

So effectively to call pageout() on the shmem page after the copy?

Yes, exactly that.

OK, good. I see what you are trying to achieve now. I do not think we
would want to allow pageout from the shrinker's context but what you can
do is to instantiate the shmem page into the tail of the inactive list
so the next reclaim attempt will swap it out (assuming swap is available
of course).


Yes, that's at least my understanding of how we currently do it.

Problem with that approach is that I first copy over the whole object 
into shmem and then free it.


So instead of temporary using a single page, I need whatever the buffer 
object is in size as temporary storage for the shmem object and that can 
be a couple of hundred MiB.



This is not really something that our existing infrastructure gives you
though, I am afraid. There is no way to tell a newly allocated shmem
page should be in fact cold and the first one to swap out. But there are
people more familiar with shmem and its pecularities so I might be wrong
here.

Anyway, I am wondering whether the overall approach is sound. Why don't
you simply use shmem as your backing storage from the beginning and pin
those pages if they are used by the device?


Yeah, that is exactly what the Intel guys are doing for their integrated 
GPUs :)


Problem is for TTM I need to be able to handle dGPUs and those have all 
kinds of funny allocation restrictions. In other words I need to 
guarantee that the allocated memory is coherent accessible to the GPU 
without using SWIOTLB.


The simple case is that the device can only do DMA32, but you also got 
device which can only do 40bits or 48bits.


On top of that you also got AGP, CMA and stuff like CPU cache behavior 
changes (write back vs. write through, vs. uncached).


Regards,
Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: stop warning on TT shrinker failure

2021-03-23 Thread Daniel Vetter
On Tue, Mar 23, 2021 at 01:04:03PM +0100, Michal Hocko wrote:
> On Tue 23-03-21 12:48:58, Christian König wrote:
> > Am 23.03.21 um 12:28 schrieb Daniel Vetter:
> > > On Tue, Mar 23, 2021 at 08:38:33AM +0100, Michal Hocko wrote:
> > > > I think this is where I don't get yet what Christian tries to do: We
> > > > really shouldn't do different tricks and calling contexts between direct
> > > > reclaim and kswapd reclaim. Otherwise very hard to track down bugs are
> > > > pretty much guaranteed. So whether we use explicit gfp flags or the
> > > > context apis, result is exactly the same.
> > 
> > Ok let us recap what TTMs TT shrinker does here:
> > 
> > 1. We got memory which is not swapable because it might be accessed by the
> > GPU at any time.
> > 2. Make sure the memory is not accessed by the GPU and driver need to grab a
> > lock before they can make it accessible again.
> > 3. Allocate a shmem file and copy over the not swapable pages.
> 
> This is quite tricky because the shrinker operates in the PF_MEMALLOC
> context so such an allocation would be allowed to completely deplete
> memory unless you explicitly mark that context as __GFP_NOMEMALLOC. Also
> note that if the allocation cannot succeed it will not trigger reclaim
> again because you are already called from the reclaim context.

[Limiting to that discussion]

Yes it's not emulating real (direct) reclaim correctly, but ime the
biggest issue with direct reclaim is when you do mutex_lock instead of
mutex_trylock or in general block on stuff that you cant. And lockdep +
fs_reclaim annotations gets us that, so pretty good to make sure our
shrinker is correct.

Actual tuning of it and making sure it's not doing silly things is ofc a
different thing, and for that we can't test it in isolation. But it's good
to know that before you tune it, you have rather high confidence it's
at least correct. And for that not running with PF_MEMALLOC is actually
good, since it means more allocation failures, so more testing of those
error/backoff paths in the code.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/pm: Update aldebaran pmfw interface

2021-03-23 Thread Lazar, Lijo
[AMD Public Use]

Update aldebaran PMFW interfaces to version 0x6

Signed-off-by: Lijo Lazar lijo.la...@amd.com
---
.../gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h| 11 +--
drivers/gpu/drm/amd/pm/inc/smu_v13_0.h|  2 +-
2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h 
b/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h
index df2ead254f37..d23533bda002 100644
--- a/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h
+++ b/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h
@@ -435,8 +435,12 @@ typedef struct {
   uint8_t  GpioI2cSda; // Serial Data
   uint16_t spare5;
+  uint16_t XgmiMaxCurrent; // in Amps
+  int8_t   XgmiOffset; // in Amps
+  uint8_t  Padding_TelemetryXgmi;
+
   //reserved
-  uint32_t reserved[16];
+  uint32_t reserved[15];
 } PPTable_t;
@@ -481,7 +485,10 @@ typedef struct {
   uint16_t TemperatureAllHBM[4]  ;
   uint32_t GfxBusyAcc;
   uint32_t DramBusyAcc   ;
-  uint32_t Spare[4];
+  uint32_t EnergyAcc64bitLow ; //15.259uJ resolution
+  uint32_t EnergyAcc64bitHigh;
+  uint32_t TimeStampLow  ; //10ns resolution
+  uint32_t TimeStampHigh ;
   // Padding - ignore
   uint32_t MmHubPadding[8]; // SMU internal use
diff --git a/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h 
b/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h
index 6db3464c09d6..8145e1cbf181 100644
--- a/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h
+++ b/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h
@@ -26,7 +26,7 @@
#include "amdgpu_smu.h"
 #define SMU13_DRIVER_IF_VERSION_INV 0x
-#define SMU13_DRIVER_IF_VERSION_ALDE 0x5
+#define SMU13_DRIVER_IF_VERSION_ALDE 0x6
 /* MP Apertures */
#define MP0_Public  0x0380
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: stop warning on TT shrinker failure

2021-03-23 Thread Christian König

Am 23.03.21 um 13:37 schrieb Michal Hocko:

On Tue 23-03-21 13:21:32, Christian König wrote:

Am 23.03.21 um 13:04 schrieb Michal Hocko:

On Tue 23-03-21 12:48:58, Christian König wrote:

Am 23.03.21 um 12:28 schrieb Daniel Vetter:

On Tue, Mar 23, 2021 at 08:38:33AM +0100, Michal Hocko wrote:

On Mon 22-03-21 20:34:25, Christian König wrote:

[...]

My only concern is that if I could rely on memalloc_no* being used we could
optimize this quite a bit further.

Yes you can use the scope API and you will be guaranteed that _any_
allocation from the enclosed context will inherit GFP_NO* semantic.

The question is if this is also guaranteed the other way around?

In other words if somebody calls get_free_page(GFP_NOFS) are the context
flags set as well?

gfp mask is always restricted in the page allocator. So say you have
noio scope context and call get_free_page/kmalloc(GFP_NOFS) then the
scope would restrict the allocation flags to GFP_NOIO (aka drop
__GFP_IO). For further details, have a look at current_gfp_context
and its callers.

Does this answer your question?

But what happens if you don't have noio scope and somebody calls
get_free_page(GFP_NOFS)?

Then this will be a regular NOFS request. Let me repeat scope API will
further restrict any requested allocation mode.


Ok, got it.




Is then the noio scope added automatically? And is it possible that the
shrinker gets called without noio scope even we would need it?

Here you have lost me again.


I think this is where I don't get yet what Christian tries to do: We
really shouldn't do different tricks and calling contexts between direct
reclaim and kswapd reclaim. Otherwise very hard to track down bugs are
pretty much guaranteed. So whether we use explicit gfp flags or the
context apis, result is exactly the same.

Ok let us recap what TTMs TT shrinker does here:

1. We got memory which is not swapable because it might be accessed by the
GPU at any time.
2. Make sure the memory is not accessed by the GPU and driver need to grab a
lock before they can make it accessible again.
3. Allocate a shmem file and copy over the not swapable pages.

This is quite tricky because the shrinker operates in the PF_MEMALLOC
context so such an allocation would be allowed to completely deplete
memory unless you explicitly mark that context as __GFP_NOMEMALLOC.

Thanks, exactly that was one thing I was absolutely not sure about. And yes
I agree that this is really tricky.

Ideally I would like to be able to trigger swapping out the shmem page I
allocated immediately after doing the copy.

So let me try to rephrase to make sure I understand. You would like to
swap out the existing content from the shrinker and you use shmem as a
way to achieve that. The swapout should happen at the time of copying
(shrinker context) or shortly afterwards?

So effectively to call pageout() on the shmem page after the copy?


Yes, exactly that.


This way I would only need a single page for the whole shrink operation at
any given time.

What do you mean by that? You want the share the same shmem page for
other copy+swapout?


Correct, yes.

The idea is that we can swap out the content of a full GPU buffer object 
this way to give the backing store of the object back to the core memory 
managment.


Regards,
Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] amdgpu: fix gcc -Wrestrict warning

2021-03-23 Thread Arnd Bergmann
From: Arnd Bergmann 

gcc warns about an sprintf() that uses the same buffer as source
and destination, which is undefined behavior in C99:

drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c: In function 
'amdgpu_securedisplay_debugfs_write':
drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c:141:6: error: 'sprintf' 
argument 3 overlaps destination object 'i2c_output' [-Werror=restrict]
  141 |  sprintf(i2c_output, "%s 0x%X", i2c_output,
  |  ^~
  142 |   
securedisplay_cmd->securedisplay_out_message.send_roi_crc.i2c_buf[i]);
  |   
~
drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c:97:7: note: destination 
object referenced by 'restrict'-qualified argument 1 was declared here
   97 |  char i2c_output[256];
  |   ^~

Rewrite it to remember the current offset into the buffer instead.

Signed-off-by: Arnd Bergmann 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
index 834440ab9ff7..69d7f6bff5d4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
@@ -136,9 +136,10 @@ static ssize_t amdgpu_securedisplay_debugfs_write(struct 
file *f, const char __u
ret = psp_securedisplay_invoke(psp, 
TA_SECUREDISPLAY_COMMAND__SEND_ROI_CRC);
if (!ret) {
if (securedisplay_cmd->status == 
TA_SECUREDISPLAY_STATUS__SUCCESS) {
+   int pos = 0;
memset(i2c_output,  0, sizeof(i2c_output));
for (i = 0; i < 
TA_SECUREDISPLAY_I2C_BUFFER_SIZE; i++)
-   sprintf(i2c_output, "%s 0x%X", 
i2c_output,
+   pos += sprintf(i2c_output + pos, " 
0x%X",

securedisplay_cmd->securedisplay_out_message.send_roi_crc.i2c_buf[i]);
dev_info(adev->dev, "SECUREDISPLAY: I2C buffer 
out put is :%s\n", i2c_output);
} else {
-- 
2.29.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: stop warning on TT shrinker failure

2021-03-23 Thread Christian König

Am 23.03.21 um 13:04 schrieb Michal Hocko:

On Tue 23-03-21 12:48:58, Christian König wrote:

Am 23.03.21 um 12:28 schrieb Daniel Vetter:

On Tue, Mar 23, 2021 at 08:38:33AM +0100, Michal Hocko wrote:

On Mon 22-03-21 20:34:25, Christian König wrote:

[...]

My only concern is that if I could rely on memalloc_no* being used we could
optimize this quite a bit further.

Yes you can use the scope API and you will be guaranteed that _any_
allocation from the enclosed context will inherit GFP_NO* semantic.

The question is if this is also guaranteed the other way around?

In other words if somebody calls get_free_page(GFP_NOFS) are the context
flags set as well?

gfp mask is always restricted in the page allocator. So say you have
noio scope context and call get_free_page/kmalloc(GFP_NOFS) then the
scope would restrict the allocation flags to GFP_NOIO (aka drop
__GFP_IO). For further details, have a look at current_gfp_context
and its callers.

Does this answer your question?


But what happens if you don't have noio scope and somebody calls 
get_free_page(GFP_NOFS)?


Is then the noio scope added automatically? And is it possible that the 
shrinker gets called without noio scope even we would need it?



I think this is where I don't get yet what Christian tries to do: We
really shouldn't do different tricks and calling contexts between direct
reclaim and kswapd reclaim. Otherwise very hard to track down bugs are
pretty much guaranteed. So whether we use explicit gfp flags or the
context apis, result is exactly the same.

Ok let us recap what TTMs TT shrinker does here:

1. We got memory which is not swapable because it might be accessed by the
GPU at any time.
2. Make sure the memory is not accessed by the GPU and driver need to grab a
lock before they can make it accessible again.
3. Allocate a shmem file and copy over the not swapable pages.

This is quite tricky because the shrinker operates in the PF_MEMALLOC
context so such an allocation would be allowed to completely deplete
memory unless you explicitly mark that context as __GFP_NOMEMALLOC.


Thanks, exactly that was one thing I was absolutely not sure about. And 
yes I agree that this is really tricky.


Ideally I would like to be able to trigger swapping out the shmem page I 
allocated immediately after doing the copy.


This way I would only need a single page for the whole shrink operation 
at any given time.



Also note that if the allocation cannot succeed it will not trigger reclaim
again because you are already called from the reclaim context.


4. Free the not swapable/reclaimable pages.

The pages we got from the shmem file are easily swapable to disk after the
copy is completed. But only if IO is not already blocked because the
shrinker was called from an allocation restricted by GFP_NOFS or GFP_NOIO.

Sorry for being dense here but I still do not follow the actual problem
(well, except for the above mentioned one). Is the sole point of this to
emulate a GFP_NO* allocation context and see how shrinker behaves?


Please be as dense as you need to be :)

I think Daniel and I only have a very rough understanding of the memory 
management details here, but we need exactly that knowledge to get the 
GPU memory management into the shape we want it to be.


Thanks,
Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: stop warning on TT shrinker failure

2021-03-23 Thread Michal Hocko
On Mon 22-03-21 20:34:25, Christian König wrote:
> Am 22.03.21 um 18:02 schrieb Daniel Vetter:
> > On Mon, Mar 22, 2021 at 5:06 PM Michal Hocko  wrote:
> > > On Mon 22-03-21 14:05:48, Matthew Wilcox wrote:
> > > > On Mon, Mar 22, 2021 at 02:49:27PM +0100, Daniel Vetter wrote:
> > > > > On Sun, Mar 21, 2021 at 03:18:28PM +0100, Christian König wrote:
> > > > > > Am 20.03.21 um 14:17 schrieb Daniel Vetter:
> > > > > > > On Sat, Mar 20, 2021 at 10:04 AM Christian König
> > > > > > >  wrote:
> > > > > > > > Am 19.03.21 um 20:06 schrieb Daniel Vetter:
> > > > > > > > > On Fri, Mar 19, 2021 at 07:53:48PM +0100, Christian König 
> > > > > > > > > wrote:
> > > > > > > > > > Am 19.03.21 um 18:52 schrieb Daniel Vetter:
> > > > > > > > > > > On Fri, Mar 19, 2021 at 03:08:57PM +0100, Christian König 
> > > > > > > > > > > wrote:
> > > > > > > > > > > > Don't print a warning when we fail to allocate a page 
> > > > > > > > > > > > for swapping things out.
> > > > > > > > > > > > 
> > > > > > > > > > > > Also rely on memalloc_nofs_save/memalloc_nofs_restore 
> > > > > > > > > > > > instead of GFP_NOFS.
> > > > > > > > > > > Uh this part doesn't make sense. Especially since you 
> > > > > > > > > > > only do it for the
> > > > > > > > > > > debugfs file, not in general. Which means you've just 
> > > > > > > > > > > completely broken
> > > > > > > > > > > the shrinker.
> > > > > > > > > > Are you sure? My impression is that GFP_NOFS should now 
> > > > > > > > > > work much more out
> > > > > > > > > > of the box with the 
> > > > > > > > > > memalloc_nofs_save()/memalloc_nofs_restore().
> > > > > > > > > Yeah, if you'd put it in the right place :-)
> > > > > > > > > 
> > > > > > > > > But also -mm folks are very clear that memalloc_no*() family 
> > > > > > > > > is for dire
> > > > > > > > > situation where there's really no other way out. For anything 
> > > > > > > > > where you
> > > > > > > > > know what you're doing, you really should use explicit gfp 
> > > > > > > > > flags.
> > > > > > > > My impression is just the other way around. You should try to 
> > > > > > > > avoid the
> > > > > > > > NOFS/NOIO flags and use the memalloc_no* approach instead.
> > > > > > > Where did you get that idea?
> > > > > > Well from the kernel comment on GFP_NOFS:
> > > > > > 
> > > > > >   * %GFP_NOFS will use direct reclaim but will not use any 
> > > > > > filesystem
> > > > > > interfaces.
> > > > > >   * Please try to avoid using this flag directly and instead use
> > > > > >   * memalloc_nofs_{save,restore} to mark the whole scope which
> > > > > > cannot/shouldn't
> > > > > >   * recurse into the FS layer with a short explanation why. All 
> > > > > > allocation
> > > > > >   * requests will inherit GFP_NOFS implicitly.
> > > > > Huh that's interesting, since iirc Willy or Dave told me the 
> > > > > opposite, and
> > > > > the memalloc_no* stuff is for e.g. nfs calling into network layer 
> > > > > (needs
> > > > > GFP_NOFS) or swap on top of a filesystems (even needs GFP_NOIO I 
> > > > > think).
> > > > > 
> > > > > Adding them, maybe I got confused.
> > > > My impression is that the scoped API is preferred these days.
> > > > 
> > > > https://www.kernel.org/doc/html/latest/core-api/gfp_mask-from-fs-io.html
> > > > 
> > > > I'd probably need to spend a few months learning the DRM subsystem to
> > > > have a more detailed opinion on whether passing GFP flags around 
> > > > explicitly
> > > > or using the scope API is the better approach for your situation.
> > > yes, in an ideal world we would have a clearly defined scope of the
> > > reclaim recursion wrt FS/IO associated with it. I've got back to
> > > https://lore.kernel.org/amd-gfx/20210319140857.2262-1-christian.koe...@amd.com/
> > > and there are two things standing out. Why does ttm_tt_debugfs_shrink_show
> > > really require NOFS semantic? And why does it play with
> > > fs_reclaim_acquire?
> > It's our shrinker. shrink_show simply triggers that specific shrinker
> > asking it to shrink everything it can, which helps a lot with testing
> > without having to drive the entire system against the OOM wall.

Yes I figured that much. But...

> > fs_reclaim_acquire is there to make sure lockdep understands that this
> > is a shrinker and that it checks all the dependencies for us like if
> > we'd be in real reclaim. There is some drop caches interfaces in proc
> > iirc, but those drop everything, and they don't have the fs_reclaim
> > annotations to teach lockdep about what we're doing.

... I really do not follow this. You shouldn't really care whether this
is a reclaim interface or not. Or maybe I just do not understand this...
 
> To summarize the debugfs code is basically to test if that stuff really
> works with GFP_NOFS.

What do you mean by testing GFP_NOFS. Do you mean to test that GFP_NOFS
context is sufficiently powerful to reclaim enough objects due to some
internal constrains?

> My only concern is that if I could rely on memalloc_no* being used we could
> 

Re: [PATCH] drm/ttm: stop warning on TT shrinker failure

2021-03-23 Thread Daniel Vetter
On Tue, Mar 23, 2021 at 12:51:13PM +0100, Christian König wrote:
> 
> 
> Am 23.03.21 um 12:46 schrieb Michal Hocko:
> > On Tue 23-03-21 12:28:20, Daniel Vetter wrote:
> > > On Tue, Mar 23, 2021 at 08:38:33AM +0100, Michal Hocko wrote:
> > [...]
> > > > > > fs_reclaim_acquire is there to make sure lockdep understands that 
> > > > > > this
> > > > > > is a shrinker and that it checks all the dependencies for us like if
> > > > > > we'd be in real reclaim. There is some drop caches interfaces in 
> > > > > > proc
> > > > > > iirc, but those drop everything, and they don't have the fs_reclaim
> > > > > > annotations to teach lockdep about what we're doing.
> > > > ... I really do not follow this. You shouldn't really care whether this
> > > > is a reclaim interface or not. Or maybe I just do not understand this...
> > > We're heavily relying on lockdep and fs_reclaim to make sure we get it all
> > > right. So any drop caches interface that isn't wrapped in fs_reclaim
> > > context is kinda useless for testing. Plus ideally we want to only hit our
> > > own paths, and not trash every other cache in the system. Speed matters in
> > > CI.
> > But what is special about this path to hack around and make it pretend
> > it is part of the fs reclaim path?
> 
> That's just to teach lockdep that there is a dependency.
> 
> In other words we pretend in the debugfs file that it is part of the fs
> reclaim path to check for the case when it really becomes part of the fs
> reclaim path.

Yeah this is only for testing. There's two ways to test your shrinker:

- drive system agains the OOM wall, deal with lots of unrelated hangs and
  issues. Aside from this takes postively forever, which is not good if
  you want CI turn-around time measured in "coffee breaks" as time unit.

- have a debugfs file which reconstructs the calling context of direct
  reclaim sufficiently for lockdep to do its thing, and then test just
  your shrinker in isolation, without crashing your CI machines or even
  hurting it much.

Only one of these options is actually practical.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: stop warning on TT shrinker failure

2021-03-23 Thread Christian König




Am 23.03.21 um 12:46 schrieb Michal Hocko:

On Tue 23-03-21 12:28:20, Daniel Vetter wrote:

On Tue, Mar 23, 2021 at 08:38:33AM +0100, Michal Hocko wrote:

[...]

fs_reclaim_acquire is there to make sure lockdep understands that this
is a shrinker and that it checks all the dependencies for us like if
we'd be in real reclaim. There is some drop caches interfaces in proc
iirc, but those drop everything, and they don't have the fs_reclaim
annotations to teach lockdep about what we're doing.

... I really do not follow this. You shouldn't really care whether this
is a reclaim interface or not. Or maybe I just do not understand this...

We're heavily relying on lockdep and fs_reclaim to make sure we get it all
right. So any drop caches interface that isn't wrapped in fs_reclaim
context is kinda useless for testing. Plus ideally we want to only hit our
own paths, and not trash every other cache in the system. Speed matters in
CI.

But what is special about this path to hack around and make it pretend
it is part of the fs reclaim path?


That's just to teach lockdep that there is a dependency.

In other words we pretend in the debugfs file that it is part of the fs 
reclaim path to check for the case when it really becomes part of the fs 
reclaim path.


Regards,
Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: stop warning on TT shrinker failure

2021-03-23 Thread Christian König

Am 23.03.21 um 12:28 schrieb Daniel Vetter:

On Tue, Mar 23, 2021 at 08:38:33AM +0100, Michal Hocko wrote:

On Mon 22-03-21 20:34:25, Christian König wrote:

Am 22.03.21 um 18:02 schrieb Daniel Vetter:

On Mon, Mar 22, 2021 at 5:06 PM Michal Hocko  wrote:

On Mon 22-03-21 14:05:48, Matthew Wilcox wrote:

On Mon, Mar 22, 2021 at 02:49:27PM +0100, Daniel Vetter wrote:

On Sun, Mar 21, 2021 at 03:18:28PM +0100, Christian König wrote:

Am 20.03.21 um 14:17 schrieb Daniel Vetter:

On Sat, Mar 20, 2021 at 10:04 AM Christian König
 wrote:

Am 19.03.21 um 20:06 schrieb Daniel Vetter:

On Fri, Mar 19, 2021 at 07:53:48PM +0100, Christian König wrote:

Am 19.03.21 um 18:52 schrieb Daniel Vetter:

On Fri, Mar 19, 2021 at 03:08:57PM +0100, Christian König wrote:

Don't print a warning when we fail to allocate a page for swapping things out.

Also rely on memalloc_nofs_save/memalloc_nofs_restore instead of GFP_NOFS.

Uh this part doesn't make sense. Especially since you only do it for the
debugfs file, not in general. Which means you've just completely broken
the shrinker.

Are you sure? My impression is that GFP_NOFS should now work much more out
of the box with the memalloc_nofs_save()/memalloc_nofs_restore().

Yeah, if you'd put it in the right place :-)

But also -mm folks are very clear that memalloc_no*() family is for dire
situation where there's really no other way out. For anything where you
know what you're doing, you really should use explicit gfp flags.

My impression is just the other way around. You should try to avoid the
NOFS/NOIO flags and use the memalloc_no* approach instead.

Where did you get that idea?

Well from the kernel comment on GFP_NOFS:

   * %GFP_NOFS will use direct reclaim but will not use any filesystem
interfaces.
   * Please try to avoid using this flag directly and instead use
   * memalloc_nofs_{save,restore} to mark the whole scope which
cannot/shouldn't
   * recurse into the FS layer with a short explanation why. All allocation
   * requests will inherit GFP_NOFS implicitly.

Huh that's interesting, since iirc Willy or Dave told me the opposite, and
the memalloc_no* stuff is for e.g. nfs calling into network layer (needs
GFP_NOFS) or swap on top of a filesystems (even needs GFP_NOIO I think).

Adding them, maybe I got confused.

My impression is that the scoped API is preferred these days.

https://www.kernel.org/doc/html/latest/core-api/gfp_mask-from-fs-io.html

I'd probably need to spend a few months learning the DRM subsystem to
have a more detailed opinion on whether passing GFP flags around explicitly
or using the scope API is the better approach for your situation.

yes, in an ideal world we would have a clearly defined scope of the
reclaim recursion wrt FS/IO associated with it. I've got back to
https://lore.kernel.org/amd-gfx/20210319140857.2262-1-christian.koe...@amd.com/
and there are two things standing out. Why does ttm_tt_debugfs_shrink_show
really require NOFS semantic? And why does it play with
fs_reclaim_acquire?

It's our shrinker. shrink_show simply triggers that specific shrinker
asking it to shrink everything it can, which helps a lot with testing
without having to drive the entire system against the OOM wall.

Yes I figured that much. But...


fs_reclaim_acquire is there to make sure lockdep understands that this
is a shrinker and that it checks all the dependencies for us like if
we'd be in real reclaim. There is some drop caches interfaces in proc
iirc, but those drop everything, and they don't have the fs_reclaim
annotations to teach lockdep about what we're doing.

... I really do not follow this. You shouldn't really care whether this
is a reclaim interface or not. Or maybe I just do not understand this...

We're heavily relying on lockdep and fs_reclaim to make sure we get it all
right. So any drop caches interface that isn't wrapped in fs_reclaim
context is kinda useless for testing. Plus ideally we want to only hit our
own paths, and not trash every other cache in the system. Speed matters in
CI.


To summarize the debugfs code is basically to test if that stuff really
works with GFP_NOFS.

What do you mean by testing GFP_NOFS. Do you mean to test that GFP_NOFS
context is sufficiently powerful to reclaim enough objects due to some
internal constrains?


My only concern is that if I could rely on memalloc_no* being used we could
optimize this quite a bit further.

Yes you can use the scope API and you will be guaranteed that _any_
allocation from the enclosed context will inherit GFP_NO* semantic.


The question is if this is also guaranteed the other way around?

In other words if somebody calls get_free_page(GFP_NOFS) are the context 
flags set as well?



I think this is where I don't get yet what Christian tries to do: We
really shouldn't do different tricks and calling contexts between direct
reclaim and kswapd reclaim. Otherwise very hard to track down bugs are
pretty much guaranteed. So whether we use explicit gfp flags or the
context apis, resu

Re: [PATCH] drm/amd/dispaly: fix deadlock issue in amdgpu reset

2021-03-23 Thread Andrey Grodzovsky

+ Harry and Nick

On 2021-03-22 9:42 p.m., Yu, Lang wrote:

[AMD Official Use Only - Internal Distribution Only]



-Original Message-
From: Grodzovsky, Andrey 
Sent: Monday, March 22, 2021 11:01 PM
To: Yu, Lang ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Huang, Ray 

Subject: Re: [PATCH] drm/amd/dispaly: fix deadlock issue in amdgpu reset



On 2021-03-22 4:11 a.m., Lang Yu wrote:

In amdggpu reset, while dm.dc_lock is held by dm_suspend,
handle_hpd_rx_irq tries to acquire it. Deadlock occurred!

Deadlock log:

[  104.528304] amdgpu :03:00.0: amdgpu: GPU reset begin!

[  104.640084] ==
[  104.640092] WARNING: possible circular locking dependency detected
[  104.640099] 5.11.0-custom #1 Tainted: GW   E
[  104.640107] --
[  104.640114] cat/1158 is trying to acquire lock:
[  104.640120] 88810a09ce00
((work_completion)(&lh->work)){+.+.}-{0:0}, at: __flush_work+0x2e3/0x450 [  
104.640144]
 but task is already holding lock:
[  104.640151] 88810a09cc70 (&adev->dm.dc_lock){+.+.}-{3:3}, at:
dm_suspend+0xb2/0x1d0 [amdgpu] [  104.640581]
 which lock already depends on the new lock.

[  104.640590]
 the existing dependency chain (in reverse order) is:
[  104.640598]
 -> #2 (&adev->dm.dc_lock){+.+.}-{3:3}:
[  104.640611]lock_acquire+0xca/0x390
[  104.640623]__mutex_lock+0x9b/0x930
[  104.640633]mutex_lock_nested+0x1b/0x20
[  104.640640]handle_hpd_rx_irq+0x9b/0x1c0 [amdgpu]
[  104.640959]dm_irq_work_func+0x4e/0x60 [amdgpu]
[  104.641264]process_one_work+0x2a7/0x5b0
[  104.641275]worker_thread+0x4a/0x3d0
[  104.641283]kthread+0x125/0x160
[  104.641290]ret_from_fork+0x22/0x30
[  104.641300]
 -> #1 (&aconnector->hpd_lock){+.+.}-{3:3}:
[  104.641312]lock_acquire+0xca/0x390
[  104.641321]__mutex_lock+0x9b/0x930
[  104.641328]mutex_lock_nested+0x1b/0x20
[  104.641336]handle_hpd_rx_irq+0x67/0x1c0 [amdgpu]
[  104.641635]dm_irq_work_func+0x4e/0x60 [amdgpu]
[  104.641931]process_one_work+0x2a7/0x5b0
[  104.641940]worker_thread+0x4a/0x3d0
[  104.641948]kthread+0x125/0x160
[  104.641954]ret_from_fork+0x22/0x30
[  104.641963]
 -> #0 ((work_completion)(&lh->work)){+.+.}-{0:0}:
[  104.641975]check_prev_add+0x94/0xbf0
[  104.641983]__lock_acquire+0x130d/0x1ce0
[  104.641992]lock_acquire+0xca/0x390
[  104.642000]__flush_work+0x303/0x450
[  104.642008]flush_work+0x10/0x20
[  104.642016]amdgpu_dm_irq_suspend+0x93/0x100 [amdgpu]
[  104.642312]dm_suspend+0x181/0x1d0 [amdgpu]
[  104.642605]amdgpu_device_ip_suspend_phase1+0x8a/0x100 [amdgpu]
[  104.642835]amdgpu_device_ip_suspend+0x21/0x70 [amdgpu]
[  104.643066]amdgpu_device_pre_asic_reset+0x1bd/0x1d2 [amdgpu]
[  104.643403]amdgpu_device_gpu_recover.cold+0x5df/0xa9d [amdgpu]
[  104.643715]gpu_recover_get+0x2e/0x60 [amdgpu]
[  104.643951]simple_attr_read+0x6d/0x110
[  104.643960]debugfs_attr_read+0x49/0x70
[  104.643970]full_proxy_read+0x5f/0x90
[  104.643979]vfs_read+0xa3/0x190
[  104.643986]ksys_read+0x70/0xf0
[  104.643992]__x64_sys_read+0x1a/0x20
[  104.643999]do_syscall_64+0x38/0x90
[  104.644007]entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  104.644017]
 other info that might help us debug this:

[  104.644026] Chain exists of:
   (work_completion)(&lh->work) -->
&aconnector->hpd_lock --> &adev->dm.dc_lock

[  104.644043]  Possible unsafe locking scenario:

[  104.644049]CPU0CPU1
[  104.644055]
[  104.644060]   lock(&adev->dm.dc_lock);
[  104.644066]lock(&aconnector->hpd_lock);
[  104.644075]lock(&adev->dm.dc_lock);
[  104.644083]   lock((work_completion)(&lh->work));
[  104.644090]
  *** DEADLOCK ***

[  104.644096] 3 locks held by cat/1158:
[  104.644103]  #0: 88810d0e4eb8 (&attr->mutex){+.+.}-{3:3}, at:
simple_attr_read+0x4e/0x110 [  104.644119]  #1: 88810a0a1600
(&adev->reset_sem){}-{3:3}, at: amdgpu_device_lock_adev+0x42/0x94
[amdgpu] [  104.644489]  #2: 88810a09cc70
(&adev->dm.dc_lock){+.+.}-{3:3}, at: dm_suspend+0xb2/0x1d0 [amdgpu]

Signed-off-by: Lang Yu 
---
   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 6 --
   1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index e176ea84d75b..8727488df769 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2657,13 +2657,15 @

Re: [PATCH] drm/ttm: stop warning on TT shrinker failure

2021-03-23 Thread Daniel Vetter
On Tue, Mar 23, 2021 at 08:38:33AM +0100, Michal Hocko wrote:
> On Mon 22-03-21 20:34:25, Christian König wrote:
> > Am 22.03.21 um 18:02 schrieb Daniel Vetter:
> > > On Mon, Mar 22, 2021 at 5:06 PM Michal Hocko  wrote:
> > > > On Mon 22-03-21 14:05:48, Matthew Wilcox wrote:
> > > > > On Mon, Mar 22, 2021 at 02:49:27PM +0100, Daniel Vetter wrote:
> > > > > > On Sun, Mar 21, 2021 at 03:18:28PM +0100, Christian König wrote:
> > > > > > > Am 20.03.21 um 14:17 schrieb Daniel Vetter:
> > > > > > > > On Sat, Mar 20, 2021 at 10:04 AM Christian König
> > > > > > > >  wrote:
> > > > > > > > > Am 19.03.21 um 20:06 schrieb Daniel Vetter:
> > > > > > > > > > On Fri, Mar 19, 2021 at 07:53:48PM +0100, Christian König 
> > > > > > > > > > wrote:
> > > > > > > > > > > Am 19.03.21 um 18:52 schrieb Daniel Vetter:
> > > > > > > > > > > > On Fri, Mar 19, 2021 at 03:08:57PM +0100, Christian 
> > > > > > > > > > > > König wrote:
> > > > > > > > > > > > > Don't print a warning when we fail to allocate a page 
> > > > > > > > > > > > > for swapping things out.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Also rely on memalloc_nofs_save/memalloc_nofs_restore 
> > > > > > > > > > > > > instead of GFP_NOFS.
> > > > > > > > > > > > Uh this part doesn't make sense. Especially since you 
> > > > > > > > > > > > only do it for the
> > > > > > > > > > > > debugfs file, not in general. Which means you've just 
> > > > > > > > > > > > completely broken
> > > > > > > > > > > > the shrinker.
> > > > > > > > > > > Are you sure? My impression is that GFP_NOFS should now 
> > > > > > > > > > > work much more out
> > > > > > > > > > > of the box with the 
> > > > > > > > > > > memalloc_nofs_save()/memalloc_nofs_restore().
> > > > > > > > > > Yeah, if you'd put it in the right place :-)
> > > > > > > > > > 
> > > > > > > > > > But also -mm folks are very clear that memalloc_no*() 
> > > > > > > > > > family is for dire
> > > > > > > > > > situation where there's really no other way out. For 
> > > > > > > > > > anything where you
> > > > > > > > > > know what you're doing, you really should use explicit gfp 
> > > > > > > > > > flags.
> > > > > > > > > My impression is just the other way around. You should try to 
> > > > > > > > > avoid the
> > > > > > > > > NOFS/NOIO flags and use the memalloc_no* approach instead.
> > > > > > > > Where did you get that idea?
> > > > > > > Well from the kernel comment on GFP_NOFS:
> > > > > > > 
> > > > > > >   * %GFP_NOFS will use direct reclaim but will not use any 
> > > > > > > filesystem
> > > > > > > interfaces.
> > > > > > >   * Please try to avoid using this flag directly and instead use
> > > > > > >   * memalloc_nofs_{save,restore} to mark the whole scope which
> > > > > > > cannot/shouldn't
> > > > > > >   * recurse into the FS layer with a short explanation why. All 
> > > > > > > allocation
> > > > > > >   * requests will inherit GFP_NOFS implicitly.
> > > > > > Huh that's interesting, since iirc Willy or Dave told me the 
> > > > > > opposite, and
> > > > > > the memalloc_no* stuff is for e.g. nfs calling into network layer 
> > > > > > (needs
> > > > > > GFP_NOFS) or swap on top of a filesystems (even needs GFP_NOIO I 
> > > > > > think).
> > > > > > 
> > > > > > Adding them, maybe I got confused.
> > > > > My impression is that the scoped API is preferred these days.
> > > > > 
> > > > > https://www.kernel.org/doc/html/latest/core-api/gfp_mask-from-fs-io.html
> > > > > 
> > > > > I'd probably need to spend a few months learning the DRM subsystem to
> > > > > have a more detailed opinion on whether passing GFP flags around 
> > > > > explicitly
> > > > > or using the scope API is the better approach for your situation.
> > > > yes, in an ideal world we would have a clearly defined scope of the
> > > > reclaim recursion wrt FS/IO associated with it. I've got back to
> > > > https://lore.kernel.org/amd-gfx/20210319140857.2262-1-christian.koe...@amd.com/
> > > > and there are two things standing out. Why does 
> > > > ttm_tt_debugfs_shrink_show
> > > > really require NOFS semantic? And why does it play with
> > > > fs_reclaim_acquire?
> > > It's our shrinker. shrink_show simply triggers that specific shrinker
> > > asking it to shrink everything it can, which helps a lot with testing
> > > without having to drive the entire system against the OOM wall.
> 
> Yes I figured that much. But...
> 
> > > fs_reclaim_acquire is there to make sure lockdep understands that this
> > > is a shrinker and that it checks all the dependencies for us like if
> > > we'd be in real reclaim. There is some drop caches interfaces in proc
> > > iirc, but those drop everything, and they don't have the fs_reclaim
> > > annotations to teach lockdep about what we're doing.
> 
> ... I really do not follow this. You shouldn't really care whether this
> is a reclaim interface or not. Or maybe I just do not understand this...

We're heavily relying on lockdep and fs_reclaim to make sure we get it all
right. So any drop cac

RE: [PATCH] drm/amdgpu: Enable recovery on aldebaran

2021-03-23 Thread Zhang, Hawking
[AMD Public Use]

Reviewed-by: Hawking Zhang 

Regards,
Hawking
From: Lazar, Lijo 
Sent: Tuesday, March 23, 2021 18:55
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Xu, Feifei 
Subject: [PATCH] drm/amdgpu: Enable recovery on aldebaran


[AMD Public Use]

Add aldebaran to devices which support recovery

Signed-off-by: Lijo Lazar lijo.la...@amd.com
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index b1b83d282090..324b9e6b2965 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4192,6 +4192,7 @@ bool amdgpu_device_should_recover_gpu(struct 
amdgpu_device *adev)
   case CHIP_NAVY_FLOUNDER:
   case CHIP_DIMGREY_CAVEFISH:
   case CHIP_VANGOGH:
+ case CHIP_ALDEBARAN:
   break;
   default:
   goto disabled;
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: Enable recovery on aldebaran

2021-03-23 Thread Lazar, Lijo
[AMD Public Use]

Add aldebaran to devices which support recovery

Signed-off-by: Lijo Lazar lijo.la...@amd.com
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index b1b83d282090..324b9e6b2965 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4192,6 +4192,7 @@ bool amdgpu_device_should_recover_gpu(struct 
amdgpu_device *adev)
   case CHIP_NAVY_FLOUNDER:
   case CHIP_DIMGREY_CAVEFISH:
   case CHIP_VANGOGH:
+ case CHIP_ALDEBARAN:
   break;
   default:
   goto disabled;
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: re-apply "use the new cursor in the VM code""

2021-03-23 Thread Chen, Guchun
[AMD Public Use]

Hi Christian,

Thanks for your patience.

Unluckily, after applying below patch, vulkan cts test on my side is negative. 
The same gfxhub page fault and kernel bug along with amdgpu_vm_update_ptes 
calltrace is observed. I will send the full log to you privately soon.

I suggest holding on this patch before rooting cause it.

Regards,
Guchun

-Original Message-
From: Das, Nirmoy  
Sent: Tuesday, March 23, 2021 5:09 PM
To: Chen, Guchun ; Christian König 
; amd-gfx@lists.freedesktop.org
Cc: Das, Nirmoy 
Subject: Re: [PATCH] drm/amdgpu: re-apply "use the new cursor in the VM code""

I tested ./piglit run opengl results/test multiple times. Once I got gfx time 
out

error but without kernel freeze. I can't reproduce it any more.


Regards,

Nirmoy

On 3/22/21 2:11 PM, Chen, Guchun wrote:
> [AMD Public Use]
>
> Hi Christian,
>
> I will conduct one stress test for this tomorrow. Would you mind waiting for 
> my ack before submitting?
>
> Regards,
> Guchun
>
> -Original Message-
> From: Christian König 
> Sent: Monday, March 22, 2021 8:41 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Chen, Guchun ; Das, Nirmoy 
> 
> Subject: [PATCH] drm/amdgpu: re-apply "use the new cursor in the VM code""
>
> Now that we found the underlying problem we can re-apply this patch.
>
> This reverts commit 867fee7f8821ff42e7308088cf0c3450ac49c17c.
>
> Signed-off-by: Christian König 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 55 +-
>   1 file changed, 18 insertions(+), 37 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 9268db1172bd..bc3951b71079 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -37,6 +37,7 @@
>   #include "amdgpu_gmc.h"
>   #include "amdgpu_xgmi.h"
>   #include "amdgpu_dma_buf.h"
> +#include "amdgpu_res_cursor.h"
>   
>   /**
>* DOC: GPUVM
> @@ -1583,7 +1584,7 @@ static int amdgpu_vm_update_ptes(struct 
> amdgpu_vm_update_params *params,
>* @last: last mapped entry
>* @flags: flags for the entries
>* @offset: offset into nodes and pages_addr
> - * @nodes: array of drm_mm_nodes with the MC addresses
> + * @res: ttm_resource to map
>* @pages_addr: DMA addresses to use for mapping
>* @fence: optional resulting fence
>*
> @@ -1598,13 +1599,13 @@ static int amdgpu_vm_bo_update_mapping(struct 
> amdgpu_device *adev,
>  bool unlocked, struct dma_resv *resv,
>  uint64_t start, uint64_t last,
>  uint64_t flags, uint64_t offset,
> -struct drm_mm_node *nodes,
> +struct ttm_resource *res,
>  dma_addr_t *pages_addr,
>  struct dma_fence **fence)
>   {
>   struct amdgpu_vm_update_params params;
> + struct amdgpu_res_cursor cursor;
>   enum amdgpu_sync_mode sync_mode;
> - uint64_t pfn;
>   int r;
>   
>   memset(¶ms, 0, sizeof(params)); @@ -1622,14 +1623,6 @@ static 
> int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>   else
>   sync_mode = AMDGPU_SYNC_EXPLICIT;
>   
> - pfn = offset >> PAGE_SHIFT;
> - if (nodes) {
> - while (pfn >= nodes->size) {
> - pfn -= nodes->size;
> - ++nodes;
> - }
> - }
> -
>   amdgpu_vm_eviction_lock(vm);
>   if (vm->evicting) {
>   r = -EBUSY;
> @@ -1648,23 +1641,17 @@ static int amdgpu_vm_bo_update_mapping(struct 
> amdgpu_device *adev,
>   if (r)
>   goto error_unlock;
>   
> - do {
> + amdgpu_res_first(res, offset, (last - start + 1) * AMDGPU_GPU_PAGE_SIZE,
> +  &cursor);
> + while (cursor.remaining) {
>   uint64_t tmp, num_entries, addr;
>   
> -
> - num_entries = last - start + 1;
> - if (nodes) {
> - addr = nodes->start << PAGE_SHIFT;
> - num_entries = min((nodes->size - pfn) *
> - AMDGPU_GPU_PAGES_IN_CPU_PAGE, num_entries);
> - } else {
> - addr = 0;
> - }
> -
> + num_entries = cursor.size >> AMDGPU_GPU_PAGE_SHIFT;
>   if (pages_addr) {
>   bool contiguous = true;
>   
>   if (num_entries > AMDGPU_GPU_PAGES_IN_CPU_PAGE) {
> + uint64_t pfn = cursor.start >> PAGE_SHIFT;
>   uint64_t count;
>   
>   contiguous = pages_addr[pfn + 1] == @@ -1684,16 
> +1671,18 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>   }
>   
>   if (!contiguous) {
> - addr = pfn << PAGE_SHIFT;

Re: [PATCH] drm/amdgpu: re-apply "use the new cursor in the VM code""

2021-03-23 Thread Nirmoy
I tested ./piglit run opengl results/test multiple times. Once I got gfx 
time out


error but without kernel freeze. I can't reproduce it any more.


Regards,

Nirmoy

On 3/22/21 2:11 PM, Chen, Guchun wrote:

[AMD Public Use]

Hi Christian,

I will conduct one stress test for this tomorrow. Would you mind waiting for my 
ack before submitting?

Regards,
Guchun

-Original Message-
From: Christian König 
Sent: Monday, March 22, 2021 8:41 PM
To: amd-gfx@lists.freedesktop.org
Cc: Chen, Guchun ; Das, Nirmoy 
Subject: [PATCH] drm/amdgpu: re-apply "use the new cursor in the VM code""

Now that we found the underlying problem we can re-apply this patch.

This reverts commit 867fee7f8821ff42e7308088cf0c3450ac49c17c.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 55 +-
  1 file changed, 18 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 9268db1172bd..bc3951b71079 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -37,6 +37,7 @@
  #include "amdgpu_gmc.h"
  #include "amdgpu_xgmi.h"
  #include "amdgpu_dma_buf.h"
+#include "amdgpu_res_cursor.h"
  
  /**

   * DOC: GPUVM
@@ -1583,7 +1584,7 @@ static int amdgpu_vm_update_ptes(struct 
amdgpu_vm_update_params *params,
   * @last: last mapped entry
   * @flags: flags for the entries
   * @offset: offset into nodes and pages_addr
- * @nodes: array of drm_mm_nodes with the MC addresses
+ * @res: ttm_resource to map
   * @pages_addr: DMA addresses to use for mapping
   * @fence: optional resulting fence
   *
@@ -1598,13 +1599,13 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
   bool unlocked, struct dma_resv *resv,
   uint64_t start, uint64_t last,
   uint64_t flags, uint64_t offset,
-  struct drm_mm_node *nodes,
+  struct ttm_resource *res,
   dma_addr_t *pages_addr,
   struct dma_fence **fence)
  {
struct amdgpu_vm_update_params params;
+   struct amdgpu_res_cursor cursor;
enum amdgpu_sync_mode sync_mode;
-   uint64_t pfn;
int r;
  
  	memset(¶ms, 0, sizeof(params));

@@ -1622,14 +1623,6 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
else
sync_mode = AMDGPU_SYNC_EXPLICIT;
  
-	pfn = offset >> PAGE_SHIFT;

-   if (nodes) {
-   while (pfn >= nodes->size) {
-   pfn -= nodes->size;
-   ++nodes;
-   }
-   }
-
amdgpu_vm_eviction_lock(vm);
if (vm->evicting) {
r = -EBUSY;
@@ -1648,23 +1641,17 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
if (r)
goto error_unlock;
  
-	do {

+   amdgpu_res_first(res, offset, (last - start + 1) * AMDGPU_GPU_PAGE_SIZE,
+&cursor);
+   while (cursor.remaining) {
uint64_t tmp, num_entries, addr;
  
-

-   num_entries = last - start + 1;
-   if (nodes) {
-   addr = nodes->start << PAGE_SHIFT;
-   num_entries = min((nodes->size - pfn) *
-   AMDGPU_GPU_PAGES_IN_CPU_PAGE, num_entries);
-   } else {
-   addr = 0;
-   }
-
+   num_entries = cursor.size >> AMDGPU_GPU_PAGE_SHIFT;
if (pages_addr) {
bool contiguous = true;
  
  			if (num_entries > AMDGPU_GPU_PAGES_IN_CPU_PAGE) {

+   uint64_t pfn = cursor.start >> PAGE_SHIFT;
uint64_t count;
  
  contiguous = pages_addr[pfn + 1] == @@ -1684,16 +1671,18 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,

}
  
  			if (!contiguous) {

-   addr = pfn << PAGE_SHIFT;
+   addr = cursor.start;
params.pages_addr = pages_addr;
} else {
-   addr = pages_addr[pfn];
+   addr = pages_addr[cursor.start >> PAGE_SHIFT];
params.pages_addr = NULL;
}
  
  		} else if (flags & (AMDGPU_PTE_VALID | AMDGPU_PTE_PRT)) {

-   addr += bo_adev->vm_manager.vram_base_offset;
-   addr += pfn << PAGE_SHIFT;
+   addr = bo_adev->vm_manager.vram_base_offset +
+   cursor.start;
+   } else {
+   addr = 0;
}
  
  		tmp = start + num_entries;

@@ -1701,14 +1690,9 @@ stat

RE: [PATCH] drm/amd/pm: fix gpu reset failure by MP1 state setting

2021-03-23 Thread Quan, Evan
[AMD Public Use]

Thanks! Reviewed-by: Evan Quan 

-Original Message-
From: Chen, Guchun  
Sent: Tuesday, March 23, 2021 1:50 PM
To: amd-gfx@lists.freedesktop.org; Lazar, Lijo ; Chen, 
Jiansong (Simon) ; Quan, Evan 
Cc: Chen, Guchun 
Subject: [PATCH] drm/amd/pm: fix gpu reset failure by MP1 state setting

Instead of blocking varied unsupported MP1 state in upper level,
defer and skip such MP1 state handling in specific ASIC.

Signed-off-by: Lijo Lazar 
Signed-off-by: Guchun Chen 
---
 drivers/gpu/drm/amd/pm/amdgpu_dpm.c|  3 ---
 .../gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c| 10 +++---
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c 
b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c
index 15e239582a97..0a6bb3311f0f 100644
--- a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c
+++ b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c
@@ -1027,9 +1027,6 @@ int amdgpu_dpm_set_mp1_state(struct amdgpu_device *adev,
int ret = 0;
const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs;
 
-   if (mp1_state == PP_MP1_STATE_NONE)
-   return 0;
-
if (pp_funcs && pp_funcs->set_mp1_state) {
ret = pp_funcs->set_mp1_state(
adev->powerplay.pp_handle,
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index 722fe067ac2c..72d9c1be1835 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -3113,14 +3113,18 @@ static int 
sienna_cichlid_system_features_control(struct smu_context *smu,
 static int sienna_cichlid_set_mp1_state(struct smu_context *smu,
enum pp_mp1_state mp1_state)
 {
+   int ret;
+
switch (mp1_state) {
case PP_MP1_STATE_UNLOAD:
-   return smu_cmn_set_mp1_state(smu, mp1_state);
+   ret = smu_cmn_set_mp1_state(smu, mp1_state);
+   break;
default:
-   return -EINVAL;
+   /* Ignore others */
+   ret = 0;
}
 
-   return 0;
+   return ret;
 }
 
 static const struct pptable_funcs sienna_cichlid_ppt_funcs = {
-- 
2.17.1
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu/display: fix dmub invalid register read

2021-03-23 Thread Thomas Lambertz
DMCUB_SCRATCH_0 sometimes contains 0xdeadbeef during initialization.
If this is detected, return 0 instead. This prevents wrong bit-flags
from being read.

The main impact of this bug is in the status check loop in
dmub_srv_wait_for_auto_load. As it is waiting for the device to become
ready, returning too early leads to a race condition. It is usually won
on first boot, but lost when laptop resumes from sleep, breaking screen
brightness control.

This issue was always present, but previously mitigated by the fact that
the full register was compared to the wanted value. Currently, only the
bottom two bits are tested, which are also set in 0xdeadbeef, thus
returning readiness to early.

Fixes: 5fe6b98ae00d ("drm/amd/display: Update dmub code")
Signed-off-by: Thomas Lambertz 
---
 drivers/gpu/drm/amd/display/dmub/src/dmub_dcn20.c | 8 +++-
 drivers/gpu/drm/amd/display/dmub/src/dmub_dcn20.h | 2 ++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn20.c 
b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn20.c
index 8e8e65fa83c0..d6fcae182f68 100644
--- a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn20.c
+++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn20.c
@@ -323,8 +323,14 @@ uint32_t dmub_dcn20_get_gpint_response(struct dmub_srv 
*dmub)
 union dmub_fw_boot_status dmub_dcn20_get_fw_boot_status(struct dmub_srv *dmub)
 {
union dmub_fw_boot_status status;
+   uint32_t value;
+
+   value = REG_READ(DMCUB_SCRATCH0);
+   if (value == DMCUB_SCRATCH0_INVALID)
+   status.all = 0;
+   else
+   status.all = value;

-   status.all = REG_READ(DMCUB_SCRATCH0);
return status;
 }

diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn20.h 
b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn20.h
index a62be9c0652e..9557e76cf5d4 100644
--- a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn20.h
+++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn20.h
@@ -154,6 +154,8 @@ struct dmub_srv_common_regs {

 extern const struct dmub_srv_common_regs dmub_srv_dcn20_regs;

+#define DMCUB_SCRATCH0_INVALID 0xdeadbeef
+
 /* Hardware functions. */

 void dmub_dcn20_init(struct dmub_srv *dmub);
--
2.31.0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd: Fix a typo in two different sentences

2021-03-23 Thread Randy Dunlap
On 3/22/21 2:06 PM, Bhaskar Chowdhury wrote:
> 
> s/defintion/definition/ .two different places.
> 
> Signed-off-by: Bhaskar Chowdhury 

Acked-by: Randy Dunlap 

> ---
>  drivers/gpu/drm/amd/include/atombios.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/include/atombios.h 
> b/drivers/gpu/drm/amd/include/atombios.h
> index c1d7b1d0b952..47eb84598b96 100644
> --- a/drivers/gpu/drm/amd/include/atombios.h
> +++ b/drivers/gpu/drm/amd/include/atombios.h
> @@ -1987,9 +1987,9 @@ typedef struct _PIXEL_CLOCK_PARAMETERS_V6
>  #define PIXEL_CLOCK_V6_MISC_HDMI_BPP_MASK   0x0c
>  #define PIXEL_CLOCK_V6_MISC_HDMI_24BPP  0x00
>  #define PIXEL_CLOCK_V6_MISC_HDMI_36BPP  0x04
> -#define PIXEL_CLOCK_V6_MISC_HDMI_36BPP_V6   0x08//for V6, the 
> correct defintion for 36bpp should be 2 for 36bpp(2:1)
> +#define PIXEL_CLOCK_V6_MISC_HDMI_36BPP_V6   0x08//for V6, the 
> correct definition for 36bpp should be 2 for 36bpp(2:1)
>  #define PIXEL_CLOCK_V6_MISC_HDMI_30BPP  0x08
> -#define PIXEL_CLOCK_V6_MISC_HDMI_30BPP_V6   0x04//for V6, the 
> correct defintion for 30bpp should be 1 for 36bpp(5:4)
> +#define PIXEL_CLOCK_V6_MISC_HDMI_30BPP_V6   0x04//for V6, the 
> correct definition for 30bpp should be 1 for 36bpp(5:4)
>  #define PIXEL_CLOCK_V6_MISC_HDMI_48BPP  0x0c
>  #define PIXEL_CLOCK_V6_MISC_REF_DIV_SRC 0x10
>  #define PIXEL_CLOCK_V6_MISC_GEN_DPREFCLK0x40
> --


-- 
~Randy

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd: Fix a typo in two different sentences

2021-03-23 Thread Bhaskar Chowdhury


s/defintion/definition/ .two different places.

Signed-off-by: Bhaskar Chowdhury 
---
 drivers/gpu/drm/amd/include/atombios.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/include/atombios.h 
b/drivers/gpu/drm/amd/include/atombios.h
index c1d7b1d0b952..47eb84598b96 100644
--- a/drivers/gpu/drm/amd/include/atombios.h
+++ b/drivers/gpu/drm/amd/include/atombios.h
@@ -1987,9 +1987,9 @@ typedef struct _PIXEL_CLOCK_PARAMETERS_V6
 #define PIXEL_CLOCK_V6_MISC_HDMI_BPP_MASK   0x0c
 #define PIXEL_CLOCK_V6_MISC_HDMI_24BPP  0x00
 #define PIXEL_CLOCK_V6_MISC_HDMI_36BPP  0x04
-#define PIXEL_CLOCK_V6_MISC_HDMI_36BPP_V6   0x08//for V6, the 
correct defintion for 36bpp should be 2 for 36bpp(2:1)
+#define PIXEL_CLOCK_V6_MISC_HDMI_36BPP_V6   0x08//for V6, the 
correct definition for 36bpp should be 2 for 36bpp(2:1)
 #define PIXEL_CLOCK_V6_MISC_HDMI_30BPP  0x08
-#define PIXEL_CLOCK_V6_MISC_HDMI_30BPP_V6   0x04//for V6, the 
correct defintion for 30bpp should be 1 for 36bpp(5:4)
+#define PIXEL_CLOCK_V6_MISC_HDMI_30BPP_V6   0x04//for V6, the 
correct definition for 30bpp should be 1 for 36bpp(5:4)
 #define PIXEL_CLOCK_V6_MISC_HDMI_48BPP  0x0c
 #define PIXEL_CLOCK_V6_MISC_REF_DIV_SRC 0x10
 #define PIXEL_CLOCK_V6_MISC_GEN_DPREFCLK0x40
--
2.31.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] gpu: drm: amd: Remove duplicate includes

2021-03-23 Thread Wan Jiabing
../hw_ddc.h, ../hw_gpio.h and ../hw_hpd.h have been included 
at line 32, so remove them.

Signed-off-by: Wan Jiabing 
---
 .../gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c| 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c 
b/drivers/gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c
index 66e4841f41e4..ca335ea60412 100644
--- a/drivers/gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c
+++ b/drivers/gpu/drm/amd/display/dc/gpio/dce110/hw_factory_dce110.c
@@ -48,10 +48,6 @@
 #define REGI(reg_name, block, id)\
mm ## block ## id ## _ ## reg_name
 
-#include "../hw_gpio.h"
-#include "../hw_ddc.h"
-#include "../hw_hpd.h"
-
 #include "reg_helper.h"
 #include "../hpd_regs.h"
 
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] gpu: drm: amd: Remove duplicate include of dce110_resource.h

2021-03-23 Thread Wan Jiabing
dce110/dce110_resource.h has been included at line 58, so remove
the duplicate include at line 64.

Signed-off-by: Wan Jiabing 
---
 drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
index 4a3df13c9e49..c4fe21b3b23f 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
@@ -61,7 +61,6 @@
 #include "dcn21/dcn21_dccg.h"
 #include "dcn21_hubbub.h"
 #include "dcn10/dcn10_resource.h"
-#include "dce110/dce110_resource.h"
 #include "dce/dce_panel_cntl.h"
 
 #include "dcn20/dcn20_dwb.h"
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx