[PATCH] drm/amdgpu: enable display for cyan skillfish

2021-10-11 Thread Lang Yu
Display support for cyan skillfish is ready now. Enable it!

Signed-off-by: Lang Yu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 2bebd2ce6474..4228c7964175 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -736,6 +736,7 @@ static int amdgpu_discovery_set_display_ip_blocks(struct 
amdgpu_device *adev)
case IP_VERSION(1, 0, 1):
case IP_VERSION(2, 0, 2):
case IP_VERSION(2, 0, 0):
+   case IP_VERSION(2, 0, 3):
case IP_VERSION(2, 1, 0):
case IP_VERSION(3, 0, 0):
case IP_VERSION(3, 0, 2):
@@ -745,8 +746,6 @@ static int amdgpu_discovery_set_display_ip_blocks(struct 
amdgpu_device *adev)
case IP_VERSION(3, 1, 3):
amdgpu_device_ip_block_add(adev, &dm_ip_block);
break;
-   case IP_VERSION(2, 0, 3):
-   break;
default:
return -EINVAL;
}
-- 
2.25.1



Re: [PATCH 2/2] drm/amdgpu: Fix RAS page retirement with mode2 reset on Aldebaran

2021-10-11 Thread Zhou1, Tao
[AMD Official Use Only]

The patch looks good for me, but it's better to add comment in 
amdgpu_register_bad_pages_mca_notifier to explain why we need to reserve GPU 
info instead of using mgpu_info list, with this addressed, the patch is:

Reviewed-by: Tao Zhou mailto:tao.zh...@amd.com>>


From: Joshi, Mukul 
Sent: Tuesday, October 12, 2021 10:33 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Zhou1, Tao ; Clements, John ; 
Joshi, Mukul 
Subject: [PATCH 2/2] drm/amdgpu: Fix RAS page retirement with mode2 reset on 
Aldebaran

During mode2 reset, the GPU is temporarily removed from the
mgpu_info list. As a result, page retirement fails because it
cannot find the GPU in the GPU list.
To fix this, create our own list of GPUs that support MCE notifier
based page retirement and use that list to check if the UMC error
occurred on a GPU that supports MCE notifier based page retirement.

Signed-off-by: Mukul Joshi 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index e8875351967e..e8d88c77eb46 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -112,7 +112,12 @@ static bool amdgpu_ras_check_bad_page_unlock(struct 
amdgpu_ras *con,
 static bool amdgpu_ras_check_bad_page(struct amdgpu_device *adev,
 uint64_t addr);
 #ifdef CONFIG_X86_MCE_AMD
-static void amdgpu_register_bad_pages_mca_notifier(void);
+static void amdgpu_register_bad_pages_mca_notifier(struct amdgpu_device *adev);
+struct mce_notifier_adev_list {
+   struct amdgpu_device *devs[MAX_GPU_INSTANCE];
+   int num_gpu;
+};
+static struct mce_notifier_adev_list mce_adev_list;
 #endif

 void amdgpu_ras_set_error_query_ready(struct amdgpu_device *adev, bool ready)
@@ -2108,7 +2113,7 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev)
 #ifdef CONFIG_X86_MCE_AMD
 if ((adev->asic_type == CHIP_ALDEBARAN) &&
 (adev->gmc.xgmi.connected_to_cpu))
-   amdgpu_register_bad_pages_mca_notifier();
+   amdgpu_register_bad_pages_mca_notifier(adev);
 #endif
 return 0;

@@ -2605,24 +2610,18 @@ void amdgpu_release_ras_context(struct amdgpu_device 
*adev)
 #ifdef CONFIG_X86_MCE_AMD
 static struct amdgpu_device *find_adev(uint32_t node_id)
 {
-   struct amdgpu_gpu_instance *gpu_instance;
 int i;
 struct amdgpu_device *adev = NULL;

-   mutex_lock(&mgpu_info.mutex);
-
-   for (i = 0; i < mgpu_info.num_gpu; i++) {
-   gpu_instance = &(mgpu_info.gpu_ins[i]);
-   adev = gpu_instance->adev;
+   for (i = 0; i < mce_adev_list.num_gpu; i++) {
+   adev = mce_adev_list.devs[i];

-   if (adev->gmc.xgmi.connected_to_cpu &&
+   if (adev && adev->gmc.xgmi.connected_to_cpu &&
 adev->gmc.xgmi.physical_node_id == node_id)
 break;
 adev = NULL;
 }

-   mutex_unlock(&mgpu_info.mutex);
-
 return adev;
 }

@@ -2718,8 +2717,9 @@ static struct notifier_block amdgpu_bad_page_nb = {
 .priority   = MCE_PRIO_UC,
 };

-static void amdgpu_register_bad_pages_mca_notifier(void)
+static void amdgpu_register_bad_pages_mca_notifier(struct amdgpu_device *adev)
 {
+   mce_adev_list.devs[mce_adev_list.num_gpu++] = adev;
 /*
  * Register the x86 notifier only once
  * with MCE subsystem.
--
2.33.0



Re: [PATCH 1/2] drm/amdgpu: Enable RAS error injection after mode2 reset on Aldebaran

2021-10-11 Thread Zhou1, Tao
[AMD Official Use Only]

Reviewed-by: Tao Zhou mailto:tao.zh...@amd.com>>

From: Joshi, Mukul 
Sent: Tuesday, October 12, 2021 10:33 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Zhou1, Tao ; Clements, John ; 
Joshi, Mukul 
Subject: [PATCH 1/2] drm/amdgpu: Enable RAS error injection after mode2 reset 
on Aldebaran

Add the missing call to re-enable RAS error injections on the Aldebaran
mode2 reset code path.

Signed-off-by: Mukul Joshi 
---
 drivers/gpu/drm/amd/amdgpu/aldebaran.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/aldebaran.c 
b/drivers/gpu/drm/amd/amdgpu/aldebaran.c
index 148f6c3343ab..bcfdb63b1d42 100644
--- a/drivers/gpu/drm/amd/amdgpu/aldebaran.c
+++ b/drivers/gpu/drm/amd/amdgpu/aldebaran.c
@@ -307,6 +307,8 @@ static int aldebaran_mode2_restore_ip(struct amdgpu_device 
*adev)
 adev->ip_blocks[i].status.late_initialized = true;
 }

+   amdgpu_ras_set_error_query_ready(adev, true);
+
 amdgpu_device_set_cg_state(adev, AMD_CG_STATE_GATE);
 amdgpu_device_set_pg_state(adev, AMD_PG_STATE_GATE);

--
2.33.0



[PATCH 2/2] drm/amdgpu: Fix RAS page retirement with mode2 reset on Aldebaran

2021-10-11 Thread Mukul Joshi
During mode2 reset, the GPU is temporarily removed from the
mgpu_info list. As a result, page retirement fails because it
cannot find the GPU in the GPU list.
To fix this, create our own list of GPUs that support MCE notifier
based page retirement and use that list to check if the UMC error
occurred on a GPU that supports MCE notifier based page retirement.

Signed-off-by: Mukul Joshi 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index e8875351967e..e8d88c77eb46 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -112,7 +112,12 @@ static bool amdgpu_ras_check_bad_page_unlock(struct 
amdgpu_ras *con,
 static bool amdgpu_ras_check_bad_page(struct amdgpu_device *adev,
uint64_t addr);
 #ifdef CONFIG_X86_MCE_AMD
-static void amdgpu_register_bad_pages_mca_notifier(void);
+static void amdgpu_register_bad_pages_mca_notifier(struct amdgpu_device *adev);
+struct mce_notifier_adev_list {
+   struct amdgpu_device *devs[MAX_GPU_INSTANCE];
+   int num_gpu;
+};
+static struct mce_notifier_adev_list mce_adev_list;
 #endif
 
 void amdgpu_ras_set_error_query_ready(struct amdgpu_device *adev, bool ready)
@@ -2108,7 +2113,7 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev)
 #ifdef CONFIG_X86_MCE_AMD
if ((adev->asic_type == CHIP_ALDEBARAN) &&
(adev->gmc.xgmi.connected_to_cpu))
-   amdgpu_register_bad_pages_mca_notifier();
+   amdgpu_register_bad_pages_mca_notifier(adev);
 #endif
return 0;
 
@@ -2605,24 +2610,18 @@ void amdgpu_release_ras_context(struct amdgpu_device 
*adev)
 #ifdef CONFIG_X86_MCE_AMD
 static struct amdgpu_device *find_adev(uint32_t node_id)
 {
-   struct amdgpu_gpu_instance *gpu_instance;
int i;
struct amdgpu_device *adev = NULL;
 
-   mutex_lock(&mgpu_info.mutex);
-
-   for (i = 0; i < mgpu_info.num_gpu; i++) {
-   gpu_instance = &(mgpu_info.gpu_ins[i]);
-   adev = gpu_instance->adev;
+   for (i = 0; i < mce_adev_list.num_gpu; i++) {
+   adev = mce_adev_list.devs[i];
 
-   if (adev->gmc.xgmi.connected_to_cpu &&
+   if (adev && adev->gmc.xgmi.connected_to_cpu &&
adev->gmc.xgmi.physical_node_id == node_id)
break;
adev = NULL;
}
 
-   mutex_unlock(&mgpu_info.mutex);
-
return adev;
 }
 
@@ -2718,8 +2717,9 @@ static struct notifier_block amdgpu_bad_page_nb = {
.priority   = MCE_PRIO_UC,
 };
 
-static void amdgpu_register_bad_pages_mca_notifier(void)
+static void amdgpu_register_bad_pages_mca_notifier(struct amdgpu_device *adev)
 {
+   mce_adev_list.devs[mce_adev_list.num_gpu++] = adev;
/*
 * Register the x86 notifier only once
 * with MCE subsystem.
-- 
2.33.0



[PATCH 1/2] drm/amdgpu: Enable RAS error injection after mode2 reset on Aldebaran

2021-10-11 Thread Mukul Joshi
Add the missing call to re-enable RAS error injections on the Aldebaran
mode2 reset code path.

Signed-off-by: Mukul Joshi 
---
 drivers/gpu/drm/amd/amdgpu/aldebaran.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/aldebaran.c 
b/drivers/gpu/drm/amd/amdgpu/aldebaran.c
index 148f6c3343ab..bcfdb63b1d42 100644
--- a/drivers/gpu/drm/amd/amdgpu/aldebaran.c
+++ b/drivers/gpu/drm/amd/amdgpu/aldebaran.c
@@ -307,6 +307,8 @@ static int aldebaran_mode2_restore_ip(struct amdgpu_device 
*adev)
adev->ip_blocks[i].status.late_initialized = true;
}
 
+   amdgpu_ras_set_error_query_ready(adev, true);
+
amdgpu_device_set_cg_state(adev, AMD_CG_STATE_GATE);
amdgpu_device_set_pg_state(adev, AMD_PG_STATE_GATE);
 
-- 
2.33.0



RE: [PATCH] drm/amdgpu: query default sclk from smu for cyan_skillfish

2021-10-11 Thread Yu, Lang
[Public]



>-Original Message-
>From: Chen, Guchun 
>Sent: Monday, October 11, 2021 10:27 PM
>To: Lazar, Lijo ; Yu, Lang ; amd-
>g...@lists.freedesktop.org
>Cc: Deucher, Alexander ; Huang, Ray
>
>Subject: RE: [PATCH] drm/amdgpu: query default sclk from smu for cyan_skillfish
>
>[Public]
>
>Global variable to carry the sclk value looks a bit over-killed. Is it 
>possible that
>move all into cyan_skillfish_od_edit_dpm_table, like querying sclk first and
>setting it to cyan_skillfish_user_settings.sclk?

1, We need to query default sclk in smu init phase and use it in 
od_edit_dpm_table,
so global variable is needed.
2,  To maintain "set then commit" command rule of pp_od_clk_voltage,
global variable is also needed. 

Regards,
Lang

We need some global variables to store user settings and 
>
>Regards,
>Guchun
>
>-Original Message-
>From: amd-gfx  On Behalf Of Lazar,
>Lijo
>Sent: Monday, October 11, 2021 4:54 PM
>To: Yu, Lang ; amd-gfx@lists.freedesktop.org
>Cc: Deucher, Alexander ; Huang, Ray
>
>Subject: Re: [PATCH] drm/amdgpu: query default sclk from smu for cyan_skillfish
>
>
>
>On 10/11/2021 2:01 PM, Lang Yu wrote:
>> Query default sclk instead of hard code.
>>
>> Signed-off-by: Lang Yu 
>> ---
>>   .../gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c  | 12 +---
>>   1 file changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
>> b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
>> index 3d4c65bc29dc..d98fd06a2574 100644
>> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
>> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
>> @@ -47,7 +47,6 @@
>>   /* unit: MHz */
>>   #define CYAN_SKILLFISH_SCLK_MIN1000
>>   #define CYAN_SKILLFISH_SCLK_MAX2000
>> -#define CYAN_SKILLFISH_SCLK_DEFAULT 1800
>>
>>   /* unit: mV */
>>   #define CYAN_SKILLFISH_VDDC_MIN700
>> @@ -59,6 +58,8 @@ static struct gfx_user_settings {
>>  uint32_t vddc;
>>   } cyan_skillfish_user_settings;
>>
>> +static uint32_t cyan_skillfish_sclk_default;
>> +
>>   #define FEATURE_MASK(feature) (1ULL << feature)
>>   #define SMC_DPM_FEATURE ( \
>>  FEATURE_MASK(FEATURE_FCLK_DPM_BIT)  |   \
>> @@ -365,13 +366,18 @@ static bool cyan_skillfish_is_dpm_running(struct
>smu_context *smu)
>>  return false;
>>
>>  ret = smu_cmn_get_enabled_32_bits_mask(smu, feature_mask, 2);
>> -
>>  if (ret)
>>  return false;
>>
>>  feature_enabled = (uint64_t)feature_mask[0] |
>>  ((uint64_t)feature_mask[1] << 32);
>>
>> +/*
>> + * cyan_skillfish specific, query default sclk inseted of hard code.
>> + */
>> +cyan_skillfish_get_smu_metrics_data(smu, METRICS_CURR_GFXCLK,
>> +&cyan_skillfish_sclk_default);
>> +
>
>Maybe add if (!cyan_skillfish_sclk_default) so that it's read only once during 
>driver
>load and not on every suspend/resume.
>
>Reviewed-by: Lijo Lazar 
>
>Thanks,
>Lijo
>
>>  return !!(feature_enabled & SMC_DPM_FEATURE);
>>   }
>>
>> @@ -468,7 +474,7 @@ static int cyan_skillfish_od_edit_dpm_table(struct
>smu_context *smu,
>>  return -EINVAL;
>>  }
>>
>> -cyan_skillfish_user_settings.sclk =
>CYAN_SKILLFISH_SCLK_DEFAULT;
>> +cyan_skillfish_user_settings.sclk = cyan_skillfish_sclk_default;
>>  cyan_skillfish_user_settings.vddc =
>CYAN_SKILLFISH_VDDC_MAGIC;
>>
>>  break;
>>

RE: [PATCH] drm/amdgpu/pm: properly handle sclk for profiling modes on vangogh

2021-10-11 Thread Quan, Evan
[AMD Official Use Only]

Reviewed-by: Evan Quan 

> -Original Message-
> From: amd-gfx  On Behalf Of Alex
> Deucher
> Sent: Monday, October 11, 2021 11:04 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander 
> Subject: [PATCH] drm/amdgpu/pm: properly handle sclk for profiling modes
> on vangogh
> 
> When selecting between levels in the force performance levels interface sclk
> (gfxclk) was not set correctly for all levels.  Select the proper sclk 
> settings for
> all levels.
> 
> Bug:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitla
> b.freedesktop.org%2Fdrm%2Famd%2F-
> %2Fissues%2F1726&data=04%7C01%7Cevan.quan%40amd.com%7C3bf
> 2cf5224d4467295e508d98cc85ebf%7C3dd8961fe4884e608e11a82d994e183d%
> 7C0%7C0%7C637695614479890816%7CUnknown%7CTWFpbGZsb3d8eyJWIjoi
> MC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C100
> 0&sdata=nnWeLhX6hPmlP42pH9ygjiLX44HIzPApyR0%2BIFh5oaQ%3D&a
> mp;reserved=0
> Signed-off-by: Alex Deucher 
> ---
>  .../gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c  | 89 ++
> -
>  1 file changed, 29 insertions(+), 60 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> index bdd1a01e27b4..8d5f32807821 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> @@ -1386,52 +1386,38 @@ static int vangogh_set_performance_level(struct
> smu_context *smu,
>   uint32_t soc_mask, mclk_mask, fclk_mask;
>   uint32_t vclk_mask = 0, dclk_mask = 0;
> 
> + smu->cpu_actual_soft_min_freq = smu-
> >cpu_default_soft_min_freq;
> + smu->cpu_actual_soft_max_freq = smu-
> >cpu_default_soft_max_freq;
> +
>   switch (level) {
>   case AMD_DPM_FORCED_LEVEL_HIGH:
> - smu->gfx_actual_hard_min_freq = smu-
> >gfx_default_hard_min_freq;
> + smu->gfx_actual_hard_min_freq = smu-
> >gfx_default_soft_max_freq;
>   smu->gfx_actual_soft_max_freq = smu-
> >gfx_default_soft_max_freq;
> 
> - smu->cpu_actual_soft_min_freq = smu-
> >cpu_default_soft_min_freq;
> - smu->cpu_actual_soft_max_freq = smu-
> >cpu_default_soft_max_freq;
> 
>   ret = vangogh_force_dpm_limit_value(smu, true);
> + if (ret)
> + return ret;
>   break;
>   case AMD_DPM_FORCED_LEVEL_LOW:
>   smu->gfx_actual_hard_min_freq = smu-
> >gfx_default_hard_min_freq;
> - smu->gfx_actual_soft_max_freq = smu-
> >gfx_default_soft_max_freq;
> -
> - smu->cpu_actual_soft_min_freq = smu-
> >cpu_default_soft_min_freq;
> - smu->cpu_actual_soft_max_freq = smu-
> >cpu_default_soft_max_freq;
> + smu->gfx_actual_soft_max_freq = smu-
> >gfx_default_hard_min_freq;
> 
>   ret = vangogh_force_dpm_limit_value(smu, false);
> + if (ret)
> + return ret;
>   break;
>   case AMD_DPM_FORCED_LEVEL_AUTO:
>   smu->gfx_actual_hard_min_freq = smu-
> >gfx_default_hard_min_freq;
>   smu->gfx_actual_soft_max_freq = smu-
> >gfx_default_soft_max_freq;
> 
> - smu->cpu_actual_soft_min_freq = smu-
> >cpu_default_soft_min_freq;
> - smu->cpu_actual_soft_max_freq = smu-
> >cpu_default_soft_max_freq;
> -
>   ret = vangogh_unforce_dpm_levels(smu);
> - break;
> - case AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD:
> - smu->gfx_actual_hard_min_freq = smu-
> >gfx_default_hard_min_freq;
> - smu->gfx_actual_soft_max_freq = smu-
> >gfx_default_soft_max_freq;
> -
> - smu->cpu_actual_soft_min_freq = smu-
> >cpu_default_soft_min_freq;
> - smu->cpu_actual_soft_max_freq = smu-
> >cpu_default_soft_max_freq;
> -
> - ret = smu_cmn_send_smc_msg_with_param(smu,
> - SMU_MSG_SetHardMinGfxClk,
> -
>   VANGOGH_UMD_PSTATE_STANDARD_GFXCLK, NULL);
> - if (ret)
> - return ret;
> -
> - ret = smu_cmn_send_smc_msg_with_param(smu,
> - SMU_MSG_SetSoftMaxGfxClk,
> -
>   VANGOGH_UMD_PSTATE_STANDARD_GFXCLK, NULL);
>   if (ret)
>   return ret;
> + break;
> + case AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD:
> + smu->gfx_actual_hard_min_freq =
> VANGOGH_UMD_PSTATE_STANDARD_GFXCLK;
> + smu->gfx_actual_soft_max_freq =
> VANGOGH_UMD_PSTATE_STANDARD_GFXCLK;
> 
>   ret = vangogh_get_profiling_clk_mask(smu, level,
>   &vclk_mask,
> @@ -1446,32 +1432,15 @@ static int vangogh_set_performance_level(struct
> smu_context *smu,
>   vangogh_force_clk_levels(smu, SMU_SOCCLK, 1 <<
> soc_mask);
>   vangogh_force_clk_levels(smu, SMU_VCLK, 1 << vclk_mask);
>   vangogh_force_clk_levels(smu, S

Re: [PATCH 28/64] drm/amdgpu: drive all navi asics from the IP discovery table

2021-10-11 Thread Mike Lothian
I've raised a bug with hopefully everything you need

https://gitlab.freedesktop.org/drm/amd/-/issues/1743

On Mon, 11 Oct 2021 at 18:35, Alex Deucher  wrote:
>
> On Mon, Oct 11, 2021 at 1:20 PM Mike Lothian  wrote:
> >
> > Hi
> >
> > This patch breaks things for me on my Green Sardine & Navy Flounder
> > system (Asus ROG G513QY)
> >
> > It doesn't get past post with amdgpu built in, will try as a module
>
> Can you provide the dmesg output in that case?
>
> Alex
>
>
> >
> > Cheers
> >
> > Mike
> >
> > On Tue, 28 Sept 2021 at 17:44, Alex Deucher  
> > wrote:
> > >
> > > Rather than hardcoding based on asic_type, use the IP
> > > discovery table to configure the driver.
> > >
> > > v2: rebase
> > >
> > > Reviewed-by: Christian König 
> > > Signed-off-by: Alex Deucher 
> > > ---
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 20 
> > >  1 file changed, 20 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > index 5e0956b19d69..9c47cc636429 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > @@ -2142,26 +2142,6 @@ static int amdgpu_device_ip_early_init(struct 
> > > amdgpu_device *adev)
> > > if (r)
> > > return r;
> > > break;
> > > -   case  CHIP_NAVI14:
> > > -   case  CHIP_NAVI12:
> > > -   case  CHIP_SIENNA_CICHLID:
> > > -   case  CHIP_NAVY_FLOUNDER:
> > > -   case  CHIP_DIMGREY_CAVEFISH:
> > > -   case  CHIP_BEIGE_GOBY:
> > > -   case CHIP_VANGOGH:
> > > -   case CHIP_YELLOW_CARP:
> > > -   case CHIP_CYAN_SKILLFISH:
> > > -   if (adev->asic_type == CHIP_VANGOGH)
> > > -   adev->family = AMDGPU_FAMILY_VGH;
> > > -   else if (adev->asic_type == CHIP_YELLOW_CARP)
> > > -   adev->family = AMDGPU_FAMILY_YC;
> > > -   else
> > > -   adev->family = AMDGPU_FAMILY_NV;
> > > -
> > > -   r = nv_set_ip_blocks(adev);
> > > -   if (r)
> > > -   return r;
> > > -   break;
> > > default:
> > > r = amdgpu_discovery_set_ip_blocks(adev);
> > > if (r)
> > > --
> > > 2.31.1
> > >


Fwd: [PATCH] Size can be any value and is user controlled resulting in overwriting the 40 byte array wr_buf with an arbitrary length of data from buf.

2021-10-11 Thread T. Williams
-- Forwarded message -
From: docfate111 
Date: Mon, Oct 11, 2021 at 4:22 PM
Subject: [PATCH] Size can be any value and is user controlled resulting in
overwriting the 40 byte array wr_buf with an arbitrary length of data from
buf.
To: 
Cc: , 


Signed-off-by: docfate111 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index 87daa78a32b8..17f2756a64dc 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -263,7 +263,7 @@ static ssize_t dp_link_settings_write(struct file *f,
const char __user *buf,
if (!wr_buf)
return -ENOSPC;

-   if (parse_write_buffer_into_params(wr_buf, size,
+   if (parse_write_buffer_into_params(wr_buf, wr_buf_size,
   (long *)param, buf,
   max_param_num,
   ¶m_nums)) {
-- 
2.25.1



-- 
Thank you for your time,
Thelford Williams


Re: [PATCH 28/64] drm/amdgpu: drive all navi asics from the IP discovery table

2021-10-11 Thread Alex Deucher
On Mon, Oct 11, 2021 at 1:20 PM Mike Lothian  wrote:
>
> Hi
>
> This patch breaks things for me on my Green Sardine & Navy Flounder
> system (Asus ROG G513QY)
>
> It doesn't get past post with amdgpu built in, will try as a module

Can you provide the dmesg output in that case?

Alex


>
> Cheers
>
> Mike
>
> On Tue, 28 Sept 2021 at 17:44, Alex Deucher  wrote:
> >
> > Rather than hardcoding based on asic_type, use the IP
> > discovery table to configure the driver.
> >
> > v2: rebase
> >
> > Reviewed-by: Christian König 
> > Signed-off-by: Alex Deucher 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 20 
> >  1 file changed, 20 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index 5e0956b19d69..9c47cc636429 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -2142,26 +2142,6 @@ static int amdgpu_device_ip_early_init(struct 
> > amdgpu_device *adev)
> > if (r)
> > return r;
> > break;
> > -   case  CHIP_NAVI14:
> > -   case  CHIP_NAVI12:
> > -   case  CHIP_SIENNA_CICHLID:
> > -   case  CHIP_NAVY_FLOUNDER:
> > -   case  CHIP_DIMGREY_CAVEFISH:
> > -   case  CHIP_BEIGE_GOBY:
> > -   case CHIP_VANGOGH:
> > -   case CHIP_YELLOW_CARP:
> > -   case CHIP_CYAN_SKILLFISH:
> > -   if (adev->asic_type == CHIP_VANGOGH)
> > -   adev->family = AMDGPU_FAMILY_VGH;
> > -   else if (adev->asic_type == CHIP_YELLOW_CARP)
> > -   adev->family = AMDGPU_FAMILY_YC;
> > -   else
> > -   adev->family = AMDGPU_FAMILY_NV;
> > -
> > -   r = nv_set_ip_blocks(adev);
> > -   if (r)
> > -   return r;
> > -   break;
> > default:
> > r = amdgpu_discovery_set_ip_blocks(adev);
> > if (r)
> > --
> > 2.31.1
> >


Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

2021-10-11 Thread Borislav Petkov
On Mon, Oct 11, 2021 at 08:03:51AM +, Quan, Evan wrote:
> OK... Then forget about previous patches. Let's try to narrow down the
> issue first. Please try the attached patch1 first. If it works,

It does.

> please undo the changes of patch1 and try patch2 to narrow down further.

It does too.

:-)

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Re: [PATCH 28/64] drm/amdgpu: drive all navi asics from the IP discovery table

2021-10-11 Thread Mike Lothian
Hi

This patch breaks things for me on my Green Sardine & Navy Flounder
system (Asus ROG G513QY)

It doesn't get past post with amdgpu built in, will try as a module

Cheers

Mike

On Tue, 28 Sept 2021 at 17:44, Alex Deucher  wrote:
>
> Rather than hardcoding based on asic_type, use the IP
> discovery table to configure the driver.
>
> v2: rebase
>
> Reviewed-by: Christian König 
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 20 
>  1 file changed, 20 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 5e0956b19d69..9c47cc636429 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2142,26 +2142,6 @@ static int amdgpu_device_ip_early_init(struct 
> amdgpu_device *adev)
> if (r)
> return r;
> break;
> -   case  CHIP_NAVI14:
> -   case  CHIP_NAVI12:
> -   case  CHIP_SIENNA_CICHLID:
> -   case  CHIP_NAVY_FLOUNDER:
> -   case  CHIP_DIMGREY_CAVEFISH:
> -   case  CHIP_BEIGE_GOBY:
> -   case CHIP_VANGOGH:
> -   case CHIP_YELLOW_CARP:
> -   case CHIP_CYAN_SKILLFISH:
> -   if (adev->asic_type == CHIP_VANGOGH)
> -   adev->family = AMDGPU_FAMILY_VGH;
> -   else if (adev->asic_type == CHIP_YELLOW_CARP)
> -   adev->family = AMDGPU_FAMILY_YC;
> -   else
> -   adev->family = AMDGPU_FAMILY_NV;
> -
> -   r = nv_set_ip_blocks(adev);
> -   if (r)
> -   return r;
> -   break;
> default:
> r = amdgpu_discovery_set_ip_blocks(adev);
> if (r)
> --
> 2.31.1
>


Re: [PATCH -v2] x86/Kconfig: Do not enable AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT automatically

2021-10-11 Thread Tom Lendacky

On 10/11/21 11:03 AM, Borislav Petkov wrote:

Ok,

here's v2, I've added "however" number 3 below which should summarize
Christian's note about coherent and concurrent use of memory by the GPU
and CPU, which obviously cannot work with bounce buffers.

I'll send it to Linus next week if there are no more complaints.

Thx.

---
From: Borislav Petkov 

This Kconfig option was added initially so that memory encryption is
enabled by default on machines which support it.

However, devices which have DMA masks that are less than the bit
position of the encryption bit, aka C-bit, require the use of an IOMMU
or the use of SWIOTLB.

If the IOMMU is disabled or in passthrough mode, the kernel would switch
to SWIOTLB bounce-buffering for those transfers.

In order to avoid that,

   2cc13bb4f59f ("iommu: Disable passthrough mode when SME is active")

disables the default IOMMU passthrough mode so that devices for which the
default 256K DMA is insufficient, can use the IOMMU instead.

However 2, there are cases where the IOMMU is disabled in the BIOS, etc.
(think the usual hardware folk "oops, I dropped the ball there" cases) or a
driver doesn't properly use the DMA APIs or a device has a firmware or
hardware bug, e.g.:

   ea68573d408f ("drm/amdgpu: Fail to load on RAVEN if SME is active")

However 3, in the above GPU use case, there are APIs like Vulkan and
some OpenGL/OpenCL extensions which are under the assumption that
user-allocated memory can be passed in to the kernel driver and both the
GPU and CPU can do coherent and concurrent access to the same memory.
That cannot work with SWIOTLB bounce buffers, of course.

So, in order for those devices to function, drop the "default y" for the
SME by default active option so that users who want to have SME enabled,
will need to either enable it in their config or use "mem_encrypt=on" on
the kernel command line.

  [ tlendacky: Generalize commit message. ]

Fixes: 7744ccdbc16f ("x86/mm: Add Secure Memory Encryption (SME) support")
Reported-by: Paul Menzel 
Signed-off-by: Borislav Petkov 


Acked-by: Tom Lendacky 


Cc: 
Link: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml.kernel.org%2Fr%2F8bbacd0e-4580-3194-19d2-a0ecad7df09c%40molgen.mpg.de&data=04%7C01%7Cthomas.lendacky%40amd.com%7Cf9321f8ec7ba426182f908d98cd09ef0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637695649962742668%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=TybLSD8uU84WT4%2BRQbdL4unTJMQm5gDH4ykXaG8Dg1s%3D&reserved=0
---
  arch/x86/Kconfig | 1 -
  1 file changed, 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index bd70e8a39fbf..d9830e7e1060 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1525,7 +1525,6 @@ config AMD_MEM_ENCRYPT
  
  config AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT

bool "Activate AMD Secure Memory Encryption (SME) by default"
-   default y
depends on AMD_MEM_ENCRYPT
help
  Say yes to have system memory encrypted by default if running on



Re: [PATCH -v2] x86/Kconfig: Do not enable AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT automatically

2021-10-11 Thread Alex Deucher
On Mon, Oct 11, 2021 at 12:03 PM Borislav Petkov  wrote:
>
> Ok,
>
> here's v2, I've added "however" number 3 below which should summarize
> Christian's note about coherent and concurrent use of memory by the GPU
> and CPU, which obviously cannot work with bounce buffers.
>
> I'll send it to Linus next week if there are no more complaints.
>
> Thx.
>
> ---
> From: Borislav Petkov 
>
> This Kconfig option was added initially so that memory encryption is
> enabled by default on machines which support it.
>
> However, devices which have DMA masks that are less than the bit
> position of the encryption bit, aka C-bit, require the use of an IOMMU
> or the use of SWIOTLB.
>
> If the IOMMU is disabled or in passthrough mode, the kernel would switch
> to SWIOTLB bounce-buffering for those transfers.
>
> In order to avoid that,
>
>   2cc13bb4f59f ("iommu: Disable passthrough mode when SME is active")
>
> disables the default IOMMU passthrough mode so that devices for which the
> default 256K DMA is insufficient, can use the IOMMU instead.
>
> However 2, there are cases where the IOMMU is disabled in the BIOS, etc.
> (think the usual hardware folk "oops, I dropped the ball there" cases) or a
> driver doesn't properly use the DMA APIs or a device has a firmware or
> hardware bug, e.g.:
>
>   ea68573d408f ("drm/amdgpu: Fail to load on RAVEN if SME is active")
>
> However 3, in the above GPU use case, there are APIs like Vulkan and
> some OpenGL/OpenCL extensions which are under the assumption that
> user-allocated memory can be passed in to the kernel driver and both the
> GPU and CPU can do coherent and concurrent access to the same memory.
> That cannot work with SWIOTLB bounce buffers, of course.
>
> So, in order for those devices to function, drop the "default y" for the
> SME by default active option so that users who want to have SME enabled,
> will need to either enable it in their config or use "mem_encrypt=on" on
> the kernel command line.
>
>  [ tlendacky: Generalize commit message. ]
>
> Fixes: 7744ccdbc16f ("x86/mm: Add Secure Memory Encryption (SME) support")
> Reported-by: Paul Menzel 
> Signed-off-by: Borislav Petkov 
> Cc: 
> Link: 
> https://lkml.kernel.org/r/8bbacd0e-4580-3194-19d2-a0ecad7df...@molgen.mpg.de

Acked-by: Alex Deucher 

> ---
>  arch/x86/Kconfig | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index bd70e8a39fbf..d9830e7e1060 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1525,7 +1525,6 @@ config AMD_MEM_ENCRYPT
>
>  config AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
> bool "Activate AMD Secure Memory Encryption (SME) by default"
> -   default y
> depends on AMD_MEM_ENCRYPT
> help
>   Say yes to have system memory encrypted by default if running on
> --
> 2.29.2
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette


[PATCH -v2] x86/Kconfig: Do not enable AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT automatically

2021-10-11 Thread Borislav Petkov
Ok,

here's v2, I've added "however" number 3 below which should summarize
Christian's note about coherent and concurrent use of memory by the GPU
and CPU, which obviously cannot work with bounce buffers.

I'll send it to Linus next week if there are no more complaints.

Thx.

---
From: Borislav Petkov 

This Kconfig option was added initially so that memory encryption is
enabled by default on machines which support it.

However, devices which have DMA masks that are less than the bit
position of the encryption bit, aka C-bit, require the use of an IOMMU
or the use of SWIOTLB.

If the IOMMU is disabled or in passthrough mode, the kernel would switch
to SWIOTLB bounce-buffering for those transfers.

In order to avoid that,

  2cc13bb4f59f ("iommu: Disable passthrough mode when SME is active")

disables the default IOMMU passthrough mode so that devices for which the
default 256K DMA is insufficient, can use the IOMMU instead.

However 2, there are cases where the IOMMU is disabled in the BIOS, etc.
(think the usual hardware folk "oops, I dropped the ball there" cases) or a
driver doesn't properly use the DMA APIs or a device has a firmware or
hardware bug, e.g.:

  ea68573d408f ("drm/amdgpu: Fail to load on RAVEN if SME is active")

However 3, in the above GPU use case, there are APIs like Vulkan and
some OpenGL/OpenCL extensions which are under the assumption that
user-allocated memory can be passed in to the kernel driver and both the
GPU and CPU can do coherent and concurrent access to the same memory.
That cannot work with SWIOTLB bounce buffers, of course.

So, in order for those devices to function, drop the "default y" for the
SME by default active option so that users who want to have SME enabled,
will need to either enable it in their config or use "mem_encrypt=on" on
the kernel command line.

 [ tlendacky: Generalize commit message. ]

Fixes: 7744ccdbc16f ("x86/mm: Add Secure Memory Encryption (SME) support")
Reported-by: Paul Menzel 
Signed-off-by: Borislav Petkov 
Cc: 
Link: 
https://lkml.kernel.org/r/8bbacd0e-4580-3194-19d2-a0ecad7df...@molgen.mpg.de
---
 arch/x86/Kconfig | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index bd70e8a39fbf..d9830e7e1060 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1525,7 +1525,6 @@ config AMD_MEM_ENCRYPT
 
 config AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
bool "Activate AMD Secure Memory Encryption (SME) by default"
-   default y
depends on AMD_MEM_ENCRYPT
help
  Say yes to have system memory encrypted by default if running on
-- 
2.29.2

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


[PATCH v5] amd/display: only require overlay plane to cover whole CRTC on ChromeOS

2021-10-11 Thread Simon Ser
Commit ddab8bd788f5 ("drm/amd/display: Fix two cursor duplication when
using overlay") changed the atomic validation code to forbid the
overlay plane from being used if it doesn't cover the whole CRTC. The
motivation is that ChromeOS uses the atomic API for everything except
the cursor plane (which uses the legacy API). Thus amdgpu must always
be prepared to enable/disable/move the cursor plane at any time without
failing (or else ChromeOS will trip over).

As discussed in [1], there's no reason why the ChromeOS limitation
should prevent other fully atomic users from taking advantage of the
overlay plane. Let's limit the check to ChromeOS.

v4: fix ChromeOS detection (Harry)

v5: fix conflict with linux-next

[1]: 
https://lore.kernel.org/amd-gfx/JIQ_93_cHcshiIDsrMU1huBzx9P9LVQxucx8hQArpQu7Wk5DrCl_vTXj_Q20m_L-8C8A5dSpNcSJ8ehfcCrsQpfB5QG_Spn14EYkH9chtg0=@emersion.fr/

Signed-off-by: Simon Ser 
Cc: Alex Deucher 
Cc: Harry Wentland 
Cc: Nicholas Kazlauskas 
Cc: Bas Nieuwenhuizen 
Cc: Rodrigo Siqueira 
Cc: Sean Paul 
Fixes: ddab8bd788f5 ("drm/amd/display: Fix two cursor duplication when using 
overlay")
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 29 +++
 1 file changed, 29 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index f35561b5a465..2eeda1fec506 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -10594,6 +10594,31 @@ static int add_affected_mst_dsc_crtcs(struct 
drm_atomic_state *state, struct drm
 }
 #endif
 
+static bool is_chromeos(void)
+{
+   struct mm_struct *mm = current->mm;
+   struct file *exe_file;
+   bool ret;
+
+   /* ChromeOS renames its thread to DrmThread. Also check the executable
+* name. */
+   if (strcmp(current->comm, "DrmThread") != 0 || !mm)
+   return false;
+
+   rcu_read_lock();
+   exe_file = rcu_dereference(mm->exe_file);
+   if (exe_file && !get_file_rcu(exe_file))
+   exe_file = NULL;
+   rcu_read_unlock();
+
+   if (!exe_file)
+   return false;
+   ret = strcmp(exe_file->f_path.dentry->d_name.name, "chrome") == 0;
+   fput(exe_file);
+
+   return ret;
+}
+
 static int validate_overlay(struct drm_atomic_state *state)
 {
int i;
@@ -10601,6 +10626,10 @@ static int validate_overlay(struct drm_atomic_state 
*state)
struct drm_plane_state *new_plane_state;
struct drm_plane_state *primary_state, *overlay_state = NULL;
 
+   /* This is a workaround for ChromeOS only */
+   if (!is_chromeos())
+   return 0;
+
/* Check if primary plane is contained inside overlay */
for_each_new_plane_in_state_reverse(state, plane, new_plane_state, i) {
if (plane->type == DRM_PLANE_TYPE_OVERLAY) {
-- 
2.33.0




[PATCH] drm/amdgpu/pm: properly handle sclk for profiling modes on vangogh

2021-10-11 Thread Alex Deucher
When selecting between levels in the force performance levels interface
sclk (gfxclk) was not set correctly for all levels.  Select the proper
sclk settings for all levels.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1726
Signed-off-by: Alex Deucher 
---
 .../gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c  | 89 ++-
 1 file changed, 29 insertions(+), 60 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
index bdd1a01e27b4..8d5f32807821 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
@@ -1386,52 +1386,38 @@ static int vangogh_set_performance_level(struct 
smu_context *smu,
uint32_t soc_mask, mclk_mask, fclk_mask;
uint32_t vclk_mask = 0, dclk_mask = 0;
 
+   smu->cpu_actual_soft_min_freq = smu->cpu_default_soft_min_freq;
+   smu->cpu_actual_soft_max_freq = smu->cpu_default_soft_max_freq;
+
switch (level) {
case AMD_DPM_FORCED_LEVEL_HIGH:
-   smu->gfx_actual_hard_min_freq = smu->gfx_default_hard_min_freq;
+   smu->gfx_actual_hard_min_freq = smu->gfx_default_soft_max_freq;
smu->gfx_actual_soft_max_freq = smu->gfx_default_soft_max_freq;
 
-   smu->cpu_actual_soft_min_freq = smu->cpu_default_soft_min_freq;
-   smu->cpu_actual_soft_max_freq = smu->cpu_default_soft_max_freq;
 
ret = vangogh_force_dpm_limit_value(smu, true);
+   if (ret)
+   return ret;
break;
case AMD_DPM_FORCED_LEVEL_LOW:
smu->gfx_actual_hard_min_freq = smu->gfx_default_hard_min_freq;
-   smu->gfx_actual_soft_max_freq = smu->gfx_default_soft_max_freq;
-
-   smu->cpu_actual_soft_min_freq = smu->cpu_default_soft_min_freq;
-   smu->cpu_actual_soft_max_freq = smu->cpu_default_soft_max_freq;
+   smu->gfx_actual_soft_max_freq = smu->gfx_default_hard_min_freq;
 
ret = vangogh_force_dpm_limit_value(smu, false);
+   if (ret)
+   return ret;
break;
case AMD_DPM_FORCED_LEVEL_AUTO:
smu->gfx_actual_hard_min_freq = smu->gfx_default_hard_min_freq;
smu->gfx_actual_soft_max_freq = smu->gfx_default_soft_max_freq;
 
-   smu->cpu_actual_soft_min_freq = smu->cpu_default_soft_min_freq;
-   smu->cpu_actual_soft_max_freq = smu->cpu_default_soft_max_freq;
-
ret = vangogh_unforce_dpm_levels(smu);
-   break;
-   case AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD:
-   smu->gfx_actual_hard_min_freq = smu->gfx_default_hard_min_freq;
-   smu->gfx_actual_soft_max_freq = smu->gfx_default_soft_max_freq;
-
-   smu->cpu_actual_soft_min_freq = smu->cpu_default_soft_min_freq;
-   smu->cpu_actual_soft_max_freq = smu->cpu_default_soft_max_freq;
-
-   ret = smu_cmn_send_smc_msg_with_param(smu,
-   SMU_MSG_SetHardMinGfxClk,
-   VANGOGH_UMD_PSTATE_STANDARD_GFXCLK, 
NULL);
-   if (ret)
-   return ret;
-
-   ret = smu_cmn_send_smc_msg_with_param(smu,
-   SMU_MSG_SetSoftMaxGfxClk,
-   VANGOGH_UMD_PSTATE_STANDARD_GFXCLK, 
NULL);
if (ret)
return ret;
+   break;
+   case AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD:
+   smu->gfx_actual_hard_min_freq = 
VANGOGH_UMD_PSTATE_STANDARD_GFXCLK;
+   smu->gfx_actual_soft_max_freq = 
VANGOGH_UMD_PSTATE_STANDARD_GFXCLK;
 
ret = vangogh_get_profiling_clk_mask(smu, level,
&vclk_mask,
@@ -1446,32 +1432,15 @@ static int vangogh_set_performance_level(struct 
smu_context *smu,
vangogh_force_clk_levels(smu, SMU_SOCCLK, 1 << soc_mask);
vangogh_force_clk_levels(smu, SMU_VCLK, 1 << vclk_mask);
vangogh_force_clk_levels(smu, SMU_DCLK, 1 << dclk_mask);
-
break;
case AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK:
smu->gfx_actual_hard_min_freq = smu->gfx_default_hard_min_freq;
-   smu->gfx_actual_soft_max_freq = smu->gfx_default_soft_max_freq;
-
-   smu->cpu_actual_soft_min_freq = smu->cpu_default_soft_min_freq;
-   smu->cpu_actual_soft_max_freq = smu->cpu_default_soft_max_freq;
-
-   ret = smu_cmn_send_smc_msg_with_param(smu, 
SMU_MSG_SetHardMinVcn,
-   
VANGOGH_UMD_PSTATE_PEAK_DCLK, NULL);
-   if (ret)
-   return ret;
-
-   ret = smu_cmn_send_smc_msg_with_param(smu, 
SMU_MSG_SetSoftMaxVcn,
-  

[PATCH 2/5] drm/amdkfd: protect raven_device_info with KFD_SUPPORT_IOMMU_V2

2021-10-11 Thread Alex Deucher
raven_device_info is not used when KFD_SUPPORT_IOMMU_V2 is not
set.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 31e255ba15ed..c5387036a9c2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -93,7 +93,6 @@ static const struct kfd_device_info carrizo_device_info = {
.num_xgmi_sdma_engines = 0,
.num_sdma_queues_per_engine = 2,
 };
-#endif
 
 static const struct kfd_device_info raven_device_info = {
.asic_family = CHIP_RAVEN,
@@ -113,6 +112,7 @@ static const struct kfd_device_info raven_device_info = {
.num_xgmi_sdma_engines = 0,
.num_sdma_queues_per_engine = 2,
 };
+#endif
 
 #ifdef CONFIG_DRM_AMDGPU_CIK
 static const struct kfd_device_info hawaii_device_info = {
-- 
2.31.1



[PATCH 5/5] drm/amdgpu: drop navi reg init functions

2021-10-11 Thread Alex Deucher
No longer used since IP enumeration is driven by the IP
discovery table now.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/Makefile   |  6 +-
 .../gpu/drm/amd/amdgpu/beige_goby_reg_init.c  | 54 --
 .../drm/amd/amdgpu/cyan_skillfish_reg_init.c  | 51 -
 drivers/gpu/drm/amd/amdgpu/navi10_reg_init.c  | 55 ---
 drivers/gpu/drm/amd/amdgpu/navi12_reg_init.c  | 52 --
 drivers/gpu/drm/amd/amdgpu/navi14_reg_init.c  | 53 --
 drivers/gpu/drm/amd/amdgpu/nv.h   |  9 ---
 .../drm/amd/amdgpu/sienna_cichlid_reg_init.c  | 54 --
 drivers/gpu/drm/amd/amdgpu/vangogh_reg_init.c | 50 -
 .../gpu/drm/amd/amdgpu/yellow_carp_reg_init.c | 51 -
 10 files changed, 2 insertions(+), 433 deletions(-)
 delete mode 100644 drivers/gpu/drm/amd/amdgpu/beige_goby_reg_init.c
 delete mode 100644 drivers/gpu/drm/amd/amdgpu/cyan_skillfish_reg_init.c
 delete mode 100644 drivers/gpu/drm/amd/amdgpu/navi10_reg_init.c
 delete mode 100644 drivers/gpu/drm/amd/amdgpu/navi12_reg_init.c
 delete mode 100644 drivers/gpu/drm/amd/amdgpu/navi14_reg_init.c
 delete mode 100644 drivers/gpu/drm/amd/amdgpu/sienna_cichlid_reg_init.c
 delete mode 100644 drivers/gpu/drm/amd/amdgpu/vangogh_reg_init.c
 delete mode 100644 drivers/gpu/drm/amd/amdgpu/yellow_carp_reg_init.c

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 73a2151ee43f..7fedbb725e17 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -73,10 +73,8 @@ amdgpu-$(CONFIG_DRM_AMDGPU_SI)+= si.o gmc_v6_0.o gfx_v6_0.o 
si_ih.o si_dma.o dce
 
 amdgpu-y += \
vi.o mxgpu_vi.o nbio_v6_1.o soc15.o emu_soc.o mxgpu_ai.o nbio_v7_0.o 
vega10_reg_init.o \
-   vega20_reg_init.o nbio_v7_4.o nbio_v2_3.o nv.o navi10_reg_init.o 
navi14_reg_init.o \
-   arct_reg_init.o navi12_reg_init.o mxgpu_nv.o sienna_cichlid_reg_init.o 
vangogh_reg_init.o \
-   nbio_v7_2.o dimgrey_cavefish_reg_init.o hdp_v4_0.o hdp_v5_0.o 
aldebaran_reg_init.o aldebaran.o \
-   beige_goby_reg_init.o yellow_carp_reg_init.o cyan_skillfish_reg_init.o
+   vega20_reg_init.o nbio_v7_4.o nbio_v2_3.o nv.o arct_reg_init.o 
mxgpu_nv.o \
+   nbio_v7_2.o hdp_v4_0.o hdp_v5_0.o aldebaran_reg_init.o aldebaran.o
 
 # add DF block
 amdgpu-y += \
diff --git a/drivers/gpu/drm/amd/amdgpu/beige_goby_reg_init.c 
b/drivers/gpu/drm/amd/amdgpu/beige_goby_reg_init.c
deleted file mode 100644
index 608a113ce354..
--- a/drivers/gpu/drm/amd/amdgpu/beige_goby_reg_init.c
+++ /dev/null
@@ -1,54 +0,0 @@
-/*
- * Copyright 2020 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- */
-#include "amdgpu.h"
-#include "nv.h"
-
-#include "soc15_common.h"
-#include "soc15_hw_ip.h"
-#include "beige_goby_ip_offset.h"
-
-int beige_goby_reg_base_init(struct amdgpu_device *adev)
-{
-   /* HW has more IP blocks,  only initialize the block needed by driver */
-   uint32_t i;
-   for (i = 0 ; i < MAX_INSTANCE ; ++i) {
-   adev->reg_offset[GC_HWIP][i] = (uint32_t 
*)(&(GC_BASE.instance[i]));
-   adev->reg_offset[HDP_HWIP][i] = (uint32_t 
*)(&(HDP_BASE.instance[i]));
-   adev->reg_offset[MMHUB_HWIP][i] = (uint32_t 
*)(&(MMHUB_BASE.instance[i]));
-   adev->reg_offset[ATHUB_HWIP][i] = (uint32_t 
*)(&(ATHUB_BASE.instance[i]));
-   adev->reg_offset[NBIO_HWIP][i] = (uint32_t 
*)(&(NBIO_BASE.instance[i]));
-   adev->reg_offset[MP0_HWIP][i] = (uint32_t 
*)(&(MP0_BASE.instance[i]));
-   adev->reg_offset[MP1_HWIP][i] = (uint32_t 
*)(&(MP1_BASE.instance[i]));
-   adev->reg_offset[VCN_HWIP][i] = (uint32_t 
*)(&(VCN0_BASE.instance[i]));
-   adev->reg_offset[DF_HWIP][i] = (uint32_t 
*)(&(DF_BASE.instance[i]));
-   adev->reg_offset[DCE_HWIP][i] = (uint32

[PATCH 3/5] drm/amdgpu: drop soc15_set_ip_blocks()

2021-10-11 Thread Alex Deucher
No longer used since IP enumeration is now driven by
amdgpu IP discovery code.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/soc15.c | 179 -
 drivers/gpu/drm/amd/amdgpu/soc15.h |   1 -
 2 files changed, 180 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index 74310bb4216a..b5d7f21018cb 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -780,185 +780,6 @@ void soc15_set_virt_ops(struct amdgpu_device *adev)
soc15_reg_base_init(adev);
 }
 
-int soc15_set_ip_blocks(struct amdgpu_device *adev)
-{
-   /* for bare metal case */
-   if (!amdgpu_sriov_vf(adev))
-   soc15_reg_base_init(adev);
-
-   if (adev->flags & AMD_IS_APU) {
-   adev->nbio.funcs = &nbio_v7_0_funcs;
-   adev->nbio.hdp_flush_reg = &nbio_v7_0_hdp_flush_reg;
-   } else if (adev->asic_type == CHIP_VEGA20 ||
-  adev->asic_type == CHIP_ARCTURUS ||
-  adev->asic_type == CHIP_ALDEBARAN) {
-   adev->nbio.funcs = &nbio_v7_4_funcs;
-   adev->nbio.hdp_flush_reg = &nbio_v7_4_hdp_flush_reg;
-   } else {
-   adev->nbio.funcs = &nbio_v6_1_funcs;
-   adev->nbio.hdp_flush_reg = &nbio_v6_1_hdp_flush_reg;
-   }
-   adev->hdp.funcs = &hdp_v4_0_funcs;
-
-   if (adev->asic_type == CHIP_VEGA20 ||
-   adev->asic_type == CHIP_ARCTURUS ||
-   adev->asic_type == CHIP_ALDEBARAN)
-   adev->df.funcs = &df_v3_6_funcs;
-   else
-   adev->df.funcs = &df_v1_7_funcs;
-
-   if (adev->asic_type == CHIP_VEGA20 ||
-   adev->asic_type == CHIP_ARCTURUS)
-   adev->smuio.funcs = &smuio_v11_0_funcs;
-   else if (adev->asic_type == CHIP_ALDEBARAN)
-   adev->smuio.funcs = &smuio_v13_0_funcs;
-   else
-   adev->smuio.funcs = &smuio_v9_0_funcs;
-
-   adev->rev_id = soc15_get_rev_id(adev);
-
-   switch (adev->asic_type) {
-   case CHIP_VEGA10:
-   case CHIP_VEGA12:
-   case CHIP_VEGA20:
-   amdgpu_device_ip_block_add(adev, &vega10_common_ip_block);
-   amdgpu_device_ip_block_add(adev, &gmc_v9_0_ip_block);
-
-   /* For Vega10 SR-IOV, PSP need to be initialized before IH */
-   if (amdgpu_sriov_vf(adev)) {
-   if (likely(adev->firmware.load_type == 
AMDGPU_FW_LOAD_PSP)) {
-   if (adev->asic_type == CHIP_VEGA20)
-   amdgpu_device_ip_block_add(adev, 
&psp_v11_0_ip_block);
-   else
-   amdgpu_device_ip_block_add(adev, 
&psp_v3_1_ip_block);
-   }
-   if (adev->asic_type == CHIP_VEGA20)
-   amdgpu_device_ip_block_add(adev, 
&vega20_ih_ip_block);
-   else
-   amdgpu_device_ip_block_add(adev, 
&vega10_ih_ip_block);
-   } else {
-   if (adev->asic_type == CHIP_VEGA20)
-   amdgpu_device_ip_block_add(adev, 
&vega20_ih_ip_block);
-   else
-   amdgpu_device_ip_block_add(adev, 
&vega10_ih_ip_block);
-   if (likely(adev->firmware.load_type == 
AMDGPU_FW_LOAD_PSP)) {
-   if (adev->asic_type == CHIP_VEGA20)
-   amdgpu_device_ip_block_add(adev, 
&psp_v11_0_ip_block);
-   else
-   amdgpu_device_ip_block_add(adev, 
&psp_v3_1_ip_block);
-   }
-   }
-   amdgpu_device_ip_block_add(adev, &gfx_v9_0_ip_block);
-   amdgpu_device_ip_block_add(adev, &sdma_v4_0_ip_block);
-   if (is_support_sw_smu(adev)) {
-   if (!amdgpu_sriov_vf(adev))
-   amdgpu_device_ip_block_add(adev, 
&smu_v11_0_ip_block);
-   } else {
-   amdgpu_device_ip_block_add(adev, &pp_smu_ip_block);
-   }
-   if (adev->enable_virtual_display || amdgpu_sriov_vf(adev))
-   amdgpu_device_ip_block_add(adev, &amdgpu_vkms_ip_block);
-#if defined(CONFIG_DRM_AMD_DC)
-   else if (amdgpu_device_has_dc_support(adev))
-   amdgpu_device_ip_block_add(adev, &dm_ip_block);
-#endif
-   if (!(adev->asic_type == CHIP_VEGA20 && amdgpu_sriov_vf(adev))) 
{
-   amdgpu_device_ip_block_add(adev, &uvd_v7_0_ip_block);
-   amdgpu_device_ip_block_add(adev, &vce_v4_0_ip_block);
-   }
-   break;
-   case CHIP_RAVEN:
-   amdgpu_device_ip_block_add(adev, &vega10_common_ip_block);
-   amdgpu_device_ip_block_a

[PATCH 4/5] drm/amdgpu: drop nv_set_ip_blocks()

2021-10-11 Thread Alex Deucher
No longer used since IP enumeration is now driven by
amdgpu IP discovery code.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/nv.c | 293 
 drivers/gpu/drm/amd/amdgpu/nv.h |   1 -
 2 files changed, 294 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index 898e688be63c..90ae5d99e94a 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -607,304 +607,11 @@ const struct amdgpu_ip_block_version nv_common_ip_block =
.funcs = &nv_common_ip_funcs,
 };
 
-static int nv_reg_base_init(struct amdgpu_device *adev)
-{
-   int r;
-
-   if (amdgpu_discovery) {
-   r = amdgpu_discovery_reg_base_init(adev);
-   if (r) {
-   DRM_WARN("failed to init reg base from ip discovery 
table, "
-   "fallback to legacy init method\n");
-   goto legacy_init;
-   }
-
-   amdgpu_discovery_harvest_ip(adev);
-
-   return 0;
-   }
-
-legacy_init:
-   switch (adev->asic_type) {
-   case CHIP_NAVI10:
-   navi10_reg_base_init(adev);
-   break;
-   case CHIP_NAVI14:
-   navi14_reg_base_init(adev);
-   break;
-   case CHIP_NAVI12:
-   navi12_reg_base_init(adev);
-   break;
-   case CHIP_SIENNA_CICHLID:
-   case CHIP_NAVY_FLOUNDER:
-   sienna_cichlid_reg_base_init(adev);
-   break;
-   case CHIP_VANGOGH:
-   vangogh_reg_base_init(adev);
-   break;
-   case CHIP_DIMGREY_CAVEFISH:
-   dimgrey_cavefish_reg_base_init(adev);
-   break;
-   case CHIP_BEIGE_GOBY:
-   beige_goby_reg_base_init(adev);
-   break;
-   case CHIP_YELLOW_CARP:
-   yellow_carp_reg_base_init(adev);
-   break;
-   case CHIP_CYAN_SKILLFISH:
-   cyan_skillfish_reg_base_init(adev);
-   break;
-   default:
-   return -EINVAL;
-   }
-
-   return 0;
-}
-
 void nv_set_virt_ops(struct amdgpu_device *adev)
 {
adev->virt.ops = &xgpu_nv_virt_ops;
 }
 
-int nv_set_ip_blocks(struct amdgpu_device *adev)
-{
-   int r;
-
-   if (adev->asic_type == CHIP_CYAN_SKILLFISH) {
-   adev->nbio.funcs = &nbio_v2_3_funcs;
-   adev->nbio.hdp_flush_reg = &nbio_v2_3_hdp_flush_reg;
-   } else if (adev->flags & AMD_IS_APU) {
-   adev->nbio.funcs = &nbio_v7_2_funcs;
-   adev->nbio.hdp_flush_reg = &nbio_v7_2_hdp_flush_reg;
-   } else {
-   adev->nbio.funcs = &nbio_v2_3_funcs;
-   adev->nbio.hdp_flush_reg = &nbio_v2_3_hdp_flush_reg;
-   }
-   adev->hdp.funcs = &hdp_v5_0_funcs;
-
-   if (adev->asic_type >= CHIP_SIENNA_CICHLID)
-   adev->smuio.funcs = &smuio_v11_0_6_funcs;
-   else
-   adev->smuio.funcs = &smuio_v11_0_funcs;
-
-   if (adev->asic_type == CHIP_SIENNA_CICHLID)
-   adev->gmc.xgmi.supported = true;
-
-   /* Set IP register base before any HW register access */
-   r = nv_reg_base_init(adev);
-   if (r)
-   return r;
-
-   switch (adev->asic_type) {
-   case CHIP_NAVI10:
-   case CHIP_NAVI14:
-   amdgpu_device_ip_block_add(adev, &nv_common_ip_block);
-   amdgpu_device_ip_block_add(adev, &gmc_v10_0_ip_block);
-   amdgpu_device_ip_block_add(adev, &navi10_ih_ip_block);
-   amdgpu_device_ip_block_add(adev, &psp_v11_0_ip_block);
-   if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP &&
-   !amdgpu_sriov_vf(adev))
-   amdgpu_device_ip_block_add(adev, &smu_v11_0_ip_block);
-   if (adev->enable_virtual_display || amdgpu_sriov_vf(adev))
-   amdgpu_device_ip_block_add(adev, &amdgpu_vkms_ip_block);
-#if defined(CONFIG_DRM_AMD_DC)
-   else if (amdgpu_device_has_dc_support(adev))
-   amdgpu_device_ip_block_add(adev, &dm_ip_block);
-#endif
-   amdgpu_device_ip_block_add(adev, &gfx_v10_0_ip_block);
-   amdgpu_device_ip_block_add(adev, &sdma_v5_0_ip_block);
-   if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT &&
-   !amdgpu_sriov_vf(adev))
-   amdgpu_device_ip_block_add(adev, &smu_v11_0_ip_block);
-   amdgpu_device_ip_block_add(adev, &vcn_v2_0_ip_block);
-   amdgpu_device_ip_block_add(adev, &jpeg_v2_0_ip_block);
-   if (adev->enable_mes)
-   amdgpu_device_ip_block_add(adev, &mes_v10_1_ip_block);
-   break;
-   case CHIP_NAVI12:
-   amdgpu_device_ip_block_add(adev, &nv_common_ip_block);
-   amdgpu_device_ip_block_add(adev, &gmc_v10_0_ip_block);
-  

[PATCH 1/5] drm/amdkfd: protect hawaii_device_info with CONFIG_DRM_AMDGPU_CIK

2021-10-11 Thread Alex Deucher
hawaii_device_info is not used when CONFIG_DRM_AMDGPU_CIK is not
set.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 064d42acd54e..31e255ba15ed 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -114,6 +114,7 @@ static const struct kfd_device_info raven_device_info = {
.num_sdma_queues_per_engine = 2,
 };
 
+#ifdef CONFIG_DRM_AMDGPU_CIK
 static const struct kfd_device_info hawaii_device_info = {
.asic_family = CHIP_HAWAII,
.asic_name = "hawaii",
@@ -133,6 +134,7 @@ static const struct kfd_device_info hawaii_device_info = {
.num_xgmi_sdma_engines = 0,
.num_sdma_queues_per_engine = 2,
 };
+#endif
 
 static const struct kfd_device_info tonga_device_info = {
.asic_family = CHIP_TONGA,
-- 
2.31.1



Re: `AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y` causes AMDGPU to fail on Ryzen: amdgpu: SME is not compatible with RAVEN

2021-10-11 Thread Alex Deucher
On Mon, Oct 11, 2021 at 10:21 AM Paul Menzel  wrote:
>
> Dear Tom,
>
>
> Am 11.10.21 um 15:58 schrieb Tom Lendacky:
> > On 10/11/21 8:52 AM, Paul Menzel wrote:
>
> >> Am 11.10.21 um 15:27 schrieb Tom Lendacky:
> >>> On 10/11/21 8:11 AM, Borislav Petkov wrote:
>  On Mon, Oct 11, 2021 at 03:05:33PM +0200, Paul Menzel wrote:
> > I think, the IOMMU is enabled on the MSI B350M MORTAR, but
> > otherwise, yes
> > this looks fine. The help text could also be updated to mention
> > problems
> > with AMD Raven devices.
> 
>  This is not only about Raven GPUs but, as Alex explained, pretty much
>  about every device which doesn't support a 48 bit DMA mask. I'll expand
>  that aspect in the changelog.
> >>>
> >>> In general, non-GPU devices that don't support a 48-bit DMA mask work
> >>> fine (assuming they have set their DMA mask appropriately). It really
> >>> depends on whether SWIOTLB will be able to satisfy the memory
> >>> requirements of the driver when the IOMMU is not enabled or in
> >>> passthrough mode. Since GPU devices need/use a lot of memory, that
> >>> becomes a problem.
> >>
> >> How can I check that?
> >
> > How can you check what? 32-bit DMA devices? GPUs? I need a bit more
> > information...
>
> How can I check, why MEM_ENCRYPT is not working on my device despite the
> IOMMU being enabled.

I think there are several potential problem cases:

1. Device is in passthrough mode in the IOMMU and the device has a
limited DMA mask.  This could be due to a hardware requirements (e.g.,
IOMMUv2 functionality) or a hardware/platform requirements (e.g., ACPI
IOMMU tables define passthrough for a specific device or memory
region).  This is the case for Raven.

2. Device driver bug (e.g., driver not using the DMA API properly)

Alex

>
>
> Kind regards,
>
> Paul


Re: `AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y` causes AMDGPU to fail on Ryzen: amdgpu: SME is not compatible with RAVEN

2021-10-11 Thread Tom Lendacky

On 10/11/21 9:21 AM, Paul Menzel wrote:

Dear Tom,


Am 11.10.21 um 15:58 schrieb Tom Lendacky:

On 10/11/21 8:52 AM, Paul Menzel wrote:



Am 11.10.21 um 15:27 schrieb Tom Lendacky:

On 10/11/21 8:11 AM, Borislav Petkov wrote:

On Mon, Oct 11, 2021 at 03:05:33PM +0200, Paul Menzel wrote:
I think, the IOMMU is enabled on the MSI B350M MORTAR, but 
otherwise, yes
this looks fine. The help text could also be updated to mention 
problems

with AMD Raven devices.


This is not only about Raven GPUs but, as Alex explained, pretty much
about every device which doesn't support a 48 bit DMA mask. I'll expand
that aspect in the changelog.


In general, non-GPU devices that don't support a 48-bit DMA mask work 
fine (assuming they have set their DMA mask appropriately). It really 
depends on whether SWIOTLB will be able to satisfy the memory 
requirements of the driver when the IOMMU is not enabled or in 
passthrough mode. Since GPU devices need/use a lot of memory, that 
becomes a problem.


How can I check that?


How can you check what? 32-bit DMA devices? GPUs? I need a bit more 
information...


How can I check, why MEM_ENCRYPT is not working on my device despite the 
IOMMU being enabled.


I believe Alex already explained that. Your original message is from commit:

ea68573d408f ("drm/amdgpu: Fail to load on RAVEN if SME is active")

Thanks,
Tom




Kind regards,

Paul


RE: [PATCH] drm/amdgpu: query default sclk from smu for cyan_skillfish

2021-10-11 Thread Chen, Guchun
[Public]

Global variable to carry the sclk value looks a bit over-killed. Is it possible 
that move all into cyan_skillfish_od_edit_dpm_table, like querying sclk first 
and setting it to cyan_skillfish_user_settings.sclk?

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of Lazar, Lijo
Sent: Monday, October 11, 2021 4:54 PM
To: Yu, Lang ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Huang, Ray 

Subject: Re: [PATCH] drm/amdgpu: query default sclk from smu for cyan_skillfish



On 10/11/2021 2:01 PM, Lang Yu wrote:
> Query default sclk instead of hard code.
> 
> Signed-off-by: Lang Yu 
> ---
>   .../gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c  | 12 +---
>   1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
> index 3d4c65bc29dc..d98fd06a2574 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
> @@ -47,7 +47,6 @@
>   /* unit: MHz */
>   #define CYAN_SKILLFISH_SCLK_MIN 1000
>   #define CYAN_SKILLFISH_SCLK_MAX 2000
> -#define CYAN_SKILLFISH_SCLK_DEFAULT  1800
>   
>   /* unit: mV */
>   #define CYAN_SKILLFISH_VDDC_MIN 700
> @@ -59,6 +58,8 @@ static struct gfx_user_settings {
>   uint32_t vddc;
>   } cyan_skillfish_user_settings;
>   
> +static uint32_t cyan_skillfish_sclk_default;
> +
>   #define FEATURE_MASK(feature) (1ULL << feature)
>   #define SMC_DPM_FEATURE ( \
>   FEATURE_MASK(FEATURE_FCLK_DPM_BIT)  |   \
> @@ -365,13 +366,18 @@ static bool cyan_skillfish_is_dpm_running(struct 
> smu_context *smu)
>   return false;
>   
>   ret = smu_cmn_get_enabled_32_bits_mask(smu, feature_mask, 2);
> -
>   if (ret)
>   return false;
>   
>   feature_enabled = (uint64_t)feature_mask[0] |
>   ((uint64_t)feature_mask[1] << 32);
>   
> + /*
> +  * cyan_skillfish specific, query default sclk inseted of hard code.
> +  */
> + cyan_skillfish_get_smu_metrics_data(smu, METRICS_CURR_GFXCLK,
> + &cyan_skillfish_sclk_default);
> +

Maybe add if (!cyan_skillfish_sclk_default) so that it's read only once during 
driver load and not on every suspend/resume.

Reviewed-by: Lijo Lazar 

Thanks,
Lijo

>   return !!(feature_enabled & SMC_DPM_FEATURE);
>   }
>   
> @@ -468,7 +474,7 @@ static int cyan_skillfish_od_edit_dpm_table(struct 
> smu_context *smu,
>   return -EINVAL;
>   }
>   
> - cyan_skillfish_user_settings.sclk = CYAN_SKILLFISH_SCLK_DEFAULT;
> + cyan_skillfish_user_settings.sclk = cyan_skillfish_sclk_default;
>   cyan_skillfish_user_settings.vddc = CYAN_SKILLFISH_VDDC_MAGIC;
>   
>   break;
> 


Re: `AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y` causes AMDGPU to fail on Ryzen: amdgpu: SME is not compatible with RAVEN

2021-10-11 Thread Paul Menzel

Dear Tom,


Am 11.10.21 um 15:58 schrieb Tom Lendacky:

On 10/11/21 8:52 AM, Paul Menzel wrote:



Am 11.10.21 um 15:27 schrieb Tom Lendacky:

On 10/11/21 8:11 AM, Borislav Petkov wrote:

On Mon, Oct 11, 2021 at 03:05:33PM +0200, Paul Menzel wrote:
I think, the IOMMU is enabled on the MSI B350M MORTAR, but 
otherwise, yes
this looks fine. The help text could also be updated to mention 
problems

with AMD Raven devices.


This is not only about Raven GPUs but, as Alex explained, pretty much
about every device which doesn't support a 48 bit DMA mask. I'll expand
that aspect in the changelog.


In general, non-GPU devices that don't support a 48-bit DMA mask work 
fine (assuming they have set their DMA mask appropriately). It really 
depends on whether SWIOTLB will be able to satisfy the memory 
requirements of the driver when the IOMMU is not enabled or in 
passthrough mode. Since GPU devices need/use a lot of memory, that 
becomes a problem.


How can I check that?


How can you check what? 32-bit DMA devices? GPUs? I need a bit more 
information...


How can I check, why MEM_ENCRYPT is not working on my device despite the 
IOMMU being enabled.



Kind regards,

Paul


RE: [PATCH 2/2] drm/amdgpu: init iommu after amdkfd device init

2021-10-11 Thread Zhang, Yifan
[Public]

Hi youling,

Would you pls try this patch ? 

BRs,
Yifan

-Original Message-
From: youling 257  
Sent: Monday, October 11, 2021 2:18 PM
To: Zhang, Yifan 
Cc: Kuehling, Felix ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 2/2] drm/amdgpu: init iommu after amdkfd device init

drm/amdgpu: init iommu after amdkfd device init but CONFIG_AMD_IOMMU=y 
CONFIG_AMD_IOMMU_V2=y
[0.203386] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel 
[0.203387] AMD-Vi: AMD IOMMUv2 functionality not available on this system
[7.622052] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
[7.622128] kfd kfd: amdgpu: error getting iommu info. is the iommu enabled?
[7.622129] kfd kfd: amdgpu: Error initializing iommuv2
[7.622430] kfd kfd: amdgpu: device 1002:15d8 NOT added due to errors

2021-10-11 14:13 GMT+08:00, youling257 :
> my kernel config CONFIG_AMD_IOMMU=y CONFIG_AMD_IOMMU_V2=y.
> linux kernel 5.15rc2 "drm/amdgpu: move iommu_resume before ip init/resume"
> cause my amd 3400g suspend to disk resume failed, have to press power 
> button to force shutdown.
> linux kernel 5.15rc5 "drm/amdgpu: init iommu after amdkfd device init" 
> cause my amd 3400g blackscreen when boot enter my userspace.
> i need revert "drm/amdgpu: init iommu after amdkfd device init" and
> "drm/amdgpu: move iommu_resume before ip init/resume" for my 
> userspace, running androidx86 with mesa21.3 on amdgpu.
>


0001-drm-amdkfd-fix-boot-resume-error-when-iommuv2-disabl.patch
Description: 0001-drm-amdkfd-fix-boot-resume-error-when-iommuv2-disabl.patch


Re: `AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y` causes AMDGPU to fail on Ryzen: amdgpu: SME is not compatible with RAVEN

2021-10-11 Thread Tom Lendacky

On 10/11/21 8:52 AM, Paul Menzel wrote:

Dear Tom,


Am 11.10.21 um 15:27 schrieb Tom Lendacky:

On 10/11/21 8:11 AM, Borislav Petkov wrote:

On Mon, Oct 11, 2021 at 03:05:33PM +0200, Paul Menzel wrote:

I think, the IOMMU is enabled on the MSI B350M MORTAR, but otherwise, yes
this looks fine. The help text could also be updated to mention problems
with AMD Raven devices.


This is not only about Raven GPUs but, as Alex explained, pretty much
about every device which doesn't support a 48 bit DMA mask. I'll expand
that aspect in the changelog.


In general, non-GPU devices that don't support a 48-bit DMA mask work 
fine (assuming they have set their DMA mask appropriately). It really 
depends on whether SWIOTLB will be able to satisfy the memory 
requirements of the driver when the IOMMU is not enabled or in 
passthrough mode. Since GPU devices need/use a lot of memory, that 
becomes a problem.


How can I check that?


How can you check what? 32-bit DMA devices? GPUs? I need a bit more 
information...


Thanks,
Tom




Kind regards,

Paul


Re: `AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y` causes AMDGPU to fail on Ryzen: amdgpu: SME is not compatible with RAVEN

2021-10-11 Thread Paul Menzel

Dear Tom,


Am 11.10.21 um 15:27 schrieb Tom Lendacky:

On 10/11/21 8:11 AM, Borislav Petkov wrote:

On Mon, Oct 11, 2021 at 03:05:33PM +0200, Paul Menzel wrote:
I think, the IOMMU is enabled on the MSI B350M MORTAR, but otherwise, 
yes

this looks fine. The help text could also be updated to mention problems
with AMD Raven devices.


This is not only about Raven GPUs but, as Alex explained, pretty much
about every device which doesn't support a 48 bit DMA mask. I'll expand
that aspect in the changelog.


In general, non-GPU devices that don't support a 48-bit DMA mask work 
fine (assuming they have set their DMA mask appropriately). It really 
depends on whether SWIOTLB will be able to satisfy the memory 
requirements of the driver when the IOMMU is not enabled or in 
passthrough mode. Since GPU devices need/use a lot of memory, that 
becomes a problem.


How can I check that?


Kind regards,

Paul


Re: `AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y` causes AMDGPU to fail on Ryzen: amdgpu: SME is not compatible with RAVEN

2021-10-11 Thread Tom Lendacky

On 10/11/21 8:11 AM, Borislav Petkov wrote:

On Mon, Oct 11, 2021 at 03:05:33PM +0200, Paul Menzel wrote:

I think, the IOMMU is enabled on the MSI B350M MORTAR, but otherwise, yes
this looks fine. The help text could also be updated to mention problems
with AMD Raven devices.


This is not only about Raven GPUs but, as Alex explained, pretty much
about every device which doesn't support a 48 bit DMA mask. I'll expand
that aspect in the changelog.


In general, non-GPU devices that don't support a 48-bit DMA mask work fine 
(assuming they have set their DMA mask appropriately). It really depends 
on whether SWIOTLB will be able to satisfy the memory requirements of the 
driver when the IOMMU is not enabled or in passthrough mode. Since GPU 
devices need/use a lot of memory, that becomes a problem.


Thanks,
Tom





Re: [PATCH] MAINTAINERS: Add Siqueira for AMD DC

2021-10-11 Thread Alex Deucher
Acked-by: Alex Deucher 

On Fri, Oct 8, 2021 at 5:21 PM Harry Wentland  wrote:
>
> He's been helping maintain it for quite a while now. Make
> it official.
>
> Signed-off-by: Harry Wentland 
> ---
>  MAINTAINERS | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 24d520c4b157..b107ddb306de 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -876,6 +876,7 @@ F:  include/uapi/linux/psp-sev.h
>  AMD DISPLAY CORE
>  M: Harry Wentland 
>  M: Leo Li 
> +M: Rodrigo Siqueira 
>  L: amd-gfx@lists.freedesktop.org
>  S: Supported
>  T: git https://gitlab.freedesktop.org/agd5f/linux.git
> --
> 2.33.0
>


Re: `AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y` causes AMDGPU to fail on Ryzen: amdgpu: SME is not compatible with RAVEN

2021-10-11 Thread Borislav Petkov
On Mon, Oct 11, 2021 at 03:05:33PM +0200, Paul Menzel wrote:
> I think, the IOMMU is enabled on the MSI B350M MORTAR, but otherwise, yes
> this looks fine. The help text could also be updated to mention problems
> with AMD Raven devices.

This is not only about Raven GPUs but, as Alex explained, pretty much
about every device which doesn't support a 48 bit DMA mask. I'll expand
that aspect in the changelog.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Re: `AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y` causes AMDGPU to fail on Ryzen: amdgpu: SME is not compatible with RAVEN

2021-10-11 Thread Paul Menzel

Dear Borislav,


Am 06.10.21 um 19:48 schrieb Borislav Petkov:

Ok,

so I sat down and wrote something and tried to capture all the stuff we
so talked about that it is clear in the future why we did it.

Thoughts?

---
From: Borislav Petkov 
Date: Wed, 6 Oct 2021 19:34:55 +0200
Subject: [PATCH] x86/Kconfig: Do not enable AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT 
automatically

This Kconfig option was added initially so that memory encryption is
enabled by default on machines which support it.

However, Raven-class GPUs, a.o., cannot handle DMA masks which are
shorter than the bit position of the encryption, aka C-bit. For that,
those devices need to have the IOMMU present.

If the IOMMU is disabled or in passthrough mode, though, the kernel
would switch to SWIOTLB bounce-buffering for those transfers.

In order to avoid that,

2cc13bb4f59f ("iommu: Disable passthrough mode when SME is active")

disables the default IOMMU passthrough mode so that devices for which
the default 256K DMA is insufficient, can use the IOMMU instead.

However 2, there are cases where the IOMMU is disabled in the BIOS, etc,
think the usual hardware folk "oops, I dropped the ball there" cases.

Which means, it can happen that there are systems out there with devices
which need the IOMMU to function properly with SME enabled but the IOMMU
won't necessarily be enabled.

So in order for those devices to function, drop the "default y" for
the SME by default on option so that users who want to have SME, will
need to either enable it in their config or use "mem_encrypt=on" on the
kernel command line.

Fixes: 7744ccdbc16f ("x86/mm: Add Secure Memory Encryption (SME) support")
Reported-by: Paul Menzel 
Signed-off-by: Borislav Petkov 
Cc: 
Link: 
https://lkml.kernel.org/r/8bbacd0e-4580-3194-19d2-a0ecad7df...@molgen.mpg.de
---
  arch/x86/Kconfig | 1 -
  1 file changed, 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 8055da49f1c0..6a336b1f3f28 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1525,7 +1525,6 @@ config AMD_MEM_ENCRYPT
  
  config AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT

bool "Activate AMD Secure Memory Encryption (SME) by default"
-   default y
depends on AMD_MEM_ENCRYPT
help
  Say yes to have system memory encrypted by default if running on



I think, the IOMMU is enabled on the MSI B350M MORTAR, but otherwise, 
yes this looks fine. The help text could also be updated to mention 
problems with AMD Raven devices.



Kind regards,

Paul


Re: [PATCH 2/2] drm/amdgpu: init iommu after amdkfd device init

2021-10-11 Thread youling 257
test this patch can fix my boot and suspend problem.

2021-10-11 18:03 GMT+08:00, Zhang, Yifan :
> [Public]
>
> Hi youling,
>
> Would you pls try this patch ?
>
> BRs,
> Yifan
>
> -Original Message-
> From: youling 257 
> Sent: Monday, October 11, 2021 2:18 PM
> To: Zhang, Yifan 
> Cc: Kuehling, Felix ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 2/2] drm/amdgpu: init iommu after amdkfd device init
>
> drm/amdgpu: init iommu after amdkfd device init but CONFIG_AMD_IOMMU=y
> CONFIG_AMD_IOMMU_V2=y
> [0.203386] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel 
> [0.203387] AMD-Vi: AMD IOMMUv2 functionality not available on this
> system
> [7.622052] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
> [7.622128] kfd kfd: amdgpu: error getting iommu info. is the iommu
> enabled?
> [7.622129] kfd kfd: amdgpu: Error initializing iommuv2
> [7.622430] kfd kfd: amdgpu: device 1002:15d8 NOT added due to errors
>
> 2021-10-11 14:13 GMT+08:00, youling257 :
>> my kernel config CONFIG_AMD_IOMMU=y CONFIG_AMD_IOMMU_V2=y.
>> linux kernel 5.15rc2 "drm/amdgpu: move iommu_resume before ip
>> init/resume"
>> cause my amd 3400g suspend to disk resume failed, have to press power
>> button to force shutdown.
>> linux kernel 5.15rc5 "drm/amdgpu: init iommu after amdkfd device init"
>> cause my amd 3400g blackscreen when boot enter my userspace.
>> i need revert "drm/amdgpu: init iommu after amdkfd device init" and
>> "drm/amdgpu: move iommu_resume before ip init/resume" for my
>> userspace, running androidx86 with mesa21.3 on amdgpu.
>>
>


[PATCH 2/2] drm/amdkfd: fix resume error when iommu disabled in Picasso

2021-10-11 Thread Yifan Zhang
When IOMMU disabled in sbios and kfd in iommuv2 path,
IOMMU resume failure blocks system resume. Don't allow kfd to
use iommu v2 when iommu is disabled.

Reported-by: youling 
Tested-by: youling 
Signed-off-by: Yifan Zhang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index bb652ee35c25..1fadc9fb168d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -916,6 +916,7 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
kfd_double_confirm_iommu_support(kfd);
 
if (kfd_iommu_device_init(kfd)) {
+   kfd->use_iommu_v2 = false;
dev_err(kfd_device, "Error initializing iommuv2\n");
goto device_iommu_error;
}
-- 
2.25.1



[PATCH 1/2] drm/amdkfd: fix boot failure when iommu is disabled in Picasso.

2021-10-11 Thread Yifan Zhang
When IOMMU disabled in sbios and kfd in iommuv2 path, iommuv2
init will fail. But this failure should not block amdgpu driver init.

Reported-by: youling 
Tested-by: youling 
Signed-off-by: Yifan Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 
 drivers/gpu/drm/amd/amdkfd/kfd_device.c| 3 +++
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index af9bdf16eefd..9dfcef2015c8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2432,10 +2432,6 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
*adev)
if (!adev->gmc.xgmi.pending_reset)
amdgpu_amdkfd_device_init(adev);
 
-   r = amdgpu_amdkfd_resume_iommu(adev);
-   if (r)
-   goto init_failed;
-
amdgpu_fru_get_product_info(adev);
 
 init_failed:
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 4a416231b24c..bb652ee35c25 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -920,6 +920,9 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
goto device_iommu_error;
}
 
+   if(kgd2kfd_resume_iommu(kfd))
+   goto device_iommu_error;
+
kfd_cwsr_init(kfd);
 
svm_migrate_init((struct amdgpu_device *)kfd->kgd);
-- 
2.25.1



RE: [PATCH 2/2] drm/amdgpu: init iommu after amdkfd device init

2021-10-11 Thread Zhang, Yifan
[AMD Official Use Only]

Great. Thanks for testing.

-Original Message-
From: youling 257  
Sent: Monday, October 11, 2021 6:20 PM
To: Zhang, Yifan 
Cc: Kuehling, Felix ; amd-gfx@lists.freedesktop.org; 
Zhu, James 
Subject: Re: [PATCH 2/2] drm/amdgpu: init iommu after amdkfd device init

test this patch can fix my boot and suspend problem.

2021-10-11 18:03 GMT+08:00, Zhang, Yifan :
> [Public]
>
> Hi youling,
>
> Would you pls try this patch ?
>
> BRs,
> Yifan
>
> -Original Message-
> From: youling 257 
> Sent: Monday, October 11, 2021 2:18 PM
> To: Zhang, Yifan 
> Cc: Kuehling, Felix ; 
> amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 2/2] drm/amdgpu: init iommu after amdkfd device 
> init
>
> drm/amdgpu: init iommu after amdkfd device init but CONFIG_AMD_IOMMU=y 
> CONFIG_AMD_IOMMU_V2=y
> [0.203386] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel 
> [0.203387] AMD-Vi: AMD IOMMUv2 functionality not available on this
> system
> [7.622052] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
> [7.622128] kfd kfd: amdgpu: error getting iommu info. is the iommu
> enabled?
> [7.622129] kfd kfd: amdgpu: Error initializing iommuv2
> [7.622430] kfd kfd: amdgpu: device 1002:15d8 NOT added due to errors
>
> 2021-10-11 14:13 GMT+08:00, youling257 :
>> my kernel config CONFIG_AMD_IOMMU=y CONFIG_AMD_IOMMU_V2=y.
>> linux kernel 5.15rc2 "drm/amdgpu: move iommu_resume before ip 
>> init/resume"
>> cause my amd 3400g suspend to disk resume failed, have to press power 
>> button to force shutdown.
>> linux kernel 5.15rc5 "drm/amdgpu: init iommu after amdkfd device init"
>> cause my amd 3400g blackscreen when boot enter my userspace.
>> i need revert "drm/amdgpu: init iommu after amdkfd device init" and
>> "drm/amdgpu: move iommu_resume before ip init/resume" for my 
>> userspace, running androidx86 with mesa21.3 on amdgpu.
>>
>


RE: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

2021-10-11 Thread Quan, Evan
[AMD Official Use Only]

OK... Then forget about previous patches. Let's try to narrow down the issue 
first.
Please try the attached patch1 first. If it works, please undo the changes of 
patch1 and try patch2 to narrow down further.

BR
Evan
> -Original Message-
> From: Borislav Petkov 
> Sent: Saturday, October 9, 2021 6:07 PM
> To: Quan, Evan 
> Cc: Alex Deucher ; amd-gfx list  g...@lists.freedesktop.org>; LKML ; Deucher,
> Alexander ; Pan, Xinhui
> ; Chen, Guchun 
> Subject: Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12
> UVD/VCE on suspend")
> 
> On Sat, Oct 09, 2021 at 09:54:13AM +, Quan, Evan wrote:
> > Oops, I just found some necessary changes are missing from the patch of
> the link below.
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.
> freedesktop.org%2Farchives%2Famd-gfx%2F2021-
> September%2F069006.html&data=04%7C01%7CEvan.Quan%40amd.co
> m%7Ce528679b6b6e4da74ec408d98b0c98df%7C3dd8961fe4884e608e11a82d
> 994e183d%7C0%7C0%7C637693708504533267%7CUnknown%7CTWFpbGZsb3
> d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0
> %3D%7C1000&sdata=HAmBuuX%2BvMex3Rxw%2FZrV8d21ygSMS3xrW
> HWeTMzLObg%3D&reserved=0
> >
> > Could you try the patch from the link above + the attached patch?
> 
> Nope, still no joy. ;-\
> 
> --
> Regards/Gruss,
> Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeo
> ple.kernel.org%2Ftglx%2Fnotes-about-
> netiquette&data=04%7C01%7CEvan.Quan%40amd.com%7Ce528679b6b
> 6e4da74ec408d98b0c98df%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> C0%7C637693708504543261%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4w
> LjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&am
> p;sdata=QVTW41SGsMuwq0qeZ9LtQs%2BQ2zw6gxhW5Ttt1iM%2Fu0M%3D
> &reserved=0


0001-drm-amdgpu-no-UVD-VCE-dpm-disablment-on-suspend-for-.patch
Description: 0001-drm-amdgpu-no-UVD-VCE-dpm-disablment-on-suspend-for-.patch


0002-drm-amd-pm-no-UVD-VCE-power-off-during-early-phase-o.patch
Description: 0002-drm-amd-pm-no-UVD-VCE-power-off-during-early-phase-o.patch


RE: [PATCH] drm/amdgpu: query default sclk from smu for cyan_skillfish

2021-10-11 Thread Yu, Lang
[AMD Official Use Only]



>-Original Message-
>From: Lazar, Lijo 
>Sent: Monday, October 11, 2021 4:54 PM
>To: Yu, Lang ; amd-gfx@lists.freedesktop.org
>Cc: Deucher, Alexander ; Huang, Ray
>
>Subject: Re: [PATCH] drm/amdgpu: query default sclk from smu for cyan_skillfish
>
>
>
>On 10/11/2021 2:01 PM, Lang Yu wrote:
>> Query default sclk instead of hard code.
>>
>> Signed-off-by: Lang Yu 
>> ---
>>   .../gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c  | 12 +---
>>   1 file changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
>> b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
>> index 3d4c65bc29dc..d98fd06a2574 100644
>> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
>> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
>> @@ -47,7 +47,6 @@
>>   /* unit: MHz */
>>   #define CYAN_SKILLFISH_SCLK_MIN1000
>>   #define CYAN_SKILLFISH_SCLK_MAX2000
>> -#define CYAN_SKILLFISH_SCLK_DEFAULT 1800
>>
>>   /* unit: mV */
>>   #define CYAN_SKILLFISH_VDDC_MIN700
>> @@ -59,6 +58,8 @@ static struct gfx_user_settings {
>>  uint32_t vddc;
>>   } cyan_skillfish_user_settings;
>>
>> +static uint32_t cyan_skillfish_sclk_default;
>> +
>>   #define FEATURE_MASK(feature) (1ULL << feature)
>>   #define SMC_DPM_FEATURE ( \
>>  FEATURE_MASK(FEATURE_FCLK_DPM_BIT)  |   \
>> @@ -365,13 +366,18 @@ static bool cyan_skillfish_is_dpm_running(struct
>smu_context *smu)
>>  return false;
>>
>>  ret = smu_cmn_get_enabled_32_bits_mask(smu, feature_mask, 2);
>> -
>>  if (ret)
>>  return false;
>>
>>  feature_enabled = (uint64_t)feature_mask[0] |
>>  ((uint64_t)feature_mask[1] << 32);
>>
>> +/*
>> + * cyan_skillfish specific, query default sclk inseted of hard code.
>> + */
>> +cyan_skillfish_get_smu_metrics_data(smu, METRICS_CURR_GFXCLK,
>> +&cyan_skillfish_sclk_default);
>> +
>
>Maybe add if (!cyan_skillfish_sclk_default) so that it's read only once during 
>driver
>load and not on every suspend/resume.

Good idea! 

Thanks,
Lang

>Reviewed-by: Lijo Lazar 
>
>Thanks,
>Lijo
>
>>  return !!(feature_enabled & SMC_DPM_FEATURE);
>>   }
>>
>> @@ -468,7 +474,7 @@ static int cyan_skillfish_od_edit_dpm_table(struct
>smu_context *smu,
>>  return -EINVAL;
>>  }
>>
>> -cyan_skillfish_user_settings.sclk =
>CYAN_SKILLFISH_SCLK_DEFAULT;
>> +cyan_skillfish_user_settings.sclk = cyan_skillfish_sclk_default;
>>  cyan_skillfish_user_settings.vddc =
>CYAN_SKILLFISH_VDDC_MAGIC;
>>
>>  break;
>>


[PATCH] drm/amdkfd: Separate pinned BOs destruction from general routine

2021-10-11 Thread Lang Yu
Currently, all kfd BOs use same destruction routine. But pinned
BOs are not unpinned properly. Separate them from general routine.

Signed-off-by: Lang Yu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|   2 +
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  10 ++
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |   3 +
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |   3 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 125 ++
 5 files changed, 114 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 69de31754907..751557af09bb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -279,6 +279,8 @@ int amdgpu_amdkfd_gpuvm_sync_memory(
struct kgd_dev *kgd, struct kgd_mem *mem, bool intr);
 int amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel(struct kgd_dev *kgd,
struct kgd_mem *mem, void **kptr, uint64_t *size);
+void amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel(struct kgd_dev *kgd, struct 
kgd_mem *mem);
+
 int amdgpu_amdkfd_gpuvm_restore_process_bos(void *process_info,
struct dma_fence **ef);
 int amdgpu_amdkfd_gpuvm_get_vm_fault_info(struct kgd_dev *kgd,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 054c1a224def..6acc78b02bdc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1871,6 +1871,16 @@ int amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel(struct 
kgd_dev *kgd,
return ret;
 }
 
+void amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel(struct kgd_dev *kgd, struct 
kgd_mem *mem)
+{
+   struct amdgpu_bo *bo = mem->bo;
+
+   amdgpu_bo_reserve(bo, true);
+   amdgpu_bo_kunmap(bo);
+   amdgpu_bo_unpin(bo);
+   amdgpu_bo_unreserve(bo);
+}
+
 int amdgpu_amdkfd_gpuvm_get_vm_fault_info(struct kgd_dev *kgd,
  struct kfd_vm_fault_info *mem)
 {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index f1e7edeb4e6b..0db48ac10fde 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1051,6 +1051,9 @@ static int kfd_ioctl_create_event(struct file *filp, 
struct kfd_process *p,
pr_err("Failed to set event page\n");
return err;
}
+
+   p->signal_handle = args->event_page_offset;
+
}
 
err = kfd_event_create(filp, p, args->event_type,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 6d8f9bb2d905..30f08f1606bb 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -608,12 +608,14 @@ struct qcm_process_device {
uint32_t sh_hidden_private_base;
 
/* CWSR memory */
+   struct kgd_mem *cwsr_mem;
void *cwsr_kaddr;
uint64_t cwsr_base;
uint64_t tba_addr;
uint64_t tma_addr;
 
/* IB memory */
+   struct kgd_mem *ib_mem;
uint64_t ib_base;
void *ib_kaddr;
 
@@ -808,6 +810,7 @@ struct kfd_process {
/* Event ID allocator and lookup */
struct idr event_idr;
/* Event page */
+   u64 signal_handle;
struct kfd_signal_page *signal_page;
size_t signal_mapped_size;
size_t signal_event_count;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 21ec8a18cad2..c024f2e2efaa 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -72,6 +72,8 @@ static int kfd_process_init_cwsr_apu(struct kfd_process *p, 
struct file *filep);
 static void evict_process_worker(struct work_struct *work);
 static void restore_process_worker(struct work_struct *work);
 
+static void kfd_process_device_destroy_cwsr_dgpu(struct kfd_process_device 
*pdd);
+
 struct kfd_procfs_tree {
struct kobject *kobj;
 };
@@ -685,10 +687,15 @@ void kfd_process_destroy_wq(void)
 }
 
 static void kfd_process_free_gpuvm(struct kgd_mem *mem,
-   struct kfd_process_device *pdd)
+   struct kfd_process_device *pdd, void *kptr)
 {
struct kfd_dev *dev = pdd->dev;
 
+   if (kptr) {
+   amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel(dev->kgd, mem);
+   kptr = NULL;
+   }
+
amdgpu_amdkfd_gpuvm_unmap_memory_from_gpu(dev->kgd, mem, pdd->drm_priv);
amdgpu_amdkfd_gpuvm_free_memory_of_gpu(dev->kgd, mem, pdd->drm_priv,
   NULL);
@@ -702,63 +709,46 @@ static void kfd_process_free_gpuvm(struct kgd_mem *mem,
  */
 static int kfd_process_alloc_gpuvm(struct kfd_process_device *pdd,
   uint64_t gpu_va, uin

Re: [PATCH] drm/amdgpu: query default sclk from smu for cyan_skillfish

2021-10-11 Thread Lazar, Lijo




On 10/11/2021 2:01 PM, Lang Yu wrote:

Query default sclk instead of hard code.

Signed-off-by: Lang Yu 
---
  .../gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c  | 12 +---
  1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
index 3d4c65bc29dc..d98fd06a2574 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
@@ -47,7 +47,6 @@
  /* unit: MHz */
  #define CYAN_SKILLFISH_SCLK_MIN   1000
  #define CYAN_SKILLFISH_SCLK_MAX   2000
-#define CYAN_SKILLFISH_SCLK_DEFAULT1800
  
  /* unit: mV */

  #define CYAN_SKILLFISH_VDDC_MIN   700
@@ -59,6 +58,8 @@ static struct gfx_user_settings {
uint32_t vddc;
  } cyan_skillfish_user_settings;
  
+static uint32_t cyan_skillfish_sclk_default;

+
  #define FEATURE_MASK(feature) (1ULL << feature)
  #define SMC_DPM_FEATURE ( \
FEATURE_MASK(FEATURE_FCLK_DPM_BIT)  |   \
@@ -365,13 +366,18 @@ static bool cyan_skillfish_is_dpm_running(struct 
smu_context *smu)
return false;
  
  	ret = smu_cmn_get_enabled_32_bits_mask(smu, feature_mask, 2);

-
if (ret)
return false;
  
  	feature_enabled = (uint64_t)feature_mask[0] |

((uint64_t)feature_mask[1] << 32);
  
+	/*

+* cyan_skillfish specific, query default sclk inseted of hard code.
+*/
+   cyan_skillfish_get_smu_metrics_data(smu, METRICS_CURR_GFXCLK,
+   &cyan_skillfish_sclk_default);
+


Maybe add if (!cyan_skillfish_sclk_default) so that it's read only once 
during driver load and not on every suspend/resume.


Reviewed-by: Lijo Lazar 

Thanks,
Lijo


return !!(feature_enabled & SMC_DPM_FEATURE);
  }
  
@@ -468,7 +474,7 @@ static int cyan_skillfish_od_edit_dpm_table(struct smu_context *smu,

return -EINVAL;
}
  
-		cyan_skillfish_user_settings.sclk = CYAN_SKILLFISH_SCLK_DEFAULT;

+   cyan_skillfish_user_settings.sclk = cyan_skillfish_sclk_default;
cyan_skillfish_user_settings.vddc = CYAN_SKILLFISH_VDDC_MAGIC;
  
  		break;




RE: [PATCH] drm/amdgpu: query default sclk from smu for cyan_skillfish

2021-10-11 Thread Huang, Ray
[AMD Official Use Only]

Acked-by: Huang Rui 

-Original Message-
From: amd-gfx  On Behalf Of Lang Yu
Sent: Monday, October 11, 2021 4:32 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Huang, Ray 
; Yu, Lang 
Subject: [PATCH] drm/amdgpu: query default sclk from smu for cyan_skillfish

Query default sclk instead of hard code.

Signed-off-by: Lang Yu 
---
 .../gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c  | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
index 3d4c65bc29dc..d98fd06a2574 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
@@ -47,7 +47,6 @@
 /* unit: MHz */
 #define CYAN_SKILLFISH_SCLK_MIN1000
 #define CYAN_SKILLFISH_SCLK_MAX2000
-#define CYAN_SKILLFISH_SCLK_DEFAULT1800
 
 /* unit: mV */
 #define CYAN_SKILLFISH_VDDC_MIN700
@@ -59,6 +58,8 @@ static struct gfx_user_settings {
uint32_t vddc;
 } cyan_skillfish_user_settings;
 
+static uint32_t cyan_skillfish_sclk_default;
+
 #define FEATURE_MASK(feature) (1ULL << feature)  #define SMC_DPM_FEATURE ( \
FEATURE_MASK(FEATURE_FCLK_DPM_BIT)  |   \
@@ -365,13 +366,18 @@ static bool cyan_skillfish_is_dpm_running(struct 
smu_context *smu)
return false;
 
ret = smu_cmn_get_enabled_32_bits_mask(smu, feature_mask, 2);
-
if (ret)
return false;
 
feature_enabled = (uint64_t)feature_mask[0] |
((uint64_t)feature_mask[1] << 32);
 
+   /*
+* cyan_skillfish specific, query default sclk inseted of hard code.
+*/
+   cyan_skillfish_get_smu_metrics_data(smu, METRICS_CURR_GFXCLK,
+   &cyan_skillfish_sclk_default);
+
return !!(feature_enabled & SMC_DPM_FEATURE);  }
 
@@ -468,7 +474,7 @@ static int cyan_skillfish_od_edit_dpm_table(struct 
smu_context *smu,
return -EINVAL;
}
 
-   cyan_skillfish_user_settings.sclk = CYAN_SKILLFISH_SCLK_DEFAULT;
+   cyan_skillfish_user_settings.sclk = cyan_skillfish_sclk_default;
cyan_skillfish_user_settings.vddc = CYAN_SKILLFISH_VDDC_MAGIC;
 
break;
--
2.25.1


Re: [PATCH] MAINTAINERS: Add Siqueira for AMD DC

2021-10-11 Thread Christian König

Am 08.10.21 um 23:21 schrieb Harry Wentland:

He's been helping maintain it for quite a while now. Make
it official.

Signed-off-by: Harry Wentland 


Acked-by: Christian König 


---
  MAINTAINERS | 1 +
  1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 24d520c4b157..b107ddb306de 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -876,6 +876,7 @@ F:  include/uapi/linux/psp-sev.h
  AMD DISPLAY CORE
  M:Harry Wentland 
  M:Leo Li 
+M: Rodrigo Siqueira 
  L:amd-gfx@lists.freedesktop.org
  S:Supported
  T:git https://gitlab.freedesktop.org/agd5f/linux.git




[PATCH] drm/amdgpu: query default sclk from smu for cyan_skillfish

2021-10-11 Thread Lang Yu
Query default sclk instead of hard code.

Signed-off-by: Lang Yu 
---
 .../gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c  | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
index 3d4c65bc29dc..d98fd06a2574 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
@@ -47,7 +47,6 @@
 /* unit: MHz */
 #define CYAN_SKILLFISH_SCLK_MIN1000
 #define CYAN_SKILLFISH_SCLK_MAX2000
-#define CYAN_SKILLFISH_SCLK_DEFAULT1800
 
 /* unit: mV */
 #define CYAN_SKILLFISH_VDDC_MIN700
@@ -59,6 +58,8 @@ static struct gfx_user_settings {
uint32_t vddc;
 } cyan_skillfish_user_settings;
 
+static uint32_t cyan_skillfish_sclk_default;
+
 #define FEATURE_MASK(feature) (1ULL << feature)
 #define SMC_DPM_FEATURE ( \
FEATURE_MASK(FEATURE_FCLK_DPM_BIT)  |   \
@@ -365,13 +366,18 @@ static bool cyan_skillfish_is_dpm_running(struct 
smu_context *smu)
return false;
 
ret = smu_cmn_get_enabled_32_bits_mask(smu, feature_mask, 2);
-
if (ret)
return false;
 
feature_enabled = (uint64_t)feature_mask[0] |
((uint64_t)feature_mask[1] << 32);
 
+   /*
+* cyan_skillfish specific, query default sclk inseted of hard code.
+*/
+   cyan_skillfish_get_smu_metrics_data(smu, METRICS_CURR_GFXCLK,
+   &cyan_skillfish_sclk_default);
+
return !!(feature_enabled & SMC_DPM_FEATURE);
 }
 
@@ -468,7 +474,7 @@ static int cyan_skillfish_od_edit_dpm_table(struct 
smu_context *smu,
return -EINVAL;
}
 
-   cyan_skillfish_user_settings.sclk = CYAN_SKILLFISH_SCLK_DEFAULT;
+   cyan_skillfish_user_settings.sclk = cyan_skillfish_sclk_default;
cyan_skillfish_user_settings.vddc = CYAN_SKILLFISH_VDDC_MAGIC;
 
break;
-- 
2.25.1



[PATCH] drm/amdgpu: fix Polaris12 uvd crash on driver unload

2021-10-11 Thread Evan Quan
This is a supplement for the change below:
cdccf1ffe1a3 drm/amdgpu: Fix crash on device remove/driver unload

Signed-off-by: Evan Quan 
Change-Id: Iedc25e2f572f04772511d56781b01b481e22fd00
---
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 24 +---
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
index d5d023a24269..2d558c2f417d 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
@@ -534,6 +534,19 @@ static int uvd_v6_0_hw_fini(void *handle)
 {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   cancel_delayed_work_sync(&adev->uvd.idle_work);
+
+   if (RREG32(mmUVD_STATUS) != 0)
+   uvd_v6_0_stop(adev);
+
+   return 0;
+}
+
+static int uvd_v6_0_suspend(void *handle)
+{
+   int r;
+   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+
/*
 * Proper cleanups before halting the HW engine:
 *   - cancel the delayed idle work
@@ -558,17 +571,6 @@ static int uvd_v6_0_hw_fini(void *handle)
   AMD_CG_STATE_GATE);
}
 
-   if (RREG32(mmUVD_STATUS) != 0)
-   uvd_v6_0_stop(adev);
-
-   return 0;
-}
-
-static int uvd_v6_0_suspend(void *handle)
-{
-   int r;
-   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-
r = uvd_v6_0_hw_fini(adev);
if (r)
return r;
-- 
2.29.0



Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

2021-10-11 Thread Borislav Petkov
On Sat, Oct 09, 2021 at 09:54:13AM +, Quan, Evan wrote:
> Oops, I just found some necessary changes are missing from the patch of the 
> link below.
> https://lists.freedesktop.org/archives/amd-gfx/2021-September/069006.html
> 
> Could you try the patch from the link above + the attached patch?

Nope, still no joy. ;-\

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Re: [PATCH 2/2] drm/amdgpu: init iommu after amdkfd device init

2021-10-11 Thread youling 257
drm/amdgpu: init iommu after amdkfd device init
but CONFIG_AMD_IOMMU=y CONFIG_AMD_IOMMU_V2=y
[0.203386] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel 
[0.203387] AMD-Vi: AMD IOMMUv2 functionality not available on this system
[7.622052] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
[7.622128] kfd kfd: amdgpu: error getting iommu info. is the iommu enabled?
[7.622129] kfd kfd: amdgpu: Error initializing iommuv2
[7.622430] kfd kfd: amdgpu: device 1002:15d8 NOT added due to errors

2021-10-11 14:13 GMT+08:00, youling257 :
> my kernel config CONFIG_AMD_IOMMU=y CONFIG_AMD_IOMMU_V2=y.
> linux kernel 5.15rc2 "drm/amdgpu: move iommu_resume before ip init/resume"
> cause my amd 3400g suspend to disk resume failed, have to press power button
> to force shutdown.
> linux kernel 5.15rc5 "drm/amdgpu: init iommu after amdkfd device init" cause
> my amd 3400g blackscreen when boot enter my userspace.
> i need revert "drm/amdgpu: init iommu after amdkfd device init" and
> "drm/amdgpu: move iommu_resume before ip init/resume" for my userspace,
> running androidx86 with mesa21.3 on amdgpu.
>


Re: [PATCH 2/2] drm/amdgpu: init iommu after amdkfd device init

2021-10-11 Thread youling257
my kernel config CONFIG_AMD_IOMMU=y CONFIG_AMD_IOMMU_V2=y.
linux kernel 5.15rc2 "drm/amdgpu: move iommu_resume before ip init/resume" 
cause my amd 3400g suspend to disk resume failed, have to press power button to 
force shutdown.
linux kernel 5.15rc5 "drm/amdgpu: init iommu after amdkfd device init" cause my 
amd 3400g blackscreen when boot enter my userspace.
i need revert "drm/amdgpu: init iommu after amdkfd device init" and 
"drm/amdgpu: move iommu_resume before ip init/resume" for my userspace, running 
androidx86 with mesa21.3 on amdgpu.


Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

2021-10-11 Thread Borislav Petkov
On Sat, Oct 09, 2021 at 01:20:39AM +, Quan, Evan wrote:
> Maybe the change below can address your issue.
> https://lists.freedesktop.org/archives/amd-gfx/2021-September/069006.html

Nope, that one doesn't change anything.

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette