date:20220120

Re: [PATCH] drm/amdgpu: remove gart.ready flag

2022-01-20 Thread Christian König

Sure, go ahead. Just remove both warning, the one for map as well as the 
one for unmap.


Regards,
Christian.

Am 21.01.22 um 08:31 schrieb Chen, Guchun:

[Public]

Hi Christian,

So far I did not find something bad except the warning. So I assume it should 
be fine with below change. So let me make a patch for this?

-if (WARN_ON(!adev->gart.ptr))
+if (!adev->gart.ptr)
return;

Regards,
Guchun

-Original Message-
From: Koenig, Christian 
Sent: Friday, January 21, 2022 3:24 PM
To: Chen, Guchun ; Kim, Jonathan ; 
Christian König ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag

Yeah, you of course still need to check if the table is allocated or not.

Otherwise we will run immediately into a NULL pointer exception.

Question is does it always work with this change?

Regards,
Christian.

Am 21.01.22 um 02:54 schrieb Chen, Guchun:

[Public]

How about changing the code from "if (WARN_ON(!adev->gart.ptr))" to "if 
(!adev->gart.ptr)", the purpose is to drop the warning only. Other functionality keeps the same.

Regards,
Guchun

-Original Message-
From: Koenig, Christian 
Sent: Thursday, January 20, 2022 11:47 PM
To: Kim, Jonathan ; Chen, Guchun
; Christian König
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag

I think we could remove this warning as well, all we need to do is to make sure 
that the GART table is always restored from the metadata before it is enabled 
in hardware.

I've seen that we do this anyway for most hardware generations, but we really 
need to double check.

Christian.

Am 20.01.22 um 16:04 schrieb Kim, Jonathan:

[Public]

Switching to a VRAM bounce buffer can drop performance around 4x~6x on Vega20 
over larger access so it's not desired.

Jon


-Original Message-
From: Koenig, Christian 
Sent: January 20, 2022 9:10 AM
To: Chen, Guchun ; Christian König
; Kim, Jonathan
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag

I actually suggested to allocate the bounce buffer in VRAM, but that
add a bit more latency.

Christian.

Am 20.01.22 um 15:00 schrieb Chen, Guchun:

[Public]

Hi Christian,

Unfortunately, your patch brings another warning from the same

sdma_access_bo's creation in amdgpu_ttm_init.

In your patch, you introduce a new check of
WARN_ON(!adev->gart.ptr)),

however, sdma_access_bo targets to create a bo in GTT domain, but
adev-

gart.ptr is ready in gmc_v10_0_gart_init only.

Hi Jonathan,

Is it mandatory to create this sdma_access_bo in GTT domain? Can we

change it to VRAM?

Regards,
Guchun

-Original Message-
From: Koenig, Christian 
Sent: Wednesday, January 19, 2022 10:38 PM
To: Chen, Guchun ; Christian König
; Kim, Jonathan
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag

Hi Guchun,

yes, just haven't found time to do this yet.

Regards,
Christian.

Am 19.01.22 um 15:24 schrieb Chen, Guchun:

[Public]

Hello Christian,

Do you plan to submit your code to drm-next branch?

Regards,
Guchun

-Original Message-
From: Chen, Guchun
Sent: Tuesday, January 18, 2022 10:22 PM
To: 'Christian König' ; Kim,
Jonathan ; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: remove gart.ready flag

[Public]

Thanks for the clarification. The patch is:
Reviewed-by: Guchun Chen 

Regards,
Guchun

-Original Message-
From: Christian König 
Sent: Tuesday, January 18, 2022 10:10 PM
To: Chen, Guchun ; Kim, Jonathan
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag

Am 18.01.22 um 14:28 schrieb Chen, Guchun:

[Public]

- if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev))
- goto skip_pin_bo;
-
- r = amdgpu_gtt_mgr_recover(&adev->mman.gtt_mgr);
- if (r)
- return r;
-
-skip_pin_bo:

Does deleting skip_pin_bo path cause bo redundant pin in SRIOV case?

Pinning/unpinning the BO was already removed as well.

See Nirmoy's patches in the git log.

Regards,
Christian.


Regards,
Guchun

-Original Message-
From: Christian König 
Sent: Tuesday, January 18, 2022 8:02 PM
To: Chen, Guchun ; Kim, Jonathan
; amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/amdgpu: remove gart.ready flag

That's just a leftover from old radeon days and was preventing CS
and

GART bindings before the hardware was initialized. But nowdays that
is perfectly valid.

The only thing we need to warn about are GART binding before the

table is even allocated.

Signed-off-by: Christian König 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c| 35 +++---
   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h| 15 ++--
   drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c |  9 +--
   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 77 ++-

--

   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  4 +-
   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  | 11 +--
   drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c   |  7 +-
   drivers/gpu

Re: [PATCH V3 1/7] drm/amd/pm: drop unneeded lock protection smu->mutex

2022-01-20 Thread Lazar, Lijo





On 1/21/2022 12:36 PM, Evan Quan wrote:

As all those APIs are already protected either by adev->pm.mutex
or smu->message_lock.

Signed-off-by: Evan Quan 

Reviewed-by: Lijo Lazar 

Thanks,
Lijo


Change-Id: I1db751fba9caabc5ca1314992961d3674212f9b0
--
v1->v2:
   - optimize the "!smu_table->hardcode_pptable" check(Guchun)
   - add the lock protection(adev->pm.mutex) for i2c transfer(Lijo)
---
  drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 316 ++
  drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |   1 -
  .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |   4 +-
  .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |   4 +-
  .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   |   4 +-
  .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c|   4 +-
  6 files changed, 34 insertions(+), 299 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index 828cb932f6a9..eaaa5b033d46 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -55,8 +55,7 @@ static int smu_force_smuclk_levels(struct smu_context *smu,
   uint32_t mask);
  static int smu_handle_task(struct smu_context *smu,
   enum amd_dpm_forced_level level,
-  enum amd_pp_task task_id,
-  bool lock_needed);
+  enum amd_pp_task task_id);
  static int smu_reset(struct smu_context *smu);
  static int smu_set_fan_speed_pwm(void *handle, u32 speed);
  static int smu_set_fan_control_mode(void *handle, u32 value);
@@ -68,36 +67,22 @@ static int smu_sys_get_pp_feature_mask(void *handle,
   char *buf)
  {
struct smu_context *smu = handle;
-   int size = 0;
  
  	if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)

return -EOPNOTSUPP;
  
-	mutex_lock(&smu->mutex);

-
-   size = smu_get_pp_feature_mask(smu, buf);
-
-   mutex_unlock(&smu->mutex);
-
-   return size;
+   return smu_get_pp_feature_mask(smu, buf);
  }
  
  static int smu_sys_set_pp_feature_mask(void *handle,

   uint64_t new_mask)
  {
struct smu_context *smu = handle;
-   int ret = 0;
  
  	if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)

return -EOPNOTSUPP;
  
-	mutex_lock(&smu->mutex);

-
-   ret = smu_set_pp_feature_mask(smu, new_mask);
-
-   mutex_unlock(&smu->mutex);
-
-   return ret;
+   return smu_set_pp_feature_mask(smu, new_mask);
  }
  
  int smu_get_status_gfxoff(struct smu_context *smu, uint32_t *value)

@@ -117,16 +102,12 @@ int smu_set_soft_freq_range(struct smu_context *smu,
  {
int ret = 0;
  
-	mutex_lock(&smu->mutex);

-
if (smu->ppt_funcs->set_soft_freq_limited_range)
ret = smu->ppt_funcs->set_soft_freq_limited_range(smu,
  clk_type,
  min,
  max);
  
-	mutex_unlock(&smu->mutex);

-
return ret;
  }
  
@@ -140,16 +121,12 @@ int smu_get_dpm_freq_range(struct smu_context *smu,

if (!min && !max)
return -EINVAL;
  
-	mutex_lock(&smu->mutex);

-
if (smu->ppt_funcs->get_dpm_ultimate_freq)
ret = smu->ppt_funcs->get_dpm_ultimate_freq(smu,
clk_type,
min,
max);
  
-	mutex_unlock(&smu->mutex);

-
return ret;
  }
  
@@ -482,7 +459,6 @@ static int smu_sys_get_pp_table(void *handle,

  {
struct smu_context *smu = handle;
struct smu_table_context *smu_table = &smu->smu_table;
-   uint32_t powerplay_table_size;
  
  	if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)

return -EOPNOTSUPP;
@@ -490,18 +466,12 @@ static int smu_sys_get_pp_table(void *handle,
if (!smu_table->power_play_table && !smu_table->hardcode_pptable)
return -EINVAL;
  
-	mutex_lock(&smu->mutex);

-
if (smu_table->hardcode_pptable)
*table = smu_table->hardcode_pptable;
else
*table = smu_table->power_play_table;
  
-	powerplay_table_size = smu_table->power_play_table_size;

-
-   mutex_unlock(&smu->mutex);
-
-   return powerplay_table_size;
+   return smu_table->power_play_table_size;
  }
  
  static int smu_sys_set_pp_table(void *handle,

@@ -521,12 +491,10 @@ static int smu_sys_set_pp_table(void *handle,
return -EIO;
}
  
-	mutex_lock(&smu->mutex);

-   if (!smu_table->hardcode_pptable)
-   smu_table->hardcode_pptable = kzalloc(size, GFP_KERNEL);
if (!smu_table->hardcode_pptable) {
-   ret = -ENOMEM;
-

[PATCH 7/7] drm/amd/pm: revise the implementation of smu_cmn_disable_all_features_with_exception

2022-01-20 Thread Evan Quan

As there is no internal cache for enabled ppfeatures now. Thus the 2nd
parameter will be not needed any more.

Signed-off-by: Evan Quan 
Change-Id: I0c1811f216c55d6ddfabdc9e099dc214c21bdf2e
---
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 9 ++---
 drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 1 -
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 7 ---
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h| 1 -
 drivers/gpu/drm/amd/pm/swsmu/smu_internal.h   | 2 +-
 5 files changed, 3 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index aceb6e56bc6a..62c757c79f25 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -1348,9 +1348,7 @@ static int smu_disable_dpms(struct smu_context *smu)
case IP_VERSION(11, 5, 0):
case IP_VERSION(11, 0, 12):
case IP_VERSION(11, 0, 13):
-   return smu_disable_all_features_with_exception(smu,
-  true,
-  
SMU_FEATURE_COUNT);
+   return 0;
default:
break;
}
@@ -1366,9 +1364,7 @@ static int smu_disable_dpms(struct smu_context *smu)
case IP_VERSION(11, 0, 0):
case IP_VERSION(11, 0, 5):
case IP_VERSION(11, 0, 9):
-   return smu_disable_all_features_with_exception(smu,
-  true,
-  
SMU_FEATURE_BACO_BIT);
+   return 0;
default:
break;
}
@@ -1380,7 +1376,6 @@ static int smu_disable_dpms(struct smu_context *smu)
 */
if (use_baco && smu_feature_is_enabled(smu, SMU_FEATURE_BACO_BIT)) {
ret = smu_disable_all_features_with_exception(smu,
- false,
  
SMU_FEATURE_BACO_BIT);
if (ret)
dev_err(adev->dev, "Failed to disable smu features 
except BACO.\n");
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h 
b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
index 4ba579cdd203..76849c0d8df6 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
@@ -989,7 +989,6 @@ struct pptable_funcs {
 *   exception to those in &mask.
 */
int (*disable_all_features_with_exception)(struct smu_context *smu,
-  bool no_hw_disablement,
   enum smu_feature_mask mask);
 
/**
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index e54c59f3e8c2..90fd722cbaef 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -748,9 +748,6 @@ int smu_cmn_set_pp_feature_mask(struct smu_context *smu,
  *   @mask
  *
  * @smu:   smu_context pointer
- * @no_hw_disablement: whether real dpm disablement should be performed
- * true: update the cache(about dpm enablement state) only
- * false: real dpm disablement plus cache update
  * @mask:  the dpm feature which should not be disabled
  * SMU_FEATURE_COUNT: no exception, all dpm features
  * to disable
@@ -759,7 +756,6 @@ int smu_cmn_set_pp_feature_mask(struct smu_context *smu,
  * 0 on success or a negative error code on failure.
  */
 int smu_cmn_disable_all_features_with_exception(struct smu_context *smu,
-   bool no_hw_disablement,
enum smu_feature_mask mask)
 {
uint64_t features_to_disable = U64_MAX;
@@ -775,9 +771,6 @@ int smu_cmn_disable_all_features_with_exception(struct 
smu_context *smu,
features_to_disable &= ~(1ULL << skipped_feature_id);
}
 
-   if (no_hw_disablement)
-   return 0;
-
return smu_cmn_feature_update_enable_state(smu,
   features_to_disable,
   0);
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h 
b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h
index 98190ed5360f..c1423784ab96 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h
@@ -76,7 +76,6 @@ int smu_cmn_set_pp_feature_mask(struct smu_context *smu,
uint64_t new_mask);

[PATCH 6/7] drm/amd/pm: avoid consecutive retrieving for enabled ppfeatures

2022-01-20 Thread Evan Quan

As the enabled ppfeatures are just retrieved ahead. We can use
that directly instead of retrieving again and again.

Signed-off-by: Evan Quan 
Change-Id: I08827437fcbbc52084418c8ca6a90cfa503306a9
---
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index 905d6ddd7479..e54c59f3e8c2 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -661,6 +661,7 @@ size_t smu_cmn_get_pp_feature_mask(struct smu_context *smu,
int8_t sort_feature[SMU_FEATURE_COUNT];
size_t size = 0;
int ret = 0, i;
+   int feature_id;
 
ret = smu_cmn_get_enabled_mask(smu,
   &feature_mask);
@@ -689,11 +690,18 @@ size_t smu_cmn_get_pp_feature_mask(struct smu_context 
*smu,
if (sort_feature[i] < 0)
continue;
 
+   /* convert to asic spcific feature ID */
+   feature_id = smu_cmn_to_asic_specific_index(smu,
+   
CMN2ASIC_MAPPING_FEATURE,
+   sort_feature[i]);
+   if (feature_id < 0)
+   continue;
+
size += sysfs_emit_at(buf, size, "%02d. %-20s (%2d) : %s\n",
count++,
smu_get_feature_name(smu, sort_feature[i]),
i,
-   !!smu_cmn_feature_is_enabled(smu, 
sort_feature[i]) ?
+   !!test_bit(feature_id, (unsigned long 
*)&feature_mask) ?
"enabled" : "disabled");
}
 
-- 
2.29.0

[PATCH 5/7] drm/amd/pm: drop the cache for enabled ppfeatures

2022-01-20 Thread Evan Quan

The following scenarios make the driver cache for enabled ppfeatures
outdated and invalid:
  - Other tools interact with PMFW to change the enabled ppfeatures.
  - PMFW may enable/disable some features behind driver's back. E.g.
for sienna_cichild, on gfxoff entering, PMFW will disable gfx
related DPM features. All those are performed without driver's
notice.
Also considering driver does not actually interact with PMFW such
frequently, the benefit brought by such cache is very limited.

Signed-off-by: Evan Quan 
Change-Id: I20ed58ab216e930c7a5d223be1eb99146889f2b3
---
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  1 -
 drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |  1 -
 .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c| 23 +-
 .../gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c  | 16 +--
 .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c| 23 +-
 .../drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c  | 16 +--
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 46 +--
 7 files changed, 17 insertions(+), 109 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index 5aff66103da2..aceb6e56bc6a 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -949,7 +949,6 @@ static int smu_sw_init(void *handle)
 
smu->pool_size = adev->pm.smu_prv_buffer_size;
smu->smu_feature.feature_num = SMU_FEATURE_MAX;
-   bitmap_zero(smu->smu_feature.enabled, SMU_FEATURE_MAX);
bitmap_zero(smu->smu_feature.allowed, SMU_FEATURE_MAX);
 
mutex_init(&smu->message_lock);
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h 
b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
index 46496bed5f61..4ba579cdd203 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
@@ -389,7 +389,6 @@ struct smu_feature
 {
uint32_t feature_num;
DECLARE_BITMAP(allowed, SMU_FEATURE_MAX);
-   DECLARE_BITMAP(enabled, SMU_FEATURE_MAX);
 };
 
 struct smu_clocks {
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
index b30885f4ca8e..1c76ee25c30b 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
@@ -798,27 +798,8 @@ int smu_v11_0_set_allowed_mask(struct smu_context *smu)
 int smu_v11_0_system_features_control(struct smu_context *smu,
 bool en)
 {
-   struct smu_feature *feature = &smu->smu_feature;
-   uint64_t feature_mask;
-   int ret = 0;
-
-   ret = smu_cmn_send_smc_msg(smu, (en ? SMU_MSG_EnableAllSmuFeatures :
-SMU_MSG_DisableAllSmuFeatures), NULL);
-   if (ret)
-   return ret;
-
-   bitmap_zero(feature->enabled, feature->feature_num);
-
-   if (en) {
-   ret = smu_cmn_get_enabled_mask(smu, &feature_mask);
-   if (ret)
-   return ret;
-
-   bitmap_copy(feature->enabled, (unsigned long *)&feature_mask,
-   feature->feature_num);
-   }
-
-   return ret;
+   return smu_cmn_send_smc_msg(smu, (en ? SMU_MSG_EnableAllSmuFeatures :
+ SMU_MSG_DisableAllSmuFeatures), NULL);
 }
 
 int smu_v11_0_notify_display_change(struct smu_context *smu)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
index 478151e72889..96a5b31f708d 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
@@ -1947,27 +1947,13 @@ static int vangogh_get_dpm_clock_table(struct 
smu_context *smu, struct dpm_clock
 static int vangogh_system_features_control(struct smu_context *smu, bool en)
 {
struct amdgpu_device *adev = smu->adev;
-   struct smu_feature *feature = &smu->smu_feature;
-   uint64_t feature_mask;
int ret = 0;
 
if (adev->pm.fw_version >= 0x43f1700 && !en)
ret = smu_cmn_send_smc_msg_with_param(smu, 
SMU_MSG_RlcPowerNotify,
  RLC_STATUS_OFF, NULL);
 
-   bitmap_zero(feature->enabled, feature->feature_num);
-
-   if (!en)
-   return ret;
-
-   ret = smu_cmn_get_enabled_mask(smu, &feature_mask);
-   if (ret)
-   return ret;
-
-   bitmap_copy(feature->enabled, (unsigned long *)&feature_mask,
-   feature->feature_num);
-
-   return 0;
+   return ret;
 }
 
 static int vangogh_post_smu_init(struct smu_context *smu)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
index c1caf61c2bbc..dda3f6d685d1 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
@@ -764,27 +764,8 @@ int

[PATCH 3/7] drm/amd/pm: drop the redundant 'supported' member of smu_feature structure

2022-01-20 Thread Evan Quan

As it has exactly the same value as the 'enabled' member and also do
the same thing.

Signed-off-by: Evan Quan 
Change-Id: I07c9a5ac5290cd0d88a40ce1768d393156419b5a
---
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  1 -
 drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |  1 -
 .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |  8 
 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   | 10 +-
 .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c| 19 ---
 .../gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c  |  5 +
 .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c|  5 +
 .../drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c  |  3 ---
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 17 -
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h|  3 ---
 10 files changed, 19 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index 5ace30434e60..d3237b89f2c5 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -949,7 +949,6 @@ static int smu_sw_init(void *handle)
 
smu->pool_size = adev->pm.smu_prv_buffer_size;
smu->smu_feature.feature_num = SMU_FEATURE_MAX;
-   bitmap_zero(smu->smu_feature.supported, SMU_FEATURE_MAX);
bitmap_zero(smu->smu_feature.enabled, SMU_FEATURE_MAX);
bitmap_zero(smu->smu_feature.allowed, SMU_FEATURE_MAX);
 
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h 
b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
index 18f24db7d202..3c0360772822 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
@@ -388,7 +388,6 @@ struct smu_power_context {
 struct smu_feature
 {
uint32_t feature_num;
-   DECLARE_BITMAP(supported, SMU_FEATURE_MAX);
DECLARE_BITMAP(allowed, SMU_FEATURE_MAX);
DECLARE_BITMAP(enabled, SMU_FEATURE_MAX);
 };
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 84834c24a7e9..9fb290f9aaeb 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -1625,8 +1625,8 @@ static int navi10_display_config_changed(struct 
smu_context *smu)
int ret = 0;
 
if ((smu->watermarks_bitmap & WATERMARKS_EXIST) &&
-   smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_DCEFCLK_BIT) &&
-   smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_SOCCLK_BIT)) {
+   smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_DCEFCLK_BIT) &&
+   smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_SOCCLK_BIT)) {
ret = smu_cmn_send_smc_msg_with_param(smu, 
SMU_MSG_NumOfDisplays,
  
smu->display_config->num_display,
  NULL);
@@ -1864,13 +1864,13 @@ static int navi10_notify_smc_display_config(struct 
smu_context *smu)
min_clocks.dcef_clock_in_sr = 
smu->display_config->min_dcef_deep_sleep_set_clk;
min_clocks.memory_clock = smu->display_config->min_mem_set_clock;
 
-   if (smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_DCEFCLK_BIT)) {
+   if (smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_DCEFCLK_BIT)) {
clock_req.clock_type = amd_pp_dcef_clock;
clock_req.clock_freq_in_khz = min_clocks.dcef_clock * 10;
 
ret = smu_v11_0_display_clock_voltage_request(smu, &clock_req);
if (!ret) {
-   if (smu_cmn_feature_is_supported(smu, 
SMU_FEATURE_DS_DCEFCLK_BIT)) {
+   if (smu_cmn_feature_is_enabled(smu, 
SMU_FEATURE_DS_DCEFCLK_BIT)) {
ret = smu_cmn_send_smc_msg_with_param(smu,
  
SMU_MSG_SetMinDeepSleepDcefclk,
  
min_clocks.dcef_clock_in_sr/100,
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index 651fe748e423..d568d6137a00 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -1281,8 +1281,8 @@ static int sienna_cichlid_display_config_changed(struct 
smu_context *smu)
int ret = 0;
 
if ((smu->watermarks_bitmap & WATERMARKS_EXIST) &&
-   smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_DCEFCLK_BIT) &&
-   smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_SOCCLK_BIT)) {
+   smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_DCEFCLK_BIT) &&
+   smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_SOCCLK_BIT)) {
 #if 0
ret = smu_cmn_send_smc_msg_with_param(smu, 
SMU_MSG_NumOfDisplays,
  
smu->display_config->num_display,
@@ -1521,13 +1521,13 @@ static int

[PATCH 4/7] drm/amd/pm: update the data type for retrieving enabled ppfeatures

2022-01-20 Thread Evan Quan

Use uint64_t instead of an array of uint32_t. This can avoid
some non-necessary intermediate uint32_t -> uint64_t conversions.

Signed-off-by: Evan Quan 
Change-Id: I4e217357203a23440f058d7e25f55eaebd15c5ef
---
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  2 +-
 drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |  5 ++--
 .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |  5 +---
 .../amd/pm/swsmu/smu11/cyan_skillfish_ppt.c   |  6 +---
 .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |  5 +---
 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   |  5 +---
 .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c|  4 +--
 .../gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c  | 10 ++-
 .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c|  7 ++---
 .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c|  4 +--
 .../drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c  |  9 ++
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 29 +++
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h|  3 +-
 drivers/gpu/drm/amd/pm/swsmu/smu_internal.h   |  2 +-
 14 files changed, 32 insertions(+), 64 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index d3237b89f2c5..5aff66103da2 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -2309,7 +2309,7 @@ static int smu_read_sensor(void *handle,
*size = 4;
break;
case AMDGPU_PP_SENSOR_ENABLED_SMC_FEATURES_MASK:
-   ret = smu_feature_get_enabled_mask(smu, (uint32_t *)data, 2);
+   ret = smu_feature_get_enabled_mask(smu, (uint64_t *)data);
*size = 8;
break;
case AMDGPU_PP_SENSOR_UVD_POWER:
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h 
b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
index 3c0360772822..46496bed5f61 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
@@ -974,10 +974,9 @@ struct pptable_funcs {
/**
 * @get_enabled_mask: Get a mask of features that are currently enabled
 *on the SMU.
-* &feature_mask: Array representing enabled feature mask.
-* &num: Elements in &feature_mask.
+* &feature_mask: Enabled feature mask.
 */
-   int (*get_enabled_mask)(struct smu_context *smu, uint32_t 
*feature_mask, uint32_t num);
+   int (*get_enabled_mask)(struct smu_context *smu, uint64_t 
*feature_mask);
 
/**
 * @feature_is_enabled: Test if a feature is enabled.
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
index 2c78d04d5611..dda36942cfb6 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
@@ -2022,15 +2022,12 @@ static void arcturus_dump_pptable(struct smu_context 
*smu)
 static bool arcturus_is_dpm_running(struct smu_context *smu)
 {
int ret = 0;
-   uint32_t feature_mask[2];
uint64_t feature_enabled;
 
-   ret = smu_cmn_get_enabled_mask(smu, feature_mask, 2);
+   ret = smu_cmn_get_enabled_mask(smu, &feature_enabled);
if (ret)
return false;
 
-   feature_enabled = (uint64_t)feature_mask[1] << 32 | feature_mask[0];
-
return !!(feature_enabled & SMC_DPM_FEATURE);
 }
 
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
index cabea4eb1566..844d931da6f6 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
@@ -366,20 +366,16 @@ static bool cyan_skillfish_is_dpm_running(struct 
smu_context *smu)
 {
struct amdgpu_device *adev = smu->adev;
int ret = 0;
-   uint32_t feature_mask[2];
uint64_t feature_enabled;
 
/* we need to re-init after suspend so return false */
if (adev->in_suspend)
return false;
 
-   ret = smu_cmn_get_enabled_mask(smu, feature_mask, 2);
+   ret = smu_cmn_get_enabled_mask(smu, &feature_enabled);
if (ret)
return false;
 
-   feature_enabled = (uint64_t)feature_mask[0] |
-   ((uint64_t)feature_mask[1] << 32);
-
/*
 * cyan_skillfish specific, query default sclk inseted of hard code.
 */
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 9fb290f9aaeb..4a41cd6c5ea4 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -1640,15 +1640,12 @@ static int navi10_display_config_changed(struct 
smu_context *smu)
 static bool navi10_is_dpm_running(struct smu_context *smu)
 {
int ret = 0;
-   uint32_t feature_mask[2];
uint64_t feature_enabled;
 
-   ret = smu_cmn_get_enabled_mask(smu, f

Re: [PATCH] drm/amd/amdgpu/amdgpu_cs: fix refcount leak of a dma_fence obj

2022-01-20 Thread Christian König


Am 21.01.22 um 06:28 schrieb Xin Xiong:

This issue takes place in an error path in
amdgpu_cs_fence_to_handle_ioctl(). When `info->in.what` falls into
default case, the function simply returns -EINVAL, forgetting to
decrement the reference count of a dma_fence obj, which is bumped
earlier by amdgpu_cs_get_fence(). This may result in reference count
leaks.

Fix it by decreasing the refcount of specific object before returning
the error code.

Signed-off-by: Xin Xiong 
Signed-off-by: Xin Tan 


Good catch. Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 0311d799a..894869789 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1510,6 +1510,7 @@ int amdgpu_cs_fence_to_handle_ioctl(struct drm_device 
*dev, void *data,
return 0;
  
  	default:

+   dma_fence_put(fence);
return -EINVAL;
}
  }

[PATCH 2/7] drm/amd/pm: unify the interface for retrieving enabled ppfeatures

2022-01-20 Thread Evan Quan

Instead of having two which do the same thing.

Signed-off-by: Evan Quan 
Change-Id: I6302c9b5abdb999c4b7c83a0d1852181208b1c1f
---
 .../amd/pm/swsmu/smu11/cyan_skillfish_ppt.c   |  2 +-
 .../gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c  |  6 +-
 .../drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c  |  6 +-
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 93 ---
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h|  4 -
 5 files changed, 44 insertions(+), 67 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
index 2acd7470431e..cabea4eb1566 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/cyan_skillfish_ppt.c
@@ -373,7 +373,7 @@ static bool cyan_skillfish_is_dpm_running(struct 
smu_context *smu)
if (adev->in_suspend)
return false;
 
-   ret = smu_cmn_get_enabled_32_bits_mask(smu, feature_mask, 2);
+   ret = smu_cmn_get_enabled_mask(smu, feature_mask, 2);
if (ret)
return false;
 
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
index 721027917f81..b4a3c9b8b54e 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
@@ -507,7 +507,7 @@ static bool vangogh_is_dpm_running(struct smu_context *smu)
if (adev->in_suspend)
return false;
 
-   ret = smu_cmn_get_enabled_32_bits_mask(smu, feature_mask, 2);
+   ret = smu_cmn_get_enabled_mask(smu, feature_mask, 2);
 
if (ret)
return false;
@@ -1965,7 +1965,7 @@ static int vangogh_system_features_control(struct 
smu_context *smu, bool en)
if (!en)
return ret;
 
-   ret = smu_cmn_get_enabled_32_bits_mask(smu, feature_mask, 2);
+   ret = smu_cmn_get_enabled_mask(smu, feature_mask, 2);
if (ret)
return ret;
 
@@ -2182,7 +2182,7 @@ static const struct pptable_funcs vangogh_ppt_funcs = {
.dpm_set_jpeg_enable = vangogh_dpm_set_jpeg_enable,
.is_dpm_running = vangogh_is_dpm_running,
.read_sensor = vangogh_read_sensor,
-   .get_enabled_mask = smu_cmn_get_enabled_32_bits_mask,
+   .get_enabled_mask = smu_cmn_get_enabled_mask,
.get_pp_feature_mask = smu_cmn_get_pp_feature_mask,
.set_watermarks_table = vangogh_set_watermarks_table,
.set_driver_table_location = smu_v11_0_set_driver_table_location,
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c
index bd24a2632214..f425827e2361 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c
@@ -209,7 +209,7 @@ static int yellow_carp_system_features_control(struct 
smu_context *smu, bool en)
if (!en)
return ret;
 
-   ret = smu_cmn_get_enabled_32_bits_mask(smu, feature_mask, 2);
+   ret = smu_cmn_get_enabled_mask(smu, feature_mask, 2);
if (ret)
return ret;
 
@@ -258,7 +258,7 @@ static bool yellow_carp_is_dpm_running(struct smu_context 
*smu)
uint32_t feature_mask[2];
uint64_t feature_enabled;
 
-   ret = smu_cmn_get_enabled_32_bits_mask(smu, feature_mask, 2);
+   ret = smu_cmn_get_enabled_mask(smu, feature_mask, 2);
 
if (ret)
return false;
@@ -1174,7 +1174,7 @@ static const struct pptable_funcs yellow_carp_ppt_funcs = 
{
.is_dpm_running = yellow_carp_is_dpm_running,
.set_watermarks_table = yellow_carp_set_watermarks_table,
.get_gpu_metrics = yellow_carp_get_gpu_metrics,
-   .get_enabled_mask = smu_cmn_get_enabled_32_bits_mask,
+   .get_enabled_mask = smu_cmn_get_enabled_mask,
.get_pp_feature_mask = smu_cmn_get_pp_feature_mask,
.set_driver_table_location = smu_v13_0_set_driver_table_location,
.gfx_off_control = smu_v13_0_gfx_off_control,
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index c3c679bf9d9f..50164ebed1cd 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -545,67 +545,57 @@ int smu_cmn_get_enabled_mask(struct smu_context *smu,
 uint32_t *feature_mask,
 uint32_t num)
 {
-   uint32_t feature_mask_high = 0, feature_mask_low = 0;
struct smu_feature *feature = &smu->smu_feature;
+   struct amdgpu_device *adev = smu->adev;
+   uint32_t *feature_mask_high;
+   uint32_t *feature_mask_low;
int ret = 0;
 
if (!feature_mask || num < 2)
return -EINVAL;
 
-   if (bitmap_empty(feature->enabled, feature->feature_num)) {
-   ret = smu_cmn_send_smc_msg(smu, 
SMU_MSG_GetEnabledSmuFeaturesHigh, &feature_mask_high);
-   if (ret)
-

[PATCH 1/7] drm/amd/pm: correct the way for retrieving enabled ppfeatures on Renoir

2022-01-20 Thread Evan Quan

As other dGPU asics, Renoir should use smu_cmn_get_enabled_mask() for
that job.

Signed-off-by: Evan Quan 
Change-Id: I9e845ba84dd45d0826506de44ef4760fa851a516
---
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index fcead7c6ca7e..c3c679bf9d9f 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -710,7 +710,8 @@ size_t smu_cmn_get_pp_feature_mask(struct smu_context *smu,
size_t size = 0;
int ret = 0, i;
 
-   if (!smu->is_apu) {
+   if (!smu->is_apu ||
+   (smu->adev->asic_type == CHIP_RENOIR)) {
ret = smu_cmn_get_enabled_mask(smu,
feature_mask,
2);
-- 
2.29.0

RE: [PATCH] drm/amd/pm: use dev_*** to print output in multiple GPUs

2022-01-20 Thread Quan, Evan

[Public]

Sure, please go ahead.

BR
Evan
> -Original Message-
> From: Chen, Guchun 
> Sent: Friday, January 21, 2022 3:35 PM
> To: Quan, Evan ; amd-gfx@lists.freedesktop.org;
> Deucher, Alexander ; Koenig, Christian
> ; Pan, Xinhui ; Lazar,
> Lijo 
> Subject: RE: [PATCH] drm/amd/pm: use dev_*** to print output in multiple
> GPUs
> 
> [Public]
> 
> Thank you for review, Evan. I will submit the patch with modifying the
> commit message a bit to align the indentation. Hope it's fine to you.
> 
> Regards,
> Guchun
> 
> -Original Message-
> From: Quan, Evan 
> Sent: Friday, January 21, 2022 11:38 AM
> To: Chen, Guchun ; amd-
> g...@lists.freedesktop.org; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ; Lazar,
> Lijo 
> Subject: RE: [PATCH] drm/amd/pm: use dev_*** to print output in multiple
> GPUs
> 
> [AMD Official Use Only]
> 
> Reviewed-by: Evan Quan 
> 
> > -Original Message-
> > From: Chen, Guchun 
> > Sent: Friday, January 21, 2022 10:47 AM
> > To: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> > ; Koenig, Christian
> > ; Pan, Xinhui ; Quan,
> > Evan ; Lazar, Lijo 
> > Cc: Chen, Guchun 
> > Subject: [PATCH] drm/amd/pm: use dev_*** to print output in multiple
> > GPUs
> >
> > In multiple GPU configuration, when failed to send a SMU message, it's
> > hard to figure out which GPU has such problem.
> > So it's not comfortable to user.
> >
> > [40190.142181] amdgpu: [powerplay]
> > last message was failed ret is 65535 [40190.242420]
> > amdgpu: [powerplay]
> > failed to send message 201 ret is 65535 [40190.392763]
> > amdgpu: [powerplay]
> > last message was failed ret is 65535 [40190.492997]
> > amdgpu: [powerplay]
> > failed to send message 200 ret is 65535 [40190.743575]
> > amdgpu: [powerplay]
> > last message was failed ret is 65535 [40190.843812]
> > amdgpu: [powerplay]
> > failed to send message 282 ret is 65535
> >
> > Signed-off-by: Guchun Chen 
> > ---
> >  drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c   |  4 +++-
> >  .../gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c|  4 ++--
> >  drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c | 11
> > +++
> >  drivers/gpu/drm/amd/pm/powerplay/smumgr/smu9_smumgr.c |  2 +-
> >  .../gpu/drm/amd/pm/powerplay/smumgr/vega20_smumgr.c   |  4 ++--
> >  5 files changed, 15 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c
> > b/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c
> > index 93a1c7248e26..5ca3c422f7d4 100644
> > --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c
> > +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c
> > @@ -208,6 +208,7 @@ static int ci_read_smc_sram_dword(struct
> pp_hwmgr
> > *hwmgr, uint32_t smc_addr,
> >
> >  static int ci_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t msg) {
> > +   struct amdgpu_device *adev = hwmgr->adev;
> > int ret;
> >
> > cgs_write_register(hwmgr->device, mmSMC_RESP_0, 0); @@ -218,7
> +219,8
> > @@ static int ci_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t
> msg)
> > ret = PHM_READ_FIELD(hwmgr->device, SMC_RESP_0, SMC_RESP);
> >
> > if (ret != 1)
> > -   pr_info("\n failed to send message %x ret is %d\n",  msg, ret);
> > +   dev_info(adev->dev,
> > +   "failed to send message %x ret is %d\n", msg,ret);
> >
> > return 0;
> >  }
> > diff --git
> a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c
> > b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c
> > index 47b34c6ca924..88a5641465dc 100644
> > --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c
> > +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c
> > @@ -87,7 +87,7 @@ static int smu10_send_msg_to_smc(struct pp_hwmgr
> > *hwmgr, uint16_t msg)
> > smu10_send_msg_to_smc_without_waiting(hwmgr, msg);
> >
> > if (smu10_wait_for_response(hwmgr) == 0)
> > -   printk("Failed to send Message %x.\n", msg);
> > +   dev_err(adev->dev, "Failed to send Message %x.\n", msg);
> >
> > return 0;
> >  }
> > @@ -108,7 +108,7 @@ static int
> > smu10_send_msg_to_smc_with_parameter(struct pp_hwmgr *hwmgr,
> >
> >
> > if (smu10_wait_for_response(hwmgr) == 0)
> > -   printk("Failed to send Message %x.\n", msg);
> > +   dev_err(adev->dev, "Failed to send Message %x.\n", msg);
> >
> > return 0;
> >  }
> > diff --git
> a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> > b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> > index aae25243eb10..5a010cd38303 100644
> > --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> > +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> > @@ -165,6 +165,7 @@ bool smu7_is_smc_ram_running(struct pp_hwmgr
> > *hwmgr)
> >
> >  int smu7_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t msg)  {
> > +   struct amdgpu_device *adev = hwmgr->adev;
> > int ret;
> >
> >

RE: [PATCH] drm/amd/pm: use dev_*** to print output in multiple GPUs

2022-01-20 Thread Chen, Guchun

[Public]

Thank you for review, Evan. I will submit the patch with modifying the commit 
message a bit to align the indentation. Hope it's fine to you.

Regards,
Guchun

-Original Message-
From: Quan, Evan  
Sent: Friday, January 21, 2022 11:38 AM
To: Chen, Guchun ; amd-gfx@lists.freedesktop.org; Deucher, 
Alexander ; Koenig, Christian 
; Pan, Xinhui ; Lazar, Lijo 

Subject: RE: [PATCH] drm/amd/pm: use dev_*** to print output in multiple GPUs

[AMD Official Use Only]

Reviewed-by: Evan Quan 

> -Original Message-
> From: Chen, Guchun 
> Sent: Friday, January 21, 2022 10:47 AM
> To: amd-gfx@lists.freedesktop.org; Deucher, Alexander 
> ; Koenig, Christian 
> ; Pan, Xinhui ; Quan, 
> Evan ; Lazar, Lijo 
> Cc: Chen, Guchun 
> Subject: [PATCH] drm/amd/pm: use dev_*** to print output in multiple 
> GPUs
> 
> In multiple GPU configuration, when failed to send a SMU message, it's 
> hard to figure out which GPU has such problem.
> So it's not comfortable to user.
> 
> [40190.142181] amdgpu: [powerplay]
> last message was failed ret is 65535 [40190.242420] 
> amdgpu: [powerplay]
> failed to send message 201 ret is 65535 [40190.392763] 
> amdgpu: [powerplay]
> last message was failed ret is 65535 [40190.492997] 
> amdgpu: [powerplay]
> failed to send message 200 ret is 65535 [40190.743575] 
> amdgpu: [powerplay]
> last message was failed ret is 65535 [40190.843812] 
> amdgpu: [powerplay]
> failed to send message 282 ret is 65535
> 
> Signed-off-by: Guchun Chen 
> ---
>  drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c   |  4 +++-
>  .../gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c|  4 ++--
>  drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c | 11
> +++
>  drivers/gpu/drm/amd/pm/powerplay/smumgr/smu9_smumgr.c |  2 +-
>  .../gpu/drm/amd/pm/powerplay/smumgr/vega20_smumgr.c   |  4 ++--
>  5 files changed, 15 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c
> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c
> index 93a1c7248e26..5ca3c422f7d4 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c
> @@ -208,6 +208,7 @@ static int ci_read_smc_sram_dword(struct pp_hwmgr 
> *hwmgr, uint32_t smc_addr,
> 
>  static int ci_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t msg)  
> {
> + struct amdgpu_device *adev = hwmgr->adev;
>   int ret;
> 
>   cgs_write_register(hwmgr->device, mmSMC_RESP_0, 0); @@ -218,7 +219,8 
> @@ static int ci_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t msg)
>   ret = PHM_READ_FIELD(hwmgr->device, SMC_RESP_0, SMC_RESP);
> 
>   if (ret != 1)
> - pr_info("\n failed to send message %x ret is %d\n",  msg, ret);
> + dev_info(adev->dev,
> + "failed to send message %x ret is %d\n", msg,ret);
> 
>   return 0;
>  }
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c
> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c
> index 47b34c6ca924..88a5641465dc 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c
> @@ -87,7 +87,7 @@ static int smu10_send_msg_to_smc(struct pp_hwmgr 
> *hwmgr, uint16_t msg)
>   smu10_send_msg_to_smc_without_waiting(hwmgr, msg);
> 
>   if (smu10_wait_for_response(hwmgr) == 0)
> - printk("Failed to send Message %x.\n", msg);
> + dev_err(adev->dev, "Failed to send Message %x.\n", msg);
> 
>   return 0;
>  }
> @@ -108,7 +108,7 @@ static int
> smu10_send_msg_to_smc_with_parameter(struct pp_hwmgr *hwmgr,
> 
> 
>   if (smu10_wait_for_response(hwmgr) == 0)
> - printk("Failed to send Message %x.\n", msg);
> + dev_err(adev->dev, "Failed to send Message %x.\n", msg);
> 
>   return 0;
>  }
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> index aae25243eb10..5a010cd38303 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> @@ -165,6 +165,7 @@ bool smu7_is_smc_ram_running(struct pp_hwmgr
> *hwmgr)
> 
>  int smu7_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t msg)  {
> + struct amdgpu_device *adev = hwmgr->adev;
>   int ret;
> 
>   PHM_WAIT_FIELD_UNEQUAL(hwmgr, SMC_RESP_0, SMC_RESP, 0); @@ -172,9 
> +173,10 @@ int smu7_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t 
> msg)
>   ret = PHM_READ_FIELD(hwmgr->device, SMC_RESP_0, SMC_RESP);
> 
>   if (ret == 0xFE)
> - pr_debug("last message was not supported\n");
> + dev_dbg(adev->dev, "last message was not supported\n");
>   else if (ret != 1)
> - pr_info("\n last message was failed ret is %d\n", ret);
> + dev_info(adev->dev,
> +

RE: [PATCH] drm/amdgpu: remove gart.ready flag

2022-01-20 Thread Chen, Guchun

[Public]

Hi Christian,

So far I did not find something bad except the warning. So I assume it should 
be fine with below change. So let me make a patch for this?

-if (WARN_ON(!adev->gart.ptr)) 
+if (!adev->gart.ptr)
return;

Regards,
Guchun

-Original Message-
From: Koenig, Christian  
Sent: Friday, January 21, 2022 3:24 PM
To: Chen, Guchun ; Kim, Jonathan ; 
Christian König ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag

Yeah, you of course still need to check if the table is allocated or not.

Otherwise we will run immediately into a NULL pointer exception.

Question is does it always work with this change?

Regards,
Christian.

Am 21.01.22 um 02:54 schrieb Chen, Guchun:
> [Public]
>
> How about changing the code from "if (WARN_ON(!adev->gart.ptr))" to "if 
> (!adev->gart.ptr)", the purpose is to drop the warning only. Other 
> functionality keeps the same.
>
> Regards,
> Guchun
>
> -Original Message-
> From: Koenig, Christian 
> Sent: Thursday, January 20, 2022 11:47 PM
> To: Kim, Jonathan ; Chen, Guchun 
> ; Christian König 
> ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag
>
> I think we could remove this warning as well, all we need to do is to make 
> sure that the GART table is always restored from the metadata before it is 
> enabled in hardware.
>
> I've seen that we do this anyway for most hardware generations, but we really 
> need to double check.
>
> Christian.
>
> Am 20.01.22 um 16:04 schrieb Kim, Jonathan:
>> [Public]
>>
>> Switching to a VRAM bounce buffer can drop performance around 4x~6x on 
>> Vega20 over larger access so it's not desired.
>>
>> Jon
>>
>>> -Original Message-
>>> From: Koenig, Christian 
>>> Sent: January 20, 2022 9:10 AM
>>> To: Chen, Guchun ; Christian König 
>>> ; Kim, Jonathan 
>>> ; amd-gfx@lists.freedesktop.org
>>> Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag
>>>
>>> I actually suggested to allocate the bounce buffer in VRAM, but that 
>>> add a bit more latency.
>>>
>>> Christian.
>>>
>>> Am 20.01.22 um 15:00 schrieb Chen, Guchun:
 [Public]

 Hi Christian,

 Unfortunately, your patch brings another warning from the same
>>> sdma_access_bo's creation in amdgpu_ttm_init.
 In your patch, you introduce a new check of 
 WARN_ON(!adev->gart.ptr)),
>>> however, sdma_access_bo targets to create a bo in GTT domain, but
>>> adev-
 gart.ptr is ready in gmc_v10_0_gart_init only.

 Hi Jonathan,

 Is it mandatory to create this sdma_access_bo in GTT domain? Can we
>>> change it to VRAM?
 Regards,
 Guchun

 -Original Message-
 From: Koenig, Christian 
 Sent: Wednesday, January 19, 2022 10:38 PM
 To: Chen, Guchun ; Christian König 
 ; Kim, Jonathan 
 ; amd-gfx@lists.freedesktop.org
 Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag

 Hi Guchun,

 yes, just haven't found time to do this yet.

 Regards,
 Christian.

 Am 19.01.22 um 15:24 schrieb Chen, Guchun:
> [Public]
>
> Hello Christian,
>
> Do you plan to submit your code to drm-next branch?
>
> Regards,
> Guchun
>
> -Original Message-
> From: Chen, Guchun
> Sent: Tuesday, January 18, 2022 10:22 PM
> To: 'Christian König' ; Kim, 
> Jonathan ; amd-gfx@lists.freedesktop.org
> Subject: RE: [PATCH] drm/amdgpu: remove gart.ready flag
>
> [Public]
>
> Thanks for the clarification. The patch is:
> Reviewed-by: Guchun Chen 
>
> Regards,
> Guchun
>
> -Original Message-
> From: Christian König 
> Sent: Tuesday, January 18, 2022 10:10 PM
> To: Chen, Guchun ; Kim, Jonathan 
> ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag
>
> Am 18.01.22 um 14:28 schrieb Chen, Guchun:
>> [Public]
>>
>> - if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev))
>> - goto skip_pin_bo;
>> -
>> - r = amdgpu_gtt_mgr_recover(&adev->mman.gtt_mgr);
>> - if (r)
>> - return r;
>> -
>> -skip_pin_bo:
>>
>> Does deleting skip_pin_bo path cause bo redundant pin in SRIOV case?
> Pinning/unpinning the BO was already removed as well.
>
> See Nirmoy's patches in the git log.
>
> Regards,
> Christian.
>
>> Regards,
>> Guchun
>>
>> -Original Message-
>> From: Christian König 
>> Sent: Tuesday, January 18, 2022 8:02 PM
>> To: Chen, Guchun ; Kim, Jonathan 
>> ; amd-gfx@lists.freedesktop.org
>> Subject: [PATCH] drm/amdgpu: remove gart.ready flag
>>
>> That's just a leftover from old radeon days and was preventing CS 
>> and
>>> GART bindings before the hardware was initialized. But nowdays that 
>>> is perfectly valid.
>> The only thing we need to warn about are GART binding before the
>>

Re: [PATCH] drm/amdgpu: remove gart.ready flag

2022-01-20 Thread Christian König


Yeah, you of course still need to check if the table is allocated or not.

Otherwise we will run immediately into a NULL pointer exception.

Question is does it always work with this change?

Regards,
Christian.

Am 21.01.22 um 02:54 schrieb Chen, Guchun:

[Public]

How about changing the code from "if (WARN_ON(!adev->gart.ptr))" to "if 
(!adev->gart.ptr)", the purpose is to drop the warning only. Other functionality keeps the same.

Regards,
Guchun

-Original Message-
From: Koenig, Christian 
Sent: Thursday, January 20, 2022 11:47 PM
To: Kim, Jonathan ; Chen, Guchun ; 
Christian König ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag

I think we could remove this warning as well, all we need to do is to make sure 
that the GART table is always restored from the metadata before it is enabled 
in hardware.

I've seen that we do this anyway for most hardware generations, but we really 
need to double check.

Christian.

Am 20.01.22 um 16:04 schrieb Kim, Jonathan:

[Public]

Switching to a VRAM bounce buffer can drop performance around 4x~6x on Vega20 
over larger access so it's not desired.

Jon


-Original Message-
From: Koenig, Christian 
Sent: January 20, 2022 9:10 AM
To: Chen, Guchun ; Christian König
; Kim, Jonathan
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag

I actually suggested to allocate the bounce buffer in VRAM, but that
add a bit more latency.

Christian.

Am 20.01.22 um 15:00 schrieb Chen, Guchun:

[Public]

Hi Christian,

Unfortunately, your patch brings another warning from the same

sdma_access_bo's creation in amdgpu_ttm_init.

In your patch, you introduce a new check of
WARN_ON(!adev->gart.ptr)),

however, sdma_access_bo targets to create a bo in GTT domain, but
adev-

gart.ptr is ready in gmc_v10_0_gart_init only.

Hi Jonathan,

Is it mandatory to create this sdma_access_bo in GTT domain? Can we

change it to VRAM?

Regards,
Guchun

-Original Message-
From: Koenig, Christian 
Sent: Wednesday, January 19, 2022 10:38 PM
To: Chen, Guchun ; Christian König
; Kim, Jonathan
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag

Hi Guchun,

yes, just haven't found time to do this yet.

Regards,
Christian.

Am 19.01.22 um 15:24 schrieb Chen, Guchun:

[Public]

Hello Christian,

Do you plan to submit your code to drm-next branch?

Regards,
Guchun

-Original Message-
From: Chen, Guchun
Sent: Tuesday, January 18, 2022 10:22 PM
To: 'Christian König' ; Kim,
Jonathan ; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: remove gart.ready flag

[Public]

Thanks for the clarification. The patch is:
Reviewed-by: Guchun Chen 

Regards,
Guchun

-Original Message-
From: Christian König 
Sent: Tuesday, January 18, 2022 10:10 PM
To: Chen, Guchun ; Kim, Jonathan
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag

Am 18.01.22 um 14:28 schrieb Chen, Guchun:

[Public]

- if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev))
- goto skip_pin_bo;
-
- r = amdgpu_gtt_mgr_recover(&adev->mman.gtt_mgr);
- if (r)
- return r;
-
-skip_pin_bo:

Does deleting skip_pin_bo path cause bo redundant pin in SRIOV case?

Pinning/unpinning the BO was already removed as well.

See Nirmoy's patches in the git log.

Regards,
Christian.


Regards,
Guchun

-Original Message-
From: Christian König 
Sent: Tuesday, January 18, 2022 8:02 PM
To: Chen, Guchun ; Kim, Jonathan
; amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/amdgpu: remove gart.ready flag

That's just a leftover from old radeon days and was preventing CS
and

GART bindings before the hardware was initialized. But nowdays that
is perfectly valid.

The only thing we need to warn about are GART binding before the

table is even allocated.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c| 35 +++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h| 15 ++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c |  9 +--
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 77 ++-

--

  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  4 +-
  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  | 11 +--
  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c   |  7 +-
  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c   |  8 +--
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c   |  8 +--
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   | 10 +--
  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c|  5 +-
  11 files changed, 52 insertions(+), 137 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index 645950a653a0..53cc844346f0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -150,7 +150,7 @@ void amdgpu_gart_table_vram_free(struct

amdgpu_device *adev)

   * replaces them with the dummy page (all asics)

RE: [PATCH V3 1/7] drm/amd/pm: drop unneeded lock protection smu->mutex

2022-01-20 Thread Chen, Guchun

[Public]

Reviewed-by: Guchun Chen 

Regards,
Guchun

-Original Message-
From: Quan, Evan  
Sent: Friday, January 21, 2022 3:07 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Lazar, Lijo 
; Chen, Guchun ; Quan, Evan 

Subject: [PATCH V3 1/7] drm/amd/pm: drop unneeded lock protection smu->mutex

As all those APIs are already protected either by adev->pm.mutex or 
smu->message_lock.

Signed-off-by: Evan Quan 
Change-Id: I1db751fba9caabc5ca1314992961d3674212f9b0
--
v1->v2:
  - optimize the "!smu_table->hardcode_pptable" check(Guchun)
  - add the lock protection(adev->pm.mutex) for i2c transfer(Lijo)
---
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 316 ++
 drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |   1 -
 .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |   4 +-
 .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |   4 +-
 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   |   4 +-
 .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c|   4 +-
 6 files changed, 34 insertions(+), 299 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index 828cb932f6a9..eaaa5b033d46 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -55,8 +55,7 @@ static int smu_force_smuclk_levels(struct smu_context *smu,
   uint32_t mask);
 static int smu_handle_task(struct smu_context *smu,
   enum amd_dpm_forced_level level,
-  enum amd_pp_task task_id,
-  bool lock_needed);
+  enum amd_pp_task task_id);
 static int smu_reset(struct smu_context *smu);  static int 
smu_set_fan_speed_pwm(void *handle, u32 speed);  static int 
smu_set_fan_control_mode(void *handle, u32 value); @@ -68,36 +67,22 @@ static 
int smu_sys_get_pp_feature_mask(void *handle,
   char *buf)
 {
struct smu_context *smu = handle;
-   int size = 0;
 
if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)
return -EOPNOTSUPP;
 
-   mutex_lock(&smu->mutex);
-
-   size = smu_get_pp_feature_mask(smu, buf);
-
-   mutex_unlock(&smu->mutex);
-
-   return size;
+   return smu_get_pp_feature_mask(smu, buf);
 }
 
 static int smu_sys_set_pp_feature_mask(void *handle,
   uint64_t new_mask)
 {
struct smu_context *smu = handle;
-   int ret = 0;
 
if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)
return -EOPNOTSUPP;
 
-   mutex_lock(&smu->mutex);
-
-   ret = smu_set_pp_feature_mask(smu, new_mask);
-
-   mutex_unlock(&smu->mutex);
-
-   return ret;
+   return smu_set_pp_feature_mask(smu, new_mask);
 }
 
 int smu_get_status_gfxoff(struct smu_context *smu, uint32_t *value) @@ -117,16 
+102,12 @@ int smu_set_soft_freq_range(struct smu_context *smu,  {
int ret = 0;
 
-   mutex_lock(&smu->mutex);
-
if (smu->ppt_funcs->set_soft_freq_limited_range)
ret = smu->ppt_funcs->set_soft_freq_limited_range(smu,
  clk_type,
  min,
  max);
 
-   mutex_unlock(&smu->mutex);
-
return ret;
 }
 
@@ -140,16 +121,12 @@ int smu_get_dpm_freq_range(struct smu_context *smu,
if (!min && !max)
return -EINVAL;
 
-   mutex_lock(&smu->mutex);
-
if (smu->ppt_funcs->get_dpm_ultimate_freq)
ret = smu->ppt_funcs->get_dpm_ultimate_freq(smu,
clk_type,
min,
max);
 
-   mutex_unlock(&smu->mutex);
-
return ret;
 }
 
@@ -482,7 +459,6 @@ static int smu_sys_get_pp_table(void *handle,  {
struct smu_context *smu = handle;
struct smu_table_context *smu_table = &smu->smu_table;
-   uint32_t powerplay_table_size;
 
if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)
return -EOPNOTSUPP;
@@ -490,18 +466,12 @@ static int smu_sys_get_pp_table(void *handle,
if (!smu_table->power_play_table && !smu_table->hardcode_pptable)
return -EINVAL;
 
-   mutex_lock(&smu->mutex);
-
if (smu_table->hardcode_pptable)
*table = smu_table->hardcode_pptable;
else
*table = smu_table->power_play_table;
 
-   powerplay_table_size = smu_table->power_play_table_size;
-
-   mutex_unlock(&smu->mutex);
-
-   return powerplay_table_size;
+   return smu_table->power_play_table_size;
 }
 
 static int smu_sys_set_pp_table(void *handle, @@ -521,12 +491,10 @@ static int 
smu_sys_set_pp_table(void *handle,
re

RE: [PATCH V2 1/7] drm/amd/pm: drop unneeded lock protection smu->mutex

2022-01-20 Thread Quan, Evan

[Public]



> -Original Message-
> From: Chen, Guchun 
> Sent: Thursday, January 20, 2022 9:38 PM
> To: Quan, Evan ; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Lazar, Lijo
> 
> Subject: RE: [PATCH V2 1/7] drm/amd/pm: drop unneeded lock protection
> smu->mutex
> 
> [Public]
> 
>   if (!smu_table->hardcode_pptable)
>   smu_table->hardcode_pptable = kzalloc(size, GFP_KERNEL);
> - if (!smu_table->hardcode_pptable) {
> - ret = -ENOMEM;
> - goto failed;
> - }
> + if (!smu_table->hardcode_pptable)
> + return -ENOMEM;
> 
> I guess it's better to put the second check of hardcode_pptable into first if
> condition section like:
> if (!smu_table->hardcode_pptable) {
>   smu_table->hardcode_pptable = kzalloc(size, GFP_KERNEL);
>   if (!smu_table->hardcode_pptable)
>   return -ENOMEM;
> }
[Quan, Evan] Thanks! Fixed in V3.

BR
Evan
> 
> 
> Regards,
> Guchun
> 
> -Original Message-
> From: Quan, Evan 
> Sent: Monday, January 17, 2022 1:42 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Lazar, Lijo
> ; Chen, Guchun ; Quan,
> Evan 
> Subject: [PATCH V2 1/7] drm/amd/pm: drop unneeded lock protection smu-
> >mutex
> 
> As all those APIs are already protected either by adev->pm.mutex or smu-
> >message_lock.
> 
> Signed-off-by: Evan Quan 
> Change-Id: I1db751fba9caabc5ca1314992961d3674212f9b0
> ---
>  drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 315 ++
>  drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |   1 -
>  .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |   2 -
>  .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |   2 -
>  .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   |   2 -
>  .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c|   2 -
>  6 files changed, 25 insertions(+), 299 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> index 828cb932f6a9..411f03eb4523 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> @@ -55,8 +55,7 @@ static int smu_force_smuclk_levels(struct smu_context
> *smu,
>  uint32_t mask);
>  static int smu_handle_task(struct smu_context *smu,
>  enum amd_dpm_forced_level level,
> -enum amd_pp_task task_id,
> -bool lock_needed);
> +enum amd_pp_task task_id);
>  static int smu_reset(struct smu_context *smu);  static int
> smu_set_fan_speed_pwm(void *handle, u32 speed);  static int
> smu_set_fan_control_mode(void *handle, u32 value); @@ -68,36 +67,22
> @@ static int smu_sys_get_pp_feature_mask(void *handle,
>  char *buf)
>  {
>   struct smu_context *smu = handle;
> - int size = 0;
> 
>   if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)
>   return -EOPNOTSUPP;
> 
> - mutex_lock(&smu->mutex);
> -
> - size = smu_get_pp_feature_mask(smu, buf);
> -
> - mutex_unlock(&smu->mutex);
> -
> - return size;
> + return smu_get_pp_feature_mask(smu, buf);
>  }
> 
>  static int smu_sys_set_pp_feature_mask(void *handle,
>  uint64_t new_mask)
>  {
>   struct smu_context *smu = handle;
> - int ret = 0;
> 
>   if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)
>   return -EOPNOTSUPP;
> 
> - mutex_lock(&smu->mutex);
> -
> - ret = smu_set_pp_feature_mask(smu, new_mask);
> -
> - mutex_unlock(&smu->mutex);
> -
> - return ret;
> + return smu_set_pp_feature_mask(smu, new_mask);
>  }
> 
>  int smu_get_status_gfxoff(struct smu_context *smu, uint32_t *value) @@
> -117,16 +102,12 @@ int smu_set_soft_freq_range(struct smu_context *smu,
> {
>   int ret = 0;
> 
> - mutex_lock(&smu->mutex);
> -
>   if (smu->ppt_funcs->set_soft_freq_limited_range)
>   ret = smu->ppt_funcs->set_soft_freq_limited_range(smu,
> clk_type,
> min,
> max);
> 
> - mutex_unlock(&smu->mutex);
> -
>   return ret;
>  }
> 
> @@ -140,16 +121,12 @@ int smu_get_dpm_freq_range(struct smu_context
> *smu,
>   if (!min && !max)
>   return -EINVAL;
> 
> - mutex_lock(&smu->mutex);
> -
>   if (smu->ppt_funcs->get_dpm_ultimate_freq)
>   ret = smu->ppt_funcs->get_dpm_ultimate_freq(smu,
>   clk_type,
>   min,
>   max);
> 
> - mutex_unlock(&smu->mutex);
> -
>   return ret;
>  }
> 
> @@ -482,7 +459,6 @@ static int smu_sys_get_pp_table(void *handle,  {
>   struct smu_context *smu = handle;
>   struct s

RE: [PATCH V2 1/7] drm/amd/pm: drop unneeded lock protection smu->mutex

2022-01-20 Thread Quan, Evan

[AMD Official Use Only]



> -Original Message-
> From: Lazar, Lijo 
> Sent: Thursday, January 20, 2022 11:23 PM
> To: Quan, Evan ; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Chen, Guchun
> 
> Subject: Re: [PATCH V2 1/7] drm/amd/pm: drop unneeded lock protection
> smu->mutex
> 
> 
> 
> On 1/17/2022 11:11 AM, Evan Quan wrote:
> > As all those APIs are already protected either by adev->pm.mutex or
> > smu->message_lock.
> >
> > Signed-off-by: Evan Quan 
> > Change-Id: I1db751fba9caabc5ca1314992961d3674212f9b0
> > ---
> >   drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 315 ++---
> -
> >   drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |   1 -
> >   .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |   2 -
> >   .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |   2 -
> >   .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   |   2 -
> >   .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c|   2 -
> >   6 files changed, 25 insertions(+), 299 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> > b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> > index 828cb932f6a9..411f03eb4523 100644
> > --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> > +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> > @@ -55,8 +55,7 @@ static int smu_force_smuclk_levels(struct
> smu_context *smu,
> >uint32_t mask);
> >   static int smu_handle_task(struct smu_context *smu,
> >enum amd_dpm_forced_level level,
> > -  enum amd_pp_task task_id,
> > -  bool lock_needed);
> > +  enum amd_pp_task task_id);
> >   static int smu_reset(struct smu_context *smu);
> >   static int smu_set_fan_speed_pwm(void *handle, u32 speed);
> >   static int smu_set_fan_control_mode(void *handle, u32 value); @@
> > -68,36 +67,22 @@ static int smu_sys_get_pp_feature_mask(void *handle,
> >char *buf)
> >   {
> > struct smu_context *smu = handle;
> > -   int size = 0;
> >
> > if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)
> > return -EOPNOTSUPP;
> >
> > -   mutex_lock(&smu->mutex);
> > -
> > -   size = smu_get_pp_feature_mask(smu, buf);
> > -
> > -   mutex_unlock(&smu->mutex);
> > -
> > -   return size;
> > +   return smu_get_pp_feature_mask(smu, buf);
> >   }
> >
> >   static int smu_sys_set_pp_feature_mask(void *handle,
> >uint64_t new_mask)
> >   {
> > struct smu_context *smu = handle;
> > -   int ret = 0;
> >
> > if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)
> > return -EOPNOTSUPP;
> >
> > -   mutex_lock(&smu->mutex);
> > -
> > -   ret = smu_set_pp_feature_mask(smu, new_mask);
> > -
> > -   mutex_unlock(&smu->mutex);
> > -
> > -   return ret;
> > +   return smu_set_pp_feature_mask(smu, new_mask);
> >   }
> >
> >   int smu_get_status_gfxoff(struct smu_context *smu, uint32_t *value)
> > @@ -117,16 +102,12 @@ int smu_set_soft_freq_range(struct
> smu_context *smu,
> >   {
> > int ret = 0;
> >
> > -   mutex_lock(&smu->mutex);
> > -
> > if (smu->ppt_funcs->set_soft_freq_limited_range)
> > ret = smu->ppt_funcs->set_soft_freq_limited_range(smu,
> >   clk_type,
> >   min,
> >   max);
> >
> > -   mutex_unlock(&smu->mutex);
> > -
> > return ret;
> >   }
> >
> > @@ -140,16 +121,12 @@ int smu_get_dpm_freq_range(struct
> smu_context *smu,
> > if (!min && !max)
> > return -EINVAL;
> >
> > -   mutex_lock(&smu->mutex);
> > -
> > if (smu->ppt_funcs->get_dpm_ultimate_freq)
> > ret = smu->ppt_funcs->get_dpm_ultimate_freq(smu,
> > clk_type,
> > min,
> > max);
> >
> > -   mutex_unlock(&smu->mutex);
> > -
> > return ret;
> >   }
> >
> > @@ -482,7 +459,6 @@ static int smu_sys_get_pp_table(void *handle,
> >   {
> > struct smu_context *smu = handle;
> > struct smu_table_context *smu_table = &smu->smu_table;
> > -   uint32_t powerplay_table_size;
> >
> > if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)
> > return -EOPNOTSUPP;
> > @@ -490,18 +466,12 @@ static int smu_sys_get_pp_table(void *handle,
> > if (!smu_table->power_play_table && !smu_table-
> >hardcode_pptable)
> > return -EINVAL;
> >
> > -   mutex_lock(&smu->mutex);
> > -
> > if (smu_table->hardcode_pptable)
> > *table = smu_table->hardcode_pptable;
> > else
> > *table = smu_table->power_play_table;
> >
> > -   powerplay_table_size = smu_table->power_play_table_size;
> > -
> > -   mutex_unlock(&smu->mutex);
> > -
> > -   return powerplay_table_size;
> > +   return smu_table->power_play_ta

[PATCH V3 1/7] drm/amd/pm: drop unneeded lock protection smu->mutex

2022-01-20 Thread Evan Quan

As all those APIs are already protected either by adev->pm.mutex
or smu->message_lock.

Signed-off-by: Evan Quan 
Change-Id: I1db751fba9caabc5ca1314992961d3674212f9b0
--
v1->v2:
  - optimize the "!smu_table->hardcode_pptable" check(Guchun)
  - add the lock protection(adev->pm.mutex) for i2c transfer(Lijo)
---
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 316 ++
 drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |   1 -
 .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |   4 +-
 .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |   4 +-
 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   |   4 +-
 .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c|   4 +-
 6 files changed, 34 insertions(+), 299 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index 828cb932f6a9..eaaa5b033d46 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -55,8 +55,7 @@ static int smu_force_smuclk_levels(struct smu_context *smu,
   uint32_t mask);
 static int smu_handle_task(struct smu_context *smu,
   enum amd_dpm_forced_level level,
-  enum amd_pp_task task_id,
-  bool lock_needed);
+  enum amd_pp_task task_id);
 static int smu_reset(struct smu_context *smu);
 static int smu_set_fan_speed_pwm(void *handle, u32 speed);
 static int smu_set_fan_control_mode(void *handle, u32 value);
@@ -68,36 +67,22 @@ static int smu_sys_get_pp_feature_mask(void *handle,
   char *buf)
 {
struct smu_context *smu = handle;
-   int size = 0;
 
if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)
return -EOPNOTSUPP;
 
-   mutex_lock(&smu->mutex);
-
-   size = smu_get_pp_feature_mask(smu, buf);
-
-   mutex_unlock(&smu->mutex);
-
-   return size;
+   return smu_get_pp_feature_mask(smu, buf);
 }
 
 static int smu_sys_set_pp_feature_mask(void *handle,
   uint64_t new_mask)
 {
struct smu_context *smu = handle;
-   int ret = 0;
 
if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)
return -EOPNOTSUPP;
 
-   mutex_lock(&smu->mutex);
-
-   ret = smu_set_pp_feature_mask(smu, new_mask);
-
-   mutex_unlock(&smu->mutex);
-
-   return ret;
+   return smu_set_pp_feature_mask(smu, new_mask);
 }
 
 int smu_get_status_gfxoff(struct smu_context *smu, uint32_t *value)
@@ -117,16 +102,12 @@ int smu_set_soft_freq_range(struct smu_context *smu,
 {
int ret = 0;
 
-   mutex_lock(&smu->mutex);
-
if (smu->ppt_funcs->set_soft_freq_limited_range)
ret = smu->ppt_funcs->set_soft_freq_limited_range(smu,
  clk_type,
  min,
  max);
 
-   mutex_unlock(&smu->mutex);
-
return ret;
 }
 
@@ -140,16 +121,12 @@ int smu_get_dpm_freq_range(struct smu_context *smu,
if (!min && !max)
return -EINVAL;
 
-   mutex_lock(&smu->mutex);
-
if (smu->ppt_funcs->get_dpm_ultimate_freq)
ret = smu->ppt_funcs->get_dpm_ultimate_freq(smu,
clk_type,
min,
max);
 
-   mutex_unlock(&smu->mutex);
-
return ret;
 }
 
@@ -482,7 +459,6 @@ static int smu_sys_get_pp_table(void *handle,
 {
struct smu_context *smu = handle;
struct smu_table_context *smu_table = &smu->smu_table;
-   uint32_t powerplay_table_size;
 
if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)
return -EOPNOTSUPP;
@@ -490,18 +466,12 @@ static int smu_sys_get_pp_table(void *handle,
if (!smu_table->power_play_table && !smu_table->hardcode_pptable)
return -EINVAL;
 
-   mutex_lock(&smu->mutex);
-
if (smu_table->hardcode_pptable)
*table = smu_table->hardcode_pptable;
else
*table = smu_table->power_play_table;
 
-   powerplay_table_size = smu_table->power_play_table_size;
-
-   mutex_unlock(&smu->mutex);
-
-   return powerplay_table_size;
+   return smu_table->power_play_table_size;
 }
 
 static int smu_sys_set_pp_table(void *handle,
@@ -521,12 +491,10 @@ static int smu_sys_set_pp_table(void *handle,
return -EIO;
}
 
-   mutex_lock(&smu->mutex);
-   if (!smu_table->hardcode_pptable)
-   smu_table->hardcode_pptable = kzalloc(size, GFP_KERNEL);
if (!smu_table->hardcode_pptable) {
-   ret = -ENOMEM;
-   goto failed;
+   smu_table->hardcode_pptable = kzal

RE: [PATCH] drm/amd/pm: use dev_*** to print output in multiple GPUs

2022-01-20 Thread Quan, Evan

[AMD Official Use Only]

Reviewed-by: Evan Quan 

> -Original Message-
> From: Chen, Guchun 
> Sent: Friday, January 21, 2022 10:47 AM
> To: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ; Quan,
> Evan ; Lazar, Lijo 
> Cc: Chen, Guchun 
> Subject: [PATCH] drm/amd/pm: use dev_*** to print output in multiple
> GPUs
> 
> In multiple GPU configuration, when failed to send a SMU
> message, it's hard to figure out which GPU has such problem.
> So it's not comfortable to user.
> 
> [40190.142181] amdgpu: [powerplay]
> last message was failed ret is 65535
> [40190.242420] amdgpu: [powerplay]
> failed to send message 201 ret is 65535
> [40190.392763] amdgpu: [powerplay]
> last message was failed ret is 65535
> [40190.492997] amdgpu: [powerplay]
> failed to send message 200 ret is 65535
> [40190.743575] amdgpu: [powerplay]
> last message was failed ret is 65535
> [40190.843812] amdgpu: [powerplay]
> failed to send message 282 ret is 65535
> 
> Signed-off-by: Guchun Chen 
> ---
>  drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c   |  4 +++-
>  .../gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c|  4 ++--
>  drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c | 11
> +++
>  drivers/gpu/drm/amd/pm/powerplay/smumgr/smu9_smumgr.c |  2 +-
>  .../gpu/drm/amd/pm/powerplay/smumgr/vega20_smumgr.c   |  4 ++--
>  5 files changed, 15 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c
> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c
> index 93a1c7248e26..5ca3c422f7d4 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c
> @@ -208,6 +208,7 @@ static int ci_read_smc_sram_dword(struct pp_hwmgr
> *hwmgr, uint32_t smc_addr,
> 
>  static int ci_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t msg)
>  {
> + struct amdgpu_device *adev = hwmgr->adev;
>   int ret;
> 
>   cgs_write_register(hwmgr->device, mmSMC_RESP_0, 0);
> @@ -218,7 +219,8 @@ static int ci_send_msg_to_smc(struct pp_hwmgr
> *hwmgr, uint16_t msg)
>   ret = PHM_READ_FIELD(hwmgr->device, SMC_RESP_0, SMC_RESP);
> 
>   if (ret != 1)
> - pr_info("\n failed to send message %x ret is %d\n",  msg, ret);
> + dev_info(adev->dev,
> + "failed to send message %x ret is %d\n", msg,ret);
> 
>   return 0;
>  }
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c
> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c
> index 47b34c6ca924..88a5641465dc 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c
> @@ -87,7 +87,7 @@ static int smu10_send_msg_to_smc(struct pp_hwmgr
> *hwmgr, uint16_t msg)
>   smu10_send_msg_to_smc_without_waiting(hwmgr, msg);
> 
>   if (smu10_wait_for_response(hwmgr) == 0)
> - printk("Failed to send Message %x.\n", msg);
> + dev_err(adev->dev, "Failed to send Message %x.\n", msg);
> 
>   return 0;
>  }
> @@ -108,7 +108,7 @@ static int
> smu10_send_msg_to_smc_with_parameter(struct pp_hwmgr *hwmgr,
> 
> 
>   if (smu10_wait_for_response(hwmgr) == 0)
> - printk("Failed to send Message %x.\n", msg);
> + dev_err(adev->dev, "Failed to send Message %x.\n", msg);
> 
>   return 0;
>  }
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> index aae25243eb10..5a010cd38303 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
> @@ -165,6 +165,7 @@ bool smu7_is_smc_ram_running(struct pp_hwmgr
> *hwmgr)
> 
>  int smu7_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t msg)
>  {
> + struct amdgpu_device *adev = hwmgr->adev;
>   int ret;
> 
>   PHM_WAIT_FIELD_UNEQUAL(hwmgr, SMC_RESP_0, SMC_RESP, 0);
> @@ -172,9 +173,10 @@ int smu7_send_msg_to_smc(struct pp_hwmgr
> *hwmgr, uint16_t msg)
>   ret = PHM_READ_FIELD(hwmgr->device, SMC_RESP_0, SMC_RESP);
> 
>   if (ret == 0xFE)
> - pr_debug("last message was not supported\n");
> + dev_dbg(adev->dev, "last message was not supported\n");
>   else if (ret != 1)
> - pr_info("\n last message was failed ret is %d\n", ret);
> + dev_info(adev->dev,
> + "\nlast message was failed ret is %d\n", ret);
> 
>   cgs_write_register(hwmgr->device, mmSMC_RESP_0, 0);
>   cgs_write_register(hwmgr->device, mmSMC_MESSAGE_0, msg);
> @@ -184,9 +186,10 @@ int smu7_send_msg_to_smc(struct pp_hwmgr
> *hwmgr, uint16_t msg)
>   ret = PHM_READ_FIELD(hwmgr->device, SMC_RESP_0, SMC_RESP);
> 
>   if (ret == 0xFE)
> - pr_debug("message %x was not supported\n", msg);
> + dev_dbg(ade

RE: [PATCH] drm/amdgpu: Fix kernel compilation; style

2022-01-20 Thread Liu, Shaoyun

[AMD Official Use Only]

Good catch .  Thanks . 
Reviewed by : shaoyun.liu 

-Original Message-
From: Tuikov, Luben  
Sent: Thursday, January 20, 2022 6:52 PM
To: amd-gfx@lists.freedesktop.org
Cc: Tuikov, Luben ; Deucher, Alexander 
; Liu, Shaoyun ; Russell, Kent 

Subject: [PATCH] drm/amdgpu: Fix kernel compilation; style

Problem:
drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c: In function 
‘is_fru_eeprom_supported’:
drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c:47:3: error: expected ‘)’ before 
‘return’
   47 |   return false;
  |   ^~
drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c:46:5: note: to match this ‘(’
   46 |  if (amdgpu_sriov_vf(adev)
  | ^

Fix kernel compilation:
if (amdgpu_sriov_vf(adev)
return false;
missing closing right parenthesis for the "if".

Fix style:
/* The i2c access is blocked on VF
 * TODO: Need other way to get the info
 */
Has white space after the closing */.

Cc: Alex Deucher 
Cc: shaoyunl 
Cc: Kent Russell 
Fixes: 824c2051039dfc ("drm/amdgpu: Disable FRU EEPROM access for SRIOV")
Signed-off-by: Luben Tuikov 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
index 0548e279cc9fc4..60e7e637eaa33d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
@@ -42,8 +42,8 @@ static bool is_fru_eeprom_supported(struct amdgpu_device 
*adev)
 
/* The i2c access is blocked on VF
 * TODO: Need other way to get the info
-*/  
-   if (amdgpu_sriov_vf(adev)
+*/
+   if (amdgpu_sriov_vf(adev))
return false;
 
/* VBIOS is of the format ###-DXXXYY-##. For SKU identification,

base-commit: 2e8e13b0a6794f3ddae0ddcd13eedb64de94f0fd
-- 
2.34.0

[PATCH] drm/amd/pm: use dev_*** to print output in multiple GPUs

2022-01-20 Thread Guchun Chen

In multiple GPU configuration, when failed to send a SMU
message, it's hard to figure out which GPU has such problem.
So it's not comfortable to user.

[40190.142181] amdgpu: [powerplay]
last message was failed ret is 65535
[40190.242420] amdgpu: [powerplay]
failed to send message 201 ret is 65535
[40190.392763] amdgpu: [powerplay]
last message was failed ret is 65535
[40190.492997] amdgpu: [powerplay]
failed to send message 200 ret is 65535
[40190.743575] amdgpu: [powerplay]
last message was failed ret is 65535
[40190.843812] amdgpu: [powerplay]
failed to send message 282 ret is 65535

Signed-off-by: Guchun Chen 
---
 drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c   |  4 +++-
 .../gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c|  4 ++--
 drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c | 11 +++
 drivers/gpu/drm/amd/pm/powerplay/smumgr/smu9_smumgr.c |  2 +-
 .../gpu/drm/amd/pm/powerplay/smumgr/vega20_smumgr.c   |  4 ++--
 5 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c 
b/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c
index 93a1c7248e26..5ca3c422f7d4 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/ci_smumgr.c
@@ -208,6 +208,7 @@ static int ci_read_smc_sram_dword(struct pp_hwmgr *hwmgr, 
uint32_t smc_addr,
 
 static int ci_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t msg)
 {
+   struct amdgpu_device *adev = hwmgr->adev;
int ret;
 
cgs_write_register(hwmgr->device, mmSMC_RESP_0, 0);
@@ -218,7 +219,8 @@ static int ci_send_msg_to_smc(struct pp_hwmgr *hwmgr, 
uint16_t msg)
ret = PHM_READ_FIELD(hwmgr->device, SMC_RESP_0, SMC_RESP);
 
if (ret != 1)
-   pr_info("\n failed to send message %x ret is %d\n",  msg, ret);
+   dev_info(adev->dev,
+   "failed to send message %x ret is %d\n", msg,ret);
 
return 0;
 }
diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c 
b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c
index 47b34c6ca924..88a5641465dc 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu10_smumgr.c
@@ -87,7 +87,7 @@ static int smu10_send_msg_to_smc(struct pp_hwmgr *hwmgr, 
uint16_t msg)
smu10_send_msg_to_smc_without_waiting(hwmgr, msg);
 
if (smu10_wait_for_response(hwmgr) == 0)
-   printk("Failed to send Message %x.\n", msg);
+   dev_err(adev->dev, "Failed to send Message %x.\n", msg);
 
return 0;
 }
@@ -108,7 +108,7 @@ static int smu10_send_msg_to_smc_with_parameter(struct 
pp_hwmgr *hwmgr,
 
 
if (smu10_wait_for_response(hwmgr) == 0)
-   printk("Failed to send Message %x.\n", msg);
+   dev_err(adev->dev, "Failed to send Message %x.\n", msg);
 
return 0;
 }
diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c 
b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
index aae25243eb10..5a010cd38303 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
@@ -165,6 +165,7 @@ bool smu7_is_smc_ram_running(struct pp_hwmgr *hwmgr)
 
 int smu7_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t msg)
 {
+   struct amdgpu_device *adev = hwmgr->adev;
int ret;
 
PHM_WAIT_FIELD_UNEQUAL(hwmgr, SMC_RESP_0, SMC_RESP, 0);
@@ -172,9 +173,10 @@ int smu7_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t 
msg)
ret = PHM_READ_FIELD(hwmgr->device, SMC_RESP_0, SMC_RESP);
 
if (ret == 0xFE)
-   pr_debug("last message was not supported\n");
+   dev_dbg(adev->dev, "last message was not supported\n");
else if (ret != 1)
-   pr_info("\n last message was failed ret is %d\n", ret);
+   dev_info(adev->dev,
+   "\nlast message was failed ret is %d\n", ret);
 
cgs_write_register(hwmgr->device, mmSMC_RESP_0, 0);
cgs_write_register(hwmgr->device, mmSMC_MESSAGE_0, msg);
@@ -184,9 +186,10 @@ int smu7_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t 
msg)
ret = PHM_READ_FIELD(hwmgr->device, SMC_RESP_0, SMC_RESP);
 
if (ret == 0xFE)
-   pr_debug("message %x was not supported\n", msg);
+   dev_dbg(adev->dev, "message %x was not supported\n", msg);
else if (ret != 1)
-   pr_info("\n failed to send message %x ret is %d \n",  msg, ret);
+   dev_dbg(adev->dev,
+   "failed to send message %x ret is %d \n",  msg, ret);
 
return 0;
 }
diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu9_smumgr.c 
b/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu9_smumgr.c
index 23e5de3c4ec1..8c9bf4940dc1 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/smum

Re: [PATCH] drm/amdgpu: enable amdgpu_dc module parameter

2022-01-20 Thread Lang Yu

On 01/20/ , Alex Deucher wrote:
> On Thu, Jan 20, 2022 at 1:25 AM Lang Yu  wrote:
> >
> > It doesn't work under IP discovery mode. Make it work!
> >
> > Signed-off-by: Lang Yu 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 10 --
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> > index 07965ac6381b..1ad137499e38 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> > @@ -846,8 +846,14 @@ static int 
> > amdgpu_discovery_set_display_ip_blocks(struct amdgpu_device *adev)
> >  {
> > if (adev->enable_virtual_display || amdgpu_sriov_vf(adev)) {
> > amdgpu_device_ip_block_add(adev, &amdgpu_vkms_ip_block);
> > +   return 0;
> > +   }
> > +
> > +   if (!amdgpu_device_has_dc_support(adev))
> > +   return 0;
> > +
> >  #if defined(CONFIG_DRM_AMD_DC)
> > -   } else if (adev->ip_versions[DCE_HWIP][0]) {
> > +   if (adev->ip_versions[DCE_HWIP][0]) {
> > switch (adev->ip_versions[DCE_HWIP][0]) {
> > case IP_VERSION(1, 0, 0):
> > case IP_VERSION(1, 0, 1):
> > @@ -882,9 +888,9 @@ static int 
> > amdgpu_discovery_set_display_ip_blocks(struct amdgpu_device *adev)
> > adev->ip_versions[DCI_HWIP][0]);
> > return -EINVAL;
> > }
> > -#endif
> > }
> > return 0;
> > +#endif
> 
> I think the compiler will complain about this.  If you move the #endif
> before the return, the patch is:
> Reviewed-by: Alex Deucher 

Thanks. I got it.

Regards,
Lang

> >  }
> >
> >  static int amdgpu_discovery_set_gc_ip_blocks(struct amdgpu_device *adev)
> > --
> > 2.25.1
> >

RE: [PATCH] drm/amdgpu: remove gart.ready flag

2022-01-20 Thread Chen, Guchun

[Public]

How about changing the code from "if (WARN_ON(!adev->gart.ptr))" to "if 
(!adev->gart.ptr)", the purpose is to drop the warning only. Other 
functionality keeps the same.

Regards,
Guchun

-Original Message-
From: Koenig, Christian  
Sent: Thursday, January 20, 2022 11:47 PM
To: Kim, Jonathan ; Chen, Guchun ; 
Christian König ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag

I think we could remove this warning as well, all we need to do is to make sure 
that the GART table is always restored from the metadata before it is enabled 
in hardware.

I've seen that we do this anyway for most hardware generations, but we really 
need to double check.

Christian.

Am 20.01.22 um 16:04 schrieb Kim, Jonathan:
> [Public]
>
> Switching to a VRAM bounce buffer can drop performance around 4x~6x on Vega20 
> over larger access so it's not desired.
>
> Jon
>
>> -Original Message-
>> From: Koenig, Christian 
>> Sent: January 20, 2022 9:10 AM
>> To: Chen, Guchun ; Christian König 
>> ; Kim, Jonathan 
>> ; amd-gfx@lists.freedesktop.org
>> Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag
>>
>> I actually suggested to allocate the bounce buffer in VRAM, but that 
>> add a bit more latency.
>>
>> Christian.
>>
>> Am 20.01.22 um 15:00 schrieb Chen, Guchun:
>>> [Public]
>>>
>>> Hi Christian,
>>>
>>> Unfortunately, your patch brings another warning from the same
>> sdma_access_bo's creation in amdgpu_ttm_init.
>>> In your patch, you introduce a new check of 
>>> WARN_ON(!adev->gart.ptr)),
>> however, sdma_access_bo targets to create a bo in GTT domain, but 
>> adev-
>>> gart.ptr is ready in gmc_v10_0_gart_init only.
>>>
>>> Hi Jonathan,
>>>
>>> Is it mandatory to create this sdma_access_bo in GTT domain? Can we
>> change it to VRAM?
>>> Regards,
>>> Guchun
>>>
>>> -Original Message-
>>> From: Koenig, Christian 
>>> Sent: Wednesday, January 19, 2022 10:38 PM
>>> To: Chen, Guchun ; Christian König 
>>> ; Kim, Jonathan 
>>> ; amd-gfx@lists.freedesktop.org
>>> Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag
>>>
>>> Hi Guchun,
>>>
>>> yes, just haven't found time to do this yet.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 19.01.22 um 15:24 schrieb Chen, Guchun:
 [Public]

 Hello Christian,

 Do you plan to submit your code to drm-next branch?

 Regards,
 Guchun

 -Original Message-
 From: Chen, Guchun
 Sent: Tuesday, January 18, 2022 10:22 PM
 To: 'Christian König' ; Kim, 
 Jonathan ; amd-gfx@lists.freedesktop.org
 Subject: RE: [PATCH] drm/amdgpu: remove gart.ready flag

 [Public]

 Thanks for the clarification. The patch is:
 Reviewed-by: Guchun Chen 

 Regards,
 Guchun

 -Original Message-
 From: Christian König 
 Sent: Tuesday, January 18, 2022 10:10 PM
 To: Chen, Guchun ; Kim, Jonathan 
 ; amd-gfx@lists.freedesktop.org
 Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag

 Am 18.01.22 um 14:28 schrieb Chen, Guchun:
> [Public]
>
> - if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev))
> - goto skip_pin_bo;
> -
> - r = amdgpu_gtt_mgr_recover(&adev->mman.gtt_mgr);
> - if (r)
> - return r;
> -
> -skip_pin_bo:
>
> Does deleting skip_pin_bo path cause bo redundant pin in SRIOV case?
 Pinning/unpinning the BO was already removed as well.

 See Nirmoy's patches in the git log.

 Regards,
 Christian.

> Regards,
> Guchun
>
> -Original Message-
> From: Christian König 
> Sent: Tuesday, January 18, 2022 8:02 PM
> To: Chen, Guchun ; Kim, Jonathan 
> ; amd-gfx@lists.freedesktop.org
> Subject: [PATCH] drm/amdgpu: remove gart.ready flag
>
> That's just a leftover from old radeon days and was preventing CS 
> and
>> GART bindings before the hardware was initialized. But nowdays that 
>> is perfectly valid.
> The only thing we need to warn about are GART binding before the
>> table is even allocated.
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c| 35 +++---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h| 15 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c |  9 +--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 77 ++-
>> --
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  4 +-
>  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  | 11 +--
>  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c   |  7 +-
>  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c   |  8 +--
>  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c   |  8 +--
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   | 10 +--
>  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c|  5 +-
>  11 files changed, 52 insertions(+), 137 deletions(-)
>
> diff --git a/drivers/gpu/drm/

RE: [PATCH 1/2] drm/amdgpu/display: adjust msleep limit in dp_wait_for_training_aux_rd_interval

2022-01-20 Thread Chen, Guchun

[Public]

If we change if condition, how about the division by "wait_in_micro_secs/1000", 
as the sleep time is less now. Shall we adjust it as well?

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Friday, January 21, 2022 2:04 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander 
Subject: [PATCH 1/2] drm/amdgpu/display: adjust msleep limit in 
dp_wait_for_training_aux_rd_interval

Some architectures (e.g., ARM) have relatively low udelay limits.
On most architectures, anything longer than 2000us is not recommended.
Change the check to align with other similar checks in DC.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 1f8831156bc4..aa1c67c3c386 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -202,7 +202,7 @@ void dp_wait_for_training_aux_rd_interval(
uint32_t wait_in_micro_secs)
 {
 #if defined(CONFIG_DRM_AMD_DC_DCN)
-   if (wait_in_micro_secs > 16000)
+   if (wait_in_micro_secs > 1000)
msleep(wait_in_micro_secs/1000);
else
udelay(wait_in_micro_secs);
-- 
2.34.1

RE: [PATCH] drm/amdgpu: Fix kernel compilation; style

2022-01-20 Thread Chen, Guchun

[Public]

Reviewed-by: Guchun Chen 


Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of Luben Tuikov
Sent: Friday, January 21, 2022 7:52 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Tuikov, Luben 
; Russell, Kent ; Liu, Shaoyun 

Subject: [PATCH] drm/amdgpu: Fix kernel compilation; style

Problem:
drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c: In function 
‘is_fru_eeprom_supported’:
drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c:47:3: error: expected ‘)’ before 
‘return’
   47 |   return false;
  |   ^~
drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c:46:5: note: to match this ‘(’
   46 |  if (amdgpu_sriov_vf(adev)
  | ^

Fix kernel compilation:
if (amdgpu_sriov_vf(adev)
return false;
missing closing right parenthesis for the "if".

Fix style:
/* The i2c access is blocked on VF
 * TODO: Need other way to get the info
 */
Has white space after the closing */.

Cc: Alex Deucher 
Cc: shaoyunl 
Cc: Kent Russell 
Fixes: 824c2051039dfc ("drm/amdgpu: Disable FRU EEPROM access for SRIOV")
Signed-off-by: Luben Tuikov 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
index 0548e279cc9fc4..60e7e637eaa33d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
@@ -42,8 +42,8 @@ static bool is_fru_eeprom_supported(struct amdgpu_device 
*adev)
 
/* The i2c access is blocked on VF
 * TODO: Need other way to get the info
-*/  
-   if (amdgpu_sriov_vf(adev)
+*/
+   if (amdgpu_sriov_vf(adev))
return false;
 
/* VBIOS is of the format ###-DXXXYY-##. For SKU identification,

base-commit: 2e8e13b0a6794f3ddae0ddcd13eedb64de94f0fd
-- 
2.34.0

RE: [PATCH] drm/amdgpu: Disable FRU EEPROM access for SRIOV

2022-01-20 Thread Chen, Guchun

[Public]

Acked-by: Guchun Chen 

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of shaoyunl
Sent: Thursday, January 20, 2022 11:49 PM
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Shaoyun 
Subject: [PATCH] drm/amdgpu: Disable FRU EEPROM access for SRIOV

VF acces the EEPROM is blocked by security policy, we might need other way to 
get SKUs info for VF

Signed-off-by: shaoyunl 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
index 2a786e788627..0548e279cc9f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
@@ -40,6 +40,12 @@ static bool is_fru_eeprom_supported(struct amdgpu_device 
*adev)
 */
struct atom_context *atom_ctx = adev->mode_info.atom_context;
 
+   /* The i2c access is blocked on VF
+* TODO: Need other way to get the info
+*/  
+   if (amdgpu_sriov_vf(adev)
+   return false;
+
/* VBIOS is of the format ###-DXXXYY-##. For SKU identification,
 * we can use just the "DXXX" portion. If there were more models, we
 * could convert the 3 characters to a hex integer and use a switch
--
2.17.1

[PATCH] drm/amdgpu: Fix kernel compilation; style

2022-01-20 Thread Luben Tuikov

Problem:
drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c: In function 
‘is_fru_eeprom_supported’:
drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c:47:3: error: expected ‘)’ before 
‘return’
   47 |   return false;
  |   ^~
drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c:46:5: note: to match this ‘(’
   46 |  if (amdgpu_sriov_vf(adev)
  | ^

Fix kernel compilation:
if (amdgpu_sriov_vf(adev)
return false;
missing closing right parenthesis for the "if".

Fix style:
/* The i2c access is blocked on VF
 * TODO: Need other way to get the info
 */
Has white space after the closing */.

Cc: Alex Deucher 
Cc: shaoyunl 
Cc: Kent Russell 
Fixes: 824c2051039dfc ("drm/amdgpu: Disable FRU EEPROM access for SRIOV")
Signed-off-by: Luben Tuikov 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
index 0548e279cc9fc4..60e7e637eaa33d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
@@ -42,8 +42,8 @@ static bool is_fru_eeprom_supported(struct amdgpu_device 
*adev)
 
/* The i2c access is blocked on VF
 * TODO: Need other way to get the info
-*/  
-   if (amdgpu_sriov_vf(adev)
+*/
+   if (amdgpu_sriov_vf(adev))
return false;
 
/* VBIOS is of the format ###-DXXXYY-##. For SKU identification,

base-commit: 2e8e13b0a6794f3ddae0ddcd13eedb64de94f0fd
-- 
2.34.0

[PATCH v2 3/8] drm/amdkfd: Enable per process SMI event

2022-01-20 Thread Philip Yang

Process receive event from same process by default. Add a flag to be
able to receive event from all processes, this requires super user
permission.

Event with pid 0 send to all processes, to keep the default behavior of
existing SMI events.

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 29 -
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
index 18ed1b72f0f7..68c93701c5f7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
@@ -37,6 +37,8 @@ struct kfd_smi_client {
uint64_t events;
struct kfd_dev *dev;
spinlock_t lock;
+   pid_t pid;
+   bool suser;
 };
 
 #define MAX_KFIFO_SIZE 1024
@@ -150,16 +152,27 @@ static int kfd_smi_ev_release(struct inode *inode, struct 
file *filep)
return 0;
 }
 
-static void add_event_to_kfifo(struct kfd_dev *dev, unsigned int smi_event,
- char *event_msg, int len)
+static bool kfd_smi_ev_enabled(pid_t pid, struct kfd_smi_client *client,
+  unsigned int event)
+{
+   uint64_t all = KFD_SMI_EVENT_MASK_FROM_INDEX(KFD_SMI_EVENT_ALL_PROCESS);
+   uint64_t events = READ_ONCE(client->events);
+
+   if (pid && client->pid != pid && !(client->suser && (events & all)))
+   return false;
+
+   return events & KFD_SMI_EVENT_MASK_FROM_INDEX(event);
+}
+
+static void add_event_to_kfifo(pid_t pid, struct kfd_dev *dev,
+  unsigned int smi_event, char *event_msg, int len)
 {
struct kfd_smi_client *client;
 
rcu_read_lock();
 
list_for_each_entry_rcu(client, &dev->smi_clients, list) {
-   if (!(READ_ONCE(client->events) &
-   KFD_SMI_EVENT_MASK_FROM_INDEX(smi_event)))
+   if (!kfd_smi_ev_enabled(pid, client, smi_event))
continue;
spin_lock(&client->lock);
if (kfifo_avail(&client->fifo) >= len) {
@@ -202,7 +215,7 @@ void kfd_smi_event_update_gpu_reset(struct kfd_dev *dev, 
bool post_reset)
len = snprintf(fifo_in, sizeof(fifo_in), "%x %x\n", event,
dev->reset_seq_num);
 
-   add_event_to_kfifo(dev, event, fifo_in, len);
+   add_event_to_kfifo(0, dev, event, fifo_in, len);
 }
 
 void kfd_smi_event_update_thermal_throttling(struct kfd_dev *dev,
@@ -225,7 +238,7 @@ void kfd_smi_event_update_thermal_throttling(struct kfd_dev 
*dev,
   KFD_SMI_EVENT_THERMAL_THROTTLE, throttle_bitmask,
   amdgpu_dpm_get_thermal_throttling_counter(dev->adev));
 
-   add_event_to_kfifo(dev, KFD_SMI_EVENT_THERMAL_THROTTLE, fifo_in, len);
+   add_event_to_kfifo(0, dev, KFD_SMI_EVENT_THERMAL_THROTTLE, fifo_in, 
len);
 }
 
 void kfd_smi_event_update_vmfault(struct kfd_dev *dev, uint16_t pasid)
@@ -250,7 +263,7 @@ void kfd_smi_event_update_vmfault(struct kfd_dev *dev, 
uint16_t pasid)
len = snprintf(fifo_in, sizeof(fifo_in), "%x %x:%s\n", 
KFD_SMI_EVENT_VMFAULT,
task_info.pid, task_info.task_name);
 
-   add_event_to_kfifo(dev, KFD_SMI_EVENT_VMFAULT, fifo_in, len);
+   add_event_to_kfifo(0, dev, KFD_SMI_EVENT_VMFAULT, fifo_in, len);
 }
 
 int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
@@ -282,6 +295,8 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
spin_lock_init(&client->lock);
client->events = 0;
client->dev = dev;
+   client->pid = current->pid;
+   client->suser = capable(CAP_SYS_ADMIN);
 
spin_lock(&dev->smi_lock);
list_add_rcu(&client->list, &dev->smi_clients);
-- 
2.17.1

[PATCH v2 6/8] drm/amdkfd: Add user queue eviction restore SMI event

2022-01-20 Thread Philip Yang

Output user queue eviction and restore event. User queue eviction may be
triggered by migration, MMU notifier, TTM eviction or device suspend.

User queue restore may be rescheduled if eviction happens again while
restore.

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  7 +++-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 11 --
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   |  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 37 +--
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c   | 34 +
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h   |  4 ++
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c  | 16 ++--
 8 files changed, 101 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index ac841ae8f5cc..bd3301e2c682 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -309,6 +309,7 @@ void amdgpu_amdkfd_gpuvm_destroy_cb(struct amdgpu_device 
*adev,
  */
 void amdgpu_amdkfd_release_notify(struct amdgpu_bo *bo);
 void amdgpu_amdkfd_reserve_system_mem(uint64_t size);
+void kfd_process_smi_event_restore_rescheduled(struct mm_struct *mm);
 #else
 static inline
 void amdgpu_amdkfd_gpuvm_init_mem_limits(void)
@@ -325,9 +326,13 @@ static inline
 void amdgpu_amdkfd_release_notify(struct amdgpu_bo *bo)
 {
 }
+
+static inline void kfd_process_smi_event_restore_rescheduled(struct mm_struct 
*mm)
+{
+}
 #endif
 /* KGD2KFD callbacks */
-int kgd2kfd_quiesce_mm(struct mm_struct *mm);
+int kgd2kfd_quiesce_mm(struct mm_struct *mm, uint32_t trigger);
 int kgd2kfd_resume_mm(struct mm_struct *mm);
 int kgd2kfd_schedule_evict_and_restore_process(struct mm_struct *mm,
struct dma_fence *fence);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 5df387c4d7fb..c44e8dc0d869 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -2066,7 +2066,7 @@ int amdgpu_amdkfd_evict_userptr(struct kgd_mem *mem,
evicted_bos = atomic_inc_return(&process_info->evicted_bos);
if (evicted_bos == 1) {
/* First eviction, stop the queues */
-   r = kgd2kfd_quiesce_mm(mm);
+   r = kgd2kfd_quiesce_mm(mm, USERPTR_EVICTION);
if (r)
pr_err("Failed to quiesce KFD\n");
schedule_delayed_work(&process_info->restore_userptr_work,
@@ -2340,13 +2340,16 @@ static void amdgpu_amdkfd_restore_userptr_worker(struct 
work_struct *work)
 
 unlock_out:
mutex_unlock(&process_info->lock);
-   mmput(mm);
-   put_task_struct(usertask);
 
/* If validation failed, reschedule another attempt */
-   if (evicted_bos)
+   if (evicted_bos) {
schedule_delayed_work(&process_info->restore_userptr_work,
msecs_to_jiffies(AMDGPU_USERPTR_RESTORE_DELAY_MS));
+
+   kfd_process_smi_event_restore_rescheduled(mm);
+   }
+   mmput(mm);
+   put_task_struct(usertask);
 }
 
 /** amdgpu_amdkfd_gpuvm_restore_process_bos - Restore all BOs for the given
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 5a47f437b455..ffaa80447d9c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -783,7 +783,7 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void 
*ih_ring_entry)
spin_unlock_irqrestore(&kfd->interrupt_lock, flags);
 }
 
-int kgd2kfd_quiesce_mm(struct mm_struct *mm)
+int kgd2kfd_quiesce_mm(struct mm_struct *mm, uint32_t trigger)
 {
struct kfd_process *p;
int r;
@@ -797,7 +797,7 @@ int kgd2kfd_quiesce_mm(struct mm_struct *mm)
return -ESRCH;
 
WARN(debug_evictions, "Evicting pid %d", p->lead_thread->pid);
-   r = kfd_process_evict_queues(p);
+   r = kfd_process_evict_queues(p, trigger);
 
kfd_unref_process(p);
return r;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index ea68f3b3a4e9..39519084df78 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -906,7 +906,7 @@ static inline struct kfd_process_device 
*kfd_process_device_from_gpuidx(
 }
 
 void kfd_unref_process(struct kfd_process *p);
-int kfd_process_evict_queues(struct kfd_process *p);
+int kfd_process_evict_queues(struct kfd_process *p, uint32_t trigger);
 int kfd_process_restore_queues(struct kfd_process *p);
 void kfd_suspend_all_processes(void);
 int kfd_resume_all_processes(void);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 74f162887d3b..e4ba4d537b3c 100644
--- a/drivers/gpu/drm/amd/amd

[PATCH v2 5/8] drm/amdkfd: add migration SMI event

2022-01-20 Thread Philip Yang

After migration is finished, output timestamp when migration starts,
duration of migration, svm range address and size, GPU id of
migration source and destination and svm range attributes,

Migration trigger could be prefetch, CPU or GPU page fault and TTM
eviction.

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c| 67 ++---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.h|  5 +-
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 29 +
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h |  5 ++
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c| 16 +++--
 5 files changed, 91 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index 88db82b3d443..06fb888f87aa 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -32,6 +32,7 @@
 #include "kfd_priv.h"
 #include "kfd_svm.h"
 #include "kfd_migrate.h"
+#include "kfd_smi_events.h"
 
 #ifdef dev_fmt
 #undef dev_fmt
@@ -402,10 +403,11 @@ svm_migrate_copy_to_vram(struct amdgpu_device *adev, 
struct svm_range *prange,
 static long
 svm_migrate_vma_to_vram(struct amdgpu_device *adev, struct svm_range *prange,
struct vm_area_struct *vma, uint64_t start,
-   uint64_t end)
+   uint64_t end, uint32_t trigger)
 {
uint64_t npages = (end - start) >> PAGE_SHIFT;
-   struct kfd_process_device *pdd;
+   struct kfd_process_device *pdd = NULL;
+   uint64_t timestamp = ktime_get_boottime_ns();
struct dma_fence *mfence = NULL;
struct migrate_vma migrate;
unsigned long cpages = 0;
@@ -431,6 +433,10 @@ svm_migrate_vma_to_vram(struct amdgpu_device *adev, struct 
svm_range *prange,
migrate.dst = migrate.src + npages;
scratch = (dma_addr_t *)(migrate.dst + npages);
 
+   pdd = svm_range_get_pdd_by_adev(prange, adev);
+   if (!pdd)
+   goto out_free;
+
r = migrate_vma_setup(&migrate);
if (r) {
dev_err(adev->dev, "vma setup fail %d range [0x%lx 0x%lx]\n", r,
@@ -459,6 +465,11 @@ svm_migrate_vma_to_vram(struct amdgpu_device *adev, struct 
svm_range *prange,
svm_migrate_copy_done(adev, mfence);
migrate_vma_finalize(&migrate);
 
+   kfd_smi_event_migration(adev->kfd.dev, pdd->process->pasid,
+   start >> PAGE_SHIFT, end >> PAGE_SHIFT,
+   0, adev->kfd.dev->id, prange->prefetch_loc,
+   prange->preferred_loc, trigger, timestamp);
+
svm_range_dma_unmap(adev->dev, scratch, 0, npages);
svm_range_free_dma_mappings(prange);
 
@@ -466,10 +477,7 @@ svm_migrate_vma_to_vram(struct amdgpu_device *adev, struct 
svm_range *prange,
kvfree(buf);
 out:
if (!r && cpages) {
-   pdd = svm_range_get_pdd_by_adev(prange, adev);
-   if (pdd)
-   WRITE_ONCE(pdd->page_in, pdd->page_in + cpages);
-
+   WRITE_ONCE(pdd->page_in, pdd->page_in + cpages);
return cpages;
}
return r;
@@ -480,6 +488,7 @@ svm_migrate_vma_to_vram(struct amdgpu_device *adev, struct 
svm_range *prange,
  * @prange: range structure
  * @best_loc: the device to migrate to
  * @mm: the process mm structure
+ * @trigger: reason of migration
  *
  * Context: Process context, caller hold mmap read lock, svms lock, prange lock
  *
@@ -488,7 +497,7 @@ svm_migrate_vma_to_vram(struct amdgpu_device *adev, struct 
svm_range *prange,
  */
 static int
 svm_migrate_ram_to_vram(struct svm_range *prange, uint32_t best_loc,
-   struct mm_struct *mm)
+   struct mm_struct *mm, uint32_t trigger)
 {
unsigned long addr, start, end;
struct vm_area_struct *vma;
@@ -525,7 +534,7 @@ svm_migrate_ram_to_vram(struct svm_range *prange, uint32_t 
best_loc,
break;
 
next = min(vma->vm_end, end);
-   r = svm_migrate_vma_to_vram(adev, prange, vma, addr, next);
+   r = svm_migrate_vma_to_vram(adev, prange, vma, addr, next, 
trigger);
if (r < 0) {
pr_debug("failed %ld to migrate\n", r);
break;
@@ -641,12 +650,14 @@ svm_migrate_copy_to_ram(struct amdgpu_device *adev, 
struct svm_range *prange,
 
 static long
 svm_migrate_vma_to_ram(struct amdgpu_device *adev, struct svm_range *prange,
-  struct vm_area_struct *vma, uint64_t start, uint64_t end)
+  struct vm_area_struct *vma, uint64_t start, uint64_t end,
+  uint32_t trigger)
 {
uint64_t npages = (end - start) >> PAGE_SHIFT;
+   uint64_t timestamp = ktime_get_boottime_ns();
unsigned long upages = npages;
unsigned long cpages = 0;
-   struct kfd_process_device *pdd;
+   struct kfd_process_device *pdd = NULL;

[PATCH v2 7/8] drm/amdkfd: Add unmap from GPU SMI event

2022-01-20 Thread Philip Yang

SVM range unmapped from GPUs when range is unmapped from CPU, or with
xnack on when range is evicted or migrated.

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 18 ++
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h |  3 +++
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c| 14 +-
 3 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
index facc8d7627d8..736d8d0c9666 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
@@ -377,6 +377,24 @@ void kfd_smi_event_queue_eviction_restore(struct kfd_dev 
*dev, pid_t pid,
   fifo_in, len);
 }
 
+void kfd_smi_event_unmap_from_gpu(struct kfd_dev *dev, pid_t pid,
+ unsigned long address, unsigned long last,
+ uint32_t trigger)
+{
+   char fifo_in[64];
+   int len;
+
+   if (list_empty(&dev->smi_clients))
+   return;
+
+   len = snprintf(fifo_in, sizeof(fifo_in), "%x %lld -%d @%lx(%lx) %x 
%d\n",
+  KFD_SMI_EVENT_UNMAP_FROM_GPU, ktime_get_boottime_ns(),
+  pid, address, last - address + 1, dev->id, trigger);
+
+   add_event_to_kfifo(pid, dev, KFD_SMI_EVENT_UNMAP_FROM_GPU, fifo_in,
+  len);
+}
+
 int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
 {
struct kfd_smi_client *client;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h
index d85300b5af23..7d348452d8c3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h
@@ -43,4 +43,7 @@ void kfd_smi_event_queue_eviction(struct kfd_dev *dev, pid_t 
pid,
  uint32_t trigger);
 void kfd_smi_event_queue_eviction_restore(struct kfd_dev *dev, pid_t pid,
  bool rescheduled);
+void kfd_smi_event_unmap_from_gpu(struct kfd_dev *dev, pid_t pid,
+ unsigned long address, unsigned long last,
+ uint32_t trigger);
 #endif
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 30aaa9764067..f8e6c8269743 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1191,7 +1191,7 @@ svm_range_unmap_from_gpu(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
 
 static int
 svm_range_unmap_from_gpus(struct svm_range *prange, unsigned long start,
- unsigned long last)
+ unsigned long last, uint32_t trigger)
 {
DECLARE_BITMAP(bitmap, MAX_GPU_INSTANCE);
struct kfd_process_device *pdd;
@@ -1212,6 +1212,9 @@ svm_range_unmap_from_gpus(struct svm_range *prange, 
unsigned long start,
return -EINVAL;
}
 
+   kfd_smi_event_unmap_from_gpu(pdd->dev, p->lead_thread->pid,
+start, last, trigger);
+
r = svm_range_unmap_from_gpu(pdd->dev->adev,
 drm_priv_to_vm(pdd->drm_priv),
 start, last, &fence);
@@ -1795,13 +1798,13 @@ svm_range_evict(struct svm_range *prange, struct 
mm_struct *mm,
s = max(start, pchild->start);
l = min(last, pchild->last);
if (l >= s)
-   svm_range_unmap_from_gpus(pchild, s, l);
+   svm_range_unmap_from_gpus(pchild, s, l, 
trigger);
mutex_unlock(&pchild->lock);
}
s = max(start, prange->start);
l = min(last, prange->last);
if (l >= s)
-   svm_range_unmap_from_gpus(prange, s, l);
+   svm_range_unmap_from_gpus(prange, s, l, trigger);
}
 
return r;
@@ -2214,6 +2217,7 @@ static void
 svm_range_unmap_from_cpu(struct mm_struct *mm, struct svm_range *prange,
 unsigned long start, unsigned long last)
 {
+   uint32_t trigger = UNMAP_FROM_CPU;
struct svm_range_list *svms;
struct svm_range *pchild;
struct kfd_process *p;
@@ -2241,14 +2245,14 @@ svm_range_unmap_from_cpu(struct mm_struct *mm, struct 
svm_range *prange,
s = max(start, pchild->start);
l = min(last, pchild->last);
if (l >= s)
-   svm_range_unmap_from_gpus(pchild, s, l);
+   svm_range_unmap_from_gpus(pchild, s, l, trigger);
svm_range_unmap_split(mm, prange, pchild, start, last);
mutex_unlock(&pchild->lock);
}
s = max(start, prange->start);
l = min(last, prange->

[PATCH v2 4/8] drm/amdkfd: Add GPU recoverable fault SMI event

2022-01-20 Thread Philip Yang

Output timestamp when GPU recoverable fault starts, ends and duration to
recover the fault, if migration happened or only GPU page table is
updated, fault address, read or write fault.

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 48 +
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h |  7 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c| 17 ++--
 3 files changed, 67 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
index 68c93701c5f7..080eba0d3be0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
@@ -266,6 +266,54 @@ void kfd_smi_event_update_vmfault(struct kfd_dev *dev, 
uint16_t pasid)
add_event_to_kfifo(0, dev, KFD_SMI_EVENT_VMFAULT, fifo_in, len);
 }
 
+static bool kfd_smi_event_duration(struct kfd_dev *dev, uint64_t ts,
+  uint64_t *duration)
+{
+   if (list_empty(&dev->smi_clients))
+   return false;
+
+   *duration = ktime_get_boottime_ns() - ts;
+   return true;
+}
+
+void kfd_smi_event_page_fault_start(struct kfd_dev *dev, pid_t pid,
+   unsigned long address, bool write_fault,
+   uint64_t ts)
+{
+   char fifo_in[64];
+   int len;
+
+   if (list_empty(&dev->smi_clients))
+   return;
+
+   len = snprintf(fifo_in, sizeof(fifo_in), "%x %lld -%d @%lx(%x) %c\n",
+  KFD_SMI_EVENT_PAGE_FAULT_START, ts, pid, address,
+  dev->id, write_fault ? 'W' : 'R');
+
+   add_event_to_kfifo(pid, dev, KFD_SMI_EVENT_PAGE_FAULT_START, fifo_in,
+  len);
+}
+
+void kfd_smi_event_page_fault_end(struct kfd_dev *dev, pid_t pid,
+ unsigned long address, bool migration,
+ uint64_t ts)
+{
+   char fifo_in[64];
+   uint64_t duration;
+   int len;
+
+   if (!kfd_smi_event_duration(dev, ts, &duration))
+   return;
+
+   len = snprintf(fifo_in, sizeof(fifo_in),
+  "%x %lld(%lld) -%d @%lx(%x) %c\n",
+  KFD_SMI_EVENT_PAGE_FAULT_END, ktime_get_boottime_ns(),
+  duration, pid, address, dev->id, migration ? 'M' : 'm');
+
+   add_event_to_kfifo(pid, dev, KFD_SMI_EVENT_PAGE_FAULT_END, fifo_in,
+  len);
+}
+
 int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
 {
struct kfd_smi_client *client;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h
index bffd0c32b060..7f70db914d2c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h
@@ -28,5 +28,10 @@ void kfd_smi_event_update_vmfault(struct kfd_dev *dev, 
uint16_t pasid);
 void kfd_smi_event_update_thermal_throttling(struct kfd_dev *dev,
 uint64_t throttle_bitmask);
 void kfd_smi_event_update_gpu_reset(struct kfd_dev *dev, bool post_reset);
-
+void kfd_smi_event_page_fault_start(struct kfd_dev *dev, pid_t pid,
+   unsigned long address, bool write_fault,
+   uint64_t ts);
+void kfd_smi_event_page_fault_end(struct kfd_dev *dev, pid_t pid,
+ unsigned long address, bool migration,
+ uint64_t ts);
 #endif
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 2d2cae05dbea..08b21f9759ea 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -32,6 +32,7 @@
 #include "kfd_priv.h"
 #include "kfd_svm.h"
 #include "kfd_migrate.h"
+#include "kfd_smi_events.h"
 
 #ifdef dev_fmt
 #undef dev_fmt
@@ -1596,7 +1597,7 @@ static int svm_range_validate_and_map(struct mm_struct 
*mm,
svm_range_unreserve_bos(&ctx);
 
if (!r)
-   prange->validate_timestamp = ktime_to_us(ktime_get());
+   prange->validate_timestamp = ktime_get_boottime_ns();
 
return r;
 }
@@ -2665,11 +2666,12 @@ svm_range_restore_pages(struct amdgpu_device *adev, 
unsigned int pasid,
struct svm_range_list *svms;
struct svm_range *prange;
struct kfd_process *p;
-   uint64_t timestamp;
+   uint64_t timestamp = ktime_get_boottime_ns();
int32_t best_loc;
int32_t gpuidx = MAX_GPU_INSTANCE;
bool write_locked = false;
struct vm_area_struct *vma;
+   bool migration = false;
int r = 0;
 
if (!KFD_IS_SVM_API_SUPPORTED(adev->kfd.dev)) {
@@ -2745,9 +2747,9 @@ svm_range_restore_pages(struct amdgpu_device *adev, 
unsigned int pasid,
goto out_unlock_range;
}
 
-   timestamp = ktime_to_us(ktime_get()) - prange->validate_timestamp;
/* skip

[PATCH v2 8/8] drm/amdkfd: Bump KFD API version for SMI profiling event

2022-01-20 Thread Philip Yang

Indicate SMI profiling events available,

Signed-off-by: Philip Yang 
---
 include/uapi/linux/kfd_ioctl.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index de0b5bb95db3..1236550d1375 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -32,9 +32,10 @@
  * - 1.4 - Indicate new SRAM EDC bit in device properties
  * - 1.5 - Add SVM API
  * - 1.6 - Query clear flags in SVM get_attr API
+ * - 1.7 - Add SMI profiler event log
  */
 #define KFD_IOCTL_MAJOR_VERSION 1
-#define KFD_IOCTL_MINOR_VERSION 6
+#define KFD_IOCTL_MINOR_VERSION 7
 
 struct kfd_ioctl_get_version_args {
__u32 major_version;/* from KFD */
-- 
2.17.1

[PATCH v2 1/8] drm/amdkfd: Correct SMI event read size

2022-01-20 Thread Philip Yang

sizeof(buf) is 8 bytes because it is defined as unsigned char *buf,
each SMI event read only copy max 8 bytes to user buffer. Correct this
by using the buf allocate size.

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
index 329a4c89f1e6..18ed1b72f0f7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
@@ -81,7 +81,8 @@ static ssize_t kfd_smi_ev_read(struct file *filep, char 
__user *user,
struct kfd_smi_client *client = filep->private_data;
unsigned char *buf;
 
-   buf = kmalloc_array(MAX_KFIFO_SIZE, sizeof(*buf), GFP_KERNEL);
+   size = min_t(size_t, size, MAX_KFIFO_SIZE);
+   buf = kmalloc(size, GFP_KERNEL);
if (!buf)
return -ENOMEM;
 
@@ -95,7 +96,7 @@ static ssize_t kfd_smi_ev_read(struct file *filep, char 
__user *user,
ret = -EAGAIN;
goto ret_err;
}
-   to_copy = min3(size, sizeof(buf), to_copy);
+   to_copy = min(size, to_copy);
ret = kfifo_out(&client->fifo, buf, to_copy);
spin_unlock(&client->lock);
if (ret <= 0) {
-- 
2.17.1

[PATCH v2 2/8] drm/amdkfd: Add KFD SMI event IDs and triggers

2022-01-20 Thread Philip Yang

Define new system management interface event IDs, migration triggers and
user queue eviction triggers, those will be implemented in the following
patches.

Signed-off-by: Philip Yang 
---
 include/uapi/linux/kfd_ioctl.h | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index af96af174dc4..de0b5bb95db3 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -459,10 +459,37 @@ enum kfd_smi_event {
KFD_SMI_EVENT_THERMAL_THROTTLE = 2,
KFD_SMI_EVENT_GPU_PRE_RESET = 3,
KFD_SMI_EVENT_GPU_POST_RESET = 4,
+   KFD_SMI_EVENT_MIGRATION = 5,
+   KFD_SMI_EVENT_PAGE_FAULT_START = 6,
+   KFD_SMI_EVENT_PAGE_FAULT_END = 7,
+   KFD_SMI_EVENT_QUEUE_EVICTION = 8,
+   KFD_SMI_EVENT_QUEUE_EVICTION_RESTORE = 9,
+   KFD_SMI_EVENT_UNMAP_FROM_GPU = 10,
+
+   /*
+* max event number, as a flag bit to get events from all processes,
+* this requires super user permission, otherwise will not be able to
+* receive event from any process. Without this flag to receive events
+* from same process.
+*/
+   KFD_SMI_EVENT_ALL_PROCESS = 64
 };
 
 #define KFD_SMI_EVENT_MASK_FROM_INDEX(i) (1ULL << ((i) - 1))
 
+enum KFD_MIGRATION_QUEUE_EVICTION_UNMAP_EVENT_TRIGGER {
+   MIGRATION_TRIGGER_PREFETCH = 1,
+   MIGRATION_TRIGGER_PAGEFAULT,
+   MIGRATION_TRIGGER_PAGEFAULT_CPU,
+   MIGRATION_TRIGGER_TTM_EVICTION,
+   SVM_RANGE_EVICTION,
+   SVM_RANGE_MIGRATION,
+   USERPTR_EVICTION,
+   TTM_EVICTION,
+   UNMAP_FROM_CPU,
+   SUSPEND_EVICTION
+};
+
 struct kfd_ioctl_smi_events_args {
__u32 gpuid;/* to KFD */
__u32 anon_fd;  /* from KFD */
-- 
2.17.1

[PATCH v2 0/8] HMM profiler interface

2022-01-20 Thread Philip Yang

The ROCm profiler would expose the data from KFD profiling APIs to
application developers to tune the applications based on how the address
range attributes affect the behavior and performance.

Per process event log use the existing SMI (system management interface)
event API. Each event log is one line of text with the event specific
information.

v2:
 * Keep existing events behaviour
 * Use ktime_get_boottime_ns() as timestamp to correlate with other APIs
 * Use compact message layout, stick with existing message convention
 * Add unmap from GPU event

Philip Yang (8):
  drm/amdkfd: Correct SMI event read size
  drm/amdkfd: Add KFD SMI event IDs and triggers
  drm/amdkfd: Enable per process SMI event
  drm/amdkfd: Add GPU recoverable fault SMI event
  drm/amdkfd: add migration SMI event
  drm/amdkfd: Add user queue eviction restore SMI event
  drm/amdkfd: Add unmap from GPU SMI event
  drm/amdkfd: Bump KFD API version for SMI profiling event

 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|   7 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  11 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   |   4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c  |  67 ---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.h  |   5 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |   2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |  37 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c   | 163 +-
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h   |  19 +-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c  |  63 +--
 include/uapi/linux/kfd_ioctl.h|  30 +++-
 11 files changed, 343 insertions(+), 65 deletions(-)

-- 
2.17.1

Re: [PATCH 2/2] drm/amdgpu/display: use msleep rather than udelay for long delays

2022-01-20 Thread Harry Wentland

On 2022-01-20 13:04, Alex Deucher wrote:
> Some architectures (e.g., ARM) throw an compilation error if the
> udelay is too long.  In general udelays of longer than 2000us are
> not recommended on any architecture.  Switch to msleep in these
> cases.
> 
> Signed-off-by: Alex Deucher 

Series is
Reviewed-by: Harry Wentland 

Harry

> ---
>  drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
> b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
> index aa1c67c3c386..eb4432dca761 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
> @@ -6941,7 +6941,7 @@ bool dpcd_write_128b_132b_sst_payload_allocation_table(
>   }
>   }
>   retries++;
> - udelay(5000);
> + msleep(5);
>   }
>  
>   if (!result && retries == max_retries) {
> @@ -6993,7 +6993,7 @@ bool dpcd_poll_for_allocation_change_trigger(struct 
> dc_link *link)
>   break;
>   }
>  
> - udelay(5000);
> + msleep(5);
>   }
>  
>   if (result == ACT_FAILED) {

RE: [PATCH] drm/amd/display: Not to call dpcd_set_source_specific_data during resume.

2022-01-20 Thread Wu, Hersen

[AMD Official Use Only]

Hi Rajib,

For resume from s3 or si03, the change should work.

Reviewed-by: Hersen Wu 

For boot up, at the location of your change, link->dpcd_sink_ext_caps.bits.oled 
= 0.
OLED caps is read by dpcd_read_sink_ext_caps which is called within  
detect_edp_sink_caps.

For boot up, we need another change.

Thanks!
Hersen



-Original Message-
From: Mahapatra, Rajib  
Sent: Thursday, January 20, 2022 2:33 PM
To: Wentland, Harry ; Wu, Hersen ; 
Deucher, Alexander 
Cc: amd-gfx@lists.freedesktop.org; S, Shirish 
Subject: RE: [PATCH] drm/amd/display: Not to call dpcd_set_source_specific_data 
during resume.

Hi Hersen,
I am waiting for your comments here.
I think we can take this change for resume path at this moment.
For bootup, we can have separate patch for resume optimization. 

Thanks
-Rajib

-Original Message-
From: Wentland, Harry  
Sent: Tuesday, January 11, 2022 9:47 PM
To: Mahapatra, Rajib ; Wu, Hersen 
; Deucher, Alexander 
Cc: amd-gfx@lists.freedesktop.org; S, Shirish 
Subject: Re: [PATCH] drm/amd/display: Not to call dpcd_set_source_specific_data 
during resume.



On 2022-01-11 02:52, Mahapatra, Rajib wrote:
> dpcd_set_source_specific_data is not specific to OLED panel.  It is called 
> from boot-up path also.
> Hersen Wu introduced it in resume-path while enabling OLED panel for Linux in 
> below commit.
> 

If we set it in the boot-up path we'll probably want to set it on resume as 
well. Though I'll let Hersen comment since he knows this part much better than 
me.

Harry

> So here, I guard it by calling source specific data only for OLED panel, and 
> I can get advantage of around 100ms for non-oled panel during resume. Hersen 
> night have answer about the issue related to regression for other panels, 
> waiting for his reply about this change.
> 
> commit 96577cf82a1331732a71199522398120c649f1cf
> Author: Hersen Wu 
> Date:   Tue Jan 14 15:39:07 2020 -0500
> 
> drm/amd/display: linux enable oled panel support dc part
> 
> 
> 
> -Original Message-
> From: Wentland, Harry 
> Sent: Monday, January 10, 2022 10:03 PM
> To: Mahapatra, Rajib ; Wu, Hersen 
> ; Deucher, Alexander 
> Cc: amd-gfx@lists.freedesktop.org; S, Shirish 
> Subject: Re: [PATCH] drm/amd/display: Not to call 
> dpcd_set_source_specific_data during resume.
> 
> On 2022-01-10 04:06, Rajib Mahapatra wrote:
>> [Why]
>> During resume path, dpcd_set_source_specific_data is taking extra 
>> time when core_link_write_dpcd fails on DP_SOURCE_OUI+0x03 and 
>> DP_SOURCE_MINIMUM_HBLANK_SUPPORTED. Here,aux->transfer fails with 
>> multiple retries and consume sigficantamount time during
>> S0i3 resume.
>>
>> [How]
>> Not to call dpcd_set_source_specific_data during resume path when 
>> there is no oled panel connected and achieve faster resume during 
>> S0i3.
>>
>> Signed-off-by: Rajib Mahapatra 
>> ---
>>  drivers/gpu/drm/amd/display/dc/core/dc_link.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
>> b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
>> index c0bdc23702c8..04086c199dbb 100644
>> --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
>> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
>> @@ -892,7 +892,8 @@ static bool dc_link_detect_helper(struct dc_link *link,
>>  (!link->dc->config.allow_edp_hotplug_detection)) &&
>>  link->local_sink) {
>>  // need to re-write OUI and brightness in resume case
>> -if (link->connector_signal == SIGNAL_TYPE_EDP) {
>> +if (link->connector_signal == SIGNAL_TYPE_EDP &&
>> +(link->dpcd_sink_ext_caps.bits.oled == 1)) {
> 
> Is the source specific data only used by OLED panels?
> 
> Do we know that this won't lead to regressions with any features on non-OLED 
> panels?
> 
> Harry
> 
>>  dpcd_set_source_specific_data(link);
>>  msleep(post_oui_delay);
>>  dc_link_set_default_brightness_aux(link);
>

RE: [PATCH v4] drm/amd: Warn users about potential s0ix problems

2022-01-20 Thread Limonciello, Mario

[Public]

Add back on Lijo and Prike, my mistake they got dropped from CC.

> -Original Message-
> From: Limonciello, Mario 
> Sent: Tuesday, January 18, 2022 21:41
> To: amd-gfx@lists.freedesktop.org
> Cc: Limonciello, Mario ; Bjoren Dasse
> 
> Subject: [PATCH v4] drm/amd: Warn users about potential s0ix problems
> 
> On some OEM setups users can configure the BIOS for S3 or S2idle.
> When configured to S3 users can still choose 's2idle' in the kernel by
> using `/sys/power/mem_sleep`.  Before commit 6dc8265f9803 ("drm/amdgpu:
> always reset the asic in suspend (v2)"), the GPU would crash.  Now when
> configured this way, the system should resume but will use more power.
> 
> As such, adjust the `amdpu_acpi_is_s0ix function` to warn users about
> potential power consumption issues during their first attempt at
> suspending.
> 
> Reported-by: Bjoren Dasse 
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1824
> Signed-off-by: Mario Limonciello 
> ---
> v3->v4:
>  * Add back in CONFIG_SUSPEND check
> v2->v3:
>  * Better direct users how to recover in the bad cases
> v1->v2:
>  * Only show messages in s2idle cases
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 21 +++--
>  1 file changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> index 4811b0faafd9..2531da6cbec3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> @@ -1040,11 +1040,20 @@ void amdgpu_acpi_detect(void)
>   */
>  bool amdgpu_acpi_is_s0ix_active(struct amdgpu_device *adev)
>  {
> -#if IS_ENABLED(CONFIG_AMD_PMC) && IS_ENABLED(CONFIG_SUSPEND)
> - if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) {
> - if (adev->flags & AMD_IS_APU)
> - return pm_suspend_target_state ==
> PM_SUSPEND_TO_IDLE;
> - }
> -#endif
> +#if IS_ENABLED(CONFIG_SUSPEND)
> + if (!(adev->flags & AMD_IS_APU) ||
> + pm_suspend_target_state != PM_SUSPEND_TO_IDLE)
> + return false;
> +#else
>   return false;
> +#endif
> + if (!(acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0))
> + dev_warn_once(adev->dev,
> +   "Power consumption will be higher as BIOS has not
> been configured for suspend-to-idle.\n"
> +   "To use suspend-to-idle change the sleep mode in
> BIOS setup.\n");
> +#if !IS_ENABLED(CONFIG_AMD_PMC)
> + dev_warn_once(adev->dev,
> +   "Power consumption will be higher as the kernel has not
> been compiled with CONFIG_AMD_PMC.\n");
> +#endif
> + return true;
>  }
> --
> 2.25.1

RE: [PATCH] drm/amd/display: Not to call dpcd_set_source_specific_data during resume.

2022-01-20 Thread Mahapatra, Rajib

Hi Hersen,
I am waiting for your comments here.
I think we can take this change for resume path at this moment.
For bootup, we can have separate patch for resume optimization. 

Thanks
-Rajib

-Original Message-
From: Wentland, Harry  
Sent: Tuesday, January 11, 2022 9:47 PM
To: Mahapatra, Rajib ; Wu, Hersen 
; Deucher, Alexander 
Cc: amd-gfx@lists.freedesktop.org; S, Shirish 
Subject: Re: [PATCH] drm/amd/display: Not to call dpcd_set_source_specific_data 
during resume.



On 2022-01-11 02:52, Mahapatra, Rajib wrote:
> dpcd_set_source_specific_data is not specific to OLED panel.  It is called 
> from boot-up path also.
> Hersen Wu introduced it in resume-path while enabling OLED panel for Linux in 
> below commit.
> 

If we set it in the boot-up path we'll probably want to set it on resume as 
well. Though I'll let Hersen comment since he knows this part much better than 
me.

Harry

> So here, I guard it by calling source specific data only for OLED panel, and 
> I can get advantage of around 100ms for non-oled panel during resume. Hersen 
> night have answer about the issue related to regression for other panels, 
> waiting for his reply about this change.
> 
> commit 96577cf82a1331732a71199522398120c649f1cf
> Author: Hersen Wu 
> Date:   Tue Jan 14 15:39:07 2020 -0500
> 
> drm/amd/display: linux enable oled panel support dc part
> 
> 
> 
> -Original Message-
> From: Wentland, Harry 
> Sent: Monday, January 10, 2022 10:03 PM
> To: Mahapatra, Rajib ; Wu, Hersen 
> ; Deucher, Alexander 
> Cc: amd-gfx@lists.freedesktop.org; S, Shirish 
> Subject: Re: [PATCH] drm/amd/display: Not to call 
> dpcd_set_source_specific_data during resume.
> 
> On 2022-01-10 04:06, Rajib Mahapatra wrote:
>> [Why]
>> During resume path, dpcd_set_source_specific_data is taking extra 
>> time when core_link_write_dpcd fails on DP_SOURCE_OUI+0x03 and 
>> DP_SOURCE_MINIMUM_HBLANK_SUPPORTED. Here,aux->transfer fails with 
>> multiple retries and consume sigficantamount time during
>> S0i3 resume.
>>
>> [How]
>> Not to call dpcd_set_source_specific_data during resume path when 
>> there is no oled panel connected and achieve faster resume during 
>> S0i3.
>>
>> Signed-off-by: Rajib Mahapatra 
>> ---
>>  drivers/gpu/drm/amd/display/dc/core/dc_link.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
>> b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
>> index c0bdc23702c8..04086c199dbb 100644
>> --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
>> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
>> @@ -892,7 +892,8 @@ static bool dc_link_detect_helper(struct dc_link *link,
>>  (!link->dc->config.allow_edp_hotplug_detection)) &&
>>  link->local_sink) {
>>  // need to re-write OUI and brightness in resume case
>> -if (link->connector_signal == SIGNAL_TYPE_EDP) {
>> +if (link->connector_signal == SIGNAL_TYPE_EDP &&
>> +(link->dpcd_sink_ext_caps.bits.oled == 1)) {
> 
> Is the source specific data only used by OLED panels?
> 
> Do we know that this won't lead to regressions with any features on non-OLED 
> panels?
> 
> Harry
> 
>>  dpcd_set_source_specific_data(link);
>>  msleep(post_oui_delay);
>>  dc_link_set_default_brightness_aux(link);
>

RE: [PATCH 0/3] lib/string_helpers: Add a few string helpers

2022-01-20 Thread David Laight

...
> Yeah, and I am sorry for bikeshedding. Honestly, I do not know what is
> better. This is why I do not want to block this series when others
> like this.
> 
> My main motivation is to point out that:
> 
> enabledisable(enable)
> 
> might be, for some people, more eye bleeding than
> 
> enable ? "enable" : "disable"

Indeed - you need to look the former up, wasting brain time.

> The problem is not that visible with yesno() and onoff(). But as you said,
> onoff() confliscts with variable names. And enabledisable() sucks.
> As a result, there is a non-trivial risk of two mass changes:
> 
> now:
> 
> - contition ? "yes" : "no"
> + yesno(condition)
> 
> a few moths later:
> 
> - yesno(condition)
> + str_yes_no(condition)

Followed by:
- str_yes_no(x)
+ no_yes_str(x)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

Re: [PATCH v3 03/10] mm/gup: fail get_user_pages for LONGTERM dev coherent type

2022-01-20 Thread Alistair Popple

On Thursday, 20 January 2022 11:36:21 PM AEDT Joao Martins wrote:
> On 1/10/22 22:31, Alex Sierra wrote:
> > Avoid long term pinning for Coherent device type pages. This could
> > interfere with their own device memory manager. For now, we are just
> > returning error for PIN_LONGTERM Coherent device type pages. Eventually,
> > these type of pages will get migrated to system memory, once the device
> > migration pages support is added.
> > 
> > Signed-off-by: Alex Sierra 
> > ---
> >  mm/gup.c | 7 +++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/mm/gup.c b/mm/gup.c
> > index 886d6148d3d0..9c8a075d862d 100644
> > --- a/mm/gup.c
> > +++ b/mm/gup.c
> > @@ -1720,6 +1720,12 @@ static long check_and_migrate_movable_pages(unsigned 
> > long nr_pages,
> >  * If we get a movable page, since we are going to be pinning
> >  * these entries, try to move them out if possible.
> >  */
> > +   if (is_device_page(head)) {
> > +   WARN_ON_ONCE(is_device_private_page(head));
> > +   ret = -EFAULT;
> > +   goto unpin_pages;
> > +   }
> > +
> 
> Wouldn't be more efficient for you failing earlier instead of after all the 
> pages are pinned?

Rather than failing I think the plan is to migrate the device coherent pages
like we do for ZONE_MOVABLE, so leaving this here is a good place holder until
that is done. Currently we are missing some functionality required to do that
but I am hoping to post a series fixing that soon.

> Filesystem DAX suffers from a somewhat similar issue[0] -- albeit it's more 
> related to
> blocking FOLL_LONGTERM in gup-fast while gup-slow can still do it. Coherent 
> devmap appears
> to want to block it in all gup.
> 
> On another thread Jason was suggesting about having different pgmap::flags to 
> capture
> these special cases[1] instead of selecting what different pgmap types can do 
> in various
> different places.
> 
> [0] 
> https://lore.kernel.org/linux-mm/6a18179e-65f7-367d-89a9-d5162f10f...@oracle.com/
> [1] https://lore.kernel.org/linux-mm/20211019160136.gh3686...@ziepe.ca/
>

Re: [PATCH v3 03/10] mm/gup: fail get_user_pages for LONGTERM dev coherent type

2022-01-20 Thread Joao Martins

On 1/10/22 22:31, Alex Sierra wrote:
> Avoid long term pinning for Coherent device type pages. This could
> interfere with their own device memory manager. For now, we are just
> returning error for PIN_LONGTERM Coherent device type pages. Eventually,
> these type of pages will get migrated to system memory, once the device
> migration pages support is added.
> 
> Signed-off-by: Alex Sierra 
> ---
>  mm/gup.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/mm/gup.c b/mm/gup.c
> index 886d6148d3d0..9c8a075d862d 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -1720,6 +1720,12 @@ static long check_and_migrate_movable_pages(unsigned 
> long nr_pages,
>* If we get a movable page, since we are going to be pinning
>* these entries, try to move them out if possible.
>*/
> + if (is_device_page(head)) {
> + WARN_ON_ONCE(is_device_private_page(head));
> + ret = -EFAULT;
> + goto unpin_pages;
> + }
> +

Wouldn't be more efficient for you failing earlier instead of after all the 
pages are pinned?

Filesystem DAX suffers from a somewhat similar issue[0] -- albeit it's more 
related to
blocking FOLL_LONGTERM in gup-fast while gup-slow can still do it. Coherent 
devmap appears
to want to block it in all gup.

On another thread Jason was suggesting about having different pgmap::flags to 
capture
these special cases[1] instead of selecting what different pgmap types can do 
in various
different places.

[0] 
https://lore.kernel.org/linux-mm/6a18179e-65f7-367d-89a9-d5162f10f...@oracle.com/
[1] https://lore.kernel.org/linux-mm/20211019160136.gh3686...@ziepe.ca/

[PATCH] drm/amdgpu: Fix double free in amdgpu_get_xgmi_hive

2022-01-20 Thread Miaoqian Lin

Callback function amdgpu_xgmi_hive_release() in kobject_put()
calls kfree(hive), So we don't need call kfree(hive) again.

Fixes: 7b833d680481 ("drm/amd/amdgpu: fix potential memleak")
Signed-off-by: Miaoqian Lin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
index e8b8f28c2f72..35d4b966ef2c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
@@ -393,7 +393,6 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct 
amdgpu_device *adev)
if (ret) {
dev_err(adev->dev, "XGMI: failed initializing kobject for xgmi 
hive\n");
kobject_put(&hive->kobj);
-   kfree(hive);
hive = NULL;
goto pro_end;
}
-- 
2.17.1

Re: [Intel-gfx] [PATCH 0/7] DRM kmap() fixes and kmap_local_page() conversions

2022-01-20 Thread Ira Weiny

On Thu, Jan 20, 2022 at 04:48:50PM +0100, Daniel Vetter wrote:
> On Thu, Jan 20, 2022 at 09:16:35AM +0100, Christian König wrote:
> > Am 20.01.22 um 00:55 schrieb Ira Weiny:
> > > On Wed, Jan 19, 2022 at 06:24:22PM +0100, Daniel Vetter wrote:
> > > > On Wed, Jan 19, 2022 at 08:53:56AM -0800, Ira Weiny wrote:
> > > > > On Fri, Dec 10, 2021 at 03:23:57PM -0800, 'Ira Weiny' wrote:
> > > > > > From: Ira Weiny 
> > > > > > 
> > > > > > This series starts by converting the last easy kmap() uses to
> > > > > > kmap_local_page().
> > > > > > 
> > > > > > There is one more call to kmap() wrapped in ttm_bo_kmap_ttm().  
> > > > > > Unfortunately,
> > > > > > ttm_bo_kmap_ttm() is called in a number of different ways including 
> > > > > > some which
> > > > > > are not thread local.  I have a patch to convert that call.  
> > > > > > However, it is not
> > > > > > straight forward so it is not included in this series.
> > > > > > 
> > > > > > The final 2 patches fix bugs found while working on the 
> > > > > > ttm_bo_kmap_ttm()
> > > > > > conversion.
> > > > > Gentile ping on this series?  Will it make this merge window?
> > > > I think this fell through the cracks and so no. Note that generally we
> > > > feature-freeze drm tree around -rc6 anyway for the upcoming merge 
> > > > window,
> > > > so you were cutting this all a bit close anyway.
> > > Ok, No problem.  I just had not heard if this was picked up or not.
> > > 
> > > > Also looks like the ttm
> > > > kmap caching question didn't get resolved?
> > > I'm sorry I thought it was resolve for this series.  Christian said the 
> > > patches
> > > in this series were "a good bug fix" even if not strictly necessary.[1]  
> > > Beyond
> > > this series I was discussing where to go from here, and is it possible to 
> > > go
> > > further with more changes.[2]  At the moment I don't think I will.
> > > 
> > > Christian did I misunderstand?  I can drop patch 6 and 7 if they are not 
> > > proper
> > > bug fixes or at least clarifications to the code.
> > 
> > Yeah, it is indeed a correct cleanup. I would just *not* put a CC stable on
> > it because it doesn't really fix anything.
> 
> Ok can you pls get the amd/radeon ones stuffed into alex' tree? Or do we
> want to put all the ttm ones into drm-misc instead?

I just updated to the latest master and there is a minor conflict.  Since this
is not going in this window.  Let me rebase and resend.

Ira

> -Daniel
>

Re: amd-staging-drm-next breaks suspend

2022-01-20 Thread Bert Karwatzki

Tested amd-staging-drm-next (HEAD
e5c18a35031963eb22bfabf84cce3545da56a8ee) and suspend/resume works
despite the warnings. So the amdgpu_gart_bind warning did not cause
problems.

Am Donnerstag, dem 20.01.2022 um 01:52 + schrieb Kim, Jonathan:
> [Public]
>
> This should fix the issue by getting rid of the unneeded flag check
> during gart bind:
> https://patchwork.freedesktop.org/patch/469907/
>
> Thanks,
>
> Jon
>
> > -Original Message-
> > From: amd-gfx  On Behalf Of
> > Bert
> > Karwatzki
> > Sent: January 19, 2022 8:12 PM
> > To: Alex Deucher 
> > Cc: Chris Hixon ; Zhuo, Qingqing
> > (Lillian) ; Das, Nirmoy
> > ; amd-gfx@lists.freedesktop.org; Scott Bruce
> > ; Limonciello, Mario
> > ; Kazlauskas, Nicholas
> > 
> > Subject: Re: amd-staging-drm-next breaks suspend
> >
> > [CAUTION: External Email]
> >
> > Unfortunately this does not work either:
> >
> > [    0.859998] [ cut here ]
> > [    0.859998] trying to bind memory to uninitialized GART !
> > [    0.860003] WARNING: CPU: 13 PID: 235 at
> > drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c:254
> > amdgpu_gart_bind+0x29/0x40 [amdgpu]
> > [    0.860099] Modules linked in: amdgpu(+) drm_ttm_helper ttm
> > gpu_sched i2c_algo_bit drm_kms_helper syscopyarea hid_sensor_hub
> > sysfillrect mfd_core sysimgblt hid_generic fb_sys_fops cec xhci_pci
> > xhci_hcd nvme drm r8169 nvme_core psmouse crc32c_intel realtek
> > amd_sfh usbcore i2c_hid_acpi mdio_devres t10_pi crc_t10dif i2c_hid
> > i2c_piix4 crct10dif_generic libphy crct10dif_common hid backlight
> > i2c_designware_platform i2c_designware_core
> > [    0.860113] CPU: 13 PID: 235 Comm: systemd-udevd Not tainted
> > 5.13.0+
> > #15
> > [    0.860115] Hardware name: Micro-Star International Co., Ltd.
> > Alpha
> > 15 B5EEK/MS-158L, BIOS E158LAMS.107 11/10/2021
> > [    0.860116] RIP: 0010:amdgpu_gart_bind+0x29/0x40 [amdgpu]
> > [    0.860210] Code: 00 80 bf 34 25 00 00 00 74 14 4c 8b 8f 20 25
> > 00 00
> > 4d 85 c9 74 05 e9 16 ff ff ff 31 c0 c3 48 c7 c7 08 06 7d c0 e8 8e
> > cc 31
> > e2 <0f> 0b b8 ea ff ff ff c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f
> > 40
> > [    0.860212] RSP: 0018:bb9e80b6f968 EFLAGS: 00010286
> > [    0.860213] RAX:  RBX: 0067 RCX:
> > a3080968
> > [    0.860214] RDX:  RSI: efff RDI:
> > a3028960
> > [    0.860215] RBP: 947c91e49a80 R08:  R09:
> > bb9e80b6f798
> > [    0.860215] R10: bb9e80b6f790 R11: a30989a8 R12:
> > 
> > [    0.860216] R13: 947c8a74 R14: 947c8a74 R15:
> > 
> > [    0.860216] FS:  7f60a3c918c0()
> > GS:947f5e94()
> > knlGS:
> > [    0.860217] CS:  0010 DS:  ES:  CR0: 80050033
> > [    0.860218] CR2: 7f60a4213480 CR3: 000135ee2000 CR4:
> > 00550ee0
> > [    0.860218] PKRU: 5554
> > [    0.860219] Call Trace:
> > [    0.860221]  amdgpu_ttm_gart_bind+0x74/0xc0 [amdgpu]
> > [    0.860305]  amdgpu_ttm_alloc_gart+0x13e/0x190 [amdgpu]
> > [    0.860385]  amdgpu_bo_create_reserved.part.0+0xf3/0x1b0
> > [amdgpu]
> > [    0.860465]  ? amdgpu_ttm_debugfs_init+0x110/0x110 [amdgpu]
> > [    0.860554]  amdgpu_bo_create_kernel+0x36/0xa0 [amdgpu]
> > [    0.860641]  amdgpu_ttm_init.cold+0x167/0x181 [amdgpu]
> > [    0.860784]  gmc_v10_0_sw_init+0x2d7/0x430 [amdgpu]
> > [    0.860889]  amdgpu_device_init.cold+0x147f/0x1ad7 [amdgpu]
> > [    0.861007]  ? acpi_ns_get_node+0x4a/0x55
> > [    0.861011]  ? acpi_get_handle+0x89/0xb2
> > [    0.861012]  amdgpu_driver_load_kms+0x55/0x290 [amdgpu]
> > [    0.861098]  amdgpu_pci_probe+0x181/0x250 [amdgpu]
> > [    0.861188]  pci_device_probe+0xcd/0x140
> > [    0.861191]  really_probe+0xed/0x460
> > [    0.861193]  driver_probe_device+0xe3/0x150
> > [    0.861195]  device_driver_attach+0x9c/0xb0
> > [    0.861196]  __driver_attach+0x8a/0x150
> > [    0.861197]  ? device_driver_attach+0xb0/0xb0
> > [    0.861198]  ? device_driver_attach+0xb0/0xb0
> > [    0.861198]  bus_for_each_dev+0x73/0xb0
> > [    0.861200]  bus_add_driver+0x121/0x1e0
> > [    0.861201]  driver_register+0x8a/0xe0
> > [    0.861202]  ? 0xc1117000
> > [    0.861203]  do_one_initcall+0x47/0x180
> > [    0.861205]  ? do_init_module+0x19/0x230
> > [    0.861208]  ? kmem_cache_alloc+0x182/0x260
> > [    0.861210]  do_init_module+0x51/0x230
> > [    0.861211]  __do_sys_finit_module+0xb1/0x110
> > [    0.861213]  do_syscall_64+0x40/0xb0
> > [    0.861216]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [    0.861218] RIP: 0033:0x7f60a4149679
> > [    0.861220] Code: 48 8d 3d 9a a1 0c 00 0f 05 eb a5 66 0f 1f 44
> > 00 00
> > 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24
> > 08 0f
> > 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c7 57 0c 00 f7 d8 64 89 01
> > 48
> > [    0.861221] RSP: 002b:7ffe25f17ea8 EFLAGS: 0246
> > ORIG_RAX:
> > 0139
> > [    0.861223] RAX: ffda RBX: 00

Re: [PATCH] drm: Add PSR version 4 macro

2022-01-20 Thread Zhang, Dingchen (David)

[Public]

From: dri-devel  on behalf of 
sunpeng...@amd.com 
Sent: Monday, January 17, 2022 5:33 PM
To: amd-gfx@lists.freedesktop.org ; 
dri-de...@lists.freedesktop.org 
Cc: Li, Sun peng (Leo) 
Subject: [PATCH] drm: Add PSR version 4 macro

From: Leo Li 


eDP 1.5 specification defines PSR version 4.

It defines PSR1 and PSR2 support with selective-update (SU)
capabilities, with additional support for Y-coordinate and Early
Transport of the selective-update region.

This differs from PSR version 3 in that early transport is supported
for version 4, but not for version 3.

Signed-off-by: Leo Li 
Reviewed-by: David Zhang 

Thanks
David Zhang
---
 include/drm/drm_dp_helper.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
index 3f2715eb965f..05268c51acaa 100644
--- a/include/drm/drm_dp_helper.h
+++ b/include/drm/drm_dp_helper.h
@@ -360,6 +360,7 @@ struct drm_dp_aux;
 # define DP_PSR_IS_SUPPORTED1
 # define DP_PSR2_IS_SUPPORTED   2/* eDP 1.4 */
 # define DP_PSR2_WITH_Y_COORD_IS_SUPPORTED  3   /* eDP 1.4a */
+# define DP_PSR2_WITH_ET_IS_SUPPORTED   4  /* eDP 1.5 (eDP 1.4b SCR) */

 #define DP_PSR_CAPS 0x071   /* XXX 1.2? */
 # define DP_PSR_NO_TRAIN_ON_EXIT1
--
2.34.1

Re: [PATCH] drm/amdgpu: filter out radeon secondary ids as well

2022-01-20 Thread Christian König


Am 20.01.22 um 18:48 schrieb Alex Deucher:

Older radeon boards (r2xx-r5xx) had secondary PCI functions
which we solely there for supporting multi-head on OSs with
special requirements.  Add them to the unsupported list
as well so we don't attempt to bind to them.  The driver
would fail to bind to them anyway, but this does so
in a cleaner way that should not confuse the user.

Signed-off-by: Alex Deucher 


Acked-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 81 +
  1 file changed, 81 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 1527decd7e30..75ceb43392b1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1499,6 +1499,87 @@ static const u16 amdgpu_unsupported_pciidlist[] = {
0x99A0,
0x99A2,
0x99A4,
+   /* radeon secondary ids */
+   0x3171,
+   0x3e70,
+   0x4164,
+   0x4165,
+   0x4166,
+   0x4168,
+   0x4170,
+   0x4171,
+   0x4172,
+   0x4173,
+   0x496e,
+   0x4a69,
+   0x4a6a,
+   0x4a6b,
+   0x4a70,
+   0x4a74,
+   0x4b69,
+   0x4b6b,
+   0x4b6c,
+   0x4c6e,
+   0x4e64,
+   0x4e65,
+   0x4e66,
+   0x4e67,
+   0x4e68,
+   0x4e69,
+   0x4e6a,
+   0x4e71,
+   0x4f73,
+   0x5569,
+   0x556b,
+   0x556d,
+   0x556f,
+   0x5571,
+   0x5854,
+   0x5874,
+   0x5940,
+   0x5941,
+   0x5b72,
+   0x5b73,
+   0x5b74,
+   0x5b75,
+   0x5d44,
+   0x5d45,
+   0x5d6d,
+   0x5d6f,
+   0x5d72,
+   0x5d77,
+   0x5e6b,
+   0x5e6d,
+   0x7120,
+   0x7124,
+   0x7129,
+   0x712e,
+   0x712f,
+   0x7162,
+   0x7163,
+   0x7166,
+   0x7167,
+   0x7172,
+   0x7173,
+   0x71a0,
+   0x71a1,
+   0x71a3,
+   0x71a7,
+   0x71bb,
+   0x71e0,
+   0x71e1,
+   0x71e2,
+   0x71e6,
+   0x71e7,
+   0x71f2,
+   0x7269,
+   0x726b,
+   0x726e,
+   0x72a0,
+   0x72a8,
+   0x72b1,
+   0x72b3,
+   0x793f,
  };
  
  static const struct pci_device_id pciidlist[] = {

[PATCH 2/2] drm/amdgpu/display: use msleep rather than udelay for long delays

2022-01-20 Thread Alex Deucher

Some architectures (e.g., ARM) throw an compilation error if the
udelay is too long.  In general udelays of longer than 2000us are
not recommended on any architecture.  Switch to msleep in these
cases.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index aa1c67c3c386..eb4432dca761 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -6941,7 +6941,7 @@ bool dpcd_write_128b_132b_sst_payload_allocation_table(
}
}
retries++;
-   udelay(5000);
+   msleep(5);
}
 
if (!result && retries == max_retries) {
@@ -6993,7 +6993,7 @@ bool dpcd_poll_for_allocation_change_trigger(struct 
dc_link *link)
break;
}
 
-   udelay(5000);
+   msleep(5);
}
 
if (result == ACT_FAILED) {
-- 
2.34.1

[PATCH 1/2] drm/amdgpu/display: adjust msleep limit in dp_wait_for_training_aux_rd_interval

2022-01-20 Thread Alex Deucher

Some architectures (e.g., ARM) have relatively low udelay limits.
On most architectures, anything longer than 2000us is not recommended.
Change the check to align with other similar checks in DC.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 1f8831156bc4..aa1c67c3c386 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -202,7 +202,7 @@ void dp_wait_for_training_aux_rd_interval(
uint32_t wait_in_micro_secs)
 {
 #if defined(CONFIG_DRM_AMD_DC_DCN)
-   if (wait_in_micro_secs > 16000)
+   if (wait_in_micro_secs > 1000)
msleep(wait_in_micro_secs/1000);
else
udelay(wait_in_micro_secs);
-- 
2.34.1

Re: [PATCH] drm/amdgpu: Fix double free in amdgpu_get_xgmi_hive

2022-01-20 Thread Felix Kuehling

Am 2022-01-20 um 5:17 a.m. schrieb Miaoqian Lin:
> Callback function amdgpu_xgmi_hive_release() in kobject_put()
> calls kfree(hive), So we don't need call kfree(hive) again.
>
> Fixes: 7b833d680481 ("drm/amd/amdgpu: fix potential memleak")
> Signed-off-by: Miaoqian Lin 

The patch is

Reviewed-by: Felix Kuehling 

This kobject_init_and_add error handling semantics is very unintuitive,
and we keep stumbling over it. I wonder is there is a better way to
handle this. Basically, this is what it looks like, when done correctly:

foo = kzalloc(sizeof(*foo), GFP_KERNEL);
if (!foo)
return -ENOMEM;
r = kobject_init_and_add(&foo->kobj, &foo_type, &parent, "foo_name");
if (r) {
/* OK, initialization failed, but I still need to
 * clean up manually as if the call had succeeded.
 */
kobject_put(&foo->kobj);
/* Don't kfree foo, because that's already done by
 * a callback setup by the call that failed above.
 */
return r;
}

Given that unintuitive behaviour, I'd argue that kobject_init_and_add
fails as an abstraction. Code would be clearer, more intuitive and safer
by calling kobject_init and kobject_add separately itself.
kobject_init_and_add saves you typing exactly one line of code, and it's
just not worth it:

foo = kzalloc(sizeof(*foo), GFP_KERNEL);
if (!foo)
return -ENOMEM;
kobject_init(&foo->kobj, &foo_type); /* never fails */
r = kobject_add(&foo->kobj, &parent, "foo_name");
if (r) {
/* since kobj_init succeeded, it's obvious that kobj_put
 * is the right thing to do to handle all the cleanup.
 */
kobject_put(&foo->kobj);
return r;
}

Regards,
  Felix

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> index e8b8f28c2f72..35d4b966ef2c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> @@ -393,7 +393,6 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct 
> amdgpu_device *adev)
>   if (ret) {
>   dev_err(adev->dev, "XGMI: failed initializing kobject for xgmi 
> hive\n");
>   kobject_put(&hive->kobj);
> - kfree(hive);
>   hive = NULL;
>   goto pro_end;
>   }

RE: [PATCH] drm/amdgpu: Disable FRU EEPROM access for SRIOV

2022-01-20 Thread Russell, Kent

[AMD Official Use Only]

Reviewed-by: Kent Russell 



> -Original Message-
> From: amd-gfx  On Behalf Of Alex 
> Deucher
> Sent: Thursday, January 20, 2022 11:51 AM
> To: Liu, Shaoyun 
> Cc: amd-gfx list 
> Subject: Re: [PATCH] drm/amdgpu: Disable FRU EEPROM access for SRIOV
>
> On Thu, Jan 20, 2022 at 10:49 AM shaoyunl  wrote:
> >
> > VF acces the EEPROM is blocked by security policy, we might need other way
> > to get SKUs info for VF
> >
> > Signed-off-by: shaoyunl 
>
> Acked-by: Alex Deucher 
>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c | 6 ++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
> > index 2a786e788627..0548e279cc9f 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
> > @@ -40,6 +40,12 @@ static bool is_fru_eeprom_supported(struct amdgpu_device 
> > *adev)
> >  */
> > struct atom_context *atom_ctx = adev->mode_info.atom_context;
> >
> > +   /* The i2c access is blocked on VF
> > +* TODO: Need other way to get the info
> > +*/
> > +   if (amdgpu_sriov_vf(adev)
> > +   return false;
> > +
> > /* VBIOS is of the format ###-DXXXYY-##. For SKU identification,
> >  * we can use just the "DXXX" portion. If there were more models, we
> >  * could convert the 3 characters to a hex integer and use a switch
> > --
> > 2.17.1
> >

[PATCH] drm/amdgpu: filter out radeon secondary ids as well

2022-01-20 Thread Alex Deucher

Older radeon boards (r2xx-r5xx) had secondary PCI functions
which we solely there for supporting multi-head on OSs with
special requirements.  Add them to the unsupported list
as well so we don't attempt to bind to them.  The driver
would fail to bind to them anyway, but this does so
in a cleaner way that should not confuse the user.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 81 +
 1 file changed, 81 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 1527decd7e30..75ceb43392b1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1499,6 +1499,87 @@ static const u16 amdgpu_unsupported_pciidlist[] = {
0x99A0,
0x99A2,
0x99A4,
+   /* radeon secondary ids */
+   0x3171,
+   0x3e70,
+   0x4164,
+   0x4165,
+   0x4166,
+   0x4168,
+   0x4170,
+   0x4171,
+   0x4172,
+   0x4173,
+   0x496e,
+   0x4a69,
+   0x4a6a,
+   0x4a6b,
+   0x4a70,
+   0x4a74,
+   0x4b69,
+   0x4b6b,
+   0x4b6c,
+   0x4c6e,
+   0x4e64,
+   0x4e65,
+   0x4e66,
+   0x4e67,
+   0x4e68,
+   0x4e69,
+   0x4e6a,
+   0x4e71,
+   0x4f73,
+   0x5569,
+   0x556b,
+   0x556d,
+   0x556f,
+   0x5571,
+   0x5854,
+   0x5874,
+   0x5940,
+   0x5941,
+   0x5b72,
+   0x5b73,
+   0x5b74,
+   0x5b75,
+   0x5d44,
+   0x5d45,
+   0x5d6d,
+   0x5d6f,
+   0x5d72,
+   0x5d77,
+   0x5e6b,
+   0x5e6d,
+   0x7120,
+   0x7124,
+   0x7129,
+   0x712e,
+   0x712f,
+   0x7162,
+   0x7163,
+   0x7166,
+   0x7167,
+   0x7172,
+   0x7173,
+   0x71a0,
+   0x71a1,
+   0x71a3,
+   0x71a7,
+   0x71bb,
+   0x71e0,
+   0x71e1,
+   0x71e2,
+   0x71e6,
+   0x71e7,
+   0x71f2,
+   0x7269,
+   0x726b,
+   0x726e,
+   0x72a0,
+   0x72a8,
+   0x72b1,
+   0x72b3,
+   0x793f,
 };
 
 static const struct pci_device_id pciidlist[] = {
-- 
2.34.1

Re: [PATCH] drm/amdgpu: enable amdgpu_dc module parameter

2022-01-20 Thread Alex Deucher

On Thu, Jan 20, 2022 at 1:25 AM Lang Yu  wrote:
>
> It doesn't work under IP discovery mode. Make it work!
>
> Signed-off-by: Lang Yu 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> index 07965ac6381b..1ad137499e38 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> @@ -846,8 +846,14 @@ static int amdgpu_discovery_set_display_ip_blocks(struct 
> amdgpu_device *adev)
>  {
> if (adev->enable_virtual_display || amdgpu_sriov_vf(adev)) {
> amdgpu_device_ip_block_add(adev, &amdgpu_vkms_ip_block);
> +   return 0;
> +   }
> +
> +   if (!amdgpu_device_has_dc_support(adev))
> +   return 0;
> +
>  #if defined(CONFIG_DRM_AMD_DC)
> -   } else if (adev->ip_versions[DCE_HWIP][0]) {
> +   if (adev->ip_versions[DCE_HWIP][0]) {
> switch (adev->ip_versions[DCE_HWIP][0]) {
> case IP_VERSION(1, 0, 0):
> case IP_VERSION(1, 0, 1):
> @@ -882,9 +888,9 @@ static int amdgpu_discovery_set_display_ip_blocks(struct 
> amdgpu_device *adev)
> adev->ip_versions[DCI_HWIP][0]);
> return -EINVAL;
> }
> -#endif
> }
> return 0;
> +#endif

I think the compiler will complain about this.  If you move the #endif
before the return, the patch is:
Reviewed-by: Alex Deucher 

>  }
>
>  static int amdgpu_discovery_set_gc_ip_blocks(struct amdgpu_device *adev)
> --
> 2.25.1
>

Re: [PATCH v9 4/6] drm: implement a method to free unused pages

2022-01-20 Thread Matthew Auld


On 19/01/2022 11:37, Arunpravin wrote:

On contiguous allocation, we round up the size
to the *next* power of 2, implement a function
to free the unused pages after the newly allocate block.

v2(Matthew Auld):
   - replace function name 'drm_buddy_free_unused_pages' with
 drm_buddy_block_trim
   - replace input argument name 'actual_size' with 'new_size'
   - add more validation checks for input arguments
   - add overlaps check to avoid needless searching and splitting
   - merged the below patch to see the feature in action
  - add free unused pages support to i915 driver
   - lock drm_buddy_block_trim() function as it calls mark_free/mark_split
 are all globally visible

v3(Matthew Auld):
   - remove trim method error handling as we address the failure case
 at drm_buddy_block_trim() function

v4:
   - in case of trim, at __alloc_range() split_block failure path
 marks the block as free and removes it from the original list,
 potentially also freeing it, to overcome this problem, we turn
 the drm_buddy_block_trim() input node into a temporary node to
 prevent recursively freeing itself, but still retain the
 un-splitting/freeing of the other nodes(Matthew Auld)

   - modify the drm_buddy_block_trim() function return type

v5(Matthew Auld):
   - revert drm_buddy_block_trim() function return type changes in v4
   - modify drm_buddy_block_trim() passing argument n_pages to original_size
 as n_pages has already been rounded up to the next power-of-two and
 passing n_pages results noop

v6:
   - fix warnings reported by kernel test robot 

Signed-off-by: Arunpravin 
---
  drivers/gpu/drm/drm_buddy.c   | 65 +++
  drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 10 +++
  include/drm/drm_buddy.h   |  4 ++
  3 files changed, 79 insertions(+)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 6aa5c1ce25bf..c5902a81b8c5 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -546,6 +546,71 @@ static int __drm_buddy_alloc_range(struct drm_buddy *mm,
return __alloc_range(mm, &dfs, start, size, blocks);
  }
  
+/**

+ * drm_buddy_block_trim - free unused pages
+ *
+ * @mm: DRM buddy manager
+ * @new_size: original size requested
+ * @blocks: output list head to add allocated blocks


@blocks: Input and output list of allocated blocks. MUST contain single 
block as input to be trimmed. On success will contain the newly 
allocated blocks making up the @new_size. Blocks always appear in 
ascending order.


?


+ *
+ * For contiguous allocation, we round up the size to the nearest
+ * power of two value, drivers consume *actual* size, so remaining
+ * portions are unused and it can be freed.


so remaining portions are unused and can be optionally freed with this 
function.


?


+ *
+ * Returns:
+ * 0 on success, error code on failure.
+ */
+int drm_buddy_block_trim(struct drm_buddy *mm,
+u64 new_size,
+struct list_head *blocks)
+{
+   struct drm_buddy_block *parent;
+   struct drm_buddy_block *block;
+   LIST_HEAD(dfs);
+   u64 new_start;
+   int err;
+
+   if (!list_is_singular(blocks))
+   return -EINVAL;
+
+   block = list_first_entry(blocks,
+struct drm_buddy_block,
+link);
+
+   if (!drm_buddy_block_is_allocated(block))


Maybe:

if (WARN_ON(!drm_buddy_block_is_allocated()))

AFAIK it should be normally impossible to be handed such non-allocated 
block, and so should be treated as a serious programmer error.


?


+   return -EINVAL;
+
+   if (new_size > drm_buddy_block_size(mm, block))
+   return -EINVAL;
+
+   if (!new_size && !IS_ALIGNED(new_size, mm->chunk_size))
+   return -EINVAL;


I assume that's a typo:

if (!new_size || ...)

Otherwise I think looks good. Some unit tests for this would be nice, 
but not a blocker. And this does at least pass the igt_mock_contiguous 
selftest, and I didn't see anything nasty when running on DG1, which 
does make use of TTM_PL_FLAG_CONTIGUOUS,

Reviewed-by: Matthew Auld 


+
+   if (new_size == drm_buddy_block_size(mm, block))
+   return 0;
+
+   list_del(&block->link);
+   mark_free(mm, block);
+   mm->avail += drm_buddy_block_size(mm, block);
+
+   /* Prevent recursively freeing this node */
+   parent = block->parent;
+   block->parent = NULL;
+
+   new_start = drm_buddy_block_offset(block);
+   list_add(&block->tmp_link, &dfs);
+   err =  __alloc_range(mm, &dfs, new_start, new_size, blocks);
+   if (err) {
+   mark_allocated(block);
+   mm->avail -= drm_buddy_block_size(mm, block);
+   list_add(&block->link, blocks);
+   }
+
+   block->parent = parent;
+   return err;
+}
+EXPORT_SYMBOL(drm_buddy_block_trim);
+
  /**
   *

Re: [PATCH] drm/amdgpu: Disable FRU EEPROM access for SRIOV

2022-01-20 Thread Alex Deucher

On Thu, Jan 20, 2022 at 10:49 AM shaoyunl  wrote:
>
> VF acces the EEPROM is blocked by security policy, we might need other way
> to get SKUs info for VF
>
> Signed-off-by: shaoyunl 

Acked-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
> index 2a786e788627..0548e279cc9f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
> @@ -40,6 +40,12 @@ static bool is_fru_eeprom_supported(struct amdgpu_device 
> *adev)
>  */
> struct atom_context *atom_ctx = adev->mode_info.atom_context;
>
> +   /* The i2c access is blocked on VF
> +* TODO: Need other way to get the info
> +*/
> +   if (amdgpu_sriov_vf(adev)
> +   return false;
> +
> /* VBIOS is of the format ###-DXXXYY-##. For SKU identification,
>  * we can use just the "DXXX" portion. If there were more models, we
>  * could convert the 3 characters to a hex integer and use a switch
> --
> 2.17.1
>

RE: [PATCH v9 1/6] drm: move the buddy allocator from i915 into common drm

2022-01-20 Thread Paneer Selvam, Arunpravin

[AMD Official Use Only]

Hi Matthew,

Do you have suggestions / issues for the other patches, shall we push all the 
other patches into drm-misc-next.

Thanks,
Arun.

I've just gone ahead and pushed this version here to drm-misc-next.

That should at least reduce the amount of mails send back and forth.

Let me know when there are more rbs on the rest and I will push that as 
well.

Thanks,
Christian.

-Original Message-
From: amd-gfx  On Behalf Of Christian 
König
Sent: Wednesday, January 19, 2022 12:54 PM
To: Paneer Selvam, Arunpravin ; 
dri-de...@lists.freedesktop.org; intel-...@lists.freedesktop.org; 
amd-gfx@lists.freedesktop.org
Cc: dan...@ffwll.ch; jani.nik...@linux.intel.com; matthew.a...@intel.com; 
tzimmerm...@suse.de; Deucher, Alexander ; Koenig, 
Christian 
Subject: Re: [PATCH v9 1/6] drm: move the buddy allocator from i915 into common 
drm

Am 18.01.22 um 11:44 schrieb Arunpravin:
> Move the base i915 buddy allocator code into drm
> - Move i915_buddy.h to include/drm
> - Move i915_buddy.c to drm root folder
> - Rename "i915" string with "drm" string wherever applicable
> - Rename "I915" string with "DRM" string wherever applicable
> - Fix header file dependencies
> - Fix alignment issues
> - add Makefile support for drm buddy
> - export functions and write kerneldoc description
> - Remove i915 selftest config check condition as buddy selftest
>will be moved to drm selftest folder
>
> cleanup i915 buddy references in i915 driver module
> and replace with drm buddy
>
> v2:
>- include header file in alphabetical order(Thomas)
>- merged changes listed in the body section into a single patch
>  to keep the build intact(Christian, Jani)
>
> v3:
>- make drm buddy a separate module(Thomas, Christian)
>
> v4:
>- Fix build error reported by kernel test robot 
>- removed i915 buddy selftest from i915_mock_selftests.h to
>  avoid build error
>- removed selftests/i915_buddy.c file as we create a new set of
>  buddy test cases in drm/selftests folder
>
> v5:
>- Fix merge conflict issue
>
> v6:
>- replace drm_buddy_mm structure name as drm_buddy(Thomas, Christian)
>- replace drm_buddy_alloc() function name as drm_buddy_alloc_blocks()
>  (Thomas)
>- replace drm_buddy_free() function name as drm_buddy_free_block()
>  (Thomas)
>- export drm_buddy_free_block() function
>- fix multiple instances of KMEM_CACHE() entry
>
> v7:
>- fix warnings reported by kernel test robot 
>- modify the license(Christian)
>
> v8:
>- fix warnings reported by kernel test robot 
>
> Signed-off-by: Arunpravin 

I've just gone ahead and pushed this version here to drm-misc-next.

That should at least reduce the amount of mails send back and forth.

Let me know when there are more rbs on the rest and I will push that as 
well.

Thanks,
Christian.

> ---
>   drivers/gpu/drm/Kconfig   |   6 +
>   drivers/gpu/drm/Makefile  |   2 +
>   drivers/gpu/drm/drm_buddy.c   | 535 
>   drivers/gpu/drm/i915/Kconfig  |   1 +
>   drivers/gpu/drm/i915/Makefile |   1 -
>   drivers/gpu/drm/i915/i915_buddy.c | 466 ---
>   drivers/gpu/drm/i915/i915_buddy.h | 143 
>   drivers/gpu/drm/i915/i915_module.c|   3 -
>   drivers/gpu/drm/i915/i915_scatterlist.c   |  11 +-
>   drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  33 +-
>   drivers/gpu/drm/i915/i915_ttm_buddy_manager.h |   4 +-
>   drivers/gpu/drm/i915/selftests/i915_buddy.c   | 787 --
>   .../drm/i915/selftests/i915_mock_selftests.h  |   1 -
>   .../drm/i915/selftests/intel_memory_region.c  |  13 +-
>   include/drm/drm_buddy.h   | 150 
>   15 files changed, 725 insertions(+), 1431 deletions(-)
>   create mode 100644 drivers/gpu/drm/drm_buddy.c
>   delete mode 100644 drivers/gpu/drm/i915/i915_buddy.c
>   delete mode 100644 drivers/gpu/drm/i915/i915_buddy.h
>   delete mode 100644 drivers/gpu/drm/i915/selftests/i915_buddy.c
>   create mode 100644 include/drm/drm_buddy.h
>
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 91f54aeb0b7c..cc3e979c9c9d 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -204,6 +204,12 @@ config DRM_TTM
> GPU memory types. Will be enabled automatically if a device driver
> uses it.
>   
> +config DRM_BUDDY
> + tristate
> + depends on DRM
> + help
> +   A page based buddy allocator
> +
>   config DRM_VRAM_HELPER
>   tristate
>   depends on DRM
> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> index 700abeb4945e..8675c2af7ae1 100644
> --- a/drivers/gpu/drm/Makefile
> +++ b/drivers/gpu/drm/Makefile
> @@ -40,6 +40,8 @@ obj-$(CONFIG_DRM_GEM_CMA_HELPER) += drm_cma_helper.o
>   drm_shmem_helper-y := drm_gem_shmem_helper.o
>   obj-$(CONFIG_DRM_GEM_SHMEM_HELPER) += drm_shmem_helper.o
>   
> +obj-$(CONFIG_DRM_B

Re: [PATCH 1/2] drm/amdkfd: svm deferred_list work continue cleanup after mm gone

2022-01-20 Thread Felix Kuehling

Can we instead take a proper reference to the mm in
svm_range_add_list_work? That way the mm would remain valid as long as
the work is scheduled.

So instead of calling get_task_mm in svm_range_deferred_list_work, do it
in svm_range_add_list_work.

Regards,
  Felix


Am 2022-01-19 um 11:22 a.m. schrieb Philip Yang:
> After mm is removed from task->mm, deferred_list work should continue to
> handle deferred_range_list which maybe split to child range to avoid
> child range leak, and remove ranges mmu interval notifier to avoid mm
> mm_count leak, but skip updating notifier and inserting new notifier.
>
> Signed-off-by: Philip Yang 
> Reported-by: Ruili Ji 
> Tested-by: Ruili Ji 
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 41 
>  1 file changed, 24 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index f2805ba74c80..9ec195e1ef23 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -1985,10 +1985,9 @@ svm_range_update_notifier_and_interval_tree(struct 
> mm_struct *mm,
>  }
>  
>  static void
> -svm_range_handle_list_op(struct svm_range_list *svms, struct svm_range 
> *prange)
> +svm_range_handle_list_op(struct svm_range_list *svms, struct svm_range 
> *prange,
> +  struct mm_struct *mm)
>  {
> - struct mm_struct *mm = prange->work_item.mm;
> -
>   switch (prange->work_item.op) {
>   case SVM_OP_NULL:
>   pr_debug("NULL OP 0x%p prange 0x%p [0x%lx 0x%lx]\n",
> @@ -2004,25 +2003,29 @@ svm_range_handle_list_op(struct svm_range_list *svms, 
> struct svm_range *prange)
>   case SVM_OP_UPDATE_RANGE_NOTIFIER:
>   pr_debug("update notifier 0x%p prange 0x%p [0x%lx 0x%lx]\n",
>svms, prange, prange->start, prange->last);
> - svm_range_update_notifier_and_interval_tree(mm, prange);
> + if (mm)
> + svm_range_update_notifier_and_interval_tree(mm, prange);
>   break;
>   case SVM_OP_UPDATE_RANGE_NOTIFIER_AND_MAP:
>   pr_debug("update and map 0x%p prange 0x%p [0x%lx 0x%lx]\n",
>svms, prange, prange->start, prange->last);
> - svm_range_update_notifier_and_interval_tree(mm, prange);
> + if (mm)
> + svm_range_update_notifier_and_interval_tree(mm, prange);
>   /* TODO: implement deferred validation and mapping */
>   break;
>   case SVM_OP_ADD_RANGE:
>   pr_debug("add 0x%p prange 0x%p [0x%lx 0x%lx]\n", svms, prange,
>prange->start, prange->last);
>   svm_range_add_to_svms(prange);
> - svm_range_add_notifier_locked(mm, prange);
> + if (mm)
> + svm_range_add_notifier_locked(mm, prange);
>   break;
>   case SVM_OP_ADD_RANGE_AND_MAP:
>   pr_debug("add and map 0x%p prange 0x%p [0x%lx 0x%lx]\n", svms,
>prange, prange->start, prange->last);
>   svm_range_add_to_svms(prange);
> - svm_range_add_notifier_locked(mm, prange);
> + if (mm)
> + svm_range_add_notifier_locked(mm, prange);
>   /* TODO: implement deferred validation and mapping */
>   break;
>   default:
> @@ -2071,20 +2074,22 @@ static void svm_range_deferred_list_work(struct 
> work_struct *work)
>   pr_debug("enter svms 0x%p\n", svms);
>  
>   p = container_of(svms, struct kfd_process, svms);
> - /* Avoid mm is gone when inserting mmu notifier */
> +
> + /* If mm is gone, continue cleanup the deferred_range_list */
>   mm = get_task_mm(p->lead_thread);
> - if (!mm) {
> + if (!mm)
>   pr_debug("svms 0x%p process mm gone\n", svms);
> - return;
> - }
> +
>  retry:
> - mmap_write_lock(mm);
> + if (mm)
> + mmap_write_lock(mm);
>  
>   /* Checking for the need to drain retry faults must be inside
>* mmap write lock to serialize with munmap notifiers.
>*/
>   if (unlikely(atomic_read(&svms->drain_pagefaults))) {
> - mmap_write_unlock(mm);
> + if (mm)
> + mmap_write_unlock(mm);
>   svm_range_drain_retry_fault(svms);
>   goto retry;
>   }
> @@ -2109,19 +2114,21 @@ static void svm_range_deferred_list_work(struct 
> work_struct *work)
>   pr_debug("child prange 0x%p op %d\n", pchild,
>pchild->work_item.op);
>   list_del_init(&pchild->child_list);
> - svm_range_handle_list_op(svms, pchild);
> + svm_range_handle_list_op(svms, pchild, mm);
>   }
>   mutex_unlock(&prange->migrate_mutex);
>  
> - svm_range_handle_list_op(svms, prange);
> +

Re: [PATCH V2 1/7] drm/amd/pm: drop unneeded lock protection smu->mutex

2022-01-20 Thread Lazar, Lijo


Apart from patch 1, rest of the series is

Reviewed-by: Lijo Lazar 

Patch 1 needs another look related to i2c transfers.

Thanks,
Lijo

On 1/17/2022 11:11 AM, Evan Quan wrote:

As all those APIs are already protected either by adev->pm.mutex
or smu->message_lock.

Signed-off-by: Evan Quan 
Change-Id: I1db751fba9caabc5ca1314992961d3674212f9b0
---
  drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 315 ++
  drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |   1 -
  .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |   2 -
  .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |   2 -
  .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   |   2 -
  .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c|   2 -
  6 files changed, 25 insertions(+), 299 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index 828cb932f6a9..411f03eb4523 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -55,8 +55,7 @@ static int smu_force_smuclk_levels(struct smu_context *smu,
   uint32_t mask);
  static int smu_handle_task(struct smu_context *smu,
   enum amd_dpm_forced_level level,
-  enum amd_pp_task task_id,
-  bool lock_needed);
+  enum amd_pp_task task_id);
  static int smu_reset(struct smu_context *smu);
  static int smu_set_fan_speed_pwm(void *handle, u32 speed);
  static int smu_set_fan_control_mode(void *handle, u32 value);
@@ -68,36 +67,22 @@ static int smu_sys_get_pp_feature_mask(void *handle,
   char *buf)
  {
struct smu_context *smu = handle;
-   int size = 0;
  
  	if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)

return -EOPNOTSUPP;
  
-	mutex_lock(&smu->mutex);

-
-   size = smu_get_pp_feature_mask(smu, buf);
-
-   mutex_unlock(&smu->mutex);
-
-   return size;
+   return smu_get_pp_feature_mask(smu, buf);
  }
  
  static int smu_sys_set_pp_feature_mask(void *handle,

   uint64_t new_mask)
  {
struct smu_context *smu = handle;
-   int ret = 0;
  
  	if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)

return -EOPNOTSUPP;
  
-	mutex_lock(&smu->mutex);

-
-   ret = smu_set_pp_feature_mask(smu, new_mask);
-
-   mutex_unlock(&smu->mutex);
-
-   return ret;
+   return smu_set_pp_feature_mask(smu, new_mask);
  }
  
  int smu_get_status_gfxoff(struct smu_context *smu, uint32_t *value)

@@ -117,16 +102,12 @@ int smu_set_soft_freq_range(struct smu_context *smu,
  {
int ret = 0;
  
-	mutex_lock(&smu->mutex);

-
if (smu->ppt_funcs->set_soft_freq_limited_range)
ret = smu->ppt_funcs->set_soft_freq_limited_range(smu,
  clk_type,
  min,
  max);
  
-	mutex_unlock(&smu->mutex);

-
return ret;
  }
  
@@ -140,16 +121,12 @@ int smu_get_dpm_freq_range(struct smu_context *smu,

if (!min && !max)
return -EINVAL;
  
-	mutex_lock(&smu->mutex);

-
if (smu->ppt_funcs->get_dpm_ultimate_freq)
ret = smu->ppt_funcs->get_dpm_ultimate_freq(smu,
clk_type,
min,
max);
  
-	mutex_unlock(&smu->mutex);

-
return ret;
  }
  
@@ -482,7 +459,6 @@ static int smu_sys_get_pp_table(void *handle,

  {
struct smu_context *smu = handle;
struct smu_table_context *smu_table = &smu->smu_table;
-   uint32_t powerplay_table_size;
  
  	if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)

return -EOPNOTSUPP;
@@ -490,18 +466,12 @@ static int smu_sys_get_pp_table(void *handle,
if (!smu_table->power_play_table && !smu_table->hardcode_pptable)
return -EINVAL;
  
-	mutex_lock(&smu->mutex);

-
if (smu_table->hardcode_pptable)
*table = smu_table->hardcode_pptable;
else
*table = smu_table->power_play_table;
  
-	powerplay_table_size = smu_table->power_play_table_size;

-
-   mutex_unlock(&smu->mutex);
-
-   return powerplay_table_size;
+   return smu_table->power_play_table_size;
  }
  
  static int smu_sys_set_pp_table(void *handle,

@@ -521,13 +491,10 @@ static int smu_sys_set_pp_table(void *handle,
return -EIO;
}
  
-	mutex_lock(&smu->mutex);

if (!smu_table->hardcode_pptable)
smu_table->hardcode_pptable = kzalloc(size, GFP_KERNEL);
-   if (!smu_table->hardcode_pptable) {
-   ret = -ENOMEM;
-   goto failed;
-   }
+   if (!smu_

[PATCH] drm/amdgpu: Disable FRU EEPROM access for SRIOV

2022-01-20 Thread shaoyunl

VF acces the EEPROM is blocked by security policy, we might need other way
to get SKUs info for VF

Signed-off-by: shaoyunl 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
index 2a786e788627..0548e279cc9f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
@@ -40,6 +40,12 @@ static bool is_fru_eeprom_supported(struct amdgpu_device 
*adev)
 */
struct atom_context *atom_ctx = adev->mode_info.atom_context;
 
+   /* The i2c access is blocked on VF
+* TODO: Need other way to get the info
+*/  
+   if (amdgpu_sriov_vf(adev)
+   return false;
+
/* VBIOS is of the format ###-DXXXYY-##. For SKU identification,
 * we can use just the "DXXX" portion. If there were more models, we
 * could convert the 3 characters to a hex integer and use a switch
-- 
2.17.1

Re: [Intel-gfx] [PATCH 0/7] DRM kmap() fixes and kmap_local_page() conversions

2022-01-20 Thread Daniel Vetter

On Thu, Jan 20, 2022 at 09:16:35AM +0100, Christian König wrote:
> Am 20.01.22 um 00:55 schrieb Ira Weiny:
> > On Wed, Jan 19, 2022 at 06:24:22PM +0100, Daniel Vetter wrote:
> > > On Wed, Jan 19, 2022 at 08:53:56AM -0800, Ira Weiny wrote:
> > > > On Fri, Dec 10, 2021 at 03:23:57PM -0800, 'Ira Weiny' wrote:
> > > > > From: Ira Weiny 
> > > > > 
> > > > > This series starts by converting the last easy kmap() uses to
> > > > > kmap_local_page().
> > > > > 
> > > > > There is one more call to kmap() wrapped in ttm_bo_kmap_ttm().  
> > > > > Unfortunately,
> > > > > ttm_bo_kmap_ttm() is called in a number of different ways including 
> > > > > some which
> > > > > are not thread local.  I have a patch to convert that call.  However, 
> > > > > it is not
> > > > > straight forward so it is not included in this series.
> > > > > 
> > > > > The final 2 patches fix bugs found while working on the 
> > > > > ttm_bo_kmap_ttm()
> > > > > conversion.
> > > > Gentile ping on this series?  Will it make this merge window?
> > > I think this fell through the cracks and so no. Note that generally we
> > > feature-freeze drm tree around -rc6 anyway for the upcoming merge window,
> > > so you were cutting this all a bit close anyway.
> > Ok, No problem.  I just had not heard if this was picked up or not.
> > 
> > > Also looks like the ttm
> > > kmap caching question didn't get resolved?
> > I'm sorry I thought it was resolve for this series.  Christian said the 
> > patches
> > in this series were "a good bug fix" even if not strictly necessary.[1]  
> > Beyond
> > this series I was discussing where to go from here, and is it possible to go
> > further with more changes.[2]  At the moment I don't think I will.
> > 
> > Christian did I misunderstand?  I can drop patch 6 and 7 if they are not 
> > proper
> > bug fixes or at least clarifications to the code.
> 
> Yeah, it is indeed a correct cleanup. I would just *not* put a CC stable on
> it because it doesn't really fix anything.

Ok can you pls get the amd/radeon ones stuffed into alex' tree? Or do we
want to put all the ttm ones into drm-misc instead?
-Daniel

> 
> Christian.
> 
> > 
> > Ira
> > 
> > [1] 
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2Fc3b173ea-6509-ebbe-b5f9-eeb29f1ce57e%40amd.com%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7C5e0192210d4640adb88b08d9dba734b1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637782333459591089%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=4p7jCB6pB4nlcUtLWh6K2Sso9X%2BsRSK7mcD8UavzztQ%3D&reserved=0
> > [2] 
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F20211215210949.GW3538886%40iweiny-DESK2.sc.intel.com%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7C5e0192210d4640adb88b08d9dba734b1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637782333459591089%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=6%2BGfSKshg8Xr%2FXJshiU28yHzbg2HcVisVJLDU6tVUT4%3D&reserved=0
> > 
> > > Anyway if patches are stuck resend with RESEND and if people still don't
> > > pick them up poke me and I'll apply as fallback.
> > > 
> > > Cheers, Daniel
> > > 
> > > > Thanks,
> > > > Ira
> > > > 
> > > > > 
> > > > > Ira Weiny (7):
> > > > > drm/i915: Replace kmap() with kmap_local_page()
> > > > > drm/amd: Replace kmap() with kmap_local_page()
> > > > > drm/gma: Remove calls to kmap()
> > > > > drm/radeon: Replace kmap() with kmap_local_page()
> > > > > drm/msm: Alter comment to use kmap_local_page()
> > > > > drm/amdgpu: Ensure kunmap is called on error
> > > > > drm/radeon: Ensure kunmap is called on error
> > > > > 
> > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 8 
> > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 1 +
> > > > > drivers/gpu/drm/gma500/gma_display.c | 6 ++
> > > > > drivers/gpu/drm/gma500/mmu.c | 8 
> > > > > drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 4 ++--
> > > > > drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c | 8 
> > > > > drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 4 ++--
> > > > > drivers/gpu/drm/i915/gt/shmem_utils.c | 4 ++--
> > > > > drivers/gpu/drm/i915/i915_gem.c | 8 
> > > > > drivers/gpu/drm/i915/i915_gpu_error.c | 4 ++--
> > > > > drivers/gpu/drm/msm/msm_gem_submit.c | 4 ++--
> > > > > drivers/gpu/drm/radeon/radeon_ttm.c | 4 ++--
> > > > > drivers/gpu/drm/radeon/radeon_uvd.c | 1 +
> > > > > 13 files changed, 32 insertions(+), 32 deletions(-)
> > > > > 
> > > > > --
> > > > > 2.31.1
> > > > > 
> > > -- 
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7C5e0192210d4640adb88b08d9dba734b1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637782333459591089%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%

Re: [PATCH] drm/amdgpu: remove gart.ready flag

2022-01-20 Thread Christian König

I think we could remove this warning as well, all we need to do is to 
make sure that the GART table is always restored from the metadata 
before it is enabled in hardware.


I've seen that we do this anyway for most hardware generations, but we 
really need to double check.


Christian.

Am 20.01.22 um 16:04 schrieb Kim, Jonathan:

[Public]

Switching to a VRAM bounce buffer can drop performance around 4x~6x on Vega20 
over larger access so it's not desired.

Jon


-Original Message-
From: Koenig, Christian 
Sent: January 20, 2022 9:10 AM
To: Chen, Guchun ; Christian König
; Kim, Jonathan
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag

I actually suggested to allocate the bounce buffer in VRAM, but that add a
bit more latency.

Christian.

Am 20.01.22 um 15:00 schrieb Chen, Guchun:

[Public]

Hi Christian,

Unfortunately, your patch brings another warning from the same

sdma_access_bo's creation in amdgpu_ttm_init.

In your patch, you introduce a new check of WARN_ON(!adev->gart.ptr)),

however, sdma_access_bo targets to create a bo in GTT domain, but adev-

gart.ptr is ready in gmc_v10_0_gart_init only.

Hi Jonathan,

Is it mandatory to create this sdma_access_bo in GTT domain? Can we

change it to VRAM?

Regards,
Guchun

-Original Message-
From: Koenig, Christian 
Sent: Wednesday, January 19, 2022 10:38 PM
To: Chen, Guchun ; Christian König
; Kim, Jonathan
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag

Hi Guchun,

yes, just haven't found time to do this yet.

Regards,
Christian.

Am 19.01.22 um 15:24 schrieb Chen, Guchun:

[Public]

Hello Christian,

Do you plan to submit your code to drm-next branch?

Regards,
Guchun

-Original Message-
From: Chen, Guchun
Sent: Tuesday, January 18, 2022 10:22 PM
To: 'Christian König' ; Kim,
Jonathan ; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: remove gart.ready flag

[Public]

Thanks for the clarification. The patch is:
Reviewed-by: Guchun Chen 

Regards,
Guchun

-Original Message-
From: Christian König 
Sent: Tuesday, January 18, 2022 10:10 PM
To: Chen, Guchun ; Kim, Jonathan
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag

Am 18.01.22 um 14:28 schrieb Chen, Guchun:

[Public]

- if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev))
- goto skip_pin_bo;
-
- r = amdgpu_gtt_mgr_recover(&adev->mman.gtt_mgr);
- if (r)
- return r;
-
-skip_pin_bo:

Does deleting skip_pin_bo path cause bo redundant pin in SRIOV case?

Pinning/unpinning the BO was already removed as well.

See Nirmoy's patches in the git log.

Regards,
Christian.


Regards,
Guchun

-Original Message-
From: Christian König 
Sent: Tuesday, January 18, 2022 8:02 PM
To: Chen, Guchun ; Kim, Jonathan
; amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/amdgpu: remove gart.ready flag

That's just a leftover from old radeon days and was preventing CS and

GART bindings before the hardware was initialized. But nowdays that is
perfectly valid.

The only thing we need to warn about are GART binding before the

table is even allocated.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c| 35 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h| 15 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c |  9 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 77 ++-

--

 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  4 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  | 11 +--
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c   |  7 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c   |  8 +--
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c   |  8 +--
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   | 10 +--
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c|  5 +-
 11 files changed, 52 insertions(+), 137 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index 645950a653a0..53cc844346f0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -150,7 +150,7 @@ void amdgpu_gart_table_vram_free(struct

amdgpu_device *adev)

  * replaces them with the dummy page (all asics).
  * Returns 0 for success, -EINVAL for failure.
  */
-int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t

offset,

+void amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t
+offset,
   int pages)
 {
   unsigned t;
@@ -161,13 +161,11 @@ int amdgpu_gart_unbind(struct

amdgpu_device *adev, uint64_t offset,

   uint64_t flags = 0;
   int idx;

- if (!adev->gart.ready) {
- WARN(1, "trying to unbind memory from uninitialized GART

!\n");

- return -EINVAL;
- }
+ if (WARN_ON(!adev->gart.ptr))
+ return;

   if (!drm_dev_enter(adev_to_drm(adev), &idx))
- return 0;
+ return;

Re: [PATCH 1/9] drm/amdgpu: add helper to query rlcg reg access flag

2022-01-20 Thread Lazar, Lijo


Series is
Reviewed-by: Lijo Lazar 

Thanks,
Lijo

On 1/20/2022 4:48 PM, Hawking Zhang wrote:

Query rlc indirect register access approach specified
by sriov host driver per ip blocks

Signed-off-by: Hawking Zhang 
Reviewed-by: Zhou, Peng Ju 
Acked-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 35 
  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h |  8 ++
  2 files changed, 43 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 07bc0f504713..a40e4fcdfa46 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -820,3 +820,38 @@ void amdgpu_virt_update_sriov_video_codec(struct 
amdgpu_device *adev,
}
}
  }
+
+bool amdgpu_virt_get_rlcg_reg_access_flag(struct amdgpu_device *adev, u32 
acc_flags,
+ u32 hwip, bool write, u32 *rlcg_flag)
+{
+   bool ret = false;
+
+   switch (hwip) {
+   case GC_HWIP:
+   if (amdgpu_sriov_reg_indirect_gc(adev)) {
+   *rlcg_flag =
+   write ? AMDGPU_RLCG_GC_WRITE : 
AMDGPU_RLCG_GC_READ;
+   ret = true;
+   /* only in new version, AMDGPU_REGS_NO_KIQ and
+* AMDGPU_REGS_RLC are enabled simultaneously */
+   } else if ((acc_flags & AMDGPU_REGS_RLC) &&
+  !(acc_flags & AMDGPU_REGS_NO_KIQ)) {
+   *rlcg_flag = AMDGPU_RLCG_GC_WRITE_LEGACY;
+   ret = true;
+   }
+   break;
+   case MMHUB_HWIP:
+   if (amdgpu_sriov_reg_indirect_mmhub(adev) &&
+   (acc_flags & AMDGPU_REGS_RLC) && write) {
+   *rlcg_flag = AMDGPU_RLCG_MMHUB_WRITE;
+   ret = true;
+   }
+   break;
+   default:
+   dev_err(adev->dev,
+   "indirect registers access through rlcg is not 
supported\n");
+   ret = false;
+   break;
+   }
+   return ret;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
index 9adfb8d63280..404a06e57f30 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
@@ -32,6 +32,12 @@
  #define AMDGPU_PASSTHROUGH_MODE(1 << 3) /* thw whole GPU is pass 
through for VM */
  #define AMDGPU_SRIOV_CAPS_RUNTIME  (1 << 4) /* is out of full access mode 
*/
  
+/* flags for indirect register access path supported by rlcg for sriov */

+#define AMDGPU_RLCG_GC_WRITE_LEGACY(0x8 << 28)
+#define AMDGPU_RLCG_GC_WRITE   (0x0 << 28)
+#define AMDGPU_RLCG_GC_READ(0x1 << 28)
+#define AMDGPU_RLCG_MMHUB_WRITE(0x2 << 28)
+
  /* all asic after AI use this offset */
  #define mmRCC_IOV_FUNC_IDENTIFIER 0xDE5
  /* tonga/fiji use this offset */
@@ -321,4 +327,6 @@ enum amdgpu_sriov_vf_mode 
amdgpu_virt_get_sriov_vf_mode(struct amdgpu_device *ad
  void amdgpu_virt_update_sriov_video_codec(struct amdgpu_device *adev,
struct amdgpu_video_codec_info *encode, uint32_t 
encode_array_size,
struct amdgpu_video_codec_info *decode, uint32_t 
decode_array_size);
+bool amdgpu_virt_get_rlcg_reg_access_flag(struct amdgpu_device *adev, u32 
acc_flags,
+ u32 hwip, bool write, u32 *rlcg_flag);
  #endif

RE: [PATCH] drm/amdgpu: remove gart.ready flag

2022-01-20 Thread Chen, Guchun

[Public]

Then the most simple direction to drop the calltrace is to put off the creation 
of this bo after xx_gart_init is called?

Regards,
Guchun

-Original Message-
From: Kim, Jonathan  
Sent: Thursday, January 20, 2022 11:05 PM
To: Koenig, Christian ; Chen, Guchun 
; Christian König ; 
amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: remove gart.ready flag

[Public]

Switching to a VRAM bounce buffer can drop performance around 4x~6x on Vega20 
over larger access so it's not desired.

Jon

> -Original Message-
> From: Koenig, Christian 
> Sent: January 20, 2022 9:10 AM
> To: Chen, Guchun ; Christian König 
> ; Kim, Jonathan 
> ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag
>
> I actually suggested to allocate the bounce buffer in VRAM, but that 
> add a bit more latency.
>
> Christian.
>
> Am 20.01.22 um 15:00 schrieb Chen, Guchun:
> > [Public]
> >
> > Hi Christian,
> >
> > Unfortunately, your patch brings another warning from the same
> sdma_access_bo's creation in amdgpu_ttm_init.
> >
> > In your patch, you introduce a new check of 
> > WARN_ON(!adev->gart.ptr)),
> however, sdma_access_bo targets to create a bo in GTT domain, but 
> adev-
> >gart.ptr is ready in gmc_v10_0_gart_init only.
> >
> > Hi Jonathan,
> >
> > Is it mandatory to create this sdma_access_bo in GTT domain? Can we
> change it to VRAM?
> >
> > Regards,
> > Guchun
> >
> > -Original Message-
> > From: Koenig, Christian 
> > Sent: Wednesday, January 19, 2022 10:38 PM
> > To: Chen, Guchun ; Christian König 
> > ; Kim, Jonathan 
> > ; amd-gfx@lists.freedesktop.org
> > Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag
> >
> > Hi Guchun,
> >
> > yes, just haven't found time to do this yet.
> >
> > Regards,
> > Christian.
> >
> > Am 19.01.22 um 15:24 schrieb Chen, Guchun:
> >> [Public]
> >>
> >> Hello Christian,
> >>
> >> Do you plan to submit your code to drm-next branch?
> >>
> >> Regards,
> >> Guchun
> >>
> >> -Original Message-
> >> From: Chen, Guchun
> >> Sent: Tuesday, January 18, 2022 10:22 PM
> >> To: 'Christian König' ; Kim, 
> >> Jonathan ; amd-gfx@lists.freedesktop.org
> >> Subject: RE: [PATCH] drm/amdgpu: remove gart.ready flag
> >>
> >> [Public]
> >>
> >> Thanks for the clarification. The patch is:
> >> Reviewed-by: Guchun Chen 
> >>
> >> Regards,
> >> Guchun
> >>
> >> -Original Message-
> >> From: Christian König 
> >> Sent: Tuesday, January 18, 2022 10:10 PM
> >> To: Chen, Guchun ; Kim, Jonathan 
> >> ; amd-gfx@lists.freedesktop.org
> >> Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag
> >>
> >> Am 18.01.22 um 14:28 schrieb Chen, Guchun:
> >>> [Public]
> >>>
> >>> - if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev))
> >>> - goto skip_pin_bo;
> >>> -
> >>> - r = amdgpu_gtt_mgr_recover(&adev->mman.gtt_mgr);
> >>> - if (r)
> >>> - return r;
> >>> -
> >>> -skip_pin_bo:
> >>>
> >>> Does deleting skip_pin_bo path cause bo redundant pin in SRIOV case?
> >> Pinning/unpinning the BO was already removed as well.
> >>
> >> See Nirmoy's patches in the git log.
> >>
> >> Regards,
> >> Christian.
> >>
> >>> Regards,
> >>> Guchun
> >>>
> >>> -Original Message-
> >>> From: Christian König 
> >>> Sent: Tuesday, January 18, 2022 8:02 PM
> >>> To: Chen, Guchun ; Kim, Jonathan 
> >>> ; amd-gfx@lists.freedesktop.org
> >>> Subject: [PATCH] drm/amdgpu: remove gart.ready flag
> >>>
> >>> That's just a leftover from old radeon days and was preventing CS 
> >>> and
> GART bindings before the hardware was initialized. But nowdays that is 
> perfectly valid.
> >>>
> >>> The only thing we need to warn about are GART binding before the
> table is even allocated.
> >>>
> >>> Signed-off-by: Christian König 
> >>> ---
> >>> drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c| 35 +++---
> >>> drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h| 15 ++--
> >>> drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c |  9 +--
> >>> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 77 ++-
> --
> >>> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  4 +-
> >>> drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  | 11 +--
> >>> drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c   |  7 +-
> >>> drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c   |  8 +--
> >>> drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c   |  8 +--
> >>> drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   | 10 +--
> >>> drivers/gpu/drm/amd/amdkfd/kfd_migrate.c|  5 +-
> >>> 11 files changed, 52 insertions(+), 137 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >>> index 645950a653a0..53cc844346f0 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >>> @@ -150,7 +150,7 @@ void amdgpu_gart_table_vram_free(struct
> amdgpu_device *adev)
> >>>  * replaces them with the dummy page (all asics).
> >>>  * Returns 0 for succe

Re: [PATCH V2 1/7] drm/amd/pm: drop unneeded lock protection smu->mutex

2022-01-20 Thread Lazar, Lijo





On 1/17/2022 11:11 AM, Evan Quan wrote:

As all those APIs are already protected either by adev->pm.mutex
or smu->message_lock.

Signed-off-by: Evan Quan 
Change-Id: I1db751fba9caabc5ca1314992961d3674212f9b0
---
  drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 315 ++
  drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |   1 -
  .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |   2 -
  .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |   2 -
  .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   |   2 -
  .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c|   2 -
  6 files changed, 25 insertions(+), 299 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index 828cb932f6a9..411f03eb4523 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -55,8 +55,7 @@ static int smu_force_smuclk_levels(struct smu_context *smu,
   uint32_t mask);
  static int smu_handle_task(struct smu_context *smu,
   enum amd_dpm_forced_level level,
-  enum amd_pp_task task_id,
-  bool lock_needed);
+  enum amd_pp_task task_id);
  static int smu_reset(struct smu_context *smu);
  static int smu_set_fan_speed_pwm(void *handle, u32 speed);
  static int smu_set_fan_control_mode(void *handle, u32 value);
@@ -68,36 +67,22 @@ static int smu_sys_get_pp_feature_mask(void *handle,
   char *buf)
  {
struct smu_context *smu = handle;
-   int size = 0;
  
  	if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)

return -EOPNOTSUPP;
  
-	mutex_lock(&smu->mutex);

-
-   size = smu_get_pp_feature_mask(smu, buf);
-
-   mutex_unlock(&smu->mutex);
-
-   return size;
+   return smu_get_pp_feature_mask(smu, buf);
  }
  
  static int smu_sys_set_pp_feature_mask(void *handle,

   uint64_t new_mask)
  {
struct smu_context *smu = handle;
-   int ret = 0;
  
  	if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)

return -EOPNOTSUPP;
  
-	mutex_lock(&smu->mutex);

-
-   ret = smu_set_pp_feature_mask(smu, new_mask);
-
-   mutex_unlock(&smu->mutex);
-
-   return ret;
+   return smu_set_pp_feature_mask(smu, new_mask);
  }
  
  int smu_get_status_gfxoff(struct smu_context *smu, uint32_t *value)

@@ -117,16 +102,12 @@ int smu_set_soft_freq_range(struct smu_context *smu,
  {
int ret = 0;
  
-	mutex_lock(&smu->mutex);

-
if (smu->ppt_funcs->set_soft_freq_limited_range)
ret = smu->ppt_funcs->set_soft_freq_limited_range(smu,
  clk_type,
  min,
  max);
  
-	mutex_unlock(&smu->mutex);

-
return ret;
  }
  
@@ -140,16 +121,12 @@ int smu_get_dpm_freq_range(struct smu_context *smu,

if (!min && !max)
return -EINVAL;
  
-	mutex_lock(&smu->mutex);

-
if (smu->ppt_funcs->get_dpm_ultimate_freq)
ret = smu->ppt_funcs->get_dpm_ultimate_freq(smu,
clk_type,
min,
max);
  
-	mutex_unlock(&smu->mutex);

-
return ret;
  }
  
@@ -482,7 +459,6 @@ static int smu_sys_get_pp_table(void *handle,

  {
struct smu_context *smu = handle;
struct smu_table_context *smu_table = &smu->smu_table;
-   uint32_t powerplay_table_size;
  
  	if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)

return -EOPNOTSUPP;
@@ -490,18 +466,12 @@ static int smu_sys_get_pp_table(void *handle,
if (!smu_table->power_play_table && !smu_table->hardcode_pptable)
return -EINVAL;
  
-	mutex_lock(&smu->mutex);

-
if (smu_table->hardcode_pptable)
*table = smu_table->hardcode_pptable;
else
*table = smu_table->power_play_table;
  
-	powerplay_table_size = smu_table->power_play_table_size;

-
-   mutex_unlock(&smu->mutex);
-
-   return powerplay_table_size;
+   return smu_table->power_play_table_size;
  }
  
  static int smu_sys_set_pp_table(void *handle,

@@ -521,13 +491,10 @@ static int smu_sys_set_pp_table(void *handle,
return -EIO;
}
  
-	mutex_lock(&smu->mutex);

if (!smu_table->hardcode_pptable)
smu_table->hardcode_pptable = kzalloc(size, GFP_KERNEL);
-   if (!smu_table->hardcode_pptable) {
-   ret = -ENOMEM;
-   goto failed;
-   }
+   if (!smu_table->hardcode_pptable)
+   return -ENOMEM;
  
  	memcpy(smu_table->hardcode_pptable, buf, size);

smu_table->pow

Re: [PATCH 2/2] drm/amdgpu: protected amdgpu_dma_buf_move_notify against hotplug

2022-01-20 Thread Alex Deucher

Series is:
Reviewed-by: Alex Deucher 

On Thu, Jan 20, 2022 at 4:03 AM Christian König
 wrote:
>
> Add the proper drm_dev_enter()/drm_dev_exit() calls here.
>
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> index 8756f505c87d..eb31ba3da403 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> @@ -36,6 +36,7 @@
>  #include "amdgpu_gem.h"
>  #include "amdgpu_dma_buf.h"
>  #include "amdgpu_xgmi.h"
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -447,14 +448,18 @@ amdgpu_dma_buf_move_notify(struct dma_buf_attachment 
> *attach)
> struct ttm_operation_ctx ctx = { false, false };
> struct ttm_placement placement = {};
> struct amdgpu_vm_bo_base *bo_base;
> -   int r;
> +   int idx, r;
>
> if (bo->tbo.resource->mem_type == TTM_PL_SYSTEM)
> return;
>
> +   if (!drm_dev_enter(adev_to_drm(adev), &idx))
> +   return;
> +
> r = ttm_bo_validate(&bo->tbo, &placement, &ctx);
> if (r) {
> DRM_ERROR("Failed to invalidate DMA-buf import (%d))\n", r);
> +   drm_dev_exit(idx);
> return;
> }
>
> @@ -490,6 +495,7 @@ amdgpu_dma_buf_move_notify(struct dma_buf_attachment 
> *attach)
>
> dma_resv_unlock(resv);
> }
> +   drm_dev_exit(idx);
>  }
>
>  static const struct dma_buf_attach_ops amdgpu_dma_buf_attach_ops = {
> --
> 2.25.1
>

RE: [PATCH] drm/amdgpu: remove gart.ready flag

2022-01-20 Thread Kim, Jonathan

[Public]

Switching to a VRAM bounce buffer can drop performance around 4x~6x on Vega20 
over larger access so it's not desired.

Jon

> -Original Message-
> From: Koenig, Christian 
> Sent: January 20, 2022 9:10 AM
> To: Chen, Guchun ; Christian König
> ; Kim, Jonathan
> ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag
>
> I actually suggested to allocate the bounce buffer in VRAM, but that add a
> bit more latency.
>
> Christian.
>
> Am 20.01.22 um 15:00 schrieb Chen, Guchun:
> > [Public]
> >
> > Hi Christian,
> >
> > Unfortunately, your patch brings another warning from the same
> sdma_access_bo's creation in amdgpu_ttm_init.
> >
> > In your patch, you introduce a new check of WARN_ON(!adev->gart.ptr)),
> however, sdma_access_bo targets to create a bo in GTT domain, but adev-
> >gart.ptr is ready in gmc_v10_0_gart_init only.
> >
> > Hi Jonathan,
> >
> > Is it mandatory to create this sdma_access_bo in GTT domain? Can we
> change it to VRAM?
> >
> > Regards,
> > Guchun
> >
> > -Original Message-
> > From: Koenig, Christian 
> > Sent: Wednesday, January 19, 2022 10:38 PM
> > To: Chen, Guchun ; Christian König
> > ; Kim, Jonathan
> > ; amd-gfx@lists.freedesktop.org
> > Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag
> >
> > Hi Guchun,
> >
> > yes, just haven't found time to do this yet.
> >
> > Regards,
> > Christian.
> >
> > Am 19.01.22 um 15:24 schrieb Chen, Guchun:
> >> [Public]
> >>
> >> Hello Christian,
> >>
> >> Do you plan to submit your code to drm-next branch?
> >>
> >> Regards,
> >> Guchun
> >>
> >> -Original Message-
> >> From: Chen, Guchun
> >> Sent: Tuesday, January 18, 2022 10:22 PM
> >> To: 'Christian König' ; Kim,
> >> Jonathan ; amd-gfx@lists.freedesktop.org
> >> Subject: RE: [PATCH] drm/amdgpu: remove gart.ready flag
> >>
> >> [Public]
> >>
> >> Thanks for the clarification. The patch is:
> >> Reviewed-by: Guchun Chen 
> >>
> >> Regards,
> >> Guchun
> >>
> >> -Original Message-
> >> From: Christian König 
> >> Sent: Tuesday, January 18, 2022 10:10 PM
> >> To: Chen, Guchun ; Kim, Jonathan
> >> ; amd-gfx@lists.freedesktop.org
> >> Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag
> >>
> >> Am 18.01.22 um 14:28 schrieb Chen, Guchun:
> >>> [Public]
> >>>
> >>> - if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev))
> >>> - goto skip_pin_bo;
> >>> -
> >>> - r = amdgpu_gtt_mgr_recover(&adev->mman.gtt_mgr);
> >>> - if (r)
> >>> - return r;
> >>> -
> >>> -skip_pin_bo:
> >>>
> >>> Does deleting skip_pin_bo path cause bo redundant pin in SRIOV case?
> >> Pinning/unpinning the BO was already removed as well.
> >>
> >> See Nirmoy's patches in the git log.
> >>
> >> Regards,
> >> Christian.
> >>
> >>> Regards,
> >>> Guchun
> >>>
> >>> -Original Message-
> >>> From: Christian König 
> >>> Sent: Tuesday, January 18, 2022 8:02 PM
> >>> To: Chen, Guchun ; Kim, Jonathan
> >>> ; amd-gfx@lists.freedesktop.org
> >>> Subject: [PATCH] drm/amdgpu: remove gart.ready flag
> >>>
> >>> That's just a leftover from old radeon days and was preventing CS and
> GART bindings before the hardware was initialized. But nowdays that is
> perfectly valid.
> >>>
> >>> The only thing we need to warn about are GART binding before the
> table is even allocated.
> >>>
> >>> Signed-off-by: Christian König 
> >>> ---
> >>> drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c| 35 +++---
> >>> drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h| 15 ++--
> >>> drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c |  9 +--
> >>> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 77 ++-
> --
> >>> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  4 +-
> >>> drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  | 11 +--
> >>> drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c   |  7 +-
> >>> drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c   |  8 +--
> >>> drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c   |  8 +--
> >>> drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   | 10 +--
> >>> drivers/gpu/drm/amd/amdkfd/kfd_migrate.c|  5 +-
> >>> 11 files changed, 52 insertions(+), 137 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >>> index 645950a653a0..53cc844346f0 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> >>> @@ -150,7 +150,7 @@ void amdgpu_gart_table_vram_free(struct
> amdgpu_device *adev)
> >>>  * replaces them with the dummy page (all asics).
> >>>  * Returns 0 for success, -EINVAL for failure.
> >>>  */
> >>> -int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t
> offset,
> >>> +void amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t
> >>> +offset,
> >>>   int pages)
> >>> {
> >>>   unsigned t;
> >>> @@ -161,13 +161,11 @@ int amdgpu_gart_unbind(struct
> amdgpu_device *adev, uint64_t offset,
> >>>   uint64_t flags

Re: [PATCH] drm/amdgpu: remove gart.ready flag

2022-01-20 Thread Christian König

I actually suggested to allocate the bounce buffer in VRAM, but that add 
a bit more latency.


Christian.

Am 20.01.22 um 15:00 schrieb Chen, Guchun:

[Public]

Hi Christian,

Unfortunately, your patch brings another warning from the same sdma_access_bo's 
creation in amdgpu_ttm_init.

In your patch, you introduce a new check of WARN_ON(!adev->gart.ptr)), however, 
sdma_access_bo targets to create a bo in GTT domain, but adev->gart.ptr is ready 
in gmc_v10_0_gart_init only.

Hi Jonathan,

Is it mandatory to create this sdma_access_bo in GTT domain? Can we change it 
to VRAM?

Regards,
Guchun

-Original Message-
From: Koenig, Christian 
Sent: Wednesday, January 19, 2022 10:38 PM
To: Chen, Guchun ; Christian König 
; Kim, Jonathan ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag

Hi Guchun,

yes, just haven't found time to do this yet.

Regards,
Christian.

Am 19.01.22 um 15:24 schrieb Chen, Guchun:

[Public]

Hello Christian,

Do you plan to submit your code to drm-next branch?

Regards,
Guchun

-Original Message-
From: Chen, Guchun
Sent: Tuesday, January 18, 2022 10:22 PM
To: 'Christian König' ; Kim,
Jonathan ; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: remove gart.ready flag

[Public]

Thanks for the clarification. The patch is:
Reviewed-by: Guchun Chen 

Regards,
Guchun

-Original Message-
From: Christian König 
Sent: Tuesday, January 18, 2022 10:10 PM
To: Chen, Guchun ; Kim, Jonathan
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag

Am 18.01.22 um 14:28 schrieb Chen, Guchun:

[Public]

-   if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev))
-   goto skip_pin_bo;
-
-   r = amdgpu_gtt_mgr_recover(&adev->mman.gtt_mgr);
-   if (r)
-   return r;
-
-skip_pin_bo:

Does deleting skip_pin_bo path cause bo redundant pin in SRIOV case?

Pinning/unpinning the BO was already removed as well.

See Nirmoy's patches in the git log.

Regards,
Christian.


Regards,
Guchun

-Original Message-
From: Christian König 
Sent: Tuesday, January 18, 2022 8:02 PM
To: Chen, Guchun ; Kim, Jonathan
; amd-gfx@lists.freedesktop.org
Subject: [PATCH] drm/amdgpu: remove gart.ready flag

That's just a leftover from old radeon days and was preventing CS and GART 
bindings before the hardware was initialized. But nowdays that is perfectly 
valid.

The only thing we need to warn about are GART binding before the table is even 
allocated.

Signed-off-by: Christian König 
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c| 35 +++---
drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h| 15 ++--
drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c |  9 +--
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 77 ++---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  4 +-
drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  | 11 +--
drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c   |  7 +-
drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c   |  8 +--
drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c   |  8 +--
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   | 10 +--
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c|  5 +-
11 files changed, 52 insertions(+), 137 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index 645950a653a0..53cc844346f0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -150,7 +150,7 @@ void amdgpu_gart_table_vram_free(struct amdgpu_device *adev)
 * replaces them with the dummy page (all asics).
 * Returns 0 for success, -EINVAL for failure.
 */
-int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
+void amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
int pages)
{
unsigned t;
@@ -161,13 +161,11 @@ int amdgpu_gart_unbind(struct amdgpu_device *adev, 
uint64_t offset,
uint64_t flags = 0;
int idx;

-	if (!adev->gart.ready) {

-   WARN(1, "trying to unbind memory from uninitialized GART !\n");
-   return -EINVAL;
-   }
+   if (WARN_ON(!adev->gart.ptr))
+   return;

	if (!drm_dev_enter(adev_to_drm(adev), &idx))

-   return 0;
+   return;

	t = offset / AMDGPU_GPU_PAGE_SIZE;

p = t / AMDGPU_GPU_PAGES_IN_CPU_PAGE; @@ -188,7 +186,6 @@ int
amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
amdgpu_gmc_flush_gpu_tlb(adev, 0, i, 0);

	drm_dev_exit(idx);

-   return 0;
}

/**

@@ -204,7 +201,7 @@ int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t 
offset,
 * Map the dma_addresses into GART entries (all asics).
 * Returns 0 for success, -EINVAL for failure.
 */
-int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
+void amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,

RE: [PATCH] drm/amdgpu: remove gart.ready flag

2022-01-20 Thread Chen, Guchun

[Public]

Hi Christian,

Unfortunately, your patch brings another warning from the same sdma_access_bo's 
creation in amdgpu_ttm_init.

In your patch, you introduce a new check of WARN_ON(!adev->gart.ptr)), however, 
sdma_access_bo targets to create a bo in GTT domain, but adev->gart.ptr is 
ready in gmc_v10_0_gart_init only.

Hi Jonathan,

Is it mandatory to create this sdma_access_bo in GTT domain? Can we change it 
to VRAM?

Regards,
Guchun

-Original Message-
From: Koenig, Christian  
Sent: Wednesday, January 19, 2022 10:38 PM
To: Chen, Guchun ; Christian König 
; Kim, Jonathan ; 
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag

Hi Guchun,

yes, just haven't found time to do this yet.

Regards,
Christian.

Am 19.01.22 um 15:24 schrieb Chen, Guchun:
> [Public]
>
> Hello Christian,
>
> Do you plan to submit your code to drm-next branch?
>
> Regards,
> Guchun
>
> -Original Message-
> From: Chen, Guchun
> Sent: Tuesday, January 18, 2022 10:22 PM
> To: 'Christian König' ; Kim, 
> Jonathan ; amd-gfx@lists.freedesktop.org
> Subject: RE: [PATCH] drm/amdgpu: remove gart.ready flag
>
> [Public]
>
> Thanks for the clarification. The patch is:
> Reviewed-by: Guchun Chen 
>
> Regards,
> Guchun
>
> -Original Message-
> From: Christian König 
> Sent: Tuesday, January 18, 2022 10:10 PM
> To: Chen, Guchun ; Kim, Jonathan 
> ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: remove gart.ready flag
>
> Am 18.01.22 um 14:28 schrieb Chen, Guchun:
>> [Public]
>>
>> -if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev))
>> -goto skip_pin_bo;
>> -
>> -r = amdgpu_gtt_mgr_recover(&adev->mman.gtt_mgr);
>> -if (r)
>> -return r;
>> -
>> -skip_pin_bo:
>>
>> Does deleting skip_pin_bo path cause bo redundant pin in SRIOV case?
> Pinning/unpinning the BO was already removed as well.
>
> See Nirmoy's patches in the git log.
>
> Regards,
> Christian.
>
>> Regards,
>> Guchun
>>
>> -Original Message-
>> From: Christian König 
>> Sent: Tuesday, January 18, 2022 8:02 PM
>> To: Chen, Guchun ; Kim, Jonathan 
>> ; amd-gfx@lists.freedesktop.org
>> Subject: [PATCH] drm/amdgpu: remove gart.ready flag
>>
>> That's just a leftover from old radeon days and was preventing CS and GART 
>> bindings before the hardware was initialized. But nowdays that is perfectly 
>> valid.
>>
>> The only thing we need to warn about are GART binding before the table is 
>> even allocated.
>>
>> Signed-off-by: Christian König 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c| 35 +++---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h| 15 ++--
>>drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c |  9 +--
>>drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 77 ++---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  4 +-
>>drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  | 11 +--
>>drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c   |  7 +-
>>drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c   |  8 +--
>>drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c   |  8 +--
>>drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   | 10 +--
>>drivers/gpu/drm/amd/amdkfd/kfd_migrate.c|  5 +-
>>11 files changed, 52 insertions(+), 137 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> index 645950a653a0..53cc844346f0 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> @@ -150,7 +150,7 @@ void amdgpu_gart_table_vram_free(struct amdgpu_device 
>> *adev)
>> * replaces them with the dummy page (all asics).
>> * Returns 0 for success, -EINVAL for failure.
>> */
>> -int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>> +void amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>>  int pages)
>>{
>>  unsigned t;
>> @@ -161,13 +161,11 @@ int amdgpu_gart_unbind(struct amdgpu_device *adev, 
>> uint64_t offset,
>>  uint64_t flags = 0;
>>  int idx;
>>
>> -if (!adev->gart.ready) {
>> -WARN(1, "trying to unbind memory from uninitialized GART !\n");
>> -return -EINVAL;
>> -}
>> +if (WARN_ON(!adev->gart.ptr))
>> +return;
>>
>>  if (!drm_dev_enter(adev_to_drm(adev), &idx))
>> -return 0;
>> +return;
>>
>>  t = offset / AMDGPU_GPU_PAGE_SIZE;
>>  p = t / AMDGPU_GPU_PAGES_IN_CPU_PAGE; @@ -188,7 +186,6 @@ int 
>> amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
>>  amdgpu_gmc_flush_gpu_tlb(adev, 0, i, 0);
>>
>>  drm_dev_exit(idx);
>> -return 0;
>>}
>>
>>/**
>> @@ -204,7 +201,7 @@ int amdgpu_gart_unbind(struct amdgpu_device *adev, 
>> uint64_t offset,
>> * Map the dma_addresses into GART entries (all asics).
>> * Returns 0 for success, -EINVAL for failure.
>> */
>> -int amdgpu_gart_map(struct amdgpu_device *adev,

RE: [PATCH V2 7/7] drm/amd/pm: drop unneeded hwmgr->smu_lock

2022-01-20 Thread Chen, Guchun

[Public]

With the comment in patch 1 addressed, the series is:
Reviewed-by: Guchun Chen 

Regards,
Guchun

-Original Message-
From: Quan, Evan  
Sent: Thursday, January 20, 2022 7:52 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Lazar, Lijo 
; Chen, Guchun 
Subject: RE: [PATCH V2 7/7] drm/amd/pm: drop unneeded hwmgr->smu_lock

[AMD Official Use Only]

Ping for the series..

> -Original Message-
> From: Quan, Evan 
> Sent: Monday, January 17, 2022 1:42 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Lazar, Lijo 
> ; Chen, Guchun ; Quan, Evan 
> 
> Subject: [PATCH V2 7/7] drm/amd/pm: drop unneeded hwmgr->smu_lock
> 
> As all those related APIs are already well protected by adev->pm.mutex.
> 
> Signed-off-by: Evan Quan 
> Change-Id: I36426791d3bbc9d84a6ae437da26a892682eb0cb
> ---
>  .../gpu/drm/amd/pm/powerplay/amd_powerplay.c  | 278 +++---
>  drivers/gpu/drm/amd/pm/powerplay/inc/hwmgr.h  |   1 -
>  2 files changed, 38 insertions(+), 241 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
> b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
> index 76c26ae368f9..a2da46bf3985 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
> @@ -50,7 +50,6 @@ static int amd_powerplay_create(struct amdgpu_device
> *adev)
>   hwmgr->adev = adev;
>   hwmgr->not_vf = !amdgpu_sriov_vf(adev);
>   hwmgr->device = amdgpu_cgs_create_device(adev);
> - mutex_init(&hwmgr->smu_lock);
>   mutex_init(&hwmgr->msg_lock);
>   hwmgr->chip_family = adev->family;
>   hwmgr->chip_id = adev->asic_type;
> @@ -178,12 +177,9 @@ static int pp_late_init(void *handle)
>   struct amdgpu_device *adev = handle;
>   struct pp_hwmgr *hwmgr = adev->powerplay.pp_handle;
> 
> - if (hwmgr && hwmgr->pm_en) {
> - mutex_lock(&hwmgr->smu_lock);
> + if (hwmgr && hwmgr->pm_en)
>   hwmgr_handle_task(hwmgr,
>   AMD_PP_TASK_COMPLETE_INIT,
> NULL);
> - mutex_unlock(&hwmgr->smu_lock);
> - }
>   if (adev->pm.smu_prv_buffer_size != 0)
>   pp_reserve_vram_for_smu(adev);
> 
> @@ -345,11 +341,9 @@ static int pp_dpm_force_performance_level(void
> *handle,
>   if (level == hwmgr->dpm_level)
>   return 0;
> 
> - mutex_lock(&hwmgr->smu_lock);
>   pp_dpm_en_umd_pstate(hwmgr, &level);
>   hwmgr->request_dpm_level = level;
>   hwmgr_handle_task(hwmgr,
> AMD_PP_TASK_READJUST_POWER_STATE, NULL);
> - mutex_unlock(&hwmgr->smu_lock);
> 
>   return 0;
>  }
> @@ -358,21 +352,16 @@ static enum amd_dpm_forced_level 
> pp_dpm_get_performance_level(
>   void *handle)
>  {
>   struct pp_hwmgr *hwmgr = handle;
> - enum amd_dpm_forced_level level;
> 
>   if (!hwmgr || !hwmgr->pm_en)
>   return -EINVAL;
> 
> - mutex_lock(&hwmgr->smu_lock);
> - level = hwmgr->dpm_level;
> - mutex_unlock(&hwmgr->smu_lock);
> - return level;
> + return hwmgr->dpm_level;
>  }
> 
>  static uint32_t pp_dpm_get_sclk(void *handle, bool low)  {
>   struct pp_hwmgr *hwmgr = handle;
> - uint32_t clk = 0;
> 
>   if (!hwmgr || !hwmgr->pm_en)
>   return 0;
> @@ -381,16 +370,12 @@ static uint32_t pp_dpm_get_sclk(void *handle, 
> bool low)
>   pr_info_ratelimited("%s was not implemented.\n", __func__);
>   return 0;
>   }
> - mutex_lock(&hwmgr->smu_lock);
> - clk = hwmgr->hwmgr_func->get_sclk(hwmgr, low);
> - mutex_unlock(&hwmgr->smu_lock);
> - return clk;
> + return hwmgr->hwmgr_func->get_sclk(hwmgr, low);
>  }
> 
>  static uint32_t pp_dpm_get_mclk(void *handle, bool low)  {
>   struct pp_hwmgr *hwmgr = handle;
> - uint32_t clk = 0;
> 
>   if (!hwmgr || !hwmgr->pm_en)
>   return 0;
> @@ -399,10 +384,7 @@ static uint32_t pp_dpm_get_mclk(void *handle, 
> bool low)
>   pr_info_ratelimited("%s was not implemented.\n", __func__);
>   return 0;
>   }
> - mutex_lock(&hwmgr->smu_lock);
> - clk = hwmgr->hwmgr_func->get_mclk(hwmgr, low);
> - mutex_unlock(&hwmgr->smu_lock);
> - return clk;
> + return hwmgr->hwmgr_func->get_mclk(hwmgr, low);
>  }
> 
>  static void pp_dpm_powergate_vce(void *handle, bool gate) @@ -416,9
> +398,7 @@ static void pp_dpm_powergate_vce(void *handle, bool gate)
>   pr_info_ratelimited("%s was not implemented.\n", __func__);
>   return;
>   }
> - mutex_lock(&hwmgr->smu_lock);
>   hwmgr->hwmgr_func->powergate_vce(hwmgr, gate);
> - mutex_unlock(&hwmgr->smu_lock);
>  }
> 
>  static void pp_dpm_powergate_uvd(void *handle, bool gate) @@ -432,25
> +412,18 @@ static void pp_dpm_powergate_uvd(void *handle, bool gate)
>   pr_info_ratelimited("%s was not implemented.\n", __func__);
>   ret

RE: [PATCH V2 1/7] drm/amd/pm: drop unneeded lock protection smu->mutex

2022-01-20 Thread Chen, Guchun

[Public]

if (!smu_table->hardcode_pptable)
smu_table->hardcode_pptable = kzalloc(size, GFP_KERNEL);
-   if (!smu_table->hardcode_pptable) {
-   ret = -ENOMEM;
-   goto failed;
-   }
+   if (!smu_table->hardcode_pptable)
+   return -ENOMEM;

I guess it's better to put the second check of hardcode_pptable into first if 
condition section like:
if (!smu_table->hardcode_pptable) {
smu_table->hardcode_pptable = kzalloc(size, GFP_KERNEL);
if (!smu_table->hardcode_pptable)
return -ENOMEM;
}


Regards,
Guchun

-Original Message-
From: Quan, Evan  
Sent: Monday, January 17, 2022 1:42 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Lazar, Lijo 
; Chen, Guchun ; Quan, Evan 

Subject: [PATCH V2 1/7] drm/amd/pm: drop unneeded lock protection smu->mutex

As all those APIs are already protected either by adev->pm.mutex or 
smu->message_lock.

Signed-off-by: Evan Quan 
Change-Id: I1db751fba9caabc5ca1314992961d3674212f9b0
---
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 315 ++
 drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |   1 -
 .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |   2 -
 .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |   2 -
 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   |   2 -
 .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c|   2 -
 6 files changed, 25 insertions(+), 299 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index 828cb932f6a9..411f03eb4523 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -55,8 +55,7 @@ static int smu_force_smuclk_levels(struct smu_context *smu,
   uint32_t mask);
 static int smu_handle_task(struct smu_context *smu,
   enum amd_dpm_forced_level level,
-  enum amd_pp_task task_id,
-  bool lock_needed);
+  enum amd_pp_task task_id);
 static int smu_reset(struct smu_context *smu);  static int 
smu_set_fan_speed_pwm(void *handle, u32 speed);  static int 
smu_set_fan_control_mode(void *handle, u32 value); @@ -68,36 +67,22 @@ static 
int smu_sys_get_pp_feature_mask(void *handle,
   char *buf)
 {
struct smu_context *smu = handle;
-   int size = 0;
 
if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)
return -EOPNOTSUPP;
 
-   mutex_lock(&smu->mutex);
-
-   size = smu_get_pp_feature_mask(smu, buf);
-
-   mutex_unlock(&smu->mutex);
-
-   return size;
+   return smu_get_pp_feature_mask(smu, buf);
 }
 
 static int smu_sys_set_pp_feature_mask(void *handle,
   uint64_t new_mask)
 {
struct smu_context *smu = handle;
-   int ret = 0;
 
if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)
return -EOPNOTSUPP;
 
-   mutex_lock(&smu->mutex);
-
-   ret = smu_set_pp_feature_mask(smu, new_mask);
-
-   mutex_unlock(&smu->mutex);
-
-   return ret;
+   return smu_set_pp_feature_mask(smu, new_mask);
 }
 
 int smu_get_status_gfxoff(struct smu_context *smu, uint32_t *value) @@ -117,16 
+102,12 @@ int smu_set_soft_freq_range(struct smu_context *smu,  {
int ret = 0;
 
-   mutex_lock(&smu->mutex);
-
if (smu->ppt_funcs->set_soft_freq_limited_range)
ret = smu->ppt_funcs->set_soft_freq_limited_range(smu,
  clk_type,
  min,
  max);
 
-   mutex_unlock(&smu->mutex);
-
return ret;
 }
 
@@ -140,16 +121,12 @@ int smu_get_dpm_freq_range(struct smu_context *smu,
if (!min && !max)
return -EINVAL;
 
-   mutex_lock(&smu->mutex);
-
if (smu->ppt_funcs->get_dpm_ultimate_freq)
ret = smu->ppt_funcs->get_dpm_ultimate_freq(smu,
clk_type,
min,
max);
 
-   mutex_unlock(&smu->mutex);
-
return ret;
 }
 
@@ -482,7 +459,6 @@ static int smu_sys_get_pp_table(void *handle,  {
struct smu_context *smu = handle;
struct smu_table_context *smu_table = &smu->smu_table;
-   uint32_t powerplay_table_size;
 
if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)
return -EOPNOTSUPP;
@@ -490,18 +466,12 @@ static int smu_sys_get_pp_table(void *handle,
if (!smu_table->power_play_table && !smu_table->hardcode_pptable)
return -EINVAL;
 
-   mutex_lock(&smu->mutex);
-
if (smu_table->hardcode_pptable)
*table = smu_table->h

Re: [PATCH] drm/amdgpu: switch to common helper to read bios from rom

2022-01-20 Thread Wang, Yang(Kevin)

[AMD Official Use Only]

Reviewed-by: Yang Wang 

Best Regards,
Kevin


From: amd-gfx  on behalf of Hawking 
Zhang 
Sent: Thursday, January 20, 2022 7:26 PM
To: amd-gfx@lists.freedesktop.org ; Deucher, 
Alexander 
Cc: Zhang, Hawking 
Subject: [PATCH] drm/amdgpu: switch to common helper to read bios from rom

create a common helper function for soc15 and onwards
to read bios image from rom

Signed-off-by: Hawking Zhang 
Reviewed-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h  |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 38 
 drivers/gpu/drm/amd/amdgpu/nv.c  | 34 +
 drivers/gpu/drm/amd/amdgpu/soc15.c   | 37 ++-
 4 files changed, 43 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 8a7759147fb2..b2da840f4718 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -378,7 +378,8 @@ int amdgpu_device_ip_block_add(struct amdgpu_device *adev,
  */
 bool amdgpu_get_bios(struct amdgpu_device *adev);
 bool amdgpu_read_bios(struct amdgpu_device *adev);
-
+bool amdgpu_soc15_read_bios_from_rom(struct amdgpu_device *adev,
+u8 *bios, u32 length_bytes);
 /*
  * Clocks
  */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
index ca0503d56e5c..a819828408fd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
@@ -476,3 +476,41 @@ bool amdgpu_get_bios(struct amdgpu_device *adev)
 adev->is_atom_fw = (adev->asic_type >= CHIP_VEGA10) ? true : false;
 return true;
 }
+
+/* helper function for soc15 and onwards to read bios from rom */
+bool amdgpu_soc15_read_bios_from_rom(struct amdgpu_device *adev,
+u8 *bios, u32 length_bytes)
+{
+   u32 *dw_ptr;
+   u32 i, length_dw;
+   u32 rom_index_offset;
+   u32 rom_data_offset;
+
+   if (bios == NULL)
+   return false;
+   if (length_bytes == 0)
+   return false;
+   /* APU vbios image is part of sbios image */
+   if (adev->flags & AMD_IS_APU)
+   return false;
+   if (!adev->smuio.funcs ||
+   !adev->smuio.funcs->get_rom_index_offset ||
+   !adev->smuio.funcs->get_rom_data_offset)
+   return false;
+
+   dw_ptr = (u32 *)bios;
+   length_dw = ALIGN(length_bytes, 4) / 4;
+
+   rom_index_offset =
+   adev->smuio.funcs->get_rom_index_offset(adev);
+   rom_data_offset =
+   adev->smuio.funcs->get_rom_data_offset(adev);
+
+   /* set rom index to 0 */
+   WREG32(rom_index_offset, 0);
+   /* read out the rom data */
+   for (i = 0; i < length_dw; i++)
+   dw_ptr[i] = RREG32(rom_data_offset);
+
+   return true;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index 3ccd3b42196a..e52d1114501c 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -358,38 +358,6 @@ static bool nv_read_disabled_bios(struct amdgpu_device 
*adev)
 return false;
 }

-static bool nv_read_bios_from_rom(struct amdgpu_device *adev,
- u8 *bios, u32 length_bytes)
-{
-   u32 *dw_ptr;
-   u32 i, length_dw;
-   u32 rom_index_offset, rom_data_offset;
-
-   if (bios == NULL)
-   return false;
-   if (length_bytes == 0)
-   return false;
-   /* APU vbios image is part of sbios image */
-   if (adev->flags & AMD_IS_APU)
-   return false;
-
-   dw_ptr = (u32 *)bios;
-   length_dw = ALIGN(length_bytes, 4) / 4;
-
-   rom_index_offset =
-   adev->smuio.funcs->get_rom_index_offset(adev);
-   rom_data_offset =
-   adev->smuio.funcs->get_rom_data_offset(adev);
-
-   /* set rom index to 0 */
-   WREG32(rom_index_offset, 0);
-   /* read out the rom data */
-   for (i = 0; i < length_dw; i++)
-   dw_ptr[i] = RREG32(rom_data_offset);
-
-   return true;
-}
-
 static struct soc15_allowed_register_entry nv_allowed_read_registers[] = {
 { SOC15_REG_ENTRY(GC, 0, mmGRBM_STATUS)},
 { SOC15_REG_ENTRY(GC, 0, mmGRBM_STATUS2)},
@@ -707,7 +675,7 @@ static int nv_update_umd_stable_pstate(struct amdgpu_device 
*adev,
 static const struct amdgpu_asic_funcs nv_asic_funcs =
 {
 .read_disabled_bios = &nv_read_disabled_bios,
-   .read_bios_from_rom = &nv_read_bios_from_rom,
+   .read_bios_from_rom = &amdgpu_soc15_read_bios_from_rom,
 .read_register = &nv_read_register,
 .reset = &nv_asic_reset,
 .reset_method = &nv_asic_reset_method,
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index 0fc1747e4a70..e5a1950fb862 100644
--- a/drivers/gpu/drm/amd/

RE: [PATCH] drm/admgpu: fix the page fault caused by uninitialized variables

2022-01-20 Thread Chen, Guchun

[Public]

If you add the page fault info in your commit message, that's better.

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of Xiaojian Du
Sent: Thursday, January 20, 2022 4:02 PM
To: amd-gfx@lists.freedesktop.org
Cc: Huang, Ray ; Du, Xiaojian 
Subject: [PATCH] drm/admgpu: fix the page fault caused by uninitialized 
variables

This patch will fix the page fault caused by uninitialized variables.

Signed-off-by: Xiaojian Du 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index c65d82301bca..09780a0f874a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -828,9 +828,9 @@ void amdgpu_gmc_get_reserved_allocation(struct 
amdgpu_device *adev)
 
 int amdgpu_gmc_vram_checking(struct amdgpu_device *adev)  {
-   struct amdgpu_bo *vram_bo;
-   uint64_t vram_gpu;
-   void *vram_ptr;
+   struct amdgpu_bo *vram_bo = NULL;
+   uint64_t vram_gpu = 0;
+   void *vram_ptr = NULL;
 
int ret, size = 0x10;
uint8_t cptr[10];
--
2.17.1

RE: [PATCH V2 7/7] drm/amd/pm: drop unneeded hwmgr->smu_lock

2022-01-20 Thread Quan, Evan

[AMD Official Use Only]

Ping for the series..

> -Original Message-
> From: Quan, Evan 
> Sent: Monday, January 17, 2022 1:42 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Lazar, Lijo
> ; Chen, Guchun ; Quan,
> Evan 
> Subject: [PATCH V2 7/7] drm/amd/pm: drop unneeded hwmgr->smu_lock
> 
> As all those related APIs are already well protected by adev->pm.mutex.
> 
> Signed-off-by: Evan Quan 
> Change-Id: I36426791d3bbc9d84a6ae437da26a892682eb0cb
> ---
>  .../gpu/drm/amd/pm/powerplay/amd_powerplay.c  | 278 +++---
>  drivers/gpu/drm/amd/pm/powerplay/inc/hwmgr.h  |   1 -
>  2 files changed, 38 insertions(+), 241 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
> b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
> index 76c26ae368f9..a2da46bf3985 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
> @@ -50,7 +50,6 @@ static int amd_powerplay_create(struct amdgpu_device
> *adev)
>   hwmgr->adev = adev;
>   hwmgr->not_vf = !amdgpu_sriov_vf(adev);
>   hwmgr->device = amdgpu_cgs_create_device(adev);
> - mutex_init(&hwmgr->smu_lock);
>   mutex_init(&hwmgr->msg_lock);
>   hwmgr->chip_family = adev->family;
>   hwmgr->chip_id = adev->asic_type;
> @@ -178,12 +177,9 @@ static int pp_late_init(void *handle)
>   struct amdgpu_device *adev = handle;
>   struct pp_hwmgr *hwmgr = adev->powerplay.pp_handle;
> 
> - if (hwmgr && hwmgr->pm_en) {
> - mutex_lock(&hwmgr->smu_lock);
> + if (hwmgr && hwmgr->pm_en)
>   hwmgr_handle_task(hwmgr,
>   AMD_PP_TASK_COMPLETE_INIT,
> NULL);
> - mutex_unlock(&hwmgr->smu_lock);
> - }
>   if (adev->pm.smu_prv_buffer_size != 0)
>   pp_reserve_vram_for_smu(adev);
> 
> @@ -345,11 +341,9 @@ static int pp_dpm_force_performance_level(void
> *handle,
>   if (level == hwmgr->dpm_level)
>   return 0;
> 
> - mutex_lock(&hwmgr->smu_lock);
>   pp_dpm_en_umd_pstate(hwmgr, &level);
>   hwmgr->request_dpm_level = level;
>   hwmgr_handle_task(hwmgr,
> AMD_PP_TASK_READJUST_POWER_STATE, NULL);
> - mutex_unlock(&hwmgr->smu_lock);
> 
>   return 0;
>  }
> @@ -358,21 +352,16 @@ static enum amd_dpm_forced_level
> pp_dpm_get_performance_level(
>   void *handle)
>  {
>   struct pp_hwmgr *hwmgr = handle;
> - enum amd_dpm_forced_level level;
> 
>   if (!hwmgr || !hwmgr->pm_en)
>   return -EINVAL;
> 
> - mutex_lock(&hwmgr->smu_lock);
> - level = hwmgr->dpm_level;
> - mutex_unlock(&hwmgr->smu_lock);
> - return level;
> + return hwmgr->dpm_level;
>  }
> 
>  static uint32_t pp_dpm_get_sclk(void *handle, bool low)  {
>   struct pp_hwmgr *hwmgr = handle;
> - uint32_t clk = 0;
> 
>   if (!hwmgr || !hwmgr->pm_en)
>   return 0;
> @@ -381,16 +370,12 @@ static uint32_t pp_dpm_get_sclk(void *handle,
> bool low)
>   pr_info_ratelimited("%s was not implemented.\n",
> __func__);
>   return 0;
>   }
> - mutex_lock(&hwmgr->smu_lock);
> - clk = hwmgr->hwmgr_func->get_sclk(hwmgr, low);
> - mutex_unlock(&hwmgr->smu_lock);
> - return clk;
> + return hwmgr->hwmgr_func->get_sclk(hwmgr, low);
>  }
> 
>  static uint32_t pp_dpm_get_mclk(void *handle, bool low)  {
>   struct pp_hwmgr *hwmgr = handle;
> - uint32_t clk = 0;
> 
>   if (!hwmgr || !hwmgr->pm_en)
>   return 0;
> @@ -399,10 +384,7 @@ static uint32_t pp_dpm_get_mclk(void *handle,
> bool low)
>   pr_info_ratelimited("%s was not implemented.\n",
> __func__);
>   return 0;
>   }
> - mutex_lock(&hwmgr->smu_lock);
> - clk = hwmgr->hwmgr_func->get_mclk(hwmgr, low);
> - mutex_unlock(&hwmgr->smu_lock);
> - return clk;
> + return hwmgr->hwmgr_func->get_mclk(hwmgr, low);
>  }
> 
>  static void pp_dpm_powergate_vce(void *handle, bool gate) @@ -416,9
> +398,7 @@ static void pp_dpm_powergate_vce(void *handle, bool gate)
>   pr_info_ratelimited("%s was not implemented.\n",
> __func__);
>   return;
>   }
> - mutex_lock(&hwmgr->smu_lock);
>   hwmgr->hwmgr_func->powergate_vce(hwmgr, gate);
> - mutex_unlock(&hwmgr->smu_lock);
>  }
> 
>  static void pp_dpm_powergate_uvd(void *handle, bool gate) @@ -432,25
> +412,18 @@ static void pp_dpm_powergate_uvd(void *handle, bool gate)
>   pr_info_ratelimited("%s was not implemented.\n",
> __func__);
>   return;
>   }
> - mutex_lock(&hwmgr->smu_lock);
>   hwmgr->hwmgr_func->powergate_uvd(hwmgr, gate);
> - mutex_unlock(&hwmgr->smu_lock);
>  }
> 
>  static int pp_dpm_dispatch_tasks(void *handle, enum amd_pp_task task_id,
>   enum amd_pm_state_type *user_state)
>  {
> - int ret = 0;
>   struct pp_hwmgr *hwmgr = handle;

[PATCH] drm/amdgpu: switch to common helper to read bios from rom

2022-01-20 Thread Hawking Zhang

create a common helper function for soc15 and onwards
to read bios image from rom

Signed-off-by: Hawking Zhang 
Reviewed-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h  |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 38 
 drivers/gpu/drm/amd/amdgpu/nv.c  | 34 +
 drivers/gpu/drm/amd/amdgpu/soc15.c   | 37 ++-
 4 files changed, 43 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 8a7759147fb2..b2da840f4718 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -378,7 +378,8 @@ int amdgpu_device_ip_block_add(struct amdgpu_device *adev,
  */
 bool amdgpu_get_bios(struct amdgpu_device *adev);
 bool amdgpu_read_bios(struct amdgpu_device *adev);
-
+bool amdgpu_soc15_read_bios_from_rom(struct amdgpu_device *adev,
+u8 *bios, u32 length_bytes);
 /*
  * Clocks
  */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
index ca0503d56e5c..a819828408fd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
@@ -476,3 +476,41 @@ bool amdgpu_get_bios(struct amdgpu_device *adev)
adev->is_atom_fw = (adev->asic_type >= CHIP_VEGA10) ? true : false;
return true;
 }
+
+/* helper function for soc15 and onwards to read bios from rom */
+bool amdgpu_soc15_read_bios_from_rom(struct amdgpu_device *adev,
+u8 *bios, u32 length_bytes)
+{
+   u32 *dw_ptr;
+   u32 i, length_dw;
+   u32 rom_index_offset;
+   u32 rom_data_offset;
+
+   if (bios == NULL)
+   return false;
+   if (length_bytes == 0)
+   return false;
+   /* APU vbios image is part of sbios image */
+   if (adev->flags & AMD_IS_APU)
+   return false;
+   if (!adev->smuio.funcs ||
+   !adev->smuio.funcs->get_rom_index_offset ||
+   !adev->smuio.funcs->get_rom_data_offset)
+   return false;
+
+   dw_ptr = (u32 *)bios;
+   length_dw = ALIGN(length_bytes, 4) / 4;
+
+   rom_index_offset =
+   adev->smuio.funcs->get_rom_index_offset(adev);
+   rom_data_offset =
+   adev->smuio.funcs->get_rom_data_offset(adev);
+
+   /* set rom index to 0 */
+   WREG32(rom_index_offset, 0);
+   /* read out the rom data */
+   for (i = 0; i < length_dw; i++)
+   dw_ptr[i] = RREG32(rom_data_offset);
+
+   return true;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index 3ccd3b42196a..e52d1114501c 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -358,38 +358,6 @@ static bool nv_read_disabled_bios(struct amdgpu_device 
*adev)
return false;
 }
 
-static bool nv_read_bios_from_rom(struct amdgpu_device *adev,
- u8 *bios, u32 length_bytes)
-{
-   u32 *dw_ptr;
-   u32 i, length_dw;
-   u32 rom_index_offset, rom_data_offset;
-
-   if (bios == NULL)
-   return false;
-   if (length_bytes == 0)
-   return false;
-   /* APU vbios image is part of sbios image */
-   if (adev->flags & AMD_IS_APU)
-   return false;
-
-   dw_ptr = (u32 *)bios;
-   length_dw = ALIGN(length_bytes, 4) / 4;
-
-   rom_index_offset =
-   adev->smuio.funcs->get_rom_index_offset(adev);
-   rom_data_offset =
-   adev->smuio.funcs->get_rom_data_offset(adev);
-
-   /* set rom index to 0 */
-   WREG32(rom_index_offset, 0);
-   /* read out the rom data */
-   for (i = 0; i < length_dw; i++)
-   dw_ptr[i] = RREG32(rom_data_offset);
-
-   return true;
-}
-
 static struct soc15_allowed_register_entry nv_allowed_read_registers[] = {
{ SOC15_REG_ENTRY(GC, 0, mmGRBM_STATUS)},
{ SOC15_REG_ENTRY(GC, 0, mmGRBM_STATUS2)},
@@ -707,7 +675,7 @@ static int nv_update_umd_stable_pstate(struct amdgpu_device 
*adev,
 static const struct amdgpu_asic_funcs nv_asic_funcs =
 {
.read_disabled_bios = &nv_read_disabled_bios,
-   .read_bios_from_rom = &nv_read_bios_from_rom,
+   .read_bios_from_rom = &amdgpu_soc15_read_bios_from_rom,
.read_register = &nv_read_register,
.reset = &nv_asic_reset,
.reset_method = &nv_asic_reset_method,
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index 0fc1747e4a70..e5a1950fb862 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -375,39 +375,6 @@ static bool soc15_read_disabled_bios(struct amdgpu_device 
*adev)
return false;
 }
 
-static bool soc15_read_bios_from_rom(struct amdgpu_device *adev,
-u8 *bios, u32 length_bytes)
-{
-   u32 *dw_ptr;
-   u32 i, length_dw;

[PATCH 9/9] drm/amdgpu: retire rlc callbacks sriov_rreg/wreg

2022-01-20 Thread Hawking Zhang

Not needed anymore.

Signed-off-by: Hawking Zhang 
Reviewed-by: Zhou, Peng Ju 
Acked-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h  |   2 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c |   5 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h |   2 -
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c   | 114 ---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 106 -
 5 files changed, 3 insertions(+), 226 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
index 286b2347d063..3f671a62b009 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
@@ -127,8 +127,6 @@ struct amdgpu_rlc_funcs {
void (*reset)(struct amdgpu_device *adev);
void (*start)(struct amdgpu_device *adev);
void (*update_spm_vmid)(struct amdgpu_device *adev, unsigned vmid);
-   void (*sriov_wreg)(struct amdgpu_device *adev, u32 offset, u32 v, u32 
acc_flags, u32 hwip);
-   u32 (*sriov_rreg)(struct amdgpu_device *adev, u32 offset, u32 
acc_flags, u32 hwip);
bool (*is_rlcg_access_range)(struct amdgpu_device *adev, uint32_t reg);
 };
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 8c27d31f3e53..80c25176c993 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -821,8 +821,9 @@ void amdgpu_virt_update_sriov_video_codec(struct 
amdgpu_device *adev,
}
 }
 
-bool amdgpu_virt_get_rlcg_reg_access_flag(struct amdgpu_device *adev, u32 
acc_flags,
- u32 hwip, bool write, u32 *rlcg_flag)
+static bool amdgpu_virt_get_rlcg_reg_access_flag(struct amdgpu_device *adev,
+u32 acc_flags, u32 hwip,
+bool write, u32 *rlcg_flag)
 {
bool ret = false;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
index dbfa3ba445c3..c5edd84c1c12 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
@@ -334,8 +334,6 @@ enum amdgpu_sriov_vf_mode 
amdgpu_virt_get_sriov_vf_mode(struct amdgpu_device *ad
 void amdgpu_virt_update_sriov_video_codec(struct amdgpu_device *adev,
struct amdgpu_video_codec_info *encode, uint32_t 
encode_array_size,
struct amdgpu_video_codec_info *decode, uint32_t 
decode_array_size);
-bool amdgpu_virt_get_rlcg_reg_access_flag(struct amdgpu_device *adev, u32 
acc_flags,
- u32 hwip, bool write, u32 *rlcg_flag);
 void amdgpu_sriov_wreg(struct amdgpu_device *adev,
   u32 offset, u32 value,
   u32 acc_flags, u32 hwip);
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 3fb484214d3a..f54e106e2b86 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -56,10 +56,6 @@
 #define GFX10_NUM_GFX_RINGS_Sienna_Cichlid 1
 #define GFX10_MEC_HPD_SIZE 2048
 
-#define RLCG_VFGATE_DISABLED   0x400
-#define RLCG_WRONG_OPERATION_TYPE  0x200
-#define RLCG_NOT_IN_RANGE  0x100
-
 #define F32_CE_PROGRAM_RAM_SIZE65536
 #define RLCG_UCODE_LOADING_START_ADDRESS   0x2000L
 
@@ -180,9 +176,6 @@
 #define mmRLC_SPARE_INT_0_Sienna_Cichlid   0x4ca5
 #define mmRLC_SPARE_INT_0_Sienna_Cichlid_BASE_IDX  1
 
-#define RLCG_ERROR_REPORT_ENABLED(adev) \
-   (amdgpu_sriov_reg_indirect_mmhub(adev) || 
amdgpu_sriov_reg_indirect_gc(adev))
-
 MODULE_FIRMWARE("amdgpu/navi10_ce.bin");
 MODULE_FIRMWARE("amdgpu/navi10_pfp.bin");
 MODULE_FIRMWARE("amdgpu/navi10_me.bin");
@@ -1458,111 +1451,6 @@ static const struct soc15_reg_golden 
golden_settings_gc_10_1_2[] =
SOC15_REG_GOLDEN_VALUE(GC, 0, mmUTCL1_CTRL, 0x, 0x00c0)
 };
 
-static u32 gfx_v10_rlcg_rw(struct amdgpu_device *adev, u32 offset, u32 v, 
uint32_t flag)
-{
-   static void *scratch_reg0;
-   static void *scratch_reg1;
-   static void *scratch_reg2;
-   static void *scratch_reg3;
-   static void *spare_int;
-   static uint32_t grbm_cntl;
-   static uint32_t grbm_idx;
-   uint32_t i = 0;
-   uint32_t retries = 5;
-   u32 ret = 0;
-   u32 tmp;
-
-   scratch_reg0 = adev->rmmio +
-  (adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG0_BASE_IDX] + 
mmSCRATCH_REG0) * 4;
-   scratch_reg1 = adev->rmmio +
-  (adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG1_BASE_IDX] + 
mmSCRATCH_REG1) * 4;
-   scratch_reg2 = adev->rmmio +
-  (adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG0_BASE_IDX] + 
mmSCRATCH_REG2) * 4;
-   scratch_reg3 = adev->rmmio +
-  (adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG1_BASE_IDX] + 
mmSCRATCH_R

[PATCH 7/9] drm/amdgpu: add helper for rlcg indirect reg access

2022-01-20 Thread Hawking Zhang

The helper will be used to access registers from sriov
guest in full access time

Signed-off-by: Hawking Zhang 
Reviewed-by: Zhou, Peng Ju 
Acked-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 111 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h |  14 ++-
 2 files changed, 124 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index a40e4fcdfa46..8c27d31f3e53 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -855,3 +855,114 @@ bool amdgpu_virt_get_rlcg_reg_access_flag(struct 
amdgpu_device *adev, u32 acc_fl
}
return ret;
 }
+
+static u32 amdgpu_virt_rlcg_reg_rw(struct amdgpu_device *adev, u32 offset, u32 
v, u32 flag)
+{
+   struct amdgpu_rlcg_reg_access_ctrl *reg_access_ctrl;
+   uint32_t timeout = 5;
+   uint32_t i, tmp;
+   uint32_t ret = 0;
+   static void *scratch_reg0;
+   static void *scratch_reg1;
+   static void *scratch_reg2;
+   static void *scratch_reg3;
+   static void *spare_int;
+
+   if (!adev->gfx.rlc.rlcg_reg_access_supported) {
+   dev_err(adev->dev,
+   "indirect registers access through rlcg is not 
available\n");
+   return 0;
+   }
+
+   scratch_reg0 = (void __iomem *)adev->rmmio + 4 * 
reg_access_ctrl->scratch_reg0;
+   scratch_reg1 = (void __iomem *)adev->rmmio + 4 * 
reg_access_ctrl->scratch_reg1;
+   scratch_reg2 = (void __iomem *)adev->rmmio + 4 * 
reg_access_ctrl->scratch_reg2;
+   scratch_reg3 = (void __iomem *)adev->rmmio + 4 * 
reg_access_ctrl->scratch_reg3;
+   if (reg_access_ctrl->spare_int)
+   spare_int = (void __iomem *)adev->rmmio + 4 * 
reg_access_ctrl->spare_int;
+
+   if (offset == reg_access_ctrl->grbm_cntl) {
+   /* if the target reg offset is grbm_cntl, write to scratch_reg2 
*/
+   writel(v, scratch_reg2);
+   writel(v, ((void __iomem *)adev->rmmio) + (offset * 4));
+   } else if (offset == reg_access_ctrl->grbm_idx) {
+   /* if the target reg offset is grbm_idx, write to scratch_reg3 
*/
+   writel(v, scratch_reg3);
+   writel(v, ((void __iomem *)adev->rmmio) + (offset * 4));
+   } else {
+   /*
+* SCRATCH_REG0 = read/write value
+* SCRATCH_REG1[30:28]  = command
+* SCRATCH_REG1[19:0]   = address in dword
+* SCRATCH_REG1[26:24]  = Error reporting
+*/
+   writel(v, scratch_reg0);
+   writel((offset | flag), scratch_reg1);
+   if (reg_access_ctrl->spare_int)
+   writel(1, spare_int);
+
+   for (i = 0; i < timeout; i++) {
+   tmp = readl(scratch_reg1);
+   if (!(tmp & flag))
+   break;
+   udelay(10);
+   }
+
+   if (i >= timeout) {
+   if (amdgpu_sriov_rlcg_error_report_enabled(adev)) {
+   if (tmp & AMDGPU_RLCG_VFGATE_DISABLED) {
+   dev_err(adev->dev,
+   "vfgate is disabled, rlcg 
failed to program reg: 0x%05x\n", offset);
+   } else if (tmp & 
AMDGPU_RLCG_WRONG_OPERATION_TYPE) {
+   dev_err(adev->dev,
+   "wrong operation type, rlcg 
failed to program reg: 0x%05x\n", offset);
+   } else if (tmp & AMDGPU_RLCG_REG_NOT_IN_RANGE) {
+   dev_err(adev->dev,
+   "regiser is not in range, rlcg 
failed to program reg: 0x%05x\n", offset);
+   } else {
+   dev_err(adev->dev,
+   "unknown error type, rlcg 
failed to program reg: 0x%05x\n", offset);
+   }
+   } else {
+   dev_err(adev->dev,
+   "timeout: rlcg faled to program reg: 
0x%05x\n", offset);
+   }
+   }
+   }
+
+   ret = readl(scratch_reg0);
+   return ret;
+}
+
+void amdgpu_sriov_wreg(struct amdgpu_device *adev,
+  u32 offset, u32 value,
+  u32 acc_flags, u32 hwip)
+{
+   u32 rlcg_flag;
+
+   if (!amdgpu_sriov_runtime(adev) &&
+   amdgpu_virt_get_rlcg_reg_access_flag(adev, acc_flags, hwip, true, 
&rlcg_flag)) {
+   amdgpu_virt_rlcg_reg_rw(adev, offset, value, rlcg_flag);
+   return;
+   }
+
+   if (acc_flags & AMDGPU_REGS_NO_KIQ)
+   WREG32_NO_KIQ(offset, v

[PATCH 8/9] drm/amdgpu: switch to amdgpu_sriov_rreg/wreg

2022-01-20 Thread Hawking Zhang

Instead of ip specific implementation for rlcg
indirect register access

Signed-off-by: Hawking Zhang 
Reviewed-by: Zhou, Peng Ju 
Acked-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/soc15_common.h  | 8 
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 2baafef9c786..4867d2231a3d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -566,7 +566,7 @@ void amdgpu_mm_wreg_mmio_rlc(struct amdgpu_device *adev,
adev->gfx.rlc.funcs &&
adev->gfx.rlc.funcs->is_rlcg_access_range) {
if (adev->gfx.rlc.funcs->is_rlcg_access_range(adev, reg))
-   return adev->gfx.rlc.funcs->sriov_wreg(adev, reg, v, 0, 
0);
+   return amdgpu_sriov_wreg(adev, reg, v, 0, 0);
} else if ((reg * 4) >= adev->rmmio_size) {
adev->pcie_wreg(adev, reg * 4, v);
} else {
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15_common.h 
b/drivers/gpu/drm/amd/amdgpu/soc15_common.h
index 473767e03676..acce8c2e0328 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15_common.h
+++ b/drivers/gpu/drm/amd/amdgpu/soc15_common.h
@@ -28,13 +28,13 @@
 #define SOC15_REG_OFFSET(ip, inst, reg)
(adev->reg_offset[ip##_HWIP][inst][reg##_BASE_IDX] + reg)
 
 #define __WREG32_SOC15_RLC__(reg, value, flag, hwip) \
-   ((amdgpu_sriov_vf(adev) && adev->gfx.rlc.funcs && 
adev->gfx.rlc.funcs->sriov_wreg) ? \
-adev->gfx.rlc.funcs->sriov_wreg(adev, reg, value, flag, hwip) : \
+   ((amdgpu_sriov_vf(adev) && adev->gfx.rlc.funcs && 
adev->gfx.rlc.rlcg_reg_access_supported) ? \
+amdgpu_sriov_wreg(adev, reg, value, flag, hwip) : \
 WREG32(reg, value))
 
 #define __RREG32_SOC15_RLC__(reg, flag, hwip) \
-   ((amdgpu_sriov_vf(adev) && adev->gfx.rlc.funcs && 
adev->gfx.rlc.funcs->sriov_rreg) ? \
-adev->gfx.rlc.funcs->sriov_rreg(adev, reg, flag, hwip) : \
+   ((amdgpu_sriov_vf(adev) && adev->gfx.rlc.funcs && 
adev->gfx.rlc.rlcg_reg_access_supported) ? \
+amdgpu_sriov_rreg(adev, reg, flag, hwip) : \
 RREG32(reg))
 
 #define WREG32_FIELD15(ip, idx, reg, field, val)   \
-- 
2.17.1

[PATCH 6/9] drm/amdgpu: init rlcg_reg_access_ctrl for gfx10

2022-01-20 Thread Hawking Zhang

Initialize all the register offsets that will be
used in rlcg indirect reg access path for gfx10
in sw_init phase

Signed-off-by: Hawking Zhang 
Reviewed-by: Zhou, Peng Ju 
Acked-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 38 +++---
 1 file changed, 34 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 588c922573e9..3fb484214d3a 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -4411,6 +4411,30 @@ static void gfx_v10_0_rlc_fini(struct amdgpu_device 
*adev)
(void **)&adev->gfx.rlc.cp_table_ptr);
 }
 
+static void gfx_v10_0_init_rlcg_reg_access_ctrl(struct amdgpu_device *adev)
+{
+   struct amdgpu_rlcg_reg_access_ctrl *reg_access_ctrl;
+
+   reg_access_ctrl = &adev->gfx.rlc.reg_access_ctrl;
+   reg_access_ctrl->scratch_reg0 = SOC15_REG_OFFSET(GC, 0, mmSCRATCH_REG0);
+   reg_access_ctrl->scratch_reg1 = SOC15_REG_OFFSET(GC, 0, mmSCRATCH_REG1);
+   reg_access_ctrl->scratch_reg2 = SOC15_REG_OFFSET(GC, 0, mmSCRATCH_REG2);
+   reg_access_ctrl->scratch_reg3 = SOC15_REG_OFFSET(GC, 0, mmSCRATCH_REG3);
+   reg_access_ctrl->grbm_cntl = SOC15_REG_OFFSET(GC, 0, mmGRBM_GFX_CNTL);
+   reg_access_ctrl->grbm_idx = SOC15_REG_OFFSET(GC, 0, mmGRBM_GFX_INDEX);
+   switch (adev->ip_versions[GC_HWIP][0]) {
+   case IP_VERSION(10, 3, 0):
+   reg_access_ctrl->spare_int =
+   SOC15_REG_OFFSET(GC, 0, 
mmRLC_SPARE_INT_0_Sienna_Cichlid);
+   break;
+   default:
+   reg_access_ctrl->spare_int =
+   SOC15_REG_OFFSET(GC, 0, mmRLC_SPARE_INT);
+   break;
+   }
+   adev->gfx.rlc.rlcg_reg_access_supported = true;
+}
+
 static int gfx_v10_0_rlc_init(struct amdgpu_device *adev)
 {
const struct cs_section_def *cs_data;
@@ -4431,6 +4455,8 @@ static int gfx_v10_0_rlc_init(struct amdgpu_device *adev)
if (adev->gfx.rlc.funcs->update_spm_vmid)
adev->gfx.rlc.funcs->update_spm_vmid(adev, 0xf);
 
+   gfx_v10_0_init_rlcg_reg_access_ctrl(adev);
+
return 0;
 }
 
@@ -4828,10 +4854,14 @@ static int gfx_v10_0_sw_init(void *handle)
if (r)
return r;
 
-   r = gfx_v10_0_rlc_init(adev);
-   if (r) {
-   DRM_ERROR("Failed to init rlc BOs!\n");
-   return r;
+   if (adev->gfx.rlc.funcs) {
+   if (adev->gfx.rlc.funcs->init) {
+   r = adev->gfx.rlc.funcs->init(adev);
+   if (r) {
+   dev_err(adev->dev, "Failed to init rlc BOs!\n");
+   return r;
+   }
+   }
}
 
r = gfx_v10_0_mec_init(adev);
-- 
2.17.1

[PATCH 5/9] drm/amdgpu: init rlcg_reg_access_ctrl for gfx9

2022-01-20 Thread Hawking Zhang

Initialize all the register offsets that will be
used in rlcg indirect reg access path for gfx9 in
sw_init phase

Signed-off-by: Hawking Zhang 
Reviewed-by: Zhou, Peng Ju 
Acked-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 30 +++
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 17704cd99aaf..c7bccf1a28b4 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -1983,6 +1983,21 @@ static int gfx_v9_0_cp_jump_table_num(struct 
amdgpu_device *adev)
return 4;
 }
 
+static void gfx_v9_0_init_rlcg_reg_access_ctrl(struct amdgpu_device *adev)
+{
+   struct amdgpu_rlcg_reg_access_ctrl *reg_access_ctrl;
+
+   reg_access_ctrl = &adev->gfx.rlc.reg_access_ctrl;
+   reg_access_ctrl->scratch_reg0 = SOC15_REG_OFFSET(GC, 0, mmSCRATCH_REG0);
+   reg_access_ctrl->scratch_reg1 = SOC15_REG_OFFSET(GC, 0, mmSCRATCH_REG1);
+   reg_access_ctrl->scratch_reg2 = SOC15_REG_OFFSET(GC, 0, mmSCRATCH_REG2);
+   reg_access_ctrl->scratch_reg3 = SOC15_REG_OFFSET(GC, 0, mmSCRATCH_REG3);
+   reg_access_ctrl->grbm_cntl = SOC15_REG_OFFSET(GC, 0, mmGRBM_GFX_CNTL);
+   reg_access_ctrl->grbm_idx = SOC15_REG_OFFSET(GC, 0, mmGRBM_GFX_INDEX);
+   reg_access_ctrl->spare_int = SOC15_REG_OFFSET(GC, 0, mmRLC_SPARE_INT);
+   adev->gfx.rlc.rlcg_reg_access_supported = true;
+}
+
 static int gfx_v9_0_rlc_init(struct amdgpu_device *adev)
 {
const struct cs_section_def *cs_data;
@@ -2023,6 +2038,9 @@ static int gfx_v9_0_rlc_init(struct amdgpu_device *adev)
if (adev->gfx.rlc.funcs->update_spm_vmid)
adev->gfx.rlc.funcs->update_spm_vmid(adev, 0xf);
 
+   /* init rlcg reg access ctrl */
+   gfx_v9_0_init_rlcg_reg_access_ctrl(adev);
+
return 0;
 }
 
@@ -2432,10 +2450,14 @@ static int gfx_v9_0_sw_init(void *handle)
return r;
}
 
-   r = adev->gfx.rlc.funcs->init(adev);
-   if (r) {
-   DRM_ERROR("Failed to init rlc BOs!\n");
-   return r;
+   if (adev->gfx.rlc.funcs) {
+   if (adev->gfx.rlc.funcs->init) {
+   r = adev->gfx.rlc.funcs->init(adev);
+   if (r) {
+   dev_err(adev->dev, "Failed to init rlc BOs!\n");
+   return r;
+   }
+   }
}
 
r = gfx_v9_0_mec_init(adev);
-- 
2.17.1

[PATCH 4/9] drm/amdgpu: add structures for rlcg indirect reg access

2022-01-20 Thread Hawking Zhang

Add structures that are used to cache registers offsets
for rlcg indirect reg access ctrl and flag availability
of such interface

Signed-off-by: Hawking Zhang 
Reviewed-by: Zhou, Peng Ju 
Acked-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
index 00afd0dcae86..286b2347d063 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
@@ -132,6 +132,16 @@ struct amdgpu_rlc_funcs {
bool (*is_rlcg_access_range)(struct amdgpu_device *adev, uint32_t reg);
 };
 
+struct amdgpu_rlcg_reg_access_ctrl {
+   uint32_t scratch_reg0;
+   uint32_t scratch_reg1;
+   uint32_t scratch_reg2;
+   uint32_t scratch_reg3;
+   uint32_t grbm_cntl;
+   uint32_t grbm_idx;
+   uint32_t spare_int;
+};
+
 struct amdgpu_rlc {
/* for power gating */
struct amdgpu_bo*save_restore_obj;
@@ -191,6 +201,10 @@ struct amdgpu_rlc {
struct amdgpu_bo*rlc_toc_bo;
uint64_trlc_toc_gpu_addr;
void*rlc_toc_buf;
+
+   bool rlcg_reg_access_supported;
+   /* registers for rlcg indirect reg access */
+   struct amdgpu_rlcg_reg_access_ctrl reg_access_ctrl;
 };
 
 void amdgpu_gfx_rlc_enter_safe_mode(struct amdgpu_device *adev);
-- 
2.17.1

[PATCH 3/9] drm/amdgpu: switch to get_rlcg_reg_access_flag for gfx10

2022-01-20 Thread Hawking Zhang

Switch to common helper to query rlcg access flag
specified by sriov host driver for gfx10

Signed-off-by: Hawking Zhang 
Reviewed-by: Zhou, Peng Ju 
Acked-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 41 ++
 1 file changed, 2 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index dbe7442fb25c..588c922573e9 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -180,11 +180,6 @@
 #define mmRLC_SPARE_INT_0_Sienna_Cichlid   0x4ca5
 #define mmRLC_SPARE_INT_0_Sienna_Cichlid_BASE_IDX  1
 
-#define GFX_RLCG_GC_WRITE_OLD  (0x8 << 28)
-#define GFX_RLCG_GC_WRITE  (0x0 << 28)
-#define GFX_RLCG_GC_READ   (0x1 << 28)
-#define GFX_RLCG_MMHUB_WRITE   (0x2 << 28)
-
 #define RLCG_ERROR_REPORT_ENABLED(adev) \
(amdgpu_sriov_reg_indirect_mmhub(adev) || 
amdgpu_sriov_reg_indirect_gc(adev))
 
@@ -1463,38 +1458,6 @@ static const struct soc15_reg_golden 
golden_settings_gc_10_1_2[] =
SOC15_REG_GOLDEN_VALUE(GC, 0, mmUTCL1_CTRL, 0x, 0x00c0)
 };
 
-static bool gfx_v10_get_rlcg_flag(struct amdgpu_device *adev, u32 acc_flags, 
u32 hwip,
-int write, u32 *rlcg_flag)
-{
-   switch (hwip) {
-   case GC_HWIP:
-   if (amdgpu_sriov_reg_indirect_gc(adev)) {
-   *rlcg_flag = write ? GFX_RLCG_GC_WRITE : 
GFX_RLCG_GC_READ;
-
-   return true;
-   /* only in new version, AMDGPU_REGS_NO_KIQ and AMDGPU_REGS_RLC 
enabled simultaneously */
-   } else if ((acc_flags & AMDGPU_REGS_RLC) && !(acc_flags & 
AMDGPU_REGS_NO_KIQ)) {
-   *rlcg_flag = GFX_RLCG_GC_WRITE_OLD;
-
-   return true;
-   }
-
-   break;
-   case MMHUB_HWIP:
-   if (amdgpu_sriov_reg_indirect_mmhub(adev) &&
-   (acc_flags & AMDGPU_REGS_RLC) && write) {
-   *rlcg_flag = GFX_RLCG_MMHUB_WRITE;
-   return true;
-   }
-
-   break;
-   default:
-   DRM_DEBUG("Not program register by RLCG\n");
-   }
-
-   return false;
-}
-
 static u32 gfx_v10_rlcg_rw(struct amdgpu_device *adev, u32 offset, u32 v, 
uint32_t flag)
 {
static void *scratch_reg0;
@@ -1575,7 +1538,7 @@ static void gfx_v10_sriov_wreg(struct amdgpu_device 
*adev, u32 offset, u32 value
u32 rlcg_flag;
 
if (!amdgpu_sriov_runtime(adev) &&
-   gfx_v10_get_rlcg_flag(adev, acc_flags, hwip, 1, &rlcg_flag)) {
+   amdgpu_virt_get_rlcg_reg_access_flag(adev, acc_flags, hwip, true, 
&rlcg_flag)) {
gfx_v10_rlcg_rw(adev, offset, value, rlcg_flag);
return;
}
@@ -1591,7 +1554,7 @@ static u32 gfx_v10_sriov_rreg(struct amdgpu_device *adev, 
u32 offset, u32 acc_fl
u32 rlcg_flag;
 
if (!amdgpu_sriov_runtime(adev) &&
-   gfx_v10_get_rlcg_flag(adev, acc_flags, hwip, 0, &rlcg_flag))
+   amdgpu_virt_get_rlcg_reg_access_flag(adev, acc_flags, hwip, false, 
&rlcg_flag))
return gfx_v10_rlcg_rw(adev, offset, 0, rlcg_flag);
 
if (acc_flags & AMDGPU_REGS_NO_KIQ)
-- 
2.17.1

[PATCH 2/9] drm/amdgpu: switch to get_rlcg_reg_access_flag for gfx9

2022-01-20 Thread Hawking Zhang

Switch to common helper to query rlcg access flag
specified by sriov host driver for gfx9

Signed-off-by: Hawking Zhang 
Reviewed-by: Zhou, Peng Ju 
Acked-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 33 ---
 1 file changed, 4 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index e12f9f5c3beb..17704cd99aaf 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -63,9 +63,6 @@
 #define mmGCEA_PROBE_MAP0x070c
 #define mmGCEA_PROBE_MAP_BASE_IDX   0
 
-#define GFX9_RLCG_GC_WRITE_OLD (0x8 << 28)
-#define GFX9_RLCG_GC_WRITE (0x0 << 28)
-#define GFX9_RLCG_GC_READ  (0x1 << 28)
 #define GFX9_RLCG_VFGATE_DISABLED  0x400
 #define GFX9_RLCG_WRONG_OPERATION_TYPE 0x200
 #define GFX9_RLCG_NOT_IN_RANGE 0x100
@@ -815,35 +812,12 @@ static u32 gfx_v9_0_rlcg_rw(struct amdgpu_device *adev, 
u32 offset, u32 v, uint3
return ret;
 }
 
-static bool gfx_v9_0_get_rlcg_flag(struct amdgpu_device *adev, u32 acc_flags, 
u32 hwip,
-   int write, u32 *rlcg_flag)
-{
-
-   switch (hwip) {
-   case GC_HWIP:
-   if (amdgpu_sriov_reg_indirect_gc(adev)) {
-   *rlcg_flag = write ? GFX9_RLCG_GC_WRITE : 
GFX9_RLCG_GC_READ;
-
-   return true;
-   /* only in new version, AMDGPU_REGS_NO_KIQ and AMDGPU_REGS_RLC 
enabled simultaneously */
-   } else if ((acc_flags & AMDGPU_REGS_RLC) && !(acc_flags & 
AMDGPU_REGS_NO_KIQ) && write) {
-   *rlcg_flag = GFX9_RLCG_GC_WRITE_OLD;
-   return true;
-   }
-
-   break;
-   default:
-   return false;
-   }
-
-   return false;
-}
-
 static u32 gfx_v9_0_sriov_rreg(struct amdgpu_device *adev, u32 offset, u32 
acc_flags, u32 hwip)
 {
u32 rlcg_flag;
 
-   if (!amdgpu_sriov_runtime(adev) && gfx_v9_0_get_rlcg_flag(adev, 
acc_flags, hwip, 0, &rlcg_flag))
+   if (!amdgpu_sriov_runtime(adev) &&
+   amdgpu_virt_get_rlcg_reg_access_flag(adev, acc_flags, hwip, false, 
&rlcg_flag))
return gfx_v9_0_rlcg_rw(adev, offset, 0, rlcg_flag);
 
if (acc_flags & AMDGPU_REGS_NO_KIQ)
@@ -857,7 +831,8 @@ static void gfx_v9_0_sriov_wreg(struct amdgpu_device *adev, 
u32 offset,
 {
u32 rlcg_flag;
 
-   if (!amdgpu_sriov_runtime(adev) && gfx_v9_0_get_rlcg_flag(adev, 
acc_flags, hwip, 1, &rlcg_flag)) {
+   if (!amdgpu_sriov_runtime(adev) &&
+   amdgpu_virt_get_rlcg_reg_access_flag(adev, acc_flags, hwip, true, 
&rlcg_flag)) {
gfx_v9_0_rlcg_rw(adev, offset, value, rlcg_flag);
return;
}
-- 
2.17.1

[PATCH 1/9] drm/amdgpu: add helper to query rlcg reg access flag

2022-01-20 Thread Hawking Zhang

Query rlc indirect register access approach specified
by sriov host driver per ip blocks

Signed-off-by: Hawking Zhang 
Reviewed-by: Zhou, Peng Ju 
Acked-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 35 
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h |  8 ++
 2 files changed, 43 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 07bc0f504713..a40e4fcdfa46 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -820,3 +820,38 @@ void amdgpu_virt_update_sriov_video_codec(struct 
amdgpu_device *adev,
}
}
 }
+
+bool amdgpu_virt_get_rlcg_reg_access_flag(struct amdgpu_device *adev, u32 
acc_flags,
+ u32 hwip, bool write, u32 *rlcg_flag)
+{
+   bool ret = false;
+
+   switch (hwip) {
+   case GC_HWIP:
+   if (amdgpu_sriov_reg_indirect_gc(adev)) {
+   *rlcg_flag =
+   write ? AMDGPU_RLCG_GC_WRITE : 
AMDGPU_RLCG_GC_READ;
+   ret = true;
+   /* only in new version, AMDGPU_REGS_NO_KIQ and
+* AMDGPU_REGS_RLC are enabled simultaneously */
+   } else if ((acc_flags & AMDGPU_REGS_RLC) &&
+  !(acc_flags & AMDGPU_REGS_NO_KIQ)) {
+   *rlcg_flag = AMDGPU_RLCG_GC_WRITE_LEGACY;
+   ret = true;
+   }
+   break;
+   case MMHUB_HWIP:
+   if (amdgpu_sriov_reg_indirect_mmhub(adev) &&
+   (acc_flags & AMDGPU_REGS_RLC) && write) {
+   *rlcg_flag = AMDGPU_RLCG_MMHUB_WRITE;
+   ret = true;
+   }
+   break;
+   default:
+   dev_err(adev->dev,
+   "indirect registers access through rlcg is not 
supported\n");
+   ret = false;
+   break;
+   }
+   return ret;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
index 9adfb8d63280..404a06e57f30 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
@@ -32,6 +32,12 @@
 #define AMDGPU_PASSTHROUGH_MODE(1 << 3) /* thw whole GPU is pass 
through for VM */
 #define AMDGPU_SRIOV_CAPS_RUNTIME  (1 << 4) /* is out of full access mode 
*/
 
+/* flags for indirect register access path supported by rlcg for sriov */
+#define AMDGPU_RLCG_GC_WRITE_LEGACY(0x8 << 28)
+#define AMDGPU_RLCG_GC_WRITE   (0x0 << 28)
+#define AMDGPU_RLCG_GC_READ(0x1 << 28)
+#define AMDGPU_RLCG_MMHUB_WRITE(0x2 << 28)
+
 /* all asic after AI use this offset */
 #define mmRCC_IOV_FUNC_IDENTIFIER 0xDE5
 /* tonga/fiji use this offset */
@@ -321,4 +327,6 @@ enum amdgpu_sriov_vf_mode 
amdgpu_virt_get_sriov_vf_mode(struct amdgpu_device *ad
 void amdgpu_virt_update_sriov_video_codec(struct amdgpu_device *adev,
struct amdgpu_video_codec_info *encode, uint32_t 
encode_array_size,
struct amdgpu_video_codec_info *decode, uint32_t 
decode_array_size);
+bool amdgpu_virt_get_rlcg_reg_access_flag(struct amdgpu_device *adev, u32 
acc_flags,
+ u32 hwip, bool write, u32 *rlcg_flag);
 #endif
-- 
2.17.1

RE: [PATCH 2/2] drm/amdkfd: svm range restore work deadlock when process exit

2022-01-20 Thread Ji, Ruili

[AMD Official Use Only]

sudo ./kfdtest --gtest_filter=KFDSVM*
sudo ./kfdtest
Test results are pass.
Tested-by: Ruili Ji 

-Original Message-
From: Yang, Philip 
Sent: 2022年1月20日 0:23
To: amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix ; Ji, Ruili ; 
Yang, Philip 
Subject: [PATCH 2/2] drm/amdkfd: svm range restore work deadlock when process 
exit

kfd_process_notifier_release flush svm_range_restore_work which calls 
svm_range_list_lock_and_flush_work to flush deferred_list work, but if 
deferred_list work mmput release the last user, it will call exit_mmap -> 
notifier_release, it is deadlock with below backtrace.

Move flush svm_range_restore_work to kfd_process_wq_release to avoid deadlock. 
Then svm_range_restore_work take task->mm ref to avoid mm is gone while 
validating and mapping ranges to GPU.

Workqueue: events svm_range_deferred_list_work [amdgpu] Call Trace:
 wait_for_completion+0x94/0x100
 __flush_work+0x12a/0x1e0
 __cancel_work_timer+0x10e/0x190
 cancel_delayed_work_sync+0x13/0x20
 kfd_process_notifier_release+0x98/0x2a0 [amdgpu]
 __mmu_notifier_release+0x74/0x1f0
 exit_mmap+0x170/0x200
 mmput+0x5d/0x130
 svm_range_deferred_list_work+0x104/0x230 [amdgpu]
 process_one_work+0x220/0x3c0

Signed-off-by: Philip Yang 
Reported-by: Ruili Ji 
Tested-by: Ruili Ji 
---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c |  1 -
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 15 +--
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index d1145da5348f..74f162887d3b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1150,7 +1150,6 @@ static void kfd_process_notifier_release(struct 
mmu_notifier *mn,

cancel_delayed_work_sync(&p->eviction_work);
cancel_delayed_work_sync(&p->restore_work);
-   cancel_delayed_work_sync(&p->svms.restore_work);

mutex_lock(&p->mutex);

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 9ec195e1ef23..2d2cae05dbea 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1643,13 +1643,14 @@ static void svm_range_restore_work(struct work_struct 
*work)

pr_debug("restore svm ranges\n");

-   /* kfd_process_notifier_release destroys this worker thread. So during
-* the lifetime of this thread, kfd_process and mm will be valid.
-*/
p = container_of(svms, struct kfd_process, svms);
-   mm = p->mm;
-   if (!mm)
+
+   /* Keep mm reference when svm_range_validate_and_map ranges */
+   mm = get_task_mm(p->lead_thread);
+   if (!mm) {
+   pr_debug("svms 0x%p process mm gone\n", svms);
return;
+   }

svm_range_list_lock_and_flush_work(svms, mm);
mutex_lock(&svms->lock);
@@ -1703,6 +1704,7 @@ static void svm_range_restore_work(struct work_struct 
*work)
 out_reschedule:
mutex_unlock(&svms->lock);
mmap_write_unlock(mm);
+   mmput(mm);

/* If validation failed, reschedule another attempt */
if (evicted_ranges) {
@@ -2837,6 +2839,8 @@ void svm_range_list_fini(struct kfd_process *p)

pr_debug("pasid 0x%x svms 0x%p\n", p->pasid, &p->svms);

+   cancel_delayed_work_sync(&p->svms.restore_work);
+
/* Ensure list work is finished before process is destroyed */
flush_work(&p->svms.deferred_list_work);

@@ -2847,7 +2851,6 @@ void svm_range_list_fini(struct kfd_process *p)
atomic_inc(&p->svms.drain_pagefaults);
svm_range_drain_retry_fault(&p->svms);

-
list_for_each_entry_safe(prange, next, &p->svms.list, list) {
svm_range_unlink(prange);
svm_range_remove_notifier(prange);
--
2.17.1

RE: [PATCH 1/2] drm/amdkfd: svm deferred_list work continue cleanup after mm gone

2022-01-20 Thread Ji, Ruili

[AMD Official Use Only]

sudo ./kfdtest --gtest_filter=KFDSVM*
sudo ./kfdtest
Test results are pass.
Tested-by: Ruili Ji 

-Original Message-
From: Yang, Philip 
Sent: 2022年1月20日 0:23
To: amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix ; Ji, Ruili ; 
Yang, Philip 
Subject: [PATCH 1/2] drm/amdkfd: svm deferred_list work continue cleanup after 
mm gone

After mm is removed from task->mm, deferred_list work should continue to handle 
deferred_range_list which maybe split to child range to avoid child range leak, 
and remove ranges mmu interval notifier to avoid mm mm_count leak, but skip 
updating notifier and inserting new notifier.

Signed-off-by: Philip Yang 
Reported-by: Ruili Ji 
Tested-by: Ruili Ji 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 41 
 1 file changed, 24 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index f2805ba74c80..9ec195e1ef23 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1985,10 +1985,9 @@ svm_range_update_notifier_and_interval_tree(struct 
mm_struct *mm,  }

 static void
-svm_range_handle_list_op(struct svm_range_list *svms, struct svm_range *prange)
+svm_range_handle_list_op(struct svm_range_list *svms, struct svm_range *prange,
+struct mm_struct *mm)
 {
-   struct mm_struct *mm = prange->work_item.mm;
-
switch (prange->work_item.op) {
case SVM_OP_NULL:
pr_debug("NULL OP 0x%p prange 0x%p [0x%lx 0x%lx]\n", @@ 
-2004,25 +2003,29 @@ svm_range_handle_list_op(struct svm_range_list *svms, 
struct svm_range *prange)
case SVM_OP_UPDATE_RANGE_NOTIFIER:
pr_debug("update notifier 0x%p prange 0x%p [0x%lx 0x%lx]\n",
 svms, prange, prange->start, prange->last);
-   svm_range_update_notifier_and_interval_tree(mm, prange);
+   if (mm)
+   svm_range_update_notifier_and_interval_tree(mm, prange);
break;
case SVM_OP_UPDATE_RANGE_NOTIFIER_AND_MAP:
pr_debug("update and map 0x%p prange 0x%p [0x%lx 0x%lx]\n",
 svms, prange, prange->start, prange->last);
-   svm_range_update_notifier_and_interval_tree(mm, prange);
+   if (mm)
+   svm_range_update_notifier_and_interval_tree(mm, prange);
/* TODO: implement deferred validation and mapping */
break;
case SVM_OP_ADD_RANGE:
pr_debug("add 0x%p prange 0x%p [0x%lx 0x%lx]\n", svms, prange,
 prange->start, prange->last);
svm_range_add_to_svms(prange);
-   svm_range_add_notifier_locked(mm, prange);
+   if (mm)
+   svm_range_add_notifier_locked(mm, prange);
break;
case SVM_OP_ADD_RANGE_AND_MAP:
pr_debug("add and map 0x%p prange 0x%p [0x%lx 0x%lx]\n", svms,
 prange, prange->start, prange->last);
svm_range_add_to_svms(prange);
-   svm_range_add_notifier_locked(mm, prange);
+   if (mm)
+   svm_range_add_notifier_locked(mm, prange);
/* TODO: implement deferred validation and mapping */
break;
default:
@@ -2071,20 +2074,22 @@ static void svm_range_deferred_list_work(struct 
work_struct *work)
pr_debug("enter svms 0x%p\n", svms);

p = container_of(svms, struct kfd_process, svms);
-   /* Avoid mm is gone when inserting mmu notifier */
+
+   /* If mm is gone, continue cleanup the deferred_range_list */
mm = get_task_mm(p->lead_thread);
-   if (!mm) {
+   if (!mm)
pr_debug("svms 0x%p process mm gone\n", svms);
-   return;
-   }
+
 retry:
-   mmap_write_lock(mm);
+   if (mm)
+   mmap_write_lock(mm);

/* Checking for the need to drain retry faults must be inside
 * mmap write lock to serialize with munmap notifiers.
 */
if (unlikely(atomic_read(&svms->drain_pagefaults))) {
-   mmap_write_unlock(mm);
+   if (mm)
+   mmap_write_unlock(mm);
svm_range_drain_retry_fault(svms);
goto retry;
}
@@ -2109,19 +2114,21 @@ static void svm_range_deferred_list_work(struct 
work_struct *work)
pr_debug("child prange 0x%p op %d\n", pchild,
 pchild->work_item.op);
list_del_init(&pchild->child_list);
-   svm_range_handle_list_op(svms, pchild);
+   svm_range_handle_list_op(svms, pchild, mm);
}
mutex_unlock(&prange->migrate_mutex);

-   svm_range_handle_list_op(svms, prange);
+   svm_range_handle_list_op(svms, p

Re: [PATCH] drm/amdgpu: force using sdma to update vm page table when mmio is blocked

2022-01-20 Thread Wang, Yang(Kevin)

[AMD Official Use Only]

Thanks @Koenig, Christian.

Best Regards,
Kevin

From: Koenig, Christian 
Sent: Thursday, January 20, 2022 5:11 PM
To: Wang, Yang(Kevin) ; amd-gfx@lists.freedesktop.org 
; Deucher, Alexander 
; Liu, Monk 
Cc: Min, Frank ; Chen, Horace 
Subject: Re: [PATCH] drm/amdgpu: force using sdma to update vm page table when 
mmio is blocked

Hi Kevin,

well at least the HDP flush function has to work correctly or otherwise the 
driver won't work correctly.

If the registers are not accessible any more we need to find a proper 
workaround for this.

One possibility would be to use the KIQ, another is a dummy write/read to make 
sure the HDP is flushed (check the hardware docs).

The third option would be to question if blocking the HDP registers is really a 
good idea.

The solution is up to you, but a workaround like proposed below doesn't really 
help in any way.

Regards,
Christian.

Am 20.01.22 um 10:07 schrieb Wang, Yang(Kevin):

[AMD Official Use Only]

Hi Chris,

yes, I agree with your point.
and another way is that we can use KIQ to write HDP register to resolve HDP 
can't R/W issue.
but it will cause some performance drop if use KIQ to programing register.

what is your ideas?

Best Regards,
Kevin

From: Koenig, Christian 

Sent: Thursday, January 20, 2022 4:58 PM
To: Wang, Yang(Kevin) ; 
amd-gfx@lists.freedesktop.org 
; Deucher, 
Alexander ; Liu, 
Monk 
Cc: Min, Frank ; Chen, Horace 

Subject: Re: [PATCH] drm/amdgpu: force using sdma to update vm page table when 
mmio is blocked

Well NAK.

Even when we can't R/W HDP registers we need a way to invalidate the HDP or 
quite a bunch of functions won't work correctly.

Blocking CPU base page table updates only works around the symptoms, but 
doesn't really solve anything.

Regards,
Christian.

Am 20.01.22 um 09:46 schrieb Wang, Yang(Kevin):

[AMD Official Use Only]

ping...

add @Liu, Monk @Koenig, 
Christian @Deucher, 
Alexander

Best Regards,
Kevin

From: Wang, Yang(Kevin) 
Sent: Wednesday, January 19, 2022 11:16 AM
To: amd-gfx@lists.freedesktop.org 

Cc: Min, Frank ; Chen, Horace 
; Wang, Yang(Kevin) 

Subject: [PATCH] drm/amdgpu: force using sdma to update vm page table when mmio 
is blocked

when mmio protection feature is enabled in hypervisor,
it will cause guest OS can't R/W HDP regiters,
and using cpu to update page table is not working well.

force using sdma to update page table when mmio is blocked.

Signed-off-by: Yang Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index b23cb463b106..0f86f0b2e183 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2959,6 +2959,9 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
amdgpu_vm *vm)
 vm->use_cpu_for_update = !!(adev->vm_manager.vm_update_mode &
 AMDGPU_VM_USE_CPU_FOR_GFX);

+   if (vm->use_cpu_for_update && amdgpu_sriov_vf(adev) && 
amdgpu_virt_mmio_blocked(adev))
+   vm->use_cpu_for_update = false;
+
 DRM_DEBUG_DRIVER("VM update mode is %s\n",
  vm->use_cpu_for_update ? "CPU" : "SDMA");
 WARN_ONCE((vm->use_cpu_for_update &&
@@ -3094,6 +3097,10 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, 
struct amdgpu_vm *vm)
 /* Update VM state */
 vm->use_cpu_for_update = !!(adev->vm_manager.vm_update_mode &
 AMDGPU_VM_USE_CPU_FOR_COMPUTE);
+
+   if (vm->use_cpu_for_update && amdgpu_sriov_vf(adev) && 
amdgpu_virt_mmio_blocked(adev))
+   vm->use_cpu_for_update = false;
+
 DRM_DEBUG_DRIVER("VM update mode is %s\n",
  vm->use_cpu_for_update ? "CPU" : "SDMA");
 WARN_ONCE((vm->use_cpu_for_update &&
--
2.25.1

Re: [PATCH] drm/amdgpu: modify a pair of functions for the pcie port wreg/rreg

2022-01-20 Thread Huang Rui

On Tue, Jan 18, 2022 at 09:36:56PM +0800, Lazar, Lijo wrote:
> 
> 
> On 1/18/2022 4:56 PM, Xiaojian Du wrote:
> > This patch will modify a pair of functions for pcie port wreg/rreg.
> > AMD GPU have had an independent NBIO block from SOC15 arch.
> > If the dirver wants to read/write the address space of the pcie devices,
> > it has to go through the NBIO block.
> > This patch will move the pcie port wreg/rreg functions to
> > "amdgpu_device.c", so that to make the functions can be used on the
> > future GPU ASICs.
> > 
> > Signed-off-by: Xiaojian Du 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu.h|  4 +++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 33 +
> >   drivers/gpu/drm/amd/amdgpu/nv.c| 34 ++
> >   3 files changed, 39 insertions(+), 32 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > index b2da840f4718..691d7868d64d 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > @@ -1421,6 +1421,10 @@ void amdgpu_device_invalidate_hdp(struct 
> > amdgpu_device *adev,
> > struct amdgpu_ring *ring);
> >   
> >   void amdgpu_device_halt(struct amdgpu_device *adev);
> > +u32 amdgpu_device_pcie_port_rreg(struct amdgpu_device *adev,
> > +   u32 reg);
> > +void amdgpu_device_pcie_port_wreg(struct amdgpu_device *adev,
> > +   u32 reg, u32 v);
> >   
> >   /* atpx handler */
> >   #if defined(CONFIG_VGA_SWITCHEROO)
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index ff4cf0e2a01f..10f2b7cbb49d 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -6023,3 +6023,36 @@ void amdgpu_device_halt(struct amdgpu_device *adev)
> > pci_disable_device(pdev);
> > pci_wait_for_pending_transaction(pdev);
> >   }
> > +
> > +u32 amdgpu_device_pcie_port_rreg(struct amdgpu_device *adev,
> > +   u32 reg)
> > +{
> > +   unsigned long flags, address, data;
> > +   u32 r;
> > +
> > +   address = adev->nbio.funcs->get_pcie_port_index_offset(adev);
> > +   data = adev->nbio.funcs->get_pcie_port_data_offset(adev);
> > +
> > +   spin_lock_irqsave(&adev->pcie_idx_lock, flags);
> > +   WREG32(address, reg * 4);
> > +   (void)RREG32(address);
> > +   r = RREG32(data);
> > +   spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> > +   return r;
> > +}
> > +
> > +void amdgpu_device_pcie_port_wreg(struct amdgpu_device *adev,
> > +   u32 reg, u32 v)
> > +{
> > +   unsigned long flags, address, data;
> > +
> > +   address = adev->nbio.funcs->get_pcie_port_index_offset(adev);
> > +   data = adev->nbio.funcs->get_pcie_port_data_offset(adev);
> > +
> > +   spin_lock_irqsave(&adev->pcie_idx_lock, flags);
> > +   WREG32(address, reg * 4);
> > +   (void)RREG32(address);
> > +   WREG32(data, v);
> > +   (void)RREG32(data);
> > +   spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> > +}
> > diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c 
> > b/drivers/gpu/drm/amd/amdgpu/nv.c
> > index e52d1114501c..17480c1eeae8 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/nv.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/nv.c
> > @@ -256,21 +256,6 @@ static u64 nv_pcie_rreg64(struct amdgpu_device *adev, 
> > u32 reg)
> > return amdgpu_device_indirect_rreg64(adev, address, data, reg);
> >   }
> >   
> > -static u32 nv_pcie_port_rreg(struct amdgpu_device *adev, u32 reg)
> > -{
> > -   unsigned long flags, address, data;
> > -   u32 r;
> > -   address = adev->nbio.funcs->get_pcie_port_index_offset(adev);
> > -   data = adev->nbio.funcs->get_pcie_port_data_offset(adev);
> > -
> > -   spin_lock_irqsave(&adev->pcie_idx_lock, flags);
> > -   WREG32(address, reg * 4);
> > -   (void)RREG32(address);
> > -   r = RREG32(data);
> > -   spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> > -   return r;
> > -}
> > -
> >   static void nv_pcie_wreg64(struct amdgpu_device *adev, u32 reg, u64 v)
> >   {
> > unsigned long address, data;
> > @@ -281,21 +266,6 @@ static void nv_pcie_wreg64(struct amdgpu_device *adev, 
> > u32 reg, u64 v)
> > amdgpu_device_indirect_wreg64(adev, address, data, reg, v);
> >   }
> >   
> > -static void nv_pcie_port_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
> > -{
> > -   unsigned long flags, address, data;
> > -
> > -   address = adev->nbio.funcs->get_pcie_port_index_offset(adev);
> > -   data = adev->nbio.funcs->get_pcie_port_data_offset(adev);
> > -
> > -   spin_lock_irqsave(&adev->pcie_idx_lock, flags);
> > -   WREG32(address, reg * 4);
> > -   (void)RREG32(address);
> > -   WREG32(data, v);
> > -   (void)RREG32(data);
> > -   spin_unlock_irqrestore(&adev->pcie_idx_lock, flags);
> > -}
> > -
> >   static u32 nv_didt_rreg(struct amdgpu_device *adev, u32 reg)
> >   {
> > unsigned long flags, address, data;
> > @@ -709,8

Re: [PATCH 0/3] lib/string_helpers: Add a few string helpers

2022-01-20 Thread Jani Nikula

On Thu, 20 Jan 2022, Petr Mladek  wrote:
> On Wed 2022-01-19 16:16:12, Jani Nikula wrote:
>> On Wed, 19 Jan 2022, Petr Mladek  wrote:
>> > On Tue 2022-01-18 23:24:47, Lucas De Marchi wrote:
>> >> d. This doesn't bring onoff() helper as there are some places in the
>> >>kernel with onoff as variable - another name is probably needed for
>> >>this function in order not to shadow the variable, or those variables
>> >>could be renamed.  Or if people wanting  
>> >>try to find a short one
>> >
>> > I would call it str_on_off().
>> >
>> > And I would actually suggest to use the same style also for
>> > the other helpers.
>> >
>> > The "str_" prefix would make it clear that it is something with
>> > string. There are other _on_off() that affect some
>> > functionality, e.g. mute_led_on_off(), e1000_vlan_filter_on_off().
>> >
>> > The dash '_' would significantly help to parse the name. yesno() and
>> > onoff() are nicely short and kind of acceptable. But "enabledisable()"
>> > is a puzzle.
>> >
>> > IMHO, str_yes_no(), str_on_off(), str_enable_disable() are a good
>> > compromise.
>> >
>> > The main motivation should be code readability. You write the
>> > code once. But many people will read it many times. Open coding
>> > is sometimes better than misleading macro names.
>> >
>> > That said, I do not want to block this patchset. If others like
>> > it... ;-)
>> 
>> I don't mind the names either way. Adding the prefix and dashes is
>> helpful in that it's possible to add the functions first and convert
>> users at leisure, though with a bunch of churn, while using names that
>> collide with existing ones requires the changes to happen in one go.
>
> It is also possible to support both notations at the beginning.
> And convert the existing users in the 2nd step.
>
>> What I do mind is grinding this series to a halt once again. I sent a
>> handful of versions of this three years ago, with inconclusive
>> bikeshedding back and forth, eventually threw my hands up in disgust,
>> and walked away.
>
> Yeah, and I am sorry for bikeshedding. Honestly, I do not know what is
> better. This is why I do not want to block this series when others
> like this.
>
> My main motivation is to point out that:
>
> enabledisable(enable)
>
> might be, for some people, more eye bleeding than
>
> enable ? "enable" : "disable"
>
>
> The problem is not that visible with yesno() and onoff(). But as you said,
> onoff() confliscts with variable names. And enabledisable() sucks.
> As a result, there is a non-trivial risk of two mass changes:

My point is, in the past three years we could have churned through more
than two mass renames just fine, if needed, *if* we had just managed to
merge something for a start!

BR,
Jani.

>
> now:
>
> - contition ? "yes" : "no"
> + yesno(condition)
>
> a few moths later:
>
> - yesno(condition)
> + str_yes_no(condition)
>
>
> Best Regards,
> Petr

-- 
Jani Nikula, Intel Open Source Graphics Center

Re: [PATCH] drm/amdgpu: force using sdma to update vm page table when mmio is blocked

2022-01-20 Thread Christian König


Hi Kevin,

well at least the HDP flush function has to work correctly or otherwise 
the driver won't work correctly.


If the registers are not accessible any more we need to find a proper 
workaround for this.


One possibility would be to use the KIQ, another is a dummy write/read 
to make sure the HDP is flushed (check the hardware docs).


The third option would be to question if blocking the HDP registers is 
really a good idea.


The solution is up to you, but a workaround like proposed below doesn't 
really help in any way.


Regards,
Christian.

Am 20.01.22 um 10:07 schrieb Wang, Yang(Kevin):


[AMD Official Use Only]


Hi Chris,

yes, I agree with your point.
and another way is that we can use KIQ to write HDP register to 
resolve HDP can't R/W issue.

but it will cause some performance drop if use KIQ to programing register.

what is your ideas?

Best Regards,
Kevin

*From:* Koenig, Christian 
*Sent:* Thursday, January 20, 2022 4:58 PM
*To:* Wang, Yang(Kevin) ; 
amd-gfx@lists.freedesktop.org ; 
Deucher, Alexander ; Liu, Monk 


*Cc:* Min, Frank ; Chen, Horace 
*Subject:* Re: [PATCH] drm/amdgpu: force using sdma to update vm page 
table when mmio is blocked

Well NAK.

Even when we can't R/W HDP registers we need a way to invalidate the 
HDP or quite a bunch of functions won't work correctly.


Blocking CPU base page table updates only works around the symptoms, 
but doesn't really solve anything.


Regards,
Christian.

Am 20.01.22 um 09:46 schrieb Wang, Yang(Kevin):


[AMD Official Use Only]


ping...

add @Liu, Monk  @Koenig, Christian 
 @Deucher, Alexander 



Best Regards,
Kevin

*From:* Wang, Yang(Kevin)  


*Sent:* Wednesday, January 19, 2022 11:16 AM
*To:* amd-gfx@lists.freedesktop.org 
 
 
*Cc:* Min, Frank  ; 
Chen, Horace  ; 
Wang, Yang(Kevin)  

*Subject:* [PATCH] drm/amdgpu: force using sdma to update vm page 
table when mmio is blocked

when mmio protection feature is enabled in hypervisor,
it will cause guest OS can't R/W HDP regiters,
and using cpu to update page table is not working well.

force using sdma to update page table when mmio is blocked.

Signed-off-by: Yang Wang  


---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index b23cb463b106..0f86f0b2e183 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2959,6 +2959,9 @@ int amdgpu_vm_init(struct amdgpu_device *adev, 
struct amdgpu_vm *vm)

 vm->use_cpu_for_update = !!(adev->vm_manager.vm_update_mode &
AMDGPU_VM_USE_CPU_FOR_GFX);

+   if (vm->use_cpu_for_update && amdgpu_sriov_vf(adev) && 
amdgpu_virt_mmio_blocked(adev))

+   vm->use_cpu_for_update = false;
+
 DRM_DEBUG_DRIVER("VM update mode is %s\n",
vm->use_cpu_for_update ? "CPU" : "SDMA");
 WARN_ONCE((vm->use_cpu_for_update &&
@@ -3094,6 +3097,10 @@ int amdgpu_vm_make_compute(struct 
amdgpu_device *adev, struct amdgpu_vm *vm)

 /* Update VM state */
 vm->use_cpu_for_update = !!(adev->vm_manager.vm_update_mode &
AMDGPU_VM_USE_CPU_FOR_COMPUTE);
+
+   if (vm->use_cpu_for_update && amdgpu_sriov_vf(adev) && 
amdgpu_virt_mmio_blocked(adev))

+   vm->use_cpu_for_update = false;
+
 DRM_DEBUG_DRIVER("VM update mode is %s\n",
vm->use_cpu_for_update ? "CPU" : "SDMA");
 WARN_ONCE((vm->use_cpu_for_update &&
--
2.25.1

Re: [PATCH] drm/amdgpu: force using sdma to update vm page table when mmio is blocked

2022-01-20 Thread Wang, Yang(Kevin)

[AMD Official Use Only]

Hi Chris,

yes, I agree with your point.
and another way is that we can use KIQ to write HDP register to resolve HDP 
can't R/W issue.
but it will cause some performance drop if use KIQ to programing register.

what is your ideas?

Best Regards,
Kevin

From: Koenig, Christian 
Sent: Thursday, January 20, 2022 4:58 PM
To: Wang, Yang(Kevin) ; amd-gfx@lists.freedesktop.org 
; Deucher, Alexander 
; Liu, Monk 
Cc: Min, Frank ; Chen, Horace 
Subject: Re: [PATCH] drm/amdgpu: force using sdma to update vm page table when 
mmio is blocked

Well NAK.

Even when we can't R/W HDP registers we need a way to invalidate the HDP or 
quite a bunch of functions won't work correctly.

Blocking CPU base page table updates only works around the symptoms, but 
doesn't really solve anything.

Regards,
Christian.

Am 20.01.22 um 09:46 schrieb Wang, Yang(Kevin):

[AMD Official Use Only]

ping...

add @Liu, Monk @Koenig, 
Christian @Deucher, 
Alexander

Best Regards,
Kevin

From: Wang, Yang(Kevin) 
Sent: Wednesday, January 19, 2022 11:16 AM
To: amd-gfx@lists.freedesktop.org 

Cc: Min, Frank ; Chen, Horace 
; Wang, Yang(Kevin) 

Subject: [PATCH] drm/amdgpu: force using sdma to update vm page table when mmio 
is blocked

when mmio protection feature is enabled in hypervisor,
it will cause guest OS can't R/W HDP regiters,
and using cpu to update page table is not working well.

force using sdma to update page table when mmio is blocked.

Signed-off-by: Yang Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index b23cb463b106..0f86f0b2e183 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2959,6 +2959,9 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
amdgpu_vm *vm)
 vm->use_cpu_for_update = !!(adev->vm_manager.vm_update_mode &
 AMDGPU_VM_USE_CPU_FOR_GFX);

+   if (vm->use_cpu_for_update && amdgpu_sriov_vf(adev) && 
amdgpu_virt_mmio_blocked(adev))
+   vm->use_cpu_for_update = false;
+
 DRM_DEBUG_DRIVER("VM update mode is %s\n",
  vm->use_cpu_for_update ? "CPU" : "SDMA");
 WARN_ONCE((vm->use_cpu_for_update &&
@@ -3094,6 +3097,10 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, 
struct amdgpu_vm *vm)
 /* Update VM state */
 vm->use_cpu_for_update = !!(adev->vm_manager.vm_update_mode &
 AMDGPU_VM_USE_CPU_FOR_COMPUTE);
+
+   if (vm->use_cpu_for_update && amdgpu_sriov_vf(adev) && 
amdgpu_virt_mmio_blocked(adev))
+   vm->use_cpu_for_update = false;
+
 DRM_DEBUG_DRIVER("VM update mode is %s\n",
  vm->use_cpu_for_update ? "CPU" : "SDMA");
 WARN_ONCE((vm->use_cpu_for_update &&
--
2.25.1

[PATCH 1/2] drm/amdgpu: grab a PM reference on importing DMA-bufs

2022-01-20 Thread Christian König

We need the device alive and kicking for the move notify callback to work
correctly. Not sure if we should have that here or in the callback itself,
but go with the defensive variant for now.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 37 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  |  4 +--
 3 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index a9475b207510..8756f505c87d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -512,6 +512,7 @@ struct drm_gem_object *amdgpu_gem_prime_import(struct 
drm_device *dev,
 {
struct dma_buf_attachment *attach;
struct drm_gem_object *obj;
+   int r;
 
if (dma_buf->ops == &amdgpu_dmabuf_ops) {
obj = dma_buf->priv;
@@ -525,20 +526,48 @@ struct drm_gem_object *amdgpu_gem_prime_import(struct 
drm_device *dev,
}
}
 
+   r = pm_runtime_resume_and_get(dev->dev);
+   if (r)
+   return ERR_PTR(r);
+
obj = amdgpu_dma_buf_create_obj(dev, dma_buf);
-   if (IS_ERR(obj))
-   return obj;
+   if (IS_ERR(obj)) {
+   r = PTR_ERR(obj);
+   goto err_pm;
+   }
 
attach = dma_buf_dynamic_attach(dma_buf, dev->dev,
&amdgpu_dma_buf_attach_ops, obj);
if (IS_ERR(attach)) {
-   drm_gem_object_put(obj);
-   return ERR_CAST(attach);
+   r = PTR_ERR(attach);
+   goto err_put;
}
 
get_dma_buf(dma_buf);
obj->import_attach = attach;
return obj;
+
+err_put:
+   drm_gem_object_put(obj);
+
+err_pm:
+   pm_runtime_put_autosuspend(dev->dev);
+   return ERR_PTR(r);
+}
+
+/**
+ * amdgpu_gem_prime_destroy - destroy an imported BO again
+ * @bo: the imported BO
+ *
+ * Make sure to cleanup the SG table, detach from the DMA-buf and drop the PM
+ * reference we grabbed.
+ */
+void amdgpu_gem_prime_destroy(struct amdgpu_bo *bo)
+{
+   struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
+
+   drm_prime_gem_destroy(&bo->tbo.base, bo->tbo.sg);
+   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h
index 3e93b9b407a9..14cc6a873444 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h
@@ -29,6 +29,7 @@ struct dma_buf *amdgpu_gem_prime_export(struct drm_gem_object 
*gobj,
int flags);
 struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev,
struct dma_buf *dma_buf);
+void amdgpu_gem_prime_destroy(struct amdgpu_bo *bo);
 bool amdgpu_dmabuf_is_xgmi_accessible(struct amdgpu_device *adev,
  struct amdgpu_bo *bo);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index ff9dc377a3a0..6a22eaf38056 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -39,6 +39,7 @@
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 #include "amdgpu_amdkfd.h"
+#include "amdgpu_dma_buf.h"
 
 /**
  * DOC: amdgpu_object
@@ -58,9 +59,8 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo);
 
amdgpu_bo_kunmap(bo);
-
if (bo->tbo.base.import_attach)
-   drm_prime_gem_destroy(&bo->tbo.base, bo->tbo.sg);
+   amdgpu_gem_prime_destroy(bo);
drm_gem_object_release(&bo->tbo.base);
amdgpu_bo_unref(&bo->parent);
kvfree(bo);
-- 
2.25.1

[PATCH 2/2] drm/amdgpu: protected amdgpu_dma_buf_move_notify against hotplug

2022-01-20 Thread Christian König

Add the proper drm_dev_enter()/drm_dev_exit() calls here.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index 8756f505c87d..eb31ba3da403 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -36,6 +36,7 @@
 #include "amdgpu_gem.h"
 #include "amdgpu_dma_buf.h"
 #include "amdgpu_xgmi.h"
+#include 
 #include 
 #include 
 #include 
@@ -447,14 +448,18 @@ amdgpu_dma_buf_move_notify(struct dma_buf_attachment 
*attach)
struct ttm_operation_ctx ctx = { false, false };
struct ttm_placement placement = {};
struct amdgpu_vm_bo_base *bo_base;
-   int r;
+   int idx, r;
 
if (bo->tbo.resource->mem_type == TTM_PL_SYSTEM)
return;
 
+   if (!drm_dev_enter(adev_to_drm(adev), &idx))
+   return;
+
r = ttm_bo_validate(&bo->tbo, &placement, &ctx);
if (r) {
DRM_ERROR("Failed to invalidate DMA-buf import (%d))\n", r);
+   drm_dev_exit(idx);
return;
}
 
@@ -490,6 +495,7 @@ amdgpu_dma_buf_move_notify(struct dma_buf_attachment 
*attach)
 
dma_resv_unlock(resv);
}
+   drm_dev_exit(idx);
 }
 
 static const struct dma_buf_attach_ops amdgpu_dma_buf_attach_ops = {
-- 
2.25.1

Re: amd-staging-drm-next breaks suspend

2022-01-20 Thread Ma, Jun

The warn_on is still triggered because of empty gart.ptr
in function amdgpu_gart_bind

On 1/20/2022 10:56 AM, Chen, Guchun wrote:
> [Public]
> 
> [ 1.310551] trying to bind memory to uninitialized GART !
> 
> This is a warning only, it should not break suspend/resume. There is a fix on 
> drm-next for this "drm/amdgpu: remove gart.ready flag", pls have a try.
> If you still observe suspend issue, I guess it's caused by other regression. 
> Then can you pls bisect it?
> 
> Regards,
> Guchun
> 
> -Original Message-
> From: amd-gfx  On Behalf Of Bert 
> Karwatzki
> Sent: Thursday, January 20, 2022 5:52 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Chris Hixon ; Zhuo, Qingqing (Lillian) 
> ; Scott Bruce ; Limonciello, Mario 
> ; Alex Deucher ; 
> Kazlauskas, Nicholas 
> Subject: amd-staging-drm-next breaks suspend
> 
> I just tested drm-staging-drm-next with HEAD
> f1b2924ee6929cb431440e6f961f06eb65d52beb:
> Going into suspend leads to a hang again:
> This is probably caused by
> [ 1.310551] trying to bind memory to uninitialized GART !
> and/or
> [ 3.976438] trying to bind memory to uninitialized GART !
> 
> 
> Here's the complete dmesg:
> [ 0.00] Linux version 5.13.0+ (bert@lisa) (gcc (Debian 11.2.0-14)
> 11.2.0, GNU ld (GNU Binutils for Debian) 2.37.50.20220106) #4 SMP Wed
> Jan 19 22:19:19 CET 2022
> [ 0.00] Command line: BOOT_IMAGE=/boot/vmlinuz-5.13.0+
> root=UUID=78dcbf14-902d-49c0-9d4d-b7ad84550d9a ro
> mt7921e.disable_aspm=1 quiet
> [ 0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating
> point registers'
> [ 0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
> [ 0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
> [ 0.00] x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys
> User registers'
> [ 0.00] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
> [ 0.00] x86/fpu: xstate_offset[9]: 832, xstate_sizes[9]: 8
> [ 0.00] x86/fpu: Enabled xstate features 0x207, context size is 840
> bytes, using 'compacted' format.
> [ 0.00] BIOS-provided physical RAM map:
> [ 0.00] BIOS-e820: [mem 0x-0x0009]
> usable
> [ 0.00] BIOS-e820: [mem 0x000a-0x000f]
> reserved
> [ 0.00] BIOS-e820: [mem 0x0010-0x09bfefff]
> usable
> [ 0.00] BIOS-e820: [mem 0x09bff000-0x0a000fff]
> reserved
> [ 0.00] BIOS-e820: [mem 0x0a001000-0x0a1f]
> usable
> [ 0.00] BIOS-e820: [mem 0x0a20-0x0a20efff] ACPI
> NVS
> [ 0.00] BIOS-e820: [mem 0x0a20f000-0xe9e1]
> usable
> [ 0.00] BIOS-e820: [mem 0xe9e2-0xeb33efff]
> reserved
> [ 0.00] BIOS-e820: [mem 0xeb33f000-0xeb39efff] ACPI
> data
> [ 0.00] BIOS-e820: [mem 0xeb39f000-0xeb556fff] ACPI
> NVS
> [ 0.00] BIOS-e820: [mem 0xeb557000-0xed17cfff]
> reserved
> [ 0.00] BIOS-e820: [mem 0xed17d000-0xed1fefff] type
> 20
> [ 0.00] BIOS-e820: [mem 0xed1ff000-0xedff]
> usable
> [ 0.00] BIOS-e820: [mem 0xee00-0xf7ff]
> reserved
> [ 0.00] BIOS-e820: [mem 0xfd00-0xfdff]
> reserved
> [ 0.00] BIOS-e820: [mem 0xfeb8-0xfec01fff]
> reserved
> [ 0.00] BIOS-e820: [mem 0xfec1-0xfec10fff]
> reserved
> [ 0.00] BIOS-e820: [mem 0xfed0-0xfed00fff]
> reserved
> [ 0.00] BIOS-e820: [mem 0xfed4-0xfed44fff]
> reserved
> [ 0.00] BIOS-e820: [mem 0xfed8-0xfed8]
> reserved
> [ 0.00] BIOS-e820: [mem 0xfedc4000-0xfedc9fff]
> reserved
> [ 0.00] BIOS-e820: [mem 0xfedcc000-0xfedcefff]
> reserved
> [ 0.00] BIOS-e820: [mem 0xfedd5000-0xfedd5fff]
> reserved
> [ 0.00] BIOS-e820: [mem 0xff00-0x]
> reserved
> [ 0.00] BIOS-e820: [mem 0x0001-0x0003ee2f]
> usable
> [ 0.00] BIOS-e820: [mem 0x0003ee30-0x00040fff]
> reserved
> [ 0.00] NX (Execute Disable) protection: active
> [ 0.00] efi: EFI v2.70 by American Megatrends
> [ 0.00] efi: ACPI=0xeb54 ACPI 2.0=0xeb540014
> TPMFinalLog=0xeb50c000 SMBIOS=0xed02 SMBIOS 3.0=0xed01f000
> MEMATTR=0xe6fa3018 ESRT=0xe87cb918 MOKvar=0xe6fa
> [ 0.00] SMBIOS 3.3.0 present.
> [ 0.00] DMI: Micro-Star International Co., Ltd. Alpha 15 B5EEK/MS-
> 158L, BIOS E158LAMS.107 11/10/2021
> [ 0.00] tsc: Fast TSC calibration using PIT
> [ 0.00] tsc: Detected 3194.034 MHz processor
> [ 0.000125] e820: update [mem 0x-0x0fff] usable ==>
> reserved
> [ 0.000126] e820: remove [mem 0x000a-0x000f] usable
> [ 0.000131] last_pfn = 0x3ee300 max_arch_pfn = 0x4
> [ 0.000363] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT
> [ 0.000577] e820: update [mem 0xf000-0xff

Re: [PATCH] drm/amdgpu: force using sdma to update vm page table when mmio is blocked

2022-01-20 Thread Christian König


Well NAK.

Even when we can't R/W HDP registers we need a way to invalidate the HDP 
or quite a bunch of functions won't work correctly.


Blocking CPU base page table updates only works around the symptoms, but 
doesn't really solve anything.


Regards,
Christian.

Am 20.01.22 um 09:46 schrieb Wang, Yang(Kevin):


[AMD Official Use Only]


ping...

add @Liu, Monk  @Koenig, Christian 
 @Deucher, Alexander 



Best Regards,
Kevin

*From:* Wang, Yang(Kevin) 
*Sent:* Wednesday, January 19, 2022 11:16 AM
*To:* amd-gfx@lists.freedesktop.org 
*Cc:* Min, Frank ; Chen, Horace 
; Wang, Yang(Kevin) 
*Subject:* [PATCH] drm/amdgpu: force using sdma to update vm page 
table when mmio is blocked

when mmio protection feature is enabled in hypervisor,
it will cause guest OS can't R/W HDP regiters,
and using cpu to update page table is not working well.

force using sdma to update page table when mmio is blocked.

Signed-off-by: Yang Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index b23cb463b106..0f86f0b2e183 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2959,6 +2959,9 @@ int amdgpu_vm_init(struct amdgpu_device *adev, 
struct amdgpu_vm *vm)

 vm->use_cpu_for_update = !!(adev->vm_manager.vm_update_mode &
AMDGPU_VM_USE_CPU_FOR_GFX);

+   if (vm->use_cpu_for_update && amdgpu_sriov_vf(adev) && 
amdgpu_virt_mmio_blocked(adev))

+   vm->use_cpu_for_update = false;
+
 DRM_DEBUG_DRIVER("VM update mode is %s\n",
  vm->use_cpu_for_update ? "CPU" : "SDMA");
 WARN_ONCE((vm->use_cpu_for_update &&
@@ -3094,6 +3097,10 @@ int amdgpu_vm_make_compute(struct amdgpu_device 
*adev, struct amdgpu_vm *vm)

 /* Update VM state */
 vm->use_cpu_for_update = !!(adev->vm_manager.vm_update_mode &
AMDGPU_VM_USE_CPU_FOR_COMPUTE);
+
+   if (vm->use_cpu_for_update && amdgpu_sriov_vf(adev) && 
amdgpu_virt_mmio_blocked(adev))

+   vm->use_cpu_for_update = false;
+
 DRM_DEBUG_DRIVER("VM update mode is %s\n",
  vm->use_cpu_for_update ? "CPU" : "SDMA");
 WARN_ONCE((vm->use_cpu_for_update &&
--
2.25.1

RE: [PATCH V2 1/2] drm/amdgpu: Move xgmi ras initialization from .late_init to .early_init

2022-01-20 Thread Chai, Thomas



-Original Message-
From: Lazar, Lijo  
Sent: Thursday, January 20, 2022 3:32 PM
To: Chai, Thomas ; amd-gfx@lists.freedesktop.org
Cc: Zhou1, Tao ; Zhang, Hawking ; 
Clements, John 
Subject: Re: [PATCH V2 1/2] drm/amdgpu: Move xgmi ras initialization from 
.late_init to .early_init



On 1/20/2022 12:57 PM, Chai, Thomas wrote:
> 
> -Original Message-
> From: Lazar, Lijo 
> Sent: Thursday, January 20, 2022 1:49 PM
> To: Chai, Thomas ; amd-gfx@lists.freedesktop.org
> Cc: Zhou1, Tao ; Zhang, Hawking 
> ; Clements, John ; Chai, 
> Thomas 
> Subject: Re: [PATCH V2 1/2] drm/amdgpu: Move xgmi ras initialization 
> from .late_init to .early_init
> 
> 
> 
> On 1/20/2022 8:48 AM, yipechai wrote:
>> Move xgmi ras initialization from .late_init to .early_init, which 
>> let xgmi ras can be initialized only once.
>>
>> Signed-off-by: yipechai 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 15 ++-
>>drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h |  1 +
>>drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  |  5 +
>>drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   |  5 +
>>4 files changed, 21 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> index 3483a82f5734..788c0257832d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> @@ -436,6 +436,16 @@ void amdgpu_gmc_filter_faults_remove(struct 
>> amdgpu_device *adev, uint64_t addr,
>>  } while (fault->timestamp < tmp);
>>}
>>
>> +int amdgpu_gmc_ras_early_init(struct amdgpu_device *adev) {
>> +if (!adev->gmc.xgmi.connected_to_cpu) {
>> +adev->gmc.xgmi.ras = &xgmi_ras;
>> +amdgpu_ras_register_ras_block(adev, 
>> &adev->gmc.xgmi.ras->ras_block);
>> +}
>> +
>> +return 0;
>> +}
>> +
>>int amdgpu_gmc_ras_late_init(struct amdgpu_device *adev)
>>{
>>  int r;
>> @@ -452,11 +462,6 @@ int amdgpu_gmc_ras_late_init(struct amdgpu_device *adev)
>>  return r;
>>  }
>>
>> -if (!adev->gmc.xgmi.connected_to_cpu) {
>> -adev->gmc.xgmi.ras = &xgmi_ras;
>> -amdgpu_ras_register_ras_block(adev, 
>> &adev->gmc.xgmi.ras->ras_block);
>> -}
>> -
>>  if (adev->gmc.xgmi.ras && adev->gmc.xgmi.ras->ras_block.ras_late_init) {
>>  r = adev->gmc.xgmi.ras->ras_block.ras_late_init(adev, NULL);
>>  if (r)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
>> index 0001631cfedb..ac4c0e50b45c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
>> @@ -318,6 +318,7 @@ bool amdgpu_gmc_filter_faults(struct amdgpu_device *adev,
>>uint16_t pasid, uint64_t timestamp);
>>void amdgpu_gmc_filter_faults_remove(struct amdgpu_device *adev, uint64_t 
>> addr,
>>   uint16_t pasid);
>> +int amdgpu_gmc_ras_early_init(struct amdgpu_device *adev);
>>int amdgpu_gmc_ras_late_init(struct amdgpu_device *adev);
>>void amdgpu_gmc_ras_fini(struct amdgpu_device *adev);
>>int amdgpu_gmc_allocate_vm_inv_eng(struct amdgpu_device *adev); 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>> index 4f8d356f8432..7a6ad5d467b2 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>> @@ -719,6 +719,7 @@ static void gmc_v10_0_set_gfxhub_funcs(struct 
>> amdgpu_device *adev)
>>
>>static int gmc_v10_0_early_init(void *handle)
>>{
>> +int r;
>>  struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>
>>  gmc_v10_0_set_mmhub_funcs(adev);
>> @@ -734,6 +735,10 @@ static int gmc_v10_0_early_init(void *handle)
>>  adev->gmc.private_aperture_end =
>>  adev->gmc.private_aperture_start + (4ULL << 30) - 1;
>>
>> +r = amdgpu_gmc_ras_early_init(adev);
>> +if (r)
>> +return r;
>> +
> 
>> At this point it's unknown if RAS is applicable for the SKU. I think this 
>> failure check shouldn't be there (here and below one).
> 
>> amdgpu_gmc_ras_early_init is return 0 always, that way also this check is 
>> not needed.
> 
> [Thomas]  Just like calling amdgpu_gmc_ras_late_init,  checking the return 
> status may make the code extensible.
>  In amdgpu_gmc_ras_early_init,  the xgmi ras initialization may 
> always return 0, but it may add functions that need to check the return 
> status in future.
> 

>At this point, it's unknown

>1) If the device is part of XGMI hive or not.
>2) If the device supports RAS.

>For such devices, it doesn't make any sense to fail here based on this 
>function.

[Thomas] The current code in amdgpu_gmc_ras_early_init has no effect for the 
devices,  whether the device has xgmi hive or not, whether it support RAS or 
not.
  Checking the return status is just fo

Re: [PATCH v3 00/10] Add MEMORY_DEVICE_COHERENT for coherent device memory mapping

2022-01-20 Thread Alistair Popple

On Wednesday, 12 January 2022 10:06:03 PM AEDT Alistair Popple wrote:
> I have been looking at this in relation to the migration code and noticed we
> have the following in try_to_migrate():
> 
> if (is_zone_device_page(page) && !is_device_private_page(page))
> return;
> 
> Which if I'm understanding correctly means that migration of device coherent
> pages will always fail. Given that I do wonder how hmm-tests are passing, but
> I assume you must always be hitting this fast path in
> migrate_vma_collect_pmd():
> 
> /*
>  * Optimize for the common case where page is only mapped once
>  * in one process. If we can lock the page, then we can safely
>  * set up a special migration page table entry now.
>  */
> 
> Meaning that try_to_migrate() never gets called from migrate_vma_unmap(). So
> you will also need some changes to try_to_migrate() and possibly
> try_to_migrate_one() to make this reliable.

I have been running the hmm tests with the changes below. I'm pretty sure these
are correct because the only zone device pages try_to_migrate_one() should be
called on are device coherent/private, and coherent pages can be treated just
the same as a normal pages for migration. However it would be worth checking I
haven't missed anything.

 - Alistair

---

diff --git a/mm/rmap.c b/mm/rmap.c
index 163ac4e6bcee..15f56c27daab 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1806,7 +1806,7 @@ static bool try_to_migrate_one(struct page *page, struct 
vm_area_struct *vma,
/* Update high watermark before we lower rss */
update_hiwater_rss(mm);
 
-   if (is_zone_device_page(page)) {
+   if (is_device_private_page(page)) {
unsigned long pfn = page_to_pfn(page);
swp_entry_t entry;
pte_t swp_pte;
@@ -1947,7 +1947,7 @@ void try_to_migrate(struct page *page, enum ttu_flags 
flags)
TTU_SYNC)))
return;
 
-   if (is_zone_device_page(page) && !is_device_private_page(page))
+   if (is_zone_device_page(page) && !is_device_page(page))
return;
 
/*

Re: [PATCH v3 08/10] lib: add support for device coherent type in test_hmm

2022-01-20 Thread Alistair Popple

On Tuesday, 11 January 2022 9:31:59 AM AEDT Alex Sierra wrote:
> Device Coherent type uses device memory that is coherently accesible by
> the CPU. This could be shown as SP (special purpose) memory range
> at the BIOS-e820 memory enumeration. If no SP memory is supported in
> system, this could be faked by setting CONFIG_EFI_FAKE_MEMMAP.
> 
> Currently, test_hmm only supports two different SP ranges of at least
> 256MB size. This could be specified in the kernel parameter variable
> efi_fake_mem. Ex. Two SP ranges of 1GB starting at 0x1 &
> 0x14000 physical address. Ex.
> efi_fake_mem=1G@0x1:0x4,1G@0x14000:0x4
> 
> Private and coherent device mirror instances can be created in the same
> probed. This is done by passing the module parameters spm_addr_dev0 &
> spm_addr_dev1. In this case, it will create four instances of
> device_mirror. The first two correspond to private device type, the
> last two to coherent type. Then, they can be easily accessed from user
> space through /dev/hmm_mirror. Usually num_device 0 and 1
> are for private, and 2 and 3 for coherent types. If no module
> parameters are passed, two instances of private type device_mirror will
> be created only.
> 
> Signed-off-by: Alex Sierra 
> ---
>  lib/test_hmm.c  | 247 
>  lib/test_hmm_uapi.h |  15 ++-
>  2 files changed, 193 insertions(+), 69 deletions(-)
> 
> diff --git a/lib/test_hmm.c b/lib/test_hmm.c
> index 9edeff52302e..7c641c5a9cfa 100644
> --- a/lib/test_hmm.c
> +++ b/lib/test_hmm.c
> @@ -29,11 +29,22 @@
>  
>  #include "test_hmm_uapi.h"
>  
> -#define DMIRROR_NDEVICES 2
> +#define DMIRROR_NDEVICES 4
>  #define DMIRROR_RANGE_FAULT_TIMEOUT  1000
>  #define DEVMEM_CHUNK_SIZE(256 * 1024 * 1024U)
>  #define DEVMEM_CHUNKS_RESERVE16
>  
> +/*
> + * For device_private pages, dpage is just a dummy struct page
> + * representing a piece of device memory. dmirror_devmem_alloc_page
> + * allocates a real system memory page as backing storage to fake a
> + * real device. zone_device_data points to that backing page. But
> + * for device_coherent memory, the struct page represents real
> + * physical CPU-accessible memory that we can use directly.
> + */
> +#define BACKING_PAGE(page) (is_device_private_page((page)) ? \
> +(page)->zone_device_data : (page))
> +
>  static unsigned long spm_addr_dev0;
>  module_param(spm_addr_dev0, long, 0644);
>  MODULE_PARM_DESC(spm_addr_dev0,
> @@ -122,6 +133,21 @@ static int dmirror_bounce_init(struct dmirror_bounce 
> *bounce,
>   return 0;
>  }
>  
> +static bool dmirror_is_private_zone(struct dmirror_device *mdevice)
> +{
> + return (mdevice->zone_device_type ==
> + HMM_DMIRROR_MEMORY_DEVICE_PRIVATE) ? true : false;
> +}
> +
> +static enum migrate_vma_direction
> + dmirror_select_device(struct dmirror *dmirror)
> +{
> + return (dmirror->mdevice->zone_device_type ==
> + HMM_DMIRROR_MEMORY_DEVICE_PRIVATE) ?
> + MIGRATE_VMA_SELECT_DEVICE_PRIVATE :
> + MIGRATE_VMA_SELECT_DEVICE_COHERENT;
> +}
> +
>  static void dmirror_bounce_fini(struct dmirror_bounce *bounce)
>  {
>   vfree(bounce->ptr);
> @@ -572,16 +598,19 @@ static int dmirror_allocate_chunk(struct dmirror_device 
> *mdevice,
>  static struct page *dmirror_devmem_alloc_page(struct dmirror_device *mdevice)
>  {
>   struct page *dpage = NULL;
> - struct page *rpage;
> + struct page *rpage = NULL;
>  
>   /*
> -  * This is a fake device so we alloc real system memory to store
> -  * our device memory.
> +  * For ZONE_DEVICE private type, this is a fake device so we alloc real
> +  * system memory to store our device memory.
> +  * For ZONE_DEVICE coherent type we use the actual dpage to store the 
> data
> +  * and ignore rpage.
>*/
> - rpage = alloc_page(GFP_HIGHUSER);
> - if (!rpage)
> - return NULL;
> -
> + if (dmirror_is_private_zone(mdevice)) {
> + rpage = alloc_page(GFP_HIGHUSER);
> + if (!rpage)
> + return NULL;
> + }
>   spin_lock(&mdevice->lock);
>  
>   if (mdevice->free_pages) {
> @@ -601,7 +630,8 @@ static struct page *dmirror_devmem_alloc_page(struct 
> dmirror_device *mdevice)
>   return dpage;
>  
>  error:
> - __free_page(rpage);
> + if (rpage)
> + __free_page(rpage);
>   return NULL;
>  }
>  
> @@ -627,12 +657,15 @@ static void dmirror_migrate_alloc_and_copy(struct 
> migrate_vma *args,
>* unallocated pte_none() or read-only zero page.
>*/
>   spage = migrate_pfn_to_page(*src);
> + WARN(spage && is_zone_device_page(spage),
> +  "page already in device spage pfn: 0x%lx\n",
> +  page_to_pfn(spage));

This should also lead to test failure because we are only supposed to be
selecting sy

Re: [PATCH v3 10/10] tools: update test_hmm script to support SP config

2022-01-20 Thread Alistair Popple

Looks good,

Reviewed-by: Alistair Popple 

On Tuesday, 11 January 2022 9:32:01 AM AEDT Alex Sierra wrote:
> Add two more parameters to set spm_addr_dev0 & spm_addr_dev1
> addresses. These two parameters configure the start SP
> addresses for each device in test_hmm driver.
> Consequently, this configures zone device type as coherent.
> 
> Signed-off-by: Alex Sierra 
> ---
> v2:
> Add more mknods for device coherent type. These are represented under
> /dev/hmm_mirror2 and /dev/hmm_mirror3, only in case they have created
> at probing the hmm-test driver.
> ---
>  tools/testing/selftests/vm/test_hmm.sh | 24 +---
>  1 file changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/testing/selftests/vm/test_hmm.sh 
> b/tools/testing/selftests/vm/test_hmm.sh
> index 0647b525a625..539c9371e592 100755
> --- a/tools/testing/selftests/vm/test_hmm.sh
> +++ b/tools/testing/selftests/vm/test_hmm.sh
> @@ -40,11 +40,26 @@ check_test_requirements()
>  
>  load_driver()
>  {
> - modprobe $DRIVER > /dev/null 2>&1
> + if [ $# -eq 0 ]; then
> + modprobe $DRIVER > /dev/null 2>&1
> + else
> + if [ $# -eq 2 ]; then
> + modprobe $DRIVER spm_addr_dev0=$1 spm_addr_dev1=$2
> + > /dev/null 2>&1
> + else
> + echo "Missing module parameters. Make sure pass"\
> + "spm_addr_dev0 and spm_addr_dev1"
> + usage
> + fi
> + fi
>   if [ $? == 0 ]; then
>   major=$(awk "\$2==\"HMM_DMIRROR\" {print \$1}" /proc/devices)
>   mknod /dev/hmm_dmirror0 c $major 0
>   mknod /dev/hmm_dmirror1 c $major 1
> + if [ $# -eq 2 ]; then
> + mknod /dev/hmm_dmirror2 c $major 2
> + mknod /dev/hmm_dmirror3 c $major 3
> + fi
>   fi
>  }
>  
> @@ -58,7 +73,7 @@ run_smoke()
>  {
>   echo "Running smoke test. Note, this test provides basic coverage."
>  
> - load_driver
> + load_driver $1 $2
>   $(dirname "${BASH_SOURCE[0]}")/hmm-tests
>   unload_driver
>  }
> @@ -75,6 +90,9 @@ usage()
>   echo "# Smoke testing"
>   echo "./${TEST_NAME}.sh smoke"
>   echo
> + echo "# Smoke testing with SPM enabled"
> + echo "./${TEST_NAME}.sh smoke  "
> + echo
>   exit 0
>  }
>  
> @@ -84,7 +102,7 @@ function run_test()
>   usage
>   else
>   if [ "$1" = "smoke" ]; then
> - run_smoke
> + run_smoke $2 $3
>   else
>   usage
>   fi
>

Re: [PATCH v3 07/10] lib: test_hmm add module param for zone device type

2022-01-20 Thread Alistair Popple

Thanks for splitting the coherent devices into separate device nodes. Couple of
comments below.

On Tuesday, 11 January 2022 9:31:58 AM AEDT Alex Sierra wrote:
> In order to configure device coherent in test_hmm, two module parameters
> should be passed, which correspond to the SP start address of each
> device (2) spm_addr_dev0 & spm_addr_dev1. If no parameters are passed,
> private device type is configured.
> 
> Signed-off-by: Alex Sierra 
> ---
>  lib/test_hmm.c  | 74 +++--
>  lib/test_hmm_uapi.h |  1 +
>  2 files changed, 53 insertions(+), 22 deletions(-)
> 
> diff --git a/lib/test_hmm.c b/lib/test_hmm.c
> index 97e48164d56a..9edeff52302e 100644
> --- a/lib/test_hmm.c
> +++ b/lib/test_hmm.c
> @@ -34,6 +34,16 @@
>  #define DEVMEM_CHUNK_SIZE(256 * 1024 * 1024U)
>  #define DEVMEM_CHUNKS_RESERVE16
>  
> +static unsigned long spm_addr_dev0;
> +module_param(spm_addr_dev0, long, 0644);
> +MODULE_PARM_DESC(spm_addr_dev0,
> + "Specify start address for SPM (special purpose memory) used 
> for device 0. By setting this Coherent device type will be used. Make sure 
> spm_addr_dev1 is set too");

It would be useful if you could mention the required size for this region
(ie. DEVMEM_CHUNK_SIZE).

> +
> +static unsigned long spm_addr_dev1;
> +module_param(spm_addr_dev1, long, 0644);
> +MODULE_PARM_DESC(spm_addr_dev1,
> + "Specify start address for SPM (special purpose memory) used 
> for device 1. By setting this Coherent device type will be used. Make sure 
> spm_addr_dev0 is set too");
> +
>  static const struct dev_pagemap_ops dmirror_devmem_ops;
>  static const struct mmu_interval_notifier_ops dmirror_min_ops;
>  static dev_t dmirror_dev;
> @@ -452,29 +462,44 @@ static int dmirror_write(struct dmirror *dmirror, 
> struct hmm_dmirror_cmd *cmd)
>   return ret;
>  }
>  
> -static bool dmirror_allocate_chunk(struct dmirror_device *mdevice,
> +static int dmirror_allocate_chunk(struct dmirror_device *mdevice,
>  struct page **ppage)
>  {
>   struct dmirror_chunk *devmem;
> - struct resource *res;
> + struct resource *res = NULL;
>   unsigned long pfn;
>   unsigned long pfn_first;
>   unsigned long pfn_last;
>   void *ptr;
> + int ret = -ENOMEM;
>  
>   devmem = kzalloc(sizeof(*devmem), GFP_KERNEL);
>   if (!devmem)
> - return false;
> + return ret;
>  
> - res = request_free_mem_region(&iomem_resource, DEVMEM_CHUNK_SIZE,
> -   "hmm_dmirror");
> - if (IS_ERR(res))
> + switch (mdevice->zone_device_type) {
> + case HMM_DMIRROR_MEMORY_DEVICE_PRIVATE:
> + res = request_free_mem_region(&iomem_resource, 
> DEVMEM_CHUNK_SIZE,
> +   "hmm_dmirror");
> + if (IS_ERR_OR_NULL(res))
> + goto err_devmem;
> + devmem->pagemap.range.start = res->start;
> + devmem->pagemap.range.end = res->end;
> + devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
> + break;
> + case HMM_DMIRROR_MEMORY_DEVICE_COHERENT:
> + devmem->pagemap.range.start = (MINOR(mdevice->cdevice.dev) - 2) 
> ?
> + spm_addr_dev0 :
> + spm_addr_dev1;
> + devmem->pagemap.range.end = devmem->pagemap.range.start +
> + DEVMEM_CHUNK_SIZE - 1;
> + devmem->pagemap.type = MEMORY_DEVICE_COHERENT;
> + break;
> + default:
> + ret = -EINVAL;
>   goto err_devmem;
> + }
>  
> - mdevice->zone_device_type = HMM_DMIRROR_MEMORY_DEVICE_PRIVATE;

What initialises mdevice->zone_device_type now? It looks like it needs to get
initialised in hmm_dmirror_init(), which would be easier to do in the previous
patch rather than adding it here in the first place.

> - devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
> - devmem->pagemap.range.start = res->start;
> - devmem->pagemap.range.end = res->end;
>   devmem->pagemap.nr_range = 1;
>   devmem->pagemap.ops = &dmirror_devmem_ops;
>   devmem->pagemap.owner = mdevice;
> @@ -495,10 +520,14 @@ static bool dmirror_allocate_chunk(struct 
> dmirror_device *mdevice,
>   mdevice->devmem_capacity = new_capacity;
>   mdevice->devmem_chunks = new_chunks;
>   }
> -
>   ptr = memremap_pages(&devmem->pagemap, numa_node_id());
> - if (IS_ERR(ptr))
> + if (IS_ERR_OR_NULL(ptr)) {
> + if (ptr)
> + ret = PTR_ERR(ptr);
> + else
> + ret = -EFAULT;
>   goto err_release;
> + }
>  
>   devmem->mdevice = mdevice;
>   pfn_first = devmem->pagemap.range.start >> PAGE_SHIFT;
> @@ -527,15 +556,17 @@ static bool dmirror_allocate_chunk(struct 
> dm

1 2 >

1 - 100 of 106 matches

Mail list logo