Re: [RFC PATCH 02/18] drm/ttm: Add per-BO eviction tracking

2024-04-25 Thread Christian König

Am 25.04.24 um 21:02 schrieb Matthew Brost:

On Thu, Apr 25, 2024 at 08:18:38AM +0200, Christian König wrote:

Am 24.04.24 um 18:56 schrieb Friedrich Vock:

Make each buffer object aware of whether it has been evicted or not.

That reverts some changes we made a couple of years ago.

In general the idea is that eviction isn't something we need to reverse in
TTM.

Rather the driver gives the desired placement.

Regards,
Christian.


We have added a concept similar to this in drm_gpuvm [1]. GPUVM
maintains a list of evicted BOs and when the GPUVM is locked for
submission it has validate vfunc which is called on each BO. If driver
is using TTM, this is where the driver would call TTM BO validate which
unevicts the BO. Well at least this what we do it Xe [2].

The uneviction is a per VM operation not a global one. With this, a
global eviction list does not seem correct (admittedly not through the
entire series).


Yeah, that's exactly what I meant when I wrote that this is controlled 
by the "driver" :)


The state machine in AMDGPUs VM code is pretty much the same.

Regards,
Christian.



Matt

[1] 
https://elixir.bootlin.com/linux/v6.8.7/source/drivers/gpu/drm/drm_gpuvm.c#L86
[2] 
https://elixir.bootlin.com/linux/v6.8.7/source/drivers/gpu/drm/xe/xe_vm.c#L464


Signed-off-by: Friedrich Vock 
---
   drivers/gpu/drm/ttm/ttm_bo.c |  1 +
   include/drm/ttm/ttm_bo.h | 11 +++
   2 files changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index edf10618fe2b2..3968b17453569 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -980,6 +980,7 @@ int ttm_bo_init_reserved(struct ttm_device *bdev, struct 
ttm_buffer_object *bo,
bo->pin_count = 0;
bo->sg = sg;
bo->bulk_move = NULL;
+   bo->evicted_type = TTM_NUM_MEM_TYPES;
if (resv)
bo->base.resv = resv;
else
diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h
index 0223a41a64b24..8a1a29c6fbc50 100644
--- a/include/drm/ttm/ttm_bo.h
+++ b/include/drm/ttm/ttm_bo.h
@@ -121,6 +121,17 @@ struct ttm_buffer_object {
unsigned priority;
unsigned pin_count;

+   /**
+* @evicted_type: Memory type this BO was evicted from, if any.
+* TTM_NUM_MEM_TYPES if this BO was not evicted.
+*/
+   int evicted_type;
+   /**
+* @evicted: Entry in the evicted list for the resource manager
+* this BO was evicted from.
+*/
+   struct list_head evicted;
+
/**
 * @delayed_delete: Work item used when we can't delete the BO
 * immediately
--
2.44.0





[PATCH 4/4] drm/amdgpu: Move ras resume into SRIOV function

2024-04-25 Thread Yunxiang Li
This is part of the reset, move it into the reset function.

Signed-off-by: Yunxiang Li 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 3c4755f3c116..8f2c1f71ed9a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5119,6 +5119,11 @@ static int amdgpu_device_reset_sriov(struct 
amdgpu_device *adev,
amdgpu_amdkfd_post_reset(adev);
amdgpu_virt_release_full_gpu(adev, true);
 
+   /* Aldebaran and gfx_11_0_3 support ras in SRIOV, so need resume ras 
during reset */
+   if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 2) ||
+   amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3) ||
+   amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(11, 0, 3))
+   amdgpu_ras_resume(adev);
return 0;
 }
 
@@ -5823,13 +5828,6 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
goto retry;
if (r)
adev->asic_reset_res = r;
-
-   /* Aldebaran and gfx_11_0_3 support ras in SRIOV, so need 
resume ras during reset */
-   if (amdgpu_ip_version(adev, GC_HWIP, 0) ==
-   IP_VERSION(9, 4, 2) ||
-   amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3) 
||
-   amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(11, 0, 3))
-   amdgpu_ras_resume(adev);
} else {
r = amdgpu_do_asic_reset(device_list_handle, reset_context);
if (r && r == -EAGAIN)
-- 
2.34.1



[PATCH 3/4] drm/amdgpu: Fix amdgpu_device_reset_sriov retry logic

2024-04-25 Thread Yunxiang Li
The retry loop for SRIOV reset have refcount and memory leak issue.
Depending on which function call fails it can potentially call
amdgpu_amdkfd_pre/post_reset different number of times and causes
kfd_locked count to be wrong. This will block all future attempts at
opening /dev/kfd. The retry loop also leakes resources by calling
amdgpu_virt_init_data_exchange multiple times without calling the
corresponding fini function.

Align with the bare-metal reset path which doesn't have these issues.
This means taking the amdgpu_amdkfd_pre/post_reset functions out of the
reset loop and calling amdgpu_device_pre_asic_reset each retry which
properly free the resources from previous try by calling
amdgpu_virt_fini_data_exchange.

Signed-off-by: Yunxiang Li 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 50 ++
 1 file changed, 22 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 1fd9637daafc..3c4755f3c116 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5063,19 +5063,14 @@ static int amdgpu_device_recover_vram(struct 
amdgpu_device *adev)
 static int amdgpu_device_reset_sriov(struct amdgpu_device *adev,
 struct amdgpu_reset_context *reset_context)
 {
-   int r;
+   int r = 0;
struct amdgpu_hive_info *hive = NULL;
-   int retry_limit = 0;
-
-retry:
-   amdgpu_amdkfd_pre_reset(adev);
 
if (test_bit(AMDGPU_HOST_FLR, &reset_context->flags))
r = amdgpu_virt_request_full_gpu(adev, true);
else
r = amdgpu_virt_reset_gpu(adev);
-   if (r)
-   return r;
+
amdgpu_ras_set_fed(adev, false);
amdgpu_irq_gpu_reset_resume_helper(adev);
 
@@ -5085,7 +5080,7 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device 
*adev,
/* Resume IP prior to SMC */
r = amdgpu_device_ip_reinit_early_sriov(adev);
if (r)
-   goto error;
+   return r;
 
amdgpu_virt_init_data_exchange(adev);
 
@@ -5096,38 +5091,35 @@ static int amdgpu_device_reset_sriov(struct 
amdgpu_device *adev,
/* now we are okay to resume SMC/CP/SDMA */
r = amdgpu_device_ip_reinit_late_sriov(adev);
if (r)
-   goto error;
+   return r;
 
hive = amdgpu_get_xgmi_hive(adev);
/* Update PSP FW topology after reset */
if (hive && adev->gmc.xgmi.num_physical_nodes > 1)
r = amdgpu_xgmi_update_topology(hive, adev);
-
if (hive)
amdgpu_put_xgmi_hive(hive);
+   if (r)
+   return r;
 
-   if (!r) {
-   r = amdgpu_ib_ring_tests(adev);
-
-   amdgpu_amdkfd_post_reset(adev);
-   }
+   r = amdgpu_ib_ring_tests(adev);
+   if (r)
+   return r;
 
-error:
-   if (!r && adev->virt.gim_feature & AMDGIM_FEATURE_GIM_FLR_VRAMLOST) {
+   if (adev->virt.gim_feature & AMDGIM_FEATURE_GIM_FLR_VRAMLOST) {
amdgpu_inc_vram_lost(adev);
r = amdgpu_device_recover_vram(adev);
}
-   amdgpu_virt_release_full_gpu(adev, true);
+   if (r)
+   return r;
 
-   if (AMDGPU_RETRY_SRIOV_RESET(r)) {
-   if (retry_limit < AMDGPU_MAX_RETRY_LIMIT) {
-   retry_limit++;
-   goto retry;
-   } else
-   DRM_ERROR("GPU reset retry is beyond the retry 
limit\n");
-   }
+   /* need to be called during full access so we can't do it later like
+* bare-metal does.
+*/
+   amdgpu_amdkfd_post_reset(adev);
+   amdgpu_virt_release_full_gpu(adev, true);
 
-   return r;
+   return 0;
 }
 
 /**
@@ -5686,6 +5678,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
int i, r = 0;
bool need_emergency_restart = false;
bool audio_suspended = false;
+   int retry_limit = AMDGPU_MAX_RETRY_LIMIT;
 
/*
 * Special case: RAS triggered and full reset isn't supported
@@ -5767,8 +5760,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 
cancel_delayed_work_sync(&tmp_adev->delayed_init_work);
 
-   if (!amdgpu_sriov_vf(tmp_adev))
-   amdgpu_amdkfd_pre_reset(tmp_adev);
+   amdgpu_amdkfd_pre_reset(tmp_adev);
 
/*
 * Mark these ASICs to be reseted as untracked first
@@ -5827,6 +5819,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
/* Host driver will handle XGMI hive reset for SRIOV */
if (amdgpu_sriov_vf(adev)) {
r = amdgpu_device_reset_sriov(adev, reset_context);
+   if (AMDGPU_RETRY_SRIOV_RESET(r) && (retry_limit--) > 0)
+   goto retry;
if (r)
adev->asic_reset_res = 

[PATCH v3 2/4] drm/amdgpu: Add reset_context flag for host FLR

2024-04-25 Thread Yunxiang Li
There are other reset sources that pass NULL as the job pointer, such as
amdgpu_amdkfd_reset_work. Therefore, using the job pointer to check if
the FLR comes from the host does not work.

Add a flag in reset_context to explicitly mark host triggered reset, and
set this flag when we receive host reset notification.

Signed-off-by: Yunxiang Li 
---
v2: fix typo
v3: pass reset_context directly

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 
 drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h  | 1 +
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c  | 1 +
 drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c  | 1 +
 drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c  | 1 +
 5 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 8befd10bf007..1fd9637daafc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5055,13 +5055,13 @@ static int amdgpu_device_recover_vram(struct 
amdgpu_device *adev)
  * amdgpu_device_reset_sriov - reset ASIC for SR-IOV vf
  *
  * @adev: amdgpu_device pointer
- * @from_hypervisor: request from hypervisor
+ * @reset_context: amdgpu reset context pointer
  *
  * do VF FLR and reinitialize Asic
  * return 0 means succeeded otherwise failed
  */
 static int amdgpu_device_reset_sriov(struct amdgpu_device *adev,
-bool from_hypervisor)
+struct amdgpu_reset_context *reset_context)
 {
int r;
struct amdgpu_hive_info *hive = NULL;
@@ -5070,7 +5070,7 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device 
*adev,
 retry:
amdgpu_amdkfd_pre_reset(adev);
 
-   if (from_hypervisor)
+   if (test_bit(AMDGPU_HOST_FLR, &reset_context->flags))
r = amdgpu_virt_request_full_gpu(adev, true);
else
r = amdgpu_virt_reset_gpu(adev);
@@ -5826,7 +5826,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
/* Actual ASIC resets if needed.*/
/* Host driver will handle XGMI hive reset for SRIOV */
if (amdgpu_sriov_vf(adev)) {
-   r = amdgpu_device_reset_sriov(adev, job ? false : true);
+   r = amdgpu_device_reset_sriov(adev, reset_context);
if (r)
adev->asic_reset_res = r;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
index b11d190ece53..5a9cc043b858 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
@@ -33,6 +33,7 @@ enum AMDGPU_RESET_FLAGS {
AMDGPU_NEED_FULL_RESET = 0,
AMDGPU_SKIP_HW_RESET = 1,
AMDGPU_SKIP_COREDUMP = 2,
+   AMDGPU_HOST_FLR = 3,
 };
 
 struct amdgpu_reset_context {
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c 
b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
index c5ba9c4757a8..f4c47492e0cd 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
@@ -292,6 +292,7 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct 
*work)
reset_context.method = AMD_RESET_METHOD_NONE;
reset_context.reset_req_dev = adev;
clear_bit(AMDGPU_NEED_FULL_RESET, &reset_context.flags);
+   set_bit(AMDGPU_HOST_FLR, &reset_context.flags);
 
amdgpu_device_gpu_recover(adev, NULL, &reset_context);
}
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c 
b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
index fa9d1b02f391..14cc7910e5cf 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
@@ -328,6 +328,7 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct 
*work)
reset_context.method = AMD_RESET_METHOD_NONE;
reset_context.reset_req_dev = adev;
clear_bit(AMDGPU_NEED_FULL_RESET, &reset_context.flags);
+   set_bit(AMDGPU_HOST_FLR, &reset_context.flags);
 
amdgpu_device_gpu_recover(adev, NULL, &reset_context);
}
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c 
b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
index 14a065516ae4..78cd07744ebe 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
@@ -529,6 +529,7 @@ static void xgpu_vi_mailbox_flr_work(struct work_struct 
*work)
reset_context.method = AMD_RESET_METHOD_NONE;
reset_context.reset_req_dev = adev;
clear_bit(AMDGPU_NEED_FULL_RESET, &reset_context.flags);
+   set_bit(AMDGPU_HOST_FLR, &reset_context.flags);
 
amdgpu_device_gpu_recover(adev, NULL, &reset_context);
}
-- 
2.34.1



[PATCH v3 1/4] drm/amdgpu: Fix two reset triggered in a row

2024-04-25 Thread Yunxiang Li
Some times a hang GPU causes multiple reset sources to schedule resets.
The second source will be able to trigger an unnecessary reset if they
schedule after we call amdgpu_device_stop_pending_resets.

Move amdgpu_device_stop_pending_resets to after the reset is done. Since
at this point the GPU is supposedly in a good state, any reset scheduled
after this point would be a legitimate reset.

Remove unnecessary and incorrect checks for amdgpu_in_reset that was
kinda serving this purpose.

Signed-off-by: Yunxiang Li 
---
v2: instead of adding amdgpu_in_reset check, move when we cancel pending
resets
v3: no changes from v2, collect all the patches in one series for easier review

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 19 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c  |  2 +-
 5 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 861ccff78af9..8befd10bf007 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5070,8 +5070,6 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device 
*adev,
 retry:
amdgpu_amdkfd_pre_reset(adev);
 
-   amdgpu_device_stop_pending_resets(adev);
-
if (from_hypervisor)
r = amdgpu_virt_request_full_gpu(adev, true);
else
@@ -5823,13 +5821,6 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
  r, adev_to_drm(tmp_adev)->unique);
tmp_adev->asic_reset_res = r;
}
-
-   if (!amdgpu_sriov_vf(tmp_adev))
-   /*
-   * Drop all pending non scheduler resets. Scheduler 
resets
-   * were already dropped during drm_sched_stop
-   */
-   amdgpu_device_stop_pending_resets(tmp_adev);
}
 
/* Actual ASIC resets if needed.*/
@@ -5851,6 +5842,16 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
goto retry;
}
 
+   list_for_each_entry(tmp_adev, device_list_handle, reset_list) {
+   /*
+* Drop any pending non scheduler resets queued before reset is 
done.
+* Any reset scheduled after this point would be valid. 
Scheduler resets
+* were already dropped during drm_sched_stop and no new ones 
can come
+* in before drm_sched_start.
+*/
+   amdgpu_device_stop_pending_resets(tmp_adev);
+   }
+
 skip_hw_reset:
 
/* Post ASIC reset for all devs .*/
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 54ab51a4ada7..c2385178d6b3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -597,7 +597,7 @@ static void amdgpu_virt_update_vf2pf_work_item(struct 
work_struct *work)
if (ret) {
adev->virt.vf2pf_update_retry_cnt++;
if ((adev->virt.vf2pf_update_retry_cnt >= 
AMDGPU_VF2PF_UPDATE_MAX_RETRY_LIMIT) &&
-   amdgpu_sriov_runtime(adev) && !amdgpu_in_reset(adev)) {
+   amdgpu_sriov_runtime(adev)) {
amdgpu_ras_set_fed(adev, true);
if (amdgpu_reset_domain_schedule(adev->reset_domain,
  &adev->virt.flr_work))
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c 
b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
index 0c7275bca8f7..c5ba9c4757a8 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
@@ -319,7 +319,7 @@ static int xgpu_ai_mailbox_rcv_irq(struct amdgpu_device 
*adev,
 
switch (event) {
case IDH_FLR_NOTIFICATION:
-   if (amdgpu_sriov_runtime(adev) && !amdgpu_in_reset(adev))
+   if (amdgpu_sriov_runtime(adev))

WARN_ONCE(!amdgpu_reset_domain_schedule(adev->reset_domain,

&adev->virt.flr_work),
  "Failed to queue work! at %s",
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c 
b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
index aba00d961627..fa9d1b02f391 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
@@ -358,7 +358,7 @@ static int xgpu_nv_mailbox_rcv_irq(struct amdgpu_device 
*adev,
 
switch (event) {
case IDH_FLR_NOTIFICATION:
-   if (amdgpu_sriov_runtime(adev) && !amdgpu_in_reset(adev))
+   if (amdgpu_sriov_runtime(adev))

WARN_ONCE(!amdgpu_reset_domain_schedule(adev->reset_domain,
   

Re: [PATCH] drm/amdgpu: Fix out-of-bounds write warning

2024-04-25 Thread Ma, Jun



On 4/25/2024 8:39 PM, Christian König wrote:
> 
> 
> Am 25.04.24 um 12:00 schrieb Ma Jun:
>> Check the ring type value to fix the out-of-bounds
>> write warning
>>
>> Signed-off-by: Ma Jun 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 5 +
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> index 06f0a6534a94..1e0b5bb47bc9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> @@ -353,6 +353,11 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
>> amdgpu_ring *ring,
>>  ring->hw_prio = hw_prio;
>>   
>>  if (!ring->no_scheduler) {
>> +if (ring->funcs->type >= AMDGPU_HW_IP_NUM) {
>> +dev_warn(adev->dev, "ring type %d has no scheduler\n", 
>> ring->funcs->type);
>> +return 0;
>> +}
>> +
> 
> That check should probably be at the beginning of the function since 
> trying to initialize a ring with an invalid type should be rejected 
> immediately.
> 
This check is used to skip the gpu_sched setting for the rings
which don't have scheduler, such as KIQ, MES, UMSCH_MM. 
Without this check, there could be an potential out-of-bounds writing
when ring->no__scheduler is not set correctly.

Regards,
Ma Jun

> Regards,
> Christian.
> 
>>  hw_ip = ring->funcs->type;
>>  num_sched = &adev->gpu_sched[hw_ip][hw_prio].num_scheds;
>>  adev->gpu_sched[hw_ip][hw_prio].sched[(*num_sched)++] =
> 


RE: [PATCH] drm/amdgpu: add ACA error query support for umc_v12_0

2024-04-25 Thread Zhou1, Tao
[AMD Official Use Only - General]

> -Original Message-
> From: Wang, Yang(Kevin) 
> Sent: Wednesday, April 17, 2024 11:10 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking ; Zhou1, Tao
> ; Chai, Thomas 
> Subject: [PATCH] drm/amdgpu: add ACA error query support for umc_v12_0
>
> add ACA error query support for umc_v12_0.
>
> Signed-off-by: Yang Wang 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c |  6 +++---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h |  4 
> drivers/gpu/drm/amd/amdgpu/umc_v12_0.c  | 18 ++
>  3 files changed, 21 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index 352ce16a0963..46b7f0c5cd8a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -1268,9 +1268,9 @@ int amdgpu_ras_unbind_aca(struct amdgpu_device
> *adev, enum amdgpu_ras_block blk)
>   return 0;
>  }
>
> -static int amdgpu_aca_log_ras_error_data(struct amdgpu_device *adev, enum
> amdgpu_ras_block blk,
> -  enum aca_error_type type, struct
> ras_err_data *err_data,
> -  struct ras_query_context *qctx)
> +int amdgpu_aca_log_ras_error_data(struct amdgpu_device *adev, enum
> amdgpu_ras_block blk,
> +   enum aca_error_type type, struct ras_err_data
> *err_data,
> +   struct ras_query_context *qctx)
>  {
>   struct ras_manager *obj;
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> index 8d26989c75c8..487548879c49 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> @@ -898,6 +898,10 @@ int amdgpu_ras_unbind_aca(struct amdgpu_device
> *adev, enum amdgpu_ras_block blk)  ssize_t amdgpu_ras_aca_sysfs_read(struct
> device *dev, struct device_attribute *attr,
> struct aca_handle *handle, char *buf, void
> *data);
>
> +int amdgpu_aca_log_ras_error_data(struct amdgpu_device *adev, enum
> amdgpu_ras_block blk,
> +   enum aca_error_type type, struct ras_err_data
> *err_data,
> +   struct ras_query_context *qctx);

[Tao] is it used in this patch?

> +
>  void amdgpu_ras_add_mca_err_addr(struct ras_err_info *err_info,
>   struct ras_err_addr *err_addr);
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c
> b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c
> index f69871902233..9f2c46814a4f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c
> @@ -317,16 +317,26 @@ static int umc_v12_0_err_cnt_init_per_channel(struct
> amdgpu_device *adev,  static void
> umc_v12_0_ecc_info_query_ras_error_count(struct amdgpu_device *adev,
>   void *ras_error_status)
>  {
> + struct ras_err_data *err_data = (struct ras_err_data
> +*)ras_error_status;
>   struct ras_query_context qctx;
>
>   memset(&qctx, 0, sizeof(qctx));
>   qctx.event_id = amdgpu_ras_acquire_event_id(adev,
> amdgpu_ras_intr_triggered() ?
>   RAS_EVENT_TYPE_ISR :
> RAS_EVENT_TYPE_INVALID);
>
> - amdgpu_mca_smu_log_ras_error(adev,
> - AMDGPU_RAS_BLOCK__UMC,
> AMDGPU_MCA_ERROR_TYPE_CE, ras_error_status, &qctx);
> - amdgpu_mca_smu_log_ras_error(adev,
> - AMDGPU_RAS_BLOCK__UMC,
> AMDGPU_MCA_ERROR_TYPE_UE, ras_error_status, &qctx);
> + if (amdgpu_aca_is_enabled(adev)) {
> + amdgpu_aca_get_error_data(adev,
> AMDGPU_RAS_BLOCK__UMC, ACA_ERROR_TYPE_CE,
> +   err_data, &qctx);
> + amdgpu_aca_get_error_data(adev,
> AMDGPU_RAS_BLOCK__UMC, ACA_ERROR_TYPE_UE,
> +   err_data, &qctx);
> + amdgpu_aca_get_error_data(adev,
> AMDGPU_RAS_BLOCK__UMC, ACA_ERROR_TYPE_DEFERRED,
> +   err_data, &qctx);
> + } else {
> + amdgpu_mca_smu_log_ras_error(adev,
> AMDGPU_RAS_BLOCK__UMC, AMDGPU_MCA_ERROR_TYPE_CE,
> +  err_data, &qctx);
> + amdgpu_mca_smu_log_ras_error(adev,
> AMDGPU_RAS_BLOCK__UMC, AMDGPU_MCA_ERROR_TYPE_UE,
> +  err_data, &qctx);
> + }
>  }
>
>  static void umc_v12_0_ecc_info_query_ras_error_address(struct
> amdgpu_device *adev,
> --
> 2.34.1



Re: [PATCH v2 2/2] drm/amdgpu: Fix the uninitialized variable warning

2024-04-25 Thread Ma, Jun



On 4/25/2024 6:10 PM, Lazar, Lijo wrote:
> 
> 
> On 4/25/2024 3:30 PM, Ma Jun wrote:
>> Initialize the phy_id to 0 to fix the warning of
>> "Using uninitialized value phy_id"
>>
>> Signed-off-by: Ma Jun 
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c | 6 +-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
>> index 8ed0e073656f..53d85fafd8ab 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
>> @@ -95,7 +95,7 @@ static ssize_t amdgpu_securedisplay_debugfs_write(struct 
>> file *f, const char __u
>>  struct psp_context *psp = &adev->psp;
>>  struct ta_securedisplay_cmd *securedisplay_cmd;
>>  struct drm_device *dev = adev_to_drm(adev);
>> -uint32_t phy_id;
>> +uint32_t phy_id = 0;
>>  uint32_t op;
>>  char str[64];
>>  int ret;
>> @@ -135,6 +135,10 @@ static ssize_t 
>> amdgpu_securedisplay_debugfs_write(struct file *f, const char __u
>>  mutex_unlock(&psp->securedisplay_context.mutex);
>>  break;
>>  case 2:
>> +if (size < 3) {
>> +dev_err(adev->dev, "Invalid input: %s\n", str);
>> +return -EINVAL;
>> +}
> 
> Better is to check the return of sscanf to see if phy_id value is
> successfully scanned. Otherwise, return error.

Thanks, will fix in v3

Regards,
Ma Jun
> 
> Thanks,
> Lijo
> 
>>  mutex_lock(&psp->securedisplay_context.mutex);
>>  psp_prep_securedisplay_cmd_buf(psp, &securedisplay_cmd,
>>  TA_SECUREDISPLAY_COMMAND__SEND_ROI_CRC);


RE: [PATCH] drm/amdgpu: skip to create ras xxx_err_count node when ACA is enabled

2024-04-25 Thread Zhou1, Tao
[AMD Official Use Only - General]

Reviewed-by: Tao Zhou 

[Tao] it's better to add comment to explain how to get error count when aca is 
enabled.

BTW, according to the change, do we need to update ras tool?

> -Original Message-
> From: Wang, Yang(Kevin) 
> Sent: Wednesday, April 24, 2024 10:50 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking ; Zhou1, Tao
> 
> Subject: [PATCH] drm/amdgpu: skip to create ras xxx_err_count node when ACA
> is enabled
>
> skip to create 'xxx_err_count' node when ACA is enabled.
>
> Signed-off-by: Yang Wang 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index 1e2b866751c3..96a8359b703b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -1756,6 +1756,9 @@ int amdgpu_ras_sysfs_create(struct amdgpu_device
> *adev,
>   if (!obj || obj->attr_inuse)
>   return -EINVAL;
>
> + if (amdgpu_aca_is_enabled(adev))
> + return 0;
> +
>   get_obj(obj);
>
>   snprintf(obj->fs_data.sysfs_name, sizeof(obj->fs_data.sysfs_name), @@ -
> 1790,6 +1793,9 @@ int amdgpu_ras_sysfs_remove(struct amdgpu_device *adev,
>   if (!obj || !obj->attr_inuse)
>   return -EINVAL;
>
> + if (amdgpu_aca_is_enabled(adev))
> + return 0;
> +
>   if (adev->dev->kobj.sd)
>   sysfs_remove_file_from_group(&adev->dev->kobj,
>   &obj->sysfs_attr.attr,
> --
> 2.34.1



[PATCH] drm/amdgpu: fix uninitialized scalar variable warning

2024-04-25 Thread Tim Huang
Clear warning that field bp is uninitialized when
calling amdgpu_virt_ras_add_bps.

Signed-off-by: Tim Huang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 54ab51a4ada7..a2f15edfe812 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -395,6 +395,8 @@ static void amdgpu_virt_add_bad_page(struct amdgpu_device 
*adev,
else
vram_usage_va = adev->mman.drv_vram_usage_va;
 
+   memset(&bp, 0, sizeof(struct eeprom_table_record));
+
if (bp_block_size) {
bp_cnt = bp_block_size / sizeof(uint64_t);
for (bp_idx = 0; bp_idx < bp_cnt; bp_idx++) {
-- 
2.39.2



Re: [PATCH] drm/amdgpu: fix overflowed array index read warning

2024-04-25 Thread Alex Deucher
On Thu, Apr 25, 2024 at 8:37 PM Tim Huang  wrote:
>
> Clear overflowed array index read warning by cast operation.
>
> Signed-off-by: Tim Huang 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> index 06f0a6534a94..15c240656470 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> @@ -473,8 +473,9 @@ static ssize_t amdgpu_debugfs_ring_read(struct file *f, 
> char __user *buf,
> size_t size, loff_t *pos)
>  {
> struct amdgpu_ring *ring = file_inode(f)->i_private;
> -   int r, i;
> uint32_t value, result, early[3];
> +   loff_t i;
> +   int r;
>
> if (*pos & 3 || size & 3)
> return -EINVAL;
> --
> 2.39.2
>


[PATCH] drm/amdgpu: fix overflowed array index read warning

2024-04-25 Thread Tim Huang
Clear overflowed array index read warning by cast operation.

Signed-off-by: Tim Huang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 06f0a6534a94..15c240656470 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -473,8 +473,9 @@ static ssize_t amdgpu_debugfs_ring_read(struct file *f, 
char __user *buf,
size_t size, loff_t *pos)
 {
struct amdgpu_ring *ring = file_inode(f)->i_private;
-   int r, i;
uint32_t value, result, early[3];
+   loff_t i;
+   int r;
 
if (*pos & 3 || size & 3)
return -EINVAL;
-- 
2.39.2



RE: [PATCH v2] drm/amdgpu: fix overflowed array index read warning

2024-04-25 Thread Huang, Tim
[AMD Official Use Only - General]

-Original Message-
From: Koenig, Christian 
Sent: Thursday, April 25, 2024 9:31 PM
To: Alex Deucher ; Huang, Tim 
Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander 

Subject: Re: [PATCH v2] drm/amdgpu: fix overflowed array index read warning

Am 25.04.24 um 15:28 schrieb Alex Deucher:
> On Thu, Apr 25, 2024 at 3:22 AM Tim Huang  wrote:
>> From: Tim Huang 
>>
>> Clear warning that cast operation might have overflowed.
>>
>> v2: keep reverse xmas tree order to declare "int r;" (Christian)
>>
>> Signed-off-by: Tim Huang 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> index 06f0a6534a94..8cf60acb2970 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>> @@ -473,8 +473,8 @@ static ssize_t amdgpu_debugfs_ring_read(struct file *f, 
>> char __user *buf,
>>  size_t size, loff_t *pos)
>>   {
>>  struct amdgpu_ring *ring = file_inode(f)->i_private;
>> -   int r, i;
>>  uint32_t value, result, early[3];
>> +   int r;
>>
>>  if (*pos & 3 || size & 3)
>>  return -EINVAL;
>> @@ -485,7 +485,7 @@ static ssize_t amdgpu_debugfs_ring_read(struct file *f, 
>> char __user *buf,
>>  early[0] = amdgpu_ring_get_rptr(ring) & ring->buf_mask;
>>  early[1] = amdgpu_ring_get_wptr(ring) & ring->buf_mask;
>>  early[2] = ring->wptr & ring->buf_mask;
>> -   for (i = *pos / 4; i < 3 && size; i++) {
>> +   for (loff_t i = *pos / 4; i < 3 && size; i++) {
> Some older compilers complain about declarations mixed with code like
> this.  Not sure how big a deal that would be.

>Good point, we would like to be able to backport this.

>Somebody from Alivins team needs to comment, but IIRC we agreed that this 
>would be legal and we take care of it by using appropriate compiler flags on 
>older kernels.

>Christian.

Thanks for pointing out. Will avoid doing this.

>
> Alex
>
>>  r = put_user(early[i], (uint32_t *)buf);
>>  if (r)
>>  return r;
>> --
>> 2.39.2
>>



Re: [PATCH v3] drm/amdgpu: IB test encode test package change for VCN5

2024-04-25 Thread Jiang, Sonny
[AMD Official Use Only - General]

By tests, I didn't find error on VCN1 to VCN4.

Thanks,
Sonny


From: Jiang, Sonny 
Sent: Thursday, April 25, 2024 4:10 PM
To: amd-gfx@lists.freedesktop.org 
Cc: Jiang, Sonny ; Jiang, Sonny 
Subject: [PATCH v3] drm/amdgpu: IB test encode test package change for VCN5

From: Sonny Jiang 

VCN5 session info package interface changed

Signed-off-by: Sonny Jiang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index 677eb141554e..b89605b400c0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -885,7 +885,7 @@ static int amdgpu_vcn_enc_get_create_msg(struct amdgpu_ring 
*ring, uint32_t hand
 ib->ptr[ib->length_dw++] = handle;
 ib->ptr[ib->length_dw++] = upper_32_bits(addr);
 ib->ptr[ib->length_dw++] = addr;
-   ib->ptr[ib->length_dw++] = 0x000b;
+   ib->ptr[ib->length_dw++] = 0x;

 ib->ptr[ib->length_dw++] = 0x0014;
 ib->ptr[ib->length_dw++] = 0x0002; /* task info */
@@ -952,7 +952,7 @@ static int amdgpu_vcn_enc_get_destroy_msg(struct 
amdgpu_ring *ring, uint32_t han
 ib->ptr[ib->length_dw++] = handle;
 ib->ptr[ib->length_dw++] = upper_32_bits(addr);
 ib->ptr[ib->length_dw++] = addr;
-   ib->ptr[ib->length_dw++] = 0x000b;
+   ib->ptr[ib->length_dw++] = 0x;

 ib->ptr[ib->length_dw++] = 0x0014;
 ib->ptr[ib->length_dw++] = 0x0002;
--
2.43.2



[PATCH v3] drm/amdgpu: IB test encode test package change for VCN5

2024-04-25 Thread Sonny Jiang
From: Sonny Jiang 

VCN5 session info package interface changed

Signed-off-by: Sonny Jiang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index 677eb141554e..b89605b400c0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -885,7 +885,7 @@ static int amdgpu_vcn_enc_get_create_msg(struct amdgpu_ring 
*ring, uint32_t hand
ib->ptr[ib->length_dw++] = handle;
ib->ptr[ib->length_dw++] = upper_32_bits(addr);
ib->ptr[ib->length_dw++] = addr;
-   ib->ptr[ib->length_dw++] = 0x000b;
+   ib->ptr[ib->length_dw++] = 0x;
 
ib->ptr[ib->length_dw++] = 0x0014;
ib->ptr[ib->length_dw++] = 0x0002; /* task info */
@@ -952,7 +952,7 @@ static int amdgpu_vcn_enc_get_destroy_msg(struct 
amdgpu_ring *ring, uint32_t han
ib->ptr[ib->length_dw++] = handle;
ib->ptr[ib->length_dw++] = upper_32_bits(addr);
ib->ptr[ib->length_dw++] = addr;
-   ib->ptr[ib->length_dw++] = 0x000b;
+   ib->ptr[ib->length_dw++] = 0x;
 
ib->ptr[ib->length_dw++] = 0x0014;
ib->ptr[ib->length_dw++] = 0x0002;
-- 
2.43.2



[PATCH 1/5] drm/amdgpu: Add gfx v12 pte/pde format change

2024-04-25 Thread Alex Deucher
From: Hawking Zhang 

Add gfx v12 pte/pde format change.

Signed-off-by: Hawking Zhang 
Reviewed-by: Likun Gao 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 54d7da396de0..e0e7e944a323 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -111,6 +111,19 @@ struct amdgpu_mem_stats;
 #define AMDGPU_PTE_MTYPE_NV10(a)   ((uint64_t)(a) << 48)
 #define AMDGPU_PTE_MTYPE_NV10_MASK AMDGPU_PTE_MTYPE_NV10(7ULL)
 
+/* gfx12 */
+#define AMDGPU_PTE_PRT_GFX12   (1ULL << 56)
+
+#define AMDGPU_PTE_MTYPE_GFX12(a)  ((uint64_t)(a) << 54)
+#define AMDGPU_PTE_MTYPE_GFX12_MASKAMDGPU_PTE_MTYPE_GFX12(3ULL)
+
+#define AMDGPU_PTE_IS_PTE  (1ULL << 63)
+
+/* PDE Block Fragment Size for gfx v12 */
+#define AMDGPU_PDE_BFS_GFX12(a)((uint64_t)((a) & 0x1fULL) << 
58)
+/* PDE is handled as PTE for gfx v12 */
+#define AMDGPU_PDE_PTE_GFX12   (1ULL << 63)
+
 /* How to program VM fault handling */
 #define AMDGPU_VM_FAULT_STOP_NEVER 0
 #define AMDGPU_VM_FAULT_STOP_FIRST 1
-- 
2.44.0



[PATCH 4/5] drm/amdgpu: support gfx v12 specific pte/pde fields

2024-04-25 Thread Alex Deucher
From: Hawking Zhang 

Add gfx v12 pte/pde support to gmc common helper.

v2: squash in fixes (Alex)

Signed-off-by: Hawking Zhang 
Reviewed-by: Likun Gao 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c   |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 12 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h|  6 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c |  6 +++---
 5 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 6bbab141eaae..8fe825479194 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -683,7 +683,7 @@ uint64_t amdgpu_gem_va_map_flags(struct amdgpu_device 
*adev, uint32_t flags)
if (flags & AMDGPU_VM_PAGE_WRITEABLE)
pte_flag |= AMDGPU_PTE_WRITEABLE;
if (flags & AMDGPU_VM_PAGE_PRT)
-   pte_flag |= AMDGPU_PTE_PRT;
+   pte_flag |= AMDGPU_PTE_PRT_FLAG(adev);
if (flags & AMDGPU_VM_PAGE_NOALLOC)
pte_flag |= AMDGPU_PTE_NOALLOC;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index be4629cdac04..9fcf194fea33 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -1015,7 +1015,7 @@ void amdgpu_gmc_init_pdb0(struct amdgpu_device *adev)
flags |= AMDGPU_PTE_WRITEABLE;
flags |= AMDGPU_PTE_SNOOPED;
flags |= AMDGPU_PTE_FRAG((adev->gmc.vmid0_page_table_block_size + 9*1));
-   flags |= AMDGPU_PDE_PTE;
+   flags |= AMDGPU_PDE_PTE_FLAG(adev);
 
/* The first n PDE0 entries are used as PTE,
 * pointing to vram
@@ -1028,7 +1028,7 @@ void amdgpu_gmc_init_pdb0(struct amdgpu_device *adev)
 * pointing to a 4K system page
 */
flags = AMDGPU_PTE_VALID;
-   flags |= AMDGPU_PDE_BFS(0) | AMDGPU_PTE_SNOOPED;
+   flags |= AMDGPU_PTE_SNOOPED | AMDGPU_PDE_BFS_FLAG(adev, 0);
/* Requires gart_ptb_gpu_pa to be 4K aligned */
amdgpu_gmc_set_pte_pde(adev, adev->gmc.ptr_pdb0, i, gart_ptb_gpu_pa, 
flags);
drm_dev_exit(idx);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 4e2391c83d7c..991e4d69c6a2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1055,7 +1055,7 @@ int amdgpu_vm_update_range(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
params.pages_addr = NULL;
}
 
-   } else if (flags & (AMDGPU_PTE_VALID | AMDGPU_PTE_PRT)) {
+   } else if (flags & (AMDGPU_PTE_VALID | 
AMDGPU_PTE_PRT_FLAG(adev))) {
addr = vram_base + cursor.start;
} else {
addr = 0;
@@ -1369,7 +1369,7 @@ static void amdgpu_vm_free_mapping(struct amdgpu_device 
*adev,
   struct amdgpu_bo_va_mapping *mapping,
   struct dma_fence *fence)
 {
-   if (mapping->flags & AMDGPU_PTE_PRT)
+   if (mapping->flags & AMDGPU_PTE_PRT_FLAG(adev))
amdgpu_vm_add_prt_cb(adev, fence);
kfree(mapping);
 }
@@ -1637,7 +1637,7 @@ static void amdgpu_vm_bo_insert_map(struct amdgpu_device 
*adev,
list_add(&mapping->list, &bo_va->invalids);
amdgpu_vm_it_insert(mapping, &vm->va);
 
-   if (mapping->flags & AMDGPU_PTE_PRT)
+   if (mapping->flags & AMDGPU_PTE_PRT_FLAG(adev))
amdgpu_vm_prt_get(adev);
 
if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv &&
@@ -1939,7 +1939,7 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device 
*adev,
struct amdgpu_bo *bo = before->bo_va->base.bo;
 
amdgpu_vm_it_insert(before, &vm->va);
-   if (before->flags & AMDGPU_PTE_PRT)
+   if (before->flags & AMDGPU_PTE_PRT_FLAG(adev))
amdgpu_vm_prt_get(adev);
 
if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv &&
@@ -1954,7 +1954,7 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device 
*adev,
struct amdgpu_bo *bo = after->bo_va->base.bo;
 
amdgpu_vm_it_insert(after, &vm->va);
-   if (after->flags & AMDGPU_PTE_PRT)
+   if (after->flags & AMDGPU_PTE_PRT_FLAG(adev))
amdgpu_vm_prt_get(adev);
 
if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv &&
@@ -2605,7 +2605,7 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct 
amdgpu_vm *vm)
dma_fence_put(vm->last_tlb_flush);
 
list_for_each_entry_safe(mapping, tmp, &vm->freed, list) {
-   if (mapping->flags & AMDGPU_PTE_PRT && prt_fini_needed) {
+   if (mapping->flags & AMDGPU_PTE_PRT_FLAG(adev) && 
prt_fini_needed) {

[PATCH 5/5] drm/amdgpu/discovery: Add gmc v12_0 ip block

2024-04-25 Thread Alex Deucher
From: Likun Gao 

Add gmc v12_0 ip block.

v2: Squash in updates (Alex)

Signed-off-by: Likun Gao 
Reviewed-by: Hawking Zhang 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 79b43e4bf7c8..98d6915e955e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -55,6 +55,7 @@
 #include "smuio_v9_0.h"
 #include "gmc_v10_0.h"
 #include "gmc_v11_0.h"
+#include "gmc_v12_0.h"
 #include "gfxhub_v2_0.h"
 #include "mmhub_v2_0.h"
 #include "nbio_v2_3.h"
@@ -1753,6 +1754,10 @@ static int amdgpu_discovery_set_gmc_ip_blocks(struct 
amdgpu_device *adev)
case IP_VERSION(11, 5, 1):
amdgpu_device_ip_block_add(adev, &gmc_v11_0_ip_block);
break;
+   case IP_VERSION(12, 0, 0):
+   case IP_VERSION(12, 0, 1):
+   amdgpu_device_ip_block_add(adev, &gmc_v12_0_ip_block);
+   break;
default:
dev_err(adev->dev, "Failed to add gmc ip block(GC_HWIP:0x%x)\n",
amdgpu_ip_version(adev, GC_HWIP, 0));
-- 
2.44.0



[PATCH 2/5] drm/amdgpu: Add gmc v12_0 ip block support (v7)

2024-04-25 Thread Alex Deucher
From: Hawking Zhang 

Add initial support for GMC v12.

v1: Add gmc v12_0 ip block support.
v2: Switch to gfx.kiq array.
v3: Switch to vmhubs_mask.
v4: Switch to AMDGPU_MMHUB0(0) and AMDGPU_GFXHUB(0)
v5: Rebase (Alex)
v6: Squash in fixes for AGP handling, gfxhub init order,
vmhub index (Alex)
v7: Rebase (Alex)
v8: squash in ecc fix (Alex)

Signed-off-by: Hawking Zhang 
Reviewed-by: Likun Gao 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/Makefile|2 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c | 1000 
 drivers/gpu/drm/amd/amdgpu/gmc_v12_0.h |   30 +
 3 files changed, 1031 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/gmc_v12_0.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 5c0e7b512e25..9a793f4d8fcf 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -116,7 +116,7 @@ amdgpu-y += \
gfxhub_v2_0.o mmhub_v2_0.o gmc_v10_0.o gfxhub_v2_1.o mmhub_v2_3.o \
mmhub_v1_7.o gfxhub_v3_0.o mmhub_v3_0.o mmhub_v3_0_2.o gmc_v11_0.o \
mmhub_v3_0_1.o gfxhub_v3_0_3.o gfxhub_v1_2.o mmhub_v1_8.o mmhub_v3_3.o \
-   gfxhub_v11_5_0.o mmhub_v4_1_0.o gfxhub_v12_0.o
+   gfxhub_v11_5_0.o mmhub_v4_1_0.o gfxhub_v12_0.o gmc_v12_0.o
 
 # add UMC block
 amdgpu-y += \
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c
new file mode 100644
index ..c85ebc8360e1
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c
@@ -0,0 +1,1000 @@
+/*
+ * Copyright 2023 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+#include 
+#include 
+
+#include 
+
+#include "amdgpu.h"
+#include "amdgpu_atomfirmware.h"
+#include "gmc_v12_0.h"
+#include "athub/athub_4_1_0_sh_mask.h"
+#include "athub/athub_4_1_0_offset.h"
+#include "oss/osssys_7_0_0_offset.h"
+#include "ivsrcid/vmc/irqsrcs_vmc_1_0.h"
+#include "soc24_enum.h"
+#include "soc24.h"
+#include "soc15d.h"
+#include "soc15_common.h"
+#include "nbif_v6_3_1.h"
+#include "gfxhub_v12_0.h"
+#include "mmhub_v4_1_0.h"
+#include "athub_v4_1_0.h"
+
+
+static int gmc_v12_0_ecc_interrupt_state(struct amdgpu_device *adev,
+struct amdgpu_irq_src *src,
+unsigned type,
+enum amdgpu_interrupt_state state)
+{
+   return 0;
+}
+
+static int gmc_v12_0_vm_fault_interrupt_state(struct amdgpu_device *adev,
+ struct amdgpu_irq_src *src, 
unsigned type,
+ enum amdgpu_interrupt_state state)
+{
+   switch (state) {
+   case AMDGPU_IRQ_STATE_DISABLE:
+   /* MM HUB */
+   amdgpu_gmc_set_vm_fault_masks(adev, AMDGPU_MMHUB0(0), false);
+   /* GFX HUB */
+   /* This works because this interrupt is only
+* enabled at init/resume and disabled in
+* fini/suspend, so the overall state doesn't
+* change over the course of suspend/resume.
+*/
+   if (!adev->in_s0ix)
+   amdgpu_gmc_set_vm_fault_masks(adev, AMDGPU_GFXHUB(0), 
false);
+   break;
+   case AMDGPU_IRQ_STATE_ENABLE:
+   /* MM HUB */
+   amdgpu_gmc_set_vm_fault_masks(adev, AMDGPU_MMHUB0(0), true);
+   /* GFX HUB */
+   /* This works because this interrupt is only
+* enabled at init/resume and disabled in
+* fini/suspend, so the overall state doesn't
+* change over the course of suspend/resume.
+*/
+   if (!adev->in_s0ix)
+   amdgpu_gmc_set_vm_fault_masks(adev, AMDGPU_GFXHUB(0), 
true);
+  

[PATCH 3/5] drm/amdgpu: Set pte_is_pte flag in gmc v12 gart

2024-04-25 Thread Alex Deucher
From: Hawking Zhang 

pte_is_pte is new flag introduced in gmc v12 that
needs to be set by default for pte.

Signed-off-by: Hawking Zhang 
Reviewed-by: Likun Gao 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c
index c85ebc8360e1..c24f5bd3e09c 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c
@@ -686,7 +686,8 @@ static int gmc_v12_0_gart_init(struct amdgpu_device *adev)
 
adev->gart.table_size = adev->gart.num_gpu_pages * 8;
adev->gart.gart_pte_flags = AMDGPU_PTE_MTYPE_GFX12(MTYPE_UC) |
-AMDGPU_PTE_EXECUTABLE;
+   AMDGPU_PTE_EXECUTABLE |
+   AMDGPU_PTE_IS_PTE;
 
return amdgpu_gart_table_vram_alloc(adev);
 }
-- 
2.44.0



[PATCH] drm/amdgpu: Add gfxhub v12_0 ip block support (v3)

2024-04-25 Thread Alex Deucher
From: Likun Gao 

Add initial gfxhub v12 support.

v1: Add gfxhub v12_0 ip block support (Likun)
v2: Switch to AMDGPU_GFXHUB(0) (Hawking)
v3: Squash in keep default error response mode (Hawking)

Signed-off-by: Likun Gao 
Signed-off-by: Hawking Zhang 
Reviewed-by: Hawking Zhang 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/Makefile   |   2 +-
 drivers/gpu/drm/amd/amdgpu/gfxhub_v12_0.c | 501 ++
 drivers/gpu/drm/amd/amdgpu/gfxhub_v12_0.h |  29 ++
 3 files changed, 531 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/gfxhub_v12_0.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/gfxhub_v12_0.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 52e21ea7252d..5c0e7b512e25 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -116,7 +116,7 @@ amdgpu-y += \
gfxhub_v2_0.o mmhub_v2_0.o gmc_v10_0.o gfxhub_v2_1.o mmhub_v2_3.o \
mmhub_v1_7.o gfxhub_v3_0.o mmhub_v3_0.o mmhub_v3_0_2.o gmc_v11_0.o \
mmhub_v3_0_1.o gfxhub_v3_0_3.o gfxhub_v1_2.o mmhub_v1_8.o mmhub_v3_3.o \
-   gfxhub_v11_5_0.o mmhub_v4_1_0.o
+   gfxhub_v11_5_0.o mmhub_v4_1_0.o gfxhub_v12_0.o
 
 # add UMC block
 amdgpu-y += \
diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v12_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfxhub_v12_0.c
new file mode 100644
index ..7ea64f1e1e48
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v12_0.c
@@ -0,0 +1,501 @@
+/*
+ * Copyright 2023 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include "amdgpu.h"
+#include "gfxhub_v12_0.h"
+
+#include "gc/gc_12_0_0_offset.h"
+#include "gc/gc_12_0_0_sh_mask.h"
+#include "soc24_enum.h"
+#include "soc15_common.h"
+
+#define regGCVM_L2_CNTL3_DEFAULT   0x80120007
+#define regGCVM_L2_CNTL4_DEFAULT   0x00c1
+#define regGCVM_L2_CNTL5_DEFAULT   0x3fe0
+#define regGRBM_GFX_INDEX_DEFAULT  0xe000
+
+static const char *gfxhub_client_ids[] = {
+   /* TODO */
+};
+
+static uint32_t gfxhub_v12_0_get_invalidate_req(unsigned int vmid,
+   uint32_t flush_type)
+{
+   u32 req = 0;
+
+   /* invalidate using legacy mode on vmid*/
+   req = REG_SET_FIELD(req, GCVM_INVALIDATE_ENG0_REQ,
+   PER_VMID_INVALIDATE_REQ, 1 << vmid);
+   req = REG_SET_FIELD(req, GCVM_INVALIDATE_ENG0_REQ, FLUSH_TYPE, 
flush_type);
+   req = REG_SET_FIELD(req, GCVM_INVALIDATE_ENG0_REQ, INVALIDATE_L2_PTES, 
1);
+   req = REG_SET_FIELD(req, GCVM_INVALIDATE_ENG0_REQ, INVALIDATE_L2_PDE0, 
1);
+   req = REG_SET_FIELD(req, GCVM_INVALIDATE_ENG0_REQ, INVALIDATE_L2_PDE1, 
1);
+   req = REG_SET_FIELD(req, GCVM_INVALIDATE_ENG0_REQ, INVALIDATE_L2_PDE2, 
1);
+   req = REG_SET_FIELD(req, GCVM_INVALIDATE_ENG0_REQ, INVALIDATE_L1_PTES, 
1);
+   req = REG_SET_FIELD(req, GCVM_INVALIDATE_ENG0_REQ,
+   CLEAR_PROTECTION_FAULT_STATUS_ADDR, 0);
+
+   return req;
+}
+
+static void
+gfxhub_v12_0_print_l2_protection_fault_status(struct amdgpu_device *adev,
+ uint32_t status)
+{
+   u32 cid = REG_GET_FIELD(status,
+   GCVM_L2_PROTECTION_FAULT_STATUS_LO32, CID);
+
+   dev_err(adev->dev,
+   "GCVM_L2_PROTECTION_FAULT_STATUS:0x%08X\n",
+   status);
+   dev_err(adev->dev, "\t Faulty UTCL2 client ID: %s (0x%x)\n",
+   cid >= ARRAY_SIZE(gfxhub_client_ids) ? "unknown" : 
gfxhub_client_ids[cid],
+   cid);
+   dev_err(adev->dev, "\t MORE_FAULTS: 0x%lx\n",
+   REG_GET_FIELD(status,
+   GCVM_L2_PROTECTION_FAULT_STATUS_LO32, MORE_FAULTS));
+   dev_err(adev->dev, "\t WALKER_ERROR: 0x%lx\n",
+   REG_GET_FIELD(status,
+   GCVM_L2_PROTECTION_FAULT_STATUS_LO32, WALKER_ERRO

[PATCH 2/2] drm/amdgpu: Add mmhub v4_1_0 ip block support (v4)

2024-04-25 Thread Alex Deucher
From: Hawking Zhang 

Add initial support for MMHUB 4.1.0.

v1: Add mmhub v4_1_0 ip block support.
v2: Switch to AMDGPU_MMHUB0(0).
v3: squash in fix for ip version check (Alex)
v4: squash in vm_contexts_disable fix (Alex)

Signed-off-by: Hawking Zhang 
Reviewed-by: Likun Gao 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/Makefile   |   2 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c | 654 ++
 drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.h |  28 +
 3 files changed, 683 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 240c86f54ce8..52e21ea7252d 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -116,7 +116,7 @@ amdgpu-y += \
gfxhub_v2_0.o mmhub_v2_0.o gmc_v10_0.o gfxhub_v2_1.o mmhub_v2_3.o \
mmhub_v1_7.o gfxhub_v3_0.o mmhub_v3_0.o mmhub_v3_0_2.o gmc_v11_0.o \
mmhub_v3_0_1.o gfxhub_v3_0_3.o gfxhub_v1_2.o mmhub_v1_8.o mmhub_v3_3.o \
-   gfxhub_v11_5_0.o
+   gfxhub_v11_5_0.o mmhub_v4_1_0.o
 
 # add UMC block
 amdgpu-y += \
diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c 
b/drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c
new file mode 100644
index ..5bbaa2b2caab
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c
@@ -0,0 +1,654 @@
+/*
+ * Copyright 2023 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include "amdgpu.h"
+#include "mmhub_v4_1_0.h"
+
+#include "mmhub/mmhub_4_1_0_offset.h"
+#include "mmhub/mmhub_4_1_0_sh_mask.h"
+
+#include "soc15_common.h"
+#include "soc24_enum.h"
+
+#define regMMVM_L2_CNTL3_DEFAULT   0x8017
+#define regMMVM_L2_CNTL4_DEFAULT   0x00c1
+#define regMMVM_L2_CNTL5_DEFAULT   0x3fe0
+
+static const char *mmhub_client_ids_v4_1_0[][2] = {
+   [0][0] = "VMC",
+   [4][0] = "DCEDMC",
+   [5][0] = "DCEVGA",
+   [6][0] = "MP0",
+   [7][0] = "MP1",
+   [8][0] = "MPIO",
+   [16][0] = "HDP",
+   [17][0] = "LSDMA",
+   [18][0] = "JPEG",
+   [19][0] = "VCNU0",
+   [21][0] = "VSCH",
+   [22][0] = "VCNU1",
+   [23][0] = "VCN1",
+   [32+20][0] = "VCN0",
+   [2][1] = "DBGUNBIO",
+   [3][1] = "DCEDWB",
+   [4][1] = "DCEDMC",
+   [5][1] = "DCEVGA",
+   [6][1] = "MP0",
+   [7][1] = "MP1",
+   [8][1] = "MPIO",
+   [10][1] = "DBGU0",
+   [11][1] = "DBGU1",
+   [12][1] = "DBGU2",
+   [13][1] = "DBGU3",
+   [14][1] = "XDP",
+   [15][1] = "OSSSYS",
+   [16][1] = "HDP",
+   [17][1] = "LSDMA",
+   [18][1] = "JPEG",
+   [19][1] = "VCNU0",
+   [20][1] = "VCN0",
+   [21][1] = "VSCH",
+   [22][1] = "VCNU1",
+   [23][1] = "VCN1",
+};
+
+static uint32_t mmhub_v4_1_0_get_invalidate_req(unsigned int vmid,
+   uint32_t flush_type)
+{
+   u32 req = 0;
+
+   /* invalidate using legacy mode on vmid*/
+   req = REG_SET_FIELD(req, MMVM_INVALIDATE_ENG0_REQ,
+   PER_VMID_INVALIDATE_REQ, 1 << vmid);
+   req = REG_SET_FIELD(req, MMVM_INVALIDATE_ENG0_REQ, FLUSH_TYPE, 
flush_type);
+   req = REG_SET_FIELD(req, MMVM_INVALIDATE_ENG0_REQ, INVALIDATE_L2_PTES, 
1);
+   req = REG_SET_FIELD(req, MMVM_INVALIDATE_ENG0_REQ, INVALIDATE_L2_PDE0, 
1);
+   req = REG_SET_FIELD(req, MMVM_INVALIDATE_ENG0_REQ, INVALIDATE_L2_PDE1, 
1);
+   req = REG_SET_FIELD(req, MMVM_INVALIDATE_ENG0_REQ, INVALIDATE_L2_PDE2, 
1);
+   req = REG_SET_FIELD(req, MMVM_INVALIDATE_ENG0_REQ, INVALIDATE_L1_PTES, 
1);
+   req = REG_SET_FIELD(req, MMVM_INVALIDATE_ENG0_REQ,
+   CLEAR_PROTECTION_FAULT_STATUS_ADDR, 0);
+
+   

[PATCH 0/2] Add mmhub 4.1.x support

2024-04-25 Thread Alex Deucher
Add support for mmhub 4.1.x.

The first patch adds new register headers which
have been omitted due to size.

Hawking Zhang (2):
  drm/amdgpu: Add mmhub v4_1_0 ip headers (v4)
  drm/amdgpu: Add mmhub v4_1_0 ip block support (v4)

 drivers/gpu/drm/amd/amdgpu/Makefile   |2 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c |  654 ++
 drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.h |   28 +
 .../asic_reg/mmhub/mmhub_4_1_0_offset.h   | 1341 
 .../asic_reg/mmhub/mmhub_4_1_0_sh_mask.h  | 6943 +
 5 files changed, 8967 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.h
 create mode 100644 
drivers/gpu/drm/amd/include/asic_reg/mmhub/mmhub_4_1_0_offset.h
 create mode 100644 
drivers/gpu/drm/amd/include/asic_reg/mmhub/mmhub_4_1_0_sh_mask.h

-- 
2.44.0



[PATCH 4/4] drm/amdgpu/discovery: Add common soc24 ip block

2024-04-25 Thread Alex Deucher
From: Likun Gao 

Add common soc24 ip block.

v2: squash in updates (Alex)

Signed-off-by: Likun Gao 
Reviewed-by: Hawking Zhang 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 0e31bdb4b7cb..79b43e4bf7c8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -68,6 +68,7 @@
 #include "hdp_v7_0.h"
 #include "nv.h"
 #include "soc21.h"
+#include "soc24.h"
 #include "navi10_ih.h"
 #include "ih_v6_0.h"
 #include "ih_v6_1.h"
@@ -1700,6 +1701,10 @@ static int amdgpu_discovery_set_common_ip_blocks(struct 
amdgpu_device *adev)
case IP_VERSION(11, 5, 1):
amdgpu_device_ip_block_add(adev, &soc21_common_ip_block);
break;
+   case IP_VERSION(12, 0, 0):
+   case IP_VERSION(12, 0, 1):
+   amdgpu_device_ip_block_add(adev, &soc24_common_ip_block);
+   break;
default:
dev_err(adev->dev,
"Failed to add common ip block(GC_HWIP:0x%x)\n",
-- 
2.44.0



[PATCH 3/4] drm/amdgpu: Add soc24 common ip block (v2)

2024-04-25 Thread Alex Deucher
From: Hawking Zhang 

Add initial soc24 support.

v1: Add soc24 common ip block.
v2: Switch to new select_se_sh/enter_safe_mode
interface.
v3: squash in correct ext rev id, etc. (Alex)

Signed-off-by: Hawking Zhang 
Reviewed-by: Likun Gao 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/Makefile |   2 +-
 drivers/gpu/drm/amd/amdgpu/soc24.c  | 532 
 drivers/gpu/drm/amd/amdgpu/soc24.h  |  30 ++
 3 files changed, 563 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/soc24.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/soc24.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 1f6b56ec99f6..240c86f54ce8 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -97,7 +97,7 @@ amdgpu-$(CONFIG_DRM_AMDGPU_SI)+= si.o gmc_v6_0.o gfx_v6_0.o 
si_ih.o si_dma.o dce
 amdgpu-y += \
vi.o mxgpu_vi.o nbio_v6_1.o soc15.o emu_soc.o mxgpu_ai.o nbio_v7_0.o 
vega10_reg_init.o \
vega20_reg_init.o nbio_v7_4.o nbio_v2_3.o nv.o arct_reg_init.o 
mxgpu_nv.o \
-   nbio_v7_2.o hdp_v4_0.o hdp_v5_0.o aldebaran_reg_init.o aldebaran.o 
soc21.o \
+   nbio_v7_2.o hdp_v4_0.o hdp_v5_0.o aldebaran_reg_init.o aldebaran.o 
soc21.o soc24.o \
sienna_cichlid.o smu_v13_0_10.o nbio_v4_3.o hdp_v6_0.o nbio_v7_7.o 
hdp_v5_2.o lsdma_v6_0.o \
nbio_v7_9.o aqua_vanjaram.o nbio_v7_11.o lsdma_v7_0.o hdp_v7_0.o 
nbif_v6_3_1.o
 
diff --git a/drivers/gpu/drm/amd/amdgpu/soc24.c 
b/drivers/gpu/drm/amd/amdgpu/soc24.c
new file mode 100644
index ..3010dbff695d
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/soc24.c
@@ -0,0 +1,532 @@
+/*
+ * Copyright 2023 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+#include 
+#include 
+#include 
+#include 
+
+#include "amdgpu.h"
+#include "amdgpu_atombios.h"
+#include "amdgpu_ih.h"
+#include "amdgpu_uvd.h"
+#include "amdgpu_vce.h"
+#include "amdgpu_ucode.h"
+#include "amdgpu_psp.h"
+#include "amdgpu_smu.h"
+#include "atom.h"
+#include "amd_pcie.h"
+
+#include "gc/gc_12_0_0_offset.h"
+#include "gc/gc_12_0_0_sh_mask.h"
+#include "mp/mp_14_0_2_offset.h"
+
+#include "soc15.h"
+#include "soc15_common.h"
+#include "soc24.h"
+#include "mxgpu_nv.h"
+
+static const struct amd_ip_funcs soc24_common_ip_funcs;
+
+static u32 soc24_get_config_memsize(struct amdgpu_device *adev)
+{
+   return adev->nbio.funcs->get_memsize(adev);
+}
+
+static u32 soc24_get_xclk(struct amdgpu_device *adev)
+{
+   return adev->clock.spll.reference_freq;
+}
+
+void soc24_grbm_select(struct amdgpu_device *adev,
+  u32 me, u32 pipe, u32 queue, u32 vmid)
+{
+   u32 grbm_gfx_cntl = 0;
+   grbm_gfx_cntl = REG_SET_FIELD(grbm_gfx_cntl, GRBM_GFX_CNTL, PIPEID, 
pipe);
+   grbm_gfx_cntl = REG_SET_FIELD(grbm_gfx_cntl, GRBM_GFX_CNTL, MEID, me);
+   grbm_gfx_cntl = REG_SET_FIELD(grbm_gfx_cntl, GRBM_GFX_CNTL, VMID, vmid);
+   grbm_gfx_cntl = REG_SET_FIELD(grbm_gfx_cntl, GRBM_GFX_CNTL, QUEUEID, 
queue);
+
+   WREG32_SOC15(GC, 0, regGRBM_GFX_CNTL, grbm_gfx_cntl);
+}
+
+static struct soc15_allowed_register_entry soc24_allowed_read_registers[] = {
+   { SOC15_REG_ENTRY(GC, 0, regGRBM_STATUS)},
+   { SOC15_REG_ENTRY(GC, 0, regGRBM_STATUS2)},
+   { SOC15_REG_ENTRY(GC, 0, regGRBM_STATUS_SE0)},
+   { SOC15_REG_ENTRY(GC, 0, regGRBM_STATUS_SE1)},
+   { SOC15_REG_ENTRY(GC, 0, regGRBM_STATUS_SE2)},
+   { SOC15_REG_ENTRY(GC, 0, regGRBM_STATUS_SE3)},
+   { SOC15_REG_ENTRY(SDMA0, 0, regSDMA0_STATUS_REG)},
+   { SOC15_REG_ENTRY(SDMA1, 0, regSDMA1_STATUS_REG)},
+   { SOC15_REG_ENTRY(GC, 0, regCP_STAT)},
+   { SOC15_REG_ENTRY(GC, 0, regCP_STALLED_STAT1)},
+   { SOC15_REG_ENTRY(GC, 0, regCP_STALLED_STAT2)},
+   { SOC15_REG_ENTRY(GC, 0, regCP_STALLED_STAT3)},
+   { SOC15_REG_ENTRY(GC, 0, regCP_CPF_BUSY_STAT)},
+   { SOC15_REG_

[PATCH 0/4] add soc24 support

2024-04-25 Thread Alex Deucher
Add SoC handler for SoC24 platforms.

First two patches add new headers which are omitted due
to size.

Hawking Zhang (3):
  drm/amdgpu: Add gc v12_0_0 ip headers (v4)
  drm/amdgpu: Add soc24 chip enum definitions (v4)
  drm/amdgpu: Add soc24 common ip block (v2)

Likun Gao (1):
  drm/amdgpu/discovery: Add common soc24 ip block

 drivers/gpu/drm/amd/amdgpu/Makefile   | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 5 +
 drivers/gpu/drm/amd/amdgpu/soc24.c|   532 +
 drivers/gpu/drm/amd/amdgpu/soc24.h|30 +
 .../include/asic_reg/gc/gc_12_0_0_offset.h| 11053 +
 .../include/asic_reg/gc/gc_12_0_0_sh_mask.h   | 40452 
 drivers/gpu/drm/amd/include/soc24_enum.h  | 21073 
 7 files changed, 73146 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/soc24.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/soc24.h
 create mode 100644 drivers/gpu/drm/amd/include/asic_reg/gc/gc_12_0_0_offset.h
 create mode 100644 drivers/gpu/drm/amd/include/asic_reg/gc/gc_12_0_0_sh_mask.h
 create mode 100644 drivers/gpu/drm/amd/include/soc24_enum.h

-- 
2.44.0



Re: [RFC PATCH 02/18] drm/ttm: Add per-BO eviction tracking

2024-04-25 Thread Matthew Brost
On Thu, Apr 25, 2024 at 08:18:38AM +0200, Christian König wrote:
> Am 24.04.24 um 18:56 schrieb Friedrich Vock:
> > Make each buffer object aware of whether it has been evicted or not.
> 
> That reverts some changes we made a couple of years ago.
> 
> In general the idea is that eviction isn't something we need to reverse in
> TTM.
> 
> Rather the driver gives the desired placement.
> 
> Regards,
> Christian.
> 

We have added a concept similar to this in drm_gpuvm [1]. GPUVM
maintains a list of evicted BOs and when the GPUVM is locked for
submission it has validate vfunc which is called on each BO. If driver
is using TTM, this is where the driver would call TTM BO validate which
unevicts the BO. Well at least this what we do it Xe [2].

The uneviction is a per VM operation not a global one. With this, a
global eviction list does not seem correct (admittedly not through the
entire series).

Matt

[1] 
https://elixir.bootlin.com/linux/v6.8.7/source/drivers/gpu/drm/drm_gpuvm.c#L86
[2] 
https://elixir.bootlin.com/linux/v6.8.7/source/drivers/gpu/drm/xe/xe_vm.c#L464

> > 
> > Signed-off-by: Friedrich Vock 
> > ---
> >   drivers/gpu/drm/ttm/ttm_bo.c |  1 +
> >   include/drm/ttm/ttm_bo.h | 11 +++
> >   2 files changed, 12 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> > index edf10618fe2b2..3968b17453569 100644
> > --- a/drivers/gpu/drm/ttm/ttm_bo.c
> > +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> > @@ -980,6 +980,7 @@ int ttm_bo_init_reserved(struct ttm_device *bdev, 
> > struct ttm_buffer_object *bo,
> > bo->pin_count = 0;
> > bo->sg = sg;
> > bo->bulk_move = NULL;
> > +   bo->evicted_type = TTM_NUM_MEM_TYPES;
> > if (resv)
> > bo->base.resv = resv;
> > else
> > diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h
> > index 0223a41a64b24..8a1a29c6fbc50 100644
> > --- a/include/drm/ttm/ttm_bo.h
> > +++ b/include/drm/ttm/ttm_bo.h
> > @@ -121,6 +121,17 @@ struct ttm_buffer_object {
> > unsigned priority;
> > unsigned pin_count;
> > 
> > +   /**
> > +* @evicted_type: Memory type this BO was evicted from, if any.
> > +* TTM_NUM_MEM_TYPES if this BO was not evicted.
> > +*/
> > +   int evicted_type;
> > +   /**
> > +* @evicted: Entry in the evicted list for the resource manager
> > +* this BO was evicted from.
> > +*/
> > +   struct list_head evicted;
> > +
> > /**
> >  * @delayed_delete: Work item used when we can't delete the BO
> >  * immediately
> > --
> > 2.44.0
> > 
> 


[PATCH v5] drm/amdgpu: Modify the contiguous flags behaviour

2024-04-25 Thread Arunpravin Paneer Selvam
Now we have two flags for contiguous VRAM buffer allocation.
If the application request for AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
it would set the ttm place TTM_PL_FLAG_CONTIGUOUS flag in the
buffer's placement function.

This patch will change the default behaviour of the two flags.

When we set AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS
- This means contiguous is not mandatory.
- we will try to allocate the contiguous buffer. Say if the
  allocation fails, we fallback to allocate the individual pages.

When we setTTM_PL_FLAG_CONTIGUOUS
- This means contiguous allocation is mandatory.
- we are setting this in amdgpu_bo_pin_restricted() before bo validation
  and check this flag in the vram manager file.
- if this is set, we should allocate the buffer pages contiguously.
  the allocation fails, we return -ENOSPC.

v2:
  - keep the mem_flags and bo->flags check as is(Christian)
  - place the TTM_PL_FLAG_CONTIGUOUS flag setting into the
amdgpu_bo_pin_restricted function placement range iteration
loop(Christian)
  - rename find_pages with amdgpu_vram_mgr_calculate_pages_per_block
(Christian)
  - Keep the kernel BO allocation as is(Christain)
  - If BO pin vram allocation failed, we need to return -ENOSPC as
RDMA cannot work with scattered VRAM pages(Philip)

v3(Christian):
  - keep contiguous flag handling outside of pages_per_block
calculation
  - remove the hacky implementation in contiguous flag error
handling code

v4(Christian):
  - use any variable and return value for non-contiguous
fallback

v5: rebase to amd-staging-drm-next branch

Signed-off-by: Arunpravin Paneer Selvam 
Suggested-by: Christian König 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c   |  8 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 23 +++-
 2 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 331b9ed8062c..316a9f897f2b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -153,8 +153,10 @@ void amdgpu_bo_placement_from_domain(struct amdgpu_bo 
*abo, u32 domain)
else
places[c].flags |= TTM_PL_FLAG_TOPDOWN;
 
-   if (flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)
+   if (abo->tbo.type == ttm_bo_type_kernel &&
+   flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)
places[c].flags |= TTM_PL_FLAG_CONTIGUOUS;
+
c++;
}
 
@@ -964,6 +966,10 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 
domain,
if (!bo->placements[i].lpfn ||
(lpfn && lpfn < bo->placements[i].lpfn))
bo->placements[i].lpfn = lpfn;
+
+   if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS &&
+   bo->placements[i].mem_type == TTM_PL_VRAM)
+   bo->placements[i].flags |= TTM_PL_FLAG_CONTIGUOUS;
}
 
r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 8db880244324..f23002ed2b42 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -450,6 +450,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
 {
struct amdgpu_vram_mgr *mgr = to_vram_mgr(man);
struct amdgpu_device *adev = to_amdgpu_device(mgr);
+   struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo);
u64 vis_usage = 0, max_bytes, min_block_size;
struct amdgpu_vram_mgr_resource *vres;
u64 size, remaining_size, lpfn, fpfn;
@@ -468,7 +469,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
if (tbo->type != ttm_bo_type_kernel)
max_bytes -= AMDGPU_VM_RESERVED_VRAM;
 
-   if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
+   if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS) {
pages_per_block = ~0ul;
} else {
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
@@ -477,7 +478,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
/* default to 2MB */
pages_per_block = 2UL << (20UL - PAGE_SHIFT);
 #endif
-   pages_per_block = max_t(uint32_t, pages_per_block,
+   pages_per_block = max_t(u32, pages_per_block,
tbo->page_alignment);
}
 
@@ -498,7 +499,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
if (place->flags & TTM_PL_FLAG_TOPDOWN)
vres->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION;
 
-   if (place->flags & TTM_PL_FLAG_CONTIGUOUS)
+   if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)
vres->flags |= DRM_BUDDY_CONTIGUOUS_ALLOCATION;
 
if (fpfn || lpfn != mgr->mm.size)
@@ -514,21 +515,31 @@ static int

Re: [PATCH] drm/amdgpu: skip ip dump if devcoredump flag is set

2024-04-25 Thread Khatri, Sunil


On 4/25/2024 7:43 PM, Lazar, Lijo wrote:


On 4/25/2024 3:53 PM, Sunil Khatri wrote:

Do not dump the ip registers during driver reload
in passthrough environment.

Signed-off-by: Sunil Khatri
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 ++
  1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 869256394136..b50758482530 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5372,10 +5372,12 @@ int amdgpu_do_asic_reset(struct list_head 
*device_list_handle,
amdgpu_reset_reg_dumps(tmp_adev);

Probably not related, can the above step be clubbed with what's being
done below? Or, can we move all such to start with amdgpu_reset_dump_*?

Sure lizo

I will club both dump_ip_state and amdgpu_reset_reg_dumps under one if 
condition in the patch to push.


Regards Sunil

  
  	/* Trigger ip dump before we reset the asic */

-   for (i = 0; i < tmp_adev->num_ip_blocks; i++)
-   if (tmp_adev->ip_blocks[i].version->funcs->dump_ip_state)
-   tmp_adev->ip_blocks[i].version->funcs->dump_ip_state(
-   (void *)tmp_adev);
+   if (!test_bit(AMDGPU_SKIP_COREDUMP, &reset_context->flags)) {
+   for (i = 0; i < tmp_adev->num_ip_blocks; i++)
+   if 
(tmp_adev->ip_blocks[i].version->funcs->dump_ip_state)
+   tmp_adev->ip_blocks[i].version->funcs
+   ->dump_ip_state((void *)tmp_adev);
+   }


Anyway,

Reviewed-by: Lijo Lazar

Thanks,
Lijo
  
  	reset_context->reset_device_list = device_list_handle;

r = amdgpu_reset_perform_reset(tmp_adev, reset_context);

RE: [PATCH] drm/amdgpu: Fix two reset triggered in a row

2024-04-25 Thread Li, Yunxiang (Teddy)
[Public]

> Looks like that is handled by the scheduler work item now as well. See 
> function gfx_v9_0_fault() for an example.

Cool so it is blocked by drm_sched_stop also. I think that covers everything.


Re: [PATCH] drm/amdgpu: skip ip dump if devcoredump flag is set

2024-04-25 Thread Lazar, Lijo



On 4/25/2024 3:53 PM, Sunil Khatri wrote:
> Do not dump the ip registers during driver reload
> in passthrough environment.
> 
> Signed-off-by: Sunil Khatri 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 869256394136..b50758482530 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -5372,10 +5372,12 @@ int amdgpu_do_asic_reset(struct list_head 
> *device_list_handle,
>   amdgpu_reset_reg_dumps(tmp_adev);

Probably not related, can the above step be clubbed with what's being
done below? Or, can we move all such to start with amdgpu_reset_dump_*?
>  
>   /* Trigger ip dump before we reset the asic */
> - for (i = 0; i < tmp_adev->num_ip_blocks; i++)
> - if (tmp_adev->ip_blocks[i].version->funcs->dump_ip_state)
> - tmp_adev->ip_blocks[i].version->funcs->dump_ip_state(
> - (void *)tmp_adev);
> + if (!test_bit(AMDGPU_SKIP_COREDUMP, &reset_context->flags)) {
> + for (i = 0; i < tmp_adev->num_ip_blocks; i++)
> + if 
> (tmp_adev->ip_blocks[i].version->funcs->dump_ip_state)
> + tmp_adev->ip_blocks[i].version->funcs
> + ->dump_ip_state((void *)tmp_adev);
> + }


Anyway,

Reviewed-by: Lijo Lazar 

Thanks,
Lijo
>  
>   reset_context->reset_device_list = device_list_handle;
>   r = amdgpu_reset_perform_reset(tmp_adev, reset_context);


Re: [PATCH] drm/amdgpu: Fix two reset triggered in a row

2024-04-25 Thread Christian König

Am 24.04.24 um 15:13 schrieb Li, Yunxiang (Teddy):

[Public]


We have the KFD, FLR, the per engine one in the scheduler and IIRC one more for 
the CP (illegal operation and register write).

I'm not sure about the CP one, but all others should be handled correctly with 
the V2 patch as far as I can see.

Where can I find the CP one? Nothing came up when I search for 
amdgpu_device_gpu_recover


I had to dig that up as well in the code since I haven't looked into it 
in years.


Looks like that is handled by the scheduler work item now as well. See 
function gfx_v9_0_fault() for an example.


Regards,
Christian.


Re: [PATCH 5/5] drm/amdgpu/gfx: enable mes to map legacy queue support

2024-04-25 Thread Christian König

Shashank can you take a look as well.

Thanks,
Christian.

Am 25.04.24 um 15:40 schrieb Alex Deucher:

Series looks good to me.

Reviewed-by: Alex Deucher 

On Thu, Apr 25, 2024 at 6:07 AM Jack Xiao  wrote:

Enable mes to map legacy queue support.

Signed-off-by: Jack Xiao 
Reviewed-by: Hawking Zhang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 39 +
  1 file changed, 34 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index d9dc5485..172b7ba5d0a6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -622,10 +622,28 @@ int amdgpu_gfx_enable_kcq(struct amdgpu_device *adev, int 
xcc_id)
 queue_mask |= (1ull << 
amdgpu_queue_mask_bit_to_set_resource_bit(adev, i));
 }

-   DRM_INFO("kiq ring mec %d pipe %d q %d\n", kiq_ring->me, kiq_ring->pipe,
-   kiq_ring->queue);
 amdgpu_device_flush_hdp(adev, NULL);

+   if (adev->enable_mes)
+   queue_mask = ~0ULL;
+
+   if (adev->enable_mes) {
+   for (i = 0; i < adev->gfx.num_compute_rings; i++) {
+   j = i + xcc_id * adev->gfx.num_compute_rings;
+   r = amdgpu_mes_map_legacy_queue(adev,
+   
&adev->gfx.compute_ring[j]);
+   if (r) {
+   DRM_ERROR("failed to map compute queue\n");
+   return r;
+   }
+   }
+
+   return 0;
+   }
+
+   DRM_INFO("kiq ring mec %d pipe %d q %d\n", kiq_ring->me, kiq_ring->pipe,
+kiq_ring->queue);
+
 spin_lock(&kiq->ring_lock);
 r = amdgpu_ring_alloc(kiq_ring, kiq->pmf->map_queues_size *
 adev->gfx.num_compute_rings +
@@ -636,9 +654,6 @@ int amdgpu_gfx_enable_kcq(struct amdgpu_device *adev, int 
xcc_id)
 return r;
 }

-   if (adev->enable_mes)
-   queue_mask = ~0ULL;
-
 kiq->pmf->kiq_set_resources(kiq_ring, queue_mask);
 for (i = 0; i < adev->gfx.num_compute_rings; i++) {
 j = i + xcc_id * adev->gfx.num_compute_rings;
@@ -665,6 +680,20 @@ int amdgpu_gfx_enable_kgq(struct amdgpu_device *adev, int 
xcc_id)

 amdgpu_device_flush_hdp(adev, NULL);

+   if (adev->enable_mes) {
+   for (i = 0; i < adev->gfx.num_gfx_rings; i++) {
+   j = i + xcc_id * adev->gfx.num_gfx_rings;
+   r = amdgpu_mes_map_legacy_queue(adev,
+   &adev->gfx.gfx_ring[j]);
+   if (r) {
+   DRM_ERROR("failed to map gfx queue\n");
+   return r;
+   }
+   }
+
+   return 0;
+   }
+
 spin_lock(&kiq->ring_lock);
 /* No need to map kcq on the slave */
 if (amdgpu_gfx_is_master_xcc(adev, xcc_id)) {
--
2.41.0





Re: [PATCH 5/5] drm/amdgpu/gfx: enable mes to map legacy queue support

2024-04-25 Thread Alex Deucher
Series looks good to me.

Reviewed-by: Alex Deucher 

On Thu, Apr 25, 2024 at 6:07 AM Jack Xiao  wrote:
>
> Enable mes to map legacy queue support.
>
> Signed-off-by: Jack Xiao 
> Reviewed-by: Hawking Zhang 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 39 +
>  1 file changed, 34 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index d9dc5485..172b7ba5d0a6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -622,10 +622,28 @@ int amdgpu_gfx_enable_kcq(struct amdgpu_device *adev, 
> int xcc_id)
> queue_mask |= (1ull << 
> amdgpu_queue_mask_bit_to_set_resource_bit(adev, i));
> }
>
> -   DRM_INFO("kiq ring mec %d pipe %d q %d\n", kiq_ring->me, 
> kiq_ring->pipe,
> -   kiq_ring->queue);
> amdgpu_device_flush_hdp(adev, NULL);
>
> +   if (adev->enable_mes)
> +   queue_mask = ~0ULL;
> +
> +   if (adev->enable_mes) {
> +   for (i = 0; i < adev->gfx.num_compute_rings; i++) {
> +   j = i + xcc_id * adev->gfx.num_compute_rings;
> +   r = amdgpu_mes_map_legacy_queue(adev,
> +   
> &adev->gfx.compute_ring[j]);
> +   if (r) {
> +   DRM_ERROR("failed to map compute queue\n");
> +   return r;
> +   }
> +   }
> +
> +   return 0;
> +   }
> +
> +   DRM_INFO("kiq ring mec %d pipe %d q %d\n", kiq_ring->me, 
> kiq_ring->pipe,
> +kiq_ring->queue);
> +
> spin_lock(&kiq->ring_lock);
> r = amdgpu_ring_alloc(kiq_ring, kiq->pmf->map_queues_size *
> adev->gfx.num_compute_rings +
> @@ -636,9 +654,6 @@ int amdgpu_gfx_enable_kcq(struct amdgpu_device *adev, int 
> xcc_id)
> return r;
> }
>
> -   if (adev->enable_mes)
> -   queue_mask = ~0ULL;
> -
> kiq->pmf->kiq_set_resources(kiq_ring, queue_mask);
> for (i = 0; i < adev->gfx.num_compute_rings; i++) {
> j = i + xcc_id * adev->gfx.num_compute_rings;
> @@ -665,6 +680,20 @@ int amdgpu_gfx_enable_kgq(struct amdgpu_device *adev, 
> int xcc_id)
>
> amdgpu_device_flush_hdp(adev, NULL);
>
> +   if (adev->enable_mes) {
> +   for (i = 0; i < adev->gfx.num_gfx_rings; i++) {
> +   j = i + xcc_id * adev->gfx.num_gfx_rings;
> +   r = amdgpu_mes_map_legacy_queue(adev,
> +   
> &adev->gfx.gfx_ring[j]);
> +   if (r) {
> +   DRM_ERROR("failed to map gfx queue\n");
> +   return r;
> +   }
> +   }
> +
> +   return 0;
> +   }
> +
> spin_lock(&kiq->ring_lock);
> /* No need to map kcq on the slave */
> if (amdgpu_gfx_is_master_xcc(adev, xcc_id)) {
> --
> 2.41.0
>


Re: [RFC PATCH 00/18] TTM interface for managing VRAM oversubscription

2024-04-25 Thread Christian König

Yeah, and this patch set here is removing that functionality.

Which is major concern from my side as well.

Instead of removing it my long term plan was to move this into TTM ( the 
recent flags rework is going into that direction), so that both amdgpu 
and radeon can use the same code again *and* we can also apply it on 
VM_ALWAYS_VALID BOs.


Christian.

Am 25.04.24 um 15:22 schrieb Marek Olšák:

The most extreme ping-ponging is mitigated by throttling buffer moves
in the kernel, but it only works without VM_ALWAYS_VALID and you can
set BO priorities in the BO list. A better approach that works with
VM_ALWAYS_VALID would be nice.

Marek

On Wed, Apr 24, 2024 at 1:12 PM Friedrich Vock  wrote:

Hi everyone,

recently I've been looking into remedies for apps (in particular, newer
games) that experience significant performance loss when they start to
hit VRAM limits, especially on older or lower-end cards that struggle
to fit both desktop apps and all the game data into VRAM at once.

The root of the problem lies in the fact that from userspace's POV,
buffer eviction is very opaque: Userspace applications/drivers cannot
tell how oversubscribed VRAM is, nor do they have fine-grained control
over which buffers get evicted.  At the same time, with GPU APIs becoming
increasingly lower-level and GPU-driven, only the application itself
can know which buffers are used within a particular submission, and
how important each buffer is. For this, GPU APIs include interfaces
to query oversubscription and specify memory priorities: In Vulkan,
oversubscription can be queried through the VK_EXT_memory_budget
extension. Different buffers can also be assigned priorities via the
VK_EXT_pageable_device_local_memory extension. Modern games, especially
D3D12 games via vkd3d-proton, rely on oversubscription being reported and
priorities being respected in order to perform their memory management.

However, relaying this information to the kernel via the current KMD uAPIs
is not possible. On AMDGPU for example, all work submissions include a
"bo list" that contains any buffer object that is accessed during the
course of the submission. If VRAM is oversubscribed and a buffer in the
list was evicted to system memory, that buffer is moved back to VRAM
(potentially evicting other unused buffers).

Since the usermode driver doesn't know what buffers are used by the
application, its only choice is to submit a bo list that contains every
buffer the application has allocated. In case of VRAM oversubscription,
it is highly likely that some of the application's buffers were evicted,
which almost guarantees that some buffers will get moved around. Since
the bo list is only known at submit time, this also means the buffers
will get moved right before submitting application work, which is the
worst possible time to move buffers from a latency perspective. Another
consequence of the large bo list is that nearly all memory from other
applications will be evicted, too. When different applications (e.g. game
and compositor) submit work one after the other, this causes a ping-pong
effect where each app's submission evicts the other app's memory,
resulting in a large amount of unnecessary moves.

This overly aggressive eviction behavior led to RADV adopting a change
that effectively allows all VRAM applications to reside in system memory
[1].  This worked around the ping-ponging/excessive buffer moving problem,
but also meant that any memory evicted to system memory would forever
stay there, regardless of how VRAM is used.

My proposal aims at providing a middle ground between these extremes.
The goals I want to meet are:
- Userspace is accurately informed about VRAM oversubscription/how much
   VRAM has been evicted
- Buffer eviction respects priorities set by userspace - Wasteful
   ping-ponging is avoided to the extent possible

I have been testing out some prototypes, and came up with this rough
sketch of an API:

- For each ttm_resource_manager, the amount of evicted memory is tracked
   (similarly to how "usage" tracks the memory usage). When memory is
   evicted via ttm_bo_evict, the size of the evicted memory is added, when
   memory is un-evicted (see below), its size is subtracted. The amount of
   evicted memory for e.g. VRAM can be queried by userspace via an ioctl.

- Each ttm_resource_manager maintains a list of evicted buffer objects.

- ttm_mem_unevict walks the list of evicted bos for a given
   ttm_resource_manager and tries moving evicted resources back. When a
   buffer is freed, this function is called to immediately restore some
   evicted memory.

- Each ttm_buffer_object independently tracks the mem_type it wants
   to reside in.

- ttm_bo_try_unevict is added as a helper function which attempts to
   move the buffer to its preferred mem_type. If no space is available
   there, it fails with -ENOSPC/-ENOMEM.

- Similar to how ttm_bo_evict works, each driver can implement
   uneviction_valuable/unevict_flags callbacks to con

Re: [PATCH v2] drm/amdgpu: fix overflowed array index read warning

2024-04-25 Thread Christian König

Am 25.04.24 um 15:28 schrieb Alex Deucher:

On Thu, Apr 25, 2024 at 3:22 AM Tim Huang  wrote:

From: Tim Huang 

Clear warning that cast operation might have overflowed.

v2: keep reverse xmas tree order to declare "int r;" (Christian)

Signed-off-by: Tim Huang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 06f0a6534a94..8cf60acb2970 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -473,8 +473,8 @@ static ssize_t amdgpu_debugfs_ring_read(struct file *f, 
char __user *buf,
 size_t size, loff_t *pos)
  {
 struct amdgpu_ring *ring = file_inode(f)->i_private;
-   int r, i;
 uint32_t value, result, early[3];
+   int r;

 if (*pos & 3 || size & 3)
 return -EINVAL;
@@ -485,7 +485,7 @@ static ssize_t amdgpu_debugfs_ring_read(struct file *f, 
char __user *buf,
 early[0] = amdgpu_ring_get_rptr(ring) & ring->buf_mask;
 early[1] = amdgpu_ring_get_wptr(ring) & ring->buf_mask;
 early[2] = ring->wptr & ring->buf_mask;
-   for (i = *pos / 4; i < 3 && size; i++) {
+   for (loff_t i = *pos / 4; i < 3 && size; i++) {

Some older compilers complain about declarations mixed with code like
this.  Not sure how big a deal that would be.


Good point, we would like to be able to backport this.

Somebody from Alivins team needs to comment, but IIRC we agreed that 
this would be legal and we take care of it by using appropriate compiler 
flags on older kernels.


Christian.



Alex


 r = put_user(early[i], (uint32_t *)buf);
 if (r)
 return r;
--
2.39.2





Re: [PATCH] drm/amdgpu: fix the warning about the expression (int)size - len

2024-04-25 Thread Alex Deucher
On Thu, Apr 25, 2024 at 3:37 AM Jesse Zhang  wrote:
>
> Converting size from size_t to int may overflow.
> v2: keep reverse xmas tree order (Christian)
>
> Signed-off-by: Jesse Zhang 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index f5d0fa207a88..b62ae3c91a9d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -2065,12 +2065,13 @@ static ssize_t 
> amdgpu_reset_dump_register_list_write(struct file *f,
> struct amdgpu_device *adev = (struct amdgpu_device 
> *)file_inode(f)->i_private;
> char reg_offset[11];
> uint32_t *new = NULL, *tmp = NULL;
> -   int ret, i = 0, len = 0;
> +   unsigned int len = 0;
> +   int ret, i = 0;
>
> do {
> memset(reg_offset, 0, 11);
> if (copy_from_user(reg_offset, buf + len,
> -   min(10, ((int)size-len {
> +   min(10, (size-len {
> ret = -EFAULT;
> goto error_free;
> }
> --
> 2.25.1
>


Re: [PATCH] drm/amdgpu: skip ip dump if devcoredump flag is set

2024-04-25 Thread Alex Deucher
On Thu, Apr 25, 2024 at 6:23 AM Sunil Khatri  wrote:
>
> Do not dump the ip registers during driver reload
> in passthrough environment.
>
> Signed-off-by: Sunil Khatri 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 869256394136..b50758482530 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -5372,10 +5372,12 @@ int amdgpu_do_asic_reset(struct list_head 
> *device_list_handle,
> amdgpu_reset_reg_dumps(tmp_adev);
>
> /* Trigger ip dump before we reset the asic */
> -   for (i = 0; i < tmp_adev->num_ip_blocks; i++)
> -   if (tmp_adev->ip_blocks[i].version->funcs->dump_ip_state)
> -   tmp_adev->ip_blocks[i].version->funcs->dump_ip_state(
> -   (void *)tmp_adev);
> +   if (!test_bit(AMDGPU_SKIP_COREDUMP, &reset_context->flags)) {
> +   for (i = 0; i < tmp_adev->num_ip_blocks; i++)
> +   if 
> (tmp_adev->ip_blocks[i].version->funcs->dump_ip_state)
> +   tmp_adev->ip_blocks[i].version->funcs
> +   ->dump_ip_state((void *)tmp_adev);
> +   }
>
> reset_context->reset_device_list = device_list_handle;
> r = amdgpu_reset_perform_reset(tmp_adev, reset_context);
> --
> 2.34.1
>


Re: [PATCH v2] drm/amdgpu: fix overflowed array index read warning

2024-04-25 Thread Alex Deucher
On Thu, Apr 25, 2024 at 3:22 AM Tim Huang  wrote:
>
> From: Tim Huang 
>
> Clear warning that cast operation might have overflowed.
>
> v2: keep reverse xmas tree order to declare "int r;" (Christian)
>
> Signed-off-by: Tim Huang 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> index 06f0a6534a94..8cf60acb2970 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> @@ -473,8 +473,8 @@ static ssize_t amdgpu_debugfs_ring_read(struct file *f, 
> char __user *buf,
> size_t size, loff_t *pos)
>  {
> struct amdgpu_ring *ring = file_inode(f)->i_private;
> -   int r, i;
> uint32_t value, result, early[3];
> +   int r;
>
> if (*pos & 3 || size & 3)
> return -EINVAL;
> @@ -485,7 +485,7 @@ static ssize_t amdgpu_debugfs_ring_read(struct file *f, 
> char __user *buf,
> early[0] = amdgpu_ring_get_rptr(ring) & ring->buf_mask;
> early[1] = amdgpu_ring_get_wptr(ring) & ring->buf_mask;
> early[2] = ring->wptr & ring->buf_mask;
> -   for (i = *pos / 4; i < 3 && size; i++) {
> +   for (loff_t i = *pos / 4; i < 3 && size; i++) {

Some older compilers complain about declarations mixed with code like
this.  Not sure how big a deal that would be.

Alex

> r = put_user(early[i], (uint32_t *)buf);
> if (r)
> return r;
> --
> 2.39.2
>


Re: [RFC PATCH 00/18] TTM interface for managing VRAM oversubscription

2024-04-25 Thread Marek Olšák
The most extreme ping-ponging is mitigated by throttling buffer moves
in the kernel, but it only works without VM_ALWAYS_VALID and you can
set BO priorities in the BO list. A better approach that works with
VM_ALWAYS_VALID would be nice.

Marek

On Wed, Apr 24, 2024 at 1:12 PM Friedrich Vock  wrote:
>
> Hi everyone,
>
> recently I've been looking into remedies for apps (in particular, newer
> games) that experience significant performance loss when they start to
> hit VRAM limits, especially on older or lower-end cards that struggle
> to fit both desktop apps and all the game data into VRAM at once.
>
> The root of the problem lies in the fact that from userspace's POV,
> buffer eviction is very opaque: Userspace applications/drivers cannot
> tell how oversubscribed VRAM is, nor do they have fine-grained control
> over which buffers get evicted.  At the same time, with GPU APIs becoming
> increasingly lower-level and GPU-driven, only the application itself
> can know which buffers are used within a particular submission, and
> how important each buffer is. For this, GPU APIs include interfaces
> to query oversubscription and specify memory priorities: In Vulkan,
> oversubscription can be queried through the VK_EXT_memory_budget
> extension. Different buffers can also be assigned priorities via the
> VK_EXT_pageable_device_local_memory extension. Modern games, especially
> D3D12 games via vkd3d-proton, rely on oversubscription being reported and
> priorities being respected in order to perform their memory management.
>
> However, relaying this information to the kernel via the current KMD uAPIs
> is not possible. On AMDGPU for example, all work submissions include a
> "bo list" that contains any buffer object that is accessed during the
> course of the submission. If VRAM is oversubscribed and a buffer in the
> list was evicted to system memory, that buffer is moved back to VRAM
> (potentially evicting other unused buffers).
>
> Since the usermode driver doesn't know what buffers are used by the
> application, its only choice is to submit a bo list that contains every
> buffer the application has allocated. In case of VRAM oversubscription,
> it is highly likely that some of the application's buffers were evicted,
> which almost guarantees that some buffers will get moved around. Since
> the bo list is only known at submit time, this also means the buffers
> will get moved right before submitting application work, which is the
> worst possible time to move buffers from a latency perspective. Another
> consequence of the large bo list is that nearly all memory from other
> applications will be evicted, too. When different applications (e.g. game
> and compositor) submit work one after the other, this causes a ping-pong
> effect where each app's submission evicts the other app's memory,
> resulting in a large amount of unnecessary moves.
>
> This overly aggressive eviction behavior led to RADV adopting a change
> that effectively allows all VRAM applications to reside in system memory
> [1].  This worked around the ping-ponging/excessive buffer moving problem,
> but also meant that any memory evicted to system memory would forever
> stay there, regardless of how VRAM is used.
>
> My proposal aims at providing a middle ground between these extremes.
> The goals I want to meet are:
> - Userspace is accurately informed about VRAM oversubscription/how much
>   VRAM has been evicted
> - Buffer eviction respects priorities set by userspace - Wasteful
>   ping-ponging is avoided to the extent possible
>
> I have been testing out some prototypes, and came up with this rough
> sketch of an API:
>
> - For each ttm_resource_manager, the amount of evicted memory is tracked
>   (similarly to how "usage" tracks the memory usage). When memory is
>   evicted via ttm_bo_evict, the size of the evicted memory is added, when
>   memory is un-evicted (see below), its size is subtracted. The amount of
>   evicted memory for e.g. VRAM can be queried by userspace via an ioctl.
>
> - Each ttm_resource_manager maintains a list of evicted buffer objects.
>
> - ttm_mem_unevict walks the list of evicted bos for a given
>   ttm_resource_manager and tries moving evicted resources back. When a
>   buffer is freed, this function is called to immediately restore some
>   evicted memory.
>
> - Each ttm_buffer_object independently tracks the mem_type it wants
>   to reside in.
>
> - ttm_bo_try_unevict is added as a helper function which attempts to
>   move the buffer to its preferred mem_type. If no space is available
>   there, it fails with -ENOSPC/-ENOMEM.
>
> - Similar to how ttm_bo_evict works, each driver can implement
>   uneviction_valuable/unevict_flags callbacks to control buffer
>   un-eviction.
>
> This is what patches 1-10 accomplish (together with an amdgpu
> implementation utilizing the new API).
>
> Userspace priorities could then be implemented as follows:
>
> - TTM already manages priorities for each buffer object. These pri

Re: [PATCH v2 1/2] drm/amdgpu: Fix uninitialized variable warning in amdgpu_afmt_acr

2024-04-25 Thread Alex Deucher
On Thu, Apr 25, 2024 at 6:07 AM Ma Jun  wrote:
>
> Assign value to clock to fix the warning below:
> "Using uninitialized value res. Field res.clock is uninitialized"
>
> Signed-off-by: Ma Jun 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_afmt.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_afmt.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_afmt.c
> index a4d65973bf7c..80771b1480ff 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_afmt.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_afmt.c
> @@ -100,6 +100,7 @@ struct amdgpu_afmt_acr amdgpu_afmt_acr(uint32_t clock)
> amdgpu_afmt_calc_cts(clock, &res.cts_32khz, &res.n_32khz, 32000);
> amdgpu_afmt_calc_cts(clock, &res.cts_44_1khz, &res.n_44_1khz, 44100);
> amdgpu_afmt_calc_cts(clock, &res.cts_48khz, &res.n_48khz, 48000);
> +   res.clock = clock;
>
> return res;
>  }
> --
> 2.34.1
>


Re: [PATCH v2 2/2] drm/amdgpu: Fix the uninitialized variable warning

2024-04-25 Thread Alex Deucher
On Thu, Apr 25, 2024 at 6:17 AM Ma Jun  wrote:
>
> Initialize the phy_id to 0 to fix the warning of
> "Using uninitialized value phy_id"
>
> Signed-off-by: Ma Jun 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
> index 8ed0e073656f..53d85fafd8ab 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
> @@ -95,7 +95,7 @@ static ssize_t amdgpu_securedisplay_debugfs_write(struct 
> file *f, const char __u
> struct psp_context *psp = &adev->psp;
> struct ta_securedisplay_cmd *securedisplay_cmd;
> struct drm_device *dev = adev_to_drm(adev);
> -   uint32_t phy_id;
> +   uint32_t phy_id = 0;

You can drop this hunk now that you added the check below.

Alex

> uint32_t op;
> char str[64];
> int ret;
> @@ -135,6 +135,10 @@ static ssize_t amdgpu_securedisplay_debugfs_write(struct 
> file *f, const char __u
> mutex_unlock(&psp->securedisplay_context.mutex);
> break;
> case 2:
> +   if (size < 3) {
> +   dev_err(adev->dev, "Invalid input: %s\n", str);
> +   return -EINVAL;
> +   }
> mutex_lock(&psp->securedisplay_context.mutex);
> psp_prep_securedisplay_cmd_buf(psp, &securedisplay_cmd,
> TA_SECUREDISPLAY_COMMAND__SEND_ROI_CRC);
> --
> 2.34.1
>


[PATCH v2] drm/amd/display: re-indent dc_power_down_on_boot()

2024-04-25 Thread Dan Carpenter
These lines are indented too far.  Clean the whitespace.

Signed-off-by: Dan Carpenter 
---
v2: Delete another blank line (checkpatch.pl --strict).

 drivers/gpu/drm/amd/display/dc/core/dc.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 3e16041bf4f9..5a0835f884a8 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -5192,11 +5192,9 @@ void dc_interrupt_ack(struct dc *dc, enum dc_irq_source 
src)
 void dc_power_down_on_boot(struct dc *dc)
 {
if (dc->ctx->dce_environment != DCE_ENV_VIRTUAL_HW &&
-   dc->hwss.power_down_on_boot) {
-
-   if (dc->caps.ips_support)
-   dc_exit_ips_for_hw_access(dc);
-
+   dc->hwss.power_down_on_boot) {
+   if (dc->caps.ips_support)
+   dc_exit_ips_for_hw_access(dc);
dc->hwss.power_down_on_boot(dc);
}
 }
-- 
2.43.0



Re: [PATCH] drm/amdgpu: Fix out-of-bounds write warning

2024-04-25 Thread Christian König




Am 25.04.24 um 12:00 schrieb Ma Jun:

Check the ring type value to fix the out-of-bounds
write warning

Signed-off-by: Ma Jun 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 06f0a6534a94..1e0b5bb47bc9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -353,6 +353,11 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
amdgpu_ring *ring,
ring->hw_prio = hw_prio;
  
  	if (!ring->no_scheduler) {

+   if (ring->funcs->type >= AMDGPU_HW_IP_NUM) {
+   dev_warn(adev->dev, "ring type %d has no scheduler\n", 
ring->funcs->type);
+   return 0;
+   }
+


That check should probably be at the beginning of the function since 
trying to initialize a ring with an invalid type should be rejected 
immediately.


Regards,
Christian.


hw_ip = ring->funcs->type;
num_sched = &adev->gpu_sched[hw_ip][hw_prio].num_scheds;
adev->gpu_sched[hw_ip][hw_prio].sched[(*num_sched)++] =




Re: [PATCH 1/2] drm/print: drop include debugfs.h and include where needed

2024-04-25 Thread Robert Foss
On Mon, Apr 22, 2024 at 2:10 PM Jani Nikula  wrote:
>
> Surprisingly many places depend on debugfs.h to be included via
> drm_print.h. Fix them.
>
> v3: Also fix armada, ite-it6505, imagination, msm, sti, vc4, and xe
>
> v2: Also fix ivpu and vmwgfx
>
> Reviewed-by: Andrzej Hajda 
> Acked-by: Maxime Ripard 
> Link: 
> https://patchwork.freedesktop.org/patch/msgid/20240410141434.157908-1-jani.nik...@intel.com
> Signed-off-by: Jani Nikula 
>
> ---
>
> Cc: Jacek Lawrynowicz 
> Cc: Stanislaw Gruszka 
> Cc: Oded Gabbay 
> Cc: Russell King 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Andrzej Hajda 
> Cc: Neil Armstrong 
> Cc: Robert Foss 
> Cc: Laurent Pinchart 
> Cc: Jonas Karlman 
> Cc: Jernej Skrabec 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Thomas Zimmermann 
> Cc: Jani Nikula 
> Cc: Rodrigo Vivi 
> Cc: Joonas Lahtinen 
> Cc: Tvrtko Ursulin 
> Cc: Frank Binns 
> Cc: Matt Coster 
> Cc: Rob Clark 
> Cc: Abhinav Kumar 
> Cc: Dmitry Baryshkov 
> Cc: Sean Paul 
> Cc: Marijn Suijten 
> Cc: Karol Herbst 
> Cc: Lyude Paul 
> Cc: Danilo Krummrich 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: "Pan, Xinhui" 
> Cc: Alain Volmat 
> Cc: Huang Rui 
> Cc: Zack Rusin 
> Cc: Broadcom internal kernel review list 
> 
> Cc: Lucas De Marchi 
> Cc: "Thomas Hellström" 
> Cc: dri-de...@lists.freedesktop.org
> Cc: intel-...@lists.freedesktop.org
> Cc: intel...@lists.freedesktop.org
> Cc: linux-arm-...@vger.kernel.org
> Cc: freedr...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Cc: amd-gfx@lists.freedesktop.org
> ---
>  drivers/accel/ivpu/ivpu_debugfs.c   | 2 ++
>  drivers/gpu/drm/armada/armada_debugfs.c | 1 +
>  drivers/gpu/drm/bridge/ite-it6505.c | 1 +
>  drivers/gpu/drm/bridge/panel.c  | 2 ++
>  drivers/gpu/drm/drm_print.c | 6 +++---
>  drivers/gpu/drm/i915/display/intel_dmc.c| 1 +
>  drivers/gpu/drm/imagination/pvr_fw_trace.c  | 1 +
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.c | 2 ++
>  drivers/gpu/drm/nouveau/dispnv50/crc.c  | 2 ++
>  drivers/gpu/drm/radeon/r100.c   | 1 +
>  drivers/gpu/drm/radeon/r300.c   | 1 +
>  drivers/gpu/drm/radeon/r420.c   | 1 +
>  drivers/gpu/drm/radeon/r600.c   | 3 ++-
>  drivers/gpu/drm/radeon/radeon_fence.c   | 1 +
>  drivers/gpu/drm/radeon/radeon_gem.c | 1 +
>  drivers/gpu/drm/radeon/radeon_ib.c  | 2 ++
>  drivers/gpu/drm/radeon/radeon_pm.c  | 1 +
>  drivers/gpu/drm/radeon/radeon_ring.c| 2 ++
>  drivers/gpu/drm/radeon/radeon_ttm.c | 1 +
>  drivers/gpu/drm/radeon/rs400.c  | 1 +
>  drivers/gpu/drm/radeon/rv515.c  | 1 +
>  drivers/gpu/drm/sti/sti_drv.c   | 1 +
>  drivers/gpu/drm/ttm/ttm_device.c| 1 +
>  drivers/gpu/drm/ttm/ttm_resource.c  | 3 ++-
>  drivers/gpu/drm/ttm/ttm_tt.c| 5 +++--
>  drivers/gpu/drm/vc4/vc4_drv.h   | 1 +
>  drivers/gpu/drm/vmwgfx/vmwgfx_gem.c | 2 ++
>  drivers/gpu/drm/xe/xe_debugfs.c | 1 +
>  drivers/gpu/drm/xe/xe_gt_debugfs.c  | 2 ++
>  drivers/gpu/drm/xe/xe_uc_debugfs.c  | 2 ++
>  include/drm/drm_print.h | 2 +-
>  31 files changed, 46 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/accel/ivpu/ivpu_debugfs.c 
> b/drivers/accel/ivpu/ivpu_debugfs.c
> index d09d29775b3f..e07e447d08d1 100644
> --- a/drivers/accel/ivpu/ivpu_debugfs.c
> +++ b/drivers/accel/ivpu/ivpu_debugfs.c
> @@ -3,6 +3,8 @@
>   * Copyright (C) 2020-2023 Intel Corporation
>   */
>
> +#include 
> +
>  #include 
>  #include 
>  #include 
> diff --git a/drivers/gpu/drm/armada/armada_debugfs.c 
> b/drivers/gpu/drm/armada/armada_debugfs.c
> index 29f4b52e3c8d..a763349dd89f 100644
> --- a/drivers/gpu/drm/armada/armada_debugfs.c
> +++ b/drivers/gpu/drm/armada/armada_debugfs.c
> @@ -5,6 +5,7 @@
>   */
>
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> diff --git a/drivers/gpu/drm/bridge/ite-it6505.c 
> b/drivers/gpu/drm/bridge/ite-it6505.c
> index 27334173e911..3f68c82888c2 100644
> --- a/drivers/gpu/drm/bridge/ite-it6505.c
> +++ b/drivers/gpu/drm/bridge/ite-it6505.c
> @@ -3,6 +3,7 @@
>   * Copyright (c) 2020, The Linux Foundation. All rights reserved.
>   */
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> diff --git a/drivers/gpu/drm/bridge/panel.c b/drivers/gpu/drm/bridge/panel.c
> index 7f41525f7a6e..32506524d9a2 100644
> --- a/drivers/gpu/drm/bridge/panel.c
> +++ b/drivers/gpu/drm/bridge/panel.c
> @@ -4,6 +4,8 @@
>   * Copyright (C) 2017 Broadcom
>   */
>
> +#include 
> +
>  #include 
>  #include 
>  #include 
> diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
> index 699b7dbffd7b..cf2efb44722c 100644
> --- a/drivers/gpu/drm/drm_print.c
> +++ b/drivers/gpu/drm/drm_print.c
> @@ -23,13 +23,13 @@
>   * Rob Clark 
>   */
>
> -#include 
> -
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 

[PATCH] drm/amdgpu: skip ip dump if devcoredump flag is set

2024-04-25 Thread Sunil Khatri
Do not dump the ip registers during driver reload
in passthrough environment.

Signed-off-by: Sunil Khatri 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 869256394136..b50758482530 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5372,10 +5372,12 @@ int amdgpu_do_asic_reset(struct list_head 
*device_list_handle,
amdgpu_reset_reg_dumps(tmp_adev);
 
/* Trigger ip dump before we reset the asic */
-   for (i = 0; i < tmp_adev->num_ip_blocks; i++)
-   if (tmp_adev->ip_blocks[i].version->funcs->dump_ip_state)
-   tmp_adev->ip_blocks[i].version->funcs->dump_ip_state(
-   (void *)tmp_adev);
+   if (!test_bit(AMDGPU_SKIP_COREDUMP, &reset_context->flags)) {
+   for (i = 0; i < tmp_adev->num_ip_blocks; i++)
+   if 
(tmp_adev->ip_blocks[i].version->funcs->dump_ip_state)
+   tmp_adev->ip_blocks[i].version->funcs
+   ->dump_ip_state((void *)tmp_adev);
+   }
 
reset_context->reset_device_list = device_list_handle;
r = amdgpu_reset_perform_reset(tmp_adev, reset_context);
-- 
2.34.1



Re: [PATCH v2 2/2] drm/amdgpu: Fix the uninitialized variable warning

2024-04-25 Thread Lazar, Lijo



On 4/25/2024 3:30 PM, Ma Jun wrote:
> Initialize the phy_id to 0 to fix the warning of
> "Using uninitialized value phy_id"
> 
> Signed-off-by: Ma Jun 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
> index 8ed0e073656f..53d85fafd8ab 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
> @@ -95,7 +95,7 @@ static ssize_t amdgpu_securedisplay_debugfs_write(struct 
> file *f, const char __u
>   struct psp_context *psp = &adev->psp;
>   struct ta_securedisplay_cmd *securedisplay_cmd;
>   struct drm_device *dev = adev_to_drm(adev);
> - uint32_t phy_id;
> + uint32_t phy_id = 0;
>   uint32_t op;
>   char str[64];
>   int ret;
> @@ -135,6 +135,10 @@ static ssize_t amdgpu_securedisplay_debugfs_write(struct 
> file *f, const char __u
>   mutex_unlock(&psp->securedisplay_context.mutex);
>   break;
>   case 2:
> + if (size < 3) {
> + dev_err(adev->dev, "Invalid input: %s\n", str);
> + return -EINVAL;
> + }

Better is to check the return of sscanf to see if phy_id value is
successfully scanned. Otherwise, return error.

Thanks,
Lijo

>   mutex_lock(&psp->securedisplay_context.mutex);
>   psp_prep_securedisplay_cmd_buf(psp, &securedisplay_cmd,
>   TA_SECUREDISPLAY_COMMAND__SEND_ROI_CRC);


[PATCH] drm/amdgpu: Fix out-of-bounds write warning

2024-04-25 Thread Ma Jun
Check the ring type value to fix the out-of-bounds
write warning

Signed-off-by: Ma Jun 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 06f0a6534a94..1e0b5bb47bc9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -353,6 +353,11 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
amdgpu_ring *ring,
ring->hw_prio = hw_prio;
 
if (!ring->no_scheduler) {
+   if (ring->funcs->type >= AMDGPU_HW_IP_NUM) {
+   dev_warn(adev->dev, "ring type %d has no scheduler\n", 
ring->funcs->type);
+   return 0;
+   }
+
hw_ip = ring->funcs->type;
num_sched = &adev->gpu_sched[hw_ip][hw_prio].num_scheds;
adev->gpu_sched[hw_ip][hw_prio].sched[(*num_sched)++] =
-- 
2.34.1



[PATCH v2 2/2] drm/amdgpu: Fix the uninitialized variable warning

2024-04-25 Thread Ma Jun
Initialize the phy_id to 0 to fix the warning of
"Using uninitialized value phy_id"

Signed-off-by: Ma Jun 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
index 8ed0e073656f..53d85fafd8ab 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
@@ -95,7 +95,7 @@ static ssize_t amdgpu_securedisplay_debugfs_write(struct file 
*f, const char __u
struct psp_context *psp = &adev->psp;
struct ta_securedisplay_cmd *securedisplay_cmd;
struct drm_device *dev = adev_to_drm(adev);
-   uint32_t phy_id;
+   uint32_t phy_id = 0;
uint32_t op;
char str[64];
int ret;
@@ -135,6 +135,10 @@ static ssize_t amdgpu_securedisplay_debugfs_write(struct 
file *f, const char __u
mutex_unlock(&psp->securedisplay_context.mutex);
break;
case 2:
+   if (size < 3) {
+   dev_err(adev->dev, "Invalid input: %s\n", str);
+   return -EINVAL;
+   }
mutex_lock(&psp->securedisplay_context.mutex);
psp_prep_securedisplay_cmd_buf(psp, &securedisplay_cmd,
TA_SECUREDISPLAY_COMMAND__SEND_ROI_CRC);
-- 
2.34.1



[PATCH v2 1/2] drm/amdgpu: Fix uninitialized variable warning in amdgpu_afmt_acr

2024-04-25 Thread Ma Jun
Assign value to clock to fix the warning below:
"Using uninitialized value res. Field res.clock is uninitialized"

Signed-off-by: Ma Jun 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_afmt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_afmt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_afmt.c
index a4d65973bf7c..80771b1480ff 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_afmt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_afmt.c
@@ -100,6 +100,7 @@ struct amdgpu_afmt_acr amdgpu_afmt_acr(uint32_t clock)
amdgpu_afmt_calc_cts(clock, &res.cts_32khz, &res.n_32khz, 32000);
amdgpu_afmt_calc_cts(clock, &res.cts_44_1khz, &res.n_44_1khz, 44100);
amdgpu_afmt_calc_cts(clock, &res.cts_48khz, &res.n_48khz, 48000);
+   res.clock = clock;
 
return res;
 }
-- 
2.34.1



[PATCH 5/5] drm/amdgpu/gfx: enable mes to map legacy queue support

2024-04-25 Thread Jack Xiao
Enable mes to map legacy queue support.

Signed-off-by: Jack Xiao 
Reviewed-by: Hawking Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 39 +
 1 file changed, 34 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index d9dc5485..172b7ba5d0a6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -622,10 +622,28 @@ int amdgpu_gfx_enable_kcq(struct amdgpu_device *adev, int 
xcc_id)
queue_mask |= (1ull << 
amdgpu_queue_mask_bit_to_set_resource_bit(adev, i));
}
 
-   DRM_INFO("kiq ring mec %d pipe %d q %d\n", kiq_ring->me, kiq_ring->pipe,
-   kiq_ring->queue);
amdgpu_device_flush_hdp(adev, NULL);
 
+   if (adev->enable_mes)
+   queue_mask = ~0ULL;
+
+   if (adev->enable_mes) {
+   for (i = 0; i < adev->gfx.num_compute_rings; i++) {
+   j = i + xcc_id * adev->gfx.num_compute_rings;
+   r = amdgpu_mes_map_legacy_queue(adev,
+   
&adev->gfx.compute_ring[j]);
+   if (r) {
+   DRM_ERROR("failed to map compute queue\n");
+   return r;
+   }
+   }
+
+   return 0;
+   }
+
+   DRM_INFO("kiq ring mec %d pipe %d q %d\n", kiq_ring->me, kiq_ring->pipe,
+kiq_ring->queue);
+
spin_lock(&kiq->ring_lock);
r = amdgpu_ring_alloc(kiq_ring, kiq->pmf->map_queues_size *
adev->gfx.num_compute_rings +
@@ -636,9 +654,6 @@ int amdgpu_gfx_enable_kcq(struct amdgpu_device *adev, int 
xcc_id)
return r;
}
 
-   if (adev->enable_mes)
-   queue_mask = ~0ULL;
-
kiq->pmf->kiq_set_resources(kiq_ring, queue_mask);
for (i = 0; i < adev->gfx.num_compute_rings; i++) {
j = i + xcc_id * adev->gfx.num_compute_rings;
@@ -665,6 +680,20 @@ int amdgpu_gfx_enable_kgq(struct amdgpu_device *adev, int 
xcc_id)
 
amdgpu_device_flush_hdp(adev, NULL);
 
+   if (adev->enable_mes) {
+   for (i = 0; i < adev->gfx.num_gfx_rings; i++) {
+   j = i + xcc_id * adev->gfx.num_gfx_rings;
+   r = amdgpu_mes_map_legacy_queue(adev,
+   &adev->gfx.gfx_ring[j]);
+   if (r) {
+   DRM_ERROR("failed to map gfx queue\n");
+   return r;
+   }
+   }
+
+   return 0;
+   }
+
spin_lock(&kiq->ring_lock);
/* No need to map kcq on the slave */
if (amdgpu_gfx_is_master_xcc(adev, xcc_id)) {
-- 
2.41.0



[PATCH 4/5] drm/amdgpu/mes11: adjust mes initialization sequence

2024-04-25 Thread Jack Xiao
Adjust mes queue initialization before kgq/kcq initialization
to enable mes mapping legacy queue.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
index 91e4e38b30c5..28a04f0f3541 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
@@ -52,7 +52,7 @@ MODULE_FIRMWARE("amdgpu/gc_11_5_0_mes1.bin");
 MODULE_FIRMWARE("amdgpu/gc_11_5_1_mes_2.bin");
 MODULE_FIRMWARE("amdgpu/gc_11_5_1_mes1.bin");
 
-
+static int mes_v11_0_hw_init(void *handle);
 static int mes_v11_0_hw_fini(void *handle);
 static int mes_v11_0_kiq_hw_init(struct amdgpu_device *adev);
 static int mes_v11_0_kiq_hw_fini(struct amdgpu_device *adev);
@@ -1292,6 +1292,10 @@ static int mes_v11_0_kiq_hw_init(struct amdgpu_device 
*adev)
if (r)
goto failure;
 
+   r = mes_v11_0_hw_init(adev);
+   if (r)
+   goto failure;
+
return r;
 
 failure:
@@ -1321,6 +1325,9 @@ static int mes_v11_0_hw_init(void *handle)
int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   if (adev->mes.ring.sched.ready)
+   return 0;
+
if (!adev->enable_mes_kiq) {
if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT) {
r = mes_v11_0_load_microcode(adev,
-- 
2.41.0



[PATCH 2/5] drm/amdgpu/mes11: update ADD_QUEUE interface

2024-04-25 Thread Jack Xiao
Update ADD_QUEUE interface for mes11 to support
mes mapping legacy queue.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/include/mes_v11_api_def.h | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/include/mes_v11_api_def.h 
b/drivers/gpu/drm/amd/include/mes_v11_api_def.h
index 410c8d664336..b72d5d362251 100644
--- a/drivers/gpu/drm/amd/include/mes_v11_api_def.h
+++ b/drivers/gpu/drm/amd/include/mes_v11_api_def.h
@@ -299,10 +299,21 @@ union MESAPI__ADD_QUEUE {
uint32_t skip_process_ctx_clear : 1;
uint32_t map_legacy_kq  : 1;
uint32_t exclusively_scheduled  : 1;
-   uint32_t reserved   : 17;
+   uint32_t is_long_running: 1;
+   uint32_t is_dwm_queue   : 1;
+   uint32_t is_video_blit_queue: 1;
+   uint32_t reserved   : 14;
};
-   struct MES_API_STATUS   api_status;
-   uint64_ttma_addr;
+   struct MES_API_STATUS   api_status;
+   uint64_ttma_addr;
+   uint32_tsch_id;
+   uint64_ttimestamp;
+   uint32_tprocess_context_array_index;
+   uint32_tgang_context_array_index;
+   uint32_tpipe_id;
+   uint32_tqueue_id;
+   uint32_talignment_mode_setting;
+   uint64_tunmap_flag_addr;
};
 
uint32_tmax_dwords_in_api[API_FRAME_SIZE_IN_DWORDS];
-- 
2.41.0



[PATCH 3/5] drm/amdgpu/mes11: add mes mapping legacy queue support

2024-04-25 Thread Jack Xiao
Add mes11 map legacy queue packet submission.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
index 0d1407f25005..91e4e38b30c5 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
@@ -325,6 +325,31 @@ static int mes_v11_0_remove_hw_queue(struct amdgpu_mes 
*mes,
offsetof(union MESAPI__REMOVE_QUEUE, api_status));
 }
 
+static int mes_v11_0_map_legacy_queue(struct amdgpu_mes *mes,
+ struct mes_map_legacy_queue_input *input)
+{
+   union MESAPI__ADD_QUEUE mes_add_queue_pkt;
+
+   memset(&mes_add_queue_pkt, 0, sizeof(mes_add_queue_pkt));
+
+   mes_add_queue_pkt.header.type = MES_API_TYPE_SCHEDULER;
+   mes_add_queue_pkt.header.opcode = MES_SCH_API_ADD_QUEUE;
+   mes_add_queue_pkt.header.dwsize = API_FRAME_SIZE_IN_DWORDS;
+
+   mes_add_queue_pkt.pipe_id = input->pipe_id;
+   mes_add_queue_pkt.queue_id = input->queue_id;
+   mes_add_queue_pkt.doorbell_offset = input->doorbell_offset;
+   mes_add_queue_pkt.mqd_addr = input->mqd_addr;
+   mes_add_queue_pkt.wptr_addr = input->wptr_addr;
+   mes_add_queue_pkt.queue_type =
+   convert_to_mes_queue_type(input->queue_type);
+   mes_add_queue_pkt.map_legacy_kq = 1;
+
+   return mes_v11_0_submit_pkt_and_poll_completion(mes,
+   &mes_add_queue_pkt, sizeof(mes_add_queue_pkt),
+   offsetof(union MESAPI__ADD_QUEUE, api_status));
+}
+
 static int mes_v11_0_unmap_legacy_queue(struct amdgpu_mes *mes,
struct mes_unmap_legacy_queue_input *input)
 {
@@ -538,6 +563,7 @@ static int mes_v11_0_set_hw_resources_1(struct amdgpu_mes 
*mes)
 static const struct amdgpu_mes_funcs mes_v11_0_funcs = {
.add_hw_queue = mes_v11_0_add_hw_queue,
.remove_hw_queue = mes_v11_0_remove_hw_queue,
+   .map_legacy_queue = mes_v11_0_map_legacy_queue,
.unmap_legacy_queue = mes_v11_0_unmap_legacy_queue,
.suspend_gang = mes_v11_0_suspend_gang,
.resume_gang = mes_v11_0_resume_gang,
-- 
2.41.0



[PATCH 0/5] enable mes to map kgq/kcq

2024-04-25 Thread Jack Xiao
Jack Xiao (5):
  drm/amdgpu/mes: add mes mapping legacy queue support
  drm/amdgpu/mes11: update ADD_QUEUE interface
  drm/amdgpu/mes11: add mes mapping legacy queue support
  drm/amdgpu/mes11: adjust mes initialization sequence
  drm/amdgpu/gfx: enable mes to map legacy queue support

 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c   | 39 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c   | 22 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h   | 14 +++
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c| 35 -
 drivers/gpu/drm/amd/include/mes_v11_api_def.h | 17 ++--
 5 files changed, 118 insertions(+), 9 deletions(-)

-- 
2.41.0



[PATCH 1/5] drm/amdgpu/mes: add mes mapping legacy queue support

2024-04-25 Thread Jack Xiao
Add mes mapping legacy queue framework support.

Signed-off-by: Jack Xiao 
Reviewed-by: Hawking Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 22 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 14 ++
 2 files changed, 36 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 8783b339066f..b22d50653899 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -786,6 +786,28 @@ int amdgpu_mes_remove_hw_queue(struct amdgpu_device *adev, 
int queue_id)
return 0;
 }
 
+int amdgpu_mes_map_legacy_queue(struct amdgpu_device *adev,
+   struct amdgpu_ring *ring)
+{
+   struct mes_map_legacy_queue_input queue_input;
+   int r;
+
+   memset(&queue_input, 0, sizeof(queue_input));
+
+   queue_input.queue_type = ring->funcs->type;
+   queue_input.doorbell_offset = ring->doorbell_index;
+   queue_input.pipe_id = ring->pipe;
+   queue_input.queue_id = ring->queue;
+   queue_input.mqd_addr = amdgpu_bo_gpu_offset(ring->mqd_obj);
+   queue_input.wptr_addr = ring->wptr_gpu_addr;
+
+   r = adev->mes.funcs->map_legacy_queue(&adev->mes, &queue_input);
+   if (r)
+   DRM_ERROR("failed to map legacy queue\n");
+
+   return r;
+}
+
 int amdgpu_mes_unmap_legacy_queue(struct amdgpu_device *adev,
  struct amdgpu_ring *ring,
  enum amdgpu_unmap_queues_action action,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index b99a2b3cffe3..df9f0404d842 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -248,6 +248,15 @@ struct mes_remove_queue_input {
uint64_tgang_context_addr;
 };
 
+struct mes_map_legacy_queue_input {
+   uint32_t   queue_type;
+   uint32_t   doorbell_offset;
+   uint32_t   pipe_id;
+   uint32_t   queue_id;
+   uint64_t   mqd_addr;
+   uint64_t   wptr_addr;
+};
+
 struct mes_unmap_legacy_queue_input {
enum amdgpu_unmap_queues_actionaction;
uint32_t   queue_type;
@@ -324,6 +333,9 @@ struct amdgpu_mes_funcs {
int (*remove_hw_queue)(struct amdgpu_mes *mes,
   struct mes_remove_queue_input *input);
 
+   int (*map_legacy_queue)(struct amdgpu_mes *mes,
+   struct mes_map_legacy_queue_input *input);
+
int (*unmap_legacy_queue)(struct amdgpu_mes *mes,
  struct mes_unmap_legacy_queue_input *input);
 
@@ -367,6 +379,8 @@ int amdgpu_mes_add_hw_queue(struct amdgpu_device *adev, int 
gang_id,
int *queue_id);
 int amdgpu_mes_remove_hw_queue(struct amdgpu_device *adev, int queue_id);
 
+int amdgpu_mes_map_legacy_queue(struct amdgpu_device *adev,
+   struct amdgpu_ring *ring);
 int amdgpu_mes_unmap_legacy_queue(struct amdgpu_device *adev,
  struct amdgpu_ring *ring,
  enum amdgpu_unmap_queues_action action,
-- 
2.41.0



RE: [PATCH 4/4] drm/amdgpu: avoid dump mca bank log muti times during ras ISR

2024-04-25 Thread Wang, Yang(Kevin)
[AMD Official Use Only - General]

-Original Message-
From: Zhou1, Tao 
Sent: Thursday, April 25, 2024 4:31 PM
To: Wang, Yang(Kevin) ; amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Li, Candice 
Subject: RE: [PATCH 4/4] drm/amdgpu: avoid dump mca bank log muti times during 
ras ISR

[AMD Official Use Only - General]

> -Original Message-
> From: Wang, Yang(Kevin) 
> Sent: Tuesday, April 23, 2024 4:27 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking ; Zhou1, Tao
> ; Li, Candice 
> Subject: [PATCH 4/4] drm/amdgpu: avoid dump mca bank log muti times
> during ras ISR
>
> because the ue valid mca count will only be cleared after gpu reset,
> so only dump mca log on the first time to get mca bank after receive RAS 
> interrupt.
>
> Signed-off-by: Yang Wang 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c | 28
> +  drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h |
> 1 +
>  2 files changed, 29 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c
> index 264f56fd4f66..b581523fa8d7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c
> @@ -229,6 +229,8 @@ int amdgpu_mca_init(struct amdgpu_device *adev)
>   struct mca_bank_cache *mca_cache;
>   int i;
>
> + atomic_set(&mca->ue_update_flag, 0);
> +
>   for (i = 0; i < ARRAY_SIZE(mca->mca_caches); i++) {
>   mca_cache = &mca->mca_caches[i];
>   mutex_init(&mca_cache->lock); @@ -244,6 +246,8 @@ void
> amdgpu_mca_fini(struct amdgpu_device *adev)
>   struct mca_bank_cache *mca_cache;
>   int i;
>
> + atomic_set(&mca->ue_update_flag, 0);
> +
>   for (i = 0; i < ARRAY_SIZE(mca->mca_caches); i++) {
>   mca_cache = &mca->mca_caches[i];
>   amdgpu_mca_bank_set_release(&mca_cache->mca_set);
> @@ -325,6 +329,27 @@ static int amdgpu_mca_smu_get_mca_entry(struct
> amdgpu_device *adev, enum amdgpu_
>   return mca_funcs->mca_get_mca_entry(adev, type, idx, entry);  }
>
> +static bool amdgpu_mca_bank_should_update(struct amdgpu_device *adev,
> +enum amdgpu_mca_error_type type) {
> + struct amdgpu_mca *mca = &adev->mca;
> + bool ret = true;
> +
> + /*
> +  * Because the UE Valid MCA count will only be cleared after reset,
> +  * in order to avoid repeated counting of the error count,
> +  * the aca bank is only updated once during the gpu recovery stage.
> +  */
> + if (type == AMDGPU_MCA_ERROR_TYPE_UE) {
> + if (amdgpu_ras_intr_triggered())
> + ret = atomic_cmpxchg(&mca->ue_update_flag, 0, 1)
> + ==
> 0;
> + else
> + atomic_set(&mca->ue_update_flag, 0);
> + }
> +
> + return ret;
> +}
> +
> +

[Tao] redundant line, with this fixed, the patch is:

Reviewed-by: Tao Zhou 

[Kevin]:
Thanks for reminding me.

Best Regards,
Kevin

>  static int amdgpu_mca_smu_get_mca_set(struct amdgpu_device *adev,
> enum amdgpu_mca_error_type type, struct mca_bank_set *mca_set,
> struct ras_query_context *qctx)  {
> @@ -
> 335,6 +360,9 @@ static int amdgpu_mca_smu_get_mca_set(struct
> amdgpu_device *adev, enum amdgpu_mc
>   if (!mca_set)
>   return -EINVAL;
>
> + if (!amdgpu_mca_bank_should_update(adev, type))
> + return 0;
> +
>   ret = amdgpu_mca_smu_get_valid_mca_count(adev, type, &count);
>   if (ret)
>   return ret;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h
> index 9b97cfa28e05..e80323ff90c1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h
> @@ -93,6 +93,7 @@ struct amdgpu_mca {
>   struct amdgpu_mca_ras mpio;
>   const struct amdgpu_mca_smu_funcs *mca_funcs;
>   struct mca_bank_cache mca_caches[AMDGPU_MCA_ERROR_TYPE_DE];
> + atomic_t ue_update_flag;
>  };
>
>  enum mca_reg_idx {
> --
> 2.34.1




Re: [PATCH v4] drm/amdgpu: Modify the contiguous flags behaviour

2024-04-25 Thread Christian König

Am 25.04.24 um 10:15 schrieb Arunpravin Paneer Selvam:

Now we have two flags for contiguous VRAM buffer allocation.
If the application request for AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
it would set the ttm place TTM_PL_FLAG_CONTIGUOUS flag in the
buffer's placement function.

This patch will change the default behaviour of the two flags.

When we set AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS
- This means contiguous is not mandatory.
- we will try to allocate the contiguous buffer. Say if the
   allocation fails, we fallback to allocate the individual pages.

When we setTTM_PL_FLAG_CONTIGUOUS
- This means contiguous allocation is mandatory.
- we are setting this in amdgpu_bo_pin_restricted() before bo validation
   and check this flag in the vram manager file.
- if this is set, we should allocate the buffer pages contiguously.
   the allocation fails, we return -ENOSPC.

v2:
   - keep the mem_flags and bo->flags check as is(Christian)
   - place the TTM_PL_FLAG_CONTIGUOUS flag setting into the
 amdgpu_bo_pin_restricted function placement range iteration
 loop(Christian)
   - rename find_pages with amdgpu_vram_mgr_calculate_pages_per_block
 (Christian)
   - Keep the kernel BO allocation as is(Christain)
   - If BO pin vram allocation failed, we need to return -ENOSPC as
 RDMA cannot work with scattered VRAM pages(Philip)

v3(Christian):
   - keep contiguous flag handling outside of pages_per_block
 calculation
   - remove the hacky implementation in contiguous flag error
 handling code

v4(Christian):
   - use any variable and return value for non-contiguous
 fallback

Signed-off-by: Arunpravin Paneer Selvam 
Suggested-by: Christian König 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c   |  8 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 22 ++--
  2 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 492aebc44e51..c594d2a5978e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -154,8 +154,10 @@ void amdgpu_bo_placement_from_domain(struct amdgpu_bo 
*abo, u32 domain)
else
places[c].flags |= TTM_PL_FLAG_TOPDOWN;
  
-		if (flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)

+   if (abo->tbo.type == ttm_bo_type_kernel &&
+   flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)
places[c].flags |= TTM_PL_FLAG_CONTIGUOUS;
+
c++;
}
  
@@ -965,6 +967,10 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain,

if (!bo->placements[i].lpfn ||
(lpfn && lpfn < bo->placements[i].lpfn))
bo->placements[i].lpfn = lpfn;
+
+   if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS &&
+   bo->placements[i].mem_type == TTM_PL_VRAM)
+   bo->placements[i].flags |= TTM_PL_FLAG_CONTIGUOUS;
}
  
  	r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index e494f5bf136a..6c30eceec896 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -469,7 +469,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
if (tbo->type != ttm_bo_type_kernel)
max_bytes -= AMDGPU_VM_RESERVED_VRAM;
  
-	if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {

+   if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS) {
pages_per_block = ~0ul;
} else {
  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
@@ -478,7 +478,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
/* default to 2MB */
pages_per_block = 2UL << (20UL - PAGE_SHIFT);
  #endif
-   pages_per_block = max_t(uint32_t, pages_per_block,
+   pages_per_block = max_t(u32, pages_per_block,
tbo->page_alignment);
}
  
@@ -499,7 +499,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man,

if (place->flags & TTM_PL_FLAG_TOPDOWN)
vres->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION;
  
-	if (place->flags & TTM_PL_FLAG_CONTIGUOUS)

+   if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)
vres->flags |= DRM_BUDDY_CONTIGUOUS_ALLOCATION;
  
  	if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CLEARED)

@@ -518,21 +518,31 @@ static int amdgpu_vram_mgr_new(struct 
ttm_resource_manager *man,
else
min_block_size = mgr->default_page_size;
  
-		BUG_ON(min_block_size < mm->chunk_size);

-
/* Limit maximum size to 2GiB due to SG table limitations */
size = min(remaining_size, 2ULL << 30);
  
  		if ((size >= (u64)pages_per_block << PAGE_

RE: [PATCH 4/4] drm/amdgpu: avoid dump mca bank log muti times during ras ISR

2024-04-25 Thread Zhou1, Tao
[AMD Official Use Only - General]

> -Original Message-
> From: Wang, Yang(Kevin) 
> Sent: Tuesday, April 23, 2024 4:27 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking ; Zhou1, Tao
> ; Li, Candice 
> Subject: [PATCH 4/4] drm/amdgpu: avoid dump mca bank log muti times during
> ras ISR
>
> because the ue valid mca count will only be cleared after gpu reset, so only 
> dump
> mca log on the first time to get mca bank after receive RAS interrupt.
>
> Signed-off-by: Yang Wang 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c | 28
> +  drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h |
> 1 +
>  2 files changed, 29 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c
> index 264f56fd4f66..b581523fa8d7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c
> @@ -229,6 +229,8 @@ int amdgpu_mca_init(struct amdgpu_device *adev)
>   struct mca_bank_cache *mca_cache;
>   int i;
>
> + atomic_set(&mca->ue_update_flag, 0);
> +
>   for (i = 0; i < ARRAY_SIZE(mca->mca_caches); i++) {
>   mca_cache = &mca->mca_caches[i];
>   mutex_init(&mca_cache->lock);
> @@ -244,6 +246,8 @@ void amdgpu_mca_fini(struct amdgpu_device *adev)
>   struct mca_bank_cache *mca_cache;
>   int i;
>
> + atomic_set(&mca->ue_update_flag, 0);
> +
>   for (i = 0; i < ARRAY_SIZE(mca->mca_caches); i++) {
>   mca_cache = &mca->mca_caches[i];
>   amdgpu_mca_bank_set_release(&mca_cache->mca_set);
> @@ -325,6 +329,27 @@ static int amdgpu_mca_smu_get_mca_entry(struct
> amdgpu_device *adev, enum amdgpu_
>   return mca_funcs->mca_get_mca_entry(adev, type, idx, entry);  }
>
> +static bool amdgpu_mca_bank_should_update(struct amdgpu_device *adev,
> +enum amdgpu_mca_error_type type) {
> + struct amdgpu_mca *mca = &adev->mca;
> + bool ret = true;
> +
> + /*
> +  * Because the UE Valid MCA count will only be cleared after reset,
> +  * in order to avoid repeated counting of the error count,
> +  * the aca bank is only updated once during the gpu recovery stage.
> +  */
> + if (type == AMDGPU_MCA_ERROR_TYPE_UE) {
> + if (amdgpu_ras_intr_triggered())
> + ret = atomic_cmpxchg(&mca->ue_update_flag, 0, 1) ==
> 0;
> + else
> + atomic_set(&mca->ue_update_flag, 0);
> + }
> +
> + return ret;
> +}
> +
> +

[Tao] redundant line, with this fixed, the patch is:

Reviewed-by: Tao Zhou 

>  static int amdgpu_mca_smu_get_mca_set(struct amdgpu_device *adev, enum
> amdgpu_mca_error_type type, struct mca_bank_set *mca_set,
> struct ras_query_context *qctx)  { @@ -
> 335,6 +360,9 @@ static int amdgpu_mca_smu_get_mca_set(struct
> amdgpu_device *adev, enum amdgpu_mc
>   if (!mca_set)
>   return -EINVAL;
>
> + if (!amdgpu_mca_bank_should_update(adev, type))
> + return 0;
> +
>   ret = amdgpu_mca_smu_get_valid_mca_count(adev, type, &count);
>   if (ret)
>   return ret;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h
> index 9b97cfa28e05..e80323ff90c1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h
> @@ -93,6 +93,7 @@ struct amdgpu_mca {
>   struct amdgpu_mca_ras mpio;
>   const struct amdgpu_mca_smu_funcs *mca_funcs;
>   struct mca_bank_cache mca_caches[AMDGPU_MCA_ERROR_TYPE_DE];
> + atomic_t ue_update_flag;
>  };
>
>  enum mca_reg_idx {
> --
> 2.34.1



Re: [PATCH v3] drm/amdgpu: Modify the contiguous flags behaviour

2024-04-25 Thread Paneer Selvam, Arunpravin

Hi Christian,

On 4/24/2024 2:02 PM, Christian König wrote:

Am 24.04.24 um 09:13 schrieb Arunpravin Paneer Selvam:

Now we have two flags for contiguous VRAM buffer allocation.
If the application request for AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
it would set the ttm place TTM_PL_FLAG_CONTIGUOUS flag in the
buffer's placement function.

This patch will change the default behaviour of the two flags.

When we set AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS
- This means contiguous is not mandatory.
- we will try to allocate the contiguous buffer. Say if the
   allocation fails, we fallback to allocate the individual pages.

When we setTTM_PL_FLAG_CONTIGUOUS
- This means contiguous allocation is mandatory.
- we are setting this in amdgpu_bo_pin_restricted() before bo validation
   and check this flag in the vram manager file.
- if this is set, we should allocate the buffer pages contiguously.
   the allocation fails, we return -ENOSPC.

v2:
   - keep the mem_flags and bo->flags check as is(Christian)
   - place the TTM_PL_FLAG_CONTIGUOUS flag setting into the
 amdgpu_bo_pin_restricted function placement range iteration
 loop(Christian)
   - rename find_pages with amdgpu_vram_mgr_calculate_pages_per_block
 (Christian)
   - Keep the kernel BO allocation as is(Christain)
   - If BO pin vram allocation failed, we need to return -ENOSPC as
 RDMA cannot work with scattered VRAM pages(Philip)

v3(Christian):
   - keep contiguous flag handling outside of pages_per_block
 calculation
   - remove the hacky implementation in contiguous flag error
 handling code

Signed-off-by: Arunpravin Paneer Selvam 


Suggested-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c   |  8 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 83 ++--
  2 files changed, 65 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c

index 492aebc44e51..c594d2a5978e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -154,8 +154,10 @@ void amdgpu_bo_placement_from_domain(struct 
amdgpu_bo *abo, u32 domain)

  else
  places[c].flags |= TTM_PL_FLAG_TOPDOWN;
  -    if (flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)
+    if (abo->tbo.type == ttm_bo_type_kernel &&
+    flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)
  places[c].flags |= TTM_PL_FLAG_CONTIGUOUS;
+
  c++;
  }
  @@ -965,6 +967,10 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo 
*bo, u32 domain,

  if (!bo->placements[i].lpfn ||
  (lpfn && lpfn < bo->placements[i].lpfn))
  bo->placements[i].lpfn = lpfn;
+
+    if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS &&
+    bo->placements[i].mem_type == TTM_PL_VRAM)
+    bo->placements[i].flags |= TTM_PL_FLAG_CONTIGUOUS;
  }
    r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c

index e494f5bf136a..17c5d9ce9927 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -88,6 +88,23 @@ static inline u64 
amdgpu_vram_mgr_blocks_size(struct list_head *head)

  return size;
  }
  +static inline void amdgpu_vram_mgr_limit_min_block_size(unsigned 
long pages_per_block,

+    u64 size,
+    u64 *min_block_size,
+    bool contiguous_enabled)
+{
+    if (contiguous_enabled)
+    return;
+
+    /*
+ * if size >= 2MiB, limit the min_block_size to 2MiB
+ * for better TLB usage.
+ */
+    if ((size >= (u64)pages_per_block << PAGE_SHIFT) &&
+    !(size & (((u64)pages_per_block << PAGE_SHIFT) - 1)))
+    *min_block_size = (u64)pages_per_block << PAGE_SHIFT;
+}
+
  /**
   * DOC: mem_info_vram_total
   *
@@ -452,11 +469,12 @@ static int amdgpu_vram_mgr_new(struct 
ttm_resource_manager *man,

  struct amdgpu_device *adev = to_amdgpu_device(mgr);
  struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo);
  u64 vis_usage = 0, max_bytes, min_block_size;
+    struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo);
  struct amdgpu_vram_mgr_resource *vres;
  u64 size, remaining_size, lpfn, fpfn;
  struct drm_buddy *mm = &mgr->mm;
-    struct drm_buddy_block *block;
  unsigned long pages_per_block;
+    struct drm_buddy_block *block;
  int r;
    lpfn = (u64)place->lpfn << PAGE_SHIFT;
@@ -469,18 +487,14 @@ static int amdgpu_vram_mgr_new(struct 
ttm_resource_manager *man,

  if (tbo->type != ttm_bo_type_kernel)
  max_bytes -= AMDGPU_VM_RESERVED_VRAM;
  -    if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
-    pages_per_block = ~0ul;
-    } else {
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+    if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
  pages_per_block = HPAGE_PMD_NR;


That won't work like this.

[PATCH v4] drm/amdgpu: Modify the contiguous flags behaviour

2024-04-25 Thread Arunpravin Paneer Selvam
Now we have two flags for contiguous VRAM buffer allocation.
If the application request for AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
it would set the ttm place TTM_PL_FLAG_CONTIGUOUS flag in the
buffer's placement function.

This patch will change the default behaviour of the two flags.

When we set AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS
- This means contiguous is not mandatory.
- we will try to allocate the contiguous buffer. Say if the
  allocation fails, we fallback to allocate the individual pages.

When we setTTM_PL_FLAG_CONTIGUOUS
- This means contiguous allocation is mandatory.
- we are setting this in amdgpu_bo_pin_restricted() before bo validation
  and check this flag in the vram manager file.
- if this is set, we should allocate the buffer pages contiguously.
  the allocation fails, we return -ENOSPC.

v2:
  - keep the mem_flags and bo->flags check as is(Christian)
  - place the TTM_PL_FLAG_CONTIGUOUS flag setting into the
amdgpu_bo_pin_restricted function placement range iteration
loop(Christian)
  - rename find_pages with amdgpu_vram_mgr_calculate_pages_per_block
(Christian)
  - Keep the kernel BO allocation as is(Christain)
  - If BO pin vram allocation failed, we need to return -ENOSPC as
RDMA cannot work with scattered VRAM pages(Philip)

v3(Christian):
  - keep contiguous flag handling outside of pages_per_block
calculation
  - remove the hacky implementation in contiguous flag error
handling code

v4(Christian):
  - use any variable and return value for non-contiguous
fallback

Signed-off-by: Arunpravin Paneer Selvam 
Suggested-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c   |  8 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 22 ++--
 2 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 492aebc44e51..c594d2a5978e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -154,8 +154,10 @@ void amdgpu_bo_placement_from_domain(struct amdgpu_bo 
*abo, u32 domain)
else
places[c].flags |= TTM_PL_FLAG_TOPDOWN;
 
-   if (flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)
+   if (abo->tbo.type == ttm_bo_type_kernel &&
+   flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)
places[c].flags |= TTM_PL_FLAG_CONTIGUOUS;
+
c++;
}
 
@@ -965,6 +967,10 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 
domain,
if (!bo->placements[i].lpfn ||
(lpfn && lpfn < bo->placements[i].lpfn))
bo->placements[i].lpfn = lpfn;
+
+   if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS &&
+   bo->placements[i].mem_type == TTM_PL_VRAM)
+   bo->placements[i].flags |= TTM_PL_FLAG_CONTIGUOUS;
}
 
r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index e494f5bf136a..6c30eceec896 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -469,7 +469,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
if (tbo->type != ttm_bo_type_kernel)
max_bytes -= AMDGPU_VM_RESERVED_VRAM;
 
-   if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
+   if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS) {
pages_per_block = ~0ul;
} else {
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
@@ -478,7 +478,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
/* default to 2MB */
pages_per_block = 2UL << (20UL - PAGE_SHIFT);
 #endif
-   pages_per_block = max_t(uint32_t, pages_per_block,
+   pages_per_block = max_t(u32, pages_per_block,
tbo->page_alignment);
}
 
@@ -499,7 +499,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
if (place->flags & TTM_PL_FLAG_TOPDOWN)
vres->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION;
 
-   if (place->flags & TTM_PL_FLAG_CONTIGUOUS)
+   if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)
vres->flags |= DRM_BUDDY_CONTIGUOUS_ALLOCATION;
 
if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CLEARED)
@@ -518,21 +518,31 @@ static int amdgpu_vram_mgr_new(struct 
ttm_resource_manager *man,
else
min_block_size = mgr->default_page_size;
 
-   BUG_ON(min_block_size < mm->chunk_size);
-
/* Limit maximum size to 2GiB due to SG table limitations */
size = min(remaining_size, 2ULL << 30);
 
if ((size >= (u64)pages_per_block << PAGE_SHIFT) &&
-   !(size & (((u64)pages_per_b

[PATCH] drm/amd/display: Remove duplicate dcn401/dcn401_clk_mgr.h header

2024-04-25 Thread Jiapeng Chong
./drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr.c: 
dcn401/dcn401_clk_mgr.h is included more than once.

Reported-by: Abaci Robot 
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=8885
Signed-off-by: Jiapeng Chong 
---
 drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr.c
index d146c35f6d60..005092b0a0cb 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn401/dcn401_clk_mgr.c
@@ -21,7 +21,6 @@
 #include "dcn/dcn_4_1_0_offset.h"
 #include "dcn/dcn_4_1_0_sh_mask.h"
 
-#include "dcn401/dcn401_clk_mgr.h"
 #include "dml/dcn401/dcn401_fpu.h"
 
 #define mmCLK01_CLK0_CLK_PLL_REQ0x16E37
-- 
2.20.1.7.g153144c



Re: [PATCH 1/2] drm/print: drop include debugfs.h and include where needed

2024-04-25 Thread Jani Nikula
On Mon, 22 Apr 2024, Jani Nikula  wrote:
> Surprisingly many places depend on debugfs.h to be included via
> drm_print.h. Fix them.
>
> v3: Also fix armada, ite-it6505, imagination, msm, sti, vc4, and xe
>
> v2: Also fix ivpu and vmwgfx
>
> Reviewed-by: Andrzej Hajda 
> Acked-by: Maxime Ripard 
> Link: 
> https://patchwork.freedesktop.org/patch/msgid/20240410141434.157908-1-jani.nik...@intel.com
> Signed-off-by: Jani Nikula 

While the changes all over the place are small, mostly just adding the
debugfs.h include, please consider acking. I've sent this a few times
already.

Otherwise, I'll merge this by the end of the week, acks or not.

Thanks,
Jani.



>
> ---
>
> Cc: Jacek Lawrynowicz 
> Cc: Stanislaw Gruszka 
> Cc: Oded Gabbay 
> Cc: Russell King 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Andrzej Hajda 
> Cc: Neil Armstrong 
> Cc: Robert Foss 
> Cc: Laurent Pinchart 
> Cc: Jonas Karlman 
> Cc: Jernej Skrabec 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Thomas Zimmermann 
> Cc: Jani Nikula 
> Cc: Rodrigo Vivi 
> Cc: Joonas Lahtinen 
> Cc: Tvrtko Ursulin 
> Cc: Frank Binns 
> Cc: Matt Coster 
> Cc: Rob Clark 
> Cc: Abhinav Kumar 
> Cc: Dmitry Baryshkov 
> Cc: Sean Paul 
> Cc: Marijn Suijten 
> Cc: Karol Herbst 
> Cc: Lyude Paul 
> Cc: Danilo Krummrich 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: "Pan, Xinhui" 
> Cc: Alain Volmat 
> Cc: Huang Rui 
> Cc: Zack Rusin 
> Cc: Broadcom internal kernel review list 
> 
> Cc: Lucas De Marchi 
> Cc: "Thomas Hellström" 
> Cc: dri-de...@lists.freedesktop.org
> Cc: intel-...@lists.freedesktop.org
> Cc: intel...@lists.freedesktop.org
> Cc: linux-arm-...@vger.kernel.org
> Cc: freedr...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Cc: amd-gfx@lists.freedesktop.org
> ---
>  drivers/accel/ivpu/ivpu_debugfs.c   | 2 ++
>  drivers/gpu/drm/armada/armada_debugfs.c | 1 +
>  drivers/gpu/drm/bridge/ite-it6505.c | 1 +
>  drivers/gpu/drm/bridge/panel.c  | 2 ++
>  drivers/gpu/drm/drm_print.c | 6 +++---
>  drivers/gpu/drm/i915/display/intel_dmc.c| 1 +
>  drivers/gpu/drm/imagination/pvr_fw_trace.c  | 1 +
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.c | 2 ++
>  drivers/gpu/drm/nouveau/dispnv50/crc.c  | 2 ++
>  drivers/gpu/drm/radeon/r100.c   | 1 +
>  drivers/gpu/drm/radeon/r300.c   | 1 +
>  drivers/gpu/drm/radeon/r420.c   | 1 +
>  drivers/gpu/drm/radeon/r600.c   | 3 ++-
>  drivers/gpu/drm/radeon/radeon_fence.c   | 1 +
>  drivers/gpu/drm/radeon/radeon_gem.c | 1 +
>  drivers/gpu/drm/radeon/radeon_ib.c  | 2 ++
>  drivers/gpu/drm/radeon/radeon_pm.c  | 1 +
>  drivers/gpu/drm/radeon/radeon_ring.c| 2 ++
>  drivers/gpu/drm/radeon/radeon_ttm.c | 1 +
>  drivers/gpu/drm/radeon/rs400.c  | 1 +
>  drivers/gpu/drm/radeon/rv515.c  | 1 +
>  drivers/gpu/drm/sti/sti_drv.c   | 1 +
>  drivers/gpu/drm/ttm/ttm_device.c| 1 +
>  drivers/gpu/drm/ttm/ttm_resource.c  | 3 ++-
>  drivers/gpu/drm/ttm/ttm_tt.c| 5 +++--
>  drivers/gpu/drm/vc4/vc4_drv.h   | 1 +
>  drivers/gpu/drm/vmwgfx/vmwgfx_gem.c | 2 ++
>  drivers/gpu/drm/xe/xe_debugfs.c | 1 +
>  drivers/gpu/drm/xe/xe_gt_debugfs.c  | 2 ++
>  drivers/gpu/drm/xe/xe_uc_debugfs.c  | 2 ++
>  include/drm/drm_print.h | 2 +-
>  31 files changed, 46 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/accel/ivpu/ivpu_debugfs.c 
> b/drivers/accel/ivpu/ivpu_debugfs.c
> index d09d29775b3f..e07e447d08d1 100644
> --- a/drivers/accel/ivpu/ivpu_debugfs.c
> +++ b/drivers/accel/ivpu/ivpu_debugfs.c
> @@ -3,6 +3,8 @@
>   * Copyright (C) 2020-2023 Intel Corporation
>   */
>  
> +#include 
> +
>  #include 
>  #include 
>  #include 
> diff --git a/drivers/gpu/drm/armada/armada_debugfs.c 
> b/drivers/gpu/drm/armada/armada_debugfs.c
> index 29f4b52e3c8d..a763349dd89f 100644
> --- a/drivers/gpu/drm/armada/armada_debugfs.c
> +++ b/drivers/gpu/drm/armada/armada_debugfs.c
> @@ -5,6 +5,7 @@
>   */
>  
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> diff --git a/drivers/gpu/drm/bridge/ite-it6505.c 
> b/drivers/gpu/drm/bridge/ite-it6505.c
> index 27334173e911..3f68c82888c2 100644
> --- a/drivers/gpu/drm/bridge/ite-it6505.c
> +++ b/drivers/gpu/drm/bridge/ite-it6505.c
> @@ -3,6 +3,7 @@
>   * Copyright (c) 2020, The Linux Foundation. All rights reserved.
>   */
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> diff --git a/drivers/gpu/drm/bridge/panel.c b/drivers/gpu/drm/bridge/panel.c
> index 7f41525f7a6e..32506524d9a2 100644
> --- a/drivers/gpu/drm/bridge/panel.c
> +++ b/drivers/gpu/drm/bridge/panel.c
> @@ -4,6 +4,8 @@
>   * Copyright (C) 2017 Broadcom
>   */
>  
> +#include 
> +
>  #include 
>  #include 
>  #include 
> diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
> index 699b7dbffd7b..cf2efb44722c 100644
> --- a/driver

Re: [PATCH] drm/amd/display: re-indent dc_power_down_on_boot()

2024-04-25 Thread Dan Carpenter
On Wed, Apr 24, 2024 at 03:11:08PM +0200, Christian König wrote:
> Am 24.04.24 um 13:41 schrieb Dan Carpenter:
> > These lines are indented too far.  Clean the whitespace.
> > 
> > Signed-off-by: Dan Carpenter 
> > ---
> >   drivers/gpu/drm/amd/display/dc/core/dc.c | 7 +++
> >   1 file changed, 3 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
> > b/drivers/gpu/drm/amd/display/dc/core/dc.c
> > index 8eefba757da4..f64d7229eb6c 100644
> > --- a/drivers/gpu/drm/amd/display/dc/core/dc.c
> > +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
> > @@ -5043,11 +5043,10 @@ void dc_interrupt_ack(struct dc *dc, enum 
> > dc_irq_source src)
> >   void dc_power_down_on_boot(struct dc *dc)
> >   {
> > if (dc->ctx->dce_environment != DCE_ENV_VIRTUAL_HW &&
> > -   dc->hwss.power_down_on_boot) {
> > -
> > -   if (dc->caps.ips_support)
> > -   dc_exit_ips_for_hw_access(dc);
> > +   dc->hwss.power_down_on_boot) {
> > +   if (dc->caps.ips_support)
> > +   dc_exit_ips_for_hw_access(dc);
> 
> Well while at it can't the two ifs be merged here?
> 
> (I don't know this code to well, but it looks like it).
> 

I'm sorry, I don't see what you're saying.

I probably should have deleted the other blank line as well, though.
It introduces a checkpatch.pl --strict warning.

regards,
dan carpenter



[PATCH] drm/amd/display: Remove duplicate spl/dc_spl_types.h header

2024-04-25 Thread Jiapeng Chong
./drivers/gpu/drm/amd/display/dc/inc/hw/transform.h: spl/dc_spl_types.h is 
included more than once.

Reported-by: Abaci Robot 
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=8884
Signed-off-by: Jiapeng Chong 
---
 drivers/gpu/drm/amd/display/dc/inc/hw/transform.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/transform.h 
b/drivers/gpu/drm/amd/display/dc/inc/hw/transform.h
index 5aa2f1a1fb83..28da1dddf0a0 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/transform.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/transform.h
@@ -31,8 +31,6 @@
 #include "fixed31_32.h"
 #include "spl/dc_spl_types.h"
 
-#include "spl/dc_spl_types.h"
-
 #define CSC_TEMPERATURE_MATRIX_SIZE 12
 
 struct bit_depth_reduction_params;
-- 
2.19.1.6.gb485710b



[PATCH][next] drm/amdgpu: Fix spelling mistake "PRORITY" -> "PRIORITY"

2024-04-25 Thread Colin Ian King
There are spelling mistakes in a literal string and enums, fix these.
Currently there are no uses of the enums that got renamed in this fix.

Signed-off-by: Colin Ian King 
---
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c| 2 +-
 drivers/gpu/drm/amd/include/mes_api_def.h | 2 +-
 drivers/gpu/drm/amd/include/mes_v11_api_def.h | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
index fbe31afad1d4..44f1af6da21e 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
@@ -111,7 +111,7 @@ static const char *mes_v11_0_opcodes[] = {
"RESUME",
"RESET",
"SET_LOG_BUFFER",
-   "CHANGE_GANG_PRORITY",
+   "CHANGE_GANG_PRIORITY",
"QUERY_SCHEDULER_STATUS",
"PROGRAM_GDS",
"SET_DEBUG_VMID",
diff --git a/drivers/gpu/drm/amd/include/mes_api_def.h 
b/drivers/gpu/drm/amd/include/mes_api_def.h
index bf3d6ad263f9..ed479575df18 100644
--- a/drivers/gpu/drm/amd/include/mes_api_def.h
+++ b/drivers/gpu/drm/amd/include/mes_api_def.h
@@ -54,7 +54,7 @@ enum MES_SCH_API_OPCODE {
MES_SCH_API_RESUME  = 7,
MES_SCH_API_RESET   = 8,
MES_SCH_API_SET_LOG_BUFFER  = 9,
-   MES_SCH_API_CHANGE_GANG_PRORITY = 10,
+   MES_SCH_API_CHANGE_GANG_PRIORITY= 10,
MES_SCH_API_QUERY_SCHEDULER_STATUS  = 11,
MES_SCH_API_PROGRAM_GDS = 12,
MES_SCH_API_SET_DEBUG_VMID  = 13,
diff --git a/drivers/gpu/drm/amd/include/mes_v11_api_def.h 
b/drivers/gpu/drm/amd/include/mes_v11_api_def.h
index 410c8d664336..5b8fd9465cf3 100644
--- a/drivers/gpu/drm/amd/include/mes_v11_api_def.h
+++ b/drivers/gpu/drm/amd/include/mes_v11_api_def.h
@@ -54,7 +54,7 @@ enum MES_SCH_API_OPCODE {
MES_SCH_API_RESUME  = 7,
MES_SCH_API_RESET   = 8,
MES_SCH_API_SET_LOG_BUFFER  = 9,
-   MES_SCH_API_CHANGE_GANG_PRORITY = 10,
+   MES_SCH_API_CHANGE_GANG_PRIORITY= 10,
MES_SCH_API_QUERY_SCHEDULER_STATUS  = 11,
MES_SCH_API_PROGRAM_GDS = 12,
MES_SCH_API_SET_DEBUG_VMID  = 13,
-- 
2.39.2



Re: [PATCH] drm/amd/display: re-indent dc_power_down_on_boot()

2024-04-25 Thread Dan Carpenter
On Wed, Apr 24, 2024 at 03:33:11PM +0200, Christian König wrote:
> Am 24.04.24 um 15:20 schrieb Dan Carpenter:
> > On Wed, Apr 24, 2024 at 03:11:08PM +0200, Christian König wrote:
> > > Am 24.04.24 um 13:41 schrieb Dan Carpenter:
> > > > These lines are indented too far.  Clean the whitespace.
> > > > 
> > > > Signed-off-by: Dan Carpenter 
> > > > ---
> > > >drivers/gpu/drm/amd/display/dc/core/dc.c | 7 +++
> > > >1 file changed, 3 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
> > > > b/drivers/gpu/drm/amd/display/dc/core/dc.c
> > > > index 8eefba757da4..f64d7229eb6c 100644
> > > > --- a/drivers/gpu/drm/amd/display/dc/core/dc.c
> > > > +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
> > > > @@ -5043,11 +5043,10 @@ void dc_interrupt_ack(struct dc *dc, enum 
> > > > dc_irq_source src)
> > > >void dc_power_down_on_boot(struct dc *dc)
> > > >{
> > > > if (dc->ctx->dce_environment != DCE_ENV_VIRTUAL_HW &&
> > > > -   dc->hwss.power_down_on_boot) {
> > > > -
> > > > -   if (dc->caps.ips_support)
> > > > -   dc_exit_ips_for_hw_access(dc);
> > > > +   dc->hwss.power_down_on_boot) {
> > > > +   if (dc->caps.ips_support)
> > > > +   dc_exit_ips_for_hw_access(dc);
> > > Well while at it can't the two ifs be merged here?
> > > 
> > > (I don't know this code to well, but it looks like it).
> > > 
> > I'm sorry, I don't see what you're saying.
> 
> The indentation was so messed up that I though the call to
> power_down_on_boot() was after both ifs, but it is still inside the first.
> 
> So your patch is actually right, sorry for the noise.

Okay, but let me send a v2 anyway to delete the extra blank line.

regards,
dan carpenter



[PATCH] drm/amd/display: re-indent dc_power_down_on_boot()

2024-04-25 Thread Dan Carpenter
These lines are indented too far.  Clean the whitespace.

Signed-off-by: Dan Carpenter 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 8eefba757da4..f64d7229eb6c 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -5043,11 +5043,10 @@ void dc_interrupt_ack(struct dc *dc, enum dc_irq_source 
src)
 void dc_power_down_on_boot(struct dc *dc)
 {
if (dc->ctx->dce_environment != DCE_ENV_VIRTUAL_HW &&
-   dc->hwss.power_down_on_boot) {
-
-   if (dc->caps.ips_support)
-   dc_exit_ips_for_hw_access(dc);
+   dc->hwss.power_down_on_boot) {
 
+   if (dc->caps.ips_support)
+   dc_exit_ips_for_hw_access(dc);
dc->hwss.power_down_on_boot(dc);
}
 }
-- 
2.43.0



[PATCH][next] drm/amd/display: Fix spelling various spelling mistakes

2024-04-25 Thread Colin Ian King
There are various spelling mistakes in dml2_printf messages, fix them.

Signed-off-by: Colin Ian King 
---
 .../dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c  | 6 +++---
 .../display/dc/dml2/dml21/src/dml2_core/dml2_core_shared.c  | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git 
a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c
 
b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c
index 846b0ae48596..2dea5965d02f 100644
--- 
a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c
+++ 
b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c
@@ -5566,7 +5566,7 @@ static bool CalculatePrefetchSchedule(struct 
dml2_core_internal_scratch *scratch
dml2_printf("DML: Tvm: %fus - time to fetch vm\n", 
s->TimeForFetchingVM);
dml2_printf("DML: Tr0: %fus - time to fetch first row of data 
pagetables\n", s->TimeForFetchingRowInVBlank);
dml2_printf("DML: Tsw: %fus = time to fetch enough pixel data 
and cursor data to feed the scalers init position and detile\n", 
(double)s->LinesToRequestPrefetchPixelData * s->LineTime);
-   dml2_printf("DML: To: %fus - time for propogation from scaler 
to optc\n", (*p->DSTYAfterScaler + ((double)(*p->DSTXAfterScaler) / 
(double)p->myPipe->HTotal)) * s->LineTime);
+   dml2_printf("DML: To: %fus - time for propagation from scaler 
to optc\n", (*p->DSTYAfterScaler + ((double)(*p->DSTXAfterScaler) / 
(double)p->myPipe->HTotal)) * s->LineTime);
dml2_printf("DML: Tvstartup - TSetup - Tcalc - TWait - Tpre - 
To > 0\n");
dml2_printf("DML: Tslack(pre): %fus - time left over in 
schedule\n", p->VStartup * s->LineTime - s->TimeForFetchingVM - 2 * 
s->TimeForFetchingRowInVBlank - (*p->DSTYAfterScaler + 
((double)(*p->DSTXAfterScaler) / (double)p->myPipe->HTotal)) * s->LineTime - 
p->TWait - p->TCalc - *p->TSetup);
dml2_printf("DML: row_bytes = dpte_row_bytes (per_pipe) = 
PixelPTEBytesPerRow = : %u\n", p->PixelPTEBytesPerRow);
@@ -7825,7 +7825,7 @@ static bool dml_core_mode_support(struct 
dml2_core_calcs_mode_support_ex *in_out
dml2_printf("DML::%s: mode_lib->ms.FabricClock = %f\n", __func__, 
mode_lib->ms.FabricClock);
dml2_printf("DML::%s: mode_lib->ms.uclk_freq_mhz = %f\n", __func__, 
mode_lib->ms.uclk_freq_mhz);
dml2_printf("DML::%s: max_urgent_latency_us = %f\n", __func__, 
mode_lib->ms.support.max_urgent_latency_us);
-   dml2_printf("DML::%s: urgent latency tolarance = %f\n", __func__, 
((mode_lib->ip.rob_buffer_size_kbytes - mode_lib->ip.pixel_chunk_size_kbytes) * 
1024 / (mode_lib->ms.DCFCLK * mode_lib->soc.return_bus_width_bytes)));
+   dml2_printf("DML::%s: urgent latency tolerance = %f\n", __func__, 
((mode_lib->ip.rob_buffer_size_kbytes - mode_lib->ip.pixel_chunk_size_kbytes) * 
1024 / (mode_lib->ms.DCFCLK * mode_lib->soc.return_bus_width_bytes)));
dml2_printf("DML::%s: ROBSupport = %u\n", __func__, 
mode_lib->ms.support.ROBSupport);
 #endif
 
@@ -10603,7 +10603,7 @@ static bool dml_core_mode_programming(struct 
dml2_core_calcs_mode_programming_ex
if 
(display_cfg->plane_descriptors[k].immediate_flip && 
mode_lib->mp.ImmediateFlipSupportedForPipe[k] == false) {
mode_lib->mp.ImmediateFlipSupported = 
false;
 #ifdef __DML_VBA_DEBUG__
-   dml2_printf("DML::%s: Pipe %0d not 
supporing iflip!\n", __func__, k);
+   dml2_printf("DML::%s: Pipe %0d not 
supporting iflip!\n", __func__, k);
 #endif
}
}
diff --git 
a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_shared.c 
b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_shared.c
index 0ef77a89d984..d1d4fe062d4e 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_shared.c
+++ b/drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_shared.c
@@ -2023,7 +2023,7 @@ bool dml2_core_shared_mode_support(struct 
dml2_core_calcs_mode_support_ex *in_ou
dml2_printf("DML::%s: mode_lib->ms.FabricClock = %f\n", __func__, 
mode_lib->ms.FabricClock);
dml2_printf("DML::%s: mode_lib->ms.uclk_freq_mhz = %f\n", __func__, 
mode_lib->ms.uclk_freq_mhz);
dml2_printf("DML::%s: max_urgent_latency_us = %f\n", __func__, 
mode_lib->ms.support.max_urgent_latency_us);
-   dml2_printf("DML::%s: urgent latency tolarance = %f\n", __func__, 
((mode_lib->ip.rob_buffer_size_kbytes - mode_lib->ip.pixel_chunk_size_kbytes) * 
1024 / (mode_lib->ms.DCFCLK * mode_lib->soc.return_bus_width_bytes)));
+   dml2_printf("DML::%s: urgent latency tolerance = %f\n", __func__, 
((mode_lib->ip.rob_buffer_size_kbytes - mode_lib->ip.pixel_chunk_size_kbytes) * 
1024 / (mode_lib->ms.DCFCLK * mode_lib->s

Re: [RFC PATCH 10/18] drm/amdgpu: Don't add GTT to initial domains after failing to allocate VRAM

2024-04-25 Thread Christian König

Am 25.04.24 um 09:39 schrieb Friedrich Vock:

On 25.04.24 08:25, Christian König wrote:

Am 24.04.24 um 18:57 schrieb Friedrich Vock:

This adds GTT to the "preferred domains" of this buffer object, which
will also prevent any attempts at moving the buffer back to VRAM if
there is space. If VRAM is full, GTT will already be chosen as a
fallback.


Big NAK to that one, this is mandatory for correct operation.


Hm, how is correctness affected here? We still fall back to GTT if
allocating in VRAM doesn't work, I don't see a difference except that
now we'll actually try moving it back into VRAM again.


Well this is the fallback. Only during CS we try to allocate from GTT if 
allocating in VRAM doesn't work.


When you remove this here then any failed allocation from VRAM would be 
fatal.


What could be is that the handling is buggy and when we update the 
initial domain we also add GTT to the preferred domain, but that should 
then be fixed.


Regards,
Christian.



Regards,
Friedrich


Regards,
Christian.



Signed-off-by: Friedrich Vock 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c    | 4 
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +-
  2 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 6bbab141eaaeb..aea3770d3ea2e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -378,10 +378,6 @@ int amdgpu_gem_create_ioctl(struct drm_device
*dev, void *data,
  goto retry;
  }

-    if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
-    initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
-    goto retry;
-    }
  DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu,
%d)\n",
  size, initial_domain, args->in.alignment, r);
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 85c10d8086188..9978b85ed6f40 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -619,7 +619,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
    AMDGPU_GEM_DOMAIN_GDS))
  amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_CPU);
  else
-    amdgpu_bo_placement_from_domain(bo, bp->domain);
+    amdgpu_bo_placement_from_domain(bo, bo->allowed_domains);
  if (bp->type == ttm_bo_type_kernel)
  bo->tbo.priority = 2;
  else if (!(bp->flags & AMDGPU_GEM_CREATE_DISCARDABLE))
--
2.44.0







Re: [RFC PATCH 10/18] drm/amdgpu: Don't add GTT to initial domains after failing to allocate VRAM

2024-04-25 Thread Friedrich Vock

On 25.04.24 08:25, Christian König wrote:

Am 24.04.24 um 18:57 schrieb Friedrich Vock:

This adds GTT to the "preferred domains" of this buffer object, which
will also prevent any attempts at moving the buffer back to VRAM if
there is space. If VRAM is full, GTT will already be chosen as a
fallback.


Big NAK to that one, this is mandatory for correct operation.


Hm, how is correctness affected here? We still fall back to GTT if
allocating in VRAM doesn't work, I don't see a difference except that
now we'll actually try moving it back into VRAM again.

Regards,
Friedrich


Regards,
Christian.



Signed-off-by: Friedrich Vock 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c    | 4 
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +-
  2 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 6bbab141eaaeb..aea3770d3ea2e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -378,10 +378,6 @@ int amdgpu_gem_create_ioctl(struct drm_device
*dev, void *data,
  goto retry;
  }

-    if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
-    initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
-    goto retry;
-    }
  DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu,
%d)\n",
  size, initial_domain, args->in.alignment, r);
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 85c10d8086188..9978b85ed6f40 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -619,7 +619,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
    AMDGPU_GEM_DOMAIN_GDS))
  amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_CPU);
  else
-    amdgpu_bo_placement_from_domain(bo, bp->domain);
+    amdgpu_bo_placement_from_domain(bo, bo->allowed_domains);
  if (bp->type == ttm_bo_type_kernel)
  bo->tbo.priority = 2;
  else if (!(bp->flags & AMDGPU_GEM_CREATE_DISCARDABLE))
--
2.44.0





Re: [RFC PATCH 16/18] drm/amdgpu: Implement SET_PRIORITY GEM op

2024-04-25 Thread Friedrich Vock

On 25.04.24 09:15, Christian König wrote:

Am 25.04.24 um 09:06 schrieb Friedrich Vock:

On 25.04.24 08:58, Christian König wrote:

Am 25.04.24 um 08:46 schrieb Friedrich Vock:

On 25.04.24 08:32, Christian König wrote:

Am 24.04.24 um 18:57 schrieb Friedrich Vock:

Used by userspace to adjust buffer priorities in response to
changes in
application demand and memory pressure.


Yeah, that was discussed over and over again. One big design criteria
is that we can't have global priorities from userspace!

The background here is that this can trivially be abused.


I see your point when apps are allowed to prioritize themselves above
other apps, and I agree that should probably be disallowed at least
for
unprivileged apps.

Disallowing this is a pretty trivial change though, and I don't really
see the abuse potential in being able to downgrade your own priority?


Yeah, I know what you mean and I'm also leaning towards that
argumentation. But another good point is also that it doesn't actually
help.

For example when you have desktop apps fighting with a game, you
probably don't want to use static priorities, but rather evict the
apps which are inactive and keep the apps which are active in the
background.


Sadly things are not as simple as "evict everything from app 1, keep
everything from app 2 active". The simplest failure case of this is
games that already oversubscribe VRAM on their own. Keeping the whole
app inside VRAM is literally impossible there, and it helps a lot to
know which buffers the app is most happy with evicting.

In other words the priority just tells you which stuff from each app
to evict first, but not which app to globally throw out.


Yeah, but per-buffer priority system could do both of these.


Yeah, but we already have that. See amdgpu_bo_list_entry_cmp() and
amdgpu_bo_list_create().

This is the per application priority which can be used by userspace to
give priority to each BO in a submission (or application wide).

The problem is rather that amdgpu/TTM never really made good use of
that information.


I think it's nigh impossible to make good use of priority information if
you wrap it in the BO list which you only know on submit. For example,
you don't know when priorities change unless you duplicate all the
tracking work (that the application has to do too!) in the kernel. You
also have no way of knowing the priority changed until right when the
app wants to submit work using that BO, and starting to move BOs around
at that point is bad for submission latency. That's why I didn't go
forward with tracking priorities on a BO-list basis.

Also, the priorities being local to a submission is actually not that
great when talking about lowering priorities. Consider a case where an
app's working set fits into VRAM completely, but combined with the
working set of other apps running in parallel, VRAM is oversubscribed.
The app recognizes this and asks the kernel to evict one of its
rarely-used buffers by setting the priority to the lowest possible, to
make space for the other applications.
Without global priorities, the kernel can't honor that request, even
though it would solve the oversubscription with minimal performance
impact. Even with per-app priorities, the kernel isn't likely to evict
buffers from the requesting application unless all the other
applications have a higher priority.

Regards,
Friedrich


Regards,
Christian.



Regards,
Friedrich


Regards,
Christian.



Regards,
Friedrich


What we can do is to have per process priorities, but that needs
to be
in the VM subsystem.

That's also the reason why I personally think that the handling
shouldn't be inside TTM at all.

Regards,
Christian.



Signed-off-by: Friedrich Vock 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 20 
  include/uapi/drm/amdgpu_drm.h   |  1 +
  2 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 5ca13e2e50f50..6107810a9c205 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -836,8 +836,10 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev,
void *data,
  {
  struct amdgpu_device *adev = drm_to_adev(dev);
  struct drm_amdgpu_gem_op *args = data;
+    struct ttm_resource_manager *man;
  struct drm_gem_object *gobj;
  struct amdgpu_vm_bo_base *base;
+    struct ttm_operation_ctx ctx;
  struct amdgpu_bo *robj;
  int r;

@@ -851,6 +853,9 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev,
void *data,
  if (unlikely(r))
  goto out;

+    memset(&ctx, 0, sizeof(ctx));
+    ctx.interruptible = true;
+
  switch (args->op) {
  case AMDGPU_GEM_OP_GET_GEM_CREATE_INFO: {
  struct drm_amdgpu_gem_create_in info;
@@ -898,6 +903,21 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev,
void *data,

  amdgpu_bo_unreserve(robj);
  break;
+    case AMDGPU_GEM_OP_SET_PRIORITY:
+    if (args->value 

RE: [PATCH Review 1/1] drm/amdgpu: Adjust XGMI WAFL ras enable bit

2024-04-25 Thread Yang, Stanley
[AMD Official Use Only - General]

Thanks for reminding, the XGMI/WAFL caps is set on device without XGMI link, 
will notice PSP firmware team to fix.

Regards,
Stanley
> -Original Message-
> From: Zhang, Hawking 
> Sent: Thursday, April 25, 2024 3:26 PM
> To: Yang, Stanley ; amd-gfx@lists.freedesktop.org
> Cc: Yang, Stanley 
> Subject: RE: [PATCH Review 1/1] drm/amdgpu: Adjust XGMI WAFL ras enable bit
>
> [AMD Official Use Only - General]
>
> Hmm... we do expect PSP report the XGMI/WAFL Caps. This is different from
> legacy RAS CAP check through atomfirmware. But if you found the XGMI/WAFL
> bits are not set properly in the new PSP interface, let's reach out to PSP 
> firmware
> team for a fix.
>
> Regards,
> Hawking
>
> -Original Message-
> From: amd-gfx  On Behalf Of
> Stanley.Yang
> Sent: Thursday, April 25, 2024 15:08
> To: amd-gfx@lists.freedesktop.org
> Cc: Yang, Stanley 
> Subject: [PATCH Review 1/1] drm/amdgpu: Adjust XGMI WAFL ras enable bit
>
> The way to get ras capability has changed for some asics, both of them need
> check XGMI physical nodes number to set XGMI WAFL ras enable bit.
>
> Signed-off-by: Stanley.Yang 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index b2a883d3e19d..ea77e00cc002 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -2918,13 +2918,6 @@ static void
> amdgpu_ras_query_ras_capablity_from_vbios(struct amdgpu_device *adev
> else
> adev->ras_hw_enabled &= ~(1 << AMDGPU_RAS_BLOCK__VCN |
>   1 << 
> AMDGPU_RAS_BLOCK__JPEG);
> -
> -   /*
> -* XGMI RAS is not supported if xgmi num physical nodes
> -* is zero
> -*/
> -   if (!adev->gmc.xgmi.num_physical_nodes)
> -   adev->ras_hw_enabled &= ~(1 <<
> AMDGPU_RAS_BLOCK__XGMI_WAFL);
> } else {
> dev_info(adev->dev, "SRAM ECC is not presented.\n");
> }
> @@ -3002,6 +2995,13 @@ static void amdgpu_ras_check_supported(struct
> amdgpu_device *adev)
> amdgpu_ras_query_poison_mode(adev);
>
>  init_ras_enabled_flag:
> +   /*
> +* XGMI RAS is not supported if xgmi num physical nodes
> +* is zero
> +*/
> +   if (!adev->gmc.xgmi.num_physical_nodes)
> +   adev->ras_hw_enabled &= ~(1 <<
> AMDGPU_RAS_BLOCK__XGMI_WAFL);
> +
> /* hw_supported needs to be aligned with RAS block mask. */
> adev->ras_hw_enabled &= AMDGPU_RAS_BLOCK_MASK;
>
> --
> 2.25.1
>



RE: [PATCH Review 1/1] drm/amdgpu: Adjust XGMI WAFL ras enable bit

2024-04-25 Thread Zhang, Hawking
[AMD Official Use Only - General]

Hmm... we do expect PSP report the XGMI/WAFL Caps. This is different from 
legacy RAS CAP check through atomfirmware. But if you found the XGMI/WAFL bits 
are not set properly in the new PSP interface, let's reach out to PSP firmware 
team for a fix.

Regards,
Hawking

-Original Message-
From: amd-gfx  On Behalf Of Stanley.Yang
Sent: Thursday, April 25, 2024 15:08
To: amd-gfx@lists.freedesktop.org
Cc: Yang, Stanley 
Subject: [PATCH Review 1/1] drm/amdgpu: Adjust XGMI WAFL ras enable bit

The way to get ras capability has changed for some asics, both of them need 
check XGMI physical nodes number to set XGMI WAFL ras enable bit.

Signed-off-by: Stanley.Yang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index b2a883d3e19d..ea77e00cc002 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -2918,13 +2918,6 @@ static void 
amdgpu_ras_query_ras_capablity_from_vbios(struct amdgpu_device *adev
else
adev->ras_hw_enabled &= ~(1 << AMDGPU_RAS_BLOCK__VCN |
  1 << AMDGPU_RAS_BLOCK__JPEG);
-
-   /*
-* XGMI RAS is not supported if xgmi num physical nodes
-* is zero
-*/
-   if (!adev->gmc.xgmi.num_physical_nodes)
-   adev->ras_hw_enabled &= ~(1 << 
AMDGPU_RAS_BLOCK__XGMI_WAFL);
} else {
dev_info(adev->dev, "SRAM ECC is not presented.\n");
}
@@ -3002,6 +2995,13 @@ static void amdgpu_ras_check_supported(struct 
amdgpu_device *adev)
amdgpu_ras_query_poison_mode(adev);

 init_ras_enabled_flag:
+   /*
+* XGMI RAS is not supported if xgmi num physical nodes
+* is zero
+*/
+   if (!adev->gmc.xgmi.num_physical_nodes)
+   adev->ras_hw_enabled &= ~(1 << AMDGPU_RAS_BLOCK__XGMI_WAFL);
+
/* hw_supported needs to be aligned with RAS block mask. */
adev->ras_hw_enabled &= AMDGPU_RAS_BLOCK_MASK;

--
2.25.1



[PATCH] drm/amdgpu: fix the warning about the expression (int)size - len

2024-04-25 Thread Jesse Zhang
Converting size from size_t to int may overflow.
v2: keep reverse xmas tree order (Christian)

Signed-off-by: Jesse Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index f5d0fa207a88..b62ae3c91a9d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -2065,12 +2065,13 @@ static ssize_t 
amdgpu_reset_dump_register_list_write(struct file *f,
struct amdgpu_device *adev = (struct amdgpu_device 
*)file_inode(f)->i_private;
char reg_offset[11];
uint32_t *new = NULL, *tmp = NULL;
-   int ret, i = 0, len = 0;
+   unsigned int len = 0;
+   int ret, i = 0;
 
do {
memset(reg_offset, 0, 11);
if (copy_from_user(reg_offset, buf + len,
-   min(10, ((int)size-len {
+   min(10, (size-len {
ret = -EFAULT;
goto error_free;
}
-- 
2.25.1



Re: [PATCH V2] drm/amdgpu: fix the warning about the expression (int)size - len

2024-04-25 Thread Christian König

Am 25.04.24 um 09:11 schrieb Jesse Zhang:

Converting size from size_t to int may overflow.
v2: keep reverse xmas tree order (Christian)

Signed-off-by: Jesse Zhang 

---
  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index f5d0fa207a88..eed60d4b3390 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -2065,12 +2065,13 @@ static ssize_t 
amdgpu_reset_dump_register_list_write(struct file *f,
struct amdgpu_device *adev = (struct amdgpu_device 
*)file_inode(f)->i_private;
char reg_offset[11];
uint32_t *new = NULL, *tmp = NULL;
+   unsigned int len = 0;
int ret, i = 0, len = 0;


Well now you have len defined twice :)

Christian.

  
  	do {

memset(reg_offset, 0, 11);
if (copy_from_user(reg_offset, buf + len,
-   min(10, ((int)size-len {
+   min(10, (size-len {
ret = -EFAULT;
goto error_free;
}




Re: [RFC PATCH 16/18] drm/amdgpu: Implement SET_PRIORITY GEM op

2024-04-25 Thread Christian König

Am 25.04.24 um 09:06 schrieb Friedrich Vock:

On 25.04.24 08:58, Christian König wrote:

Am 25.04.24 um 08:46 schrieb Friedrich Vock:

On 25.04.24 08:32, Christian König wrote:

Am 24.04.24 um 18:57 schrieb Friedrich Vock:

Used by userspace to adjust buffer priorities in response to
changes in
application demand and memory pressure.


Yeah, that was discussed over and over again. One big design criteria
is that we can't have global priorities from userspace!

The background here is that this can trivially be abused.


I see your point when apps are allowed to prioritize themselves above
other apps, and I agree that should probably be disallowed at least for
unprivileged apps.

Disallowing this is a pretty trivial change though, and I don't really
see the abuse potential in being able to downgrade your own priority?


Yeah, I know what you mean and I'm also leaning towards that
argumentation. But another good point is also that it doesn't actually
help.

For example when you have desktop apps fighting with a game, you
probably don't want to use static priorities, but rather evict the
apps which are inactive and keep the apps which are active in the
background.


Sadly things are not as simple as "evict everything from app 1, keep
everything from app 2 active". The simplest failure case of this is
games that already oversubscribe VRAM on their own. Keeping the whole
app inside VRAM is literally impossible there, and it helps a lot to
know which buffers the app is most happy with evicting.

In other words the priority just tells you which stuff from each app
to evict first, but not which app to globally throw out.


Yeah, but per-buffer priority system could do both of these.


Yeah, but we already have that. See amdgpu_bo_list_entry_cmp() and 
amdgpu_bo_list_create().


This is the per application priority which can be used by userspace to 
give priority to each BO in a submission (or application wide).


The problem is rather that amdgpu/TTM never really made good use of that 
information.


Regards,
Christian.



Regards,
Friedrich


Regards,
Christian.



Regards,
Friedrich


What we can do is to have per process priorities, but that needs to be
in the VM subsystem.

That's also the reason why I personally think that the handling
shouldn't be inside TTM at all.

Regards,
Christian.



Signed-off-by: Friedrich Vock 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 20 
  include/uapi/drm/amdgpu_drm.h   |  1 +
  2 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 5ca13e2e50f50..6107810a9c205 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -836,8 +836,10 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev,
void *data,
  {
  struct amdgpu_device *adev = drm_to_adev(dev);
  struct drm_amdgpu_gem_op *args = data;
+    struct ttm_resource_manager *man;
  struct drm_gem_object *gobj;
  struct amdgpu_vm_bo_base *base;
+    struct ttm_operation_ctx ctx;
  struct amdgpu_bo *robj;
  int r;

@@ -851,6 +853,9 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev,
void *data,
  if (unlikely(r))
  goto out;

+    memset(&ctx, 0, sizeof(ctx));
+    ctx.interruptible = true;
+
  switch (args->op) {
  case AMDGPU_GEM_OP_GET_GEM_CREATE_INFO: {
  struct drm_amdgpu_gem_create_in info;
@@ -898,6 +903,21 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev,
void *data,

  amdgpu_bo_unreserve(robj);
  break;
+    case AMDGPU_GEM_OP_SET_PRIORITY:
+    if (args->value > AMDGPU_BO_PRIORITY_MAX_USER)
+    args->value = AMDGPU_BO_PRIORITY_MAX_USER;
+    ttm_bo_update_priority(&robj->tbo, args->value);
+    if (robj->tbo.evicted_type != TTM_NUM_MEM_TYPES) {
+    ttm_bo_try_unevict(&robj->tbo, &ctx);
+    amdgpu_bo_unreserve(robj);
+    } else {
+    amdgpu_bo_unreserve(robj);
+    man = ttm_manager_type(robj->tbo.bdev,
+    robj->tbo.resource->mem_type);
+    ttm_mem_unevict_evicted(robj->tbo.bdev, man,
+    true);
+    }
+    break;
  default:
  amdgpu_bo_unreserve(robj);
  r = -EINVAL;
diff --git a/include/uapi/drm/amdgpu_drm.h
b/include/uapi/drm/amdgpu_drm.h
index bdbe6b262a78d..53552dd489b9b 100644
--- a/include/uapi/drm/amdgpu_drm.h
+++ b/include/uapi/drm/amdgpu_drm.h
@@ -531,6 +531,7 @@ union drm_amdgpu_wait_fences {

  #define AMDGPU_GEM_OP_GET_GEM_CREATE_INFO    0
  #define AMDGPU_GEM_OP_SET_PLACEMENT    1
+#define AMDGPU_GEM_OP_SET_PRIORITY  2

  /* Sets or returns a value associated with a buffer. */
  struct drm_amdgpu_gem_op {
--
2.44.0









[PATCH V2] drm/amdgpu: fix the warning about the expression (int)size - len

2024-04-25 Thread Jesse Zhang
Converting size from size_t to int may overflow.
v2: keep reverse xmas tree order (Christian)

Signed-off-by: Jesse Zhang 

---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index f5d0fa207a88..eed60d4b3390 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -2065,12 +2065,13 @@ static ssize_t 
amdgpu_reset_dump_register_list_write(struct file *f,
struct amdgpu_device *adev = (struct amdgpu_device 
*)file_inode(f)->i_private;
char reg_offset[11];
uint32_t *new = NULL, *tmp = NULL;
+   unsigned int len = 0;
int ret, i = 0, len = 0;
 
do {
memset(reg_offset, 0, 11);
if (copy_from_user(reg_offset, buf + len,
-   min(10, ((int)size-len {
+   min(10, (size-len {
ret = -EFAULT;
goto error_free;
}
-- 
2.25.1



[PATCH Review 1/1] drm/amdgpu: Adjust XGMI WAFL ras enable bit

2024-04-25 Thread Stanley . Yang
The way to get ras capability has changed for some asics,
both of them need check XGMI physical nodes number to
set XGMI WAFL ras enable bit.

Signed-off-by: Stanley.Yang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index b2a883d3e19d..ea77e00cc002 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -2918,13 +2918,6 @@ static void 
amdgpu_ras_query_ras_capablity_from_vbios(struct amdgpu_device *adev
else
adev->ras_hw_enabled &= ~(1 << AMDGPU_RAS_BLOCK__VCN |
  1 << AMDGPU_RAS_BLOCK__JPEG);
-
-   /*
-* XGMI RAS is not supported if xgmi num physical nodes
-* is zero
-*/
-   if (!adev->gmc.xgmi.num_physical_nodes)
-   adev->ras_hw_enabled &= ~(1 << 
AMDGPU_RAS_BLOCK__XGMI_WAFL);
} else {
dev_info(adev->dev, "SRAM ECC is not presented.\n");
}
@@ -3002,6 +2995,13 @@ static void amdgpu_ras_check_supported(struct 
amdgpu_device *adev)
amdgpu_ras_query_poison_mode(adev);
 
 init_ras_enabled_flag:
+   /*
+* XGMI RAS is not supported if xgmi num physical nodes
+* is zero
+*/
+   if (!adev->gmc.xgmi.num_physical_nodes)
+   adev->ras_hw_enabled &= ~(1 << AMDGPU_RAS_BLOCK__XGMI_WAFL);
+
/* hw_supported needs to be aligned with RAS block mask. */
adev->ras_hw_enabled &= AMDGPU_RAS_BLOCK_MASK;
 
-- 
2.25.1



Re: [RFC PATCH 16/18] drm/amdgpu: Implement SET_PRIORITY GEM op

2024-04-25 Thread Friedrich Vock

On 25.04.24 08:58, Christian König wrote:

Am 25.04.24 um 08:46 schrieb Friedrich Vock:

On 25.04.24 08:32, Christian König wrote:

Am 24.04.24 um 18:57 schrieb Friedrich Vock:

Used by userspace to adjust buffer priorities in response to
changes in
application demand and memory pressure.


Yeah, that was discussed over and over again. One big design criteria
is that we can't have global priorities from userspace!

The background here is that this can trivially be abused.


I see your point when apps are allowed to prioritize themselves above
other apps, and I agree that should probably be disallowed at least for
unprivileged apps.

Disallowing this is a pretty trivial change though, and I don't really
see the abuse potential in being able to downgrade your own priority?


Yeah, I know what you mean and I'm also leaning towards that
argumentation. But another good point is also that it doesn't actually
help.

For example when you have desktop apps fighting with a game, you
probably don't want to use static priorities, but rather evict the
apps which are inactive and keep the apps which are active in the
background.


Sadly things are not as simple as "evict everything from app 1, keep
everything from app 2 active". The simplest failure case of this is
games that already oversubscribe VRAM on their own. Keeping the whole
app inside VRAM is literally impossible there, and it helps a lot to
know which buffers the app is most happy with evicting.

In other words the priority just tells you which stuff from each app
to evict first, but not which app to globally throw out.


Yeah, but per-buffer priority system could do both of these.

Regards,
Friedrich


Regards,
Christian.



Regards,
Friedrich


What we can do is to have per process priorities, but that needs to be
in the VM subsystem.

That's also the reason why I personally think that the handling
shouldn't be inside TTM at all.

Regards,
Christian.



Signed-off-by: Friedrich Vock 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 20 
  include/uapi/drm/amdgpu_drm.h   |  1 +
  2 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 5ca13e2e50f50..6107810a9c205 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -836,8 +836,10 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev,
void *data,
  {
  struct amdgpu_device *adev = drm_to_adev(dev);
  struct drm_amdgpu_gem_op *args = data;
+    struct ttm_resource_manager *man;
  struct drm_gem_object *gobj;
  struct amdgpu_vm_bo_base *base;
+    struct ttm_operation_ctx ctx;
  struct amdgpu_bo *robj;
  int r;

@@ -851,6 +853,9 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev,
void *data,
  if (unlikely(r))
  goto out;

+    memset(&ctx, 0, sizeof(ctx));
+    ctx.interruptible = true;
+
  switch (args->op) {
  case AMDGPU_GEM_OP_GET_GEM_CREATE_INFO: {
  struct drm_amdgpu_gem_create_in info;
@@ -898,6 +903,21 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev,
void *data,

  amdgpu_bo_unreserve(robj);
  break;
+    case AMDGPU_GEM_OP_SET_PRIORITY:
+    if (args->value > AMDGPU_BO_PRIORITY_MAX_USER)
+    args->value = AMDGPU_BO_PRIORITY_MAX_USER;
+    ttm_bo_update_priority(&robj->tbo, args->value);
+    if (robj->tbo.evicted_type != TTM_NUM_MEM_TYPES) {
+    ttm_bo_try_unevict(&robj->tbo, &ctx);
+    amdgpu_bo_unreserve(robj);
+    } else {
+    amdgpu_bo_unreserve(robj);
+    man = ttm_manager_type(robj->tbo.bdev,
+    robj->tbo.resource->mem_type);
+    ttm_mem_unevict_evicted(robj->tbo.bdev, man,
+    true);
+    }
+    break;
  default:
  amdgpu_bo_unreserve(robj);
  r = -EINVAL;
diff --git a/include/uapi/drm/amdgpu_drm.h
b/include/uapi/drm/amdgpu_drm.h
index bdbe6b262a78d..53552dd489b9b 100644
--- a/include/uapi/drm/amdgpu_drm.h
+++ b/include/uapi/drm/amdgpu_drm.h
@@ -531,6 +531,7 @@ union drm_amdgpu_wait_fences {

  #define AMDGPU_GEM_OP_GET_GEM_CREATE_INFO    0
  #define AMDGPU_GEM_OP_SET_PLACEMENT    1
+#define AMDGPU_GEM_OP_SET_PRIORITY  2

  /* Sets or returns a value associated with a buffer. */
  struct drm_amdgpu_gem_op {
--
2.44.0







Re: [PATCH v2] drm/amdgpu: Fix buffer size in gfx_v9_4_3_init_ cp_compute_microcode() and rlc_microcode()

2024-04-25 Thread Lazar, Lijo



On 4/25/2024 12:05 PM, Srinivasan Shanmugam wrote:
> The function gfx_v9_4_3_init_microcode in gfx_v9_4_3.c was generating
> about potential truncation of output when using the snprintf function.
> The issue was due to the size of the buffer 'ucode_prefix' being too
> small to accommodate the maximum possible length of the string being
> written into it.
> 
> The string being written is "amdgpu/%s_mec.bin" or "amdgpu/%s_rlc.bin",
> where %s is replaced by the value of 'chip_name'. The length of this
> string without the %s is 16 characters. The warning message indicated
> that 'chip_name' could be up to 29 characters long, resulting in a total
> of 45 characters, which exceeds the buffer size of 30 characters.
> 
> To resolve this issue, the size of the 'ucode_prefix' buffer has been
> reduced from 30 to 15. This ensures that the maximum possible length of
> the string being written into the buffer will not exceed its size, thus
> preventing potential buffer overflow and truncation issues.
> 
> Fixes the below with gcc W=1:
> drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c: In function ‘gfx_v9_4_3_early_init’:
> drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:379:52: warning: ‘%s’ directive 
> output may be truncated writing up to 29 bytes into a region of size 23 
> [-Wformat-truncation=]
>   379 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_rlc.bin", 
> chip_name);
>   |^~
> ..
>   439 | r = gfx_v9_4_3_init_rlc_microcode(adev, ucode_prefix);
>   | 
> drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:379:9: note: ‘snprintf’ output 
> between 16 and 45 bytes into a destination of size 30
>   379 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_rlc.bin", 
> chip_name);
>   | 
> ^~
> drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:413:52: warning: ‘%s’ directive 
> output may be truncated writing up to 29 bytes into a region of size 23 
> [-Wformat-truncation=]
>   413 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mec.bin", 
> chip_name);
>   |^~
> ..
>   443 | r = gfx_v9_4_3_init_cp_compute_microcode(adev, ucode_prefix);
>   |
> drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:413:9: note: ‘snprintf’ output 
> between 16 and 45 bytes into a destination of size 30
>   413 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mec.bin", 
> chip_name);
>   | 
> ^~
> 
> Fixes: 86301129698b ("drm/amdgpu: split gc v9_4_3 functionality from gc v9_0")
> Cc: Hawking Zhang 
> Cc: Christian König 
> Cc: Alex Deucher 
> Cc: Lijo Lazar 
> Signed-off-by: Srinivasan Shanmugam 
> Suggested-by: Lijo Lazar 

Reviewed-by: Lijo Lazar 

Thanks,
Lijo
> ---
> v2:
>  - reduced the size in ucode_prefix to 15 instead of changing size in
>fw_name (Lijo)
> 
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
> index 0e429b7ed036..7b16e8cca86a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
> @@ -431,7 +431,7 @@ static int gfx_v9_4_3_init_cp_compute_microcode(struct 
> amdgpu_device *adev,
>  
>  static int gfx_v9_4_3_init_microcode(struct amdgpu_device *adev)
>  {
> - char ucode_prefix[30];
> + char ucode_prefix[15];
>   int r;
>  
>   amdgpu_ucode_ip_version_decode(adev, GC_HWIP, ucode_prefix, 
> sizeof(ucode_prefix));


[PATCH v2] drm/amdgpu: fix overflowed array index read warning

2024-04-25 Thread Tim Huang
From: Tim Huang 

Clear warning that cast operation might have overflowed.

v2: keep reverse xmas tree order to declare "int r;" (Christian)

Signed-off-by: Tim Huang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 06f0a6534a94..8cf60acb2970 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -473,8 +473,8 @@ static ssize_t amdgpu_debugfs_ring_read(struct file *f, 
char __user *buf,
size_t size, loff_t *pos)
 {
struct amdgpu_ring *ring = file_inode(f)->i_private;
-   int r, i;
uint32_t value, result, early[3];
+   int r;
 
if (*pos & 3 || size & 3)
return -EINVAL;
@@ -485,7 +485,7 @@ static ssize_t amdgpu_debugfs_ring_read(struct file *f, 
char __user *buf,
early[0] = amdgpu_ring_get_rptr(ring) & ring->buf_mask;
early[1] = amdgpu_ring_get_wptr(ring) & ring->buf_mask;
early[2] = ring->wptr & ring->buf_mask;
-   for (i = *pos / 4; i < 3 && size; i++) {
+   for (loff_t i = *pos / 4; i < 3 && size; i++) {
r = put_user(early[i], (uint32_t *)buf);
if (r)
return r;
-- 
2.39.2