RE: [PATCH] drm/amdgpu: Mark amdgpu_bo as invalid after moved

2024-07-19 Thread YuanShang Mao (River)
[AMD Official Use Only - AMD Internal Distribution Only]

Same issue on CPU page table update.

-Original Message-
From: Kuehling, Felix 
Sent: Thursday, July 18, 2024 12:28 AM
To: Christian König ; YuanShang Mao (River) 
; Huang, Trigger ; 
amd-gfx@lists.freedesktop.org; cao, lin 
Subject: Re: [PATCH] drm/amdgpu: Mark amdgpu_bo as invalid after moved


On 2024-07-15 08:39, Christian König wrote:
> Hi Felix,
>
> yes that is a perfectly expected consequence.
>
> The last time we talked about it the problem to solve this was that
> amdgpu_vm_sdma_prepare() couldn't read the fences from a resv object
> which wasn't locked.

Why only amdgpu_vm_sdma_prepare? Doesn't CPU page table update have the same 
problem?


>
> That happens both during amdgpu_vm_handle_moved() as well as unlocked
> in validations of the page tables.

By "unlocked validations of page table entries" do you mean the
"unlocked" flag in amdgpu_vm_update_range? That should only be used for
invalidating page table entries in MMU notifiers for SVM ranges. It
should not affect normal BOs.

amdgpu_vm_handle_moved tries to lock the reservations. But if it fails,
it clears page table entries. So this is another case of "unlocked
invalidations". This one does affect normal BOs. I think
amdgpu_vm_handle_moved makes an assumption that the other user of the BO
is in a different VM but not the same VM. Clearing the PTEs in this VM
even though the BO move is still waiting for some other VM to finish
accessing it, is safe.

I think here we have a case where the BO is used by something else in
the same VM. In this case we cannot safely clear the PTEs before the BO
fence signals.

We want to clear the PTEs before the move happens. Otherwise we risk
memory corruption. Maybe the same job that does the move blit should
also invalidate the PTEs?

Regards,
   Felix


>
> IIRC we postponed looking into the issue until it really becomes a
> problem which is probably now :)
>
> Regards,
> Christian.
>
> Am 12.07.24 um 16:56 schrieb Felix Kuehling:
>> KFD eviction fences are triggered by the enable_signaling callback on
>> the eviction fence. Any move operations scheduled by amdgpu_bo_move
>> are held up by the GPU scheduler until the eviction fence is signaled
>> by the KFD eviction handler, which only happens after the user mode
>> queues are stopped.
>>
>> As I understand it, VM BO invalidation does not unmap anything from
>> the page table itself. So the KFD queues are OK continue running
>> until the eviction handler stops them and signals the fence.
>>
>> However, if amdgpu_vm_handle_moved gets called before the eviction
>> fence is signaled, then there could be a problem. In applications
>> that do compute-graphics interop, the VM is shared between compute
>> and graphics. So graphics and compute submissions at the same time
>> are possible. @Christian, this is a concequence of using libdrm and
>> insisting that each process uses only a single VM per GPU.
>>
>> Regards,
>>Felix
>>
>> On 2024-07-12 3:39, Christian König wrote:
>>> Hi River,
>>>
>>> well that isn't an error at all, this is perfectly expected behavior.
>>>
>>> The VMs used by the KFD process are currently not meant to be used
>>> by classic CS at the same time.
>>>
>>> This is one of the reasons for that.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 12.07.24 um 09:35 schrieb YuanShang Mao (River):
>>>> [AMD Official Use Only - AMD Internal Distribution Only]
>>>>
>>>> Add more info and CC @Kuehling, Felix @cao, lin
>>>>
>>>> In amdgpu_amdkfd_fence.c, there is a design description:
>>>>
>>>> /* Eviction Fence
>>>>* Fence helper functions to deal with KFD memory eviction.
>>>>* Big Idea - Since KFD submissions are done by user queues, a BO
>>>> cannot be
>>>>*  evicted unless all the user queues for that process are evicted.
>>>>*
>>>>* All the BOs in a process share an eviction fence. When process
>>>> X wants
>>>>* to map VRAM memory but TTM can't find enough space, TTM will
>>>> attempt to
>>>>* evict BOs from its LRU list. TTM checks if the BO is valuable
>>>> to evict
>>>>* by calling ttm_device_funcs->eviction_valuable().
>>>>*
>>>>* ttm_device_funcs->eviction_valuable() - will return false if
>>>> the BO belongs
>>>>*  to process X. Otherwise, it will return true to indicate BO
>>>> can be
>>>>*  evicted by TTM.
&

RE: [PATCH] drm/amdgpu: Mark amdgpu_bo as invalid after moved

2024-07-12 Thread YuanShang Mao (River)
[AMD Official Use Only - AMD Internal Distribution Only]

Add more info and CC @Kuehling, Felix @cao, lin

In amdgpu_amdkfd_fence.c, there is a design description:

/* Eviction Fence
 * Fence helper functions to deal with KFD memory eviction.
 * Big Idea - Since KFD submissions are done by user queues, a BO cannot be
 *  evicted unless all the user queues for that process are evicted.
 *
 * All the BOs in a process share an eviction fence. When process X wants
 * to map VRAM memory but TTM can't find enough space, TTM will attempt to
 * evict BOs from its LRU list. TTM checks if the BO is valuable to evict
 * by calling ttm_device_funcs->eviction_valuable().
 *
 * ttm_device_funcs->eviction_valuable() - will return false if the BO belongs
 *  to process X. Otherwise, it will return true to indicate BO can be
 *  evicted by TTM.
 *
 * If ttm_device_funcs->eviction_valuable returns true, then TTM will continue
 * the evcition process for that BO by calling ttm_bo_evict --> amdgpu_bo_move
 * --> amdgpu_copy_buffer(). This sets up job in GPU scheduler.
 *
 * GPU Scheduler (amd_sched_main) - sets up a cb (fence_add_callback) to
 *  nofity when the BO is free to move. fence_add_callback --> enable_signaling
 *  --> amdgpu_amdkfd_fence.enable_signaling
 *
 * amdgpu_amdkfd_fence.enable_signaling - Start a work item that will quiesce
 * user queues and signal fence. The work item will also start another delayed
 * work item to restore BOs
 */

If mark BOs as invalidated before submitting job to move the buffer, user queue 
is still active.
During the time before user queue is evicted, if a drm job achieve, 
amdgpu_cs_vm_handling will call amdgpu_vm_handle_moved to clear the ptes of
Invalidated BOs. Then page fault happens because compute shader is still 
accessing the "invalidated" BO.

I am not familiar with amdgpu_vm_bo state machine, so I don’t know if it is an 
code error or an design error.

Thanks
River


-----Original Message-
From: YuanShang Mao (River)
Sent: Friday, July 12, 2024 10:55 AM
To: Christian König 
Cc: Huang, Trigger ; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: Mark amdgpu_bo as invalid after moved

We need to make sure that all BOs of an active kfd process validated. Moving 
buffer will trigger process eviction.
If mark it as invalided before process eviction, related kfd process is still 
active and may attempt to access this invalidated BO.

Agree with Trigger. Seems kfd eviction should been synced to move notify, not 
the move action.

Thanks
River

-Original Message-
From: Christian König 
Sent: Thursday, July 11, 2024 8:39 PM
To: Huang, Trigger ; YuanShang Mao (River) 
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Mark amdgpu_bo as invalid after moved

Yeah, completely agree. This patch doesn't really make sense.

Please explain why you would want to do this?

Regards,
Christian.

Am 11.07.24 um 13:56 schrieb Huang, Trigger:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> This patch seems to be wrong.
> Quite a lot of preparations have been done in amdgpu_bo_move_notify
> For example, amdgpu_bo_kunmap() will be called to prevent the BO from being 
> accessed by CPU. If not called, the CPU may attempt to access the BO while it 
> is being moved.
>
> Thanks,
> Trigger
>
>> -Original Message-
>> From: amd-gfx  On Behalf Of
>> YuanShang
>> Sent: Thursday, July 11, 2024 5:10 PM
>> To: amd-gfx@lists.freedesktop.org
>> Cc: YuanShang Mao (River) ; YuanShang Mao
>> (River) 
>> Subject: [PATCH] drm/amdgpu: Mark amdgpu_bo as invalid after moved
>>
>> Caution: This message originated from an External Source. Use proper
>> caution when opening attachments, clicking links, or responding.
>>
>>
>> It leads to race condition if amdgpu_bo is marked as invalid before
>> it is really moved.
>>
>> Signed-off-by: YuanShang 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 10 +-
>>   1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> index 29e4b5875872..a29d5132ad3d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> @@ -519,8 +519,8 @@ static int amdgpu_bo_move(struct
>> ttm_buffer_object *bo, bool evict,
>>
>>  if (!old_mem || (old_mem->mem_type == TTM_PL_SYSTEM &&
>>   bo->ttm == NULL)) {
>> -   amdgpu_bo_move_notify(bo, evict, new_mem);
>>  ttm_bo_move_null(bo, new_mem);
>> +   amdgpu_bo_move_notify(bo, evict, new_mem);
>>  return 0;
>>  }
>>  if (ol

RE: [PATCH] drm/amdgpu: Mark amdgpu_bo as invalid after moved

2024-07-11 Thread YuanShang Mao (River)
[AMD Official Use Only - AMD Internal Distribution Only]

We need to make sure that all BOs of an active kfd process validated. Moving 
buffer will trigger process eviction.
If mark it as invalided before process eviction, related kfd process is still 
active and may attempt to access this invalidated BO.

Agree with Trigger. Seems kfd eviction should been synced to move notify, not 
the move action.

Thanks
River

-Original Message-
From: Christian König 
Sent: Thursday, July 11, 2024 8:39 PM
To: Huang, Trigger ; YuanShang Mao (River) 
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Mark amdgpu_bo as invalid after moved

Yeah, completely agree. This patch doesn't really make sense.

Please explain why you would want to do this?

Regards,
Christian.

Am 11.07.24 um 13:56 schrieb Huang, Trigger:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> This patch seems to be wrong.
> Quite a lot of preparations have been done in amdgpu_bo_move_notify
> For example, amdgpu_bo_kunmap() will be called to prevent the BO from being 
> accessed by CPU. If not called, the CPU may attempt to access the BO while it 
> is being moved.
>
> Thanks,
> Trigger
>
>> -Original Message-
>> From: amd-gfx  On Behalf Of
>> YuanShang
>> Sent: Thursday, July 11, 2024 5:10 PM
>> To: amd-gfx@lists.freedesktop.org
>> Cc: YuanShang Mao (River) ; YuanShang Mao
>> (River) 
>> Subject: [PATCH] drm/amdgpu: Mark amdgpu_bo as invalid after moved
>>
>> Caution: This message originated from an External Source. Use proper
>> caution when opening attachments, clicking links, or responding.
>>
>>
>> It leads to race condition if amdgpu_bo is marked as invalid before
>> it is really moved.
>>
>> Signed-off-by: YuanShang 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 10 +-
>>   1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> index 29e4b5875872..a29d5132ad3d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> @@ -519,8 +519,8 @@ static int amdgpu_bo_move(struct
>> ttm_buffer_object *bo, bool evict,
>>
>>  if (!old_mem || (old_mem->mem_type == TTM_PL_SYSTEM &&
>>   bo->ttm == NULL)) {
>> -   amdgpu_bo_move_notify(bo, evict, new_mem);
>>  ttm_bo_move_null(bo, new_mem);
>> +   amdgpu_bo_move_notify(bo, evict, new_mem);
>>  return 0;
>>  }
>>  if (old_mem->mem_type == AMDGPU_GEM_DOMAIN_DGMA || @@ -
>> 530,8 +530,8 @@ static int amdgpu_bo_move(struct ttm_buffer_object
>> *bo, bool evict,
>>  if (old_mem->mem_type == TTM_PL_SYSTEM &&
>>  (new_mem->mem_type == TTM_PL_TT ||
>>   new_mem->mem_type == AMDGPU_PL_PREEMPT)) {
>> -   amdgpu_bo_move_notify(bo, evict, new_mem);
>>  ttm_bo_move_null(bo, new_mem);
>> +   amdgpu_bo_move_notify(bo, evict, new_mem);
>>  return 0;
>>  }
>>  if ((old_mem->mem_type == TTM_PL_TT || @@ -542,9 +542,9 @@
>> static int amdgpu_bo_move(struct ttm_buffer_object *bo, bool evict,
>>  return r;
>>
>>  amdgpu_ttm_backend_unbind(bo->bdev, bo->ttm);
>> -   amdgpu_bo_move_notify(bo, evict, new_mem);
>>  ttm_resource_free(bo, >resource);
>>  ttm_bo_assign_mem(bo, new_mem);
>> +   amdgpu_bo_move_notify(bo, evict, new_mem);
>>  return 0;
>>  }
>>
>> @@ -557,8 +557,8 @@ static int amdgpu_bo_move(struct
>> ttm_buffer_object *bo, bool evict,
>>  new_mem->mem_type == AMDGPU_PL_OA ||
>>  new_mem->mem_type == AMDGPU_PL_DOORBELL) {
>>  /* Nothing to save here */
>> -   amdgpu_bo_move_notify(bo, evict, new_mem);
>>  ttm_bo_move_null(bo, new_mem);
>> +   amdgpu_bo_move_notify(bo, evict, new_mem);
>>  return 0;
>>  }
>>
>> @@ -583,11 +583,11 @@ static int amdgpu_bo_move(struct
>> ttm_buffer_object *bo, bool evict,
>>  return -EMULTIHOP;
>>  }
>>
>> -   amdgpu_bo_move_notify(bo, evict, new_mem);
>>  if (adev->mman.buffer_funcs_enabled)
>>  r = amdgpu_move_blit(bo, evict, new_mem, old_mem);
>>  else
>>  r = -ENODEV;
>> +   amdgpu_bo_move_notify(bo, evict, new_mem);
>>
>>  if (r) {
>>  /* Check that all memory is CPU accessible */
>> --
>> 2.25.1



[PATCH] drm/amdgpu: Mark amdgpu_bo as invalid after moved

2024-07-11 Thread YuanShang
It leads to race condition if amdgpu_bo is marked as invalid
before it is really moved.

Signed-off-by: YuanShang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 29e4b5875872..a29d5132ad3d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -519,8 +519,8 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo, 
bool evict,
 
if (!old_mem || (old_mem->mem_type == TTM_PL_SYSTEM &&
 bo->ttm == NULL)) {
-   amdgpu_bo_move_notify(bo, evict, new_mem);
ttm_bo_move_null(bo, new_mem);
+   amdgpu_bo_move_notify(bo, evict, new_mem);
return 0;
}
if (old_mem->mem_type == AMDGPU_GEM_DOMAIN_DGMA ||
@@ -530,8 +530,8 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo, 
bool evict,
if (old_mem->mem_type == TTM_PL_SYSTEM &&
(new_mem->mem_type == TTM_PL_TT ||
 new_mem->mem_type == AMDGPU_PL_PREEMPT)) {
-   amdgpu_bo_move_notify(bo, evict, new_mem);
ttm_bo_move_null(bo, new_mem);
+   amdgpu_bo_move_notify(bo, evict, new_mem);
return 0;
}
if ((old_mem->mem_type == TTM_PL_TT ||
@@ -542,9 +542,9 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo, 
bool evict,
return r;
 
amdgpu_ttm_backend_unbind(bo->bdev, bo->ttm);
-   amdgpu_bo_move_notify(bo, evict, new_mem);
ttm_resource_free(bo, >resource);
ttm_bo_assign_mem(bo, new_mem);
+   amdgpu_bo_move_notify(bo, evict, new_mem);
return 0;
}
 
@@ -557,8 +557,8 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo, 
bool evict,
new_mem->mem_type == AMDGPU_PL_OA ||
new_mem->mem_type == AMDGPU_PL_DOORBELL) {
/* Nothing to save here */
-   amdgpu_bo_move_notify(bo, evict, new_mem);
ttm_bo_move_null(bo, new_mem);
+   amdgpu_bo_move_notify(bo, evict, new_mem);
return 0;
}
 
@@ -583,11 +583,11 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo, 
bool evict,
return -EMULTIHOP;
}
 
-   amdgpu_bo_move_notify(bo, evict, new_mem);
if (adev->mman.buffer_funcs_enabled)
r = amdgpu_move_blit(bo, evict, new_mem, old_mem);
else
r = -ENODEV;
+   amdgpu_bo_move_notify(bo, evict, new_mem);
 
if (r) {
/* Check that all memory is CPU accessible */
-- 
2.25.1



RE: [PATCH v3] drm/amd/amdgpu: Update RLC_SPM_MC_CNT by ring wreg

2024-01-15 Thread YuanShang Mao (River)
[AMD Official Use Only - General]

Ping...

-Original Message-
From: YuanShang Mao (River) 
Sent: Saturday, January 13, 2024 2:58 PM
To: amd-gfx@lists.freedesktop.org
Cc: YuanShang Mao (River) ; YuanShang Mao (River) 

Subject: [PATCH v3] drm/amd/amdgpu: Update RLC_SPM_MC_CNT by ring wreg

[Why]
RLC_SPM_MC_CNTL can not updated by MMIO
since MMIO protection is enabled during runtime in guest machine.

[How]
Submit command of wreg in amdgpu ring to update RLC_SPM_MC_CNT.

Signed-off-by: YuanShang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h |  2 +-  
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  |  2 +-  
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  |  2 +-  
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c  | 12 +---
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c   |  4 ++--
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c   |  4 ++--
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c |  4 ++--
 8 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
index b591d33af264..5a17e0ff2ab8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
@@ -169,7 +169,7 @@ struct amdgpu_rlc_funcs {
void (*stop)(struct amdgpu_device *adev);
void (*reset)(struct amdgpu_device *adev);
void (*start)(struct amdgpu_device *adev);
-   void (*update_spm_vmid)(struct amdgpu_device *adev, unsigned vmid);
+   void (*update_spm_vmid)(struct amdgpu_device *adev, struct amdgpu_ring
+*ring, unsigned vmid);
bool (*is_rlcg_access_range)(struct amdgpu_device *adev, uint32_t reg); 
 };

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 7da71b6a9dc6..13b2c82e5f48 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -650,7 +650,7 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct 
amdgpu_job *job,
amdgpu_gmc_emit_pasid_mapping(ring, job->vmid, job->pasid);

if (spm_update_needed && adev->gfx.rlc.funcs->update_spm_vmid)
-   adev->gfx.rlc.funcs->update_spm_vmid(adev, job->vmid);
+   adev->gfx.rlc.funcs->update_spm_vmid(adev, ring, job->vmid);

if (!ring->is_mes_queue && ring->funcs->emit_gds_switch &&
gds_switch_needed) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index c8a3bf01743f..830ed6fe1baf 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7951,7 +7951,7 @@ static void gfx_v10_0_update_spm_vmid_internal(struct 
amdgpu_device *adev,
WREG32_SOC15_NO_KIQ(GC, 0, mmRLC_SPM_MC_CNTL, data);  }

-static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, unsigned int 
vmid)
+static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev,
+struct amdgpu_ring *ring, unsigned int vmid)
 {
amdgpu_gfx_off_ctrl(adev, false);

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index c659ef0f47ce..42e9976c053e 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -749,7 +749,7 @@ static int gfx_v11_0_rlc_init(struct amdgpu_device *adev)

/* init spm vmid with 0xf */
if (adev->gfx.rlc.funcs->update_spm_vmid)
-   adev->gfx.rlc.funcs->update_spm_vmid(adev, 0xf);
+   adev->gfx.rlc.funcs->update_spm_vmid(adev, NULL, 0xf);

return 0;
 }
@@ -5002,7 +5002,7 @@ static int gfx_v11_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
return 0;
 }

-static void gfx_v11_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)
+static void gfx_v11_0_update_spm_vmid(struct amdgpu_device *adev,
+struct amdgpu_ring *ring, unsigned vmid)
 {
u32 data;

@@ -5013,9 +5013,15 @@ static void gfx_v11_0_update_spm_vmid(struct 
amdgpu_device *adev, unsigned vmid)
data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;

-   WREG32_SOC15_NO_KIQ(GC, 0, regRLC_SPM_MC_CNTL, data);
+   if (ring == NULL)
+   WREG32_SOC15_NO_KIQ(GC, 0, regRLC_SPM_MC_CNTL, data);

amdgpu_gfx_off_ctrl(adev, true);
+
+   if (ring) {
+   uint32_t reg = SOC15_REG_OFFSET(GC, 0, regRLC_SPM_MC_CNTL);
+   amdgpu_ring_emit_wreg(ring, reg, data);
+   }
 }

 static const struct amdgpu_rlc_funcs gfx_v11_0_rlc_funcs = { diff --git 
a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index c2faf6b4c2fc..86a4865b1ae5 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -3274,7 +3274,7 @@ static int gfx_v7_0_rlc_init(struct amdgpu_device *adev)

/* i

[PATCH v3] drm/amd/amdgpu: Update RLC_SPM_MC_CNT by ring wreg

2024-01-12 Thread YuanShang
[Why]
RLC_SPM_MC_CNTL can not updated by MMIO
since MMIO protection is enabled during runtime in
guest machine.

[How]
Submit command of wreg in amdgpu ring to update
RLC_SPM_MC_CNT.

Signed-off-by: YuanShang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c  | 12 +---
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c   |  4 ++--
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c   |  4 ++--
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c |  4 ++--
 8 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
index b591d33af264..5a17e0ff2ab8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
@@ -169,7 +169,7 @@ struct amdgpu_rlc_funcs {
void (*stop)(struct amdgpu_device *adev);
void (*reset)(struct amdgpu_device *adev);
void (*start)(struct amdgpu_device *adev);
-   void (*update_spm_vmid)(struct amdgpu_device *adev, unsigned vmid);
+   void (*update_spm_vmid)(struct amdgpu_device *adev, struct amdgpu_ring 
*ring, unsigned vmid);
bool (*is_rlcg_access_range)(struct amdgpu_device *adev, uint32_t reg);
 };
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 7da71b6a9dc6..13b2c82e5f48 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -650,7 +650,7 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct 
amdgpu_job *job,
amdgpu_gmc_emit_pasid_mapping(ring, job->vmid, job->pasid);
 
if (spm_update_needed && adev->gfx.rlc.funcs->update_spm_vmid)
-   adev->gfx.rlc.funcs->update_spm_vmid(adev, job->vmid);
+   adev->gfx.rlc.funcs->update_spm_vmid(adev, ring, job->vmid);
 
if (!ring->is_mes_queue && ring->funcs->emit_gds_switch &&
gds_switch_needed) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index c8a3bf01743f..830ed6fe1baf 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7951,7 +7951,7 @@ static void gfx_v10_0_update_spm_vmid_internal(struct 
amdgpu_device *adev,
WREG32_SOC15_NO_KIQ(GC, 0, mmRLC_SPM_MC_CNTL, data);
 }
 
-static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, unsigned int 
vmid)
+static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, struct 
amdgpu_ring *ring, unsigned int vmid)
 {
amdgpu_gfx_off_ctrl(adev, false);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index c659ef0f47ce..42e9976c053e 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -749,7 +749,7 @@ static int gfx_v11_0_rlc_init(struct amdgpu_device *adev)
 
/* init spm vmid with 0xf */
if (adev->gfx.rlc.funcs->update_spm_vmid)
-   adev->gfx.rlc.funcs->update_spm_vmid(adev, 0xf);
+   adev->gfx.rlc.funcs->update_spm_vmid(adev, NULL, 0xf);
 
return 0;
 }
@@ -5002,7 +5002,7 @@ static int gfx_v11_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
return 0;
 }
 
-static void gfx_v11_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)
+static void gfx_v11_0_update_spm_vmid(struct amdgpu_device *adev, struct 
amdgpu_ring *ring, unsigned vmid)
 {
u32 data;
 
@@ -5013,9 +5013,15 @@ static void gfx_v11_0_update_spm_vmid(struct 
amdgpu_device *adev, unsigned vmid)
data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
 
-   WREG32_SOC15_NO_KIQ(GC, 0, regRLC_SPM_MC_CNTL, data);
+   if (ring == NULL)
+   WREG32_SOC15_NO_KIQ(GC, 0, regRLC_SPM_MC_CNTL, data);
 
amdgpu_gfx_off_ctrl(adev, true);
+
+   if (ring) {
+   uint32_t reg = SOC15_REG_OFFSET(GC, 0, regRLC_SPM_MC_CNTL);
+   amdgpu_ring_emit_wreg(ring, reg, data);
+   }
 }
 
 static const struct amdgpu_rlc_funcs gfx_v11_0_rlc_funcs = {
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index c2faf6b4c2fc..86a4865b1ae5 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -3274,7 +3274,7 @@ static int gfx_v7_0_rlc_init(struct amdgpu_device *adev)
 
/* init spm vmid with 0xf */
if (adev->gfx.rlc.funcs->update_spm_vmid)
-   adev->gfx.rlc.funcs->update_spm_vmid(adev, 0xf);
+   adev->gfx.rlc.funcs->update_spm_vmid(adev, NULL, 0xf);
 
return 0;
 }
@@ -3500,7 +3500,7 @@ static int gf

[PATCH v2] drm/amd/amdgpu: Update RLC_SPM_MC_CNT by ring wreg

2024-01-11 Thread YuanShang
[Why]
RLC_SPM_MC_CNTL can not updated by MMIO
since MMIO protection is enabled during runtime in
guest machine.

[How]
Submit command of wreg in amdgpu ring to update
RLC_SPM_MC_CNT.

Signed-off-by: YuanShang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c  | 12 +---
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c   |  4 ++--
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c   |  4 ++--
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c |  4 ++--
 8 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
index b591d33af264..5a17e0ff2ab8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
@@ -169,7 +169,7 @@ struct amdgpu_rlc_funcs {
void (*stop)(struct amdgpu_device *adev);
void (*reset)(struct amdgpu_device *adev);
void (*start)(struct amdgpu_device *adev);
-   void (*update_spm_vmid)(struct amdgpu_device *adev, unsigned vmid);
+   void (*update_spm_vmid)(struct amdgpu_device *adev, struct amdgpu_ring 
*ring, unsigned vmid);
bool (*is_rlcg_access_range)(struct amdgpu_device *adev, uint32_t reg);
 };
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 7da71b6a9dc6..13b2c82e5f48 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -650,7 +650,7 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct 
amdgpu_job *job,
amdgpu_gmc_emit_pasid_mapping(ring, job->vmid, job->pasid);
 
if (spm_update_needed && adev->gfx.rlc.funcs->update_spm_vmid)
-   adev->gfx.rlc.funcs->update_spm_vmid(adev, job->vmid);
+   adev->gfx.rlc.funcs->update_spm_vmid(adev, ring, job->vmid);
 
if (!ring->is_mes_queue && ring->funcs->emit_gds_switch &&
gds_switch_needed) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index c8a3bf01743f..830ed6fe1baf 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7951,7 +7951,7 @@ static void gfx_v10_0_update_spm_vmid_internal(struct 
amdgpu_device *adev,
WREG32_SOC15_NO_KIQ(GC, 0, mmRLC_SPM_MC_CNTL, data);
 }
 
-static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, unsigned int 
vmid)
+static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, struct 
amdgpu_ring *ring, unsigned int vmid)
 {
amdgpu_gfx_off_ctrl(adev, false);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index c659ef0f47ce..a981071e7c93 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -749,7 +749,7 @@ static int gfx_v11_0_rlc_init(struct amdgpu_device *adev)
 
/* init spm vmid with 0xf */
if (adev->gfx.rlc.funcs->update_spm_vmid)
-   adev->gfx.rlc.funcs->update_spm_vmid(adev, 0xf);
+   adev->gfx.rlc.funcs->update_spm_vmid(adev, NULL, 0xf);
 
return 0;
 }
@@ -5002,7 +5002,7 @@ static int gfx_v11_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
return 0;
 }
 
-static void gfx_v11_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)
+static void gfx_v11_0_update_spm_vmid(struct amdgpu_device *adev, struct 
amdgpu_ring *ring, unsigned vmid)
 {
u32 data;
 
@@ -5013,9 +5013,15 @@ static void gfx_v11_0_update_spm_vmid(struct 
amdgpu_device *adev, unsigned vmid)
data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
 
-   WREG32_SOC15_NO_KIQ(GC, 0, regRLC_SPM_MC_CNTL, data);
+   if (ring == NULL)
+   WREG32_SOC15_NO_KIQ(GC, 0, regRLC_SPM_MC_CNTL, data);
 
amdgpu_gfx_off_ctrl(adev, true);
+
+   if (ring) {
+   uint32_t reg = SOC15_REG_OFFSET(GC, 0, 
regRLC_SPM_MC_CNTL);
+   amdgpu_ring_emit_wreg(ring, reg, data);
+   }
 }
 
 static const struct amdgpu_rlc_funcs gfx_v11_0_rlc_funcs = {
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index c2faf6b4c2fc..86a4865b1ae5 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -3274,7 +3274,7 @@ static int gfx_v7_0_rlc_init(struct amdgpu_device *adev)
 
/* init spm vmid with 0xf */
if (adev->gfx.rlc.funcs->update_spm_vmid)
-   adev->gfx.rlc.funcs->update_spm_vmid(adev, 0xf);
+   adev->gfx.rlc.funcs->update_spm_vmid(adev, NULL, 0xf);
 
return 0;
 }
@@ -3500,7 +3500,7 @@ static

[PATCH] SWDEV-439292 - Update RLC_SPM_MC_CNT by ring wreg

2024-01-11 Thread YuanShang
Why: RLC_SPM_MC_CNTL can not updated by MMIO
since MMIO protection is enabled during runtime.

How: submit command of wreg in amdgpu ring to update
RLC_SPM_MC_CNT.

Signed-off-by: YuanShang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c  | 12 +---
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c   |  4 ++--
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c   |  4 ++--
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c |  4 ++--
 8 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
index b591d33af264..5a17e0ff2ab8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_rlc.h
@@ -169,7 +169,7 @@ struct amdgpu_rlc_funcs {
void (*stop)(struct amdgpu_device *adev);
void (*reset)(struct amdgpu_device *adev);
void (*start)(struct amdgpu_device *adev);
-   void (*update_spm_vmid)(struct amdgpu_device *adev, unsigned vmid);
+   void (*update_spm_vmid)(struct amdgpu_device *adev, struct amdgpu_ring 
*ring, unsigned vmid);
bool (*is_rlcg_access_range)(struct amdgpu_device *adev, uint32_t reg);
 };
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 7da71b6a9dc6..13b2c82e5f48 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -650,7 +650,7 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct 
amdgpu_job *job,
amdgpu_gmc_emit_pasid_mapping(ring, job->vmid, job->pasid);
 
if (spm_update_needed && adev->gfx.rlc.funcs->update_spm_vmid)
-   adev->gfx.rlc.funcs->update_spm_vmid(adev, job->vmid);
+   adev->gfx.rlc.funcs->update_spm_vmid(adev, ring, job->vmid);
 
if (!ring->is_mes_queue && ring->funcs->emit_gds_switch &&
gds_switch_needed) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index c8a3bf01743f..830ed6fe1baf 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7951,7 +7951,7 @@ static void gfx_v10_0_update_spm_vmid_internal(struct 
amdgpu_device *adev,
WREG32_SOC15_NO_KIQ(GC, 0, mmRLC_SPM_MC_CNTL, data);
 }
 
-static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, unsigned int 
vmid)
+static void gfx_v10_0_update_spm_vmid(struct amdgpu_device *adev, struct 
amdgpu_ring *ring, unsigned int vmid)
 {
amdgpu_gfx_off_ctrl(adev, false);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index c659ef0f47ce..a981071e7c93 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -749,7 +749,7 @@ static int gfx_v11_0_rlc_init(struct amdgpu_device *adev)
 
/* init spm vmid with 0xf */
if (adev->gfx.rlc.funcs->update_spm_vmid)
-   adev->gfx.rlc.funcs->update_spm_vmid(adev, 0xf);
+   adev->gfx.rlc.funcs->update_spm_vmid(adev, NULL, 0xf);
 
return 0;
 }
@@ -5002,7 +5002,7 @@ static int gfx_v11_0_update_gfx_clock_gating(struct 
amdgpu_device *adev,
return 0;
 }
 
-static void gfx_v11_0_update_spm_vmid(struct amdgpu_device *adev, unsigned 
vmid)
+static void gfx_v11_0_update_spm_vmid(struct amdgpu_device *adev, struct 
amdgpu_ring *ring, unsigned vmid)
 {
u32 data;
 
@@ -5013,9 +5013,15 @@ static void gfx_v11_0_update_spm_vmid(struct 
amdgpu_device *adev, unsigned vmid)
data &= ~RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK;
data |= (vmid & RLC_SPM_MC_CNTL__RLC_SPM_VMID_MASK) << 
RLC_SPM_MC_CNTL__RLC_SPM_VMID__SHIFT;
 
-   WREG32_SOC15_NO_KIQ(GC, 0, regRLC_SPM_MC_CNTL, data);
+   if (ring == NULL)
+   WREG32_SOC15_NO_KIQ(GC, 0, regRLC_SPM_MC_CNTL, data);
 
amdgpu_gfx_off_ctrl(adev, true);
+
+   if (ring) {
+   uint32_t reg = SOC15_REG_OFFSET(GC, 0, 
regRLC_SPM_MC_CNTL);
+   amdgpu_ring_emit_wreg(ring, reg, data);
+   }
 }
 
 static const struct amdgpu_rlc_funcs gfx_v11_0_rlc_funcs = {
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index c2faf6b4c2fc..86a4865b1ae5 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -3274,7 +3274,7 @@ static int gfx_v7_0_rlc_init(struct amdgpu_device *adev)
 
/* init spm vmid with 0xf */
if (adev->gfx.rlc.funcs->update_spm_vmid)
-   adev->gfx.rlc.funcs->update_spm_vmid(adev, 0xf);
+   adev->gfx.rlc.funcs->update_spm_vmid(adev, NULL, 0xf);
 
return 0;
 }
@@ -3500,7 +3500,7 @@ static int gf

[PATCH] /drm/amdgpu: correct chunk_ptr to a pointer to chunk.

2023-10-30 Thread YuanShang
The variable "chunk_ptr" should be a pointer pointing
to a struct drm_amdgpu_cs_chunk instead of to a pointer
of that.
Signed-off-by: YuanShang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 74769afaa33d..551b9466a441 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -208,7 +208,7 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
}
 
for (i = 0; i < p->nchunks; i++) {
-   struct drm_amdgpu_cs_chunk __user **chunk_ptr = NULL;
+   struct drm_amdgpu_cs_chunk __user *chunk_ptr = NULL;
struct drm_amdgpu_cs_chunk user_chunk;
uint32_t __user *cdata;
 
-- 
2.25.1



[PATCH v3] drm/amdgpu: load sdma ucode in the guest machine

2023-07-18 Thread YuanShang
[why]
User mode driver need to check the sdma ucode version to
see whether the sdma engine supports a new type of PM4 packet.
In SRIOV, sdma is loaded by the host. And, there is no way
to check the sdma ucode version of CHIP_NAVI12 and
CHIP_SIENNA_CICHLID of the host in the guest machine.

[how]
Load the sdma ucode for CHIP_NAVI12 and CHIP_SIENNA_CICHLID
in the guest machine.

Signed-off-by: YuanShang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c |  3 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 11 +++
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c   |  8 +++-
 3 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
index dacf281d2b21..e2b9392d7f0d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
@@ -239,9 +239,6 @@ int amdgpu_sdma_init_microcode(struct amdgpu_device *adev,
   sizeof(struct amdgpu_sdma_instance));
}
 
-   if (amdgpu_sriov_vf(adev))
-   return 0;
-
DRM_DEBUG("psp_load == '%s'\n",
  adev->firmware.load_type == AMDGPU_FW_LOAD_PSP ? "true" : 
"false");
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 41aa853a07d2..3365fe04275a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -845,6 +845,17 @@ bool amdgpu_virt_fw_load_skip_check(struct amdgpu_device 
*adev, uint32_t ucode_i
return false;
else
return true;
+   case IP_VERSION(11, 0, 9):
+   case IP_VERSION(11, 0, 7):
+   /* black list for CHIP_NAVI12 and CHIP_SIENNA_CICHLID */
+   if (ucode_id == AMDGPU_UCODE_ID_RLC_G
+   || ucode_id == AMDGPU_UCODE_ID_RLC_RESTORE_LIST_CNTL
+   || ucode_id == AMDGPU_UCODE_ID_RLC_RESTORE_LIST_GPM_MEM
+   || ucode_id == AMDGPU_UCODE_ID_RLC_RESTORE_LIST_SRM_MEM
+   || ucode_id == AMDGPU_UCODE_ID_SMC)
+   return true;
+   else
+   return false;
case IP_VERSION(13, 0, 10):
/* white list */
if (ucode_id == AMDGPU_UCODE_ID_CAP
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index 5c4d4df9cf94..1cc34efb455b 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -237,17 +237,15 @@ static void sdma_v5_0_init_golden_registers(struct 
amdgpu_device *adev)
 // emulation only, won't work on real chip
 // navi10 real chip need to use PSP to load firmware
 static int sdma_v5_0_init_microcode(struct amdgpu_device *adev)
-{  int ret, i;
-
-   if (amdgpu_sriov_vf(adev) && (adev->ip_versions[SDMA0_HWIP][0] == 
IP_VERSION(5, 0, 5)))
-   return 0;
+{
+   int ret, i;
 
for (i = 0; i < adev->sdma.num_instances; i++) {
ret = amdgpu_sdma_init_microcode(adev, i, false);
if (ret)
return ret;
}
-   
+
return ret;
 }
 
-- 
2.25.1



[PATCH v2] drm/amdgpu: load sdma ucode in the guest machine

2023-07-17 Thread YuanShang
[why]
User mode driver need to check the sdma ucode version to
see whether the sdma engine supports a new type of PM4 packet.
In SRIOV, sdma is loaded by the host. And, there is no way
to check the sdma ucode version of CHIP_NAVI12 and
CHIP_SIENNA_CICHLID of the host in the guest machine.

[how]
Load the sdma ucode for CHIP_NAVI12 and CHIP_SIENNA_CICHLID
in the guest machine.

Signed-off-by: YuanShang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c |  3 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 11 +++
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c   |  8 +++-
 3 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
index dacf281d2b21..e2b9392d7f0d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
@@ -239,9 +239,6 @@ int amdgpu_sdma_init_microcode(struct amdgpu_device *adev,
   sizeof(struct amdgpu_sdma_instance));
}
 
-   if (amdgpu_sriov_vf(adev))
-   return 0;
-
DRM_DEBUG("psp_load == '%s'\n",
  adev->firmware.load_type == AMDGPU_FW_LOAD_PSP ? "true" : 
"false");
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 41aa853a07d2..ab76817d94ce 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -845,6 +845,17 @@ bool amdgpu_virt_fw_load_skip_check(struct amdgpu_device 
*adev, uint32_t ucode_i
return false;
else
return true;
+   case IP_VERSION(13, 0, 7):
+   case IP_VERSION(13, 0, 9):
+   /* black list for CHIP_NAVI12 and CHIP_SIENNA_CICHLID */
+   if (ucode_id == AMDGPU_UCODE_ID_RLC_G
+   || ucode_id == AMDGPU_UCODE_ID_RLC_RESTORE_LIST_CNTL
+   || ucode_id == AMDGPU_UCODE_ID_RLC_RESTORE_LIST_GPM_MEM
+   || ucode_id == AMDGPU_UCODE_ID_RLC_RESTORE_LIST_SRM_MEM
+   || ucode_id == AMDGPU_UCODE_ID_SMC)
+   return true;
+   else
+   return false;
case IP_VERSION(13, 0, 10):
/* white list */
if (ucode_id == AMDGPU_UCODE_ID_CAP
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index 5c4d4df9cf94..1cc34efb455b 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -237,17 +237,15 @@ static void sdma_v5_0_init_golden_registers(struct 
amdgpu_device *adev)
 // emulation only, won't work on real chip
 // navi10 real chip need to use PSP to load firmware
 static int sdma_v5_0_init_microcode(struct amdgpu_device *adev)
-{  int ret, i;
-
-   if (amdgpu_sriov_vf(adev) && (adev->ip_versions[SDMA0_HWIP][0] == 
IP_VERSION(5, 0, 5)))
-   return 0;
+{
+   int ret, i;
 
for (i = 0; i < adev->sdma.num_instances; i++) {
ret = amdgpu_sdma_init_microcode(adev, i, false);
if (ret)
return ret;
}
-   
+
return ret;
 }
 
-- 
2.25.1



[PATCH] drm/amdgpu: load sdma ucode in the guest machine

2023-07-13 Thread YuanShang
Load the sdma ucode in the guest machine CHIP_NAVI12
and CHIP_SIENNA_CICHLID, so that the guest can check
the version of current sdma ucode.
It is used to support KFDTopologyTest.BasicTest,
which need use the sdma ucode version to see whether
the sdma engine support a new type of package (Barrier
Value Packet).

Signed-off-by: YuanShang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c |  3 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 11 +++
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c   |  6 ++
 3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
index dacf281d2b21..e2b9392d7f0d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
@@ -239,9 +239,6 @@ int amdgpu_sdma_init_microcode(struct amdgpu_device *adev,
   sizeof(struct amdgpu_sdma_instance));
}
 
-   if (amdgpu_sriov_vf(adev))
-   return 0;
-
DRM_DEBUG("psp_load == '%s'\n",
  adev->firmware.load_type == AMDGPU_FW_LOAD_PSP ? "true" : 
"false");
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 41aa853a07d2..16e4e30ee28e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -845,6 +845,17 @@ bool amdgpu_virt_fw_load_skip_check(struct amdgpu_device 
*adev, uint32_t ucode_i
return false;
else
return true;
+   case IP_VERSION(13, 0, 7):
+   case IP_VERSION(13, 0, 9):
+   /* black list for navi12 and navi21*/
+   if (ucode_id == AMDGPU_UCODE_ID_RLC_G
+   || ucode_id == AMDGPU_UCODE_ID_RLC_RESTORE_LIST_CNTL
+   || ucode_id == AMDGPU_UCODE_ID_RLC_RESTORE_LIST_GPM_MEM
+   || ucode_id == AMDGPU_UCODE_ID_RLC_RESTORE_LIST_SRM_MEM
+   || ucode_id == AMDGPU_UCODE_ID_SMC)
+   return true;
+   else
+   return false;
case IP_VERSION(13, 0, 10):
/* white list */
if (ucode_id == AMDGPU_UCODE_ID_CAP
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index 5c4d4df9cf94..aa6b7390a7a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -237,10 +237,8 @@ static void sdma_v5_0_init_golden_registers(struct 
amdgpu_device *adev)
 // emulation only, won't work on real chip
 // navi10 real chip need to use PSP to load firmware
 static int sdma_v5_0_init_microcode(struct amdgpu_device *adev)
-{  int ret, i;
-
-   if (amdgpu_sriov_vf(adev) && (adev->ip_versions[SDMA0_HWIP][0] == 
IP_VERSION(5, 0, 5)))
-   return 0;
+{
+   int ret, i;
 
for (i = 0; i < adev->sdma.num_instances; i++) {
ret = amdgpu_sdma_init_microcode(adev, i, false);
-- 
2.25.1



[PATCH] drm/amdgpu: Remove IMU ucode in vf2pf

2023-05-09 Thread YuanShang
The IMU firmware is loaded on the host side, not the guest.
Remove IMU in vf2pf ucode id enum.

Signed-off-by: YuanShang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c| 1 -
 drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h | 1 -
 2 files changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 1311e42ab8e9..0af871735a74 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -557,7 +557,6 @@ static void amdgpu_virt_populate_vf2pf_ucode_info(struct 
amdgpu_device *adev)
POPULATE_UCODE_INFO(vf2pf_info, AMD_SRIOV_UCODE_ID_RLC_SRLS, 
adev->gfx.rlc_srls_fw_version);
POPULATE_UCODE_INFO(vf2pf_info, AMD_SRIOV_UCODE_ID_MEC,  
adev->gfx.mec_fw_version);
POPULATE_UCODE_INFO(vf2pf_info, AMD_SRIOV_UCODE_ID_MEC2, 
adev->gfx.mec2_fw_version);
-   POPULATE_UCODE_INFO(vf2pf_info, AMD_SRIOV_UCODE_ID_IMU,  
adev->gfx.imu_fw_version);
POPULATE_UCODE_INFO(vf2pf_info, AMD_SRIOV_UCODE_ID_SOS,  
adev->psp.sos.fw_version);
POPULATE_UCODE_INFO(vf2pf_info, AMD_SRIOV_UCODE_ID_ASD,
adev->psp.asd_context.bin_desc.fw_version);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h 
b/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
index 24d42d24e6a0..104a5ad8397d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
@@ -70,7 +70,6 @@ enum amd_sriov_ucode_engine_id {
AMD_SRIOV_UCODE_ID_RLC_SRLS,
AMD_SRIOV_UCODE_ID_MEC,
AMD_SRIOV_UCODE_ID_MEC2,
-   AMD_SRIOV_UCODE_ID_IMU,
AMD_SRIOV_UCODE_ID_SOS,
AMD_SRIOV_UCODE_ID_ASD,
AMD_SRIOV_UCODE_ID_TA_RAS,
-- 
2.25.1



[PATCH] drm/amdgpu: Remove IMU ucode in vf2pf

2023-05-09 Thread YuanShang
The IMU firmware is loaded on the host side, not the guest.
Remove IMU in vf2pf ucode id enum.

Signed-off-by: YuanShang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c| 1 -
 drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h | 1 -
 2 files changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 1311e42ab8e9..0af871735a74 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -557,7 +557,6 @@ static void amdgpu_virt_populate_vf2pf_ucode_info(struct 
amdgpu_device *adev)
POPULATE_UCODE_INFO(vf2pf_info, AMD_SRIOV_UCODE_ID_RLC_SRLS, 
adev->gfx.rlc_srls_fw_version);
POPULATE_UCODE_INFO(vf2pf_info, AMD_SRIOV_UCODE_ID_MEC,  
adev->gfx.mec_fw_version);
POPULATE_UCODE_INFO(vf2pf_info, AMD_SRIOV_UCODE_ID_MEC2, 
adev->gfx.mec2_fw_version);
-   POPULATE_UCODE_INFO(vf2pf_info, AMD_SRIOV_UCODE_ID_IMU,  
adev->gfx.imu_fw_version);
POPULATE_UCODE_INFO(vf2pf_info, AMD_SRIOV_UCODE_ID_SOS,  
adev->psp.sos.fw_version);
POPULATE_UCODE_INFO(vf2pf_info, AMD_SRIOV_UCODE_ID_ASD,
adev->psp.asd_context.bin_desc.fw_version);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h 
b/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
index 24d42d24e6a0..104a5ad8397d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
@@ -70,7 +70,6 @@ enum amd_sriov_ucode_engine_id {
AMD_SRIOV_UCODE_ID_RLC_SRLS,
AMD_SRIOV_UCODE_ID_MEC,
AMD_SRIOV_UCODE_ID_MEC2,
-   AMD_SRIOV_UCODE_ID_IMU,
AMD_SRIOV_UCODE_ID_SOS,
AMD_SRIOV_UCODE_ID_ASD,
AMD_SRIOV_UCODE_ID_TA_RAS,
-- 
2.25.1



[PATCH] drm/amd: Remove IMU ucode in vf2pf

2023-05-09 Thread YuanShang
Remove IMU in vf2pf ucode id enum to align with host's definition.

Signed-off-by: YuanShang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c| 1 -
 drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h | 1 -
 2 files changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 1311e42ab8e9..0af871735a74 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -557,7 +557,6 @@ static void amdgpu_virt_populate_vf2pf_ucode_info(struct 
amdgpu_device *adev)
POPULATE_UCODE_INFO(vf2pf_info, AMD_SRIOV_UCODE_ID_RLC_SRLS, 
adev->gfx.rlc_srls_fw_version);
POPULATE_UCODE_INFO(vf2pf_info, AMD_SRIOV_UCODE_ID_MEC,  
adev->gfx.mec_fw_version);
POPULATE_UCODE_INFO(vf2pf_info, AMD_SRIOV_UCODE_ID_MEC2, 
adev->gfx.mec2_fw_version);
-   POPULATE_UCODE_INFO(vf2pf_info, AMD_SRIOV_UCODE_ID_IMU,  
adev->gfx.imu_fw_version);
POPULATE_UCODE_INFO(vf2pf_info, AMD_SRIOV_UCODE_ID_SOS,  
adev->psp.sos.fw_version);
POPULATE_UCODE_INFO(vf2pf_info, AMD_SRIOV_UCODE_ID_ASD,
adev->psp.asd_context.bin_desc.fw_version);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h 
b/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
index 24d42d24e6a0..104a5ad8397d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h
@@ -70,7 +70,6 @@ enum amd_sriov_ucode_engine_id {
AMD_SRIOV_UCODE_ID_RLC_SRLS,
AMD_SRIOV_UCODE_ID_MEC,
AMD_SRIOV_UCODE_ID_MEC2,
-   AMD_SRIOV_UCODE_ID_IMU,
AMD_SRIOV_UCODE_ID_SOS,
AMD_SRIOV_UCODE_ID_ASD,
AMD_SRIOV_UCODE_ID_TA_RAS,
-- 
2.25.1