Re: [PATCH] drm/amd check num of link levels when update pcie param

2023-10-24 Thread Chen, JingWen (Wayne)

Acked-by: Jingwen Chen 

Best Regards,
JingWen Chen

On 2023/10/19 17:46, Lin.Cao wrote:

In SR-IOV environment, the value of pcie_table->num_of_link_levels will
be 0, and num_of_levels - 1 will cause array index out of bounds

Signed-off-by: Lin.Cao 
---
  drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
index bcb7ab9d2221..6906b0a7d1d1 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
@@ -2437,6 +2437,9 @@ int smu_v13_0_update_pcie_parameters(struct smu_context 
*smu,
uint32_t smu_pcie_arg;
int ret, i;
  
+	if (!num_of_levels)

+   return 0;
+
if (!amdgpu_device_pcie_dynamic_switching_supported()) {
if (pcie_table->pcie_gen[num_of_levels - 1] < pcie_gen_cap)
pcie_gen_cap = pcie_table->pcie_gen[num_of_levels - 1];


Re: [PATCH] drm/amdgpu: save VCN instances init info before jpeg init

2023-10-10 Thread Chen, JingWen (Wayne)

Reviewed-by: Jingwen Chen 

Best Regards,
JingWen Chen

On 2023/10/10 14:27, Lin.Cao wrote:

JPEG init header will overwirte vcn init header info which will
loss some debug information

Signed-off-by: Lin.Cao 
---
  drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c
index a3768aefb6b6..bc38b90f8cf8 100644
--- a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c
@@ -431,6 +431,10 @@ static int jpeg_v4_0_start_sriov(struct amdgpu_device 
*adev)
end.cmd_header.command_type =
MMSCH_COMMAND__END;
  
+	size = sizeof(struct mmsch_v4_0_init_header);

+   table_loc = (uint32_t *)table->cpu_addr;
+   memcpy(&header, (void *)table_loc, size);
+
header.version = MMSCH_VERSION;
header.total_size = RREG32_SOC15(VCN, 0, regMMSCH_VF_CTX_SIZE);
  


Re: [PATCH] drm/amdgpu: Return -EINVAL when MMSCH init status incorrect

2023-10-09 Thread Chen, JingWen (Wayne)

Reviewed-by: Jingwen Chen 
--
Best Regards,
JingWen Chen

On 2023/10/8 18:06, Lin.Cao wrote:

Return -EINVAL when MMSCH init fail which can be handle by function
amdgpu_device_reset_sriov correctly.

Signed-off-by: Lin.Cao 
---
  drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c
index ac614b869aaf..a3768aefb6b6 100644
--- a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c
@@ -518,8 +518,11 @@ static int jpeg_v4_0_start_sriov(struct amdgpu_device 
*adev)
return -EBUSY;
}
}
-   if (resp != expected && resp != MMSCH_VF_MAILBOX_RESP__INCOMPLETE && 
init_status != MMSCH_VF_ENGINE_STATUS__PASS)
+   if (resp != expected && resp != MMSCH_VF_MAILBOX_RESP__INCOMPLETE
+   && init_status != MMSCH_VF_ENGINE_STATUS__PASS) {
DRM_ERROR("MMSCH init status is incorrect! readback=0x%08x, header 
init status for jpeg: %x\n", resp, init_status);
+   return -EINVAL;
+   }
  
  	return 0;
  




Re: [RFC v4 02/11] drm/amdgpu: Move scheduler init to after XGMI is ready

2022-03-03 Thread Chen, JingWen
Thanks a lot

Best Regards,
JingWen Chen



> On Mar 4, 2022, at 00:36, Grodzovsky, Andrey  
> wrote:
> 
> I pushed all the changes including your patch.
> 
> Andrey
> 
> On 2022-03-02 22:16, Andrey Grodzovsky wrote:
>> OK, i will do quick smoke test tomorrow and push all of it it then.
>> 
>> Andrey
>> 
>> On 2022-03-02 21:59, Chen, JingWen wrote:
>>> Hi Andrey,
>>> 
>>> I don't have the bare mental environment, I can only test the SRIOV cases.
>>> 
>>> Best Regards,
>>> JingWen Chen
>>> 
>>> 
>>> 
>>>> On Mar 3, 2022, at 01:55, Grodzovsky, Andrey  
>>>> wrote:
>>>> 
>>>> The patch is acked-by: Andrey Grodzovsky 
>>>> 
>>>> If you also smoked tested bare metal feel free to apply all the patches, 
>>>> if no let me know.
>>>> 
>>>> Andrey
>>>> 
>>>> On 2022-03-02 04:51, JingWen Chen wrote:
>>>>> Hi Andrey,
>>>>> 
>>>>> Most part of the patches are OK, but the code will introduce a ib test 
>>>>> fail on the disabled vcn of sienna_cichlid.
>>>>> 
>>>>> In SRIOV use case we will disable one vcn on sienna_cichlid, I have 
>>>>> attached a patch to fix this issue, please check the attachment.
>>>>> 
>>>>> Best Regards,
>>>>> 
>>>>> Jingwen Chen
>>>>> 
>>>>> 
>>>>> On 2/26/22 5:22 AM, Andrey Grodzovsky wrote:
>>>>>> Hey, patches attached - i applied the patches and resolved merge 
>>>>>> conflicts but weren't able to test as my on board's network card doesn't 
>>>>>> work with 5.16 kernel (it does with 5.17, maybe it's Kconfig issue and i 
>>>>>> need to check more).
>>>>>> The patches are on top of 'cababde192b2 Yifan Zhang 2 days ago   
>>>>>>   drm/amd/pm: fix mode2 reset fail for smu 13.0.5 ' commit.
>>>>>> 
>>>>>> Please test and let me know. Maybe by Monday I will be able to resolve 
>>>>>> the connectivity issue on 5.16.
>>>>>> 
>>>>>> Andrey
>>>>>> 
>>>>>> On 2022-02-24 22:13, JingWen Chen wrote:
>>>>>>> Hi Andrey,
>>>>>>> 
>>>>>>> Sorry for the misleading, I mean the whole patch series. We are 
>>>>>>> depending on this patch series to fix the concurrency issue within 
>>>>>>> SRIOV TDR sequence.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On 2/25/22 1:26 AM, Andrey Grodzovsky wrote:
>>>>>>>> No problem if so but before I do,
>>>>>>>> 
>>>>>>>> 
>>>>>>>> JingWen - why you think this patch is needed as a standalone now ? It 
>>>>>>>> has no use without the
>>>>>>>> entire feature together with it. Is it some changes you want to do on 
>>>>>>>> top of that code ?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Andrey
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 2022-02-24 12:12, Deucher, Alexander wrote:
>>>>>>>>> [Public]
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> If it applies cleanly, feel free to drop it in. I'll drop those 
>>>>>>>>> patches for drm-next since they are already in drm-misc.
>>>>>>>>> 
>>>>>>>>> Alex
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>  
>>>>>>>>> *From:* amd-gfx  on behalf of 
>>>>>>>>> Andrey Grodzovsky 
>>>>>>>>> *Sent:* Thursday, February 24, 2022 11:24 AM
>>>>>>>>> *To:* Chen, JingWen ; Christian König 
>>>>>>>>> ; dri-de...@lists.freedesktop.org 
>>>>>>>>> ; amd-gfx@lists.freedesktop.org 
>>>>>>>>> 
>>>>>>>>> *Cc:* Liu, Monk ; Chen, Horace 
>>>>>>>>> ; Lazar, Lijo ; Koenig, 
>>>>>>>>> Christian

Re: [RFC v4 02/11] drm/amdgpu: Move scheduler init to after XGMI is ready

2022-03-02 Thread Chen, JingWen
Hi Andrey,

I don't have the bare mental environment, I can only test the SRIOV cases.

Best Regards,
JingWen Chen



> On Mar 3, 2022, at 01:55, Grodzovsky, Andrey  
> wrote:
> 
> The patch is acked-by: Andrey Grodzovsky 
> 
> If you also smoked tested bare metal feel free to apply all the patches, if 
> no let me know.
> 
> Andrey
> 
> On 2022-03-02 04:51, JingWen Chen wrote:
>> Hi Andrey,
>> 
>> Most part of the patches are OK, but the code will introduce a ib test fail 
>> on the disabled vcn of sienna_cichlid.
>> 
>> In SRIOV use case we will disable one vcn on sienna_cichlid, I have attached 
>> a patch to fix this issue, please check the attachment.
>> 
>> Best Regards,
>> 
>> Jingwen Chen
>> 
>> 
>> On 2/26/22 5:22 AM, Andrey Grodzovsky wrote:
>>> Hey, patches attached - i applied the patches and resolved merge conflicts 
>>> but weren't able to test as my on board's network card doesn't work with 
>>> 5.16 kernel (it does with 5.17, maybe it's Kconfig issue and i need to 
>>> check more).
>>> The patches are on top of 'cababde192b2 Yifan Zhang 2 days ago 
>>> drm/amd/pm: fix mode2 reset fail for smu 13.0.5 ' commit.
>>> 
>>> Please test and let me know. Maybe by Monday I will be able to resolve the 
>>> connectivity issue on 5.16.
>>> 
>>> Andrey
>>> 
>>> On 2022-02-24 22:13, JingWen Chen wrote:
>>>> Hi Andrey,
>>>> 
>>>> Sorry for the misleading, I mean the whole patch series. We are depending 
>>>> on this patch series to fix the concurrency issue within SRIOV TDR 
>>>> sequence.
>>>> 
>>>> 
>>>> 
>>>> On 2/25/22 1:26 AM, Andrey Grodzovsky wrote:
>>>>> No problem if so but before I do,
>>>>> 
>>>>> 
>>>>> JingWen - why you think this patch is needed as a standalone now ? It has 
>>>>> no use without the
>>>>> entire feature together with it. Is it some changes you want to do on top 
>>>>> of that code ?
>>>>> 
>>>>> 
>>>>> Andrey
>>>>> 
>>>>> 
>>>>> On 2022-02-24 12:12, Deucher, Alexander wrote:
>>>>>> [Public]
>>>>>> 
>>>>>> 
>>>>>> If it applies cleanly, feel free to drop it in.  I'll drop those patches 
>>>>>> for drm-next since they are already in drm-misc.
>>>>>> 
>>>>>> Alex
>>>>>> 
>>>>>> 
>>>>>> *From:* amd-gfx  on behalf of 
>>>>>> Andrey Grodzovsky 
>>>>>> *Sent:* Thursday, February 24, 2022 11:24 AM
>>>>>> *To:* Chen, JingWen ; Christian König 
>>>>>> ; dri-de...@lists.freedesktop.org 
>>>>>> ; amd-gfx@lists.freedesktop.org 
>>>>>> 
>>>>>> *Cc:* Liu, Monk ; Chen, Horace ; 
>>>>>> Lazar, Lijo ; Koenig, Christian 
>>>>>> ; dan...@ffwll.ch 
>>>>>> *Subject:* Re: [RFC v4 02/11] drm/amdgpu: Move scheduler init to after 
>>>>>> XGMI is ready
>>>>>> No because all the patch-set including this patch was landed into
>>>>>> drm-misc-next and will reach amd-staging-drm-next on the next upstream
>>>>>> rebase i guess.
>>>>>> 
>>>>>> Andrey
>>>>>> 
>>>>>> On 2022-02-24 01:47, JingWen Chen wrote:
>>>>>>> Hi Andrey,
>>>>>>> 
>>>>>>> Will you port this patch into amd-staging-drm-next?
>>>>>>> 
>>>>>>> on 2/10/22 2:06 AM, Andrey Grodzovsky wrote:
>>>>>>>> All comments are fixed and code pushed. Thanks for everyone
>>>>>>>> who helped reviewing.
>>>>>>>> 
>>>>>>>> Andrey
>>>>>>>> 
>>>>>>>> On 2022-02-09 02:53, Christian König wrote:
>>>>>>>>> Am 09.02.22 um 01:23 schrieb Andrey Grodzovsky:
>>>>>>>>>> Before we initialize schedulers we must know which reset
>>>>>>>>>> domain are we in - for single device there iis a single
>>>>>>>>>> domain per device and so single wq per device. For XGMI
&g

RE: [PATCH] drm/amd/amdgpu: consider paging job always not guilty

2021-07-21 Thread Chen, JingWen
[AMD Official Use Only]

Hi Christian,

I have uploaded the latest patch according to your suggestion.

Best Regards,
JingWen Chen

-Original Message-
From: Christian König 
Sent: Tuesday, July 20, 2021 8:13 PM
To: Chen, JingWen ; amd-gfx@lists.freedesktop.org
Cc: Chen, Horace ; Liu, Monk 
Subject: Re: [PATCH] drm/amd/amdgpu: consider paging job always not guilty



Am 20.07.21 um 13:02 schrieb Jingwen Chen:
> [Why]
> Currently all timedout job will be considered to be guilty. In SRIOV
> multi-vf use case, the vf flr happens first and then job time out is
> found. There can be several jobs timeout during a very small time slice.
> And if the innocent sdma job time out is found before the real bad
> job, then the innocent sdma job will be set to guilty. This will lead
> to a page fault after resubmitting job.
>
> [How]
> If the job is a paging job, we will always consider it not guilty

Don't say "paging job", better "kernel job". Since the PTE updates we are using 
here are not even remotely related to paging.

Regards,
Christian.

>
> Signed-off-by: Jingwen Chen 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 37fa199be8b3..40461547701a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4410,7 +4410,7 @@ int amdgpu_device_pre_asic_reset(struct amdgpu_device 
> *adev,
>   amdgpu_fence_driver_force_completion(ring);
>   }
>
> - if(job)
> + if (job && job->vm)
>   drm_sched_increase_karma(&job->base);
>
>   r = amdgpu_reset_prepare_hwcontext(adev, reset_context); @@ -4874,7
> +4874,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>   DRM_INFO("Bailing on TDR for s_job:%llx, hive: %llx as 
> another already in progress",
>   job ? job->base.id : -1, hive->hive_id);
>   amdgpu_put_xgmi_hive(hive);
> - if (job)
> + if (job && job->vm)
>   drm_sched_increase_karma(&job->base);
>   return 0;
>   }
> @@ -4898,7 +4898,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device 
> *adev,
>   job ? job->base.id : -1);
>
>   /* even we skipped this reset, still need to set the job to 
> guilty */
> - if (job)
> + if (job && job->vm)
>   drm_sched_increase_karma(&job->base);
>   goto skip_recovery;
>   }

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority

2021-07-20 Thread Chen, JingWen
[AMD Official Use Only]

Hi Christian,

I agree that changing the priority is not the right way.
So to not consider paging job guilty, do you think it's OK to not 
increase_karma for it(if (job->vmid == 0))?

Best Regards,
JingWen Chen

-Original Message-
From: Christian König 
Sent: Tuesday, July 20, 2021 4:29 PM
To: Chen, JingWen ; Liu, Monk ; 
amd-gfx@lists.freedesktop.org
Cc: Chen, Horace 
Subject: Re: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority

Hi JingWen,

you can look at the job->vm pointer to distinct an userspace submission
from a kernel submission.

The priority is not really related to the submission type, we just
happen to treat paging jobs with the highest priority since they are
important for the system as a whole.

Regards,
Christian.

Am 20.07.21 um 09:49 schrieb Chen, JingWen:
> [AMD Official Use Only]
>
> Hi Christian,
>
> Even if this is a userspace mapping, it's still packaged by the kernel, so 
> it's always assumed to be correct. In which case modifying the priority 
> should have no side effects. May I know the detailed reason for your concern?
>
> And if we eventually decide not to change the priority, do you have any 
> suggestions about how to make sure the paging jobs are not considered guilty?
>
> Best Regards,
> JingWen Chen
>
> -Original Message-
> From: Christian König 
> Sent: Monday, July 19, 2021 7:10 PM
> To: Liu, Monk ; Chen, JingWen ; 
> amd-gfx@lists.freedesktop.org
> Cc: Chen, Horace 
> Subject: Re: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority
>
> Am 19.07.21 um 11:42 schrieb Liu, Monk:
>> [AMD Official Use Only]
>>
>> Besides, I think our current KMD have three types of kernel sdma jobs:
>> 1) adev->mman.entity, it is already a KERNEL priority entity
>> 2) vm->immediate
>> 3) vm->delay
>>
>> Do you mean now vm->immediate or delay are used as moving jobs instead of 
>> mman.entity ?
> No, exactly that's the point. vm->immediate and vm->delayed are not for 
> kernel paging jobs.
>
> Those are used for userspace page table updates.
>
> I agree that those should probably not considered guilty, but modifying the 
> priority of them is not the right way of doing that.
>
> Regards,
> Christian.
>
>> Thanks
>>
>> ----------
>> Monk Liu | Cloud-GPU Core team
>> --
>>
>> -Original Message-
>> From: Liu, Monk
>> Sent: Monday, July 19, 2021 5:40 PM
>> To: 'Christian König' ; Chen,
>> JingWen ; amd-gfx@lists.freedesktop.org
>> Cc: Chen, Horace 
>> Subject: RE: [PATCH] drm/amd/amdgpu: vm entities should have kernel
>> priority
>>
>> [AMD Official Use Only]
>>
>> If there is move jobs clashing there we probably need to fix the bugs
>> of those move jobs
>>
>> Previously I believe you also remember that we agreed to always trust
>> kernel jobs especially paging jobs,
>>
>> Without set paging jobs' priority to KERNEL level how can we keep that 
>> protocol ? do you have a better idea?
>>
>> Thanks
>>
>> --
>> Monk Liu | Cloud-GPU Core team
>> --
>>
>> -Original Message-
>> From: Christian König 
>> Sent: Monday, July 19, 2021 4:25 PM
>> To: Chen, JingWen ;
>> amd-gfx@lists.freedesktop.org
>> Cc: Chen, Horace ; Liu, Monk 
>> Subject: Re: [PATCH] drm/amd/amdgpu: vm entities should have kernel
>> priority
>>
>> Am 19.07.21 um 07:57 schrieb Jingwen Chen:
>>> [Why]
>>> Current vm_pte entities have NORMAL priority, in SRIOV multi-vf use
>>> case, the vf flr happens first and then job time out is found.
>>> There can be several jobs timeout during a very small time slice.
>>> And if the innocent sdma job time out is found before the real bad
>>> job, then the innocent sdma job will be set to guilty as it only has
>>> NORMAL priority. This will lead to a page fault after resubmitting
>>> job.
>>>
>>> [How]
>>> sdma should always have KERNEL priority. The kernel job will always
>>> be resubmitted.
>> I'm not sure if that is a good idea. We intentionally didn't gave the page 
>> table updates kernel priority to avoid clashing with the move jobs.
>>
>> Christian.
>>
>>> Signed-off-by: Jingwen Chen 
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>&

RE: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority

2021-07-20 Thread Chen, JingWen
[AMD Official Use Only]

Hi Christian,

Even if this is a userspace mapping, it's still packaged by the kernel, so it's 
always assumed to be correct. In which case modifying the priority should have 
no side effects. May I know the detailed reason for your concern?

And if we eventually decide not to change the priority, do you have any 
suggestions about how to make sure the paging jobs are not considered guilty?

Best Regards,
JingWen Chen

-Original Message-
From: Christian König 
Sent: Monday, July 19, 2021 7:10 PM
To: Liu, Monk ; Chen, JingWen ; 
amd-gfx@lists.freedesktop.org
Cc: Chen, Horace 
Subject: Re: [PATCH] drm/amd/amdgpu: vm entities should have kernel priority

Am 19.07.21 um 11:42 schrieb Liu, Monk:
> [AMD Official Use Only]
>
> Besides, I think our current KMD have three types of kernel sdma jobs:
> 1) adev->mman.entity, it is already a KERNEL priority entity
> 2) vm->immediate
> 3) vm->delay
>
> Do you mean now vm->immediate or delay are used as moving jobs instead of 
> mman.entity ?

No, exactly that's the point. vm->immediate and vm->delayed are not for kernel 
paging jobs.

Those are used for userspace page table updates.

I agree that those should probably not considered guilty, but modifying the 
priority of them is not the right way of doing that.

Regards,
Christian.

>
> Thanks
>
> --
> Monk Liu | Cloud-GPU Core team
> --
>
> -Original Message-
> From: Liu, Monk
> Sent: Monday, July 19, 2021 5:40 PM
> To: 'Christian König' ; Chen,
> JingWen ; amd-gfx@lists.freedesktop.org
> Cc: Chen, Horace 
> Subject: RE: [PATCH] drm/amd/amdgpu: vm entities should have kernel
> priority
>
> [AMD Official Use Only]
>
> If there is move jobs clashing there we probably need to fix the bugs
> of those move jobs
>
> Previously I believe you also remember that we agreed to always trust
> kernel jobs especially paging jobs,
>
> Without set paging jobs' priority to KERNEL level how can we keep that 
> protocol ? do you have a better idea?
>
> Thanks
>
> --
> Monk Liu | Cloud-GPU Core team
> --
>
> -Original Message-
> From: Christian König 
> Sent: Monday, July 19, 2021 4:25 PM
> To: Chen, JingWen ;
> amd-gfx@lists.freedesktop.org
> Cc: Chen, Horace ; Liu, Monk 
> Subject: Re: [PATCH] drm/amd/amdgpu: vm entities should have kernel
> priority
>
> Am 19.07.21 um 07:57 schrieb Jingwen Chen:
>> [Why]
>> Current vm_pte entities have NORMAL priority, in SRIOV multi-vf use
>> case, the vf flr happens first and then job time out is found.
>> There can be several jobs timeout during a very small time slice.
>> And if the innocent sdma job time out is found before the real bad
>> job, then the innocent sdma job will be set to guilty as it only has
>> NORMAL priority. This will lead to a page fault after resubmitting
>> job.
>>
>> [How]
>> sdma should always have KERNEL priority. The kernel job will always
>> be resubmitted.
> I'm not sure if that is a good idea. We intentionally didn't gave the page 
> table updates kernel priority to avoid clashing with the move jobs.
>
> Christian.
>
>> Signed-off-by: Jingwen Chen 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++--
>>1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 358316d6a38c..f7526b67cc5d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -2923,13 +2923,13 @@ int amdgpu_vm_init(struct amdgpu_device *adev, 
>> struct amdgpu_vm *vm)
>>  INIT_LIST_HEAD(&vm->done);
>>
>>  /* create scheduler entities for page table updates */
>> -r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_NORMAL,
>> +r = drm_sched_entity_init(&vm->immediate,
>> +DRM_SCHED_PRIORITY_KERNEL,
>>adev->vm_manager.vm_pte_scheds,
>>adev->vm_manager.vm_pte_num_scheds, NULL);
>>  if (r)
>>  return r;
>>
>> -r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_NORMAL,
>> +r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_KERNEL,
>>adev->vm_manager.vm_pte_scheds,
>>adev->vm_manager.vm_pte_num_scheds, NULL);
>>  if (r)

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/pm: Disable SMU messages in navi10 sriov

2021-06-11 Thread Chen, JingWen
[AMD Official Use Only]

Acked-by: Jingwen Chen 

Best Regards,
JingWen Chen

-Original Message-
From: Yifan Zha 
Sent: Friday, June 11, 2021 6:49 PM
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Monk ; Chen, JingWen ; Zha, 
YiFan(Even) 
Subject: [PATCH] drm/amd/pm: Disable SMU messages in navi10 sriov

[Why]
sriov vf send unsupported SMU message lead to fail.

[How]
disable related messages in sriov.

Signed-off-by: Yifan Zha 
---
 drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 78fe13183e8b..e1b019115e92 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -80,10 +80,10 @@ static struct cmn2asic_msg_mapping 
navi10_message_map[SMU_MSG_MAX_COUNT] = {
MSG_MAP(SetAllowedFeaturesMaskHigh, 
PPSMC_MSG_SetAllowedFeaturesMaskHigh,   0),
MSG_MAP(EnableAllSmuFeatures,   PPSMC_MSG_EnableAllSmuFeatures, 
0),
MSG_MAP(DisableAllSmuFeatures,  
PPSMC_MSG_DisableAllSmuFeatures,0),
-   MSG_MAP(EnableSmuFeaturesLow,   PPSMC_MSG_EnableSmuFeaturesLow, 
1),
-   MSG_MAP(EnableSmuFeaturesHigh,  
PPSMC_MSG_EnableSmuFeaturesHigh,1),
-   MSG_MAP(DisableSmuFeaturesLow,  
PPSMC_MSG_DisableSmuFeaturesLow,1),
-   MSG_MAP(DisableSmuFeaturesHigh, 
PPSMC_MSG_DisableSmuFeaturesHigh,   1),
+   MSG_MAP(EnableSmuFeaturesLow,   PPSMC_MSG_EnableSmuFeaturesLow, 
0),
+   MSG_MAP(EnableSmuFeaturesHigh,  
PPSMC_MSG_EnableSmuFeaturesHigh,0),
+   MSG_MAP(DisableSmuFeaturesLow,  
PPSMC_MSG_DisableSmuFeaturesLow,0),
+   MSG_MAP(DisableSmuFeaturesHigh, 
PPSMC_MSG_DisableSmuFeaturesHigh,   0),
MSG_MAP(GetEnabledSmuFeaturesLow,   
PPSMC_MSG_GetEnabledSmuFeaturesLow, 1),
MSG_MAP(GetEnabledSmuFeaturesHigh,  
PPSMC_MSG_GetEnabledSmuFeaturesHigh,1),
MSG_MAP(SetWorkloadMask,PPSMC_MSG_SetWorkloadMask,  
1),
--
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/amdgpu:save psp ring wptr in SRIOV to avoid attack

2021-05-26 Thread Chen, JingWen
Hi Monk,

New patch submitted according to your suggestion

Best Regards,
JingWen Chen

-Original Message-
From: Liu, Monk  
Sent: Wednesday, May 26, 2021 3:24 PM
To: Chen, JingWen ; amd-gfx@lists.freedesktop.org
Cc: Zhao, Victor ; Chen, JingWen 
Subject: RE: [PATCH] drm/amd/amdgpu:save psp ring wptr in SRIOV to avoid attack

[AMD Official Use Only]

>>+ ring->ring_wptr = psp_write_ptr_reg;

Put the cache mechanism into the callbacks please

>>+ ring->ring_wptr = 0;

It is not needed

At last, try to put more details in the comment to let people know why we need 
this PSP WPTR cache mechanism 


Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Jingwen Chen  
Sent: Wednesday, May 26, 2021 2:55 PM
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Monk ; Zhao, Victor ; Chen, 
JingWen 
Subject: [PATCH] drm/amd/amdgpu:save psp ring wptr in SRIOV to avoid attack

From: Victor Zhao 

save psp ring wptr in SRIOV to avoid attack to avoid extra changes to
MP0_SMN_C2PMSG_102 reg

Change-Id: Idee78e8c1c781463048f2f6311fdc70488ef05b2
Signed-off-by: Victor Zhao 
Signed-off-by: Jingwen Chen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 1 +  
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 1 +  
drivers/gpu/drm/amd/amdgpu/psp_v11_0.c  | 3 ++-
 drivers/gpu/drm/amd/amdgpu/psp_v3_1.c   | 3 ++-
 4 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 55378c6b9722..20e06b3ec686 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -2701,6 +2701,7 @@ int psp_ring_cmd_submit(struct psp_context *psp,
/* Update the write Pointer in DWORDs */
psp_write_ptr_reg = (psp_write_ptr_reg + rb_frame_size_dw) % 
ring_size_dw;
psp_ring_set_wptr(psp, psp_write_ptr_reg);
+   ring->ring_wptr = psp_write_ptr_reg;
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
index 46a5328e00e0..60aa99a39a74 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
@@ -76,6 +76,7 @@ struct psp_ring
uint64_tring_mem_mc_addr;
void*ring_mem_handle;
uint32_tring_size;
+   uint32_tring_wptr;
 };
 
 /* More registers may will be supported */ diff --git 
a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
index 1f2e7e35c91e..4a32b0c84ef4 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
@@ -474,6 +474,7 @@ static int psp_v11_0_ring_create(struct psp_context *psp,
return ret;
}
 
+   ring->ring_wptr = 0;
/* Write low address of the ring to C2PMSG_102 */
psp_ring_reg = lower_32_bits(ring->ring_mem_mc_addr);
WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_102, psp_ring_reg); @@ 
-733,7 +734,7 @@ static uint32_t psp_v11_0_ring_get_wptr(struct psp_context 
*psp)
struct amdgpu_device *adev = psp->adev;
 
if (amdgpu_sriov_vf(adev))
-   data = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_102);
+   data = psp->km_ring.ring_wptr;
else
data = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_67);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c 
b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
index f2e725f72d2f..160f78eb6403 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
@@ -237,6 +237,7 @@ static int psp_v3_1_ring_create(struct psp_context *psp,
return ret;
}
 
+   ring->ring_wptr = 0;
/* Write low address of the ring to C2PMSG_102 */
psp_ring_reg = lower_32_bits(ring->ring_mem_mc_addr);
WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_102, psp_ring_reg); @@ 
-379,7 +380,7 @@ static uint32_t psp_v3_1_ring_get_wptr(struct psp_context *psp)
struct amdgpu_device *adev = psp->adev;
 
if (amdgpu_sriov_vf(adev))
-   data = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_102);
+   data = psp->km_ring.ring_wptr;
else
data = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_67);
return data;
--
2.25.1
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 2/2] drm/amd/amdgpu: force flush resubmit job

2021-02-24 Thread Chen, JingWen
[AMD Official Use Only - Internal Distribution Only]

Consider this sequence:
1. GPU reset begin
2. device reset count + 1
3. job id 1 scheduled with vm_need_flush=false
4. When handling this job in vm_flush, amdgpu_vmid_had_gpu_reset will return 
true, thus this job will be flush and the vmid_reset_count will be set equals 
to device_reset_count
5. stop drm scheduler
6. GPU do real reset
7. resubmit job id 1 with vm_need_flush=false
8. Job id 1 is the first job to resubmit after reset. This time 
amdgpu_vmid_had_gpu_reset returns false and the vm_need_flush==false

Then no one will flush pd_addr and vmid for jobs after resubmit. In this 
sequence amdgpu_vmid_had_gpu_reset doesn't work.


Best Regards,
JingWen Chen

-Original Message-
From: Koenig, Christian 
Sent: Thursday, February 25, 2021 3:46 PM
To: Chen, JingWen ; amd-gfx@lists.freedesktop.org
Cc: Liu, Monk 
Subject: Re: [PATCH 2/2] drm/amd/amdgpu: force flush resubmit job



Am 25.02.21 um 06:27 schrieb Jingwen Chen:
> [Why]
> when a job is scheduled during TDR(after device reset count increase
> and before drm_sched_stop), this job won't do vm_flush when resubmit
> itself after GPU reset done. This can lead to a page fault.
>
> [How]
> Always do vm_flush for resubmit job.

NAK, what do you think amdgpu_vmid_had_gpu_reset already does?

Christian.

>
> Signed-off-by: Jingwen Chen 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index fdbe7d4e8b8b..4af2c5d15950 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1088,7 +1088,8 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct 
> amdgpu_job *job,
>   if (update_spm_vmid_needed && adev->gfx.rlc.funcs->update_spm_vmid)
>   adev->gfx.rlc.funcs->update_spm_vmid(adev, job->vmid);
>
> -if (amdgpu_vmid_had_gpu_reset(adev, id)) {
> +if (amdgpu_vmid_had_gpu_reset(adev, id)||
> +(job->base.flags & DRM_FLAG_RESUBMIT_JOB)) {
>   gds_switch_needed = true;
>   vm_flush_needed = true;
>   pasid_mapping_needed = true;

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/amdgpu: add error handling to amdgpu_virt_read_pf2vf_data

2021-01-20 Thread Chen, JingWen
[AMD Official Use Only - Approved for External Use]

Ping

Best Regards,
JingWen Chen

-Original Message-
From: Jingwen Chen  
Sent: Tuesday, January 19, 2021 5:07 PM
To: amd-gfx@lists.freedesktop.org
Cc: Chen, JingWen ; Chen, JingWen 
Subject: [PATCH] drm/amd/amdgpu: add error handling to 
amdgpu_virt_read_pf2vf_data

[Why]
when vram lost happened in guest, try to write vram can lead to kernel stuck.

[How]
When the readback data is invalid, don't do write work, directly reschedule a 
new work.

Signed-off-by: Jingwen Chen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index c649944e49da..3dd7eec52344 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -558,10 +558,14 @@ static int amdgpu_virt_write_vf2pf_data(struct 
amdgpu_device *adev)  static void amdgpu_virt_update_vf2pf_work_item(struct 
work_struct *work)  {
struct amdgpu_device *adev = container_of(work, struct amdgpu_device, 
virt.vf2pf_work.work);
+   int ret;
 
-   amdgpu_virt_read_pf2vf_data(adev);
+   ret = amdgpu_virt_read_pf2vf_data(adev);
+   if (ret)
+   goto out;
amdgpu_virt_write_vf2pf_data(adev);
 
+out:
schedule_delayed_work(&(adev->virt.vf2pf_work), 
adev->virt.vf2pf_update_interval_ms);
 }
 
--
2.25.1
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 2/2] drm/amd: Skip not used microcode loading in SRIOV

2020-09-22 Thread Chen, JingWen
[AMD Public Use]


[AMD Public Use]

Hi Hawking,

We may need other features in PSP in the future, e.g. load cap fw. So we can't 
skip the whole psp_init_microcode.

Best Regards,
JingWen Chen

From: Deng, Emily 
Sent: Tuesday, September 22, 2020 6:22 PM
To: Wang, Kevin(Yang) ; Zhang, Hawking 
; Chen, JingWen ; 
amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH 2/2] drm/amd: Skip not used microcode loading in SRIOV


[AMD Public Use]

Hi Kevin and Hawking,
I think both you are right. But currently we haven't good method to handle 
this. It seems need to re-arch the whole driver, not only refer to this patch. 
Only refer to this patch, I think it is OK.

Best wishes
Emily Deng
From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 On Behalf Of Wang, Kevin(Yang)
Sent: Tuesday, September 22, 2020 3:38 PM
To: Zhang, Hawking mailto:hawking.zh...@amd.com>>; Chen, 
JingWen mailto:jingwen.ch...@amd.com>>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH 2/2] drm/amd: Skip not used microcode loading in SRIOV


[AMD Public Use]


[AMD Public Use]

Embedding these SRIOV check into the underlying functions is in many places, 
which is not conducive to subsequent code optimization and maintenance.
It took a long time to clean up the SMU code before, but now some new checks 
have been introduced into the SMU code.
I think a new method should be adopted to solve this problem unless there's a 
special reason.

Best Regards,
Kevin

From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 on behalf of Zhang, Hawking 
mailto:hawking.zh...@amd.com>>
Sent: Tuesday, September 22, 2020 3:25 PM
To: Chen, JingWen mailto:jingwen.ch...@amd.com>>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> 
mailto:amd-gfx@lists.freedesktop.org>>
Cc: Chen, JingWen mailto:jingwen.ch...@amd.com>>
Subject: RE: [PATCH 2/2] drm/amd: Skip not used microcode loading in SRIOV

[AMD Public Use]

1. Please do not add the amdgpu_sriov_vf check in every psp fw init_microcode 
function. psp_init_microcode is the entry point for all kinds of psp fw 
microcode initialization.
2. I'd like to get a whole picture on all the sequence you want to skip from 
guest side so that we can have more organized/reasonable approach to exclude 
those programing sequence for SRIOV, instead of having the amdgpu_sriov_vf 
patched case by case...

Regards,
Hawking

-Original Message-
From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 On Behalf Of Jingwen Chen
Sent: Tuesday, September 22, 2020 15:09
To: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Cc: Chen, JingWen mailto:jingwen.ch...@amd.com>>
Subject: [PATCH 2/2] drm/amd: Skip not used microcode loading in SRIOV

smc, sdma, sos, ta and asd fw is not used in SRIOV. Skip them to accelerate 
sw_init for navi12.

v2: skip above fw in SRIOV for vega10 and sienna_cichlid
Signed-off-by: Jingwen Chen 
mailto:jingwen.ch...@amd.com>>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c  |  9 +
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c   |  3 +++
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c   |  3 +++
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c   |  3 +++
 .../gpu/drm/amd/pm/powerplay/smumgr/vega10_smumgr.c  | 12 +++-
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c| 11 +++
 6 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 2c66e20b2ed9..9e2038de6ea7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -2385,6 +2385,9 @@ int psp_init_asd_microcode(struct psp_context *psp,
 const struct psp_firmware_header_v1_0 *asd_hdr;
 int err = 0;

+   if (amdgpu_sriov_vf(adev))
+   return 0;
+
 if (!chip_name) {
 dev_err(adev->dev, "invalid chip name for asd microcode\n");
 return -EINVAL;
@@ -2424,6 +2427,9 @@ int psp_init_sos_microcode(struct psp_context *psp,
 const struct psp_firmware_header_v1_3 *sos_hdr_v1_3;
 int err = 0;

+   if (amdgpu_sriov_vf(adev))
+   return 0;
+
 if (!chip_name) {
 dev_err(adev->dev, "invalid chip name for sos microcode\n");
 return -EINVAL;
@@ -2558,6 +2564,9 @@ int psp_init_ta_microcode(struct psp_context *psp,
 int err = 0;
 int ta_index = 0;

+   if (amdgpu_sriov_vf(adev))
+   return 0;
+
 if (!chip_name) {
 dev_err(adev->dev, "invalid chip name for ta microcode\n");
 return -EINVAL;
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index 810635cbf4c

RE: [PATCH] drm/amd/pm: Skip smu_post_init in SRIOV

2020-09-17 Thread Chen, JingWen
[AMD Public Use]

Typo fixed in v3

Best Regards,
JingWen Chen

> -Original Message-
> From: Chen, Guchun 
> Sent: Thursday, September 17, 2020 5:40 PM
> To: Chen, JingWen ; amd-
> g...@lists.freedesktop.org
> Cc: Chen, JingWen 
> Subject: RE: [PATCH] drm/amd/pm: Skip smu_post_init in SRIOV
> 
> [AMD Public Use]
> 
> You want to call it in SRIOV case or in bare-metal case?
> 
> Regards,
> Guchun
> 
> -Original Message-
> From: amd-gfx  On Behalf Of
> Jingwen Chen
> Sent: Thursday, September 17, 2020 5:17 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Chen, JingWen 
> Subject: [PATCH] drm/amd/pm: Skip smu_post_init in SRIOV
> 
> smu_post_init needs to enable SMU feature, while this require virtualization
> off. Skip it since this feature is not used in SRIOV.
> 
> v2: move the check to the early stage of smu_post_init.
> 
> Signed-off-by: Jingwen Chen 
> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> index a027c7fdad56..a950f009c794 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> @@ -2631,6 +2631,9 @@ static int navi10_post_smu_init(struct smu_context
> *smu)
>   uint64_t feature_mask = 0;
>   int ret = 0;
> 
> + if (!amdgpu_sriov_vf(adev))
> + return 0;
> +
>   /* For Naiv1x, enable these features only after DAL initialization */
>   if (adev->pm.pp_feature & PP_SOCCLK_DPM_MASK)
>   feature_mask |=
> FEATURE_MASK(FEATURE_DPM_SOCCLK_BIT);
> --
> 2.25.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.
> freedesktop.org%2Fmailman%2Flistinfo%2Famd-
> gfx&data=02%7C01%7Cguchun.chen%40amd.com%7C12ec63de0caa413
> 4415008d85aea7b6a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C
> 637359310721844702&sdata=9JCzyhqPIKMZV%2BBEL83HZyfwCyZjTP5iP
> gs7Hn4Epx8%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/pm: Skip smu_post_init in SRIOV

2020-09-17 Thread Chen, JingWen
[AMD Public Use]

Done in v2

Best Regards,
JingWen Chen

> -Original Message-
> From: Chen, Guchun 
> Sent: Thursday, September 17, 2020 4:21 PM
> To: Chen, JingWen ; amd-
> g...@lists.freedesktop.org
> Cc: Chen, JingWen 
> Subject: RE: [PATCH] drm/amd/pm: Skip smu_post_init in SRIOV
> 
> [AMD Public Use]
> 
> Why not moving the check in smu_post_init, and return 0 at the first early
> stage if it's SRIOV case?
> 
> Regards,
> Guchun
> 
> -Original Message-
> From: amd-gfx  On Behalf Of
> Jingwen Chen
> Sent: Thursday, September 17, 2020 4:11 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Chen, JingWen 
> Subject: [PATCH] drm/amd/pm: Skip smu_post_init in SRIOV
> 
> smu_post_init needs to enable SMU feature, while this require virtualization
> off. Skip it since this feature is not used in SRIOV.
> 
> Signed-off-by: Jingwen Chen 
> ---
>  drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> index 5c4b74f964fc..79163d0ff762 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> @@ -469,10 +469,12 @@ static int smu_late_init(void *handle)
>   if (!smu->pm_enabled)
>   return 0;
> 
> - ret = smu_post_init(smu);
> - if (ret) {
> - dev_err(adev->dev, "Failed to post smu init!\n");
> - return ret;
> + if (!amdgpu_sriov_vf(adev)) {
> + ret = smu_post_init(smu);
> + if (ret) {
> + dev_err(adev->dev, "Failed to post smu init!\n");
> + return ret;
> + }
>   }
> 
>   ret = smu_set_default_od_settings(smu);
> --
> 2.25.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.
> freedesktop.org%2Fmailman%2Flistinfo%2Famd-
> gfx&data=02%7C01%7Cguchun.chen%40amd.com%7C7bc132d80cd34c4
> e7b8f08d85ae1fcc5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C
> 637359274256715319&sdata=x%2Bc0jbDbTv8PR7qj4GCbYgxorKyFg2K%2
> BJYgcrs4iftE%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx