date:20221124

Re: [PATCH] drm/amdgpu: Fix minmax error

2022-11-24 Thread Christian König


Am 25.11.22 um 08:56 schrieb Luben Tuikov:

On 2022-11-25 02:45, Christian König wrote:


Am 24.11.22 um 22:19 schrieb Luben Tuikov:

Fix minmax compilation error by using min_t()/max_t(), of the assignment type.

Cc: James Zhu 
Cc: Felix Kuehling 
Fixes: 58170a7a002ad6 ("drm/amdgpu: fix stall on CPU when allocate large system 
memory")
Signed-off-by: Luben Tuikov 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c | 10 +++---
   1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
index 8a2e5716d8dba2..d22d14b0ef0c84 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
@@ -191,14 +191,18 @@ int amdgpu_hmm_range_get_pages(struct 
mmu_interval_notifier *notifier,
hmm_range->dev_private_owner = owner;
   
   	do {

-   hmm_range->end = min(hmm_range->start + MAX_WALK_BYTE, end);
+   hmm_range->end = min_t(typeof(hmm_range->end),
+  hmm_range->start + MAX_WALK_BYTE,
+  end);

Since end is a local variable I would strongly prefer to just have it
use the correct type for it.

Otherwise we might end up using something which doesn't work on all
architectures.

They all appear to be "unsigned long". I thought, since we assign to
hmm_range->end, we use that type.


Mhm, then why does the compiler complain here?

As far as I can see "unsigned long" is correct here, but if we somehow 
have a typecast then something is not working as expected.


Is MAX_WALK_BYTE maybe of signed type?



Would you prefer at the top of the function to define "timeout" and "end" as,
typeof(hmm_range->end) end, timeout;


Well for end that might make sense, but timeout is independent of the 
hmm range.


Regards,
Christian.



Regards,
Luben

Re: [PATCH] drm/amdgpu: Fix minmax error

2022-11-24 Thread Luben Tuikov

On 2022-11-25 02:45, Christian König wrote:
> 
> 
> Am 24.11.22 um 22:19 schrieb Luben Tuikov:
>> Fix minmax compilation error by using min_t()/max_t(), of the assignment 
>> type.
>>
>> Cc: James Zhu 
>> Cc: Felix Kuehling 
>> Fixes: 58170a7a002ad6 ("drm/amdgpu: fix stall on CPU when allocate large 
>> system memory")
>> Signed-off-by: Luben Tuikov 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c | 10 +++---
>>   1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
>> index 8a2e5716d8dba2..d22d14b0ef0c84 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
>> @@ -191,14 +191,18 @@ int amdgpu_hmm_range_get_pages(struct 
>> mmu_interval_notifier *notifier,
>>  hmm_range->dev_private_owner = owner;
>>   
>>  do {
>> -hmm_range->end = min(hmm_range->start + MAX_WALK_BYTE, end);
>> +hmm_range->end = min_t(typeof(hmm_range->end),
>> +   hmm_range->start + MAX_WALK_BYTE,
>> +   end);
> 
> Since end is a local variable I would strongly prefer to just have it 
> use the correct type for it.
> 
> Otherwise we might end up using something which doesn't work on all 
> architectures.

They all appear to be "unsigned long". I thought, since we assign to
hmm_range->end, we use that type.

Would you prefer at the top of the function to define "timeout" and "end" as,
typeof(hmm_range->end) end, timeout;

Regards,
Luben

Re: [PATCH] drm/amdgpu: Fix minmax error

2022-11-24 Thread Christian König





Am 24.11.22 um 22:19 schrieb Luben Tuikov:

Fix minmax compilation error by using min_t()/max_t(), of the assignment type.

Cc: James Zhu 
Cc: Felix Kuehling 
Fixes: 58170a7a002ad6 ("drm/amdgpu: fix stall on CPU when allocate large system 
memory")
Signed-off-by: Luben Tuikov 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c | 10 +++---
  1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
index 8a2e5716d8dba2..d22d14b0ef0c84 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
@@ -191,14 +191,18 @@ int amdgpu_hmm_range_get_pages(struct 
mmu_interval_notifier *notifier,
hmm_range->dev_private_owner = owner;
  
  	do {

-   hmm_range->end = min(hmm_range->start + MAX_WALK_BYTE, end);
+   hmm_range->end = min_t(typeof(hmm_range->end),
+  hmm_range->start + MAX_WALK_BYTE,
+  end);


Since end is a local variable I would strongly prefer to just have it 
use the correct type for it.


Otherwise we might end up using something which doesn't work on all 
architectures.


Regards,
Christian.

  
  		pr_debug("hmm range: start = 0x%lx, end = 0x%lx",

hmm_range->start, hmm_range->end);
  
  		/* Assuming 512MB takes maxmium 1 second to fault page address */

-   timeout = max((hmm_range->end - hmm_range->start) >> 29, 1ULL) *
-   HMM_RANGE_DEFAULT_TIMEOUT;
+   timeout = max_t(typeof(timeout),
+   (hmm_range->end - hmm_range->start) >> 29,
+   1ULL);
+   timeout *= HMM_RANGE_DEFAULT_TIMEOUT;
timeout = jiffies + msecs_to_jiffies(timeout);
  
  retry:


base-commit: d5e8f4912061ad2e577b4909556e1364e2c2018e
prerequisite-patch-id: 6024d0c36cae3e4a995a8fcf787b91f511a37486

RE: [PATCH] drm/amdgpu: fix for set df cstate failed under sriov

2022-11-24 Thread Wang, Yang(Kevin)

[AMD Official Use Only - General]

Hi Shikang,

your patch will impact all amdgpu devices.
May I know which device/asic has this issue ?

other words, you'd better move your changes into smu side as well (likes 
smu_v13_xx_ppt.c).

Best Regards,
Kevin

-Original Message-
From: amd-gfx  On Behalf Of Shikang Fan
Sent: 2022年11月25日 15:15
To: amd-gfx@lists.freedesktop.org
Cc: Fan, Shikang 
Subject: [PATCH] drm/amdgpu: fix for set df cstate failed under sriov

- set_df_cstate won't be called under sriov case.

Signed-off-by: Shikang Fan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 818fa72c670d..c557585aa46c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2943,13 +2943,15 @@ static int amdgpu_device_ip_suspend_phase1(struct 
amdgpu_device *adev)
amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);

-   /*
-* Per PMFW team's suggestion, driver needs to handle gfxoff
-* and df cstate features disablement for gpu reset(e.g. Mode1Reset)
-* scenario. Add the missing df cstate disablement here.
-*/
-   if (amdgpu_dpm_set_df_cstate(adev, DF_CSTATE_DISALLOW))
-   dev_warn(adev->dev, "Failed to disallow df cstate");
+   if (!amdgpu_sriov_vf(adev)) {
+   /*
+* Per PMFW team's suggestion, driver needs to handle gfxoff
+* and df cstate features disablement for gpu reset(e.g. 
Mode1Reset)
+* scenario. Add the missing df cstate disablement here.
+*/
+   if (amdgpu_dpm_set_df_cstate(adev, DF_CSTATE_DISALLOW))
+   dev_warn(adev->dev, "Failed to disallow df cstate");
+   }

for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
if (!adev->ip_blocks[i].status.valid)
--
2.25.1

<>

[PATCH] drm/amdgpu: fix for set df cstate failed under sriov

2022-11-24 Thread Shikang Fan

- set_df_cstate won't be called under sriov case.

Signed-off-by: Shikang Fan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 818fa72c670d..c557585aa46c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2943,13 +2943,15 @@ static int amdgpu_device_ip_suspend_phase1(struct 
amdgpu_device *adev)
amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
amdgpu_device_set_cg_state(adev, AMD_CG_STATE_UNGATE);
 
-   /*
-* Per PMFW team's suggestion, driver needs to handle gfxoff
-* and df cstate features disablement for gpu reset(e.g. Mode1Reset)
-* scenario. Add the missing df cstate disablement here.
-*/
-   if (amdgpu_dpm_set_df_cstate(adev, DF_CSTATE_DISALLOW))
-   dev_warn(adev->dev, "Failed to disallow df cstate");
+   if (!amdgpu_sriov_vf(adev)) {
+   /*
+* Per PMFW team's suggestion, driver needs to handle gfxoff
+* and df cstate features disablement for gpu reset(e.g. 
Mode1Reset)
+* scenario. Add the missing df cstate disablement here.
+*/
+   if (amdgpu_dpm_set_df_cstate(adev, DF_CSTATE_DISALLOW))
+   dev_warn(adev->dev, "Failed to disallow df cstate");
+   }
 
for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
if (!adev->ip_blocks[i].status.valid)
-- 
2.25.1

Re: [PATCH] drm/amdgpu: add mb for si

2022-11-24 Thread Lazar, Lijo




On 11/25/2022 7:43 AM, Quan, Evan wrote:

[AMD Official Use Only - General]




-Original Message-
From: Lazar, Lijo 
Sent: Thursday, November 24, 2022 6:49 PM
To: Quan, Evan ; 李真能 ;
Michel Dänzer ; Koenig, Christian
; Deucher, Alexander

Cc: amd-gfx@lists.freedesktop.org; Pan, Xinhui ;
linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: add mb for si



On 11/24/2022 4:11 PM, Lazar, Lijo wrote:


On 11/24/2022 3:34 PM, Quan, Evan wrote:

[AMD Official Use Only - General]

Could the attached patch help?

Evan

-Original Message-
From: amd-gfx  On Behalf

Of ???

Sent: Friday, November 18, 2022 5:25 PM
To: Michel Dänzer ; Koenig, Christian
; Deucher, Alexander

Cc: amd-gfx@lists.freedesktop.org; Pan, Xinhui ;
linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: add mb for si


在 2022/11/18 17:18, Michel Dänzer 写道:

On 11/18/22 09:01, Christian König wrote:

Am 18.11.22 um 08:48 schrieb Zhenneng Li:

During reboot test on arm64 platform, it may failure on boot, so
add this mb in smc.

The error message are as follows:
[    6.996395][ 7] [  T295] [drm:amdgpu_device_ip_late_init
[amdgpu]] *ERROR*
   late_init of IP block  failed -22 [
7.006919][ 7] [  T295] amdgpu :04:00.0:

The issue is happening in late_init() which eventually does

  ret = si_thermal_enable_alert(adev, false);

Just before this, si_thermal_start_thermal_controller is called in
hw_init and that enables thermal alert.

Maybe the issue is with enable/disable of thermal alerts in quick
succession. Adding a delay inside si_thermal_start_thermal_controller
might help.


On a second look, temperature range is already set as part of
si_thermal_start_thermal_controller in hw_init
https://elixir.bootlin.com/linux/v6.1-
rc6/source/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c#L6780

There is no need to set it again here -

https://elixir.bootlin.com/linux/v6.1-
rc6/source/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c#L7635

I think it is safe to remove the call from late_init altogether. Alex/Evan?


[Quan, Evan] Yes, it makes sense to me. But I'm not sure whether that’s related 
with the issue here.
Since per my understandings, if the issue is caused by double calling of 
thermal_alert enablement, it will fail every time.
That cannot explain why adding some delays or a mb() calling can help.


The side effect of the patch is just some random delay introduced for 
every SMC message


The issue happens in late_init(). Between late_init() and dpm 
enablement, there are many smc messages sent which don't have this 
issue. So I think the issue is not with FW not running.


Thus the only case I see is enable/disable of thermal alert in random 
succession.


Thanks,

Lijo


BR
Evan

Thanks,
Lijo


Thanks,
Lijo


amdgpu_device_ip_late_init failed [    7.014224][ 7] [  T295] amdgpu
:04:00.0: Fatal error during GPU init

Memory barries are not supposed to be sprinkled around like this,

you

need to give a detailed explanation why this is necessary.

Regards,
Christian.


Signed-off-by: Zhenneng Li 
---
     drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++
     1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
index 8f994ffa9cd1..c7656f22278d 100644
--- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
+++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
@@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct
amdgpu_device *adev)
     u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL);
     u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0);
     +    mb();
+
     if (!(rst & RST_REG) && !(clk & CK_DISABLE))
     return true;

In particular, it makes no sense in this specific place, since it
cannot directly

affect the values of rst & clk.

I thinks so too.

But when I do reboot test using nine desktop machines,  there maybe
report
this error on one or two machines after Hundreds of times or
Thousands of
times reboot test, at the beginning, I use msleep() instead of mb(),
these
two methods are all works, but I don't know what is the root case.

I use this method on other verdor's oland card, this error message are
reported again.

What could be the root reason?

test environmen:

graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87

driver: amdgpu

os: ubuntu 2004

platform: arm64

kernel: 5.4.18

回复: [PATCH] drm/amdgpu: add amdgpu vram usage information into amdgpu_vram_mm

2022-11-24 Thread Wang, Yang(Kevin)

[AMD Official Use Only - General]

-邮件原件-
发件人: Paneer Selvam, Arunpravin 
发送时间: 2022年11月24日 22:57
收件人: Christian König ; Wang, Yang(Kevin) 
; Koenig, Christian ; 
amd-gfx@lists.freedesktop.org
抄送: Deucher, Alexander ; Zhang, Hawking 

主题: Re: [PATCH] drm/amdgpu: add amdgpu vram usage information into 
amdgpu_vram_mm



On 11/24/2022 1:17 PM, Christian König wrote:
> Am 24.11.22 um 08:45 schrieb Wang, Yang(Kevin):
>> [AMD Official Use Only - General]
>>
>> -Original Message-
>> From: Koenig, Christian 
>> Sent: Thursday, November 24, 2022 3:25 PM
>> To: Wang, Yang(Kevin) ;
>> amd-gfx@lists.freedesktop.org; Paneer Selvam, Arunpravin
>> 
>> Cc: Zhang, Hawking ; Deucher, Alexander
>> 
>> Subject: Re: [PATCH] drm/amdgpu: add amdgpu vram usage information
>> into amdgpu_vram_mm
>>
>> Am 24.11.22 um 06:49 schrieb Yang Wang:
>>> add vram usage information into dri debugfs amdgpu_vram_mm node.
>>>
>>> Background:
>>> when amdgpu driver introduces drm buddy allocator, the kernel driver
>>> (and developer) is difficult to track the vram usage information.
>>>
>>> Field:
>>> 0x-0x: vram usaged range.
>>> type: kernel, device, sg
>>> usage: normal, vm, user
>>> domain: C-CPU, G-GTT, V-VRAM, P-PRIV
>>> @x: the address of "amdgpu_bo" object in kernel space.
>>> 4096: vram range range.
>>>
>>> Example:
>>> 0x0003fea68000-0x0003fea68fff: (type:kernel usage:vm
>>> domain:--V- --V-) @1d33dfee 4096 bytes
>>> 0x0003fea69000-0x0003fea69fff: (type:kernel usage:vm
>>> domain:--V- --V-) @a79155b5 4096 bytes
>>> 0x0003fea6b000-0x0003fea6bfff: (type:kernel usage:vm
>>> domain:--V- --V-) @38ad633b 4096 bytes
>>> 0x0003fea6c000-0x0003fea6cfff: (type:device usage:user
>>> domain:--V- --V-) @e302f90b 4096 bytes
>>> 0x0003fea6d000-0x0003fea6dfff: (type:device usage:user
>>> domain:--V- --V-) @e664c172 4096 bytes
>>> 0x0003fea6e000-0x0003fea6efff: (type:kernel usage:vm
>>> domain:--V- --V-) @4528cb2f 4096 bytes
>>> 0x0003fea6f000-0x0003fea6: (type:kernel usage:vm
>>> domain:--V- --V-) @a446bdbf 4096 bytes
>>> 0x0003fea7-0x0003fea7: (type:device usage:user
>>> domain:--V- --V-) @78fae42f 65536 bytes
>>> 0x0003fead8000-0x0003feadbfff: (type:kernel usage:normal
>>> domain:--V- --V-) @1327b7ff 16384 bytes
>>> 0x0003feadc000-0x0003feadcfff: (type:kernel usage:normal
>>> domain:--V- --V-) @1327b7ff 4096 bytes
>>> 0x0003feadd000-0x0003feaddfff: (type:kernel usage:normal
>>> domain:--V- --V-) @b9706fc1 4096 bytes
>>> 0x0003feade000-0x0003feadefff: (type:kernel usage:vm
>>> domain:--V- --V-) @71a25571 4096 bytes
>>>
>>> Note:
>>> although some vram ranges can be merged in the example above, but
>>> this can reflect the actual distribution of drm buddy allocator.
>> Well completely NAK. This is way to much complexity for simple
>> debugging.
>>
>> Question is what are your requirements here? E.g. what information do
>> you want and why doesn't the buddy allocator already expose this?
>>
>> Regards,
>> Christian.
>>
>> [Kevin]:
>>
>> For KMD debug.
>> The DRM buddy interface doesn't provide an interface to query which
>> ranges of ram(resource) are used.
>> It is not easy to debug in KMD side if driver create BO fail at
>> specific location.
>> and from the view of KMD, the VRAM at some locations has special
>> purposes.
>> with this patch, we can know which range of vram are actually used.
>
> Well that's not a good reason to add this complexity. Debugging
> doesn't justify that.
>
> Please work with Arun to add the necessary information to the buddy
> allocator interface.
>
> Regards,
> Christian.
>
Hi Kevin,

I will check and list down some of the necessary information that we can add to 
the buddy allocator interface.

Regards,
Arun.

[kevin]:

Thanks,
but some information is AMDGPU specific, so I hope we can add some flexible 
interfaces to adapt the different driver's extension.

Best Regards,
Kevin

>>
>> Best Regards,
>> Kevin
>>> Signed-off-by: Yang Wang 
>>> ---
>>>drivers/gpu/drm/amd/amdgpu/amdgpu_object.c   |   6 +-
>>>drivers/gpu/drm/amd/amdgpu/amdgpu_object.h   |   3 +
>>>drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 130
>>> ++-
>>>drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h |   1 +
>>>4 files changed, 136 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index 90eb07106609..117c754409b3 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -53,7 +53,7 @@
>>> *
>>> */
>>>
>>> -static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
>>> +void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
>>>{
>>>struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo);
>>>
>>> @@ -66,7 +66,7 @@ static voi

Re: [PATCH] swsmu/amdgpu_smu: Fix the wrong if-condition

2022-11-24 Thread Luben Tuikov

Applied!

Thanks,
Luben

On 2022-11-24 05:10, Quan, Evan wrote:
> [AMD Official Use Only - General]
> 
> Reviewed-by: Evan Quan 
> 
>> -Original Message-
>> From: amd-gfx  On Behalf Of Yu
>> Songping
>> Sent: Thursday, November 24, 2022 9:53 AM
>> To: airl...@gmail.com; dan...@ffwll.ch
>> Cc: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; linux-
>> ker...@vger.kernel.org
>> Subject: [PATCH] swsmu/amdgpu_smu: Fix the wrong if-condition
>>
>> The logical operator '&&' will make
>> smu->ppt_funcs->set_gfx_power_up_by_imu segment fault when
>> ppt_funcs is
>> smu->NULL.
>>
>> Signed-off-by: Yu Songping 
>> ---
>>  drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
>> b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
>> index b880f4d7d67e..1cb728b0b7ee 100644
>> --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
>> +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
>> @@ -161,7 +161,7 @@ int smu_get_dpm_freq_range(struct smu_context
>> *smu,
>>
>>  int smu_set_gfx_power_up_by_imu(struct smu_context *smu)  {
>> - if (!smu->ppt_funcs && !smu->ppt_funcs-
>>> set_gfx_power_up_by_imu)
>> + if (!smu->ppt_funcs || !smu->ppt_funcs-
>>> set_gfx_power_up_by_imu)
>>   return -EOPNOTSUPP;
>>
>>   return smu->ppt_funcs->set_gfx_power_up_by_imu(smu);
>> --
>> 2.17.1

RE: [PATCH] drm/amdgpu: add mb for si

2022-11-24 Thread Quan, Evan

[AMD Official Use Only - General]



> -Original Message-
> From: Lazar, Lijo 
> Sent: Thursday, November 24, 2022 6:49 PM
> To: Quan, Evan ; 李真能 ;
> Michel Dänzer ; Koenig, Christian
> ; Deucher, Alexander
> 
> Cc: amd-gfx@lists.freedesktop.org; Pan, Xinhui ;
> linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: add mb for si
> 
> 
> 
> On 11/24/2022 4:11 PM, Lazar, Lijo wrote:
> >
> >
> > On 11/24/2022 3:34 PM, Quan, Evan wrote:
> >> [AMD Official Use Only - General]
> >>
> >> Could the attached patch help?
> >>
> >> Evan
> >>> -Original Message-
> >>> From: amd-gfx  On Behalf
> Of ???
> >>> Sent: Friday, November 18, 2022 5:25 PM
> >>> To: Michel Dänzer ; Koenig, Christian
> >>> ; Deucher, Alexander
> >>> 
> >>> Cc: amd-gfx@lists.freedesktop.org; Pan, Xinhui ;
> >>> linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org
> >>> Subject: Re: [PATCH] drm/amdgpu: add mb for si
> >>>
> >>>
> >>> 在 2022/11/18 17:18, Michel Dänzer 写道:
>  On 11/18/22 09:01, Christian König wrote:
> > Am 18.11.22 um 08:48 schrieb Zhenneng Li:
> >> During reboot test on arm64 platform, it may failure on boot, so
> >> add this mb in smc.
> >>
> >> The error message are as follows:
> >> [    6.996395][ 7] [  T295] [drm:amdgpu_device_ip_late_init
> >> [amdgpu]] *ERROR*
> >>   late_init of IP block  failed -22 [
> >> 7.006919][ 7] [  T295] amdgpu :04:00.0:
> >
> > The issue is happening in late_init() which eventually does
> >
> >  ret = si_thermal_enable_alert(adev, false);
> >
> > Just before this, si_thermal_start_thermal_controller is called in
> > hw_init and that enables thermal alert.
> >
> > Maybe the issue is with enable/disable of thermal alerts in quick
> > succession. Adding a delay inside si_thermal_start_thermal_controller
> > might help.
> >
> 
> On a second look, temperature range is already set as part of
> si_thermal_start_thermal_controller in hw_init
> https://elixir.bootlin.com/linux/v6.1-
> rc6/source/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c#L6780
> 
> There is no need to set it again here -
> 
> https://elixir.bootlin.com/linux/v6.1-
> rc6/source/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c#L7635
> 
> I think it is safe to remove the call from late_init altogether. Alex/Evan?
> 
[Quan, Evan] Yes, it makes sense to me. But I'm not sure whether that’s related 
with the issue here.
Since per my understandings, if the issue is caused by double calling of 
thermal_alert enablement, it will fail every time.
That cannot explain why adding some delays or a mb() calling can help.

BR
Evan
> Thanks,
> Lijo
> 
> > Thanks,
> > Lijo
> >
> >> amdgpu_device_ip_late_init failed [    7.014224][ 7] [  T295] amdgpu
> >> :04:00.0: Fatal error during GPU init
> > Memory barries are not supposed to be sprinkled around like this,
> you
> >>> need to give a detailed explanation why this is necessary.
> >
> > Regards,
> > Christian.
> >
> >> Signed-off-by: Zhenneng Li 
> >> ---
> >>     drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++
> >>     1 file changed, 2 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >> b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >> index 8f994ffa9cd1..c7656f22278d 100644
> >> --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >> +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >> @@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct
> >> amdgpu_device *adev)
> >>     u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL);
> >>     u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0);
> >>     +    mb();
> >> +
> >>     if (!(rst & RST_REG) && !(clk & CK_DISABLE))
> >>     return true;
>  In particular, it makes no sense in this specific place, since it
>  cannot directly
> >>> affect the values of rst & clk.
> >>>
> >>> I thinks so too.
> >>>
> >>> But when I do reboot test using nine desktop machines,  there maybe
> >>> report
> >>> this error on one or two machines after Hundreds of times or
> >>> Thousands of
> >>> times reboot test, at the beginning, I use msleep() instead of mb(),
> >>> these
> >>> two methods are all works, but I don't know what is the root case.
> >>>
> >>> I use this method on other verdor's oland card, this error message are
> >>> reported again.
> >>>
> >>> What could be the root reason?
> >>>
> >>> test environmen:
> >>>
> >>> graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87
> >>>
> >>> driver: amdgpu
> >>>
> >>> os: ubuntu 2004
> >>>
> >>> platform: arm64
> >>>
> >>> kernel: 5.4.18
> >>>
> 
<>

RE: [PATCH] drm/amdgpu: add mb for si

2022-11-24 Thread Quan, Evan

[AMD Official Use Only - General]

Did you see that? It's a patch which I created by git-format-patch.
Anyway I will paste the changes below. I was suspecting maybe we need some 
waits for smu running.

diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c 
b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
index 49c398ec0aaf..9f308a021b2d 100644
--- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
+++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
@@ -6814,6 +6814,7 @@ static int si_dpm_enable(struct amdgpu_device *adev)
struct si_power_info *si_pi = si_get_pi(adev);
struct amdgpu_ps *boot_ps = adev->pm.dpm.boot_ps;
int ret;
+   int i;

if (amdgpu_si_is_smc_running(adev))
return -EINVAL;
@@ -6909,6 +6910,17 @@ static int si_dpm_enable(struct amdgpu_device *adev)
si_program_response_times(adev);
si_program_ds_registers(adev);
si_dpm_start_smc(adev);
+   /* Waiting for smc alive */
+   for (i = 0; i < adev->usec_timeout; i++) {
+   if (amdgpu_si_is_smc_running(adev))
+   break;
+   udelay(1);
+   }
+   if (i >= adev->usec_timeout) {
+   DRM_ERROR("Timedout on waiting for smu running\n");
+   return -EINVAL;
+   }
+
ret = si_notify_smc_display_change(adev, false);
if (ret) {
DRM_ERROR("si_notify_smc_display_change failed\n");


BR
Evan
> -Original Message-
> From: Christian König 
> Sent: Thursday, November 24, 2022 6:06 PM
> To: Quan, Evan ; 李真能 ;
> Michel Dänzer ; Koenig, Christian
> ; Deucher, Alexander
> 
> Cc: dri-de...@lists.freedesktop.org; Pan, Xinhui ;
> linux-ker...@vger.kernel.org; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: add mb for si
> 
> That's not a patch but some binary file?
> 
> Christian.
> 
> Am 24.11.22 um 11:04 schrieb Quan, Evan:
> > [AMD Official Use Only - General]
> >
> > Could the attached patch help?
> >
> > Evan
> >> -Original Message-
> >> From: amd-gfx  On Behalf
> Of ???
> >> Sent: Friday, November 18, 2022 5:25 PM
> >> To: Michel Dänzer ; Koenig, Christian
> >> ; Deucher, Alexander
> >> 
> >> Cc: amd-gfx@lists.freedesktop.org; Pan, Xinhui ;
> >> linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org
> >> Subject: Re: [PATCH] drm/amdgpu: add mb for si
> >>
> >>
> >> 在 2022/11/18 17:18, Michel Dänzer 写道:
> >>> On 11/18/22 09:01, Christian König wrote:
>  Am 18.11.22 um 08:48 schrieb Zhenneng Li:
> > During reboot test on arm64 platform, it may failure on boot, so
> > add this mb in smc.
> >
> > The error message are as follows:
> > [    6.996395][ 7] [  T295] [drm:amdgpu_device_ip_late_init
> > [amdgpu]] *ERROR*
> >       late_init of IP block  failed -22 [
> > 7.006919][ 7] [  T295] amdgpu :04:00.0:
> > amdgpu_device_ip_late_init failed [    7.014224][ 7] [  T295]
> > amdgpu
> > :04:00.0: Fatal error during GPU init
>  Memory barries are not supposed to be sprinkled around like this,
>  you
> >> need to give a detailed explanation why this is necessary.
>  Regards,
>  Christian.
> 
> > Signed-off-by: Zhenneng Li 
> > ---
> >     drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++
> >     1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> > b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> > index 8f994ffa9cd1..c7656f22278d 100644
> > --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> > +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> > @@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct
> > amdgpu_device *adev)
> >     u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL);
> >     u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0);
> >     +    mb();
> > +
> >     if (!(rst & RST_REG) && !(clk & CK_DISABLE))
> >     return true;
> >>> In particular, it makes no sense in this specific place, since it
> >>> cannot directly
> >> affect the values of rst & clk.
> >>
> >> I thinks so too.
> >>
> >> But when I do reboot test using nine desktop machines,  there maybe
> >> report this error on one or two machines after Hundreds of times or
> >> Thousands of times reboot test, at the beginning, I use msleep()
> >> instead of mb(), these two methods are all works, but I don't know what
> is the root case.
> >>
> >> I use this method on other verdor's oland card, this error message
> >> are reported again.
> >>
> >> What could be the root reason?
> >>
> >> test environmen:
> >>
> >> graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87
> >>
> >> driver: amdgpu
> >>
> >> os: ubuntu 2004
> >>
> >> platform: arm64
> >>
> >> kernel: 5.4.18
> >>
<>

Re: [PATCH] drm/amdgpu: Fix minmax error

2022-11-24 Thread James Zhu



ThispatchisReviewed-by:JamesZhu

On 2022-11-24 16:19, Luben Tuikov wrote:

Fix minmax compilation error by using min_t()/max_t(), of the assignment type.

Cc: James Zhu
Cc: Felix Kuehling
Fixes: 58170a7a002ad6 ("drm/amdgpu: fix stall on CPU when allocate large system 
memory")
Signed-off-by: Luben Tuikov
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c | 10 +++---
  1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
index 8a2e5716d8dba2..d22d14b0ef0c84 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
@@ -191,14 +191,18 @@ int amdgpu_hmm_range_get_pages(struct 
mmu_interval_notifier *notifier,
hmm_range->dev_private_owner = owner;
  
  	do {

-   hmm_range->end = min(hmm_range->start + MAX_WALK_BYTE, end);
+   hmm_range->end = min_t(typeof(hmm_range->end),
+  hmm_range->start + MAX_WALK_BYTE,
+  end);
  
  		pr_debug("hmm range: start = 0x%lx, end = 0x%lx",

hmm_range->start, hmm_range->end);
  
  		/* Assuming 512MB takes maxmium 1 second to fault page address */

-   timeout = max((hmm_range->end - hmm_range->start) >> 29, 1ULL) *
-   HMM_RANGE_DEFAULT_TIMEOUT;
+   timeout = max_t(typeof(timeout),
+   (hmm_range->end - hmm_range->start) >> 29,
+   1ULL);
+   timeout *= HMM_RANGE_DEFAULT_TIMEOUT;
timeout = jiffies + msecs_to_jiffies(timeout);
  
  retry:


base-commit: d5e8f4912061ad2e577b4909556e1364e2c2018e
prerequisite-patch-id: 6024d0c36cae3e4a995a8fcf787b91f511a37486

Re: [PATCH 19/29] drm/amdkfd: add debug set exceptions enabled operation

2022-11-24 Thread Felix Kuehling


Am 2022-10-31 um 12:23 schrieb Jonathan Kim:

The debugger subscibes to nofication for requested exceptions on attach.
Allow the debugger to change its subsciption later on.

Signed-off-by: Jonathan Kim 


Reviewed-by: Felix Kuehling 



---
  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |  3 ++
  drivers/gpu/drm/amd/amdkfd/kfd_debug.c   | 36 
  drivers/gpu/drm/amd/amdkfd/kfd_debug.h   |  2 ++
  3 files changed, 41 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 27cd5af72521..61612b9bdf8c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2887,6 +2887,9 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, 
struct kfd_process *p, v
args->send_runtime_event.exception_mask);
break;
case KFD_IOC_DBG_TRAP_SET_EXCEPTIONS_ENABLED:
+   kfd_dbg_set_enabled_debug_exception_mask(target,
+   args->set_exceptions_enabled.exception_mask);
+   break;
case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_OVERRIDE:
case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_MODE:
case KFD_IOC_DBG_TRAP_SUSPEND_QUEUES:
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
index 3d304e8c286e..594ccca25cae 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
@@ -441,3 +441,39 @@ int kfd_dbg_trap_enable(struct kfd_process *target, 
uint32_t fd,
  
  	return r;

  }
+
+void kfd_dbg_set_enabled_debug_exception_mask(struct kfd_process *target,
+   uint64_t exception_set_mask)
+{
+   uint64_t found_mask = 0;
+   struct process_queue_manager *pqm;
+   struct process_queue_node *pqn;
+   static const char write_data = '.';
+   loff_t pos = 0;
+   int i;
+
+   mutex_lock(&target->event_mutex);
+
+   found_mask |= target->exception_status;
+
+   pqm = &target->pqm;
+   list_for_each_entry(pqn, &pqm->queues, process_queue_list) {
+   if (!pqn)
+   continue;
+
+   found_mask |= pqn->q->properties.exception_status;
+   }
+
+   for (i = 0; i < target->n_pdds; i++) {
+   struct kfd_process_device *pdd = target->pdds[i];
+
+   found_mask |= pdd->exception_status;
+   }
+
+   if (exception_set_mask & found_mask)
+   kernel_write(target->dbg_ev_file, &write_data, 1, &pos);
+
+   target->exception_enable_mask = exception_set_mask;
+
+   mutex_unlock(&target->event_mutex);
+}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
index 5270d5749828..837e09491a76 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
@@ -58,6 +58,8 @@ static inline bool kfd_dbg_is_per_vmid_supported(struct 
kfd_dev *dev)
  
  void debug_event_write_work_handler(struct work_struct *work);
  
+void kfd_dbg_set_enabled_debug_exception_mask(struct kfd_process *target,

+   uint64_t exception_set_mask);
  /*
   * If GFX off is enabled, chips that do not support RLC restore for the debug
   * registers will disable GFX off temporarily for the entire debug session.

[PATCH] drm/amdgpu: Fix minmax error

2022-11-24 Thread Luben Tuikov

Fix minmax compilation error by using min_t()/max_t(), of the assignment type.

Cc: James Zhu 
Cc: Felix Kuehling 
Fixes: 58170a7a002ad6 ("drm/amdgpu: fix stall on CPU when allocate large system 
memory")
Signed-off-by: Luben Tuikov 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
index 8a2e5716d8dba2..d22d14b0ef0c84 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
@@ -191,14 +191,18 @@ int amdgpu_hmm_range_get_pages(struct 
mmu_interval_notifier *notifier,
hmm_range->dev_private_owner = owner;
 
do {
-   hmm_range->end = min(hmm_range->start + MAX_WALK_BYTE, end);
+   hmm_range->end = min_t(typeof(hmm_range->end),
+  hmm_range->start + MAX_WALK_BYTE,
+  end);
 
pr_debug("hmm range: start = 0x%lx, end = 0x%lx",
hmm_range->start, hmm_range->end);
 
/* Assuming 512MB takes maxmium 1 second to fault page address 
*/
-   timeout = max((hmm_range->end - hmm_range->start) >> 29, 1ULL) *
-   HMM_RANGE_DEFAULT_TIMEOUT;
+   timeout = max_t(typeof(timeout),
+   (hmm_range->end - hmm_range->start) >> 29,
+   1ULL);
+   timeout *= HMM_RANGE_DEFAULT_TIMEOUT;
timeout = jiffies + msecs_to_jiffies(timeout);
 
 retry:

base-commit: d5e8f4912061ad2e577b4909556e1364e2c2018e
prerequisite-patch-id: 6024d0c36cae3e4a995a8fcf787b91f511a37486
-- 
2.39.0.rc0

[PATCH 09/12] drm/amd/display: Don't overwrite subvp pipe info in fast updates

2022-11-24 Thread jdhillon

From: Alvin Lee 

[Description]
- This is a workaround to avoid concurrency issues -- a fast update
  creates a shallow copy of the dc current_state, and removes all
  subvp/phantom related flags.
- We want to prevent the fast update thread from removing those
  flags in case there's another thread running that requires
  the info for proper programming

Reviewed-by: Jun Lei 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Alvin Lee 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c  |  2 +-
 .../drm/amd/display/dc/dcn32/dcn32_resource.c | 64 +++
 .../drm/amd/display/dc/dcn32/dcn32_resource.h |  2 +-
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.c  |  4 +-
 .../gpu/drm/amd/display/dc/inc/core_types.h   |  2 +-
 5 files changed, 44 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index a7bfe0b6a5f3..87994ae0a397 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -3061,7 +3061,7 @@ static bool update_planes_and_stream_state(struct dc *dc,
 * Ensures that we have enough pipes for newly added MPO planes
 */
if (dc->res_pool->funcs->remove_phantom_pipes)
-   dc->res_pool->funcs->remove_phantom_pipes(dc, context);
+   dc->res_pool->funcs->remove_phantom_pipes(dc, context, 
false);
 
/*remove old surfaces from context */
if (!dc_rem_all_planes_for_stream(dc, stream, context)) {
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
index 06489df85ac1..e4dbc8353ea3 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
@@ -1743,7 +1743,7 @@ void dcn32_retain_phantom_pipes(struct dc *dc, struct 
dc_state *context)
 }
 
 // return true if removed piped from ctx, false otherwise
-bool dcn32_remove_phantom_pipes(struct dc *dc, struct dc_state *context)
+bool dcn32_remove_phantom_pipes(struct dc *dc, struct dc_state *context, bool 
fast_update)
 {
int i;
bool removed_pipe = false;
@@ -1770,14 +1770,23 @@ bool dcn32_remove_phantom_pipes(struct dc *dc, struct 
dc_state *context)
removed_pipe = true;
}
 
-   // Clear all phantom stream info
-   if (pipe->stream) {
-   pipe->stream->mall_stream_config.type = SUBVP_NONE;
-   pipe->stream->mall_stream_config.paired_stream = NULL;
-   }
+   /* For non-full updates, a shallow copy of the current state
+* is created. In this case we don't want to erase the current
+* state (there can be 2 HIRQL threads, one in flip, and one in
+* checkMPO) that can cause a race condition.
+*
+* This is just a workaround, needs a proper fix.
+*/
+   if (!fast_update) {
+   // Clear all phantom stream info
+   if (pipe->stream) {
+   pipe->stream->mall_stream_config.type = 
SUBVP_NONE;
+   pipe->stream->mall_stream_config.paired_stream 
= NULL;
+   }
 
-   if (pipe->plane_state) {
-   pipe->plane_state->is_phantom = false;
+   if (pipe->plane_state) {
+   pipe->plane_state->is_phantom = false;
+   }
}
}
return removed_pipe;
@@ -1950,23 +1959,28 @@ int dcn32_populate_dml_pipes_from_context(
pipes[pipe_cnt].pipe.src.unbounded_req_mode = false;
pipes[pipe_cnt].pipe.scale_ratio_depth.lb_depth = dm_lb_19;
 
-   switch (pipe->stream->mall_stream_config.type) {
-   case SUBVP_MAIN:
-   pipes[pipe_cnt].pipe.src.use_mall_for_pstate_change = 
dm_use_mall_pstate_change_sub_viewport;
-   subvp_in_use = true;
-   break;
-   case SUBVP_PHANTOM:
-   pipes[pipe_cnt].pipe.src.use_mall_for_pstate_change = 
dm_use_mall_pstate_change_phantom_pipe;
-   pipes[pipe_cnt].pipe.src.use_mall_for_static_screen = 
dm_use_mall_static_screen_disable;
-   // Disallow unbounded req for SubVP according to DCHUB 
programming guide
-   pipes[pipe_cnt].pipe.src.unbounded_req_mode = false;
-   break;
-   case SUBVP_NONE:
-   pipes[pipe_cnt].pipe.src.use_mall_for_pstate_change = 
dm_use_mall_pstate_change_disable;
-   pipes[pipe_cnt].pipe.src.use_mall_for_static_screen = 
dm_use_mall_static_screen_disable;
-   break;
-   default:
-

[PATCH 08/12] drm/amd/display: program output tf when required

2022-11-24 Thread jdhillon

From: Dillon Varone 

[Description]
Output transfer function must be programmed per pipe as part of a front end
update when the plane changes, or output transfer function changes for a
given plane.

Reviewed-by: Alvin Lee 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Dillon Varone 
---
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
index db57b17061ae..bc4a303cd864 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
@@ -1741,7 +1741,10 @@ static void dcn20_program_pipe(
 * only do gamma programming for powering on, internal memcmp to avoid
 * updating on slave planes
 */
-   if (pipe_ctx->update_flags.bits.enable || 
pipe_ctx->stream->update_flags.bits.out_tf)
+   if (pipe_ctx->update_flags.bits.enable ||
+   pipe_ctx->update_flags.bits.plane_changed ||
+   pipe_ctx->stream->update_flags.bits.out_tf ||
+   
pipe_ctx->plane_state->update_flags.bits.output_tf_change)
hws->funcs.set_output_transfer_func(dc, pipe_ctx, 
pipe_ctx->stream);
 
/* If the pipe has been enabled or has a different opp, we
-- 
2.34.1

[PATCH 12/12] drm/amd/display: 3.2.214

2022-11-24 Thread jdhillon

From: Aric Cyr 

This version brings along following fixes:
-Program output transfer function when required
-Fix arthmetic errror in MALL size caluclations for subvp
-DCC Meta pitch used for MALL allocation
-Debugfs entry to tell if connector is DPIA link
-Use largest vready_offset in pipe group
-Fixes race condition in DPIA Aux transfer

Reviewed-by: Rodrigo Siqueira 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Aric Cyr 
---
 drivers/gpu/drm/amd/display/dc/dc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index a19a890f1d76..4a7c0356d9c7 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -47,7 +47,7 @@ struct aux_payload;
 struct set_config_cmd_payload;
 struct dmub_notification;
 
-#define DC_VER "3.2.213"
+#define DC_VER "3.2.214"
 
 #define MAX_SURFACES 3
 #define MAX_PLANES 6
-- 
2.34.1

[PATCH 06/12] drm/amd/display: Retain phantom pipes when min transition into subvp

2022-11-24 Thread jdhillon

From: Alvin Lee 

[Description]
- When entering into a SubVP config that requires a minimal
  transition we need to retain phantom pipes and also restore
  the mall config
- This is because the min transition will remove phantom pipes
  from the context (shallow copy) and not restore it's original
  state
- This is just a workaround, and needs a proper fix

Reviewed-by: Jun Lei 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Alvin Lee 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 21 ++-
 drivers/gpu/drm/amd/display/dc/dc_stream.h| 11 ++
 .../drm/amd/display/dc/dcn32/dcn32_resource.c |  2 ++
 .../drm/amd/display/dc/dcn32/dcn32_resource.h | 11 --
 .../amd/display/dc/dcn321/dcn321_resource.c   |  2 ++
 .../gpu/drm/amd/display/dc/inc/core_types.h   |  2 ++
 6 files changed, 37 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index f9b8b6f6fd31..a7bfe0b6a5f3 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -3954,6 +3954,7 @@ bool dc_update_planes_and_stream(struct dc *dc,
struct dc_state *context;
enum surface_update_type update_type;
int i;
+   struct mall_temp_config mall_temp_config;
 
/* In cases where MPO and split or ODM are used transitions can
 * cause underflow. Apply stream configuration with minimal pipe
@@ -3985,11 +3986,29 @@ bool dc_update_planes_and_stream(struct dc *dc,
 
/* on plane removal, minimal state is the new one */
if (force_minimal_pipe_splitting && !is_plane_addition) {
+   /* Since all phantom pipes are removed in full validation,
+* we have to save and restore the subvp/mall config when
+* we do a minimal transition since the flags marking the
+* pipe as subvp/phantom will be cleared (dc copy constructor
+* creates a shallow copy).
+*/
+   if (dc->res_pool->funcs->save_mall_state)
+   dc->res_pool->funcs->save_mall_state(dc, context, 
&mall_temp_config);
if (!commit_minimal_transition_state(dc, context)) {
dc_release_state(context);
return false;
}
-
+   if (dc->res_pool->funcs->restore_mall_state)
+   dc->res_pool->funcs->restore_mall_state(dc, context, 
&mall_temp_config);
+
+   /* If we do a minimal transition with plane removal and the 
context
+* has subvp we also have to retain back the phantom stream / 
planes
+* since the refcount is decremented as part of the min 
transition
+* (we commit a state with no subvp, so the phantom streams / 
planes
+* had to be removed).
+*/
+   if (dc->res_pool->funcs->retain_phantom_pipes)
+   dc->res_pool->funcs->retain_phantom_pipes(dc, context);
update_type = UPDATE_TYPE_FULL;
}
 
diff --git a/drivers/gpu/drm/amd/display/dc/dc_stream.h 
b/drivers/gpu/drm/amd/display/dc/dc_stream.h
index e0cee9666c48..dfd3df1d2f7e 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_stream.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_stream.h
@@ -160,6 +160,17 @@ struct mall_stream_config {
struct dc_stream_state *paired_stream;  // master / slave stream
 };
 
+/* Temp struct used to save and restore MALL config
+ * during validation.
+ *
+ * TODO: Move MALL config into dc_state instead of stream struct
+ * to avoid needing to save/restore.
+ */
+struct mall_temp_config {
+   struct mall_stream_config mall_stream_config[MAX_PIPES];
+   bool is_phantom_plane[MAX_PIPES];
+};
+
 struct dc_stream_state {
// sink is deprecated, new code should not reference
// this pointer
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
index 99ddd2232322..06489df85ac1 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
@@ -2055,6 +2055,8 @@ static struct resource_funcs dcn32_res_pool_funcs = {
.add_phantom_pipes = dcn32_add_phantom_pipes,
.remove_phantom_pipes = dcn32_remove_phantom_pipes,
.retain_phantom_pipes = dcn32_retain_phantom_pipes,
+   .save_mall_state = dcn32_save_mall_state,
+   .restore_mall_state = dcn32_restore_mall_state,
 };
 
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.h 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.h
index c50bb34b515f..4e6b71832187 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.h
@@ -45,17 +45,6 @@
 extern struct _vcs_dpi_ip_params_st dcn3_2_ip;
 extern struct _vcs_dpi_soc_bounding_box_st dcn3_2_soc;
 
-/* Temp struct used to

[PATCH 10/12] drm/amd/display: Fix DTBCLK disable requests and SRC_SEL programming

2022-11-24 Thread jdhillon

From: Alvin Lee 

[Description]
- When transitioning FRL / DP2 is not required, we will always request
  DTBCLK = 0Mhz, but PMFW returns the min freq
- This causes us to make DTBCLK requests every time we call optimize
  after transitioning from FRL to non-FRL
- If DTBCLK is not required, request the min instead (then we only need
  to make 1 extra request at boot time)
- Also when programming PIPE_DTO_SRC_SEL, don't programming for DP
  first, just programming once for the required selection (programming
  DP on an HDMI connection then switching back causes corruption)

Reviewed-by: Dillon Varone 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Alvin Lee 
---
 .../gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c| 2 +-
 drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c   | 6 +-
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c
index 6f77d8e538ab..9eb9fe5b8d2c 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c
@@ -438,7 +438,7 @@ static void dcn32_update_clocks(struct clk_mgr 
*clk_mgr_base,
}
 
if (!new_clocks->dtbclk_en) {
-   new_clocks->ref_dtbclk_khz = 0;
+   new_clocks->ref_dtbclk_khz = 
clk_mgr_base->bw_params->clk_table.entries[0].dtbclk_mhz * 1000;
}
 
/* clock limits are received with MHz precision, divide by 1000 to 
prevent setting clocks at every call */
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c
index df4f25119142..e4472c6be6c3 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c
@@ -225,11 +225,7 @@ static void dccg32_set_dtbclk_dto(
} else {
REG_UPDATE_2(OTG_PIXEL_RATE_CNTL[params->otg_inst],
DTBCLK_DTO_ENABLE[params->otg_inst], 0,
-   PIPE_DTO_SRC_SEL[params->otg_inst], 1);
-   if (params->is_hdmi)
-   REG_UPDATE(OTG_PIXEL_RATE_CNTL[params->otg_inst],
-   PIPE_DTO_SRC_SEL[params->otg_inst], 0);
-
+   PIPE_DTO_SRC_SEL[params->otg_inst], 
params->is_hdmi ? 0 : 1);
REG_WRITE(DTBCLK_DTO_MODULO[params->otg_inst], 0);
REG_WRITE(DTBCLK_DTO_PHASE[params->otg_inst], 0);
}
-- 
2.34.1

[PATCH 11/12] drm/amd/display: set per pipe dppclk to 0 when dpp is off

2022-11-24 Thread jdhillon

From: Dmytro Laktyushkin 

The 'commit 52e4fdf09ebc ("drm/amd/display: use low clocks for no plane
configs")' introduced a change that set low clock values for DCN31 and
DCN32. As a result of these changes, DC started to spam the log with the
following warning:

[ cut here ]
WARNING: CPU: 8 PID: 1486 at
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn20/dcn20_dccg.c:58
dccg2_update_dpp_dto+0x3f/0xf0 [amdgpu]
[..]
CPU: 8 PID: 1486 Comm: kms_atomic Tainted: G W 5.18.0+ #1
RIP: 0010:dccg2_update_dpp_dto+0x3f/0xf0 [amdgpu]
RSP: 0018:bbd8025334d0 EFLAGS: 00010206
RAX: 01ee RBX: a02c87dd3de0 RCX: 000a7f80
RDX: 0007dec3 RSI:  RDI: a02c87dd3de0
RBP: bbd8025334e8 R08: 0001 R09: 0005
R10: 000331a0 R11: c0b03d80 R12: a02ca576d000
R13: a02cd02c R14: 001453bc R15: a02cdc28
[..]
dcn20_update_clocks_update_dpp_dto+0x4e/0xa0 [amdgpu]
dcn32_update_clocks+0x5d9/0x650 [amdgpu]
dcn20_prepare_bandwidth+0x49/0x100 [amdgpu]
dcn30_prepare_bandwidth+0x63/0x80 [amdgpu]
dc_commit_state_no_check+0x39d/0x13e0 [amdgpu]
dc_commit_streams+0x1f9/0x3b0 [amdgpu]
dc_commit_state+0x37/0x120 [amdgpu]
amdgpu_dm_atomic_commit_tail+0x5e5/0x2520 [amdgpu]
? _raw_spin_unlock_irqrestore+0x1f/0x40
? down_trylock+0x2c/0x40
? vprintk_emit+0x186/0x2c0
? vprintk_default+0x1d/0x20
? vprintk+0x4e/0x60

We can easily trigger this issue by using a 4k@120 or a 2k@165 and
running some of the kms_atomic tests. This warning is triggered because
the per-pipe clock update is not happening; this commit fixes this issue
by ensuring that DPPCLK is updated when calculating the watermark and
dlg is invoked.

Fixes: 52e4fdf09ebc ("drm/amd/display: use low clocks for no plane configs")
Reported-by: Mark Broadworth 
Reviewed-by: Rodrigo Siqueira 
Signed-off-by: Dmytro Laktyushkin 
---
 drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c | 3 +++
 drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c | 5 -
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
index 12b23bd50e19..b37d14369a62 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
@@ -559,6 +559,9 @@ void dcn31_calculate_wm_and_dlg_fp(
context->bw_ctx.bw.dcn.clk.dramclk_khz = 0;
context->bw_ctx.bw.dcn.clk.fclk_khz = 0;
context->bw_ctx.bw.dcn.clk.p_state_change_support = true;
+   for (i = 0; i < dc->res_pool->pipe_count; i++)
+   if (context->res_ctx.pipe_ctx[i].stream)
+   
context->res_ctx.pipe_ctx[i].plane_res.bw.dppclk_khz = 0;
}
 }
 
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
index 5a4cdb559d4e..f94abd124021 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
@@ -1320,7 +1320,10 @@ static void dcn32_calculate_dlg_params(struct dc *dc, 
struct dc_state *context,
 
if (context->bw_ctx.bw.dcn.clk.dppclk_khz < 
pipes[pipe_idx].clks_cfg.dppclk_mhz * 1000)
context->bw_ctx.bw.dcn.clk.dppclk_khz = 
pipes[pipe_idx].clks_cfg.dppclk_mhz * 1000;
-   context->res_ctx.pipe_ctx[i].plane_res.bw.dppclk_khz = 
pipes[pipe_idx].clks_cfg.dppclk_mhz * 1000;
+   if (context->res_ctx.pipe_ctx[i].plane_state)
+   context->res_ctx.pipe_ctx[i].plane_res.bw.dppclk_khz = 
pipes[pipe_idx].clks_cfg.dppclk_mhz * 1000;
+   else
+   context->res_ctx.pipe_ctx[i].plane_res.bw.dppclk_khz = 
0;
context->res_ctx.pipe_ctx[i].pipe_dlg_param = 
pipes[pipe_idx].pipe.dest;
pipe_idx++;
}
-- 
2.34.1

[PATCH 05/12] drm/amd/display: Use DCC meta pitch for MALL allocation requirements

2022-11-24 Thread jdhillon

From: Dillon Varone 

[Description]
Calculations for determining DCC meta size should be pitch*height*bpp/256.

Reviewed-by: Alvin Lee 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Dillon Varone 
---
 drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c|  6 +++---
 .../drm/amd/display/dc/dcn32/dcn32_resource_helpers.c | 11 ---
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
index 76548b4b822c..c9b2343947be 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
@@ -262,11 +262,11 @@ static uint32_t dcn32_calculate_cab_allocation(struct dc 
*dc, struct dc_state *c
num_mblks = ((mall_alloc_width_blk_aligned + mblk_width - 1) / 
mblk_width) *
((mall_alloc_height_blk_aligned + mblk_height - 
1) / mblk_height);
 
-   /* For DCC:
-* meta_num_mblk = 
CEILING(full_mblk_width_ub_l*full_mblk_height_ub_l*Bpe/256/mblk_bytes, 1)
+   /*For DCC:
+* meta_num_mblk = 
CEILING(meta_pitch*full_vp_height*Bpe/256/mblk_bytes, 1)
 */
if (pipe->plane_state->dcc.enable)
-   num_mblks += (mall_alloc_width_blk_aligned * 
mall_alloc_width_blk_aligned * bytes_per_pixel +
+   num_mblks += (pipe->plane_state->dcc.meta_pitch * 
pipe->plane_res.scl_data.viewport.height * bytes_per_pixel +
(256 * DCN3_2_MALL_MBLK_SIZE_BYTES) - 
1) / (256 * DCN3_2_MALL_MBLK_SIZE_BYTES);
 
bytes_in_mall = num_mblks * DCN3_2_MALL_MBLK_SIZE_BYTES;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c
index fa3778849db1..94fd125daa6b 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c
@@ -121,14 +121,19 @@ uint32_t dcn32_helper_calculate_num_ways_for_subvp(struct 
dc *dc, struct dc_stat
 */
num_mblks = ((mall_alloc_width_blk_aligned + mblk_width 
- 1) / mblk_width) *
((mall_alloc_height_blk_aligned + 
mblk_height - 1) / mblk_height);
+
+   /*For DCC:
+* meta_num_mblk = 
CEILING(meta_pitch*full_vp_height*Bpe/256/mblk_bytes, 1)
+*/
+   if (pipe->plane_state->dcc.enable)
+   num_mblks += (pipe->plane_state->dcc.meta_pitch 
* pipe->plane_res.scl_data.viewport.height * bytes_per_pixel +
+   (256 * 
DCN3_2_MALL_MBLK_SIZE_BYTES) - 1) / (256 * DCN3_2_MALL_MBLK_SIZE_BYTES);
+
bytes_in_mall = num_mblks * DCN3_2_MALL_MBLK_SIZE_BYTES;
// cache lines used is total bytes / cache_line size. 
Add +2 for worst case alignment
// (MALL is 64-byte aligned)
cache_lines_per_plane = bytes_in_mall / 
dc->caps.cache_line_size + 2;
 
-   /* For DCC divide by 256 */
-   if (pipe->plane_state->dcc.enable)
-   cache_lines_per_plane = cache_lines_per_plane + 
(cache_lines_per_plane / 256) + 1;
cache_lines_used += cache_lines_per_plane;
}
}
-- 
2.34.1

[PATCH 07/12] drm/amd/display: Fix arithmetic error in MALL size calculations for subvp

2022-11-24 Thread jdhillon

From: Dillon Varone 

[Description]
Need to subtract unused section of the viewport when calculating required space
in MALL for subvp instead of adding, to prevent over allocation.

Reviewed-by: Alvin Lee 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Dillon Varone 
---
 drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c
index 94fd125daa6b..783935c4e664 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource_helpers.c
@@ -97,14 +97,14 @@ uint32_t dcn32_helper_calculate_num_ways_for_subvp(struct 
dc *dc, struct dc_stat
 * FLOOR(vp_x_start, blk_width)
 */
full_vp_width_blk_aligned = 
((pipe->plane_res.scl_data.viewport.x +
-   pipe->plane_res.scl_data.viewport.width 
+ mblk_width - 1) / mblk_width * mblk_width) +
+   pipe->plane_res.scl_data.viewport.width 
+ mblk_width - 1) / mblk_width * mblk_width) -
(pipe->plane_res.scl_data.viewport.x / 
mblk_width * mblk_width);
 
/* full_vp_height_blk_aligned = FLOOR(vp_y_start + 
full_vp_height + blk_height - 1, blk_height) -
 * FLOOR(vp_y_start, blk_height)
 */
full_vp_height_blk_aligned = 
((pipe->plane_res.scl_data.viewport.y +
-   full_vp_height + mblk_height - 1) / 
mblk_height * mblk_height) +
+   full_vp_height + mblk_height - 1) / 
mblk_height * mblk_height) -
(pipe->plane_res.scl_data.viewport.y / 
mblk_height * mblk_height);
 
/* mall_alloc_width_blk_aligned_l/c = 
full_vp_width_blk_aligned_l/c */
-- 
2.34.1

[PATCH 01/12] drm/amd/display: Fix race condition in DPIA AUX transfer

2022-11-24 Thread jdhillon

From: Stylon Wang 

[Why]
This fix was intended for improving on coding style but in the process
uncovers a race condition, which explains why we are getting incorrect
length in DPIA AUX replies. Due to the call path of DPIA AUX going from
DC back to DM layer then again into DC and the added complexities on top
of current DC AUX implementation, a proper fix to rely on current dc_lock
to address the race condition is difficult without a major overhual
on how DPIA AUX is implemented.

[How]
- Add a mutex dpia_aux_lock to protect DPIA AUX transfers
- Remove DMUB_ASYNC_TO_SYNC_ACCESS_* codes and rely solely on
  aux_return_code_type for error reporting and handling
- Separate SET_CONFIG from DPIA AUX transfer because they have quiet
  different processing logic
- Remove unnecessary type casting to and from void * type

Reviewed-by: Nicholas Kazlauskas 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Stylon Wang 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 151 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  17 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c |  10 +-
 3 files changed, 91 insertions(+), 87 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 4fe7971e3e58..da1be67831d6 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -147,14 +147,6 @@ MODULE_FIRMWARE(FIRMWARE_NAVI12_DMCU);
 /* Number of bytes in PSP footer for firmware. */
 #define PSP_FOOTER_BYTES 0x100
 
-/*
- * DMUB Async to Sync Mechanism Status
- */
-#define DMUB_ASYNC_TO_SYNC_ACCESS_FAIL 1
-#define DMUB_ASYNC_TO_SYNC_ACCESS_TIMEOUT 2
-#define DMUB_ASYNC_TO_SYNC_ACCESS_SUCCESS 3
-#define DMUB_ASYNC_TO_SYNC_ACCESS_INVALID 4
-
 /**
  * DOC: overview
  *
@@ -1442,6 +1434,7 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
memset(&init_params, 0, sizeof(init_params));
 #endif
 
+   mutex_init(&adev->dm.dpia_aux_lock);
mutex_init(&adev->dm.dc_lock);
mutex_init(&adev->dm.audio_lock);
 
@@ -1806,6 +1799,7 @@ static void amdgpu_dm_fini(struct amdgpu_device *adev)
 
mutex_destroy(&adev->dm.audio_lock);
mutex_destroy(&adev->dm.dc_lock);
+   mutex_destroy(&adev->dm.dpia_aux_lock);
 
return;
 }
@@ -10211,91 +10205,92 @@ uint32_t dm_read_reg_func(const struct dc_context 
*ctx, uint32_t address,
return value;
 }
 
-static int amdgpu_dm_set_dmub_async_sync_status(bool is_cmd_aux,
-   struct dc_context *ctx,
-   uint8_t status_type,
-   uint32_t *operation_result)
+int amdgpu_dm_process_dmub_aux_transfer_sync(
+   struct dc_context *ctx,
+   unsigned int link_index,
+   struct aux_payload *payload,
+   enum aux_return_code_type *operation_result)
 {
struct amdgpu_device *adev = ctx->driver_context;
-   int return_status = -1;
struct dmub_notification *p_notify = adev->dm.dmub_notify;
+   int ret = -1;
 
-   if (is_cmd_aux) {
-   if (status_type == DMUB_ASYNC_TO_SYNC_ACCESS_SUCCESS) {
-   return_status = p_notify->aux_reply.length;
-   *operation_result = p_notify->result;
-   } else if (status_type == DMUB_ASYNC_TO_SYNC_ACCESS_TIMEOUT) {
-   *operation_result = AUX_RET_ERROR_TIMEOUT;
-   } else if (status_type == DMUB_ASYNC_TO_SYNC_ACCESS_FAIL) {
-   *operation_result = AUX_RET_ERROR_ENGINE_ACQUIRE;
-   } else if (status_type == DMUB_ASYNC_TO_SYNC_ACCESS_INVALID) {
-   *operation_result = AUX_RET_ERROR_INVALID_REPLY;
-   } else {
-   *operation_result = AUX_RET_ERROR_UNKNOWN;
+   mutex_lock(&adev->dm.dpia_aux_lock);
+   if (!dc_process_dmub_aux_transfer_async(ctx->dc, link_index, payload)) {
+   *operation_result = AUX_RET_ERROR_ENGINE_ACQUIRE;
+   goto out;
+   }
+
+   if (!wait_for_completion_timeout(&adev->dm.dmub_aux_transfer_done, 10 * 
HZ)) {
+   DRM_ERROR("wait_for_completion_timeout timeout!");
+   *operation_result = AUX_RET_ERROR_TIMEOUT;
+   goto out;
+   }
+
+   if (p_notify->result != AUX_RET_SUCCESS) {
+   /*
+* Transient states before tunneling is enabled could
+* lead to this error. We can ignore this for now.
+*/
+   if (p_notify->result != AUX_RET_ERROR_PROTOCOL_ERROR) {
+   DRM_WARN("DPIA AUX failed on 0x%x(%d), error %d\n",
+   payload->address, payload->length,
+   p_notify->result);
}
-   } else {
-   if (status_type == DMUB_ASYNC_TO

[PATCH 04/12] drm/amd/display: MALL SS calculations should iterate over all pipes for cursor

2022-11-24 Thread jdhillon

From: Dillon Varone 

[Description]
MALL SS allocation calculations should iterate over all pipes to determine the
the allocation size required for HW cursor.

Reviewed-by: Alvin Lee 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Dillon Varone 
---
 drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
index 2f19f711d8be..76548b4b822c 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
@@ -316,8 +316,8 @@ static uint32_t dcn32_calculate_cab_allocation(struct dc 
*dc, struct dc_state *c
cache_lines_used += (((cursor_size + 
DCN3_2_MALL_MBLK_SIZE_BYTES - 1) /

DCN3_2_MALL_MBLK_SIZE_BYTES) * DCN3_2_MALL_MBLK_SIZE_BYTES) /

dc->caps.cache_line_size + 2;
+   break;
}
-   break;
}
}
 
-- 
2.34.1

[PATCH 03/12] drm/amd/display: Create debugfs to tell if connector is DPIA link

2022-11-24 Thread jdhillon

From: Stylon Wang 

[Why]
Tests need to tell if display is connected via USB4 DPIA link.
Currently this is only possible via analyzing dmesg logs.

[How]
Create a per-connector debugfs entry to report if the link is
tunneled via USB4 DPIA.

Reviewed-by: Wayne Lin 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Stylon Wang 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index 2c43cdd2e707..461037a3dd75 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -2639,6 +2639,25 @@ static int dp_mst_progress_status_show(struct seq_file 
*m, void *unused)
return 0;
 }
 
+/*
+ * Reports whether the connected display is a USB4 DPIA tunneled display
+ * Example usage: cat /sys/kernel/debug/dri/0/DP-8/is_dpia_link
+ */
+static int is_dpia_link_show(struct seq_file *m, void *data)
+{
+   struct drm_connector *connector = m->private;
+   struct amdgpu_dm_connector *aconnector = 
to_amdgpu_dm_connector(connector);
+   struct dc_link *link = aconnector->dc_link;
+
+   if (connector->status != connector_status_connected)
+   return -ENODEV;
+
+   seq_printf(m, "%s\n", (link->ep_type == DISPLAY_ENDPOINT_USB4_DPIA) ? 
"yes" :
+   (link->ep_type == DISPLAY_ENDPOINT_PHY) ? "no" 
: "unknown");
+
+   return 0;
+}
+
 DEFINE_SHOW_ATTRIBUTE(dp_dsc_fec_support);
 DEFINE_SHOW_ATTRIBUTE(dmub_fw_state);
 DEFINE_SHOW_ATTRIBUTE(dmub_tracebuffer);
@@ -2650,6 +2669,7 @@ DEFINE_SHOW_ATTRIBUTE(internal_display);
 DEFINE_SHOW_ATTRIBUTE(psr_capability);
 DEFINE_SHOW_ATTRIBUTE(dp_is_mst_connector);
 DEFINE_SHOW_ATTRIBUTE(dp_mst_progress_status);
+DEFINE_SHOW_ATTRIBUTE(is_dpia_link);
 
 static const struct file_operations dp_dsc_clock_en_debugfs_fops = {
.owner = THIS_MODULE,
@@ -2794,7 +2814,8 @@ static const struct {
{"max_bpc", &dp_max_bpc_debugfs_fops},
{"dsc_disable_passthrough", 
&dp_dsc_disable_passthrough_debugfs_fops},
{"is_mst_connector", &dp_is_mst_connector_fops},
-   {"mst_progress_status", &dp_mst_progress_status_fops}
+   {"mst_progress_status", &dp_mst_progress_status_fops},
+   {"is_dpia_link", &is_dpia_link_fops}
 };
 
 #ifdef CONFIG_DRM_AMD_DC_HDCP
-- 
2.34.1

[PATCH 00/12] DC Patches November 28 2022

2022-11-24 Thread jdhillon

This DC patchset brings improvements in multiple areas. In summary, we have:

* Program output transfer function when required
* Fix arthmetic errror in MALL size caluclations for subvp 
* DCC Meta pitch used for MALL allocation 
* Debugfs entry to tell if connector is DPIA link 
* Use largest vready_offset in pipe group 
* Fixes race condition in DPIA Aux transfer


Cc: Daniel Wheeler 


Alvin Lee (3):
  drm/amd/display: Retain phantom pipes when min transition into subvp
  drm/amd/display: Don't overwrite subvp pipe info in fast updates
  drm/amd/display: Fix DTBCLK disable requests and SRC_SEL programming

Aric Cyr (1):
  drm/amd/display: 3.2.214

Dillon Varone (4):
  drm/amd/display: MALL SS calculations should iterate over all pipes
for cursor
  drm/amd/display: Use DCC meta pitch for MALL allocation requirements
  drm/amd/display: Fix arithmetic error in MALL size calculations for
subvp
  drm/amd/display: program output tf when required

Dmytro Laktyushkin (1):
  drm/amd/display: set per pipe dppclk to 0 when dpp is off

Stylon Wang (2):
  drm/amd/display: Fix race condition in DPIA AUX transfer
  drm/amd/display: Create debugfs to tell if connector is DPIA link

Wesley Chalmers (1):
  drm/amd/display: Use the largest vready_offset in pipe group

 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 151 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  17 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c |  23 ++-
 .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c |  10 +-
 .../display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c  |   2 +-
 drivers/gpu/drm/amd/display/dc/core/dc.c  |  23 ++-
 drivers/gpu/drm/amd/display/dc/dc.h   |   2 +-
 drivers/gpu/drm/amd/display/dc/dc_stream.h|  11 ++
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c |  30 +++-
 .../drm/amd/display/dc/dcn20/dcn20_hwseq.c|  34 +++-
 .../gpu/drm/amd/display/dc/dcn32/dcn32_dccg.c |   6 +-
 .../drm/amd/display/dc/dcn32/dcn32_hwseq.c|   8 +-
 .../drm/amd/display/dc/dcn32/dcn32_resource.c |  66 +---
 .../drm/amd/display/dc/dcn32/dcn32_resource.h |  13 +-
 .../display/dc/dcn32/dcn32_resource_helpers.c |  15 +-
 .../amd/display/dc/dcn321/dcn321_resource.c   |   2 +
 .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.c  |   3 +
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.c  |   9 +-
 .../gpu/drm/amd/display/dc/inc/core_types.h   |   4 +-
 19 files changed, 277 insertions(+), 152 deletions(-)

-- 
2.34.1

[PATCH 02/12] drm/amd/display: Use the largest vready_offset in pipe group

2022-11-24 Thread jdhillon

From: Wesley Chalmers 

[WHY]
Corruption can occur in LB if vready_offset is not large enough.
DML calculates vready_offset for each pipe, but we currently select the
top pipe's vready_offset, which is not necessarily enough for all pipes
in the group.

[HOW]
Wherever program_global_sync is currently called, iterate through the
entire pipe group and find the highest vready_offset.

Reviewed-by: Dillon Varone 
Acked-by: Jasdeep Dhillon 
Signed-off-by: Wesley Chalmers 
---
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 30 +--
 .../drm/amd/display/dc/dcn20/dcn20_hwseq.c| 29 --
 2 files changed, 55 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index 0db02e76dcc5..355ffed7380b 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -869,6 +869,32 @@ static void false_optc_underflow_wa(
tg->funcs->clear_optc_underflow(tg);
 }
 
+static int calculate_vready_offset_for_group(struct pipe_ctx *pipe)
+{
+   struct pipe_ctx *other_pipe;
+   int vready_offset = pipe->pipe_dlg_param.vready_offset;
+
+   /* Always use the largest vready_offset of all connected pipes */
+   for (other_pipe = pipe->bottom_pipe; other_pipe != NULL; other_pipe = 
other_pipe->bottom_pipe) {
+   if (other_pipe->pipe_dlg_param.vready_offset > vready_offset)
+   vready_offset = 
other_pipe->pipe_dlg_param.vready_offset;
+   }
+   for (other_pipe = pipe->top_pipe; other_pipe != NULL; other_pipe = 
other_pipe->top_pipe) {
+   if (other_pipe->pipe_dlg_param.vready_offset > vready_offset)
+   vready_offset = 
other_pipe->pipe_dlg_param.vready_offset;
+   }
+   for (other_pipe = pipe->next_odm_pipe; other_pipe != NULL; other_pipe = 
other_pipe->next_odm_pipe) {
+   if (other_pipe->pipe_dlg_param.vready_offset > vready_offset)
+   vready_offset = 
other_pipe->pipe_dlg_param.vready_offset;
+   }
+   for (other_pipe = pipe->prev_odm_pipe; other_pipe != NULL; other_pipe = 
other_pipe->prev_odm_pipe) {
+   if (other_pipe->pipe_dlg_param.vready_offset > vready_offset)
+   vready_offset = 
other_pipe->pipe_dlg_param.vready_offset;
+   }
+
+   return vready_offset;
+}
+
 enum dc_status dcn10_enable_stream_timing(
struct pipe_ctx *pipe_ctx,
struct dc_state *context,
@@ -912,7 +938,7 @@ enum dc_status dcn10_enable_stream_timing(
pipe_ctx->stream_res.tg->funcs->program_timing(
pipe_ctx->stream_res.tg,
&stream->timing,
-   pipe_ctx->pipe_dlg_param.vready_offset,
+   calculate_vready_offset_for_group(pipe_ctx),
pipe_ctx->pipe_dlg_param.vstartup_start,
pipe_ctx->pipe_dlg_param.vupdate_offset,
pipe_ctx->pipe_dlg_param.vupdate_width,
@@ -2908,7 +2934,7 @@ void dcn10_program_pipe(
 
pipe_ctx->stream_res.tg->funcs->program_global_sync(
pipe_ctx->stream_res.tg,
-   pipe_ctx->pipe_dlg_param.vready_offset,
+   calculate_vready_offset_for_group(pipe_ctx),
pipe_ctx->pipe_dlg_param.vstartup_start,
pipe_ctx->pipe_dlg_param.vupdate_offset,
pipe_ctx->pipe_dlg_param.vupdate_width);
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
index 3f3d4daa6294..db57b17061ae 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
@@ -1652,6 +1652,31 @@ static void dcn20_update_dchubp_dpp(
hubp->funcs->phantom_hubp_post_enable(hubp);
 }
 
+static int calculate_vready_offset_for_group(struct pipe_ctx *pipe)
+{
+   struct pipe_ctx *other_pipe;
+   int vready_offset = pipe->pipe_dlg_param.vready_offset;
+
+   /* Always use the largest vready_offset of all connected pipes */
+   for (other_pipe = pipe->bottom_pipe; other_pipe != NULL; other_pipe = 
other_pipe->bottom_pipe) {
+   if (other_pipe->pipe_dlg_param.vready_offset > vready_offset)
+   vready_offset = 
other_pipe->pipe_dlg_param.vready_offset;
+   }
+   for (other_pipe = pipe->top_pipe; other_pipe != NULL; other_pipe = 
other_pipe->top_pipe) {
+   if (other_pipe->pipe_dlg_param.vready_offset > vready_offset)
+   vready_offset = 
other_pipe->pipe_dlg_param.vready_offset;
+   }
+   for (other_pipe = pipe->next_odm_pipe; other_pipe != NULL; other_pipe = 
other_pipe->next_odm_pipe) {
+

Re: [PATCH] drm/amdgpu: remove redundant NULL check

2022-11-24 Thread Luben Tuikov

Applied.

Thanks,
Luben

On 2022-11-23 02:44, zys.zlj...@gmail.com wrote:
> From: Yushan Zhou 
> 
> release_firmware() checks whether firmware pointer is NULL.
> Remove the redundant NULL check in psp_sw_fini().
> 
> Signed-off-by: Yushan Zhou 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 36 +++--
>  1 file changed, 16 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index effa7df3ddbf..77b966ab5439 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -484,26 +484,22 @@ static int psp_sw_fini(void *handle)
>   struct psp_gfx_cmd_resp *cmd = psp->cmd;
>  
>   psp_memory_training_fini(psp);
> - if (psp->sos_fw) {
> - release_firmware(psp->sos_fw);
> - psp->sos_fw = NULL;
> - }
> - if (psp->asd_fw) {
> - release_firmware(psp->asd_fw);
> - psp->asd_fw = NULL;
> - }
> - if (psp->ta_fw) {
> - release_firmware(psp->ta_fw);
> - psp->ta_fw = NULL;
> - }
> - if (psp->cap_fw) {
> - release_firmware(psp->cap_fw);
> - psp->cap_fw = NULL;
> - }
> - if (psp->toc_fw) {
> - release_firmware(psp->toc_fw);
> - psp->toc_fw = NULL;
> - }
> +
> + release_firmware(psp->sos_fw);
> + psp->sos_fw = NULL;
> +
> + release_firmware(psp->asd_fw);
> + psp->asd_fw = NULL;
> +
> + release_firmware(psp->ta_fw);
> + psp->ta_fw = NULL;
> +
> + release_firmware(psp->cap_fw);
> + psp->cap_fw = NULL;
> +
> + release_firmware(psp->toc_fw);
> + psp->toc_fw = NULL;
> +
>   if (adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 0) ||
>   adev->ip_versions[MP0_HWIP][0] == IP_VERSION(11, 0, 7))
>   psp_sysfs_fini(adev);

RE: [PATCH 17/29] drm/amdkfd: Add debug trap enabled flag to TMA

2022-11-24 Thread Kim, Jonathan

[AMD Official Use Only - General]

> -Original Message-
> From: Kuehling, Felix 
> Sent: November 24, 2022 11:24 AM
> To: Kim, Jonathan ; amd-
> g...@lists.freedesktop.org
> Subject: Re: [PATCH 17/29] drm/amdkfd: Add debug trap enabled flag to
> TMA
>
>
> Am 2022-11-24 um 09:51 schrieb Kim, Jonathan:
> > [Public]
> >
> >> -Original Message-
> >> From: Kuehling, Felix 
> >> Sent: November 22, 2022 7:45 PM
> >> To: Kim, Jonathan ; amd-
> >> g...@lists.freedesktop.org
> >> Subject: Re: [PATCH 17/29] drm/amdkfd: Add debug trap enabled flag to
> >> TMA
> >>
> >>
> >> On 2022-10-31 12:23, Jonathan Kim wrote:
> >>> From: Jay Cornwall 
> >>>
> >>> Trap handler behavior will differ when a debugger is attached.
> >>>
> >>> Make the debug trap flag available in the trap handler TMA.
> >>> Update it when the debug trap ioctl is invoked.
> >>>
> >>> v3: Rebase for upstream
> >>>
> >>> v2:
> >>> Add missing debug flag setup on APUs
> >>>
> >>> Signed-off-by: Jay Cornwall 
> >>> Reviewed-by: Felix Kuehling 
> >>> Signed-off-by: Jonathan Kim 
> >>> ---
> >>>drivers/gpu/drm/amd/amdkfd/kfd_debug.c   |  4 
> >>>drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  2 ++
> >>>drivers/gpu/drm/amd/amdkfd/kfd_process.c | 16
> 
> >>>3 files changed, 22 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
> >> b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
> >>> index ae6e701a2656..d4f87f2adada 100644
> >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
> >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
> >>> @@ -193,6 +193,8 @@ void kfd_dbg_trap_deactivate(struct
> kfd_process
> >> *target, bool unwind, int unwind
> >>>  if (unwind && count == unwind_count)
> >>>  break;
> >>>
> >>> +   kfd_process_set_trap_debug_flag(&pdd->qpd, false);
> >>> +
> >>>  /* GFX off is already disabled by debug activate if not RLC
> >> restore supported. */
> >>>  if (kfd_dbg_is_rlc_restore_supported(pdd->dev))
> >>>  amdgpu_gfx_off_ctrl(pdd->dev->adev, false);
> >>> @@ -278,6 +280,8 @@ int kfd_dbg_trap_activate(struct kfd_process
> >> *target)
> >>>  if (kfd_dbg_is_rlc_restore_supported(pdd->dev))
> >>>  amdgpu_gfx_off_ctrl(pdd->dev->adev, true);
> >>>
> >>> +   kfd_process_set_trap_debug_flag(&pdd->qpd, true);
> >>> +
> >>>  r = debug_refresh_runlist(pdd->dev->dqm);
> >>>  if (r) {
> >>>  target->runtime_info.runtime_state =
> >>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> >> b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> >>> index 9690a2adb9ed..82b28588ab72 100644
> >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> >>> @@ -1101,6 +1101,8 @@ int kfd_init_apertures(struct kfd_process
> >> *process);
> >>>void kfd_process_set_trap_handler(struct qcm_process_device *qpd,
> >>>uint64_t tba_addr,
> >>>uint64_t tma_addr);
> >>> +void kfd_process_set_trap_debug_flag(struct qcm_process_device
> >> *qpd,
> >>> +bool enabled);
> >>>
> >>>/* CWSR initialization */
> >>>int kfd_process_init_cwsr_apu(struct kfd_process *process, struct file
> >> *filep);
> >>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> >> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> >>> index 59c4c38833b6..d62e0c62df76 100644
> >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> >>> @@ -1252,6 +1252,8 @@ int kfd_process_init_cwsr_apu(struct
> >> kfd_process *p, struct file *filep)
> >>>  memcpy(qpd->cwsr_kaddr, dev->cwsr_isa, dev-
> >>> cwsr_isa_size);
> >>>
> >>> +   kfd_process_set_trap_debug_flag(qpd, p-
> >>> debug_trap_enabled);
> >>> +
> >>>  qpd->tma_addr = qpd->tba_addr +
> >> KFD_CWSR_TMA_OFFSET;
> >>>  pr_debug("set tba :0x%llx, tma:0x%llx, cwsr_kaddr:%p for
> >> pqm.\n",
> >>>  qpd->tba_addr, qpd->tma_addr, qpd->cwsr_kaddr);
> >>> @@ -1288,6 +1290,9 @@ static int
> >> kfd_process_device_init_cwsr_dgpu(struct kfd_process_device *pdd)
> >>>  memcpy(qpd->cwsr_kaddr, dev->cwsr_isa, dev->cwsr_isa_size);
> >>>
> >>> +   kfd_process_set_trap_debug_flag(&pdd->qpd,
> >>> +   pdd->process-
> >>> debug_trap_enabled);
> >>> +
> >>>  qpd->tma_addr = qpd->tba_addr + KFD_CWSR_TMA_OFFSET;
> >>>  pr_debug("set tba :0x%llx, tma:0x%llx, cwsr_kaddr:%p for pqm.\n",
> >>>   qpd->tba_addr, qpd->tma_addr, qpd->cwsr_kaddr);
> >>> @@ -1374,6 +1379,17 @@ bool kfd_process_xnack_mode(struct
> >> kfd_process *p, bool supported)
> >>>  return true;
> >>>}
> >>>
> >>> +void kfd_process_set_trap_debug_flag(struct qcm_process_device
> >> *qpd,
> >>> +bool enabled)
> >>> +{
> >>> +   /* If TMA doesn't exist then f

[linux-next:master] BUILD REGRESSION c35bd4e428856ed8c1fae7f7dfa08a9141c153d1

2022-11-24 Thread kernel test robot

tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
branch HEAD: c35bd4e428856ed8c1fae7f7dfa08a9141c153d1  Add linux-next specific 
files for 20221124

Error/Warning reports:

https://lore.kernel.org/oe-kbuild-all/202211090634.ryfkk0ws-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202211241736.k6437e7j-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202211242021.fdzrfna8-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202211242120.mzzvguln-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202211250227.grpjpxyn-...@intel.com

Error/Warning: (recently discovered and may have been fixed)

arch/arm/mach-s3c/devs.c:32:10: fatal error: linux/platform_data/dma-s3c24xx.h: 
No such file or directory
arch/powerpc/kernel/kvm_emul.o: warning: objtool: kvm_template_end(): can't 
find starting instruction
arch/powerpc/kernel/optprobes_head.o: warning: objtool: 
optprobe_template_end(): can't find starting instruction
drivers/clk/clk.c:1022:5: error: redefinition of 'clk_prepare'
drivers/clk/clk.c:1268:6: error: redefinition of 'clk_is_enabled_when_prepared'
drivers/clk/clk.c:941:6: error: redefinition of 'clk_unprepare'
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:4968: warning: This comment 
starts with '/**', but isn't a kernel-doc comment. Refer 
Documentation/doc-guide/kernel-doc.rst
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_dp.c:5075:24: warning: 
implicit conversion from 'enum ' to 'enum dc_status' 
[-Wenum-conversion]
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn201/irq_service_dcn201.c:40:20: 
warning: no previous prototype for 'to_dal_irq_source_dcn201' 
[-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/engine/fifo/gf100.c:451:1: warning: no previous 
prototype for 'gf100_fifo_nonstall_block' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/engine/fifo/gf100.c:451:1: warning: no previous 
prototype for function 'gf100_fifo_nonstall_block' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/engine/fifo/runl.c:34:1: warning: no previous 
prototype for 'nvkm_engn_cgrp_get' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/engine/fifo/runl.c:34:1: warning: no previous 
prototype for function 'nvkm_engn_cgrp_get' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.c:210:1: warning: no previous 
prototype for 'tu102_gr_load' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.c:210:1: warning: no previous 
prototype for function 'tu102_gr_load' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/nvfw/acr.c:49:1: warning: no previous prototype 
for 'wpr_generic_header_dump' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/nvfw/acr.c:49:1: warning: no previous prototype 
for function 'wpr_generic_header_dump' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c:221:21: warning: variable 'loc' 
set but not used [-Wunused-but-set-variable]
drivers/iio/addac/ad74115.c:320:27: warning: 'ad74115_dac_slew_rate_hz_tbl' 
defined but not used [-Wunused-const-variable=]
mm/vmscan.c:4090:30: error: implicit declaration of function 'pmd_young'; did 
you mean 'pte_young'? [-Werror=implicit-function-declaration]
net/netfilter/nf_conntrack_netlink.c:2674:6: warning: unused variable 'mark' 
[-Wunused-variable]
vmlinux.o: warning: objtool: __btrfs_map_block+0x1d77: unreachable instruction

Unverified Error/Warning (likely false positive, please contact us if 
interested):

drivers/usb/fotg210/fotg210-udc.c:632:17: sparse: sparse: restricted __le16 
degrades to integer

Error/Warning ids grouped by kconfigs:

gcc_recent_errors
|-- alpha-allyesconfig
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc_link_dp.c:warning:implicit-conversion-from-enum-anonymous-to-enum-dc_status
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-irq-dcn201-irq_service_dcn201.c:warning:no-previous-prototype-for-to_dal_irq_source_dcn201
|   |-- 
drivers-gpu-drm-nouveau-nvkm-engine-fifo-gf100.c:warning:no-previous-prototype-for-gf100_fifo_nonstall_block
|   |-- 
drivers-gpu-drm-nouveau-nvkm-engine-fifo-runl.c:warning:no-previous-prototype-for-nvkm_engn_cgrp_get
|   |-- 
drivers-gpu-drm-nouveau-nvkm-engine-gr-tu102.c:warning:no-previous-prototype-for-tu102_gr_load
|   |-- 
drivers-gpu-drm-nouveau-nvkm-nvfw-acr.c:warning:no-previous-prototype-for-wpr_generic_header_dump
|   `-- 
drivers-gpu-drm-nouveau-nvkm-subdev-acr-lsfw.c:warning:variable-loc-set-but-not-used
|-- alpha-randconfig-r005-20221124
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-

Re: [PATCH AUTOSEL 6.0 38/44] drm/amdgpu: Unlock bo_list_mutex after error handling

2022-11-24 Thread Sasha Levin


On Mon, Nov 21, 2022 at 12:07:40PM +0100, Christian König wrote:

Am 21.11.22 um 10:57 schrieb Michel Dänzer:

On 11/19/22 03:11, Sasha Levin wrote:

From: Philip Yang 

[ Upstream commit 64f65135c41a75f933d3bca236417ad8e9eb75de ]

Get below kernel WARNING backtrace when pressing ctrl-C to kill kfdtest
application.

If amdgpu_cs_parser_bos returns error after taking bo_list_mutex, as
caller amdgpu_cs_ioctl will not unlock bo_list_mutex, this generates the
kernel WARNING.

Add unlock bo_list_mutex after amdgpu_cs_parser_bos error handling to
cleanup bo_list userptr bo.

 WARNING: kfdtest/2930 still has locks held!
 1 lock held by kfdtest/2930:
  (&list->bo_list_mutex){+.+.}-{3:3}, at: amdgpu_cs_ioctl+0xce5/0x1f10 [amdgpu]
  stack backtrace:
   dump_stack_lvl+0x44/0x57
   get_signal+0x79f/0xd00
   arch_do_signal_or_restart+0x36/0x7b0
   exit_to_user_mode_prepare+0xfd/0x1b0
   syscall_exit_to_user_mode+0x19/0x40
   do_syscall_64+0x40/0x80

Signed-off-by: Philip Yang 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index b7bae833c804..9d59f83c8faa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -655,6 +655,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
}
mutex_unlock(&p->bo_list->bo_list_mutex);
}
+   mutex_unlock(&p->bo_list->bo_list_mutex);
return r;
 }

Looks doubtful that this is a correct backport — there's an identical 
mutex_unlock call just above.



Oh, yes good point. This patch doesn't needs to be backported at all 
because it just fixes a problem introduced in the same cycle:


Dropping it, thanks!

--
Thanks,
Sasha

Re: [PATCH v2] drm/amdgpu: Fix potential double free and null pointer dereference

2022-11-24 Thread Luben Tuikov

Applied.

Regards,
Luben

On 2022-11-22 19:10, Luben Tuikov wrote:
> amdgpu_xgmi_hive_type does provide a release method which frees the allocated 
> "hive",
> so we don't need a kfree() after a kobject_put().
> 
> Reviewed-by: Luben Tuikov 
> 
> Regards,
> Luben
> 
> On 2022-11-21 23:28, Liang He wrote:
>> In amdgpu_get_xgmi_hive(), we should not call kfree() after
>> kobject_put() as the PUT will call kfree().
>>
>> In amdgpu_device_ip_init(), we need to check the returned *hive*
>> which can be NULL before we dereference it.
>>
>> Signed-off-by: Liang He 
>> ---
>>  v1->v2: we need the extra GET to keep *hive* alive, it is
>>  my fault to remove the GET in v1.
>>
>>
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 +
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   | 2 --
>>  2 files changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index f1e9663b4051..00976e15b698 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -2462,6 +2462,11 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
>> *adev)
>>  if (!amdgpu_sriov_vf(adev)) {
>>  struct amdgpu_hive_info *hive = 
>> amdgpu_get_xgmi_hive(adev);
>>  
>> +if (WARN_ON(!hive)) {
>> +r = -ENOENT;
>> +goto init_failed;
>> +}
>> +
>>  if (!hive->reset_domain ||
>>  
>> !amdgpu_reset_get_reset_domain(hive->reset_domain)) {
>>  r = -ENOENT;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> index 47159e9a0884..4b9e7b050ccd 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> @@ -386,7 +386,6 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct 
>> amdgpu_device *adev)
>>  if (ret) {
>>  dev_err(adev->dev, "XGMI: failed initializing kobject for xgmi 
>> hive\n");
>>  kobject_put(&hive->kobj);
>> -kfree(hive);
>>  hive = NULL;
>>  goto pro_end;
>>  }
>> @@ -410,7 +409,6 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct 
>> amdgpu_device *adev)
>>  dev_err(adev->dev, "XGMI: failed initializing 
>> reset domain for xgmi hive\n");
>>  ret = -ENOMEM;
>>  kobject_put(&hive->kobj);
>> -kfree(hive);
>>  hive = NULL;
>>  goto pro_end;
>>  }
>

Re: [PATCH 07/29] drm/amdgpu: add gfx9.4.1 hw debug mode enable and disable calls

2022-11-24 Thread Felix Kuehling


Am 2022-11-24 um 09:58 schrieb Kim, Jonathan:

[AMD Official Use Only - General]


-Original Message-
From: Kuehling, Felix 
Sent: November 22, 2022 6:59 PM
To: Kim, Jonathan ; amd-
g...@lists.freedesktop.org
Subject: Re: [PATCH 07/29] drm/amdgpu: add gfx9.4.1 hw debug mode
enable and disable calls


On 2022-10-31 12:23, Jonathan Kim wrote:

On GFX9.4.1, the implicit wait count instruction on s_barrier is
disabled by default in the driver during normal operation for
performance requirements.

There is a hardware bug in GFX9.4.1 where if the implicit wait count
instruction after an s_barrier instruction is disabled, any wave that
hits an exception may step over the s_barrier when returning from the
trap handler with the barrier logic having no ability to be
aware of this, thereby causing other waves to wait at the barrier
indefinitely resulting in a shader hang.  This bug has been corrected
for GFX9.4.2 and onward.

Since the debugger subscribes to hardware exceptions, in order to avoid
this bug, the debugger must enable implicit wait count on s_barrier
for a debug session and disable it on detach.

In order to change this setting in the in the device global SQ_CONFIG
register, the GFX pipeline must be idle.  GFX9.4.1 as a compute device
will either dispatch work through the compute ring buffers used for
image post processing or through the hardware scheduler by the KFD.

Have the KGD suspend and drain the compute ring buffer, then suspend

the

hardware scheduler and block any future KFD process job requests before
changing the implicit wait count setting.  Once set, resume all work.

Signed-off-by: Jonathan Kim 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu.h   |   3 +
   .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   | 105

+-

   drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c |   4 +-
   drivers/gpu/drm/amd/amdkfd/kfd_process.c  |   2 +-
   4 files changed, 110 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h

b/drivers/gpu/drm/amd/amdgpu/amdgpu.h

index 0e6ddf05c23c..9f2499f52d2c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1034,6 +1034,9 @@ struct amdgpu_device {
 struct pci_saved_state  *pci_state;
 pci_channel_state_t pci_channel_state;

+   /* Track auto wait count on s_barrier settings */
+   boolbarrier_has_auto_waitcnt;
+
 struct amdgpu_reset_control *reset_cntl;
 uint32_t

ip_versions[MAX_HWIP][HWIP_MAX_INSTANCE];

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c

b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c

index 4191af5a3f13..13f02a0aa828 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
@@ -26,6 +26,7 @@
   #include "amdgpu.h"
   #include "amdgpu_amdkfd.h"
   #include "amdgpu_amdkfd_arcturus.h"
+#include "amdgpu_reset.h"
   #include "sdma0/sdma0_4_2_2_offset.h"
   #include "sdma0/sdma0_4_2_2_sh_mask.h"
   #include "sdma1/sdma1_4_2_2_offset.h"
@@ -48,6 +49,8 @@
   #include "amdgpu_amdkfd_gfx_v9.h"
   #include "gfxhub_v1_0.h"
   #include "mmhub_v9_4.h"
+#include "gc/gc_9_0_offset.h"
+#include "gc/gc_9_0_sh_mask.h"

   #define HQD_N_REGS 56
   #define DUMP_REG(addr) do {   \
@@ -276,6 +279,104 @@ int kgd_arcturus_hqd_sdma_destroy(struct

amdgpu_device *adev, void *mqd,

 return 0;
   }

+/*
+ * Helper used to suspend/resume gfx pipe for image post process work

to set

+ * barrier behaviour.
+ */
+static int suspend_resume_compute_scheduler(struct amdgpu_device

*adev, bool suspend)

+{
+   int i, r = 0;
+
+   for (i = 0; i < adev->gfx.num_compute_rings; i++) {
+   struct amdgpu_ring *ring = &adev->gfx.compute_ring[i];
+
+   if (!(ring && ring->sched.thread))
+   continue;
+
+   /* stop secheduler and drain ring. */
+   if (suspend) {
+   drm_sched_stop(&ring->sched, NULL);
+   r = amdgpu_fence_wait_empty(ring);
+   if (r)
+   goto out;
+   } else {
+   drm_sched_start(&ring->sched, false);
+   }
+   }
+
+out:
+   /* return on resume or failure to drain rings. */
+   if (!suspend || r)
+   return r;
+
+   return amdgpu_device_ip_wait_for_idle(adev, GC_HWIP);
+}
+
+static void set_barrier_auto_waitcnt(struct amdgpu_device *adev, bool

enable_waitcnt)

+{
+   uint32_t data;
+
+   WRITE_ONCE(adev->barrier_has_auto_waitcnt, enable_waitcnt);
+
+   if (!down_read_trylock(&adev->reset_domain->sem))
+   return;
+
+   amdgpu_amdkfd_suspend(adev, false);
+
+   if (suspend_resume_compute_scheduler(adev, true))
+   goto out;
+
+   data = RREG32(SOC15_REG_OFFSET(GC, 0, mmSQ_CONFIG));
+   data = REG_SET_FIELD(data, SQ_CONFIG,

DISABLE_BARRIER_WAITCNT,

+   enable_waitcnt ? 0 : 1);
+   WREG32(

Re: [PATCH 17/29] drm/amdkfd: Add debug trap enabled flag to TMA

2022-11-24 Thread Felix Kuehling




Am 2022-11-24 um 09:51 schrieb Kim, Jonathan:

[Public]


-Original Message-
From: Kuehling, Felix 
Sent: November 22, 2022 7:45 PM
To: Kim, Jonathan ; amd-
g...@lists.freedesktop.org
Subject: Re: [PATCH 17/29] drm/amdkfd: Add debug trap enabled flag to
TMA


On 2022-10-31 12:23, Jonathan Kim wrote:

From: Jay Cornwall 

Trap handler behavior will differ when a debugger is attached.

Make the debug trap flag available in the trap handler TMA.
Update it when the debug trap ioctl is invoked.

v3: Rebase for upstream

v2:
Add missing debug flag setup on APUs

Signed-off-by: Jay Cornwall 
Reviewed-by: Felix Kuehling 
Signed-off-by: Jonathan Kim 
---
   drivers/gpu/drm/amd/amdkfd/kfd_debug.c   |  4 
   drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  2 ++
   drivers/gpu/drm/amd/amdkfd/kfd_process.c | 16 
   3 files changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c

b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c

index ae6e701a2656..d4f87f2adada 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
@@ -193,6 +193,8 @@ void kfd_dbg_trap_deactivate(struct kfd_process

*target, bool unwind, int unwind

 if (unwind && count == unwind_count)
 break;

+   kfd_process_set_trap_debug_flag(&pdd->qpd, false);
+
 /* GFX off is already disabled by debug activate if not RLC

restore supported. */

 if (kfd_dbg_is_rlc_restore_supported(pdd->dev))
 amdgpu_gfx_off_ctrl(pdd->dev->adev, false);
@@ -278,6 +280,8 @@ int kfd_dbg_trap_activate(struct kfd_process

*target)

 if (kfd_dbg_is_rlc_restore_supported(pdd->dev))
 amdgpu_gfx_off_ctrl(pdd->dev->adev, true);

+   kfd_process_set_trap_debug_flag(&pdd->qpd, true);
+
 r = debug_refresh_runlist(pdd->dev->dqm);
 if (r) {
 target->runtime_info.runtime_state =
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h

b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h

index 9690a2adb9ed..82b28588ab72 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -1101,6 +1101,8 @@ int kfd_init_apertures(struct kfd_process

*process);

   void kfd_process_set_trap_handler(struct qcm_process_device *qpd,
   uint64_t tba_addr,
   uint64_t tma_addr);
+void kfd_process_set_trap_debug_flag(struct qcm_process_device

*qpd,

+bool enabled);

   /* CWSR initialization */
   int kfd_process_init_cwsr_apu(struct kfd_process *process, struct file

*filep);

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c

b/drivers/gpu/drm/amd/amdkfd/kfd_process.c

index 59c4c38833b6..d62e0c62df76 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1252,6 +1252,8 @@ int kfd_process_init_cwsr_apu(struct

kfd_process *p, struct file *filep)

 memcpy(qpd->cwsr_kaddr, dev->cwsr_isa, dev-
cwsr_isa_size);

+   kfd_process_set_trap_debug_flag(qpd, p-
debug_trap_enabled);
+
 qpd->tma_addr = qpd->tba_addr +

KFD_CWSR_TMA_OFFSET;

 pr_debug("set tba :0x%llx, tma:0x%llx, cwsr_kaddr:%p for

pqm.\n",

 qpd->tba_addr, qpd->tma_addr, qpd->cwsr_kaddr);
@@ -1288,6 +1290,9 @@ static int

kfd_process_device_init_cwsr_dgpu(struct kfd_process_device *pdd)

 memcpy(qpd->cwsr_kaddr, dev->cwsr_isa, dev->cwsr_isa_size);

+   kfd_process_set_trap_debug_flag(&pdd->qpd,
+   pdd->process-
debug_trap_enabled);
+
 qpd->tma_addr = qpd->tba_addr + KFD_CWSR_TMA_OFFSET;
 pr_debug("set tba :0x%llx, tma:0x%llx, cwsr_kaddr:%p for pqm.\n",
  qpd->tba_addr, qpd->tma_addr, qpd->cwsr_kaddr);
@@ -1374,6 +1379,17 @@ bool kfd_process_xnack_mode(struct

kfd_process *p, bool supported)

 return true;
   }

+void kfd_process_set_trap_debug_flag(struct qcm_process_device

*qpd,

+bool enabled)
+{
+   /* If TMA doesn't exist then flag will be set during allocation. */

I would expect a change to the TMA allocation function, but that isn't
in this patch?

The TMA is allocated under kfd_process_init_cwsr_* and CWSR enabled is a 
pre-condition for the 1st level trap handler loading.
The lack of context in the patch for those functions may be hiding that fact.
Is the placement of this comment misleading?  Maybe it should go in 
kfd_dbg_trap_activate when kfd_process_set_trap_debug_flag is called?
Or should it just be removed since the combined calls within initialization of 
CWSR + debug enable seem complete for enablement?


I think the comment is fine. I was sort of expecting to see the 
corresponding change in the TMA allocation in the same patch. So my 
question is just lack of context. If that change in the TMA allocation 
got squashed into another patch in the series, maybe it would make se

RE: [PATCH 07/29] drm/amdgpu: add gfx9.4.1 hw debug mode enable and disable calls

2022-11-24 Thread Kim, Jonathan

[AMD Official Use Only - General]

> -Original Message-
> From: Kuehling, Felix 
> Sent: November 22, 2022 6:59 PM
> To: Kim, Jonathan ; amd-
> g...@lists.freedesktop.org
> Subject: Re: [PATCH 07/29] drm/amdgpu: add gfx9.4.1 hw debug mode
> enable and disable calls
>
>
> On 2022-10-31 12:23, Jonathan Kim wrote:
> > On GFX9.4.1, the implicit wait count instruction on s_barrier is
> > disabled by default in the driver during normal operation for
> > performance requirements.
> >
> > There is a hardware bug in GFX9.4.1 where if the implicit wait count
> > instruction after an s_barrier instruction is disabled, any wave that
> > hits an exception may step over the s_barrier when returning from the
> > trap handler with the barrier logic having no ability to be
> > aware of this, thereby causing other waves to wait at the barrier
> > indefinitely resulting in a shader hang.  This bug has been corrected
> > for GFX9.4.2 and onward.
> >
> > Since the debugger subscribes to hardware exceptions, in order to avoid
> > this bug, the debugger must enable implicit wait count on s_barrier
> > for a debug session and disable it on detach.
> >
> > In order to change this setting in the in the device global SQ_CONFIG
> > register, the GFX pipeline must be idle.  GFX9.4.1 as a compute device
> > will either dispatch work through the compute ring buffers used for
> > image post processing or through the hardware scheduler by the KFD.
> >
> > Have the KGD suspend and drain the compute ring buffer, then suspend
> the
> > hardware scheduler and block any future KFD process job requests before
> > changing the implicit wait count setting.  Once set, resume all work.
> >
> > Signed-off-by: Jonathan Kim 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu.h   |   3 +
> >   .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   | 105
> +-
> >   drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c |   4 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_process.c  |   2 +-
> >   4 files changed, 110 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > index 0e6ddf05c23c..9f2499f52d2c 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > @@ -1034,6 +1034,9 @@ struct amdgpu_device {
> > struct pci_saved_state  *pci_state;
> > pci_channel_state_t pci_channel_state;
> >
> > +   /* Track auto wait count on s_barrier settings */
> > +   boolbarrier_has_auto_waitcnt;
> > +
> > struct amdgpu_reset_control *reset_cntl;
> > uint32_t
> ip_versions[MAX_HWIP][HWIP_MAX_INSTANCE];
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
> > index 4191af5a3f13..13f02a0aa828 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
> > @@ -26,6 +26,7 @@
> >   #include "amdgpu.h"
> >   #include "amdgpu_amdkfd.h"
> >   #include "amdgpu_amdkfd_arcturus.h"
> > +#include "amdgpu_reset.h"
> >   #include "sdma0/sdma0_4_2_2_offset.h"
> >   #include "sdma0/sdma0_4_2_2_sh_mask.h"
> >   #include "sdma1/sdma1_4_2_2_offset.h"
> > @@ -48,6 +49,8 @@
> >   #include "amdgpu_amdkfd_gfx_v9.h"
> >   #include "gfxhub_v1_0.h"
> >   #include "mmhub_v9_4.h"
> > +#include "gc/gc_9_0_offset.h"
> > +#include "gc/gc_9_0_sh_mask.h"
> >
> >   #define HQD_N_REGS 56
> >   #define DUMP_REG(addr) do {   \
> > @@ -276,6 +279,104 @@ int kgd_arcturus_hqd_sdma_destroy(struct
> amdgpu_device *adev, void *mqd,
> > return 0;
> >   }
> >
> > +/*
> > + * Helper used to suspend/resume gfx pipe for image post process work
> to set
> > + * barrier behaviour.
> > + */
> > +static int suspend_resume_compute_scheduler(struct amdgpu_device
> *adev, bool suspend)
> > +{
> > +   int i, r = 0;
> > +
> > +   for (i = 0; i < adev->gfx.num_compute_rings; i++) {
> > +   struct amdgpu_ring *ring = &adev->gfx.compute_ring[i];
> > +
> > +   if (!(ring && ring->sched.thread))
> > +   continue;
> > +
> > +   /* stop secheduler and drain ring. */
> > +   if (suspend) {
> > +   drm_sched_stop(&ring->sched, NULL);
> > +   r = amdgpu_fence_wait_empty(ring);
> > +   if (r)
> > +   goto out;
> > +   } else {
> > +   drm_sched_start(&ring->sched, false);
> > +   }
> > +   }
> > +
> > +out:
> > +   /* return on resume or failure to drain rings. */
> > +   if (!suspend || r)
> > +   return r;
> > +
> > +   return amdgpu_device_ip_wait_for_idle(adev, GC_HWIP);
> > +}
> > +
> > +static void set_barrier_auto_waitcnt(struct amdgpu_device *adev, bool
> enable_waitcnt)
> > +{
> > +   uint32_t data;
> > +
> > +   WRITE_ONCE(adev->barrier_has_auto_waitcnt, enable_waitcnt);
> > +
> > +   if (!down_rea

Re: [PATCH] drm/amdgpu: add amdgpu vram usage information into amdgpu_vram_mm

2022-11-24 Thread Arunpravin Paneer Selvam





On 11/24/2022 1:17 PM, Christian König wrote:

Am 24.11.22 um 08:45 schrieb Wang, Yang(Kevin):

[AMD Official Use Only - General]

-Original Message-
From: Koenig, Christian 
Sent: Thursday, November 24, 2022 3:25 PM
To: Wang, Yang(Kevin) ; 
amd-gfx@lists.freedesktop.org; Paneer Selvam, Arunpravin 

Cc: Zhang, Hawking ; Deucher, Alexander 

Subject: Re: [PATCH] drm/amdgpu: add amdgpu vram usage information 
into amdgpu_vram_mm


Am 24.11.22 um 06:49 schrieb Yang Wang:

add vram usage information into dri debugfs amdgpu_vram_mm node.

Background:
when amdgpu driver introduces drm buddy allocator, the kernel driver
(and developer) is difficult to track the vram usage information.

Field:
0x-0x: vram usaged range.
type: kernel, device, sg
usage: normal, vm, user
domain: C-CPU, G-GTT, V-VRAM, P-PRIV
@x: the address of "amdgpu_bo" object in kernel space.
4096: vram range range.

Example:
0x0003fea68000-0x0003fea68fff: (type:kernel usage:vm   
domain:--V- --V-) @1d33dfee 4096 bytes
0x0003fea69000-0x0003fea69fff: (type:kernel usage:vm   
domain:--V- --V-) @a79155b5 4096 bytes
0x0003fea6b000-0x0003fea6bfff: (type:kernel usage:vm   
domain:--V- --V-) @38ad633b 4096 bytes
0x0003fea6c000-0x0003fea6cfff: (type:device usage:user 
domain:--V- --V-) @e302f90b 4096 bytes
0x0003fea6d000-0x0003fea6dfff: (type:device usage:user 
domain:--V- --V-) @e664c172 4096 bytes
0x0003fea6e000-0x0003fea6efff: (type:kernel usage:vm   
domain:--V- --V-) @4528cb2f 4096 bytes
0x0003fea6f000-0x0003fea6: (type:kernel usage:vm   
domain:--V- --V-) @a446bdbf 4096 bytes
0x0003fea7-0x0003fea7: (type:device usage:user 
domain:--V- --V-) @78fae42f 65536 bytes
0x0003fead8000-0x0003feadbfff: (type:kernel usage:normal   
domain:--V- --V-) @1327b7ff 16384 bytes
0x0003feadc000-0x0003feadcfff: (type:kernel usage:normal   
domain:--V- --V-) @1327b7ff 4096 bytes
0x0003feadd000-0x0003feaddfff: (type:kernel usage:normal   
domain:--V- --V-) @b9706fc1 4096 bytes
0x0003feade000-0x0003feadefff: (type:kernel usage:vm   
domain:--V- --V-) @71a25571 4096 bytes


Note:
although some vram ranges can be merged in the example above, but this
can reflect the actual distribution of drm buddy allocator.
Well completely NAK. This is way to much complexity for simple 
debugging.


Question is what are your requirements here? E.g. what information do 
you want and why doesn't the buddy allocator already expose this?


Regards,
Christian.

[Kevin]:

For KMD debug.
The DRM buddy interface doesn't provide an interface to query which 
ranges of ram(resource) are used.
It is not easy to debug in KMD side if driver create BO fail at 
specific location.
and from the view of KMD, the VRAM at some locations has special 
purposes.

with this patch, we can know which range of vram are actually used.


Well that's not a good reason to add this complexity. Debugging 
doesn't justify that.


Please work with Arun to add the necessary information to the buddy 
allocator interface.


Regards,
Christian.


Hi Kevin,

I will check and list down some of the necessary information that we can 
add to the

buddy allocator interface.

Regards,
Arun.


Best Regards,
Kevin

Signed-off-by: Yang Wang 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c   |   6 +-
   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h   |   3 +
   drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 130 
++-

   drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h |   1 +
   4 files changed, 136 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 90eb07106609..117c754409b3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -53,7 +53,7 @@
    *
    */

-static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
+void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
   {
   struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo);

@@ -66,7 +66,7 @@ static void amdgpu_bo_destroy(struct 
ttm_buffer_object *tbo)

   kvfree(bo);
   }

-static void amdgpu_bo_user_destroy(struct ttm_buffer_object *tbo)
+void amdgpu_bo_user_destroy(struct ttm_buffer_object *tbo)
   {
   struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo);
   struct amdgpu_bo_user *ubo;
@@ -76,7 +76,7 @@ static void amdgpu_bo_user_destroy(struct 
ttm_buffer_object *tbo)

   amdgpu_bo_destroy(tbo);
   }

-static void amdgpu_bo_vm_destroy(struct ttm_buffer_object *tbo)
+void amdgpu_bo_vm_destroy(struct ttm_buffer_object *tbo)
   {
   struct amdgpu_device *adev = amdgpu_ttm_adev(tbo->bdev);
   struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo); diff --git
a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index 147b79c10c

RE: [PATCH 17/29] drm/amdkfd: Add debug trap enabled flag to TMA

2022-11-24 Thread Kim, Jonathan

[Public]

> -Original Message-
> From: Kuehling, Felix 
> Sent: November 22, 2022 7:45 PM
> To: Kim, Jonathan ; amd-
> g...@lists.freedesktop.org
> Subject: Re: [PATCH 17/29] drm/amdkfd: Add debug trap enabled flag to
> TMA
>
>
> On 2022-10-31 12:23, Jonathan Kim wrote:
> > From: Jay Cornwall 
> >
> > Trap handler behavior will differ when a debugger is attached.
> >
> > Make the debug trap flag available in the trap handler TMA.
> > Update it when the debug trap ioctl is invoked.
> >
> > v3: Rebase for upstream
> >
> > v2:
> > Add missing debug flag setup on APUs
> >
> > Signed-off-by: Jay Cornwall 
> > Reviewed-by: Felix Kuehling 
> > Signed-off-by: Jonathan Kim 
> > ---
> >   drivers/gpu/drm/amd/amdkfd/kfd_debug.c   |  4 
> >   drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  2 ++
> >   drivers/gpu/drm/amd/amdkfd/kfd_process.c | 16 
> >   3 files changed, 22 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
> > index ae6e701a2656..d4f87f2adada 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
> > @@ -193,6 +193,8 @@ void kfd_dbg_trap_deactivate(struct kfd_process
> *target, bool unwind, int unwind
> > if (unwind && count == unwind_count)
> > break;
> >
> > +   kfd_process_set_trap_debug_flag(&pdd->qpd, false);
> > +
> > /* GFX off is already disabled by debug activate if not RLC
> restore supported. */
> > if (kfd_dbg_is_rlc_restore_supported(pdd->dev))
> > amdgpu_gfx_off_ctrl(pdd->dev->adev, false);
> > @@ -278,6 +280,8 @@ int kfd_dbg_trap_activate(struct kfd_process
> *target)
> > if (kfd_dbg_is_rlc_restore_supported(pdd->dev))
> > amdgpu_gfx_off_ctrl(pdd->dev->adev, true);
> >
> > +   kfd_process_set_trap_debug_flag(&pdd->qpd, true);
> > +
> > r = debug_refresh_runlist(pdd->dev->dqm);
> > if (r) {
> > target->runtime_info.runtime_state =
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> > index 9690a2adb9ed..82b28588ab72 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> > @@ -1101,6 +1101,8 @@ int kfd_init_apertures(struct kfd_process
> *process);
> >   void kfd_process_set_trap_handler(struct qcm_process_device *qpd,
> >   uint64_t tba_addr,
> >   uint64_t tma_addr);
> > +void kfd_process_set_trap_debug_flag(struct qcm_process_device
> *qpd,
> > +bool enabled);
> >
> >   /* CWSR initialization */
> >   int kfd_process_init_cwsr_apu(struct kfd_process *process, struct file
> *filep);
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> > index 59c4c38833b6..d62e0c62df76 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> > @@ -1252,6 +1252,8 @@ int kfd_process_init_cwsr_apu(struct
> kfd_process *p, struct file *filep)
> >
> > memcpy(qpd->cwsr_kaddr, dev->cwsr_isa, dev-
> >cwsr_isa_size);
> >
> > +   kfd_process_set_trap_debug_flag(qpd, p-
> >debug_trap_enabled);
> > +
> > qpd->tma_addr = qpd->tba_addr +
> KFD_CWSR_TMA_OFFSET;
> > pr_debug("set tba :0x%llx, tma:0x%llx, cwsr_kaddr:%p for
> pqm.\n",
> > qpd->tba_addr, qpd->tma_addr, qpd->cwsr_kaddr);
> > @@ -1288,6 +1290,9 @@ static int
> kfd_process_device_init_cwsr_dgpu(struct kfd_process_device *pdd)
> >
> > memcpy(qpd->cwsr_kaddr, dev->cwsr_isa, dev->cwsr_isa_size);
> >
> > +   kfd_process_set_trap_debug_flag(&pdd->qpd,
> > +   pdd->process-
> >debug_trap_enabled);
> > +
> > qpd->tma_addr = qpd->tba_addr + KFD_CWSR_TMA_OFFSET;
> > pr_debug("set tba :0x%llx, tma:0x%llx, cwsr_kaddr:%p for pqm.\n",
> >  qpd->tba_addr, qpd->tma_addr, qpd->cwsr_kaddr);
> > @@ -1374,6 +1379,17 @@ bool kfd_process_xnack_mode(struct
> kfd_process *p, bool supported)
> > return true;
> >   }
> >
> > +void kfd_process_set_trap_debug_flag(struct qcm_process_device
> *qpd,
> > +bool enabled)
> > +{
> > +   /* If TMA doesn't exist then flag will be set during allocation. */
>
> I would expect a change to the TMA allocation function, but that isn't
> in this patch?

The TMA is allocated under kfd_process_init_cwsr_* and CWSR enabled is a 
pre-condition for the 1st level trap handler loading.
The lack of context in the patch for those functions may be hiding that fact.
Is the placement of this comment misleading?  Maybe it should go in 
kfd_dbg_trap_activate when kfd_process_set_trap_debug_flag is called?
Or should it just be removed since the combined calls within initialization of 
CWSR + debug enable seem complete for en

Re: [PATCH] drm/amdgpu: add mb for si

2022-11-24 Thread Lazar, Lijo





On 11/24/2022 4:11 PM, Lazar, Lijo wrote:



On 11/24/2022 3:34 PM, Quan, Evan wrote:

[AMD Official Use Only - General]

Could the attached patch help?

Evan

-Original Message-
From: amd-gfx  On Behalf Of ???
Sent: Friday, November 18, 2022 5:25 PM
To: Michel Dänzer ; Koenig, Christian
; Deucher, Alexander

Cc: amd-gfx@lists.freedesktop.org; Pan, Xinhui ;
linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: add mb for si


在 2022/11/18 17:18, Michel Dänzer 写道:

On 11/18/22 09:01, Christian König wrote:

Am 18.11.22 um 08:48 schrieb Zhenneng Li:

During reboot test on arm64 platform, it may failure on boot, so add
this mb in smc.

The error message are as follows:
[    6.996395][ 7] [  T295] [drm:amdgpu_device_ip_late_init
[amdgpu]] *ERROR*
  late_init of IP block  failed -22 [
7.006919][ 7] [  T295] amdgpu :04:00.0:


The issue is happening in late_init() which eventually does

 ret = si_thermal_enable_alert(adev, false);

Just before this, si_thermal_start_thermal_controller is called in 
hw_init and that enables thermal alert.


Maybe the issue is with enable/disable of thermal alerts in quick 
succession. Adding a delay inside si_thermal_start_thermal_controller 
might help.




On a second look, temperature range is already set as part of 
si_thermal_start_thermal_controller in hw_init

https://elixir.bootlin.com/linux/v6.1-rc6/source/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c#L6780

There is no need to set it again here -

https://elixir.bootlin.com/linux/v6.1-rc6/source/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c#L7635

I think it is safe to remove the call from late_init altogether. Alex/Evan?

Thanks,
Lijo


Thanks,
Lijo


amdgpu_device_ip_late_init failed [    7.014224][ 7] [  T295] amdgpu
:04:00.0: Fatal error during GPU init

Memory barries are not supposed to be sprinkled around like this, you

need to give a detailed explanation why this is necessary.


Regards,
Christian.


Signed-off-by: Zhenneng Li 
---
    drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++
    1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
index 8f994ffa9cd1..c7656f22278d 100644
--- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
+++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
@@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct
amdgpu_device *adev)
    u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL);
    u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0);
    +    mb();
+
    if (!(rst & RST_REG) && !(clk & CK_DISABLE))
    return true;
In particular, it makes no sense in this specific place, since it 
cannot directly

affect the values of rst & clk.

I thinks so too.

But when I do reboot test using nine desktop machines,  there maybe 
report
this error on one or two machines after Hundreds of times or 
Thousands of
times reboot test, at the beginning, I use msleep() instead of mb(), 
these

two methods are all works, but I don't know what is the root case.

I use this method on other verdor's oland card, this error message are
reported again.

What could be the root reason?

test environmen:

graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87

driver: amdgpu

os: ubuntu 2004

platform: arm64

kernel: 5.4.18

Re: [PATCH] drm/amdgpu: add mb for si

2022-11-24 Thread Lazar, Lijo





On 11/24/2022 3:34 PM, Quan, Evan wrote:

[AMD Official Use Only - General]

Could the attached patch help?

Evan

-Original Message-
From: amd-gfx  On Behalf Of ???
Sent: Friday, November 18, 2022 5:25 PM
To: Michel Dänzer ; Koenig, Christian
; Deucher, Alexander

Cc: amd-gfx@lists.freedesktop.org; Pan, Xinhui ;
linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: add mb for si


在 2022/11/18 17:18, Michel Dänzer 写道:

On 11/18/22 09:01, Christian König wrote:

Am 18.11.22 um 08:48 schrieb Zhenneng Li:

During reboot test on arm64 platform, it may failure on boot, so add
this mb in smc.

The error message are as follows:
[6.996395][ 7] [  T295] [drm:amdgpu_device_ip_late_init
[amdgpu]] *ERROR*
  late_init of IP block  failed -22 [
7.006919][ 7] [  T295] amdgpu :04:00.0:


The issue is happening in late_init() which eventually does

ret = si_thermal_enable_alert(adev, false);

Just before this, si_thermal_start_thermal_controller is called in 
hw_init and that enables thermal alert.


Maybe the issue is with enable/disable of thermal alerts in quick 
succession. Adding a delay inside si_thermal_start_thermal_controller 
might help.


Thanks,
Lijo


amdgpu_device_ip_late_init failed [7.014224][ 7] [  T295] amdgpu
:04:00.0: Fatal error during GPU init

Memory barries are not supposed to be sprinkled around like this, you

need to give a detailed explanation why this is necessary.


Regards,
Christian.


Signed-off-by: Zhenneng Li 
---
drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
index 8f994ffa9cd1..c7656f22278d 100644
--- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
+++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
@@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct
amdgpu_device *adev)
u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL);
u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0);
+mb();
+
if (!(rst & RST_REG) && !(clk & CK_DISABLE))
return true;

In particular, it makes no sense in this specific place, since it cannot 
directly

affect the values of rst & clk.

I thinks so too.

But when I do reboot test using nine desktop machines,  there maybe report
this error on one or two machines after Hundreds of times or Thousands of
times reboot test, at the beginning, I use msleep() instead of mb(), these
two methods are all works, but I don't know what is the root case.

I use this method on other verdor's oland card, this error message are
reported again.

What could be the root reason?

test environmen:

graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87

driver: amdgpu

os: ubuntu 2004

platform: arm64

kernel: 5.4.18

Re: [PATCH 1/3] drm/ttm: remove ttm_bo_(un)lock_delayed_workqueue

2022-11-24 Thread Matthew Auld

On Thu, 24 Nov 2022 at 10:03, Christian König
 wrote:
>
> Those functions never worked correctly since it is still perfectly
> possible that a buffer object is released and the background worker
> restarted even after calling them.
>
> Signed-off-by: Christian König 

I know you usually do, but just a friendly reminder to Cc: intel-gfx
on the next revision or before merging, just so our CI can give the
series a quick test. Thanks.

RE: [PATCH] swsmu/amdgpu_smu: Fix the wrong if-condition

2022-11-24 Thread Quan, Evan

[AMD Official Use Only - General]

Reviewed-by: Evan Quan 

> -Original Message-
> From: amd-gfx  On Behalf Of Yu
> Songping
> Sent: Thursday, November 24, 2022 9:53 AM
> To: airl...@gmail.com; dan...@ffwll.ch
> Cc: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; linux-
> ker...@vger.kernel.org
> Subject: [PATCH] swsmu/amdgpu_smu: Fix the wrong if-condition
> 
> The logical operator '&&' will make
> smu->ppt_funcs->set_gfx_power_up_by_imu segment fault when
> ppt_funcs is
> smu->NULL.
> 
> Signed-off-by: Yu Songping 
> ---
>  drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> index b880f4d7d67e..1cb728b0b7ee 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> @@ -161,7 +161,7 @@ int smu_get_dpm_freq_range(struct smu_context
> *smu,
> 
>  int smu_set_gfx_power_up_by_imu(struct smu_context *smu)  {
> - if (!smu->ppt_funcs && !smu->ppt_funcs-
> >set_gfx_power_up_by_imu)
> + if (!smu->ppt_funcs || !smu->ppt_funcs-
> >set_gfx_power_up_by_imu)
>   return -EOPNOTSUPP;
> 
>   return smu->ppt_funcs->set_gfx_power_up_by_imu(smu);
> --
> 2.17.1
<>

Re: [PATCH] drm/amdgpu: add mb for si

2022-11-24 Thread Christian König


That's not a patch but some binary file?

Christian.

Am 24.11.22 um 11:04 schrieb Quan, Evan:

[AMD Official Use Only - General]

Could the attached patch help?

Evan

-Original Message-
From: amd-gfx  On Behalf Of ???
Sent: Friday, November 18, 2022 5:25 PM
To: Michel Dänzer ; Koenig, Christian
; Deucher, Alexander

Cc: amd-gfx@lists.freedesktop.org; Pan, Xinhui ;
linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: add mb for si


在 2022/11/18 17:18, Michel Dänzer 写道:

On 11/18/22 09:01, Christian König wrote:

Am 18.11.22 um 08:48 schrieb Zhenneng Li:

During reboot test on arm64 platform, it may failure on boot, so add
this mb in smc.

The error message are as follows:
[    6.996395][ 7] [  T295] [drm:amdgpu_device_ip_late_init
[amdgpu]] *ERROR*
      late_init of IP block  failed -22 [
7.006919][ 7] [  T295] amdgpu :04:00.0:
amdgpu_device_ip_late_init failed [    7.014224][ 7] [  T295] amdgpu
:04:00.0: Fatal error during GPU init

Memory barries are not supposed to be sprinkled around like this, you

need to give a detailed explanation why this is necessary.

Regards,
Christian.


Signed-off-by: Zhenneng Li 
---
    drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++
    1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
index 8f994ffa9cd1..c7656f22278d 100644
--- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
+++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
@@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct
amdgpu_device *adev)
    u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL);
    u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0);
    +    mb();
+
    if (!(rst & RST_REG) && !(clk & CK_DISABLE))
    return true;

In particular, it makes no sense in this specific place, since it cannot 
directly

affect the values of rst & clk.

I thinks so too.

But when I do reboot test using nine desktop machines,  there maybe report
this error on one or two machines after Hundreds of times or Thousands of
times reboot test, at the beginning, I use msleep() instead of mb(), these
two methods are all works, but I don't know what is the root case.

I use this method on other verdor's oland card, this error message are
reported again.

What could be the root reason?

test environmen:

graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87

driver: amdgpu

os: ubuntu 2004

platform: arm64

kernel: 5.4.18

RE: [PATCH] drm/amdgpu: add mb for si

2022-11-24 Thread Quan, Evan

[AMD Official Use Only - General]

Could the attached patch help?

Evan
> -Original Message-
> From: amd-gfx  On Behalf Of ???
> Sent: Friday, November 18, 2022 5:25 PM
> To: Michel Dänzer ; Koenig, Christian
> ; Deucher, Alexander
> 
> Cc: amd-gfx@lists.freedesktop.org; Pan, Xinhui ;
> linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: add mb for si
> 
> 
> 在 2022/11/18 17:18, Michel Dänzer 写道:
> > On 11/18/22 09:01, Christian König wrote:
> >> Am 18.11.22 um 08:48 schrieb Zhenneng Li:
> >>> During reboot test on arm64 platform, it may failure on boot, so add
> >>> this mb in smc.
> >>>
> >>> The error message are as follows:
> >>> [    6.996395][ 7] [  T295] [drm:amdgpu_device_ip_late_init
> >>> [amdgpu]] *ERROR*
> >>>      late_init of IP block  failed -22 [
> >>> 7.006919][ 7] [  T295] amdgpu :04:00.0:
> >>> amdgpu_device_ip_late_init failed [    7.014224][ 7] [  T295] amdgpu
> >>> :04:00.0: Fatal error during GPU init
> >> Memory barries are not supposed to be sprinkled around like this, you
> need to give a detailed explanation why this is necessary.
> >>
> >> Regards,
> >> Christian.
> >>
> >>> Signed-off-by: Zhenneng Li 
> >>> ---
> >>>    drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++
> >>>    1 file changed, 2 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>> b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>> index 8f994ffa9cd1..c7656f22278d 100644
> >>> --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>> +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c
> >>> @@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct
> >>> amdgpu_device *adev)
> >>>    u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL);
> >>>    u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0);
> >>>    +    mb();
> >>> +
> >>>    if (!(rst & RST_REG) && !(clk & CK_DISABLE))
> >>>    return true;
> > In particular, it makes no sense in this specific place, since it cannot 
> > directly
> affect the values of rst & clk.
> 
> I thinks so too.
> 
> But when I do reboot test using nine desktop machines,  there maybe report
> this error on one or two machines after Hundreds of times or Thousands of
> times reboot test, at the beginning, I use msleep() instead of mb(), these
> two methods are all works, but I don't know what is the root case.
> 
> I use this method on other verdor's oland card, this error message are
> reported again.
> 
> What could be the root reason?
> 
> test environmen:
> 
> graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87
> 
> driver: amdgpu
> 
> os: ubuntu 2004
> 
> platform: arm64
> 
> kernel: 5.4.18
> 
> >
<>

[PATCH 2/3] drm/ttm: use per BO cleanup workers

2022-11-24 Thread Christian König

Instead of a single worker going over the list of delete BOs in regular
intervals use a per BO worker which blocks for the resv object and
locking of the BO.

This not only simplifies the handling massively, but also results in
much better response time when cleaning up buffers.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 +-
 drivers/gpu/drm/i915/i915_gem.c|   2 +-
 drivers/gpu/drm/i915/intel_region_ttm.c|   2 +-
 drivers/gpu/drm/ttm/ttm_bo.c   | 112 -
 drivers/gpu/drm/ttm/ttm_bo_util.c  |   1 -
 drivers/gpu/drm/ttm/ttm_device.c   |  24 ++---
 include/drm/ttm/ttm_bo_api.h   |  18 +---
 include/drm/ttm/ttm_device.h   |   7 +-
 8 files changed, 57 insertions(+), 111 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index ff2ae0be2c28..05f458d67eb8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3966,7 +3966,7 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
amdgpu_fence_driver_hw_fini(adev);
 
if (adev->mman.initialized)
-   flush_delayed_work(&adev->mman.bdev.wq);
+   drain_workqueue(adev->mman.bdev.wq);
 
if (adev->pm_sysfs_en)
amdgpu_pm_sysfs_fini(adev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 299f94a9fb87..413c9dbe5be1 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1099,7 +1099,7 @@ void i915_gem_drain_freed_objects(struct drm_i915_private 
*i915)
 {
while (atomic_read(&i915->mm.free_count)) {
flush_work(&i915->mm.free_work);
-   flush_delayed_work(&i915->bdev.wq);
+   drain_workqueue(i915->bdev.wq);
rcu_barrier();
}
 }
diff --git a/drivers/gpu/drm/i915/intel_region_ttm.c 
b/drivers/gpu/drm/i915/intel_region_ttm.c
index cf89d0c2a2d9..657bbc16a48a 100644
--- a/drivers/gpu/drm/i915/intel_region_ttm.c
+++ b/drivers/gpu/drm/i915/intel_region_ttm.c
@@ -132,7 +132,7 @@ int intel_region_ttm_fini(struct intel_memory_region *mem)
break;
 
msleep(20);
-   flush_delayed_work(&mem->i915->bdev.wq);
+   drain_workqueue(mem->i915->bdev.wq);
}
 
/* If we leaked objects, Don't free the region causing use after free */
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index b77262a623e0..4749b65bedc4 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -280,14 +280,13 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object 
*bo,
ret = 0;
}
 
-   if (ret || unlikely(list_empty(&bo->ddestroy))) {
+   if (ret) {
if (unlock_resv)
dma_resv_unlock(bo->base.resv);
spin_unlock(&bo->bdev->lru_lock);
return ret;
}
 
-   list_del_init(&bo->ddestroy);
spin_unlock(&bo->bdev->lru_lock);
ttm_bo_cleanup_memtype_use(bo);
 
@@ -300,47 +299,21 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object 
*bo,
 }
 
 /*
- * Traverse the delayed list, and call ttm_bo_cleanup_refs on all
- * encountered buffers.
+ * Block for the dma_resv object to become idle, lock the buffer and clean up
+ * the resource and tt object.
  */
-bool ttm_bo_delayed_delete(struct ttm_device *bdev, bool remove_all)
+static void ttm_bo_delayed_delete(struct work_struct *work)
 {
-   struct list_head removed;
-   bool empty;
-
-   INIT_LIST_HEAD(&removed);
-
-   spin_lock(&bdev->lru_lock);
-   while (!list_empty(&bdev->ddestroy)) {
-   struct ttm_buffer_object *bo;
-
-   bo = list_first_entry(&bdev->ddestroy, struct ttm_buffer_object,
- ddestroy);
-   list_move_tail(&bo->ddestroy, &removed);
-   if (!ttm_bo_get_unless_zero(bo))
-   continue;
-
-   if (remove_all || bo->base.resv != &bo->base._resv) {
-   spin_unlock(&bdev->lru_lock);
-   dma_resv_lock(bo->base.resv, NULL);
-
-   spin_lock(&bdev->lru_lock);
-   ttm_bo_cleanup_refs(bo, false, !remove_all, true);
-
-   } else if (dma_resv_trylock(bo->base.resv)) {
-   ttm_bo_cleanup_refs(bo, false, !remove_all, true);
-   } else {
-   spin_unlock(&bdev->lru_lock);
-   }
+   struct ttm_buffer_object *bo;
 
-   ttm_bo_put(bo);
-   spin_lock(&bdev->lru_lock);
-   }
-   list_splice_tail(&removed, &bdev->ddestroy);
-   empty = list_empty(&bdev->ddestroy);
-   spin_unlock(&bdev->lru_lock);
+   bo = container_of(work, typeof(*bo), delayed_delete);
 
-

[PATCH 3/3] drm/amdgpu: generally allow over-commit during BO allocation

2022-11-24 Thread Christian König

We already fallback to a dummy BO with no backing store when we
allocate GDS,GWS and OA resources and to GTT when we allocate VRAM.

Drop all those workarounds and generalize this for GTT as well. This
fixes ENOMEM issues with runaway applications which try to allocate/free
GTT in a loop and are otherwise only limited by the CPU speed.

The CS will wait for the cleanup of freed up BOs to satisfy the
various domain specific limits and so effectively throttle those
buggy applications down to a sane allocation behavior again.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 16 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  6 +-
 2 files changed, 4 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 8ef31d687ef3..286509e5e17e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -112,7 +112,7 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, 
unsigned long size,
bp.resv = resv;
bp.preferred_domain = initial_domain;
bp.flags = flags;
-   bp.domain = initial_domain;
+   bp.domain = initial_domain | AMDGPU_GEM_DOMAIN_CPU;
bp.bo_ptr_size = sizeof(struct amdgpu_bo);
 
r = amdgpu_bo_create_user(adev, &bp, &ubo);
@@ -331,20 +331,10 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void 
*data,
}
 
initial_domain = (u32)(0x & args->in.domains);
-retry:
r = amdgpu_gem_object_create(adev, size, args->in.alignment,
-initial_domain,
-flags, ttm_bo_type_device, resv, &gobj);
+initial_domain, flags, ttm_bo_type_device,
+resv, &gobj);
if (r && r != -ERESTARTSYS) {
-   if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
-   flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
-   goto retry;
-   }
-
-   if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
-   initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
-   goto retry;
-   }
DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, 
%d)\n",
size, initial_domain, args->in.alignment, r);
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 974e85d8b6cc..919bbea2e3ac 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -581,11 +581,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
bo->flags |= AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE;
 
bo->tbo.bdev = &adev->mman.bdev;
-   if (bp->domain & (AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA |
- AMDGPU_GEM_DOMAIN_GDS))
-   amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_CPU);
-   else
-   amdgpu_bo_placement_from_domain(bo, bp->domain);
+   amdgpu_bo_placement_from_domain(bo, bp->domain);
if (bp->type == ttm_bo_type_kernel)
bo->tbo.priority = 1;
 
-- 
2.34.1

[PATCH 1/3] drm/ttm: remove ttm_bo_(un)lock_delayed_workqueue

2022-11-24 Thread Christian König

Those functions never worked correctly since it is still perfectly
possible that a buffer object is released and the background worker
restarted even after calling them.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |  6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  4 +---
 drivers/gpu/drm/radeon/radeon_device.c  |  5 -
 drivers/gpu/drm/radeon/radeon_pm.c  |  4 +---
 drivers/gpu/drm/ttm/ttm_bo.c| 14 --
 include/drm/ttm/ttm_bo_api.h| 16 
 6 files changed, 3 insertions(+), 46 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index de61a85c4b02..8c57a5b7ecba 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -1717,7 +1717,7 @@ static void amdgpu_ib_preempt_mark_partial_job(struct 
amdgpu_ring *ring)
 
 static int amdgpu_debugfs_ib_preempt(void *data, u64 val)
 {
-   int r, resched, length;
+   int r, length;
struct amdgpu_ring *ring;
struct dma_fence **fences = NULL;
struct amdgpu_device *adev = (struct amdgpu_device *)data;
@@ -1747,8 +1747,6 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val)
/* stop the scheduler */
kthread_park(ring->sched.thread);
 
-   resched = ttm_bo_lock_delayed_workqueue(&adev->mman.bdev);
-
/* preempt the IB */
r = amdgpu_ring_preempt_ib(ring);
if (r) {
@@ -1785,8 +1783,6 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val)
 
up_read(&adev->reset_domain->sem);
 
-   ttm_bo_unlock_delayed_workqueue(&adev->mman.bdev, resched);
-
 pro_end:
kfree(fences);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index cc65df9f2419..ff2ae0be2c28 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3965,10 +3965,8 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
}
amdgpu_fence_driver_hw_fini(adev);
 
-   if (adev->mman.initialized) {
+   if (adev->mman.initialized)
flush_delayed_work(&adev->mman.bdev.wq);
-   ttm_bo_lock_delayed_workqueue(&adev->mman.bdev);
-   }
 
if (adev->pm_sysfs_en)
amdgpu_pm_sysfs_fini(adev);
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index a556b6be1137..ea10306809cf 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -1768,7 +1768,6 @@ int radeon_gpu_reset(struct radeon_device *rdev)
bool saved = false;
 
int i, r;
-   int resched;
 
down_write(&rdev->exclusive_lock);
 
@@ -1780,8 +1779,6 @@ int radeon_gpu_reset(struct radeon_device *rdev)
atomic_inc(&rdev->gpu_reset_counter);
 
radeon_save_bios_scratch_regs(rdev);
-   /* block TTM */
-   resched = ttm_bo_lock_delayed_workqueue(&rdev->mman.bdev);
radeon_suspend(rdev);
radeon_hpd_fini(rdev);
 
@@ -1840,8 +1837,6 @@ int radeon_gpu_reset(struct radeon_device *rdev)
/* reset hpd state */
radeon_hpd_init(rdev);
 
-   ttm_bo_unlock_delayed_workqueue(&rdev->mman.bdev, resched);
-
rdev->in_reset = true;
rdev->needs_reset = false;
 
diff --git a/drivers/gpu/drm/radeon/radeon_pm.c 
b/drivers/gpu/drm/radeon/radeon_pm.c
index 04c693ca419a..cbc554928bcc 100644
--- a/drivers/gpu/drm/radeon/radeon_pm.c
+++ b/drivers/gpu/drm/radeon/radeon_pm.c
@@ -1853,11 +1853,10 @@ static bool radeon_pm_debug_check_in_vbl(struct 
radeon_device *rdev, bool finish
 static void radeon_dynpm_idle_work_handler(struct work_struct *work)
 {
struct radeon_device *rdev;
-   int resched;
+
rdev = container_of(work, struct radeon_device,
pm.dynpm_idle_work.work);
 
-   resched = ttm_bo_lock_delayed_workqueue(&rdev->mman.bdev);
mutex_lock(&rdev->pm.mutex);
if (rdev->pm.dynpm_state == DYNPM_STATE_ACTIVE) {
int not_processed = 0;
@@ -1908,7 +1907,6 @@ static void radeon_dynpm_idle_work_handler(struct 
work_struct *work)
  msecs_to_jiffies(RADEON_IDLE_LOOP_MS));
}
mutex_unlock(&rdev->pm.mutex);
-   ttm_bo_unlock_delayed_workqueue(&rdev->mman.bdev, resched);
 }
 
 /*
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index c3f4b33136e5..b77262a623e0 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -418,20 +418,6 @@ void ttm_bo_put(struct ttm_buffer_object *bo)
 }
 EXPORT_SYMBOL(ttm_bo_put);
 
-int ttm_bo_lock_delayed_workqueue(struct ttm_device *bdev)
-{
-   return cancel_delayed_work_sync(&bdev->wq);
-}
-EXPORT_SYMBOL(ttm_bo_lock_delayed_workqueue);
-
-void ttm_bo_unlock_delayed_workqueue(struct ttm_device *bdev, int resc

[PATCH] swsmu/amdgpu_smu: Fix the wrong if-condition

2022-11-24 Thread Yu Songping

The logical operator '&&' will make
smu->ppt_funcs->set_gfx_power_up_by_imu segment fault when
smu->ppt_funcs is NULL.

Signed-off-by: Yu Songping 
---
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index b880f4d7d67e..1cb728b0b7ee 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -161,7 +161,7 @@ int smu_get_dpm_freq_range(struct smu_context *smu,
 
 int smu_set_gfx_power_up_by_imu(struct smu_context *smu)
 {
-   if (!smu->ppt_funcs && !smu->ppt_funcs->set_gfx_power_up_by_imu)
+   if (!smu->ppt_funcs || !smu->ppt_funcs->set_gfx_power_up_by_imu)
return -EOPNOTSUPP;
 
return smu->ppt_funcs->set_gfx_power_up_by_imu(smu);
-- 
2.17.1

46 matches

Mail list logo