Re: [PATCH 05/18] drm/amd/display: Use mdelay to avoid crashes

2022-12-16 Thread Harry Wentland



On 12/15/22 16:02, Alex Hung wrote:
> 
> 
> On 2022-12-15 08:17, Harry Wentland wrote:
>>
>>
>> On 12/15/22 05:29, Michel Dänzer wrote:
>>> On 12/15/22 09:09, Christian König wrote:
 Am 15.12.22 um 00:33 schrieb Alex Hung:
> On 2022-12-14 16:06, Alex Deucher wrote:
>> On Wed, Dec 14, 2022 at 5:56 PM Alex Hung  wrote:
>>> On 2022-12-14 15:35, Alex Deucher wrote:
 On Wed, Dec 14, 2022 at 5:25 PM Alex Hung  wrote:
> On 2022-12-14 14:54, Alex Deucher wrote:
>> On Wed, Dec 14, 2022 at 4:50 PM Alex Hung  wrote:
>>> On 2022-12-14 13:48, Alex Deucher wrote:
 On Wed, Dec 14, 2022 at 3:22 PM Aurabindo Pillai
  wrote:
>
> From: Alex Hung 
>



>>
>> It can come through handle_hpd_rx_irq but we're using a workqueue
>> to queue interrupt handling so this shouldn't come from an atomic
>> context. I currently don't see where else it might be used in an
>> atomic context. Alex Hung, can you do a dump_stack() in this function
>> to see where the problematic call is coming from?
> 
> 
> IGT's kms_bw executes as below (when passing)
> 
> IGT-Version: 1.26-gf4067678 (x86_64) (Linux: 5.19.0-99-custom x86_64)
> Starting subtest: linear-tiling-1-displays-1920x1080p
> Subtest linear-tiling-1-displays-1920x1080p: SUCCESS (0.225s)
> Starting subtest: linear-tiling-1-displays-2560x1440p
> Subtest linear-tiling-1-displays-2560x1440p: SUCCESS (0.111s)
> Starting subtest: linear-tiling-1-displays-3840x2160p
> Subtest linear-tiling-1-displays-3840x2160p: SUCCESS (0.118s)
> Starting subtest: linear-tiling-2-displays-1920x1080p
> Subtest linear-tiling-2-displays-1920x1080p: SUCCESS (0.409s)
> Starting subtest: linear-tiling-2-displays-2560x1440p
> Subtest linear-tiling-2-displays-2560x1440p: SUCCESS (0.417s)
> Starting subtest: linear-tiling-2-displays-3840x2160p
> Subtest linear-tiling-2-displays-3840x2160p: SUCCESS (0.444s)
> Starting subtest: linear-tiling-3-displays-1920x1080p
> Subtest linear-tiling-3-displays-1920x1080p: SUCCESS (0.547s)
> Starting subtest: linear-tiling-3-displays-2560x1440p
> Subtest linear-tiling-3-displays-2560x1440p: SUCCESS (0.555s)
> Starting subtest: linear-tiling-3-displays-3840x2160p
> Subtest linear-tiling-3-displays-3840x2160p: SUCCESS (0.586s)
> Starting subtest: linear-tiling-4-displays-1920x1080p
> Subtest linear-tiling-4-displays-1920x1080p: SUCCESS (0.734s)
> Starting subtest: linear-tiling-4-displays-2560x1440p
> Subtest linear-tiling-4-displays-2560x1440p: SUCCESS (0.742s)
> Starting subtest: linear-tiling-4-displays-3840x2160p
> Subtest linear-tiling-4-displays-3840x2160p: SUCCESS (0.778s)
> Starting subtest: linear-tiling-5-displays-1920x1080p
> Subtest linear-tiling-5-displays-1920x1080p: SUCCESS (0.734s)
> Starting subtest: linear-tiling-5-displays-2560x1440p
> Subtest linear-tiling-5-displays-2560x1440p: SUCCESS (0.743s)
> Starting subtest: linear-tiling-5-displays-3840x2160p
> Subtest linear-tiling-5-displays-3840x2160p: SUCCESS (0.781s)
> Starting subtest: linear-tiling-6-displays-1920x1080p
> Test requirement not met in function run_test_linear_tiling, file 
> ../tests/kms_bw.c:156:
> Test requirement: !(pipe > num_pipes)
> ASIC does not have 5 pipes

Does this IGT patch fix the !(pipe > num_pipes) issue?

https://gitlab.freedesktop.org/hwentland/igt-gpu-tools/-/commit/3bb9ee157642ec2433f01b51e09866da2b0f3dd8

> Subtest linear-tiling-6-displays-1920x1080p: SKIP (0.000s)
> Starting subtest: linear-tiling-6-displays-2560x1440p
> Test requirement not met in function run_test_linear_tiling, file 
> ../tests/kms_bw.c:156:
> Test requirement: !(pipe > num_pipes)
> ASIC does not have 5 pipes
> Subtest linear-tiling-6-displays-2560x1440p: SKIP (0.000s)
> Starting subtest: linear-tiling-6-displays-3840x2160p
> Test requirement not met in function run_test_linear_tiling, file 
> ../tests/kms_bw.c:156:
> Test requirement: !(pipe > num_pipes)
> ASIC does not have 5 pipes
> Subtest linear-tiling-6-displays-3840x2160p: SKIP (0.000s)
> 
> The crash usually occurs when executing "linear-tiling-3-displays-1920x1080p" 
> most of time, but the crash can also occurs at 
> "linear-tiling-3-displays-2560x1440p"
> 
> 
> This is dump_stack right before the failing msleep.
> 
> [IGT] kms_bw: starting subtest linear-tiling-3-displays-1920x1080p
> CPU: 1 PID: 76 Comm: kworker/1:1 Not tainted 5.19.0-99-custom #126
> Workqueue: events drm_mode_rmfb_work_fn [drm]
> Call Trace:
>  
>  dump_stack_lvl+0x49/0x63
>  dump_stack+0x10/0x16
>  dce110_blank_stream.cold+0x5/0x14 [amdgpu]
>  core_link_disable_stream+0xe0/0x6b0 [amdgpu]
>  ? optc1_set_vtotal_min_max+0x6b/0x80 [amdgpu]
>  dcn31_reset_hw_ctx_wrap+0x229/0x410 [amdgpu]
>  dce110_apply_ctx_to_hw+0x6e/0x6c0 [amdgpu]
>  ? dcn20_plane_atomic_disable+0xb2/0x160 [amdgpu]
>  ? dcn20_disable_plane+0x2c/0x60 [amdgpu]
>  ? dcn20_post_unlock_program_front_end+0x77/0x2c0 [amdgpu]
>  dc_commit_state_no_check+0x39a/0xcd0 [amdgpu]
>  ? 

Re: [PATCH] drm/amd/pm: avoid large variable on kernel stack

2022-12-16 Thread Alex Deucher
On Thu, Dec 15, 2022 at 2:46 PM Christophe JAILLET
 wrote:
>
> Le 15/12/2022 à 17:36, Arnd Bergmann a écrit :
> > From: Arnd Bergmann 
> >
> > The activity_monitor_external[] array is too big to fit on the
> > kernel stack, resulting in this warning with clang:
> >
> > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_7_ppt.c:1438:12: 
> > error: stack frame size (1040) exceeds limit (1024) in 
> > 'smu_v13_0_7_get_power_profile_mode' [-Werror,-Wframe-larger-than]
> >
> > Use dynamic allocation instead. It should also be possible to
> > have single element here instead of the array, but this seems
> > easier.
> >
> > Fixes: 334682ae8151 ("drm/amd/pm: enable workload type change on 
> > smu_v13_0_7")
> > Signed-off-by: Arnd Bergmann 
> > ---
> >   .../drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c  | 21 ++-
> >   1 file changed, 16 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c 
> > b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
> > index c270f94a1b86..7eba854e09ec 100644
> > --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
> > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
> > @@ -1439,7 +1439,7 @@ static int smu_v13_0_7_get_power_limit(struct 
> > smu_context *smu,
> >
> >   static int smu_v13_0_7_get_power_profile_mode(struct smu_context *smu, 
> > char *buf)
> >   {
> > - DpmActivityMonitorCoeffIntExternal_t 
> > activity_monitor_external[PP_SMC_POWER_PROFILE_COUNT];
> > + DpmActivityMonitorCoeffIntExternal_t *activity_monitor_external;
> >   uint32_t i, j, size = 0;
> >   int16_t workload_type = 0;
> >   int result = 0;
> > @@ -1447,6 +1447,12 @@ static int smu_v13_0_7_get_power_profile_mode(struct 
> > smu_context *smu, char *buf
> >   if (!buf)
> >   return -EINVAL;
> >
> > + activity_monitor_external = kcalloc(sizeof(activity_monitor_external),
>
> Hi,
>
> Before, 'activity_monitor_external' was not initialized.
> Maybe kcalloc() is enough?
>
> sizeof(*activity_monitor_external)?
>   

I've fixed this up when applying.

Alex

>
> > + PP_SMC_POWER_PROFILE_COUNT,
> > + GFP_KERNEL);
> > + if (!activity_monitor_external)
> > + return -ENOMEM;
> > +
> >   size += sysfs_emit_at(buf, size, "  ");
> >   for (i = 0; i <= PP_SMC_POWER_PROFILE_WINDOW3D; i++)
>
> Unrelated, but wouldn't it be more straightforward with "<
> PP_SMC_POWER_PROFILE_COUNT"?
>
> >   size += sysfs_emit_at(buf, size, "%-14s%s", 
> > amdgpu_pp_profile_name[i],
> > @@ -1459,15 +1465,17 @@ static int 
> > smu_v13_0_7_get_power_profile_mode(struct smu_context *smu, char *buf
> >   workload_type = smu_cmn_to_asic_specific_index(smu,
> >  
> > CMN2ASIC_MAPPING_WORKLOAD,
> >  i);
> > - if (workload_type < 0)
> > - return -EINVAL;
> > + if (workload_type < 0) {
> > + result = -EINVAL;
> > + goto out;
> > + }
> >
> >   result = smu_cmn_update_table(smu,
> > SMU_TABLE_ACTIVITY_MONITOR_COEFF, 
> > workload_type,
> > (void 
> > *)(_monitor_external[i]), false);
> >   if (result) {
> >   dev_err(smu->adev->dev, "[%s] Failed to get activity 
> > monitor!", __func__);
> > - return result;
> > + goto out;
> >   }
> >   }
> >
> > @@ -1495,7 +1503,10 @@ do { 
> >   \
> >   PRINT_DPM_MONITOR(Fclk_BoosterFreq);
> >   #undef PRINT_DPM_MONITOR
> >
> > - return size;
> > + result = size;
> > +out:
> > + kfree(activity_monitor_external);
> > + return result;
> >   }
> >
> >   static int smu_v13_0_7_set_power_profile_mode(struct smu_context *smu, 
> > long *input, uint32_t size)
>


Re: drm/amdgpu: skip MES for S0ix as well since it's part of GFX

2022-12-16 Thread Limonciello, Mario

On 12/16/2022 10:44, Alex Deucher wrote:

It's also part of gfxoff.

Signed-off-by: Alex Deucher 


Reviewed-by: Mario Limonciello 

Even without the other series this alone has been shown
to improve things for the affected ASIC, so it should
probably go to stable.

Cc: sta...@vger.kernel.org # 6.0, 6.1


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 582a80a9850e..e4609b8d574c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3018,14 +3018,15 @@ static int amdgpu_device_ip_suspend_phase2(struct 
amdgpu_device *adev)
continue;
}
  
-		/* skip suspend of gfx and psp for S0ix

+   /* skip suspend of gfx/mes and psp for S0ix
 * gfx is in gfxoff state, so on resume it will exit gfxoff just
 * like at runtime. PSP is also part of the always on hardware
 * so no need to suspend it.
 */
if (adev->in_s0ix &&
(adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_PSP 
||
-adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GFX))
+adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GFX 
||
+adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_MES))
continue;
  
  		/* SDMA 5.x+ is part of GFX power domain so it's covered by GFXOFF */




Re: [5/7] drm/amdgpu: for S0ix, skip SMDA 5.x+ suspend/resume

2022-12-16 Thread Alex Deucher
On Fri, Dec 16, 2022 at 9:35 AM Limonciello, Mario
 wrote:
>
> +Tim
>
> On 12/15/2022 16:10, Alex Deucher wrote:
> > SDMA 5.x is part of the GFX block so it's controlled via
> > GFXOFF.  Skip suspend as it should be handled the same
> > as GFX.
> >
> > v2: drop SDMA 4.x.  That requires special handling.
> >
> > Acked-by: Rajneesh Bhardwaj 
> > Signed-off-by: Alex Deucher 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 ++
> >   1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index a99b327d5f09..5c0719c03c37 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -3028,6 +3028,12 @@ static int amdgpu_device_ip_suspend_phase2(struct 
> > amdgpu_device *adev)
> >adev->ip_blocks[i].version->type == 
> > AMD_IP_BLOCK_TYPE_GFX))
> >   continue;
> >
> > + /* SDMA 5.x+ is part of GFX power domain so it's covered by 
> > GFXOFF */
> > + if (adev->in_s0ix &&
> > + (adev->ip_versions[SDMA0_HWIP][0] >= IP_VERSION(5, 0, 0)) 
> > &&
> > + (adev->ip_blocks[i].version->type == 
> > AMD_IP_BLOCK_TYPE_SDMA))
> > + continue;
> > +
>
> I think we want to also skip MES here too, right?  That might be a
> follow up patch though.

Sent as a follow up.

Alex

>
> >   /* XXX handle errors */
> >   r = adev->ip_blocks[i].version->funcs->suspend(adev);
> >   /* XXX handle errors */
>


[PATCH] drm/amdgpu: skip MES for S0ix as well since it's part of GFX

2022-12-16 Thread Alex Deucher
It's also part of gfxoff.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 582a80a9850e..e4609b8d574c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3018,14 +3018,15 @@ static int amdgpu_device_ip_suspend_phase2(struct 
amdgpu_device *adev)
continue;
}
 
-   /* skip suspend of gfx and psp for S0ix
+   /* skip suspend of gfx/mes and psp for S0ix
 * gfx is in gfxoff state, so on resume it will exit gfxoff just
 * like at runtime. PSP is also part of the always on hardware
 * so no need to suspend it.
 */
if (adev->in_s0ix &&
(adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_PSP 
||
-adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GFX))
+adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GFX 
||
+adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_MES))
continue;
 
/* SDMA 5.x+ is part of GFX power domain so it's covered by 
GFXOFF */
-- 
2.38.1



[linux-next:master] BUILD REGRESSION ca39c4daa6f7f770b1329ffb46f1e4a6bcc3f291

2022-12-16 Thread kernel test robot
tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
branch HEAD: ca39c4daa6f7f770b1329ffb46f1e4a6bcc3f291  Add linux-next specific 
files for 20221216

Error/Warning reports:

https://lore.kernel.org/oe-kbuild-all/202211180516.dtowileo-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202211180955.uixgtkeu-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202211190207.rf66o1j0-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202211242120.mzzvguln-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202212020520.0okmino3-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202212051759.cev6fyhy-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202212142121.vendksoc-...@intel.com

Error/Warning: (recently discovered and may have been fixed)

Documentation/gpu/drm-internals:179: ./include/drm/drm_file.h:411: WARNING: 
undefined label: drm_accel_node (if the link has no caption the label must 
precede a section header)
Documentation/networking/devlink/etas_es58x.rst: WARNING: document isn't 
included in any toctree
Warning: tools/power/cpupower/man/cpupower-powercap-info.1 references a file 
that doesn't exist: Documentation/power/powercap/powercap.txt
arch/powerpc/kernel/kvm_emul.o: warning: objtool: kvm_template_end(): can't 
find starting instruction
arch/powerpc/kernel/optprobes_head.o: warning: objtool: 
optprobe_template_end(): can't find starting instruction
drivers/gpu/drm/amd/amdgpu/../display/dc/irq/dcn201/irq_service_dcn201.c:40:20: 
warning: no previous prototype for 'to_dal_irq_source_dcn201' 
[-Wmissing-prototypes]
drivers/regulator/tps65219-regulator.c:310:32: warning: parameter 'dev' set but 
not used [-Wunused-but-set-parameter]
drivers/regulator/tps65219-regulator.c:310:60: warning: parameter 'dev' set but 
not used [-Wunused-but-set-parameter]
drivers/regulator/tps65219-regulator.c:370:26: sparse:int
drivers/regulator/tps65219-regulator.c:370:26: sparse:struct regulator_dev 
*[assigned] rdev
drivers/regulator/tps65219-regulator.c:370:26: warning: ordered comparison of 
pointer with integer zero [-Wextra]

Unverified Error/Warning (likely false positive, please contact us if 
interested):

drivers/accessibility/speakup/main.c:1290:26: sparse: sparse: obsolete array 
initializer, use C99 syntax
drivers/i2c/busses/i2c-qcom-geni.c:1028:28: sparse: sparse: symbol 
'i2c_master_hub' was not declared. Should it be static?
drivers/media/platform/ti/davinci/vpif.c:483:20: sparse: sparse: cast from 
non-scalar
drivers/media/platform/ti/davinci/vpif.c:483:20: sparse: sparse: cast to 
non-scalar
drivers/media/test-drivers/visl/visl-video.c:690:22: sparse: sparse: symbol 
'visl_qops' was not declared. Should it be static?
drivers/usb/misc/sisusbvga/sisusbvga.c:528:9: sparse: sparse: incorrect type in 
assignment (different base types)
fs/xfs/xfs_iomap.c:86:29: sparse: sparse: symbol 'xfs_iomap_page_ops' was not 
declared. Should it be static?
hidma.c:(.text+0x46): undefined reference to `devm_ioremap_resource'
mm/hugetlb.c:6897 hugetlb_reserve_pages() error: uninitialized symbol 'chg'.

Error/Warning ids grouped by kconfigs:

gcc_recent_errors
|-- alpha-allyesconfig
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-irq-dcn201-irq_service_dcn201.c:warning:no-previous-prototype-for-to_dal_irq_source_dcn201
|   |-- 
drivers-regulator-tps65219-regulator.c:warning:ordered-comparison-of-pointer-with-integer-zero
|   `-- 
drivers-regulator-tps65219-regulator.c:warning:parameter-dev-set-but-not-used
|-- arc-allyesconfig
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-irq-dcn201-irq_service_dcn201.c:warning:no-previous-prototype-for-to_dal_irq_source_dcn201
|   |-- 
drivers-regulator-tps65219-regulator.c:warning:ordered-comparison-of-pointer-with-integer-zero
|   `-- 
drivers-regulator-tps65219-regulator.c:warning:parameter-dev-set-but-not-used
|-- arc-randconfig-s053-20221216
|   |-- 
drivers-i2c-busses-i2c-qcom-geni.c:sparse:sparse:symbol-i2c_master_hub-was-not-declared.-Should-it-be-static
|   `-- 
drivers-media-test-drivers-visl-visl-video.c:sparse:sparse:symbol-visl_qops-was-not-declared.-Should-it-be-static
|-- arm-allyesconfig
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-irq-dcn201-irq_service_dcn201.c:warning:no-previous-prototype-for-to_dal_irq_source_dcn201
|   |-- 
drivers-regulator-tps65219-regulator.c:warning:ordered-comparison-of-pointer-with-integer-zero
|   `-- 
drivers-regulator-tps65219-regulator.c:warning:parameter-dev-set-but-not-used
|-- arm64-allyesconfig
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-irq-dcn201-irq_service_dcn201.c:warning:no-previous-prototype-for-to_dal_irq_source_dcn201
|   |-- 
drivers-regulator-tps65219-regulator.c:warning:ordered-comparison-of-pointer-with-integer-zero
|   `-- 
drivers-regulator-tps65219-regulator.c:warning:parameter-dev-set-but-not-used
|-- arm64-buildonly-randconfig-r006-20221215
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-irq-dcn201-irq_service_dcn201.c:warning:no-previous

RE: [PATCH 5/7] drm/amdgpu: for S0ix, skip SMDA 5.x+ suspend/resume

2022-12-16 Thread Russell, Kent
[AMD Official Use Only - General]

Probably want to fix that typo from SMDA to SDMA in the subject line before 
pushing.

 Kent

> -Original Message-
> From: amd-gfx  On Behalf Of Alex
> Deucher
> Sent: Thursday, December 15, 2022 5:11 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Bhardwaj, Rajneesh
> 
> Subject: [PATCH 5/7] drm/amdgpu: for S0ix, skip SMDA 5.x+ suspend/resume
> 
> SDMA 5.x is part of the GFX block so it's controlled via
> GFXOFF.  Skip suspend as it should be handled the same
> as GFX.
> 
> v2: drop SDMA 4.x.  That requires special handling.
> 
> Acked-by: Rajneesh Bhardwaj 
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index a99b327d5f09..5c0719c03c37 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3028,6 +3028,12 @@ static int amdgpu_device_ip_suspend_phase2(struct
> amdgpu_device *adev)
>adev->ip_blocks[i].version->type ==
> AMD_IP_BLOCK_TYPE_GFX))
>   continue;
> 
> + /* SDMA 5.x+ is part of GFX power domain so it's covered by
> GFXOFF */
> + if (adev->in_s0ix &&
> + (adev->ip_versions[SDMA0_HWIP][0] >= IP_VERSION(5, 0, 0))
> &&
> + (adev->ip_blocks[i].version->type ==
> AMD_IP_BLOCK_TYPE_SDMA))
> + continue;
> +
>   /* XXX handle errors */
>   r = adev->ip_blocks[i].version->funcs->suspend(adev);
>   /* XXX handle errors */
> --
> 2.38.1


Re: [PATCH] drm/amd/pm: correct the fan speed retrieving in PWM for some SMU13 asics

2022-12-16 Thread Alex Deucher
With Christian's comment addressed, the patch is:
Acked-by: Alex Deucher 

On Fri, Dec 16, 2022 at 6:50 AM Christian König
 wrote:
>
> Am 16.12.22 um 11:35 schrieb Evan Quan:
> > For SMU 13.0.0 and 13.0.7, the output from PMFW is in percent. Driver
> > need to convert that into correct PMW(255) based.
> >
> > Signed-off-by: Evan Quan 
> > Change-Id: I7bbeae3c0d81c6cf6e0033aa28ca6d26f5b6d178
> > ---
> >   .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c  | 15 ---
> >   .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c  | 15 ---
> >   2 files changed, 24 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c 
> > b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
> > index 636cb561fea9..283cf7cf95ab 100644
> > --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
> > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
> > @@ -1445,12 +1445,21 @@ static void smu_v13_0_0_get_unique_id(struct 
> > smu_context *smu)
> >   static int smu_v13_0_0_get_fan_speed_pwm(struct smu_context *smu,
> >uint32_t *speed)
> >   {
> > + int ret = 0;
>
> Please don't initialize local variables when there isn't a need for this.
>
> We often get complains about this from automated scripts.
>
> Regards,
> Christian.
>
> > +
> >   if (!speed)
> >   return -EINVAL;
> >
> > - return smu_v13_0_0_get_smu_metrics_data(smu,
> > - METRICS_CURR_FANPWM,
> > - speed);
> > + ret = smu_v13_0_0_get_smu_metrics_data(smu,
> > +METRICS_CURR_FANPWM,
> > +speed);
> > + if (ret)
> > + return ret;
> > +
> > + /* Convert the PMFW output which is in percent to pwm(255) based */
> > + *speed = MIN(*speed * 255 / 100, 255);
> > +
> > + return 0;
> >   }
> >
> >   static int smu_v13_0_0_get_fan_speed_rpm(struct smu_context *smu,
> > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c 
> > b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
> > index 5e937e4efb51..f207f102ed7e 100644
> > --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
> > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
> > @@ -1365,12 +1365,21 @@ static int 
> > smu_v13_0_7_populate_umd_state_clk(struct smu_context *smu)
> >   static int smu_v13_0_7_get_fan_speed_pwm(struct smu_context *smu,
> >uint32_t *speed)
> >   {
> > + int ret = 0;
> > +
> >   if (!speed)
> >   return -EINVAL;
> >
> > - return smu_v13_0_7_get_smu_metrics_data(smu,
> > - METRICS_CURR_FANPWM,
> > - speed);
> > + ret = smu_v13_0_7_get_smu_metrics_data(smu,
> > +METRICS_CURR_FANPWM,
> > +speed);
> > + if (ret)
> > + return ret;
> > +
> > + /* Convert the PMFW output which is in percent to pwm(255) based */
> > + *speed = MIN(*speed * 255 / 100, 255);
> > +
> > + return 0;
> >   }
> >
> >   static int smu_v13_0_7_get_fan_speed_rpm(struct smu_context *smu,
>


Re: [PATCH] drm/amd/display: fix dp_retrieve_lttpr_cap return code

2022-12-16 Thread Arnd Bergmann
On Thu, Dec 15, 2022, at 18:56, Michel Dänzer wrote:
> On 12/15/22 17:37, Arnd Bergmann wrote:
/amd/display/dc/core/dc_link_dp.c
>> index af9411ee3c74..95dbfa4e996a 100644
>> --- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
>> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
>> @@ -5095,7 +5095,7 @@ enum dc_status dp_retrieve_lttpr_cap(struct dc_link 
>> *link)
>>  bool vbios_lttpr_interop = link->dc->caps.vbios_lttpr_aware;
>>  
>>  if (!vbios_lttpr_interop || 
>> !link->dc->caps.extended_aux_timeout_support)
>> -return false;
>> +return DC_OK;
>
>   return status;
>
> seems more appropriate. (Otherwise the status = DC_ERROR_UNEXPECTED 
> initialization has no effect)

Ok, makes sense. I'd also remove the unused initialization in that
case though:

 enum dc_status dp_retrieve_lttpr_cap(struct dc_link *link)
 {
uint8_t lttpr_dpcd_data[8];
-   enum dc_status status = DC_ERROR_UNEXPECTED;
-   bool is_lttpr_present = false;
+   enum dc_status status;
+   bool is_lttpr_present;
 
/* Logic to determine LTTPR support*/
bool vbios_lttpr_interop = link->dc->caps.vbios_lttpr_aware;
 
if (!vbios_lttpr_interop || 
!link->dc->caps.extended_aux_timeout_support)
-   return false;
+   return DC_ERROR_UNEXPECTED;
 
/* By reading LTTPR capability, RX assumes that we will enable
 * LTTPR extended aux timeout if LTTPR is present.

I'll send that as a v2 once that passes my build test and nobody
has further suggestions.

   Arnd


Re: [5/7] drm/amdgpu: for S0ix, skip SMDA 5.x+ suspend/resume

2022-12-16 Thread Limonciello, Mario

+Tim

On 12/15/2022 16:10, Alex Deucher wrote:

SDMA 5.x is part of the GFX block so it's controlled via
GFXOFF.  Skip suspend as it should be handled the same
as GFX.

v2: drop SDMA 4.x.  That requires special handling.

Acked-by: Rajneesh Bhardwaj 
Signed-off-by: Alex Deucher 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a99b327d5f09..5c0719c03c37 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3028,6 +3028,12 @@ static int amdgpu_device_ip_suspend_phase2(struct 
amdgpu_device *adev)
 adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GFX))
continue;
  
+		/* SDMA 5.x+ is part of GFX power domain so it's covered by GFXOFF */

+   if (adev->in_s0ix &&
+   (adev->ip_versions[SDMA0_HWIP][0] >= IP_VERSION(5, 0, 0)) &&
+   (adev->ip_blocks[i].version->type == 
AMD_IP_BLOCK_TYPE_SDMA))
+   continue;
+


I think we want to also skip MES here too, right?  That might be a 
follow up patch though.



/* XXX handle errors */
r = adev->ip_blocks[i].version->funcs->suspend(adev);
/* XXX handle errors */




Re: [PATCH 0/7 v2] Improve S0ix stability

2022-12-16 Thread Limonciello, Mario

On 12/15/2022 16:10, Alex Deucher wrote:

This series improves S0ix stability by avoiding touching
registers that should be handled as part of gfxoff.

v2: add comments in gmc code to explain why we can
skip the vm fault state setting for gfxhub.

Alex Deucher (7):
   drm/amdgpu/gmc9: don't touch gfxhub registers during S0ix
   drm/amdgpu/gmc10: don't touch gfxhub registers during S0ix
   drm/amdgpu/gmc11: don't touch gfxhub registers during S0ix
   drm/amdgpu: don't mess with SDMA clock or powergating in S0ix
   drm/amdgpu: for S0ix, skip SMDA 5.x+ suspend/resume
   Revert "drm/amdgpu: disallow gfxoff until GC IP blocks complete s2idle
 resume"
   Revert "drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix"

  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 32 ---
  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 36 --
  drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 16 --
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 36 ++
  drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c |  8 -
  5 files changed, 83 insertions(+), 45 deletions(-)



Series is:

Reviewed-by: Mario Limonciello 


Re: [PATCH V2] drm/plane-helper: Add the missing declaration of drm_atomic_state

2022-12-16 Thread Thomas Zimmermann

Hi

Am 16.12.22 um 04:05 schrieb Ma Jun:

Add the missing declaration of struct drm_atomic_state to fix the
compile error below:

error: 'struct drm_atomic_state' declared inside parameter
list will not be visible outside of this definition or declaration [-Werror]

Signed-off-by: Ma Jun 


Thanks. I added a Fixes tag and merged the patch into drm-misc-fixes.

Best regards
Thomas


---
  include/drm/drm_plane_helper.h | 1 +
  1 file changed, 1 insertion(+)

diff --git a/include/drm/drm_plane_helper.h b/include/drm/drm_plane_helper.h
index b00ad36cf5b6..90156e13ac11 100644
--- a/include/drm/drm_plane_helper.h
+++ b/include/drm/drm_plane_helper.h
@@ -26,6 +26,7 @@
  
  #include 
  
+struct drm_atomic_state;

  struct drm_crtc;
  struct drm_framebuffer;
  struct drm_modeset_acquire_ctx;


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature


RE: [PATCH] drm/amdgpu: Fixed bug on error when uninstalling amdgpu

2022-12-16 Thread Chai, Thomas
[AMD Official Use Only - General]

OK, I will update subject line.  Thanks!


-
Best Regards,
Thomas

-Original Message-
From: Christian König  
Sent: Friday, December 16, 2022 4:50 PM
To: Chai, Thomas ; amd-gfx@lists.freedesktop.org; Paneer 
Selvam, Arunpravin 
Cc: Zhou1, Tao ; Zhang, Hawking ; 
Chai, Thomas 
Subject: Re: [PATCH] drm/amdgpu: Fixed bug on error when uninstalling amdgpu

Am 16.12.22 um 03:56 schrieb YiPeng Chai:
> Fixed bug on error when uninstalling amdgpu.
> The error message is as follows:
> [  304.852489] kernel BUG at drivers/gpu/drm/drm_buddy.c:278!
> [  304.852503] invalid opcode:  [#1] PREEMPT SMP NOPTI
> [  304.852510] CPU: 2 PID: 4192 Comm: modprobe Tainted: GW IOE 
> 5.19.0-thomas #1
> [  304.852519] Hardware name: ASUS System Product Name/PRIME Z390-A, 
> BIOS 2004 11/02/2021 [  304.852526] RIP: 
> 0010:drm_buddy_free_block+0x26/0x30 [drm_buddy] [  304.852535] Code: 
> 00 00 00 90 0f 1f 44 00 00 48 8b 0e 89 c8 25 00 0c 00 00 3d 00 04 00 
> 00 75 10 48 8b 47 18 48 d3 e0 48 01 47 28 e9 fa fe ff ff <0f> 0b 0f 1f 
> 84 00 00 00 00 00 0f 1f 44 00 00 41 54 55 48 89 f5 53 [  304.852549] 
> RSP: 0018:9afac17bbcb8 EFLAGS: 00010287 [  304.852556] RAX: 
>  RBX: 8dacd37fd778 RCX:  [  
> 304.852563] RDX: 8dacd37fd7a0 RSI: 8dacd37fd3b8 RDI: 
> 8dac672a5f80 [  304.852570] RBP: 8dacd37fd3a0 R08: 
> 0001 R09:  [  304.852577] R10: 
> 8dac68185500 R11: 9afac17bbd00 R12: 8dac672a5f80 [  
> 304.852584] R13: 8dac672a5fe0 R14: 8dacd37fd380 R15: 
> 8dac672a5f80 [  304.852590] FS:  7f0fa9b30c40() 
> GS:8dadb648() knlGS: [  304.852598] CS:  0010 DS: 
>  ES:  CR0: 80050033 [  304.852604] CR2: 7f4bf1a1ba50 CR3: 
> 000108c58004 CR4: 003706e0 [  304.852611] DR0:  
> DR1:  DR2:  [  304.852618] DR3: 
>  DR6: fffe0ff0 DR7: 0400 [  304.852625] 
> Call Trace:
> [  304.852629]  
> [  304.852632]  drm_buddy_free_list+0x2a/0x60 [drm_buddy] [  
> 304.852639]  amdgpu_vram_mgr_fini+0xea/0x180 [amdgpu] [  304.852827]  
> amdgpu_ttm_fini+0x1f9/0x280 [amdgpu] [  304.852925]  
> amdgpu_bo_fini+0x22/0x90 [amdgpu] [  304.853022]  
> gmc_v11_0_sw_fini+0x26/0x30 [amdgpu] [  304.853132]  
> amdgpu_device_fini_sw+0xc5/0x3b0 [amdgpu] [  304.853229]  
> amdgpu_driver_release_kms+0x12/0x30 [amdgpu] [  304.853327]  
> drm_dev_release+0x20/0x40 [drm] [  304.853352]  
> release_nodes+0x35/0xb0 [  304.853359]  devres_release_all+0x8b/0xc0 [  
> 304.853364]  device_unbind_cleanup+0xe/0x70 [  304.853370]  
> device_release_driver_internal+0xee/0x160
> [  304.853377]  driver_detach+0x44/0x90 [  304.853382]  
> bus_remove_driver+0x55/0xe0 [  304.853387]  
> pci_unregister_driver+0x3b/0x90 [  304.853393]  amdgpu_exit+0x11/0x69 
> [amdgpu] [  304.853540]  __x64_sys_delete_module+0x142/0x260
> [  304.853548]  ? exit_to_user_mode_prepare+0x3e/0x190
> [  304.853555]  do_syscall_64+0x38/0x90 [  304.853562]  
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> Signed-off-by: YiPeng Chai 

The subject line should probably read "when unloading amdgpu", but apart from 
that good catch.

Reviewed-by: Christian König 

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> index 0b598b510bd8..eb63324c30d2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> @@ -829,7 +829,7 @@ void amdgpu_vram_mgr_fini(struct amdgpu_device *adev)
>   kfree(rsv);
>   
>   list_for_each_entry_safe(rsv, temp, >reserved_pages, blocks) {
> - drm_buddy_free_list(>mm, >blocks);
> + drm_buddy_free_list(>mm, >allocated);
>   kfree(rsv);
>   }
>   drm_buddy_fini(>mm);


[PATCH -next] drm/amd/display: Remove redundant assignment to variable dc

2022-12-16 Thread Yi Yang
Smatch report warning as follows:

Line 53679: drivers/gpu/drm/amd/display/dc/core/dc_stream.c:402
dc_stream_set_cursor_position() warn: variable dereferenced before
check 'stream'

The value of 'dc' has been assigned after check whether 'stream' is
NULL. Fix it by remove redundant assignment.

Signed-off-by: Yi Yang 
---
 drivers/gpu/drm/amd/display/dc/core/dc_stream.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
index 20e534f73513..78d31bb875d1 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
@@ -408,7 +408,7 @@ bool dc_stream_set_cursor_position(
struct dc_stream_state *stream,
const struct dc_cursor_position *position)
 {
-   struct dc  *dc = stream->ctx->dc;
+   struct dc *dc;
bool reset_idle_optimizations = false;
 
if (NULL == stream) {
-- 
2.17.1



Re: [PATCH] drm/amd/pm: correct the fan speed retrieving in PWM for some SMU13 asics

2022-12-16 Thread Christian König

Am 16.12.22 um 11:35 schrieb Evan Quan:

For SMU 13.0.0 and 13.0.7, the output from PMFW is in percent. Driver
need to convert that into correct PMW(255) based.

Signed-off-by: Evan Quan 
Change-Id: I7bbeae3c0d81c6cf6e0033aa28ca6d26f5b6d178
---
  .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c  | 15 ---
  .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c  | 15 ---
  2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
index 636cb561fea9..283cf7cf95ab 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
@@ -1445,12 +1445,21 @@ static void smu_v13_0_0_get_unique_id(struct 
smu_context *smu)
  static int smu_v13_0_0_get_fan_speed_pwm(struct smu_context *smu,
 uint32_t *speed)
  {
+   int ret = 0;


Please don't initialize local variables when there isn't a need for this.

We often get complains about this from automated scripts.

Regards,
Christian.


+
if (!speed)
return -EINVAL;
  
-	return smu_v13_0_0_get_smu_metrics_data(smu,

-   METRICS_CURR_FANPWM,
-   speed);
+   ret = smu_v13_0_0_get_smu_metrics_data(smu,
+  METRICS_CURR_FANPWM,
+  speed);
+   if (ret)
+   return ret;
+
+   /* Convert the PMFW output which is in percent to pwm(255) based */
+   *speed = MIN(*speed * 255 / 100, 255);
+
+   return 0;
  }
  
  static int smu_v13_0_0_get_fan_speed_rpm(struct smu_context *smu,

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
index 5e937e4efb51..f207f102ed7e 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
@@ -1365,12 +1365,21 @@ static int smu_v13_0_7_populate_umd_state_clk(struct 
smu_context *smu)
  static int smu_v13_0_7_get_fan_speed_pwm(struct smu_context *smu,
 uint32_t *speed)
  {
+   int ret = 0;
+
if (!speed)
return -EINVAL;
  
-	return smu_v13_0_7_get_smu_metrics_data(smu,

-   METRICS_CURR_FANPWM,
-   speed);
+   ret = smu_v13_0_7_get_smu_metrics_data(smu,
+  METRICS_CURR_FANPWM,
+  speed);
+   if (ret)
+   return ret;
+
+   /* Convert the PMFW output which is in percent to pwm(255) based */
+   *speed = MIN(*speed * 255 / 100, 255);
+
+   return 0;
  }
  
  static int smu_v13_0_7_get_fan_speed_rpm(struct smu_context *smu,




Re: [PATCH 16/16] drm/amd/display: Don't restrict bpc to 8 bpc

2022-12-16 Thread Michel Dänzer
On 12/15/22 10:07, Michel Dänzer wrote:
> On 12/14/22 16:46, Alex Deucher wrote:
>> On Wed, Dec 14, 2022 at 4:01 AM Pekka Paalanen  wrote:
>>> On Tue, 13 Dec 2022 18:20:59 +0100
>>> Michel Dänzer  wrote:
 On 12/12/22 19:21, Harry Wentland wrote:
> This will let us pass kms_hdr.bpc_switch.
>
> I don't see any good reasons why we still need to
> limit bpc to 8 bpc and doing so is problematic when
> we enable HDR.
>
> If I remember correctly there might have been some
> displays out there where the advertised link bandwidth
> was not large enough to drive the default timing at
> max bpc. This would leave to an atomic commit/check
> failure which should really be handled in compositors
> with some sort of fallback mechanism.
>
> If this somehow turns out to still be an issue I
> suggest we add a module parameter to allow users to
> limit the max_bpc to a desired value.

 While leaving the fallback for user space to handle makes some sense
 in theory, in practice most KMS display servers likely won't handle
 it.

 Another issue is that if mode validation is based on the maximum bpc
 value, it may reject modes which would work with lower bpc.


 What Ville (CC'd) suggested before instead (and what i915 seems to be
 doing already) is that the driver should do mode validation based on
 the *minimum* bpc, and automatically make the effective bpc lower
 than the maximum as needed to make the rest of the atomic state work.
>>>
>>> A driver is always allowed to choose a bpc lower than max_bpc, so it
>>> very well should do so when necessary due to *known* hardware etc.
>>> limitations.
>>>
>>
>> In the amdgpu case, it's more of a preference thing.  The driver would
>> enable higher bpcs at the expense of refresh rate and it seemed most
>> users want higher refresh rates than higher bpc. 
> 
> I wrote the above because I thought that this patch might result in some 
> modes getting pruned because they can't work with the max bpc. However, I see 
> now that cbd14ae7ea93 ("drm/amd/display: Fix incorrectly pruned modes with 
> deep color") should prevent that AFAICT.
> 
> The question then is: What happens if user space tries to use a mode which 
> doesn't work with the max bpc? Does the driver automatically lower the 
> effective bpc as needed, or does the atomic commit (check) fail? The latter 
> would seem bad.

Per my previous post in the other sub-thread, cbd14ae7ea93 ("drm/amd/display: 
Fix incorrectly pruned modes with deep color") seems to do the former. The 
commit log of this patch should probably be changed to reflect that.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



[PATCH] drm/amd/display: Remove redundant logs from DSC code

2022-12-16 Thread Praful Swarnakar
[Why & How]
Remove redundant log in DSC that just add additional blank prints

Signed-off-by: Praful Swarnakar 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 1 -
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dsc.c | 1 -
 2 files changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index af9411ee3c74..f2b6d40e4f5c 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -7510,7 +7510,6 @@ bool dp_set_dsc_pps_sdp(struct pipe_ctx *pipe_ctx, bool 
enable, bool immediate_u
dsc_cfg.is_odm = pipe_ctx->next_odm_pipe ? true : false;
dsc_cfg.dc_dsc_cfg = stream->timing.dsc_cfg;
 
-   DC_LOG_DSC(" ");
dsc->funcs->dsc_get_packed_pps(dsc, _cfg, 
_packed_pps[0]);
memcpy(>dsc_packed_pps[0], _packed_pps[0], 
sizeof(stream->dsc_packed_pps));
if (dc_is_dp_signal(stream->signal)) {
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dsc.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dsc.c
index 784a8b6f360d..c08c01e05dcf 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dsc.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dsc.c
@@ -200,7 +200,6 @@ static void dsc2_set_config(struct 
display_stream_compressor *dsc, const struct
bool is_config_ok;
struct dcn20_dsc *dsc20 = TO_DCN20_DSC(dsc);
 
-   DC_LOG_DSC(" ");
DC_LOG_DSC("Setting DSC Config at DSC inst %d", dsc->inst);
dsc_config_log(dsc, dsc_cfg);
is_config_ok = dsc_prepare_config(dsc_cfg, >reg_vals, 
dsc_optc_cfg);
-- 
2.25.1



[PATCH] drm/amd/pm: correct the fan speed retrieving in PWM for some SMU13 asics

2022-12-16 Thread Evan Quan
For SMU 13.0.0 and 13.0.7, the output from PMFW is in percent. Driver
need to convert that into correct PMW(255) based.

Signed-off-by: Evan Quan 
Change-Id: I7bbeae3c0d81c6cf6e0033aa28ca6d26f5b6d178
---
 .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c  | 15 ---
 .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c  | 15 ---
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
index 636cb561fea9..283cf7cf95ab 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
@@ -1445,12 +1445,21 @@ static void smu_v13_0_0_get_unique_id(struct 
smu_context *smu)
 static int smu_v13_0_0_get_fan_speed_pwm(struct smu_context *smu,
 uint32_t *speed)
 {
+   int ret = 0;
+
if (!speed)
return -EINVAL;
 
-   return smu_v13_0_0_get_smu_metrics_data(smu,
-   METRICS_CURR_FANPWM,
-   speed);
+   ret = smu_v13_0_0_get_smu_metrics_data(smu,
+  METRICS_CURR_FANPWM,
+  speed);
+   if (ret)
+   return ret;
+
+   /* Convert the PMFW output which is in percent to pwm(255) based */
+   *speed = MIN(*speed * 255 / 100, 255);
+
+   return 0;
 }
 
 static int smu_v13_0_0_get_fan_speed_rpm(struct smu_context *smu,
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
index 5e937e4efb51..f207f102ed7e 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
@@ -1365,12 +1365,21 @@ static int smu_v13_0_7_populate_umd_state_clk(struct 
smu_context *smu)
 static int smu_v13_0_7_get_fan_speed_pwm(struct smu_context *smu,
 uint32_t *speed)
 {
+   int ret = 0;
+
if (!speed)
return -EINVAL;
 
-   return smu_v13_0_7_get_smu_metrics_data(smu,
-   METRICS_CURR_FANPWM,
-   speed);
+   ret = smu_v13_0_7_get_smu_metrics_data(smu,
+  METRICS_CURR_FANPWM,
+  speed);
+   if (ret)
+   return ret;
+
+   /* Convert the PMFW output which is in percent to pwm(255) based */
+   *speed = MIN(*speed * 255 / 100, 255);
+
+   return 0;
 }
 
 static int smu_v13_0_7_get_fan_speed_rpm(struct smu_context *smu,
-- 
2.34.1



Re: [PATCH] drm/amdgpu: Fixed bug on error when uninstalling amdgpu

2022-12-16 Thread Christian König

Am 16.12.22 um 03:56 schrieb YiPeng Chai:

Fixed bug on error when uninstalling amdgpu.
The error message is as follows:
[  304.852489] kernel BUG at drivers/gpu/drm/drm_buddy.c:278!
[  304.852503] invalid opcode:  [#1] PREEMPT SMP NOPTI
[  304.852510] CPU: 2 PID: 4192 Comm: modprobe Tainted: GW IOE 
5.19.0-thomas #1
[  304.852519] Hardware name: ASUS System Product Name/PRIME Z390-A, BIOS 2004 
11/02/2021
[  304.852526] RIP: 0010:drm_buddy_free_block+0x26/0x30 [drm_buddy]
[  304.852535] Code: 00 00 00 90 0f 1f 44 00 00 48 8b 0e 89 c8 25 00 0c 00 00 3d 00 
04 00 00 75 10 48 8b 47 18 48 d3 e0 48 01 47 28 e9 fa fe ff ff <0f> 0b 0f 1f 84 
00 00 00 00 00 0f 1f 44 00 00 41 54 55 48 89 f5 53
[  304.852549] RSP: 0018:9afac17bbcb8 EFLAGS: 00010287
[  304.852556] RAX:  RBX: 8dacd37fd778 RCX: 
[  304.852563] RDX: 8dacd37fd7a0 RSI: 8dacd37fd3b8 RDI: 8dac672a5f80
[  304.852570] RBP: 8dacd37fd3a0 R08: 0001 R09: 
[  304.852577] R10: 8dac68185500 R11: 9afac17bbd00 R12: 8dac672a5f80
[  304.852584] R13: 8dac672a5fe0 R14: 8dacd37fd380 R15: 8dac672a5f80
[  304.852590] FS:  7f0fa9b30c40() GS:8dadb648() 
knlGS:
[  304.852598] CS:  0010 DS:  ES:  CR0: 80050033
[  304.852604] CR2: 7f4bf1a1ba50 CR3: 000108c58004 CR4: 003706e0
[  304.852611] DR0:  DR1:  DR2: 
[  304.852618] DR3:  DR6: fffe0ff0 DR7: 0400
[  304.852625] Call Trace:
[  304.852629]  
[  304.852632]  drm_buddy_free_list+0x2a/0x60 [drm_buddy]
[  304.852639]  amdgpu_vram_mgr_fini+0xea/0x180 [amdgpu]
[  304.852827]  amdgpu_ttm_fini+0x1f9/0x280 [amdgpu]
[  304.852925]  amdgpu_bo_fini+0x22/0x90 [amdgpu]
[  304.853022]  gmc_v11_0_sw_fini+0x26/0x30 [amdgpu]
[  304.853132]  amdgpu_device_fini_sw+0xc5/0x3b0 [amdgpu]
[  304.853229]  amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
[  304.853327]  drm_dev_release+0x20/0x40 [drm]
[  304.853352]  release_nodes+0x35/0xb0
[  304.853359]  devres_release_all+0x8b/0xc0
[  304.853364]  device_unbind_cleanup+0xe/0x70
[  304.853370]  device_release_driver_internal+0xee/0x160
[  304.853377]  driver_detach+0x44/0x90
[  304.853382]  bus_remove_driver+0x55/0xe0
[  304.853387]  pci_unregister_driver+0x3b/0x90
[  304.853393]  amdgpu_exit+0x11/0x69 [amdgpu]
[  304.853540]  __x64_sys_delete_module+0x142/0x260
[  304.853548]  ? exit_to_user_mode_prepare+0x3e/0x190
[  304.853555]  do_syscall_64+0x38/0x90
[  304.853562]  entry_SYSCALL_64_after_hwframe+0x63/0xcd

Signed-off-by: YiPeng Chai 


The subject line should probably read "when unloading amdgpu", but apart 
from that good catch.


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 0b598b510bd8..eb63324c30d2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -829,7 +829,7 @@ void amdgpu_vram_mgr_fini(struct amdgpu_device *adev)
kfree(rsv);
  
  	list_for_each_entry_safe(rsv, temp, >reserved_pages, blocks) {

-   drm_buddy_free_list(>mm, >blocks);
+   drm_buddy_free_list(>mm, >allocated);
kfree(rsv);
}
drm_buddy_fini(>mm);




Re: [linux-next:master] BUILD REGRESSION 459c73db4069c27c1d4a0e20d055b837396364b8

2022-12-16 Thread Vincent Mailhol
On Tue. 15 Dec. 2022 at 22:57, kernel test robot  wrote:
> tree/branch: 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> branch HEAD: 459c73db4069c27c1d4a0e20d055b837396364b8  Add linux-next 
> specific files for 20221215
>
> Error/Warning reports:

(...)

> Documentation/networking/devlink/etas_es58x.rst: WARNING: document isn't 
> included in any toctree

A patch for this warning is on its way:
  
https://lore.kernel.org/linux-next/20221213051136.721887-1-mailhol.vinc...@wanadoo.fr/T/#u

(...)

Yours sincerely,
Vincent Mailhol