[PATCH] drm/amdgpu: add ras event id support

2024-03-14 Thread Yang Wang
add amdgpu ras event id support to better distinguish different error information sources in dmesg logs. the following log will be identify by event id: {event_id} interrupt to inform RAS event {event_id} ACA logs {event_id} errors statistic since from current injection/error query {event_id} erro

RE: [PATCH] drm/amdgpu: add ras event id support

2024-03-14 Thread Zhou1, Tao
[AMD Official Use Only - General] Reviewed-by: Tao Zhou > -Original Message- > From: amd-gfx On Behalf Of Yang > Wang > Sent: Thursday, March 14, 2024 4:12 PM > To: amd-gfx@lists.freedesktop.org > Cc: Wang, Yang(Kevin) ; Zhang, Hawking > > Subject: [PATCH] drm/amdgpu: add ras event id

Re: [PATCH] drm/scheduler: fix null-ptr-deref in init entity

2024-03-14 Thread Christian König
Am 13.03.24 um 22:20 schrieb vitaly.pros...@amd.com: From: Vitaly Prosyak The bug can be triggered by sending an amdgpu_cs_wait_ioctl to the AMDGPU DRM driver on any ASICs with valid context. The bug was reported by Joonkyo Jung . For example the following code: static void Syzkaller2(int

Re: [PATCH 2/2] drm:amdgpu: add firmware information of all IP's

2024-03-14 Thread Khatri, Sunil
On 3/14/2024 11:40 AM, Sharma, Shashank wrote: On 14/03/2024 06:58, Khatri, Sunil wrote: On 3/14/2024 2:06 AM, Alex Deucher wrote: On Tue, Mar 12, 2024 at 8:42 AM Sunil Khatri wrote: Add firmware version information of each IP and each instance where applicable. Is there a way we can sh

[PATCH] drm/amdgpu: Bypass display ta if it is harvested

2024-03-14 Thread Hawking Zhang
Display TA doesn't need to be loaded/invoked if it is harvested. Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c ind

RE: [PATCH] drm/amdgpu: Bypass display ta if it is harvested

2024-03-14 Thread Zhang, Hawking
[AMD Official Use Only - General] Copy @Deucher, Alexander for the awareness. Regards, Hawking -Original Message- From: Zhang, Hawking Sent: Thursday, March 14, 2024 18:36 To: amd-gfx@lists.freedesktop.org; Pillai, Aurabindo ; Feng, Kenneth Cc: Zhang

[PATCH v2 0/9] Add PM policy interfaces

2024-03-14 Thread Lijo Lazar
This series adds APIs to get the supported PM policies and also set them. A PM policy type is a predefined policy type supported by an SOC and each policy may define two or more levels to choose from. A user can select the appropriate level through amdgpu_dpm_set_pm_policy() or through sysfs node p

[PATCH v2 2/9] drm/amd/pm: Update PMFW messages for SMUv13.0.6

2024-03-14 Thread Lijo Lazar
Add PMF message to select a Pstate policy in SOCs with SMU v13.0.6. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_ppsmc.h | 3 ++- drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h | 3 ++- 2 files changed, 4 insertions(

[PATCH v2 4/9] drm/amd/pm: Add xgmi plpd policy to pm_policy

2024-03-14 Thread Lijo Lazar
Add support to set XGMI PLPD policy levels through pm_policy sysfs node. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/include/kgd_pp_interface.h | 1 + drivers/gpu/drm/amd/pm/amdgpu_pm.c | 3 +++ 2 files changed, 4 insertions(+) diff --git a/drivers/

[PATCH v2 5/9] drm/amd/pm: Add xgmi plpd to SMU v13.0.6 pm_policy

2024-03-14 Thread Lijo Lazar
On SOCs with SMU v13.0.6, allow changing xgmi plpd policy through pm_policy sysfs interface. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 19 ++- .../drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 51 +-- drivers/gpu/d

[PATCH v2 1/9] drm/amd/pm: Add support for DPM policies

2024-03-14 Thread Lijo Lazar
Add support to set/get information about different DPM policies. The support is only available on SOCs which use swsmu architecture. A DPM policy type may be defined with different levels. For example, a policy may be defined to select Pstate preference and then later a pstate preference may be ch

[PATCH v2 3/9] drm/amd/pm: Add support to select pstate policy

2024-03-14 Thread Lijo Lazar
Add support to select pstate policy in SOCs with SMUv13.0.6 Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c| 2 + .../drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 71 +++ drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 30 +

[PATCH v2 6/9] drm/amd/pm: Add xgmi plpd to aldebaran pm_policy

2024-03-14 Thread Lijo Lazar
On aldebaran, allow changing xgmi plpd policy through pm_policy sysfs interface. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c| 35 +++ 1 file changed, 35 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/ald

[PATCH v2 8/9] drm/amd/pm: Remove legacy interface for xgmi plpd

2024-03-14 Thread Lijo Lazar
Replace the legacy interface with amdgpu_dpm_set_pm_policy to set XGMI PLPD mode. Also, xgmi_plpd sysfs node is not used by any client. Remove that as well. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 4 +- drivers/gpu/drm/amd/pm/amd

[PATCH v2 7/9] drm/amd/pm: Add xgmi plpd to arcturus pm_policy

2024-03-14 Thread Lijo Lazar
On arcturus, allow changing xgmi plpd policy through pm_policy sysfs interface. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 7 ++-- .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 42 +++ 2 files changed, 46 insertion

[PATCH v2 9/9] drm/amd/pm: Remove unused interface to set plpd

2024-03-14 Thread Lijo Lazar
Remove unused callback to set PLPD policy and its implementation from arcturus, aldebaran and SMUv13.0.6 SOCs. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 6 --- .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 22 --- .../drm

Re: [PATCH] drm/amdgpu/vpe: power on vpe when hw_init

2024-03-14 Thread Alex Deucher
On Wed, Mar 13, 2024 at 9:33 PM Lee, Peyton wrote: > > [AMD Official Use Only - General] > > Hi Alex, > > There are two places where VPE will lose power: When there is a system call > to vpe_hw_fini(), and the vpe-thread finds that VEP has no jobs in the queue. > This patch is to make sure that V

RE: [PATCH] drm/amdgpu: Bypass display ta if it is harvested

2024-03-14 Thread Zhang, Hawking
[AMD Official Use Only - General] Never mind. There is helper function to check if display hardware is available. I will move to the helper in v2. Thanks @Wang, Yang(Kevin) for his reminder. Regards, Hawking From: amd-gfx On Behalf Of Zhang, Hawking Sent: Thurs

Re: [RFC PATCH v4 02/42] drm: Add helper for conversion from signed-magnitude

2024-03-14 Thread Melissa Wen
On 02/26, Harry Wentland wrote: > CTM values are defined as signed-magnitude values. Add > a helper that converts from CTM signed-magnitude fixed > point value to the twos-complement value used by > drm_fixed. > > Signed-off-by: Harry Wentland > --- > include/drm/drm_fixed.h | 18 +++

Re: [PATCH] drm/amdgpu: Bypass display ta if it is harvested

2024-03-14 Thread Alex Deucher
On Thu, Mar 14, 2024 at 6:47 AM Hawking Zhang wrote: > > Display TA doesn't need to be loaded/invoked if it > is harvested. > > Signed-off-by: Hawking Zhang > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 18 ++ > 1 file changed, 18 insertions(+) > > diff --git a/drivers/gpu/dr

[PATCH] drm/amdgpu: Bypass display ta if display hw is not available

2024-03-14 Thread Hawking Zhang
Do not load/invoke display TA if display hardware is not available Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c i

Re: [PATCH] drm/amdgpu: Bypass display ta if display hw is not available

2024-03-14 Thread Alex Deucher
On Thu, Mar 14, 2024 at 9:18 AM Hawking Zhang wrote: > > Do not load/invoke display TA if display hardware is not > available > > Signed-off-by: Hawking Zhang Reviewed-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 18 ++ > 1 file changed, 18 insertions(+)

RE: [PATCH] drm/amdgpu: Bypass display ta if it is harvested

2024-03-14 Thread Zhang, Hawking
[AMD Official Use Only - General] Hi Alex, Please check my comments inline. Please also check v2 where I switch to an existing helper to check is display hardware is available. Regards, Hawking -Original Message- From: Alex Deucher Sent: Thursday, March 14, 2024 21:17 To: Zhang, Hawk

RE: [PATCH] drm/amdgpu: Bypass display ta if display hw is not available

2024-03-14 Thread Zhang, Hawking
[AMD Official Use Only - General] Thanks Alex! Regards, Hawking -Original Message- From: Alex Deucher Sent: Thursday, March 14, 2024 21:23 To: Zhang, Hawking Cc: amd-gfx@lists.freedesktop.org; Pillai, Aurabindo ; Feng, Kenneth ; Deucher, Alexander ; Wang, Yang(Kevin) Subject: Re: [

Re: [PATCH 2/2] drm:amdgpu: add firmware information of all IP's

2024-03-14 Thread Alex Deucher
On Thu, Mar 14, 2024 at 2:10 AM Sharma, Shashank wrote: > > > On 14/03/2024 06:58, Khatri, Sunil wrote: > > > > On 3/14/2024 2:06 AM, Alex Deucher wrote: > >> On Tue, Mar 12, 2024 at 8:42 AM Sunil Khatri > >> wrote: > >>> Add firmware version information of each > >>> IP and each instance where a

Re: [RFC PATCH v4 01/42] drm: Don't treat 0 as -1 in drm_fixp2int_ceil

2024-03-14 Thread Melissa Wen
On 02/26, Harry Wentland wrote: > Unit testing this in VKMS shows that passing 0 into > this function returns -1, which is highly counter- > intuitive. Fix it by checking whether the input is > >= 0 instead of > 0. > > Fixes: 64566b5e767f9 ("drm: Add drm_fixp_from_fraction and drm_fixp2int_ceil")

RE: [PATCH v2] drm/amdgpu: Clear the hotplug interrupt ack bit before hpd initialization

2024-03-14 Thread Deucher, Alexander
[Public] > -Original Message- > From: Qiang Ma > Sent: Wednesday, March 13, 2024 2:18 AM > To: Deucher, Alexander ; Koenig, Christian > ; Pan, Xinhui ; > airl...@gmail.com; dan...@ffwll.ch; SHANMUGAM, SRINIVASAN > ; sunran...@208suo.com > Cc: amd-gfx@lists.freedesktop.org; dri-de...@lists

Re: [PATCH 1/2] drm/amdgpu: add the IP information of the soc

2024-03-14 Thread Alex Deucher
On Thu, Mar 14, 2024 at 1:44 AM Khatri, Sunil wrote: > > > On 3/14/2024 1:58 AM, Alex Deucher wrote: > > On Tue, Mar 12, 2024 at 8:41 AM Sunil Khatri wrote: > >> Add all the IP's information on a SOC to the > >> devcoredump. > >> > >> Signed-off-by: Sunil Khatri > >> --- > >> drivers/gpu/drm/a

Re: [PATCH] drm/amdgpu: correct the KGQ fallback message

2024-03-14 Thread Deucher, Alexander
[Public] Reviewed-by: Alex Deucher From: Liang, Prike Sent: Wednesday, March 13, 2024 5:29 AM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Liang, Prike Subject: [PATCH] drm/amdgpu: correct the KGQ fallback message Fix the KGQ fallback function n

[PATCH 2/2] drm/amdkfd: Check preemption status on all XCDs

2024-03-14 Thread Mukul Joshi
This patch adds the following functionality: - Check the queue preemption status on all XCDs in a partition for GFX 9.4.3. - Update the queue preemption debug message to print the queue doorbell id for which preemption failed. - Change the signature of check preemption failed function to retu

[PATCH 1/2] drm/amdkfd: Rename read_doorbell_id in MQD functions

2024-03-14 Thread Mukul Joshi
Rename read_doorbell_id function to a more meaningful name, implying what it is used for. No functional change. Suggested-by: Jay Cornwall Signed-off-by: Mukul Joshi --- drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h | 2 +- d

Re: [PATCH 1/2] drm/amdgpu: add the IP information of the soc

2024-03-14 Thread Khatri, Sunil
On 3/14/2024 8:12 PM, Alex Deucher wrote: On Thu, Mar 14, 2024 at 1:44 AM Khatri, Sunil wrote: On 3/14/2024 1:58 AM, Alex Deucher wrote: On Tue, Mar 12, 2024 at 8:41 AM Sunil Khatri wrote: Add all the IP's information on a SOC to the devcoredump. Signed-off-by: Sunil Khatri --- drive

RE: [PATCH] drm/amdgpu/: Remove bo_create_kernel_at path from virt page

2024-03-14 Thread Luo, Zhigang
[AMD Official Use Only - General] Reviewed-by: Zhigang Luo -Original Message- From: Skvortsov, Victor Sent: Tuesday, March 12, 2024 1:51 PM To: Skvortsov, Victor ; Luo, Zhigang ; amd-gfx@lists.freedesktop.org Cc: Koenig, Christian Subject: [PATCH] drm/amdgpu/: Remove bo_create_kernel_

RE: [PATCH] drm/amdgpu: Skip virt_exchange_init on SDMA poison consumption

2024-03-14 Thread Luo, Zhigang
[AMD Official Use Only - General] Reviewed-by: Zhigang Luo -Original Message- From: Skvortsov, Victor Sent: Tuesday, March 12, 2024 10:09 PM To: Skvortsov, Victor ; Chai, Thomas ; Zhang, Hawking ; Luo, Zhigang ; amd-gfx@lists.freedesktop.org Cc: Skvortsov, Victor Subject: [PATCH] drm

[PATCH] drm/amdgpu: trigger flr_work if reading pf2vf data failed

2024-03-14 Thread Zhigang Luo
if reading pf2vf data failed 5 times continuously, it means something is wrong. Need to trigger flr_work to recover the issue. also use dev_err to print the error message to get which device has issue and add warning message if waiting IDH_FLR_NOTIFICATION_CMPL timeout. Signed-off-by: Zhigang Luo

Re: [PATCH 1/2] drm/amdgpu: add the IP information of the soc

2024-03-14 Thread Alex Deucher
On Thu, Mar 14, 2024 at 12:16 PM Khatri, Sunil wrote: > > > On 3/14/2024 8:12 PM, Alex Deucher wrote: > > On Thu, Mar 14, 2024 at 1:44 AM Khatri, Sunil wrote: > >> > >> On 3/14/2024 1:58 AM, Alex Deucher wrote: > >>> On Tue, Mar 12, 2024 at 8:41 AM Sunil Khatri wrote: > Add all the IP's inf

Re: [PATCH 2/2] drm/amdkfd: Check preemption status on all XCDs

2024-03-14 Thread Felix Kuehling
On 2024-03-14 12:00, Mukul Joshi wrote: This patch adds the following functionality: - Check the queue preemption status on all XCDs in a partition for GFX 9.4.3. - Update the queue preemption debug message to print the queue doorbell id for which preemption failed. - Change the signature o

RE: [PATCH 2/2] drm/amdkfd: Check preemption status on all XCDs

2024-03-14 Thread Joshi, Mukul
[AMD Official Use Only - General] From: Kuehling, Felix Sent: Thursday, March 14, 2024 2:39 PM To: Joshi, Mukul ; amd-gfx@lists.freedesktop.org Cc: Cornwall, Jay Subject: Re: [PATCH 2/2] drm/amdkfd: Check preemption status on all XCDs On 2024-03-14 12:00, Mukul Joshi wrote: This patch adds t

RE: [PATCH v2 1/9] drm/amd/pm: Add support for DPM policies

2024-03-14 Thread Deucher, Alexander
[Public] > -Original Message- > From: Lazar, Lijo > Sent: Thursday, March 14, 2024 7:56 AM > To: amd-gfx@lists.freedesktop.org > Cc: Zhang, Hawking ; Deucher, Alexander > ; Liu, Shuzhou (Bill) > > Subject: [PATCH v2 1/9] drm/amd/pm: Add support for DPM policies > > Add support to set/get

回覆: [PATCH] drm/amdgpu/vpe: power on vpe when hw_init

2024-03-14 Thread Lee, Peyton
[AMD Official Use Only - General] Hi Alex > I think it will continue to be powered up until a VPE job comes in and > completes and the idle handler gets scheduled. If a VPE job doesn't come in, > it will stay powered up I think. Yes, correct. And after the VPE is called to do initialization, a

Re: Proposal to add CRIU support to DRM render nodes

2024-03-14 Thread Felix Kuehling
On 2024-03-12 5:45, Tvrtko Ursulin wrote: On 11/03/2024 14:48, Tvrtko Ursulin wrote: Hi Felix, On 06/12/2023 21:23, Felix Kuehling wrote: Executive Summary: We need to add CRIU support to DRM render nodes in order to maintain CRIU support for ROCm application once they start relying on re

[PATCH] drm/sched: fix null-ptr-deref in init entity

2024-03-14 Thread vitaly.prosyak
From: Vitaly Prosyak The bug can be triggered by sending an amdgpu_cs_wait_ioctl to the AMDGPU DRM driver on any ASICs with valid context. The bug was reported by Joonkyo Jung . For example the following code: static void Syzkaller2(int fd) { union drm_amdgpu_ctx arg1; un

[PATCH] drm/amdgpu: Fix the iounmap error of rmmio

2024-03-14 Thread Ma Jun
Setting the rmmio pointer to NULL to fix the following iounmap error and calltrace. iounmap: bad address d0b3631f Fixes: 923f7a82d2e1 ("drm/amd/amdgpu: Fix potential ioremap() memory leaks in amdgpu_device_init()") Signed-off-by: Ma Jun --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7

[PATCH 2/2] drm/amd/pm: Use metric table for pcie speed/width

2024-03-14 Thread Asad Kamal
Report pcie link speed/width using metric table in case of one vf & if pmfw support is available, else report directly from registers in case of pf. Skip reporting it for other cases. Signed-off-by: Asad Kamal --- .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 14 +- 1 file cha

[PATCH 1/2] drm/amd/pm: Update SMUv13.0.6 PMFW headers

2024-03-14 Thread Asad Kamal
Update PMFW interface headers for updated metrics table with pcie link speed and pcie link width Signed-off-by: Asad Kamal --- drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_pmfw.h | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw

Re: [PATCH] drm/amdgpu: Fix the iounmap error of rmmio

2024-03-14 Thread SRINIVASAN SHANMUGAM
On 3/15/2024 10:47 AM, Ma Jun wrote: Setting the rmmio pointer to NULL to fix the following iounmap error and calltrace. iounmap: bad address d0b3631f Fixes: 923f7a82d2e1 ("drm/amd/amdgpu: Fix potential ioremap() memory leaks in amdgpu_device_init()") Signed-off-by: Ma Jun Acked-by:

[PATCH] drm/amd/amdgpu: add pipe1 hardware support

2024-03-14 Thread ZhenGuo Yin
Enable pipe1 support starting from SIENNA CICHLID asic. Need to use correct ref/mask for pipe1 hdp flush. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2117 Fixes: 085292c3d780 ("Revert "drm/amd/amdgpu: add pipe1 hardware support"") Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/amd/amdgp

[PATCH] drm/amd/pm set pp_dpm_*clk as read only for SRIOV one VF mode

2024-03-14 Thread Lin . Cao
pp_dpm_*clk should be set as read only for SRIOV one VF mode, remove S_IWUGO flag and _store function of these debugfs in one VF mode. Signed-off-by: Lin.Cao --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/p

Re: [PATCH 2/2] drm/amd/pm: Use metric table for pcie speed/width

2024-03-14 Thread Lazar, Lijo
On 3/15/2024 11:11 AM, Asad Kamal wrote: > Report pcie link speed/width using metric table in case > of one vf & if pmfw support is available, else report directly from > registers in case of pf. Skip reporting it for other cases. > > Signed-off-by: Asad Kamal > --- > .../gpu/drm/amd/pm/swsmu

RE: [PATCH 2/2] drm/amd/pm: Use metric table for pcie speed/width

2024-03-14 Thread Kamal, Asad
[AMD Official Use Only - General] -Original Message- From: Lazar, Lijo Sent: Friday, March 15, 2024 12:25 PM To: Kamal, Asad ; amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Ma, Le ; Zhang, Morris ; Cheung, Donald ; Khatir, Sepehr ; Oliveira, Daniel ; Poag, Charis Subject: Re: [PA