On 6/7/2024 12:31 PM, SRINIVASAN SHANMUGAM wrote:
>
> On 6/6/2024 10:58 PM, Lazar, Lijo wrote:
>> On 6/6/2024 5:35 PM, Srinivasan Shanmugam wrote:
>>> Previously, this check was performed in the gfx_v9_4_3_sw_init function,
>>> and the amdgpu_gfx_sysfs_compute
On 6/6/2024 9:13 PM, Eric Huang wrote:
> amdgpu_job_ring may return NULL, which causes kernel NULL
> pointer error, using another way to print ring name instead
> of ring->name.
>
> Suggested-by: Lijo Lazar
> Signed-off-by: Eric Huang
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
>
On 6/6/2024 5:35 PM, Srinivasan Shanmugam wrote:
> Previously, this check was performed in the gfx_v9_4_3_sw_init function,
> and the amdgpu_gfx_sysfs_compute_init function was only called if the
> GPU was not a VF in SR-IOV mode. This is because the sysfs entries
> created by
On 6/3/2024 11:42 PM, Eric Huang wrote:
> reset cause is requested by customer as additional
> info for gpu reset smi event.
>
> v2: integerate reset sources suggested by Lijo Lazar
>
> Signed-off-by: Eric Huang
This series is
Reviewed-by: Lijo Lazar
I think SMI needs to get all
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Eric,
To consider other reset cases also, you may have something like attached.
Thanks,
Lijo
-Original Message-
From: amd-gfx On Behalf Of Eric Huang
Sent: Friday, May 31, 2024 8:38 PM
To: amd-gfx@lists.freedesktop.org
Cc:
On 5/28/2024 5:20 PM, Arnd Bergmann wrote:
> From: Arnd Bergmann
>
> The amdkfd support fails to link when CONFIG_CRC16 is disabled:
>
> aarch64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_topology.o: in function
> `kfd_topology_add_device':
> kfd_topology.c:(.text+0x3a4c): undefined reference
anks,
Lijo
> Thanks,
> Victor
>
> -Original Message-
> From: Lazar, Lijo
> Sent: Wednesday, May 22, 2024 2:14 PM
> To: Zhao, Victor ; amd-gfx@lists.freedesktop.org
> Subject: RE: [PATCH] drm/amd/amdgpu: fix the inst passed to
> amdgpu_virt_rlcg_reg_rw
>
> [AMD Off
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Victor,
Could you check if an approach like the attached one helps?
Thanks,
Lijo
-Original Message-
From: Zhao, Victor
Sent: Wednesday, May 22, 2024 11:13 AM
To: Zhao, Victor ; amd-gfx@lists.freedesktop.org; Lazar,
Lijo
On 5/22/2024 7:49 AM, Zhang, Jesse(Jie) wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Lijo
>
> -Original Message-
> From: Lazar, Lijo
> Sent: Tuesday, May 21, 2024 4:20 PM
> To: Zhang, Jesse(Jie) ; amd-gfx@lists.freedesktop.org
On 5/21/2024 12:46 PM, Jesse Zhang wrote:
> Since the type of data_size is uint32_t, adev->umsch_mm.data_size - 1 >> 16
> >> 16 is 0
> regardless of the values of its operands
>
> So removing the operations upper_32_bits and lower_32_bits.
>
> Signed-off-by: Jesse Zhang
> Suggested-by: Tim
On 5/21/2024 1:07 PM, Srinivasan Shanmugam wrote:
> This commit fixes a format truncation issue arosed by the snprintf
> function potentially writing more characters into the ring->name buffer
> than it can hold, in the amdgpu_gfx_kiq_init_ring function
>
> The issue occurred because the
On 5/21/2024 10:13 AM, Srinivasan Shanmugam wrote:
> This commit fixes a format truncation issue arosed by the snprintf
> function potentially writing more characters into the ring->name buffer
> than it can hold, in the amdgpu_gfx_kiq_init_ring function
>
> The issue occurred because the '%d'
On 5/20/2024 4:44 PM, Victor Zhao wrote:
> the inst passed to reg read/write should be physical instance.
> Fix the miss matched code.
>
> Signed-off-by: Victor Zhao
> ---
> .../drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c | 6 ++---
> .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 2 +-
>
On 5/20/2024 10:31 AM, Asad Kamal wrote:
> Remove gpu_metrics_v1_6 usage for SMUv13.0.6 temporarily and use
> gpu_metrics_v1_5 until tool support is ready for it.
>
> This reverts commit e6efb71ae640eada28f44cc97aa79e8ae4901e63.
>
> Signed-off-by: Asad Kamal
Series is
Reviewed-by:
On 5/14/2024 4:35 PM, Lijo Lazar wrote:
> This series adds APIs to get the supported PM policies and also set them. A PM
> policy type is a predefined policy type supported by an SOC and each policy
> may
> define two or more levels to choose from. A user can select the appropriate
> level
On 5/14/2024 12:28 PM, Jesse Zhang wrote:
> To avoid warning problems, drop index and
> use PPSMC_MSG_GfxDriverReset instead of index for aldebaran.
>
> Signed-off-by: Jesse Zhang
> Suggested-by: Lijo Lazar
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 13 +++--
> 1
On 5/14/2024 12:37 PM, Wang, Yang(Kevin) wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> -Original Message-
> From: amd-gfx On Behalf Of Lazar, Lijo
> Sent: Tuesday, May 14, 2024 2:07 PM
> To: Zhang, Jesse(Jie) ; amd-gfx@lists.freedesktop
On 5/14/2024 9:43 AM, Ma Jun wrote:
> Drop hard-code value of nsTmax because we read this
> value from fantable below.
>
> Signed-off-by: Ma Jun
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c | 2 --
> 1 file changed, 2
On 5/14/2024 9:42 AM, Ma Jun wrote:
> Check ras_manager before using it
>
> Signed-off-by: Ma Jun
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 7 +--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git
On 5/14/2024 11:34 AM, Jesse Zhang wrote:
> To avoid warning problems, drop index and
> use PPSMC_MSG_GfxDriverReset instead of index for aldebaran.
>
> Signed-off-by: Jesse Zhang
> Suggested-by: Lijo Lazar
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 13 +++--
> 1
On 5/14/2024 6:30 AM, Ma, Jun wrote:
> Hi Lijo & Kevin, thanks for review, will drop this patch
>
In the original function below check is there.
if (!handle || !info || type >= ACA_ERROR_TYPE_COUNT)
return -EINVAL;
So moving this to a later stage is still valid.
On 5/10/2024 8:20 AM, Jesse Zhang wrote:
> Check for specific indexes that may be invalid values.
>
> Signed-off-by: Jesse Zhang
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git
On 5/13/2024 4:27 PM, Lazar, Lijo wrote:
>
>
> On 5/10/2024 8:20 AM, Jesse Zhang wrote:
>> Check for specific indexes that may be invalid values.
>>
>> Signed-off-by: Jesse Zhang
>> ---
>> drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
On 5/10/2024 8:20 AM, Jesse Zhang wrote:
> Check for specific indexes that may be invalid values.
>
> Signed-off-by: Jesse Zhang
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git
On 5/13/2024 2:26 PM, Ma Jun wrote:
> Check handle pointer before using it
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_aca.c | 6 +-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_aca.c
>
On 5/13/2024 2:26 PM, Ma Jun wrote:
> Check ras_manager before using it
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 9 +++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
>
On 5/13/2024 9:44 AM, Ori Messinger wrote:
> This patch adds 'ring hang' events to the driver.
> This is done by adding a 'reset_ring_hang' bool variable to the
> struct 'amdgpu_reset_context' in the amdgpu_reset.h file.
> The purpose for this 'reset_ring_hang' variable is whenever a GPU
>
On 5/10/2024 1:36 AM, Harish Kasiviswanathan wrote:
> gpu_id needs to be unique for user space to identify GPUs via KFD
> interface. In the current implementation there is a very small
> probability of having non unique gpu_ids.
>
> v2: Add check to confirm if gpu_id is unique. If not unique,
On 5/10/2024 1:56 PM, Jesse Zhang wrote:
> Checks the partition mode and returns an error for an invalid mode.
>
> Signed-off-by: Jesse Zhang
> Suggested-by: Lijo Lazar
> ---
> drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c | 7 +++
> 1 file changed, 7 insertions(+)
>
> diff --git
On 5/10/2024 1:09 PM, Zhang, Jesse(Jie) wrote:
> [AMD Official Use Only - General]
>
> Hi Lijo,
>
> -Original Message-
> From: amd-gfx On Behalf Of Lazar, Lijo
> Sent: Friday, May 10, 2024 3:16 PM
> To: amd-gfx@lists.freedesktop.org
> Subject: Re: [P
On 5/10/2024 8:20 AM, Jesse Zhang wrote:
> Dividing expression num_xcc_per_xcp which may be zero has undefined behavior.
>
> Signed-off-by: Jesse Zhang
> ---
> drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git
On 5/7/2024 10:14 PM, Srinivasan Shanmugam wrote:
> This commit fixes potential truncation when writing the string _imu.bin
> into the fw_name buffer in the imu_v12_0_init_microcode function in the
> imu_v12_0.c file
>
> The ucode_prefix size was reduced from 30 to 15 to ensure the snprintf
>
On 5/7/2024 10:14 PM, Srinivasan Shanmugam wrote:
> This commit addresses multiple warnings in the gfx_v12_0_init_microcode
> function in the gfx_v12_0.c file. The warnings were related to potential
> truncation when writing the strings _pfp.bin, _me.bin, _rlc.bin, and
> _mec.bin into the
On 5/4/2024 3:36 AM, Harish Kasiviswanathan wrote:
> gpu_id needs to be unique for user space to identify GPUs via KFD
> interface. In the current implementation there is a very small
> probability of having non unique gpu_ids.
>
> v2: Add check to confirm if gpu_id is unique. If not unique,
On 5/7/2024 6:00 AM, Harry Wentland wrote:
> This patch is causing crashes of Manor Lords on my Navi 21 on the 6.8.9
> stable kernel. It leads to an assertion failure in wine:
>
> File: ../src-wine/dlls/winevulkan/loader_thunks.c
> Line: 3621
>
> Expression "!status && vkEndCommandBuffer""
>
On 5/2/2024 7:01 PM, Asad Kamal wrote:
> Validate tbo resource pointer, skip if NULL
>
> Signed-off-by: Asad Kamal
> Reviewed-by: Christian König
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
> 1 file changed, 2 insertions(+), 1
On 4/26/2024 9:27 AM, Yunxiang Li wrote:
> Some times a hang GPU causes multiple reset sources to schedule resets.
> The second source will be able to trigger an unnecessary reset if they
> schedule after we call amdgpu_device_stop_pending_resets.
>
> Move amdgpu_device_stop_pending_resets to
On 4/30/2024 7:53 PM, Zhigang Luo wrote:
> VF can't access FB when host is doing mode1 reset. Using sizeof to get
> vf2pf info size, instead of reading it from vf2pf header stored in FB.
>
> Signed-off-by: Zhigang Luo
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 2 +-
> 1 file changed,
On 4/28/2024 12:38 PM, YiPeng Chai wrote:
> Add mutex to protect ras shared memory.
>
> Signed-off-by: YiPeng Chai
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c| 121 ++---
> drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h| 1 +
>
On 4/25/2024 3:53 PM, Sunil Khatri wrote:
> Do not dump the ip registers during driver reload
> in passthrough environment.
>
> Signed-off-by: Sunil Khatri
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 ++
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git
On 4/25/2024 3:30 PM, Ma Jun wrote:
> Initialize the phy_id to 0 to fix the warning of
> "Using uninitialized value phy_id"
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c | 6 +-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git
On 4/25/2024 12:05 PM, Srinivasan Shanmugam wrote:
> The function gfx_v9_4_3_init_microcode in gfx_v9_4_3.c was generating
> about potential truncation of output when using the snprintf function.
> The issue was due to the size of the buffer 'ucode_prefix' being too
> small to accommodate the
On 4/23/2024 7:13 AM, Srinivasan Shanmugam wrote:
> The buffer size is determined by the declaration char fw_name[30]; This
> means fw_name can hold up to 30 characters, including the null character
> that marks the end of the string.
>
> The string to be written is "amdgpu/%s_mec.bin" or
On 4/23/2024 1:15 AM, Yunxiang Li wrote:
> Reset request from KFD is missing a check for if a reset is already in
> progress, this causes a second reset to be triggered right after the
> previous one finishes. Add the check to align with the other reset sources.
>
> Signed-off-by: Yunxiang Li
On 4/19/2024 9:14 PM, Srinivasan Shanmugam wrote:
> This commit addresses buffer overflow in the smu_v14_0_init_microcode
> function. The issue was about the snprintf function writing more bytes
> into the fw_name buffer than it can hold.
>
> The line of code is:
>
> snprintf(fw_name,
On 4/22/2024 4:52 PM, Christian König wrote:
> Am 22.04.24 um 11:37 schrieb Lazar, Lijo:
>>
>> On 4/22/2024 2:59 PM, Christian König wrote:
>>> Am 22.04.24 um 10:47 schrieb Jack Xiao:
>>>> Delete fence fallback timer to fix the ramdom
>>>&g
On 4/22/2024 3:09 PM, Jack Xiao wrote:
> Delete fence fallback timer to fix the ramdom
> use-after-free issue.
>
> v2: move to amdgpu_mes.c
>
> Signed-off-by: Jack Xiao
Acked-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 1 +
> 1 file changed, 1
On 4/22/2024 2:59 PM, Christian König wrote:
> Am 22.04.24 um 10:47 schrieb Jack Xiao:
>> Delete fence fallback timer to fix the ramdom
>> use-after-free issue.
>
> That's already done in amdgpu_fence_driver_hw_fini() and absolutely
> shouldn't be in amdgpu_ring_fini().
>
> And the
On 4/22/2024 11:23 AM, Le Ma wrote:
> To adapt to different gc versions in gfx_v9_4_3.c file.
>
> Change-Id: Ib4465aade0dcbbcc43318c6dc865f813c5411097
> Signed-off-by: Le Ma
> Reviewed-by: Hawking Zhang
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
On 4/22/2024 6:42 AM, Rajneesh Bhardwaj wrote:
> Tune coarse grain clock gating idle threshold and rlc idle timeout to
> achieve better kernel launch latency.
>
> Signed-off-by: Rajneesh Bhardwaj
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 8
). If that happens, then the purpose of
the patch - to get the context of a device hang - is lost. We may not even get
a proper dmesg log.
Thanks,
Lijo
-Original Message-
From: Khatri, Sunil
Sent: Wednesday, April 17, 2024 9:42 PM
To: Lazar, Lijo ; Alex Deucher ;
Khatri, Sunil
Cc
On 4/17/2024 9:21 PM, Alex Deucher wrote:
> On Wed, Apr 17, 2024 at 5:38 AM Sunil Khatri wrote:
>>
>> Adding gfx10 gc registers to be used for register
>> dump via devcoredump during a gpu reset.
>>
>> Signed-off-by: Sunil Khatri
>
> Reviewed-by: Alex Deucher
>
>> ---
>>
On 4/17/2024 11:23 AM, Ma Jun wrote:
> gpu_od should be removed if it's an empty directory
>
> Signed-off-by: Ma Jun
> Reported-by: Yang Wang
> ---
> drivers/gpu/drm/amd/pm/amdgpu_pm.c | 7 +++
> 1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
>
On 4/17/2024 3:10 PM, Ma Jun wrote:
> Print the od status info if it's not supported.
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/pm/amdgpu_pm.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
>
On 4/17/2024 1:14 PM, Khatri, Sunil wrote:
>
> On 4/17/2024 1:06 PM, Khatri, Sunil wrote:
>> devcoredump is used to debug gpu hangs/resets. So in normal process
>> when there is a hang due to ring timeout or page fault we are doing a
>> hard reset as soft reset fail in those cases. How are we
On 4/17/2024 9:43 AM, Ahmad Rehman wrote:
> In passthrough environment, the driver triggers the mode-1 reset on
> reload. The reset causes the core dump collection which is delayed task
> and prevents driver from unloading until it is completed. Since we do
> not need to collect data on "reset
On 4/17/2024 12:05 AM, Ahmad Rehman wrote:
> In passthrough environment, the driver triggers the mode-1 reset on
> reload. The reset causes the core dump collection which is delayed task
> and prevents driver from unloading until it is completed. Since we do
> not need to collect data on "reset
[Public]
Is this applicable for aldebaran also?
Thanks,
Lijo
-Original Message-
From: amd-gfx On Behalf Of Hawking Zhang
Sent: Tuesday, April 16, 2024 11:46 AM
To: amd-gfx@lists.freedesktop.org; Zhou1, Tao
Cc: Zhang, Hawking
Subject: [PATCH] drm/amdgpu: Use driver mode reset for data
On 4/3/2024 8:27 AM, Ma Jun wrote:
> refactor the code of runtime pm mode detection to support
> amdgpu_runtime_pm =2 and 1 two cases
>
> Signed-off-by: Ma Jun
> Reviewed-by: Yang Wang
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> v1->v2:
> - Fix logic and output info (Lijo)
> - Fix code
On 4/8/2024 10:50 PM, Alex Deucher wrote:
> Need to take the srbm_mutex and while we are here, use the
> helper function soc21_grbm_select();
>
> Signed-off-by: Alex Deucher
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 15 +--
> 1 file
On 4/3/2024 11:42 PM, Zhigang Luo wrote:
> 1. change AMDGPU_VF2PF_UPDATE_MAX_RETRY_LIMIT from 30 to 5.
> 2. set fatel error detected flag.
>
> Signed-off-by: Zhigang Luo
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
>
On 4/2/2024 4:00 PM, Lazar, Lijo wrote:
>
>
> On 4/2/2024 3:52 PM, Asad Kamal wrote:
>> Report max set uclk and sclk for smu_v_13_0_6
>>
>
> You may rephrase as
>
> "Use OD (pp_od_clk_voltage) interface to report current limits, default
>
On 4/2/2024 3:52 PM, Asad Kamal wrote:
> Update max set uclk and sclk reporting format for smu_v_13_0_0
>
Use aldebaran instead of smu v13.0.0 - both are different. You may also
add the description similar to patch 1.
With those updates,
Reviewed-by: Lijo Lazar
Thanks,
Lijo
>
On 4/2/2024 3:52 PM, Asad Kamal wrote:
> Report max set uclk and sclk for smu_v_13_0_6
>
You may rephrase as
"Use OD (pp_od_clk_voltage) interface to report current limits, default
or those set by user, for SCLK and UCLK."
Thanks,
Lijo
> Signed-off-by: Asad Kamal
> ---
>
On 3/29/2024 1:58 PM, Ma Jun wrote:
> refactor the code of runtime pm mode detection to support
> amdgpu_runtime_pm =2 and 1 two cases
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 68 ++
On 4/1/2024 4:45 PM, Kamal, Asad wrote:
> [AMD Official Use Only - General]
>
> -Original Message-
> From: amd-gfx On Behalf Of Lijo Lazar
> Sent: Thursday, March 28, 2024 8:06 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking ; Deucher, Alexander
> ; Wang, Yang(Kevin)
>
On 3/27/2024 4:40 PM, Ma Jun wrote:
> Add a new runtime pm mode AMDGPU_RUNPM_BAMACO
> and related macro definition
>
> Signed-off-by: Ma Jun
Series is
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 4
> 1 file changed, 4 insertions(+)
>
>
On 3/28/2024 8:49 AM, Wang, Yang(Kevin) wrote:
> [AMD Official Use Only - General]
>
> -Original Message-
> From: amd-gfx On Behalf Of Lijo Lazar
> Sent: Thursday, March 28, 2024 11:06 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking ; Deucher, Alexander
>
> Subject:
On 3/28/2024 8:57 AM, Wang, Yang(Kevin) wrote:
> [AMD Official Use Only - General]
>
> -Original Message-
> From: amd-gfx On Behalf Of Lijo Lazar
> Sent: Thursday, March 28, 2024 10:36 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking ; Deucher, Alexander
> ; Wang,
On 3/26/2024 2:32 PM, Yang Wang wrote:
> add a new enumeration type to identify device attribute node,
> this method is relatively more efficient compared with 'strcmp' in
> update_attr() function.
>
> Signed-off-by: Yang Wang
> ---
> drivers/gpu/drm/amd/pm/amdgpu_pm.c | 4 +--
>
On 3/23/2024 1:27 AM, Zhigang Luo wrote:
> Signed-off-by: Zhigang Luo
> Change-Id: I2a98d513c26107ac76ecf20e951c188afbc7ede6
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 20
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 5 -
>
On 3/26/2024 2:59 PM, Lazar, Lijo wrote:
>
>
> On 3/25/2024 3:45 PM, Ma Jun wrote:
>> Optimize the code to add support for BAMACO mode checking
>>
>> Signed-off-by: Ma Jun
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 +-
>>
On 3/25/2024 3:45 PM, Ma Jun wrote:
> Optimize the code to add support for BAMACO mode checking
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 74 +++--
>
On 3/25/2024 3:45 PM, Ma Jun wrote:
> Add support for MACO flag checking.
> MACO mode only works if BACO is supported.
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu.h| 4 ++--
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
>
On 3/22/2024 12:33 PM, Srinivasan Shanmugam wrote:
> Reducing the size of ucode_prefix to 25 in the smu_v11_0_init_microcode
> function. we ensure that fw_name can accommodate the maximum possible
> string size
>
> Fixes the below with gcc W=1:
>
On 3/22/2024 12:24 PM, Srinivasan Shanmugam wrote:
> The total size of the fw_name buffer is 8 (for "amdgpu/") + 30 (for
> ucode_prefix) + 5 (for "_pfp") + 5 (for "_wks") + 5 (for ".bin") = 53
> characters.
>
> Fixes the below with gcc W=1:
> drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c: In function
On 3/22/2024 12:02 PM, Srinivasan Shanmugam wrote:
> Reducing the size of ucode_prefix to 25 in the gfx_v11_0_init_microcode
> function. This would ensure that the total number of characters being
> written into fw_name does not exceed its size of 40.
>
> Fixes the below with gcc W=1:
>
On 3/22/2024 11:54 AM, Srinivasan Shanmugam wrote:
> The size of fw_name is increased to ensure that it can accommodate
> the maximum possible size of the string being written into it.
>
> Fixes the below with gcc W=1:
> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c: In function ‘gfx_v9_0_early_init’:
On 3/21/2024 11:16 AM, Srinivasan Shanmugam wrote:
> The snprintf function is used to write a formatted string into fw_name.
> The format of the string is "amdgpu/%s_mes%s.bin", where %s is replaced
> by the string in ucode_prefix and the second %s is replaced by either
> "_2" or "1" depending
On 3/21/2024 10:29 AM, Srinivasan Shanmugam wrote:
> Reducing the size of ucode_prefix to 25 in the amdgpu_vcn_early_init
> function. This would ensure that the total number of characters being
> written into fw_name does not exceed its size of 40.
>
> Fixes the below with gcc W=1:
>
On 3/21/2024 12:28 PM, Ma, Jun wrote:
>
>
> On 3/20/2024 9:38 PM, Lazar, Lijo wrote:
>>
>>
>> On 3/20/2024 6:54 PM, Alex Deucher wrote:
>>> On Wed, Mar 20, 2024 at 6:17 AM Ma Jun wrote:
>>>>
>>>> Because of the logic error,
On 3/20/2024 8:28 PM, SRINIVASAN SHANMUGAM wrote:
>
> On 3/20/2024 3:12 PM, Lazar, Lijo wrote:
>>
>> On 3/20/2024 2:15 PM, Srinivasan Shanmugam wrote:
>>> The issue was present in the lines where 'fw_name' was being formatted.
>>> This fix ensures that the o
On 3/20/2024 6:54 PM, Alex Deucher wrote:
> On Wed, Mar 20, 2024 at 6:17 AM Ma Jun wrote:
>>
>> Because of the logic error, Arcturus and vega20 currently
>> use the AMDGPU_RUNPM_NONE for runtime pm even though they
>> support BACO. So, the code is optimized to fix this error.
>>
>>
On 3/20/2024 2:15 PM, Srinivasan Shanmugam wrote:
> The issue was present in the lines where 'fw_name' was being formatted.
> This fix ensures that the output is not truncated
>
> Fixes the below with gcc W=1:
> drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c: In function ‘amdgpu_vcn_early_init’:
>
On 3/19/2024 7:27 PM, Khatri, Sunil wrote:
>
> On 3/19/2024 7:19 PM, Lazar, Lijo wrote:
>>
>> On 3/19/2024 6:02 PM, Sunil Khatri wrote:
>>> Refactor the code so debugfs and devcoredump can reuse
>>> the common information and avoid unnecessary copy of it.
&g
On 3/19/2024 6:02 PM, Sunil Khatri wrote:
> Refactor the code so debugfs and devcoredump can reuse
> the common information and avoid unnecessary copy of it.
>
> created a new file which would be the right place to
> hold functions which will be used between sysfs, debugfs
> and devcoredump.
>
[Public]
Reviewed-by: Lijo Lazar
Thanks,
Lijo
-Original Message-
From: SHANMUGAM, SRINIVASAN
Sent: Saturday, March 16, 2024 10:20 PM
To: Koenig, Christian ; Deucher, Alexander
Cc: amd-gfx@lists.freedesktop.org; SHANMUGAM, SRINIVASAN
; Lazar, Lijo
Subject: [PATCH] drm/amdgpu: Fix
On 3/15/2024 5:45 PM, Ma, Le wrote:
> [AMD Official Use Only - General]
>
>
>
>> -Original Message-----
>> From: Lazar, Lijo <_Lijo.Lazar@amd.com_ <mailto:lijo.la...@amd.com>>
>> Sent: Friday, March 15, 2024 6:14 PM
>> To: Ma, Le <_L
On 3/14/2024 10:24 PM, Zhigang Luo wrote:
> if reading pf2vf data failed 5 times continuously, it means something is
> wrong. Need to trigger flr_work to recover the issue.
>
> also use dev_err to print the error message to get which device has
> issue and add warning message if waiting
On 3/15/2024 3:43 PM, Lazar, Lijo wrote:
>
>
> On 3/15/2024 2:46 PM, Le Ma wrote:
>> To fix the entity rq NULL issue. This setting has been moved to upper level.
>>
>
> Need to call amdgpu_ttm_set_buffer_funcs_status(adev, true/false) in
> mode-2 reset handlers
On 3/15/2024 2:46 PM, Le Ma wrote:
> To fix the entity rq NULL issue. This setting has been moved to upper level.
>
Need to call amdgpu_ttm_set_buffer_funcs_status(adev, true/false) in
mode-2 reset handlers as well.
Thanks,
Lijo
> Fixes b70438004a14 ("drm/amdgpu: move buffer funcs setting
On 3/15/2024 1:13 PM, Asad Kamal wrote:
> Update PMFW interface headers for updated metrics table
> with pcie link speed and pcie link width
>
> Signed-off-by: Asad Kamal
Series is -
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
>
On 3/15/2024 11:11 AM, Asad Kamal wrote:
> Report pcie link speed/width using metric table in case
> of one vf & if pmfw support is available, else report directly from
> registers in case of pf. Skip reporting it for other cases.
>
> Signed-off-by: Asad Kamal
> ---
>
This one is missing some NULL checks. Will send a v2.
Thanks,
Lijo
On 3/13/2024 4:32 PM, Lijo Lazar wrote:
> Add support to set/get information about different DPM policies. The
> support is only available on SOCs which use swsmu architecture.
>
> A DPM policy type may be defined with different
On 3/14/2024 1:19 AM, Felix Kuehling wrote:
>
> On 2024-03-13 5:41, Lijo Lazar wrote:
>> Check if the device is present in the bus before trying to recover. It
>> could be that device itself is lost from the bus in some hang
>> situations.
>>
>> Signed-off-by: Lijo Lazar
>> ---
>>
On 3/13/2024 8:15 AM, Ma, Jun wrote:
>
>
> On 3/12/2024 8:57 PM, Lazar, Lijo wrote:
>>
>>
>> On 3/12/2024 4:29 PM, Ma Jun wrote:
>>> Sometimes user may want to enable the od feature
>>> by setting ppfeaturemask when loading amdgpu driver.
>>&
On 3/12/2024 4:29 PM, Ma Jun wrote:
> Sometimes user may want to enable the od feature
> by setting ppfeaturemask when loading amdgpu driver.
> However,not all Asics support this feature.
> So we need to restore the ppfeature value and print
> a warning info.
>
> Signed-off-by: Ma Jun
> ---
>
On 3/8/2024 10:17 PM, Felix Kuehling wrote:
> On 2024-03-08 11:22, Mukul Joshi wrote:
>> In certain situations, some apps can import a BO multiple times
>> (through IPC for example). To restore such processes successfully,
>> we need to tell drm to ignore duplicate BOs.
>> While at it, also add
On 3/8/2024 3:21 PM, Ma Jun wrote:
> Because powerplay_table initialization is skipped under
> sriov case, We set default lower and upper OD value to
> avoid NULL pointer issue.
>
> Also, It's necessary to check od capability before
> using the power limit value from powerplay_table.
>
>
On 3/7/2024 7:42 AM, Ma, Jun wrote:
> Hi Lijo,
>
> On 3/6/2024 7:16 PM, Lazar, Lijo wrote:
>>
>>
>> On 3/6/2024 3:56 PM, Ma Jun wrote:
>>> Because powerplay_table initialization is skipped under
>>> sriov case, We set default lower and u
1 - 100 of 973 matches
Mail list logo