On 5/14/2024 4:35 PM, Lijo Lazar wrote:
> This series adds APIs to get the supported PM policies and also set them. A PM
> policy type is a predefined policy type supported by an SOC and each policy
> may
> define two or more levels to choose from. A user can select the appropriate
> level
On 5/14/2024 12:28 PM, Jesse Zhang wrote:
> To avoid warning problems, drop index and
> use PPSMC_MSG_GfxDriverReset instead of index for aldebaran.
>
> Signed-off-by: Jesse Zhang
> Suggested-by: Lijo Lazar
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 13 +++--
> 1
On 5/14/2024 12:37 PM, Wang, Yang(Kevin) wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> -Original Message-
> From: amd-gfx On Behalf Of Lazar, Lijo
> Sent: Tuesday, May 14, 2024 2:07 PM
> To: Zhang, Jesse(Jie) ; amd-gfx@lists.freedesktop
On 5/14/2024 9:43 AM, Ma Jun wrote:
> Drop hard-code value of nsTmax because we read this
> value from fantable below.
>
> Signed-off-by: Ma Jun
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c | 2 --
> 1 file changed, 2
On 5/14/2024 9:42 AM, Ma Jun wrote:
> Check ras_manager before using it
>
> Signed-off-by: Ma Jun
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 7 +--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git
On 5/14/2024 11:34 AM, Jesse Zhang wrote:
> To avoid warning problems, drop index and
> use PPSMC_MSG_GfxDriverReset instead of index for aldebaran.
>
> Signed-off-by: Jesse Zhang
> Suggested-by: Lijo Lazar
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 13 +++--
> 1
On 5/14/2024 6:30 AM, Ma, Jun wrote:
> Hi Lijo & Kevin, thanks for review, will drop this patch
>
In the original function below check is there.
if (!handle || !info || type >= ACA_ERROR_TYPE_COUNT)
return -EINVAL;
So moving this to a later stage is still valid.
On 5/10/2024 8:20 AM, Jesse Zhang wrote:
> Check for specific indexes that may be invalid values.
>
> Signed-off-by: Jesse Zhang
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git
On 5/13/2024 4:27 PM, Lazar, Lijo wrote:
>
>
> On 5/10/2024 8:20 AM, Jesse Zhang wrote:
>> Check for specific indexes that may be invalid values.
>>
>> Signed-off-by: Jesse Zhang
>> ---
>> drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
On 5/10/2024 8:20 AM, Jesse Zhang wrote:
> Check for specific indexes that may be invalid values.
>
> Signed-off-by: Jesse Zhang
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git
On 5/13/2024 2:26 PM, Ma Jun wrote:
> Check handle pointer before using it
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_aca.c | 6 +-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_aca.c
>
On 5/13/2024 2:26 PM, Ma Jun wrote:
> Check ras_manager before using it
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 9 +++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
>
On 5/13/2024 9:44 AM, Ori Messinger wrote:
> This patch adds 'ring hang' events to the driver.
> This is done by adding a 'reset_ring_hang' bool variable to the
> struct 'amdgpu_reset_context' in the amdgpu_reset.h file.
> The purpose for this 'reset_ring_hang' variable is whenever a GPU
>
On 5/10/2024 1:36 AM, Harish Kasiviswanathan wrote:
> gpu_id needs to be unique for user space to identify GPUs via KFD
> interface. In the current implementation there is a very small
> probability of having non unique gpu_ids.
>
> v2: Add check to confirm if gpu_id is unique. If not unique,
On 5/10/2024 1:56 PM, Jesse Zhang wrote:
> Checks the partition mode and returns an error for an invalid mode.
>
> Signed-off-by: Jesse Zhang
> Suggested-by: Lijo Lazar
> ---
> drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c | 7 +++
> 1 file changed, 7 insertions(+)
>
> diff --git
On 5/10/2024 1:09 PM, Zhang, Jesse(Jie) wrote:
> [AMD Official Use Only - General]
>
> Hi Lijo,
>
> -Original Message-
> From: amd-gfx On Behalf Of Lazar, Lijo
> Sent: Friday, May 10, 2024 3:16 PM
> To: amd-gfx@lists.freedesktop.org
> Subject: Re: [P
On 5/10/2024 8:20 AM, Jesse Zhang wrote:
> Dividing expression num_xcc_per_xcp which may be zero has undefined behavior.
>
> Signed-off-by: Jesse Zhang
> ---
> drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git
On 5/7/2024 10:14 PM, Srinivasan Shanmugam wrote:
> This commit fixes potential truncation when writing the string _imu.bin
> into the fw_name buffer in the imu_v12_0_init_microcode function in the
> imu_v12_0.c file
>
> The ucode_prefix size was reduced from 30 to 15 to ensure the snprintf
>
On 5/7/2024 10:14 PM, Srinivasan Shanmugam wrote:
> This commit addresses multiple warnings in the gfx_v12_0_init_microcode
> function in the gfx_v12_0.c file. The warnings were related to potential
> truncation when writing the strings _pfp.bin, _me.bin, _rlc.bin, and
> _mec.bin into the
On 5/4/2024 3:36 AM, Harish Kasiviswanathan wrote:
> gpu_id needs to be unique for user space to identify GPUs via KFD
> interface. In the current implementation there is a very small
> probability of having non unique gpu_ids.
>
> v2: Add check to confirm if gpu_id is unique. If not unique,
On 5/7/2024 6:00 AM, Harry Wentland wrote:
> This patch is causing crashes of Manor Lords on my Navi 21 on the 6.8.9
> stable kernel. It leads to an assertion failure in wine:
>
> File: ../src-wine/dlls/winevulkan/loader_thunks.c
> Line: 3621
>
> Expression "!status && vkEndCommandBuffer""
>
On 5/2/2024 7:01 PM, Asad Kamal wrote:
> Validate tbo resource pointer, skip if NULL
>
> Signed-off-by: Asad Kamal
> Reviewed-by: Christian König
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
> 1 file changed, 2 insertions(+), 1
On 4/26/2024 9:27 AM, Yunxiang Li wrote:
> Some times a hang GPU causes multiple reset sources to schedule resets.
> The second source will be able to trigger an unnecessary reset if they
> schedule after we call amdgpu_device_stop_pending_resets.
>
> Move amdgpu_device_stop_pending_resets to
On 4/30/2024 7:53 PM, Zhigang Luo wrote:
> VF can't access FB when host is doing mode1 reset. Using sizeof to get
> vf2pf info size, instead of reading it from vf2pf header stored in FB.
>
> Signed-off-by: Zhigang Luo
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 2 +-
> 1 file changed,
On 4/28/2024 12:38 PM, YiPeng Chai wrote:
> Add mutex to protect ras shared memory.
>
> Signed-off-by: YiPeng Chai
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c| 121 ++---
> drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h| 1 +
>
On 4/25/2024 3:53 PM, Sunil Khatri wrote:
> Do not dump the ip registers during driver reload
> in passthrough environment.
>
> Signed-off-by: Sunil Khatri
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 ++
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git
On 4/25/2024 3:30 PM, Ma Jun wrote:
> Initialize the phy_id to 0 to fix the warning of
> "Using uninitialized value phy_id"
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c | 6 +-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git
On 4/25/2024 12:05 PM, Srinivasan Shanmugam wrote:
> The function gfx_v9_4_3_init_microcode in gfx_v9_4_3.c was generating
> about potential truncation of output when using the snprintf function.
> The issue was due to the size of the buffer 'ucode_prefix' being too
> small to accommodate the
On 4/23/2024 7:13 AM, Srinivasan Shanmugam wrote:
> The buffer size is determined by the declaration char fw_name[30]; This
> means fw_name can hold up to 30 characters, including the null character
> that marks the end of the string.
>
> The string to be written is "amdgpu/%s_mec.bin" or
On 4/23/2024 1:15 AM, Yunxiang Li wrote:
> Reset request from KFD is missing a check for if a reset is already in
> progress, this causes a second reset to be triggered right after the
> previous one finishes. Add the check to align with the other reset sources.
>
> Signed-off-by: Yunxiang Li
On 4/19/2024 9:14 PM, Srinivasan Shanmugam wrote:
> This commit addresses buffer overflow in the smu_v14_0_init_microcode
> function. The issue was about the snprintf function writing more bytes
> into the fw_name buffer than it can hold.
>
> The line of code is:
>
> snprintf(fw_name,
On 4/22/2024 4:52 PM, Christian König wrote:
> Am 22.04.24 um 11:37 schrieb Lazar, Lijo:
>>
>> On 4/22/2024 2:59 PM, Christian König wrote:
>>> Am 22.04.24 um 10:47 schrieb Jack Xiao:
>>>> Delete fence fallback timer to fix the ramdom
>>>&g
On 4/22/2024 3:09 PM, Jack Xiao wrote:
> Delete fence fallback timer to fix the ramdom
> use-after-free issue.
>
> v2: move to amdgpu_mes.c
>
> Signed-off-by: Jack Xiao
Acked-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 1 +
> 1 file changed, 1
On 4/22/2024 2:59 PM, Christian König wrote:
> Am 22.04.24 um 10:47 schrieb Jack Xiao:
>> Delete fence fallback timer to fix the ramdom
>> use-after-free issue.
>
> That's already done in amdgpu_fence_driver_hw_fini() and absolutely
> shouldn't be in amdgpu_ring_fini().
>
> And the
On 4/22/2024 11:23 AM, Le Ma wrote:
> To adapt to different gc versions in gfx_v9_4_3.c file.
>
> Change-Id: Ib4465aade0dcbbcc43318c6dc865f813c5411097
> Signed-off-by: Le Ma
> Reviewed-by: Hawking Zhang
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
On 4/22/2024 6:42 AM, Rajneesh Bhardwaj wrote:
> Tune coarse grain clock gating idle threshold and rlc idle timeout to
> achieve better kernel launch latency.
>
> Signed-off-by: Rajneesh Bhardwaj
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 8
). If that happens, then the purpose of
the patch - to get the context of a device hang - is lost. We may not even get
a proper dmesg log.
Thanks,
Lijo
-Original Message-
From: Khatri, Sunil
Sent: Wednesday, April 17, 2024 9:42 PM
To: Lazar, Lijo ; Alex Deucher ;
Khatri, Sunil
Cc
On 4/17/2024 9:21 PM, Alex Deucher wrote:
> On Wed, Apr 17, 2024 at 5:38 AM Sunil Khatri wrote:
>>
>> Adding gfx10 gc registers to be used for register
>> dump via devcoredump during a gpu reset.
>>
>> Signed-off-by: Sunil Khatri
>
> Reviewed-by: Alex Deucher
>
>> ---
>>
On 4/17/2024 11:23 AM, Ma Jun wrote:
> gpu_od should be removed if it's an empty directory
>
> Signed-off-by: Ma Jun
> Reported-by: Yang Wang
> ---
> drivers/gpu/drm/amd/pm/amdgpu_pm.c | 7 +++
> 1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
>
On 4/17/2024 3:10 PM, Ma Jun wrote:
> Print the od status info if it's not supported.
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/pm/amdgpu_pm.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
>
On 4/17/2024 1:14 PM, Khatri, Sunil wrote:
>
> On 4/17/2024 1:06 PM, Khatri, Sunil wrote:
>> devcoredump is used to debug gpu hangs/resets. So in normal process
>> when there is a hang due to ring timeout or page fault we are doing a
>> hard reset as soft reset fail in those cases. How are we
On 4/17/2024 9:43 AM, Ahmad Rehman wrote:
> In passthrough environment, the driver triggers the mode-1 reset on
> reload. The reset causes the core dump collection which is delayed task
> and prevents driver from unloading until it is completed. Since we do
> not need to collect data on "reset
On 4/17/2024 12:05 AM, Ahmad Rehman wrote:
> In passthrough environment, the driver triggers the mode-1 reset on
> reload. The reset causes the core dump collection which is delayed task
> and prevents driver from unloading until it is completed. Since we do
> not need to collect data on "reset
[Public]
Is this applicable for aldebaran also?
Thanks,
Lijo
-Original Message-
From: amd-gfx On Behalf Of Hawking Zhang
Sent: Tuesday, April 16, 2024 11:46 AM
To: amd-gfx@lists.freedesktop.org; Zhou1, Tao
Cc: Zhang, Hawking
Subject: [PATCH] drm/amdgpu: Use driver mode reset for data
On 4/3/2024 8:27 AM, Ma Jun wrote:
> refactor the code of runtime pm mode detection to support
> amdgpu_runtime_pm =2 and 1 two cases
>
> Signed-off-by: Ma Jun
> Reviewed-by: Yang Wang
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> v1->v2:
> - Fix logic and output info (Lijo)
> - Fix code
On 4/8/2024 10:50 PM, Alex Deucher wrote:
> Need to take the srbm_mutex and while we are here, use the
> helper function soc21_grbm_select();
>
> Signed-off-by: Alex Deucher
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 15 +--
> 1 file
On 4/3/2024 11:42 PM, Zhigang Luo wrote:
> 1. change AMDGPU_VF2PF_UPDATE_MAX_RETRY_LIMIT from 30 to 5.
> 2. set fatel error detected flag.
>
> Signed-off-by: Zhigang Luo
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
>
On 4/2/2024 4:00 PM, Lazar, Lijo wrote:
>
>
> On 4/2/2024 3:52 PM, Asad Kamal wrote:
>> Report max set uclk and sclk for smu_v_13_0_6
>>
>
> You may rephrase as
>
> "Use OD (pp_od_clk_voltage) interface to report current limits, default
>
On 4/2/2024 3:52 PM, Asad Kamal wrote:
> Update max set uclk and sclk reporting format for smu_v_13_0_0
>
Use aldebaran instead of smu v13.0.0 - both are different. You may also
add the description similar to patch 1.
With those updates,
Reviewed-by: Lijo Lazar
Thanks,
Lijo
>
On 4/2/2024 3:52 PM, Asad Kamal wrote:
> Report max set uclk and sclk for smu_v_13_0_6
>
You may rephrase as
"Use OD (pp_od_clk_voltage) interface to report current limits, default
or those set by user, for SCLK and UCLK."
Thanks,
Lijo
> Signed-off-by: Asad Kamal
> ---
>
On 3/29/2024 1:58 PM, Ma Jun wrote:
> refactor the code of runtime pm mode detection to support
> amdgpu_runtime_pm =2 and 1 two cases
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 68 ++
On 4/1/2024 4:45 PM, Kamal, Asad wrote:
> [AMD Official Use Only - General]
>
> -Original Message-
> From: amd-gfx On Behalf Of Lijo Lazar
> Sent: Thursday, March 28, 2024 8:06 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking ; Deucher, Alexander
> ; Wang, Yang(Kevin)
>
On 3/27/2024 4:40 PM, Ma Jun wrote:
> Add a new runtime pm mode AMDGPU_RUNPM_BAMACO
> and related macro definition
>
> Signed-off-by: Ma Jun
Series is
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 4
> 1 file changed, 4 insertions(+)
>
>
On 3/28/2024 8:49 AM, Wang, Yang(Kevin) wrote:
> [AMD Official Use Only - General]
>
> -Original Message-
> From: amd-gfx On Behalf Of Lijo Lazar
> Sent: Thursday, March 28, 2024 11:06 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking ; Deucher, Alexander
>
> Subject:
On 3/28/2024 8:57 AM, Wang, Yang(Kevin) wrote:
> [AMD Official Use Only - General]
>
> -Original Message-
> From: amd-gfx On Behalf Of Lijo Lazar
> Sent: Thursday, March 28, 2024 10:36 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking ; Deucher, Alexander
> ; Wang,
On 3/26/2024 2:32 PM, Yang Wang wrote:
> add a new enumeration type to identify device attribute node,
> this method is relatively more efficient compared with 'strcmp' in
> update_attr() function.
>
> Signed-off-by: Yang Wang
> ---
> drivers/gpu/drm/amd/pm/amdgpu_pm.c | 4 +--
>
On 3/23/2024 1:27 AM, Zhigang Luo wrote:
> Signed-off-by: Zhigang Luo
> Change-Id: I2a98d513c26107ac76ecf20e951c188afbc7ede6
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 20
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 5 -
>
On 3/26/2024 2:59 PM, Lazar, Lijo wrote:
>
>
> On 3/25/2024 3:45 PM, Ma Jun wrote:
>> Optimize the code to add support for BAMACO mode checking
>>
>> Signed-off-by: Ma Jun
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 +-
>>
On 3/25/2024 3:45 PM, Ma Jun wrote:
> Optimize the code to add support for BAMACO mode checking
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 74 +++--
>
On 3/25/2024 3:45 PM, Ma Jun wrote:
> Add support for MACO flag checking.
> MACO mode only works if BACO is supported.
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu.h| 4 ++--
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
>
On 3/22/2024 12:33 PM, Srinivasan Shanmugam wrote:
> Reducing the size of ucode_prefix to 25 in the smu_v11_0_init_microcode
> function. we ensure that fw_name can accommodate the maximum possible
> string size
>
> Fixes the below with gcc W=1:
>
On 3/22/2024 12:24 PM, Srinivasan Shanmugam wrote:
> The total size of the fw_name buffer is 8 (for "amdgpu/") + 30 (for
> ucode_prefix) + 5 (for "_pfp") + 5 (for "_wks") + 5 (for ".bin") = 53
> characters.
>
> Fixes the below with gcc W=1:
> drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c: In function
On 3/22/2024 12:02 PM, Srinivasan Shanmugam wrote:
> Reducing the size of ucode_prefix to 25 in the gfx_v11_0_init_microcode
> function. This would ensure that the total number of characters being
> written into fw_name does not exceed its size of 40.
>
> Fixes the below with gcc W=1:
>
On 3/22/2024 11:54 AM, Srinivasan Shanmugam wrote:
> The size of fw_name is increased to ensure that it can accommodate
> the maximum possible size of the string being written into it.
>
> Fixes the below with gcc W=1:
> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c: In function ‘gfx_v9_0_early_init’:
On 3/21/2024 11:16 AM, Srinivasan Shanmugam wrote:
> The snprintf function is used to write a formatted string into fw_name.
> The format of the string is "amdgpu/%s_mes%s.bin", where %s is replaced
> by the string in ucode_prefix and the second %s is replaced by either
> "_2" or "1" depending
On 3/21/2024 10:29 AM, Srinivasan Shanmugam wrote:
> Reducing the size of ucode_prefix to 25 in the amdgpu_vcn_early_init
> function. This would ensure that the total number of characters being
> written into fw_name does not exceed its size of 40.
>
> Fixes the below with gcc W=1:
>
On 3/21/2024 12:28 PM, Ma, Jun wrote:
>
>
> On 3/20/2024 9:38 PM, Lazar, Lijo wrote:
>>
>>
>> On 3/20/2024 6:54 PM, Alex Deucher wrote:
>>> On Wed, Mar 20, 2024 at 6:17 AM Ma Jun wrote:
>>>>
>>>> Because of the logic error,
On 3/20/2024 8:28 PM, SRINIVASAN SHANMUGAM wrote:
>
> On 3/20/2024 3:12 PM, Lazar, Lijo wrote:
>>
>> On 3/20/2024 2:15 PM, Srinivasan Shanmugam wrote:
>>> The issue was present in the lines where 'fw_name' was being formatted.
>>> This fix ensures that the o
On 3/20/2024 6:54 PM, Alex Deucher wrote:
> On Wed, Mar 20, 2024 at 6:17 AM Ma Jun wrote:
>>
>> Because of the logic error, Arcturus and vega20 currently
>> use the AMDGPU_RUNPM_NONE for runtime pm even though they
>> support BACO. So, the code is optimized to fix this error.
>>
>>
On 3/20/2024 2:15 PM, Srinivasan Shanmugam wrote:
> The issue was present in the lines where 'fw_name' was being formatted.
> This fix ensures that the output is not truncated
>
> Fixes the below with gcc W=1:
> drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c: In function ‘amdgpu_vcn_early_init’:
>
On 3/19/2024 7:27 PM, Khatri, Sunil wrote:
>
> On 3/19/2024 7:19 PM, Lazar, Lijo wrote:
>>
>> On 3/19/2024 6:02 PM, Sunil Khatri wrote:
>>> Refactor the code so debugfs and devcoredump can reuse
>>> the common information and avoid unnecessary copy of it.
&g
On 3/19/2024 6:02 PM, Sunil Khatri wrote:
> Refactor the code so debugfs and devcoredump can reuse
> the common information and avoid unnecessary copy of it.
>
> created a new file which would be the right place to
> hold functions which will be used between sysfs, debugfs
> and devcoredump.
>
[Public]
Reviewed-by: Lijo Lazar
Thanks,
Lijo
-Original Message-
From: SHANMUGAM, SRINIVASAN
Sent: Saturday, March 16, 2024 10:20 PM
To: Koenig, Christian ; Deucher, Alexander
Cc: amd-gfx@lists.freedesktop.org; SHANMUGAM, SRINIVASAN
; Lazar, Lijo
Subject: [PATCH] drm/amdgpu: Fix
On 3/15/2024 5:45 PM, Ma, Le wrote:
> [AMD Official Use Only - General]
>
>
>
>> -Original Message-----
>> From: Lazar, Lijo <_Lijo.Lazar@amd.com_ <mailto:lijo.la...@amd.com>>
>> Sent: Friday, March 15, 2024 6:14 PM
>> To: Ma, Le <_L
On 3/14/2024 10:24 PM, Zhigang Luo wrote:
> if reading pf2vf data failed 5 times continuously, it means something is
> wrong. Need to trigger flr_work to recover the issue.
>
> also use dev_err to print the error message to get which device has
> issue and add warning message if waiting
On 3/15/2024 3:43 PM, Lazar, Lijo wrote:
>
>
> On 3/15/2024 2:46 PM, Le Ma wrote:
>> To fix the entity rq NULL issue. This setting has been moved to upper level.
>>
>
> Need to call amdgpu_ttm_set_buffer_funcs_status(adev, true/false) in
> mode-2 reset handlers
On 3/15/2024 2:46 PM, Le Ma wrote:
> To fix the entity rq NULL issue. This setting has been moved to upper level.
>
Need to call amdgpu_ttm_set_buffer_funcs_status(adev, true/false) in
mode-2 reset handlers as well.
Thanks,
Lijo
> Fixes b70438004a14 ("drm/amdgpu: move buffer funcs setting
On 3/15/2024 1:13 PM, Asad Kamal wrote:
> Update PMFW interface headers for updated metrics table
> with pcie link speed and pcie link width
>
> Signed-off-by: Asad Kamal
Series is -
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
>
On 3/15/2024 11:11 AM, Asad Kamal wrote:
> Report pcie link speed/width using metric table in case
> of one vf & if pmfw support is available, else report directly from
> registers in case of pf. Skip reporting it for other cases.
>
> Signed-off-by: Asad Kamal
> ---
>
This one is missing some NULL checks. Will send a v2.
Thanks,
Lijo
On 3/13/2024 4:32 PM, Lijo Lazar wrote:
> Add support to set/get information about different DPM policies. The
> support is only available on SOCs which use swsmu architecture.
>
> A DPM policy type may be defined with different
On 3/14/2024 1:19 AM, Felix Kuehling wrote:
>
> On 2024-03-13 5:41, Lijo Lazar wrote:
>> Check if the device is present in the bus before trying to recover. It
>> could be that device itself is lost from the bus in some hang
>> situations.
>>
>> Signed-off-by: Lijo Lazar
>> ---
>>
On 3/13/2024 8:15 AM, Ma, Jun wrote:
>
>
> On 3/12/2024 8:57 PM, Lazar, Lijo wrote:
>>
>>
>> On 3/12/2024 4:29 PM, Ma Jun wrote:
>>> Sometimes user may want to enable the od feature
>>> by setting ppfeaturemask when loading amdgpu driver.
>>&
On 3/12/2024 4:29 PM, Ma Jun wrote:
> Sometimes user may want to enable the od feature
> by setting ppfeaturemask when loading amdgpu driver.
> However,not all Asics support this feature.
> So we need to restore the ppfeature value and print
> a warning info.
>
> Signed-off-by: Ma Jun
> ---
>
On 3/8/2024 10:17 PM, Felix Kuehling wrote:
> On 2024-03-08 11:22, Mukul Joshi wrote:
>> In certain situations, some apps can import a BO multiple times
>> (through IPC for example). To restore such processes successfully,
>> we need to tell drm to ignore duplicate BOs.
>> While at it, also add
On 3/8/2024 3:21 PM, Ma Jun wrote:
> Because powerplay_table initialization is skipped under
> sriov case, We set default lower and upper OD value to
> avoid NULL pointer issue.
>
> Also, It's necessary to check od capability before
> using the power limit value from powerplay_table.
>
>
On 3/7/2024 7:42 AM, Ma, Jun wrote:
> Hi Lijo,
>
> On 3/6/2024 7:16 PM, Lazar, Lijo wrote:
>>
>>
>> On 3/6/2024 3:56 PM, Ma Jun wrote:
>>> Because powerplay_table initialization is skipped under
>>> sriov case, We set default lower and u
On 3/6/2024 3:56 PM, Ma Jun wrote:
> Because powerplay_table initialization is skipped under
> sriov case, We set default lower and upper OD value to
> avoid NULL pointer issue.
pp_od_clk_voltage is not enabled in SRIOV (except for GC 9.4.3 one VF
mode). Since the interface is not available
On 3/5/2024 2:44 PM, Christian König wrote:
> Am 05.03.24 um 10:01 schrieb Lazar, Lijo:
>> On 3/5/2024 2:22 PM, Christian König wrote:
>>> Am 05.03.24 um 07:40 schrieb Lijo Lazar:
>>>> VCN 4.0.3 cannot trigger HDP flush with RRMT enabled. Instead, trigger
>&g
On 3/5/2024 2:48 PM, Christian König wrote:
> Am 05.03.24 um 10:03 schrieb Lazar, Lijo:
>>
>> On 3/5/2024 2:24 PM, Christian König wrote:
>>>
>>> Am 05.03.24 um 07:40 schrieb Lijo Lazar:
>>>> For VCN 4.0.3, use only the local addressing scheme wh
On 3/5/2024 2:24 PM, Christian König wrote:
>
>
> Am 05.03.24 um 07:40 schrieb Lijo Lazar:
>> For VCN 4.0.3, use only the local addressing scheme while in VF
>> mode. This includes addressing scheme used for HUB offsets.
>>
>> Signed-off-by: Lijo Lazar
>> ---
>>
On 3/5/2024 2:22 PM, Christian König wrote:
> Am 05.03.24 um 07:40 schrieb Lijo Lazar:
>> VCN 4.0.3 cannot trigger HDP flush with RRMT enabled. Instead, trigger
>> HDP flush from host side before ringing doorbell.
>
> Well that won't work like that.
>
> The HDP flush is supposed to be emitted
On 3/1/2024 7:52 PM, Christian König wrote:
> Am 01.03.24 um 15:01 schrieb Lazar, Lijo:
>> On 3/1/2024 6:15 PM, Srinivasan Shanmugam wrote:
>>> The 'mask' array could be used in a way that would make the code
>>> vulnerable to a Spectre attack. The issue is
On 3/1/2024 6:15 PM, Srinivasan Shanmugam wrote:
> The 'mask' array could be used in a way that would make the code
> vulnerable to a Spectre attack. The issue is likely related to the fact
> that the 'mask' array is being indexed using values that are derived
> from user input (the 'se' and
On 3/1/2024 1:15 PM, Ma Jun wrote:
> Fix the pwm_mode value error which used for
> pwm1_enable setting
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/pm/amdgpu_pm.c | 12 +++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git
On 2/29/2024 4:40 PM, Ma, Jun wrote:
> Hi Lijo,
>
> On 2/29/2024 3:33 PM, Lazar, Lijo wrote:
>>
>>
>> On 2/29/2024 11:49 AM, Ma Jun wrote:
>>> Check return value of amdgpu_device_baco_enter/exit and print
>>> warning message because these errors ma
On 2/29/2024 11:49 AM, Ma Jun wrote:
> Check return value of amdgpu_device_baco_enter/exit and print
> warning message because these errors may cause runtime resume failure
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 29 --
> 1 file
On 2/27/2024 9:23 PM, Harish Kasiviswanathan wrote:
> Also passing adev is misleading if BO is associated with different adev.
> In this case BO is mapped to a different device
>
Looks like a typo in subject - unused?
Thanks,
Lijo
> Signed-off-by: Harish Kasiviswanathan
> ---
>
On 2/28/2024 5:14 PM, Ma Jun wrote:
> Because the rpm_mode flag is already set when the driver
> is initialized, we use it directly for runtime suspend/resume
> instead of checking it again
>
> Signed-off-by: Ma Jun
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
>
On 2/28/2024 5:14 PM, Ma Jun wrote:
> Check return value of amdgpu_device_baco_enter/exit and print
> warning message because these errors may cause runtime resume failure
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 8 ++--
> 1 file changed, 6
On 2/27/2024 10:05 PM, Srinivasan Shanmugam wrote:
> Fixes snprintf function by writing more bytes into various buffers than
> they can hold.
>
> In several files - smu_v13_0.c, gfx_v11_0.c, gfx_v10_0.c, gfx_v9_0.c,
> and amdgpu_mes.c. They were related to different directives, such as
> '%s',
1 - 100 of 959 matches
Mail list logo