On 3/6/2024 3:56 PM, Ma Jun wrote:
> Because powerplay_table initialization is skipped under
> sriov case, We set default lower and upper OD value to
> avoid NULL pointer issue.
pp_od_clk_voltage is not enabled in SRIOV (except for GC 9.4.3 one VF
mode). Since the interface is not available
On 3/5/2024 2:44 PM, Christian König wrote:
> Am 05.03.24 um 10:01 schrieb Lazar, Lijo:
>> On 3/5/2024 2:22 PM, Christian König wrote:
>>> Am 05.03.24 um 07:40 schrieb Lijo Lazar:
>>>> VCN 4.0.3 cannot trigger HDP flush with RRMT enabled. Instead, trigger
>&g
On 3/5/2024 2:48 PM, Christian König wrote:
> Am 05.03.24 um 10:03 schrieb Lazar, Lijo:
>>
>> On 3/5/2024 2:24 PM, Christian König wrote:
>>>
>>> Am 05.03.24 um 07:40 schrieb Lijo Lazar:
>>>> For VCN 4.0.3, use only the local addressing scheme wh
On 3/5/2024 2:24 PM, Christian König wrote:
>
>
> Am 05.03.24 um 07:40 schrieb Lijo Lazar:
>> For VCN 4.0.3, use only the local addressing scheme while in VF
>> mode. This includes addressing scheme used for HUB offsets.
>>
>> Signed-off-by: Lijo Lazar
>> ---
>>
On 3/5/2024 2:22 PM, Christian König wrote:
> Am 05.03.24 um 07:40 schrieb Lijo Lazar:
>> VCN 4.0.3 cannot trigger HDP flush with RRMT enabled. Instead, trigger
>> HDP flush from host side before ringing doorbell.
>
> Well that won't work like that.
>
> The HDP flush is supposed to be emitted
On 3/1/2024 7:52 PM, Christian König wrote:
> Am 01.03.24 um 15:01 schrieb Lazar, Lijo:
>> On 3/1/2024 6:15 PM, Srinivasan Shanmugam wrote:
>>> The 'mask' array could be used in a way that would make the code
>>> vulnerable to a Spectre attack. The issue is
On 3/1/2024 6:15 PM, Srinivasan Shanmugam wrote:
> The 'mask' array could be used in a way that would make the code
> vulnerable to a Spectre attack. The issue is likely related to the fact
> that the 'mask' array is being indexed using values that are derived
> from user input (the 'se' and
On 3/1/2024 1:15 PM, Ma Jun wrote:
> Fix the pwm_mode value error which used for
> pwm1_enable setting
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/pm/amdgpu_pm.c | 12 +++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git
On 2/29/2024 4:40 PM, Ma, Jun wrote:
> Hi Lijo,
>
> On 2/29/2024 3:33 PM, Lazar, Lijo wrote:
>>
>>
>> On 2/29/2024 11:49 AM, Ma Jun wrote:
>>> Check return value of amdgpu_device_baco_enter/exit and print
>>> warning message because these errors ma
On 2/29/2024 11:49 AM, Ma Jun wrote:
> Check return value of amdgpu_device_baco_enter/exit and print
> warning message because these errors may cause runtime resume failure
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 29 --
> 1 file
On 2/27/2024 9:23 PM, Harish Kasiviswanathan wrote:
> Also passing adev is misleading if BO is associated with different adev.
> In this case BO is mapped to a different device
>
Looks like a typo in subject - unused?
Thanks,
Lijo
> Signed-off-by: Harish Kasiviswanathan
> ---
>
On 2/28/2024 5:14 PM, Ma Jun wrote:
> Because the rpm_mode flag is already set when the driver
> is initialized, we use it directly for runtime suspend/resume
> instead of checking it again
>
> Signed-off-by: Ma Jun
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
>
On 2/28/2024 5:14 PM, Ma Jun wrote:
> Check return value of amdgpu_device_baco_enter/exit and print
> warning message because these errors may cause runtime resume failure
>
> Signed-off-by: Ma Jun
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 8 ++--
> 1 file changed, 6
On 2/27/2024 10:05 PM, Srinivasan Shanmugam wrote:
> Fixes snprintf function by writing more bytes into various buffers than
> they can hold.
>
> In several files - smu_v13_0.c, gfx_v11_0.c, gfx_v10_0.c, gfx_v9_0.c,
> and amdgpu_mes.c. They were related to different directives, such as
> '%s',
On 2/28/2024 12:30 PM, Yang Wang wrote:
> v1:
> enabel pp_od_clk_voltage node for gfx 9.4.3 SRIOV and BM.
>
> v2:
> add onevf check for gfx 9.4.3
>
> v3:
> refine code check order to make function clearly.
>
> Signed-off-by: Yang Wang
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
>
On 2/28/2024 12:08 PM, Yang Wang wrote:
> v1:
> enabel pp_od_clk_voltage node for gfx 9.4.3 SRIOV and BM.
>
> v2:
> add onevf check for gfx 9.4.3
>
> Signed-off-by: Yang Wang
> ---
> drivers/gpu/drm/amd/pm/amdgpu_pm.c | 35 +-
> 1 file changed, 30 insertions(+),
On 2/28/2024 11:28 AM, Yang Wang wrote:
> enabel pp_od_clk_voltage node for gfx 9.4.3 SRIOV and BM.
>
> Signed-off-by: Yang Wang
> ---
> drivers/gpu/drm/amd/pm/amdgpu_pm.c | 29 -
> 1 file changed, 24 insertions(+), 5 deletions(-)
>
> diff --git
On 2/28/2024 9:30 AM, Asad Kamal wrote:
> Skip reporting pcie link width/speed on vfs for
> smu_v13_0_6 & smu_v13_0_2
>
> Signed-off-by: Asad Kamal
> Reviewed-by: Yang Wang
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> .../gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 10 ++
>
On 2/20/2024 7:52 PM, Christian König wrote:
> Am 20.02.24 um 07:32 schrieb Lazar, Lijo:
>> On 2/16/2024 8:43 PM, Alex Deucher wrote:
>>> Use the new reset critical section accessors for debugfs, sysfs,
>>> and the INFO IOCTL to provide proper mutual exclusivity
&
On 2/16/2024 8:43 PM, Alex Deucher wrote:
> Use the new reset critical section accessors for debugfs, sysfs,
> and the INFO IOCTL to provide proper mutual exclusivity
> to hardware with respect the GPU resets.
>
This looks more like a priority inversion. When the device needs reset,
it
On 2/19/2024 1:45 PM, Tao Zhou wrote:
> Let kfd interrupt handler process it.
>
> Signed-off-by: Tao Zhou
> ---
> drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 10 +-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>
On 2/18/2024 12:26 PM, Tao Zhou wrote:
> Add help function to query and reset RAS UTCL2 poison status.
>
> Signed-off-by: Tao Zhou
> ---
> drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 14 ++
> 1 file changed, 14 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
Sending another one, please ignore.
Thanks,
Lijo
On 2/9/2024 12:04 PM, Lijo Lazar wrote:
> Allow reducing max UCLK in MANUAL performance level. New UCLK value
> should be less than the max DPM level UCLK level value.
>
> Ex:
> echo manual >
On 2/8/2024 11:04 AM, Kenneth Feng wrote:
> denote S to the deep sleep clock for the clock output on smu
> v13.0.0/v13.0.7/v13.0.10
>
> Signed-off-by: Kenneth Feng
> ---
> .../drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 27 +--
> .../drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
On 2/7/2024 2:03 PM, Kenneth Feng wrote:
> denote S to the actual clock in smu v13.0.0/v13.0.7/v13.0.10
>
> Signed-off-by: Kenneth Feng
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 12 ++--
> drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c | 12 ++--
>
On 1/26/2024 2:30 PM, Liang, Prike wrote:
> [AMD Official Use Only - General]
>
>>
>> On 1/25/2024 8:52 AM, Prike Liang wrote:
>>> In the pm abort case the gfx power rail not turn off from FCH side and
>>> this will lead to the gfx reinitialized failed base on the unknown gfx
>>> HW status, so
On 1/25/2024 1:37 PM, Le Ma wrote:
> This patch is to eliminate interrupt warning below:
>
> "[drm] Fence fallback timer expired on ring sdma0.0".
>
> An early vm pt clearing job is sent to SDMA ahead of interrupt enabled,
> introduced by patch below:
>
> - drm/amdkfd: Export DMABufs
On 1/25/2024 8:52 AM, Prike Liang wrote:
> In the pm abort case the gfx power rail not turn off from FCH side and
> this will lead to the gfx reinitialized failed base on the unknown gfx
> HW status, so let's reset the gpu to a known good power state.
>
>From the description, this an APU only
On 1/25/2024 2:20 PM, Ma Jun wrote:
> Replace the hard-coded numbers with macro definition
>
> Signed-off-by: Ma Jun
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> v3:
> - Add new SMU_IH_INTERRUPT_* macros for smu, keeping the original
> macro definitions in sync with pmfw (kevin)
> ---
>
On 1/25/2024 11:48 AM, Le Ma wrote:
> This patch is to eliminate interrupt warning below:
>
> "[drm] Fence fallback timer expired on ring sdma0.0".
>
> An early vm pt clearing job is sent to SDMA ahead of interrupt enabled,
> introduced by patch below:
>
> - drm/amdkfd: Export DMABufs
On 1/24/2024 2:28 PM, Le Ma wrote:
> This patch is to eliminate interrupt warning below:
>
> "[drm] Fence fallback timer expired on ring sdma0.0".
>
> An early vm pt clearing job is sent to SDMA ahead of interrupt enabled,
> introduced by patch below:
>
> - drm/amdkfd: Export DMABufs
Mukul posted a patch for this already.
"drm/amdgpu: Fix module unload hang with RAS enabled"
Thanks,
Lijo
On 1/24/2024 9:09 AM, YiPeng Chai wrote:
> The following is the error message:
> [ 484.495995] task:rmmod state:D stack:0 pid: 2195 ppid: 2194
> flags:0x4002
>
On 1/23/2024 1:43 PM, Ma Jun wrote:
> Replace the hard-coded numbers with macro definition
>
> Signed-off-by: Ma Jun
Series is
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> .../pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h | 11 ---
>
On 1/22/2024 8:30 PM, Alex Deucher wrote:
> Replace [1] with []. Silences UBSAN warnings.
>
> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3107
> Signed-off-by: Alex Deucher
typo => covert-> convert
With the typo fixed in the subject -
Reviewed-by: Lijo Lazar
Thanks,
On 1/21/2024 5:49 AM, vitaly.pros...@amd.com wrote:
> From: Vitaly Prosyak
>
>The issue started to appear after the following commit
> 11b3b9f461c5c4f700f6c8da202fcc2fd6418e1f (scheduler to variable number
> of run-queues). The scheduler flag ready (ring->sched.ready) could not be
>
On 1/23/2024 1:38 PM, Srinivasan Shanmugam wrote:
> 'adev->gfx.rlc_fw' may not be released before end of
> gfx_v10_0_init_microcode() function.
>
> Using the function release_firmware() to release adev->gfx.rlc_fw.
>
> Fixes the below:
> drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4046
On 1/22/2024 2:12 PM, Ma Jun wrote:
> Replace the hard-coded numbers with macro definition
>
> Signed-off-by: Ma Jun
> ---
> .../pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h | 11 +--
> .../pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_7.h | 11 ---
>
[AMD Official Use Only - General]
Reviewed-by: Lijo Lazar
Thanks,
Lijo
-Original Message-
From: Zhang, Hawking
Sent: Monday, January 22, 2024 3:27 PM
To: amd-gfx@lists.freedesktop.org; Lazar, Lijo ; Deucher,
Alexander ; Ma, Le
Cc: Zhang, Hawking
Subject: [PATCH] drm/amdgpu: Fix
On 1/19/2024 7:24 AM, Ma, Jun wrote:
Hi Lijo,
On 1/18/2024 5:24 PM, Lazar, Lijo wrote:
On 1/18/2024 2:31 PM, Ma, Jun wrote:
On 1/18/2024 4:38 PM, Lazar, Lijo wrote:
On 1/18/2024 12:57 PM, Ma Jun wrote:
The power source flag should be updated when
[1] System receives an interrupt
On 1/19/2024 9:17 AM, Yang Wang wrote:
update smu v13.0.6 message to allow guest driver set gfx clock.
Signed-off-by: Yang Wang
Reviewed-by: Lijo Lazar
Thanks,
Lijo
---
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff
On 1/18/2024 2:31 PM, Ma, Jun wrote:
On 1/18/2024 4:38 PM, Lazar, Lijo wrote:
On 1/18/2024 12:57 PM, Ma Jun wrote:
The power source flag should be updated when
[1] System receives an interrupt indicating that the power source
has changed.
[2] System resumes from suspend or runtime suspend
On 1/18/2024 12:57 PM, Ma Jun wrote:
The power source flag should be updated when
[1] System receives an interrupt indicating that the power source
has changed.
[2] System resumes from suspend or runtime suspend
Signed-off-by: Ma Jun
---
drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 13
On 1/18/2024 11:07 AM, Yang Wang wrote:
From: Yang Wang
v1:
enable amdgpu smu driver message log.
v2:
add smu/pmfw response value into debug log.
Signed-off-by: Yang Wang
Reviewed-by: Lijo Lazar
Thanks,
Lijo
---
drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 9 -
1 file changed,
On 1/18/2024 8:56 AM, Yang Wang wrote:
From: Yang Wang
enable amdgpu smu driver message log.
Signed-off-by: Yang Wang
---
drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
On 1/18/2024 7:54 AM, Ma, Jun wrote:
Hi Lijo,
On 1/17/2024 5:41 PM, Lazar, Lijo wrote:
On 1/17/2024 2:22 PM, Ma Jun wrote:
The power source flag should be updated when
[1] System receives an interrupt indicating that the power source
has changed.
[2] System resumes from suspend or runtime
On 1/17/2024 2:22 PM, Ma Jun wrote:
The power source flag should be updated when
[1] System receives an interrupt indicating that the power source
has changed.
[2] System resumes from suspend or runtime suspend
Signed-off-by: Ma Jun
---
drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 24
On 1/16/2024 4:32 PM, Yang Wang wrote:
fix array index out of bounds issue for ras_block_string[] array.
Fixes: 2e3675fe4e3ee ("drm/amdgpu: Align ras block enum with firmware")
Signed-off-by: Yang Wang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 5 -
1 file changed, 4 insertions(+),
On 1/9/2024 6:30 PM, Le Ma wrote:
Use debug_mask=0x8 param to help isolating data path issues
on new systems in early phase.
v2: rename the flag for explicitness (lijo)
Signed-off-by: Le Ma
Series is
Reviewed-by: Lijo Lazar
Thanks,
Lijo
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h
On 1/9/2024 3:43 PM, Le Ma wrote:
se debug_mask=0x8 param to help isolating data path issues
on new systems in early phase.
Signed-off-by: Le Ma
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 ++
On 1/8/2024 4:27 PM, Asad Kamal wrote:
Re-evaluate the original workaround: commit f5c7e7797060 ("drm/amdgpu:
Adjust removal control flow for smu v13_0_2")
This reverts commit 2e8e792e6a51e8cb7f5f96148146b6525dbb9cef.
Signed-off-by: Asad Kamal
You may reword the commmit message as 'revert
On 1/8/2024 1:51 PM, Christian König wrote:
Am 08.01.24 um 09:13 schrieb Kamal, Asad:
[AMD Official Use Only - General]
Hi Christian,
Thank you for the comment.
This is not normal reset, it is reset done during unload for smu
v_13_0_2.
Yeah, but this doesn't explain the rational for this.
On 1/5/2024 8:51 PM, Asad Kamal wrote:
In certain special cases, e.g device reset before module
unload, irq gets disabled as part of reset sequence and
won't get enabled back. Add special check to cover such scenarios
Signed-off-by: Asad Kamal
Suggested-by: Lijo Lazar
Please also add the
On 12/22/2023 10:52 PM, Asad Kamal wrote:
Expose sysfs entry mem_busy_percent for GC version
9.4.3 APU system
Signed-off-by: Asad Kamal
Reviewed-by: Lijo Lazar
Thanks,
Lijo
---
drivers/gpu/drm/amd/pm/amdgpu_pm.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git
On 12/21/2023 11:35 AM, Stanley.Yang wrote:
The ecc_irq is disabled while GPU mode2 reset suspending process,
but not be enabled during GPU mode2 reset resume process.
Changed from V1:
only do sdma/gfx ras_late_init in aldebaran_mode2_restore_ip
delete amdgpu_ras_late_resume
On 12/21/2023 9:13 PM, Srinivasan Shanmugam wrote:
In function 'amdgpu_device_need_post(struct amdgpu_device *adev)' -
'adev->pm.fw' may not be released before return.
Using the function release_firmware() to release adev->pm.fw.
Thus fixing the below:
On 12/22/2023 9:44 AM, Srinivasan Shanmugam wrote:
Before using list_first_entry, make sure to check that list is not
empty.
Fixes the below:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c:1347
kfd_create_indirect_link_prop() warn: can 'gpu_link' even be NULL?
On 12/21/2023 8:12 AM, Srinivasan Shanmugam wrote:
Doing a bitwise AND between a bool and an int is generally not a good
idea. The bool will be promoted to an int with value 0 or 1, the int is
generally regarded as true with a non-zero value, thus ANDing them using
bitwise has the potential to
On 12/16/2023 1:25 AM, Mario Limonciello wrote:
The SW CTF delayed work handler triggers a shutdown if a sensor
read failed for any reason.
The specific circumstance of a busy sensor should be retried
however to ensure that a good value can be returned.
Signed-off-by: Mario Limonciello
---
On 12/14/2023 10:15 PM, Mario Limonciello wrote:
The SW CTF handler assumes that the read_sensor() call always succeeds
and has updated `hotspot_tmp`, but this may not be guaranteed.
For example some of the read_sensor() callbacks will return 0 when a RAS
interrupt is triggered in which case
On 12/7/2023 1:09 AM, Dmitrii Galantsev wrote:
Fix pp_dpm_sclk_od and pp_dpm_mclk_od typos.
Those were defined as pp_*clk_od but used as pp_dpm_*clk_od instead.
This change removes the _dpm part.
Signed-off-by: Dmitrii Galantsev
Reviewed-by: Lijo Lazar
Add below tag also before
On 12/6/2023 7:40 PM, Alex Deucher wrote:
On Wed, Dec 6, 2023 at 7:57 AM Lijo Lazar wrote:
Replace direct usage of adev->ip_versions with amdgpu_ip_version.
Signed-off-by: Lijo Lazar
Reviewed-by: Alex Deucher
I see two more instances of direct use. Will send a v2.
Thanks,
Lijo
On 11/30/2023 4:17 PM, Ma, Jun wrote:
Hi Lijo,
On 11/30/2023 5:18 PM, Lazar, Lijo wrote:
On 11/30/2023 11:59 AM, Ma, Jun wrote:
Hi Alex,
On 11/30/2023 12:39 AM, Alex Deucher wrote:
On Wed, Nov 29, 2023 at 11:37 AM Ma Jun wrote:
Some platforms can't resume from d3cold state, So add
On 11/30/2023 11:59 AM, Ma, Jun wrote:
Hi Alex,
On 11/30/2023 12:39 AM, Alex Deucher wrote:
On Wed, Nov 29, 2023 at 11:37 AM Ma Jun wrote:
Some platforms can't resume from d3cold state, So add a
new module parameter to disable d3cold state for debugging
purpose or workaround.
Doesn't
On 11/30/2023 10:39 AM, Yang Wang wrote:
Use amdgpu_ip_version() helper function to check ip version.
The ip verison contains other information,
use the helper function to avoid reading wrong value.
Signed-off-by: Yang Wang
May refine the subject to "Fix missing mca debugfs node"
On 11/29/2023 12:22 AM, Mario Limonciello wrote:
As the module parameter can be used to control behavior, all parts
of the driver should obey what has been programmed by user or
detected by auto mode rather than what "can" be supported.
This is also not correct. You can very well disable
On 11/29/2023 12:22 AM, Mario Limonciello wrote:
Rather than plumbing module parameter deep into IP declare BAMACO
runpm mode at amdgpu_driver_set_runtime_pm_mode() and then detect
this mode in consumers.
Signed-off-by: Mario Limonciello
---
drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
On 11/29/2023 12:22 AM, Mario Limonciello wrote:
On products that support both BOCO and BACO it should be possible
to override the BOCO detection and force BACO by amdgpu.runpm=1 but
this doesn't work today.
Adjust the logic used in amdgpu_driver_load_kms() to make sure that
module
On 11/28/2023 9:51 PM, Mario Limonciello wrote:
Hi,
In amd-staging-drm-next 46fe6312082c ("drm/amdgpu: update retry times
for psp BL wait") and upstream a11156ff6f41 ("drm/amdgpu: update retry
times for psp BL wait") the number of loops for
psp_v13_0_wait_for_bootloader() to try again
On 11/28/2023 3:07 PM, Christian König wrote:
Am 27.11.23 um 22:55 schrieb Alex Deucher:
On Mon, Nov 27, 2023 at 2:22 PM Christian König
wrote:
Am 27.11.23 um 19:29 schrieb Lijo Lazar:
The return value is uniinitialized if ras context is NULL.
Fixes: 0f4c8faa043c (drm/amdgpu: Move mca
On 11/24/2023 4:25 AM, Felix Kuehling wrote:
Make restore workers freezable so we don't have to explicitly flush them
in suspend and GPU reset code paths, and we don't accidentally try to
restore BOs while the GPU is suspended. Not having to flush restore_work
also helps avoid lock/fence
On 11/17/2023 10:50 AM, David Yat Sin wrote:
Fixes issue where user events of type KFD_EVENT_TYPE_HW_EXCEPTION do not
have valid data
Signed-off-by: David Yat Sin
Reviewed-by: Lijo Lazar
Thanks,
Lijo
---
drivers/gpu/drm/amd/amdkfd/kfd_events.c | 4
1 file changed, 4
On 11/10/2023 1:25 PM, Lijo Lazar wrote:
Refactor code such that ras block decides the default mca debug mode,
and not swsmu block.
By default mca debug mode is set to false.
Signed-off-by: Lijo Lazar
---
v3: Default mca debug mode is set to false
v2: Set mca debug mode early before ras
On 11/16/2023 2:39 AM, Mario Limonciello wrote:
On 11/15/2023 11:04, Mario Limonciello wrote:
On 11/14/2023 21:23, Lazar, Lijo wrote:
On 11/15/2023 1:37 AM, Mario Limonciello wrote:
The USB4 spec specifies that PCIe ports that are used for tunneling
PCIe traffic over USB4 fabric
On 11/11/2023 4:04 AM, Mario Limonciello wrote:
When bandwidth limits are looked up using pcie_bandwidth_available()
virtual links such as USB4 are analyzed which might not represent the
real speed. Furthermore devices may change speeds autonomously which
may introduce conditional variation
On 11/15/2023 10:48 AM, Ma Jun wrote:
Set dpm_enabled flag to false only when dpms is
successfully disabled.
This a software flag and we block many services based on this flag
status. I think the purpose of setting it early is to block other
service calls which could come in between. Did
On 11/15/2023 1:37 AM, Mario Limonciello wrote:
The USB4 spec specifies that PCIe ports that are used for tunneling
PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s and
behave as a PCIe Gen1 device. The actual performance of these ports is
controlled by the fabric
On 11/14/2023 2:25 PM, Asad Kamal wrote:
Update pmfw metric table to include pcie
instantaneous bandwidth & pcie error counters
Signed-off-by: Asad Kamal
Reviewed-by: Le Ma
Series is -
Reviewed-by: Lijo Lazar
Thanks,
Lijo
---
On 11/10/2023 8:18 PM, Christian König wrote:
Am 09.11.23 um 08:38 schrieb Lijo Lazar:
cancel_work is not backported to all custom kernels.
Well this is pretty clear NAK to pushing this upstream. We absolutely
can't add workaround for older kernels.
You could keep this in the backported
On 11/9/2023 1:08 PM, Lijo Lazar wrote:
cancel_work is not backported to all custom kernels. Add a workaround to
skip execution of already queued recovery jobs, if the device is already
reset.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 +
On 11/9/2023 2:11 PM, José Pekkarinen wrote:
The following patch will convert the power values returned by
amdgpu_hwmon_get_power to signed, fixing the following warnings reported
by coccinelle:
drivers/gpu/drm/amd/pm/amdgpu_pm.c:2801:5-8: WARNING: Unsigned expression compared
with zero:
On 11/10/2023 3:44 AM, Alex Deucher wrote:
Some chips provide both average and input power. Previously
we just exposed average power, add a new query for input
power.
Input looks like a misnomer (not the supply side, but the power
consumed). Better to rename to instantaneous or current
On 11/3/2023 8:26 PM, Victor Lu wrote:
The "rlcg_reg_access_supported" flag is missing. Add it back in.
Signed-off-by: Victor Lu
Series is
Reviewed-by: Lijo Lazar
Thanks,
Lijo
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 1 +
1 file changed, 1 insertion(+)
diff --git
On 11/7/2023 9:58 PM, Hunter Chasens wrote:
Resolves Sphinx unexpected indentation warning when compiling
documentation (e.g. `make htmldocs`). Replaces tabs with spaces and adds
a literal block to keep vertical formatting of the
example power state list.
Signed-off-by: Hunter Chasens
On 11/7/2023 1:47 AM, Hunter Chasens wrote:
Resolves Sphinx unexpected indentation warning when compiling
documentation (e.g. `make htmldocs`). Replaces tabs with spaces and adds
a literal block to keep vertical formatting of the
example power state list.
Signed-off-by: Hunter Chasens
---
On 11/6/2023 2:30 AM, Hunter Chasens wrote:
Resolves Sphinx unexpected indentation warning when compiling
documentation (e.g. `make htmldocs`). Replaces tabs with spaces and adds
a literal block to keep vertical formatting of the
example power state list.
Signed-off-by: Hunter Chasens
On 11/4/2023 12:37 AM, Mario Limonciello wrote:
The USB4 spec specifies that PCIe ports that are used for tunneling
PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s and
behave as a PCIe Gen1 device. The actual performance of these ports is
controlled by the fabric
On 11/2/2023 8:34 PM, Victor Lu wrote:
amdgpu_kiq_wreg/rreg is hardcoded to use MEC engine 0.
Add an xcc_id parameter to amdgpu_kiq_wreg/rreg, define W/RREG32_XCC
and amdgpu_device_xcc_wreg/rreg to to use the new xcc_id parameter.
Using amdgpu_sriov_runtime to determine whether to access
On 10/31/2023 8:20 AM, Yang Wang wrote:
correct following amdgpu ip block version information:
- gfx_v9_4_3
- sdma_v4_4_2
Signed-off-by: Yang Wang
Reviewed-by: Lijo Lazar
Thanks,
Lijo
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 2 +-
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 2
On 10/31/2023 7:42 AM, Yang Wang wrote:
remove unused macro HW_REV
Signed-off-by: Yang Wang
Reviewed-by: Lijo Lazar
Thanks,
Lijo
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 3 ---
1 file changed, 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
On 10/28/2023 1:41 AM, Victor Lu wrote:
amdgpu_virt_kiq_reg_write_reg_wait is hardcoded to use MEC engine 0.
Add xcc_inst as a parameter to allow it to use different MEC engines.
v3: use first xcc for MMHUB in gmc_v9_0_flush_gpu_tlb
v2: rebase
Signed-off-by: Victor Lu
---
On 10/28/2023 1:36 AM, Victor Lu wrote:
The WREG32/RREG32_SOC15_IP_NO_KIQ call is using XCC0's RLCG interface
when programming other XCCs.
Add xcc instance parameter to them.
v3: xcc not needed for MMMHUB
v2: rebase
Signed-off-by: Victor Lu
---
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
On 10/26/2023 2:22 AM, Victor Lu wrote:
amdgpu_kiq_wreg/rreg is hardcoded to use MEC engine 0.
Add an xcc_id parameter to amdgpu_kiq_wreg/rreg, define W/RREG32_XCC
and amdgpu_device_xcc_wreg/rreg to to use the new xcc_id parameter.
v3: use W/RREG32_XCC to handle non-kiq case
v2: define
On 10/26/2023 2:22 AM, Victor Lu wrote:
The WREG32/RREG32_SOC15_IP_NO_KIQ call is using XCC0's RLCG interface
when programming other XCCs.
Add xcc instance parameter to them.
v2: rebase
Signed-off-by: Victor Lu
---
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 16
On 10/26/2023 2:22 AM, Victor Lu wrote:
amdgpu_kiq_wreg/rreg is hardcoded to use MEC engine 0.
Add an xcc_id parameter to amdgpu_kiq_wreg/rreg, define W/RREG32_XCC
and amdgpu_device_xcc_wreg/rreg to to use the new xcc_id parameter.
v3: use W/RREG32_XCC to handle non-kiq case
v2: define
On 10/4/2023 6:26 AM, Victor Lu wrote:
WREG32_RLC does not specify the correct XCC so the RLCG interface does
not work.
Define WREG32_RLC_XCC to be like WREG32_RLC but include a parameter to
specify the XCC.
v2: Add new macro WREG32_RLC_XCC instead of modifying exiting WREG32_RLC
macro
[AMD Official Use Only - General]
Thanks,
Lijo
From: amd-gfx on behalf of Lijo Lazar
Sent: Friday, October 20, 2023 8:44:22 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Kasiviswanathan, Harish
; Zhang, Hawking
Subject: [PATCH] drm/amdgpu:
On 10/17/2023 9:58 AM, Asad Kamal wrote:
Add hive ras recovery check and propagate fatal
error to aids of all sockets in the hive
May be reword it as 'If one of the devices in the hive detects a fatal
error, need to send ras recovery reset message to PMFW of all devices in
the hive. For
[AMD Official Use Only - General]
Please ignore this patch as tOS is not loaded on VF and hence the path is not
taken.
Thanks,
Lijo
-Original Message-
From: amd-gfx On Behalf Of Lijo Lazar
Sent: Thursday, October 12, 2023 11:21 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher,
rom: Lazar, Lijo
Sent: Wednesday, October 11, 2023 23:32
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Deucher, Alexander
Subject: [PATCH 3/3] drm/amd/pm: Add P2S tables for SMU v13.0.6
Add P2S table load support on SMU v13.0.6 ASICs.
Signed-off-by: Lijo Lazar
---
drivers/gpu/drm/
On 10/11/2023 2:55 PM, Asad Kamal wrote:
Expose ras table version & schema info to sysfs
v2: Updated schema to get poison support info
from ras context, removed asic specific checks
Signed-off-by: Asad Kamal
One nit inline. With/without that change,
Reviewed-by: Lijo Lazar
---
101 - 200 of 973 matches
Mail list logo