[PATCH] drm/amdgpu: set start timestamp of fence in the right place

2024-07-08 Thread jiadong.zhu
From: Jiadong Zhu The job's embedded fence is dma_fence which shall not be conversed to amdgpu_fence. The start timestamp shall be saved on job for hw_fence. Signed-off-by: Jiadong Zhu --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 33 --- drivers/gpu/drm/amd/amdgpu/amdgpu

RE: [PATCH V2 2/2] drm/amdgpu: timely save bad pages to eeprom after gpu ras reset is completed

2024-07-08 Thread Zhou1, Tao
[AMD Official Use Only - AMD Internal Distribution Only] The series is: Reviewed-by: Tao Zhou > -Original Message- > From: Chai, Thomas > Sent: Tuesday, July 9, 2024 1:56 PM > To: amd-gfx@lists.freedesktop.org > Cc: Zhang, Hawking ; Zhou1, Tao > ; Li, Candice ; Wang, Yang(Kevin) > ; Ya

[PATCH v1 3/3] drm/amdgpu: select compute ME engines dynamically

2024-07-08 Thread Sunil Khatri
GFX ME right now is one but this could change in future SOC's. Use no of ME for GFX as start point for ME for compute for GFX12. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/g

[PATCH v1 1/3] drm/amdgpu: select compute ME engines dynamically

2024-07-08 Thread Sunil Khatri
GFX ME right now is one but this could change in future SOC's. Use no of ME for GFX as start point for ME for compute for GFX10. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/g

[PATCH v1 2/3] drm/amdgpu: select compute ME engines dynamically

2024-07-08 Thread Sunil Khatri
GFX ME right now is one but this could change in future SOC's. Use no of ME for GFX as start point for ME for compute for GFX11. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/g

[PATCH v1 0/3] num_me of gfx as start point for ME for compute

2024-07-08 Thread Sunil Khatri
To support future soc's which could have more than one me engine for GFX Sunil Khatri (3): drm/amdgpu: select compute ME engines dynamically drm/amdgpu: select compute ME engines dynamically drm/amdgpu: select compute ME engines dynamically drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 +- dr

[PATCH] drm/amdgpu: avoid repeatedly executing gpu ras reset

2024-07-08 Thread YiPeng Chai
When a gpu in hive is performing ras reset, other gpus in hive do not need to schedule recovery work to reset the gpu. Signed-off-by: YiPeng Chai --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 20 +++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/a

[PATCH V2 1/2] drm/amdgpu: flush all cached ras bad pages to eeprom

2024-07-08 Thread YiPeng Chai
Before uninstalling gpu driver, flush all cached ras bad pages to eeprom. v2: Put the same code into a function and reuse the function. Signed-off-by: YiPeng Chai --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 35 - 1 file changed, 29 insertions(+), 6 deletions(-) diff

[PATCH V2 2/2] drm/amdgpu: timely save bad pages to eeprom after gpu ras reset is completed

2024-07-08 Thread YiPeng Chai
The problem case is as follows: 1. GPU A triggers a gpu ras reset, and GPU A drives GPU B to also perform a gpu ras reset. 2. After gpu B ras reset started, gpu B queried a DE data. Since the DE data was queried in the ras reset thread instead of the page retirement thread, bad page ret

[PATCH] drm/amdgpu: Initialize VF partition mode

2024-07-08 Thread Lijo Lazar
For SOCs with GFX v9.4.3, a VF may have multiple compute partitions. Fetch the partition information during init and initialize partition nodes. There is no support to switch partition mode in VF mode, hence disable the same. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h

RE: [PATCH] drm/amd/amdgpu: fix SDMA IRQ client ID <-> req mapping.

2024-07-08 Thread Zhang, Hawking
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Gavin Wan Sent: Tuesday, July 9, 2024 10:27 To: amd-gfx@lists.freedesktop.org Cc: Wan, Gavin Subject: [PATCH] drm/amd/amdgpu: fix SDMA IRQ

Re: [PATCH] drm/amd/amdgpu: fix SDMA IRQ client ID <-> req mapping.

2024-07-08 Thread Wan, Gavin
[AMD Official Use Only - AMD Internal Distribution Only] Hi Hawking, Fixed the typo and a new email is sent. Thanks, Gavin From: Zhang, Hawking Sent: Monday, July 8, 2024 10:23 PM To: Wan, Gavin ; amd-gfx@lists.freedesktop.org Cc: Wan, Gavin Subject: RE: [PAT

[PATCH] drm/amd/amdgpu: fix SDMA IRQ client ID <-> req mapping.

2024-07-08 Thread Gavin Wan
sdma has 2 instances in SRIOV cpx mode. Odd numbered VFs have sdma0/sdma1 instances. Even numbered vfs have sdma2/sdma3. For Even numbered vfs, the sdma2 & sdma3 (irq srouce id CLIENTID_SDMA2 and CLIENTID_SDMA3) shoud map to irq seq 0 & 1. Signed-off-by: Gavin Wan Change-Id: Ie850114932ca98ea3c91

RE: [PATCH] drm/amd/amdgpu: fix SDMA IRQ client ID <-> req mapping.

2024-07-08 Thread Zhang, Hawking
[AMD Official Use Only - AMD Internal Distribution Only] Please correct the typo in description CLIENTID_SDMA2 and CLIENTID_SDMA2 With that fixed, the patch is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Gavin Wan Sent: Tuesday, July 9, 2

Re: Re:Proposal to add CRIU support to DRM render nodes

2024-07-08 Thread Felix Kuehling
On 2024-07-08 2:51, 周春明(日月) wrote: > > Hi Felix, > > When I learn CRIU you introduced in  > https://github.com/checkpoint-restore/criu/tree/criu-dev/plugins/amdgpu >  , > there is a sentence > "ROCm manages memory in the

[PATCH 2/2] drm/amdgpu/mes12: add missing opcode string

2024-07-08 Thread Alex Deucher
Fixes the indexing of the string array. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c index 106eef1ff5cc..c9f74231ad59 100644 --- a/driver

[PATCH 1/2] drm/amdgpu/mes11: update opcode strings

2024-07-08 Thread Alex Deucher
Add new packet. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c index 1376b6ff1b77..8ce51b9236c1 100644 --- a/drivers/gpu/drm/amd/amdgpu/

[PATCH] drm/amdgpu: Restore uncache behaviour on GFX12

2024-07-08 Thread David Belanger
Always use MTYPE_UC if UNCACHED flag is specified. This makes kernarg region uncached and it restores usermode cache disable debug flag functionality. Do not set MTYPE_UC for COHERENT flag, on GFX12 coherence is handled by shader code. Signed-off-by: David Belanger --- drivers/gpu/drm/amd/amdg

[PATCH 2/2] drm/amd/display: use drm_crtc_set_vblank_offdelay()

2024-07-08 Thread Hamza Mahfooz
Hook up drm_crtc_set_vblank_offdelay() in amdgpu_dm, so that we can enable PSR more quickly for displays that support it. Signed-off-by: Hamza Mahfooz --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 30 ++- 1 file changed, 22 insertions(+), 8 deletions(-) diff --git a/driver

[PATCH 1/2] drm/vblank: allow dynamic per-crtc vblank off delay

2024-07-08 Thread Hamza Mahfooz
We would like to be able to adjust the vblank off delay dynamically for a given CRTC. Since, it will allow drivers to apply static screen optimizations more quickly and consequently allow users to benefit more so from the power savings afforded by the aforementioned optimizations. Signed-off-by: H

Re: [PATCH] drm/buddy: Add start address support to trim function

2024-07-08 Thread Alex Deucher
On Thu, Jul 4, 2024 at 4:40 AM Arunpravin Paneer Selvam wrote: > > - Add a new start parameter in trim function to specify exact > address from where to start the trimming. This would help us > in situations like if drivers would like to do address alignment > for specific requirements. > >

Re: [PATCH v3 6/6] drm/radeon: change drm_dev_alloc to devm_drm_dev_alloc

2024-07-08 Thread Alex Deucher
Applied the series. Thanks! Alex On Wed, Jul 3, 2024 at 4:55 AM Thomas Zimmermann wrote: > > > > Am 30.06.24 um 18:59 schrieb Wu Hoi Pok: > > "drm_dev_alloc" is deprecated, in order to use the newer > > "devm_drm_dev_alloc", > > the "drm_device" is stored inside "radeon_device", by changing >

[PATCH] drm/amdgpu/job: Replace DRM_INFO/ERROR logging

2024-07-08 Thread Alex Deucher
Use the dev_info/err variants so we get per device logging in multi-GPU cases. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 21 +++-- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gp

[PATCH] drm/amd/amdgpu: fix SDMA IRQ client ID <-> req mapping.

2024-07-08 Thread Gavin Wan
sdma has 2 instances in SRIOV cpx mode. Odd numbered VFs have sdma0/sdma1 instances. Even numbered vfs have sdma2/sdma3. For Even numbered vfs, the sdma2 & sdma3 (irq srouce id CLIENTID_SDMA2 and CLIENTID_SDMA2) shoud map to irq seq 0 & 1. Signed-off-by: Gavin Wan Change-Id: Ie850114932ca98ea3c91

Re: [PATCH] drm/amd/display: Allow display DCC for DCN401

2024-07-08 Thread Alex Deucher
On Mon, Jul 8, 2024 at 12:06 PM Aurabindo Pillai wrote: > > To enable mesa to use display dcc, DM should expose them in the > supported modifiers. Add the best (most efficient) modifiers first. > > Signed-off-by: Aurabindo Pillai Acked-by: Alex Deucher > --- > .../amd/display/amdgpu_dm/amdgpu

Re: [PATCH] drm/amdgpu/acpi: Add NULL check for event->device_class in amdgpu_atif_handler

2024-07-08 Thread Alex Deucher
On Tue, Jul 2, 2024 at 4:50 AM Srinivasan Shanmugam wrote: > > This commit addresses a NULL dereference issue in the > amdgpu_atif_handler function. > > The issue arises when event->device_class is NULL and is passed to the > DRM_DEBUG_DRIVER macro, which attempts to print the NULL string with the

[Patch v2] drm/ttm: Allow direct reclaim to allocate local memory

2024-07-08 Thread Rajneesh Bhardwaj
Limiting the allocation of higher order pages to the closest NUMA node and enabling direct memory reclaim provides not only failsafe against situations when memory becomes too much fragmented and the allocator is not able to satisfy the request from the local node but falls back to remote pages (HU

[PATCH] drm/amd/display: Allow display DCC for DCN401

2024-07-08 Thread Aurabindo Pillai
To enable mesa to use display dcc, DM should expose them in the supported modifiers. Add the best (most efficient) modifiers first. Signed-off-by: Aurabindo Pillai --- .../amd/display/amdgpu_dm/amdgpu_dm_plane.c | 31 +++ 1 file changed, 25 insertions(+), 6 deletions(-) diff -

RE: [PATCH] drm/amd/pm: Ignore initial value in smu response register

2024-07-08 Thread Slivka, Danijel
[Public] >-Original Message- >From: Lazar, Lijo >Sent: Monday, July 8, 2024 12:13 PM >To: Slivka, Danijel ; amd-gfx@lists.freedesktop.org >Cc: Slivka, Danijel >Subject: RE: [PATCH] drm/amd/pm: Ignore initial value in smu response register > >[Public] > >One problem is it's also bypassing

[PATCH v2] drm/amd/pm: Ignore initial value in smu response register

2024-07-08 Thread Danijel Slivka
Why: If the reg mmMP1_SMN_C2PMSG_90 is being written to during amdgpu driver load or driver unload, subsequent amdgpu driver load will fail at smu_hw_init. The default of mmMP1_SMN_C2PMSG_90 register at a clean environment is 0x1 and if value differs from expected, amdgpu driver load will fail. Ho

RE: [PATCH] drm/amd/pm: Ignore initial value in smu response register

2024-07-08 Thread Lazar, Lijo
[Public] One problem is it's also bypassing a valid 0 response which usually means FW may not have completed processing the previous message. What I thought was is it shouldn't even attempt sending a message if it identified a FW hang. Is there a possibility to have the same problem whenever t

RE: [PATCH] drm/amd/pm: Ignore initial value in smu response register

2024-07-08 Thread Lazar, Lijo
[Public] One problem is it's also bypassing a valid 0 response which usually means FW may not have completed processing the previous message. What I thought was is it shouldn't even attempt sending a message if it identified a FW hang. Is there a possibility to have the same problem whenever t

[PATCH] drm/amd/pm: Ignore initial value in smu response register

2024-07-08 Thread Danijel Slivka
Why: If the reg mmMP1_SMN_C2PMSG_90 is being written to during amdgpu driver load or driver unload, subsequent amdgpu driver load will fail at smu_hw_init. The default of mmMP1_SMN_C2PMSG_90 register at a clean environment is 0x1 and if value differs from expected, amdgpu driver load will fail. Ho