Re: Fwd: Kernel 5.11 crashes when it boots, it produces black screen.

2023-05-10 Thread Bagas Sanjaya
On 5/10/23 16:51, Linux regression tracking (Thorsten Leemhuis) wrote: > Bagas, thx for all your help with regression tracking, much appreciated > (side note, as I'm curious for a while already: what is your motivation? > Just want to help? But whatever, any help is great!). > I did this when I w

Re: [PATCH] drm/amdgpu: release correct lock in amdgpu_gfx_enable_kgq()

2023-05-10 Thread Alex Deucher
Applied. Thanks! Alex On Tue, May 9, 2023 at 10:32 AM Dan Carpenter wrote: > > This function was releasing the incorrect lock on the error path. > > Reported-by: kernel test robot > Fixes: 9bfa241d1289 ("drm/amdgpu: add [en/dis]able_kgq() functions") > Signed-off-by: Dan Carpenter > --- > The

RE: [PATCH 1/2] drm/amdgpu: fix amdgpu_irq_put call trace in jpeg_v4_0_hw_fini

2023-05-10 Thread Zhang, Horatio
[AMD Official Use Only - General] Got it! Thanks, Horatio -Original Message- From: Zhang, Hawking Sent: Thursday, May 11, 2023 10:28 AM To: Zhang, Horatio ; Zhou1, Tao ; amd-gfx@lists.freedesktop.org Cc: Xu, Feifei ; Liu, Leo ; Jiang, Sonny ; Limonciello, Mario ; Liu, HaoPing (Alan)

RE: [PATCH 1/2] drm/amdgpu: fix amdgpu_irq_put call trace in jpeg_v4_0_hw_fini

2023-05-10 Thread Zhang, Hawking
[AMD Official Use Only - General] Please register dedicated ras_irq src and funcs for UVD_POISON, which should allow you to create vcn ras sw calls like gfx/sdma ip block. Regards, Hawking -Original Message- From: Zhang, Horatio Sent: Wednesday, May 10, 2023 18:55 To: Zhang, Hawking ;

Re: [PATCH] drm/amd/amdgpu: Remove redundant else branch in amdgpu_encoders.c

2023-05-10 Thread Alex Deucher
On Tue, May 9, 2023 at 1:17 AM SHANMUGAM, SRINIVASAN wrote: > > [AMD Official Use Only - General] > > > > -Original Message- > From: Alex Deucher > Sent: Monday, May 8, 2023 9:27 PM > To: SHANMUGAM, SRINIVASAN > Cc: Koenig, Christian ; Deucher, Alexander > ; amd-gfx@lists.freedesktop.or

[PATCH] drm/amdgpu: change gfx 11.0.4 external_id range

2023-05-10 Thread Yifan Zhang
gfx 11.0.4 range starts from 0x80. Fixes: 311d52367d0a ("drm/amdgpu: add soc21 common ip block support for GC 11.0.4") Cc: sta...@vger.kernel.org Signed-off-by: Yifan Zhang Reported-by: Yogesh Mohan Marimuthu Acked-by: Alex Deucher Reviewed-by: Tim Huang --- drivers/gpu/drm/amd/amdgpu/soc21.

Re: [PATCH] drm/amd/amdgpu: Fix warnings in amdgpu _object, _ring.c

2023-05-10 Thread Alex Deucher
On Tue, May 9, 2023 at 10:03 AM Srinivasan Shanmugam wrote: > > Fix below warnings reported by checkpatch: > > WARNING: Prefer 'unsigned int' to bare use of 'unsigned' > WARNING: static const char * array should probably be static const char * > const > WARNING: space prohibited between function

RE: [PATCH] drm/amdgpu: change gfx 11.0.4 external_id range

2023-05-10 Thread Huang, Tim
[AMD Official Use Only - General] This patch is Reviewed-by: Tim Huang Best Regards, Tim Huang -Original Message- From: Zhang, Yifan Sent: Wednesday, May 10, 2023 4:38 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Huang, Tim ; Du, Xiaojian ; Limonciello, Mario ; Mo

[PATCH 5/5] drm/amdgpu: add check for RAS instance mask

2023-05-10 Thread Alex Deucher
From: Tao Zhou The mask is only needed to be set when RAS block instance number is more than 1 and invalid bits should be also masked out. We only check valid bits for GFX and SDMA block for now, and will add check for other RAS blocks in the future. v2: move the check under injection operation

[PATCH 3/5] drm/amdgpu: reorganize RAS injection flow

2023-05-10 Thread Alex Deucher
From: Tao Zhou So GFX RAS injection could use default function if it doesn't define its own injection interface. Signed-off-by: Tao Zhou Reviewed-by: Hawking Zhang Reviewed-by: Stanley.Yang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 13 ++--- 1 file ch

[PATCH 2/5] drm/amdgpu: add instance mask for RAS inject

2023-05-10 Thread Alex Deucher
From: Tao Zhou User can specify injected instances by the mask. For backward compatibility, the mask value is incorporated into sub block index without interface change of RAS TA. User uses logical mask and driver should convert it to physical value before sending it to RAS TA. v2: update parame

[PATCH 4/5] drm/amdgpu: remove RAS GFX injection for gfx_v9_4/gfx_v9_4_2

2023-05-10 Thread Alex Deucher
From: Tao Zhou No special requirement in RAS injection for the two versions, switch to use default injection interface. Signed-off-by: Tao Zhou Reviewed-by: Hawking Zhang Reviewed-by: Stanley.Yang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v9_4.c | 24 -

[PATCH 1/5] drm/amdgpu: convert logical instance mask to physical one

2023-05-10 Thread Alex Deucher
From: Tao Zhou Convert instance mask for the convenience of RAS TA. Signed-off-by: Tao Zhou Reviewed-by: Hawking Zhang Reviewed-by: Stanley.Yang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 6 -- .../drm/amd/amdgpu/aqua_vanjaram_reg_init.c| 18 ++

[PATCH] drm/amdgpu: Enable IH CAM on GFX9.4.3

2023-05-10 Thread Alex Deucher
From: Mukul Joshi This patch enables IH CAM on GFX9.4.3 ASIC. Signed-off-by: Mukul Joshi Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/vega20_ih.c | 3 ++- 2 files changed, 4 insertions(+), 3 deletions(

[PATCH 27/29] drm/amdgpu: route ioctls on primary node of XCPs to primary device

2023-05-10 Thread Alex Deucher
From: Shiwu Zhang During XCP init, unlike the primary device, there is no amdgpu_device attached to each XCP's drm_device In case that user trying to open/close the primary node of XCP drm_device this rerouting is to solve the NULL pointer issue causing by referring to any member of the amdgpu_d

[PATCH 29/29] drm/amdgpu: Correct get_xcp_mem_id calculation

2023-05-10 Thread Alex Deucher
From: Philip Yang Current calculation only works for NPS4/QPX mode, correct it for NPS4/CPX mode. Signed-off-by: Philip Yang Reviewed-by: Lijo Lazar Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/aqua_vanjaram_reg_init.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)

[PATCH 18/29] drm/amdkfd: Update MTYPE for far memory partition

2023-05-10 Thread Alex Deucher
From: Philip Yang Use MTYPE RW/MTYPE_CC for mapping system memory or VRAM to KFD node within the same memory partition, use MTYPE_NC for mapping on KFD node from the far memory partition of the same socket or from another socket on same XGMI hive. On NPS4 or 4P system, MTYPE will be overridden p

[PATCH 24/29] drm/amdkfd: Move local_mem_info to kfd_node

2023-05-10 Thread Alex Deucher
From: Mukul Joshi We need to track memory usage on a per partition basis. To do that, store the local memory information in KFD node instead of kfd device. v2: squash in fix ("amdkfd: Use mem_id to access mem_partition info") Signed-off-by: Mukul Joshi Reviewed-by: Felix Kuehling Signed-off-b

[PATCH 23/29] drm/amdgpu: use xcp partition ID for amdgpu_gem

2023-05-10 Thread Alex Deucher
From: James Zhu Find xcp_id from amdgpu_fpriv, use it for amdgpu_gem_object_create. Signed-off-by: James Zhu Acked-by: Felix Kuehling Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gp

[PATCH 19/29] drm/amdgpu: Alloc page table on correct memory partition

2023-05-10 Thread Alex Deucher
From: Philip Yang Alloc kernel mode page table bo uses the amdgpu_vm->mem_id + 1 as bp mem_id_plus1 parameter. For APU mode, select the correct TTM pool to alloc page from the corresponding memory partition, this will be the closest NUMA node. For dGPU mode, select the correct address range for v

[PATCH 15/29] drm/amdkfd: Alloc memory of GPU support memory partition

2023-05-10 Thread Alex Deucher
From: Philip Yang For dGPU mode VRAM allocation, create amdgpu_bo from amdgpu_vm->mem_id, to alloc from the correct memory range. For APU mode VRAM allocation, set alloc domain to GTT, and set bp->mem_id_plus1 from amdgpu_vm->mem_id + 1 to create amdgpu_bo, to allocate system memory from correct

[PATCH 25/29] drm/amdkfd: Fix memory reporting on GFX 9.4.3

2023-05-10 Thread Alex Deucher
From: Mukul Joshi This patch fixes memory reporting on the GFX 9.4.3 APU and dGPU by reporting available memory on a per partition basis. If its an APU, available and used memory calculations take into account system and TTM memory. v2: squash in fix ("drm/amdkfd: Fix array out of bound warning"

[PATCH 22/29] drm/amdgpu: KFD graphics interop support compute partition

2023-05-10 Thread Alex Deucher
From: Philip Yang kfd_ioctl_get_dmabuf use the amdgpu bo xcp_id to get the gpu_id of the KFD node from the exported dmabuf_adev, and then create kfd bo on the correct adev and KFD node when importing the amdgpu bo to KFD. Remove function kfd_device_by_adev, it is not needed as it is the same res

[PATCH 21/29] drm/amdkfd: Store xcp partition id to amdgpu bo

2023-05-10 Thread Alex Deucher
From: Philip Yang For memory accounting per compute partition and export drm amdgpu bo and then import to KFD, we need the xcp id to account the memory usage or find the KFD node of the original amdgpu bo to create the KFD bo on the correct adev KFD node. Set xcp_id_plus1 of amdgpu_bo_param to c

[PATCH 17/29] drm/amdgpu: dGPU mode placement support memory partition

2023-05-10 Thread Alex Deucher
From: Philip Yang dGPU mode uses VRAM manager to validate bo, amdgpu bo placement use the mem_id to get the allocation range first, last page frame number from xcp manager, pass to drm buddy allocator as the allowed range. Signed-off-by: Philip Yang Reviewed-by: Felix Kuehling Signed-off-by:

[PATCH 26/29] drm/amdkfd: APU mode set max svm range pages

2023-05-10 Thread Alex Deucher
From: Philip Yang svm_migrate_init set the max svm range pages based on the KFD nodes partition size. APU mode don't init pgmap because there is no migration. kgd2kfd_device_init calls svm_migrate_init after KFD nodes allocation and initialization. Signed-off-by: Philip Yang Reviewed-by: Felix

[PATCH 28/29] drm/amdkfd: Refactor migrate init to support partition switch

2023-05-10 Thread Alex Deucher
From: Philip Yang Rename smv_migrate_init to a better name kgd2kfd_init_zone_device because it setup zone devive pgmap for page migration and keep it in kfd_migrate.c to access static functions svm_migrate_pgmap_ops. Call it only once in amdgpu_device_ip_init after adev ip blocks are initialized,

[PATCH 16/29] drm/amdkfd: SVM range allocation support memory partition

2023-05-10 Thread Alex Deucher
From: Philip Yang Pass kfd node->xcp->mem_id to amdgpu bo create parameter mem_id_plus1 to allocate new svm_bo on the specified memory partition. This is only for dGPU mode as we don't migrate with APU mode. Signed-off-by: Philip Yang Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher -

[PATCH 14/29] drm/amdgpu: Add memory partition mem_id to amdgpu_bo

2023-05-10 Thread Alex Deucher
From: Philip Yang Add mem_id_plus1 parameter to amdgpu_gem_object_create and pass it to amdgpu_bo_create. For dGPU mode allocation, mem_id is used by VRAM manager to get the memory partition fpfn, lpfn from xcp manager. For APU native mode allocation, mem_id is used to get NUMA node id from xcp m

[PATCH 20/29] drm/amdgpu: dGPU mode set VRAM range lpfn as exclusive

2023-05-10 Thread Alex Deucher
From: Philip Yang TTM place lpfn is exclusive used as end (start + size) in drm and buddy allocator, adev->gmc memory partition range lpfn is inclusive (start + size - 1), should plus 1 to set TTM place lpfn. Signed-off-by: Philip Yang Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher -

[PATCH 07/29] drm/amdgpu: add partition schedule for GC(9, 4, 3)

2023-05-10 Thread Alex Deucher
From: James Zhu Implement partition schedule for GC(9, 4, 3). Signed-off-by: James Zhu Acked-by: Lijo Lazar Signed-off-by: Alex Deucher --- .../drm/amd/amdgpu/aqua_vanjaram_reg_init.c | 41 +++ 1 file changed, 41 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/aqua_v

[PATCH 05/29] drm/amdgpu: add partition scheduler list update

2023-05-10 Thread Alex Deucher
From: James Zhu Add partition scheduler list update in late init and xcp partition mode switch. Signed-off-by: James Zhu Acked-by: Lijo Lazar Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c | 2 + .../drm/am

[PATCH 08/29] drm/amdgpu: run partition schedule if it is supported

2023-05-10 Thread Alex Deucher
From: James Zhu Run partition schedule if it is supported during ctx init entity. Signed-off-by: James Zhu Acked-by: Lijo Lazar Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/drivers

[PATCH 09/29] drm/amdgpu: update ref_cnt before ctx free

2023-05-10 Thread Alex Deucher
From: James Zhu Update ref_cnt before ctx free. Signed-off-by: James Zhu Acked-by: Lijo Lazar Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 7 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c | 16 drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.h | 2 ++

[PATCH 06/29] drm/amdgpu: keep amdgpu_ctx_mgr in ctx structure

2023-05-10 Thread Alex Deucher
From: James Zhu Keep amdgpu_ctx_mgr in ctx structure to track fpriv. Signed-off-by: James Zhu Acked-by: Lijo Lazar Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h | 1 + 2 files changed, 2 insertions(+) diff --git a/driv

[PATCH 11/29] drm/amdkfd: Store drm node minor number for kfd nodes

2023-05-10 Thread Alex Deucher
From: Philip Yang >From KFD topology, application will find kfd node with the corresponding drm device node minor number, for example if partition drm node starts from /dev/dri/renderD129, then KFD node 0 with store drm node minor number 129. Application will open drm node /dev/dri/renderD129 to

[PATCH 13/29] drm/amdkfd: Show KFD node memory partition info

2023-05-10 Thread Alex Deucher
From: Philip Yang Show KFD node memory partition id and size, add helper function KFD_XCP_MEMORY_SIZE to get kfd node memory size, will be used later to support memory accounting per partition. Signed-off-by: Philip Yang Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher --- drivers/gpu

[PATCH 12/29] drm/amdgpu: Add memory partition id to amdgpu_vm

2023-05-10 Thread Alex Deucher
From: Philip Yang If xcp_mgr is initialized, add mem_id to amdgpu_vm structure to store memory partition number when creating amdgpu_vm for the xcp. The xcp number is decided when opening the render device, for example /dev/dri/renderD129 is xcp_id 0, /dev/dri/rederD130 is xcp_id 1. Signed-off-b

[PATCH 10/29] drm/amdgpu: Add xcp manager num_xcp_per_mem_partition

2023-05-10 Thread Alex Deucher
From: Philip Yang Used by KFD to check memory limit accounting. Signed-off-by: Philip Yang Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.h | 3 +++ 2 files changed, 4 insertions(+) diff --git

[PATCH 03/29] drm/amdgpu: add partition ID track in ring

2023-05-10 Thread Alex Deucher
From: James Zhu Keep track partition ID in ring. Signed-off-by: James Zhu Acked-by: Lijo Lazar Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 1 + .../drm/amd/amdgpu/aqua_vanjaram_reg_init.c | 41 +++ 2 files changed, 42 insertions(+) diff

[PATCH 02/29] drm/amdgpu: find partition ID when open device

2023-05-10 Thread Alex Deucher
From: James Zhu Find partition ID when open device from render device minor. Signed-off-by: Christian König Signed-off-by: James Zhu Reviewed-and-tested-by: Philip Yang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c |

[PATCH 04/29] drm/amdgpu: update header to support partition scheduling

2023-05-10 Thread Alex Deucher
From: James Zhu Update header to support partition scheduling. Signed-off-by: James Zhu Acked-by: Lijo Lazar Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.h | 15 +++ 1 file changed, 15 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.h b

[PATCH 01/29] drm/amdgpu: support partition drm devices

2023-05-10 Thread Alex Deucher
From: James Zhu Support partition drm devices on GC_HWIP IP_VERSION(9, 4, 3). This is a temporary solution and will be superceded. Signed-off-by: Christian König Signed-off-by: James Zhu Reviewed-and-tested-by: Philip Yang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu.h

Re: [PATCH 10/66] drm/amd/display: Do not set drr on pipe commit

2023-05-10 Thread Aurabindo Pillai
On 5/10/23 09:20, Michel Dänzer wrote: > On 5/9/23 23:07, Pillai, Aurabindo wrote: >> >> Sorry - the firmware in the previous message is for DCN32. For Navi2x, >> please use the firmware attached here. > > Same problem (contents of /sys/kernel/debug/dri/0/amdgpu_firmware_info below). > > Even

[PATCH 07/10] drm/amd/display: Make unbounded req update separate from dlg/ttu

2023-05-10 Thread Aurabindo Pillai
From: Alvin Lee [Description] - Updates to unbounded requesting should not be conditional on updates to dlg / ttu, as this could prevent unbounded requesting from being updated if dlg / ttu does not change Reviewed-by: Jun Lei Acked-by: Aurabindo Pillai Signed-off-by: Alvin Lee --- drive

[PATCH 09/10] drm/amd/display: Remove v_startup workaround for dcn3+

2023-05-10 Thread Aurabindo Pillai
From: Daniel Miess [Why] Calls to dcn20_adjust_freesync_v_startup are no longer needed as of dcn3+ and can cause underflow in some cases [How] Move calls to dcn20_adjust_freesync_v_startup up into validate_bandwidth for dcn2.x Reviewed-by: Jun Lei Acked-by: Aurabindo Pillai Signed-off-by: Dan

[PATCH 10/10] drm/amd/display: 3.2.236

2023-05-10 Thread Aurabindo Pillai
From: Aric Cyr Acked-by: Aurabindo Pillai Signed-off-by: Aric Cyr --- drivers/gpu/drm/amd/display/dc/dc.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h index 8be2e6d6d888..2dff1a5cf3b1 100644 --- a

[PATCH 08/10] drm/amd/display: Remove unnecessary variable

2023-05-10 Thread Aurabindo Pillai
From: Rodrigo Siqueira There is no need to use dc_version in the dc_construct_ctx since this value is copied to dc_ctx->dce_version later. This commit removes the extra steps. Reviewed-by: Alex Hung Acked-by: Aurabindo Pillai Signed-off-by: Rodrigo Siqueira --- drivers/gpu/drm/amd/display/dc

[PATCH 06/10] drm/amd/display: Add visual confirm color support for MCLK switch

2023-05-10 Thread Aurabindo Pillai
From: "Leo (Hanghong) Ma" [Why && How] We would like to have visual confirm color support for MCLK switch. 1. Set visual confirm color to yellow: Vblank MCLK switch. 2. Set visual confirm color to cyan: FPO + Vblank MCLK switch. 3. Set visual confirm color to pink:

[PATCH 05/10] drm/amd/display: Fix possible underflow for displays with large vblank

2023-05-10 Thread Aurabindo Pillai
From: Daniel Miess [Why] Underflow observed when using a display with a large vblank region and low refresh rate [How] Simplify calculation of vblank_nom Increase value for VBlankNomDefaultUS to 800us Reviewed-by: Jun Lei Acked-by: Aurabindo Pillai Signed-off-by: Daniel Miess --- .../amd/d

[PATCH 04/10] drm/amd/display: Convert connector signal id to string

2023-05-10 Thread Aurabindo Pillai
From: Rodrigo Siqueira To improve the readability of the of the log, this commit introduces a function that converts the signal type id to a human-readable string. Reviewed-by: Jerry Zuo Acked-by: Aurabindo Pillai Signed-off-by: Rodrigo Siqueira --- .../drm/amd/display/dc/link/link_factory.c

[PATCH 01/10] drm/amd/display: enable dpia validate

2023-05-10 Thread Aurabindo Pillai
From: Mustapha Ghaddar Use dpia_validate_usb4_bw() function Fixes: 6d86146dd62f ("drm/amd/display: Add function pointer for validate bw usb4") Reviewed-by: Roman Li Reviewed-by: Meenakshikumar Somasundaram Acked-by: Aurabindo Pillai Signed-off-by: Mustapha Ghaddar --- drivers/gpu/drm/amd/d

[PATCH 03/10] drm/amd/display: Update vactive margin and max vblank for fpo + vactive

2023-05-10 Thread Aurabindo Pillai
From: Alvin Lee [Description] - Some 1920x1080@60hz displays have VBLANK time > 600us which we still want to accept for FPO + Vactive configs based on testing - Increase max VBLANK time to 1000us to allow these configs for FPO + Vactive - Increase minimum vactive switch margin

[PATCH 02/10] drm/amd/display: Only skip update for DCFCLK, UCLK, FCLK on overclock

2023-05-10 Thread Aurabindo Pillai
From: Alvin Lee [Description] - Update clocks is skipped in the GPU overclock sequence - However, we still need to update DISPCLK, DPPCLK, and DTBCLK because the GPU overclock sequence could temporarily disable ODM 2:1 combine because we disable all planes in the sequence Reviewed-by: Jun Le

[PATCH 00/10] DC Patches for 15 May 2023

2023-05-10 Thread Aurabindo Pillai
This DC patchset brings improvements in multiple areas. In summary, we highlight: * DC v3.2.236 * Fixes related to DCN clock sequencing * Changes to FPO acceptance heuristics for various modelines * Dmesg log readability, visual debug improments and various bug fixes. Cc: Daniel Wheeler --- A

Re: [RFC PATCH 0/4] Add support for DRM cgroup memory accounting.

2023-05-10 Thread Tejun Heo
Hello, On Wed, May 10, 2023 at 04:59:01PM +0200, Maarten Lankhorst wrote: > The misc controller is not granular enough. A single computer may have any > number of > graphics cards, some of them with multiple regions of vram inside a single > card. Extending the misc controller to support dynami

[PATCH 6/6] drm/amdgpu/bu: update mtype_local parameter settings

2023-05-10 Thread Alex Deucher
From: Graham Sider Update mtype_local module parameter to use MTYPE_RW by default. 0: MTYPE_RW (default) 1: MTYPE_NC 2: MTYPE_CC Signed-off-by: Graham Sider Reviewed-by: Harish Kasiviswanathan Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +- drivers/gpu/drm/a

[PATCH 5/6] drm/amdgpu/bu: add mtype_local as a module parameter

2023-05-10 Thread Alex Deucher
From: David Francis Selects the MTYPE to be used for local memory, (0 = MTYPE_CC (default), 1 = MTYPE_NC, 2 = MTYPE_RW) This change is for internal testing only - do not upstream. v2: squash in build fix (Alex) Reviewed-by: Graham Sider Signed-off-by: David Francis Signed-off-by: Alex Deuche

[PATCH 3/6] drm/amdgpu: Fix per-BO MTYPE selection for GFXv9.4.3

2023-05-10 Thread Alex Deucher
From: Felix Kuehling Treat system memory on NUMA systems as remote by default. Overriding with a more efficient MTYPE per page will be implemented in the next patch. No need for a special case for APP APUs. System memory is handled the same for carve-out and native mode. And VRAM doesn't exist i

[PATCH 4/6] drm/amdgpu: Override MTYPE per page on GFXv9.4.3 APUs

2023-05-10 Thread Alex Deucher
From: Felix Kuehling On GFXv9.4.3 NUMA APUs, system memory locality must be determined per page to choose the correct MTYPE. This patch adds a GMC callback that can provide this per-page override and implements it for native mode. Carve-out mode is not yet supported and will use the safe default

[PATCH 2/6] drm/amdgpu/bu: Add use_mtype_cc_wa module param

2023-05-10 Thread Alex Deucher
From: Graham Sider By default, set use_mtype_cc_wa to 1 to set PTE coherence flag MTYPE_CC instead of MTYPE_RW by default. This is required for the time being to mitigate a bug causing XCCs to hit stale data due to TCC marking fully dirty lines as exclusive. Signed-off-by: Graham Sider Reviewed

[PATCH 1/6] drm/amdgpu/bu: Use legacy TLB flush for gfx943

2023-05-10 Thread Alex Deucher
From: Graham Sider Invalidate TLBs via a legacy flush request (flush_type=0) prior to the heavyweight flush requests (flush_type=2) in gmc_v9_0.c. This is temporarily required to mitigate a bug causing CPC UTCL1 to return stale translations after invalidation requests in address range mode. Sign

[linux-next:master] BUILD SUCCESS WITH WARNING 578215f3e21c472c08d70b8796edf1ac58f88578

2023-05-10 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master branch HEAD: 578215f3e21c472c08d70b8796edf1ac58f88578 Add linux-next specific files for 20230510 Warning reports: https://lore.kernel.org/oe-kbuild-all/202304140707.coh337ux-...@intel.com Warning

Re: [PATCH] drm/amdkfd: Remove skiping userptr buffer mapping when mmu notifier marks it as invalid

2023-05-10 Thread Alex Deucher
On Wed, May 10, 2023 at 11:00 AM Felix Kuehling wrote: > > Am 2023-05-09 um 18:17 schrieb Alex Deucher: > > From: Xiaogang Chen > > > > mmu notifier does not always hold mm->sem during call back. That causes > > a race condition between kfd userprt buffer mapping and mmu notifier > > which leds t

Re: [PATCH] drm/amdkfd: Remove skiping userptr buffer mapping when mmu notifier marks it as invalid

2023-05-10 Thread Felix Kuehling
Am 2023-05-09 um 18:17 schrieb Alex Deucher: From: Xiaogang Chen mmu notifier does not always hold mm->sem during call back. That causes a race condition between kfd userprt buffer mapping and mmu notifier which leds to gpu shadder or SDMA access userptr buffer before it has been mapped to gpu

Re: [RFC PATCH 0/4] Add support for DRM cgroup memory accounting.

2023-05-10 Thread Maarten Lankhorst
Hey, On 2023-05-05 21:50, Tejun Heo wrote: Hello, On Wed, May 03, 2023 at 10:34:56AM +0200, Maarten Lankhorst wrote: RFC as I'm looking for comments. For long running compute, it can be beneficial to partition the GPU memory between cgroups, so each cgroup can use its maximum amount of memory

Re: [PATCH] drm/sched: Check scheduler work queue before calling timeout handling

2023-05-10 Thread Luben Tuikov
On 2023-05-10 10:24, vitaly prosyak wrote: > > On 2023-05-10 10:19, Luben Tuikov wrote: >> On 2023-05-10 09:51, vitaly.pros...@amd.com wrote: >>> From: Vitaly Prosyak >>> >>> During an IGT GPU reset test we see again oops despite of >>> commit 0c8c901aaaebc9 (drm/sched: Check scheduler ready befo

Re: [PATCH] drm/sched: Check scheduler work queue before calling timeout handling

2023-05-10 Thread vitaly prosyak
On 2023-05-10 10:19, Luben Tuikov wrote: > On 2023-05-10 09:51, vitaly.pros...@amd.com wrote: >> From: Vitaly Prosyak >> >> During an IGT GPU reset test we see again oops despite of >> commit 0c8c901aaaebc9 (drm/sched: Check scheduler ready before calling >> timeout handling). >> >> It uses read

Re: [PATCH] drm/sched: Check scheduler work queue before calling timeout handling

2023-05-10 Thread Luben Tuikov
On 2023-05-10 09:51, vitaly.pros...@amd.com wrote: > From: Vitaly Prosyak > > During an IGT GPU reset test we see again oops despite of > commit 0c8c901aaaebc9 (drm/sched: Check scheduler ready before calling > timeout handling). > > It uses ready condition whether to call drm_sched_fault which

[PATCH] drm/sched: Check scheduler work queue before calling timeout handling

2023-05-10 Thread vitaly.prosyak
From: Vitaly Prosyak During an IGT GPU reset test we see again oops despite of commit 0c8c901aaaebc9 (drm/sched: Check scheduler ready before calling timeout handling). It uses ready condition whether to call drm_sched_fault which unwind the TDR leads to GPU reset. However it looks the ready con

[PATCH] drm/sched: Check scheduler work queue before calling timeout handling

2023-05-10 Thread vitaly.prosyak
From: Vitaly Prosyak During an IGT GPU reset test we see again oops despite of commit 0c8c901aaaebc9 (drm/sched: Check scheduler ready before calling timeout handling). It uses ready condition whether to call drm_sched_fault which unwind the TDR leads to GPU reset. However it looks the ready con

Re: [PATCH 10/66] drm/amd/display: Do not set drr on pipe commit

2023-05-10 Thread Michel Dänzer
On 5/9/23 23:07, Pillai, Aurabindo wrote: > > Sorry - the firmware in the previous message is for DCN32. For Navi2x, please > use the firmware attached here. Same problem (contents of /sys/kernel/debug/dri/0/amdgpu_firmware_info below). Even if it did work with newer FW, the kernel must keep wo

Re: [PATCH] drm/amdgpu: change gfx 11.0.4 external_id range

2023-05-10 Thread Alex Deucher
On Wed, May 10, 2023 at 4:38 AM Yifan Zhang wrote: > > gfx 11.0.4 range starts from 0x80. > > Fixes: 311d52367d0a ("drm/amdgpu: add soc21 common ip block support for GC > 11.0.4") > > Signed-off-by: Yifan Zhang Acked-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/soc21.c | 2 +- > 1 fil

Fwd: Kernel 5.11 crashes when it boots, it produces black screen.

2023-05-10 Thread Bagas Sanjaya
Hi, I noticed a regression report on Bugzilla ([1]). As many developers don't have a look on it, I decided to forward it by email. See the report for the full thread. Quoting from the report: > Azamat S. Kalimoulline 2021-04-06 15:45:08 UTC > > Same as in https://bugzilla.kernel.org/show_bug.c

Re: Fwd: Kernel 5.11 crashes when it boots, it produces black screen.

2023-05-10 Thread Linux regression tracking (Thorsten Leemhuis)
Hi! On 10.05.23 10:26, Bagas Sanjaya wrote: > > I noticed a regression report on Bugzilla ([1]). As many developers don't > have a look on it, I decided to forward it by email. See the report > for the full thread. > > Quoting from the report: > >> Azamat S. Kalimoulline 2021-04-06 15:45:08 UT

RE: [PATCH 1/2] drm/amdgpu: fix amdgpu_irq_put call trace in jpeg_v4_0_hw_fini

2023-05-10 Thread Zhang, Horatio
[AMD Official Use Only - General] Hi Hawking, When modprobe, the interrupt of jpeg/vcn was enabled in amdgpu_fence_driver_hw_init(). If the amdgpu_irq_get function is added in amdgpu_xxx_ras_late_init/xxx_v4_0_late_init, it will enable the instance interrupt twice. My previous modification pl

Re: [PATCH] drm/sched: Check scheduler work queue before calling timeout handling

2023-05-10 Thread Luben Tuikov
On 2023-05-09 17:43, vitaly.pros...@amd.com wrote: > From: Vitaly Prosyak > > During an IGT GPU reset test we see again oops despite of > commit 0c8c901aaaebc9bf8bf189ffc116e678f7a2dc16 > drm/sched: Check scheduler ready before calling timeout handling. You can probably use the more succinct fix

[PATCH] drm/amdgpu: change gfx 11.0.4 external_id range

2023-05-10 Thread Yifan Zhang
gfx 11.0.4 range starts from 0x80. Fixes: 311d52367d0a ("drm/amdgpu: add soc21 common ip block support for GC 11.0.4") Signed-off-by: Yifan Zhang --- drivers/gpu/drm/amd/amdgpu/soc21.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/dri