[PATCH v3 5/6] drm/ci: skip driver specific tests

2024-05-28 Thread Vignesh Raman
Skip driver specific tests and skip kms tests for panfrost driver since it is not a kms driver. Reviewed-by: Dmitry Baryshkov Signed-off-by: Vignesh Raman --- v2: - Skip xe tests for amdgpu and virtio. v3: - No changes. --- .../gpu/drm/ci/xfails/amdgpu-stoney-skips.txt | 15

[PATCH v3 4/6] drm/ci: uprev IGT

2024-05-28 Thread Vignesh Raman
test-list.txt and test-list-full.txt are not generated for cross-builds and they are required by drm-ci for testing arm32 targets. This is fixed in igt-gpu-tools. So uprev IGT to include the commit which fixes this issue. Also disable building xe driver tests for non-intel platforms. Reviewed-by:

[PATCH v3 6/6] drm/ci: update xfails for the new testlist

2024-05-28 Thread Vignesh Raman
Now the testlist is used from IGT build, so update xfails with the new testlist. Set the timeout of all i915 jobs to 1h30m since some jobs takes more than 1 hour to complete. Reviewed-by: Dmitry Baryshkov Signed-off-by: Vignesh Raman --- v2: - Set the timeout of all i915 jobs to 1h30m and

[PATCH v3 3/6] drm/ci: generate testlist from build

2024-05-28 Thread Vignesh Raman
Stop vendoring the testlist into the kernel. Instead, use the testlist from the IGT build to ensure we do not miss renamed or newly added tests. Signed-off-by: Vignesh Raman --- v2: - Fix testlist generation for arm and arm64 builds. v3: - Rename generated testlist file to ci-testlist.

[PATCH v3 2/6] drm/ci: add farm variable

2024-05-28 Thread Vignesh Raman
Mesa uses structured logs for logging and debug purpose, https://mesa.pages.freedesktop.org/-/mesa/-/jobs/59165650/artifacts/results/job_detail.json Since drm-ci uses the mesa scripts, add the farm variable and update the device type for missing jobs. Signed-off-by: Vignesh Raman --- v3: -

[PATCH v3 1/6] drm/ci: uprev mesa version

2024-05-28 Thread Vignesh Raman
zlib.net is not allowing tarball download anymore and results in below error in kernel+rootfs_arm32 container build, urllib.error.HTTPError: HTTP Error 403: Forbidden urllib.error.HTTPError: HTTP Error 415: Unsupported Media Type Uprev mesa to latest version which includes a fix for this issue.

[PATCH v3 0/6] drm/ci: uprev mesa/IGT and generate testlist

2024-05-28 Thread Vignesh Raman
Uprev mesa and IGT to the latest version and stop vendoring the testlist into the kernel. Instead, use the testlist from the IGT build to ensure we do not miss renamed or newly added tests. Update the xfails with the latest testlist run. Add farm variable and update device type variable.

[PATCH] drm/amdkfd: Handle deallocated VPGRs in gfx10+ trap handler

2024-05-28 Thread Jay Cornwall
A wavefront may deallocate its VGPRs at the end of a program while waiting for memory transactions to complete. If it subsequently receives a context save exception it will be unable to save, since this requires VGPRs. In this case the trap handler should terminate the wavefront. Fixes

[PATCH v2] drm/client: Detect when ACPI lid is closed during initialization

2024-05-28 Thread Mario Limonciello
If the lid on a laptop is closed when eDP connectors are populated then it remains enabled when the initial framebuffer configuration is built. When creating the initial framebuffer configuration detect the ACPI lid status and if it's closed disable any eDP connectors. Reported-by: Chris

[PATCH v4 3/3] drm/display: split DSC helpers from DP helpers

2024-05-28 Thread Dmitry Baryshkov
Currently the DRM DSC functions are selected by the DRM_DISPLAY_DP_HELPER Kconfig symbol. This is not optimal, since the DSI code (both panel and host drivers) end up selecting the seemingly irrelevant DP helpers. Split the DSC code to be guarded by the separate DRM_DISPLAY_DSC_HELPER Kconfig

[PATCH v4 1/3] drm/panel/lg-sw43408: select CONFIG_DRM_DISPLAY_DP_HELPER

2024-05-28 Thread Dmitry Baryshkov
This panel driver uses DSC PPS functions and as such depends on the DRM_DISPLAY_DP_HELPER. Select this symbol to make required functions available to the driver. Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202404200800.kysryyli-...@intel.com/ Fixes: 069a6c0e94f9

[PATCH v4 2/3] drm/panel/lg-sw43408: mark sw43408_backlight_ops as static

2024-05-28 Thread Dmitry Baryshkov
Fix sparse warning regarding symbol 'sw43408_backlight_ops' not being declared. Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202404200739.hbwzvohr-...@intel.com/ Reviewed-by: Neil Armstrong Fixes: 069a6c0e94f9 ("drm: panel: Add LG sw43408 panel driver")

[PATCH v4 0/3] drm/panel: two fixes for lg-sw43408

2024-05-28 Thread Dmitry Baryshkov
Fix two issues with the panel-lg-sw43408 driver reported by the kernel test robot. Signed-off-by: Dmitry Baryshkov --- Changes in v4: - Reoder patches so that fixes come first, to be able to land them to drm-misc-fixes - Link to v3:

Re: [linux-next:master] BUILD REGRESSION 6dc544b66971c7f9909ff038b62149105272d26a

2024-05-28 Thread Jakub Kicinski
On Wed, 29 May 2024 02:19:47 +0800 kernel test robot wrote: > | `-- > net-ipv6-route.c-rt6_fill_node()-error:we-previously-assumed-dst-could-be-null-(see-line-) Is there a way for us to mark this as false positive?

RE: [PATCH v2 10/10] Revert "drm/amdgpu: Queue KFD reset workitem in VF FED"

2024-05-28 Thread Skvortsov, Victor
[AMD Official Use Only - AMD Internal Distribution Only] Nack to the revert. The FLR sequence is defined as the following (host-initiated reset): 1) host sends FLR_NOTIFICATION 2) Guest gets interrupt and queues FLR work item 3) Guest sends READY_TO_RESET 4) Host sends

[linux-next:master] BUILD REGRESSION 6dc544b66971c7f9909ff038b62149105272d26a

2024-05-28 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master branch HEAD: 6dc544b66971c7f9909ff038b62149105272d26a Add linux-next specific files for 20240528 Error/Warning reports: https://lore.kernel.org/oe-kbuild-all/202405282036.maedo54q-...@intel.com https

Re: [PATCH v3 1/3] drm/display: split DSC helpers from DP helpers

2024-05-28 Thread Jessica Zhang
On 5/21/2024 11:25 PM, Dmitry Baryshkov wrote: Currently the DRM DSC functions are selected by the DRM_DISPLAY_DP_HELPER Kconfig symbol. This is not optimal, since the DSI code (both panel and host drivers) end up selecting the seemingly irrelevant DP helpers. Split the DSC code to be guarded

[PATCH v2 09/10] drm/amdgpu: fix missing reset domain locks

2024-05-28 Thread Yunxiang Li
These functions are missing the lock for reset domain. Signed-off-by: Yunxiang Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 4 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 8 ++-- drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 9 +++-- 3

[PATCH v2 04/10] drm/amdgpu/kfd: remove is_hws_hang and is_resetting

2024-05-28 Thread Yunxiang Li
is_hws_hang and is_resetting serves pretty much the same purpose and they all duplicates the work of the reset_domain lock, just check that directly instead. This also eliminate a few bugs listed below and get rid of dqm->ops.pre_reset. kfd_hws_hang did not need to avoid scheduling another reset.

[PATCH v2 06/10] drm/amdgpu: remove tlb flush in amdgpu_gtt_mgr_recover

2024-05-28 Thread Yunxiang Li
At this point the gart is not set up, there's no point to invalidate tlb here and it could even be harmful. Signed-off-by: Yunxiang Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c

[PATCH v2 08/10] drm/amdgpu: fix locking scope when flushing tlb

2024-05-28 Thread Yunxiang Li
Which method is used to flush tlb does not depend on whether a reset is in progress or not. We should skip flush altogether if the GPU will get reset. So put both path under reset_domain read lock. Signed-off-by: Yunxiang Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 66

[PATCH v2 10/10] Revert "drm/amdgpu: Queue KFD reset workitem in VF FED"

2024-05-28 Thread Yunxiang Li
This reverts commit 2149ee697a7a3091a16447c647d4a30f7468553a. The issue is already fixed by fa5a7f2ccb7e ("drm/amdgpu: Fix two reset triggered in a row") Signed-off-by: Yunxiang Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

[PATCH v2 07/10] drm/amdgpu: use helper in amdgpu_gart_unbind

2024-05-28 Thread Yunxiang Li
When amdgpu_gart_invalidate_tlb helper is introduced this part was left out of the conversion. Avoid the code duplication here. Signed-off-by: Yunxiang Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git

[PATCH v2 03/10] drm/amdgpu: abort fence poll if reset is started

2024-05-28 Thread Yunxiang Li
If a reset is triggered, there's no point in waiting for the fence back anymore, it just makes the reset code wait for a long time for the reset_domain read lock to be dropped. This also makes our reply to host FLR fast enough so the host doesn't timeout. Signed-off-by: Yunxiang Li ---

[PATCH v2 05/10] drm/amd/amdgpu: remove unnecessary flush when enable gart

2024-05-28 Thread Yunxiang Li
From: Likun Gao Remove hdp flush for gc v11/12 when enable gart. Remove flush tlb for gc v10/11/12 when enable gart. Signed-off-by: Likun Gao Signed-off-by: Yunxiang Li --- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 3 --- drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 3 ---

[PATCH v2 02/10] drm/amdgpu: fix sriov host flr handler

2024-05-28 Thread Yunxiang Li
We send back the ready to reset message before we stop anything. This is wrong. Move it to when we are actually ready for the FLR to happen. In the current state since we take tens of seconds to stop everything, it is very likely that host would give up waiting and reset the GPU before we send

[PATCH v2 01/10] drm/amdgpu: add skip_hw_access checks for sriov

2024-05-28 Thread Yunxiang Li
Accessing registers via host is missing the check for skip_hw_access and the lockdep check that comes with it. Signed-off-by: Yunxiang Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c

[PATCH v2 00/10] drm/amdgpu: prevent concurrent GPU access during reset

2024-05-28 Thread Yunxiang Li
If another thread accesses the gpu while the GPU is being reset, the reset could fail. This is especially problematic on SRIOV since host may reset the GPU even if guest is not yet ready. There are code in place that tries to prevent stray access, but over time bugs have crept in making it not

Re: [PATCH] drm/amd/amdgpu: Fix 'snprintf' output truncation warning

2024-05-28 Thread Mario Limonciello
On 5/28/2024 10:24, Pratap Nirujogi wrote: snprintf can truncate the output fw filename if the isp ucode_prefix exceeds 29 characters. Knowing ISP ucode_prefix is in the format isp_x_x_x, limiting the size of ucode_prefix[] to 10 characters to fix the warning. Fixes the below warning:

[PATCH] drm/amd/amdgpu: Fix 'snprintf' output truncation warning

2024-05-28 Thread Pratap Nirujogi
snprintf can truncate the output fw filename if the isp ucode_prefix exceeds 29 characters. Knowing ISP ucode_prefix is in the format isp_x_x_x, limiting the size of ucode_prefix[] to 10 characters to fix the warning. Fixes the below warning: drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c: In

Re: [PATCH 2/2] drivers/gpu: Fix misalignment in comment block

2024-05-28 Thread Alex Deucher
Applied. Thanks! Alex On Tue, May 28, 2024 at 10:47 AM Bruno Rocha Levi wrote: > > This patch fixes a warning from checkpatch by ensuring the trailing */ is > aligned with the rest of the *, improving readability. > > Co-developed-by: Lucas Antonio > Signed-off-by: Lucas Antonio >

Re: [PATCH] drm/amdgpu: Add lock around VF RLCG interface

2024-05-28 Thread Christian König
Am 27.05.24 um 22:19 schrieb Victor Skvortsov: flush_gpu_tlb may be called from another thread while device_gpu_recover is running. No, that would be illegal. Where do you see that? Regards, Christian. Both of these threads access registers through the VF RLCG interface during VF Full

RE: [PATCH] drm/amdgpu: Add lock around VF RLCG interface

2024-05-28 Thread Luo, Zhigang
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Zhigang Luo -Original Message- From: Skvortsov, Victor Sent: Monday, May 27, 2024 4:19 PM To: amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Lazar, Lijo ; Luo, Zhigang Cc: Skvortsov, Victor Subject: [PATCH]

Re: [PATCH 1/3] drm/amdgpu/gfx11: select HDP ref/mask according to gfx ring pipe

2024-05-28 Thread Christian König
Reviewed-by: Christian König for the entire series. Regards, Christian. Am 13.05.24 um 22:25 schrieb Alex Deucher: Use correct ref/mask for differnent gfx ring pipe. Ported from ZhenGuo's patch for gfx10. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 2 +- 1

Re: [PATCH] drm/amdgpu: drop some kernel messages in VCN code

2024-05-28 Thread Christian König
Acked-by: Christian König Thanks, Christian. Am 23.05.24 um 19:11 schrieb Jiang, Sonny: [AMD Official Use Only - AMD Internal Distribution Only] The patch is Reviewed-by: Sonny Jiang Thanks, Sonny *From:* Dong,

Re: [PATCH] drm/amdgpu: Add flags to distinguish vf/pf/pt mode

2024-05-28 Thread Christian König
Am 27.05.24 um 18:28 schrieb Asad Kamal: Add extra flag definition for ids_flag field to distinguish between vf/pf/pt modes v2: Updated kms driver minor version & removed pf check as default is 0 Signed-off-by: Asad Kamal Reviewed-by: Lijo Lazar Acked-by: Christian König ---

Re: 6.10/regression/bisected commit c4cb23111103 causes sleeping function called from invalid context at kernel/locking/mutex.c:585

2024-05-28 Thread Linux regression tracking (Thorsten Leemhuis)
On 22.05.24 23:18, Chris Bainbridge wrote: > On Tue, May 21, 2024 at 02:39:06PM +0500, Mikhail Gavrilov wrote: >> Yesterday on the fresh kernel snapshot >> I spotted a new bug message with follow stacktrace: >> [4.307097] BUG: sleeping function called from invalid context at >>

Re: [PATCH] drm/radeon/r100: enhance error handling in r100_cp_init_microcode

2024-05-28 Thread Zhouyi Zhou
Fix some error in my previous email On Tue, May 28, 2024 at 9:36 AM Zhouyi Zhou wrote: > > Thanks for reviewing the patch > > On Mon, May 27, 2024 at 3:58 PM Christian König > wrote: > > > > Am 27.05.24 um 03:20 schrieb Zhouyi Zhou: > > > In r100_cp_init_microcode, if rdev->family don't match

Re: [PATCH] drm/client: Detect when ACPI lid is closed during initialization

2024-05-28 Thread Chris Bainbridge
On Mon, 27 May 2024 at 15:23, Mario Limonciello wrote: > > If the lid on a laptop is closed when eDP connectors are populated > then it remains enabled when the initial framebuffer configuration > is built. > > When creating the initial framebuffer configuration detect the ACPI > lid status and

Re: [PATCH] drm/radeon/r100: enhance error handling in r100_cp_init_microcode

2024-05-28 Thread Zhouyi Zhou
Thanks for reviewing the patch On Mon, May 27, 2024 at 3:58 PM Christian König wrote: > > Am 27.05.24 um 03:20 schrieb Zhouyi Zhou: > > In r100_cp_init_microcode, if rdev->family don't match any of > > if statement, fw_name will be NULL, which will cause > > gcc (11.4.0 powerpc64le-linux-gnu)

[PATCH] drm/amd/display: Convert some legacy DRM debug macros into appropriate categories

2024-05-28 Thread Tvrtko Ursulin
From: Tvrtko Ursulin Currently when one enables driver debugging dmesg gets spammed, at I suspect vblank rate, with messages like: [drm:amdgpu_dm_atomic_check [amdgpu]] MPO enablement requested on crtc:[f073c3bb] Fix if by converting some logging from deprecated and incorrect

Re: [PATCH] drm/amdkfd: select CONFIG_CRC16

2024-05-28 Thread Lazar, Lijo
On 5/28/2024 5:20 PM, Arnd Bergmann wrote: > From: Arnd Bergmann > > The amdkfd support fails to link when CONFIG_CRC16 is disabled: > > aarch64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_topology.o: in function > `kfd_topology_add_device': > kfd_topology.c:(.text+0x3a4c): undefined reference

[PATCH 4/4] drm/amd/display: Move 'struct scaler_data' off stack

2024-05-28 Thread Arnd Bergmann
From: Arnd Bergmann The scaler_data structure is implicitly copied onto the stack twice by being returned from a function. This is usually a bad idea, but it was not flagged by the compiler until a recent addition that pushed it over the 1024 byte function stack limit:

[PATCH 3/4] drm/amd/display: avoid large on-stack structures

2024-05-28 Thread Arnd Bergmann
From: Arnd Bergmann Putting excessively large objects on a function stack causes a warning about possibly overflowing the 8KiB of kernel stack: drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn401/dcn401_resource.c: In function 'dcn401_update_bw_bounding_box':

[PATCH 2/4] [RESEND] drm/amd/display: fix graphics_object_id size

2024-05-28 Thread Arnd Bergmann
From: Arnd Bergmann The graphics_object_id structure is meant to fit into 32 bits, as it's passed by value in and out of functions. A recent change increased the size to 128 bits, so it's now always passed by reference, which is clearly not intended and ends up producing a compile-time warning:

[PATCH 1/4] [RESEND] drm/amd/display: dynamically allocate dml2_configuration_options structures

2024-05-28 Thread Arnd Bergmann
From: Arnd Bergmann This structure is too large to fit on a stack, as shown by the newly introduced warnings from a recent code change: drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn32/dcn32_resource.c: In function 'dcn32_update_bw_bounding_box':

[PATCH] drm/amdkfd: select CONFIG_CRC16

2024-05-28 Thread Arnd Bergmann
From: Arnd Bergmann The amdkfd support fails to link when CONFIG_CRC16 is disabled: aarch64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_topology.o: in function `kfd_topology_add_device': kfd_topology.c:(.text+0x3a4c): undefined reference to `crc16' This is a library module that needs to be

RE: [PATCH] drm/amdgpu: Add flags to distinguish vf/pf/pt mode

2024-05-28 Thread Zhang, Hawking
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Kamal, Asad Sent: Tuesday, May 28, 2024 00:29 To: amd-gfx@lists.freedesktop.org Cc: Lazar, Lijo ; Zhang, Hawking ; Ma, Le ; Zhang, Morris ; Kamal, Asad ;

RE: [PATCH] drm/amdgpu: Fix missing error code in amdgpu_od_set_init

2024-05-28 Thread Wang, Yang(Kevin)
[AMD Official Use Only - AMD Internal Distribution Only] Hi SRINIVASAN, Please pause this patch. The original intention of the patch was to avoid creating an empty directory ("gpu_od") , as this change may result in incorrect results. Thanks. Best Regards, Kevin -Original Message-

[PATCH] drm/amdgpu: Fix missing error code in amdgpu_od_set_init

2024-05-28 Thread Srinivasan Shanmugam
This commit ensures that an error code -EINVAL is set in the amdgpu_od_set_init function when the od_kobj_list has only one entry, indicating that the list is not in the expected state. Fixes the below: drivers/gpu/drm/amd/amdgpu/../pm/amdgpu_pm.c:4355 amdgpu_od_set_init() warn: missing error

[PATCH] drm/amd/display: Add null checks for 'stream' and 'plane' before dereferencing

2024-05-28 Thread Srinivasan Shanmugam
This commit adds null checks for the 'stream' and 'plane' variables in the dcn30_apply_idle_power_optimizations function. These variables were previously assumed to be null at line 922, but they were used later in the code without checking if they were null. This could potentially lead to a null

RE: [PATCH] drm/amdgpu: Estimate RAS reservation when report capacity v2

2024-05-28 Thread Zhang, Hawking
[AMD Official Use Only - AMD Internal Distribution Only] Thanks Tao. Yes, I added the comments to amdgpu_ras.h. +/* Reserve 8 physical dram row for possible retirement. + * In worst cases, it will lose 8 * 2MB memory in vram domain */ +#define AMDGPU_RAS_RESERVED_VRAM_SIZE (16ULL << 20)

RE: [PATCH] drm/amdgpu: Estimate RAS reservation when report capacity v2

2024-05-28 Thread Zhou1, Tao
[AMD Official Use Only - AMD Internal Distribution Only] I prefer to add comment for AMDGPU_RAS_RESERVED_VRAM_SIZE to explain the value of 16MB, anyway the patch is: Reviewed-by: Tao Zhou > -Original Message- > From: Zhang, Hawking > Sent: Tuesday, May 28, 2024 1:57 PM > To: