[PATCH] drm/amdgpu: enable tmz by default for GC 11.0.1

2023-05-25 Thread Ikshwaku Chauhan
Add IP GC 11.0.1 in the list of target to have tmz enabled by default. Signed-off-by: Ikshwaku Chauhan diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index 3f5dd9e32e08..348d856626c6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drive

[PATCH 1/3] drm/amdgpu: golden settings for ASIC rev_id 0

2023-05-25 Thread Shiwu Zhang
Suggested by FW team that GB_ADDR_CONFIG is handled by golden settings in driver to get the expected value Signed-off-by: Shiwu Zhang Reviewed-by: Hawking Zhang Acked-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/d

[PATCH 3/3] drm/amdgpu: set the APU flag based on package type

2023-05-25 Thread Shiwu Zhang
Since currently APU and dGPU share the same pcie class while gmc init needs the flag to set up correctly for upcomming memory allocations v2: call get_pkg_type in smuio 13_0_3 is enough (hawking) Signed-off-by: Shiwu Zhang Reviewed-by: Hawking Zhang Acked-by: Alex Deucher --- drivers/gpu/drm/

[PATCH 2/3] drm/amdgpu: add the accelerator pcie class

2023-05-25 Thread Shiwu Zhang
v2: add the base class id for accelerator (lijo) v3: add the new pci class in amdgpu tree (hawking) Signed-off-by: Shiwu Zhang Acked-by: Lijo Lazar Acked-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 5 + drivers/gpu/drm/amd/include/amd_shared.h | 1 + 2 files changed, 6

Re: [PATCH v2] drm/amd/display: enable more strict compile checks

2023-05-25 Thread Jani Nikula
On Wed, 24 May 2023, Hamza Mahfooz wrote: > + Kees > > On 5/24/23 15:50, Alex Deucher wrote: >> On Wed, May 24, 2023 at 3:46 PM Felix Kuehling >> wrote: >>> >>> Sure, I think we tried enabling warnings as errors before and had to >>> revert it because of weird compiler quirks or the variety of c

RE: [PATCH 3/3] drm/amdgpu: set the APU flag based on package type

2023-05-25 Thread Zhang, Hawking
[AMD Official Use Only - General] Series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Shiwu Zhang Sent: Thursday, May 25, 2023 15:46 To: amd-gfx@lists.freedesktop.org Subject: [PATCH 3/3] drm/amdgpu: set the APU flag based on package type

[PATCH] drm/amdgpu: change reserved vram info print

2023-05-25 Thread YiPeng Chai
The link object of mgr->reserved_pages is the blocks variable in struct amdgpu_vram_reservation, not the link variable in struct drm_buddy_block. Signed-off-by: YiPeng Chai --- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a

[PATCH] drm/amdgpu: keep irq count in amdgpu_irq_disable_all

2023-05-25 Thread Guchun Chen
This can clean up all irq warnings because of unbalanced amdgpu_irq_get/put when unplugging/unbind device, and leave irq count decrease in each ip fini function. Signed-off-by: Guchun Chen --- drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/

Re: [PATCH 1/2] Revert "drm/amd/display: Block optimize on consecutive FAMS enables"

2023-05-25 Thread Michel Dänzer
On 5/23/23 18:09, Hamza Mahfooz wrote: > On 5/22/23 09:08, Michel Dänzer wrote: >> From: Michel Dänzer >> >> This reverts commit ce560ac40272a5c8b5b68a9d63a75edd9e66aed2. >> >> It depends on its parent commit, which we want to revert. >> >> Signed-off-by: Michel Dänzer > > I have applied the ser

[PATCH] drm/amdgpu: Fix unused sq_int_priv variable in event_interrupt_wq_v11

2023-05-25 Thread Srinivasan Shanmugam
gcc with W=1 drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_int_process_v11.c: In function ‘event_interrupt_wq_v11’: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_int_process_v11.c:282:38: warning: variable ‘sq_int_priv’ set but not used [-Wunused-but-set-variable] 282 | uint8_t sq_int_enc, sq_int_errtyp

Re: [PATCH] drm/amdgpu: enable tmz by default for GC 11.0.1

2023-05-25 Thread Alex Deucher
Reviewed-by: Alex Deucher On Thu, May 25, 2023 at 3:22 AM Ikshwaku Chauhan wrote: > > Add IP GC 11.0.1 in the list of target to have > tmz enabled by default. > > Signed-off-by: Ikshwaku Chauhan > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.

Re: [PATCH] drm/amdgpu: keep irq count in amdgpu_irq_disable_all

2023-05-25 Thread Christian König
Am 25.05.23 um 11:28 schrieb Guchun Chen: This can clean up all irq warnings because of unbalanced amdgpu_irq_get/put when unplugging/unbind device, and leave irq count decrease in each ip fini function. Signed-off-by: Guchun Chen Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdg

Re: [PATCH 1/2] Revert "drm/amd/display: Block optimize on consecutive FAMS enables"

2023-05-25 Thread Alex Deucher
On Thu, May 25, 2023 at 6:27 AM Michel Dänzer wrote: > > On 5/23/23 18:09, Hamza Mahfooz wrote: > > On 5/22/23 09:08, Michel Dänzer wrote: > >> From: Michel Dänzer > >> > >> This reverts commit ce560ac40272a5c8b5b68a9d63a75edd9e66aed2. > >> > >> It depends on its parent commit, which we want to r

Re: [PATCH 02/36] drm/drm_property: make replace_property_blob_from_id a DRM helper

2023-05-25 Thread Liviu Dudau
On Tue, May 23, 2023 at 09:14:46PM -0100, Melissa Wen wrote: > Place it in drm_property where drm_property_replace_blob and > drm_property_lookup_blob live. Then we can use the DRM helper for > driver-specific KMS properties too. > > Signed-off-by: Melissa Wen I know that I've got Cc-ed because

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Nathan Chancellor
On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote: > Silencing the compiler from below compilation error: > > drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable > 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted > [-Werror,-Wunneeded-internal

Re: [PATCH 06/36] drm/amd/display: add CRTC driver-specific property for gamma TF

2023-05-25 Thread Harry Wentland
On 5/24/23 04:24, Pekka Paalanen wrote: > On Tue, 23 May 2023 21:14:50 -0100 > Melissa Wen wrote: > >> Hook up driver-specific atomic operations for managing AMD color >> properties and create AMD driver-specific color management properties >> and attach them according to HW capabilities defin

Re: [PATCH v2] drm/amd/display: enable more strict compile checks

2023-05-25 Thread Kees Cook
Hi! On Wed, May 24, 2023 at 04:27:31PM -0400, Hamza Mahfooz wrote: > + Kees > > On 5/24/23 15:50, Alex Deucher wrote: > > On Wed, May 24, 2023 at 3:46 PM Felix Kuehling > > wrote: > > > > > > Sure, I think we tried enabling warnings as errors before and had to > > > revert it because of weird

Re: [PATCH] drm/amdgpu: add the accelerator pcie class

2023-05-25 Thread Christoph Hellwig
On Tue, May 23, 2023 at 10:02:32AM -0400, Alex Deucher wrote: > On Tue, May 23, 2023 at 5:25 AM Christoph Hellwig wrote: > > > > On Tue, May 23, 2023 at 12:02:32PM +0800, Shiwu Zhang wrote: > > > + { PCI_DEVICE(0x1002, PCI_ANY_ID), > > > + .class = PCI_CLASS_ACCELERATOR_PROCESSING << 8,

[PATCH] drm/amd/amdgpu: Fix up locking etc in amdgpu_debugfs_gprwave_ioctl()

2023-05-25 Thread Dan Carpenter
There are two bugs here. 1) Drop the lock if copy_from_user() fails. 2) If the copy fails then the correct error code is -EFAULT instead of -EINVAL. I also broke up the long line and changed "sizeof rd->id" to "sizeof(rd->id)". Fixes: 164fb2940933 ("drm/amd/amdgpu: Update debugfs for XCC suppo

Re: [PATCH v2] drm/amd/display: enable more strict compile checks

2023-05-25 Thread Christoph Hellwig
> +subdir-ccflags-y += -Werror -Wunused -Wmisleading-indentation We have a config option for -Werror. Blindly adding this will create problems with too new (or sometimes too old, or just too weird) compilers all the time. Don't do this.

[PATCH 1/2] drm/amdgpu: Fix no-procfs build

2023-05-25 Thread Rob Clark
From: Rob Clark Fixes undefined symbol when PROC_FS is not enabled. Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202305251510.u0r2as7k-...@intel.com/ Fixes: 376c25f8ca47 ("drm/amdgpu: Switch to fdinfo helper") Signed-off-by: Rob Clark --- drivers/gpu/drm/amd/a

[PATCH 2/2] drm/amdgpu: Remove duplicate fdinfo fields

2023-05-25 Thread Rob Clark
From: Rob Clark Some of the fields that are handled by drm_show_fdinfo() crept back in when rebasing the patch. Remove them again. Fixes: 376c25f8ca47 ("drm/amdgpu: Switch to fdinfo helper") Signed-off-by: Rob Clark --- drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 3 --- 1 file changed, 3 del

[PATCH] drm/amdgpu: Fix up missing kdoc in sdma_v6_0.c

2023-05-25 Thread Srinivasan Shanmugam
Address a bunch of kdoc warnings: gcc with W=1 drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:248: warning: Function parameter or member 'job' not described in 'sdma_v6_0_ring_emit_ib' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:248: warning: Function parameter or member 'flags' not described in 'sdma_v6_0_r

Re: [PATCH v2] drm/amd/display: enable more strict compile checks

2023-05-25 Thread Nathan Chancellor
On Thu, May 25, 2023 at 08:37:07AM -0700, Kees Cook wrote: > Hi! > > On Wed, May 24, 2023 at 04:27:31PM -0400, Hamza Mahfooz wrote: > > + Kees > > > > On 5/24/23 15:50, Alex Deucher wrote: > > > On Wed, May 24, 2023 at 3:46 PM Felix Kuehling > > > wrote: > > > > > > > > Sure, I think we tried

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Luben Tuikov
On 2023-05-25 11:22, Nathan Chancellor wrote: > On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote: >> Silencing the compiler from below compilation error: >> >> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable >> 'mmhub_v1_8_mmea_err_status_reg' is not needed and

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Nathan Chancellor
On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote: > On 2023-05-25 11:22, Nathan Chancellor wrote: > > On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote: > >> Silencing the compiler from below compilation error: > >> > >> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23:

[PATCH] drm/amd/amdgpu: introduce DRM_AMDGPU_WERROR

2023-05-25 Thread Hamza Mahfooz
We want to do -Werror builds on our CI. However, non-amdgpu breakages have prevented us from doing so thus far. Also, there are a number of additional checks that we should enable, that the community cares about and are hidden behind -Wextra. So, define DRM_AMDGPU_WERROR to only enable -Werror for

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Alex Deucher
On Thu, May 25, 2023 at 12:29 PM Nathan Chancellor wrote: > > On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote: > > On 2023-05-25 11:22, Nathan Chancellor wrote: > > > On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote: > > >> Silencing the compiler from below compila

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Luben Tuikov
On 2023-05-25 12:29, Nathan Chancellor wrote: > On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote: >> On 2023-05-25 11:22, Nathan Chancellor wrote: >>> On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote: Silencing the compiler from below compilation error: >>>

[PATCH 2/3] drm/amdgpu: cache gpuvm fault information for gmc7+

2023-05-25 Thread Alex Deucher
Cache the current fault info in the vm struct. This can be queried by userspace later to help debug UMDs. Cc: samuel.pitoi...@gmail.com Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 3 +++ drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 3 +++ drivers/gpu/drm/amd/amdgpu/gm

[PATCH 3/3] drm/amdgpu: add new INFO ioctl query for the last GPU page fault

2023-05-25 Thread Alex Deucher
Add a interface to query the last GPU page fault for the process. Useful for debugging context lost errors. v2: split vmhub representation between kernel and userspace Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238 libdrm MR: https://gitlab.freedesktop.org/mesa/mesa/-/me

[PATCH 0/3] Add GPU page fault query interface

2023-05-25 Thread Alex Deucher
This patch set adds support for an application to query GPU page faults. It's useful for debugging and there are vulkan extensions that could make use of this. Preliminary user space code which uses this can be found here: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238 https://gi

[PATCH 1/3] drm/amdgpu: add cached GPU fault structure to vm struct

2023-05-25 Thread Alex Deucher
When we get a GPU pge fault, cache the fault for later analysis. Cc: samuel.pitoi...@gmail.com Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 31 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 18 +++ 2 files changed, 49 insertions(+) d

Re: [PATCH] drm/amdgpu: Fix up missing kdoc in sdma_v6_0.c

2023-05-25 Thread Alex Deucher
Reviewed-by: Alex Deucher On Thu, May 25, 2023 at 12:15 PM Srinivasan Shanmugam wrote: > > Address a bunch of kdoc warnings: > > gcc with W=1 > drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:248: warning: Function parameter or > member 'job' not described in 'sdma_v6_0_ring_emit_ib' > drivers/gpu/drm/a

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Nathan Chancellor
On Thu, May 25, 2023 at 12:42:05PM -0400, Alex Deucher wrote: > On Thu, May 25, 2023 at 12:29 PM Nathan Chancellor wrote: > > > > On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote: > > > On 2023-05-25 11:22, Nathan Chancellor wrote: > > > > On Fri, May 19, 2023 at 06:14:38PM +0530, Srin

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Nathan Chancellor
On Thu, May 25, 2023 at 12:45:13PM -0400, Luben Tuikov wrote: > On 2023-05-25 12:29, Nathan Chancellor wrote: > > On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote: > >> On 2023-05-25 11:22, Nathan Chancellor wrote: > >>> On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wro

[PATCH] drm/amdgpu: Fix up kdoc in sdma_v4_4_2.c

2023-05-25 Thread Srinivasan Shanmugam
Address a bunch of kdoc warnings: gcc with W=1 drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:426: warning: Function parameter or member 'inst_mask' not described in 'sdma_v4_4_2_inst_gfx_stop' drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:457: warning: Function parameter or member 'inst_mask' not describe

[PATCH 02/33] drm/amdkfd: display debug capabilities

2023-05-25 Thread Jonathan Kim
Expose debug capabilities in the KFD topology node's HSA capabilities and debug properties flags. Ensure correct capabilities are exposed based on firmware support. Flag definitions can be referenced in uapi/linux/kfd_sysfs.h. v2: rebase topology fw check fix with kfd_node struct update Signed-

[PATCH 01/33] drm/amdkfd: add debug and runtime enable interface

2023-05-25 Thread Jonathan Kim
Introduce the GPU debug operations interface. For ROCm-GDB to extend the GNU Debugger's ability to inspect the AMD GPU instruction set, provide the necessary interface to allow the debugger to HW debug-mode set and query exceptions per HSA queue, process or device. The runtime_enable interface co

[PATCH 03/33] drm/amdkfd: prepare per-process debug enable and disable

2023-05-25 Thread Jonathan Kim
The ROCm debugger will attach to a process to debug by PTRACE and will expect the KFD to prepare a process for the target PID, whether the target PID has opened the KFD device or not. This patch is to explicity handle this requirement. Further HW mode setting and runtime coordination requirements

[PATCH 04/33] drm/amdgpu: add kgd hw debug mode setting interface

2023-05-25 Thread Jonathan Kim
Introduce the require KGD debug calls that will execute hardware debug mode setting. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- .../gpu/drm/amd/include/kgd_kfd_interface.h | 34 +++ 1 file changed, 34 insertions(+) diff --git a/drivers/gpu/drm/amd/include/kgd

[PATCH 06/33] drm/amdgpu: add gfx9 hw debug mode enable and disable calls

2023-05-25 Thread Jonathan Kim
Implement the per-device calls to enable or disable HW debug mode for GFX9 prior to GFX9.4.1. GFX9.4.1 and onward will require their own enable/disable sequence as follow on patches. When hardware debug mode setting is requested, waves will inherit these settings in the Shader Processor Input's (

[PATCH 05/33] drm/amdgpu: setup hw debug registers on driver initialization

2023-05-25 Thread Jonathan Kim
Add missing debug trap registers references and initialize all debug registers on boot by clearing the hardware exception overrides and the wave allocation ID index. The debugger requires that TTMPs 6 & 7 save the dispatch ID to map waves onto dispatch during compute context inspection. In order t

[PATCH 09/33] drm/amdgpu: add gfx10 hw debug mode enable and disable calls

2023-05-25 Thread Jonathan Kim
Similar to GFX9 debug devices, set the hardware debug mode by draining the SPI appropriately prior the mode setting request. Because GFX10 has waves allocated by the work group boundary and each SE's SPI instances do not communicate, the SPI drain time is much longer. This long drain time will be

[PATCH 14/33] drm/amdgpu: prepare map process for multi-process debug devices

2023-05-25 Thread Jonathan Kim
Unlike single process debug devices, multi-process debug devices allow debug mode setting per-VMID (non-device-global). Because the HWS manages PASID-VMID mapping, the new MAP_PROCESS API allows the KFD to forward the required SPI debug register write requests. To request a new debug mode setting

[PATCH 12/33] drm/amdgpu: add configurable grace period for unmap queues

2023-05-25 Thread Jonathan Kim
The HWS schedule allows a grace period for wave completion prior to preemption for better performance by avoiding CWSR on waves that can potentially complete quickly. The debugger, on the other hand, will want to inspect wave status immediately after it actively triggers preemption (a suspend funct

[PATCH 08/33] drm/amdkfd: fix kfd_suspend_all_processes

2023-05-25 Thread Jonathan Kim
Flush delayed restore work in kfd_suspend_all_queues instead of cancelling. Cancelling the work before it runs results in the queues becoming permanently disabled. Flushing the work ensures that the queue suspend/resume state stays balanced. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling

[PATCH 07/33] drm/amdgpu: add gfx9.4.1 hw debug mode enable and disable calls

2023-05-25 Thread Jonathan Kim
On GFX9.4.1, the implicit wait count instruction on s_barrier is disabled by default in the driver during normal operation for performance requirements. There is a hardware bug in GFX9.4.1 where if the implicit wait count instruction after an s_barrier instruction is disabled, any wave that hits a

[PATCH 13/33] drm/amdkfd: prepare map process for single process debug devices

2023-05-25 Thread Jonathan Kim
Older HW only supports debugging on a single process because the SPI debug mode setting registers are device global. The HWS has supplied a single pinned VMID (0xf) for MAP_PROCESS for debug purposes. To pin the VMID, the KFD will remove the VMID from the HWS dynamic VMID allocation via SET_RESOUC

[PATCH 16/33] drm/amdkfd: add per process hw trap enable and disable functions

2023-05-25 Thread Jonathan Kim
To enable HW debug mode per process, all devices must be debug enabled successfully. If a failure occures, rewind the enablement of debug mode on the enabled devices. A power management scenario that needs to be considered is HW debug mode setting during GFXOFF. During GFXOFF, these registers wi

[PATCH 20/33] drm/amdkfd: add runtime enable operation

2023-05-25 Thread Jonathan Kim
The debugger can attach to a process prior to HSA enablement (i.e. inferior is spawned by the debugger and attached to immediately before target process has been enabled for HSA dispatches) or it can attach to a running target that is already HSA enabled. Either way, the debugger needs to know the

[PATCH 19/33] drm/amdkfd: add send exception operation

2023-05-25 Thread Jonathan Kim
Add a debug operation that allows the debugger to send an exception directly to runtime through a payload address. For memory violations, normal vmfault signals will be applied to notify runtime instead after passing in the saved exception data when a memory violation was raised to the debugger.

[PATCH 18/33] drm/amdkfd: add raise exception event function

2023-05-25 Thread Jonathan Kim
Exception events can be generated from interrupts or queue activitity. The raise event function will save exception status of a queue, device or process then notify the debugger of the status change by writing to a debugger polled file descriptor that the debugger provides during debug attach. Fo

[PATCH 10/33] drm/amdgpu: add gfx9.4.2 hw debug mode enable and disable calls

2023-05-25 Thread Jonathan Kim
GFX9.4.2 now supports per-VMID debug mode controls registers (SPI_GDBG_PER_VMID_CNTL). Because the KFD lets the HWS handle PASID-VMID mapping, the KFD will forward all debug mode setting register writes to the HWS scheduler using a new MAP_PROCESS API, so instead of writing to registers, return th

[PATCH 26/33] drm/amdkfd: add debug suspend and resume process queues operation

2023-05-25 Thread Jonathan Kim
In order to inspect waves from the saved context at any point during a debug session, the debugger must be able to preempt queues to trigger context save by suspending them. On queue suspend, the KFD will copy the context save header information so that the debugger can correctly crawl the appropr

[PATCH 24/33] drm/amdkfd: add debug wave launch override operation

2023-05-25 Thread Jonathan Kim
This operation allows the debugger to override the enabled HW exceptions on the device. On debug devices that only support the debugging of a single process, the HW exceptions are global and set through the SPI_GDBG_TRAP_MASK register. Because they are global, only address watch exceptions are all

[PATCH 28/33] drm/amdkfd: add debug set flags operation

2023-05-25 Thread Jonathan Kim
Allow the debugger to set single memory and single ALU operations. Some exceptions are imprecise (memory violations, address watch) in the sense that a trap occurs only when the exception interrupt occurs and not at the non-halting faulty instruction. Trap temporaries 0 & 1 save the program count

[PATCH 11/33] drm/amdgpu: add gfx11 hw debug mode enable and disable calls

2023-05-25 Thread Jonathan Kim
Implement the per-device calls to enable or disable HW debug mode for GFX11. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c| 38 +++ 1 file changed, 38 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkf

[PATCH 15/33] drm/amdgpu: expose debug api for mes

2023-05-25 Thread Jonathan Kim
Similar to the F32 HWS, the RS64 HWS for GFX11 now supports a multi-process debug API. The skip_process_ctx_clear ADD_QUEUE requirement is to prevent the MES from clearing the process context when the first queue is added to the scheduler in order to maintain debug mode settings during queue preem

[PATCH 17/33] drm/amdkfd: apply trap workaround for gfx11

2023-05-25 Thread Jonathan Kim
Due to a HW bug, waves in only half the shader arrays can enter trap. When starting a debug session, relocate all waves to the first shader array of each shader engine and mask off the 2nd shader array as unavailable. When ending a debug session, re-enable the 2nd shader array per shader engine.

[PATCH 21/33] drm/amdkfd: add debug trap enabled flag to tma

2023-05-25 Thread Jonathan Kim
From: Jay Cornwall Trap handler behavior will differ when a debugger is attached. Make the debug trap flag available in the trap handler TMA. Update it when the debug trap ioctl is invoked. Signed-off-by: Jay Cornwall Reviewed-by: Felix Kuehling Signed-off-by: Jonathan Kim Reviewed-by: Felix

[PATCH 25/33] drm/amdkfd: add debug wave launch mode operation

2023-05-25 Thread Jonathan Kim
Allow the debugger to set wave behaviour on to either normally operate, halt at launch, trap on every instruction, terminate immediately or stall on allocation. v2: fixup with new kfd_node struct reference for mes check Signed-off-by: Jonathan Kim --- .../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.

[PATCH 27/33] drm/amdkfd: add debug set and clear address watch points operation

2023-05-25 Thread Jonathan Kim
Shader read, write and atomic memory operations can be alerted to the debugger as an address watch exception. Allow the debugger to pass in a watch point to a particular memory address per device. Note that there exists only 4 watch points per devices to date, so have the KFD keep track of what w

[PATCH 22/33] drm/amdkfd: update process interrupt handling for debug events

2023-05-25 Thread Jonathan Kim
The debugger must be notified by any debugger subscribed exception that comes from hardware interrupts. If a debugger session exits, any exceptions it subscribed to may still have interrupts in the interrupt ring buffer or KGD/KFD pipeline. To prevent a new session from inheriting stale interrupts

[PATCH 23/33] drm/amdkfd: add debug set exceptions enabled operation

2023-05-25 Thread Jonathan Kim
The debugger subscibes to nofication for requested exceptions on attach. Allow the debugger to change its subsciption later on. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 3 ++ drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 36 +++

[PATCH 33/33] drm/amdkfd: bump kfd ioctl minor version for debug api availability

2023-05-25 Thread Jonathan Kim
Bump the minor version to declare debugging capability is now available. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 1 - include/uapi/linux/kfd_ioctl.h | 3 ++- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/d

[PATCH 29/33] drm/amdkfd: add debug query event operation

2023-05-25 Thread Jonathan Kim
Allow the debugger to query a single queue, device and process exception. The KFD should also return the GPU or Queue id of the exception. The debugger also has the option of clearing exceptions after being queried. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd

[PATCH 31/33] drm/amdkfd: add debug queue snapshot operation

2023-05-25 Thread Jonathan Kim
Allow the debugger to get a snapshot of a specified number of queues containing various queue property information that is copied to the debugger. Since the debugger doesn't know how many queues exist at any given time, allow the debugger to pass the requested number of snapshots as 0 to get the a

[PATCH 30/33] drm/amdkfd: add debug query exception info operation

2023-05-25 Thread Jonathan Kim
Allow the debugger to query additional info based on an exception code. For device exceptions, it's currently only memory violation information. For process exceptions, it's currently only runtime information. Queue exception only report the queue exception status. The debugger has the option of c

[PATCH 32/33] drm/amdkfd: add debug device snapshot operation

2023-05-25 Thread Jonathan Kim
Similar to queue snapshot, return an array of device information using an entry_size check and return. Unlike queue snapshots, the debugger needs to pass to correct number of devices that exist. If it fails to do so, the KFD will return the number of actual devices so that the debugger can make a

[PATCH] drm/amdgpu: Fix up kdoc in amdgpu_device.c

2023-05-25 Thread Srinivasan Shanmugam
Fix these warnings by deleting the deviant arguments. gcc with W=1 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:799: warning: Excess function parameter 'pcie_index' description in 'amdgpu_device_indirect_wreg' drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:799: warning: Excess function parameter 'pcie

Re: [PATCH] drm/amdgpu: Fix up kdoc in sdma_v4_4_2.c

2023-05-25 Thread Alex Deucher
On Thu, May 25, 2023 at 1:08 PM Srinivasan Shanmugam wrote: > > Address a bunch of kdoc warnings: > > gcc with W=1 > drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:426: warning: Function parameter or > member 'inst_mask' not described in 'sdma_v4_4_2_inst_gfx_stop' > drivers/gpu/drm/amd/amdgpu/sdma_v4_

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Nick Desaulniers
On Thu, May 25, 2023 at 9:42 AM Alex Deucher wrote: > > On Thu, May 25, 2023 at 12:29 PM Nathan Chancellor wrote: > > > > On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote: > > > On 2023-05-25 11:22, Nathan Chancellor wrote: > > > > On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan

[PATCH] drm/amdgpu: Fix up kdoc in amdgpu_acpi.c

2023-05-25 Thread Srinivasan Shanmugam
Fix these warnings by adding & deleting the deviant arguments. gcc with W=1 drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:906: warning: Function parameter or member 'numa_info' not described in 'amdgpu_acpi_get_node_id' drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:906: warning: Excess function parameter

[PATCH AUTOSEL 6.3 64/67] drm/amdgpu: skip disabling fence driver src_irqs when device is unplugged

2023-05-25 Thread Sasha Levin
From: Guchun Chen [ Upstream commit c1a322a7a4a96cd0a3dde32ce37af437a78bf8cd ] When performing device unbind or halt, we have disabled all irqs at the very begining like amdgpu_pci_remove or amdgpu_device_halt. So amdgpu_irq_put for irqs stored in fence driver should not be called any more, othe

[PATCH AUTOSEL 6.1 54/57] drm/amdgpu: skip disabling fence driver src_irqs when device is unplugged

2023-05-25 Thread Sasha Levin
From: Guchun Chen [ Upstream commit c1a322a7a4a96cd0a3dde32ce37af437a78bf8cd ] When performing device unbind or halt, we have disabled all irqs at the very begining like amdgpu_pci_remove or amdgpu_device_halt. So amdgpu_irq_put for irqs stored in fence driver should not be called any more, othe

[PATCH AUTOSEL 5.15 42/43] drm/amdgpu: skip disabling fence driver src_irqs when device is unplugged

2023-05-25 Thread Sasha Levin
From: Guchun Chen [ Upstream commit c1a322a7a4a96cd0a3dde32ce37af437a78bf8cd ] When performing device unbind or halt, we have disabled all irqs at the very begining like amdgpu_pci_remove or amdgpu_device_halt. So amdgpu_irq_put for irqs stored in fence driver should not be called any more, othe

[PATCH v4 00/13] Enable Colorspace connector property in amdgpu

2023-05-25 Thread Harry Wentland
This patchset is based on Joshua's previous patchset [1], as well as my previous patchset [2]. It is - enabling support for the colorspace property in amdgpu, as well as - allowing drivers to specify the supported set of colorspaces, and Colorspace, Infoframes, and YCbCr matrix --

[PATCH v4 01/13] drm/connector: Convert DRM_MODE_COLORIMETRY to enum

2023-05-25 Thread Harry Wentland
This allows us to use strongly typed arguments. v2: - Bring NO_DATA back - Provide explicit enum values v3: - Drop unnecessary '&' from kerneldoc (emersion) v4: - Fix Normal Colorimetry comment Signed-off-by: Harry Wentland Reviewed-by: Simon Ser Cc: Pekka Paalanen Cc: Sebastian Wick Cc:

[PATCH v4 02/13] drm/connector: Add enum documentation to drm_colorspace

2023-05-25 Thread Harry Wentland
From: Joshua Ashton To match the other enums, and add more information about these values. v2: - Specify where an enum entry comes from - Clarify DEFAULT and NO_DATA behavior - BT.2020 CYCC is "constant luminance" - correct type for BT.601 v4: - drop DP/HDMI clarifications that might create

[PATCH v4 04/13] drm/connector: Use common colorspace_names array

2023-05-25 Thread Harry Wentland
We an use bitfields to track the support ones for HDMI and DP. This allows us to print colorspaces in a consistent manner without needing to know whether we're dealing with DP or HDMI. v4: - Rename _MAX to _COUNT and leave comment to indicate it's not a valid value - Fix misplaced function doc

[PATCH v4 03/13] drm/connector: Pull out common create_colorspace_property code

2023-05-25 Thread Harry Wentland
Signed-off-by: Harry Wentland Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Uma Shankar Cc: Ville Syrjälä Cc: Joshua Ashton Cc: Jani Nikula Cc: Simon Ser Cc: Ville Syrjälä Cc: Melissa Wen Cc: dri-de...@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org --- dri

[PATCH v4 06/13] drm/connector: Allow drivers to pass list of supported colorspaces

2023-05-25 Thread Harry Wentland
Drivers might not support all colorspaces defined in dp_colorspaces and hdmi_colorspaces. This results in undefined behavior when userspace is setting an unsupported colorspace. Allow drivers to pass the list of supported colorspaces when creating the colorspace property. v2: - Use 0 to indicate

[PATCH v4 07/13] drm/amd/display: Always pass connector_state to stream validation

2023-05-25 Thread Harry Wentland
We need the connector_state for colorspace and scaling information and can get it from connector->state. Signed-off-by: Harry Wentland Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Joshua Ashton Cc: Simon Ser Cc: Melissa Wen Cc: dri-de...@lists.freedesktop.org Cc: amd-

[PATCH v4 09/13] drm/amd/display: Signal mode_changed if colorspace changed

2023-05-25 Thread Harry Wentland
We need to signal mode_changed to make sure we update the output colorspace. v2: No need to call drm_hdmi_avi_infoframe_colorimetry as DC does its own infoframe packing. Signed-off-by: Harry Wentland Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Uma Shankar Cc: Josh

[PATCH v4 08/13] drm/amd/display: Register Colorspace property for DP and HDMI

2023-05-25 Thread Harry Wentland
We want compositors to be able to set the output colorspace on DP and HDMI outputs, based on the caps reported from the receiver via EDID. Signed-off-by: Harry Wentland Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Joshua Ashton Cc: Simon Ser Cc: Ville Syrjälä Cc: Meli

[PATCH v4 12/13] drm/amd/display: Add debugfs for testing output colorspace

2023-05-25 Thread Harry Wentland
In order to IGT test colorspace we'll want to print the currently enabled colorspace on a stream. We add a new debugfs to do so, using the same scheme as current bpc reporting. This might also come in handy when debugging display issues. v4: - Fix function doc comment - Fix sRGB debug print Sign

[PATCH v4 10/13] drm/amd/display: Send correct DP colorspace infopacket

2023-05-25 Thread Harry Wentland
Look at connector->colorimetry to determine output colorspace. We don't want to impact current SDR behavior, so DRM_MODE_COLORIMETRY_DEFAULT preserves current behavior. Also add support to explicitly set BT601 and BT709. v4: - Roll support for BT709 and BT601 into this patch - Add default case t

[PATCH v4 13/13] drm/amd/display: Refactor avi_info_frame colorimetry determination

2023-05-25 Thread Harry Wentland
From: Joshua Ashton Replace the messy two if-else chains here that were on the same value with a switch on the enum. Signed-off-by: Joshua Ashton Signed-off-by: Harry Wentland Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Joshua Ashton Cc: Simon Ser Cc: Melissa Wen

[PATCH v4 05/13] drm/connector: Print connector colorspace in state debugfs

2023-05-25 Thread Harry Wentland
v3: Fix kerneldocs (kernel test robot) v4: Avoid returning NULL from drm_get_colorspace_name Signed-off-by: Harry Wentland Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Uma Shankar Cc: Ville Syrjälä Cc: Joshua Ashton Cc: Jani Nikula Cc: Simon Ser Cc: Ville Syrjälä

[PATCH v4 11/13] drm/amd/display: Always set crtcinfo from create_stream_for_sink

2023-05-25 Thread Harry Wentland
From: Joshua Ashton Given that we always pass dm_state into here now, this won't ever trigger anymore. This is needed for we will always fail mode validation with invalid clocks or link bandwidth errors. Signed-off-by: Joshua Ashton Signed-off-by: Harry Wentland Cc: Pekka Paalanen Cc: Sebast

Re: [PATCH] drm/amd/amdgpu: Fix up locking etc in amdgpu_debugfs_gprwave_ioctl()

2023-05-25 Thread Alex Deucher
Applied. Thanks! Alex On Thu, May 25, 2023 at 4:05 AM Dan Carpenter wrote: > > There are two bugs here. > 1) Drop the lock if copy_from_user() fails. > 2) If the copy fails then the correct error code is -EFAULT instead of >-EINVAL. > > I also broke up the long line and changed "sizeof rd->

[PATCH] drm/amdkfd: fix gfx_target_version for certain 11.0.3 devices

2023-05-25 Thread Alex Deucher
Certain boards with GC IP 11.0.3 need slightly different handling in the shader compiler due to board specific bounding box optimizations. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/drive

[PATCH] drm/amdkfd: remove unused function get_reserved_sdma_queues_bitmap

2023-05-25 Thread Tom Rix
clang with W=1 reports drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:122:24: error: unused function 'get_reserved_sdma_queues_bitmap' [-Werror,-Wunused-function] static inline uint64_t get_reserved_sdma_queues_bitmap(struct device_queue_manager *dqm) ^ Th

Re: [PATCH] drm/amdkfd: remove unused function get_reserved_sdma_queues_bitmap

2023-05-25 Thread Nathan Chancellor
You can actually go a step farther and remove the reserved_sdma_queues_bitmap member from 'struct kfd_device_info' because it is now only assigned, never read. $ git grep reserved_sdma_queues_bitmap next-20230525 next:20230525:drivers/gpu/drm/amd/amdkfd/kfd_device.c: kfd->device_info.

Re: [PATCH] drm/amdgpu: Fix up kdoc in amdgpu_acpi.c

2023-05-25 Thread Alex Deucher
On Thu, May 25, 2023 at 2:03 PM Srinivasan Shanmugam wrote: > > Fix these warnings by adding & deleting the deviant arguments. > > gcc with W=1 > drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:906: warning: Function parameter or > member 'numa_info' not described in 'amdgpu_acpi_get_node_id' > drivers/

Re: [PATCH 2/2] drm/amdgpu: Remove duplicate fdinfo fields

2023-05-25 Thread Alex Deucher
On Thu, May 25, 2023 at 11:52 AM Rob Clark wrote: > > From: Rob Clark > > Some of the fields that are handled by drm_show_fdinfo() crept back in > when rebasing the patch. Remove them again. > > Fixes: 376c25f8ca47 ("drm/amdgpu: Switch to fdinfo helper") > Signed-off-by: Rob Clark Series is: R

[PATCH] drm/amdgpu: move gfx9_cs_data definition

2023-05-25 Thread Tom Rix
gcc with W=1 reports In file included from drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:32: drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h:939:36: error: ‘gfx9_cs_data’ defined but not used [-Werror=unused-const-variable=] 939 | static const struct cs_section_def gfx9_cs_data[] = { |

Re: [PATCH] drm/amdgpu: move gfx9_cs_data definition

2023-05-25 Thread Alex Deucher
On Thu, May 25, 2023 at 4:35 PM Tom Rix wrote: > > gcc with W=1 reports > In file included from drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:32: > drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h:939:36: error: > ‘gfx9_cs_data’ defined but not used [-Werror=unused-const-variable=] > 939 | static const

Re: [PATCH 01/13] drm: execution context for GEM buffers v4

2023-05-25 Thread Danilo Krummrich
On 5/4/23 13:51, Christian König wrote: This adds the infrastructure for an execution context for GEM buffers which is similar to the existing TTMs execbuf util and intended to replace it in the long term. The basic functionality is that we abstracts the necessary loop to lock many different GEM

  1   2   >