We need heavy-weight flushes not just for SVM. If this is broken it will
affect ROCm either way.
Regards,
Felix
On 2023-09-07 08:08, Lang Yu wrote:
GC 10.1.3/4 have problems with TLB_FLUSH_HEAVYWEIGHT
which is used by SVM in svm_range_unmap_from_gpus().
This causes problems on GC 10.1.3/4.
] amdgpu_driver_load_kms+0x1a/0x110 [amdgpu]
[ +0.000545] amdgpu_pci_probe+0x197/0x400 [amdgpu]
Fixes: cfeaeb3c0ce7 ("drm/amdgpu: use doorbell mgr for kfd kernel doorbells")
Signed-off-by: Mukul Joshi
Reviewed-by: Felix Kuehling
---
v1->v2:
- Update the logic to make it work with both 32
On 2023-08-31 16:33, Chen, Xiaogang wrote:
That said, I'm not actually sure why we're freeing the DMA
address array after migration to RAM at all. I think we still
need it even when we're using VRAM. We call svm_range_dma_map in
svm_range_validate_and_map regardless of whether the range is in
ee the last value written by user mode. With your
changes, this is no longer writable, and driver code is now looking at
adev->debug_vm, which cannot be updated through sysfs. As long as
everyone is OK with that change, I have no objections. Just pointing it out.
Regardless
On 2023-08-30 19:02, Chen, Xiaogang wrote:
On 8/30/2023 3:56 PM, Felix Kuehling wrote:
On 2023-08-30 15:39, Chen, Xiaogang wrote:
On 8/28/2023 5:37 PM, Felix Kuehling wrote:
On 2023-08-28 16:57, Chen, Xiaogang wrote:
On 8/28/2023 2:06 PM, Felix Kuehling wrote:
On 2023-08-24 18:08
On 2023-08-30 15:39, Chen, Xiaogang wrote:
On 8/28/2023 5:37 PM, Felix Kuehling wrote:
On 2023-08-28 16:57, Chen, Xiaogang wrote:
On 8/28/2023 2:06 PM, Felix Kuehling wrote:
On 2023-08-24 18:08, Xiaogang.Chen wrote:
From: Xiaogang Chen
This patch implements partial migration in gpu
On 2023-08-30 16:01, Mukul Joshi wrote:
This patch fixes the following unaligned 64-bit doorbell
warning seen when submitting packets on HIQ on GFX v9.4.3
by making the HIQ doorbell 64-bit aligned.
The warning is seen when GPU is loaded in any mode other
than SPX mode.
[ +0.000301]
+Shashank, FYI. I believe this is a regression from your patch
"drm/amdgpu: use doorbell mgr for kfd kernel doorbells".
On 2023-08-29 12:16, Mukul Joshi wrote:
This patch fixes the following unaligned 64-bit doorbell
warning seen when submitting packets on HIQ on GFX v9.4.3
by making the HIQ
On 2023-08-28 16:57, Chen, Xiaogang wrote:
On 8/28/2023 2:06 PM, Felix Kuehling wrote:
On 2023-08-24 18:08, Xiaogang.Chen wrote:
From: Xiaogang Chen
This patch implements partial migration in gpu page fault according
to migration
granularity(default 2MB) and not split svm range in cpu
On 2023-08-24 18:08, Xiaogang.Chen wrote:
From: Xiaogang Chen
This patch implements partial migration in gpu page fault according to migration
granularity(default 2MB) and not split svm range in cpu page fault handling.
Now a svm range may have pages from both system ram and vram of one gpu.
Signed-off-by: Harish Kasiviswanathan
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
index 23
On 2023-08-28 11:35, Alex Sierra wrote:
Interrupt sq data bits were not taken properly from contextid0 and contextid1.
Use macro KFD_CONTEXT_ID_GET_SQ_INT_DATA instead.
Signed-off-by: Alex Sierra
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 2
On 2023-08-26 09:41, Asad Kamal wrote:
Replace pr_err with dev_err to show the bus-id of
failing device with kfd queue errors
Signed-off-by: Asad Kamal
Reviewed-by: Lijo Lazar
Reviewed-by: Felix Kuehling
---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 116
no need to enable this.
It's often buggy on consumer platforms anyway.
This is not needed for stable.
I agree. I was about to comment in the 5.10 patch as well.
Regards,
Felix
Alex
Reviewed-by: Felix Kuehling
Acked-by: Christian König
Tested-by: Mike Lothian
Signed-off-by: Alex
On 2023-08-22 9:49, Bhardwaj, Rajneesh wrote:
On 8/21/2023 4:32 PM, Felix Kuehling wrote:
On 2023-08-21 15:20, Rajneesh Bhardwaj wrote:
Rework the KFD max system memory and ttm limit to allow bigger
system memory allocations upto 63/64 of the available memory which is
controlled by ttm
Would it make sense to include a link to a better explanation of the
underlying issue? E.g. https://lwn.net/Articles/624126/?
Regards,
Felix
On 2023-08-21 07:23, Christian König wrote:
Am 04.08.23 um 07:46 schrieb Srinivasan Shanmugam:
Instead of declaring pointers use READ_ONCE(), when
On 2023-08-21 15:29, Philip Yang wrote:
If mGPUs is on same IOMMU group, or is ram direct mapped, then mGPUs
can share the original BO for GTT mapping dma address, without creating
new BO from export/import dmabuf.
Signed-off-by: Philip Yang
Reviewed-by: Felix Kuehling
---
drivers
On 2023-08-21 15:20, Rajneesh Bhardwaj wrote:
Rework the KFD max system memory and ttm limit to allow bigger
system memory allocations upto 63/64 of the available memory which is
controlled by ttm module params pages_limit and page_pool_size. Also for
NPS1 mode, report the max ttm limit as the
ace Chen
Acked-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 8 ++--
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index df633e9ce920..cdf6087706aa 100
pages fallback.
Signed-off-by: Alex Sierra
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 93609ea42163..3ebd5d99f39e 100644
some calls to cond_resched().
But then I would expect cond_resched() to fix the problem, according to
this document.
Regards,
Felix
On 2023-08-11 17:27, Chen, Xiaogang wrote:
On 8/11/2023 4:22 PM, Felix Kuehling wrote:
On 2023-08-11 17:12, Chen, Xiaogang wrote:
I know the original
Zhu
Acked-by: Christian König
Reviewed-by: Felix Kuehling
-v2: added warning message
-v3: use dev_warn
---
drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c | 13 -
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 10 +-
2 files changed, 21 insertions(+), 2 deletions(-)
diff
On 2023-08-11 17:12, Chen, Xiaogang wrote:
I know the original jira ticket. The system got RCU cpu stall, then
kernel enter panic, then no response or ssh. This patch let prange
list update task yield cpu after each range update. It can prevent
task holding mm lock too long.
Calling
with preemption disabled.
- A CPU looping with bottom halves disabled.
Or is there another thread that has an mmap_write_lock inside an RCU
read critical section that's getting stalled by the mmap_read_lock?
Regards,
Felix
On 2023-08-11 16:50, James Zhu wrote:
On 2023-08-11 16:06, Felix
On 2023-08-11 16:23, James Zhu wrote:
Return 0 when drm device alloc failed with -ENOSPC in
order to allow amdgpu drive loading. But the xcp without
drm device node assigned won't be visiable in user space.
This helps amdgpu driver loading on system which has more
than 64 nodes, the current
On 2023-08-11 15:11, James Zhu wrote:
update_list could be big in list_for_each_entry(prange, _list,
update_list),
mmap_read_lock(mm) is kept hold all the time, adding schedule() can remove
RCU stall on CPU for this case.
RIP: 0010:svm_range_cpu_invalidate_pagetables+0x317/0x610 [amdgpu]
Am 2023-08-09 um 15:09 schrieb Alex Deucher:
We need the domains in amdgpu_drm.h for the kernel driver to manage
the pool, but we don't want userspace using it until the code
is ready. So reject for now.
Signed-off-by: Alex Deucher
Acked-by: Felix Kuehling
---
drivers/gpu/drm/amd
than one watchpoint event, so test B check out and report error on
second or third watchpoint not set by itself.
Regards,
Eric
On 2023-08-10 17:56, Felix Kuehling wrote:
I think Jon is suggesting that the UNMAP_QUEUES command should clear
the address watch registers. Requesting such a change from
Am 2023-08-11 um 06:11 schrieb Mike Lothian:
On Thu, 3 Aug 2023 at 20:43, Felix Kuehling wrote:
Is your kernel configured without dynamic debugging? Maybe we need to
wrap this in some #if defined(CONFIG_DYNAMIC_DEBUG_CORE).
Apologies, I thought I'd replied to this, yes I didn't have dynamic
different because it needs to support multiple XCCs.
That said, this patch is
Reviewed-by: Felix Kuehling
On 2023-08-10 16:47, Eric Huang wrote:
KFD currently relies on MEC FW to clear tcp watch control
register by sending MAP_PROCESS packet with 0 of field
tcp_watch_cntl to HWS
I think amdgpu_amdkfd_gc_9_4_3.c needs a similar fix. But maybe a bit
different because it needs to support multiple XCCs.
That said, this patch is
Reviewed-by: Felix Kuehling
On 2023-08-10 16:47, Eric Huang wrote:
KFD currently relies on MEC FW to clear tcp watch control
register
On 2023-08-10 15:03, Jonathan Kim wrote:
Remove redundant assignment when skipping process ctx clear.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/gpu/drm
On 2023-08-09 17:26, Jay Cornwall wrote:
Previously asymptomatic because high 32 bits were zero.
Fixes: 615222cfed20 ("drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole")
Signed-off-by: Jay Cornwall
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_packet_
The patch is
Reviewed-by: Felix Kuehling
I'm applying it to amd-staging-drm-next.
Regards,
Felix
---
v1 -> v2
caller checks for errors, hence removed
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 10 +-
1 file changed, 1 insertion(+), 9 deletions(-)
diff --
On 2023-08-08 16:57, Atul Raut wrote:
To prevent its redundant implementation and streamline
code, use memdup_user.
This fixes warnings reported by Coccinelle:
./drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c:2811:13-20: WARNING
opportunity for memdup_user
Signed-off-by: Atul Raut
---
On 2023-08-07 18:05, Alex Deucher wrote:
We are dropping the IOMMUv2 path, so no need to enable this.
It's often buggy on consumer platforms anyway.
Signed-off-by: Alex Deucher
The series is
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 4
1 file
I just applied Arnd Bergmann's patch "drm/amdkfd: fix build failure
without CONFIG_DYNAMIC_DEBUG". This patch is no longer needed.
Regards,
Felix
On 2023-08-04 12:05, Alex Sierra wrote:
This causes error compilation if CONFIG_DYNAMIC_DEBUG_CORE is not
defined.
Signed-off-by: Alex Sierra
debug disabled")
Signed-off-by: Arnd Bergmann
The patch is
Reviewed-by: Felix Kuehling
I'm applying it to amd-staging-drm-next.
Thanks,
Felix
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 6 ++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
b/driver
/../amdkfd/kfd_svm.c:50:2: note: expanded
from macro 'dynamic_svm_range_dump'
_dynamic_func_call_no_desc("svm_range_dump", svm_range_debug_dump, svms)
^
1 error generated.
Cheers
Mike
On Wed, 19 Jul 2023 at 22:27, Felix Kuehling wrote:
Am 2023-07-19 um 17:22 schrieb A
On 2023-07-31 16:40, Jay Cornwall wrote:
Some changes have been lost during rebases. Rebuild sources.
Signed-off-by: Jay Cornwall
The series is
Reviewed-by: Felix Kuehling
---
.../gpu/drm/amd/amdkfd/cwsr_trap_handler.h| 741 +-
1 file changed, 371 insertions
mapping
intact.
Signed-off-by: Alex Sierra
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 7 +--
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 61 +---
drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 2 +-
3 files changed, 50 insertions(+), 20
On 2023-07-28 15:39, David Francis wrote:
These flags (for GEM and SVM allocations) allocate
memory that allows for system-scope atomic semantics.
On GFX943 these flags cause caches to be avoided on
non-local memory.
On all other ASICs they are identical in functionality to the
equivalent
There are some APU-specific code paths for Kaveri and Carrizo in the
device queue manager and MQD manager. I think a minimal fix would be to
change device_queue_manager_init to call
device_queue_manager_init_cik_hawaii for Kaveri and
device_queue_manager_init_vi_tonga for Carrizo to use the
On 2023-07-27 19:43, Alex Sierra wrote:
DMA address reference within svm_ranges should be unmapped only after
the memory has been released from the system. In case of range
splitting, the DMA address information should be copied to the
corresponding range after this has split. But leaving dma
In amdgpu_dma_buf_create_obj we copy the coherence-related flags to the
SG BO that's used to attach the BO to the importer device. You need to
add the new flag to the list.
Some more nit-picks inline.
Am 2023-07-26 um 09:34 schrieb David Francis:
These flags (for GEM and SVM allocations)
* is less GPUVM Base
+*/
+ if (((uint64_t)kgd_mem->va <= pdd->gpuvm_base) &&
kgd_mem->va)
Unnecessary parentheses around (a <= b). In this condition I'd also
prefer to put kgd_mem->va first, because it short-circuits execution for
the case tha
Am 2023-07-25 um 16:04 schrieb Errabolu, Ramesh:
[AMD Official Use Only - General]
Responses inline.
-Original Message-
From: Kuehling, Felix
Sent: Monday, July 24, 2023 2:51 PM
To: amd-gfx@lists.freedesktop.org; Errabolu, Ramesh
Subject: Re: [PATCH] drm/amdgpu: Checkpoint and
Signed-off-by: Philip Yang
Reviewed-by: Felix Kuehling
Michel, can you test whether this fixes your regression on Raven? Would
be good to get a Tested-by for this patch, since we haven't been able to
reproduce the problem yet.
Thanks,
Felix
---
drivers/gpu/drm/amd/amdkfd
On 2023-07-24 11:57, Ramesh Errabolu wrote:
Extend checkpoint logic to allow inclusion of VRAM BOs that
do not have a VA attached
Signed-off-by: Ramesh Errabolu
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git
and svm_range_debug_dump functions
are dynamically enabled to print svm_range_debug_dump debug traces.
Signed-off-by: Alex Sierra
Tested-by: Alex Sierra
Signed-off-by: Philip Yang
Signed-off-by: Felix Kuehling
I don't think my name on a Signed-off-by is appropriate here. I didn't
write the patch. And I'm
and svm_range_debug_dump functions
are dynamically enabled to print svm_range_debug_dump debug traces.
Signed-off-by: Alex Sierra
Tested-by: Alex Sierra
Signed-off-by: Philip Yang
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 3
Am 2023-07-08 um 12:57 schrieb Alex Sierra:
svm_range_debug_dump should not be called at all when dynamic debug
is disabled to avoid iterating over SVM lists. This could drop
performance, specially with big number of SVM ranges.
Signed-off-by: Alex Sierra
Signed-off-by: Philip Yang
---
inline. With those fixed, the patch is
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 2 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 1 +
drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 2 ++
drivers/gpu/drm/amd/amdkfd
Am 2023-07-19 um 03:51 schrieb Kefeng Wang:
Use the helpers to simplify code.
Cc: Felix Kuehling
Cc: Alex Deucher
Cc: "Christian König"
Cc: "Pan, Xinhui"
Cc: David Airlie
Cc: Daniel Vetter
Signed-off-by: Kefeng Wang
Reviewed-by: Felix Kuehling
---
driver
Am 2023-07-14 um 05:37 schrieb Jonathan Kim:
Update the list of devices that require the cwsr trap handling
workaround for debugging use cases.
Signed-off-by: Jonathan Kim
This patch is
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 5
Am 2023-07-14 um 05:37 schrieb Jonathan Kim:
MES can concurrently schedule queues on the device that require
exclusive device access if marked exclusively_scheduled without the
requirement of GWS. Similar to the F32 HWS, MES will manage
quality of service for these queues.
Use this for
On 2023-07-16 22:26, Guchun Chen wrote:
~0 as no xcp partition is used in several places, so improve its
definition by a macro for code consistency.
Suggested-by: Christian König
Signed-off-by: Guchun Chen
The series is
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu
Am 2023-07-14 um 10:26 schrieb Vlastimil Babka:
On 7/12/23 18:24, Felix Kuehling wrote:
Allocations in the heap and stack tend to be small, with several
allocations sharing the same page. Sharing the same page for different
allocations with different access patterns leads to thrashing when we
with PDD, delay doorbell process
page allocation until really needed (Felix)
Cc: Alex Deucher
Cc: Christian Koenig
Cc: Felix Kuehling
Acked-by: Christian König
Signed-off-by: Shashank Sharma
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 20
(Alex)
V3:
- Move single variable declaration below (Christian)
- Add a to-do item to reuse the KGD kernel level doorbells for
KFD for non-MES cases, instead of reserving one page (Felix)
Cc: Alex Deucher
Cc: Christian Koenig
Cc: Felix Kuehling
Signed-off-by: Shashank Sharma
Reviewed
On 2023-07-13 10:50, Stanley.Yang wrote:
Disable RAS feature by default for aqua vanjaram on APU platform.
Changed from V1:
Splite Disable RAS by default on APU platform into a
separated patch.
Signed-off-by: Stanley.Yang
Reviewed-by: Hawking Zhang
---
Allocations in the heap and stack tend to be small, with several
allocations sharing the same page. Sharing the same page for different
allocations with different access patterns leads to thrashing when we
migrate data back and forth on GPU and CPU access. To avoid this we
disable HMM
Am 2023-07-12 um 11:55 schrieb Shashank Sharma:
On 11/07/2023 21:51, Felix Kuehling wrote:
On 2023-07-06 09:39, Christian König wrote:
Am 06.07.23 um 15:37 schrieb Shashank Sharma:
On 06/07/2023 15:22, Christian König wrote:
Am 06.07.23 um 14:35 schrieb Shashank Sharma:
A Memory queue
On 2023-07-06 09:39, Christian König wrote:
Am 06.07.23 um 15:37 schrieb Shashank Sharma:
On 06/07/2023 15:22, Christian König wrote:
Am 06.07.23 um 14:35 schrieb Shashank Sharma:
A Memory queue descriptor (MQD) of a userqueue defines it in
the hw's context. As MQD format can vary between
On 2023-07-11 09:31, Christian König wrote:
Avoids quite a bit of logic and kmalloc overhead.
v2: fix multiple problems pointed out by Felix
Signed-off-by: Christian König
Two nit-picks inline about DRM_EXEC_INTERRUPTIBLE_WAIT. With those
fixed, the patch is
Reviewed-by: Felix Kuehling
On 2023-07-11 10:28, Eric Huang wrote:
Read/write grace period from/to first xcc instance of
xcp in kfd node.
Signed-off-by: Eric Huang
---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 21 ---
.../drm/amd/amdkfd/kfd_device_queue_manager.h | 2 +-
ways available for debug/trap
handling.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
b/drivers/gpu/drm/amd/am
oing to flood the log. It would be a
good idea to apply a rate-limit, or use dev_warn_once. With that fixed,
the patch is
Reviewed-by: Felix Kuehling
}
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
index dd1c2eded6b9..6c6184f0dbc1 1
Am 2023-07-07 um 10:14 schrieb Philip Yang:
Retry faults are delegated to IH soft ring and then processed by
deferred worker. Current IH soft ring size PAGE_SIZE can store 128
entries, which may overflow and drop retry faults, causes HW stucks
because the retry fault is not recovered.
Am 2023-06-20 um 22:11 schrieb Ramesh Errabolu:
Call KFD api to get Dmabuf instead of calling GEM Prime API
Signed-off-by: Ramesh Errabolu
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git
Can we change the flags if needed. E.g. see what
amdgpu_bo_pin_restricted does:
if (!(bo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS))
bo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
amdgpu_bo_placement_from_domain(bo, domain);
This shouldn't really change
to a no-retry fault.
Additionally, have 2 sets of invalid PTE settings, one for
TF enabled, the other for TF disabled. The setting with
TF disabled, doesn't work with TF enabled.
Signed-off-by: Mukul Joshi
Reviewed-by: Felix Kuehling
---
v1->v2:
- Update handling according to Christian's feedb
at amdgpu_vm_make_compute.
Signed-off-by: Xiaogang Chen
Reviewed-by: Felix Kuehling
As discussed, we can follow this up with a change that enables ATS for
graphics VMs as well, so we don't need to enable ATS in
amdgpu_vm_make_compute. This would improve interop for Raven. We only
enable ATS for the lower half
On 2023-06-19 15:06, Xiaogang.Chen wrote:
From: Xiaogang Chen
Since we allow kfd and graphic operate on same GPU VM to have interoperation
between them GPU VM may have been used by graphic vm operations before kfd turn
a GFX VM into a compute VM. Remove vm clean checking at
On 2023-06-16 14:44, Mukul Joshi wrote:
Enable GWS capable queue creation for forward
progress gaurantee on GFX 9.4.3.
Signed-off-by: Mukul Joshi
Reviewed-by: Felix Kuehling
---
v1->v2:
- Update the condition for setting pqn->q->gws
for GFX 9.4.3.
drivers/gpu/drm/a
On 2023-06-16 14:00, Mukul Joshi wrote:
Currently, we unmap HIQ by directly writing to HQD
registers. This doesn't work for GFX9.4.3. Instead,
use KIQ to unmap HIQ, similar to how we use KIQ to
map HIQ. Using KIQ to unmap HIQ works for all GFX
series post GFXv9.
Signed-off-by: Mukul Joshi
On 2023-06-16 13:59, Mukul Joshi wrote:
Enable GWS capable queue creation for forward
progress gaurantee on GFX 9.4.3.
Signed-off-by: Mukul Joshi
---
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 1 +
.../amd/amdkfd/kfd_process_queue_manager.c| 31 ---
2 files
Am 2023-06-16 um 06:23 schrieb Lijo Lazar:
Modify it such that it doesn't change the instance mask parameter.
Signed-off-by: Lijo Lazar
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git
Am 2023-06-16 um 00:29 schrieb Felix Kuehling:
Am 2023-06-15 um 18:54 schrieb Alex Sierra:
This flag determines whether the host possesses coherent access to
the memory of the device.
Signed-off-by: Alex Sierra
---
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 4
1 file changed, 4
Am 2023-06-15 um 18:54 schrieb Alex Sierra:
This flag determines whether the host possesses coherent access to
the memory of the device.
Signed-off-by: Alex Sierra
---
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 4
1 file changed, 4 insertions(+)
diff --git
Am 2023-06-15 um 03:37 schrieb Christian König:
Am 14.06.23 um 17:42 schrieb Felix Kuehling:
Am 2023-06-14 um 06:38 schrieb Christian König:
Am 10.05.23 um 00:01 schrieb Alex Deucher:
From: Rajneesh Bhardwaj
This adds dummy vram manager to support ASICs that do not have a
dedicated
through the regular VRAM manager.
Regards,
Felix
Christian.
Reviewed-by: Felix Kuehling
Signed-off-by: Rajneesh Bhardwaj
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 67 ++--
1 file changed, 60 insertions(+), 7 deletions(-)
diff --git
Am 2023-06-13 um 22:04 schrieb Jiapeng Chong:
Use memdup_user() rather than duplicating its implementation. This is a
little bit restricted to reduce false positives.
./drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c:2813:13-20: WARNING
opportunity for memdup_user.
Reported-by: Abaci
On 2023-06-13 17:48, Jonathan Kim wrote:
Queue count should decrement on queue destruction regardless of HWS
support type.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +-
1 file changed, 1 insertion(+), 1
Am 2023-06-07 um 12:31 schrieb Alex Deucher:
The wptr needs to be incremented at at least 64 dword intervals,
use 256 to align with windows. This should fix potential hangs
with unaligned updates.
Signed-off-by: Alex Deucher
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu
Am 2023-06-12 um 12:23 schrieb Mukul Joshi:
Update the invalid PTE flag setting with TF enabled.
This is to ensure, in addition to transitioning the
retry fault to a no-retry fault, it also causes the
wavefront to enter the trap handler. With the current
setting, the fault only transitions to
Am 2023-06-12 um 11:46 schrieb Jonathan Kim:
Null check should be done on queue struct itself and not on the
process queue list node.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion
, the series is
Reviewed-by: Felix Kuehling
David, I looked at the ROCr and Thunk changes as well, and they look
reasonable to me. Do you have any feedback on these patches from a ROCr
point of view? Is there a reasonable stress test that could be used check that
this handles the race conditions
[+Jon]
Am 2023-06-12 um 07:58 schrieb Lu Hongfei:
pqn bound in list_for_each_entry loop will not be null, so there is
no need to check whether pqn is NULL or not.
Thus remove a redundant null pointer check.
Signed-off-by: Lu Hongfei
---
The filename of the previous version was:
From the KFD perspective, the series is
Reviewed-by: Felix Kuehling
David, I looked at the ROCr and Thunk changes as well, and they look
reasonable to me. Do you have any feedback on these patches from a ROCr
point of view? Is there a reasonable stress test that could be used
check
On 2023-06-09 16:13, James Zhu wrote:
Set waiter's activated flag true when event age unmatchs with last_event_age.
-v4: add event type check
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_events.c | 15 +++
1 file changed, 11 insertions(+), 4 deletions(-)
diff
On 2023-06-08 13:07, James Zhu wrote:
Set waiter's activated flag true when event age unmatchs with last_event_age.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_events.c | 15 +++
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git
-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c | 32 ---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 12 -
2 files changed, 44 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
index
On 2023-04-25 15:59, Shashank Sharma wrote:
On 24/04/2023 21:56, Felix Kuehling wrote:
On 2023-04-22 2:39, Shashank Sharma wrote:
- KFD process level doorbells: doorbell pages which are allocated by
kernel but mapped and written by userspace processes, saved in
struct pdd->qpd->doo
eld? That
could have avoided the need for the compatibility checks.
Anyway, the patch is
Reviewed-by: Felix Kuehling
amdgpu_mes_lock(>mes);
r = adev->mes.funcs->misc_op(>mes, _input);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
b/drivers/gpu/drm/a
On 2023-06-07 13:32, Jonathan Kim wrote:
Legacy debug devices limited to pinning a single debug VMID for debugging
are the only devices that require disabling GFX OFF while accessing
debug registers. Debug devices that support multi-process debugging
rely on the hardware scheduler to update
On 2023-06-07 13:26, Jonathan Kim wrote:
There are a few fixes required to enable gfx11 debugging.
First, ADD_QUEUE.trap_en is an inappropriate place to toggle
a per-process register so move it to SET_SHADER_DEBUGGER.trap_en.
When ADD_QUEUE.skip_process_ctx_clear is set, MES will prioritize
On 2023-06-06 12:24, James Zhu wrote:
Don't sleep when event age unmatch, and update last_event_age.
It is only for KFD_EVENT_TYPE_SIGNAL which is checked by user space.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_events.c | 15 +++
1 file changed, 15
On 2023-06-06 12:24, James Zhu wrote:
Add event age tracking
Signed-off-by: James Zhu
---
include/uapi/linux/kfd_ioctl.h | 13 +++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index
mdkfd: Update SDMA queue management for GFX9.4.3")
Signed-off-by: Mukul Joshi
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 13 ++---
.../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 10 +-
drivers/gpu/drm/amd/amdkfd/
401 - 500 of 3340 matches
Mail list logo