[PATCH] drm/amdkfd: avoid recursive lock in migrations back to RAM

2021-11-01 Thread Alex Sierra
[Why]: When we call hmm_range_fault to map memory after a migration, we don't expect memory to be migrated again as a result of hmm_range_fault. The driver ensures that all memory is in GPU-accessible locations so that no migration should be needed. However, there is one corner case where hmm_range

Re: [PATCH] drm/amdkfd: update gfx target version for Renoir

2021-11-01 Thread Huang Rui
On Tue, Nov 02, 2021 at 02:01:17AM +0800, Sider, Graham wrote: > Previously Renoir compiler gfx target version was forced to Raven. > Update driver side for completeness. > > Signed-off-by: Graham Sider Reviewed-by: Huang Rui > --- > drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 +- > 1 file ch

Re: [PATCH v2 1/1] drm/amdkfd: Add sysfs bitfields and enums to uAPI

2021-11-01 Thread Felix Kuehling
Ping. Am 2021-09-13 um 5:23 p.m. schrieb Felix Kuehling: > These bits are de-facto part of the uAPI, so declare them in a uAPI header. > > The corresponding bit-fields and enums in user mode are defined in > https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/blob/master/include/hsakmttypes.

Re: [PATCH] drm/amdkfd: avoid recursive lock in migrations back to RAM

2021-11-01 Thread Felix Kuehling
Am 2021-11-01 um 6:04 p.m. schrieb Alex Sierra: > [Why]: > During hmm_range_fault validation calls on VRAM migrations, This sounds a bit confusing. I think the hmm_range_fault is not called from a migration, but right after a migration, in the context of a GPU page fault handler. I would explain t

[PATCH] drm/amdkfd: avoid recursive lock in migrations back to RAM

2021-11-01 Thread Alex Sierra
[Why]: During hmm_range_fault validation calls on VRAM migrations, there could be cases where some pages within the range could be marked as Read Only (COW) triggering a migration back to RAM. In this case, the migration to RAM will try to grab mutexes that have been held already before the hmm_ran

RE: [PATCH] drm/amdgpu: Make sure to reserve BOs before adding or removing

2021-11-01 Thread Russell, Kent
[AMD Official Use Only] Hi Ramesh, kfd_mem_attach_dmabuf would deadlock if the BO is already reserved, which is obviously problematic. Also, if it's AQL, we make 2 BOs, and would need to reserve each one of those during addition, which we can't easily do if we lock outside of attach. Kent

[PATCH v3 1/3] drm/amd/pm: Add missing mutex for pp_get_power_profile_mode

2021-11-01 Thread Mario Limonciello
Prevent possible issues from set and get being called simultaneously. Signed-off-by: Mario Limonciello --- Changes from v2->v3: * New patch drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/

[PATCH v3 3/3] drm/amdgpu/pm: Don't show pp_power_profile_mode for unsupported devices

2021-11-01 Thread Mario Limonciello
This command corresponding to this attribute was deprecated in the PMFW for YC so don't show a non-functional attribute. Verify that the function has been implemented by the subsystem. Suggested-by: Alex Deucher Signed-off-by: Mario Limonciello --- Changes from v2->v3: * Handle powerplay to re

[PATCH v3 2/3] drm/amd/pm: Adjust returns when power_profile_mode is not supported

2021-11-01 Thread Mario Limonciello
This better aligns that the caller can make a mistake with the buffer and -EINVAL should be returned, but if the hardware doesn't support the feature it should be -EOPNOTSUPP. Signed-off-by: Mario Limonciello --- Changes from v2->v3: * New patch drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c

RE: [PATCH] drm/amdgpu: Make sure to reserve BOs before adding or removing

2021-11-01 Thread Errabolu, Ramesh
Kent, the call to map has the same structure. Is it not possible to call kfd_mem_attach after BO's are reserved? Regards, Ramesh -Original Message- From: amd-gfx On Behalf Of Felix Kuehling Sent: Monday, November 1, 2021 3:18 PM To: Russell, Kent ; amd-gfx@lists.freedesktop.org Subject

Re: [PATCH] drm/amdgpu: Make sure to reserve BOs before adding or removing

2021-11-01 Thread Felix Kuehling
Am 2021-11-01 um 4:14 p.m. schrieb Kent Russell: > BOs need to be reserved before they are added or removed, so ensure that > they are reserved during kfd_mem_attach and kfd_mem_detach > > Signed-off-by: Kent Russell Reviewed-by: Felix Kuehling > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd

[PATCH] drm/amdgpu: Make sure to reserve BOs before adding or removing

2021-11-01 Thread Kent Russell
BOs need to be reserved before they are added or removed, so ensure that they are reserved during kfd_mem_attach and kfd_mem_detach Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/

Re: [PATCH v2] drm/amdgpu/pm: Don't show pp_power_profile_mode for unsupported devices

2021-11-01 Thread Alex Deucher
On Mon, Nov 1, 2021 at 3:10 PM Mario Limonciello wrote: > > This command corresponding to this attribute was deprecated in the PMFW > for YC so don't show a non-functional attribute. > > Verify that the function has been implemented by the subsystem. > > Suggested-by: Alex Deucher > Signed-off-by

[PATCH v2] drm/amdgpu/pm: Don't show pp_power_profile_mode for unsupported devices

2021-11-01 Thread Mario Limonciello
This command corresponding to this attribute was deprecated in the PMFW for YC so don't show a non-functional attribute. Verify that the function has been implemented by the subsystem. Suggested-by: Alex Deucher Signed-off-by: Mario Limonciello --- Changes from v1->v2: * Change smu_get_power_p

[PATCH 21.40 2/2] drm/amdgpu: Temporary exclude GTT BOs from KFD memory attachment

2021-11-01 Thread Kent Russell
kfd_mem_attach is not sharing GTT BOs correctly between processes. This can cause kernel panics due to NULL pointers depending on which processes hold references to specific GTT BOs. Temporarily disable this functionality for now while a proper code refactor and proper fix is done. Note that since

[PATCH 21.40 1/2] drm/amdgpu: Make sure to reserve BOs before adding or removing

2021-11-01 Thread Kent Russell
BOs need to be reserved before they are added or removed, so ensure that they are reserved during kfd_mem_attach and kfd_mem_detach Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/

[PATCH] drm/amdkfd: update gfx target version for Renoir

2021-11-01 Thread Graham Sider
Previously Renoir compiler gfx target version was forced to Raven. Update driver side for completeness. Signed-off-by: Graham Sider --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/g

Re: Lockdep spalt on killing a processes

2021-11-01 Thread Andrey Grodzovsky
Pushed to drm-misc-next Andrey On 2021-10-29 3:07 a.m., Christian König wrote: Attached a patch. Give it a try please, I tested it on my side and tried to generate the right conditions to trigger this code path by repeatedly submitting commands while issuing GPU reset to stop the scheduler

Re: [PATCH 1/3] drm/amdkfd: Fix SVM_ATTR_PREFERRED_LOC

2021-11-01 Thread philip yang
On 2021-11-01 8:55 a.m., Felix Kuehling wrote: The preferred location should be used as the migration destination whenever it is accessible by the faulting GPU. System memory is always accessible. Peer memory is accessible if it's in the same XGMI hive.

Re: [PATCH] drm/amd/display: Fix error handling on waiting for completion

2021-11-01 Thread Wang, Chao-kai (Stylon)
[AMD Official Use Only] Hi Michel, The problem with -ERESTARTSYS is the same half-baked atomic state with modifications we made in the interrupted atomic check, is reused in the next retry and fails the atomic check. What we expect in the next retry is with the original atomic state. I am goin

RE: [PATCH 00/14] DC patches for Nov 1, 2021

2021-11-01 Thread Wheeler, Daniel
[Public] Hi all,   This week this patchset was tested on the following systems:   HP Envy 360, with Ryzen 5 4500U, with the following display types: eDP 1080p 60hz, 4k 60hz (via USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI), 1680*1050 60hz (via USB-C to DP and then DP to DVI/VGA)   AMD

Re: [PATCH] drm/amdgpu/pm: Don't show pp_power_profile_mode for YC and later APUs

2021-11-01 Thread Alex Deucher
Right. Alex On Sun, Oct 31, 2021 at 11:41 PM Lazar, Lijo wrote: > > There are two subsystems - powerplay and swsmu. The subsystem implementation > details are hidden from amdgpu_pm funcs. I thought Alex is talking about that. > > Thanks, > Lijo > > From: Limonci

[PATCH 3/3] drm/amdkfd: Handle incomplete migration to system memory

2021-11-01 Thread Felix Kuehling
If some pages fail to migrate to system memory, don't update prange->actual_loc = 0. This prevents endless CPU page faults after partial migration failures due to contested page locks. Migration to RAM must be complete during migrations from VRAM to VRAM and during evictions. Implement retry and f

[PATCH 2/3] drm/amdkfd: Avoid thrashing of stack and heap

2021-11-01 Thread Felix Kuehling
Stack and heap pages tend to be shared by many small allocations. Concurrent access by CPU and GPU is therefore likely, which can lead to thrashing. Avoid this by setting the preferred location to system memory. Signed-off-by: Felix Kuehling Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/am

[PATCH 1/3] drm/amdkfd: Fix SVM_ATTR_PREFERRED_LOC

2021-11-01 Thread Felix Kuehling
The preferred location should be used as the migration destination whenever it is accessible by the faulting GPU. System memory is always accessible. Peer memory is accessible if it's in the same XGMI hive. Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 14 +++--

Re: [PATCH 3/6] drm/etnaviv: stop getting the excl fence separately here

2021-11-01 Thread Lucas Stach
Am Donnerstag, dem 28.10.2021 um 15:26 +0200 schrieb Christian König: > Just grab all fences in one go. > > Signed-off-by: Christian König Reviewed-by: Lucas Stach > --- > drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/dr