Instead of blocking varied unsupported MP1 state in upper level,
defer and skip such MP1 state handling in specific ASIC.
Signed-off-by: Lijo Lazar
Signed-off-by: Guchun Chen
---
drivers/gpu/drm/amd/pm/amdgpu_dpm.c| 3 ---
.../gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
Typo in the title:
s/dispaly/display
- Joshie 🐸✨
On 3/22/21 8:11 AM, Lang Yu wrote:
In amdggpu reset, while dm.dc_lock is held by dm_suspend,
handle_hpd_rx_irq tries to acquire it. Deadlock occurred!
Deadlock log:
[ 104.528304] amdgpu :03:00.0: amdgpu: GPU reset begin!
[ 104.640084] =
[AMD Official Use Only - Internal Distribution Only]
-Original Message-
From: Grodzovsky, Andrey
Sent: Monday, March 22, 2021 11:01 PM
To: Yu, Lang ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Huang, Ray
Subject: Re: [PATCH] drm/amd/dispaly: fix deadlock issue in amdgpu r
Convert IRQ-based prints from DRM_DEBUG_DRIVER to
DRM_DEBUG_DP, as the latter is not used in drm/amd
prior to this patch and since IRQ-based prints
drown out the rest of the driver's
DRM_DEBUG_DRIVER messages.
Cc: Harry Wentland
Cc: Alex Deucher
Signed-off-by: Luben Tuikov
---
.../gpu/drm/amd/
[AMD Official Use Only - Internal Distribution Only]
Hi,
The updated patch has been merged and is available with commit ID
"ef5c594461650de0a18aa0bfd240189991790d7e".
Somehow missed to mail the updated version, attached is the updated patch,
please review and let me know if any changes requ
On Mon 22-03-21 14:05:48, Matthew Wilcox wrote:
> On Mon, Mar 22, 2021 at 02:49:27PM +0100, Daniel Vetter wrote:
> > On Sun, Mar 21, 2021 at 03:18:28PM +0100, Christian König wrote:
> > > Am 20.03.21 um 14:17 schrieb Daniel Vetter:
> > > > On Sat, Mar 20, 2021 at 10:04 AM Christian König
> > > > w
Am 22.03.21 um 18:02 schrieb Daniel Vetter:
On Mon, Mar 22, 2021 at 5:06 PM Michal Hocko wrote:
On Mon 22-03-21 14:05:48, Matthew Wilcox wrote:
On Mon, Mar 22, 2021 at 02:49:27PM +0100, Daniel Vetter wrote:
On Sun, Mar 21, 2021 at 03:18:28PM +0100, Christian König wrote:
Am 20.03.21 um 14:17
On Thu, Mar 18, 2021 at 8:19 AM Harvey wrote:
>
> Alex,
>
> I waited for kernel 5.11.7 to hit our repos yesterday evening and tested
> again:
>
> 1. The suspend issue is gone - suspend and resume now work as expected.
>
> 2. System hibernation seems to be a different beast - still freezing
You ne
On Mon, Mar 22, 2021 at 5:07 PM Felix Kuehling wrote:
>
> Am 2021-03-22 um 10:15 a.m. schrieb Daniel Vetter:
> > On Mon, Mar 22, 2021 at 06:58:16AM -0400, Felix Kuehling wrote:
> >> Since the last patch series I sent on Jan 6 a lot has changed. Patches 1-33
> >> are the cleaned up, rebased on amd-
On Mon, Mar 22, 2021 at 5:06 PM Michal Hocko wrote:
>
> On Mon 22-03-21 14:05:48, Matthew Wilcox wrote:
> > On Mon, Mar 22, 2021 at 02:49:27PM +0100, Daniel Vetter wrote:
> > > On Sun, Mar 21, 2021 at 03:18:28PM +0100, Christian König wrote:
> > > > Am 20.03.21 um 14:17 schrieb Daniel Vetter:
> >
Still freezing on 5.11.8 and 5.12-rc4.
Log on 5.12-rc4 looks a little different:
Mär 22 17:40:26 obelix systemd[1]: Reached target Sleep.
Mär 22 17:40:26 obelix systemd[1]: Starting Hibernate...
Mär 22 17:40:26 obelix kernel: PM: hibernation: hibernation entry
Mär 22 17:40:26 obelix systemd-sle
Am 2021-03-22 um 10:15 a.m. schrieb Daniel Vetter:
> On Mon, Mar 22, 2021 at 06:58:16AM -0400, Felix Kuehling wrote:
>> Since the last patch series I sent on Jan 6 a lot has changed. Patches 1-33
>> are the cleaned up, rebased on amd-staging-drm-next 5.11 version from about
>> a week ago. The remai
"friendly ping"
On Wed, Mar 10, 2021 at 11:14 AM Mark Yacoub
wrote:
> From: Mark Yacoub
>
> On initializing the framebuffer, call drm_any_plane_has_format to do a
> check if the modifier is supported. drm_any_plane_has_format calls
> dm_plane_format_mod_supported which is extended to validate t
On Mon, Mar 22, 2021 at 02:49:27PM +0100, Daniel Vetter wrote:
> On Sun, Mar 21, 2021 at 03:18:28PM +0100, Christian König wrote:
> > Am 20.03.21 um 14:17 schrieb Daniel Vetter:
> > > On Sat, Mar 20, 2021 at 10:04 AM Christian König
> > > wrote:
> > > > Am 19.03.21 um 20:06 schrieb Daniel Vetter:
On 3/22/21 4:54 AM, Arnd Bergmann wrote:
> From: Arnd Bergmann
>
> clang points out that the %hu format string does not match the type
> of the variables here:
>
> drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c:263:7: warning: format specifies type
> 'unsigned short' but the argument has type 'unsigne
On 3/22/21 09:04, Chen, Guchun wrote:
> [AMD Public Use]
>
> Thanks for your patch, Silva. The issue has been fixed by " a5c6007e20e1
> drm/amd/display: fix modprobe failure on vega series".
Great. :)
Good to know this is already fixed.
Thanks!
--
Gustavo
On Fri, Mar 19, 2021 at 10:03:12PM +0100, Mario Kleiner wrote:
> Hi,
>
> this patch series adds the fourcc's for 16 bit fixed point unorm
> framebuffers to the core, and then an implementation for AMD gpu's
> with DisplayCore.
>
> This is intended to allow for pageflipping to, and direct scanout
[AMD Official Use Only - Internal Distribution Only]
Reviewed-by: Alex Deucher
From: amd-gfx on behalf of Evan Quan
Sent: Monday, March 22, 2021 2:11 AM
To: amd-gfx@lists.freedesktop.org
Cc: Quan, Evan
Subject: [PATCH] drm/amd/pm: drop redundant and unneeded
On 15/03/2021 05:23, Zhang, Jack (Jian) wrote:
[AMD Public Use]
Hi, Rob/Tomeu/Steven,
Would you please help to review this patch for panfrost driver?
Thanks,
Jack Zhang
-Original Message-
From: Jack Zhang
Sent: Monday, March 15, 2021 1:21 PM
To: dri-de...@lists.freedesktop.org; amd-g
On 2021-03-22 4:11 a.m., Lang Yu wrote:
In amdggpu reset, while dm.dc_lock is held by dm_suspend,
handle_hpd_rx_irq tries to acquire it. Deadlock occurred!
Deadlock log:
[ 104.528304] amdgpu :03:00.0: amdgpu: GPU reset begin!
[ 104.640084] =
On Sun, Mar 21, 2021 at 8:12 PM Evan Benn wrote:
>
> On Sat, Mar 20, 2021 at 8:36 AM Alex Deucher wrote:
> >
> > On Fri, Mar 19, 2021 at 5:31 PM Evan Benn wrote:
> > >
> > > On Sat, 20 Mar 2021 at 02:10, Harry Wentland
> > > wrote:
> > > > On 2021-03-19 10:22 a.m., Alex Deucher wrote:
> > > >
On Mon, Mar 22, 2021 at 02:05:48PM +, Matthew Wilcox wrote:
> On Mon, Mar 22, 2021 at 02:49:27PM +0100, Daniel Vetter wrote:
> > On Sun, Mar 21, 2021 at 03:18:28PM +0100, Christian König wrote:
> > > Am 20.03.21 um 14:17 schrieb Daniel Vetter:
> > > > On Sat, Mar 20, 2021 at 10:04 AM Christian
On Mon, Mar 22, 2021 at 06:58:16AM -0400, Felix Kuehling wrote:
> Since the last patch series I sent on Jan 6 a lot has changed. Patches 1-33
> are the cleaned up, rebased on amd-staging-drm-next 5.11 version from about
> a week ago. The remaining 11 patches are current work-in-progress with
> furt
[AMD Public Use]
Thanks for your patch, Silva. The issue has been fixed by " a5c6007e20e1
drm/amd/display: fix modprobe failure on vega series".
Regards,
Guchun
-Original Message-
From: amd-gfx On Behalf Of Gustavo A.
R. Silva
Sent: Monday, March 22, 2021 8:51 PM
To: Lee Jones ; Wentl
[AMD Official Use Only - Internal Distribution Only]
Hello all,
Can someone help to review below patches? We verified with firmware team and
want to check-in together with psp firmware
Regards,
Oak
On 2021-03-12, 4:24 PM, "Zeng, Oak" wrote:
This new interface passes both virtual and p
On Sat, Mar 20, 2021 at 3:52 AM Randy Dunlap
wrote:
>
>
>
> On Fri, 19 Mar 2021, Bhaskar Chowdhury wrote:
>
> > s/traing/training/
> >
> > ...Plus the entire sentence construction for better readability.
> >
> > Signed-off-by: Bhaskar Chowdhury
> > ---
> > Changes from V1:
> > Alex and Randy's s
The wrong sizeof values are currently being used as arguments to
kzalloc().
Fix this by using the right arguments *dceip and *vbios,
correspondingly.
Addresses-Coverity-ID: 1502901 ("Wrong sizeof argument")
Fixes: fca1e079055e ("drm/amd/display/dc/calcs/dce_calcs: Remove some large
variables fro
On Sun, Mar 21, 2021 at 03:18:28PM +0100, Christian König wrote:
> Am 20.03.21 um 14:17 schrieb Daniel Vetter:
> > On Sat, Mar 20, 2021 at 10:04 AM Christian König
> > wrote:
> > > Am 19.03.21 um 20:06 schrieb Daniel Vetter:
> > > > On Fri, Mar 19, 2021 at 07:53:48PM +0100, Christian König wrote:
On Mon, Mar 22, 2021 at 12:22 PM Christian König
wrote:
>
> Don't print a warning when we fail to allocate a page for swapping things out.
>
> v2: only stop the warning
>
> Signed-off-by: Christian König
Reviewed-by: Daniel Vetter
It is kinda surprising that page allocator warns here even thou
Am 22.03.21 um 11:58 schrieb Felix Kuehling:
From: Philip Yang
Forgot to reserve a fence slot to use sdma to update page table, cause
below kernel BUG backtrace to handle vm retry fault while application is
exiting.
[ 133.048143] kernel BUG at
/home/yangp/git/compute_staging/kernel/drivers
[AMD Public Use]
Hi all,
This week this patchset was tested on a HP Envy 360, with Ryzen 5 4500U, on the
following display types (via usb-c to dp/dvi/hdmi/vga):
4k 60z, 1440p 144hz, 1680*1050 60hz, internal eDP 1080p 60hz
Tested on a Sapphire Pulse RX5700XT on the following display types (via D
On Fri, 19 Mar 2021, Daniel Vetter wrote:
> On Fri, Mar 19, 2021 at 08:24:07AM +, Lee Jones wrote:
> > On Thu, 18 Mar 2021, Daniel Vetter wrote:
> >
> > > On Wed, Mar 17, 2021 at 9:32 PM Daniel Vetter wrote:
> > > >
> > > > On Wed, Mar 17, 2021 at 9:17 AM Lee Jones wrote:
> > > > >
> > > >
On Mon, 22 Mar 2021, Guchun Chen wrote:
> Fixes: d88b34caee83 ("Remove some large variables from the stack")
>
> [ 41.232097] Call Trace:
> [ 41.232105] kvasprintf+0x66/0xd0
> [ 41.232122] kasprintf+0x49/0x70
> [ 41.232136] __drm_crtc_init_with_planes+0x2e1/0x340 [drm]
> [ 41.232219]
amdgpu_hdp.h has been included at line 91, so remove
the duplicate include.
Signed-off-by: Wan Jiabing
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 49267eb64302..68836c2
From: wengjianfeng
change 'addres' to 'address'
Signed-off-by: wengjianfeng
---
drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/powerplay/smumgr/smu7_smumgr.c
b/drivers/gpu/drm/amd/pm/powerpla
[AMD Public Use]
Hi Christian,
I will conduct one stress test for this tomorrow. Would you mind waiting for my
ack before submitting?
Regards,
Guchun
-Original Message-
From: Christian König
Sent: Monday, March 22, 2021 8:41 PM
To: amd-gfx@lists.freedesktop.org
Cc: Chen, Guchun ; Das
Now that we found the underlying problem we can re-apply this patch.
This reverts commit 867fee7f8821ff42e7308088cf0c3450ac49c17c.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 55 +-
1 file changed, 18 insertions(+), 37 deletions(-)
diff -
Am 22.03.21 um 13:02 schrieb Wan Jiabing:
amdgpu_hdp.h has been included at line 91, so remove
the duplicate include.
Signed-off-by: Wan Jiabing
Acked-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amd
Am 22.03.21 um 12:54 schrieb Arnd Bergmann:
From: Arnd Bergmann
clang points out that the %hu format string does not match the type
of the variables here:
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c:263:7: warning: format specifies type
'unsigned short' but the argument has type 'unsigned int' [-
From: Arnd Bergmann
clang points out that the %hu format string does not match the type
of the variables here:
drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c:263:7: warning: format specifies type
'unsigned short' but the argument has type 'unsigned int' [-Wformat]
ver
Don't print a warning when we fail to allocate a page for swapping things out.
v2: only stop the warning
Signed-off-by: Christian König
---
drivers/gpu/drm/ttm/ttm_tt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.
svm_range_split_by_granularity always added the parent range and only
the parent range to the update list for the caller to add it to the
deferred work list. So just do that in the caller unconditionally and
eliminate the update_list parameter.
Split the range so that the original prange is always
This saves callers from looking up the pdd with a linear search later.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 8 +++-
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 10 -
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 51 +++-
3 files change
This can happen when syste memory page were never allocated. Skip them
during the migration. 0-initialize the BO.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 50 ++--
1 file changed, 38 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/
Mapping without validation is broken. Also removed saving the pages from
the last migration. They may be invalidated without an MMU notifier to
catch it, so let the next proper validation take care of it.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 14 ---
This fixes potential race conditions between any code that validates and
maps SVM ranges and MMU notifiers. The whole sequence is encapsulated in
svm_range_validate_and_map. The page_addr and hmm_range structures are
not useful outside that function, so they were removed from
struct svm_range.
Val
Don't dma_unmap in unmap_from_gpu. The dma_addr arrays are protected
by the migrate_mutex, which we cannot hold when unmapping in MMU
notifiers.
Instead dma_unmap and free dma_addr arrays whenever the pages_array
is invalidated: when migrating to VRAM and when re-validating RAM.
Freeing dma_addr
This allows validation of child ranges, so the GPU page fault handler
can be more light-weight.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 8 +
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 40 +---
2 files changed, 37 insertions(+), 11 del
If prefetch range to gpu with acutal location is another gpu, or GPU
retry fault restore pages to migrate the range with acutal location is
gpu, then migrate from one gpu to another gpu.
Use system memory as bridge because sdma engine may not able to access
another gpu vram, use sdma of source gpu
Take the svm_bo_list spin lock when iterating of the range list during
eviction.
Change-Id: I979d959e06c32e114cea8d151933b8ee7455627e
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 19 +--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/dr
From: Philip Yang
SVMAPISupported property added to HSA_CAPABILITY, the value match
HSA_CAPABILITY defined in Thunk spec:
SVMAPISupported: it will not be supported on older kernels that don't
have HMM or on systems with GFXv8 or older GPUs without support for
48-bit virtual addresses.
CoherentH
Destroy SVM-related mutexes correctly.
Change-Id: I85da30b1b0dce72433e6d3b507cb0b55b83b433c
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 4
1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
Restore can appear to fail if the svms->evicted counter changes before
the function can acquire the necessary locks. Re-read the counter after
acquiring the lock to minimize the chances of having to reschedule the
worker.
Change-Id: I236b912bddf106583be264abde2f6bd1a5d5a083
Signed-off-by: Felix Ku
With xnack on, add validate timestamp in order to handle GPU vm fault
from multiple GPUs.
If GPU retry fault need migrate the range to the best restore location,
use range validate timestamp to record system timestamp after range is
restored to update GPU page table.
Because multiple pages of sam
There are several race conditions with XNACK enabled. For now just some
FIXME comments with ideas how to fix it.
Change-Id: If0abab6dcb8f4e95c9d8820f6c569263eda29a89
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 5 +
drivers/gpu/drm/amd/amdkfd/kfd_svm.c |
With xnack on, GPU vm fault handler decide the best restore location,
then migrate range to the best restore location and update GPU mapping
to recover the GPU vm fault.
Signed-off-by: Philip Yang
Signed-off-by: Alex Sierra
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_migra
From: Alex Sierra
This flag is useful at cpu invalidation page table
decision. Between select queue eviction or page fault.
Signed-off-by: Alex Sierra
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 4 +++
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 36 +++
From: Alex Sierra
[why]
As part of the SVM functionality, the eviction mechanism used for
SVM_BOs is different. This mechanism uses one eviction fence per prange,
instead of one fence per kfd_process.
[how]
A svm_bo reference to amdgpu_amdkfd_fence to allow differentiate between
SVM_BO or regula
From: Alex Sierra
Xnack retries are used for page fault recovery. Some AMD chip
families support continuously retry while page table entries are invalid.
The driver must handle the page fault interrupt and fill in a valid entry
for the GPU to continue.
This ioctl allows to enable/disable XNACK r
If svm range perfetch location is not zero, use TTM to alloc
amdgpu_bo vram nodes to validate svm range, then map vram nodes to GPUs.
Use offset to sub allocate from the same amdgpu_bo to handle overlap
vram range while adding new range or unmapping range.
svm_bo has ref count to trace the shared
From: Alex Sierra
Add CREATE_SVM_BO define bit for SVM BOs.
Another define flag was moved to concentrate these
KFD type flags in one include file.
Signed-off-by: Alex Sierra
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 7 ++-
drivers/gpu/drm/amd/amd
From: Alex Sierra
[why]
To support svm bo eviction mechanism.
[how]
If the BO crated has AMDGPU_AMDKFD_CREATE_SVM_BO flag set,
enable_signal callback will be called inside amdgpu_evict_flags.
This also causes gutting of the BO by removing all placements,
so that TTM won't actually do an eviction
From: Alex Sierra
Add to amdgpu_amdkfd_fence.enable_signal callback, support
for svm_bo fence eviction.
Signed-off-by: Alex Sierra
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c | 11 ---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a
GPU page tables are invalidated by unmapping prange directly at
the mmu notifier, when page fault retry is enabled through
amdgpu_noretry global parameter. The restore page table is
performed at the page fault handler.
If xnack is on, we update GPU mappings after migration to avoid
unnecessary GPU
Register svm range with same address and size but perferred_location
is changed from CPU to GPU or from GPU to CPU, trigger migration the svm
range from ram to vram or from vram to ram.
If svm range prefetch location is GPU with flags
KFD_IOCTL_SVM_FLAG_HOST_ACCESS, validate the svm range on ram f
From: Philip Yang
Use sdma linear copy to migrate data between ram and vram. The sdma
linear copy command uses kernel buffer function queue to access system
memory through gart table.
Use reserved gart table window 0 to map system page address, and vram
page address is direct mapping. Use the sa
From: Philip Yang
Add svm (shared virtual memory) ioctl data structure and API definition.
The svm ioctl API is designed to be extensible in the future. All
operations are provided by a single IOCTL to preserve ioctl number
space. The arguments structure ends with a variable size array of
attrib
If CPU page fault happens, HMM pgmap_ops callback migrate_to_ram start
migrate memory from vram to ram in steps:
1. migrate_vma_pages get vram pages, and notify HMM to invalidate the
pages, HMM interval notifier callback evict process queues
2. Allocate system memory pages
3. Use svm copy memory t
Page table restore implementation in SVM API. This is called from
the fault handler at amdgpu_vm. To update page tables through
the page fault retry IH.
Signed-off-by: Alex Sierra
Signed-off-by: Philip Yang
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 69 +++
From: Alex Sierra
By default this timestamp is 32 bit counter. It gets
overflowed in around 10 minutes.
Change-Id: I7c46604b0272dcfd1ce24351437c16fe53dca0ab
Signed-off-by: Alex Sierra
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdgpu/vega10_ih.c | 1 +
1 file changed, 1 insertion(+)
From: Alex Sierra
Use SVM API to restore page tables when retry fault and
compute context are enabled.
Signed-off-by: Alex Sierra
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 20 +++-
1 file changed, 15 insertions(+), 5 deletions(-)
diff --git a/
From: Philip Yang
Move the HMM get pages function from amdgpu_ttm and to amdgpu_mn. This
common function will be used by new svm APIs.
Signed-off-by: Philip Yang
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 83 +
drivers/gpu/drm/amd/amdgp
From: Philip Yang
Forgot to reserve a fence slot to use sdma to update page table, cause
below kernel BUG backtrace to handle vm retry fault while application is
exiting.
[ 133.048143] kernel BUG at
/home/yangp/git/compute_staging/kernel/drivers/dma-buf/dma-resv.c:281!
[ 133.048487] Workqueue
svm_bo eviction mechanism is different from regular BOs.
Every SVM_BO created contains one eviction fence and one
worker item for eviction process.
SVM_BOs can be attached to one or more pranges.
For SVM_BO eviction mechanism, TTM will start to call
enable_signal callback for every SVM_BO until VRA
From: Philip Yang
amdgpu_gmc_get_vm_pte use bo_va->is_xgmi same hive information to set
pte flags to update GPU mapping. Add local structure variable bo_va, and
update bo_va.is_xgmi, pass it to mapping->bo_va while mapping to GPU.
Assuming xgmi pstate is hi after boot.
Signed-off-by: Philip Yan
Since the last patch series I sent on Jan 6 a lot has changed. Patches 1-33
are the cleaned up, rebased on amd-staging-drm-next 5.11 version from about
a week ago. The remaining 11 patches are current work-in-progress with
further cleanup and fixes.
MMU notifiers and CPU page faults now can split
From: Philip Yang
Get the intersection of attributes over all memory in the given
range
Signed-off-by: Philip Yang
Signed-off-by: Alex Sierra
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 164 +++
1 file changed, 164 insertions(+)
diff --gi
From: Philip Yang
When application explicitly call unmap or unmap from mmput when
application exit, driver will receive MMU_NOTIFY_UNMAP event to remove
svm range from process svms object tree and list first, unmap from GPUs
(in the following patch).
Split the svm ranges to handle partial unmapp
From: Philip Yang
It will be used by kfd to map svm range to GPU, because svm range does
not have amdgpu_bo and bo_va, cannot use amdgpu_bo_update interface, use
amdgpu vm update interface directly.
Signed-off-by: Philip Yang
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu
HMM interval notifier callback notify CPU page table will be updated,
stop process queues if the updated address belongs to svm range
registered in process svms objects tree. Scheduled restore work to
update GPU page table using new pages address in the updated svm range.
The restore worker flushe
From: Philip Yang
Use HMM to get system memory pages address, which will be used to
map to GPUs or migrate to vram.
Signed-off-by: Philip Yang
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 103 ++-
drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 4 +
Use amdgpu_vm_bo_update_mapping to update GPU page table to map or unmap
svm range system memory pages address to GPUs.
Signed-off-by: Philip Yang
Signed-off-by: Alex Sierra
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 395 +--
drivers/gpu/dr
From: Philip Yang
Register vram memory as MEMORY_DEVICE_PRIVATE type resource, to
allocate vram backing pages for page migration.
Signed-off-by: Philip Yang
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 4 +
drivers/gpu/drm/amd/amdkfd/Kconfig | 1 +
From: Alex Sierra
svm range uses gpu bitmap to store which GPU svm range maps to.
Application pass driver gpu id to specify GPU, the helper is needed to
convert gpu id to gpu bitmap idx.
Access through kfd_process_device pointers array from kfd_process.
Signed-off-by: Alex Sierra
Signed-off-by
From: Philip Yang
svm range structure stores the range start address, size, attributes,
flags, prefetch location and gpu bitmap which indicates which GPU this
range maps to. Same virtual address is shared by CPU and GPUs.
Process has svm range list which uses both interval tree and list to
store
From: Alex Sierra
Remove per_device_list from kfd_process and replace it with a
kfd_process_device pointers array of MAX_GPU_INSTANCES size. This helps
to manage the kfd_process_devices binded to a specific kfd_process.
Also, functions used by kfd_chardev to iterate over the list were
removed, si
On Mon, 22 Mar 2021 at 11:34, Christian König
wrote:
>
> Hi Daniel,
>
> Am 22.03.21 um 10:38 schrieb Daniel Gomez:
> > On Fri, 19 Mar 2021 at 21:29, Felix Kuehling wrote:
> >> This caused a regression in kfdtest in a large-buffer stress test after
> >> memory allocation for user pages fails:
> >
Hi Daniel,
Am 22.03.21 um 10:38 schrieb Daniel Gomez:
On Fri, 19 Mar 2021 at 21:29, Felix Kuehling wrote:
This caused a regression in kfdtest in a large-buffer stress test after
memory allocation for user pages fails:
I'm sorry to hear that. BTW, I guess you meant amdgpu leak patch and
not th
On Fri, 19 Mar 2021 at 21:29, Felix Kuehling wrote:
>
> This caused a regression in kfdtest in a large-buffer stress test after
> memory allocation for user pages fails:
I'm sorry to hear that. BTW, I guess you meant amdgpu leak patch and
not this one.
Just some background for the mem leak patch
On 2021-03-20 1:31 a.m., R, Bindu wrote:
>
> The Update patch has been submitted.
Submitted where? Still can't see it.
--
Earthling Michel Dänzer | https://redhat.com
Libre software enthusiast | Mesa and X developer
__
Fixes: d88b34caee83 ("Remove some large variables from the stack")
[ 41.232097] Call Trace:
[ 41.232105] kvasprintf+0x66/0xd0
[ 41.232122] kasprintf+0x49/0x70
[ 41.232136] __drm_crtc_init_with_planes+0x2e1/0x340 [drm]
[ 41.232219] ? create_object+0x263/0x3b0
[ 41.232231] drm_crtc_
- no functional changes
Signed-off-by: Tobias Jakobi
---
drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 4 ++--
drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_pp
Thanks,
Fixed as suggested and sent as v2.
- Tong
On Sun, Mar 21, 2021 at 9:26 AM Christian König
wrote:
>
>
>
> Am 20.03.21 um 21:10 schrieb Tong Zhang:
> > TTM_PL_VRAM may not initialized at all when calling
> > radeon_bo_evict_vram(). We need to check before doing eviction.
> >
> > [2.1608
Here is the system crash log:
[ 1272.884438] BUG: unable to handle kernel NULL pointer dereference at
(null)
[ 1272.88] IP: [< (null)>] (null)
[ 1272.884447] PGD 825b09067 PUD 8267c8067 PMD 0
[ 1272.884452] Oops: 0010 [#1] SMP
[ 1272.884509] CPU: 13 PID: 3485 Comm: cat Kdump:
TTM_PL_VRAM may not initialized at all when calling
radeon_bo_evict_vram(). We need to check before doing eviction.
[2.160837] BUG: kernel NULL pointer dereference, address: 0020
[2.161212] #PF: supervisor read access in kernel mode
[2.161490] #PF: error_code(0x) - not-
On Fri, 19 Mar 2021, Bhaskar Chowdhury wrote:
s/traing/training/
...Plus the entire sentence construction for better readability.
Signed-off-by: Bhaskar Chowdhury
---
Changes from V1:
Alex and Randy's suggestions incorporated.
drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 8
1 file ch
TTM_PL_VRAM may not initialized at all when calling
radeon_bo_evict_vram(). We need to check before doing eviction.
[2.160837] BUG: kernel NULL pointer dereference, address: 0020
[2.161212] #PF: supervisor read access in kernel mode
[2.161490] #PF: error_code(0x) - not-
Fix the following coccicheck warnings:
./drivers/gpu/drm/amd/display/dc/dcn30/dcn30_mpc.c:875:62-67: WARNING:
conversion to bool not needed here.
Reported-by: Abaci Robot
Signed-off-by: Jiapeng Chong
---
drivers/gpu/drm/amd/display/dc/dcn30/dcn30_mpc.c | 2 +-
1 file changed, 1 insertion(+), 1
On Sat, Mar 20, 2021 at 8:36 AM Alex Deucher wrote:
>
> On Fri, Mar 19, 2021 at 5:31 PM Evan Benn wrote:
> >
> > On Sat, 20 Mar 2021 at 02:10, Harry Wentland wrote:
> > > On 2021-03-19 10:22 a.m., Alex Deucher wrote:
> > > > On Fri, Mar 19, 2021 at 3:23 AM Evan Benn wrote:
> > > >>
> > > >> AMD
In amdggpu reset, while dm.dc_lock is held by dm_suspend,
handle_hpd_rx_irq tries to acquire it. Deadlock occurred!
Deadlock log:
[ 104.528304] amdgpu :03:00.0: amdgpu: GPU reset begin!
[ 104.640084] ==
[ 104.640092] WARNING: possible ci
1 - 100 of 105 matches
Mail list logo