schrieb Hoi Pok Wu:
i do it because it is part of the todo list
where the task is to remove load/unload callback
there are only 2 drm_driver that still uses thats why
i thought my amdgpu could test radeonsi but no, i still send it anyway
regards,
wu
On Fri, Jun 7, 2024 at 3:51 AM Christian
Am 07.06.24 um 16:43 schrieb Joshi, Mukul:
[AMD Official Use Only - AMD Internal Distribution Only]
-Original Message-
From: Koenig, Christian
Sent: Friday, June 7, 2024 3:26 AM
To: Joshi, Mukul ; amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix ; Bhardwaj, Rajneesh
; Yang, Philip
Am 07.06.24 um 10:33 schrieb Bob Zhou:
To avoid null pointer dereference, Check return value and
conduct error handling.
That doesn't make much sense.
At this point the amdgpu_mes_ctx_get_offs_cpu_addr() shouldn't be able
to return NULL in the first place.
Regards,
Christian.
Am 07.06.24 um 10:33 schrieb Bob Zhou:
To fix potential overflowed constant warning, modify the variables to u32
for getting the return value of RREG32_SOC15().
Signed-off-by: Bob Zhou
Acked-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c | 2 +-
drivers/gpu/drm/amd
nto drm-misc-fixes to be
sure the patch makes it into 6.10.
Feel free to add Reviewed-by: Christian König .
Regards,
Christian.
Fixes: a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality")
Signed-off-by: Arunpravin Paneer Selvam
Suggested-by: Christian König
---
driv
Am 07.06.24 um 03:14 schrieb wu hoi pok:
this patch is to remove the load callback from the kms_driver,
following closly to amdgpu, radeon_driver_load_kms and devm_drm_dev_alloc
are used, most of the changes here are rdev->ddev to rdev_to_drm,
which maps to adev_to_drm in amdgpu. however this
Am 06.06.24 um 21:22 schrieb Mukul Joshi:
Make sure we do not overflow the memory limits set for a cgroup when doing
GTT memory allocations.
NAK, That's intentionally not done like that.
Please see the cgroup discussion on memory management on the public
mailing list.
Regards,
Christian.
. See the partitioning mode is
something which is fundamentally incompatible with SRIOV.
So this is not IP version specific at all.
Regards,
Christian.
Thanks,
Lijo
Cc: Alex Deucher
Cc: Christian König
Suggested-by: Christian König
Signed-off-by: Srinivasan Shanmugam
---
drivers/gpu/drm
make it mandatory to keep the runtime
pm reference would be if we pin the buffer into VRAM, and that's not
something we currently do.
v2: improve the commit message
Signed-off-by: Christian König
Reviewed-by: Alex Deucher
CC: sta...@vger.kernel.org
---
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
the code organization and maintainability. If in
the future the conditions for creating the compute partition sysfs
entries change, we would only need to update the
amdgpu_gfx_sysfs_compute_init function.
Cc: Alex Deucher
Cc: Christian König
Suggested-by: Christian König
Signed-off-by: Srinivasan
not recoverable in any way when VRAM is
lost.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 4 -
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 87 +
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 67 +---
drivers/gpu/drm/amd/amdgpu
We haven't used the functionality to pin BOs in a certain range at all
while the driver existed. Just nuke it.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 56 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 2 -
2 files changed, 5
Instead of having that in the amdgpu_bo_pin() function applied for all
pinned BOs.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 2 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 1 -
drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c| 1 +
drivers/gpu/drm/amd
Am 03.06.24 um 03:41 schrieb Yang Wang:
Adding formatting string feature to improve function flexibility.
Signed-off-by: Yang Wang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 30 +--
drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h | 3 ++-
2 files changed, 24
ing.
With that done the patch is Reviewed-by: Christian König
Regards,
Christian.
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 63 +++---
1 file changed, 33 insertions(+), 30 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
b/drivers/gpu/drm/amd/amd
Am 05.06.24 um 15:20 schrieb Alex Deucher:
On Wed, Jun 5, 2024 at 8:32 AM Christian König
wrote:
This reverts commit b8c415e3bf989be1b749409951debe6b36f5c78c and
commit 425285d39afddaf4a9dab36045b816af0cc3e400.
Taking a runtime pm reference for DMA-buf is actually completely
unnecessary
it is in VRAM the buffer gets migrated to
GTT before powering down.
The only use case which would make it mandatory to keep the runtime
pm reference would be if we pin the buffer into VRAM, and that's not
something we currently do.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu
You haven't addressed any of my comments on patch #1.
Regards,
Christian.
Am 05.06.24 um 11:33 schrieb Wang, Yang(Kevin):
[AMD Official Use Only - AMD Internal Distribution Only]
Ping...
Best Regards,
Kevin
-Original Message-
From: amd-gfx On Behalf Of Yang Wang
Sent: Monday, June
Hi guys,
just FYI: Alex published yesterday a bunch of new firmware files:
https://gitlab.freedesktop.org/drm/firmware/-/commits/amd-staging
One major issue which should be fixed by those is that page faults can
no longer overflow the IH ring buffer on APUs and older dGPUs.
Newer dGPU with
Am 04.06.24 um 20:08 schrieb Felix Kuehling:
On 2024-06-03 22:13, Al Viro wrote:
Using drm_gem_prime_handle_to_fd() to set dmabuf up and insert it into
descriptor table, only to have it looked up by file descriptor and
remove it from descriptor table is not just too convoluted - it's
racy;
Am 04.06.24 um 17:58 schrieb Eric Huang:
To fullfill the reset event description.
Suggested-by: Lijo Lazar
Signed-off-by: Eric Huang
Reviewed-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 1 +
drivers/gpu/drm/amd
ng...@amd.com
Cc: rodrigo.sique...@amd.com
Acked-by: Christian König
---
drivers/gpu/drm/amd/display/dc/dc.h | 1 +
.../drm/amd/display/dc/resource/dcn32/dcn32_resource.c| 8 +++-
.../drm/amd/display/dc/resource/dcn321/dcn321_resource.c | 8 +++-
3 fil
Am 04.06.24 um 16:57 schrieb Arnd Bergmann:
On Tue, Jun 4, 2024, at 16:22, Christian König wrote:
Am 04.06.24 um 15:50 schrieb Alex Deucher:
This can be called in atomic context. Should fix:
BUG: sleeping function called from invalid context at
include/linux/sched/mm.h:306
in_atomic(): 1
investigation.
With that done the patch is Reviewed-by: Christian König
Regards,
Christian.
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 4 +---
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 2 +-
drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 2 +-
3 files changed, 3 insertions(+), 5 deletions(-)
diff
This should prevent buffer moves when the threshold is reached during
CS.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 36 --
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 22 +
2 files changed, 29 insertions(+), 29 deletions
This adds support to enable a placement only when a certain treshold of
moved bytes is reached. It's a context flag which will be handled
together with TTM_PL_FLAG_DESIRED and TTM_PL_FLAG_FALLBACK.
Signed-off-by: Christian König
---
drivers/gpu/drm/ttm/ttm_bo.c | 5 ++---
drivers/gpu/drm
something is in it's preferred
placement or not and also disable the handling on APUs.
Signed-off-by: Tvrtko Ursulin
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 16
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/
That should probably come last.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 16
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index
That is just a waste of time on APUs.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 8d8c39be6129
compile tested.
While at it cleanup the coding style.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 76 --
1 file changed, 48 insertions(+), 28 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
b/drivers/gpu/drm/amd/amdgpu
Hi guys,
as already discussed on the mailing list Tvrtko and Friedrich stumbled
over a bunch of problems with the memory management. Especially that
move rate limit didn't seemed to work for VRAM|GTT BOs and causing bunch
of additional and unecessary overhead during CS.
This (not well tested)
Am 04.06.24 um 15:50 schrieb Alex Deucher:
This can be called in atomic context. Should fix:
BUG: sleeping function called from invalid context at
include/linux/sched/mm.h:306
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 449, name: kworker/u64:8
preempt_count: 2, expected: 0
RCU
Am 04.06.24 um 09:08 schrieb Bob Zhou:
To fix potential overflowed constant warning reported by Coverity,
modify the variables to uint32_t.
Signed-off-by: Bob Zhou
Acked-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/imu_v12_0.c | 7 ---
1 file changed, 4 insertions(+), 3
To: Liu, Shaoyun ; Christian König ; Li,
Yunxiang (Teddy) ; amd-gfx@lists.freedesktop.org; Deucher, Alexander
; Xiao, Hua
Subject: Re: [PATCH v2 03/10] drm/amdgpu: abort fence poll if reset is started
Hi Shaoyun,
yes my thinking goes into the same direction. The basic problem here is that we
Am 03.06.24 um 13:46 schrieb Yiqing Yao:
When flushing gpu tlb using kiq from gfxhub, kiq ring is always
local as xcc instance is selected for it. Thus using lower 18 bits
to access mmregs inside local xcc instead of full address used
when accessing regs outside of local xcc.
Remove redundent
Am 03.06.24 um 13:52 schrieb Pierre-Eric Pelloux-Prayer:
Hi Christia,
Le 03/06/2024 à 11:58, Christian König a écrit :
Am 03.06.24 um 10:46 schrieb Pierre-Eric Pelloux-Prayer:
These 2 traces events are tied to a specific VM so in order for them
to be useful for a tool we need to trace
Am 03.06.24 um 10:53 schrieb Zhou, Bob:
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Christian,
It fixes a potential Overflowed constant (INTEGER_OVERFLOW) warning reported by
Coverity.
You need to mention that in the commit message.
And I haven't checked the hardware docs,
Am 31.05.24 um 08:52 schrieb Yang Wang:
Adding formatting string feature to improve function flexibility.
Signed-off-by: Yang Wang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 30 +--
drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h | 3 ++-
2 files changed, 24
with the seq number . In this case driver can get the
failure of the submission to MES in time and make its own decision for what
to do next , What do you think about this ?
Regards
Shaoyun.liu
-Original Message-
From: amd-gfx On Behalf Of Christian
König
Sent: Wednesday, May 29
Am 03.06.24 um 10:46 schrieb Pierre-Eric Pelloux-Prayer:
These 2 traces events are tied to a specific VM so in order for them
to be useful for a tool we need to trace the amdgpu_vm as well.
The bo_va already contains the VM pointer the map/unmap operation
belongs to.
Signed-off-by:
Am 03.06.24 um 07:59 schrieb Bob Zhou:
The return value of RREG32_SOC15 is unsigned int, so modify variable to
unsigned.
And why is that an improvement?
Regards,
Christian.
Signed-off-by: Bob Zhou
---
drivers/gpu/drm/amd/amdgpu/imu_v12_0.c | 6 +++---
1 file changed, 3 insertions(+),
Am 31.05.24 um 14:34 schrieb Lijo Lazar:
Skip scheduling coredump when gpu reset is intentionally triggered
through debugfs.
Signed-off-by: Lijo Lazar
Reviewed-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers
20784781 10.00 37.00 89.6722.0012.33
patched 4227688 13.67 37.00 81.3323.3315.00
Disclaimers that I have is that more runs would be needed to be more confident
about the results. And more games. And APU versus discrete.
Cc: Christian König
Cc: Fried
Am 31.05.24 um 00:02 schrieb Felix Kuehling:
On 2024-05-28 13:23, Yunxiang Li wrote:
These functions are missing the lock for reset domain.
Signed-off-by: Yunxiang Li
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 4 +++-
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
Am 30.05.24 um 23:48 schrieb Yunxiang Li:
These functions are missing the lock for reset domain.
Please separate the GART changes from the KFD changes. Apart from that
looks good to me.
Thanks,
Christian.
Signed-off-by: Yunxiang Li
---
v3: only bracket amdgpu_device_flush_hdp with the
Am 30.05.24 um 05:50 schrieb Jesse Zhang:
To fix the warning about unused value, comment out the variable use_static.
Commenting out variables with // will just get you another warning from
checkpatch.
Christian.
Signed-off-by: Jesse Zhang
---
Am 30.05.24 um 05:48 schrieb Jesse Zhang:
If the svm migration copy memory gart fails or the dma mapping page fails for
the first time.
But the variable i is still 0, and executing i-- will overflow.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 3 ++-
1 file
ybe better change the type of the local variable instead?
On the other hand feel free to add Reviewed-by: Christian König
to this one as well.
Regards,
Christian.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 +-
1 file changed, 1 insertion(+), 1 delet
Am 29.05.24 um 16:48 schrieb Li, Yunxiang (Teddy):
[AMD Official Use Only - AMD Internal Distribution Only]
Yeah, I know. That's one of the reason I've pointed out on the patch adding
that that this behavior is actually completely broken.
If you run into issues with the MES because of this
Am 29.05.24 um 16:31 schrieb Li, Yunxiang (Teddy):
[Public]
The problem is that we don't force complete the non scheduler rings, e.g. MES,
KIQ etc...
Try to remove this check here from the loop in
amdgpu_device_pre_asic_reset():
if (!amdgpu_ring_sched_ready(ring))
Am 29.05.24 um 15:44 schrieb Li, Yunxiang (Teddy):
[AMD Official Use Only - AMD Internal Distribution Only]
I don't think trying to add some reset handling here makes sense in the first
place.
Part of the reset/recovery procedure is to signal all fence and that includes
the one we are
Am 29.05.24 um 15:22 schrieb Li, Yunxiang (Teddy):
[Public]
It's perfectly possible that the reset has already started before we enter the
function.
Yeah, this could and does happen, but it just means we are back to the old behavior. I
guess I could use "can I take the read side lock?" to
Am 02.05.24 um 23:41 schrieb Alex Deucher:
It was an enablement vehicle for MES 11 and was never
productized. Remove it.
Signed-off-by: Alex Deucher
Acked-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/Makefile |1 -
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
Acked-by: Christian König for the whole series.
Am 06.05.24 um 20:45 schrieb Alex Deucher:
Will be used to consolidate reg remap settings and fix HDP
flushes on systems with non-4K pages.
Reviewed-by: Felix Kuehling
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h
Hi,
when the issue is easy to reproduce I suggest to bisect the changes
between 6.9 and 6.10-rc1.
On the other hand it's not unlikely that we have a known bug in -rc1
which will be fixed by -rc2.
Anyway added Leo to the mail thread since he is the one responsible for
the video decoding
Am 28.05.24 um 19:23 schrieb Yunxiang Li:
These functions are missing the lock for reset domain.
Signed-off-by: Yunxiang Li
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 4 +++-
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 8 ++--
Am 28.05.24 um 19:23 schrieb Yunxiang Li:
Which method is used to flush tlb does not depend on whether a reset is
in progress or not. We should skip flush altogether if the GPU will get
reset. So put both path under reset_domain read lock.
Signed-off-by: Yunxiang Li
Reviewed-by: Christian
Am 28.05.24 um 19:23 schrieb Yunxiang Li:
When amdgpu_gart_invalidate_tlb helper is introduced this part was left
out of the conversion. Avoid the code duplication here.
Signed-off-by: Yunxiang Li
Reviewed-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 5 +
1
Am 28.05.24 um 19:23 schrieb Yunxiang Li:
At this point the gart is not set up, there's no point to invalidate tlb
here and it could even be harmful.
Signed-off-by: Yunxiang Li
Reviewed-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 2 --
1 file changed, 2
Signed-off-by: Yunxiang Li
With the commit message improved the patch is Reviewed-by: Christian
König .
Regards,
Christian.
---
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 3 ---
drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 3 ---
drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 3 ---
drivers/gpu/drm/
Am 28.05.24 um 19:23 schrieb Yunxiang Li:
is_hws_hang and is_resetting serves pretty much the same purpose and
they all duplicates the work of the reset_domain lock, just check that
directly instead. This also eliminate a few bugs listed below and get
rid of dqm->ops.pre_reset.
kfd_hws_hang did
a nice cleanup to me, but that is absolutely not my field
of expertise.
But feel free to add an Acked-by: Christian König
Regards,
Christian.
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 14
drivers/gpu/drm/amd/amdgpu
Am 28.05.24 um 19:23 schrieb Yunxiang Li:
If a reset is triggered, there's no point in waiting for the fence back
anymore, it just makes the reset code wait for a long time for the
reset_domain read lock to be dropped.
This also makes our reply to host FLR fast enough so the host doesn't
is duplicated.
On the other hand an extra check doesn't really hurt us.
So either way the patch is Reviewed-by: Christian König
Regards,
Christian.
reg_access_ctrl = >gfx.rlc.reg_access_ctrl[xcc_id];
scratch_reg0 = (void __iomem *)adev->rmmio + 4 *
reg_access_ctrl->scr
Am 27.05.24 um 22:19 schrieb Victor Skvortsov:
flush_gpu_tlb may be called from another thread while
device_gpu_recover is running.
No, that would be illegal. Where do you see that?
Regards,
Christian.
Both of these threads access registers through the VF
RLCG interface during VF Full
Reviewed-by: Christian König for the entire
series.
Regards,
Christian.
Am 13.05.24 um 22:25 schrieb Alex Deucher:
Use correct ref/mask for differnent gfx ring pipe. Ported from
ZhenGuo's patch for gfx10.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 2 +-
1
Acked-by: Christian König
Thanks,
Christian.
Am 23.05.24 um 19:11 schrieb Jiang, Sonny:
[AMD Official Use Only - AMD Internal Distribution Only]
The patch is Reviewed-by: Sonny Jiang
Thanks,
Sonny
*From:* Dong
Am 27.05.24 um 18:28 schrieb Asad Kamal:
Add extra flag definition for ids_flag field to distinguish
between vf/pf/pt modes
v2: Updated kms driver minor version & removed pf check as default is 0
Signed-off-by: Asad Kamal
Reviewed-by: Lijo Lazar
Acked-by: Christian König
---
dri
Am 23.05.24 um 21:48 schrieb Alex Deucher:
It was an enablement vehicle for MES 11 and was never
productized. Remove it.
v2: drop additional checks in the GFX10 code.
v3: drop mes_api_def.h
Signed-off-by: Alex Deucher
Acked-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/Makefile
Am 27.05.24 um 03:20 schrieb Zhouyi Zhou:
In r100_cp_init_microcode, if rdev->family don't match any of
if statement, fw_name will be NULL, which will cause
gcc (11.4.0 powerpc64le-linux-gnu) complain:
In function ‘r100_cp_init_microcode’,
inlined from ‘r100_cp_init’ at
and maintainability of the
code. It also increases the reusability of the attribute management
functions, allowing them to be used by multiple modules.
Cc: Lijo Lazar
Cc: Alex Deucher
Cc: Christian König
Suggested-by: Alex Deucher
Signed-off-by: Srinivasan Shanmugam
While at it you could
Am 24.05.24 um 15:35 schrieb Li, Yunxiang (Teddy):
[AMD Official Use Only - AMD Internal Distribution Only]
If that is true you could in theory lower the locked area of the existing lock,
but adding a new one is strict no-go from my side.
I'll try this, right now I see two places where this
Am 16.05.24 um 15:55 schrieb Alex Deucher:
Convert a variable sized array from [1] to [].
Signed-off-by: Alex Deucher
Reviewed-by: Christian König
---
drivers/gpu/drm/amd/include/atomfirmware.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd
Am 16.05.24 um 17:05 schrieb Alex Deucher:
Use current speed/width on devices which don't support
dynamic PCIe switching.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3289
Signed-off-by: Alex Deucher
Acked-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
Am 23.05.24 um 17:35 schrieb Li, Yunxiang (Teddy):
[Public]
Here is taking a different lock than the reset_domain->sem. It is a seperate
reset_domain->gpu_sem that is only locked when we will actuall do reset, it is not
taken in the skip_hw_reset path.
Exactly that is what you should *not*
Am 23.05.24 um 13:36 schrieb Li, Yunxiang (Teddy):
[AMD Official Use Only - AMD Internal Distribution Only]
+void amdgpu_lock_hw_access(struct amdgpu_device *adev); void
+amdgpu_unlock_hw_access(struct amdgpu_device *adev); int
+amdgpu_begin_hw_access(struct amdgpu_device *adev); void
Am 23.05.24 um 11:16 schrieb Jesse Zhang:
The pointer parent may be NULLed by the function amdgpu_vm_pt_parent.
To make the code more robust, check the pointer parent.
Signed-off-by: Jesse Zhang
Suggested-by: Christian König
Reviewed-by: Christian König
---
drivers/gpu/drm/amd/amdgpu
Am 23.05.24 um 10:07 schrieb Jesse Zhang:
The pointer parent may be NULLed by the function amdgpu_vm_pt_parent.
To make the code more robust, check the pointer parent.
V2: When parent is NULL here we should
probably call BUG() instead. (Christian)
Signed-off-by: Jesse Zhang
---
Am 23.05.24 um 08:13 schrieb Jesse Zhang:
The pointer parent may be NULLed by the function amdgpu_vm_pt_parent.
To make the code more robust, check the pointer parent.
No that doesn't make any sense.
When parent is NULL here we should probably call BUG() instead.
Regards,
Christian.
Am 22.05.24 um 19:27 schrieb Yunxiang Li:
Random accesses to the GPU while it is not re-initialized can lead to a
bad time. So add a rwsem to prevent such accesses. Normal accesses will
now take the read lock for shared GPU access, reset takes the write lock
for exclusive GPU access.
Care need
Am 21.05.24 um 07:11 schrieb Rino Andre Johnsen:
[Why]
For debugging and testing purposes.
[How]
Create amdgpu_current_pixelencoding debugfs entry.
Usage: cat /sys/kernel/debug/dri/1/crtc-0/amdgpu_current_pixelencoding
Why isn't that available as standard DRM CRTC property in either sysfs
or
Am 20.05.24 um 10:18 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin
Align kerneldoc with the function argument name.
Signed-off-by: Tvrtko Ursulin
Reported-by: Stephen Rothwell
Fixes: 26e20235ce00 ("drm/amdgpu: Add amdgpu_bo_is_vm_bo helper")
Cc: Christian König
Cc: Alex Deucher
Am 17.05.24 um 17:46 schrieb Alex Deucher:
On Fri, May 17, 2024 at 2:35 AM Christian König
wrote:
Am 16.05.24 um 19:57 schrieb Tim Van Patten:
From: Tim Van Patten
The following commit updated gmc->noretry from 0 to 1 for GC HW IP
9.3.0:
commit 5f3854f1f4e2 ("drm/amdgpu:
Am 16.05.24 um 14:21 schrieb Tvrtko Ursulin:
Hi Christian,
On 08/05/2024 09:26, Tvrtko Ursulin wrote:
On 08/05/2024 06:42, Christian König wrote:
Am 06.05.24 um 18:26 schrieb Tvrtko Ursulin:
On 03/05/2024 10:14, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin
Help code readability
Am 16.05.24 um 19:57 schrieb Tim Van Patten:
From: Tim Van Patten
The following commit updated gmc->noretry from 0 to 1 for GC HW IP
9.3.0:
commit 5f3854f1f4e2 ("drm/amdgpu: add more cases to noretry=1")
This causes the device to hang when a page fault occurs, until the
device is
Am 15.05.24 um 12:59 schrieb Tvrtko Ursulin:
On 15/05/2024 08:20, Christian König wrote:
Am 08.05.24 um 20:09 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin
Current code appears to live in a misconception that playing with
buffer
allowed and preferred placements can control the decision
-by: Tvrtko Ursulin
Cc: Christian König
Cc: Friedrich Vock
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 12 +---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 22708954ae68..d07a1dd7c880
budget spent.
Fix it by looking at the before and after buffer object backing store and
only account if there was a change.
FIXME:
I think this needs a better solution to account for migrations between
VRAM visible and non-visible portions.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc
ably rough but should be good enough for dicsussion.
I am curious
to hear if I identified at least something correctly as a real
problem.
It would also be good to hear what are the suggested games to check
and see
whether there is any improvement.
Cc: Christian König
Cc: Friedrich Vock
Tvrt
Am 14.05.24 um 10:13 schrieb Liang, Prike:
[AMD Official Use Only - AMD Internal Distribution Only]
From: Koenig, Christian
Sent: Friday, May 10, 2024 5:31 PM
To: Liang, Prike ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander
Subject: Re: [PATCH] drm/amdgpu: Use the slab allocator to
ked-by: Christian König
Fixes: 1bece222eab ("drm/amdgpu: Clear doorbell interrupt status for Sienna
Cichlid")
Cc: Alex Deucher
Cc: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/d
Am 09.05.24 um 22:41 schrieb Ori Messinger:
This patch adds 'ring hang' events to the driver.
This is done by adding a 'reset_ring_hang' bool variable to the
struct 'amdgpu_reset_context' in the amdgpu_reset.h file.
The purpose for this 'reset_ring_hang' variable is whenever a GPU
reset is
Am 13.05.24 um 19:41 schrieb David Wu:
On 2024-05-13 13:11, Christian König wrote:
Am 09.05.24 um 20:40 schrieb David (Ming Qiang) Wu:
We do not directly enable/disable VCN IRQ in vcn 5.0.0.
And we do not handle the IRQ state as well. So the calls to
disable IRQ and set state are removed
Am 09.05.24 um 20:40 schrieb David (Ming Qiang) Wu:
We do not directly enable/disable VCN IRQ in vcn 5.0.0.
And we do not handle the IRQ state as well. So the calls to
disable IRQ and set state are removed. This effectively gets
rid of the warining of
"WARN_ON(!amdgpu_irq_enabled(adev,
Am 10.05.24 um 10:50 schrieb Arunpravin Paneer Selvam:
Add support to handle the userqueue protected fence signal hardware
interrupt.
Create a xarray which maps the doorbell index to the fence driver address.
This would help to retrieve the fence driver information when an userq fence
interrupt
Am 10.05.24 um 10:50 schrieb Arunpravin Paneer Selvam:
Remove MES self test as this conflicts the userqueue fence
interrupts.
Please also completely remove the amdgpu_mes_self_test() function and
any now unused code.
Regards,
Christian.
Signed-off-by: Arunpravin Paneer Selvam
---
Am 13.05.24 um 06:14 schrieb Ori Messinger:
This patch adds 'ring hang' events to the driver.
This is done by adding a 'reset_ring_hang' bool variable to the
struct 'amdgpu_reset_context' in the amdgpu_reset.h file.
The purpose for this 'reset_ring_hang' variable is whenever a GPU
reset is
without holding a lock.
Signed-off-by: Arunpravin Paneer Selvam
Suggested-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +
.../gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 431 +-
.../gpu/drm/amd/amdgpu/amdgpu_userq_fence.h | 6 +
drivers/gp
Am 13.05.24 um 10:56 schrieb Ma Jun:
Check bo before using it
Signed-off-by: Ma Jun
Reviewed-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c | 16 +++-
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c
b
.
The driver core will already nuke the pointer for us when the pci device
is removed, so should be safe to simply drop. Alternative would be to
move to the driver pci remove callback.
Signed-off-by: Matthew Auld
Cc: Christian König
Cc: Daniel Vetter
Cc: amd-gfx@lists.freedesktop.org
Oh! Very good
1 - 100 of 9846 matches
Mail list logo