On 09/10/2024 13:07, Christian König wrote:
Am 09.10.24 um 09:41 schrieb Tvrtko Ursulin:
On 08/10/2024 19:11, Christian König wrote:
Volatile only prevents the compiler from re-ordering reads and writes.
Since we always only modify the ring buffer from one CPU thread and have
an explicit
On 08/10/2024 19:10, Christian König wrote:
Am 08.10.24 um 17:05 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin
I've noticed the hardware ring padding optimisations have landed so I
decided
to respin the CPU side optimisations.
First two patches are simply adding ring fill helpers
On 08/10/2024 19:11, Christian König wrote:
Stop masking the wptr and decrementing the count_dw while writing into
the ring buffer. We can do that all at once while pushing the changes to
the HW.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 11 +--
On 08/10/2024 19:11, Christian König wrote:
Volatile only prevents the compiler from re-ordering reads and writes.
Since we always only modify the ring buffer from one CPU thread and have
an explicit barrier before signaling the HW this should have no effect at
all and just prevents compiler op
From: Tvrtko Ursulin
I've noticed there is really a lot of places which write addresses into
the ring as two writes of lower_32_bits() followed by upper_32_bits().
Is it worth adding a helper to do those in one go?
It shrinks the source and binary a bit but is the readability better, or
From: Tvrtko Ursulin
Similarly as in the previous patch, we add a new amdgpu_ring_fill2x32()
helper which can write out the nops more efficiently using memset64().
This should have a lesser effect than the previous patch, given how the
affected rings have at most 64 dword alignment restriction
From: Tvrtko Ursulin
Similar to the previous patch but with the addition of a magic bit1 set on
big endian platforms. No idea what it is but maybe adding a helper and
giving both it and the magic bit a proper name would be worth it.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Sunil
From: Tvrtko Ursulin
Having noticed that typically 200+ nops per submission are written into
the ring, using a rather verbose one-nop-at-a-time-plus-ring-buffer-
arithmetic as done in amdgpu_ring_write(), the obvious idea was to
improve it by filling those nops in blocks.
This patch therefore
From: Tvrtko Ursulin
I've noticed the hardware ring padding optimisations have landed so I decided
to respin the CPU side optimisations.
First two patches are simply adding ring fill helpers which deal with reducing
the CPU cost of emitting hundreds of nops from the for-amdgpu_ring_write
On 07/10/2024 15:39, Alex Deucher wrote:
On Mon, Oct 7, 2024 at 8:52 AM Tvrtko Ursulin wrote:
On 04/10/2024 15:15, Alex Deucher wrote:
Applied. Thanks!
Thanks Alex!
Could you perhaps also merge
https://lore.kernel.org/amd-gfx/20240813135712.82611-1-tursu...@igalia.com/
via your tree
On 04/10/2024 15:15, Alex Deucher wrote:
Applied. Thanks!
Thanks Alex!
Could you perhaps also merge
https://lore.kernel.org/amd-gfx/20240813135712.82611-1-tursu...@igalia.com/
via your tree? If it still applies that is.
Regards,
Tvrtko
On Fri, Oct 4, 2024 at 3:28 AM Tvrtko Ursulin
On 24/09/2024 13:06, Christian König wrote:
Am 24.09.24 um 11:51 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin
While loop makes it sound like amdgpu_vmid_grab() potentially needs to be
called multiple times to produce a fence, while in reality all code paths
either return an error, assign a
On 30/09/2024 14:07, Christian König wrote:
Am 30.09.24 um 15:01 schrieb Tvrtko Ursulin:
On 13/09/2024 17:05, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin
Entities run queue can change during drm_sched_entity_push_job() so make
sure to update the score consistently.
Signed-off-by: Tvrtko
On 13/09/2024 17:05, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin
Entities run queue can change during drm_sched_entity_push_job() so make
sure to update the score consistently.
Signed-off-by: Tvrtko Ursulin
Fixes: d41a39dda140 ("drm/scheduler: improve job distribution with multiple
q
On 27/09/2024 09:48, Pierre-Eric Pelloux-Prayer wrote:
If a drm_file name is set append it to the process name.
This information is useful with the virtio/native-context driver: this
allows the guest applications identifier to visible in amdgpu's output.
The output in amdgpu_vm_info/amdgpu_ge
sg, fdinfo, etc), -EINVAL is returned.
A 0-length string is a valid use, and clears the existing name.
Reviewed-by: Tvrtko Ursulin
Signed-off-by: Pierre-Eric Pelloux-Prayer
---
drivers/gpu/drm/drm_debugfs.c | 14 ++---
drivers/gpu/drm/drm_file.c| 5
drivers/gpu/drm/drm
On 26/09/2024 09:15, Philipp Stanner wrote:
On Mon, 2024-09-23 at 15:35 +0100, Tvrtko Ursulin wrote:
Ping Christian and Philipp - reasonably happy with v2? I think it's
the
only unreviewed patch from the series.
Howdy,
sry for the delay, I had been traveling.
I have a few nits
they also go to
drm-misc-fixes? I am not too familiar with the drm-misc flow.
Or the series now needs to wait for some backmerge?
Regards,
Tvrtko
Am 24.09.24 um 12:19 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin
Since drm_sched_entity_modify_sched() can modify the entities run queue,
lets ma
On 24/09/2024 15:20, Christian König wrote:
Am 24.09.24 um 16:12 schrieb Tvrtko Ursulin:
On 24/09/2024 14:55, Christian König wrote:
I've pushed the first to drm-misc-next, but that one here fails to
apply cleanly.
This appears due 440d52b370b0 ("drm/sched: Fix dynamic job-flo
On 24/09/2024 10:45, Tvrtko Ursulin wrote:
On 24/09/2024 09:20, Christian König wrote:
Am 16.09.24 um 19:30 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin
Having removed one re-lock cycle on the entity->lock in a patch titled
"drm/sched: Optimise drm_sched_entity_push_job", wit
On 24/09/2024 09:20, Christian König wrote:
Am 16.09.24 um 19:30 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin
Having removed one re-lock cycle on the entity->lock in a patch titled
"drm/sched: Optimise drm_sched_entity_push_job", with only a tiny bit
larger refactoring we ca
On 24/09/2024 09:23, Christian König wrote:
Am 23.09.24 um 12:25 schrieb Tvrtko Ursulin:
On 20/09/2024 10:06, Pierre-Eric Pelloux-Prayer wrote:
At this point the vm is locked so we safely modify it without risk of
concurrent access.
To which particular lock this is referring to and does
On 24/09/2024 09:22, Pierre-Eric Pelloux-Prayer wrote:
Le 23/09/2024 à 12:06, Tvrtko Ursulin a écrit :
On 20/09/2024 10:06, Pierre-Eric Pelloux-Prayer wrote:
Giving the opportunity to userspace to associate a free-form
name with a drm_file struct is helpful for tracking and debugging
#define DRM_IOCTL_MODE_CLOSEFBDRM_IOWR(0xD0, struct
drm_mode_closefb)
+/**
+ * DRM_IOCTL_SET_NAME - Attach a name to a drm_file
+ *
+ * This ioctl is similar to DMA_BUF_SET_NAME - it allows for easier tracking
+ * and debugging.
+ * The length of the name must <= DRM_NAME_MA
Ping Christian and Philipp - reasonably happy with v2? I think it's the
only unreviewed patch from the series.
Regards,
Tvrtko
On 16/09/2024 18:30, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin
Having removed one re-lock cycle on the entity->lock in a patch titled
"drm/sc
On 20/09/2024 10:06, Pierre-Eric Pelloux-Prayer wrote:
This will allow to use flexible array to store the process name and
other information.
This also means that process name will be determined once and for all,
instead of at each submit.
But the pid and others can still change? By design?
On 20/09/2024 10:06, Pierre-Eric Pelloux-Prayer wrote:
At this point the vm is locked so we safely modify it without risk of
concurrent access.
To which particular lock this is referring to and does this imply
previous placement was unsafe?
Regards,
Tvrtko
Signed-off-by: Pierre-Eric Pel
From: Tvrtko Ursulin
It does not seem there is a need to set the current entity in FIFO mode
since ot only serves as being a "cursor" in round-robin mode. Even if
scheduling mode is changed at runtime the change in behaviour is simply
to restart from the first entity, instead of contin
From: Tvrtko Ursulin
Current kerneldoc for struct drm_sched_rq incompletely documents what
fields are protected by the lock.
This is not good because it is misleading.
Lets fix it by listing all the elements which are protected by the lock.
While at it, lets also re-order the members so all
From: Tvrtko Ursulin
Having removed one re-lock cycle on the entity->lock in a patch titled
"drm/sched: Optimise drm_sched_entity_push_job", with only a tiny bit
larger refactoring we can do the same optimisation on the rq->lock.
(Currently both drm_sched_rq_
From: Tvrtko Ursulin
Since drm_sched_entity_modify_sched() can modify the entities run queue,
lets make sure to only dereference the pointer once so both adding and
waking up are guaranteed to be consistent.
Alternative of moving the spin_unlock to after the wake up would for now
be more
From: Tvrtko Ursulin
Entities run queue can change during drm_sched_entity_push_job() so make
sure to update the score consistently.
Signed-off-by: Tvrtko Ursulin
Fixes: d41a39dda140 ("drm/scheduler: improve job distribution with multiple
queues")
Cc: Nirmoy Das
Cc: Christian
From: Tvrtko Ursulin
Without the locking amdgpu currently can race between
amdgpu_ctx_set_entity_priority() (via drm_sched_entity_modify_sched()) and
drm_sched_job_arm(), leading to the latter accesing potentially
inconsitent entity->sched_list and entity->num_sched_list pair.
v2:
* I
From: Tvrtko Ursulin
Christian suggested to rename the lock and improve the documentation of
what it protects. And to also re-order the structure members so all
protected by the lock are together in a block.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Alex Deucher
Cc: Luben Tuikov
From: Tvrtko Ursulin
In FIFO mode We can avoid dropping the lock only to immediately re-acquire
by adding a new drm_sched_rq_update_fifo_locked() helper.
v2:
* Remove drm_sched_rq_update_fifo() altogether. (Christian)
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Alex Deucher
Cc
From: Tvrtko Ursulin
All reviewed now, re-sending after rebasing on latest drm-tip so it is in a
mergeable state.
Tvrtko Ursulin (8):
drm/sched: Add locking to drm_sched_entity_modify_sched
drm/sched: Always wake up correct scheduler in
drm_sched_entity_push_job
drm/sched: Always
From: Tvrtko Ursulin
While loop makes it sound like amdgpu_vmid_grab() potentially needs to be
called multiple times to produce a fence, while in reality all code paths
either return an error, assign a valid job->vmid or assign a vmid which
will be valid once the returned fence sign
From: Tvrtko Ursulin
Fence has been initialised to NULL so no need to test it.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
b/drivers
From: Tvrtko Ursulin
Fence argument is unused so lets drop it.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 6 ++
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
b/drivers/gpu/drm
On 16/09/2024 13:20, Tvrtko Ursulin wrote:
On 16/09/2024 13:11, Christian König wrote:
Am 13.09.24 um 18:05 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin
Having removed one re-lock cycle on the entity->lock in a patch titled
"drm/sched: Optimise drm_sched_entity_push_job", wit
From: Tvrtko Ursulin
Having removed one re-lock cycle on the entity->lock in a patch titled
"drm/sched: Optimise drm_sched_entity_push_job", with only a tiny bit
larger refactoring we can do the same optimisation on the rq->lock.
(Currently both drm_sched_rq_
On 16/09/2024 09:16, Philipp Stanner wrote:
On Fri, 2024-09-13 at 17:05 +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin
Current kerneldoc for struct drm_sched_rq incompletely documents what
fields are protected by the lock.
This is not good because it is misleading.
Lets fix it by
On 16/09/2024 13:11, Christian König wrote:
Am 13.09.24 um 18:05 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin
Having removed one re-lock cycle on the entity->lock in a patch titled
"drm/sched: Optimise drm_sched_entity_push_job", with only a tiny bit
larger refactoring we ca
On 10/09/2024 16:03, Christian König wrote:
Am 10.09.24 um 11:46 schrieb Tvrtko Ursulin:
On 10/09/2024 10:08, Christian König wrote:
Am 09.09.24 um 19:19 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin
Having removed one re-lock cycle on the entity->lock in a patch titled
"d
On 13/09/2024 13:19, Philipp Stanner wrote:
On Wed, 2024-09-11 at 13:22 +0100, Tvrtko Ursulin wrote:
On 10/09/2024 11:25, Philipp Stanner wrote:
On Mon, 2024-09-09 at 18:19 +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin
Having removed one re-lock cycle on the entity->lock in a pa
From: Tvrtko Ursulin
Entities run queue can change during drm_sched_entity_push_job() so make
sure to update the score consistently.
Signed-off-by: Tvrtko Ursulin
Fixes: d41a39dda140 ("drm/scheduler: improve job distribution with multiple
queues")
Cc: Nirmoy Das
Cc: Christian
From: Tvrtko Ursulin
Without the locking amdgpu currently can race between
amdgpu_ctx_set_entity_priority() (via drm_sched_entity_modify_sched()) and
drm_sched_job_arm(), leading to the latter accesing potentially
inconsitent entity->sched_list and entity->num_sched_list pair.
v2:
* I
From: Tvrtko Ursulin
Christian suggested to rename the lock and improve the documentation of
what it protects. And to also re-order the structure members so all
protected by the lock are together in a block.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Alex Deucher
Cc: Luben Tuikov
From: Tvrtko Ursulin
Current kerneldoc for struct drm_sched_rq incompletely documents what
fields are protected by the lock.
This is not good because it is misleading.
Lets fix it by listing all the elements which are protected by the lock.
While at it, lets also re-order the members so all
From: Tvrtko Ursulin
Since drm_sched_entity_modify_sched() can modify the entities run queue,
lets make sure to only dereference the pointer once so both adding and
waking up are guaranteed to be consistent.
Alternative of moving the spin_unlock to after the wake up would for now
be more
From: Tvrtko Ursulin
Having removed one re-lock cycle on the entity->lock in a patch titled
"drm/sched: Optimise drm_sched_entity_push_job", with only a tiny bit
larger refactoring we can do the same optimisation on the rq->lock.
(Currently both drm_sched_rq_
From: Tvrtko Ursulin
In FIFO mode We can avoid dropping the lock only to immediately re-acquire
by adding a new drm_sched_rq_update_fifo_locked() helper.
v2:
* Remove drm_sched_rq_update_fifo() altogether. (Christian)
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Alex Deucher
Cc
From: Tvrtko Ursulin
It does not seem there is a need to set the current entity in FIFO mode
since ot only serves as being a "cursor" in round-robin mode. Even if
scheduling mode is changed at runtime the change in behaviour is simply
to restart from the first entity, instead of contin
From: Tvrtko Ursulin
Re-spin of the series from last week. Changelog is in individual patches.
Cc: Christian König
Cc: Alex Deucher
Cc: Luben Tuikov
Cc: Matthew Brost
Cc: Philipp Stanner
Tvrtko Ursulin (8):
drm/sched: Add locking to drm_sched_entity_modify_sched
drm/sched: Always wake
On 10/09/2024 11:25, Philipp Stanner wrote:
On Mon, 2024-09-09 at 18:19 +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin
Having removed one re-lock cycle on the entity->lock in a patch
titled
"drm/sched: Optimise drm_sched_entity_push_job", with only a tiny bit
larger refactor
On 10/09/2024 11:05, Philipp Stanner wrote:
On Mon, 2024-09-09 at 18:19 +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin
Lets re-order the members to make it clear which are protected by the
lock
and at the same time document it via kerneldoc.
I'd prefer if commit messages follo
On 10/09/2024 10:08, Christian König wrote:
Am 09.09.24 um 19:19 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin
Having removed one re-lock cycle on the entity->lock in a patch titled
"drm/sched: Optimise drm_sched_entity_push_job", with only a tiny bit
larger refactoring we ca
On 09/09/2024 13:46, Philipp Stanner wrote:
On Mon, 2024-09-09 at 13:37 +0100, Tvrtko Ursulin wrote:
On 09/09/2024 13:18, Christian König wrote:
Am 09.09.24 um 14:13 schrieb Philipp Stanner:
On Mon, 2024-09-09 at 13:29 +0200, Christian König wrote:
Am 09.09.24 um 11:44 schrieb Philipp
From: Tvrtko Ursulin
Christian suggested to rename the lock and improve the documentation of
what it protects. And to also re-order the structure members so all
protected by the lock are together in a block.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Alex Deucher
Cc: Luben Tuikov
From: Tvrtko Ursulin
Having removed one re-lock cycle on the entity->lock in a patch titled
"drm/sched: Optimise drm_sched_entity_push_job", with only a tiny bit
larger refactoring we can do the same optimisation on the rq->lock.
(Currently both drm_sched_rq_
From: Tvrtko Ursulin
It does not seem there is a need to set the current entity in FIFO mode
since ot only serves as being a "cursor" in round-robin mode. Even if
scheduling mode is changed at runtime the change in behaviour is simply
to restart from the first entity, instead of contin
From: Tvrtko Ursulin
Entities run queue can change during drm_sched_entity_push_job() so make
sure to update the score consistently.
Signed-off-by: Tvrtko Ursulin
Fixes: d41a39dda140 ("drm/scheduler: improve job distribution with multiple
queues")
Cc: Nirmoy Das
Cc: Christian
From: Tvrtko Ursulin
Without the locking amdgpu currently can race between
amdgpu_ctx_set_entity_priority() (via drm_sched_entity_modify_sched()) and
drm_sched_job_arm(), leading to the latter accesing potentially
inconsitent entity->sched_list and entity->num_sched_list pair.
v2:
* I
From: Tvrtko Ursulin
Lets re-order the members to make it clear which are protected by the lock
and at the same time document it via kerneldoc.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Alex Deucher
Cc: Luben Tuikov
Cc: Matthew Brost
Cc: Philipp Stanner
---
include/drm
From: Tvrtko Ursulin
In FIFO mode We can avoid dropping the lock only to immediately re-acquire
by adding a new drm_sched_rq_update_fifo_locked() helper.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Alex Deucher
Cc: Luben Tuikov
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers
From: Tvrtko Ursulin
Since drm_sched_entity_modify_sched() can modify the entities run queue,
lets make sure to only dereference the pointer once so both adding and
waking up are guaranteed to be consistent.
Alternative of moving the spin_unlock to after the wake up would for now
be more
From: Tvrtko Ursulin
Re-spin of the series from two days ago with review feedback addressed and
some new patches added.
Changelog is in individual patches but essentially new patches are renames
and struct members re-ordering as discussed in v1, plus one more optimisation
when I noticed we can
On 06/09/2024 19:12, Alex Deucher wrote:
On Wed, Sep 4, 2024 at 4:36 AM Tvrtko Ursulin wrote:
On 21/08/2024 21:47, Alex Deucher wrote:
On Tue, Aug 13, 2024 at 9:57 AM Tvrtko Ursulin wrote:
From: Tvrtko Ursulin
Currently it is not well defined what is drm-memory- compared to other
On 09/09/2024 13:18, Christian König wrote:
Am 09.09.24 um 14:13 schrieb Philipp Stanner:
On Mon, 2024-09-09 at 13:29 +0200, Christian König wrote:
Am 09.09.24 um 11:44 schrieb Philipp Stanner:
On Fri, 2024-09-06 at 19:06 +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin
Without the
On 09/09/2024 09:47, Philipp Stanner wrote:
Hi,
On Fri, 2024-09-06 at 19:06 +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin
In a recent conversation with Christian there was a thought that
drm_sched_entity_modify_sched() should start using the entity-
rq_lock to be
safe against job
From: Tvrtko Ursulin
Now that no callers exist, lets remove the whole misleading helper.
Misleading because runtime changes do not reliably work due
drm_sched_entity_select_rq() only acting on idle entities.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Alex Deucher
Cc: Luben Tuikov
From: Tvrtko Ursulin
According to Christian the dynamic DRM priority override was only
interesting before the hardware priority (dona via
drm_sched_entity_modify_sched()) existed. Furthermore, both
overrides also only work somewhat on paper while in reality they are only
effective if the entity
From: Tvrtko Ursulin
In a recent conversation with Christian there was a thought that dynamic DRM
scheduling priority changes are not required, or even not desired (actively
prevented?!), and can be ripped out.
For more context, starting point for that conversation was me observing that
they
From: Tvrtko Ursulin
Without the locking amdgpu currently can race
amdgpu_ctx_set_entity_priority() and drm_sched_job_arm(), leading to the
latter accesing potentially inconsitent entity->sched_list and
entity->num_sched_list pair.
The comment on drm_sched_entity_modify_sched() howeve
From: Tvrtko Ursulin
In a recent conversation with Christian there was a thought that
drm_sched_entity_modify_sched() should start using the entity->rq_lock to be
safe against job submission and simultaneous priority changes.
The kerneldoc accompanying that function however is a bit unclear
From: Tvrtko Ursulin
In FIFO mode We can avoid dropping the lock only to immediately re-acquire
by adding a new drm_sched_rq_update_fifo_locked() helper.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Alex Deucher
Cc: Luben Tuikov
Cc: Matthew Brost
---
drivers/gpu/drm/scheduler
From: Tvrtko Ursulin
Since drm_sched_entity_modify_sched() can modify the entities run queue
lets make sure to only derefernce the pointer once so both adding and
waking up are guaranteed to be consistent.
Signed-off-by: Tvrtko Ursulin
Fixes: b37aced31eb0 ("drm/scheduler: implement a fun
From: Tvrtko Ursulin
Entities run queue can change during drm_sched_entity_push_job() so make
sure to update the score consistently.
Signed-off-by: Tvrtko Ursulin
Fixes: d41a39dda140 ("drm/scheduler: improve job distribution with multiple
queues")
Cc: Nirmoy Das
Cc: Christian
On 21/08/2024 21:47, Alex Deucher wrote:
On Tue, Aug 13, 2024 at 9:57 AM Tvrtko Ursulin wrote:
From: Tvrtko Ursulin
Currently it is not well defined what is drm-memory- compared to other
categories.
In practice the only driver which emits these keys is amdgpu and in them
exposes the
On 13/08/2024 19:47, Rob Clark wrote:
On Tue, Aug 13, 2024 at 6:57 AM Tvrtko Ursulin wrote:
From: Tvrtko Ursulin
Currently it is not well defined what is drm-memory- compared to other
categories.
In practice the only driver which emits these keys is amdgpu and in them
exposes the current
On 13/08/2024 15:08, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin
When CONFIG_INIT_STACK_ALL_ZERO is set and so -ftrivial-auto-var-init=zero
compiler option active, compiler fails to notice that later in
amdgpu_vm_pt_clear() there is a second memset to clear the same on stack
struct
I was waiting for some replies elsewhere on this thread. Anwyay.. for
the below, because I don't understand how come an important fix like
this is not garnering more attention:
On 04/06/2024 17:05, Christian König wrote:
From: Tvrtko Ursulin
Since you pretty much changed my logi
From: Tvrtko Ursulin
When CONFIG_INIT_STACK_ALL_ZERO is set and so -ftrivial-auto-var-init=zero
compiler option active, compiler fails to notice that later in
amdgpu_vm_pt_clear() there is a second memset to clear the same on stack
struct amdgpu_vm_update_params.
If we replace this memset with
From: Tvrtko Ursulin
When CONFIG_INIT_STACK_ALL_ZERO is set and so -ftrivial-auto-var-init=zero
compiler option active, compiler fails to notice that inside
amdgpu_cs_parser_init() there is a second memset to clear the same on
stack struct amdgpu_cs_parser.
If we pull this memset one level out
From: Tvrtko Ursulin
Re-sending these two since they garnered little attention last time round.
First patch clarifies what drm-memory- is, and that it is legacy, and second
patch updates amdgpu to start emitting new keys together with the legacy (by
using the common DRM helper).
With that
From: Tvrtko Ursulin
Convert fdinfo memory stats to use the common drm_print_memory_stats
helper.
This achieves alignment with the common keys as documented in
drm-usage-stats.rst, adding specifically drm-total- key the driver was
missing until now.
Additionally I made the code stop skipping
From: Tvrtko Ursulin
Currently it is not well defined what is drm-memory- compared to other
categories.
In practice the only driver which emits these keys is amdgpu and in them
exposes the current resident buffer object memory (including shared).
To prevent any confusion, document that drm
On 04/08/2024 19:11, Marek Olšák wrote:
On Thu, Aug 1, 2024 at 2:55 PM Marek Olšák wrote:
On Thu, Aug 1, 2024, 03:37 Christian König wrote:
Am 01.08.24 um 08:53 schrieb Marek Olšák:
On Thu, Aug 1, 2024, 00:28 Khatri, Sunil wrote:
On 8/1/2024 8:49 AM, Marek Olšák wrote:
+ /* He
On 24/07/2024 12:16, Christian König wrote:
Am 24.07.24 um 10:16 schrieb Tvrtko Ursulin:
[SNIP]
Absolutely.
Absolutely good and absolutely me, or absolutely you? :)
You, I don't even have time to finish all the stuff I already started :/
Okay, I think I can squeeze it in.
Thes
On 22/07/2024 16:13, Christian König wrote:
Am 22.07.24 um 16:43 schrieb Tvrtko Ursulin:
On 22/07/2024 15:06, Christian König wrote:
Am 22.07.24 um 15:52 schrieb Tvrtko Ursulin:
On 19/07/2024 16:18, Christian König wrote:
Am 19.07.24 um 15:02 schrieb Christian König:
Am 19.07.24 um 11
On 22/07/2024 15:06, Christian König wrote:
Am 22.07.24 um 15:52 schrieb Tvrtko Ursulin:
On 19/07/2024 16:18, Christian König wrote:
Am 19.07.24 um 15:02 schrieb Christian König:
Am 19.07.24 um 11:47 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin
Long time ago in commit b3ac17667f11 (&quo
On 19/07/2024 16:18, Christian König wrote:
Am 19.07.24 um 15:02 schrieb Christian König:
Am 19.07.24 um 11:47 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin
Long time ago in commit b3ac17667f11 ("drm/scheduler: rework entity
creation") a change was made which prevented priority c
From: Tvrtko Ursulin
Long time ago in commit b3ac17667f11 ("drm/scheduler: rework entity
creation") a change was made which prevented priority changes for entities
with only one assigned scheduler.
The commit reduced drm_sched_entity_set_priority() to simply update the
entities pri
From: Tvrtko Ursulin
Having noticed that typically 200+ nops per submission are written into
the ring, using a rather verbose one-nop-at-a-time-plus-ring-buffer-
arithmetic as done in amdgpu_ring_write(), the obvious idea was to
improve it by filling those nops in blocks.
This patch therefore
On 12/07/2024 16:28, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin
Having noticed that typically 200+ nops per submission are written into
the ring, using a rather verbose one-nop-at-a-time-plus-ring-buffer-
arithmetic as done in amdgpu_ring_write(), the obvious idea was to
improve it by
On 12/07/2024 14:04, Christian König wrote:
Am 12.07.24 um 11:14 schrieb Tvrtko Ursulin:
On 12/07/2024 08:33, Christian König wrote:
Am 11.07.24 um 20:17 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin
From the department of questionable optimisations today we have a
minor
improvement to
On 12/07/2024 08:33, Christian König wrote:
Am 11.07.24 um 20:17 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin
From the department of questionable optimisations today we have a minor
improvement to how padding / filling the rings with nops is done.
Having noticed that typically 200+ nops
From: Tvrtko Ursulin
Having noticed that typically 200+ nops per submission are written into
the ring, using a rather verbose one-nop-at-a-time-plus-ring-buffer-
arithmetic as done in amdgpu_ring_write(), the obvious idea was to
improve it by filling those nops in blocks.
This patch therefore
From: Tvrtko Ursulin
Similarly as in the previous patch, we add a new amdgpu_ring_fill64()
helper which can write out the nops more efficiently using memset64().
This should have a lesser effect than the previous patch, given how the
affected rings have at most 64 dword alignment restriction
From: Tvrtko Ursulin
A three patches to streamline the ring nop padding process which happens on
every submission.
I smoke tested graphics and video decode on the Steam Deck but cannot do much
more testing than that. Therefore no guarantees I did not break something.
Cc: Christian König
1 - 100 of 226 matches
Mail list logo