Re: [PATCH 1/2] drm/ttm: improve idle/busy handling v4

2024-02-27 Thread Matthew Auld

On 26/02/2024 20:21, Thomas Hellström wrote:

Hi, Christian

On Fri, 2024-02-23 at 15:30 +0100, Christian König wrote:

Am 06.02.24 um 13:56 schrieb Christian König:

Am 06.02.24 um 13:53 schrieb Thomas Hellström:

Hi, Christian,

On Fri, 2024-01-26 at 15:09 +0100, Christian König wrote:

Previously we would never try to move a BO into the preferred
placements
when it ever landed in a busy placement since those were
considered
compatible.

Rework the whole handling and finally unify the idle and busy
handling.
ttm_bo_validate() is now responsible to try idle placement
first and
then
use the busy placement if that didn't worked.

Drawback is that we now always try the idle placement first for
each
validation which might cause some additional CPU overhead on
overcommit.

v2: fix kerneldoc warning and coding style
v3: take care of XE as well
v4: keep the ttm_bo_mem_space functionality as it is for now,
only
add
  new handling for ttm_bo_validate as suggested by Thomas

Signed-off-by: Christian König 
Reviewed-by: Zack Rusin  v3

Sending this through xe CI, will try to review asap.


Take your time. At the moment people are bombarding me with work
and I
have only two hands and one head as well :(


So I've digged myself out of that hole and would rather like to get
this
new feature into 6.9.

Any time to review it? I can also plan some time to review your LRU
changes next week.

Thanks,
Christian.


Sorry for the late response. Was planning to review but saw that there
was still an xe CI failure.

https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-129579v1/bat-atsm-2/igt@xe_evict_...@evict-overcommit-parallel-nofree-samefd.html

I haven't really had time to look into what might be causing this,
though.

Maybe in ttm_bo_alloc_resource():

@@ -772,7 +772,7 @@ static int ttm_bo_alloc_resource(struct 
ttm_buffer_object *bo,


do {
ret = ttm_resource_alloc(bo, place, res);
-   if (unlikely(ret != -ENOSPC))
+   if (unlikely(ret && ret != -ENOSPC))
return ret;
if (likely(!ret) || !force_space)
break;

Otherwise we allocate VRAM but never correctly synchronise against the 
move fence, since we missed adding it to the BO. When we trigger async 
evictions that would explain the above test failure where we detect VRAM 
corruption, since someone else is still using the VRAM we allocated. 
What do you think?




/Thomas





Christian.



/Thomas



---
   drivers/gpu/drm/ttm/ttm_bo.c   | 231 +---
---
--
   drivers/gpu/drm/ttm/ttm_resource.c |  16 +-
   include/drm/ttm/ttm_resource.h |   3 +-
   3 files changed, 121 insertions(+), 129 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
b/drivers/gpu/drm/ttm/ttm_bo.c
index ba3f09e2d7e6..b12f435542a9 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -724,64 +724,36 @@ static int ttm_bo_add_move_fence(struct
ttm_buffer_object *bo,
   return ret;
   }
   -/*
- * Repeatedly evict memory from the LRU for @mem_type until we
create enough
- * space, or we've evicted everything and there isn't enough
space.
- */
-static int ttm_bo_mem_force_space(struct ttm_buffer_object
*bo,
-      const struct ttm_place *place,
-      struct ttm_resource **mem,
-      struct ttm_operation_ctx *ctx)
-{
-    struct ttm_device *bdev = bo->bdev;
-    struct ttm_resource_manager *man;
-    struct ww_acquire_ctx *ticket;
-    int ret;
-
-    man = ttm_manager_type(bdev, place->mem_type);
-    ticket = dma_resv_locking_ctx(bo->base.resv);
-    do {
-    ret = ttm_resource_alloc(bo, place, mem);
-    if (likely(!ret))
-    break;
-    if (unlikely(ret != -ENOSPC))
-    return ret;
-    ret = ttm_mem_evict_first(bdev, man, place, ctx,
-      ticket);
-    if (unlikely(ret != 0))
-    return ret;
-    } while (1);
-
-    return ttm_bo_add_move_fence(bo, man, *mem, ctx-

no_wait_gpu);

-}
-
   /**
- * ttm_bo_mem_space
+ * ttm_bo_alloc_resource - Allocate backing store for a BO
    *
- * @bo: Pointer to a struct ttm_buffer_object. the data of
which
- * we want to allocate space for.
- * @placement: Proposed new placement for the buffer object.
- * @mem: A struct ttm_resource.
+ * @bo: Pointer to a struct ttm_buffer_object of which we want
a
resource for
+ * @placement: Proposed new placement for the buffer object
    * @ctx: if and how to sleep, lock buffers and alloc memory
+ * @force_space: If we should evict buffers to force space
+ * @res: The resulting struct ttm_resource.
    *
- * Allocate memory space for the buffer object pointed to by
@bo,
using
- * the placement flags in @placement, potentially evicting
other
idle buffer objects.
- * This function may sleep while waiting for space to become
available.
+ * Allocates a resource for the buffer object pointed to by
@bo,
using

Re: [PATCH 1/2] drm/ttm: improve idle/busy handling v4

2024-02-27 Thread Christian König

Am 27.02.24 um 09:12 schrieb Matthew Auld:

On 26/02/2024 20:21, Thomas Hellström wrote:

Hi, Christian

On Fri, 2024-02-23 at 15:30 +0100, Christian König wrote:

Am 06.02.24 um 13:56 schrieb Christian König:

Am 06.02.24 um 13:53 schrieb Thomas Hellström:

Hi, Christian,

On Fri, 2024-01-26 at 15:09 +0100, Christian König wrote:

Previously we would never try to move a BO into the preferred
placements
when it ever landed in a busy placement since those were
considered
compatible.

Rework the whole handling and finally unify the idle and busy
handling.
ttm_bo_validate() is now responsible to try idle placement
first and
then
use the busy placement if that didn't worked.

Drawback is that we now always try the idle placement first for
each
validation which might cause some additional CPU overhead on
overcommit.

v2: fix kerneldoc warning and coding style
v3: take care of XE as well
v4: keep the ttm_bo_mem_space functionality as it is for now,
only
add
  new handling for ttm_bo_validate as suggested by Thomas

Signed-off-by: Christian König 
Reviewed-by: Zack Rusin  v3

Sending this through xe CI, will try to review asap.


Take your time. At the moment people are bombarding me with work
and I
have only two hands and one head as well :(


So I've digged myself out of that hole and would rather like to get
this
new feature into 6.9.

Any time to review it? I can also plan some time to review your LRU
changes next week.

Thanks,
Christian.


Sorry for the late response. Was planning to review but saw that there
was still an xe CI failure.

https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-129579v1/bat-atsm-2/igt@xe_evict_...@evict-overcommit-parallel-nofree-samefd.html 



I haven't really had time to look into what might be causing this,
though.

Maybe in ttm_bo_alloc_resource():

@@ -772,7 +772,7 @@ static int ttm_bo_alloc_resource(struct 
ttm_buffer_object *bo,


    do {
    ret = ttm_resource_alloc(bo, place, res);
-   if (unlikely(ret != -ENOSPC))
+   if (unlikely(ret && ret != -ENOSPC))
    return ret;
    if (likely(!ret) || !force_space)
    break;

Otherwise we allocate VRAM but never correctly synchronise against the 
move fence, since we missed adding it to the BO. When we trigger async 
evictions that would explain the above test failure where we detect 
VRAM corruption, since someone else is still using the VRAM we 
allocated. What do you think?


Yup, that looks like the right thing to do. Thanks.

Give me a moment to fix that.

Christian.





/Thomas





Christian.



/Thomas



---
   drivers/gpu/drm/ttm/ttm_bo.c   | 231 +---
---
--
   drivers/gpu/drm/ttm/ttm_resource.c |  16 +-
   include/drm/ttm/ttm_resource.h |   3 +-
   3 files changed, 121 insertions(+), 129 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
b/drivers/gpu/drm/ttm/ttm_bo.c
index ba3f09e2d7e6..b12f435542a9 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -724,64 +724,36 @@ static int ttm_bo_add_move_fence(struct
ttm_buffer_object *bo,
   return ret;
   }
   -/*
- * Repeatedly evict memory from the LRU for @mem_type until we
create enough
- * space, or we've evicted everything and there isn't enough
space.
- */
-static int ttm_bo_mem_force_space(struct ttm_buffer_object
*bo,
-      const struct ttm_place *place,
-      struct ttm_resource **mem,
-      struct ttm_operation_ctx *ctx)
-{
-    struct ttm_device *bdev = bo->bdev;
-    struct ttm_resource_manager *man;
-    struct ww_acquire_ctx *ticket;
-    int ret;
-
-    man = ttm_manager_type(bdev, place->mem_type);
-    ticket = dma_resv_locking_ctx(bo->base.resv);
-    do {
-    ret = ttm_resource_alloc(bo, place, mem);
-    if (likely(!ret))
-    break;
-    if (unlikely(ret != -ENOSPC))
-    return ret;
-    ret = ttm_mem_evict_first(bdev, man, place, ctx,
-      ticket);
-    if (unlikely(ret != 0))
-    return ret;
-    } while (1);
-
-    return ttm_bo_add_move_fence(bo, man, *mem, ctx-

no_wait_gpu);

-}
-
   /**
- * ttm_bo_mem_space
+ * ttm_bo_alloc_resource - Allocate backing store for a BO
    *
- * @bo: Pointer to a struct ttm_buffer_object. the data of
which
- * we want to allocate space for.
- * @placement: Proposed new placement for the buffer object.
- * @mem: A struct ttm_resource.
+ * @bo: Pointer to a struct ttm_buffer_object of which we want
a
resource for
+ * @placement: Proposed new placement for the buffer object
    * @ctx: if and how to sleep, lock buffers and alloc memory
+ * @force_space: If we should evict buffers to force space
+ * @res: The resulting struct ttm_resource.
    *
- * Allocate memory space for the buffer object pointed to by
@bo,
using
- * the placement flags in @placement, potentially evicting
other
idle buffer objects.

[PATCH 01/13] drm/gem-shmem: Acquire reservation lock in GEM pin/unpin callbacks

2024-02-27 Thread Thomas Zimmermann
Export drm_gem_shmem_pin_locked() and acquire the reservation lock
directly in GEM pin callback. Same for unpin. Prepares for further
changes.

Dma-buf locking semantics require callers to hold the buffer's
reservation lock when invoking the pin and unpin callbacks. Prepare
gem-shmem accordingly by pushing locking out of the implementation.
A follow-up patch will fix locking for all GEM code at once.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/drm_gem_shmem_helper.c |  6 --
 include/drm/drm_gem_shmem_helper.h | 16 ++--
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index e435f986cd135..0ac3dddb917f3 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -228,7 +228,7 @@ void drm_gem_shmem_put_pages(struct drm_gem_shmem_object 
*shmem)
 }
 EXPORT_SYMBOL(drm_gem_shmem_put_pages);
 
-static int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem)
+int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem)
 {
int ret;
 
@@ -238,13 +238,15 @@ static int drm_gem_shmem_pin_locked(struct 
drm_gem_shmem_object *shmem)
 
return ret;
 }
+EXPORT_SYMBOL(drm_gem_shmem_pin_locked);
 
-static void drm_gem_shmem_unpin_locked(struct drm_gem_shmem_object *shmem)
+void drm_gem_shmem_unpin_locked(struct drm_gem_shmem_object *shmem)
 {
dma_resv_assert_held(shmem->base.resv);
 
drm_gem_shmem_put_pages(shmem);
 }
+EXPORT_SYMBOL(drm_gem_shmem_unpin_locked);
 
 /**
  * drm_gem_shmem_pin - Pin backing pages for a shmem GEM object
diff --git a/include/drm/drm_gem_shmem_helper.h 
b/include/drm/drm_gem_shmem_helper.h
index bf0c31aa8fbe4..eb12aa9a8c556 100644
--- a/include/drm/drm_gem_shmem_helper.h
+++ b/include/drm/drm_gem_shmem_helper.h
@@ -108,6 +108,9 @@ void drm_gem_shmem_vunmap(struct drm_gem_shmem_object 
*shmem,
  struct iosys_map *map);
 int drm_gem_shmem_mmap(struct drm_gem_shmem_object *shmem, struct 
vm_area_struct *vma);
 
+int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem);
+void drm_gem_shmem_unpin_locked(struct drm_gem_shmem_object *shmem);
+
 int drm_gem_shmem_madvise(struct drm_gem_shmem_object *shmem, int madv);
 
 static inline bool drm_gem_shmem_is_purgeable(struct drm_gem_shmem_object 
*shmem)
@@ -172,8 +175,15 @@ static inline void drm_gem_shmem_object_print_info(struct 
drm_printer *p, unsign
 static inline int drm_gem_shmem_object_pin(struct drm_gem_object *obj)
 {
struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
+   int ret;
+
+   ret = dma_resv_lock_interruptible(shmem->base.resv, NULL);
+   if (ret)
+   return ret;
+   ret = drm_gem_shmem_pin_locked(shmem);
+   dma_resv_unlock(shmem->base.resv);
 
-   return drm_gem_shmem_pin(shmem);
+   return ret;
 }
 
 /**
@@ -187,7 +197,9 @@ static inline void drm_gem_shmem_object_unpin(struct 
drm_gem_object *obj)
 {
struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
 
-   drm_gem_shmem_unpin(shmem);
+   dma_resv_lock(shmem->base.resv, NULL);
+   drm_gem_shmem_unpin_locked(shmem);
+   dma_resv_unlock(shmem->base.resv);
 }
 
 /**
-- 
2.43.2



[PATCH 02/13] drm/gem-vram: Acquire reservation lock in GEM pin/unpin callbacks

2024-02-27 Thread Thomas Zimmermann
Acquire the reservation lock directly in GEM pin callback. Same for
unpin. Prepares for further changes.

Dma-buf locking semantics require callers to hold the buffer's
reservation lock when invoking the pin and unpin callbacks. Prepare
gem-vram accordingly by pushing locking out of the implementation.
A follow-up patch will fix locking for all GEM code at once.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/drm_gem_vram_helper.c | 24 +---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c 
b/drivers/gpu/drm/drm_gem_vram_helper.c
index 75f2eaf0d5b61..15029d89badf8 100644
--- a/drivers/gpu/drm/drm_gem_vram_helper.c
+++ b/drivers/gpu/drm/drm_gem_vram_helper.c
@@ -283,6 +283,8 @@ static int drm_gem_vram_pin_locked(struct 
drm_gem_vram_object *gbo,
struct ttm_operation_ctx ctx = { false, false };
int ret;
 
+   dma_resv_assert_held(gbo->bo.base.resv);
+
if (gbo->bo.pin_count)
goto out;
 
@@ -338,6 +340,8 @@ EXPORT_SYMBOL(drm_gem_vram_pin);
 
 static void drm_gem_vram_unpin_locked(struct drm_gem_vram_object *gbo)
 {
+   dma_resv_assert_held(gbo->bo.base.resv);
+
ttm_bo_unpin(&gbo->bo);
 }
 
@@ -770,8 +774,14 @@ EXPORT_SYMBOL(drm_gem_vram_simple_display_pipe_cleanup_fb);
 static int drm_gem_vram_object_pin(struct drm_gem_object *gem)
 {
struct drm_gem_vram_object *gbo = drm_gem_vram_of_gem(gem);
+   int ret;
 
-   /* Fbdev console emulation is the use case of these PRIME
+   ret = ttm_bo_reserve(&gbo->bo, true, false, NULL);
+   if (ret)
+   return ret;
+
+   /*
+* Fbdev console emulation is the use case of these PRIME
 * helpers. This may involve updating a hardware buffer from
 * a shadow FB. We pin the buffer to it's current location
 * (either video RAM or system memory) to prevent it from
@@ -779,7 +789,10 @@ static int drm_gem_vram_object_pin(struct drm_gem_object 
*gem)
 * the buffer to be pinned to VRAM, implement a callback that
 * sets the flags accordingly.
 */
-   return drm_gem_vram_pin(gbo, 0);
+   ret = drm_gem_vram_pin_locked(gbo, 0);
+   ttm_bo_unreserve(&gbo->bo);
+
+   return ret;
 }
 
 /**
@@ -790,8 +803,13 @@ static int drm_gem_vram_object_pin(struct drm_gem_object 
*gem)
 static void drm_gem_vram_object_unpin(struct drm_gem_object *gem)
 {
struct drm_gem_vram_object *gbo = drm_gem_vram_of_gem(gem);
+   int ret;
 
-   drm_gem_vram_unpin(gbo);
+   ret = ttm_bo_reserve(&gbo->bo, true, false, NULL);
+   if (ret)
+   return;
+   drm_gem_vram_unpin_locked(gbo);
+   ttm_bo_unreserve(&gbo->bo);
 }
 
 /**
-- 
2.43.2



[PATCH 00/13] drm: Fix reservation locking for pin/unpin and console

2024-02-27 Thread Thomas Zimmermann
Dma-buf locking semantics require the caller of pin and unpin to hold
the buffer's reservation lock. Fix DRM to adhere to the specs. This
enables to fix the locking in DRM's console emulation. Similar changes
for vmap and mmap have been posted at [1][2]

Most DRM drivers and memory managers acquire the buffer object's
reservation lock within their GEM pin and unpin callbacks. This
violates dma-buf locking semantics. We get away with it because PRIME
does not provide pin/unpin, but attach/detach, for which the locking
semantics is correct.

Patches 1 to 8 rework DRM GEM code in various implementations to
acquire the reservation lock when entering the pin and unpin callbacks.
This prepares them for the next patch. Drivers that are not affected
by these patches either don't acquire the reservation lock (amdgpu)
or don't need preparation (loongson).

Patch 9 moves reservation locking from the GEM pin/unpin callbacks
into drm_gem_pin() and drm_gem_unpin(). As PRIME uses these functions
internally it still gets the reservation lock.

With the updated GEM callbacks, the rest of the patchset fixes the
fbdev emulation's buffer locking. Fbdev emulation needs to keep its
GEM buffer object inplace while updating its content. This required
a implicit pinning and apparently amdgpu didn't do this at all.

Patch 10 introduces drm_client_buffer_vmap_local() and _vunmap_local().
The former function map a GEM buffer into the kernel's address space
with regular vmap operations, but keeps holding the reservation lock.
The _vunmap_local() helper undoes the vmap and releases the lock. The
updated GEM callbacks make this possible. Between the two calls, the
fbdev emulation can update the buffer content without have the buffer
moved or evicted. Update fbdev-generic to use vmap_local helpers,
which fix amdgpu. The idea of adding a "local vmap" has previously been
attempted at [3] in a different form.

Patch 11 adds implicit pinning to the DRM client's regular vmap
helper so that long-term vmap'ed buffers won't be evicted. This only
affects fbdev-dma, but GEM DMA helpers don't require pinning. So
there are no practical changes.

Patches 12 and 13 remove implicit pinning from the vmap and vunmap
operations in gem-vram and qxl. These pin operations are not supposed
to be part of vmap code, but were required to keep the buffers in place
for fbdev emulation. With the conversion o ffbdev-generic to to
vmap_local helpers, that code can finally be removed.

Tested with amdgpu, nouveau, radeon, simpledrm and vc4.

[1] https://patchwork.freedesktop.org/series/106371/
[2] https://patchwork.freedesktop.org/series/116001/
[3] https://patchwork.freedesktop.org/series/84732/

Thomas Zimmermann (13):
  drm/gem-shmem: Acquire reservation lock in GEM pin/unpin callbacks
  drm/gem-vram: Acquire reservation lock in GEM pin/unpin callbacks
  drm/msm: Provide msm_gem_get_pages_locked()
  drm/msm: Acquire reservation lock in GEM pin/unpin callback
  drm/nouveau: Provide nouveau_bo_{pin,unpin}_locked()
  drm/nouveau: Acquire reservation lock in GEM pin/unpin callbacks
  drm/qxl: Provide qxl_bo_{pin,unpin}_locked()
  drm/qxl: Acquire reservation lock in GEM pin/unpin callbacks
  drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}()
  drm/fbdev-generic: Fix locking with drm_client_buffer_vmap_local()
  drm/client: Pin vmap'ed GEM buffers
  drm/gem-vram: Do not pin buffer objects for vmap
  drm/qxl: Do not pin buffer objects for vmap

 drivers/gpu/drm/drm_client.c|  92 ++---
 drivers/gpu/drm/drm_fbdev_generic.c |   4 +-
 drivers/gpu/drm/drm_gem.c   |  34 +++-
 drivers/gpu/drm/drm_gem_shmem_helper.c  |   6 +-
 drivers/gpu/drm/drm_gem_vram_helper.c   | 101 ++--
 drivers/gpu/drm/drm_internal.h  |   2 +
 drivers/gpu/drm/loongson/lsdc_gem.c |  13 +--
 drivers/gpu/drm/msm/msm_gem.c   |  20 ++---
 drivers/gpu/drm/msm/msm_gem.h   |   4 +-
 drivers/gpu/drm/msm/msm_gem_prime.c |  20 +++--
 drivers/gpu/drm/nouveau/nouveau_bo.c|  43 +++---
 drivers/gpu/drm/nouveau/nouveau_bo.h|   2 +
 drivers/gpu/drm/nouveau/nouveau_prime.c |   8 +-
 drivers/gpu/drm/qxl/qxl_object.c|  26 +++---
 drivers/gpu/drm/qxl/qxl_object.h|   2 +
 drivers/gpu/drm/qxl/qxl_prime.c |   4 +-
 drivers/gpu/drm/radeon/radeon_prime.c   |  11 ---
 drivers/gpu/drm/vmwgfx/vmwgfx_gem.c |  25 ++
 include/drm/drm_client.h|  10 +++
 include/drm/drm_gem.h   |   3 +
 include/drm/drm_gem_shmem_helper.h  |   7 +-
 21 files changed, 265 insertions(+), 172 deletions(-)


base-commit: 7291e2e67dff0ff573900266382c9c9248a7dea5
prerequisite-patch-id: bdfa0e6341b30cc9d7647172760b3473007c1216
prerequisite-patch-id: bc27ac702099f481890ae2c7c4a9c531f4a62d64
prerequisite-patch-id: f5d4bf16dc45334254527c2e31ee21ba4582761c
prerequisite-patch-id: 734c87e610747779aa41be12eb9e4c984bdfa743
prerequisite-patch-id: 0aa359f6144c4015c140c8a6750be19099c676fb

[PATCH 03/13] drm/msm: Provide msm_gem_get_pages_locked()

2024-02-27 Thread Thomas Zimmermann
Rename msm_gem_pin_pages_locked() to msm_gem_get_pages_locked(). The
function doesn't pin any pages, but only acquires them. Renaming the
function makes the old name available.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/msm/msm_gem.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 175ee4ab8a6f7..bb729353d3a8d 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -219,7 +219,7 @@ static void put_pages(struct drm_gem_object *obj)
}
 }
 
-static struct page **msm_gem_pin_pages_locked(struct drm_gem_object *obj,
+static struct page **msm_gem_get_pages_locked(struct drm_gem_object *obj,
  unsigned madv)
 {
struct msm_gem_object *msm_obj = to_msm_bo(obj);
@@ -262,7 +262,7 @@ struct page **msm_gem_pin_pages(struct drm_gem_object *obj)
struct page **p;
 
msm_gem_lock(obj);
-   p = msm_gem_pin_pages_locked(obj, MSM_MADV_WILLNEED);
+   p = msm_gem_get_pages_locked(obj, MSM_MADV_WILLNEED);
if (!IS_ERR(p))
pin_obj_locked(obj);
msm_gem_unlock(obj);
@@ -489,7 +489,7 @@ int msm_gem_pin_vma_locked(struct drm_gem_object *obj, 
struct msm_gem_vma *vma)
 
msm_gem_assert_locked(obj);
 
-   pages = msm_gem_pin_pages_locked(obj, MSM_MADV_WILLNEED);
+   pages = msm_gem_get_pages_locked(obj, MSM_MADV_WILLNEED);
if (IS_ERR(pages))
return PTR_ERR(pages);
 
@@ -703,7 +703,7 @@ static void *get_vaddr(struct drm_gem_object *obj, unsigned 
madv)
if (obj->import_attach)
return ERR_PTR(-ENODEV);
 
-   pages = msm_gem_pin_pages_locked(obj, madv);
+   pages = msm_gem_get_pages_locked(obj, madv);
if (IS_ERR(pages))
return ERR_CAST(pages);
 
-- 
2.43.2




[PATCH 04/13] drm/msm: Acquire reservation lock in GEM pin/unpin callback

2024-02-27 Thread Thomas Zimmermann
Export msm_gem_pin_pages_locked() and acquire the reservation lock
directly in GEM pin callback. Same for unpin. Prepares for further
changes.

Dma-buf locking semantics require callers to hold the buffer's
reservation lock when invoking the pin and unpin callbacks. Prepare
msm accordingly by pushing locking out of the implementation. A
follow-up patch will fix locking for all GEM code at once.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/msm/msm_gem.c   | 12 ++--
 drivers/gpu/drm/msm/msm_gem.h   |  4 ++--
 drivers/gpu/drm/msm/msm_gem_prime.c | 24 +++-
 3 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index bb729353d3a8d..a5c6498a43f06 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -257,24 +257,24 @@ static void pin_obj_locked(struct drm_gem_object *obj)
mutex_unlock(&priv->lru.lock);
 }
 
-struct page **msm_gem_pin_pages(struct drm_gem_object *obj)
+struct page **msm_gem_pin_pages_locked(struct drm_gem_object *obj)
 {
struct page **p;
 
-   msm_gem_lock(obj);
+   msm_gem_assert_locked(obj);
+
p = msm_gem_get_pages_locked(obj, MSM_MADV_WILLNEED);
if (!IS_ERR(p))
pin_obj_locked(obj);
-   msm_gem_unlock(obj);
 
return p;
 }
 
-void msm_gem_unpin_pages(struct drm_gem_object *obj)
+void msm_gem_unpin_pages_locked(struct drm_gem_object *obj)
 {
-   msm_gem_lock(obj);
+   msm_gem_assert_locked(obj);
+
msm_gem_unpin_locked(obj);
-   msm_gem_unlock(obj);
 }
 
 static pgprot_t msm_gem_pgprot(struct msm_gem_object *msm_obj, pgprot_t prot)
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index 8d414b072c29d..85f0257e83dab 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -140,8 +140,8 @@ int msm_gem_get_and_pin_iova(struct drm_gem_object *obj,
 void msm_gem_unpin_iova(struct drm_gem_object *obj,
struct msm_gem_address_space *aspace);
 void msm_gem_pin_obj_locked(struct drm_gem_object *obj);
-struct page **msm_gem_pin_pages(struct drm_gem_object *obj);
-void msm_gem_unpin_pages(struct drm_gem_object *obj);
+struct page **msm_gem_pin_pages_locked(struct drm_gem_object *obj);
+void msm_gem_unpin_pages_locked(struct drm_gem_object *obj);
 int msm_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
struct drm_mode_create_dumb *args);
 int msm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev,
diff --git a/drivers/gpu/drm/msm/msm_gem_prime.c 
b/drivers/gpu/drm/msm/msm_gem_prime.c
index 0915f3b68752e..0d22df53ab98a 100644
--- a/drivers/gpu/drm/msm/msm_gem_prime.c
+++ b/drivers/gpu/drm/msm/msm_gem_prime.c
@@ -47,13 +47,27 @@ struct drm_gem_object *msm_gem_prime_import_sg_table(struct 
drm_device *dev,
 
 int msm_gem_prime_pin(struct drm_gem_object *obj)
 {
-   if (!obj->import_attach)
-   msm_gem_pin_pages(obj);
-   return 0;
+   struct page **pages;
+   int ret = 0;
+
+   if (obj->import_attach)
+   return 0;
+
+   msm_gem_lock(obj);
+   pages = msm_gem_pin_pages_locked(obj);
+   if (IS_ERR(pages))
+   ret = PTR_ERR(pages);
+   msm_gem_unlock(obj);
+
+   return ret;
 }
 
 void msm_gem_prime_unpin(struct drm_gem_object *obj)
 {
-   if (!obj->import_attach)
-   msm_gem_unpin_pages(obj);
+   if (obj->import_attach)
+   return;
+
+   msm_gem_lock(obj);
+   msm_gem_unpin_pages_locked(obj);
+   msm_gem_unlock(obj);
 }
-- 
2.43.2



[PATCH 06/13] drm/nouveau: Acquire reservation lock in GEM pin/unpin callbacks

2024-02-27 Thread Thomas Zimmermann
Acquire the reservation lock directly in GEM pin callback. Same for
unpin. Prepares for further changes.

Dma-buf locking semantics require callers to hold the buffer's
reservation lock when invoking the pin and unpin callbacks. Prepare
nouveau accordingly by pushing locking out of the implementation. A
follow-up patch will fix locking for all GEM code at once.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/nouveau/nouveau_prime.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_prime.c 
b/drivers/gpu/drm/nouveau/nouveau_prime.c
index 1b2ff0c40fc1c..774f9bd031102 100644
--- a/drivers/gpu/drm/nouveau/nouveau_prime.c
+++ b/drivers/gpu/drm/nouveau/nouveau_prime.c
@@ -86,21 +86,32 @@ struct drm_gem_object 
*nouveau_gem_prime_import_sg_table(struct drm_device *dev,
 int nouveau_gem_prime_pin(struct drm_gem_object *obj)
 {
struct nouveau_bo *nvbo = nouveau_gem_object(obj);
+   struct ttm_buffer_object *bo = &nvbo->bo;
int ret;
 
-   /* pin buffer into GTT */
-   ret = nouveau_bo_pin(nvbo, NOUVEAU_GEM_DOMAIN_GART, false);
+   ret = ttm_bo_reserve(bo, false, false, NULL);
if (ret)
return -EINVAL;
+   /* pin buffer into GTT */
+   ret = nouveau_bo_pin_locked(nvbo, NOUVEAU_GEM_DOMAIN_GART, false);
+   if (ret)
+   ret = -EINVAL;
+   ttm_bo_unreserve(bo);
 
-   return 0;
+   return ret;
 }
 
 void nouveau_gem_prime_unpin(struct drm_gem_object *obj)
 {
struct nouveau_bo *nvbo = nouveau_gem_object(obj);
+   struct ttm_buffer_object *bo = &nvbo->bo;
+   int ret;
 
-   nouveau_bo_unpin(nvbo);
+   ret = ttm_bo_reserve(bo, false, false, NULL);
+   if (ret)
+   return;
+   nouveau_bo_unpin_locked(nvbo);
+   ttm_bo_unreserve(bo);
 }
 
 struct dma_buf *nouveau_gem_prime_export(struct drm_gem_object *gobj,
-- 
2.43.2



[PATCH 05/13] drm/nouveau: Provide nouveau_bo_{pin,unpin}_locked()

2024-02-27 Thread Thomas Zimmermann
Implement pinning without locking in nouveau_bo_pin_locked(). Keep
nouveau_bo_pin() for acquiring the buffer object's reservation lock.
The new helper will be useful for implementing the GEM pin callback
with correct semantics. Same for unpin.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/nouveau/nouveau_bo.c | 43 +++-
 drivers/gpu/drm/nouveau/nouveau_bo.h |  2 ++
 2 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 56dcd25db1ce2..4a7c002a325a4 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -467,17 +467,14 @@ nouveau_bo_placement_set(struct nouveau_bo *nvbo, 
uint32_t domain,
set_placement_range(nvbo, domain);
 }
 
-int
-nouveau_bo_pin(struct nouveau_bo *nvbo, uint32_t domain, bool contig)
+int nouveau_bo_pin_locked(struct nouveau_bo *nvbo, uint32_t domain, bool 
contig)
 {
struct nouveau_drm *drm = nouveau_bdev(nvbo->bo.bdev);
struct ttm_buffer_object *bo = &nvbo->bo;
bool force = false, evict = false;
-   int ret;
+   int ret = 0;
 
-   ret = ttm_bo_reserve(bo, false, false, NULL);
-   if (ret)
-   return ret;
+   dma_resv_assert_held(bo->base.resv);
 
if (drm->client.device.info.family >= NV_DEVICE_INFO_V0_TESLA &&
domain == NOUVEAU_GEM_DOMAIN_VRAM && contig) {
@@ -540,20 +537,15 @@ nouveau_bo_pin(struct nouveau_bo *nvbo, uint32_t domain, 
bool contig)
 out:
if (force && ret)
nvbo->contig = false;
-   ttm_bo_unreserve(bo);
return ret;
 }
 
-int
-nouveau_bo_unpin(struct nouveau_bo *nvbo)
+void nouveau_bo_unpin_locked(struct nouveau_bo *nvbo)
 {
struct nouveau_drm *drm = nouveau_bdev(nvbo->bo.bdev);
struct ttm_buffer_object *bo = &nvbo->bo;
-   int ret;
 
-   ret = ttm_bo_reserve(bo, false, false, NULL);
-   if (ret)
-   return ret;
+   dma_resv_assert_held(bo->base.resv);
 
ttm_bo_unpin(&nvbo->bo);
if (!nvbo->bo.pin_count) {
@@ -568,8 +560,33 @@ nouveau_bo_unpin(struct nouveau_bo *nvbo)
break;
}
}
+}
+
+int nouveau_bo_pin(struct nouveau_bo *nvbo, uint32_t domain, bool contig)
+{
+   struct ttm_buffer_object *bo = &nvbo->bo;
+   int ret;
 
+   ret = ttm_bo_reserve(bo, false, false, NULL);
+   if (ret)
+   return ret;
+   ret = nouveau_bo_pin_locked(nvbo, domain, contig);
+   ttm_bo_unreserve(bo);
+
+   return ret;
+}
+
+int nouveau_bo_unpin(struct nouveau_bo *nvbo)
+{
+   struct ttm_buffer_object *bo = &nvbo->bo;
+   int ret;
+
+   ret = ttm_bo_reserve(bo, false, false, NULL);
+   if (ret)
+   return ret;
+   nouveau_bo_unpin_locked(nvbo);
ttm_bo_unreserve(bo);
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.h 
b/drivers/gpu/drm/nouveau/nouveau_bo.h
index e9dfab6a81560..4e891752c2551 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.h
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.h
@@ -85,6 +85,8 @@ int  nouveau_bo_new(struct nouveau_cli *, u64 size, int 
align, u32 domain,
u32 tile_mode, u32 tile_flags, struct sg_table *sg,
struct dma_resv *robj,
struct nouveau_bo **);
+int  nouveau_bo_pin_locked(struct nouveau_bo *nvbo, uint32_t domain, bool 
contig);
+void nouveau_bo_unpin_locked(struct nouveau_bo *nvbo);
 int  nouveau_bo_pin(struct nouveau_bo *, u32 flags, bool contig);
 int  nouveau_bo_unpin(struct nouveau_bo *);
 int  nouveau_bo_map(struct nouveau_bo *);
-- 
2.43.2



[PATCH 08/13] drm/qxl: Acquire reservation lock in GEM pin/unpin callbacks

2024-02-27 Thread Thomas Zimmermann
Acquire the reservation lock directly in GEM pin callback. Same for
unpin. Prepares for further changes.

Dma-buf locking semantics require callers to hold the buffer's
reservation lock when invoking the pin and unpin callbacks. Prepare
qxl accordingly by pushing locking out of the implementation. A
follow-up patch will fix locking for all GEM code at once.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/qxl/qxl_prime.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/qxl/qxl_prime.c b/drivers/gpu/drm/qxl/qxl_prime.c
index 9169c26357d36..f2646603e12eb 100644
--- a/drivers/gpu/drm/qxl/qxl_prime.c
+++ b/drivers/gpu/drm/qxl/qxl_prime.c
@@ -31,15 +31,27 @@
 int qxl_gem_prime_pin(struct drm_gem_object *obj)
 {
struct qxl_bo *bo = gem_to_qxl_bo(obj);
+   int r;
 
-   return qxl_bo_pin(bo);
+   r = qxl_bo_reserve(bo);
+   if (r)
+   return r;
+   r = qxl_bo_pin_locked(bo);
+   qxl_bo_unreserve(bo);
+
+   return r;
 }
 
 void qxl_gem_prime_unpin(struct drm_gem_object *obj)
 {
struct qxl_bo *bo = gem_to_qxl_bo(obj);
+   int r;
 
-   qxl_bo_unpin(bo);
+   r = qxl_bo_reserve(bo);
+   if (r)
+   return;
+   qxl_bo_unpin_locked(bo);
+   qxl_bo_unreserve(bo);
 }
 
 struct sg_table *qxl_gem_prime_get_sg_table(struct drm_gem_object *obj)
-- 
2.43.2



[PATCH 10/13] drm/fbdev-generic: Fix locking with drm_client_buffer_vmap_local()

2024-02-27 Thread Thomas Zimmermann
Temporarily lock the fbdev buffer object during updates to prevent
memory managers from evicting/moving the buffer. Moving a buffer
object while update its content results in undefined behaviour.

Fbdev-generic updates its buffer object from a shadow buffer. Gem-shmem
and gem-dma helpers do not move buffer objects, so they are safe to be
used with fbdev-generic. Gem-vram and qxl are based on TTM, but pin
buffer objects are part of the vmap operation. So both are also safe
to be used with fbdev-generic.

Amdgpu and nouveau do not pin or lock the buffer object during an
update. Their TTM-based memory management could move the buffer object
while the update is ongoing.

The new vmap_local and vunmap_local helpers hold the buffer object's
reservation lock during the buffer update. This prevents moving the
buffer object on all memory managers.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/drm_client.c| 68 +
 drivers/gpu/drm/drm_fbdev_generic.c |  4 +-
 drivers/gpu/drm/drm_gem.c   | 12 +
 include/drm/drm_client.h| 10 +
 include/drm/drm_gem.h   |  3 ++
 5 files changed, 87 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/drm_client.c b/drivers/gpu/drm/drm_client.c
index 9403b3f576f7b..2cc81831236b5 100644
--- a/drivers/gpu/drm/drm_client.c
+++ b/drivers/gpu/drm/drm_client.c
@@ -304,6 +304,66 @@ drm_client_buffer_create(struct drm_client_dev *client, 
u32 width, u32 height,
return ERR_PTR(ret);
 }
 
+/**
+ * drm_client_buffer_vmap_local - Map DRM client buffer into address space
+ * @buffer: DRM client buffer
+ * @map_copy: Returns the mapped memory's address
+ *
+ * This function maps a client buffer into kernel address space. If the
+ * buffer is already mapped, it returns the existing mapping's address.
+ *
+ * Client buffer mappings are not ref'counted. Each call to
+ * drm_client_buffer_vmap_local() should be closely followed by a call to
+ * drm_client_buffer_vunmap_local(). See drm_client_buffer_vmap() for
+ * long-term mappings.
+ *
+ * The returned address is a copy of the internal value. In contrast to
+ * other vmap interfaces, you don't need it for the client's vunmap
+ * function. So you can modify it at will during blit and draw operations.
+ *
+ * Returns:
+ * 0 on success, or a negative errno code otherwise.
+ */
+int drm_client_buffer_vmap_local(struct drm_client_buffer *buffer,
+struct iosys_map *map_copy)
+{
+   struct drm_gem_object *gem = buffer->gem;
+   struct iosys_map *map = &buffer->map;
+   int ret;
+
+   drm_gem_lock(gem);
+
+   ret = drm_gem_vmap(gem, map);
+   if (ret)
+   goto err_drm_gem_vmap_unlocked;
+   *map_copy = *map;
+
+   return 0;
+
+err_drm_gem_vmap_unlocked:
+   drm_gem_unlock(gem);
+   return 0;
+}
+EXPORT_SYMBOL(drm_client_buffer_vmap_local);
+
+/**
+ * drm_client_buffer_vunmap_local - Unmap DRM client buffer
+ * @buffer: DRM client buffer
+ *
+ * This function removes a client buffer's memory mapping established
+ * with drm_client_buffer_vunmap_local(). Calling this function is only
+ * required by clients that manage their buffer mappings by themselves.
+ */
+void drm_client_buffer_vunmap_local(struct drm_client_buffer *buffer)
+{
+   struct drm_gem_object *gem = buffer->gem;
+   struct iosys_map *map = &buffer->map;
+
+   drm_gem_vunmap(gem, map);
+   drm_gem_unlock(gem);
+}
+EXPORT_SYMBOL(drm_client_buffer_vunmap_local);
+
 /**
  * drm_client_buffer_vmap - Map DRM client buffer into address space
  * @buffer: DRM client buffer
@@ -331,14 +391,6 @@ drm_client_buffer_vmap(struct drm_client_buffer *buffer,
struct iosys_map *map = &buffer->map;
int ret;
 
-   /*
-* FIXME: The dependency on GEM here isn't required, we could
-* convert the driver handle to a dma-buf instead and use the
-* backend-agnostic dma-buf vmap support instead. This would
-* require that the handle2fd prime ioctl is reworked to pull the
-* fd_install step out of the driver backend hooks, to make that
-* final step optional for internal users.
-*/
ret = drm_gem_vmap_unlocked(buffer->gem, map);
if (ret)
return ret;
diff --git a/drivers/gpu/drm/drm_fbdev_generic.c 
b/drivers/gpu/drm/drm_fbdev_generic.c
index d647d89764cb9..be357f926faec 100644
--- a/drivers/gpu/drm/drm_fbdev_generic.c
+++ b/drivers/gpu/drm/drm_fbdev_generic.c
@@ -197,14 +197,14 @@ static int drm_fbdev_generic_damage_blit(struct 
drm_fb_helper *fb_helper,
 */
mutex_lock(&fb_helper->lock);
 
-   ret = drm_client_buffer_vmap(buffer, &map);
+   ret = drm_client_buffer_vmap_local(buffer, &map);
if (ret)
goto out;
 
dst = map;
drm_fbdev_generic_damage_blit_real(fb_helper, clip, &dst);
 
-   drm_client_buffer_vunmap(buffer);
+   drm_client_buffer_vunmap_loca

[PATCH 07/13] drm/qxl: Provide qxl_bo_{pin,unpin}_locked()

2024-02-27 Thread Thomas Zimmermann
Rename __qxl_bo_pin() to qxl_bo_pin_locked() and update all callers.
The function will be helpful for implementing the GEM pin callback
with correct semantics. Same for __qxl_bo_unpin().

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/qxl/qxl_object.c | 25 +
 drivers/gpu/drm/qxl/qxl_object.h |  2 ++
 2 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/qxl/qxl_object.c b/drivers/gpu/drm/qxl/qxl_object.c
index 1e46b0a6e4787..39218e979a807 100644
--- a/drivers/gpu/drm/qxl/qxl_object.c
+++ b/drivers/gpu/drm/qxl/qxl_object.c
@@ -29,9 +29,6 @@
 #include "qxl_drv.h"
 #include "qxl_object.h"
 
-static int __qxl_bo_pin(struct qxl_bo *bo);
-static void __qxl_bo_unpin(struct qxl_bo *bo);
-
 static void qxl_ttm_bo_destroy(struct ttm_buffer_object *tbo)
 {
struct qxl_bo *bo;
@@ -167,13 +164,13 @@ int qxl_bo_vmap_locked(struct qxl_bo *bo, struct 
iosys_map *map)
goto out;
}
 
-   r = __qxl_bo_pin(bo);
+   r = qxl_bo_pin_locked(bo);
if (r)
return r;
 
r = ttm_bo_vmap(&bo->tbo, &bo->map);
if (r) {
-   __qxl_bo_unpin(bo);
+   qxl_bo_unpin_locked(bo);
return r;
}
bo->map_count = 1;
@@ -246,7 +243,7 @@ void qxl_bo_vunmap_locked(struct qxl_bo *bo)
return;
bo->kptr = NULL;
ttm_bo_vunmap(&bo->tbo, &bo->map);
-   __qxl_bo_unpin(bo);
+   qxl_bo_unpin_locked(bo);
 }
 
 int qxl_bo_vunmap(struct qxl_bo *bo)
@@ -290,12 +287,14 @@ struct qxl_bo *qxl_bo_ref(struct qxl_bo *bo)
return bo;
 }
 
-static int __qxl_bo_pin(struct qxl_bo *bo)
+int qxl_bo_pin_locked(struct qxl_bo *bo)
 {
struct ttm_operation_ctx ctx = { false, false };
struct drm_device *ddev = bo->tbo.base.dev;
int r;
 
+   dma_resv_assert_held(bo->tbo.base.resv);
+
if (bo->tbo.pin_count) {
ttm_bo_pin(&bo->tbo);
return 0;
@@ -309,14 +308,16 @@ static int __qxl_bo_pin(struct qxl_bo *bo)
return r;
 }
 
-static void __qxl_bo_unpin(struct qxl_bo *bo)
+void qxl_bo_unpin_locked(struct qxl_bo *bo)
 {
+   dma_resv_assert_held(bo->tbo.base.resv);
+
ttm_bo_unpin(&bo->tbo);
 }
 
 /*
  * Reserve the BO before pinning the object.  If the BO was reserved
- * beforehand, use the internal version directly __qxl_bo_pin.
+ * beforehand, use the internal version directly qxl_bo_pin_locked.
  *
  */
 int qxl_bo_pin(struct qxl_bo *bo)
@@ -327,14 +328,14 @@ int qxl_bo_pin(struct qxl_bo *bo)
if (r)
return r;
 
-   r = __qxl_bo_pin(bo);
+   r = qxl_bo_pin_locked(bo);
qxl_bo_unreserve(bo);
return r;
 }
 
 /*
  * Reserve the BO before pinning the object.  If the BO was reserved
- * beforehand, use the internal version directly __qxl_bo_unpin.
+ * beforehand, use the internal version directly qxl_bo_unpin_locked.
  *
  */
 int qxl_bo_unpin(struct qxl_bo *bo)
@@ -345,7 +346,7 @@ int qxl_bo_unpin(struct qxl_bo *bo)
if (r)
return r;
 
-   __qxl_bo_unpin(bo);
+   qxl_bo_unpin_locked(bo);
qxl_bo_unreserve(bo);
return 0;
 }
diff --git a/drivers/gpu/drm/qxl/qxl_object.h b/drivers/gpu/drm/qxl/qxl_object.h
index 53392cb90eecf..1cf5bc7591016 100644
--- a/drivers/gpu/drm/qxl/qxl_object.h
+++ b/drivers/gpu/drm/qxl/qxl_object.h
@@ -67,6 +67,8 @@ void *qxl_bo_kmap_atomic_page(struct qxl_device *qdev, struct 
qxl_bo *bo, int pa
 void qxl_bo_kunmap_atomic_page(struct qxl_device *qdev, struct qxl_bo *bo, 
void *map);
 extern struct qxl_bo *qxl_bo_ref(struct qxl_bo *bo);
 extern void qxl_bo_unref(struct qxl_bo **bo);
+extern int qxl_bo_pin_locked(struct qxl_bo *bo);
+extern void qxl_bo_unpin_locked(struct qxl_bo *bo);
 extern int qxl_bo_pin(struct qxl_bo *bo);
 extern int qxl_bo_unpin(struct qxl_bo *bo);
 extern void qxl_ttm_placement_from_domain(struct qxl_bo *qbo, u32 domain);
-- 
2.43.2




[PATCH 12/13] drm/gem-vram: Do not pin buffer objects for vmap

2024-02-27 Thread Thomas Zimmermann
Pin and vmap are distinct operations. Do not perform a pin as part
of the vmap call. This used to be necessary to keep the fbdev buffer
in place while it is being updated. Fbdev emulation has meanwhile
been fixed to lock the buffer correctly. Same for vunmap.

For refactoring the code, remove the pin calls from the helper's
vmap implementation in drm_gem_vram_vmap() and inline the call to
drm_gem_vram_kmap_locked(). This gives a vmap helper that only
maps the buffer object's memory pages without pinning or locking.
Do a similar refactoring for vunmap.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/drm_gem_vram_helper.c | 90 ++-
 1 file changed, 32 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c 
b/drivers/gpu/drm/drm_gem_vram_helper.c
index 5a16b3e0a4134..45650b9b3de91 100644
--- a/drivers/gpu/drm/drm_gem_vram_helper.c
+++ b/drivers/gpu/drm/drm_gem_vram_helper.c
@@ -368,11 +368,28 @@ int drm_gem_vram_unpin(struct drm_gem_vram_object *gbo)
 }
 EXPORT_SYMBOL(drm_gem_vram_unpin);
 
-static int drm_gem_vram_kmap_locked(struct drm_gem_vram_object *gbo,
-   struct iosys_map *map)
+/**
+ * drm_gem_vram_vmap() - Pins and maps a GEM VRAM object into kernel address
+ *   space
+ * @gbo: The GEM VRAM object to map
+ * @map: Returns the kernel virtual address of the VRAM GEM object's backing
+ *   store.
+ *
+ * The vmap function pins a GEM VRAM object to its current location, either
+ * system or video memory, and maps its buffer into kernel address space.
+ * As pinned object cannot be relocated, you should avoid pinning objects
+ * permanently. Call drm_gem_vram_vunmap() with the returned address to
+ * unmap and unpin the GEM VRAM object.
+ *
+ * Returns:
+ * 0 on success, or a negative error code otherwise.
+ */
+int drm_gem_vram_vmap(struct drm_gem_vram_object *gbo, struct iosys_map *map)
 {
int ret;
 
+   dma_resv_assert_held(gbo->bo.base.resv);
+
if (gbo->vmap_use_count > 0)
goto out;
 
@@ -393,12 +410,23 @@ static int drm_gem_vram_kmap_locked(struct 
drm_gem_vram_object *gbo,
 
return 0;
 }
+EXPORT_SYMBOL(drm_gem_vram_vmap);
 
-static void drm_gem_vram_kunmap_locked(struct drm_gem_vram_object *gbo,
-  struct iosys_map *map)
+/**
+ * drm_gem_vram_vunmap() - Unmaps and unpins a GEM VRAM object
+ * @gbo: The GEM VRAM object to unmap
+ * @map: Kernel virtual address where the VRAM GEM object was mapped
+ *
+ * A call to drm_gem_vram_vunmap() unmaps and unpins a GEM VRAM buffer. See
+ * the documentation for drm_gem_vram_vmap() for more information.
+ */
+void drm_gem_vram_vunmap(struct drm_gem_vram_object *gbo,
+struct iosys_map *map)
 {
struct drm_device *dev = gbo->bo.base.dev;
 
+   dma_resv_assert_held(gbo->bo.base.resv);
+
if (drm_WARN_ON_ONCE(dev, !gbo->vmap_use_count))
return;
 
@@ -415,60 +443,6 @@ static void drm_gem_vram_kunmap_locked(struct 
drm_gem_vram_object *gbo,
 * from memory. See drm_gem_vram_bo_driver_move_notify().
 */
 }
-
-/**
- * drm_gem_vram_vmap() - Pins and maps a GEM VRAM object into kernel address
- *   space
- * @gbo: The GEM VRAM object to map
- * @map: Returns the kernel virtual address of the VRAM GEM object's backing
- *   store.
- *
- * The vmap function pins a GEM VRAM object to its current location, either
- * system or video memory, and maps its buffer into kernel address space.
- * As pinned object cannot be relocated, you should avoid pinning objects
- * permanently. Call drm_gem_vram_vunmap() with the returned address to
- * unmap and unpin the GEM VRAM object.
- *
- * Returns:
- * 0 on success, or a negative error code otherwise.
- */
-int drm_gem_vram_vmap(struct drm_gem_vram_object *gbo, struct iosys_map *map)
-{
-   int ret;
-
-   dma_resv_assert_held(gbo->bo.base.resv);
-
-   ret = drm_gem_vram_pin_locked(gbo, 0);
-   if (ret)
-   return ret;
-   ret = drm_gem_vram_kmap_locked(gbo, map);
-   if (ret)
-   goto err_drm_gem_vram_unpin_locked;
-
-   return 0;
-
-err_drm_gem_vram_unpin_locked:
-   drm_gem_vram_unpin_locked(gbo);
-   return ret;
-}
-EXPORT_SYMBOL(drm_gem_vram_vmap);
-
-/**
- * drm_gem_vram_vunmap() - Unmaps and unpins a GEM VRAM object
- * @gbo: The GEM VRAM object to unmap
- * @map: Kernel virtual address where the VRAM GEM object was mapped
- *
- * A call to drm_gem_vram_vunmap() unmaps and unpins a GEM VRAM buffer. See
- * the documentation for drm_gem_vram_vmap() for more information.
- */
-void drm_gem_vram_vunmap(struct drm_gem_vram_object *gbo,
-struct iosys_map *map)
-{
-   dma_resv_assert_held(gbo->bo.base.resv);
-
-   drm_gem_vram_kunmap_locked(gbo, map);
-   drm_gem_vram_unpin_locked(gbo);
-}
 EXPORT_SYMBOL(drm_gem_vram_vunmap);
 
 /**
-- 
2.43.2



[PATCH 09/13] drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}()

2024-02-27 Thread Thomas Zimmermann
Acquire the buffer object's reservation lock in drm_gem_pin() and
remove locking the drivers' GEM callbacks where necessary. Same for
unpin().

DRM drivers and memory managers modified by this patch will now have
correct dma-buf locking semantics: the caller is responsible for
holding the reservation lock when calling the pin or unpin callback.

DRM drivers and memory managers that are not modified will now be
protected against concurent invocation of their pin and unpin callbacks.

PRIME does not implement struct dma_buf_ops.pin, which requires
the caller to hold the reservation lock. It does implement struct
dma_buf_ops.attach, which requires to callee to acquire the
reservation lock. The PRIME code uses drm_gem_pin(), so locks
are now taken as specified. Same for unpin and detach.

The patch harmonizes GEM pin and unpin to have non-interruptible
reservation locking across all drivers, as is already the case for
vmap and vunmap. This affects gem-shmem, gem-vram, loongson, qxl and
radeon.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/drm_gem.c   | 22 --
 drivers/gpu/drm/drm_gem_vram_helper.c   | 15 +--
 drivers/gpu/drm/drm_internal.h  |  2 ++
 drivers/gpu/drm/loongson/lsdc_gem.c | 13 ++---
 drivers/gpu/drm/msm/msm_gem_prime.c |  4 
 drivers/gpu/drm/nouveau/nouveau_prime.c | 11 ---
 drivers/gpu/drm/qxl/qxl_prime.c | 14 +-
 drivers/gpu/drm/radeon/radeon_prime.c   | 11 ---
 drivers/gpu/drm/vmwgfx/vmwgfx_gem.c | 25 ++---
 include/drm/drm_gem_shmem_helper.h  | 11 +--
 10 files changed, 33 insertions(+), 95 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 44a948b80ee14..e0f80c6a7096f 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1161,7 +1161,7 @@ void drm_gem_print_info(struct drm_printer *p, unsigned 
int indent,
obj->funcs->print_info(p, indent, obj);
 }
 
-int drm_gem_pin(struct drm_gem_object *obj)
+int drm_gem_pin_locked(struct drm_gem_object *obj)
 {
if (obj->funcs->pin)
return obj->funcs->pin(obj);
@@ -1169,12 +1169,30 @@ int drm_gem_pin(struct drm_gem_object *obj)
return 0;
 }
 
-void drm_gem_unpin(struct drm_gem_object *obj)
+void drm_gem_unpin_locked(struct drm_gem_object *obj)
 {
if (obj->funcs->unpin)
obj->funcs->unpin(obj);
 }
 
+int drm_gem_pin(struct drm_gem_object *obj)
+{
+   int ret;
+
+   dma_resv_lock(obj->resv, NULL);
+   ret = drm_gem_pin_locked(obj);
+   dma_resv_unlock(obj->resv);
+
+   return ret;
+}
+
+void drm_gem_unpin(struct drm_gem_object *obj)
+{
+   dma_resv_lock(obj->resv, NULL);
+   drm_gem_unpin_locked(obj);
+   dma_resv_unlock(obj->resv);
+}
+
 int drm_gem_vmap(struct drm_gem_object *obj, struct iosys_map *map)
 {
int ret;
diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c 
b/drivers/gpu/drm/drm_gem_vram_helper.c
index 15029d89badf8..5a16b3e0a4134 100644
--- a/drivers/gpu/drm/drm_gem_vram_helper.c
+++ b/drivers/gpu/drm/drm_gem_vram_helper.c
@@ -774,11 +774,6 @@ EXPORT_SYMBOL(drm_gem_vram_simple_display_pipe_cleanup_fb);
 static int drm_gem_vram_object_pin(struct drm_gem_object *gem)
 {
struct drm_gem_vram_object *gbo = drm_gem_vram_of_gem(gem);
-   int ret;
-
-   ret = ttm_bo_reserve(&gbo->bo, true, false, NULL);
-   if (ret)
-   return ret;
 
/*
 * Fbdev console emulation is the use case of these PRIME
@@ -789,10 +784,7 @@ static int drm_gem_vram_object_pin(struct drm_gem_object 
*gem)
 * the buffer to be pinned to VRAM, implement a callback that
 * sets the flags accordingly.
 */
-   ret = drm_gem_vram_pin_locked(gbo, 0);
-   ttm_bo_unreserve(&gbo->bo);
-
-   return ret;
+   return drm_gem_vram_pin_locked(gbo, 0);
 }
 
 /**
@@ -803,13 +795,8 @@ static int drm_gem_vram_object_pin(struct drm_gem_object 
*gem)
 static void drm_gem_vram_object_unpin(struct drm_gem_object *gem)
 {
struct drm_gem_vram_object *gbo = drm_gem_vram_of_gem(gem);
-   int ret;
 
-   ret = ttm_bo_reserve(&gbo->bo, true, false, NULL);
-   if (ret)
-   return;
drm_gem_vram_unpin_locked(gbo);
-   ttm_bo_unreserve(&gbo->bo);
 }
 
 /**
diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index 8e4faf0a28e6c..40b2d3a274d6c 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -170,6 +170,8 @@ void drm_gem_release(struct drm_device *dev, struct 
drm_file *file_private);
 void drm_gem_print_info(struct drm_printer *p, unsigned int indent,
const struct drm_gem_object *obj);
 
+int drm_gem_pin_locked(struct drm_gem_object *obj);
+void drm_gem_unpin_locked(struct drm_gem_object *obj);
 int drm_gem_pin(struct drm_gem_object *obj);
 void drm_gem_unpin(struct drm_gem_object *obj);
 int

[PATCH 11/13] drm/client: Pin vmap'ed GEM buffers

2024-02-27 Thread Thomas Zimmermann
The function drm_client_buffer_vmap() establishes a long-term mapping
of the client's buffer object into the kernel address space. Make sure
that buffer does not move by pinning it to its current location. Same
for vunmap with unpin.

The only caller of drm_client_buffer_vmap() is fbdev-dma, which uses
gem-dma. As DMA-backed GEM buffers do not move, this change is for
correctness with little impact in practice.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/drm_client.c | 24 +---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_client.c b/drivers/gpu/drm/drm_client.c
index 2cc81831236b5..77fe217aeaf36 100644
--- a/drivers/gpu/drm/drm_client.c
+++ b/drivers/gpu/drm/drm_client.c
@@ -388,16 +388,30 @@ int
 drm_client_buffer_vmap(struct drm_client_buffer *buffer,
   struct iosys_map *map_copy)
 {
+   struct drm_gem_object *gem = buffer->gem;
struct iosys_map *map = &buffer->map;
int ret;
 
-   ret = drm_gem_vmap_unlocked(buffer->gem, map);
+   drm_gem_lock(gem);
+
+   ret = drm_gem_pin_locked(gem);
if (ret)
-   return ret;
+   goto err_drm_gem_pin_locked;
+   ret = drm_gem_vmap(gem, map);
+   if (ret)
+   goto err_drm_gem_vmap;
+
+   drm_gem_unlock(gem);
 
*map_copy = *map;
 
return 0;
+
+err_drm_gem_vmap:
+   drm_gem_unpin_locked(buffer->gem);
+err_drm_gem_pin_locked:
+   drm_gem_unlock(gem);
+   return ret;
 }
 EXPORT_SYMBOL(drm_client_buffer_vmap);
 
@@ -411,9 +425,13 @@ EXPORT_SYMBOL(drm_client_buffer_vmap);
  */
 void drm_client_buffer_vunmap(struct drm_client_buffer *buffer)
 {
+   struct drm_gem_object *gem = buffer->gem;
struct iosys_map *map = &buffer->map;
 
-   drm_gem_vunmap_unlocked(buffer->gem, map);
+   drm_gem_lock(gem);
+   drm_gem_vunmap(gem, map);
+   drm_gem_unpin_locked(gem);
+   drm_gem_unlock(gem);
 }
 EXPORT_SYMBOL(drm_client_buffer_vunmap);
 
-- 
2.43.2



[PATCH 13/13] drm/qxl: Do not pin buffer objects for vmap

2024-02-27 Thread Thomas Zimmermann
Pin and vmap are distinct operations. Do not perform a pin as part
of the vmap call. This used to be necessary to keep the fbdev buffer
in place while it is being updated. Fbdev emulation has meanwhile
been fixed to lock the buffer correctly. Same for vunmap.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/qxl/qxl_object.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/qxl/qxl_object.c b/drivers/gpu/drm/qxl/qxl_object.c
index 39218e979a807..5893e27a7ae50 100644
--- a/drivers/gpu/drm/qxl/qxl_object.c
+++ b/drivers/gpu/drm/qxl/qxl_object.c
@@ -164,10 +164,6 @@ int qxl_bo_vmap_locked(struct qxl_bo *bo, struct iosys_map 
*map)
goto out;
}
 
-   r = qxl_bo_pin_locked(bo);
-   if (r)
-   return r;
-
r = ttm_bo_vmap(&bo->tbo, &bo->map);
if (r) {
qxl_bo_unpin_locked(bo);
@@ -243,7 +239,6 @@ void qxl_bo_vunmap_locked(struct qxl_bo *bo)
return;
bo->kptr = NULL;
ttm_bo_vunmap(&bo->tbo, &bo->map);
-   qxl_bo_unpin_locked(bo);
 }
 
 int qxl_bo_vunmap(struct qxl_bo *bo)
-- 
2.43.2



Re: [PATCH 00/13] drm: Fix reservation locking for pin/unpin and console

2024-02-27 Thread Christian König

Nice, looks totally valid to me.

Feel free to add to patch #2, #9, #10, #11 and #12 Reviewed-by: 
Christian König 


And Acked-by: Christian König  to the rest.

Regards,
Christian.

Am 27.02.24 um 11:14 schrieb Thomas Zimmermann:

Dma-buf locking semantics require the caller of pin and unpin to hold
the buffer's reservation lock. Fix DRM to adhere to the specs. This
enables to fix the locking in DRM's console emulation. Similar changes
for vmap and mmap have been posted at [1][2]

Most DRM drivers and memory managers acquire the buffer object's
reservation lock within their GEM pin and unpin callbacks. This
violates dma-buf locking semantics. We get away with it because PRIME
does not provide pin/unpin, but attach/detach, for which the locking
semantics is correct.

Patches 1 to 8 rework DRM GEM code in various implementations to
acquire the reservation lock when entering the pin and unpin callbacks.
This prepares them for the next patch. Drivers that are not affected
by these patches either don't acquire the reservation lock (amdgpu)
or don't need preparation (loongson).

Patch 9 moves reservation locking from the GEM pin/unpin callbacks
into drm_gem_pin() and drm_gem_unpin(). As PRIME uses these functions
internally it still gets the reservation lock.

With the updated GEM callbacks, the rest of the patchset fixes the
fbdev emulation's buffer locking. Fbdev emulation needs to keep its
GEM buffer object inplace while updating its content. This required
a implicit pinning and apparently amdgpu didn't do this at all.

Patch 10 introduces drm_client_buffer_vmap_local() and _vunmap_local().
The former function map a GEM buffer into the kernel's address space
with regular vmap operations, but keeps holding the reservation lock.
The _vunmap_local() helper undoes the vmap and releases the lock. The
updated GEM callbacks make this possible. Between the two calls, the
fbdev emulation can update the buffer content without have the buffer
moved or evicted. Update fbdev-generic to use vmap_local helpers,
which fix amdgpu. The idea of adding a "local vmap" has previously been
attempted at [3] in a different form.

Patch 11 adds implicit pinning to the DRM client's regular vmap
helper so that long-term vmap'ed buffers won't be evicted. This only
affects fbdev-dma, but GEM DMA helpers don't require pinning. So
there are no practical changes.

Patches 12 and 13 remove implicit pinning from the vmap and vunmap
operations in gem-vram and qxl. These pin operations are not supposed
to be part of vmap code, but were required to keep the buffers in place
for fbdev emulation. With the conversion o ffbdev-generic to to
vmap_local helpers, that code can finally be removed.

Tested with amdgpu, nouveau, radeon, simpledrm and vc4.

[1] https://patchwork.freedesktop.org/series/106371/
[2] https://patchwork.freedesktop.org/series/116001/
[3] https://patchwork.freedesktop.org/series/84732/

Thomas Zimmermann (13):
   drm/gem-shmem: Acquire reservation lock in GEM pin/unpin callbacks
   drm/gem-vram: Acquire reservation lock in GEM pin/unpin callbacks
   drm/msm: Provide msm_gem_get_pages_locked()
   drm/msm: Acquire reservation lock in GEM pin/unpin callback
   drm/nouveau: Provide nouveau_bo_{pin,unpin}_locked()
   drm/nouveau: Acquire reservation lock in GEM pin/unpin callbacks
   drm/qxl: Provide qxl_bo_{pin,unpin}_locked()
   drm/qxl: Acquire reservation lock in GEM pin/unpin callbacks
   drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}()
   drm/fbdev-generic: Fix locking with drm_client_buffer_vmap_local()
   drm/client: Pin vmap'ed GEM buffers
   drm/gem-vram: Do not pin buffer objects for vmap
   drm/qxl: Do not pin buffer objects for vmap

  drivers/gpu/drm/drm_client.c|  92 ++---
  drivers/gpu/drm/drm_fbdev_generic.c |   4 +-
  drivers/gpu/drm/drm_gem.c   |  34 +++-
  drivers/gpu/drm/drm_gem_shmem_helper.c  |   6 +-
  drivers/gpu/drm/drm_gem_vram_helper.c   | 101 ++--
  drivers/gpu/drm/drm_internal.h  |   2 +
  drivers/gpu/drm/loongson/lsdc_gem.c |  13 +--
  drivers/gpu/drm/msm/msm_gem.c   |  20 ++---
  drivers/gpu/drm/msm/msm_gem.h   |   4 +-
  drivers/gpu/drm/msm/msm_gem_prime.c |  20 +++--
  drivers/gpu/drm/nouveau/nouveau_bo.c|  43 +++---
  drivers/gpu/drm/nouveau/nouveau_bo.h|   2 +
  drivers/gpu/drm/nouveau/nouveau_prime.c |   8 +-
  drivers/gpu/drm/qxl/qxl_object.c|  26 +++---
  drivers/gpu/drm/qxl/qxl_object.h|   2 +
  drivers/gpu/drm/qxl/qxl_prime.c |   4 +-
  drivers/gpu/drm/radeon/radeon_prime.c   |  11 ---
  drivers/gpu/drm/vmwgfx/vmwgfx_gem.c |  25 ++
  include/drm/drm_client.h|  10 +++
  include/drm/drm_gem.h   |   3 +
  include/drm/drm_gem_shmem_helper.h  |   7 +-
  21 files changed, 265 insertions(+), 172 deletions(-)


base-commit: 7291e2e67dff0ff573900266382c9c9248a7dea5
prerequisite-patch-id: bdfa0e6341b30cc9d7647172760

Re: [PATCH] [v4][RFC] drm/nouveau: expose GSP-RM logging buffers via debugfs

2024-02-27 Thread Danilo Krummrich

Hi Timur,

thanks for re-working this patch!

On 2/26/24 22:02, Timur Tabi wrote:

The LOGINIT, LOGINTR, LOGRM, and LOGPMU buffers are circular buffers
that have printf-like logs from GSP-RM and PMU encoded in them.

LOGINIT, LOGINTR, and LOGRM are allocated by Nouveau and their DMA
addresses are passed to GSP-RM during initialization.  The buffers are
required for GSP-RM to initialize properly.

LOGPMU is also allocated by Nouveau, but its contents are updated
when Nouveau receives an NV_VGPU_MSG_EVENT_UCODE_LIBOS_PRINT RPC from
GSP-RM.  Nouveau then copies the RPC to the buffer.

The messages are encoded as an array of variable-length structures that
contain the parameters to an NV_PRINTF call.  The format string and
parameter count are stored in a special ELF image that contains only
logging strings.  This image is not currently shipped with the Nvidia
driver.

There are two methods to extract the logs.

OpenRM tries to load the logging ELF, and if present, parses the log
buffers in real time and outputs the strings to the kernel console.

Alternatively, and this is the method used by this patch, the buffers
can be exposed to user space, and a user-space tool (along with the
logging ELF image) can parse the buffer and dump the logs.

This method has the advantage that it allows the buffers to be parsed
even when the logging ELF file is not available to the user.  However,
it has the disadvantage the debubfs entries need to remain until the


Typo in "debugfs".


driver is unloaded.

The buffers are exposed via debugfs.  The debugfs entries must be
created before GSP-RM is started, to ensure that they are available
during GSP-RM initialization.


Generally agree. Just wondering why you are pointing this out
explicitly.

What if we'd create them directly after GSP init? Is there anything
subtle to that?



If GSP-RM fails to initialize, then Nouveau immediately shuts down
the GSP interface.  This would normally also deallocate the logging
buffers, thereby preventing the user from capturing the debug logs.
To avoid this, use the keep-gsp-logging command line parameter.  If


"introduce the keep-gsp-logging command line parameter"


specified, and if at least one logging buffer has contents, then
Nouveau will migrate these buffers into new debugfs entries that are
retained until the driver unloads.

An end-user can capture the logs using the following commands:

 cp /sys/kernel/debug/dri//loginit loginit
 cp /sys/kernel/debug/dri//logrm logrm
 cp /sys/kernel/debug/dri//logintr logintr
 cp /sys/kernel/debug/dri//logpmu logpmu

where  is the PCI ID of the GPU (e.g. :65:00.0).

Since LOGPMU is not needed for normal GSP-RM operation, it is only
created if debugfs is available.  Otherwise, the
NV_VGPU_MSG_EVENT_UCODE_LIBOS_PRINT RPCs are ignored.


The commit message and some DOC comments contains trailing spaces 
between words.




Signed-off-by: Timur Tabi 
---
  .../gpu/drm/nouveau/include/nvkm/subdev/gsp.h |  25 ++
  drivers/gpu/drm/nouveau/nouveau_drm.c |  13 +
  .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c| 340 +-
  3 files changed, 377 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h 
b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
index a9be0d86e412..e2f8ff58d408 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
@@ -5,6 +5,8 @@
  #include 
  #include 
  
+#include 

+
  #define GSP_PAGE_SHIFT 12
  #define GSP_PAGE_SIZE  BIT(GSP_PAGE_SHIFT)
  
@@ -27,6 +29,20 @@ typedef int (*nvkm_gsp_msg_ntfy_func)(void *priv, u32 fn, void *repv, u32 repc);

  struct nvkm_gsp_event;
  typedef void (*nvkm_gsp_event_func)(struct nvkm_gsp_event *, void *repv, u32 
repc);
  
+/**

+ * nvkm_gsp_log - structure for tracking GSP-RM logging buffers
+ * @head: list head
+ * @shutdown: pointer to function to call to clean up
+ *
+ * One of these is created for each GPU upon GSP shutdown if the
+ * "keep_gsp_logging" command-line parameter is specified.  This is used to
+ * track the alternative debugfs entries for the GSP-RM logs.
+ */
+struct nvkm_gsp_log {


Maybe better to name this structure nvif_log instead and move it to
include/nvif/log.h for consistency. It's not really limited GSP,
although I don't see what else it could ever be used for.


+   struct list_head head;
+   void (*shutdown)(struct nvkm_gsp_log *);
+};
+
  struct nvkm_gsp {
const struct nvkm_gsp_func *func;
struct nvkm_subdev subdev;
@@ -218,6 +234,15 @@ struct nvkm_gsp {
  
  	/* The size of the registry RPC */

size_t registry_rpc_size;
+
+   /*
+* Logging buffers in debugfs.  The wrapper objects need to remain
+* in memory until the dentry is deleted.
+*/
+   struct debugfs_blob_wrapper blob_init;
+   struct debugfs_blob_wrapper blob_intr;
+   struct debugfs_blob_wrapper blob_rm;
+   struct debugfs_blob_wrapper blob_pmu;

Re: [PATCH] drm/nouveau: retain device pointer in nvkm_gsp_mem object

2024-02-27 Thread Danilo Krummrich

Hi Timur,

On 2/26/24 22:04, Timur Tabi wrote:

Store the struct device pointer used to allocate the DMA buffer in
the nvkm_gsp_mem object.  This allows nvkm_gsp_mem_dtor() to release
the buffer without needing the nvkm_gsp.  This is needed so that
we can retain DMA buffers even after the nvkm_gsp object is deleted.


Considering "[v4][RFC] drm/nouveau: expose GSP-RM logging buffers via
debugfs", I don't see where this one is needed.

Do I miss anything?

- Danilo



Signed-off-by: Timur Tabi 
---
  .../gpu/drm/nouveau/include/nvkm/subdev/gsp.h |  1 +
  .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c| 30 ++-
  2 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h 
b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
index 3fbc57b16a05..a9be0d86e412 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
@@ -9,6 +9,7 @@
  #define GSP_PAGE_SIZE  BIT(GSP_PAGE_SHIFT)
  
  struct nvkm_gsp_mem {

+   struct device *dev;
size_t size;
void *data;
dma_addr_t addr;
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
index 40757a21e150..1eb1bc5df39a 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
@@ -1000,7 +1000,7 @@ r535_gsp_rpc_get_gsp_static_info(struct nvkm_gsp *gsp)
  }
  
  static void

-nvkm_gsp_mem_dtor(struct nvkm_gsp *gsp, struct nvkm_gsp_mem *mem)
+nvkm_gsp_mem_dtor(struct nvkm_gsp_mem *mem)
  {
if (mem->data) {
/*
@@ -1009,7 +1009,7 @@ nvkm_gsp_mem_dtor(struct nvkm_gsp *gsp, struct 
nvkm_gsp_mem *mem)
 */
memset(mem->data, 0xFF, mem->size);
  
-		dma_free_coherent(gsp->subdev.device->dev, mem->size, mem->data, mem->addr);

+   dma_free_coherent(mem->dev, mem->size, mem->data, mem->addr);
memset(mem, 0, sizeof(*mem));
}
  }
@@ -1017,11 +1017,13 @@ nvkm_gsp_mem_dtor(struct nvkm_gsp *gsp, struct 
nvkm_gsp_mem *mem)
  static int
  nvkm_gsp_mem_ctor(struct nvkm_gsp *gsp, size_t size, struct nvkm_gsp_mem *mem)
  {
-   mem->size = size;
mem->data = dma_alloc_coherent(gsp->subdev.device->dev, size, 
&mem->addr, GFP_KERNEL);
if (WARN_ON(!mem->data))
return -ENOMEM;
  
+	mem->size = size;

+   mem->dev = gsp->subdev.device->dev;
+
return 0;
  }
  
@@ -1054,10 +1056,10 @@ r535_gsp_postinit(struct nvkm_gsp *gsp)

nvkm_wr32(device, 0x110004, 0x0040);
  
  	/* Release the DMA buffers that were needed only for boot and init */

-   nvkm_gsp_mem_dtor(gsp, &gsp->boot.fw);
-   nvkm_gsp_mem_dtor(gsp, &gsp->libos);
-   nvkm_gsp_mem_dtor(gsp, &gsp->rmargs);
-   nvkm_gsp_mem_dtor(gsp, &gsp->wpr_meta);
+   nvkm_gsp_mem_dtor(&gsp->boot.fw);
+   nvkm_gsp_mem_dtor(&gsp->libos);
+   nvkm_gsp_mem_dtor(&gsp->rmargs);
+   nvkm_gsp_mem_dtor(&gsp->wpr_meta);
  
  	return ret;

  }
@@ -2249,7 +2251,7 @@ static void
  nvkm_gsp_radix3_dtor(struct nvkm_gsp *gsp, struct nvkm_gsp_radix3 *rx3)
  {
for (int i = ARRAY_SIZE(rx3->mem) - 1; i >= 0; i--)
-   nvkm_gsp_mem_dtor(gsp, &rx3->mem[i]);
+   nvkm_gsp_mem_dtor(&rx3->mem[i]);
  }
  
  /**

@@ -2407,7 +2409,7 @@ r535_gsp_init(struct nvkm_gsp *gsp)
  
  done:

if (gsp->sr.meta.data) {
-   nvkm_gsp_mem_dtor(gsp, &gsp->sr.meta);
+   nvkm_gsp_mem_dtor(&gsp->sr.meta);
nvkm_gsp_radix3_dtor(gsp, &gsp->sr.radix3);
nvkm_gsp_sg_free(gsp->subdev.device, &gsp->sr.sgt);
return ret;
@@ -2488,7 +2490,7 @@ r535_gsp_dtor(struct nvkm_gsp *gsp)
mutex_destroy(&gsp->client_id.mutex);
  
  	nvkm_gsp_radix3_dtor(gsp, &gsp->radix3);

-   nvkm_gsp_mem_dtor(gsp, &gsp->sig);
+   nvkm_gsp_mem_dtor(&gsp->sig);
nvkm_firmware_dtor(&gsp->fw);
  
  	nvkm_falcon_fw_dtor(&gsp->booter.unload);

@@ -2499,10 +2501,10 @@ r535_gsp_dtor(struct nvkm_gsp *gsp)
  
  	r535_gsp_dtor_fws(gsp);
  
-	nvkm_gsp_mem_dtor(gsp, &gsp->shm.mem);

-   nvkm_gsp_mem_dtor(gsp, &gsp->loginit);
-   nvkm_gsp_mem_dtor(gsp, &gsp->logintr);
-   nvkm_gsp_mem_dtor(gsp, &gsp->logrm);
+   nvkm_gsp_mem_dtor(&gsp->shm.mem);
+   nvkm_gsp_mem_dtor(&gsp->loginit);
+   nvkm_gsp_mem_dtor(&gsp->logintr);
+   nvkm_gsp_mem_dtor(&gsp->logrm);
  }
  
  int




Re: [PATCH 00/13] drm: Fix reservation locking for pin/unpin and console

2024-02-27 Thread Thomas Zimmermann

Hi

Am 27.02.24 um 15:03 schrieb Christian König:

Nice, looks totally valid to me.

Feel free to add to patch #2, #9, #10, #11 and #12 Reviewed-by: 
Christian König 


And Acked-by: Christian König  to the rest.


Oh, wow. That was quick! Thanks a lot.

Best regards
Thomas



Regards,
Christian.

Am 27.02.24 um 11:14 schrieb Thomas Zimmermann:

Dma-buf locking semantics require the caller of pin and unpin to hold
the buffer's reservation lock. Fix DRM to adhere to the specs. This
enables to fix the locking in DRM's console emulation. Similar changes
for vmap and mmap have been posted at [1][2]

Most DRM drivers and memory managers acquire the buffer object's
reservation lock within their GEM pin and unpin callbacks. This
violates dma-buf locking semantics. We get away with it because PRIME
does not provide pin/unpin, but attach/detach, for which the locking
semantics is correct.

Patches 1 to 8 rework DRM GEM code in various implementations to
acquire the reservation lock when entering the pin and unpin callbacks.
This prepares them for the next patch. Drivers that are not affected
by these patches either don't acquire the reservation lock (amdgpu)
or don't need preparation (loongson).

Patch 9 moves reservation locking from the GEM pin/unpin callbacks
into drm_gem_pin() and drm_gem_unpin(). As PRIME uses these functions
internally it still gets the reservation lock.

With the updated GEM callbacks, the rest of the patchset fixes the
fbdev emulation's buffer locking. Fbdev emulation needs to keep its
GEM buffer object inplace while updating its content. This required
a implicit pinning and apparently amdgpu didn't do this at all.

Patch 10 introduces drm_client_buffer_vmap_local() and _vunmap_local().
The former function map a GEM buffer into the kernel's address space
with regular vmap operations, but keeps holding the reservation lock.
The _vunmap_local() helper undoes the vmap and releases the lock. The
updated GEM callbacks make this possible. Between the two calls, the
fbdev emulation can update the buffer content without have the buffer
moved or evicted. Update fbdev-generic to use vmap_local helpers,
which fix amdgpu. The idea of adding a "local vmap" has previously been
attempted at [3] in a different form.

Patch 11 adds implicit pinning to the DRM client's regular vmap
helper so that long-term vmap'ed buffers won't be evicted. This only
affects fbdev-dma, but GEM DMA helpers don't require pinning. So
there are no practical changes.

Patches 12 and 13 remove implicit pinning from the vmap and vunmap
operations in gem-vram and qxl. These pin operations are not supposed
to be part of vmap code, but were required to keep the buffers in place
for fbdev emulation. With the conversion o ffbdev-generic to to
vmap_local helpers, that code can finally be removed.

Tested with amdgpu, nouveau, radeon, simpledrm and vc4.

[1] https://patchwork.freedesktop.org/series/106371/
[2] https://patchwork.freedesktop.org/series/116001/
[3] https://patchwork.freedesktop.org/series/84732/

Thomas Zimmermann (13):
   drm/gem-shmem: Acquire reservation lock in GEM pin/unpin callbacks
   drm/gem-vram: Acquire reservation lock in GEM pin/unpin callbacks
   drm/msm: Provide msm_gem_get_pages_locked()
   drm/msm: Acquire reservation lock in GEM pin/unpin callback
   drm/nouveau: Provide nouveau_bo_{pin,unpin}_locked()
   drm/nouveau: Acquire reservation lock in GEM pin/unpin callbacks
   drm/qxl: Provide qxl_bo_{pin,unpin}_locked()
   drm/qxl: Acquire reservation lock in GEM pin/unpin callbacks
   drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}()
   drm/fbdev-generic: Fix locking with drm_client_buffer_vmap_local()
   drm/client: Pin vmap'ed GEM buffers
   drm/gem-vram: Do not pin buffer objects for vmap
   drm/qxl: Do not pin buffer objects for vmap

  drivers/gpu/drm/drm_client.c    |  92 ++---
  drivers/gpu/drm/drm_fbdev_generic.c |   4 +-
  drivers/gpu/drm/drm_gem.c   |  34 +++-
  drivers/gpu/drm/drm_gem_shmem_helper.c  |   6 +-
  drivers/gpu/drm/drm_gem_vram_helper.c   | 101 ++--
  drivers/gpu/drm/drm_internal.h  |   2 +
  drivers/gpu/drm/loongson/lsdc_gem.c |  13 +--
  drivers/gpu/drm/msm/msm_gem.c   |  20 ++---
  drivers/gpu/drm/msm/msm_gem.h   |   4 +-
  drivers/gpu/drm/msm/msm_gem_prime.c |  20 +++--
  drivers/gpu/drm/nouveau/nouveau_bo.c    |  43 +++---
  drivers/gpu/drm/nouveau/nouveau_bo.h    |   2 +
  drivers/gpu/drm/nouveau/nouveau_prime.c |   8 +-
  drivers/gpu/drm/qxl/qxl_object.c    |  26 +++---
  drivers/gpu/drm/qxl/qxl_object.h    |   2 +
  drivers/gpu/drm/qxl/qxl_prime.c |   4 +-
  drivers/gpu/drm/radeon/radeon_prime.c   |  11 ---
  drivers/gpu/drm/vmwgfx/vmwgfx_gem.c |  25 ++
  include/drm/drm_client.h    |  10 +++
  include/drm/drm_gem.h   |   3 +
  include/drm/drm_gem_shmem_helper.h  |   7 +-
  21 files changed, 265 insertions(+), 172 dele

Re: [PATCH] drm/nouveau: retain device pointer in nvkm_gsp_mem object

2024-02-27 Thread Timur Tabi
On Tue, 2024-02-27 at 15:23 +0100, Danilo Krummrich wrote:
> > Store the struct device pointer used to allocate the DMA buffer in
> > the nvkm_gsp_mem object.  This allows nvkm_gsp_mem_dtor() to release
> > the buffer without needing the nvkm_gsp.  This is needed so that
> > we can retain DMA buffers even after the nvkm_gsp object is deleted.
> 
> Considering "[v4][RFC] drm/nouveau: expose GSP-RM logging buffers via
> debugfs", I don't see where this one is needed.
> 
> Do I miss anything?

No, I should have deleted that last sentence before sending it out.  I wrote
this patch for an earlier version of my debugfs patch, but then I changed
the design to not need it.  However, I still think this is a good patch to
include.  

I will revise the commit message and post a v2 when I post the next version
of my debugfs patch.



Re: [PATCH] [v4][RFC] drm/nouveau: expose GSP-RM logging buffers via debugfs

2024-02-27 Thread Timur Tabi
On Tue, 2024-02-27 at 15:20 +0100, Danilo Krummrich wrote:
> Hi Timur,
> 
> thanks for re-working this patch!
> 
> On 2/26/24 22:02, Timur Tabi wrote:
> > The LOGINIT, LOGINTR, LOGRM, and LOGPMU buffers are circular buffers
> > that have printf-like logs from GSP-RM and PMU encoded in them.
> > 
> > LOGINIT, LOGINTR, and LOGRM are allocated by Nouveau and their DMA
> > addresses are passed to GSP-RM during initialization.  The buffers are
> > required for GSP-RM to initialize properly.
> > 
> > LOGPMU is also allocated by Nouveau, but its contents are updated
> > when Nouveau receives an NV_VGPU_MSG_EVENT_UCODE_LIBOS_PRINT RPC from
> > GSP-RM.  Nouveau then copies the RPC to the buffer.
> > 
> > The messages are encoded as an array of variable-length structures that
> > contain the parameters to an NV_PRINTF call.  The format string and
> > parameter count are stored in a special ELF image that contains only
> > logging strings.  This image is not currently shipped with the Nvidia
> > driver.
> > 
> > There are two methods to extract the logs.
> > 
> > OpenRM tries to load the logging ELF, and if present, parses the log
> > buffers in real time and outputs the strings to the kernel console.
> > 
> > Alternatively, and this is the method used by this patch, the buffers
> > can be exposed to user space, and a user-space tool (along with the
> > logging ELF image) can parse the buffer and dump the logs.
> > 
> > This method has the advantage that it allows the buffers to be parsed
> > even when the logging ELF file is not available to the user.  However,
> > it has the disadvantage the debubfs entries need to remain until the
> 
> Typo in "debugfs".

Thanks.

> 
> > driver is unloaded.
> > 
> > The buffers are exposed via debugfs.  The debugfs entries must be
> > created before GSP-RM is started, to ensure that they are available
> > during GSP-RM initialization.
> 
> Generally agree. Just wondering why you are pointing this out
> explicitly.

There is so much about GSP-RM initialization that is undocumented that I
just like documenting as much as I can.  

Now that I think about, that statement isn't technically true.  If you don't
have debugfs at all, GSP-RM will still boot.  I'll delete it.

> What if we'd create them directly after GSP init? Is there anything
> subtle to that?

No.  I supposed we could technically create them later, I just don't see an
advantage to that.  

> > If GSP-RM fails to initialize, then Nouveau immediately shuts down
> > the GSP interface.  This would normally also deallocate the logging
> > buffers, thereby preventing the user from capturing the debug logs.
> > To avoid this, use the keep-gsp-logging command line parameter.  If
> 
> "introduce the keep-gsp-logging command line parameter"

Ok.

> > specified, and if at least one logging buffer has contents, then
> > Nouveau will migrate these buffers into new debugfs entries that are
> > retained until the driver unloads.
> > 
> > An end-user can capture the logs using the following commands:
> > 
> >  cp /sys/kernel/debug/dri//loginit loginit
> >  cp /sys/kernel/debug/dri//logrm logrm
> >  cp /sys/kernel/debug/dri//logintr logintr
> >  cp /sys/kernel/debug/dri//logpmu logpmu
> > 
> > where  is the PCI ID of the GPU (e.g. :65:00.0).
> > 
> > Since LOGPMU is not needed for normal GSP-RM operation, it is only
> > created if debugfs is available.  Otherwise, the
> > NV_VGPU_MSG_EVENT_UCODE_LIBOS_PRINT RPCs are ignored.
> 
> The commit message and some DOC comments contains trailing spaces 
> between words.

Ok, I'll review and fix.

> >   
> > +/**
> > + * nvkm_gsp_log - structure for tracking GSP-RM logging buffers
> > + * @head: list head
> > + * @shutdown: pointer to function to call to clean up
> > + *
> > + * One of these is created for each GPU upon GSP shutdown if the
> > + * "keep_gsp_logging" command-line parameter is specified.  This is used to
> > + * track the alternative debugfs entries for the GSP-RM logs.
> > + */
> > +struct nvkm_gsp_log {
> 
> Maybe better to name this structure nvif_log instead and move it to
> include/nvif/log.h for consistency. It's not really limited GSP,
> although I don't see what else it could ever be used for.

Well, that's why I named it that.  But technically it's not even limited to
logging.  It would be nice if the kernel had a general mechanism for
callbacks on driver exit, like it does for system shutdown.

I supposed I could create a notifier chain instead.  How about that?

> > +   dir_init = debugfs_create_blob("loginit", 0444, dir, &gsp->blob_init);
> > +   if (IS_ERR(dir_init)) {
> > +   nvkm_error(&gsp->subdev, "failed to create loginit debugfs 
> > entry\n");
> > +   goto error;
> 
> If we hit this, dir_intr and dir_rm are uninitialized.

Thanks!

> >   
> > +#ifdef CONFIG_DEBUG_FS
> > +
> > +/*
> > + * If GSP-RM load fails, then the GSP nvkm object will be deleted, the
> > + * logging debugfs entries will be deleted, and it will not be possi

Re: [PATCH 00/13] drm: Fix reservation locking for pin/unpin and console

2024-02-27 Thread Dmitry Osipenko
Hello,

Thank you for the patches!

On 2/27/24 13:14, Thomas Zimmermann wrote:
> Dma-buf locking semantics require the caller of pin and unpin to hold
> the buffer's reservation lock. Fix DRM to adhere to the specs. This
> enables to fix the locking in DRM's console emulation. Similar changes
> for vmap and mmap have been posted at [1][2]
> 
> Most DRM drivers and memory managers acquire the buffer object's
> reservation lock within their GEM pin and unpin callbacks. This
> violates dma-buf locking semantics. We get away with it because PRIME
> does not provide pin/unpin, but attach/detach, for which the locking
> semantics is correct.
> 
> Patches 1 to 8 rework DRM GEM code in various implementations to
> acquire the reservation lock when entering the pin and unpin callbacks.
> This prepares them for the next patch. Drivers that are not affected
> by these patches either don't acquire the reservation lock (amdgpu)
> or don't need preparation (loongson).
> 
> Patch 9 moves reservation locking from the GEM pin/unpin callbacks
> into drm_gem_pin() and drm_gem_unpin(). As PRIME uses these functions
> internally it still gets the reservation lock.
> 
> With the updated GEM callbacks, the rest of the patchset fixes the
> fbdev emulation's buffer locking. Fbdev emulation needs to keep its
> GEM buffer object inplace while updating its content. This required
> a implicit pinning and apparently amdgpu didn't do this at all.
> 
> Patch 10 introduces drm_client_buffer_vmap_local() and _vunmap_local().
> The former function map a GEM buffer into the kernel's address space
> with regular vmap operations, but keeps holding the reservation lock.
> The _vunmap_local() helper undoes the vmap and releases the lock. The
> updated GEM callbacks make this possible. Between the two calls, the
> fbdev emulation can update the buffer content without have the buffer
> moved or evicted. Update fbdev-generic to use vmap_local helpers,
> which fix amdgpu. The idea of adding a "local vmap" has previously been
> attempted at [3] in a different form.
> 
> Patch 11 adds implicit pinning to the DRM client's regular vmap
> helper so that long-term vmap'ed buffers won't be evicted. This only
> affects fbdev-dma, but GEM DMA helpers don't require pinning. So
> there are no practical changes.
> 
> Patches 12 and 13 remove implicit pinning from the vmap and vunmap
> operations in gem-vram and qxl. These pin operations are not supposed
> to be part of vmap code, but were required to keep the buffers in place
> for fbdev emulation. With the conversion o ffbdev-generic to to
> vmap_local helpers, that code can finally be removed.

Isn't it a common behaviour for all DRM drivers to implicitly pin BO
while it's vmapped? I was sure it should be common /o\

Why would you want to kmap BO that isn't pinned?

Shouldn't TTM's vmap() be changed to do the pinning?

I missed that TTM doesn't pin BO on vmap() and now surprised to see it.
It should be a rather serious problem requiring backporting of the
fixes, but I don't see the fixes tags on the patches (?)

-- 
Best regards,
Dmitry



Re: [PATCH 00/13] drm: Fix reservation locking for pin/unpin and console

2024-02-27 Thread Christian König

Am 27.02.24 um 19:14 schrieb Dmitry Osipenko:

Hello,

Thank you for the patches!

On 2/27/24 13:14, Thomas Zimmermann wrote:

Dma-buf locking semantics require the caller of pin and unpin to hold
the buffer's reservation lock. Fix DRM to adhere to the specs. This
enables to fix the locking in DRM's console emulation. Similar changes
for vmap and mmap have been posted at [1][2]

Most DRM drivers and memory managers acquire the buffer object's
reservation lock within their GEM pin and unpin callbacks. This
violates dma-buf locking semantics. We get away with it because PRIME
does not provide pin/unpin, but attach/detach, for which the locking
semantics is correct.

Patches 1 to 8 rework DRM GEM code in various implementations to
acquire the reservation lock when entering the pin and unpin callbacks.
This prepares them for the next patch. Drivers that are not affected
by these patches either don't acquire the reservation lock (amdgpu)
or don't need preparation (loongson).

Patch 9 moves reservation locking from the GEM pin/unpin callbacks
into drm_gem_pin() and drm_gem_unpin(). As PRIME uses these functions
internally it still gets the reservation lock.

With the updated GEM callbacks, the rest of the patchset fixes the
fbdev emulation's buffer locking. Fbdev emulation needs to keep its
GEM buffer object inplace while updating its content. This required
a implicit pinning and apparently amdgpu didn't do this at all.

Patch 10 introduces drm_client_buffer_vmap_local() and _vunmap_local().
The former function map a GEM buffer into the kernel's address space
with regular vmap operations, but keeps holding the reservation lock.
The _vunmap_local() helper undoes the vmap and releases the lock. The
updated GEM callbacks make this possible. Between the two calls, the
fbdev emulation can update the buffer content without have the buffer
moved or evicted. Update fbdev-generic to use vmap_local helpers,
which fix amdgpu. The idea of adding a "local vmap" has previously been
attempted at [3] in a different form.

Patch 11 adds implicit pinning to the DRM client's regular vmap
helper so that long-term vmap'ed buffers won't be evicted. This only
affects fbdev-dma, but GEM DMA helpers don't require pinning. So
there are no practical changes.

Patches 12 and 13 remove implicit pinning from the vmap and vunmap
operations in gem-vram and qxl. These pin operations are not supposed
to be part of vmap code, but were required to keep the buffers in place
for fbdev emulation. With the conversion o ffbdev-generic to to
vmap_local helpers, that code can finally be removed.

Isn't it a common behaviour for all DRM drivers to implicitly pin BO
while it's vmapped? I was sure it should be common /o\


No, at least amdgpu and radon doesn't pin kmapped BOs and I don't think 
nouveau does either.



Why would you want to kmap BO that isn't pinned?


The usual use case is to call the ttm kmap function when you need CPU 
access.


When the buffer hasn't moved we can use the cached CPU mapping, if the 
buffer has moved since the last time or this is the first time that is 
called we setup a new mapping.



Shouldn't TTM's vmap() be changed to do the pinning?


Absolutely not, no. That would break tons of use cases.

Regards,
Christian.



I missed that TTM doesn't pin BO on vmap() and now surprised to see it.
It should be a rather serious problem requiring backporting of the
fixes, but I don't see the fixes tags on the patches (?)






[linux-next:master] BUILD REGRESSION 22ba90670a51a18c6b36d285fddf92b9887c0bc3

2024-02-27 Thread kernel test robot
tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
branch HEAD: 22ba90670a51a18c6b36d285fddf92b9887c0bc3  Add linux-next specific 
files for 20240227

Error/Warning reports:

https://lore.kernel.org/oe-kbuild-all/202402271629.7zzu8scf-...@intel.com

Error/Warning: (recently discovered and may have been fixed)

WARNING: modpost: vmlinux: section mismatch in reference: virtio_fs_init+0x8c 
(section: .init.text) -> map_benchmark_cleanup (section: .exit.text)
WARNING: modpost: vmlinux: section mismatch in reference: virtio_fs_init+0xdc 
(section: .init.text) -> sq_api_exit (section: .exit.text)

Unverified Error/Warning (likely false positive, please contact us if 
interested):

{standard input}:1474: Error: unknown pseudo-op: `.l141'

Error/Warning ids grouped by kconfigs:

gcc_recent_errors
|-- arc-randconfig-001-20240227
|   `-- 
fs-ubifs-journal.c:warning:expecting-prototype-for-wake_up_reservation().-Prototype-was-for-add_or_start_queue()-instead
|-- arm-randconfig-002-20240227
|   `-- 
fs-ubifs-journal.c:warning:expecting-prototype-for-wake_up_reservation().-Prototype-was-for-add_or_start_queue()-instead
|-- arm64-defconfig
|   |-- 
drivers-gpu-drm-nouveau-nvkm-subdev-gsp-r535.c:warning:Function-parameter-or-struct-member-gsp-not-described-in-nvkm_gsp_radix3_sg
|   `-- 
fs-ubifs-journal.c:warning:expecting-prototype-for-wake_up_reservation().-Prototype-was-for-add_or_start_queue()-instead
|-- i386-buildonly-randconfig-002-20240227
|   `-- 
drivers-gpu-drm-nouveau-nvkm-subdev-gsp-r535.c:warning:Function-parameter-or-struct-member-gsp-not-described-in-nvkm_gsp_radix3_sg
|-- parisc-defconfig
|   `-- 
drivers-gpu-drm-nouveau-nvkm-subdev-gsp-r535.c:warning:Function-parameter-or-struct-member-gsp-not-described-in-nvkm_gsp_radix3_sg
|-- parisc64-defconfig
|   `-- 
drivers-gpu-drm-nouveau-nvkm-subdev-gsp-r535.c:warning:Function-parameter-or-struct-member-gsp-not-described-in-nvkm_gsp_radix3_sg
|-- sh-buildonly-randconfig-r003-20230528
|   `-- 
WARNING:modpost:vmlinux:section-mismatch-in-reference:virtio_fs_init-(section:.init.text)-map_benchmark_cleanup-(section:.exit.text)
|-- sh-randconfig-r003-20230713
|   `-- standard-input:Error:unknown-pseudo-op:l141
`-- sh-randconfig-r004-20220501
`-- 
WARNING:modpost:vmlinux:section-mismatch-in-reference:virtio_fs_init-(section:.init.text)-sq_api_exit-(section:.exit.text)
clang_recent_errors
|-- hexagon-allmodconfig
|   `-- 
fs-ubifs-journal.c:warning:expecting-prototype-for-wake_up_reservation().-Prototype-was-for-add_or_start_queue()-instead
|-- hexagon-allyesconfig
|   `-- 
fs-ubifs-journal.c:warning:expecting-prototype-for-wake_up_reservation().-Prototype-was-for-add_or_start_queue()-instead
`-- riscv-defconfig
`-- 
drivers-gpu-drm-nouveau-nvkm-subdev-gsp-r535.c:warning:Function-parameter-or-struct-member-gsp-not-described-in-nvkm_gsp_radix3_sg

elapsed time: 724m

configs tested: 179
configs skipped: 3

tested configs:
alpha allnoconfig   gcc  
alphaallyesconfig   gcc  
alpha   defconfig   gcc  
arc  allmodconfig   gcc  
arc   allnoconfig   gcc  
arc  allyesconfig   gcc  
arc defconfig   gcc  
arc   randconfig-001-20240227   gcc  
arc   randconfig-002-20240227   gcc  
arm  allmodconfig   gcc  
arm   allnoconfig   clang
arm  allyesconfig   gcc  
arm defconfig   clang
armkeystone_defconfig   gcc  
armmulti_v7_defconfig   gcc  
arm   randconfig-001-20240227   gcc  
arm   randconfig-002-20240227   gcc  
arm   randconfig-003-20240227   gcc  
arm   randconfig-004-20240227   gcc  
arm   tegra_defconfig   gcc  
armvt8500_v6_v7_defconfig   gcc  
arm64allmodconfig   clang
arm64 allnoconfig   gcc  
arm64   defconfig   gcc  
arm64 randconfig-001-20240227   clang
arm64 randconfig-002-20240227   gcc  
arm64 randconfig-003-20240227   gcc  
arm64 randconfig-004-20240227   gcc  
csky allmodconfig   gcc  
csky  allnoconfig   gcc  
csky allyesconfig   gcc  
cskydefconfig   gcc  
csky  randconfig-001-20240227   gcc  
csky  randconfig-002-20240227   gcc  
hexagon  allmodconfig   clang
hexagon   allnoconfig   clang
hexagon  allyesconfig   cl

DisplayPort: handling of HPD events / link training

2024-02-27 Thread Dmitry Baryshkov
Hello,

We are currently looking at checking and/or possibly redesigning the
way the MSM DRM driver handles the HPD events and link training.

After a quick glance at the drivers implementing DP support, I noticed
following main approaches:
- Perform link training at the atomic_enable time, don't report
failures (mtk, analogix, zynqmp, tegra, nouveau)
- Perform link training at the atomic_enable time, report errors using
link_status property (i915, mhdp8546)
- Perform link training on the plug event (msm, it8605).
- Perform link training from the DPMS handler, also calling it from
the enable callback (AMDGPU, radeon).

It looks like the majority wins and we should move HPD to
atomic_enable time. Is that assumption correct?

Also two related questions:
- Is there a plan to actually make use of the link_status property?
Intel presented it at FOSDEM 2018, but since that time it was not
picked up by other drivers.

- Is there any plan to create generic DP link training helpers? After
glancing through the DP drivers there is a lot of similar code in the
link training functions, with minor differences here and there. And
it's those minor differences that bug me. It means that drivers might
respond differently to similar devices. Or that there might be minor
bugs here and there.

-- 
With best wishes
Dmitry


Re: [PATCH 08/13] drm/qxl: Acquire reservation lock in GEM pin/unpin callbacks

2024-02-27 Thread Zack Rusin
On Tue, Feb 27, 2024 at 6:39 AM Thomas Zimmermann  wrote:
>
> Acquire the reservation lock directly in GEM pin callback. Same for
> unpin. Prepares for further changes.
>
> Dma-buf locking semantics require callers to hold the buffer's
> reservation lock when invoking the pin and unpin callbacks. Prepare
> qxl accordingly by pushing locking out of the implementation. A
> follow-up patch will fix locking for all GEM code at once.
>
> Signed-off-by: Thomas Zimmermann 
> ---
>  drivers/gpu/drm/qxl/qxl_prime.c | 16 ++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/qxl/qxl_prime.c b/drivers/gpu/drm/qxl/qxl_prime.c
> index 9169c26357d36..f2646603e12eb 100644
> --- a/drivers/gpu/drm/qxl/qxl_prime.c
> +++ b/drivers/gpu/drm/qxl/qxl_prime.c
> @@ -31,15 +31,27 @@
>  int qxl_gem_prime_pin(struct drm_gem_object *obj)
>  {
> struct qxl_bo *bo = gem_to_qxl_bo(obj);
> +   int r;
>
> -   return qxl_bo_pin(bo);
> +   r = qxl_bo_reserve(bo);
> +   if (r)
> +   return r;
> +   r = qxl_bo_pin_locked(bo);
> +   qxl_bo_unreserve(bo);
> +
> +   return r;
>  }
>
>  void qxl_gem_prime_unpin(struct drm_gem_object *obj)
>  {
> struct qxl_bo *bo = gem_to_qxl_bo(obj);
> +   int r;
>
> -   qxl_bo_unpin(bo);
> +   r = qxl_bo_reserve(bo);
> +   if (r)
> +   return;
> +   qxl_bo_unpin_locked(bo);
> +   qxl_bo_unreserve(bo);
>  }

It looks like gem_prime_pin/unpin is largely the same between a lot of
drivers now. That might be a nice cleanup in the future.

z


Re: [PATCH 09/13] drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}()

2024-02-27 Thread Zack Rusin
On Tue, Feb 27, 2024 at 6:39 AM Thomas Zimmermann  wrote:
>
> Acquire the buffer object's reservation lock in drm_gem_pin() and
> remove locking the drivers' GEM callbacks where necessary. Same for
> unpin().
>
> DRM drivers and memory managers modified by this patch will now have
> correct dma-buf locking semantics: the caller is responsible for
> holding the reservation lock when calling the pin or unpin callback.
>
> DRM drivers and memory managers that are not modified will now be
> protected against concurent invocation of their pin and unpin callbacks.
>
> PRIME does not implement struct dma_buf_ops.pin, which requires
> the caller to hold the reservation lock. It does implement struct
> dma_buf_ops.attach, which requires to callee to acquire the
> reservation lock. The PRIME code uses drm_gem_pin(), so locks
> are now taken as specified. Same for unpin and detach.
>
> The patch harmonizes GEM pin and unpin to have non-interruptible
> reservation locking across all drivers, as is already the case for
> vmap and vunmap. This affects gem-shmem, gem-vram, loongson, qxl and
> radeon.
>
> Signed-off-by: Thomas Zimmermann 
> ---
>  drivers/gpu/drm/drm_gem.c   | 22 --
>  drivers/gpu/drm/drm_gem_vram_helper.c   | 15 +--
>  drivers/gpu/drm/drm_internal.h  |  2 ++
>  drivers/gpu/drm/loongson/lsdc_gem.c | 13 ++---
>  drivers/gpu/drm/msm/msm_gem_prime.c |  4 
>  drivers/gpu/drm/nouveau/nouveau_prime.c | 11 ---
>  drivers/gpu/drm/qxl/qxl_prime.c | 14 +-
>  drivers/gpu/drm/radeon/radeon_prime.c   | 11 ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_gem.c | 25 ++---
>  include/drm/drm_gem_shmem_helper.h  | 11 +--
>  10 files changed, 33 insertions(+), 95 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index 44a948b80ee14..e0f80c6a7096f 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -1161,7 +1161,7 @@ void drm_gem_print_info(struct drm_printer *p, unsigned 
> int indent,
> obj->funcs->print_info(p, indent, obj);
>  }
>
> -int drm_gem_pin(struct drm_gem_object *obj)
> +int drm_gem_pin_locked(struct drm_gem_object *obj)
>  {
> if (obj->funcs->pin)
> return obj->funcs->pin(obj);
> @@ -1169,12 +1169,30 @@ int drm_gem_pin(struct drm_gem_object *obj)
> return 0;
>  }
>
> -void drm_gem_unpin(struct drm_gem_object *obj)
> +void drm_gem_unpin_locked(struct drm_gem_object *obj)
>  {
> if (obj->funcs->unpin)
> obj->funcs->unpin(obj);
>  }
>
> +int drm_gem_pin(struct drm_gem_object *obj)
> +{
> +   int ret;
> +
> +   dma_resv_lock(obj->resv, NULL);
> +   ret = drm_gem_pin_locked(obj);
> +   dma_resv_unlock(obj->resv);
> +
> +   return ret;
> +}
> +
> +void drm_gem_unpin(struct drm_gem_object *obj)
> +{
> +   dma_resv_lock(obj->resv, NULL);
> +   drm_gem_unpin_locked(obj);
> +   dma_resv_unlock(obj->resv);
> +}
> +
>  int drm_gem_vmap(struct drm_gem_object *obj, struct iosys_map *map)
>  {
> int ret;
> diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c 
> b/drivers/gpu/drm/drm_gem_vram_helper.c
> index 15029d89badf8..5a16b3e0a4134 100644
> --- a/drivers/gpu/drm/drm_gem_vram_helper.c
> +++ b/drivers/gpu/drm/drm_gem_vram_helper.c
> @@ -774,11 +774,6 @@ 
> EXPORT_SYMBOL(drm_gem_vram_simple_display_pipe_cleanup_fb);
>  static int drm_gem_vram_object_pin(struct drm_gem_object *gem)
>  {
> struct drm_gem_vram_object *gbo = drm_gem_vram_of_gem(gem);
> -   int ret;
> -
> -   ret = ttm_bo_reserve(&gbo->bo, true, false, NULL);
> -   if (ret)
> -   return ret;
>
> /*
>  * Fbdev console emulation is the use case of these PRIME
> @@ -789,10 +784,7 @@ static int drm_gem_vram_object_pin(struct drm_gem_object 
> *gem)
>  * the buffer to be pinned to VRAM, implement a callback that
>  * sets the flags accordingly.
>  */
> -   ret = drm_gem_vram_pin_locked(gbo, 0);
> -   ttm_bo_unreserve(&gbo->bo);
> -
> -   return ret;
> +   return drm_gem_vram_pin_locked(gbo, 0);
>  }
>
>  /**
> @@ -803,13 +795,8 @@ static int drm_gem_vram_object_pin(struct drm_gem_object 
> *gem)
>  static void drm_gem_vram_object_unpin(struct drm_gem_object *gem)
>  {
> struct drm_gem_vram_object *gbo = drm_gem_vram_of_gem(gem);
> -   int ret;
>
> -   ret = ttm_bo_reserve(&gbo->bo, true, false, NULL);
> -   if (ret)
> -   return;
> drm_gem_vram_unpin_locked(gbo);
> -   ttm_bo_unreserve(&gbo->bo);
>  }
>
>  /**
> diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
> index 8e4faf0a28e6c..40b2d3a274d6c 100644
> --- a/drivers/gpu/drm/drm_internal.h
> +++ b/drivers/gpu/drm/drm_internal.h
> @@ -170,6 +170,8 @@ void drm_gem_release(struct drm_device *dev, struct 
> drm_file *file_private);
>  void drm_gem_print_info(st

Re: [PATCH 00/13] drm: Fix reservation locking for pin/unpin and console

2024-02-27 Thread Zack Rusin
On Tue, Feb 27, 2024 at 6:38 AM Thomas Zimmermann  wrote:
>
> Dma-buf locking semantics require the caller of pin and unpin to hold
> the buffer's reservation lock. Fix DRM to adhere to the specs. This
> enables to fix the locking in DRM's console emulation. Similar changes
> for vmap and mmap have been posted at [1][2]
>
> Most DRM drivers and memory managers acquire the buffer object's
> reservation lock within their GEM pin and unpin callbacks. This
> violates dma-buf locking semantics. We get away with it because PRIME
> does not provide pin/unpin, but attach/detach, for which the locking
> semantics is correct.
>
> Patches 1 to 8 rework DRM GEM code in various implementations to
> acquire the reservation lock when entering the pin and unpin callbacks.
> This prepares them for the next patch. Drivers that are not affected
> by these patches either don't acquire the reservation lock (amdgpu)
> or don't need preparation (loongson).
>
> Patch 9 moves reservation locking from the GEM pin/unpin callbacks
> into drm_gem_pin() and drm_gem_unpin(). As PRIME uses these functions
> internally it still gets the reservation lock.
>
> With the updated GEM callbacks, the rest of the patchset fixes the
> fbdev emulation's buffer locking. Fbdev emulation needs to keep its
> GEM buffer object inplace while updating its content. This required
> a implicit pinning and apparently amdgpu didn't do this at all.
>
> Patch 10 introduces drm_client_buffer_vmap_local() and _vunmap_local().
> The former function map a GEM buffer into the kernel's address space
> with regular vmap operations, but keeps holding the reservation lock.
> The _vunmap_local() helper undoes the vmap and releases the lock. The
> updated GEM callbacks make this possible. Between the two calls, the
> fbdev emulation can update the buffer content without have the buffer
> moved or evicted. Update fbdev-generic to use vmap_local helpers,
> which fix amdgpu. The idea of adding a "local vmap" has previously been
> attempted at [3] in a different form.
>
> Patch 11 adds implicit pinning to the DRM client's regular vmap
> helper so that long-term vmap'ed buffers won't be evicted. This only
> affects fbdev-dma, but GEM DMA helpers don't require pinning. So
> there are no practical changes.
>
> Patches 12 and 13 remove implicit pinning from the vmap and vunmap
> operations in gem-vram and qxl. These pin operations are not supposed
> to be part of vmap code, but were required to keep the buffers in place
> for fbdev emulation. With the conversion o ffbdev-generic to to
> vmap_local helpers, that code can finally be removed.
>
> Tested with amdgpu, nouveau, radeon, simpledrm and vc4.
>
> [1] https://patchwork.freedesktop.org/series/106371/
> [2] https://patchwork.freedesktop.org/series/116001/
> [3] https://patchwork.freedesktop.org/series/84732/
>
> Thomas Zimmermann (13):
>   drm/gem-shmem: Acquire reservation lock in GEM pin/unpin callbacks
>   drm/gem-vram: Acquire reservation lock in GEM pin/unpin callbacks
>   drm/msm: Provide msm_gem_get_pages_locked()
>   drm/msm: Acquire reservation lock in GEM pin/unpin callback
>   drm/nouveau: Provide nouveau_bo_{pin,unpin}_locked()
>   drm/nouveau: Acquire reservation lock in GEM pin/unpin callbacks
>   drm/qxl: Provide qxl_bo_{pin,unpin}_locked()
>   drm/qxl: Acquire reservation lock in GEM pin/unpin callbacks
>   drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}()
>   drm/fbdev-generic: Fix locking with drm_client_buffer_vmap_local()
>   drm/client: Pin vmap'ed GEM buffers
>   drm/gem-vram: Do not pin buffer objects for vmap
>   drm/qxl: Do not pin buffer objects for vmap
>
>  drivers/gpu/drm/drm_client.c|  92 ++---
>  drivers/gpu/drm/drm_fbdev_generic.c |   4 +-
>  drivers/gpu/drm/drm_gem.c   |  34 +++-
>  drivers/gpu/drm/drm_gem_shmem_helper.c  |   6 +-
>  drivers/gpu/drm/drm_gem_vram_helper.c   | 101 ++--
>  drivers/gpu/drm/drm_internal.h  |   2 +
>  drivers/gpu/drm/loongson/lsdc_gem.c |  13 +--
>  drivers/gpu/drm/msm/msm_gem.c   |  20 ++---
>  drivers/gpu/drm/msm/msm_gem.h   |   4 +-
>  drivers/gpu/drm/msm/msm_gem_prime.c |  20 +++--
>  drivers/gpu/drm/nouveau/nouveau_bo.c|  43 +++---
>  drivers/gpu/drm/nouveau/nouveau_bo.h|   2 +
>  drivers/gpu/drm/nouveau/nouveau_prime.c |   8 +-
>  drivers/gpu/drm/qxl/qxl_object.c|  26 +++---
>  drivers/gpu/drm/qxl/qxl_object.h|   2 +
>  drivers/gpu/drm/qxl/qxl_prime.c |   4 +-
>  drivers/gpu/drm/radeon/radeon_prime.c   |  11 ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_gem.c |  25 ++
>  include/drm/drm_client.h|  10 +++
>  include/drm/drm_gem.h   |   3 +
>  include/drm/drm_gem_shmem_helper.h  |   7 +-
>  21 files changed, 265 insertions(+), 172 deletions(-)
>
>
> base-commit: 7291e2e67dff0ff573900266382c9c9248a7dea5
> prerequisite-patch-id: bdfa0e6341b30cc9d7647172760b3473007c1216
> prerequisite-patch