Re: [PATCH v4 2/4] drm/ttm: Use LRU hitches

2024-03-11 Thread Thomas Hellström
Hi! Thanks for reviewing. 
On Fri, 2024-03-08 at 18:50 +0530, Somalapuram, Amaranath wrote:
> 
> On 3/6/2024 12:31 PM, Thomas Hellström wrote:
> > Have iterators insert themselves into the list they are iterating
> > over using hitch list nodes. Since only the iterator owner
> > can remove these list nodes from the list, it's safe to unlock
> > the list and when continuing, use them as a starting point. Due to
> > the way LRU bumping works in TTM, newly added items will not be
> > missed, and bumped items will be iterated over a second time before
> > reaching the end of the list.
> > 
> > The exception is list with bulk move sublists. When bumping a
> > sublist, a hitch that is part of that sublist will also be moved
> > and we might miss items if restarting from it. This will be
> > addressed in a later patch.
> > 
> > v2:
> > - Updated ttm_resource_cursor_fini() documentation.
> > 
> > Cc: Christian König 
> > Cc: Somalapuram Amaranath 
> > Cc: 
> > Signed-off-by: Thomas Hellström 
> > ---
> >   drivers/gpu/drm/ttm/ttm_bo.c   |  1 +
> >   drivers/gpu/drm/ttm/ttm_device.c   |  9 ++-
> >   drivers/gpu/drm/ttm/ttm_resource.c | 94 -
> > -
> >   include/drm/ttm/ttm_resource.h | 16 +++--
> >   4 files changed, 82 insertions(+), 38 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
> > b/drivers/gpu/drm/ttm/ttm_bo.c
> > index e059b1e1b13b..b6f75a0ff2e5 100644
> > --- a/drivers/gpu/drm/ttm/ttm_bo.c
> > +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> > @@ -622,6 +622,7 @@ int ttm_mem_evict_first(struct ttm_device
> > *bdev,
> >     if (locked)
> >     dma_resv_unlock(res->bo->base.resv);
> >     }
> > +   ttm_resource_cursor_fini_locked();
> >   
> >     if (!bo) {
> >     if (busy_bo && !ttm_bo_get_unless_zero(busy_bo))
> > diff --git a/drivers/gpu/drm/ttm/ttm_device.c
> > b/drivers/gpu/drm/ttm/ttm_device.c
> > index f27406e851e5..e8a6a1dab669 100644
> > --- a/drivers/gpu/drm/ttm/ttm_device.c
> > +++ b/drivers/gpu/drm/ttm/ttm_device.c
> > @@ -169,12 +169,17 @@ int ttm_device_swapout(struct ttm_device
> > *bdev, struct ttm_operation_ctx *ctx,
> >     num_pages = PFN_UP(bo->base.size);
> >     ret = ttm_bo_swapout(bo, ctx, gfp_flags);
> >     /* ttm_bo_swapout has dropped the lru_lock
> > */
> > -   if (!ret)
> > +   if (!ret) {
> > +   ttm_resource_cursor_fini();
> 
> is spin_unlock(>lru_lock) missing ?
> 
> >     return num_pages;
> > -   if (ret != -EBUSY)
> > +   }
> > +   if (ret != -EBUSY) {
> > +   ttm_resource_cursor_fini();
> 
> is spin_unlock(>lru_lock) missing ?

The ttm_bo_swapout() function returns unlocked depending on the error
code. IIRC it only returns locked on -EBUSY. That is something we
hopefully can change when this series is in place.

/Thomas


> 
> Regards,
> S.Amarnath
> >     return ret;
> > +   }
> >     }
> >     }
> > +   ttm_resource_cursor_fini_locked();
> >     spin_unlock(>lru_lock);
> >     return 0;
> >   }
> > diff --git a/drivers/gpu/drm/ttm/ttm_resource.c
> > b/drivers/gpu/drm/ttm/ttm_resource.c
> > index ee1865f82cb4..971014fca10a 100644
> > --- a/drivers/gpu/drm/ttm/ttm_resource.c
> > +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> > @@ -32,6 +32,37 @@
> >   
> >   #include 
> >   
> > +/**
> > + * ttm_resource_cursor_fini_locked() - Finalize the LRU list
> > cursor usage
> > + * @cursor: The struct ttm_resource_cursor to finalize.
> > + *
> > + * The function pulls the LRU list cursor off any lists it was
> > previusly
> > + * attached to. Needs to be called with the LRU lock held. The
> > function
> > + * can be called multiple times after eachother.
> > + */
> > +void ttm_resource_cursor_fini_locked(struct ttm_resource_cursor
> > *cursor)
> > +{
> > +   lockdep_assert_held(>man->bdev->lru_lock);
> > +   list_del_init(>hitch.link);
> > +}
> > +
> > +/**
> > + * ttm_resource_cursor_fini() - Finalize the LRU list cursor usage
> > + * @cursor: The struct ttm_resource_cursor to finalize.
> > + *
> > + * The function pulls the LRU list cursor off any lists it was
> > previusly
> > + * attached to. Needs to be called without the LRU list lock held.
> > The
> > + * function can be called multiple times after eachother.
> > + */
> > +void ttm_resource_cursor_fini(struct ttm_resource_cursor *cursor)
> > +{
> > +   spinlock_t *lru_lock = >man->bdev->lru_lock;
> > +
> > +   spin_lock(lru_lock);
> > +   ttm_resource_cursor_fini_locked(cursor);
> > +   spin_unlock(lru_lock);
> > +}
> > +
> >   /**
> >    * ttm_lru_bulk_move_init - initialize a bulk move structure
> >    * @bulk: the structure to init
> > @@ -483,62 +514,63 @@ void ttm_resource_manager_debug(struct
> > ttm_resource_manager *man,
> >   EXPORT_SYMBOL(ttm_resource_manager_debug);
> >   
> >   /**
> > - * 

Re: [PATCH v4 2/4] drm/ttm: Use LRU hitches

2024-03-08 Thread Somalapuram, Amaranath



On 3/6/2024 12:31 PM, Thomas Hellström wrote:

Have iterators insert themselves into the list they are iterating
over using hitch list nodes. Since only the iterator owner
can remove these list nodes from the list, it's safe to unlock
the list and when continuing, use them as a starting point. Due to
the way LRU bumping works in TTM, newly added items will not be
missed, and bumped items will be iterated over a second time before
reaching the end of the list.

The exception is list with bulk move sublists. When bumping a
sublist, a hitch that is part of that sublist will also be moved
and we might miss items if restarting from it. This will be
addressed in a later patch.

v2:
- Updated ttm_resource_cursor_fini() documentation.

Cc: Christian König 
Cc: Somalapuram Amaranath 
Cc: 
Signed-off-by: Thomas Hellström 
---
  drivers/gpu/drm/ttm/ttm_bo.c   |  1 +
  drivers/gpu/drm/ttm/ttm_device.c   |  9 ++-
  drivers/gpu/drm/ttm/ttm_resource.c | 94 --
  include/drm/ttm/ttm_resource.h | 16 +++--
  4 files changed, 82 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index e059b1e1b13b..b6f75a0ff2e5 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -622,6 +622,7 @@ int ttm_mem_evict_first(struct ttm_device *bdev,
if (locked)
dma_resv_unlock(res->bo->base.resv);
}
+   ttm_resource_cursor_fini_locked();
  
  	if (!bo) {

if (busy_bo && !ttm_bo_get_unless_zero(busy_bo))
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index f27406e851e5..e8a6a1dab669 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -169,12 +169,17 @@ int ttm_device_swapout(struct ttm_device *bdev, struct 
ttm_operation_ctx *ctx,
num_pages = PFN_UP(bo->base.size);
ret = ttm_bo_swapout(bo, ctx, gfp_flags);
/* ttm_bo_swapout has dropped the lru_lock */
-   if (!ret)
+   if (!ret) {
+   ttm_resource_cursor_fini();


is spin_unlock(>lru_lock) missing ?


return num_pages;
-   if (ret != -EBUSY)
+   }
+   if (ret != -EBUSY) {
+   ttm_resource_cursor_fini();


is spin_unlock(>lru_lock) missing ?

Regards,
S.Amarnath

return ret;
+   }
}
}
+   ttm_resource_cursor_fini_locked();
spin_unlock(>lru_lock);
return 0;
  }
diff --git a/drivers/gpu/drm/ttm/ttm_resource.c 
b/drivers/gpu/drm/ttm/ttm_resource.c
index ee1865f82cb4..971014fca10a 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -32,6 +32,37 @@
  
  #include 
  
+/**

+ * ttm_resource_cursor_fini_locked() - Finalize the LRU list cursor usage
+ * @cursor: The struct ttm_resource_cursor to finalize.
+ *
+ * The function pulls the LRU list cursor off any lists it was previusly
+ * attached to. Needs to be called with the LRU lock held. The function
+ * can be called multiple times after eachother.
+ */
+void ttm_resource_cursor_fini_locked(struct ttm_resource_cursor *cursor)
+{
+   lockdep_assert_held(>man->bdev->lru_lock);
+   list_del_init(>hitch.link);
+}
+
+/**
+ * ttm_resource_cursor_fini() - Finalize the LRU list cursor usage
+ * @cursor: The struct ttm_resource_cursor to finalize.
+ *
+ * The function pulls the LRU list cursor off any lists it was previusly
+ * attached to. Needs to be called without the LRU list lock held. The
+ * function can be called multiple times after eachother.
+ */
+void ttm_resource_cursor_fini(struct ttm_resource_cursor *cursor)
+{
+   spinlock_t *lru_lock = >man->bdev->lru_lock;
+
+   spin_lock(lru_lock);
+   ttm_resource_cursor_fini_locked(cursor);
+   spin_unlock(lru_lock);
+}
+
  /**
   * ttm_lru_bulk_move_init - initialize a bulk move structure
   * @bulk: the structure to init
@@ -483,62 +514,63 @@ void ttm_resource_manager_debug(struct 
ttm_resource_manager *man,
  EXPORT_SYMBOL(ttm_resource_manager_debug);
  
  /**

- * ttm_resource_manager_first
- *
- * @man: resource manager to iterate over
+ * ttm_resource_manager_next() - Continue iterating over the resource manager
+ * resources
   * @cursor: cursor to record the position
   *
- * Returns the first resource from the resource manager.
+ * Return: The next resource from the resource manager.
   */
  struct ttm_resource *
-ttm_resource_manager_first(struct ttm_resource_manager *man,
-  struct ttm_resource_cursor *cursor)
+ttm_resource_manager_next(struct ttm_resource_cursor *cursor)
  {
+   struct ttm_resource_manager *man = cursor->man;
struct ttm_lru_item *lru;
  
  	lockdep_assert_held(>bdev->lru_lock);
  

[PATCH v4 2/4] drm/ttm: Use LRU hitches

2024-03-05 Thread Thomas Hellström
Have iterators insert themselves into the list they are iterating
over using hitch list nodes. Since only the iterator owner
can remove these list nodes from the list, it's safe to unlock
the list and when continuing, use them as a starting point. Due to
the way LRU bumping works in TTM, newly added items will not be
missed, and bumped items will be iterated over a second time before
reaching the end of the list.

The exception is list with bulk move sublists. When bumping a
sublist, a hitch that is part of that sublist will also be moved
and we might miss items if restarting from it. This will be
addressed in a later patch.

v2:
- Updated ttm_resource_cursor_fini() documentation.

Cc: Christian König 
Cc: Somalapuram Amaranath 
Cc: 
Signed-off-by: Thomas Hellström 
---
 drivers/gpu/drm/ttm/ttm_bo.c   |  1 +
 drivers/gpu/drm/ttm/ttm_device.c   |  9 ++-
 drivers/gpu/drm/ttm/ttm_resource.c | 94 --
 include/drm/ttm/ttm_resource.h | 16 +++--
 4 files changed, 82 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index e059b1e1b13b..b6f75a0ff2e5 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -622,6 +622,7 @@ int ttm_mem_evict_first(struct ttm_device *bdev,
if (locked)
dma_resv_unlock(res->bo->base.resv);
}
+   ttm_resource_cursor_fini_locked();
 
if (!bo) {
if (busy_bo && !ttm_bo_get_unless_zero(busy_bo))
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index f27406e851e5..e8a6a1dab669 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -169,12 +169,17 @@ int ttm_device_swapout(struct ttm_device *bdev, struct 
ttm_operation_ctx *ctx,
num_pages = PFN_UP(bo->base.size);
ret = ttm_bo_swapout(bo, ctx, gfp_flags);
/* ttm_bo_swapout has dropped the lru_lock */
-   if (!ret)
+   if (!ret) {
+   ttm_resource_cursor_fini();
return num_pages;
-   if (ret != -EBUSY)
+   }
+   if (ret != -EBUSY) {
+   ttm_resource_cursor_fini();
return ret;
+   }
}
}
+   ttm_resource_cursor_fini_locked();
spin_unlock(>lru_lock);
return 0;
 }
diff --git a/drivers/gpu/drm/ttm/ttm_resource.c 
b/drivers/gpu/drm/ttm/ttm_resource.c
index ee1865f82cb4..971014fca10a 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -32,6 +32,37 @@
 
 #include 
 
+/**
+ * ttm_resource_cursor_fini_locked() - Finalize the LRU list cursor usage
+ * @cursor: The struct ttm_resource_cursor to finalize.
+ *
+ * The function pulls the LRU list cursor off any lists it was previusly
+ * attached to. Needs to be called with the LRU lock held. The function
+ * can be called multiple times after eachother.
+ */
+void ttm_resource_cursor_fini_locked(struct ttm_resource_cursor *cursor)
+{
+   lockdep_assert_held(>man->bdev->lru_lock);
+   list_del_init(>hitch.link);
+}
+
+/**
+ * ttm_resource_cursor_fini() - Finalize the LRU list cursor usage
+ * @cursor: The struct ttm_resource_cursor to finalize.
+ *
+ * The function pulls the LRU list cursor off any lists it was previusly
+ * attached to. Needs to be called without the LRU list lock held. The
+ * function can be called multiple times after eachother.
+ */
+void ttm_resource_cursor_fini(struct ttm_resource_cursor *cursor)
+{
+   spinlock_t *lru_lock = >man->bdev->lru_lock;
+
+   spin_lock(lru_lock);
+   ttm_resource_cursor_fini_locked(cursor);
+   spin_unlock(lru_lock);
+}
+
 /**
  * ttm_lru_bulk_move_init - initialize a bulk move structure
  * @bulk: the structure to init
@@ -483,62 +514,63 @@ void ttm_resource_manager_debug(struct 
ttm_resource_manager *man,
 EXPORT_SYMBOL(ttm_resource_manager_debug);
 
 /**
- * ttm_resource_manager_first
- *
- * @man: resource manager to iterate over
+ * ttm_resource_manager_next() - Continue iterating over the resource manager
+ * resources
  * @cursor: cursor to record the position
  *
- * Returns the first resource from the resource manager.
+ * Return: The next resource from the resource manager.
  */
 struct ttm_resource *
-ttm_resource_manager_first(struct ttm_resource_manager *man,
-  struct ttm_resource_cursor *cursor)
+ttm_resource_manager_next(struct ttm_resource_cursor *cursor)
 {
+   struct ttm_resource_manager *man = cursor->man;
struct ttm_lru_item *lru;
 
lockdep_assert_held(>bdev->lru_lock);
 
-   for (cursor->priority = 0; cursor->priority < TTM_MAX_BO_PRIORITY;
-++cursor->priority)
-   list_for_each_entry(lru,