Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

2021-10-05 Thread Zeng, Oak
Thanks for explanation. This patch is Acked-by: Oak Zeng 

Regards,
Oak

> -Original Message-
> From: Auld, Matthew 
> Sent: October 5, 2021 1:07 PM
> To: Zeng, Oak ; Thomas Hellström
> ; intel-gfx@lists.freedesktop.org
> Cc: dri-de...@lists.freedesktop.org; Christian König
> 
> Subject: Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem
> backend
> 
> On 05/10/2021 15:23, Zeng, Oak wrote:
> >
> >
> > Regards,
> > Oak
> >
> >> -Original Message-
> >> From: Thomas Hellström 
> >> Sent: October 5, 2021 9:48 AM
> >> To: Zeng, Oak ; Auld, Matthew
> >> ; intel-gfx@lists.freedesktop.org
> >> Cc: dri-de...@lists.freedesktop.org; Christian König
> >> 
> >> Subject: Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem
> >> backend
> >>
> >>
> >> On 10/5/21 04:05, Zeng, Oak wrote:
> >>> Hi Matthew/Thomas,
> >>>
> >>> See one question inline
> >>>
> >>> Regards,
> >>> Oak
> >>>
> >>> -Original Message-----
> >>> From: Intel-gfx  On Behalf Of
> >> Matthew Auld
> >>> Sent: September 27, 2021 7:41 AM
> >>> To: intel-gfx@lists.freedesktop.org
> >>> Cc: dri-de...@lists.freedesktop.org; Thomas Hellström
> >> ; Christian König
> >> 
> >>> Subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem
> backend
> >>>
> >>> For cached objects we can allocate our pages directly in shmem. This
> should
> >> make it possible(in a later patch) to utilise the existing i915-gem 
> >> shrinker
> code
> >> for such objects. For now this is still disabled.
> >>>
> >>> v2(Thomas):
> >>> - Add optional try_to_writeback hook for objects. Importantly we need
> >>>   to check if the object is even still shrinkable; in between us
> >>>   dropping the shrinker LRU lock and acquiring the object lock it 
> >>> could for
> >>>   example have been moved. Also we need to differentiate between
> >>>   "lazy" shrinking and the immediate writeback mode. Also later we
> need
> >> to
> >>>   handle objects which don't even have mm.pages, so bundling this into
> >>>   put_pages() would require somehow handling that edge case, hence
> >>>   just letting the ttm backend handle everything in try_to_writeback
> >>>   doesn't seem too bad.
> >>> v3(Thomas):
> >>> - Likely a bad idea to touch the object from the unpopulate hook,
> >>>   since it's not possible to hold a reference, without also creating
> >>>   circular dependency, so likely this is too fragile. For now just
> >>>   ensure we at least mark the pages as dirty/accessed when called from
> the
> >>>   shrinker on WILLNEED objects.
> >>> - s/try_to_writeback/shrinker_release_pages, since this can do more
> >>>   than just writeback.
> >>> - Get rid of do_backup boolean and just set the SWAPPED flag prior to
> >>>   calling unpopulate.
> >>> - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk,
> >> since
> >>>   these just get skipped anyway. We can try to come up with something
> >>>   better later.
> >>>
> >>> Signed-off-by: Matthew Auld 
> >>> Cc: Thomas Hellström 
> >>> Cc: Christian König 
> >>> ---
> >>>drivers/gpu/drm/i915/gem/i915_gem_object.h|   8 +
> >>>.../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
> >>>drivers/gpu/drm/i915/gem/i915_gem_shmem.c |  14 +-
> >>>drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  17 +-
> >>>drivers/gpu/drm/i915/gem/i915_gem_ttm.c   | 240
> -
> >> -
> >>>5 files changed, 245 insertions(+), 36 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> >> b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> >>> index 3043fcbd31bd..1c9a1d8d3434 100644
> >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> >>> @@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct
> >> drm_i915_gem_object *obj,  bool
> >> i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,
> >>&g

Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

2021-10-05 Thread Matthew Auld

On 05/10/2021 15:23, Zeng, Oak wrote:



Regards,
Oak


-Original Message-
From: Thomas Hellström 
Sent: October 5, 2021 9:48 AM
To: Zeng, Oak ; Auld, Matthew
; intel-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org; Christian König

Subject: Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem
backend


On 10/5/21 04:05, Zeng, Oak wrote:

Hi Matthew/Thomas,

See one question inline

Regards,
Oak

-Original Message-
From: Intel-gfx  On Behalf Of

Matthew Auld

Sent: September 27, 2021 7:41 AM
To: intel-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org; Thomas Hellström

; Christian König


Subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

For cached objects we can allocate our pages directly in shmem. This should

make it possible(in a later patch) to utilise the existing i915-gem shrinker 
code
for such objects. For now this is still disabled.


v2(Thomas):
- Add optional try_to_writeback hook for objects. Importantly we need
  to check if the object is even still shrinkable; in between us
  dropping the shrinker LRU lock and acquiring the object lock it could for
  example have been moved. Also we need to differentiate between
  "lazy" shrinking and the immediate writeback mode. Also later we need

to

  handle objects which don't even have mm.pages, so bundling this into
  put_pages() would require somehow handling that edge case, hence
  just letting the ttm backend handle everything in try_to_writeback
  doesn't seem too bad.
v3(Thomas):
- Likely a bad idea to touch the object from the unpopulate hook,
  since it's not possible to hold a reference, without also creating
  circular dependency, so likely this is too fragile. For now just
  ensure we at least mark the pages as dirty/accessed when called from the
  shrinker on WILLNEED objects.
- s/try_to_writeback/shrinker_release_pages, since this can do more
  than just writeback.
- Get rid of do_backup boolean and just set the SWAPPED flag prior to
  calling unpopulate.
- Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk,

since

  these just get skipped anyway. We can try to come up with something
  better later.

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Christian König 
---
   drivers/gpu/drm/i915/gem/i915_gem_object.h|   8 +
   .../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
   drivers/gpu/drm/i915/gem/i915_gem_shmem.c |  14 +-
   drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  17 +-
   drivers/gpu/drm/i915/gem/i915_gem_ttm.c   | 240 -

-

   5 files changed, 245 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h

b/drivers/gpu/drm/i915/gem/i915_gem_object.h

index 3043fcbd31bd..1c9a1d8d3434 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct

drm_i915_gem_object *obj,  bool
i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,

 enum intel_memory_type type);

+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
+   size_t size, struct intel_memory_region *mr,
+   struct address_space *mapping,
+   unsigned int max_segment);
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
+  bool dirty, bool backup);
+void __shmem_writeback(size_t size, struct address_space *mapping);
+
   #ifdef CONFIG_MMU_NOTIFIER
   static inline bool
   i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) diff --git

a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h

index fa2ba9e2a4d0..f0fb17be2f7a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops {
   struct sg_table *pages);
 void (*truncate)(struct drm_i915_gem_object *obj);
 void (*writeback)(struct drm_i915_gem_object *obj);
+   int (*shrinker_release_pages)(struct drm_i915_gem_object *obj,
+ bool should_writeback);

 int (*pread)(struct drm_i915_gem_object *obj,
  const struct drm_i915_gem_pread *arg); diff --git

a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c

index 36b711ae9e28..19e55cc29a15 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec

*pvec)

 cond_resched();
   }

-static void shmem_free_st(struct sg_table *st, struct address_space

*mapping,

- bool dirty, bool backup)
+void shmem_free_st(struct sg_table *st, stru

Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

2021-10-05 Thread Zeng, Oak


Regards,
Oak

> -Original Message-
> From: Thomas Hellström 
> Sent: October 5, 2021 9:48 AM
> To: Zeng, Oak ; Auld, Matthew
> ; intel-gfx@lists.freedesktop.org
> Cc: dri-de...@lists.freedesktop.org; Christian König
> 
> Subject: Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem
> backend
> 
> 
> On 10/5/21 04:05, Zeng, Oak wrote:
> > Hi Matthew/Thomas,
> >
> > See one question inline
> >
> > Regards,
> > Oak
> >
> > -Original Message-
> > From: Intel-gfx  On Behalf Of
> Matthew Auld
> > Sent: September 27, 2021 7:41 AM
> > To: intel-gfx@lists.freedesktop.org
> > Cc: dri-de...@lists.freedesktop.org; Thomas Hellström
> ; Christian König
> 
> > Subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend
> >
> > For cached objects we can allocate our pages directly in shmem. This should
> make it possible(in a later patch) to utilise the existing i915-gem shrinker 
> code
> for such objects. For now this is still disabled.
> >
> > v2(Thomas):
> >- Add optional try_to_writeback hook for objects. Importantly we need
> >  to check if the object is even still shrinkable; in between us
> >  dropping the shrinker LRU lock and acquiring the object lock it could 
> > for
> >  example have been moved. Also we need to differentiate between
> >  "lazy" shrinking and the immediate writeback mode. Also later we need
> to
> >  handle objects which don't even have mm.pages, so bundling this into
> >  put_pages() would require somehow handling that edge case, hence
> >  just letting the ttm backend handle everything in try_to_writeback
> >  doesn't seem too bad.
> > v3(Thomas):
> >- Likely a bad idea to touch the object from the unpopulate hook,
> >  since it's not possible to hold a reference, without also creating
> >  circular dependency, so likely this is too fragile. For now just
> >  ensure we at least mark the pages as dirty/accessed when called from 
> > the
> >  shrinker on WILLNEED objects.
> >- s/try_to_writeback/shrinker_release_pages, since this can do more
> >  than just writeback.
> >- Get rid of do_backup boolean and just set the SWAPPED flag prior to
> >  calling unpopulate.
> >- Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk,
> since
> >  these just get skipped anyway. We can try to come up with something
> >  better later.
> >
> > Signed-off-by: Matthew Auld 
> > Cc: Thomas Hellström 
> > Cc: Christian König 
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_object.h|   8 +
> >   .../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
> >   drivers/gpu/drm/i915/gem/i915_gem_shmem.c |  14 +-
> >   drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  17 +-
> >   drivers/gpu/drm/i915/gem/i915_gem_ttm.c   | 240 -
> -
> >   5 files changed, 245 insertions(+), 36 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> > index 3043fcbd31bd..1c9a1d8d3434 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> > @@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct
> drm_i915_gem_object *obj,  bool
> i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,
> > enum intel_memory_type type);
> >
> > +struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
> > +   size_t size, struct intel_memory_region *mr,
> > +   struct address_space *mapping,
> > +   unsigned int max_segment);
> > +void shmem_free_st(struct sg_table *st, struct address_space *mapping,
> > +  bool dirty, bool backup);
> > +void __shmem_writeback(size_t size, struct address_space *mapping);
> > +
> >   #ifdef CONFIG_MMU_NOTIFIER
> >   static inline bool
> >   i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) diff --git
> a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> > index fa2ba9e2a4d0..f0fb17be2f7a 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> > @@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops {
> >   struct sg_table *pages);
> > void (*truncate)(struct drm_i915_gem_object *obj);
> >

Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

2021-10-05 Thread Thomas Hellström



On 10/5/21 04:05, Zeng, Oak wrote:

Hi Matthew/Thomas,

See one question inline

Regards,
Oak

-Original Message-
From: Intel-gfx  On Behalf Of Matthew 
Auld
Sent: September 27, 2021 7:41 AM
To: intel-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org; Thomas Hellström 
; Christian König 
Subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

For cached objects we can allocate our pages directly in shmem. This should 
make it possible(in a later patch) to utilise the existing i915-gem shrinker 
code for such objects. For now this is still disabled.

v2(Thomas):
   - Add optional try_to_writeback hook for objects. Importantly we need
 to check if the object is even still shrinkable; in between us
 dropping the shrinker LRU lock and acquiring the object lock it could for
 example have been moved. Also we need to differentiate between
 "lazy" shrinking and the immediate writeback mode. Also later we need to
 handle objects which don't even have mm.pages, so bundling this into
 put_pages() would require somehow handling that edge case, hence
 just letting the ttm backend handle everything in try_to_writeback
 doesn't seem too bad.
v3(Thomas):
   - Likely a bad idea to touch the object from the unpopulate hook,
 since it's not possible to hold a reference, without also creating
 circular dependency, so likely this is too fragile. For now just
 ensure we at least mark the pages as dirty/accessed when called from the
 shrinker on WILLNEED objects.
   - s/try_to_writeback/shrinker_release_pages, since this can do more
 than just writeback.
   - Get rid of do_backup boolean and just set the SWAPPED flag prior to
 calling unpopulate.
   - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk, since
 these just get skipped anyway. We can try to come up with something
 better later.

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Christian König 
---
  drivers/gpu/drm/i915/gem/i915_gem_object.h|   8 +
  .../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
  drivers/gpu/drm/i915/gem/i915_gem_shmem.c |  14 +-
  drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  17 +-
  drivers/gpu/drm/i915/gem/i915_gem_ttm.c   | 240 --
  5 files changed, 245 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 3043fcbd31bd..1c9a1d8d3434 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct 
drm_i915_gem_object *obj,  bool i915_gem_object_placement_possible(struct 
drm_i915_gem_object *obj,
enum intel_memory_type type);
  
+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,

+   size_t size, struct intel_memory_region *mr,
+   struct address_space *mapping,
+   unsigned int max_segment);
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
+  bool dirty, bool backup);
+void __shmem_writeback(size_t size, struct address_space *mapping);
+
  #ifdef CONFIG_MMU_NOTIFIER
  static inline bool
  i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) diff --git 
a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index fa2ba9e2a4d0..f0fb17be2f7a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops {
  struct sg_table *pages);
void (*truncate)(struct drm_i915_gem_object *obj);
void (*writeback)(struct drm_i915_gem_object *obj);
+   int (*shrinker_release_pages)(struct drm_i915_gem_object *obj,
+ bool should_writeback);
  
  	int (*pread)(struct drm_i915_gem_object *obj,

 const struct drm_i915_gem_pread *arg); diff --git 
a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 36b711ae9e28..19e55cc29a15 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec *pvec)
cond_resched();
  }
  
-static void shmem_free_st(struct sg_table *st, struct address_space *mapping,

- bool dirty, bool backup)
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
+  bool dirty, bool backup)
  {
struct sgt_iter sgt_iter;
struct pagevec pvec;
@@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st, struct 
address_space *mapping,
kfree(st);
  }
  
-static struct sg_table *shmem_alloc_st(stru

Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

2021-10-04 Thread Zeng, Oak
Hi Matthew/Thomas,

See one question inline

Regards,
Oak

-Original Message-
From: Intel-gfx  On Behalf Of Matthew 
Auld
Sent: September 27, 2021 7:41 AM
To: intel-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org; Thomas Hellström 
; Christian König 
Subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

For cached objects we can allocate our pages directly in shmem. This should 
make it possible(in a later patch) to utilise the existing i915-gem shrinker 
code for such objects. For now this is still disabled.

v2(Thomas):
  - Add optional try_to_writeback hook for objects. Importantly we need
to check if the object is even still shrinkable; in between us
dropping the shrinker LRU lock and acquiring the object lock it could for
example have been moved. Also we need to differentiate between
"lazy" shrinking and the immediate writeback mode. Also later we need to
handle objects which don't even have mm.pages, so bundling this into
put_pages() would require somehow handling that edge case, hence
just letting the ttm backend handle everything in try_to_writeback
doesn't seem too bad.
v3(Thomas):
  - Likely a bad idea to touch the object from the unpopulate hook,
since it's not possible to hold a reference, without also creating
circular dependency, so likely this is too fragile. For now just
ensure we at least mark the pages as dirty/accessed when called from the
shrinker on WILLNEED objects.
  - s/try_to_writeback/shrinker_release_pages, since this can do more
than just writeback.
  - Get rid of do_backup boolean and just set the SWAPPED flag prior to
calling unpopulate.
  - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk, since
these just get skipped anyway. We can try to come up with something
better later.

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Christian König 
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h|   8 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c |  14 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  17 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c   | 240 --
 5 files changed, 245 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 3043fcbd31bd..1c9a1d8d3434 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct 
drm_i915_gem_object *obj,  bool i915_gem_object_placement_possible(struct 
drm_i915_gem_object *obj,
enum intel_memory_type type);
 
+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
+   size_t size, struct intel_memory_region *mr,
+   struct address_space *mapping,
+   unsigned int max_segment);
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
+  bool dirty, bool backup);
+void __shmem_writeback(size_t size, struct address_space *mapping);
+
 #ifdef CONFIG_MMU_NOTIFIER
 static inline bool
 i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) diff --git 
a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index fa2ba9e2a4d0..f0fb17be2f7a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops {
  struct sg_table *pages);
void (*truncate)(struct drm_i915_gem_object *obj);
void (*writeback)(struct drm_i915_gem_object *obj);
+   int (*shrinker_release_pages)(struct drm_i915_gem_object *obj,
+ bool should_writeback);
 
int (*pread)(struct drm_i915_gem_object *obj,
 const struct drm_i915_gem_pread *arg); diff --git 
a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 36b711ae9e28..19e55cc29a15 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec *pvec)
cond_resched();
 }
 
-static void shmem_free_st(struct sg_table *st, struct address_space *mapping,
- bool dirty, bool backup)
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
+  bool dirty, bool backup)
 {
struct sgt_iter sgt_iter;
struct pagevec pvec;
@@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st, struct 
address_space *mapping,
kfree(st);
 }
 
-static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
-  size_t size, struct

Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

2021-09-29 Thread Thomas Hellström
On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:
> For cached objects we can allocate our pages directly in shmem. This
> should make it possible(in a later patch) to utilise the existing
> i915-gem shrinker code for such objects. For now this is still
> disabled.

Some minor comments below, with those either fixed or deemed
unnecessary, 
Reviewed-by: Thomas Hellström 




> 
> v2(Thomas):
>   - Add optional try_to_writeback hook for objects. Importantly we
> need
>     to check if the object is even still shrinkable; in between us
>     dropping the shrinker LRU lock and acquiring the object lock it
> could for
>     example have been moved. Also we need to differentiate between
>     "lazy" shrinking and the immediate writeback mode. Also later we
> need to
>     handle objects which don't even have mm.pages, so bundling this
> into
>     put_pages() would require somehow handling that edge case, hence
>     just letting the ttm backend handle everything in
> try_to_writeback
>     doesn't seem too bad.
> v3(Thomas):
>   - Likely a bad idea to touch the object from the unpopulate hook,
>     since it's not possible to hold a reference, without also
> creating
>     circular dependency, so likely this is too fragile. For now just
>     ensure we at least mark the pages as dirty/accessed when called
> from the
>     shrinker on WILLNEED objects.
>   - s/try_to_writeback/shrinker_release_pages, since this can do more
>     than just writeback.
>   - Get rid of do_backup boolean and just set the SWAPPED flag prior
> to
>     calling unpopulate.
>   - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk,
> since
>     these just get skipped anyway. We can try to come up with
> something
>     better later.
> 
> Signed-off-by: Matthew Auld 
> Cc: Thomas Hellström 
> Cc: Christian König 
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_object.h    |   8 +
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
>  drivers/gpu/drm/i915/gem/i915_gem_shmem.c |  14 +-
>  drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  17 +-
>  drivers/gpu/drm/i915/gem/i915_gem_ttm.c   | 240
> --
>  5 files changed, 245 insertions(+), 36 deletions(-)
> 
> 

...

> +
> +   err = dma_map_sg_attrs(i915_tt->dev,
> +  st->sgl, st->nents,
> +  PCI_DMA_BIDIRECTIONAL,

nit: Since this is a dma api call, should we use DMA_BIDIRECTIONAL
instead of PCI_DMA_BIDIRECTIONAL? DMA_BIDIRECTIONAL is used elsewhere
in this file, but not throughout the driver IIRC.

> +  DMA_ATTR_SKIP_CPU_SYNC |
> +  DMA_ATTR_NO_KERNEL_MAPPING |
> +  DMA_ATTR_NO_WARN);
> +   if (err <= 0) {
> +   err = -EINVAL;
> +   goto err_free_st;
> +   }
> +
> +   i = 0;
> +   for_each_sgt_page(page, sgt_iter, st)
> +   ttm->pages[i++] = page;
> +
> +   if (ttm->page_flags & TTM_TT_FLAG_SWAPPED)
> +   ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
> +
> +   i915_tt->cached_st = st;
> +   return 0;
> +
> +err_free_st:
> +   shmem_free_st(st, filp->f_mapping, false, false);
> +   return err;
> +}
> +
> +static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm)
> +{
> +   struct i915_ttm_tt *i915_tt = container_of(ttm,
> typeof(*i915_tt), ttm);
> +   bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED;
> +
> +   dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl,
> +    i915_tt->cached_st->nents,
> +    PCI_DMA_BIDIRECTIONAL);

Same here. 
> +
> +   shmem_free_st(fetch_and_zero(&i915_tt->cached_st),
> + file_inode(i915_tt->filp)->i_mapping,
> + backup, backup);
> +}
> +
>  static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object
> *bo,
>  uint32_t page_flags)
>  {
> struct ttm_resource_manager *man =
> ttm_manager_type(bo->bdev, bo->resource->mem_type);
> struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
> +   enum ttm_caching caching = i915_ttm_select_tt_caching(obj);
> struct i915_ttm_tt *i915_tt;
> int ret;
>  
> @@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct
> ttm_buffer_object *bo,
>     man->use_tt)
> page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
>  
> -   ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
> - i915_ttm_select_tt_caching(obj));
> -   if (ret) {
> -   kfree(i915_tt);
> -   return NULL;
> +   if (i915_gem_object_is_shrinkable(obj) && caching ==
> ttm_cached) {
> +   page_flags |= TTM_TT_FLAG_EXTERNAL |
> + TTM_TT_FLAG_EXTERNAL_MAPPABLE;
> +   i915_tt->is_shmem = true;
> }
>  
> +   ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching);
> +   if (ret)
>

[Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

2021-09-27 Thread Matthew Auld
For cached objects we can allocate our pages directly in shmem. This
should make it possible(in a later patch) to utilise the existing
i915-gem shrinker code for such objects. For now this is still disabled.

v2(Thomas):
  - Add optional try_to_writeback hook for objects. Importantly we need
to check if the object is even still shrinkable; in between us
dropping the shrinker LRU lock and acquiring the object lock it could for
example have been moved. Also we need to differentiate between
"lazy" shrinking and the immediate writeback mode. Also later we need to
handle objects which don't even have mm.pages, so bundling this into
put_pages() would require somehow handling that edge case, hence
just letting the ttm backend handle everything in try_to_writeback
doesn't seem too bad.
v3(Thomas):
  - Likely a bad idea to touch the object from the unpopulate hook,
since it's not possible to hold a reference, without also creating
circular dependency, so likely this is too fragile. For now just
ensure we at least mark the pages as dirty/accessed when called from the
shrinker on WILLNEED objects.
  - s/try_to_writeback/shrinker_release_pages, since this can do more
than just writeback.
  - Get rid of do_backup boolean and just set the SWAPPED flag prior to
calling unpopulate.
  - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk, since
these just get skipped anyway. We can try to come up with something
better later.

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Christian König 
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h|   8 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c |  14 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  17 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c   | 240 --
 5 files changed, 245 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 3043fcbd31bd..1c9a1d8d3434 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct 
drm_i915_gem_object *obj,
 bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,
enum intel_memory_type type);
 
+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
+   size_t size, struct intel_memory_region *mr,
+   struct address_space *mapping,
+   unsigned int max_segment);
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
+  bool dirty, bool backup);
+void __shmem_writeback(size_t size, struct address_space *mapping);
+
 #ifdef CONFIG_MMU_NOTIFIER
 static inline bool
 i915_gem_object_is_userptr(struct drm_i915_gem_object *obj)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index fa2ba9e2a4d0..f0fb17be2f7a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops {
  struct sg_table *pages);
void (*truncate)(struct drm_i915_gem_object *obj);
void (*writeback)(struct drm_i915_gem_object *obj);
+   int (*shrinker_release_pages)(struct drm_i915_gem_object *obj,
+ bool should_writeback);
 
int (*pread)(struct drm_i915_gem_object *obj,
 const struct drm_i915_gem_pread *arg);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 36b711ae9e28..19e55cc29a15 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec *pvec)
cond_resched();
 }
 
-static void shmem_free_st(struct sg_table *st, struct address_space *mapping,
- bool dirty, bool backup)
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
+  bool dirty, bool backup)
 {
struct sgt_iter sgt_iter;
struct pagevec pvec;
@@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st, struct 
address_space *mapping,
kfree(st);
 }
 
-static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
-  size_t size, struct intel_memory_region 
*mr,
-  struct address_space *mapping,
-  unsigned int max_segment)
+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
+   size_t size, struct intel_memory_region *mr,
+   struct address_space *mapping,
+