Re: [RFC][PATCH v6 3/7] drm: ttm_pool: Rework ttm_pool_free_page to allow us to use it as a function pointer

2021-02-09 Thread John Stultz
On Tue, Feb 9, 2021 at 4:14 AM Christian König  wrote:
> Am 05.02.21 um 20:47 schrieb John Stultz:
> > On Fri, Feb 5, 2021 at 12:28 AM Christian König
> >  wrote:
> >> Adding this to all pages would increase the memory footprint drastically.
> > Yea, that's a good point!  Hrm... bummer. I'll have to see if there's
> > some other way I can get the needed context for the free from the
> > generic page-pool side.
>
> What exactly is the problem here?

Me, usually. :)

> As far as I can see we just have the
> lru entry (list_head) and the pool.

Yea, I reworked it to an embedded drm_page_pool struct, but that is
mostly a list_head.

> How the lru is cast to the page can be completely pool implementation
> specific.

Yea, I had it do container_of(), just haven't gotten around to sending
it out yet.

Thanks so much for the feedback and ideas!

thanks
-john
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC][PATCH v6 3/7] drm: ttm_pool: Rework ttm_pool_free_page to allow us to use it as a function pointer

2021-02-09 Thread Christian König



Am 05.02.21 um 20:47 schrieb John Stultz:

On Fri, Feb 5, 2021 at 12:28 AM Christian König
 wrote:

Am 05.02.21 um 09:06 schrieb John Stultz:

This refactors ttm_pool_free_page(), and by adding extra entries
to ttm_pool_page_dat, we then use it for all allocations, which
allows us to simplify the arguments needed to be passed to
ttm_pool_free_page().

This is a clear NAK since the peer page data is just a workaround for
the DMA-API hack to grab pages from there.

Adding this to all pages would increase the memory footprint drastically.

Yea, that's a good point!  Hrm... bummer. I'll have to see if there's
some other way I can get the needed context for the free from the
generic page-pool side.


What exactly is the problem here? As far as I can see we just have the 
lru entry (list_head) and the pool.


How the lru is cast to the page can be completely pool implementation 
specific.


Regards,
Christian.



Thanks so much for the review!
-john


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC][PATCH v6 3/7] drm: ttm_pool: Rework ttm_pool_free_page to allow us to use it as a function pointer

2021-02-05 Thread John Stultz
On Fri, Feb 5, 2021 at 12:28 AM Christian König
 wrote:
> Am 05.02.21 um 09:06 schrieb John Stultz:
> > This refactors ttm_pool_free_page(), and by adding extra entries
> > to ttm_pool_page_dat, we then use it for all allocations, which
> > allows us to simplify the arguments needed to be passed to
> > ttm_pool_free_page().
>
> This is a clear NAK since the peer page data is just a workaround for
> the DMA-API hack to grab pages from there.
>
> Adding this to all pages would increase the memory footprint drastically.

Yea, that's a good point!  Hrm... bummer. I'll have to see if there's
some other way I can get the needed context for the free from the
generic page-pool side.

Thanks so much for the review!
-john
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC][PATCH v6 3/7] drm: ttm_pool: Rework ttm_pool_free_page to allow us to use it as a function pointer

2021-02-05 Thread Christian König

Am 05.02.21 um 09:06 schrieb John Stultz:

This refactors ttm_pool_free_page(), and by adding extra entries
to ttm_pool_page_dat, we then use it for all allocations, which
allows us to simplify the arguments needed to be passed to
ttm_pool_free_page().


This is a clear NAK since the peer page data is just a workaround for 
the DMA-API hack to grab pages from there.


Adding this to all pages would increase the memory footprint drastically.

christian.



This is critical for allowing the free function to be called
by the sharable drm_page_pool logic.

Cc: Daniel Vetter 
Cc: Christian Koenig 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
  drivers/gpu/drm/ttm/ttm_pool.c | 60 ++
  1 file changed, 32 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
index c0274e256be3..eca36678f967 100644
--- a/drivers/gpu/drm/ttm/ttm_pool.c
+++ b/drivers/gpu/drm/ttm/ttm_pool.c
@@ -44,10 +44,14 @@
  /**
   * struct ttm_pool_page_dat - Helper object for coherent DMA mappings
   *
+ * @pool: ttm_pool pointer the page was allocated by
+ * @caching: the caching value the allocated page was configured for
   * @addr: original DMA address returned for the mapping
   * @vaddr: original vaddr return for the mapping and order in the lower bits
   */
  struct ttm_pool_page_dat {
+   struct ttm_pool *pool;
+   enum ttm_caching caching;
dma_addr_t addr;
unsigned long vaddr;
  };
@@ -71,13 +75,20 @@ static struct shrinker mm_shrinker;
  
  /* Allocate pages of size 1 << order with the given gfp_flags */

  static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t 
gfp_flags,
-   unsigned int order)
+   unsigned int order, enum ttm_caching 
caching)
  {
unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS;
struct ttm_pool_page_dat *dat;
struct page *p;
void *vaddr;
  
+	dat = kmalloc(sizeof(*dat), GFP_KERNEL);

+   if (!dat)
+   return NULL;
+
+   dat->pool = pool;
+   dat->caching = caching;
+
/* Don't set the __GFP_COMP flag for higher order allocations.
 * Mapping pages directly into an userspace process and calling
 * put_page() on a TTM allocated page is illegal.
@@ -88,15 +99,13 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool 
*pool, gfp_t gfp_flags,
  
  	if (!pool->use_dma_alloc) {

p = alloc_pages(gfp_flags, order);
-   if (p)
-   p->private = order;
+   if (!p)
+   goto error_free;
+   dat->vaddr = order;
+   p->private = (unsigned long)dat;
return p;
}
  
-	dat = kmalloc(sizeof(*dat), GFP_KERNEL);

-   if (!dat)
-   return NULL;
-
if (order)
attr |= DMA_ATTR_NO_WARN;
  
@@ -123,34 +132,34 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,

  }
  
  /* Reset the caching and pages of size 1 << order */

-static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching,
-  unsigned int order, struct page *p)
+static int ttm_pool_free_page(struct page *p, unsigned int order)
  {
unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS;
-   struct ttm_pool_page_dat *dat;
+   struct ttm_pool_page_dat *dat = (void *)p->private;
void *vaddr;
  
  #ifdef CONFIG_X86

/* We don't care that set_pages_wb is inefficient here. This is only
 * used when we have to shrink and CPU overhead is irrelevant then.
 */
-   if (caching != ttm_cached && !PageHighMem(p))
+   if (dat->caching != ttm_cached && !PageHighMem(p))
set_pages_wb(p, 1 << order);
  #endif
  
-	if (!pool || !pool->use_dma_alloc) {

+   if (!dat->pool || !dat->pool->use_dma_alloc) {
__free_pages(p, order);
-   return;
+   goto out;
}
  
  	if (order)

attr |= DMA_ATTR_NO_WARN;
  
-	dat = (void *)p->private;

vaddr = (void *)(dat->vaddr & PAGE_MASK);
-   dma_free_attrs(pool->dev, (1UL << order) * PAGE_SIZE, vaddr, dat->addr,
+   dma_free_attrs(dat->pool->dev, (1UL << order) * PAGE_SIZE, vaddr, 
dat->addr,
   attr);
+out:
kfree(dat);
+   return 1 << order;
  }
  
  /* Apply a new caching to an array of pages */

@@ -264,7 +273,7 @@ static void ttm_pool_type_fini(struct ttm_pool_type *pt)
mutex_unlock(_lock);
  
  	list_for_each_entry_safe(p, tmp, >pages, lru)

-  

[RFC][PATCH v6 3/7] drm: ttm_pool: Rework ttm_pool_free_page to allow us to use it as a function pointer

2021-02-05 Thread John Stultz
This refactors ttm_pool_free_page(), and by adding extra entries
to ttm_pool_page_dat, we then use it for all allocations, which
allows us to simplify the arguments needed to be passed to
ttm_pool_free_page().

This is critical for allowing the free function to be called
by the sharable drm_page_pool logic.

Cc: Daniel Vetter 
Cc: Christian Koenig 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
 drivers/gpu/drm/ttm/ttm_pool.c | 60 ++
 1 file changed, 32 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
index c0274e256be3..eca36678f967 100644
--- a/drivers/gpu/drm/ttm/ttm_pool.c
+++ b/drivers/gpu/drm/ttm/ttm_pool.c
@@ -44,10 +44,14 @@
 /**
  * struct ttm_pool_page_dat - Helper object for coherent DMA mappings
  *
+ * @pool: ttm_pool pointer the page was allocated by
+ * @caching: the caching value the allocated page was configured for
  * @addr: original DMA address returned for the mapping
  * @vaddr: original vaddr return for the mapping and order in the lower bits
  */
 struct ttm_pool_page_dat {
+   struct ttm_pool *pool;
+   enum ttm_caching caching;
dma_addr_t addr;
unsigned long vaddr;
 };
@@ -71,13 +75,20 @@ static struct shrinker mm_shrinker;
 
 /* Allocate pages of size 1 << order with the given gfp_flags */
 static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
-   unsigned int order)
+   unsigned int order, enum ttm_caching 
caching)
 {
unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS;
struct ttm_pool_page_dat *dat;
struct page *p;
void *vaddr;
 
+   dat = kmalloc(sizeof(*dat), GFP_KERNEL);
+   if (!dat)
+   return NULL;
+
+   dat->pool = pool;
+   dat->caching = caching;
+
/* Don't set the __GFP_COMP flag for higher order allocations.
 * Mapping pages directly into an userspace process and calling
 * put_page() on a TTM allocated page is illegal.
@@ -88,15 +99,13 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool 
*pool, gfp_t gfp_flags,
 
if (!pool->use_dma_alloc) {
p = alloc_pages(gfp_flags, order);
-   if (p)
-   p->private = order;
+   if (!p)
+   goto error_free;
+   dat->vaddr = order;
+   p->private = (unsigned long)dat;
return p;
}
 
-   dat = kmalloc(sizeof(*dat), GFP_KERNEL);
-   if (!dat)
-   return NULL;
-
if (order)
attr |= DMA_ATTR_NO_WARN;
 
@@ -123,34 +132,34 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool 
*pool, gfp_t gfp_flags,
 }
 
 /* Reset the caching and pages of size 1 << order */
-static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching,
-  unsigned int order, struct page *p)
+static int ttm_pool_free_page(struct page *p, unsigned int order)
 {
unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS;
-   struct ttm_pool_page_dat *dat;
+   struct ttm_pool_page_dat *dat = (void *)p->private;
void *vaddr;
 
 #ifdef CONFIG_X86
/* We don't care that set_pages_wb is inefficient here. This is only
 * used when we have to shrink and CPU overhead is irrelevant then.
 */
-   if (caching != ttm_cached && !PageHighMem(p))
+   if (dat->caching != ttm_cached && !PageHighMem(p))
set_pages_wb(p, 1 << order);
 #endif
 
-   if (!pool || !pool->use_dma_alloc) {
+   if (!dat->pool || !dat->pool->use_dma_alloc) {
__free_pages(p, order);
-   return;
+   goto out;
}
 
if (order)
attr |= DMA_ATTR_NO_WARN;
 
-   dat = (void *)p->private;
vaddr = (void *)(dat->vaddr & PAGE_MASK);
-   dma_free_attrs(pool->dev, (1UL << order) * PAGE_SIZE, vaddr, dat->addr,
+   dma_free_attrs(dat->pool->dev, (1UL << order) * PAGE_SIZE, vaddr, 
dat->addr,
   attr);
+out:
kfree(dat);
+   return 1 << order;
 }
 
 /* Apply a new caching to an array of pages */
@@ -264,7 +273,7 @@ static void ttm_pool_type_fini(struct ttm_pool_type *pt)
mutex_unlock(_lock);
 
list_for_each_entry_safe(p, tmp, >pages, lru)
-   ttm_pool_free_page(pt->pool, pt->caching, pt->order, p);
+   ttm_pool_free_page(p, pt->order);
 }
 
 /* Return the pool_type to use for the given caching and order */
@@ -307,7 +316,7 @@ static unsigned int ttm_pool_shrink(void)
 
p