On Thu, 8 Jan 2026 11:42:32 +0100
Boris Brezillon <[email protected]> wrote:

> On Thu, 8 Jan 2026 10:28:00 +0000
> Steven Price <[email protected]> wrote:
> 
> > Hi Boris,
> > 
> > Happy new year!
> > 
> > On 08/01/2026 10:10, Boris Brezillon wrote:  
> > > drm_gem_put_pages(), which we rely on for returning BO pages to shmem,
> > > assume per-folio refcounting and not per-page. If we call
> > > shmem_read_mapping_page() per-page, we break this assumption and leak
> > > pages every time we get a huge page allocated.
> > > 
> > > Cc: Loïc Molinari <[email protected]>
> > > Fixes: c12e9fcb5a5a ("drm/panfrost: Introduce huge tmpfs mountpoint 
> > > option")
> > > Signed-off-by: Boris Brezillon <[email protected]>
> > > ---
> > >  drivers/gpu/drm/panfrost/panfrost_mmu.c | 37 +++++++++++++++++++++----
> > >  1 file changed, 31 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c 
> > > b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> > > index 8f3b7a7b6ad0..9b61ad98a77e 100644
> > > --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> > > +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> > > @@ -595,6 +595,7 @@ static int panfrost_mmu_map_fault_addr(struct 
> > > panfrost_device *pfdev, int as,
> > >   pgoff_t page_offset;
> > >   struct sg_table *sgt;
> > >   struct page **pages;
> > > + u32 nr_pages;
> > >  
> > >   bomapping = addr_to_mapping(pfdev, as, addr);
> > >   if (!bomapping)
> > > @@ -613,6 +614,7 @@ static int panfrost_mmu_map_fault_addr(struct 
> > > panfrost_device *pfdev, int as,
> > >   addr &= ~((u64)SZ_2M - 1);
> > >   page_offset = addr >> PAGE_SHIFT;
> > >   page_offset -= bomapping->mmnode.start;
> > > + nr_pages = bo->base.base.size >> PAGE_SHIFT;
> > >  
> > >   obj = &bo->base.base;
> > >  
> > > @@ -626,8 +628,7 @@ static int panfrost_mmu_map_fault_addr(struct 
> > > panfrost_device *pfdev, int as,
> > >                   goto err_unlock;
> > >           }
> > >  
> > > -         pages = kvmalloc_array(bo->base.base.size >> PAGE_SHIFT,
> > > -                                sizeof(struct page *), GFP_KERNEL | 
> > > __GFP_ZERO);
> > > +         pages = kvmalloc_array(nr_pages, sizeof(struct page *), 
> > > GFP_KERNEL | __GFP_ZERO);
> > >           if (!pages) {
> > >                   kvfree(bo->sgts);
> > >                   bo->sgts = NULL;
> > > @@ -650,6 +651,8 @@ static int panfrost_mmu_map_fault_addr(struct 
> > > panfrost_device *pfdev, int as,
> > >   mapping_set_unevictable(mapping);
> > >  
> > >   for (i = page_offset; i < page_offset + NUM_FAULT_PAGES; i++) {
> > > +         struct folio *folio;
> > > +
> > >           /* Can happen if the last fault only partially filled this
> > >            * section of the pages array before failing. In that case
> > >            * we skip already filled pages.
> > > @@ -657,12 +660,34 @@ static int panfrost_mmu_map_fault_addr(struct 
> > > panfrost_device *pfdev, int as,
> > >           if (pages[i])
> > >                   continue;
> > >  
> > > -         pages[i] = shmem_read_mapping_page(mapping, i);
> > > -         if (IS_ERR(pages[i])) {
> > > -                 ret = PTR_ERR(pages[i]);
> > > -                 pages[i] = NULL;
> > > +         folio = shmem_read_folio(mapping, i);
> > > +         if (IS_ERR(folio)) {
> > > +                 ret = PTR_ERR(folio);
> > >                   goto err_unlock;
> > >           }
> > > +
> > > +         /* We always fill the page array at a folio granularity so
> > > +          * there's no reason for a missing page to not be the first
> > > +          * in the folio. We want to ensure that's the case to avoid
> > > +          * unbalanced folio_{get,put}() leading to leaks, because
> > > +          * drm_gem_put_pages() assumes per-folio refcounting not
> > > +          * per-page.
> > > +          */    
> > 
> > I'm a little uneasy about this reasoning. Above we do mask the address
> > to be a multiple of 2MB, but the folio could (in theory at least) be
> > bigger than 2MB. So I don't see what stops a GPU job faulting in the
> > middle of a buffer and triggering this warning.
> > 
> > Can we not walk backwards to the beginning of the folio if the address
> > isn't aligned and check that?  
> 
> Yep, I assumed the heap buffer would be faulted from bottom to top
> (which I think is a valid assumption for the tiler, but I might be
> wrong). I can certainly revisit the logic to map on both side if we're
> in the middle of a folio and drop this WARN_ON().

Actually no, it can and will allocate from both ends, and if we cross a
2M boundary that's backed by the same folio when doing top->bottom
alloc we'll hit this WARN_ON(). I'll send a v2 with a revisited for()
loop, as suggested.

Reply via email to