On Thu, 8 Jan 2026 11:42:32 +0100 Boris Brezillon <[email protected]> wrote:
> On Thu, 8 Jan 2026 10:28:00 +0000 > Steven Price <[email protected]> wrote: > > > Hi Boris, > > > > Happy new year! > > > > On 08/01/2026 10:10, Boris Brezillon wrote: > > > drm_gem_put_pages(), which we rely on for returning BO pages to shmem, > > > assume per-folio refcounting and not per-page. If we call > > > shmem_read_mapping_page() per-page, we break this assumption and leak > > > pages every time we get a huge page allocated. > > > > > > Cc: Loïc Molinari <[email protected]> > > > Fixes: c12e9fcb5a5a ("drm/panfrost: Introduce huge tmpfs mountpoint > > > option") > > > Signed-off-by: Boris Brezillon <[email protected]> > > > --- > > > drivers/gpu/drm/panfrost/panfrost_mmu.c | 37 +++++++++++++++++++++---- > > > 1 file changed, 31 insertions(+), 6 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c > > > b/drivers/gpu/drm/panfrost/panfrost_mmu.c > > > index 8f3b7a7b6ad0..9b61ad98a77e 100644 > > > --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c > > > +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c > > > @@ -595,6 +595,7 @@ static int panfrost_mmu_map_fault_addr(struct > > > panfrost_device *pfdev, int as, > > > pgoff_t page_offset; > > > struct sg_table *sgt; > > > struct page **pages; > > > + u32 nr_pages; > > > > > > bomapping = addr_to_mapping(pfdev, as, addr); > > > if (!bomapping) > > > @@ -613,6 +614,7 @@ static int panfrost_mmu_map_fault_addr(struct > > > panfrost_device *pfdev, int as, > > > addr &= ~((u64)SZ_2M - 1); > > > page_offset = addr >> PAGE_SHIFT; > > > page_offset -= bomapping->mmnode.start; > > > + nr_pages = bo->base.base.size >> PAGE_SHIFT; > > > > > > obj = &bo->base.base; > > > > > > @@ -626,8 +628,7 @@ static int panfrost_mmu_map_fault_addr(struct > > > panfrost_device *pfdev, int as, > > > goto err_unlock; > > > } > > > > > > - pages = kvmalloc_array(bo->base.base.size >> PAGE_SHIFT, > > > - sizeof(struct page *), GFP_KERNEL | > > > __GFP_ZERO); > > > + pages = kvmalloc_array(nr_pages, sizeof(struct page *), > > > GFP_KERNEL | __GFP_ZERO); > > > if (!pages) { > > > kvfree(bo->sgts); > > > bo->sgts = NULL; > > > @@ -650,6 +651,8 @@ static int panfrost_mmu_map_fault_addr(struct > > > panfrost_device *pfdev, int as, > > > mapping_set_unevictable(mapping); > > > > > > for (i = page_offset; i < page_offset + NUM_FAULT_PAGES; i++) { > > > + struct folio *folio; > > > + > > > /* Can happen if the last fault only partially filled this > > > * section of the pages array before failing. In that case > > > * we skip already filled pages. > > > @@ -657,12 +660,34 @@ static int panfrost_mmu_map_fault_addr(struct > > > panfrost_device *pfdev, int as, > > > if (pages[i]) > > > continue; > > > > > > - pages[i] = shmem_read_mapping_page(mapping, i); > > > - if (IS_ERR(pages[i])) { > > > - ret = PTR_ERR(pages[i]); > > > - pages[i] = NULL; > > > + folio = shmem_read_folio(mapping, i); > > > + if (IS_ERR(folio)) { > > > + ret = PTR_ERR(folio); > > > goto err_unlock; > > > } > > > + > > > + /* We always fill the page array at a folio granularity so > > > + * there's no reason for a missing page to not be the first > > > + * in the folio. We want to ensure that's the case to avoid > > > + * unbalanced folio_{get,put}() leading to leaks, because > > > + * drm_gem_put_pages() assumes per-folio refcounting not > > > + * per-page. > > > + */ > > > > I'm a little uneasy about this reasoning. Above we do mask the address > > to be a multiple of 2MB, but the folio could (in theory at least) be > > bigger than 2MB. So I don't see what stops a GPU job faulting in the > > middle of a buffer and triggering this warning. > > > > Can we not walk backwards to the beginning of the folio if the address > > isn't aligned and check that? > > Yep, I assumed the heap buffer would be faulted from bottom to top > (which I think is a valid assumption for the tiler, but I might be > wrong). I can certainly revisit the logic to map on both side if we're > in the middle of a folio and drop this WARN_ON(). Actually no, it can and will allocate from both ends, and if we cross a 2M boundary that's backed by the same folio when doing top->bottom alloc we'll hit this WARN_ON(). I'll send a v2 with a revisited for() loop, as suggested.
