On Mon, Jan 12, 2026 at 06:15:26PM -0500, Zi Yan wrote: > On 12 Jan 2026, at 16:49, Matthew Brost wrote: > > > On Mon, Jan 12, 2026 at 09:45:10AM -0400, Jason Gunthorpe wrote: > > > > Hi, catching up here. > > > >> On Sun, Jan 11, 2026 at 07:51:01PM -0500, Zi Yan wrote: > >>> On 11 Jan 2026, at 19:19, Balbir Singh wrote: > >>> > >>>> On 1/12/26 08:35, Matthew Wilcox wrote: > >>>>> On Sun, Jan 11, 2026 at 09:55:40PM +0100, Francois Dugast wrote: > >>>>>> The core MM splits the folio before calling folio_free, restoring the > >>>>>> zone pages associated with the folio to an initialized state (e.g., > >>>>>> non-compound, pgmap valid, etc...). The order argument represents the > >>>>>> folio’s order prior to the split which can be used driver side to know > >>>>>> how many pages are being freed. > >>>>> > >>>>> This really feels like the wrong way to fix this problem. > >>>>> > >>> > >>> Hi Matthew, > >>> > >>> I think the wording is confusing, since the actual issue is that: > >>> > >>> 1. zone_device_page_init() calls prep_compound_page() to form a large > >>> folio, > >>> 2. but free_zone_device_folio() never reverse the course, > >>> 3. the undo of prep_compound_page() in free_zone_device_folio() needs to > >>> be done before driver callback ->folio_free(), since once > >>> ->folio_free() > >>> is called, the folio can be reallocated immediately, > >>> 4. after the undo of prep_compound_page(), folio_order() can no longer > >>> provide > >>> the original order information, thus, folio_free() needs that for > >>> proper > >>> device side ref manipulation. > >> > >> There is something wrong with the driver if the "folio can be > >> reallocated immediately". > >> > >> The flow generally expects there to be a driver allocator linked to > >> folio_free() > >> > >> 1) Allocator finds free memory > >> 2) zone_device_page_init() allocates the memory and makes refcount=1 > >> 3) __folio_put() knows the recount 0. > >> 4) free_zone_device_folio() calls folio_free(), but it doesn't > >> actually need to undo prep_compound_page() because *NOTHING* can > >> use the page pointer at this point. > > > > Correct—nothing can use the folio prior to calling folio_free(). Once > > folio_free() returns, the driver side is free to immediately reallocate > > the folio (or a subset of its pages). > > > >> 5) Driver puts the memory back into the allocator and now #1 can > >> happen. It knows how much memory to put back because folio->order > >> is valid from #2 > >> 6) #1 happens again, then #2 happens again and the folio is in the > >> right state for use. The successor #2 fully undoes the work of the > >> predecessor #2. > >> > >> If you have races where #1 can happen immediately after #3 then the > >> driver design is fundamentally broken and passing around order isn't > >> going to help anything. > >> > > > > The above race does not exist; if it did, I agree we’d be solving > > nothing here. > > > >> If the allocator is using the struct page memory then step #5 should > >> also clean up the struct page with the allocator data before returning > >> it to the allocator. > >> > > > > We could move the call to free_zone_device_folio_prepare() [1] into the > > driver-side implementation of ->folio_free() and drop the order argument > > here. Zi didn’t particularly like that; he preferred calling > > free_zone_device_folio_prepare() [2] before invoking ->folio_free(), > > which is why this patch exists. > > On a second thought, if calling free_zone_device_folio_prepare() in > ->folio_free() works, feel free to do so. >
+1, testing this change right now and it does indeed work. Matt > > > > FWIW, I do not have a strong opinion here—either way works. Xe doesn’t > > actually need the order regardless of where > > free_zone_device_folio_prepare() is called, but Nouveau does need the > > order if free_zone_device_folio_prepare() is called before > > ->folio_free(). > > > > [1] https://patchwork.freedesktop.org/patch/697877/?series=159120&rev=4 > > [2] > > https://patchwork.freedesktop.org/patch/697709/?series=159120&rev=3#comment_1282405 > > > >> I vaugely remember talking about this before in the context of the Xe > >> driver.. You can't just take an existing VRAM allocator and layer it > >> on top of the folios and have it broadly ignore the folio_free > >> callback. > >> > > > > We are definitely not ignoring the ->folio_free callback—that is the > > point at which we tell our VRAM allocator (DRM buddy) it is okay to > > release the allocation and make it available for reuse. > > > > Matt > > > >> Jsaon > > > Best Regards, > Yan, Zi
