On 12 Jan 2026, at 11:50, Jason Gunthorpe wrote: > On Mon, Jan 12, 2026 at 11:31:04AM -0500, Zi Yan wrote: >>> folio_free() >>> >>> 1) Allocator finds free memory >>> 2) zone_device_page_init() allocates the memory and makes refcount=1 >>> 3) __folio_put() knows the recount 0. >>> 4) free_zone_device_folio() calls folio_free(), but it doesn't >>> actually need to undo prep_compound_page() because *NOTHING* can >>> use the page pointer at this point. >>> 5) Driver puts the memory back into the allocator and now #1 can >>> happen. It knows how much memory to put back because folio->order >>> is valid from #2 >>> 6) #1 happens again, then #2 happens again and the folio is in the >>> right state for use. The successor #2 fully undoes the work of the >>> predecessor #2. >> >> But how can a successor #2 undo the work if the second #1 only allocates >> half of the original folio? For example, an order-9 at PFN 0 is >> allocated and freed, then an order-8 at PFN 0 is allocated and another >> order-8 at PFN 256 is allocated. How can two #2s undo the same order-9 >> without corrupting each other’s data? > > What do you mean? The fundamental rule is you can't read the folio or > the order outside folio_free once it's refcount reaches 0.
There is no such a rule. In core MM, folio_split(), which splits a high order folio to low order ones, freezes the folio (turning refcount to 0) and manipulates the folio order and all tail pages compound_head to restructure the folio. Your fundamental rule breaks this. Allowing compound information to stay after a folio is freed means you cannot tell whether a folio is under split or freed. > > So the successor #2 will write updated heads and order to the order 8 > pages at PFN 0 and the ones starting at PFN 256 will remain with > garbage. > > This is OK because nothing is allowed to read them as their refcount > is 0. > > If later PFN256 is allocated then it will get updated head and order > at the same time it's refcount becomes 1. > > There is corruption and they don't corrupt each other's data. > >>> If the allocator is using the struct page memory then step #5 should >>> also clean up the struct page with the allocator data before returning >>> it to the allocator. >> >> Do you mean ->folio_free() callback should undo prep_compound_page() >> instead? > > I wouldn't say undo, I was very careful to say it needs to get the > struct page memory into a state that the allocator algorithm expects, > whatever that means. > > Jason Best Regards, Yan, Zi
