On Mon, Jan 19, 2026 at 03:09:00PM -0500, Zi Yan wrote:
> > diff --git a/mm/internal.h b/mm/internal.h
> > index e430da900430a1..a7d3f5e4b85e49 100644
> > --- a/mm/internal.h
> > +++ b/mm/internal.h
> > @@ -806,14 +806,21 @@ static inline void prep_compound_head(struct page
> > *page, unsigned int order)
> > atomic_set(&folio->_pincount, 0);
> > atomic_set(&folio->_entire_mapcount, -1);
> > }
> > - if (order > 1)
> > + if (order > 1) {
> > INIT_LIST_HEAD(&folio->_deferred_list);
> > + } else {
> > + folio->mapping = NULL;
> > +#ifdef CONFIG_MEMCG
> > + folio->memcg_data = 0;
> > +#endif
> > + }
>
> prep_compound_head() is only called on >0 order pages. The above
> code means when order == 1, folio->mapping and folio->memcg_data are
> assigned NULL.
OK, fair enough, the conditionals would have to change and maybe it
shouldn't be called "compound_head" if it also cleans up normal pages.
> > static inline void prep_compound_tail(struct page *head, int tail_idx)
> > {
> > struct page *p = head + tail_idx;
> >
> > + p->flags.f &= ~0xffUL; /* Clear possible order, page head */
>
> No one cares about tail page flags if it is not checked in check_new_page()
> from mm/page_alloc.c.
At least page_fixed_fake_head() does check PG_head in some
configurations. It does seem safer to clear it. Possibly order is
never used, but it is free to clear it.
> > - if (order)
> > - prep_compound_page(page, order);
> > + prep_compound_page(page, order);
>
> prep_compound_page() should only be called for >0 order pages. This creates
> another weirdness in device pages by assuming all pages are
> compound.
OK
> > + folio = page_folio(page);
> > + folio->pgmap = pgmap;
> > + folio_lock(folio);
> > + folio_set_count(folio, 1);
>
> /* clear possible previous page->mapping */
> folio->mapping = NULL;
>
> /* clear possible previous page->_nr_pages */
> #ifdef CONFIG_MEMCG
> folio->memcg_data = 0;
> #endif
This is reasonable too, but prep_compound_head() was doing more than
that, it is also clearing the order, and this needs to clear the head
bit. That's why it was apppealing to reuse those functions, but you
are right they are not ideal.
I suppose we want some prep_single_page(page) and some reorg to share
code with the other prep function.
> This patch mixed the concept of page and folio together, thus
> causing confusion. Core MM sees page and folio two separate things:
> 1. page is the smallest internal physical memory management unit,
> 2. folio is an abstraction on top of pages, and other abstractions can be
> slab, ptdesc, and more (https://kernelnewbies.org/MatthewWilcox/Memdescs).
I think the users of zone_device_page_init() are principally trying to
create something that can be installed in a non-special PTE. Meaning
the output is always a folio because it is going to be read as a folio
in the page walkers.
Thus, the job of this function is to take the memory range starting at
page for 2^order and turn it into a single valid folio with refcount
of 1.
> If device pages have to initialize on top of pages with obsolete states,
> at least it should be first initialized as pages, then as folios to avoid
> confusion.
I don't think so. It should do the above job efficiently and iterate
over the page list exactly once.
Jason