zone_device: Reinitialize large zone device private folios

Vlastimil Babka Fri, 16 Jan 2026 08:07:27 -0800

On 1/16/26 12:10, Francois Dugast wrote:
> From: Matthew Brost <[email protected]>
> diff --git a/mm/memremap.c b/mm/memremap.c
> index 63c6ab4fdf08..ac7be07e3361 100644
> --- a/mm/memremap.c
> +++ b/mm/memremap.c
> @@ -477,10 +477,43 @@ void free_zone_device_folio(struct folio *folio)
>       }
>  }
>  
> -void zone_device_page_init(struct page *page, unsigned int order)
> +void zone_device_page_init(struct page *page, struct dev_pagemap *pgmap,
> +                        unsigned int order)
>  {
> +     struct page *new_page = page;
> +     unsigned int i;
> +
>       VM_WARN_ON_ONCE(order > MAX_ORDER_NR_PAGES);
>  
> +     for (i = 0; i < (1UL << order); ++i, ++new_page) {
> +             struct folio *new_folio = (struct folio *)new_page;
> +
> +             /*
> +              * new_page could have been part of previous higher order folio
> +              * which encodes the order, in page + 1, in the flags bits. We
> +              * blindly clear bits which could have set my order field here,
> +              * including page head.
> +              */
> +             new_page->flags.f &= ~0xffUL;   /* Clear possible order, page 
> head */
> +
> +#ifdef NR_PAGES_IN_LARGE_FOLIO
> +             /*
> +              * This pointer math looks odd, but new_page could have been
> +              * part of a previous higher order folio, which sets _nr_pages
> +              * in page + 1 (new_page). Therefore, we use pointer casting to
> +              * correctly locate the _nr_pages bits within new_page which
> +              * could have modified by previous higher order folio.
> +              */
> +             ((struct folio *)(new_page - 1))->_nr_pages = 0;
> +#endif
> +
> +             new_folio->mapping = NULL;
> +             new_folio->pgmap = pgmap;       /* Also clear compound head */
> +             new_folio->share = 0;   /* fsdax only, unused for device 
> private */
> +             VM_WARN_ON_FOLIO(folio_ref_count(new_folio), new_folio);
> +             VM_WARN_ON_FOLIO(!folio_is_zone_device(new_folio), new_folio);
> +     }
> +
>       /*
>        * Drivers shouldn't be allocating pages after calling
>        * memunmap_pages().


Can't say I'm a fan of this. It probably works now (so I'm not nacking) but
seems rather fragile. It seems likely to me somebody will try to change some
implementation detail in the page allocator and not notice it breaks this,
for example. I hope we can eventually get to something more robust.

Re: [PATCH v6 1/5] mm/zone_device: Reinitialize large zone device private folios

Reply via email to