On Mon, Jan 26, 2026 at 12:21:17PM +0100, Jan Beulich wrote:
> On 22.01.2026 18:38, Roger Pau Monne wrote:
> > The current logic allows for up to 1G pages to be scrubbed in place, which
> > can cause the watchdog to trigger in practice.  Reduce the limit for
> > in-place scrubbed allocations to a newly introduced define:
> > CONFIG_DIRTY_MAX_ORDER.  This currently defaults to CONFIG_PTDOM_MAX_ORDER
> > on all architectures.  Also introduce a command line option to set the
> > value.
> > 
> > Fixes: 74d2e11ccfd2 ("mm: Scrub pages in alloc_heap_pages() if needed")
> > Signed-off-by: Roger Pau Monné <[email protected]>
> 
> Apart from a nit (see below) looks technically okay to me now. Still I have
> an uneasy feeling about introducing such a restriction, so I'm (still)
> hesitant to ack the change.

OK, I understand that, and I'm not going to argue there's no risk.
Overall, even if this commit is not fully correct, it's a step in the
right direction IMO, we need to limit such allocations.  And for
callers that legitimately need bigger orders we will have to add
preemptive scrubbing like we do for populate physmap.

> > --- a/xen/common/page_alloc.c
> > +++ b/xen/common/page_alloc.c
> > @@ -267,6 +267,13 @@ static PAGE_LIST_HEAD(page_offlined_list);
> >  /* Broken page list, protected by heap_lock. */
> >  static PAGE_LIST_HEAD(page_broken_list);
> >  
> > +/* Maximum order allowed for allocations with MEMF_no_scrub. */
> > +#ifndef CONFIG_DIRTY_MAX_ORDER
> > +# define CONFIG_DIRTY_MAX_ORDER CONFIG_PTDOM_MAX_ORDER
> > +#endif
> > +static unsigned int __ro_after_init dirty_max_order = 
> > CONFIG_DIRTY_MAX_ORDER;
> > +integer_param("max-order-dirty", dirty_max_order);
> 
> The comment may want to mention "post-boot", to account for ...
> 
> > @@ -1008,7 +1015,13 @@ static struct page_info *alloc_heap_pages(
> >  
> >      pg = get_free_buddy(zone_lo, zone_hi, order, memflags, d);
> >      /* Try getting a dirty buddy if we couldn't get a clean one. */
> > -    if ( !pg && !(memflags & MEMF_no_scrub) )
> > +    if ( !pg && !(memflags & MEMF_no_scrub) &&
> > +         /*
> > +          * Allow any order unscrubbed allocations during boot time, we
> > +          * compensate by processing softirqs in the scrubbing loop below 
> > once
> > +          * irqs are enabled.
> > +          */
> > +         (order <= dirty_max_order || system_state < SYS_STATE_active) )
> 
> ... the system_state check here.

Added.

Reply via email to