mm: allow deferred scrub of physmap populate allocated pages

Jan Beulich Tue, 27 Jan 2026 07:49:35 -0800

On 27.01.2026 16:01, Roger Pau Monné wrote:
> On Tue, Jan 27, 2026 at 12:06:32PM +0100, Jan Beulich wrote:
>> On 27.01.2026 11:40, Roger Pau Monné wrote:
>>> On Mon, Jan 26, 2026 at 12:14:35PM +0100, Jan Beulich wrote:
>>>> On 22.01.2026 18:38, Roger Pau Monne wrote:
>>>>> --- a/xen/common/memory.c
>>>>> +++ b/xen/common/memory.c
>>>>> @@ -159,6 +159,66 @@ static void increase_reservation(struct memop_args 
>>>>> *a)
>>>>>      a->nr_done = i;
>>>>>  }
>>>>>  
>>>>> +/*
>>>>> + * Temporary storage for a domain assigned page that's not been fully 
>>>>> scrubbed.
>>>>> + * Stored pages must be domheap ones.
>>>>> + *
>>>>> + * The stashed page can be freed at any time by Xen, the caller must 
>>>>> pass the
>>>>> + * order and NUMA node requirement to the fetch function to ensure the
>>>>> + * currently stashed page matches it's requirements.
>>>>> + */
>>>>> +static void stash_allocation(struct domain *d, struct page_info *page,
>>>>> +                             unsigned int order, unsigned int 
>>>>> scrub_index)
>>>>> +{
>>>>> +    rspin_lock(&d->page_alloc_lock);
>>>>> +
>>>>> +    /*
>>>>> +     * Drop any stashed allocation to accommodated the current one.  This
>>>>> +     * interface is designed to be used for single-threaded domain 
>>>>> creation.
>>>>> +     */
>>>>> +    if ( d->pending_scrub )
>>>>> +        free_domheap_pages(d->pending_scrub, d->pending_scrub_order);
>>>>
>>>> Didn't you indicate you'd move the freeing ...
>>>>
>>>>> +    d->pending_scrub_index = scrub_index;
>>>>> +    d->pending_scrub_order = order;
>>>>> +    d->pending_scrub = page;
>>>>> +
>>>>> +    rspin_unlock(&d->page_alloc_lock);
>>>>> +}
>>>>> +
>>>>> +static struct page_info *get_stashed_allocation(struct domain *d,
>>>>> +                                                unsigned int order,
>>>>> +                                                nodeid_t node,
>>>>> +                                                unsigned int 
>>>>> *scrub_index)
>>>>> +{
>>>>
>>>> ... into this function?
>>>
>>> I could add freeing to get_stashed_allocation(), but it seems
>>> pointless, because the freeing in stash_allocation() will have to stay
>>> to deal with concurrent callers.  Even if a context frees the stashed
>>> page in get_stashed_allocation() there's no guarantee the field will
>>> still be free when stash_allocation() is called, as another concurrent
>>> thread might have stashed a page in the meantime.
>>
>> Hmm, yes, yet still ...
>>
>>> I think it's best to consistently do it only in stash_allocation(), as
>>> that's clearer.
>>
>> ... no, as (to me) "clearer" is only a secondary criteria here. What I'm
>> worried of is potentially holding back a 1Gb page when the new request is,
>> say, a 2Mb one, and then not having enough memory available just because
>> of that detained huge page.
> 
> If that's really the case then either the caller is using a broken
> toolstack that's making bogus populate physmap calls, or the caller is
> attempting to populate the physmap in parallel and hasn't properly
> checked whether there's enough free memory in the system.  In the
> later case the physmap population would end up failing anyway.
> 
>> In fact, if stash_allocation() finds the field re-populated despite
>> get_stashed_allocation() having cleared it, it's not quite clear which
>> of the two allocations should actually be undone. The other vCPU may be
>> quicker in retrying, and to avoid ping-pong freeing the new (local)
>> allocation rather than stashing it might possibly be better. Thoughts?
> 
> TBH I didn't give it much thought, as in any case progression when
> attempting to populate the physmap in parallel will be far from
> optimal.  If you prefer I can switch to the approach where the freeing
> of the stashed page is done in get_stashed_allocation() and
> stash_allocation() instead frees the current one if it find the field
> is already in use.


I'd prefer that, yes. Of course if others were to agree with your take ...

Jan

Re: [PATCH v3 2/3] xen/mm: allow deferred scrub of physmap populate allocated pages

Reply via email to