On 12.07.2022 09:01, Jan Beulich wrote:
> On 11.07.2022 18:21, Anthony PERARD wrote:
>> On Fri, Jul 08, 2022 at 03:39:38PM +0200, Jan Beulich wrote:
>>> While in principle possible also under other conditions as long as other
>>> parallel operations potentially consuming memory aren't "locked out", in
>>> particular with IOMMU large page mappings used in Dom0 (for PV when in
>>> strict mode; for PVH when not sharing page tables with HAP) ballooning
>>> out of individual pages can actually lead to less free memory available
>>> afterwards. This is because to split a large page, one or more page
>>> table pages are necessary (one per level that is split).
>>>
>>> When rebooting a guest I've observed freemem() to fail: A single page
>>> was required to be ballooned out (presumably because of heap
>>> fragmentation in the hypervisor). This ballooning out of a single page
>>> of course went fast, but freemem() then found that it would require to
>>> balloon out another page. This repeating just another time leads to the
>>> function to signal failure to the caller - without having come anywhere
>>> near the designated 30s that the whole process is allowed to not make
>>> any progress at all.
>>>
>>> Convert from a simple retry count to actually calculating elapsed time,
>>> subtracting from an initial credit of 30s. Don't go as far as limiting
>>> the "wait_secs" value passed to libxl_wait_for_memory_target(), though.
>>> While this leads to the overall process now possibly taking longer (if
>>> the previous iteration ended very close to the intended 30s), this
>>> compensates to some degree for the value passed really meaning "allowed
>>> to run for this long without making progress".
>>>
>>> Signed-off-by: Jan Beulich <jbeul...@suse.com>
>>> ---
>>> I further wonder whether the "credit expired" loop exit wouldn't better
>>> be moved to the middle of the loop, immediately after "return true".
>>> That way having reached the goal on the last iteration would be reported
>>> as success to the caller, rather than as "timed out".
>>
>> That would sound like a good improvement to the patch.
> 
> Oh. I would have made it a separate one, if deemed sensible. Order
> shouldn't matter as I'd consider both backporting candidates.

Except, of course, if the change here is controversial or needs a lot
of further refinement, in which case the other one may better go first.
Please let me know ...

Jan

Reply via email to