The following patch-set proposes an efficient mechanism for handing freed 
memory between the guest and the host. It enables the guests with no page cache 
to rapidly free and reclaims memory to and from the host respectively.

Benefit:
With this patch-series, in our test-case, executed on a single system and 
single NUMA node with 15GB memory, we were able to successfully launch atleast 
5 guests 
when page hinting was enabled and 3 without it. (Detailed explanation of the 
test procedure is provided at the bottom).

Changelog in V8:
In this patch-series, the earlier approach [1] which was used to capture and 
scan the pages freed by the guest has been changed. The new approach is briefly 
described below:

The patch-set still leverages the existing arch_free_page() to add this 
functionality. It maintains a per CPU array which is used to store the pages 
freed by the guest. The maximum number of entries which it can hold is defined 
by MAX_FGPT_ENTRIES(1000). When the array is completely filled, it is scanned 
and only the pages which are available in the buddy are stored. This process 
continues until the array is filled with pages which are part of the buddy free 
list. After which it wakes up a kernel per-cpu-thread.
This kernel per-cpu-thread rescans the per-cpu-array for any re-allocation and 
if the page is not reallocated and present in the buddy, the kernel thread 
attempts to isolate it from the buddy. If it is successfully isolated, the page 
is added to another per-cpu array. Once the entire scanning process is 
complete, all the isolated pages are reported to the host through an existing 
virtio-balloon driver.

Known Issues:
        * Fixed array size: The problem with having a fixed/hardcoded array 
size arises when the size of the guest varies. For example when the guest size 
increases and it starts making large allocations fixed size limits this 
solution's ability to capture all the freed pages. This will result in less 
guest free memory getting reported to the host.

Known code re-work:
        * Plan to re-use Wei's work, which communicates the poison value to the 
host.
        * The nomenclatures used in virtio-balloon needs to be changed so that 
the code can easily be distinguished from Wei's Free Page Hint code.
        * Sorting based on zonenum, to avoid repetitive zone locks for the same 
zone.

Other required work:
        * Run other benchmarks to evaluate the performance/impact of this 
approach.

Test case:
Setup:
Memory-15837 MB
Guest Memory Size-5 GB
Swap-Disabled
Test Program-Simple program which allocates 4GB memory via malloc, touches it 
via memset and exits.
Use case-Number of guests that can be launched completely including the 
successful execution of the test program.
Procedure: 
The first guest is launched and once its console is up, the test allocation 
program is executed with 4 GB memory request (Due to this the guest occupies 
almost 4-5 GB of memory in the host in a system without page hinting). Once 
this program exits at that time another guest is launched in the host and the 
same process is followed. We continue launching the guests until a guest gets 
killed due to low memory condition in the host.

Result:
Without Hinting-3 Guests
With Hinting-5 to 7 Guests(Based on the amount of memory freed/captured).

[1] https://www.spinics.net/lists/kvm/msg170113.html 


Reply via email to