Today's interface of Xen for memory ballooning is quite a mess. There
are some shortcomings which should be addressed somehow. After a
discussion on IRC there was consensus we should try to design a new
interface addressing the current and probably future needs.

Current interface
-----------------
A guest has access to the following memory related information (all for
x86):

- the memory map (E820 or EFI)
- ACPI tables for HVM/PVH guests
- actual maximum size via XENMEM_maximum_reservation hypercall (the
  hypervisor will deny attempts of the guest to allocate more)
- current size via XENMEM_current_reservation hypercall
- Xenstore entry "memory/static-max" for the upper bound of memory size
  (information for the guest which memory size might be reached without
  hotplugging memory)
- Xenstore entry "memory/target" for current target size (used for
  ballooning: Xen tools set the size the guest should try to reach by
  allocating or releasing memory)

The main problem with this interface is the guest doesn't know in all
cases which memory is included in the values (e.g. memory allocated by
Xen tools for the firmware of a HVM guest is included in the Xenstore
and hypercall information, but not in the memory map).

So without tweaking the available information a HVM guest booted with
a certain amount of memory will believe it has to balloon up, as the
target value in Xenstore will be larger than the memory the guest
assumes to have available according to the memory map.

An additional complexity is added by Xen tools which add a magic size
constant depending on guest type to the Xenstore values.

The current interface has no way to specify (virtual) NUMA nodes for
ballooning. In case vNUMA is being added to Xen the ballooning interface
needs an extension, too.


Suggested new interface
-----------------------
Hypercalls, memory map(s) and ACPI tables should stay the same (for
compatibility reasons or because they are architectural interfaces).

As the main confusion in the current interface is related to the
specification of the target memory size this part of the interface
should be changed: specifying the size of the ballooned area instead
is much clearer and will be the same for all guest types (no firmware
memory or magic additions involved).

In order to support vNUMA the balloon size should be per vNUMA node.

With the new interface in use Xen tools will calculate the balloon
size per vnode and write the related values to Xenstore:

memory/vnode<n>/target-balloon-size

The guest will have setup a watch on those entries, so it can react on a
modification as today.

The guest will indicate support of the new ballooning interface by
writing the value "1" into Xenstore entry control/feature-balloon-vnode.
In case Xen supports the new interface and the guest does so, too, only
the new interface should be used. Xen tools will remove the (old) node
memory/target-size in this case.

Open questions
--------------
Should we add memory size information to the memory/vnode<n> nodes?

Should the guest add information about its current balloon sizes to the
memory/vnode<n> nodes (i.e. after ballooning, or every x seconds while
ballooning)?

Should we specify whether the guest is free to balloon another vnode
than specified?

Should memory hotplug (at least for PV domains) use the vnode specific
Xenstore paths, too, if supported by the guest?


Any further thoughts on this?


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to