+list (sic, not my day for reply-alls)

---------- Forwarded message ----------
From: Guido Trotter <[email protected]>
Date: Thu, Jul 11, 2013 at 3:20 PM
Subject: Re: [RFC] Add HugePages Support Design Doc
To: Izhar ul Hassan <[email protected]>


On Thu, Jul 11, 2013 at 2:43 PM,  <[email protected]> wrote:
> From: Izhar <[email protected]>
>
> ---
>  doc/design-hugepages-ganeti-support.rst | 87 
> +++++++++++++++++++++++++++++++++
>  1 file changed, 87 insertions(+)
>  create mode 100644 doc/design-hugepages-ganeti-support.rst
>
> diff --git a/doc/design-hugepages-ganeti-support.rst 
> b/doc/design-hugepages-ganeti-support.rst
> new file mode 100644
> index 0000000..a4ba827
> --- /dev/null
> +++ b/doc/design-hugepages-ganeti-support.rst
> @@ -0,0 +1,87 @@
> +===============================
> +Huge Pages Support for Ganeti
> +===============================
> +This is a design document about implementing support for huge pages in
> +Ganeti. (Please note that Ganeti works with Transparent Huge Pages i.e.
> +THP and any reference in this document to Huge Pages refers to explicit
> +Huge Pages).
> +
> +Current State and Shortcomings:
> +-------------------------------
> +The Linux kernel allows using pages of larger size by setting aside a
> +portion of the memory. Using larger page size may enhance the
> +performance of applications that require a lot of memory by improving
> +page hits. To use huge pages, memory has to be reserved beforehand. This
> +portion of memory is subtracted from free memory and is considered as in
> +use. Currently Ganeti cannot take proper advantage of huge pages. On a
> +node, if huge pages are reserved and are available to fulfill the VM
> +request, Ganeti fails to recognize huge pages and considers the memory
> +reserved for huge pages as used memory.  This leads to failure of
> +launching VMs on a node where memory is available in the form of huge
> +pages rather than normal pages.
> +
> +Proposed Changes:
> +-----------------
> +The following components will be changed in order for Ganeti to take
> +advantage of Huge Pages.
> +
> +Hypervisor Parameters:
> +----------------------
> +Currently, It is possible to set or modify huge pages mount point at
> +cluster level via the hypervisor parameter ``mem_path`` as::
> +
> +       $ gnt-cluster init --no-lvm-storage --no-drbd-storage \
> +       >--enabled-hypervisors=kvm -nic-parameters link=br100 \
> +       > -H kvm:mem_path=/mount/point/for/hugepages
> +
> +This hypervisor parameter is  inherited by all the instances as
> +default although it can be overriden at the instance level.
> +
> +The following changes will be made to the inheritence behaviour.
> +
> +-  The hypervisor parameter ``mem_path`` must be made available at the
> +   node group level (in addition to the cluster level),so that users can
> +   set it as default for the node group::
> +
> +       $ gnt-group add/modify\
> +       > -H kvm:mem_path=/mount/point/for/hugepages
> +
> +-  Furthermore, the hypervisor parameter ``mem_path`` will be changeable
> +   only at the cluster or node group level and users must not be able to
> +   override this when creating new instances. The following command must
> +   produce an error message that ``mem_path`` may only be set at either
> +   the cluster or the node group level::
> +
> +       $ gnt-instance add -H kvm:mem_path=/mount/point/for/hugepages
> +

Ack. Please specify that this means that all hypervisor parameters
will be made available at node group level as well, and that the
inheritance order will be

cluster -> group -> os -> instance

> +Memory Pools:
> +-------------
> +Memory management of Ganeti will be improved by creating separate pools
> +for memory used by the node itself, memory used by the hypervisor and
> +the memory reserved for huge pages as:
> +- mtotal/xen (Xen memory)
> +- mfree/xen (Xen unused memory)
> +- mtotal/hp (Memory reserved for Huge Pages)
> +- mfree/hp (Memory available from unused huge pages)
> +- mpgsize/hp (Size of a huge page)
> +
> +mfree and mtotal will be changed to mean "the total and free memory for
> +the default method in this cluster/nodegroup"
> +

Note that the default method depends both on the default hypervisor
and on its parameters.

> +iAllocator Changes:
> +-------------------
> +If huge pages are set as default for a cluster of node group, then
> +iAllocator must consider the huge pages memory on the nodes, as a
> +parameter when trying to find the best node for the VM.
> +

Ack. Note that the iallocator will also be changed to use the correct
parameter depending on the cluster/group.

> +hbal Changes:
> +-------------
> +The cluster balancer (hbal) will also be changed to consider memory
> +pools and recognize memory reserved for huge pages when trying to
> +rebalance the cluster.
> +

Also: the cluster balances will be changed to act on the default memory pool.

Thanks,
(LGTM in general)

Guido


--
Guido Trotter
Ganeti Engineering
Google Germany GmbH
Dienerstr. 12, 80331, München

Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Graham Law, Katherine Stephens

Steuernummer: 48/725/00206
Umsatzsteueridentifikationsnummer: DE813741370

Reply via email to