Re: [PATCH 0/3] KVM-userspace: add NUMA support for guests

Andre Przywara Mon, 01 Dec 2008 06:15:09 -0800

Avi Kivity wrote:

Andre Przywara wrote:

The user (or better: management application) specifies the host nodes
the guest should use: -nodes 2,3 would create a two node guest mapped to
node 2 and 3 on the host. These numbers are handed over to libnuma:
VCPUs are pinned to the nodes and the allocated guest memory is bound to
it's respective node. Since libnuma seems not to be installed
everywhere, the user has to enable this via configure --enable-numa
In the BIOS code an ACPI SRAT table was added, which describes the NUMA
topology to the guest. The number of nodes is communicated via the CMOS
RAM (offset 0x3E). If someone thinks of this as a bad idea, tell me.

There exists now a firmware interface in qemu for this kind ofcommunications.

Oh, right you are, I missed that (was well hidden). I was looking at howthe BIOS detects memory size and CPU numbers and these methods are quitecumbersome. Why not convert them to the FW_CFG methods (which the qemuside already sets)? To not diverge too much from the original BOCHS BIOS?

Node over-committing is allowed (-nodes 0,0,0,0), omitting the -nodes
parameter reverts to the old behavior.
'-nodes' is too generic a name ('node' could also mean a host). Suggest-numanode.
Need more flexibility: specify the range of memory per node, which cpusare in the node, relative weights for the SRAT table:
  -numanode node=1,cpu=2,cpu=3,start=1G,size=1G,hostnode=3

I converted my code to use the new firmware interface. This also makesit possible to pass more information between qemu and BIOS (whichprevented a more flexible command line in the first version).

So I would opt for the following:
- use numanode (or simply numa?) instead of the misleading -nodes
- allow passing memory sizes, VCPU subsets and host CPU pin info
I would prefer Daniel's version:
-numa <nrnodes>[,mem:<node1size>[;<node2size>...]]
[,cpu:<node1cpus>[;<node2cpus>...]]
[,pin:<node1hostnode>[;<host2hostnode>...]]

That would allow easy things like -numa 2 (for a two guest node), notgiven options would result in defaults (equally split-up resources).

The only problem is the default option for the host side, as libnumarequires to explicitly name the nodes. Maybe make the pin: part _not_optional? I would at least want to pin the memory, one could discussabout the VCPUs...


Also need a monitor command to change host nodes dynamically:

Implementing a monitor interface is a good idea.

(qemu) numanode 1 0

Does that include page migration? That would be easily possible withmbind(MPOL_MF_MOVE), but would take some time and resources (which Ithink is OK if explicitly triggered in the monitor).Any other useful commands for the monitor? Maybe (temporary) VCPUmigration without page migration?


Regards,
Andre.

--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 277-84917
----to satisfy European Law for business letters:
AMD Saxony Limited Liability Company & Co. KG,
Wilschdorfer Landstr. 101, 01109 Dresden, Germany
Register Court Dresden: HRA 4896, General Partner authorized
to represent: AMD Saxony LLC (Wilmington, Delaware, US)
General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/3] KVM-userspace: add NUMA support for guests

Reply via email to