On Mon, Dec 01, 2008 at 03:15:19PM +0100, Andre Przywara wrote:
> Avi Kivity wrote:
> >>Node over-committing is allowed (-nodes 0,0,0,0), omitting the -nodes
> >>parameter reverts to the old behavior.
> >
> >'-nodes' is too generic a name ('node' could also mean a host).  Suggest 
> >-numanode.
> >
> >Need more flexibility: specify the range of memory per node, which cpus 
> >are in the node, relative weights for the SRAT table:
> >
> >  -numanode node=1,cpu=2,cpu=3,start=1G,size=1G,hostnode=3
> 
> I converted my code to use the new firmware interface. This also makes 
> it possible to pass more information between qemu and BIOS (which 
> prevented a more flexible command line in the first version).
> So I would opt for the following:
> - use numanode (or simply numa?) instead of the misleading -nodes
> - allow passing memory sizes, VCPU subsets and host CPU pin info
> I would prefer Daniel's version:
> -numa <nrnodes>[,mem:<node1size>[;<node2size>...]]
> [,cpu:<node1cpus>[;<node2cpus>...]]
> [,pin:<node1hostnode>[;<host2hostnode>...]]
> 
> That would allow easy things like -numa 2 (for a two guest node), not 
> given options would result in defaults (equally split-up resources).
> 
> The only problem is the default option for the host side, as libnuma 
> requires to explicitly name the nodes. Maybe make the pin: part _not_ 
> optional? I would at least want to pin the memory, one could discuss 
> about the VCPUs...

I think keeping it optional makes things more flexible for people
invoking KVM. If omitted, then query current CPU pinning to determine
which host NUMA nodes to allocate from. 

The topology exposed to a guest  will likely be the same every time
you launch a particular VM, while the guest<-> host pinning is a 
point in time decision according to current available resources.
Thus some apps / users may find it more convenient to have a fixed set
of args they always use to invoke the KVM process, and instead control
placement during the fork/exec'ing of KVM by explicitly calling 
sched_setaffinity or using numactl to launch.  It should be easy enough
to use sched_getaffinity to query current pining and from that determine
appropriate NUMA nodes, if they leave out the pin=XXXX arg.

Daniel
-- 
|: Red Hat, Engineering, London   -o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org  -o-  http://virt-manager.org  -o-  http://ovirt.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-  F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to