On Tue, 2013-11-19 at 12:52 +0000, Daniel P. Berrange wrote: > On Wed, Nov 13, 2013 at 02:46:06PM +0200, Tuomas Paappanen wrote: > > Hi all, > > > > I would like to hear your thoughts about core pinning in Openstack. > > Currently nova(with qemu-kvm) supports usage of cpu set of PCPUs > > what can be used by instances. I didn't find blueprint, but I think > > this feature is for isolate cpus used by host from cpus used by > > instances(VCPUs). > > > > But, from performance point of view it is better to exclusively > > dedicate PCPUs for VCPUs and emulator. In some cases you may want to > > guarantee that only one instance(and its VCPUs) is using certain > > PCPUs. By using core pinning you can optimize instance performance > > based on e.g. cache sharing, NUMA topology, interrupt handling, pci > > pass through(SR-IOV) in multi socket hosts etc. > > > > We have already implemented feature like this(PoC with limitations) > > to Nova Grizzly version and would like to hear your opinion about > > it. > > > > The current implementation consists of three main parts: > > - Definition of pcpu-vcpu maps for instances and instance spawning > > - (optional) Compute resource and capability advertising including > > free pcpus and NUMA topology. > > - (optional) Scheduling based on free cpus and NUMA topology. > > > > The implementation is quite simple: > > > > (additional/optional parts) > > Nova-computes are advertising free pcpus and NUMA topology in same > > manner than host capabilities. Instances are scheduled based on this > > information. > > > > (core pinning) > > admin can set PCPUs for VCPUs and for emulator process, or select > > NUMA cell for instance vcpus, by adding key:value pairs to flavor's > > extra specs. > > > > EXAMPLE: > > instance has 4 vcpus > > <key>:<value> > > vcpus:1,2,3,4 --> vcpu0 pinned to pcpu1, vcpu1 pinned to pcpu2... > > emulator:5 --> emulator pinned to pcpu5 > > or > > numacell:0 --> all vcpus are pinned to pcpus in numa cell 0. > > > > In nova-compute, core pinning information is read from extra specs > > and added to domain xml same way as cpu quota values(cputune). > > > > <cputune> > > <vcpupin vcpu='0' cpuset='1'/> > > <vcpupin vcpu='1' cpuset='2'/> > > <vcpupin vcpu='2' cpuset='3'/> > > <vcpupin vcpu='3' cpuset='4'/> > > <emulatorpin cpuset='5'/> > > </cputune> > > > > What do you think? Implementation alternatives? Is this worth of > > blueprint? All related comments are welcome! > > I think there are several use cases mixed up in your descriptions > here which should likely be considered independantly > > - pCPU/vCPU pinning > > I don't really think this is a good idea as a general purpose > feature in its own right. It tends to lead to fairly inefficient > use of CPU resources when you consider that a large % of guests > will be mostly idle most of the time. It has a fairly high > administrative burden to maintain explicit pinning too. This > feels like a data center virt use case rather than cloud use > case really. > > - Dedicated CPU reservation > > The ability of an end user to request that their VM (or their > group of VMs) gets assigned a dedicated host CPU set to run on. > This is obviously something that would have to be controlled > at a flavour level, and in a commercial deployment would carry > a hefty pricing premium. > > I don't think you want to expose explicit pCPU/vCPU placement > for this though. Just request the high level concept and allow > the virt host to decide actual placement > > - Host NUMA placement. > > By not taking NUMA into account currently the libvirt driver > at least is badly wasting resources. Having too much cross-numa > node memory access by guests just kills scalability. The virt > driver should really automaticall figure out cpu & memory pinning > within the scope of a NUMA node automatically. No admin config > should be required for this. > > - Guest NUMA topology > > If the flavour memory size / cpu count exceeds the size of a > single NUMA node, then the flavour should likely have a way to > express that the guest should see multiple NUMA nodes. The > virt host would then set guest NUMA topology to match the way > it places vCPUs & memory on host NUMA nodes. Again you don't > want explicit pcpu/vcpu mapping done by the admin for this. > > > > Regards, > Daniel
Quite clear splitting and +1 for P/V pin option. --jyh _______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
