On 06/23/2017 09:35 AM, Henning Schild wrote:
Am Fri, 23 Jun 2017 11:11:10 +0200
schrieb Sahid Orentino Ferdjaoui <sferd...@redhat.com>:
In Linux RT context, and as you mentioned, the non-RT vCPU can acquire
some guest kernel lock, then be pre-empted by emulator thread while
holding this lock. This situation blocks RT vCPUs from doing its
work. So that is why we have implemented [2]. For DPDK I don't think
we have such problems because it's running in userland.
So for DPDK context I think we could have a mask like we have for RT
and basically considering vCPU0 to handle best effort works (emulator
threads, SSH...). I think it's the current pattern used by DPDK users.
DPDK is just a library and one can imagine an application that has
cross-core communication/synchronisation needs where the emulator
slowing down vpu0 will also slow down vcpu1. You DPDK application would
have to know which of its cores did not get a full pcpu.
I am not sure what the DPDK-example is doing in this discussion, would
that not just be cpu_policy=dedicated? I guess normal behaviour of
dedicated is that emulators and io happily share pCPUs with vCPUs and
you are looking for a way to restrict emulators/io to a subset of pCPUs
because you can live with some of them beeing not 100%.
Yes. A typical DPDK-using VM might look something like this:
vCPU0: non-realtime, housekeeping and I/O, handles all virtual interrupts and
"normal" linux stuff, emulator runs on same pCPU
vCPU1: realtime, runs in tight loop in userspace processing packets
vCPU2: realtime, runs in tight loop in userspace processing packets
vCPU3: realtime, runs in tight loop in userspace processing packets
In this context, vCPUs 1-3 don't really ever enter the kernel, and we've
offloaded as much kernel work as possible from them onto vCPU0. This works
pretty well with the current system.
For RT we have to isolate the emulator threads to an additional pCPU
per guests or as your are suggesting to a set of pCPUs for all the
guests running.
I think we should introduce a new option:
- hw:cpu_emulator_threads_mask=^1
If on 'nova.conf' - that mask will be applied to the set of all host
CPUs (vcpu_pin_set) to basically pack the emulator threads of all VMs
running here (useful for RT context).
That would allow modelling exactly what we need.
In nova.conf we are talking absolute known values, no need for a mask
and a set is much easier to read. Also using the same name does not
sound like a good idea.
And the name vcpu_pin_set clearly suggest what kind of load runs here,
if using a mask it should be called pin_set.
I agree with Henning.
In nova.conf we should just use a set, something like "rt_emulator_vcpu_pin_set"
which would be used for running the emulator/io threads of *only* realtime
instances.
We may also want to have "rt_emulator_overcommit_ratio" to control how many
threads/instances we allow per pCPU.
If on flavor extra-specs It will be applied to the vCPUs dedicated for
the guest (useful for DPDK context).
And if both are present the flavor wins and nova.conf is ignored?
In the flavor I'd like to see it be a full bitmask, not an exclusion mask with
an implicit full set. Thus the end-user could specify
"hw:cpu_emulator_threads_mask=0" and get the emulator threads to run alongside
vCPU0.
Henning, there is no conflict, the nova.conf setting and the flavor setting are
used for two different things.
Chris
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev