On Tue, 2016-05-31 at 16:08 +1000, David Gibson wrote: > > QEMU fails with errors like > > > > qemu-kvm: Cannot support more than 8 threads on PPC with KVM > > qemu-kvm: Cannot support more than 1 threads on PPC with TCG > > > > depending on the guest type. > > Note that in a sense the two errors come about for different reasons. > > On Power, to a much greater degree than x86, threads on the same core > have observably different behaviour from threads on different cores. > Because of that, there's no reasonable way for KVM to present more > guest threads-per-core than there are host threads-per-core. > > The limit of 1 thread on TCG is simply because no-one's ever bothered > to implement SMT emulation in qemu.
That just means in the future we might have to expose something other than an hardcoded '1' as guest thread limit for TCG guests; the interface would remain valid AFAICT. > > physical_core_id would be 32 for all of the above - it would > > just be the very value of core_id the kernel reads from the > > hardware and reports through sysfs. > > > > The tricky bit is that, when subcores are in use, core_id and > > physical_core_id would not match. They will always match on > > architectures that lack the concept of subcores, though. > > Yeah, I'm still not terribly convinced that we should even be > presenting physical core info instead of *just* logical core info. If > you care that much about physical core topology, you probably > shouldn't be running your system in subcore mode. Me neither. We could leave it out initially, and add it later if it turns out to be useful, I guess. > > > > The optimal guest topology in this case would be > > > > > > > > <vcpu placement='static' cpuset='4'>4</vcpu> > > > > <cpu> > > > > <topology sockets='1' cores='1' threads='4'/> > > > > </cpu> > > > > > > So when we pin to logical CPU #4, ppc KVM is smart enough to see that > > > it's a > > > subcore thread, will then make use of the offline threads in the same > > > subcore? > > > Or does libvirt do anything fancy to facilitate this case? > > > > My understanding is that libvirt shouldn't have to do anything > > to pass the hint to kvm, but David will have the authoritative > > answer here. > > Um.. I'm not totally certain. It will be one of two things: > a) you just bind the guest thread to the representative host thread > b) you bind the guest thread to a cpumask with all of the host > threads on the relevant (sub)core - including the offline host > threads > > I'll try to figure out which one it is. I played with this a bit: I created a guest with <vcpu placement='static' cpuset='0,8'>8</vcpu> <cpu> <topology sockets='1' cores='2' threads='4'/> </cpu> and then, inside the guest, I used cgroups to pin a bunch of busy loops to specific vCPUs. As long as all the load (8+ busy loops) was distributed only across vCPUs 0-3, one of the host threads remained idle. As soon as the first of the jobs was moved to vCPUs 4-7, the other host thread immediately jumped to 100%. This seems to indicate that QEMU / KVM are actually smart enough to schedule guest threads on the corresponding host threads. I think :) On the other hand, when I changed the guest to distribute the 8 vCPUs among 2 sockets with 4 cores each instead, the second host thread would start running as soon as I started the second busy loop. > > We won't know whether the proposal is actually sensible until > > David weighs in, but I'm adding Martin back in the loop so > > we can maybe give us the oVirt angle in the meantime. > > TBH, I'm not really sure what you want from me. Most of the questions > seem to be libvirt design decisions which are independent of the layers > below. I mostly need you to sanity check my proposals and point out any incorrect / dubious claims, just like you did above :) The design of features like this one can have pretty significant consequences for the interactions between the various layers, and when the choices are not straightforward I think it's better to gather as much feedback as possible from across the stack before moving forward with an implementation. -- Andrea Bolognani Software Engineer - Virtualization Team -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list