Re: [PATCH v5] i386: Introduce ARAT CPU feature

2015-06-21 Thread Jan Kiszka
On 2015-06-18 22:21, Eduardo Habkost wrote:
 On Sun, Jun 07, 2015 at 11:15:08AM +0200, Jan Kiszka wrote:
 From: Jan Kiszka jan.kis...@siemens.com

 ARAT signals that the APIC timer does not stop in power saving states.
 As our APICs are emulated, it's fine to expose this feature to guests,
 at least when asking for KVM host features or with CPU types that
 include the flag. The exact model number that introduced the feature is
 not known, but reports can be found that it's at least available since
 Sandy Bridge.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 
 The code looks good now, but: what are the real consequences of
 enabling/disabling the flag? What exactly guests use it for?
 
 Isn't this going to make guests have additional expectations about the
 APIC timer that may be broken when live-migrating or pausing the VM?

ARAT only refers to stopping of the timer in certain power states (which
we do not even emulate IIRC). In that case, the OS is under risk of
sleeping forever, thus need to look for a different wakeup source.
Live-migration or VM pausing are external effects on all timers of the
guest, not only the APIC. However, none of them cause a wakeup miss -
provided the host decides to resume the guest eventually.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 0/2] KVM: PPC: Book3S HV: Dynamic micro-threading/split-core

2015-06-21 Thread Paul Mackerras
On Wed, Jun 17, 2015 at 07:30:09PM +0200, Laurent Vivier wrote:
 
 Tested-by: Laurent Vivier lviv...@redhat.com
 
 Performance is better, but Paul could you explain why it is better if I 
 disable dynamic micro-threading ?
 Did I miss something ?
 
 My test system is an IBM Power S822L.
 
 I run two guests with 8 vCPUs (-smp 8,sockets=8,cores=1,threads=1) both
 attached on the same core (with pinning option of virt-manager). Then, I
 measure the time needed to compile a kernel in parallel in both guests
 with make -j 16.
 
 My kernel without micro-threading:
 
 real37m23.424s real37m24.959s
 user167m31.474suser165m44.142s
 sys 113m26.195ssys 113m45.072s
 
 With micro-threading patches (PATCH 1+2):
 
 target_smt_mode 0 [in fact It was 8 here, but it should behave like 0, as it 
 is  max threads/sub-core]
 dynamic_mt_modes 6
 
 real32m13.338s real  32m26.652s
 user139m21.181suser  140m20.994s
 sys 77m35.339s sys   78m16.599s
 
 It's better, but if I disable dynamic micro-threading (but PATCH 1+2):
 
 target_smt_mode 0
 dynamic_mt_modes 0
 
 real30m49.100s real 30m48.161s
 user144m22.989suser 142m53.886s
 sys 65m4.942s  sys  66m8.159s
 
 it's even better.

I think what's happening here is that with dynamic_mt_modes=0 the
system alternates between the two guests, whereas with
dynamic_mt_modes=6 it will spend some of the time running both guests
simultaneously in two-way split mode.  Since you have two
compute-bound guests that each have threads=1 and 8 vcpus, it can fill
up the core either way.  In that case it is more efficient to fill up
the core with vcpus from one guest and not have to split the core,
firstly because you avoid the split/unsplit latency and secondly
because the threads run a little faster in whole-core mode than in
split-core.

I am considering adding an additional heuristic, which would be to do
two passes through the list of preempted vcores, considering only
vcores from the same guest as the primary vcore on the first pass, and
then considering all vcores on the second pass.  Maybe we could then
also say after the first pass that if we have collected 4 or more
runnable vcpus we don't bother with the second pass.

Paul.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in


Re: [PATCH 0/2] KVM: PPC: Book3S HV: Dynamic micro-threading/split-core

2015-06-21 Thread Paul Mackerras
On Wed, Jun 17, 2015 at 07:30:09PM +0200, Laurent Vivier wrote:
 
 Tested-by: Laurent Vivier lviv...@redhat.com
 
 Performance is better, but Paul could you explain why it is better if I 
 disable dynamic micro-threading ?
 Did I miss something ?
 
 My test system is an IBM Power S822L.
 
 I run two guests with 8 vCPUs (-smp 8,sockets=8,cores=1,threads=1) both
 attached on the same core (with pinning option of virt-manager). Then, I
 measure the time needed to compile a kernel in parallel in both guests
 with make -j 16.
 
 My kernel without micro-threading:
 
 real37m23.424s real37m24.959s
 user167m31.474suser165m44.142s
 sys 113m26.195ssys 113m45.072s
 
 With micro-threading patches (PATCH 1+2):
 
 target_smt_mode 0 [in fact It was 8 here, but it should behave like 0, as it 
 is  max threads/sub-core]
 dynamic_mt_modes 6
 
 real32m13.338s real  32m26.652s
 user139m21.181suser  140m20.994s
 sys 77m35.339s sys   78m16.599s
 
 It's better, but if I disable dynamic micro-threading (but PATCH 1+2):
 
 target_smt_mode 0
 dynamic_mt_modes 0
 
 real30m49.100s real 30m48.161s
 user144m22.989suser 142m53.886s
 sys 65m4.942s  sys  66m8.159s
 
 it's even better.

I think what's happening here is that with dynamic_mt_modes=0 the
system alternates between the two guests, whereas with
dynamic_mt_modes=6 it will spend some of the time running both guests
simultaneously in two-way split mode.  Since you have two
compute-bound guests that each have threads=1 and 8 vcpus, it can fill
up the core either way.  In that case it is more efficient to fill up
the core with vcpus from one guest and not have to split the core,
firstly because you avoid the split/unsplit latency and secondly
because the threads run a little faster in whole-core mode than in
split-core.

I am considering adding an additional heuristic, which would be to do
two passes through the list of preempted vcores, considering only
vcores from the same guest as the primary vcore on the first pass, and
then considering all vcores on the second pass.  Maybe we could then
also say after the first pass that if we have collected 4 or more
runnable vcpus we don't bother with the second pass.

Paul.
--
To unsubscribe from this list: send the line unsubscribe kvm in


Re: [PATCH 2/3] powerpc: use default endianness for converting guest/init

2015-06-21 Thread Michael Ellerman
On Fri, 2015-06-19 at 17:15 +0100, Andre Przywara wrote:

 What works though is using xxd to convert the binary guest/init into a C
 array:
 $ xxd -i guest/init | $(CC) -x c -c - -o guest/guest_init.o
 This has the nice property of using the same compiler that generates the
 other object files and thus automatically matches them (which is a
 problem under MIPS atm, as ld seems to default to some different ELF type).
 The only issue is that xxd is part of the vim package, which would annoy
 Emacs users. Not sure we are in a position to mandate vim for compiling
 kvmtool ;-)

You'd be doing them a favor, so fine by me :)

cheers


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in


Re: [PATCH 2/3] powerpc: use default endianness for converting guest/init

2015-06-21 Thread Michael Ellerman
On Fri, 2015-06-19 at 17:15 +0100, Andre Przywara wrote:

 What works though is using xxd to convert the binary guest/init into a C
 array:
 $ xxd -i guest/init | $(CC) -x c -c - -o guest/guest_init.o
 This has the nice property of using the same compiler that generates the
 other object files and thus automatically matches them (which is a
 problem under MIPS atm, as ld seems to default to some different ELF type).
 The only issue is that xxd is part of the vim package, which would annoy
 Emacs users. Not sure we are in a position to mandate vim for compiling
 kvmtool ;-)

You'd be doing them a favor, so fine by me :)

cheers


--
To unsubscribe from this list: send the line unsubscribe kvm in


Re: [PATCH v3 4/4] KVM: x86: Add support for local interrupt requests from userspace

2015-06-21 Thread Paolo Bonzini


On 20/06/2015 02:41, Steve Rutherford wrote:
 Pinging this thread.
 
 Should I go with skipping the round trip, and combining
 KVM_REQUEST_PIC_INJECTION with the KVM_INTERRUPT (a VCPU IOCTL)?
 [It's currently a VM IOCTL, which seems reasonable, given that the
 PIC is a per VM device. When skipping the round trip, a VCPU Ioctl
 seems sensible, given that an interrupt is associated with a specific
 CPU.]

Yes, please.  Sorry for not answering, I didn't understand a question
was implied.  The roundtrip can be done in userspace.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in