Re: How to do fast accesses to LAPIC TPR under kvm?

2012-10-25 Thread Avi Kivity
On 10/24/2012 11:19 AM, Stefan Fritsch wrote:

 With the decode table fix I think it should work.
 
 It needs some more changes. The patch below did the trick for me. It is
 against 3.5, because I didn't want to build a whole new kernel (my test
 machine is a dead slow AMD E-350).
 
 The patch is definitely incomplete. It now allows the lock prefix for
 all mov operations on the cr1-7, which should not be the case. Apart
 from that, do the changes look reasonable? I have not checked that this
 is the minimal patch that works. But the LockReg bit was definitely
 necessary, that was the final piece to make it work.
 
 Cheers,
 Stefan
 
 diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
 index 4837375..c7f0ec7 100644
 --- a/arch/x86/kvm/emulate.c
 +++ b/arch/x86/kvm/emulate.c
 @@ -128,6 +128,7 @@
  #define Priv(127) /* instruction generates #GP if current CPL
 != 0 */
  #define No64(128)
  #define PageTable   (1  29)   /* instruction used to write page table */
 +#define LockReg (130) /* lock prefix is allowed for the
 instruction even for reg destination */
  /* Source 2 operand type */
  #define Src2Shift   (30)

LockReg conflicts with Src2Shift.

  #define Src2None(OpNone  Src2Shift)
 @@ -420,6 +421,7 @@ static int emulator_check_intercept(struct
 x86_emulate_ctxt *ctxt,
  struct x86_instruction_info info = {
  .intercept  = intercept,
  .rep_prefix = ctxt-rep_prefix,
 +.lock_prefix = ctxt-lock_prefix,
  .modrm_mod  = ctxt-modrm_mod,
  .modrm_reg  = ctxt-modrm_reg,
  .modrm_rm   = ctxt-modrm_rm,
 @@ -2874,7 +2876,10 @@ static int em_mov(struct x86_emulate_ctxt *ctxt)
 
  static int em_cr_write(struct x86_emulate_ctxt *ctxt)
  {
 -if (ctxt-ops-set_cr(ctxt, ctxt-modrm_reg, ctxt-src.val))
 +int cr = ctxt-modrm_reg;

Blank line here.

 +if (ctxt-lock_prefix  cr == 0)
 +cr = 8;

But maybe this is better dealt with during general decode, and
ctxt-modrm_reg adjusted instead.  This removes the code triplicstion.
Please also #UD if modrm_reg != 0, and if the feature is not exposed to
the guest via cpuid.

Please regenerate against kvm.git next, there have been changes to
emulate.c.

-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to do fast accesses to LAPIC TPR under kvm?

2012-10-22 Thread Avi Kivity
On 10/20/2012 12:39 AM, Stefan Fritsch wrote:
 On Thursday 18 October 2012, Avi Kivity wrote:
 On 10/18/2012 11:35 AM, Gleb Natapov wrote:
  You misunderstood the description. V_INTR_MASKING=1 means that
  CR8 writes are not propagated to real HW APIC.
  
  But KVM does not trap access to CR8 unconditionally. It enables
  CR8 intercept only when there is pending interrupt in IRR that
  cannot be immediately delivered due to current TPR value. This
  should eliminate 99% of CR8 intercepts.
 
 Right.  You will need to expose the alternate encoding of cr8 (IIRC
 lock mov reg, cr0) on AMD via cpuid, but otherwise it should just
 work.  Be aware that this will break cross-vendor migration.
 
 I get an exception and I am not sure why:
 
 kvm_entry: vcpu 0
 kvm_exit: reason write_cr8 rip 0xd0203788 info 0 0
 kvm_emulate_insn: 0:d0203788: f0 0f 22 c0 (prot32)
 kvm_inj_exception: #UD (0x0)
 
 This is qemu-kvm 1.1.2 on Linux 3.2.
 
 When I look at arch/x86/kvm/emulate.c (both the current and the v3.2 
 version), I don't see any special case handling for lock mov reg, 
 cr0 to mean mov reg, cr8.

emulate.c will #UD is the Lock flag is missing in the instruction decode
table.

 Before I spend lots of time on debugging my code, can you verify if 
 the alternate encoding of cr8 is actually supported in kvm or if it is 
 maybe missing? Thanks in advance.

With the decode table fix I think it should work.


-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to do fast accesses to LAPIC TPR under kvm?

2012-10-19 Thread Stefan Fritsch
On Thursday 18 October 2012, Avi Kivity wrote:
 On 10/18/2012 11:35 AM, Gleb Natapov wrote:
  You misunderstood the description. V_INTR_MASKING=1 means that
  CR8 writes are not propagated to real HW APIC.
  
  But KVM does not trap access to CR8 unconditionally. It enables
  CR8 intercept only when there is pending interrupt in IRR that
  cannot be immediately delivered due to current TPR value. This
  should eliminate 99% of CR8 intercepts.
 
 Right.  You will need to expose the alternate encoding of cr8 (IIRC
 lock mov reg, cr0) on AMD via cpuid, but otherwise it should just
 work.  Be aware that this will break cross-vendor migration.

I get an exception and I am not sure why:

kvm_entry: vcpu 0
kvm_exit: reason write_cr8 rip 0xd0203788 info 0 0
kvm_emulate_insn: 0:d0203788: f0 0f 22 c0 (prot32)
kvm_inj_exception: #UD (0x0)

This is qemu-kvm 1.1.2 on Linux 3.2.

When I look at arch/x86/kvm/emulate.c (both the current and the v3.2 
version), I don't see any special case handling for lock mov reg, 
cr0 to mean mov reg, cr8.

Before I spend lots of time on debugging my code, can you verify if 
the alternate encoding of cr8 is actually supported in kvm or if it is 
maybe missing? Thanks in advance.

Cheers,
Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to do fast accesses to LAPIC TPR under kvm?

2012-10-18 Thread Jan Kiszka
On 2012-10-17 21:24, Stefan Fritsch wrote:
 Hi,
 
 OpenBSD/i386 seems to be one of the few operating systems that still 
 uses the LAPIC taks priority register for interrupt handling. On AMD 

Yeah, only very special OSes do this... ;)

 CPUs and on older Intel CPUs without the flexpriority feature, this 
 causes a huge performance impact on kvm. I have seen slowdown by a 
 factor of 10.
 
 Is there a way to use the TPR under kvm without the slowdown? There 
 are some MSRs inherited from Hyper-V, but using these does not make 
 that much difference. I think this is because they still cause an 
 vmexit for every TPR access. I expect the the same is true for x2apic 
 emulation, isn't it?

Didn't study the HyperV interface yet, but the trick is indeed to avoid
as many vmexits as possible, specifically when lowering the TPR value
has no effect as no interrupts are pending.

 
 There is also the kvmvapic, but kvm does not expose a sane interface 
 to it and only uses it for Windows XP specific binary patching.

The kvmvapic is not a classic paravirtual interface in that it does not
really require guest OS awareness. But it requires the guest to accept
being patched. That's the case for certain Windows versions. Also, the
option ROMs, including our kvmvapic ROM, have to be mapped at fixed,
accessible addresses to allow jumping to it from a patched TPR instructions.

Therefore, we limited the patching to known OS versions, avoiding to
mess around with other, untested OSes. However, it may be possible to
accept OpenBSD as well by adjusting the tests in kvmvapic and possibly
adjusting some other details.

 
 Another possibility is TPR access via CR8 on AMD, but the missing 
 cr8_legacy CPUID bit and this discussion [1] make me believe that this 
 is not supported under kvm, at least in 32bit mode. Could this be 
 easily fixed? If yes, would it solve the performance problems, i.e. 
 offer performance comparable to Intel's flexpriority feature?

Everything that unconditionally traps, and so do CR8 accesses, does not
help.

 
 OpenBSD seems to be reluctant to stop using the TPR. In fact, in a 
 recent discussion, there has been a suggestion that OpenBSD should 
 switch to using TPR also on OpenBSD/amd64 to solve some problems with 
 boot interrupts. How do you expect this would affect performance under 
 kvm (if using CR8)?
 
 Or do you have any other suggestions? One could also modify kvm to 
 expose a real interface to kvmvapic, e.g. allow the guest OS to 
 provide the virtual address of the option rom and the offset of the 
 CPU number in the %fs segment, instead of using hard coded values for 
 Windows XP.

Of course, though all we need is a stable address in fact. See
vapic_write() for the existing PV interface (between option ROM and
hypervisor so far). We can extend it as long as it is compatible with
the existing one.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to do fast accesses to LAPIC TPR under kvm?

2012-10-18 Thread Stefan Fritsch

Hi Jan,

On Thu, 18 Oct 2012, Jan Kiszka wrote:

There is also the kvmvapic, but kvm does not expose a sane interface
to it and only uses it for Windows XP specific binary patching.


The kvmvapic is not a classic paravirtual interface in that it does not
really require guest OS awareness. But it requires the guest to accept
being patched. That's the case for certain Windows versions. Also, the
option ROMs, including our kvmvapic ROM, have to be mapped at fixed,
accessible addresses to allow jumping to it from a patched TPR instructions.

Therefore, we limited the patching to known OS versions, avoiding to
mess around with other, untested OSes. However, it may be possible to
accept OpenBSD as well by adjusting the tests in kvmvapic and possibly
adjusting some other details.


The problem I see here is that OpenBSD is, other than Windows XP, still a 
moving target and details like the offsets in the cpu info struct may 
change in the future. So some interface where the guest OS provides the 
necessary details may be nicer.



Another possibility is TPR access via CR8 on AMD, but the missing
cr8_legacy CPUID bit and this discussion [1] make me believe that this
is not supported under kvm, at least in 32bit mode. Could this be
easily fixed? If yes, would it solve the performance problems, i.e.
offer performance comparable to Intel's flexpriority feature?


Everything that unconditionally traps, and so do CR8 accesses, does not
help.


I was hoping that CR8 access would not trap unconditionally. The AMD 
Programmer's Manual Vol. 2, section 15.21.2 seems to imply that there is a 
mode where this is not the case:


quote
SVM provides a virtual TPR register, V_TPR, for use by the guest; its 
value is loaded from the VMCB by VMRUN and written back to the VMCB by 
#VMEXIT. The APIC's TPR always controls the task priority for physical 
interrupts, and the V_TPR always controls virtual interrupts.


While running a guest with V_INTR_MASKING cleared to 0:
* Writes to CR8 affect both the APIC's TPR and the V_TPR register
* Reads from CR8 operate as they would without SVM

While running a guest with V_INTR_MASKING set to 1:
* Writes to CR8 affect only the V_TPR register
* Reads from CR8 return V_TPR.
/quote

Is V_INTR_MASKING == 1 not used in kvm? Is it not usable at all for some 
reason? Or have I misunderstood the description?



It would be more likely that other hypervisors add support for CR8 
access than that they add kvmvapic compatibility. Therefore such a 
solution would seem preferable, if it is possible.


Cheers,
Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to do fast accesses to LAPIC TPR under kvm?

2012-10-18 Thread Gleb Natapov
On Thu, Oct 18, 2012 at 09:43:46AM +0200, Stefan Fritsch wrote:
 Everything that unconditionally traps, and so do CR8 accesses, does not
 help.
 
 I was hoping that CR8 access would not trap unconditionally. The AMD
 Programmer's Manual Vol. 2, section 15.21.2 seems to imply that
 there is a mode where this is not the case:
 
 quote
 SVM provides a virtual TPR register, V_TPR, for use by the guest;
 its value is loaded from the VMCB by VMRUN and written back to the
 VMCB by #VMEXIT. The APIC's TPR always controls the task priority
 for physical interrupts, and the V_TPR always controls virtual
 interrupts.
 
 While running a guest with V_INTR_MASKING cleared to 0:
 * Writes to CR8 affect both the APIC's TPR and the V_TPR register
 * Reads from CR8 operate as they would without SVM
 
 While running a guest with V_INTR_MASKING set to 1:
 * Writes to CR8 affect only the V_TPR register
 * Reads from CR8 return V_TPR.
 /quote
 
 Is V_INTR_MASKING == 1 not used in kvm? Is it not usable at all for
 some reason? Or have I misunderstood the description?
 
You misunderstood the description. V_INTR_MASKING=1 means that CR8 writes
are not propagated to real HW APIC.

But KVM does not trap access to CR8 unconditionally. It enables CR8
intercept only when there is pending interrupt in IRR that cannot be
immediately delivered due to current TPR value. This should eliminate 99%
of CR8 intercepts.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to do fast accesses to LAPIC TPR under kvm?

2012-10-18 Thread Avi Kivity
On 10/18/2012 11:35 AM, Gleb Natapov wrote:

 You misunderstood the description. V_INTR_MASKING=1 means that CR8 writes
 are not propagated to real HW APIC.
 
 But KVM does not trap access to CR8 unconditionally. It enables CR8
 intercept only when there is pending interrupt in IRR that cannot be
 immediately delivered due to current TPR value. This should eliminate 99%
 of CR8 intercepts.
 

Right.  You will need to expose the alternate encoding of cr8 (IIRC lock
mov reg, cr0) on AMD via cpuid, but otherwise it should just work.  Be
aware that this will break cross-vendor migration.


-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to do fast accesses to LAPIC TPR under kvm?

2012-10-18 Thread Stefan Fritsch

On Thu, 18 Oct 2012, Avi Kivity wrote:


On 10/18/2012 11:35 AM, Gleb Natapov wrote:


You misunderstood the description. V_INTR_MASKING=1 means that CR8 writes
are not propagated to real HW APIC.

But KVM does not trap access to CR8 unconditionally. It enables CR8
intercept only when there is pending interrupt in IRR that cannot be
immediately delivered due to current TPR value. This should eliminate 99%
of CR8 intercepts.



Right.  You will need to expose the alternate encoding of cr8 (IIRC lock
mov reg, cr0) on AMD via cpuid, but otherwise it should just work.  Be
aware that this will break cross-vendor migration.


Thanks for the clarifications. I will try that.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


How to do fast accesses to LAPIC TPR under kvm?

2012-10-17 Thread Stefan Fritsch
Hi,

OpenBSD/i386 seems to be one of the few operating systems that still 
uses the LAPIC taks priority register for interrupt handling. On AMD 
CPUs and on older Intel CPUs without the flexpriority feature, this 
causes a huge performance impact on kvm. I have seen slowdown by a 
factor of 10.

Is there a way to use the TPR under kvm without the slowdown? There 
are some MSRs inherited from Hyper-V, but using these does not make 
that much difference. I think this is because they still cause an 
vmexit for every TPR access. I expect the the same is true for x2apic 
emulation, isn't it?

There is also the kvmvapic, but kvm does not expose a sane interface 
to it and only uses it for Windows XP specific binary patching.

Another possibility is TPR access via CR8 on AMD, but the missing 
cr8_legacy CPUID bit and this discussion [1] make me believe that this 
is not supported under kvm, at least in 32bit mode. Could this be 
easily fixed? If yes, would it solve the performance problems, i.e. 
offer performance comparable to Intel's flexpriority feature?

OpenBSD seems to be reluctant to stop using the TPR. In fact, in a 
recent discussion, there has been a suggestion that OpenBSD should 
switch to using TPR also on OpenBSD/amd64 to solve some problems with 
boot interrupts. How do you expect this would affect performance under 
kvm (if using CR8)?

Or do you have any other suggestions? One could also modify kvm to 
expose a real interface to kvmvapic, e.g. allow the guest OS to 
provide the virtual address of the option rom and the offset of the 
CPU number in the %fs segment, instead of using hard coded values for 
Windows XP.

Cheers,
Stefan

[1] http://www.mail-archive.com/kvm@vger.kernel.org/msg30627.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html