Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-06 Thread Avi Kivity

On 02/03/2011 04:15 PM, Gleb Natapov wrote:


  Maybe this is true for the in-kernel model, but I don't see the issue
  (anymore) for the way user space works.

With patch below I can boot Windows7.

diff --git a/hw/apic.c b/hw/apic.c
index 146deca..fdcac88 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -600,7 +600,7 @@ int apic_get_interrupt(DeviceState *d)
  intno = get_highest_priority_int(s-irr);
  if (intno  0)
  return -1;
-if (s-tpr  intno= s-tpr)
+if ((s-tpr  4)  (intno  4)= (s-tpr  4))
  return s-spurious_vec  0xff;
  reset_bit(s-irr, intno);
  set_bit(s-isr, intno);


That still allows interrupts that have higher priority than the TPR, but 
lower priority than interrupts in the ISR to be injected.  I think we 
need to use the PPR here (same as apic_update_irq()).


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-06 Thread Gleb Natapov
On Sun, Feb 06, 2011 at 12:26:40PM +0200, Avi Kivity wrote:
 On 02/03/2011 04:15 PM, Gleb Natapov wrote:
 
   Maybe this is true for the in-kernel model, but I don't see the issue
   (anymore) for the way user space works.
 
 With patch below I can boot Windows7.
 
 diff --git a/hw/apic.c b/hw/apic.c
 index 146deca..fdcac88 100644
 --- a/hw/apic.c
 +++ b/hw/apic.c
 @@ -600,7 +600,7 @@ int apic_get_interrupt(DeviceState *d)
   intno = get_highest_priority_int(s-irr);
   if (intno  0)
   return -1;
 -if (s-tpr  intno= s-tpr)
 +if ((s-tpr  4)  (intno  4)= (s-tpr  4))
   return s-spurious_vec  0xff;
   reset_bit(s-irr, intno);
   set_bit(s-isr, intno);
 
 That still allows interrupts that have higher priority than the TPR,
 but lower priority than interrupts in the ISR to be injected.  I
 think we need to use the PPR here (same as apic_update_irq()).
 
We shouldn't get here if isr is non-empty, but see the patch I posted
today to qemu-devel. It does what you say anyway.

--
Gleb.



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-03 Thread Avi Kivity

On 02/02/2011 05:52 PM, Jan Kiszka wrote:


  If there is no problem in the logic of this commit (and I do not see
  one yet) then we somewhere miss kicking vcpu when interrupt, that should be
  handled, arrives?

I'm not yet confident about the logic of the kernel patch: mov to cr8 is
serializing. If the guest raises the tpr and then signals this with a
succeeding, non vm-exiting instruction to the other vcpus, one of those
could inject an interrupt with a higher priority than the previous tpr,
but a lower one than current tpr. QEMU user space would accept this
interrupt - and would likely surprise the guest. Do I miss something?


apic_get_interrupt() is only called from the vcpu thread, so it should 
see a correct tpr.


The only difference I can see with the patch is that we may issue a 
spurious cpu_interrupt().  But that shouldn't do anything bad, should it?


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-03 Thread Jan Kiszka
On 2011-02-03 08:42, Gleb Natapov wrote:
 On Wed, Feb 02, 2011 at 05:51:32PM +0100, Jan Kiszka wrote:
 Just did so, and I can no longer reproduce the problem. Hmm...

 If there is no problem in the logic of this commit (and I do not see
 one yet) then we somewhere miss kicking vcpu when interrupt, that 
 should be
 handled, arrives?

 I'm not yet confident about the logic of the kernel patch: mov to cr8 is
 serializing. If the guest raises the tpr and then signals this with a
 succeeding, non vm-exiting instruction to the other vcpus, one of those
 could inject an interrupt with a higher priority than the previous tpr,
 but a lower one than current tpr. QEMU user space would accept this
 interrupt - and would likely surprise the guest. Do I miss something?

 Injection happens by vcpu thread on cpu entry:
 run-request_interrupt_window = kvm_arch_try_push_interrupts(env);
 and tpr is synced on vcpu exit, so I do not yet see how what you describe
 above may happen since during injection vcpu should see correct tpr.

 Hmm, maybe this is the key: Once we call into apic_get_interrupt
 (because CPU_INTERRUPT_HARD was set as described above) and we find a
 pending irq below the tpr, we inject a spurious vector instead.

 That should be easy to verify. I expect Windows to BSOD upon receiving
 spurious vector though.

 I hacked spurious irq injection away, but the issue remains. At the same
 time, Windows is receiving tons of spurious interrupts without any
 complaints, even without that tpr optimization in the kernel. So this is
 obviously not yet the key.

 Let's try your idea that we miss a wakeup.

 That is unlikely too. If vcpu missed wakeup, info cpus would solve the
 hang since it would kick vcpu out of the kernel and missed interrupt would be
 injected on re-entry.

Yeah, and it wouldn't explain the various BSOFs I'm seeing (you get an
even broader spectrum when trying the Windows installations DVDs).

We are probably digging at the wrong site.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-03 Thread Jan Kiszka
On 2011-02-03 09:18, Avi Kivity wrote:
 On 02/02/2011 05:52 PM, Jan Kiszka wrote:

  If there is no problem in the logic of this commit (and I do not see
  one yet) then we somewhere miss kicking vcpu when interrupt, that should be
  handled, arrives?

 I'm not yet confident about the logic of the kernel patch: mov to cr8 is
 serializing. If the guest raises the tpr and then signals this with a
 succeeding, non vm-exiting instruction to the other vcpus, one of those
 could inject an interrupt with a higher priority than the previous tpr,
 but a lower one than current tpr. QEMU user space would accept this
 interrupt - and would likely surprise the guest. Do I miss something?
 
 apic_get_interrupt() is only called from the vcpu thread, so it should 
 see a correct tpr.
 
 The only difference I can see with the patch is that we may issue a 
 spurious cpu_interrupt().  But that shouldn't do anything bad, should it?

I tested this yesterday, and it doesn't confuse Windows. It actually
receives quite a few spurious IRQs in normal operation, w/ or w/o the
kernel's tpr optimization.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-03 Thread Avi Kivity

On 02/03/2011 11:32 AM, Jan Kiszka wrote:

On 2011-02-03 09:18, Avi Kivity wrote:
  On 02/02/2011 05:52 PM, Jan Kiszka wrote:

   If there is no problem in the logic of this commit (and I do not see
   one yet) then we somewhere miss kicking vcpu when interrupt, that should 
be
   handled, arrives?

  I'm not yet confident about the logic of the kernel patch: mov to cr8 is
  serializing. If the guest raises the tpr and then signals this with a
  succeeding, non vm-exiting instruction to the other vcpus, one of those
  could inject an interrupt with a higher priority than the previous tpr,
  but a lower one than current tpr. QEMU user space would accept this
  interrupt - and would likely surprise the guest. Do I miss something?

  apic_get_interrupt() is only called from the vcpu thread, so it should
  see a correct tpr.

  The only difference I can see with the patch is that we may issue a
  spurious cpu_interrupt().  But that shouldn't do anything bad, should it?

I tested this yesterday, and it doesn't confuse Windows. It actually
receives quite a few spurious IRQs in normal operation, w/ or w/o the
kernel's tpr optimization.


I don't see why there should be any spurious interrupts in normal 
operation.  From the docs, these happen due to an INTA cycle racing with 
raising the TPR, but in ioapic mode, there shouldn't be any INTA cycles.


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-03 Thread Marcelo Tosatti
On Thu, Feb 03, 2011 at 10:32:25AM +0100, Jan Kiszka wrote:
 On 2011-02-03 09:18, Avi Kivity wrote:
  On 02/02/2011 05:52 PM, Jan Kiszka wrote:
 
   If there is no problem in the logic of this commit (and I do not see
   one yet) then we somewhere miss kicking vcpu when interrupt, that should 
  be
   handled, arrives?
 
  I'm not yet confident about the logic of the kernel patch: mov to cr8 is
  serializing. If the guest raises the tpr and then signals this with a
  succeeding, non vm-exiting instruction to the other vcpus, one of those
  could inject an interrupt with a higher priority than the previous tpr,
  but a lower one than current tpr. QEMU user space would accept this
  interrupt - and would likely surprise the guest. Do I miss something?
  
  apic_get_interrupt() is only called from the vcpu thread, so it should 
  see a correct tpr.
  
  The only difference I can see with the patch is that we may issue a 
  spurious cpu_interrupt().  But that shouldn't do anything bad, should it?
 
 I tested this yesterday, and it doesn't confuse Windows. It actually
 receives quite a few spurious IRQs in normal operation, w/ or w/o the
 kernel's tpr optimization.

http://www.mail-archive.com/kvm@vger.kernel.org/msg41681.html

tpr of a vcpu should always be inspected in vcpu context, instead of 
iothread context?




Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-03 Thread Jan Kiszka
On 2011-02-03 11:04, Marcelo Tosatti wrote:
 On Thu, Feb 03, 2011 at 10:32:25AM +0100, Jan Kiszka wrote:
 On 2011-02-03 09:18, Avi Kivity wrote:
 On 02/02/2011 05:52 PM, Jan Kiszka wrote:

  If there is no problem in the logic of this commit (and I do not see
  one yet) then we somewhere miss kicking vcpu when interrupt, that should 
 be
  handled, arrives?

 I'm not yet confident about the logic of the kernel patch: mov to cr8 is
 serializing. If the guest raises the tpr and then signals this with a
 succeeding, non vm-exiting instruction to the other vcpus, one of those
 could inject an interrupt with a higher priority than the previous tpr,
 but a lower one than current tpr. QEMU user space would accept this
 interrupt - and would likely surprise the guest. Do I miss something?

 apic_get_interrupt() is only called from the vcpu thread, so it should 
 see a correct tpr.

 The only difference I can see with the patch is that we may issue a 
 spurious cpu_interrupt().  But that shouldn't do anything bad, should it?

 I tested this yesterday, and it doesn't confuse Windows. It actually
 receives quite a few spurious IRQs in normal operation, w/ or w/o the
 kernel's tpr optimization.
 
 http://www.mail-archive.com/kvm@vger.kernel.org/msg41681.html

Don't get the scenario yet: We do not inject (or set isr) over the
context of apic_set_irq caller.

 
 tpr of a vcpu should always be inspected in vcpu context, instead of 
 iothread context?

Maybe this is true for the in-kernel model, but I don't see the issue
(anymore) for the way user space works.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-03 Thread Jan Kiszka
On 2011-02-03 11:01, Avi Kivity wrote:
 On 02/03/2011 11:32 AM, Jan Kiszka wrote:
 On 2011-02-03 09:18, Avi Kivity wrote:
  On 02/02/2011 05:52 PM, Jan Kiszka wrote:

   If there is no problem in the logic of this commit (and I do not see
   one yet) then we somewhere miss kicking vcpu when interrupt, that 
 should be
   handled, arrives?

  I'm not yet confident about the logic of the kernel patch: mov to cr8 is
  serializing. If the guest raises the tpr and then signals this with a
  succeeding, non vm-exiting instruction to the other vcpus, one of those
  could inject an interrupt with a higher priority than the previous tpr,
  but a lower one than current tpr. QEMU user space would accept this
  interrupt - and would likely surprise the guest. Do I miss something?

  apic_get_interrupt() is only called from the vcpu thread, so it should
  see a correct tpr.

  The only difference I can see with the patch is that we may issue a
  spurious cpu_interrupt().  But that shouldn't do anything bad, should it?

 I tested this yesterday, and it doesn't confuse Windows. It actually
 receives quite a few spurious IRQs in normal operation, w/ or w/o the
 kernel's tpr optimization.
 
 I don't see why there should be any spurious interrupts in normal 
 operation.  From the docs, these happen due to an INTA cycle racing with 
 raising the TPR, but in ioapic mode, there shouldn't be any INTA cycles.
 

I added an instrumentation to the line of apic_get_interrupt that
returns the spurious vector, and it triggered fairly often. Just didn't
examined why this happens even without the tpr optimization.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-03 Thread Gleb Natapov
On Thu, Feb 03, 2011 at 11:11:23AM +0100, Jan Kiszka wrote:
 On 2011-02-03 11:04, Marcelo Tosatti wrote:
  On Thu, Feb 03, 2011 at 10:32:25AM +0100, Jan Kiszka wrote:
  On 2011-02-03 09:18, Avi Kivity wrote:
  On 02/02/2011 05:52 PM, Jan Kiszka wrote:
 
   If there is no problem in the logic of this commit (and I do not see
   one yet) then we somewhere miss kicking vcpu when interrupt, that 
  should be
   handled, arrives?
 
  I'm not yet confident about the logic of the kernel patch: mov to cr8 is
  serializing. If the guest raises the tpr and then signals this with a
  succeeding, non vm-exiting instruction to the other vcpus, one of those
  could inject an interrupt with a higher priority than the previous tpr,
  but a lower one than current tpr. QEMU user space would accept this
  interrupt - and would likely surprise the guest. Do I miss something?
 
  apic_get_interrupt() is only called from the vcpu thread, so it should 
  see a correct tpr.
 
  The only difference I can see with the patch is that we may issue a 
  spurious cpu_interrupt().  But that shouldn't do anything bad, should it?
 
  I tested this yesterday, and it doesn't confuse Windows. It actually
  receives quite a few spurious IRQs in normal operation, w/ or w/o the
  kernel's tpr optimization.
  
  http://www.mail-archive.com/kvm@vger.kernel.org/msg41681.html
 
 Don't get the scenario yet: We do not inject (or set isr) over the
 context of apic_set_irq caller.
 
  
  tpr of a vcpu should always be inspected in vcpu context, instead of 
  iothread context?
 
 Maybe this is true for the in-kernel model, but I don't see the issue
 (anymore) for the way user space works.
 
With patch below I can boot Windows7.

diff --git a/hw/apic.c b/hw/apic.c
index 146deca..fdcac88 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -600,7 +600,7 @@ int apic_get_interrupt(DeviceState *d)
 intno = get_highest_priority_int(s-irr);
 if (intno  0)
 return -1;
-if (s-tpr  intno = s-tpr)
+if ((s-tpr  4)  (intno  4) = (s-tpr  4))
 return s-spurious_vec  0xff;
 reset_bit(s-irr, intno);
 set_bit(s-isr, intno);
--
Gleb.



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-03 Thread Jan Kiszka
On 2011-02-03 15:15, Gleb Natapov wrote:
 On Thu, Feb 03, 2011 at 11:11:23AM +0100, Jan Kiszka wrote:
 On 2011-02-03 11:04, Marcelo Tosatti wrote:
 On Thu, Feb 03, 2011 at 10:32:25AM +0100, Jan Kiszka wrote:
 On 2011-02-03 09:18, Avi Kivity wrote:
 On 02/02/2011 05:52 PM, Jan Kiszka wrote:

  If there is no problem in the logic of this commit (and I do not see
  one yet) then we somewhere miss kicking vcpu when interrupt, that 
 should be
  handled, arrives?

 I'm not yet confident about the logic of the kernel patch: mov to cr8 is
 serializing. If the guest raises the tpr and then signals this with a
 succeeding, non vm-exiting instruction to the other vcpus, one of those
 could inject an interrupt with a higher priority than the previous tpr,
 but a lower one than current tpr. QEMU user space would accept this
 interrupt - and would likely surprise the guest. Do I miss something?

 apic_get_interrupt() is only called from the vcpu thread, so it should 
 see a correct tpr.

 The only difference I can see with the patch is that we may issue a 
 spurious cpu_interrupt().  But that shouldn't do anything bad, should it?

 I tested this yesterday, and it doesn't confuse Windows. It actually
 receives quite a few spurious IRQs in normal operation, w/ or w/o the
 kernel's tpr optimization.

 http://www.mail-archive.com/kvm@vger.kernel.org/msg41681.html

 Don't get the scenario yet: We do not inject (or set isr) over the
 context of apic_set_irq caller.


 tpr of a vcpu should always be inspected in vcpu context, instead of 
 iothread context?

 Maybe this is true for the in-kernel model, but I don't see the issue
 (anymore) for the way user space works.

 With patch below I can boot Windows7.
 
 diff --git a/hw/apic.c b/hw/apic.c
 index 146deca..fdcac88 100644
 --- a/hw/apic.c
 +++ b/hw/apic.c
 @@ -600,7 +600,7 @@ int apic_get_interrupt(DeviceState *d)
  intno = get_highest_priority_int(s-irr);
  if (intno  0)
  return -1;
 -if (s-tpr  intno = s-tpr)
 +if ((s-tpr  4)  (intno  4) = (s-tpr  4))
  return s-spurious_vec  0xff;
  reset_bit(s-irr, intno);
  set_bit(s-isr, intno);
 --
   Gleb.

Cool, /me too. I would just suggest

diff --git a/hw/apic.c b/hw/apic.c
index 05a115f..13bd7b4 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -582,6 +582,7 @@ int apic_get_interrupt(DeviceState *d)
 {
 APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
 int intno;
+int tpr;
 
 /* if the APIC is installed or enabled, we let the 8259 handle the
IRQs */
@@ -594,8 +595,10 @@ int apic_get_interrupt(DeviceState *d)
 intno = get_highest_priority_int(s-irr);
 if (intno  0)
 return -1;
-if (s-tpr  intno = s-tpr)
+tpr = s-tpr  4;
+if (tpr  (intno  4) = tpr) {
 return s-spurious_vec  0xff;
+}
 reset_bit(s-irr, intno);
 set_bit(s-isr, intno);
 apic_update_irq(s);


Unfortunately, that issue was not related to the emulation mode
problems of QEMU.

Thanks!
Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Gleb Natapov
On Tue, Feb 01, 2011 at 07:02:03PM +0100, Jan Kiszka wrote:
 Hi,
 
 testing my KVM patches, I noticed that none of the 64-bit Windows
 versions I have around (early Win7  2003 server) boot in KVM mode when
 using 2 or more VCPUs and the user space irqchip. This applies to both
 upstream KVM and qemu-kvm, with our without any of my current patches. A
 subtle difference in the APIC/IOAPIC emulation?
 
 Can anyone confirm this?
 
Just booted windows7-64 on qemu-kvm -no-kvm-irqchip + latest kernel here.

--
Gleb.



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Gleb Natapov
On Wed, Feb 02, 2011 at 02:09:24PM +0100, Jan Kiszka wrote:
 On 2011-02-02 14:05, Avi Kivity wrote:
  On 02/02/2011 02:50 PM, Jan Kiszka wrote:
 
   Opps, -smp 1. With -smp 2 it boot almost completely and then hangs.
 
  Ah, good (or not good). With Windows 2003 Server, I actually get a Blue
  Screen (Stop 0x00b8).
  
  Userspace APIC is broken since it may run with an outdated cr8, does 
  reverting 27a4f7976d5 help?
  
 
 -ECOMMITNOTFOUND, neither in qemu-kvm nor upstream.
 
This is kernel commit, but it is too old. I am pretty sure userspace irq
chip worked back then.

--
Gleb.



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Jan Kiszka
On 2011-02-02 14:05, Avi Kivity wrote:
 On 02/02/2011 02:50 PM, Jan Kiszka wrote:

  Opps, -smp 1. With -smp 2 it boot almost completely and then hangs.

 Ah, good (or not good). With Windows 2003 Server, I actually get a Blue
 Screen (Stop 0x00b8).
 
 Userspace APIC is broken since it may run with an outdated cr8, does 
 reverting 27a4f7976d5 help?
 

-ECOMMITNOTFOUND, neither in qemu-kvm nor upstream.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Gleb Natapov
On Wed, Feb 02, 2011 at 03:14:26PM +0200, Avi Kivity wrote:
 On 02/02/2011 03:11 PM, Gleb Natapov wrote:
 On Wed, Feb 02, 2011 at 02:09:24PM +0100, Jan Kiszka wrote:
   On 2011-02-02 14:05, Avi Kivity wrote:
 On 02/02/2011 02:50 PM, Jan Kiszka wrote:
   
  Opps, -smp 1. With -smp 2 it boot almost completely and then hangs.
   
 Ah, good (or not good). With Windows 2003 Server, I actually get a 
  Blue
 Screen (Stop 0x00b8).
   
 Userspace APIC is broken since it may run with an outdated cr8, does
 reverting 27a4f7976d5 help?
   
 
   -ECOMMITNOTFOUND, neither in qemu-kvm nor upstream.
 
 This is kernel commit, but it is too old. I am pretty sure userspace irq
 chip worked back then.
 
 I have memories of it failing autotest, but the conclusion was that
 the commit did not cause the problem, simply enlarged the window in
 which it could happen.
 
Ah, I remember something like that. RHEL6 kernel + qemu-kvm fails
in the same way so problem is not new.
 
--
Gleb.



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Avi Kivity

On 02/02/2011 03:11 PM, Gleb Natapov wrote:

On Wed, Feb 02, 2011 at 02:09:24PM +0100, Jan Kiszka wrote:
  On 2011-02-02 14:05, Avi Kivity wrote:
On 02/02/2011 02:50 PM, Jan Kiszka wrote:
  
 Opps, -smp 1. With -smp 2 it boot almost completely and then hangs.
  
Ah, good (or not good). With Windows 2003 Server, I actually get a Blue
Screen (Stop 0x00b8).
  
Userspace APIC is broken since it may run with an outdated cr8, does
reverting 27a4f7976d5 help?
  

  -ECOMMITNOTFOUND, neither in qemu-kvm nor upstream.

This is kernel commit, but it is too old. I am pretty sure userspace irq
chip worked back then.


I have memories of it failing autotest, but the conclusion was that the 
commit did not cause the problem, simply enlarged the window in which it 
could happen.


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Avi Kivity

On 02/02/2011 02:50 PM, Jan Kiszka wrote:


  Opps, -smp 1. With -smp 2 it boot almost completely and then hangs.

Ah, good (or not good). With Windows 2003 Server, I actually get a Blue
Screen (Stop 0x00b8).


Userspace APIC is broken since it may run with an outdated cr8, does 
reverting 27a4f7976d5 help?


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Jan Kiszka
On 2011-02-02 14:05, Avi Kivity wrote:
 On 02/02/2011 02:50 PM, Jan Kiszka wrote:

  Opps, -smp 1. With -smp 2 it boot almost completely and then hangs.

 Ah, good (or not good). With Windows 2003 Server, I actually get a Blue
 Screen (Stop 0x00b8).
 
 Userspace APIC is broken since it may run with an outdated cr8, does 
 reverting 27a4f7976d5 help?

Can you elaborate on what is broken? The way hw/apic.c maintains the
tpr? Would it make sense to compare this against the in-kernel model? Or
do you mean something else?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Jan Kiszka
On 2011-02-02 12:55, Gleb Natapov wrote:
 On Tue, Feb 01, 2011 at 07:02:03PM +0100, Jan Kiszka wrote:
 Hi,

 testing my KVM patches, I noticed that none of the 64-bit Windows
 versions I have around (early Win7  2003 server) boot in KVM mode when
 using 2 or more VCPUs and the user space irqchip. This applies to both
 upstream KVM and qemu-kvm, with our without any of my current patches. A
 subtle difference in the APIC/IOAPIC emulation?

 Can anyone confirm this?

 Just booted windows7-64 on qemu-kvm -no-kvm-irqchip + latest kernel here.

Strange. -smp 2?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Avi Kivity

On 02/02/2011 04:30 PM, Jan Kiszka wrote:

On 2011-02-02 14:05, Avi Kivity wrote:
  On 02/02/2011 02:50 PM, Jan Kiszka wrote:

   Opps, -smp 1. With -smp 2 it boot almost completely and then hangs.

  Ah, good (or not good). With Windows 2003 Server, I actually get a Blue
  Screen (Stop 0x00b8).

  Userspace APIC is broken since it may run with an outdated cr8, does
  reverting 27a4f7976d5 help?

Can you elaborate on what is broken? The way hw/apic.c maintains the
tpr? Would it make sense to compare this against the in-kernel model? Or
do you mean something else?


The problem, IIRC, was that we look up the TPR but it may already have 
been changed by the running vcpu.  Not 100% sure.


If that is indeed the problem then the fix would be to process the APIC 
in vcpu context (which is what the kernel does - we set a bit in the IRR 
and all further processing is synchronous).


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Jan Kiszka
On 2011-02-02 15:43, Jan Kiszka wrote:
 On 2011-02-02 15:35, Avi Kivity wrote:
 On 02/02/2011 04:30 PM, Jan Kiszka wrote:
 On 2011-02-02 14:05, Avi Kivity wrote:
  On 02/02/2011 02:50 PM, Jan Kiszka wrote:

   Opps, -smp 1. With -smp 2 it boot almost completely and then hangs.

  Ah, good (or not good). With Windows 2003 Server, I actually get a Blue
  Screen (Stop 0x00b8).

  Userspace APIC is broken since it may run with an outdated cr8, does
  reverting 27a4f7976d5 help?

 Can you elaborate on what is broken? The way hw/apic.c maintains the
 tpr? Would it make sense to compare this against the in-kernel model? Or
 do you mean something else?

 The problem, IIRC, was that we look up the TPR but it may already have 
 been changed by the running vcpu.  Not 100% sure.

 If that is indeed the problem then the fix would be to process the APIC 
 in vcpu context (which is what the kernel does - we set a bit in the IRR 
 and all further processing is synchronous).
 
 You mean: user space changes the tpr value while the vcpu is in KVM_RUN,
 then we return from the kernel and overwrite the tpr in the apic with
 the vcpu's view, right?

Hmm, probably rather that there is a discrepancy between tpr and irr.
The latter is changed asynchronously /wrt to the vcpu, the former /wrt
the user space device model.

Run apic_set_irq on the vcpu?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Jan Kiszka
On 2011-02-02 17:39, Gleb Natapov wrote:
 On Wed, Feb 02, 2011 at 05:36:53PM +0100, Jan Kiszka wrote:
 On 2011-02-02 17:29, Gleb Natapov wrote:
 On Wed, Feb 02, 2011 at 04:52:11PM +0100, Jan Kiszka wrote:
 On 2011-02-02 16:46, Gleb Natapov wrote:
 On Wed, Feb 02, 2011 at 04:35:25PM +0100, Jan Kiszka wrote:
 On 2011-02-02 16:09, Avi Kivity wrote:
 On 02/02/2011 04:52 PM, Jan Kiszka wrote:
 On 2011-02-02 15:43, Jan Kiszka wrote:
  On 2011-02-02 15:35, Avi Kivity wrote:
  On 02/02/2011 04:30 PM, Jan Kiszka wrote:
  On 2011-02-02 14:05, Avi Kivity wrote:
   On 02/02/2011 02:50 PM, Jan Kiszka wrote:

Opps, -smp 1. With -smp 2 it boot almost completely and then 
 hangs.

   Ah, good (or not good). With Windows 2003 Server, I actually 
 get a Blue
   Screen (Stop 0x00b8).

   Userspace APIC is broken since it may run with an outdated cr8, 
 does
   reverting 27a4f7976d5 help?

  Can you elaborate on what is broken? The way hw/apic.c maintains 
 the
  tpr? Would it make sense to compare this against the in-kernel 
 model? Or
  do you mean something else?

  The problem, IIRC, was that we look up the TPR but it may already 
 have
  been changed by the running vcpu.  Not 100% sure.

  If that is indeed the problem then the fix would be to process the 
 APIC
  in vcpu context (which is what the kernel does - we set a bit in 
 the IRR
  and all further processing is synchronous).

  You mean: user space changes the tpr value while the vcpu is in 
 KVM_RUN,
  then we return from the kernel and overwrite the tpr in the apic with
  the vcpu's view, right?

 Hmm, probably rather that there is a discrepancy between tpr and irr.
 The latter is changed asynchronously /wrt to the vcpu, the former /wrt
 the user space device model.

 And yet, both are synchronized via qemu_mutex.  So we're still missing 
 something in this picture.

 Run apic_set_irq on the vcpu?

 static void apic_set_irq(APICState *s, int vector_num, int trigger_mode)
 {
  apic_irq_delivered += !get_bit(s-irr, vector_num);

  trace_apic_set_irq(apic_irq_delivered);

  set_bit(s-irr, vector_num);

 This is even more async with kernel irqchip

  if (trigger_mode)
  set_bit(s-tmr, vector_num);
  else
  reset_bit(s-tmr, vector_num);

 This is protected by qemu_mutex

  apic_update_irq(s);

 This will be run the next time the vcpu exits, via apic_get_interrupt().

 The decision to pend an IRQ (and potentially kick the vcpu) takes place
 immediately in acip_update_irq. And it is based on current irr as well
 as tpr. But we update again when user space returns with a new value.


 }

 Did you check whether reverting that commit helps?


 Just did so, and I can no longer reproduce the problem. Hmm...

 If there is no problem in the logic of this commit (and I do not see
 one yet) then we somewhere miss kicking vcpu when interrupt, that should 
 be
 handled, arrives?

 I'm not yet confident about the logic of the kernel patch: mov to cr8 is
 serializing. If the guest raises the tpr and then signals this with a
 succeeding, non vm-exiting instruction to the other vcpus, one of those
 could inject an interrupt with a higher priority than the previous tpr,
 but a lower one than current tpr. QEMU user space would accept this
 interrupt - and would likely surprise the guest. Do I miss something?

 Injection happens by vcpu thread on cpu entry:
 run-request_interrupt_window = kvm_arch_try_push_interrupts(env);
 and tpr is synced on vcpu exit, so I do not yet see how what you describe
 above may happen since during injection vcpu should see correct tpr.

 Hmm, maybe this is the key: Once we call into apic_get_interrupt
 (because CPU_INTERRUPT_HARD was set as described above) and we find a
 pending irq below the tpr, we inject a spurious vector instead.

 That should be easy to verify. I expect Windows to BSOD upon receiving
 spurious vector though.

I hacked spurious irq injection away, but the issue remains. At the same
time, Windows is receiving tons of spurious interrupts without any
complaints, even without that tpr optimization in the kernel. So this is
obviously not yet the key.

Let's try your idea that we miss a wakeup.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Gleb Natapov
On Wed, Feb 02, 2011 at 04:52:11PM +0100, Jan Kiszka wrote:
 On 2011-02-02 16:46, Gleb Natapov wrote:
  On Wed, Feb 02, 2011 at 04:35:25PM +0100, Jan Kiszka wrote:
  On 2011-02-02 16:09, Avi Kivity wrote:
  On 02/02/2011 04:52 PM, Jan Kiszka wrote:
  On 2011-02-02 15:43, Jan Kiszka wrote:
   On 2011-02-02 15:35, Avi Kivity wrote:
   On 02/02/2011 04:30 PM, Jan Kiszka wrote:
   On 2011-02-02 14:05, Avi Kivity wrote:
On 02/02/2011 02:50 PM, Jan Kiszka wrote:
 
 Opps, -smp 1. With -smp 2 it boot almost completely and then 
  hangs.
 
Ah, good (or not good). With Windows 2003 Server, I actually get 
  a Blue
Screen (Stop 0x00b8).
 
Userspace APIC is broken since it may run with an outdated cr8, 
  does
reverting 27a4f7976d5 help?
 
   Can you elaborate on what is broken? The way hw/apic.c maintains the
   tpr? Would it make sense to compare this against the in-kernel 
  model? Or
   do you mean something else?
 
   The problem, IIRC, was that we look up the TPR but it may already have
   been changed by the running vcpu.  Not 100% sure.
 
   If that is indeed the problem then the fix would be to process the 
  APIC
   in vcpu context (which is what the kernel does - we set a bit in the 
  IRR
   and all further processing is synchronous).
 
   You mean: user space changes the tpr value while the vcpu is in 
  KVM_RUN,
   then we return from the kernel and overwrite the tpr in the apic with
   the vcpu's view, right?
 
  Hmm, probably rather that there is a discrepancy between tpr and irr.
  The latter is changed asynchronously /wrt to the vcpu, the former /wrt
  the user space device model.
 
  And yet, both are synchronized via qemu_mutex.  So we're still missing 
  something in this picture.
 
  Run apic_set_irq on the vcpu?
 
  static void apic_set_irq(APICState *s, int vector_num, int trigger_mode)
  {
   apic_irq_delivered += !get_bit(s-irr, vector_num);
 
   trace_apic_set_irq(apic_irq_delivered);
 
   set_bit(s-irr, vector_num);
 
  This is even more async with kernel irqchip
 
   if (trigger_mode)
   set_bit(s-tmr, vector_num);
   else
   reset_bit(s-tmr, vector_num);
 
  This is protected by qemu_mutex
 
   apic_update_irq(s);
 
  This will be run the next time the vcpu exits, via apic_get_interrupt().
 
  The decision to pend an IRQ (and potentially kick the vcpu) takes place
  immediately in acip_update_irq. And it is based on current irr as well
  as tpr. But we update again when user space returns with a new value.
 
 
  }
 
  Did you check whether reverting that commit helps?
 
 
  Just did so, and I can no longer reproduce the problem. Hmm...
 
  If there is no problem in the logic of this commit (and I do not see
  one yet) then we somewhere miss kicking vcpu when interrupt, that should be
  handled, arrives?
 
 I'm not yet confident about the logic of the kernel patch: mov to cr8 is
 serializing. If the guest raises the tpr and then signals this with a
 succeeding, non vm-exiting instruction to the other vcpus, one of those
 could inject an interrupt with a higher priority than the previous tpr,
 but a lower one than current tpr. QEMU user space would accept this
 interrupt - and would likely surprise the guest. Do I miss something?
 
Injection happens by vcpu thread on cpu entry:
run-request_interrupt_window = kvm_arch_try_push_interrupts(env);
and tpr is synced on vcpu exit, so I do not yet see how what you describe
above may happen since during injection vcpu should see correct tpr.

--
Gleb.



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Gleb Natapov
On Wed, Feb 02, 2011 at 05:36:53PM +0100, Jan Kiszka wrote:
 On 2011-02-02 17:29, Gleb Natapov wrote:
  On Wed, Feb 02, 2011 at 04:52:11PM +0100, Jan Kiszka wrote:
  On 2011-02-02 16:46, Gleb Natapov wrote:
  On Wed, Feb 02, 2011 at 04:35:25PM +0100, Jan Kiszka wrote:
  On 2011-02-02 16:09, Avi Kivity wrote:
  On 02/02/2011 04:52 PM, Jan Kiszka wrote:
  On 2011-02-02 15:43, Jan Kiszka wrote:
   On 2011-02-02 15:35, Avi Kivity wrote:
   On 02/02/2011 04:30 PM, Jan Kiszka wrote:
   On 2011-02-02 14:05, Avi Kivity wrote:
On 02/02/2011 02:50 PM, Jan Kiszka wrote:
 
 Opps, -smp 1. With -smp 2 it boot almost completely and then 
  hangs.
 
Ah, good (or not good). With Windows 2003 Server, I actually 
  get a Blue
Screen (Stop 0x00b8).
 
Userspace APIC is broken since it may run with an outdated cr8, 
  does
reverting 27a4f7976d5 help?
 
   Can you elaborate on what is broken? The way hw/apic.c maintains 
  the
   tpr? Would it make sense to compare this against the in-kernel 
  model? Or
   do you mean something else?
 
   The problem, IIRC, was that we look up the TPR but it may already 
  have
   been changed by the running vcpu.  Not 100% sure.
 
   If that is indeed the problem then the fix would be to process the 
  APIC
   in vcpu context (which is what the kernel does - we set a bit in 
  the IRR
   and all further processing is synchronous).
 
   You mean: user space changes the tpr value while the vcpu is in 
  KVM_RUN,
   then we return from the kernel and overwrite the tpr in the apic with
   the vcpu's view, right?
 
  Hmm, probably rather that there is a discrepancy between tpr and irr.
  The latter is changed asynchronously /wrt to the vcpu, the former /wrt
  the user space device model.
 
  And yet, both are synchronized via qemu_mutex.  So we're still missing 
  something in this picture.
 
  Run apic_set_irq on the vcpu?
 
  static void apic_set_irq(APICState *s, int vector_num, int trigger_mode)
  {
   apic_irq_delivered += !get_bit(s-irr, vector_num);
 
   trace_apic_set_irq(apic_irq_delivered);
 
   set_bit(s-irr, vector_num);
 
  This is even more async with kernel irqchip
 
   if (trigger_mode)
   set_bit(s-tmr, vector_num);
   else
   reset_bit(s-tmr, vector_num);
 
  This is protected by qemu_mutex
 
   apic_update_irq(s);
 
  This will be run the next time the vcpu exits, via apic_get_interrupt().
 
  The decision to pend an IRQ (and potentially kick the vcpu) takes place
  immediately in acip_update_irq. And it is based on current irr as well
  as tpr. But we update again when user space returns with a new value.
 
 
  }
 
  Did you check whether reverting that commit helps?
 
 
  Just did so, and I can no longer reproduce the problem. Hmm...
 
  If there is no problem in the logic of this commit (and I do not see
  one yet) then we somewhere miss kicking vcpu when interrupt, that should 
  be
  handled, arrives?
 
  I'm not yet confident about the logic of the kernel patch: mov to cr8 is
  serializing. If the guest raises the tpr and then signals this with a
  succeeding, non vm-exiting instruction to the other vcpus, one of those
  could inject an interrupt with a higher priority than the previous tpr,
  but a lower one than current tpr. QEMU user space would accept this
  interrupt - and would likely surprise the guest. Do I miss something?
 
  Injection happens by vcpu thread on cpu entry:
  run-request_interrupt_window = kvm_arch_try_push_interrupts(env);
  and tpr is synced on vcpu exit, so I do not yet see how what you describe
  above may happen since during injection vcpu should see correct tpr.
 
 Hmm, maybe this is the key: Once we call into apic_get_interrupt
 (because CPU_INTERRUPT_HARD was set as described above) and we find a
 pending irq below the tpr, we inject a spurious vector instead.
 
That should be easy to verify. I expect Windows to BSOD upon receiving
spurious vector though.

--
Gleb.



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Jan Kiszka
On 2011-02-02 15:35, Avi Kivity wrote:
 On 02/02/2011 04:30 PM, Jan Kiszka wrote:
 On 2011-02-02 14:05, Avi Kivity wrote:
  On 02/02/2011 02:50 PM, Jan Kiszka wrote:

   Opps, -smp 1. With -smp 2 it boot almost completely and then hangs.

  Ah, good (or not good). With Windows 2003 Server, I actually get a Blue
  Screen (Stop 0x00b8).

  Userspace APIC is broken since it may run with an outdated cr8, does
  reverting 27a4f7976d5 help?

 Can you elaborate on what is broken? The way hw/apic.c maintains the
 tpr? Would it make sense to compare this against the in-kernel model? Or
 do you mean something else?
 
 The problem, IIRC, was that we look up the TPR but it may already have 
 been changed by the running vcpu.  Not 100% sure.
 
 If that is indeed the problem then the fix would be to process the APIC 
 in vcpu context (which is what the kernel does - we set a bit in the IRR 
 and all further processing is synchronous).

You mean: user space changes the tpr value while the vcpu is in KVM_RUN,
then we return from the kernel and overwrite the tpr in the apic with
the vcpu's view, right?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Avi Kivity

On 02/02/2011 05:35 PM, Jan Kiszka wrote:


  And yet, both are synchronized via qemu_mutex.  So we're still missing
  something in this picture.

  Run apic_set_irq on the vcpu?

  static void apic_set_irq(APICState *s, int vector_num, int trigger_mode)
  {
   apic_irq_delivered += !get_bit(s-irr, vector_num);

   trace_apic_set_irq(apic_irq_delivered);

   set_bit(s-irr, vector_num);

  This is even more async with kernel irqchip

   if (trigger_mode)
   set_bit(s-tmr, vector_num);
   else
   reset_bit(s-tmr, vector_num);

  This is protected by qemu_mutex

   apic_update_irq(s);

  This will be run the next time the vcpu exits, via apic_get_interrupt().

The decision to pend an IRQ (and potentially kick the vcpu) takes place
immediately in acip_update_irq. And it is based on current irr as well
as tpr. But we update again when user space returns with a new value.


Right.  My current understanding is that it is correct.  But I 
distinctly remember that I came to a different conclusion when the 
failure first occurred (and the conclusion was that the patch did not 
cause the problem, just made it much more likely to see a not-up-to-date 
TPR).




  }

  Did you check whether reverting that commit helps?


Just did so, and I can no longer reproduce the problem. Hmm...


At least we have a pointer.

--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Gleb Natapov
On Wed, Feb 02, 2011 at 12:58:47PM +0100, Jan Kiszka wrote:
 On 2011-02-02 12:55, Gleb Natapov wrote:
  On Tue, Feb 01, 2011 at 07:02:03PM +0100, Jan Kiszka wrote:
  Hi,
 
  testing my KVM patches, I noticed that none of the 64-bit Windows
  versions I have around (early Win7  2003 server) boot in KVM mode when
  using 2 or more VCPUs and the user space irqchip. This applies to both
  upstream KVM and qemu-kvm, with our without any of my current patches. A
  subtle difference in the APIC/IOAPIC emulation?
 
  Can anyone confirm this?
 
  Just booted windows7-64 on qemu-kvm -no-kvm-irqchip + latest kernel here.
 
 Strange. -smp 2?
 
Opps, -smp 1. With -smp 2 it boot almost completely and then hangs.

--
Gleb.



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Jan Kiszka
On 2011-02-02 16:46, Gleb Natapov wrote:
 On Wed, Feb 02, 2011 at 04:35:25PM +0100, Jan Kiszka wrote:
 On 2011-02-02 16:09, Avi Kivity wrote:
 On 02/02/2011 04:52 PM, Jan Kiszka wrote:
 On 2011-02-02 15:43, Jan Kiszka wrote:
  On 2011-02-02 15:35, Avi Kivity wrote:
  On 02/02/2011 04:30 PM, Jan Kiszka wrote:
  On 2011-02-02 14:05, Avi Kivity wrote:
   On 02/02/2011 02:50 PM, Jan Kiszka wrote:

Opps, -smp 1. With -smp 2 it boot almost completely and then 
 hangs.

   Ah, good (or not good). With Windows 2003 Server, I actually get a 
 Blue
   Screen (Stop 0x00b8).

   Userspace APIC is broken since it may run with an outdated cr8, does
   reverting 27a4f7976d5 help?

  Can you elaborate on what is broken? The way hw/apic.c maintains the
  tpr? Would it make sense to compare this against the in-kernel model? 
 Or
  do you mean something else?

  The problem, IIRC, was that we look up the TPR but it may already have
  been changed by the running vcpu.  Not 100% sure.

  If that is indeed the problem then the fix would be to process the APIC
  in vcpu context (which is what the kernel does - we set a bit in the IRR
  and all further processing is synchronous).

  You mean: user space changes the tpr value while the vcpu is in KVM_RUN,
  then we return from the kernel and overwrite the tpr in the apic with
  the vcpu's view, right?

 Hmm, probably rather that there is a discrepancy between tpr and irr.
 The latter is changed asynchronously /wrt to the vcpu, the former /wrt
 the user space device model.

 And yet, both are synchronized via qemu_mutex.  So we're still missing 
 something in this picture.

 Run apic_set_irq on the vcpu?

 static void apic_set_irq(APICState *s, int vector_num, int trigger_mode)
 {
  apic_irq_delivered += !get_bit(s-irr, vector_num);

  trace_apic_set_irq(apic_irq_delivered);

  set_bit(s-irr, vector_num);

 This is even more async with kernel irqchip

  if (trigger_mode)
  set_bit(s-tmr, vector_num);
  else
  reset_bit(s-tmr, vector_num);

 This is protected by qemu_mutex

  apic_update_irq(s);

 This will be run the next time the vcpu exits, via apic_get_interrupt().

 The decision to pend an IRQ (and potentially kick the vcpu) takes place
 immediately in acip_update_irq. And it is based on current irr as well
 as tpr. But we update again when user space returns with a new value.


 }

 Did you check whether reverting that commit helps?


 Just did so, and I can no longer reproduce the problem. Hmm...

 If there is no problem in the logic of this commit (and I do not see
 one yet) then we somewhere miss kicking vcpu when interrupt, that should be
 handled, arrives?

I'm not yet confident about the logic of the kernel patch: mov to cr8 is
serializing. If the guest raises the tpr and then signals this with a
succeeding, non vm-exiting instruction to the other vcpus, one of those
could inject an interrupt with a higher priority than the previous tpr,
but a lower one than current tpr. QEMU user space would accept this
interrupt - and would likely surprise the guest. Do I miss something?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Jan Kiszka
On 2011-02-02 13:35, Gleb Natapov wrote:
 On Wed, Feb 02, 2011 at 12:58:47PM +0100, Jan Kiszka wrote:
 On 2011-02-02 12:55, Gleb Natapov wrote:
 On Tue, Feb 01, 2011 at 07:02:03PM +0100, Jan Kiszka wrote:
 Hi,

 testing my KVM patches, I noticed that none of the 64-bit Windows
 versions I have around (early Win7  2003 server) boot in KVM mode when
 using 2 or more VCPUs and the user space irqchip. This applies to both
 upstream KVM and qemu-kvm, with our without any of my current patches. A
 subtle difference in the APIC/IOAPIC emulation?

 Can anyone confirm this?

 Just booted windows7-64 on qemu-kvm -no-kvm-irqchip + latest kernel here.

 Strange. -smp 2?

 Opps, -smp 1. With -smp 2 it boot almost completely and then hangs.

Ah, good (or not good). With Windows 2003 Server, I actually get a Blue
Screen (Stop 0x00b8).

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Avi Kivity

On 02/02/2011 04:52 PM, Jan Kiszka wrote:

On 2011-02-02 15:43, Jan Kiszka wrote:
  On 2011-02-02 15:35, Avi Kivity wrote:
  On 02/02/2011 04:30 PM, Jan Kiszka wrote:
  On 2011-02-02 14:05, Avi Kivity wrote:
   On 02/02/2011 02:50 PM, Jan Kiszka wrote:

Opps, -smp 1. With -smp 2 it boot almost completely and then hangs.

   Ah, good (or not good). With Windows 2003 Server, I actually get a Blue
   Screen (Stop 0x00b8).

   Userspace APIC is broken since it may run with an outdated cr8, does
   reverting 27a4f7976d5 help?

  Can you elaborate on what is broken? The way hw/apic.c maintains the
  tpr? Would it make sense to compare this against the in-kernel model? Or
  do you mean something else?

  The problem, IIRC, was that we look up the TPR but it may already have
  been changed by the running vcpu.  Not 100% sure.

  If that is indeed the problem then the fix would be to process the APIC
  in vcpu context (which is what the kernel does - we set a bit in the IRR
  and all further processing is synchronous).

  You mean: user space changes the tpr value while the vcpu is in KVM_RUN,
  then we return from the kernel and overwrite the tpr in the apic with
  the vcpu's view, right?

Hmm, probably rather that there is a discrepancy between tpr and irr.
The latter is changed asynchronously /wrt to the vcpu, the former /wrt
the user space device model.


And yet, both are synchronized via qemu_mutex.  So we're still missing 
something in this picture.



Run apic_set_irq on the vcpu?


static void apic_set_irq(APICState *s, int vector_num, int trigger_mode)
{
apic_irq_delivered += !get_bit(s-irr, vector_num);

trace_apic_set_irq(apic_irq_delivered);

set_bit(s-irr, vector_num);

This is even more async with kernel irqchip

if (trigger_mode)
set_bit(s-tmr, vector_num);
else
reset_bit(s-tmr, vector_num);

This is protected by qemu_mutex

apic_update_irq(s);

This will be run the next time the vcpu exits, via apic_get_interrupt().

}

Did you check whether reverting that commit helps?

--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Jan Kiszka
On 2011-02-02 16:09, Avi Kivity wrote:
 On 02/02/2011 04:52 PM, Jan Kiszka wrote:
 On 2011-02-02 15:43, Jan Kiszka wrote:
  On 2011-02-02 15:35, Avi Kivity wrote:
  On 02/02/2011 04:30 PM, Jan Kiszka wrote:
  On 2011-02-02 14:05, Avi Kivity wrote:
   On 02/02/2011 02:50 PM, Jan Kiszka wrote:

Opps, -smp 1. With -smp 2 it boot almost completely and then hangs.

   Ah, good (or not good). With Windows 2003 Server, I actually get a 
 Blue
   Screen (Stop 0x00b8).

   Userspace APIC is broken since it may run with an outdated cr8, does
   reverting 27a4f7976d5 help?

  Can you elaborate on what is broken? The way hw/apic.c maintains the
  tpr? Would it make sense to compare this against the in-kernel model? Or
  do you mean something else?

  The problem, IIRC, was that we look up the TPR but it may already have
  been changed by the running vcpu.  Not 100% sure.

  If that is indeed the problem then the fix would be to process the APIC
  in vcpu context (which is what the kernel does - we set a bit in the IRR
  and all further processing is synchronous).

  You mean: user space changes the tpr value while the vcpu is in KVM_RUN,
  then we return from the kernel and overwrite the tpr in the apic with
  the vcpu's view, right?

 Hmm, probably rather that there is a discrepancy between tpr and irr.
 The latter is changed asynchronously /wrt to the vcpu, the former /wrt
 the user space device model.
 
 And yet, both are synchronized via qemu_mutex.  So we're still missing 
 something in this picture.
 
 Run apic_set_irq on the vcpu?
 
 static void apic_set_irq(APICState *s, int vector_num, int trigger_mode)
 {
  apic_irq_delivered += !get_bit(s-irr, vector_num);
 
  trace_apic_set_irq(apic_irq_delivered);
 
  set_bit(s-irr, vector_num);
 
 This is even more async with kernel irqchip
 
  if (trigger_mode)
  set_bit(s-tmr, vector_num);
  else
  reset_bit(s-tmr, vector_num);
 
 This is protected by qemu_mutex
 
  apic_update_irq(s);
 
 This will be run the next time the vcpu exits, via apic_get_interrupt().

The decision to pend an IRQ (and potentially kick the vcpu) takes place
immediately in acip_update_irq. And it is based on current irr as well
as tpr. But we update again when user space returns with a new value.

 
 }
 
 Did you check whether reverting that commit helps?
 

Just did so, and I can no longer reproduce the problem. Hmm...

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Gleb Natapov
On Wed, Feb 02, 2011 at 04:35:25PM +0100, Jan Kiszka wrote:
 On 2011-02-02 16:09, Avi Kivity wrote:
  On 02/02/2011 04:52 PM, Jan Kiszka wrote:
  On 2011-02-02 15:43, Jan Kiszka wrote:
   On 2011-02-02 15:35, Avi Kivity wrote:
   On 02/02/2011 04:30 PM, Jan Kiszka wrote:
   On 2011-02-02 14:05, Avi Kivity wrote:
On 02/02/2011 02:50 PM, Jan Kiszka wrote:
 
 Opps, -smp 1. With -smp 2 it boot almost completely and then 
  hangs.
 
Ah, good (or not good). With Windows 2003 Server, I actually get a 
  Blue
Screen (Stop 0x00b8).
 
Userspace APIC is broken since it may run with an outdated cr8, does
reverting 27a4f7976d5 help?
 
   Can you elaborate on what is broken? The way hw/apic.c maintains the
   tpr? Would it make sense to compare this against the in-kernel model? 
  Or
   do you mean something else?
 
   The problem, IIRC, was that we look up the TPR but it may already have
   been changed by the running vcpu.  Not 100% sure.
 
   If that is indeed the problem then the fix would be to process the APIC
   in vcpu context (which is what the kernel does - we set a bit in the IRR
   and all further processing is synchronous).
 
   You mean: user space changes the tpr value while the vcpu is in KVM_RUN,
   then we return from the kernel and overwrite the tpr in the apic with
   the vcpu's view, right?
 
  Hmm, probably rather that there is a discrepancy between tpr and irr.
  The latter is changed asynchronously /wrt to the vcpu, the former /wrt
  the user space device model.
  
  And yet, both are synchronized via qemu_mutex.  So we're still missing 
  something in this picture.
  
  Run apic_set_irq on the vcpu?
  
  static void apic_set_irq(APICState *s, int vector_num, int trigger_mode)
  {
   apic_irq_delivered += !get_bit(s-irr, vector_num);
  
   trace_apic_set_irq(apic_irq_delivered);
  
   set_bit(s-irr, vector_num);
  
  This is even more async with kernel irqchip
  
   if (trigger_mode)
   set_bit(s-tmr, vector_num);
   else
   reset_bit(s-tmr, vector_num);
  
  This is protected by qemu_mutex
  
   apic_update_irq(s);
  
  This will be run the next time the vcpu exits, via apic_get_interrupt().
 
 The decision to pend an IRQ (and potentially kick the vcpu) takes place
 immediately in acip_update_irq. And it is based on current irr as well
 as tpr. But we update again when user space returns with a new value.
 
  
  }
  
  Did you check whether reverting that commit helps?
  
 
 Just did so, and I can no longer reproduce the problem. Hmm...
 
If there is no problem in the logic of this commit (and I do not see
one yet) then we somewhere miss kicking vcpu when interrupt, that should be
handled, arrives?

--
Gleb.



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Jan Kiszka
On 2011-02-02 17:29, Gleb Natapov wrote:
 On Wed, Feb 02, 2011 at 04:52:11PM +0100, Jan Kiszka wrote:
 On 2011-02-02 16:46, Gleb Natapov wrote:
 On Wed, Feb 02, 2011 at 04:35:25PM +0100, Jan Kiszka wrote:
 On 2011-02-02 16:09, Avi Kivity wrote:
 On 02/02/2011 04:52 PM, Jan Kiszka wrote:
 On 2011-02-02 15:43, Jan Kiszka wrote:
  On 2011-02-02 15:35, Avi Kivity wrote:
  On 02/02/2011 04:30 PM, Jan Kiszka wrote:
  On 2011-02-02 14:05, Avi Kivity wrote:
   On 02/02/2011 02:50 PM, Jan Kiszka wrote:

Opps, -smp 1. With -smp 2 it boot almost completely and then 
 hangs.

   Ah, good (or not good). With Windows 2003 Server, I actually get 
 a Blue
   Screen (Stop 0x00b8).

   Userspace APIC is broken since it may run with an outdated cr8, 
 does
   reverting 27a4f7976d5 help?

  Can you elaborate on what is broken? The way hw/apic.c maintains the
  tpr? Would it make sense to compare this against the in-kernel 
 model? Or
  do you mean something else?

  The problem, IIRC, was that we look up the TPR but it may already have
  been changed by the running vcpu.  Not 100% sure.

  If that is indeed the problem then the fix would be to process the 
 APIC
  in vcpu context (which is what the kernel does - we set a bit in the 
 IRR
  and all further processing is synchronous).

  You mean: user space changes the tpr value while the vcpu is in 
 KVM_RUN,
  then we return from the kernel and overwrite the tpr in the apic with
  the vcpu's view, right?

 Hmm, probably rather that there is a discrepancy between tpr and irr.
 The latter is changed asynchronously /wrt to the vcpu, the former /wrt
 the user space device model.

 And yet, both are synchronized via qemu_mutex.  So we're still missing 
 something in this picture.

 Run apic_set_irq on the vcpu?

 static void apic_set_irq(APICState *s, int vector_num, int trigger_mode)
 {
  apic_irq_delivered += !get_bit(s-irr, vector_num);

  trace_apic_set_irq(apic_irq_delivered);

  set_bit(s-irr, vector_num);

 This is even more async with kernel irqchip

  if (trigger_mode)
  set_bit(s-tmr, vector_num);
  else
  reset_bit(s-tmr, vector_num);

 This is protected by qemu_mutex

  apic_update_irq(s);

 This will be run the next time the vcpu exits, via apic_get_interrupt().

 The decision to pend an IRQ (and potentially kick the vcpu) takes place
 immediately in acip_update_irq. And it is based on current irr as well
 as tpr. But we update again when user space returns with a new value.


 }

 Did you check whether reverting that commit helps?


 Just did so, and I can no longer reproduce the problem. Hmm...

 If there is no problem in the logic of this commit (and I do not see
 one yet) then we somewhere miss kicking vcpu when interrupt, that should be
 handled, arrives?

 I'm not yet confident about the logic of the kernel patch: mov to cr8 is
 serializing. If the guest raises the tpr and then signals this with a
 succeeding, non vm-exiting instruction to the other vcpus, one of those
 could inject an interrupt with a higher priority than the previous tpr,
 but a lower one than current tpr. QEMU user space would accept this
 interrupt - and would likely surprise the guest. Do I miss something?

 Injection happens by vcpu thread on cpu entry:
 run-request_interrupt_window = kvm_arch_try_push_interrupts(env);
 and tpr is synced on vcpu exit, so I do not yet see how what you describe
 above may happen since during injection vcpu should see correct tpr.

Hmm, maybe this is the key: Once we call into apic_get_interrupt
(because CPU_INTERRUPT_HARD was set as described above) and we find a
pending irq below the tpr, we inject a spurious vector instead.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-02 Thread Gleb Natapov
On Wed, Feb 02, 2011 at 05:51:32PM +0100, Jan Kiszka wrote:
  Just did so, and I can no longer reproduce the problem. Hmm...
 
  If there is no problem in the logic of this commit (and I do not see
  one yet) then we somewhere miss kicking vcpu when interrupt, that 
  should be
  handled, arrives?
 
  I'm not yet confident about the logic of the kernel patch: mov to cr8 is
  serializing. If the guest raises the tpr and then signals this with a
  succeeding, non vm-exiting instruction to the other vcpus, one of those
  could inject an interrupt with a higher priority than the previous tpr,
  but a lower one than current tpr. QEMU user space would accept this
  interrupt - and would likely surprise the guest. Do I miss something?
 
  Injection happens by vcpu thread on cpu entry:
  run-request_interrupt_window = kvm_arch_try_push_interrupts(env);
  and tpr is synced on vcpu exit, so I do not yet see how what you describe
  above may happen since during injection vcpu should see correct tpr.
 
  Hmm, maybe this is the key: Once we call into apic_get_interrupt
  (because CPU_INTERRUPT_HARD was set as described above) and we find a
  pending irq below the tpr, we inject a spurious vector instead.
 
  That should be easy to verify. I expect Windows to BSOD upon receiving
  spurious vector though.
 
 I hacked spurious irq injection away, but the issue remains. At the same
 time, Windows is receiving tons of spurious interrupts without any
 complaints, even without that tpr optimization in the kernel. So this is
 obviously not yet the key.
 
 Let's try your idea that we miss a wakeup.
 
That is unlikely too. If vcpu missed wakeup, info cpus would solve the
hang since it would kick vcpu out of the kernel and missed interrupt would be
injected on re-entry.

--
Gleb.



[Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-01 Thread Jan Kiszka
Hi,

testing my KVM patches, I noticed that none of the 64-bit Windows
versions I have around (early Win7  2003 server) boot in KVM mode when
using 2 or more VCPUs and the user space irqchip. This applies to both
upstream KVM and qemu-kvm, with our without any of my current patches. A
subtle difference in the APIC/IOAPIC emulation?

Can anyone confirm this?

Jan

PS: Looks like they boot fine without CONFIG_IOTHREAD (over upstream).

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux