Re: [question] e1000 interrupt storm happened becauseof its corresponding ioapic-irr bit always set

2014-08-25 Thread Zhang Haoyu
 Hi, all

 I use a qemu-1.4.1/qemu-2.0.0 to run win7 guest, and encounter e1000 NIC 
 interrupt storm, 
 because if (!ent-fields.mask  (ioapic-irr  (1  i))) is always true 
 in __kvm_ioapic_update_eoi().

 Any ideas?

We meet this several times: search the autoneg patches for an example of
workaround for this in qemu, and patch kvm: ioapic: conditionally delay
irq delivery during eoi broadcast for an workaround in kvm (rejected).

Thanks, Jason,
I searched e1000 autoneg in gmane.comp.emulators.qemu, and found below 
patches, 
http://thread.gmane.org/gmane.comp.emulators.qemu/143001/focus=143007
http://thread.gmane.org/gmane.comp.emulators.qemu/284105/focus=284765
http://thread.gmane.org/gmane.comp.emulators.qemu/186159/focus=187351
which one tries to fix this problem, or all of them?

That was probably caused by something wrong in e1000 emulation which
causes interrupt to be injected into windows guest before its interrupt
handler is registered. And Windows guest does not have a mechanism to
detect and disable irq in such condition.

Sorry, I don't understand,
I think one interrupt should not been enabled before its handler is 
successfully registered, 
is it possible that e1000 emulation inject the interrupt before the interrupt 
is succesfully enabled?

Thanks,
Zhang Haoyu
 
e1000 emulation is far from stable and complete (e.g run e1000 ethtool
selftest in linux guest may see lots of errors). It's complicate and
subtle (even has undocumented registers and behaviour). You should
better consider to use virtio which are more stable and fast in a kvm
guest (unless some intel guys are involved to improve e1000 emulation).

Thanks

 Thanks,
 Zhang Haoyu



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [question] e1000 interrupt storm happened becauseof its corresponding ioapic-irr bit always set

2014-08-25 Thread Jason Wang
On 08/25/2014 03:17 PM, Zhang Haoyu wrote:
 Hi, all

 I use a qemu-1.4.1/qemu-2.0.0 to run win7 guest, and encounter e1000 NIC 
 interrupt storm, 
 because if (!ent-fields.mask  (ioapic-irr  (1  i))) is always true 
 in __kvm_ioapic_update_eoi().

 Any ideas?
 We meet this several times: search the autoneg patches for an example of
 workaround for this in qemu, and patch kvm: ioapic: conditionally delay
 irq delivery during eoi broadcast for an workaround in kvm (rejected).

 Thanks, Jason,
 I searched e1000 autoneg in gmane.comp.emulators.qemu, and found below 
 patches, 
 http://thread.gmane.org/gmane.comp.emulators.qemu/143001/focus=143007

This series is the first try to fix the guest hang during guest
hibernation or driver enable/disable.
 http://thread.gmane.org/gmane.comp.emulators.qemu/284105/focus=284765
 http://thread.gmane.org/gmane.comp.emulators.qemu/186159/focus=187351

Those are follow-up that tries to fix the bugs introduced by the autoneg
hack.
 which one tries to fix this problem, or all of them?

As you can see, those kinds of hacking may not as good as we expect
since we don't know exactly how e1000 works. Only the register function
description from Intel's manual may not be sufficient. And you can
search e1000 in the archives and you can find some behaviour of e1000
registers were not fictionalized like what spec said. It was really
suggested to use virtio-net instead of e1000 in guest. 

 That was probably caused by something wrong in e1000 emulation which
 causes interrupt to be injected into windows guest before its interrupt
 handler is registered. And Windows guest does not have a mechanism to
 detect and disable irq in such condition.

 Sorry, I don't understand,
 I think one interrupt should not been enabled before its handler is 
 successfully registered, 
 is it possible that e1000 emulation inject the interrupt before the interrupt 
 is succesfully enabled?

 Thanks,
 Zhang Haoyu
  
 e1000 emulation is far from stable and complete (e.g run e1000 ethtool
 selftest in linux guest may see lots of errors). It's complicate and
 subtle (even has undocumented registers and behaviour). You should
 better consider to use virtio which are more stable and fast in a kvm
 guest (unless some intel guys are involved to improve e1000 emulation).

 Thanks
 Thanks,
 Zhang Haoyu


 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [question] e1000 interrupt storm happened becauseof its corresponding ioapic-irr bit always set

2014-08-25 Thread Jason Wang
On 08/25/2014 03:17 PM, Zhang Haoyu wrote:
 Hi, all
 
  I use a qemu-1.4.1/qemu-2.0.0 to run win7 guest, and encounter e1000 NIC 
  interrupt storm, 
  because if (!ent-fields.mask  (ioapic-irr  (1  i))) is always 
  true in __kvm_ioapic_update_eoi().
 
  Any ideas?
 
 We meet this several times: search the autoneg patches for an example of
 workaround for this in qemu, and patch kvm: ioapic: conditionally delay
 irq delivery during eoi broadcast for an workaround in kvm (rejected).
 
 Thanks, Jason,
 I searched e1000 autoneg in gmane.comp.emulators.qemu, and found below 
 patches, 
 http://thread.gmane.org/gmane.comp.emulators.qemu/143001/focus=143007
 http://thread.gmane.org/gmane.comp.emulators.qemu/284105/focus=284765
 http://thread.gmane.org/gmane.comp.emulators.qemu/186159/focus=187351
 which one tries to fix this problem, or all of them?

 That was probably caused by something wrong in e1000 emulation which
 causes interrupt to be injected into windows guest before its interrupt
 handler is registered. And Windows guest does not have a mechanism to
 detect and disable irq in such condition.
 
 Sorry, I don't understand,
 I think one interrupt should not been enabled before its handler is 
 successfully registered, 
 is it possible that e1000 emulation inject the interrupt before the interrupt 
 is succesfully enabled?

There's no way for qemu to know whether or not the irq handler was
registered in guest. So if qemu behaves differently with a physical
card, it may lead the interrupt to be injected into guest too early. You
can search redhat bugzilla for lots of related bugs, some even with
in-depth analysis.

Thanks

 Thanks,
 Zhang Haoyu
  

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: x86: fix xen guest panic due to lack of KVM_REQ_EVENT

2014-08-25 Thread Wanpeng Li
This patch fix bug https://bugzilla.kernel.org/show_bug.cgi?id=82211

(XEN) ..MP-BIOS bug: 8254 timer not connected to IO-APIC
(XEN) ...trying to set up timer (IRQ0) through the 8259A ...  failed.
(XEN) ...trying to set up timer as Virtual Wire IRQ... failed.
(XEN) ...trying to set up timer as ExtINT IRQ... failed :(.
(XEN) 
(XEN) 
(XEN) Panic on CPU 0:
(XEN) IO-APIC + timer doesn't work!  Boot with apic_verbosity=debug and send a 
report.
(XEN) 

Commit 6addfc42992b (KVM: x86: avoid useless set of KVM_REQ_EVENT after 
emulation) sets a KVM_REQ_EVENT if an interrupt could be injected, which 
happens a) if an interrupt shadow bit (STI or MOV SS) has gone away; b) 
if the interrupt flag has just been set. However, a KVM_REQ_EVENT should 
be set if there is no sti sequence. This patch fix it by set a KVM_REQ_EVENT 
if both the first and the second instructions are not sti.

Signed-off-by: Wanpeng Li wanpeng...@linux.intel.com
---
 arch/x86/kvm/x86.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c10408e..b7c0073 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4928,6 +4928,8 @@ static void toggle_interruptibility(struct kvm_vcpu 
*vcpu, u32 mask)
if (!mask)
kvm_make_request(KVM_REQ_EVENT, vcpu);
}
+   if (!(int_shadow || mask))
+   kvm_make_request(KVM_REQ_EVENT, vcpu);
 }
 
 static void inject_emulated_exception(struct kvm_vcpu *vcpu)
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: avoid unnecessary synchronize_rcu

2014-08-25 Thread Christian Borntraeger
On 19/08/14 16:45, Christian Borntraeger wrote:
 We dont have to wait for a grace period if there is no oldpid that
 we are going to free. putpid also checks for NULL, so this patch
 only fences synchronize_rcu.
 
 Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com
 ---
  virt/kvm/kvm_main.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)
 
 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index 33712fb..39b1603 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -129,7 +129,8 @@ int vcpu_load(struct kvm_vcpu *vcpu)
   struct pid *oldpid = vcpu-pid;
   struct pid *newpid = get_task_pid(current, PIDTYPE_PID);
   rcu_assign_pointer(vcpu-pid, newpid);
 - synchronize_rcu();
 + if (oldpid)
 + synchronize_rcu();
   put_pid(oldpid);
   }
   cpu = get_cpu();
 

Ping.
That variant should be enough for us for future QEMUs. David has prepared some 
patches in QEMU that makes the other problems go away (mostly) which are 
currently under internal review/test.
Let me know if you want to have the put_pid inside the if as well (or feel free 
to fix up yourself code  and patch description).

Thanks

Christian

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: avoid unnecessary synchronize_rcu

2014-08-25 Thread Christian Borntraeger
On 25/08/14 10:24, Christian Borntraeger wrote:
 On 19/08/14 16:45, Christian Borntraeger wrote:
 We dont have to wait for a grace period if there is no oldpid that
 we are going to free. putpid also checks for NULL, so this patch
 only fences synchronize_rcu.

 Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com
 ---
  virt/kvm/kvm_main.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index 33712fb..39b1603 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -129,7 +129,8 @@ int vcpu_load(struct kvm_vcpu *vcpu)
  struct pid *oldpid = vcpu-pid;
  struct pid *newpid = get_task_pid(current, PIDTYPE_PID);
  rcu_assign_pointer(vcpu-pid, newpid);
 -synchronize_rcu();
 +if (oldpid)
 +synchronize_rcu();
  put_pid(oldpid);
  }
  cpu = get_cpu();

 
 Ping.
 That variant should be enough for us for future QEMUs. David has prepared 
 some patches in QEMU that makes the other problems go away (mostly) which are 
 currently under internal review/test.
 Let me know if you want to have the put_pid inside the if as well (or feel 
 free to fix up yourself code  and patch description).
 
 Thanks

Just updated kvm/next. Sorry for the noise.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [question] e1000 interrupt storm happened becauseof its correspondingioapic-irr bit always set

2014-08-25 Thread Zhang Haoyu
 Hi, all

 I use a qemu-1.4.1/qemu-2.0.0 to run win7 guest, and encounter e1000 NIC 
 interrupt storm, 
 because if (!ent-fields.mask  (ioapic-irr  (1  i))) is always 
 true in __kvm_ioapic_update_eoi().

 Any ideas?
 We meet this several times: search the autoneg patches for an example of
 workaround for this in qemu, and patch kvm: ioapic: conditionally delay
 irq delivery during eoi broadcast for an workaround in kvm (rejected).

 Thanks, Jason,
 I searched e1000 autoneg in gmane.comp.emulators.qemu, and found below 
 patches, 
 http://thread.gmane.org/gmane.comp.emulators.qemu/143001/focus=143007

This series is the first try to fix the guest hang during guest
hibernation or driver enable/disable.
 http://thread.gmane.org/gmane.comp.emulators.qemu/284105/focus=284765
 http://thread.gmane.org/gmane.comp.emulators.qemu/186159/focus=187351

Those are follow-up that tries to fix the bugs introduced by the autoneg
hack.
 which one tries to fix this problem, or all of them?

As you can see, those kinds of hacking may not as good as we expect
since we don't know exactly how e1000 works. Only the register function
description from Intel's manual may not be sufficient. And you can
search e1000 in the archives and you can find some behaviour of e1000
registers were not fictionalized like what spec said. It was really
suggested to use virtio-net instead of e1000 in guest. 

We support both, virtio-net is the recommended option, 
with regard to some guest (e.g., windows server 2000), virtio-net is not 
supported, e1000 is the last option.
 

 That was probably caused by something wrong in e1000 emulation which
 causes interrupt to be injected into windows guest before its interrupt
 handler is registered. And Windows guest does not have a mechanism to
 detect and disable irq in such condition.

 Sorry, I don't understand,
 I think one interrupt should not been enabled before its handler is 
 successfully registered, 
 is it possible that e1000 emulation inject the interrupt before the 
 interrupt is succesfully enabled?

 Thanks,
 Zhang Haoyu
  
 e1000 emulation is far from stable and complete (e.g run e1000 ethtool
 selftest in linux guest may see lots of errors). It's complicate and
 subtle (even has undocumented registers and behaviour). You should
 better consider to use virtio which are more stable and fast in a kvm
 guest (unless some intel guys are involved to improve e1000 emulation).

 Thanks
 Thanks,
 Zhang Haoyu

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: x86: fix xen guest panic due to lack of KVM_REQ_EVENT

2014-08-25 Thread Paolo Bonzini
Il 25/08/2014 09:58, Wanpeng Li ha scritto:
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index c10408e..b7c0073 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -4928,6 +4928,8 @@ static void toggle_interruptibility(struct kvm_vcpu 
 *vcpu, u32 mask)
   if (!mask)
   kvm_make_request(KVM_REQ_EVENT, vcpu);
   }
 + if (!(int_shadow || mask))
 + kvm_make_request(KVM_REQ_EVENT, vcpu);
  }
  
  static void inject_emulated_exception(struct kvm_vcpu *vcpu)

No, this patch undoes the optimization in the buggy patch.

A KVM_REQ_EVENT must be missing somewhere else.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: x86: fix xen guest panic due to lack of KVM_REQ_EVENT

2014-08-25 Thread Wanpeng Li
Hi Paolo,
On Mon, Aug 25, 2014 at 11:01:07AM +0200, Paolo Bonzini wrote:
Il 25/08/2014 09:58, Wanpeng Li ha scritto:
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index c10408e..b7c0073 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -4928,6 +4928,8 @@ static void toggle_interruptibility(struct kvm_vcpu 
 *vcpu, u32 mask)
  if (!mask)
  kvm_make_request(KVM_REQ_EVENT, vcpu);
  }
 +if (!(int_shadow || mask))
 +kvm_make_request(KVM_REQ_EVENT, vcpu);
  }
  
  static void inject_emulated_exception(struct kvm_vcpu *vcpu)

No, this patch undoes the optimization in the buggy patch.

A KVM_REQ_EVENT must be missing somewhere else.


Could you give some tips in order that I can figure it out?

Regards,
Wanpeng Li 

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: x86: fix xen guest panic due to lack of KVM_REQ_EVENT

2014-08-25 Thread Paolo Bonzini
Il 25/08/2014 11:08, Wanpeng Li ha scritto:
 Hi Paolo,
 On Mon, Aug 25, 2014 at 11:01:07AM +0200, Paolo Bonzini wrote:
 Il 25/08/2014 09:58, Wanpeng Li ha scritto:
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index c10408e..b7c0073 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -4928,6 +4928,8 @@ static void toggle_interruptibility(struct kvm_vcpu 
 *vcpu, u32 mask)
 if (!mask)
 kvm_make_request(KVM_REQ_EVENT, vcpu);
 }
 +   if (!(int_shadow || mask))
 +   kvm_make_request(KVM_REQ_EVENT, vcpu);
  }
  
  static void inject_emulated_exception(struct kvm_vcpu *vcpu)

 No, this patch undoes the optimization in the buggy patch.

 A KVM_REQ_EVENT must be missing somewhere else.

 
 Could you give some tips in order that I can figure it out?

I have no idea right now (I was planning to debug it this week).

(BTW, look at the original commit that introduced KVM_REQ_EVENT --
https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/?id=3842d135 -- and
compare the patch and the commit message.  You can see that it was added
to the emulator because it is a place that can set EFLAGS and this
idea is preserved in the buggy patch).

The important thing is that (despite Xen being involved) this is not
related to nested virtualization.  So I would first of all try to see if
some module parameter makes it go away (apicv and unrestricted mode
especially), then capture a trace of the panic.  At least this is how I
was planning to start... :)

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: x86: fix xen guest panic due to lack of KVM_REQ_EVENT

2014-08-25 Thread Wanpeng Li
Hi Paolo,
On Mon, Aug 25, 2014 at 11:16:16AM +0200, Paolo Bonzini wrote:
Il 25/08/2014 11:08, Wanpeng Li ha scritto:
 Hi Paolo,
 On Mon, Aug 25, 2014 at 11:01:07AM +0200, Paolo Bonzini wrote:
 Il 25/08/2014 09:58, Wanpeng Li ha scritto:
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index c10408e..b7c0073 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -4928,6 +4928,8 @@ static void toggle_interruptibility(struct kvm_vcpu 
 *vcpu, u32 mask)
if (!mask)
kvm_make_request(KVM_REQ_EVENT, vcpu);
}
 +  if (!(int_shadow || mask))
 +  kvm_make_request(KVM_REQ_EVENT, vcpu);
  }
  
  static void inject_emulated_exception(struct kvm_vcpu *vcpu)

 No, this patch undoes the optimization in the buggy patch.

 A KVM_REQ_EVENT must be missing somewhere else.

 
 Could you give some tips in order that I can figure it out?

I have no idea right now (I was planning to debug it this week).

(BTW, look at the original commit that introduced KVM_REQ_EVENT --
https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/?id=3842d135 -- and
compare the patch and the commit message.  You can see that it was added
to the emulator because it is a place that can set EFLAGS and this
idea is preserved in the buggy patch).

The important thing is that (despite Xen being involved) this is not
related to nested virtualization.  So I would first of all try to see if
some module parameter makes it go away (apicv and unrestricted mode

This bug can be reproduced w/o apicv.

especially), then capture a trace of the panic.  At least this is how I
was planning to start... :)

Great, I will also continue to debug it. ;-)

Regards,
Wanpeng Li 


Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [libvirt] Mentors wanted for Outreach Program for Women October 2014

2014-08-25 Thread Martin Kletzander

On Thu, Aug 21, 2014 at 09:06:39PM +0100, Stefan Hajnoczi wrote:

Dear mentors and core contributors,
Outreach Program for Women is starting the next round in October 2014.
OPW funds women to work on open source software for 12 weeks with the
help of mentors:
https://wiki.gnome.org/OutreachProgramForWomen/

We just completed a successful round of OPW and Google Summer of Code.
Other organizations have also been participating successfully in OPW
like the Linux kernel with Greg KH and other mentors.

Would you like to mentor in OPW October 2014?



I could use some of my time to help others participate in the
community.


Regular code contributors to QEMU, KVM, and libvirt are eligible to
participate as mentors.

We also need project ideas that are achievable in 12 weeks by someone
skilled in programming but not necessarily familiar with open source
or our codebase.  Ideas welcome!



It's just a matter of ideas.  Maybe we could revisit some of those we
had for GSoC.  If I'm reading the deadline for project ideas is
October 22nd, so I think we'll definitely come up with something.

In first pitch this might be a rewriting of storage driver to handle
jobs (our failed GSoC project from this year), and if admin API gets
added, there will be many APIs and ideas how to expand it.

Martin


Stefan



signature.asc
Description: Digital signature


[PATCH 1/1] x86:kvm: fix one typo in comment

2014-08-25 Thread Tiejun Chen
s/drity/dirty

Signed-off-by: Tiejun Chen tiejun.c...@intel.com
---
 arch/x86/kvm/mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 9314678..09b9f05 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1180,7 +1180,7 @@ static void drop_large_spte(struct kvm_vcpu *vcpu, u64 
*sptep)
  * Write-protect on the specified @sptep, @pt_protect indicates whether
  * spte write-protection is caused by protecting shadow page table.
  *
- * Note: write protection is difference between drity logging and spte
+ * Note: write protection is difference between dirty logging and spte
  * protection:
  * - for dirty logging, the spte can be set to writable at anytime if
  *   its dirty bitmap is properly set.
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/2] irqfd support for ARM

2014-08-25 Thread Eric Auger
This patch serie enables irqfd on ARM.

irqfd framework enables to inject a virtual IRQ into a guest upon an
eventfd trigger. User-side uses KVM_IRQFD VM ioctl to provide KVM with
a kvm_irqfd struct that associates a VM, an eventfd, an IRQ number
(aka. the gsi). When an actor signals the eventfd (typically a VFIO
platform driver), the kvm irqfd subsystem injects the provided virtual
IRQ into the guest.

Resamplefd also is supported for level sensitive interrupts, ie. the
user can provide another eventfd that is triggered when the completion
of the virtual IRQ (gsi) is detected by the GIC.

The gsi must correspond to a shared peripheral interrupt (SPI), ie the
GIC interrupt ID is gsi + 32. It is still under discussion whether PPI
injection support is needed.

this patch enables CONFIG_HAVE_KVM_EVENTFD and CONFIG_HAVE_KVM_IRQFD.

No IRQ routing table is used.

2 patch files are included:
- the 1st one simply removes the inclusion of irq.h. After Paul
  Mackerras' eventfd rework, I think it is no more needed
- the second patch brings the irqfd integration for ARM, without
  routing

This patch serie deprecates integration with GSI routing
(https://patches.linaro.org/32261/)

can be found at git://git.linaro.org/people/eric.auger/linux.git
on branch irqfd_integ_v5

This work was tested with Calxeda Midway xgmac main interrupt with
qemu-system-arm and QEMU VFIO platform device.

- rebase on 3.17rc1
- move of the dist unlock in process_maintenance
- remove of dist lock in __kvm_vgic_sync_hwstate
- remove irq.h

Eric Auger (2):
  KVM: EVENTFD: remove inclusion of irq.h
  KVM: ARM: add irqfd support

 Documentation/virtual/kvm/api.txt |  5 +++-
 arch/arm/include/uapi/asm/kvm.h   |  3 +++
 arch/arm/kvm/Kconfig  |  3 ++-
 arch/arm/kvm/Makefile |  2 +-
 virt/kvm/arm/vgic.c   | 56 ---
 virt/kvm/eventfd.c|  1 -
 6 files changed, 62 insertions(+), 8 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/2] KVM: ARM: add irqfd support

2014-08-25 Thread Eric Auger
This patch enables irqfd on ARM.

irqfd framework enables to inject a virtual IRQ into a guest upon an
eventfd trigger. User-side uses KVM_IRQFD VM ioctl to provide KVM with
a kvm_irqfd struct that associates a VM, an eventfd, an IRQ number
(aka. the gsi). When an actor signals the eventfd (typically a VFIO
platform driver), the kvm irqfd subsystem injects the provided virtual
IRQ into the guest.

Resamplefd also is supported for level sensitive interrupts, ie. the
user can provide another eventfd that is triggered when the completion
of the virtual IRQ (gsi) is detected by the GIC.

The gsi must correspond to a shared peripheral interrupt (SPI), ie the
GIC interrupt ID is gsi+32.

this patch enables CONFIG_HAVE_KVM_EVENTFD and CONFIG_HAVE_KVM_IRQFD.

No IRQ routing table is used.

Signed-off-by: Eric Auger eric.au...@linaro.org

---

This patch deprecates the previous patch featuring GSI routing
(https://patches.linaro.org/32261/)

irqchip.c and irq_comm.c are not used at all.

This RFC applies on top of Christoffer Dall's serie
arm/arm64: KVM: Various VGIC cleanups and improvements
https://lists.cs.columbia.edu/pipermail/kvmarm/2014-June/009979.html

All pieces can be found on git://git.linaro.org/people/eric.auger/linux.git
branch irqfd_integ_v5

This work was tested with Calxeda Midway xgmac main interrupt with
qemu-system-arm and QEMU VFIO platform device.

v1 - v2:
- rebase on 3.17rc1
- move of the dist unlock in process_maintenance
- remove of dist lock in __kvm_vgic_sync_hwstate
- rewording of the commit message (add resamplefd reference)
- remove irq.h
---
 Documentation/virtual/kvm/api.txt |  5 +++-
 arch/arm/include/uapi/asm/kvm.h   |  3 +++
 arch/arm/kvm/Kconfig  |  3 ++-
 arch/arm/kvm/Makefile |  2 +-
 virt/kvm/arm/vgic.c   | 56 ---
 5 files changed, 62 insertions(+), 7 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index beae3fd..8118b12 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2204,7 +2204,7 @@ into the hash PTE second double word).
 4.75 KVM_IRQFD
 
 Capability: KVM_CAP_IRQFD
-Architectures: x86 s390
+Architectures: x86 s390 arm
 Type: vm ioctl
 Parameters: struct kvm_irqfd (in)
 Returns: 0 on success, -1 on error
@@ -2230,6 +2230,9 @@ Note that closing the resamplefd is not sufficient to 
disable the
 irqfd.  The KVM_IRQFD_FLAG_RESAMPLE is only necessary on assignment
 and need not be specified with KVM_IRQFD_FLAG_DEASSIGN.
 
+On ARM/arm64 the injected must be a shared peripheral interrupt (SPI).
+This means the programmed GIC interrupt ID is gsi+32.
+
 4.76 KVM_PPC_ALLOCATE_HTAB
 
 Capability: KVM_CAP_PPC_ALLOC_HTAB
diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
index e6ebdd3..3034c66 100644
--- a/arch/arm/include/uapi/asm/kvm.h
+++ b/arch/arm/include/uapi/asm/kvm.h
@@ -194,6 +194,9 @@ struct kvm_arch_memory_slot {
 /* Highest supported SPI, from VGIC_NR_IRQS */
 #define KVM_ARM_IRQ_GIC_MAX127
 
+/* One single KVM irqchip, ie. the VGIC */
+#define KVM_NR_IRQCHIPS  1
+
 /* PSCI interface */
 #define KVM_PSCI_FN_BASE   0x95c1ba5e
 #define KVM_PSCI_FN(n) (KVM_PSCI_FN_BASE + (n))
diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index 466bd29..82ccd81 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -24,6 +24,7 @@ config KVM
select KVM_MMIO
select KVM_ARM_HOST
depends on ARM_VIRT_EXT  ARM_LPAE
+   select HAVE_KVM_EVENTFD
---help---
  Support hosting virtualized guest machines. You will also
  need to select one or more of the processor modules below.
@@ -55,7 +56,7 @@ config KVM_ARM_MAX_VCPUS
 config KVM_ARM_VGIC
bool KVM support for Virtual GIC
depends on KVM_ARM_HOST  OF
-   select HAVE_KVM_IRQCHIP
+   select HAVE_KVM_IRQFD
default y
---help---
  Adds support for a hardware assisted, in-kernel GIC emulation.
diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile
index f7057ed..859db09 100644
--- a/arch/arm/kvm/Makefile
+++ b/arch/arm/kvm/Makefile
@@ -15,7 +15,7 @@ AFLAGS_init.o := -Wa,-march=armv7-a$(plus_virt)
 AFLAGS_interrupts.o := -Wa,-march=armv7-a$(plus_virt)
 
 KVM := ../../../virt/kvm
-kvm-arm-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o
+kvm-arm-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o
 
 obj-y += kvm-arm.o init.o interrupts.o
 obj-y += arm.o handle_exit.o guest.o mmu.o emulate.o reset.o
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 7164d2e..586bd11 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1334,7 +1334,10 @@ epilog:
 static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
 {
u32 status = vgic_get_interrupt_status(vcpu);
+   struct vgic_dist *dist = vcpu-kvm-arch.vgic;
bool level_pending = false;
+   struct kvm *kvm = 

[PATCH v2 1/2] KVM: EVENTFD: remove inclusion of irq.h

2014-08-25 Thread Eric Auger
No more needed. Also irq.h is not used on ARM.

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 virt/kvm/eventfd.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 3c5981c..0c712a7 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -36,7 +36,6 @@
 #include linux/seqlock.h
 #include trace/events/kvm.h
 
-#include irq.h
 #include iodev.h
 
 #ifdef CONFIG_HAVE_KVM_IRQFD
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL 2/2] KVM: s390/mm: try a cow on read only pages for key ops

2014-08-25 Thread Christian Borntraeger
The PFMF instruction handler  blindly wrote the storage key even if
the page was mapped R/O in the host. Lets try a COW before continuing
and bail out in case of errors.

Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com
Reviewed-by: Dominik Dingel din...@linux.vnet.ibm.com
Cc: sta...@vger.kernel.org
---
 arch/s390/mm/pgtable.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
index 19daa53..5404a62 100644
--- a/arch/s390/mm/pgtable.c
+++ b/arch/s390/mm/pgtable.c
@@ -986,11 +986,21 @@ int set_guest_storage_key(struct mm_struct *mm, unsigned 
long addr,
pte_t *ptep;
 
down_read(mm-mmap_sem);
+retry:
ptep = get_locked_pte(current-mm, addr, ptl);
if (unlikely(!ptep)) {
up_read(mm-mmap_sem);
return -EFAULT;
}
+   if (!(pte_val(*ptep)  _PAGE_INVALID) 
+(pte_val(*ptep)  _PAGE_PROTECT)) {
+   pte_unmap_unlock(*ptep, ptl);
+   if (fixup_user_fault(current, mm, addr, 
FAULT_FLAG_WRITE)) {
+   up_read(mm-mmap_sem);
+   return -EFAULT;
+   }
+   goto retry;
+   }
 
new = old = pgste_get_lock(ptep);
pgste_val(new) = ~(PGSTE_GR_BIT | PGSTE_GC_BIT |
-- 
1.8.4.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL 0/2] KVM: s390: Fixes for 3.17

2014-08-25 Thread Christian Borntraeger
Paolo,

the following changes since commit 7d1311b93e58ed55f3a31cc8f94c4b8fe988a2b9:

  Linux 3.17-rc1 (2014-08-16 10:40:26 -0600)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git  
tags/kvm-s390-20140825

for you to fetch changes up to ab3f285f227fec62868037e9b1b1fd18294a83b8:

  KVM: s390/mm: try a cow on read only pages for key ops (2014-08-25 14:35:28 
+0200)


Here are two fixes for s390 KVM code that prevent:
1. a malicious user to trigger a kernel BUG
2. a malicious user to change the storage key of read-only pages


Christian Borntraeger (2):
  KVM: s390: Fix user triggerable bug in dead code
  KVM: s390/mm: try a cow on read only pages for key ops

 arch/s390/kvm/kvm-s390.c | 13 -
 arch/s390/mm/pgtable.c   | 10 ++
 2 files changed, 10 insertions(+), 13 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL 1/2] KVM: s390: Fix user triggerable bug in dead code

2014-08-25 Thread Christian Borntraeger
In the early days, we had some special handling for the
KVM_EXIT_S390_SIEIC exit, but this was gone in 2009 with commit
d7b0b5eb3000 (KVM: s390: Make psw available on all exits, not
just a subset).

Now this switch statement is just a sanity check for userspace
not messing with the kvm_run structure. Unfortunately, this
allows userspace to trigger a kernel BUG. Let's just remove
this switch statement.

Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com
Reviewed-by: Cornelia Huck cornelia.h...@de.ibm.com
Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com
Cc: sta...@vger.kernel.org
---
 arch/s390/kvm/kvm-s390.c | 13 -
 1 file changed, 13 deletions(-)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index ce81eb2..81b0e11 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -1317,19 +1317,6 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 
struct kvm_run *kvm_run)
return -EINVAL;
}
 
-   switch (kvm_run-exit_reason) {
-   case KVM_EXIT_S390_SIEIC:
-   case KVM_EXIT_UNKNOWN:
-   case KVM_EXIT_INTR:
-   case KVM_EXIT_S390_RESET:
-   case KVM_EXIT_S390_UCONTROL:
-   case KVM_EXIT_S390_TSCH:
-   case KVM_EXIT_DEBUG:
-   break;
-   default:
-   BUG();
-   }
-
vcpu-arch.sie_block-gpsw.mask = kvm_run-psw_mask;
vcpu-arch.sie_block-gpsw.addr = kvm_run-psw_addr;
if (kvm_run-kvm_dirty_regs  KVM_SYNC_PREFIX) {
-- 
1.8.4.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 9/9] KVM: KVM_VFIO: ARM: implement irq forwarding control

2014-08-25 Thread Eric Auger
Implements ARM specific KVM-VFIO device group commands:
- KVM_DEV_VFIO_DEVICE_ASSIGN_IRQ
- KVM_DEV_VFIO_DEVICE_DEASSIGN_IRQ
capability can be queried using KVM_HAS_DEVICE_ATTR.

The new commands enable to set IRQ forwarding on/off for a given
IRQ index of a VFIO platform device.

as soon as a forwarded irq is set, a reference to the VFIO device
is taken by the kvm-vfio device.

The kvm-vfio device stores in the kvm_vfio_arch_data the list
of assigned devices (kvm_vfio_device). Each kvm_vfio_device
stores the list of assigned IRQs (potentially allowed a subset of
IRQ to be forwarded)

The kvm-vfio device programs both the GIC and vGIC. Also it
clears the active bit on destruction, in case the guest did not
do it itself.

Changing the forwarded state is not allowed in the critical
section starting from VFIO IRQ handler to LR programming. It is
up to the client to take care of this.

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 arch/arm/include/asm/kvm_host.h |   2 +
 arch/arm/kvm/Makefile   |   2 +-
 arch/arm/kvm/kvm_vfio_arm.c | 599 
 3 files changed, 602 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm/kvm/kvm_vfio_arm.c

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 4f1edbf..5c300f6 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -25,6 +25,8 @@
 #include asm/fpstate.h
 #include kvm/arm_arch_timer.h
 
+#define __KVM_HAVE_ARCH_KVM_VFIO
+
 #if defined(CONFIG_KVM_ARM_MAX_VCPUS)
 #define KVM_MAX_VCPUS CONFIG_KVM_ARM_MAX_VCPUS
 #else
diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile
index ea1fa76..26a5a42 100644
--- a/arch/arm/kvm/Makefile
+++ b/arch/arm/kvm/Makefile
@@ -19,7 +19,7 @@ kvm-arm-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o 
$(KVM)/eventfd.o $(KVM)/vf
 
 obj-y += kvm-arm.o init.o interrupts.o
 obj-y += arm.o handle_exit.o guest.o mmu.o emulate.o reset.o
-obj-y += coproc.o coproc_a15.o coproc_a7.o mmio.o psci.o perf.o
+obj-y += coproc.o coproc_a15.o coproc_a7.o mmio.o psci.o perf.o kvm_vfio_arm.o
 obj-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic.o
 obj-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v2.o
 obj-$(CONFIG_KVM_ARM_TIMER) += $(KVM)/arm/arch_timer.o
diff --git a/arch/arm/kvm/kvm_vfio_arm.c b/arch/arm/kvm/kvm_vfio_arm.c
new file mode 100644
index 000..6619e0b
--- /dev/null
+++ b/arch/arm/kvm/kvm_vfio_arm.c
@@ -0,0 +1,599 @@
+/*
+ * Copyright (C) 2014 Linaro Ltd.
+ * Authors: Eric Auger eric.au...@linaro.org
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include linux/errno.h
+#include linux/file.h
+#include linux/kvm_host.h
+#include linux/list.h
+#include linux/mutex.h
+#include linux/vfio.h
+#include linux/irq.h
+#include asm/kvm_host.h
+#include asm/kvm.h
+#include linux/irq.h
+#include linux/platform_device.h
+#include linux/interrupt.h
+
+struct vfio_device;
+
+enum kvm_irq_fwd_action {
+   KVM_VFIO_IRQ_SET_FORWARD,
+   KVM_VFIO_IRQ_SET_NORMAL,
+   KVM_VFIO_IRQ_CLEANUP,
+};
+
+/* internal structure describing a forwarded IRQ */
+struct __kvm_arch_fwd_irq {
+   struct list_head link;
+   __u32 irq_index; /* platform device irq index */
+   __u32 hwirq; /*physical IRQ */
+   __u32 guest_irq; /* virtual IRQ */
+   struct kvm_vcpu *vcpu; /* vcpu to inject into*/
+};
+
+struct kvm_vfio_device {
+   struct list_head node;
+   struct vfio_device *vfio_device;
+   /* list of forwarded IRQs for that VFIO device */
+   struct list_head fwd_irq_list;
+   int fd;
+};
+
+struct kvm_vfio_arch_data {
+   /* list of kvm_vfio_devices for which some IRQs are forwarded*/
+   struct list_head assigned_device_list;
+};
+
+/**
+ * set_fwd_state - change the forwarded state of an IRQ
+ * @pfwd: the forwarded irq struct
+ * @action: action to perform (set forward, set back normal, cleanup)
+ *
+ * programs the GIC and VGIC
+ * returns the VGIC map/unmap return status
+ * It is the responsability of the caller to make sure the physical IRQ
+ * is not active. there is a critical section between the start of the
+ * VFIO IRQ handler and LR programming.
+ */
+int set_fwd_state(struct __kvm_arch_fwd_irq *pfwd,
+ enum kvm_irq_fwd_action action)
+{
+   int ret;
+   struct irq_desc *desc = irq_to_desc(pfwd-hwirq);
+   struct irq_data *d = desc-irq_data;
+   struct irq_chip *chip = desc-irq_data.chip;
+
+   disable_irq(pfwd-hwirq);
+   /* no fwd state change can happen if the IRQ is in progress */
+   if (irqd_irq_inprogress(d)) {
+   kvm_err(%s 

[RFC 3/9] VFIO: platform: handler tests whether the IRQ is forwarded

2014-08-25 Thread Eric Auger
In case the IRQ is forwarded, the VFIO platform IRQ handler does not
need to disable the IRQ anymore. In that mode, when the handler completes
the IRQ is not deactivated but only its priority is lowered.

Some other actor (typically a guest) is supposed to deactivate the IRQ,
allowing at that time a new physical IRQ to hit.

In virtualization use case, the physical IRQ is automatically completed
by the interrupt controller when the guest completes the corresponding
virtual IRQ.

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 drivers/vfio/platform/vfio_platform_irq.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/platform/vfio_platform_irq.c 
b/drivers/vfio/platform/vfio_platform_irq.c
index 6768508..1f851b2 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -88,13 +88,18 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
struct vfio_platform_irq *irq_ctx = dev_id;
unsigned long flags;
int ret = IRQ_NONE;
+   struct irq_data *d;
+   bool is_forwarded;
 
spin_lock_irqsave(irq_ctx-lock, flags);
 
if (!irq_ctx-masked) {
ret = IRQ_HANDLED;
+   d = irq_get_irq_data(irq_ctx-hwirq);
+   is_forwarded = irqd_irq_forwarded(d);
 
-   if (irq_ctx-flags  VFIO_IRQ_INFO_AUTOMASKED) {
+   if (irq_ctx-flags  VFIO_IRQ_INFO_AUTOMASKED 
+   !is_forwarded) {
disable_irq_nosync(irq_ctx-hwirq);
irq_ctx-masked = true;
}
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 2/9] KVM: ARM: VGIC: add forwarded irq rbtree lock

2014-08-25 Thread Eric Auger
add a lock related to the rb tree manipulation. The rb tree can be
searched in one thread (irqfd handler for instance) and map/unmap
happen in another.

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 virt/kvm/arm/vgic.c | 46 +-
 1 file changed, 37 insertions(+), 9 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 195c10c..3311e0a 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1628,9 +1628,15 @@ static struct rb_root *vgic_get_irq_phys_map(struct 
kvm_vcpu *vcpu,
 
 int vgic_map_phys_irq(struct kvm_vcpu *vcpu, int virt_irq, int phys_irq)
 {
-   struct rb_root *root = vgic_get_irq_phys_map(vcpu, virt_irq);
-   struct rb_node **new = root-rb_node, *parent = NULL;
+   struct rb_root *root;
+   struct rb_node **new, *parent = NULL;
struct irq_phys_map *new_map;
+   struct vgic_dist *dist = vcpu-kvm-arch.vgic;
+
+   spin_lock(dist-rb_tree_lock);
+
+   root = vgic_get_irq_phys_map(vcpu, virt_irq);
+   new = root-rb_node;
 
/* Boilerplate rb_tree code */
while (*new) {
@@ -1642,13 +1648,17 @@ int vgic_map_phys_irq(struct kvm_vcpu *vcpu, int 
virt_irq, int phys_irq)
new = (*new)-rb_left;
else if (this-virt_irq  virt_irq)
new = (*new)-rb_right;
-   else
+   else {
+   spin_unlock(dist-rb_tree_lock);
return -EEXIST;
+   }
}
 
new_map = kzalloc(sizeof(*new_map), GFP_KERNEL);
-   if (!new_map)
+   if (!new_map) {
+   spin_unlock(dist-rb_tree_lock);
return -ENOMEM;
+   }
 
new_map-virt_irq = virt_irq;
new_map-phys_irq = phys_irq;
@@ -1656,6 +1666,8 @@ int vgic_map_phys_irq(struct kvm_vcpu *vcpu, int 
virt_irq, int phys_irq)
rb_link_node(new_map-node, parent, new);
rb_insert_color(new_map-node, root);
 
+   spin_unlock(dist-rb_tree_lock);
+
return 0;
 }
 
@@ -1683,24 +1695,39 @@ static struct irq_phys_map *vgic_irq_map_search(struct 
kvm_vcpu *vcpu,
 
 int vgic_get_phys_irq(struct kvm_vcpu *vcpu, int virt_irq)
 {
-   struct irq_phys_map *map = vgic_irq_map_search(vcpu, virt_irq);
+   struct irq_phys_map *map;
+   struct vgic_dist *dist = vcpu-kvm-arch.vgic;
+   int ret;
+
+   spin_lock(dist-rb_tree_lock);
+   map = vgic_irq_map_search(vcpu, virt_irq);
 
if (map)
-   return map-phys_irq;
+   ret = map-phys_irq;
+   else
+   ret =  -ENOENT;
+
+   spin_unlock(dist-rb_tree_lock);
+   return ret;
 
-   return -ENOENT;
 }
 
 int vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, int virt_irq, int phys_irq)
 {
-   struct irq_phys_map *map = vgic_irq_map_search(vcpu, virt_irq);
+   struct irq_phys_map *map;
+   struct vgic_dist *dist = vcpu-kvm-arch.vgic;
+
+   spin_lock(dist-rb_tree_lock);
+
+   map = vgic_irq_map_search(vcpu, virt_irq);
 
if (map  map-phys_irq == phys_irq) {
rb_erase(map-node, vgic_get_irq_phys_map(vcpu, virt_irq));
kfree(map);
+   spin_unlock(dist-rb_tree_lock);
return 0;
}
-
+   spin_unlock(dist-rb_tree_lock);
return -ENOENT;
 }
 
@@ -1896,6 +1923,7 @@ int kvm_vgic_create(struct kvm *kvm)
}
 
spin_lock_init(kvm-arch.vgic.lock);
+   spin_lock_init(kvm-arch.vgic.rb_tree_lock);
kvm-arch.vgic.in_kernel = true;
kvm-arch.vgic.vctrl_base = vgic-vctrl_base;
kvm-arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 0/9] KVM-VFIO IRQ forward control

2014-08-25 Thread Eric Auger
This RFC proposes an integration of ARM: Forwarding physical
interrupts to a guest VM (http://lwn.net/Articles/603514/) in
KVM.

It enables to transform a VFIO platform driver IRQ into a forwarded
IRQ. The direct benefit is that, for a level sensitive IRQ, a VM
switch can be avoided on guest virtual IRQ completion. Before this
patch, a maintenance IRQ was triggered on the virtual IRQ completion.

When the IRQ is forwarded, the VFIO platform driver does not need to
disable the IRQ anymore. Indeed when returning from the IRQ handler
the IRQ is not deactivated. Only its priority is lowered. This means
the same IRQ cannot hit before the guest completes the virtual IRQ
and the GIC automatically deactivates the corresponding physical IRQ.

Besides, the injection still is based on irqfd triggering. The only
impact on irqfd process is resamplefd is not called anymore on
virtual IRQ completion since this latter becomes transparent.

The current integration is based on an extension of the KVM-VFIO
device, previously used by KVM to interact with VFIO groups. The
patch serie now enables KVM to directly interact with a VFIO
platform device. The VFIO external API was extended for that purpose.

Th KVM-VFIO device can get/put the vfio platform device, check its
integrity and type, get the IRQ number associated to an IRQ index.

The KVM-VFIO is extended with an architecture specific implementation.
IRQ forward control is implemented in the ARM specific part.

from a user point of view, the functionality is provided through new
KVM-VFIO device commands, KVM_DEV_VFIO_DEVICE_(DE)ASSIGN_IRQ
and the capability can be checked with KVM_HAS_DEVICE_ATTR.
Assignment can only be changed when the physical IRQ is not active.
It is the responsability of the user to do this check.

This patch serie has the following dependencies:
- ARM: Forwarding physical interrupts to a guest VM
  (http://lwn.net/Articles/603514/) in
- [PATCH v2] irqfd for ARM
  which itself depends on
  - arm/arm64: KVM: Various VGIC cleanups and improvements
http://lists.infradead.org/pipermail/linux-arm-kernel/2014-June/263685.html
- and obviously the VFIO platform driver serie:
  [RFC PATCH v6 00/20] VFIO support for platform devices on ARM
  https://www.mail-archive.com/kvm@vger.kernel.org/msg103247.html

Integrated pieces can be found at
git://git.linaro.org/people/eric.auger/linux.git
on branch 3.17rc1_forward_integ_v0

This was was tested on Calxeda Miday, assigning the xgmac main IRQ.


Eric Auger (9):
  KVM: ARM: VGIC: fix multiple injection of level sensitive forwarded
IRQ
  KVM: ARM: VGIC: add forwarded irq rbtree lock
  VFIO: platform: handler tests whether the IRQ is forwarded
  KVM: KVM-VFIO: update user API to program forwarded IRQ
  VFIO: Extend external user API
  KVM: KVM-VFIO: allow arch specific implementation
  KVM: KVM-VFIO: add new VFIO external API hooks
  KVM: KVM-VFIO: add kvm_vfio_arch_data and accessors
  KVM: KVM_VFIO: ARM: implement irq forwarding control

 Documentation/virtual/kvm/devices/vfio.txt |  25 ++
 arch/arm/include/asm/kvm_host.h|  16 +
 arch/arm/include/uapi/asm/kvm.h|   6 +
 arch/arm/kvm/Makefile  |   2 +-
 arch/arm/kvm/kvm_vfio_arm.c| 599 +
 drivers/vfio/platform/vfio_platform_irq.c  |   7 +-
 drivers/vfio/vfio.c|  35 ++
 include/kvm/arm_vgic.h |   1 +
 include/linux/kvm_host.h   |  30 ++
 include/linux/vfio.h   |   4 +
 include/uapi/linux/kvm.h   |   3 +
 virt/kvm/arm/vgic.c|  55 ++-
 virt/kvm/vfio.c|  92 +
 13 files changed, 862 insertions(+), 13 deletions(-)
 create mode 100644 arch/arm/kvm/kvm_vfio_arm.c

-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 8/9] KVM: KVM-VFIO: add kvm_vfio_arch_data and accessors

2014-08-25 Thread Eric Auger
add a pointer to architecture specific data in kvm_vfio struct
add accessors to keep kvm_vfio private

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 arch/arm/include/asm/kvm_host.h |  8 
 virt/kvm/vfio.c | 21 +
 2 files changed, 29 insertions(+)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 62cbf5b..4f1edbf 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -177,6 +177,14 @@ void kvm_vfio_device_put_external_user(struct vfio_device 
*vdev);
 int kvm_vfio_external_get_type(struct vfio_device *vdev);
 struct device *kvm_vfio_external_get_base_device(struct vfio_device *vdev);
 
+struct kvm_vfio;
+struct kvm_vfio_arch_data;
+void kvm_vfio_device_set_arch_data(struct kvm_vfio *kv,
+  struct kvm_vfio_arch_data *ptr);
+struct kvm_vfio_arch_data *kvm_vfio_device_get_arch_data(struct kvm_vfio *kv);
+void kvm_vfio_lock(struct kvm_vfio *kv);
+void kvm_vfio_unlock(struct kvm_vfio *kv);
+
 /* We do not have shadow page tables, hence the empty hooks */
 static inline int kvm_age_hva(struct kvm *kvm, unsigned long hva)
 {
diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
index f1c4e35..177b71e 100644
--- a/virt/kvm/vfio.c
+++ b/virt/kvm/vfio.c
@@ -28,6 +28,7 @@ struct kvm_vfio {
struct list_head group_list;
struct mutex lock;
bool noncoherent;
+   struct kvm_vfio_arch_data *arch_data;
 };
 
 static struct vfio_group *kvm_vfio_group_get_external_user(struct file *filep)
@@ -338,6 +339,26 @@ static int kvm_vfio_create(struct kvm_device *dev, u32 
type)
return 0;
 }
 
+void kvm_vfio_device_set_arch_data(struct kvm_vfio *kv,
+  struct kvm_vfio_arch_data *ptr)
+{
+   kv-arch_data = ptr;
+}
+
+struct kvm_vfio_arch_data *kvm_vfio_device_get_arch_data(struct kvm_vfio *kv)
+{
+   return kv-arch_data;
+}
+
+void kvm_vfio_lock(struct kvm_vfio *kv)
+{
+   mutex_lock(kv-lock);
+}
+
+void kvm_vfio_unlock(struct kvm_vfio *kv)
+{
+   mutex_unlock(kv-lock);
+}
 
 struct kvm_device_ops kvm_vfio_ops = {
.name = kvm-vfio,
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 1/9] KVM: ARM: VGIC: fix multiple injection of level sensitive forwarded IRQ

2014-08-25 Thread Eric Auger
Fix multiple injection of level sensitive forwarded IRQs.
With current code, the second injection fails since the state bitmaps
are not reset (process_maintenance is not called anymore).
New implementation consists in fully bypassing the vgic state
management for forwarded IRQ (checks are ignored in
vgic_update_irq_pending). This obviously assumes the forwarded IRQ is
injected from kernel side.

---
  It was attempted to reset the states in __kvm_vgic_sync_hwstate, checking
  the emptied LR of forwarded IRQ. However surprisingly this solution does
  not seem to work. Some times, a new forwarded IRQ injection is observed
  while the LR of the previous instance was not observed as empty.

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 include/kvm/arm_vgic.h | 1 +
 virt/kvm/arm/vgic.c| 9 +++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 743020f..3da244f 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -177,6 +177,7 @@ struct vgic_dist {
unsigned long   irq_pending_on_cpu;
 
struct rb_root  irq_phys_map;
+   spinlock_t  rb_tree_lock;
 #endif
 };
 
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 0007300..195c10c 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1517,14 +1517,18 @@ static bool vgic_update_irq_pending(struct kvm *kvm, 
int cpuid,
int edge_triggered, level_triggered;
int enabled;
bool ret = true;
+   bool is_forwarded;
 
spin_lock(dist-lock);
 
vcpu = kvm_get_vcpu(kvm, cpuid);
+   is_forwarded = (vgic_get_phys_irq(vcpu, irq_num) 0);
+   
edge_triggered = vgic_irq_is_edge(vcpu, irq_num);
level_triggered = !edge_triggered;
 
-   if (!vgic_validate_injection(vcpu, irq_num, level)) {
+   if (!is_forwarded 
+   !vgic_validate_injection(vcpu, irq_num, level)) {
ret = false;
goto out;
}
@@ -1557,7 +1561,8 @@ static bool vgic_update_irq_pending(struct kvm *kvm, int 
cpuid,
goto out;
}
 
-   if (level_triggered  vgic_irq_is_queued(vcpu, irq_num)) {
+   if (!is_forwarded 
+   level_triggered  vgic_irq_is_queued(vcpu, irq_num)) {
/*
 * Level interrupt in progress, will be picked up
 * when EOId.
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 7/9] KVM: KVM-VFIO: add new VFIO external API hooks

2014-08-25 Thread Eric Auger
add functions that implement the gateway to the extended
external VFIO API:
- kvm_vfio_device_get_external_user
- kvm_vfio_device_put_external_user
- kvm_vfio_external_get_type
- kvm_vfio_external_get_base_device

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 arch/arm/include/asm/kvm_host.h |  6 
 virt/kvm/vfio.c | 62 +
 2 files changed, 68 insertions(+)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 6dfb404..62cbf5b 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -171,6 +171,12 @@ void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, 
pte_t pte);
 unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu);
 int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *indices);
 
+struct vfio_device;
+struct vfio_device *kvm_vfio_device_get_external_user(struct file *filep);
+void kvm_vfio_device_put_external_user(struct vfio_device *vdev);
+int kvm_vfio_external_get_type(struct vfio_device *vdev);
+struct device *kvm_vfio_external_get_base_device(struct vfio_device *vdev);
+
 /* We do not have shadow page tables, hence the empty hooks */
 static inline int kvm_age_hva(struct kvm *kvm, unsigned long hva)
 {
diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
index 89d3b75..f1c4e35 100644
--- a/virt/kvm/vfio.c
+++ b/virt/kvm/vfio.c
@@ -59,6 +59,67 @@ static void kvm_vfio_group_put_external_user(struct 
vfio_group *vfio_group)
symbol_put(vfio_group_put_external_user);
 }
 
+struct vfio_device *kvm_vfio_device_get_external_user(struct file *filep)
+{
+   struct vfio_device *vdev;
+   struct vfio_device *(*fn)(struct file *);
+
+   fn = symbol_get(vfio_device_get_external_user);
+   if (!fn)
+   return ERR_PTR(-EINVAL);
+
+   vdev = fn(filep);
+
+   symbol_put(vfio_device_get_external_user);
+
+   return vdev;
+}
+
+void kvm_vfio_device_put_external_user(struct vfio_device *vdev)
+{
+   void (*fn)(struct vfio_device *);
+
+   fn = symbol_get(vfio_device_put_external_user);
+   if (!fn)
+   return;
+
+   fn(vdev);
+
+   symbol_put(vfio_device_put_external_user);
+}
+
+int kvm_vfio_external_get_type(struct vfio_device *vdev)
+{
+   int (*fn)(struct vfio_device *);
+   int ret;
+
+   fn = symbol_get(vfio_external_get_type);
+   if (!fn)
+   return -EINVAL;
+
+   ret = fn(vdev);
+
+   symbol_put(vfio_external_get_type);
+
+   return ret;
+}
+
+struct device *kvm_vfio_external_get_base_device(struct vfio_device *vdev)
+{
+   struct device *(*fn)(struct vfio_device *);
+   struct device *dev;
+
+   fn = symbol_get(vfio_external_get_base_device);
+   if (!fn)
+   return NULL;
+
+   dev = fn(vdev);
+
+   symbol_put(vfio_external_get_base_device);
+
+   return dev;
+}
+
 static bool kvm_vfio_group_is_coherent(struct vfio_group *vfio_group)
 {
long (*fn)(struct vfio_group *, unsigned long);
@@ -277,6 +338,7 @@ static int kvm_vfio_create(struct kvm_device *dev, u32 type)
return 0;
 }
 
+
 struct kvm_device_ops kvm_vfio_ops = {
.name = kvm-vfio,
.create = kvm_vfio_create,
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 5/9] VFIO: Extend external user API

2014-08-25 Thread Eric Auger
New functions are added to be called from ARM KVM-VFIO device.

- vfio_device_get_external_user enables to get a vfio device from
  its fd
- vfio_device_put_external_user puts the vfio device
- vfio_external_get_type enables to retrieve the type of the device
  (PCI or platform)
- vfio_external_get_base_device enables to get the
  struct device*, useful to access the platform_device

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 drivers/vfio/vfio.c  | 35 +++
 include/linux/vfio.h |  4 
 2 files changed, 39 insertions(+)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 8e84471..c93b9e4 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1401,6 +1401,41 @@ void vfio_group_put_external_user(struct vfio_group 
*group)
 }
 EXPORT_SYMBOL_GPL(vfio_group_put_external_user);
 
+struct vfio_device *vfio_device_get_external_user(struct file *filep)
+{
+   struct vfio_device *vdev = filep-private_data;
+
+   if (filep-f_op != vfio_device_fops)
+   return ERR_PTR(-EINVAL);
+
+   vfio_device_get(vdev);
+   return vdev;
+}
+EXPORT_SYMBOL_GPL(vfio_device_get_external_user);
+
+void vfio_device_put_external_user(struct vfio_device *vdev)
+{
+   vfio_device_put(vdev);
+}
+EXPORT_SYMBOL_GPL(vfio_device_put_external_user);
+
+int vfio_external_get_type(struct vfio_device *vdev)
+{
+   if (!strcmp(vdev-ops-name,  vfio-platform))
+   return VFIO_DEVICE_FLAGS_PLATFORM;
+   else if (!strcmp(vdev-ops-name,  vfio-pci))
+   return VFIO_DEVICE_FLAGS_PCI;
+   else
+   return -EINVAL;
+}
+EXPORT_SYMBOL_GPL(vfio_external_get_type);
+
+struct device *vfio_external_get_base_device(struct vfio_device *vdev)
+{
+   return vdev-dev;
+}
+EXPORT_SYMBOL_GPL(vfio_external_get_base_device);
+
 int vfio_external_user_iommu_id(struct vfio_group *group)
 {
return iommu_group_id(group-iommu_group);
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index ffe04ed..19e98eb 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -99,6 +99,10 @@ extern void vfio_group_put_external_user(struct vfio_group 
*group);
 extern int vfio_external_user_iommu_id(struct vfio_group *group);
 extern long vfio_external_check_extension(struct vfio_group *group,
  unsigned long arg);
+extern struct vfio_device *vfio_device_get_external_user(struct file *filep);
+extern void vfio_device_put_external_user(struct vfio_device *vdev);
+extern int vfio_external_get_type(struct vfio_device *vdev);
+extern struct device *vfio_external_get_base_device(struct vfio_device *vdev);
 
 struct pci_dev;
 #ifdef CONFIG_EEH
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 6/9] KVM: KVM-VFIO: allow arch specific implementation

2014-08-25 Thread Eric Auger
introduce a new option __KVM_HAVE_ARCH_KVM_VFIO option.
When set the generic KVM-VFIO code calls architecture dependent
code.

the architecture dependent hooks are
- kvm_arch_vfio_has_attr
- kvm_arch_vfio_set_attr
- kvm_arch_vfio_init
- kvm_arch_vfio_destroy

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 include/linux/kvm_host.h | 30 ++
 virt/kvm/vfio.c  |  9 +
 2 files changed, 39 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a4c33b3..c4ce4af 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1075,6 +1075,36 @@ extern struct kvm_device_ops kvm_vfio_ops;
 extern struct kvm_device_ops kvm_arm_vgic_v2_ops;
 extern struct kvm_device_ops kvm_flic_ops;
 
+#ifdef __KVM_HAVE_ARCH_KVM_VFIO
+
+int kvm_arch_vfio_has_attr(struct kvm_device *dev,
+  struct kvm_device_attr *attr);
+int kvm_arch_vfio_set_attr(struct kvm_device *dev,
+  struct kvm_device_attr *attr);
+int kvm_arch_vfio_init(struct kvm_device *dev);
+
+void kvm_arch_vfio_destroy(struct kvm_device *dev);
+
+#else
+static inline int kvm_arch_vfio_has_attr(struct kvm_device *dev,
+ struct kvm_device_attr *attr)
+{
+   return -ENXIO;
+}
+static inline int kvm_arch_vfio_set_attr(struct kvm_device *dev,
+ struct kvm_device_attr *attr)
+{
+   return -ENXIO;
+}
+static inline int kvm_arch_vfio_init(struct kvm_device *dev)
+{
+   return 0;
+}
+static inline void kvm_arch_vfio_destroy(struct kvm_device *dev)
+{
+}
+#endif
+
 #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
 
 static inline void kvm_vcpu_set_in_spin_loop(struct kvm_vcpu *vcpu, bool val)
diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
index ba1a93f..89d3b75 100644
--- a/virt/kvm/vfio.c
+++ b/virt/kvm/vfio.c
@@ -207,6 +207,8 @@ static int kvm_vfio_set_attr(struct kvm_device *dev,
switch (attr-group) {
case KVM_DEV_VFIO_GROUP:
return kvm_vfio_set_group(dev, attr-attr, attr-addr);
+   default:
+   return kvm_arch_vfio_set_attr(dev, attr);
}
 
return -ENXIO;
@@ -224,6 +226,9 @@ static int kvm_vfio_has_attr(struct kvm_device *dev,
}
 
break;
+
+   default:
+   kvm_arch_vfio_has_attr(dev, attr);
}
 
return -ENXIO;
@@ -234,6 +239,8 @@ static void kvm_vfio_destroy(struct kvm_device *dev)
struct kvm_vfio *kv = dev-private;
struct kvm_vfio_group *kvg, *tmp;
 
+   kvm_arch_vfio_destroy(dev);
+
list_for_each_entry_safe(kvg, tmp, kv-group_list, node) {
kvm_vfio_group_put_external_user(kvg-vfio_group);
list_del(kvg-node);
@@ -265,6 +272,8 @@ static int kvm_vfio_create(struct kvm_device *dev, u32 type)
 
dev-private = kv;
 
+   kvm_arch_vfio_init(dev);
+
return 0;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 4/9] KVM: KVM-VFIO: update user API to program forwarded IRQ

2014-08-25 Thread Eric Auger
add new device group commands:
- KVM_DEV_VFIO_DEVICE_ASSIGN_IRQ and
  KVM_DEV_VFIO_DEVICE_DEASSIGN_IRQ

which enable to turn forwarded IRQ mode on/off.

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 Documentation/virtual/kvm/devices/vfio.txt | 25 +
 arch/arm/include/uapi/asm/kvm.h|  6 ++
 include/uapi/linux/kvm.h   |  3 +++
 3 files changed, 34 insertions(+)

diff --git a/Documentation/virtual/kvm/devices/vfio.txt 
b/Documentation/virtual/kvm/devices/vfio.txt
index ef51740..c8b3fa1 100644
--- a/Documentation/virtual/kvm/devices/vfio.txt
+++ b/Documentation/virtual/kvm/devices/vfio.txt
@@ -13,6 +13,7 @@ VFIO-group is held by KVM.
 
 Groups:
   KVM_DEV_VFIO_GROUP
+  KVM_DEV_VFIO_DEVICE
 
 KVM_DEV_VFIO_GROUP attributes:
   KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking
@@ -20,3 +21,27 @@ KVM_DEV_VFIO_GROUP attributes:
 
 For each, kvm_device_attr.addr points to an int32_t file descriptor
 for the VFIO group.
+
+KVM_DEV_VFIO_DEVICE attributes:
+  KVM_DEV_VFIO_DEVICE_ASSIGN_IRQ
+  KVM_DEV_VFIO_DEVICE_DEASSIGN_IRQ
+
+For each, kvm_device_attr.addr points to an kvm_arch_forwarded_irq.
+This user API makes possible to create a special IRQ handling mode,
+currently supported only on ARM, where KVM and a VFIO platform driver
+collaborate to improve IRQ handling performance.
+fd represents the file descriptor of a valid VFIO device whose physical
+IRQ, referenced by its irq_index is injected to the VM guest_irq.
+
+On ASSIGN_IRQ, KVM-VFIO device programs:
+- the host, to not complete the physical IRQ itself.
+- the GIC, to automatically complete the physical IRQ when the guest
+  completes the virtual IRQ
+This avoid trapping the end-of-interrupt for level sensitive IRQ.
+
+On DEASSIGN_IRQ, one come back to the mode where the host completes the
+physical IRQ and the guest only completes the virtual IRQ.
+
+It is up to the caller of this API to get the assurance the IRQ is not
+outstanding when the ASSIGN/DEASSIGN is called. This could lead to some
+inconsistency on who is going to complete the IRQ.
diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
index 3034c66..1920b33 100644
--- a/arch/arm/include/uapi/asm/kvm.h
+++ b/arch/arm/include/uapi/asm/kvm.h
@@ -109,6 +109,12 @@ struct kvm_sync_regs {
 struct kvm_arch_memory_slot {
 };
 
+struct kvm_arch_forwarded_irq {
+   __u32 fd; /* file desciptor of the VFIO device */
+   __u32 irq_index; /* platform device index of the IRQ */
+   __u32 guest_irq; /* virtual IRQ number */
+};
+
 /* If you need to interpret the index values, here is the key: */
 #define KVM_REG_ARM_COPROC_MASK0x0FFF
 #define KVM_REG_ARM_COPROC_SHIFT   16
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index cf3a2ff..b149ba8 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -954,6 +954,9 @@ struct kvm_device_attr {
 #define  KVM_DEV_VFIO_GROUP1
 #define   KVM_DEV_VFIO_GROUP_ADD   1
 #define   KVM_DEV_VFIO_GROUP_DEL   2
+#define  KVM_DEV_VFIO_DEVICE   2
+#define   KVM_DEV_VFIO_DEVICE_ASSIGN_IRQ   1
+#define   KVM_DEV_VFIO_DEVICE_DEASSIGN_IRQ 2
 #define KVM_DEV_TYPE_ARM_VGIC_V2   5
 #define KVM_DEV_TYPE_FLIC  6
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL 0/2] KVM: s390: Fixes for 3.17

2014-08-25 Thread Paolo Bonzini
Il 25/08/2014 15:10, Christian Borntraeger ha scritto:
 Paolo,
 
 the following changes since commit 7d1311b93e58ed55f3a31cc8f94c4b8fe988a2b9:
 
   Linux 3.17-rc1 (2014-08-16 10:40:26 -0600)
 
 are available in the git repository at:
 
   git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git  
 tags/kvm-s390-20140825
 
 for you to fetch changes up to ab3f285f227fec62868037e9b1b1fd18294a83b8:
 
   KVM: s390/mm: try a cow on read only pages for key ops (2014-08-25 14:35:28 
 +0200)
 
 
 Here are two fixes for s390 KVM code that prevent:
 1. a malicious user to trigger a kernel BUG
 2. a malicious user to change the storage key of read-only pages
 
 
 Christian Borntraeger (2):
   KVM: s390: Fix user triggerable bug in dead code
   KVM: s390/mm: try a cow on read only pages for key ops
 
  arch/s390/kvm/kvm-s390.c | 13 -
  arch/s390/mm/pgtable.c   | 10 ++
  2 files changed, 10 insertions(+), 13 deletions(-)
 

Thanks, pulled (for now locally).  Since the log message's about
malicious users, I assume there's no urgency in pushing this to Linus
and that normal guests work fine.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL 0/2] KVM: s390: Fixes for 3.17

2014-08-25 Thread Christian Borntraeger
On 25/08/14 15:42, Paolo Bonzini wrote:
 Il 25/08/2014 15:10, Christian Borntraeger ha scritto:
 Paolo,

 the following changes since commit 7d1311b93e58ed55f3a31cc8f94c4b8fe988a2b9:

   Linux 3.17-rc1 (2014-08-16 10:40:26 -0600)

 are available in the git repository at:

   git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git  
 tags/kvm-s390-20140825

 for you to fetch changes up to ab3f285f227fec62868037e9b1b1fd18294a83b8:

   KVM: s390/mm: try a cow on read only pages for key ops (2014-08-25 
 14:35:28 +0200)

 
 Here are two fixes for s390 KVM code that prevent:
 1. a malicious user to trigger a kernel BUG
 2. a malicious user to change the storage key of read-only pages

 
 Christian Borntraeger (2):
   KVM: s390: Fix user triggerable bug in dead code
   KVM: s390/mm: try a cow on read only pages for key ops

  arch/s390/kvm/kvm-s390.c | 13 -
  arch/s390/mm/pgtable.c   | 10 ++
  2 files changed, 10 insertions(+), 13 deletions(-)

 
 Thanks, pulled (for now locally).  Since the log message's about
 malicious users, I assume there's no urgency in pushing this to Linus
 and that normal guests work fine.

Yes.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 5/7] KVM: trace kvm_ple_window grow/shrink

2014-08-25 Thread Sabrina Dubroca
Hello,

2014-08-21, 18:08:09 +0200, Radim Krčmář wrote:
 Tracepoint for dynamic PLE window, fired on every potential change.
 
 Signed-off-by: Radim Krčmář rkrc...@redhat.com
 ---
  arch/x86/kvm/trace.h | 30 ++
  arch/x86/kvm/vmx.c   | 10 --
  arch/x86/kvm/x86.c   |  1 +
  3 files changed, 39 insertions(+), 2 deletions(-)
 
 diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
 index e850a7d..1742dfb 100644
 --- a/arch/x86/kvm/trace.h
 +++ b/arch/x86/kvm/trace.h
 @@ -848,6 +848,36 @@ TRACE_EVENT(kvm_track_tsc,
 __print_symbolic(__entry-host_clock, host_clocks))
  );
  
 +TRACE_EVENT(kvm_ple_window,
 + TP_PROTO(bool grow, unsigned int vcpu_id, int new, int old),
 + TP_ARGS(grow, vcpu_id, new, old),
 +
 + TP_STRUCT__entry(
 + __field(bool,  grow )
 + __field(unsigned int,   vcpu_id )
 + __field( int,   new )
 + __field( int,   old )
 + ),
 +
 + TP_fast_assign(
 + __entry-grow   = grow;
 + __entry-vcpu_id= vcpu_id;
 + __entry-new= new;
 + __entry-old= old;
 + ),
 +
 + TP_printk(vcpu %u: ple_window %d (%s %d),
 +   __entry-vcpu_id,
 +   __entry-new,
 +   __entry-grow ? grow : shrink,
 +   __entry-old)
 +);
 +
 +#define trace_kvm_ple_window_grow(vcpu_id, new, old) \
 + trace_kvm_ple_window(true, vcpu_id, new, old)
 +#define trace_kvm_ple_window_shrink(vcpu_id, new, old) \
 + trace_kvm_ple_window(false, vcpu_id, new, old)
 +
  #endif /* CONFIG_X86_64 */

Looks like these are needed on 32-bit as well.
Today's linux-next doesn't build:

  CC [M]  arch/x86/kvm/x86.o
In file included from include/linux/linkage.h:6:0,
 from include/linux/preempt.h:9,
 from include/linux/preempt_mask.h:4,
 from include/linux/hardirq.h:4,
 from include/linux/kvm_host.h:10,
 from arch/x86/kvm/x86.c:22:
include/linux/tracepoint.h:214:20: error: ‘__tracepoint_kvm_ple_window’ 
undeclared here (not in a function)
  EXPORT_SYMBOL_GPL(__tracepoint_##name)
^
include/linux/export.h:57:16: note: in definition of macro ‘__EXPORT_SYMBOL’
  extern typeof(sym) sym; \
^
include/linux/tracepoint.h:214:2: note: in expansion of macro 
‘EXPORT_SYMBOL_GPL’
  EXPORT_SYMBOL_GPL(__tracepoint_##name)
  ^
arch/x86/kvm/x86.c:7676:1: note: in expansion of macro 
‘EXPORT_TRACEPOINT_SYMBOL_GPL’
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_ple_window);
 ^


and if I comment out the EXPORT_TRACEPOINT_SYMBOL_GPL:

arch/x86/kvm/vmx.c: In function ‘grow_ple_window’:
arch/x86/kvm/vmx.c:5742:2: error: implicit declaration of function 
‘trace_kvm_ple_window_grow’ [-Werror=implicit-function-declaration]
  trace_kvm_ple_window_grow(vcpu-vcpu_id, vmx-ple_window, old);
  ^
arch/x86/kvm/vmx.c: In function ‘shrink_ple_window’:
arch/x86/kvm/vmx.c:5756:2: error: implicit declaration of function 
‘trace_kvm_ple_window_shrink’ [-Werror=implicit-function-declaration]
  trace_kvm_ple_window_shrink(vcpu-vcpu_id, vmx-ple_window, old);
  ^
cc1: some warnings being treated as errors
make[2]: *** [arch/x86/kvm/vmx.o] Error 1



I moved the line

#endif /* CONFIG_X86_64 */

above

TRACE_EVENT(kvm_ple_window,

and it builds.


Thanks,

-- 
Sabrina
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM-Use value reading from MSR when construct the eptp in VMX mode

2014-08-25 Thread Gleb Natapov
On Mon, Aug 25, 2014 at 11:16:34AM +0800, Dennis Chen wrote:
 On Sun, Aug 24, 2014 at 5:38 PM, Gleb Natapov g...@kernel.org wrote:
  On Sun, Aug 24, 2014 at 11:54:32AM +0800, Dennis Chen wrote:
  This patch is used to construct the eptp in vmx mode with values
  readed from MSR according to the intel x86 software developer's
  manual.
 
   static u64 construct_eptp(unsigned long root_hpa)
   {
  -u64 eptp;
  +u64 eptp, pwl;
  +
  +if (cpu_has_vmx_ept_4levels())
  +pwl = VMX_EPT_DEFAULT_GAW  VMX_EPT_GAW_EPTP_SHIFT;
  +else {
  +WARN(1, Unsupported page-walk length of 4.\n);
  Page-walk length of 4 is the only one that is supported.
 
 Since there is a bit 6 in IA32_VMX_EPT_VPID_CAP MSR indicating the
 support for the page-walk length, I think sanity check is necessary.
 But I just checked the code, it's already done in the hardware_setup()
 function which will disable ept feature if the page-wake length is not
 4. Gleb, any comments for the memory type check part?
Looks fine, but are there CPUs out there that do not support WB for eptp? Since
there was no bug reports about it I assume no.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 5/7] KVM: trace kvm_ple_window grow/shrink

2014-08-25 Thread Radim Krčmář
2014-08-25 15:53+0200, Sabrina Dubroca:
 Hello,
 
 2014-08-21, 18:08:09 +0200, Radim Krčmář wrote:
  Tracepoint for dynamic PLE window, fired on every potential change.
  +#define trace_kvm_ple_window_grow(vcpu_id, new, old) \
  +   trace_kvm_ple_window(true, vcpu_id, new, old)
  +#define trace_kvm_ple_window_shrink(vcpu_id, new, old) \
  +   trace_kvm_ple_window(false, vcpu_id, new, old)
  +
   #endif /* CONFIG_X86_64 */
 
 Looks like these are needed on 32-bit as well.
 Today's linux-next doesn't build:
 
 [...]
 
 I moved the line
 
 #endif /* CONFIG_X86_64 */
 
 above
 
 TRACE_EVENT(kvm_ple_window,
 
 and it builds.

Thanks!

Paolo, can you still fix this just by rebasing?

---
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index 1742dfb..4c2868f 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -848,6 +848,8 @@ TRACE_EVENT(kvm_track_tsc,
  __print_symbolic(__entry-host_clock, host_clocks))
 );
 
+#endif /* CONFIG_X86_64 */
+
 TRACE_EVENT(kvm_ple_window,
TP_PROTO(bool grow, unsigned int vcpu_id, int new, int old),
TP_ARGS(grow, vcpu_id, new, old),
@@ -878,8 +880,6 @@ TRACE_EVENT(kvm_ple_window,
 #define trace_kvm_ple_window_shrink(vcpu_id, new, old) \
trace_kvm_ple_window(false, vcpu_id, new, old)
 
-#endif /* CONFIG_X86_64 */
-
 #endif /* _TRACE_KVM_H */
 
 #undef TRACE_INCLUDE_PATH
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 5/7] KVM: trace kvm_ple_window grow/shrink

2014-08-25 Thread Paolo Bonzini
Il 25/08/2014 16:32, Radim Krčmář ha scritto:
 Paolo, can you still fix this just by rebasing?

Maybe I could, but I just pushed the fix to kvm/next as a separate commit.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/7] Dynamic Pause Loop Exiting window.

2014-08-25 Thread Radim Krčmář
2014-08-22 12:45+0800, Wanpeng Li:
 Hi Radim,
 On Thu, Aug 21, 2014 at 06:50:03PM +0200, Radim Krčmář wrote:
 2014-08-21 18:30+0200, Paolo Bonzini:
  Il 21/08/2014 18:08, Radim Krčmář ha scritto:
  I'm not sure of the usefulness of patch 6, so I'm going to drop it.
  I'll keep it in my local junkyard branch in case it's going to be useful
  in some scenario I didn't think of.
 
 I've been using it to benchmark different values, because it is more
 
 Is there any benchmark data for this patchset?

Sorry, I already returned the testing machine and it wasn't quality
benchmarking, so I haven't kept the results ...

I used ebizzy and dbench, because ebizzy had large difference between
PLE on/off and dbench minimal (without overcommit), so one was looking
for improvements while the other was checking regressions.
(And they are easy to set up.)

From what I remember, this patch had roughly 5x better performance with
ebizzy on 60 VCPU guests and no obvious difference for dbench.
(And improvement under overcommit was visible for both.)

There was a significant reduction in %sys, which never raised much above
30%, as oposed to original 90%+.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [libvirt] Mentors wanted for Outreach Program for Women October 2014

2014-08-25 Thread Stefan Hajnoczi
On Mon, Aug 25, 2014 at 11:52 AM, Martin Kletzander mklet...@redhat.com wrote:
 On Thu, Aug 21, 2014 at 09:06:39PM +0100, Stefan Hajnoczi wrote:
 Regular code contributors to QEMU, KVM, and libvirt are eligible to
 participate as mentors.

 We also need project ideas that are achievable in 12 weeks by someone
 skilled in programming but not necessarily familiar with open source
 or our codebase.  Ideas welcome!


 It's just a matter of ideas.  Maybe we could revisit some of those we
 had for GSoC.  If I'm reading the deadline for project ideas is
 October 22nd, so I think we'll definitely come up with something.

Yes, we can continue to offer project ideas that were not done last round.

Thanks for your interest, Martin!  I'll send more information once we
have information on how many slots are funded.

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [libvirt] Mentors wanted for Outreach Program for Women October 2014

2014-08-25 Thread Marina Zhurakhinskaya
- Original Message -
 From: Stefan Hajnoczi stefa...@gmail.com
 To: Martin Kletzander mklet...@redhat.com
 Cc: qemu-devel qemu-de...@nongnu.org, libvir-l...@redhat.com, kvm 
 kvm@vger.kernel.org, Marina
 Zhurakhinskaya mari...@redhat.com
 Sent: Monday, August 25, 2014 12:29:27 PM
 Subject: Re: [libvirt] Mentors wanted for Outreach Program for Women October 
 2014
 
 On Mon, Aug 25, 2014 at 11:52 AM, Martin Kletzander mklet...@redhat.com
 wrote:
  On Thu, Aug 21, 2014 at 09:06:39PM +0100, Stefan Hajnoczi wrote:
  Regular code contributors to QEMU, KVM, and libvirt are eligible to
  participate as mentors.
 
  We also need project ideas that are achievable in 12 weeks by someone
  skilled in programming but not necessarily familiar with open source
  or our codebase.  Ideas welcome!
 
 
  It's just a matter of ideas.  Maybe we could revisit some of those we
  had for GSoC.  If I'm reading the deadline for project ideas is
  October 22nd, so I think we'll definitely come up with something.

Thank you for your interest in helping revisit GSoC ideas and come up with new 
ones! October 22 is an application deadline for prospective interns. QEMU would 
need to have some project ideas listed by September 8, though you can add more 
ideas through September. The timeline for the program is at 
https://wiki.gnome.org/OutreachProgramForWomen/2014/DecemberMarch You don't 
need very many ideas, as you are likely to only have at most 2-3 participants.

We don't yet have any funding confirmed for QEMU, but Stefan and I will be 
working on this. If your organization might be able to sponsor QEMU internships 
in OPW, please contact me and Stefan off-list. You can learn more at 
https://wiki.gnome.org/OutreachProgramForWomen/Admin/InfoForOrgs

Thanks,
Marina

 
 Yes, we can continue to offer project ideas that were not done last round.
 
 Thanks for your interest, Martin!  I'll send more information once we
 have information on how many slots are funded.
 
 Stefan
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm-unit-tests: x86: pmu: call measure for every counter in check_counters_many

2014-08-25 Thread Paolo Bonzini
Il 14/08/2014 22:58, Chris J Arges ha scritto:
 In the check_counters_many function measure was only being called on the last
 counter, causing the pmu test to fail.

I don't understand.  measure loops on all N counters and calls
start_event (which in turn calls global_enable) and stop_event (which
calls global_disable) on all counters.

 This ensures that measure is called for
 each counter in the array before calling verify_counter.

Actually the point of this test is to run the loop while all the
counters are active, so this patch is just masking another bug.

Paolo

 Signed-off-by: Chris J Arges chris.j.ar...@canonical.com
 ---
  x86/pmu.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)
 
 diff --git a/x86/pmu.c b/x86/pmu.c
 index 5c85146..3402d1e 100644
 --- a/x86/pmu.c
 +++ b/x86/pmu.c
 @@ -287,11 +287,11 @@ static void check_counters_many(void)
   n++;
   }
  
 - measure(cnt, n);
 -
 - for (i = 0; i  n; i++)
 + for (i = 0; i  n; i++) {
 + measure(cnt[i], 1);
   if (!verify_counter(cnt[i]))
   break;
 + }
  
   report(all counters, i == n);
  }
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/3] Sync MTRRs with KVM and disable on reset

2014-08-25 Thread Paolo Bonzini
Il 14/08/2014 23:39, Alex Williamson ha scritto:
 v3:
  - Fix off-by-one identified by Laszlo in 2/3
  - Add R-b in 1  3
 
 It turns out that not only do we not follow the SDM guidelines for
 reseting MTRR state on vCPU reset, but we really don't even attempt
 to keep KVM MTRR state synchronized with QEMU, which affects not
 only reset, but migration.  This series implements the get/put MSR
 support for KVM, then goes on to properly re-initialize the state on
 vCPU reset.  This resolves the problem described in the last patch
 as well as some potential mismatches around migration.  The migration
 state is unchanged, other than actually passing valid data.
 
 Thanks to Laszlo for his help debugging this and realization of how
 terribly broken MTRR synchronization is.  Thanks,

Applying to uq/master, thanks.

Paolo

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm-unit-tests: x86: pmu: call measure for every counter in check_counters_many

2014-08-25 Thread Chris J Arges


On 08/25/2014 11:45 AM, Paolo Bonzini wrote:
 Il 14/08/2014 22:58, Chris J Arges ha scritto:
 In the check_counters_many function measure was only being called on the last
 counter, causing the pmu test to fail.
 
 I don't understand.  measure loops on all N counters and calls
 start_event (which in turn calls global_enable) and stop_event (which
 calls global_disable) on all counters.
 
 This ensures that measure is called for
 each counter in the array before calling verify_counter.
 
 Actually the point of this test is to run the loop while all the
 counters are active, so this patch is just masking another bug.
 
 Paolo

Paolo,

Ok I see now where this patch doesn't make sense.
With the latest kvm tree I get:

sudo ./x86-run x86/pmu.flat -smp 1 -cpu host | grep -v PASS


qemu-system-x86_64 -enable-kvm -device pc-testdev -device
isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
-device pci-testdev -kernel x86/pmu.flat -smp 1 -cpu host
enabling apic
paging enabled
cr0 = 80010011
cr3 = 7fff000
cr4 = 20
PMU version: 2
GP counters: 4
GP counter width:48
Mask length: 7
Fixed counters:  3
Fixed counter width: 48
FAIL: all counters

SUMMARY: 67 tests, 1 unexpected failures
Return value from qemu: 3

I've tested this on a few Intel platforms (sandybridge/haswell), I'll
look into the code more then.

Thanks,
--chris j arges

 
 Signed-off-by: Chris J Arges chris.j.ar...@canonical.com
 ---
  x86/pmu.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

 diff --git a/x86/pmu.c b/x86/pmu.c
 index 5c85146..3402d1e 100644
 --- a/x86/pmu.c
 +++ b/x86/pmu.c
 @@ -287,11 +287,11 @@ static void check_counters_many(void)
  n++;
  }
  
 -measure(cnt, n);
 -
 -for (i = 0; i  n; i++)
 +for (i = 0; i  n; i++) {
 +measure(cnt[i], 1);
  if (!verify_counter(cnt[i]))
  break;
 +}
  
  report(all counters, i == n);
  }

 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm-unit-tests: x86: pmu: call measure for every counter in check_counters_many

2014-08-25 Thread Paolo Bonzini
 Ok I see now where this patch doesn't make sense.
 With the latest kvm tree I get:
 
 sudo ./x86-run x86/pmu.flat -smp 1 -cpu host | grep -v PASS
 
 
 qemu-system-x86_64 -enable-kvm -device pc-testdev -device
 isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
 -device pci-testdev -kernel x86/pmu.flat -smp 1 -cpu host
 enabling apic
 paging enabled
 cr0 = 80010011
 cr3 = 7fff000
 cr4 = 20
 PMU version: 2
 GP counters: 4
 GP counter width:48
 Mask length: 7
 Fixed counters:  3
 Fixed counter width: 48
 FAIL: all counters
 
 SUMMARY: 67 tests, 1 unexpected failures
 Return value from qemu: 3
 
 I've tested this on a few Intel platforms (sandybridge/haswell), I'll
 look into the code more then.


Are you using the NMI watchdog in the host?  It eats one PMU counter
and makes this test fail.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm-unit-tests: x86: pmu: call measure for every counter in check_counters_many

2014-08-25 Thread Chris J Arges


On 08/25/2014 02:32 PM, Paolo Bonzini wrote:
 Ok I see now where this patch doesn't make sense.
 With the latest kvm tree I get:

 sudo ./x86-run x86/pmu.flat -smp 1 -cpu host | grep -v PASS


 qemu-system-x86_64 -enable-kvm -device pc-testdev -device
 isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
 -device pci-testdev -kernel x86/pmu.flat -smp 1 -cpu host
 enabling apic
 paging enabled
 cr0 = 80010011
 cr3 = 7fff000
 cr4 = 20
 PMU version: 2
 GP counters: 4
 GP counter width:48
 Mask length: 7
 Fixed counters:  3
 Fixed counter width: 48
 FAIL: all counters

 SUMMARY: 67 tests, 1 unexpected failures
 Return value from qemu: 3

 I've tested this on a few Intel platforms (sandybridge/haswell), I'll
 look into the code more then.
 
 
 Are you using the NMI watchdog in the host?  It eats one PMU counter
 and makes this test fail.
 
 Paolo
 

Ah, I didn't know that. Yes disabling NMI watchdog via:
echo 0 | sudo tee /proc/sys/kernel/nmi_watchdog
Allows this test to pass.

Would it make sense to have a check if nmi_watchdog is enabled in this
test case, and skip the all counters test?

--chris j arges


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/2] target-i386: tsc_adjust and mpx feature names

2014-08-25 Thread Eduardo Habkost
Add feature names that are missing on the x86 CPU feature name tables. Both had
migration support implemented many months ago.

Changes v1 - v2:
 * Commit message changes only. Added reference to migration support commit IDs.

Note that v1 was not sent as a series, but as separate individual patches.

Eduardo Habkost (2):
  target-i386: Add mpx CPU feature name
  target-i386: Add tsc_adjust CPU feature name

 target-i386/cpu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/2] target-i386: Add mpx CPU feature name

2014-08-25 Thread Eduardo Habkost
Migration support for MPX is already implemented (commit
79e9ebebbf2a00c46fcedb6dc7dd5e12bbd30216), so we can add it to the list
of known feature names.

Signed-off-by: Eduardo Habkost ehabk...@redhat.com
---
 target-i386/cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 217500c..c86cf5c 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -258,7 +258,7 @@ static const char *svm_feature_name[] = {
 
 static const char *cpuid_7_0_ebx_feature_name[] = {
 fsgsbase, NULL, NULL, bmi1, hle, avx2, NULL, smep,
-bmi2, erms, invpcid, rtm, NULL, NULL, NULL, NULL,
+bmi2, erms, invpcid, rtm, NULL, NULL, mpx, NULL,
 NULL, NULL, rdseed, adx, smap, NULL, NULL, NULL,
 NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
 };
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/2] target-i386: Add tsc_adjust CPU feature name

2014-08-25 Thread Eduardo Habkost
tsc_adjust migration support is already implemented (commit
f28558d3d37ad3bc4e35e8ac93f7bf81a0d5622c), so we can add it to the list
of known feature names.

Signed-off-by: Eduardo Habkost ehabk...@redhat.com
---
 target-i386/cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index c86cf5c..ea0fd9c 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -257,7 +257,7 @@ static const char *svm_feature_name[] = {
 };
 
 static const char *cpuid_7_0_ebx_feature_name[] = {
-fsgsbase, NULL, NULL, bmi1, hle, avx2, NULL, smep,
+fsgsbase, tsc_adjust, NULL, bmi1, hle, avx2, NULL, smep,
 bmi2, erms, invpcid, rtm, NULL, NULL, mpx, NULL,
 NULL, NULL, rdseed, adx, smap, NULL, NULL, NULL,
 NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/5] driver core: amba: add device binding path 'driver_override'

2014-08-25 Thread Kim Phillips
On Fri, 22 Aug 2014 11:01:24 +0200
Antonios Motakis a.mota...@virtualopensystems.com wrote:

 As already demonstrated with PCI [1] and the platform bus [2], a
 driver_override property in sysfs can be used to bypass the id matching
 of a device to a AMBA driver. This can be used by VFIO to bind to any AMBA
 device requested by the user.
 
 [1] 
 http://lists-archives.com/linux-kernel/28030441-pci-introduce-new-device-binding-path-using-pci_dev-driver_override.html
 [2] https://www.redhat.com/archives/libvir-list/2014-April/msg00382.html
 
 Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com
 ---
  drivers/amba/bus.c   | 43 +++
  include/linux/amba/bus.h |  1 +

missing Documentation/ABI/testing/sysfs-bus-amba entry?

otherwise looks ok.

Thanks,

Kim
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/6] pc: Create pc_compat_2_1() functions

2014-08-25 Thread Eduardo Habkost
We will need new compat code for the 2.1 machine-types.

Signed-off-by: Eduardo Habkost ehabk...@redhat.com
---
 hw/i386/pc_piix.c | 13 -
 hw/i386/pc_q35.c  | 13 -
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 47ac1b5..5b7f9ba 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -302,8 +302,13 @@ static void pc_init_pci(MachineState *machine)
 pc_init1(machine, 1, 1);
 }
 
+static void pc_compat_2_1(MachineState *machine)
+{
+}
+
 static void pc_compat_2_0(MachineState *machine)
 {
+pc_compat_2_1(machine);
 /* This value depends on the actual DSDT and SSDT compiled into
  * the source QEMU; unfortunately it depends on the binary and
  * not on the machine type, so we cannot make pc-i440fx-1.7 work on
@@ -367,6 +372,12 @@ static void pc_compat_1_2(MachineState *machine)
 x86_cpu_compat_disable_kvm_features(FEAT_KVM, KVM_FEATURE_PV_EOI);
 }
 
+static void pc_init_pci_2_1(MachineState *machine)
+{
+pc_compat_2_1(machine);
+pc_init_pci(machine);
+}
+
 static void pc_init_pci_2_0(MachineState *machine)
 {
 pc_compat_2_0(machine);
@@ -470,7 +481,7 @@ static QEMUMachine pc_i440fx_machine_v2_2 = {
 static QEMUMachine pc_i440fx_machine_v2_1 = {
 PC_I440FX_2_1_MACHINE_OPTIONS,
 .name = pc-i440fx-2.1,
-.init = pc_init_pci,
+.init = pc_init_pci_2_1,
 .compat_props = (GlobalProperty[]) {
 PC_COMPAT_2_1,
 { /* end of list */ }
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 4b5a274..602daa8 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -276,8 +276,13 @@ static void pc_q35_init(MachineState *machine)
 }
 }
 
+static void pc_compat_2_1(MachineState *machine)
+{
+}
+
 static void pc_compat_2_0(MachineState *machine)
 {
+pc_compat_2_1(machine);
 smbios_legacy_mode = true;
 has_reserved_memory = false;
 }
@@ -310,6 +315,12 @@ static void pc_compat_1_4(MachineState *machine)
 x86_cpu_compat_set_features(Westmere, FEAT_1_ECX, 0, 
CPUID_EXT_PCLMULQDQ);
 }
 
+static void pc_q35_init_2_1(MachineState *machine)
+{
+pc_compat_2_1(machine);
+pc_q35_init(machine);
+}
+
 static void pc_q35_init_2_0(MachineState *machine)
 {
 pc_compat_2_0(machine);
@@ -361,7 +372,7 @@ static QEMUMachine pc_q35_machine_v2_2 = {
 static QEMUMachine pc_q35_machine_v2_1 = {
 PC_Q35_2_1_MACHINE_OPTIONS,
 .name = pc-q35-2.1,
-.init = pc_q35_init,
+.init = pc_q35_init_2_1,
 .compat_props = (GlobalProperty[]) {
 PC_COMPAT_2_1,
 { /* end of list */ }
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/6] target-i386: Make most CPU models work with enforce out of the box

2014-08-25 Thread Eduardo Habkost
Changes v1 - v2:
 * Commit message and comment changes.
 * Update compat code to change pc-*-2.1, not pc-*-2.0.
 * Added patch to disable SVM by default in KVM mode.

Most of the bits that make enforce breaks were introduced in 2010 by commit
8560efed6a72a816c0115f41ddb9d79f7ce63f28. The intention behind that commit made
sense, the only problem is that we can't guarantee guest ABI stability across
hosts if we simply rely on trimming of CPU features based on host capabilities.

So, this series remove CPUID bits from the CPU model definitions so they become
defaults that: 1) won't unexpectly stop working when we start using the
enforce flag; 2) won't silently break the guest ABI when TCG or KVM start
supporting new features.

There's only one non-trivial case left: the qemu32/qemu64 models. The problem
with them is that we have conflicting expectations about it, from different
users:

TCG users expect the default CPU model to contain most TCG-supported features
(and it makes sense). See, for example, commit
f1e00a9cf326acc1f2386a72525af8859852e1df.

KVM users expect the default CPU model to be a conservative choice which will
work on most host CPUs (and will only contain features that are supported by
KVM).

We could solve the qemu32/qemu64 issue by having different defaults for TCG and
KVM. But we have existing management code (libvirt) that already expects qemu32
or qemu64 to be the default, and changing the default would break that code. I
will send an RFC to address that later.

Cc: Aurelien Jarno aurel...@aurel32.net
Cc: Paolo Bonzini pbonz...@redhat.com
Cc: kvm@vger.kernel.org

Eduardo Habkost (6):
  pc: Create pc_compat_2_1() functions
  target-i386: Rename KVM auto-feature-enable compat function
  target-i386: Disable CPUID_ACPI by default on KVM mode
  target-i386: Remove unsupported bits from all CPU models
  target-i386: Don't enable nested VMX by default
  target-i386: Disable SVM by default in KVM mode

 hw/i386/pc_piix.c | 22 ++
 hw/i386/pc_q35.c  | 18 --
 target-i386/cpu.c | 42 --
 target-i386/cpu.h |  3 ++-
 4 files changed, 64 insertions(+), 21 deletions(-)

-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 4/6] target-i386: Remove unsupported bits from all CPU models

2014-08-25 Thread Eduardo Habkost
The following CPU features were never supported by neither TCG or KVM,
so they are useless on the CPU model definitions, today:

 * CPUID_DTS (DS)
 * CPUID_HT
 * CPUID_TM
 * CPUID_PBE
 * CPUID_EXT_DTES64
 * CPUID_EXT_DSCPL
 * CPUID_EXT_EST
 * CPUID_EXT_TM2
 * CPUID_EXT_XTPR
 * CPUID_EXT_PDCM
 * CPUID_SVM_LBRV

As using enforce mode is the only way to ensure guest ABI doesn't
change when moving to a different host, we should make enforce mode
the default or at least encourage management software to always use it.

In turn, to make enforce usable, we need CPU models that work without
always requiring some features to be explicitly disabled. This patch
removes the above features from all CPU model definitions.

We won't need any machine-type compat code for those changes, because it
is impossible to have existing VMs with those features enabled.

Signed-off-by: Eduardo Habkost ehabk...@redhat.com
Cc: Aurelien Jarno aurel...@aurel32.net
---
Changes v1 - v2:
* Trivial typo fix in comment
---
 target-i386/cpu.c | 33 -
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index b7fc6e0..6f26169 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -680,10 +680,11 @@ static X86CPUDefinition builtin_x86_defs[] = {
 .family = 16,
 .model = 2,
 .stepping = 3,
+/* Missing: CPUID_HT */
 .features[FEAT_1_EDX] =
 PPRO_FEATURES |
 CPUID_MTRR | CPUID_CLFLUSH | CPUID_MCA |
-CPUID_PSE36 | CPUID_VME | CPUID_HT,
+CPUID_PSE36 | CPUID_VME,
 .features[FEAT_1_ECX] =
 CPUID_EXT_SSE3 | CPUID_EXT_MONITOR | CPUID_EXT_CX16 |
 CPUID_EXT_POPCNT,
@@ -699,8 +700,9 @@ static X86CPUDefinition builtin_x86_defs[] = {
 .features[FEAT_8000_0001_ECX] =
 CPUID_EXT3_LAHF_LM | CPUID_EXT3_SVM |
 CPUID_EXT3_ABM | CPUID_EXT3_SSE4A,
+/* Missing: CPUID_SVM_LBRV */
 .features[FEAT_SVM] =
-CPUID_SVM_NPT | CPUID_SVM_LBRV,
+CPUID_SVM_NPT,
 .xlevel = 0x801A,
 .model_id = AMD Phenom(tm) 9550 Quad-Core Processor
 },
@@ -711,15 +713,16 @@ static X86CPUDefinition builtin_x86_defs[] = {
 .family = 6,
 .model = 15,
 .stepping = 11,
+/* Missing: CPUID_DTS, CPUID_HT, CPUID_TM, CPUID_PBE */
 .features[FEAT_1_EDX] =
 PPRO_FEATURES |
 CPUID_MTRR | CPUID_CLFLUSH | CPUID_MCA |
-CPUID_PSE36 | CPUID_VME | CPUID_DTS | CPUID_ACPI | CPUID_SS |
-CPUID_HT | CPUID_TM | CPUID_PBE,
+CPUID_PSE36 | CPUID_VME | CPUID_ACPI | CPUID_SS,
+/* Missing: CPUID_EXT_DTES64, CPUID_EXT_DSCPL, CPUID_EXT_EST,
+ * CPUID_EXT_TM2, CPUID_EXT_XTPR, CPUID_EXT_PDCM */
 .features[FEAT_1_ECX] =
 CPUID_EXT_SSE3 | CPUID_EXT_MONITOR | CPUID_EXT_SSSE3 |
-CPUID_EXT_DTES64 | CPUID_EXT_DSCPL | CPUID_EXT_VMX | CPUID_EXT_EST 
|
-CPUID_EXT_TM2 | CPUID_EXT_CX16 | CPUID_EXT_XTPR | CPUID_EXT_PDCM,
+CPUID_EXT_VMX | CPUID_EXT_CX16,
 .features[FEAT_8000_0001_EDX] =
 CPUID_EXT2_LM | CPUID_EXT2_SYSCALL | CPUID_EXT2_NX,
 .features[FEAT_8000_0001_ECX] =
@@ -794,13 +797,15 @@ static X86CPUDefinition builtin_x86_defs[] = {
 .family = 6,
 .model = 14,
 .stepping = 8,
+/* Missing: CPUID_DTS, CPUID_HT, CPUID_TM, CPUID_PBE */
 .features[FEAT_1_EDX] =
 PPRO_FEATURES | CPUID_VME |
-CPUID_MTRR | CPUID_CLFLUSH | CPUID_MCA | CPUID_DTS | CPUID_ACPI |
-CPUID_SS | CPUID_HT | CPUID_TM | CPUID_PBE,
+CPUID_MTRR | CPUID_CLFLUSH | CPUID_MCA | CPUID_ACPI |
+CPUID_SS,
+/* Missing: CPUID_EXT_EST, CPUID_EXT_TM2 , CPUID_EXT_XTPR,
+ * CPUID_EXT_PDCM */
 .features[FEAT_1_ECX] =
-CPUID_EXT_SSE3 | CPUID_EXT_MONITOR | CPUID_EXT_VMX |
-CPUID_EXT_EST | CPUID_EXT_TM2 | CPUID_EXT_XTPR | CPUID_EXT_PDCM,
+CPUID_EXT_SSE3 | CPUID_EXT_MONITOR | CPUID_EXT_VMX,
 .features[FEAT_8000_0001_EDX] =
 CPUID_EXT2_NX,
 .xlevel = 0x8008,
@@ -873,14 +878,16 @@ static X86CPUDefinition builtin_x86_defs[] = {
 .family = 6,
 .model = 28,
 .stepping = 2,
+/* Missing: CPUID_DTS, CPUID_HT, CPUID_TM, CPUID_PBE */
 .features[FEAT_1_EDX] =
 PPRO_FEATURES |
-CPUID_MTRR | CPUID_CLFLUSH | CPUID_MCA | CPUID_VME | CPUID_DTS |
-CPUID_ACPI | CPUID_SS | CPUID_HT | CPUID_TM | CPUID_PBE,
+CPUID_MTRR | CPUID_CLFLUSH | CPUID_MCA | CPUID_VME |
+CPUID_ACPI | CPUID_SS,
 /* Some CPUs got no CPUID_SEP */
+/* Missing: CPUID_EXT_DSCPL, CPUID_EXT_EST, CPUID_EXT_TM2,
+ * CPUID_EXT_XTPR */
 .features[FEAT_1_ECX] =
 CPUID_EXT_SSE3 | CPUID_EXT_MONITOR | CPUID_EXT_SSSE3 |
-

[PATCH v2 5/6] target-i386: Don't enable nested VMX by default

2014-08-25 Thread Eduardo Habkost
TCG doesn't support VMX, and nested VMX is not enabled by default on the
KVM kernel module.

So, there's no reason to have VMX enabled by default on the core2duo and
coreduo CPU models, today. Even the newer Intel CPU model definitions
don't have it enabled.

In this case, we need machine-type compat code, as people may be running
the older machine-types on hosts that had VMX nesting enabled.

Signed-off-by: Eduardo Habkost ehabk...@redhat.com
---
 hw/i386/pc_piix.c | 2 ++
 hw/i386/pc_q35.c  | 2 ++
 target-i386/cpu.c | 8 
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 6ee8dfa..c6db762 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -304,6 +304,8 @@ static void pc_init_pci(MachineState *machine)
 
 static void pc_compat_2_1(MachineState *machine)
 {
+x86_cpu_compat_set_features(coreduo, FEAT_1_ECX, CPUID_EXT_VMX, 0);
+x86_cpu_compat_set_features(core2duo, FEAT_1_ECX, CPUID_EXT_VMX, 0);
 }
 
 static void pc_compat_2_0(MachineState *machine)
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 55fc62f..be84352 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -278,6 +278,8 @@ static void pc_q35_init(MachineState *machine)
 
 static void pc_compat_2_1(MachineState *machine)
 {
+x86_cpu_compat_set_features(coreduo, FEAT_1_ECX, CPUID_EXT_VMX, 0);
+x86_cpu_compat_set_features(core2duo, FEAT_1_ECX, CPUID_EXT_VMX, 0);
 }
 
 static void pc_compat_2_0(MachineState *machine)
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 6f26169..011316d 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -719,10 +719,10 @@ static X86CPUDefinition builtin_x86_defs[] = {
 CPUID_MTRR | CPUID_CLFLUSH | CPUID_MCA |
 CPUID_PSE36 | CPUID_VME | CPUID_ACPI | CPUID_SS,
 /* Missing: CPUID_EXT_DTES64, CPUID_EXT_DSCPL, CPUID_EXT_EST,
- * CPUID_EXT_TM2, CPUID_EXT_XTPR, CPUID_EXT_PDCM */
+ * CPUID_EXT_TM2, CPUID_EXT_XTPR, CPUID_EXT_PDCM, CPUID_EXT_VMX */
 .features[FEAT_1_ECX] =
 CPUID_EXT_SSE3 | CPUID_EXT_MONITOR | CPUID_EXT_SSSE3 |
-CPUID_EXT_VMX | CPUID_EXT_CX16,
+CPUID_EXT_CX16,
 .features[FEAT_8000_0001_EDX] =
 CPUID_EXT2_LM | CPUID_EXT2_SYSCALL | CPUID_EXT2_NX,
 .features[FEAT_8000_0001_ECX] =
@@ -803,9 +803,9 @@ static X86CPUDefinition builtin_x86_defs[] = {
 CPUID_MTRR | CPUID_CLFLUSH | CPUID_MCA | CPUID_ACPI |
 CPUID_SS,
 /* Missing: CPUID_EXT_EST, CPUID_EXT_TM2 , CPUID_EXT_XTPR,
- * CPUID_EXT_PDCM */
+ * CPUID_EXT_PDCM, CPUID_EXT_VMX */
 .features[FEAT_1_ECX] =
-CPUID_EXT_SSE3 | CPUID_EXT_MONITOR | CPUID_EXT_VMX,
+CPUID_EXT_SSE3 | CPUID_EXT_MONITOR,
 .features[FEAT_8000_0001_EDX] =
 CPUID_EXT2_NX,
 .xlevel = 0x8008,
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/6] target-i386: Rename KVM auto-feature-enable compat function

2014-08-25 Thread Eduardo Habkost
The x86_cpu_compat_disable_kvm_features() name was a bit confusing, as
it won't forcibly disable the feature for all CPU models (i.e. add it to
kvm_default_unset_features), but it will instead turn off the KVM
auto-enabling of the feature (i.e. remove it from kvm_default_features),
meaning the feature may still be enabled by default in some CPU models).

Signed-off-by: Eduardo Habkost ehabk...@redhat.com
---
 hw/i386/pc_piix.c | 6 +++---
 hw/i386/pc_q35.c  | 2 +-
 target-i386/cpu.c | 2 +-
 target-i386/cpu.h | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 5b7f9ba..6ee8dfa 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -337,7 +337,7 @@ static void pc_compat_1_7(MachineState *machine)
 gigabyte_align = false;
 option_rom_has_mr = true;
 legacy_acpi_table_size = 6414;
-x86_cpu_compat_disable_kvm_features(FEAT_1_ECX, CPUID_EXT_X2APIC);
+x86_cpu_compat_kvm_no_autoenable(FEAT_1_ECX, CPUID_EXT_X2APIC);
 }
 
 static void pc_compat_1_6(MachineState *machine)
@@ -369,7 +369,7 @@ static void pc_compat_1_3(MachineState *machine)
 static void pc_compat_1_2(MachineState *machine)
 {
 pc_compat_1_3(machine);
-x86_cpu_compat_disable_kvm_features(FEAT_KVM, KVM_FEATURE_PV_EOI);
+x86_cpu_compat_kvm_no_autoenable(FEAT_KVM, KVM_FEATURE_PV_EOI);
 }
 
 static void pc_init_pci_2_1(MachineState *machine)
@@ -440,7 +440,7 @@ static void pc_init_isa(MachineState *machine)
 if (!machine-cpu_model) {
 machine-cpu_model = 486;
 }
-x86_cpu_compat_disable_kvm_features(FEAT_KVM, KVM_FEATURE_PV_EOI);
+x86_cpu_compat_kvm_no_autoenable(FEAT_KVM, KVM_FEATURE_PV_EOI);
 enable_compat_apic_id_mode();
 pc_init1(machine, 0, 1);
 }
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 602daa8..55fc62f 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -293,7 +293,7 @@ static void pc_compat_1_7(MachineState *machine)
 smbios_defaults = false;
 gigabyte_align = false;
 option_rom_has_mr = true;
-x86_cpu_compat_disable_kvm_features(FEAT_1_ECX, CPUID_EXT_X2APIC);
+x86_cpu_compat_kvm_no_autoenable(FEAT_1_ECX, CPUID_EXT_X2APIC);
 }
 
 static void pc_compat_1_6(MachineState *machine)
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 217500c..0396410 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -464,7 +464,7 @@ static uint32_t kvm_default_unset_features[FEATURE_WORDS] = 
{
 [FEAT_1_ECX] = CPUID_EXT_MONITOR,
 };
 
-void x86_cpu_compat_disable_kvm_features(FeatureWord w, uint32_t features)
+void x86_cpu_compat_kvm_no_autoenable(FeatureWord w, uint32_t features)
 {
 kvm_default_features[w] = ~features;
 }
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index e634d83..346eac1 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -1300,7 +1300,7 @@ void cpu_report_tpr_access(CPUX86State *env, TPRAccess 
access);
 void x86_cpu_compat_set_features(const char *cpu_model, FeatureWord w,
  uint32_t feat_add, uint32_t feat_remove);
 
-void x86_cpu_compat_disable_kvm_features(FeatureWord w, uint32_t features);
+void x86_cpu_compat_kvm_no_autoenable(FeatureWord w, uint32_t features);
 
 
 /* Return name of 32-bit register, from a R_* constant */
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 6/6] target-i386: Disable SVM by default in KVM mode

2014-08-25 Thread Eduardo Habkost
Make SVM be disabled by default on all CPU models when in KVM mode.
Nested SVM is enabled by default in the KVM kernel module, but it is
probably less stable than nested VMX (which is already disabled by
default).

Add a new compat function, x86_cpu_compat_kvm_no_autodisable(), to keep
compatibility on previous machine-types.

Suggested-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Eduardo Habkost ehabk...@redhat.com
---
 hw/i386/pc_piix.c | 1 +
 hw/i386/pc_q35.c  | 1 +
 target-i386/cpu.c | 6 ++
 target-i386/cpu.h | 1 +
 4 files changed, 9 insertions(+)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index c6db762..87f5b81 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -306,6 +306,7 @@ static void pc_compat_2_1(MachineState *machine)
 {
 x86_cpu_compat_set_features(coreduo, FEAT_1_ECX, CPUID_EXT_VMX, 0);
 x86_cpu_compat_set_features(core2duo, FEAT_1_ECX, CPUID_EXT_VMX, 0);
+x86_cpu_compat_kvm_no_autodisable(FEAT_8000_0001_ECX, CPUID_EXT3_SVM);
 }
 
 static void pc_compat_2_0(MachineState *machine)
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index be84352..5736f8a 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -280,6 +280,7 @@ static void pc_compat_2_1(MachineState *machine)
 {
 x86_cpu_compat_set_features(coreduo, FEAT_1_ECX, CPUID_EXT_VMX, 0);
 x86_cpu_compat_set_features(core2duo, FEAT_1_ECX, CPUID_EXT_VMX, 0);
+x86_cpu_compat_kvm_no_autodisable(FEAT_8000_0001_ECX, CPUID_EXT3_SVM);
 }
 
 static void pc_compat_2_0(MachineState *machine)
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 011316d..d3f40f5 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -463,6 +463,7 @@ static uint32_t kvm_default_features[FEATURE_WORDS] = {
 static uint32_t kvm_default_unset_features[FEATURE_WORDS] = {
 [FEAT_1_EDX] = CPUID_ACPI,
 [FEAT_1_ECX] = CPUID_EXT_MONITOR,
+[FEAT_8000_0001_ECX] = CPUID_EXT3_SVM,
 };
 
 void x86_cpu_compat_kvm_no_autoenable(FeatureWord w, uint32_t features)
@@ -470,6 +471,11 @@ void x86_cpu_compat_kvm_no_autoenable(FeatureWord w, 
uint32_t features)
 kvm_default_features[w] = ~features;
 }
 
+void x86_cpu_compat_kvm_no_autodisable(FeatureWord w, uint32_t features)
+{
+kvm_default_unset_features[w] = ~features;
+}
+
 /*
  * Returns the set of feature flags that are supported and migratable by
  * QEMU, for a given FeatureWord.
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index 346eac1..f496571 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -1301,6 +1301,7 @@ void x86_cpu_compat_set_features(const char *cpu_model, 
FeatureWord w,
  uint32_t feat_add, uint32_t feat_remove);
 
 void x86_cpu_compat_kvm_no_autoenable(FeatureWord w, uint32_t features);
+void x86_cpu_compat_kvm_no_autodisable(FeatureWord w, uint32_t features);
 
 
 /* Return name of 32-bit register, from a R_* constant */
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/6] target-i386: Disable CPUID_ACPI by default on KVM mode

2014-08-25 Thread Eduardo Habkost
KVM never supported the CPUID_ACPI flag, so it doesn't make sense to
have it enabled by default when KVM is enabled.

The motivation here is exactly the same we had for the MONITOR flag
(disabled by commit 136a7e9a85d7047461f8153f7d12c514a3d68f69).

And like on the MONITOR flag case, we don't need machine-type compat code
because it is currently impossible to run a KVM VM with the ACPI flag set.

Signed-off-by: Eduardo Habkost ehabk...@redhat.com
---
 target-i386/cpu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 0396410..b7fc6e0 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -461,6 +461,7 @@ static uint32_t kvm_default_features[FEATURE_WORDS] = {
 /* Features that are not added by default to any CPU model when KVM is enabled.
  */
 static uint32_t kvm_default_unset_features[FEATURE_WORDS] = {
+[FEAT_1_EDX] = CPUID_ACPI,
 [FEAT_1_ECX] = CPUID_EXT_MONITOR,
 };
 
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/5] vfio: AMBA devices support

2014-08-25 Thread Kim Phillips
On Fri, 22 Aug 2014 11:01:23 +0200
Antonios Motakis a.mota...@virtualopensystems.com wrote:

 This patch series depends on the VFIO for PLATFORM devices patch series,
 and implements AMBA device support for VFIO.

...

  drivers/amba/bus.c|  43 +++
  drivers/vfio/platform/Kconfig |  10 +
  drivers/vfio/platform/Makefile|   6 +-
  drivers/vfio/platform/vfio_amba.c | 129 +
  drivers/vfio/platform/vfio_platform.c | 355 +---
  drivers/vfio/platform/vfio_platform_common.c  | 380 
 ++
  drivers/vfio/platform/vfio_platform_irq.c |   6 +-
  drivers/vfio/platform/vfio_platform_private.h |   9 +-
  include/linux/amba/bus.h  |   1 +
  9 files changed, 591 insertions(+), 348 deletions(-)
  create mode 100644 drivers/vfio/platform/vfio_amba.c
  create mode 100644 drivers/vfio/platform/vfio_platform_common.c

I think this patchseries should be merged with its dependent
patchseries, in order to not have to review the common vfio_platform
code twice.

just my 2c.

Thanks,

Kim
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM-Use value reading from MSR when construct the eptp in VMX mode

2014-08-25 Thread Dennis Chen
On Mon, Aug 25, 2014 at 10:04 PM, Gleb Natapov g...@kernel.org wrote:
 On Mon, Aug 25, 2014 at 11:16:34AM +0800, Dennis Chen wrote:
 On Sun, Aug 24, 2014 at 5:38 PM, Gleb Natapov g...@kernel.org wrote:
  On Sun, Aug 24, 2014 at 11:54:32AM +0800, Dennis Chen wrote:
  This patch is used to construct the eptp in vmx mode with values
  readed from MSR according to the intel x86 software developer's
  manual.
 
   static u64 construct_eptp(unsigned long root_hpa)
   {
  -u64 eptp;
  +u64 eptp, pwl;
  +
  +if (cpu_has_vmx_ept_4levels())
  +pwl = VMX_EPT_DEFAULT_GAW  VMX_EPT_GAW_EPTP_SHIFT;
  +else {
  +WARN(1, Unsupported page-walk length of 4.\n);
  Page-walk length of 4 is the only one that is supported.
 
 Since there is a bit 6 in IA32_VMX_EPT_VPID_CAP MSR indicating the
 support for the page-walk length, I think sanity check is necessary.
 But I just checked the code, it's already done in the hardware_setup()
 function which will disable ept feature if the page-wake length is not
 4. Gleb, any comments for the memory type check part?
 Looks fine, but are there CPUs out there that do not support WB for eptp? 
 Since
 there was no bug reports about it I assume no.

Hmm, currently I can't find a x86 processor that don't support WB for
eptp, also there is no relevant bug reported.
I just read the intel SDM 24.6.11: SW should read the VMX capability
MSR_IA32_VMX_EPT_VPID_CAP to determine
what EPT MT are supported. But looks like this is not a big concern in
the community, so let's go back this thread if
we encounter one unfornately in the future. Thanks for the comments.


 --
 Gleb.



-- 
Den
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: x86: fix xen guest panic due to lack of KVM_REQ_EVENT

2014-08-25 Thread Wanpeng Li
On Mon, Aug 25, 2014 at 11:16:16AM +0200, Paolo Bonzini wrote:
Il 25/08/2014 11:08, Wanpeng Li ha scritto:
 Hi Paolo,
 On Mon, Aug 25, 2014 at 11:01:07AM +0200, Paolo Bonzini wrote:
 Il 25/08/2014 09:58, Wanpeng Li ha scritto:
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index c10408e..b7c0073 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -4928,6 +4928,8 @@ static void toggle_interruptibility(struct kvm_vcpu 
 *vcpu, u32 mask)
if (!mask)
kvm_make_request(KVM_REQ_EVENT, vcpu);
}
 +  if (!(int_shadow || mask))
 +  kvm_make_request(KVM_REQ_EVENT, vcpu);
  }
  
  static void inject_emulated_exception(struct kvm_vcpu *vcpu)

 No, this patch undoes the optimization in the buggy patch.

 A KVM_REQ_EVENT must be missing somewhere else.

 
 Could you give some tips in order that I can figure it out?

I have no idea right now (I was planning to debug it this week).

(BTW, look at the original commit that introduced KVM_REQ_EVENT --
https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/?id=3842d135 -- and
compare the patch and the commit message.  You can see that it was added
to the emulator because it is a place that can set EFLAGS and this
idea is preserved in the buggy patch).


From xen codes which report panic:

check_timer 
timer_irq_works

local_save_flags(flags);  = pushf;pop
local_irq_enable();   = sti 
delay  
local_irq_restore(flags); = pushfq;andq;orq;popfq 

Regards,
Wanpeng Li 

The important thing is that (despite Xen being involved) this is not
related to nested virtualization.  So I would first of all try to see if
some module parameter makes it go away (apicv and unrestricted mode
especially), then capture a trace of the panic.  At least this is how I
was planning to start... :)

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html