[Bug 60629] Starting a virtual machine ater suspend causes the host system hangup

2014-07-17 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=60629

ffsi...@yandex.ru changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from ffsi...@yandex.ru ---
Problem solved by updating CPU microcode in BIOS ROM. Somewhat updating
microcode by kernel not solves teh problem.
Now after suspend i have all my CPU cores have right model names.
Sorry for that "bug".

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/6 v2] kvm: ppc: bookehv: Added wrapper macros for shadow registers

2014-07-17 Thread Scott Wood
On Thu, 2014-07-17 at 17:01 +0530, Bharat Bhushan wrote:
> There are shadow registers like, GSPRG[0-3], GSRR0, GSRR1 etc on
> BOOKE-HV and these shadow registers are guest accessible.
> So these shadow registers needs to be updated on BOOKE-HV.
> This patch adds new macro for get/set helper of shadow register .
> 
> Signed-off-by: Bharat Bhushan 
> ---
> v1->v2
>  - Fix compilation for book3s (separate macro etc)
> 
>  arch/powerpc/include/asm/kvm_ppc.h | 44 
> +++---
>  1 file changed, 36 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
> b/arch/powerpc/include/asm/kvm_ppc.h
> index f3f7611..7646994 100644
> --- a/arch/powerpc/include/asm/kvm_ppc.h
> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> @@ -475,8 +475,20 @@ static inline bool kvmppc_shared_big_endian(struct 
> kvm_vcpu *vcpu)
>  #endif
>  }
>  
> +#define SPRNG_WRAPPER_GET(reg, e500hv_spr)   \
> +static inline ulong kvmppc_get_##reg(struct kvm_vcpu *vcpu)  \
> +{\
> + return mfspr(e500hv_spr);   \
> +}\
> +
> +#define SPRNG_WRAPPER_SET(reg, e500hv_spr)   \
> +static inline void kvmppc_set_##reg(struct kvm_vcpu *vcpu, ulong val)
> \
> +{\
> + mtspr(e500hv_spr, val); \
> +}\

Why "e500hv" rather than "bookehv"?

-Scott


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering guest

2014-07-17 Thread Scott Wood
On Fri, 2014-07-18 at 02:37 +0200, Alexander Graf wrote:
> On 18.07.14 02:36, Scott Wood wrote:
> > On Fri, 2014-07-18 at 02:33 +0200, Alexander Graf wrote:
> >> On 18.07.14 02:28, Scott Wood wrote:
> >>> On Thu, 2014-07-17 at 18:29 +0200, Alexander Graf wrote:
>  On 17.07.14 18:27, Alexander Graf wrote:
> > On 17.07.14 18:24, bharat.bhus...@freescale.com wrote:
> >>> -Original Message-
> >>> From: Alexander Graf [mailto:ag...@suse.de]
> >>> Sent: Thursday, July 17, 2014 9:41 PM
> >>> To: Bhushan Bharat-R65777; kvm-...@vger.kernel.org
> >>> Cc: kvm@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248
> >>> Subject: Re: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering 
> >>> guest
> >>>
> >>>
> >>> On 16.07.14 08:02, Bharat Bhushan wrote:
>  SPRG3 is guest accessible and SPRG3 can be clobbered by host or
>  another guest, So this need to be restored when loading guest state.
> >>> SPRG3 is not guest writeable.  We should be doing this so that guest
> >>> reads of SPRG3 through the alternative read-only SPR work, not because
> >>> "SPRG3 can be clobbered by host or another guest".
> >>>
>  Signed-off-by: Bharat Bhushan 
>  ---
>   arch/powerpc/kvm/booke_interrupts.S | 2 ++
>   1 file changed, 2 insertions(+)
> 
>  diff --git a/arch/powerpc/kvm/booke_interrupts.S
>  b/arch/powerpc/kvm/booke_interrupts.S
>  index 2c6deb5ef..0d3403f 100644
>  --- a/arch/powerpc/kvm/booke_interrupts.S
>  +++ b/arch/powerpc/kvm/booke_interrupts.S
>  @@ -459,6 +459,8 @@ lightweight_exit:
>    * written directly to the shared area, so we
>    * need to reload them here with the guest's values.
>    */
>  +PPC_LD(r3, VCPU_SHARED_SPRG3, r5)
>  +mtsprSPRN_SPRG3, r3
> >>> We also need to restore it when resuming the host, no?
> >> I do not think host expect some meaningful value when returning from
> >> guest, same true for SPRG4-7.
> >> So there seems no reason to save host values and restore them.
> >>> Linux no longer uses SPRG4-7 for itself.  That is not true of SPRG3, as
> >>> Alex points out.
> >>>
> > Hmm - arch/powerpc/include/asm/reg.h says:
> >
> >* All 32-bit:
> >*  - SPRG3 current thread_info pointer
> >*(virtual on BookE, physical on others)
> >
> > but I can indeed find no trace of usage anywhere. This at least needs
> > to go into the patch description.
>  Bah - it obviously is used. It's SPRN_SPRG_THREAD. And it's so
>  incredibly important that I have no idea how we could possibly run
>  without switching the host value back in very early. And even then our
>  interrupt handlers wouldn't work anymore.
> 
>  This is more complicated :).
> >>> To make this work we need to avoid SPRG3 as well, or at least avoid
> >>> using it for something needed prior to DO_KVM.
> >>>
> >>> We also need to update the documentation in reg.h to reflect the fact
> >>> that we don't use SPRG4-7 anymore on e500.
> >> I would personally prefer if we claim SPRG3R as unsupported on e500v2
> >> until we find someone who actually uses it. There's a good chance we'd
> >> start jumping through a lot of hoops and reduce overall performance for
> >> no real-world gain today.
> > The same problem applies to e500mc.
> 
> There we have SPRN_GSPRG3, no?

Oh, right.

Since it's only a problem for PR-mode, it can be fixed without needing
to avoid SPRG3 entirely, since PR-mode doesn't use DO_KVM.  We'd only
need to avoid using SPRG_THREAD in __KVM_HANDLER (i.e. revert commit
ffe129ecd79779221fdb03305049ec8b5a8beb0f).

And if we decide it's not worthwhile and don't revert that commit, we
should at least remove the comment that "Under KVM, the host SPRG1 is
used to point to the current VCPU data structure"...

-Scott


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering guest

2014-07-17 Thread Alexander Graf


On 18.07.14 02:36, Scott Wood wrote:

On Fri, 2014-07-18 at 02:33 +0200, Alexander Graf wrote:

On 18.07.14 02:28, Scott Wood wrote:

On Thu, 2014-07-17 at 18:29 +0200, Alexander Graf wrote:

On 17.07.14 18:27, Alexander Graf wrote:

On 17.07.14 18:24, bharat.bhus...@freescale.com wrote:

-Original Message-
From: Alexander Graf [mailto:ag...@suse.de]
Sent: Thursday, July 17, 2014 9:41 PM
To: Bhushan Bharat-R65777; kvm-...@vger.kernel.org
Cc: kvm@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248
Subject: Re: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering guest


On 16.07.14 08:02, Bharat Bhushan wrote:

SPRG3 is guest accessible and SPRG3 can be clobbered by host or
another guest, So this need to be restored when loading guest state.

SPRG3 is not guest writeable.  We should be doing this so that guest
reads of SPRG3 through the alternative read-only SPR work, not because
"SPRG3 can be clobbered by host or another guest".


Signed-off-by: Bharat Bhushan 
---
 arch/powerpc/kvm/booke_interrupts.S | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kvm/booke_interrupts.S
b/arch/powerpc/kvm/booke_interrupts.S
index 2c6deb5ef..0d3403f 100644
--- a/arch/powerpc/kvm/booke_interrupts.S
+++ b/arch/powerpc/kvm/booke_interrupts.S
@@ -459,6 +459,8 @@ lightweight_exit:
  * written directly to the shared area, so we
  * need to reload them here with the guest's values.
  */
+PPC_LD(r3, VCPU_SHARED_SPRG3, r5)
+mtsprSPRN_SPRG3, r3

We also need to restore it when resuming the host, no?

I do not think host expect some meaningful value when returning from
guest, same true for SPRG4-7.
So there seems no reason to save host values and restore them.

Linux no longer uses SPRG4-7 for itself.  That is not true of SPRG3, as
Alex points out.


Hmm - arch/powerpc/include/asm/reg.h says:

   * All 32-bit:
   *  - SPRG3 current thread_info pointer
   *(virtual on BookE, physical on others)

but I can indeed find no trace of usage anywhere. This at least needs
to go into the patch description.

Bah - it obviously is used. It's SPRN_SPRG_THREAD. And it's so
incredibly important that I have no idea how we could possibly run
without switching the host value back in very early. And even then our
interrupt handlers wouldn't work anymore.

This is more complicated :).

To make this work we need to avoid SPRG3 as well, or at least avoid
using it for something needed prior to DO_KVM.

We also need to update the documentation in reg.h to reflect the fact
that we don't use SPRG4-7 anymore on e500.

I would personally prefer if we claim SPRG3R as unsupported on e500v2
until we find someone who actually uses it. There's a good chance we'd
start jumping through a lot of hoops and reduce overall performance for
no real-world gain today.

The same problem applies to e500mc.


There we have SPRN_GSPRG3, no?


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering guest

2014-07-17 Thread Scott Wood
On Fri, 2014-07-18 at 02:33 +0200, Alexander Graf wrote:
> On 18.07.14 02:28, Scott Wood wrote:
> > On Thu, 2014-07-17 at 18:29 +0200, Alexander Graf wrote:
> >> On 17.07.14 18:27, Alexander Graf wrote:
> >>> On 17.07.14 18:24, bharat.bhus...@freescale.com wrote:
> > -Original Message-
> > From: Alexander Graf [mailto:ag...@suse.de]
> > Sent: Thursday, July 17, 2014 9:41 PM
> > To: Bhushan Bharat-R65777; kvm-...@vger.kernel.org
> > Cc: kvm@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248
> > Subject: Re: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering guest
> >
> >
> > On 16.07.14 08:02, Bharat Bhushan wrote:
> >> SPRG3 is guest accessible and SPRG3 can be clobbered by host or
> >> another guest, So this need to be restored when loading guest state.
> > SPRG3 is not guest writeable.  We should be doing this so that guest
> > reads of SPRG3 through the alternative read-only SPR work, not because
> > "SPRG3 can be clobbered by host or another guest".
> >
> >> Signed-off-by: Bharat Bhushan 
> >> ---
> >> arch/powerpc/kvm/booke_interrupts.S | 2 ++
> >> 1 file changed, 2 insertions(+)
> >>
> >> diff --git a/arch/powerpc/kvm/booke_interrupts.S
> >> b/arch/powerpc/kvm/booke_interrupts.S
> >> index 2c6deb5ef..0d3403f 100644
> >> --- a/arch/powerpc/kvm/booke_interrupts.S
> >> +++ b/arch/powerpc/kvm/booke_interrupts.S
> >> @@ -459,6 +459,8 @@ lightweight_exit:
> >>  * written directly to the shared area, so we
> >>  * need to reload them here with the guest's values.
> >>  */
> >> +PPC_LD(r3, VCPU_SHARED_SPRG3, r5)
> >> +mtsprSPRN_SPRG3, r3
> > We also need to restore it when resuming the host, no?
>  I do not think host expect some meaningful value when returning from
>  guest, same true for SPRG4-7.
>  So there seems no reason to save host values and restore them.
> > Linux no longer uses SPRG4-7 for itself.  That is not true of SPRG3, as
> > Alex points out.
> >
> >>> Hmm - arch/powerpc/include/asm/reg.h says:
> >>>
> >>>   * All 32-bit:
> >>>   *  - SPRG3 current thread_info pointer
> >>>   *(virtual on BookE, physical on others)
> >>>
> >>> but I can indeed find no trace of usage anywhere. This at least needs
> >>> to go into the patch description.
> >> Bah - it obviously is used. It's SPRN_SPRG_THREAD. And it's so
> >> incredibly important that I have no idea how we could possibly run
> >> without switching the host value back in very early. And even then our
> >> interrupt handlers wouldn't work anymore.
> >>
> >> This is more complicated :).
> > To make this work we need to avoid SPRG3 as well, or at least avoid
> > using it for something needed prior to DO_KVM.
> >
> > We also need to update the documentation in reg.h to reflect the fact
> > that we don't use SPRG4-7 anymore on e500.
> 
> I would personally prefer if we claim SPRG3R as unsupported on e500v2 
> until we find someone who actually uses it. There's a good chance we'd 
> start jumping through a lot of hoops and reduce overall performance for 
> no real-world gain today.

The same problem applies to e500mc.

-Scott


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering guest

2014-07-17 Thread Alexander Graf


On 18.07.14 02:28, Scott Wood wrote:

On Thu, 2014-07-17 at 18:29 +0200, Alexander Graf wrote:

On 17.07.14 18:27, Alexander Graf wrote:

On 17.07.14 18:24, bharat.bhus...@freescale.com wrote:

-Original Message-
From: Alexander Graf [mailto:ag...@suse.de]
Sent: Thursday, July 17, 2014 9:41 PM
To: Bhushan Bharat-R65777; kvm-...@vger.kernel.org
Cc: kvm@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248
Subject: Re: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering guest


On 16.07.14 08:02, Bharat Bhushan wrote:

SPRG3 is guest accessible and SPRG3 can be clobbered by host or
another guest, So this need to be restored when loading guest state.

SPRG3 is not guest writeable.  We should be doing this so that guest
reads of SPRG3 through the alternative read-only SPR work, not because
"SPRG3 can be clobbered by host or another guest".


Signed-off-by: Bharat Bhushan 
---
arch/powerpc/kvm/booke_interrupts.S | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kvm/booke_interrupts.S
b/arch/powerpc/kvm/booke_interrupts.S
index 2c6deb5ef..0d3403f 100644
--- a/arch/powerpc/kvm/booke_interrupts.S
+++ b/arch/powerpc/kvm/booke_interrupts.S
@@ -459,6 +459,8 @@ lightweight_exit:
 * written directly to the shared area, so we
 * need to reload them here with the guest's values.
 */
+PPC_LD(r3, VCPU_SHARED_SPRG3, r5)
+mtsprSPRN_SPRG3, r3

We also need to restore it when resuming the host, no?

I do not think host expect some meaningful value when returning from
guest, same true for SPRG4-7.
So there seems no reason to save host values and restore them.

Linux no longer uses SPRG4-7 for itself.  That is not true of SPRG3, as
Alex points out.


Hmm - arch/powerpc/include/asm/reg.h says:

  * All 32-bit:
  *  - SPRG3 current thread_info pointer
  *(virtual on BookE, physical on others)

but I can indeed find no trace of usage anywhere. This at least needs
to go into the patch description.

Bah - it obviously is used. It's SPRN_SPRG_THREAD. And it's so
incredibly important that I have no idea how we could possibly run
without switching the host value back in very early. And even then our
interrupt handlers wouldn't work anymore.

This is more complicated :).

To make this work we need to avoid SPRG3 as well, or at least avoid
using it for something needed prior to DO_KVM.

We also need to update the documentation in reg.h to reflect the fact
that we don't use SPRG4-7 anymore on e500.


I would personally prefer if we claim SPRG3R as unsupported on e500v2 
until we find someone who actually uses it. There's a good chance we'd 
start jumping through a lot of hoops and reduce overall performance for 
no real-world gain today.



Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering guest

2014-07-17 Thread Scott Wood
On Thu, 2014-07-17 at 18:29 +0200, Alexander Graf wrote:
> On 17.07.14 18:27, Alexander Graf wrote:
> >
> > On 17.07.14 18:24, bharat.bhus...@freescale.com wrote:
> >>
> >>> -Original Message-
> >>> From: Alexander Graf [mailto:ag...@suse.de]
> >>> Sent: Thursday, July 17, 2014 9:41 PM
> >>> To: Bhushan Bharat-R65777; kvm-...@vger.kernel.org
> >>> Cc: kvm@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248
> >>> Subject: Re: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering guest
> >>>
> >>>
> >>> On 16.07.14 08:02, Bharat Bhushan wrote:
>  SPRG3 is guest accessible and SPRG3 can be clobbered by host or
>  another guest, So this need to be restored when loading guest state.

SPRG3 is not guest writeable.  We should be doing this so that guest
reads of SPRG3 through the alternative read-only SPR work, not because
"SPRG3 can be clobbered by host or another guest".

> 
>  Signed-off-by: Bharat Bhushan 
>  ---
> arch/powerpc/kvm/booke_interrupts.S | 2 ++
> 1 file changed, 2 insertions(+)
> 
>  diff --git a/arch/powerpc/kvm/booke_interrupts.S
>  b/arch/powerpc/kvm/booke_interrupts.S
>  index 2c6deb5ef..0d3403f 100644
>  --- a/arch/powerpc/kvm/booke_interrupts.S
>  +++ b/arch/powerpc/kvm/booke_interrupts.S
>  @@ -459,6 +459,8 @@ lightweight_exit:
>  * written directly to the shared area, so we
>  * need to reload them here with the guest's values.
>  */
>  +PPC_LD(r3, VCPU_SHARED_SPRG3, r5)
>  +mtsprSPRN_SPRG3, r3
> >>> We also need to restore it when resuming the host, no?
> >> I do not think host expect some meaningful value when returning from 
> >> guest, same true for SPRG4-7.
> >> So there seems no reason to save host values and restore them.

Linux no longer uses SPRG4-7 for itself.  That is not true of SPRG3, as
Alex points out.

> > Hmm - arch/powerpc/include/asm/reg.h says:
> >
> >  * All 32-bit:
> >  *  - SPRG3 current thread_info pointer
> >  *(virtual on BookE, physical on others)
> >
> > but I can indeed find no trace of usage anywhere. This at least needs 
> > to go into the patch description.
> 
> Bah - it obviously is used. It's SPRN_SPRG_THREAD. And it's so 
> incredibly important that I have no idea how we could possibly run 
> without switching the host value back in very early. And even then our 
> interrupt handlers wouldn't work anymore.
> 
> This is more complicated :).

To make this work we need to avoid SPRG3 as well, or at least avoid
using it for something needed prior to DO_KVM.

We also need to update the documentation in reg.h to reflect the fact
that we don't use SPRG4-7 anymore on e500.

-Scott


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] CMA: generalize CMA reserved area management functionality (fixup)

2014-07-17 Thread Andrew Morton
On Thu, 17 Jul 2014 11:36:07 +0200 Marek Szyprowski  
wrote:

> MAX_CMA_AREAS is used by other subsystems (i.e. arch/arm/mm/dma-mapping.c),
> so we need to provide correct definition even if CMA is disabled.
> This patch fixes this issue.
> 
> Reported-by: Sylwester Nawrocki 
> Signed-off-by: Marek Szyprowski 
> ---
>  include/linux/cma.h | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/include/linux/cma.h b/include/linux/cma.h
> index 9a18a2b1934c..c077635cad76 100644
> --- a/include/linux/cma.h
> +++ b/include/linux/cma.h
> @@ -5,7 +5,11 @@
>   * There is always at least global CMA area and a few optional
>   * areas configured in kernel .config.
>   */
> +#ifdef CONFIG_CMA
>  #define MAX_CMA_AREAS(1 + CONFIG_CMA_AREAS)
> +#else
> +#define MAX_CMA_AREAS(0)
> +#endif
>  
>  struct cma;

Joonsoo already fixed this up, a bit differently:
http://ozlabs.org/~akpm/mmots/broken-out/cma-generalize-cma-reserved-area-management-functionality-fix.patch

Which approach makes more sense?



From: Joonsoo Kim 
Subject: CMA: fix ARM build failure related to MAX_CMA_AREAS definition

If CMA is disabled, CONFIG_CMA_AREAS isn't defined so compile error
happens. To fix it, define MAX_CMA_AREAS if CONFIG_CMA_AREAS
isn't defined.

Signed-off-by: Joonsoo Kim 
Reported-by: Stephen Rothwell 
Signed-off-by: Andrew Morton 
---

 include/linux/cma.h |6 ++
 1 file changed, 6 insertions(+)

diff -puN 
include/linux/cma.h~cma-generalize-cma-reserved-area-management-functionality-fix
 include/linux/cma.h
--- 
a/include/linux/cma.h~cma-generalize-cma-reserved-area-management-functionality-fix
+++ a/include/linux/cma.h
@@ -5,8 +5,14 @@
  * There is always at least global CMA area and a few optional
  * areas configured in kernel .config.
  */
+#ifdef CONFIG_CMA_AREAS
 #define MAX_CMA_AREAS  (1 + CONFIG_CMA_AREAS)
 
+#else
+#define MAX_CMA_AREAS  (0)
+
+#endif
+
 struct cma;
 
 extern phys_addr_t cma_get_base(struct cma *cma);
_


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/4] KVM: MMU: allow pinning spte translations (TDP-only)

2014-07-17 Thread Marcelo Tosatti
On Thu, Jul 17, 2014 at 08:18:03PM +0300, Nadav Amit wrote:
> Small question if I may regarding kvm_mmu_pin_pages:
> 
> On 7/9/14, 10:12 PM, mtosa...@redhat.com wrote:
> >+
> >+static int kvm_mmu_pin_pages(struct kvm_vcpu *vcpu)
> >+{
> >+struct kvm_pinned_page_range *p;
> >+int r = 1;
> >+
> >+if (is_guest_mode(vcpu))
> >+return r;
> >+
> >+if (!vcpu->arch.mmu.direct_map)
> >+return r;
> >+
> >+ASSERT(VALID_PAGE(vcpu->arch.mmu.root_hpa));
> >+
> >+list_for_each_entry(p, &vcpu->arch.pinned_mmu_pages, link) {
> >+gfn_t gfn_offset;
> >+
> >+for (gfn_offset = 0; gfn_offset < p->npages; gfn_offset++) {
> >+gfn_t gfn = p->base_gfn + gfn_offset;
> >+int r;
> >+bool pinned = false;
> >+
> >+r = vcpu->arch.mmu.page_fault(vcpu, gfn << PAGE_SHIFT,
> >+ PFERR_WRITE_MASK, false,
> >+ true, &pinned);
> 
> I understand that the current use-case is for pinning only few
> pages. Yet, wouldn't it be better (for performance) to check whether
> the gfn uses a large page and if so to skip forward, increasing
> gfn_offset to point to the next large page?

Sure, that can be a lazy optimization and performed when necessary?

(feel free to do it in advance if you're interested in doing it 
now).

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/5] random,x86: Add arch_get_slow_rng_u64

2014-07-17 Thread Andy Lutomirski
On Thu, Jul 17, 2014 at 11:42 AM, Hannes Frederic Sowa
 wrote:
>
>
> On Thu, Jul 17, 2014, at 19:34, Andy Lutomirski wrote:
>> On Thu, Jul 17, 2014 at 10:32 AM, Theodore Ts'o  wrote:
>> > On Thu, Jul 17, 2014 at 10:12:27AM -0700, Andy Lutomirski wrote:
>> >>
>> >> Unless I'm reading the code wrong, the prandom_reseed_late call can
>> >> happen after userspace is running.
>> >
>> > But there is also the prandom_reseed() call, which happens early.
>> >
>>
>> Right -- I missed that.
>
> prandom_init is a core_initcall, prandom_reseed is a late_initcall.
> During initialization of the network stack we have calls to prandom_u32
> before the late_initcall happens. That said, I think it is not that
> important to seed prandom with rdseed/rdrand as security relevant
> entropy extraction should always use get_random_bytes(), but we should
> do it nonetheless.
>

Regardless, I don't want to do this as part of this patch series.  One
thing at a time...

--Andy
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 5/5] x86,kaslr: Use MSR_KVM_GET_RNG_SEED for KASLR if available

2014-07-17 Thread Kees Cook
On Thu, Jul 17, 2014 at 11:22 AM, Andy Lutomirski  wrote:
> It's considerably better than any of the alternatives on KVM.
>
> Rather than reinventing all of the cpu feature query code, this fixes
> native_cpuid to work in PIC objects.
>
> I haven't combined it with boot/cpuflags.c's cpuid implementation:
> including asm/processor.h from boot/cpuflags.c results in a flood of
> unrelated errors, and fixing it might be messy.
>
> Signed-off-by: Andy Lutomirski 

This will be very nice to have under kvm!

Reviewed-by: Kees Cook 

Thanks,

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/5] random,x86: Add arch_get_slow_rng_u64

2014-07-17 Thread Hannes Frederic Sowa


On Thu, Jul 17, 2014, at 19:34, Andy Lutomirski wrote:
> On Thu, Jul 17, 2014 at 10:32 AM, Theodore Ts'o  wrote:
> > On Thu, Jul 17, 2014 at 10:12:27AM -0700, Andy Lutomirski wrote:
> >>
> >> Unless I'm reading the code wrong, the prandom_reseed_late call can
> >> happen after userspace is running.
> >
> > But there is also the prandom_reseed() call, which happens early.
> >
> 
> Right -- I missed that.

prandom_init is a core_initcall, prandom_reseed is a late_initcall.
During initialization of the network stack we have calls to prandom_u32
before the late_initcall happens. That said, I think it is not that
important to seed prandom with rdseed/rdrand as security relevant
entropy extraction should always use get_random_bytes(), but we should
do it nonetheless.

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 1/5] x86,kvm: Add MSR_KVM_GET_RNG_SEED and a matching feature bit

2014-07-17 Thread Andy Lutomirski
This adds a simple interface to allow a guest to request 64 bits of
host nonblocking entropy.  This is independent of virtio-rng for a
couple of reasons:

 - It's intended to be usable during early boot, when a trivial
   synchronous interface is needed.

 - virtio-rng gives blocking entropy, and making guest boot wait for
   the host's /dev/random will cause problems.

MSR_KVM_GET_RNG_SEED is intended to provide 64 bits of best-effort
cryptographically secure data for use as a seed.  It provides no
guarantee that the result contains any actual entropy.

Signed-off-by: Andy Lutomirski 
---
 Documentation/virtual/kvm/cpuid.txt  | 3 +++
 arch/x86/include/uapi/asm/kvm_para.h | 2 ++
 arch/x86/kvm/cpuid.c | 3 ++-
 arch/x86/kvm/x86.c   | 4 
 4 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/Documentation/virtual/kvm/cpuid.txt 
b/Documentation/virtual/kvm/cpuid.txt
index 3c65feb..0ab043b 100644
--- a/Documentation/virtual/kvm/cpuid.txt
+++ b/Documentation/virtual/kvm/cpuid.txt
@@ -54,6 +54,9 @@ KVM_FEATURE_PV_UNHALT  || 7 || guest checks 
this feature bit
||   || before enabling paravirtualized
||   || spinlock support.
 --
+KVM_FEATURE_GET_RNG_SEED   || 8 || host provides rng seed data via
+   ||   || MSR_KVM_GET_RNG_SEED.
+--
 KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||24 || host will warn if no guest-side
||   || per-cpu warps are expected in
||   || kvmclock.
diff --git a/arch/x86/include/uapi/asm/kvm_para.h 
b/arch/x86/include/uapi/asm/kvm_para.h
index 94dc8ca..e2eaf93 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -24,6 +24,7 @@
 #define KVM_FEATURE_STEAL_TIME 5
 #define KVM_FEATURE_PV_EOI 6
 #define KVM_FEATURE_PV_UNHALT  7
+#define KVM_FEATURE_GET_RNG_SEED   8
 
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.
@@ -40,6 +41,7 @@
 #define MSR_KVM_ASYNC_PF_EN 0x4b564d02
 #define MSR_KVM_STEAL_TIME  0x4b564d03
 #define MSR_KVM_PV_EOI_EN  0x4b564d04
+#define MSR_KVM_GET_RNG_SEED 0x4b564d05
 
 struct kvm_steal_time {
__u64 steal;
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 38a0afe..40d6763 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -479,7 +479,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 
*entry, u32 function,
 (1 << KVM_FEATURE_ASYNC_PF) |
 (1 << KVM_FEATURE_PV_EOI) |
 (1 << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT) |
-(1 << KVM_FEATURE_PV_UNHALT);
+(1 << KVM_FEATURE_PV_UNHALT) |
+(1 << KVM_FEATURE_GET_RNG_SEED);
 
if (sched_info_on())
entry->eax |= (1 << KVM_FEATURE_STEAL_TIME);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f644933..4e81853 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -48,6 +48,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #define CREATE_TRACE_POINTS
@@ -2480,6 +2481,9 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, 
u64 *pdata)
case MSR_KVM_PV_EOI_EN:
data = vcpu->arch.pv_eoi.msr_val;
break;
+   case MSR_KVM_GET_RNG_SEED:
+   get_random_bytes(&data, sizeof(data));
+   break;
case MSR_IA32_P5_MC_ADDR:
case MSR_IA32_P5_MC_TYPE:
case MSR_IA32_MCG_CAP:
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 5/5] x86,kaslr: Use MSR_KVM_GET_RNG_SEED for KASLR if available

2014-07-17 Thread Andy Lutomirski
It's considerably better than any of the alternatives on KVM.

Rather than reinventing all of the cpu feature query code, this fixes
native_cpuid to work in PIC objects.

I haven't combined it with boot/cpuflags.c's cpuid implementation:
including asm/processor.h from boot/cpuflags.c results in a flood of
unrelated errors, and fixing it might be messy.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/boot/compressed/aslr.c  | 27 +++
 arch/x86/include/asm/processor.h | 21 ++---
 2 files changed, 45 insertions(+), 3 deletions(-)

diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index fc6091a..8583f0e 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -5,6 +5,8 @@
 #include 
 #include 
 
+#include 
+
 #include 
 #include 
 #include 
@@ -15,6 +17,22 @@
 static const char build_str[] = UTS_RELEASE " (" LINUX_COMPILE_BY "@"
LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION;
 
+static bool kvm_para_has_feature(unsigned int feature)
+{
+   u32 kvm_base;
+   u32 features;
+
+   if (!has_cpuflag(X86_FEATURE_HYPERVISOR))
+   return false;
+
+   kvm_base = hypervisor_cpuid_base("KVMKVMKVM\0\0\0", KVM_CPUID_FEATURES);
+   if (!kvm_base)
+   return false;
+
+   features = cpuid_eax(kvm_base | KVM_CPUID_FEATURES);
+   return features & (1UL << feature);
+}
+
 #define I8254_PORT_CONTROL 0x43
 #define I8254_PORT_COUNTER00x40
 #define I8254_CMD_READBACK 0xC0
@@ -81,6 +99,15 @@ static unsigned long get_random_long(void)
}
}
 
+   if (kvm_para_has_feature(KVM_FEATURE_GET_RNG_SEED)) {
+   u64 seed;
+
+   debug_putstr(" MSR_KVM_GET_RNG_SEED");
+   rdmsrl(MSR_KVM_GET_RNG_SEED, seed);
+   random ^= (unsigned long)seed;
+   use_i8254 = false;
+   }
+
if (has_cpuflag(X86_FEATURE_TSC)) {
debug_putstr(" RDTSC");
rdtscll(raw);
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a4ea023..6096f3c 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -189,10 +189,25 @@ static inline int have_cpuid_p(void)
 static inline void native_cpuid(unsigned int *eax, unsigned int *ebx,
unsigned int *ecx, unsigned int *edx)
 {
-   /* ecx is often an input as well as an output. */
-   asm volatile("cpuid"
+   /*
+* This function can be used from the boot code, so it needs
+* to avoid using EBX in constraints in PIC mode.
+*
+* ecx is often an input as well as an output.
+*/
+   asm volatile(".ifnc %%ebx,%1 ; .ifnc %%rbx,%1   \n\t"
+"movl  %%ebx,%1\n\t"
+".endif ; .endif   \n\t"
+"cpuid \n\t"
+".ifnc %%ebx,%1 ; .ifnc %%rbx,%1   \n\t"
+"xchgl %%ebx,%1\n\t"
+".endif ; .endif"
: "=a" (*eax),
- "=b" (*ebx),
+#if defined(__i386__) && defined(__PIC__)
+ "=r" (*ebx),  /* gcc won't let us use ebx */
+#else
+ "=b" (*ebx),  /* ebx is okay */
+#endif
  "=c" (*ecx),
  "=d" (*edx)
: "0" (*eax), "2" (*ecx)
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 2/5] random: Add and use arch_get_rng_seed

2014-07-17 Thread Andy Lutomirski
Currently, init_std_data contains its own logic for using arch
random sources.  This logic is a bit strange: it reads one long of
arch random data per byte of internal state.

This replaces that logic with a generic function arch_get_rng_seed
that allows arch code to supply its own logic.  The default
implementation tries arch_get_random_seed_long and
arch_get_random_long individually, requesting one bit per bit of
internal state being seeded.

Assuming the arch sources are perfect, this is the right thing to
do.  They're not, though, so the followup patch attempts to
implement the correct logic on x86.

Signed-off-by: Andy Lutomirski 
---
 drivers/char/random.c  | 14 +++---
 include/linux/random.h | 40 
 2 files changed, 51 insertions(+), 3 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 0a7ac0a..be7a94e 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1236,6 +1236,10 @@ void get_random_bytes_arch(void *buf, int nbytes)
 }
 EXPORT_SYMBOL(get_random_bytes_arch);
 
+static void seed_entropy_store(void *ctx, u32 data)
+{
+   mix_pool_bytes((struct entropy_store *)ctx, &data, sizeof(data), NULL);
+}
 
 /*
  * init_std_data - initialize pool with system data
@@ -1251,15 +1255,19 @@ static void init_std_data(struct entropy_store *r)
int i;
ktime_t now = ktime_get_real();
unsigned long rv;
+   char log_prefix[128];
 
r->last_pulled = jiffies;
mix_pool_bytes(r, &now, sizeof(now), NULL);
for (i = r->poolinfo->poolbytes; i > 0; i -= sizeof(rv)) {
-   if (!arch_get_random_seed_long(&rv) &&
-   !arch_get_random_long(&rv))
-   rv = random_get_entropy();
+   rv = random_get_entropy();
mix_pool_bytes(r, &rv, sizeof(rv), NULL);
}
+
+   sprintf(log_prefix, "random: seeded %s pool", r->name);
+   arch_get_rng_seed(r, seed_entropy_store, 8 * r->poolinfo->poolbytes,
+ log_prefix);
+
mix_pool_bytes(r, utsname(), sizeof(*(utsname())), NULL);
 }
 
diff --git a/include/linux/random.h b/include/linux/random.h
index 57fbbff..a17065e 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -106,6 +106,46 @@ static inline int arch_has_random_seed(void)
 }
 #endif
 
+#ifndef __HAVE_ARCH_GET_RNG_SEED
+
+/**
+ * arch_get_rng_seed() - get architectural rng seed data
+ * @ctx: context for the seed function
+ * @seed: function to call for each u32 obtained
+ * @bits_per_source: number of bits from each source to try to use
+ * @log_prefix: beginning of log output (may be NULL)
+ *
+ * Synchronously load some architectural entropy or other best-effort
+ * random seed data.  An arch-specific implementation should be no worse
+ * than this generic implementation.  If the arch code does something
+ * interesting, it may log something of the form "log_prefix with
+ * 8 bits of stuff".
+ *
+ * No arch-specific implementation should be any worse than the generic
+ * implementation.
+ */
+static inline void arch_get_rng_seed(void *ctx,
+void (*seed)(void *ctx, u32 data),
+int bits_per_source,
+const char *log_prefix)
+{
+   int i, longs = (bits_per_source + BITS_PER_LONG - 1) / BITS_PER_LONG;
+
+   for (i = 0; i < longs; i++) {
+   unsigned long rv;
+
+   if (arch_get_random_seed_long(&rv) ||
+   arch_get_random_long(&rv)) {
+   seed(ctx, (u32)rv);
+#if BITS_PER_LONG > 32
+   seed(ctx, (u32)(rv >> 32));
+#endif
+   }
+   }
+}
+
+#endif /* __HAVE_ARCH_GET_RNG_SEED */
+
 /* Pseudo random number generator from numerical recipes. */
 static inline u32 next_pseudo_random32(u32 seed)
 {
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 0/5] random,x86,kvm: Rework arch RNG seeds and get some from kvm

2014-07-17 Thread Andy Lutomirski
This introduces and uses a very simple synchronous mechanism to get
/dev/urandom-style bits appropriate for initial KVM PV guest RNG
seeding.

It also re-works the way that architectural random data is fed into
random.c's pools.  I added a new arch hook called arch_get_rng_seed.
The default implementation uses arch_get_random_seed_long and
arch_get_random_long, but not quite the same way as before.

x86 gets a custom arch_get_rng_seed, which is significantly enhanced
over the generic implementation.  It uses RDSEED less aggressively (the
old implementation requested 4x or 8x as many bits as would fit in the
pool, depending on kernel bitness), but, if using RDRAND, it requests
enough bits to comply with Intel's recommendations.

x86's arch_get_rng_seed will also use KVM_GET_RNG_SEED if available.
If more paravirt seed sources show up, it will be a natural place
to add them.

I sent the corresponding kvm-unit-tests and qemu changes separately.

Changes from v3:
 - Other than KASLR, the guest pieces are completely rewritten.
   Patches 2-4 have essentially nothing in common with v2.

Changes from v2:
 - Bisection fix (patch 2 had a misplaced brace).  The final states is
   identical to that of v2.
 - Improve the 0/5 description a little bit.

Changes from v1:
 - Split patches 2 and 3
 - Log all arch sources in init_std_data
 - Fix the 32-bit kaslr build

Andy Lutomirski (5):
  x86,kvm: Add MSR_KVM_GET_RNG_SEED and a matching feature bit
  random: Add and use arch_get_rng_seed
  x86,random: Add an x86 implementation of arch_get_rng_seed
  x86,random,kvm: Use KVM_GET_RNG_SEED in arch_get_rng_seed
  x86,kaslr: Use MSR_KVM_GET_RNG_SEED for KASLR if available

 Documentation/virtual/kvm/cpuid.txt  |  3 ++
 arch/x86/Kconfig |  4 ++
 arch/x86/boot/compressed/aslr.c  | 27 ++
 arch/x86/include/asm/archrandom.h|  6 +++
 arch/x86/include/asm/kvm_guest.h |  9 
 arch/x86/include/asm/processor.h | 21 ++--
 arch/x86/include/uapi/asm/kvm_para.h |  2 +
 arch/x86/kernel/Makefile |  2 +
 arch/x86/kernel/archrandom.c | 99 
 arch/x86/kernel/kvm.c| 10 
 arch/x86/kvm/cpuid.c |  3 +-
 arch/x86/kvm/x86.c   |  4 ++
 drivers/char/random.c| 14 +++--
 include/linux/random.h   | 40 +++
 14 files changed, 237 insertions(+), 7 deletions(-)
 create mode 100644 arch/x86/kernel/archrandom.c

-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 4/5] x86,random,kvm: Use KVM_GET_RNG_SEED in arch_get_rng_seed

2014-07-17 Thread Andy Lutomirski
This is a straightforward implementation: for each bit of internal
RNG state, request one bit from KVM_GET_RNG_SEED.  This is done even
if RDSEED/RDRAND worked, since KVM_GET_RNG_SEED is likely to provide
cryptographically secure output even if the CPU's RNG is weak or
compromised.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/Kconfig |  4 
 arch/x86/include/asm/kvm_guest.h |  9 +
 arch/x86/kernel/archrandom.c | 22 +-
 arch/x86/kernel/kvm.c| 10 ++
 4 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index a8f749e..adfa09c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -593,6 +593,7 @@ config KVM_GUEST
bool "KVM Guest support (including kvmclock)"
depends on PARAVIRT
select PARAVIRT_CLOCK
+   select ARCH_RANDOM
default y
---help---
  This option enables various optimizations for running under the KVM
@@ -1507,6 +1508,9 @@ config ARCH_RANDOM
  If supported, this is a high bandwidth, cryptographically
  secure hardware random number generator.
 
+ This also enables paravirt RNGs such as KVM's if the relevant
+ PV guest support is enabled.
+
 config X86_SMAP
def_bool y
prompt "Supervisor Mode Access Prevention" if EXPERT
diff --git a/arch/x86/include/asm/kvm_guest.h b/arch/x86/include/asm/kvm_guest.h
index a92b176..8c4dbd5 100644
--- a/arch/x86/include/asm/kvm_guest.h
+++ b/arch/x86/include/asm/kvm_guest.h
@@ -3,4 +3,13 @@
 
 int kvm_setup_vsyscall_timeinfo(void);
 
+#if defined(CONFIG_KVM_GUEST) && defined(CONFIG_ARCH_RANDOM)
+extern bool kvm_get_rng_seed(u64 *rv);
+#else
+static inline bool kvm_get_rng_seed(u64 *rv)
+{
+   return false;
+}
+#endif
+
 #endif /* _ASM_X86_KVM_GUEST_H */
diff --git a/arch/x86/kernel/archrandom.c b/arch/x86/kernel/archrandom.c
index 5515fc8..3bcfa58 100644
--- a/arch/x86/kernel/archrandom.c
+++ b/arch/x86/kernel/archrandom.c
@@ -15,6 +15,7 @@
  */
 
 #include 
+#include 
 
 void arch_get_rng_seed(void *ctx,
   void (*seed)(void *ctx, u32 data),
@@ -22,7 +23,7 @@ void arch_get_rng_seed(void *ctx,
   const char *log_prefix)
 {
int i, longs = (bits_per_source + BITS_PER_LONG - 1) / BITS_PER_LONG;
-   int rdseed_bits = 0, rdrand_bits = 0;
+   int rdseed_bits = 0, rdrand_bits = 0, kvm_bits = 0;
int rdrand_longs_wanted = 0;
char buf[128] = "";
char *msgptr = buf;
@@ -74,6 +75,25 @@ void arch_get_rng_seed(void *ctx,
if (rdrand_bits)
msgptr += sprintf(msgptr, ", %d bits from RDRAND", rdrand_bits);
 
+   /*
+* Use KVM_GET_RNG_SEED regardless of whether the CPU RNG worked,
+* since it incorporates entropy unavailable to the CPU.  We
+* request enough bits for the entire internal RNG state, because
+* there's no good reason not to.
+*/
+   for (i = 0; i < (bits_per_source + 63) / 64; i++) {
+   u64 rv;
+
+   if (kvm_get_rng_seed(&rv)) {
+   seed(ctx, (u32)rv);
+   seed(ctx, (u32)(rv >> 32));
+   kvm_bits += 8 * sizeof(rv);
+   }
+   }
+   if (kvm_bits)
+   msgptr += sprintf(msgptr, ", %d bits from KVM_GET_RNG_BITS",
+ kvm_bits);
+
if (buf[0])
pr_info("%s with %s\n", log_prefix, buf + 2);
 }
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 3dd8e2c..bd8783a 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -416,6 +416,16 @@ void kvm_disable_steal_time(void)
wrmsr(MSR_KVM_STEAL_TIME, 0, 0);
 }
 
+bool kvm_get_rng_seed(u64 *v)
+{
+   /*
+* Allow migration from a hypervisor with the GET_RNG_SEED
+* feature to a hypervisor without it.
+*/
+   return (kvm_para_has_feature(KVM_FEATURE_GET_RNG_SEED) &&
+   rdmsrl_safe(MSR_KVM_GET_RNG_SEED, v) == 0);
+}
+
 #ifdef CONFIG_SMP
 static void __init kvm_smp_prepare_boot_cpu(void)
 {
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 3/5] x86,random: Add an x86 implementation of arch_get_rng_seed

2014-07-17 Thread Andy Lutomirski
This is closer to Intel's recommended logic for using RDRAND and
RDSEED.  It will attempt to seed the entire internal state of the
RNG pool using RDSEED (with one bit of RDSEED output per bit of
state).  For any bits that can't be obtained using RDSEED (e.g. if
RDSEED is unavailable), it calculates the number of RDRAND reseeds
needed to obtain the missing bits from the internal NRBG and then
requests enough bits from RDRAND to obtain the full output from at
least that many reseeds.

Arguably, arch_get_random_seed could be removed now: I'm having some
trouble imagining a sensible non-architecture-specific use of it
that wouldn't be better served by arch_get_rng_seed.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/include/asm/archrandom.h |  6 +++
 arch/x86/kernel/Makefile  |  2 +
 arch/x86/kernel/archrandom.c  | 79 +++
 3 files changed, 87 insertions(+)
 create mode 100644 arch/x86/kernel/archrandom.c

diff --git a/arch/x86/include/asm/archrandom.h 
b/arch/x86/include/asm/archrandom.h
index 69f1366..88f9c5a 100644
--- a/arch/x86/include/asm/archrandom.h
+++ b/arch/x86/include/asm/archrandom.h
@@ -117,6 +117,12 @@ GET_SEED(arch_get_random_seed_int, unsigned int, 
RDSEED_INT, ASM_NOP4);
 #define arch_has_random()  static_cpu_has(X86_FEATURE_RDRAND)
 #define arch_has_random_seed() static_cpu_has(X86_FEATURE_RDSEED)
 
+#define __HAVE_ARCH_GET_RNG_SEED
+extern void arch_get_rng_seed(void *ctx,
+ void (*seed)(void *ctx, u32 data),
+ int bits_per_source,
+ const char *log_prefix);
+
 #else
 
 static inline int rdrand_long(unsigned long *v)
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 047f9ff..0718bae 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -92,6 +92,8 @@ obj-$(CONFIG_PARAVIRT)+= paravirt.o 
paravirt_patch_$(BITS).o
 obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= paravirt-spinlocks.o
 obj-$(CONFIG_PARAVIRT_CLOCK)   += pvclock.o
 
+obj-$(CONFIG_ARCH_RANDOM)  += archrandom.o
+
 obj-$(CONFIG_PCSPKR_PLATFORM)  += pcspeaker.o
 
 obj-$(CONFIG_X86_CHECK_BIOS_CORRUPTION) += check.o
diff --git a/arch/x86/kernel/archrandom.c b/arch/x86/kernel/archrandom.c
new file mode 100644
index 000..5515fc8
--- /dev/null
+++ b/arch/x86/kernel/archrandom.c
@@ -0,0 +1,79 @@
+/*
+ * This file is part of the Linux kernel.
+ *
+ * Copyright (c) 2014 Andy Lutomirski
+ * Authors: Andy Lutomirski 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include 
+
+void arch_get_rng_seed(void *ctx,
+  void (*seed)(void *ctx, u32 data),
+  int bits_per_source,
+  const char *log_prefix)
+{
+   int i, longs = (bits_per_source + BITS_PER_LONG - 1) / BITS_PER_LONG;
+   int rdseed_bits = 0, rdrand_bits = 0;
+   int rdrand_longs_wanted = 0;
+   char buf[128] = "";
+   char *msgptr = buf;
+
+   for (i = 0; i < longs; i++) {
+   unsigned long rv;
+
+   if (arch_get_random_seed_long(&rv)) {
+   seed(ctx, (u32)rv);
+#if BITS_PER_LONG > 32
+   seed(ctx, (u32)(rv >> 32));
+#endif
+   rdseed_bits += 8 * sizeof(rv);
+   }
+   }
+   if (rdseed_bits)
+   msgptr += sprintf(msgptr, ", %d bits from RDSEED", rdseed_bits);
+
+   /*
+* According to the Intel DRNG Software Implementation Guide 2.0,
+* the RDRAND hardware is guaranteed to provide at least 128 bits
+* of non-deterministic entropy per 511*128 bits of RDRAND output.
+* Nonetheless, the guide suggests using a 512:1 reduction for
+* generating seeds.
+*
+* We use one extra reseed, because we might not own the first
+* or last few samples.
+*
+* We skip using RDRAND for any bits already provided by RDSEED,
+* as they use the same underlying entropy source.
+*/
+   if (rdseed_bits < bits_per_source && arch_has_random()) {
+   int nrbg_bits = bits_per_source - rdseed_bits;
+   int reseeds = (nrbg_bits + 127) / 128 + 1;
+
+   rdrand_longs_wanted = reseeds * 512 * 128 / BITS_PER_LONG;
+   }
+   for (i = 0; i < rdrand_longs_wanted; i++) {
+   unsigned long rv;
+
+   if (arch_get_random_long(&rv)) {
+   seed(ctx, (u32)rv);
+#if BITS_PER_LONG > 32
+   seed(ctx, (u32)(rv >> 32));
+#endif
+ 

Re: [PATCH v3 1/5] x86,kvm: Add MSR_KVM_GET_RNG_SEED and a matching feature bit

2014-07-17 Thread Andy Lutomirski
On Thu, Jul 17, 2014 at 10:43 AM, Andrew Honig  wrote:
>> +   case MSR_KVM_GET_RNG_SEED:
>> +   get_random_bytes(&data, sizeof(data));
>> +   break;
>
> Should this be rate limited in the interest of conserving randomness?
> If there ever is an attack on the prng, this would create very
> favorable conditions for an attacker to exploit it.

IMO if the nonblocking pool has a weakness that requires us to
conserve its output, then this is the least of our worries.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/5] x86,kvm: Add MSR_KVM_GET_RNG_SEED and a matching feature bit

2014-07-17 Thread Andrew Honig
> +   case MSR_KVM_GET_RNG_SEED:
> +   get_random_bytes(&data, sizeof(data));
> +   break;

Should this be rate limited in the interest of conserving randomness?
If there ever is an attack on the prng, this would create very
favorable conditions for an attacker to exploit it.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/5] random,x86: Add arch_get_slow_rng_u64

2014-07-17 Thread Andy Lutomirski
On Thu, Jul 17, 2014 at 10:32 AM, Theodore Ts'o  wrote:
> On Thu, Jul 17, 2014 at 10:12:27AM -0700, Andy Lutomirski wrote:
>>
>> Unless I'm reading the code wrong, the prandom_reseed_late call can
>> happen after userspace is running.
>
> But there is also the prandom_reseed() call, which happens early.
>

Right -- I missed that.

>- Ted



-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/5] random,x86: Add arch_get_slow_rng_u64

2014-07-17 Thread Theodore Ts'o
On Thu, Jul 17, 2014 at 10:12:27AM -0700, Andy Lutomirski wrote:
> 
> Unless I'm reading the code wrong, the prandom_reseed_late call can
> happen after userspace is running.

But there is also the prandom_reseed() call, which happens early.

   - Ted
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/4] KVM: MMU: allow pinning spte translations (TDP-only)

2014-07-17 Thread Nadav Amit

Small question if I may regarding kvm_mmu_pin_pages:

On 7/9/14, 10:12 PM, mtosa...@redhat.com wrote:

+
+static int kvm_mmu_pin_pages(struct kvm_vcpu *vcpu)
+{
+   struct kvm_pinned_page_range *p;
+   int r = 1;
+
+   if (is_guest_mode(vcpu))
+   return r;
+
+   if (!vcpu->arch.mmu.direct_map)
+   return r;
+
+   ASSERT(VALID_PAGE(vcpu->arch.mmu.root_hpa));
+
+   list_for_each_entry(p, &vcpu->arch.pinned_mmu_pages, link) {
+   gfn_t gfn_offset;
+
+   for (gfn_offset = 0; gfn_offset < p->npages; gfn_offset++) {
+   gfn_t gfn = p->base_gfn + gfn_offset;
+   int r;
+   bool pinned = false;
+
+   r = vcpu->arch.mmu.page_fault(vcpu, gfn << PAGE_SHIFT,
+PFERR_WRITE_MASK, false,
+true, &pinned);


I understand that the current use-case is for pinning only few pages. 
Yet, wouldn't it be better (for performance) to check whether the gfn 
uses a large page and if so to skip forward, increasing gfn_offset to 
point to the next large page?


Thanks,
Nadav
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/5] random,x86: Add arch_get_slow_rng_u64

2014-07-17 Thread Andy Lutomirski
On Thu, Jul 17, 2014 at 9:39 AM, H. Peter Anvin  wrote:
> On 07/17/2014 03:33 AM, Theodore Ts'o wrote:
>> On Wed, Jul 16, 2014 at 09:55:15PM -0700, H. Peter Anvin wrote:
>>> On 07/16/2014 05:03 PM, Andy Lutomirski wrote:
>
 I meant that prandom isn't using rdrand for early seeding.

>>>
>>> We should probably fix that.
>>
>> It wouldn't hurt to explicitly use arch_get_random_long() in prandom,
>> but it does use get_random_bytes() in early seed, and for CPU's with
>> RDRAND present, we do use it in init_std_data() in
>> drivers/char/random.c, so prandom is already getting initialized via
>> an RNG (which is effectively a DRBG even if it doesn't pass all of
>> NIST's rules) which is derived from RDRAND.
>>
>
> I assumed he was referring to before alternatives.  Not sure if we use
> prandom before that point, though.

Unless I'm reading the code wrong, the prandom_reseed_late call can
happen after userspace is running.

Anyway, I'm working on a near-complete rewrite of the guest part of all of this.

--Andy

>
> -hpa
>
>



-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/5] random,x86: Add arch_get_slow_rng_u64

2014-07-17 Thread H. Peter Anvin
On 07/17/2014 03:33 AM, Theodore Ts'o wrote:
> On Wed, Jul 16, 2014 at 09:55:15PM -0700, H. Peter Anvin wrote:
>> On 07/16/2014 05:03 PM, Andy Lutomirski wrote:

>>> I meant that prandom isn't using rdrand for early seeding.
>>>
>>
>> We should probably fix that.
> 
> It wouldn't hurt to explicitly use arch_get_random_long() in prandom,
> but it does use get_random_bytes() in early seed, and for CPU's with
> RDRAND present, we do use it in init_std_data() in
> drivers/char/random.c, so prandom is already getting initialized via
> an RNG (which is effectively a DRBG even if it doesn't pass all of
> NIST's rules) which is derived from RDRAND.
> 

I assumed he was referring to before alternatives.  Not sure if we use
prandom before that point, though.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering guest

2014-07-17 Thread bharat.bhus...@freescale.com


> -Original Message-
> From: Alexander Graf [mailto:ag...@suse.de]
> Sent: Thursday, July 17, 2014 9:58 PM
> To: Bhushan Bharat-R65777; kvm-...@vger.kernel.org
> Cc: kvm@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248
> Subject: Re: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering guest
> 
> 
> On 17.07.14 18:24, bharat.bhus...@freescale.com wrote:
> >
> >> -Original Message-
> >> From: Alexander Graf [mailto:ag...@suse.de]
> >> Sent: Thursday, July 17, 2014 9:41 PM
> >> To: Bhushan Bharat-R65777; kvm-...@vger.kernel.org
> >> Cc: kvm@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248
> >> Subject: Re: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering
> >> guest
> >>
> >>
> >> On 16.07.14 08:02, Bharat Bhushan wrote:
> >>> SPRG3 is guest accessible and SPRG3 can be clobbered by host or
> >>> another guest, So this need to be restored when loading guest state.
> >>>
> >>> Signed-off-by: Bharat Bhushan 
> >>> ---
> >>>arch/powerpc/kvm/booke_interrupts.S | 2 ++
> >>>1 file changed, 2 insertions(+)
> >>>
> >>> diff --git a/arch/powerpc/kvm/booke_interrupts.S
> >>> b/arch/powerpc/kvm/booke_interrupts.S
> >>> index 2c6deb5ef..0d3403f 100644
> >>> --- a/arch/powerpc/kvm/booke_interrupts.S
> >>> +++ b/arch/powerpc/kvm/booke_interrupts.S
> >>> @@ -459,6 +459,8 @@ lightweight_exit:
> >>>* written directly to the shared area, so we
> >>>* need to reload them here with the guest's values.
> >>>*/
> >>> + PPC_LD(r3, VCPU_SHARED_SPRG3, r5)
> >>> + mtspr   SPRN_SPRG3, r3
> >> We also need to restore it when resuming the host, no?
> > I do not think host expect some meaningful value when returning from guest,
> same true for SPRG4-7.
> > So there seems no reason to save host values and restore them.
> 
> Hmm - arch/powerpc/include/asm/reg.h says:
> 
>   * All 32-bit:
>   *  - SPRG3 current thread_info pointer
>   *(virtual on BookE, physical on others)
> 
> but I can indeed find no trace of usage anywhere. This at least needs to go 
> into
> the patch description.

I will add a comment in code as well.

Thanks
-Bharat

> 
> 
> Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering guest

2014-07-17 Thread Alexander Graf


On 17.07.14 18:27, Alexander Graf wrote:


On 17.07.14 18:24, bharat.bhus...@freescale.com wrote:



-Original Message-
From: Alexander Graf [mailto:ag...@suse.de]
Sent: Thursday, July 17, 2014 9:41 PM
To: Bhushan Bharat-R65777; kvm-...@vger.kernel.org
Cc: kvm@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248
Subject: Re: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering guest


On 16.07.14 08:02, Bharat Bhushan wrote:

SPRG3 is guest accessible and SPRG3 can be clobbered by host or
another guest, So this need to be restored when loading guest state.

Signed-off-by: Bharat Bhushan 
---
   arch/powerpc/kvm/booke_interrupts.S | 2 ++
   1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kvm/booke_interrupts.S
b/arch/powerpc/kvm/booke_interrupts.S
index 2c6deb5ef..0d3403f 100644
--- a/arch/powerpc/kvm/booke_interrupts.S
+++ b/arch/powerpc/kvm/booke_interrupts.S
@@ -459,6 +459,8 @@ lightweight_exit:
* written directly to the shared area, so we
* need to reload them here with the guest's values.
*/
+PPC_LD(r3, VCPU_SHARED_SPRG3, r5)
+mtsprSPRN_SPRG3, r3

We also need to restore it when resuming the host, no?
I do not think host expect some meaningful value when returning from 
guest, same true for SPRG4-7.

So there seems no reason to save host values and restore them.


Hmm - arch/powerpc/include/asm/reg.h says:

 * All 32-bit:
 *  - SPRG3 current thread_info pointer
 *(virtual on BookE, physical on others)

but I can indeed find no trace of usage anywhere. This at least needs 
to go into the patch description.


Bah - it obviously is used. It's SPRN_SPRG_THREAD. And it's so 
incredibly important that I have no idea how we could possibly run 
without switching the host value back in very early. And even then our 
interrupt handlers wouldn't work anymore.


This is more complicated :).


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering guest

2014-07-17 Thread Alexander Graf


On 17.07.14 18:24, bharat.bhus...@freescale.com wrote:



-Original Message-
From: Alexander Graf [mailto:ag...@suse.de]
Sent: Thursday, July 17, 2014 9:41 PM
To: Bhushan Bharat-R65777; kvm-...@vger.kernel.org
Cc: kvm@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248
Subject: Re: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering guest


On 16.07.14 08:02, Bharat Bhushan wrote:

SPRG3 is guest accessible and SPRG3 can be clobbered by host or
another guest, So this need to be restored when loading guest state.

Signed-off-by: Bharat Bhushan 
---
   arch/powerpc/kvm/booke_interrupts.S | 2 ++
   1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kvm/booke_interrupts.S
b/arch/powerpc/kvm/booke_interrupts.S
index 2c6deb5ef..0d3403f 100644
--- a/arch/powerpc/kvm/booke_interrupts.S
+++ b/arch/powerpc/kvm/booke_interrupts.S
@@ -459,6 +459,8 @@ lightweight_exit:
 * written directly to the shared area, so we
 * need to reload them here with the guest's values.
 */
+   PPC_LD(r3, VCPU_SHARED_SPRG3, r5)
+   mtspr   SPRN_SPRG3, r3

We also need to restore it when resuming the host, no?

I do not think host expect some meaningful value when returning from guest, 
same true for SPRG4-7.
So there seems no reason to save host values and restore them.


Hmm - arch/powerpc/include/asm/reg.h says:

 * All 32-bit:
 *  - SPRG3 current thread_info pointer
 *(virtual on BookE, physical on others)

but I can indeed find no trace of usage anywhere. This at least needs to 
go into the patch description.



Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering guest

2014-07-17 Thread bharat.bhus...@freescale.com


> -Original Message-
> From: Alexander Graf [mailto:ag...@suse.de]
> Sent: Thursday, July 17, 2014 9:41 PM
> To: Bhushan Bharat-R65777; kvm-...@vger.kernel.org
> Cc: kvm@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248
> Subject: Re: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering guest
> 
> 
> On 16.07.14 08:02, Bharat Bhushan wrote:
> > SPRG3 is guest accessible and SPRG3 can be clobbered by host or
> > another guest, So this need to be restored when loading guest state.
> >
> > Signed-off-by: Bharat Bhushan 
> > ---
> >   arch/powerpc/kvm/booke_interrupts.S | 2 ++
> >   1 file changed, 2 insertions(+)
> >
> > diff --git a/arch/powerpc/kvm/booke_interrupts.S
> > b/arch/powerpc/kvm/booke_interrupts.S
> > index 2c6deb5ef..0d3403f 100644
> > --- a/arch/powerpc/kvm/booke_interrupts.S
> > +++ b/arch/powerpc/kvm/booke_interrupts.S
> > @@ -459,6 +459,8 @@ lightweight_exit:
> >  * written directly to the shared area, so we
> >  * need to reload them here with the guest's values.
> >  */
> > +   PPC_LD(r3, VCPU_SHARED_SPRG3, r5)
> > +   mtspr   SPRN_SPRG3, r3
> 
> We also need to restore it when resuming the host, no?

I do not think host expect some meaningful value when returning from guest, 
same true for SPRG4-7.
So there seems no reason to save host values and restore them.

Thanks
-Bharat
> 
> 
> Alex
> 
> > PPC_LD(r3, VCPU_SHARED_SPRG4, r5)
> > mtspr   SPRN_SPRG4W, r3
> > PPC_LD(r3, VCPU_SHARED_SPRG5, r5)

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] kvm: ppc: bookehv: Save restore SPRN_SPRG9 on guest entry exit

2014-07-17 Thread bharat.bhus...@freescale.com


> -Original Message-
> From: Alexander Graf [mailto:ag...@suse.de]
> Sent: Thursday, July 17, 2014 9:47 PM
> To: Bhushan Bharat-R65777; kvm-...@vger.kernel.org
> Cc: kvm@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248
> Subject: Re: [PATCH] kvm: ppc: bookehv: Save restore SPRN_SPRG9 on guest entry
> exit
> 
> 
> On 16.07.14 07:49, Bharat Bhushan wrote:
> > SPRN_SPRG is used by debug interrupt handler, so this is required for
> > debug support.
> >
> > Signed-off-by: Bharat Bhushan 
> > ---
> >   arch/powerpc/include/asm/kvm_host.h   | 1 +
> >   arch/powerpc/kernel/asm-offsets.c | 1 +
> >   arch/powerpc/kvm/bookehv_interrupts.S | 4 
> >   3 files changed, 6 insertions(+)
> >
> > diff --git a/arch/powerpc/include/asm/kvm_host.h
> > b/arch/powerpc/include/asm/kvm_host.h
> > index 372b977..f9e94ed 100644
> > --- a/arch/powerpc/include/asm/kvm_host.h
> > +++ b/arch/powerpc/include/asm/kvm_host.h
> > @@ -588,6 +588,7 @@ struct kvm_vcpu_arch {
> > u32 mmucfg;
> > u32 eptcfg;
> > u32 epr;
> > +   u32 sprg9;
> 
> u32? really?

Must be 64, even I did 64bit save/restore below. Will correct in next version.

Thanks
-Bharat

> 
> 
> Alex
> 
> > u32 pwrmgtcr0;
> > u32 crit_save;
> > /* guest debug registers*/
> > diff --git a/arch/powerpc/kernel/asm-offsets.c
> > b/arch/powerpc/kernel/asm-offsets.c
> > index 17ffcb4..ab9ae04 100644
> > --- a/arch/powerpc/kernel/asm-offsets.c
> > +++ b/arch/powerpc/kernel/asm-offsets.c
> > @@ -668,6 +668,7 @@ int main(void)
> > DEFINE(VCPU_LR, offsetof(struct kvm_vcpu, arch.lr));
> > DEFINE(VCPU_CTR, offsetof(struct kvm_vcpu, arch.ctr));
> > DEFINE(VCPU_PC, offsetof(struct kvm_vcpu, arch.pc));
> > +   DEFINE(VCPU_SPRG9, offsetof(struct kvm_vcpu, arch.sprg9));
> > DEFINE(VCPU_LAST_INST, offsetof(struct kvm_vcpu, arch.last_inst));
> > DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear));
> > DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
> > diff --git a/arch/powerpc/kvm/bookehv_interrupts.S
> > b/arch/powerpc/kvm/bookehv_interrupts.S
> > index a1712b8..f45da85 100644
> > --- a/arch/powerpc/kvm/bookehv_interrupts.S
> > +++ b/arch/powerpc/kvm/bookehv_interrupts.S
> > @@ -441,6 +441,7 @@ _GLOBAL(kvmppc_resume_host)
> >   #ifdef CONFIG_64BIT
> > PPC_LL  r3, PACA_SPRG_VDSO(r13)
> >   #endif
> > +   mfspr   r5, SPRN_SPRG9
> > PPC_STD(r6, VCPU_SHARED_SPRG4, r11)
> > mfspr   r8, SPRN_SPRG6
> > PPC_STD(r7, VCPU_SHARED_SPRG5, r11) @@ -448,6 +449,7 @@
> > _GLOBAL(kvmppc_resume_host)
> >   #ifdef CONFIG_64BIT
> > mtspr   SPRN_SPRG_VDSO_WRITE, r3
> >   #endif
> > +   PPC_STD(r5, VCPU_SPRG9, r4)
> > PPC_STD(r8, VCPU_SHARED_SPRG6, r11)
> > mfxer   r3
> > PPC_STD(r9, VCPU_SHARED_SPRG7, r11) @@ -682,7 +684,9 @@
> > lightweight_exit:
> > mtspr   SPRN_SPRG5W, r6
> > PPC_LD(r8, VCPU_SHARED_SPRG7, r11)
> > mtspr   SPRN_SPRG6W, r7
> > +   PPC_LD(r5, VCPU_SPRG9, r4)
> > mtspr   SPRN_SPRG7W, r8
> > +   mtspr   SPRN_SPRG9, r5
> >
> > /* Load some guest volatiles. */
> > PPC_LL  r3, VCPU_LR(r4)

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RESEND PATCH v7 3/4] arm: dirty log write protect management support

2014-07-17 Thread Mario Smarduch
Hi Christoffer,
   Just back from holiday - a short plan to resume work.

- move VM tlb flush and kvm log functions to generic, per Paolo's
comments use Kconfig approach
- update other architectures make sure they compile
- Keep it ARMv7 for now

Get maintainers to test the branch.

In parallel add dirty log support to ARMv8, to test I would
add a QEMU monitor function to validate general operation.

Your thoughts?

Thanks,
  Mario

On 07/03/2014 08:04 AM, Christoffer Dall wrote:
> On Tue, Jun 17, 2014 at 06:41:52PM -0700, Mario Smarduch wrote:
>> On 06/11/2014 12:03 AM, Christoffer Dall wrote:
>>

 There is also the issue of kvm_flush_remote_tlbs(), that's also weak,
 the generic one is using IPIs. Since it's only used in mmu.c maybe make 
 this one static.

>>> So I don't see a lot of use of weak symbols in kvm_main.c (actually on
>>> kvmarm/next I don't see any), but we do want to share code when more
>>> than one architecture implements something in the exact same way, like
>>> it seems x86 and ARM is doing here for this particular function.
>>>
>>> I think the KVM scheme is usually to check for some define, like:
>>>
>>> #ifdef KVM_ARCH_HAVE_GET_DIRTY_LOG
>>> ret = kvm_arch_get_dirty_log(...);
>>> #else
>>> ret = kvm_get_dirty_log(...);
>>> #endif
>>>
>>> but Paolo may have a more informed oppinion of how to deal with these.
>>>
>>> Thanks,
>>> -Christoffer
>>>
>>
>>  
>> One approach I'm trying looking at the code in kvm_main().
>> This approach applies more to selecting features as opposed to
>> selecting generic vs architecture specific functions.
>>
>> 1.-
>>  - add to 'virt/kvm/Kconfig'
>> config HAVE_KVM_ARCH_TLB_FLUSH_ALL
>>bool
>>
>> config HAVE_KVM_ARCH_DIRTY_LOG
>>bool
>> 2.--
>> For ARM and later ARM64 add to 'arch/arm[64]/kvm/Kconfig'
>> config KVM
>> bool "Kernel-based Virtual Machine (KVM) support"
>> ...
>> select HAVE_KVM_ARCH_TLB_FLUSH_ALL
>> ..
>>
>> Not for HAVE_KVM_ARCH_DIRTY_LOG given it's shared with x86,
>> but would need to do it for every other architecture that
>> does not share it (except initially for arm64 since it
>> will use the variant that returns -EINVAL until feature
>> is supported)
>>
>> 3--
>> In kvm_main.c would have something like
>>
>> void kvm_flush_remote_tlbs(struct kvm *kvm)
>> {
>> #ifdef CONFIG_HAVE_KVM_ARCH_TLB_FLUSH_ALL
>> kvm_arch_flush_remote_tlbs(kvm);
>> #else
>> long dirty_count = kvm->tlbs_dirty;
>>
>> smp_mb();
>> if (make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH))
>> ++kvm->stat.remote_tlb_flush;
>> cmpxchg(&kvm->tlbs_dirty, dirty_count, 0);
>> #endif
>> }
>>
>> Then add void kvm_flush_remote_tlbs(struct kvm *kvm) definition
>> to arm kvm_host.h. Define the function in this case mmu.c
>>
>> For the dirty log function
>> int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
>> struct kvm_dirty_log *log)
>> {
>> #ifdef CONFIG_HAVE_KVM_ARCH_DIRTY_LOG
>> kvm_arch_vm_ioctl_get_dirty_log(kvm, log);
>> #else
>> int r;
>> struct kvm_memory_slot *memslot;
>> unsigned long n, i;
>> unsigned long *dirty_bitmap;
>> unsigned long *dirty_bitmap_buffer;
>> bool is_dirty = false;
>>  ...
>>
>> But then you have to go into every architecture and define the
>> kvm_arch_vm_...() variant.
>>
>> Is this the right way to go? Or is there a simpler way?
>>
> Hmmm, I'm really not an expert in the 'established procedures' for what
> to put in config files etc., but here's my basic take:
> 
> a) you wouldn't put a config option in Kconfig unless it's comething
> that's actually configurable or some generic feature/subsystem that
> should only be enabled if hardware has certain capabilities or other
> config options enabled.
> 
> b) this seems entirely an implementation issue and not depending on
> anything users should select.
> 
> c) therefore, I think it's either a question of always having an
> arch-specific implementation that you probe for its return value or you
> have some sort of define in the header files for the
> arch/X/include/asm/kvm_host.h to control what you need.
> 
> -Christoffer
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: ppc: bookehv: Save restore SPRN_SPRG9 on guest entry exit

2014-07-17 Thread Alexander Graf


On 16.07.14 07:49, Bharat Bhushan wrote:

SPRN_SPRG is used by debug interrupt handler, so this is required
for debug support.

Signed-off-by: Bharat Bhushan 
---
  arch/powerpc/include/asm/kvm_host.h   | 1 +
  arch/powerpc/kernel/asm-offsets.c | 1 +
  arch/powerpc/kvm/bookehv_interrupts.S | 4 
  3 files changed, 6 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 372b977..f9e94ed 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -588,6 +588,7 @@ struct kvm_vcpu_arch {
u32 mmucfg;
u32 eptcfg;
u32 epr;
+   u32 sprg9;


u32? really?


Alex


u32 pwrmgtcr0;
u32 crit_save;
/* guest debug registers*/
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 17ffcb4..ab9ae04 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -668,6 +668,7 @@ int main(void)
DEFINE(VCPU_LR, offsetof(struct kvm_vcpu, arch.lr));
DEFINE(VCPU_CTR, offsetof(struct kvm_vcpu, arch.ctr));
DEFINE(VCPU_PC, offsetof(struct kvm_vcpu, arch.pc));
+   DEFINE(VCPU_SPRG9, offsetof(struct kvm_vcpu, arch.sprg9));
DEFINE(VCPU_LAST_INST, offsetof(struct kvm_vcpu, arch.last_inst));
DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear));
DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
b/arch/powerpc/kvm/bookehv_interrupts.S
index a1712b8..f45da85 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -441,6 +441,7 @@ _GLOBAL(kvmppc_resume_host)
  #ifdef CONFIG_64BIT
PPC_LL  r3, PACA_SPRG_VDSO(r13)
  #endif
+   mfspr   r5, SPRN_SPRG9
PPC_STD(r6, VCPU_SHARED_SPRG4, r11)
mfspr   r8, SPRN_SPRG6
PPC_STD(r7, VCPU_SHARED_SPRG5, r11)
@@ -448,6 +449,7 @@ _GLOBAL(kvmppc_resume_host)
  #ifdef CONFIG_64BIT
mtspr   SPRN_SPRG_VDSO_WRITE, r3
  #endif
+   PPC_STD(r5, VCPU_SPRG9, r4)
PPC_STD(r8, VCPU_SHARED_SPRG6, r11)
mfxer   r3
PPC_STD(r9, VCPU_SHARED_SPRG7, r11)
@@ -682,7 +684,9 @@ lightweight_exit:
mtspr   SPRN_SPRG5W, r6
PPC_LD(r8, VCPU_SHARED_SPRG7, r11)
mtspr   SPRN_SPRG6W, r7
+   PPC_LD(r5, VCPU_SPRG9, r4)
mtspr   SPRN_SPRG7W, r8
+   mtspr   SPRN_SPRG9, r5
  
  	/* Load some guest volatiles. */

PPC_LL  r3, VCPU_LR(r4)


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: ppc: booke: Restore SPRG3 when entering guest

2014-07-17 Thread Alexander Graf


On 16.07.14 08:02, Bharat Bhushan wrote:

SPRG3 is guest accessible and SPRG3 can be clobbered by host
or another guest, So this need to be restored when loading
guest state.

Signed-off-by: Bharat Bhushan 
---
  arch/powerpc/kvm/booke_interrupts.S | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kvm/booke_interrupts.S 
b/arch/powerpc/kvm/booke_interrupts.S
index 2c6deb5ef..0d3403f 100644
--- a/arch/powerpc/kvm/booke_interrupts.S
+++ b/arch/powerpc/kvm/booke_interrupts.S
@@ -459,6 +459,8 @@ lightweight_exit:
 * written directly to the shared area, so we
 * need to reload them here with the guest's values.
 */
+   PPC_LD(r3, VCPU_SHARED_SPRG3, r5)
+   mtspr   SPRN_SPRG3, r3


We also need to restore it when resuming the host, no?


Alex


PPC_LD(r3, VCPU_SHARED_SPRG4, r5)
mtspr   SPRN_SPRG4W, r3
PPC_LD(r3, VCPU_SHARED_SPRG5, r5)


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RESEND PATCH v7 3/4] arm: dirty log write protect management support

2014-07-17 Thread Mario Smarduch
On 07/04/2014 09:29 AM, Paolo Bonzini wrote:
> Il 03/07/2014 17:04, Christoffer Dall ha scritto:
>> Hmmm, I'm really not an expert in the 'established procedures' for what
>> to put in config files etc., but here's my basic take:
>>
>> a) you wouldn't put a config option in Kconfig unless it's comething
>> that's actually configurable or some generic feature/subsystem that
>> should only be enabled if hardware has certain capabilities or other
>> config options enabled.
>>
>> b) this seems entirely an implementation issue and not depending on
>> anything users should select.
> 
> Actually I think Mario's idea is just fine.  Non-user-accessible Kconfig
> symbols are used a lot to invoke an #ifdef elsewhere in the code;
> compare this with his proposal is a bit different but not too much.
> 
> Sometimes #defines are used, sometimes Kconfig symbols, but the idea is
> the same.
> 
> Paolo

Hi Paolo,
  thanks for your feedback. I forgot to add that I tried define 
ARCH_HAVE_... approach but checkpatch rejected it and insisted
on Kconfig.

Thanks,
- Mario
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 0/5] Read guest last instruction from kvmppc_get_last_inst()

2014-07-17 Thread Alexander Graf


On 17.07.14 13:22, Mihai Caraman wrote:

Read guest last instruction from kvmppc_get_last_inst() allowing the function
to fail in order to emulate again. On bookehv architecture search for
the physical address and kmap it, instead of using Load External PID (lwepx)
instruction. This fixes an infinite loop caused by lwepx's data TLB miss
exception handled in the host and the TODO for execute-but-not-read entries
and TLB eviction.


Looks very good apart from minor nits. Please let me know whether you 
can send a final patch set fixing those or you would like me to fix it 
up while applying them.



Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 4/5] KVM: PPC: Alow kvmppc_get_last_inst() to fail

2014-07-17 Thread Alexander Graf


On 17.07.14 13:22, Mihai Caraman wrote:

On book3e, guest last instruction is read on the exit path using load
external pid (lwepx) dedicated instruction. This load operation may fail
due to TLB eviction and execute-but-not-read entries.

This patch lay down the path for an alternative solution to read the guest
last instruction, by allowing kvmppc_get_lat_inst() function to fail.
Architecture specific implmentations of kvmppc_load_last_inst() may read
last guest instruction and instruct the emulation layer to re-execute the
guest in case of failure.

Make kvmppc_get_last_inst() definition common between architectures.

Signed-off-by: Mihai Caraman 
---
v5
  - don't swap when load fail
  - convert the return value space of kvmppc_ld()

v4:
  - these changes compile on book3s, please validate the functionality and
do the necessary adaptations!
  - common declaration and enum for kvmppc_load_last_inst()
  - remove kvmppc_read_inst() in a preceding patch

v3:
  - rework patch description
  - add common definition for kvmppc_get_last_inst()
  - check return values in book3s code

v2:
  - integrated kvmppc_get_last_inst() in book3s code and checked build
  - addressed cosmetic feedback

  arch/powerpc/include/asm/kvm_book3s.h| 26 -
  arch/powerpc/include/asm/kvm_booke.h |  5 ---
  arch/powerpc/include/asm/kvm_ppc.h   | 25 +
  arch/powerpc/kvm/book3s.c| 17 +
  arch/powerpc/kvm/book3s_64_mmu_hv.c  | 17 +++--
  arch/powerpc/kvm/book3s_paired_singles.c | 38 ---
  arch/powerpc/kvm/book3s_pr.c | 63 ++--
  arch/powerpc/kvm/booke.c |  3 ++
  arch/powerpc/kvm/e500_mmu_host.c |  6 +++
  arch/powerpc/kvm/emulate.c   | 18 ++---
  arch/powerpc/kvm/powerpc.c   | 11 +-
  11 files changed, 144 insertions(+), 85 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 20fb6f2..a86ca65 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -276,32 +276,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu 
*vcpu)
return (kvmppc_get_msr(vcpu) & MSR_LE) != (MSR_KERNEL & MSR_LE);
  }
  
-static inline u32 kvmppc_get_last_inst_internal(struct kvm_vcpu *vcpu, ulong pc)

-{
-   /* Load the instruction manually if it failed to do so in the
-* exit path */
-   if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
-   kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false);
-
-   return kvmppc_need_byteswap(vcpu) ? swab32(vcpu->arch.last_inst) :
-   vcpu->arch.last_inst;
-}
-
-static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
-{
-   return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu));
-}
-
-/*
- * Like kvmppc_get_last_inst(), but for fetching a sc instruction.
- * Because the sc instruction sets SRR0 to point to the following
- * instruction, we have to fetch from pc - 4.
- */
-static inline u32 kvmppc_get_last_sc(struct kvm_vcpu *vcpu)
-{
-   return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu) - 4);
-}
-
  static inline ulong kvmppc_get_fault_dar(struct kvm_vcpu *vcpu)
  {
return vcpu->arch.fault_dar;
diff --git a/arch/powerpc/include/asm/kvm_booke.h 
b/arch/powerpc/include/asm/kvm_booke.h
index c7aed61..cbb1990 100644
--- a/arch/powerpc/include/asm/kvm_booke.h
+++ b/arch/powerpc/include/asm/kvm_booke.h
@@ -69,11 +69,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu 
*vcpu)
return false;
  }
  
-static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)

-{
-   return vcpu->arch.last_inst;
-}
-
  static inline void kvmppc_set_ctr(struct kvm_vcpu *vcpu, ulong val)
  {
vcpu->arch.ctr = val;
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index e2fd5a1..7f9c634 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -47,6 +47,11 @@ enum emulation_result {
EMULATE_EXIT_USER,/* emulation requires exit to user-space */
  };
  
+enum instruction_type {

+   INST_GENERIC,
+   INST_SC,/* system call */
+};
+
  extern int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
  extern int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
  extern void kvmppc_handler_highmem(void);
@@ -62,6 +67,9 @@ extern int kvmppc_handle_store(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
   u64 val, unsigned int bytes,
   int is_default_endian);
  
+extern int kvmppc_load_last_inst(struct kvm_vcpu *vcpu,

+enum instruction_type type, u32 *inst);
+
  extern int kvmppc_emulate_instruction(struct kvm_run *run,
struct kvm_vcpu *vcpu);
  extern int kvmppc_emulate_mmio(struct kvm_ru

Re: [PATCH v2 5/5] kvm, mem-hotplug: Do not pin apic access page in memory.

2014-07-17 Thread Gleb Natapov
On Thu, Jul 17, 2014 at 09:34:20PM +0800, Tang Chen wrote:
> Hi Gleb,
> 
> On 07/15/2014 08:40 PM, Gleb Natapov wrote:
> ..
> >>
> >>And yes, we have the problem you said here. We can migrate the page while L2
> >>vm is running.
> >>So I think we should enforce L2 vm to exit to L1. Right ?
> >>
> >We can request APIC_ACCESS_ADDR reload during L2->L1 vmexit emulation, so
> >if APIC_ACCESS_ADDR changes while L2 is running it will be reloaded for L1 
> >too.
> >
> 
> Sorry, I think I don't quite understand the procedure you are talking about
> here.
> 
> Referring to the code, I think we have three machines: L0(host), L1 and L2.
> And we have two types of vmexit: L2->L1 and L2->L0.  Right ?
> 
> We are now talking about this case: L2 and L1 shares the apic page.
> 
> Using patch 5/5, when apic page is migrated on L0, mmu_notifier will notify
> L1,
> and update L1's VMCS. At this time, we are in L0, not L2. Why cannot we
Using patch 5/5, when apic page is migrated on L0, mmu_notifier will notify
L1 or L2 VMCS depending on which one happens to be running right now.
If it is L1 then L2's VMCS will be updated during vmentry emulation, if it is
L2 we need to request reload during vmexit emulation to make sure L1's VMCS is
updated.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 3/5] KVM: PPC: Book3s: Remove kvmppc_read_inst() function

2014-07-17 Thread Alexander Graf


On 17.07.14 13:22, Mihai Caraman wrote:

In the context of replacing kvmppc_ld() function calls with a version of
kvmppc_get_last_inst() which allow to fail, Alex Graf suggested this:

"If we get EMULATE_AGAIN, we just have to make sure we go back into the guest.
No need to inject an ISI into  the guest - it'll do that all by itself.
With an error returning kvmppc_get_last_inst we can just use completely
get rid of kvmppc_read_inst() and only use kvmppc_get_last_inst() instead."

As a intermediate step get rid of kvmppc_read_inst() and only use kvmppc_ld()
instead.

Signed-off-by: Mihai Caraman 
---
v5:
  - make paired single emulation the unusual

v4:
  - new patch

  arch/powerpc/kvm/book3s_pr.c | 91 ++--
  1 file changed, 37 insertions(+), 54 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index e40765f..02a983e 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -710,42 +710,6 @@ static void kvmppc_giveup_fac(struct kvm_vcpu *vcpu, ulong 
fac)
  #endif
  }
  
-static int kvmppc_read_inst(struct kvm_vcpu *vcpu)

-{
-   ulong srr0 = kvmppc_get_pc(vcpu);
-   u32 last_inst = kvmppc_get_last_inst(vcpu);
-   int ret;
-
-   ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
-   if (ret == -ENOENT) {
-   ulong msr = kvmppc_get_msr(vcpu);
-
-   msr = kvmppc_set_field(msr, 33, 33, 1);
-   msr = kvmppc_set_field(msr, 34, 36, 0);
-   msr = kvmppc_set_field(msr, 42, 47, 0);
-   kvmppc_set_msr_fast(vcpu, msr);
-   kvmppc_book3s_queue_irqprio(vcpu, 
BOOK3S_INTERRUPT_INST_STORAGE);
-   return EMULATE_AGAIN;
-   }
-
-   return EMULATE_DONE;
-}
-
-static int kvmppc_check_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr)
-{
-
-   /* Need to do paired single emulation? */
-   if (!(vcpu->arch.hflags & BOOK3S_HFLAG_PAIRED_SINGLE))
-   return EMULATE_DONE;
-
-   /* Read out the instruction */
-   if (kvmppc_read_inst(vcpu) == EMULATE_DONE)
-   /* Need to emulate */
-   return EMULATE_FAIL;
-
-   return EMULATE_AGAIN;
-}
-
  /* Handle external providers (FPU, Altivec, VSX) */
  static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr,
 ulong msr)
@@ -1149,31 +1113,49 @@ program_interrupt:
case BOOK3S_INTERRUPT_VSX:
{
int ext_msr = 0;
+   int emul;
+   ulong pc;
+   u32 last_inst;
+
+   if (vcpu->arch.hflags & BOOK3S_HFLAG_PAIRED_SINGLE) {
+   /* Emulate the instruction */


If the flag is set we do paired single instruction emulation.


+
+   pc = kvmppc_get_pc(vcpu);
+   last_inst = kvmppc_get_last_inst(vcpu);
+   emul = kvmppc_ld(vcpu, &pc, sizeof(u32), &last_inst,
+false);
+   if (emul == EMULATE_DONE)
+   goto program_interrupt;
+   else
+   r = RESUME_GUEST;
+   } else {
+   /* Do paired single emulation */


otherwise we deflect the interrupt into our guest.


Alex

  
-		switch (exit_nr) {

-   case BOOK3S_INTERRUPT_FP_UNAVAIL: ext_msr = MSR_FP;  break;
-   case BOOK3S_INTERRUPT_ALTIVEC:ext_msr = MSR_VEC; break;
-   case BOOK3S_INTERRUPT_VSX:ext_msr = MSR_VSX; break;
-   }
+   switch (exit_nr) {
+   case BOOK3S_INTERRUPT_FP_UNAVAIL:
+   ext_msr = MSR_FP;
+   break;
+
+   case BOOK3S_INTERRUPT_ALTIVEC:
+   ext_msr = MSR_VEC;
+   break;
+
+   case BOOK3S_INTERRUPT_VSX:
+   ext_msr = MSR_VSX;
+   break;
+   }
  
-		switch (kvmppc_check_ext(vcpu, exit_nr)) {

-   case EMULATE_DONE:
-   /* everything ok - let's enable the ext */
r = kvmppc_handle_ext(vcpu, exit_nr, ext_msr);
-   break;
-   case EMULATE_FAIL:
-   /* we need to emulate this instruction */
-   goto program_interrupt;
-   break;
-   default:
-   /* nothing to worry about - go again */
-   break;
}
break;
}
case BOOK3S_INTERRUPT_ALIGNMENT:
-   if (kvmppc_read_inst(vcpu) == EMULATE_DONE) {
-   u32 last_inst = kvmppc_get_last_inst(vcpu);
+   {
+   ulong pc = kvmppc_get_pc(vcpu);
+   u32 last_i

RE: [PATCH 6/6 v2] kvm: ppc: Add SPRN_EPR get helper function

2014-07-17 Thread mihai.cara...@freescale.com
> -Original Message-
> From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-
> ow...@vger.kernel.org] On Behalf Of Bharat Bhushan
> Sent: Thursday, July 17, 2014 2:32 PM
> To: ag...@suse.de; kvm-...@vger.kernel.org
> Cc: kvm@vger.kernel.org; Wood Scott-B07421; Yoder Stuart-B08248; Bhushan
> Bharat-R65777
> Subject: [PATCH 6/6 v2] kvm: ppc: Add SPRN_EPR get helper function
> 
> kvmppc_set_epr() is already defined in asm/kvm_ppc.h, So
> rename and move get_epr helper function to same file.
> 
> Signed-off-by: Bharat Bhushan 
> ---
> v1->v2
>  - vcpu->arch.epr under CONFIG_BOOKE
> 
>  arch/powerpc/include/asm/kvm_ppc.h | 10 ++
>  arch/powerpc/kvm/booke.c   | 11 +--
>  2 files changed, 11 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_ppc.h
> b/arch/powerpc/include/asm/kvm_ppc.h
> index 58a5202..14e2d87 100644
> --- a/arch/powerpc/include/asm/kvm_ppc.h
> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> @@ -395,6 +395,16 @@ static inline int kvmppc_xics_hcall(struct kvm_vcpu
> *vcpu, u32 cmd)
>   { return 0; }
>  #endif
> 
> +static inline unsigned long kvmppc_get_epr(struct kvm_vcpu *vcpu)
> +{
> +#ifdef CONFIG_KVM_BOOKE_HV
> + return mfspr(SPRN_GEPR);
> +#elif defined(CONFIG_BOOKE)
> + return vcpu->arch.epr;
> +#endif
> + return 0;
> +}
> +
>  static inline void kvmppc_set_epr(struct kvm_vcpu *vcpu, u32 epr)
>  {
>  #ifdef CONFIG_KVM_BOOKE_HV

EPR is a BookE resource, why don't we move the helpers to kvm_booke.h?

-Mike
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6 v2] Cleanup and fixes related to helper SPRN_XX functions

2014-07-17 Thread Alexander Graf


On 17.07.14 13:31, Bharat Bhushan wrote:

These are primarily the cleanup patches, where shared struct get/set
helper function are enhanced to handle shadow registers and uses those
helper functions.
Eventually this also fix SRR0/1 synchronization from userspace

v1->v2
  - Compilation fix for book3s


Thanks, applied all to kvm-ppc-queue.

(resend without triple-x again)


Alex



Bharat Bhushan (6):
   kvm: ppc: bookehv: Added wrapper macros for shadow registers
   kvm: ppc: booke: Use the shared struct helpers of SRR0 and SRR1
   kvm: ppc: booke: Use the shared struct helpers of SPRN_DEAR
   kvm: ppc: booke: Add shared struct helpers of SPRN_ESR
   kvm: ppc: booke: Use the shared struct helpers for SPRN_SPRG0-7
   kvm: ppc: Add SPRN_EPR get helper function

  arch/powerpc/include/asm/kvm_ppc.h |  55 ---
  arch/powerpc/kvm/booke.c   | 108 ++---
  arch/powerpc/kvm/booke_emulate.c   |   8 +--
  3 files changed, 80 insertions(+), 91 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/6 v2] kvm: ppc: Add SPRN_EPR get helper function

2014-07-17 Thread Alexander Graf


On 17.07.14 13:31, Bharat Bhushan wrote:

kvmppc_set_epr() is already defined in asm/kvm_ppc.h, So
rename and move get_epr helper function to same file.

Signed-off-by: Bharat Bhushan 
---
v1->v2
  - vcpu->arch.epr under CONFIG_BOOKE

  arch/powerpc/include/asm/kvm_ppc.h | 10 ++
  arch/powerpc/kvm/booke.c   | 11 +--
  2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 58a5202..14e2d87 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -395,6 +395,16 @@ static inline int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, 
u32 cmd)
{ return 0; }
  #endif
  
+static inline unsigned long kvmppc_get_epr(struct kvm_vcpu *vcpu)

+{
+#ifdef CONFIG_KVM_BOOKE_HV
+   return mfspr(SPRN_GEPR);
+#elif defined(CONFIG_BOOKE)
+   return vcpu->arch.epr;
+#endif
+   return 0;


Let me change that to

#else
  return 0;
#endif


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 5/5] kvm, mem-hotplug: Do not pin apic access page in memory.

2014-07-17 Thread Tang Chen

Hi Gleb,

On 07/15/2014 08:40 PM, Gleb Natapov wrote:
..


And yes, we have the problem you said here. We can migrate the page while L2
vm is running.
So I think we should enforce L2 vm to exit to L1. Right ?


We can request APIC_ACCESS_ADDR reload during L2->L1 vmexit emulation, so
if APIC_ACCESS_ADDR changes while L2 is running it will be reloaded for L1 too.



Sorry, I think I don't quite understand the procedure you are talking 
about here.


Referring to the code, I think we have three machines: L0(host), L1 and L2.
And we have two types of vmexit: L2->L1 and L2->L0.  Right ?

We are now talking about this case: L2 and L1 shares the apic page.

Using patch 5/5, when apic page is migrated on L0, mmu_notifier will 
notify L1,
and update L1's VMCS. At this time, we are in L0, not L2. Why cannot we 
update
the L2's VMCS at the same time ?  Is it because we don't know how many 
L2 vms

there are in L1 ?

And, when will L2->L1 vmexit happen ?  When we enforce L1 to exit to L0 by
calling make_all_cpus_request(), is L2->L1 vmexit triggered automatically ?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/5] random,x86: Add arch_get_slow_rng_u64

2014-07-17 Thread Daniel Borkmann

On 07/17/2014 12:59 AM, H. Peter Anvin wrote:

On 07/16/2014 03:40 PM, Andy Lutomirski wrote:

On Wed, Jul 16, 2014 at 3:13 PM, Andy Lutomirski  wrote:

My personal preference is to defer this until some user shows up.  I
think that even this would be too complicated for KASLR, which is the
only extremely early-boot user that I found.

Hmm.  Does the prandom stuff want to use this?


prandom isn't even using rdrand.  I'd suggest fixing this separately,
or even just waiting until someone goes and deletes prandom.


prandom is exactly the opposite; it is designed for when we need
possibly low quality random numbers very quickly.  RDRAND is actually
too slow.


Yep, prandom() is quite heavily used in the network stack where it's
traded for speed.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/5] random,x86: Add arch_get_slow_rng_u64

2014-07-17 Thread Theodore Ts'o
On Wed, Jul 16, 2014 at 09:55:15PM -0700, H. Peter Anvin wrote:
> On 07/16/2014 05:03 PM, Andy Lutomirski wrote:
> >>
> > I meant that prandom isn't using rdrand for early seeding.
> > 
> 
> We should probably fix that.

It wouldn't hurt to explicitly use arch_get_random_long() in prandom,
but it does use get_random_bytes() in early seed, and for CPU's with
RDRAND present, we do use it in init_std_data() in
drivers/char/random.c, so prandom is already getting initialized via
an RNG (which is effectively a DRBG even if it doesn't pass all of
NIST's rules) which is derived from RDRAND.

Cheers,

- Ted

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] KVM: nVMX: Fix fail to get nested ack intr's vector during nested vmexit

2014-07-17 Thread Wanpeng Li
On Thu, Jul 17, 2014 at 02:04:11PM +0200, Paolo Bonzini wrote:
>Il 17/07/2014 13:28, Paolo Bonzini ha scritto:
>> Il 17/07/2014 13:03, Wanpeng Li ha scritto:
>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>> index 4ae5ad8..a704f71 100644
>>> --- a/arch/x86/kvm/vmx.c
>>> +++ b/arch/x86/kvm/vmx.c
>>> @@ -8697,6 +8697,9 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, 
>>> u32 exit_reason,
>>> if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
>>> && nested_exit_intr_ack_set(vcpu)) {
>>> int irq = kvm_cpu_get_interrupt(vcpu);
>>> +
>>> +   if (irq < 0 && kvm_apic_vid_enabled(vcpu->kvm))
>>> +   irq = kvm_lapic_find_highest_irr(vcpu);
>>> WARN_ON(irq < 0);
>>> vmcs12->vm_exit_intr_info = irq |
>>> INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
>> 
>> I wonder if this should be kvm_apic_has_interrupt, so that the PPR
>> register is taken into consideration?
>
>
>And actually, I think the acknowledging should include the three steps to
>set-ISR/update-PPR/clear-IRR.  (With APICv update PPR is not strictly
>necessary, but it doesn't hurt either).
>
>You cannot let the processor do these because it would deliver the interrupt
>through the IDT,  but you still must do it in the hypervisor.
>
>This gives this patch:
>
>diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
>index bd0da43..a1ec6a5 100644
>--- a/arch/x86/kvm/irq.c
>+++ b/arch/x86/kvm/irq.c
>@@ -108,7 +108,7 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
> 
>   vector = kvm_cpu_get_extint(v);
> 
>-  if (kvm_apic_vid_enabled(v->kvm) || vector != -1)
>+  if (vector != -1)
>   return vector;  /* PIC */
> 
>   return kvm_get_apic_interrupt(v);   /* APIC */
>diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>index 3855103..6cbc7af 100644
>--- a/arch/x86/kvm/lapic.c
>+++ b/arch/x86/kvm/lapic.c
>@@ -360,10 +360,20 @@ static inline void apic_clear_irr(int vec, struct 
>kvm_lapic *apic)
> 
> static inline void apic_set_isr(int vec, struct kvm_lapic *apic)
> {
>-  /* Note that we never get here with APIC virtualization enabled.  */
>+  if (!__apic_test_and_set_vector(vec, apic->regs + APIC_ISR)) {
>+  /*
>+   * With APIC virtualization enabled, all caching is disabled
>+   * because the processor can modify ISR under the hood.  Instead
>+   * just set SVI.
>+   */
>+  if (kvm_apic_vid_enabled(vcpu->kvm)) {
>+  kvm_x86_ops->hwapic_isr_update(vcpu->kvm, vec);
>+  return;
>+  }
> 
>-  if (!__apic_test_and_set_vector(vec, apic->regs + APIC_ISR))
>   ++apic->isr_count;
>+  }
>+
>   BUG_ON(apic->isr_count > MAX_APIC_VECTOR);
>   /*
>* ISR (in service register) bit is set when injecting an interrupt.
>@@ -1627,11 +1637,16 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu)
>   int vector = kvm_apic_has_interrupt(vcpu);
>   struct kvm_lapic *apic = vcpu->arch.apic;
> 
>-  /* Note that we never get here with APIC virtualization enabled.  */
>-
>   if (vector == -1)
>   return -1;
> 
>+  /*
>+   * We get here even with APIC virtualization enabled, if doing
>+   * nested virtualization and L1 runs with the "acknowledge interrupt 
>+   * on exit" mode.  Then we cannot inject the interrupt via RVI,
>+   * because the process would deliver it through the IDT.
>+   */
>+
>   apic_set_isr(vector, apic);
>   apic_update_ppr(apic);
>   apic_clear_irr(vector, apic);
>
>
>I think the right way to do it must be something like this; you cannot
>do it just in nested_vmx_vmexit.  Testing is welcome since I don't have
>easy access to APICv-capable hardware (it would take a few days).

I will test it tomorrow, it's late today for me. ;-)

Regards,
Wanpeng Li 

>
>Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] KVM: nVMX: Fix fail to get nested ack intr's vector during nested vmexit

2014-07-17 Thread Paolo Bonzini
Il 17/07/2014 13:28, Paolo Bonzini ha scritto:
> Il 17/07/2014 13:03, Wanpeng Li ha scritto:
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 4ae5ad8..a704f71 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -8697,6 +8697,9 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, 
>> u32 exit_reason,
>>  if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
>>  && nested_exit_intr_ack_set(vcpu)) {
>>  int irq = kvm_cpu_get_interrupt(vcpu);
>> +
>> +if (irq < 0 && kvm_apic_vid_enabled(vcpu->kvm))
>> +irq = kvm_lapic_find_highest_irr(vcpu);
>>  WARN_ON(irq < 0);
>>  vmcs12->vm_exit_intr_info = irq |
>>  INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
> 
> I wonder if this should be kvm_apic_has_interrupt, so that the PPR
> register is taken into consideration?


And actually, I think the acknowledging should include the three steps to
set-ISR/update-PPR/clear-IRR.  (With APICv update PPR is not strictly
necessary, but it doesn't hurt either).

You cannot let the processor do these because it would deliver the interrupt
through the IDT,  but you still must do it in the hypervisor.

This gives this patch:

diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
index bd0da43..a1ec6a5 100644
--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -108,7 +108,7 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
 
vector = kvm_cpu_get_extint(v);
 
-   if (kvm_apic_vid_enabled(v->kvm) || vector != -1)
+   if (vector != -1)
return vector;  /* PIC */
 
return kvm_get_apic_interrupt(v);   /* APIC */
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 3855103..6cbc7af 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -360,10 +360,20 @@ static inline void apic_clear_irr(int vec, struct 
kvm_lapic *apic)
 
 static inline void apic_set_isr(int vec, struct kvm_lapic *apic)
 {
-   /* Note that we never get here with APIC virtualization enabled.  */
+   if (!__apic_test_and_set_vector(vec, apic->regs + APIC_ISR)) {
+   /*
+* With APIC virtualization enabled, all caching is disabled
+* because the processor can modify ISR under the hood.  Instead
+* just set SVI.
+*/
+   if (kvm_apic_vid_enabled(vcpu->kvm)) {
+   kvm_x86_ops->hwapic_isr_update(vcpu->kvm, vec);
+   return;
+   }
 
-   if (!__apic_test_and_set_vector(vec, apic->regs + APIC_ISR))
++apic->isr_count;
+   }
+
BUG_ON(apic->isr_count > MAX_APIC_VECTOR);
/*
 * ISR (in service register) bit is set when injecting an interrupt.
@@ -1627,11 +1637,16 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu)
int vector = kvm_apic_has_interrupt(vcpu);
struct kvm_lapic *apic = vcpu->arch.apic;
 
-   /* Note that we never get here with APIC virtualization enabled.  */
-
if (vector == -1)
return -1;
 
+   /*
+* We get here even with APIC virtualization enabled, if doing
+* nested virtualization and L1 runs with the "acknowledge interrupt 
+* on exit" mode.  Then we cannot inject the interrupt via RVI,
+* because the process would deliver it through the IDT.
+*/
+
apic_set_isr(vector, apic);
apic_update_ppr(apic);
apic_clear_irr(vector, apic);


I think the right way to do it must be something like this; you cannot
do it just in nested_vmx_vmexit.  Testing is welcome since I don't have
easy access to APICv-capable hardware (it would take a few days).

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/2] KVM: nVMX: Fix virtual interrupt delivery injection

2014-07-17 Thread Wanpeng Li
On Thu, Jul 17, 2014 at 01:31:06PM +0200, Paolo Bonzini wrote:
>Il 17/07/2014 13:03, Wanpeng Li ha scritto:
>>+ /*
>>+  * Fall back to old way to inject the interrupt since there
>>+  * is no vAPIC-v for L2.
>>+  */
>>+ if (vcpu->arch.exception.pending ||
>>+ vcpu->arch.nmi_injected ||
>>+ vcpu->arch.interrupt.pending)
>>+ return;
>
>This is just
>
>   if (kvm_event_needs_reinjection(vcpu))
>   return;
>
>but apart from this the patch is okay.  I'll make the change and
>apply it, thanks.
>

Thanks.

Regards,
Wanpeng Li 

>Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/6 v2] kvm: ppc: Add SPRN_EPR get helper function

2014-07-17 Thread Bharat Bhushan
kvmppc_set_epr() is already defined in asm/kvm_ppc.h, So
rename and move get_epr helper function to same file.

Signed-off-by: Bharat Bhushan 
---
v1->v2
 - vcpu->arch.epr under CONFIG_BOOKE

 arch/powerpc/include/asm/kvm_ppc.h | 10 ++
 arch/powerpc/kvm/booke.c   | 11 +--
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 58a5202..14e2d87 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -395,6 +395,16 @@ static inline int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, 
u32 cmd)
{ return 0; }
 #endif
 
+static inline unsigned long kvmppc_get_epr(struct kvm_vcpu *vcpu)
+{
+#ifdef CONFIG_KVM_BOOKE_HV
+   return mfspr(SPRN_GEPR);
+#elif defined(CONFIG_BOOKE)
+   return vcpu->arch.epr;
+#endif
+   return 0;
+}
+
 static inline void kvmppc_set_epr(struct kvm_vcpu *vcpu, u32 epr)
 {
 #ifdef CONFIG_KVM_BOOKE_HV
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 9606139..5e9a380 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -302,15 +302,6 @@ static void set_guest_mcsrr(struct kvm_vcpu *vcpu, 
unsigned long srr0, u32 srr1)
vcpu->arch.mcsrr1 = srr1;
 }
 
-static unsigned long get_guest_epr(struct kvm_vcpu *vcpu)
-{
-#ifdef CONFIG_KVM_BOOKE_HV
-   return mfspr(SPRN_GEPR);
-#else
-   return vcpu->arch.epr;
-#endif
-}
-
 /* Deliver the interrupt of the corresponding priority, if possible. */
 static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
 unsigned int priority)
@@ -1483,7 +1474,7 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, 
struct kvm_one_reg *reg)
val = get_reg_val(reg->id, vcpu->arch.dbg_reg.dac2);
break;
case KVM_REG_PPC_EPR: {
-   u32 epr = get_guest_epr(vcpu);
+   u32 epr = kvmppc_get_epr(vcpu);
val = get_reg_val(reg->id, epr);
break;
}
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 2/5] KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1

2014-07-17 Thread Mihai Caraman
Add mising defines MAS0_GET_TLBSEL() and MAS1_GET_TSIZE() for Book3E.

Signed-off-by: Mihai Caraman 
---
v5-v2:
 - no change

 arch/powerpc/include/asm/mmu-book3e.h | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu-book3e.h 
b/arch/powerpc/include/asm/mmu-book3e.h
index 8d24f78..cd4f04a 100644
--- a/arch/powerpc/include/asm/mmu-book3e.h
+++ b/arch/powerpc/include/asm/mmu-book3e.h
@@ -40,9 +40,11 @@
 
 /* MAS registers bit definitions */
 
-#define MAS0_TLBSEL_MASK0x3000
-#define MAS0_TLBSEL_SHIFT   28
-#define MAS0_TLBSEL(x)  (((x) << MAS0_TLBSEL_SHIFT) & MAS0_TLBSEL_MASK)
+#define MAS0_TLBSEL_MASK   0x3000
+#define MAS0_TLBSEL_SHIFT  28
+#define MAS0_TLBSEL(x) (((x) << MAS0_TLBSEL_SHIFT) & MAS0_TLBSEL_MASK)
+#define MAS0_GET_TLBSEL(mas0)  (((mas0) & MAS0_TLBSEL_MASK) >> \
+   MAS0_TLBSEL_SHIFT)
 #define MAS0_ESEL_MASK 0x0FFF
 #define MAS0_ESEL_SHIFT16
 #define MAS0_ESEL(x)   (((x) << MAS0_ESEL_SHIFT) & MAS0_ESEL_MASK)
@@ -60,6 +62,7 @@
 #define MAS1_TSIZE_MASK0x0f80
 #define MAS1_TSIZE_SHIFT   7
 #define MAS1_TSIZE(x)  (((x) << MAS1_TSIZE_SHIFT) & MAS1_TSIZE_MASK)
+#define MAS1_GET_TSIZE(mas1)   (((mas1) & MAS1_TSIZE_MASK) >> MAS1_TSIZE_SHIFT)
 
 #define MAS2_EPN   (~0xFFFUL)
 #define MAS2_X00x0040
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/6 v2] kvm: ppc: booke: Use the shared struct helpers for SPRN_SPRG0-7

2014-07-17 Thread Bharat Bhushan
Use kvmppc_set_sprg[0-7]() and kvmppc_get_sprg[0-7]() helper
functions

Signed-off-by: Bharat Bhushan 
---
 arch/powerpc/kvm/booke.c | 32 
 arch/powerpc/kvm/booke_emulate.c |  8 
 2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 81484d9..9606139 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -1258,14 +1258,14 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
regs->srr0 = kvmppc_get_srr0(vcpu);
regs->srr1 = kvmppc_get_srr1(vcpu);
regs->pid = vcpu->arch.pid;
-   regs->sprg0 = vcpu->arch.shared->sprg0;
-   regs->sprg1 = vcpu->arch.shared->sprg1;
-   regs->sprg2 = vcpu->arch.shared->sprg2;
-   regs->sprg3 = vcpu->arch.shared->sprg3;
-   regs->sprg4 = vcpu->arch.shared->sprg4;
-   regs->sprg5 = vcpu->arch.shared->sprg5;
-   regs->sprg6 = vcpu->arch.shared->sprg6;
-   regs->sprg7 = vcpu->arch.shared->sprg7;
+   regs->sprg0 = kvmppc_get_sprg0(vcpu);
+   regs->sprg1 = kvmppc_get_sprg1(vcpu);
+   regs->sprg2 = kvmppc_get_sprg2(vcpu);
+   regs->sprg3 = kvmppc_get_sprg3(vcpu);
+   regs->sprg4 = kvmppc_get_sprg4(vcpu);
+   regs->sprg5 = kvmppc_get_sprg5(vcpu);
+   regs->sprg6 = kvmppc_get_sprg6(vcpu);
+   regs->sprg7 = kvmppc_get_sprg7(vcpu);
 
for (i = 0; i < ARRAY_SIZE(regs->gpr); i++)
regs->gpr[i] = kvmppc_get_gpr(vcpu, i);
@@ -1286,14 +1286,14 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
kvmppc_set_srr0(vcpu, regs->srr0);
kvmppc_set_srr1(vcpu, regs->srr1);
kvmppc_set_pid(vcpu, regs->pid);
-   vcpu->arch.shared->sprg0 = regs->sprg0;
-   vcpu->arch.shared->sprg1 = regs->sprg1;
-   vcpu->arch.shared->sprg2 = regs->sprg2;
-   vcpu->arch.shared->sprg3 = regs->sprg3;
-   vcpu->arch.shared->sprg4 = regs->sprg4;
-   vcpu->arch.shared->sprg5 = regs->sprg5;
-   vcpu->arch.shared->sprg6 = regs->sprg6;
-   vcpu->arch.shared->sprg7 = regs->sprg7;
+   kvmppc_set_sprg0(vcpu, regs->sprg0);
+   kvmppc_set_sprg1(vcpu, regs->sprg1);
+   kvmppc_set_sprg2(vcpu, regs->sprg2);
+   kvmppc_set_sprg3(vcpu, regs->sprg3);
+   kvmppc_set_sprg4(vcpu, regs->sprg4);
+   kvmppc_set_sprg5(vcpu, regs->sprg5);
+   kvmppc_set_sprg6(vcpu, regs->sprg6);
+   kvmppc_set_sprg7(vcpu, regs->sprg7);
 
for (i = 0; i < ARRAY_SIZE(regs->gpr); i++)
kvmppc_set_gpr(vcpu, i, regs->gpr[i]);
diff --git a/arch/powerpc/kvm/booke_emulate.c b/arch/powerpc/kvm/booke_emulate.c
index 3d143fe..3330faf 100644
--- a/arch/powerpc/kvm/booke_emulate.c
+++ b/arch/powerpc/kvm/booke_emulate.c
@@ -310,16 +310,16 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int 
sprn, ulong spr_val)
 * guest (PR-mode only).
 */
case SPRN_SPRG4:
-   vcpu->arch.shared->sprg4 = spr_val;
+   kvmppc_set_sprg4(vcpu, spr_val);
break;
case SPRN_SPRG5:
-   vcpu->arch.shared->sprg5 = spr_val;
+   kvmppc_set_sprg5(vcpu, spr_val);
break;
case SPRN_SPRG6:
-   vcpu->arch.shared->sprg6 = spr_val;
+   kvmppc_set_sprg6(vcpu, spr_val);
break;
case SPRN_SPRG7:
-   vcpu->arch.shared->sprg7 = spr_val;
+   kvmppc_set_sprg7(vcpu, spr_val);
break;
 
case SPRN_IVPR:
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/6 v2] kvm: ppc: booke: Use the shared struct helpers of SPRN_DEAR

2014-07-17 Thread Bharat Bhushan
Uses kvmppc_set_dar() and kvmppc_get_dar() helper functions

Signed-off-by: Bharat Bhushan 
---
 arch/powerpc/kvm/booke.c | 24 +++-
 1 file changed, 3 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 096998a..20296c8 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -302,24 +302,6 @@ static void set_guest_mcsrr(struct kvm_vcpu *vcpu, 
unsigned long srr0, u32 srr1)
vcpu->arch.mcsrr1 = srr1;
 }
 
-static unsigned long get_guest_dear(struct kvm_vcpu *vcpu)
-{
-#ifdef CONFIG_KVM_BOOKE_HV
-   return mfspr(SPRN_GDEAR);
-#else
-   return vcpu->arch.shared->dar;
-#endif
-}
-
-static void set_guest_dear(struct kvm_vcpu *vcpu, unsigned long dear)
-{
-#ifdef CONFIG_KVM_BOOKE_HV
-   mtspr(SPRN_GDEAR, dear);
-#else
-   vcpu->arch.shared->dar = dear;
-#endif
-}
-
 static unsigned long get_guest_esr(struct kvm_vcpu *vcpu)
 {
 #ifdef CONFIG_KVM_BOOKE_HV
@@ -461,7 +443,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu 
*vcpu,
if (update_esr == true)
set_guest_esr(vcpu, vcpu->arch.queued_esr);
if (update_dear == true)
-   set_guest_dear(vcpu, vcpu->arch.queued_dear);
+   kvmppc_set_dar(vcpu, vcpu->arch.queued_dear);
if (update_epr == true) {
if (vcpu->arch.epr_flags & KVMPPC_EPR_USER)
kvm_make_request(KVM_REQ_EPR_EXIT, vcpu);
@@ -1348,7 +1330,7 @@ static void get_sregs_base(struct kvm_vcpu *vcpu,
sregs->u.e.csrr1 = vcpu->arch.csrr1;
sregs->u.e.mcsr = vcpu->arch.mcsr;
sregs->u.e.esr = get_guest_esr(vcpu);
-   sregs->u.e.dear = get_guest_dear(vcpu);
+   sregs->u.e.dear = kvmppc_get_dar(vcpu);
sregs->u.e.tsr = vcpu->arch.tsr;
sregs->u.e.tcr = vcpu->arch.tcr;
sregs->u.e.dec = kvmppc_get_dec(vcpu, tb);
@@ -1366,7 +1348,7 @@ static int set_sregs_base(struct kvm_vcpu *vcpu,
vcpu->arch.csrr1 = sregs->u.e.csrr1;
vcpu->arch.mcsr = sregs->u.e.mcsr;
set_guest_esr(vcpu, sregs->u.e.esr);
-   set_guest_dear(vcpu, sregs->u.e.dear);
+   kvmppc_set_dar(vcpu, sregs->u.e.dear);
vcpu->arch.vrsave = sregs->u.e.vrsave;
kvmppc_set_tcr(vcpu, sregs->u.e.tcr);
 
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/6 v2] kvm: ppc: bookehv: Added wrapper macros for shadow registers

2014-07-17 Thread Bharat Bhushan
There are shadow registers like, GSPRG[0-3], GSRR0, GSRR1 etc on
BOOKE-HV and these shadow registers are guest accessible.
So these shadow registers needs to be updated on BOOKE-HV.
This patch adds new macro for get/set helper of shadow register .

Signed-off-by: Bharat Bhushan 
---
v1->v2
 - Fix compilation for book3s (separate macro etc)

 arch/powerpc/include/asm/kvm_ppc.h | 44 +++---
 1 file changed, 36 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index f3f7611..7646994 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -475,8 +475,20 @@ static inline bool kvmppc_shared_big_endian(struct 
kvm_vcpu *vcpu)
 #endif
 }
 
+#define SPRNG_WRAPPER_GET(reg, e500hv_spr) \
+static inline ulong kvmppc_get_##reg(struct kvm_vcpu *vcpu)\
+{  \
+   return mfspr(e500hv_spr);   \
+}  \
+
+#define SPRNG_WRAPPER_SET(reg, e500hv_spr) \
+static inline void kvmppc_set_##reg(struct kvm_vcpu *vcpu, ulong val)  \
+{  \
+   mtspr(e500hv_spr, val); \
+}  \
+
 #define SHARED_WRAPPER_GET(reg, size)  \
-static inline u##size kvmppc_get_##reg(struct kvm_vcpu *vcpu)  \
+static inline u##size kvmppc_get_##reg(struct kvm_vcpu *vcpu)  \
 {  \
if (kvmppc_shared_big_endian(vcpu)) \
   return be##size##_to_cpu(vcpu->arch.shared->reg);\
@@ -497,14 +509,30 @@ static inline void kvmppc_set_##reg(struct kvm_vcpu 
*vcpu, u##size val)   \
SHARED_WRAPPER_GET(reg, size)   \
SHARED_WRAPPER_SET(reg, size)   \
 
+#define SPRNG_WRAPPER(reg, e500hv_spr) \
+   SPRNG_WRAPPER_GET(reg, e500hv_spr)  \
+   SPRNG_WRAPPER_SET(reg, e500hv_spr)  \
+
+#ifdef CONFIG_KVM_BOOKE_HV
+
+#define SHARED_SPRNG_WRAPPER(reg, size, e500hv_spr)\
+   SPRNG_WRAPPER(reg, e500hv_spr)  \
+
+#else
+
+#define SHARED_SPRNG_WRAPPER(reg, size, e500hv_spr)\
+   SHARED_WRAPPER(reg, size)   \
+
+#endif
+
 SHARED_WRAPPER(critical, 64)
-SHARED_WRAPPER(sprg0, 64)
-SHARED_WRAPPER(sprg1, 64)
-SHARED_WRAPPER(sprg2, 64)
-SHARED_WRAPPER(sprg3, 64)
-SHARED_WRAPPER(srr0, 64)
-SHARED_WRAPPER(srr1, 64)
-SHARED_WRAPPER(dar, 64)
+SHARED_SPRNG_WRAPPER(sprg0, 64, SPRN_GSPRG0)
+SHARED_SPRNG_WRAPPER(sprg1, 64, SPRN_GSPRG1)
+SHARED_SPRNG_WRAPPER(sprg2, 64, SPRN_GSPRG2)
+SHARED_SPRNG_WRAPPER(sprg3, 64, SPRN_GSPRG3)
+SHARED_SPRNG_WRAPPER(srr0, 64, SPRN_GSRR0)
+SHARED_SPRNG_WRAPPER(srr1, 64, SPRN_GSRR1)
+SHARED_SPRNG_WRAPPER(dar, 64, SPRN_GDEAR)
 SHARED_WRAPPER_GET(msr, 64)
 static inline void kvmppc_set_msr_fast(struct kvm_vcpu *vcpu, u64 val)
 {
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/6 v2] kvm: ppc: booke: Add shared struct helpers of SPRN_ESR

2014-07-17 Thread Bharat Bhushan
Add and use kvmppc_set_esr() and kvmppc_get_esr() helper functions

Signed-off-by: Bharat Bhushan 
---
 arch/powerpc/include/asm/kvm_ppc.h |  1 +
 arch/powerpc/kvm/booke.c   | 24 +++-
 2 files changed, 4 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 7646994..58a5202 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -533,6 +533,7 @@ SHARED_SPRNG_WRAPPER(sprg3, 64, SPRN_GSPRG3)
 SHARED_SPRNG_WRAPPER(srr0, 64, SPRN_GSRR0)
 SHARED_SPRNG_WRAPPER(srr1, 64, SPRN_GSRR1)
 SHARED_SPRNG_WRAPPER(dar, 64, SPRN_GDEAR)
+SHARED_SPRNG_WRAPPER(esr, 64, SPRN_GESR)
 SHARED_WRAPPER_GET(msr, 64)
 static inline void kvmppc_set_msr_fast(struct kvm_vcpu *vcpu, u64 val)
 {
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 20296c8..81484d9 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -302,24 +302,6 @@ static void set_guest_mcsrr(struct kvm_vcpu *vcpu, 
unsigned long srr0, u32 srr1)
vcpu->arch.mcsrr1 = srr1;
 }
 
-static unsigned long get_guest_esr(struct kvm_vcpu *vcpu)
-{
-#ifdef CONFIG_KVM_BOOKE_HV
-   return mfspr(SPRN_GESR);
-#else
-   return vcpu->arch.shared->esr;
-#endif
-}
-
-static void set_guest_esr(struct kvm_vcpu *vcpu, u32 esr)
-{
-#ifdef CONFIG_KVM_BOOKE_HV
-   mtspr(SPRN_GESR, esr);
-#else
-   vcpu->arch.shared->esr = esr;
-#endif
-}
-
 static unsigned long get_guest_epr(struct kvm_vcpu *vcpu)
 {
 #ifdef CONFIG_KVM_BOOKE_HV
@@ -441,7 +423,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu 
*vcpu,
 
vcpu->arch.pc = vcpu->arch.ivpr | vcpu->arch.ivor[priority];
if (update_esr == true)
-   set_guest_esr(vcpu, vcpu->arch.queued_esr);
+   kvmppc_set_esr(vcpu, vcpu->arch.queued_esr);
if (update_dear == true)
kvmppc_set_dar(vcpu, vcpu->arch.queued_dear);
if (update_epr == true) {
@@ -1329,7 +1311,7 @@ static void get_sregs_base(struct kvm_vcpu *vcpu,
sregs->u.e.csrr0 = vcpu->arch.csrr0;
sregs->u.e.csrr1 = vcpu->arch.csrr1;
sregs->u.e.mcsr = vcpu->arch.mcsr;
-   sregs->u.e.esr = get_guest_esr(vcpu);
+   sregs->u.e.esr = kvmppc_get_esr(vcpu);
sregs->u.e.dear = kvmppc_get_dar(vcpu);
sregs->u.e.tsr = vcpu->arch.tsr;
sregs->u.e.tcr = vcpu->arch.tcr;
@@ -1347,7 +1329,7 @@ static int set_sregs_base(struct kvm_vcpu *vcpu,
vcpu->arch.csrr0 = sregs->u.e.csrr0;
vcpu->arch.csrr1 = sregs->u.e.csrr1;
vcpu->arch.mcsr = sregs->u.e.mcsr;
-   set_guest_esr(vcpu, sregs->u.e.esr);
+   kvmppc_set_esr(vcpu, sregs->u.e.esr);
kvmppc_set_dar(vcpu, sregs->u.e.dear);
vcpu->arch.vrsave = sregs->u.e.vrsave;
kvmppc_set_tcr(vcpu, sregs->u.e.tcr);
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/6 v2] kvm: ppc: booke: Use the shared struct helpers of SRR0 and SRR1

2014-07-17 Thread Bharat Bhushan
Use kvmppc_set_srr0/srr1() and kvmppc_get_srr0/srr1() helper functions

Signed-off-by: Bharat Bhushan 
---
 arch/powerpc/kvm/booke.c | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index c2471ed..096998a 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -276,13 +276,8 @@ void kvmppc_core_dequeue_debug(struct kvm_vcpu *vcpu)
 
 static void set_guest_srr(struct kvm_vcpu *vcpu, unsigned long srr0, u32 srr1)
 {
-#ifdef CONFIG_KVM_BOOKE_HV
-   mtspr(SPRN_GSRR0, srr0);
-   mtspr(SPRN_GSRR1, srr1);
-#else
-   vcpu->arch.shared->srr0 = srr0;
-   vcpu->arch.shared->srr1 = srr1;
-#endif
+   kvmppc_set_srr0(vcpu, srr0);
+   kvmppc_set_srr1(vcpu, srr1);
 }
 
 static void set_guest_csrr(struct kvm_vcpu *vcpu, unsigned long srr0, u32 srr1)
@@ -1296,8 +1291,8 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
regs->lr = vcpu->arch.lr;
regs->xer = kvmppc_get_xer(vcpu);
regs->msr = vcpu->arch.shared->msr;
-   regs->srr0 = vcpu->arch.shared->srr0;
-   regs->srr1 = vcpu->arch.shared->srr1;
+   regs->srr0 = kvmppc_get_srr0(vcpu);
+   regs->srr1 = kvmppc_get_srr1(vcpu);
regs->pid = vcpu->arch.pid;
regs->sprg0 = vcpu->arch.shared->sprg0;
regs->sprg1 = vcpu->arch.shared->sprg1;
@@ -1324,8 +1319,8 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
vcpu->arch.lr = regs->lr;
kvmppc_set_xer(vcpu, regs->xer);
kvmppc_set_msr(vcpu, regs->msr);
-   vcpu->arch.shared->srr0 = regs->srr0;
-   vcpu->arch.shared->srr1 = regs->srr1;
+   kvmppc_set_srr0(vcpu, regs->srr0);
+   kvmppc_set_srr1(vcpu, regs->srr1);
kvmppc_set_pid(vcpu, regs->pid);
vcpu->arch.shared->sprg0 = regs->sprg0;
vcpu->arch.shared->sprg1 = regs->sprg1;
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/2] KVM: nVMX: Fix virtual interrupt delivery injection

2014-07-17 Thread Paolo Bonzini

Il 17/07/2014 13:03, Wanpeng Li ha scritto:

+   /*
+* Fall back to old way to inject the interrupt since there
+* is no vAPIC-v for L2.
+*/
+   if (vcpu->arch.exception.pending ||
+   vcpu->arch.nmi_injected ||
+   vcpu->arch.interrupt.pending)
+   return;


This is just

if (kvm_event_needs_reinjection(vcpu))
return;

but apart from this the patch is okay.  I'll make the change and apply 
it, thanks.


Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] KVM: nVMX: Fix fail to get nested ack intr's vector during nested vmexit

2014-07-17 Thread Paolo Bonzini
Il 17/07/2014 13:03, Wanpeng Li ha scritto:
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 4ae5ad8..a704f71 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -8697,6 +8697,9 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, 
> u32 exit_reason,
>   if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
>   && nested_exit_intr_ack_set(vcpu)) {
>   int irq = kvm_cpu_get_interrupt(vcpu);
> +
> + if (irq < 0 && kvm_apic_vid_enabled(vcpu->kvm))
> + irq = kvm_lapic_find_highest_irr(vcpu);
>   WARN_ON(irq < 0);
>   vmcs12->vm_exit_intr_info = irq |
>   INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;

I wonder if this should be kvm_apic_has_interrupt, so that the PPR
register is taken into consideration?

If so, the same change can also be written like this:

diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
index bd0da43..a1ec6a5 100644
--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -108,7 +108,7 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
 
vector = kvm_cpu_get_extint(v);
 
-   if (kvm_apic_vid_enabled(v->kvm) || vector != -1)
+   if (vector != -1)
return vector;  /* PIC */
 
return kvm_get_apic_interrupt(v);   /* APIC */
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 3855103..92a0a58 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1627,10 +1627,13 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu)
int vector = kvm_apic_has_interrupt(vcpu);
struct kvm_lapic *apic = vcpu->arch.apic;
 
-   /* Note that we never get here with APIC virtualization enabled.  */
+   /*
+* With APIC virtualization enabled, just pass back the
+* vector, the processor will take care of delivery.
+*/
 
-   if (vector == -1)
-   return -1;
+   if (vector == -1 || kvm_apic_vid_enabled(vcpu->kvm))
+   return vector;
 
apic_set_isr(vector, apic);
apic_update_ppr(apic);

The idea is that kvm_cpu_get_interrupt always return the interrupt.  If
you are injecting an interrupt you will test kvm_cpu_has_injectable_intr
outside the call to kvm_cpu_get_interrupt, and kvm_get_apic_interrupt
will never be reached anyway.  Instead, if you are reporting the interrupt,
any interrupt will be okay.

Yang, Wanpeng, what do you think?  Can you test both variants,
that is:

- you patch with kvm_apic_has_interrupt instead of
kvm_lapic_find_highest_irr

- the above untested patch of mine?

Thanks,

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 4/5] KVM: PPC: Alow kvmppc_get_last_inst() to fail

2014-07-17 Thread Mihai Caraman
On book3e, guest last instruction is read on the exit path using load
external pid (lwepx) dedicated instruction. This load operation may fail
due to TLB eviction and execute-but-not-read entries.

This patch lay down the path for an alternative solution to read the guest
last instruction, by allowing kvmppc_get_lat_inst() function to fail.
Architecture specific implmentations of kvmppc_load_last_inst() may read
last guest instruction and instruct the emulation layer to re-execute the
guest in case of failure.

Make kvmppc_get_last_inst() definition common between architectures.

Signed-off-by: Mihai Caraman 
---
v5
 - don't swap when load fail
 - convert the return value space of kvmppc_ld()

v4:
 - these changes compile on book3s, please validate the functionality and
   do the necessary adaptations!
 - common declaration and enum for kvmppc_load_last_inst()
 - remove kvmppc_read_inst() in a preceding patch

v3:
 - rework patch description
 - add common definition for kvmppc_get_last_inst()
 - check return values in book3s code

v2:
 - integrated kvmppc_get_last_inst() in book3s code and checked build
 - addressed cosmetic feedback

 arch/powerpc/include/asm/kvm_book3s.h| 26 -
 arch/powerpc/include/asm/kvm_booke.h |  5 ---
 arch/powerpc/include/asm/kvm_ppc.h   | 25 +
 arch/powerpc/kvm/book3s.c| 17 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c  | 17 +++--
 arch/powerpc/kvm/book3s_paired_singles.c | 38 ---
 arch/powerpc/kvm/book3s_pr.c | 63 ++--
 arch/powerpc/kvm/booke.c |  3 ++
 arch/powerpc/kvm/e500_mmu_host.c |  6 +++
 arch/powerpc/kvm/emulate.c   | 18 ++---
 arch/powerpc/kvm/powerpc.c   | 11 +-
 11 files changed, 144 insertions(+), 85 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 20fb6f2..a86ca65 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -276,32 +276,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu 
*vcpu)
return (kvmppc_get_msr(vcpu) & MSR_LE) != (MSR_KERNEL & MSR_LE);
 }
 
-static inline u32 kvmppc_get_last_inst_internal(struct kvm_vcpu *vcpu, ulong 
pc)
-{
-   /* Load the instruction manually if it failed to do so in the
-* exit path */
-   if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED)
-   kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false);
-
-   return kvmppc_need_byteswap(vcpu) ? swab32(vcpu->arch.last_inst) :
-   vcpu->arch.last_inst;
-}
-
-static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
-{
-   return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu));
-}
-
-/*
- * Like kvmppc_get_last_inst(), but for fetching a sc instruction.
- * Because the sc instruction sets SRR0 to point to the following
- * instruction, we have to fetch from pc - 4.
- */
-static inline u32 kvmppc_get_last_sc(struct kvm_vcpu *vcpu)
-{
-   return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu) - 4);
-}
-
 static inline ulong kvmppc_get_fault_dar(struct kvm_vcpu *vcpu)
 {
return vcpu->arch.fault_dar;
diff --git a/arch/powerpc/include/asm/kvm_booke.h 
b/arch/powerpc/include/asm/kvm_booke.h
index c7aed61..cbb1990 100644
--- a/arch/powerpc/include/asm/kvm_booke.h
+++ b/arch/powerpc/include/asm/kvm_booke.h
@@ -69,11 +69,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu 
*vcpu)
return false;
 }
 
-static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu)
-{
-   return vcpu->arch.last_inst;
-}
-
 static inline void kvmppc_set_ctr(struct kvm_vcpu *vcpu, ulong val)
 {
vcpu->arch.ctr = val;
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index e2fd5a1..7f9c634 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -47,6 +47,11 @@ enum emulation_result {
EMULATE_EXIT_USER,/* emulation requires exit to user-space */
 };
 
+enum instruction_type {
+   INST_GENERIC,
+   INST_SC,/* system call */
+};
+
 extern int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
 extern int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
 extern void kvmppc_handler_highmem(void);
@@ -62,6 +67,9 @@ extern int kvmppc_handle_store(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
   u64 val, unsigned int bytes,
   int is_default_endian);
 
+extern int kvmppc_load_last_inst(struct kvm_vcpu *vcpu,
+enum instruction_type type, u32 *inst);
+
 extern int kvmppc_emulate_instruction(struct kvm_run *run,
   struct kvm_vcpu *vcpu);
 extern int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu);
@@ -234,6 +242,23 @@ struct kvmppc_ops {
 extern struc

[PATCH v5 5/5] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation

2014-07-17 Thread Mihai Caraman
On book3e, KVM uses load external pid (lwepx) dedicated instruction to read
guest last instruction on the exit path. lwepx exceptions (DTLB_MISS, DSI
and LRAT), generated by loading a guest address, needs to be handled by KVM.
These exceptions are generated in a substituted guest translation context
(EPLC[EGS] = 1) from host context (MSR[GS] = 0).

Currently, KVM hooks only interrupts generated from guest context (MSR[GS] = 1),
doing minimal checks on the fast path to avoid host performance degradation.
lwepx exceptions originate from host state (MSR[GS] = 0) which implies
additional checks in DO_KVM macro (beside the current MSR[GS] = 1) by looking
at the Exception Syndrome Register (ESR[EPID]) and the External PID Load Context
Register (EPLC[EGS]). Doing this on each Data TLB miss exception is obvious
too intrusive for the host.

Read guest last instruction from kvmppc_load_last_inst() by searching for the
physical address and kmap it. This address the TODO for TLB eviction and
execute-but-not-read entries, and allow us to get rid of lwepx until we are
able to handle failures.

A simple stress benchmark shows a 1% sys performance degradation compared with
previous approach (lwepx without failure handling):

time for i in `seq 1 1`; do /bin/echo > /dev/null; done

real0m 8.85s
user0m 4.34s
sys 0m 4.48s

vs

real0m 8.84s
user0m 4.36s
sys 0m 4.44s

A solution to use lwepx and to handle its exceptions in KVM would be to 
temporary
highjack the interrupt vector from host. This imposes additional 
synchronizations
for cores like FSL e6500 that shares host IVOR registers between hardware 
threads.
This optimized solution can be later developed on top of this patch.

Signed-off-by: Mihai Caraman 
---
v5:
 - return ENULATE_AGAIN in case of failure

v4:
 - add switch and new function when getting last inst earlier
 - use enum instead of prev semnatic
 - get rid of mas0, optimize mas7_mas3
 - give more context in visible messages
 - check storage attributes mismatch on MMUv2
 - get rid of pfn_valid check

v3:
 - reworked patch description
 - use unaltered kmap addr for kunmap
 - get last instruction before beeing preempted

v2:
 - reworked patch description
 - used pr_* functions
 - addressed cosmetic feedback

 arch/powerpc/kvm/booke.c  | 44 +
 arch/powerpc/kvm/bookehv_interrupts.S | 37 --
 arch/powerpc/kvm/e500_mmu_host.c  | 92 +++
 3 files changed, 145 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 34a42b9..843077b 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -869,6 +869,28 @@ static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu,
}
 }
 
+static int kvmppc_resume_inst_load(struct kvm_run *run, struct kvm_vcpu *vcpu,
+ enum emulation_result emulated, u32 last_inst)
+{
+   switch (emulated) {
+   case EMULATE_AGAIN:
+   return RESUME_GUEST;
+
+   case EMULATE_FAIL:
+   pr_debug("%s: load instruction from guest address %lx failed\n",
+  __func__, vcpu->arch.pc);
+   /* For debugging, encode the failing instruction and
+* report it to userspace. */
+   run->hw.hardware_exit_reason = ~0ULL << 32;
+   run->hw.hardware_exit_reason |= last_inst;
+   kvmppc_core_queue_program(vcpu, ESR_PIL);
+   return RESUME_HOST;
+
+   default:
+   BUG();
+   }
+}
+
 /**
  * kvmppc_handle_exit
  *
@@ -880,6 +902,8 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu 
*vcpu,
int r = RESUME_HOST;
int s;
int idx;
+   u32 last_inst = KVM_INST_FETCH_FAILED;
+   enum emulation_result emulated = EMULATE_DONE;
 
/* update before a new last_exit_type is rewritten */
kvmppc_update_timing_stats(vcpu);
@@ -887,6 +911,20 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
/* restart interrupts if they were meant for the host */
kvmppc_restart_interrupt(vcpu, exit_nr);
 
+   /*
+* get last instruction before beeing preempted
+* TODO: for e6500 check also BOOKE_INTERRUPT_LRAT_ERROR & ESR_DATA
+*/
+   switch (exit_nr) {
+   case BOOKE_INTERRUPT_DATA_STORAGE:
+   case BOOKE_INTERRUPT_DTLB_MISS:
+   case BOOKE_INTERRUPT_HV_PRIV:
+   emulated = kvmppc_get_last_inst(vcpu, false, &last_inst);
+   break;
+   default:
+   break;
+   }
+
local_irq_enable();
 
trace_kvm_exit(exit_nr, vcpu);
@@ -895,6 +933,11 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
run->exit_reason = KVM_EXIT_UNKNOWN;
run->ready_for_interrupt_injection = 1;
 
+   if (emulated != EMULATE_DONE) {
+   r = kvmppc_resume_inst_load(run, vcpu, emulated, last_inst);
+   

[PATCH v5 0/5] Read guest last instruction from kvmppc_get_last_inst()

2014-07-17 Thread Mihai Caraman
Read guest last instruction from kvmppc_get_last_inst() allowing the function
to fail in order to emulate again. On bookehv architecture search for
the physical address and kmap it, instead of using Load External PID (lwepx)
instruction. This fixes an infinite loop caused by lwepx's data TLB miss
exception handled in the host and the TODO for execute-but-not-read entries
and TLB eviction.

Mihai Caraman (5):
  KVM: PPC: e500mc: Revert "add load inst fixup"
  KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1
  KVM: PPC: Book3s: Remove kvmppc_read_inst() function
  KVM: PPC: Alow kvmppc_get_last_inst() to fail
  KVM: PPC: Bookehv: Get vcpu's last instruction for  emulation

 arch/powerpc/include/asm/kvm_book3s.h|  26 ---
 arch/powerpc/include/asm/kvm_booke.h |   5 --
 arch/powerpc/include/asm/kvm_ppc.h   |  25 +++
 arch/powerpc/include/asm/mmu-book3e.h|   9 ++-
 arch/powerpc/kvm/book3s.c|  17 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c  |  17 ++---
 arch/powerpc/kvm/book3s_paired_singles.c |  38 +++
 arch/powerpc/kvm/book3s_pr.c | 114 ---
 arch/powerpc/kvm/booke.c |  47 +
 arch/powerpc/kvm/bookehv_interrupts.S|  55 ++-
 arch/powerpc/kvm/e500_mmu_host.c |  98 ++
 arch/powerpc/kvm/emulate.c   |  18 +++--
 arch/powerpc/kvm/powerpc.c   |  11 ++-
 13 files changed, 309 insertions(+), 171 deletions(-)

-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 3/5] KVM: PPC: Book3s: Remove kvmppc_read_inst() function

2014-07-17 Thread Mihai Caraman
In the context of replacing kvmppc_ld() function calls with a version of
kvmppc_get_last_inst() which allow to fail, Alex Graf suggested this:

"If we get EMULATE_AGAIN, we just have to make sure we go back into the guest.
No need to inject an ISI into  the guest - it'll do that all by itself.
With an error returning kvmppc_get_last_inst we can just use completely
get rid of kvmppc_read_inst() and only use kvmppc_get_last_inst() instead."

As a intermediate step get rid of kvmppc_read_inst() and only use kvmppc_ld()
instead.

Signed-off-by: Mihai Caraman 
---
v5:
 - make paired single emulation the unusual

v4:
 - new patch

 arch/powerpc/kvm/book3s_pr.c | 91 ++--
 1 file changed, 37 insertions(+), 54 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index e40765f..02a983e 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -710,42 +710,6 @@ static void kvmppc_giveup_fac(struct kvm_vcpu *vcpu, ulong 
fac)
 #endif
 }
 
-static int kvmppc_read_inst(struct kvm_vcpu *vcpu)
-{
-   ulong srr0 = kvmppc_get_pc(vcpu);
-   u32 last_inst = kvmppc_get_last_inst(vcpu);
-   int ret;
-
-   ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false);
-   if (ret == -ENOENT) {
-   ulong msr = kvmppc_get_msr(vcpu);
-
-   msr = kvmppc_set_field(msr, 33, 33, 1);
-   msr = kvmppc_set_field(msr, 34, 36, 0);
-   msr = kvmppc_set_field(msr, 42, 47, 0);
-   kvmppc_set_msr_fast(vcpu, msr);
-   kvmppc_book3s_queue_irqprio(vcpu, 
BOOK3S_INTERRUPT_INST_STORAGE);
-   return EMULATE_AGAIN;
-   }
-
-   return EMULATE_DONE;
-}
-
-static int kvmppc_check_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr)
-{
-
-   /* Need to do paired single emulation? */
-   if (!(vcpu->arch.hflags & BOOK3S_HFLAG_PAIRED_SINGLE))
-   return EMULATE_DONE;
-
-   /* Read out the instruction */
-   if (kvmppc_read_inst(vcpu) == EMULATE_DONE)
-   /* Need to emulate */
-   return EMULATE_FAIL;
-
-   return EMULATE_AGAIN;
-}
-
 /* Handle external providers (FPU, Altivec, VSX) */
 static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr,
 ulong msr)
@@ -1149,31 +1113,49 @@ program_interrupt:
case BOOK3S_INTERRUPT_VSX:
{
int ext_msr = 0;
+   int emul;
+   ulong pc;
+   u32 last_inst;
+
+   if (vcpu->arch.hflags & BOOK3S_HFLAG_PAIRED_SINGLE) {
+   /* Emulate the instruction */
+
+   pc = kvmppc_get_pc(vcpu);
+   last_inst = kvmppc_get_last_inst(vcpu);
+   emul = kvmppc_ld(vcpu, &pc, sizeof(u32), &last_inst,
+false);
+   if (emul == EMULATE_DONE)
+   goto program_interrupt;
+   else
+   r = RESUME_GUEST;
+   } else {
+   /* Do paired single emulation */
 
-   switch (exit_nr) {
-   case BOOK3S_INTERRUPT_FP_UNAVAIL: ext_msr = MSR_FP;  break;
-   case BOOK3S_INTERRUPT_ALTIVEC:ext_msr = MSR_VEC; break;
-   case BOOK3S_INTERRUPT_VSX:ext_msr = MSR_VSX; break;
-   }
+   switch (exit_nr) {
+   case BOOK3S_INTERRUPT_FP_UNAVAIL:
+   ext_msr = MSR_FP;
+   break;
+
+   case BOOK3S_INTERRUPT_ALTIVEC:
+   ext_msr = MSR_VEC;
+   break;
+
+   case BOOK3S_INTERRUPT_VSX:
+   ext_msr = MSR_VSX;
+   break;
+   }
 
-   switch (kvmppc_check_ext(vcpu, exit_nr)) {
-   case EMULATE_DONE:
-   /* everything ok - let's enable the ext */
r = kvmppc_handle_ext(vcpu, exit_nr, ext_msr);
-   break;
-   case EMULATE_FAIL:
-   /* we need to emulate this instruction */
-   goto program_interrupt;
-   break;
-   default:
-   /* nothing to worry about - go again */
-   break;
}
break;
}
case BOOK3S_INTERRUPT_ALIGNMENT:
-   if (kvmppc_read_inst(vcpu) == EMULATE_DONE) {
-   u32 last_inst = kvmppc_get_last_inst(vcpu);
+   {
+   ulong pc = kvmppc_get_pc(vcpu);
+   u32 last_inst = kvmppc_get_last_inst(vcpu);
+   int emul = kvmppc_ld(vcpu, &pc, sizeof(u32), &last_inst, false);
+
+   if (emul == EMULATE_DONE)

[PATCH v5 1/5] KVM: PPC: e500mc: Revert "add load inst fixup"

2014-07-17 Thread Mihai Caraman
The commit 1d628af7 "add load inst fixup" made an attempt to handle
failures generated by reading the guest current instruction. The fixup
code that was added works by chance hiding the real issue.

Load external pid (lwepx) instruction, used by KVM to read guest
instructions, is executed in a subsituted guest translation context
(EPLC[EGS] = 1). In consequence lwepx's TLB error and data storage
interrupts need to be handled by KVM, even though these interrupts
are generated from host context (MSR[GS] = 0) where lwepx is executed.

Currently, KVM hooks only interrupts generated from guest context
(MSR[GS] = 1), doing minimal checks on the fast path to avoid host
performance degradation. As a result, the host kernel handles lwepx
faults searching the faulting guest data address (loaded in DEAR) in
its own Logical Partition ID (LPID) 0 context. In case a host translation
is found the execution returns to the lwepx instruction instead of the
fixup, the host ending up in an infinite loop.

Revert the commit "add load inst fixup". lwepx issue will be addressed
in a subsequent patch without needing fixup code.

Signed-off-by: Mihai Caraman 
---
v5-v2:
 - no change

 arch/powerpc/kvm/bookehv_interrupts.S | 26 +-
 1 file changed, 1 insertion(+), 25 deletions(-)

diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
b/arch/powerpc/kvm/bookehv_interrupts.S
index a1712b8..6ff4480 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -29,7 +29,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #ifdef CONFIG_64BIT
 #include 
@@ -164,32 +163,9 @@
PPC_STL r30, VCPU_GPR(R30)(r4)
PPC_STL r31, VCPU_GPR(R31)(r4)
mtspr   SPRN_EPLC, r8
-
-   /* disable preemption, so we are sure we hit the fixup handler */
-   CURRENT_THREAD_INFO(r8, r1)
-   li  r7, 1
-   stw r7, TI_PREEMPT(r8)
-
isync
-
-   /*
-* In case the read goes wrong, we catch it and write an invalid value
-* in LAST_INST instead.
-*/
-1: lwepx   r9, 0, r5
-2:
-.section .fixup, "ax"
-3: li  r9, KVM_INST_FETCH_FAILED
-   b   2b
-.previous
-.section __ex_table,"a"
-   PPC_LONG_ALIGN
-   PPC_LONG 1b,3b
-.previous
-
+   lwepx   r9, 0, r5
mtspr   SPRN_EPLC, r3
-   li  r7, 0
-   stw r7, TI_PREEMPT(r8)
stw r9, VCPU_LAST_INST(r4)
.endif
 
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] KVM: nVMX: Fix virtual interrupt delivery injection

2014-07-17 Thread Wanpeng Li
On Thu, Jul 17, 2014 at 12:43:58PM +0200, Paolo Bonzini wrote:
>Il 17/07/2014 11:11, Wanpeng Li ha scritto:
>> What hypervisor did you test with? nested_exit_on_intr(vcpu) will

Jailhouse will clear External-interrupt exiting bit. Am I right? Jan.

>> return true for both Xen and KVM (nested_exit_on_intr is not the same
>> thing as ACK_INTR_ON_EXIT).

I guess he want to say External-interrupt exiting bit not ACK_INTR_ON_EXIT.

>>Ah yes, a typo here.
>
>Ok, please repost this patch together with your version of patch 2.

Just send out the version two of 1/3 and 2/3. 

>
>Leave aside patch 3 for now, as I think the original use-after-free
>patch was wrong.

Any proposal is appreciated. ;-)

Regards,
Wanpeng Li 

>
>Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch V2 43/64] x86: kvm: Use ktime_get_boot_ns()

2014-07-17 Thread Paolo Bonzini

Il 16/07/2014 23:04, Thomas Gleixner ha scritto:

Use the new nanoseconds based interface and get rid of the timespec
conversion dance.

Signed-off-by: Thomas Gleixner 
Cc: Gleb Natapov 
Cc: kvm@vger.kernel.org
---
 arch/x86/kvm/x86.c |6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

Index: tip/arch/x86/kvm/x86.c
===
--- tip.orig/arch/x86/kvm/x86.c
+++ tip/arch/x86/kvm/x86.c
@@ -1109,11 +1109,7 @@ static void kvm_get_time_scale(uint32_t

 static inline u64 get_kernel_ns(void)
 {
-   struct timespec ts;
-
-   ktime_get_ts(&ts);
-   monotonic_to_bootbased(&ts);
-   return timespec_to_ns(&ts);
+   return ktime_get_boot_ns();
 }

 #ifdef CONFIG_X86_64




Acked-by: Paolo Bonzini 

I will remove get_kernel_ns if you don't do that for me...

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/2] KVM: nVMX: Fix fail to get nested ack intr's vector during nested vmexit

2014-07-17 Thread Wanpeng Li
WARNING: CPU: 9 PID: 7251 at arch/x86/kvm/vmx.c:8719 
nested_vmx_vmexit+0xa4/0x233 [kvm_intel]()
Modules linked in: tun nfsv3 nfs_acl auth_rpcgss oid_registry nfsv4 
dns_resolver nfs fscache lockd
sunrpc pci_stub netconsole kvm_intel kvm bridge stp llc autofs4 8021q ipv6 
uinput joydev microcode
pcspkr igb i2c_algo_bit ehci_pci ehci_hcd e1000e ixgbe ptp pps_core hwmon mdio 
i2c_i801 i2c_core
tpm_tis tpm ipmi_si ipmi_msghandler isci libsas scsi_transport_sas button 
dm_mirror dm_region_hash
dm_log dm_mod
CPU: 9 PID: 7251 Comm: qemu-system-x86 Tainted: GW 3.16.0-rc1 #2
Hardware name: Intel Corporation S2600CP/S2600CP, BIOS 
RMLSDP.86I.00.29.D696.131329 11/11/2013
 220f 880ffd107bf8 81493563 220f
  880ffd107c38 8103f0eb 880ffd107c48
 a059709a 881ffc9e0040 8800b74b8000 
Call Trace:
 [] dump_stack+0x49/0x5e
 [] warn_slowpath_common+0x7c/0x96
 [] ? nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
 [] warn_slowpath_null+0x15/0x17
 [] nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
 [] ? nested_vmx_exit_handled+0x6a/0x39e [kvm_intel]
 [] ? kvm_apic_has_interrupt+0x80/0xd5 [kvm]
 [] vmx_check_nested_events+0xc3/0xd3 [kvm_intel]
 [] inject_pending_event+0xd0/0x16e [kvm]
 [] vcpu_enter_guest+0x319/0x704 [kvm]

After commit 77b0f5d (KVM: nVMX: Ack and write vector info to intr_info if L1
asks us to), "Acknowledge interrupt on exit" behavior can be emulated. Current
logic will ask for intr vector if it is nested vmexit and 
VM_EXIT_ACK_INTR_ON_EXIT
is set by L1. However, intr vector for posted intr can't be got by generic read
pending interrupt vector and intack routine, there is a requirement to sync from
pir to irr. This patch fix it by ask the intr vector after sync pir to irr.

Reviewed-by: Yang Zhang 
Signed-off-by: Wanpeng Li 
---
v1 -> v2:
 * replace kvm_get_apic_interrupt() by kvm_lapic_find_highest_irr()

 arch/x86/kvm/lapic.c | 1 +
 arch/x86/kvm/vmx.c   | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 0069118..b7d45dc 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1637,6 +1637,7 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu)
apic_clear_irr(vector, apic);
return vector;
 }
+EXPORT_SYMBOL_GPL(kvm_get_apic_interrupt);
 
 void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu,
struct kvm_lapic_state *s)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 4ae5ad8..a704f71 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8697,6 +8697,9 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 
exit_reason,
if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
&& nested_exit_intr_ack_set(vcpu)) {
int irq = kvm_cpu_get_interrupt(vcpu);
+
+   if (irq < 0 && kvm_apic_vid_enabled(vcpu->kvm))
+   irq = kvm_lapic_find_highest_irr(vcpu);
WARN_ON(irq < 0);
vmcs12->vm_exit_intr_info = irq |
INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/2] KVM: nVMX: Fix virtual interrupt delivery injection

2014-07-17 Thread Wanpeng Li
From: Wanpeng Li 

This patch fix bug reported in 
https://bugzilla.kernel.org/show_bug.cgi?id=73331, 
after the patch http://www.spinics.net/lists/kvm/msg105230.html applied, there 
is
some progress and the L2 can boot up, however, slowly. The original idea of 
this 
fix vid injection patch is from "Zhang, Yang Z" .

Interrupt which delivered by vid should be injected to L1 by L0 if current is 
in 
L1, or should be injected to L2 by L0 through the old injection way if L1 
doesn't 
have set External-interrupt exiting bit. The current logic doen't consider 
these 
cases. This patch fix it by vid intr to L1 if current is L1 or L2 through old 
injection way if L1 doen't have External-interrupt exiting bit set.

Signed-off-by: Wanpeng Li 
Signed-off-by: "Zhang, Yang Z" 
---
v1 -> v2:
 * fix the typo in patch description

 arch/x86/kvm/vmx.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 021d84a..ad36646 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -7112,8 +7112,22 @@ static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, 
int max_irr)
 {
if (max_irr == -1)
return;
-
-   vmx_set_rvi(max_irr);
+   if (!is_guest_mode(vcpu)) {
+   vmx_set_rvi(max_irr);
+   } else if (is_guest_mode(vcpu) && !nested_exit_on_intr(vcpu)) {
+   /*
+* Fall back to old way to inject the interrupt since there
+* is no vAPIC-v for L2.
+*/
+   if (vcpu->arch.exception.pending ||
+   vcpu->arch.nmi_injected ||
+   vcpu->arch.interrupt.pending)
+   return;
+   else if (vmx_interrupt_allowed(vcpu)) {
+   kvm_queue_interrupt(vcpu, max_irr, false);
+   vmx_inject_irq(vcpu);
+   }
+   }
 }
 
 static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch V2 44/64] x86: kvm: Make kvm_get_time_and_clockread() nanoseconds based

2014-07-17 Thread Paolo Bonzini

Il 16/07/2014 23:04, Thomas Gleixner ha scritto:

Convert the relevant base data right away to nanoseconds instead of
doing the conversion on every readout. Reduces text size by 160 bytes.

Signed-off-by: Thomas Gleixner 
Cc: Gleb Natapov 
Cc: kvm@vger.kernel.org
---
 arch/x86/kvm/x86.c |   44 ++--
 1 file changed, 14 insertions(+), 30 deletions(-)

Index: tip/arch/x86/kvm/x86.c
===
--- tip.orig/arch/x86/kvm/x86.c
+++ tip/arch/x86/kvm/x86.c
@@ -984,9 +984,8 @@ struct pvclock_gtod_data {
u32 shift;
} clock;

-   /* open coded 'struct timespec' */
-   u64 monotonic_time_snsec;
-   time_t  monotonic_time_sec;
+   u64 boot_ns;
+   u64 nsec_base;
 };

 static struct pvclock_gtod_data pvclock_gtod_data;
@@ -994,6 +993,9 @@ static struct pvclock_gtod_data pvclock_
 static void update_pvclock_gtod(struct timekeeper *tk)
 {
struct pvclock_gtod_data *vdata = &pvclock_gtod_data;
+   u64 boot_ns;
+
+   boot_ns = ktime_to_ns(ktime_add(tk->base_mono, tk->offs_boot));

write_seqcount_begin(&vdata->seq);

@@ -1004,17 +1006,8 @@ static void update_pvclock_gtod(struct t
vdata->clock.mult= tk->mult;
vdata->clock.shift   = tk->shift;

-   vdata->monotonic_time_sec= tk->xtime_sec
-   + tk->wall_to_monotonic.tv_sec;
-   vdata->monotonic_time_snsec  = tk->xtime_nsec
-   + (tk->wall_to_monotonic.tv_nsec
-   << tk->shift);
-   while (vdata->monotonic_time_snsec >=
-   (((u64)NSEC_PER_SEC) << tk->shift)) {
-   vdata->monotonic_time_snsec -=
-   ((u64)NSEC_PER_SEC) << tk->shift;
-   vdata->monotonic_time_sec++;
-   }
+   vdata->boot_ns   = boot_ns;
+   vdata->nsec_base = tk->xtime_nsec;

write_seqcount_end(&vdata->seq);
 }
@@ -1371,23 +1364,22 @@ static inline u64 vgettsc(cycle_t *cycle
return v * gtod->clock.mult;
 }

-static int do_monotonic(struct timespec *ts, cycle_t *cycle_now)
+static int do_monotonic_boot(s64 *t, cycle_t *cycle_now)
 {
+   struct pvclock_gtod_data *gtod = &pvclock_gtod_data;
unsigned long seq;
-   u64 ns;
int mode;
-   struct pvclock_gtod_data *gtod = &pvclock_gtod_data;
+   u64 ns;

-   ts->tv_nsec = 0;
do {
seq = read_seqcount_begin(>od->seq);
mode = gtod->clock.vclock_mode;
-   ts->tv_sec = gtod->monotonic_time_sec;
-   ns = gtod->monotonic_time_snsec;
+   ns = gtod->nsec_base;
ns += vgettsc(cycle_now);
ns >>= gtod->clock.shift;
+   ns += gtod->boot_ns;
} while (unlikely(read_seqcount_retry(>od->seq, seq)));
-   timespec_add_ns(ts, ns);
+   *t = ns;

return mode;
 }
@@ -1395,19 +1387,11 @@ static int do_monotonic(struct timespec
 /* returns true if host is using tsc clocksource */
 static bool kvm_get_time_and_clockread(s64 *kernel_ns, cycle_t *cycle_now)
 {
-   struct timespec ts;
-
/* checked again under seqlock below */
if (pvclock_gtod_data.clock.vclock_mode != VCLOCK_TSC)
return false;

-   if (do_monotonic(&ts, cycle_now) != VCLOCK_TSC)
-   return false;
-
-   monotonic_to_bootbased(&ts);
-   *kernel_ns = timespec_to_ns(&ts);
-
-   return true;
+   return do_monotonic_boot(kernel_ns, cycle_now) == VCLOCK_TSC;
 }
 #endif





Acked-by: Paolo Bonzini 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] KVM: nVMX: Fix virtual interrupt delivery injection

2014-07-17 Thread Paolo Bonzini

Il 17/07/2014 11:11, Wanpeng Li ha scritto:

>> What hypervisor did you test with? nested_exit_on_intr(vcpu) will

>
>Jailhouse will clear External-interrupt exiting bit. Am I right? Jan.
>

>> return true for both Xen and KVM (nested_exit_on_intr is not the same
>> thing as ACK_INTR_ON_EXIT).

>
>I guess he want to say External-interrupt exiting bit not ACK_INTR_ON_EXIT.
>

Ah yes, a typo here.


Ok, please repost this patch together with your version of patch 2.

Leave aside patch 3 for now, as I think the original use-after-free 
patch was wrong.


Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] KVM: nVMX: Fix fail to get nested ack intr's vector during nested vmexit

2014-07-17 Thread Paolo Bonzini

Il 17/07/2014 12:01, Wanpeng Li ha scritto:

That is my original proposal solution of this bug. However, what I concern
after more think is since kvm_lapic_find_highest_irr will not clear
irr, if the intr will be injected by kvm_86_ops->hwapic_irr_update(vcpu,
kvm_lapic_find_highest_irr(vcpu)) which called by vcpu_enter_guest() again?

Any idea, Paolo?


The processor should do that when it does the virtual interrupt 
delivery.  It will do (29.2.2):


   Vector := RVI;
   VISR[Vector] := 1;
   SVI := Vector;
   VIRR[Vector] := 0;
   If VIRR not empty
  then RVI := highest index of bit set in VIRR
  else RVI := 0
   Fi;
   deliver interrupt with Vector through IDT;

Please post a patch, so we can reason on it better.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] KVM: nVMX: Fix fail to get nested ack intr's vector during nested vmexit

2014-07-17 Thread Wanpeng Li
On Thu, Jul 17, 2014 at 09:13:56AM +, Zhang, Yang Z wrote:
>Paolo Bonzini wrote on 2014-07-17:
>> Il 17/07/2014 06:56, Wanpeng Li ha scritto:
>>> && nested_exit_intr_ack_set(vcpu)) {
>>> int irq = kvm_cpu_get_interrupt(vcpu);
>>> +
>>> +   if (irq < 0 && kvm_apic_vid_enabled(vcpu->kvm))
>>> +   irq = kvm_get_apic_interrupt(vcpu);
>> 
>> There's something weird in this patch.  If you "inline"
>> kvm_cpu_get_interrupt, what you get is this:
>> 
>>  int irq;
>>  /* Beginning of kvm_cpu_get_interrupt... */
>>  if (!irqchip_in_kernel(v->kvm))
>>  irq = v->arch.interrupt.nr;
>>  else {
>>  irq = kvm_cpu_get_extint(v); /* PIC */
>>  if (!kvm_apic_vid_enabled(v->kvm) && irq == -1)
>>  irq = kvm_get_apic_interrupt(v); /* APIC */
>>  }
>> 
>>  /* kvm_cpu_get_interrupt done. */
>>  if (irq < 0 && kvm_apic_vid_enabled(vcpu->kvm))
>>  irq = kvm_get_apic_interrupt(vcpu);
>> 
>> There are just two callers of kvm_cpu_get_interrupt, and the other is
>> protected by kvm_cpu_has_injectable_intr so it won't be executed if
>> virtual interrupt delivery is enabled.  So you patch is effectively the same 
>> as this:
>> 
>> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c index
>> bd0da43..a1ec6a5 100644 --- a/arch/x86/kvm/irq.c +++
>> b/arch/x86/kvm/irq.c @@ -108,7 +108,7 @@ int
>> kvm_cpu_get_interrupt(struct kvm_vcpu *v)
>> 
>>  vector = kvm_cpu_get_extint(v);
>> -if (kvm_apic_vid_enabled(v->kvm) || vector != -1)
>> +if (vector != -1)
>>  return vector;  /* PIC */
>>   
>>  return kvm_get_apic_interrupt(v);   /* APIC */
>> But in kvm_get_apic_interrupt I have just added this comment:
>> 
>>  /* Note that we never get here with APIC virtualization
>>   * enabled.  */
>> 
>> because kvm_get_apic_interrupt calls apic_set_isr, and apic_set_isr
>> must never be called with APIC virtualization enabled either.  With
>> APIC virtualization enabled, isr_count is always 1, and
>> highest_isr_cache is always -1, and apic_set_isr breaks both of these 
>> invariants.
>> 
>
>You are right. kvm_lapic_find_highest_irr should be the right one.
>

That is my original proposal solution of this bug. However, what I concern
after more think is since kvm_lapic_find_highest_irr will not clear irr, 
if the intr will be injected by kvm_86_ops->hwapic_irr_update(vcpu,
kvm_lapic_find_highest_irr(vcpu)) which called by vcpu_enter_guest()
again?

Any idea, Paolo?

Regards,
Wanpeng Li


>> Paolo
>
>
>Best regards,
>Yang
>
>From b14c444e073a21560961b37be643b78c6c9cba17 Mon Sep 17 00:00:00 2001
From: Wanpeng Li 
Date: Thu, 17 Jul 2014 17:41:28 +0800
Subject: [PATCH v2] KVM: nVMX: Fix fail to get nested ack intr's vector during nested vmexit

WARNING: CPU: 9 PID: 7251 at arch/x86/kvm/vmx.c:8719 nested_vmx_vmexit+0xa4/0x233 [kvm_intel]()
Modules linked in: tun nfsv3 nfs_acl auth_rpcgss oid_registry nfsv4 dns_resolver nfs fscache lockd
sunrpc pci_stub netconsole kvm_intel kvm bridge stp llc autofs4 8021q ipv6 uinput joydev microcode
pcspkr igb i2c_algo_bit ehci_pci ehci_hcd e1000e ixgbe ptp pps_core hwmon mdio i2c_i801 i2c_core
tpm_tis tpm ipmi_si ipmi_msghandler isci libsas scsi_transport_sas button dm_mirror dm_region_hash
dm_log dm_mod
CPU: 9 PID: 7251 Comm: qemu-system-x86 Tainted: GW 3.16.0-rc1 #2
Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.131329 11/11/2013
 220f 880ffd107bf8 81493563 220f
  880ffd107c38 8103f0eb 880ffd107c48
 a059709a 881ffc9e0040 8800b74b8000 
Call Trace:
 [] dump_stack+0x49/0x5e
 [] warn_slowpath_common+0x7c/0x96
 [] ? nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
 [] warn_slowpath_null+0x15/0x17
 [] nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
 [] ? nested_vmx_exit_handled+0x6a/0x39e [kvm_intel]
 [] ? kvm_apic_has_interrupt+0x80/0xd5 [kvm]
 [] vmx_check_nested_events+0xc3/0xd3 [kvm_intel]
 [] inject_pending_event+0xd0/0x16e [kvm]
 [] vcpu_enter_guest+0x319/0x704 [kvm]

After commit 77b0f5d (KVM: nVMX: Ack and write vector info to intr_info if L1
asks us to), "Acknowledge interrupt on exit" behavior can be emulated. Current
logic will ask for intr vector if it is nested vmexit and VM_EXIT_ACK_INTR_ON_EXIT
is set by L1. However, intr vector for posted intr can't be got by generic read
pending interrupt vector and intack routine, there is a requirement to sync from
pir to irr. This patch fix it by ask the intr vector after sync pir to irr.

Signed-off-by: Wanpeng Li 
---
v1 -> v2:
 * replace kvm_get_apic_interrupt() by kvm_lapic_find_highest_irr()

 arch/x86/kvm/lapic.c | 1 +
 arch/x86/kvm/vmx.c   | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 0069118..b7d45dc 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -16

Re: [PATCH 2/3] KVM: nVMX: Fix fail to get nested ack intr's vector during nested vmexit

2014-07-17 Thread Wanpeng Li
On Thu, Jul 17, 2014 at 09:13:56AM +, Zhang, Yang Z wrote:
>Paolo Bonzini wrote on 2014-07-17:
>> Il 17/07/2014 06:56, Wanpeng Li ha scritto:
>>> && nested_exit_intr_ack_set(vcpu)) {
>>> int irq = kvm_cpu_get_interrupt(vcpu);
>>> +
>>> +   if (irq < 0 && kvm_apic_vid_enabled(vcpu->kvm))
>>> +   irq = kvm_get_apic_interrupt(vcpu);
>> 
>> There's something weird in this patch.  If you "inline"
>> kvm_cpu_get_interrupt, what you get is this:
>> 
>>  int irq;
>>  /* Beginning of kvm_cpu_get_interrupt... */
>>  if (!irqchip_in_kernel(v->kvm))
>>  irq = v->arch.interrupt.nr;
>>  else {
>>  irq = kvm_cpu_get_extint(v); /* PIC */
>>  if (!kvm_apic_vid_enabled(v->kvm) && irq == -1)
>>  irq = kvm_get_apic_interrupt(v); /* APIC */
>>  }
>> 
>>  /* kvm_cpu_get_interrupt done. */
>>  if (irq < 0 && kvm_apic_vid_enabled(vcpu->kvm))
>>  irq = kvm_get_apic_interrupt(vcpu);
>> 
>> There are just two callers of kvm_cpu_get_interrupt, and the other is
>> protected by kvm_cpu_has_injectable_intr so it won't be executed if
>> virtual interrupt delivery is enabled.  So you patch is effectively the same 
>> as this:
>> 
>> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c index
>> bd0da43..a1ec6a5 100644 --- a/arch/x86/kvm/irq.c +++
>> b/arch/x86/kvm/irq.c @@ -108,7 +108,7 @@ int
>> kvm_cpu_get_interrupt(struct kvm_vcpu *v)
>> 
>>  vector = kvm_cpu_get_extint(v);
>> -if (kvm_apic_vid_enabled(v->kvm) || vector != -1)
>> +if (vector != -1)
>>  return vector;  /* PIC */
>>   
>>  return kvm_get_apic_interrupt(v);   /* APIC */
>> But in kvm_get_apic_interrupt I have just added this comment:
>> 
>>  /* Note that we never get here with APIC virtualization
>>   * enabled.  */
>> 
>> because kvm_get_apic_interrupt calls apic_set_isr, and apic_set_isr
>> must never be called with APIC virtualization enabled either.  With
>> APIC virtualization enabled, isr_count is always 1, and
>> highest_isr_cache is always -1, and apic_set_isr breaks both of these 
>> invariants.
>> 
>
>You are right. kvm_lapic_find_highest_irr should be the right one.

That is my original proposal solution of this bug. However, what I concern 
after more think is since kvm_lapic_find_highest_irr will not clear
irr, if the intr will be injected by kvm_86_ops->hwapic_irr_update(vcpu, 
kvm_lapic_find_highest_irr(vcpu)) which called by vcpu_enter_guest() again? 

Any idea, Paolo?

Regards,
Wanpeng Li 

>
>> Paolo
>
>
>Best regards,
>Yang
>
>From b14c444e073a21560961b37be643b78c6c9cba17 Mon Sep 17 00:00:00 2001
From: Wanpeng Li 
Date: Thu, 17 Jul 2014 17:41:28 +0800
Subject: [PATCH v2] KVM: nVMX: Fix fail to get nested ack intr's vector during nested vmexit

WARNING: CPU: 9 PID: 7251 at arch/x86/kvm/vmx.c:8719 nested_vmx_vmexit+0xa4/0x233 [kvm_intel]()
Modules linked in: tun nfsv3 nfs_acl auth_rpcgss oid_registry nfsv4 dns_resolver nfs fscache lockd
sunrpc pci_stub netconsole kvm_intel kvm bridge stp llc autofs4 8021q ipv6 uinput joydev microcode
pcspkr igb i2c_algo_bit ehci_pci ehci_hcd e1000e ixgbe ptp pps_core hwmon mdio i2c_i801 i2c_core
tpm_tis tpm ipmi_si ipmi_msghandler isci libsas scsi_transport_sas button dm_mirror dm_region_hash
dm_log dm_mod
CPU: 9 PID: 7251 Comm: qemu-system-x86 Tainted: GW 3.16.0-rc1 #2
Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.131329 11/11/2013
 220f 880ffd107bf8 81493563 220f
  880ffd107c38 8103f0eb 880ffd107c48
 a059709a 881ffc9e0040 8800b74b8000 
Call Trace:
 [] dump_stack+0x49/0x5e
 [] warn_slowpath_common+0x7c/0x96
 [] ? nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
 [] warn_slowpath_null+0x15/0x17
 [] nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
 [] ? nested_vmx_exit_handled+0x6a/0x39e [kvm_intel]
 [] ? kvm_apic_has_interrupt+0x80/0xd5 [kvm]
 [] vmx_check_nested_events+0xc3/0xd3 [kvm_intel]
 [] inject_pending_event+0xd0/0x16e [kvm]
 [] vcpu_enter_guest+0x319/0x704 [kvm]

After commit 77b0f5d (KVM: nVMX: Ack and write vector info to intr_info if L1
asks us to), "Acknowledge interrupt on exit" behavior can be emulated. Current
logic will ask for intr vector if it is nested vmexit and VM_EXIT_ACK_INTR_ON_EXIT
is set by L1. However, intr vector for posted intr can't be got by generic read
pending interrupt vector and intack routine, there is a requirement to sync from
pir to irr. This patch fix it by ask the intr vector after sync pir to irr.

Signed-off-by: Wanpeng Li 
---
v1 -> v2:
 * replace kvm_get_apic_interrupt() by kvm_lapic_find_highest_irr()

 arch/x86/kvm/lapic.c | 1 +
 arch/x86/kvm/vmx.c   | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 0069118..b7d45dc 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -

Re: [PATCH 2/3] KVM: nVMX: Fix fail to get nested ack intr's vector during nested vmexit

2014-07-17 Thread Paolo Bonzini

Il 17/07/2014 06:56, Wanpeng Li ha scritto:

&& nested_exit_intr_ack_set(vcpu)) {
int irq = kvm_cpu_get_interrupt(vcpu);
+
+   if (irq < 0 && kvm_apic_vid_enabled(vcpu->kvm))
+   irq = kvm_get_apic_interrupt(vcpu);


There's something weird in this patch.  If you "inline" 
kvm_cpu_get_interrupt, what you get is this:


int irq;

/* Beginning of kvm_cpu_get_interrupt... */
if (!irqchip_in_kernel(v->kvm))
irq = v->arch.interrupt.nr;
else {
irq = kvm_cpu_get_extint(v); /* PIC */
if (!kvm_apic_vid_enabled(v->kvm) && irq == -1)
irq = kvm_get_apic_interrupt(v); /* APIC */
}

/* kvm_cpu_get_interrupt done. */
if (irq < 0 && kvm_apic_vid_enabled(vcpu->kvm))
irq = kvm_get_apic_interrupt(vcpu);

There are just two callers of kvm_cpu_get_interrupt, and the other is 
protected by kvm_cpu_has_injectable_intr so it won't be executed if 
virtual interrupt delivery is enabled.  So you patch is effectively the 
same as this:


diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
index bd0da43..a1ec6a5 100644
--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -108,7 +108,7 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v)

vector = kvm_cpu_get_extint(v);

-   if (kvm_apic_vid_enabled(v->kvm) || vector != -1)
+   if (vector != -1)
return vector;  /* PIC */

return kvm_get_apic_interrupt(v);   /* APIC */

But in kvm_get_apic_interrupt I have just added this comment:

/* Note that we never get here with APIC virtualization
 * enabled.  */

because kvm_get_apic_interrupt calls apic_set_isr, and apic_set_isr must 
never be called with APIC virtualization enabled either.  With APIC 
virtualization enabled, isr_count is always 1, and highest_isr_cache is 
always -1, and apic_set_isr breaks both of these invariants.


Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] CMA: generalize CMA reserved area management functionality (fixup)

2014-07-17 Thread Marek Szyprowski
MAX_CMA_AREAS is used by other subsystems (i.e. arch/arm/mm/dma-mapping.c),
so we need to provide correct definition even if CMA is disabled.
This patch fixes this issue.

Reported-by: Sylwester Nawrocki 
Signed-off-by: Marek Szyprowski 
---
 include/linux/cma.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/linux/cma.h b/include/linux/cma.h
index 9a18a2b1934c..c077635cad76 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -5,7 +5,11 @@
  * There is always at least global CMA area and a few optional
  * areas configured in kernel .config.
  */
+#ifdef CONFIG_CMA
 #define MAX_CMA_AREAS  (1 + CONFIG_CMA_AREAS)
+#else
+#define MAX_CMA_AREAS  (0)
+#endif
 
 struct cma;
 
-- 
1.9.2

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 5/5] kvm, mem-hotplug: Do not pin apic access page in memory.

2014-07-17 Thread Tang Chen

Hi Gleb,

Sorry for the delay. Please see below.

On 07/15/2014 10:40 PM, Gleb Natapov wrote:
..



We can request APIC_ACCESS_ADDR reload during L2->L1 vmexit emulation, so
if APIC_ACCESS_ADDR changes while L2 is running it will be reloaded for L1 too.



apic pages for L2 and L1 are not the same page, right ?


If L2 guest enable apic access page then they are different, otherwise
they are the same.


I think, just like we are doing in patch 5/5, we cannot wait for the next
L2->L1 vmexit.
We should enforce a L2->L1 vmexit in mmu_notifier, just like
make_all_cpus_request() does.

Am I right ?


I do not see why forcing APIC_ACCESS_ADDR reload during L2->L1 exit is not 
enough.


Yes, you are right. APIC_ACCESS_ADDR reload should be done during L2->L1 
vmexit.


I mean, before the page is moved to other place, we have to enforce a 
L2->L1 vmexit,
but not wait for the next L2->L1 vmexit. Since when the page is being 
moved, if the
L2 vm is still running, it could access apic page directly. And the vm 
may corrupt.


In the mmu_notifier called before the page is moved, we have to enforce 
a L2->L1
vmexit, and ask vcpus to reload APIC_ACCESS_ADDR for L2 vm. The process 
will wait
till the page migration is completed, and update the APIC_ACCESS_ADDR, 
and re-enter

guest mode.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/3] KVM: nVMX: Fix fail to get nested ack intr's vector during nested vmexit

2014-07-17 Thread Zhang, Yang Z
Paolo Bonzini wrote on 2014-07-17:
> Il 17/07/2014 06:56, Wanpeng Li ha scritto:
>>  && nested_exit_intr_ack_set(vcpu)) {
>>  int irq = kvm_cpu_get_interrupt(vcpu);
>> +
>> +if (irq < 0 && kvm_apic_vid_enabled(vcpu->kvm))
>> +irq = kvm_get_apic_interrupt(vcpu);
> 
> There's something weird in this patch.  If you "inline"
> kvm_cpu_get_interrupt, what you get is this:
> 
>  int irq;
>   /* Beginning of kvm_cpu_get_interrupt... */
>  if (!irqchip_in_kernel(v->kvm))
>  irq = v->arch.interrupt.nr;
>   else {
>   irq = kvm_cpu_get_extint(v); /* PIC */
>   if (!kvm_apic_vid_enabled(v->kvm) && irq == -1)
>   irq = kvm_get_apic_interrupt(v); /* APIC */
>   }
> 
>   /* kvm_cpu_get_interrupt done. */
>   if (irq < 0 && kvm_apic_vid_enabled(vcpu->kvm))
>   irq = kvm_get_apic_interrupt(vcpu);
> 
> There are just two callers of kvm_cpu_get_interrupt, and the other is
> protected by kvm_cpu_has_injectable_intr so it won't be executed if
> virtual interrupt delivery is enabled.  So you patch is effectively the same 
> as this:
> 
> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c index
> bd0da43..a1ec6a5 100644 --- a/arch/x86/kvm/irq.c +++
> b/arch/x86/kvm/irq.c @@ -108,7 +108,7 @@ int
> kvm_cpu_get_interrupt(struct kvm_vcpu *v)
> 
>   vector = kvm_cpu_get_extint(v);
> - if (kvm_apic_vid_enabled(v->kvm) || vector != -1)
> + if (vector != -1)
>   return vector;  /* PIC */
>   
>   return kvm_get_apic_interrupt(v);   /* APIC */
> But in kvm_get_apic_interrupt I have just added this comment:
> 
>  /* Note that we never get here with APIC virtualization
>* enabled.  */
> 
> because kvm_get_apic_interrupt calls apic_set_isr, and apic_set_isr
> must never be called with APIC virtualization enabled either.  With
> APIC virtualization enabled, isr_count is always 1, and
> highest_isr_cache is always -1, and apic_set_isr breaks both of these 
> invariants.
> 

You are right. kvm_lapic_find_highest_irr should be the right one.

> Paolo


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] KVM: nVMX: Fix virtual interrupt delivery injection

2014-07-17 Thread Wanpeng Li
On Thu, Jul 17, 2014 at 09:03:01AM +, Zhang, Yang Z wrote:
>Paolo Bonzini wrote on 2014-07-17:
>> Il 17/07/2014 06:56, Wanpeng Li ha scritto:
>>> This patch fix bug reported in
>>> https://bugzilla.kernel.org/show_bug.cgi?id=73331, after the patch
>>> http://www.spinics.net/lists/kvm/msg105230.html applied, there is some
>>> progress and the L2 can boot up, however, slowly. The original idea of
>>> this fix vid injection patch is from "Zhang, Yang Z"
>>> .
>>> 
>>> Interrupt which delivered by vid should be injected to L1 by L0 if
>>> current is in L1, or should be injected to L2 by L0 through the old
>>> injection way if L1 doesn't have set VM_EXIT_ACK_INTR_ON_EXIT. The
>>> current logic doen't consider these cases. This patch fix it by vid
>>> intr to L1 if current is L1 or L2 through old injection way if L1
>>> doen't have VM_EXIT_ACK_INTR_ON_EXIT set.
>>> 
>>> Signed-off-by: Wanpeng Li 
>>> Signed-off-by: "Zhang, Yang Z" 
>>> ---
>>>  arch/x86/kvm/vmx.c | 18 --
>>>  1 file changed, 16 insertions(+), 2 deletions(-)
>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index
>>> 021d84a..ad36646 100644
>>> --- a/arch/x86/kvm/vmx.c
>>> +++ b/arch/x86/kvm/vmx.c
>>> @@ -7112,8 +7112,22 @@ static void vmx_hwapic_irr_update(struct
>>> kvm_vcpu *vcpu, int max_irr)  {
>>> if (max_irr == -1)
>>> return;
>>> -
>>> -   vmx_set_rvi(max_irr);
>>> +   if (!is_guest_mode(vcpu)) {
>>> +   vmx_set_rvi(max_irr);
>>> +   } else if (is_guest_mode(vcpu) && !nested_exit_on_intr(vcpu)) {
>>> +   /*
>>> +* Fall back to old way to inject the interrupt since there
>>> +* is no vAPIC-v for L2.
>>> +*/
>>> +   if (vcpu->arch.exception.pending ||
>>> +   vcpu->arch.nmi_injected ||
>>> +   vcpu->arch.interrupt.pending)
>>> +   return;
>>> +   else if (vmx_interrupt_allowed(vcpu)) {
>>> +   kvm_queue_interrupt(vcpu, max_irr, false);
>>> +   vmx_inject_irq(vcpu);
>>> +   }
>>> +   }
>>>  }
>>>  
>>>  static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64
>>> *eoi_exit_bitmap)
>>> 
>> 
>> What hypervisor did you test with? nested_exit_on_intr(vcpu) will
>
>Jailhouse will clear External-interrupt exiting bit. Am I right? Jan.
>
>> return true for both Xen and KVM (nested_exit_on_intr is not the same
>> thing as ACK_INTR_ON_EXIT).
>
>I guess he want to say External-interrupt exiting bit not ACK_INTR_ON_EXIT. 
>

Ah yes, a typo here. 

Regards,
Wanpeng Li 

>> 
>> Paolo
>
>
>Best regards,
>Yang
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] KVM: nVMX: Fix vmptrld fail and vmwrite error when L1 goes down

2014-07-17 Thread Paolo Bonzini

Il 17/07/2014 10:56, Paolo Bonzini ha scritto:

Il 17/07/2014 06:56, Wanpeng Li ha scritto:

This bug can be trigger by L1 goes down directly w/ enable_shadow_vmcs.

[ 6413.158950] kvm: vmptrld   (null)/7800 failed
[ 6413.158954] vmwrite error: reg 401e value 4 (err 1)
[ 6413.158957] CPU: 0 PID: 4840 Comm: qemu-system-x86 Tainted:
G   OE 3.16.0kvm+ #2
[ 6413.158958] Hardware name: Dell Inc. OptiPlex 9020/0DNKMN, BIOS A05
12/05/2013
[ 6413.158959]  0003 880210c9fb58 81741de9
8800d7433f80
[ 6413.158960]  880210c9fb68 a059fa08 880210c9fb78
a05938bf
[ 6413.158962]  880210c9fba8 a059a97f 8800d7433f80
0003
[ 6413.158963] Call Trace:
[ 6413.158968]  [] dump_stack+0x45/0x56
[ 6413.158972]  [] vmwrite_error+0x2c/0x2e [kvm_intel]
[ 6413.158974]  [] vmcs_writel+0x1f/0x30 [kvm_intel]
[ 6413.158976]  [] free_nested.part.73+0x5f/0x170
[kvm_intel]
[ 6413.158978]  [] vmx_free_vcpu+0x33/0x70 [kvm_intel]
[ 6413.158991]  [] kvm_arch_vcpu_free+0x44/0x50 [kvm]
[ 6413.158998]  [] kvm_arch_destroy_vm+0xf2/0x1f0 [kvm]

Commit 26a865 (KVM: VMX: fix use after free of vmx->loaded_vmcs) fix
the use
after free bug by move free_loaded_vmcs() before free_nested(),
however, this
lead to free loaded_vmcs->vmcs premature and vmptrld load a NULL
pointer during
sync shadow vmcs to vmcs12. In addition, vmwrite which used to disable
shadow
vmcs and reset VMCS_LINK_POINTER failed since there is no valid
current-VMCS.
This patch fix it by skipping sync shadow vmcs and reset vmcs field
for L1
destroy since they will be reinitialized after L1 recreate.

Signed-off-by: Wanpeng Li 
---
 arch/x86/kvm/vmx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index fbce89e..2b28da7 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6113,9 +6113,9 @@ static void free_nested(struct vcpu_vmx *vmx)
 return;
 vmx->nested.vmxon = false;
 if (vmx->nested.current_vmptr != -1ull) {
-nested_release_vmcs12(vmx);
 vmx->nested.current_vmptr = -1ull;
 vmx->nested.current_vmcs12 = NULL;
+nested_release_vmcs12(vmx);
 }
 if (enable_shadow_vmcs)
 free_vmcs(vmx->nested.current_shadow_vmcs);



This looks good, I'll apply it to kvm/master.


Hmm, on second thought the lifetimes of the VMCSes are a total mess. 
Let me look more at this.


Paolo

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/3] KVM: nVMX: Fix virtual interrupt delivery injection

2014-07-17 Thread Zhang, Yang Z
Paolo Bonzini wrote on 2014-07-17:
> Il 17/07/2014 06:56, Wanpeng Li ha scritto:
>> This patch fix bug reported in
>> https://bugzilla.kernel.org/show_bug.cgi?id=73331, after the patch
>> http://www.spinics.net/lists/kvm/msg105230.html applied, there is some
>> progress and the L2 can boot up, however, slowly. The original idea of
>> this fix vid injection patch is from "Zhang, Yang Z"
>> .
>> 
>> Interrupt which delivered by vid should be injected to L1 by L0 if
>> current is in L1, or should be injected to L2 by L0 through the old
>> injection way if L1 doesn't have set VM_EXIT_ACK_INTR_ON_EXIT. The
>> current logic doen't consider these cases. This patch fix it by vid
>> intr to L1 if current is L1 or L2 through old injection way if L1
>> doen't have VM_EXIT_ACK_INTR_ON_EXIT set.
>> 
>> Signed-off-by: Wanpeng Li 
>> Signed-off-by: "Zhang, Yang Z" 
>> ---
>>  arch/x86/kvm/vmx.c | 18 --
>>  1 file changed, 16 insertions(+), 2 deletions(-)
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index
>> 021d84a..ad36646 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -7112,8 +7112,22 @@ static void vmx_hwapic_irr_update(struct
>> kvm_vcpu *vcpu, int max_irr)  {
>>  if (max_irr == -1)
>>  return;
>> -
>> -vmx_set_rvi(max_irr);
>> +if (!is_guest_mode(vcpu)) {
>> +vmx_set_rvi(max_irr);
>> +} else if (is_guest_mode(vcpu) && !nested_exit_on_intr(vcpu)) {
>> +/*
>> + * Fall back to old way to inject the interrupt since there
>> + * is no vAPIC-v for L2.
>> + */
>> +if (vcpu->arch.exception.pending ||
>> +vcpu->arch.nmi_injected ||
>> +vcpu->arch.interrupt.pending)
>> +return;
>> +else if (vmx_interrupt_allowed(vcpu)) {
>> +kvm_queue_interrupt(vcpu, max_irr, false);
>> +vmx_inject_irq(vcpu);
>> +}
>> +}
>>  }
>>  
>>  static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64
>> *eoi_exit_bitmap)
>> 
> 
> What hypervisor did you test with? nested_exit_on_intr(vcpu) will

Jailhouse will clear External-interrupt exiting bit. Am I right? Jan.

> return true for both Xen and KVM (nested_exit_on_intr is not the same
> thing as ACK_INTR_ON_EXIT).

I guess he want to say External-interrupt exiting bit not ACK_INTR_ON_EXIT. 

> 
> Paolo


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] KVM: nVMX: Fix virtual interrupt delivery injection

2014-07-17 Thread Paolo Bonzini

Il 17/07/2014 06:56, Wanpeng Li ha scritto:

This patch fix bug reported in 
https://bugzilla.kernel.org/show_bug.cgi?id=73331,
after the patch http://www.spinics.net/lists/kvm/msg105230.html applied, there 
is
some progress and the L2 can boot up, however, slowly. The original idea of this
fix vid injection patch is from "Zhang, Yang Z" .

Interrupt which delivered by vid should be injected to L1 by L0 if current is in
L1, or should be injected to L2 by L0 through the old injection way if L1 
doesn't
have set VM_EXIT_ACK_INTR_ON_EXIT. The current logic doen't consider these 
cases.
This patch fix it by vid intr to L1 if current is L1 or L2 through old injection
way if L1 doen't have VM_EXIT_ACK_INTR_ON_EXIT set.

Signed-off-by: Wanpeng Li 
Signed-off-by: "Zhang, Yang Z" 
---
 arch/x86/kvm/vmx.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 021d84a..ad36646 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -7112,8 +7112,22 @@ static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, 
int max_irr)
 {
if (max_irr == -1)
return;
-
-   vmx_set_rvi(max_irr);
+   if (!is_guest_mode(vcpu)) {
+   vmx_set_rvi(max_irr);
+   } else if (is_guest_mode(vcpu) && !nested_exit_on_intr(vcpu)) {
+   /*
+* Fall back to old way to inject the interrupt since there
+* is no vAPIC-v for L2.
+*/
+   if (vcpu->arch.exception.pending ||
+   vcpu->arch.nmi_injected ||
+   vcpu->arch.interrupt.pending)
+   return;
+   else if (vmx_interrupt_allowed(vcpu)) {
+   kvm_queue_interrupt(vcpu, max_irr, false);
+   vmx_inject_irq(vcpu);
+   }
+   }
 }

 static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)



What hypervisor did you test with? nested_exit_on_intr(vcpu) will return 
true for both Xen and KVM (nested_exit_on_intr is not the same thing as 
ACK_INTR_ON_EXIT).


Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] KVM: nVMX: Fix vmptrld fail and vmwrite error when L1 goes down

2014-07-17 Thread Paolo Bonzini

Il 17/07/2014 06:56, Wanpeng Li ha scritto:

This bug can be trigger by L1 goes down directly w/ enable_shadow_vmcs.

[ 6413.158950] kvm: vmptrld   (null)/7800 failed
[ 6413.158954] vmwrite error: reg 401e value 4 (err 1)
[ 6413.158957] CPU: 0 PID: 4840 Comm: qemu-system-x86 Tainted: G   OE 
3.16.0kvm+ #2
[ 6413.158958] Hardware name: Dell Inc. OptiPlex 9020/0DNKMN, BIOS A05 
12/05/2013
[ 6413.158959]  0003 880210c9fb58 81741de9 
8800d7433f80
[ 6413.158960]  880210c9fb68 a059fa08 880210c9fb78 
a05938bf
[ 6413.158962]  880210c9fba8 a059a97f 8800d7433f80 
0003
[ 6413.158963] Call Trace:
[ 6413.158968]  [] dump_stack+0x45/0x56
[ 6413.158972]  [] vmwrite_error+0x2c/0x2e [kvm_intel]
[ 6413.158974]  [] vmcs_writel+0x1f/0x30 [kvm_intel]
[ 6413.158976]  [] free_nested.part.73+0x5f/0x170 [kvm_intel]
[ 6413.158978]  [] vmx_free_vcpu+0x33/0x70 [kvm_intel]
[ 6413.158991]  [] kvm_arch_vcpu_free+0x44/0x50 [kvm]
[ 6413.158998]  [] kvm_arch_destroy_vm+0xf2/0x1f0 [kvm]

Commit 26a865 (KVM: VMX: fix use after free of vmx->loaded_vmcs) fix the use
after free bug by move free_loaded_vmcs() before free_nested(), however, this
lead to free loaded_vmcs->vmcs premature and vmptrld load a NULL pointer during
sync shadow vmcs to vmcs12. In addition, vmwrite which used to disable shadow
vmcs and reset VMCS_LINK_POINTER failed since there is no valid current-VMCS.
This patch fix it by skipping sync shadow vmcs and reset vmcs field for L1
destroy since they will be reinitialized after L1 recreate.

Signed-off-by: Wanpeng Li 
---
 arch/x86/kvm/vmx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index fbce89e..2b28da7 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6113,9 +6113,9 @@ static void free_nested(struct vcpu_vmx *vmx)
return;
vmx->nested.vmxon = false;
if (vmx->nested.current_vmptr != -1ull) {
-   nested_release_vmcs12(vmx);
vmx->nested.current_vmptr = -1ull;
vmx->nested.current_vmcs12 = NULL;
+   nested_release_vmcs12(vmx);
}
if (enable_shadow_vmcs)
free_vmcs(vmx->nested.current_shadow_vmcs);



This looks good, I'll apply it to kvm/master.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 -next 5/9] CMA: generalize CMA reserved area management functionality

2014-07-17 Thread Marek Szyprowski

Hello,

On 2014-06-16 07:40, Joonsoo Kim wrote:

Currently, there are two users on CMA functionality, one is the DMA
subsystem and the other is the KVM on powerpc. They have their own code
to manage CMA reserved area even if they looks really similar.
>From my guess, it is caused by some needs on bitmap management. KVM side
wants to maintain bitmap not for 1 page, but for more size. Eventually it
use bitmap where one bit represents 64 pages.

When I implement CMA related patches, I should change those two places
to apply my change and it seem to be painful to me. I want to change
this situation and reduce future code management overhead through
this patch.

This change could also help developer who want to use CMA in their
new feature development, since they can use CMA easily without
copying & pasting this reserved area management code.

In previous patches, we have prepared some features to generalize
CMA reserved area management and now it's time to do it. This patch
moves core functions to mm/cma.c and change DMA APIs to use
these functions.

There is no functional change in DMA APIs.

v2: There is no big change from v1 in mm/cma.c. Mostly renaming.
v3: remove log2.h in dma-contiguous.c (Minchan)
 add some accessor functions to pass aligned base and size to
 dma_contiguous_early_fixup() function
 move MAX_CMA_AREAS to cma.h


I've just noticed that MAX_CMA_AREAS is used also by 
arch/arm/mm/dma-mapping.c,
so we need to provide correct definition if CMA is disabled in kconfig. 
I will

send a fixup patch in a few minutes.


Acked-by: Michal Nazarewicz 
Acked-by: Zhang Yanfei 
Acked-by: Minchan Kim 
Reviewed-by: Aneesh Kumar K.V 
Signed-off-by: Joonsoo Kim 

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 4c88935..3116880 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -26,6 +26,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #include 

  #include 
diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
index 00e13ce..4eac559 100644
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -283,16 +283,6 @@ config CMA_ALIGNMENT
  
  	  If unsure, leave the default value "8".
  
-config CMA_AREAS

-   int "Maximum count of the CMA device-private areas"
-   default 7
-   help
- CMA allows to create CMA areas for particular devices. This parameter
- sets the maximum number of such device private CMA areas in the
- system.
-
- If unsure, leave the default value "7".
-
  endif
  
  endmenu

diff --git a/drivers/base/dma-contiguous.c b/drivers/base/dma-contiguous.c
index c6eeb2c..0411c1c 100644
--- a/drivers/base/dma-contiguous.c
+++ b/drivers/base/dma-contiguous.c
@@ -24,25 +24,9 @@
  
  #include 

  #include 
-#include 
-#include 
-#include 
  #include 
-#include 
-#include 
-#include 
  #include 
-#include 
-
-struct cma {
-   unsigned long   base_pfn;
-   unsigned long   count;
-   unsigned long   *bitmap;
-   unsigned int order_per_bit; /* Order of pages represented by one bit */
-   struct mutexlock;
-};
-
-struct cma *dma_contiguous_default_area;
+#include 
  
  #ifdef CONFIG_CMA_SIZE_MBYTES

  #define CMA_SIZE_MBYTES CONFIG_CMA_SIZE_MBYTES
@@ -50,6 +34,8 @@ struct cma *dma_contiguous_default_area;
  #define CMA_SIZE_MBYTES 0
  #endif
  
+struct cma *dma_contiguous_default_area;

+
  /*
   * Default global CMA area size can be defined in kernel's .config.
   * This is useful mainly for distro maintainers to create a kernel
@@ -156,169 +142,6 @@ void __init dma_contiguous_reserve(phys_addr_t limit)
}
  }
  
-static DEFINE_MUTEX(cma_mutex);

-
-static unsigned long cma_bitmap_aligned_mask(struct cma *cma, int align_order)
-{
-   return (1 << (align_order >> cma->order_per_bit)) - 1;
-}
-
-static unsigned long cma_bitmap_maxno(struct cma *cma)
-{
-   return cma->count >> cma->order_per_bit;
-}
-
-static unsigned long cma_bitmap_pages_to_bits(struct cma *cma,
-   unsigned long pages)
-{
-   return ALIGN(pages, 1 << cma->order_per_bit) >> cma->order_per_bit;
-}
-
-static void cma_clear_bitmap(struct cma *cma, unsigned long pfn, int count)
-{
-   unsigned long bitmap_no, bitmap_count;
-
-   bitmap_no = (pfn - cma->base_pfn) >> cma->order_per_bit;
-   bitmap_count = cma_bitmap_pages_to_bits(cma, count);
-
-   mutex_lock(&cma->lock);
-   bitmap_clear(cma->bitmap, bitmap_no, bitmap_count);
-   mutex_unlock(&cma->lock);
-}
-
-static int __init cma_activate_area(struct cma *cma)
-{
-   int bitmap_size = BITS_TO_LONGS(cma_bitmap_maxno(cma)) * sizeof(long);
-   unsigned long base_pfn = cma->base_pfn, pfn = base_pfn;
-   unsigned i = cma->count >> pageblock_order;
-   struct zone *zone;
-
-   cma->bitmap = kzalloc(bitmap_size, GFP_KERNEL);
-
-   if (!cma->bitmap)
-   return -ENOMEM;
-
-   WARN_ON_ONCE(!pfn_valid(pfn));
-   zone = page_zone(pfn_to_pa