Re: [PATCH 5/6] KVM: PPC: Book3S HV: Access XICS in BE

2014-06-17 Thread Alexander Graf


On 17.06.14 03:02, Paul Mackerras wrote:

On Wed, Jun 11, 2014 at 12:33:50PM +0200, Alexander Graf wrote:

On the exit path from the guest we check what type of interrupt we received
if we received one. This means we're doing hardware access to the XICS interrupt
controller.

However, when running on a little endian system, this access is byte reversed.

So let's make sure to swizzle the bytes back again and virtuall make XICS
accesses big endian.

...


@@ -2241,7 +2253,8 @@ kvmppc_read_intr:
  42:   /* It's not an IPI and it's for the host, stash it in the PACA
 * before exit, it will be picked up by the host ICP driver
 */
-   stw r0, HSTATE_SAVED_XIRR(r13)
+   li  r4, HSTATE_SAVED_XIRR
+   STWX_BE r0, r13, r4

This is a paca field, not something mandated by PAPR or shared with
the guest, so why do we need to keep it BE?  If you do make it BE,
don't you also need to fix kvmppc_get_xics_latch()?


Yikes. Yes. Thanks a lot for the catch!


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule

2014-06-17 Thread Mihai Caraman
On vcpu schedule, the condition checked for tlb pollution is too loose.
The tlb entries of a vcpu become polluted (vs stale) only when a different
vcpu within the same logical partition runs in-between. Optimize the tlb
invalidation condition taking into account the logical partition id.

With the new invalidation condition, a guest shows 4% performance improvement
on P5020DS while running a memory stress application with the cpu 
oversubscribed,
the other guest running a cpu intensive workload.

Guest - old invalidation condition
  real 3.89
  user 3.87
  sys 0.01

Guest - enhanced invalidation condition
  real 3.75
  user 3.73
  sys 0.01

Host
  real 3.70
  user 1.85
  sys 0.00

The memory stress application accesses 4KB pages backed by 75% of available
TLB0 entries:

char foo[ENTRIES][4096] __attribute__ ((aligned (4096)));

int main()
{
char bar;
int i, j;

for (i = 0; i  ITERATIONS; i++)
for (j = 0; j  ENTRIES; j++)
bar = foo[j][0];

return 0;
}

Signed-off-by: Mihai Caraman mihai.cara...@freescale.com
Cc: Scott Wood scottw...@freescale.com
---
v2:
 - improve patch name and description
 - add performance results

 arch/powerpc/kvm/e500mc.c | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
index 17e4562..d3b814b0 100644
--- a/arch/powerpc/kvm/e500mc.c
+++ b/arch/powerpc/kvm/e500mc.c
@@ -111,10 +111,12 @@ void kvmppc_mmu_msr_notify(struct kvm_vcpu *vcpu, u32 
old_msr)
 }
 
 static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu);
+static DEFINE_PER_CPU(int, last_lpid_on_cpu);
 
 static void kvmppc_core_vcpu_load_e500mc(struct kvm_vcpu *vcpu, int cpu)
 {
struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu);
+   bool update_last = false, inval_tlb = false;
 
kvmppc_booke_vcpu_load(vcpu, cpu);
 
@@ -140,12 +142,24 @@ static void kvmppc_core_vcpu_load_e500mc(struct kvm_vcpu 
*vcpu, int cpu)
mtspr(SPRN_GDEAR, vcpu-arch.shared-dar);
mtspr(SPRN_GESR, vcpu-arch.shared-esr);
 
-   if (vcpu-arch.oldpir != mfspr(SPRN_PIR) ||
-   __get_cpu_var(last_vcpu_on_cpu) != vcpu) {
-   kvmppc_e500_tlbil_all(vcpu_e500);
+   if (vcpu-arch.oldpir != mfspr(SPRN_PIR)) {
+   /* stale tlb entries */
+   inval_tlb = update_last = true;
+   } else if (__get_cpu_var(last_vcpu_on_cpu) != vcpu) {
+   update_last = true;
+   /* polluted tlb entries */
+   inval_tlb = __get_cpu_var(last_lpid_on_cpu) ==
+   vcpu-kvm-arch.lpid;
+   }
+
+   if (update_last) {
__get_cpu_var(last_vcpu_on_cpu) = vcpu;
+   __get_cpu_var(last_lpid_on_cpu) = vcpu-kvm-arch.lpid;
}
 
+   if (inval_tlb)
+   kvmppc_e500_tlbil_all(vcpu_e500);
+
kvmppc_load_guest_fp(vcpu);
 }
 
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: Book3s PR: Disable AIL mode with OPAL

2014-06-17 Thread Alexander Graf


On 12.06.14 05:56, Paul Mackerras wrote:

On Tue, Jun 10, 2014 at 07:23:00PM +0200, Alexander Graf wrote:

When we're using PR KVM we must not allow the CPU to take interrupts
in virtual mode, as the SLB does not contain host kernel mappings
when running inside the guest context.

To make sure we get good performance for non-KVM tasks but still
properly functioning PR KVM, let's just disable AIL whenever a vcpu
is scheduled in.

This patch fixes running PR KVM on POWER8 bare metal for me.

We already handle this for the situation where we're running under a
hypervisor with the calls to pSeries_disable_reloc_on_exc() and
pSeries_enable_reloc_on_exc() in kvmppc_core_init_vm_pr() and
kvmppc_core_destroy_vm_pr() respectively.

The obvious approach to fixing this problem would be to generalize
those calls, perhaps via a ppc_md callback, to work on the powernv
platform too.  If you don't want to do that, for instance because
those calls are defined to operate across the whole machine rather
than a single CPU thread, and you prefer to affect just the one thread
we're running on, then I think you need to explain that in the commit
message.


It's what I've done at first, yes. Unfortunately the pSeries call is 
system global, while we need to do the AIL switching per cpu on bare metal.


Once you start considering CPU hotplug and how that affects the 
secondary bringup paths you start wondering whether it's worth the 
hassle :). This way you can have a single guest running which only slows 
down (disables AIL) on its own vcpu, while all the others still enjoy 
the benefits of AIL.



Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] powerpc/kvm: support to handle sw breakpoint

2014-06-17 Thread Benjamin Herrenschmidt
On Tue, 2014-06-17 at 11:25 +0200, Alexander Graf wrote:
 On 17.06.14 11:22, Benjamin Herrenschmidt wrote:
  On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote:
  Also, why don't we use twi always or something else that actually is
  defined as illegal instruction? I would like to see this shared with
  book3s_32 PR.
  twi will be directed to the guest on HV no ? We want a real illegal
  because those go to the host (for potential emulation by the HV).
 
 Ah, good point. I guess we need different one for PR and HV then to 
 ensure compatibility with older ISAs on PR.

Well, we also need to be careful with what happens if a PR guest puts
that instruction in, do that stop its HV guest/host ?

What if it's done in userspace ? Do that stop the kernel ? :-)

Maddy, I haven't checked, does your patch ensure that we only ever stop
if the instruction is at a recorded bkpt address ? It still means that a
userspace process can practically DOS its kernel by issuing a lot of
these causing a crapload of exits.

Cheers,
Ben.

 Alex
 
  I'm
  trying to see if I can get the architect to set one in stone in a future
  proof way.
 
  Cheers,
  Ben.
 
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] powerpc/kvm: support to handle sw breakpoint

2014-06-17 Thread Alexander Graf


On 17.06.14 11:32, Benjamin Herrenschmidt wrote:

On Tue, 2014-06-17 at 11:25 +0200, Alexander Graf wrote:

On 17.06.14 11:22, Benjamin Herrenschmidt wrote:

On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote:

Also, why don't we use twi always or something else that actually is
defined as illegal instruction? I would like to see this shared with
book3s_32 PR.

twi will be directed to the guest on HV no ? We want a real illegal
because those go to the host (for potential emulation by the HV).

Ah, good point. I guess we need different one for PR and HV then to
ensure compatibility with older ISAs on PR.

Well, we also need to be careful with what happens if a PR guest puts
that instruction in, do that stop its HV guest/host ?

What if it's done in userspace ? Do that stop the kernel ? :-)


The way SW breakpointing is handled is that when we see one, it gets 
deflected into user space. User space then has an array of breakpoints 
it configured itself. If the breakpoint is part of that list, it 
consumes it. If not, it injects a debug interrupt (program in this case) 
into the guest.


That way we can overlay that one instruction with as many layers as we 
like :). We only get a performance hit on execution of that instruction.



Maddy, I haven't checked, does your patch ensure that we only ever stop
if the instruction is at a recorded bkpt address ? It still means that a
userspace process can practically DOS its kernel by issuing a lot of
these causing a crapload of exits.


Only user space knows about its breakpoint addresses, so we have to 
deflect. However since time still ticks on, we only increase jitter of 
the guest. The process would still get scheduled away after the same 
amount of real time, no?



Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] KVM: PPC: Book3s PR: Disable AIL mode with OPAL

2014-06-17 Thread Alexander Graf
When we're using PR KVM we must not allow the CPU to take interrupts
in virtual mode, as the SLB does not contain host kernel mappings
when running inside the guest context.

To make sure we get good performance for non-KVM tasks but still
properly functioning PR KVM, let's just disable AIL whenever a vcpu
is scheduled in.

This is fundamentally different from how we deal with AIL on pSeries
type machines where we disable AIL for the whole machine as soon as
a single KVM VM is up.

The reason for that is easy - on pSeries we do not have control over
per-cpu configuration of AIL. We also don't want to mess with CPU hotplug
races and AIL configuration, so setting it per CPU is easier and more
flexible.

This patch fixes running PR KVM on POWER8 bare metal for me.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - Extend patch description

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 3da412e..8ea7da4 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -71,6 +71,12 @@ static void kvmppc_core_vcpu_load_pr(struct kvm_vcpu *vcpu, 
int cpu)
svcpu-in_use = 0;
svcpu_put(svcpu);
 #endif
+
+   /* Disable AIL if supported */
+   if (cpu_has_feature(CPU_FTR_HVMODE) 
+   cpu_has_feature(CPU_FTR_ARCH_207S))
+   mtspr(SPRN_LPCR, mfspr(SPRN_LPCR)  ~LPCR_AIL);
+
vcpu-cpu = smp_processor_id();
 #ifdef CONFIG_PPC_BOOK3S_32
current-thread.kvm_shadow_vcpu = vcpu-arch.shadow_vcpu;
@@ -91,6 +97,12 @@ static void kvmppc_core_vcpu_put_pr(struct kvm_vcpu *vcpu)
 
kvmppc_giveup_ext(vcpu, MSR_FP | MSR_VEC | MSR_VSX);
kvmppc_giveup_fac(vcpu, FSCR_TAR_LG);
+
+   /* Enable AIL if supported */
+   if (cpu_has_feature(CPU_FTR_HVMODE) 
+   cpu_has_feature(CPU_FTR_ARCH_207S))
+   mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) | LPCR_AIL_3);
+
vcpu-cpu = -1;
 }
 
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] KVM: PPC: Book3s PR: Disable AIL mode with OPAL

2014-06-17 Thread Paul Mackerras
On Tue, Jun 17, 2014 at 12:03:58PM +0200, Alexander Graf wrote:
 When we're using PR KVM we must not allow the CPU to take interrupts
 in virtual mode, as the SLB does not contain host kernel mappings
 when running inside the guest context.
 
 To make sure we get good performance for non-KVM tasks but still
 properly functioning PR KVM, let's just disable AIL whenever a vcpu
 is scheduled in.
 
 This is fundamentally different from how we deal with AIL on pSeries
 type machines where we disable AIL for the whole machine as soon as
 a single KVM VM is up.
 
 The reason for that is easy - on pSeries we do not have control over
 per-cpu configuration of AIL. We also don't want to mess with CPU hotplug
 races and AIL configuration, so setting it per CPU is easier and more
 flexible.
 
 This patch fixes running PR KVM on POWER8 bare metal for me.
 
 Signed-off-by: Alexander Graf ag...@suse.de

Acked-by: Paul Mackerras pau...@samba.org
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/6] KVM: PPC: Book3S HV: Access XICS in BE

2014-06-17 Thread Alexander Graf


On 17.06.14 10:37, Alexander Graf wrote:


On 17.06.14 03:02, Paul Mackerras wrote:

On Wed, Jun 11, 2014 at 12:33:50PM +0200, Alexander Graf wrote:
On the exit path from the guest we check what type of interrupt we 
received
if we received one. This means we're doing hardware access to the 
XICS interrupt

controller.

However, when running on a little endian system, this access is byte 
reversed.


So let's make sure to swizzle the bytes back again and virtuall make 
XICS

accesses big endian.

...


@@ -2241,7 +2253,8 @@ kvmppc_read_intr:
  42:/* It's not an IPI and it's for the host, stash it in the PACA
   * before exit, it will be picked up by the host ICP driver
   */
-stwr0, HSTATE_SAVED_XIRR(r13)
+lir4, HSTATE_SAVED_XIRR
+STWX_BEr0, r13, r4

This is a paca field, not something mandated by PAPR or shared with
the guest, so why do we need to keep it BE?  If you do make it BE,
don't you also need to fix kvmppc_get_xics_latch()?


Yikes. Yes. Thanks a lot for the catch!


Eh, no. What we do is we read (good on BE, byte reversed) into r0. Then 
we swab32() from r0 to r3 on LE, mr from r0 to r3 on BE.


r3 gets truncated along the way.

The reason we maintain r0 as wrong-endian is that we write it back using 
the cache inhibited stwcix instruction:



stwcix  r0, r6, r7  /* EOI it */


So during the lifetime of r0 as XIRR it's always byte-reversed on LE. 
That's why we store it using STWX_BE into hstate, because that's the 
time when we actually swab32() it for further interpretation.


Alternatively I could clobber a different register and maintain the byte 
swapped variant in there if you prefer.



Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] powerpc/kvm: support to handle sw breakpoint

2014-06-17 Thread Madhavan Srinivasan
On Tuesday 17 June 2014 02:24 PM, Alexander Graf wrote:
 
 On 14.06.14 23:08, Madhavan Srinivasan wrote:
 This patch adds kernel side support for software breakpoint.
 Design is that, by using an illegal instruction, we trap to hypervisor
 via Emulation Assistance interrupt, where we check for the illegal
 instruction
 and accordingly we return to Host or Guest. Patch mandates use of
 abs instruction
 (primary opcode 31 and extended opcode 360) as sw breakpoint instruction.
 Based on PowerISA v2.01, ABS instruction has been dropped from the
 architecture
 and treated an illegal instruction.

 Signed-off-by: Madhavan Srinivasan ma...@linux.vnet.ibm.com
 ---
   arch/powerpc/kvm/book3s.c|  3 ++-
   arch/powerpc/kvm/book3s_hv.c | 23 +++
   2 files changed, 21 insertions(+), 5 deletions(-)

 diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
 index c254c27..b40fe5d 100644
 --- a/arch/powerpc/kvm/book3s.c
 +++ b/arch/powerpc/kvm/book3s.c
 @@ -789,7 +789,8 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu
 *vcpu,
   int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
   struct kvm_guest_debug *dbg)
   {
 -return -EINVAL;
 +vcpu-guest_debug = dbg-control;
 +return 0;
   }
 void kvmppc_decrementer_func(unsigned long data)
 diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
 index 7a12edb..688421d 100644
 --- a/arch/powerpc/kvm/book3s_hv.c
 +++ b/arch/powerpc/kvm/book3s_hv.c
 @@ -67,6 +67,14 @@
   /* Used as a null value for timebase values */
   #define TB_NIL(~(u64)0)
   +/*
 + * SW_BRK_DBG_INT is debug Instruction for supporting Software
 Breakpoint.
 + * Instruction mnemonic is ABS, primary opcode is 31 and extended
 opcode is 360.
 + * Based on PowerISA v2.01, ABS instruction has been dropped from the
 architecture
 + * and treated an illegal instruction.
 + */
 +#define SW_BRK_DBG_INT 0x7c0002d0
 
 The instruction we use to trap needs to get exposed to user space via a
 ONE_REG property.
 

Yes. I got to know about that from Bharat (patchset ppc debug: Add
debug stub support). I will change it.

 Also, why don't we use twi always or something else that actually is
 defined as illegal instruction? I would like to see this shared with
 book3s_32 PR.
 
 +
   static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
   static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
   @@ -721,12 +729,19 @@ static int kvmppc_handle_exit_hv(struct
 kvm_run *run, struct kvm_vcpu *vcpu,
   break;
   /*
* This occurs if the guest executes an illegal instruction.
 - * We just generate a program interrupt to the guest, since
 - * we don't emulate any guest instructions at this stage.
 + * To support software breakpoint, we check for the sw breakpoint
 + * instruction to return back to host, if not we just generate a
 + * program interrupt to the guest.
*/
   case BOOK3S_INTERRUPT_H_EMUL_ASSIST:
 -kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
 -r = RESUME_GUEST;
 +if (vcpu-arch.last_inst == SW_BRK_DBG_INT) {
 
 Don't access last_inst directly. Instead use the provided helpers.
 

Ok. Will look and replace it.

 +run-exit_reason = KVM_EXIT_DEBUG;
 +run-debug.arch.address = vcpu-arch.pc;
 +r = RESUME_HOST;
 +} else {
 +kvmppc_core_queue_program(vcpu, 0x8);
 
 magic numbers
^
I did not understand this?

 +r = RESUME_GUEST;
 +}
   break;
   /*
* This occurs if the guest (kernel or userspace), does
 something that
 
 Please enable PR KVM as well while you're at it.
 
My bad, I did not try the PR KVM. I will try it out.

 
 Alex
 
Thanks for review
Regards
Maddy

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] powerpc/kvm: support to handle sw breakpoint

2014-06-17 Thread Alexander Graf


On 17.06.14 13:07, Madhavan Srinivasan wrote:

On Tuesday 17 June 2014 02:24 PM, Alexander Graf wrote:

On 14.06.14 23:08, Madhavan Srinivasan wrote:

This patch adds kernel side support for software breakpoint.
Design is that, by using an illegal instruction, we trap to hypervisor
via Emulation Assistance interrupt, where we check for the illegal
instruction
and accordingly we return to Host or Guest. Patch mandates use of
abs instruction
(primary opcode 31 and extended opcode 360) as sw breakpoint instruction.
Based on PowerISA v2.01, ABS instruction has been dropped from the
architecture
and treated an illegal instruction.

Signed-off-by: Madhavan Srinivasan ma...@linux.vnet.ibm.com
---
   arch/powerpc/kvm/book3s.c|  3 ++-
   arch/powerpc/kvm/book3s_hv.c | 23 +++
   2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index c254c27..b40fe5d 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -789,7 +789,8 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu
*vcpu,
   int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
   struct kvm_guest_debug *dbg)
   {
-return -EINVAL;
+vcpu-guest_debug = dbg-control;
+return 0;
   }
 void kvmppc_decrementer_func(unsigned long data)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 7a12edb..688421d 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -67,6 +67,14 @@
   /* Used as a null value for timebase values */
   #define TB_NIL(~(u64)0)
   +/*
+ * SW_BRK_DBG_INT is debug Instruction for supporting Software
Breakpoint.
+ * Instruction mnemonic is ABS, primary opcode is 31 and extended
opcode is 360.
+ * Based on PowerISA v2.01, ABS instruction has been dropped from the
architecture
+ * and treated an illegal instruction.
+ */
+#define SW_BRK_DBG_INT 0x7c0002d0

The instruction we use to trap needs to get exposed to user space via a
ONE_REG property.


Yes. I got to know about that from Bharat (patchset ppc debug: Add
debug stub support). I will change it.


Also, why don't we use twi always or something else that actually is
defined as illegal instruction? I would like to see this shared with
book3s_32 PR.


+
   static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
   static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
   @@ -721,12 +729,19 @@ static int kvmppc_handle_exit_hv(struct
kvm_run *run, struct kvm_vcpu *vcpu,
   break;
   /*
* This occurs if the guest executes an illegal instruction.
- * We just generate a program interrupt to the guest, since
- * we don't emulate any guest instructions at this stage.
+ * To support software breakpoint, we check for the sw breakpoint
+ * instruction to return back to host, if not we just generate a
+ * program interrupt to the guest.
*/
   case BOOK3S_INTERRUPT_H_EMUL_ASSIST:
-kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
-r = RESUME_GUEST;
+if (vcpu-arch.last_inst == SW_BRK_DBG_INT) {

Don't access last_inst directly. Instead use the provided helpers.


Ok. Will look and replace it.


+run-exit_reason = KVM_EXIT_DEBUG;
+run-debug.arch.address = vcpu-arch.pc;
+r = RESUME_HOST;
+} else {
+kvmppc_core_queue_program(vcpu, 0x8);

magic numbers

^
I did not understand this?


You're replacing the readable SRR1_PROGILL with the unreadable 0x8. 
That's bad.



Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] powerpc/kvm: support to handle sw breakpoint

2014-06-17 Thread Madhavan Srinivasan
On Tuesday 17 June 2014 03:02 PM, Benjamin Herrenschmidt wrote:
 On Tue, 2014-06-17 at 11:25 +0200, Alexander Graf wrote:
 On 17.06.14 11:22, Benjamin Herrenschmidt wrote:
 On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote:
 Also, why don't we use twi always or something else that actually is
 defined as illegal instruction? I would like to see this shared with
 book3s_32 PR.
 twi will be directed to the guest on HV no ? We want a real illegal
 because those go to the host (for potential emulation by the HV).

 Ah, good point. I guess we need different one for PR and HV then to 
 ensure compatibility with older ISAs on PR.
 
 Well, we also need to be careful with what happens if a PR guest puts
 that instruction in, do that stop its HV guest/host ?
 

Damn, my mail client is messed up. did not see the mail till now.


I havent tried this incase of PR guest kernel. I will need to try this
before commenting.

 What if it's done in userspace ? Do that stop the kernel ? :-)
 

Basically flow is that, when we see this instruction, we return to host,
and host checks for address in the SW array and if not it returns to kernel.

 Maddy, I haven't checked, does your patch ensure that we only ever stop
 if the instruction is at a recorded bkpt address ? It still means that a
 userspace process can practically DOS its kernel by issuing a lot of
 these causing a crapload of exits.
 
This is valid, userspace can create a mess, need to handle this, meaning
incase if we dont find a valid SW breakpoint for this address in the
HOST, we need to route it to guest and kill it at app.

Regards
Maddy

 Cheers,
 Ben.
 
 Alex

 I'm
 trying to see if I can get the architect to set one in stone in a future
 proof way.

 Cheers,
 Ben.


 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] powerpc/kvm: support to handle sw breakpoint

2014-06-17 Thread Madhavan Srinivasan
On Tuesday 17 June 2014 03:13 PM, Alexander Graf wrote:
 
 On 17.06.14 11:32, Benjamin Herrenschmidt wrote:
 On Tue, 2014-06-17 at 11:25 +0200, Alexander Graf wrote:
 On 17.06.14 11:22, Benjamin Herrenschmidt wrote:
 On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote:
 Also, why don't we use twi always or something else that actually is
 defined as illegal instruction? I would like to see this shared with
 book3s_32 PR.
 twi will be directed to the guest on HV no ? We want a real illegal
 because those go to the host (for potential emulation by the HV).
 Ah, good point. I guess we need different one for PR and HV then to
 ensure compatibility with older ISAs on PR.
 Well, we also need to be careful with what happens if a PR guest puts
 that instruction in, do that stop its HV guest/host ?

 What if it's done in userspace ? Do that stop the kernel ? :-)
 
 The way SW breakpointing is handled is that when we see one, it gets
 deflected into user space. User space then has an array of breakpoints
 it configured itself. If the breakpoint is part of that list, it
 consumes it. If not, it injects a debug interrupt (program in this case)
 into the guest.
 
 That way we can overlay that one instruction with as many layers as we
 like :). We only get a performance hit on execution of that instruction.
 
 Maddy, I haven't checked, does your patch ensure that we only ever stop
 if the instruction is at a recorded bkpt address ? It still means that a
 userspace process can practically DOS its kernel by issuing a lot of
 these causing a crapload of exits.
 
 Only user space knows about its breakpoint addresses, so we have to
 deflect. However since time still ticks on, we only increase jitter of
 the guest. The process would still get scheduled away after the same
 ^^^ Where is this taken care. I am still trying to understand. Kindly
can you explain or point to the code. Will help.

 amount of real time, no?
 
 
 Alex
 
Thanks for review.
Regards
Maddy



--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] powerpc/kvm: support to handle sw breakpoint

2014-06-17 Thread Madhavan Srinivasan
On Tuesday 17 June 2014 04:38 PM, Alexander Graf wrote:
 
 On 17.06.14 13:07, Madhavan Srinivasan wrote:
 On Tuesday 17 June 2014 02:24 PM, Alexander Graf wrote:
 On 14.06.14 23:08, Madhavan Srinivasan wrote:
 This patch adds kernel side support for software breakpoint.
 Design is that, by using an illegal instruction, we trap to hypervisor
 via Emulation Assistance interrupt, where we check for the illegal
 instruction
 and accordingly we return to Host or Guest. Patch mandates use of
 abs instruction
 (primary opcode 31 and extended opcode 360) as sw breakpoint
 instruction.
 Based on PowerISA v2.01, ABS instruction has been dropped from the
 architecture
 and treated an illegal instruction.

 Signed-off-by: Madhavan Srinivasan ma...@linux.vnet.ibm.com
 ---
arch/powerpc/kvm/book3s.c|  3 ++-
arch/powerpc/kvm/book3s_hv.c | 23 +++
2 files changed, 21 insertions(+), 5 deletions(-)

 diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
 index c254c27..b40fe5d 100644
 --- a/arch/powerpc/kvm/book3s.c
 +++ b/arch/powerpc/kvm/book3s.c
 @@ -789,7 +789,8 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu
 *vcpu,
int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
struct kvm_guest_debug *dbg)
{
 -return -EINVAL;
 +vcpu-guest_debug = dbg-control;
 +return 0;
}
  void kvmppc_decrementer_func(unsigned long data)
 diff --git a/arch/powerpc/kvm/book3s_hv.c
 b/arch/powerpc/kvm/book3s_hv.c
 index 7a12edb..688421d 100644
 --- a/arch/powerpc/kvm/book3s_hv.c
 +++ b/arch/powerpc/kvm/book3s_hv.c
 @@ -67,6 +67,14 @@
/* Used as a null value for timebase values */
#define TB_NIL(~(u64)0)
+/*
 + * SW_BRK_DBG_INT is debug Instruction for supporting Software
 Breakpoint.
 + * Instruction mnemonic is ABS, primary opcode is 31 and extended
 opcode is 360.
 + * Based on PowerISA v2.01, ABS instruction has been dropped from the
 architecture
 + * and treated an illegal instruction.
 + */
 +#define SW_BRK_DBG_INT 0x7c0002d0
 The instruction we use to trap needs to get exposed to user space via a
 ONE_REG property.

 Yes. I got to know about that from Bharat (patchset ppc debug: Add
 debug stub support). I will change it.

 Also, why don't we use twi always or something else that actually is
 defined as illegal instruction? I would like to see this shared with
 book3s_32 PR.

 +
static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
@@ -721,12 +729,19 @@ static int kvmppc_handle_exit_hv(struct
 kvm_run *run, struct kvm_vcpu *vcpu,
break;
/*
 * This occurs if the guest executes an illegal instruction.
 - * We just generate a program interrupt to the guest, since
 - * we don't emulate any guest instructions at this stage.
 + * To support software breakpoint, we check for the sw breakpoint
 + * instruction to return back to host, if not we just generate a
 + * program interrupt to the guest.
 */
case BOOK3S_INTERRUPT_H_EMUL_ASSIST:
 -kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
 -r = RESUME_GUEST;
 +if (vcpu-arch.last_inst == SW_BRK_DBG_INT) {
 Don't access last_inst directly. Instead use the provided helpers.

 Ok. Will look and replace it.

 +run-exit_reason = KVM_EXIT_DEBUG;
 +run-debug.arch.address = vcpu-arch.pc;
 +r = RESUME_HOST;
 +} else {
 +kvmppc_core_queue_program(vcpu, 0x8);
 magic numbers
 ^
 I did not understand this?
 
 You're replacing the readable SRR1_PROGILL with the unreadable 0x8.
 That's bad.
 

Oops. My bad. Will undo that. I guess I messed up when was re basing it.

 
 Alex
 
Thanks for review
Regards
Maddy

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] powerpc/kvm: support to handle sw breakpoint

2014-06-17 Thread Alexander Graf


On 17.06.14 13:20, Madhavan Srinivasan wrote:

On Tuesday 17 June 2014 03:13 PM, Alexander Graf wrote:

On 17.06.14 11:32, Benjamin Herrenschmidt wrote:

On Tue, 2014-06-17 at 11:25 +0200, Alexander Graf wrote:

On 17.06.14 11:22, Benjamin Herrenschmidt wrote:

On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote:

Also, why don't we use twi always or something else that actually is
defined as illegal instruction? I would like to see this shared with
book3s_32 PR.

twi will be directed to the guest on HV no ? We want a real illegal
because those go to the host (for potential emulation by the HV).

Ah, good point. I guess we need different one for PR and HV then to
ensure compatibility with older ISAs on PR.

Well, we also need to be careful with what happens if a PR guest puts
that instruction in, do that stop its HV guest/host ?

What if it's done in userspace ? Do that stop the kernel ? :-)

The way SW breakpointing is handled is that when we see one, it gets
deflected into user space. User space then has an array of breakpoints
it configured itself. If the breakpoint is part of that list, it
consumes it. If not, it injects a debug interrupt (program in this case)
into the guest.

That way we can overlay that one instruction with as many layers as we
like :). We only get a performance hit on execution of that instruction.


Maddy, I haven't checked, does your patch ensure that we only ever stop
if the instruction is at a recorded bkpt address ? It still means that a
userspace process can practically DOS its kernel by issuing a lot of
these causing a crapload of exits.

Only user space knows about its breakpoint addresses, so we have to
deflect. However since time still ticks on, we only increase jitter of
the guest. The process would still get scheduled away after the same

  ^^^ Where is this taken care. I am still trying to understand. Kindly
can you explain or point to the code. Will help.


We tell the guest via VPA about its steal time which includes QEMU time.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] KVM: PPC: e500mc: Relax tlb invalidation condition on vcpu schedule

2014-06-17 Thread mihai.cara...@freescale.com
 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Tuesday, June 17, 2014 12:09 PM
 To: Wood Scott-B07421
 Cc: Caraman Mihai Claudiu-B02008; kvm-ppc@vger.kernel.org;
 k...@vger.kernel.org; linuxppc-...@lists.ozlabs.org
 Subject: Re: [PATCH] KVM: PPC: e500mc: Relax tlb invalidation condition
 on vcpu schedule
 
 
 On 13.06.14 21:42, Scott Wood wrote:
  On Fri, 2014-06-13 at 16:55 +0200, Alexander Graf wrote:
  On 13.06.14 16:43, mihai.cara...@freescale.com wrote:
  -Original Message-
  From: Alexander Graf [mailto:ag...@suse.de]
  Sent: Thursday, June 12, 2014 8:05 PM
  To: Caraman Mihai Claudiu-B02008
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; linuxppc-
  d...@lists.ozlabs.org; Wood Scott-B07421
  Subject: Re: [PATCH] KVM: PPC: e500mc: Relax tlb invalidation
 condition
  on vcpu schedule
 
  On 06/12/2014 04:00 PM, Mihai Caraman wrote:
  On vcpu schedule, the condition checked for tlb pollution is too
 tight.
  The tlb entries of one vcpu are polluted when a different vcpu from
 the
  same partition runs in-between. Relax the current tlb invalidation
  condition taking into account the lpid.
  Can you quantify the performance improvement from this?  We've had bugs
  in this area before, so let's make sure it's worth it before making
 this
  more complicated.
 
  Signed-off-by: Mihai Caraman mihai.caraman at freescale.com
  Your mailer is broken? :)
  This really should be an @.
 
  I think this should work. Scott, please ack.
  Alex, you were right. I screwed up the patch description by inverting
 relax
  and tight terms :) It should have been more like this:
 
  KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
 
  On vcpu schedule, the condition checked for tlb pollution is too
 loose.
  The tlb entries of a vcpu are polluted (vs stale) only when a
 different vcpu
  within the same logical partition runs in-between. Optimize the tlb
 invalidation
  condition taking into account the lpid.
  Can't we give every vcpu its own lpid? Or don't we trap on global
  invalidates?
  That would significantly increase the odds of exhausting LPIDs,
  especially on large chips like t4240 with similarly large VMs.  If we
  were to do that, the LPIDs would need to be dynamically assigned (like
  PIDs), and should probably be a separate numberspace per physical core.
 
 True, I didn't realize we only have so few of them. It would however
 save us from most flushing as long as we have spare LPIDs available :).

Yes, we had this proposal on the table for e6500 multithreaded core. This
core lacks tlb write conditional instruction, so an OS needs to use locks
to protect itself against concurrent tlb writes executed from sibling threads.
When we expose hw treads as single-threaded vcpus (useful when the user opt
not to pin vcpus), the guest can't no longer protect itself optimally
(it can protect tlb writes across all threads but this is not acceptable).
So instead, we found a solution at hypervisor level by assigning different
logical partition ids to guest's vcpus running simultaneous on sibling hw
threads. Currently in FSL SDK we allocate two lpids to each guest.

I am also a proponent for using all LPID space (63 values) per (multi-threaded)
physical core, which will lead to fewer invalidates on vcpu schedule and will
accommodate the solution described above.

-Mike


Re: [PATCH 5/6] KVM: PPC: Book3S HV: Access XICS in BE

2014-06-17 Thread Paul Mackerras
On Tue, Jun 17, 2014 at 12:22:32PM +0200, Alexander Graf wrote:
 
 Eh, no. What we do is we read (good on BE, byte reversed) into r0. Then we
 swab32() from r0 to r3 on LE, mr from r0 to r3 on BE.
 
 r3 gets truncated along the way.
 
 The reason we maintain r0 as wrong-endian is that we write it back using the
 cache inhibited stwcix instruction:
 
 stwcix  r0, r6, r7  /* EOI it */
 
 So during the lifetime of r0 as XIRR it's always byte-reversed on LE. That's
 why we store it using STWX_BE into hstate, because that's the time when we
 actually swab32() it for further interpretation.

So the STWX_BE is more like a be32_to_cpu than a cpu_to_be32, which is
what the name STWX_BE would suggest.  Sounds like it at least deserves
a comment, or (as you suggest) rearrange the register usage so a
normal store works.

Paul.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/6] KVM: PPC: Book3S HV: Access XICS in BE

2014-06-17 Thread Alexander Graf


On 17.06.14 14:13, Paul Mackerras wrote:

On Tue, Jun 17, 2014 at 12:22:32PM +0200, Alexander Graf wrote:

Eh, no. What we do is we read (good on BE, byte reversed) into r0. Then we
swab32() from r0 to r3 on LE, mr from r0 to r3 on BE.

r3 gets truncated along the way.

The reason we maintain r0 as wrong-endian is that we write it back using the
cache inhibited stwcix instruction:


stwcix  r0, r6, r7  /* EOI it */

So during the lifetime of r0 as XIRR it's always byte-reversed on LE. That's
why we store it using STWX_BE into hstate, because that's the time when we
actually swab32() it for further interpretation.

So the STWX_BE is more like a be32_to_cpu than a cpu_to_be32, which is
what the name STWX_BE would suggest.  Sounds like it at least deserves
a comment, or (as you suggest) rearrange the register usage so a
normal store works.


Yes, I have this now:


From a94a66437ec714ec5650f6d8fec050a33e4477ca Mon Sep 17 00:00:00 2001
From: Alexander Graf ag...@suse.de
Date: Wed, 11 Jun 2014 10:37:52 +0200
Subject: [PATCH] KVM: PPC: Book3S HV: Access XICS in BE

On the exit path from the guest we check what type of interrupt we received
if we received one. This means we're doing hardware access to the XICS 
interrupt

controller.

However, when running on a little endian system, this access is byte 
reversed.


So let's make sure to swizzle the bytes back again and virtually make XICS
accesses big endian.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - Make code easier to follow

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S

index 1a2f471..9829e18 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -36,6 +36,13 @@
 #define NAPPING_CEDE1
 #define NAPPING_NOVCPU2

+.macro bswap32 regd, regs
+srwi\regd,\regs,24
+rlwimi\regd,\regs,24,16,23
+rlwimi\regd,\regs,8,8,15
+rlwimi\regd,\regs,24,0,7
+.endm
+
 /*
  * Call kvmppc_hv_entry in real mode.
  * Must be called with interrupts hard-disabled.
@@ -2325,7 +2332,12 @@ kvmppc_read_intr:
 cmpdir6, 0
 beq-1f
 lwzcixr0, r6, r7
-rlwinm.r3, r0, 0, 0xff
+#ifdef __LITTLE_ENDIAN__
+bswap32r4, r0
+#else
+mrr4, r0
+#endif
+rlwinm.r3, r4, 0, 0xff
 sync
 beq1f/* if nothing pending in the ICP */

@@ -2360,7 +2372,7 @@ kvmppc_read_intr:
 42:/* It's not an IPI and it's for the host, stash it in the PACA
  * before exit, it will be picked up by the host ICP driver
  */
-stwr0, HSTATE_SAVED_XIRR(r13)
+stwr4, HSTATE_SAVED_XIRR(r13)
 lir3, 1
 b1b


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] powerpc/kvm: support to handle sw breakpoint

2014-06-17 Thread Alexander Graf


On 17.06.14 13:13, Madhavan Srinivasan wrote:

On Tuesday 17 June 2014 04:38 PM, Alexander Graf wrote:

On 17.06.14 13:07, Madhavan Srinivasan wrote:

On Tuesday 17 June 2014 02:24 PM, Alexander Graf wrote:

On 14.06.14 23:08, Madhavan Srinivasan wrote:

This patch adds kernel side support for software breakpoint.
Design is that, by using an illegal instruction, we trap to hypervisor
via Emulation Assistance interrupt, where we check for the illegal
instruction
and accordingly we return to Host or Guest. Patch mandates use of
abs instruction
(primary opcode 31 and extended opcode 360) as sw breakpoint
instruction.
Based on PowerISA v2.01, ABS instruction has been dropped from the
architecture
and treated an illegal instruction.

Signed-off-by: Madhavan Srinivasan ma...@linux.vnet.ibm.com
---
arch/powerpc/kvm/book3s.c|  3 ++-
arch/powerpc/kvm/book3s_hv.c | 23 +++
2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index c254c27..b40fe5d 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -789,7 +789,8 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu
*vcpu,
int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
struct kvm_guest_debug *dbg)
{
-return -EINVAL;
+vcpu-guest_debug = dbg-control;
+return 0;
}
  void kvmppc_decrementer_func(unsigned long data)
diff --git a/arch/powerpc/kvm/book3s_hv.c
b/arch/powerpc/kvm/book3s_hv.c
index 7a12edb..688421d 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -67,6 +67,14 @@
/* Used as a null value for timebase values */
#define TB_NIL(~(u64)0)
+/*
+ * SW_BRK_DBG_INT is debug Instruction for supporting Software
Breakpoint.
+ * Instruction mnemonic is ABS, primary opcode is 31 and extended
opcode is 360.
+ * Based on PowerISA v2.01, ABS instruction has been dropped from the
architecture
+ * and treated an illegal instruction.
+ */
+#define SW_BRK_DBG_INT 0x7c0002d0

The instruction we use to trap needs to get exposed to user space via a
ONE_REG property.


Yes. I got to know about that from Bharat (patchset ppc debug: Add
debug stub support). I will change it.


Also please make sure to pick an instruction that preferably looks 
identical regardless of guest endianness. Segher suggested 0x0000. 
Does that trap properly for you?



Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: PPC: Book3S PR: Handle hyp doorbell exits

2014-06-17 Thread Alexander Graf
If we're running PR KVM in HV mode, we may get hypervisor doorbell interrupts.
Handle those the same way we treat normal doorbells.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_pr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 8ea7da4..3b82e86 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -988,6 +988,7 @@ int kvmppc_handle_exit_pr(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
case BOOK3S_INTERRUPT_DECREMENTER:
case BOOK3S_INTERRUPT_HV_DECREMENTER:
case BOOK3S_INTERRUPT_DOORBELL:
+   case BOOK3S_INTERRUPT_H_DOORBELL:
vcpu-stat.dec_exits++;
r = RESUME_GUEST;
break;
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] KVM: PPC: Book3S HV: Fix ABIv2 indirect branch issue

2014-06-17 Thread Alexander Graf


On 12.06.14 10:16, Anton Blanchard wrote:

To establish addressability quickly, ABIv2 requires the target
address of the function being called to be in r12.

Signed-off-by: Anton Blanchard an...@samba.org


Thanks, applied to kvm-ppc-queue.


Alex


---

Index: b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
===
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -1920,8 +1920,8 @@ hcall_try_real_mode:
lwaxr3,r3,r4
cmpwi   r3,0
beq guest_exit_cont
-   add r3,r3,r4
-   mtctr   r3
+   add r12,r3,r4
+   mtctr   r12
mr  r3,r9   /* get vcpu pointer */
ld  r4,VCPU_GPR(R4)(r9)
bctrl


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC()

2014-06-17 Thread Alexander Graf


On 12.06.14 10:16, Anton Blanchard wrote:

Both kvmppc_hv_entry_trampoline and kvmppc_entry_trampoline are
assembly functions that are exported to modules and also require
a valid r2.

As such we need to use _GLOBAL_TOC so we provide a global entry
point that establishes the TOC (r2).

Signed-off-by: Anton Blanchard an...@samba.org


Thanks, applied to kvm-ppc-queue.

I've not applied patches 1 and 2 for now, as they break BE module support.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: PPC: Book3S PR: Fix ABIv2 on LE

2014-06-17 Thread Alexander Graf
We switched to ABIv2 on Little Endian systems now which gets rid of the
dotted function names. Branch to the actual functions when we see such
a system.

Signed-off-by: Alexander Graf ag...@suse.de

diff --git a/arch/powerpc/kvm/book3s_interrupts.S 
b/arch/powerpc/kvm/book3s_interrupts.S
index e2c29e3..d044b8b 100644
--- a/arch/powerpc/kvm/book3s_interrupts.S
+++ b/arch/powerpc/kvm/book3s_interrupts.S
@@ -25,7 +25,11 @@
 #include asm/exception-64s.h
 
 #if defined(CONFIG_PPC_BOOK3S_64)
+#if defined(_CALL_ELF)  _CALL_ELF == 2
+#define FUNC(name) name
+#else
 #define FUNC(name) GLUE(.,name)
+#endif
 #define GET_SHADOW_VCPU(reg)addi   reg, r13, PACA_SVCPU
 
 #elif defined(CONFIG_PPC_BOOK3S_32)
diff --git a/arch/powerpc/kvm/book3s_rmhandlers.S 
b/arch/powerpc/kvm/book3s_rmhandlers.S
index 9eec675..cdcbb63 100644
--- a/arch/powerpc/kvm/book3s_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_rmhandlers.S
@@ -36,7 +36,11 @@
 
 #if defined(CONFIG_PPC_BOOK3S_64)
 
+#if defined(_CALL_ELF)  _CALL_ELF == 2
+#define FUNC(name) name
+#else
 #define FUNC(name) GLUE(.,name)
+#endif
 
 #elif defined(CONFIG_PPC_BOOK3S_32)
 
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: PPC: Book3S PR: Fix sparse endian checks

2014-06-17 Thread Alexander Graf
While sending sparse with endian checks over the code base, it triggered at
some places that were missing casts or had wrong types. Fix them up.

Signed-off-by: Alexander Graf ag...@suse.de

diff --git a/arch/powerpc/kvm/book3s_pr_papr.c 
b/arch/powerpc/kvm/book3s_pr_papr.c
index 52a63bf..f7c25c6 100644
--- a/arch/powerpc/kvm/book3s_pr_papr.c
+++ b/arch/powerpc/kvm/book3s_pr_papr.c
@@ -40,8 +40,9 @@ static int kvmppc_h_pr_enter(struct kvm_vcpu *vcpu)
 {
long flags = kvmppc_get_gpr(vcpu, 4);
long pte_index = kvmppc_get_gpr(vcpu, 5);
-   unsigned long pteg[2 * 8];
-   unsigned long pteg_addr, i, *hpte;
+   __be64 pteg[2 * 8];
+   __be64 *hpte;
+   unsigned long pteg_addr, i;
long int ret;
 
i = pte_index  7;
@@ -93,8 +94,8 @@ static int kvmppc_h_pr_remove(struct kvm_vcpu *vcpu)
pteg = get_pteg_addr(vcpu, pte_index);
mutex_lock(vcpu-kvm-arch.hpt_mutex);
copy_from_user(pte, (void __user *)pteg, sizeof(pte));
-   pte[0] = be64_to_cpu(pte[0]);
-   pte[1] = be64_to_cpu(pte[1]);
+   pte[0] = be64_to_cpu((__force __be64)pte[0]);
+   pte[1] = be64_to_cpu((__force __be64)pte[1]);
 
ret = H_NOT_FOUND;
if ((pte[0]  HPTE_V_VALID) == 0 ||
@@ -171,8 +172,8 @@ static int kvmppc_h_pr_bulk_remove(struct kvm_vcpu *vcpu)
 
pteg = get_pteg_addr(vcpu, tsh  H_BULK_REMOVE_PTEX);
copy_from_user(pte, (void __user *)pteg, sizeof(pte));
-   pte[0] = be64_to_cpu(pte[0]);
-   pte[1] = be64_to_cpu(pte[1]);
+   pte[0] = be64_to_cpu((__force __be64)pte[0]);
+   pte[1] = be64_to_cpu((__force __be64)pte[1]);
 
/* tsl = AVPN */
flags = (tsh  H_BULK_REMOVE_FLAGS)  26;
@@ -211,8 +212,8 @@ static int kvmppc_h_pr_protect(struct kvm_vcpu *vcpu)
pteg = get_pteg_addr(vcpu, pte_index);
mutex_lock(vcpu-kvm-arch.hpt_mutex);
copy_from_user(pte, (void __user *)pteg, sizeof(pte));
-   pte[0] = be64_to_cpu(pte[0]);
-   pte[1] = be64_to_cpu(pte[1]);
+   pte[0] = be64_to_cpu((__force __be64)pte[0]);
+   pte[1] = be64_to_cpu((__force __be64)pte[1]);
 
ret = H_NOT_FOUND;
if ((pte[0]  HPTE_V_VALID) == 0 ||
@@ -231,8 +232,8 @@ static int kvmppc_h_pr_protect(struct kvm_vcpu *vcpu)
 
rb = compute_tlbie_rb(v, r, pte_index);
vcpu-arch.mmu.tlbie(vcpu, rb, rb  1 ? true : false);
-   pte[0] = cpu_to_be64(pte[0]);
-   pte[1] = cpu_to_be64(pte[1]);
+   pte[0] = (__force u64)cpu_to_be64(pte[0]);
+   pte[1] = (__force u64)cpu_to_be64(pte[1]);
copy_to_user((void __user *)pteg, pte, sizeof(pte));
ret = H_SUCCESS;
 
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/7] KVM: PPC: Book3S HV: Enable on little endian hosts

2014-06-17 Thread Alexander Graf
So far we've been able to successfully run HV KVM on big endian hosts, but
once you dive into little endian land things start to fall apart.

This patch set enables HV KVM for little endian hosts. This should be the
final piece left missing to get little endian systems fully en par with big
endian ones in the KVM world - modulo bugs.

For now guest threading support is still slightly flaky, but I'm sure that's
only a minor breakage somewhere that we'll find soon.

v1 - v2:

  - fix typo in STWX_BE
  - Add __be hints
  - Fix H_REMOVE
  - Fix dtl_idx
  - Make XICS code easier to follow and use memory for bswap

Alex

Alexander Graf (7):
  PPC: Add asm helpers for BE 32bit load/store
  KVM: PPC: Book3S HV: Make HTAB code LE host aware
  KVM: PPC: Book3S HV: Access guest VPA in BE
  KVM: PPC: Book3S HV: Access host lppaca and shadow slb in BE
  KVM: PPC: Book3S HV: Access XICS in BE
  KVM: PPC: Book3S HV: Fix ABIv2 on LE
  KVM: PPC: Book3S HV: Enable for little endian hosts

 arch/powerpc/include/asm/asm-compat.h|   4 +
 arch/powerpc/include/asm/kvm_book3s.h|   4 +-
 arch/powerpc/include/asm/kvm_book3s_64.h |  15 +++-
 arch/powerpc/kvm/Kconfig |   1 -
 arch/powerpc/kvm/book3s_64_mmu_hv.c  | 128 ++-
 arch/powerpc/kvm/book3s_hv.c |  22 ++---
 arch/powerpc/kvm/book3s_hv_ras.c |   6 +-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c  | 146 ++-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  |  60 -
 9 files changed, 220 insertions(+), 166 deletions(-)

-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/7] KVM: PPC: Book3S HV: Fix ABIv2 on LE

2014-06-17 Thread Alexander Graf
We use ABIv2 on Little Endian systems which gets rid of the dotted function
names. Branch to the actual functions when we see such a system.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 22 ++
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 1a71f60..1ff3ebd 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -36,6 +36,12 @@
 #define NAPPING_CEDE   1
 #define NAPPING_NOVCPU 2
 
+#if defined(_CALL_ELF)  _CALL_ELF == 2
+#define FUNC(name) name
+#else
+#define FUNC(name) GLUE(.,name)
+#endif
+
 /*
  * Call kvmppc_hv_entry in real mode.
  * Must be called with interrupts hard-disabled.
@@ -668,9 +674,9 @@ END_FTR_SECTION_IFCLR(CPU_FTR_TM)
 
mr  r31, r4
addir3, r31, VCPU_FPRS_TM
-   bl  .load_fp_state
+   bl  FUNC(load_fp_state)
addir3, r31, VCPU_VRS_TM
-   bl  .load_vr_state
+   bl  FUNC(load_vr_state)
mr  r4, r31
lwz r7, VCPU_VRSAVE_TM(r4)
mtspr   SPRN_VRSAVE, r7
@@ -1414,9 +1420,9 @@ END_FTR_SECTION_IFCLR(CPU_FTR_TM)
 
/* Save FP/VSX. */
addir3, r9, VCPU_FPRS_TM
-   bl  .store_fp_state
+   bl  FUNC(store_fp_state)
addir3, r9, VCPU_VRS_TM
-   bl  .store_vr_state
+   bl  FUNC(store_vr_state)
mfspr   r6, SPRN_VRSAVE
stw r6, VCPU_VRSAVE_TM(r9)
 1:
@@ -2405,11 +2411,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_VSX)
mtmsrd  r8
isync
addir3,r3,VCPU_FPRS
-   bl  .store_fp_state
+   bl  FUNC(store_fp_state)
 #ifdef CONFIG_ALTIVEC
 BEGIN_FTR_SECTION
addir3,r31,VCPU_VRS
-   bl  .store_vr_state
+   bl  FUNC(store_vr_state)
 END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
 #endif
mfspr   r6,SPRN_VRSAVE
@@ -2441,11 +2447,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_VSX)
mtmsrd  r8
isync
addir3,r4,VCPU_FPRS
-   bl  .load_fp_state
+   bl  FUNC(load_fp_state)
 #ifdef CONFIG_ALTIVEC
 BEGIN_FTR_SECTION
addir3,r31,VCPU_VRS
-   bl  .load_vr_state
+   bl  FUNC(load_vr_state)
 END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
 #endif
lwz r7,VCPU_VRSAVE(r31)
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/7] KVM: PPC: Book3S HV: Access host lppaca and shadow slb in BE

2014-06-17 Thread Alexander Graf
Some data structures are always stored in big endian. Among those are the LPPACA
fields as well as the shadow slb. These structures might be shared with a
hypervisor.

So whenever we access those fields, make sure we do so in big endian byte order.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 7e7fa01..75c7e22 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -32,10 +32,6 @@
 
 #define VCPU_GPRS_TM(reg) (((reg) * ULONG_SIZE) + VCPU_GPR_TM)
 
-#ifdef __LITTLE_ENDIAN__
-#error Need to fix lppaca and SLB shadow accesses in little endian mode
-#endif
-
 /* Values in HSTATE_NAPPING(r13) */
 #define NAPPING_CEDE   1
 #define NAPPING_NOVCPU 2
@@ -595,9 +591,10 @@ kvmppc_got_guest:
ld  r3, VCPU_VPA(r4)
cmpdi   r3, 0
beq 25f
-   lwz r5, LPPACA_YIELDCOUNT(r3)
+   li  r6, LPPACA_YIELDCOUNT
+   LWZX_BE r5, r3, r6
addir5, r5, 1
-   stw r5, LPPACA_YIELDCOUNT(r3)
+   STWX_BE r5, r3, r6
li  r6, 1
stb r6, VCPU_VPA_DIRTY(r4)
 25:
@@ -1442,9 +1439,10 @@ END_FTR_SECTION_IFCLR(CPU_FTR_TM)
ld  r8, VCPU_VPA(r9)/* do they have a VPA? */
cmpdi   r8, 0
beq 25f
-   lwz r3, LPPACA_YIELDCOUNT(r8)
+   li  r4, LPPACA_YIELDCOUNT
+   LWZX_BE r3, r8, r4
addir3, r3, 1
-   stw r3, LPPACA_YIELDCOUNT(r8)
+   STWX_BE r3, r8, r4
li  r3, 1
stb r3, VCPU_VPA_DIRTY(r9)
 25:
@@ -1757,8 +1755,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 33:ld  r8,PACA_SLBSHADOWPTR(r13)
 
.rept   SLB_NUM_BOLTED
-   ld  r5,SLBSHADOW_SAVEAREA(r8)
-   ld  r6,SLBSHADOW_SAVEAREA+8(r8)
+   li  r3, SLBSHADOW_SAVEAREA
+   LDX_BE  r5, r8, r3
+   addir3, r3, 8
+   LDX_BE  r6, r8, r3
andis.  r7,r5,SLB_ESID_V@h
beq 1f
slbmte  r6,r5
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/7] PPC: Add asm helpers for BE 32bit load/store

2014-06-17 Thread Alexander Graf
From assembly code we might not only have to explicitly BE access 64bit values,
but sometimes also 32bit ones. Add helpers that allow for easy use of lwzx/stwx
in their respective byte-reverse or native form.

Signed-off-by: Alexander Graf ag...@suse.de
CC: Benjamin Herrenschmidt b...@kernel.crashing.org

---

v1 - v2:

  - fix typo in STWX_BE
---
 arch/powerpc/include/asm/asm-compat.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/include/asm/asm-compat.h 
b/arch/powerpc/include/asm/asm-compat.h
index 4b237aa..21be8ae 100644
--- a/arch/powerpc/include/asm/asm-compat.h
+++ b/arch/powerpc/include/asm/asm-compat.h
@@ -34,10 +34,14 @@
 #define PPC_MIN_STKFRM 112
 
 #ifdef __BIG_ENDIAN__
+#define LWZX_BEstringify_in_c(lwzx)
 #define LDX_BE stringify_in_c(ldx)
+#define STWX_BEstringify_in_c(stwx)
 #define STDX_BEstringify_in_c(stdx)
 #else
+#define LWZX_BEstringify_in_c(lwbrx)
 #define LDX_BE stringify_in_c(ldbrx)
+#define STWX_BEstringify_in_c(stwbrx)
 #define STDX_BEstringify_in_c(stdbrx)
 #endif
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/7] KVM: PPC: Book3S HV: Make HTAB code LE host aware

2014-06-17 Thread Alexander Graf
When running on an LE host all data structures are kept in little endian
byte order. However, the HTAB still needs to be maintained in big endian.

So every time we access any HTAB we need to make sure we do so in the right
byte order. Fix up all accesses to manually byte swap.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - Add __be32 hints
  - Fix H_REMOVE
---
 arch/powerpc/include/asm/kvm_book3s.h|   4 +-
 arch/powerpc/include/asm/kvm_book3s_64.h |  15 +++-
 arch/powerpc/kvm/book3s_64_mmu_hv.c  | 128 ++-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c  | 146 ++-
 4 files changed, 164 insertions(+), 129 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index a20cc0b..028f4b1 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -161,9 +161,9 @@ extern pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t 
gfn, bool writing,
bool *writable);
 extern void kvmppc_add_revmap_chain(struct kvm *kvm, struct revmap_entry *rev,
unsigned long *rmap, long pte_index, int realmode);
-extern void kvmppc_invalidate_hpte(struct kvm *kvm, unsigned long *hptep,
+extern void kvmppc_invalidate_hpte(struct kvm *kvm, __be64 *hptep,
unsigned long pte_index);
-void kvmppc_clear_ref_hpte(struct kvm *kvm, unsigned long *hptep,
+void kvmppc_clear_ref_hpte(struct kvm *kvm, __be64 *hptep,
unsigned long pte_index);
 extern void *kvmppc_pin_guest_page(struct kvm *kvm, unsigned long addr,
unsigned long *nb_ret);
diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
b/arch/powerpc/include/asm/kvm_book3s_64.h
index c7871f3..e504f88 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -59,20 +59,29 @@ extern unsigned long kvm_rma_pages;
 /* These bits are reserved in the guest view of the HPTE */
 #define HPTE_GR_RESERVED   HPTE_GR_MODIFIED
 
-static inline long try_lock_hpte(unsigned long *hpte, unsigned long bits)
+static inline long try_lock_hpte(__be64 *hpte, unsigned long bits)
 {
unsigned long tmp, old;
+   __be64 be_lockbit, be_bits;
+
+   /*
+* We load/store in native endian, but the HTAB is in big endian. If
+* we byte swap all data we apply on the PTE we're implicitly correct
+* again.
+*/
+   be_lockbit = cpu_to_be64(HPTE_V_HVLOCK);
+   be_bits = cpu_to_be64(bits);
 
asm volatile(  ldarx   %0,0,%2\n
   and.%1,%0,%3\n
   bne 2f\n
-  ori %0,%0,%4\n
+  or  %0,%0,%4\n
   stdcx.  %0,0,%2\n
   beq+2f\n
   mr  %1,%3\n
 2:isync
 : =r (tmp), =r (old)
-: r (hpte), r (bits), i (HPTE_V_HVLOCK)
+: r (hpte), r (be_bits), r (be_lockbit)
 : cc, memory);
return old == 0;
 }
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 8056107..2d154d9 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -450,7 +450,7 @@ static int kvmppc_mmu_book3s_64_hv_xlate(struct kvm_vcpu 
*vcpu, gva_t eaddr,
unsigned long slb_v;
unsigned long pp, key;
unsigned long v, gr;
-   unsigned long *hptep;
+   __be64 *hptep;
int index;
int virtmode = vcpu-arch.shregs.msr  (data ? MSR_DR : MSR_IR);
 
@@ -473,13 +473,13 @@ static int kvmppc_mmu_book3s_64_hv_xlate(struct kvm_vcpu 
*vcpu, gva_t eaddr,
preempt_enable();
return -ENOENT;
}
-   hptep = (unsigned long *)(kvm-arch.hpt_virt + (index  4));
-   v = hptep[0]  ~HPTE_V_HVLOCK;
+   hptep = (__be64 *)(kvm-arch.hpt_virt + (index  4));
+   v = be64_to_cpu(hptep[0])  ~HPTE_V_HVLOCK;
gr = kvm-arch.revmap[index].guest_rpte;
 
/* Unlock the HPTE */
asm volatile(lwsync : : : memory);
-   hptep[0] = v;
+   hptep[0] = cpu_to_be64(v);
preempt_enable();
 
gpte-eaddr = eaddr;
@@ -583,7 +583,8 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
unsigned long ea, unsigned long dsisr)
 {
struct kvm *kvm = vcpu-kvm;
-   unsigned long *hptep, hpte[3], r;
+   unsigned long hpte[3], r;
+   __be64 *hptep;
unsigned long mmu_seq, psize, pte_size;
unsigned long gpa_base, gfn_base;
unsigned long gpa, gfn, hva, pfn;
@@ -606,16 +607,16 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, 
struct kvm_vcpu *vcpu,
if (ea != vcpu-arch.pgfault_addr)
return RESUME_GUEST;
index = vcpu-arch.pgfault_index;
-   

[PATCH 7/7] KVM: PPC: Book3S HV: Enable for little endian hosts

2014-06-17 Thread Alexander Graf
Now that we've fixed all the issues that HV KVM code had on little endian
hosts, we can enable it in the kernel configuration for users to play with.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/Kconfig | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index d6a53b9..8aeeda1 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -75,7 +75,6 @@ config KVM_BOOK3S_64
 config KVM_BOOK3S_64_HV
tristate KVM support for POWER7 and PPC970 using hypervisor mode in 
host
depends on KVM_BOOK3S_64
-   depends on !CPU_LITTLE_ENDIAN
select KVM_BOOK3S_HV_POSSIBLE
select MMU_NOTIFIER
select CMA
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/7] KVM: PPC: Book3S HV: Access XICS in BE

2014-06-17 Thread Alexander Graf
On the exit path from the guest we check what type of interrupt we received
if we received one. This means we're doing hardware access to the XICS interrupt
controller.

However, when running on a little endian system, this access is byte reversed.

So let's make sure to swizzle the bytes back again and virtually make XICS
accesses big endian.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - Make code easier to follow and use hardware for bswap
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 75c7e22..1a71f60 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -2325,7 +2325,18 @@ kvmppc_read_intr:
cmpdi   r6, 0
beq-1f
lwzcix  r0, r6, r7
-   rlwinm. r3, r0, 0, 0xff
+   /*
+* Save XIRR for later. Since we get in in reverse endian on LE
+* systems, save it byte reversed and fetch it back in host endian.
+*/
+   li  r3, HSTATE_SAVED_XIRR
+   STWX_BE r0, r3, r13
+#ifdef __LITTLE_ENDIAN__
+   lwz r3, HSTATE_SAVED_XIRR(r13)
+#else
+   mr  r3, r0
+#endif
+   rlwinm. r3, r3, 0, 0xff
sync
beq 1f  /* if nothing pending in the ICP */
 
@@ -2357,10 +2368,9 @@ kvmppc_read_intr:
li  r3, -1
 1: blr
 
-42:/* It's not an IPI and it's for the host, stash it in the PACA
-* before exit, it will be picked up by the host ICP driver
+42:/* It's not an IPI and it's for the host. We saved a copy of XIRR in
+* the PACA earlier, it will be picked up by the host ICP driver
 */
-   stw r0, HSTATE_SAVED_XIRR(r13)
li  r3, 1
b   1b
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule

2014-06-17 Thread mihai.cara...@freescale.com
 -Original Message-
 From: Wood Scott-B07421
 Sent: Tuesday, June 17, 2014 6:36 PM
 To: Caraman Mihai Claudiu-B02008
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; linuxppc-
 d...@lists.ozlabs.org
 Subject: Re: [PATCH v2] KVM: PPC: e500mc: Enhance tlb invalidation
 condition on vcpu schedule
 
 On Tue, 2014-06-17 at 12:02 +0300, Mihai Caraman wrote:
  On vcpu schedule, the condition checked for tlb pollution is too loose.
  The tlb entries of a vcpu become polluted (vs stale) only when a
 different
  vcpu within the same logical partition runs in-between. Optimize the
 tlb
  invalidation condition taking into account the logical partition id.
 
  With the new invalidation condition, a guest shows 4% performance
 improvement
  on P5020DS while running a memory stress application with the cpu
 oversubscribed,
  the other guest running a cpu intensive workload.
 
 See
 https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-June/118547.html

Thanks. The original code needs just a simple adjustment to benefit from
this optimization, please review v3.

- Mike


Re: [PATCH v2] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule

2014-06-17 Thread Scott Wood
On Tue, 2014-06-17 at 14:04 -0500, Caraman Mihai Claudiu-B02008 wrote:
  -Original Message-
  From: Wood Scott-B07421
  Sent: Tuesday, June 17, 2014 6:36 PM
  To: Caraman Mihai Claudiu-B02008
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; linuxppc-
  d...@lists.ozlabs.org
  Subject: Re: [PATCH v2] KVM: PPC: e500mc: Enhance tlb invalidation
  condition on vcpu schedule
  
  On Tue, 2014-06-17 at 12:02 +0300, Mihai Caraman wrote:
   On vcpu schedule, the condition checked for tlb pollution is too loose.
   The tlb entries of a vcpu become polluted (vs stale) only when a
  different
   vcpu within the same logical partition runs in-between. Optimize the
  tlb
   invalidation condition taking into account the logical partition id.
  
   With the new invalidation condition, a guest shows 4% performance
  improvement
   on P5020DS while running a memory stress application with the cpu
  oversubscribed,
   the other guest running a cpu intensive workload.
  
  See
  https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-June/118547.html
 
 Thanks. The original code needs just a simple adjustment to benefit from
 this optimization, please review v3.

Where is v3?  Or is it forthcoming?

-Scott


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule

2014-06-17 Thread Mihai Caraman
On vcpu schedule, the condition checked for tlb pollution is too loose.
The tlb entries of a vcpu become polluted (vs stale) only when a different
vcpu within the same logical partition runs in-between. Optimize the tlb
invalidation condition keeping last_vcpu_on_cpu per logical partition id.

With the new invalidation condition, a guest shows 4% performance improvement
on P5020DS while running a memory stress application with the cpu 
oversubscribed,
the other guest running a cpu intensive workload.

Guest - old invalidation condition
  real 3.89
  user 3.87
  sys 0.01

Guest - enhanced invalidation condition
  real 3.75
  user 3.73
  sys 0.01

Host
  real 3.70
  user 1.85
  sys 0.00

The memory stress application accesses 4KB pages backed by 75% of available
TLB0 entries:

char foo[ENTRIES][4096] __attribute__ ((aligned (4096)));

int main()
{
char bar;
int i, j;

for (i = 0; i  ITERATIONS; i++)
for (j = 0; j  ENTRIES; j++)
bar = foo[j][0];

return 0;
}

Signed-off-by: Mihai Caraman mihai.cara...@freescale.com
Cc: Scott Wood scottw...@freescale.com
---
v3:
 - use existing logic while keeping last_vcpu_per_cpu per lpid
 
v2:
 - improve patch name and description
 - add performance results


 arch/powerpc/kvm/e500mc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
index 17e4562..95e33e3 100644
--- a/arch/powerpc/kvm/e500mc.c
+++ b/arch/powerpc/kvm/e500mc.c
@@ -110,7 +110,7 @@ void kvmppc_mmu_msr_notify(struct kvm_vcpu *vcpu, u32 
old_msr)
 {
 }
 
-static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu);
+static DEFINE_PER_CPU(struct kvm_vcpu * [KVMPPC_NR_LPIDS], last_vcpu_on_cpu);
 
 static void kvmppc_core_vcpu_load_e500mc(struct kvm_vcpu *vcpu, int cpu)
 {
@@ -141,9 +141,9 @@ static void kvmppc_core_vcpu_load_e500mc(struct kvm_vcpu 
*vcpu, int cpu)
mtspr(SPRN_GESR, vcpu-arch.shared-esr);
 
if (vcpu-arch.oldpir != mfspr(SPRN_PIR) ||
-   __get_cpu_var(last_vcpu_on_cpu) != vcpu) {
+   __get_cpu_var(last_vcpu_on_cpu)[vcpu-kvm-arch.lpid] != vcpu) {
kvmppc_e500_tlbil_all(vcpu_e500);
-   __get_cpu_var(last_vcpu_on_cpu) = vcpu;
+   __get_cpu_var(last_vcpu_on_cpu)[vcpu-kvm-arch.lpid] = vcpu;
}
 
kvmppc_load_guest_fp(vcpu);
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule

2014-06-17 Thread Scott Wood
On Tue, 2014-06-17 at 22:09 +0300, Mihai Caraman wrote:
 On vcpu schedule, the condition checked for tlb pollution is too loose.
 The tlb entries of a vcpu become polluted (vs stale) only when a different
 vcpu within the same logical partition runs in-between. Optimize the tlb
 invalidation condition keeping last_vcpu_on_cpu per logical partition id.
 
 With the new invalidation condition, a guest shows 4% performance improvement
 on P5020DS while running a memory stress application with the cpu 
 oversubscribed,
 the other guest running a cpu intensive workload.
 
 Guest - old invalidation condition
   real 3.89
   user 3.87
   sys 0.01
 
 Guest - enhanced invalidation condition
   real 3.75
   user 3.73
   sys 0.01
 
 Host
   real 3.70
   user 1.85
   sys 0.00
 
 The memory stress application accesses 4KB pages backed by 75% of available
 TLB0 entries:
 
 char foo[ENTRIES][4096] __attribute__ ((aligned (4096)));
 
 int main()
 {
   char bar;
   int i, j;
 
   for (i = 0; i  ITERATIONS; i++)
   for (j = 0; j  ENTRIES; j++)
   bar = foo[j][0];
 
   return 0;
 }
 
 Signed-off-by: Mihai Caraman mihai.cara...@freescale.com
 Cc: Scott Wood scottw...@freescale.com
 ---
 v3:
  - use existing logic while keeping last_vcpu_per_cpu per lpid
  
 v2:
  - improve patch name and description
  - add performance results
 
 
  arch/powerpc/kvm/e500mc.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)
 
 diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
 index 17e4562..95e33e3 100644
 --- a/arch/powerpc/kvm/e500mc.c
 +++ b/arch/powerpc/kvm/e500mc.c
 @@ -110,7 +110,7 @@ void kvmppc_mmu_msr_notify(struct kvm_vcpu *vcpu, u32 
 old_msr)
  {
  }
  
 -static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu);
 +static DEFINE_PER_CPU(struct kvm_vcpu * [KVMPPC_NR_LPIDS], last_vcpu_on_cpu);

Hmm, I didn't know you could express types like that.  Is this special
syntax that only works for typeof?

No space after *

Name should be adjusted to match, something like last_vcpu_of_lpid (with
the _on_cpu being implied by the fact that it's PER_CPU).

-Scott


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule

2014-06-17 Thread mihai.cara...@freescale.com
  -static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu);
  +static DEFINE_PER_CPU(struct kvm_vcpu * [KVMPPC_NR_LPIDS],
 last_vcpu_on_cpu);
 
 Hmm, I didn't know you could express types like that.  Is this special
 syntax that only works for typeof?

Yes, AFAIK.

 No space after *

Checkpatch complains about the missing space ;)

 
 Name should be adjusted to match, something like last_vcpu_of_lpid (with
 the _on_cpu being implied by the fact that it's PER_CPU).

I was thinking to the long name but it was not appealing, I will change it to
last_vcpu_of_lpid.

-Mike


Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule

2014-06-17 Thread Scott Wood
On Tue, 2014-06-17 at 14:42 -0500, Caraman Mihai Claudiu-B02008 wrote:
   -static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu);
   +static DEFINE_PER_CPU(struct kvm_vcpu * [KVMPPC_NR_LPIDS],
  last_vcpu_on_cpu);
  
  Hmm, I didn't know you could express types like that.  Is this special
  syntax that only works for typeof?
 
 Yes, AFAIK.
 
  No space after *
 
 Checkpatch complains about the missing space ;)

Checkpatch is wrong, which isn't surprising given that this is unusual
syntax.  We don't normally put a space after * when used to represent a
pointer.

-Scott


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule

2014-06-17 Thread mihai.cara...@freescale.com
 -Original Message-
 From: Wood Scott-B07421
 Sent: Tuesday, June 17, 2014 10:48 PM
 To: Caraman Mihai Claudiu-B02008
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; linuxppc-
 d...@lists.ozlabs.org
 Subject: Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation
 condition on vcpu schedule
 
 On Tue, 2014-06-17 at 14:42 -0500, Caraman Mihai Claudiu-B02008 wrote:
-static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu);
+static DEFINE_PER_CPU(struct kvm_vcpu * [KVMPPC_NR_LPIDS],
   last_vcpu_on_cpu);
  
   Hmm, I didn't know you could express types like that.  Is this
 special
   syntax that only works for typeof?
 
  Yes, AFAIK.
 
   No space after *
 
  Checkpatch complains about the missing space ;)
 
 Checkpatch is wrong, which isn't surprising given that this is unusual
 syntax.  We don't normally put a space after * when used to represent a
 pointer.

This is not something new. See [PATCH 04/10] percpu: cleanup percpu array
definitions:

https://lkml.org/lkml/2009/6/24/26

-Mike
N�r��yb�X��ǧv�^�)޺{.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥

Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule

2014-06-17 Thread Scott Wood
On Tue, 2014-06-17 at 15:02 -0500, Caraman Mihai Claudiu-B02008 wrote:
  -Original Message-
  From: Wood Scott-B07421
  Sent: Tuesday, June 17, 2014 10:48 PM
  To: Caraman Mihai Claudiu-B02008
  Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; linuxppc-
  d...@lists.ozlabs.org
  Subject: Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation
  condition on vcpu schedule
  
  On Tue, 2014-06-17 at 14:42 -0500, Caraman Mihai Claudiu-B02008 wrote:
 -static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu);
 +static DEFINE_PER_CPU(struct kvm_vcpu * [KVMPPC_NR_LPIDS],
last_vcpu_on_cpu);
   
Hmm, I didn't know you could express types like that.  Is this
  special
syntax that only works for typeof?
  
   Yes, AFAIK.
  
No space after *
  
   Checkpatch complains about the missing space ;)
  
  Checkpatch is wrong, which isn't surprising given that this is unusual
  syntax.  We don't normally put a space after * when used to represent a
  pointer.
 
 This is not something new. See [PATCH 04/10] percpu: cleanup percpu array
 definitions:
 
   https://lkml.org/lkml/2009/6/24/26

I didn't say it was new, just unusual, and checkpatch doesn't recognize
it.  Checkpatch shouldn't be blindly and silently obeyed when it says
something strange.

-Scott


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule

2014-06-17 Thread mihai.cara...@freescale.com
 -Original Message-
 From: Wood Scott-B07421
 Sent: Tuesday, June 17, 2014 11:05 PM
 To: Caraman Mihai Claudiu-B02008
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; linuxppc-
 d...@lists.ozlabs.org
 Subject: Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation
 condition on vcpu schedule
 
 On Tue, 2014-06-17 at 15:02 -0500, Caraman Mihai Claudiu-B02008 wrote:
   -Original Message-
   From: Wood Scott-B07421
   Sent: Tuesday, June 17, 2014 10:48 PM
   To: Caraman Mihai Claudiu-B02008
   Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; linuxppc-
   d...@lists.ozlabs.org
   Subject: Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation
   condition on vcpu schedule
  
   On Tue, 2014-06-17 at 14:42 -0500, Caraman Mihai Claudiu-B02008
 wrote:
  -static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu);
  +static DEFINE_PER_CPU(struct kvm_vcpu * [KVMPPC_NR_LPIDS],
 last_vcpu_on_cpu);

 Hmm, I didn't know you could express types like that.  Is this
   special
 syntax that only works for typeof?
   
Yes, AFAIK.
   
 No space after *
   
Checkpatch complains about the missing space ;)
  
   Checkpatch is wrong, which isn't surprising given that this is
 unusual
   syntax.  We don't normally put a space after * when used to represent
 a
   pointer.
 
  This is not something new. See [PATCH 04/10] percpu: cleanup percpu
 array
  definitions:
 
  https://lkml.org/lkml/2009/6/24/26
 
 I didn't say it was new, just unusual, and checkpatch doesn't recognize
 it.  Checkpatch shouldn't be blindly and silently obeyed when it says
 something strange.

I agree with you about the syntax and I know other cases where checkpatch
is a moron. For similar corner cases checkpatch maintainers did not wanted
(or found it difficult) to make an exception. I would also like to see Alex
opinion on this.

-Mike




Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule

2014-06-17 Thread Alexander Graf


On 17.06.14 22:36, mihai.cara...@freescale.com wrote:

-Original Message-
From: Wood Scott-B07421
Sent: Tuesday, June 17, 2014 11:05 PM
To: Caraman Mihai Claudiu-B02008
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; linuxppc-
d...@lists.ozlabs.org
Subject: Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation
condition on vcpu schedule

On Tue, 2014-06-17 at 15:02 -0500, Caraman Mihai Claudiu-B02008 wrote:

-Original Message-
From: Wood Scott-B07421
Sent: Tuesday, June 17, 2014 10:48 PM
To: Caraman Mihai Claudiu-B02008
Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; linuxppc-
d...@lists.ozlabs.org
Subject: Re: [PATCH v3] KVM: PPC: e500mc: Enhance tlb invalidation
condition on vcpu schedule

On Tue, 2014-06-17 at 14:42 -0500, Caraman Mihai Claudiu-B02008

wrote:

-static DEFINE_PER_CPU(struct kvm_vcpu *, last_vcpu_on_cpu);
+static DEFINE_PER_CPU(struct kvm_vcpu * [KVMPPC_NR_LPIDS],

last_vcpu_on_cpu);

Hmm, I didn't know you could express types like that.  Is this

special

syntax that only works for typeof?

Yes, AFAIK.


No space after *

Checkpatch complains about the missing space ;)

Checkpatch is wrong, which isn't surprising given that this is

unusual

syntax.  We don't normally put a space after * when used to represent

a

pointer.

This is not something new. See [PATCH 04/10] percpu: cleanup percpu

array

definitions:

https://lkml.org/lkml/2009/6/24/26

I didn't say it was new, just unusual, and checkpatch doesn't recognize
it.  Checkpatch shouldn't be blindly and silently obeyed when it says
something strange.

I agree with you about the syntax and I know other cases where checkpatch
is a moron. For similar corner cases checkpatch maintainers did not wanted
(or found it difficult) to make an exception. I would also like to see Alex
opinion on this.


I usually like to apply common sense :).


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html