Re: [PATCH] KVM: PPC: BookE: HV: Fix compile

2012-07-25 Thread Michael Neuling
Alexander Graf  wrote:

> After merging the register type check patches from Ben's tree, the
> hv enabled booke implementation ceased to compile.
> 
> This patch fixes things up so everyone's happy again.

Is there a defconfig which catches this?

Mikey

> 
> Signed-off-by: Alexander Graf 
> ---
>  arch/powerpc/kvm/bookehv_interrupts.S |   77 
> +
>  1 files changed, 39 insertions(+), 38 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/bookehv_interrupts.S 
> b/arch/powerpc/kvm/bookehv_interrupts.S
> index d28c2d4..099fe82 100644
> --- a/arch/powerpc/kvm/bookehv_interrupts.S
> +++ b/arch/powerpc/kvm/bookehv_interrupts.S
> @@ -50,8 +50,9 @@
>  #define HOST_R2 (3 * LONGBYTES)
>  #define HOST_CR (4 * LONGBYTES)
>  #define HOST_NV_GPRS(5 * LONGBYTES)
> -#define HOST_NV_GPR(n)  (HOST_NV_GPRS + ((n - 14) * LONGBYTES))
> -#define HOST_MIN_STACK_SIZE (HOST_NV_GPR(31) + LONGBYTES)
> +#define __HOST_NV_GPR(n)  (HOST_NV_GPRS + ((n - 14) * LONGBYTES))
> +#define HOST_NV_GPR(n)  __HOST_NV_GPR(__REG_##n)
> +#define HOST_MIN_STACK_SIZE (HOST_NV_GPR(R31) + LONGBYTES)
>  #define HOST_STACK_SIZE ((HOST_MIN_STACK_SIZE + 15) & ~15) /* Align. */
>  #define HOST_STACK_LR   (HOST_STACK_SIZE + LONGBYTES) /* In caller stack 
> frame. */
>  
> @@ -410,24 +411,24 @@ heavyweight_exit:
>   PPC_STL r31, VCPU_GPR(R31)(r4)
>  
>   /* Load host non-volatile register state from host stack. */
> - PPC_LL  r14, HOST_NV_GPR(r14)(r1)
> - PPC_LL  r15, HOST_NV_GPR(r15)(r1)
> - PPC_LL  r16, HOST_NV_GPR(r16)(r1)
> - PPC_LL  r17, HOST_NV_GPR(r17)(r1)
> - PPC_LL  r18, HOST_NV_GPR(r18)(r1)
> - PPC_LL  r19, HOST_NV_GPR(r19)(r1)
> - PPC_LL  r20, HOST_NV_GPR(r20)(r1)
> - PPC_LL  r21, HOST_NV_GPR(r21)(r1)
> - PPC_LL  r22, HOST_NV_GPR(r22)(r1)
> - PPC_LL  r23, HOST_NV_GPR(r23)(r1)
> - PPC_LL  r24, HOST_NV_GPR(r24)(r1)
> - PPC_LL  r25, HOST_NV_GPR(r25)(r1)
> - PPC_LL  r26, HOST_NV_GPR(r26)(r1)
> - PPC_LL  r27, HOST_NV_GPR(r27)(r1)
> - PPC_LL  r28, HOST_NV_GPR(r28)(r1)
> - PPC_LL  r29, HOST_NV_GPR(r29)(r1)
> - PPC_LL  r30, HOST_NV_GPR(r30)(r1)
> - PPC_LL  r31, HOST_NV_GPR(r31)(r1)
> + PPC_LL  r14, HOST_NV_GPR(R14)(r1)
> + PPC_LL  r15, HOST_NV_GPR(R15)(r1)
> + PPC_LL  r16, HOST_NV_GPR(R16)(r1)
> + PPC_LL  r17, HOST_NV_GPR(R17)(r1)
> + PPC_LL  r18, HOST_NV_GPR(R18)(r1)
> + PPC_LL  r19, HOST_NV_GPR(R19)(r1)
> + PPC_LL  r20, HOST_NV_GPR(R20)(r1)
> + PPC_LL  r21, HOST_NV_GPR(R21)(r1)
> + PPC_LL  r22, HOST_NV_GPR(R22)(r1)
> + PPC_LL  r23, HOST_NV_GPR(R23)(r1)
> + PPC_LL  r24, HOST_NV_GPR(R24)(r1)
> + PPC_LL  r25, HOST_NV_GPR(R25)(r1)
> + PPC_LL  r26, HOST_NV_GPR(R26)(r1)
> + PPC_LL  r27, HOST_NV_GPR(R27)(r1)
> + PPC_LL  r28, HOST_NV_GPR(R28)(r1)
> + PPC_LL  r29, HOST_NV_GPR(R29)(r1)
> + PPC_LL  r30, HOST_NV_GPR(R30)(r1)
> + PPC_LL  r31, HOST_NV_GPR(R31)(r1)
>  
>   /* Return to kvm_vcpu_run(). */
>   mtlrr5
> @@ -453,24 +454,24 @@ _GLOBAL(__kvmppc_vcpu_run)
>   stw r5, HOST_CR(r1)
>  
>   /* Save host non-volatile register state to stack. */
> - PPC_STL r14, HOST_NV_GPR(r14)(r1)
> - PPC_STL r15, HOST_NV_GPR(r15)(r1)
> - PPC_STL r16, HOST_NV_GPR(r16)(r1)
> - PPC_STL r17, HOST_NV_GPR(r17)(r1)
> - PPC_STL r18, HOST_NV_GPR(r18)(r1)
> - PPC_STL r19, HOST_NV_GPR(r19)(r1)
> - PPC_STL r20, HOST_NV_GPR(r20)(r1)
> - PPC_STL r21, HOST_NV_GPR(r21)(r1)
> - PPC_STL r22, HOST_NV_GPR(r22)(r1)
> - PPC_STL r23, HOST_NV_GPR(r23)(r1)
> - PPC_STL r24, HOST_NV_GPR(r24)(r1)
> - PPC_STL r25, HOST_NV_GPR(r25)(r1)
> - PPC_STL r26, HOST_NV_GPR(r26)(r1)
> - PPC_STL r27, HOST_NV_GPR(r27)(r1)
> - PPC_STL r28, HOST_NV_GPR(r28)(r1)
> - PPC_STL r29, HOST_NV_GPR(r29)(r1)
> - PPC_STL r30, HOST_NV_GPR(r30)(r1)
> - PPC_STL r31, HOST_NV_GPR(r31)(r1)
> + PPC_STL r14, HOST_NV_GPR(R14)(r1)
> + PPC_STL r15, HOST_NV_GPR(R15)(r1)
> + PPC_STL r16, HOST_NV_GPR(R16)(r1)
> + PPC_STL r17, HOST_NV_GPR(R17)(r1)
> + PPC_STL r18, HOST_NV_GPR(R18)(r1)
> + PPC_STL r19, HOST_NV_GPR(R19)(r1)
> + PPC_STL r20, HOST_NV_GPR(R20)(r1)
> + PPC_STL r21, HOST_NV_GPR(R21)(r1)
> + PPC_STL r22, HOST_NV_GPR(R22)(r1)
> + PPC_STL r23, HOST_NV_GPR(R23)(r1)
> + PPC_STL r24, HOST_NV_GPR(R24)(r1)
> + PPC_STL r25, HOST_NV_GPR(R25)(r1)
> + PPC_STL r26, HOST_NV_GPR(R26)(r1)
> + PPC_STL r27, HOST_NV_GPR(R27)(r1)
> + PPC_STL r28, HOST_NV_GPR(R28)(r1)
> + PPC_STL r29, HOST_NV_GPR(R29)(r1)
> + PPC_STL r30, HOST_NV_GPR(R30)(r1)
> + PPC_STL r31, HOST_NV_GPR(R31)(r1)
>  
>   /* Load guest non-volatiles. */
>   PPC_LL  r14, VCPU_GPR(R14)(r4)
> -- 
> 1.6.0.2
> 
> ___
> Linuxppc-dev mailing list
> linuxppc-...@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
> 
--
To unsubscribe from this list: send the line "unsubscri

Re: [PATCH 08/27] Add SLB switching code for entry/exit

2009-11-01 Thread Michael Neuling
> This is the really low level of guest entry/exit code.
> 
> Book3s_64 has an SLB, which stores all ESID -> VSID mappings we're
> currently aware of.
> 
> The segments in the guest differ from the ones on the host, so we need
> to switch the SLB to tell the MMU that we're in a new context.
> 
> So we store a shadow of the guest's SLB in the PACA, switch to that on
> entry and only restore bolted entries on exit, leaving the rest to the
> Linux SLB fault handler.
> 
> That way we get a really clean way of switching the SLB.
> 
> Signed-off-by: Alexander Graf 
> ---
>  arch/powerpc/kvm/book3s_64_slb.S |  277 
++
>  1 files changed, 277 insertions(+), 0 deletions(-)
>  create mode 100644 arch/powerpc/kvm/book3s_64_slb.S
> 
> diff --git a/arch/powerpc/kvm/book3s_64_slb.S b/arch/powerpc/kvm/book3s_64_sl
b.S
> new file mode 100644
> index 000..00a8367
> --- /dev/null
> +++ b/arch/powerpc/kvm/book3s_64_slb.S
> @@ -0,0 +1,277 @@
> +/*
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License, version 2, as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
> + *
> + * Copyright SUSE Linux Products GmbH 2009
> + *
> + * Authors: Alexander Graf 
> + */
> +
> +/***
***
> + *  
  *
> + *   Entry code 
  *
> + *  
  *
> + ***
**/
> +
> +.global kvmppc_handler_trampoline_enter
> +kvmppc_handler_trampoline_enter:
> +
> + /* Required state:
> +  *
> +  * MSR = ~IR|DR
> +  * R13 = PACA
> +  * R9 = guest IP
> +  * R10 = guest MSR
> +  * R11 = free
> +  * R12 = free
> +  * PACA[PACA_EXMC + EX_R9] = guest R9
> +  * PACA[PACA_EXMC + EX_R10] = guest R10
> +  * PACA[PACA_EXMC + EX_R11] = guest R11
> +  * PACA[PACA_EXMC + EX_R12] = guest R12
> +  * PACA[PACA_EXMC + EX_R13] = guest R13
> +  * PACA[PACA_EXMC + EX_CCR] = guest CR
> +  * PACA[PACA_EXMC + EX_R3] = guest XER
> +  */
> +
> + mtsrr0  r9
> + mtsrr1  r10
> +
> + mtspr   SPRN_SPRG_SCRATCH0, r0
> +
> + /* Remove LPAR shadow entries */
> +
> +#if SLB_NUM_BOLTED == 3

You could alternatively check the persistent entry in the slb_shawdow
buffer.  This would give you a run time check.  Not sure what's best
though.  

> +
> + ld  r12, PACA_SLBSHADOWPTR(r13)
> + ld  r10, 0x10(r12)
> + ld  r11, 0x18(r12)

Can you define something in asm-offsets.c for these magic constants 0x10
and 0x18.  Similarly below.

> + /* Invalid? Skip. */
> + rldicl. r0, r10, 37, 63
> + beq slb_entry_skip_1
> + xoris   r9, r10, slb_esi...@h
> + std r9, 0x10(r12)
> +slb_entry_skip_1:
> + ld  r9, 0x20(r12)
> + /* Invalid? Skip. */
> + rldicl. r0, r9, 37, 63
> + beq slb_entry_skip_2
> + xoris   r9, r9, slb_esi...@h
> + std r9, 0x20(r12)
> +slb_entry_skip_2:
> + ld  r9, 0x30(r12)
> + /* Invalid? Skip. */
> + rldicl. r0, r9, 37, 63
> + beq slb_entry_skip_3
> + xoris   r9, r9, slb_esi...@h
> + std r9, 0x30(r12)

Can these 3 be made into a macro?

> +slb_entry_skip_3:
> + 
> +#else
> +#error unknown number of bolted entries
> +#endif
> +
> + /* Flush SLB */
> +
> + slbia
> +
> + /* r0 = esid & ESID_MASK */
> + rldicr  r10, r10, 0, 35
> + /* r0 |= CLASS_BIT(VSID) */
> + rldic   r12, r11, 56 - 36, 36
> + or  r10, r10, r12
> + slbie   r10
> +
> + isync
> +
> + /* Fill SLB with our shadow */
> +
> + lbz r12, PACA_KVM_SLB_MAX(r13)
> + mulli   r12, r12, 16
> + addir12, r12, PACA_KVM_SLB
> + add r12, r12, r13
> +
> + /* for (r11 = kvm_slb; r11 < kvm_slb + kvm_slb_size; r11+=slb_entry) */
> + li  r11, PACA_KVM_SLB
> + add r11, r11, r13
> +
> +slb_loop_enter:
> +
> + ld  r10, 0(r11)
> +
> + rldicl. r0, r10, 37, 63
> + beq slb_loop_enter_skip
> +
> + ld  r9, 8(r11)
> + slbmte  r9, r10

If you're updating the first 3 slbs, you need to make sure the slb
shadow is updated at the same time (BTW dumb question: can we run this
under PHYP?)

> +
> +slb_loop_enter_skip:
> +

Re: [PATCH 11/27] Add book3s_64 Host MMU handling

2009-11-01 Thread Michael Neuling

> +static void invalidate_pte(struct hpte_cache *pte)
> +{
> + dprintk_mmu("KVM: Flushing SPT %d: 0x%llx (0x%llx) -> 0x%llx\n",
> + i, pte->pte.eaddr, pte->pte.vpage, pte->host_va);
> +
> + ppc_md.hpte_invalidate(pte->slot, pte->host_va,
> +MMU_PAGE_4K, MMU_SEGSIZE_256M,
> +false);

Are we assuming 256M segments here (and elsewhere)?


> +static int kvmppc_mmu_next_segment(struct kvm_vcpu *vcpu, ulong esid)
> +{
> + int i;
> + int max_slb_size = 64;
> + int found_inval = -1;
> + int r;
> +
> + if (!get_paca()->kvm_slb_max)
> + get_paca()->kvm_slb_max = 1;
> +
> + /* Are we overwriting? */
> + for (i = 1; i < get_paca()->kvm_slb_max; i++) {
> + if (!(get_paca()->kvm_slb[i].esid & SLB_ESID_V))
> + found_inval = i;
> + else if ((get_paca()->kvm_slb[i].esid & ESID_MASK) == esid)
> + return i;
> + }
> +
> + /* Found a spare entry that was invalidated before */
> + if (found_inval > 0)
> + return found_inval;
> +
> + /* No spare invalid entry, so create one */
> +
> + if (mmu_slb_size < 64)
> + max_slb_size = mmu_slb_size;

Can we just use the global mmu_slb_size eliminate max_slb_size?



Mikey
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/27] Add SLB switching code for entry/exit

2009-11-02 Thread Michael Neuling
> >> This is the really low level of guest entry/exit code.
> >>
> >> Book3s_64 has an SLB, which stores all ESID -> VSID mappings we're
> >> currently aware of.
> >>
> >> The segments in the guest differ from the ones on the host, so we  
> >> need
> >> to switch the SLB to tell the MMU that we're in a new context.
> >>
> >> So we store a shadow of the guest's SLB in the PACA, switch to that  
> >> on
> >> entry and only restore bolted entries on exit, leaving the rest to  
> >> the
> >> Linux SLB fault handler.
> >>
> >> That way we get a really clean way of switching the SLB.
> >>
> >> Signed-off-by: Alexander Graf 
> >> ---
> >> arch/powerpc/kvm/book3s_64_slb.S |  277  
> >> 
> > ++
> >> 1 files changed, 277 insertions(+), 0 deletions(-)
> >> create mode 100644 arch/powerpc/kvm/book3s_64_slb.S
> >>
> >> diff --git a/arch/powerpc/kvm/book3s_64_slb.S b/arch/powerpc/kvm/ 
> >> book3s_64_sl
> > b.S
> >> new file mode 100644
> >> index 000..00a8367
> >> --- /dev/null
> >> +++ b/arch/powerpc/kvm/book3s_64_slb.S
> >> @@ -0,0 +1,277 @@
> >> +/*
> >> + * This program is free software; you can redistribute it and/or  
> >> modify
> >> + * it under the terms of the GNU General Public License, version  
> >> 2, as
> >> + * published by the Free Software Foundation.
> >> + *
> >> + * This program is distributed in the hope that it will be useful,
> >> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >> + * GNU General Public License for more details.
> >> + *
> >> + * You should have received a copy of the GNU General Public License
> >> + * along with this program; if not, write to the Free Software
> >> + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA   
> >> 02110-1301, USA.
> >> + *
> >> + * Copyright SUSE Linux Products GmbH 2009
> >> + *
> >> + * Authors: Alexander Graf 
> >> + */
> >> +
> >> +/ 
> >> *** 
> >> *** 
> >> *
> > ***
> >> + *
> >  *
> >> + *   Entry code
> >  *
> >> + *
> >  *
> >> +  
> >> *** 
> >> *** 
> >> *
> > **/
> >> +
> >> +.global kvmppc_handler_trampoline_enter
> >> +kvmppc_handler_trampoline_enter:
> >> +
> >> +/* Required state:
> >> + *
> >> + * MSR = ~IR|DR
> >> + * R13 = PACA
> >> + * R9 = guest IP
> >> + * R10 = guest MSR
> >> + * R11 = free
> >> + * R12 = free
> >> + * PACA[PACA_EXMC + EX_R9] = guest R9
> >> + * PACA[PACA_EXMC + EX_R10] = guest R10
> >> + * PACA[PACA_EXMC + EX_R11] = guest R11
> >> + * PACA[PACA_EXMC + EX_R12] = guest R12
> >> + * PACA[PACA_EXMC + EX_R13] = guest R13
> >> + * PACA[PACA_EXMC + EX_CCR] = guest CR
> >> + * PACA[PACA_EXMC + EX_R3] = guest XER
> >> + */
> >> +
> >> +mtsrr0r9
> >> +mtsrr1r10
> >> +
> >> +mtsprSPRN_SPRG_SCRATCH0, r0
> >> +
> >> +/* Remove LPAR shadow entries */
> >> +
> >> +#if SLB_NUM_BOLTED == 3
> >
> > You could alternatively check the persistent entry in the slb_shawdow
> > buffer.  This would give you a run time check.  Not sure what's best
> > though.
> 
> Well we're in the hot path here, so anything using as few registers as  
> possible and being simple is the best :-). I'd guess the more we are  
> clever at compile time the better.

Yeah, I tend to agree.

> 
> >
> >
> >> +
> >> +ldr12, PACA_SLBSHADOWPTR(r13)
> >> +ldr10, 0x10(r12)
> >> +ldr11, 0x18(r12)
> >
> > Can you define something in asm-offsets.c for these magic constants  
> > 0x10
> > and 0x18.  Similarly below.
> >
> >> +/* Invalid? Skip. */
> >> +rldicl. r0, r10, 37, 63
> >> +beqslb_entry_skip_1
> >> +xorisr9, r10, slb_esi...@h
> >> +stdr9, 0x10(r12)
> >> +slb_entry_skip_1:
> >> +ldr9, 0x20(r12)
> >> +/* Invalid? Skip. */
> >> +rldicl. r0, r9, 37, 63
> >> +beqslb_entry_skip_2
> >> +xorisr9, r9, slb_esi...@h
> >> +stdr9, 0x20(r12)
> >> +slb_entry_skip_2:
> >> +ldr9, 0x30(r12)
> >> +/* Invalid? Skip. */
> >> +rldicl. r0, r9, 37, 63
> >> +beqslb_entry_skip_3
> >> +xorisr9, r9, slb_esi...@h
> >> +stdr9, 0x30(r12)
> >
> > Can these 3 be made into a macro?
> 
> Phew - dynamically generating jump points sounds rather hard. I can  
> give it a try...
> 
> >
> >> +slb_entry_skip_3:
> >> +
> >> +#else
> >> +#error unknown number of bolted entries
> >> +#endif
> >> +
> >> +/* Flush SLB */
> >> +
> >> +slbia
> >> +
> >> +/* r0 = esid & ESID_MASK */
> >> +rldicr  r10, r10, 0, 35
> >> +/* r0 |= CLASS_BIT(VSID) */
> >> +rldic   r12, r11, 56 - 36, 36
> >> +or  r10, r10, r12
> >> +slbier10
> >> +
> >> +isync
> >> +
> >> +/* Fill SLB with our shadow */
> >> +
> >> +lbzr12, PACA_KVM_SLB_MAX(r13)
> >> +  

Re: [PATCH 1/4 v6] powerpc: export debug registers save function for KVM

2013-07-29 Thread Michael Neuling
Alexander Graf  wrote:

> 
> On 04.07.2013, at 08:57, Bharat Bhushan wrote:
> 
> > KVM need this function when switching from vcpu to user-space
> > thread. My subsequent patch will use this function.
> > 
> > Signed-off-by: Bharat Bhushan 
> 
> Ben / Michael, please ack.

It's not really my area of expertise, but it applies and compiles for me
and it's relatively simple, so FWIW...

Acked-by: Michael Neuling 

> 
> 
> Alex
> 
> > ---
> > v5->v6
> > - switch_booke_debug_regs() not guarded by the compiler switch
> > 
> > arch/powerpc/include/asm/switch_to.h |1 +
> > arch/powerpc/kernel/process.c|3 ++-
> > 2 files changed, 3 insertions(+), 1 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/switch_to.h 
> > b/arch/powerpc/include/asm/switch_to.h
> > index 200d763..db68f1d 100644
> > --- a/arch/powerpc/include/asm/switch_to.h
> > +++ b/arch/powerpc/include/asm/switch_to.h
> > @@ -29,6 +29,7 @@ extern void giveup_vsx(struct task_struct *);
> > extern void enable_kernel_spe(void);
> > extern void giveup_spe(struct task_struct *);
> > extern void load_up_spe(struct task_struct *);
> > +extern void switch_booke_debug_regs(struct thread_struct *new_thread);
> > 
> > #ifndef CONFIG_SMP
> > extern void discard_lazy_cpu_state(void);
> > diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> > index 01ff496..da586aa 100644
> > --- a/arch/powerpc/kernel/process.c
> > +++ b/arch/powerpc/kernel/process.c
> > @@ -362,12 +362,13 @@ static void prime_debug_regs(struct thread_struct 
> > *thread)
> >  * debug registers, set the debug registers from the values
> >  * stored in the new thread.
> >  */
> > -static void switch_booke_debug_regs(struct thread_struct *new_thread)
> > +void switch_booke_debug_regs(struct thread_struct *new_thread)
> > {
> > if ((current->thread.debug.dbcr0 & DBCR0_IDM)
> > || (new_thread->debug.dbcr0 & DBCR0_IDM))
> > prime_debug_regs(new_thread);
> > }
> > +EXPORT_SYMBOL_GPL(switch_booke_debug_regs);
> > #else   /* !CONFIG_PPC_ADV_DEBUG_REGS */
> > #ifndef CONFIG_HAVE_HW_BREAKPOINT
> > static void set_debug_reg_defaults(struct thread_struct *thread)
> > -- 
> > 1.7.0.4
> > 
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] KVM: PPC: Book3S HV: Add register name when loading toc

2014-08-18 Thread Michael Neuling
Add 'r' to register name r2 in kvmppc_hv_enter.

Also update comment at the top of kvmppc_hv_enter to indicate that R2/TOC is
non-volatile.

Signed-off-by: Michael Neuling 
Signed-off-by: Paul Mackerras 
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index c4bd2d7..1e8c480 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -355,6 +355,7 @@ kvmppc_hv_entry:
 * MSR = ~IR|DR
 * R13 = PACA
 * R1 = host R1
+* R2 = TOC
 * all other volatile GPRS = free
 */
mflrr0
@@ -503,7 +504,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 toc_tlbie_lock:
.tc native_tlbie_lock[TC],native_tlbie_lock
.previous
-   ld  r3,toc_tlbie_lock@toc(2)
+   ld  r3,toc_tlbie_lock@toc(r2)
 #ifdef __BIG_ENDIAN__
lwz r8,PACA_LOCK_TOKEN(r13)
 #else
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] KVM: PPC: Book3S HV: Cleanup kvmppc_load/save_fp

2014-08-18 Thread Michael Neuling
This cleans up kvmppc_load/save_fp.  It removes unnecessary isyncs.  It also
removes the unnecessary resetting of the MSR bits on exit of kvmppc_save_fp.

Signed-off-by: Michael Neuling 
Signed-off-by: Paul Mackerras 
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index f0c4db7..c4bd2d7 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -2434,7 +2434,6 @@ BEGIN_FTR_SECTION
 END_FTR_SECTION_IFSET(CPU_FTR_VSX)
 #endif
mtmsrd  r8
-   isync
addir3,r3,VCPU_FPRS
bl  store_fp_state
 #ifdef CONFIG_ALTIVEC
@@ -2470,7 +2469,6 @@ BEGIN_FTR_SECTION
 END_FTR_SECTION_IFSET(CPU_FTR_VSX)
 #endif
mtmsrd  r8
-   isync
addir3,r4,VCPU_FPRS
bl  load_fp_state
 #ifdef CONFIG_ALTIVEC
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] KVM: PPC: Book3S HV: Cleanup kvmppc_load/save_fp

2014-08-18 Thread Michael Neuling
On Tue, 2014-08-19 at 15:24 +1000, Paul Mackerras wrote:
> On Tue, Aug 19, 2014 at 02:59:29PM +1000, Michael Neuling wrote:
> > This cleans up kvmppc_load/save_fp.  It removes unnecessary isyncs.
> 
> NAK - they are necessary on PPC970, which we (still) support.  You
> could put them in a feature section if they are really annoying you.

I'm not fussed, but we should at least have a comment there for why we
need them.

> >  It also
> > removes the unnecessary resetting of the MSR bits on exit of kvmppc_save_fp.
> 
> ... except it doesn't. :)  That got folded into e4e38121507a ("KVM:
> PPC: Book3S HV: Add transactional memory support").

Arrh, thanks.  This patch was cleaning up stuff from an old local tree
and couldn't see where it had been upstreamed.  I missed this.

Mikey
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm tools: powerpc: Fix init order for xics

2013-08-19 Thread Michael Neuling
xics_init() assumes kvm->nrcpus is already setup.  kvm->nrcpus is setup
in kvm_cpu_init()

Unfortunately xics_init() and kvm_cpu_init() both use base_init().  So
depending on the order randomly determined by the compiler, xics_init()
may initialised see kvm->nrcpus as 0 and not setup any of the icp VCPU
pointers.  This manifests itself later in boot when trying to raise an
IRQ resulting in a null pointer deference/segv.

This moves xics_init() to use dev_base_init() to ensure it happens after
kvm_cpu_init().

Signed-off-by: Michael Neuling 

diff --git a/tools/kvm/powerpc/xics.c b/tools/kvm/powerpc/xics.c
index cf64a08..c1ef35b 100644
--- a/tools/kvm/powerpc/xics.c
+++ b/tools/kvm/powerpc/xics.c
@@ -505,7 +505,7 @@ static int xics_init(struct kvm *kvm)
 
return 0;
 }
-base_init(xics_init);
+dev_base_init(xics_init);
 
 
 void kvm__irq_line(struct kvm *kvm, int irq, int level)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC] KVM: PPC: Book3S HV: Reserve POWER8 space in get/set_one_reg

2013-08-29 Thread Michael Neuling
Alex,

This reserves space in get/set_one_reg ioctl for the extra guest state
needed for POWER8.  It doesn't implement these at all, it just reserves
them so that the ABI is defined now.  

A few things to note here:

- POWER8 has 6 PMCs and an additional 2 SPMCs for the supervisor.  Here
  I'm storing these 2 SPMCs in PMC7/8.

- This add *a lot* state for transactional memory.  TM suspend mode,
  this is unavoidable, you can't simply roll back all transactions and
  store only the checkpointed state.  I've added this all to
  get/set_one_reg (including GPRs) rather than creating a new ioctl
  which returns a struct kvm_regs like KVM_GET_REGS does.  This means we
  if we need to extract the TM state, we are going to need a bucket load
  of IOCTLs.  Hopefully most of the time this will not be needed as we
  can look at the MSR to see if TM is active and only grab them when
  needed.

- The TM state is offset bu 0x1000.  Other than being bigger than the
  SPR space, it's fairly arbitrarily chose. 

- For TM, I've done away with VMX and FP and created a single 64x128 bit
  VSX register space.

Alex: I'll add the documentation Documentation/virtual/kvm/api.txt if
you're happy with all this.

Signed-off-by: Michael Neuling 

diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
b/arch/powerpc/include/uapi/asm/kvm.h
index 0fb1a6e..33b8007 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -429,6 +429,11 @@ struct kvm_get_htab_header {
 #define KVM_REG_PPC_MMCR0  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x10)
 #define KVM_REG_PPC_MMCR1  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x11)
 #define KVM_REG_PPC_MMCRA  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x12)
+#define KVM_REG_PPC_MMCR2  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x13)
+#define KVM_REG_PPC_MMCRS  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x14)
+#define KVM_REG_PPC_SIAR   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x15)
+#define KVM_REG_PPC_SDAR   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x16)
+#define KVM_REG_PPC_SIER   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x17)
 
 #define KVM_REG_PPC_PMC1   (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x18)
 #define KVM_REG_PPC_PMC2   (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x19)
@@ -499,6 +504,53 @@ struct kvm_get_htab_header {
 #define KVM_REG_PPC_TLB3PS (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x9a)
 #define KVM_REG_PPC_EPTCFG (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x9b)
 
+/* POWER8 new SPRs */
+#define KVM_REG_PPC_IAMR   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x9c)
+#define KVM_REG_PPC_TFHAR  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x9d)
+#define KVM_REG_PPC_TFIAR  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x9e)
+#define KVM_REG_PPC_TEXASR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x9f)
+#define KVM_REG_PPC_FSCR   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa0)
+#define KVM_REG_PPC_PSPB   (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xa1)
+#define KVM_REG_PPC_EBBHR  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa2)
+#define KVM_REG_PPC_EBBRR  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa3)
+#define KVM_REG_PPC_BESCR  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa4)
+#define KVM_REG_PPC_TAR(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa5)
+#define KVM_REG_PPC_DPDES  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa6)
+#define KVM_REG_PPC_DAWR   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa7)
+#define KVM_REG_PPC_DAWRX  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa8)
+#define KVM_REG_PPC_CIABR  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa9)
+#define KVM_REG_PPC_IC (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xaa)
+#define KVM_REG_PPC_VTB(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xab)
+#define KVM_REG_PPC_CSIGR  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xac)
+#define KVM_REG_PPC_TACR   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xad)
+#define KVM_REG_PPC_TCSCR  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xae)
+#define KVM_REG_PPC_PID(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xaf)
+#define KVM_REG_PPC_ACOP   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xb0)
+
+/* Transactional Memory checkpointed state:
+ * This is all GPRs, all VSX regs and a subset of SPRs
+ */
+#define KVM_REG_PPC_TM (KVM_REG_PPC | 0x1000)
+/* TM GPRs */
+#define KVM_REG_PPC_TM_GPR0(KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0)
+#define KVM_REG_PPC_TM_GPR(n)  (KVM_REG_PPC_TM_GPR0 + (n))
+#define KVM_REG_PPC_TM_GPR31   (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x1f)
+/* TM VSX */
+#define KVM_REG_PPC_TM_VSR0(KVM_REG_PPC | KVM_REG_SIZE_U128 | 0x20)
+#define KVM_REG_PPC_TM_VSR(n)  (KVM_REG_PPC_VSR0 + (n))
+#define KVM_REG_PPC_VSR63  (KVM_REG_PPC | KVM_REG_SIZE_U128 | 0x5f)
+/* TM SPRS */
+#define KVM_REG_PPC_TM_CR  (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x60)
+#define KVM_REG_PPC_TM_LR  (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x61)
+#define KVM_REG_PPC_TM_CTR (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x62)
+#define KVM_REG_PPC_TM_FPSCR   (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x63)
+#define KVM_REG_PPC_TM_AMR (KVM_REG_PPC_TM | KVM_REG_SIZE

Re: [PATCH RFC] KVM: PPC: Book3S HV: Reserve POWER8 space in get/set_one_reg

2013-08-30 Thread Michael Neuling
On Sat, Aug 31, 2013 at 8:17 AM, Benjamin Herrenschmidt
 wrote:
> On Fri, 2013-08-30 at 16:01 +0200, Alexander Graf wrote:
>> >
>> > - The TM state is offset bu 0x1000.  Other than being bigger than
>> the
>> >  SPR space, it's fairly arbitrarily chose.
>
> Make it higher, just in case

Ok but how high?  KVM_REG_SiZE is set by generic code and it's way up
at 0x0030ULL

Mikey
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] KVM: PPC: Book3S HV: Reserve POWER8 space in get/set_one_reg

2013-08-30 Thread Michael Neuling
On Sat, Aug 31, 2013 at 12:01 AM, Alexander Graf  wrote:
>
> On 30.08.2013, at 08:09, Michael Neuling wrote:
>
>> Alex,
>>
>> This reserves space in get/set_one_reg ioctl for the extra guest state
>> needed for POWER8.  It doesn't implement these at all, it just reserves
>> them so that the ABI is defined now.
>>
>> A few things to note here:
>>
>> - POWER8 has 6 PMCs and an additional 2 SPMCs for the supervisor.  Here
>>  I'm storing these 2 SPMCs in PMC7/8.
>
> Is this safe to do? Are we guaranteed that POWER9 or POWER10 doesn't 
> introduce a real PMC7?

Good point.  I'll change it.

>
>> - This add *a lot* state for transactional memory.  TM suspend mode,
>>  this is unavoidable, you can't simply roll back all transactions and
>>  store only the checkpointed state.  I've added this all to
>>  get/set_one_reg (including GPRs) rather than creating a new ioctl
>>  which returns a struct kvm_regs like KVM_GET_REGS does.  This means we
>>  if we need to extract the TM state, we are going to need a bucket load
>>  of IOCTLs.  Hopefully most of the time this will not be needed as we
>>  can look at the MSR to see if TM is active and only grab them when
>>  needed.
>
> If we find this to be a performance issue, we can always add a new ioctl that 
> allows multiple ONE_REG accesses at a time. The only reason we don't have 
> that yet is that bulk one_reg access hasn't happened in any performance 
> critical path so far.

Ok.

>
>>
>> - The TM state is offset bu 0x1000.  Other than being bigger than the
>>  SPR space, it's fairly arbitrarily chose.
>>
>> - For TM, I've done away with VMX and FP and created a single 64x128 bit
>>  VSX register space.
>>
>> Alex: I'll add the documentation Documentation/virtual/kvm/api.txt if
>> you're happy with all this.
>
> Looks perfectly reasonable to me :).

Thanks.  I'll repost.

Mikey
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] KVM: PPC: Book3S HV: Reserve POWER8 space in get/set_one_reg

2013-09-02 Thread Michael Neuling
This reserves space in get/set_one_reg ioctl for the extra guest state
needed for POWER8.  It doesn't implement these at all, it just reserves
them so that the ABI is defined now.

A few things to note here:

- This add *a lot* state for transactional memory.  TM suspend mode,
  this is unavoidable, you can't simply roll back all transactions and
  store only the checkpointed state.  I've added this all to
  get/set_one_reg (including GPRs) rather than creating a new ioctl
  which returns a struct kvm_regs like KVM_GET_REGS does.  This means we
  if we need to extract the TM state, we are going to need a bucket load
  of IOCTLs.  Hopefully most of the time this will not be needed as we
  can look at the MSR to see if TM is active and only grab them when
  needed.  If this becomes a bottle neck in future we can add another
  ioctl to grab all this state in one go.

- The TM state is offset by 0x8000.

- For TM, I've done away with VMX and FP and created a single 64x128 bit
  VSX register space.

- I've left a space of 1 (at 0x9c) since Paulus needs to add a value
  which applies to POWER7 as well.

Signed-off-by: Michael Neuling 

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index ef925ea..341058c 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1810,6 +1810,45 @@ registers, find a list below:
   PPC   | KVM_REG_PPC_TLB3PS   | 32
   PPC   | KVM_REG_PPC_EPTCFG   | 32
   PPC   | KVM_REG_PPC_ICP_STATE | 64
+  PPC   | KVM_REG_PPC_SPMC1| 32
+  PPC   | KVM_REG_PPC_SPMC2| 32
+  PPC   | KVM_REG_PPC_IAMR | 64
+  PPC   | KVM_REG_PPC_TFHAR| 64
+  PPC   | KVM_REG_PPC_TFIAR| 64
+  PPC   | KVM_REG_PPC_TEXASR   | 64
+  PPC   | KVM_REG_PPC_FSCR | 64
+  PPC   | KVM_REG_PPC_PSPB | 32
+  PPC   | KVM_REG_PPC_EBBHR| 64
+  PPC   | KVM_REG_PPC_EBBRR| 64
+  PPC   | KVM_REG_PPC_BESCR| 64
+  PPC   | KVM_REG_PPC_TAR  | 64
+  PPC   | KVM_REG_PPC_DPDES| 64
+  PPC   | KVM_REG_PPC_DAWR | 64
+  PPC   | KVM_REG_PPC_DAWRX| 64
+  PPC   | KVM_REG_PPC_CIABR| 64
+  PPC   | KVM_REG_PPC_IC   | 64
+  PPC   | KVM_REG_PPC_VTB  | 64
+  PPC   | KVM_REG_PPC_CSIGR| 64
+  PPC   | KVM_REG_PPC_TACR | 64
+  PPC   | KVM_REG_PPC_TCSCR| 64
+  PPC   | KVM_REG_PPC_PID  | 64
+  PPC   | KVM_REG_PPC_ACOP | 64
+  PPC   | KVM_REG_PPC_TM_GPR0  | 64
+  ...
+  PPC   | KVM_REG_PPC_TM_GPR31 | 64
+  PPC   | KVM_REG_PPC_TM_VSR0  | 128
+  ...
+  PPC   | KVM_REG_PPC_TM_VSR63 | 128
+  PPC   | KVM_REG_PPC_TM_CR| 64
+  PPC   | KVM_REG_PPC_TM_LR| 64
+  PPC   | KVM_REG_PPC_TM_CTR   | 64
+  PPC   | KVM_REG_PPC_TM_FPSCR | 64
+  PPC   | KVM_REG_PPC_TM_AMR   | 64
+  PPC   | KVM_REG_PPC_TM_PPR   | 64
+  PPC   | KVM_REG_PPC_TM_VRSAVE| 64
+  PPC   | KVM_REG_PPC_TM_VSCR  | 32
+  PPC   | KVM_REG_PPC_TM_DSCR  | 64
+  PPC   | KVM_REG_PPC_TM_TAR   | 64
 
 ARM registers are mapped using the lower 32 bits.  The upper 16 of that
 is the register group type, or coprocessor number:
diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
b/arch/powerpc/include/uapi/asm/kvm.h
index 0fb1a6e..8e687a1 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -429,6 +429,11 @@ struct kvm_get_htab_header {
 #define KVM_REG_PPC_MMCR0  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x10)
 #define KVM_REG_PPC_MMCR1  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x11)
 #define KVM_REG_PPC_MMCRA  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x12)
+#define KVM_REG_PPC_MMCR2  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x13)
+#define KVM_REG_PPC_MMCRS  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x14)
+#define KVM_REG_PPC_SIAR   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x15)
+#define KVM_REG_PPC_SDAR   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x16)
+#define KVM_REG_PPC_SIER   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x17)
 
 #define KVM_REG_PPC_PMC1   (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x18)
 #define KVM_REG_PPC_PMC2   (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x19)
@@ -499,6 +504,55 @@ struct kvm_get_htab_header {
 #define KVM_REG_PPC_TLB3PS (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x9a)
 #define KVM_REG_PPC_EPTCFG (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x9b)
 
+/* POWER8 registers */
+#define KVM_REG_PPC_SPMC1  (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x9d)
+#define KVM_REG_PPC_SPMC2  (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x9e)
+#define KVM_REG_PPC_IAMR   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x9f)
+#define KVM_REG_PPC_TFHAR  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa0)
+#define KVM_REG_PPC_TFIAR  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa1)
+#define KVM_REG_PPC_TEXASR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa2)
+#define KVM_REG_PPC_FSCR   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa3)
+#define KVM_REG_PPC_PSPB   (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xa4)
+#define KVM_REG_PPC_EBBHR  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa5)
+#define KVM_REG_PPC_EBBRR  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa6)
+

[PATCH v3] KVM: PPC: Book3S HV: Reserve POWER8 space in get/set_one_reg

2013-09-02 Thread Michael Neuling
This reserves space in get/set_one_reg ioctl for the extra guest state
needed for POWER8.  It doesn't implement these at all, it just reserves
them so that the ABI is defined now.

A few things to note here:

- This add *a lot* state for transactional memory.  TM suspend mode,
  this is unavoidable, you can't simply roll back all transactions and
  store only the checkpointed state.  I've added this all to
  get/set_one_reg (including GPRs) rather than creating a new ioctl
  which returns a struct kvm_regs like KVM_GET_REGS does.  This means we
  if we need to extract the TM state, we are going to need a bucket load
  of IOCTLs.  Hopefully most of the time this will not be needed as we
  can look at the MSR to see if TM is active and only grab them when
  needed.  If this becomes a bottle neck in future we can add another
  ioctl to grab all this state in one go.

- The TM state is offset by 0x8000.

- For TM, I've done away with VMX and FP and created a single 64x128 bit
  VSX register space.

- I've left a space of 1 (at 0x9c) since Paulus needs to add a value
  which applies to POWER7 as well.

Signed-off-by: Michael Neuling 

--- 
The last one was screwed up... sorry..

v3: 
  fix naming mistake and whitespace screwage.

v2: 
  integrate feedback

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index ef925ea..341058c 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1810,6 +1810,45 @@ registers, find a list below:
   PPC   | KVM_REG_PPC_TLB3PS   | 32
   PPC   | KVM_REG_PPC_EPTCFG   | 32
   PPC   | KVM_REG_PPC_ICP_STATE | 64
+  PPC   | KVM_REG_PPC_SPMC1| 32
+  PPC   | KVM_REG_PPC_SPMC2| 32
+  PPC   | KVM_REG_PPC_IAMR | 64
+  PPC   | KVM_REG_PPC_TFHAR| 64
+  PPC   | KVM_REG_PPC_TFIAR| 64
+  PPC   | KVM_REG_PPC_TEXASR   | 64
+  PPC   | KVM_REG_PPC_FSCR | 64
+  PPC   | KVM_REG_PPC_PSPB | 32
+  PPC   | KVM_REG_PPC_EBBHR| 64
+  PPC   | KVM_REG_PPC_EBBRR| 64
+  PPC   | KVM_REG_PPC_BESCR| 64
+  PPC   | KVM_REG_PPC_TAR  | 64
+  PPC   | KVM_REG_PPC_DPDES| 64
+  PPC   | KVM_REG_PPC_DAWR | 64
+  PPC   | KVM_REG_PPC_DAWRX| 64
+  PPC   | KVM_REG_PPC_CIABR| 64
+  PPC   | KVM_REG_PPC_IC   | 64
+  PPC   | KVM_REG_PPC_VTB  | 64
+  PPC   | KVM_REG_PPC_CSIGR| 64
+  PPC   | KVM_REG_PPC_TACR | 64
+  PPC   | KVM_REG_PPC_TCSCR| 64
+  PPC   | KVM_REG_PPC_PID  | 64
+  PPC   | KVM_REG_PPC_ACOP | 64
+  PPC   | KVM_REG_PPC_TM_GPR0  | 64
+  ...
+  PPC   | KVM_REG_PPC_TM_GPR31 | 64
+  PPC   | KVM_REG_PPC_TM_VSR0  | 128
+  ...
+  PPC   | KVM_REG_PPC_TM_VSR63 | 128
+  PPC   | KVM_REG_PPC_TM_CR| 64
+  PPC   | KVM_REG_PPC_TM_LR| 64
+  PPC   | KVM_REG_PPC_TM_CTR   | 64
+  PPC   | KVM_REG_PPC_TM_FPSCR | 64
+  PPC   | KVM_REG_PPC_TM_AMR   | 64
+  PPC   | KVM_REG_PPC_TM_PPR   | 64
+  PPC   | KVM_REG_PPC_TM_VRSAVE| 64
+  PPC   | KVM_REG_PPC_TM_VSCR  | 32
+  PPC   | KVM_REG_PPC_TM_DSCR  | 64
+  PPC   | KVM_REG_PPC_TM_TAR   | 64
 
 ARM registers are mapped using the lower 32 bits.  The upper 16 of that
 is the register group type, or coprocessor number:
diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
b/arch/powerpc/include/uapi/asm/kvm.h
index 0fb1a6e..7ed41c0 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -429,6 +429,11 @@ struct kvm_get_htab_header {
 #define KVM_REG_PPC_MMCR0  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x10)
 #define KVM_REG_PPC_MMCR1  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x11)
 #define KVM_REG_PPC_MMCRA  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x12)
+#define KVM_REG_PPC_MMCR2  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x13)
+#define KVM_REG_PPC_MMCRS  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x14)
+#define KVM_REG_PPC_SIAR   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x15)
+#define KVM_REG_PPC_SDAR   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x16)
+#define KVM_REG_PPC_SIER   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x17)
 
 #define KVM_REG_PPC_PMC1   (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x18)
 #define KVM_REG_PPC_PMC2   (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x19)
@@ -499,6 +504,55 @@ struct kvm_get_htab_header {
 #define KVM_REG_PPC_TLB3PS (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x9a)
 #define KVM_REG_PPC_EPTCFG (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x9b)
 
+/* POWER8 registers */
+#define KVM_REG_PPC_SPMC1  (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x9d)
+#define KVM_REG_PPC_SPMC2  (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x9e)
+#define KVM_REG_PPC_IAMR   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x9f)
+#define KVM_REG_PPC_TFHAR  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa0)
+#define KVM_REG_PPC_TFIAR  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa1)
+#define KVM_REG_PPC_TEXASR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa2)
+#define KVM_REG_PPC_FSCR   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa3)
+#define KVM_REG_PPC_PSPB   (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xa4)
+#define KVM_REG_PP

Re: [PATCH RFC 0/5] Eliminate double-copying of FP/VMX/VSX state

2013-09-09 Thread Michael Neuling
> At present, PR KVM and BookE KVM does multiple copies of FP and
> related state because of the way that they use the arrays in the
> thread_struct as an intermediate staging post for the state.  They do
> this so that they can use the existing system functions for loading
> and saving state, and so that they can keep guest state in the CPU
> registers while executing general kernel code.
> 
> This patch series reorganizes things so that KVM and the main kernel
> use the same representation for FP/VMX/VSX state, and so that guest
> state can be loaded/save directly from/to the vcpu struct instead of
> having to go via the thread_struct.  This simplifies things and should
> be a little faster.
> 
> This series is against Alex Graf's kvm-ppc-queue branch plus my recent
> series of 23 patches to make PR and HV KVM coexist.

This is great!

Alex, can you pull this into your tree?  It's going to be very useful
for POWER8 transactional memory as we have to save another set of VSX
state.  The changes in 4/5 are going to make coding this up a lot
cleaner

There are a lot of generic powerpc changes in here also.  Perhaps, if
Alex is fine with these, he can ACK them and benh can pull this in?

Mikey

> 
> Paul.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 0/5] Eliminate double-copying of FP/VMX/VSX state

2013-09-09 Thread Michael Neuling
Alexander Graf  wrote:

> 
> On 09.09.2013, at 09:28, Michael Neuling wrote:
> 
> >> At present, PR KVM and BookE KVM does multiple copies of FP and
> >> related state because of the way that they use the arrays in the
> >> thread_struct as an intermediate staging post for the state.  They do
> >> this so that they can use the existing system functions for loading
> >> and saving state, and so that they can keep guest state in the CPU
> >> registers while executing general kernel code.
> >> 
> >> This patch series reorganizes things so that KVM and the main kernel
> >> use the same representation for FP/VMX/VSX state, and so that guest
> >> state can be loaded/save directly from/to the vcpu struct instead of
> >> having to go via the thread_struct.  This simplifies things and should
> >> be a little faster.
> >> 
> >> This series is against Alex Graf's kvm-ppc-queue branch plus my recent
> >> series of 23 patches to make PR and HV KVM coexist.
> > 
> > This is great!
> > 
> > Alex, can you pull this into your tree?  
> 
> I never apply RFC patches if I can avoid it. Paul, if you think
> they're ready for inclusion, please repost them as actual patches.

Arrh, good point.  I'll talk to paulus about reposting them.  

Your kvm-ppc-queue branch on github seems to be based on 3.11-rc1. Is
that the tree we should be aiming for currently?

Mikey
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] KVM: PPC: Book3S HV: Make TM avoid program check

2014-03-27 Thread Michael Neuling
Currently using kvmppc_set_one_reg() a transaction could be setup without
TEXASR Failure Summary (FS) not set.  When this is switched back in by the
host, this will result in a TM Bad Thing (ie 0x700 program check) when the
trechkpt is run.

This avoids this by always setting the TEXASR FS when there is an active
transaction being started.

This patch is on top of Paulus' recent KVM TM patch set.

Signed-off-by: Michael Neuling 
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 217a22e..01d5701 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -639,6 +639,14 @@ END_FTR_SECTION_IFCLR(CPU_FTR_TM)
rldicl. r5, r5, 64 - MSR_TS_S_LG, 62
beq 1f  /* TM not active in guest */
 
+   /* Make sure the failure summary is set, otherwise we'll program check
+* when we trechkpt.  It's possible that this might have been not set
+* on a kvmppc_set_one_reg() call but we shouldn't let this crash the
+* host.
+*/
+   orisr7, r7, (TEXASR_FS)@h
+   mtspr   SPRN_TEXASR, r7
+
/*
 * We need to load up the checkpointed state for the guest.
 * We need to do this early as it will blow away any GPRs, VSRs and
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] KVM: PPC: Book3S HV: Add branch label

2014-03-27 Thread Michael Neuling
This branch label is over a large section so let's give it a real name.

Signed-off-by: Michael Neuling 
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 01d5701..832750d 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -612,7 +612,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
 
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 BEGIN_FTR_SECTION
-   b   1f
+   b   skip_tm
 END_FTR_SECTION_IFCLR(CPU_FTR_TM)
 
/* Turn on TM/FP/VSX/VMX so we can restore them. */
@@ -637,7 +637,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_TM)
 
ld  r5, VCPU_MSR(r4)
rldicl. r5, r5, 64 - MSR_TS_S_LG, 62
-   beq 1f  /* TM not active in guest */
+   beq skip_tm /* TM not active in guest */
 
/* Make sure the failure summary is set, otherwise we'll program check
 * when we trechkpt.  It's possible that this might have been not set
@@ -717,7 +717,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_TM)
/* Set the MSR RI since we have our registers back. */
li  r5, MSR_RI
mtmsrd  r5, 1
-1:
+skip_tm:
 #endif
 
/* Load guest PMU registers */
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/6] Implement split core for POWER8

2014-04-23 Thread Michael Neuling
This patch series implements split core mode on POWER8.  This enables up to 4
subcores per core which can each independently run guests (per guest SPRs like
SDR1, LPIDR etc are replicated per subcore).  Lots more documentation on this
feature in the code and commit messages.

Most of this code is in the powernv platform but there's a couple of KVM
specific patches too.

Alex: If you're happy with the KVM patches, please ACK them and benh can hold
this series.

Patch series authored by mpe and me with a few bug fixes from others.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/6] powerpc: Check cpu_thread_in_subcore() in __cpu_up()

2014-04-23 Thread Michael Neuling
From: Michael Ellerman 

To support split core we need to change the check in __cpu_up() that
determines if a cpu is allowed to come online.

Currently we refuse to online cpus which are not the primary thread
within their core.

On POWER8 with split core support this check needs to instead refuse to
online cpus which are not the primary thread within their *sub* core.

On POWER7 and other systems that do not support split core,
threads_per_subcore == threads_per_core and so the check is equivalent.

Signed-off-by: Michael Ellerman 
Signed-off-by: Michael Neuling 
---
 arch/powerpc/kernel/smp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 6edae3d..b5222c4 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -489,7 +489,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *tidle)
 * Don't allow secondary threads to come online if inhibited
 */
if (threads_per_core > 1 && secondaries_inhibited() &&
-   cpu % threads_per_core != 0)
+   cpu_thread_in_subcore(cpu))
return -EBUSY;
 
if (smp_ops == NULL ||
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/6] KVM: PPC: Book3S HV: Rework the secondary inhibit code

2014-04-23 Thread Michael Neuling
From: Michael Ellerman 

As part of the support for split core on POWER8, we want to be able to
block splitting of the core while KVM VMs are active.

The logic to do that would be exactly the same as the code we currently
have for inhibiting onlining of secondaries.

Instead of adding an identical mechanism to block split core, rework the
secondary inhibit code to be a "HV KVM is active" check. We can then use
that in both the cpu hotplug code and the upcoming split core code.

Signed-off-by: Michael Ellerman 
Signed-off-by: Michael Neuling 
---
 arch/powerpc/include/asm/kvm_ppc.h   |  7 +++
 arch/powerpc/include/asm/smp.h   |  8 
 arch/powerpc/kernel/smp.c| 34 +++---
 arch/powerpc/kvm/book3s_hv.c |  8 
 arch/powerpc/kvm/book3s_hv_builtin.c | 31 +++
 5 files changed, 45 insertions(+), 43 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 4096f16..2c8e399 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -337,6 +337,10 @@ static inline void kvmppc_fast_vcpu_kick(struct kvm_vcpu 
*vcpu)
vcpu->kvm->arch.kvm_ops->fast_vcpu_kick(vcpu);
 }
 
+extern void kvm_hv_vm_activated(void);
+extern void kvm_hv_vm_deactivated(void);
+extern bool kvm_hv_mode_active(void);
+
 #else
 static inline void __init kvm_cma_reserve(void)
 {}
@@ -356,6 +360,9 @@ static inline void kvmppc_fast_vcpu_kick(struct kvm_vcpu 
*vcpu)
 {
kvm_vcpu_kick(vcpu);
 }
+
+static inline bool kvm_hv_mode_active(void){ return false; }
+
 #endif
 
 #ifdef CONFIG_KVM_XICS
diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index ff51046..5a6614a 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -68,14 +68,6 @@ void generic_mach_cpu_die(void);
 void generic_set_cpu_dead(unsigned int cpu);
 void generic_set_cpu_up(unsigned int cpu);
 int generic_check_cpu_restart(unsigned int cpu);
-
-extern void inhibit_secondary_onlining(void);
-extern void uninhibit_secondary_onlining(void);
-
-#else /* HOTPLUG_CPU */
-static inline void inhibit_secondary_onlining(void) {}
-static inline void uninhibit_secondary_onlining(void) {}
-
 #endif
 
 #ifdef CONFIG_PPC64
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index e2a4232..6edae3d 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -457,38 +458,9 @@ int generic_check_cpu_restart(unsigned int cpu)
return per_cpu(cpu_state, cpu) == CPU_UP_PREPARE;
 }
 
-static atomic_t secondary_inhibit_count;
-
-/*
- * Don't allow secondary CPU threads to come online
- */
-void inhibit_secondary_onlining(void)
-{
-   /*
-* This makes secondary_inhibit_count stable during cpu
-* online/offline operations.
-*/
-   get_online_cpus();
-
-   atomic_inc(&secondary_inhibit_count);
-   put_online_cpus();
-}
-EXPORT_SYMBOL_GPL(inhibit_secondary_onlining);
-
-/*
- * Allow secondary CPU threads to come online again
- */
-void uninhibit_secondary_onlining(void)
-{
-   get_online_cpus();
-   atomic_dec(&secondary_inhibit_count);
-   put_online_cpus();
-}
-EXPORT_SYMBOL_GPL(uninhibit_secondary_onlining);
-
-static int secondaries_inhibited(void)
+static bool secondaries_inhibited(void)
 {
-   return atomic_read(&secondary_inhibit_count);
+   return kvm_hv_mode_active();
 }
 
 #else /* HOTPLUG_CPU */
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 8227dba..d7b74f8 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2317,10 +2317,10 @@ static int kvmppc_core_init_vm_hv(struct kvm *kvm)
spin_lock_init(&kvm->arch.slot_phys_lock);
 
/*
-* Don't allow secondary CPU threads to come online
-* while any KVM VMs exist.
+* Track that we now have a HV mode VM active. This blocks secondary
+* CPU threads from coming online.
 */
-   inhibit_secondary_onlining();
+   kvm_hv_vm_activated();
 
return 0;
 }
@@ -2336,7 +2336,7 @@ static void kvmppc_free_vcores(struct kvm *kvm)
 
 static void kvmppc_core_destroy_vm_hv(struct kvm *kvm)
 {
-   uninhibit_secondary_onlining();
+   kvm_hv_vm_deactivated();
 
kvmppc_free_vcores(kvm);
if (kvm->arch.rma) {
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index 8cd0dae..7cde8a6 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -6,6 +6,7 @@
  * published by the Free Software Foundation.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -181,3 +182,33 @@ void __init kvm_cma_reserve(void)
kvm_cma_declare_contiguous(selected_size, align_size);

[PATCH 6/6] powerpc/powernv: Add support for POWER8 split core on powernv

2014-04-23 Thread Michael Neuling
From: Michael Ellerman 

Upcoming POWER8 chips support a concept called split core. This is where
the core can be split into subcores that although not full cores, are
able to appear as full cores to a guest.

The splitting & unsplitting procedure is mildly complicated, and
explained at length in the comments within the patch.

One notable detail is that when splitting or unsplitting we need to pull
offline cpus out of their offline state to do work as part of the
procedure.

The interface for changing the split mode is via a sysfs file, eg:

 $ echo 2 > /sys/devices/system/cpu/subcores_per_core

Currently supported values are '1', '2' and '4'. And indicate
respectively that the core should be unsplit, split in half, and split
in quarters. These modes correspond to threads_per_subcore of 8, 4 and
2.

We do not allow changing the split mode while KVM VMs are active. This
is to prevent the value changing while userspace is configuring the VM,
and also to prevent the mode being changed in such a way that existing
guests are unable to be run.

CPU hotplug fixes by Srivatsa.  max_cpus fixes by Mahesh.  cpuset fixes by
benh.  The rest by mikey and mpe.

Signed-off-by: Michael Ellerman 
Signed-off-by: Michael Neuling 
Signed-off-by: Srivatsa S. Bhat 
Signed-off-by: Mahesh Salgaonkar 
Signed-off-by: Benjamin Herrenschmidt 
---
 arch/powerpc/include/asm/reg.h   |   9 +
 arch/powerpc/platforms/powernv/Makefile  |   2 +-
 arch/powerpc/platforms/powernv/powernv.h |   2 +
 arch/powerpc/platforms/powernv/smp.c |   8 +
 arch/powerpc/platforms/powernv/subcore-asm.S |  95 +++
 arch/powerpc/platforms/powernv/subcore.c | 392 +++
 arch/powerpc/platforms/powernv/subcore.h |  18 ++
 7 files changed, 525 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/platforms/powernv/subcore-asm.S
 create mode 100644 arch/powerpc/platforms/powernv/subcore.c
 create mode 100644 arch/powerpc/platforms/powernv/subcore.h

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index e5d2e0b..154c419 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -224,6 +224,7 @@
 #define   CTRL_TE  0x00c0  /* thread enable */
 #define   CTRL_RUNLATCH0x1
 #define SPRN_DAWR  0xB4
+#define SPRN_RPR   0xBA/* Relative Priority Register */
 #define SPRN_CIABR 0xBB
 #define   CIABR_PRIV   0x3
 #define   CIABR_PRIV_USER  1
@@ -272,8 +273,10 @@
 #define SPRN_HSRR1 0x13B   /* Hypervisor Save/Restore 1 */
 #define SPRN_IC0x350   /* Virtual Instruction Count */
 #define SPRN_VTB   0x351   /* Virtual Time Base */
+#define SPRN_LDBAR 0x352   /* LD Base Address Register */
 #define SPRN_PMICR 0x354   /* Power Management Idle Control Reg */
 #define SPRN_PMSR  0x355   /* Power Management Status Reg */
+#define SPRN_PMMAR 0x356   /* Power Management Memory Activity Register */
 #define SPRN_PMCR  0x374   /* Power Management Control Register */
 
 /* HFSCR and FSCR bit numbers are the same */
@@ -433,6 +436,12 @@
 #define HID0_BTCD  (1<<1)  /* Branch target cache disable */
 #define HID0_NOPDST(1<<1)  /* No-op dst, dstt, etc. instr. */
 #define HID0_NOPTI (1<<0)  /* No-op dcbt and dcbst instr. */
+/* POWER8 HID0 bits */
+#define HID0_POWER8_4LPARMODE  __MASK(61)
+#define HID0_POWER8_2LPARMODE  __MASK(57)
+#define HID0_POWER8_1TO2LPAR   __MASK(52)
+#define HID0_POWER8_1TO4LPAR   __MASK(51)
+#define HID0_POWER8_DYNLPARDIS __MASK(48)
 
 #define SPRN_HID1  0x3F1   /* Hardware Implementation Register 1 */
 #ifdef CONFIG_6xx
diff --git a/arch/powerpc/platforms/powernv/Makefile 
b/arch/powerpc/platforms/powernv/Makefile
index 63cebb9..4ad0d34 100644
--- a/arch/powerpc/platforms/powernv/Makefile
+++ b/arch/powerpc/platforms/powernv/Makefile
@@ -1,7 +1,7 @@
 obj-y  += setup.o opal-takeover.o opal-wrappers.o opal.o 
opal-async.o
 obj-y  += opal-rtc.o opal-nvram.o opal-lpc.o opal-flash.o
 obj-y  += rng.o opal-elog.o opal-dump.o opal-sysparam.o 
opal-sensor.o
-obj-y  += opal-msglog.o
+obj-y  += opal-msglog.o subcore.o subcore-asm.o
 
 obj-$(CONFIG_SMP)  += smp.o
 obj-$(CONFIG_PCI)  += pci.o pci-p5ioc2.o pci-ioda.o
diff --git a/arch/powerpc/platforms/powernv/powernv.h 
b/arch/powerpc/platforms/powernv/powernv.h
index 0051e10..75501bf 100644
--- a/arch/powerpc/platforms/powernv/powernv.h
+++ b/arch/powerpc/platforms/powernv/powernv.h
@@ -25,4 +25,6 @@ static inline int pnv_pci_dma_set_mask(struct pci_dev *pdev, 
u64 dma_mask)
 
 extern void pnv_lpc_init(void);
 
+bool cpu_core_split_required(void);
+
 #endif /* _POWERNV_H */
diff --git a/arch/powerpc/platforms/powernv/smp.c 
b/arch/powerpc/platforms/powernv/smp.c
index ab8b4ad..7774f85 100644
--- a/arch/powerpc/platforms/powernv/smp

[PATCH 5/6] KVM: PPC: Book3S HV: Use threads_per_subcore in KVM

2014-04-23 Thread Michael Neuling
From: Michael Ellerman 

To support split core on POWER8 we need to modify various parts of the
KVM code to use threads_per_subcore instead of threads_per_core. On
systems that do not support split core threads_per_subcore ==
threads_per_core and these changes are a nop.

We use threads_per_subcore as the value reported by KVM_CAP_PPC_SMT.
This communicates to userspace that guests can only be created with
a value of threads_per_core that is less than or equal to the current
threads_per_subcore. This ensures that guests can only be created with a
thread configuration that we are able to run given the current split
core mode.

Although threads_per_subcore can change during the life of the system,
the commit that enables that will ensure that threads_per_subcore does
not change during the life of a KVM VM.

Signed-off-by: Michael Ellerman 
Signed-off-by: Michael Neuling 
---
 arch/powerpc/kvm/book3s_hv.c | 26 --
 arch/powerpc/kvm/powerpc.c   |  2 +-
 2 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index d7b74f8..5e86f28 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1266,7 +1266,7 @@ static struct kvm_vcpu *kvmppc_core_vcpu_create_hv(struct 
kvm *kvm,
int core;
struct kvmppc_vcore *vcore;
 
-   core = id / threads_per_core;
+   core = id / threads_per_subcore;
if (core >= KVM_MAX_VCORES)
goto out;
 
@@ -1305,7 +1305,7 @@ static struct kvm_vcpu *kvmppc_core_vcpu_create_hv(struct 
kvm *kvm,
init_waitqueue_head(&vcore->wq);
vcore->preempt_tb = TB_NIL;
vcore->lpcr = kvm->arch.lpcr;
-   vcore->first_vcpuid = core * threads_per_core;
+   vcore->first_vcpuid = core * threads_per_subcore;
vcore->kvm = kvm;
}
kvm->arch.vcores[core] = vcore;
@@ -1495,16 +1495,19 @@ static void kvmppc_wait_for_nap(struct kvmppc_vcore *vc)
 static int on_primary_thread(void)
 {
int cpu = smp_processor_id();
-   int thr = cpu_thread_in_core(cpu);
+   int thr;
 
-   if (thr)
+   /* Are we on a primary subcore? */
+   if (cpu_thread_in_subcore(cpu))
return 0;
-   while (++thr < threads_per_core)
+
+   thr = 0;
+   while (++thr < threads_per_subcore)
if (cpu_online(cpu + thr))
return 0;
 
/* Grab all hw threads so they can't go into the kernel */
-   for (thr = 1; thr < threads_per_core; ++thr) {
+   for (thr = 1; thr < threads_per_subcore; ++thr) {
if (kvmppc_grab_hwthread(cpu + thr)) {
/* Couldn't grab one; let the others go */
do {
@@ -1563,15 +1566,18 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
}
 
/*
-* Make sure we are running on thread 0, and that
-* secondary threads are offline.
+* Make sure we are running on primary threads, and that secondary
+* threads are offline.  Also check if the number of threads in this
+* guest are greater than the current system threads per guest.
 */
-   if (threads_per_core > 1 && !on_primary_thread()) {
+   if ((threads_per_core > 1) &&
+   ((vc->num_threads > threads_per_subcore) || !on_primary_thread())) {
list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list)
vcpu->arch.ret = -EBUSY;
goto out;
}
 
+
vc->pcpu = smp_processor_id();
list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list) {
kvmppc_start_thread(vcpu);
@@ -1599,7 +1605,7 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
/* wait for secondary threads to finish writing their state to memory */
if (vc->nap_count < vc->n_woken)
kvmppc_wait_for_nap(vc);
-   for (i = 0; i < threads_per_core; ++i)
+   for (i = 0; i < threads_per_subcore; ++i)
kvmppc_release_hwthread(vc->pcpu + i);
/* prevent other vcpu threads from doing kvmppc_start_thread() now */
vc->vcore_state = VCORE_EXITING;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 3cf541a..27919a8 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -384,7 +384,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
case KVM_CAP_PPC_SMT:
if (hv_enabled)
-   r = threads_per_core;
+   r = threads_per_subcore;
else
r = 0;
break;
-- 
1.8.3.2

--
To unsubscribe from this list: send the l

[PATCH 3/6] powerpc: Add threads_per_subcore

2014-04-23 Thread Michael Neuling
From: Michael Ellerman 

On POWER8 we have a new concept of a subcore. This is what happens when
you take a regular core and split it. A subcore is a grouping of two or
four SMT threads, as well as a handfull of SPRs which allows the subcore
to appear as if it were a core from the point of view of a guest.

Unlike threads_per_core which is fixed at boot, threads_per_subcore can
change while the system is running. Most code will not want to use
threads_per_subcore.

Signed-off-by: Michael Ellerman 
Signed-off-by: Michael Neuling 
---
 arch/powerpc/include/asm/cputhreads.h | 7 +++
 arch/powerpc/kernel/setup-common.c| 4 +++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/cputhreads.h 
b/arch/powerpc/include/asm/cputhreads.h
index ac3eedb..2bf8e93 100644
--- a/arch/powerpc/include/asm/cputhreads.h
+++ b/arch/powerpc/include/asm/cputhreads.h
@@ -18,10 +18,12 @@
 
 #ifdef CONFIG_SMP
 extern int threads_per_core;
+extern int threads_per_subcore;
 extern int threads_shift;
 extern cpumask_t threads_core_mask;
 #else
 #define threads_per_core   1
+#define threads_per_subcore1
 #define threads_shift  0
 #define threads_core_mask  (CPU_MASK_CPU0)
 #endif
@@ -74,6 +76,11 @@ static inline int cpu_thread_in_core(int cpu)
return cpu & (threads_per_core - 1);
 }
 
+static inline int cpu_thread_in_subcore(int cpu)
+{
+   return cpu & (threads_per_subcore - 1);
+}
+
 static inline int cpu_first_thread_sibling(int cpu)
 {
return cpu & ~(threads_per_core - 1);
diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index 79b7612..f9faf1e 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -382,9 +382,10 @@ void __init check_for_initrd(void)
 
 #ifdef CONFIG_SMP
 
-int threads_per_core, threads_shift;
+int threads_per_core, threads_per_subcore, threads_shift;
 cpumask_t threads_core_mask;
 EXPORT_SYMBOL_GPL(threads_per_core);
+EXPORT_SYMBOL_GPL(threads_per_subcore);
 EXPORT_SYMBOL_GPL(threads_shift);
 EXPORT_SYMBOL_GPL(threads_core_mask);
 
@@ -393,6 +394,7 @@ static void __init cpu_init_thread_core_maps(int tpc)
int i;
 
threads_per_core = tpc;
+   threads_per_subcore = tpc;
cpumask_clear(&threads_core_mask);
 
/* This implementation only supports power of 2 number of threads
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/6] powerpc/powernv: Make it possible to skip the IRQHAPPENED check in power7_nap()

2014-04-23 Thread Michael Neuling
From: Michael Ellerman 

To support split core we need to be able to force all secondaries into
nap, so the core can detect they are idle and do an unsplit.

Currently power7_nap() will return without napping if there is an irq
pending. We want to ignore the pending irq and nap anyway, we will deal
with the interrupt later.

Signed-off-by: Michael Ellerman 
Signed-off-by: Michael Neuling 
---
 arch/powerpc/include/asm/processor.h | 2 +-
 arch/powerpc/kernel/idle_power7.S| 9 +
 arch/powerpc/platforms/powernv/smp.c | 2 +-
 3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index d660dc3..6d59072 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -449,7 +449,7 @@ extern unsigned long cpuidle_disable;
 enum idle_boot_override {IDLE_NO_OVERRIDE = 0, IDLE_POWERSAVE_OFF};
 
 extern int powersave_nap;  /* set if nap mode can be used in idle loop */
-extern void power7_nap(void);
+extern void power7_nap(int check_irq);
 extern void power7_sleep(void);
 extern void flush_instruction_cache(void);
 extern void hard_reset_now(void);
diff --git a/arch/powerpc/kernel/idle_power7.S 
b/arch/powerpc/kernel/idle_power7.S
index c3ab869..063c7cb 100644
--- a/arch/powerpc/kernel/idle_power7.S
+++ b/arch/powerpc/kernel/idle_power7.S
@@ -39,6 +39,10 @@
  * Pass requested state in r3:
  * 0 - nap
  * 1 - sleep
+ *
+ * To check IRQ_HAPPENED in r4
+ * 0 - don't check
+ * 1 - check
  */
 _GLOBAL(power7_powersave_common)
/* Use r3 to pass state nap/sleep/winkle */
@@ -71,6 +75,8 @@ _GLOBAL(power7_powersave_common)
lbz r0,PACAIRQHAPPENED(r13)
cmpwi   cr0,r0,0
beq 1f
+   cmpwi   cr0,r4,0
+   beq 1f
addir1,r1,INT_FRAME_SIZE
ld  r0,16(r1)
mtlrr0
@@ -115,14 +121,17 @@ _GLOBAL(power7_idle)
cmpwi   0,r4,0
beqlr
/* fall through */
+   li  r3, 1
 
 _GLOBAL(power7_nap)
+   mr  r4,r3
li  r3,0
b   power7_powersave_common
/* No return */
 
 _GLOBAL(power7_sleep)
li  r3,1
+   li  r4,0
b   power7_powersave_common
/* No return */
 
diff --git a/arch/powerpc/platforms/powernv/smp.c 
b/arch/powerpc/platforms/powernv/smp.c
index 908672b..ab8b4ad 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -156,7 +156,7 @@ static void pnv_smp_cpu_kill_self(void)
 */
mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1);
while (!generic_check_cpu_restart(cpu)) {
-   power7_nap();
+   power7_nap(1);
if (!generic_check_cpu_restart(cpu)) {
DBG("CPU%d Unexpected exit while offline !\n", cpu);
/* We may be getting an IPI, so we re-enable
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/6] powerpc/powernv: Add support for POWER8 split core on powernv

2014-04-23 Thread Michael Neuling
Joel Stanley  wrote:

> Hi Mikey,
> 
> On Thu, Apr 24, 2014 at 11:02 AM, Michael Neuling  wrote:
> > +static DEVICE_ATTR(subcores_per_core, 0600,
> > +   show_subcores_per_core, store_subcores_per_core);
> 
> Can we make this 644, so users can query the state of the system
> without being root? This is useful for tools like ppc64_cpu --info.

Good point... I'll update.

Thanks,
Mikey
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6] Implement split core for POWER8

2014-04-29 Thread Michael Neuling
> This patch series implements split core mode on POWER8.  This enables up to 4
> subcores per core which can each independently run guests (per guest SPRs like
> SDR1, LPIDR etc are replicated per subcore).  Lots more documentation on this
> feature in the code and commit messages.
> 
> Most of this code is in the powernv platform but there's a couple of KVM
> specific patches too.
> 
> Alex: If you're happy with the KVM patches, please ACK them and benh can hold
> this series.

Alex,

Any chance we can get an ACK on these two KVM patches so benh can put
this series in his next branch?

Mikey
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/6] KVM: PPC: Book3S PR: Emulate TIR register

2014-04-29 Thread Michael Neuling
> In parallel to the Processor ID Register (PIR) threaded POWER8 also adds a
> Thread ID Register (TID). Since PR KVM doesn't emulate more than one thread

s/TID/TIR/ above

> per core, we can just always expose 0 here.

I'm not sure if we ever do, but if we IPI ourselves using a doorbell,
we'll need to emulate the doorbell as well.

Mikey

> Signed-off-by: Alexander Graf 
> ---
>  arch/powerpc/kvm/book3s_emulate.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/powerpc/kvm/book3s_emulate.c 
> b/arch/powerpc/kvm/book3s_emulate.c
> index 914beb2..e4e54fb 100644
> --- a/arch/powerpc/kvm/book3s_emulate.c
> +++ b/arch/powerpc/kvm/book3s_emulate.c
> @@ -563,6 +563,7 @@ int kvmppc_core_emulate_mfspr_pr(struct kvm_vcpu *vcpu, 
> int sprn, ulong *spr_val
>   case SPRN_MMCR0:
>   case SPRN_MMCR1:
>   case SPRN_MMCR2:
> + case SPRN_TIR:
>   *spr_val = 0;
>   break;
>   default:
> -- 
> 1.8.1.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/6] Implement split core for POWER8

2014-05-23 Thread Michael Neuling
This patch series implements split core mode on POWER8.  This enables up to 4
subcores per core which can each independently run guests (per guest SPRs like
SDR1, LPIDR etc are replicated per subcore).  Lots more documentation on this
feature in the code and commit messages.

Most of this code is in the powernv platform but there's a couple of KVM
specific patches too.

Patch series authored by mpe and me with a few bug fixes from others.

v2:
  There are some minor updates based on comments and I've added the Acks by
  Paulus and Alex for the KVM code.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 4/6] powerpc: Check cpu_thread_in_subcore() in __cpu_up()

2014-05-23 Thread Michael Neuling
From: Michael Ellerman 

To support split core we need to change the check in __cpu_up() that
determines if a cpu is allowed to come online.

Currently we refuse to online cpus which are not the primary thread
within their core.

On POWER8 with split core support this check needs to instead refuse to
online cpus which are not the primary thread within their *sub* core.

On POWER7 and other systems that do not support split core,
threads_per_subcore == threads_per_core and so the check is equivalent.

Signed-off-by: Michael Ellerman 
Signed-off-by: Michael Neuling 
---
 arch/powerpc/kernel/smp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 6edae3d..b5222c4 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -489,7 +489,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *tidle)
 * Don't allow secondary threads to come online if inhibited
 */
if (threads_per_core > 1 && secondaries_inhibited() &&
-   cpu % threads_per_core != 0)
+   cpu_thread_in_subcore(cpu))
return -EBUSY;
 
if (smp_ops == NULL ||
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/6] powerpc/powernv: Make it possible to skip the IRQHAPPENED check in power7_nap()

2014-05-23 Thread Michael Neuling
From: Michael Ellerman 

To support split core we need to be able to force all secondaries into
nap, so the core can detect they are idle and do an unsplit.

Currently power7_nap() will return without napping if there is an irq
pending. We want to ignore the pending irq and nap anyway, we will deal
with the interrupt later.

Signed-off-by: Michael Ellerman 
Signed-off-by: Michael Neuling 
---
 arch/powerpc/include/asm/processor.h | 2 +-
 arch/powerpc/kernel/idle_power7.S| 9 +
 arch/powerpc/platforms/powernv/smp.c | 2 +-
 3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index d660dc3..6d59072 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -449,7 +449,7 @@ extern unsigned long cpuidle_disable;
 enum idle_boot_override {IDLE_NO_OVERRIDE = 0, IDLE_POWERSAVE_OFF};
 
 extern int powersave_nap;  /* set if nap mode can be used in idle loop */
-extern void power7_nap(void);
+extern void power7_nap(int check_irq);
 extern void power7_sleep(void);
 extern void flush_instruction_cache(void);
 extern void hard_reset_now(void);
diff --git a/arch/powerpc/kernel/idle_power7.S 
b/arch/powerpc/kernel/idle_power7.S
index dca6e16..2480256 100644
--- a/arch/powerpc/kernel/idle_power7.S
+++ b/arch/powerpc/kernel/idle_power7.S
@@ -39,6 +39,10 @@
  * Pass requested state in r3:
  * 0 - nap
  * 1 - sleep
+ *
+ * To check IRQ_HAPPENED in r4
+ * 0 - don't check
+ * 1 - check
  */
 _GLOBAL(power7_powersave_common)
/* Use r3 to pass state nap/sleep/winkle */
@@ -71,6 +75,8 @@ _GLOBAL(power7_powersave_common)
lbz r0,PACAIRQHAPPENED(r13)
cmpwi   cr0,r0,0
beq 1f
+   cmpwi   cr0,r4,0
+   beq 1f
addir1,r1,INT_FRAME_SIZE
ld  r0,16(r1)
mtlrr0
@@ -114,15 +120,18 @@ _GLOBAL(power7_idle)
lwz r4,ADDROFF(powersave_nap)(r3)
cmpwi   0,r4,0
beqlr
+   li  r3, 1
/* fall through */
 
 _GLOBAL(power7_nap)
+   mr  r4,r3
li  r3,0
b   power7_powersave_common
/* No return */
 
 _GLOBAL(power7_sleep)
li  r3,1
+   li  r4,0
b   power7_powersave_common
/* No return */
 
diff --git a/arch/powerpc/platforms/powernv/smp.c 
b/arch/powerpc/platforms/powernv/smp.c
index 1601a1e..65faf99 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -159,7 +159,7 @@ static void pnv_smp_cpu_kill_self(void)
mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1);
while (!generic_check_cpu_restart(cpu)) {
ppc64_runlatch_off();
-   power7_nap();
+   power7_nap(1);
ppc64_runlatch_on();
if (!generic_check_cpu_restart(cpu)) {
DBG("CPU%d Unexpected exit while offline !\n", cpu);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/6] KVM: PPC: Book3S HV: Rework the secondary inhibit code

2014-05-23 Thread Michael Neuling
From: Michael Ellerman 

As part of the support for split core on POWER8, we want to be able to
block splitting of the core while KVM VMs are active.

The logic to do that would be exactly the same as the code we currently
have for inhibiting onlining of secondaries.

Instead of adding an identical mechanism to block split core, rework the
secondary inhibit code to be a "HV KVM is active" check. We can then use
that in both the cpu hotplug code and the upcoming split core code.

Signed-off-by: Michael Ellerman 
Signed-off-by: Michael Neuling 
Acked-by: Alexander Graf 
Acked-by: Paul Mackerras 
---
 arch/powerpc/include/asm/kvm_ppc.h   |  7 +++
 arch/powerpc/include/asm/smp.h   |  8 
 arch/powerpc/kernel/smp.c| 34 +++---
 arch/powerpc/kvm/book3s_hv.c |  8 
 arch/powerpc/kvm/book3s_hv_builtin.c | 31 +++
 5 files changed, 45 insertions(+), 43 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 4096f16..2c8e399 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -337,6 +337,10 @@ static inline void kvmppc_fast_vcpu_kick(struct kvm_vcpu 
*vcpu)
vcpu->kvm->arch.kvm_ops->fast_vcpu_kick(vcpu);
 }
 
+extern void kvm_hv_vm_activated(void);
+extern void kvm_hv_vm_deactivated(void);
+extern bool kvm_hv_mode_active(void);
+
 #else
 static inline void __init kvm_cma_reserve(void)
 {}
@@ -356,6 +360,9 @@ static inline void kvmppc_fast_vcpu_kick(struct kvm_vcpu 
*vcpu)
 {
kvm_vcpu_kick(vcpu);
 }
+
+static inline bool kvm_hv_mode_active(void){ return false; }
+
 #endif
 
 #ifdef CONFIG_KVM_XICS
diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index ff51046..5a6614a 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -68,14 +68,6 @@ void generic_mach_cpu_die(void);
 void generic_set_cpu_dead(unsigned int cpu);
 void generic_set_cpu_up(unsigned int cpu);
 int generic_check_cpu_restart(unsigned int cpu);
-
-extern void inhibit_secondary_onlining(void);
-extern void uninhibit_secondary_onlining(void);
-
-#else /* HOTPLUG_CPU */
-static inline void inhibit_secondary_onlining(void) {}
-static inline void uninhibit_secondary_onlining(void) {}
-
 #endif
 
 #ifdef CONFIG_PPC64
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index e2a4232..6edae3d 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -457,38 +458,9 @@ int generic_check_cpu_restart(unsigned int cpu)
return per_cpu(cpu_state, cpu) == CPU_UP_PREPARE;
 }
 
-static atomic_t secondary_inhibit_count;
-
-/*
- * Don't allow secondary CPU threads to come online
- */
-void inhibit_secondary_onlining(void)
-{
-   /*
-* This makes secondary_inhibit_count stable during cpu
-* online/offline operations.
-*/
-   get_online_cpus();
-
-   atomic_inc(&secondary_inhibit_count);
-   put_online_cpus();
-}
-EXPORT_SYMBOL_GPL(inhibit_secondary_onlining);
-
-/*
- * Allow secondary CPU threads to come online again
- */
-void uninhibit_secondary_onlining(void)
-{
-   get_online_cpus();
-   atomic_dec(&secondary_inhibit_count);
-   put_online_cpus();
-}
-EXPORT_SYMBOL_GPL(uninhibit_secondary_onlining);
-
-static int secondaries_inhibited(void)
+static bool secondaries_inhibited(void)
 {
-   return atomic_read(&secondary_inhibit_count);
+   return kvm_hv_mode_active();
 }
 
 #else /* HOTPLUG_CPU */
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 8227dba..d7b74f8 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2317,10 +2317,10 @@ static int kvmppc_core_init_vm_hv(struct kvm *kvm)
spin_lock_init(&kvm->arch.slot_phys_lock);
 
/*
-* Don't allow secondary CPU threads to come online
-* while any KVM VMs exist.
+* Track that we now have a HV mode VM active. This blocks secondary
+* CPU threads from coming online.
 */
-   inhibit_secondary_onlining();
+   kvm_hv_vm_activated();
 
return 0;
 }
@@ -2336,7 +2336,7 @@ static void kvmppc_free_vcores(struct kvm *kvm)
 
 static void kvmppc_core_destroy_vm_hv(struct kvm *kvm)
 {
-   uninhibit_secondary_onlining();
+   kvm_hv_vm_deactivated();
 
kvmppc_free_vcores(kvm);
if (kvm->arch.rma) {
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index 8cd0dae..7cde8a6 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -6,6 +6,7 @@
  * published by the Free Software Foundation.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -181,3 +182,33 @@ void __init kvm_cma_reserve(void)
k

[PATCH v2 5/6] KVM: PPC: Book3S HV: Use threads_per_subcore in KVM

2014-05-23 Thread Michael Neuling
From: Michael Ellerman 

To support split core on POWER8 we need to modify various parts of the
KVM code to use threads_per_subcore instead of threads_per_core. On
systems that do not support split core threads_per_subcore ==
threads_per_core and these changes are a nop.

We use threads_per_subcore as the value reported by KVM_CAP_PPC_SMT.
This communicates to userspace that guests can only be created with
a value of threads_per_core that is less than or equal to the current
threads_per_subcore. This ensures that guests can only be created with a
thread configuration that we are able to run given the current split
core mode.

Although threads_per_subcore can change during the life of the system,
the commit that enables that will ensure that threads_per_subcore does
not change during the life of a KVM VM.

Signed-off-by: Michael Ellerman 
Signed-off-by: Michael Neuling 
Acked-by: Alexander Graf 
Acked-by: Paul Mackerras 
---
 arch/powerpc/kvm/book3s_hv.c | 26 --
 arch/powerpc/kvm/powerpc.c   |  2 +-
 2 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index d7b74f8..5e86f28 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1266,7 +1266,7 @@ static struct kvm_vcpu *kvmppc_core_vcpu_create_hv(struct 
kvm *kvm,
int core;
struct kvmppc_vcore *vcore;
 
-   core = id / threads_per_core;
+   core = id / threads_per_subcore;
if (core >= KVM_MAX_VCORES)
goto out;
 
@@ -1305,7 +1305,7 @@ static struct kvm_vcpu *kvmppc_core_vcpu_create_hv(struct 
kvm *kvm,
init_waitqueue_head(&vcore->wq);
vcore->preempt_tb = TB_NIL;
vcore->lpcr = kvm->arch.lpcr;
-   vcore->first_vcpuid = core * threads_per_core;
+   vcore->first_vcpuid = core * threads_per_subcore;
vcore->kvm = kvm;
}
kvm->arch.vcores[core] = vcore;
@@ -1495,16 +1495,19 @@ static void kvmppc_wait_for_nap(struct kvmppc_vcore *vc)
 static int on_primary_thread(void)
 {
int cpu = smp_processor_id();
-   int thr = cpu_thread_in_core(cpu);
+   int thr;
 
-   if (thr)
+   /* Are we on a primary subcore? */
+   if (cpu_thread_in_subcore(cpu))
return 0;
-   while (++thr < threads_per_core)
+
+   thr = 0;
+   while (++thr < threads_per_subcore)
if (cpu_online(cpu + thr))
return 0;
 
/* Grab all hw threads so they can't go into the kernel */
-   for (thr = 1; thr < threads_per_core; ++thr) {
+   for (thr = 1; thr < threads_per_subcore; ++thr) {
if (kvmppc_grab_hwthread(cpu + thr)) {
/* Couldn't grab one; let the others go */
do {
@@ -1563,15 +1566,18 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
}
 
/*
-* Make sure we are running on thread 0, and that
-* secondary threads are offline.
+* Make sure we are running on primary threads, and that secondary
+* threads are offline.  Also check if the number of threads in this
+* guest are greater than the current system threads per guest.
 */
-   if (threads_per_core > 1 && !on_primary_thread()) {
+   if ((threads_per_core > 1) &&
+   ((vc->num_threads > threads_per_subcore) || !on_primary_thread())) {
list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list)
vcpu->arch.ret = -EBUSY;
goto out;
}
 
+
vc->pcpu = smp_processor_id();
list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list) {
kvmppc_start_thread(vcpu);
@@ -1599,7 +1605,7 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
/* wait for secondary threads to finish writing their state to memory */
if (vc->nap_count < vc->n_woken)
kvmppc_wait_for_nap(vc);
-   for (i = 0; i < threads_per_core; ++i)
+   for (i = 0; i < threads_per_subcore; ++i)
kvmppc_release_hwthread(vc->pcpu + i);
/* prevent other vcpu threads from doing kvmppc_start_thread() now */
vc->vcore_state = VCORE_EXITING;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 3cf541a..27919a8 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -384,7 +384,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
case KVM_CAP_PPC_SMT:
if (hv_enabled)
-   r = threads_per_core;
+   r = threads_per_subcore;
else
r = 0;

[PATCH v2 3/6] powerpc: Add threads_per_subcore

2014-05-23 Thread Michael Neuling
From: Michael Ellerman 

On POWER8 we have a new concept of a subcore. This is what happens when
you take a regular core and split it. A subcore is a grouping of two or
four SMT threads, as well as a handfull of SPRs which allows the subcore
to appear as if it were a core from the point of view of a guest.

Unlike threads_per_core which is fixed at boot, threads_per_subcore can
change while the system is running. Most code will not want to use
threads_per_subcore.

Signed-off-by: Michael Ellerman 
Signed-off-by: Michael Neuling 
---
 arch/powerpc/include/asm/cputhreads.h | 7 +++
 arch/powerpc/kernel/setup-common.c| 4 +++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/cputhreads.h 
b/arch/powerpc/include/asm/cputhreads.h
index ac3eedb..2bf8e93 100644
--- a/arch/powerpc/include/asm/cputhreads.h
+++ b/arch/powerpc/include/asm/cputhreads.h
@@ -18,10 +18,12 @@
 
 #ifdef CONFIG_SMP
 extern int threads_per_core;
+extern int threads_per_subcore;
 extern int threads_shift;
 extern cpumask_t threads_core_mask;
 #else
 #define threads_per_core   1
+#define threads_per_subcore1
 #define threads_shift  0
 #define threads_core_mask  (CPU_MASK_CPU0)
 #endif
@@ -74,6 +76,11 @@ static inline int cpu_thread_in_core(int cpu)
return cpu & (threads_per_core - 1);
 }
 
+static inline int cpu_thread_in_subcore(int cpu)
+{
+   return cpu & (threads_per_subcore - 1);
+}
+
 static inline int cpu_first_thread_sibling(int cpu)
 {
return cpu & ~(threads_per_core - 1);
diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index 3cf25c8..aa0f5ed 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -390,9 +390,10 @@ void __init check_for_initrd(void)
 
 #ifdef CONFIG_SMP
 
-int threads_per_core, threads_shift;
+int threads_per_core, threads_per_subcore, threads_shift;
 cpumask_t threads_core_mask;
 EXPORT_SYMBOL_GPL(threads_per_core);
+EXPORT_SYMBOL_GPL(threads_per_subcore);
 EXPORT_SYMBOL_GPL(threads_shift);
 EXPORT_SYMBOL_GPL(threads_core_mask);
 
@@ -401,6 +402,7 @@ static void __init cpu_init_thread_core_maps(int tpc)
int i;
 
threads_per_core = tpc;
+   threads_per_subcore = tpc;
cpumask_clear(&threads_core_mask);
 
/* This implementation only supports power of 2 number of threads
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 6/6] powerpc/powernv: Add support for POWER8 split core on powernv

2014-05-23 Thread Michael Neuling
From: Michael Ellerman 

Upcoming POWER8 chips support a concept called split core. This is where the
core can be split into subcores that although not full cores, are able to
appear as full cores to a guest.

The splitting & unsplitting procedure is mildly complicated, and explained at
length in the comments within the patch.

One notable detail is that when splitting or unsplitting we need to pull
offline cpus out of their offline state to do work as part of the procedure.

The interface for changing the split mode is via a sysfs file, eg:

 $ echo 2 > /sys/devices/system/cpu/subcores_per_core

Currently supported values are '1', '2' and '4'. And indicate respectively that
the core should be unsplit, split in half, and split in quarters. These modes
correspond to threads_per_subcore of 8, 4 and 2.

We do not allow changing the split mode while KVM VMs are active. This is to
prevent the value changing while userspace is configuring the VM, and also to
prevent the mode being changed in such a way that existing guests are unable to
be run.

CPU hotplug fixes by Srivatsa.  max_cpus fixes by Mahesh.  cpuset fixes by
benh.  Fix for irq race by paulus.  The rest by mikey and mpe.

Signed-off-by: Michael Ellerman 
Signed-off-by: Michael Neuling 
Signed-off-by: Srivatsa S. Bhat 
Signed-off-by: Mahesh Salgaonkar 
Signed-off-by: Benjamin Herrenschmidt 
---
 arch/powerpc/include/asm/reg.h   |   9 +
 arch/powerpc/platforms/powernv/Makefile  |   2 +-
 arch/powerpc/platforms/powernv/powernv.h |   2 +
 arch/powerpc/platforms/powernv/smp.c |  18 +-
 arch/powerpc/platforms/powernv/subcore-asm.S |  95 +++
 arch/powerpc/platforms/powernv/subcore.c | 392 +++
 arch/powerpc/platforms/powernv/subcore.h |  18 ++
 7 files changed, 527 insertions(+), 9 deletions(-)
 create mode 100644 arch/powerpc/platforms/powernv/subcore-asm.S
 create mode 100644 arch/powerpc/platforms/powernv/subcore.c
 create mode 100644 arch/powerpc/platforms/powernv/subcore.h

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 29de015..2cd799b 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -225,6 +225,7 @@
 #define   CTRL_TE  0x00c0  /* thread enable */
 #define   CTRL_RUNLATCH0x1
 #define SPRN_DAWR  0xB4
+#define SPRN_RPR   0xBA/* Relative Priority Register */
 #define SPRN_CIABR 0xBB
 #define   CIABR_PRIV   0x3
 #define   CIABR_PRIV_USER  1
@@ -273,8 +274,10 @@
 #define SPRN_HSRR1 0x13B   /* Hypervisor Save/Restore 1 */
 #define SPRN_IC0x350   /* Virtual Instruction Count */
 #define SPRN_VTB   0x351   /* Virtual Time Base */
+#define SPRN_LDBAR 0x352   /* LD Base Address Register */
 #define SPRN_PMICR 0x354   /* Power Management Idle Control Reg */
 #define SPRN_PMSR  0x355   /* Power Management Status Reg */
+#define SPRN_PMMAR 0x356   /* Power Management Memory Activity Register */
 #define SPRN_PMCR  0x374   /* Power Management Control Register */
 
 /* HFSCR and FSCR bit numbers are the same */
@@ -434,6 +437,12 @@
 #define HID0_BTCD  (1<<1)  /* Branch target cache disable */
 #define HID0_NOPDST(1<<1)  /* No-op dst, dstt, etc. instr. */
 #define HID0_NOPTI (1<<0)  /* No-op dcbt and dcbst instr. */
+/* POWER8 HID0 bits */
+#define HID0_POWER8_4LPARMODE  __MASK(61)
+#define HID0_POWER8_2LPARMODE  __MASK(57)
+#define HID0_POWER8_1TO2LPAR   __MASK(52)
+#define HID0_POWER8_1TO4LPAR   __MASK(51)
+#define HID0_POWER8_DYNLPARDIS __MASK(48)
 
 #define SPRN_HID1  0x3F1   /* Hardware Implementation Register 1 */
 #ifdef CONFIG_6xx
diff --git a/arch/powerpc/platforms/powernv/Makefile 
b/arch/powerpc/platforms/powernv/Makefile
index 63cebb9..4ad0d34 100644
--- a/arch/powerpc/platforms/powernv/Makefile
+++ b/arch/powerpc/platforms/powernv/Makefile
@@ -1,7 +1,7 @@
 obj-y  += setup.o opal-takeover.o opal-wrappers.o opal.o 
opal-async.o
 obj-y  += opal-rtc.o opal-nvram.o opal-lpc.o opal-flash.o
 obj-y  += rng.o opal-elog.o opal-dump.o opal-sysparam.o 
opal-sensor.o
-obj-y  += opal-msglog.o
+obj-y  += opal-msglog.o subcore.o subcore-asm.o
 
 obj-$(CONFIG_SMP)  += smp.o
 obj-$(CONFIG_PCI)  += pci.o pci-p5ioc2.o pci-ioda.o
diff --git a/arch/powerpc/platforms/powernv/powernv.h 
b/arch/powerpc/platforms/powernv/powernv.h
index 0051e10..75501bf 100644
--- a/arch/powerpc/platforms/powernv/powernv.h
+++ b/arch/powerpc/platforms/powernv/powernv.h
@@ -25,4 +25,6 @@ static inline int pnv_pci_dma_set_mask(struct pci_dev *pdev, 
u64 dma_mask)
 
 extern void pnv_lpc_init(void);
 
+bool cpu_core_split_required(void);
+
 #endif /* _POWERNV_H */
diff --git a/arch/powerpc/platforms/powernv/smp.c 
b/arch/powerpc/platforms/powernv/smp.c
index 65faf99..0062a43 100644
--- a/

Re: [PATCH v2 0/6] Implement split core for POWER8

2014-05-23 Thread Michael Neuling
On Fri, 2014-05-23 at 11:53 +0200, Alexander Graf wrote:
> On 23.05.14 10:15, Michael Neuling wrote:
> > This patch series implements split core mode on POWER8.  This enables up to 
> > 4
> > subcores per core which can each independently run guests (per guest SPRs 
> > like
> > SDR1, LPIDR etc are replicated per subcore).  Lots more documentation on 
> > this
> > feature in the code and commit messages.
> >
> > Most of this code is in the powernv platform but there's a couple of KVM
> > specific patches too.
> >
> > Patch series authored by mpe and me with a few bug fixes from others.
> >
> > v2:
> >There are some minor updates based on comments and I've added the Acks by
> >Paulus and Alex for the KVM code.
> 
> I don't see changelogs inside the individual patches. Please make sure 
> to always mention what changed from one version to the next in a 
> particular patch, so that I have the chance to check whether that change 
> was good :).

Sure, that was a bit sloppy

Only the last patch was the only one that changed.  I changed the sysfs
file from 600 permissions to 644 so that users can read it more easily
as requested by Joel.

The other change was to fix the possibility of a race when coming out of
nap and checking if we need to split.  This fix was form paulus' (worked
offline).

> Also, is there any performance penalty associated with split core mode? 
> If not, could we just always default to split-by-4 on POWER8 bare metal?

Yeah, there is a performance hit .  When you are split (ie
subcores_per_core = 2 or 4), the core is stuck in SMT8 mode.  So if you
only have 1 thread active (others napped), you won't get the benefit of
ST mode in the core (more register renames per HW thread, more FXUs,
more FPUs etc).

Mikey
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/6] Implement split core for POWER8

2014-05-23 Thread Michael Neuling
> >> Also, is there any performance penalty associated with split core mode?
> >> If not, could we just always default to split-by-4 on POWER8 bare metal?
> > Yeah, there is a performance hit .  When you are split (ie
> > subcores_per_core = 2 or 4), the core is stuck in SMT8 mode.  So if you
> > only have 1 thread active (others napped), you won't get the benefit of
> > ST mode in the core (more register renames per HW thread, more FXUs,
> > more FPUs etc).
> 
> Ok, imagine I have 1 core with SMT8. I have one process running at 100% 
> occupying one thread, the other 7 threads are idle.
> 
> Do I get performance benefits from having the other threads idle? Or do 
> I have to configure the system into SMT1 mode to get my ST benefits?

You automatically get the performance benefit when they are idle.  When
threads enter nap, the core is able to reduce it's SMT mode
automatically. 

> If it's the latter, we could just have ppc64_cpu --smt=x also set the 
> subcore amount in parallel to the thread count.

FWIW on powernv we just nap the threads on hotplug.

> The reason I'm bringing this up is that I'm not quite sure who would be 
> the instance doing these performance tweaks. So I'd guess the majority 
> of users will simply miss out on them.

Everyone, it's automatic on idle... except for split core mode
unfortunately.

Mikey

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/6] Implement split core for POWER8

2014-05-23 Thread Michael Neuling
Alex,

> >> If it's the latter, we could just have ppc64_cpu --smt=x also set the
> >> subcore amount in parallel to the thread count.
> > FWIW on powernv we just nap the threads on hotplug.
> >
> >> The reason I'm bringing this up is that I'm not quite sure who would be
> >> the instance doing these performance tweaks. So I'd guess the majority
> >> of users will simply miss out on them.
> > Everyone, it's automatic on idle... except for split core mode
> > unfortunately.
> 
> Oh I meant when you want to use a POWER system as VM host, you have to 
> know about split core mode and configure it accordingly. That's 
> something someone needs to do. And it's different from x86 which means 
> people may miss out on it for their performance benchmarks.

It depends on what's running.  If you have 1 guest per core, then
running unsplit is probably best as you can nap threads as needed and
improve performance.  

If you have more than two guests per core, then running split core can
hugely improve performance as they may be able to run at the same time
without context switching.  4 guests with 2 threads per core can run at
the same time on a single physical core.

One thing to note here is guest doorbell IRQs (new in POWER8).  They
can't cross a core or subcore boundary and there is no way for the
hypervisor to virtualise them.  Hence if you run split 4 on an SMT8
POWER8, you can only run guests up to 2 threads per core (rather than 8
threads per core).

> But if we impose a general performance penalty for everyone with it, I 
> don't think split core mode should be enabled by default.

FWIW we'd like to make this dynamic eventually, so that each core is run
in whatever mode is currently best based on the running guests.

Mikey

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] KVM: PPC: Book3S HV: Add H_SET_MODE hcall handling

2014-05-28 Thread Michael Neuling
Alex,

> > +static int kvmppc_h_set_mode(struct kvm_vcpu *vcpu, unsigned long mflags,
> > +unsigned long resource, unsigned long value1,
> > +unsigned long value2)
> > +{
> > +   switch (resource) {
> > +   case H_SET_MODE_RESOURCE_SET_CIABR:
> > +   if (!kvmppc_power8_compatible(vcpu))
> > +   return H_P2;
> > +   if (value2)
> > +   return H_P4;
> > +   if (mflags)
> > +   return H_UNSUPPORTED_FLAG_START;
> > +   if ((value1 & 0x3) == 0x3)
> 
> What is this?

It's what it says in PAPR (I wish that was public!!!).  Joking aside... 

If you refer to the 2.07 HW arch (not PAPR), the bottom two bits of the
CIABR tell you what mode to match in.  0x3 means match in hypervisor,
which we obviously don't want the guest to be able to do.

I'll add some #defines to make it a clearer and repost.

Mikey

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/3] KVM: PPC: Book3S HV: Add H_SET_MODE hcall handling

2014-05-28 Thread Michael Neuling
This adds support for the H_SET_MODE hcall.  This hcall is a
multiplexer that has several functions, some of which are called
rarely, and some which are potentially called very frequently.
Here we add support for the functions that set the debug registers
CIABR (Completed Instruction Address Breakpoint Register) and
DAWR/DAWRX (Data Address Watchpoint Register and eXtension),
since they could be updated by the guest as often as every context
switch.

This also adds a kvmppc_power8_compatible() function to test to see
if a guest is compatible with POWER8 or not.  The CIABR and DAWR/X
only exist on POWER8.

Signed-off-by: Michael Neuling 
Signed-off-by: Paul Mackerras 
---
v2:
  add some #defines to make CIABR setting clearer.  No functional change.

diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index 5dbbb29..85bc8c0 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -279,6 +279,12 @@
 #define H_GET_24X7_DATA0xF07C
 #define H_GET_PERF_COUNTER_INFO0xF080
 
+/* Values for 2nd argument to H_SET_MODE */
+#define H_SET_MODE_RESOURCE_SET_CIABR  1
+#define H_SET_MODE_RESOURCE_SET_DAWR   2
+#define H_SET_MODE_RESOURCE_ADDR_TRANS_MODE3
+#define H_SET_MODE_RESOURCE_LE 4
+
 #ifndef __ASSEMBLY__
 
 /**
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index e699047..e74dab2 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -557,6 +557,48 @@ static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
vcpu->arch.dtl.dirty = true;
 }
 
+static bool kvmppc_power8_compatible(struct kvm_vcpu *vcpu)
+{
+   if (vcpu->arch.vcore->arch_compat >= PVR_ARCH_207)
+   return true;
+   if ((!vcpu->arch.vcore->arch_compat) &&
+   cpu_has_feature(CPU_FTR_ARCH_207S))
+   return true;
+   return false;
+}
+
+static int kvmppc_h_set_mode(struct kvm_vcpu *vcpu, unsigned long mflags,
+unsigned long resource, unsigned long value1,
+unsigned long value2)
+{
+   switch (resource) {
+   case H_SET_MODE_RESOURCE_SET_CIABR:
+   if (!kvmppc_power8_compatible(vcpu))
+   return H_P2;
+   if (value2)
+   return H_P4;
+   if (mflags)
+   return H_UNSUPPORTED_FLAG_START;
+   /* Guests can't breakpoint the hypervisor */
+   if ((value1 & CIABR_PRIV) == CIABR_PRIV_HYPER)
+   return H_P3;
+   vcpu->arch.ciabr  = value1;
+   return H_SUCCESS;
+   case H_SET_MODE_RESOURCE_SET_DAWR:
+   if (!kvmppc_power8_compatible(vcpu))
+   return H_P2;
+   if (mflags)
+   return H_UNSUPPORTED_FLAG_START;
+   if (value2 & DABRX_HYP)
+   return H_P4;
+   vcpu->arch.dawr  = value1;
+   vcpu->arch.dawrx = value2;
+   return H_SUCCESS;
+   default:
+   return H_TOO_HARD;
+   }
+}
+
 int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
 {
unsigned long req = kvmppc_get_gpr(vcpu, 3);
@@ -626,7 +668,14 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
 
/* Send the error out to userspace via KVM_RUN */
return rc;
-
+   case H_SET_MODE:
+   ret = kvmppc_h_set_mode(vcpu, kvmppc_get_gpr(vcpu, 4),
+   kvmppc_get_gpr(vcpu, 5),
+   kvmppc_get_gpr(vcpu, 6),
+   kvmppc_get_gpr(vcpu, 7));
+   if (ret == H_TOO_HARD)
+   return RESUME_HOST;
+   break;
case H_XIRR:
case H_CPPR:
case H_EOI:
@@ -652,6 +701,7 @@ static int kvmppc_hcall_impl_hv(struct kvm *kvm, unsigned 
long cmd)
case H_PROD:
case H_CONFER:
case H_REGISTER_VPA:
+   case H_SET_MODE:
 #ifdef CONFIG_KVM_XICS
case H_XIRR:
case H_CPPR:

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


powerpc/pseries: Use new defines when calling h_set_mode

2014-05-29 Thread Michael Neuling
> > +/* Values for 2nd argument to H_SET_MODE */
> > +#define H_SET_MODE_RESOURCE_SET_CIABR1
> > +#define H_SET_MODE_RESOURCE_SET_DAWR2
> > +#define H_SET_MODE_RESOURCE_ADDR_TRANS_MODE3
> > +#define H_SET_MODE_RESOURCE_LE4
> 
> 
> Much better, but I think you want to make use of these in non-kvm code too,
> no? At least the LE one is definitely already implemented as call :)

Sure but that's a different patch below.

Mikey


powerpc/pseries: Use new defines when calling h_set_mode

Now that we define these in the KVM code, use these defines when we call
h_set_mode.  No functional change.

Signed-off-by: Michael Neuling 
--
This depends on the KVM h_set_mode patches.

diff --git a/arch/powerpc/include/asm/plpar_wrappers.h 
b/arch/powerpc/include/asm/plpar_wrappers.h
index 12c32c5..67859ed 100644
--- a/arch/powerpc/include/asm/plpar_wrappers.h
+++ b/arch/powerpc/include/asm/plpar_wrappers.h
@@ -273,7 +273,7 @@ static inline long plpar_set_mode(unsigned long mflags, 
unsigned long resource,
 static inline long enable_reloc_on_exceptions(void)
 {
/* mflags = 3: Exceptions at 0xC0004000 */
-   return plpar_set_mode(3, 3, 0, 0);
+   return plpar_set_mode(3, H_SET_MODE_RESOURCE_ADDR_TRANS_MODE, 0, 0);
 }
 
 /*
@@ -284,7 +284,7 @@ static inline long enable_reloc_on_exceptions(void)
  * returns H_SUCCESS.
  */
 static inline long disable_reloc_on_exceptions(void) {
-   return plpar_set_mode(0, 3, 0, 0);
+   return plpar_set_mode(0, H_SET_MODE_RESOURCE_ADDR_TRANS_MODE, 0, 0);
 }
 
 /*
@@ -297,7 +297,7 @@ static inline long disable_reloc_on_exceptions(void) {
 static inline long enable_big_endian_exceptions(void)
 {
/* mflags = 0: big endian exceptions */
-   return plpar_set_mode(0, 4, 0, 0);
+   return plpar_set_mode(0, H_SET_MODE_RESOURCE_LE, 0, 0);
 }
 
 /*
@@ -310,17 +310,17 @@ static inline long enable_big_endian_exceptions(void)
 static inline long enable_little_endian_exceptions(void)
 {
/* mflags = 1: little endian exceptions */
-   return plpar_set_mode(1, 4, 0, 0);
+   return plpar_set_mode(1, H_SET_MODE_RESOURCE_LE, 0, 0);
 }
 
 static inline long plapr_set_ciabr(unsigned long ciabr)
 {
-   return plpar_set_mode(0, 1, ciabr, 0);
+   return plpar_set_mode(0, H_SET_MODE_RESOURCE_SET_CIABR, ciabr, 0);
 }
 
 static inline long plapr_set_watchpoint0(unsigned long dawr0, unsigned long 
dawrx0)
 {
-   return plpar_set_mode(0, 2, dawr0, dawrx0);
+   return plpar_set_mode(0, H_SET_MODE_RESOURCE_SET_DAWR, dawr0, dawrx0);
 }
 
 #endif /* _ASM_POWERPC_PLPAR_WRAPPERS_H */

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: powerpc/pseries: Use new defines when calling h_set_mode

2014-05-30 Thread Michael Neuling
On Fri, 2014-05-30 at 18:56 +1000, Michael Ellerman wrote:
> On Thu, 2014-05-29 at 17:45 +1000, Michael Neuling wrote:
> > > > +/* Values for 2nd argument to H_SET_MODE */
> > > > +#define H_SET_MODE_RESOURCE_SET_CIABR1
> > > > +#define H_SET_MODE_RESOURCE_SET_DAWR2
> > > > +#define H_SET_MODE_RESOURCE_ADDR_TRANS_MODE3
> > > > +#define H_SET_MODE_RESOURCE_LE4
> > > 
> > > Much better, but I think you want to make use of these in non-kvm code 
> > > too,
> > > no? At least the LE one is definitely already implemented as call :)
> > 
> > powerpc/pseries: Use new defines when calling h_set_mode
> > 
> > Now that we define these in the KVM code, use these defines when we call
> > h_set_mode.  No functional change.
> > 
> > Signed-off-by: Michael Neuling 
> > --
> > This depends on the KVM h_set_mode patches.
> > 
> > diff --git a/arch/powerpc/include/asm/plpar_wrappers.h 
> > b/arch/powerpc/include/asm/plpar_wrappers.h
> > index 12c32c5..67859ed 100644
> > --- a/arch/powerpc/include/asm/plpar_wrappers.h
> > +++ b/arch/powerpc/include/asm/plpar_wrappers.h
> > @@ -273,7 +273,7 @@ static inline long plpar_set_mode(unsigned long mflags, 
> > unsigned long resource,
> >  static inline long enable_reloc_on_exceptions(void)
> >  {
> > /* mflags = 3: Exceptions at 0xC0004000 */
> > -   return plpar_set_mode(3, 3, 0, 0);
> > +   return plpar_set_mode(3, H_SET_MODE_RESOURCE_ADDR_TRANS_MODE, 0, 0);
> >  }
> 
> Which header are these coming from, and why aren't we including it? And is it
> going to still build with CONFIG_KVM=n?

>From include/asm/hvcall.h in the h_set_mode patch set I sent before.

And yes it compiles with CONFIG_KVM=n fine.

Mikey
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html