[PATCH v3 28/29] KVM: PPC: remove load/put vcpu for KVM_GET_REGS/KVM_SET_REGS

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

In both HV/PR KVM, the KVM_SET_REGS/KVM_GET_REGS ioctl should
be able to perform without load vcpu. This patch adds
KVM_SET_ONE_REG/KVM_GET_ONE_REG implementation to async ioctl
function.

Due to the vcpu mutex locking/unlock has been moved out of vcpu_load()
/vcpu_put(), KVM_SET_REGS/KVM_GET_REGS don't need to do
ioctl with loading vcpu anymore. This patch removes vcpu_load()/vcpu_put()
from KVM_SET_REGS/KVM_GET_REGS ioctl.

Signed-off-by: Simon Guo 
---
 arch/powerpc/kvm/book3s.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 97d4a11..523c68f 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -509,8 +509,6 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
 {
int i;
 
-   vcpu_load(vcpu);
-
regs->pc = kvmppc_get_pc(vcpu);
regs->cr = kvmppc_get_cr(vcpu);
regs->ctr = kvmppc_get_ctr(vcpu);
@@ -532,7 +530,6 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
for (i = 0; i < ARRAY_SIZE(regs->gpr); i++)
regs->gpr[i] = kvmppc_get_gpr(vcpu, i);
 
-   vcpu_put(vcpu);
return 0;
 }
 
@@ -540,8 +537,6 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
 {
int i;
 
-   vcpu_load(vcpu);
-
kvmppc_set_pc(vcpu, regs->pc);
kvmppc_set_cr(vcpu, regs->cr);
kvmppc_set_ctr(vcpu, regs->ctr);
@@ -562,7 +557,6 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
for (i = 0; i < ARRAY_SIZE(regs->gpr); i++)
kvmppc_set_gpr(vcpu, i, regs->gpr[i]);
 
-   vcpu_put(vcpu);
return 0;
 }
 
-- 
1.8.3.1



[PATCH v3 27/29] KVM: PPC: remove load/put vcpu for KVM_GET/SET_ONE_REG ioctl

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

Due to the vcpu mutex locking/unlock has been moved out of vcpu_load()
/vcpu_put(), KVM_GET_ONE_REG and KVM_SET_ONE_REG doesn't need to do
ioctl with loading vcpu anymore. This patch removes vcpu_load()/vcpu_put()
from KVM_GET_ONE_REG and KVM_SET_ONE_REG ioctl.

Signed-off-by: Simon Guo 
---
 arch/powerpc/kvm/powerpc.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index c9098ff..5def68d 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -1801,14 +1801,12 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
{
struct kvm_one_reg reg;
r = -EFAULT;
-   vcpu_load(vcpu);
if (copy_from_user(, argp, sizeof(reg)))
goto out;
if (ioctl == KVM_SET_ONE_REG)
r = kvm_vcpu_ioctl_set_one_reg(vcpu, );
else
r = kvm_vcpu_ioctl_get_one_reg(vcpu, );
-   vcpu_put(vcpu);
break;
}
 
-- 
1.8.3.1



[PATCH v3 26/29] KVM: PPC: move vcpu_load/vcpu_put down to each ioctl case in kvm_arch_vcpu_ioctl

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

Although we already have kvm_arch_vcpu_async_ioctl() which doesn't require
ioctl to load vcpu, the sync ioctl code need to be cleaned up when
CONFIG_HAVE_KVM_VCPU_ASYNC_IOCTL is not configured.

This patch moves vcpu_load/vcpu_put down to each ioctl switch case so that
each ioctl can decide to do vcpu_load/vcpu_put or not independently.

Signed-off-by: Simon Guo 
---
 arch/powerpc/kvm/powerpc.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 1fa5bbe..c9098ff 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -1783,16 +1783,16 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
void __user *argp = (void __user *)arg;
long r;
 
-   vcpu_load(vcpu);
-
switch (ioctl) {
case KVM_ENABLE_CAP:
{
struct kvm_enable_cap cap;
r = -EFAULT;
+   vcpu_load(vcpu);
if (copy_from_user(, argp, sizeof(cap)))
goto out;
r = kvm_vcpu_ioctl_enable_cap(vcpu, );
+   vcpu_put(vcpu);
break;
}
 
@@ -1801,12 +1801,14 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
{
struct kvm_one_reg reg;
r = -EFAULT;
+   vcpu_load(vcpu);
if (copy_from_user(, argp, sizeof(reg)))
goto out;
if (ioctl == KVM_SET_ONE_REG)
r = kvm_vcpu_ioctl_set_one_reg(vcpu, );
else
r = kvm_vcpu_ioctl_get_one_reg(vcpu, );
+   vcpu_put(vcpu);
break;
}
 
@@ -1814,9 +1816,11 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
case KVM_DIRTY_TLB: {
struct kvm_dirty_tlb dirty;
r = -EFAULT;
+   vcpu_load(vcpu);
if (copy_from_user(, argp, sizeof(dirty)))
goto out;
r = kvm_vcpu_ioctl_dirty_tlb(vcpu, );
+   vcpu_put(vcpu);
break;
}
 #endif
@@ -1825,7 +1829,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
}
 
 out:
-   vcpu_put(vcpu);
return r;
 }
 
-- 
1.8.3.1



[PATCH v3 25/29] KVM: PPC: Book3S PR: enable HTM for PR KVM for KVM_CHECK_EXTENSION ioctl

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

With current patch set, PR KVM now supports HTM. So this patch turns it
on for PR KVM.

Tested with:
https://github.com/justdoitqd/publicFiles/blob/master/test_kvm_htm_cap.c

Signed-off-by: Simon Guo 
---
 arch/powerpc/kvm/powerpc.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index bef27b1..1fa5bbe 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -648,9 +648,8 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 #endif
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
case KVM_CAP_PPC_HTM:
-   r = hv_enabled &&
-   (!!(cur_cpu_spec->cpu_user_features2 & PPC_FEATURE2_HTM) ||
-cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST));
+   r = !!(cur_cpu_spec->cpu_user_features2 & PPC_FEATURE2_HTM) ||
+(hv_enabled && cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST));
break;
 #endif
default:
-- 
1.8.3.1



[PATCH v3 24/29] KVM: PPC: Book3S PR: Support TAR handling for PR KVM HTM.

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

Currently guest kernel doesn't handle TAR fac unavailable and it always
runs with TAR bit on. PR KVM will lazily enable TAR. TAR is not a
frequent-use reg and it is not included in SVCPU struct.

Due to the above, the checkpointed TAR val might be a bogus TAR val.
To solve this issue, we will make vcpu->arch.fscr tar bit consistent
with shadow_fscr when TM enabled.

At the end of emulating treclaim., the correct TAR val need to be loaded
into reg if FSCR_TAR bit is on.
At the beginning of emulating trechkpt., TAR needs to be flushed so that
the right tar val can be copy into tar_tm.

Tested with:
tools/testing/selftests/powerpc/tm/tm-tar
tools/testing/selftests/powerpc/ptrace/ptrace-tm-tar (remove DSCR/PPR
related testing).

Signed-off-by: Simon Guo 
---
 arch/powerpc/include/asm/kvm_book3s.h |  2 ++
 arch/powerpc/kvm/book3s_emulate.c |  4 
 arch/powerpc/kvm/book3s_pr.c  | 21 -
 arch/powerpc/kvm/tm.S | 16 ++--
 4 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 2940de7..1f345a0 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -271,6 +271,8 @@ static inline void kvmppc_save_tm_sprs(struct kvm_vcpu 
*vcpu) {}
 static inline void kvmppc_restore_tm_sprs(struct kvm_vcpu *vcpu) {}
 #endif
 
+void kvmppc_giveup_fac(struct kvm_vcpu *vcpu, ulong fac);
+
 extern int kvm_irq_bypass;
 
 static inline struct kvmppc_vcpu_book3s *to_book3s(struct kvm_vcpu *vcpu)
diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index 67d0fb40..fdbc695 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -173,6 +173,9 @@ static void kvmppc_emulate_treclaim(struct kvm_vcpu *vcpu, 
int ra_val)
guest_msr &= ~(MSR_TS_MASK);
kvmppc_set_msr(vcpu, guest_msr);
preempt_enable();
+
+   if (vcpu->arch.shadow_fscr & FSCR_TAR)
+   mtspr(SPRN_TAR, vcpu->arch.tar);
 }
 
 static void kvmppc_emulate_trchkpt(struct kvm_vcpu *vcpu)
@@ -185,6 +188,7 @@ static void kvmppc_emulate_trchkpt(struct kvm_vcpu *vcpu)
 * copy.
 */
kvmppc_giveup_ext(vcpu, MSR_VSX);
+   kvmppc_giveup_fac(vcpu, FSCR_TAR_LG);
kvmppc_copyto_vcpu_tm(vcpu);
kvmppc_save_tm_sprs(vcpu);
 
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 526c928..8efc87b 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -55,7 +55,7 @@
 
 static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr,
 ulong msr);
-static void kvmppc_giveup_fac(struct kvm_vcpu *vcpu, ulong fac);
+static int kvmppc_handle_fac(struct kvm_vcpu *vcpu, ulong fac);
 
 /* Some compatibility defines */
 #ifdef CONFIG_PPC_BOOK3S_32
@@ -346,6 +346,7 @@ void kvmppc_save_tm_pr(struct kvm_vcpu *vcpu)
return;
}
 
+   kvmppc_giveup_fac(vcpu, FSCR_TAR_LG);
kvmppc_giveup_ext(vcpu, MSR_VSX);
 
preempt_disable();
@@ -357,8 +358,11 @@ void kvmppc_restore_tm_pr(struct kvm_vcpu *vcpu)
 {
if (!MSR_TM_ACTIVE(kvmppc_get_msr(vcpu))) {
kvmppc_restore_tm_sprs(vcpu);
-   if (kvmppc_get_msr(vcpu) & MSR_TM)
+   if (kvmppc_get_msr(vcpu) & MSR_TM) {
kvmppc_handle_lost_math_exts(vcpu);
+   if (vcpu->arch.fscr & FSCR_TAR)
+   kvmppc_handle_fac(vcpu, FSCR_TAR_LG);
+   }
return;
}
 
@@ -366,9 +370,11 @@ void kvmppc_restore_tm_pr(struct kvm_vcpu *vcpu)
_kvmppc_restore_tm_pr(vcpu, kvmppc_get_msr(vcpu));
preempt_enable();
 
-   if (kvmppc_get_msr(vcpu) & MSR_TM)
+   if (kvmppc_get_msr(vcpu) & MSR_TM) {
kvmppc_handle_lost_math_exts(vcpu);
-
+   if (vcpu->arch.fscr & FSCR_TAR)
+   kvmppc_handle_fac(vcpu, FSCR_TAR_LG);
+   }
 }
 #endif
 
@@ -819,7 +825,7 @@ void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr)
 }
 
 /* Give up facility (TAR / EBB / DSCR) */
-static void kvmppc_giveup_fac(struct kvm_vcpu *vcpu, ulong fac)
+void kvmppc_giveup_fac(struct kvm_vcpu *vcpu, ulong fac)
 {
 #ifdef CONFIG_PPC_BOOK3S_64
if (!(vcpu->arch.shadow_fscr & (1ULL << fac))) {
@@ -1020,7 +1026,12 @@ void kvmppc_set_fscr(struct kvm_vcpu *vcpu, u64 fscr)
if ((vcpu->arch.fscr & FSCR_TAR) && !(fscr & FSCR_TAR)) {
/* TAR got dropped, drop it in shadow too */
kvmppc_giveup_fac(vcpu, FSCR_TAR_LG);
+   } else if (!(vcpu->arch.fscr & FSCR_TAR) && (fscr & FSCR_TAR)) {
+   vcpu->arch.fscr = fscr;
+   kvmppc_handle_fac(vcpu, FSCR_TAR_LG);
+   return;
}
+
vcpu->arch.fscr = fscr;
 }
 #endif
diff --git 

[PATCH v3 23/29] KVM: PPC: Book3S PR: add guard code to prevent returning to guest with PR=0 and Transactional state

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

Currently PR KVM doesn't support transaction memory at guest privilege
state.

This patch adds a check at setting guest msr, so that we can never return
to guest with PR=0 and TS=0b10. A tabort will be emulated to indicate
this and fail transaction immediately.

Signed-off-by: Simon Guo 
---
 arch/powerpc/include/uapi/asm/tm.h |  2 +-
 arch/powerpc/kvm/book3s.h  |  6 ++
 arch/powerpc/kvm/book3s_emulate.c  |  2 +-
 arch/powerpc/kvm/book3s_pr.c   | 13 -
 4 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/tm.h 
b/arch/powerpc/include/uapi/asm/tm.h
index e1bf0e2..e2947c9 100644
--- a/arch/powerpc/include/uapi/asm/tm.h
+++ b/arch/powerpc/include/uapi/asm/tm.h
@@ -13,7 +13,7 @@
 #define TM_CAUSE_TLBI  0xdc
 #define TM_CAUSE_FAC_UNAV  0xda
 #define TM_CAUSE_SYSCALL   0xd8
-#define TM_CAUSE_MISC  0xd6  /* future use */
+#define TM_CAUSE_PRIV_T0xd6
 #define TM_CAUSE_SIGNAL0xd4
 #define TM_CAUSE_ALIGNMENT 0xd2
 #define TM_CAUSE_EMULATE   0xd0
diff --git a/arch/powerpc/kvm/book3s.h b/arch/powerpc/kvm/book3s.h
index 4ad5e28..14ef035 100644
--- a/arch/powerpc/kvm/book3s.h
+++ b/arch/powerpc/kvm/book3s.h
@@ -31,4 +31,10 @@ extern int kvmppc_core_emulate_mfspr_pr(struct kvm_vcpu 
*vcpu,
 extern int kvmppc_book3s_init_pr(void);
 extern void kvmppc_book3s_exit_pr(void);
 
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+extern void kvmppc_emulate_tabort(struct kvm_vcpu *vcpu, int ra_val);
+#else
+static inline void kvmppc_emulate_tabort(struct kvm_vcpu *vcpu, int ra_val) {}
+#endif
+
 #endif
diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index 34f910e..67d0fb40 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -199,7 +199,7 @@ static void kvmppc_emulate_trchkpt(struct kvm_vcpu *vcpu)
 }
 
 /* emulate tabort. at guest privilege state */
-static void kvmppc_emulate_tabort(struct kvm_vcpu *vcpu, int ra_val)
+void kvmppc_emulate_tabort(struct kvm_vcpu *vcpu, int ra_val)
 {
/* currently we only emulate tabort. but no emulation of other
 * tabort variants since there is no kernel usage of them at
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 5359f9c..526c928 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -446,12 +446,23 @@ static void kvm_set_spte_hva_pr(struct kvm *kvm, unsigned 
long hva, pte_t pte)
 
 static void kvmppc_set_msr_pr(struct kvm_vcpu *vcpu, u64 msr)
 {
-   ulong old_msr = kvmppc_get_msr(vcpu);
+   ulong old_msr;
 
 #ifdef EXIT_DEBUG
printk(KERN_INFO "KVM: Set MSR to 0x%llx\n", msr);
 #endif
 
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+   /* We should never target guest MSR to TS=10 && PR=0,
+* since we always fail transaction for guest privilege
+* state.
+*/
+   if (!(msr & MSR_PR) && MSR_TM_TRANSACTIONAL(msr))
+   kvmppc_emulate_tabort(vcpu,
+   TM_CAUSE_PRIV_T | TM_CAUSE_PERSISTENT);
+#endif
+
+   old_msr = kvmppc_get_msr(vcpu);
msr &= to_book3s(vcpu)->msr_mask;
kvmppc_set_msr_fast(vcpu, msr);
kvmppc_recalc_shadow_msr(vcpu);
-- 
1.8.3.1



[PATCH v3 22/29] KVM: PPC: Book3S PR: add emulation for tabort. for privilege guest

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

Currently privilege guest will be run with TM disabled.

Although the privilege guest cannot initiate a new transaction,
it can use tabort to terminate its problem state's transaction.
So it is still necessary to emulate tabort. for privilege guest.

This patch adds emulation for tabort. of privilege guest.

Tested with:
https://github.com/justdoitqd/publicFiles/blob/master/test_tabort.c

Signed-off-by: Simon Guo 
---
 arch/powerpc/kvm/book3s_emulate.c | 68 +++
 1 file changed, 68 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index b7530cf..34f910e 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -50,6 +50,7 @@
 #define OP_31_XOP_SLBMFEE  915
 
 #define OP_31_XOP_TBEGIN   654
+#define OP_31_XOP_TABORT   910
 
 #define OP_31_XOP_TRECLAIM 942
 #define OP_31_XOP_TRCHKPT  1006
@@ -196,6 +197,47 @@ static void kvmppc_emulate_trchkpt(struct kvm_vcpu *vcpu)
kvmppc_restore_tm_pr(vcpu);
preempt_enable();
 }
+
+/* emulate tabort. at guest privilege state */
+static void kvmppc_emulate_tabort(struct kvm_vcpu *vcpu, int ra_val)
+{
+   /* currently we only emulate tabort. but no emulation of other
+* tabort variants since there is no kernel usage of them at
+* present.
+*/
+   unsigned long guest_msr = kvmppc_get_msr(vcpu);
+
+   preempt_disable();
+   tm_enable();
+   tm_abort(ra_val);
+
+   /* CR0 = 0 | MSR[TS] | 0 */
+   vcpu->arch.cr = (vcpu->arch.cr & ~(CR0_MASK << CR0_SHIFT)) |
+   (((guest_msr & MSR_TS_MASK) >> (MSR_TS_S_LG - 1))
+<< CR0_SHIFT);
+
+   vcpu->arch.texasr = mfspr(SPRN_TEXASR);
+   /* failure recording depends on Failure Summary bit,
+* and tabort will be treated as nops in non-transactional
+* state.
+*/
+   if (!(vcpu->arch.texasr & TEXASR_FS) &&
+   MSR_TM_ACTIVE(guest_msr)) {
+   vcpu->arch.texasr &= ~(TEXASR_PR | TEXASR_HV);
+   if (guest_msr & MSR_PR)
+   vcpu->arch.texasr |= TEXASR_PR;
+
+   if (guest_msr & MSR_HV)
+   vcpu->arch.texasr |= TEXASR_HV;
+
+   vcpu->arch.tfiar = kvmppc_get_pc(vcpu);
+   mtspr(SPRN_TEXASR, vcpu->arch.texasr);
+   mtspr(SPRN_TFIAR, vcpu->arch.tfiar);
+   }
+   tm_disable();
+   preempt_enable();
+}
+
 #endif
 
 int kvmppc_core_emulate_op_pr(struct kvm_run *run, struct kvm_vcpu *vcpu,
@@ -468,6 +510,32 @@ int kvmppc_core_emulate_op_pr(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
emulated = EMULATE_FAIL;
break;
}
+   case OP_31_XOP_TABORT:
+   {
+   ulong guest_msr = kvmppc_get_msr(vcpu);
+   unsigned long ra_val = 0;
+
+   if (!cpu_has_feature(CPU_FTR_TM))
+   break;
+
+   if (!(kvmppc_get_msr(vcpu) & MSR_TM)) {
+   kvmppc_trigger_fac_interrupt(vcpu, FSCR_TM_LG);
+   emulated = EMULATE_AGAIN;
+   break;
+   }
+
+   /* only emulate for privilege guest, since problem state
+* guest can run with TM enabled and we don't expect to
+* trap at here for that case.
+*/
+   WARN_ON(guest_msr & MSR_PR);
+
+   if (ra)
+   ra_val = kvmppc_get_gpr(vcpu, ra);
+
+   kvmppc_emulate_tabort(vcpu, ra_val);
+   break;
+   }
case OP_31_XOP_TRECLAIM:
{
ulong guest_msr = kvmppc_get_msr(vcpu);
-- 
1.8.3.1



[PATCH v3 21/29] KVM: PPC: Book3S PR: add emulation for trechkpt in PR KVM.

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

This patch adds host emulation when guest PR KVM executes "trechkpt.",
which is a privileged instruction and will trap into host.

We firstly copy vcpu ongoing content into vcpu tm checkpoint
content, then perform kvmppc_restore_tm_pr() to do trechkpt.
with updated vcpu tm checkpoint vals.

Signed-off-by: Simon Guo 
---
 arch/powerpc/include/asm/kvm_book3s.h |  2 ++
 arch/powerpc/kvm/book3s_emulate.c | 61 +++
 arch/powerpc/kvm/book3s_pr.c  |  2 +-
 3 files changed, 64 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index c1cea82..2940de7 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -262,10 +262,12 @@ extern void kvmppc_update_lpcr(struct kvm *kvm, unsigned 
long lpcr,
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 void kvmppc_save_tm_pr(struct kvm_vcpu *vcpu);
 void kvmppc_restore_tm_pr(struct kvm_vcpu *vcpu);
+void kvmppc_save_tm_sprs(struct kvm_vcpu *vcpu);
 void kvmppc_restore_tm_sprs(struct kvm_vcpu *vcpu);
 #else
 static inline void kvmppc_save_tm_pr(struct kvm_vcpu *vcpu) {}
 static inline void kvmppc_restore_tm_pr(struct kvm_vcpu *vcpu) {}
+static inline void kvmppc_save_tm_sprs(struct kvm_vcpu *vcpu) {}
 static inline void kvmppc_restore_tm_sprs(struct kvm_vcpu *vcpu) {}
 #endif
 
diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index 04c29e0..b7530cf 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -52,6 +52,7 @@
 #define OP_31_XOP_TBEGIN   654
 
 #define OP_31_XOP_TRECLAIM 942
+#define OP_31_XOP_TRCHKPT  1006
 
 /* DCBZ is actually 1014, but we patch it to 1010 so we get a trap */
 #define OP_31_XOP_DCBZ 1010
@@ -172,6 +173,29 @@ static void kvmppc_emulate_treclaim(struct kvm_vcpu *vcpu, 
int ra_val)
kvmppc_set_msr(vcpu, guest_msr);
preempt_enable();
 }
+
+static void kvmppc_emulate_trchkpt(struct kvm_vcpu *vcpu)
+{
+   unsigned long guest_msr = kvmppc_get_msr(vcpu);
+
+   preempt_disable();
+   /*
+* need flush FP/VEC/VSX to vcpu save area before
+* copy.
+*/
+   kvmppc_giveup_ext(vcpu, MSR_VSX);
+   kvmppc_copyto_vcpu_tm(vcpu);
+   kvmppc_save_tm_sprs(vcpu);
+
+   /*
+* as a result of trecheckpoint. set TS to suspended.
+*/
+   guest_msr &= ~(MSR_TS_MASK);
+   guest_msr |= MSR_TS_S;
+   kvmppc_set_msr(vcpu, guest_msr);
+   kvmppc_restore_tm_pr(vcpu);
+   preempt_enable();
+}
 #endif
 
 int kvmppc_core_emulate_op_pr(struct kvm_run *run, struct kvm_vcpu *vcpu,
@@ -478,6 +502,43 @@ int kvmppc_core_emulate_op_pr(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
kvmppc_emulate_treclaim(vcpu, ra_val);
break;
}
+   case OP_31_XOP_TRCHKPT:
+   {
+   ulong guest_msr = kvmppc_get_msr(vcpu);
+   unsigned long texasr;
+
+   if (!cpu_has_feature(CPU_FTR_TM))
+   break;
+
+   if (!(kvmppc_get_msr(vcpu) & MSR_TM)) {
+   kvmppc_trigger_fac_interrupt(vcpu, FSCR_TM_LG);
+   emulated = EMULATE_AGAIN;
+   break;
+   }
+
+   /* generate interrupt based on priorities */
+   if (guest_msr & MSR_PR) {
+   /* Privileged Instruction type Program Intr */
+   kvmppc_core_queue_program(vcpu, SRR1_PROGPRIV);
+   emulated = EMULATE_AGAIN;
+   break;
+   }
+
+   tm_enable();
+   texasr = mfspr(SPRN_TEXASR);
+   tm_disable();
+
+   if (MSR_TM_ACTIVE(guest_msr) ||
+   !(texasr & (TEXASR_FS))) {
+   /* TM bad thing interrupt */
+   kvmppc_core_queue_program(vcpu, SRR1_PROGTM);
+   emulated = EMULATE_AGAIN;
+   break;
+   }
+
+   kvmppc_emulate_trchkpt(vcpu);
+   break;
+   }
 #endif
default:
emulated = EMULATE_FAIL;
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 9a72460..5359f9c 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -299,7 +299,7 @@ void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu)
 }
 
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
-static inline void kvmppc_save_tm_sprs(struct kvm_vcpu *vcpu)
+void kvmppc_save_tm_sprs(struct kvm_vcpu *vcpu)
 {

[PATCH v3 20/29] KVM: PPC: Book3S PR: adds emulation for treclaim.

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

This patch adds support for "treclaim." emulation when PR KVM guest
executes treclaim. and traps to host.

We will firstly doing treclaim. and save TM checkpoint. Then it is
necessary to update vcpu current reg content with checkpointed vals.
When rfid into guest again, those vcpu current reg content(now the
checkpoint vals) will be loaded into regs.

Signed-off-by: Simon Guo 
---
 arch/powerpc/kvm/book3s_emulate.c | 76 +++
 1 file changed, 76 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index 570339b..04c29e0 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -51,6 +51,8 @@
 
 #define OP_31_XOP_TBEGIN   654
 
+#define OP_31_XOP_TRECLAIM 942
+
 /* DCBZ is actually 1014, but we patch it to 1010 so we get a trap */
 #define OP_31_XOP_DCBZ 1010
 
@@ -130,6 +132,46 @@ static inline void kvmppc_copyfrom_vcpu_tm(struct kvm_vcpu 
*vcpu)
vcpu->arch.vrsave = vcpu->arch.vrsave_tm;
 }
 
+static void kvmppc_emulate_treclaim(struct kvm_vcpu *vcpu, int ra_val)
+{
+   unsigned long guest_msr = kvmppc_get_msr(vcpu);
+   int fc_val = ra_val ? ra_val : 1;
+
+   /* CR0 = 0 | MSR[TS] | 0 */
+   vcpu->arch.cr = (vcpu->arch.cr & ~(CR0_MASK << CR0_SHIFT)) |
+   (((guest_msr & MSR_TS_MASK) >> (MSR_TS_S_LG - 1))
+<< CR0_SHIFT);
+
+   preempt_disable();
+   kvmppc_save_tm_pr(vcpu);
+   kvmppc_copyfrom_vcpu_tm(vcpu);
+
+   tm_enable();
+   vcpu->arch.texasr = mfspr(SPRN_TEXASR);
+   /* failure recording depends on Failure Summary bit */
+   if (!(vcpu->arch.texasr & TEXASR_FS)) {
+   vcpu->arch.texasr &= ~TEXASR_FC;
+   vcpu->arch.texasr |= ((u64)fc_val << TEXASR_FC_LG);
+
+   vcpu->arch.texasr &= ~(TEXASR_PR | TEXASR_HV);
+   if (kvmppc_get_msr(vcpu) & MSR_PR)
+   vcpu->arch.texasr |= TEXASR_PR;
+
+   if (kvmppc_get_msr(vcpu) & MSR_HV)
+   vcpu->arch.texasr |= TEXASR_HV;
+
+   vcpu->arch.tfiar = kvmppc_get_pc(vcpu);
+   mtspr(SPRN_TEXASR, vcpu->arch.texasr);
+   mtspr(SPRN_TFIAR, vcpu->arch.tfiar);
+   }
+   tm_disable();
+   /*
+* treclaim need quit to non-transactional state.
+*/
+   guest_msr &= ~(MSR_TS_MASK);
+   kvmppc_set_msr(vcpu, guest_msr);
+   preempt_enable();
+}
 #endif
 
 int kvmppc_core_emulate_op_pr(struct kvm_run *run, struct kvm_vcpu *vcpu,
@@ -402,6 +444,40 @@ int kvmppc_core_emulate_op_pr(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
emulated = EMULATE_FAIL;
break;
}
+   case OP_31_XOP_TRECLAIM:
+   {
+   ulong guest_msr = kvmppc_get_msr(vcpu);
+   unsigned long ra_val = 0;
+
+   if (!cpu_has_feature(CPU_FTR_TM))
+   break;
+
+   if (!(kvmppc_get_msr(vcpu) & MSR_TM)) {
+   kvmppc_trigger_fac_interrupt(vcpu, FSCR_TM_LG);
+   emulated = EMULATE_AGAIN;
+   break;
+   }
+
+   /* generate interrupts based on priorities */
+   if (guest_msr & MSR_PR) {
+   /* Privileged Instruction type Program 
Interrupt */
+   kvmppc_core_queue_program(vcpu, SRR1_PROGPRIV);
+   emulated = EMULATE_AGAIN;
+   break;
+   }
+
+   if (!MSR_TM_ACTIVE(guest_msr)) {
+   /* TM bad thing interrupt */
+   kvmppc_core_queue_program(vcpu, SRR1_PROGTM);
+   emulated = EMULATE_AGAIN;
+   break;
+   }
+
+   if (ra)
+   ra_val = kvmppc_get_gpr(vcpu, ra);
+   kvmppc_emulate_treclaim(vcpu, ra_val);
+   break;
+   }
 #endif
default:
emulated = EMULATE_FAIL;
-- 
1.8.3.1



[PATCH v3 19/29] KVM: PPC: Book3S PR: enable NV reg restore for reading TM SPR at guest privilege state

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

Currently kvmppc_handle_fac() will not update NV GPRs and thus it can
return with GUEST_RESUME.

However PR KVM guest always disables MSR_TM bit at privilege state. If PR
privilege guest are trying to read TM SPRs, it will trigger TM facility
unavailable exception and fall into kvmppc_handle_fac(). Then the emulation
will be done by kvmppc_core_emulate_mfspr_pr(). The mfspr instruction can
include a RT with NV reg. So it is necessary to restore NV GPRs at this
case, to reflect the update to NV RT.

This patch make kvmppc_handle_fac() return GUEST_RESUME_NV at TM fac
exception and with guest privilege state.

Signed-off-by: Simon Guo 
Reviewed-by: Paul Mackerras 
---
 arch/powerpc/kvm/book3s_pr.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 9becca1..9a72460 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -989,6 +989,18 @@ static int kvmppc_handle_fac(struct kvm_vcpu *vcpu, ulong 
fac)
break;
}
 
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+   /* Since we disabled MSR_TM at privilege state, the mfspr instruction
+* for TM spr can trigger TM fac unavailable. In this case, the
+* emulation is handled by kvmppc_emulate_fac(), which invokes
+* kvmppc_emulate_mfspr() finally. But note the mfspr can include
+* RT for NV registers. So it need to restore those NV reg to reflect
+* the update.
+*/
+   if ((fac == FSCR_TM_LG) && !(kvmppc_get_msr(vcpu) & MSR_PR))
+   return RESUME_GUEST_NV;
+#endif
+
return RESUME_GUEST;
 }
 
@@ -1350,8 +1362,7 @@ int kvmppc_handle_exit_pr(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
}
 #ifdef CONFIG_PPC_BOOK3S_64
case BOOK3S_INTERRUPT_FAC_UNAVAIL:
-   kvmppc_handle_fac(vcpu, vcpu->arch.shadow_fscr >> 56);
-   r = RESUME_GUEST;
+   r = kvmppc_handle_fac(vcpu, vcpu->arch.shadow_fscr >> 56);
break;
 #endif
case BOOK3S_INTERRUPT_MACHINE_CHECK:
-- 
1.8.3.1



[PATCH v3 18/29] KVM: PPC: Book3S PR: always fail transaction in guest privilege state

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

Currently kernel doesn't use transaction memory.
And there is an issue for privilege guest that:
tbegin/tsuspend/tresume/tabort TM instructions can impact MSR TM bits
without trap into PR host. So following code will lead to a false mfmsr
result:
tbegin  <- MSR bits update to Transaction active.
beq <- failover handler branch
mfmsr   <- still read MSR bits from magic page with
transaction inactive.

It is not an issue for non-privilege guest since its mfmsr is not patched
with magic page and will always trap into PR host.

This patch will always fail tbegin attempt for privilege guest, so that
the above issue is prevented. It is benign since currently (guest) kernel
doesn't initiate a transaction.

Test case:
https://github.com/justdoitqd/publicFiles/blob/master/test_tbegin_pr.c

Signed-off-by: Simon Guo 
---
 arch/powerpc/include/asm/kvm_book3s.h |  2 ++
 arch/powerpc/kvm/book3s_emulate.c | 40 +++
 arch/powerpc/kvm/book3s_pr.c  | 11 +-
 3 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 43e8bb1..c1cea82 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -262,9 +262,11 @@ extern void kvmppc_update_lpcr(struct kvm *kvm, unsigned 
long lpcr,
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 void kvmppc_save_tm_pr(struct kvm_vcpu *vcpu);
 void kvmppc_restore_tm_pr(struct kvm_vcpu *vcpu);
+void kvmppc_restore_tm_sprs(struct kvm_vcpu *vcpu);
 #else
 static inline void kvmppc_save_tm_pr(struct kvm_vcpu *vcpu) {}
 static inline void kvmppc_restore_tm_pr(struct kvm_vcpu *vcpu) {}
+static inline void kvmppc_restore_tm_sprs(struct kvm_vcpu *vcpu) {}
 #endif
 
 extern int kvm_irq_bypass;
diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index c4e3ec6..570339b 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "book3s.h"
 #include 
 
@@ -48,6 +49,8 @@
 #define OP_31_XOP_EIOIO854
 #define OP_31_XOP_SLBMFEE  915
 
+#define OP_31_XOP_TBEGIN   654
+
 /* DCBZ is actually 1014, but we patch it to 1010 so we get a trap */
 #define OP_31_XOP_DCBZ 1010
 
@@ -363,6 +366,43 @@ int kvmppc_core_emulate_op_pr(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
 
break;
}
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+   case OP_31_XOP_TBEGIN:
+   {
+   if (!cpu_has_feature(CPU_FTR_TM))
+   break;
+
+   if (!(kvmppc_get_msr(vcpu) & MSR_TM)) {
+   kvmppc_trigger_fac_interrupt(vcpu, FSCR_TM_LG);
+   emulated = EMULATE_AGAIN;
+   break;
+   }
+
+   if (!(kvmppc_get_msr(vcpu) & MSR_PR)) {
+   preempt_disable();
+   vcpu->arch.cr = (CR0_TBEGIN_FAILURE |
+ (vcpu->arch.cr & ~(CR0_MASK << CR0_SHIFT)));
+
+   vcpu->arch.texasr = (TEXASR_FS | TEXASR_EXACT |
+   (((u64)(TM_CAUSE_EMULATE | 
TM_CAUSE_PERSISTENT))
+<< TEXASR_FC_LG));
+
+   if ((inst >> 21) & 0x1)
+   vcpu->arch.texasr |= TEXASR_ROT;
+
+   if (kvmppc_get_msr(vcpu) & MSR_HV)
+   vcpu->arch.texasr |= TEXASR_HV;
+
+   vcpu->arch.tfhar = kvmppc_get_pc(vcpu) + 4;
+   vcpu->arch.tfiar = kvmppc_get_pc(vcpu);
+
+   kvmppc_restore_tm_sprs(vcpu);
+   preempt_enable();
+   } else
+   emulated = EMULATE_FAIL;
+   break;
+   }
+#endif
default:
emulated = EMULATE_FAIL;
}
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index e8e7f3a..9becca1 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -207,6 +207,15 @@ static void kvmppc_recalc_shadow_msr(struct kvm_vcpu *vcpu)
 #ifdef CONFIG_PPC_BOOK3S_64
smsr |= MSR_ISF | MSR_HV;
 #endif
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+   /*
+* in guest privileged state, we want to fail all TM transactions.
+* So disable MSR TM bit so that all tbegin. will be able to be
+* trapped into host.
+*/
+   if (!(guest_msr & MSR_PR))
+   smsr &= ~MSR_TM;
+#endif
vcpu->arch.shadow_msr = 

[PATCH v3 17/29] KVM: PPC: Book3S PR: make mtspr/mfspr emulation behavior based on active TM SPRs

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

The mfspr/mtspr on TM SPRs(TEXASR/TFIAR/TFHAR) are non-privileged
instructions and can be executed at PR KVM guest without trapping
into host in problem state. We only emulate mtspr/mfspr
texasr/tfiar/tfhar at guest PR=0 state.

When we are emulating mtspr tm sprs at guest PR=0 state, the emulation
result need to be visible to guest PR=1 state. That is, the actual TM
SPR val should be loaded into actual registers.

We already flush TM SPRs into vcpu when switching out of CPU, and load
TM SPRs when switching back.

This patch corrects mfspr()/mtspr() emulation for TM SPRs to make the
actual source/dest based on actual TM SPRs.

Signed-off-by: Simon Guo 
---
 arch/powerpc/include/asm/kvm_book3s.h |  1 +
 arch/powerpc/kvm/book3s_emulate.c | 58 +--
 arch/powerpc/kvm/book3s_pr.c  |  2 +-
 3 files changed, 50 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index fc15ad9..43e8bb1 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -210,6 +210,7 @@ extern long kvmppc_hv_get_dirty_log_radix(struct kvm *kvm,
 extern void kvmppc_book3s_dequeue_irqprio(struct kvm_vcpu *vcpu,
  unsigned int vec);
 extern void kvmppc_inject_interrupt(struct kvm_vcpu *vcpu, int vec, u64 flags);
+extern void kvmppc_trigger_fac_interrupt(struct kvm_vcpu *vcpu, ulong fac);
 extern void kvmppc_set_bat(struct kvm_vcpu *vcpu, struct kvmppc_bat *bat,
   bool upper, u32 val);
 extern void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr);
diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index f81a921..c4e3ec6 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include "book3s.h"
+#include 
 
 #define OP_19_XOP_RFID 18
 #define OP_19_XOP_RFI  50
@@ -523,13 +524,38 @@ int kvmppc_core_emulate_mtspr_pr(struct kvm_vcpu *vcpu, 
int sprn, ulong spr_val)
break;
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
case SPRN_TFHAR:
-   vcpu->arch.tfhar = spr_val;
-   break;
case SPRN_TEXASR:
-   vcpu->arch.texasr = spr_val;
-   break;
case SPRN_TFIAR:
-   vcpu->arch.tfiar = spr_val;
+   if (!cpu_has_feature(CPU_FTR_TM))
+   break;
+
+   if (!(kvmppc_get_msr(vcpu) & MSR_TM)) {
+   kvmppc_trigger_fac_interrupt(vcpu, FSCR_TM_LG);
+   emulated = EMULATE_AGAIN;
+   break;
+   }
+
+   if (MSR_TM_ACTIVE(kvmppc_get_msr(vcpu)) &&
+   !((MSR_TM_SUSPENDED(kvmppc_get_msr(vcpu))) &&
+   (sprn == SPRN_TFHAR))) {
+   /* it is illegal to mtspr() TM regs in
+* other than non-transactional state, with
+* the exception of TFHAR in suspend state.
+*/
+   kvmppc_core_queue_program(vcpu, SRR1_PROGTM);
+   emulated = EMULATE_AGAIN;
+   break;
+   }
+
+   tm_enable();
+   if (sprn == SPRN_TFHAR)
+   mtspr(SPRN_TFHAR, spr_val);
+   else if (sprn == SPRN_TEXASR)
+   mtspr(SPRN_TEXASR, spr_val);
+   else
+   mtspr(SPRN_TFIAR, spr_val);
+   tm_disable();
+
break;
 #endif
 #endif
@@ -676,13 +702,25 @@ int kvmppc_core_emulate_mfspr_pr(struct kvm_vcpu *vcpu, 
int sprn, ulong *spr_val
break;
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
case SPRN_TFHAR:
-   *spr_val = vcpu->arch.tfhar;
-   break;
case SPRN_TEXASR:
-   *spr_val = vcpu->arch.texasr;
-   break;
case SPRN_TFIAR:
-   *spr_val = vcpu->arch.tfiar;
+   if (!cpu_has_feature(CPU_FTR_TM))
+   break;
+
+   if (!(kvmppc_get_msr(vcpu) & MSR_TM)) {
+   kvmppc_trigger_fac_interrupt(vcpu, FSCR_TM_LG);
+   emulated = EMULATE_AGAIN;
+   break;
+   }
+
+   tm_enable();
+   if (sprn == SPRN_TFHAR)
+   *spr_val = mfspr(SPRN_TFHAR);
+   else if (sprn == SPRN_TEXASR)
+   *spr_val = mfspr(SPRN_TEXASR);
+   else if (sprn == SPRN_TFIAR)
+   *spr_val = mfspr(SPRN_TFIAR);
+   tm_disable();
break;
 #endif
 #endif
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 4b81b3c..e8e7f3a 100644
--- 

[PATCH v3 16/29] KVM: PPC: Book3S PR: add math support for PR KVM HTM

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

The math registers will be saved into vcpu->arch.fp/vr and corresponding
vcpu->arch.fp_tm/vr_tm area.

We flush or giveup the math regs into vcpu->arch.fp/vr before saving
transaction. After transaction is restored, the math regs will be loaded
back into regs.

If there is a FP/VEC/VSX unavailable exception during transaction active
state, the math checkpoint content might be incorrect and we need to do
treclaim./load the correct checkpoint val/trechkpt. sequence to retry the
transaction. That will make our solution complicated. To solve this issue,
we always make the hardware guest MSR math bits (shadow_msr) consistent
with the MSR val which guest sees (kvmppc_get_msr()) when guest msr is
with tm enabled. Then all FP/VEC/VSX unavailable exception can be delivered
to guest and guest handles the exception by itself.

Signed-off-by: Simon Guo 
---
 arch/powerpc/kvm/book3s_pr.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 226bae7..4b81b3c 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -308,6 +308,28 @@ static inline void kvmppc_restore_tm_sprs(struct kvm_vcpu 
*vcpu)
tm_disable();
 }
 
+/* loadup math bits which is enabled at kvmppc_get_msr() but not enabled at
+ * hardware.
+ */
+static void kvmppc_handle_lost_math_exts(struct kvm_vcpu *vcpu)
+{
+   ulong exit_nr;
+   ulong ext_diff = (kvmppc_get_msr(vcpu) & ~vcpu->arch.guest_owned_ext) &
+   (MSR_FP | MSR_VEC | MSR_VSX);
+
+   if (!ext_diff)
+   return;
+
+   if (ext_diff == MSR_FP)
+   exit_nr = BOOK3S_INTERRUPT_FP_UNAVAIL;
+   else if (ext_diff == MSR_VEC)
+   exit_nr = BOOK3S_INTERRUPT_ALTIVEC;
+   else
+   exit_nr = BOOK3S_INTERRUPT_VSX;
+
+   kvmppc_handle_ext(vcpu, exit_nr, ext_diff);
+}
+
 void kvmppc_save_tm_pr(struct kvm_vcpu *vcpu)
 {
if (!(MSR_TM_ACTIVE(kvmppc_get_msr(vcpu {
@@ -315,6 +337,8 @@ void kvmppc_save_tm_pr(struct kvm_vcpu *vcpu)
return;
}
 
+   kvmppc_giveup_ext(vcpu, MSR_VSX);
+
preempt_disable();
_kvmppc_save_tm_pr(vcpu, mfmsr());
preempt_enable();
@@ -324,12 +348,18 @@ void kvmppc_restore_tm_pr(struct kvm_vcpu *vcpu)
 {
if (!MSR_TM_ACTIVE(kvmppc_get_msr(vcpu))) {
kvmppc_restore_tm_sprs(vcpu);
+   if (kvmppc_get_msr(vcpu) & MSR_TM)
+   kvmppc_handle_lost_math_exts(vcpu);
return;
}
 
preempt_disable();
_kvmppc_restore_tm_pr(vcpu, kvmppc_get_msr(vcpu));
preempt_enable();
+
+   if (kvmppc_get_msr(vcpu) & MSR_TM)
+   kvmppc_handle_lost_math_exts(vcpu);
+
 }
 #endif
 
@@ -468,6 +498,11 @@ static void kvmppc_set_msr_pr(struct kvm_vcpu *vcpu, u64 
msr)
/* Preload FPU if it's enabled */
if (kvmppc_get_msr(vcpu) & MSR_FP)
kvmppc_handle_ext(vcpu, BOOK3S_INTERRUPT_FP_UNAVAIL, MSR_FP);
+
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+   if (kvmppc_get_msr(vcpu) & MSR_TM)
+   kvmppc_handle_lost_math_exts(vcpu);
+#endif
 }
 
 void kvmppc_set_pvr_pr(struct kvm_vcpu *vcpu, u32 pvr)
-- 
1.8.3.1



[PATCH v3 15/29] KVM: PPC: Book3S PR: add transaction memory save/restore skeleton for PR KVM

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

The transaction memory checkpoint area save/restore behavior is
triggered when VCPU qemu process is switching out/into CPU. ie.
at kvmppc_core_vcpu_put_pr() and kvmppc_core_vcpu_load_pr().

MSR TM active state is determined by TS bits:
active: 10(transactional) or 01 (suspended)
inactive: 00 (non-transactional)
We don't "fake" TM functionality for guest. We "sync" guest virtual
MSR TM active state(10 or 01) with shadow MSR. That is to say,
we don't emulate a transactional guest with a TM inactive MSR.

TM SPR support(TFIAR/TFAR/TEXASR) has already been supported by
commit 9916d57e64a4 ("KVM: PPC: Book3S PR: Expose TM registers").
Math register support (FPR/VMX/VSX) will be done at subsequent
patch.

Whether TM context need to be saved/restored can be determined
by kvmppc_get_msr() TM active state:
* TM active - save/restore TM context
* TM inactive - no need to do so and only save/restore
TM SPRs.

Signed-off-by: Simon Guo 
Suggested-by: Paul Mackerras 
---
 arch/powerpc/include/asm/kvm_book3s.h |  9 +
 arch/powerpc/include/asm/kvm_host.h   |  1 -
 arch/powerpc/kvm/book3s_pr.c  | 27 +++
 3 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 20d3d5a..fc15ad9 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -257,6 +257,15 @@ extern void kvmppc_update_lpcr(struct kvm *kvm, unsigned 
long lpcr,
 extern int kvmppc_hcall_impl_hv_realmode(unsigned long cmd);
 extern void kvmppc_copy_to_svcpu(struct kvm_vcpu *vcpu);
 extern void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu);
+
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+void kvmppc_save_tm_pr(struct kvm_vcpu *vcpu);
+void kvmppc_restore_tm_pr(struct kvm_vcpu *vcpu);
+#else
+static inline void kvmppc_save_tm_pr(struct kvm_vcpu *vcpu) {}
+static inline void kvmppc_restore_tm_pr(struct kvm_vcpu *vcpu) {}
+#endif
+
 extern int kvm_irq_bypass;
 
 static inline struct kvmppc_vcpu_book3s *to_book3s(struct kvm_vcpu *vcpu)
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 89f44ec..60325af 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -621,7 +621,6 @@ struct kvm_vcpu_arch {
 
struct thread_vr_state vr_tm;
u32 vrsave_tm; /* also USPRG0 */
-
 #endif
 
 #ifdef CONFIG_KVM_EXIT_TIMING
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 7d4905a..226bae7 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -43,6 +43,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "book3s.h"
 
@@ -115,6 +116,8 @@ static void kvmppc_core_vcpu_load_pr(struct kvm_vcpu *vcpu, 
int cpu)
 
if (kvmppc_is_split_real(vcpu))
kvmppc_fixup_split_real(vcpu);
+
+   kvmppc_restore_tm_pr(vcpu);
 }
 
 static void kvmppc_core_vcpu_put_pr(struct kvm_vcpu *vcpu)
@@ -134,6 +137,7 @@ static void kvmppc_core_vcpu_put_pr(struct kvm_vcpu *vcpu)
 
kvmppc_giveup_ext(vcpu, MSR_FP | MSR_VEC | MSR_VSX);
kvmppc_giveup_fac(vcpu, FSCR_TAR_LG);
+   kvmppc_save_tm_pr(vcpu);
 
/* Enable AIL if supported */
if (cpu_has_feature(CPU_FTR_HVMODE) &&
@@ -304,6 +308,29 @@ static inline void kvmppc_restore_tm_sprs(struct kvm_vcpu 
*vcpu)
tm_disable();
 }
 
+void kvmppc_save_tm_pr(struct kvm_vcpu *vcpu)
+{
+   if (!(MSR_TM_ACTIVE(kvmppc_get_msr(vcpu {
+   kvmppc_save_tm_sprs(vcpu);
+   return;
+   }
+
+   preempt_disable();
+   _kvmppc_save_tm_pr(vcpu, mfmsr());
+   preempt_enable();
+}
+
+void kvmppc_restore_tm_pr(struct kvm_vcpu *vcpu)
+{
+   if (!MSR_TM_ACTIVE(kvmppc_get_msr(vcpu))) {
+   kvmppc_restore_tm_sprs(vcpu);
+   return;
+   }
+
+   preempt_disable();
+   _kvmppc_restore_tm_pr(vcpu, kvmppc_get_msr(vcpu));
+   preempt_enable();
+}
 #endif
 
 static int kvmppc_core_check_requests_pr(struct kvm_vcpu *vcpu)
-- 
1.8.3.1



[PATCH v3 14/29] KVM: PPC: Book3S PR: add kvmppc_save/restore_tm_sprs() APIs

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

This patch adds 2 new APIs kvmppc_save_tm_sprs()/kvmppc_restore_tm_sprs()
for the purpose of TEXASR/TFIAR/TFHAR save/restore.

Signed-off-by: Simon Guo 
Reviewed-by: Paul Mackerras 
---
 arch/powerpc/kvm/book3s_pr.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index f2ae5a3..7d4905a 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "book3s.h"
 
@@ -284,6 +285,27 @@ void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu)
svcpu_put(svcpu);
 }
 
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+static inline void kvmppc_save_tm_sprs(struct kvm_vcpu *vcpu)
+{
+   tm_enable();
+   vcpu->arch.tfhar = mfspr(SPRN_TFHAR);
+   vcpu->arch.texasr = mfspr(SPRN_TEXASR);
+   vcpu->arch.tfiar = mfspr(SPRN_TFIAR);
+   tm_disable();
+}
+
+static inline void kvmppc_restore_tm_sprs(struct kvm_vcpu *vcpu)
+{
+   tm_enable();
+   mtspr(SPRN_TFHAR, vcpu->arch.tfhar);
+   mtspr(SPRN_TEXASR, vcpu->arch.texasr);
+   mtspr(SPRN_TFIAR, vcpu->arch.tfiar);
+   tm_disable();
+}
+
+#endif
+
 static int kvmppc_core_check_requests_pr(struct kvm_vcpu *vcpu)
 {
int r = 1; /* Indicate we want to get back into the guest */
-- 
1.8.3.1



[PATCH v3 13/29] KVM: PPC: Book3S PR: adds new kvmppc_copyto_vcpu_tm/kvmppc_copyfrom_vcpu_tm API for PR KVM.

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

This patch adds 2 new APIs: kvmppc_copyto_vcpu_tm() and
kvmppc_copyfrom_vcpu_tm().  These 2 APIs will be used to copy from/to TM
data between VCPU_TM/VCPU area.

PR KVM will use these APIs for treclaim. or trchkpt. emulation.

Signed-off-by: Simon Guo 
---
 arch/powerpc/kvm/book3s_emulate.c | 41 +++
 1 file changed, 41 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index 2eb457b..f81a921 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -87,6 +87,47 @@ static bool spr_allowed(struct kvm_vcpu *vcpu, enum 
priv_level level)
return true;
 }
 
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+static inline void kvmppc_copyto_vcpu_tm(struct kvm_vcpu *vcpu)
+{
+   memcpy(>arch.gpr_tm[0], >arch.regs.gpr[0],
+   sizeof(vcpu->arch.gpr_tm));
+   memcpy(>arch.fp_tm, >arch.fp,
+   sizeof(struct thread_fp_state));
+   memcpy(>arch.vr_tm, >arch.vr,
+   sizeof(struct thread_vr_state));
+   vcpu->arch.ppr_tm = vcpu->arch.ppr;
+   vcpu->arch.dscr_tm = vcpu->arch.dscr;
+   vcpu->arch.amr_tm = vcpu->arch.amr;
+   vcpu->arch.ctr_tm = vcpu->arch.regs.ctr;
+   vcpu->arch.tar_tm = vcpu->arch.tar;
+   vcpu->arch.lr_tm = vcpu->arch.regs.link;
+   vcpu->arch.cr_tm = vcpu->arch.cr;
+   vcpu->arch.xer_tm = vcpu->arch.regs.xer;
+   vcpu->arch.vrsave_tm = vcpu->arch.vrsave;
+}
+
+static inline void kvmppc_copyfrom_vcpu_tm(struct kvm_vcpu *vcpu)
+{
+   memcpy(>arch.regs.gpr[0], >arch.gpr_tm[0],
+   sizeof(vcpu->arch.regs.gpr));
+   memcpy(>arch.fp, >arch.fp_tm,
+   sizeof(struct thread_fp_state));
+   memcpy(>arch.vr, >arch.vr_tm,
+   sizeof(struct thread_vr_state));
+   vcpu->arch.ppr = vcpu->arch.ppr_tm;
+   vcpu->arch.dscr = vcpu->arch.dscr_tm;
+   vcpu->arch.amr = vcpu->arch.amr_tm;
+   vcpu->arch.regs.ctr = vcpu->arch.ctr_tm;
+   vcpu->arch.tar = vcpu->arch.tar_tm;
+   vcpu->arch.regs.link = vcpu->arch.lr_tm;
+   vcpu->arch.cr = vcpu->arch.cr_tm;
+   vcpu->arch.regs.xer = vcpu->arch.xer_tm;
+   vcpu->arch.vrsave = vcpu->arch.vrsave_tm;
+}
+
+#endif
+
 int kvmppc_core_emulate_op_pr(struct kvm_run *run, struct kvm_vcpu *vcpu,
  unsigned int inst, int *advance)
 {
-- 
1.8.3.1



[PATCH v3 12/29] KVM: PPC: Book3S PR: prevent TS bits change in kvmppc_interrupt_pr()

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

PR KVM host usually equipped with enabled TM in its host MSR value, and
with non-transactional TS value.

When a guest with TM active traps into PR KVM host, the rfid at the
tail of kvmppc_interrupt_pr() will try to switch TS bits from
S0 (Suspended & TM disabled) to N1 (Non-transactional & TM enabled).

That will leads to TM Bad Thing interrupt.

This patch manually sets target TS bits unchanged to avoid this
exception.

Signed-off-by: Simon Guo 
Reviewed-by: Paul Mackerras 
---
 arch/powerpc/kvm/book3s_segment.S | 13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_segment.S 
b/arch/powerpc/kvm/book3s_segment.S
index 93a180c..98ccc7e 100644
--- a/arch/powerpc/kvm/book3s_segment.S
+++ b/arch/powerpc/kvm/book3s_segment.S
@@ -383,6 +383,19 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 */
 
PPC_LL  r6, HSTATE_HOST_MSR(r13)
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+   /*
+* We don't want to change MSR[TS] bits via rfi here.
+* The actual TM handling logic will be in host with
+* recovered DR/IR bits after HSTATE_VMHANDLER.
+* And MSR_TM can be enabled in HOST_MSR so rfid may
+* not suppress this change and can lead to exception.
+* Manually set MSR to prevent TS state change here.
+*/
+   mfmsr   r7
+   rldicl  r7, r7, 64 - MSR_TS_S_LG, 62
+   rldimi  r6, r7, MSR_TS_S_LG, 63 - MSR_TS_T_LG
+#endif
PPC_LL  r8, HSTATE_VMHANDLER(r13)
 
 #ifdef CONFIG_PPC64
-- 
1.8.3.1



[PATCH v3 11/29] KVM: PPC: Book3S PR: implement RFID TM behavior to suppress change from S0 to N0

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

Accordingly to ISA specification for RFID, in MSR TM disabled and TS
suspended state(S0), if the target MSR is TM disabled and TS state is
inactive(N0), rfid should suppress this update.

This patch make RFID emulation of PR KVM to be consistent with this.

Signed-off-by: Simon Guo 
Reviewed-by: Paul Mackerras 
---
 arch/powerpc/kvm/book3s_emulate.c | 21 +++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index 68d6898..2eb457b 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -117,11 +117,28 @@ int kvmppc_core_emulate_op_pr(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
case 19:
switch (get_xop(inst)) {
case OP_19_XOP_RFID:
-   case OP_19_XOP_RFI:
+   case OP_19_XOP_RFI: {
+   unsigned long srr1 = kvmppc_get_srr1(vcpu);
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+   unsigned long cur_msr = kvmppc_get_msr(vcpu);
+
+   /*
+* add rules to fit in ISA specification regarding TM
+* state transistion in TM disable/Suspended state,
+* and target TM state is TM inactive(00) state. (the
+* change should be suppressed).
+*/
+   if (((cur_msr & MSR_TM) == 0) &&
+   ((srr1 & MSR_TM) == 0) &&
+   MSR_TM_SUSPENDED(cur_msr) &&
+   !MSR_TM_ACTIVE(srr1))
+   srr1 |= MSR_TS_S;
+#endif
kvmppc_set_pc(vcpu, kvmppc_get_srr0(vcpu));
-   kvmppc_set_msr(vcpu, kvmppc_get_srr1(vcpu));
+   kvmppc_set_msr(vcpu, srr1);
*advance = 0;
break;
+   }
 
default:
emulated = EMULATE_FAIL;
-- 
1.8.3.1



[PATCH v3 10/29] KVM: PPC: Book3S PR: Sync TM bits to shadow msr for problem state guest

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

MSR TS bits can be modified with non-privileged instruction like
tbegin./tend.  That means guest can change MSR value "silently" without
notifying host.

It is necessary to sync the TM bits to host so that host can calculate
shadow msr correctly.

note privilege guest will always fail transactions so we only take
care of problem state guest.

The logic is put into kvmppc_copy_from_svcpu() so that
kvmppc_handle_exit_pr() can use correct MSR TM bits even when preemption.

Signed-off-by: Simon Guo 
---
 arch/powerpc/kvm/book3s_pr.c | 73 ++--
 1 file changed, 50 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index d3237f5..f2ae5a3 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -182,10 +182,36 @@ void kvmppc_copy_to_svcpu(struct kvm_vcpu *vcpu)
svcpu_put(svcpu);
 }
 
+static void kvmppc_recalc_shadow_msr(struct kvm_vcpu *vcpu)
+{
+   ulong guest_msr = kvmppc_get_msr(vcpu);
+   ulong smsr = guest_msr;
+
+   /* Guest MSR values */
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+   smsr &= MSR_FE0 | MSR_FE1 | MSR_SF | MSR_SE | MSR_BE | MSR_LE |
+   MSR_TM | MSR_TS_MASK;
+#else
+   smsr &= MSR_FE0 | MSR_FE1 | MSR_SF | MSR_SE | MSR_BE | MSR_LE;
+#endif
+   /* Process MSR values */
+   smsr |= MSR_ME | MSR_RI | MSR_IR | MSR_DR | MSR_PR | MSR_EE;
+   /* External providers the guest reserved */
+   smsr |= (guest_msr & vcpu->arch.guest_owned_ext);
+   /* 64-bit Process MSR values */
+#ifdef CONFIG_PPC_BOOK3S_64
+   smsr |= MSR_ISF | MSR_HV;
+#endif
+   vcpu->arch.shadow_msr = smsr;
+}
+
 /* Copy data touched by real-mode code from shadow vcpu back to vcpu */
 void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu)
 {
struct kvmppc_book3s_shadow_vcpu *svcpu = svcpu_get(vcpu);
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+   ulong old_msr;
+#endif
 
/*
 * Maybe we were already preempted and synced the svcpu from
@@ -228,6 +254,30 @@ void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu)
to_book3s(vcpu)->vtb += get_vtb() - vcpu->arch.entry_vtb;
if (cpu_has_feature(CPU_FTR_ARCH_207S))
vcpu->arch.ic += mfspr(SPRN_IC) - vcpu->arch.entry_ic;
+
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+   /*
+* Unlike other MSR bits, MSR[TS]bits can be changed at guest without
+* notifying host:
+*  modified by unprivileged instructions like "tbegin"/"tend"/
+* "tresume"/"tsuspend" in PR KVM guest.
+*
+* It is necessary to sync here to calculate a correct shadow_msr.
+*
+* privileged guest's tbegin will be failed at present. So we
+* only take care of problem state guest.
+*/
+   old_msr = kvmppc_get_msr(vcpu);
+   if (unlikely((old_msr & MSR_PR) &&
+   (vcpu->arch.shadow_srr1 & (MSR_TS_MASK)) !=
+   (old_msr & (MSR_TS_MASK {
+   old_msr &= ~(MSR_TS_MASK);
+   old_msr |= (vcpu->arch.shadow_srr1 & (MSR_TS_MASK));
+   kvmppc_set_msr_fast(vcpu, old_msr);
+   kvmppc_recalc_shadow_msr(vcpu);
+   }
+#endif
+
svcpu->in_use = false;
 
 out:
@@ -306,29 +356,6 @@ static void kvm_set_spte_hva_pr(struct kvm *kvm, unsigned 
long hva, pte_t pte)
 
 /*/
 
-static void kvmppc_recalc_shadow_msr(struct kvm_vcpu *vcpu)
-{
-   ulong guest_msr = kvmppc_get_msr(vcpu);
-   ulong smsr = guest_msr;
-
-   /* Guest MSR values */
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
-   smsr &= MSR_FE0 | MSR_FE1 | MSR_SF | MSR_SE | MSR_BE | MSR_LE |
-   MSR_TM | MSR_TS_MASK;
-#else
-   smsr &= MSR_FE0 | MSR_FE1 | MSR_SF | MSR_SE | MSR_BE | MSR_LE;
-#endif
-   /* Process MSR values */
-   smsr |= MSR_ME | MSR_RI | MSR_IR | MSR_DR | MSR_PR | MSR_EE;
-   /* External providers the guest reserved */
-   smsr |= (guest_msr & vcpu->arch.guest_owned_ext);
-   /* 64-bit Process MSR values */
-#ifdef CONFIG_PPC_BOOK3S_64
-   smsr |= MSR_ISF | MSR_HV;
-#endif
-   vcpu->arch.shadow_msr = smsr;
-}
-
 static void kvmppc_set_msr_pr(struct kvm_vcpu *vcpu, u64 msr)
 {
ulong old_msr = kvmppc_get_msr(vcpu);
-- 
1.8.3.1



[PATCH v3 09/29] KVM: PPC: Book3S PR: PR KVM pass through MSR TM/TS bits to shadow_msr.

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

PowerPC TM functionality needs MSR TM/TS bits support in hardware level.
Guest TM functionality can not be emulated with "fake" MSR (msr in magic
page) TS bits.

This patch syncs TM/TS bits in shadow_msr with the MSR value in magic
page, so that the MSR TS value which guest sees is consistent with actual
MSR bits running in guest.

Signed-off-by: Simon Guo 
Reviewed-by: Paul Mackerras 
---
 arch/powerpc/kvm/book3s_pr.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 67061d3..d3237f5 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -312,7 +312,12 @@ static void kvmppc_recalc_shadow_msr(struct kvm_vcpu *vcpu)
ulong smsr = guest_msr;
 
/* Guest MSR values */
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+   smsr &= MSR_FE0 | MSR_FE1 | MSR_SF | MSR_SE | MSR_BE | MSR_LE |
+   MSR_TM | MSR_TS_MASK;
+#else
smsr &= MSR_FE0 | MSR_FE1 | MSR_SF | MSR_SE | MSR_BE | MSR_LE;
+#endif
/* Process MSR values */
smsr |= MSR_ME | MSR_RI | MSR_IR | MSR_DR | MSR_PR | MSR_EE;
/* External providers the guest reserved */
-- 
1.8.3.1



[PATCH v3 08/29] KVM: PPC: Book3S PR: In PR KVM suspends Transactional state when inject an interrupt.

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

This patch simulates interrupt behavior per Power ISA while injecting
interrupt in PR KVM:
- When interrupt happens, transactional state should be suspended.

kvmppc_mmu_book3s_64_reset_msr() will be invoked when injecting an
interrupt. This patch performs this ISA logic in
kvmppc_mmu_book3s_64_reset_msr().

Signed-off-by: Simon Guo 
Reviewed-by: Paul Mackerras 
---
 arch/powerpc/kvm/book3s_64_mmu.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c
index a93d719..cf9d686 100644
--- a/arch/powerpc/kvm/book3s_64_mmu.c
+++ b/arch/powerpc/kvm/book3s_64_mmu.c
@@ -38,7 +38,16 @@
 
 static void kvmppc_mmu_book3s_64_reset_msr(struct kvm_vcpu *vcpu)
 {
-   kvmppc_set_msr(vcpu, vcpu->arch.intr_msr);
+   unsigned long msr = vcpu->arch.intr_msr;
+   unsigned long cur_msr = kvmppc_get_msr(vcpu);
+
+   /* If transactional, change to suspend mode on IRQ delivery */
+   if (MSR_TM_TRANSACTIONAL(cur_msr))
+   msr |= MSR_TS_S;
+   else
+   msr |= cur_msr & MSR_TS_MASK;
+
+   kvmppc_set_msr(vcpu, msr);
 }
 
 static struct kvmppc_slb *kvmppc_mmu_book3s_64_find_slbe(
-- 
1.8.3.1



[PATCH v3 07/29] KVM: PPC: Book3S PR: add C function wrapper for _kvmppc_save/restore_tm()

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

Currently _kvmppc_save/restore_tm() APIs can only be invoked from
assembly function. This patch adds C function wrappers for them so
that they can be safely called from C function.

Signed-off-by: Simon Guo 
---
 arch/powerpc/include/asm/asm-prototypes.h |  6 ++
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   |  8 +--
 arch/powerpc/kvm/tm.S | 94 ++-
 3 files changed, 102 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/asm-prototypes.h 
b/arch/powerpc/include/asm/asm-prototypes.h
index dfdcb23..5da683b 100644
--- a/arch/powerpc/include/asm/asm-prototypes.h
+++ b/arch/powerpc/include/asm/asm-prototypes.h
@@ -141,7 +141,13 @@ unsigned long __init prom_init(unsigned long r3, unsigned 
long r4,
 void pnv_power9_force_smt4_catch(void);
 void pnv_power9_force_smt4_release(void);
 
+/* Transaction memory related */
 void tm_enable(void);
 void tm_disable(void);
 void tm_abort(uint8_t cause);
+
+struct kvm_vcpu;
+void _kvmppc_restore_tm_pr(struct kvm_vcpu *vcpu, u64 guest_msr);
+void _kvmppc_save_tm_pr(struct kvm_vcpu *vcpu, u64 guest_msr);
+
 #endif /* _ASM_POWERPC_ASM_PROTOTYPES_H */
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 6445d29..980df5f 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -795,7 +795,7 @@ END_FTR_SECTION(CPU_FTR_TM | CPU_FTR_P9_TM_HV_ASSIST, 0)
 */
mr  r3, r4
ld  r4, VCPU_MSR(r3)
-   bl  kvmppc_restore_tm
+   bl  __kvmppc_restore_tm
ld  r4, HSTATE_KVM_VCPU(r13)
 91:
 END_FTR_SECTION_IFSET(CPU_FTR_TM)
@@ -1783,7 +1783,7 @@ END_FTR_SECTION(CPU_FTR_TM | CPU_FTR_P9_TM_HV_ASSIST, 0)
 */
mr  r3, r9
ld  r4, VCPU_MSR(r3)
-   bl  kvmppc_save_tm
+   bl  __kvmppc_save_tm
ld  r9, HSTATE_KVM_VCPU(r13)
 91:
 #endif
@@ -2689,7 +2689,7 @@ END_FTR_SECTION(CPU_FTR_TM | CPU_FTR_P9_TM_HV_ASSIST, 0)
 */
ld  r3, HSTATE_KVM_VCPU(r13)
ld  r4, VCPU_MSR(r3)
-   bl  kvmppc_save_tm
+   bl  __kvmppc_save_tm
 91:
 #endif
 
@@ -2809,7 +2809,7 @@ END_FTR_SECTION(CPU_FTR_TM | CPU_FTR_P9_TM_HV_ASSIST, 0)
 */
mr  r3, r4
ld  r4, VCPU_MSR(r3)
-   bl  kvmppc_restore_tm
+   bl  __kvmppc_restore_tm
ld  r4, HSTATE_KVM_VCPU(r13)
 91:
 #endif
diff --git a/arch/powerpc/kvm/tm.S b/arch/powerpc/kvm/tm.S
index b7057d5..42a7cd8 100644
--- a/arch/powerpc/kvm/tm.S
+++ b/arch/powerpc/kvm/tm.S
@@ -33,7 +33,7 @@
  * This can modify all checkpointed registers, but
  * restores r1, r2 before exit.
  */
-_GLOBAL(kvmppc_save_tm)
+_GLOBAL(__kvmppc_save_tm)
mflrr0
std r0, PPC_LR_STKOFF(r1)
stdur1, -PPC_MIN_STKFRM(r1)
@@ -210,6 +210,52 @@ END_FTR_SECTION_IFSET(CPU_FTR_P9_TM_HV_ASSIST)
blr
 
 /*
+ * _kvmppc_save_tm_pr() is a wrapper around __kvmppc_save_tm(), so that it can
+ * be invoked from C function by PR KVM only.
+ */
+_GLOBAL(_kvmppc_save_tm_pr)
+   mflrr5
+   std r5, PPC_LR_STKOFF(r1)
+   stdur1, -SWITCH_FRAME_SIZE(r1)
+   SAVE_NVGPRS(r1)
+
+   /* save MSR since TM/math bits might be impacted
+* by __kvmppc_save_tm().
+*/
+   mfmsr   r5
+   SAVE_GPR(5, r1)
+
+   /* also save DSCR/CR so that it can be recovered later */
+   mfspr   r6, SPRN_DSCR
+   SAVE_GPR(6, r1)
+
+   mfcrr7
+   stw r7, _CCR(r1)
+
+   bl  __kvmppc_save_tm
+
+   ld  r7, _CCR(r1)
+   mtcrr7
+
+   REST_GPR(6, r1)
+   mtspr   SPRN_DSCR, r6
+
+   /* need preserve current MSR's MSR_TS bits */
+   REST_GPR(5, r1)
+   mfmsr   r6
+   rldicl  r6, r6, 64 - MSR_TS_S_LG, 62
+   rldimi  r5, r6, MSR_TS_S_LG, 63 - MSR_TS_T_LG
+   mtmsrd  r5
+
+   REST_NVGPRS(r1)
+   addir1, r1, SWITCH_FRAME_SIZE
+   ld  r5, PPC_LR_STKOFF(r1)
+   mtlrr5
+   blr
+
+EXPORT_SYMBOL_GPL(_kvmppc_save_tm_pr);
+
+/*
  * Restore transactional state and TM-related registers.
  * Called with:
  *  - r3 pointing to the vcpu struct.
@@ -219,7 +265,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_P9_TM_HV_ASSIST)
  * This potentially modifies all checkpointed registers.
  * It restores r1, r2 from the PACA.
  */
-_GLOBAL(kvmppc_restore_tm)
+_GLOBAL(__kvmppc_restore_tm)
mflrr0
std r0, PPC_LR_STKOFF(r1)
 
@@ -362,4 +408,48 @@ END_FTR_SECTION_IFSET(CPU_FTR_P9_TM_HV_ASSIST)
addir1, r1, PPC_MIN_STKFRM
b   9b
 #endif
+
+/*
+ * _kvmppc_restore_tm_pr() is a wrapper around __kvmppc_restore_tm(), so that 
it
+ * can be invoked from C function by PR KVM only.
+ */
+_GLOBAL(_kvmppc_restore_tm_pr)
+   mflrr5
+   std r5, PPC_LR_STKOFF(r1)
+   stdur1, -SWITCH_FRAME_SIZE(r1)
+   SAVE_NVGPRS(r1)
+
+  

[PATCH v3 06/29] KVM: PPC: Book3S PR: turn on FP/VSX/VMX MSR bits in kvmppc_save_tm()

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

kvmppc_save_tm() invokes  store_fp_state/store_vr_state(). So it is
mandatory to turn on FP/VSX/VMX MSR bits for its execution, just
like what kvmppc_restore_tm() did.

Previsouly HV KVM has turned the bits on outside of function
kvmppc_save_tm().  Now we include this bit change in kvmppc_save_tm()
so that the logic is more clean. And PR KVM can reuse it later.

Signed-off-by: Simon Guo 
Reviewed-by: Paul Mackerras 
---
 arch/powerpc/kvm/tm.S | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kvm/tm.S b/arch/powerpc/kvm/tm.S
index cbe608a..b7057d5 100644
--- a/arch/powerpc/kvm/tm.S
+++ b/arch/powerpc/kvm/tm.S
@@ -42,6 +42,8 @@ _GLOBAL(kvmppc_save_tm)
mfmsr   r8
li  r0, 1
rldimi  r8, r0, MSR_TM_LG, 63-MSR_TM_LG
+   ori r8, r8, MSR_FP
+   orisr8, r8, (MSR_VEC | MSR_VSX)@h
mtmsrd  r8
 
rldicl. r4, r4, 64 - MSR_TS_S_LG, 62
-- 
1.8.3.1



[PATCH v3 05/29] KVM: PPC: Book3S PR: add new parameter (guest MSR) for kvmppc_save_tm()/kvmppc_restore_tm()

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

HV KVM and PR KVM need different MSR source to indicate whether
treclaim. or trecheckpoint. is necessary.

This patch add new parameter (guest MSR) for these kvmppc_save_tm/
kvmppc_restore_tm() APIs:
- For HV KVM, it is VCPU_MSR
- For PR KVM, it is current host MSR or VCPU_SHADOW_SRR1

This enhancement enables these 2 APIs to be reused by PR KVM later.
And the patch keeps HV KVM logic unchanged.

This patch also reworks kvmppc_save_tm()/kvmppc_restore_tm() to
have a clean ABI: r3 for vcpu and r4 for guest_msr.

During kvmppc_save_tm/kvmppc_restore_tm(), the R1 need to be saved
or restored. Currently the R1 is saved into HSTATE_HOST_R1. In PR
KVM, we are going to add a C function wrapper for
kvmppc_save_tm/kvmppc_restore_tm() where the R1 will be incremented
with added stackframe and save into HSTATE_HOST_R1. There are several
places in HV KVM to load HSTATE_HOST_R1 as R1, and we don't want to
bring risk or confusion by TM code.

This patch will use HSTATE_SCRATCH2 to save/restore R1 in
kvmppc_save_tm/kvmppc_restore_tm() to avoid future confusion, since
the r1 is actually a temporary/scratch value to be saved/stored.

Signed-off-by: Simon Guo 
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 13 +-
 arch/powerpc/kvm/tm.S   | 74 -
 2 files changed, 49 insertions(+), 38 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 4db2b10..6445d29 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -793,8 +793,12 @@ END_FTR_SECTION(CPU_FTR_TM | CPU_FTR_P9_TM_HV_ASSIST, 0)
/*
 * NOTE THAT THIS TRASHES ALL NON-VOLATILE REGISTERS INCLUDING CR
 */
+   mr  r3, r4
+   ld  r4, VCPU_MSR(r3)
bl  kvmppc_restore_tm
+   ld  r4, HSTATE_KVM_VCPU(r13)
 91:
+END_FTR_SECTION_IFSET(CPU_FTR_TM)
 #endif
 
/* Load guest PMU registers */
@@ -1777,7 +1781,10 @@ END_FTR_SECTION(CPU_FTR_TM | CPU_FTR_P9_TM_HV_ASSIST, 0)
/*
 * NOTE THAT THIS TRASHES ALL NON-VOLATILE REGISTERS INCLUDING CR
 */
+   mr  r3, r9
+   ld  r4, VCPU_MSR(r3)
bl  kvmppc_save_tm
+   ld  r9, HSTATE_KVM_VCPU(r13)
 91:
 #endif
 
@@ -2680,7 +2687,8 @@ END_FTR_SECTION(CPU_FTR_TM | CPU_FTR_P9_TM_HV_ASSIST, 0)
/*
 * NOTE THAT THIS TRASHES ALL NON-VOLATILE REGISTERS INCLUDING CR
 */
-   ld  r9, HSTATE_KVM_VCPU(r13)
+   ld  r3, HSTATE_KVM_VCPU(r13)
+   ld  r4, VCPU_MSR(r3)
bl  kvmppc_save_tm
 91:
 #endif
@@ -2799,7 +2807,10 @@ END_FTR_SECTION(CPU_FTR_TM | CPU_FTR_P9_TM_HV_ASSIST, 0)
/*
 * NOTE THAT THIS TRASHES ALL NON-VOLATILE REGISTERS INCLUDING CR
 */
+   mr  r3, r4
+   ld  r4, VCPU_MSR(r3)
bl  kvmppc_restore_tm
+   ld  r4, HSTATE_KVM_VCPU(r13)
 91:
 #endif
 
diff --git a/arch/powerpc/kvm/tm.S b/arch/powerpc/kvm/tm.S
index e79b373..cbe608a 100644
--- a/arch/powerpc/kvm/tm.S
+++ b/arch/powerpc/kvm/tm.S
@@ -26,9 +26,12 @@
 
 /*
  * Save transactional state and TM-related registers.
- * Called with r9 pointing to the vcpu struct.
+ * Called with:
+ * - r3 pointing to the vcpu struct
+ * - r4 points to the MSR with current TS bits:
+ * (For HV KVM, it is VCPU_MSR ; For PR KVM, it is host MSR).
  * This can modify all checkpointed registers, but
- * restores r1, r2 and r9 (vcpu pointer) before exit.
+ * restores r1, r2 before exit.
  */
 _GLOBAL(kvmppc_save_tm)
mflrr0
@@ -41,14 +44,11 @@ _GLOBAL(kvmppc_save_tm)
rldimi  r8, r0, MSR_TM_LG, 63-MSR_TM_LG
mtmsrd  r8
 
-#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
-   ld  r5, VCPU_MSR(r9)
-   rldicl. r5, r5, 64 - MSR_TS_S_LG, 62
+   rldicl. r4, r4, 64 - MSR_TS_S_LG, 62
beq 1f  /* TM not active in guest. */
-#endif
 
-   std r1, HSTATE_HOST_R1(r13)
-   li  r3, TM_CAUSE_KVM_RESCHED
+   std r1, HSTATE_SCRATCH2(r13)
+   std r3, HSTATE_SCRATCH1(r13)
 
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
 BEGIN_FTR_SECTION
@@ -65,7 +65,7 @@ END_FTR_SECTION_NESTED(CPU_FTR_P9_TM_XER_SO_BUG, 
CPU_FTR_P9_TM_XER_SO_BUG, 96)
 3:
/* Emulation of the treclaim instruction needs TEXASR before treclaim */
mfspr   r6, SPRN_TEXASR
-   std r6, VCPU_ORIG_TEXASR(r9)
+   std r6, VCPU_ORIG_TEXASR(r3)
 6:
 END_FTR_SECTION_IFSET(CPU_FTR_P9_TM_HV_ASSIST)
 #endif
@@ -74,6 +74,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_P9_TM_HV_ASSIST)
li  r5, 0
mtmsrd  r5, 1
 
+   li  r3, TM_CAUSE_KVM_RESCHED
+
/* All GPRs are volatile at this point. */
TRECLAIM(R3)
 
@@ -94,7 +96,7 @@ BEGIN_FTR_SECTION
 * we already have it), therefore we can now use any volatile GPR.
 */
/* Reload stack pointer and TOC. */
-   ld  r1, HSTATE_HOST_R1(r13)
+

[PATCH v3 04/29] KVM: PPC: Book3S PR: Move kvmppc_save_tm/kvmppc_restore_tm to separate file

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

It is a simple patch just for moving kvmppc_save_tm/kvmppc_restore_tm()
functionalities to tm.S. There is no logic change. The reconstruct of
those APIs will be done in later patches to improve readability.

It is for preparation of reusing those APIs on both HV/PR PPC KVM.

Some slight change during move the functions includes:
- surrounds some HV KVM specific code with CONFIG_KVM_BOOK3S_HV_POSSIBLE
for compilation.
- use _GLOBAL() to define kvmppc_save_tm/kvmppc_restore_tm()

Signed-off-by: Simon Guo 
---
 arch/powerpc/kvm/Makefile   |   3 +
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 322 
 arch/powerpc/kvm/tm.S   | 363 
 3 files changed, 366 insertions(+), 322 deletions(-)
 create mode 100644 arch/powerpc/kvm/tm.S

diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index 4b19da8..f872c04 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -63,6 +63,9 @@ kvm-pr-y := \
book3s_64_mmu.o \
book3s_32_mmu.o
 
+kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HANDLER) += \
+   tm.o
+
 ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
 kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HANDLER) += \
book3s_rmhandlers.o
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 5e6e493..4db2b10 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -39,8 +39,6 @@ BEGIN_FTR_SECTION;\
extsw   reg, reg;   \
 END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300)
 
-#define VCPU_GPRS_TM(reg) (((reg) * ULONG_SIZE) + VCPU_GPR_TM)
-
 /* Values in HSTATE_NAPPING(r13) */
 #define NAPPING_CEDE   1
 #define NAPPING_NOVCPU 2
@@ -3119,326 +3117,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
mr  r4,r31
blr
 
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
-/*
- * Save transactional state and TM-related registers.
- * Called with r9 pointing to the vcpu struct.
- * This can modify all checkpointed registers, but
- * restores r1, r2 and r9 (vcpu pointer) before exit.
- */
-kvmppc_save_tm:
-   mflrr0
-   std r0, PPC_LR_STKOFF(r1)
-   stdur1, -PPC_MIN_STKFRM(r1)
-
-   /* Turn on TM. */
-   mfmsr   r8
-   li  r0, 1
-   rldimi  r8, r0, MSR_TM_LG, 63-MSR_TM_LG
-   mtmsrd  r8
-
-   ld  r5, VCPU_MSR(r9)
-   rldicl. r5, r5, 64 - MSR_TS_S_LG, 62
-   beq 1f  /* TM not active in guest. */
-
-   std r1, HSTATE_HOST_R1(r13)
-   li  r3, TM_CAUSE_KVM_RESCHED
-
-BEGIN_FTR_SECTION
-   lbz r0, HSTATE_FAKE_SUSPEND(r13) /* Were we fake suspended? */
-   cmpwi   r0, 0
-   beq 3f
-   rldicl. r8, r8, 64 - MSR_TS_S_LG, 62 /* Did we actually hrfid? */
-   beq 4f
-BEGIN_FTR_SECTION_NESTED(96)
-   bl  pnv_power9_force_smt4_catch
-END_FTR_SECTION_NESTED(CPU_FTR_P9_TM_XER_SO_BUG, CPU_FTR_P9_TM_XER_SO_BUG, 96)
-   nop
-   b   6f
-3:
-   /* Emulation of the treclaim instruction needs TEXASR before treclaim */
-   mfspr   r6, SPRN_TEXASR
-   std r6, VCPU_ORIG_TEXASR(r9)
-6:
-END_FTR_SECTION_IFSET(CPU_FTR_P9_TM_HV_ASSIST)
-
-   /* Clear the MSR RI since r1, r13 are all going to be foobar. */
-   li  r5, 0
-   mtmsrd  r5, 1
-
-   /* All GPRs are volatile at this point. */
-   TRECLAIM(R3)
-
-   /* Temporarily store r13 and r9 so we have some regs to play with */
-   SET_SCRATCH0(r13)
-   GET_PACA(r13)
-   std r9, PACATMSCRATCH(r13)
-
-   /* If doing TM emulation on POWER9 DD2.2, check for fake suspend mode */
-BEGIN_FTR_SECTION
-   lbz r9, HSTATE_FAKE_SUSPEND(r13)
-   cmpwi   r9, 0
-   beq 2f
-   /*
-* We were in fake suspend, so we are not going to save the
-* register state as the guest checkpointed state (since
-* we already have it), therefore we can now use any volatile GPR.
-*/
-   /* Reload stack pointer and TOC. */
-   ld  r1, HSTATE_HOST_R1(r13)
-   ld  r2, PACATOC(r13)
-   /* Set MSR RI now we have r1 and r13 back. */
-   li  r5, MSR_RI
-   mtmsrd  r5, 1
-   HMT_MEDIUM
-   ld  r6, HSTATE_DSCR(r13)
-   mtspr   SPRN_DSCR, r6
-BEGIN_FTR_SECTION_NESTED(96)
-   bl  pnv_power9_force_smt4_release
-END_FTR_SECTION_NESTED(CPU_FTR_P9_TM_XER_SO_BUG, CPU_FTR_P9_TM_XER_SO_BUG, 96)
-   nop
-
-4:
-   mfspr   r3, SPRN_PSSCR
-   /* PSSCR_FAKE_SUSPEND is a write-only bit, but clear it anyway */
-   li  r0, PSSCR_FAKE_SUSPEND
-   andcr3, r3, r0
-   mtspr   SPRN_PSSCR, r3
-   ld  r9, HSTATE_KVM_VCPU(r13)
-   /* Don't save TEXASR, use value from last exit in real suspend state */
-   b   11f
-2:
-END_FTR_SECTION_IFSET(CPU_FTR_P9_TM_HV_ASSIST)
-
-   ld  r9, 

[PATCH v3 03/29] powerpc: export tm_enable()/tm_disable/tm_abort() APIs

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

This patch exports tm_enable()/tm_disable/tm_abort() APIs, which
will be used for PR KVM transaction memory logic.

Signed-off-by: Simon Guo 
Reviewed-by: Paul Mackerras 
---
 arch/powerpc/include/asm/asm-prototypes.h |  3 +++
 arch/powerpc/include/asm/tm.h |  2 --
 arch/powerpc/kernel/tm.S  | 12 
 arch/powerpc/mm/hash_utils_64.c   |  1 +
 4 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/asm-prototypes.h 
b/arch/powerpc/include/asm/asm-prototypes.h
index d9713ad..dfdcb23 100644
--- a/arch/powerpc/include/asm/asm-prototypes.h
+++ b/arch/powerpc/include/asm/asm-prototypes.h
@@ -141,4 +141,7 @@ unsigned long __init prom_init(unsigned long r3, unsigned 
long r4,
 void pnv_power9_force_smt4_catch(void);
 void pnv_power9_force_smt4_release(void);
 
+void tm_enable(void);
+void tm_disable(void);
+void tm_abort(uint8_t cause);
 #endif /* _ASM_POWERPC_ASM_PROTOTYPES_H */
diff --git a/arch/powerpc/include/asm/tm.h b/arch/powerpc/include/asm/tm.h
index b1658c9..e94f6db 100644
--- a/arch/powerpc/include/asm/tm.h
+++ b/arch/powerpc/include/asm/tm.h
@@ -10,12 +10,10 @@
 
 #ifndef __ASSEMBLY__
 
-extern void tm_enable(void);
 extern void tm_reclaim(struct thread_struct *thread,
   uint8_t cause);
 extern void tm_reclaim_current(uint8_t cause);
 extern void tm_recheckpoint(struct thread_struct *thread);
-extern void tm_abort(uint8_t cause);
 extern void tm_save_sprs(struct thread_struct *thread);
 extern void tm_restore_sprs(struct thread_struct *thread);
 
diff --git a/arch/powerpc/kernel/tm.S b/arch/powerpc/kernel/tm.S
index b92ac8e..ff12f47 100644
--- a/arch/powerpc/kernel/tm.S
+++ b/arch/powerpc/kernel/tm.S
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_VSX
 /* See fpu.S, this is borrowed from there */
@@ -55,6 +56,16 @@ _GLOBAL(tm_enable)
or  r4, r4, r3
mtmsrd  r4
 1: blr
+EXPORT_SYMBOL_GPL(tm_enable);
+
+_GLOBAL(tm_disable)
+   mfmsr   r4
+   li  r3, MSR_TM >> 32
+   sldir3, r3, 32
+   andcr4, r4, r3
+   mtmsrd  r4
+   blr
+EXPORT_SYMBOL_GPL(tm_disable);
 
 _GLOBAL(tm_save_sprs)
mfspr   r0, SPRN_TFHAR
@@ -78,6 +89,7 @@ _GLOBAL(tm_restore_sprs)
 _GLOBAL(tm_abort)
TABORT(R3)
blr
+EXPORT_SYMBOL_GPL(tm_abort);
 
 /* void tm_reclaim(struct thread_struct *thread,
  *uint8_t cause)
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 0bd3790..1bd8b4c1 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -64,6 +64,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef DEBUG
 #define DBG(fmt...) udbg_printf(fmt)
-- 
1.8.3.1



[PATCH v3 02/29] powerpc: add TEXASR related macros

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

This patches add some macros for CR0/TEXASR bits so that PR KVM TM
logic(tbegin./treclaim./tabort.) can make use of them later.

Signed-off-by: Simon Guo 
Reviewed-by: Paul Mackerras 
---
 arch/powerpc/include/asm/reg.h  | 32 +++--
 arch/powerpc/platforms/powernv/copy-paste.h |  3 +--
 2 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 44b2be4..5625684 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -146,6 +146,12 @@
 #define MSR_64BIT  0
 #endif
 
+/* Condition Register related */
+#define CR0_SHIFT  28
+#define CR0_MASK   0xF
+#define CR0_TBEGIN_FAILURE (0x2 << 28) /* 0b0010 */
+
+
 /* Power Management - Processor Stop Status and Control Register Fields */
 #define PSSCR_RL_MASK  0x000F /* Requested Level */
 #define PSSCR_MTL_MASK 0x00F0 /* Maximum Transition Level */
@@ -239,13 +245,27 @@
 #define SPRN_TFIAR 0x81/* Transaction Failure Inst Addr   */
 #define SPRN_TEXASR0x82/* Transaction EXception & Summary */
 #define SPRN_TEXASRU   0x83/* ''  ''  ''Upper 32  */
-#define   TEXASR_ABORT __MASK(63-31) /* terminated by tabort or treclaim */
-#define   TEXASR_SUSP  __MASK(63-32) /* tx failed in suspended state */
-#define   TEXASR_HV__MASK(63-34) /* MSR[HV] when failure occurred */
-#define   TEXASR_PR__MASK(63-35) /* MSR[PR] when failure occurred */
-#define   TEXASR_FS__MASK(63-36) /* TEXASR Failure Summary */
-#define   TEXASR_EXACT __MASK(63-37) /* TFIAR value is exact */
+
+#define TEXASR_FC_LG   (63 - 7)/* Failure Code */
+#define TEXASR_AB_LG   (63 - 31)   /* Abort */
+#define TEXASR_SU_LG   (63 - 32)   /* Suspend */
+#define TEXASR_HV_LG   (63 - 34)   /* Hypervisor state*/
+#define TEXASR_PR_LG   (63 - 35)   /* Privilege level */
+#define TEXASR_FS_LG   (63 - 36)   /* failure summary */
+#define TEXASR_EX_LG   (63 - 37)   /* TFIAR exact bit */
+#define TEXASR_ROT_LG  (63 - 38)   /* ROT bit */
+
+#define   TEXASR_ABORT __MASK(TEXASR_AB_LG) /* terminated by tabort or 
treclaim */
+#define   TEXASR_SUSP  __MASK(TEXASR_SU_LG) /* tx failed in suspended state */
+#define   TEXASR_HV__MASK(TEXASR_HV_LG) /* MSR[HV] when failure occurred */
+#define   TEXASR_PR__MASK(TEXASR_PR_LG) /* MSR[PR] when failure occurred */
+#define   TEXASR_FS__MASK(TEXASR_FS_LG) /* TEXASR Failure Summary */
+#define   TEXASR_EXACT __MASK(TEXASR_EX_LG) /* TFIAR value is exact */
+#define   TEXASR_ROT   __MASK(TEXASR_ROT_LG)
+#define   TEXASR_FC(ASM_CONST(0xFF) << TEXASR_FC_LG)
+
 #define SPRN_TFHAR 0x80/* Transaction Failure Handler Addr */
+
 #define SPRN_TIDR  144 /* Thread ID register */
 #define SPRN_CTRLF 0x088
 #define SPRN_CTRLT 0x098
diff --git a/arch/powerpc/platforms/powernv/copy-paste.h 
b/arch/powerpc/platforms/powernv/copy-paste.h
index c9a5036..3fa62de 100644
--- a/arch/powerpc/platforms/powernv/copy-paste.h
+++ b/arch/powerpc/platforms/powernv/copy-paste.h
@@ -7,9 +7,8 @@
  * 2 of the License, or (at your option) any later version.
  */
 #include 
+#include 
 
-#define CR0_SHIFT  28
-#define CR0_MASK   0xF
 /*
  * Copy/paste instructions:
  *
-- 
1.8.3.1



[PATCH v3 01/29] powerpc: export symbol msr_check_and_set().

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

PR KVM will need to reuse msr_check_and_set().
This patch exports this API for reuse.

Signed-off-by: Simon Guo 
Reviewed-by: Paul Mackerras 
---
 arch/powerpc/kernel/process.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 1237f13..25db000 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -154,6 +154,7 @@ unsigned long msr_check_and_set(unsigned long bits)
 
return newmsr;
 }
+EXPORT_SYMBOL_GPL(msr_check_and_set);
 
 void __msr_check_and_clear(unsigned long bits)
 {
-- 
1.8.3.1



[PATCH v3 00/29] KVM: PPC: Book3S PR: Transaction memory support on PR KVM

2018-05-20 Thread wei . guo . simon
From: Simon Guo 

In current days, many OS distributions have utilized transaction
memory functionality. In PowerPC, HV KVM supports TM. But PR KVM
does not.

The drive for the transaction memory support of PR KVM is the
openstack Continuous Integration testing - They runs a HV(hypervisor)
KVM(as level 1) and then run PR KVM(as level 2) on top of that.

This patch set add transaction memory support on PR KVM.

v2 -> v3 changes:
1) rebase onto Paul's kvm-ppc-next branch, which includes rework 
KVM_CHECK_EXTENSION ioctl (patch #25) a little bit. 
2) allow mtspr TFHAR in TM suspend state
3) remove patch: 
  "KVM: PPC: add KVM_SET_ONE_REG/KVM_GET_ONE_REG to async ioctl"
4) some minor rework per comments

v1 -> v2 changes:
1. Correct a bug in trechkpt emulation: the tm sprs need to be 
flushed to vcpu before trechkpt.
2. add PR kvm ioctl functionalities for TM.
3. removed save_msr_tm and use kvmppc_get_msr() to determine 
whether a transaction state need to be restored.
4. Remove "KVM: PPC: Book3S PR: set MSR HV bit accordingly 
for PPC970 and others." patch.
It will prevent PR KVM to start as L1 hypervisor. Since if 
we set HV bit to 0 when rfid to guest (who is supposed to 
run at HV=1 && PR=1), the guest will not be able to access 
its original memory.
The original code always set HV bits for shadow_msr, it is 
benign since:
HV bits can only be altered by sc instruction; it can only 
be set to 0 by rfid/hrfid instruction.  
We return to guest with rfid. So:
* if KVM are running as L1 hypervisor, guest physical MSR 
expects HV=1.
* if KVM are running as L2 hypervisor, rfid cannot update 
HV =1 so the HV is still 0.
5. add XER register implementation to 
kvmppc_copyto_vcpu_tm/kvmppc_copyfrom_vcpu_tm()
6. remove unnecessary stack frame in _kvmppc_save/restore_tm().
7. move MSR bits sync into kvmppc_copy_from_svcpu() so that 
we always see inconsistent shadow_msr/kvmppc_get_msr() 
even when preemption.
8. doing failure recording in treclaim emulation when TEXASR_FS
is 0.


Test cases performed:
linux/tools/testing/selftests/powerpc/tm/tm-syscall
linux/tools/testing/selftests/powerpc/tm/tm-fork
linux/tools/testing/selftests/powerpc/tm/tm-vmx-unavail
linux/tools/testing/selftests/powerpc/tm/tm-tmspr
linux/tools/testing/selftests/powerpc/tm/tm-signal-msr-resv
linux/tools/testing/selftests/powerpc/math/vsx_preempt
linux/tools/testing/selftests/powerpc/math/fpu_signal
linux/tools/testing/selftests/powerpc/math/vmx_preempt
linux/tools/testing/selftests/powerpc/math/fpu_syscall
linux/tools/testing/selftests/powerpc/math/vmx_syscall
linux/tools/testing/selftests/powerpc/math/fpu_preempt
linux/tools/testing/selftests/powerpc/math/vmx_signal
linux/tools/testing/selftests/powerpc/ptrace/ptrace-tm-gpr
linux/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-gpr
linux/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-vsx
linux/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spr
linux/tools/testing/selftests/powerpc/ptrace/ptrace-tm-vsx
https://github.com/justdoitqd/publicFiles/blob/master/test_tbegin_pr.c
https://github.com/justdoitqd/publicFiles/blob/master/test_tabort.c
https://github.com/justdoitqd/publicFiles/blob/master/test_kvm_htm_cap.c
https://github.com/justdoitqd/publicFiles/blob/master/test-tm-mig.c

Simon Guo (29):
  powerpc: export symbol msr_check_and_set().
  powerpc: add TEXASR related macros
  powerpc: export tm_enable()/tm_disable/tm_abort() APIs
  KVM: PPC: Book3S PR: Move kvmppc_save_tm/kvmppc_restore_tm to separate
file
  KVM: PPC: Book3S PR: add new parameter (guest MSR) for
kvmppc_save_tm()/kvmppc_restore_tm()
  KVM: PPC: Book3S PR: turn on FP/VSX/VMX MSR bits in kvmppc_save_tm()
  KVM: PPC: Book3S PR: add C function wrapper for
_kvmppc_save/restore_tm()
  KVM: PPC: Book3S PR: In PR KVM suspends Transactional state when
inject an interrupt.
  KVM: PPC: Book3S PR: PR KVM pass through MSR TM/TS bits to shadow_msr.
  KVM: PPC: Book3S PR: Sync TM bits to shadow msr for problem state
guest
  KVM: PPC: Book3S PR: implement RFID TM behavior to suppress change
from S0 to N0
  KVM: PPC: Book3S PR: prevent TS bits change in kvmppc_interrupt_pr()
  KVM: PPC: Book3S PR: adds new
kvmppc_copyto_vcpu_tm/kvmppc_copyfrom_vcpu_tm API for PR KVM.
  KVM: PPC: Book3S PR: add kvmppc_save/restore_tm_sprs() APIs
  KVM: PPC: Book3S PR: add transaction memory save/restore skeleton for
PR KVM
  KVM: PPC: Book3S PR: add math support for PR KVM HTM
  KVM: PPC: Book3S PR: make mtspr/mfspr emulation behavior based on
active TM SPRs
  KVM: PPC: Book3S PR: always fail transaction in guest privilege state
  KVM: PPC: Book3S PR: enable NV reg restore for reading TM SPR at guest
privilege state
  KVM: PPC: Book3S PR: adds emulation for treclaim.
  KVM: PPC: Book3S PR: add emulation for trechkpt in PR KVM.
  KVM: PPC: Book3S PR: add emulation for tabort. for privilege guest
  KVM: PPC: Book3S PR: add guard code to prevent returning to guest with
PR=0 and 

[PATCH 3/3] powerpc/sstep: Fix emulate_step test if VSX not present

2018-05-20 Thread Ravi Bangoria
emulate_step() tests are failing if VSX is not supported or disabled.

  emulate_step_test: lxvd2x : FAIL
  emulate_step_test: stxvd2x: FAIL

If !CPU_FTR_VSX, emulate_step() failure is expected and testcase should
PASS with a valid justification. After patch:

  emulate_step_test: lxvd2x : PASS (!CPU_FTR_VSX)
  emulate_step_test: stxvd2x: PASS (!CPU_FTR_VSX)

Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/lib/test_emulate_step.c | 21 +++--
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/lib/test_emulate_step.c 
b/arch/powerpc/lib/test_emulate_step.c
index 2534c1447554..6c47daa61614 100644
--- a/arch/powerpc/lib/test_emulate_step.c
+++ b/arch/powerpc/lib/test_emulate_step.c
@@ -387,10 +387,14 @@ static void __init test_lxvd2x_stxvd2x(void)
/* lxvd2x vsr39, r3, r4 */
stepped = emulate_step(, TEST_LXVD2X(39, 3, 4));
 
-   if (stepped == 1)
+   if (stepped == 1 && cpu_has_feature(CPU_FTR_VSX)) {
show_result("lxvd2x", "PASS");
-   else
-   show_result("lxvd2x", "FAIL");
+   } else {
+   if (!cpu_has_feature(CPU_FTR_VSX))
+   show_result("lxvd2x", "PASS (!CPU_FTR_VSX)");
+   else
+   show_result("lxvd2x", "FAIL");
+   }
 
 
/*** stxvd2x ***/
@@ -404,10 +408,15 @@ static void __init test_lxvd2x_stxvd2x(void)
stepped = emulate_step(, TEST_STXVD2X(39, 3, 4));
 
if (stepped == 1 && cached_b[0] == c.b[0] && cached_b[1] == c.b[1] &&
-   cached_b[2] == c.b[2] && cached_b[3] == c.b[3])
+   cached_b[2] == c.b[2] && cached_b[3] == c.b[3] &&
+   cpu_has_feature(CPU_FTR_VSX)) {
show_result("stxvd2x", "PASS");
-   else
-   show_result("stxvd2x", "FAIL");
+   } else {
+   if (!cpu_has_feature(CPU_FTR_VSX))
+   show_result("stxvd2x", "PASS (!CPU_FTR_VSX)");
+   else
+   show_result("stxvd2x", "FAIL");
+   }
 }
 #else
 static void __init test_lxvd2x_stxvd2x(void)
-- 
2.16.2



[PATCH 2/3] powerpc/sstep: Fix kernel crash if VSX is not present

2018-05-20 Thread Ravi Bangoria
emulate_step() is not checking runtime VSX feature flag before
emulating an instruction. This is causing kernel crash when kernel
is compiled with CONFIG_VSX=y but running on a machine where VSX
is not supported or disabled. Ex, while running emulate_step tests
on P6 machine:

  Oops: Exception in kernel mode, sig: 4 [#1]
  NIP [c0095c24] .load_vsrn+0x28/0x54
  LR [c0094bdc] .emulate_loadstore+0x167c/0x17b0
  Call Trace:
0x40fe240c7ae147ae (unreliable)
.emulate_loadstore+0x167c/0x17b0
.emulate_step+0x25c/0x5bc
.test_lxvd2x_stxvd2x+0x64/0x154
.test_emulate_step+0x38/0x4c
.do_one_initcall+0x5c/0x2c0
.kernel_init_freeable+0x314/0x4cc
.kernel_init+0x24/0x160
.ret_from_kernel_thread+0x58/0xb4

With fix:
  emulate_step_test: lxvd2x : FAIL
  emulate_step_test: stxvd2x: FAIL

Fixes: https://github.com/linuxppc/linux/issues/148

Reported-by: Michael Ellerman 
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/lib/sstep.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index db6bba259d91..23b7ddf04521 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -2544,6 +2544,15 @@ int analyse_instr(struct instruction_op *op, const 
struct pt_regs *regs,
 #endif /* __powerpc64__ */
 
}
+
+#ifdef CONFIG_VSX
+   if ((GETTYPE(op->type) == LOAD_VSX ||
+GETTYPE(op->type) == STORE_VSX) &&
+   !cpu_has_feature(CPU_FTR_VSX)) {
+   return -1;
+   }
+#endif /* CONFIG_VSX */
+
return 0;
 
  logical_done:
-- 
2.16.2



[PATCH 1/3] powerpc/sstep: Introduce GETTYPE macro

2018-05-20 Thread Ravi Bangoria
Replace 'op->type & INSTR_TYPE_MASK' expression with GETTYPE(op->type)
macro.

Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/sstep.h | 2 ++
 arch/powerpc/kernel/align.c  | 2 +-
 arch/powerpc/lib/sstep.c | 6 +++---
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/sstep.h b/arch/powerpc/include/asm/sstep.h
index ab9d849644d0..9a2dfee26f9f 100644
--- a/arch/powerpc/include/asm/sstep.h
+++ b/arch/powerpc/include/asm/sstep.h
@@ -97,6 +97,8 @@ enum instruction_type {
 #define SIZE(n)((n) << 12)
 #define GETSIZE(w) ((w) >> 12)
 
+#define GETTYPE(t) ((t) & INSTR_TYPE_MASK)
+
 #define MKOP(t, f, s)  ((t) | (f) | SIZE(s))
 
 struct instruction_op {
diff --git a/arch/powerpc/kernel/align.c b/arch/powerpc/kernel/align.c
index 3e6c0744c174..11550a3d1ac2 100644
--- a/arch/powerpc/kernel/align.c
+++ b/arch/powerpc/kernel/align.c
@@ -339,7 +339,7 @@ int fix_alignment(struct pt_regs *regs)
if (r < 0)
return -EINVAL;
 
-   type = op.type & INSTR_TYPE_MASK;
+   type = GETTYPE(op.type);
if (!OP_IS_LOAD_STORE(type)) {
if (op.type != CACHEOP + DCBZ)
return -EINVAL;
diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index 34d68f1b1b40..db6bba259d91 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -2641,7 +2641,7 @@ void emulate_update_regs(struct pt_regs *regs, struct 
instruction_op *op)
unsigned long next_pc;
 
next_pc = truncate_if_32bit(regs->msr, regs->nip + 4);
-   switch (op->type & INSTR_TYPE_MASK) {
+   switch (GETTYPE(op->type)) {
case COMPUTE:
if (op->type & SETREG)
regs->gpr[op->reg] = op->val;
@@ -2739,7 +2739,7 @@ int emulate_loadstore(struct pt_regs *regs, struct 
instruction_op *op)
 
err = 0;
size = GETSIZE(op->type);
-   type = op->type & INSTR_TYPE_MASK;
+   type = GETTYPE(op->type);
cross_endian = (regs->msr & MSR_LE) != (MSR_KERNEL & MSR_LE);
ea = truncate_if_32bit(regs->msr, op->ea);
 
@@ -3001,7 +3001,7 @@ int emulate_step(struct pt_regs *regs, unsigned int instr)
}
 
err = 0;
-   type = op.type & INSTR_TYPE_MASK;
+   type = GETTYPE(op.type);
 
if (OP_IS_LOAD_STORE(type)) {
err = emulate_loadstore(regs, );
-- 
2.16.2



[PATCH 0/3] powerpc/sstep: Fix kernel crash if VSX is not present

2018-05-20 Thread Ravi Bangoria
This is a next version to RFC patch:
  https://lkml.org/lkml/2018/5/16/36

kbuild test robot reported the following build failure with RFC.

   error: unused variable 'type' [-Werror=unused-variable]
 int type;
 ^~~~

I've fixed it along with following changes.

1st patch introduces a new macro to simplify code a bit.
2nd patch fixes the kernel crash when VSX is not supported
or disabled.
3rd patch fixes emulate_step() tests

Ravi Bangoria (3):
  powerpc/sstep: Introduce GETTYPE macro
  powerpc/sstep: Fix kernel crash if VSX is not present
  powerpc/sstep: Fix emulate_step test if VSX not present

 arch/powerpc/include/asm/sstep.h |  2 ++
 arch/powerpc/kernel/align.c  |  2 +-
 arch/powerpc/lib/sstep.c | 15 ---
 arch/powerpc/lib/test_emulate_step.c | 21 +++--
 4 files changed, 30 insertions(+), 10 deletions(-)

-- 
2.16.2



Re: [PATCH 2/2] powerpc/ptrace: Fix setting 512B aligned breakpoints with PTRACE_SET_DEBUGREG

2018-05-20 Thread Michael Neuling
On Fri, 2018-05-18 at 22:56 +1000, Michael Ellerman wrote:
> Michael Neuling  writes:
> > In this change:
> >   e2a800beac powerpc/hw_brk: Fix off by one error when validating DAWR
> > region end
> > 
> > We fixed setting the DAWR end point to its max value via
> > PPC_PTRACE_SETHWDEBUG. Unfortunately we broke PTRACE_SET_DEBUGREG when
> > setting a 512 byte aligned breakpoint.
> > 
> > PTRACE_SET_DEBUGREG currently sets the length of the breakpoint to
> > zero (memset() in hw_breakpoint_init()).  This worked with
> > arch_validate_hwbkpt_settings() before the above patch was applied but
> > is now broken if the breakpoint is 512byte aligned.
> > 
> > This sets the length of the breakpoint to 8 bytes when using
> > PTRACE_SET_DEBUGREG.
> > 
> > Signed-off-by: Michael Neuling 
> > Cc: sta...@vger.kernel.org # 3.10+
> 
> If this is "fixing" e2a800beac then I think v3.11 is right for the
> stable tag?
> 
> $ git describe --contains --long e2a800beaca1
> v3.11-rc1~94^2~4

You're right. I think read the output of gitk incorrectly.

Thanks.
Mikey


Re: pkeys on POWER: Access rights not reset on execve

2018-05-20 Thread Ram Pai
On Sat, May 19, 2018 at 11:06:20PM -0700, Andy Lutomirski wrote:
> On Sat, May 19, 2018 at 11:04 PM Ram Pai  wrote:
> 
> > On Sat, May 19, 2018 at 04:47:23PM -0700, Andy Lutomirski wrote: > On
> Sat, May 19, 2018 at 1:28 PM Ram Pai  wrote:
> 
> > ...snip...
> > >
> > > So is it possible for two threads to each call pkey_alloc() and end up
> with
> > > the same key?  If so, it seems entirely broken.
> 
> > No. Two threads cannot allocate the same key; just like x86.
> 
> > > If not, then how do you
> > > intend for a multithreaded application to usefully allocate a new key?
> > > Regardless, it seems like the current behavior on POWER is very
> difficult
> > > to work with.  Can you give an example of a use case for which POWER's
> > > behavior makes sense?
> > >
> > > For the use cases I've imagined, POWER's behavior does not make sense.
> > >   x86's is not ideal but is still better.  Here are my two example use
> cases:
> > >
> > > 1. A crypto library.  Suppose I'm writing a TLS-terminating server, and
> I
> > > want it to be resistant to Heartbleed-like bugs.  I could store my
> private
> > > keys protected by mprotect_key() and arrange for all threads and signal
> > > handlers to have PKRU/AMR values that prevent any access to the memory.
> > > When an explicit call is made to sign with the key, I would temporarily
> > > change PKRU/AMR to allow access, compute the signature, and change
> PKRU/AMR
> > > back.  On x86 right now, this works nicely.  On POWER, it doesn't,
> because
> > > any thread started before my pkey_alloc() call can access the protected
> > > memory, as can any signal handler.
> > >
> > > 2. A database using mmap() (with persistent memory or otherwise).  It
> would
> > > be nice to be resistant to accidental corruption due to stray writes.  I
> > > would do more or less the same thing as (1), except that I would want
> > > threads that are not actively writing to the database to be able the
> > > protected memory.  On x86, I need to manually convince threads that may
> > > have been started before my pkey_alloc() call as well as signal
> handlers to
> > > update their PKRU values.  On POWER, as in example (1), the error goes
> the
> > > other direction -- if I fail to propagate the AMR bits to all threads,
> > > writes are not blocked.
> 
> > I see the problem from an application's point of view, on powerpc.  If
> > the key allocated in one thread is not activated on all threads
> > (existing one and future one), than other threads will not be able
> > to modify the key's permissions. Hence they will not be able to control
> > access/write to pages to which the key is associated.
> 
> > As Florian suggested, I should enable the key's bit in the UAMOR value
> > corresponding to existing threads, when a new key is allocated.
> 
> > Now, looking at the implementation for x86, I see that sys_mpkey_alloc()
> > makes no attempt to modify anything of any other thread. How
> > does it manage to activate the key on any other thread? Is this
> > magic done by the hardware?
> 
> x86 has no equivalent concept to UAMOR.  There are 16 keys no matter what.


Florian,

Does the following patch fix the problem for you?  Just like x86
I am enabling all keys in the UAMOR register during
initialization itself. Hence any key created by any thread at
any time, will get activated on all threads. So any thread
can change the permission on that key. Smoke tested it
with your test program.


Signed-off-by: Ram Pai 
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 0eafdf01..ab4519a 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -15,8 +15,9 @@
 int  pkeys_total;  /* Total pkeys as per device tree */
 bool pkeys_devtree_defined;/* pkey property exported by device tree */
 u32  initial_allocation_mask;  /* Bits set for reserved keys */
-u64  pkey_amr_uamor_mask;  /* Bits in AMR/UMOR not to be touched */
+u64  pkey_amr_mask;/* Bits in AMR not to be touched */
 u64  pkey_iamr_mask;   /* Bits in AMR not to be touched */
+u64  pkey_uamor_mask;  /* Bits in UMOR not to be touched */
 
 #define AMR_BITS_PER_PKEY 2
 #define AMR_RD_BIT 0x1UL
@@ -24,6 +25,7 @@
 #define IAMR_EX_BIT 0x1UL
 #define PKEY_REG_BITS (sizeof(u64)*8)
 #define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
+#define switch_off(bitmask, i) (bitmask &= ~(0x3ul << pkeyshift(i)))
 
 static void scan_pkey_feature(void)
 {
@@ -120,19 +122,31 @@ int pkey_initialize(void)
os_reserved = 0;
 #endif
initial_allocation_mask = ~0x0;
-   pkey_amr_uamor_mask = ~0x0ul;
+
+   /* register mask is in BE format */
+   pkey_amr_mask = ~0x0ul;
pkey_iamr_mask = ~0x0ul;
-   /*
-* key 0, 1 are reserved.
-* key 0 is the default key, which allows read/write/execute.
-* key 1 is recommended not to 

Re: [PATCH 07/14] powerpc: Add support for restartable sequences

2018-05-20 Thread Boqun Feng
On Fri, May 18, 2018 at 02:17:17PM -0400, Mathieu Desnoyers wrote:
> - On May 17, 2018, at 7:50 PM, Boqun Feng boqun.f...@gmail.com wrote:
> [...]
> >> > I think you're right. So we have to introduce callsite to rseq_syscall()
> >> > in syscall path, something like:
> >> > 
> >> > diff --git a/arch/powerpc/kernel/entry_64.S 
> >> > b/arch/powerpc/kernel/entry_64.S
> >> > index 51695608c68b..a25734a96640 100644
> >> > --- a/arch/powerpc/kernel/entry_64.S
> >> > +++ b/arch/powerpc/kernel/entry_64.S
> >> > @@ -222,6 +222,9 @@ system_call_exit:
> >> >  mtmsrd  r11,1
> >> > #endif /* CONFIG_PPC_BOOK3E */
> >> > 
> >> > +addir3,r1,STACK_FRAME_OVERHEAD
> >> > +bl  rseq_syscall
> >> > +
> >> >  ld  r9,TI_FLAGS(r12)
> >> >  li  r11,-MAX_ERRNO
> >> >  andi.
> >> >  
> >> > r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK)
> >> > 
> 
> By the way, I think this is not the right spot to call rseq_syscall, because
> interrupts are disabled. I think we should move this hunk right after 
> system_call_exit.
> 

Good point.

> Would you like to implement and test an updated patch adding those calls for 
> ppc 32 and 64 ?
> 

I'd like to help, but I don't have a handy ppc environment for test...
So I made the below patch which has only been build-tested, hope it
could be somewhat helpful.

Regards,
Boqun

->8
Subject: [PATCH] powerpc: Add syscall detection for restartable sequences

Syscalls are not allowed inside restartable sequences, so add a call to
rseq_syscall() at the very beginning of system call exiting path for
CONFIG_DEBUG_RSEQ=y kernel. This could help us to detect whether there
is a syscall issued inside restartable sequences.

Signed-off-by: Boqun Feng 
---
 arch/powerpc/kernel/entry_32.S | 5 +
 arch/powerpc/kernel/entry_64.S | 5 +
 2 files changed, 10 insertions(+)

diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index eb8d01bae8c6..2f134eebe7ed 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -365,6 +365,11 @@ syscall_dotrace_cont:
blrl/* Call handler */
.globl  ret_from_syscall
 ret_from_syscall:
+#ifdef CONFIG_DEBUG_RSEQ
+   /* Check whether the syscall is issued inside a restartable sequence */
+   addir3,r1,STACK_FRAME_OVERHEAD
+   bl  rseq_syscall
+#endif
mr  r6,r3
CURRENT_THREAD_INFO(r12, r1)
/* disable interrupts so current_thread_info()->flags can't change */
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 2cb5109a7ea3..2e2d59bb45d0 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -204,6 +204,11 @@ system_call:   /* label this so stack 
traces look sane */
  * This is blacklisted from kprobes further below with _ASM_NOKPROBE_SYMBOL().
  */
 system_call_exit:
+#ifdef CONFIG_DEBUG_RSEQ
+   /* Check whether the syscall is issued inside a restartable sequence */
+   addir3,r1,STACK_FRAME_OVERHEAD
+   bl  rseq_syscall
+#endif
/*
 * Disable interrupts so current_thread_info()->flags can't change,
 * and so that we don't get interrupted after loading SRR0/1.
-- 
2.16.2



[PATCH 7/7 v5] arm64: dts: ls208xa: comply with the iommu map binding for fsl_mc

2018-05-20 Thread Nipun Gupta
fsl-mc bus support the new iommu-map property. Comply to this binding
for fsl_mc bus.

Signed-off-by: Nipun Gupta 
Reviewed-by: Laurentiu Tudor 
---
 arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi 
b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
index 137ef4d..6010505 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
@@ -184,6 +184,7 @@
#address-cells = <2>;
#size-cells = <2>;
ranges;
+   dma-ranges = <0x0 0x0 0x0 0x0 0x1 0x>;
 
clockgen: clocking@130 {
compatible = "fsl,ls2080a-clockgen";
@@ -357,6 +358,8 @@
reg = <0x0008 0x0c00 0 0x40>,/* MC portal 
base */
  <0x 0x0834 0 0x4>; /* MC control 
reg */
msi-parent = <>;
+   iommu-map = <0  0 0>;  /* This is fixed-up by 
u-boot */
+   dma-coherent;
#address-cells = <3>;
#size-cells = <1>;
 
@@ -460,6 +463,8 @@
compatible = "arm,mmu-500";
reg = <0 0x500 0 0x80>;
#global-interrupts = <12>;
+   #iommu-cells = <1>;
+   stream-match-mask = <0x7C00>;
interrupts = <0 13 4>, /* global secure fault */
 <0 14 4>, /* combined secure interrupt */
 <0 15 4>, /* global non-secure fault */
@@ -502,7 +507,6 @@
 <0 204 4>, <0 205 4>,
 <0 206 4>, <0 207 4>,
 <0 208 4>, <0 209 4>;
-   mmu-masters = <_mc 0x300 0>;
};
 
dspi: dspi@210 {
-- 
1.9.1



[PATCH 6/7 v5] bus: fsl-mc: set coherent dma mask for devices on fsl-mc bus

2018-05-20 Thread Nipun Gupta
of_dma_configure() API expects coherent_dma_mask to be correctly
set in the devices. This patch does the needful.

Signed-off-by: Nipun Gupta 
---
 drivers/bus/fsl-mc/fsl-mc-bus.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index fa43c7d..624828b 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -627,6 +627,7 @@ int fsl_mc_device_add(struct fsl_mc_obj_desc *obj_desc,
mc_dev->icid = parent_mc_dev->icid;
mc_dev->dma_mask = FSL_MC_DEFAULT_DMA_MASK;
mc_dev->dev.dma_mask = _dev->dma_mask;
+   mc_dev->dev.coherent_dma_mask = mc_dev->dma_mask;
dev_set_msi_domain(_dev->dev,
   dev_get_msi_domain(_mc_dev->dev));
}
-- 
1.9.1



[PATCH 5/7 v5] bus: fsl-mc: support dma configure for devices on fsl-mc bus

2018-05-20 Thread Nipun Gupta
This patch adds support of dma configuration for devices on fsl-mc
bus using 'dma_configure' callback for busses. Also, directly calling
arch_setup_dma_ops is removed from the fsl-mc bus.

Signed-off-by: Nipun Gupta 
Reviewed-by: Laurentiu Tudor 
---
 drivers/bus/fsl-mc/fsl-mc-bus.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 5d8266c..fa43c7d 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -127,6 +127,16 @@ static int fsl_mc_bus_uevent(struct device *dev, struct 
kobj_uevent_env *env)
return 0;
 }
 
+static int fsl_mc_dma_configure(struct device *dev)
+{
+   struct device *dma_dev = dev;
+
+   while (dev_is_fsl_mc(dma_dev))
+   dma_dev = dma_dev->parent;
+
+   return of_dma_configure(dev, dma_dev->of_node, 0);
+}
+
 static ssize_t modalias_show(struct device *dev, struct device_attribute *attr,
 char *buf)
 {
@@ -148,6 +158,7 @@ struct bus_type fsl_mc_bus_type = {
.name = "fsl-mc",
.match = fsl_mc_bus_match,
.uevent = fsl_mc_bus_uevent,
+   .dma_configure  = fsl_mc_dma_configure,
.dev_groups = fsl_mc_dev_groups,
 };
 EXPORT_SYMBOL_GPL(fsl_mc_bus_type);
@@ -633,10 +644,6 @@ int fsl_mc_device_add(struct fsl_mc_obj_desc *obj_desc,
goto error_cleanup_dev;
}
 
-   /* Objects are coherent, unless 'no shareability' flag set. */
-   if (!(obj_desc->flags & FSL_MC_OBJ_FLAG_NO_MEM_SHAREABILITY))
-   arch_setup_dma_ops(_dev->dev, 0, 0, NULL, true);
-
/*
 * The device-specific probe callback will get invoked by device_add()
 */
-- 
1.9.1



[PATCH 4/7 v5] iommu: arm-smmu: Add support for the fsl-mc bus

2018-05-20 Thread Nipun Gupta
Implement bus specific support for the fsl-mc bus including
registering arm_smmu_ops and bus specific device add operations.

Signed-off-by: Nipun Gupta 
---
 drivers/iommu/arm-smmu.c |  7 +++
 drivers/iommu/iommu.c| 21 +
 include/linux/fsl/mc.h   |  8 
 include/linux/iommu.h|  2 ++
 4 files changed, 38 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 69e7c60..e1d5090 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -52,6 +52,7 @@
 #include 
 
 #include 
+#include 
 
 #include "io-pgtable.h"
 #include "arm-smmu-regs.h"
@@ -1459,6 +1460,8 @@ static struct iommu_group *arm_smmu_device_group(struct 
device *dev)
 
if (dev_is_pci(dev))
group = pci_device_group(dev);
+   else if (dev_is_fsl_mc(dev))
+   group = fsl_mc_device_group(dev);
else
group = generic_device_group(dev);
 
@@ -2037,6 +2040,10 @@ static void arm_smmu_bus_init(void)
bus_set_iommu(_bus_type, _smmu_ops);
}
 #endif
+#ifdef CONFIG_FSL_MC_BUS
+   if (!iommu_present(_mc_bus_type))
+   bus_set_iommu(_mc_bus_type, _smmu_ops);
+#endif
 }
 
 static int arm_smmu_device_probe(struct platform_device *pdev)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index d2aa2320..6d4ce35 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static struct kset *iommu_group_kset;
@@ -987,6 +988,26 @@ struct iommu_group *pci_device_group(struct device *dev)
return iommu_group_alloc();
 }
 
+/* Get the IOMMU group for device on fsl-mc bus */
+struct iommu_group *fsl_mc_device_group(struct device *dev)
+{
+   struct device *cont_dev = fsl_mc_cont_dev(dev);
+   struct iommu_group *group;
+
+   /* Container device is responsible for creating the iommu group */
+   if (fsl_mc_is_cont_dev(dev)) {
+   group = iommu_group_alloc();
+   if (IS_ERR(group))
+   return NULL;
+   } else {
+   get_device(cont_dev);
+   group = iommu_group_get(cont_dev);
+   put_device(cont_dev);
+   }
+
+   return group;
+}
+
 /**
  * iommu_group_get_for_dev - Find or create the IOMMU group for a device
  * @dev: target device
diff --git a/include/linux/fsl/mc.h b/include/linux/fsl/mc.h
index f27cb14..dddaca1 100644
--- a/include/linux/fsl/mc.h
+++ b/include/linux/fsl/mc.h
@@ -351,6 +351,14 @@ struct fsl_mc_io {
 #define dev_is_fsl_mc(_dev) (0)
 #endif
 
+/* Macro to check if a device is a container device */
+#define fsl_mc_is_cont_dev(_dev) (to_fsl_mc_device(_dev)->flags & \
+   FSL_MC_IS_DPRC)
+
+/* Macro to get the container device of a MC device */
+#define fsl_mc_cont_dev(_dev) (fsl_mc_is_cont_dev(_dev) ? \
+   (_dev) : (_dev)->parent)
+
 /*
  * module_fsl_mc_driver() - Helper macro for drivers that don't do
  * anything special in module init/exit.  This eliminates a lot of
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 19938ee..2981200 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -389,6 +389,8 @@ static inline size_t iommu_map_sg(struct iommu_domain 
*domain,
 extern struct iommu_group *pci_device_group(struct device *dev);
 /* Generic device grouping function */
 extern struct iommu_group *generic_device_group(struct device *dev);
+/* FSL-MC device grouping function */
+struct iommu_group *fsl_mc_device_group(struct device *dev);
 
 /**
  * struct iommu_fwspec - per-device IOMMU instance data
-- 
1.9.1



[PATCH 3/7 v5] iommu: support iommu configuration for fsl-mc devices

2018-05-20 Thread Nipun Gupta
With of_pci_map_rid available for all the busses, use the function
for configuration of devices on fsl-mc bus

Signed-off-by: Nipun Gupta 
---
 drivers/iommu/of_iommu.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 811e160..284474d 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define NO_IOMMU   1
 
@@ -159,6 +160,23 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 
alias, void *data)
return err;
 }
 
+static int of_fsl_mc_iommu_init(struct fsl_mc_device *mc_dev,
+   struct device_node *master_np)
+{
+   struct of_phandle_args iommu_spec = { .args_count = 1 };
+   int err;
+
+   err = of_map_rid(master_np, mc_dev->icid, "iommu-map",
+"iommu-map-mask", _spec.np,
+iommu_spec.args);
+   if (err)
+   return err == -ENODEV ? NO_IOMMU : err;
+
+   err = of_iommu_xlate(_dev->dev, _spec);
+   of_node_put(iommu_spec.np);
+   return err;
+}
+
 const struct iommu_ops *of_iommu_configure(struct device *dev,
   struct device_node *master_np)
 {
@@ -190,6 +208,8 @@ const struct iommu_ops *of_iommu_configure(struct device 
*dev,
 
err = pci_for_each_dma_alias(to_pci_dev(dev),
 of_pci_iommu_init, );
+   } else if (dev_is_fsl_mc(dev)) {
+   err = of_fsl_mc_iommu_init(to_fsl_mc_device(dev), master_np);
} else {
struct of_phandle_args iommu_spec;
int idx = 0;
-- 
1.9.1



[PATCH 2/7 v5] iommu: of: make of_pci_map_rid() available for other devices too

2018-05-20 Thread Nipun Gupta
iommu-map property is also used by devices with fsl-mc. This
patch moves the of_pci_map_rid to generic location, so that it
can be used by other busses too.

'of_pci_map_rid' is renamed here to 'of_map_rid' and there is no
functional change done in the API.

Signed-off-by: Nipun Gupta 
Reviewed-by: Rob Herring 
Acked-by: Bjorn Helgaas 
---
 drivers/iommu/of_iommu.c |   5 +--
 drivers/of/base.c| 102 +++
 drivers/of/irq.c |   5 +--
 drivers/pci/of.c | 101 --
 include/linux/of.h   |  11 +
 include/linux/of_pci.h   |  10 -
 6 files changed, 117 insertions(+), 117 deletions(-)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 5c36a8b..811e160 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -149,9 +149,8 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 
alias, void *data)
struct of_phandle_args iommu_spec = { .args_count = 1 };
int err;
 
-   err = of_pci_map_rid(info->np, alias, "iommu-map",
-"iommu-map-mask", _spec.np,
-iommu_spec.args);
+   err = of_map_rid(info->np, alias, "iommu-map", "iommu-map-mask",
+_spec.np, iommu_spec.args);
if (err)
return err == -ENODEV ? NO_IOMMU : err;
 
diff --git a/drivers/of/base.c b/drivers/of/base.c
index 848f549..c7aac81 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -1995,3 +1995,105 @@ int of_find_last_cache_level(unsigned int cpu)
 
return cache_level;
 }
+
+/**
+ * of_map_rid - Translate a requester ID through a downstream mapping.
+ * @np: root complex device node.
+ * @rid: device requester ID to map.
+ * @map_name: property name of the map to use.
+ * @map_mask_name: optional property name of the mask to use.
+ * @target: optional pointer to a target device node.
+ * @id_out: optional pointer to receive the translated ID.
+ *
+ * Given a device requester ID, look up the appropriate implementation-defined
+ * platform ID and/or the target device which receives transactions on that
+ * ID, as per the "iommu-map" and "msi-map" bindings. Either of @target or
+ * @id_out may be NULL if only the other is required. If @target points to
+ * a non-NULL device node pointer, only entries targeting that node will be
+ * matched; if it points to a NULL value, it will receive the device node of
+ * the first matching target phandle, with a reference held.
+ *
+ * Return: 0 on success or a standard error code on failure.
+ */
+int of_map_rid(struct device_node *np, u32 rid,
+  const char *map_name, const char *map_mask_name,
+  struct device_node **target, u32 *id_out)
+{
+   u32 map_mask, masked_rid;
+   int map_len;
+   const __be32 *map = NULL;
+
+   if (!np || !map_name || (!target && !id_out))
+   return -EINVAL;
+
+   map = of_get_property(np, map_name, _len);
+   if (!map) {
+   if (target)
+   return -ENODEV;
+   /* Otherwise, no map implies no translation */
+   *id_out = rid;
+   return 0;
+   }
+
+   if (!map_len || map_len % (4 * sizeof(*map))) {
+   pr_err("%pOF: Error: Bad %s length: %d\n", np,
+   map_name, map_len);
+   return -EINVAL;
+   }
+
+   /* The default is to select all bits. */
+   map_mask = 0x;
+
+   /*
+* Can be overridden by "{iommu,msi}-map-mask" property.
+* If of_property_read_u32() fails, the default is used.
+*/
+   if (map_mask_name)
+   of_property_read_u32(np, map_mask_name, _mask);
+
+   masked_rid = map_mask & rid;
+   for ( ; map_len > 0; map_len -= 4 * sizeof(*map), map += 4) {
+   struct device_node *phandle_node;
+   u32 rid_base = be32_to_cpup(map + 0);
+   u32 phandle = be32_to_cpup(map + 1);
+   u32 out_base = be32_to_cpup(map + 2);
+   u32 rid_len = be32_to_cpup(map + 3);
+
+   if (rid_base & ~map_mask) {
+   pr_err("%pOF: Invalid %s translation - %s-mask (0x%x) 
ignores rid-base (0x%x)\n",
+   np, map_name, map_name,
+   map_mask, rid_base);
+   return -EFAULT;
+   }
+
+   if (masked_rid < rid_base || masked_rid >= rid_base + rid_len)
+   continue;
+
+   phandle_node = of_find_node_by_phandle(phandle);
+   if (!phandle_node)
+   return -ENODEV;
+
+   if (target) {
+   if (*target)
+   of_node_put(phandle_node);
+   else
+   *target = phandle_node;
+
+  

[PATCH 1/7 v5] Docs: dt: add fsl-mc iommu-map device-tree binding

2018-05-20 Thread Nipun Gupta
The existing IOMMU bindings cannot be used to specify the relationship
between fsl-mc devices and IOMMUs. This patch adds a generic binding for
mapping fsl-mc devices to IOMMUs, using iommu-map property.

Signed-off-by: Nipun Gupta 
Reviewed-by: Rob Herring 
---
 .../devicetree/bindings/misc/fsl,qoriq-mc.txt  | 39 ++
 1 file changed, 39 insertions(+)

diff --git a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt 
b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
index 6611a7c..8cbed4f 100644
--- a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
+++ b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
@@ -9,6 +9,25 @@ blocks that can be used to create functional hardware 
objects/devices
 such as network interfaces, crypto accelerator instances, L2 switches,
 etc.
 
+For an overview of the DPAA2 architecture and fsl-mc bus see:
+drivers/staging/fsl-mc/README.txt
+
+As described in the above overview, all DPAA2 objects in a DPRC share the
+same hardware "isolation context" and a 10-bit value called an ICID
+(isolation context id) is expressed by the hardware to identify
+the requester.
+
+The generic 'iommus' property is insufficient to describe the relationship
+between ICIDs and IOMMUs, so an iommu-map property is used to define
+the set of possible ICIDs under a root DPRC and how they map to
+an IOMMU.
+
+For generic IOMMU bindings, see
+Documentation/devicetree/bindings/iommu/iommu.txt.
+
+For arm-smmu binding, see:
+Documentation/devicetree/bindings/iommu/arm,smmu.txt.
+
 Required properties:
 
 - compatible
@@ -88,14 +107,34 @@ Sub-nodes:
   Value type: 
   Definition: Specifies the phandle to the PHY device node 
associated
   with the this dpmac.
+Optional properties:
+
+- iommu-map: Maps an ICID to an IOMMU and associated iommu-specifier
+  data.
+
+  The property is an arbitrary number of tuples of
+  (icid-base,iommu,iommu-base,length).
+
+  Any ICID i in the interval [icid-base, icid-base + length) is
+  associated with the listed IOMMU, with the iommu-specifier
+  (i - icid-base + iommu-base).
 
 Example:
 
+smmu: iommu@500 {
+   compatible = "arm,mmu-500";
+   #iommu-cells = <2>;
+   stream-match-mask = <0x7C00>;
+   ...
+};
+
 fsl_mc: fsl-mc@80c00 {
 compatible = "fsl,qoriq-mc";
 reg = <0x0008 0x0c00 0 0x40>,/* MC portal base */
   <0x 0x0834 0 0x4>; /* MC control reg */
 msi-parent = <>;
+/* define map for ICIDs 23-64 */
+iommu-map = <23  23 41>;
 #address-cells = <3>;
 #size-cells = <1>;
 
-- 
1.9.1



[PATCH 0/7 v5] Support for fsl-mc bus and its devices in SMMU

2018-05-20 Thread Nipun Gupta
This patchset defines IOMMU DT binding for fsl-mc bus and adds
support in SMMU for fsl-mc bus.

The patch series is based on top of dma-mapping tree (for-next branch):
http://git.infradead.org/users/hch/dma-mapping.git

These patches
  - Define property 'iommu-map' for fsl-mc bus (patch 1)
  - Integrates the fsl-mc bus with the SMMU using this
IOMMU binding (patch 2,3,4)
  - Adds the dma configuration support for fsl-mc bus (patch 5, 6)
  - Updates the fsl-mc device node with iommu/dma related changes (patch 7)

Changes in v2:
  - use iommu-map property for fsl-mc bus
  - rebase over patchset https://patchwork.kernel.org/patch/10317337/
and make corresponding changes for dma configuration of devices on
fsl-mc bus

Changes in v3:
  - move of_map_rid in drivers/of/address.c

Changes in v4:
  - move of_map_rid in drivers/of/base.c

Changes in v5:
  - break patch 5 in two separate patches (now patch 5/7 and patch 6/7)
  - add changelog text in patch 3/7 and patch 5/7
  - typo fix

Nipun Gupta (7):
  Docs: dt: add fsl-mc iommu-map device-tree binding
  iommu: of: make of_pci_map_rid() available for other devices too
  iommu: support iommu configuration for fsl-mc devices
  iommu: arm-smmu: Add support for the fsl-mc bus
  bus: fsl-mc: support dma configure for devices on fsl-mc bus
  bus: fsl-mc: set coherent dma mask for devices on fsl-mc bus
  arm64: dts: ls208xa: comply with the iommu map binding for fsl_mc

 .../devicetree/bindings/misc/fsl,qoriq-mc.txt  |  39 
 arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi |   6 +-
 drivers/bus/fsl-mc/fsl-mc-bus.c|  16 +++-
 drivers/iommu/arm-smmu.c   |   7 ++
 drivers/iommu/iommu.c  |  21 +
 drivers/iommu/of_iommu.c   |  25 -
 drivers/of/base.c  | 102 +
 drivers/of/irq.c   |   5 +-
 drivers/pci/of.c   | 101 
 include/linux/fsl/mc.h |   8 ++
 include/linux/iommu.h  |   2 +
 include/linux/of.h |  11 +++
 include/linux/of_pci.h |  10 --
 13 files changed, 231 insertions(+), 122 deletions(-)

-- 
1.9.1



[PATCH v2] powerpc/64s/radix: do not flush TLB when relaxing access

2018-05-20 Thread Nicholas Piggin
Radix flushes the TLB when updating ptes to increase permissiveness
of protection (increase access authority). Book3S does not require
TLB flushing in this case, and it is not done on hash. This patch
avoids the flush for radix.

>From Power ISA v3.0B, p.1090:

Setting a Reference or Change Bit or Upgrading Access Authority
(PTE Subject to Atomic Hardware Updates)

If the only change being made to a valid PTE that is subject to
atomic hardware updates is to set the Reference or Change bit to 1
or to add access authorities, a simpler sequence suffices because
the translation hardware will refetch the PTE if an access is
attempted for which the only problems were reference and/or change
bits needing to be set or insufficient access authority.

The nest MMU on POWER9 does not re-fetch the PTE after such an access
attempt before faulting, so address spaces with a coprocessor
attached will continue to flush in these cases.

This reduces tlbies for a kernel compile workload from 1.28M to 0.95M,
tlbiels from 20.17M 19.68M.

fork --fork --exec benchmark improved 2.77% (12000->12300).

Signed-off-by: Nicholas Piggin 
---
Oops I missed this patch, it's supposed to go as the first patch in
the "Various TLB and PTE improvements" patch.

 arch/powerpc/mm/pgtable-book3s64.c | 10 +++---
 arch/powerpc/mm/pgtable.c  | 29 ++---
 2 files changed, 33 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/mm/pgtable-book3s64.c 
b/arch/powerpc/mm/pgtable-book3s64.c
index 518518fb7c45..994492453f0e 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -31,16 +31,20 @@ int (*register_process_table)(unsigned long base, unsigned 
long page_size,
 int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address,
  pmd_t *pmdp, pmd_t entry, int dirty)
 {
+   struct mm_struct *mm = vma->vm_mm;
int changed;
 #ifdef CONFIG_DEBUG_VM
WARN_ON(!pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
-   assert_spin_locked(>vm_mm->page_table_lock);
+   assert_spin_locked(>page_table_lock);
 #endif
changed = !pmd_same(*(pmdp), entry);
if (changed) {
-   __ptep_set_access_flags(vma->vm_mm, pmdp_ptep(pmdp),
+   __ptep_set_access_flags(mm, pmdp_ptep(pmdp),
pmd_pte(entry), address);
-   flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
+   /* See ptep_set_access_flags comments */
+   if (atomic_read(>context.copros) > 0)
+   flush_pmd_tlb_range(vma, address,
+   address + HPAGE_PMD_SIZE);
}
return changed;
 }
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index 9f361ae571e9..525ec4656a55 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -217,14 +217,37 @@ void set_pte_at(struct mm_struct *mm, unsigned long addr, 
pte_t *ptep,
 int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address,
  pte_t *ptep, pte_t entry, int dirty)
 {
+   struct mm_struct *mm = vma->vm_mm;
int changed;
+
entry = set_access_flags_filter(entry, vma, dirty);
changed = !pte_same(*(ptep), entry);
if (changed) {
if (!is_vm_hugetlb_page(vma))
-   assert_pte_locked(vma->vm_mm, address);
-   __ptep_set_access_flags(vma->vm_mm, ptep, entry, address);
-   flush_tlb_page(vma, address);
+   assert_pte_locked(mm, address);
+   __ptep_set_access_flags(mm, ptep, entry, address);
+   if (IS_ENABLED(CONFIG_PPC_BOOK3S_64)) {
+   /*
+* Book3S does not require a TLB flush when relaxing
+* access restrictions because the core MMU will reload
+* the pte after taking an access fault. However the
+* NMMU on POWER9 does not re-load the pte, so flush
+* if we have a coprocessor attached to this address
+* space.
+*
+* This could be further refined and pushed out to
+* NMMU drivers so TLBIEs are only done for NMMU
+* faults, but this is a more minimal fix. The NMMU
+* fault handler does a get_user_pages_remote or
+* similar to bring the page tables in, and this
+* flush_tlb_page will do a global TLBIE because the
+* coprocessor is attached to the address space.
+*/
+   if (atomic_read(>context.copros) > 0)
+   flush_tlb_page(vma, address);
+   } else {
+ 

Re: [PATCH v4 4/4] powerpc/kbuild: move -mprofile-kernel check to Kconfig

2018-05-20 Thread Nicholas Piggin
On Thu, 17 May 2018 00:14:58 +1000
Nicholas Piggin  wrote:

> This eliminates the workaround that requires disabling
> -mprofile-kernel by default in Kconfig.
> 
> [ Note: this depends on 
> https://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild.git 
> kconfig-shell-v3 ]
> 
> Signed-off-by: Nicholas Piggin 

Here is an incremental patch that brings this up to v4.

https://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild.git 
kconfig-shell-v4

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index c7cc482cb660..60b83398b498 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -462,7 +462,7 @@ config LD_HEAD_STUB_CATCH
 
 config MPROFILE_KERNEL
depends on PPC64 && CPU_LITTLE_ENDIAN
-   def_bool $(success 
$(srctree)/arch/powerpc/tools/gcc-check-mprofile-kernel.sh $(CC) 
-I$(srctree)/include -D__KERNEL__)
+   def_bool 
$(success,$(srctree)/arch/powerpc/tools/gcc-check-mprofile-kernel.sh $(CC) 
-I$(srctree)/include -D__KERNEL__)
 
 config IOMMU_HELPER
def_bool PPC64
-- 
2.17.0



[PATCH 2/2] i2c: opal: don't check number of messages in the driver

2018-05-20 Thread Wolfram Sang
Since commit 1eace8344c02 ("i2c: add param sanity check to
i2c_transfer()") and b7f625840267 ("i2c: add quirk checks to core"), the
I2C core does this check now. We can remove it here.

Signed-off-by: Wolfram Sang 
---

Only build tested.

 drivers/i2c/busses/i2c-opal.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/i2c/busses/i2c-opal.c b/drivers/i2c/busses/i2c-opal.c
index 0aabb7eca0c5..dc2a23f4fb52 100644
--- a/drivers/i2c/busses/i2c-opal.c
+++ b/drivers/i2c/busses/i2c-opal.c
@@ -94,8 +94,6 @@ static int i2c_opal_master_xfer(struct i2c_adapter *adap, 
struct i2c_msg *msgs,
 */
memset(, 0, sizeof(req));
switch(num) {
-   case 0:
-   return 0;
case 1:
req.type = (msgs[0].flags & I2C_M_RD) ?
OPAL_I2C_RAW_READ : OPAL_I2C_RAW_WRITE;
@@ -114,8 +112,6 @@ static int i2c_opal_master_xfer(struct i2c_adapter *adap, 
struct i2c_msg *msgs,
req.size = cpu_to_be32(msgs[1].len);
req.buffer_ra = cpu_to_be64(__pa(msgs[1].buf));
break;
-   default:
-   return -EOPNOTSUPP;
}
 
rc = i2c_opal_send_request(opal_id, );
-- 
2.11.0



[PATCH 0/2] don't check number of I2C messages in drivers

2018-05-20 Thread Wolfram Sang
The core does it now, we can simplify drivers.

Wolfram Sang (2):
  i2c: ibm_iic: don't check number of messages in the driver
  i2c: opal: don't check number of messages in the driver

 drivers/i2c/busses/i2c-ibm_iic.c | 3 ---
 drivers/i2c/busses/i2c-opal.c| 4 
 2 files changed, 7 deletions(-)

-- 
2.11.0



[powerpc:topic/kbuild 4/4] arch/powerpc/Kconfig:466: syntax error

2018-05-20 Thread kbuild test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
topic/kbuild
head:   023eff0e59052c52bc4f077e66e82d68be486d7f
commit: 023eff0e59052c52bc4f077e66e82d68be486d7f [4/4] powerpc/kbuild: Move 
-mprofile-kernel check to Kconfig
config: powerpc-asp8347_defconfig
compiler: powerpc-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout 023eff0e59052c52bc4f077e66e82d68be486d7f
make.cross ARCH=powerpc  83xx/asp8347_defconfig
make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   arch/powerpc/Kconfig:465:warning: ignoring unsupported character '$'
   arch/powerpc/Kconfig:465:warning: ignoring unsupported character '$'
>> arch/powerpc/Kconfig:466: syntax error
   arch/powerpc/Kconfig:465:warning: ignoring unsupported character '$'
   arch/powerpc/Kconfig:465:warning: ignoring unsupported character '$'
>> arch/powerpc/Kconfig:465: invalid option
   make[2]: *** [83xx/asp8347_defconfig] Error 1
   make[1]: *** [83xx/asp8347_defconfig] Error 2
   make: *** [sub-make] Error 2
--
   arch/powerpc/Kconfig:465:warning: ignoring unsupported character '$'
   arch/powerpc/Kconfig:465:warning: ignoring unsupported character '$'
>> arch/powerpc/Kconfig:466: syntax error
   arch/powerpc/Kconfig:465:warning: ignoring unsupported character '$'
   arch/powerpc/Kconfig:465:warning: ignoring unsupported character '$'
>> arch/powerpc/Kconfig:465: invalid option
   make[2]: *** [oldconfig] Error 1
   make[1]: *** [oldconfig] Error 2
   make: *** [sub-make] Error 2
--
   arch/powerpc/Kconfig:465:warning: ignoring unsupported character '$'
   arch/powerpc/Kconfig:465:warning: ignoring unsupported character '$'
>> arch/powerpc/Kconfig:466: syntax error
   arch/powerpc/Kconfig:465:warning: ignoring unsupported character '$'
   arch/powerpc/Kconfig:465:warning: ignoring unsupported character '$'
>> arch/powerpc/Kconfig:465: invalid option
   make[2]: *** [olddefconfig] Error 1
   make[2]: Target 'oldnoconfig' not remade because of errors.
   make[1]: *** [oldnoconfig] Error 2
   make: *** [sub-make] Error 2

vim +466 arch/powerpc/Kconfig

e05c0e81 Kevin Hao   2013-07-16  441  
3d72bbc4 Michael Neuling 2013-02-13  442  config PPC_TRANSACTIONAL_MEM
3d72bbc4 Michael Neuling 2013-02-13  443 bool "Transactional Memory 
support for POWERPC"
3d72bbc4 Michael Neuling 2013-02-13  444 depends on PPC_BOOK3S_64
3d72bbc4 Michael Neuling 2013-02-13  445 depends on SMP
7b37a123 Michael Neuling 2014-01-08  446 select ALTIVEC
7b37a123 Michael Neuling 2014-01-08  447 select VSX
3d72bbc4 Michael Neuling 2013-02-13  448 default n
3d72bbc4 Michael Neuling 2013-02-13  449 ---help---
3d72bbc4 Michael Neuling 2013-02-13  450   Support user-mode 
Transactional Memory on POWERPC.
3d72bbc4 Michael Neuling 2013-02-13  451  
951eedeb Nicholas Piggin 2017-05-29  452  config LD_HEAD_STUB_CATCH
951eedeb Nicholas Piggin 2017-05-29  453bool "Reserve 256 bytes to cope 
with linker stubs in HEAD text" if EXPERT
951eedeb Nicholas Piggin 2017-05-29  454depends on PPC64
951eedeb Nicholas Piggin 2017-05-29  455default n
951eedeb Nicholas Piggin 2017-05-29  456help
951eedeb Nicholas Piggin 2017-05-29  457  Very large kernels can cause 
linker branch stubs to be generated by
951eedeb Nicholas Piggin 2017-05-29  458  code in head_64.S, which 
moves the head text sections out of their
951eedeb Nicholas Piggin 2017-05-29  459  specified location. This 
option can work around the problem.
951eedeb Nicholas Piggin 2017-05-29  460  
951eedeb Nicholas Piggin 2017-05-29  461  If unsure, say "N".
951eedeb Nicholas Piggin 2017-05-29  462  
8c50b72a Torsten Duwe2016-03-03  463  config MPROFILE_KERNEL
8c50b72a Torsten Duwe2016-03-03  464depends on PPC64 && 
CPU_LITTLE_ENDIAN
023eff0e Nicholas Piggin 2018-05-17 @465def_bool $(success 
$(srctree)/arch/powerpc/tools/gcc-check-mprofile-kernel.sh $(CC) 
-I$(srctree)/include -D__KERNEL__)
8c50b72a Torsten Duwe2016-03-03 @466  

:: The code at line 466 was first introduced by commit
:: 8c50b72a3b4f1f7cdfdfebd233b1cbd121262e65 powerpc/ftrace: Add Kconfig & 
Make glue for mprofile-kernel

:: TO: Torsten Duwe 
:: CC: Michael Ellerman 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


Re: pkeys on POWER: Access rights not reset on execve

2018-05-20 Thread Andy Lutomirski
On Sat, May 19, 2018 at 11:04 PM Ram Pai  wrote:

> On Sat, May 19, 2018 at 04:47:23PM -0700, Andy Lutomirski wrote: > On
Sat, May 19, 2018 at 1:28 PM Ram Pai  wrote:

> ...snip...
> >
> > So is it possible for two threads to each call pkey_alloc() and end up
with
> > the same key?  If so, it seems entirely broken.

> No. Two threads cannot allocate the same key; just like x86.

> > If not, then how do you
> > intend for a multithreaded application to usefully allocate a new key?
> > Regardless, it seems like the current behavior on POWER is very
difficult
> > to work with.  Can you give an example of a use case for which POWER's
> > behavior makes sense?
> >
> > For the use cases I've imagined, POWER's behavior does not make sense.
> >   x86's is not ideal but is still better.  Here are my two example use
cases:
> >
> > 1. A crypto library.  Suppose I'm writing a TLS-terminating server, and
I
> > want it to be resistant to Heartbleed-like bugs.  I could store my
private
> > keys protected by mprotect_key() and arrange for all threads and signal
> > handlers to have PKRU/AMR values that prevent any access to the memory.
> > When an explicit call is made to sign with the key, I would temporarily
> > change PKRU/AMR to allow access, compute the signature, and change
PKRU/AMR
> > back.  On x86 right now, this works nicely.  On POWER, it doesn't,
because
> > any thread started before my pkey_alloc() call can access the protected
> > memory, as can any signal handler.
> >
> > 2. A database using mmap() (with persistent memory or otherwise).  It
would
> > be nice to be resistant to accidental corruption due to stray writes.  I
> > would do more or less the same thing as (1), except that I would want
> > threads that are not actively writing to the database to be able the
> > protected memory.  On x86, I need to manually convince threads that may
> > have been started before my pkey_alloc() call as well as signal
handlers to
> > update their PKRU values.  On POWER, as in example (1), the error goes
the
> > other direction -- if I fail to propagate the AMR bits to all threads,
> > writes are not blocked.

> I see the problem from an application's point of view, on powerpc.  If
> the key allocated in one thread is not activated on all threads
> (existing one and future one), than other threads will not be able
> to modify the key's permissions. Hence they will not be able to control
> access/write to pages to which the key is associated.

> As Florian suggested, I should enable the key's bit in the UAMOR value
> corresponding to existing threads, when a new key is allocated.

> Now, looking at the implementation for x86, I see that sys_mpkey_alloc()
> makes no attempt to modify anything of any other thread. How
> does it manage to activate the key on any other thread? Is this
> magic done by the hardware?

x86 has no equivalent concept to UAMOR.  There are 16 keys no matter what.

--Andy


Re: pkeys on POWER: Access rights not reset on execve

2018-05-20 Thread Ram Pai
On Sat, May 19, 2018 at 04:47:23PM -0700, Andy Lutomirski wrote: > On Sat, May 
19, 2018 at 1:28 PM Ram Pai  wrote:

...snip...
> 
> So is it possible for two threads to each call pkey_alloc() and end up with
> the same key?  If so, it seems entirely broken. 

No. Two threads cannot allocate the same key; just like x86. 

> If not, then how do you
> intend for a multithreaded application to usefully allocate a new key?
> Regardless, it seems like the current behavior on POWER is very difficult
> to work with.  Can you give an example of a use case for which POWER's
> behavior makes sense?
> 
> For the use cases I've imagined, POWER's behavior does not make sense.
>   x86's is not ideal but is still better.  Here are my two example use cases:
> 
> 1. A crypto library.  Suppose I'm writing a TLS-terminating server, and I
> want it to be resistant to Heartbleed-like bugs.  I could store my private
> keys protected by mprotect_key() and arrange for all threads and signal
> handlers to have PKRU/AMR values that prevent any access to the memory.
> When an explicit call is made to sign with the key, I would temporarily
> change PKRU/AMR to allow access, compute the signature, and change PKRU/AMR
> back.  On x86 right now, this works nicely.  On POWER, it doesn't, because
> any thread started before my pkey_alloc() call can access the protected
> memory, as can any signal handler.
> 
> 2. A database using mmap() (with persistent memory or otherwise).  It would
> be nice to be resistant to accidental corruption due to stray writes.  I
> would do more or less the same thing as (1), except that I would want
> threads that are not actively writing to the database to be able the
> protected memory.  On x86, I need to manually convince threads that may
> have been started before my pkey_alloc() call as well as signal handlers to
> update their PKRU values.  On POWER, as in example (1), the error goes the
> other direction -- if I fail to propagate the AMR bits to all threads,
> writes are not blocked.

I see the problem from an application's point of view, on powerpc.  If
the key allocated in one thread is not activated on all threads
(existing one and future one), than other threads will not be able
to modify the key's permissions. Hence they will not be able to control
access/write to pages to which the key is associated.

As Florian suggested, I should enable the key's bit in the UAMOR value
corresponding to existing threads, when a new key is allocated.

Now, looking at the implementation for x86, I see that sys_mpkey_alloc()
makes no attempt to modify anything of any other thread. How
does it manage to activate the key on any other thread? Is this
magic done by the hardware?

RP