[PATCH 1/6 v5] powerpc: book3e: _PAGE_LENDIAN must be _PAGE_ENDIAN
For booke3e _PAGE_ENDIAN is not defined. Infact what is defined is _PAGE_LENDIAN which is wrong and that should be _PAGE_ENDIAN. There are no compilation errors as arch/powerpc/include/asm/pte-common.h defines _PAGE_ENDIAN to 0 as it is not defined anywhere. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v1-v5 - no change arch/powerpc/include/asm/pte-book3e.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/include/asm/pte-book3e.h b/arch/powerpc/include/asm/pte-book3e.h index 0156702..576ad88 100644 --- a/arch/powerpc/include/asm/pte-book3e.h +++ b/arch/powerpc/include/asm/pte-book3e.h @@ -40,7 +40,7 @@ #define _PAGE_U1 0x01 #define _PAGE_U0 0x02 #define _PAGE_ACCESSED 0x04 -#define _PAGE_LENDIAN 0x08 +#define _PAGE_ENDIAN 0x08 #define _PAGE_GUARDED 0x10 #define _PAGE_COHERENT 0x20 /* M: enforce memory coherence */ #define _PAGE_NO_CACHE 0x40 /* I: cache inhibit */ -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/6 v5] kvm: powerpc: allow guest control E attribute in mas2
E bit in MAS2 bit indicates whether the page is accessed in Little-Endian or Big-Endian byte order. There is no reason to stop guest setting E, so allow him. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v1-v5 - no change arch/powerpc/kvm/e500.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h index c2e5e98..277cb18 100644 --- a/arch/powerpc/kvm/e500.h +++ b/arch/powerpc/kvm/e500.h @@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct kvm_vcpu *vcpu) #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW) #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW) #define MAS2_ATTRIB_MASK \ - (MAS2_X0 | MAS2_X1) + (MAS2_X0 | MAS2_X1 | MAS2_E) #define MAS3_ATTRIB_MASK \ (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \ | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK) -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/6 v5] kvm: powerpc: use cache attributes from linux pte
From: Bharat Bhushan bharat.bhus...@freescale.com First patch is a typo fix where book3e define _PAGE_LENDIAN while it should be defined as _PAGE_ENDIAN. This seems to show that this is never exercised :-) Second and third patch is to allow guest controlling G-Guarded and E-Endian TLB attributes respectively. Fourth patch is moving functions/logic in common code so they can be used on booke also. Fifth and Sixth patch is actually setting caching attributes (TLB.WIMGE) using corresponding Linux pte. v3-v5 - Fix tlb-reference-flag clearing issue (patch 4/6) - There was a patch (4/6 powerpc: move linux pte/hugepte search to more generic file) in the last series of this patchset which was moving pte/hugepte search functions to generic file. That patch is no more needed as some other patch is already applied to fix that :) v2-v3 - now lookup_linux_pte() only have pte search logic and it does not set any access flags in pte. There is already a function for setting access flag which will be called explicitly where needed. On booke we only need to search for pte to get WIMG. v1-v2 - Earlier caching attributes (WIMGE) were set based of page is RAM or not But now we get these attributes from corresponding Linux PTE. Bharat Bhushan (6): powerpc: book3e: _PAGE_LENDIAN must be _PAGE_ENDIAN kvm: powerpc: allow guest control E attribute in mas2 kvm: powerpc: allow guest control G attribute in mas2 kvm: powerpc: keep only pte search logic in lookup_linux_pte kvm: booke: clear host tlb reference flag on guest tlb invalidation kvm: powerpc: use caching attributes as per linux pte arch/powerpc/include/asm/kvm_host.h |2 +- arch/powerpc/include/asm/pgtable.h| 24 arch/powerpc/include/asm/pte-book3e.h |2 +- arch/powerpc/kvm/book3s_hv_rm_mmu.c | 36 arch/powerpc/kvm/booke.c |2 +- arch/powerpc/kvm/e500.h | 10 -- arch/powerpc/kvm/e500_mmu_host.c | 50 +++-- 7 files changed, 74 insertions(+), 52 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/6 v5] kvm: powerpc: allow guest control G attribute in mas2
G bit in MAS2 indicates whether the page is Guarded. There is no reason to stop guest setting G, so allow him. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v1-v5 - no change arch/powerpc/kvm/e500.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h index 277cb18..4fd9650 100644 --- a/arch/powerpc/kvm/e500.h +++ b/arch/powerpc/kvm/e500.h @@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct kvm_vcpu *vcpu) #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW) #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW) #define MAS2_ATTRIB_MASK \ - (MAS2_X0 | MAS2_X1 | MAS2_E) + (MAS2_X0 | MAS2_X1 | MAS2_E | MAS2_G) #define MAS3_ATTRIB_MASK \ (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \ | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK) -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/6 v5] kvm: powerpc: keep only pte search logic in lookup_linux_pte
lookup_linux_pte() was searching for a pte and also sets access flags is writable. This function now searches only pte while access flag setting is done explicitly. This pte lookup is not kvm specific, so moved to common code (asm/pgtable.h) My Followup patch will use this on booke. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v4-v5 - No change arch/powerpc/include/asm/pgtable.h | 24 +++ arch/powerpc/kvm/book3s_hv_rm_mmu.c | 36 +++--- 2 files changed, 36 insertions(+), 24 deletions(-) diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h index 7d6eacf..3a5de5c 100644 --- a/arch/powerpc/include/asm/pgtable.h +++ b/arch/powerpc/include/asm/pgtable.h @@ -223,6 +223,30 @@ extern int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr, #endif pte_t *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea, unsigned *shift); + +static inline pte_t *lookup_linux_pte(pgd_t *pgdir, unsigned long hva, +unsigned long *pte_sizep) +{ + pte_t *ptep; + unsigned long ps = *pte_sizep; + unsigned int shift; + + ptep = find_linux_pte_or_hugepte(pgdir, hva, shift); + if (!ptep) + return __pte(0); + if (shift) + *pte_sizep = 1ul shift; + else + *pte_sizep = PAGE_SIZE; + + if (ps *pte_sizep) + return __pte(0); + + if (!pte_present(*ptep)) + return __pte(0); + + return ptep; +} #endif /* __ASSEMBLY__ */ #endif /* __KERNEL__ */ diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c index 45e30d6..74fa7f8 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c @@ -134,25 +134,6 @@ static void remove_revmap_chain(struct kvm *kvm, long pte_index, unlock_rmap(rmap); } -static pte_t lookup_linux_pte(pgd_t *pgdir, unsigned long hva, - int writing, unsigned long *pte_sizep) -{ - pte_t *ptep; - unsigned long ps = *pte_sizep; - unsigned int hugepage_shift; - - ptep = find_linux_pte_or_hugepte(pgdir, hva, hugepage_shift); - if (!ptep) - return __pte(0); - if (hugepage_shift) - *pte_sizep = 1ul hugepage_shift; - else - *pte_sizep = PAGE_SIZE; - if (ps *pte_sizep) - return __pte(0); - return kvmppc_read_update_linux_pte(ptep, writing, hugepage_shift); -} - static inline void unlock_hpte(unsigned long *hpte, unsigned long hpte_v) { asm volatile(PPC_RELEASE_BARRIER : : : memory); @@ -173,6 +154,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, unsigned long is_io; unsigned long *rmap; pte_t pte; + pte_t *ptep; unsigned int writing; unsigned long mmu_seq; unsigned long rcbits; @@ -231,8 +213,9 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, /* Look up the Linux PTE for the backing page */ pte_size = psize; - pte = lookup_linux_pte(pgdir, hva, writing, pte_size); - if (pte_present(pte)) { + ptep = lookup_linux_pte(pgdir, hva, pte_size); + if (pte_present(pte_val(*ptep))) { + pte = kvmppc_read_update_linux_pte(ptep, writing); if (writing !pte_write(pte)) /* make the actual HPTE be read-only */ ptel = hpte_make_readonly(ptel); @@ -661,15 +644,20 @@ long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long flags, struct kvm_memory_slot *memslot; pgd_t *pgdir = vcpu-arch.pgdir; pte_t pte; + pte_t *ptep; psize = hpte_page_size(v, r); gfn = ((r HPTE_R_RPN) ~(psize - 1)) PAGE_SHIFT; memslot = __gfn_to_memslot(kvm_memslots(kvm), gfn); if (memslot) { hva = __gfn_to_hva_memslot(memslot, gfn); - pte = lookup_linux_pte(pgdir, hva, 1, psize); - if (pte_present(pte) !pte_write(pte)) - r = hpte_make_readonly(r); + ptep = lookup_linux_pte(pgdir, hva, psize); + if (pte_present(pte_val(*ptep))) { + pte = kvmppc_read_update_linux_pte(ptep, + 1); + if (pte_present(pte) !pte_write(pte)) + r = hpte_make_readonly(r); + }
[PATCH 6/6 v5] kvm: powerpc: use caching attributes as per linux pte
KVM uses same WIM tlb attributes as the corresponding qemu pte. For this we now search the linux pte for the requested page and get these cache caching/coherency attributes from pte. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v4-v5 - No change arch/powerpc/include/asm/kvm_host.h |2 +- arch/powerpc/kvm/booke.c|2 +- arch/powerpc/kvm/e500.h |8 -- arch/powerpc/kvm/e500_mmu_host.c| 38 -- 4 files changed, 29 insertions(+), 21 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 9741bf0..775f0e8 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -538,6 +538,7 @@ struct kvm_vcpu_arch { #endif gpa_t paddr_accessed; gva_t vaddr_accessed; + pgd_t *pgdir; u8 io_gpr; /* GPR used as IO source/target */ u8 mmio_is_bigendian; @@ -595,7 +596,6 @@ struct kvm_vcpu_arch { struct list_head run_list; struct task_struct *run_task; struct kvm_run *kvm_run; - pgd_t *pgdir; spinlock_t vpa_update_lock; struct kvmppc_vpa vpa; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 17722d8..4171c7d 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -695,7 +695,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) kvmppc_load_guest_fp(vcpu); #endif - + vcpu-arch.pgdir = current-mm-pgd; kvmppc_fix_ee_before_entry(); ret = __kvmppc_vcpu_run(kvm_run, vcpu); diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h index 4fd9650..a326178 100644 --- a/arch/powerpc/kvm/e500.h +++ b/arch/powerpc/kvm/e500.h @@ -31,11 +31,13 @@ enum vcpu_ftr { #define E500_TLB_NUM 2 /* entry is mapped somewhere in host TLB */ -#define E500_TLB_VALID (1 0) +#define E500_TLB_VALID (1 31) /* TLB1 entry is mapped by host TLB1, tracked by bitmaps */ -#define E500_TLB_BITMAP(1 1) +#define E500_TLB_BITMAP(1 30) /* TLB1 entry is mapped by host TLB0 */ -#define E500_TLB_TLB0 (1 2) +#define E500_TLB_TLB0 (1 29) +/* bits [6-5] MAS2_X1 and MAS2_X0 and [4-0] bits for WIMGE */ +#define E500_TLB_MAS2_ATTR (0x7f) struct tlbe_ref { pfn_t pfn; /* valid only for TLB0, except briefly */ diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 60f5a3c..654c368 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,15 +64,6 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) -{ -#ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; -#endif -} - /* * writing shadow tlb entry to host TLB */ @@ -250,10 +241,12 @@ static inline int tlbe_is_writable(struct kvm_book3e_206_tlb_entry *tlbe) static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref, struct kvm_book3e_206_tlb_entry *gtlbe, -pfn_t pfn) +pfn_t pfn, unsigned int wimg) { ref-pfn = pfn; ref-flags = E500_TLB_VALID; + /* Use guest supplied MAS2_G and MAS2_E */ + ref-flags |= (gtlbe-mas2 MAS2_ATTRIB_MASK) | wimg; if (tlbe_is_writable(gtlbe)) kvm_set_pfn_dirty(pfn); @@ -314,8 +307,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; - stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + stlbe-mas2 = (gvaddr MAS2_EPN) | (ref-flags E500_TLB_MAS2_ATTR); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); @@ -334,6 +326,10 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500, unsigned long hva; int pfnmap = 0; int tsize = BOOK3E_PAGESZ_4K; + unsigned long tsize_pages = 0; + pte_t *ptep; + unsigned int wimg = 0; + pgd_t *pgdir; /* * Translate guest physical to true physical, acquiring @@ -396,7 +392,7 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500, */ for (; tsize BOOK3E_PAGESZ_4K; tsize -= 2) { - unsigned long gfn_start, gfn_end, tsize_pages; + unsigned long gfn_start, gfn_end; tsize_pages = 1 (tsize - 2); gfn_start = gfn ~(tsize_pages - 1); @@ -438,9 +434,10 @@ static inline int
[PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation
On booke, struct tlbe_ref contains host tlb mapping information (pfn: for guest-pfn to pfn, flags: attribute associated with this mapping) for a guest tlb entry. So when a guest creates a TLB entry then struct tlbe_ref is set to point to valid pfn and set attributes in flags field of the above said structure. When a guest TLB entry is invalidated then flags field of corresponding struct tlbe_ref is updated to point that this is no more valid, also we selectively clear some other attribute bits, example: if E500_TLB_BITMAP was set then we clear E500_TLB_BITMAP, if E500_TLB_TLB0 is set then we clear this. Ideally we should clear complete flags as this entry is invalid and does not have anything to re-used. The other part of the problem is that when we use the same entry again then also we do not clear (started doing or-ing etc). So far it was working because the selectively clearing mentioned above actually clears flags what was set during TLB mapping. But the problem starts coming when we add more attributes to this then we need to selectively clear them and which is not needed. This patch we do both - Clear flags when invalidating; - Clear flags when reusing same entry later Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v3- v5 - New patch (found this issue when doing vfio-pci development) arch/powerpc/kvm/e500_mmu_host.c | 12 +++- 1 files changed, 7 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..60f5a3c 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -217,7 +217,8 @@ void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel, } mb(); vcpu_e500-g2h_tlb1_map[esel] = 0; - ref-flags = ~(E500_TLB_BITMAP | E500_TLB_VALID); + /* Clear flags as TLB is not backed by the host anymore */ + ref-flags = 0; local_irq_restore(flags); } @@ -227,7 +228,8 @@ void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel, * rarely and is not worth optimizing. Invalidate everything. */ kvmppc_e500_tlbil_all(vcpu_e500); - ref-flags = ~(E500_TLB_TLB0 | E500_TLB_VALID); + /* Clear flags as TLB is not backed by the host anymore */ + ref-flags = 0; } /* Already invalidated in between */ @@ -237,8 +239,8 @@ void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel, /* Guest tlbe is backed by at most one host tlbe per shadow pid. */ kvmppc_e500_tlbil_one(vcpu_e500, gtlbe); - /* Mark the TLB as not backed by the host anymore */ - ref-flags = ~E500_TLB_VALID; + /* Clear flags as TLB is not backed by the host anymore */ + ref-flags = 0; } static inline int tlbe_is_writable(struct kvm_book3e_206_tlb_entry *tlbe) @@ -251,7 +253,7 @@ static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref, pfn_t pfn) { ref-pfn = pfn; - ref-flags |= E500_TLB_VALID; + ref-flags = E500_TLB_VALID; if (tlbe_is_writable(gtlbe)) kvm_set_pfn_dirty(pfn); -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 19/23] KVM: PPC: Book3S: Select PR vs HV separately for each guest
Am 18.09.2013 um 07:05 schrieb Paul Mackerras pau...@samba.org: On Thu, Sep 12, 2013 at 11:17:11PM -0500, Alexander Graf wrote: It means you can only choose between HV and PR machine wide, while with this patch set you give the user the flexibility to have HV and PR guests run in parallel. I know that Anthony doesn't believe it's a valid use case, but I like the flexible solution better. It does however male sense to enable a sysadmin to remove any PR functionality from the system by blocking that module. Can't we have both? So, one suggestion (from Aneesh) is to use the 'type' argument to kvm_arch_init_vm() to indicate whether we want a specific type of KVM (PR or HV), or just the default. Zero would mean default (fastest available) whereas other values would indicate a specific choice of PR or HV. Then, if we build separate kvm_pr and kvm_hv modules when KVM is configured to be a module, the sysadmin can control the default choice by loading and unloading modules. How does that sound? Or would you prefer to stick with a single module and have a module option to control the default choice? I think keeping 2 modules makes a lot of sense, but I'm not sure a parameter to init_vm works well with the way we model machines in QEMU. IIRC we only know that we force anything in the machine model initialization which happens way past the vm init. Alex Paul. -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation
On Thu, 2013-09-19 at 11:32 +0530, Bharat Bhushan wrote: On booke, struct tlbe_ref contains host tlb mapping information (pfn: for guest-pfn to pfn, flags: attribute associated with this mapping) for a guest tlb entry. So when a guest creates a TLB entry then struct tlbe_ref is set to point to valid pfn and set attributes in flags field of the above said structure. When a guest TLB entry is invalidated then flags field of corresponding struct tlbe_ref is updated to point that this is no more valid, also we selectively clear some other attribute bits, example: if E500_TLB_BITMAP was set then we clear E500_TLB_BITMAP, if E500_TLB_TLB0 is set then we clear this. Ideally we should clear complete flags as this entry is invalid and does not have anything to re-used. The other part of the problem is that when we use the same entry again then also we do not clear (started doing or-ing etc). So far it was working because the selectively clearing mentioned above actually clears flags what was set during TLB mapping. But the problem starts coming when we add more attributes to this then we need to selectively clear them and which is not needed. This patch we do both - Clear flags when invalidating; - Clear flags when reusing same entry later Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v3- v5 - New patch (found this issue when doing vfio-pci development) arch/powerpc/kvm/e500_mmu_host.c | 12 +++- 1 files changed, 7 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..60f5a3c 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -217,7 +217,8 @@ void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel, } mb(); vcpu_e500-g2h_tlb1_map[esel] = 0; - ref-flags = ~(E500_TLB_BITMAP | E500_TLB_VALID); + /* Clear flags as TLB is not backed by the host anymore */ + ref-flags = 0; local_irq_restore(flags); } This breaks when you have both E500_TLB_BITMAP and E500_TLB_TLB0 set. Instead, just convert the final E500_TLB_VALID clearing at the end into ref-flags = 0, and convert the early return a few lines earlier into conditional execution of the tlbil_one(). -Scott -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/11] KVM: PPC: Book3S HV: Add support for guest Program Priority Register
On 16.09.2013, at 22:29, Benjamin Herrenschmidt wrote: On Fri, 2013-09-06 at 13:22 +1000, Paul Mackerras wrote: POWER7 and later IBM server processors have a register called the Program Priority Register (PPR), which controls the priority of each hardware CPU SMT thread, and affects how fast it runs compared to other SMT threads. This priority can be controlled by writing to the PPR or by use of a set of instructions of the form or rN,rN,rN which are otherwise no-ops but have been defined to set the priority to particular levels. This adds code to context switch the PPR when entering and exiting guests and to make the PPR value accessible through the SET/GET_ONE_REG interface. When entering the guest, we set the PPR as late as possible, because if we are setting a low thread priority it will make the code run slowly from that point on. Similarly, the first-level interrupt handlers save the PPR value in the PACA very early on, and set the thread priority to the medium level, so that the interrupt handling code runs at a reasonable speed. Signed-off-by: Paul Mackerras pau...@samba.org Acked-by: Benjamin Herrenschmidt b...@kernel.crashing.org Alex, can you take this via your tree ? Yes, on the next respin :). Or is this one urgent? Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation
-Original Message- From: Wood Scott-B07421 Sent: Friday, September 20, 2013 2:38 AM To: Bhushan Bharat-R65777 Cc: b...@kernel.crashing.org; ag...@suse.de; pau...@samba.org; k...@vger.kernel.org; kvm-ppc@vger.kernel.org; linuxppc-...@lists.ozlabs.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation On Thu, 2013-09-19 at 11:32 +0530, Bharat Bhushan wrote: On booke, struct tlbe_ref contains host tlb mapping information (pfn: for guest-pfn to pfn, flags: attribute associated with this mapping) for a guest tlb entry. So when a guest creates a TLB entry then struct tlbe_ref is set to point to valid pfn and set attributes in flags field of the above said structure. When a guest TLB entry is invalidated then flags field of corresponding struct tlbe_ref is updated to point that this is no more valid, also we selectively clear some other attribute bits, example: if E500_TLB_BITMAP was set then we clear E500_TLB_BITMAP, if E500_TLB_TLB0 is set then we clear this. Ideally we should clear complete flags as this entry is invalid and does not have anything to re-used. The other part of the problem is that when we use the same entry again then also we do not clear (started doing or-ing etc). So far it was working because the selectively clearing mentioned above actually clears flags what was set during TLB mapping. But the problem starts coming when we add more attributes to this then we need to selectively clear them and which is not needed. This patch we do both - Clear flags when invalidating; - Clear flags when reusing same entry later Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v3- v5 - New patch (found this issue when doing vfio-pci development) arch/powerpc/kvm/e500_mmu_host.c | 12 +++- 1 files changed, 7 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..60f5a3c 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -217,7 +217,8 @@ void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel, } mb(); vcpu_e500-g2h_tlb1_map[esel] = 0; - ref-flags = ~(E500_TLB_BITMAP | E500_TLB_VALID); + /* Clear flags as TLB is not backed by the host anymore */ + ref-flags = 0; local_irq_restore(flags); } This breaks when you have both E500_TLB_BITMAP and E500_TLB_TLB0 set. I do not see any case where we set both E500_TLB_BITMAP and E500_TLB_TLB0. Also we have not optimized that yet (keeping track of multiple shadow TLB0 entries for one guest TLB1 entry) We uses these bit flags only for TLB1 and if size of stlbe is 4K then we set E500_TLB_TLB0 otherwise we set E500_TLB_BITMAP. Although I think that E500_TLB_BITMAP should be set only if stlbe size is less than gtlbe size. Instead, just convert the final E500_TLB_VALID clearing at the end into ref-flags = 0, and convert the early return a few lines earlier into conditional execution of the tlbil_one(). This looks better, will send the patch shortly. Thanks -Bharat -Scott N�r��yb�X��ǧv�^�){.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥
[PATCH 5/6 v6] kvm: booke: clear host tlb reference flag on guest tlb invalidation
On booke, struct tlbe_ref contains host tlb mapping information (pfn: for guest-pfn to pfn, flags: attribute associated with this mapping) for a guest tlb entry. So when a guest creates a TLB entry then struct tlbe_ref is set to point to valid pfn and set attributes in flags field of the above said structure. When a guest TLB entry is invalidated then flags field of corresponding struct tlbe_ref is updated to point that this is no more valid, also we selectively clear some other attribute bits, example: if E500_TLB_BITMAP was set then we clear E500_TLB_BITMAP, if E500_TLB_TLB0 is set then we clear this. Ideally we should clear complete flags as this entry is invalid and does not have anything to re-used. The other part of the problem is that when we use the same entry again then also we do not clear (started doing or-ing etc). So far it was working because the selectively clearing mentioned above actually clears flags what was set during TLB mapping. But the problem starts coming when we add more attributes to this then we need to selectively clear them and which is not needed. This patch we do both - Clear flags when invalidating; - Clear flags when reusing same entry later Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v5-v6 - Fix flag clearing comment arch/powerpc/kvm/e500_mmu_host.c | 16 1 files changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..7370e1c 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -230,15 +230,15 @@ void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel, ref-flags = ~(E500_TLB_TLB0 | E500_TLB_VALID); } - /* Already invalidated in between */ - if (!(ref-flags E500_TLB_VALID)) - return; - - /* Guest tlbe is backed by at most one host tlbe per shadow pid. */ - kvmppc_e500_tlbil_one(vcpu_e500, gtlbe); + /* +* Check whether TLB entry is already invalidated in between +* Guest tlbe is backed by at most one host tlbe per shadow pid. +*/ + if (ref-flags E500_TLB_VALID) + kvmppc_e500_tlbil_one(vcpu_e500, gtlbe); /* Mark the TLB as not backed by the host anymore */ - ref-flags = ~E500_TLB_VALID; + ref-flags = 0; } static inline int tlbe_is_writable(struct kvm_book3e_206_tlb_entry *tlbe) @@ -251,7 +251,7 @@ static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref, pfn_t pfn) { ref-pfn = pfn; - ref-flags |= E500_TLB_VALID; + ref-flags = E500_TLB_VALID; if (tlbe_is_writable(gtlbe)) kvm_set_pfn_dirty(pfn); -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 07/18] KVM: PPC: Book3S PR: Keep volatile reg values in vcpu rather than shadow_vcpu
Currently PR-style KVM keeps the volatile guest register values (R0 - R13, CR, LR, CTR, XER, PC) in a shadow_vcpu struct rather than the main kvm_vcpu struct. For 64-bit, the shadow_vcpu exists in two places, a kmalloc'd struct and in the PACA, and it gets copied back and forth in kvmppc_core_vcpu_load/put(), because the real-mode code can't rely on being able to access the kmalloc'd struct. This changes the code to copy the volatile values into the shadow_vcpu as one of the last things done before entering the guest. Similarly the values are copied back out of the shadow_vcpu to the kvm_vcpu immediately after exiting the guest. We arrange for interrupts to be still disabled at this point so that we can't get preempted on 64-bit and end up copying values from the wrong PACA. This means that the accessor functions in kvm_book3s.h for these registers are greatly simplified, and are same between PR and HV KVM. In places where accesses to shadow_vcpu fields are now replaced by accesses to the kvm_vcpu, we can also remove the svcpu_get/put pairs. Finally, on 64-bit, we don't need the kmalloc'd struct at all any more. With this, the time to read the PVR one million times in a loop went from 567.7ms to 575.5ms (averages of 6 values), an increase of about 1.4% for this worse-case test for guest entries and exits. The standard deviation of the measurements is about 11ms, so the difference is only marginally significant statistically. Signed-off-by: Paul Mackerras pau...@samba.org --- arch/powerpc/include/asm/kvm_book3s.h | 220 +- arch/powerpc/include/asm/kvm_book3s_asm.h | 6 +- arch/powerpc/include/asm/kvm_host.h | 1 + arch/powerpc/kernel/asm-offsets.c | 4 +- arch/powerpc/kvm/book3s_emulate.c | 8 +- arch/powerpc/kvm/book3s_interrupts.S | 27 +++- arch/powerpc/kvm/book3s_pr.c | 122 - arch/powerpc/kvm/book3s_rmhandlers.S | 6 +- arch/powerpc/kvm/trace.h | 7 +- 9 files changed, 162 insertions(+), 239 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index 14a4741..40f22d9 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -200,140 +200,76 @@ extern void kvm_return_point(void); #include asm/kvm_book3s_64.h #endif -#ifdef CONFIG_KVM_BOOK3S_PR - -static inline unsigned long kvmppc_interrupt_offset(struct kvm_vcpu *vcpu) -{ - return to_book3s(vcpu)-hior; -} - -static inline void kvmppc_update_int_pending(struct kvm_vcpu *vcpu, - unsigned long pending_now, unsigned long old_pending) -{ - if (pending_now) - vcpu-arch.shared-int_pending = 1; - else if (old_pending) - vcpu-arch.shared-int_pending = 0; -} - static inline void kvmppc_set_gpr(struct kvm_vcpu *vcpu, int num, ulong val) { - if ( num 14 ) { - struct kvmppc_book3s_shadow_vcpu *svcpu = svcpu_get(vcpu); - svcpu-gpr[num] = val; - svcpu_put(svcpu); - to_book3s(vcpu)-shadow_vcpu-gpr[num] = val; - } else - vcpu-arch.gpr[num] = val; + vcpu-arch.gpr[num] = val; } static inline ulong kvmppc_get_gpr(struct kvm_vcpu *vcpu, int num) { - if ( num 14 ) { - struct kvmppc_book3s_shadow_vcpu *svcpu = svcpu_get(vcpu); - ulong r = svcpu-gpr[num]; - svcpu_put(svcpu); - return r; - } else - return vcpu-arch.gpr[num]; + return vcpu-arch.gpr[num]; } static inline void kvmppc_set_cr(struct kvm_vcpu *vcpu, u32 val) { - struct kvmppc_book3s_shadow_vcpu *svcpu = svcpu_get(vcpu); - svcpu-cr = val; - svcpu_put(svcpu); - to_book3s(vcpu)-shadow_vcpu-cr = val; + vcpu-arch.cr = val; } static inline u32 kvmppc_get_cr(struct kvm_vcpu *vcpu) { - struct kvmppc_book3s_shadow_vcpu *svcpu = svcpu_get(vcpu); - u32 r; - r = svcpu-cr; - svcpu_put(svcpu); - return r; + return vcpu-arch.cr; } static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, u32 val) { - struct kvmppc_book3s_shadow_vcpu *svcpu = svcpu_get(vcpu); - svcpu-xer = val; - to_book3s(vcpu)-shadow_vcpu-xer = val; - svcpu_put(svcpu); + vcpu-arch.xer = val; } static inline u32 kvmppc_get_xer(struct kvm_vcpu *vcpu) { - struct kvmppc_book3s_shadow_vcpu *svcpu = svcpu_get(vcpu); - u32 r; - r = svcpu-xer; - svcpu_put(svcpu); - return r; + return vcpu-arch.xer; } static inline void kvmppc_set_ctr(struct kvm_vcpu *vcpu, ulong val) { - struct kvmppc_book3s_shadow_vcpu *svcpu = svcpu_get(vcpu); - svcpu-ctr = val; - svcpu_put(svcpu); + vcpu-arch.ctr = val; } static inline ulong kvmppc_get_ctr(struct kvm_vcpu *vcpu) { - struct kvmppc_book3s_shadow_vcpu *svcpu = svcpu_get(vcpu);
[PATCH 12/18] KVM: PPC: Book3S PR: Make HPT accesses and updates SMP-safe
This adds a per-VM mutex to provide mutual exclusion between vcpus for accesses to and updates of the guest hashed page table (HPT). This also makes the code use single-byte writes to the HPT entry when updating of the reference (R) and change (C) bits. The reason for doing this, rather than writing back the whole HPTE, is that on non-PAPR virtual machines, the guest OS might be writing to the HPTE concurrently, and writing back the whole HPTE might conflict with that. Also, real hardware does single-byte writes to update R and C. The new mutex is taken in kvmppc_mmu_book3s_64_xlate() when reading the HPT and updating R and/or C, and in the PAPR HPT update hcalls (H_ENTER, H_REMOVE, etc.). Having the mutex means that we don't need to use a hypervisor lock bit in the HPT update hcalls, and we don't need to be careful about the order in which the bytes of the HPTE are updated by those hcalls. The other change here is to make emulated TLB invalidations (tlbie) effective across all vcpus. To do this we call kvmppc_mmu_pte_vflush for all vcpus in kvmppc_ppc_book3s_64_tlbie(). For 32-bit, this makes the setting of the accessed and dirty bits use single-byte writes, and makes tlbie invalidate shadow HPTEs for all vcpus. With this, PR KVM can successfully run SMP guests. Signed-off-by: Paul Mackerras pau...@samba.org --- arch/powerpc/include/asm/kvm_host.h | 3 +++ arch/powerpc/kvm/book3s_32_mmu.c| 36 ++-- arch/powerpc/kvm/book3s_64_mmu.c| 33 +++-- arch/powerpc/kvm/book3s_pr.c| 1 + arch/powerpc/kvm/book3s_pr_papr.c | 33 +++-- 5 files changed, 72 insertions(+), 34 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 2c7963b..1f7349d 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -259,6 +259,9 @@ struct kvm_arch { struct kvmppc_vcore *vcores[KVM_MAX_VCORES]; int hpt_cma_alloc; #endif /* CONFIG_KVM_BOOK3S_64_HV */ +#ifdef CONFIG_KVM_BOOK3S_PR + struct mutex hpt_mutex; +#endif #ifdef CONFIG_PPC_BOOK3S_64 struct list_head spapr_tce_tables; struct list_head rtas_tokens; diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c index af04553..856af98 100644 --- a/arch/powerpc/kvm/book3s_32_mmu.c +++ b/arch/powerpc/kvm/book3s_32_mmu.c @@ -271,19 +271,22 @@ static int kvmppc_mmu_book3s_32_xlate_pte(struct kvm_vcpu *vcpu, gva_t eaddr, /* Update PTE C and A bits, so the guest's swapper knows we used the page */ if (found) { - u32 oldpte = pteg[i+1]; - - if (pte-may_read) - pteg[i+1] |= PTEG_FLAG_ACCESSED; - if (pte-may_write) - pteg[i+1] |= PTEG_FLAG_DIRTY; - else - dprintk_pte(KVM: Mapping read-only page!\n); - - /* Write back into the PTEG */ - if (pteg[i+1] != oldpte) - copy_to_user((void __user *)ptegp, pteg, sizeof(pteg)); - + u32 pte_r = pteg[i+1]; + char __user *addr = (char __user *) pteg[i+1]; + + /* +* Use single-byte writes to update the HPTE, to +* conform to what real hardware does. +*/ + if (pte-may_read !(pte_r PTEG_FLAG_ACCESSED)) { + pte_r |= PTEG_FLAG_ACCESSED; + put_user(pte_r 8, addr + 2); + } + if (pte-may_write !(pte_r PTEG_FLAG_DIRTY)) { + /* XXX should only set this for stores */ + pte_r |= PTEG_FLAG_DIRTY; + put_user(pte_r, addr + 3); + } return 0; } @@ -348,7 +351,12 @@ static void kvmppc_mmu_book3s_32_mtsrin(struct kvm_vcpu *vcpu, u32 srnum, static void kvmppc_mmu_book3s_32_tlbie(struct kvm_vcpu *vcpu, ulong ea, bool large) { - kvmppc_mmu_pte_flush(vcpu, ea, 0x0000); + int i; + struct kvm_vcpu *v; + + /* flush this VA on all cpus */ + kvm_for_each_vcpu(i, v, vcpu-kvm) + kvmppc_mmu_pte_flush(v, ea, 0x0000); } static int kvmppc_mmu_book3s_32_esid_to_vsid(struct kvm_vcpu *vcpu, ulong esid, diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c index 6aded53..2d2e88b 100644 --- a/arch/powerpc/kvm/book3s_64_mmu.c +++ b/arch/powerpc/kvm/book3s_64_mmu.c @@ -257,6 +257,8 @@ static int kvmppc_mmu_book3s_64_xlate(struct kvm_vcpu *vcpu, gva_t eaddr, pgsize = slbe-large ? MMU_PAGE_16M : MMU_PAGE_4K; + mutex_lock(vcpu-kvm-arch.hpt_mutex); + do_second: ptegp = kvmppc_mmu_book3s_64_get_pteg(vcpu_book3s, slbe, eaddr, second); if (kvm_is_error_hva(ptegp)) @@ -332,30 +334,37 @@ do_second: /* Update PTE R and C bits, so the
[PATCH 00/18] KVM: PPC: Fixes for PR and preparation for POWER8
This patch series contains updated versions of patches that have been posted before, plus one new compilation fix (for PR KVM without CONFIG_ALTIVEC), plus a patch to allow the guest VRSAVE register to be accessed with the ONE_REG interface on Book E. The first few patches are preparation for POWER8 support. Following that there are several patches that improve PR KVM's MMU emulation and prepare for being able to compile both HV and PR KVM in the one kernel. The series stops short of allowing them to coexist, though, since the details of how that should best be done are still being discussed. Please apply. Paul. --- Documentation/virtual/kvm/api.txt | 3 + arch/powerpc/include/asm/exception-64s.h | 8 + arch/powerpc/include/asm/kvm_asm.h| 2 + arch/powerpc/include/asm/kvm_book3s.h | 246 ++ arch/powerpc/include/asm/kvm_book3s_32.h | 2 +- arch/powerpc/include/asm/kvm_book3s_asm.h | 7 +- arch/powerpc/include/asm/kvm_host.h | 22 ++- arch/powerpc/include/asm/reg.h| 14 ++ arch/powerpc/include/uapi/asm/kvm.h | 5 + arch/powerpc/kernel/asm-offsets.c | 8 +- arch/powerpc/kernel/exceptions-64s.S | 26 +++ arch/powerpc/kvm/book3s.c | 15 +- arch/powerpc/kvm/book3s_32_mmu.c | 73 arch/powerpc/kvm/book3s_32_mmu_host.c | 14 +- arch/powerpc/kvm/book3s_64_mmu.c | 181 +++ arch/powerpc/kvm/book3s_64_mmu_host.c | 102 --- arch/powerpc/kvm/book3s_64_mmu_hv.c | 7 +- arch/powerpc/kvm/book3s_emulate.c | 8 +- arch/powerpc/kvm/book3s_hv.c | 116 ++-- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 57 +++--- arch/powerpc/kvm/book3s_interrupts.S | 27 ++- arch/powerpc/kvm/book3s_mmu_hpte.c| 64 ++- arch/powerpc/kvm/book3s_pr.c | 282 +++--- arch/powerpc/kvm/book3s_pr_papr.c | 52 -- arch/powerpc/kvm/book3s_rmhandlers.S | 32 +--- arch/powerpc/kvm/booke.c | 6 + arch/powerpc/kvm/trace.h | 7 +- 27 files changed, 920 insertions(+), 466 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 08/18] KVM: PPC: Book3S PR: Allow guest to use 64k pages
This adds the code to interpret 64k HPTEs in the guest hashed page table (HPT), 64k SLB entries, and to tell the guest about 64k pages in kvm_vm_ioctl_get_smmu_info(). Guest 64k pages are still shadowed by 4k pages. This also adds another hash table to the four we have already in book3s_mmu_hpte.c to allow us to find all the PTEs that we have instantiated that match a given 64k guest page. The tlbie instruction changed starting with POWER6 to use a bit in the RB operand to indicate large page invalidations, and to use other RB bits to indicate the base and actual page sizes and the segment size. 64k pages came in slightly earlier, with POWER5++. We use one bit in vcpu-arch.hflags to indicate that the emulated cpu supports 64k pages, and another to indicate that it has the new tlbie definition. The KVM_PPC_GET_SMMU_INFO ioctl presents a bit of a problem, because the MMU capabilities depend on which CPU model we're emulating, but it is a VM ioctl not a VCPU ioctl and therefore doesn't get passed a VCPU fd. In addition, commonly-used userspace (QEMU) calls it before setting the PVR for any VCPU. Therefore, as a best effort we look at the first vcpu in the VM and return 64k pages or not depending on its capabilities. We also make the PVR default to the host PVR on recent CPUs that support 1TB segments (and therefore multiple page sizes as well) so that KVM_PPC_GET_SMMU_INFO will include 64k page and 1TB segment support on those CPUs. Signed-off-by: Paul Mackerras pau...@samba.org --- arch/powerpc/include/asm/kvm_asm.h| 2 + arch/powerpc/include/asm/kvm_book3s.h | 6 +++ arch/powerpc/include/asm/kvm_host.h | 4 ++ arch/powerpc/kvm/book3s_64_mmu.c | 92 +++ arch/powerpc/kvm/book3s_mmu_hpte.c| 50 +++ arch/powerpc/kvm/book3s_pr.c | 58 +++--- 6 files changed, 197 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_asm.h b/arch/powerpc/include/asm/kvm_asm.h index 851bac7..e2d4d46 100644 --- a/arch/powerpc/include/asm/kvm_asm.h +++ b/arch/powerpc/include/asm/kvm_asm.h @@ -123,6 +123,8 @@ #define BOOK3S_HFLAG_SLB 0x2 #define BOOK3S_HFLAG_PAIRED_SINGLE 0x4 #define BOOK3S_HFLAG_NATIVE_PS 0x8 +#define BOOK3S_HFLAG_MULTI_PGSIZE 0x10 +#define BOOK3S_HFLAG_NEW_TLBIE 0x20 #define RESUME_FLAG_NV (10) /* Reload guest nonvolatile state? */ #define RESUME_FLAG_HOST(11) /* Resume host? */ diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index 40f22d9..1d4a120 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -58,6 +58,9 @@ struct hpte_cache { struct hlist_node list_pte_long; struct hlist_node list_vpte; struct hlist_node list_vpte_long; +#ifdef CONFIG_PPC_BOOK3S_64 + struct hlist_node list_vpte_64k; +#endif struct rcu_head rcu_head; u64 host_vpn; u64 pfn; @@ -99,6 +102,9 @@ struct kvmppc_vcpu_book3s { struct hlist_head hpte_hash_pte_long[HPTEG_HASH_NUM_PTE_LONG]; struct hlist_head hpte_hash_vpte[HPTEG_HASH_NUM_VPTE]; struct hlist_head hpte_hash_vpte_long[HPTEG_HASH_NUM_VPTE_LONG]; +#ifdef CONFIG_PPC_BOOK3S_64 + struct hlist_head hpte_hash_vpte_64k[HPTEG_HASH_NUM_VPTE_64K]; +#endif int hpte_cache_count; spinlock_t mmu_lock; }; diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 933ae29..2c7963b 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -73,10 +73,12 @@ extern void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte); #define HPTEG_HASH_BITS_PTE_LONG 12 #define HPTEG_HASH_BITS_VPTE 13 #define HPTEG_HASH_BITS_VPTE_LONG 5 +#define HPTEG_HASH_BITS_VPTE_64K 11 #define HPTEG_HASH_NUM_PTE (1 HPTEG_HASH_BITS_PTE) #define HPTEG_HASH_NUM_PTE_LONG(1 HPTEG_HASH_BITS_PTE_LONG) #define HPTEG_HASH_NUM_VPTE(1 HPTEG_HASH_BITS_VPTE) #define HPTEG_HASH_NUM_VPTE_LONG (1 HPTEG_HASH_BITS_VPTE_LONG) +#define HPTEG_HASH_NUM_VPTE_64K(1 HPTEG_HASH_BITS_VPTE_64K) /* Physical Address Mask - allowed range of real mode RAM access */ #define KVM_PAM0x0fffULL @@ -332,6 +334,7 @@ struct kvmppc_pte { bool may_read : 1; bool may_write : 1; bool may_execute: 1; + u8 page_size; /* MMU_PAGE_xxx */ }; struct kvmppc_mmu { @@ -364,6 +367,7 @@ struct kvmppc_slb { bool large : 1;/* PTEs are 16MB */ bool tb : 1;/* 1TB segment */ bool class : 1; + u8 base_page_size; /* MMU_PAGE_xxx */ }; # ifdef CONFIG_PPC_FSL_BOOK3E diff --git a/arch/powerpc/kvm/book3s_64_mmu.c
[PATCH 14/18] KVM: PPC: Book3S: Move skip-interrupt handlers to common code
Both PR and HV KVM have separate, identical copies of the kvmppc_skip_interrupt and kvmppc_skip_Hinterrupt handlers that are used for the situation where an interrupt happens when loading the instruction that caused an exit from the guest. To eliminate this duplication and make it easier to compile in both PR and HV KVM, this moves this code to arch/powerpc/kernel/exceptions-64s.S along with other kernel interrupt handler code. Signed-off-by: Paul Mackerras pau...@samba.org --- arch/powerpc/kernel/exceptions-64s.S| 26 ++ arch/powerpc/kvm/book3s_hv_rmhandlers.S | 24 arch/powerpc/kvm/book3s_rmhandlers.S| 26 -- 3 files changed, 26 insertions(+), 50 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 4e00d22..580d97a 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -645,6 +645,32 @@ slb_miss_user_pseries: b . /* prevent spec. execution */ #endif /* __DISABLED__ */ +#ifdef CONFIG_KVM_BOOK3S_64_HANDLER +kvmppc_skip_interrupt: + /* +* Here all GPRs are unchanged from when the interrupt happened +* except for r13, which is saved in SPRG_SCRATCH0. +*/ + mfspr r13, SPRN_SRR0 + addir13, r13, 4 + mtspr SPRN_SRR0, r13 + GET_SCRATCH0(r13) + rfid + b . + +kvmppc_skip_Hinterrupt: + /* +* Here all GPRs are unchanged from when the interrupt happened +* except for r13, which is saved in SPRG_SCRATCH0. +*/ + mfspr r13, SPRN_HSRR0 + addir13, r13, 4 + mtspr SPRN_HSRR0, r13 + GET_SCRATCH0(r13) + hrfid + b . +#endif + /* * Code from here down to __end_handlers is invoked from the * exception prologs above. Because the prologs assemble the diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index 6eb252a..8e0f28f 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -29,30 +29,6 @@ #include asm/kvm_book3s_asm.h #include asm/mmu-hash64.h -/* - * * - *Real Mode handlers that need to be in the linear mapping * - * * - / - - .globl kvmppc_skip_interrupt -kvmppc_skip_interrupt: - mfspr r13,SPRN_SRR0 - addir13,r13,4 - mtspr SPRN_SRR0,r13 - GET_SCRATCH0(r13) - rfid - b . - - .globl kvmppc_skip_Hinterrupt -kvmppc_skip_Hinterrupt: - mfspr r13,SPRN_HSRR0 - addir13,r13,4 - mtspr SPRN_HSRR0,r13 - GET_SCRATCH0(r13) - hrfid - b . - /* * Call kvmppc_hv_entry in real mode. * Must be called with interrupts hard-disabled. diff --git a/arch/powerpc/kvm/book3s_rmhandlers.S b/arch/powerpc/kvm/book3s_rmhandlers.S index cd59a3a..a38c4c9 100644 --- a/arch/powerpc/kvm/book3s_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_rmhandlers.S @@ -38,32 +38,6 @@ #define FUNC(name) GLUE(.,name) - .globl kvmppc_skip_interrupt -kvmppc_skip_interrupt: - /* -* Here all GPRs are unchanged from when the interrupt happened -* except for r13, which is saved in SPRG_SCRATCH0. -*/ - mfspr r13, SPRN_SRR0 - addir13, r13, 4 - mtspr SPRN_SRR0, r13 - GET_SCRATCH0(r13) - rfid - b . - - .globl kvmppc_skip_Hinterrupt -kvmppc_skip_Hinterrupt: - /* -* Here all GPRs are unchanged from when the interrupt happened -* except for r13, which is saved in SPRG_SCRATCH0. -*/ - mfspr r13, SPRN_HSRR0 - addir13, r13, 4 - mtspr SPRN_HSRR0, r13 - GET_SCRATCH0(r13) - hrfid - b . - #elif defined(CONFIG_PPC_BOOK3S_32) #define FUNC(name) name -- 1.8.4.rc3 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 13/18] KVM: PPC: Book3S PR: Allocate kvm_vcpu structs from kvm_vcpu_cache
This makes PR KVM allocate its kvm_vcpu structs from the kvm_vcpu_cache rather than having them embedded in the kvmppc_vcpu_book3s struct, which is allocated with vzalloc. The reason is to reduce the differences between PR and HV KVM in order to make is easier to have them coexist in one kernel binary. With this, the kvm_vcpu struct has a pointer to the kvmppc_vcpu_book3s struct. The pointer to the kvmppc_book3s_shadow_vcpu struct has moved from the kvmppc_vcpu_book3s struct to the kvm_vcpu struct, and is only present for 32-bit, since it is only used for 32-bit. Signed-off-by: Paul Mackerras pau...@samba.org --- arch/powerpc/include/asm/kvm_book3s.h| 4 +--- arch/powerpc/include/asm/kvm_book3s_32.h | 2 +- arch/powerpc/include/asm/kvm_host.h | 7 +++ arch/powerpc/kvm/book3s_32_mmu.c | 8 arch/powerpc/kvm/book3s_64_mmu.c | 11 +-- arch/powerpc/kvm/book3s_pr.c | 31 --- 6 files changed, 38 insertions(+), 25 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index 6bf20b4..603fba4 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -70,8 +70,6 @@ struct hpte_cache { }; struct kvmppc_vcpu_book3s { - struct kvm_vcpu vcpu; - struct kvmppc_book3s_shadow_vcpu *shadow_vcpu; struct kvmppc_sid_map sid_map[SID_MAP_NUM]; struct { u64 esid; @@ -194,7 +192,7 @@ extern int kvmppc_h_pr(struct kvm_vcpu *vcpu, unsigned long cmd); static inline struct kvmppc_vcpu_book3s *to_book3s(struct kvm_vcpu *vcpu) { - return container_of(vcpu, struct kvmppc_vcpu_book3s, vcpu); + return vcpu-arch.book3s; } extern void kvm_return_point(void); diff --git a/arch/powerpc/include/asm/kvm_book3s_32.h b/arch/powerpc/include/asm/kvm_book3s_32.h index ce0ef6c..c720e0b 100644 --- a/arch/powerpc/include/asm/kvm_book3s_32.h +++ b/arch/powerpc/include/asm/kvm_book3s_32.h @@ -22,7 +22,7 @@ static inline struct kvmppc_book3s_shadow_vcpu *svcpu_get(struct kvm_vcpu *vcpu) { - return to_book3s(vcpu)-shadow_vcpu; + return vcpu-arch.shadow_vcpu; } static inline void svcpu_put(struct kvmppc_book3s_shadow_vcpu *svcpu) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 1f7349d..f482594 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -91,6 +91,9 @@ struct lppaca; struct slb_shadow; struct dtl_entry; +struct kvmppc_vcpu_book3s; +struct kvmppc_book3s_shadow_vcpu; + struct kvm_vm_stat { u32 remote_tlb_flush; }; @@ -413,6 +416,10 @@ struct kvm_vcpu_arch { int slb_max;/* 1 + index of last valid entry in slb[] */ int slb_nr; /* total number of entries in SLB */ struct kvmppc_mmu mmu; + struct kvmppc_vcpu_book3s *book3s; +#endif +#ifdef CONFIG_PPC_BOOK3S_32 + struct kvmppc_book3s_shadow_vcpu *shadow_vcpu; #endif ulong gpr[32]; diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c index 856af98..b14af6d 100644 --- a/arch/powerpc/kvm/book3s_32_mmu.c +++ b/arch/powerpc/kvm/book3s_32_mmu.c @@ -111,10 +111,11 @@ static void kvmppc_mmu_book3s_32_reset_msr(struct kvm_vcpu *vcpu) kvmppc_set_msr(vcpu, 0); } -static hva_t kvmppc_mmu_book3s_32_get_pteg(struct kvmppc_vcpu_book3s *vcpu_book3s, +static hva_t kvmppc_mmu_book3s_32_get_pteg(struct kvm_vcpu *vcpu, u32 sre, gva_t eaddr, bool primary) { + struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu); u32 page, hash, pteg, htabmask; hva_t r; @@ -132,7 +133,7 @@ static hva_t kvmppc_mmu_book3s_32_get_pteg(struct kvmppc_vcpu_book3s *vcpu_book3 kvmppc_get_pc(vcpu_book3s-vcpu), eaddr, vcpu_book3s-sdr1, pteg, sr_vsid(sre)); - r = gfn_to_hva(vcpu_book3s-vcpu.kvm, pteg PAGE_SHIFT); + r = gfn_to_hva(vcpu-kvm, pteg PAGE_SHIFT); if (kvm_is_error_hva(r)) return r; return r | (pteg ~PAGE_MASK); @@ -203,7 +204,6 @@ static int kvmppc_mmu_book3s_32_xlate_pte(struct kvm_vcpu *vcpu, gva_t eaddr, struct kvmppc_pte *pte, bool data, bool primary) { - struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu); u32 sre; hva_t ptegp; u32 pteg[16]; @@ -218,7 +218,7 @@ static int kvmppc_mmu_book3s_32_xlate_pte(struct kvm_vcpu *vcpu, gva_t eaddr, pte-vpage = kvmppc_mmu_book3s_32_ea_to_vp(vcpu, eaddr, data); - ptegp = kvmppc_mmu_book3s_32_get_pteg(vcpu_book3s, sre, eaddr, primary); + ptegp = kvmppc_mmu_book3s_32_get_pteg(vcpu, sre, eaddr, primary); if (kvm_is_error_hva(ptegp)) { printk(KERN_INFO KVM: Invalid PTEG!\n);
[PATCH 15/18] KVM: PPC: Book3S PR: Better handling of host-side read-only pages
Currently we request write access to all pages that get mapped into the guest, even if the guest is only loading from the page. This reduces the effectiveness of KSM because it means that we unshare every page we access. Also, we always set the changed (C) bit in the guest HPTE if it allows writing, even for a guest load. This fixes both these problems. We pass an 'iswrite' flag to the mmu.xlate() functions and to kvmppc_mmu_map_page() to indicate whether the access is a load or a store. The mmu.xlate() functions now only set C for stores. kvmppc_gfn_to_pfn() now calls gfn_to_pfn_prot() instead of gfn_to_pfn() so that it can indicate whether we need write access to the page, and get back a 'writable' flag to indicate whether the page is writable or not. If that 'writable' flag is clear, we then make the host HPTE read-only even if the guest HPTE allowed writing. This means that we can get a protection fault when the guest writes to a page that it has mapped read-write but which is read-only on the host side (perhaps due to KSM having merged the page). Thus we now call kvmppc_handle_pagefault() for protection faults as well as HPTE not found faults. In kvmppc_handle_pagefault(), if the access was allowed by the guest HPTE and we thus need to install a new host HPTE, we then need to remove the old host HPTE if there is one. This is done with a new function, kvmppc_mmu_unmap_page(), which uses kvmppc_mmu_pte_vflush() to find and remove the old host HPTE. Since the memslot-related functions require the KVM SRCU read lock to be held, this adds srcu_read_lock/unlock pairs around the calls to kvmppc_handle_pagefault(). Finally, this changes kvmppc_mmu_book3s_32_xlate_pte() to not ignore guest HPTEs that don't permit access, and to return -EPERM for accesses that are not permitted by the page protections. Signed-off-by: Paul Mackerras pau...@samba.org --- arch/powerpc/include/asm/kvm_book3s.h | 7 +-- arch/powerpc/include/asm/kvm_host.h | 3 ++- arch/powerpc/kvm/book3s.c | 15 +-- arch/powerpc/kvm/book3s_32_mmu.c | 32 +--- arch/powerpc/kvm/book3s_32_mmu_host.c | 14 +++--- arch/powerpc/kvm/book3s_64_mmu.c | 9 + arch/powerpc/kvm/book3s_64_mmu_host.c | 20 +--- arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +- arch/powerpc/kvm/book3s_pr.c | 29 - 9 files changed, 91 insertions(+), 40 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index 603fba4..a07bd7e 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -128,7 +128,9 @@ extern void kvmppc_set_pvr(struct kvm_vcpu *vcpu, u32 pvr); extern void kvmppc_mmu_book3s_64_init(struct kvm_vcpu *vcpu); extern void kvmppc_mmu_book3s_32_init(struct kvm_vcpu *vcpu); extern void kvmppc_mmu_book3s_hv_init(struct kvm_vcpu *vcpu); -extern int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *pte); +extern int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *pte, + bool iswrite); +extern void kvmppc_mmu_unmap_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *pte); extern int kvmppc_mmu_map_segment(struct kvm_vcpu *vcpu, ulong eaddr); extern void kvmppc_mmu_flush_segment(struct kvm_vcpu *vcpu, ulong eaddr, ulong seg_size); extern void kvmppc_mmu_flush_segments(struct kvm_vcpu *vcpu); @@ -157,7 +159,8 @@ extern void kvmppc_set_bat(struct kvm_vcpu *vcpu, struct kvmppc_bat *bat, bool upper, u32 val); extern void kvmppc_giveup_ext(struct kvm_vcpu *vcpu, ulong msr); extern int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu *vcpu); -extern pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn); +extern pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn, bool writing, + bool *writable); extern void kvmppc_add_revmap_chain(struct kvm *kvm, struct revmap_entry *rev, unsigned long *rmap, long pte_index, int realmode); extern void kvmppc_invalidate_hpte(struct kvm *kvm, unsigned long *hptep, diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index f482594..802984e 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -353,7 +353,8 @@ struct kvmppc_mmu { /* book3s */ void (*mtsrin)(struct kvm_vcpu *vcpu, u32 srnum, ulong value); u32 (*mfsrin)(struct kvm_vcpu *vcpu, u32 srnum); - int (*xlate)(struct kvm_vcpu *vcpu, gva_t eaddr, struct kvmppc_pte *pte, bool data); + int (*xlate)(struct kvm_vcpu *vcpu, gva_t eaddr, + struct kvmppc_pte *pte, bool data, bool iswrite); void (*reset_msr)(struct kvm_vcpu *vcpu); void (*tlbie)(struct kvm_vcpu *vcpu, ulong addr, bool large); int (*esid_to_vsid)(struct kvm_vcpu *vcpu, ulong esid, u64
[PATCH 17/18] KVM: PPC: Book3S PR: Mark pages accessed, and dirty if being written
The mark_page_dirty() function, despite what its name might suggest, doesn't actually mark the page as dirty as far as the MM subsystem is concerned. It merely sets a bit in KVM's map of dirty pages, if userspace has requested dirty tracking for the relevant memslot. To tell the MM subsystem that the page is dirty, we have to call kvm_set_pfn_dirty() (or an equivalent such as SetPageDirty()). This adds a call to kvm_set_pfn_dirty(), and while we are here, also adds a call to kvm_set_pfn_accessed() to tell the MM subsystem that the page has been accessed. Since we are now using the pfn in several places, this adds a 'pfn' variable to store it and changes the places that used hpaddr PAGE_SHIFT to use pfn instead, which is the same thing. This also changes a use of HPTE_R_PP to PP_RXRX. Both are 3, but PP_RXRX is more informative as being the read-only page permission bit setting. Signed-off-by: Paul Mackerras pau...@samba.org --- arch/powerpc/kvm/book3s_64_mmu_host.c | 26 +++--- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c b/arch/powerpc/kvm/book3s_64_mmu_host.c index 307e6e8..e2ab8a7 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_host.c +++ b/arch/powerpc/kvm/book3s_64_mmu_host.c @@ -96,20 +96,21 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte, unsigned long mmu_seq; struct kvm *kvm = vcpu-kvm; struct hpte_cache *cpte; + unsigned long gfn = orig_pte-raddr PAGE_SHIFT; + unsigned long pfn; /* used to check for invalidations in progress */ mmu_seq = kvm-mmu_notifier_seq; smp_rmb(); /* Get host physical address for gpa */ - hpaddr = kvmppc_gfn_to_pfn(vcpu, orig_pte-raddr PAGE_SHIFT, - iswrite, writable); - if (is_error_noslot_pfn(hpaddr)) { - printk(KERN_INFO Couldn't get guest page for gfn %lx!\n, orig_pte-eaddr); + pfn = kvmppc_gfn_to_pfn(vcpu, gfn, iswrite, writable); + if (is_error_noslot_pfn(pfn)) { + printk(KERN_INFO Couldn't get guest page for gfn %lx!\n, gfn); r = -EINVAL; goto out; } - hpaddr = PAGE_SHIFT; + hpaddr = pfn PAGE_SHIFT; /* and write the mapping ea - hpa into the pt */ vcpu-arch.mmu.esid_to_vsid(vcpu, orig_pte-eaddr SID_SHIFT, vsid); @@ -129,15 +130,18 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte, vpn = hpt_vpn(orig_pte-eaddr, map-host_vsid, MMU_SEGSIZE_256M); + kvm_set_pfn_accessed(pfn); if (!orig_pte-may_write || !writable) - rflags |= HPTE_R_PP; - else - mark_page_dirty(vcpu-kvm, orig_pte-raddr PAGE_SHIFT); + rflags |= PP_RXRX; + else { + mark_page_dirty(vcpu-kvm, gfn); + kvm_set_pfn_dirty(pfn); + } if (!orig_pte-may_execute) rflags |= HPTE_R_N; else - kvmppc_mmu_flush_icache(hpaddr PAGE_SHIFT); + kvmppc_mmu_flush_icache(pfn); /* * Use 64K pages if possible; otherwise, on 64K page kernels, @@ -191,7 +195,7 @@ map_again: cpte-slot = hpteg + (ret 7); cpte-host_vpn = vpn; cpte-pte = *orig_pte; - cpte-pfn = hpaddr PAGE_SHIFT; + cpte-pfn = pfn; cpte-pagesize = hpsize; kvmppc_mmu_hpte_cache_map(vcpu, cpte); @@ -200,7 +204,7 @@ map_again: out_unlock: spin_unlock(kvm-mmu_lock); - kvm_release_pfn_clean(hpaddr PAGE_SHIFT); + kvm_release_pfn_clean(pfn); if (cpte) kvmppc_mmu_hpte_cache_free(cpte); -- 1.8.4.rc3 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 16/18] KVM: PPC: Book3S PR: Use mmu_notifier_retry() in kvmppc_mmu_map_page()
When the MM code is invalidating a range of pages, it calls the KVM kvm_mmu_notifier_invalidate_range_start() notifier function, which calls kvm_unmap_hva_range(), which arranges to flush all the existing host HPTEs for guest pages. However, the Linux PTEs for the range being flushed are still valid at that point. We are not supposed to establish any new references to pages in the range until the ...range_end() notifier gets called. The PPC-specific KVM code doesn't get any explicit notification of that; instead, we are supposed to use mmu_notifier_retry() to test whether we are or have been inside a range flush notifier pair while we have been getting a page and instantiating a host HPTE for the page. This therefore adds a call to mmu_notifier_retry inside kvmppc_mmu_map_page(). This call is inside a region locked with kvm-mmu_lock, which is the same lock that is called by the KVM MMU notifier functions, thus ensuring that no new notification can proceed while we are in the locked region. Inside this region we also create the host HPTE and link the corresponding hpte_cache structure into the lists used to find it later. We cannot allocate the hpte_cache structure inside this locked region because that can lead to deadlock, so we allocate it outside the region and free it if we end up not using it. This also moves the updates of vcpu3s-hpte_cache_count inside the regions locked with vcpu3s-mmu_lock, and does the increment in kvmppc_mmu_hpte_cache_map() when the pte is added to the cache rather than when it is allocated, in order that the hpte_cache_count is accurate. Signed-off-by: Paul Mackerras pau...@samba.org --- arch/powerpc/include/asm/kvm_book3s.h | 1 + arch/powerpc/kvm/book3s_64_mmu_host.c | 37 ++- arch/powerpc/kvm/book3s_mmu_hpte.c| 14 + 3 files changed, 39 insertions(+), 13 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index a07bd7e..0ec00f4 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -142,6 +142,7 @@ extern long kvmppc_hv_find_lock_hpte(struct kvm *kvm, gva_t eaddr, extern void kvmppc_mmu_hpte_cache_map(struct kvm_vcpu *vcpu, struct hpte_cache *pte); extern struct hpte_cache *kvmppc_mmu_hpte_cache_next(struct kvm_vcpu *vcpu); +extern void kvmppc_mmu_hpte_cache_free(struct hpte_cache *pte); extern void kvmppc_mmu_hpte_destroy(struct kvm_vcpu *vcpu); extern int kvmppc_mmu_hpte_init(struct kvm_vcpu *vcpu); extern void kvmppc_mmu_invalidate_pte(struct kvm_vcpu *vcpu, struct hpte_cache *pte); diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c b/arch/powerpc/kvm/book3s_64_mmu_host.c index cc9fb89..307e6e8 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_host.c +++ b/arch/powerpc/kvm/book3s_64_mmu_host.c @@ -93,6 +93,13 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte, int r = 0; int hpsize = MMU_PAGE_4K; bool writable; + unsigned long mmu_seq; + struct kvm *kvm = vcpu-kvm; + struct hpte_cache *cpte; + + /* used to check for invalidations in progress */ + mmu_seq = kvm-mmu_notifier_seq; + smp_rmb(); /* Get host physical address for gpa */ hpaddr = kvmppc_gfn_to_pfn(vcpu, orig_pte-raddr PAGE_SHIFT, @@ -143,6 +150,14 @@ int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *orig_pte, hash = hpt_hash(vpn, mmu_psize_defs[hpsize].shift, MMU_SEGSIZE_256M); + cpte = kvmppc_mmu_hpte_cache_next(vcpu); + + spin_lock(kvm-mmu_lock); + if (!cpte || mmu_notifier_retry(kvm, mmu_seq)) { + r = -EAGAIN; + goto out_unlock; + } + map_again: hpteg = ((hash htab_hash_mask) * HPTES_PER_GROUP); @@ -150,7 +165,7 @@ map_again: if (attempt 1) if (ppc_md.hpte_remove(hpteg) 0) { r = -1; - goto out; + goto out_unlock; } ret = ppc_md.hpte_insert(hpteg, vpn, hpaddr, rflags, vflags, @@ -163,8 +178,6 @@ map_again: attempt++; goto map_again; } else { - struct hpte_cache *pte = kvmppc_mmu_hpte_cache_next(vcpu); - trace_kvm_book3s_64_mmu_map(rflags, hpteg, vpn, hpaddr, orig_pte); @@ -175,15 +188,21 @@ map_again: hpteg = ((hash htab_hash_mask) * HPTES_PER_GROUP); } - pte-slot = hpteg + (ret 7); - pte-host_vpn = vpn; - pte-pte = *orig_pte; - pte-pfn = hpaddr PAGE_SHIFT; - pte-pagesize = hpsize; + cpte-slot = hpteg + (ret 7); + cpte-host_vpn = vpn; + cpte-pte = *orig_pte; + cpte-pfn = hpaddr PAGE_SHIFT; + cpte-pagesize = hpsize; -
[PATCH 01/18] KVM: PPC: BookE: Add GET/SET_ONE_REG interface for VRSAVE
This makes the VRSAVE register value for a vcpu accessible through the GET/SET_ONE_REG interface on Book E systems (in addition to the existing GET/SET_SREGS interface), for consistency with Book 3S. Signed-off-by: Paul Mackerras pau...@samba.org --- arch/powerpc/kvm/booke.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 17722d8..65fa775 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1479,6 +1479,9 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg) case KVM_REG_PPC_DEBUG_INST: val = get_reg_val(reg-id, KVMPPC_INST_EHPRIV); break; + case KVM_REG_PPC_VRSAVE: + val = get_reg_val(reg-id, vcpu-arch.vrsave); + break; default: r = kvmppc_get_one_reg(vcpu, reg-id, val); break; @@ -1552,6 +1555,9 @@ int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg) kvmppc_set_tcr(vcpu, tcr); break; } + case KVM_REG_PPC_VRSAVE: + vcpu-arch.vrsave = set_reg_val(reg-id, val); + break; default: r = kvmppc_set_one_reg(vcpu, reg-id, val); break; -- 1.8.4.rc3 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 18/18] KVM: PPC: Book3S PR: Reduce number of shadow PTEs invalidated by MMU notifiers
Currently, whenever any of the MMU notifier callbacks get called, we invalidate all the shadow PTEs. This is inefficient because it means that we typically then get a lot of DSIs and ISIs in the guest to fault the shadow PTEs back in. We do this even if the address range being notified doesn't correspond to guest memory. This commit adds code to scan the memslot array to find out what range(s) of guest physical addresses corresponds to the host virtual address range being affected. For each such range we flush only the shadow PTEs for the range, on all cpus. Signed-off-by: Paul Mackerras pau...@samba.org --- arch/powerpc/kvm/book3s_pr.c | 40 1 file changed, 32 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c index 928f5fd..8941885 100644 --- a/arch/powerpc/kvm/book3s_pr.c +++ b/arch/powerpc/kvm/book3s_pr.c @@ -150,24 +150,48 @@ int kvmppc_core_check_requests(struct kvm_vcpu *vcpu) } /* MMU Notifiers */ +static void do_kvm_unmap_hva(struct kvm *kvm, unsigned long start, +unsigned long end) +{ + long i; + struct kvm_vcpu *vcpu; + struct kvm_memslots *slots; + struct kvm_memory_slot *memslot; + + slots = kvm_memslots(kvm); + kvm_for_each_memslot(memslot, slots) { + unsigned long hva_start, hva_end; + gfn_t gfn, gfn_end; + + hva_start = max(start, memslot-userspace_addr); + hva_end = min(end, memslot-userspace_addr + + (memslot-npages PAGE_SHIFT)); + if (hva_start = hva_end) + continue; + /* +* {gfn(page) | page intersects with [hva_start, hva_end)} = +* {gfn, gfn+1, ..., gfn_end-1}. +*/ + gfn = hva_to_gfn_memslot(hva_start, memslot); + gfn_end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, memslot); + kvm_for_each_vcpu(i, vcpu, kvm) + kvmppc_mmu_pte_pflush(vcpu, gfn PAGE_SHIFT, + gfn_end PAGE_SHIFT); + } +} int kvm_unmap_hva(struct kvm *kvm, unsigned long hva) { trace_kvm_unmap_hva(hva); - /* -* Flush all shadow tlb entries everywhere. This is slow, but -* we are 100% sure that we catch the to be unmapped page -*/ - kvm_flush_remote_tlbs(kvm); + do_kvm_unmap_hva(kvm, hva, hva + PAGE_SIZE); return 0; } int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end) { - /* kvm_unmap_hva flushes everything anyways */ - kvm_unmap_hva(kvm, start); + do_kvm_unmap_hva(kvm, start, end); return 0; } @@ -187,7 +211,7 @@ int kvm_test_age_hva(struct kvm *kvm, unsigned long hva) void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte) { /* The page will get remapped properly on its next fault */ - kvm_unmap_hva(kvm, hva); + do_kvm_unmap_hva(kvm, hva, hva + PAGE_SIZE); } /*/ -- 1.8.4.rc3 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 06/18] KVM: PPC: Book3S PR: Fix compilation without CONFIG_ALTIVEC
Commit 9d1ffdd8f3 (KVM: PPC: Book3S PR: Don't corrupt guest state when kernel uses VMX) added a call to kvmppc_load_up_altivec() that isn't guarded by CONFIG_ALTIVEC, causing a link failure when building a kernel without CONFIG_ALTIVEC set. This adds an #ifdef to fix this. Signed-off-by: Paul Mackerras pau...@samba.org --- arch/powerpc/kvm/book3s_pr.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c index 27db1e6..8d45f18 100644 --- a/arch/powerpc/kvm/book3s_pr.c +++ b/arch/powerpc/kvm/book3s_pr.c @@ -619,8 +619,10 @@ static void kvmppc_handle_lost_ext(struct kvm_vcpu *vcpu) if (lost_ext MSR_FP) kvmppc_load_up_fpu(); +#ifdef CONFIG_ALTIVEC if (lost_ext MSR_VEC) kvmppc_load_up_altivec(); +#endif current-thread.regs-msr |= lost_ext; } -- 1.8.4.rc3 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 04/18] KVM: PPC: Book3S HV: Support POWER6 compatibility mode on POWER7
This enables us to use the Processor Compatibility Register (PCR) on POWER7 to put the processor into architecture 2.05 compatibility mode when running a guest. In this mode the new instructions and registers that were introduced on POWER7 are disabled in user mode. This includes all the VSX facilities plus several other instructions such as ldbrx, stdbrx, popcntw, popcntd, etc. To select this mode, we have a new register accessible through the set/get_one_reg interface, called KVM_REG_PPC_ARCH_COMPAT. Setting this to zero gives the full set of capabilities of the processor. Setting it to one of the logical PVR values defined in PAPR puts the vcpu into the compatibility mode for the corresponding architecture level. The supported values are: 0x0f02 Architecture 2.05 (POWER6) 0x0f03 Architecture 2.06 (POWER7) 0x0f13 Architecture 2.06+ (POWER7+) Since the PCR is per-core, the architecture compatibility level and the corresponding PCR value are stored in the struct kvmppc_vcore, and are therefore shared between all vcpus in a virtual core. Signed-off-by: Paul Mackerras pau...@samba.org --- Documentation/virtual/kvm/api.txt | 1 + arch/powerpc/include/asm/kvm_host.h | 2 ++ arch/powerpc/include/asm/reg.h | 11 +++ arch/powerpc/include/uapi/asm/kvm.h | 3 +++ arch/powerpc/kernel/asm-offsets.c | 1 + arch/powerpc/kvm/book3s_hv.c| 35 + arch/powerpc/kvm/book3s_hv_rmhandlers.S | 16 +-- 7 files changed, 67 insertions(+), 2 deletions(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 34a32b6..f1f300f 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -1837,6 +1837,7 @@ registers, find a list below: PPC | KVM_REG_PPC_VRSAVE | 32 PPC | KVM_REG_PPC_LPCR | 64 PPC | KVM_REG_PPC_PPR | 64 + PPC | KVM_REG_PPC_ARCH_COMPAT | 32 PPC | KVM_REG_PPC_TM_GPR0 | 64 ... PPC | KVM_REG_PPC_TM_GPR31 | 64 diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 8bd730c..82daa12 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -296,6 +296,8 @@ struct kvmppc_vcore { struct kvm_vcpu *runner; u64 tb_offset; /* guest timebase - host timebase */ ulong lpcr; + u32 arch_compat; + ulong pcr; }; #define VCORE_ENTRY_COUNT(vc) ((vc)-entry_exit_count 0xff) diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index ed98ebf..1afa20c 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -305,6 +305,10 @@ #define LPID_RSVD0x3ff /* Reserved LPID for partn switching */ #defineSPRN_HMER 0x150 /* Hardware m? error recovery */ #defineSPRN_HMEER 0x151 /* Hardware m? enable error recovery */ +#define SPRN_PCR 0x152 /* Processor compatibility register */ +#define PCR_VEC_DIS (1ul (63-0)) /* Vec. disable (pre POWER8) */ +#define PCR_VSX_DIS (1ul (63-1)) /* VSX disable (pre POWER8) */ +#define PCR_ARCH_205 0x2 /* Architecture 2.05 */ #defineSPRN_HEIR 0x153 /* Hypervisor Emulated Instruction Register */ #define SPRN_TLBINDEXR 0x154 /* P7 TLB control register */ #define SPRN_TLBVPNR 0x155 /* P7 TLB control register */ @@ -1096,6 +1100,13 @@ #define PVR_BE 0x0070 #define PVR_PA6T 0x0090 +/* Logical PVR values defined in PAPR, representing architecture levels */ +#define PVR_ARCH_204 0x0f01 +#define PVR_ARCH_205 0x0f02 +#define PVR_ARCH_206 0x0f03 +#define PVR_ARCH_206p 0x0f13 +#define PVR_ARCH_207 0x0f04 + /* Macros for setting and retrieving special purpose registers */ #ifndef __ASSEMBLY__ #define mfmsr()({unsigned long rval; \ diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index fab6bc1..62c4323 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -536,6 +536,9 @@ struct kvm_get_htab_header { #define KVM_REG_PPC_LPCR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xb5) #define KVM_REG_PPC_PPR(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xb6) +/* Architecture compatibility level */ +#define KVM_REG_PPC_ARCH_COMPAT(KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xb6) + /* Transactional Memory checkpointed state: * This is all GPRs, all VSX regs and a subset of SPRs */ diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 830193b..7f717f2 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -523,6 +523,7 @@ int main(void) DEFINE(VCORE_NAPPING_THREADS, offsetof(struct kvmppc_vcore, napping_threads)); DEFINE(VCORE_TB_OFFSET, offsetof(struct kvmppc_vcore, tb_offset));
[PATCH 09/18] KVM: PPC: Book3S PR: Use 64k host pages where possible
Currently, PR KVM uses 4k pages for the host-side mappings of guest memory, regardless of the host page size. When the host page size is 64kB, we might as well use 64k host page mappings for guest mappings of 64kB and larger pages and for guest real-mode mappings. However, the magic page has to remain a 4k page. To implement this, we first add another flag bit to the guest VSID values we use, to indicate that this segment is one where host pages should be mapped using 64k pages. For segments with this bit set we set the bits in the shadow SLB entry to indicate a 64k base page size. When faulting in host HPTEs for this segment, we make them 64k HPTEs instead of 4k. We record the pagesize in struct hpte_cache for use when invalidating the HPTE. For now we restrict the segment containing the magic page (if any) to 4k pages. It should be possible to lift this restriction in future by ensuring that the magic 4k page is appropriately positioned within a host 64k page. Signed-off-by: Paul Mackerras pau...@samba.org --- arch/powerpc/include/asm/kvm_book3s.h | 6 -- arch/powerpc/kvm/book3s_32_mmu.c | 1 + arch/powerpc/kvm/book3s_64_mmu.c | 35 ++- arch/powerpc/kvm/book3s_64_mmu_host.c | 27 +-- arch/powerpc/kvm/book3s_pr.c | 1 + 5 files changed, 57 insertions(+), 13 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index 1d4a120..6bf20b4 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -66,6 +66,7 @@ struct hpte_cache { u64 pfn; ulong slot; struct kvmppc_pte pte; + int pagesize; }; struct kvmppc_vcpu_book3s { @@ -113,8 +114,9 @@ struct kvmppc_vcpu_book3s { #define CONTEXT_GUEST 1 #define CONTEXT_GUEST_END 2 -#define VSID_REAL 0x0fc0ULL -#define VSID_BAT 0x0fb0ULL +#define VSID_REAL 0x07c0ULL +#define VSID_BAT 0x07b0ULL +#define VSID_64K 0x0800ULL #define VSID_1T0x1000ULL #define VSID_REAL_DR 0x2000ULL #define VSID_REAL_IR 0x4000ULL diff --git a/arch/powerpc/kvm/book3s_32_mmu.c b/arch/powerpc/kvm/book3s_32_mmu.c index c8cefdd..af04553 100644 --- a/arch/powerpc/kvm/book3s_32_mmu.c +++ b/arch/powerpc/kvm/book3s_32_mmu.c @@ -308,6 +308,7 @@ static int kvmppc_mmu_book3s_32_xlate(struct kvm_vcpu *vcpu, gva_t eaddr, ulong mp_ea = vcpu-arch.magic_page_ea; pte-eaddr = eaddr; + pte-page_size = MMU_PAGE_4K; /* Magic page override */ if (unlikely(mp_ea) diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c index ee1cfe2..50506ed 100644 --- a/arch/powerpc/kvm/book3s_64_mmu.c +++ b/arch/powerpc/kvm/book3s_64_mmu.c @@ -542,6 +542,16 @@ static void kvmppc_mmu_book3s_64_tlbie(struct kvm_vcpu *vcpu, ulong va, kvmppc_mmu_pte_vflush(vcpu, va 12, mask); } +#ifdef CONFIG_PPC_64K_PAGES +static int segment_contains_magic_page(struct kvm_vcpu *vcpu, ulong esid) +{ + ulong mp_ea = vcpu-arch.magic_page_ea; + + return mp_ea !(vcpu-arch.shared-msr MSR_PR) + (mp_ea SID_SHIFT) == esid; +} +#endif + static int kvmppc_mmu_book3s_64_esid_to_vsid(struct kvm_vcpu *vcpu, ulong esid, u64 *vsid) { @@ -549,11 +559,13 @@ static int kvmppc_mmu_book3s_64_esid_to_vsid(struct kvm_vcpu *vcpu, ulong esid, struct kvmppc_slb *slb; u64 gvsid = esid; ulong mp_ea = vcpu-arch.magic_page_ea; + int pagesize = MMU_PAGE_64K; if (vcpu-arch.shared-msr (MSR_DR|MSR_IR)) { slb = kvmppc_mmu_book3s_64_find_slbe(vcpu, ea); if (slb) { gvsid = slb-vsid; + pagesize = slb-base_page_size; if (slb-tb) { gvsid = SID_SHIFT_1T - SID_SHIFT; gvsid |= esid ((1ul (SID_SHIFT_1T - SID_SHIFT)) - 1); @@ -564,28 +576,41 @@ static int kvmppc_mmu_book3s_64_esid_to_vsid(struct kvm_vcpu *vcpu, ulong esid, switch (vcpu-arch.shared-msr (MSR_DR|MSR_IR)) { case 0: - *vsid = VSID_REAL | esid; + gvsid = VSID_REAL | esid; break; case MSR_IR: - *vsid = VSID_REAL_IR | gvsid; + gvsid |= VSID_REAL_IR; break; case MSR_DR: - *vsid = VSID_REAL_DR | gvsid; + gvsid |= VSID_REAL_DR; break; case MSR_DR|MSR_IR: if (!slb) goto no_slb; - *vsid = gvsid; break; default: BUG(); break; } +#ifdef CONFIG_PPC_64K_PAGES + /* +* Mark this as a 64k segment if the host
[PATCH 02/18] KVM: PPC: Book3S HV: Store LPCR value for each virtual core
This adds the ability to have a separate LPCR (Logical Partitioning Control Register) value relating to a guest for each virtual core, rather than only having a single value for the whole VM. This corresponds to what real POWER hardware does, where there is a LPCR per CPU thread but most of the fields are required to have the same value on all active threads in a core. The per-virtual-core LPCR can be read and written using the GET/SET_ONE_REG interface. Userspace can can only modify the following fields of the LPCR value: DPFDDefault prefetch depth ILE Interrupt little-endian TC Translation control (secondary HPT hash group search disable) We still maintain a per-VM default LPCR value in kvm-arch.lpcr, which contains bits relating to memory management, i.e. the Virtualized Partition Memory (VPM) bits and the bits relating to guest real mode. When this default value is updated, the update needs to be propagated to the per-vcore values, so we add a kvmppc_update_lpcr() helper to do that. Signed-off-by: Paul Mackerras pau...@samba.org --- Documentation/virtual/kvm/api.txt | 1 + arch/powerpc/include/asm/kvm_book3s.h | 2 + arch/powerpc/include/asm/kvm_host.h | 1 + arch/powerpc/include/asm/reg.h | 3 ++ arch/powerpc/include/uapi/asm/kvm.h | 1 + arch/powerpc/kernel/asm-offsets.c | 1 + arch/powerpc/kvm/book3s_64_mmu_hv.c | 5 +-- arch/powerpc/kvm/book3s_hv.c| 73 +++-- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 5 ++- 9 files changed, 75 insertions(+), 17 deletions(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index c36ff9af..1030ac9 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -1835,6 +1835,7 @@ registers, find a list below: PPC | KVM_REG_PPC_PID | 64 PPC | KVM_REG_PPC_ACOP | 64 PPC | KVM_REG_PPC_VRSAVE | 32 + PPC | KVM_REG_PPC_LPCR | 64 PPC | KVM_REG_PPC_TM_GPR0 | 64 ... PPC | KVM_REG_PPC_TM_GPR31 | 64 diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index fa19e2f..14a4741 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -172,6 +172,8 @@ extern long kvmppc_do_h_remove(struct kvm *kvm, unsigned long flags, unsigned long *hpret); extern long kvmppc_hv_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot, unsigned long *map); +extern void kvmppc_update_lpcr(struct kvm *kvm, unsigned long lpcr, + unsigned long mask); extern void kvmppc_entry_trampoline(void); extern void kvmppc_hv_entry_trampoline(void); diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 9741bf0..788930a 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -295,6 +295,7 @@ struct kvmppc_vcore { u64 preempt_tb; struct kvm_vcpu *runner; u64 tb_offset; /* guest timebase - host timebase */ + ulong lpcr; }; #define VCORE_ENTRY_COUNT(vc) ((vc)-entry_exit_count 0xff) diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index 342e4ea..ed98ebf 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -275,6 +275,7 @@ #define LPCR_ISL (1ul (63-2)) #define LPCR_VC_SH (63-2) #define LPCR_DPFD_SH (63-11) +#define LPCR_DPFD(7ul LPCR_DPFD_SH) #define LPCR_VRMASD (0x1ful (63-16)) #define LPCR_VRMA_L (1ul (63-12)) #define LPCR_VRMA_LP0(1ul (63-15)) @@ -291,6 +292,7 @@ #define LPCR_PECE2 0x1000 /* machine check etc can cause exit */ #define LPCR_MER 0x0800 /* Mediated External Exception */ #define LPCR_MER_SH 11 +#define LPCR_TC 0x0200 /* Translation control */ #define LPCR_LPES0x000c #define LPCR_LPES0 0x0008 /* LPAR Env selector 0 */ #define LPCR_LPES1 0x0004 /* LPAR Env selector 1 */ @@ -412,6 +414,7 @@ #define HID4_RMLS2_SH (63 - 2) /* Real mode limit bottom 2 bits */ #define HID4_LPID5_SH (63 - 6) /* partition ID bottom 4 bits */ #define HID4_RMOR_SH(63 - 22) /* real mode offset (16 bits) */ +#define HID4_RMOR (0xul HID4_RMOR_SH) #define HID4_LPES1 (1 (63-57)) /* LPAR env. sel. bit 1 */ #define HID4_RMLS0_SH (63 - 58) /* Real mode limit top bit */ #define HID4_LPID1_SH 0 /* partition ID top 2 bits */ diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index b98bf3f..e42127d 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -533,6 +533,7 @@ struct kvm_get_htab_header { #define KVM_REG_PPC_ACOP (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xb3)
[PATCH 11/18] KVM: PPC: Book3S PR: Correct errors in H_ENTER implementation
The implementation of H_ENTER in PR KVM has some errors: * With H_EXACT not set, if the HPTEG is full, we return H_PTEG_FULL as the return value of kvmppc_h_pr_enter, but the caller is expecting one of the EMULATE_* values. The H_PTEG_FULL needs to go in the guest's R3 instead. * With H_EXACT set, if the selected HPTE is already valid, the H_ENTER call should return a H_PTEG_FULL error. This fixes these errors and also makes it write only the selected HPTE, not the whole group, since only the selected HPTE has been modified. This also micro-optimizes the calculations involving pte_index and i. Signed-off-by: Paul Mackerras pau...@samba.org --- arch/powerpc/kvm/book3s_pr_papr.c | 19 ++- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/book3s_pr_papr.c b/arch/powerpc/kvm/book3s_pr_papr.c index da0e0bc..38f1899 100644 --- a/arch/powerpc/kvm/book3s_pr_papr.c +++ b/arch/powerpc/kvm/book3s_pr_papr.c @@ -21,6 +21,8 @@ #include asm/kvm_ppc.h #include asm/kvm_book3s.h +#define HPTE_SIZE 16 /* bytes per HPT entry */ + static unsigned long get_pteg_addr(struct kvm_vcpu *vcpu, long pte_index) { struct kvmppc_vcpu_book3s *vcpu_book3s = to_book3s(vcpu); @@ -40,32 +42,39 @@ static int kvmppc_h_pr_enter(struct kvm_vcpu *vcpu) long pte_index = kvmppc_get_gpr(vcpu, 5); unsigned long pteg[2 * 8]; unsigned long pteg_addr, i, *hpte; + long int ret; + i = pte_index 7; pte_index = ~7UL; pteg_addr = get_pteg_addr(vcpu, pte_index); copy_from_user(pteg, (void __user *)pteg_addr, sizeof(pteg)); hpte = pteg; + ret = H_PTEG_FULL; if (likely((flags H_EXACT) == 0)) { - pte_index = ~7UL; for (i = 0; ; ++i) { if (i == 8) - return H_PTEG_FULL; + goto done; if ((*hpte HPTE_V_VALID) == 0) break; hpte += 2; } } else { - i = kvmppc_get_gpr(vcpu, 5) 7UL; hpte += i * 2; + if (*hpte HPTE_V_VALID) + goto done; } hpte[0] = kvmppc_get_gpr(vcpu, 6); hpte[1] = kvmppc_get_gpr(vcpu, 7); - copy_to_user((void __user *)pteg_addr, pteg, sizeof(pteg)); - kvmppc_set_gpr(vcpu, 3, H_SUCCESS); + pteg_addr += i * HPTE_SIZE; + copy_to_user((void __user *)pteg_addr, hpte, HPTE_SIZE); kvmppc_set_gpr(vcpu, 4, pte_index | i); + ret = H_SUCCESS; + + done: + kvmppc_set_gpr(vcpu, 3, ret); return EMULATE_DONE; } -- 1.8.4.rc3 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/18] KVM: PPC: Book3S PR: Handle PP0 page-protection bit in guest HPTEs
64-bit POWER processors have a three-bit field for page protection in the hashed page table entry (HPTE). Currently we only interpret the two bits that were present in older versions of the architecture. The only defined combination that has the new bit set is 110, meaning read-only for supervisor and no access for user mode. This adds code to kvmppc_mmu_book3s_64_xlate() to interpret the extra bit appropriately. Signed-off-by: Paul Mackerras pau...@samba.org --- arch/powerpc/kvm/book3s_64_mmu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c index 50506ed..6aded53 100644 --- a/arch/powerpc/kvm/book3s_64_mmu.c +++ b/arch/powerpc/kvm/book3s_64_mmu.c @@ -298,6 +298,8 @@ do_second: v = pteg[i]; r = pteg[i+1]; pp = (r HPTE_R_PP) | key; + if (r HPTE_R_PP0) + pp |= 8; gpte-eaddr = eaddr; gpte-vpage = kvmppc_mmu_book3s_64_ea_to_vp(vcpu, eaddr, data); @@ -319,6 +321,7 @@ do_second: case 3: case 5: case 7: + case 10: gpte-may_read = true; break; } -- 1.8.4.rc3 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 03/18] KVM: PPC: Book3S HV: Add support for guest Program Priority Register
POWER7 and later IBM server processors have a register called the Program Priority Register (PPR), which controls the priority of each hardware CPU SMT thread, and affects how fast it runs compared to other SMT threads. This priority can be controlled by writing to the PPR or by use of a set of instructions of the form or rN,rN,rN which are otherwise no-ops but have been defined to set the priority to particular levels. This adds code to context switch the PPR when entering and exiting guests and to make the PPR value accessible through the SET/GET_ONE_REG interface. When entering the guest, we set the PPR as late as possible, because if we are setting a low thread priority it will make the code run slowly from that point on. Similarly, the first-level interrupt handlers save the PPR value in the PACA very early on, and set the thread priority to the medium level, so that the interrupt handling code runs at a reasonable speed. Acked-by: Benjamin Herrenschmidt b...@kernel.crashing.org Signed-off-by: Paul Mackerras pau...@samba.org --- Documentation/virtual/kvm/api.txt | 1 + arch/powerpc/include/asm/exception-64s.h | 8 arch/powerpc/include/asm/kvm_book3s_asm.h | 1 + arch/powerpc/include/asm/kvm_host.h | 1 + arch/powerpc/include/uapi/asm/kvm.h | 1 + arch/powerpc/kernel/asm-offsets.c | 2 ++ arch/powerpc/kvm/book3s_hv.c | 6 ++ arch/powerpc/kvm/book3s_hv_rmhandlers.S | 12 +++- 8 files changed, 31 insertions(+), 1 deletion(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 1030ac9..34a32b6 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -1836,6 +1836,7 @@ registers, find a list below: PPC | KVM_REG_PPC_ACOP | 64 PPC | KVM_REG_PPC_VRSAVE | 32 PPC | KVM_REG_PPC_LPCR | 64 + PPC | KVM_REG_PPC_PPR | 64 PPC | KVM_REG_PPC_TM_GPR0 | 64 ... PPC | KVM_REG_PPC_TM_GPR31 | 64 diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h index 07ca627..b86c4db 100644 --- a/arch/powerpc/include/asm/exception-64s.h +++ b/arch/powerpc/include/asm/exception-64s.h @@ -203,6 +203,10 @@ do_kvm_##n: \ ld r10,area+EX_CFAR(r13); \ std r10,HSTATE_CFAR(r13); \ END_FTR_SECTION_NESTED(CPU_FTR_CFAR,CPU_FTR_CFAR,947); \ + BEGIN_FTR_SECTION_NESTED(948) \ + ld r10,area+EX_PPR(r13); \ + std r10,HSTATE_PPR(r13);\ + END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948);\ ld r10,area+EX_R10(r13); \ stw r9,HSTATE_SCRATCH1(r13);\ ld r9,area+EX_R9(r13); \ @@ -216,6 +220,10 @@ do_kvm_##n: \ ld r10,area+EX_R10(r13); \ beq 89f;\ stw r9,HSTATE_SCRATCH1(r13);\ + BEGIN_FTR_SECTION_NESTED(948) \ + ld r9,area+EX_PPR(r13);\ + std r9,HSTATE_PPR(r13); \ + END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948);\ ld r9,area+EX_R9(r13); \ std r12,HSTATE_SCRATCH0(r13); \ li r12,n; \ diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h b/arch/powerpc/include/asm/kvm_book3s_asm.h index 9039d3c..22f4606 100644 --- a/arch/powerpc/include/asm/kvm_book3s_asm.h +++ b/arch/powerpc/include/asm/kvm_book3s_asm.h @@ -101,6 +101,7 @@ struct kvmppc_host_state { #endif #ifdef CONFIG_PPC_BOOK3S_64 u64 cfar; + u64 ppr; #endif }; diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 788930a..8bd730c 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -465,6 +465,7 @@ struct kvm_vcpu_arch { u32 ctrl; ulong dabr; ulong cfar; + ulong ppr; #endif u32 vrsave; /* also USPRG0 */ u32 mmucr; diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index e42127d..fab6bc1 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -534,6 +534,7 @@ struct kvm_get_htab_header { #define KVM_REG_PPC_VRSAVE (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xb4) #define KVM_REG_PPC_LPCR
[PATCH 05/18] KVM: PPC: Book3S HV: Don't crash host on unknown guest interrupt
If we come out of a guest with an interrupt that we don't know about, instead of crashing the host with a BUG(), we now return to userspace with the exit reason set to KVM_EXIT_UNKNOWN and the trap vector in the hw.hardware_exit_reason field of the kvm_run structure, as is done on x86. Note that run-exit_reason is already set to KVM_EXIT_UNKNOWN at the beginning of kvmppc_handle_exit(). Signed-off-by: Paul Mackerras pau...@samba.org --- arch/powerpc/kvm/book3s_hv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 373e202..cce2c20 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -709,8 +709,8 @@ static int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, printk(KERN_EMERG trap=0x%x | pc=0x%lx | msr=0x%llx\n, vcpu-arch.trap, kvmppc_get_pc(vcpu), vcpu-arch.shregs.msr); + run-hw.hardware_exit_reason = vcpu-arch.trap; r = RESUME_HOST; - BUG(); break; } -- 1.8.4.rc3 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html