Re: [PATCH v5 10/10] arm64: mm: set the contiguous bit for kernel mappings where appropriate
On 9 March 2017 at 20:33, Mark Rutlandwrote: > On Thu, Mar 09, 2017 at 09:25:12AM +0100, Ard Biesheuvel wrote: >> +static inline u64 pte_cont_addr_end(u64 addr, u64 end) >> +{ >> + return min((addr + CONT_PTE_SIZE) & CONT_PTE_MASK, end); >> +} >> + >> +static inline u64 pmd_cont_addr_end(u64 addr, u64 end) >> +{ >> + return min((addr + CONT_PMD_SIZE) & CONT_PMD_MASK, end); >> +} > > These differ structurally from the usual p??_addr_end() macros defined > in include/asm-generic/pgtable.h. I agree the asm-generic macros aren't > pretty, but it would be nice to be consistent. > > I don't think the above handle a partial contiguous span at the end of > the address space (e.g. where end is initial PAGE_SIZE away from 2^64), > whereas the asm-generic form does, AFAICT. > > Can we please use: > > #define pte_cont_addr_end(addr, end) > \ > ({ unsigned long __boundary = ((addr) + CONT_PTE_SIZE) & CONT_PTE_MASK; > \ > (__boundary - 1 < (end) - 1)? __boundary: (end); > \ > }) > > #define pmd_cont_addr_end(addr, end) > \ > ({ unsigned long __boundary = ((addr) + CONT_PMD_SIZE) & CONT_PMD_MASK; > \ > (__boundary - 1 < (end) - 1)? __boundary: (end); > \ > }) > > ... instead? > OK, so that's what the -1 is for. Either version is fine by me. > [...] > >> +static void init_pte(pte_t *pte, unsigned long addr, unsigned long end, >> + phys_addr_t phys, pgprot_t prot) >> { >> + do { >> + pte_t old_pte = *pte; >> + >> + set_pte(pte, pfn_pte(__phys_to_pfn(phys), prot)); >> + >> + /* >> + * After the PTE entry has been populated once, we >> + * only allow updates to the permission attributes. >> + */ >> + BUG_ON(!pgattr_change_is_safe(pte_val(old_pte), >> pte_val(*pte))); >> + >> + } while (pte++, addr += PAGE_SIZE, phys += PAGE_SIZE, addr != end); >> +} >> + >> +static void alloc_init_cont_pte(pmd_t *pmd, unsigned long addr, >> + unsigned long end, phys_addr_t phys, >> + pgprot_t prot, >> + phys_addr_t (*pgtable_alloc)(void), >> + int flags) >> +{ >> + unsigned long next; >> pte_t *pte; >> >> BUG_ON(pmd_sect(*pmd)); >> @@ -136,45 +156,30 @@ static void alloc_init_pte(pmd_t *pmd, unsigned long >> addr, >> >> pte = pte_set_fixmap_offset(pmd, addr); >> do { >> - pte_t old_pte = *pte; >> + pgprot_t __prot = prot; >> >> - set_pte(pte, pfn_pte(__phys_to_pfn(phys), prot)); >> - phys += PAGE_SIZE; >> + next = pte_cont_addr_end(addr, end); >> >> - /* >> - * After the PTE entry has been populated once, we >> - * only allow updates to the permission attributes. >> - */ >> - BUG_ON(!pgattr_change_is_safe(pte_val(old_pte), >> pte_val(*pte))); >> + /* use a contiguous mapping if the range is suitably aligned */ >> + if addr | next | phys) & ~CONT_PTE_MASK) == 0) && >> + (flags & NO_CONT_MAPPINGS) == 0) >> + __prot = __pgprot(pgprot_val(prot) | PTE_CONT); >> >> - } while (pte++, addr += PAGE_SIZE, addr != end); >> + init_pte(pte, addr, next, phys, __prot); >> + >> + phys += next - addr; >> + pte += (next - addr) / PAGE_SIZE; >> + } while (addr = next, addr != end); >> >> pte_clear_fixmap(); >> } > > I think it would be preferable to pass the pmd down into > alloc_init_pte(), so that we don't have to mess with the pte in both > alloc_init_cont_pte() and alloc_init_pte(). > > Likewise for alloc_init_cont_pmd() and alloc_init_pmd(), regarding the > pmd. > > I realise we'll redundantly map/unmap the PTE for each contiguous span, > but I doubt there's a case it has a noticeable impact. > OK > With lots of memory we'll use blocks at a higher level, and for > debug_pagealloc we'll pass the whole pte down to init_pte() as we > currently do. > > [...] > >> + if (pud_none(*pud)) { >> + phys_addr_t pmd_phys; >> + BUG_ON(!pgtable_alloc); >> + pmd_phys = pgtable_alloc(); >> + pmd = pmd_set_fixmap(pmd_phys); >> + __pud_populate(pud, pmd_phys, PUD_TYPE_TABLE); >> + pmd_clear_fixmap(); >> + } > > It looks like when the splitting logic was removed, we forgot to remove > the fixmapping here (and for the pmd_none() case). The __p?d_populate > functions don't touch the next level table, so there's no reason to > fixmap it. > > Would you mind spinning a patch to rip those out? > Ah right, pmd is not even referenced in the __pud_populate invocation. Yes, I will add a patch before this one to remove
Re: [PATCH v5 10/10] arm64: mm: set the contiguous bit for kernel mappings where appropriate
On Thu, Mar 09, 2017 at 09:25:12AM +0100, Ard Biesheuvel wrote: > +static inline u64 pte_cont_addr_end(u64 addr, u64 end) > +{ > + return min((addr + CONT_PTE_SIZE) & CONT_PTE_MASK, end); > +} > + > +static inline u64 pmd_cont_addr_end(u64 addr, u64 end) > +{ > + return min((addr + CONT_PMD_SIZE) & CONT_PMD_MASK, end); > +} These differ structurally from the usual p??_addr_end() macros defined in include/asm-generic/pgtable.h. I agree the asm-generic macros aren't pretty, but it would be nice to be consistent. I don't think the above handle a partial contiguous span at the end of the address space (e.g. where end is initial PAGE_SIZE away from 2^64), whereas the asm-generic form does, AFAICT. Can we please use: #define pte_cont_addr_end(addr, end) \ ({ unsigned long __boundary = ((addr) + CONT_PTE_SIZE) & CONT_PTE_MASK; \ (__boundary - 1 < (end) - 1)? __boundary: (end); \ }) #define pmd_cont_addr_end(addr, end) \ ({ unsigned long __boundary = ((addr) + CONT_PMD_SIZE) & CONT_PMD_MASK; \ (__boundary - 1 < (end) - 1)? __boundary: (end); \ }) ... instead? [...] > +static void init_pte(pte_t *pte, unsigned long addr, unsigned long end, > + phys_addr_t phys, pgprot_t prot) > { > + do { > + pte_t old_pte = *pte; > + > + set_pte(pte, pfn_pte(__phys_to_pfn(phys), prot)); > + > + /* > + * After the PTE entry has been populated once, we > + * only allow updates to the permission attributes. > + */ > + BUG_ON(!pgattr_change_is_safe(pte_val(old_pte), pte_val(*pte))); > + > + } while (pte++, addr += PAGE_SIZE, phys += PAGE_SIZE, addr != end); > +} > + > +static void alloc_init_cont_pte(pmd_t *pmd, unsigned long addr, > + unsigned long end, phys_addr_t phys, > + pgprot_t prot, > + phys_addr_t (*pgtable_alloc)(void), > + int flags) > +{ > + unsigned long next; > pte_t *pte; > > BUG_ON(pmd_sect(*pmd)); > @@ -136,45 +156,30 @@ static void alloc_init_pte(pmd_t *pmd, unsigned long > addr, > > pte = pte_set_fixmap_offset(pmd, addr); > do { > - pte_t old_pte = *pte; > + pgprot_t __prot = prot; > > - set_pte(pte, pfn_pte(__phys_to_pfn(phys), prot)); > - phys += PAGE_SIZE; > + next = pte_cont_addr_end(addr, end); > > - /* > - * After the PTE entry has been populated once, we > - * only allow updates to the permission attributes. > - */ > - BUG_ON(!pgattr_change_is_safe(pte_val(old_pte), pte_val(*pte))); > + /* use a contiguous mapping if the range is suitably aligned */ > + if addr | next | phys) & ~CONT_PTE_MASK) == 0) && > + (flags & NO_CONT_MAPPINGS) == 0) > + __prot = __pgprot(pgprot_val(prot) | PTE_CONT); > > - } while (pte++, addr += PAGE_SIZE, addr != end); > + init_pte(pte, addr, next, phys, __prot); > + > + phys += next - addr; > + pte += (next - addr) / PAGE_SIZE; > + } while (addr = next, addr != end); > > pte_clear_fixmap(); > } I think it would be preferable to pass the pmd down into alloc_init_pte(), so that we don't have to mess with the pte in both alloc_init_cont_pte() and alloc_init_pte(). Likewise for alloc_init_cont_pmd() and alloc_init_pmd(), regarding the pmd. I realise we'll redundantly map/unmap the PTE for each contiguous span, but I doubt there's a case it has a noticeable impact. With lots of memory we'll use blocks at a higher level, and for debug_pagealloc we'll pass the whole pte down to init_pte() as we currently do. [...] > + if (pud_none(*pud)) { > + phys_addr_t pmd_phys; > + BUG_ON(!pgtable_alloc); > + pmd_phys = pgtable_alloc(); > + pmd = pmd_set_fixmap(pmd_phys); > + __pud_populate(pud, pmd_phys, PUD_TYPE_TABLE); > + pmd_clear_fixmap(); > + } It looks like when the splitting logic was removed, we forgot to remove the fixmapping here (and for the pmd_none() case). The __p?d_populate functions don't touch the next level table, so there's no reason to fixmap it. Would you mind spinning a patch to rip those out? [...] > void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys, > unsigned long virt, phys_addr_t size, > pgprot_t prot, bool page_mappings_only) > { > - int flags; > + int flags = NO_CONT_MAPPINGS; > > BUG_ON(mm == _mm); > > if (page_mappings_only) > - flags = NO_BLOCK_MAPPINGS; > + flags |=
Re: [PATCH v5 09/10] arm64/mmu: replace 'page_mappings_only' parameter with flags argument
On Thu, Mar 09, 2017 at 09:25:11AM +0100, Ard Biesheuvel wrote: > In preparation of extending the policy for manipulating kernel mappings > with whether or not contiguous hints may be used in the page tables, > replace the bool 'page_mappings_only' with a flags field and a flag > NO_BLOCK_MAPPINGS. > > Signed-off-by: Ard BiesheuvelThanks for attacking this. I was going to comment on the name change, but I see that the next patch introduces and uses NO_CONT_MAPPINGS, so that's fine by me. > void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys, > unsigned long virt, phys_addr_t size, > pgprot_t prot, bool page_mappings_only) > { > + int flags; > + > BUG_ON(mm == _mm); > > + if (page_mappings_only) > + flags = NO_BLOCK_MAPPINGS; > + > __create_pgd_mapping(mm->pgd, phys, virt, size, prot, > - pgd_pgtable_alloc, page_mappings_only); > + pgd_pgtable_alloc, flags); > } Given we can't pass the flags in to create_pgd_mapping() without exposing those more generally, this also looks fine. FWIW: Reviewed-by: Mark Rutland Mark. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v5 08/10] arm64/mmu: add contiguous bit to sanity bug check
On Thu, Mar 09, 2017 at 09:25:10AM +0100, Ard Biesheuvel wrote: > A mapping with the contiguous bit cannot be safely manipulated while > live, regardless of whether the bit changes between the old and new > mapping. So take this into account when deciding whether the change > is safe. > > Signed-off-by: Ard BiesheuvelReviewed-by: Mark Rutland Strictly speaking, I think this is marginally more stringent than what the ARM ARM describes. My reading is that the "Misprogramming of the Contiguous bit" rules only apply when there are multiple valid entries, and hence if you had a contiguous span with only a single valid entry (and TLBs up-to-date), you could modify that in-place so long as you followed the usual BBM rules. However, I don't see us ever (deliberately) doing that, given it would require more work, and there's no guarantee that the CPU would consider the whole span as being mapped. It's also possible my reading of the ARM ARM is flawed. Thanks, Mark. > --- > arch/arm64/mm/mmu.c | 10 +- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > index d3fecd20a136..a6d7a86dd2b8 100644 > --- a/arch/arm64/mm/mmu.c > +++ b/arch/arm64/mm/mmu.c > @@ -103,7 +103,15 @@ static bool pgattr_change_is_safe(u64 old, u64 new) >*/ > static const pteval_t mask = PTE_PXN | PTE_RDONLY | PTE_WRITE; > > - return old == 0 || new == 0 || ((old ^ new) & ~mask) == 0; > + /* creating or taking down mappings is always safe */ > + if (old == 0 || new == 0) > + return true; > + > + /* live contiguous mappings may not be manipulated at all */ > + if ((old | new) & PTE_CONT) > + return false; > + > + return ((old ^ new) & ~mask) == 0; > } > > static void alloc_init_pte(pmd_t *pmd, unsigned long addr, > -- > 2.7.4 > ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v5 07/10] arm64/mmu: ignore debug_pagealloc for kernel segments
On Thu, Mar 09, 2017 at 09:25:09AM +0100, Ard Biesheuvel wrote: > The debug_pagealloc facility manipulates kernel mappings in the linear > region at page granularity to detect out of bounds or use-after-free > accesses. Since the kernel segments are not allocated dynamically, > there is no point in taking the debug_pagealloc_enabled flag into > account for them, and we can use block mappings unconditionally. > > Note that this applies equally to the linear alias of text/rodata: > we will never have dynamic allocations there given that the same > memory is statically in use by the kernel image. > > Signed-off-by: Ard BiesheuvelThis makes sense to me, and I haven't found anything this breaks. It may be worth noting that a similar reasoning already applies the the FDT mapping, where we use create_mapping_noalloc(), and never mandate page mappings. Regardless: Reviewed-by: Mark Rutland Mark. > --- > arch/arm64/mm/mmu.c | 7 +++ > 1 file changed, 3 insertions(+), 4 deletions(-) > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > index c3963c592ec3..d3fecd20a136 100644 > --- a/arch/arm64/mm/mmu.c > +++ b/arch/arm64/mm/mmu.c > @@ -328,8 +328,7 @@ static void update_mapping_prot(phys_addr_t phys, > unsigned long virt, > return; > } > > - __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, > - NULL, debug_pagealloc_enabled()); > + __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL, false); > > /* flush the TLBs after updating live kernel mappings */ > flush_tlb_kernel_range(virt, virt + size); > @@ -381,7 +380,7 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t > start, phys_addr_t end >*/ > __create_pgd_mapping(pgd, kernel_start, __phys_to_virt(kernel_start), >kernel_end - kernel_start, PAGE_KERNEL, > - early_pgtable_alloc, debug_pagealloc_enabled()); > + early_pgtable_alloc, false); > } > > void __init mark_linear_text_alias_ro(void) > @@ -437,7 +436,7 @@ static void __init map_kernel_segment(pgd_t *pgd, void > *va_start, void *va_end, > BUG_ON(!PAGE_ALIGNED(size)); > > __create_pgd_mapping(pgd, pa_start, (unsigned long)va_start, size, prot, > - early_pgtable_alloc, debug_pagealloc_enabled()); > + early_pgtable_alloc, false); > > vma->addr = va_start; > vma->phys_addr = pa_start; > -- > 2.7.4 > ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 02/15] arm64: sysreg: add debug system registers
This patch adds sysreg definitions for system registers in the debug and trace system register encoding space. Subsequent patches will make use of these definitions. The encodings were taken from ARM DDI 0487A.k_iss10775, Table C5-5. Signed-off-by: Mark RutlandCc: Catalin Marinas Cc: Marc Zyngier Cc: Suzuki K Poulose Cc: Will Deacon --- arch/arm64/include/asm/sysreg.h | 23 +++ 1 file changed, 23 insertions(+) diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index e6498ac..b54f8a4 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -89,6 +89,29 @@ #define SET_PSTATE_UAO(x) __emit_inst(0xd500 | REG_PSTATE_UAO_IMM | \ (!!x)<<8 | 0x1f) +#define SYS_OSDTRRX_EL1sys_reg(2, 0, 0, 0, 2) +#define SYS_MDCCINT_EL1sys_reg(2, 0, 0, 2, 0) +#define SYS_MDSCR_EL1 sys_reg(2, 0, 0, 2, 2) +#define SYS_OSDTRTX_EL1sys_reg(2, 0, 0, 3, 2) +#define SYS_OSECCR_EL1 sys_reg(2, 0, 0, 6, 2) +#define SYS_DBGBVRn_EL1(n) sys_reg(2, 0, 0, n, 4) +#define SYS_DBGBCRn_EL1(n) sys_reg(2, 0, 0, n, 5) +#define SYS_DBGWVRn_EL1(n) sys_reg(2, 0, 0, n, 6) +#define SYS_DBGWCRn_EL1(n) sys_reg(2, 0, 0, n, 7) +#define SYS_MDRAR_EL1 sys_reg(2, 0, 1, 0, 0) +#define SYS_OSLAR_EL1 sys_reg(2, 0, 1, 0, 4) +#define SYS_OSLSR_EL1 sys_reg(2, 0, 1, 1, 4) +#define SYS_OSDLR_EL1 sys_reg(2, 0, 1, 3, 4) +#define SYS_DBGPRCR_EL1sys_reg(2, 0, 1, 4, 4) +#define SYS_DBGCLAIMSET_EL1sys_reg(2, 0, 7, 8, 6) +#define SYS_DBGCLAIMCLR_EL1sys_reg(2, 0, 7, 9, 6) +#define SYS_DBGAUTHSTATUS_EL1 sys_reg(2, 0, 7, 14, 6) +#define SYS_MDCCSR_EL0 sys_reg(2, 3, 0, 1, 0) +#define SYS_DBGDTR_EL0 sys_reg(2, 3, 0, 4, 0) +#define SYS_DBGDTRRX_EL0 sys_reg(2, 3, 0, 5, 0) +#define SYS_DBGDTRTX_EL0 sys_reg(2, 3, 0, 5, 0) +#define SYS_DBGVCR32_EL2 sys_reg(2, 4, 0, 7, 0) + #define SYS_MIDR_EL1 sys_reg(3, 0, 0, 0, 0) #define SYS_MPIDR_EL1 sys_reg(3, 0, 0, 0, 5) #define SYS_REVIDR_EL1 sys_reg(3, 0, 0, 0, 6) -- 1.9.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 10/15] KVM: arm64: Use common performance monitor sysreg definitions
Now that we have common definitions for the performance monitor register encodings, make the KVM code use these, simplifying the sys_reg_descs table. The comments for PMUSERENR_EL0 and PMCCFILTR_EL0 are kept, as these describe non-obvious details regarding the registers. However, a slight fixup is applied to bring these into line with the usual comment style. Signed-off-by: Mark RutlandCc: Christoffer Dall Cc: Marc Zyngier Cc: kvmarm@lists.cs.columbia.edu --- arch/arm64/kvm/sys_regs.c | 78 +-- 1 file changed, 22 insertions(+), 56 deletions(-) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 5fa23fd..63b0785 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -804,16 +804,12 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, /* Macro to expand the PMEVCNTRn_EL0 register */ #define PMU_PMEVCNTR_EL0(n)\ - /* PMEVCNTRn_EL0 */ \ - { Op0(0b11), Op1(0b011), CRn(0b1110), \ - CRm((0b1000 | (((n) >> 3) & 0x3))), Op2(((n) & 0x7)), \ + { SYS_DESC(SYS_PMEVCNTRn_EL0(n)), \ access_pmu_evcntr, reset_unknown, (PMEVCNTR0_EL0 + n), } /* Macro to expand the PMEVTYPERn_EL0 register */ #define PMU_PMEVTYPER_EL0(n) \ - /* PMEVTYPERn_EL0 */\ - { Op0(0b11), Op1(0b011), CRn(0b1110), \ - CRm((0b1100 | (((n) >> 3) & 0x3))), Op2(((n) & 0x7)), \ + { SYS_DESC(SYS_PMEVTYPERn_EL0(n)), \ access_pmu_evtyper, reset_unknown, (PMEVTYPER0_EL0 + n), } static bool access_cntp_tval(struct kvm_vcpu *vcpu, @@ -963,12 +959,8 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu, { Op0(0b11), Op1(0b000), CRn(0b0111), CRm(0b0100), Op2(0b000), NULL, reset_unknown, PAR_EL1 }, - /* PMINTENSET_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b1001), CRm(0b1110), Op2(0b001), - access_pminten, reset_unknown, PMINTENSET_EL1 }, - /* PMINTENCLR_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b1001), CRm(0b1110), Op2(0b010), - access_pminten, NULL, PMINTENSET_EL1 }, + { SYS_DESC(SYS_PMINTENSET_EL1), access_pminten, reset_unknown, PMINTENSET_EL1 }, + { SYS_DESC(SYS_PMINTENCLR_EL1), access_pminten, NULL, PMINTENSET_EL1 }, /* MAIR_EL1 */ { Op0(0b11), Op1(0b000), CRn(0b1010), CRm(0b0010), Op2(0b000), @@ -1003,48 +995,23 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu, { Op0(0b11), Op1(0b010), CRn(0b), CRm(0b), Op2(0b000), NULL, reset_unknown, CSSELR_EL1 }, - /* PMCR_EL0 */ - { Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b000), - access_pmcr, reset_pmcr, }, - /* PMCNTENSET_EL0 */ - { Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b001), - access_pmcnten, reset_unknown, PMCNTENSET_EL0 }, - /* PMCNTENCLR_EL0 */ - { Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b010), - access_pmcnten, NULL, PMCNTENSET_EL0 }, - /* PMOVSCLR_EL0 */ - { Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b011), - access_pmovs, NULL, PMOVSSET_EL0 }, - /* PMSWINC_EL0 */ - { Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b100), - access_pmswinc, reset_unknown, PMSWINC_EL0 }, - /* PMSELR_EL0 */ - { Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b101), - access_pmselr, reset_unknown, PMSELR_EL0 }, - /* PMCEID0_EL0 */ - { Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b110), - access_pmceid }, - /* PMCEID1_EL0 */ - { Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1100), Op2(0b111), - access_pmceid }, - /* PMCCNTR_EL0 */ - { Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1101), Op2(0b000), - access_pmu_evcntr, reset_unknown, PMCCNTR_EL0 }, - /* PMXEVTYPER_EL0 */ - { Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1101), Op2(0b001), - access_pmu_evtyper }, - /* PMXEVCNTR_EL0 */ - { Op0(0b11), Op1(0b011), CRn(0b1001), CRm(0b1101), Op2(0b010), - access_pmu_evcntr }, - /* PMUSERENR_EL0 -* This register resets as unknown in 64bit mode while it resets as zero + { SYS_DESC(SYS_PMCR_EL0), access_pmcr, reset_pmcr, }, + { SYS_DESC(SYS_PMCNTENSET_EL0), access_pmcnten, reset_unknown, PMCNTENSET_EL0 }, + { SYS_DESC(SYS_PMCNTENCLR_EL0), access_pmcnten, NULL, PMCNTENSET_EL0 }, + { SYS_DESC(SYS_PMOVSCLR_EL0), access_pmovs, NULL, PMOVSSET_EL0 }, + { SYS_DESC(SYS_PMSWINC_EL0), access_pmswinc, reset_unknown, PMSWINC_EL0 }, +
[PATCH 06/15] arm64: sysreg: add register encodings used by KVM
This patch adds sysreg definitions for registers which KVM needs the encodings for, which are not currently describe in . Subsequent patches will make use of these definitions. The encodings were taken from ARM DDI 0487A.k_iss10775, Table C5-6, but this is not an exhaustive addition. Additions are only made for registers used today by KVM. Signed-off-by: Mark RutlandCc: Catalin Marinas Cc: Marc Zyngier Cc: Suzuki K Poulose Cc: Will Deacon --- arch/arm64/include/asm/sysreg.h | 37 + 1 file changed, 37 insertions(+) diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 3e281b1..f623320 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -119,6 +119,7 @@ #define SYS_ID_PFR0_EL1sys_reg(3, 0, 0, 1, 0) #define SYS_ID_PFR1_EL1sys_reg(3, 0, 0, 1, 1) #define SYS_ID_DFR0_EL1sys_reg(3, 0, 0, 1, 2) +#define SYS_ID_AFR0_EL1sys_reg(3, 0, 0, 1, 3) #define SYS_ID_MMFR0_EL1 sys_reg(3, 0, 0, 1, 4) #define SYS_ID_MMFR1_EL1 sys_reg(3, 0, 0, 1, 5) #define SYS_ID_MMFR2_EL1 sys_reg(3, 0, 0, 1, 6) @@ -149,11 +150,30 @@ #define SYS_ID_AA64MMFR1_EL1 sys_reg(3, 0, 0, 7, 1) #define SYS_ID_AA64MMFR2_EL1 sys_reg(3, 0, 0, 7, 2) +#define SYS_SCTLR_EL1 sys_reg(3, 0, 1, 0, 0) +#define SYS_ACTLR_EL1 sys_reg(3, 0, 1, 0, 1) +#define SYS_CPACR_EL1 sys_reg(3, 0, 1, 0, 2) + +#define SYS_TTBR0_EL1 sys_reg(3, 0, 2, 0, 0) +#define SYS_TTBR1_EL1 sys_reg(3, 0, 2, 0, 1) +#define SYS_TCR_EL1sys_reg(3, 0, 2, 0, 2) + #define SYS_ICC_PMR_EL1sys_reg(3, 0, 4, 6, 0) +#define SYS_AFSR0_EL1 sys_reg(3, 0, 5, 1, 0) +#define SYS_AFSR1_EL1 sys_reg(3, 0, 5, 1, 1) +#define SYS_ESR_EL1sys_reg(3, 0, 5, 2, 0) +#define SYS_FAR_EL1sys_reg(3, 0, 6, 0, 0) +#define SYS_PAR_EL1sys_reg(3, 0, 7, 4, 0) + #define SYS_PMINTENSET_EL1 sys_reg(3, 0, 9, 14, 1) #define SYS_PMINTENCLR_EL1 sys_reg(3, 0, 9, 14, 2) +#define SYS_MAIR_EL1 sys_reg(3, 0, 10, 2, 0) +#define SYS_AMAIR_EL1 sys_reg(3, 0, 10, 3, 0) + +#define SYS_VBAR_EL1 sys_reg(3, 0, 12, 0, 0) + #define SYS_ICC_DIR_EL1sys_reg(3, 0, 12, 11, 1) #define SYS_ICC_SGI1R_EL1 sys_reg(3, 0, 12, 11, 5) #define SYS_ICC_IAR1_EL1 sys_reg(3, 0, 12, 12, 0) @@ -163,6 +183,16 @@ #define SYS_ICC_SRE_EL1sys_reg(3, 0, 12, 12, 5) #define SYS_ICC_GRPEN1_EL1 sys_reg(3, 0, 12, 12, 7) +#define SYS_CONTEXTIDR_EL1 sys_reg(3, 0, 13, 0, 1) +#define SYS_TPIDR_EL1 sys_reg(3, 0, 13, 0, 4) + +#define SYS_CNTKCTL_EL1sys_reg(3, 0, 14, 1, 0) + +#define SYS_CLIDR_EL1 sys_reg(3, 1, 0, 0, 1) +#define SYS_AIDR_EL1 sys_reg(3, 1, 0, 0, 7) + +#define SYS_CSSELR_EL1 sys_reg(3, 2, 0, 0, 0) + #define SYS_CTR_EL0sys_reg(3, 3, 0, 0, 1) #define SYS_DCZID_EL0 sys_reg(3, 3, 0, 0, 7) @@ -180,6 +210,9 @@ #define SYS_PMUSERENR_EL0 sys_reg(3, 3, 9, 14, 0) #define SYS_PMOVSSET_EL0 sys_reg(3, 3, 9, 14, 3) +#define SYS_TPIDR_EL0 sys_reg(3, 3, 13, 0, 2) +#define SYS_TPIDRRO_EL0sys_reg(3, 3, 13, 0, 3) + #define SYS_CNTFRQ_EL0 sys_reg(3, 3, 14, 0, 0) #define SYS_CNTP_TVAL_EL0 sys_reg(3, 3, 14, 2, 0) @@ -194,6 +227,10 @@ #define SYS_PMCCFILTR_EL0 sys_reg (3, 3, 14, 15, 7) +#define SYS_DACR32_EL2 sys_reg(3, 4, 3, 0, 0) +#define SYS_IFSR32_EL2 sys_reg(3, 4, 5, 0, 1) +#define SYS_FPEXC32_EL2sys_reg(3, 4, 5, 3, 0) + #define __SYS__AP0Rx_EL2(x)sys_reg(3, 4, 12, 8, x) #define SYS_ICH_AP0R0_EL2 __SYS__AP0Rx_EL2(0) #define SYS_ICH_AP0R1_EL2 __SYS__AP0Rx_EL2(1) -- 1.9.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 15/15] KVM: arm64: Use common Set/Way sys definitions
Now that we have common definitions for the encoding of Set/Way cache maintenance operations, make the KVM code use these, simplifying the sys_reg_descs table. Signed-off-by: Mark RutlandCc: Christoffer Dall Cc: Marc Zyngier Cc: kvmarm@lists.cs.columbia.edu --- arch/arm64/kvm/sys_regs.c | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index effa5ce..0e6c477 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -879,15 +879,9 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu, * more demanding guest... */ static const struct sys_reg_desc sys_reg_descs[] = { - /* DC ISW */ - { Op0(0b01), Op1(0b000), CRn(0b0111), CRm(0b0110), Op2(0b010), - access_dcsw }, - /* DC CSW */ - { Op0(0b01), Op1(0b000), CRn(0b0111), CRm(0b1010), Op2(0b010), - access_dcsw }, - /* DC CISW */ - { Op0(0b01), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b010), - access_dcsw }, + { SYS_DESC(SYS_DC_ISW), access_dcsw }, + { SYS_DESC(SYS_DC_CSW), access_dcsw }, + { SYS_DESC(SYS_DC_CISW), access_dcsw }, DBG_BCR_BVR_WCR_WVR_EL1(0), DBG_BCR_BVR_WCR_WVR_EL1(1), -- 1.9.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 03/15] arm64: sysreg: add performance monitor registers
This patch adds sysreg definitions for system registers which are part of the performance monitors extension. Subsequent patches will make use of these definitions. The set of registers is described in ARM DDI 0487A.k_iss10775, Table D5-9. The encodings were taken from Table C5-6 in the same document. Signed-off-by: Mark RutlandCc: Catalin Marinas Cc: Marc Zyngier Cc: Suzuki K Poulose Cc: Will Deacon --- arch/arm64/include/asm/sysreg.h | 25 + 1 file changed, 25 insertions(+) diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index b54f8a4..3498d02 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -149,11 +149,36 @@ #define SYS_ID_AA64MMFR1_EL1 sys_reg(3, 0, 0, 7, 1) #define SYS_ID_AA64MMFR2_EL1 sys_reg(3, 0, 0, 7, 2) +#define SYS_PMINTENSET_EL1 sys_reg(3, 0, 9, 14, 1) +#define SYS_PMINTENCLR_EL1 sys_reg(3, 0, 9, 14, 2) + #define SYS_CTR_EL0sys_reg(3, 3, 0, 0, 1) #define SYS_DCZID_EL0 sys_reg(3, 3, 0, 0, 7) +#define SYS_PMCR_EL0 sys_reg(3, 3, 9, 12, 0) +#define SYS_PMCNTENSET_EL0 sys_reg(3, 3, 9, 12, 1) +#define SYS_PMCNTENCLR_EL0 sys_reg(3, 3, 9, 12, 2) +#define SYS_PMOVSCLR_EL0 sys_reg(3, 3, 9, 12, 3) +#define SYS_PMSWINC_EL0sys_reg(3, 3, 9, 12, 4) +#define SYS_PMSELR_EL0 sys_reg(3, 3, 9, 12, 5) +#define SYS_PMCEID0_EL0sys_reg(3, 3, 9, 12, 6) +#define SYS_PMCEID1_EL0sys_reg(3, 3, 9, 12, 7) +#define SYS_PMCCNTR_EL0sys_reg(3, 3, 9, 13, 0) +#define SYS_PMXEVTYPER_EL0 sys_reg(3, 3, 9, 13, 1) +#define SYS_PMXEVCNTR_EL0 sys_reg(3, 3, 9, 13, 2) +#define SYS_PMUSERENR_EL0 sys_reg(3, 3, 9, 14, 0) +#define SYS_PMOVSSET_EL0 sys_reg(3, 3, 9, 14, 3) + #define SYS_CNTFRQ_EL0 sys_reg(3, 3, 14, 0, 0) +#define __PMEV_op2(n) ((n) & 0x7) +#define __CNTR_CRm(n) (0x8 | (((n) >> 3) & 0x3)) +#define SYS_PMEVCNTRn_EL0(n) sys_reg(3, 3, 14, __CNTR_CRm(n), __PMEV_op2(n)) +#define __TYPER_CRm(n) (0xc | (((n) >> 3) & 0x3)) +#define SYS_PMEVTYPERn_EL0(n) sys_reg(3, 3, 14, __TYPER_CRm(n), __PMEV_op2(n)) + +#define SYS_PMCCFILTR_EL0 sys_reg (3, 3, 14, 15, 7) + /* Common SCTLR_ELx flags. */ #define SCTLR_ELx_EE(1 << 25) #define SCTLR_ELx_I(1 << 12) -- 1.9.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 12/15] KVM: arm64: Use common physical timer sysreg definitions
Now that we have common definitions for the physical timer control registers, make the KVM code use these, simplifying the sys_reg_descs table. Signed-off-by: Mark RutlandCc: Christoffer Dall Cc: Marc Zyngier Cc: kvmarm@lists.cs.columbia.edu --- arch/arm64/kvm/sys_regs.c | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 1f3062b..860707f 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1016,15 +1016,9 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu, { Op0(0b11), Op1(0b011), CRn(0b1101), CRm(0b), Op2(0b011), NULL, reset_unknown, TPIDRRO_EL0 }, - /* CNTP_TVAL_EL0 */ - { Op0(0b11), Op1(0b011), CRn(0b1110), CRm(0b0010), Op2(0b000), - access_cntp_tval }, - /* CNTP_CTL_EL0 */ - { Op0(0b11), Op1(0b011), CRn(0b1110), CRm(0b0010), Op2(0b001), - access_cntp_ctl }, - /* CNTP_CVAL_EL0 */ - { Op0(0b11), Op1(0b011), CRn(0b1110), CRm(0b0010), Op2(0b010), - access_cntp_cval }, + { SYS_DESC(SYS_CNTP_TVAL_EL0), access_cntp_tval }, + { SYS_DESC(SYS_CNTP_CTL_EL0), access_cntp_ctl }, + { SYS_DESC(SYS_CNTP_CVAL_EL0), access_cntp_cval }, /* PMEVCNTRn_EL0 */ PMU_PMEVCNTR_EL0(0), -- 1.9.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 14/15] KVM: arm64: Use common sysreg definitions
Now that we have common definitions for the remaining register encodings required by KVM, make the KVM code use these, simplifying the sys_reg_descs table and the genericv8_sys_regs table. Signed-off-by: Mark RutlandCc: Christoffer Dall Cc: Marc Zyngier Cc: kvmarm@lists.cs.columbia.edu --- arch/arm64/kvm/sys_regs.c| 94 +--- arch/arm64/kvm/sys_regs_generic_v8.c | 4 +- 2 files changed, 25 insertions(+), 73 deletions(-) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index e637e1d..effa5ce 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -924,72 +924,36 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu, { SYS_DESC(SYS_DBGVCR32_EL2), NULL, reset_val, DBGVCR32_EL2, 0 }, - /* MPIDR_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b), Op2(0b101), - NULL, reset_mpidr, MPIDR_EL1 }, - /* SCTLR_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b0001), CRm(0b), Op2(0b000), - access_vm_reg, reset_val, SCTLR_EL1, 0x00C50078 }, - /* CPACR_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b0001), CRm(0b), Op2(0b010), - NULL, reset_val, CPACR_EL1, 0 }, - /* TTBR0_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b0010), CRm(0b), Op2(0b000), - access_vm_reg, reset_unknown, TTBR0_EL1 }, - /* TTBR1_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b0010), CRm(0b), Op2(0b001), - access_vm_reg, reset_unknown, TTBR1_EL1 }, - /* TCR_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b0010), CRm(0b), Op2(0b010), - access_vm_reg, reset_val, TCR_EL1, 0 }, - - /* AFSR0_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b0101), CRm(0b0001), Op2(0b000), - access_vm_reg, reset_unknown, AFSR0_EL1 }, - /* AFSR1_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b0101), CRm(0b0001), Op2(0b001), - access_vm_reg, reset_unknown, AFSR1_EL1 }, - /* ESR_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b0101), CRm(0b0010), Op2(0b000), - access_vm_reg, reset_unknown, ESR_EL1 }, - /* FAR_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b0110), CRm(0b), Op2(0b000), - access_vm_reg, reset_unknown, FAR_EL1 }, - /* PAR_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b0111), CRm(0b0100), Op2(0b000), - NULL, reset_unknown, PAR_EL1 }, + { SYS_DESC(SYS_MPIDR_EL1), NULL, reset_mpidr, MPIDR_EL1 }, + { SYS_DESC(SYS_SCTLR_EL1), access_vm_reg, reset_val, SCTLR_EL1, 0x00C50078 }, + { SYS_DESC(SYS_CPACR_EL1), NULL, reset_val, CPACR_EL1, 0 }, + { SYS_DESC(SYS_TTBR0_EL1), access_vm_reg, reset_unknown, TTBR0_EL1 }, + { SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 }, + { SYS_DESC(SYS_TCR_EL1), access_vm_reg, reset_val, TCR_EL1, 0 }, + + { SYS_DESC(SYS_AFSR0_EL1), access_vm_reg, reset_unknown, AFSR0_EL1 }, + { SYS_DESC(SYS_AFSR1_EL1), access_vm_reg, reset_unknown, AFSR1_EL1 }, + { SYS_DESC(SYS_ESR_EL1), access_vm_reg, reset_unknown, ESR_EL1 }, + { SYS_DESC(SYS_FAR_EL1), access_vm_reg, reset_unknown, FAR_EL1 }, + { SYS_DESC(SYS_PAR_EL1), NULL, reset_unknown, PAR_EL1 }, { SYS_DESC(SYS_PMINTENSET_EL1), access_pminten, reset_unknown, PMINTENSET_EL1 }, { SYS_DESC(SYS_PMINTENCLR_EL1), access_pminten, NULL, PMINTENSET_EL1 }, - /* MAIR_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b1010), CRm(0b0010), Op2(0b000), - access_vm_reg, reset_unknown, MAIR_EL1 }, - /* AMAIR_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b1010), CRm(0b0011), Op2(0b000), - access_vm_reg, reset_amair_el1, AMAIR_EL1 }, + { SYS_DESC(SYS_MAIR_EL1), access_vm_reg, reset_unknown, MAIR_EL1 }, + { SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 }, - /* VBAR_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b1100), CRm(0b), Op2(0b000), - NULL, reset_val, VBAR_EL1, 0 }, + { SYS_DESC(SYS_VBAR_EL1), NULL, reset_val, VBAR_EL1, 0 }, { SYS_DESC(SYS_ICC_SGI1R_EL1), access_gic_sgi }, { SYS_DESC(SYS_ICC_SRE_EL1), access_gic_sre }, - /* CONTEXTIDR_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b1101), CRm(0b), Op2(0b001), - access_vm_reg, reset_val, CONTEXTIDR_EL1, 0 }, - /* TPIDR_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b1101), CRm(0b), Op2(0b100), - NULL, reset_unknown, TPIDR_EL1 }, + { SYS_DESC(SYS_CONTEXTIDR_EL1), access_vm_reg, reset_val, CONTEXTIDR_EL1, 0 }, + { SYS_DESC(SYS_TPIDR_EL1), NULL, reset_unknown, TPIDR_EL1 }, - /* CNTKCTL_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b1110), CRm(0b0001), Op2(0b000), - NULL, reset_val, CNTKCTL_EL1, 0}, + { SYS_DESC(SYS_CNTKCTL_EL1), NULL, reset_val, CNTKCTL_EL1, 0}, - /* CSSELR_EL1 */ - { Op0(0b11), Op1(0b010), CRn(0b), CRm(0b), Op2(0b000), -
[PATCH 08/15] KVM: arm64: add SYS_DESC()
This patch adds a macro enabling us to initialise sys_reg_desc structures based on common sysreg encoding definitions in . Subsequent patches will use this to simplify the KVM code. Signed-off-by: Mark RutlandCc: Christoffer Dall Cc: Marc Zyngier Cc: kvmarm@lists.cs.columbia.edu --- arch/arm64/kvm/sys_regs.h | 5 + 1 file changed, 5 insertions(+) diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h index 9c6ffd0..66859a5 100644 --- a/arch/arm64/kvm/sys_regs.h +++ b/arch/arm64/kvm/sys_regs.h @@ -147,4 +147,9 @@ const struct sys_reg_desc *find_reg_by_id(u64 id, #define CRm(_x).CRm = _x #define Op2(_x).Op2 = _x +#define SYS_DESC(reg) \ + Op0(sys_reg_Op0(reg)), Op1(sys_reg_Op1(reg)), \ + CRn(sys_reg_CRn(reg)), CRm(sys_reg_CRm(reg)), \ + Op2(sys_reg_Op2(reg)) + #endif /* __ARM64_KVM_SYS_REGS_LOCAL_H__ */ -- 1.9.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 13/15] KVM: arm64: use common invariant sysreg definitions
Now that we have common definitions for the register encodings used by KVM, make the KVM code uses thse for invariant sysreg definitions. This makes said definitions a reasonable amount shorter, especially as many comments are rendered redundant and can be removed. Signed-off-by: Mark RutlandCc: Christoffer Dall Cc: Marc Zyngier Cc: kvmarm@lists.cs.columbia.edu --- arch/arm64/kvm/sys_regs.c | 57 --- 1 file changed, 19 insertions(+), 38 deletions(-) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 860707f..e637e1d 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1857,44 +1857,25 @@ static const struct sys_reg_desc *index_to_sys_reg_desc(struct kvm_vcpu *vcpu, /* ->val is filled in by kvm_sys_reg_table_init() */ static struct sys_reg_desc invariant_sys_regs[] = { - { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b), Op2(0b000), - NULL, get_midr_el1 }, - { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b), Op2(0b110), - NULL, get_revidr_el1 }, - { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0001), Op2(0b000), - NULL, get_id_pfr0_el1 }, - { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0001), Op2(0b001), - NULL, get_id_pfr1_el1 }, - { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0001), Op2(0b010), - NULL, get_id_dfr0_el1 }, - { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0001), Op2(0b011), - NULL, get_id_afr0_el1 }, - { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0001), Op2(0b100), - NULL, get_id_mmfr0_el1 }, - { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0001), Op2(0b101), - NULL, get_id_mmfr1_el1 }, - { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0001), Op2(0b110), - NULL, get_id_mmfr2_el1 }, - { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0001), Op2(0b111), - NULL, get_id_mmfr3_el1 }, - { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0010), Op2(0b000), - NULL, get_id_isar0_el1 }, - { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0010), Op2(0b001), - NULL, get_id_isar1_el1 }, - { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0010), Op2(0b010), - NULL, get_id_isar2_el1 }, - { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0010), Op2(0b011), - NULL, get_id_isar3_el1 }, - { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0010), Op2(0b100), - NULL, get_id_isar4_el1 }, - { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0010), Op2(0b101), - NULL, get_id_isar5_el1 }, - { Op0(0b11), Op1(0b001), CRn(0b), CRm(0b), Op2(0b001), - NULL, get_clidr_el1 }, - { Op0(0b11), Op1(0b001), CRn(0b), CRm(0b), Op2(0b111), - NULL, get_aidr_el1 }, - { Op0(0b11), Op1(0b011), CRn(0b), CRm(0b), Op2(0b001), - NULL, get_ctr_el0 }, + { SYS_DESC(SYS_MIDR_EL1), NULL, get_midr_el1 }, + { SYS_DESC(SYS_REVIDR_EL1), NULL, get_revidr_el1 }, + { SYS_DESC(SYS_ID_PFR0_EL1), NULL, get_id_pfr0_el1 }, + { SYS_DESC(SYS_ID_PFR1_EL1), NULL, get_id_pfr1_el1 }, + { SYS_DESC(SYS_ID_DFR0_EL1), NULL, get_id_dfr0_el1 }, + { SYS_DESC(SYS_ID_AFR0_EL1), NULL, get_id_afr0_el1 }, + { SYS_DESC(SYS_ID_MMFR0_EL1), NULL, get_id_mmfr0_el1 }, + { SYS_DESC(SYS_ID_MMFR1_EL1), NULL, get_id_mmfr1_el1 }, + { SYS_DESC(SYS_ID_MMFR2_EL1), NULL, get_id_mmfr2_el1 }, + { SYS_DESC(SYS_ID_MMFR3_EL1), NULL, get_id_mmfr3_el1 }, + { SYS_DESC(SYS_ID_ISAR0_EL1), NULL, get_id_isar0_el1 }, + { SYS_DESC(SYS_ID_ISAR1_EL1), NULL, get_id_isar1_el1 }, + { SYS_DESC(SYS_ID_ISAR2_EL1), NULL, get_id_isar2_el1 }, + { SYS_DESC(SYS_ID_ISAR3_EL1), NULL, get_id_isar3_el1 }, + { SYS_DESC(SYS_ID_ISAR4_EL1), NULL, get_id_isar4_el1 }, + { SYS_DESC(SYS_ID_ISAR5_EL1), NULL, get_id_isar5_el1 }, + { SYS_DESC(SYS_CLIDR_EL1), NULL, get_clidr_el1 }, + { SYS_DESC(SYS_AIDR_EL1), NULL, get_aidr_el1 }, + { SYS_DESC(SYS_CTR_EL0), NULL, get_ctr_el0 }, }; static int reg_from_user(u64 *val, const void __user *uaddr, u64 id) -- 1.9.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 09/15] KVM: arm64: Use common debug sysreg definitions
Now that we have common definitions for the debug register encodings, make the KVM code use these, simplifying the sys_reg_descs table. The table previously erroneously referred to MDCCSR_EL0 as MDCCSR_EL1. This is corrected (as is necessary in order to use the common sysreg definition). Signed-off-by: Mark RutlandCc: Christoffer Dall Cc: Marc Zyngier Cc: kvmarm@lists.cs.columbia.edu --- arch/arm64/kvm/sys_regs.c | 73 ++- 1 file changed, 21 insertions(+), 52 deletions(-) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 0e26f8c..5fa23fd 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -793,17 +793,13 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, /* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go */ #define DBG_BCR_BVR_WCR_WVR_EL1(n) \ - /* DBGBVRn_EL1 */ \ - { Op0(0b10), Op1(0b000), CRn(0b), CRm((n)), Op2(0b100), \ + { SYS_DESC(SYS_DBGBVRn_EL1(n)), \ trap_bvr, reset_bvr, n, 0, get_bvr, set_bvr },\ - /* DBGBCRn_EL1 */ \ - { Op0(0b10), Op1(0b000), CRn(0b), CRm((n)), Op2(0b101), \ + { SYS_DESC(SYS_DBGBCRn_EL1(n)), \ trap_bcr, reset_bcr, n, 0, get_bcr, set_bcr },\ - /* DBGWVRn_EL1 */ \ - { Op0(0b10), Op1(0b000), CRn(0b), CRm((n)), Op2(0b110), \ + { SYS_DESC(SYS_DBGWVRn_EL1(n)), \ trap_wvr, reset_wvr, n, 0, get_wvr, set_wvr }, \ - /* DBGWCRn_EL1 */ \ - { Op0(0b10), Op1(0b000), CRn(0b), CRm((n)), Op2(0b111), \ + { SYS_DESC(SYS_DBGWCRn_EL1(n)), \ trap_wcr, reset_wcr, n, 0, get_wcr, set_wcr } /* Macro to expand the PMEVCNTRn_EL0 register */ @@ -899,12 +895,8 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu, DBG_BCR_BVR_WCR_WVR_EL1(0), DBG_BCR_BVR_WCR_WVR_EL1(1), - /* MDCCINT_EL1 */ - { Op0(0b10), Op1(0b000), CRn(0b), CRm(0b0010), Op2(0b000), - trap_debug_regs, reset_val, MDCCINT_EL1, 0 }, - /* MDSCR_EL1 */ - { Op0(0b10), Op1(0b000), CRn(0b), CRm(0b0010), Op2(0b010), - trap_debug_regs, reset_val, MDSCR_EL1, 0 }, + { SYS_DESC(SYS_MDCCINT_EL1), trap_debug_regs, reset_val, MDCCINT_EL1, 0 }, + { SYS_DESC(SYS_MDSCR_EL1), trap_debug_regs, reset_val, MDSCR_EL1, 0 }, DBG_BCR_BVR_WCR_WVR_EL1(2), DBG_BCR_BVR_WCR_WVR_EL1(3), DBG_BCR_BVR_WCR_WVR_EL1(4), @@ -920,44 +912,21 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu, DBG_BCR_BVR_WCR_WVR_EL1(14), DBG_BCR_BVR_WCR_WVR_EL1(15), - /* MDRAR_EL1 */ - { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b), Op2(0b000), - trap_raz_wi }, - /* OSLAR_EL1 */ - { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b), Op2(0b100), - trap_raz_wi }, - /* OSLSR_EL1 */ - { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0001), Op2(0b100), - trap_oslsr_el1 }, - /* OSDLR_EL1 */ - { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0011), Op2(0b100), - trap_raz_wi }, - /* DBGPRCR_EL1 */ - { Op0(0b10), Op1(0b000), CRn(0b0001), CRm(0b0100), Op2(0b100), - trap_raz_wi }, - /* DBGCLAIMSET_EL1 */ - { Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1000), Op2(0b110), - trap_raz_wi }, - /* DBGCLAIMCLR_EL1 */ - { Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1001), Op2(0b110), - trap_raz_wi }, - /* DBGAUTHSTATUS_EL1 */ - { Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b110), - trap_dbgauthstatus_el1 }, - - /* MDCCSR_EL1 */ - { Op0(0b10), Op1(0b011), CRn(0b), CRm(0b0001), Op2(0b000), - trap_raz_wi }, - /* DBGDTR_EL0 */ - { Op0(0b10), Op1(0b011), CRn(0b), CRm(0b0100), Op2(0b000), - trap_raz_wi }, - /* DBGDTR[TR]X_EL0 */ - { Op0(0b10), Op1(0b011), CRn(0b), CRm(0b0101), Op2(0b000), - trap_raz_wi }, - - /* DBGVCR32_EL2 */ - { Op0(0b10), Op1(0b100), CRn(0b), CRm(0b0111), Op2(0b000), - NULL, reset_val, DBGVCR32_EL2, 0 }, + { SYS_DESC(SYS_MDRAR_EL1), trap_raz_wi }, + { SYS_DESC(SYS_OSLAR_EL1), trap_raz_wi }, + { SYS_DESC(SYS_OSLSR_EL1), trap_oslsr_el1 }, + { SYS_DESC(SYS_OSDLR_EL1), trap_raz_wi }, + { SYS_DESC(SYS_DBGPRCR_EL1), trap_raz_wi }, + { SYS_DESC(SYS_DBGCLAIMSET_EL1), trap_raz_wi }, + { SYS_DESC(SYS_DBGCLAIMCLR_EL1), trap_raz_wi }, + {
[PATCH 07/15] arm64: sysreg: add Set/Way sys encodings
Cache maintenance ops fall in the SYS instruction class, and KVM needs to handle them. So as to keep all SYS encodings in one place, this patch adds them to sysreg.h. The encodings were taken from ARM DDI 0487A.k_iss10775, Table C5-2. To make it clear that these are instructions rather than registers, and to allow us to change the way these are handled in future, a new sys_insn() alias for sys_reg() is added and used for these new definitions. Signed-off-by: Mark RutlandCc: Catalin Marinas Cc: Marc Zyngier Cc: Suzuki K Poulose Cc: Will Deacon --- arch/arm64/include/asm/sysreg.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index f623320..128eae8 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -48,6 +48,8 @@ ((crn) << CRn_shift) | ((crm) << CRm_shift) | \ ((op2) << Op2_shift)) +#define sys_insn sys_reg + #define sys_reg_Op0(id)(((id) >> Op0_shift) & Op0_mask) #define sys_reg_Op1(id)(((id) >> Op1_shift) & Op1_mask) #define sys_reg_CRn(id)(((id) >> CRn_shift) & CRn_mask) @@ -89,6 +91,10 @@ #define SET_PSTATE_UAO(x) __emit_inst(0xd500 | REG_PSTATE_UAO_IMM | \ (!!x)<<8 | 0x1f) +#define SYS_DC_ISW sys_insn(1, 0, 7, 6, 2) +#define SYS_DC_CSW sys_insn(1, 0, 7, 10, 2) +#define SYS_DC_CISWsys_insn(1, 0, 7, 14, 2) + #define SYS_OSDTRRX_EL1sys_reg(2, 0, 0, 0, 2) #define SYS_MDCCINT_EL1sys_reg(2, 0, 0, 2, 0) #define SYS_MDSCR_EL1 sys_reg(2, 0, 0, 2, 2) -- 1.9.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 11/15] KVM: arm64: Use common GICv3 sysreg definitions
Now that we have common definitions for the GICv3 register encodings, make the KVM code use these, simplifying the sys_reg_descs table. Signed-off-by: Mark RutlandCc: Christoffer Dall Cc: Marc Zyngier Cc: kvmarm@lists.cs.columbia.edu --- arch/arm64/kvm/sys_regs.c | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 63b0785..1f3062b 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -973,12 +973,8 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu, { Op0(0b11), Op1(0b000), CRn(0b1100), CRm(0b), Op2(0b000), NULL, reset_val, VBAR_EL1, 0 }, - /* ICC_SGI1R_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b1100), CRm(0b1011), Op2(0b101), - access_gic_sgi }, - /* ICC_SRE_EL1 */ - { Op0(0b11), Op1(0b000), CRn(0b1100), CRm(0b1100), Op2(0b101), - access_gic_sre }, + { SYS_DESC(SYS_ICC_SGI1R_EL1), access_gic_sgi }, + { SYS_DESC(SYS_ICC_SRE_EL1), access_gic_sre }, /* CONTEXTIDR_EL1 */ { Op0(0b11), Op1(0b000), CRn(0b1101), CRm(0b), Op2(0b001), -- 1.9.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 01/15] arm64: sysreg: sort by encoding
Out sysreg definitions are largely (but not entirely) in ascending order of op0:op1:CRn:CRm:op2. It would be preferable to enforce this sort, as this makes it easier to verify the set of encodings against documentation, and provides an obvious location for each addition in future, minimising conflicts. This patch enforces this order, by moving the few items that break it. There should be no functional change. Signed-off-by: Mark RutlandCc: Catalin Marinas Cc: Marc Zyngier Cc: Suzuki K Poulose Cc: Will Deacon --- arch/arm64/include/asm/sysreg.h | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index ac24b6e..e6498ac 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -81,6 +81,14 @@ #endif /* CONFIG_BROKEN_GAS_INST */ +#define REG_PSTATE_PAN_IMM sys_reg(0, 0, 4, 0, 4) +#define REG_PSTATE_UAO_IMM sys_reg(0, 0, 4, 0, 3) + +#define SET_PSTATE_PAN(x) __emit_inst(0xd500 | REG_PSTATE_PAN_IMM | \ + (!!x)<<8 | 0x1f) +#define SET_PSTATE_UAO(x) __emit_inst(0xd500 | REG_PSTATE_UAO_IMM | \ + (!!x)<<8 | 0x1f) + #define SYS_MIDR_EL1 sys_reg(3, 0, 0, 0, 0) #define SYS_MPIDR_EL1 sys_reg(3, 0, 0, 0, 5) #define SYS_REVIDR_EL1 sys_reg(3, 0, 0, 0, 6) @@ -118,17 +126,10 @@ #define SYS_ID_AA64MMFR1_EL1 sys_reg(3, 0, 0, 7, 1) #define SYS_ID_AA64MMFR2_EL1 sys_reg(3, 0, 0, 7, 2) -#define SYS_CNTFRQ_EL0 sys_reg(3, 3, 14, 0, 0) #define SYS_CTR_EL0sys_reg(3, 3, 0, 0, 1) #define SYS_DCZID_EL0 sys_reg(3, 3, 0, 0, 7) -#define REG_PSTATE_PAN_IMM sys_reg(0, 0, 4, 0, 4) -#define REG_PSTATE_UAO_IMM sys_reg(0, 0, 4, 0, 3) - -#define SET_PSTATE_PAN(x) __emit_inst(0xd500 | REG_PSTATE_PAN_IMM | \ - (!!x)<<8 | 0x1f) -#define SET_PSTATE_UAO(x) __emit_inst(0xd500 | REG_PSTATE_UAO_IMM | \ - (!!x)<<8 | 0x1f) +#define SYS_CNTFRQ_EL0 sys_reg(3, 3, 14, 0, 0) /* Common SCTLR_ELx flags. */ #define SCTLR_ELx_EE(1 << 25) -- 1.9.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 05/15] arm64: sysreg: add physical timer registers
This patch adds sysreg definitions for system registers used to control the architected physical timer. Subsequent patches will make use of these definitions. The encodings were taken from ARM DDI 0487A.k_iss10775, Table C5-6. Signed-off-by: Mark RutlandCc: Catalin Marinas Cc: Marc Zyngier Cc: Suzuki K Poulose Cc: Will Deacon --- arch/arm64/include/asm/sysreg.h | 4 1 file changed, 4 insertions(+) diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 9dc30bc..3e281b1 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -182,6 +182,10 @@ #define SYS_CNTFRQ_EL0 sys_reg(3, 3, 14, 0, 0) +#define SYS_CNTP_TVAL_EL0 sys_reg(3, 3, 14, 2, 0) +#define SYS_CNTP_CTL_EL0 sys_reg(3, 3, 14, 2, 1) +#define SYS_CNTP_CVAL_EL0 sys_reg(3, 3, 14, 2, 2) + #define __PMEV_op2(n) ((n) & 0x7) #define __CNTR_CRm(n) (0x8 | (((n) >> 3) & 0x3)) #define SYS_PMEVCNTRn_EL0(n) sys_reg(3, 3, 14, __CNTR_CRm(n), __PMEV_op2(n)) -- 1.9.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 04/15] arm64: sysreg: subsume GICv3 sysreg definitions
Unlike most sysreg defintiions, the GICv3 definitions don't have a SYS_ prefix, and they don't live in . Additionally, some definitions are duplicated elsewhere (e.g. in the KVM save/restore code). For consistency, and to make it possible to share a common definition for these sysregs, this patch moves the definitions to , adding a SYS_ prefix, and sorting the registers per their encoding. Existing users of the definitions are fixed up so that this change is not problematic. Signed-off-by: Mark RutlandCc: Catalin Marinas Cc: Marc Zyngier Cc: Suzuki K Poulose Cc: Will Deacon --- arch/arm64/include/asm/arch_gicv3.h | 81 ++--- arch/arm64/include/asm/sysreg.h | 52 arch/arm64/kernel/head.S| 8 ++-- 3 files changed, 69 insertions(+), 72 deletions(-) diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h index f37e3a2..1a98bc8 100644 --- a/arch/arm64/include/asm/arch_gicv3.h +++ b/arch/arm64/include/asm/arch_gicv3.h @@ -20,69 +20,14 @@ #include -#define ICC_EOIR1_EL1 sys_reg(3, 0, 12, 12, 1) -#define ICC_DIR_EL1sys_reg(3, 0, 12, 11, 1) -#define ICC_IAR1_EL1 sys_reg(3, 0, 12, 12, 0) -#define ICC_SGI1R_EL1 sys_reg(3, 0, 12, 11, 5) -#define ICC_PMR_EL1sys_reg(3, 0, 4, 6, 0) -#define ICC_CTLR_EL1 sys_reg(3, 0, 12, 12, 4) -#define ICC_SRE_EL1sys_reg(3, 0, 12, 12, 5) -#define ICC_GRPEN1_EL1 sys_reg(3, 0, 12, 12, 7) -#define ICC_BPR1_EL1 sys_reg(3, 0, 12, 12, 3) - -#define ICC_SRE_EL2sys_reg(3, 4, 12, 9, 5) - -/* - * System register definitions - */ -#define ICH_VSEIR_EL2 sys_reg(3, 4, 12, 9, 4) -#define ICH_HCR_EL2sys_reg(3, 4, 12, 11, 0) -#define ICH_VTR_EL2sys_reg(3, 4, 12, 11, 1) -#define ICH_MISR_EL2 sys_reg(3, 4, 12, 11, 2) -#define ICH_EISR_EL2 sys_reg(3, 4, 12, 11, 3) -#define ICH_ELSR_EL2 sys_reg(3, 4, 12, 11, 5) -#define ICH_VMCR_EL2 sys_reg(3, 4, 12, 11, 7) - -#define __LR0_EL2(x) sys_reg(3, 4, 12, 12, x) -#define __LR8_EL2(x) sys_reg(3, 4, 12, 13, x) - -#define ICH_LR0_EL2__LR0_EL2(0) -#define ICH_LR1_EL2__LR0_EL2(1) -#define ICH_LR2_EL2__LR0_EL2(2) -#define ICH_LR3_EL2__LR0_EL2(3) -#define ICH_LR4_EL2__LR0_EL2(4) -#define ICH_LR5_EL2__LR0_EL2(5) -#define ICH_LR6_EL2__LR0_EL2(6) -#define ICH_LR7_EL2__LR0_EL2(7) -#define ICH_LR8_EL2__LR8_EL2(0) -#define ICH_LR9_EL2__LR8_EL2(1) -#define ICH_LR10_EL2 __LR8_EL2(2) -#define ICH_LR11_EL2 __LR8_EL2(3) -#define ICH_LR12_EL2 __LR8_EL2(4) -#define ICH_LR13_EL2 __LR8_EL2(5) -#define ICH_LR14_EL2 __LR8_EL2(6) -#define ICH_LR15_EL2 __LR8_EL2(7) - -#define __AP0Rx_EL2(x) sys_reg(3, 4, 12, 8, x) -#define ICH_AP0R0_EL2 __AP0Rx_EL2(0) -#define ICH_AP0R1_EL2 __AP0Rx_EL2(1) -#define ICH_AP0R2_EL2 __AP0Rx_EL2(2) -#define ICH_AP0R3_EL2 __AP0Rx_EL2(3) - -#define __AP1Rx_EL2(x) sys_reg(3, 4, 12, 9, x) -#define ICH_AP1R0_EL2 __AP1Rx_EL2(0) -#define ICH_AP1R1_EL2 __AP1Rx_EL2(1) -#define ICH_AP1R2_EL2 __AP1Rx_EL2(2) -#define ICH_AP1R3_EL2 __AP1Rx_EL2(3) - #ifndef __ASSEMBLY__ #include #include #include -#define read_gicregread_sysreg_s -#define write_gicreg write_sysreg_s +#define read_gicreg(r) read_sysreg_s(SYS_ ## r) +#define write_gicreg(v, r) write_sysreg_s(v, SYS_ ## r) /* * Low-level accessors @@ -93,13 +38,13 @@ static inline void gic_write_eoir(u32 irq) { - write_sysreg_s(irq, ICC_EOIR1_EL1); + write_sysreg_s(irq, SYS_ICC_EOIR1_EL1); isb(); } static inline void gic_write_dir(u32 irq) { - write_sysreg_s(irq, ICC_DIR_EL1); + write_sysreg_s(irq, SYS_ICC_DIR_EL1); isb(); } @@ -107,7 +52,7 @@ static inline u64 gic_read_iar_common(void) { u64 irqstat; - irqstat = read_sysreg_s(ICC_IAR1_EL1); + irqstat = read_sysreg_s(SYS_ICC_IAR1_EL1); dsb(sy); return irqstat; } @@ -124,7 +69,7 @@ static inline u64 gic_read_iar_cavium_thunderx(void) u64 irqstat; nops(8); - irqstat = read_sysreg_s(ICC_IAR1_EL1); + irqstat =
[PATCH 00/15] arm64/kvm: use common sysreg definitions
Currently we duplicate effort in maintaining system register encodings across arm64's , KVM's sysreg tables, and other places. This redundancy is unfortunate, and as encodings are encoded in-place without any mnemonic, this ends up more painful to read than necessary. This series ameliorates this by making the canonical location for (architected) system register encodings, with other users building atop of this, e.g. with KVM deriving its sysreg table values from the common mnemonics. I've only attacked AArch64-native SYS encodings, and ignored CP{15,14} registers for now, but these could be handled similarly. Largely, I've stuck to only what KVM needs, though for the debug and perfmon groups it was easier to take the whole group from the ARM ARM than to filter them to only what KVM needed today. To verify that I haven't accidentally broken KVM, I've diffed sys_regs.o and sys_regs_generic_v8.o on a section-by-section basis before and after the series is applied. The .text, .data, and .rodata sections (and most others) are identical. The __bug_table section, and some .debug* sections differ, and this appears to be due to line numbers changing due to removed lines. One thing I wasn't sure how to address was banks of registers such as PMEVCNTR_EL0. We currently enumerate all cases for our GICv3 definitions, but it seemed painful to expand ~30 cases for PMEVCNTR_EL0 and friends, and for these I've made the macros take an 'n' parameter. It would be nice to be consistent either way, and I'm happy to expand those cases. I've pushed thes series out to a branch [1] based on v4.11-rc1. It looks like git rebase is also happy to apply the patches atop of the kvm-arm-for-4.11-rc2 tag. Thanks, Mark. Since RFC [2]: * Rebase to v4.11-rc1, solving a trivial conflict. * Handle the physical counter registers. * Verified section differences again. Thanks, Mark. [1] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git arm64/common-sysreg [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-January/484693.html Mark Rutland (15): arm64: sysreg: sort by encoding arm64: sysreg: add debug system registers arm64: sysreg: add performance monitor registers arm64: sysreg: subsume GICv3 sysreg definitions arm64: sysreg: add physical timer registers arm64: sysreg: add register encodings used by KVM arm64: sysreg: add Set/Way sys encodings KVM: arm64: add SYS_DESC() KVM: arm64: Use common debug sysreg definitions KVM: arm64: Use common performance monitor sysreg definitions KVM: arm64: Use common GICv3 sysreg definitions KVM: arm64: Use common physical timer sysreg definitions KVM: arm64: use common invariant sysreg definitions KVM: arm64: Use common sysreg definitions KVM: arm64: Use common Set/Way sys definitions arch/arm64/include/asm/arch_gicv3.h | 81 ++-- arch/arm64/include/asm/sysreg.h | 162 +++- arch/arm64/kernel/head.S | 8 +- arch/arm64/kvm/sys_regs.c| 358 +++ arch/arm64/kvm/sys_regs.h| 5 + arch/arm64/kvm/sys_regs_generic_v8.c | 4 +- 6 files changed, 284 insertions(+), 334 deletions(-) -- 1.9.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v5 06/10] arm64/mmu: align alloc_init_pte prototype with pmd/pud versions
On Thu, Mar 09, 2017 at 09:25:08AM +0100, Ard Biesheuvel wrote: > Align the function prototype of alloc_init_pte() with its pmd and pud > counterparts by replacing the pfn parameter with the equivalent physical > address. > > Signed-off-by: Ard Biesheuvel> --- > arch/arm64/mm/mmu.c | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > index 75e21c33caff..c3963c592ec3 100644 > --- a/arch/arm64/mm/mmu.c > +++ b/arch/arm64/mm/mmu.c > @@ -107,7 +107,7 @@ static bool pgattr_change_is_safe(u64 old, u64 new) > } > > static void alloc_init_pte(pmd_t *pmd, unsigned long addr, > - unsigned long end, unsigned long pfn, > + unsigned long end, phys_addr_t phys, > pgprot_t prot, > phys_addr_t (*pgtable_alloc)(void)) > { > @@ -128,8 +128,8 @@ static void alloc_init_pte(pmd_t *pmd, unsigned long addr, > do { > pte_t old_pte = *pte; > > - set_pte(pte, pfn_pte(pfn, prot)); > - pfn++; > + set_pte(pte, pfn_pte(__phys_to_pfn(phys), prot)); > + phys += PAGE_SIZE; Minor nit: so as to align the strucutre of the loop with the other functions, it'd be nice to have this on the final line of the loop body. Either way: Reviewed-by: Mark Rutland Mark. > > /* >* After the PTE entry has been populated once, we > @@ -182,7 +182,7 @@ static void alloc_init_pmd(pud_t *pud, unsigned long > addr, unsigned long end, > BUG_ON(!pgattr_change_is_safe(pmd_val(old_pmd), > pmd_val(*pmd))); > } else { > - alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys), > + alloc_init_pte(pmd, addr, next, phys, > prot, pgtable_alloc); > > BUG_ON(pmd_val(old_pmd) != 0 && > -- > 2.7.4 > ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH 00/10] KVM/ARM fixes for 4.11-rc2
On Thu, Mar 09 2017 at 3:16:52 pm GMT, Radim Krčmářwrote: > 2017-03-09 09:55+, Marc Zyngier: >> Paolo, Radim, >> >> Here's the KVM/ARM updates for 4.11-rc2. The usual bag of vgic >> updates, making the memslot array large enough to handle guests with >> tons of devices assigned to them, a tidying up of exception handling, >> and a rather annoying TLB handling issue on VHE systems. > > Pulled, but what made you change the GPG key into a revoked one? > > # gpg: Signature made Thu 09 Mar 2017 10:36:07 AM CET > # gpg:using RSA key AB309C74B93B1EA1 > # gpg: Good signature from "Marc Zyngier " > # gpg: WARNING: This key has been revoked by its owner! > # gpg: This could mean that the signature is forged. > # gpg: reason for revocation: Key has been compromised > # gpg: revocation comment: Revoked after kernel.org hacking > # Primary key fingerprint: 6958 C9F2 233C 5E9D 6CCA 818A AB30 9C74 B93B > 1EA1 Gahhh... New laptop, keyring restored from a backup containing the revoked key, gpg-agent picking the revoked key as a default, idiot sitting behind the keyboard not paying attention :-(. Now fixed. Sorry about that. M. -- Jazz is not dead, it just smell funny. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH RFC 6/7] ARM64: KVM: Support heterogeneous system
On 28/01/17 14:55, Andrew Jones wrote: On Mon, Jan 16, 2017 at 05:33:33PM +0800, Shannon Zhao wrote: From: Shannon ZhaoWhen initializing KVM, check whether physical hardware is a heterogeneous system through the MIDR values. If so, force userspace to set the KVM_ARM_VCPU_CROSS feature bit. Otherwise, it should fail to initialize VCPUs. Signed-off-by: Shannon Zhao --- arch/arm/kvm/arm.c | 26 ++ include/uapi/linux/kvm.h | 1 + 2 files changed, 27 insertions(+) diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index bdceb19..21ec070 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -46,6 +46,7 @@ #include #include #include +#include #ifdef REQUIRES_VIRT __asm__(".arch_extension virt"); @@ -65,6 +66,7 @@ static unsigned int kvm_vmid_bits __read_mostly; static DEFINE_SPINLOCK(kvm_vmid_lock); static bool vgic_present; +static bool heterogeneous_system; static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled); @@ -210,6 +212,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_ARM_CROSS_VCPU: r = 1; break; + case KVM_CAP_ARM_HETEROGENEOUS: + r = heterogeneous_system; + break; What's this for? When/why would usespace check it? case KVM_CAP_COALESCED_MMIO: r = KVM_COALESCED_MMIO_PAGE_OFFSET; break; @@ -812,6 +817,12 @@ static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu, int phys_target = kvm_target_cpu(); bool cross_vcpu = kvm_vcpu_has_feature_cross_cpu(init); + if (heterogeneous_system && !cross_vcpu) { + kvm_err("%s:Host is a heterogeneous system, set KVM_ARM_VCPU_CROSS bit\n", + __func__); + return -EINVAL; + } Instead of forcing userspace to set a bit, why not just confirm the target selected will work? E.g. if only generic works on a heterogeneous system then just if (heterogeneous_system && init->target != GENERIC) return -EINVAL should work + if (!cross_vcpu && init->target != phys_target) return -EINVAL; @@ -1397,6 +1408,11 @@ static void check_kvm_target_cpu(void *ret) *(int *)ret = kvm_target_cpu(); } +static void get_physical_cpu_midr(void *midr) +{ + *(u32 *)midr = read_cpuid_id(); +} + struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr) { struct kvm_vcpu *vcpu; @@ -1417,6 +1433,7 @@ int kvm_arch_init(void *opaque) { int err; int ret, cpu; + u32 current_midr, midr; if (!is_hyp_mode_available()) { kvm_err("HYP mode not available\n"); @@ -1431,6 +1448,15 @@ int kvm_arch_init(void *opaque) } } + current_midr = read_cpuid_id(); + for_each_online_cpu(cpu) { + smp_call_function_single(cpu, get_physical_cpu_midr, , 1); + if (current_midr != midr) { + heterogeneous_system = true; + break; + } + } Is there no core kernel API that provides this? On arm64, there is a per-cpu cpuinfo structure kept for each online CPU, which keeps cached values of the ID registers. May be we could use that here. See arch/arm64/kernel/cpuinfo.c::__cpuinfo_store_cpu() Suzuki ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH 00/10] KVM/ARM fixes for 4.11-rc2
2017-03-09 09:55+, Marc Zyngier: > Paolo, Radim, > > Here's the KVM/ARM updates for 4.11-rc2. The usual bag of vgic > updates, making the memslot array large enough to handle guests with > tons of devices assigned to them, a tidying up of exception handling, > and a rather annoying TLB handling issue on VHE systems. Pulled, but what made you change the GPG key into a revoked one? # gpg: Signature made Thu 09 Mar 2017 10:36:07 AM CET # gpg:using RSA key AB309C74B93B1EA1 # gpg: Good signature from "Marc Zyngier" # gpg: WARNING: This key has been revoked by its owner! # gpg: This could mean that the signature is forged. # gpg: reason for revocation: Key has been compromised # gpg: revocation comment: Revoked after kernel.org hacking # Primary key fingerprint: 6958 C9F2 233C 5E9D 6CCA 818A AB30 9C74 B93B 1EA1 Thanks. > Please pull, > > Thanks, > > M. > > The following changes since commit c1ae3cfa0e89fa1a7ecc4c99031f5e9ae99d9201: > > Linux 4.11-rc1 (2017-03-05 12:59:56 -0800) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git > tags/kvm-arm-for-4.11-rc2 > > for you to fetch changes up to 955a3fc6d2a1c11d6d00bce4f3816100ce0530cf: > > KVM: arm64: Increase number of user memslots to 512 (2017-03-09 09:13:50 > +) > > > KVM/ARM updates for v4.11-rc2 > > vgic updates: > - Honour disabling the ITS > - Don't deadlock when deactivating own interrupts via MMIO > - Correctly expose the lact of IRQ/FIQ bypass on GICv3 > > I/O virtualization: > - Make KVM_CAP_NR_MEMSLOTS big enough for large guests with > many PCIe devices > > General bug fixes: > - Gracefully handle exception generated with syndroms that > the host doesn't understand > - Properly invalidate TLBs on VHE systems > > > Andre Przywara (1): > KVM: arm/arm64: VGIC: Fix command handling while ITS being disabled > > Jintack Lim (1): > KVM: arm/arm64: Let vcpu thread modify its own active state > > Linu Cherian (4): > KVM: Add documentation for KVM_CAP_NR_MEMSLOTS > KVM: arm/arm64: Enable KVM_CAP_NR_MEMSLOTS on arm/arm64 > KVM: arm/arm64: Remove KVM_PRIVATE_MEM_SLOTS definition that are unused > KVM: arm64: Increase number of user memslots to 512 > > Marc Zyngier (2): > arm64: KVM: VHE: Clear HCR_TGE when invalidating guest TLBs > KVM: arm/arm64: vgic-v3: Don't pretend to support IRQ/FIQ bypass > > Mark Rutland (2): > arm: KVM: Survive unknown traps from guests > arm64: KVM: Survive unknown traps from guests > > Documentation/virtual/kvm/api.txt | 4 ++ > arch/arm/include/asm/kvm_arm.h | 1 + > arch/arm/include/asm/kvm_host.h| 1 - > arch/arm/kvm/arm.c | 3 + > arch/arm/kvm/handle_exit.c | 19 --- > arch/arm64/include/asm/kvm_host.h | 3 +- > arch/arm64/kvm/handle_exit.c | 19 --- > arch/arm64/kvm/hyp/tlb.c | 64 +++--- > include/linux/irqchip/arm-gic-v3.h | 2 + > virt/kvm/arm/vgic/vgic-its.c | 109 > ++--- > virt/kvm/arm/vgic/vgic-mmio.c | 32 --- > virt/kvm/arm/vgic/vgic-v3.c| 5 +- > 12 files changed, 183 insertions(+), 79 deletions(-) ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH RFC 7/7] ARM64: KVM: Add user set handler for id_aa64mmfr0_el1
On Thu, Mar 09, 2017 at 04:52:18AM -0800, Christoffer Dall wrote: > On Mon, Jan 16, 2017 at 05:33:34PM +0800, Shannon Zhao wrote: > > From: Shannon Zhao> > > > Check if the configuration is fine. > > This commit message really needs some love and attention. > > > > > Signed-off-by: Shannon Zhao > > --- > > arch/arm64/kvm/sys_regs.c | 32 +++- > > 1 file changed, 31 insertions(+), 1 deletion(-) > > > > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c > > index f613e29..9763b79 100644 > > --- a/arch/arm64/kvm/sys_regs.c > > +++ b/arch/arm64/kvm/sys_regs.c > > @@ -1493,6 +1493,35 @@ static bool access_id_reg(struct kvm_vcpu *vcpu, > > return true; > > } > > > > +static int set_id_aa64mmfr0_el1(struct kvm_vcpu *vcpu, > > + const struct sys_reg_desc *rd, > > + const struct kvm_one_reg *reg, > > + void __user *uaddr) > > +{ > > + u64 val, id_aa64mmfr0; > > + > > + if (copy_from_user(, uaddr, KVM_REG_SIZE(reg->id)) != 0) > > + return -EFAULT; > > + > > + asm volatile("mrs %0, id_aa64mmfr0_el1\n" : "=r" (id_aa64mmfr0)); > > Doesn't the kernel have an abstraction for this already or a cached > value? Certainly we shouldn't be using a raw mrs instruction. We have read_sysreg() or read_cpuid() for that. The cpufeature code has a cached, system-wide safe value cached for each system register. The cpuid_feature_extract_field() helper uses that. > > + if ((val & GENMASK(3, 0)) > (id_aa64mmfr0 & GENMASK(3, 0)) || > > + (val & GENMASK(7, 4)) > (id_aa64mmfr0 & GENMASK(7, 4)) || > > + (val & GENMASK(11, 8)) > (id_aa64mmfr0 & GENMASK(11, 8)) || > > + (val & GENMASK(15, 12)) > (id_aa64mmfr0 & GENMASK(15, 12)) || > > + (val & GENMASK(19, 16)) > (id_aa64mmfr0 & GENMASK(19, 16)) || > > + (val & GENMASK(23, 20)) > (id_aa64mmfr0 & GENMASK(23, 20)) || > > + (val & GENMASK(27, 24)) < (id_aa64mmfr0 & GENMASK(27, 24)) || > > + (val & GENMASK(31, 28)) < (id_aa64mmfr0 & GENMASK(31, 28))) { Please use mnemonics. For example, we have ID_AA64MMFR0_TGRAN*_SHIFT defined in . We also have extraction helpers, see cpuid_feature_extract_unsigned_field(), as used in id_aa64mmfr0_mixed_endian_el0(). Thanks, Mark. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH RFC 7/7] ARM64: KVM: Add user set handler for id_aa64mmfr0_el1
On Mon, Jan 16, 2017 at 05:33:34PM +0800, Shannon Zhao wrote: > From: Shannon Zhao> > Check if the configuration is fine. This commit message really needs some love and attention. > > Signed-off-by: Shannon Zhao > --- > arch/arm64/kvm/sys_regs.c | 32 +++- > 1 file changed, 31 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c > index f613e29..9763b79 100644 > --- a/arch/arm64/kvm/sys_regs.c > +++ b/arch/arm64/kvm/sys_regs.c > @@ -1493,6 +1493,35 @@ static bool access_id_reg(struct kvm_vcpu *vcpu, > return true; > } > > +static int set_id_aa64mmfr0_el1(struct kvm_vcpu *vcpu, > + const struct sys_reg_desc *rd, > + const struct kvm_one_reg *reg, > + void __user *uaddr) > +{ > + u64 val, id_aa64mmfr0; > + > + if (copy_from_user(, uaddr, KVM_REG_SIZE(reg->id)) != 0) > + return -EFAULT; > + > + asm volatile("mrs %0, id_aa64mmfr0_el1\n" : "=r" (id_aa64mmfr0)); Doesn't the kernel have an abstraction for this already or a cached value? > + > + if ((val & GENMASK(3, 0)) > (id_aa64mmfr0 & GENMASK(3, 0)) || > + (val & GENMASK(7, 4)) > (id_aa64mmfr0 & GENMASK(7, 4)) || > + (val & GENMASK(11, 8)) > (id_aa64mmfr0 & GENMASK(11, 8)) || > + (val & GENMASK(15, 12)) > (id_aa64mmfr0 & GENMASK(15, 12)) || > + (val & GENMASK(19, 16)) > (id_aa64mmfr0 & GENMASK(19, 16)) || > + (val & GENMASK(23, 20)) > (id_aa64mmfr0 & GENMASK(23, 20)) || > + (val & GENMASK(27, 24)) < (id_aa64mmfr0 & GENMASK(27, 24)) || > + (val & GENMASK(31, 28)) < (id_aa64mmfr0 & GENMASK(31, 28))) { > + kvm_err("Wrong memory translation granule size/Physical Address > range\n"); > + return -EINVAL; > + } This really needs some explanation as to what it's checking and what the logic is. > + > + vcpu_id_sys_reg(vcpu, rd->reg) = val & GENMASK(31, 0); > + > + return 0; > +} > + > static struct sys_reg_desc invariant_sys_regs[] = { > { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b), Op2(0b000), > access_id_reg, get_midr_el1, MIDR_EL1 }, > @@ -1549,7 +1578,8 @@ static struct sys_reg_desc invariant_sys_regs[] = { > { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0110), Op2(0b001), > access_id_reg, get_id_aa64isar1_el1, ID_AA64ISAR1_EL1 }, > { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0111), Op2(0b000), > - access_id_reg, get_id_aa64mmfr0_el1, ID_AA64MMFR0_EL1 }, > + access_id_reg, get_id_aa64mmfr0_el1, ID_AA64MMFR0_EL1, > + 0, NULL, set_id_aa64mmfr0_el1 }, > { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0111), Op2(0b001), > access_id_reg, get_id_aa64mmfr1_el1, ID_AA64MMFR1_EL1 }, > { Op0(0b11), Op1(0b001), CRn(0b), CRm(0b), Op2(0b001), > -- > 2.0.4 > > Thanks, -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH RFC 5/7] ARM64: KVM: Support cross type vCPU
On Sat, Jan 28, 2017 at 03:47:54PM +0100, Andrew Jones wrote: > On Mon, Jan 16, 2017 at 05:33:32PM +0800, Shannon Zhao wrote: > > From: Shannon Zhao> > > > Add a capability to tell userspace that KVM supports cross type vCPU. > > Add a cpu feature for userspace to set when it doesn't use host type > > vCPU and kvm_vcpu_preferred_target return the host MIDR register value > > so that userspace can check whether its requested vCPU type macthes the > > one of physical CPU and if so, KVM will not trap ID registers even > > though userspace doesn't specify -cpu host. > > Guest accesses MIDR through VPIDR_EL2 so we save/restore it no matter > > it's a cross type vCPU. > > > > Signed-off-by: Shannon Zhao > > --- > > arch/arm/kvm/arm.c | 10 -- > > arch/arm64/include/asm/kvm_emulate.h | 3 +++ > > arch/arm64/include/asm/kvm_host.h| 3 ++- > > arch/arm64/include/uapi/asm/kvm.h| 1 + > > arch/arm64/kvm/guest.c | 17 - > > arch/arm64/kvm/hyp/sysreg-sr.c | 2 ++ > > include/uapi/linux/kvm.h | 1 + > > 7 files changed, 33 insertions(+), 4 deletions(-) > > > > diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c > > index 1167678..bdceb19 100644 > > --- a/arch/arm/kvm/arm.c > > +++ b/arch/arm/kvm/arm.c > > @@ -207,6 +207,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long > > ext) > > case KVM_CAP_ARM_PSCI_0_2: > > case KVM_CAP_READONLY_MEM: > > case KVM_CAP_MP_STATE: > > + case KVM_CAP_ARM_CROSS_VCPU: > > r = 1; > > break; > > case KVM_CAP_COALESCED_MMIO: > > @@ -809,8 +810,9 @@ static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu, > > { > > unsigned int i; > > int phys_target = kvm_target_cpu(); > > + bool cross_vcpu = kvm_vcpu_has_feature_cross_cpu(init); > > > > - if (init->target != phys_target) > > + if (!cross_vcpu && init->target != phys_target) > > return -EINVAL; > > I'm not sure we need the vcpu feature bit. I think qemu should be > allowed to try any target (if using -cpu host it will try the > kvm preferred target). kvm should check that the input target is > a known target and that it is compatible with the phys_target, > otherwise -EINVAL. > I agree. I think we just need to advertise the capability to user space instead. Thanks, -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH RFC 4/7] ARM64: KVM: emulate accessing ID registers
On Mon, Jan 16, 2017 at 05:33:31PM +0800, Shannon Zhao wrote: > From: Shannon ZhaoPlease provide a commit message. Thanks, -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH RFC 2/7] ARM64: KVM: Add reset handlers for all ID registers
On Mon, Jan 16, 2017 at 05:33:29PM +0800, Shannon Zhao wrote: > From: Shannon Zhao> > Move invariant_sys_regs before emulate_sys_reg so that it can be used > later. > > Signed-off-by: Shannon Zhao > --- > arch/arm64/kvm/sys_regs.c | 193 > -- > 1 file changed, 116 insertions(+), 77 deletions(-) > > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c > index 87e7e66..bf71eb4 100644 > --- a/arch/arm64/kvm/sys_regs.c > +++ b/arch/arm64/kvm/sys_regs.c > @@ -1432,6 +1432,122 @@ static const struct sys_reg_desc cp15_64_regs[] = { > { Op1( 1), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, c2_TTBR1 }, > }; > > +/* > + * These are the invariant sys_reg registers: we let the guest see the > + * host versions of these, so they're part of the guest state. > + * > + * A future CPU may provide a mechanism to present different values to > + * the guest, or a future kvm may trap them. > + */ > + > +#define FUNCTION_INVARIANT(reg) > \ > + static void get_##reg(struct kvm_vcpu *v, \ > + const struct sys_reg_desc *r) \ > + { \ > + ((struct sys_reg_desc *)r)->val = read_sysreg(reg); \ > + } > + > +FUNCTION_INVARIANT(midr_el1) > +FUNCTION_INVARIANT(ctr_el0) > +FUNCTION_INVARIANT(revidr_el1) > +FUNCTION_INVARIANT(id_pfr0_el1) > +FUNCTION_INVARIANT(id_pfr1_el1) > +FUNCTION_INVARIANT(id_dfr0_el1) > +FUNCTION_INVARIANT(id_afr0_el1) > +FUNCTION_INVARIANT(id_mmfr0_el1) > +FUNCTION_INVARIANT(id_mmfr1_el1) > +FUNCTION_INVARIANT(id_mmfr2_el1) > +FUNCTION_INVARIANT(id_mmfr3_el1) > +FUNCTION_INVARIANT(id_isar0_el1) > +FUNCTION_INVARIANT(id_isar1_el1) > +FUNCTION_INVARIANT(id_isar2_el1) > +FUNCTION_INVARIANT(id_isar3_el1) > +FUNCTION_INVARIANT(id_isar4_el1) > +FUNCTION_INVARIANT(id_isar5_el1) > +FUNCTION_INVARIANT(mvfr0_el1) > +FUNCTION_INVARIANT(mvfr1_el1) > +FUNCTION_INVARIANT(mvfr2_el1) > +FUNCTION_INVARIANT(id_aa64pfr0_el1) > +FUNCTION_INVARIANT(id_aa64pfr1_el1) > +FUNCTION_INVARIANT(id_aa64dfr0_el1) > +FUNCTION_INVARIANT(id_aa64dfr1_el1) > +FUNCTION_INVARIANT(id_aa64afr0_el1) > +FUNCTION_INVARIANT(id_aa64afr1_el1) > +FUNCTION_INVARIANT(id_aa64isar0_el1) > +FUNCTION_INVARIANT(id_aa64isar1_el1) > +FUNCTION_INVARIANT(id_aa64mmfr0_el1) > +FUNCTION_INVARIANT(id_aa64mmfr1_el1) > +FUNCTION_INVARIANT(clidr_el1) > +FUNCTION_INVARIANT(aidr_el1) > + > +/* ->val is filled in by kvm_sys_reg_table_init() */ > +static struct sys_reg_desc invariant_sys_regs[] = { > + { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b), Op2(0b000), > + NULL, get_midr_el1, MIDR_EL1 }, > + { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b), Op2(0b110), > + NULL, get_revidr_el1, REVIDR_EL1 }, > + { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0001), Op2(0b000), > + NULL, get_id_pfr0_el1, ID_PFR0_EL1 }, > + { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0001), Op2(0b001), > + NULL, get_id_pfr1_el1, ID_PFR1_EL1 }, > + { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0001), Op2(0b010), > + NULL, get_id_dfr0_el1, ID_DFR0_EL1 }, > + { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0001), Op2(0b011), > + NULL, get_id_afr0_el1, ID_AFR0_EL1 }, > + { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0001), Op2(0b100), > + NULL, get_id_mmfr0_el1, ID_MMFR0_EL1 }, > + { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0001), Op2(0b101), > + NULL, get_id_mmfr1_el1, ID_MMFR1_EL1 }, > + { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0001), Op2(0b110), > + NULL, get_id_mmfr2_el1, ID_MMFR2_EL1 }, > + { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0001), Op2(0b111), > + NULL, get_id_mmfr3_el1, ID_MMFR3_EL1 }, > + { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0010), Op2(0b000), > + NULL, get_id_isar0_el1, ID_ISAR0_EL1 }, > + { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0010), Op2(0b001), > + NULL, get_id_isar1_el1, ID_ISAR1_EL1 }, > + { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0010), Op2(0b010), > + NULL, get_id_isar2_el1, ID_ISAR2_EL1 }, > + { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0010), Op2(0b011), > + NULL, get_id_isar3_el1, ID_ISAR3_EL1 }, > + { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0010), Op2(0b100), > + NULL, get_id_isar4_el1, ID_ISAR4_EL1 }, > + { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0010), Op2(0b101), > + NULL, get_id_isar5_el1, ID_ISAR5_EL1 }, > + { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0011), Op2(0b000), > + NULL, get_mvfr0_el1, MVFR0_EL1 }, > + { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0011), Op2(0b001), > + NULL, get_mvfr1_el1, MVFR1_EL1 }, > + { Op0(0b11), Op1(0b000), CRn(0b), CRm(0b0011), Op2(0b010), > + NULL, get_mvfr2_el1, MVFR2_EL1 }, > + { Op0(0b11), Op1(0b000), CRn(0b),
[PATCH 10/10] KVM: arm64: Increase number of user memslots to 512
From: Linu CherianHaving only 32 memslots is a real constraint for the maximum number of PCI devices that can be assigned to a single guest. Assuming each PCI device/virtual function having two memory BAR regions, we could assign only 15 devices/virtual functions to a guest. Hence increase KVM_USER_MEM_SLOTS to 512 as done in other archs like powerpc. Reviewed-by: Christoffer Dall Signed-off-by: Linu Cherian Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_host.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 6ac17ee887c9..e7705e7bb07b 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -30,7 +30,7 @@ #define __KVM_HAVE_ARCH_INTC_INITIALIZED -#define KVM_USER_MEM_SLOTS 32 +#define KVM_USER_MEM_SLOTS 512 #define KVM_COALESCED_MMIO_PAGE_OFFSET 1 #define KVM_HALT_POLL_NS_DEFAULT 50 -- 2.11.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 07/10] KVM: Add documentation for KVM_CAP_NR_MEMSLOTS
From: Linu CherianAdd documentation for KVM_CAP_NR_MEMSLOTS capability. Reviewed-by: Christoffer Dall Signed-off-by: Linu Cherian Signed-off-by: Marc Zyngier --- Documentation/virtual/kvm/api.txt | 4 1 file changed, 4 insertions(+) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 069450938b79..3c248f772ae6 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -951,6 +951,10 @@ This ioctl allows the user to create or modify a guest physical memory slot. When changing an existing slot, it may be moved in the guest physical memory space, or its flags may be modified. It may not be resized. Slots may not overlap in guest physical address space. +Bits 0-15 of "slot" specifies the slot id and this value should be +less than the maximum number of user memory slots supported per VM. +The maximum allowed slots can be queried using KVM_CAP_NR_MEMSLOTS, +if this capability is supported by the architecture. If KVM_CAP_MULTI_ADDRESS_SPACE is available, bits 16-31 of "slot" specifies the address space which is being modified. They must be -- 2.11.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 08/10] KVM: arm/arm64: Enable KVM_CAP_NR_MEMSLOTS on arm/arm64
From: Linu CherianReturn KVM_USER_MEM_SLOTS for userspace capability query on NR_MEMSLOTS. Reviewed-by: Christoffer Dall Signed-off-by: Linu Cherian Signed-off-by: Marc Zyngier --- arch/arm/kvm/arm.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index c9a2103faeb9..96dba7cd8be7 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -221,6 +221,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_MAX_VCPUS: r = KVM_MAX_VCPUS; break; + case KVM_CAP_NR_MEMSLOTS: + r = KVM_USER_MEM_SLOTS; + break; case KVM_CAP_MSI_DEVID: if (!kvm) r = -EINVAL; -- 2.11.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 09/10] KVM: arm/arm64: Remove KVM_PRIVATE_MEM_SLOTS definition that are unused
From: Linu Cherianarm/arm64 architecture doesnt use private memslots, hence removing KVM_PRIVATE_MEM_SLOTS macro definition. Reviewed-by: Christoffer Dall Signed-off-by: Linu Cherian Signed-off-by: Marc Zyngier --- arch/arm/include/asm/kvm_host.h | 1 - arch/arm64/include/asm/kvm_host.h | 1 - 2 files changed, 2 deletions(-) diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index cc495d799c67..31ee468ce667 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -30,7 +30,6 @@ #define __KVM_HAVE_ARCH_INTC_INITIALIZED #define KVM_USER_MEM_SLOTS 32 -#define KVM_PRIVATE_MEM_SLOTS 4 #define KVM_COALESCED_MMIO_PAGE_OFFSET 1 #define KVM_HAVE_ONE_REG #define KVM_HALT_POLL_NS_DEFAULT 50 diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index f21fd3894370..6ac17ee887c9 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -31,7 +31,6 @@ #define __KVM_HAVE_ARCH_INTC_INITIALIZED #define KVM_USER_MEM_SLOTS 32 -#define KVM_PRIVATE_MEM_SLOTS 4 #define KVM_COALESCED_MMIO_PAGE_OFFSET 1 #define KVM_HALT_POLL_NS_DEFAULT 50 -- 2.11.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 06/10] KVM: arm/arm64: VGIC: Fix command handling while ITS being disabled
From: Andre PrzywaraThe ITS spec says that ITS commands are only processed when the ITS is enabled (section 8.19.4, Enabled, bit[0]). Our emulation was not taking this into account. Fix this by checking the enabled state before handling CWRITER writes. On the other hand that means that CWRITER could advance while the ITS is disabled, and enabling it would need those commands to be processed. Fix this case as well by refactoring actual command processing and calling this from both the GITS_CWRITER and GITS_CTLR handlers. Reviewed-by: Eric Auger Reviewed-by: Christoffer Dall Signed-off-by: Andre Przywara Signed-off-by: Marc Zyngier --- virt/kvm/arm/vgic/vgic-its.c | 109 ++- 1 file changed, 65 insertions(+), 44 deletions(-) diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c index 571b64a01c50..8d1da1af4b09 100644 --- a/virt/kvm/arm/vgic/vgic-its.c +++ b/virt/kvm/arm/vgic/vgic-its.c @@ -360,29 +360,6 @@ static int its_sync_lpi_pending_table(struct kvm_vcpu *vcpu) return ret; } -static unsigned long vgic_mmio_read_its_ctlr(struct kvm *vcpu, -struct vgic_its *its, -gpa_t addr, unsigned int len) -{ - u32 reg = 0; - - mutex_lock(>cmd_lock); - if (its->creadr == its->cwriter) - reg |= GITS_CTLR_QUIESCENT; - if (its->enabled) - reg |= GITS_CTLR_ENABLE; - mutex_unlock(>cmd_lock); - - return reg; -} - -static void vgic_mmio_write_its_ctlr(struct kvm *kvm, struct vgic_its *its, -gpa_t addr, unsigned int len, -unsigned long val) -{ - its->enabled = !!(val & GITS_CTLR_ENABLE); -} - static unsigned long vgic_mmio_read_its_typer(struct kvm *kvm, struct vgic_its *its, gpa_t addr, unsigned int len) @@ -1161,33 +1138,16 @@ static void vgic_mmio_write_its_cbaser(struct kvm *kvm, struct vgic_its *its, #define ITS_CMD_SIZE 32 #define ITS_CMD_OFFSET(reg)((reg) & GENMASK(19, 5)) -/* - * By writing to CWRITER the guest announces new commands to be processed. - * To avoid any races in the first place, we take the its_cmd lock, which - * protects our ring buffer variables, so that there is only one user - * per ITS handling commands at a given time. - */ -static void vgic_mmio_write_its_cwriter(struct kvm *kvm, struct vgic_its *its, - gpa_t addr, unsigned int len, - unsigned long val) +/* Must be called with the cmd_lock held. */ +static void vgic_its_process_commands(struct kvm *kvm, struct vgic_its *its) { gpa_t cbaser; u64 cmd_buf[4]; - u32 reg; - if (!its) - return; - - mutex_lock(>cmd_lock); - - reg = update_64bit_reg(its->cwriter, addr & 7, len, val); - reg = ITS_CMD_OFFSET(reg); - if (reg >= ITS_CMD_BUFFER_SIZE(its->cbaser)) { - mutex_unlock(>cmd_lock); + /* Commands are only processed when the ITS is enabled. */ + if (!its->enabled) return; - } - its->cwriter = reg; cbaser = CBASER_ADDRESS(its->cbaser); while (its->cwriter != its->creadr) { @@ -1207,6 +1167,34 @@ static void vgic_mmio_write_its_cwriter(struct kvm *kvm, struct vgic_its *its, if (its->creadr == ITS_CMD_BUFFER_SIZE(its->cbaser)) its->creadr = 0; } +} + +/* + * By writing to CWRITER the guest announces new commands to be processed. + * To avoid any races in the first place, we take the its_cmd lock, which + * protects our ring buffer variables, so that there is only one user + * per ITS handling commands at a given time. + */ +static void vgic_mmio_write_its_cwriter(struct kvm *kvm, struct vgic_its *its, + gpa_t addr, unsigned int len, + unsigned long val) +{ + u64 reg; + + if (!its) + return; + + mutex_lock(>cmd_lock); + + reg = update_64bit_reg(its->cwriter, addr & 7, len, val); + reg = ITS_CMD_OFFSET(reg); + if (reg >= ITS_CMD_BUFFER_SIZE(its->cbaser)) { + mutex_unlock(>cmd_lock); + return; + } + its->cwriter = reg; + + vgic_its_process_commands(kvm, its); mutex_unlock(>cmd_lock); } @@ -1287,6 +1275,39 @@ static void vgic_mmio_write_its_baser(struct kvm *kvm, *regptr = reg; } +static unsigned long vgic_mmio_read_its_ctlr(struct kvm *vcpu, +struct vgic_its *its, +gpa_t addr,
[PATCH 03/10] KVM: arm/arm64: Let vcpu thread modify its own active state
From: Jintack LimCurrently, if a vcpu thread tries to change the active state of an interrupt which is already on the same vcpu's AP list, it will loop forever. Since the VGIC mmio handler is called after a vcpu has already synced back the LR state to the struct vgic_irq, we can just let it proceed safely. Cc: sta...@vger.kernel.org Reviewed-by: Marc Zyngier Signed-off-by: Jintack Lim Signed-off-by: Christoffer Dall Signed-off-by: Marc Zyngier --- virt/kvm/arm/vgic/vgic-mmio.c | 32 1 file changed, 24 insertions(+), 8 deletions(-) diff --git a/virt/kvm/arm/vgic/vgic-mmio.c b/virt/kvm/arm/vgic/vgic-mmio.c index 3654b4c835ef..2a5db1352722 100644 --- a/virt/kvm/arm/vgic/vgic-mmio.c +++ b/virt/kvm/arm/vgic/vgic-mmio.c @@ -180,21 +180,37 @@ unsigned long vgic_mmio_read_active(struct kvm_vcpu *vcpu, static void vgic_mmio_change_active(struct kvm_vcpu *vcpu, struct vgic_irq *irq, bool new_active_state) { + struct kvm_vcpu *requester_vcpu; spin_lock(>irq_lock); + + /* +* The vcpu parameter here can mean multiple things depending on how +* this function is called; when handling a trap from the kernel it +* depends on the GIC version, and these functions are also called as +* part of save/restore from userspace. +* +* Therefore, we have to figure out the requester in a reliable way. +* +* When accessing VGIC state from user space, the requester_vcpu is +* NULL, which is fine, because we guarantee that no VCPUs are running +* when accessing VGIC state from user space so irq->vcpu->cpu is +* always -1. +*/ + requester_vcpu = kvm_arm_get_running_vcpu(); + /* * If this virtual IRQ was written into a list register, we * have to make sure the CPU that runs the VCPU thread has -* synced back LR state to the struct vgic_irq. We can only -* know this for sure, when either this irq is not assigned to -* anyone's AP list anymore, or the VCPU thread is not -* running on any CPUs. +* synced back the LR state to the struct vgic_irq. * -* In the opposite case, we know the VCPU thread may be on its -* way back from the guest and still has to sync back this -* IRQ, so we release and re-acquire the spin_lock to let the -* other thread sync back the IRQ. +* As long as the conditions below are true, we know the VCPU thread +* may be on its way back from the guest (we kicked the VCPU thread in +* vgic_change_active_prepare) and still has to sync back this IRQ, +* so we release and re-acquire the spin_lock to let the other thread +* sync back the IRQ. */ while (irq->vcpu && /* IRQ may have state in an LR somewhere */ + irq->vcpu != requester_vcpu && /* Current thread is not the VCPU thread */ irq->vcpu->cpu != -1) /* VCPU thread is running */ cond_resched_lock(>irq_lock); -- 2.11.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 05/10] arm64: KVM: Survive unknown traps from guests
From: Mark RutlandCurrently we BUG() if we see an ESR_EL2.EC value we don't recognise. As configurable disables/enables are added to the architecture (controlled by RES1/RES0 bits respectively), with associated synchronous exceptions, it may be possible for a guest to trigger exceptions with classes that we don't recognise. While we can't service these exceptions in a manner useful to the guest, we can avoid bringing down the host. Per ARM DDI 0487A.k_iss10775, page D7-1937, EC values within the range 0x00 - 0x2c are reserved for future use with synchronous exceptions, and EC values within the range 0x2d - 0x3f may be used for either synchronous or asynchronous exceptions. The patch makes KVM handle any unknown EC by injecting an UNDEFINED exception into the guest, with a corresponding (ratelimited) warning in the host dmesg. We could later improve on this with with a new (opt-in) exit to the host userspace. Cc: Dave Martin Cc: Suzuki K Poulose Reviewed-by: Christoffer Dall Signed-off-by: Mark Rutland Signed-off-by: Marc Zyngier --- arch/arm64/kvm/handle_exit.c | 19 --- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c index 1bfe30dfbfe7..fa1b18e364fc 100644 --- a/arch/arm64/kvm/handle_exit.c +++ b/arch/arm64/kvm/handle_exit.c @@ -135,7 +135,19 @@ static int kvm_handle_guest_debug(struct kvm_vcpu *vcpu, struct kvm_run *run) return ret; } +static int kvm_handle_unknown_ec(struct kvm_vcpu *vcpu, struct kvm_run *run) +{ + u32 hsr = kvm_vcpu_get_hsr(vcpu); + + kvm_pr_unimpl("Unknown exception class: hsr: %#08x -- %s\n", + hsr, esr_get_class_string(hsr)); + + kvm_inject_undefined(vcpu); + return 1; +} + static exit_handle_fn arm_exit_handlers[] = { + [0 ... ESR_ELx_EC_MAX] = kvm_handle_unknown_ec, [ESR_ELx_EC_WFx]= kvm_handle_wfx, [ESR_ELx_EC_CP15_32]= kvm_handle_cp15_32, [ESR_ELx_EC_CP15_64]= kvm_handle_cp15_64, @@ -162,13 +174,6 @@ static exit_handle_fn kvm_get_exit_handler(struct kvm_vcpu *vcpu) u32 hsr = kvm_vcpu_get_hsr(vcpu); u8 hsr_ec = ESR_ELx_EC(hsr); - if (hsr_ec >= ARRAY_SIZE(arm_exit_handlers) || - !arm_exit_handlers[hsr_ec]) { - kvm_err("Unknown exception class: hsr: %#08x -- %s\n", - hsr, esr_get_class_string(hsr)); - BUG(); - } - return arm_exit_handlers[hsr_ec]; } -- 2.11.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 04/10] arm: KVM: Survive unknown traps from guests
From: Mark RutlandCurrently we BUG() if we see a HSR.EC value we don't recognise. As configurable disables/enables are added to the architecture (controlled by RES1/RES0 bits respectively), with associated synchronous exceptions, it may be possible for a guest to trigger exceptions with classes that we don't recognise. While we can't service these exceptions in a manner useful to the guest, we can avoid bringing down the host. Per ARM DDI 0406C.c, all currently unallocated HSR EC encodings are reserved, and per ARM DDI 0487A.k_iss10775, page G6-4395, EC values within the range 0x00 - 0x2c are reserved for future use with synchronous exceptions, and EC values within the range 0x2d - 0x3f may be used for either synchronous or asynchronous exceptions. The patch makes KVM handle any unknown EC by injecting an UNDEFINED exception into the guest, with a corresponding (ratelimited) warning in the host dmesg. We could later improve on this with with a new (opt-in) exit to the host userspace. Cc: Dave Martin Cc: Suzuki K Poulose Reviewed-by: Christoffer Dall Signed-off-by: Mark Rutland Signed-off-by: Marc Zyngier --- arch/arm/include/asm/kvm_arm.h | 1 + arch/arm/kvm/handle_exit.c | 19 --- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/arch/arm/include/asm/kvm_arm.h b/arch/arm/include/asm/kvm_arm.h index e22089fb44dc..a3f0b3d50089 100644 --- a/arch/arm/include/asm/kvm_arm.h +++ b/arch/arm/include/asm/kvm_arm.h @@ -209,6 +209,7 @@ #define HSR_EC_IABT_HYP(0x21) #define HSR_EC_DABT(0x24) #define HSR_EC_DABT_HYP(0x25) +#define HSR_EC_MAX (0x3f) #define HSR_WFI_IS_WFE (_AC(1, UL) << 0) diff --git a/arch/arm/kvm/handle_exit.c b/arch/arm/kvm/handle_exit.c index 4e40d1955e35..96af65a30d78 100644 --- a/arch/arm/kvm/handle_exit.c +++ b/arch/arm/kvm/handle_exit.c @@ -79,7 +79,19 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run) return 1; } +static int kvm_handle_unknown_ec(struct kvm_vcpu *vcpu, struct kvm_run *run) +{ + u32 hsr = kvm_vcpu_get_hsr(vcpu); + + kvm_pr_unimpl("Unknown exception class: hsr: %#08x\n", + hsr); + + kvm_inject_undefined(vcpu); + return 1; +} + static exit_handle_fn arm_exit_handlers[] = { + [0 ... HSR_EC_MAX] = kvm_handle_unknown_ec, [HSR_EC_WFI]= kvm_handle_wfx, [HSR_EC_CP15_32]= kvm_handle_cp15_32, [HSR_EC_CP15_64]= kvm_handle_cp15_64, @@ -98,13 +110,6 @@ static exit_handle_fn kvm_get_exit_handler(struct kvm_vcpu *vcpu) { u8 hsr_ec = kvm_vcpu_trap_get_class(vcpu); - if (hsr_ec >= ARRAY_SIZE(arm_exit_handlers) || - !arm_exit_handlers[hsr_ec]) { - kvm_err("Unknown exception class: hsr: %#08x\n", - (unsigned int)kvm_vcpu_get_hsr(vcpu)); - BUG(); - } - return arm_exit_handlers[hsr_ec]; } -- 2.11.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 02/10] KVM: arm/arm64: vgic-v3: Don't pretend to support IRQ/FIQ bypass
Our GICv3 emulation always presents ICC_SRE_EL1 with DIB/DFB set to zero, which implies that there is a way to bypass the GIC and inject raw IRQ/FIQ by driving the CPU pins. Of course, we don't allow that when the GIC is configured, but we fail to indicate that to the guest. The obvious fix is to set these bits (and never let them being changed again). Reported-by: Peter MaydellAcked-by: Christoffer Dall Reviewed-by: Eric Auger Signed-off-by: Marc Zyngier --- include/linux/irqchip/arm-gic-v3.h | 2 ++ virt/kvm/arm/vgic/vgic-v3.c| 5 - 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h index 672cfef72fc8..97cbca19430d 100644 --- a/include/linux/irqchip/arm-gic-v3.h +++ b/include/linux/irqchip/arm-gic-v3.h @@ -373,6 +373,8 @@ #define ICC_IGRPEN0_EL1_MASK (1 << ICC_IGRPEN0_EL1_SHIFT) #define ICC_IGRPEN1_EL1_SHIFT 0 #define ICC_IGRPEN1_EL1_MASK (1 << ICC_IGRPEN1_EL1_SHIFT) +#define ICC_SRE_EL1_DIB(1U << 2) +#define ICC_SRE_EL1_DFB(1U << 1) #define ICC_SRE_EL1_SRE(1U << 0) /* diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c index edc6ee2dc852..be0f4c3e0142 100644 --- a/virt/kvm/arm/vgic/vgic-v3.c +++ b/virt/kvm/arm/vgic/vgic-v3.c @@ -229,10 +229,13 @@ void vgic_v3_enable(struct kvm_vcpu *vcpu) /* * If we are emulating a GICv3, we do it in an non-GICv2-compatible * way, so we force SRE to 1 to demonstrate this to the guest. +* Also, we don't support any form of IRQ/FIQ bypass. * This goes with the spec allowing the value to be RAO/WI. */ if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) { - vgic_v3->vgic_sre = ICC_SRE_EL1_SRE; + vgic_v3->vgic_sre = (ICC_SRE_EL1_DIB | +ICC_SRE_EL1_DFB | +ICC_SRE_EL1_SRE); vcpu->arch.vgic_cpu.pendbaser = INITIAL_PENDBASER_VALUE; } else { vgic_v3->vgic_sre = 0; -- 2.11.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 00/10] KVM/ARM fixes for 4.11-rc2
Paolo, Radim, Here's the KVM/ARM updates for 4.11-rc2. The usual bag of vgic updates, making the memslot array large enough to handle guests with tons of devices assigned to them, a tidying up of exception handling, and a rather annoying TLB handling issue on VHE systems. Please pull, Thanks, M. The following changes since commit c1ae3cfa0e89fa1a7ecc4c99031f5e9ae99d9201: Linux 4.11-rc1 (2017-03-05 12:59:56 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git tags/kvm-arm-for-4.11-rc2 for you to fetch changes up to 955a3fc6d2a1c11d6d00bce4f3816100ce0530cf: KVM: arm64: Increase number of user memslots to 512 (2017-03-09 09:13:50 +) KVM/ARM updates for v4.11-rc2 vgic updates: - Honour disabling the ITS - Don't deadlock when deactivating own interrupts via MMIO - Correctly expose the lact of IRQ/FIQ bypass on GICv3 I/O virtualization: - Make KVM_CAP_NR_MEMSLOTS big enough for large guests with many PCIe devices General bug fixes: - Gracefully handle exception generated with syndroms that the host doesn't understand - Properly invalidate TLBs on VHE systems Andre Przywara (1): KVM: arm/arm64: VGIC: Fix command handling while ITS being disabled Jintack Lim (1): KVM: arm/arm64: Let vcpu thread modify its own active state Linu Cherian (4): KVM: Add documentation for KVM_CAP_NR_MEMSLOTS KVM: arm/arm64: Enable KVM_CAP_NR_MEMSLOTS on arm/arm64 KVM: arm/arm64: Remove KVM_PRIVATE_MEM_SLOTS definition that are unused KVM: arm64: Increase number of user memslots to 512 Marc Zyngier (2): arm64: KVM: VHE: Clear HCR_TGE when invalidating guest TLBs KVM: arm/arm64: vgic-v3: Don't pretend to support IRQ/FIQ bypass Mark Rutland (2): arm: KVM: Survive unknown traps from guests arm64: KVM: Survive unknown traps from guests Documentation/virtual/kvm/api.txt | 4 ++ arch/arm/include/asm/kvm_arm.h | 1 + arch/arm/include/asm/kvm_host.h| 1 - arch/arm/kvm/arm.c | 3 + arch/arm/kvm/handle_exit.c | 19 --- arch/arm64/include/asm/kvm_host.h | 3 +- arch/arm64/kvm/handle_exit.c | 19 --- arch/arm64/kvm/hyp/tlb.c | 64 +++--- include/linux/irqchip/arm-gic-v3.h | 2 + virt/kvm/arm/vgic/vgic-its.c | 109 ++--- virt/kvm/arm/vgic/vgic-mmio.c | 32 --- virt/kvm/arm/vgic/vgic-v3.c| 5 +- 12 files changed, 183 insertions(+), 79 deletions(-) ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 01/10] arm64: KVM: VHE: Clear HCR_TGE when invalidating guest TLBs
When invalidating guest TLBs, special care must be taken to actually shoot the guest TLBs and not the host ones if we're running on a VHE system. This is controlled by the HCR_EL2.TGE bit, which we forget to clear before invalidating TLBs. Address the issue by introducing two wrappers (__tlb_switch_to_guest and __tlb_switch_to_host) that take care of both the VTTBR_EL2 and HCR_EL2.TGE switching. Reported-by: Tomasz NowickiTested-by: Tomasz Nowicki Reviewed-by: Christoffer Dall Cc: sta...@vger.kernel.org Signed-off-by: Marc Zyngier --- arch/arm64/kvm/hyp/tlb.c | 64 +--- 1 file changed, 55 insertions(+), 9 deletions(-) diff --git a/arch/arm64/kvm/hyp/tlb.c b/arch/arm64/kvm/hyp/tlb.c index e8e7ba2bc11f..9e1d2b75eecd 100644 --- a/arch/arm64/kvm/hyp/tlb.c +++ b/arch/arm64/kvm/hyp/tlb.c @@ -18,14 +18,62 @@ #include #include +static void __hyp_text __tlb_switch_to_guest_vhe(struct kvm *kvm) +{ + u64 val; + + /* +* With VHE enabled, we have HCR_EL2.{E2H,TGE} = {1,1}, and +* most TLB operations target EL2/EL0. In order to affect the +* guest TLBs (EL1/EL0), we need to change one of these two +* bits. Changing E2H is impossible (goodbye TTBR1_EL2), so +* let's flip TGE before executing the TLB operation. +*/ + write_sysreg(kvm->arch.vttbr, vttbr_el2); + val = read_sysreg(hcr_el2); + val &= ~HCR_TGE; + write_sysreg(val, hcr_el2); + isb(); +} + +static void __hyp_text __tlb_switch_to_guest_nvhe(struct kvm *kvm) +{ + write_sysreg(kvm->arch.vttbr, vttbr_el2); + isb(); +} + +static hyp_alternate_select(__tlb_switch_to_guest, + __tlb_switch_to_guest_nvhe, + __tlb_switch_to_guest_vhe, + ARM64_HAS_VIRT_HOST_EXTN); + +static void __hyp_text __tlb_switch_to_host_vhe(struct kvm *kvm) +{ + /* +* We're done with the TLB operation, let's restore the host's +* view of HCR_EL2. +*/ + write_sysreg(0, vttbr_el2); + write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2); +} + +static void __hyp_text __tlb_switch_to_host_nvhe(struct kvm *kvm) +{ + write_sysreg(0, vttbr_el2); +} + +static hyp_alternate_select(__tlb_switch_to_host, + __tlb_switch_to_host_nvhe, + __tlb_switch_to_host_vhe, + ARM64_HAS_VIRT_HOST_EXTN); + void __hyp_text __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa) { dsb(ishst); /* Switch to requested VMID */ kvm = kern_hyp_va(kvm); - write_sysreg(kvm->arch.vttbr, vttbr_el2); - isb(); + __tlb_switch_to_guest()(kvm); /* * We could do so much better if we had the VA as well. @@ -46,7 +94,7 @@ void __hyp_text __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa) dsb(ish); isb(); - write_sysreg(0, vttbr_el2); + __tlb_switch_to_host()(kvm); } void __hyp_text __kvm_tlb_flush_vmid(struct kvm *kvm) @@ -55,14 +103,13 @@ void __hyp_text __kvm_tlb_flush_vmid(struct kvm *kvm) /* Switch to requested VMID */ kvm = kern_hyp_va(kvm); - write_sysreg(kvm->arch.vttbr, vttbr_el2); - isb(); + __tlb_switch_to_guest()(kvm); __tlbi(vmalls12e1is); dsb(ish); isb(); - write_sysreg(0, vttbr_el2); + __tlb_switch_to_host()(kvm); } void __hyp_text __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu) @@ -70,14 +117,13 @@ void __hyp_text __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu) struct kvm *kvm = kern_hyp_va(kern_hyp_va(vcpu)->kvm); /* Switch to requested VMID */ - write_sysreg(kvm->arch.vttbr, vttbr_el2); - isb(); + __tlb_switch_to_guest()(kvm); __tlbi(vmalle1); dsb(nsh); isb(); - write_sysreg(0, vttbr_el2); + __tlb_switch_to_host()(kvm); } void __hyp_text __kvm_flush_vm_context(void) -- 2.11.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH V12 09/10] trace, ras: add ARM processor error trace event
Hi Tyler Baicar, On 2017/3/7 4:45, Tyler Baicar wrote: > Currently there are trace events for the various RAS > errors with the exception of ARM processor type errors. > Add a new trace event for such errors so that the user > will know when they occur. These trace events are > consistent with the ARM processor error section type > defined in UEFI 2.6 spec section N.2.4.4. > > Signed-off-by: Tyler Baicar> Acked-by: Steven Rostedt > --- > drivers/acpi/apei/ghes.c| 8 +++- > drivers/firmware/efi/cper.c | 1 + > drivers/ras/ras.c | 1 + > include/ras/ras_event.h | 34 ++ > 4 files changed, 43 insertions(+), 1 deletion(-) > > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c > index 842c0cc..81d7b79 100644 > --- a/drivers/acpi/apei/ghes.c > +++ b/drivers/acpi/apei/ghes.c > @@ -514,7 +514,13 @@ static void ghes_do_proc(struct ghes *ghes, > } > #endif > #ifdef CONFIG_RAS > - else if (trace_unknown_sec_event_enabled()) { > + else if (!uuid_le_cmp(sec_type, CPER_SEC_PROC_ARM) && > + trace_arm_event_enabled()) { > + struct cper_sec_proc_arm *arm_err; > + > + arm_err = acpi_hest_generic_data_payload(gdata); > + trace_arm_event(arm_err); > + } else if (trace_unknown_sec_event_enabled()) { > void *unknown_err = > acpi_hest_generic_data_payload(gdata); > trace_unknown_sec_event(_type, > fru_id, fru_text, sec_sev, > diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c > index 545a6c2..e9fb56a 100644 > --- a/drivers/firmware/efi/cper.c > +++ b/drivers/firmware/efi/cper.c > @@ -35,6 +35,7 @@ > #include > #include > #include > +#include > > #define INDENT_SP" " > > diff --git a/drivers/ras/ras.c b/drivers/ras/ras.c > index fb2500b..8ba5a94 100644 > --- a/drivers/ras/ras.c > +++ b/drivers/ras/ras.c > @@ -28,3 +28,4 @@ static int __init ras_init(void) > #endif > EXPORT_TRACEPOINT_SYMBOL_GPL(mc_event); > EXPORT_TRACEPOINT_SYMBOL_GPL(unknown_sec_event); > +EXPORT_TRACEPOINT_SYMBOL_GPL(arm_event); > diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h > index 5861b6f..b36db48 100644 > --- a/include/ras/ras_event.h > +++ b/include/ras/ras_event.h > @@ -162,6 +162,40 @@ > ); > > /* > + * ARM Processor Events Report > + * > + * This event is generated when hardware detects an ARM processor error > + * has occurred. UEFI 2.6 spec section N.2.4.4. > + */ > +TRACE_EVENT(arm_event, > + > + TP_PROTO(const struct cper_sec_proc_arm *proc), > + > + TP_ARGS(proc), > + > + TP_STRUCT__entry( > + __field(u64, mpidr) > + __field(u64, midr) > + __field(u32, running_state) > + __field(u32, psci_state) > + __field(u8, affinity) > + ), > + > + TP_fast_assign( > + __entry->affinity = proc->affinity_level; > + __entry->mpidr = proc->mpidr; > + __entry->midr = proc->midr; > + __entry->running_state = proc->running_state; > + __entry->psci_state = proc->psci_state; > + ), > + > + TP_printk("affinity level: %d; MPIDR: %016llx; MIDR: %016llx; " > + "running state: %d; PSCI state: %d", > + __entry->affinity, __entry->mpidr, __entry->midr, > + __entry->running_state, __entry->psci_state) > +); > + I think these fields are not enough, we need also export arm processor error information (UEFI 2.6 spec section N.2.4.4.1), or at least the error type, address, etc. So that the userspace (such as rasdaemon tool) could know what error occurred. Thanks, Xie XiuQi > +/* > * Unknown Section Report > * > * This event is generated when hardware detected a hardware > ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v2] arm64: kvm: Use has_vhe() instead of hyp_alternate_select()
Hi Shanker, On Sun, Mar 05, 2017 at 08:33:18PM -0600, Shanker Donthineni wrote: > Now all the cpu_hwcaps features have their own static keys. We don't > need a separate function hyp_alternate_select() to patch the vhe/nvhe > code. We can achieve the same functionality by using has_vhe(). It > improves the code readability, uses the jump label instructions, and > also compiler generates the better code with a fewer instructions. > > Signed-off-by: Shanker DonthineniI have no objections against this patch as such, but I have a number of more substantial changes which will get rid of most of the hyp_alternate_select later, and since there's no immediate need to merge this patch, and there's the risk that it may slow down some things on certain platforms with older compilers, I'd like to hold off on merging this patch until the next merge window and revisit this issue at that point. Thanks, -Christoffer > --- > v2: removed 'Change-Id: Ia8084189833f2081ff13c392deb5070c46a64038' from commit > > arch/arm64/kvm/hyp/debug-sr.c | 12 ++ > arch/arm64/kvm/hyp/switch.c| 50 > +++--- > arch/arm64/kvm/hyp/sysreg-sr.c | 23 +-- > 3 files changed, 43 insertions(+), 42 deletions(-) > > diff --git a/arch/arm64/kvm/hyp/debug-sr.c b/arch/arm64/kvm/hyp/debug-sr.c > index f5154ed..e5642c2 100644 > --- a/arch/arm64/kvm/hyp/debug-sr.c > +++ b/arch/arm64/kvm/hyp/debug-sr.c > @@ -109,9 +109,13 @@ static void __hyp_text __debug_save_spe_nvhe(u64 > *pmscr_el1) > dsb(nsh); > } > > -static hyp_alternate_select(__debug_save_spe, > - __debug_save_spe_nvhe, __debug_save_spe_vhe, > - ARM64_HAS_VIRT_HOST_EXTN); > +static void __hyp_text __debug_save_spe(u64 *pmscr_el1) > +{ > + if (has_vhe()) > + __debug_save_spe_vhe(pmscr_el1); > + else > + __debug_save_spe_nvhe(pmscr_el1); > +} > > static void __hyp_text __debug_restore_spe(u64 pmscr_el1) > { > @@ -180,7 +184,7 @@ void __hyp_text __debug_cond_save_host_state(struct > kvm_vcpu *vcpu) > > __debug_save_state(vcpu, >arch.host_debug_state.regs, > kern_hyp_va(vcpu->arch.host_cpu_context)); > - __debug_save_spe()(>arch.host_debug_state.pmscr_el1); > + __debug_save_spe(>arch.host_debug_state.pmscr_el1); > } > > void __hyp_text __debug_cond_restore_host_state(struct kvm_vcpu *vcpu) > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c > index aede165..c5c77b8 100644 > --- a/arch/arm64/kvm/hyp/switch.c > +++ b/arch/arm64/kvm/hyp/switch.c > @@ -33,13 +33,9 @@ static bool __hyp_text __fpsimd_enabled_vhe(void) > return !!(read_sysreg(cpacr_el1) & CPACR_EL1_FPEN); > } > > -static hyp_alternate_select(__fpsimd_is_enabled, > - __fpsimd_enabled_nvhe, __fpsimd_enabled_vhe, > - ARM64_HAS_VIRT_HOST_EXTN); > - > bool __hyp_text __fpsimd_enabled(void) > { > - return __fpsimd_is_enabled()(); > + return has_vhe() ? __fpsimd_enabled_vhe() : __fpsimd_enabled_nvhe(); > } > > static void __hyp_text __activate_traps_vhe(void) > @@ -63,9 +59,10 @@ static void __hyp_text __activate_traps_nvhe(void) > write_sysreg(val, cptr_el2); > } > > -static hyp_alternate_select(__activate_traps_arch, > - __activate_traps_nvhe, __activate_traps_vhe, > - ARM64_HAS_VIRT_HOST_EXTN); > +static void __hyp_text __activate_traps_arch(void) > +{ > + has_vhe() ? __activate_traps_vhe() : __activate_traps_nvhe(); > +} > > static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu) > { > @@ -97,7 +94,7 @@ static void __hyp_text __activate_traps(struct kvm_vcpu > *vcpu) > write_sysreg(0, pmselr_el0); > write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0); > write_sysreg(vcpu->arch.mdcr_el2, mdcr_el2); > - __activate_traps_arch()(); > + __activate_traps_arch(); > } > > static void __hyp_text __deactivate_traps_vhe(void) > @@ -127,9 +124,10 @@ static void __hyp_text __deactivate_traps_nvhe(void) > write_sysreg(CPTR_EL2_DEFAULT, cptr_el2); > } > > -static hyp_alternate_select(__deactivate_traps_arch, > - __deactivate_traps_nvhe, __deactivate_traps_vhe, > - ARM64_HAS_VIRT_HOST_EXTN); > +static void __hyp_text __deactivate_traps_arch(void) > +{ > + has_vhe() ? __deactivate_traps_vhe() : __deactivate_traps_nvhe(); > +} > > static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu) > { > @@ -142,7 +140,7 @@ static void __hyp_text __deactivate_traps(struct kvm_vcpu > *vcpu) > if (vcpu->arch.hcr_el2 & HCR_VSE) > vcpu->arch.hcr_el2 = read_sysreg(hcr_el2); > > - __deactivate_traps_arch()(); > + __deactivate_traps_arch(); > write_sysreg(0, hstr_el2); > write_sysreg(0, pmuserenr_el0); > } > @@ -183,20 +181,14 @@ static void
Re: [PATCH v4 0/4] KVM: arm64: Increase number of user memslots
On Thu, Mar 09, 2017 at 08:01:56AM +, Marc Zyngier wrote: > On Wed, Mar 08 2017 at 6:08:31 am GMT, linucher...@gmail.com wrote: > > From: Linu Cherian> > > > v3 -> v4: > > - Add missing commit messages in patches 1 and 3 > > > > v2 -> v3: > > - Added documentation for KVM_CAP_NR_MEMSLOTS > > - Removed KVM_PRIVATE_MEM_SLOTS which is unused > > - KVM_USER_MEM_SLOTS changed to 512 from 508 > > > > v1 -> v2: > > - Enabling KVM_CAP_NR_MEMSLOTS for arm/arm64 moved to separate patch. > > - Updated commit message so that what is reported to userspace is explicit. > > > > > > Linu Cherian (4): > > KVM: Add documentation for KVM_CAP_NR_MEMSLOTS > > KVM: arm/arm64: Enable KVM_CAP_NR_MEMSLOTS on arm/arm64 > > KVM: arm/arm64: Remove KVM_PRIVATE_MEM_SLOTS definition that are > > unused > > KVM: arm64: Increase number of user memslots to 512 > > > > Documentation/virtual/kvm/api.txt | 4 > > arch/arm/include/asm/kvm_host.h | 1 - > > arch/arm/kvm/arm.c| 3 +++ > > arch/arm64/include/asm/kvm_host.h | 3 +-- > > 4 files changed, 8 insertions(+), 3 deletions(-) > > For the whole series: > > Acked-by: Marc Zyngier > > Christoffer: if you're happy with this series, I'll take it as part of > the next batch of fixes. > Yes: Reviewed-by: Christoffer Dall ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v5 09/10] arm64/mmu: replace 'page_mappings_only' parameter with flags argument
In preparation of extending the policy for manipulating kernel mappings with whether or not contiguous hints may be used in the page tables, replace the bool 'page_mappings_only' with a flags field and a flag NO_BLOCK_MAPPINGS. Signed-off-by: Ard Biesheuvel--- arch/arm64/mm/mmu.c | 45 1 file changed, 27 insertions(+), 18 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index a6d7a86dd2b8..9babafa253cf 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -43,6 +43,8 @@ #include #include +#define NO_BLOCK_MAPPINGS BIT(0) + u64 idmap_t0sz = TCR_T0SZ(VA_BITS); u64 kimage_voffset __ro_after_init; @@ -153,7 +155,7 @@ static void alloc_init_pte(pmd_t *pmd, unsigned long addr, static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end, phys_addr_t phys, pgprot_t prot, phys_addr_t (*pgtable_alloc)(void), - bool page_mappings_only) + int flags) { pmd_t *pmd; unsigned long next; @@ -180,7 +182,7 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end, /* try section mapping first */ if (((addr | next | phys) & ~SECTION_MASK) == 0 && - !page_mappings_only) { + (flags & NO_BLOCK_MAPPINGS) == 0) { pmd_set_huge(pmd, phys, prot); /* @@ -217,7 +219,7 @@ static inline bool use_1G_block(unsigned long addr, unsigned long next, static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end, phys_addr_t phys, pgprot_t prot, phys_addr_t (*pgtable_alloc)(void), - bool page_mappings_only) + int flags) { pud_t *pud; unsigned long next; @@ -239,7 +241,8 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end, /* * For 4K granule only, attempt to put down a 1GB block */ - if (use_1G_block(addr, next, phys) && !page_mappings_only) { + if (use_1G_block(addr, next, phys) && + (flags & NO_BLOCK_MAPPINGS) == 0) { pud_set_huge(pud, phys, prot); /* @@ -250,7 +253,7 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end, pud_val(*pud))); } else { alloc_init_pmd(pud, addr, next, phys, prot, - pgtable_alloc, page_mappings_only); + pgtable_alloc, flags); BUG_ON(pud_val(old_pud) != 0 && pud_val(old_pud) != pud_val(*pud)); @@ -265,7 +268,7 @@ static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys, unsigned long virt, phys_addr_t size, pgprot_t prot, phys_addr_t (*pgtable_alloc)(void), -bool page_mappings_only) +int flags) { unsigned long addr, length, end, next; pgd_t *pgd = pgd_offset_raw(pgdir, virt); @@ -285,7 +288,7 @@ static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys, do { next = pgd_addr_end(addr, end); alloc_init_pud(pgd, addr, next, phys, prot, pgtable_alloc, - page_mappings_only); + flags); phys += next - addr; } while (pgd++, addr = next, addr != end); } @@ -314,17 +317,22 @@ static void __init create_mapping_noalloc(phys_addr_t phys, unsigned long virt, , virt); return; } - __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL, false); + __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL, 0); } void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys, unsigned long virt, phys_addr_t size, pgprot_t prot, bool page_mappings_only) { + int flags; + BUG_ON(mm == _mm); + if (page_mappings_only) + flags = NO_BLOCK_MAPPINGS; + __create_pgd_mapping(mm->pgd, phys, virt, size, prot, -pgd_pgtable_alloc, page_mappings_only); +pgd_pgtable_alloc, flags); } static void update_mapping_prot(phys_addr_t phys, unsigned long virt, @@ -336,7 +344,7 @@ static void update_mapping_prot(phys_addr_t phys, unsigned long virt, return; } -
[PATCH v5 10/10] arm64: mm: set the contiguous bit for kernel mappings where appropriate
This is the third attempt at enabling the use of contiguous hints for kernel mappings. The most recent attempt 0bfc445dec9d was reverted after it turned out that updating permission attributes on live contiguous ranges may result in TLB conflicts. So this time, the contiguous hint is not set for .rodata or for the linear alias of .text/.rodata, both of which are mapped read-write initially, and remapped read-only at a later stage. (Note that the latter region could also be unmapped and remapped again with updated permission attributes, given that the region, while live, is only mapped for the convenience of the hibernation code, but that also means the TLB footprint is negligible anyway, so why bother) This enables the following contiguous range sizes for the virtual mapping of the kernel image, and for the linear mapping: granule size | cont PTE | cont PMD | -+++ 4 KB|64 KB | 32 MB| 16 KB| 2 MB |1 GB* | 64 KB| 2 MB | 16 GB* | * Only when built for 3 or more levels of translation. This is due to the fact that a 2 level configuration only consists of PGDs and PTEs, and the added complexity of dealing with folded PMDs is not justified considering that 16 GB contiguous ranges are likely to be ignored by the hardware (and 16k/2 levels is a niche configuration) Signed-off-by: Ard Biesheuvel--- arch/arm64/include/asm/pgtable.h | 10 ++ arch/arm64/mm/mmu.c | 154 +--- 2 files changed, 114 insertions(+), 50 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 0eef6064bf3b..f10a7bf81849 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -74,6 +74,16 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]; #define pte_user_exec(pte) (!(pte_val(pte) & PTE_UXN)) #define pte_cont(pte) (!!(pte_val(pte) & PTE_CONT)) +static inline u64 pte_cont_addr_end(u64 addr, u64 end) +{ + return min((addr + CONT_PTE_SIZE) & CONT_PTE_MASK, end); +} + +static inline u64 pmd_cont_addr_end(u64 addr, u64 end) +{ + return min((addr + CONT_PMD_SIZE) & CONT_PMD_MASK, end); +} + #ifdef CONFIG_ARM64_HW_AFDBM #define pte_hw_dirty(pte) (pte_write(pte) && !(pte_val(pte) & PTE_RDONLY)) #else diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 9babafa253cf..e2ffab56c1a6 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -44,6 +44,7 @@ #include #define NO_BLOCK_MAPPINGS BIT(0) +#define NO_CONT_MAPPINGS BIT(1) u64 idmap_t0sz = TCR_T0SZ(VA_BITS); @@ -116,11 +117,30 @@ static bool pgattr_change_is_safe(u64 old, u64 new) return ((old ^ new) & ~mask) == 0; } -static void alloc_init_pte(pmd_t *pmd, unsigned long addr, - unsigned long end, phys_addr_t phys, - pgprot_t prot, - phys_addr_t (*pgtable_alloc)(void)) +static void init_pte(pte_t *pte, unsigned long addr, unsigned long end, +phys_addr_t phys, pgprot_t prot) { + do { + pte_t old_pte = *pte; + + set_pte(pte, pfn_pte(__phys_to_pfn(phys), prot)); + + /* +* After the PTE entry has been populated once, we +* only allow updates to the permission attributes. +*/ + BUG_ON(!pgattr_change_is_safe(pte_val(old_pte), pte_val(*pte))); + + } while (pte++, addr += PAGE_SIZE, phys += PAGE_SIZE, addr != end); +} + +static void alloc_init_cont_pte(pmd_t *pmd, unsigned long addr, + unsigned long end, phys_addr_t phys, + pgprot_t prot, + phys_addr_t (*pgtable_alloc)(void), + int flags) +{ + unsigned long next; pte_t *pte; BUG_ON(pmd_sect(*pmd)); @@ -136,45 +156,30 @@ static void alloc_init_pte(pmd_t *pmd, unsigned long addr, pte = pte_set_fixmap_offset(pmd, addr); do { - pte_t old_pte = *pte; + pgprot_t __prot = prot; - set_pte(pte, pfn_pte(__phys_to_pfn(phys), prot)); - phys += PAGE_SIZE; + next = pte_cont_addr_end(addr, end); - /* -* After the PTE entry has been populated once, we -* only allow updates to the permission attributes. -*/ - BUG_ON(!pgattr_change_is_safe(pte_val(old_pte), pte_val(*pte))); + /* use a contiguous mapping if the range is suitably aligned */ + if addr | next | phys) & ~CONT_PTE_MASK) == 0) && + (flags & NO_CONT_MAPPINGS) == 0) + __prot = __pgprot(pgprot_val(prot) | PTE_CONT); -
[PATCH v5 08/10] arm64/mmu: add contiguous bit to sanity bug check
A mapping with the contiguous bit cannot be safely manipulated while live, regardless of whether the bit changes between the old and new mapping. So take this into account when deciding whether the change is safe. Signed-off-by: Ard Biesheuvel--- arch/arm64/mm/mmu.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index d3fecd20a136..a6d7a86dd2b8 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -103,7 +103,15 @@ static bool pgattr_change_is_safe(u64 old, u64 new) */ static const pteval_t mask = PTE_PXN | PTE_RDONLY | PTE_WRITE; - return old == 0 || new == 0 || ((old ^ new) & ~mask) == 0; + /* creating or taking down mappings is always safe */ + if (old == 0 || new == 0) + return true; + + /* live contiguous mappings may not be manipulated at all */ + if ((old | new) & PTE_CONT) + return false; + + return ((old ^ new) & ~mask) == 0; } static void alloc_init_pte(pmd_t *pmd, unsigned long addr, -- 2.7.4 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v5 04/10] arm64: mmu: map .text as read-only from the outset
Now that alternatives patching code no longer relies on the primary mapping of .text being writable, we can remove the code that removes the writable permissions post-init time, and map it read-only from the outset. To preserve the existing behavior under rodata=off, which is relied upon by external debuggers to manage software breakpoints (as pointed out by Mark), add an early_param() check for rodata=, and use RWX permissions if it set to 'off'. Reviewed-by: Laura AbbottReviewed-by: Kees Cook Reviewed-by: Mark Rutland Signed-off-by: Ard Biesheuvel --- arch/arm64/mm/mmu.c | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index df377fbe464e..300e98e8cd63 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -416,9 +416,6 @@ void mark_rodata_ro(void) { unsigned long section_size; - section_size = (unsigned long)_etext - (unsigned long)_text; - update_mapping_prot(__pa_symbol(_text), (unsigned long)_text, - section_size, PAGE_KERNEL_ROX); /* * mark .rodata as read only. Use __init_begin rather than __end_rodata * to cover NOTES and EXCEPTION_TABLE. @@ -451,6 +448,12 @@ static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end, vm_area_add_early(vma); } +static int __init parse_rodata(char *arg) +{ + return strtobool(arg, _enabled); +} +early_param("rodata", parse_rodata); + /* * Create fine-grained mappings for the kernel. */ @@ -458,7 +461,14 @@ static void __init map_kernel(pgd_t *pgd) { static struct vm_struct vmlinux_text, vmlinux_rodata, vmlinux_init, vmlinux_data; - map_kernel_segment(pgd, _text, _etext, PAGE_KERNEL_EXEC, _text); + /* +* External debuggers may need to write directly to the text +* mapping to install SW breakpoints. Allow this (only) when +* explicitly requested with rodata=off. +*/ + pgprot_t text_prot = rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC; + + map_kernel_segment(pgd, _text, _etext, text_prot, _text); map_kernel_segment(pgd, __start_rodata, __init_begin, PAGE_KERNEL, _rodata); map_kernel_segment(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC, _init); -- 2.7.4 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v5 07/10] arm64/mmu: ignore debug_pagealloc for kernel segments
The debug_pagealloc facility manipulates kernel mappings in the linear region at page granularity to detect out of bounds or use-after-free accesses. Since the kernel segments are not allocated dynamically, there is no point in taking the debug_pagealloc_enabled flag into account for them, and we can use block mappings unconditionally. Note that this applies equally to the linear alias of text/rodata: we will never have dynamic allocations there given that the same memory is statically in use by the kernel image. Signed-off-by: Ard Biesheuvel--- arch/arm64/mm/mmu.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index c3963c592ec3..d3fecd20a136 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -328,8 +328,7 @@ static void update_mapping_prot(phys_addr_t phys, unsigned long virt, return; } - __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, -NULL, debug_pagealloc_enabled()); + __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL, false); /* flush the TLBs after updating live kernel mappings */ flush_tlb_kernel_range(virt, virt + size); @@ -381,7 +380,7 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end */ __create_pgd_mapping(pgd, kernel_start, __phys_to_virt(kernel_start), kernel_end - kernel_start, PAGE_KERNEL, -early_pgtable_alloc, debug_pagealloc_enabled()); +early_pgtable_alloc, false); } void __init mark_linear_text_alias_ro(void) @@ -437,7 +436,7 @@ static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end, BUG_ON(!PAGE_ALIGNED(size)); __create_pgd_mapping(pgd, pa_start, (unsigned long)va_start, size, prot, -early_pgtable_alloc, debug_pagealloc_enabled()); +early_pgtable_alloc, false); vma->addr = va_start; vma->phys_addr = pa_start; -- 2.7.4 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v5 05/10] arm64: mmu: apply strict permissions to .init.text and .init.data
To avoid having mappings that are writable and executable at the same time, split the init region into a .init.text region that is mapped read-only, and a .init.data region that is mapped non-executable. This is possible now that the alternative patching occurs via the linear mapping, and the linear alias of the init region is always mapped writable (but never executable). Since the alternatives descriptions themselves are read-only data, move those into the .init.text region. Reviewed-by: Laura AbbottReviewed-by: Mark Rutland Signed-off-by: Ard Biesheuvel --- arch/arm64/include/asm/sections.h | 2 ++ arch/arm64/kernel/vmlinux.lds.S | 25 +--- arch/arm64/mm/mmu.c | 12 ++ 3 files changed, 26 insertions(+), 13 deletions(-) diff --git a/arch/arm64/include/asm/sections.h b/arch/arm64/include/asm/sections.h index 4e7e7067afdb..941267caa39c 100644 --- a/arch/arm64/include/asm/sections.h +++ b/arch/arm64/include/asm/sections.h @@ -24,6 +24,8 @@ extern char __hibernate_exit_text_start[], __hibernate_exit_text_end[]; extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[]; extern char __hyp_text_start[], __hyp_text_end[]; extern char __idmap_text_start[], __idmap_text_end[]; +extern char __initdata_begin[], __initdata_end[]; +extern char __inittext_begin[], __inittext_end[]; extern char __irqentry_text_start[], __irqentry_text_end[]; extern char __mmuoff_data_start[], __mmuoff_data_end[]; diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S index b8deffa9e1bf..2c93d259046c 100644 --- a/arch/arm64/kernel/vmlinux.lds.S +++ b/arch/arm64/kernel/vmlinux.lds.S @@ -143,12 +143,27 @@ SECTIONS . = ALIGN(SEGMENT_ALIGN); __init_begin = .; + __inittext_begin = .; INIT_TEXT_SECTION(8) .exit.text : { ARM_EXIT_KEEP(EXIT_TEXT) } + . = ALIGN(4); + .altinstructions : { + __alt_instructions = .; + *(.altinstructions) + __alt_instructions_end = .; + } + .altinstr_replacement : { + *(.altinstr_replacement) + } + + . = ALIGN(PAGE_SIZE); + __inittext_end = .; + __initdata_begin = .; + .init.data : { INIT_DATA INIT_SETUP(16) @@ -164,15 +179,6 @@ SECTIONS PERCPU_SECTION(L1_CACHE_BYTES) - . = ALIGN(4); - .altinstructions : { - __alt_instructions = .; - *(.altinstructions) - __alt_instructions_end = .; - } - .altinstr_replacement : { - *(.altinstr_replacement) - } .rela : ALIGN(8) { *(.rela .rela*) } @@ -181,6 +187,7 @@ SECTIONS __rela_size = SIZEOF(.rela); . = ALIGN(SEGMENT_ALIGN); + __initdata_end = .; __init_end = .; _data = .; diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 300e98e8cd63..75e21c33caff 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -459,7 +459,8 @@ early_param("rodata", parse_rodata); */ static void __init map_kernel(pgd_t *pgd) { - static struct vm_struct vmlinux_text, vmlinux_rodata, vmlinux_init, vmlinux_data; + static struct vm_struct vmlinux_text, vmlinux_rodata, vmlinux_inittext, + vmlinux_initdata, vmlinux_data; /* * External debuggers may need to write directly to the text @@ -469,9 +470,12 @@ static void __init map_kernel(pgd_t *pgd) pgprot_t text_prot = rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC; map_kernel_segment(pgd, _text, _etext, text_prot, _text); - map_kernel_segment(pgd, __start_rodata, __init_begin, PAGE_KERNEL, _rodata); - map_kernel_segment(pgd, __init_begin, __init_end, PAGE_KERNEL_EXEC, - _init); + map_kernel_segment(pgd, __start_rodata, __inittext_begin, PAGE_KERNEL, + _rodata); + map_kernel_segment(pgd, __inittext_begin, __inittext_end, text_prot, + _inittext); + map_kernel_segment(pgd, __initdata_begin, __initdata_end, PAGE_KERNEL, + _initdata); map_kernel_segment(pgd, _data, _end, PAGE_KERNEL, _data); if (!pgd_val(*pgd_offset_raw(pgd, FIXADDR_START))) { -- 2.7.4 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v5 06/10] arm64/mmu: align alloc_init_pte prototype with pmd/pud versions
Align the function prototype of alloc_init_pte() with its pmd and pud counterparts by replacing the pfn parameter with the equivalent physical address. Signed-off-by: Ard Biesheuvel--- arch/arm64/mm/mmu.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 75e21c33caff..c3963c592ec3 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -107,7 +107,7 @@ static bool pgattr_change_is_safe(u64 old, u64 new) } static void alloc_init_pte(pmd_t *pmd, unsigned long addr, - unsigned long end, unsigned long pfn, + unsigned long end, phys_addr_t phys, pgprot_t prot, phys_addr_t (*pgtable_alloc)(void)) { @@ -128,8 +128,8 @@ static void alloc_init_pte(pmd_t *pmd, unsigned long addr, do { pte_t old_pte = *pte; - set_pte(pte, pfn_pte(pfn, prot)); - pfn++; + set_pte(pte, pfn_pte(__phys_to_pfn(phys), prot)); + phys += PAGE_SIZE; /* * After the PTE entry has been populated once, we @@ -182,7 +182,7 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end, BUG_ON(!pgattr_change_is_safe(pmd_val(old_pmd), pmd_val(*pmd))); } else { - alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys), + alloc_init_pte(pmd, addr, next, phys, prot, pgtable_alloc); BUG_ON(pmd_val(old_pmd) != 0 && -- 2.7.4 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v5 02/10] arm64: mmu: move TLB maintenance from callers to create_mapping_late()
In preparation of refactoring the kernel mapping logic so that text regions are never mapped writable, which would require adding explicit TLB maintenance to new call sites of create_mapping_late() (which is currently invoked twice from the same function), move the TLB maintenance from the call site into create_mapping_late() itself, and change it from a full TLB flush into a flush by VA, which is more appropriate here. Also, given that create_mapping_late() has evolved into a routine that only updates protection bits on existing mappings, rename it to update_mapping_prot() Reviewed-by: Mark RutlandTested-by: Mark Rutland Signed-off-by: Ard Biesheuvel --- arch/arm64/mm/mmu.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index d28dbcf596b6..6cafd8723d1a 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -319,17 +319,20 @@ void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys, pgd_pgtable_alloc, page_mappings_only); } -static void create_mapping_late(phys_addr_t phys, unsigned long virt, - phys_addr_t size, pgprot_t prot) +static void update_mapping_prot(phys_addr_t phys, unsigned long virt, + phys_addr_t size, pgprot_t prot) { if (virt < VMALLOC_START) { - pr_warn("BUG: not creating mapping for %pa at 0x%016lx - outside kernel range\n", + pr_warn("BUG: not updating mapping for %pa at 0x%016lx - outside kernel range\n", , virt); return; } __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL, debug_pagealloc_enabled()); + + /* flush the TLBs after updating live kernel mappings */ + flush_tlb_kernel_range(virt, virt + size); } static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end) @@ -402,19 +405,16 @@ void mark_rodata_ro(void) unsigned long section_size; section_size = (unsigned long)_etext - (unsigned long)_text; - create_mapping_late(__pa_symbol(_text), (unsigned long)_text, + update_mapping_prot(__pa_symbol(_text), (unsigned long)_text, section_size, PAGE_KERNEL_ROX); /* * mark .rodata as read only. Use __init_begin rather than __end_rodata * to cover NOTES and EXCEPTION_TABLE. */ section_size = (unsigned long)__init_begin - (unsigned long)__start_rodata; - create_mapping_late(__pa_symbol(__start_rodata), (unsigned long)__start_rodata, + update_mapping_prot(__pa_symbol(__start_rodata), (unsigned long)__start_rodata, section_size, PAGE_KERNEL_RO); - /* flush the TLBs after updating live kernel mappings */ - flush_tlb_all(); - debug_checkwx(); } -- 2.7.4 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v5 03/10] arm64: alternatives: apply boot time fixups via the linear mapping
One important rule of thumb when desiging a secure software system is that memory should never be writable and executable at the same time. We mostly adhere to this rule in the kernel, except at boot time, when regions may be mapped RWX until after we are done applying alternatives or making other one-off changes. For the alternative patching, we can improve the situation by applying the fixups via the linear mapping, which is never mapped with executable permissions. So map the linear alias of .text with RW- permissions initially, and remove the write permissions as soon as alternative patching has completed. Reviewed-by: Laura AbbottReviewed-by: Mark Rutland Tested-by: Mark Rutland Signed-off-by: Ard Biesheuvel --- arch/arm64/include/asm/mmu.h| 1 + arch/arm64/kernel/alternative.c | 11 +- arch/arm64/kernel/smp.c | 1 + arch/arm64/mm/mmu.c | 22 +++- 4 files changed, 25 insertions(+), 10 deletions(-) diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index 47619411f0ff..5468c834b072 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -37,5 +37,6 @@ extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys, unsigned long virt, phys_addr_t size, pgprot_t prot, bool page_mappings_only); extern void *fixmap_remap_fdt(phys_addr_t dt_phys); +extern void mark_linear_text_alias_ro(void); #endif diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c index 06d650f61da7..8840c109c5d6 100644 --- a/arch/arm64/kernel/alternative.c +++ b/arch/arm64/kernel/alternative.c @@ -105,11 +105,11 @@ static u32 get_alt_insn(struct alt_instr *alt, u32 *insnptr, u32 *altinsnptr) return insn; } -static void __apply_alternatives(void *alt_region) +static void __apply_alternatives(void *alt_region, bool use_linear_alias) { struct alt_instr *alt; struct alt_region *region = alt_region; - u32 *origptr, *replptr; + u32 *origptr, *replptr, *updptr; for (alt = region->begin; alt < region->end; alt++) { u32 insn; @@ -124,11 +124,12 @@ static void __apply_alternatives(void *alt_region) origptr = ALT_ORIG_PTR(alt); replptr = ALT_REPL_PTR(alt); + updptr = use_linear_alias ? (u32 *)lm_alias(origptr) : origptr; nr_inst = alt->alt_len / sizeof(insn); for (i = 0; i < nr_inst; i++) { insn = get_alt_insn(alt, origptr + i, replptr + i); - *(origptr + i) = cpu_to_le32(insn); + updptr[i] = cpu_to_le32(insn); } flush_icache_range((uintptr_t)origptr, @@ -155,7 +156,7 @@ static int __apply_alternatives_multi_stop(void *unused) isb(); } else { BUG_ON(patched); - __apply_alternatives(); + __apply_alternatives(, true); /* Barriers provided by the cache flushing */ WRITE_ONCE(patched, 1); } @@ -176,5 +177,5 @@ void apply_alternatives(void *start, size_t length) .end= start + length, }; - __apply_alternatives(); + __apply_alternatives(, false); } diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index ef1caae02110..d4739552da28 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -434,6 +434,7 @@ void __init smp_cpus_done(unsigned int max_cpus) setup_cpu_features(); hyp_mode_check(); apply_alternatives_all(); + mark_linear_text_alias_ro(); } void __init smp_prepare_boot_cpu(void) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 6cafd8723d1a..df377fbe464e 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -372,16 +372,28 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end debug_pagealloc_enabled()); /* -* Map the linear alias of the [_text, __init_begin) interval as -* read-only/non-executable. This makes the contents of the -* region accessible to subsystems such as hibernate, but -* protects it from inadvertent modification or execution. +* Map the linear alias of the [_text, __init_begin) interval +* as non-executable now, and remove the write permission in +* mark_linear_text_alias_ro() below (which will be called after +* alternative patching has completed). This makes the contents +* of the region accessible to subsystems such as hibernate, +* but protects it from inadvertent modification or execution. */ __create_pgd_mapping(pgd, kernel_start, __phys_to_virt(kernel_start), -
[PATCH v5 01/10] arm: kvm: move kvm_vgic_global_state out of .text section
The kvm_vgic_global_state struct contains a static key which is written to by jump_label_init() at boot time. So in preparation of making .text regions truly (well, almost truly) read-only, mark kvm_vgic_global_state __ro_after_init so it moves to the .rodata section instead. Acked-by: Marc ZyngierReviewed-by: Laura Abbott Reviewed-by: Mark Rutland Tested-by: Mark Rutland Signed-off-by: Ard Biesheuvel --- virt/kvm/arm/vgic/vgic.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c index 654dfd40e449..7713d96e85b7 100644 --- a/virt/kvm/arm/vgic/vgic.c +++ b/virt/kvm/arm/vgic/vgic.c @@ -29,7 +29,9 @@ #define DEBUG_SPINLOCK_BUG_ON(p) #endif -struct vgic_global __section(.hyp.text) kvm_vgic_global_state = {.gicv3_cpuif = STATIC_KEY_FALSE_INIT,}; +struct vgic_global kvm_vgic_global_state __ro_after_init = { + .gicv3_cpuif = STATIC_KEY_FALSE_INIT, +}; /* * Locking order is always: -- 2.7.4 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v5 00/10] arm64: mmu: avoid W+X mappings and re-enable PTE_CONT for kernel
Having memory that is writable and executable at the same time is a security hazard, and so we tend to avoid those when we can. However, at boot time, we keep .text mapped writable during the entire init phase, and the init region itself is mapped rwx as well. Let's improve the situation by: - making the alternatives patching use the linear mapping - splitting the init region into separate text and data regions This removes all RWX mappings except the really early one created in head.S (which we could perhaps fix in the future as well) Changes since v4: - the PTE_CONT patch has now spawned four more preparatory patches that clean up some of the page table creation code before reintroducing the contiguous attribute management - add Mark's R-b to #4 and #5 Changes since v3: - use linear alias only when patching the core kernel, and not for modules - add patch to reintroduce the use of PTE_CONT for kernel mappings, except for regions that are remapped read-only later on (i.e, .rodata and the linear alias of .text+.rodata) Changes since v2: - ensure that text mappings remain writable under rodata=off - rename create_mapping_late() to update_mapping_prot() - clarify commit log of #2 - add acks Ard Biesheuvel (10): arm: kvm: move kvm_vgic_global_state out of .text section arm64: mmu: move TLB maintenance from callers to create_mapping_late() arm64: alternatives: apply boot time fixups via the linear mapping arm64: mmu: map .text as read-only from the outset arm64: mmu: apply strict permissions to .init.text and .init.data arm64/mmu: align alloc_init_pte prototype with pmd/pud versions arm64/mmu: ignore debug_pagealloc for kernel segments arm64/mmu: add contiguous bit to sanity bug check arm64/mmu: replace 'page_mappings_only' parameter with flags argument arm64: mm: set the contiguous bit for kernel mappings where appropriate arch/arm64/include/asm/mmu.h | 1 + arch/arm64/include/asm/pgtable.h | 10 + arch/arm64/include/asm/sections.h | 2 + arch/arm64/kernel/alternative.c | 11 +- arch/arm64/kernel/smp.c | 1 + arch/arm64/kernel/vmlinux.lds.S | 25 +- arch/arm64/mm/mmu.c | 250 ++-- virt/kvm/arm/vgic/vgic.c | 4 +- 8 files changed, 212 insertions(+), 92 deletions(-) -- 2.7.4 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v4 0/4] KVM: arm64: Increase number of user memslots
On Wed, Mar 08 2017 at 6:08:31 am GMT, linucher...@gmail.com wrote: > From: Linu Cherian> > v3 -> v4: > - Add missing commit messages in patches 1 and 3 > > v2 -> v3: > - Added documentation for KVM_CAP_NR_MEMSLOTS > - Removed KVM_PRIVATE_MEM_SLOTS which is unused > - KVM_USER_MEM_SLOTS changed to 512 from 508 > > v1 -> v2: > - Enabling KVM_CAP_NR_MEMSLOTS for arm/arm64 moved to separate patch. > - Updated commit message so that what is reported to userspace is explicit. > > > Linu Cherian (4): > KVM: Add documentation for KVM_CAP_NR_MEMSLOTS > KVM: arm/arm64: Enable KVM_CAP_NR_MEMSLOTS on arm/arm64 > KVM: arm/arm64: Remove KVM_PRIVATE_MEM_SLOTS definition that are > unused > KVM: arm64: Increase number of user memslots to 512 > > Documentation/virtual/kvm/api.txt | 4 > arch/arm/include/asm/kvm_host.h | 1 - > arch/arm/kvm/arm.c| 3 +++ > arch/arm64/include/asm/kvm_host.h | 3 +-- > 4 files changed, 8 insertions(+), 3 deletions(-) For the whole series: Acked-by: Marc Zyngier Christoffer: if you're happy with this series, I'll take it as part of the next batch of fixes. Thanks, M. -- Jazz is not dead, it just smell funny. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm