Re: [PATCH v7 10/21] RISC-V: KVM: Handle MMIO exits for VCPU
On Wed, Oct 9, 2019 at 4:14 AM Palmer Dabbelt wrote: > > On Mon, 23 Sep 2019 04:12:17 PDT (-0700), pbonz...@redhat.com wrote: > > On 04/09/19 18:15, Anup Patel wrote: > >> +unsigned long guest_sstatus = > >> +vcpu->arch.guest_context.sstatus | SR_MXR; > >> +unsigned long guest_hstatus = > >> +vcpu->arch.guest_context.hstatus | HSTATUS_SPRV; > >> +unsigned long guest_vsstatus, old_stvec, tmp; > >> + > >> +guest_sstatus = csr_swap(CSR_SSTATUS, guest_sstatus); > >> +old_stvec = csr_swap(CSR_STVEC, (ulong)&__kvm_riscv_unpriv_trap); > >> + > >> +if (read_insn) { > >> +guest_vsstatus = csr_read_set(CSR_VSSTATUS, SR_MXR); > > > > Is this needed? IIUC SSTATUS.MXR encompasses a wider set of permissions: > > > > The HS-level MXR bit makes any executable page readable. {\tt > > vsstatus}.MXR makes readable those pages marked executable at the VS > > translation level, but only if readable at the guest-physical > > translation level. > > > > So it should be enough to set SSTATUS.MXR=1 I think. But you also > > shouldn't set SSTATUS.MXR=1 in the !read_insn case. > > > > Also, you can drop the irq save/restore (which is already a save/restore > > of SSTATUS) since you already write 0 to SSTATUS.SIE in your csr_swap. > > Perhaps add a BUG_ON(guest_sstatus & SR_SIE) before the csr_swap? > > > >> +asm volatile ("\n" > >> +"csrrw %[hstatus], " STR(CSR_HSTATUS) ", %[hstatus]\n" > >> +"li %[tilen], 4\n" > >> +"li %[tscause], 0\n" > >> +"lhu %[val], (%[addr])\n" > >> +"andi %[tmp], %[val], 3\n" > >> +"addi %[tmp], %[tmp], -3\n" > >> +"bne %[tmp], zero, 2f\n" > >> +"lhu %[tmp], 2(%[addr])\n" > >> +"sll %[tmp], %[tmp], 16\n" > >> +"add %[val], %[val], %[tmp]\n" > >> +"2: csrw " STR(CSR_HSTATUS) ", %[hstatus]" > >> +: [hstatus] "+"(guest_hstatus), [val] "=" (val), > >> + [tmp] "=" (tmp), [tilen] "+" (tilen), > >> + [tscause] "+" (tscause) > >> +: [addr] "r" (addr)); > >> +csr_write(CSR_VSSTATUS, guest_vsstatus); > > > >> > >> +#ifndef CONFIG_RISCV_ISA_C > >> +"li %[tilen], 4\n" > >> +#else > >> +"li %[tilen], 2\n" > >> +#endif > > > > Can you use an assembler directive to force using a non-compressed > > format for ld and lw? This would get rid of tilen, which is costing 6 > > bytes (if I did the RVC math right) in order to save two. :) > > > > Paolo > > > >> +"li %[tscause], 0\n" > >> +#ifdef CONFIG_64BIT > >> +"ld %[val], (%[addr])\n" > >> +#else > >> +"lw %[val], (%[addr])\n" > >> +#endif > To: a...@brainfault.org > CC: pbonz...@redhat.com > CC: Anup Patel > CC: Paul Walmsley > CC: rkrc...@redhat.com > CC: daniel.lezc...@linaro.org > CC: t...@linutronix.de > CC: g...@amazon.com > CC: Atish Patra > CC: Alistair Francis > CC: Damien Le Moal > CC: Christoph Hellwig > CC: k...@vger.kernel.org > CC: linux-ri...@lists.infradead.org > CC: linux-kernel@vger.kernel.org > Subject: Re: [PATCH v7 10/21] RISC-V: KVM: Handle MMIO exits for VCPU > In-Reply-To: > > > On Mon, 23 Sep 2019 06:09:43 PDT (-0700), a...@brainfault.org wrote: > > On Mon, Sep 23, 2019 at 4:42 PM Paolo Bonzini wrote: > >> > >> On 04/09/19 18:15, Anup Patel wrote: > >> > + unsigned long guest_sstatus = > >> > + vcpu->arch.guest_context.sstatus | SR_MXR; > >> > + unsigned long guest_hstatus = > >> > + vcpu->arch.guest_context.hstatus | HSTATUS_SPRV; > >> > + unsigned long guest_vsstatus, old_stvec, tmp; > >> > + > >> > + guest_sstatus = csr_swap(CSR_SSTATUS, gu
Re: [PATCH v7 10/21] RISC-V: KVM: Handle MMIO exits for VCPU
On Mon, 23 Sep 2019 04:12:17 PDT (-0700), pbonz...@redhat.com wrote: On 04/09/19 18:15, Anup Patel wrote: + unsigned long guest_sstatus = + vcpu->arch.guest_context.sstatus | SR_MXR; + unsigned long guest_hstatus = + vcpu->arch.guest_context.hstatus | HSTATUS_SPRV; + unsigned long guest_vsstatus, old_stvec, tmp; + + guest_sstatus = csr_swap(CSR_SSTATUS, guest_sstatus); + old_stvec = csr_swap(CSR_STVEC, (ulong)&__kvm_riscv_unpriv_trap); + + if (read_insn) { + guest_vsstatus = csr_read_set(CSR_VSSTATUS, SR_MXR); Is this needed? IIUC SSTATUS.MXR encompasses a wider set of permissions: The HS-level MXR bit makes any executable page readable. {\tt vsstatus}.MXR makes readable those pages marked executable at the VS translation level, but only if readable at the guest-physical translation level. So it should be enough to set SSTATUS.MXR=1 I think. But you also shouldn't set SSTATUS.MXR=1 in the !read_insn case. Also, you can drop the irq save/restore (which is already a save/restore of SSTATUS) since you already write 0 to SSTATUS.SIE in your csr_swap. Perhaps add a BUG_ON(guest_sstatus & SR_SIE) before the csr_swap? + asm volatile ("\n" + "csrrw %[hstatus], " STR(CSR_HSTATUS) ", %[hstatus]\n" + "li %[tilen], 4\n" + "li %[tscause], 0\n" + "lhu %[val], (%[addr])\n" + "andi %[tmp], %[val], 3\n" + "addi %[tmp], %[tmp], -3\n" + "bne %[tmp], zero, 2f\n" + "lhu %[tmp], 2(%[addr])\n" + "sll %[tmp], %[tmp], 16\n" + "add %[val], %[val], %[tmp]\n" + "2: csrw " STR(CSR_HSTATUS) ", %[hstatus]" + : [hstatus] "+"(guest_hstatus), [val] "=" (val), + [tmp] "=" (tmp), [tilen] "+" (tilen), + [tscause] "+" (tscause) + : [addr] "r" (addr)); + csr_write(CSR_VSSTATUS, guest_vsstatus); +#ifndef CONFIG_RISCV_ISA_C + "li %[tilen], 4\n" +#else + "li %[tilen], 2\n" +#endif Can you use an assembler directive to force using a non-compressed format for ld and lw? This would get rid of tilen, which is costing 6 bytes (if I did the RVC math right) in order to save two. :) Paolo + "li %[tscause], 0\n" +#ifdef CONFIG_64BIT + "ld %[val], (%[addr])\n" +#else + "lw %[val], (%[addr])\n" +#endif To: a...@brainfault.org CC: pbonz...@redhat.com CC: Anup Patel CC: Paul Walmsley CC: rkrc...@redhat.com CC: daniel.lezc...@linaro.org CC: t...@linutronix.de CC: g...@amazon.com CC: Atish Patra CC: Alistair Francis CC: Damien Le Moal CC: Christoph Hellwig CC: k...@vger.kernel.org CC: linux-ri...@lists.infradead.org CC: linux-kernel@vger.kernel.org Subject: Re: [PATCH v7 10/21] RISC-V: KVM: Handle MMIO exits for VCPU In-Reply-To: On Mon, 23 Sep 2019 06:09:43 PDT (-0700), a...@brainfault.org wrote: On Mon, Sep 23, 2019 at 4:42 PM Paolo Bonzini wrote: On 04/09/19 18:15, Anup Patel wrote: > + unsigned long guest_sstatus = > + vcpu->arch.guest_context.sstatus | SR_MXR; > + unsigned long guest_hstatus = > + vcpu->arch.guest_context.hstatus | HSTATUS_SPRV; > + unsigned long guest_vsstatus, old_stvec, tmp; > + > + guest_sstatus = csr_swap(CSR_SSTATUS, guest_sstatus); > + old_stvec = csr_swap(CSR_STVEC, (ulong)&__kvm_riscv_unpriv_trap); > + > + if (read_insn) { > + guest_vsstatus = csr_read_set(CSR_VSSTATUS, SR_MXR); Is this needed? IIUC SSTATUS.MXR encompasses a wider set of permissions: The HS-level MXR bit makes any executable page readable. {\tt vsstatus}.MXR makes readable those pages marked executable at the VS translation level, but only if readable at the guest-physical translation level. So it should be enough to set SSTATUS.MXR=1 I think. But you also shouldn't set SSTATUS.MXR=1 in the !read_insn case. I was being overly cautious here. Initially, I thought SSTATUS.MXR applies only to Stage2 and VSSTATUS.MXR applies only to Stage1. I agree with you. The HS-mode should only need to set SSTATUS.MXR. Also, you can drop the irq save/restore (which is already a save/restore of SSTATUS) since you already write 0 to SSTATUS.SIE in your csr_swap. Perhaps add a BU
Re: [PATCH v7 10/21] RISC-V: KVM: Handle MMIO exits for VCPU
On Mon, Sep 23, 2019 at 7:03 PM Paolo Bonzini wrote: > > On 23/09/19 15:09, Anup Patel wrote: > >>> +#ifndef CONFIG_RISCV_ISA_C > >>> + "li %[tilen], 4\n" > >>> +#else > >>> + "li %[tilen], 2\n" > >>> +#endif > >> > >> Can you use an assembler directive to force using a non-compressed > >> format for ld and lw? This would get rid of tilen, which is costing 6 > >> bytes (if I did the RVC math right) in order to save two. :) > > > > I tried looking for it but could not find any assembler directive > > to selectively turn-off instruction compression. > > ".option norvc"? Thanks for the hint. I will try ".option norvc" Regards, Anup > > Paolo
Re: [PATCH v7 10/21] RISC-V: KVM: Handle MMIO exits for VCPU
On 23/09/19 15:09, Anup Patel wrote: >>> +#ifndef CONFIG_RISCV_ISA_C >>> + "li %[tilen], 4\n" >>> +#else >>> + "li %[tilen], 2\n" >>> +#endif >> >> Can you use an assembler directive to force using a non-compressed >> format for ld and lw? This would get rid of tilen, which is costing 6 >> bytes (if I did the RVC math right) in order to save two. :) > > I tried looking for it but could not find any assembler directive > to selectively turn-off instruction compression. ".option norvc"? Paolo
Re: [PATCH v7 10/21] RISC-V: KVM: Handle MMIO exits for VCPU
On Mon, Sep 23, 2019 at 4:42 PM Paolo Bonzini wrote: > > On 04/09/19 18:15, Anup Patel wrote: > > + unsigned long guest_sstatus = > > + vcpu->arch.guest_context.sstatus | SR_MXR; > > + unsigned long guest_hstatus = > > + vcpu->arch.guest_context.hstatus | HSTATUS_SPRV; > > + unsigned long guest_vsstatus, old_stvec, tmp; > > + > > + guest_sstatus = csr_swap(CSR_SSTATUS, guest_sstatus); > > + old_stvec = csr_swap(CSR_STVEC, (ulong)&__kvm_riscv_unpriv_trap); > > + > > + if (read_insn) { > > + guest_vsstatus = csr_read_set(CSR_VSSTATUS, SR_MXR); > > Is this needed? IIUC SSTATUS.MXR encompasses a wider set of permissions: > > The HS-level MXR bit makes any executable page readable. {\tt > vsstatus}.MXR makes readable those pages marked executable at the VS > translation level, but only if readable at the guest-physical > translation level. > > So it should be enough to set SSTATUS.MXR=1 I think. But you also > shouldn't set SSTATUS.MXR=1 in the !read_insn case. I was being overly cautious here. Initially, I thought SSTATUS.MXR applies only to Stage2 and VSSTATUS.MXR applies only to Stage1. I agree with you. The HS-mode should only need to set SSTATUS.MXR. > > Also, you can drop the irq save/restore (which is already a save/restore > of SSTATUS) since you already write 0 to SSTATUS.SIE in your csr_swap. > Perhaps add a BUG_ON(guest_sstatus & SR_SIE) before the csr_swap? I had already dropped irq save/restore in v7 series and having BUG_ON() on guest_sstatus here would be better. > > > + asm volatile ("\n" > > + "csrrw %[hstatus], " STR(CSR_HSTATUS) ", %[hstatus]\n" > > + "li %[tilen], 4\n" > > + "li %[tscause], 0\n" > > + "lhu %[val], (%[addr])\n" > > + "andi %[tmp], %[val], 3\n" > > + "addi %[tmp], %[tmp], -3\n" > > + "bne %[tmp], zero, 2f\n" > > + "lhu %[tmp], 2(%[addr])\n" > > + "sll %[tmp], %[tmp], 16\n" > > + "add %[val], %[val], %[tmp]\n" > > + "2: csrw " STR(CSR_HSTATUS) ", %[hstatus]" > > + : [hstatus] "+"(guest_hstatus), [val] "=" (val), > > + [tmp] "=" (tmp), [tilen] "+" (tilen), > > + [tscause] "+" (tscause) > > + : [addr] "r" (addr)); > > + csr_write(CSR_VSSTATUS, guest_vsstatus); > > > > > +#ifndef CONFIG_RISCV_ISA_C > > + "li %[tilen], 4\n" > > +#else > > + "li %[tilen], 2\n" > > +#endif > > Can you use an assembler directive to force using a non-compressed > format for ld and lw? This would get rid of tilen, which is costing 6 > bytes (if I did the RVC math right) in order to save two. :) I tried looking for it but could not find any assembler directive to selectively turn-off instruction compression. > > Paolo > > > + "li %[tscause], 0\n" > > +#ifdef CONFIG_64BIT > > + "ld %[val], (%[addr])\n" > > +#else > > + "lw %[val], (%[addr])\n" > > +#endif Regards, Anup
Re: [PATCH v7 10/21] RISC-V: KVM: Handle MMIO exits for VCPU
On 04/09/19 18:15, Anup Patel wrote: > + unsigned long guest_sstatus = > + vcpu->arch.guest_context.sstatus | SR_MXR; > + unsigned long guest_hstatus = > + vcpu->arch.guest_context.hstatus | HSTATUS_SPRV; > + unsigned long guest_vsstatus, old_stvec, tmp; > + > + guest_sstatus = csr_swap(CSR_SSTATUS, guest_sstatus); > + old_stvec = csr_swap(CSR_STVEC, (ulong)&__kvm_riscv_unpriv_trap); > + > + if (read_insn) { > + guest_vsstatus = csr_read_set(CSR_VSSTATUS, SR_MXR); Is this needed? IIUC SSTATUS.MXR encompasses a wider set of permissions: The HS-level MXR bit makes any executable page readable. {\tt vsstatus}.MXR makes readable those pages marked executable at the VS translation level, but only if readable at the guest-physical translation level. So it should be enough to set SSTATUS.MXR=1 I think. But you also shouldn't set SSTATUS.MXR=1 in the !read_insn case. Also, you can drop the irq save/restore (which is already a save/restore of SSTATUS) since you already write 0 to SSTATUS.SIE in your csr_swap. Perhaps add a BUG_ON(guest_sstatus & SR_SIE) before the csr_swap? > + asm volatile ("\n" > + "csrrw %[hstatus], " STR(CSR_HSTATUS) ", %[hstatus]\n" > + "li %[tilen], 4\n" > + "li %[tscause], 0\n" > + "lhu %[val], (%[addr])\n" > + "andi %[tmp], %[val], 3\n" > + "addi %[tmp], %[tmp], -3\n" > + "bne %[tmp], zero, 2f\n" > + "lhu %[tmp], 2(%[addr])\n" > + "sll %[tmp], %[tmp], 16\n" > + "add %[val], %[val], %[tmp]\n" > + "2: csrw " STR(CSR_HSTATUS) ", %[hstatus]" > + : [hstatus] "+"(guest_hstatus), [val] "=" (val), > + [tmp] "=" (tmp), [tilen] "+" (tilen), > + [tscause] "+" (tscause) > + : [addr] "r" (addr)); > + csr_write(CSR_VSSTATUS, guest_vsstatus); > > +#ifndef CONFIG_RISCV_ISA_C > + "li %[tilen], 4\n" > +#else > + "li %[tilen], 2\n" > +#endif Can you use an assembler directive to force using a non-compressed format for ld and lw? This would get rid of tilen, which is costing 6 bytes (if I did the RVC math right) in order to save two. :) Paolo > + "li %[tscause], 0\n" > +#ifdef CONFIG_64BIT > + "ld %[val], (%[addr])\n" > +#else > + "lw %[val], (%[addr])\n" > +#endif
Re: [PATCH v7 10/21] RISC-V: KVM: Handle MMIO exits for VCPU
On 04.09.19 18:15, Anup Patel wrote: We will get stage2 page faults whenever Guest/VM access SW emulated MMIO device or unmapped Guest RAM. This patch implements MMIO read/write emulation by extracting MMIO details from the trapped load/store instruction and forwarding the MMIO read/write to user-space. The actual MMIO emulation will happen in user-space and KVM kernel module will only take care of register updates before resuming the trapped VCPU. The handling for stage2 page faults for unmapped Guest RAM will be implemeted by a separate patch later. Signed-off-by: Anup Patel Acked-by: Paolo Bonzini Reviewed-by: Paolo Bonzini This version is indeed much better. I would not mind a bit more documentation when it comes to implicit register value assumptions (a0, a1 in the trap handler), but the code is small enough that someone who cares can figure it out quickly enough. Reviewed-by: Alexander Graf Alex Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879
[PATCH v7 10/21] RISC-V: KVM: Handle MMIO exits for VCPU
We will get stage2 page faults whenever Guest/VM access SW emulated MMIO device or unmapped Guest RAM. This patch implements MMIO read/write emulation by extracting MMIO details from the trapped load/store instruction and forwarding the MMIO read/write to user-space. The actual MMIO emulation will happen in user-space and KVM kernel module will only take care of register updates before resuming the trapped VCPU. The handling for stage2 page faults for unmapped Guest RAM will be implemeted by a separate patch later. Signed-off-by: Anup Patel Acked-by: Paolo Bonzini Reviewed-by: Paolo Bonzini --- arch/riscv/include/asm/kvm_host.h | 20 ++ arch/riscv/kvm/mmu.c | 7 + arch/riscv/kvm/vcpu_exit.c| 512 +- arch/riscv/kvm/vcpu_switch.S | 8 + 4 files changed, 544 insertions(+), 3 deletions(-) diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h index 18f1097f1d8d..2a5209fff68d 100644 --- a/arch/riscv/include/asm/kvm_host.h +++ b/arch/riscv/include/asm/kvm_host.h @@ -53,6 +53,13 @@ struct kvm_arch { phys_addr_t pgd_phys; }; +struct kvm_mmio_decode { + unsigned long insn; + int len; + int shift; + int return_handled; +}; + struct kvm_cpu_context { unsigned long zero; unsigned long ra; @@ -141,6 +148,9 @@ struct kvm_vcpu_arch { unsigned long irqs_pending; unsigned long irqs_pending_mask; + /* MMIO instruction details */ + struct kvm_mmio_decode mmio_decode; + /* VCPU power-off state */ bool power_off; @@ -160,11 +170,21 @@ static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {} int kvm_riscv_setup_vsip(void); void kvm_riscv_cleanup_vsip(void); +int kvm_riscv_stage2_map(struct kvm_vcpu *vcpu, gpa_t gpa, unsigned long hva, +bool is_write); void kvm_riscv_stage2_flush_cache(struct kvm_vcpu *vcpu); int kvm_riscv_stage2_alloc_pgd(struct kvm *kvm); void kvm_riscv_stage2_free_pgd(struct kvm *kvm); void kvm_riscv_stage2_update_hgatp(struct kvm_vcpu *vcpu); +void __kvm_riscv_unpriv_trap(void); + +unsigned long kvm_riscv_vcpu_unpriv_read(struct kvm_vcpu *vcpu, +bool read_insn, +unsigned long guest_addr, +unsigned long *trap_scause); +void kvm_riscv_vcpu_trap_redirect(struct kvm_vcpu *vcpu, + unsigned long scause, unsigned long stval); int kvm_riscv_vcpu_mmio_return(struct kvm_vcpu *vcpu, struct kvm_run *run); int kvm_riscv_vcpu_exit(struct kvm_vcpu *vcpu, struct kvm_run *run, unsigned long scause, unsigned long stval); diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c index 04dd089b86ff..2b965f9aac07 100644 --- a/arch/riscv/kvm/mmu.c +++ b/arch/riscv/kvm/mmu.c @@ -61,6 +61,13 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, return 0; } +int kvm_riscv_stage2_map(struct kvm_vcpu *vcpu, gpa_t gpa, unsigned long hva, +bool is_write) +{ + /* TODO: */ + return 0; +} + void kvm_riscv_stage2_flush_cache(struct kvm_vcpu *vcpu) { /* TODO: */ diff --git a/arch/riscv/kvm/vcpu_exit.c b/arch/riscv/kvm/vcpu_exit.c index e4d7c8f0807a..d75a6c35b6c7 100644 --- a/arch/riscv/kvm/vcpu_exit.c +++ b/arch/riscv/kvm/vcpu_exit.c @@ -6,9 +6,437 @@ * Anup Patel */ +#include #include #include #include +#include + +#define INSN_MATCH_LB 0x3 +#define INSN_MASK_LB 0x707f +#define INSN_MATCH_LH 0x1003 +#define INSN_MASK_LH 0x707f +#define INSN_MATCH_LW 0x2003 +#define INSN_MASK_LW 0x707f +#define INSN_MATCH_LD 0x3003 +#define INSN_MASK_LD 0x707f +#define INSN_MATCH_LBU 0x4003 +#define INSN_MASK_LBU 0x707f +#define INSN_MATCH_LHU 0x5003 +#define INSN_MASK_LHU 0x707f +#define INSN_MATCH_LWU 0x6003 +#define INSN_MASK_LWU 0x707f +#define INSN_MATCH_SB 0x23 +#define INSN_MASK_SB 0x707f +#define INSN_MATCH_SH 0x1023 +#define INSN_MASK_SH 0x707f +#define INSN_MATCH_SW 0x2023 +#define INSN_MASK_SW 0x707f +#define INSN_MATCH_SD 0x3023 +#define INSN_MASK_SD 0x707f + +#define INSN_MATCH_C_LD0x6000 +#define INSN_MASK_C_LD 0xe003 +#define INSN_MATCH_C_SD0xe000 +#define INSN_MASK_C_SD 0xe003 +#define INSN_MATCH_C_LW0x4000 +#define INSN_MASK_C_LW 0xe003 +#define INSN_MATCH_C_SW0xc000 +#define INSN_MASK_C_SW 0xe003 +#define INSN_MATCH_C_LDSP 0x6002 +#define INSN_MASK_C_LDSP 0xe003 +#define INSN_MATCH_C_SDSP 0xe002 +#define INSN_MASK_C_SDSP 0xe003 +#define INSN_MATCH_C_LWSP 0x4002 +#define INSN_MASK_C_LWSP 0xe003 +#define INSN_MATCH_C_SWSP