Re: [Qemu-devel] [RFC QEMU 1/2] arm/virt: Initialize generic timer scale factor dynamically

2018-11-09 Thread Bijan Mottahedeh

On 11/8/2018 6:21 AM, Richard Henderson wrote:

On 11/7/18 7:48 PM, Bijan Mottahedeh wrote:
  
+static void set_system_clock_scale(void)

+{
+unsigned long cntfrq_el0;
+
+asm volatile("mrs %0, cntfrq_el0" : "=r"(cntfrq_el0));
+
+if (cntfrq_el0 == 0) {
+cntfrq_el0 = GTIMER_SCALE_DEF;
+}
+
+system_clock_scale = NANOSECONDS_PER_SECOND / (int)cntfrq_el0;
+}

This only works for kvm.

For TCG you need to use the default always.  In particular, it won't even
compile for an x86 host.


r~
Is it ok to ifdef the asm statement with CONFIG_KVM, or what would be 
the correct way to account for TCG vs. KVM?


Thanks.

--bijan
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 2/2] kvm/arm: consistently advance singlestep when emulating instructions

2018-11-09 Thread Alex Bennée

Mark Rutland  writes:

> When we emulate a guest instruction, we don't advance the hardware
> singlestep state machine, and thus the guest will receive a software
> step exception after a next instruction which is not emulated by the
> host.
>
> We bodge around this in an ad-hoc fashion. Sometimes we explicitly check
> whether userspace requested a single step, and fake a debug exception
> from within the kernel. Other times, we advance the HW singlestep state
> rely on the HW to generate the exception for us. Thus, the observed step
> behaviour differs for host and guest.
>
> Let's make this simpler and consistent by always advancing the HW
> singlestep state machine when we skip an instruction. Thus we can rely
> on the hardware to generate the singlestep exception for us, and never
> need to explicitly check for an active-pending step, nor do we need to
> fake a debug exception from the guest.
>
> Signed-off-by: Mark Rutland 
> Cc: Alex Bennée 
> Cc: Christoffer Dall 
> Cc: Marc Zyngier 
> Cc: Peter Maydell 

Reviewed-by: Alex Bennée 
Tested-by: Alex Bennée 

For reference I'm leaving this kernel boot soaking overnight. In theory
there may be a branch to self but we shall see:

  https://gist.github.com/stsquad/ddfb00787cb133b4b658756cb6c47f63

> ---
>  arch/arm/include/asm/kvm_host.h  |  5 
>  arch/arm64/include/asm/kvm_emulate.h | 35 --
>  arch/arm64/include/asm/kvm_host.h|  1 -
>  arch/arm64/kvm/debug.c   | 21 
>  arch/arm64/kvm/handle_exit.c | 14 +--
>  arch/arm64/kvm/hyp/switch.c  | 43 
> +++-
>  arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c | 12 ++---
>  virt/kvm/arm/arm.c   |  2 --
>  virt/kvm/arm/hyp/vgic-v3-sr.c|  6 -
>  9 files changed, 46 insertions(+), 93 deletions(-)
>
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 5ca5d9af0c26..c5634c6ffcea 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -296,11 +296,6 @@ static inline void kvm_arm_init_debug(void) {}
>  static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) {}
> -static inline bool kvm_arm_handle_step_debug(struct kvm_vcpu *vcpu,
> -  struct kvm_run *run)
> -{
> - return false;
> -}
>
>  int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
>  struct kvm_device_attr *attr);
> diff --git a/arch/arm64/include/asm/kvm_emulate.h 
> b/arch/arm64/include/asm/kvm_emulate.h
> index 21247870def7..506386a3edde 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -24,6 +24,7 @@
>
>  #include 
>
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -147,14 +148,6 @@ static inline bool kvm_condition_valid(const struct 
> kvm_vcpu *vcpu)
>   return true;
>  }
>
> -static inline void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr)
> -{
> - if (vcpu_mode_is_32bit(vcpu))
> - kvm_skip_instr32(vcpu, is_wide_instr);
> - else
> - *vcpu_pc(vcpu) += 4;
> -}
> -
>  static inline void vcpu_set_thumb(struct kvm_vcpu *vcpu)
>  {
>   *vcpu_cpsr(vcpu) |= PSR_AA32_T_BIT;
> @@ -424,4 +417,30 @@ static inline unsigned long 
> vcpu_data_host_to_guest(struct kvm_vcpu *vcpu,
>   return data;/* Leave LE untouched */
>  }
>
> +static inline void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr)
> +{
> + if (vcpu_mode_is_32bit(vcpu))
> + kvm_skip_instr32(vcpu, is_wide_instr);
> + else
> + *vcpu_pc(vcpu) += 4;
> +
> + /* advance the singlestep state machine */
> + *vcpu_cpsr(vcpu) &= ~DBG_SPSR_SS;
> +}
> +
> +/*
> + * Skip an instruction which has been emulated at hyp while most guest 
> sysregs
> + * are live.
> + */
> +static inline void __hyp_text __kvm_skip_instr(struct kvm_vcpu *vcpu)
> +{
> + *vcpu_pc(vcpu) = read_sysreg_el2(elr);
> + vcpu->arch.ctxt.gp_regs.regs.pstate = read_sysreg_el2(spsr);
> +
> + kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> +
> + write_sysreg_el2(vcpu->arch.ctxt.gp_regs.regs.pstate, spsr);
> + write_sysreg_el2(*vcpu_pc(vcpu), elr);
> +}
> +
>  #endif /* __ARM64_KVM_EMULATE_H__ */
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index 52fbc823ff8c..7a5035f9c5c3 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -445,7 +445,6 @@ void kvm_arm_init_debug(void);
>  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
>  void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
>  void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
> -bool kvm_arm_handle_step_debug(struct kvm_vcpu *vcpu, struct kvm_run *run);
>  int 

Re: [PATCH 1/2] kvm/arm: skip MMIO insn after emulation

2018-11-09 Thread Alex Bennée

Mark Rutland  writes:

> When we emulate an MMIO instruction, we advance the CPU state within
> decode_hsr(), before emulating the instruction effects.
>
> Having this logic in decode_hsr() is opaque, and advancing the state
> before emulation is problematic. It gets in the way of applying
> consistent single-step logic, and it prevents us from being able to fail
> an MMIO instruction with a synchronous exception.
>
> Clean this up by only advancing the CPU state *after* the effects of the
> instruction are emulated.
>
> Signed-off-by: Mark Rutland 
> Cc: Alex Bennée 
> Cc: Christoffer Dall 
> Cc: Marc Zyngier 
> Cc: Peter Maydell 

Reviewed-by: Alex Bennée 

> ---
>  virt/kvm/arm/mmio.c | 11 ++-
>  1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/virt/kvm/arm/mmio.c b/virt/kvm/arm/mmio.c
> index dac7ceb1a677..08443a15e6be 100644
> --- a/virt/kvm/arm/mmio.c
> +++ b/virt/kvm/arm/mmio.c
> @@ -117,6 +117,12 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu, struct 
> kvm_run *run)
>   vcpu_set_reg(vcpu, vcpu->arch.mmio_decode.rt, data);
>   }
>
> + /*
> +  * The MMIO instruction is emulated and should not be re-executed
> +  * in the guest.
> +  */
> + kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> +
>   return 0;
>  }
>
> @@ -144,11 +150,6 @@ static int decode_hsr(struct kvm_vcpu *vcpu, bool 
> *is_write, int *len)
>   vcpu->arch.mmio_decode.sign_extend = sign_extend;
>   vcpu->arch.mmio_decode.rt = rt;
>
> - /*
> -  * The MMIO instruction is emulated and should not be re-executed
> -  * in the guest.
> -  */
> - kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
>   return 0;
>  }


--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 0/2] kvm/arm: make singlestep behaviour consistent

2018-11-09 Thread Mark Rutland
We don't consistently advance the singlestep state machine when emulating
instructions. We attempt to bodge around this when the host is stepping a
guest, and fake a debug exception, but we don't always get this right.
Additionally, we don't try to fix this up at all when a guest is stepping
itself, so guests cannot single-step emulated instructions reliably.

In both cases we're usually reliant on the HW singlestep state machine, so
let's have our instruction emulation consistently advance that. Thus, when we
return to a guest after emulating an instruction, the HW will generate the step
exception for us, routed to host or guest appropriately.

So far I have only compile-tested these patches. YMMV.

These patches do not ensure that guest-stepping is reliable in the presence of
host-stepping. We might need to say that it's userspace's responsibility to
virtualize the guest singlestep state machine when stepping the guest.
Otherwise, it's not clear to me if we can shadow this correctly within the
kernel.

Thanks,
Mark.

Mark Rutland (2):
  kvm/arm: skip MMIO insn after emulation
  kvm/arm: consistently advance singlestep when emulating instructions

 arch/arm/include/asm/kvm_host.h  |  5 
 arch/arm64/include/asm/kvm_emulate.h | 35 --
 arch/arm64/include/asm/kvm_host.h|  1 -
 arch/arm64/kvm/debug.c   | 21 
 arch/arm64/kvm/handle_exit.c | 14 +--
 arch/arm64/kvm/hyp/switch.c  | 43 +++-
 arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c | 12 ++---
 virt/kvm/arm/arm.c   |  2 --
 virt/kvm/arm/hyp/vgic-v3-sr.c|  6 -
 virt/kvm/arm/mmio.c  | 11 
 10 files changed, 52 insertions(+), 98 deletions(-)

-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 1/2] kvm/arm: skip MMIO insn after emulation

2018-11-09 Thread Mark Rutland
When we emulate an MMIO instruction, we advance the CPU state within
decode_hsr(), before emulating the instruction effects.

Having this logic in decode_hsr() is opaque, and advancing the state
before emulation is problematic. It gets in the way of applying
consistent single-step logic, and it prevents us from being able to fail
an MMIO instruction with a synchronous exception.

Clean this up by only advancing the CPU state *after* the effects of the
instruction are emulated.

Signed-off-by: Mark Rutland 
Cc: Alex Bennée 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Peter Maydell 
---
 virt/kvm/arm/mmio.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/virt/kvm/arm/mmio.c b/virt/kvm/arm/mmio.c
index dac7ceb1a677..08443a15e6be 100644
--- a/virt/kvm/arm/mmio.c
+++ b/virt/kvm/arm/mmio.c
@@ -117,6 +117,12 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
vcpu_set_reg(vcpu, vcpu->arch.mmio_decode.rt, data);
}
 
+   /*
+* The MMIO instruction is emulated and should not be re-executed
+* in the guest.
+*/
+   kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
+
return 0;
 }
 
@@ -144,11 +150,6 @@ static int decode_hsr(struct kvm_vcpu *vcpu, bool 
*is_write, int *len)
vcpu->arch.mmio_decode.sign_extend = sign_extend;
vcpu->arch.mmio_decode.rt = rt;
 
-   /*
-* The MMIO instruction is emulated and should not be re-executed
-* in the guest.
-*/
-   kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
return 0;
 }
 
-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 2/2] kvm/arm: consistently advance singlestep when emulating instructions

2018-11-09 Thread Mark Rutland
When we emulate a guest instruction, we don't advance the hardware
singlestep state machine, and thus the guest will receive a software
step exception after a next instruction which is not emulated by the
host.

We bodge around this in an ad-hoc fashion. Sometimes we explicitly check
whether userspace requested a single step, and fake a debug exception
from within the kernel. Other times, we advance the HW singlestep state
rely on the HW to generate the exception for us. Thus, the observed step
behaviour differs for host and guest.

Let's make this simpler and consistent by always advancing the HW
singlestep state machine when we skip an instruction. Thus we can rely
on the hardware to generate the singlestep exception for us, and never
need to explicitly check for an active-pending step, nor do we need to
fake a debug exception from the guest.

Signed-off-by: Mark Rutland 
Cc: Alex Bennée 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Peter Maydell 
---
 arch/arm/include/asm/kvm_host.h  |  5 
 arch/arm64/include/asm/kvm_emulate.h | 35 --
 arch/arm64/include/asm/kvm_host.h|  1 -
 arch/arm64/kvm/debug.c   | 21 
 arch/arm64/kvm/handle_exit.c | 14 +--
 arch/arm64/kvm/hyp/switch.c  | 43 +++-
 arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c | 12 ++---
 virt/kvm/arm/arm.c   |  2 --
 virt/kvm/arm/hyp/vgic-v3-sr.c|  6 -
 9 files changed, 46 insertions(+), 93 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 5ca5d9af0c26..c5634c6ffcea 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -296,11 +296,6 @@ static inline void kvm_arm_init_debug(void) {}
 static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) {}
-static inline bool kvm_arm_handle_step_debug(struct kvm_vcpu *vcpu,
-struct kvm_run *run)
-{
-   return false;
-}
 
 int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
   struct kvm_device_attr *attr);
diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
index 21247870def7..506386a3edde 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -24,6 +24,7 @@
 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -147,14 +148,6 @@ static inline bool kvm_condition_valid(const struct 
kvm_vcpu *vcpu)
return true;
 }
 
-static inline void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr)
-{
-   if (vcpu_mode_is_32bit(vcpu))
-   kvm_skip_instr32(vcpu, is_wide_instr);
-   else
-   *vcpu_pc(vcpu) += 4;
-}
-
 static inline void vcpu_set_thumb(struct kvm_vcpu *vcpu)
 {
*vcpu_cpsr(vcpu) |= PSR_AA32_T_BIT;
@@ -424,4 +417,30 @@ static inline unsigned long vcpu_data_host_to_guest(struct 
kvm_vcpu *vcpu,
return data;/* Leave LE untouched */
 }
 
+static inline void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr)
+{
+   if (vcpu_mode_is_32bit(vcpu))
+   kvm_skip_instr32(vcpu, is_wide_instr);
+   else
+   *vcpu_pc(vcpu) += 4;
+
+   /* advance the singlestep state machine */
+   *vcpu_cpsr(vcpu) &= ~DBG_SPSR_SS;
+}
+
+/*
+ * Skip an instruction which has been emulated at hyp while most guest sysregs
+ * are live.
+ */
+static inline void __hyp_text __kvm_skip_instr(struct kvm_vcpu *vcpu)
+{
+   *vcpu_pc(vcpu) = read_sysreg_el2(elr);
+   vcpu->arch.ctxt.gp_regs.regs.pstate = read_sysreg_el2(spsr);
+
+   kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
+
+   write_sysreg_el2(vcpu->arch.ctxt.gp_regs.regs.pstate, spsr);
+   write_sysreg_el2(*vcpu_pc(vcpu), elr);
+}
+
 #endif /* __ARM64_KVM_EMULATE_H__ */
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 52fbc823ff8c..7a5035f9c5c3 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -445,7 +445,6 @@ void kvm_arm_init_debug(void);
 void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
-bool kvm_arm_handle_step_debug(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
   struct kvm_device_attr *attr);
 int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index 00d422336a45..f39801e4136c 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -236,24 +236,3 @@ void kvm_arm_clear_debug(struct kvm_vcpu *vcpu)
}
}
 }
-
-
-/*
- * After successfully em

Re: [RFC PATCH] KVM: arm64: don't single-step for non-emulated faults

2018-11-09 Thread Mark Rutland
On Fri, Nov 09, 2018 at 12:56:54PM +, Peter Maydell wrote:
> On 9 November 2018 at 12:49, Mark Rutland  wrote:
> > I'm not saying anything about *decisions*. I'm saying that we can make
> > the state consistent by advancing the singlestep state in the same way
> > that HW does, at the instant it advances the PC.
> >
> > i.e. do that in kvm_skip_instr(), as I've done in my local tree.
> >
> > That mirrors the HW, and we don't need to special-case any handling for
> > emulated vs non-emulated instructions.
> 
> You also need to do it in the "set PC because we're making the guest
> take an exception" code path, which doesn't go through kvm_skip_instr().

Sure.

> This corresponds to the two kinds of "step completed" in hardware as
> noted in DDI0487D.a D2.12.3 fig D2-3 footnote b:
>  * executing the instruction to be stepped without taking an exception
>  * taking an exception to an exception level that debug exceptions
>are enabled from [ie guest EL1 in our case]

Thanks for the pointer!

Mark.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH] KVM: arm64: don't single-step for non-emulated faults

2018-11-09 Thread Peter Maydell
On 9 November 2018 at 12:49, Mark Rutland  wrote:
> I'm not saying anything about *decisions*. I'm saying that we can make
> the state consistent by advancing the singlestep state in the same way
> that HW does, at the instant it advances the PC.
>
> i.e. do that in kvm_skip_instr(), as I've done in my local tree.
>
> That mirrors the HW, and we don't need to special-case any handling for
> emulated vs non-emulated instructions.

You also need to do it in the "set PC because we're making the guest
take an exception" code path, which doesn't go through kvm_skip_instr().
This corresponds to the two kinds of "step completed" in hardware as
noted in DDI0487D.a D2.12.3 fig D2-3 footnote b:
 * executing the instruction to be stepped without taking an exception
 * taking an exception to an exception level that debug exceptions
   are enabled from [ie guest EL1 in our case]

thanks
-- PMM
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH] KVM: arm64: don't single-step for non-emulated faults

2018-11-09 Thread Mark Rutland
On Fri, Nov 09, 2018 at 12:24:41PM +, Alex Bennée wrote:
> Mark Rutland  writes:
> > On Thu, Nov 08, 2018 at 02:38:43PM +, Peter Maydell wrote:
> >> On 8 November 2018 at 14:28, Alex Bennée  wrote:
> >> >
> >> > Mark Rutland  writes:
> >> >> One problem is that I couldn't spot when we advance the PC for an MMIO
> >> >> trap. I presume we do that in the kernel, *after* the MMIO trap, but I
> >> >> can't see where that happens.
> >> >
> >> > Nope it gets done before during decode_hsr in mmio.c:
> >> >
> >> > /*
> >> >  * The MMIO instruction is emulated and should not be re-executed
> >> >  * in the guest.
> >> >  */
> >> > kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> >>
> >> I think that this attempt to do the PC-advance early is
> >> probably an underlying problem that is not helping the
> >> code structure here.
> >>
> >> An enhancement that's been floated previously is that the
> >> MMIO emulation in userspace should be able to report back
> >> to KVM "nope, that access should generate a guest synchronous
> >> external abort (with ESR_EL1.EA = 0/1)".
> >> If we have that, then we definitely need to not advance the
> >> PC until after userspace has done the emulation and told
> >> us whether the memory access succeeded or not...
> >
> > Yup.
> >
> > I think that we absolutely want to do all the CPU state advancement (PC,
> > SS bit, etc) at the point we apply the effects of the instruction. Not
> > before we emulate the instruction, and not higher/lower in the call
> > stack.
> 
> There is certainly an argument to abstract some of the advancement logic
> so we can make debug related decisions in one place. I don't know how
> much churn we would need to get there.

I'm not saying anything about *decisions*. I'm saying that we can make
the state consistent by advancing the singlestep state in the same way
that HW does, at the instant it advances the PC.

i.e. do that in kvm_skip_instr(), as I've done in my local tree.

That mirrors the HW, and we don't need to special-case any handling for
emulated vs non-emulated instructions.

> Currently most of the guest debug decisions are made in one place as we
> head into the guest run. Generally I don't think the emulation code want
> to know or care about the SS bit or what debug is currently happening
> although I guess the presence of the SS bit could be used to decide on
> exactly what exit type you are going to do - back to guest or out to
> user space. Currently kvm_arm_handle_step_debug on cares about host
> directed debug but we could expand it to raise the appropriate guest
> exception if required.

So long as we consistently advance the singlestep state when we emulate
an instruction, we shouldn't need kvm_arm_handle_step_debug() at all. If
we emulate an instruction, we'll return to the guest with PSTATE.SS
clear, and the HW will generate the exception for us.

This is *vastly* simpler to reason about.

I have local patches which I intend to post shortly.

> > We have a big problem in that guest-directed singlestep and
> > host-directed singlestep don't compose, and given that host-directed
> > singlestep doesn't work reliably today I'd be tempted to rip that out
> > until we've fixed guest-directed singlestep.
> 
> Getting host and guest debug to run at the same time is tricky given we
> end up subverting guest state when the host debug is in control. It did
> make my head spin when I worked on it originally which led to the
> acceptance that guest debug couldn't be expected to work transparently
> while host directed debug was in effect. If we can support it without
> adding complexity then that would be great but it's a pretty niche use
> case.

At the very least we need to define whether the kernel or userspace is
responsible for this. If we say it's userspace's responsibility to
virtualize this when it takes control of guest debug, but QEMU doesn't
do so, that's fine by me.

> I'd be loathed to rip out the current single step support as it does
> actually work pretty well - it's just corner cases with emulated MMIO
> and first step that are failing. Last I looked these were cases x86
> didn't even get right and no one has called to remove it's single step
> support AFAIK.
> 
> > We should have a story for how host-directed debug is handled
> > transparently from the PoV of a guest using guest-directed debug.
> 
> What's the use case for this apart from having a cleaner abstraction?

Providing the architecturally mandated behaviour to the guest.

> Do users really spend time running multiple gdbs at different levels
> in the stack?

It's not just about GDB. Things like kprobes, live patching, etc in a
guest may use singlestep, and this may be *critical* to the correct
operation of a given guest.

People *will* want to debug such features under a hypervisor. I know
that I personally want to be able to do so.

In general, a debugger shouldn't silently corrupt guest state or
behaviour, as KVM_GUE

Re: [RFC PATCH] KVM: arm64: don't single-step for non-emulated faults

2018-11-09 Thread Alex Bennée

Mark Rutland  writes:

> On Thu, Nov 08, 2018 at 02:38:43PM +, Peter Maydell wrote:
>> On 8 November 2018 at 14:28, Alex Bennée  wrote:
>> >
>> > Mark Rutland  writes:
>> >> One problem is that I couldn't spot when we advance the PC for an MMIO
>> >> trap. I presume we do that in the kernel, *after* the MMIO trap, but I
>> >> can't see where that happens.
>> >
>> > Nope it gets done before during decode_hsr in mmio.c:
>> >
>> > /*
>> >  * The MMIO instruction is emulated and should not be re-executed
>> >  * in the guest.
>> >  */
>> > kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
>>
>> I think that this attempt to do the PC-advance early is
>> probably an underlying problem that is not helping the
>> code structure here.
>>
>> An enhancement that's been floated previously is that the
>> MMIO emulation in userspace should be able to report back
>> to KVM "nope, that access should generate a guest synchronous
>> external abort (with ESR_EL1.EA = 0/1)".
>> If we have that, then we definitely need to not advance the
>> PC until after userspace has done the emulation and told
>> us whether the memory access succeeded or not...
>
> Yup.
>
> I think that we absolutely want to do all the CPU state advancement (PC,
> SS bit, etc) at the point we apply the effects of the instruction. Not
> before we emulate the instruction, and not higher/lower in the call
> stack.

There is certainly an argument to abstract some of the advancement logic
so we can make debug related decisions in one place. I don't know how
much churn we would need to get there.

Currently most of the guest debug decisions are made in one place as we
head into the guest run. Generally I don't think the emulation code want
to know or care about the SS bit or what debug is currently happening
although I guess the presence of the SS bit could be used to decide on
exactly what exit type you are going to do - back to guest or out to
user space. Currently kvm_arm_handle_step_debug on cares about host
directed debug but we could expand it to raise the appropriate guest
exception if required.

> We have a big problem in that guest-directed singlestep and
> host-directed singlestep don't compose, and given that host-directed
> singlestep doesn't work reliably today I'd be tempted to rip that out
> until we've fixed guest-directed singlestep.

Getting host and guest debug to run at the same time is tricky given we
end up subverting guest state when the host debug is in control. It did
make my head spin when I worked on it originally which led to the
acceptance that guest debug couldn't be expected to work transparently
while host directed debug was in effect. If we can support it without
adding complexity then that would be great but it's a pretty niche use
case.

I'd be loathed to rip out the current single step support as it does
actually work pretty well - it's just corner cases with emulated MMIO
and first step that are failing. Last I looked these were cases x86
didn't even get right and no one has called to remove it's single step
support AFAIK.

> We should have a story for how host-directed debug is handled
> transparently from the PoV of a guest using guest-directed debug.

What's the use case for this apart from having a cleaner abstraction? Do
users really spend time running multiple gdbs at different levels in
the stack?

>
> Thanks,
> Mark.


--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH] KVM: arm64: don't single-step for non-emulated faults

2018-11-09 Thread Mark Rutland
On Thu, Nov 08, 2018 at 02:38:43PM +, Peter Maydell wrote:
> On 8 November 2018 at 14:28, Alex Bennée  wrote:
> >
> > Mark Rutland  writes:
> >> One problem is that I couldn't spot when we advance the PC for an MMIO
> >> trap. I presume we do that in the kernel, *after* the MMIO trap, but I
> >> can't see where that happens.
> >
> > Nope it gets done before during decode_hsr in mmio.c:
> >
> > /*
> >  * The MMIO instruction is emulated and should not be re-executed
> >  * in the guest.
> >  */
> > kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> 
> I think that this attempt to do the PC-advance early is
> probably an underlying problem that is not helping the
> code structure here.
> 
> An enhancement that's been floated previously is that the
> MMIO emulation in userspace should be able to report back
> to KVM "nope, that access should generate a guest synchronous
> external abort (with ESR_EL1.EA = 0/1)".
> If we have that, then we definitely need to not advance the
> PC until after userspace has done the emulation and told
> us whether the memory access succeeded or not...

Yup.

I think that we absolutely want to do all the CPU state advancement (PC,
SS bit, etc) at the point we apply the effects of the instruction. Not
before we emulate the instruction, and not higher/lower in the call
stack.

We have a big problem in that guest-directed singlestep and
host-directed singlestep don't compose, and given that host-directed
singlestep doesn't work reliably today I'd be tempted to rip that out
until we've fixed guest-directed singlestep.

We should have a story for how host-directed debug is handled
transparently from the PoV of a guest using guest-directed debug.

Thanks,
Mark.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 4/4] arm64: KVM: Implement workaround for Cortex-A76 erratum 1165522

2018-11-09 Thread James Morse

Hi Marc,

On 08/11/2018 17:18, Marc Zyngier wrote:

On 05/11/18 18:34, James Morse wrote:

On 05/11/2018 14:36, Marc Zyngier wrote:

Early versions of Cortex-A76 can end-up with corrupt TLBs if they
speculate an AT instruction in during a guest switch while the


   (in during?)


S1/S2 system registers are in an inconsistent state.

Work around it by:
- Mandating VHE
- Make sure that S1 and S2 system registers are consistent before
clearing HCR_EL2.TGE, which allows AT to target the EL1 translation
regime

These two things together ensure that we cannot hit this erratum.




diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 51d5d966d9e5..322109183853 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -143,6 +143,13 @@ static void deactivate_traps_vhe(void)
   {
extern char vectors[];  /* kernel exception vectors */
write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2);
+
+   /*
+* ARM erratum 1165522 requires the actual execution of the
+* above before we can switch to the host translation regime.
+*/
+   asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_1165522));
+


Host regime too ... does __tlb_switch_to_host_vhe() need the same
treatment? It writes vttbr_el2 and hcr_el2 back to back.


It turns out that our VHE TLB invalidation are a tiny bit broken, and
that's before we work around this very erratum.

You're perfectly right that we're mitting an ISB in
__tlb_switch_to_host_vhe().


I thought that would only be needed for this erratum workaround, in the 
regular case don't we rely on the isb hiding in __vhe_hyp_call()?




We also have the problem that we can
perfectly take an interrupt here, and maybe schedule another process
from there (very unlikely, but I couldn't fully convince myself that it
couldn't happen).



What I'm planning to do is to make these TLB invalidation sequence
atomic by disabling interrupts. Yes, this is quite a hammer, but that'
no different from !VHE, and that's a very rare event anyway.


Anything running on the back of an IRQ should not be allowed to see 
HCR_EL2.TGE being clear. I think this is the right thing to do.
One example is TGE changes PANs behaviour, which could become 
inexplicably-trappy if we're also using the LDRT instructions in the 
uaccess helpers from an IRQ. (who would do such a thing? perf).



Thanks,

James
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm