Hi Dongjiu,

On Tue, Jul 04, 2017 at 12:46:23PM +0800, gengdongjiu wrote:
> Hi Christoffer,
>   thanks for the review.
> 
> 
> On 2017/7/3 16:39, Christoffer Dall wrote:
> > Hi Dongjiu,
> > 
> > On Mon, Jun 26, 2017 at 08:46:39PM +0800, Dongjiu Geng wrote:
> >> when SError happen, kvm notifies user space to record the CPER,
> >> user space specifies and passes the contents of ESR_EL1 on taking
> >> a virtual SError interrupt to KVM, KVM enables virtual system
> >> error or asynchronous abort with this specifies syndrome. This
> >> patch modify the world-switch to restore VSESR_EL2, VSESR_EL2
> >> saves the virtual SError syndrome, it becomes the ESR_EL1 value when
> >> HCR_EL2.VSE injects an SError. This register is added by the
> >> RAS Extensions.
> > 
> > This commit message is confusing and doesn't help me understand the
> > patch.
> (1) what is the rationale for the guest OS SError interrupt(SEI) handling in 
> the RAS solution?
>   you can refer to document: "RAS_Extension_PRD03-PRDC-010953-32-0, 6.5.3 
> Example software sequences"
>   a). In the firmware-first RAS solution, when guest OS happen a SError 
> interrupt (SEI), it will firstly trap to EL3(SCR_EL3.EA = 1);
>   b). The firmware logs, triages, and delegates the error exception to the 
> hypervisor. As the error came from guest OS  EL1, firmware
>       does by faking an SError interrupt exception entry to EL2.
>   c). Control transfers to the hypervisor's delegated error recovery 
> agent.Because HCR_EL2.AMO is set to 1, the hypervisor can use a
>       Virtual SError interrupt to delegate an asynchronous abort to EL1, by 
> setting HCR_EL2.VSE to 1 and using VESR_EL2 to pass syndrome.
> 
> (2) what is this patch mainly do?
>   As mentioned above, the hypervisor needs to enable virtual SError and pass 
> the virtual syndrome to the guest OS.
> 
>   a). when Control transfers to the hypervisor from firmware by faking an 
> SError interrupt, the hypervisor delivered the syndrome_info(esr_el2) and
>       host VA address( Qemu translate this VA address to the virtual machine 
> physical address(IPA)) using below new added "serror_intr" struct.
>               /* KVM_EXIT_SERROR_INTR */
>               struct {
>                       __u32 syndrome_info;
>                       __u64 address;
>               } serror_intr;
> 
>   b). Qemu gets the address(host VA) delivered by KVM, translate this host VA 
> address to virtual machine physical address(IPA), and runtime record this 
> virtual
>      machine physical address(IPA) to the guest OS's APEI table.
> 
>   c). Qemu gets the syndrome_info delivered by KVM, it refers to this 
> syndrome value(but can be different from it) to specify the virtual SError 
> interrupt's syndrome through setting VESR_EL2.
> 
>         the vsesr_el2 is armv8.2 register, its explanation can be found in 
> "RAS_Extension_PRD03-PRDC-010953-33-0, 5.6.18 VSESR_EL2, Virtual SError 
> Exception Syndrome Register"
> 
>       >>The VSESR_EL2 characteristics are:
>       >>Purpose:
>       >>Provides the syndrome value reported to software on taking a virtual 
> SError interrupt exception:
>       >>      — If the virtual SError interrupt is taken to EL1 using AArch64 
> then VSESR_EL2 provides the
>       >>syndrome value reported in ESR_EL1.
>       >>      — If the virtual SError interrupt is taken to EL1 using AArch32 
> then VSESR_EL2 provides the
>       >>        syndrome values reported in DFSR.{AET, ExT} and the remainder 
> of the DFSR is set as
>       >>       defined by VMSAv8-32.
> 
>      so in the KVM, I added a new IOCTL(#define KVM_ARM_SEI  _IO(KVMIO,  
> 0xb8)) to pass the virtual SError syndrome value specified by Qemu and enable 
> a virtual System Error.
> 
> 
>  d). when world switch to guest OS, guest OS will happen virtual SError(this 
> virtual SError can not be route to EL3 firmware), guest OS uses the specified 
> syndrome value to do the recovery and
>      parses the guest OS CPER which is dynamically recorded by the Qemu in 
> the APEI table .
> 
> 

Please format the text nicely, and try to simplify the message when
resubmitting these patches.  I hope it will be easier for you to write
a meaningful commit message if you split these patches into multiple
ones.

One specific thing currently lacking from the commit message is a
discussion of why this API is needed in the first place - why can't we
reuse KVM_SET_ONE_REG, for example.

> 
> > 
> > I think this patch is trying to do too many things.  I suggest you split
> > the patch into (at least) one patch that captures exception information
> > from the world-switch path, one patch that deals with the new exit
> > reason, and finally a patch with the new ioctl.  That way you can write
> > a commit message for each patch describing first what the patch does,
> > and then why this is a good idea.
>   Ok, thanks for the good suggestion.
> 
> > 
> > Neverthess, I added some random comments below.
> > 
> >>
> >> Changes since v3:
> >> (1) Move restore VSESR_EL2 value logic to a helper method
> >> (2) In the world-switch, not save VSESR_EL2, because no one cares the
> >>     old VSESR_EL2 value
> >> (3) Add a new KVM_ARM_SEI ioctl to set the VSESR_EL2 value and pend
> >>     a virtual system error
> >>
> >> Signed-off-by: Dongjiu Geng <gengdong...@huawei.com>
> >> Signed-off-by: Quanming Wu <wuquanm...@huawei.com>
> >> ---
> >>  Documentation/virtual/kvm/api.txt    | 10 ++++++++++
> >>  arch/arm/include/asm/kvm_host.h      |  1 +
> >>  arch/arm/kvm/arm.c                   |  7 +++++++
> >>  arch/arm/kvm/guest.c                 |  5 +++++
> >>  arch/arm64/include/asm/esr.h         |  2 ++
> >>  arch/arm64/include/asm/kvm_emulate.h | 10 ++++++++++
> >>  arch/arm64/include/asm/kvm_host.h    |  2 ++
> >>  arch/arm64/include/asm/sysreg.h      |  3 +++
> >>  arch/arm64/kvm/guest.c               | 14 ++++++++++++++
> >>  arch/arm64/kvm/handle_exit.c         | 25 +++++++++++++++++++------
> >>  arch/arm64/kvm/hyp/switch.c          | 14 ++++++++++++++
> >>  include/uapi/linux/kvm.h             |  8 ++++++++
> >>  12 files changed, 95 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/Documentation/virtual/kvm/api.txt 
> >> b/Documentation/virtual/kvm/api.txt
> >> index 3c248f7..852ac55 100644
> >> --- a/Documentation/virtual/kvm/api.txt
> >> +++ b/Documentation/virtual/kvm/api.txt
> >> @@ -3377,6 +3377,16 @@ struct kvm_ppc_resize_hpt {
> >>    __u32 pad;
> >>  };
> >>  
> >> +4.104 KVM_ARM_SEI
> >> +
> >> +Capability: KVM_EXIT_SERROR_INTR
> >> +Architectures: arm/arm64
> >> +Type: vcpu ioctl
> >> +Parameters: u64 (syndrome)
> >> +Returns: 0 in case of success
> >> +
> >> +Pend an virtual system error or asynchronous abort with user space 
> >> specified.
> >> +
> > 
> > I don't understand enough about what this ioctl does by just reading
> > this text.  Did you mean to say something like "Record that userspace
> > wishes to inject a pending virtual system error to the VCPU.  Next time
> > the VCPU is run, th
> Christoffer, sorry to confuse you.
> This IOCTL mainly do two things:
> 
> (1) set the VCPU struct's vsesr_el2 to pass the virtual SError's syndrome 
> value that Qemu specified.
> (2) enable virtual SError
> 
>  +int kvm_vcpu_ioctl_sei(struct kvm_vcpu *vcpu, u64 *syndrome)
>  +{
>  +    u64 reg = *syndrome;
>  +
>  +    if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN) && reg)
>  +            /* set vsesr_el2[24:0] with value that user space specified */
>  +            kvm_vcpu_set_vsesr(vcpu, reg & VSESR_ELx_IDS_ISS_MASK);
> 
>  +
>  +    /* inject virtual system Error or asynchronous abort */
>  +    kvm_inject_vabt(vcpu);
>  +
>  +    return 0;
>  +}
> 

Dongjiu, I can read the code, of course.  My comment was to tell you
that the text you add to the api.txt file should be meaningful and
explanatory on its own; that's the whole point of having such a file.

> > 
> >>  5. The kvm_run structure
> >>  ------------------------
> >>  
> >> diff --git a/arch/arm/include/asm/kvm_host.h 
> >> b/arch/arm/include/asm/kvm_host.h
> >> index 31ee468..566292a 100644
> >> --- a/arch/arm/include/asm/kvm_host.h
> >> +++ b/arch/arm/include/asm/kvm_host.h
> >> @@ -244,6 +244,7 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, 
> >> const struct kvm_one_reg *);
> >>  
> >>  int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
> >>            int exception_index);
> >> +int kvm_vcpu_ioctl_sei(struct kvm_vcpu *vcpu, u64 *syndrome);
> >>  
> >>  static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
> >>                                   unsigned long hyp_stack_ptr,
> >> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> >> index 96dba7c..583294f 100644
> >> --- a/arch/arm/kvm/arm.c
> >> +++ b/arch/arm/kvm/arm.c
> >> @@ -987,6 +987,13 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
> >>                    return -EFAULT;
> >>            return kvm_arm_vcpu_has_attr(vcpu, &attr);
> >>    }
> >> +  case KVM_ARM_SEI: {
> >> +          u64 syndrome;
> >> +
> >> +          if (copy_from_user(&syndrome, argp, sizeof(syndrome)))
> >> +                  return -EFAULT;
> >> +          return kvm_vcpu_ioctl_sei(vcpu, &syndrome);
> >> +  }
> >>    default:
> >>            return -EINVAL;
> >>    }
> >> diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c
> >> index fa6182a..72505bf 100644
> >> --- a/arch/arm/kvm/guest.c
> >> +++ b/arch/arm/kvm/guest.c
> >> @@ -248,6 +248,11 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu 
> >> *vcpu,
> >>    return -EINVAL;
> >>  }
> >>  
> >> +int kvm_vcpu_ioctl_sei(struct kvm_vcpu *vcpu, u64 *syndrome)
> >> +{
> >> +  return 0;
> >> +}
> >> +
> >>  int __attribute_const__ kvm_target_cpu(void)
> >>  {
> >>    switch (read_cpuid_part()) {
> >> diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
> >> index 22f9c90..d009c99 100644
> >> --- a/arch/arm64/include/asm/esr.h
> >> +++ b/arch/arm64/include/asm/esr.h
> >> @@ -127,6 +127,8 @@
> >>  #define ESR_ELx_WFx_ISS_WFE       (UL(1) << 0)
> >>  #define ESR_ELx_xVC_IMM_MASK      ((1UL << 16) - 1)
> >>  
> >> +#define VSESR_ELx_IDS_ISS_MASK    ((1UL << 25) - 1)
> >> +
> >>  /* ESR value templates for specific events */
> >>  
> >>  /* BRK instruction trap from AArch64 state */
> >> diff --git a/arch/arm64/include/asm/kvm_emulate.h 
> >> b/arch/arm64/include/asm/kvm_emulate.h
> >> index 5f64ab2..93dc3d1 100644
> >> --- a/arch/arm64/include/asm/kvm_emulate.h
> >> +++ b/arch/arm64/include/asm/kvm_emulate.h
> >> @@ -155,6 +155,16 @@ static inline u32 kvm_vcpu_get_hsr(const struct 
> >> kvm_vcpu *vcpu)
> >>    return vcpu->arch.fault.esr_el2;
> >>  }
> >>  
> >> +static inline u32 kvm_vcpu_get_vsesr(const struct kvm_vcpu *vcpu)
> >> +{
> >> +          return vcpu->arch.fault.vsesr_el2;
> >> +}
> >> +
> >> +static inline void kvm_vcpu_set_vsesr(struct kvm_vcpu *vcpu, unsigned 
> >> long val)
> >> +{
> >> +          vcpu->arch.fault.vsesr_el2 = val;
> >> +}
> >> +
> >>  static inline int kvm_vcpu_get_condition(const struct kvm_vcpu *vcpu)
> >>  {
> >>    u32 esr = kvm_vcpu_get_hsr(vcpu);
> >> diff --git a/arch/arm64/include/asm/kvm_host.h 
> >> b/arch/arm64/include/asm/kvm_host.h
> >> index e7705e7..b6242fb 100644
> >> --- a/arch/arm64/include/asm/kvm_host.h
> >> +++ b/arch/arm64/include/asm/kvm_host.h
> >> @@ -86,6 +86,7 @@ struct kvm_vcpu_fault_info {
> >>    u32 esr_el2;            /* Hyp Syndrom Register */
> >>    u64 far_el2;            /* Hyp Fault Address Register */
> >>    u64 hpfar_el2;          /* Hyp IPA Fault Address Register */
> >> +  u32 vsesr_el2;          /* Virtual SError Exception Syndrome Register */
> >>  };
> >>  
> >>  /*
> >> @@ -385,6 +386,7 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
> >>                           struct kvm_device_attr *attr);
> >>  int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
> >>                           struct kvm_device_attr *attr);
> >> +int kvm_vcpu_ioctl_sei(struct kvm_vcpu *vcpu, u64 *syndrome);
> >>  
> >>  static inline void __cpu_init_stage2(void)
> >>  {
> >> diff --git a/arch/arm64/include/asm/sysreg.h 
> >> b/arch/arm64/include/asm/sysreg.h
> >> index 32964c7..3af6907 100644
> >> --- a/arch/arm64/include/asm/sysreg.h
> >> +++ b/arch/arm64/include/asm/sysreg.h
> >> @@ -125,6 +125,9 @@
> >>  #define REG_PSTATE_PAN_IMM                sys_reg(0, 0, 4, 0, 4)
> >>  #define REG_PSTATE_UAO_IMM                sys_reg(0, 0, 4, 0, 3)
> >>  
> >> +/* virtual SError exception syndrome register in armv8.2 */
> >> +#define REG_VSESR_EL2                     sys_reg(3, 4, 5, 2, 3)
> >> +
> >>  #define SET_PSTATE_PAN(x) __emit_inst(0xd5000000 | REG_PSTATE_PAN_IMM |   
> >> \
> >>                                  (!!x)<<8 | 0x1f)
> >>  #define SET_PSTATE_UAO(x) __emit_inst(0xd5000000 | REG_PSTATE_UAO_IMM |   
> >> \
> >> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> >> index b37446a..21c20b0 100644
> >> --- a/arch/arm64/kvm/guest.c
> >> +++ b/arch/arm64/kvm/guest.c
> >> @@ -277,6 +277,20 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu 
> >> *vcpu,
> >>    return -EINVAL;
> >>  }
> >>  
> >> +int kvm_vcpu_ioctl_sei(struct kvm_vcpu *vcpu, u64 *syndrome)
> >> +{
> >> +  u64 reg = *syndrome;
> >> +
> >> +  if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN) && reg)
> >> +          /* set vsesr_el2[24:0] with value that user space specified */
> >> +          kvm_vcpu_set_vsesr(vcpu, reg & VSESR_ELx_IDS_ISS_MASK);
> > 
> > Are you really supposed to fail silently here if trying to do this on a
> > system that doesn't have RAS ?
> This register vsesr_el2 is only introduced by armv8.2.
> so if system does not have RAS, the setting will be failed because this 
> register does not exist. I need to "return -EINVAL"
> for this case. thank you for pointing that.
> 
> 
> > 
> > Why can you not set reg to 0 here?  That doesn't seem to be documented
> > anywhere, and shouldn't the function return -EINVAL in this case?
> Because the default value is 0. if want to set to 0, it can not call this API.
> zero is meaningless syndrome value. no one care the zero value.
> 

What if userspace wants to drive a CPU reset of some form.  Is there any
mechanism for resetting the register then?

> 
> > 
> > Also, if you do this, but don't run the VCPU, then migrate the VM, and
> > run the VCPU on the new destination, isn't this information lost?
> > 
> > 

Please consider the migration point for your next version as well.

Thanks,
-Christoffer

Reply via email to