Re: [PATCH v2 8/8] arm64: kvm: Check support for AArch32 for 32bit guests
On 25/02/16 09:52, Suzuki K Poulose wrote: > Add a check to make sure the system supports AArch32 state > before initialising a 32bit guest. > > Cc: Christoffer Dall > Cc: Marc Zyngier > Cc: kvmarm@lists.cs.columbia.edu > Signed-off-by: Suzuki K Poulose > > --- > > I really wanted to pass kvm_vcpu down to the helpers. But then, I can't > define the arch specific helper in asm/kvm_host.h due to lack of kvm_vcpu's > definition yet: > > In file included from include/linux/kvm_host.h:35:0, > from arch/arm64/kernel/asm-offsets.c:24: > ./arch/arm64/include/asm/kvm_host.h: In function > ‘kvm_arch_vcpu_validate_features’: > ./arch/arm64/include/asm/kvm_host.h:344:48: error: dereferencing pointer to > incomplete type >return !test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features) || Why don't you just have the prototype in kvm_host.h, and move the actual implementation to something like guest.c? But I think there is a better approach, see below. > --- > arch/arm/include/asm/kvm_host.h |5 + > arch/arm/kvm/arm.c|3 +++ > arch/arm64/include/asm/kvm_host.h |8 > 3 files changed, 16 insertions(+) > > diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h > index f9f2779..945c23a 100644 > --- a/arch/arm/include/asm/kvm_host.h > +++ b/arch/arm/include/asm/kvm_host.h > @@ -238,6 +238,11 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) > {} > static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {} > static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} > > +static inline bool kvm_arch_vcpu_validate_features(struct kvm_vcpu_arch > *arch_vcpu) > +{ > + return true; > +} > + > static inline void kvm_arm_init_debug(void) {} > static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {} > static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {} > diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c > index dda1959..fc4ea37 100644 > --- a/arch/arm/kvm/arm.c > +++ b/arch/arm/kvm/arm.c > @@ -787,6 +787,9 @@ static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu, > set_bit(i, vcpu->arch.features); > } > > + if (!kvm_arch_vcpu_validate_features(&vcpu->arch)) > + return -EINVAL; > + > vcpu->arch.target = phys_target; > > /* Now we know what it is, we can reset it. */ > diff --git a/arch/arm64/include/asm/kvm_host.h > b/arch/arm64/include/asm/kvm_host.h > index 689d4c9..9d60a6c 100644 > --- a/arch/arm64/include/asm/kvm_host.h > +++ b/arch/arm64/include/asm/kvm_host.h > @@ -24,6 +24,8 @@ > > #include > #include > +#include > +#include > #include > #include > > @@ -338,6 +340,12 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) > {} > static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {} > static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} > > +static inline bool kvm_arch_vcpu_validate_features(struct kvm_vcpu_arch > *arch_vcpu) > +{ > + return !test_bit(KVM_ARM_VCPU_EL1_32BIT, arch_vcpu->features) || > + system_supports_32bit_el0(); > +} > + This is really convoluted (it took me 5 minutes staring at the expression and remembering that AArch32 EL1 implies AArch32 EL0 to get it). Now, we already have kvm_reset_vcpu() that validates AArch32 support. It would probably be better to move things there. Thoughts? > void kvm_arm_init_debug(void); > void kvm_arm_setup_debug(struct kvm_vcpu *vcpu); > void kvm_arm_clear_debug(struct kvm_vcpu *vcpu); > Thanks, M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PULL] KVM/ARM updates for 4.5-rc7
Hi Paolo, I really thought that the previous PR was the last for this release, but Michael rightly decided to prove me wrong. Oh well. Please pull! M. The following changes since commit fd451b90e78c4178bcfc5072f2b2b637500c109a: arm64: KVM: vgic-v3: Restore ICH_APR0Rn_EL2 before ICH_APR1Rn_EL2 (2016-02-24 17:25:58 +) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git tags/kvm-arm-for-4.5-rc7 for you to fetch changes up to 4cad67fca3fc952d6f2ed9e799621f07666a560f: arm/arm64: KVM: Fix ioctl error handling (2016-02-29 09:56:40 +) KVM/ARM fixes for 4.5-rc7 - Fix ioctl error handling on the timer path Michael S. Tsirkin (1): arm/arm64: KVM: Fix ioctl error handling arch/arm/kvm/guest.c | 2 +- arch/arm64/kvm/guest.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH] arm/arm64: KVM: Fix ioctl error handling
From: "Michael S. Tsirkin" Calling return copy_to_user(...) in an ioctl will not do the right thing if there's a pagefault: copy_to_user returns the number of bytes not copied in this case. Fix up kvm to do return copy_to_user(...)) ? -EFAULT : 0; everywhere. Cc: sta...@vger.kernel.org Acked-by: Christoffer Dall Signed-off-by: Michael S. Tsirkin Signed-off-by: Marc Zyngier --- arch/arm/kvm/guest.c | 2 +- arch/arm64/kvm/guest.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c index 5fa69d7..99361f1 100644 --- a/arch/arm/kvm/guest.c +++ b/arch/arm/kvm/guest.c @@ -161,7 +161,7 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) u64 val; val = kvm_arm_timer_get_reg(vcpu, reg->id); - return copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)); + return copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)) ? -EFAULT : 0; } static unsigned long num_core_regs(void) diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index fcb7788..9e54ad7 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -194,7 +194,7 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) u64 val; val = kvm_arm_timer_get_reg(vcpu, reg->id); - return copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)); + return copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)) ? -EFAULT : 0; } /** -- 2.1.4 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PULL] KVM/ARM updates for 4.5-rc7
On 02/03/2016 10:27, Marc Zyngier wrote: > Hi Paolo, > > I really thought that the previous PR was the last for this release, > but Michael rightly decided to prove me wrong. Oh well. > > Please pull! > >M. > > The following changes since commit fd451b90e78c4178bcfc5072f2b2b637500c109a: > > arm64: KVM: vgic-v3: Restore ICH_APR0Rn_EL2 before ICH_APR1Rn_EL2 > (2016-02-24 17:25:58 +) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git > tags/kvm-arm-for-4.5-rc7 > > for you to fetch changes up to 4cad67fca3fc952d6f2ed9e799621f07666a560f: > > arm/arm64: KVM: Fix ioctl error handling (2016-02-29 09:56:40 +) > > > KVM/ARM fixes for 4.5-rc7 > > - Fix ioctl error handling on the timer path > > > Michael S. Tsirkin (1): > arm/arm64: KVM: Fix ioctl error handling > > arch/arm/kvm/guest.c | 2 +- > arch/arm64/kvm/guest.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Pulled, thanks. Paolo ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC v5 00/17] KVM PCIe/MSI passthrough on ARM/ARM64
>>From: Eric Auger >>Sent: Tuesday, March 1, 2016 11:57 PM >>To: eric.au...@st.com; eric.au...@linaro.org; robin.mur...@arm.com; >>alex.william...@redhat.com; will.dea...@arm.com; j...@8bytes.org; >>t...@linutronix.de; >>ja...@lakedaemon.net; marc.zyng...@arm.com; >>christoffer.d...@linaro.org; linux-arm-ker...@lists.infradead.org; >>kvmarm@lists.cs.columbia.edu; k...@vger.kernel.org >>Cc: suravee.suthikulpa...@amd.com; patc...@linaro.org; >>linux-ker...@vger.kernel.org; Jaggi, Manish; bharat.bhus...@freescale.com; pranav.sawargaon...@gmail.com; p.fe...@samsung.com; >>io...@lists.linux-foundation.org >>Subject: [RFC v5 00/17] KVM PCIe/MSI passthrough on ARM/ARM64 >>This series addresses KVM PCIe passthrough with MSI enabled on ARM/ARM64. >>It pursues the efforts done on [1], [2], [3]. It also aims at covering the >>same need on PowerPC platforms although the same kind of integration >>.should be carried out. >> [snip] >>- Not tested: ARM GICv3 ITS [snip] >>QEMU Integration: >>[RFC v2 0/8] KVM PCI/MSI passthrough with mach-virt >>(http://lists.gnu.org/archive/html/qemu-arm/2016-01/msg00444.html) >>https://git.linaro.org/people/eric.auger/qemu.git/shortlog/refs/heads/v2.5.0-pci-passthrough-rfc-v2 For gicv3 its, I believe, the below series for qemu and kernel is required for gicv3-its [RFC PATCH v3 0/5] vITS support https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg05197.html and in kernel CONFIG_HAVE_KVM_MSI must be enabled so that qemu sees MSI capability KVM_CAP_SIGNAL_MSI This has a dependency on gsi routing support KVM: arm/arm64: gsi routing support https://lkml.org/lkml/2015/6/29/290 I had both the above series in 4.2 in my local 4.2 tree. BR -Manish ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v2 8/8] arm64: kvm: Check support for AArch32 for 32bit guests
On 02/03/16 09:08, Marc Zyngier wrote: On 25/02/16 09:52, Suzuki K Poulose wrote: I really wanted to pass kvm_vcpu down to the helpers. But then, I can't define the arch specific helper in asm/kvm_host.h due to lack of kvm_vcpu's definition yet: In file included from include/linux/kvm_host.h:35:0, from arch/arm64/kernel/asm-offsets.c:24: ./arch/arm64/include/asm/kvm_host.h: In function ‘kvm_arch_vcpu_validate_features’: ./arch/arm64/include/asm/kvm_host.h:344:48: error: dereferencing pointer to incomplete type return !test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features) || Why don't you just have the prototype in kvm_host.h, and move the actual implementation to something like guest.c? But I think there is a better approach, see below. I thought it would better be a static inline. But, the GCC can do that, silly me :) This is really convoluted (it took me 5 minutes staring at the expression and remembering that AArch32 EL1 implies AArch32 EL0 to get it). Now, we already have kvm_reset_vcpu() that validates AArch32 support. It would probably be better to move things there. Thoughts? Definitely. I overlooked the function name to do something specific to resetting the CPU than doing some checks :(. I will respin it. Cheers Suzuki ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC v5 00/17] KVM PCIe/MSI passthrough on ARM/ARM64
Hi Manish, On 03/02/2016 09:11 AM, Jaggi, Manish wrote: > > >>> From: Eric Auger >>> Sent: Tuesday, March 1, 2016 11:57 PM >>> To: eric.au...@st.com; eric.au...@linaro.org; robin.mur...@arm.com; >>> alex.william...@redhat.com; will.dea...@arm.com; j...@8bytes.org; >>> t...@linutronix.de; >>ja...@lakedaemon.net; marc.zyng...@arm.com; >>> christoffer.d...@linaro.org; linux-arm-ker...@lists.infradead.org; >>> kvmarm@lists.cs.columbia.edu; k...@vger.kernel.org >>> Cc: suravee.suthikulpa...@amd.com; patc...@linaro.org; >>> linux-ker...@vger.kernel.org; Jaggi, Manish; bharat.bhus...@freescale.com; >>> >>pranav.sawargaon...@gmail.com; p.fe...@samsung.com; >>> io...@lists.linux-foundation.org >>> Subject: [RFC v5 00/17] KVM PCIe/MSI passthrough on ARM/ARM64 > >>> This series addresses KVM PCIe passthrough with MSI enabled on ARM/ARM64. >>> It pursues the efforts done on [1], [2], [3]. It also aims at covering the >>> same need on PowerPC platforms although the same kind of integration >>> .should be carried out. >>> > [snip] >>> - Not tested: ARM GICv3 ITS > > [snip] >>> QEMU Integration: >>> [RFC v2 0/8] KVM PCI/MSI passthrough with mach-virt >>> (http://lists.gnu.org/archive/html/qemu-arm/2016-01/msg00444.html) >>> https://git.linaro.org/people/eric.auger/qemu.git/shortlog/refs/heads/v2.5.0-pci-passthrough-rfc-v2 > > For gicv3 its, I believe, the below series for qemu and kernel is required > for gicv3-its > > [RFC PATCH v3 0/5] vITS support > https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg05197.html > > and in kernel CONFIG_HAVE_KVM_MSI must be enabled so that qemu sees MSI > capability KVM_CAP_SIGNAL_MSI > > This has a dependency on gsi routing support > KVM: arm/arm64: gsi routing support > https://lkml.org/lkml/2015/6/29/290 which has a dependency on Andre's ITS emulation series too. The Kernel series will be resent soon on top on new vgic design. > > I had both the above series in 4.2 in my local 4.2 tree. Did you have a chance to test with GICv3 ITS already? Best Regards Eric > > BR > -Manish > > ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v7 1/6] KVM: arm/arm64: Add VGICv3 save/restore API documentation
On Fri, Feb 26, 2016 at 03:01:56PM +, Peter Maydell wrote: > On 7 December 2015 at 12:29, Pavel Fedin wrote: > > From: Christoffer Dall > > > > Factor out the GICv3-specific documentation into a separate > > documentation file. Add description for how to access distributor, > > redistributor, and CPU interface registers for GICv3 in this new file. > > > > Acked-by: Peter Maydell > > Acked-by: Marc Zyngier > > Signed-off-by: Christoffer Dall > > Signed-off-by: Pavel Fedin > > I was rereading this API doc this week, and I realised we missed > something when we wrote it: > > > + KVM_DEV_ARM_VGIC_GRP_DIST_REGS > > + KVM_DEV_ARM_VGIC_GRP_REDIST_REGS > > + Attributes: > > +The attr field of kvm_device_attr encodes two values: > > +bits: | 63 32 | 31 0 | > > +values: | mpidr | offset | > > + > > +All distributor regs are (rw, 64-bit). > > It's a bit odd to claim that distributor (or redistributor) > registers are all 64-bit, because in the hardware most of them > are really 32-bit, and at 32-bit offsets from each other. > We didn't mean to imply that you could do a 64-bit access > at offset 0 of the distributor and get both of GICD_CTLR and > GICD_TYPER at once, for instance. no we didn't, and I think this was just an oversight on my side. Perhaps what I meant was the userspace should just provide a pointer to a 64-bit value. > > I'm not quite sure what we want to say in the documentation, > though I think our general intent was clear: > > * you access a particular register at its relevant offset, >and you only get that register > * no support for reading half a register > * if the register (as documented in the GICv3 architecture >specification) is less than 64 bits wide then the returned >result is zero-extended to 64 bits on read, and high bits >are ignored on write >(Only a couple of registers are really 64-bits, notably >GICD_IROUTER. The rest are 32-bits. We could probably explicitly >list all the 64-bit regs in this doc if we didn't want to defer >to the arch spec.) > > Do we want to forbid accesses to registers which the architecture > says can be byte-accessed, like GICD_IPRIORITYR? I think we have > to, because the kernel API we have here doesn't have any way to > specify access width, and it would be weird for addresses X+1, > X+2, X+3 to give you 8 bits of data when X+0 gave you 32. yes, for the gicv2 API we only allow accesses on 32-bit aligned boundaries and always assume a 32-bit access. > > What do we mean by the "rw" part? Some registers really are > architecturally RO, so does this mean "writes to architecturally > RO registers will succeed but be ignored rather than returning an > errno" ? I think my intention was that this would work like the invariant sysregs, where if you write anything else than what the kernel has defined, then you get an error. Not sure if this is something we'd want to do here though. Seems like the GICv2 implementation just ignores writes in line with the architecture. > > > + > > +KVM_DEV_ARM_VGIC_GRP_DIST_REGS accesses the main distributor registers. > > +KVM_DEV_ARM_VGIC_GRP_REDIST_REGS accesses the redistributor of the CPU > > +specified by the mpidr. > > + > > +The offset is relative to the "[Re]Distributor base address" as defined > > +in the GICv3/4 specs. Getting or setting such a register has the same > > +effect as reading or writing the register on real hardware, and the > > mpidr > > +field is used to specify which redistributor is accessed. The mpidr is > > +ignored for the distributor. > > + > > +The mpidr encoding is based on the affinity information in the > > +architecture defined MPIDR, and the field is encoded as follows: > > + | 63 56 | 55 48 | 47 40 | 39 32 | > > + |Aff3|Aff2|Aff1|Aff0| > > + > > +Note that distributor fields are not banked, but return the same value > > +regardless of the mpidr used to access the register. > > + Limitations: > > +- Priorities are not implemented, and registers are RAZ/WI > > + Errors: > > +-ENXIO: Getting or setting this register is not yet supported > > +-EBUSY: One or more VCPUs are running > > + > > + > > + KVM_DEV_ARM_VGIC_CPU_SYSREGS > > In contrast it's fine for sysregs to be all 64-bit, because they correspond > to CPU system registers which are architecturally 64-bits. > Right, perhaps this was my confusion. Thanks, -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Intermittent guest kernel crashes with v4.5-rc6.
For some reason v4.5-rc6 kernel is not stable for guest machines on Qualcomm server platforms. We are getting IABT translation faults while booting the guest kernel. The problem disappears with the following code snippet (insert "dsb ish" instruction just before switching to EL1 guest). I am using v4.5-rc6 kernel for both host and guest machines. Please let me know if you have any thoughts or ideas for tracing this problem. --- a/arch/arm64/kvm/hyp/entry.S +++ b/arch/arm64/kvm/hyp/entry.S @@ -88,6 +88,7 @@ ENTRY(__guest_enter) ldp x0, x1, [sp], #16 // Do not touch any register after this! + dsb ish eret ENDPROC(__guest_enter) Using below QEMU command for launching guest machine: qemu-system-aarch64 -machine type=virt,accel=kvm,gic-version=3 \ -cpu "host" -smp cpus=1,maxcpus=1 -m 256M -serial stdio \ -kernel /boot/Image -initrd /boot/rootfs.cpio.gz \ -append 'earlycon=earlycon=pl011,0x0900 \ console=ttyAMA0,115200 root=/dev/ram' Guest machine crash log messages: [0.00] Booting Linux on physical CPU 0x0 [0.00] Boot CPU: AArch64 Processor [510f2811] [0.00] Bad mode in Synchronous Abort handler detected, code 0x860f -- IABT (current EL) [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 4.5.rc6+ [0.00] task: ffc000d52200 ti: ffc000d44000 task.ti: ffc000d44000 [0.00] PC is at early_init_dt_scan_root+0x28/0x94 [0.00] LR is at of_scan_flat_dt+0x9c/0xd0 [0.00] pc : [] lr : [] pstate: 83c5 [0.00] sp : ffc000d47e80 [0.00] x29: ffc000d47e80 x28: -- Shanker Donthineni Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: Intermittent guest kernel crashes with v4.5-rc6.
On 02/03/16 13:56, Shanker Donthineni wrote: > > For some reason v4.5-rc6 kernel is not stable for guest machines on > Qualcomm server platforms. > We are getting IABT translation faults while booting the guest kernel. > The problem disappears with > the following code snippet (insert "dsb ish" instruction just before > switching to EL1 guest). I am > using v4.5-rc6 kernel for both host and guest machines. > > Please let me know if you have any thoughts or ideas for tracing this > problem. > > --- a/arch/arm64/kvm/hyp/entry.S > +++ b/arch/arm64/kvm/hyp/entry.S > @@ -88,6 +88,7 @@ ENTRY(__guest_enter) > ldp x0, x1, [sp], #16 > > // Do not touch any register after this! > + dsb ish > eret > ENDPROC(__guest_enter) > > > Using below QEMU command for launching guest machine: > > qemu-system-aarch64 -machine type=virt,accel=kvm,gic-version=3 \ > -cpu "host" -smp cpus=1,maxcpus=1 -m 256M -serial stdio \ > -kernel /boot/Image -initrd /boot/rootfs.cpio.gz \ > -append 'earlycon=earlycon=pl011,0x0900 \ > console=ttyAMA0,115200 root=/dev/ram' > > > Guest machine crash log messages: > > [0.00] Booting Linux on physical CPU 0x0 > [0.00] Boot CPU: AArch64 Processor [510f2811] > [0.00] Bad mode in Synchronous Abort handler detected, code > 0x860f -- IABT (current EL) > [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 4.5.rc6+ > [0.00] task: ffc000d52200 ti: ffc000d44000 task.ti: > ffc000d44000 > [0.00] PC is at early_init_dt_scan_root+0x28/0x94 > [0.00] LR is at of_scan_flat_dt+0x9c/0xd0 > [0.00] pc : [] lr : [] > pstate: 83c5 > [0.00] sp : ffc000d47e80 > [0.00] x29: ffc000d47e80 x28: > If you're getting a prefetch abort, it would be interesting to find out what instruction is there, whether the page is mapped at stage-2 or not, what are the stage-2 permissions... Basically, a full description of the memory state. Also, does it work if you do a "dsb ishst" instead? Thanks, M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: Intermittent guest kernel crashes with v4.5-rc6.
On 02/03/16 13:56, Shanker Donthineni wrote: > > For some reason v4.5-rc6 kernel is not stable for guest machines on > Qualcomm server platforms. > We are getting IABT translation faults while booting the guest kernel. > The problem disappears with > the following code snippet (insert "dsb ish" instruction just before > switching to EL1 guest). I am > using v4.5-rc6 kernel for both host and guest machines. > > Please let me know if you have any thoughts or ideas for tracing this > problem. Another thing you can try is find out how far up you can move that dsb, up to the point where it starts crashing again. Also, how reproducible is it? Every time? Randomly? Thanks, M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: Intermittent guest kernel crashes with v4.5-rc6.
Hi Marc, Thanks for your quick reply. On 03/02/2016 08:16 AM, Marc Zyngier wrote: On 02/03/16 13:56, Shanker Donthineni wrote: For some reason v4.5-rc6 kernel is not stable for guest machines on Qualcomm server platforms. We are getting IABT translation faults while booting the guest kernel. The problem disappears with the following code snippet (insert "dsb ish" instruction just before switching to EL1 guest). I am using v4.5-rc6 kernel for both host and guest machines. Please let me know if you have any thoughts or ideas for tracing this problem. --- a/arch/arm64/kvm/hyp/entry.S +++ b/arch/arm64/kvm/hyp/entry.S @@ -88,6 +88,7 @@ ENTRY(__guest_enter) ldp x0, x1, [sp], #16 // Do not touch any register after this! + dsb ish eret ENDPROC(__guest_enter) Using below QEMU command for launching guest machine: qemu-system-aarch64 -machine type=virt,accel=kvm,gic-version=3 \ -cpu "host" -smp cpus=1,maxcpus=1 -m 256M -serial stdio \ -kernel /boot/Image -initrd /boot/rootfs.cpio.gz \ -append 'earlycon=earlycon=pl011,0x0900 \ console=ttyAMA0,115200 root=/dev/ram' Guest machine crash log messages: [0.00] Booting Linux on physical CPU 0x0 [0.00] Boot CPU: AArch64 Processor [510f2811] [0.00] Bad mode in Synchronous Abort handler detected, code 0x860f -- IABT (current EL) [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 4.5.rc6+ [0.00] task: ffc000d52200 ti: ffc000d44000 task.ti: ffc000d44000 [0.00] PC is at early_init_dt_scan_root+0x28/0x94 [0.00] LR is at of_scan_flat_dt+0x9c/0xd0 [0.00] pc : [] lr : [] pstate: 83c5 [0.00] sp : ffc000d47e80 [0.00] x29: ffc000d47e80 x28: If you're getting a prefetch abort, it would be interesting to find out what instruction is there, whether the page is mapped at stage-2 or not, what are the stage-2 permissions... Basically, a full description of the memory state. Also, does it work if you do a "dsb ishst" instead? Thanks, M. Most of the times it is faulting at ldr/str instructions. I have verified stage-1 page and the the corresponding stage-2 page attributes (SH, AP, PERM), PA etc. after IABT, everything perfectly matches. I am very confident that stage-1/stage-2 MMU page tables are correct. Instruction "dsb ishst" also fixing the problem. One more Interesting observation, if retry an instruction fetch that caused IABT, second time fetch is successful and I don't see IABT. I used below experimental code to test. --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -346,6 +346,7 @@ el1_sync: b.eqel1_undef cmp x24, #ESR_ELx_EC_BREAKPT_CUR// debug exception in EL1 b.geel1_dbg + kernel_exit 1 b el1_inv el1_da: -- Shanker Donthineni Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v2 0/8] arm64: Support for systems without AArch32 state
On Wed, Mar 02, 2016 at 12:19:23AM +0300, Yury Norov wrote: > On Thu, Feb 25, 2016 at 09:52:40AM +, Suzuki K Poulose wrote: > > This series add checks to make sure that the AArch32 state is > > supported before we process the 32bit ID registers. Also > > checks the same for COMPAT binary execution. > > > > (Painfully) applies on top of 4.5-rc5 + [1] + [2]. > > > > Or it is available here : > > git://linux-arm.org/linux-skp.git noaarch32/v2-4.5-rc5 > > > > [1] > > http://lists.infradead.org/pipermail/linux-arm-kernel/2016-February/410556.html > > [2] > > http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/401913.html > > > > Changes since V1: > > - Prevent changing the personality to PER_LINUX32 by adding > > wrapper for personality() syscall. > > - Add the check to KVM before initialising a AArch32 vcpu > > - Tested on hardware. > > > > Btw, linux32 doesn't complain when the personality() syscall fails to change > > to PER_LINUX32. You can verify the personality by running > > $ cat /proc/cpuinfo > > which would still list the 64bit features for the CPUs. > > Hi Suzuki, > > I have some troubles with access to appropriate hardware to test > it, but I didn't forget. > > Yury. Hi Suzuki, ubuntu@arm64:~$ uname -a Linux arm64 4.5.0-rc5-00019-g3e330b9 #76 SMP PREEMPT Wed Mar 2 17:46:57 MSK 2016 aarch64 aarch64 aarch64 GNU/Linux ubuntu@arm64:~$ cat /proc/cpuinfo processor : 0-47 BogoMIPS: 200.00 Features: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics CPU implementer : 0x43 CPU architecture: 8 CPU variant : 0x1 CPU part: 0x0a1 CPU revision: 0 ubuntu@arm64:~$ file readdir readdir: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically linked, for GNU/Linux 2.6.32, BuildID[sha1]=aeebc12494450b55a2ab0d39ebd2e121e9085d5c, not stripped W/o 32_EL0: ubuntu@arm64:~$ ./readdir -bash: ./readdir: cannot execute binary file: Exec format error With 32_ELO but w/o your patchset: kernel just hangs (on 4.2 it printed errors, but it was other machine); With 32_EL0 and with your patchset: ubuntu@arm64:~$ ./readdir -bash: ./readdir: cannot execute binary file: Exec format error So, everything is looking OK. Tested-by: Yury Norov Yury. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: Intermittent guest kernel crashes with v4.5-rc6.
On 02/03/16 14:59, Shanker Donthineni wrote: > Hi Marc, > > Thanks for your quick reply. > > On 03/02/2016 08:16 AM, Marc Zyngier wrote: >> On 02/03/16 13:56, Shanker Donthineni wrote: >>> For some reason v4.5-rc6 kernel is not stable for guest machines on >>> Qualcomm server platforms. >>> We are getting IABT translation faults while booting the guest kernel. >>> The problem disappears with >>> the following code snippet (insert "dsb ish" instruction just before >>> switching to EL1 guest). I am >>> using v4.5-rc6 kernel for both host and guest machines. >>> >>> Please let me know if you have any thoughts or ideas for tracing this >>> problem. >>> >>> --- a/arch/arm64/kvm/hyp/entry.S >>> +++ b/arch/arm64/kvm/hyp/entry.S >>> @@ -88,6 +88,7 @@ ENTRY(__guest_enter) >>> ldp x0, x1, [sp], #16 >>> >>> // Do not touch any register after this! >>> + dsb ish >>> eret >>>ENDPROC(__guest_enter) >>> >>> >>> Using below QEMU command for launching guest machine: >>> >>> qemu-system-aarch64 -machine type=virt,accel=kvm,gic-version=3 \ >>> -cpu "host" -smp cpus=1,maxcpus=1 -m 256M -serial stdio \ >>> -kernel /boot/Image -initrd /boot/rootfs.cpio.gz \ >>> -append 'earlycon=earlycon=pl011,0x0900 \ >>> console=ttyAMA0,115200 root=/dev/ram' >>> >>> >>> Guest machine crash log messages: >>> >>> [0.00] Booting Linux on physical CPU 0x0 >>> [0.00] Boot CPU: AArch64 Processor [510f2811] >>> [0.00] Bad mode in Synchronous Abort handler detected, code >>> 0x860f -- IABT (current EL) >>> [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 4.5.rc6+ >>> [0.00] task: ffc000d52200 ti: ffc000d44000 task.ti: >>> ffc000d44000 >>> [0.00] PC is at early_init_dt_scan_root+0x28/0x94 >>> [0.00] LR is at of_scan_flat_dt+0x9c/0xd0 >>> [0.00] pc : [] lr : [] >>> pstate: 83c5 >>> [0.00] sp : ffc000d47e80 >>> [0.00] x29: ffc000d47e80 x28: >>> >> If you're getting a prefetch abort, it would be interesting to find out >> what instruction is there, whether the page is mapped at stage-2 or not, >> what are the stage-2 permissions... Basically, a full description of the >> memory state. >> >> Also, does it work if you do a "dsb ishst" instead? >> >> Thanks, >> >> M. > > Most of the times it is faulting at ldr/str instructions. I have > verified stage-1 page and the > the corresponding stage-2 page attributes (SH, AP, PERM), PA etc. after > IABT, everything > perfectly matches. I am very confident that stage-1/stage-2 MMU page > tables are correct. > > Instruction "dsb ishst" also fixing the problem. > > One more Interesting observation, if retry an instruction fetch that > caused IABT, second > time fetch is successful and I don't see IABT. I used below > experimental code to test. > > --- a/arch/arm64/kernel/entry.S > +++ b/arch/arm64/kernel/entry.S > @@ -346,6 +346,7 @@ el1_sync: > b.eqel1_undef > cmp x24, #ESR_ELx_EC_BREAKPT_CUR// debug exception in EL1 > b.geel1_dbg > + kernel_exit 1 > b el1_inv > el1_da: > > OK, that's pretty scary, specially considering that we don't have a DSB on that path. Do you ever see it exploding at EL0? Thanks, M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v2 0/8] arm64: Support for systems without AArch32 state
On Wed, Mar 02, 2016 at 06:07:21PM +0300, Yury Norov wrote: > ubuntu@arm64:~$ uname -a > Linux arm64 4.5.0-rc5-00019-g3e330b9 #76 SMP PREEMPT Wed Mar 2 17:46:57 MSK > 2016 aarch64 aarch64 aarch64 GNU/Linux > > ubuntu@arm64:~$ cat /proc/cpuinfo > processor : 0-47 > BogoMIPS: 200.00 > Features: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics > CPU implementer : 0x43 > CPU architecture: 8 > CPU variant : 0x1 > CPU part: 0x0a1 > CPU revision: 0 As an aside, please do not hack up custom /proc/cpuinfo formats. It is a compatibility nightmare for everyone. Mark. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v2 0/8] arm64: Support for systems without AArch32 state
On 02/03/16 15:07, Yury Norov wrote: On Wed, Mar 02, 2016 at 12:19:23AM +0300, Yury Norov wrote: On Thu, Feb 25, 2016 at 09:52:40AM +, Suzuki K Poulose wrote: This series add checks to make sure that the AArch32 state is supported before we process the 32bit ID registers. Also checks the same for COMPAT binary execution. ... So, everything is looking OK. Tested-by: Yury Norov Thanks Yury. Suzuki ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: Intermittent guest kernel crashes with v4.5-rc6.
On 03/02/2016 09:09 AM, Marc Zyngier wrote: > On 02/03/16 14:59, Shanker Donthineni wrote: >> Hi Marc, >> >> Thanks for your quick reply. >> >> On 03/02/2016 08:16 AM, Marc Zyngier wrote: >>> On 02/03/16 13:56, Shanker Donthineni wrote: For some reason v4.5-rc6 kernel is not stable for guest machines on Qualcomm server platforms. We are getting IABT translation faults while booting the guest kernel. The problem disappears with the following code snippet (insert "dsb ish" instruction just before switching to EL1 guest). I am using v4.5-rc6 kernel for both host and guest machines. Please let me know if you have any thoughts or ideas for tracing this problem. --- a/arch/arm64/kvm/hyp/entry.S +++ b/arch/arm64/kvm/hyp/entry.S @@ -88,6 +88,7 @@ ENTRY(__guest_enter) ldp x0, x1, [sp], #16 // Do not touch any register after this! + dsb ish eret ENDPROC(__guest_enter) Using below QEMU command for launching guest machine: qemu-system-aarch64 -machine type=virt,accel=kvm,gic-version=3 \ -cpu "host" -smp cpus=1,maxcpus=1 -m 256M -serial stdio \ -kernel /boot/Image -initrd /boot/rootfs.cpio.gz \ -append 'earlycon=earlycon=pl011,0x0900 \ console=ttyAMA0,115200 root=/dev/ram' Guest machine crash log messages: [0.00] Booting Linux on physical CPU 0x0 [0.00] Boot CPU: AArch64 Processor [510f2811] [0.00] Bad mode in Synchronous Abort handler detected, code 0x860f -- IABT (current EL) [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 4.5.rc6+ [0.00] task: ffc000d52200 ti: ffc000d44000 task.ti: ffc000d44000 [0.00] PC is at early_init_dt_scan_root+0x28/0x94 [0.00] LR is at of_scan_flat_dt+0x9c/0xd0 [0.00] pc : [] lr : [] pstate: 83c5 [0.00] sp : ffc000d47e80 [0.00] x29: ffc000d47e80 x28: >>> If you're getting a prefetch abort, it would be interesting to find out >>> what instruction is there, whether the page is mapped at stage-2 or not, >>> what are the stage-2 permissions... Basically, a full description of the >>> memory state. >>> >>> Also, does it work if you do a "dsb ishst" instead? >>> >>> Thanks, >>> >>> M. >> Most of the times it is faulting at ldr/str instructions. I have >> verified stage-1 page and the >> the corresponding stage-2 page attributes (SH, AP, PERM), PA etc. after >> IABT, everything >> perfectly matches. I am very confident that stage-1/stage-2 MMU page >> tables are correct. >> >> Instruction "dsb ishst" also fixing the problem. >> >> One more Interesting observation, if retry an instruction fetch that >> caused IABT, second >> time fetch is successful and I don't see IABT. I used below >> experimental code to test. >> >> --- a/arch/arm64/kernel/entry.S >> +++ b/arch/arm64/kernel/entry.S >> @@ -346,6 +346,7 @@ el1_sync: >> b.eqel1_undef >> cmp x24, #ESR_ELx_EC_BREAKPT_CUR// debug exception in EL1 >> b.geel1_dbg >> + kernel_exit 1 >> b el1_inv >> el1_da: >> >> > OK, that's pretty scary, specially considering that we don't have a DSB > on that path. Do you ever see it exploding at EL0? > > Thanks, > > M. We haven't started running heavy workloads in VMs. So far we have noticed this random nature behavior only during guest kernel boot (at EL1). We didn't see this problem on 4.3 kernel. Do you think it is related to TLB conflicts? -- Shanker Donthineni Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: Intermittent guest kernel crashes with v4.5-rc6.
On 02/03/16 15:48, Shanker Donthineni wrote: > We haven't started running heavy workloads in VMs. So far we > have noticed this random nature behavior only during guest > kernel boot (at EL1). > > We didn't see this problem on 4.3 kernel. Do you think it is > related to TLB conflicts? I cannot imagine why a DSB would solve a TLB conflict. But the fact that you didn't see it crashing on 4.3 is a good indication that something else it at play. In 4.5, we've rewritten a large part of KVM in C, which has changed the ordering of the various accesses a lot. It could be that a latent problem is now exposed more widely. Can you try moving this DSB around and find out what is the earliest point where it solves this problem? Some sort of bisection? Thanks, M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH 2/3] MSI-X: update GSI routing after changed MSI-X configuration
Hi, On 02/03/16 01:16, Will Deacon wrote: > On Tue, Mar 01, 2016 at 04:49:37PM +, Andre Przywara wrote: >> When we set up GSI routing to map MSIs to KVM's GSI numbers, we >> write the current device's MSI setup into the kernel routing table. >> However the device driver in the guest can use PCI configuration space >> accesses to change the MSI configuration (address and/or payload data). >> Whenever this happens after we have setup the routing table already, >> we must amend the previously sent data. >> So when MSI-X PCI config space accesses write address or payload, >> find the associated GSI number and the matching routing table entry >> and update the kernel routing table (only if the data has changed). >> >> This fixes vhost-net, where the queue's IRQFD was setup before the >> MSI vectors. >> >> Signed-off-by: Andre Przywara >> --- >> include/kvm/irq.h | 1 + >> irq.c | 31 +++ >> virtio/pci.c | 36 +--- >> 3 files changed, 65 insertions(+), 3 deletions(-) >> >> diff --git a/include/kvm/irq.h b/include/kvm/irq.h >> index bb71521..f35eb7e 100644 >> --- a/include/kvm/irq.h >> +++ b/include/kvm/irq.h >> @@ -21,5 +21,6 @@ int irq__exit(struct kvm *kvm); >> >> int irq__allocate_routing_entry(void); >> int irq__add_msix_route(struct kvm *kvm, struct msi_msg *msg); >> +void irq__update_msix_route(struct kvm *kvm, u32 gsi, struct msi_msg *msg); >> >> #endif >> diff --git a/irq.c b/irq.c >> index 1aee478..25ac8d7 100644 >> --- a/irq.c >> +++ b/irq.c >> @@ -89,6 +89,37 @@ int irq__add_msix_route(struct kvm *kvm, struct msi_msg >> *msg) >> return next_gsi++; >> } >> >> +static bool update_data(u32 *ptr, u32 newdata) >> +{ >> +if (*ptr == newdata) >> +return false; >> + >> +*ptr = newdata; >> +return true; >> +} >> + >> +void irq__update_msix_route(struct kvm *kvm, u32 gsi, struct msi_msg *msg) >> +{ >> +struct kvm_irq_routing_msi *entry; >> +unsigned int i; >> +bool changed; >> + >> +for (i = 0; i < irq_routing->nr; i++) >> +if (gsi == irq_routing->entries[i].gsi) >> +break; >> +if (i == irq_routing->nr) >> +return; >> + >> +entry = &irq_routing->entries[i].u.msi; >> + >> +changed = update_data(&entry->address_hi, msg->address_hi); >> +changed |= update_data(&entry->address_lo, msg->address_lo); >> +changed |= update_data(&entry->data, msg->data); >> + >> +if (changed) >> +ioctl(kvm->vm_fd, KVM_SET_GSI_ROUTING, irq_routing); >> +} > > What goes wrong if you just call the ioctl every time? Is this actually > a fast path in practice? I guess nothing, it's just a lot of needless churn in the kernel. We trap on every word access to the MSI data region and I have seen so many non-updates in there. For instance if the guest updates the payload, it writes the unchanged address parts also and we trap that. Also please note that this ioctl updates the _whole table_ every time. If you now look at what virt/kvm/kvm_main.c actually does (kmalloc, copy_from_user, kmalloc again, update each entry (with kmallocs), RCU switch over to the new table, free the old table, free, free), I hope you agree that his little extra code in userland is surely worth the effort. I had debug messages in the kernel to chase the bug and the output was huge every time for actually no change at all most of the times. Cheers, Andre. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v2 09/17] arm64: KVM: vgic-v2: Move GICH_ELRSR saving to its own function
On Wed, Feb 17, 2016 at 04:40:41PM +, Marc Zyngier wrote: > In order to make the saving path slightly more readable and > prepare for some more optimizations, let's more the GICH_ELRSR s/more/move/ > saving to its own function. > > No functional change. > > Signed-off-by: Marc Zyngier > --- > arch/arm64/kvm/hyp/vgic-v2-sr.c | 36 +--- > 1 file changed, 21 insertions(+), 15 deletions(-) > > diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c > index 1bda5ce..c281374 100644 > --- a/arch/arm64/kvm/hyp/vgic-v2-sr.c > +++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c > @@ -66,6 +66,25 @@ static void __hyp_text save_maint_int_state(struct > kvm_vcpu *vcpu, > #endif > } > > +static void __hyp_text save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base) > +{ > + struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2; > + int nr_lr = vcpu->arch.vgic_cpu.nr_lr; > + u32 elrsr0, elrsr1; > + > + elrsr0 = readl_relaxed(base + GICH_ELRSR0); > + if (unlikely(nr_lr > 32)) > + elrsr1 = readl_relaxed(base + GICH_ELRSR1); > + else > + elrsr1 = 0; > + > +#ifdef CONFIG_CPU_BIG_ENDIAN > + cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1; > +#else > + cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0; > +#endif > +} > + > /* vcpu is already in the HYP VA space */ > void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu) > { > @@ -73,7 +92,6 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu) > struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2; > struct vgic_dist *vgic = &kvm->arch.vgic; > void __iomem *base = kern_hyp_va(vgic->vctrl_base); > - u32 elrsr0, elrsr1; > int i, nr_lr; > > if (!base) > @@ -83,22 +101,10 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu > *vcpu) > cpu_if->vgic_vmcr = readl_relaxed(base + GICH_VMCR); > > if (vcpu->arch.vgic_cpu.live_lrs) { > - elrsr0 = readl_relaxed(base + GICH_ELRSR0); > - cpu_if->vgic_apr= readl_relaxed(base + GICH_APR); > - > - if (unlikely(nr_lr > 32)) { > - elrsr1 = readl_relaxed(base + GICH_ELRSR1); > - } else { > - elrsr1 = 0; > - } > - > -#ifdef CONFIG_CPU_BIG_ENDIAN > - cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1; > -#else > - cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0; > -#endif > + cpu_if->vgic_apr = readl_relaxed(base + GICH_APR); > > save_maint_int_state(vcpu, base); > + save_elrsr(vcpu, base); > > for (i = 0; i < nr_lr; i++) > if (vcpu->arch.vgic_cpu.live_lrs & (1UL << i)) > -- > 2.1.4 > Reviewed-by: Christoffer Dall ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v2 07/17] arm64: KVM: vgic-v2: Avoid accessing GICH registers
On Wed, Feb 17, 2016 at 04:40:39PM +, Marc Zyngier wrote: > GICv2 registers are *slow*. As in "terrifyingly slow". Which is bad. > But we're equaly bad, as we make a point in accessing them even if > we don't have any interrupt in flight. > > A good solution is to first find out if we have anything useful to > write into the GIC, and if we don't, to simply not do it. This > involves tracking which LRs actually have something valid there. > > Reviewed-by: Christoffer Dall > Signed-off-by: Marc Zyngier nice find with the APR in this one, my review-by is still valid here. Thanks, -Christoffer > --- > arch/arm64/kvm/hyp/vgic-v2-sr.c | 72 > - > include/kvm/arm_vgic.h | 2 ++ > 2 files changed, 52 insertions(+), 22 deletions(-) > > diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c > index e717612..5ab8d63 100644 > --- a/arch/arm64/kvm/hyp/vgic-v2-sr.c > +++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c > @@ -38,28 +38,41 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu > *vcpu) > > nr_lr = vcpu->arch.vgic_cpu.nr_lr; > cpu_if->vgic_vmcr = readl_relaxed(base + GICH_VMCR); > - cpu_if->vgic_misr = readl_relaxed(base + GICH_MISR); > - eisr0 = readl_relaxed(base + GICH_EISR0); > - elrsr0 = readl_relaxed(base + GICH_ELRSR0); > - if (unlikely(nr_lr > 32)) { > - eisr1 = readl_relaxed(base + GICH_EISR1); > - elrsr1 = readl_relaxed(base + GICH_ELRSR1); > - } else { > - eisr1 = elrsr1 = 0; > - } > + > + if (vcpu->arch.vgic_cpu.live_lrs) { > + eisr0 = readl_relaxed(base + GICH_EISR0); > + elrsr0 = readl_relaxed(base + GICH_ELRSR0); > + cpu_if->vgic_misr = readl_relaxed(base + GICH_MISR); > + cpu_if->vgic_apr= readl_relaxed(base + GICH_APR); > + > + if (unlikely(nr_lr > 32)) { > + eisr1 = readl_relaxed(base + GICH_EISR1); > + elrsr1 = readl_relaxed(base + GICH_ELRSR1); > + } else { > + eisr1 = elrsr1 = 0; > + } > + > #ifdef CONFIG_CPU_BIG_ENDIAN > - cpu_if->vgic_eisr = ((u64)eisr0 << 32) | eisr1; > - cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1; > + cpu_if->vgic_eisr = ((u64)eisr0 << 32) | eisr1; > + cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1; > #else > - cpu_if->vgic_eisr = ((u64)eisr1 << 32) | eisr0; > - cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0; > + cpu_if->vgic_eisr = ((u64)eisr1 << 32) | eisr0; > + cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0; > #endif > - cpu_if->vgic_apr= readl_relaxed(base + GICH_APR); > > - writel_relaxed(0, base + GICH_HCR); > + for (i = 0; i < nr_lr; i++) > + if (vcpu->arch.vgic_cpu.live_lrs & (1UL << i)) > + cpu_if->vgic_lr[i] = readl_relaxed(base + > GICH_LR0 + (i * 4)); > > - for (i = 0; i < nr_lr; i++) > - cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4)); > + writel_relaxed(0, base + GICH_HCR); > + > + vcpu->arch.vgic_cpu.live_lrs = 0; > + } else { > + cpu_if->vgic_eisr = 0; > + cpu_if->vgic_elrsr = ~0UL; > + cpu_if->vgic_misr = 0; > + cpu_if->vgic_apr = 0; > + } > } > > /* vcpu is already in the HYP VA space */ > @@ -70,15 +83,30 @@ void __hyp_text __vgic_v2_restore_state(struct kvm_vcpu > *vcpu) > struct vgic_dist *vgic = &kvm->arch.vgic; > void __iomem *base = kern_hyp_va(vgic->vctrl_base); > int i, nr_lr; > + u64 live_lrs = 0; > > if (!base) > return; > > - writel_relaxed(cpu_if->vgic_hcr, base + GICH_HCR); > - writel_relaxed(cpu_if->vgic_vmcr, base + GICH_VMCR); > - writel_relaxed(cpu_if->vgic_apr, base + GICH_APR); > - > nr_lr = vcpu->arch.vgic_cpu.nr_lr; > + > for (i = 0; i < nr_lr; i++) > - writel_relaxed(cpu_if->vgic_lr[i], base + GICH_LR0 + (i * 4)); > + if (cpu_if->vgic_lr[i] & GICH_LR_STATE) > + live_lrs |= 1UL << i; > + > + if (live_lrs) { > + writel_relaxed(cpu_if->vgic_hcr, base + GICH_HCR); > + writel_relaxed(cpu_if->vgic_apr, base + GICH_APR); > + for (i = 0; i < nr_lr; i++) { > + u32 val = 0; > + > + if (live_lrs & (1UL << i)) > + val = cpu_if->vgic_lr[i]; > + > + writel_relaxed(val, base + GICH_LR0 + (i * 4)); > + } > + } > + > + writel_relaxed(cpu_if->vgic_vmcr, base + GICH_VMCR); > + vcpu->arch.vgic_cpu.live_lrs = live_lrs; > } > diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h > index 13a3d53..f473fd6 100644 > --- a/include/kvm/arm_vgic.h > +++ b/include/kvm/arm_vgic.h > @@ -321,6 +321,8 @@ struct vgic_c
Re: [PATCH v2 10/17] arm64: KVM: vgic-v2: Do not save an LR known to be empty
On Wed, Feb 17, 2016 at 04:40:42PM +, Marc Zyngier wrote: > On exit, any empty LR will be signaled in GICH_ELRSR*. Which > means that we do not have to save it, and we can just clear > its state in the in-memory copy. > > Take this opportunity to move the LR saving code into its > own function. > > Signed-off-by: Marc Zyngier > --- > arch/arm64/kvm/hyp/vgic-v2-sr.c | 26 -- > 1 file changed, 20 insertions(+), 6 deletions(-) > > diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c > index c281374..3dbbc6b 100644 > --- a/arch/arm64/kvm/hyp/vgic-v2-sr.c > +++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c > @@ -85,6 +85,25 @@ static void __hyp_text save_elrsr(struct kvm_vcpu *vcpu, > void __iomem *base) > #endif > } > > +static void __hyp_text save_lrs(struct kvm_vcpu *vcpu, void __iomem *base) > +{ > + struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2; > + int nr_lr = vcpu->arch.vgic_cpu.nr_lr; > + int i; > + > + for (i = 0; i < nr_lr; i++) { > + if (!(vcpu->arch.vgic_cpu.live_lrs & (1UL << i))) > + continue; > + > + if (cpu_if->vgic_elrsr & (1UL << i)) { > + cpu_if->vgic_lr[i] &= ~GICH_LR_STATE; > + continue; > + } > + > + cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4)); > + } > +} > + > /* vcpu is already in the HYP VA space */ > void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu) > { > @@ -92,12 +111,10 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu > *vcpu) > struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2; > struct vgic_dist *vgic = &kvm->arch.vgic; > void __iomem *base = kern_hyp_va(vgic->vctrl_base); > - int i, nr_lr; > > if (!base) > return; > > - nr_lr = vcpu->arch.vgic_cpu.nr_lr; > cpu_if->vgic_vmcr = readl_relaxed(base + GICH_VMCR); > > if (vcpu->arch.vgic_cpu.live_lrs) { > @@ -105,10 +122,7 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu > *vcpu) > > save_maint_int_state(vcpu, base); > save_elrsr(vcpu, base); > - > - for (i = 0; i < nr_lr; i++) > - if (vcpu->arch.vgic_cpu.live_lrs & (1UL << i)) > - cpu_if->vgic_lr[i] = readl_relaxed(base + > GICH_LR0 + (i * 4)); > + save_lrs(vcpu, base); > > writel_relaxed(0, base + GICH_HCR); > > -- > 2.1.4 > Reviewed-by: Christoffer Dall ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v2 08/17] arm64: KVM: vgic-v2: Save maintenance interrupt state only if required
On Wed, Feb 17, 2016 at 04:40:40PM +, Marc Zyngier wrote: > Next on our list of useless accesses is the maintenance interrupt > status registers (GICH_MISR, GICH_EISR{0,1}). > > It is pointless to save them if we haven't asked for a maintenance > interrupt the first place, which can only happen for two reasons: > - Underflow: GICH_HCR_UIE will be set, > - EOI: GICH_LR_EOI will be set. > > These conditions can be checked on the in-memory copies of the regs. > Should any of these two condition be valid, we must read GICH_MISR. > We can then check for GICH_MISR_EOI, and only when set read > GICH_EISR*. > > This means that in most case, we don't have to save them at all. > > Signed-off-by: Marc Zyngier > --- > arch/arm64/kvm/hyp/vgic-v2-sr.c | 54 > +++-- > 1 file changed, 47 insertions(+), 7 deletions(-) > > diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c > index 5ab8d63..1bda5ce 100644 > --- a/arch/arm64/kvm/hyp/vgic-v2-sr.c > +++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c > @@ -23,6 +23,49 @@ > > #include "hyp.h" > > +static void __hyp_text save_maint_int_state(struct kvm_vcpu *vcpu, > + void __iomem *base) > +{ > + struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2; > + int nr_lr = vcpu->arch.vgic_cpu.nr_lr; > + u32 eisr0, eisr1; > + int i; > + bool expect_mi; > + > + expect_mi = !!(cpu_if->vgic_hcr & GICH_HCR_UIE); > + > + for (i = 0; i < nr_lr; i++) { > + if (!(vcpu->arch.vgic_cpu.live_lrs & (1UL << i))) > + continue; > + > + expect_mi |= (!(cpu_if->vgic_lr[i] & GICH_LR_HW) && > + (cpu_if->vgic_lr[i] & GICH_LR_EOI)); > + } Just eye balling the code it really feels crazy that this is faster than reading two registers, but I believe that may just be the case given the speed of the GIC. As an alternative to this loop, you could keep a counter for the number of requested EOI MIs and whenever we program an LR with the EOI bit set, then we increment the counter, and whenever we clear such an LR, we decrement the counter, and then you can just check here if it's non-zero. What do you think? Is it worth it? Does it make the code even worse? I can also write that on top of this patch if you'd like. In any case, for this functionality: Reviewed-by: Christoffer Dall > + > + if (expect_mi) { > + cpu_if->vgic_misr = readl_relaxed(base + GICH_MISR); > + > + if (cpu_if->vgic_misr & GICH_MISR_EOI) { > + eisr0 = readl_relaxed(base + GICH_EISR0); > + if (unlikely(nr_lr > 32)) > + eisr1 = readl_relaxed(base + GICH_EISR1); > + else > + eisr1 = 0; > + } else { > + eisr0 = eisr1 = 0; > + } > + } else { > + cpu_if->vgic_misr = 0; > + eisr0 = eisr1 = 0; > + } > + > +#ifdef CONFIG_CPU_BIG_ENDIAN > + cpu_if->vgic_eisr = ((u64)eisr0 << 32) | eisr1; > +#else > + cpu_if->vgic_eisr = ((u64)eisr1 << 32) | eisr0; > +#endif > +} > + > /* vcpu is already in the HYP VA space */ > void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu) > { > @@ -30,7 +73,7 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu) > struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2; > struct vgic_dist *vgic = &kvm->arch.vgic; > void __iomem *base = kern_hyp_va(vgic->vctrl_base); > - u32 eisr0, eisr1, elrsr0, elrsr1; > + u32 elrsr0, elrsr1; > int i, nr_lr; > > if (!base) > @@ -40,26 +83,23 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu > *vcpu) > cpu_if->vgic_vmcr = readl_relaxed(base + GICH_VMCR); > > if (vcpu->arch.vgic_cpu.live_lrs) { > - eisr0 = readl_relaxed(base + GICH_EISR0); > elrsr0 = readl_relaxed(base + GICH_ELRSR0); > - cpu_if->vgic_misr = readl_relaxed(base + GICH_MISR); > cpu_if->vgic_apr= readl_relaxed(base + GICH_APR); > > if (unlikely(nr_lr > 32)) { > - eisr1 = readl_relaxed(base + GICH_EISR1); > elrsr1 = readl_relaxed(base + GICH_ELRSR1); > } else { > - eisr1 = elrsr1 = 0; > + elrsr1 = 0; > } > > #ifdef CONFIG_CPU_BIG_ENDIAN > - cpu_if->vgic_eisr = ((u64)eisr0 << 32) | eisr1; > cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1; > #else > - cpu_if->vgic_eisr = ((u64)eisr1 << 32) | eisr0; > cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0; > #endif > > + save_maint_int_state(vcpu, base); > + > for (i = 0; i < nr_lr; i++) > if (vcpu->arch.vgic_cpu.live_lrs & (1UL << i)) > c
Re: [PATCH v2 11/17] arm64: KVM: vgic-v2: Only wipe LRs on vcpu exit
On Wed, Feb 17, 2016 at 04:40:43PM +, Marc Zyngier wrote: > So far, we're always writing all possible LRs, setting the empty > ones with a zero value. This is obvious doing a low of work for s/low/lot/ > nothing, and we're better off clearing those we've actually > dirtied on the exit path (it is very rare to inject more than one > interrupt at a time anyway). > > Signed-off-by: Marc Zyngier > --- > arch/arm64/kvm/hyp/vgic-v2-sr.c | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/kvm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c > index 3dbbc6b..e53f131 100644 > --- a/arch/arm64/kvm/hyp/vgic-v2-sr.c > +++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c > @@ -101,6 +101,7 @@ static void __hyp_text save_lrs(struct kvm_vcpu *vcpu, > void __iomem *base) > } > > cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4)); > + writel_relaxed(0, base + GICH_LR0 + (i * 4)); > } > } > > @@ -158,12 +159,11 @@ void __hyp_text __vgic_v2_restore_state(struct kvm_vcpu > *vcpu) > writel_relaxed(cpu_if->vgic_hcr, base + GICH_HCR); > writel_relaxed(cpu_if->vgic_apr, base + GICH_APR); > for (i = 0; i < nr_lr; i++) { > - u32 val = 0; > - > - if (live_lrs & (1UL << i)) > - val = cpu_if->vgic_lr[i]; > + if (!(live_lrs & (1UL << i))) > + continue; how can we be sure that the LRs are clear when we launch our first VM on a given physical CPU? Don't we need to flush the LRs during VGIC init time? > > - writel_relaxed(val, base + GICH_LR0 + (i * 4)); > + writel_relaxed(cpu_if->vgic_lr[i], > +base + GICH_LR0 + (i * 4)); > } > } > > -- > 2.1.4 > otherwie LGTM. Thanks, -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v2 12/17] arm64: KVM: vgic-v2: Make GICD_SGIR quicker to hit
On Wed, Feb 17, 2016 at 04:40:44PM +, Marc Zyngier wrote: > The GICD_SGIR register lives a long way from the beginning of > the handler array, which is searched linearly. As this is hit > pretty often, let's move it up. This saves us some precious > cycles when the guest is generating IPIs. > > Signed-off-by: Marc Zyngier Acked-by: Christoffer Dall > --- > virt/kvm/arm/vgic-v2-emul.c | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/virt/kvm/arm/vgic-v2-emul.c b/virt/kvm/arm/vgic-v2-emul.c > index 1390797..1b0bee0 100644 > --- a/virt/kvm/arm/vgic-v2-emul.c > +++ b/virt/kvm/arm/vgic-v2-emul.c > @@ -321,6 +321,11 @@ static bool handle_mmio_sgi_clear(struct kvm_vcpu *vcpu, > > static const struct vgic_io_range vgic_dist_ranges[] = { > { > + .base = GIC_DIST_SOFTINT, > + .len= 4, > + .handle_mmio= handle_mmio_sgi_reg, > + }, > + { > .base = GIC_DIST_CTRL, > .len= 12, > .bits_per_irq = 0, > @@ -387,11 +392,6 @@ static const struct vgic_io_range vgic_dist_ranges[] = { > .handle_mmio= handle_mmio_cfg_reg, > }, > { > - .base = GIC_DIST_SOFTINT, > - .len= 4, > - .handle_mmio= handle_mmio_sgi_reg, > - }, > - { > .base = GIC_DIST_SGI_PENDING_CLEAR, > .len= VGIC_NR_SGIS, > .handle_mmio= handle_mmio_sgi_clear, > -- > 2.1.4 > ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH 4/4] pre_init: add ARM implementations
Hi, On 02/03/16 03:00, Will Deacon wrote: > On Wed, Feb 24, 2016 at 03:33:08PM +, Andre Przywara wrote: >> The pre_init stub consists of two syscalls mouting the host's FS >> via 9pfs and then calling the actual init binary, which can now >> use normal dynamic linking. >> Based on the x86 code provide an ARM and ARM64 implementation of >> that. Beside removing the need for static linkage it reduces the >> size of the kvmtool binary by quite a lot (numbers for aarch64): >> >> -rwxr-xr-x 1 root root 9952 Nov 16 14:37 guest/init >> -rwxr-xr-x 1 root root 512 Nov 16 14:37 guest/pre_init >> -rwxr-xr-x 2 root root 1284704 Nov 16 14:37 lkvm >> vs. the old version: >> -rwxr-xr-x 1 root root 776024 Nov 16 14:38 guest/init >> -rwxr-xr-x 2 root root 2050112 Nov 16 14:38 lkvm >> >> Tested on Midway and Juno. > > Hmm, I'm not super keen on switching behaviour like this on arm, where > it's not uncommon to build a static lkvm and transfer it to a remote > target and expect init to work. So are you concerned about a fully static root file system on the host, which does not provide libc.so and/or ld-linux.so at all? Is that really a use case? I had the impression that people use a statically linked kvmtool to avoid dependencies like to libfdt.so. In this case I am wondering if we should provide some switch to build a static lkvm with a static init if people are concerned, or we should ship a guest/init binary statically linked against musl libc, for instance: this is only 29K compared to the above multi-100 KB gcc version. Or is there some trick to build small static binaries linked against glibc? Actually by just looking at init.c: Should we code the whole of it in assembly? Apart from printf it only consists of syscalls. > Perhaps we could only do this when building a dynamic executable? This is of course an option as well. Cheers, Andre ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm