Re: [RFC PATCH 0/5] Removing support for 32bit KVM/arm host
On 19.02.20 16:09, Arnd Bergmann wrote: On Mon, Feb 10, 2020 at 3:13 PM Marc Zyngier wrote: KVM/arm was merged just over 7 years ago, and has lived a very quiet life so far. It mostly works if you're prepared to deal with its limitations, it has been a good prototype for the arm64 version, but it suffers a few problems: - It is incomplete (no debug support, no PMU) - It hasn't followed any of the architectural evolutions - It has zero users (I don't count myself here) - It is more and more getting in the way of new arm64 developments So here it is: unless someone screams and shows that they rely on KVM/arm to be maintained upsteam, I'll remove 32bit host support form the tree. One of the reasons that makes me confident nobody is using it is that I never receive *any* bug report. Yes, it is perfect. But if you depend on KVM/arm being available in mainline, please shout. To reiterate: 32bit guest support for arm64 stays, of course. Only 32bit host goes. Once this is merged, I plan to move virt/kvm/arm to arm64, and cleanup all the now unnecessary abstractions. The patches have been generated with the -D option to avoid spamming everyone with huge diffs, and there is a kvm-arm/goodbye branch in my kernel.org repository. Just one more thought before it's gone: is there any shared code (header files?) that is used by the jailhouse hypervisor? If there is, are there any plans to merge that into the mainline kernel for arm32 in the near future? I'm guessing the answer to at least one of those questions is 'no', so we don't need to worry about it, but it seems better to ask. Good that you mention it: There is one thing we share on ARM (and ARM64), and that is the hypervisor enabling stub, to install our own vectors. If that was to be removed as well, we would have to patch it back downstream. So far, we only carry few EXPORT_SYMBOL patches for essential enabling. That said, I was also starting to think about how long we will continue to support Jailhouse on 32-bit ARM. We currently have no supported SoC there that comes with an SMMU, and I doubt to see one still showing up. So, Jailhouse on ARM is really just a testing/demo case, maybe useful (but I didn't get concrete feedback) for cleaner collaborative AMP for real-time purposes, without security concerns. I assume 32-bit ARM will never be part of what would be proposed of Jailhouse for upstream. Jan -- Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] kvm: arm: Promote KVM_ARM_TARGET_CORTEX_A7 to generic V7 core
On 30.06.19 17:19, Jan Kiszka wrote: > From: Jan Kiszka > > The only difference between the currently supported A15 and A7 target > cores is the reset state of bit 11 in SCTLR. This bit is RES1 or RAO/WI > in other ARM cores, including ARMv8 ones. By promoting A7 to a generic > default target, this allows to use yet unsupported core types. E.g., > this enables KVM on the A72 of the RPi4. > > Signed-off-by: Jan Kiszka > --- > arch/arm/include/uapi/asm/kvm.h| 1 + > arch/arm/kvm/Makefile | 2 +- > arch/arm/kvm/{coproc_a7.c => coproc_generic.c} | 18 +- > arch/arm/kvm/guest.c | 4 +--- > arch/arm/kvm/reset.c | 5 + > 5 files changed, 13 insertions(+), 17 deletions(-) > rename arch/arm/kvm/{coproc_a7.c => coproc_generic.c} (70%) > > diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h > index 4602464ebdfb..e0c5bbec3d3d 100644 > --- a/arch/arm/include/uapi/asm/kvm.h > +++ b/arch/arm/include/uapi/asm/kvm.h > @@ -70,6 +70,7 @@ struct kvm_regs { > /* Supported Processor Types */ > #define KVM_ARM_TARGET_CORTEX_A150 > #define KVM_ARM_TARGET_CORTEX_A7 1 > +#define KVM_ARM_TARGET_GENERIC_V7KVM_ARM_TARGET_CORTEX_A7 > #define KVM_ARM_NUM_TARGETS 2 > > /* KVM_ARM_SET_DEVICE_ADDR ioctl id encoding */ > diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile > index 531e59f5be9c..d959f89135d6 100644 > --- a/arch/arm/kvm/Makefile > +++ b/arch/arm/kvm/Makefile > @@ -21,7 +21,7 @@ obj-$(CONFIG_KVM_ARM_HOST) += hyp/ > > obj-y += kvm-arm.o init.o interrupts.o > obj-y += handle_exit.o guest.o emulate.o reset.o > -obj-y += coproc.o coproc_a15.o coproc_a7.o vgic-v3-coproc.o > +obj-y += coproc.o coproc_a15.o coproc_generic.o vgic-v3-coproc.o > obj-y += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o $(KVM)/arm/mmio.o > obj-y += $(KVM)/arm/psci.o $(KVM)/arm/perf.o > obj-y += $(KVM)/arm/aarch32.o > diff --git a/arch/arm/kvm/coproc_a7.c b/arch/arm/kvm/coproc_generic.c > similarity index 70% > rename from arch/arm/kvm/coproc_a7.c > rename to arch/arm/kvm/coproc_generic.c > index 40f643e1e05c..b32a541ad7bf 100644 > --- a/arch/arm/kvm/coproc_a7.c > +++ b/arch/arm/kvm/coproc_generic.c > @@ -15,28 +15,28 @@ > #include "coproc.h" > > /* > - * Cortex-A7 specific CP15 registers. > + * Generic CP15 registers. > * CRn denotes the primary register number, but is copied to the CRm in the > * user space API for 64-bit register access in line with the terminology > used > * in the ARM ARM. > * Important: Must be sorted ascending by CRn, CRM, Op1, Op2 and with 64-bit > *registers preceding 32-bit ones. > */ > -static const struct coproc_reg a7_regs[] = { > +static const struct coproc_reg generic_regs[] = { > /* SCTLR: swapped by interrupt.S. */ > { CRn( 1), CRm( 0), Op1( 0), Op2( 0), is32, > access_vm_reg, reset_val, c1_SCTLR, 0x00C50878 }, > }; > > -static struct kvm_coproc_target_table a7_target_table = { > - .target = KVM_ARM_TARGET_CORTEX_A7, > - .table = a7_regs, > - .num = ARRAY_SIZE(a7_regs), > +static struct kvm_coproc_target_table generic_target_table = { > + .target = KVM_ARM_TARGET_GENERIC_V7, > + .table = generic_regs, > + .num = ARRAY_SIZE(generic_regs), > }; > > -static int __init coproc_a7_init(void) > +static int __init coproc_generic_init(void) > { > - kvm_register_target_coproc_table(&a7_target_table); > + kvm_register_target_coproc_table(&generic_target_table); > return 0; > } > -late_initcall(coproc_a7_init); > +late_initcall(coproc_generic_init); > diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c > index 684cf64b4033..d33a24e70f49 100644 > --- a/arch/arm/kvm/guest.c > +++ b/arch/arm/kvm/guest.c > @@ -275,12 +275,10 @@ int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu, > int __attribute_const__ kvm_target_cpu(void) > { > switch (read_cpuid_part()) { > - case ARM_CPU_PART_CORTEX_A7: > - return KVM_ARM_TARGET_CORTEX_A7; > case ARM_CPU_PART_CORTEX_A15: > return KVM_ARM_TARGET_CORTEX_A15; > default: > - return -EINVAL; > + return KVM_ARM_TARGET_GENERIC_V7; > } > } > > diff --git a/arch/arm/kvm/reset.c b/arch/arm/kvm/reset.c > index eb4174f6ebbd..d6e07500bab4 100644 > --- a/arch/arm/kvm/reset.c > +++ b/arch/arm/kvm/reset.c > @@ -43,13 +43,10 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu) > struct kvm_regs *reset_regs; > > switch (vcpu->arch.target) { > - case KVM_ARM_TARGET_CORTEX
[PATCH] kvm: arm: Promote KVM_ARM_TARGET_CORTEX_A7 to generic V7 core
From: Jan Kiszka The only difference between the currently supported A15 and A7 target cores is the reset state of bit 11 in SCTLR. This bit is RES1 or RAO/WI in other ARM cores, including ARMv8 ones. By promoting A7 to a generic default target, this allows to use yet unsupported core types. E.g., this enables KVM on the A72 of the RPi4. Signed-off-by: Jan Kiszka --- arch/arm/include/uapi/asm/kvm.h| 1 + arch/arm/kvm/Makefile | 2 +- arch/arm/kvm/{coproc_a7.c => coproc_generic.c} | 18 +- arch/arm/kvm/guest.c | 4 +--- arch/arm/kvm/reset.c | 5 + 5 files changed, 13 insertions(+), 17 deletions(-) rename arch/arm/kvm/{coproc_a7.c => coproc_generic.c} (70%) diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h index 4602464ebdfb..e0c5bbec3d3d 100644 --- a/arch/arm/include/uapi/asm/kvm.h +++ b/arch/arm/include/uapi/asm/kvm.h @@ -70,6 +70,7 @@ struct kvm_regs { /* Supported Processor Types */ #define KVM_ARM_TARGET_CORTEX_A15 0 #define KVM_ARM_TARGET_CORTEX_A7 1 +#define KVM_ARM_TARGET_GENERIC_V7 KVM_ARM_TARGET_CORTEX_A7 #define KVM_ARM_NUM_TARGETS2 /* KVM_ARM_SET_DEVICE_ADDR ioctl id encoding */ diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile index 531e59f5be9c..d959f89135d6 100644 --- a/arch/arm/kvm/Makefile +++ b/arch/arm/kvm/Makefile @@ -21,7 +21,7 @@ obj-$(CONFIG_KVM_ARM_HOST) += hyp/ obj-y += kvm-arm.o init.o interrupts.o obj-y += handle_exit.o guest.o emulate.o reset.o -obj-y += coproc.o coproc_a15.o coproc_a7.o vgic-v3-coproc.o +obj-y += coproc.o coproc_a15.o coproc_generic.o vgic-v3-coproc.o obj-y += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o $(KVM)/arm/mmio.o obj-y += $(KVM)/arm/psci.o $(KVM)/arm/perf.o obj-y += $(KVM)/arm/aarch32.o diff --git a/arch/arm/kvm/coproc_a7.c b/arch/arm/kvm/coproc_generic.c similarity index 70% rename from arch/arm/kvm/coproc_a7.c rename to arch/arm/kvm/coproc_generic.c index 40f643e1e05c..b32a541ad7bf 100644 --- a/arch/arm/kvm/coproc_a7.c +++ b/arch/arm/kvm/coproc_generic.c @@ -15,28 +15,28 @@ #include "coproc.h" /* - * Cortex-A7 specific CP15 registers. + * Generic CP15 registers. * CRn denotes the primary register number, but is copied to the CRm in the * user space API for 64-bit register access in line with the terminology used * in the ARM ARM. * Important: Must be sorted ascending by CRn, CRM, Op1, Op2 and with 64-bit *registers preceding 32-bit ones. */ -static const struct coproc_reg a7_regs[] = { +static const struct coproc_reg generic_regs[] = { /* SCTLR: swapped by interrupt.S. */ { CRn( 1), CRm( 0), Op1( 0), Op2( 0), is32, access_vm_reg, reset_val, c1_SCTLR, 0x00C50878 }, }; -static struct kvm_coproc_target_table a7_target_table = { - .target = KVM_ARM_TARGET_CORTEX_A7, - .table = a7_regs, - .num = ARRAY_SIZE(a7_regs), +static struct kvm_coproc_target_table generic_target_table = { + .target = KVM_ARM_TARGET_GENERIC_V7, + .table = generic_regs, + .num = ARRAY_SIZE(generic_regs), }; -static int __init coproc_a7_init(void) +static int __init coproc_generic_init(void) { - kvm_register_target_coproc_table(&a7_target_table); + kvm_register_target_coproc_table(&generic_target_table); return 0; } -late_initcall(coproc_a7_init); +late_initcall(coproc_generic_init); diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c index 684cf64b4033..d33a24e70f49 100644 --- a/arch/arm/kvm/guest.c +++ b/arch/arm/kvm/guest.c @@ -275,12 +275,10 @@ int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu, int __attribute_const__ kvm_target_cpu(void) { switch (read_cpuid_part()) { - case ARM_CPU_PART_CORTEX_A7: - return KVM_ARM_TARGET_CORTEX_A7; case ARM_CPU_PART_CORTEX_A15: return KVM_ARM_TARGET_CORTEX_A15; default: - return -EINVAL; + return KVM_ARM_TARGET_GENERIC_V7; } } diff --git a/arch/arm/kvm/reset.c b/arch/arm/kvm/reset.c index eb4174f6ebbd..d6e07500bab4 100644 --- a/arch/arm/kvm/reset.c +++ b/arch/arm/kvm/reset.c @@ -43,13 +43,10 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu) struct kvm_regs *reset_regs; switch (vcpu->arch.target) { - case KVM_ARM_TARGET_CORTEX_A7: - case KVM_ARM_TARGET_CORTEX_A15: + default: reset_regs = &cortexa_regs_reset; vcpu->arch.midr = read_cpuid_id(); break; - default: - return -ENODEV; } /* Reset core registers */ -- 2.16.4 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: KVM works on RPi4
On 30.06.19 12:18, Jan Kiszka wrote: On 30.06.19 11:34, Jan Kiszka wrote: On 30.06.19 00:42, Marc Zyngier wrote: On Sat, 29 Jun 2019 19:09:37 +0200 Jan Kiszka wrote: However, as the Raspberry kernel is not yet ready for 64-bit (and upstream is not in sight), I had to use legacy 32-bit mode. And there we stumble over the core detection. This little patch made it work, though: diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c index 2b8de885b2bf..01606aad73cc 100644 --- a/arch/arm/kvm/guest.c +++ b/arch/arm/kvm/guest.c @@ -290,6 +290,7 @@ int __attribute_const__ kvm_target_cpu(void) case ARM_CPU_PART_CORTEX_A7: return KVM_ARM_TARGET_CORTEX_A7; case ARM_CPU_PART_CORTEX_A15: + case ARM_CPU_PART_CORTEX_A72: return KVM_ARM_TARGET_CORTEX_A15; default: return -EINVAL; That raises the question if this is hack or a valid change and if there is general interest in mapping 64-bit cores on 32-bit if they happen to run in 32-bit mode. The real thing to do here would be to move to a generic target, much like we did on the 64bit side. Could you investigate that instead? It would also allow KVM to be used on other 32bit cores such as A12/A17/A32. You mean something like KVM_ARM_TARGET_GENERIC_V8? Need to study that... Hmm, looking at what KVM_ARM_TARGET_CORTEX_A7 and ..._A15 differentiates, I found nothing so far: kvm_reset_vcpu: switch (vcpu->arch.target) { case KVM_ARM_TARGET_CORTEX_A7: case KVM_ARM_TARGET_CORTEX_A15: reset_regs = &cortexa_regs_reset; vcpu->arch.midr = read_cpuid_id(); break; And arch/arm/kvm/coproc_a15.c looks like a copy of coproc_a7.c, just with some symbols renamed. OK, found it: The reset values of SCTLR differ, in one bit. A15 starts with branch prediction (11) off, A7 with that feature enabled. Quite some boilerplate code for managing a single bit. For a generic target, can we simply assume A15 reset behaviour? Jan ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: KVM works on RPi4
On 30.06.19 11:34, Jan Kiszka wrote: On 30.06.19 00:42, Marc Zyngier wrote: On Sat, 29 Jun 2019 19:09:37 +0200 Jan Kiszka wrote: However, as the Raspberry kernel is not yet ready for 64-bit (and upstream is not in sight), I had to use legacy 32-bit mode. And there we stumble over the core detection. This little patch made it work, though: diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c index 2b8de885b2bf..01606aad73cc 100644 --- a/arch/arm/kvm/guest.c +++ b/arch/arm/kvm/guest.c @@ -290,6 +290,7 @@ int __attribute_const__ kvm_target_cpu(void) case ARM_CPU_PART_CORTEX_A7: return KVM_ARM_TARGET_CORTEX_A7; case ARM_CPU_PART_CORTEX_A15: + case ARM_CPU_PART_CORTEX_A72: return KVM_ARM_TARGET_CORTEX_A15; default: return -EINVAL; That raises the question if this is hack or a valid change and if there is general interest in mapping 64-bit cores on 32-bit if they happen to run in 32-bit mode. The real thing to do here would be to move to a generic target, much like we did on the 64bit side. Could you investigate that instead? It would also allow KVM to be used on other 32bit cores such as A12/A17/A32. You mean something like KVM_ARM_TARGET_GENERIC_V8? Need to study that... Hmm, looking at what KVM_ARM_TARGET_CORTEX_A7 and ..._A15 differentiates, I found nothing so far: kvm_reset_vcpu: switch (vcpu->arch.target) { case KVM_ARM_TARGET_CORTEX_A7: case KVM_ARM_TARGET_CORTEX_A15: reset_regs = &cortexa_regs_reset; vcpu->arch.midr = read_cpuid_id(); break; And arch/arm/kvm/coproc_a15.c looks like a copy of coproc_a7.c, just with some symbols renamed. What's the purpose of all that? Planned for something bigger but never implemented? From that perspective, there seems to be no need to arch.target and kvm_coproc_target_table at all. Jan ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: KVM works on RPi4
On 30.06.19 00:42, Marc Zyngier wrote: On Sat, 29 Jun 2019 19:09:37 +0200 Jan Kiszka wrote: Hi Jan, Hi all, just got KVM running on the Raspberry Pi4. Seems they now embedded all required logic into that new SoC. Yeah, someone saw the light and decided to enter the 21st century by attaching a GICv2 to the thing. Who knows, they may plug a GICv3 and a SMMU in 2050 at that rate! ;-) Optimistic. However, as the Raspberry kernel is not yet ready for 64-bit (and upstream is not in sight), I had to use legacy 32-bit mode. And there we stumble over the core detection. This little patch made it work, though: diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c index 2b8de885b2bf..01606aad73cc 100644 --- a/arch/arm/kvm/guest.c +++ b/arch/arm/kvm/guest.c @@ -290,6 +290,7 @@ int __attribute_const__ kvm_target_cpu(void) case ARM_CPU_PART_CORTEX_A7: return KVM_ARM_TARGET_CORTEX_A7; case ARM_CPU_PART_CORTEX_A15: + case ARM_CPU_PART_CORTEX_A72: return KVM_ARM_TARGET_CORTEX_A15; default: return -EINVAL; That raises the question if this is hack or a valid change and if there is general interest in mapping 64-bit cores on 32-bit if they happen to run in 32-bit mode. The real thing to do here would be to move to a generic target, much like we did on the 64bit side. Could you investigate that instead? It would also allow KVM to be used on other 32bit cores such as A12/A17/A32. You mean something like KVM_ARM_TARGET_GENERIC_V8? Need to study that... Although some would argue that the *real* real thing to do would be "rm -rf arch/arm/kvm" and be done with it, but that's a discussion for next week... ;-) Jan PS: The RPi device tree lacks description of the GICH maintenance interrupts. Seems KVM is fine without that - because it has the information hard-coded or because it can live without that interrupt? Nah, it really should have an interrupt here. You can end-up in situation where new virtual interrupts are delayed until the next natural exit if you don't get a maintenance interrupt. Feels like a bug. Probably just in their DT. How can I check if the maintenance IRQ is working? Anyway, if you know of any effort to get a 64bit kernel on that thing, I'm interested in helping. I bought one on Monday, but didn't get a change to do any hacking on it just yet... I played with compiling the rpi kernel for 64-bit. Lots of pieces from the graphic drivers are falling from the truck, but you can make it build at least. Not that it boots so far or gives any early messages. Probably that is the reason: https://github.com/raspberrypi/linux/issues/3032 Jan ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
KVM works on RPi4
Hi all, just got KVM running on the Raspberry Pi4. Seems they now embedded all required logic into that new SoC. However, as the Raspberry kernel is not yet ready for 64-bit (and upstream is not in sight), I had to use legacy 32-bit mode. And there we stumble over the core detection. This little patch made it work, though: diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c index 2b8de885b2bf..01606aad73cc 100644 --- a/arch/arm/kvm/guest.c +++ b/arch/arm/kvm/guest.c @@ -290,6 +290,7 @@ int __attribute_const__ kvm_target_cpu(void) case ARM_CPU_PART_CORTEX_A7: return KVM_ARM_TARGET_CORTEX_A7; case ARM_CPU_PART_CORTEX_A15: + case ARM_CPU_PART_CORTEX_A72: return KVM_ARM_TARGET_CORTEX_A15; default: return -EINVAL; That raises the question if this is hack or a valid change and if there is general interest in mapping 64-bit cores on 32-bit if they happen to run in 32-bit mode. Jan PS: The RPi device tree lacks description of the GICH maintenance interrupts. Seems KVM is fine without that - because it has the information hard-coded or because it can live without that interrupt? ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH 00/13] arm64: Virtualization Host Extension support
On 2015-08-26 11:28, Antonios Motakis wrote: > > > On 26-Aug-15 11:21, Jan Kiszka wrote: >> On 2015-08-26 11:12, Antonios Motakis wrote: >>> Hello Marc, >>> >>> On 08-Jul-15 18:19, Marc Zyngier wrote: >>>> ARMv8.1 comes with the "Virtualization Host Extension" (VHE for >>>> short), which enables simpler support of Type-2 hypervisors. >>>> >>>> This extension allows the kernel to directly run at EL2, and >>>> significantly reduces the number of system registers shared between >>>> host and guest, reducing the overhead of virtualization. >>>> >>>> In order to have the same kernel binary running on all versions of the >>>> architecture, this series makes heavy use of runtime code patching. >>>> >>>> The first ten patches massage the KVM code to deal with VHE and enable >>>> Linux to run at EL2. >>> >>> I am currently working on getting the Jailhouse hypervisor to work on >>> AArch64. >>> >>> I've been looking at your patches, trying to figure out the implications >>> for Jailhouse. It seems there are a few :) >>> >>> Jailhouse likes to be loaded by Linux into memory, and then to inject >>> itself at a higher level than Linux (demoting Linux into being the "root >>> cell"). This works on x86 and ARM (AArch32 and eventually AArch64 without >>> VHE). What this means in ARM, is that Jailhouse hooks into the HVC stub >>> exposed by Linux, and happily installs itself in EL2. >>> >>> With Linux running in EL2 though, that won't be as straightforward. It >>> looks like we can't just demote Linux to EL1 without breaking something. >>> Obviously it's OK for us that KVM won't work, but it looks like at least >>> the timer code will break horribly if we try to do something like that. >>> >>> Any comments on this? One work around would be to just remap the incoming >>> interrupt from the timer, so Linux never really realizes it's not running >>> in EL2 anymore. Then we would also have to deal with the intricacies of >>> removing and re-adding vCPUs to the Linux root cell, so we would have to >>> maintain the illusion of running in EL2 for each one of them. >> >> Without knowing any of the details, I would say there are two strategies >> regarding this: >> >> - Disable KVM support in the Linux kernel - then we shouldn't boot into >> EL2 in the first place, should we? > > We would have to ask the user to patch the kernel, to ignore VHE and keep all > the hyp stub magic that we rely on currently. It is an option of course. Patch or reconfigure? CONFIG_KVM isn't mandatory for arm64, is it? Jan > >> >> - Emulate what Linux is missing after take-over by Jailhouse (we do >> this on x86 with VT-d interrupt remapping which cannot be disabled >> anymore for Linux once it started with it, and we cannot boot without >> it when we want to use the x2APIC). > > Essentially what I described above; let's call it nested virtualization > without the virtualization parts? :) > >> >> Jan >> > -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH 00/13] arm64: Virtualization Host Extension support
On 2015-08-26 11:12, Antonios Motakis wrote: > Hello Marc, > > On 08-Jul-15 18:19, Marc Zyngier wrote: >> ARMv8.1 comes with the "Virtualization Host Extension" (VHE for >> short), which enables simpler support of Type-2 hypervisors. >> >> This extension allows the kernel to directly run at EL2, and >> significantly reduces the number of system registers shared between >> host and guest, reducing the overhead of virtualization. >> >> In order to have the same kernel binary running on all versions of the >> architecture, this series makes heavy use of runtime code patching. >> >> The first ten patches massage the KVM code to deal with VHE and enable >> Linux to run at EL2. > > I am currently working on getting the Jailhouse hypervisor to work on AArch64. > > I've been looking at your patches, trying to figure out the implications for > Jailhouse. It seems there are a few :) > > Jailhouse likes to be loaded by Linux into memory, and then to inject itself > at a higher level than Linux (demoting Linux into being the "root cell"). > This works on x86 and ARM (AArch32 and eventually AArch64 without VHE). What > this means in ARM, is that Jailhouse hooks into the HVC stub exposed by > Linux, and happily installs itself in EL2. > > With Linux running in EL2 though, that won't be as straightforward. It looks > like we can't just demote Linux to EL1 without breaking something. Obviously > it's OK for us that KVM won't work, but it looks like at least the timer code > will break horribly if we try to do something like that. > > Any comments on this? One work around would be to just remap the incoming > interrupt from the timer, so Linux never really realizes it's not running in > EL2 anymore. Then we would also have to deal with the intricacies of removing > and re-adding vCPUs to the Linux root cell, so we would have to maintain the > illusion of running in EL2 for each one of them. Without knowing any of the details, I would say there are two strategies regarding this: - Disable KVM support in the Linux kernel - then we shouldn't boot into EL2 in the first place, should we? - Emulate what Linux is missing after take-over by Jailhouse (we do this on x86 with VT-d interrupt remapping which cannot be disabled anymore for Linux once it started with it, and we cannot boot without it when we want to use the x2APIC). Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH] KVM: arm/arm64: Don't let userspace update CNTVOFF once guest is running
On 2015-07-09 12:22, Christoffer Dall wrote: > Hi Peter and Marc, > > [cc'ing Paolo for his input on x86 timekeeping] > > On Wed, Jul 08, 2015 at 08:13:59PM +0100, Peter Maydell wrote: >> On 8 July 2015 at 17:37, Marc Zyngier wrote: >>> On 08/07/15 17:06, Peter Maydell wrote: I'd prefer it if somebody could investigate to see why QEMU is actually doing this -- so far we just have speculation. >>> >>> I'd prefer that too, but so far people seem to be more comfortable >>> waiting for the issue to fix itself. In the meantime, VMs are broken in >>> weird and wonderful ways, and I don't think the current status-quo helps >>> anyone. >> >> Putting in a patch which might not be the right fix isn't >> necessarily a good plan either... >> >> Does has_run_once get cleared if we do a re-VCPU_INIT >> of a CPU that's run before? (We need to allow rewriting >> of guest state at that point so that "reset VM and >> load migration state" behaves correctly.) > > no, it does not, has_run_once is set the first time a VCPU is run and is > currently *never* cleared. > >> >> I suspect Jan is right and we really need to distinguish >> the KVM_PUT_*_STATE levels in ARM QEMU. This probably >> implies some kind of whitelist/override mechanism, since >> by and large we neither know nor want to know the >> semantics for system registers, we leave that up to the >> kernel. >> >> Q: if you have a running VM, and you pause it for >> an hour, what should the CNTVCT register do? Presumably >> it should not advance, but how do we arrange for that >> to happen? >> > > I think the CNTVCT should not advance when the VM is not scheduled, so > if we pause the VM or starve all the VCPUs for enough time, the guest > should not see time progressing, since otherwise the guest scheduler > cannot maintain fairness and you're bound to see spurious RCU stalls > etc. > > That's exactly why a guest can read both a virtual and physical counter > and it is an area where you simply want some level of > paravirtualization. I haven't studied how/if Linux deals with this at > all. > > So I think adjusting CNTVOFF should be managed by the kernel for the > pause/starvation scenario (which I think Avi once told me x86 does too - > does anyone know the current state of the art?). > > So the only situation where I think userspace should adjust the CNTVOFF > value is for migration where we are talking about a brand new VM with > has_run_once clear. > > Thus, if we were designing this from scratch now, the API should > be to return an error when trying to set KVM_REG_ARM_TIMER_CNT after the > VM has run once, but it's too late for that as we would break userspace. > The best alternative IMHO would be to merge Marc's patch and fix CNTVOFF > in the kernel side as well, and finally also fix QEMU so that it doesn't > try to do the thing that future kernels will ignore. Fixing QEMU to only write on KVM_PUT_FULL_STATE - yes, that should be done, but I don't think the approach for the kernel is generally right. The kernel should not do any policing on user space requests to change the VCPU or VM state unless - security is affected - userspace lacks information, thus cannot decide correctly - legacy userspace has a bug, we can detect it and want to fix that up without affecting future userspace that has a reason to do it differently Regarding CNTVOFF, the first two criteria do not apply for sure. Maybe the last one, don't know. Just think of the hypothetical scenario that a userspace VM debugger wants to inject certain register manipulations. If you block this by some hidden VM state like proposed, that feature would no longer be implementable easily. Jan signature.asc Description: OpenPGP digital signature ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH] KVM: arm/arm64: Don't let userspace update CNTVOFF once guest is running
On 2015-06-25 11:25, Claudio Fontana wrote: > On 25.06.2015 11:10, Peter Maydell wrote: >> On 25 June 2015 at 09:59, Claudio Fontana wrote: >>> Once the VM is created, I think QEMU should not request kvm to >>> change the virtual offset of the VM anymore: maybe an unexpected >>> consequence of QEMU's target-arm/kvm64.c::kvm_arch_put_registers ? >> >> Hmm. In general we assume that we can: >> * stop the VM >> * read all the guest system registers >> * write those values back again >> * restart the VM >> >> if we need to. Is that what's happening here, or are we doing >> something odder? >> >> -- PMM >> > > What I guess could be happening by looking at the code in linux > > virt/kvm/arm/arch_timer.c::kvm_arm_timer_set_reg > > is that QEMU tries to set the KVM_REG_ARM_TIMER_CNT register from exactly the > previous value, > but just because of the fact that the set function is called, cntvoff is > updated, > since the value provided by the user is apparently assumed to be _relative_ > to the physical timer. > > This is apparent to me in the code in that function which says: > > case KVM_REG_ARM_TIMER_CNT: { > /* ... */ > u64 cntvoff = kvm_phys_timer_read() - value; > /* ... */ > } > > And this is matched by the corresponding get function kvm_arm_timer_get_reg > where it says: > > case KVM_REG_ARM_TIMER_CNT: >return kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff; > > The time difference between when the GET is issued by QEMU and when the PUT > is issued then would account for the difference. QEMU has the concept of write-back levels: KVM_PUT_RUNTIME_STATE, KVM_PUT_RESET_STATE and KVM_PUT_FULL_STATE. I suspect this registers is just sorted into the wrong category, thus written as part of the RUNTIME_STATE. We had such bug patterns during the x86 maturing phase as well. Jan signature.asc Description: OpenPGP digital signature ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v4 02/12] KVM: define common KVM_GUESTDBG_USE_SW/HW_BP bits
On 2015-05-15 19:33, David Hildenbrand wrote: >> Am 15.05.2015 um 16:27 schrieb Alex Bennée: >>> +++ b/arch/s390/include/uapi/asm/kvm.h >>> @@ -114,8 +114,6 @@ struct kvm_fpu { >>> __u64 fprs[16]; >>> }; >>> >>> -#define KVM_GUESTDBG_USE_HW_BP 0x0001 >> [...] >>> +++ b/include/uapi/linux/kvm.h >> [...] >>> +#define KVM_GUESTDBG_USE_SW_BP (1 << 16) >>> +#define KVM_GUESTDBG_USE_HW_BP (1 << 17) >> >> This is an ABI break for s390, no? >> >> David, do you remember why we do not use KVM_GUESTDBG_USE_SW_BP? >> > > We never had to tell the kernel about software breakpoints as this is all > handled via 4 byte DIAG instructions until now. We don't have to turn this > mechanism on. QEMU can directly insert the desired DIAG instructions and gets > notified when they are about to get executed. > > (But we still have 2 byte breakpoint support todo - still tbd how exactly this > will be realized - could be turned on via such a mechanism) > > The problem is, that these bits are arch specific, now Alex wants to unify > them for all archs. > > So yes, this is an ABI break for us and breaks hardware breakpoints.(I think > the first version of this patch didn't contain this ABI break when I had a > look) > > I wonder if it wouldn't make more sense to > > - introduce new bits in the arch-unspecific section > - rework the existing implementers to accept both bits > > Or to simply leave stuff as it is and handle it via arch specific bits. With one arch proving the "all need this" theory wrong, just drop this patch. Even quicker when it breaks an ABI. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: "info cpus" issue
On 2015-03-16 15:35, Diana Craciun wrote: > Hi, > > I have played the last couple of days with info CPUs command in qemu and > discovered two issues with it: > > 1. One core is displayed as halted, but the core is actually running ok. > > (qemu) info cpus > * CPU #0: thread_id=400 > CPU #1: (halted) thread_id=401 > > Looking a little bit into the qemu code, it seems to be relatively > benign. qemu displays "halted" on info cpus command depending on the > value of the halted variable, but this variable does not seem to be > updated in case of qemu + KVM. > > 2. When issuing "info cpus" while the guest is booting bad things > happen. I saw 3 different behaviours: > - the guest just freezes during boot > - the guest crashes (see bellow the crash log) > - the host/qemu is displaying this message and the guest freezes: > > (qemu) [16777.503115] kvm [400]: load/store instruction decoding not > implementd > error: kvm run failed Function not implemented > > I did not get the chance to dig into it, but wanted to let you know > about this, perhaps is an already known issue? Can't comment if it's known but, from x86 experiences, such a pattern is usually related to inconsistency between "get kvm state" and "put kvm state" in QEMU or the related kernel interfaces: QEMU obtains the in-kernel CPU state when you issue "info cpus", marks it as "dirty" (in case other QEMU functions will manipulate it - won't happen in this case) and then writes it back to the kernel once the guest is resumed on that vcpu. If the state you get is not fully reflecting what you will write back, you corrupt the guest. If you want to debug, follow qmp_query_cpus -> cpu_synchronize_state and kvm_arch_get_registers (triggered by do_kvm_cpu_synchronize_state) vs. kvm_arch_put_registers (triggered in kvm_cpu_exec). Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: vexpress: Framebuffer broken with KVM enabled
On 2015-02-16 10:34, Alexander Spyridakis wrote: > On Mon, Feb 16, 2015 at 2:43 PM, Jan Kiszka wrote: >> Hi, >> >> next issue related to KVM/QEMU on the TK1: The guest image I'm running >> gives proper framebuffer output when in emulation mode. Once KVM is >> enabled, the screen is - at best - only initially updated. Sometimes I >> see the famous tux images and a bit of the console texts, but usually it >> stays black. Explanations? > > Hello Jan, > > If you want to force rendering, you can do something similar with the > following hack: > https://github.com/virtualopensystems/qemu/commit/64dd1b3e3a2353433edb9c63d00271f515bd06fb > > Of course expect performance to not be up to par. Yep, confirmed - both that it works and that it's slow (better not try this with SDL, exported via X...). Thanks, Jan signature.asc Description: OpenPGP digital signature ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: vexpress: Framebuffer broken with KVM enabled
On 2015-02-16 10:20, Anup Patel wrote: > On Mon, Feb 16, 2015 at 2:43 PM, Jan Kiszka wrote: >> Hi, >> >> next issue related to KVM/QEMU on the TK1: The guest image I'm running >> gives proper framebuffer output when in emulation mode. Once KVM is >> enabled, the screen is - at best - only initially updated. Sometimes I >> see the famous tux images and a bit of the console texts, but usually it >> stays black. Explanations? > > The QEMU accesses Guest Video RAM (or any portion of Guest RAM) as > cacheable user space memory. The Guest Kernel might access Guest Video > RAM as non-cacheable to maintain coherency with video device. If this is > the case then all updates by Guest kernel to Guest Video RAM will not > be visible to QEMU. On x86, we manage such RAM as coalesced MMIO region, sync'ing it periodically or on specific register accesses into the video card model. I suppose there is nothing like this for the pl111 yet, right? Jan signature.asc Description: OpenPGP digital signature ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
vexpress: Framebuffer broken with KVM enabled
Hi, next issue related to KVM/QEMU on the TK1: The guest image I'm running gives proper framebuffer output when in emulation mode. Once KVM is enabled, the screen is - at best - only initially updated. Sometimes I see the famous tux images and a bit of the console texts, but usually it stays black. Explanations? Jan ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: arm: warning at virt/kvm/arm/vgic.c:1468
On 2015-02-16 09:57, Marc Zyngier wrote: > On 15/02/15 19:03, Jan Kiszka wrote: >> On 2015-02-15 19:01, Jan Kiszka wrote: >>> On 2015-02-15 16:30, Marc Zyngier wrote: >>>> On Sun, Feb 15 2015 at 3:07:50 pm GMT, Jan Kiszka >>>> wrote: >>>>> On 2015-02-15 15:59, Marc Zyngier wrote: >>>>>> On Sun, Feb 15 2015 at 2:40:40 pm GMT, Jan Kiszka >>>>>> wrote: >>>>>>> On 2015-02-15 14:37, Marc Zyngier wrote: >>>>>>>> On Sun, Feb 15 2015 at 8:53:30 am GMT, Jan Kiszka >>>>>>>> wrote: >>>>>>>>> I'm now throwing trace_printk at my broken KVM. Already >>>>>>>>> found out that I get ARM_EXCEPTION_IRQ every few 10 µs. >>>>>>>>> Not seeing any irq_* traces, though. Weird. >>>>>>>> >>>>>>>> This very much looks like a screaming interrupt. At such >>>>>>>> a rate, no wonder your VM make much progress. Can you >>>>>>>> find out which interrupt is screaming like this? Looking >>>>>>>> at GICC_HPPIR should help, but you'll have to map the CPU >>>>>>>> interface in HYP before being able to access it there. >>>>>>> >>>>>>> OK... let me figure this out. I had this suspect as well - >>>>>>> the host gets a VM exit for each injected guest IRQ? >>>>>> >>>>>> Not exactly. There is a VM exit for each physical interrupt >>>>>> that fires while the guest is running. Injecting an interrupt >>>>>> also causes a VM exit, as we force the vcpu to reload its >>>>>> context. >>>>> >>>>> Ah, GICC != GICV - you are referring to host-side pending IRQs. >>>>> Any hints on how to get access to that register would >>>>> accelerate the analysis (ARM KVM code is still new to me). >>>> >>>> Map the GICC region in HYP using create_hyp_io_mapping (see >>>> vgic_v2_probe for an example of how we map GICH), and stash the >>>> read of GICC_HPPIR before leaving HYP mode (and before saving the >>>> guest timer). >> >>> Hacked on it until it started to work. The result delivered >>> initially are 0x002 or 0x01e. Then, when the guest gets stuck, I >>> have 0x01b most of the time (a few 0x01e arrive when there is a >>> real host irq). The virtual timer on speed? >> >>> Wait, there is also early printk for ARM, but it was off in my >>> guest! Turning it on confirms we have some problems here: >> >>> Architected timer frequency not available Division by zero in >>> kernel. >> >>> When in emulation mode, I get: >> >>> Architected cp15 timer(s) running at 62.50MHz (virt). >> >>> Digging deeper. >> >> U-Boot didn't initialize CNTFRQ on cores 1..3. Fixing this, the guest >> passes early boot reliably, now hangs much later (RCU stalls are >> detected by the guest). > > Right, that explains a lot of things. Can you describe a bit more what > you're seeing now? Sorry, should have updated this thread: http://thread.gmane.org/gmane.comp.emulators.kvm.arm.devel/17 This issue is no longer KVM-related. What might be KVM-related, or also a QEMU issue, is broken framebuffer support once KVM is enable in QEMU. Not yet reported, will do soon on qemu-devel. Jan signature.asc Description: OpenPGP digital signature ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
vexpress: Horribly slow MMC emulation on ARM host
Hi, this basically concludes my problems of getting KVM running on the Jetson TK1 board with QEMU: all fine now, provided I switch from qemu-system-arm -machine vexpress-a15 -sd disk.img ... to qemu-system-arm -machine vexpress-a15 \ -drive file=disk.img,if=none,id=disk \ -device virtio-blk-device,drive=disk ... This applies to both emulated and KVM accelerated mode. If I run the same image (and guest kernel) emulated on my x86 box, there is still a difference between both disk modes, but it's not that excessive. On ARM the system requires minutes to boot from MMC - if it doesn't run into timeouts earlier. It's seconds with virtio. Known problem? Jan signature.asc Description: OpenPGP digital signature ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: arm: warning at virt/kvm/arm/vgic.c:1468
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2015-02-15 19:01, Jan Kiszka wrote: > On 2015-02-15 16:30, Marc Zyngier wrote: >> On Sun, Feb 15 2015 at 3:07:50 pm GMT, Jan Kiszka >> wrote: >>> On 2015-02-15 15:59, Marc Zyngier wrote: >>>> On Sun, Feb 15 2015 at 2:40:40 pm GMT, Jan Kiszka >>>> wrote: >>>>> On 2015-02-15 14:37, Marc Zyngier wrote: >>>>>> On Sun, Feb 15 2015 at 8:53:30 am GMT, Jan Kiszka >>>>>> wrote: >>>>>>> I'm now throwing trace_printk at my broken KVM. Already >>>>>>> found out that I get ARM_EXCEPTION_IRQ every few 10 µs. >>>>>>> Not seeing any irq_* traces, though. Weird. >>>>>> >>>>>> This very much looks like a screaming interrupt. At such >>>>>> a rate, no wonder your VM make much progress. Can you >>>>>> find out which interrupt is screaming like this? Looking >>>>>> at GICC_HPPIR should help, but you'll have to map the CPU >>>>>> interface in HYP before being able to access it there. >>>>> >>>>> OK... let me figure this out. I had this suspect as well - >>>>> the host gets a VM exit for each injected guest IRQ? >>>> >>>> Not exactly. There is a VM exit for each physical interrupt >>>> that fires while the guest is running. Injecting an interrupt >>>> also causes a VM exit, as we force the vcpu to reload its >>>> context. >>> >>> Ah, GICC != GICV - you are referring to host-side pending IRQs. >>> Any hints on how to get access to that register would >>> accelerate the analysis (ARM KVM code is still new to me). >> >> Map the GICC region in HYP using create_hyp_io_mapping (see >> vgic_v2_probe for an example of how we map GICH), and stash the >> read of GICC_HPPIR before leaving HYP mode (and before saving the >> guest timer). > > Hacked on it until it started to work. The result delivered > initially are 0x002 or 0x01e. Then, when the guest gets stuck, I > have 0x01b most of the time (a few 0x01e arrive when there is a > real host irq). The virtual timer on speed? > > Wait, there is also early printk for ARM, but it was off in my > guest! Turning it on confirms we have some problems here: > > Architected timer frequency not available Division by zero in > kernel. > > When in emulation mode, I get: > > Architected cp15 timer(s) running at 62.50MHz (virt). > > Digging deeper. U-Boot didn't initialize CNTFRQ on cores 1..3. Fixing this, the guest passes early boot reliably, now hangs much later (RCU stalls are detected by the guest). Jan -BEGIN PGP SIGNATURE- Version: GnuPG v2 iEYEARECAAYFAlTg7ZwACgkQitSsb3rl5xSvugCeMgPeNKFbdDBYP6Sl7NeeG+w5 V30AoNzKaFCYtaSVMsXKG2ILbXgWre0Q =G/0z -END PGP SIGNATURE- ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: arm: warning at virt/kvm/arm/vgic.c:1468
On 2015-02-15 16:30, Marc Zyngier wrote: > On Sun, Feb 15 2015 at 3:07:50 pm GMT, Jan Kiszka wrote: >> On 2015-02-15 15:59, Marc Zyngier wrote: >>> On Sun, Feb 15 2015 at 2:40:40 pm GMT, Jan Kiszka >>> wrote: >>>> On 2015-02-15 14:37, Marc Zyngier wrote: >>>>> On Sun, Feb 15 2015 at 8:53:30 am GMT, Jan Kiszka >>>>> wrote: >>>>>> I'm now throwing trace_printk at my broken KVM. Already found out that I >>>>>> get ARM_EXCEPTION_IRQ every few 10 µs. Not seeing any irq_* traces, >>>>>> though. Weird. >>>>> >>>>> This very much looks like a screaming interrupt. At such a rate, no >>>>> wonder your VM make much progress. Can you find out which interrupt is >>>>> screaming like this? Looking at GICC_HPPIR should help, but you'll have >>>>> to map the CPU interface in HYP before being able to access it there. >>>> >>>> OK... let me figure this out. I had this suspect as well - the host gets >>>> a VM exit for each injected guest IRQ? >>> >>> Not exactly. There is a VM exit for each physical interrupt that fires >>> while the guest is running. Injecting an interrupt also causes a VM >>> exit, as we force the vcpu to reload its context. >> >> Ah, GICC != GICV - you are referring to host-side pending IRQs. Any >> hints on how to get access to that register would accelerate the >> analysis (ARM KVM code is still new to me). > > Map the GICC region in HYP using create_hyp_io_mapping (see > vgic_v2_probe for an example of how we map GICH), and stash the read of > GICC_HPPIR before leaving HYP mode (and before saving the guest timer). Hacked on it until it started to work. The result delivered initially are 0x002 or 0x01e. Then, when the guest gets stuck, I have 0x01b most of the time (a few 0x01e arrive when there is a real host irq). The virtual timer on speed? Wait, there is also early printk for ARM, but it was off in my guest! Turning it on confirms we have some problems here: Architected timer frequency not available Division by zero in kernel. When in emulation mode, I get: Architected cp15 timer(s) running at 62.50MHz (virt). Digging deeper. Jan signature.asc Description: OpenPGP digital signature ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: arm: warning at virt/kvm/arm/vgic.c:1468
On 2015-02-15 16:59, Christoffer Dall wrote: > On Sun, Feb 15, 2015 at 04:35:14PM +0100, Jan Kiszka wrote: >> On 2015-02-15 16:30, Marc Zyngier wrote: >>> On Sun, Feb 15 2015 at 3:07:50 pm GMT, Jan Kiszka >>> wrote: >>>> On 2015-02-15 15:59, Marc Zyngier wrote: >>>>> On Sun, Feb 15 2015 at 2:40:40 pm GMT, Jan Kiszka >>>>> wrote: >>>>>> On 2015-02-15 14:37, Marc Zyngier wrote: >>>>>>> On Sun, Feb 15 2015 at 8:53:30 am GMT, Jan Kiszka >>>>>>> wrote: >>>>>>>> I'm now throwing trace_printk at my broken KVM. Already found out that >>>>>>>> I >>>>>>>> get ARM_EXCEPTION_IRQ every few 10 µs. Not seeing any irq_* traces, >>>>>>>> though. Weird. >>>>>>> >>>>>>> This very much looks like a screaming interrupt. At such a rate, no >>>>>>> wonder your VM make much progress. Can you find out which interrupt is >>>>>>> screaming like this? Looking at GICC_HPPIR should help, but you'll have >>>>>>> to map the CPU interface in HYP before being able to access it there. >>>>>> >>>>>> OK... let me figure this out. I had this suspect as well - the host gets >>>>>> a VM exit for each injected guest IRQ? >>>>> >>>>> Not exactly. There is a VM exit for each physical interrupt that fires >>>>> while the guest is running. Injecting an interrupt also causes a VM >>>>> exit, as we force the vcpu to reload its context. >>>> >>>> Ah, GICC != GICV - you are referring to host-side pending IRQs. Any >>>> hints on how to get access to that register would accelerate the >>>> analysis (ARM KVM code is still new to me). >>> >>> Map the GICC region in HYP using create_hyp_io_mapping (see >>> vgic_v2_probe for an example of how we map GICH), and stash the read of >>> GICC_HPPIR before leaving HYP mode (and before saving the guest timer). >> >> OK. >> >>> >>> BTW, when you look at /proc/interrupts on the host, don't you see an >>> interrupt that's a bit too eager to fire? >> >> No - but that makes sense given that we do not enter any interrupt >> handler according to ftrace, thus there can't be any counter incrementation. >> >>> >>>>>> BTW, I also tried with in-kernel GIC disabled (in the kernel config), >>>>>> but I guess that's pointless. Linux seems to be stuck on a >>>>>> non-functional architectural timer then, right? >>>>> >>>>> Yes. Useful for bringup, but nothing more. >>>> >>>> Maybe we should perform a feature check and issue a warning from QEMU? >>> >>> I'd assume this is already in place (but I almost never run QEMU, so I >>> could be wrong here). >> >> Nope, QEMU starts up fine, just lets the guest starve while waiting for >> jiffies to increase. >> > > you should be able to turn the in-kernel irqchip off with a QEMU > command-line option and the that should prevent the kernel from adding > an arch-timer. This would only work on the vexpress guest model though, > since the virt-board doesn't provide an emulated timer as a replacement. I'm running vexpress, but I only tried legacy -no-kvm-irqchip so far which was refused. -machine vexpress-a15,kernel_irqchip=off has an effect: host practically locks up, dmesg - when I'm still able to start on a different console - gives endless "Unexpected interrupt 19 on vcpu ecd39670". Well, a different smell, but still very fishy. Jan signature.asc Description: OpenPGP digital signature ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: arm: warning at virt/kvm/arm/vgic.c:1468
On 2015-02-15 16:30, Marc Zyngier wrote: > On Sun, Feb 15 2015 at 3:07:50 pm GMT, Jan Kiszka wrote: >> On 2015-02-15 15:59, Marc Zyngier wrote: >>> On Sun, Feb 15 2015 at 2:40:40 pm GMT, Jan Kiszka >>> wrote: >>>> On 2015-02-15 14:37, Marc Zyngier wrote: >>>>> On Sun, Feb 15 2015 at 8:53:30 am GMT, Jan Kiszka >>>>> wrote: >>>>>> I'm now throwing trace_printk at my broken KVM. Already found out that I >>>>>> get ARM_EXCEPTION_IRQ every few 10 µs. Not seeing any irq_* traces, >>>>>> though. Weird. >>>>> >>>>> This very much looks like a screaming interrupt. At such a rate, no >>>>> wonder your VM make much progress. Can you find out which interrupt is >>>>> screaming like this? Looking at GICC_HPPIR should help, but you'll have >>>>> to map the CPU interface in HYP before being able to access it there. >>>> >>>> OK... let me figure this out. I had this suspect as well - the host gets >>>> a VM exit for each injected guest IRQ? >>> >>> Not exactly. There is a VM exit for each physical interrupt that fires >>> while the guest is running. Injecting an interrupt also causes a VM >>> exit, as we force the vcpu to reload its context. >> >> Ah, GICC != GICV - you are referring to host-side pending IRQs. Any >> hints on how to get access to that register would accelerate the >> analysis (ARM KVM code is still new to me). > > Map the GICC region in HYP using create_hyp_io_mapping (see > vgic_v2_probe for an example of how we map GICH), and stash the read of > GICC_HPPIR before leaving HYP mode (and before saving the guest timer). OK. > > BTW, when you look at /proc/interrupts on the host, don't you see an > interrupt that's a bit too eager to fire? No - but that makes sense given that we do not enter any interrupt handler according to ftrace, thus there can't be any counter incrementation. > >>>> BTW, I also tried with in-kernel GIC disabled (in the kernel config), >>>> but I guess that's pointless. Linux seems to be stuck on a >>>> non-functional architectural timer then, right? >>> >>> Yes. Useful for bringup, but nothing more. >> >> Maybe we should perform a feature check and issue a warning from QEMU? > > I'd assume this is already in place (but I almost never run QEMU, so I > could be wrong here). Nope, QEMU starts up fine, just lets the guest starve while waiting for jiffies to increase. > >>> I still wonder if the 4+1 design on the K1 is not playing tricks behind >>> our back. Having talked to Ian Campbell earlier this week, he also can't >>> manage to run guests in Xen on this platform, so there's something >>> rather fishy here. >> >> Interesting. The announcements of his PSCI patches [1] sounded more >> promising. Maybe it was only referring to getting the hypervisor itself >> running... > > This is my understanding so far. > >> To my current (still limited understanding) of that platform would say >> that this little core is parked after power-up of the main APs. And as >> we do not power them down, there is no reason to perform a cluster >> switch or anything similarly nasty, no? > > I can't see why this would happen, but I've learned not to assume > anything when it come to braindead creativity on the HW side... True. Jan signature.asc Description: OpenPGP digital signature ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: arm: warning at virt/kvm/arm/vgic.c:1468
On 2015-02-15 15:59, Marc Zyngier wrote: > On Sun, Feb 15 2015 at 2:40:40 pm GMT, Jan Kiszka wrote: >> On 2015-02-15 14:37, Marc Zyngier wrote: >>> On Sun, Feb 15 2015 at 8:53:30 am GMT, Jan Kiszka >>> wrote: >>>> I'm now throwing trace_printk at my broken KVM. Already found out that I >>>> get ARM_EXCEPTION_IRQ every few 10 µs. Not seeing any irq_* traces, >>>> though. Weird. >>> >>> This very much looks like a screaming interrupt. At such a rate, no >>> wonder your VM make much progress. Can you find out which interrupt is >>> screaming like this? Looking at GICC_HPPIR should help, but you'll have >>> to map the CPU interface in HYP before being able to access it there. >> >> OK... let me figure this out. I had this suspect as well - the host gets >> a VM exit for each injected guest IRQ? > > Not exactly. There is a VM exit for each physical interrupt that fires > while the guest is running. Injecting an interrupt also causes a VM > exit, as we force the vcpu to reload its context. Ah, GICC != GICV - you are referring to host-side pending IRQs. Any hints on how to get access to that register would accelerate the analysis (ARM KVM code is still new to me). > >> BTW, I also tried with in-kernel GIC disabled (in the kernel config), >> but I guess that's pointless. Linux seems to be stuck on a >> non-functional architectural timer then, right? > > Yes. Useful for bringup, but nothing more. Maybe we should perform a feature check and issue a warning from QEMU? > >>> >>> Do you have an form of power-management on this system? >> >> Just killed every config that has PM for FREQ in its name, but that >> makes no difference. > > I still wonder if the 4+1 design on the K1 is not playing tricks behind > our back. Having talked to Ian Campbell earlier this week, he also can't > manage to run guests in Xen on this platform, so there's something > rather fishy here. Interesting. The announcements of his PSCI patches [1] sounded more promising. Maybe it was only referring to getting the hypervisor itself running... To my current (still limited understanding) of that platform would say that this little core is parked after power-up of the main APs. And as we do not power them down, there is no reason to perform a cluster switch or anything similarly nasty, no? Jan PS: For those with such a board in reach, newer U-Boot patches are available at [2] now. [1] http://permalink.gmane.org/gmane.comp.boot-loaders.u-boot/208034 [2] https://github.com/siemens/u-boot/commits/jetson-tk1-v2 signature.asc Description: OpenPGP digital signature ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: arm: warning at virt/kvm/arm/vgic.c:1468
On 2015-02-15 14:37, Marc Zyngier wrote: > On Sun, Feb 15 2015 at 8:53:30 am GMT, Jan Kiszka wrote: >> I'm now throwing trace_printk at my broken KVM. Already found out that I >> get ARM_EXCEPTION_IRQ every few 10 µs. Not seeing any irq_* traces, >> though. Weird. > > This very much looks like a screaming interrupt. At such a rate, no > wonder your VM make much progress. Can you find out which interrupt is > screaming like this? Looking at GICC_HPPIR should help, but you'll have > to map the CPU interface in HYP before being able to access it there. OK... let me figure this out. I had this suspect as well - the host gets a VM exit for each injected guest IRQ? BTW, I also tried with in-kernel GIC disabled (in the kernel config), but I guess that's pointless. Linux seems to be stuck on a non-functional architectural timer then, right? > > Do you have an form of power-management on this system? Just killed every config that has PM for FREQ in its name, but that makes no difference. Jan signature.asc Description: OpenPGP digital signature ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: arm: warning at virt/kvm/arm/vgic.c:1468
On 2015-02-13 07:53, Alex Bennée wrote: > > Alex Bennée writes: > >> Christoffer Dall writes: > >>> On Sun, Feb 08, 2015 at 08:48:09AM +0100, Jan Kiszka wrote: > >>>> BTW, KVM tracing support on ARM seems like it requires some care. E.g.: >>>> kvm_exit does not report an exit reason. The in-kernel vgic also seems >>>> to lack instrumentation. Unfortunate. Tracing is usually the first stop >>>> when KVM is stuck on a guest. >>> >>> I know, the exit reason is on my todo list, and Alex B is sitting on >>> trace patches for the gic. Coming soon to a git repo near your. >> >> For the impatient the raw patches are in: >> >> git.linaro.org/people/alex.bennee/linux.git >> migration/v3.19-rc7-improve-tracing > > OK try tracing/kvm-exit-entry for something cleaner. Doesn't build for ARM (vcpu_sys_reg is ARM64-only so far). But the values traced seem useful. Wei Huang's patch in kvm.git queue traces the exception class, but unfortunately nothing else. When would we need that class? Do we need it at all? In any case, please add symbolic printing of the magic values whenever possible, just like on x86. I'm now throwing trace_printk at my broken KVM. Already found out that I get ARM_EXCEPTION_IRQ every few 10 µs. Not seeing any irq_* traces, though. Weird. Thanks, Jan signature.asc Description: OpenPGP digital signature ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm