Re: arm: warning at virt/kvm/arm/vgic.c:1468
On 2015-02-16 09:57, Marc Zyngier wrote: > On 15/02/15 19:03, Jan Kiszka wrote: >> On 2015-02-15 19:01, Jan Kiszka wrote: >>> On 2015-02-15 16:30, Marc Zyngier wrote: On Sun, Feb 15 2015 at 3:07:50 pm GMT, Jan Kiszka wrote: > On 2015-02-15 15:59, Marc Zyngier wrote: >> On Sun, Feb 15 2015 at 2:40:40 pm GMT, Jan Kiszka >> wrote: >>> On 2015-02-15 14:37, Marc Zyngier wrote: On Sun, Feb 15 2015 at 8:53:30 am GMT, Jan Kiszka wrote: > I'm now throwing trace_printk at my broken KVM. Already > found out that I get ARM_EXCEPTION_IRQ every few 10 µs. > Not seeing any irq_* traces, though. Weird. This very much looks like a screaming interrupt. At such a rate, no wonder your VM make much progress. Can you find out which interrupt is screaming like this? Looking at GICC_HPPIR should help, but you'll have to map the CPU interface in HYP before being able to access it there. >>> >>> OK... let me figure this out. I had this suspect as well - >>> the host gets a VM exit for each injected guest IRQ? >> >> Not exactly. There is a VM exit for each physical interrupt >> that fires while the guest is running. Injecting an interrupt >> also causes a VM exit, as we force the vcpu to reload its >> context. > > Ah, GICC != GICV - you are referring to host-side pending IRQs. > Any hints on how to get access to that register would > accelerate the analysis (ARM KVM code is still new to me). Map the GICC region in HYP using create_hyp_io_mapping (see vgic_v2_probe for an example of how we map GICH), and stash the read of GICC_HPPIR before leaving HYP mode (and before saving the guest timer). >> >>> Hacked on it until it started to work. The result delivered >>> initially are 0x002 or 0x01e. Then, when the guest gets stuck, I >>> have 0x01b most of the time (a few 0x01e arrive when there is a >>> real host irq). The virtual timer on speed? >> >>> Wait, there is also early printk for ARM, but it was off in my >>> guest! Turning it on confirms we have some problems here: >> >>> Architected timer frequency not available Division by zero in >>> kernel. >> >>> When in emulation mode, I get: >> >>> Architected cp15 timer(s) running at 62.50MHz (virt). >> >>> Digging deeper. >> >> U-Boot didn't initialize CNTFRQ on cores 1..3. Fixing this, the guest >> passes early boot reliably, now hangs much later (RCU stalls are >> detected by the guest). > > Right, that explains a lot of things. Can you describe a bit more what > you're seeing now? Sorry, should have updated this thread: http://thread.gmane.org/gmane.comp.emulators.kvm.arm.devel/17 This issue is no longer KVM-related. What might be KVM-related, or also a QEMU issue, is broken framebuffer support once KVM is enable in QEMU. Not yet reported, will do soon on qemu-devel. Jan signature.asc Description: OpenPGP digital signature
Re: arm: warning at virt/kvm/arm/vgic.c:1468
On 15/02/15 19:03, Jan Kiszka wrote: > On 2015-02-15 19:01, Jan Kiszka wrote: >> On 2015-02-15 16:30, Marc Zyngier wrote: >>> On Sun, Feb 15 2015 at 3:07:50 pm GMT, Jan Kiszka >>> wrote: On 2015-02-15 15:59, Marc Zyngier wrote: > On Sun, Feb 15 2015 at 2:40:40 pm GMT, Jan Kiszka > wrote: >> On 2015-02-15 14:37, Marc Zyngier wrote: >>> On Sun, Feb 15 2015 at 8:53:30 am GMT, Jan Kiszka >>> wrote: I'm now throwing trace_printk at my broken KVM. Already found out that I get ARM_EXCEPTION_IRQ every few 10 µs. Not seeing any irq_* traces, though. Weird. >>> >>> This very much looks like a screaming interrupt. At such >>> a rate, no wonder your VM make much progress. Can you >>> find out which interrupt is screaming like this? Looking >>> at GICC_HPPIR should help, but you'll have to map the CPU >>> interface in HYP before being able to access it there. >> >> OK... let me figure this out. I had this suspect as well - >> the host gets a VM exit for each injected guest IRQ? > > Not exactly. There is a VM exit for each physical interrupt > that fires while the guest is running. Injecting an interrupt > also causes a VM exit, as we force the vcpu to reload its > context. Ah, GICC != GICV - you are referring to host-side pending IRQs. Any hints on how to get access to that register would accelerate the analysis (ARM KVM code is still new to me). >>> >>> Map the GICC region in HYP using create_hyp_io_mapping (see >>> vgic_v2_probe for an example of how we map GICH), and stash the >>> read of GICC_HPPIR before leaving HYP mode (and before saving the >>> guest timer). > >> Hacked on it until it started to work. The result delivered >> initially are 0x002 or 0x01e. Then, when the guest gets stuck, I >> have 0x01b most of the time (a few 0x01e arrive when there is a >> real host irq). The virtual timer on speed? > >> Wait, there is also early printk for ARM, but it was off in my >> guest! Turning it on confirms we have some problems here: > >> Architected timer frequency not available Division by zero in >> kernel. > >> When in emulation mode, I get: > >> Architected cp15 timer(s) running at 62.50MHz (virt). > >> Digging deeper. > > U-Boot didn't initialize CNTFRQ on cores 1..3. Fixing this, the guest > passes early boot reliably, now hangs much later (RCU stalls are > detected by the guest). Right, that explains a lot of things. Can you describe a bit more what you're seeing now? thanks, M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: arm: warning at virt/kvm/arm/vgic.c:1468
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2015-02-15 19:01, Jan Kiszka wrote: > On 2015-02-15 16:30, Marc Zyngier wrote: >> On Sun, Feb 15 2015 at 3:07:50 pm GMT, Jan Kiszka >> wrote: >>> On 2015-02-15 15:59, Marc Zyngier wrote: On Sun, Feb 15 2015 at 2:40:40 pm GMT, Jan Kiszka wrote: > On 2015-02-15 14:37, Marc Zyngier wrote: >> On Sun, Feb 15 2015 at 8:53:30 am GMT, Jan Kiszka >> wrote: >>> I'm now throwing trace_printk at my broken KVM. Already >>> found out that I get ARM_EXCEPTION_IRQ every few 10 µs. >>> Not seeing any irq_* traces, though. Weird. >> >> This very much looks like a screaming interrupt. At such >> a rate, no wonder your VM make much progress. Can you >> find out which interrupt is screaming like this? Looking >> at GICC_HPPIR should help, but you'll have to map the CPU >> interface in HYP before being able to access it there. > > OK... let me figure this out. I had this suspect as well - > the host gets a VM exit for each injected guest IRQ? Not exactly. There is a VM exit for each physical interrupt that fires while the guest is running. Injecting an interrupt also causes a VM exit, as we force the vcpu to reload its context. >>> >>> Ah, GICC != GICV - you are referring to host-side pending IRQs. >>> Any hints on how to get access to that register would >>> accelerate the analysis (ARM KVM code is still new to me). >> >> Map the GICC region in HYP using create_hyp_io_mapping (see >> vgic_v2_probe for an example of how we map GICH), and stash the >> read of GICC_HPPIR before leaving HYP mode (and before saving the >> guest timer). > > Hacked on it until it started to work. The result delivered > initially are 0x002 or 0x01e. Then, when the guest gets stuck, I > have 0x01b most of the time (a few 0x01e arrive when there is a > real host irq). The virtual timer on speed? > > Wait, there is also early printk for ARM, but it was off in my > guest! Turning it on confirms we have some problems here: > > Architected timer frequency not available Division by zero in > kernel. > > When in emulation mode, I get: > > Architected cp15 timer(s) running at 62.50MHz (virt). > > Digging deeper. U-Boot didn't initialize CNTFRQ on cores 1..3. Fixing this, the guest passes early boot reliably, now hangs much later (RCU stalls are detected by the guest). Jan -BEGIN PGP SIGNATURE- Version: GnuPG v2 iEYEARECAAYFAlTg7ZwACgkQitSsb3rl5xSvugCeMgPeNKFbdDBYP6Sl7NeeG+w5 V30AoNzKaFCYtaSVMsXKG2ILbXgWre0Q =G/0z -END PGP SIGNATURE- -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: arm: warning at virt/kvm/arm/vgic.c:1468
On 2015-02-15 16:30, Marc Zyngier wrote: > On Sun, Feb 15 2015 at 3:07:50 pm GMT, Jan Kiszka wrote: >> On 2015-02-15 15:59, Marc Zyngier wrote: >>> On Sun, Feb 15 2015 at 2:40:40 pm GMT, Jan Kiszka >>> wrote: On 2015-02-15 14:37, Marc Zyngier wrote: > On Sun, Feb 15 2015 at 8:53:30 am GMT, Jan Kiszka > wrote: >> I'm now throwing trace_printk at my broken KVM. Already found out that I >> get ARM_EXCEPTION_IRQ every few 10 µs. Not seeing any irq_* traces, >> though. Weird. > > This very much looks like a screaming interrupt. At such a rate, no > wonder your VM make much progress. Can you find out which interrupt is > screaming like this? Looking at GICC_HPPIR should help, but you'll have > to map the CPU interface in HYP before being able to access it there. OK... let me figure this out. I had this suspect as well - the host gets a VM exit for each injected guest IRQ? >>> >>> Not exactly. There is a VM exit for each physical interrupt that fires >>> while the guest is running. Injecting an interrupt also causes a VM >>> exit, as we force the vcpu to reload its context. >> >> Ah, GICC != GICV - you are referring to host-side pending IRQs. Any >> hints on how to get access to that register would accelerate the >> analysis (ARM KVM code is still new to me). > > Map the GICC region in HYP using create_hyp_io_mapping (see > vgic_v2_probe for an example of how we map GICH), and stash the read of > GICC_HPPIR before leaving HYP mode (and before saving the guest timer). Hacked on it until it started to work. The result delivered initially are 0x002 or 0x01e. Then, when the guest gets stuck, I have 0x01b most of the time (a few 0x01e arrive when there is a real host irq). The virtual timer on speed? Wait, there is also early printk for ARM, but it was off in my guest! Turning it on confirms we have some problems here: Architected timer frequency not available Division by zero in kernel. When in emulation mode, I get: Architected cp15 timer(s) running at 62.50MHz (virt). Digging deeper. Jan signature.asc Description: OpenPGP digital signature
Re: arm: warning at virt/kvm/arm/vgic.c:1468
On 2015-02-15 16:59, Christoffer Dall wrote: > On Sun, Feb 15, 2015 at 04:35:14PM +0100, Jan Kiszka wrote: >> On 2015-02-15 16:30, Marc Zyngier wrote: >>> On Sun, Feb 15 2015 at 3:07:50 pm GMT, Jan Kiszka >>> wrote: On 2015-02-15 15:59, Marc Zyngier wrote: > On Sun, Feb 15 2015 at 2:40:40 pm GMT, Jan Kiszka > wrote: >> On 2015-02-15 14:37, Marc Zyngier wrote: >>> On Sun, Feb 15 2015 at 8:53:30 am GMT, Jan Kiszka >>> wrote: I'm now throwing trace_printk at my broken KVM. Already found out that I get ARM_EXCEPTION_IRQ every few 10 µs. Not seeing any irq_* traces, though. Weird. >>> >>> This very much looks like a screaming interrupt. At such a rate, no >>> wonder your VM make much progress. Can you find out which interrupt is >>> screaming like this? Looking at GICC_HPPIR should help, but you'll have >>> to map the CPU interface in HYP before being able to access it there. >> >> OK... let me figure this out. I had this suspect as well - the host gets >> a VM exit for each injected guest IRQ? > > Not exactly. There is a VM exit for each physical interrupt that fires > while the guest is running. Injecting an interrupt also causes a VM > exit, as we force the vcpu to reload its context. Ah, GICC != GICV - you are referring to host-side pending IRQs. Any hints on how to get access to that register would accelerate the analysis (ARM KVM code is still new to me). >>> >>> Map the GICC region in HYP using create_hyp_io_mapping (see >>> vgic_v2_probe for an example of how we map GICH), and stash the read of >>> GICC_HPPIR before leaving HYP mode (and before saving the guest timer). >> >> OK. >> >>> >>> BTW, when you look at /proc/interrupts on the host, don't you see an >>> interrupt that's a bit too eager to fire? >> >> No - but that makes sense given that we do not enter any interrupt >> handler according to ftrace, thus there can't be any counter incrementation. >> >>> >> BTW, I also tried with in-kernel GIC disabled (in the kernel config), >> but I guess that's pointless. Linux seems to be stuck on a >> non-functional architectural timer then, right? > > Yes. Useful for bringup, but nothing more. Maybe we should perform a feature check and issue a warning from QEMU? >>> >>> I'd assume this is already in place (but I almost never run QEMU, so I >>> could be wrong here). >> >> Nope, QEMU starts up fine, just lets the guest starve while waiting for >> jiffies to increase. >> > > you should be able to turn the in-kernel irqchip off with a QEMU > command-line option and the that should prevent the kernel from adding > an arch-timer. This would only work on the vexpress guest model though, > since the virt-board doesn't provide an emulated timer as a replacement. I'm running vexpress, but I only tried legacy -no-kvm-irqchip so far which was refused. -machine vexpress-a15,kernel_irqchip=off has an effect: host practically locks up, dmesg - when I'm still able to start on a different console - gives endless "Unexpected interrupt 19 on vcpu ecd39670". Well, a different smell, but still very fishy. Jan signature.asc Description: OpenPGP digital signature
Re: arm: warning at virt/kvm/arm/vgic.c:1468
On Sun, Feb 15, 2015 at 04:35:14PM +0100, Jan Kiszka wrote: > On 2015-02-15 16:30, Marc Zyngier wrote: > > On Sun, Feb 15 2015 at 3:07:50 pm GMT, Jan Kiszka > > wrote: > >> On 2015-02-15 15:59, Marc Zyngier wrote: > >>> On Sun, Feb 15 2015 at 2:40:40 pm GMT, Jan Kiszka > >>> wrote: > On 2015-02-15 14:37, Marc Zyngier wrote: > > On Sun, Feb 15 2015 at 8:53:30 am GMT, Jan Kiszka > > wrote: > >> I'm now throwing trace_printk at my broken KVM. Already found out that > >> I > >> get ARM_EXCEPTION_IRQ every few 10 µs. Not seeing any irq_* traces, > >> though. Weird. > > > > This very much looks like a screaming interrupt. At such a rate, no > > wonder your VM make much progress. Can you find out which interrupt is > > screaming like this? Looking at GICC_HPPIR should help, but you'll have > > to map the CPU interface in HYP before being able to access it there. > > OK... let me figure this out. I had this suspect as well - the host gets > a VM exit for each injected guest IRQ? > >>> > >>> Not exactly. There is a VM exit for each physical interrupt that fires > >>> while the guest is running. Injecting an interrupt also causes a VM > >>> exit, as we force the vcpu to reload its context. > >> > >> Ah, GICC != GICV - you are referring to host-side pending IRQs. Any > >> hints on how to get access to that register would accelerate the > >> analysis (ARM KVM code is still new to me). > > > > Map the GICC region in HYP using create_hyp_io_mapping (see > > vgic_v2_probe for an example of how we map GICH), and stash the read of > > GICC_HPPIR before leaving HYP mode (and before saving the guest timer). > > OK. > > > > > BTW, when you look at /proc/interrupts on the host, don't you see an > > interrupt that's a bit too eager to fire? > > No - but that makes sense given that we do not enter any interrupt > handler according to ftrace, thus there can't be any counter incrementation. > > > > BTW, I also tried with in-kernel GIC disabled (in the kernel config), > but I guess that's pointless. Linux seems to be stuck on a > non-functional architectural timer then, right? > >>> > >>> Yes. Useful for bringup, but nothing more. > >> > >> Maybe we should perform a feature check and issue a warning from QEMU? > > > > I'd assume this is already in place (but I almost never run QEMU, so I > > could be wrong here). > > Nope, QEMU starts up fine, just lets the guest starve while waiting for > jiffies to increase. > you should be able to turn the in-kernel irqchip off with a QEMU command-line option and the that should prevent the kernel from adding an arch-timer. This would only work on the vexpress guest model though, since the virt-board doesn't provide an emulated timer as a replacement. -Christoffer signature.asc Description: Digital signature
Re: arm: warning at virt/kvm/arm/vgic.c:1468
On 2015-02-15 16:30, Marc Zyngier wrote: > On Sun, Feb 15 2015 at 3:07:50 pm GMT, Jan Kiszka wrote: >> On 2015-02-15 15:59, Marc Zyngier wrote: >>> On Sun, Feb 15 2015 at 2:40:40 pm GMT, Jan Kiszka >>> wrote: On 2015-02-15 14:37, Marc Zyngier wrote: > On Sun, Feb 15 2015 at 8:53:30 am GMT, Jan Kiszka > wrote: >> I'm now throwing trace_printk at my broken KVM. Already found out that I >> get ARM_EXCEPTION_IRQ every few 10 µs. Not seeing any irq_* traces, >> though. Weird. > > This very much looks like a screaming interrupt. At such a rate, no > wonder your VM make much progress. Can you find out which interrupt is > screaming like this? Looking at GICC_HPPIR should help, but you'll have > to map the CPU interface in HYP before being able to access it there. OK... let me figure this out. I had this suspect as well - the host gets a VM exit for each injected guest IRQ? >>> >>> Not exactly. There is a VM exit for each physical interrupt that fires >>> while the guest is running. Injecting an interrupt also causes a VM >>> exit, as we force the vcpu to reload its context. >> >> Ah, GICC != GICV - you are referring to host-side pending IRQs. Any >> hints on how to get access to that register would accelerate the >> analysis (ARM KVM code is still new to me). > > Map the GICC region in HYP using create_hyp_io_mapping (see > vgic_v2_probe for an example of how we map GICH), and stash the read of > GICC_HPPIR before leaving HYP mode (and before saving the guest timer). OK. > > BTW, when you look at /proc/interrupts on the host, don't you see an > interrupt that's a bit too eager to fire? No - but that makes sense given that we do not enter any interrupt handler according to ftrace, thus there can't be any counter incrementation. > BTW, I also tried with in-kernel GIC disabled (in the kernel config), but I guess that's pointless. Linux seems to be stuck on a non-functional architectural timer then, right? >>> >>> Yes. Useful for bringup, but nothing more. >> >> Maybe we should perform a feature check and issue a warning from QEMU? > > I'd assume this is already in place (but I almost never run QEMU, so I > could be wrong here). Nope, QEMU starts up fine, just lets the guest starve while waiting for jiffies to increase. > >>> I still wonder if the 4+1 design on the K1 is not playing tricks behind >>> our back. Having talked to Ian Campbell earlier this week, he also can't >>> manage to run guests in Xen on this platform, so there's something >>> rather fishy here. >> >> Interesting. The announcements of his PSCI patches [1] sounded more >> promising. Maybe it was only referring to getting the hypervisor itself >> running... > > This is my understanding so far. > >> To my current (still limited understanding) of that platform would say >> that this little core is parked after power-up of the main APs. And as >> we do not power them down, there is no reason to perform a cluster >> switch or anything similarly nasty, no? > > I can't see why this would happen, but I've learned not to assume > anything when it come to braindead creativity on the HW side... True. Jan signature.asc Description: OpenPGP digital signature
Re: arm: warning at virt/kvm/arm/vgic.c:1468
On Sun, Feb 15 2015 at 3:07:50 pm GMT, Jan Kiszka wrote: > On 2015-02-15 15:59, Marc Zyngier wrote: >> On Sun, Feb 15 2015 at 2:40:40 pm GMT, Jan Kiszka wrote: >>> On 2015-02-15 14:37, Marc Zyngier wrote: On Sun, Feb 15 2015 at 8:53:30 am GMT, Jan Kiszka wrote: > I'm now throwing trace_printk at my broken KVM. Already found out that I > get ARM_EXCEPTION_IRQ every few 10 µs. Not seeing any irq_* traces, > though. Weird. This very much looks like a screaming interrupt. At such a rate, no wonder your VM make much progress. Can you find out which interrupt is screaming like this? Looking at GICC_HPPIR should help, but you'll have to map the CPU interface in HYP before being able to access it there. >>> >>> OK... let me figure this out. I had this suspect as well - the host gets >>> a VM exit for each injected guest IRQ? >> >> Not exactly. There is a VM exit for each physical interrupt that fires >> while the guest is running. Injecting an interrupt also causes a VM >> exit, as we force the vcpu to reload its context. > > Ah, GICC != GICV - you are referring to host-side pending IRQs. Any > hints on how to get access to that register would accelerate the > analysis (ARM KVM code is still new to me). Map the GICC region in HYP using create_hyp_io_mapping (see vgic_v2_probe for an example of how we map GICH), and stash the read of GICC_HPPIR before leaving HYP mode (and before saving the guest timer). BTW, when you look at /proc/interrupts on the host, don't you see an interrupt that's a bit too eager to fire? >>> BTW, I also tried with in-kernel GIC disabled (in the kernel config), >>> but I guess that's pointless. Linux seems to be stuck on a >>> non-functional architectural timer then, right? >> >> Yes. Useful for bringup, but nothing more. > > Maybe we should perform a feature check and issue a warning from QEMU? I'd assume this is already in place (but I almost never run QEMU, so I could be wrong here). >> I still wonder if the 4+1 design on the K1 is not playing tricks behind >> our back. Having talked to Ian Campbell earlier this week, he also can't >> manage to run guests in Xen on this platform, so there's something >> rather fishy here. > > Interesting. The announcements of his PSCI patches [1] sounded more > promising. Maybe it was only referring to getting the hypervisor itself > running... This is my understanding so far. > To my current (still limited understanding) of that platform would say > that this little core is parked after power-up of the main APs. And as > we do not power them down, there is no reason to perform a cluster > switch or anything similarly nasty, no? I can't see why this would happen, but I've learned not to assume anything when it come to braindead creativity on the HW side... M. -- Without deviation from the norm, progress is not possible. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: arm: warning at virt/kvm/arm/vgic.c:1468
On 2015-02-15 15:59, Marc Zyngier wrote: > On Sun, Feb 15 2015 at 2:40:40 pm GMT, Jan Kiszka wrote: >> On 2015-02-15 14:37, Marc Zyngier wrote: >>> On Sun, Feb 15 2015 at 8:53:30 am GMT, Jan Kiszka >>> wrote: I'm now throwing trace_printk at my broken KVM. Already found out that I get ARM_EXCEPTION_IRQ every few 10 µs. Not seeing any irq_* traces, though. Weird. >>> >>> This very much looks like a screaming interrupt. At such a rate, no >>> wonder your VM make much progress. Can you find out which interrupt is >>> screaming like this? Looking at GICC_HPPIR should help, but you'll have >>> to map the CPU interface in HYP before being able to access it there. >> >> OK... let me figure this out. I had this suspect as well - the host gets >> a VM exit for each injected guest IRQ? > > Not exactly. There is a VM exit for each physical interrupt that fires > while the guest is running. Injecting an interrupt also causes a VM > exit, as we force the vcpu to reload its context. Ah, GICC != GICV - you are referring to host-side pending IRQs. Any hints on how to get access to that register would accelerate the analysis (ARM KVM code is still new to me). > >> BTW, I also tried with in-kernel GIC disabled (in the kernel config), >> but I guess that's pointless. Linux seems to be stuck on a >> non-functional architectural timer then, right? > > Yes. Useful for bringup, but nothing more. Maybe we should perform a feature check and issue a warning from QEMU? > >>> >>> Do you have an form of power-management on this system? >> >> Just killed every config that has PM for FREQ in its name, but that >> makes no difference. > > I still wonder if the 4+1 design on the K1 is not playing tricks behind > our back. Having talked to Ian Campbell earlier this week, he also can't > manage to run guests in Xen on this platform, so there's something > rather fishy here. Interesting. The announcements of his PSCI patches [1] sounded more promising. Maybe it was only referring to getting the hypervisor itself running... To my current (still limited understanding) of that platform would say that this little core is parked after power-up of the main APs. And as we do not power them down, there is no reason to perform a cluster switch or anything similarly nasty, no? Jan PS: For those with such a board in reach, newer U-Boot patches are available at [2] now. [1] http://permalink.gmane.org/gmane.comp.boot-loaders.u-boot/208034 [2] https://github.com/siemens/u-boot/commits/jetson-tk1-v2 signature.asc Description: OpenPGP digital signature
Re: arm: warning at virt/kvm/arm/vgic.c:1468
On Sun, Feb 15 2015 at 2:40:40 pm GMT, Jan Kiszka wrote: > On 2015-02-15 14:37, Marc Zyngier wrote: >> On Sun, Feb 15 2015 at 8:53:30 am GMT, Jan Kiszka wrote: >>> I'm now throwing trace_printk at my broken KVM. Already found out that I >>> get ARM_EXCEPTION_IRQ every few 10 µs. Not seeing any irq_* traces, >>> though. Weird. >> >> This very much looks like a screaming interrupt. At such a rate, no >> wonder your VM make much progress. Can you find out which interrupt is >> screaming like this? Looking at GICC_HPPIR should help, but you'll have >> to map the CPU interface in HYP before being able to access it there. > > OK... let me figure this out. I had this suspect as well - the host gets > a VM exit for each injected guest IRQ? Not exactly. There is a VM exit for each physical interrupt that fires while the guest is running. Injecting an interrupt also causes a VM exit, as we force the vcpu to reload its context. > BTW, I also tried with in-kernel GIC disabled (in the kernel config), > but I guess that's pointless. Linux seems to be stuck on a > non-functional architectural timer then, right? Yes. Useful for bringup, but nothing more. >> >> Do you have an form of power-management on this system? > > Just killed every config that has PM for FREQ in its name, but that > makes no difference. I still wonder if the 4+1 design on the K1 is not playing tricks behind our back. Having talked to Ian Campbell earlier this week, he also can't manage to run guests in Xen on this platform, so there's something rather fishy here. M. -- Without deviation from the norm, progress is not possible. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: arm: warning at virt/kvm/arm/vgic.c:1468
On 2015-02-15 14:37, Marc Zyngier wrote: > On Sun, Feb 15 2015 at 8:53:30 am GMT, Jan Kiszka wrote: >> I'm now throwing trace_printk at my broken KVM. Already found out that I >> get ARM_EXCEPTION_IRQ every few 10 µs. Not seeing any irq_* traces, >> though. Weird. > > This very much looks like a screaming interrupt. At such a rate, no > wonder your VM make much progress. Can you find out which interrupt is > screaming like this? Looking at GICC_HPPIR should help, but you'll have > to map the CPU interface in HYP before being able to access it there. OK... let me figure this out. I had this suspect as well - the host gets a VM exit for each injected guest IRQ? BTW, I also tried with in-kernel GIC disabled (in the kernel config), but I guess that's pointless. Linux seems to be stuck on a non-functional architectural timer then, right? > > Do you have an form of power-management on this system? Just killed every config that has PM for FREQ in its name, but that makes no difference. Jan signature.asc Description: OpenPGP digital signature
Re: arm: warning at virt/kvm/arm/vgic.c:1468
On Sun, Feb 15 2015 at 8:53:30 am GMT, Jan Kiszka wrote: > On 2015-02-13 07:53, Alex Bennée wrote: >> >> Alex Bennée writes: >> >>> Christoffer Dall writes: >> On Sun, Feb 08, 2015 at 08:48:09AM +0100, Jan Kiszka wrote: >> > BTW, KVM tracing support on ARM seems like it requires some care. E.g.: > kvm_exit does not report an exit reason. The in-kernel vgic also seems > to lack instrumentation. Unfortunate. Tracing is usually the first stop > when KVM is stuck on a guest. I know, the exit reason is on my todo list, and Alex B is sitting on trace patches for the gic. Coming soon to a git repo near your. >>> >>> For the impatient the raw patches are in: >>> >>> git.linaro.org/people/alex.bennee/linux.git >>> migration/v3.19-rc7-improve-tracing >> >> OK try tracing/kvm-exit-entry for something cleaner. > > Doesn't build for ARM (vcpu_sys_reg is ARM64-only so far). > > But the values traced seem useful. Wei Huang's patch in kvm.git queue > traces the exception class, but unfortunately nothing else. When would > we need that class? Do we need it at all? > > In any case, please add symbolic printing of the magic values whenever > possible, just like on x86. > > I'm now throwing trace_printk at my broken KVM. Already found out that I > get ARM_EXCEPTION_IRQ every few 10 µs. Not seeing any irq_* traces, > though. Weird. This very much looks like a screaming interrupt. At such a rate, no wonder your VM make much progress. Can you find out which interrupt is screaming like this? Looking at GICC_HPPIR should help, but you'll have to map the CPU interface in HYP before being able to access it there. Do you have an form of power-management on this system? Thanks, M. -- Without deviation from the norm, progress is not possible. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: arm: warning at virt/kvm/arm/vgic.c:1468
On 2015-02-13 07:53, Alex Bennée wrote: > > Alex Bennée writes: > >> Christoffer Dall writes: > >>> On Sun, Feb 08, 2015 at 08:48:09AM +0100, Jan Kiszka wrote: > BTW, KVM tracing support on ARM seems like it requires some care. E.g.: kvm_exit does not report an exit reason. The in-kernel vgic also seems to lack instrumentation. Unfortunate. Tracing is usually the first stop when KVM is stuck on a guest. >>> >>> I know, the exit reason is on my todo list, and Alex B is sitting on >>> trace patches for the gic. Coming soon to a git repo near your. >> >> For the impatient the raw patches are in: >> >> git.linaro.org/people/alex.bennee/linux.git >> migration/v3.19-rc7-improve-tracing > > OK try tracing/kvm-exit-entry for something cleaner. Doesn't build for ARM (vcpu_sys_reg is ARM64-only so far). But the values traced seem useful. Wei Huang's patch in kvm.git queue traces the exception class, but unfortunately nothing else. When would we need that class? Do we need it at all? In any case, please add symbolic printing of the magic values whenever possible, just like on x86. I'm now throwing trace_printk at my broken KVM. Already found out that I get ARM_EXCEPTION_IRQ every few 10 µs. Not seeing any irq_* traces, though. Weird. Thanks, Jan signature.asc Description: OpenPGP digital signature
Re: arm: warning at virt/kvm/arm/vgic.c:1468
Alex Bennée writes: > Christoffer Dall writes: >> On Sun, Feb 08, 2015 at 08:48:09AM +0100, Jan Kiszka wrote: >>> BTW, KVM tracing support on ARM seems like it requires some care. E.g.: >>> kvm_exit does not report an exit reason. The in-kernel vgic also seems >>> to lack instrumentation. Unfortunate. Tracing is usually the first stop >>> when KVM is stuck on a guest. >> >> I know, the exit reason is on my todo list, and Alex B is sitting on >> trace patches for the gic. Coming soon to a git repo near your. > > For the impatient the raw patches are in: > > git.linaro.org/people/alex.bennee/linux.git > migration/v3.19-rc7-improve-tracing OK try tracing/kvm-exit-entry for something cleaner. -- Alex Bennée -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: arm: warning at virt/kvm/arm/vgic.c:1468
Christoffer Dall writes: > Hi Jan, > > On Sun, Feb 08, 2015 at 08:48:09AM +0100, Jan Kiszka wrote: >> Hi, >> >> after fixing the VM_BUG_ON, my QEMU guest on the Jetson TK1 generally >> refuses to boot. Once in a while it does, but quickly gets stuck again. >> In one case I found this in the kernel log (never happened again so >> far): >> >> [ 762.022874] WARNING: CPU: 1 PID: 972 at >> ../arch/arm/kvm/../../../virt/kvm/arm/vgic.c:1468 >> kvm_vgic_sync_hwstate+0x314/0x344() >> [ 762.022884] Modules linked in: >> [ 762.022902] CPU: 1 PID: 972 Comm: qemu-system-arm Not tainted >> 3.19.0-rc7-00221-gfd7a168-dirty #13 >> [ 762.022911] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree) >> [ 762.022937] [] (unwind_backtrace) from [] >> (show_stack+0x10/0x14) >> [ 762.022958] [] (show_stack) from [] >> (dump_stack+0x98/0xd8) >> [ 762.022976] [] (dump_stack) from [] >> (warn_slowpath_common+0x80/0xb0) >> [ 762.022991] [] (warn_slowpath_common) from [] >> (warn_slowpath_null+0x1c/0x24) >> [ 762.023007] [] (warn_slowpath_null) from [] >> (kvm_vgic_sync_hwstate+0x314/0x344) >> [ 762.023024] [] (kvm_vgic_sync_hwstate) from [] >> (kvm_arch_vcpu_ioctl_run+0x210/0x400) >> [ 762.023041] [] (kvm_arch_vcpu_ioctl_run) from [] >> (kvm_vcpu_ioctl+0x2e4/0x6ec) >> [ 762.023059] [] (kvm_vcpu_ioctl) from [] >> (do_vfs_ioctl+0x40c/0x600) >> [ 762.023076] [] (do_vfs_ioctl) from [] >> (SyS_ioctl+0x34/0x5c) >> [ 762.023091] [] (SyS_ioctl) from [] >> (ret_fast_syscall+0x0/0x34) > > so this means your guest caused a maintenance interrupt and the bit is > set in the GICH_EISR for the LR in question but the link register state > is not 0, which is in direct violation of the GIC spec. H. > > You're not doing any IRQ forwarding stuff or device passthrough here are > you? > >> >> >> BTW, KVM tracing support on ARM seems like it requires some care. E.g.: >> kvm_exit does not report an exit reason. The in-kernel vgic also seems >> to lack instrumentation. Unfortunate. Tracing is usually the first stop >> when KVM is stuck on a guest. > > I know, the exit reason is on my todo list, and Alex B is sitting on > trace patches for the gic. Coming soon to a git repo near your. For the impatient the raw patches are in: git.linaro.org/people/alex.bennee/linux.git migration/v3.19-rc7-improve-tracing But I'll be cleaning the tracing ones up and separating them from the rest over the next few days. > > -Christoffer > ___ > kvmarm mailing list > kvm...@lists.cs.columbia.edu > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm -- Alex Bennée -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: arm: warning at virt/kvm/arm/vgic.c:1468
On Fri, Feb 13, 2015 at 07:21:20AM +0100, Jan Kiszka wrote: > Hi Christoffer, > > On 2015-02-13 05:46, Christoffer Dall wrote: > > Hi Jan, > > > > On Sun, Feb 08, 2015 at 08:48:09AM +0100, Jan Kiszka wrote: > >> Hi, > >> > >> after fixing the VM_BUG_ON, my QEMU guest on the Jetson TK1 generally > >> refuses to boot. Once in a while it does, but quickly gets stuck again. > >> In one case I found this in the kernel log (never happened again so > >> far): > >> > >> [ 762.022874] WARNING: CPU: 1 PID: 972 at > >> ../arch/arm/kvm/../../../virt/kvm/arm/vgic.c:1468 > >> kvm_vgic_sync_hwstate+0x314/0x344() > >> [ 762.022884] Modules linked in: > >> [ 762.022902] CPU: 1 PID: 972 Comm: qemu-system-arm Not tainted > >> 3.19.0-rc7-00221-gfd7a168-dirty #13 > >> [ 762.022911] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree) > >> [ 762.022937] [] (unwind_backtrace) from [] > >> (show_stack+0x10/0x14) > >> [ 762.022958] [] (show_stack) from [] > >> (dump_stack+0x98/0xd8) > >> [ 762.022976] [] (dump_stack) from [] > >> (warn_slowpath_common+0x80/0xb0) > >> [ 762.022991] [] (warn_slowpath_common) from [] > >> (warn_slowpath_null+0x1c/0x24) > >> [ 762.023007] [] (warn_slowpath_null) from [] > >> (kvm_vgic_sync_hwstate+0x314/0x344) > >> [ 762.023024] [] (kvm_vgic_sync_hwstate) from [] > >> (kvm_arch_vcpu_ioctl_run+0x210/0x400) > >> [ 762.023041] [] (kvm_arch_vcpu_ioctl_run) from [] > >> (kvm_vcpu_ioctl+0x2e4/0x6ec) > >> [ 762.023059] [] (kvm_vcpu_ioctl) from [] > >> (do_vfs_ioctl+0x40c/0x600) > >> [ 762.023076] [] (do_vfs_ioctl) from [] > >> (SyS_ioctl+0x34/0x5c) > >> [ 762.023091] [] (SyS_ioctl) from [] > >> (ret_fast_syscall+0x0/0x34) > > > > so this means your guest caused a maintenance interrupt and the bit is > > set in the GICH_EISR for the LR in question but the link register state > > is not 0, which is in direct violation of the GIC spec. H. > > > > You're not doing any IRQ forwarding stuff or device passthrough here are > > you? > > No, just boring emulation. The command line is > > qemu-system-ar -machine vexpress-a15 -kernel zImage -serial mon:stdio > -append 'console=ttyAMA0 root=/dev/mmcblk0 rw' -snapshot -sd > OpenSuse13-1_arm.img -dtb vexpress-v2p-ca15-tc1.dtb -s -enable-kvm > > > > >> > >> > >> BTW, KVM tracing support on ARM seems like it requires some care. E.g.: > >> kvm_exit does not report an exit reason. The in-kernel vgic also seems > >> to lack instrumentation. Unfortunate. Tracing is usually the first stop > >> when KVM is stuck on a guest. > > > > I know, the exit reason is on my todo list, and Alex B is sitting on > > trace patches for the gic. Coming soon to a git repo near your. > > Cool, looking forward. > > Next thing I noticed is that guest debugging via qemu causes troubles in > kvm mode. For some reason, qemu is unable to write soft-breakpoints, > thus not even a single-step works. Also known? > Yes, Alex Bennee is working on this. -Christoffer signature.asc Description: Digital signature
Re: arm: warning at virt/kvm/arm/vgic.c:1468
Hi Christoffer, On 2015-02-13 05:46, Christoffer Dall wrote: > Hi Jan, > > On Sun, Feb 08, 2015 at 08:48:09AM +0100, Jan Kiszka wrote: >> Hi, >> >> after fixing the VM_BUG_ON, my QEMU guest on the Jetson TK1 generally >> refuses to boot. Once in a while it does, but quickly gets stuck again. >> In one case I found this in the kernel log (never happened again so >> far): >> >> [ 762.022874] WARNING: CPU: 1 PID: 972 at >> ../arch/arm/kvm/../../../virt/kvm/arm/vgic.c:1468 >> kvm_vgic_sync_hwstate+0x314/0x344() >> [ 762.022884] Modules linked in: >> [ 762.022902] CPU: 1 PID: 972 Comm: qemu-system-arm Not tainted >> 3.19.0-rc7-00221-gfd7a168-dirty #13 >> [ 762.022911] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree) >> [ 762.022937] [] (unwind_backtrace) from [] >> (show_stack+0x10/0x14) >> [ 762.022958] [] (show_stack) from [] >> (dump_stack+0x98/0xd8) >> [ 762.022976] [] (dump_stack) from [] >> (warn_slowpath_common+0x80/0xb0) >> [ 762.022991] [] (warn_slowpath_common) from [] >> (warn_slowpath_null+0x1c/0x24) >> [ 762.023007] [] (warn_slowpath_null) from [] >> (kvm_vgic_sync_hwstate+0x314/0x344) >> [ 762.023024] [] (kvm_vgic_sync_hwstate) from [] >> (kvm_arch_vcpu_ioctl_run+0x210/0x400) >> [ 762.023041] [] (kvm_arch_vcpu_ioctl_run) from [] >> (kvm_vcpu_ioctl+0x2e4/0x6ec) >> [ 762.023059] [] (kvm_vcpu_ioctl) from [] >> (do_vfs_ioctl+0x40c/0x600) >> [ 762.023076] [] (do_vfs_ioctl) from [] >> (SyS_ioctl+0x34/0x5c) >> [ 762.023091] [] (SyS_ioctl) from [] >> (ret_fast_syscall+0x0/0x34) > > so this means your guest caused a maintenance interrupt and the bit is > set in the GICH_EISR for the LR in question but the link register state > is not 0, which is in direct violation of the GIC spec. H. > > You're not doing any IRQ forwarding stuff or device passthrough here are > you? No, just boring emulation. The command line is qemu-system-ar -machine vexpress-a15 -kernel zImage -serial mon:stdio -append 'console=ttyAMA0 root=/dev/mmcblk0 rw' -snapshot -sd OpenSuse13-1_arm.img -dtb vexpress-v2p-ca15-tc1.dtb -s -enable-kvm > >> >> >> BTW, KVM tracing support on ARM seems like it requires some care. E.g.: >> kvm_exit does not report an exit reason. The in-kernel vgic also seems >> to lack instrumentation. Unfortunate. Tracing is usually the first stop >> when KVM is stuck on a guest. > > I know, the exit reason is on my todo list, and Alex B is sitting on > trace patches for the gic. Coming soon to a git repo near your. Cool, looking forward. Next thing I noticed is that guest debugging via qemu causes troubles in kvm mode. For some reason, qemu is unable to write soft-breakpoints, thus not even a single-step works. Also known? Jan signature.asc Description: OpenPGP digital signature
Re: arm: warning at virt/kvm/arm/vgic.c:1468
Hi Jan, On Sun, Feb 08, 2015 at 08:48:09AM +0100, Jan Kiszka wrote: > Hi, > > after fixing the VM_BUG_ON, my QEMU guest on the Jetson TK1 generally > refuses to boot. Once in a while it does, but quickly gets stuck again. > In one case I found this in the kernel log (never happened again so > far): > > [ 762.022874] WARNING: CPU: 1 PID: 972 at > ../arch/arm/kvm/../../../virt/kvm/arm/vgic.c:1468 > kvm_vgic_sync_hwstate+0x314/0x344() > [ 762.022884] Modules linked in: > [ 762.022902] CPU: 1 PID: 972 Comm: qemu-system-arm Not tainted > 3.19.0-rc7-00221-gfd7a168-dirty #13 > [ 762.022911] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree) > [ 762.022937] [] (unwind_backtrace) from [] > (show_stack+0x10/0x14) > [ 762.022958] [] (show_stack) from [] > (dump_stack+0x98/0xd8) > [ 762.022976] [] (dump_stack) from [] > (warn_slowpath_common+0x80/0xb0) > [ 762.022991] [] (warn_slowpath_common) from [] > (warn_slowpath_null+0x1c/0x24) > [ 762.023007] [] (warn_slowpath_null) from [] > (kvm_vgic_sync_hwstate+0x314/0x344) > [ 762.023024] [] (kvm_vgic_sync_hwstate) from [] > (kvm_arch_vcpu_ioctl_run+0x210/0x400) > [ 762.023041] [] (kvm_arch_vcpu_ioctl_run) from [] > (kvm_vcpu_ioctl+0x2e4/0x6ec) > [ 762.023059] [] (kvm_vcpu_ioctl) from [] > (do_vfs_ioctl+0x40c/0x600) > [ 762.023076] [] (do_vfs_ioctl) from [] > (SyS_ioctl+0x34/0x5c) > [ 762.023091] [] (SyS_ioctl) from [] > (ret_fast_syscall+0x0/0x34) so this means your guest caused a maintenance interrupt and the bit is set in the GICH_EISR for the LR in question but the link register state is not 0, which is in direct violation of the GIC spec. H. You're not doing any IRQ forwarding stuff or device passthrough here are you? > > > BTW, KVM tracing support on ARM seems like it requires some care. E.g.: > kvm_exit does not report an exit reason. The in-kernel vgic also seems > to lack instrumentation. Unfortunate. Tracing is usually the first stop > when KVM is stuck on a guest. I know, the exit reason is on my todo list, and Alex B is sitting on trace patches for the gic. Coming soon to a git repo near your. -Christoffer signature.asc Description: Digital signature