On 18 December 2024 18:16:13 CET, Thomas Huth <[email protected]> wrote: >On 18/12/2024 17.19, Thomas Huth wrote: >> On 18/12/2024 15.11, David Woodhouse wrote: >>> On Wed, 2024-12-18 at 14:38 +0100, Thomas Huth wrote: >> ... >>>> But FWIW, there seems to be another issue with this test. While running it >>>> multiple times, I sometimes see test_kvm_xen_guest_novector_noapic hanging. >>>> According to the console output, the guest waits in vain for a device: >>>> >>>> 2024-12-18 14:32:58,606: Initializing XFRM netlink socket >>>> 2024-12-18 14:32:58,607: NET: Registered PF_INET6 protocol family >>>> 2024-12-18 14:32:58,609: Segment Routing with IPv6 >>>> 2024-12-18 14:32:58,609: In-situ OAM (IOAM) with IPv6 >>>> 2024-12-18 14:32:58,610: NET: Registered PF_PACKET protocol family >>>> 2024-12-18 14:32:58,610: 8021q: 802.1Q VLAN Support v1.8 >>>> 2024-12-18 14:32:58,611: 9pnet: Installing 9P2000 support >>>> 2024-12-18 14:32:58,613: NET: Registered PF_VSOCK protocol family >>>> 2024-12-18 14:32:58,614: IPI shorthand broadcast: enabled >>>> 2024-12-18 14:32:58,619: sched_clock: Marking stable (551147059, >>>> -6778955)->(590359530, -45991426) >>>> 2024-12-18 14:32:59,507: tsc: Refined TSC clocksource calibration: >>>> 2495.952 MHz >>>> 2024-12-18 14:32:59,508: clocksource: tsc: mask: 0xffffffffffffffff >>>> max_cycles: 0x23fa49fc138, max_idle_ns: 440795295059 ns >>>> 2024-12-18 14:32:59,509: clocksource: Switched to clocksource tsc >>>> 2024-12-18 14:33:28,667: xenbus_probe_frontend: Waiting for devices to >>>> initialise: 25s...20s...15s...10s...5s...0s... >>>> >>>> Have you seen this problem before? >>> >>> That seems like event channel interrupts aren't being routed to the >>> legacy i8259 PIC. I've certainly seen that kind of thing before, >>> especially when asserted level-triggered interrupts weren't correctly >>> being asserted. But I don't expect that of QEMU. I'll see if I can >>> reproduce; thanks. >>> >>> How often does it happen? >> >> With the new functional test, it happens maybe 2 times out of 100 test runs. >> >> I wasn't able to reproduce it with the avocado version yet, but that also >> runs 10x slower, so it takes a longer time to get to that many runs... > >Ok, FWIW, I've now also seen the problem with the old avocado version of the >test, so it's nothing that has been introduced by my patch. I just had to >downgrade to Avocado v88 again since the current version v103 does not seem to >correctly output the console anymore :-/ (which is another good indicator that >we really need to get the stuff moved over to the functional framework now). > > Thomas >
I have reproduced it, will look into it. I'm fairly sure this was all working reliably at the time the Xen support was merged; that's why I wrote these test cases after all.
