On 16 June 2015 at 12:11, Alex Züpke <alexander.zue...@hs-rm.de> wrote: > But the startup is not my problem, it's the later parts.
But it was my problem because it meant your test case wasn't functional :-) > I added the WFE to the initial lock. Here are two new tests, both are now > 3178 bytes in size: > http://www.cs.hs-rm.de/~zuepke/qemu/ipi.elf > http://www.cs.hs-rm.de/~zuepke/qemu/ipi_yield.elf > > Both start on my machine. The IPI ping-pong starts after the > first timer interrupt after 1s. The problem is that IPIs are > delivered only once a second after the timer interrupts QEMU's > main loop. Thanks. These test cases work for me, and I can repro the same behaviour you see. I intend to investigate why we're not at least timeslicing between the two CPUs at a faster rate than "when there's another timer interrupt". > Something else: Existing ARM CPU so far do not use hyper-threading, > but have real phyical cores. In contrast, QEMU is an extreme > coarse-grained hyper-threading architectures, so existing legacy > code that was written with physical cores in mind will trigger > timing bugs in synchronization primitives then, especially code > originally written for ARM11 MPCore like mine, which lacks WFE/SEV. > If we consider QEMU as a platform to run legacy code, doesn't it > make sense to address these issues? In general QEMU's approach is more "run correct code reasonably fast" rather than "run buggy code the same way the hardware would" or "identify bugs in buggy code". There's certainly scope for heuristics for making our timeslicing approach less obtrusive, but we need to understand the underlying behaviour first (and check it doesn't accidentally slow down other common workloads in the process). In particular I think the 'do cpu_exit if one CPU triggers an interrupt on another' approach is probably good, but I need to investigate why it isn't working on your test programs without that extra 'level &&' condition first... thanks -- PMM