Hi, I work on the Yocto Project and we use qemu to test boot our Linux images and run tests against them. We've been noticing some instability for ppc where the images sometimes hang, usually around udevd bring up time so just after booting into userspace.
To cut a long story short, I've tracked down what I think is the problem. I believe the decrementer timer stops receiving interrupts so tasks in our images hang indefinitely as the timer stopped. It can be summed up with this line of debug: ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 00000100req 00000004 It should normally read: ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 00000100req 00000002 The question is why CPU_INTERRUPT_EXITTB ends up being set when the lines above this log message clearly sets CPU_INTERRUPT_HARD (via cpu_interrupt() ). I note in cpu.h: /* updates protected by BQL */ uint32_t interrupt_request; (for struct CPUState) The ppc code does "cs->interrupt_request |= CPU_INTERRUPT_EXITTB" in 5 places, 3 in excp_helper.c and 2 in helper_regs.h. In all cases, g_assert(qemu_mutex_iothread_locked()); fails. If I do something like: if (!qemu_mutex_iothread_locked()) { qemu_mutex_lock_iothread(); cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); qemu_mutex_unlock_iothread(); } else { cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); } in these call sites then I can no longer lock qemu up with my test case. I suspect the _HARD setting gets overwritten which stops the decrementer interrupts being delivered. I don't know if taking this lock in these situations is going to be bad for performance and whether such a patch would be right/wrong. At this point I therefore wanted to seek advice on what the real issue is here and how to fix it! Cheers, Richard