David Gibson <da...@gibson.dropbear.id.au> writes: > On Mon, Jul 17, 2017 at 09:46:39AM +0530, Nikunj A Dadhania wrote: >> Rebooting a SMP TCG guest is broken for both single/multi threaded TCG. >> >> When reset happens, all the CPUs are in halted state. First CPU is brought >> out >> of reset and secondary CPUs would be initialized by the guest kernel using a >> rtas call start-cpu. >> >> However, in case of TCG, decrementer interrupts keep on coming and waking the >> secondary CPUs up. >> >> These secondary CPUs would see the decrementer interrupt pending, which makes >> cpu::has_work() to bring them out of wait loop and start executing >> tcg_exec_cpu(). >> >> The problem with this is all the CPUs wake up and start booting SLOF image, >> causing the following exception(4 CPUs TCG VM): > > Ok, I'm still trying to understand why the behaviour on reboot is > different from the first boot.
During first boot, the cpu is in the stopped state, so cpus.c:cpu_thread_is_idle returns true and CPU remains in halted state until rtas start-cpu. Therefore, we never check the cpu_has_work() In case of reboot, all CPUs are resumed after reboot. So we check the next condition cpu_has_work() in cpu_thread_is_idle(), where we see a DECR interrupt and remove the CPU from halted state as the CPU has work. > AFAICT on initial boot, the LPCR will > have DEE / PECE3 enabled. So why aren't we getting the same problem > then? Regards Nikunj