* Preeti U Murthy <pre...@linux.vnet.ibm.com> wrote: > On 04/02/2015 04:12 PM, Ingo Molnar wrote: > > > > * Preeti U Murthy <pre...@linux.vnet.ibm.com> wrote: > > > >> It was found when doing a hotplug stress test on POWER, that the machine > >> either hit softlockups or rcu_sched stall warnings. The issue was > >> traced to commit 7cba160ad789a powernv/cpuidle: Redesign idle states > >> management, which exposed the cpu down race with hrtimer based broadcast > >> mode(Commit 5d1638acb9f6(tick: Introduce hrtimer based broadcast). This > >> is explained below. > >> > >> Assume CPU1 is the CPU which holds the hrtimer broadcasting duty before > >> it is taken down. > >> > >> CPU0 CPU1 > >> > >> cpu_down() take_cpu_down() > >> disable_interrupts() > >> > >> cpu_die() > >> > >> while(CPU1 != CPU_DEAD) { > >> msleep(100); > >> switch_to_idle(); > >> stop_cpu_timer(); > >> schedule_broadcast(); > >> } > >> > >> tick_cleanup_cpu_dead() > >> take_over_broadcast() > >> > >> So after CPU1 disabled interrupts it cannot handle the broadcast hrtimer > >> anymore, so CPU0 will be stuck forever. > >> > >> Fix this by explicitly taking over broadcast duty before cpu_die(). > >> This is a temporary workaround. What we really want is a callback in the > >> clockevent device which allows us to do that from the dying CPU by > >> pushing the hrtimer onto a different cpu. That might involve an IPI and > >> is definitely more complex than this immediate fix. > > > > So why not use a suitable CPU_DOWN* notifier for this, instead of open > > coding it all into a random place in the hotplug machinery? > > This is because each of them is unsuitable for a reason: > > 1. CPU_DOWN_PREPARE stage allows for a fail. The cpu in question may not > successfully go down. So we may pull the hrtimer unnecessarily.
Failure is really rare - and as long as things will continue to work afterwards it's not a problem to pull the hrtimer to this CPU. Right? Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/