On Sat, Aug 25, 2012 at 03:04:23PM +1000, Paul Mackerras wrote: > We have been observing hangs, both in KVM guest vcpus and more > generally, where a process that is woken doesn't properly wake up and > continue to run, but instead sticks in TASK_WAKING state. This > happens because the update of rq->wake_list in ttwu_queue_remote() > is not ordered with the update of ipi_message in > smp_muxed_ipi_message_pass(), and the reading of rq->wake_list in > scheduler_ipi() is not ordered with the reading of ipi_message in > smp_ipi_demux(). Thus it is possible for the IPI receiver not to see > the updated rq->wake_list and therefore conclude that there is nothing > for it to do. > > In order to make sure that anything done before calling > smp_send_reschedule() is ordered before anything done in the resulting > call to scheduler_ipi(), this adds barriers in > smp_muxed_message_pass() and smp_ipi_demux(). The barrier in > smp_muxed_message_pass() is a full barrier to ensure that there is a > full ordering between the smp_send_reschedule() caller and > scheduler_ipi(). In smp_ipi_demux(), we use xchg() rather than > xchg_local() because xchg() includes release and acquire barriers. > Using xchg() rather than xchg_local() makes sense given that > ipi_message is not just accessed locally. > > These changes made no measurable difference to the speed of IPIs as > measured using a simple ping-pong latency test across two CPUs on > different cores of a POWER7 machine. > > The analysis of the reason why processes were not waking up properly > is due to Milton Miller. > > Cc: sta...@vger.kernel.org > Reported-by: Milton Miller <milt...@bga.com> > Signed-off-by: Paul Mackerras <pau...@samba.org>
Reviewed-by: Paul E. McKenney <paul...@linux.vnet.ibm.com> > --- > arch/powerpc/kernel/smp.c | 11 +++++++++-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c > index e292ff2..ca1040a 100644 > --- a/arch/powerpc/kernel/smp.c > +++ b/arch/powerpc/kernel/smp.c > @@ -197,8 +197,15 @@ void smp_muxed_ipi_message_pass(int cpu, int msg) > struct cpu_messages *info = &per_cpu(ipi_message, cpu); > char *message = (char *)&info->messages; > > + /* > + * Order previous accesses before accesses in the IPI handler. > + */ > + smp_mb(); > message[msg] = 1; > - mb(); > + /* > + * Order setting of message before IPI. > + */ > + smp_wmb(); > smp_ops->cause_ipi(cpu, info->data); > } > > @@ -210,7 +217,7 @@ irqreturn_t smp_ipi_demux(void) > mb(); /* order any irq clear */ > > do { > - all = xchg_local(&info->messages, 0); > + all = xchg(&info->messages, 0); > > #ifdef __BIG_ENDIAN > if (all & (1 << (24 - 8 * PPC_MSG_CALL_FUNCTION))) > -- > 1.7.10.rc3.219.g53414 > _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev