Chuck, Ingo, thanks for the responses.

> > The pattern that emerges is that on CPU0 we have an interrupt, which 
> > is trying to acquire the rq lock, but can't.
> > 
> > On CPU1 we have strace which is doing wait_task_inactive(), which sort 
> > of spins acquiring and releasing the rq lock.  I've checked some of 
> > the traces and it is just before acquiring the rq lock, or just after 
> > releasing it, but is not actually holding it.
> > 
> > So is it possible that wait_task_inactive() could be starving the 
> > other waiters of the rq spinlock?  Any ideas?
> 
> hm, this is really interesting, and indeed a smoking gun. The T60 has a 
> Core2Duo and i've _never_ seen MESI starvation happen on dual-core 
> single-socket CPUs! (The only known serious MESI starvation i know about 
> is on multi-socket Opterons: there the trylock loop of spinlock 
> debugging is known to starve some CPUs out of those locks that are being 
> polled, so we had to turn off that aspect of spinlock debugging.)
> 
> wait_task_inactive(), although it busy-loops, is pretty robust: it does 
> a proper spin-lock/spin-unlock sequence and has a cpu_relax() inbetween. 
> Furthermore, the rep_nop() that cpu_relax() is based on is 
> unconditional, so it's not like we could somehow end up not having the 
> REP; NOP sequence there (which should make the lock polling even more 
> fair)
> 
> could you try the quick hack below, ontop of cfs-v17? It adds two things 
> to wait_task_inactive():
> 
> - a cond_resched() [in case you are running !PREEMPT]
> 
> - use MONITOR+MWAIT to monitor memory transactions to the rq->curr 
>   cacheline. This should make the polling loop definitely fair.

Is it not possible for the mwait to get stuck?

> If this solves the problem on your box then i'll do a proper fix and 
> introduce a cpu_relax_memory_change(*addr) type of API to around 
> monitor/mwait. This patch boots fine on my T60 - but i never saw your 
> problem.

Yes, the patch does make the pauses go away.  In fact just the
smp_mb() seems to suffice.

> [ btw., utrace IIRC fixes ptrace to get rid of wait_task_interactive(). ]

I looked at the utrace patch, and it still has wait_task_inactive(),
and I can still reproduce the freezes with the utrace patch applied.

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to