On Wed, Jul 26, 2017 at 09:41:28AM +0200, Peter Zijlstra wrote:
> On Tue, Jul 25, 2017 at 04:59:36PM -0700, Paul E. McKenney wrote:
> > On Tue, Jul 25, 2017 at 11:55:10PM +0200, Peter Zijlstra wrote:
> 
> > > People always do crazy stuff, but what surprised me is that such s patch
> > > got merged in urcu even though its known broken for a number of
> > > architectures.
> > 
> > It did not get merged into urcu.  It is instead used directly by a
> > number of people for a number of concurrent algorithms.
> 
> Yah, Mathieu also already pointed that out. It seems I really cannot
> deal with github well -- that website always terminally confuses me.
> 
> > > > But it would not be hard for userspace code to force IPIs by repeatedly
> > > > awakening higher-priority threads that sleep immediately after being
> > > > awakened, right?
> > > 
> > > RT tasks are not readily available to !root, and the user might have
> > > been constrained to a subset of available CPUs.
> > 
> > So non-idle non-nohz CPUs never get IPIed for wakeups of SCHED_OTHER
> > threads?
> 
> Sure, but SCHED_OTHER auto throttles in that if there's anything else to
> run, you get to wait. So you can't generate an IPI storm with it. Also,
> again, we can be limited to a subset of CPUs.

OK, what is its auto-throttle policy?  One round of IPIs per jiffy or
some such?

Does this auto-throttling also apply if the user is running a CPU-bound
SCHED_BATCH or SCHED_IDLE task on each CPU, and periodically waking up
one of a large group of SCHED_OTHER tasks, where the SCHED_OTHER tasks
immediately sleep upon being awakened?

> > > My thinking was that if we observe '!= mm' that CPU will have to do a
> > > context switch in order to make it true. That context switch will
> > > provide the ordering we're after so all is well.
> > > 
> > > Quite possible there's a hole in, but since I'm running on fumes someone
> > > needs to spell it out for me :-)
> > 
> > This would be the https://marc.info/?l=linux-kernel&m=126349766324224&w=2
> > URL below.
> > 
> > Which might or might not still be applicable.
> 
> I think we actually have those two smp_mb()'s around the rq->curr
> assignment.
> 
> we have smp_mb__before_spinlock(), which per the argument here:
> 
>   https://lkml.kernel.org/r/20170607162013.755917...@infradead.org
> 
> is actually a full MB, irrespective of that weird smp_wmb() definition
> we have now. And we have switch_mm() on the other side.

OK, and the rq->curr assignment is in common code, correct?  Does this
allow the IPI-only-requesting-process approach to live entirely within
common code?

The 2010 email thread ended up with sys_membarrier() acquiring the
runqueue lock for each CPU, because doing otherwise meant adding code
to the scheduler fastpath.  Don't we still need to do this?

https://marc.info/?l=linux-kernel&m=126341138408407&w=2
https://marc.info/?l=linux-kernel&m=126349766324224&w=2

> > > > I was intending to base this on the last few versions of a 2010 patch,
> > > > but maybe things have changed:
> > > > 
> > > >         https://marc.info/?l=linux-kernel&m=126358017229620&w=2
> > > >         https://marc.info/?l=linux-kernel&m=126436996014016&w=2
> > > >         https://marc.info/?l=linux-kernel&m=126601479802978&w=2
> > > >         https://marc.info/?l=linux-kernel&m=126970692903302&w=2
> > > > 
> > > > Discussion here:
> > > > 
> > > >         https://marc.info/?l=linux-kernel&m=126349766324224&w=2
> > > > 
> > > > The discussion led to acquiring the runqueue locks, as there was
> > > > otherwise a need to add code to the scheduler fastpaths.
> > > 
> > > TL;DR..  that's far too much to trawl through.
> > 
> > So we re-derive it from first principles instead?  ;-)
> 
> Yep, that's what I usually do anyway, who knows what kind of crazy our
> younger selves were up to ;-)

In my experience, it ends up being a type of crazy worth ignoring only
if I don't ignore it.  ;-)

                                                        Thanx, Paul

Reply via email to