On Sun, Aug 04, 2019 at 09:19:01PM -0700, Paul E. McKenney wrote: > On Sun, Aug 04, 2019 at 01:24:46PM -0700, Paul E. McKenney wrote:
> > For whatever it is worth, the things on my list include using 25 rounds > > of resched_cpu() on each CPU with ten-jiffy wait between each (instead of > > merely 10 rounds), using waitqueues or some such to actually force a > > meaningful context switch on the other CPUs, etc. That really should not be needed. What are those other CPUs doing? > Which appears to have reduced the bug rate by about a factor of two. > (But statistics and all that.) Which is just weird.. > I am now trying the same test, but with CONFIG_PREEMPT=y and without > quite so much hammering on the scheduler. This is keying off Peter's > earlier mention of preemption. If this turns out to be solid, perhaps > we outlaw CONFIG_PREEMPT=n && CONFIG_NO_HZ_FULL=y? CONFIG_PREEMPT=n should work just fine, _something_ is off.