On Mon, Jul 09, 2018 at 03:02:27PM +0200, Peter Zijlstra wrote: > On Mon, Jul 09, 2018 at 02:55:16PM +0200, Peter Zijlstra wrote: > > On Mon, Jul 09, 2018 at 05:34:57AM -0700, Paul E. McKenney wrote: > > > But KVM defeats this by checking need_resched() before invoking > > > cond_resched(). > > > > That's not wrong or even uncommon I think. > > In fact, I think we recently put that pattern in crypto code in order to > break up very long kernel_fpu sections.
OK, so here are our options: 1. Add the RCU conditional to need_resched(), as David suggests. Peter has concerns about overhead. 2. Create a new need_resched_rcu_qs() that is to be used when deciding whether or not to do cond_resched(). This would exact the overhead only where it is needed, but is one more thing for people to get wrong. 3. Revert my changes to de-emphasize cond_resched_rcu_qs(), and go back to sprinkling cond_resched_rcu_qs() throughout the code. This also is one more thing for people to get wrong, and might well eventually convert all cond_resched() calls to cond_resched_rcu_qs(), which sure seems like a failure mode to me. 4. Others? > Note that you also 'broke' cond_resched_lock() as that no longer matches > cond_resched(). Given that cond_resched_lock() was there first, I believe that you can just say "broke" without the quote marks. :-/ Given that this code is releasing and acquiring a lock, I believe that the patch below should cure this, aside from also needing to check whether RCU needs a quiescent state. Any other similar gotchas out there? Thanx, Paul ------------------------------------------------------------------------ diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 537bced8f4bc..b559b556f464 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5017,6 +5017,7 @@ int __cond_resched_lock(spinlock_t *lock) preempt_schedule_common(); else cpu_relax(); + rcu_all_qs(); ret = 1; spin_lock(lock); }