Re: cond_resched() and RCU CPU stall warnings

2014-03-18 Thread Paul E. McKenney
On Tue, Mar 18, 2014 at 02:45:34PM +0100, Peter Zijlstra wrote:
> > sched,rcu: Make cond_resched() report RCU quiescent states
> > 
> > Given a CPU running a loop containing cond_resched(), with no
> > other tasks runnable on that CPU, RCU will eventually report RCU
> > CPU stall warnings due to lack of quiescent states.  Fortunately,
> > every call to cond_resched() is a perfectly good quiescent state.
> > Unfortunately, invoking rcu_note_context_switch() is a bit heavyweight
> > for cond_resched(), especially given the need to disable preemption,
> > and, for RCU-preempt, interrupts as well.
> > 
> > This commit therefore maintains a per-CPU counter that causes
> > cond_resched(), cond_resched_lock(), and cond_resched_softirq() to call
> > rcu_note_context_switch(), but only about once per 256 invocations.
> > This ratio was chosen in keeping with the relative time constants of
> > RCU grace periods.
> > 
> > Signed-off-by: Paul E. McKenney 
> > 
> > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > index 3cea28c64ebe..8d64878111ea 100644
> > --- a/include/linux/rcupdate.h
> > +++ b/include/linux/rcupdate.h
> > @@ -44,6 +44,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  
> >  extern int rcu_expedited; /* for sysctl */
> > @@ -287,6 +288,41 @@ bool __rcu_is_watching(void);
> >  #endif /* #if defined(CONFIG_DEBUG_LOCK_ALLOC) || 
> > defined(CONFIG_RCU_TRACE) || defined(CONFIG_SMP) */
> >  
> >  /*
> > + * Hooks for cond_resched() and friends to avoid RCU CPU stall warnings.
> > + */
> > +
> > +#define RCU_COND_RESCHED_LIM 256   /* ms vs. 100s of ms. */
> > +DECLARE_PER_CPU(int, rcu_cond_resched_count);
> > +void rcu_resched(void);
> > +
> > +/*
> > + * Is it time to report RCU quiescent states?
> > + *
> > + * Note unsynchronized access to rcu_cond_resched_count.  Yes, we might
> > + * increment some random CPU's count, and possibly also load the result 
> > from
> > + * yet another CPU's count.  We might even clobber some other CPU's attempt
> > + * to zero its counter.  This is all OK because the goal is not precision,
> > + * but rather reasonable amortization of rcu_note_context_switch() overhead
> > + * and extremely high probability of avoiding RCU CPU stall warnings.
> > + * Note that this function has to be preempted in just the wrong place,
> > + * many thousands of times in a row, for anything bad to happen.
> > + */
> > +static inline bool rcu_should_resched(void)
> > +{
> > +   return __this_cpu_inc_return(rcu_cond_resched_count) >=
> > +  RCU_COND_RESCHED_LIM;
> > +}
> > +
> > +/*
> > + * Report quiscent states to RCU if it is time to do so.
> > + */
> > +static inline void rcu_cond_resched(void)
> > +{
> > +   if (unlikely(rcu_should_resched()))
> > +   rcu_resched();
> > +}
> > +
> > +/*
> >   * Infrastructure to implement the synchronize_() primitives in
> >   * TREE_RCU and rcu_barrier_() primitives in TINY_RCU.
> >   */
> > diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> > index 4c0a9b0af469..ed7a0d72562c 100644
> > --- a/kernel/rcu/update.c
> > +++ b/kernel/rcu/update.c
> > @@ -338,3 +338,21 @@ static int __init check_cpu_stall_init(void)
> >  early_initcall(check_cpu_stall_init);
> >  
> >  #endif /* #ifdef CONFIG_RCU_STALL_COMMON */
> > +
> > +/*
> > + * Hooks for cond_resched() and friends to avoid RCU CPU stall warnings.
> > + */
> > +
> > +DEFINE_PER_CPU(int, rcu_cond_resched_count);
> > +
> > +/*
> > + * Report a set of RCU quiescent states, for use by cond_resched()
> > + * and friends.  Out of line due to being called infrequently.
> > + */
> > +void rcu_resched(void)
> > +{
> > +   preempt_disable();
> > +   __this_cpu_write(rcu_cond_resched_count, 0);
> 
> Somehow I've got the urge to write:
> 
>   __this_cpu_add(rcu_cond_resched_count, -RCU_COND_RESCHED_LIM);
> 
> But the immediate write is saver, and I suppose it doesn't matter at
> all.

My concern is that an unfortunate series of preemptions might cause a
victim CPU's counter to be driven to a large negative number, which
could then result in RCU CPU stall warnings given a subsequent time
with no schedules.  In contrast, the series of preemptions that result
in large positive numbers result in context switches during that time,
which keeps the stall warnings at bay.

> > +   rcu_note_context_switch(smp_processor_id());
> > +   preempt_enable();
> > +}
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index b46131ef6aab..1f53e8871dd2 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -4071,6 +4071,7 @@ static void __cond_resched(void)
> >  
> >  int __sched _cond_resched(void)
> >  {
> > +   rcu_cond_resched();
> > if (should_resched()) {
> > __cond_resched();
> > return 1;
> > @@ -4089,6 +4090,7 @@ EXPORT_SYMBOL(_cond_resched);
> >   */
> >  int __cond_resched_lock(spinlock_t *lock)
> >  {
> > +   bool need_rcu_resched = rcu_should_resched();
> > int resched = should_resched()

Re: cond_resched() and RCU CPU stall warnings

2014-03-18 Thread Peter Zijlstra
> sched,rcu: Make cond_resched() report RCU quiescent states
> 
> Given a CPU running a loop containing cond_resched(), with no
> other tasks runnable on that CPU, RCU will eventually report RCU
> CPU stall warnings due to lack of quiescent states.  Fortunately,
> every call to cond_resched() is a perfectly good quiescent state.
> Unfortunately, invoking rcu_note_context_switch() is a bit heavyweight
> for cond_resched(), especially given the need to disable preemption,
> and, for RCU-preempt, interrupts as well.
> 
> This commit therefore maintains a per-CPU counter that causes
> cond_resched(), cond_resched_lock(), and cond_resched_softirq() to call
> rcu_note_context_switch(), but only about once per 256 invocations.
> This ratio was chosen in keeping with the relative time constants of
> RCU grace periods.
> 
> Signed-off-by: Paul E. McKenney 
> 
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 3cea28c64ebe..8d64878111ea 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -44,6 +44,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  extern int rcu_expedited; /* for sysctl */
> @@ -287,6 +288,41 @@ bool __rcu_is_watching(void);
>  #endif /* #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) 
> || defined(CONFIG_SMP) */
>  
>  /*
> + * Hooks for cond_resched() and friends to avoid RCU CPU stall warnings.
> + */
> +
> +#define RCU_COND_RESCHED_LIM 256 /* ms vs. 100s of ms. */
> +DECLARE_PER_CPU(int, rcu_cond_resched_count);
> +void rcu_resched(void);
> +
> +/*
> + * Is it time to report RCU quiescent states?
> + *
> + * Note unsynchronized access to rcu_cond_resched_count.  Yes, we might
> + * increment some random CPU's count, and possibly also load the result from
> + * yet another CPU's count.  We might even clobber some other CPU's attempt
> + * to zero its counter.  This is all OK because the goal is not precision,
> + * but rather reasonable amortization of rcu_note_context_switch() overhead
> + * and extremely high probability of avoiding RCU CPU stall warnings.
> + * Note that this function has to be preempted in just the wrong place,
> + * many thousands of times in a row, for anything bad to happen.
> + */
> +static inline bool rcu_should_resched(void)
> +{
> + return __this_cpu_inc_return(rcu_cond_resched_count) >=
> +RCU_COND_RESCHED_LIM;
> +}
> +
> +/*
> + * Report quiscent states to RCU if it is time to do so.
> + */
> +static inline void rcu_cond_resched(void)
> +{
> + if (unlikely(rcu_should_resched()))
> + rcu_resched();
> +}
> +
> +/*
>   * Infrastructure to implement the synchronize_() primitives in
>   * TREE_RCU and rcu_barrier_() primitives in TINY_RCU.
>   */
> diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> index 4c0a9b0af469..ed7a0d72562c 100644
> --- a/kernel/rcu/update.c
> +++ b/kernel/rcu/update.c
> @@ -338,3 +338,21 @@ static int __init check_cpu_stall_init(void)
>  early_initcall(check_cpu_stall_init);
>  
>  #endif /* #ifdef CONFIG_RCU_STALL_COMMON */
> +
> +/*
> + * Hooks for cond_resched() and friends to avoid RCU CPU stall warnings.
> + */
> +
> +DEFINE_PER_CPU(int, rcu_cond_resched_count);
> +
> +/*
> + * Report a set of RCU quiescent states, for use by cond_resched()
> + * and friends.  Out of line due to being called infrequently.
> + */
> +void rcu_resched(void)
> +{
> + preempt_disable();
> + __this_cpu_write(rcu_cond_resched_count, 0);

Somehow I've got the urge to write:

  __this_cpu_add(rcu_cond_resched_count, -RCU_COND_RESCHED_LIM);

But the immediate write is saver, and I suppose it doesn't matter at
all.

> + rcu_note_context_switch(smp_processor_id());
> + preempt_enable();
> +}
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index b46131ef6aab..1f53e8871dd2 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4071,6 +4071,7 @@ static void __cond_resched(void)
>  
>  int __sched _cond_resched(void)
>  {
> + rcu_cond_resched();
>   if (should_resched()) {
>   __cond_resched();
>   return 1;
> @@ -4089,6 +4090,7 @@ EXPORT_SYMBOL(_cond_resched);
>   */
>  int __cond_resched_lock(spinlock_t *lock)
>  {
> + bool need_rcu_resched = rcu_should_resched();
>   int resched = should_resched();
>   int ret = 0;
>  

I suppose you also need:

-   if (spin_needbreak(lock) || resched)
+   if (spin_needbreak(lock) || resched || need_rcu_resched)

> @@ -4098,6 +4100,8 @@ int __cond_resched_lock(spinlock_t *lock)
>   spin_unlock(lock);
>   if (resched)
>   __cond_resched();
> + else if (unlikely(need_rcu_resched))
> + rcu_resched();
>   else
>   cpu_relax();
>   ret = 1;
> @@ -4111,6 +4115,7 @@ int __sched __cond_resched_softirq(void)
>  {
>   BUG_ON(!in_softirq());
>  
> + rcu_cond_resched();  /* BH disabled OK, just recor

Re: cond_resched() and RCU CPU stall warnings

2014-03-18 Thread Paul E. McKenney
On Tue, Mar 18, 2014 at 09:51:18AM +0100, Peter Zijlstra wrote:
> On Mon, Mar 17, 2014 at 07:17:06PM -0700, Paul E. McKenney wrote:
> > On Mon, Mar 17, 2014 at 06:14:34PM +0100, Peter Zijlstra wrote:
> > > On Mon, Mar 17, 2014 at 09:58:07AM -0700, Paul E. McKenney wrote:
> > > > @@ -4111,6 +4115,7 @@ int __sched __cond_resched_softirq(void)
> > > >  {
> > > > BUG_ON(!in_softirq());
> > > >  
> > > > +   rcu_cond_resched();
> > > 
> > > Don't you have to enable BHs before doing that? And if not; that needs a
> > > comment for sure! :-)
> > 
> > No need to enable BHs, just RCU marking quiescent states.  But yes,
> > the name does look a bit scary in that context, doesn't it?  Added
> > a comment, please see below for updated patch.
> 
> The below seemed like an unrelated patch entirely :-)

Well, it certainly avoids issues with BH...

> > 
> > 
> > torture: Better summary diagnostics for build failures
> > 
> > The reaction of kvm-recheck.sh is obscure at best, and easy to miss
> > completely.  This commit therefore prints "BUG: Build failed" in the
> > summary at the end of a run.
> > 
> > Signed-off-by: Paul E. McKenney 

How about this one instead?  ;-)

Thanx, Paul



sched,rcu: Make cond_resched() report RCU quiescent states

Given a CPU running a loop containing cond_resched(), with no
other tasks runnable on that CPU, RCU will eventually report RCU
CPU stall warnings due to lack of quiescent states.  Fortunately,
every call to cond_resched() is a perfectly good quiescent state.
Unfortunately, invoking rcu_note_context_switch() is a bit heavyweight
for cond_resched(), especially given the need to disable preemption,
and, for RCU-preempt, interrupts as well.

This commit therefore maintains a per-CPU counter that causes
cond_resched(), cond_resched_lock(), and cond_resched_softirq() to call
rcu_note_context_switch(), but only about once per 256 invocations.
This ratio was chosen in keeping with the relative time constants of
RCU grace periods.

Signed-off-by: Paul E. McKenney 

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 3cea28c64ebe..8d64878111ea 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -44,6 +44,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 extern int rcu_expedited; /* for sysctl */
@@ -287,6 +288,41 @@ bool __rcu_is_watching(void);
 #endif /* #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) || 
defined(CONFIG_SMP) */
 
 /*
+ * Hooks for cond_resched() and friends to avoid RCU CPU stall warnings.
+ */
+
+#define RCU_COND_RESCHED_LIM 256   /* ms vs. 100s of ms. */
+DECLARE_PER_CPU(int, rcu_cond_resched_count);
+void rcu_resched(void);
+
+/*
+ * Is it time to report RCU quiescent states?
+ *
+ * Note unsynchronized access to rcu_cond_resched_count.  Yes, we might
+ * increment some random CPU's count, and possibly also load the result from
+ * yet another CPU's count.  We might even clobber some other CPU's attempt
+ * to zero its counter.  This is all OK because the goal is not precision,
+ * but rather reasonable amortization of rcu_note_context_switch() overhead
+ * and extremely high probability of avoiding RCU CPU stall warnings.
+ * Note that this function has to be preempted in just the wrong place,
+ * many thousands of times in a row, for anything bad to happen.
+ */
+static inline bool rcu_should_resched(void)
+{
+   return __this_cpu_inc_return(rcu_cond_resched_count) >=
+  RCU_COND_RESCHED_LIM;
+}
+
+/*
+ * Report quiscent states to RCU if it is time to do so.
+ */
+static inline void rcu_cond_resched(void)
+{
+   if (unlikely(rcu_should_resched()))
+   rcu_resched();
+}
+
+/*
  * Infrastructure to implement the synchronize_() primitives in
  * TREE_RCU and rcu_barrier_() primitives in TINY_RCU.
  */
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 4c0a9b0af469..ed7a0d72562c 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -338,3 +338,21 @@ static int __init check_cpu_stall_init(void)
 early_initcall(check_cpu_stall_init);
 
 #endif /* #ifdef CONFIG_RCU_STALL_COMMON */
+
+/*
+ * Hooks for cond_resched() and friends to avoid RCU CPU stall warnings.
+ */
+
+DEFINE_PER_CPU(int, rcu_cond_resched_count);
+
+/*
+ * Report a set of RCU quiescent states, for use by cond_resched()
+ * and friends.  Out of line due to being called infrequently.
+ */
+void rcu_resched(void)
+{
+   preempt_disable();
+   __this_cpu_write(rcu_cond_resched_count, 0);
+   rcu_note_context_switch(smp_processor_id());
+   preempt_enable();
+}
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b46131ef6aab..1f53e8871dd2 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4071,6 +4071,7 @@ static void __cond_resched(void)
 
 int __s

Re: cond_resched() and RCU CPU stall warnings

2014-03-18 Thread Peter Zijlstra
On Mon, Mar 17, 2014 at 07:17:06PM -0700, Paul E. McKenney wrote:
> On Mon, Mar 17, 2014 at 06:14:34PM +0100, Peter Zijlstra wrote:
> > On Mon, Mar 17, 2014 at 09:58:07AM -0700, Paul E. McKenney wrote:
> > > @@ -4111,6 +4115,7 @@ int __sched __cond_resched_softirq(void)
> > >  {
> > >   BUG_ON(!in_softirq());
> > >  
> > > + rcu_cond_resched();
> > 
> > Don't you have to enable BHs before doing that? And if not; that needs a
> > comment for sure! :-)
> 
> No need to enable BHs, just RCU marking quiescent states.  But yes,
> the name does look a bit scary in that context, doesn't it?  Added
> a comment, please see below for updated patch.

The below seemed like an unrelated patch entirely :-)

> 
> 
> torture: Better summary diagnostics for build failures
> 
> The reaction of kvm-recheck.sh is obscure at best, and easy to miss
> completely.  This commit therefore prints "BUG: Build failed" in the
> summary at the end of a run.
> 
> Signed-off-by: Paul E. McKenney 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cond_resched() and RCU CPU stall warnings

2014-03-17 Thread Paul E. McKenney
On Mon, Mar 17, 2014 at 06:14:34PM +0100, Peter Zijlstra wrote:
> On Mon, Mar 17, 2014 at 09:58:07AM -0700, Paul E. McKenney wrote:
> > @@ -4111,6 +4115,7 @@ int __sched __cond_resched_softirq(void)
> >  {
> > BUG_ON(!in_softirq());
> >  
> > +   rcu_cond_resched();
> 
> Don't you have to enable BHs before doing that? And if not; that needs a
> comment for sure! :-)

No need to enable BHs, just RCU marking quiescent states.  But yes,
the name does look a bit scary in that context, doesn't it?  Added
a comment, please see below for updated patch.

> > if (should_resched()) {
> > local_bh_enable();
> > __cond_resched();
> > 



torture: Better summary diagnostics for build failures

The reaction of kvm-recheck.sh is obscure at best, and easy to miss
completely.  This commit therefore prints "BUG: Build failed" in the
summary at the end of a run.

Signed-off-by: Paul E. McKenney 

diff --git a/tools/testing/selftests/rcutorture/bin/kvm-recheck-lock.sh 
b/tools/testing/selftests/rcutorture/bin/kvm-recheck-lock.sh
index 829186e19eb1..7f1ff1a8fc4b 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-recheck-lock.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-recheck-lock.sh
@@ -35,7 +35,7 @@ configfile=`echo $i | sed -e 's/^.*\///'`
 ncs=`grep "Writes:  Total:" $i/console.log 2> /dev/null | tail -1 | sed -e 
's/^.* Total: //' -e 's/ .*$//'`
 if test -z "$ncs"
 then
-   echo $configfile
+   echo "$configfile ---"
 else
title="$configfile --- $ncs acquisitions/releases"
dur=`sed -e 's/^.* locktorture.shutdown_secs=//' -e 's/ .*$//' < 
$i/qemu-cmd 2> /dev/null`
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh 
b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh
index d75b1dc5ae53..307c4b95f325 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh
@@ -35,7 +35,7 @@ configfile=`echo $i | sed -e 's/^.*\///'`
 ngps=`grep ver: $i/console.log 2> /dev/null | tail -1 | sed -e 's/^.* ver: //' 
-e 's/ .*$//'`
 if test -z "$ngps"
 then
-   echo $configfile
+   echo "$configfile ---"
 else
title="$configfile --- $ngps grace periods"
dur=`sed -e 's/^.* rcutorture.shutdown_secs=//' -e 's/ .*$//' < 
$i/qemu-cmd 2> /dev/null`
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-recheck.sh 
b/tools/testing/selftests/rcutorture/bin/kvm-recheck.sh
index 26d78b7eaccf..ee1f6cae3d70 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-recheck.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-recheck.sh
@@ -25,6 +25,7 @@
 # Authors: Paul E. McKenney 
 
 PATH=`pwd`/tools/testing/selftests/rcutorture/bin:$PATH; export PATH
+. tools/testing/selftests/rcutorture/bin/functions.sh
 for rd in "$@"
 do
firsttime=1
@@ -39,13 +40,24 @@ do
fi
TORTURE_SUITE="`cat $i/../TORTURE_SUITE`"
kvm-recheck-${TORTURE_SUITE}.sh $i
-   configcheck.sh $i/.config $i/ConfigFragment
-   parse-build.sh $i/Make.out $configfile
-   parse-torture.sh $i/console.log $configfile
-   parse-console.sh $i/console.log $configfile
-   if test -r $i/Warnings
+   if test -f "$i/console.log"
then
-   cat $i/Warnings
+   configcheck.sh $i/.config $i/ConfigFragment
+   parse-build.sh $i/Make.out $configfile
+   parse-torture.sh $i/console.log $configfile
+   parse-console.sh $i/console.log $configfile
+   if test -r $i/Warnings
+   then
+   cat $i/Warnings
+   fi
+   else
+   if test -f "$i/qemu-cmd"
+   then
+   print_bug qemu failed
+   else
+   print_bug Build failed
+   fi
+   echo "   $i"
fi
done
 done
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh 
b/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
index 7a95f86cc85a..51c34a91a375 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
@@ -106,6 +106,7 @@ then
fi
 else
cp $builddir/Make*.out $resdir
+   cp $builddir/.config $resdir || :
echo Build failed, not running KVM, see $resdir.
if test -f $builddir.wait
then

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cond_resched() and RCU CPU stall warnings

2014-03-17 Thread Peter Zijlstra
On Mon, Mar 17, 2014 at 09:58:07AM -0700, Paul E. McKenney wrote:
> @@ -4111,6 +4115,7 @@ int __sched __cond_resched_softirq(void)
>  {
>   BUG_ON(!in_softirq());
>  
> + rcu_cond_resched();

Don't you have to enable BHs before doing that? And if not; that needs a
comment for sure! :-)

>   if (should_resched()) {
>   local_bh_enable();
>   __cond_resched();
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cond_resched() and RCU CPU stall warnings

2014-03-17 Thread Paul E. McKenney
On Mon, Mar 17, 2014 at 11:13:00AM +0100, Peter Zijlstra wrote:
> On Sat, Mar 15, 2014 at 06:59:14PM -0700, Paul E. McKenney wrote:
> > So I have been tightening up rcutorture a bit over the past year.
> > The other day, I came across what looked like a great opportunity for
> > further tightening, namely the schedule() in rcu_torture_reader().
> > Why not turn this into a cond_resched(), speeding up the readers a bit
> > and placing more stress on RCU?
> > 
> > And boy does it increase stress!
> > 
> > Unfortunately, this increased stress sometimes shows up in the form of
> > lots of RCU CPU stall warnings.  These can appear when an instance of
> > rcu_torture_reader() gets a CPU to itself, in which case it won't ever
> > enter the scheduler, and RCU will never see a quiescent state from that
> > CPU, which means the grace period never ends.
> > 
> > So I am taking a more measured approach to cond_resched() in
> > rcu_torture_reader() for the moment.
> > 
> > But longer term, should cond_resched() imply a set of RCU
> > quiescent states?  One way to do this would be to add calls to
> > rcu_note_context_switch() in each of the various cond_resched() functions.
> > Easy change, but of course adds some overhead.  On the other hand,
> > there might be more than a few of the 500+ calls to cond_resched() that
> > expect that RCU CPU stalls will be prevented (to say nothing of
> > might_sleep() and cond_resched_lock()).
> > 
> > Thoughts?
> 
> I share Mike's concern. Some of those functions might be too expensive
> to do in the loops where we have the cond_resched()s. And while its only
> strictly required when nr_running==1, keying off off that seems
> unfortunate in that it makes things behave differently with a single
> running task.
> 
> I suppose your proposed per-cpu counter is the best option; even though
> its still an extra cacheline hit in cond_resched().
> 
> As to the other cond_resched() variants; they might be a little more
> tricky, eg. cond_resched_lock() would have you drop the lock in order to
> note the QS, etc.
> 
> So one thing that might make sense is to have something like
> rcu_should_qs() which will indicate RCUs need for a grace period end.
> Then we can augment the various should_resched()/spin_needbreak() etc.
> with that condition.
> 
> That also gets rid of the counter (or at least hides it in the
> implementation if RCU really can't do anything better).

I did code up a version having a per-CPU bitmask indicating
which flavors of RCU needed quiescent states, and only invoking
rcu_note_context_switch() if at least one of the flavors needed
a quiescent state.  This implementation ended up being more
complex, but worse, slowed down the fast path.  Hard to beat
__this_cpu_inc_return()...  Might be able to break even with a bit more
work, but on most real systems there is pretty much always a grace period
in flight anyway, so it does not appear to be worth it.

So how about the following?  Passes moderate rcutorture testing, no RCU
CPU stall warnings despite cond_resched() in rcu_torture_reader().

Thanx, Paul



sched,rcu: Make cond_resched() report RCU quiescent states

Given a CPU running a loop containing cond_resched(), with no
other tasks runnable on that CPU, RCU will eventually report RCU
CPU stall warnings due to lack of quiescent states.  Fortunately,
every call to cond_resched() is a perfectly good quiescent state.
Unfortunately, invoking rcu_note_context_switch() is a bit heavyweight
for cond_resched(), especially given the need to disable preemption,
and, for RCU-preempt, interrupts as well.

This commit therefore maintains a per-CPU counter that causes
cond_resched(), cond_resched_lock(), and cond_resched_softirq() to call
rcu_note_context_switch(), but only about once per 256 invocations.
This ratio was chosen in keeping with the relative time constants of
RCU grace periods.

Signed-off-by: Paul E. McKenney 

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 3cea28c64ebe..8d64878111ea 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -44,6 +44,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 extern int rcu_expedited; /* for sysctl */
@@ -287,6 +288,41 @@ bool __rcu_is_watching(void);
 #endif /* #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) || 
defined(CONFIG_SMP) */
 
 /*
+ * Hooks for cond_resched() and friends to avoid RCU CPU stall warnings.
+ */
+
+#define RCU_COND_RESCHED_LIM 256   /* ms vs. 100s of ms. */
+DECLARE_PER_CPU(int, rcu_cond_resched_count);
+void rcu_resched(void);
+
+/*
+ * Is it time to report RCU quiescent states?
+ *
+ * Note unsynchronized access to rcu_cond_resched_count.  Yes, we might
+ * increment some random CPU's count, and possibly also load the result from
+ * yet another CPU's count.  We might even clobber some other CPU's attempt
+ * t

Re: cond_resched() and RCU CPU stall warnings

2014-03-17 Thread Peter Zijlstra
On Sat, Mar 15, 2014 at 06:59:14PM -0700, Paul E. McKenney wrote:
> So I have been tightening up rcutorture a bit over the past year.
> The other day, I came across what looked like a great opportunity for
> further tightening, namely the schedule() in rcu_torture_reader().
> Why not turn this into a cond_resched(), speeding up the readers a bit
> and placing more stress on RCU?
> 
> And boy does it increase stress!
> 
> Unfortunately, this increased stress sometimes shows up in the form of
> lots of RCU CPU stall warnings.  These can appear when an instance of
> rcu_torture_reader() gets a CPU to itself, in which case it won't ever
> enter the scheduler, and RCU will never see a quiescent state from that
> CPU, which means the grace period never ends.
> 
> So I am taking a more measured approach to cond_resched() in
> rcu_torture_reader() for the moment.
> 
> But longer term, should cond_resched() imply a set of RCU
> quiescent states?  One way to do this would be to add calls to
> rcu_note_context_switch() in each of the various cond_resched() functions.
> Easy change, but of course adds some overhead.  On the other hand,
> there might be more than a few of the 500+ calls to cond_resched() that
> expect that RCU CPU stalls will be prevented (to say nothing of
> might_sleep() and cond_resched_lock()).
> 
> Thoughts?

I share Mike's concern. Some of those functions might be too expensive
to do in the loops where we have the cond_resched()s. And while its only
strictly required when nr_running==1, keying off off that seems
unfortunate in that it makes things behave differently with a single
running task.

I suppose your proposed per-cpu counter is the best option; even though
its still an extra cacheline hit in cond_resched().

As to the other cond_resched() variants; they might be a little more
tricky, eg. cond_resched_lock() would have you drop the lock in order to
note the QS, etc.

So one thing that might make sense is to have something like
rcu_should_qs() which will indicate RCUs need for a grace period end.
Then we can augment the various should_resched()/spin_needbreak() etc.
with that condition.

That also gets rid of the counter (or at least hides it in the
implementation if RCU really can't do anything better).

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cond_resched() and RCU CPU stall warnings

2014-03-16 Thread Mike Galbraith
On Sat, 2014-03-15 at 23:25 -0700, Paul E. McKenney wrote: 
> On Sun, Mar 16, 2014 at 07:09:42AM +0100, Mike Galbraith wrote:

> > Hm.  Since you only care about the case where your task is solo, how
> > about do racy checks, 100% accuracy isn't required is it?  Seems you
> > wouldn't want to unconditionally do that in tight loops.
> 
> And indeed, my current workaround unconditionally does schedule() one
> out of 256 loops.  I would do something similar here, perhaps based
> on per-CPU counters, perhaps even with racy accesses to avoid always
> doing preempt_disable()/preempt_enable().
> 
> Or did you have something else in mind?

Exactly what I meant, take a racy peek or two first.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cond_resched() and RCU CPU stall warnings

2014-03-15 Thread Paul E. McKenney
On Sun, Mar 16, 2014 at 07:14:15AM +0100, Mike Galbraith wrote:
> On Sun, 2014-03-16 at 07:09 +0100, Mike Galbraith wrote: 
> > On Sat, 2014-03-15 at 18:59 -0700, Paul E. McKenney wrote: 
> 
> > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > > index b46131ef6aab..994d2b0fd0b2 100644
> > > --- a/kernel/sched/core.c
> > > +++ b/kernel/sched/core.c
> > > @@ -4075,6 +4075,9 @@ int __sched _cond_resched(void)
> > >   __cond_resched();
> > >   return 1;
> > >   }
> > > + preempt_disable();
> > > + rcu_note_context_switch(smp_processor_id());
> > > + preempt_enable();
> > >   return 0;
> > >  }
> > >  EXPORT_SYMBOL(_cond_resched);
> > 
> > Hm.  Since you only care about the case where your task is solo, how
> > about do racy checks, 100% accuracy isn't required is it?  Seems you
> > wouldn't want to unconditionally do that in tight loops.
> 
> (or do that in scheduler_tick()?)

There are checks in scheduler_tick(), but they have to assume that
if they interrupt kernel execution, they might have also interrupted
an RCU read-side critical section.  And in fact, the point of having
cond_resched() (sometimes) do a quiescent state is to communicate with
the existing checks invoked from scheduler_tick().

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cond_resched() and RCU CPU stall warnings

2014-03-15 Thread Paul E. McKenney
On Sun, Mar 16, 2014 at 07:09:42AM +0100, Mike Galbraith wrote:
> On Sat, 2014-03-15 at 18:59 -0700, Paul E. McKenney wrote: 
> > So I have been tightening up rcutorture a bit over the past year.
> > The other day, I came across what looked like a great opportunity for
> > further tightening, namely the schedule() in rcu_torture_reader().
> > Why not turn this into a cond_resched(), speeding up the readers a bit
> > and placing more stress on RCU?
> > 
> > And boy does it increase stress!
> > 
> > Unfortunately, this increased stress sometimes shows up in the form of
> > lots of RCU CPU stall warnings.  These can appear when an instance of
> > rcu_torture_reader() gets a CPU to itself, in which case it won't ever
> > enter the scheduler, and RCU will never see a quiescent state from that
> > CPU, which means the grace period never ends.
> > 
> > So I am taking a more measured approach to cond_resched() in
> > rcu_torture_reader() for the moment.
> > 
> > But longer term, should cond_resched() imply a set of RCU
> > quiescent states?  One way to do this would be to add calls to
> > rcu_note_context_switch() in each of the various cond_resched() functions.
> > Easy change, but of course adds some overhead.  On the other hand,
> > there might be more than a few of the 500+ calls to cond_resched() that
> > expect that RCU CPU stalls will be prevented (to say nothing of
> > might_sleep() and cond_resched_lock()).
> > 
> > Thoughts?
> > 
> > (Untested patch below, FWIW.)
> > 
> > Thanx, Paul
> > 
> > 
> > 
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index b46131ef6aab..994d2b0fd0b2 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -4075,6 +4075,9 @@ int __sched _cond_resched(void)
> > __cond_resched();
> > return 1;
> > }
> > +   preempt_disable();
> > +   rcu_note_context_switch(smp_processor_id());
> > +   preempt_enable();
> > return 0;
> >  }
> >  EXPORT_SYMBOL(_cond_resched);
> 
> Hm.  Since you only care about the case where your task is solo, how
> about do racy checks, 100% accuracy isn't required is it?  Seems you
> wouldn't want to unconditionally do that in tight loops.

And indeed, my current workaround unconditionally does schedule() one
out of 256 loops.  I would do something similar here, perhaps based
on per-CPU counters, perhaps even with racy accesses to avoid always
doing preempt_disable()/preempt_enable().

Or did you have something else in mind?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cond_resched() and RCU CPU stall warnings

2014-03-15 Thread Mike Galbraith
On Sun, 2014-03-16 at 07:09 +0100, Mike Galbraith wrote: 
> On Sat, 2014-03-15 at 18:59 -0700, Paul E. McKenney wrote: 

> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index b46131ef6aab..994d2b0fd0b2 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -4075,6 +4075,9 @@ int __sched _cond_resched(void)
> > __cond_resched();
> > return 1;
> > }
> > +   preempt_disable();
> > +   rcu_note_context_switch(smp_processor_id());
> > +   preempt_enable();
> > return 0;
> >  }
> >  EXPORT_SYMBOL(_cond_resched);
> 
> Hm.  Since you only care about the case where your task is solo, how
> about do racy checks, 100% accuracy isn't required is it?  Seems you
> wouldn't want to unconditionally do that in tight loops.

(or do that in scheduler_tick()?)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cond_resched() and RCU CPU stall warnings

2014-03-15 Thread Mike Galbraith
On Sat, 2014-03-15 at 18:59 -0700, Paul E. McKenney wrote: 
> So I have been tightening up rcutorture a bit over the past year.
> The other day, I came across what looked like a great opportunity for
> further tightening, namely the schedule() in rcu_torture_reader().
> Why not turn this into a cond_resched(), speeding up the readers a bit
> and placing more stress on RCU?
> 
> And boy does it increase stress!
> 
> Unfortunately, this increased stress sometimes shows up in the form of
> lots of RCU CPU stall warnings.  These can appear when an instance of
> rcu_torture_reader() gets a CPU to itself, in which case it won't ever
> enter the scheduler, and RCU will never see a quiescent state from that
> CPU, which means the grace period never ends.
> 
> So I am taking a more measured approach to cond_resched() in
> rcu_torture_reader() for the moment.
> 
> But longer term, should cond_resched() imply a set of RCU
> quiescent states?  One way to do this would be to add calls to
> rcu_note_context_switch() in each of the various cond_resched() functions.
> Easy change, but of course adds some overhead.  On the other hand,
> there might be more than a few of the 500+ calls to cond_resched() that
> expect that RCU CPU stalls will be prevented (to say nothing of
> might_sleep() and cond_resched_lock()).
> 
> Thoughts?
> 
> (Untested patch below, FWIW.)
> 
>   Thanx, Paul
> 
> 
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index b46131ef6aab..994d2b0fd0b2 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4075,6 +4075,9 @@ int __sched _cond_resched(void)
>   __cond_resched();
>   return 1;
>   }
> + preempt_disable();
> + rcu_note_context_switch(smp_processor_id());
> + preempt_enable();
>   return 0;
>  }
>  EXPORT_SYMBOL(_cond_resched);

Hm.  Since you only care about the case where your task is solo, how
about do racy checks, 100% accuracy isn't required is it?  Seems you
wouldn't want to unconditionally do that in tight loops.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


cond_resched() and RCU CPU stall warnings

2014-03-15 Thread Paul E. McKenney
So I have been tightening up rcutorture a bit over the past year.
The other day, I came across what looked like a great opportunity for
further tightening, namely the schedule() in rcu_torture_reader().
Why not turn this into a cond_resched(), speeding up the readers a bit
and placing more stress on RCU?

And boy does it increase stress!

Unfortunately, this increased stress sometimes shows up in the form of
lots of RCU CPU stall warnings.  These can appear when an instance of
rcu_torture_reader() gets a CPU to itself, in which case it won't ever
enter the scheduler, and RCU will never see a quiescent state from that
CPU, which means the grace period never ends.

So I am taking a more measured approach to cond_resched() in
rcu_torture_reader() for the moment.

But longer term, should cond_resched() imply a set of RCU
quiescent states?  One way to do this would be to add calls to
rcu_note_context_switch() in each of the various cond_resched() functions.
Easy change, but of course adds some overhead.  On the other hand,
there might be more than a few of the 500+ calls to cond_resched() that
expect that RCU CPU stalls will be prevented (to say nothing of
might_sleep() and cond_resched_lock()).

Thoughts?

(Untested patch below, FWIW.)

Thanx, Paul



diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b46131ef6aab..994d2b0fd0b2 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4075,6 +4075,9 @@ int __sched _cond_resched(void)
__cond_resched();
return 1;
}
+   preempt_disable();
+   rcu_note_context_switch(smp_processor_id());
+   preempt_enable();
return 0;
 }
 EXPORT_SYMBOL(_cond_resched);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/