The rcu_start_future_gp() function uses a sloppy check for a grace
period being in progress, which works today because there are a number
of code sequences that resolve the resulting races.  However, some of
these race-resolution code sequences must acquire the root rcu_node
structure's ->lock, and contention on that lock has started manifesting.
This commit therefore makes rcu_start_future_gp() check more precise,
eliminating the sloppy lockless check of the rcu_state structure's ->gpnum
and ->completed fields.  The effect is that rcu_start_future_gp() will
sometimes unnecessarily attempt to start a new grace period, but this
overhead will be reduced later using funnel locking.

Reported-by: Nicholas Piggin <npig...@gmail.com>
Signed-off-by: Paul E. McKenney <paul...@linux.vnet.ibm.com>
---
 kernel/rcu/tree.c | 18 +++++-------------
 1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index f5ca72f2ed43..4bbba17422cd 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1705,20 +1705,12 @@ rcu_start_future_gp(struct rcu_node *rnp, struct 
rcu_data *rdp,
        }
 
        /*
-        * If either this rcu_node structure or the root rcu_node structure
-        * believe that a grace period is in progress, then we must wait
-        * for the one following, which is in "c".  Because our request
-        * will be noticed at the end of the current grace period, we don't
-        * need to explicitly start one.  We only do the lockless check
-        * of rnp_root's fields if the current rcu_node structure thinks
-        * there is no grace period in flight, and because we hold rnp->lock,
-        * the only possible change is when rnp_root's two fields are
-        * equal, in which case rnp_root->gpnum might be concurrently
-        * incremented.  But that is OK, as it will just result in our
-        * doing some extra useless work.
+        * If this rcu_node structure believes that a grace period is in
+        * progress, then we must wait for the one following, which is in
+        * "c".  Because our request will be noticed at the end of the
+        * current grace period, we don't need to explicitly start one.
         */
-       if (rnp->gpnum != rnp->completed ||
-           READ_ONCE(rnp_root->gpnum) != READ_ONCE(rnp_root->completed)) {
+       if (rnp->gpnum != rnp->completed) {
                rnp->need_future_gp[c & 0x1]++;
                trace_rcu_future_gp(rnp, rdp, c, TPS("Startedleaf"));
                goto out;
-- 
2.5.2

Reply via email to