On Thu, 2012-10-18 at 04:58 -0700, Paul E. McKenney wrote: > On Wed, Oct 17, 2012 at 09:14:12AM +0200, Eric Dumazet wrote: > > From: Eric Dumazet <[email protected]> > > > > Commit 29c00b4a1d9e27 (rcu: Add event-tracing for RCU callback > > invocation) added a regression in rcu_do_batch() > > > > Under stress, RCU is supposed to allow to process all items in queue, > > instead of a batch of 10 items (blimit), but an integer overflow makes > > the effective limit being 1. > > > > So RCU cannot recover and machine eventually crash because of OOM. > > > > Using long instead of int is not really needed, convert everything > > to integers. > > <facepalm> > > Good catch!!! > > The reason for favoring long over int is that there are a few systems out > there with terabytes of main memory. In addition, there have been a few > bugs over the past few years that could result in RCU CPU stalls of more > than a minute. This makes it impossible to rule out the possibility of > a billion callbacks appearing on one CPU. > > So, does the following patch fix things, or a I still confused? > > Thanx, Paul > > ------------------------------------------------------------------------ > > rcu: Fix batch-limit size problem > > Commit 29c00b4a1d9e27 (rcu: Add event-tracing for RCU callback > invocation) added a regression in rcu_do_batch() > > Under stress, RCU is supposed to allow to process all items in queue, > instead of a batch of 10 items (blimit), but an integer overflow makes > the effective limit being 1. So, unless there is frequent idle periods > (during which RCU ignores batch limits), RCU can be forced into a > state where it cannot keep up with the callback-generation rate, > eventually resulting in OOM. > > This commit therefore converts a few variables in rcu_do_batch() from > int to long to fix this problem. > > Reported-by: Eric Dumazet <[email protected]> > Signed-off-by: Paul E. McKenney <[email protected]> > > diff --git a/kernel/rcutree.c b/kernel/rcutree.c > index e36d085..e056e1e 100644 > --- a/kernel/rcutree.c > +++ b/kernel/rcutree.c > @@ -1823,7 +1823,8 @@ static void rcu_do_batch(struct rcu_state *rsp, struct > rcu_data *rdp) > { > unsigned long flags; > struct rcu_head *next, *list, **tail; > - int bl, count, count_lazy, i; > + long bl, count, count_lazy; > + int i; > > /* If no callbacks are ready, just return.*/ > if (!cpu_has_callbacks_ready_to_invoke(rdp)) { >
Yes, why not, but global "int blimit" being an int, I really dont see the point having a long in struct rcu_data. Having 2 billions callbacks on one cpu would be problematic, I really hope nobody relies on this ;) I guess the 10/infinity switch should be smarter. something like the following maybe : rdp->blimit = max(blimit, rdp->qlen >> 6); (if queue is big, dont wait to hit 10000 before allowing more items to be handled per round) Signed-off-by: Eric Dumazet <[email protected]> Please dont forget stable teams. (3.2 + ) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

