On Wed, Jun 17, 2020 at 04:46:09PM -0700, Paul E. McKenney wrote:
> > +   // Handle two first channels.
> > +   for (i = 0; i < FREE_N_CHANNELS; i++) {
> > +           for (; bkvhead[i]; bkvhead[i] = bnext) {
> > +                   bnext = bkvhead[i]->next;
> > +                   debug_rcu_bhead_unqueue(bkvhead[i]);
> > +
> > +                   rcu_lock_acquire(&rcu_callback_map);
> > +                   if (i == 0) { // kmalloc() / kfree().
> > +                           trace_rcu_invoke_kfree_bulk_callback(
> > +                                   rcu_state.name, bkvhead[i]->nr_records,
> > +                                   bkvhead[i]->records);
> > +
> > +                           kfree_bulk(bkvhead[i]->nr_records,
> > +                                   bkvhead[i]->records);
> > +                   } else { // vmalloc() / vfree().
> > +                           for (j = 0; j < bkvhead[i]->nr_records; j++) {
> > +                                   trace_rcu_invoke_kfree_callback(
> > +                                           rcu_state.name,
> > +                                           bkvhead[i]->records[j], 0);
> > +
> > +                                   vfree(bkvhead[i]->records[j]);
> > +                           }
> > +                   }
> > +                   rcu_lock_release(&rcu_callback_map);
> 
> Not an emergency, but did you look into replacing this "if" statement
> with an array of pointers to functions implementing the legs of the
> "if" statement?  If nothing else, this would greatly reduced indentation.

I don't think that replacing direct function calls with indirect function
calls is a great suggestion with the current state of play around branch
prediction.

I'd suggest:

                        rcu_lock_acquire(&rcu_callback_map);
                        trace_rcu_invoke_kfree_bulk_callback(rcu_state.name,
                                bkvhead[i]->nr_records, bkvhead[i]->records);
                        if (i == 0) {
                                kfree_bulk(bkvhead[i]->nr_records,
                                        bkvhead[i]->records);
                        } else {
                                for (j = 0; j < bkvhead[i]->nr_records; j++) {
                                        vfree(bkvhead[i]->records[j]);
                                }
                        }
                        rcu_lock_release(&rcu_callback_map);

But I'd also suggest a vfree_bulk be added.  There are a few things
which would be better done in bulk as part of the vfree process
(we batch them up already, but i'm sure we could do better).

Reply via email to