Quoting Antoine Tenart (2021-03-22 18:41:30)
> Quoting Matthew Wilcox (2021-03-22 17:54:39)
> > On Mon, Mar 22, 2021 at 04:43:29PM +0100, Antoine Tenart wrote:
> > > xps_queue_show is mostly made of an RCU read-side critical section and
> > > calls bitmap_zalloc with GFP_KERNEL in the middle of it. That is not
> > > allowed as this call may sleep and such behaviours aren't allowed in RCU
> > > read-side critical sections. Fix this by using GFP_NOWAIT instead.
> >
> > This would be another way of fixing the problem that is slightly less
> > complex than my initial proposal, but does allow for using GFP_KERNEL
> > for fewer failures:
> >
> > @@ -1366,11 +1366,10 @@ static ssize_t xps_queue_show(struct net_device
> > *dev, unsigned int index,
> > {
> > struct xps_dev_maps *dev_maps;
> > unsigned long *mask;
> > - unsigned int nr_ids;
> > + unsigned int nr_ids, new_nr_ids;
> > int j, len;
> >
> > - rcu_read_lock();
> > - dev_maps = rcu_dereference(dev->xps_maps[type]);
> > + dev_maps = READ_ONCE(dev->xps_maps[type]);
>
> Couldn't dev_maps be freed between here and the read of dev_maps->nr_ids
> as we're not in an RCU read-side critical section?
* The first read of dev_maps->nr_ids, happening before rcu_read_lock,
not the one shown below.
> > /* Default to nr_cpu_ids/dev->num_rx_queues and do not just return 0
> > * when dev_maps hasn't been allocated yet, to be backward
> > compatible.
> > @@ -1379,10 +1378,18 @@ static ssize_t xps_queue_show(struct net_device
> > *dev, unsigned int index,
> > (type == XPS_CPUS ? nr_cpu_ids : dev->num_rx_queues);
> >
> > mask = bitmap_zalloc(nr_ids, GFP_KERNEL);
> > - if (!mask) {
> > - rcu_read_unlock();
> > + if (!mask)
> > return -ENOMEM;
> > - }
> > +
> > + rcu_read_lock();
> > + dev_maps = rcu_dereference(dev->xps_maps[type]);
> > + /* if nr_ids shrank in the meantime, do not overrun array.
> > + * if it increased, we just won't show the new ones
> > + */
> > + new_nr_ids = dev_maps ? dev_maps->nr_ids :
> > + (type == XPS_CPUS ? nr_cpu_ids :
> > dev->num_rx_queues);
> > + if (new_nr_ids < nr_ids)
> > + nr_ids = new_nr_ids;