> From: Jason Gunthorpe <j...@nvidia.com>
> Sent: Tuesday, April 6, 2021 9:17 PM
> 
> On Mon, Apr 05, 2021 at 08:49:56AM +0300, Leon Romanovsky wrote:
> > @@ -2293,6 +2295,17 @@ static void ib_sa_event(struct ib_event_handler
> *handler,
> >     }
> >  }
> >
> > +static bool ib_sa_client_supported(struct ib_device *device) {
> > +   unsigned int i;
> > +
> > +   rdma_for_each_port(device, i) {
> > +           if (rdma_cap_ib_sa(device, i))
> > +                   return true;
> > +   }
> > +   return false;
> > +}
> 
> This is already done though:
It is but, ib_sa_device() allocates ib_sa_device worth of struct for each port 
without checking the rdma_cap_ib_sa().
This results into allocating 40 * 512 = 20480 rounded of to power of 2 to 32K 
bytes of memory for the rdma device with 512 ports.
Other modules are also similarly wasting such memory.

> 
>       for (i = 0; i <= e - s; ++i) {
>               spin_lock_init(&sa_dev->port[i].ah_lock);
>               if (!rdma_cap_ib_sa(device, i + 1))
>                       continue;
> [..]
> 
>       if (!count) {
>               ret = -EOPNOTSUPP;
>               goto free;
> 
> Why does it need to be duplicated? The other patches are all basically like
> that too.
> 
> The add_one function should return -EOPNOTSUPP if it doesn't want to run
> on this device and any supported checks should just be at the front - this is
> how things work right now
> 
I am ok to fold this check at the beginning of add callback.
When 512 to 1K RoCE devices are used, they do not have SA, CM, CMA etc caps on 
and all the client needs to go through refcnt + xa + sem and unroll them.
Is_supported() routine helps to cut down all of it. I didn't calculate the usec 
saved with it.

Please let me know.

Reply via email to