Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails

2007-02-01 Thread Bernard King-Smith
> - Message from "Or Gerlitz" <[EMAIL PROTECTED]> on Thu, 01 Feb 
2007 11:17:53 +0200 -
> 
> Dotan Barak wrote:
> > I think that now, when implementation of IPoIB CM is available and SRQ 

> > is being used, one may
> > need to use a SRQ with more than 16K WRs.
> 
> IPoIB UD uses SRQ by nature (since RX from all peers consume buffers 
> from the --only-- RQ) and lives fine with 32 buffers (or 64 you can look 

> in the code). Moreover, my assumption is that
> 
>pps(RC) <= pps(UC) <= pps(UD)
> 
> this means that what ever number of RX buffer for UD/2K MTU which is 
> "enough" to have no (or close to zero) packet loss under some traffic 
> pattern, the same pattern can be served with IPoIB CM using SRQ of the 
> same size.

I would expect that you will need more than 32 or 64 buffers using RC and 
SRQ. With larger packets it takes longer to do receive processing on each 
packet under RC. Larger packets means it takes more time to do checksum 
and copy to the socket because of up to 60K or data vs. 2K. The residency 
time on the receive queue will be longer. In the traffic pattern where one 
adapter is receiving from many adapters over the fabric, there will be a 
larger imbalance between sender rate vs. the receiving rate out of the 
queue. Given a large enough TCP send and receive window for a single 
socket to get peak bandwidth, muliple sockets will have more packet in 
flight for a single destination at the same time in this pattern

> 
> Or.
> 
> 
> 

Bernie King-Smith 
IBM Corporation
Server Group
Cluster System Performance 
[EMAIL PROTECTED](845)433-8483
Tie. 293-8483 or wombat2 on NOTES 

"We are not responsible for the world we are born into, only for the world 
we leave when we die.
So we have to accept what has gone before us and work to change the only 
thing we can,
-- The Future." William Shatner
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails

2007-02-01 Thread Or Gerlitz
Michael S. Tsirkin wrote:
>> As for the user space sharing of the same limitation, how about adding 
>> to the --kernel-- struct ib_device_attr "for user space" buddy fields to 
>> max_qp_wr max_srq_wr and max_cqe such that each hw driver set both 
>> values: for the "user space" field the actual hw limitation and for 
>> "kernel space" field a value which would pass kmalloc.

> We could do that I guess but no one so far used query in kernel,
> and userspace values are currently good.

srp calls ibv_device_query but does not care for these fields, as for 
IPoIB CM if you see things as in my other email, i guess you don't need 
to query as well.

However, as this is a kind of easy to implement change which does not 
break the user kernel ABI and allows kernel consumers to count on query 
results they got from the hw driver, going longer term i think we do 
want to have it done.

Or.






___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails

2007-02-01 Thread Michael S. Tsirkin
> As for the user space sharing of the same limitation, how about adding 
> to the --kernel-- struct ib_device_attr "for user space" buddy fields to 
> max_qp_wr max_srq_wr and max_cqe such that each hw driver set both 
> values: for the "user space" field the actual hw limitation and for 
> "kernel space" field a value which would pass kmalloc.

We could do that I guess but no one so far used query in kernel,
and userspace values are currently good.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails

2007-02-01 Thread Or Gerlitz
Roland Dreier wrote:
>  > anyway, the solution that comes into my mind is to disable creating a
>  > QP/SRQ for which > 128KB allocations are needed. So
>  > mthca_query_device() will set the max_qp_wr and max_srq_wr attributes
>  > to values whose derived size still allows to use kmalloc.
> 
> But that will limit the size of the queues that userspace can create
> too.  I guess we could allocate kernel wrid arrays with vmalloc(), but
> I wonder if anyone actually cares about this limit...

mmm, i would avoid vmalloc if possible. Allocating upto 128K bytes for a 
kernel resource sounds fine.

As for the user space sharing of the same limitation, how about adding 
to the --kernel-- struct ib_device_attr "for user space" buddy fields to 
max_qp_wr max_srq_wr and max_cqe such that each hw driver set both 
values: for the "user space" field the actual hw limitation and for 
"kernel space" field a value which would pass kmalloc.

kernel ULPs calling ibv_device_query would use the original fields, no 
need to patch them. Same for user space ULPs no need to patch them.

However, when the call is made from user space, uverbs_query_device 
copies to the resp struct the "user space" attr.

Or.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails

2007-02-01 Thread Or Gerlitz
Dotan Barak wrote:
> I think that now, when implementation of IPoIB CM is available and SRQ 
> is being used, one may
> need to use a SRQ with more than 16K WRs.

IPoIB UD uses SRQ by nature (since RX from all peers consume buffers 
from the --only-- RQ) and lives fine with 32 buffers (or 64 you can look 
in the code). Moreover, my assumption is that

pps(RC) <= pps(UC) <= pps(UD)

this means that what ever number of RX buffer for UD/2K MTU which is 
"enough" to have no (or close to zero) packet loss under some traffic 
pattern, the same pattern can be served with IPoIB CM using SRQ of the 
same size.

Or.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails

2007-01-31 Thread Dotan Barak
Michael S. Tsirkin wrote:
>> I think that now, when implementation of IPoIB CM is available and SRQ 
>> is being used, one may need to use a SRQ with more than 16K WRs.
>> 
>
> Not really: IPoIB CM uses a common CQ for all recv completions, so
> it does not make sense for IPoIB CM to create a SRQ bigger than
> the max CQ size.
>
>   

In many HCAs, the maximum CQ size is 128K entries.


Dotan

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails

2007-01-31 Thread Michael S. Tsirkin
> Quoting Dotan Barak <[EMAIL PROTECTED]>:
> Subject: Re: [mthca] Creation of a SRQ with many WR (> 16K) in kernel level 
> fails
> 
> Roland Dreier wrote:
> >  > anyway, the solution that comes into my mind is to disable creating a
> >  > QP/SRQ for which > 128KB allocations are needed. So
> >  > mthca_query_device() will set the max_qp_wr and max_srq_wr attributes
> >  > to values whose derived size still allows to use kmalloc.
> >
> > But that will limit the size of the queues that userspace can create
> > too.  I guess we could allocate kernel wrid arrays with vmalloc(), but
> > I wonder if anyone actually cares about this limit...
>
> I think that now, when implementation of IPoIB CM is available and SRQ 
> is being used, one may need to use a SRQ with more than 16K WRs.

Not really: IPoIB CM uses a common CQ for all recv completions, so
it does not make sense for IPoIB CM to create a SRQ bigger than
the max CQ size.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails

2007-01-31 Thread Dotan Barak
Roland Dreier wrote:
>  > anyway, the solution that comes into my mind is to disable creating a
>  > QP/SRQ for which > 128KB allocations are needed. So
>  > mthca_query_device() will set the max_qp_wr and max_srq_wr attributes
>  > to values whose derived size still allows to use kmalloc.
>
> But that will limit the size of the queues that userspace can create
> too.  I guess we could allocate kernel wrid arrays with vmalloc(), but
> I wonder if anyone actually cares about this limit...
>   
I think that now, when implementation of IPoIB CM is available and SRQ 
is being used, one may
need to use a SRQ with more than 16K WRs.

thanks
Dotan

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails

2007-01-30 Thread Roland Dreier
 > anyway, the solution that comes into my mind is to disable creating a
 > QP/SRQ for which > 128KB allocations are needed. So
 > mthca_query_device() will set the max_qp_wr and max_srq_wr attributes
 > to values whose derived size still allows to use kmalloc.

But that will limit the size of the queues that userspace can create
too.  I guess we could allocate kernel wrid arrays with vmalloc(), but
I wonder if anyone actually cares about this limit...

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails

2007-01-30 Thread Or Gerlitz
Dotan Barak wrote:
> When one tries to create a SRQ with many WR (> 16K WR), creation of the SRQ
> fails.

> static int mthca_alloc_srq_buf(struct mthca_dev *dev, struct mthca_pd *pd,
>struct mthca_srq *srq)
> srq->wrid = kmalloc(srq->max * sizeof (u64), GFP_KERNEL);
> if (!srq->wrid)
> return -ENOMEM;
> which means that creating a SRQ with 16K WRs (or more), the driver will try to
> allocate 16K*8=128K bytes using kmalloc. This is a very high amount of memory
> to be allocated using kmalloc.

mthca_alloc_wqe_buf has the same problem, as it does qp->wrid = 
kmalloc((qp->rq.max + qp->sq.max) * sizeof (u64), GFP_KERNEL);

anyway, the solution that comes into my mind is to disable creating a 
QP/SRQ for which > 128KB allocations are needed. So mthca_query_device() 
will set the max_qp_wr and max_srq_wr attributes to values whose derived 
size still allows to use kmalloc.

Or.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails

2007-01-30 Thread Dotan Barak
Hi Roland.

I opened bug 331 in the bugzilla with the following content:


When one tries to create a SRQ with many WR (> 16K WR), creation of the SRQ
fails.

The problem appears to be in the file: mthca_srq.c.

Here is the problematic code:

static int mthca_alloc_srq_buf(struct mthca_dev *dev, struct mthca_pd *pd,
   struct mthca_srq *srq)
{
struct mthca_data_seg *scatter;
void *wqe;
int err;
int i;

if (pd->ibpd.uobject)
return 0;

srq->wrid = kmalloc(srq->max * sizeof (u64), GFP_KERNEL);
if (!srq->wrid)
return -ENOMEM;


which means that creating a SRQ with 16K WRs (or more), the driver will try to
allocate 16K*8=128K bytes using kmalloc. This is a very high amount of memory
to be allocated using kmalloc.


The fix can be replacing this kmalloc with a different type of memory 
allocation.


Thanks
Dotan


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general