Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails
> - Message from "Or Gerlitz" <[EMAIL PROTECTED]> on Thu, 01 Feb 2007 11:17:53 +0200 - > > Dotan Barak wrote: > > I think that now, when implementation of IPoIB CM is available and SRQ > > is being used, one may > > need to use a SRQ with more than 16K WRs. > > IPoIB UD uses SRQ by nature (since RX from all peers consume buffers > from the --only-- RQ) and lives fine with 32 buffers (or 64 you can look > in the code). Moreover, my assumption is that > >pps(RC) <= pps(UC) <= pps(UD) > > this means that what ever number of RX buffer for UD/2K MTU which is > "enough" to have no (or close to zero) packet loss under some traffic > pattern, the same pattern can be served with IPoIB CM using SRQ of the > same size. I would expect that you will need more than 32 or 64 buffers using RC and SRQ. With larger packets it takes longer to do receive processing on each packet under RC. Larger packets means it takes more time to do checksum and copy to the socket because of up to 60K or data vs. 2K. The residency time on the receive queue will be longer. In the traffic pattern where one adapter is receiving from many adapters over the fabric, there will be a larger imbalance between sender rate vs. the receiving rate out of the queue. Given a large enough TCP send and receive window for a single socket to get peak bandwidth, muliple sockets will have more packet in flight for a single destination at the same time in this pattern > > Or. > > > Bernie King-Smith IBM Corporation Server Group Cluster System Performance [EMAIL PROTECTED](845)433-8483 Tie. 293-8483 or wombat2 on NOTES "We are not responsible for the world we are born into, only for the world we leave when we die. So we have to accept what has gone before us and work to change the only thing we can, -- The Future." William Shatner ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails
Michael S. Tsirkin wrote: >> As for the user space sharing of the same limitation, how about adding >> to the --kernel-- struct ib_device_attr "for user space" buddy fields to >> max_qp_wr max_srq_wr and max_cqe such that each hw driver set both >> values: for the "user space" field the actual hw limitation and for >> "kernel space" field a value which would pass kmalloc. > We could do that I guess but no one so far used query in kernel, > and userspace values are currently good. srp calls ibv_device_query but does not care for these fields, as for IPoIB CM if you see things as in my other email, i guess you don't need to query as well. However, as this is a kind of easy to implement change which does not break the user kernel ABI and allows kernel consumers to count on query results they got from the hw driver, going longer term i think we do want to have it done. Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails
> As for the user space sharing of the same limitation, how about adding > to the --kernel-- struct ib_device_attr "for user space" buddy fields to > max_qp_wr max_srq_wr and max_cqe such that each hw driver set both > values: for the "user space" field the actual hw limitation and for > "kernel space" field a value which would pass kmalloc. We could do that I guess but no one so far used query in kernel, and userspace values are currently good. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails
Roland Dreier wrote: > > anyway, the solution that comes into my mind is to disable creating a > > QP/SRQ for which > 128KB allocations are needed. So > > mthca_query_device() will set the max_qp_wr and max_srq_wr attributes > > to values whose derived size still allows to use kmalloc. > > But that will limit the size of the queues that userspace can create > too. I guess we could allocate kernel wrid arrays with vmalloc(), but > I wonder if anyone actually cares about this limit... mmm, i would avoid vmalloc if possible. Allocating upto 128K bytes for a kernel resource sounds fine. As for the user space sharing of the same limitation, how about adding to the --kernel-- struct ib_device_attr "for user space" buddy fields to max_qp_wr max_srq_wr and max_cqe such that each hw driver set both values: for the "user space" field the actual hw limitation and for "kernel space" field a value which would pass kmalloc. kernel ULPs calling ibv_device_query would use the original fields, no need to patch them. Same for user space ULPs no need to patch them. However, when the call is made from user space, uverbs_query_device copies to the resp struct the "user space" attr. Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails
Dotan Barak wrote: > I think that now, when implementation of IPoIB CM is available and SRQ > is being used, one may > need to use a SRQ with more than 16K WRs. IPoIB UD uses SRQ by nature (since RX from all peers consume buffers from the --only-- RQ) and lives fine with 32 buffers (or 64 you can look in the code). Moreover, my assumption is that pps(RC) <= pps(UC) <= pps(UD) this means that what ever number of RX buffer for UD/2K MTU which is "enough" to have no (or close to zero) packet loss under some traffic pattern, the same pattern can be served with IPoIB CM using SRQ of the same size. Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails
Michael S. Tsirkin wrote: >> I think that now, when implementation of IPoIB CM is available and SRQ >> is being used, one may need to use a SRQ with more than 16K WRs. >> > > Not really: IPoIB CM uses a common CQ for all recv completions, so > it does not make sense for IPoIB CM to create a SRQ bigger than > the max CQ size. > > In many HCAs, the maximum CQ size is 128K entries. Dotan ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails
> Quoting Dotan Barak <[EMAIL PROTECTED]>: > Subject: Re: [mthca] Creation of a SRQ with many WR (> 16K) in kernel level > fails > > Roland Dreier wrote: > > > anyway, the solution that comes into my mind is to disable creating a > > > QP/SRQ for which > 128KB allocations are needed. So > > > mthca_query_device() will set the max_qp_wr and max_srq_wr attributes > > > to values whose derived size still allows to use kmalloc. > > > > But that will limit the size of the queues that userspace can create > > too. I guess we could allocate kernel wrid arrays with vmalloc(), but > > I wonder if anyone actually cares about this limit... > > I think that now, when implementation of IPoIB CM is available and SRQ > is being used, one may need to use a SRQ with more than 16K WRs. Not really: IPoIB CM uses a common CQ for all recv completions, so it does not make sense for IPoIB CM to create a SRQ bigger than the max CQ size. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails
Roland Dreier wrote: > > anyway, the solution that comes into my mind is to disable creating a > > QP/SRQ for which > 128KB allocations are needed. So > > mthca_query_device() will set the max_qp_wr and max_srq_wr attributes > > to values whose derived size still allows to use kmalloc. > > But that will limit the size of the queues that userspace can create > too. I guess we could allocate kernel wrid arrays with vmalloc(), but > I wonder if anyone actually cares about this limit... > I think that now, when implementation of IPoIB CM is available and SRQ is being used, one may need to use a SRQ with more than 16K WRs. thanks Dotan ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails
> anyway, the solution that comes into my mind is to disable creating a > QP/SRQ for which > 128KB allocations are needed. So > mthca_query_device() will set the max_qp_wr and max_srq_wr attributes > to values whose derived size still allows to use kmalloc. But that will limit the size of the queues that userspace can create too. I guess we could allocate kernel wrid arrays with vmalloc(), but I wonder if anyone actually cares about this limit... - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails
Dotan Barak wrote: > When one tries to create a SRQ with many WR (> 16K WR), creation of the SRQ > fails. > static int mthca_alloc_srq_buf(struct mthca_dev *dev, struct mthca_pd *pd, >struct mthca_srq *srq) > srq->wrid = kmalloc(srq->max * sizeof (u64), GFP_KERNEL); > if (!srq->wrid) > return -ENOMEM; > which means that creating a SRQ with 16K WRs (or more), the driver will try to > allocate 16K*8=128K bytes using kmalloc. This is a very high amount of memory > to be allocated using kmalloc. mthca_alloc_wqe_buf has the same problem, as it does qp->wrid = kmalloc((qp->rq.max + qp->sq.max) * sizeof (u64), GFP_KERNEL); anyway, the solution that comes into my mind is to disable creating a QP/SRQ for which > 128KB allocations are needed. So mthca_query_device() will set the max_qp_wr and max_srq_wr attributes to values whose derived size still allows to use kmalloc. Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [mthca] Creation of a SRQ with many WR (> 16K) in kernel level fails
Hi Roland. I opened bug 331 in the bugzilla with the following content: When one tries to create a SRQ with many WR (> 16K WR), creation of the SRQ fails. The problem appears to be in the file: mthca_srq.c. Here is the problematic code: static int mthca_alloc_srq_buf(struct mthca_dev *dev, struct mthca_pd *pd, struct mthca_srq *srq) { struct mthca_data_seg *scatter; void *wqe; int err; int i; if (pd->ibpd.uobject) return 0; srq->wrid = kmalloc(srq->max * sizeof (u64), GFP_KERNEL); if (!srq->wrid) return -ENOMEM; which means that creating a SRQ with 16K WRs (or more), the driver will try to allocate 16K*8=128K bytes using kmalloc. This is a very high amount of memory to be allocated using kmalloc. The fix can be replacing this kmalloc with a different type of memory allocation. Thanks Dotan ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general