Re: [openib-general] Re: Mellanox HCAs: outstanding RDMAs

Michael Krause Fri, 09 Jun 2006 07:08:38 -0700

Whether iWARP or IB, there is a fixed number of RDMA Requests allowed to be outstanding at any given time. If one posts more RDMA Read requests than the fixed number, the transmit queue is stalled. This is documented in both technology specifications. It is something that all ULP should be aware of and some go so far as to communicate that as part of the Hello / login exchange. This allows the ULP implementation to determine whether it wants to stall or wants to wait until Read Responses complete before sending another request. This isn't something silent; this isn't something new; this is something for the ULP implementation to decide how to deal with the issue.

BTW, this is part of the hardware and associated specifications so it is up to software to deal with the limited hardware resources and the associated consequences. Please keep in mind that there are a limited number of RDMA Request / Atomic resource "slots" at the receiving HCA / RNIC. These are kept in hardware thus one must know the exact limit to avoid creating protocol problems. A ULP transmitter may post to the transmit queue more than the allotted slots but the transmitting (source) HCA / RNIC must not issue them to the remote. These requests do cause the source to stall. This is a well understood problem and if people give the iSCSI / iSER and DA specs good read or SDP they can see that this issue is comprehended. I agree with people that ULP designers / implementers must pay close attention to this constraint as it is in the iWARP / IB specifications for a very good reason and these semantics must be preserved to maintain the ordering requirements that are the used by the overall RDMA protocols themselves.

Mike

At 05:24 AM 6/6/2006, Talpey, Thomas wrote:

At 03:43 AM 6/6/2006, Michael S. Tsirkin wrote:
>Quoting r. Talpey, Thomas <[EMAIL PROTECTED]>:
>> Semantically, the provider is not required to provide any such flow control
>> behavior by the way. The Mellanox one apparently does, but it is not
>> a requirement of the verbs, it's a requirement on the upper layer. If more
>> RDMA Reads are posted than the remote peer supports, the connection
>> may break.
>
>This does not sound right. Isn't this the meaning of this field:
>"Initiator Depth: Number of RDMA Reads & atomic operations
>outstanding at any time"? Shouldn't any provider enforce this limit?

The core spec does not require it. An implementation *may* enforce it,
but is not *required* to do so. And as pointed out in the other message,
there are repercussions of doing so.

I believe the silent queue stalling is a bit of a time bomb for upper layers,
whose implementers are quite likely unaware of the danger. I greatly
prefer an implementation which simply sends the RDMA Read request,
resulting in a failed (but unblocked!) connection. Silence is a very
dangerous thing, no matter how helpful the intent.

Tom.

_______________________________________________
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

_______________________________________________
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general


To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Re: Mellanox HCAs: outstanding RDMAs

Reply via email to