Re: [openib-general] design for communication established affiliated asynchronous event handling
Sean Hefty wrote: > Rimmer, Todd wrote: >> The CM would open the CA, provide its async event callback routine and >> perform a special register_cm() verbs call. Of course most CM traffic >> would occur on the GSI QP, so this open CA instance was only for this >> purpose. This special verb was only available in kernel space (avoiding >> security issue of application stealing CM interface and because our CM >> was in the kernel anyway). > > Thanks for the info. I'm considering this sort of approach. OK, so you opt for a change that will have the whole solution running within the ibstack core (hw driver / core / cm) - the CM gets an async event which make it synthesize an RTU and act on it. So we went down from CMA level handling to CM level handling and it would work for both user and kernel consumers, this is in the price of having to change the verbs access layer for the CM to register on QP async events. Again, also with this solution the ULP has to be aware for CQ completions related to a QP on which ESTABLISHED event was not yet delivered on the associated CMA ID. Sound good, in fact our gen1 stack was using this solution as well, relying on the VAPI driver feature of delivering affiliated async events to all the kernel consumers (the async event ***handler*** was not associated with a specific QP) Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
> -Original Message- > From: Michael S. Tsirkin > Sent: Thursday, June 29, 2006 1:45 AM > > Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > > Subject: Re: design for communication established affiliated > asynchronous event handling > > > > >I suggest the following design: the CMA would replace the event handler > > >provided with the qp_init_attr struct with a callback of its own and > > >keep the original handler/context on a private structure. > > > > This is probably fine. There is one further situation where the > > connection needs to be established, beyond RTU and the communication > > established async event. Namely, if a receive completion is polled. > > Since async events are, well, asynchronous, there's no guarantee that > > the communication established event will be reported any time soon... > > How about user taking this into account and not arming the CQ / > not polling it until the established event? If the ULP is properly designed, the asynchronous-ness of the event (or RTU for that matter) should not be an issue. Per the IBTA CM state machine, the passive side upon sending the REP should move its endpoint (the QP and the ULPs state machine) state to Ready to Receive. QPs in RTR can have send WQEs posted to them, however they will not be sent until the QP is moved to RTS. This means the ULP while in RTR can perform its normal receive completion handling and even build and post send requests in response to such received messages. Such sends will be queued until the QP later moves to RTS. Most ULPs have some sort of application level flow control. This may be simply RNR NAK or it could be a credit system (such as SRP) or an additional application initialization protocol (such as SDP). Hence the active side will generally perform limited sends (typically one) to the passive side until it gets a response from the passive side (which won't happen until the QP is in RTS). Hence for a good ULP protocol, there is no risk of overflowing the send Q while waiting to move to RTS. The only thing the passive side ULP should not do until in RTS is any sort of "periodic status messages which don't require active side acknowledgement". Since the RTS state could be delayed, the ULP should not risk overflowing its send Q with such messages. Most of the standard ULP protocols (SDP, etc) do not have such messages or they require ULP level protocol negotiation before they are activated. Hence if this is all properly handled, the passive side's RTU/Async Event handling sequence will merely move the QP to RTS and notify the ULP. The ULP will likely do very limited work for this notification (perhaps just a state transition) as all the real work should have been done before sending the REP. The movement to RTS will enable the QP to start processing its Send Q and everything will be good. Taking this approach keeps the CM/CMA and ULP simpler in design and merely allows the RTS/RTU/Async Event handling to be another event in a state machine. Todd Rimmer ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
> -Original Message- > From: openib Sean Hefty > Sent: Wednesday, June 28, 2006 7:24 PM > > Roland Dreier wrote: > >>I suggest the following design: the CMA would replace the event handler > >>provided with the qp_init_attr struct with a callback of its own and > >>keep the original handler/context on a private structure. > > I should also point out that the proposed design will not work for > userspace. > I'm hesitant to make this change until a solution for userspace can also > be > found, in the hope that a common fix can be shared. > > - Sean The approach we took in our proprietary stack was to provide a verbs driver interface for the CM to register itself with the verbs driver. The CM would open the CA, provide its async event callback routine and perform a special register_cm() verbs call. Of course most CM traffic would occur on the GSI QP, so this open CA instance was only for this purpose. This special verb was only available in kernel space (avoiding security issue of application stealing CM interface and because our CM was in the kernel anyway). When the CA got an Async Event for a Communication Established event, it would deliver it to both the CM (regardless of which QP it was for) and to the open instance owning the QP. All other async events were only delivered to the appropriate open instance. This put the handling in the kernel and at a low level where it would not impact handling of other async events and avoided complications of user vs kernel async event filters. Depending on the design of APM, the CM might also be interested in APM related Async Events (in our design the application had an opportunity to select a new alternate path, so it was more appropriate to let the ULP handle these events directly). Todd Rimmer ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
Rimmer, Todd wrote: > The CM would open the CA, provide its async event callback routine and > perform a special register_cm() verbs call. Of course most CM traffic > would occur on the GSI QP, so this open CA instance was only for this > purpose. This special verb was only available in kernel space (avoiding > security issue of application stealing CM interface and because our CM > was in the kernel anyway). Thanks for the info. I'm considering this sort of approach. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
>How about user taking this into account and not arming the CQ / >not polling it until the established event? The CQ could be in use by other QPs. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: design for communication established affiliated asynchronous > event handling > > >I suggest the following design: the CMA would replace the event handler > >provided with the qp_init_attr struct with a callback of its own and > >keep the original handler/context on a private structure. > > This is probably fine. There is one further situation where the > connection needs to be established, beyond RTU and the communication > established async event. Namely, if a receive completion is polled. > Since async events are, well, asynchronous, there's no guarantee that > the communication established event will be reported any time soon... How about user taking this into account and not arming the CQ / not polling it until the established event? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
Roland Dreier wrote: >>I suggest the following design: the CMA would replace the event handler >>provided with the qp_init_attr struct with a callback of its own and >>keep the original handler/context on a private structure. > > > This is probably fine. There is one further situation where the > connection needs to be established, beyond RTU and the communication > established async event. Namely, if a receive completion is polled. > Since async events are, well, asynchronous, there's no guarantee that > the communication established event will be reported any time soon... This brings up a good point. Even if a user gets a communication established event, the IB CM could have already timed out and failed the connection. I don't think that we can do anything about this. I should also point out that the proposed design will not work for userspace. I'm hesitant to make this change until a solution for userspace can also be found, in the hope that a common fix can be shared. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
Rimmer, Todd wrote: > CM - have a hook so the CM can get the Async Events for all CAs. On > getting the Async Event for packet first packet received while in RTR > (Communication established), the CM should treat this exactly like an > RTU (with no private data). The CM will need to cross reference the > CA/QP this event was reported for to identify the applicable connection > endpoint. If you check the IBTA spec and the CM state machines you will > see the CM is supposed to handle this event. Also if the RTU does > arrive later, the CM state machine also handles that correctly by > discarding the RTU as if it was a duplicate. Note: this is why > applications should not depend on private data in the RTU. The IB CM has this capability, and behaves as indicated. The missing piece is for the RDMA CM to handle this situation. I believe that Or's approach of replacing the user's QP handler with the CMA's will fix this. > has completed its processing. In general IB allows for this situation > quite nicely. The ULP can process the inbound data normally and queue > it to the Send Q. Putting data on a Send Q is permitted in RTR, but the This is a good point, which indicates to me that nothing more is needed than handling the communication established event by the RDMA CM. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
> -Original Message- > From: Or Gerlitz; openib-general > > In most cases, I would expect that the IB CM will eventually receive the > RTU, > > which will generate an event to the RDMA CM to transition the QP into > RTS. > > But we want an IB stack and set of ULPs which would work in production so > they > need to handle also irregular cases... eg when the RTU is lost over and > over. Agreed. The missing RTU case must be handled for a few reasons: 1. The RTU could honestly be lost (GSI QPs are UD, they could overflow, fabric could loose the packet, etc) 2. The RC send could beat the processing of the RTU (packets on wire may be out of order if there are different SLs/VLs involved with GSI vs application QP). Also its possible the CM is slower getting to its queue of packets (such as when bombarded by many connections) while application/ULP gets its RC send quickly. [I have observed this situation in various real world stress tests]. This problem is quite simple to handle (I did it a few years ago in the SilverStorm stack) and the IB spec completely covers this issue: CM - have a hook so the CM can get the Async Events for all CAs. On getting the Async Event for packet first packet received while in RTR (Communication established), the CM should treat this exactly like an RTU (with no private data). The CM will need to cross reference the CA/QP this event was reported for to identify the applicable connection endpoint. If you check the IBTA spec and the CM state machines you will see the CM is supposed to handle this event. Also if the RTU does arrive later, the CM state machine also handles that correctly by discarding the RTU as if it was a duplicate. Note: this is why applications should not depend on private data in the RTU. ULPs - all ULPs should be written so they are fully ready to process inbound data before they tell the CM to send the REP. It is very likely the ULP will get a CQ completion for the inbound RQ data before the CM has completed its processing. In general IB allows for this situation quite nicely. The ULP can process the inbound data normally and queue it to the Send Q. Putting data on a Send Q is permitted in RTR, but the QP will not initiate sending until moved to RTS. As such the ULP can allow the Cm RTU processing (which will race with the RQ data completion) do its normal thing and move the QP to RTS. Todd Rimmer ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
Hal Rosenstock wrote: >>This moves the QP state to RTS, as opposed to the CEP state to connected. So >>I >>don't believe that it violates the spec. > > > Isn't the CEP the QP (see p. 689 line 7) ? Hmm... I was viewing the CEP as moving through the states described in 12.9.5 and 12.9.6. (Idle, REQ sent, REP wait, etc.) I see what you're saying now. > It sounds like I may have been looking at the wrong state but > nonetheless the CEP/QP states are defined there and this would be > different from what is in the spec. I wasn't saying it couldn't be made > to work though. I haven't looked at it enough to know. If it does work, > maybe the spec should get updated to cover this option too. What I'd like to find is a way that a user, upon receiving a message, can send a response. Today, a user cannot send the response until after they get a connection established event from the IB CM, and then RDMA CM. So, it sounds like even the RDMA CM needs some sort of rdma_establish() call to finish connecting a QP. I don't think that iWarp would run into this issue. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
On Fri, 2006-06-16 at 12:31, Sean Hefty wrote: > Hal Rosenstock wrote: > > IMO, it would violate the CM state machine and the passive CM transition > > specification in 12.9.7.2 and have the effect of circumventing the > > retransmission of REP on lost RTU. Data can't fly until either the RTU > > or the first data message is received from the other direction. > > This moves the QP state to RTS, as opposed to the CEP state to connected. So > I > don't believe that it violates the spec. Isn't the CEP the QP (see p. 689 line 7) ? > A drawback to moving the QP to RTS is that the communication established > event > will not be generated. This forces us to wait for the RTU to move the CEP to > connected, or we need to do it upon receiving the first completion. > The RDMA CM has no knowledge when the latter occurs, so would need user input. It sounds like I may have been looking at the wrong state but nonetheless the CEP/QP states are defined there and this would be different from what is in the spec. I wasn't saying it couldn't be made to work though. I haven't looked at it enough to know. If it does work, maybe the spec should get updated to cover this option too. -- Hal > - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
>I suggest the following design: the CMA would replace the event handler >provided with the qp_init_attr struct with a callback of its own and >keep the original handler/context on a private structure. This is probably fine. There is one further situation where the connection needs to be established, beyond RTU and the communication established async event. Namely, if a receive completion is polled. Since async events are, well, asynchronous, there's no guarantee that the communication established event will be reported any time soon... ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
Hal Rosenstock wrote: > IMO, it would violate the CM state machine and the passive CM transition > specification in 12.9.7.2 and have the effect of circumventing the > retransmission of REP on lost RTU. Data can't fly until either the RTU > or the first data message is received from the other direction. This moves the QP state to RTS, as opposed to the CEP state to connected. So I don't believe that it violates the spec. A drawback to moving the QP to RTS is that the communication established event will not be generated. This forces us to wait for the RTU to move the CEP to connected, or we need to do it upon receiving the first completion. The RDMA CM has no knowledge when the latter occurs, so would need user input. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
Or Gerlitz wrote: > This is what i was suspecting, Sean can you confirm that? if it does > not emulate RTU > reception, than what it does do? Both receiving an RTU and getting a connection established event move the connection into the established state. They generate different events to the user of the IB CM because RTUs carry private data. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
On Fri, 2006-06-16 at 11:15, James Lentini wrote: [snip...] > > As an alternative, I don't think that there's any reason why the QP > > can't be transition to RTS when the CM REP is sent. > > I like this idea. It simplifies how ULPs handle this issue. Are there > any spec. compliance issues with this? IMO, it would violate the CM state machine and the passive CM transition specification in 12.9.7.2 and have the effect of circumventing the retransmission of REP on lost RTU. Data can't fly until either the RTU or the first data message is received from the other direction. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
James Lentini wrote: >>As an alternative, I don't think that there's any reason why the QP >>can't be transition to RTS when the CM REP is sent. > > I like this idea. It simplifies how ULPs handle this issue. Are there > any spec. compliance issues with this? There's no spec compliance issues that I can readily find. I will make a note to fix this, as well as handle the connection established event as Or suggested, but it will be a couple of weeks before I get to this. (I will be attending the workshop next week.) > If the passive side CM doesn't receive an RTU, the passive side CM > should retransmit the REP. At least that is how I read 12.9.8.6 > "Timeouts and Retries" in the IBTA spec. I can't find where this > happens in the code. Did I miss it? The MAD layer retries the CM messages, typically until the CM cancels the operation. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
On Fri, 16 Jun 2006, Or Gerlitz wrote: > On 6/15/06, James Lentini <[EMAIL PROTECTED]> wrote: > > ib_cm_establish() doesn't emulate an RTU reception. It generates an > > IB_CM_USER_ESTABLISHED event (not an IB_CM_RTU_RECEIVED event). The > > CMA's cma_ib_handler() doesn't recognize a IB_CM_USER_ESTABLISHED > > event. The QP's state will not be moved to RTS. > > This is what i was suspecting, Sean can you confirm that? if it does > not emulate RTU > reception, than what it does do? > > > Consumers don't actually have to queue the completions, they have to > > defer posting sends (either in response to the recvs or otherwise) > > until the QP moves to RTS. Could the implementations queue up the > > requests for the consumers? > > nope the CM/CMA are not in charge of the consumer CQ, so there is no > way for them to queue those completions and anyway, i think its I was refering to requests, not completions. In any event, I like Sean's idea of moving the QP to RTS when a REP is sent better. > wrong for lower layer to queue completions, this "race" exists by > IB's nature (since the RTU goes to QP1 and the data to the user's QP > and the two QPs are totally unrelated) so if you want to have > production with IB you need to handle this case in your code, as > others do. Agreed. > > Strictly speaking, IB requires an error to be generated (C10-29 in > > the IBTA spec. vol 1, page 456). Still, it would be nice if > > consumers didn't have to be worry about this issue. > > What do you mean by error, this async event happens all the time, > you can't error the establishment just b/c it happend. I don't have > access now to the spec, so i can't say what i understand from the > section you have pointed to. Again, I was refering to requests, not completions. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
On Thu, 15 Jun 2006, Sean Hefty wrote: > >The cma/verbs consumer can't just ignore the event since its qp state is > >still RTR which means an attempt to tx replying the rx would fail. > > In most cases, I would expect that the IB CM will eventually receive the RTU, > which will generate an event to the RDMA CM to transition the QP into RTS. > This > is why I think that the event can safely be ignored. It does however mean > that > a user cannot send on the QP until the user sees RDMA_CM_EVENT_ESTABLISHED. > > >I suggest the following design: the CMA would replace the event handler > >provided with the qp_init_attr struct with a callback of its own and > >keep the original handler/context on a private structure. > > This sounds like it would work. I don't think that there are any events where > the additional delay would matter. > > As an alternative, I don't think that there's any reason why the QP > can't be transition to RTS when the CM REP is sent. I like this idea. It simplifies how ULPs handle this issue. Are there any spec. compliance issues with this? > A user just can't post to the send queue until either an > RDMA_CM_EVENT_ESTABLISHED, IB_EVENT_COMM_EST, or a completion occurs > on the QP. (This doesn't change the fact that the IB CM still needs > to know that the connection has been established, or it risks > putting the connection into an error state if an RTU is never > received.) If the passive side CM doesn't receive an RTU, the passive side CM should retransmit the REP. At least that is how I read 12.9.8.6 "Timeouts and Retries" in the IBTA spec. I can't find where this happens in the code. Did I miss it? ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
On 6/16/06, Sean Hefty <[EMAIL PROTECTED]> wrote: >>The cma/verbs consumer can't just ignore the event since its qp state is >>still RTR which means an attempt to tx replying the rx would fail. > In most cases, I would expect that the IB CM will eventually receive the RTU, > which will generate an event to the RDMA CM to transition the QP into RTS. But we want an IB stack and set of ULPs which would work in production so they need to handle also irregular cases... eg when the RTU is lost over and over. Or ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
On 6/15/06, James Lentini <[EMAIL PROTECTED]> wrote: > ib_cm_establish() doesn't emulate an RTU reception. It generates an > IB_CM_USER_ESTABLISHED event (not an IB_CM_RTU_RECEIVED event). The > CMA's cma_ib_handler() doesn't recognize a IB_CM_USER_ESTABLISHED > event. The QP's state will not be moved to RTS. This is what i was suspecting, Sean can you confirm that? if it does not emulate RTU reception, than what it does do? > Consumers don't actually have to queue the completions, they have to > defer posting sends (either in response to the recvs or otherwise) > until the QP moves to RTS. Could the implementations queue up the > requests for the consumers? nope the CM/CMA are not in charge of the consumer CQ, so there is no way for them to queue those completions and anyway, i think its wrong for lower layer to queue completions, this "race" exists by IB's nature (since the RTU goes to QP1 and the data to the user's QP and the two QPs are totally unrelated) so if you want to have production with IB you need to handle this case in your code, as others do. > Strictly speaking, IB requires an error to be generated (C10-29 in the > IBTA spec. vol 1, page 456). Still, it would be nice if consumers > didn't have to be worry about this issue. What do you mean by error, this async event happens all the time, you can't error the establishment just b/c it happend. I don't have access now to the spec, so i can't say what i understand from the section you have pointed to. Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
>The cma/verbs consumer can't just ignore the event since its qp state is >still RTR which means an attempt to tx replying the rx would fail. In most cases, I would expect that the IB CM will eventually receive the RTU, which will generate an event to the RDMA CM to transition the QP into RTS. This is why I think that the event can safely be ignored. It does however mean that a user cannot send on the QP until the user sees RDMA_CM_EVENT_ESTABLISHED. >I suggest the following design: the CMA would replace the event handler >provided with the qp_init_attr struct with a callback of its own and >keep the original handler/context on a private structure. This sounds like it would work. I don't think that there are any events where the additional delay would matter. As an alternative, I don't think that there's any reason why the QP can't be transition to RTS when the CM REP is sent. A user just can't post to the send queue until either an RDMA_CM_EVENT_ESTABLISHED, IB_EVENT_COMM_EST, or a completion occurs on the QP. (This doesn't change the fact that the IB CM still needs to know that the connection has been established, or it risks putting the connection into an error state if an RTU is never received.) - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
On Thu, 15 Jun 2006, Or Gerlitz wrote: > Sean Hefty wrote: > > James Lentini wrote: > >> The IBTA spec (volume 1, version 1.2) describes a communication > >> established affiliated asynchronous event. > >> We've seen this event delivered to our NFS-RDMA server and aren't sure > >> what to do with it. > > > This event is delivered to the verbs consumer, since it occurs on > > the QP. It's expected that the consumer will call > > ib_cm_establish. Although, I would guess that you can probably > > ignore the event, under the assumption that the RTU will > > eventually be received by the local CM. > > Sean, > > The cma/verbs consumer can't just ignore the event since its qp > state is still RTR which means an attempt to tx replying the rx > would fail. Good point. > On the other hand it can't call ib_cm_establish since the CMA does > not expose an API for that, This is a problem. > nor the CM can register a cb to get this event and emulate an RTU > reception since the CMA is the one to create the QP and the CMA > consumer providing the qp_init_attr along with event handler... > > I suggest the following design: the CMA would replace the event > handler provided with the qp_init_attr struct with a callback of its > own and keep the original handler/context on a private structure. > > On the delivery of IB_EVENT_COMM_EST event, the CMA would call down > the CM to emulate RTU reception (ib_cm_establish) and then call up ib_cm_establish() doesn't emulate an RTU reception. It generates an IB_CM_USER_ESTABLISHED event (not an IB_CM_RTU_RECEIVED event). The CMA's cma_ib_handler() doesn't recognize a IB_CM_USER_ESTABLISHED event. The QP's state will not be moved to RTS. > the consumer original handler, typical CMA consumers would just > ignore this event, i think. > > The CM should be able to allow ib_cm_established to be called in the > context over which the event handler is called (or jump the > treatment to higher context). The CM must also ignore the actual RTU > if it arrives later/in parallel to when ib_cm_establish was called. > > By this design the verbs consumer is guaranteed to always get > RDMA_CM_EVENT_ESTABLISHED no matter if the RTU is just late or never > arrives The CMA's cma_ib_handler() needs to be modified for this to be true. > but it still can get a CQ RX completion(s) before getting the CMA > established event; in that case it can queue these completion > elements for the short time window before the established event > arrives and then process them. Consumers don't actually have to queue the completions, they have to defer posting sends (either in response to the recvs or otherwise) until the QP moves to RTS. Could the implementations queue up the requests for the consumers? Strictly speaking, IB requires an error to be generated (C10-29 in the IBTA spec. vol 1, page 456). Still, it would be nice if consumers didn't have to be worry about this issue. > A design similar to that was implemented at the Voltaire gen1 stack > and it works in production with iSER target and VIBNAL (CFS Lustre > NAL for voltaire gen1 ib) server side. > > Does anyone know on what context (hard_irq, soft_irq, thread) are > the event handlers being called? > > Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
Or Gerlitz wrote: > I suggest the following design: the CMA would replace the event handler > provided with the qp_init_attr struct with a callback of its own and > keep the original handler/context on a private structure. > > On the delivery of IB_EVENT_COMM_EST event, the CMA would call down the > CM to emulate RTU reception (ib_cm_establish) and then call up the > consumer original handler, typical CMA consumers would just ignore this > event, i think. and on other qp affiliated events the CMA would just call up the consumer callback. This proxy-ing of qp events can help us down the road to add support for path migration in the CMA. Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] design for communication established affiliated asynchronous event handling
Sean Hefty wrote: > James Lentini wrote: >> The IBTA spec (volume 1, version 1.2) describes a communication >> established affiliated asynchronous event. >> We've seen this event delivered to our NFS-RDMA server and aren't sure >> what to do with it. > This event is delivered to the verbs consumer, since it occurs on the QP. > It's > expected that the consumer will call ib_cm_establish. Although, I would > guess > that you can probably ignore the event, under the assumption that the RTU > will > eventually be received by the local CM. Sean, The cma/verbs consumer can't just ignore the event since its qp state is still RTR which means an attempt to tx replying the rx would fail. On the other hand it can't call ib_cm_establish since the CMA does not expose an API for that, nor the CM can register a cb to get this event and emulate an RTU reception since the CMA is the one to create the QP and the CMA consumer providing the qp_init_attr along with event handler... I suggest the following design: the CMA would replace the event handler provided with the qp_init_attr struct with a callback of its own and keep the original handler/context on a private structure. On the delivery of IB_EVENT_COMM_EST event, the CMA would call down the CM to emulate RTU reception (ib_cm_establish) and then call up the consumer original handler, typical CMA consumers would just ignore this event, i think. The CM should be able to allow ib_cm_established to be called in the context over which the event handler is called (or jump the treatment to higher context). The CM must also ignore the actual RTU if it arrives later/in parallel to when ib_cm_establish was called. By this design the verbs consumer is guaranteed to always get RDMA_CM_EVENT_ESTABLISHED no matter if the RTU is just late or never arrives but it still can get a CQ RX completion(s) before getting the CMA established event; in that case it can queue these completion elements for the short time window before the established event arrives and then process them. A design similar to that was implemented at the Voltaire gen1 stack and it works in production with iSER target and VIBNAL (CFS Lustre NAL for voltaire gen1 ib) server side. Does anyone know on what context (hard_irq, soft_irq, thread) are the event handlers being called? Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general