We are running into connection reject issues (IB_CM_REJ_STALE_CONN) with our application under heavy load and lots of connections.
We occassionally get a reject based on the QP being in timewait state leftover from a prior connection. It appears that the CM keeps track of the QP's in timewait state on both sides of the connection, independently of the verbs layer, even after the QP has been destroyed at the verbs level. I can actually create a new QP via verbs and it could still be on the CM timewait queue waiting for the timer to pop and be removed. If this is the case, my attempts to connect using this QP will fail with a reject. How can a consumer know for sure that the new QP will not be in a timewait state according to the CM? Does it make sense to push the timewait functionality down into verbs? If not, is there a way for the CM to hold a reference to the QP until the timewait expires? -arlin _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
