Sean Hefty wrote:
Or Gerlitz wrote:
Conceptually, do we agree that it would be better not to expose IB
reject code to the CMA consumers? that is in the spirit of the CMA
being a framework for doing connection management in RDMA transport
independent fashion, etc.
My concern is that I do
Sean Hefty wrote:
Or Gerlitz wrote:
0) ones that are of no interest to the CMA nor to the ULP above it but
rather only to the local CM (are there any?)
1) ones that *must* be handled internally by the CMA (are there any?)
2) ones that *can* be handled internally by the CMA (eg stale-conn)
Or Gerlitz wrote:
Conceptually, do we agree that it would be better not to expose IB
reject code to the CMA consumers? that is in the spirit of the CMA being
a framework for doing connection management in RDMA transport
independent fashion, etc.
My concern is that I do not want to mask the
Sean Hefty wrote:
Or Gerlitz wrote:
Is it correct that with the gen2 code, the remote **CM** will
reconnect on that case?
I don't think so. The QP needs to move into timewait, so a new
connection request is needed with a different QPN.
Just to make sure, you replaced CM id with QP and
Or Gerlitz wrote:
0) ones that are of no interest to the CMA nor to the ULP above it but
rather only to the local CM (are there any?)
1) ones that *must* be handled internally by the CMA (are there any?)
2) ones that *can* be handled internally by the CMA (eg stale-conn)
3) ones that
Sean Hefty wrote:
I agree. This sounds like an issue where the CM is treating the REQ as
an old REQ for the established connection, versus a REQ for a new
connection.
The desired behavior in this situation would be to reject the new
request, and force the remote side to disconnect.
Sean,
Or Gerlitz wrote:
Is it correct that with the gen2 code, the remote **CM** will reconnect
on that case?
I don't think so. The QP needs to move into timewait, so a new connection
request is needed with a different QPN.
I see in cm.c :: cm_rej_handler() that when the state is IB_CM_REQ_SENT
I've had a report of rdma_connect() failing with a callback event type of
RDMA_CM_EVENT_UNREACHABLE and status -ETIMEDOUT although the peer node was
up and running at the time.
It seems this can be reproduced as follows...
1. Establish a connection between nodes A and B
2. Reboot node A
3.
Eric Barton wrote:
I've had a report of rdma_connect() failing with a callback event type of
RDMA_CM_EVENT_UNREACHABLE and status -ETIMEDOUT although the peer node was
up and running at the time.
It seems this can be reproduced as follows...
1. Establish a connection between nodes A and B
Or Gerlitz wrote:
My guess this is related to the CM not the SM.
I think there is a chance that the CM on node B does not treat the REQ
sent by A after the reboot as stale connection situation and hence
just **silently** dtop it, that is not REJ is sent.
I agree. This sounds like an
10 matches
Mail list logo