RE: [openib-general] [PATCH] Initial CM implementation

Sean Hefty Mon, 17 Jan 2005 14:49:19 -0800

>Just a quick read through, more comments later:
>
>       int     local_cm_response_timeout:5;
>       int     flow_control:1;
>
>These should be "unsigned," not "int."  A signed 1-bit int doesn't
>make much sense, and I think you'll probably run into sign trouble if
>someone passes a local_cm_response_timeout of 20 or something for the
>5-bit field.


I'll change these to unsigned.

>In cm_send_handler(), you have:
>
>       case IB_WC_RESP_TIMEOUT_ERR:
>               cm_process_send_timeout(msg);
>               break;
>
>but can this ever happen?  I thought that the MAD layer always treated
>CM MADs as unsolicited which means that responses are not matched with
>requests so timeouts don't happen.  Or am I misunderstanding the MAD
>layer semantics?

My intent is to use the MAD layer timeout and retry code.  E.g. the CM
will send a REQ.  The MAD layer will retry the REQ for the CM.  Once
a REP is received, the CM will cancel the REQ.  So, the CM will do the
response matching, but the timeout/retries are done by the MAD code.

There's a potential race between receiving a response and receiving a
timeout.  I have a way to handle it, but I need to go back and see if
I set the right fields to do so.

>I see a lot of setting of state to TIMEWAIT, but I don't see where the
>TIMEWAIT timeout happens.

TIMEWAIT is a big todo.  My intent is to use the same work queue that
receive handling uses.

Thanks for the comments.

- Sean

_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [PATCH] Initial CM implementation

Reply via email to