Just a quick read through, more comments later:
int local_cm_response_timeout:5;
int flow_control:1;
These should be "unsigned," not "int." A signed 1-bit int doesn't
make much sense, and I think you'll probably run into sign trouble if
someone passes a local_cm_response_timeout of 20 or something for the
5-bit field.
In cm_send_handler(), you have:
case IB_WC_RESP_TIMEOUT_ERR:
cm_process_send_timeout(msg);
break;
but can this ever happen? I thought that the MAD layer always treated
CM MADs as unsolicited which means that responses are not matched with
requests so timeouts don't happen. Or am I misunderstanding the MAD
layer semantics?
I see a lot of setting of state to TIMEWAIT, but I don't see where the
TIMEWAIT timeout happens.
TIMEWAIT handling should probably be done using the same cm.wq
workqueue as receive handling goes through -- this eliminates the
problem of connection setup starving timewait reaping that I mentioned
earlier.
- R.
_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general