On Sun, May 31, 2009 at 09:41:54AM +0300, Or Gerlitz wrote:
> [email protected] wrote @ 
> http://lists.openfabrics.org/pipermail/general/2009-May/059730.html
> > What would prevent a race between a tx completion (with an 
> > error) and the cleanup of a neighbour? 
> 
> Okay, so maybe this code/design of using the stashed ipoib_neighbour at the tx
> completion code is the root cause of all these troubles?! 
> 
> >From a quick look on the code and two patches that touched this area 
> >(f56bcd801... "Use separate CQ for UD send completions" and 57ce41d1... "Fix 
> >transmit queue stalling forever") - I see that the original tx cq handler - 
> >ipoib_ib_handle_tx_wc() doesn't touch the neigbour but today is called only 
> >from the drain timer & dev-stop flows. Now, ipoib_cm_handle_tx_wc() is 
> >called for "normal" flow both for datagram and connected modes, and this 
> >function touches he neighbour.

Or, I don't follow on you - ipoib_cm_handle_tx_wc() called
ipoib_neigh_free() from the first commit. Also please note the
following designation of CQs:
recv_cq: used for all receives and for CM send
send_cq: used for UD send

Thus, since in ipoib_poll() we poll "recv_cq", any none receive must
be that of CM mode sends.

> 
> I am not sure why commit f56bcd801... made UD completions to go through 
> ipoib_cm_handle_tx_wc() nor why this function must use the neighbor to access 
> the data-structure it needs to, maybe Eli can comment on that?
> 
> Or.
> 
> _______________________________________________
> general mailing list
> [email protected]
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to