[email protected] wrote @ 
http://lists.openfabrics.org/pipermail/general/2009-May/059730.html
> What would prevent a race between a tx completion (with an 
> error) and the cleanup of a neighbour? 

Okay, so maybe this code/design of using the stashed ipoib_neighbour at the tx
completion code is the root cause of all these troubles?! 

>From a quick look on the code and two patches that touched this area 
>(f56bcd801... "Use separate CQ for UD send completions" and 57ce41d1... "Fix 
>transmit queue stalling forever") - I see that the original tx cq handler - 
>ipoib_ib_handle_tx_wc() doesn't touch the neigbour but today is called only 
>from the drain timer & dev-stop flows. Now, ipoib_cm_handle_tx_wc() is called 
>for "normal" flow both for datagram and connected modes, and this function 
>touches he neighbour.

I am not sure why commit f56bcd801... made UD completions to go through 
ipoib_cm_handle_tx_wc() nor why this function must use the neighbor to access 
the data-structure it needs to, maybe Eli can comment on that?

Or.

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to