Re: how to safely tear down a cm_id?

2010-09-15 Thread Zach Brown
> rdma_destroy_id will block until all CM callbacks complete. rdma_disconnect > does not. It merely issues or responds to a disconnect request. If it issues > a disconnect request, then a disconnect callback will eventually follow, > possibly before rdma_disconnect returns. Thanks, that's jus

how to safely tear down a cm_id?

2010-09-15 Thread Zach Brown
Hi gang, We're chasing some bugs in RDS. In trying to explore possible causes I found that I don't really understand the sequence of events needed to safely tear down a cm_id. I'm worried that we have cm event callbacks being processed in the ib_cm thread racing with our krds thread which is tea

Re: [rds-devel] net-next pull request: RDS

2010-09-14 Thread Zach Brown
> Zach Brown's "RDS/IB: print IB event strings as well as their number" - commit > 1bde04a63d532c2540d6fdee0a661530a62b1686 in net-next-2.6 looks perfect to > reside as a helper function in the core IB stack which can be in use by other > rdma drivers (e.g ipoib, iser, srp, etc). Yeah, if those o

ipoib neighbour free race?

2010-08-19 Thread Zach Brown
Hi gang, We're chasing a bug that we can hit when we pull IB cables with CONFIG_DEBUG_PAGE_ALLOC enabled. It appears as though the to_ipoib_neigh() in ipoib_neigh_free() under ipoib_mcast_free() is referencing a freed neighbour struct. The invariant here, as far as I can tell, is that cleanup_n