On Wed, Feb 16, 2011 at 4:51 PM, Pradeep Satyanarayana <prade...@linux.vnet.ibm.com> wrote: > The list_move_tail() in ipoib_mcast_restart_task() is conditional. So it > feasible that mcast->list is not moved to the remove_list, but the > ipoib_neigh structure is freed. A subsequent call to ipoib_mcast_free() from > context of dev_close() tries to free the same ipoib_neigh() structure which > might cause the crash.
I'm missing something. ipoib_neigh_free() is only called in ipoib_mcast_free(). ipoib_mcast_free() is only called on entries in remove_list. So if an mcast structure is not moved to the remove_list, I don't see how an ipoib_neigh struct on its list could be freed. We do all manipulation of the mcast list in ipoib_mcast_restart_task() and ipoib_mcast_dev_flush() with priv->lock held, so I don't see how we could free the same mcast struct twice, or free an ipoib_neigh struct without freeing its corresponding mcast struct. Indeed it's not clear to me why we need to take the lock in ipoib_mcast_free() or what it could be protecting against -- that seems like it must be wrong and unnecessary (and the locking predates git history). Sorry if I'm being slow but if this patch is fixing some test, it really seems that it must be doing that by making a race window smaller or something like that, rather than actually being the right fix for the underlying problem. - R. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html