On 24/07/2012 18:14, Yishai Hadas wrote:
Just encountered a kernel oops in IPoIB on upstream kernel 3.5 [...] oops happened in ipoib_mcast_join_task.

Roland,

I made a review now on the issue Yishai raised, and took a look on few related commits to that area, as you wrote in a77a57a1a "IPoIB: Fix deadlock on RTNL in ipoib_stop()" - commit c8c2afe3 "IPoIB: Use rtnl lock/unlock when changing device flags" added a call to rtnl_lock() in ipoib_mcast_join_task(), which is run from the ipoib_workqueue, and hence we can't flush
the workqueue from the context ipoib_stop is called.

HOWEVER, that very same ipoib_stop() context, which doesn't flush the workqueue, calls ipoib_mcast_dev_flush which goes and deletes all the multicast entries, and this flow place now without any synchronization with possible running instances of ipoib_mcast_join_task which relate to the SAME ipoib device. Yishai's test stepped on the broadcast point being null, but
this race can hold for any group which this device is joined to.

What would you suggest here, change the ipoib_stop flow to apply flushing, and doing rtnl_trylock() instead of rtnl_lock() in ipoib_mcast_join_task() doesn't seems to be applicable, hence we don't know who took the lock, arbitrary context that wants now to apply changes on the device or the ipoib_stop one.

I see that this code is executed unconditionally whenever the mcast task is

priv->mcast_mtu = IPOIB_UD_MTU(ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu));

        if (!ipoib_cm_admin_enabled(dev)) {
                rtnl_lock();
                dev_set_mtu(dev, min(priv->mcast_mtu, priv->admin_mtu));
                rtnl_unlock();
        }

maybe if we go wiser and run it only after actually joining the broadcast group and not each time could help with solving the race? and/or move the code that does
the dev_set_mtu call to be executed under another context?

Or.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
  • IPoIB oops Yishai Hadas
    • Re: ipoib race in multicast flow (was: IPoIB oops) Or Gerlitz

Reply via email to