Yossi Etigin wrote:
> I think it comes from unicast_arp_send.
>
> Consider this scenario:
> - paths are flushed (opensm up/down).
> - unicast_arp_send() is called with a path in priv->path_list.
> path->valid is 0.
> - path_rec_start() fails with -EAGAIN (-11) because alloc_mad() fails -
> no sm ah (yet)
> (see the prints just before the panic).
> - unicast_arp_send calls() path_free().
> - path memory is overwritten.
> - __ipoib_dev_flush() is called again.
> - mark_paths_invalid() tries to iterate over priv->path_list and gets
> kernel panic
> because path->list became invalid.
>
> --Yossi
>
I agree with Yossi's analysis.
Isn't the fix just as simple as this?
void ipoib_mark_paths_invalid(struct net_device *dev)
{
struct ipoib_dev_priv *priv = netdev_priv(dev);
struct ipoib_path *path, *tp;
spin_lock_irq(&priv->lock);
list_for_each_entry_safe(path, tp, &priv->path_list, list) {
ipoib_dbg(priv, "mark path LID 0x%04x GID " IPOIB_GID_FMT "
invalid\n",
be16_to_cpu(path->pathrec.dlid),
IPOIB_GID_ARG(path->pathrec.dgid));
- path->valid = 0;
+ if (path)
+ path->valid = 0;
}
spin_unlock_irq(&priv->lock);
}
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general