> The race is between the list is taken off from reap_list for destory
> because of the timeout and readd to reap_list from this condition. So we
> need a condition check here.
I'm sorry, I still don't see what the bug is. What timeout removes
things from the rx_reap_list? And how could any context get added to
the rx_reap_list more than once? The code you are changing is:
> if (!--p->recv_count) {
> spin_lock_irqsave(&priv->lock, flags);
> - list_move(&p->list, &priv->cm.rx_reap_list);
> - spin_unlock_irqrestore(&priv->lock, flags);
> - queue_work(ipoib_workqueue, &priv->cm.rx_reap_task);
> + if (p->state == IPOIB_CM_RX_LIVE) {
> + list_move(&p->list, &priv->cm.rx_reap_list);
> + spin_unlock_irqrestore(&priv->lock, flags);
> + queue_work(ipoib_workqueue,
> &priv->cm.rx_reap_task);
> + } else
> + spin_unlock_irqrestore(&priv->lock, flags);
and I don't see how recv_count could ever be 0 twice: we never increment it.
In any case I don't see how testing against the LIVE state could be
correct, since for example we want to reap stale QPs that we moved to
IPOIB_CM_RX_ERROR after all the pending receives have completed.
- R.
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general