Jack Morgenstein has discovered the following race condition in
libmthca. We are actually hitting it in testing.

Thread A destroys QP A at the kernel side by calling ibv_cmd_destroy_qp, but its
time-slice is over before removing it from the user-space qp_table removal.

Thread B allocates QP B, receiving a QP number that matches the just-destroyed
QP A in the low 16 bits.  Thread B will now over-write the slot in qp_table
which was used for QP A.

Thread A wakes up and clears qp_table slot, in effect removing QP B from
qp_table.

As a solution, remove the QP from qp_table before calling ibv_cmd_destroy_qp.
This also makes sense since operations are performed in the reverse order
in create_qp.

---

Race condition fix: keep qp_table in userspace and in kernel consistent
by performing destruction in the reverse order of construction.

Signed-off-by: Michael S. Tsirkin <[EMAIL PROTECTED]>

Index: openib/src/userspace/libmthca/src/verbs.c
===================================================================
--- openib.orig/src/userspace/libmthca/src/verbs.c      2005-12-14 
23:29:54.000000000 +0200
+++ openib/src/userspace/libmthca/src/verbs.c   2006-01-05 14:54:29.000000000 
+0200
@@ -529,10 +529,6 @@ int mthca_destroy_qp(struct ibv_qp *qp)
 {
        int ret;
 
-       ret = ibv_cmd_destroy_qp(qp);
-       if (ret)
-               return ret;
-
        mthca_cq_clean(to_mcq(qp->recv_cq), qp->qp_num,
                       qp->srq ? to_msrq(qp->srq) : NULL);
        if (qp->send_cq != qp->recv_cq)
@@ -546,6 +542,18 @@ int mthca_destroy_qp(struct ibv_qp *qp)
                pthread_spin_unlock(&to_mcq(qp->recv_cq)->lock);
        pthread_spin_unlock(&to_mcq(qp->send_cq)->lock);
 
+       ret = ibv_cmd_destroy_qp(qp);
+       if (ret) {
+               pthread_spin_lock(&to_mcq(qp->send_cq)->lock);
+               if (qp->send_cq != qp->recv_cq)
+                       pthread_spin_lock(&to_mcq(qp->recv_cq)->lock);
+               mthca_store_qp(to_mctx(qp->context), qp->qp_num, to_mqp(qp));
+               if (qp->send_cq != qp->recv_cq)
+                       pthread_spin_unlock(&to_mcq(qp->recv_cq)->lock);
+               pthread_spin_unlock(&to_mcq(qp->send_cq)->lock);
+               return ret;
+       }
+
        if (mthca_is_memfree(qp->context)) {
                mthca_free_db(to_mctx(qp->context)->db_tab, MTHCA_DB_TYPE_RQ,
                              to_mqp(qp)->rq.db_index);

-- 
MST
_______________________________________________
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to