The way I understand the sequence of events w/o the patch is: ib_req_notify_cq(IB_CQ_NEXT_COMP) CQE 1 added to queue callback scheduled via tasklet future callbacks disarmed callback function calls ib_req_notify_cq(IB_CQ_NEXT_COMP) callback function calls ib_poll_cq() and gets CQE 1 callback function calls ib_poll_cq() and gets none CQE 2 added to queue via IRQ callback scheduled via tasklet future callbacks disarmed callback function returns some time later, tasklet runs and calls CQ callback function. callback function calls ib_req_notify_cq(IB_CQ_NEXT_COMP) callback function calls ib_poll_cq() and gets CQE 2
Since a tasklet or workqueue can be scheduled in the callback function, the second CQE isn't "missed" but there is a scheduling delay before the callback happens and sees CQE 2. I guess it is a minor optimization since either CQE 2 will be seen in the first callback while looping in ib_poll_cq() and then getting a callback later with ib_poll_cq()==0 or seen in the second callback. I'm willing to withdraw the 1-3 patches. I still don't understand why the timing difference matters to RDS. On Wed, 2010-03-31 at 11:17 -0700, Roland Dreier wrote: > > ib_req_notify_cq(IB_CQ_NEXT_COMP) is not guaranteed to generate > > a callback for the next completion entered since there is a race > > between arming the callback and another CQE being added to the queue. > > The IB_CQ_REPORT_MISSED_EVENTS flag was added to detect this > > race and allow the verbs consumer to call ib_poll_cq() and > > ib_req_notify_cq() again to avoid delays in processing the CQE. > > I'm not sure I understand the race you're fixing here... the existing > code does the rearm before polling: > > > + int ret; > > > > port_priv = container_of(work, struct ib_mad_port_private, work); > > - ib_req_notify_cq(port_priv->cq, IB_CQ_NEXT_COMP); > > > > +again: > > while (ib_poll_cq(port_priv->cq, 1, &wc) == 1) { > > if (wc.status == IB_WC_SUCCESS) { > > switch (wc.opcode) { > > @@ -2246,6 +2247,10 @@ static void ib_mad_completion_handler(struct > work_struct *work) > > } else > > mad_error_handler(port_priv, &wc); > > } > > + ret = ib_req_notify_cq(port_priv->cq, IB_CQ_NEXT_COMP | > > + IB_CQ_REPORT_MISSED_EVENTS); > > + if (ret > 0) > > + goto again; > > The only issue with the existing code is that it may trigger extra > events that will find the CQ empty when polling. > > So this may be a valid optimization but I don't see it fixing any missed > events. Am I missing something? > > Also for all these fixes, I think you only want to rearm the CQ once and > go back and poll if you get a missed events warning; the next time the > CQ is empty, then you know another event will happen. > > - R. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html