I think it has to stay the way I wrote it.  Your version:

+            if (empty) {
+                        ib_req_notify_cq(priv->cq, IB_CQ_NEXT_COMP, 
&missed_event);
+                        if (unlikely(missed_event) && netif_rx_reschedule(dev, 
0))
+                                    goto repoll;
+                        netif_rx_complete(dev);
+
+                        return 0;
+            }

has a race: suppose missed_event is 0 but an event _is_ generated
right before the call to netif_rx_complete().  Then the interrupt
handler might run before the call to netif_rx_complete(), try to
schedule the NAPI poll, but end up doing nothing because the poll
routine is still running.  Then the poll routine will call
netif_rx_complete() and return 0, so it won't get called again ever
(because the CQ event has already fired).  And so the interface will
hang and never make any more progress.

I would really like to understand why ehca does worse with NAPI.  In
my tests both mthca and ipath exhibit various degrees of improvement
depending on the test -- but I've never seen performance get worse.
This is the main thing holding back merging NAPI.

Does the NAPI patch help mthca on pSeries?  I wonder if it's not ehca,
but rather that there's some ppc64 quirk that makes NAPI a lot more
expensive.

 - R.

_______________________________________________
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to