[Felix Marti] In addition, is arming the CQ really in the performance
path? - Don't apps poll the CQ as long as there are pending CQEs and
only arm the CQ for notification once there is nothing left to do? If
this is the case, it would mean that we waste a few cycles 'idle'
cycles.
On Thu, 2007-01-04 at 07:07 +0200, Michael S. Tsirkin wrote:
If you think I should not add the udata parameter to the req_notify_cq()
provider verb, then I can rework the chelsio driver:
1) at cq creation time, pass the virtual address of the u32 used by the
library to track the current
@@ -1373,7 +1374,7 @@ int ib_peek_cq(struct ib_cq *cq, int wc_
static inline int ib_req_notify_cq(struct ib_cq *cq,
enum ib_cq_notify cq_notify)
{
- return cq-device-req_notify_cq(cq, cq_notify);
+ return cq-device-req_notify_cq(cq, cq_notify,
@@ -1373,7 +1374,7 @@ int ib_peek_cq(struct ib_cq *cq, int wc_
static inline int ib_req_notify_cq(struct ib_cq *cq,
enum ib_cq_notify cq_notify)
{
- return cq-device-req_notify_cq(cq, cq_notify);
+ return cq-device-req_notify_cq(cq, cq_notify,
It seems all Chelsio needs is to pass in a consumer index - so, how about
a new
entry point? Something like void set_cq_udata(struct ib_cq *cq, struct
ib_udata *udata)?
Adding a new entry point would hurt chelsio's user mode performance if
if then requires 2 kernel
No, it won't need 2 transitions - just an extra function call,
so it won't hurt performance - it would improve performance.
ib_uverbs_req_notify_cq would call
ib_uverbs_req_notify_cq()
{
ib_set_cq_udata(cq, udata)
I've run this code with mthca and didn't notice any performance
degradation, but I wasn't specifically measuring cq_poll overhead in a
tight loop...
We were speaking about ib_req_notify_cq here, actually, not cq poll.
So what was tested?
--
MST
-
To unsubscribe from this list: send the line
On Wed, 2007-01-03 at 17:02 +0200, Michael S. Tsirkin wrote:
I've run this code with mthca and didn't notice any performance
degradation, but I wasn't specifically measuring cq_poll overhead in a
tight loop...
We were speaking about ib_req_notify_cq here, actually, not cq poll.
So what
On Wed, 2007-01-03 at 17:00 +0200, Michael S. Tsirkin wrote:
No, it won't need 2 transitions - just an extra function call,
so it won't hurt performance - it would improve performance.
ib_uverbs_req_notify_cq would call
ib_uverbs_req_notify_cq()
{
I've run this code with mthca and didn't notice any performance
degradation, but I wasn't specifically measuring cq_poll overhead in a
tight loop...
We were speaking about ib_req_notify_cq here, actually, not cq poll.
So what was tested?
Sorry, I meant req_notify. I didn't
No, it won't need 2 transitions - just an extra function call,
so it won't hurt performance - it would improve performance.
ib_uverbs_req_notify_cq would call
ib_uverbs_req_notify_cq()
{
ib_set_cq_udata(cq, udata)
On Wed, 2007-01-03 at 21:33 +0200, Michael S. Tsirkin wrote:
Without extra param (1000 iterations in cycles):
ave 101.283 min 91 max 247
With extra param (1000 iterations in cycles):
ave 103.311 min 91 max 221
A 2% hit then. Not huge, but 0 either.
Convert cycles to ns
diff --git a/drivers/infiniband/hw/mthca/mthca_cq.c
b/drivers/infiniband/hw/mthca/mthca_cq.c
index 283d50b..15cbd49 100644
--- a/drivers/infiniband/hw/mthca/mthca_cq.c
+++ b/drivers/infiniband/hw/mthca/mthca_cq.c
@@ -722,7 +722,8 @@ repoll:
return err == 0 || err == -EAGAIN ? npolled
13 matches
Mail list logo