date:20060823

Re: [openib-general] librdmacm ABI issues with OFED 1.1

2006-08-23 Thread Michael S. Tsirkin

Quoting r. Sean Hefty <[EMAIL PROTECTED]>:
> If you look at Woody's backport patches, I believe that he moves the RDMA CM
> files to /sys/class/infiniband/rdma_cm and updates the librdmacm to read the
> abi_version from there.

Maybe the librdmacm part should be merged to svn?
So librdmacm could try to read from misc, then from
/sys/class/infiniband/rdma_cm, and then assume latest.
It's good to have userspace code portable across distros ...

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/4] Dispatch communication related events to the IB CM

2006-08-23 Thread Michael S. Tsirkin

Quoting r. Sean Hefty <[EMAIL PROTECTED]>:
> And even with these proposed changes, there's a race condition where the CM
> can timeout a connection after data is received over it, but before this event
> can be processed.

Hmm. And what happens then?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] librdmacm ABI issues with OFED 1.1

2006-08-23 Thread Sean Hefty

>I have some rdma_cm test code and when I run with the OFED 1.1 code (running on
>2.6.9 U3 based kernel) I got the following error.
>
>librdmacm: couldn't read ABI version.
>librdmacm: assuming: 2

The RDMA CM places the abi_version file in /sys/class/misc/rdma_cm.  The misc
class didn't exist in 2.6.9, which is why it was removed from the OFED code.

>The code seems to run (as it really does nothing) fine but I was wondering if I
>could fix this just to clean up the output.  I found that the following patch
>removes the code which creates the abi_version file.

If you look at Woody's backport patches, I believe that he moves the RDMA CM
files to /sys/class/infiniband/rdma_cm and updates the librdmacm to read the
abi_version from there.

Or you could just remove the prints from the library.

>How bad is it that the user space is assuming version 2 of the interface and
>the modules are at version 1?

It should work fine for apps using RC QPs.

Version 1 assumed that the port space was RDMA TCP for RC QPs.  Version 2 added
support for UD QPs through the RDMA's UDP port space.  The port space
information was added to the end of a structure, so if an older kernel is used,
it simply won't read in the port space data, and will assume TCP.

An application that was expecting to use UD QP will simply get an error on some
operation, likely when it tries to actually connect to a remote UD QP.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [PATCH] IB/libipathverbs - Fix compatibility with old ib_ipath kernel drivers

2006-08-23 Thread Ralph Campbell

This patch makes libipathverbs backward compatible with old ib_ipath kernel 
drivers.

Signed-off-by: Ralph Campbell <[EMAIL PROTECTED]>

Index: src/userspace/libipathverbs/src/verbs.c
===
--- src/userspace/libipathverbs/src/verbs.c (revision 9095)
+++ src/userspace/libipathverbs/src/verbs.c (working copy)
@@ -177,6 +177,29 @@ struct ibv_cq *ipath_create_cq(struct ib
return &cq->ibv_cq;
 }
 
+struct ibv_cq *ipath_create_cq_v1(struct ibv_context *context, int cqe,
+ struct ibv_comp_channel *channel,
+ int comp_vector)
+{
+   struct ibv_cq  *cq;
+   struct ibv_create_cqcmd;
+   struct ibv_create_cq_resp   resp;
+   int ret;
+
+   cq = malloc(sizeof *cq);
+   if (!cq)
+   return NULL;
+
+   ret = ibv_cmd_create_cq(context, cqe, channel, comp_vector,
+   cq, &cmd, sizeof cmd, &resp, sizeof resp);
+   if (ret) {
+   free(cq);
+   return NULL;
+   }
+
+   return cq;
+}
+
 int ipath_resize_cq(struct ibv_cq *ibcq, int cqe)
 {
struct ipath_cq*cq = to_icq(ibcq);
@@ -207,6 +230,15 @@ int ipath_resize_cq(struct ibv_cq *ibcq,
return 0;
 }
 
+int ipath_resize_cq_v1(struct ibv_cq *ibcq, int cqe)
+{
+   struct ibv_resize_cqcmd;
+   struct ibv_resize_cq_resp   resp;
+
+   return ibv_cmd_resize_cq(ibcq, cqe, &cmd, sizeof cmd,
+&resp, sizeof resp);
+}
+
 int ipath_destroy_cq(struct ibv_cq *ibcq)
 {
struct ipath_cq *cq = to_icq(ibcq);
@@ -222,6 +254,16 @@ int ipath_destroy_cq(struct ibv_cq *ibcq
return 0;
 }
 
+int ipath_destroy_cq_v1(struct ibv_cq *ibcq)
+{
+   int ret;
+
+   ret = ibv_cmd_destroy_cq(ibcq);
+   if (!ret)
+   free(ibcq);
+   return ret;
+}
+
 int ipath_poll_cq(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc)
 {
struct ipath_cq *cq = to_icq(ibcq);
@@ -290,6 +332,28 @@ struct ibv_qp *ipath_create_qp(struct ib
return &qp->ibv_qp;
 }
 
+struct ibv_qp *ipath_create_qp_v1(struct ibv_pd *pd,
+ struct ibv_qp_init_attr *attr)
+{
+   struct ibv_create_qp cmd;
+   struct ibv_create_qp_respresp;
+   struct ibv_qp   *qp;
+   int  ret;
+
+   qp = malloc(sizeof *qp);
+   if (!qp)
+   return NULL;
+
+   ret = ibv_cmd_create_qp(pd, qp, attr, &cmd, sizeof cmd,
+   &resp, sizeof resp);
+   if (ret) {
+   free(qp);
+   return NULL;
+   }
+
+   return qp;
+}
+
 int ipath_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr,
   enum ibv_qp_attr_mask attr_mask,
   struct ibv_qp_init_attr *init_attr)
@@ -330,6 +394,16 @@ int ipath_destroy_qp(struct ibv_qp *ibqp
return 0;
 }
 
+int ipath_destroy_qp_v1(struct ibv_qp *ibqp)
+{
+   int ret;
+
+   ret = ibv_cmd_destroy_qp(ibqp);
+   if (!ret)
+   free(ibqp);
+   return ret;
+}
+
 static int post_recv(struct ipath_rq *rq, struct ibv_recv_wr *wr,
 struct ibv_recv_wr **bad_wr)
 {
@@ -412,6 +486,28 @@ struct ibv_srq *ipath_create_srq(struct 
return &srq->ibv_srq;
 }
 
+struct ibv_srq *ipath_create_srq_v1(struct ibv_pd *pd,
+   struct ibv_srq_init_attr *attr)
+{
+   struct ibv_srq *srq;
+   struct ibv_create_srq cmd;
+   struct ibv_create_srq_resp resp;
+   int ret;
+
+   srq = malloc(sizeof *srq);
+   if (srq == NULL)
+   return NULL;
+
+   ret = ibv_cmd_create_srq(pd, srq, attr, &cmd, sizeof cmd,
+&resp, sizeof resp);
+   if (ret) {
+   free(srq);
+   return NULL;
+   }
+
+   return srq;
+}
+
 int ipath_modify_srq(struct ibv_srq *ibsrq,
 struct ibv_srq_attr *attr, 
 enum ibv_srq_attr_mask attr_mask)
@@ -456,6 +552,16 @@ int ipath_modify_srq(struct ibv_srq *ibs
return 0;
 }
 
+int ipath_modify_srq_v1(struct ibv_srq *ibsrq,
+   struct ibv_srq_attr *attr, 
+   enum ibv_srq_attr_mask attr_mask)
+{
+   struct ibv_modify_srq cmd;
+
+   return ibv_cmd_modify_srq(ibsrq, attr, attr_mask,
+ &cmd, sizeof cmd);
+}
+
 int ipath_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr)
 {
struct ibv_query_srq cmd;
@@ -481,6 +587,16 @@ int ipath_destroy_srq(struct ibv_srq *ib
return 0;
 }
 
+int ipath_destroy_srq_v1(struct ibv_srq *ibsrq)
+{
+   int ret;
+
+   ret = ibv_cmd_destroy_srq(ibsrq);
+   if (!ret)
+   free(ibsrq);
+   return ret;
+}
+
 int ipath_post_srq_recv(struct ibv_srq *ibsrq

Re: [openib-general] [GIT PULL] please pull infiniband.git

2006-08-23 Thread Greg KH

On Wed, Aug 23, 2006 at 04:25:38PM -0700, Roland Dreier wrote:
> Greg, please pull from
> 
> master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git 
> for-linus
> 
> This tree is also available from kernel.org mirrors at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git 
> for-linus

Pulled from, and pushed out.

thanks,

greg k-h

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] librdmacm ABI issues with OFED 1.1

2006-08-23 Thread Ira Weiny

I have some rdma_cm test code and when I run with the OFED 1.1 code (running on
2.6.9 U3 based kernel) I got the following error.

librdmacm: couldn't read ABI version.
librdmacm: assuming: 2

The code seems to run (as it really does nothing) fine but I was wondering if I
could fix this just to clean up the output.  I found that the following patch
removes the code which creates the abi_version file.

./backport/2.6.9_U3/ucma_6607_to_2_6_9.patch

So my question is:

How bad is it that the user space is assuming version 2 of the interface and
the modules are at version 1?

Thanks,
Ira


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-23 Thread Sean Hefty

>Actually, that leads me to a question: does the vendor of that adaptor
>say that this is actually safe?

I believe so.

>most of the time doesn't mean it does it all of the time. So it it
>really smart to write non-standard-conforming programs unless the
>vendor stands behind that behavior?

I'm not saying whether I consider this good computer science or not, but some
applications do rely on this feature, and hardware that wants to work best with
those applications will have it.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-23 Thread Roland Dreier

Greg> Actually, that leads me to a question: does the vendor of
Greg> that adaptor say that this is actually safe? Just because
Greg> something behaves one way most of the time doesn't mean it
Greg> does it all of the time. So it it really smart to write
Greg> non-standard-conforming programs unless the vendor stands
Greg> behind that behavior?

Yes, Mellanox documents that it is safe to rely on the last byte of an
RDMA being written last.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-23 Thread Greg Lindahl

On Wed, Aug 23, 2006 at 09:29:18AM -0700, Sean Hefty wrote:

> I don't believe that there is any ordering guarantee by the architecture. 
> However, specific adapters may behave this way, and I've seen applications 
> make 
> use of this by polling the last memory byte for a completion, for example.

Actually, that leads me to a question: does the vendor of that adaptor
say that this is actually safe? Just because something behaves one way
most of the time doesn't mean it does it all of the time. So it it
really smart to write non-standard-conforming programs unless the
vendor stands behind that behavior?

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [GIT PULL] please pull infiniband.git

2006-08-23 Thread Roland Dreier

Greg, please pull from

master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus

This tree is also available from kernel.org mirrors at:

git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git 
for-linus

to get a few fixes for the 2.6.18 tree:

Jack Morgenstein:
  IB/core: Fix SM LID/LID change with client reregister set

Michael S. Tsirkin:
  IB/mthca: Make fence flag work for send work requests
  IB/mthca: Update HCA firmware revisions

Roland Dreier:
  IB/mthca: Fix potential AB-BA deadlock with CQ locks
  IB/mthca: No userspace SRQs if HCA doesn't have SRQ support

 drivers/infiniband/core/cache.c  |3 +
 drivers/infiniband/core/sa_query.c   |3 +
 drivers/infiniband/hw/mthca/mthca_main.c |6 +--
 drivers/infiniband/hw/mthca/mthca_provider.c |   11 +++--
 drivers/infiniband/hw/mthca/mthca_provider.h |4 +-
 drivers/infiniband/hw/mthca/mthca_qp.c   |   54 +++---
 6 files changed, 55 insertions(+), 26 deletions(-)


diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index e05ca2c..75313ad 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -301,7 +301,8 @@ static void ib_cache_event(struct ib_eve
event->event == IB_EVENT_PORT_ACTIVE ||
event->event == IB_EVENT_LID_CHANGE  ||
event->event == IB_EVENT_PKEY_CHANGE ||
-   event->event == IB_EVENT_SM_CHANGE) {
+   event->event == IB_EVENT_SM_CHANGE   ||
+   event->event == IB_EVENT_CLIENT_REREGISTER) {
work = kmalloc(sizeof *work, GFP_ATOMIC);
if (work) {
INIT_WORK(&work->work, ib_cache_task, work);
diff --git a/drivers/infiniband/core/sa_query.c 
b/drivers/infiniband/core/sa_query.c
index aeda484..d6b8422 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -405,7 +405,8 @@ static void ib_sa_event(struct ib_event_
event->event == IB_EVENT_PORT_ACTIVE ||
event->event == IB_EVENT_LID_CHANGE  ||
event->event == IB_EVENT_PKEY_CHANGE ||
-   event->event == IB_EVENT_SM_CHANGE) {
+   event->event == IB_EVENT_SM_CHANGE   ||
+   event->event == IB_EVENT_CLIENT_REREGISTER) {
struct ib_sa_device *sa_dev;
sa_dev = container_of(handler, typeof(*sa_dev), event_handler);
 
diff --git a/drivers/infiniband/hw/mthca/mthca_main.c 
b/drivers/infiniband/hw/mthca/mthca_main.c
index 557cde3..7b82c19 100644
--- a/drivers/infiniband/hw/mthca/mthca_main.c
+++ b/drivers/infiniband/hw/mthca/mthca_main.c
@@ -967,12 +967,12 @@ static struct {
 } mthca_hca_table[] = {
[TAVOR]= { .latest_fw = MTHCA_FW_VER(3, 4, 0),
   .flags = 0 },
-   [ARBEL_COMPAT] = { .latest_fw = MTHCA_FW_VER(4, 7, 400),
+   [ARBEL_COMPAT] = { .latest_fw = MTHCA_FW_VER(4, 7, 600),
   .flags = MTHCA_FLAG_PCIE },
-   [ARBEL_NATIVE] = { .latest_fw = MTHCA_FW_VER(5, 1, 0),
+   [ARBEL_NATIVE] = { .latest_fw = MTHCA_FW_VER(5, 1, 400),
   .flags = MTHCA_FLAG_MEMFREE |
MTHCA_FLAG_PCIE },
-   [SINAI]= { .latest_fw = MTHCA_FW_VER(1, 0, 800),
+   [SINAI]= { .latest_fw = MTHCA_FW_VER(1, 1, 0),
   .flags = MTHCA_FLAG_MEMFREE |
MTHCA_FLAG_PCIE|
MTHCA_FLAG_SINAI_OPT }
diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c 
b/drivers/infiniband/hw/mthca/mthca_provider.c
index 230ae21..265b1d1 100644
--- a/drivers/infiniband/hw/mthca/mthca_provider.c
+++ b/drivers/infiniband/hw/mthca/mthca_provider.c
@@ -1287,11 +1287,7 @@ int mthca_register_device(struct mthca_d
(1ull << IB_USER_VERBS_CMD_MODIFY_QP)   |
(1ull << IB_USER_VERBS_CMD_DESTROY_QP)  |
(1ull << IB_USER_VERBS_CMD_ATTACH_MCAST)|
-   (1ull << IB_USER_VERBS_CMD_DETACH_MCAST)|
-   (1ull << IB_USER_VERBS_CMD_CREATE_SRQ)  |
-   (1ull << IB_USER_VERBS_CMD_MODIFY_SRQ)  |
-   (1ull << IB_USER_VERBS_CMD_QUERY_SRQ)   |
-   (1ull << IB_USER_VERBS_CMD_DESTROY_SRQ);
+   (1ull << IB_USER_VERBS_CMD_DETACH_MCAST);
dev->ib_dev.node_type= IB_NODE_CA;
dev->ib_dev.phys_port_cnt= dev->limits.num_ports;
dev->ib_dev.dma_device   = &dev->pdev->dev;
@@ -1316,6 +1312,11 @@ int mthca_register_device(struct mthca_d
dev->ib_dev.modify_srq   = mthca_modify_srq;
dev->ib_dev.query_srq= mthca_query_srq;
dev->ib_dev.destroy_srq  = mthca_destroy_srq;
+   dev->ib_dev.uverbs_cmd_mask |=
+

Re: [openib-general] [PATCH] libibsa: userspace SA query and multicast support

2006-08-23 Thread Sean Hefty

>Donnu. I'm just speaking on the general principle that we should deny by
>default, not allow by default.  Which queries do you want to perform?

At a minimum, I would expect the following queries:

PathRecord
MultiPathRecord
MCMemberRecord
ServiceRecord

Support for ServiceRecord set/delete and InformInfo are likely to be added once
kernel support is in place.

Is it a reasonable approach to export two devices with different permissions,
one that allows limited sends, and another that permits unfiltered sends?

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 3/7] IB/ipath - performance improvements via mmap of queues

2006-08-23 Thread Roland Dreier

Ralph> Not quite. If the kernel driver is old, libipathverbs is
Ralph> using the old functions which make system calls instead of
Ralph> doing the newer mmap stuff. libipathverbs doesn't need to
Ralph> attempt calling mmap() if it knows the driver doesn't
Ralph> support it.

You lost me -- I only see one version of ipath_resize_cq(), and it
seems to do munmap()/mmap() without testing an ABI version.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 3/7] IB/ipath - performance improvements via mmap of queues

2006-08-23 Thread Ralph Campbell

On Wed, 2006-08-23 at 14:50 -0700, Roland Dreier wrote:
> I applied this, but I'm wondering if this:
> 
>  > +int ipath_resize_cq(struct ibv_cq *ibcq, int cqe)
>  >  {
>  > +  struct ipath_cq*cq = to_icq(ibcq);
>  > +  struct ibv_resize_cqcmd;
>  > +  struct ipath_resize_cq_resp resp;
>  > +  size_t  size;
>  > +  int ret;
>  > +
>  > +  pthread_spin_lock(&cq->lock);
>  > +  /* Save the old size so we can unmmap the queue. */
>  > +  size = sizeof(struct ipath_cq_wc) +
>  > +  (sizeof(struct ipath_wc) * cq->ibv_cq.cqe);
>  > +  ret = ibv_cmd_resize_cq(ibcq, cqe, &cmd, sizeof cmd,
>  > +  &resp.ibv_resp, sizeof resp);
>  > +  if (ret) {
>  > +  pthread_spin_unlock(&cq->lock);
>  > +  return ret;
>  > +  }
>  > +  (void) munmap(cq->queue, size);
>  > +  size = sizeof(struct ipath_cq_wc) +
>  > +  (sizeof(struct ipath_wc) * cq->ibv_cq.cqe);
>  > +  cq->queue = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED,
>  > +   ibcq->context->cmd_fd, resp.offset);
>  > +  ret = errno;
>  > +  pthread_spin_unlock(&cq->lock);
>  > +  if ((void *) cq->queue == MAP_FAILED)
>  > +  return ret;
>  > +  return 0;
>  > +}
> 
> works against an old kernel driver.  It seems you do have this:
> 
>  > +  if (dev->abi_version == 1) {
>  > +  context->ibv_ctx.ops.poll_cq   = ibv_cmd_poll_cq;
>  > +  context->ibv_ctx.ops.post_srq_recv = ibv_cmd_post_srq_recv;
>  > +  context->ibv_ctx.ops.post_recv = ibv_cmd_post_recv;
>  > +  }
> 
> so I guess you're just ignoring the failure of mmap() or something?
> 
>  - R.

Not quite. If the kernel driver is old, libipathverbs is using the
old functions which make system calls instead of doing the newer
mmap stuff. libipathverbs doesn't need to attempt calling mmap()
if it knows the driver doesn't support it.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/4] Dispatch communication related events to the IB CM

2006-08-23 Thread Sean Hefty

Roland Dreier wrote:
> It's unfortunate that we have to add a special-case event hook for the
> CM, but I guess the iWARP CM changes are so ugly anyway it doesn't
> matter much.  So I think committing this is OK.

We also have the alternative of pushing the responsibility of notifying the CM 
of the event to the ULPs, which is what's done today.  The problem there is 
that 
gets pushed up to all userspace applications as well.

And even with these proposed changes, there's a race condition where the CM can 
timeout a connection after data is received over it, but before this event can 
be processed.  This makes me wonder if any of this is worth it...

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 1/7] IB/ipath - performance improvements via mmap of queues

2006-08-23 Thread Roland Dreier

Thanks, applied.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] libibsa: userspace SA query and multicast support

2006-08-23 Thread Michael S. Tsirkin

Quoting r. Sean Hefty <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] libibsa: userspace SA query and multicast support
> 
> > Yea I had the same question. Shouldn't interface expose
> > just the specific queries that we need?
> 
> I don't know what queries a user will want, and I'd rather not change the
> kernel ABI with every new query, but that is a possibility.  Which queries are
> of concern?

Donnu. I'm just speaking on the general principle that we should deny by
default, not allow by default.  Which queries do you want to perform?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 3/7] IB/ipath - performance improvements via mmap of queues

2006-08-23 Thread Roland Dreier

I applied this, but I'm wondering if this:

 > +int ipath_resize_cq(struct ibv_cq *ibcq, int cqe)
 >  {
 > +struct ipath_cq*cq = to_icq(ibcq);
 > +struct ibv_resize_cqcmd;
 > +struct ipath_resize_cq_resp resp;
 > +size_t  size;
 > +int ret;
 > +
 > +pthread_spin_lock(&cq->lock);
 > +/* Save the old size so we can unmmap the queue. */
 > +size = sizeof(struct ipath_cq_wc) +
 > +(sizeof(struct ipath_wc) * cq->ibv_cq.cqe);
 > +ret = ibv_cmd_resize_cq(ibcq, cqe, &cmd, sizeof cmd,
 > +&resp.ibv_resp, sizeof resp);
 > +if (ret) {
 > +pthread_spin_unlock(&cq->lock);
 > +return ret;
 > +}
 > +(void) munmap(cq->queue, size);
 > +size = sizeof(struct ipath_cq_wc) +
 > +(sizeof(struct ipath_wc) * cq->ibv_cq.cqe);
 > +cq->queue = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED,
 > + ibcq->context->cmd_fd, resp.offset);
 > +ret = errno;
 > +pthread_spin_unlock(&cq->lock);
 > +if ((void *) cq->queue == MAP_FAILED)
 > +return ret;
 > +return 0;
 > +}

works against an old kernel driver.  It seems you do have this:

 > +if (dev->abi_version == 1) {
 > +context->ibv_ctx.ops.poll_cq   = ibv_cmd_poll_cq;
 > +context->ibv_ctx.ops.post_srq_recv = ibv_cmd_post_srq_recv;
 > +context->ibv_ctx.ops.post_recv = ibv_cmd_post_recv;
 > +}

so I guess you're just ignoring the failure of mmap() or something?

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 2/7] IB/ipath - performance improvements via mmap of queues

2006-08-23 Thread Roland Dreier

Thanks, applied.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] mthca: various bug fixes for mthca_query_qp

2006-08-23 Thread Michael S. Tsirkin

Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] mthca: various bug fixes for mthca_query_qp
> 
> Michael> The other four bullets make sense however, do they not?
> 
> I guess so, although I wonder if anyone will ever care about the
> sq_sig_type() field.

It's not in IB spec query QP, so that might be unlikely.
However, libibverbs seems to be looking at this field:
init_attr->sq_sig_all   = resp.sq_sig_all;

so it only seems consistent to fill this in.

What do you think? Maybe its better to remove it from libibverbs?

What about other stuff?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] libibsa: userspace SA query and multicast support

2006-08-23 Thread Sean Hefty

> Yea I had the same question. Shouldn't interface expose
> just the specific queries that we need?

I don't know what queries a user will want, and I'd rather not change the 
kernel 
ABI with every new query, but that is a possibility.  Which queries are of 
concern?

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Rollup patch for ipath and OFED

2006-08-23 Thread Michael S. Tsirkin

Quoting r. Greg Lindahl <[EMAIL PROTECTED]>:
> Subject: Re: Rollup patch for ipath and OFED
> 
> On Wed, Aug 23, 2006 at 06:01:32PM +0300, Michael S. Tsirkin wrote:
> 
> > So this seems to be ripping out chunks of upstream code (ipath_ht400)
> > replacing them with something else (ipath_iba6110, ipath_iba6120.o)
> 
> To answer this piece of the question, we were acquired last April, and
> of course we have to rename all our devices. Since patch doesn't have
> a rename feature, this looks much worse than it really is.

Fine, but I wander why rush this cosmetic change for ofed 1.1?
Anyway, so is that right that there's basically just the mmap enhancement plus
a lot of file renames?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] mthca: various bug fixes for mthca_query_qp

2006-08-23 Thread Roland Dreier

Michael> The other four bullets make sense however, do they not?

I guess so, although I wonder if anyone will ever care about the
sq_sig_type() field.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] libibsa: userspace SA query and multicast support

2006-08-23 Thread Roland Dreier

Sean> The ibv_sa_send_mad() routine can only be used to issue the
Sean> following methods:

Sean> GET, SEND, GET_TABLE, GET_MULTI, and GET_TRACE_TABLE

OK, I missed that -- it's kind of hidden inside is_send_req().

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCHv2] IB/ipath - performance improvements via mmap of queues

2006-08-23 Thread Roland Dreier

Thanks, queued for 2.6.19

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] libibsa: userspace SA query and multicast support

2006-08-23 Thread Michael S. Tsirkin

Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] libibsa: userspace SA query and multicast support
> 
> What's the plan for how this would be used?  We can't let unprivileged
> userspace processes talk to the SA, because they could cause problems
> like deleting someone else's multicast group membership.  And I don't
> think we want to try to do some elaborate filtering in the kernel, do we?

Yea I had the same question. Shouldn't interface expose
just the specific queries that we need?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] libibsa: userspace SA query and multicast support

2006-08-23 Thread Sean Hefty

Roland Dreier wrote:
> What's the plan for how this would be used?  We can't let unprivileged
> userspace processes talk to the SA, because they could cause problems
> like deleting someone else's multicast group membership.  And I don't
> think we want to try to do some elaborate filtering in the kernel, do we?

The ibv_sa_send_mad() routine can only be used to issue the following methods:

GET, SEND, GET_TABLE, GET_MULTI, and GET_TRACE_TABLE

I do check for this in the kernel, but that is the extent of any filtering 
that's done.  Multicast operations must go through the multicast join / free 
calls, which drop into the kernel to interface with the ib_multicast module.

I would expect that other SET / DELETE type operations would be treated similar 
to how multicast is handled.

I'm expecting that the labs will use at least the multicast interfaces based on 
e-mail conversations, but without path record query support, the userspace CM 
interface isn't all that useful.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 4/7] IB/ipath - performance improvements via mmap of queues

2006-08-23 Thread Roland Dreier

Thanks, applied and queued for 2.6.19.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] mthca: various bug fixes for mthca_query_qp

2006-08-23 Thread Michael S. Tsirkin

Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] mthca: various bug fixes for mthca_query_qp
> 
>  > 5. Return the send_cq, receive cq and srq handles.
> 
> I really disagree with this change. 

The other four bullets make sense however, do they not?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] libibsa: userspace SA query and multicast support

2006-08-23 Thread Roland Dreier

What's the plan for how this would be used?  We can't let unprivileged
userspace processes talk to the SA, because they could cause problems
like deleting someone else's multicast group membership.  And I don't
think we want to try to do some elaborate filtering in the kernel, do we?

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IB/mthca: update latest firmware revisions

2006-08-23 Thread Roland Dreier

 > Please consider the following for 2.6.18 - hopefully this will reduce
 > the number of support requests from people with old Sinai firmware.

Makes sense -- this is the sort of thing we want as current as
possible.  Applied to svn and for-2.6.18

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] question: ib_umem page_size

2006-08-23 Thread Roland Dreier

 > > It gives the page size for the user memory described by the struct.
 > > The idea was that if/when someone tries to optimize for huge pages,
 > > then the low-level driver can know that a region is using huge pages
 > > without having to walk through the page list and search for the
 > > minimum physically contiguous size.
 > 
 > Hmm, mthca_reg_user_mr seems to do:
 > 
 > len = sg_dma_len(&chunk->page_list[j]) >> shift
 > 
 > which means that dma_len must be a multiple of page size.
 > 
 > Is this intentional?

Yes, it's intentional I think.  I'm probably missing something, but
the upper layer has just told mthca_reg_user_mr() that the page size
for this region is (1

Re: [openib-general] [PATCH] mthca: various bug fixes for mthca_query_qp

2006-08-23 Thread Roland Dreier

 > 5. Return the send_cq, receive cq and srq handles.  ib_query_qp() needs them
 >(required by IB Spec). ibv_query_qp() overwrites these values in 
 > user-space
 >with appropriate user-space values.

 > +qp_init_attr->send_cq   = ibqp->send_cq;
 > +qp_init_attr->recv_cq   = ibqp->recv_cq;
 > +qp_init_attr->srq   = ibqp->srq;

I really disagree with this change.  It's silly to do this copying
since the consumer already has the ibqp pointer.  And it's especially
silly to put this in a low-level driver, since there's nothing
device-specific about it.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/4] Dispatch communication related events to the IB CM

2006-08-23 Thread Roland Dreier

Sean> This patch set appears to be the preferred approach.  Any
Sean> objection to committing this?

It's unfortunate that we have to add a special-case event hook for the
CM, but I guess the iWARP CM changes are so ugly anyway it doesn't
matter much.  So I think committing this is OK.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-23 Thread Ralph Campbell

On Wed, 2006-08-23 at 14:00 -0500, Tang, Changqing wrote:
> Thanks for all your replies. So my general question is, why only 4bytes
> immediate data can
> Generate completion event on B side, Why RDMA-write with any data size
> does not generate
> A completion event on B side? basic there are the same thing, the only
> different is, one
> Copy 4bytes to completion structure, the other copy all data to provided
> dest buffer.
> 
> --CQ

This is just the way IB was defined.
Both RDMA write and RDMA write with immediate transfer
up to 2 Gigabytes of data to the destination address.
The latter just signals node B that the operation
is complete and the former doesn't.  Node B can use
the immediate data as a hint to which RDMA operation
just completed.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] opensm: option to limit size of OpenSM log file

2006-08-23 Thread Sasha Khapyorsky

Hi Hal,

On 01:34 Wed 23 Aug , Sasha Khapyorsky wrote:
> On 18:26 Tue 22 Aug , Hal Rosenstock wrote:
> > On Tue, 2006-08-22 at 18:22, Sasha Khapyorsky wrote:
> > > On 17:54 Tue 22 Aug , Hal Rosenstock wrote:
> > > > Hi Sasha,
> > > > 
> > > > On Tue, 2006-08-22 at 17:18, Sasha Khapyorsky wrote:
> > > > > Hi Hal,
> > > > > 
> > > > > There is new option which specified max size of OpenSM log file. The
> > > > > default is '0' (not-limited). Please note osm_log_init() has new
> > > > > parameter now.
> > > > 
> > > > So libopensm.ver needs to be bumped (and this is not backward
> > > > compatible).
> > > 
> > > We may. I'm not sure it is necessary - in this patch I've changed all
> > > occurrences of osm_log_init() under osm/ (in opensm and osmtest). So
> > > this can be important only if there are osm_log "external" users.
> > 
> > There may be so I will do this.
> 
> Ok. Thanks.

Found one more osm_log_init(). There is.

Sasha


Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>

diff --git a/diags/src/saquery.c b/diags/src/saquery.c
index f0d1299..0bb46be 100644
--- a/diags/src/saquery.c
+++ b/diags/src/saquery.c
@@ -443,7 +443,7 @@ get_bind_handle(void)
 
osm_log_construct(&log_osm);
if ((status = osm_log_init( &log_osm, TRUE,
-   0x0001, NULL, TRUE )) != IB_SUCCESS) {
+   0x0001, NULL, 0, TRUE )) != IB_SUCCESS) {
fprintf(stderr, "Failed to init osm_log: %s\n",
ib_get_err_str(status));
exit (-1);

> 
> Sasha
> 
> > 
> > -- Hal
> >  
> > 
> > 
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-23 Thread Tang, Changqing


Thanks for all your replies. So my general question is, why only 4bytes
immediate data can
Generate completion event on B side, Why RDMA-write with any data size
does not generate
A completion event on B side? basic there are the same thing, the only
different is, one
Copy 4bytes to completion structure, the other copy all data to provided
dest buffer.

--CQ


>-Original Message-
>From: Ralph Campbell [mailto:[EMAIL PROTECTED] 
>Sent: Wednesday, August 23, 2006 1:16 PM
>To: Tang, Changqing
>Cc: Caitlin Bestler; openib-general@openib.org
>Subject: RE: [openib-general] basic IB doubt
>
>On Wed, 2006-08-23 at 12:28 -0500, Tang, Changqing wrote:
>>  >
>> >Actually, A knows the data is in B's memory when A gets the 
>> >completion notice.  B can't rely on anything unless A uses the RDMA
our 
>> >write with immediate which puts a completion event in B's CQ.
>> 
>> Ralph:
>> 
>> Can you give a few more words on 'immediate', I know A will have A 
>> completion event in its CQ, Does B receive a CQ event on the 
>Same RDMA 
>> operation as well ?
>> 
>> --CQ Tang
>
>B doesn't get a completion event for a RDMA write initiated 
>from A unless A does something like the following:
>
> struct ib_send_wr wr;
>
> wr.opcode = IB_WR_RDMA_WRITE_WITH_IMM;
> wr.imm_data = cpu_to_be32(value);
> ...
> ib_post_send(qp, &wr, NULL);
>
>B will get a completion event with the IB_WC_WITH_IMM flag set 
>in struct ib_wc.wc_flags and ib_wc.imm_data set to the value 
>that A sent.
>
>

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-23 Thread Ralph Campbell

On Wed, 2006-08-23 at 12:28 -0500, Tang, Changqing wrote:
>  >
> >Actually, A knows the data is in B's memory when A gets the 
> >completion notice.  B can't rely on anything unless A uses the 
> >RDMA write with immediate which puts a completion event in B's CQ.
> 
> Ralph:
> 
> Can you give a few more words on 'immediate', I know A will have
> A completion event in its CQ, Does B receive a CQ event on the 
> Same RDMA operation as well ?
> 
> --CQ Tang

B doesn't get a completion event for a RDMA write initiated from A
unless A does something like the following:

 struct ib_send_wr wr;

 wr.opcode = IB_WR_RDMA_WRITE_WITH_IMM;
 wr.imm_data = cpu_to_be32(value);
 ...
 ib_post_send(qp, &wr, NULL);

B will get a completion event with the IB_WC_WITH_IMM
flag set in struct ib_wc.wc_flags and ib_wc.imm_data set
to the value that A sent.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Rollup patch for ipath and OFED

2006-08-23 Thread Sean Hefty

Bryan O'Sullivan wrote:
> SVN is not a high priority for me personally.  Fixing things so that I
> can send meaningful patches upstream in a timely manner us.

Why not remove your code from SVN?

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Rollup patch for ipath and OFED

2006-08-23 Thread Bryan O'Sullivan

On Wed, 2006-08-23 at 10:58 -0700, Woodruff, Robert J wrote:

> I hate to tell you I told you so, but this is exactly why you guys
> should not be off working behind closed doors and then submit some
> mongo patch.

That's precisely what I'm working to avoid.  It's not as if I didn't
know this.

>  If you would just submit the code as you go in smaller
> patches and check in the smaller patches daily to SVN, we would not 
> have such an integration mess at the end.

SVN is not a high priority for me personally.  Fixing things so that I
can send meaningful patches upstream in a timely manner us.

http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Rollup patch for ipath and OFED

2006-08-23 Thread Woodruff, Robert J

Bryan wrote, 
>On Wed, 2006-08-23 at 18:01 +0300, Michael S. Tsirkin wrote:
>> Guys, I just looked at ipath-fixes.patch in ofed.  With 36 files
changed, 4623
>> insertions, 4774 deletions, it's quite a biggie with no description
what it does
>> whatsoever.  Can't this be split to smaller chunks doing one thing at
a time,
>> please?  You'll have to do this for upstream inclusion anyway, so why
not for
>> OFED?

>We were in a rush to get a working patch out, is all.  I've been
>splitting that monster up into sensibly-sized chunks in the usual way.

>   http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-23 Thread Sean Hefty

Tang, Changqing wrote:
> Can you give a few more words on 'immediate', I know A will have
> A completion event in its CQ, Does B receive a CQ event on the 
> Same RDMA operation as well ?

He means and RDMA write with immediate data.  B will see a completion event for 
that operation.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-23 Thread Tang, Changqing

 
>
>Actually, A knows the data is in B's memory when A gets the 
>completion notice.  B can't rely on anything unless A uses the 
>RDMA write with immediate which puts a completion event in B's CQ.

Ralph:

Can you give a few more words on 'immediate', I know A will have
A completion event in its CQ, Does B receive a CQ event on the 
Same RDMA operation as well ?

--CQ Tang



>Most applications on B ignore this requirement and test for 
>the last memory location being modified which usually works 
>but doesn't guarantee that all the data is in memory.
>
>
>___
>openib-general mailing list
>openib-general@openib.org
>http://openib.org/mailman/listinfo/openib-general
>
>To unsubscribe, please visit 
>http://openib.org/mailman/listinfo/openib-general
>
>

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [openfabrics-ewg] OFED 1.1-rc2 is ready

2006-08-23 Thread Woodruff, Robert J

Woody wrote,
>Ok, I was able to install the RC2 on EL4-U3 and get intel MPI working
on uDAPL.
>Did have one issue with the install that maybe you could fix for the
next
>RC. It appears that the rdma_ucm and rdma_cm are not being loaded at
startup
>time and I had to manually modprobe rdma_ucm, after that, Intel MPI and
uDAPL
>seemed to work fine with the intial tests I have done so far, will
continue
>to stress test it and let you know if we see any other issues.

>woody

It appears that what is happening on this one is that the 
/etc/init.d/openibd script is failing because the ipath driver
is not loading, which is expected in rc2 as their latest patches are not
in rc2, If I disable loading of the ipath driver in 
/etc/infiniband/openib.conf
the script continues and loads rdma_cm and rdma_ucm.

The question is, if a driver like the ipath driver fails to load,
should the script go on anyway and load the other drivers like
rdma_cm/rdma_ucm ?
or is it best to leave it as it is. Anyway, once the ipath driver is
fixed,
this issue should go away. 

woody

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/4] Dispatch communication related events to the IB CM

2006-08-23 Thread Sean Hefty

Sean Hefty wrote:
> The following set of patches forwards communication related events to the IB 
> CM
> for processing.  Communication events of interest are communication 
> established
> and path migration, with only the former is currently handled by the IB CM.
> 
> This removes the need for users to trap for these events and pass the
> information onto IB CM.  Communication established events can be handled by 
> the
> ib_cm_establish() routine, but no mechanism exists to notify the IB CM of path
> migration.  This adds the framework for doing so.
> 
> Signed-off-by: Sean Hefty <[EMAIL PROTECTED]>

Based on feedback from Or and Michael:

http://openib.org/pipermail/openib-general/2006-August/025320.html
http://openib.org/pipermail/openib-general/2006-August/025306.html

This patch set appears to be the preferred approach.  Any objection to 
committing this?

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-23 Thread Ralph Campbell

On Wed, 2006-08-23 at 09:47 -0700, Caitlin Bestler wrote:
> [EMAIL PROTECTED] wrote:
> > Quoting r. john t <[EMAIL PROTECTED]>:
> >> Subject: basic IB doubt
> >> 
> >> Hi
> >> 
> >> I have a very basic doubt. Suppose Host A is doing RDMA write (say 8
> >> MB) to Host B. When data is copied into Host B's local
> > buffer, is it guaranteed that data will be copied starting
> > from the first location (first buffer address) to the last
> > location (last buffer address)? or it could be in any order?
> > 
> > Once B gets a completion (e.g. of a subsequent send), data in
> > its buffer matches that of A, byte for byte.
> 
> An excellent and concise answer. That is exactly what the application
> should rely upon, and nothing else. With iWARP this is very explicit,
> because portions of the message not only MAY be placed out of 
> order, they SHOULD be when packets have been re-ordered by the
> network. But for *any* RDMA adapter there is no guarantee on
> what order the adapter flushes things to host memory or particularly
> when old contents that may be cached are invalidated or updated.
> The role of the completion is to limit the frequency with which
> the RDMA adapter MUST guarantee coherency with application visible
> buffers. The completion not only indicates that the entire message
> was received, but that it has been entirely delivered to host memory.

Actually, A knows the data is in B's memory when A gets the completion
notice.  B can't rely on anything unless A uses the RDMA write with
immediate which puts a completion event in B's CQ.
Most applications on B ignore this requirement and test for the last
memory location being modified which usually works but doesn't
guarantee that all the data is in memory.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] osm: fix memory leak in vendor ibumad

2006-08-23 Thread Hal Rosenstock

Hi Eitan,

> Who is freeing the request MAD?
> If it is NULL then the flow aborts earlier. 

Sorry; My bad :-(

-- Hal


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] ib_cm: randomize starting local comm IDs

2006-08-23 Thread Sean Hefty

Or Gerlitz wrote:
> I have tested the patch against an iser target based on our Gen1 CM -
> it works as expected.

This has been committed in 9088.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-23 Thread Caitlin Bestler

[EMAIL PROTECTED] wrote:
> Quoting r. john t <[EMAIL PROTECTED]>:
>> Subject: basic IB doubt
>> 
>> Hi
>> 
>> I have a very basic doubt. Suppose Host A is doing RDMA write (say 8
>> MB) to Host B. When data is copied into Host B's local
> buffer, is it guaranteed that data will be copied starting
> from the first location (first buffer address) to the last
> location (last buffer address)? or it could be in any order?
> 
> Once B gets a completion (e.g. of a subsequent send), data in
> its buffer matches that of A, byte for byte.

An excellent and concise answer. That is exactly what the application
should rely upon, and nothing else. With iWARP this is very explicit,
because portions of the message not only MAY be placed out of 
order, they SHOULD be when packets have been re-ordered by the
network. But for *any* RDMA adapter there is no guarantee on
what order the adapter flushes things to host memory or particularly
when old contents that may be cached are invalidated or updated.
The role of the completion is to limit the frequency with which
the RDMA adapter MUST guarantee coherency with application visible
buffers. The completion not only indicates that the entire message
was received, but that it has been entirely delivered to host memory.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] ib_cm: randomize starting local comm IDs

2006-08-23 Thread Sean Hefty

Michael S. Tsirkin wrote:
> so I am wandering why is it not sufficient to wait for
> the window of time as described above to expire?
> Is something broken in CM that this patch is papering over?

Yes.  There are a couple of issues.  The CM doesn't time when a REQ was 
received, and the local comm ID's need rework.  This is a fix that should avoid 
the issues most of the time.

There's also an issue that a user can allocate a QP that's likely to be in time 
wait.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] InfiniBand merge plans for 2.6.19

2006-08-23 Thread Sean Hefty

Or Gerlitz wrote:
> OK.  Now, if this (RC, UD, MCAST) turns to be too much for your
> schedule before 2.6.19 opens up, does it make sense for you to push a
> char device which supports only the CMA RC functionality and the UD
> and multicast in the future?

Yes - and the fact that I can pull the OFED code for this helps.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Rollup patch for ipath and OFED

2006-08-23 Thread Greg Lindahl

On Wed, Aug 23, 2006 at 06:01:32PM +0300, Michael S. Tsirkin wrote:

> So this seems to be ripping out chunks of upstream code (ipath_ht400)
> replacing them with something else (ipath_iba6110, ipath_iba6120.o)

To answer this piece of the question, we were acquired last April, and
of course we have to rename all our devices. Since patch doesn't have
a rename feature, this looks much worse than it really is.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-23 Thread Sean Hefty

john t wrote:
> I have a very basic doubt. Suppose Host A is doing RDMA write (say 8 
> MB) to Host B. When data is copied into Host B's local buffer, is it 
> guaranteed that data will be copied starting from the first location 
> (first buffer address) to the last location (last buffer address)? or it 
> could be in any order?

I don't believe that there is any ordering guarantee by the architecture. 
However, specific adapters may behave this way, and I've seen applications make 
use of this by polling the last memory byte for a completion, for example.

> Also does this tranfer involve DMA engine (residinng on HCAs) of both 
> the hosts or just one host?

If I'm understanding the question correctly, the DMA engine of both HCAs are 
used.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] OFED 1.1-rc2 bug - Could not read ABI version

2006-08-23 Thread Woodruff, Robert J

With the OFED1.1-rc2 when I run the RDMA CM on RedHat EL4 - Update 3
I get the following warning. The application seems to work OK, 
but the warnings are concerning. 


librdmacm: couldn't read ABI version.
librdmacm: assuming: 1

It appears that the backport of the kernel module does not create the
abi_version in /sys/class/misc as the userspace code expects.
I suggest either fix the backport patch to create the abi_version or
to avoid confusion,
remove the check or the warning message from the userspace code if
the ABI does actually matches the kernel code. 

woody

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [PATCH] mthca: various bug fixes for mthca_query_qp

2006-08-23 Thread Jack Morgenstein

Fixed various bugs in mthca_query_qp:
1. correct port_num was not being returned for unconnected QPs.
2. Incorrect number of bits was taken for static_rate field.
3. When default static rate is returned for Tavor, forgot to translate it
   to an ib rate value.
4. Return sq signalling attribute in query-qp.
5. Return the send_cq, receive cq and srq handles.  ib_query_qp() needs them
   (required by IB Spec). ibv_query_qp() overwrites these values in user-space
   with appropriate user-space values.

Signed-off-by: Jack Morgenstein <[EMAIL PROTECTED]>

Index: ofed_1_1/drivers/infiniband/hw/mthca/mthca_qp.c
===
--- ofed_1_1.orig/drivers/infiniband/hw/mthca/mthca_qp.c2006-08-23 
10:33:04.0 +0300
+++ ofed_1_1/drivers/infiniband/hw/mthca/mthca_qp.c 2006-08-23 
18:46:08.330885000 +0300
@@ -391,6 +391,12 @@ static int to_ib_qp_access_flags(int mth
return ib_flags;
 }
 
+static enum ib_sig_type to_ib_qp_sq_signal(int params1)
+{
+   return (params1 & MTHCA_QP_BIT_SSC) ?
+   IB_SIGNAL_ALL_WR : IB_SIGNAL_REQ_WR;
+}
+
 static void to_ib_ah_attr(struct mthca_dev *dev, struct ib_ah_attr *ib_ah_attr,
struct mthca_qp_path *path)
 {
@@ -404,7 +410,7 @@ static void to_ib_ah_attr(struct mthca_d
ib_ah_attr->sl= be32_to_cpu(path->sl_tclass_flowlabel) >> 
28;
ib_ah_attr->src_path_bits = path->g_mylmc & 0x7f;
ib_ah_attr->static_rate   = mthca_rate_to_ib(dev,
-path->static_rate & 0x7,
+path->static_rate & 0xf,
 ib_ah_attr->port_num);
ib_ah_attr->ah_flags  = (path->g_mylmc & (1 << 7)) ? IB_AH_GRH : 0;
if (ib_ah_attr->ah_flags) {
@@ -468,10 +474,14 @@ int mthca_query_qp(struct ib_qp *ibqp, s
if (qp->transport == RC || qp->transport == UC) {
to_ib_ah_attr(dev, &qp_attr->ah_attr, &context->pri_path);
to_ib_ah_attr(dev, &qp_attr->alt_ah_attr, &context->alt_path);
+   qp_attr->alt_pkey_index =
+   be32_to_cpu(context->alt_path.port_pkey) & 0x7f;
+   qp_attr->alt_port_num   = qp_attr->alt_ah_attr.port_num;
}
 
-   qp_attr->pkey_index = be32_to_cpu(context->pri_path.port_pkey) & 
0x7f;
-   qp_attr->alt_pkey_index = be32_to_cpu(context->alt_path.port_pkey) & 
0x7f;
+   qp_attr->pkey_index = be32_to_cpu(context->pri_path.port_pkey) & 0x7f;
+   qp_attr->port_num   =
+   (be32_to_cpu(context->pri_path.port_pkey) >> 24) & 0x3;
 
/* qp_attr->en_sqd_async_notify is only applicable in modify qp */
qp_attr->sq_draining = mthca_state == MTHCA_QP_STATE_DRAINING;
@@ -482,13 +492,16 @@ int mthca_query_qp(struct ib_qp *ibqp, s
1 << ((be32_to_cpu(context->params2) >> 21) & 0x7);
qp_attr->min_rnr_timer  =
(be32_to_cpu(context->rnr_nextrecvpsn) >> 24) & 0x1f;
-   qp_attr->port_num   = qp_attr->ah_attr.port_num;
qp_attr->timeout= context->pri_path.ackto >> 3;
qp_attr->retry_cnt  = (be32_to_cpu(context->params1) >> 16) & 
0x7;
qp_attr->rnr_retry  = context->pri_path.rnr_retry >> 5;
-   qp_attr->alt_port_num   = qp_attr->alt_ah_attr.port_num;
qp_attr->alt_timeout= context->alt_path.ackto >> 3;
qp_init_attr->cap   = qp_attr->cap;
+   qp_init_attr->sq_sig_type   =
+   to_ib_qp_sq_signal(be32_to_cpu(context->params1));
+   qp_init_attr->send_cq   = ibqp->send_cq;
+   qp_init_attr->recv_cq   = ibqp->recv_cq;
+   qp_init_attr->srq   = ibqp->srq;
 
 out:
mthca_free_mailbox(dev, mailbox);
Index: ofed_1_1/drivers/infiniband/hw/mthca/mthca_av.c
===
--- ofed_1_1.orig/drivers/infiniband/hw/mthca/mthca_av.c2006-08-03 
14:30:21.0 +0300
+++ ofed_1_1/drivers/infiniband/hw/mthca/mthca_av.c 2006-08-23 
17:53:01.22722 +0300
@@ -90,7 +90,7 @@ static enum ib_rate tavor_rate_to_ib(u8 
case MTHCA_RATE_TAVOR_1X: return IB_RATE_2_5_GBPS;
case MTHCA_RATE_TAVOR_1X_DDR: return IB_RATE_5_GBPS;
case MTHCA_RATE_TAVOR_4X: return IB_RATE_10_GBPS;
-   default:  return port_rate;
+   default:  return mult_to_ib_rate(port_rate);
}
 }
 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] osm: fix memory leak in vendor ibumad

2006-08-23 Thread Eitan Zahavi

Who is freeing the request MAD?
If it is NULL then the flow aborts earlier. 

> -Original Message-
> From: Hal Rosenstock [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, August 23, 2006 6:35 PM
> To: Eitan Zahavi
> Cc: openib-general@openib.org
> Subject: Re: [PATCH] osm: fix memory leak in vendor ibumad
> 
> Hi Eitan,
> 
> I'm not getting openib-general email today but got this off a web
site:
> 
> Index: libvendor/osm_vendor_ibumad_sa.c
> 
> ===
> --- libvendor/osm_vendor_ibumad_sa.c  (revision 9087)
> +++ libvendor/osm_vendor_ibumad_sa.c  (working copy)
> @@ -180,6 +180,11 @@ __osmv_sa_mad_rcv_cb(
> 
>/* free the copied query request if found */
>if (p_query_req_copy) free(p_query_req_copy);
> +
> +  /* put back the request madw */
> +  if (p_req_madw)
> +osm_mad_pool_put(p_bind->p_mad_pool, p_req_madw);
> +
>OSM_LOG_EXIT( p_bind->p_log );
>  }
> 
> There's an additional minor change needed to this routine as there is
a case
> where the request MAD is already free'd.
> 
> I'm in the process of looking at the osm_vendor_ibumad.c change too.
> 
> -- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] osm: fix memory leak in vendor ibumad

2006-08-23 Thread Hal Rosenstock

Hi Eitan,

I'm not getting openib-general email today but got this off a web site:

Index: libvendor/osm_vendor_ibumad_sa.c
===
--- libvendor/osm_vendor_ibumad_sa.c(revision 9087)
+++ libvendor/osm_vendor_ibumad_sa.c(working copy)
@@ -180,6 +180,11 @@ __osmv_sa_mad_rcv_cb(
 
   /* free the copied query request if found */
   if (p_query_req_copy) free(p_query_req_copy);
+
+  /* put back the request madw */
+  if (p_req_madw)
+osm_mad_pool_put(p_bind->p_mad_pool, p_req_madw);
+
   OSM_LOG_EXIT( p_bind->p_log );
 }

There's an additional minor change needed to this routine as there is a
case where the request MAD is already free'd.

I'm in the process of looking at the osm_vendor_ibumad.c change too.

-- Hal


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Rollup patch for ipath and OFED

2006-08-23 Thread Bryan O'Sullivan

On Wed, 2006-08-23 at 18:01 +0300, Michael S. Tsirkin wrote:
> Guys, I just looked at ipath-fixes.patch in ofed.  With 36 files changed, 4623
> insertions, 4774 deletions, it's quite a biggie with no description what it 
> does
> whatsoever.  Can't this be split to smaller chunks doing one thing at a time,
> please?  You'll have to do this for upstream inclusion anyway, so why not for
> OFED?

We were in a rush to get a working patch out, is all.  I've been
splitting that monster up into sensibly-sized chunks in the usual way.

http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] Rollup patch for ipath and OFED

2006-08-23 Thread Michael S. Tsirkin

Guys, I just looked at ipath-fixes.patch in ofed.  With 36 files changed, 4623
insertions, 4774 deletions, it's quite a biggie with no description what it does
whatsoever.  Can't this be split to smaller chunks doing one thing at a time,
please?  You'll have to do this for upstream inclusion anyway, so why not for
OFED?

Oh well. However, this baby also does for example:

diff -r d2661c9eff49 -r 198ed6310295 drivers/infiniband/hw/ipath/Makefile
--- a/drivers/infiniband/hw/ipath/Makefile  Thu Aug 10 16:19:45 2006 -0700
+++ b/drivers/infiniband/hw/ipath/Makefile  Wed Aug 16 11:01:29 2006 -0700
@@ -1,36 +1,34 @@ EXTRA_CFLAGS += -DIPATH_IDSTR='"QLogic k
 EXTRA_CFLAGS += -DIPATH_IDSTR='"QLogic kernel.org driver"' \
-DIPATH_KERN_TYPE=0
 
-obj-$(CONFIG_IPATH_CORE) += ipath_core.o
 obj-$(CONFIG_INFINIBAND_IPATH) += ib_ipath.o
 
-ipath_core-y := \
+ib_ipath-y := \
+   ipath_cq.o \
ipath_diag.o \
ipath_driver.o \
ipath_eeprom.o \
ipath_file_ops.o \
ipath_fs.o \
-   ipath_ht400.o \
+   ipath_iba6110.o \
+   ipath_iba6120.o \
ipath_init_chip.o \
ipath_intr.o \
+   ipath_keys.o \
ipath_layer.o \
-   ipath_pe800.o \
-   ipath_stats.o \
-   ipath_sysfs.o \
-   ipath_user_pages.o
-
-ipath_core-$(CONFIG_X86_64) += ipath_wc_x86_64.o
-
-ib_ipath-y := \
-   ipath_cq.o \
-   ipath_keys.o \
ipath_mad.o \
ipath_mr.o \
ipath_qp.o \
ipath_rc.o \
ipath_ruc.o \
ipath_srq.o \
+   ipath_stats.o \
+   ipath_sysfs.o \
ipath_uc.o \
ipath_ud.o \
-   ipath_verbs.o \
-   ipath_verbs_mcast.o
+   ipath_user_pages.o \
+   ipath_verbs_mcast.o \
+   ipath_verbs.o
+
+ib_ipath-$(CONFIG_X86_64) += ipath_wc_x86_64.o
+ib_ipath-$(CONFIG_PPC64) += ipath_wc_ppc64.o

So this seems to be ripping out chunks of upstream code (ipath_ht400)
replacing them with something else (ipath_iba6110, ipath_iba6120.o)

This might be a good change for all I know. But I'd like to ask
 What exactly does this fixes patch, fix?
 Can there be a list of things it provides at the top?
 How about a Signed-off-by line?
 Was this posted on openib-general even once?

There's a single unmerged ipath patch posted on openib-general:
 mmap()ed userspace work queues for ipath.

So where does the rest come from?
Googling for ipath_iba6110 gets no hits.

Thanks,

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] ib_cm: randomize starting local comm IDs

2006-08-23 Thread Michael S. Tsirkin

Quoting r. Or Gerlitz <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] ib_cm: randomize starting local comm IDs
> 
> On 8/23/06, Michael S. Tsirkin <[EMAIL PROTECTED]> wrote:
> > Quoting r. Or Gerlitz <[EMAIL PROTECTED]>:
> 
> > > So the CM at the target side rejects the first REQ after the client
> > > reboot with STALE reason (and deliveres a disconnect event to the
> > > ULP). The second REQ is processed fine and a new connection is
> > > established.
> 
> > Hmm. Might this still be a concern for users such as SDP
> > which don't retry connections?
> 
> I don't know if "this" in your email referes to the quote above

I am speaking more or less about this quote from your message:
> > Without the patch, since the REQ had  as this of an
> > existing connection, it was just silently dropped and a target reboot
> > was a must to let the initiator reconnect !

the spec says:
> > If a CM receives a REQ/REP as described above, if the REQ/REP has the
> > same Local Communication ID and Remote Communication ID as are present
> > in the existing connection and if the REQ/REP arrives within the window
> > of time during which the active/passive side could be legally
> > retransmitting REQ/REP, the CM should treat the REQ/REP as a retry and
> > not initiate stale connection processing as described above.

so I am wandering why is it not sufficient to wait for
the window of time as described above to expire?
Is something broken in CM that this patch is papering over?

> but what
> i discribe there is what stated in the IB spec ch. 12 re stale connections.

I know. I am just wandering aloud whether this is relevant for SDP.
Why won't the window expire as described above?

> So you need to either rely on the SDP consumer to reconnect or when
> getting a STALE reject attempt to reconnect from SDP.
 
I'm not sure SDP needs to do anything - the port is busy, after all.
Retrying seems to be against the SDP spec.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] ib_cm: randomize starting local comm IDs

2006-08-23 Thread Or Gerlitz

On 8/23/06, Michael S. Tsirkin <[EMAIL PROTECTED]> wrote:
> Quoting r. Or Gerlitz <[EMAIL PROTECTED]>:

> > So the CM at the target side rejects the first REQ after the client
> > reboot with STALE reason (and deliveres a disconnect event to the
> > ULP). The second REQ is processed fine and a new connection is
> > established.

> Hmm. Might this still be a concern for users such as SDP
> which don't retry connections?

I don't know if "this" in your email referes to the quote above but what
i discribe there is what stated in the IB spec ch. 12 re stale connections.

So you need to either rely on the SDP consumer to reconnect or when
getting a STALE reject attempt to reconnect from SDP.

Or.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] ib_cm: randomize starting local comm IDs

2006-08-23 Thread Or Gerlitz

Sean Hefty wrote:
> Randomize the starting local comm ID to avoid getting a rejected connection
> due to a stale connection after a system reboot or reloading of the ib_cm.

Hi Sean,

As i wrote you patching the CM it works fine and the problem i could 
reproduce with our iser target is solved.

However, we wonder what is your opinion (and if positive, work 
estimation...) to make the CM get "GID OUT" traps, which are generated 
by the SM when a node IB restarts (eg a node reboot). Once the CM gets 
the trap, it can scan the internal data structures and emulate a 
disconnect for all relevant (*) connections ???

(*) there are some technical issues here: the GID OUT is on a PORT GID 
and the CM uses NODE GUIDS, also does openib stack has the means for a 
client to register on getting **traps** ???

Other then the CM there might be more potential consumers to this trap, 
and also to "GID IN".

Or.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] ib_cm: randomize starting local comm IDs

2006-08-23 Thread Michael S. Tsirkin

Quoting r. Or Gerlitz <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] ib_cm: randomize starting local comm IDs
> 
> On 8/22/06, Sean Hefty <[EMAIL PROTECTED]> wrote:
> > Randomize the starting local comm ID to avoid getting a rejected connection
> > due to a stale connection after a system reboot or reloading of the ib_cm.
> 
> Hi Sean,
> 
> I have tested the patch against an iser target based on our Gen1 CM -
> it works as expected.
> 
> So the CM at the target side rejects the first REQ after the client
> reboot with STALE reason (and deliveres a disconnect event to the
> ULP). The second REQ is processed fine and a new connection is
> established.
> 
> Without the patch, since the REQ had  as this of an
> existing connection, it was just silently dropped and a target reboot
> was a must to let the initiator reconnect !
> 
> Or.

Hmm. Might this still be a concern for users such as SDP
which don't retry connections?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [PATCH] osm: fix memory leak in vendor ibumad

2006-08-23 Thread Eitan Zahavi

Hi Hal

These are two trivial fixes for memory leaks in the 
ibumad vendor.

Thanks

Eitan

Signed-off-by:  Eitan Zahavi <[EMAIL PROTECTED]>

Index: libvendor/osm_vendor_ibumad_sa.c
===
--- libvendor/osm_vendor_ibumad_sa.c(revision 9087)
+++ libvendor/osm_vendor_ibumad_sa.c(working copy)
@@ -180,6 +180,11 @@ __osmv_sa_mad_rcv_cb(
 
   /* free the copied query request if found */
   if (p_query_req_copy) free(p_query_req_copy);
+
+  /* put back the request madw */
+  if (p_req_madw)
+osm_mad_pool_put(p_bind->p_mad_pool, p_req_madw);
+
   OSM_LOG_EXIT( p_bind->p_log );
 }
 
Index: libvendor/osm_vendor_ibumad.c
===
--- libvendor/osm_vendor_ibumad.c   (revision 9087)
+++ libvendor/osm_vendor_ibumad.c   (working copy)
@@ -617,6 +617,7 @@ osm_vendor_get_all_port_attr(
*p_lid = ca.ports[j]->base_lid; 
*p_linkstates = ca.ports[j]->state; 
*p_portnum = ca.ports[j]->portnum;
+   free(ca.ports[j]);
}
p_lid++;
p_linkstates++;


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] InfiniBand merge plans for 2.6.19

2006-08-23 Thread Or Gerlitz

On 8/22/06, Sean Hefty <[EMAIL PROTECTED]> wrote:
> >What about pushing the char device to support user space CMA, i recall
> >that you have mentioned the API was not mature enough when the 2.6.18
> >feature merge window was open.
>
> I will look at doing this.  I need to verify what functionality (RC, UD,
> multicast) of the kernel RDMA CM we want merged upstream for 2.6.19 and 
> create a
> patch for exposing that to userspace.

OK.  Now, if this (RC, UD, MCAST) turns to be too much for your
schedule before 2.6.19 opens up, does it make sense for you to push a
char device which supports only the CMA RC functionality and the UD
and multicast in the future?

Or.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] IB CM and the case of the lost RTU: was a bunch of other topics...

2006-08-23 Thread Or Gerlitz

On 8/23/06, Sean Hefty <[EMAIL PROTECTED]> wrote:
> To recap (since it's been a couple of weeks), we have two general solutions 
> for
> how to support the passive/server/target side of a connection:
>
> 1. One method requires that the passive side queue send WRs until they get a
> connection establish event.
>
> 2. An alternative allows sending immediately after receiving a response, but 
> may
> require the user to manually transition the connection to established.  
> Failure
> to do so will cause the connection to tear down if the RTU is never received
> (even after retries).

The Voltaire iSER target implementation follows a variant of the first
method, namely it defers RX processing till getting a connection
established event.

This is ensured in the current product by the Gen1 Voltaire CM riding
on the the IB comm_established async event and synthesizing an RTU and
would be the same in the iser target we code over the Gen2 stack (CMA)
if the first method is implemented.

As typically there is some ULP level handshake when a connection
starts, there would be very little to queue (eg in iSER its the
login-request).

Or.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] CONFIG_INFINIBAND_ADDR_TRANS

2006-08-23 Thread Or Gerlitz

On 8/23/06, john t <[EMAIL PROTECTED]> wrote:

> What does the config option
> "CONFIG_INFINIBAND_ADDR_TRANS=y" indicate? The code does
> not seem to use this.

No, the code does use it to build and install the rdma_cm (CMA)
and the ib_addr modules, see linux-2.6.18-rc4/drivers/infiniband/core/Makefile

infiniband-$(CONFIG_INFINIBAND_ADDR_TRANS)  := ib_addr.o rdma_cm.o

However, i find the config name quite confusing...

Or.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] CONFIG_INFINIBAND_ADDR_TRANS

2006-08-23 Thread Michael S. Tsirkin

Quoting r. john t <[EMAIL PROTECTED]>:
> Subject: CONFIG_INFINIBAND_ADDR_TRANS
> 
> Hi,
>  
> What does the config option "CONFIG_INFINIBAND_ADDR_TRANS=y" indicate? The 
> code does not seem to use this.
>  
> Regards,
> John T.

Builds CMA modules.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] ib_cm: randomize starting local comm IDs

2006-08-23 Thread Or Gerlitz

On 8/22/06, Sean Hefty <[EMAIL PROTECTED]> wrote:
> Randomize the starting local comm ID to avoid getting a rejected connection
> due to a stale connection after a system reboot or reloading of the ib_cm.

Hi Sean,

I have tested the patch against an iser target based on our Gen1 CM -
it works as expected.

So the CM at the target side rejects the first REQ after the client
reboot with STALE reason (and deliveres a disconnect event to the
ULP). The second REQ is processed fine and a new connection is
established.

Without the patch, since the REQ had  as this of an
existing connection, it was just silently dropped and a target reboot
was a must to let the initiator reconnect !

Or.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] CONFIG_INFINIBAND_ADDR_TRANS

2006-08-23 Thread john t

Hi,
 
What does the config option "CONFIG_INFINIBAND_ADDR_TRANS=y" indicate? The code does not seem to use this.
 
Regards,
John T.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] question: ib_umem page_size

2006-08-23 Thread Michael S. Tsirkin

Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: question: ib_umem page_size
> 
> Michael> Roland, could you please clarify what does the page_size
> Michael> field in struct ib_mem do?
> 
> It gives the page size for the user memory described by the struct.
> The idea was that if/when someone tries to optimize for huge pages,
> then the low-level driver can know that a region is using huge pages
> without having to walk through the page list and search for the
> minimum physically contiguous size.

Hmm, mthca_reg_user_mr seems to do:

len = sg_dma_len(&chunk->page_list[j]) >> shift

which means that dma_len must be a multiple of page size.

Is this intentional?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] huge pages support

2006-08-23 Thread Eli Cohen

On Fri, 2006-08-18 at 15:23 +0200, Robert Rex wrote:
> Hello,
> 
> I've also worked on the same topic. Here is what I've done so far as I 
> successfully tested it on mthca and ehca. I'd appreciate for comments and 
> suggestions.
>  
> + down_read(¤t->mm->mmap_sem);
> + if (is_vm_hugetlb_page(find_vma(current->mm, (unsigned long) addr))) {
> + use_hugepages   = 1;
> + region_page_mask= HPAGE_MASK;
> + region_page_size= HPAGE_SIZE;

This might cause a kernel oops if the address passed by the user does
not belong to the process's address space. In that case find_vma() might
return NULL and is_vm_hugetlb() will crash.
And even if find_vma() returns none NULL value, that still does not
guarantee that the vma returned is the one that contains that address.
You need to check that the address is greater then or equal to
vma->vm_start.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [Bug 203] Memory corruption in mad processing

2006-08-23 Thread bugzilla-daemon

http://openib.org/bugzilla/show_bug.cgi?id=203


[EMAIL PROTECTED] changed:

   What|Removed |Added

   Severity|major   |critical
Summary|Crash on shutdown, timer|Memory corruption in mad
   |callback, build 459 |processing




--- Comment #2 from [EMAIL PROTECTED]  2006-08-23 03:06 ---
A fix seems to be replacing the null test of p_list_item in al_mad.c (two
places) with something like below. A text pattern scan of the sources for any
other occurences might be appropriate too.

if( p_list_item == cl_qlist_end( &h_mad_svc->send_list ) )
// bad  if( !p_list_item )

I'll report back in a day or two on how this effects shutdown stability. 




--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] why sdp connections cost so much memory

2006-08-23 Thread Michael S. Tsirkin

Yes, I have reproduced the connection refusal problem and I am looking into it.
Thanks!

MST

Quoting r. zhu shi song <[EMAIL PROTECTED]>:
Subject: Re: why sdp connections cost so much memory

I haven't met kernel crashes using rc2.  But there
always occurred connection refusal when max concurrent
connections set above 200. All is right when max
concurrent connections is set to below 200.  ( If
using TCP to take the same test, there is no problem.)
(1)
This is ApacheBench, Version 2.0.41-dev <$Revision:
1.141 $> apache-2.0
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd,
http://www.zeustech.net/
Copyright (c) 1998-2002 The Apache Software
Foundation, http://www.apache.org/

Benchmarking www.google.com [through
193.12.10.14:3129] (be patient)
Completed 100 requests
Completed 200 requests
apr_recv: Connection refused (111)
Total of 257 requests completed
(2)
This is ApacheBench, Version 2.0.41-dev <$Revision:
1.141 $> apache-2.0
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd,
http://www.zeustech.net/
Copyright (c) 1998-2002 The Apache Software
Foundation, http://www.apache.org/

Benchmarking www.google.com [through
193.12.10.14:3129] (be patient)
Completed 100 requests
Completed 200 requests
apr_recv: Connection refused (111)
Total of 256 requests completed
[EMAIL PROTECTED] squid.test]#

zhu




--- "Michael S. Tsirkin" <[EMAIL PROTECTED]> wrote:

> Quoting r. zhu shi song <[EMAIL PROTECTED]>:
> > --- "Michael S. Tsirkin" <[EMAIL PROTECTED]>
> wrote:
> > 
> > > Quoting r. zhu shi song
> <[EMAIL PROTECTED]>:
> > > > (3) one time linux kernel on the client
> crashed. I
> > > > copy the output from the screen.
> > > > Process sdp (pid:4059, threadinfo
> 010036384000
> > > > task 01003ea10030)
> > > > Call
> > > >
> > >
> Trace:{:ib_sdp:sdp_destroy_workto}
> > > >  {:ib_sdp:sdp_destroy_qp+77}
> > > >
> > >
> >
>
{:ib_sdp:sdp_destruct+279}{sk_free+28}
> > > >
> > >
> >
>
{worker_thread+419}{default_wake_function+0}
> > > >
> > >
> >
>
{default_wake_function+0}{keventd_create_kthread+0}
> > > >
> > >
> >
>
{worker_thread+0}{keventd_create_kthread+0}
> > > >
> > >
> >
>
{kthread+200}{child_rip+8}
> > > >
> > >
> >
>
{keventd_create_kthread+0}{kthread+0}{child_rip+0}
> > > > Code:8b 40 04 41 39 c6 89 44 24 0c 7d 3b 45 31
> ff
> > > 45
> > > > 31 ed 4c 89
> > > >
> > >
> >
>
RIP:{:ib_sdp:sdp_recv_completion+127}RSP<010036385dc8>
> > > > CR2:0004
> > > > <0>kernel panic-not syncing:Oops
> > > > 
> > > > zhu
> > > 
> > > Hmm, the stack dump does not match my sources.
> Is
> > > this OFED rc1?
> > > Could you send me the sdp_main.o and sdp_main.c
> > > files from your system please?
> 
> ---
> 
> > Subject: Re: why sdp connections cost so much
> memory
> > 
> > please see the attachment.
> > zhu
> 
> Ugh, so its crashing inside sdp_bcopy ...
> 
> By the way, could you please re-test with OFED rc2?
> We've solved a couple of bugs there ...
> 
> If this still crashes, could you please post the
> whole
> sdp directory, with .o and .c files?
> 
> Thanks,
> 
> -- 
> MST
> 


__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [Bug 203] Crash on shutdown, timer callback, build 459

2006-08-23 Thread bugzilla-daemon

http://openib.org/bugzilla/show_bug.cgi?id=203


[EMAIL PROTECTED] changed:

   What|Removed |Added

 CC||[EMAIL PROTECTED]




--- Comment #1 from [EMAIL PROTECTED]  2006-08-23 01:55 ---
I've trapped a write of 0x1 to the dpc context field of a mad data structure.

The stack looks like this just after the write:

f797ab00 ba9de265 ibbus!ib_cancel_mad+0x6c0
[k:\windows-openib\src\winib-459\core\al\al_mad.c @ 1831]
f797ab14 ba984d68 ibbus!al_cancel_sa_req+0x25
[k:\windows-openib\src\winib-459\core\al\al_query.h @ 140]
f797ab28 ba82ec4c ibbus!ib_cancel_query+0x328
[k:\windows-openib\src\winib-459\core\al\al.c @ 429]
f797ac00 ba7fe269 ipoib!ipoib_port_down+0x13c
[k:\windows-openib\src\winib-459\ulp\ipoib\kernel\ipoib_port.c @ 5066]
f797ac74 ba991da1 ipoib!__ipoib_pnp_cb+0xe89
[k:\windows-openib\src\winib-459\ulp\ipoib\kernel\ipoib_adapter.c @ 690]
f797acdc ba994f92 ibbus!__pnp_notify_user+0x561
[k:\windows-openib\src\winib-459\core\al\kernel\al_pnp.c @ 523]
f797ad04 ba994cb1 ibbus!__pnp_process_port_forward+0x172
[k:\windows-openib\src\winib-459\core\al\kernel\al_pnp.c @ 1230]
f797ad48 ba99479a ibbus!__pnp_check_ports+0x411
[k:\windows-openib\src\winib-459\core\al\kernel\al_pnp.c @ 1433]
f797ad70 ba950884 ibbus!__pnp_check_events+0x19a
[k:\windows-openib\src\winib-459\core\al\kernel\al_pnp.c @ 1510]
f797ad8c ba956b54 ibbus!__cl_async_proc_worker+0x94
[k:\windows-openib\src\winib-459\core\complib\cl_async_proc.c @ 153]
f797ada0 ba958c0c ibbus!__cl_thread_pool_routine+0x54
[k:\windows-openib\src\winib-459\core\complib\cl_threadpool.c @ 67]
f797adac 80a07678 ibbus!__thread_callback+0x2c
[k:\windows-openib\src\winib-459\core\complib\kernel\cl_thread.c @ 49]
f797addc 80781346 nt!PspSystemThreadStartup+0x2e
  nt!KiThreadStartup+0x16

This seems to be canceling an outstanding mad query when the port goes down. An
event that would happen at shutdown, and at irregular other times.

The code that causes the dpc corruption is core\al\al_mad.c about line 1826:

if( !p_list_item )
{
  cl_spinlock_release( &h_mad_svc->obj.lock );
  AL_PRINT( TRACE_LEVEL_INFORMATION, AL_DBG_MAD_SVC, ("mad not found\n") );
return IB_NOT_FOUND;
}

/* Mark the MAD as having been canceled. */
h_send = PARENT_STRUCT( p_list_item, al_mad_send_t, pool_item );
h_send->canceled = TRUE;

The local pointer h_send seems to not be pointing at the right thing, and the
assignment of TRUE to the cancel field is actually corrupting the dpc context
field.

A structure dump of p_list_item says:

1: kd> dt p_list_item
Local var @ 0xf797aafc Type _cl_list_item*
0x88e76f10 
   +0x000 p_next   : 0x88e76f10 _cl_list_item
   +0x004 p_prev   : 0x88e76f10 _cl_list_item
   +0x008 p_list   : 0x88e76f10 _cl_qlist

The address of this 0x88e76f10 is the same address as the send_list field in
the local h_mad_svc, and believe it represents an empty list header. This
suggests the test for null is an incorrect test for the list being empty.

There is also another case that looks like an incorrect list test in the same
source file.




--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] why sdp connections cost so much memory

2006-08-23 Thread zhu shi song

I haven't met kernel crashes using rc2.  But there
always occurred connection refusal when max concurrent
connections set above 200. All is right when max
concurrent connections is set to below 200.  ( If
using TCP to take the same test, there is no problem.)
(1)
This is ApacheBench, Version 2.0.41-dev <$Revision:
1.141 $> apache-2.0
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd,
http://www.zeustech.net/
Copyright (c) 1998-2002 The Apache Software
Foundation, http://www.apache.org/

Benchmarking www.google.com [through
193.12.10.14:3129] (be patient)
Completed 100 requests
Completed 200 requests
apr_recv: Connection refused (111)
Total of 257 requests completed
(2)
This is ApacheBench, Version 2.0.41-dev <$Revision:
1.141 $> apache-2.0
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd,
http://www.zeustech.net/
Copyright (c) 1998-2002 The Apache Software
Foundation, http://www.apache.org/

Benchmarking www.google.com [through
193.12.10.14:3129] (be patient)
Completed 100 requests
Completed 200 requests
apr_recv: Connection refused (111)
Total of 256 requests completed
[EMAIL PROTECTED] squid.test]#

zhu




--- "Michael S. Tsirkin" <[EMAIL PROTECTED]> wrote:

> Quoting r. zhu shi song <[EMAIL PROTECTED]>:
> > --- "Michael S. Tsirkin" <[EMAIL PROTECTED]>
> wrote:
> > 
> > > Quoting r. zhu shi song
> <[EMAIL PROTECTED]>:
> > > > (3) one time linux kernel on the client
> crashed. I
> > > > copy the output from the screen.
> > > > Process sdp (pid:4059, threadinfo
> 010036384000
> > > > task 01003ea10030)
> > > > Call
> > > >
> > >
> Trace:{:ib_sdp:sdp_destroy_workto}
> > > >  {:ib_sdp:sdp_destroy_qp+77}
> > > >
> > >
> >
>
{:ib_sdp:sdp_destruct+279}{sk_free+28}
> > > >
> > >
> >
>
{worker_thread+419}{default_wake_function+0}
> > > >
> > >
> >
>
{default_wake_function+0}{keventd_create_kthread+0}
> > > >
> > >
> >
>
{worker_thread+0}{keventd_create_kthread+0}
> > > >
> > >
> >
>
{kthread+200}{child_rip+8}
> > > >
> > >
> >
>
{keventd_create_kthread+0}{kthread+0}{child_rip+0}
> > > > Code:8b 40 04 41 39 c6 89 44 24 0c 7d 3b 45 31
> ff
> > > 45
> > > > 31 ed 4c 89
> > > >
> > >
> >
>
RIP:{:ib_sdp:sdp_recv_completion+127}RSP<010036385dc8>
> > > > CR2:0004
> > > > <0>kernel panic-not syncing:Oops
> > > > 
> > > > zhu
> > > 
> > > Hmm, the stack dump does not match my sources.
> Is
> > > this OFED rc1?
> > > Could you send me the sdp_main.o and sdp_main.c
> > > files from your system please?
> 
> ---
> 
> > Subject: Re: why sdp connections cost so much
> memory
> > 
> > please see the attachment.
> > zhu
> 
> Ugh, so its crashing inside sdp_bcopy ...
> 
> By the way, could you please re-test with OFED rc2?
> We've solved a couple of bugs there ...
> 
> If this still crashes, could you please post the
> whole
> sdp directory, with .o and .c files?
> 
> Thanks,
> 
> -- 
> MST
> 


__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

75 matches

Mail list logo