Re: [openib-general] librdmacm ABI issues with OFED 1.1
Quoting r. Sean Hefty <[EMAIL PROTECTED]>: > If you look at Woody's backport patches, I believe that he moves the RDMA CM > files to /sys/class/infiniband/rdma_cm and updates the librdmacm to read the > abi_version from there. Maybe the librdmacm part should be merged to svn? So librdmacm could try to read from misc, then from /sys/class/infiniband/rdma_cm, and then assume latest. It's good to have userspace code portable across distros ... -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/4] Dispatch communication related events to the IB CM
Quoting r. Sean Hefty <[EMAIL PROTECTED]>: > And even with these proposed changes, there's a race condition where the CM > can timeout a connection after data is received over it, but before this event > can be processed. Hmm. And what happens then? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] librdmacm ABI issues with OFED 1.1
>I have some rdma_cm test code and when I run with the OFED 1.1 code (running on >2.6.9 U3 based kernel) I got the following error. > >librdmacm: couldn't read ABI version. >librdmacm: assuming: 2 The RDMA CM places the abi_version file in /sys/class/misc/rdma_cm. The misc class didn't exist in 2.6.9, which is why it was removed from the OFED code. >The code seems to run (as it really does nothing) fine but I was wondering if I >could fix this just to clean up the output. I found that the following patch >removes the code which creates the abi_version file. If you look at Woody's backport patches, I believe that he moves the RDMA CM files to /sys/class/infiniband/rdma_cm and updates the librdmacm to read the abi_version from there. Or you could just remove the prints from the library. >How bad is it that the user space is assuming version 2 of the interface and >the modules are at version 1? It should work fine for apps using RC QPs. Version 1 assumed that the port space was RDMA TCP for RC QPs. Version 2 added support for UD QPs through the RDMA's UDP port space. The port space information was added to the end of a structure, so if an older kernel is used, it simply won't read in the port space data, and will assume TCP. An application that was expecting to use UD QP will simply get an error on some operation, likely when it tries to actually connect to a remote UD QP. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] IB/libipathverbs - Fix compatibility with old ib_ipath kernel drivers
This patch makes libipathverbs backward compatible with old ib_ipath kernel drivers. Signed-off-by: Ralph Campbell <[EMAIL PROTECTED]> Index: src/userspace/libipathverbs/src/verbs.c === --- src/userspace/libipathverbs/src/verbs.c (revision 9095) +++ src/userspace/libipathverbs/src/verbs.c (working copy) @@ -177,6 +177,29 @@ struct ibv_cq *ipath_create_cq(struct ib return &cq->ibv_cq; } +struct ibv_cq *ipath_create_cq_v1(struct ibv_context *context, int cqe, + struct ibv_comp_channel *channel, + int comp_vector) +{ + struct ibv_cq *cq; + struct ibv_create_cqcmd; + struct ibv_create_cq_resp resp; + int ret; + + cq = malloc(sizeof *cq); + if (!cq) + return NULL; + + ret = ibv_cmd_create_cq(context, cqe, channel, comp_vector, + cq, &cmd, sizeof cmd, &resp, sizeof resp); + if (ret) { + free(cq); + return NULL; + } + + return cq; +} + int ipath_resize_cq(struct ibv_cq *ibcq, int cqe) { struct ipath_cq*cq = to_icq(ibcq); @@ -207,6 +230,15 @@ int ipath_resize_cq(struct ibv_cq *ibcq, return 0; } +int ipath_resize_cq_v1(struct ibv_cq *ibcq, int cqe) +{ + struct ibv_resize_cqcmd; + struct ibv_resize_cq_resp resp; + + return ibv_cmd_resize_cq(ibcq, cqe, &cmd, sizeof cmd, +&resp, sizeof resp); +} + int ipath_destroy_cq(struct ibv_cq *ibcq) { struct ipath_cq *cq = to_icq(ibcq); @@ -222,6 +254,16 @@ int ipath_destroy_cq(struct ibv_cq *ibcq return 0; } +int ipath_destroy_cq_v1(struct ibv_cq *ibcq) +{ + int ret; + + ret = ibv_cmd_destroy_cq(ibcq); + if (!ret) + free(ibcq); + return ret; +} + int ipath_poll_cq(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc) { struct ipath_cq *cq = to_icq(ibcq); @@ -290,6 +332,28 @@ struct ibv_qp *ipath_create_qp(struct ib return &qp->ibv_qp; } +struct ibv_qp *ipath_create_qp_v1(struct ibv_pd *pd, + struct ibv_qp_init_attr *attr) +{ + struct ibv_create_qp cmd; + struct ibv_create_qp_respresp; + struct ibv_qp *qp; + int ret; + + qp = malloc(sizeof *qp); + if (!qp) + return NULL; + + ret = ibv_cmd_create_qp(pd, qp, attr, &cmd, sizeof cmd, + &resp, sizeof resp); + if (ret) { + free(qp); + return NULL; + } + + return qp; +} + int ipath_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, enum ibv_qp_attr_mask attr_mask, struct ibv_qp_init_attr *init_attr) @@ -330,6 +394,16 @@ int ipath_destroy_qp(struct ibv_qp *ibqp return 0; } +int ipath_destroy_qp_v1(struct ibv_qp *ibqp) +{ + int ret; + + ret = ibv_cmd_destroy_qp(ibqp); + if (!ret) + free(ibqp); + return ret; +} + static int post_recv(struct ipath_rq *rq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { @@ -412,6 +486,28 @@ struct ibv_srq *ipath_create_srq(struct return &srq->ibv_srq; } +struct ibv_srq *ipath_create_srq_v1(struct ibv_pd *pd, + struct ibv_srq_init_attr *attr) +{ + struct ibv_srq *srq; + struct ibv_create_srq cmd; + struct ibv_create_srq_resp resp; + int ret; + + srq = malloc(sizeof *srq); + if (srq == NULL) + return NULL; + + ret = ibv_cmd_create_srq(pd, srq, attr, &cmd, sizeof cmd, +&resp, sizeof resp); + if (ret) { + free(srq); + return NULL; + } + + return srq; +} + int ipath_modify_srq(struct ibv_srq *ibsrq, struct ibv_srq_attr *attr, enum ibv_srq_attr_mask attr_mask) @@ -456,6 +552,16 @@ int ipath_modify_srq(struct ibv_srq *ibs return 0; } +int ipath_modify_srq_v1(struct ibv_srq *ibsrq, + struct ibv_srq_attr *attr, + enum ibv_srq_attr_mask attr_mask) +{ + struct ibv_modify_srq cmd; + + return ibv_cmd_modify_srq(ibsrq, attr, attr_mask, + &cmd, sizeof cmd); +} + int ipath_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr) { struct ibv_query_srq cmd; @@ -481,6 +587,16 @@ int ipath_destroy_srq(struct ibv_srq *ib return 0; } +int ipath_destroy_srq_v1(struct ibv_srq *ibsrq) +{ + int ret; + + ret = ibv_cmd_destroy_srq(ibsrq); + if (!ret) + free(ibsrq); + return ret; +} + int ipath_post_srq_recv(struct ibv_srq *ibsrq
Re: [openib-general] [GIT PULL] please pull infiniband.git
On Wed, Aug 23, 2006 at 04:25:38PM -0700, Roland Dreier wrote: > Greg, please pull from > > master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git > for-linus > > This tree is also available from kernel.org mirrors at: > > git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git > for-linus Pulled from, and pushed out. thanks, greg k-h ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] librdmacm ABI issues with OFED 1.1
I have some rdma_cm test code and when I run with the OFED 1.1 code (running on 2.6.9 U3 based kernel) I got the following error. librdmacm: couldn't read ABI version. librdmacm: assuming: 2 The code seems to run (as it really does nothing) fine but I was wondering if I could fix this just to clean up the output. I found that the following patch removes the code which creates the abi_version file. ./backport/2.6.9_U3/ucma_6607_to_2_6_9.patch So my question is: How bad is it that the user space is assuming version 2 of the interface and the modules are at version 1? Thanks, Ira ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
>Actually, that leads me to a question: does the vendor of that adaptor >say that this is actually safe? I believe so. >most of the time doesn't mean it does it all of the time. So it it >really smart to write non-standard-conforming programs unless the >vendor stands behind that behavior? I'm not saying whether I consider this good computer science or not, but some applications do rely on this feature, and hardware that wants to work best with those applications will have it. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
Greg> Actually, that leads me to a question: does the vendor of Greg> that adaptor say that this is actually safe? Just because Greg> something behaves one way most of the time doesn't mean it Greg> does it all of the time. So it it really smart to write Greg> non-standard-conforming programs unless the vendor stands Greg> behind that behavior? Yes, Mellanox documents that it is safe to rely on the last byte of an RDMA being written last. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Wed, Aug 23, 2006 at 09:29:18AM -0700, Sean Hefty wrote: > I don't believe that there is any ordering guarantee by the architecture. > However, specific adapters may behave this way, and I've seen applications > make > use of this by polling the last memory byte for a completion, for example. Actually, that leads me to a question: does the vendor of that adaptor say that this is actually safe? Just because something behaves one way most of the time doesn't mean it does it all of the time. So it it really smart to write non-standard-conforming programs unless the vendor stands behind that behavior? -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [GIT PULL] please pull infiniband.git
Greg, please pull from master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This tree is also available from kernel.org mirrors at: git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git for-linus to get a few fixes for the 2.6.18 tree: Jack Morgenstein: IB/core: Fix SM LID/LID change with client reregister set Michael S. Tsirkin: IB/mthca: Make fence flag work for send work requests IB/mthca: Update HCA firmware revisions Roland Dreier: IB/mthca: Fix potential AB-BA deadlock with CQ locks IB/mthca: No userspace SRQs if HCA doesn't have SRQ support drivers/infiniband/core/cache.c |3 + drivers/infiniband/core/sa_query.c |3 + drivers/infiniband/hw/mthca/mthca_main.c |6 +-- drivers/infiniband/hw/mthca/mthca_provider.c | 11 +++-- drivers/infiniband/hw/mthca/mthca_provider.h |4 +- drivers/infiniband/hw/mthca/mthca_qp.c | 54 +++--- 6 files changed, 55 insertions(+), 26 deletions(-) diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c index e05ca2c..75313ad 100644 --- a/drivers/infiniband/core/cache.c +++ b/drivers/infiniband/core/cache.c @@ -301,7 +301,8 @@ static void ib_cache_event(struct ib_eve event->event == IB_EVENT_PORT_ACTIVE || event->event == IB_EVENT_LID_CHANGE || event->event == IB_EVENT_PKEY_CHANGE || - event->event == IB_EVENT_SM_CHANGE) { + event->event == IB_EVENT_SM_CHANGE || + event->event == IB_EVENT_CLIENT_REREGISTER) { work = kmalloc(sizeof *work, GFP_ATOMIC); if (work) { INIT_WORK(&work->work, ib_cache_task, work); diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c index aeda484..d6b8422 100644 --- a/drivers/infiniband/core/sa_query.c +++ b/drivers/infiniband/core/sa_query.c @@ -405,7 +405,8 @@ static void ib_sa_event(struct ib_event_ event->event == IB_EVENT_PORT_ACTIVE || event->event == IB_EVENT_LID_CHANGE || event->event == IB_EVENT_PKEY_CHANGE || - event->event == IB_EVENT_SM_CHANGE) { + event->event == IB_EVENT_SM_CHANGE || + event->event == IB_EVENT_CLIENT_REREGISTER) { struct ib_sa_device *sa_dev; sa_dev = container_of(handler, typeof(*sa_dev), event_handler); diff --git a/drivers/infiniband/hw/mthca/mthca_main.c b/drivers/infiniband/hw/mthca/mthca_main.c index 557cde3..7b82c19 100644 --- a/drivers/infiniband/hw/mthca/mthca_main.c +++ b/drivers/infiniband/hw/mthca/mthca_main.c @@ -967,12 +967,12 @@ static struct { } mthca_hca_table[] = { [TAVOR]= { .latest_fw = MTHCA_FW_VER(3, 4, 0), .flags = 0 }, - [ARBEL_COMPAT] = { .latest_fw = MTHCA_FW_VER(4, 7, 400), + [ARBEL_COMPAT] = { .latest_fw = MTHCA_FW_VER(4, 7, 600), .flags = MTHCA_FLAG_PCIE }, - [ARBEL_NATIVE] = { .latest_fw = MTHCA_FW_VER(5, 1, 0), + [ARBEL_NATIVE] = { .latest_fw = MTHCA_FW_VER(5, 1, 400), .flags = MTHCA_FLAG_MEMFREE | MTHCA_FLAG_PCIE }, - [SINAI]= { .latest_fw = MTHCA_FW_VER(1, 0, 800), + [SINAI]= { .latest_fw = MTHCA_FW_VER(1, 1, 0), .flags = MTHCA_FLAG_MEMFREE | MTHCA_FLAG_PCIE| MTHCA_FLAG_SINAI_OPT } diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c index 230ae21..265b1d1 100644 --- a/drivers/infiniband/hw/mthca/mthca_provider.c +++ b/drivers/infiniband/hw/mthca/mthca_provider.c @@ -1287,11 +1287,7 @@ int mthca_register_device(struct mthca_d (1ull << IB_USER_VERBS_CMD_MODIFY_QP) | (1ull << IB_USER_VERBS_CMD_DESTROY_QP) | (1ull << IB_USER_VERBS_CMD_ATTACH_MCAST)| - (1ull << IB_USER_VERBS_CMD_DETACH_MCAST)| - (1ull << IB_USER_VERBS_CMD_CREATE_SRQ) | - (1ull << IB_USER_VERBS_CMD_MODIFY_SRQ) | - (1ull << IB_USER_VERBS_CMD_QUERY_SRQ) | - (1ull << IB_USER_VERBS_CMD_DESTROY_SRQ); + (1ull << IB_USER_VERBS_CMD_DETACH_MCAST); dev->ib_dev.node_type= IB_NODE_CA; dev->ib_dev.phys_port_cnt= dev->limits.num_ports; dev->ib_dev.dma_device = &dev->pdev->dev; @@ -1316,6 +1312,11 @@ int mthca_register_device(struct mthca_d dev->ib_dev.modify_srq = mthca_modify_srq; dev->ib_dev.query_srq= mthca_query_srq; dev->ib_dev.destroy_srq = mthca_destroy_srq; + dev->ib_dev.uverbs_cmd_mask |= +
Re: [openib-general] [PATCH] libibsa: userspace SA query and multicast support
>Donnu. I'm just speaking on the general principle that we should deny by >default, not allow by default. Which queries do you want to perform? At a minimum, I would expect the following queries: PathRecord MultiPathRecord MCMemberRecord ServiceRecord Support for ServiceRecord set/delete and InformInfo are likely to be added once kernel support is in place. Is it a reasonable approach to export two devices with different permissions, one that allows limited sends, and another that permits unfiltered sends? - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 3/7] IB/ipath - performance improvements via mmap of queues
Ralph> Not quite. If the kernel driver is old, libipathverbs is Ralph> using the old functions which make system calls instead of Ralph> doing the newer mmap stuff. libipathverbs doesn't need to Ralph> attempt calling mmap() if it knows the driver doesn't Ralph> support it. You lost me -- I only see one version of ipath_resize_cq(), and it seems to do munmap()/mmap() without testing an ABI version. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 3/7] IB/ipath - performance improvements via mmap of queues
On Wed, 2006-08-23 at 14:50 -0700, Roland Dreier wrote: > I applied this, but I'm wondering if this: > > > +int ipath_resize_cq(struct ibv_cq *ibcq, int cqe) > > { > > + struct ipath_cq*cq = to_icq(ibcq); > > + struct ibv_resize_cqcmd; > > + struct ipath_resize_cq_resp resp; > > + size_t size; > > + int ret; > > + > > + pthread_spin_lock(&cq->lock); > > + /* Save the old size so we can unmmap the queue. */ > > + size = sizeof(struct ipath_cq_wc) + > > + (sizeof(struct ipath_wc) * cq->ibv_cq.cqe); > > + ret = ibv_cmd_resize_cq(ibcq, cqe, &cmd, sizeof cmd, > > + &resp.ibv_resp, sizeof resp); > > + if (ret) { > > + pthread_spin_unlock(&cq->lock); > > + return ret; > > + } > > + (void) munmap(cq->queue, size); > > + size = sizeof(struct ipath_cq_wc) + > > + (sizeof(struct ipath_wc) * cq->ibv_cq.cqe); > > + cq->queue = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, > > + ibcq->context->cmd_fd, resp.offset); > > + ret = errno; > > + pthread_spin_unlock(&cq->lock); > > + if ((void *) cq->queue == MAP_FAILED) > > + return ret; > > + return 0; > > +} > > works against an old kernel driver. It seems you do have this: > > > + if (dev->abi_version == 1) { > > + context->ibv_ctx.ops.poll_cq = ibv_cmd_poll_cq; > > + context->ibv_ctx.ops.post_srq_recv = ibv_cmd_post_srq_recv; > > + context->ibv_ctx.ops.post_recv = ibv_cmd_post_recv; > > + } > > so I guess you're just ignoring the failure of mmap() or something? > > - R. Not quite. If the kernel driver is old, libipathverbs is using the old functions which make system calls instead of doing the newer mmap stuff. libipathverbs doesn't need to attempt calling mmap() if it knows the driver doesn't support it. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/4] Dispatch communication related events to the IB CM
Roland Dreier wrote: > It's unfortunate that we have to add a special-case event hook for the > CM, but I guess the iWARP CM changes are so ugly anyway it doesn't > matter much. So I think committing this is OK. We also have the alternative of pushing the responsibility of notifying the CM of the event to the ULPs, which is what's done today. The problem there is that gets pushed up to all userspace applications as well. And even with these proposed changes, there's a race condition where the CM can timeout a connection after data is received over it, but before this event can be processed. This makes me wonder if any of this is worth it... - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 1/7] IB/ipath - performance improvements via mmap of queues
Thanks, applied. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] libibsa: userspace SA query and multicast support
Quoting r. Sean Hefty <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] libibsa: userspace SA query and multicast support > > > Yea I had the same question. Shouldn't interface expose > > just the specific queries that we need? > > I don't know what queries a user will want, and I'd rather not change the > kernel ABI with every new query, but that is a possibility. Which queries are > of concern? Donnu. I'm just speaking on the general principle that we should deny by default, not allow by default. Which queries do you want to perform? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 3/7] IB/ipath - performance improvements via mmap of queues
I applied this, but I'm wondering if this: > +int ipath_resize_cq(struct ibv_cq *ibcq, int cqe) > { > +struct ipath_cq*cq = to_icq(ibcq); > +struct ibv_resize_cqcmd; > +struct ipath_resize_cq_resp resp; > +size_t size; > +int ret; > + > +pthread_spin_lock(&cq->lock); > +/* Save the old size so we can unmmap the queue. */ > +size = sizeof(struct ipath_cq_wc) + > +(sizeof(struct ipath_wc) * cq->ibv_cq.cqe); > +ret = ibv_cmd_resize_cq(ibcq, cqe, &cmd, sizeof cmd, > +&resp.ibv_resp, sizeof resp); > +if (ret) { > +pthread_spin_unlock(&cq->lock); > +return ret; > +} > +(void) munmap(cq->queue, size); > +size = sizeof(struct ipath_cq_wc) + > +(sizeof(struct ipath_wc) * cq->ibv_cq.cqe); > +cq->queue = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, > + ibcq->context->cmd_fd, resp.offset); > +ret = errno; > +pthread_spin_unlock(&cq->lock); > +if ((void *) cq->queue == MAP_FAILED) > +return ret; > +return 0; > +} works against an old kernel driver. It seems you do have this: > +if (dev->abi_version == 1) { > +context->ibv_ctx.ops.poll_cq = ibv_cmd_poll_cq; > +context->ibv_ctx.ops.post_srq_recv = ibv_cmd_post_srq_recv; > +context->ibv_ctx.ops.post_recv = ibv_cmd_post_recv; > +} so I guess you're just ignoring the failure of mmap() or something? - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 2/7] IB/ipath - performance improvements via mmap of queues
Thanks, applied. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] mthca: various bug fixes for mthca_query_qp
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] mthca: various bug fixes for mthca_query_qp > > Michael> The other four bullets make sense however, do they not? > > I guess so, although I wonder if anyone will ever care about the > sq_sig_type() field. It's not in IB spec query QP, so that might be unlikely. However, libibverbs seems to be looking at this field: init_attr->sq_sig_all = resp.sq_sig_all; so it only seems consistent to fill this in. What do you think? Maybe its better to remove it from libibverbs? What about other stuff? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] libibsa: userspace SA query and multicast support
> Yea I had the same question. Shouldn't interface expose > just the specific queries that we need? I don't know what queries a user will want, and I'd rather not change the kernel ABI with every new query, but that is a possibility. Which queries are of concern? - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Rollup patch for ipath and OFED
Quoting r. Greg Lindahl <[EMAIL PROTECTED]>: > Subject: Re: Rollup patch for ipath and OFED > > On Wed, Aug 23, 2006 at 06:01:32PM +0300, Michael S. Tsirkin wrote: > > > So this seems to be ripping out chunks of upstream code (ipath_ht400) > > replacing them with something else (ipath_iba6110, ipath_iba6120.o) > > To answer this piece of the question, we were acquired last April, and > of course we have to rename all our devices. Since patch doesn't have > a rename feature, this looks much worse than it really is. Fine, but I wander why rush this cosmetic change for ofed 1.1? Anyway, so is that right that there's basically just the mmap enhancement plus a lot of file renames? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] mthca: various bug fixes for mthca_query_qp
Michael> The other four bullets make sense however, do they not? I guess so, although I wonder if anyone will ever care about the sq_sig_type() field. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] libibsa: userspace SA query and multicast support
Sean> The ibv_sa_send_mad() routine can only be used to issue the Sean> following methods: Sean> GET, SEND, GET_TABLE, GET_MULTI, and GET_TRACE_TABLE OK, I missed that -- it's kind of hidden inside is_send_req(). - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCHv2] IB/ipath - performance improvements via mmap of queues
Thanks, queued for 2.6.19 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] libibsa: userspace SA query and multicast support
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] libibsa: userspace SA query and multicast support > > What's the plan for how this would be used? We can't let unprivileged > userspace processes talk to the SA, because they could cause problems > like deleting someone else's multicast group membership. And I don't > think we want to try to do some elaborate filtering in the kernel, do we? Yea I had the same question. Shouldn't interface expose just the specific queries that we need? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] libibsa: userspace SA query and multicast support
Roland Dreier wrote: > What's the plan for how this would be used? We can't let unprivileged > userspace processes talk to the SA, because they could cause problems > like deleting someone else's multicast group membership. And I don't > think we want to try to do some elaborate filtering in the kernel, do we? The ibv_sa_send_mad() routine can only be used to issue the following methods: GET, SEND, GET_TABLE, GET_MULTI, and GET_TRACE_TABLE I do check for this in the kernel, but that is the extent of any filtering that's done. Multicast operations must go through the multicast join / free calls, which drop into the kernel to interface with the ib_multicast module. I would expect that other SET / DELETE type operations would be treated similar to how multicast is handled. I'm expecting that the labs will use at least the multicast interfaces based on e-mail conversations, but without path record query support, the userspace CM interface isn't all that useful. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 4/7] IB/ipath - performance improvements via mmap of queues
Thanks, applied and queued for 2.6.19. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] mthca: various bug fixes for mthca_query_qp
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] mthca: various bug fixes for mthca_query_qp > > > 5. Return the send_cq, receive cq and srq handles. > > I really disagree with this change. The other four bullets make sense however, do they not? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] libibsa: userspace SA query and multicast support
What's the plan for how this would be used? We can't let unprivileged userspace processes talk to the SA, because they could cause problems like deleting someone else's multicast group membership. And I don't think we want to try to do some elaborate filtering in the kernel, do we? - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB/mthca: update latest firmware revisions
> Please consider the following for 2.6.18 - hopefully this will reduce > the number of support requests from people with old Sinai firmware. Makes sense -- this is the sort of thing we want as current as possible. Applied to svn and for-2.6.18 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] question: ib_umem page_size
> > It gives the page size for the user memory described by the struct. > > The idea was that if/when someone tries to optimize for huge pages, > > then the low-level driver can know that a region is using huge pages > > without having to walk through the page list and search for the > > minimum physically contiguous size. > > Hmm, mthca_reg_user_mr seems to do: > > len = sg_dma_len(&chunk->page_list[j]) >> shift > > which means that dma_len must be a multiple of page size. > > Is this intentional? Yes, it's intentional I think. I'm probably missing something, but the upper layer has just told mthca_reg_user_mr() that the page size for this region is (1
Re: [openib-general] [PATCH] mthca: various bug fixes for mthca_query_qp
> 5. Return the send_cq, receive cq and srq handles. ib_query_qp() needs them >(required by IB Spec). ibv_query_qp() overwrites these values in > user-space >with appropriate user-space values. > +qp_init_attr->send_cq = ibqp->send_cq; > +qp_init_attr->recv_cq = ibqp->recv_cq; > +qp_init_attr->srq = ibqp->srq; I really disagree with this change. It's silly to do this copying since the consumer already has the ibqp pointer. And it's especially silly to put this in a low-level driver, since there's nothing device-specific about it. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/4] Dispatch communication related events to the IB CM
Sean> This patch set appears to be the preferred approach. Any Sean> objection to committing this? It's unfortunate that we have to add a special-case event hook for the CM, but I guess the iWARP CM changes are so ugly anyway it doesn't matter much. So I think committing this is OK. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Wed, 2006-08-23 at 14:00 -0500, Tang, Changqing wrote: > Thanks for all your replies. So my general question is, why only 4bytes > immediate data can > Generate completion event on B side, Why RDMA-write with any data size > does not generate > A completion event on B side? basic there are the same thing, the only > different is, one > Copy 4bytes to completion structure, the other copy all data to provided > dest buffer. > > --CQ This is just the way IB was defined. Both RDMA write and RDMA write with immediate transfer up to 2 Gigabytes of data to the destination address. The latter just signals node B that the operation is complete and the former doesn't. Node B can use the immediate data as a hint to which RDMA operation just completed. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] opensm: option to limit size of OpenSM log file
Hi Hal, On 01:34 Wed 23 Aug , Sasha Khapyorsky wrote: > On 18:26 Tue 22 Aug , Hal Rosenstock wrote: > > On Tue, 2006-08-22 at 18:22, Sasha Khapyorsky wrote: > > > On 17:54 Tue 22 Aug , Hal Rosenstock wrote: > > > > Hi Sasha, > > > > > > > > On Tue, 2006-08-22 at 17:18, Sasha Khapyorsky wrote: > > > > > Hi Hal, > > > > > > > > > > There is new option which specified max size of OpenSM log file. The > > > > > default is '0' (not-limited). Please note osm_log_init() has new > > > > > parameter now. > > > > > > > > So libopensm.ver needs to be bumped (and this is not backward > > > > compatible). > > > > > > We may. I'm not sure it is necessary - in this patch I've changed all > > > occurrences of osm_log_init() under osm/ (in opensm and osmtest). So > > > this can be important only if there are osm_log "external" users. > > > > There may be so I will do this. > > Ok. Thanks. Found one more osm_log_init(). There is. Sasha Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]> diff --git a/diags/src/saquery.c b/diags/src/saquery.c index f0d1299..0bb46be 100644 --- a/diags/src/saquery.c +++ b/diags/src/saquery.c @@ -443,7 +443,7 @@ get_bind_handle(void) osm_log_construct(&log_osm); if ((status = osm_log_init( &log_osm, TRUE, - 0x0001, NULL, TRUE )) != IB_SUCCESS) { + 0x0001, NULL, 0, TRUE )) != IB_SUCCESS) { fprintf(stderr, "Failed to init osm_log: %s\n", ib_get_err_str(status)); exit (-1); > > Sasha > > > > > -- Hal > > > > > > > > ___ > openib-general mailing list > openib-general@openib.org > http://openib.org/mailman/listinfo/openib-general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
Thanks for all your replies. So my general question is, why only 4bytes immediate data can Generate completion event on B side, Why RDMA-write with any data size does not generate A completion event on B side? basic there are the same thing, the only different is, one Copy 4bytes to completion structure, the other copy all data to provided dest buffer. --CQ >-Original Message- >From: Ralph Campbell [mailto:[EMAIL PROTECTED] >Sent: Wednesday, August 23, 2006 1:16 PM >To: Tang, Changqing >Cc: Caitlin Bestler; openib-general@openib.org >Subject: RE: [openib-general] basic IB doubt > >On Wed, 2006-08-23 at 12:28 -0500, Tang, Changqing wrote: >> > >> >Actually, A knows the data is in B's memory when A gets the >> >completion notice. B can't rely on anything unless A uses the RDMA our >> >write with immediate which puts a completion event in B's CQ. >> >> Ralph: >> >> Can you give a few more words on 'immediate', I know A will have A >> completion event in its CQ, Does B receive a CQ event on the >Same RDMA >> operation as well ? >> >> --CQ Tang > >B doesn't get a completion event for a RDMA write initiated >from A unless A does something like the following: > > struct ib_send_wr wr; > > wr.opcode = IB_WR_RDMA_WRITE_WITH_IMM; > wr.imm_data = cpu_to_be32(value); > ... > ib_post_send(qp, &wr, NULL); > >B will get a completion event with the IB_WC_WITH_IMM flag set >in struct ib_wc.wc_flags and ib_wc.imm_data set to the value >that A sent. > > ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Wed, 2006-08-23 at 12:28 -0500, Tang, Changqing wrote: > > > >Actually, A knows the data is in B's memory when A gets the > >completion notice. B can't rely on anything unless A uses the > >RDMA write with immediate which puts a completion event in B's CQ. > > Ralph: > > Can you give a few more words on 'immediate', I know A will have > A completion event in its CQ, Does B receive a CQ event on the > Same RDMA operation as well ? > > --CQ Tang B doesn't get a completion event for a RDMA write initiated from A unless A does something like the following: struct ib_send_wr wr; wr.opcode = IB_WR_RDMA_WRITE_WITH_IMM; wr.imm_data = cpu_to_be32(value); ... ib_post_send(qp, &wr, NULL); B will get a completion event with the IB_WC_WITH_IMM flag set in struct ib_wc.wc_flags and ib_wc.imm_data set to the value that A sent. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Rollup patch for ipath and OFED
Bryan O'Sullivan wrote: > SVN is not a high priority for me personally. Fixing things so that I > can send meaningful patches upstream in a timely manner us. Why not remove your code from SVN? - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Rollup patch for ipath and OFED
On Wed, 2006-08-23 at 10:58 -0700, Woodruff, Robert J wrote: > I hate to tell you I told you so, but this is exactly why you guys > should not be off working behind closed doors and then submit some > mongo patch. That's precisely what I'm working to avoid. It's not as if I didn't know this. > If you would just submit the code as you go in smaller > patches and check in the smaller patches daily to SVN, we would not > have such an integration mess at the end. SVN is not a high priority for me personally. Fixing things so that I can send meaningful patches upstream in a timely manner us. http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Rollup patch for ipath and OFED
Bryan wrote, >On Wed, 2006-08-23 at 18:01 +0300, Michael S. Tsirkin wrote: >> Guys, I just looked at ipath-fixes.patch in ofed. With 36 files changed, 4623 >> insertions, 4774 deletions, it's quite a biggie with no description what it does >> whatsoever. Can't this be split to smaller chunks doing one thing at a time, >> please? You'll have to do this for upstream inclusion anyway, so why not for >> OFED? >We were in a rush to get a working patch out, is all. I've been >splitting that monster up into sensibly-sized chunks in the usual way. > http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
Tang, Changqing wrote: > Can you give a few more words on 'immediate', I know A will have > A completion event in its CQ, Does B receive a CQ event on the > Same RDMA operation as well ? He means and RDMA write with immediate data. B will see a completion event for that operation. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
> >Actually, A knows the data is in B's memory when A gets the >completion notice. B can't rely on anything unless A uses the >RDMA write with immediate which puts a completion event in B's CQ. Ralph: Can you give a few more words on 'immediate', I know A will have A completion event in its CQ, Does B receive a CQ event on the Same RDMA operation as well ? --CQ Tang >Most applications on B ignore this requirement and test for >the last memory location being modified which usually works >but doesn't guarantee that all the data is in memory. > > >___ >openib-general mailing list >openib-general@openib.org >http://openib.org/mailman/listinfo/openib-general > >To unsubscribe, please visit >http://openib.org/mailman/listinfo/openib-general > > ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1-rc2 is ready
Woody wrote, >Ok, I was able to install the RC2 on EL4-U3 and get intel MPI working on uDAPL. >Did have one issue with the install that maybe you could fix for the next >RC. It appears that the rdma_ucm and rdma_cm are not being loaded at startup >time and I had to manually modprobe rdma_ucm, after that, Intel MPI and uDAPL >seemed to work fine with the intial tests I have done so far, will continue >to stress test it and let you know if we see any other issues. >woody It appears that what is happening on this one is that the /etc/init.d/openibd script is failing because the ipath driver is not loading, which is expected in rc2 as their latest patches are not in rc2, If I disable loading of the ipath driver in /etc/infiniband/openib.conf the script continues and loads rdma_cm and rdma_ucm. The question is, if a driver like the ipath driver fails to load, should the script go on anyway and load the other drivers like rdma_cm/rdma_ucm ? or is it best to leave it as it is. Anyway, once the ipath driver is fixed, this issue should go away. woody ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/4] Dispatch communication related events to the IB CM
Sean Hefty wrote: > The following set of patches forwards communication related events to the IB > CM > for processing. Communication events of interest are communication > established > and path migration, with only the former is currently handled by the IB CM. > > This removes the need for users to trap for these events and pass the > information onto IB CM. Communication established events can be handled by > the > ib_cm_establish() routine, but no mechanism exists to notify the IB CM of path > migration. This adds the framework for doing so. > > Signed-off-by: Sean Hefty <[EMAIL PROTECTED]> Based on feedback from Or and Michael: http://openib.org/pipermail/openib-general/2006-August/025320.html http://openib.org/pipermail/openib-general/2006-August/025306.html This patch set appears to be the preferred approach. Any objection to committing this? - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Wed, 2006-08-23 at 09:47 -0700, Caitlin Bestler wrote: > [EMAIL PROTECTED] wrote: > > Quoting r. john t <[EMAIL PROTECTED]>: > >> Subject: basic IB doubt > >> > >> Hi > >> > >> I have a very basic doubt. Suppose Host A is doing RDMA write (say 8 > >> MB) to Host B. When data is copied into Host B's local > > buffer, is it guaranteed that data will be copied starting > > from the first location (first buffer address) to the last > > location (last buffer address)? or it could be in any order? > > > > Once B gets a completion (e.g. of a subsequent send), data in > > its buffer matches that of A, byte for byte. > > An excellent and concise answer. That is exactly what the application > should rely upon, and nothing else. With iWARP this is very explicit, > because portions of the message not only MAY be placed out of > order, they SHOULD be when packets have been re-ordered by the > network. But for *any* RDMA adapter there is no guarantee on > what order the adapter flushes things to host memory or particularly > when old contents that may be cached are invalidated or updated. > The role of the completion is to limit the frequency with which > the RDMA adapter MUST guarantee coherency with application visible > buffers. The completion not only indicates that the entire message > was received, but that it has been entirely delivered to host memory. Actually, A knows the data is in B's memory when A gets the completion notice. B can't rely on anything unless A uses the RDMA write with immediate which puts a completion event in B's CQ. Most applications on B ignore this requirement and test for the last memory location being modified which usually works but doesn't guarantee that all the data is in memory. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] osm: fix memory leak in vendor ibumad
Hi Eitan, > Who is freeing the request MAD? > If it is NULL then the flow aborts earlier. Sorry; My bad :-( -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_cm: randomize starting local comm IDs
Or Gerlitz wrote: > I have tested the patch against an iser target based on our Gen1 CM - > it works as expected. This has been committed in 9088. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
[EMAIL PROTECTED] wrote: > Quoting r. john t <[EMAIL PROTECTED]>: >> Subject: basic IB doubt >> >> Hi >> >> I have a very basic doubt. Suppose Host A is doing RDMA write (say 8 >> MB) to Host B. When data is copied into Host B's local > buffer, is it guaranteed that data will be copied starting > from the first location (first buffer address) to the last > location (last buffer address)? or it could be in any order? > > Once B gets a completion (e.g. of a subsequent send), data in > its buffer matches that of A, byte for byte. An excellent and concise answer. That is exactly what the application should rely upon, and nothing else. With iWARP this is very explicit, because portions of the message not only MAY be placed out of order, they SHOULD be when packets have been re-ordered by the network. But for *any* RDMA adapter there is no guarantee on what order the adapter flushes things to host memory or particularly when old contents that may be cached are invalidated or updated. The role of the completion is to limit the frequency with which the RDMA adapter MUST guarantee coherency with application visible buffers. The completion not only indicates that the entire message was received, but that it has been entirely delivered to host memory. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_cm: randomize starting local comm IDs
Michael S. Tsirkin wrote: > so I am wandering why is it not sufficient to wait for > the window of time as described above to expire? > Is something broken in CM that this patch is papering over? Yes. There are a couple of issues. The CM doesn't time when a REQ was received, and the local comm ID's need rework. This is a fix that should avoid the issues most of the time. There's also an issue that a user can allocate a QP that's likely to be in time wait. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] InfiniBand merge plans for 2.6.19
Or Gerlitz wrote: > OK. Now, if this (RC, UD, MCAST) turns to be too much for your > schedule before 2.6.19 opens up, does it make sense for you to push a > char device which supports only the CMA RC functionality and the UD > and multicast in the future? Yes - and the fact that I can pull the OFED code for this helps. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Rollup patch for ipath and OFED
On Wed, Aug 23, 2006 at 06:01:32PM +0300, Michael S. Tsirkin wrote: > So this seems to be ripping out chunks of upstream code (ipath_ht400) > replacing them with something else (ipath_iba6110, ipath_iba6120.o) To answer this piece of the question, we were acquired last April, and of course we have to rename all our devices. Since patch doesn't have a rename feature, this looks much worse than it really is. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
john t wrote: > I have a very basic doubt. Suppose Host A is doing RDMA write (say 8 > MB) to Host B. When data is copied into Host B's local buffer, is it > guaranteed that data will be copied starting from the first location > (first buffer address) to the last location (last buffer address)? or it > could be in any order? I don't believe that there is any ordering guarantee by the architecture. However, specific adapters may behave this way, and I've seen applications make use of this by polling the last memory byte for a completion, for example. > Also does this tranfer involve DMA engine (residinng on HCAs) of both > the hosts or just one host? If I'm understanding the question correctly, the DMA engine of both HCAs are used. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] OFED 1.1-rc2 bug - Could not read ABI version
With the OFED1.1-rc2 when I run the RDMA CM on RedHat EL4 - Update 3 I get the following warning. The application seems to work OK, but the warnings are concerning. librdmacm: couldn't read ABI version. librdmacm: assuming: 1 It appears that the backport of the kernel module does not create the abi_version in /sys/class/misc as the userspace code expects. I suggest either fix the backport patch to create the abi_version or to avoid confusion, remove the check or the warning message from the userspace code if the ABI does actually matches the kernel code. woody ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] mthca: various bug fixes for mthca_query_qp
Fixed various bugs in mthca_query_qp: 1. correct port_num was not being returned for unconnected QPs. 2. Incorrect number of bits was taken for static_rate field. 3. When default static rate is returned for Tavor, forgot to translate it to an ib rate value. 4. Return sq signalling attribute in query-qp. 5. Return the send_cq, receive cq and srq handles. ib_query_qp() needs them (required by IB Spec). ibv_query_qp() overwrites these values in user-space with appropriate user-space values. Signed-off-by: Jack Morgenstein <[EMAIL PROTECTED]> Index: ofed_1_1/drivers/infiniband/hw/mthca/mthca_qp.c === --- ofed_1_1.orig/drivers/infiniband/hw/mthca/mthca_qp.c2006-08-23 10:33:04.0 +0300 +++ ofed_1_1/drivers/infiniband/hw/mthca/mthca_qp.c 2006-08-23 18:46:08.330885000 +0300 @@ -391,6 +391,12 @@ static int to_ib_qp_access_flags(int mth return ib_flags; } +static enum ib_sig_type to_ib_qp_sq_signal(int params1) +{ + return (params1 & MTHCA_QP_BIT_SSC) ? + IB_SIGNAL_ALL_WR : IB_SIGNAL_REQ_WR; +} + static void to_ib_ah_attr(struct mthca_dev *dev, struct ib_ah_attr *ib_ah_attr, struct mthca_qp_path *path) { @@ -404,7 +410,7 @@ static void to_ib_ah_attr(struct mthca_d ib_ah_attr->sl= be32_to_cpu(path->sl_tclass_flowlabel) >> 28; ib_ah_attr->src_path_bits = path->g_mylmc & 0x7f; ib_ah_attr->static_rate = mthca_rate_to_ib(dev, -path->static_rate & 0x7, +path->static_rate & 0xf, ib_ah_attr->port_num); ib_ah_attr->ah_flags = (path->g_mylmc & (1 << 7)) ? IB_AH_GRH : 0; if (ib_ah_attr->ah_flags) { @@ -468,10 +474,14 @@ int mthca_query_qp(struct ib_qp *ibqp, s if (qp->transport == RC || qp->transport == UC) { to_ib_ah_attr(dev, &qp_attr->ah_attr, &context->pri_path); to_ib_ah_attr(dev, &qp_attr->alt_ah_attr, &context->alt_path); + qp_attr->alt_pkey_index = + be32_to_cpu(context->alt_path.port_pkey) & 0x7f; + qp_attr->alt_port_num = qp_attr->alt_ah_attr.port_num; } - qp_attr->pkey_index = be32_to_cpu(context->pri_path.port_pkey) & 0x7f; - qp_attr->alt_pkey_index = be32_to_cpu(context->alt_path.port_pkey) & 0x7f; + qp_attr->pkey_index = be32_to_cpu(context->pri_path.port_pkey) & 0x7f; + qp_attr->port_num = + (be32_to_cpu(context->pri_path.port_pkey) >> 24) & 0x3; /* qp_attr->en_sqd_async_notify is only applicable in modify qp */ qp_attr->sq_draining = mthca_state == MTHCA_QP_STATE_DRAINING; @@ -482,13 +492,16 @@ int mthca_query_qp(struct ib_qp *ibqp, s 1 << ((be32_to_cpu(context->params2) >> 21) & 0x7); qp_attr->min_rnr_timer = (be32_to_cpu(context->rnr_nextrecvpsn) >> 24) & 0x1f; - qp_attr->port_num = qp_attr->ah_attr.port_num; qp_attr->timeout= context->pri_path.ackto >> 3; qp_attr->retry_cnt = (be32_to_cpu(context->params1) >> 16) & 0x7; qp_attr->rnr_retry = context->pri_path.rnr_retry >> 5; - qp_attr->alt_port_num = qp_attr->alt_ah_attr.port_num; qp_attr->alt_timeout= context->alt_path.ackto >> 3; qp_init_attr->cap = qp_attr->cap; + qp_init_attr->sq_sig_type = + to_ib_qp_sq_signal(be32_to_cpu(context->params1)); + qp_init_attr->send_cq = ibqp->send_cq; + qp_init_attr->recv_cq = ibqp->recv_cq; + qp_init_attr->srq = ibqp->srq; out: mthca_free_mailbox(dev, mailbox); Index: ofed_1_1/drivers/infiniband/hw/mthca/mthca_av.c === --- ofed_1_1.orig/drivers/infiniband/hw/mthca/mthca_av.c2006-08-03 14:30:21.0 +0300 +++ ofed_1_1/drivers/infiniband/hw/mthca/mthca_av.c 2006-08-23 17:53:01.22722 +0300 @@ -90,7 +90,7 @@ static enum ib_rate tavor_rate_to_ib(u8 case MTHCA_RATE_TAVOR_1X: return IB_RATE_2_5_GBPS; case MTHCA_RATE_TAVOR_1X_DDR: return IB_RATE_5_GBPS; case MTHCA_RATE_TAVOR_4X: return IB_RATE_10_GBPS; - default: return port_rate; + default: return mult_to_ib_rate(port_rate); } } ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] osm: fix memory leak in vendor ibumad
Who is freeing the request MAD? If it is NULL then the flow aborts earlier. > -Original Message- > From: Hal Rosenstock [mailto:[EMAIL PROTECTED] > Sent: Wednesday, August 23, 2006 6:35 PM > To: Eitan Zahavi > Cc: openib-general@openib.org > Subject: Re: [PATCH] osm: fix memory leak in vendor ibumad > > Hi Eitan, > > I'm not getting openib-general email today but got this off a web site: > > Index: libvendor/osm_vendor_ibumad_sa.c > > === > --- libvendor/osm_vendor_ibumad_sa.c (revision 9087) > +++ libvendor/osm_vendor_ibumad_sa.c (working copy) > @@ -180,6 +180,11 @@ __osmv_sa_mad_rcv_cb( > >/* free the copied query request if found */ >if (p_query_req_copy) free(p_query_req_copy); > + > + /* put back the request madw */ > + if (p_req_madw) > +osm_mad_pool_put(p_bind->p_mad_pool, p_req_madw); > + >OSM_LOG_EXIT( p_bind->p_log ); > } > > There's an additional minor change needed to this routine as there is a case > where the request MAD is already free'd. > > I'm in the process of looking at the osm_vendor_ibumad.c change too. > > -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] osm: fix memory leak in vendor ibumad
Hi Eitan, I'm not getting openib-general email today but got this off a web site: Index: libvendor/osm_vendor_ibumad_sa.c === --- libvendor/osm_vendor_ibumad_sa.c(revision 9087) +++ libvendor/osm_vendor_ibumad_sa.c(working copy) @@ -180,6 +180,11 @@ __osmv_sa_mad_rcv_cb( /* free the copied query request if found */ if (p_query_req_copy) free(p_query_req_copy); + + /* put back the request madw */ + if (p_req_madw) +osm_mad_pool_put(p_bind->p_mad_pool, p_req_madw); + OSM_LOG_EXIT( p_bind->p_log ); } There's an additional minor change needed to this routine as there is a case where the request MAD is already free'd. I'm in the process of looking at the osm_vendor_ibumad.c change too. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Rollup patch for ipath and OFED
On Wed, 2006-08-23 at 18:01 +0300, Michael S. Tsirkin wrote: > Guys, I just looked at ipath-fixes.patch in ofed. With 36 files changed, 4623 > insertions, 4774 deletions, it's quite a biggie with no description what it > does > whatsoever. Can't this be split to smaller chunks doing one thing at a time, > please? You'll have to do this for upstream inclusion anyway, so why not for > OFED? We were in a rush to get a working patch out, is all. I've been splitting that monster up into sensibly-sized chunks in the usual way. http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Rollup patch for ipath and OFED
Guys, I just looked at ipath-fixes.patch in ofed. With 36 files changed, 4623 insertions, 4774 deletions, it's quite a biggie with no description what it does whatsoever. Can't this be split to smaller chunks doing one thing at a time, please? You'll have to do this for upstream inclusion anyway, so why not for OFED? Oh well. However, this baby also does for example: diff -r d2661c9eff49 -r 198ed6310295 drivers/infiniband/hw/ipath/Makefile --- a/drivers/infiniband/hw/ipath/Makefile Thu Aug 10 16:19:45 2006 -0700 +++ b/drivers/infiniband/hw/ipath/Makefile Wed Aug 16 11:01:29 2006 -0700 @@ -1,36 +1,34 @@ EXTRA_CFLAGS += -DIPATH_IDSTR='"QLogic k EXTRA_CFLAGS += -DIPATH_IDSTR='"QLogic kernel.org driver"' \ -DIPATH_KERN_TYPE=0 -obj-$(CONFIG_IPATH_CORE) += ipath_core.o obj-$(CONFIG_INFINIBAND_IPATH) += ib_ipath.o -ipath_core-y := \ +ib_ipath-y := \ + ipath_cq.o \ ipath_diag.o \ ipath_driver.o \ ipath_eeprom.o \ ipath_file_ops.o \ ipath_fs.o \ - ipath_ht400.o \ + ipath_iba6110.o \ + ipath_iba6120.o \ ipath_init_chip.o \ ipath_intr.o \ + ipath_keys.o \ ipath_layer.o \ - ipath_pe800.o \ - ipath_stats.o \ - ipath_sysfs.o \ - ipath_user_pages.o - -ipath_core-$(CONFIG_X86_64) += ipath_wc_x86_64.o - -ib_ipath-y := \ - ipath_cq.o \ - ipath_keys.o \ ipath_mad.o \ ipath_mr.o \ ipath_qp.o \ ipath_rc.o \ ipath_ruc.o \ ipath_srq.o \ + ipath_stats.o \ + ipath_sysfs.o \ ipath_uc.o \ ipath_ud.o \ - ipath_verbs.o \ - ipath_verbs_mcast.o + ipath_user_pages.o \ + ipath_verbs_mcast.o \ + ipath_verbs.o + +ib_ipath-$(CONFIG_X86_64) += ipath_wc_x86_64.o +ib_ipath-$(CONFIG_PPC64) += ipath_wc_ppc64.o So this seems to be ripping out chunks of upstream code (ipath_ht400) replacing them with something else (ipath_iba6110, ipath_iba6120.o) This might be a good change for all I know. But I'd like to ask What exactly does this fixes patch, fix? Can there be a list of things it provides at the top? How about a Signed-off-by line? Was this posted on openib-general even once? There's a single unmerged ipath patch posted on openib-general: mmap()ed userspace work queues for ipath. So where does the rest come from? Googling for ipath_iba6110 gets no hits. Thanks, -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_cm: randomize starting local comm IDs
Quoting r. Or Gerlitz <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] ib_cm: randomize starting local comm IDs > > On 8/23/06, Michael S. Tsirkin <[EMAIL PROTECTED]> wrote: > > Quoting r. Or Gerlitz <[EMAIL PROTECTED]>: > > > > So the CM at the target side rejects the first REQ after the client > > > reboot with STALE reason (and deliveres a disconnect event to the > > > ULP). The second REQ is processed fine and a new connection is > > > established. > > > Hmm. Might this still be a concern for users such as SDP > > which don't retry connections? > > I don't know if "this" in your email referes to the quote above I am speaking more or less about this quote from your message: > > Without the patch, since the REQ had as this of an > > existing connection, it was just silently dropped and a target reboot > > was a must to let the initiator reconnect ! the spec says: > > If a CM receives a REQ/REP as described above, if the REQ/REP has the > > same Local Communication ID and Remote Communication ID as are present > > in the existing connection and if the REQ/REP arrives within the window > > of time during which the active/passive side could be legally > > retransmitting REQ/REP, the CM should treat the REQ/REP as a retry and > > not initiate stale connection processing as described above. so I am wandering why is it not sufficient to wait for the window of time as described above to expire? Is something broken in CM that this patch is papering over? > but what > i discribe there is what stated in the IB spec ch. 12 re stale connections. I know. I am just wandering aloud whether this is relevant for SDP. Why won't the window expire as described above? > So you need to either rely on the SDP consumer to reconnect or when > getting a STALE reject attempt to reconnect from SDP. I'm not sure SDP needs to do anything - the port is busy, after all. Retrying seems to be against the SDP spec. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_cm: randomize starting local comm IDs
On 8/23/06, Michael S. Tsirkin <[EMAIL PROTECTED]> wrote: > Quoting r. Or Gerlitz <[EMAIL PROTECTED]>: > > So the CM at the target side rejects the first REQ after the client > > reboot with STALE reason (and deliveres a disconnect event to the > > ULP). The second REQ is processed fine and a new connection is > > established. > Hmm. Might this still be a concern for users such as SDP > which don't retry connections? I don't know if "this" in your email referes to the quote above but what i discribe there is what stated in the IB spec ch. 12 re stale connections. So you need to either rely on the SDP consumer to reconnect or when getting a STALE reject attempt to reconnect from SDP. Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_cm: randomize starting local comm IDs
Sean Hefty wrote: > Randomize the starting local comm ID to avoid getting a rejected connection > due to a stale connection after a system reboot or reloading of the ib_cm. Hi Sean, As i wrote you patching the CM it works fine and the problem i could reproduce with our iser target is solved. However, we wonder what is your opinion (and if positive, work estimation...) to make the CM get "GID OUT" traps, which are generated by the SM when a node IB restarts (eg a node reboot). Once the CM gets the trap, it can scan the internal data structures and emulate a disconnect for all relevant (*) connections ??? (*) there are some technical issues here: the GID OUT is on a PORT GID and the CM uses NODE GUIDS, also does openib stack has the means for a client to register on getting **traps** ??? Other then the CM there might be more potential consumers to this trap, and also to "GID IN". Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_cm: randomize starting local comm IDs
Quoting r. Or Gerlitz <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] ib_cm: randomize starting local comm IDs > > On 8/22/06, Sean Hefty <[EMAIL PROTECTED]> wrote: > > Randomize the starting local comm ID to avoid getting a rejected connection > > due to a stale connection after a system reboot or reloading of the ib_cm. > > Hi Sean, > > I have tested the patch against an iser target based on our Gen1 CM - > it works as expected. > > So the CM at the target side rejects the first REQ after the client > reboot with STALE reason (and deliveres a disconnect event to the > ULP). The second REQ is processed fine and a new connection is > established. > > Without the patch, since the REQ had as this of an > existing connection, it was just silently dropped and a target reboot > was a must to let the initiator reconnect ! > > Or. Hmm. Might this still be a concern for users such as SDP which don't retry connections? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] osm: fix memory leak in vendor ibumad
Hi Hal These are two trivial fixes for memory leaks in the ibumad vendor. Thanks Eitan Signed-off-by: Eitan Zahavi <[EMAIL PROTECTED]> Index: libvendor/osm_vendor_ibumad_sa.c === --- libvendor/osm_vendor_ibumad_sa.c(revision 9087) +++ libvendor/osm_vendor_ibumad_sa.c(working copy) @@ -180,6 +180,11 @@ __osmv_sa_mad_rcv_cb( /* free the copied query request if found */ if (p_query_req_copy) free(p_query_req_copy); + + /* put back the request madw */ + if (p_req_madw) +osm_mad_pool_put(p_bind->p_mad_pool, p_req_madw); + OSM_LOG_EXIT( p_bind->p_log ); } Index: libvendor/osm_vendor_ibumad.c === --- libvendor/osm_vendor_ibumad.c (revision 9087) +++ libvendor/osm_vendor_ibumad.c (working copy) @@ -617,6 +617,7 @@ osm_vendor_get_all_port_attr( *p_lid = ca.ports[j]->base_lid; *p_linkstates = ca.ports[j]->state; *p_portnum = ca.ports[j]->portnum; + free(ca.ports[j]); } p_lid++; p_linkstates++; ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] InfiniBand merge plans for 2.6.19
On 8/22/06, Sean Hefty <[EMAIL PROTECTED]> wrote: > >What about pushing the char device to support user space CMA, i recall > >that you have mentioned the API was not mature enough when the 2.6.18 > >feature merge window was open. > > I will look at doing this. I need to verify what functionality (RC, UD, > multicast) of the kernel RDMA CM we want merged upstream for 2.6.19 and > create a > patch for exposing that to userspace. OK. Now, if this (RC, UD, MCAST) turns to be too much for your schedule before 2.6.19 opens up, does it make sense for you to push a char device which supports only the CMA RC functionality and the UD and multicast in the future? Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IB CM and the case of the lost RTU: was a bunch of other topics...
On 8/23/06, Sean Hefty <[EMAIL PROTECTED]> wrote: > To recap (since it's been a couple of weeks), we have two general solutions > for > how to support the passive/server/target side of a connection: > > 1. One method requires that the passive side queue send WRs until they get a > connection establish event. > > 2. An alternative allows sending immediately after receiving a response, but > may > require the user to manually transition the connection to established. > Failure > to do so will cause the connection to tear down if the RTU is never received > (even after retries). The Voltaire iSER target implementation follows a variant of the first method, namely it defers RX processing till getting a connection established event. This is ensured in the current product by the Gen1 Voltaire CM riding on the the IB comm_established async event and synthesizing an RTU and would be the same in the iser target we code over the Gen2 stack (CMA) if the first method is implemented. As typically there is some ULP level handshake when a connection starts, there would be very little to queue (eg in iSER its the login-request). Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] CONFIG_INFINIBAND_ADDR_TRANS
On 8/23/06, john t <[EMAIL PROTECTED]> wrote: > What does the config option > "CONFIG_INFINIBAND_ADDR_TRANS=y" indicate? The code does > not seem to use this. No, the code does use it to build and install the rdma_cm (CMA) and the ib_addr modules, see linux-2.6.18-rc4/drivers/infiniband/core/Makefile infiniband-$(CONFIG_INFINIBAND_ADDR_TRANS) := ib_addr.o rdma_cm.o However, i find the config name quite confusing... Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] CONFIG_INFINIBAND_ADDR_TRANS
Quoting r. john t <[EMAIL PROTECTED]>: > Subject: CONFIG_INFINIBAND_ADDR_TRANS > > Hi, > > What does the config option "CONFIG_INFINIBAND_ADDR_TRANS=y" indicate? The > code does not seem to use this. > > Regards, > John T. Builds CMA modules. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_cm: randomize starting local comm IDs
On 8/22/06, Sean Hefty <[EMAIL PROTECTED]> wrote: > Randomize the starting local comm ID to avoid getting a rejected connection > due to a stale connection after a system reboot or reloading of the ib_cm. Hi Sean, I have tested the patch against an iser target based on our Gen1 CM - it works as expected. So the CM at the target side rejects the first REQ after the client reboot with STALE reason (and deliveres a disconnect event to the ULP). The second REQ is processed fine and a new connection is established. Without the patch, since the REQ had as this of an existing connection, it was just silently dropped and a target reboot was a must to let the initiator reconnect ! Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] CONFIG_INFINIBAND_ADDR_TRANS
Hi, What does the config option "CONFIG_INFINIBAND_ADDR_TRANS=y" indicate? The code does not seem to use this. Regards, John T. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] question: ib_umem page_size
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: question: ib_umem page_size > > Michael> Roland, could you please clarify what does the page_size > Michael> field in struct ib_mem do? > > It gives the page size for the user memory described by the struct. > The idea was that if/when someone tries to optimize for huge pages, > then the low-level driver can know that a region is using huge pages > without having to walk through the page list and search for the > minimum physically contiguous size. Hmm, mthca_reg_user_mr seems to do: len = sg_dma_len(&chunk->page_list[j]) >> shift which means that dma_len must be a multiple of page size. Is this intentional? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] huge pages support
On Fri, 2006-08-18 at 15:23 +0200, Robert Rex wrote: > Hello, > > I've also worked on the same topic. Here is what I've done so far as I > successfully tested it on mthca and ehca. I'd appreciate for comments and > suggestions. > > + down_read(¤t->mm->mmap_sem); > + if (is_vm_hugetlb_page(find_vma(current->mm, (unsigned long) addr))) { > + use_hugepages = 1; > + region_page_mask= HPAGE_MASK; > + region_page_size= HPAGE_SIZE; This might cause a kernel oops if the address passed by the user does not belong to the process's address space. In that case find_vma() might return NULL and is_vm_hugetlb() will crash. And even if find_vma() returns none NULL value, that still does not guarantee that the vma returned is the one that contains that address. You need to check that the address is greater then or equal to vma->vm_start. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [Bug 203] Memory corruption in mad processing
http://openib.org/bugzilla/show_bug.cgi?id=203 [EMAIL PROTECTED] changed: What|Removed |Added Severity|major |critical Summary|Crash on shutdown, timer|Memory corruption in mad |callback, build 459 |processing --- Comment #2 from [EMAIL PROTECTED] 2006-08-23 03:06 --- A fix seems to be replacing the null test of p_list_item in al_mad.c (two places) with something like below. A text pattern scan of the sources for any other occurences might be appropriate too. if( p_list_item == cl_qlist_end( &h_mad_svc->send_list ) ) // bad if( !p_list_item ) I'll report back in a day or two on how this effects shutdown stability. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] why sdp connections cost so much memory
Yes, I have reproduced the connection refusal problem and I am looking into it. Thanks! MST Quoting r. zhu shi song <[EMAIL PROTECTED]>: Subject: Re: why sdp connections cost so much memory I haven't met kernel crashes using rc2. But there always occurred connection refusal when max concurrent connections set above 200. All is right when max concurrent connections is set to below 200. ( If using TCP to take the same test, there is no problem.) (1) This is ApacheBench, Version 2.0.41-dev <$Revision: 1.141 $> apache-2.0 Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/ Benchmarking www.google.com [through 193.12.10.14:3129] (be patient) Completed 100 requests Completed 200 requests apr_recv: Connection refused (111) Total of 257 requests completed (2) This is ApacheBench, Version 2.0.41-dev <$Revision: 1.141 $> apache-2.0 Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/ Benchmarking www.google.com [through 193.12.10.14:3129] (be patient) Completed 100 requests Completed 200 requests apr_recv: Connection refused (111) Total of 256 requests completed [EMAIL PROTECTED] squid.test]# zhu --- "Michael S. Tsirkin" <[EMAIL PROTECTED]> wrote: > Quoting r. zhu shi song <[EMAIL PROTECTED]>: > > --- "Michael S. Tsirkin" <[EMAIL PROTECTED]> > wrote: > > > > > Quoting r. zhu shi song > <[EMAIL PROTECTED]>: > > > > (3) one time linux kernel on the client > crashed. I > > > > copy the output from the screen. > > > > Process sdp (pid:4059, threadinfo > 010036384000 > > > > task 01003ea10030) > > > > Call > > > > > > > > Trace:{:ib_sdp:sdp_destroy_workto} > > > > {:ib_sdp:sdp_destroy_qp+77} > > > > > > > > > > {:ib_sdp:sdp_destruct+279}{sk_free+28} > > > > > > > > > > {worker_thread+419}{default_wake_function+0} > > > > > > > > > > {default_wake_function+0}{keventd_create_kthread+0} > > > > > > > > > > {worker_thread+0}{keventd_create_kthread+0} > > > > > > > > > > {kthread+200}{child_rip+8} > > > > > > > > > > {keventd_create_kthread+0}{kthread+0}{child_rip+0} > > > > Code:8b 40 04 41 39 c6 89 44 24 0c 7d 3b 45 31 > ff > > > 45 > > > > 31 ed 4c 89 > > > > > > > > > > RIP:{:ib_sdp:sdp_recv_completion+127}RSP<010036385dc8> > > > > CR2:0004 > > > > <0>kernel panic-not syncing:Oops > > > > > > > > zhu > > > > > > Hmm, the stack dump does not match my sources. > Is > > > this OFED rc1? > > > Could you send me the sdp_main.o and sdp_main.c > > > files from your system please? > > --- > > > Subject: Re: why sdp connections cost so much > memory > > > > please see the attachment. > > zhu > > Ugh, so its crashing inside sdp_bcopy ... > > By the way, could you please re-test with OFED rc2? > We've solved a couple of bugs there ... > > If this still crashes, could you please post the > whole > sdp directory, with .o and .c files? > > Thanks, > > -- > MST > __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [Bug 203] Crash on shutdown, timer callback, build 459
http://openib.org/bugzilla/show_bug.cgi?id=203 [EMAIL PROTECTED] changed: What|Removed |Added CC||[EMAIL PROTECTED] --- Comment #1 from [EMAIL PROTECTED] 2006-08-23 01:55 --- I've trapped a write of 0x1 to the dpc context field of a mad data structure. The stack looks like this just after the write: f797ab00 ba9de265 ibbus!ib_cancel_mad+0x6c0 [k:\windows-openib\src\winib-459\core\al\al_mad.c @ 1831] f797ab14 ba984d68 ibbus!al_cancel_sa_req+0x25 [k:\windows-openib\src\winib-459\core\al\al_query.h @ 140] f797ab28 ba82ec4c ibbus!ib_cancel_query+0x328 [k:\windows-openib\src\winib-459\core\al\al.c @ 429] f797ac00 ba7fe269 ipoib!ipoib_port_down+0x13c [k:\windows-openib\src\winib-459\ulp\ipoib\kernel\ipoib_port.c @ 5066] f797ac74 ba991da1 ipoib!__ipoib_pnp_cb+0xe89 [k:\windows-openib\src\winib-459\ulp\ipoib\kernel\ipoib_adapter.c @ 690] f797acdc ba994f92 ibbus!__pnp_notify_user+0x561 [k:\windows-openib\src\winib-459\core\al\kernel\al_pnp.c @ 523] f797ad04 ba994cb1 ibbus!__pnp_process_port_forward+0x172 [k:\windows-openib\src\winib-459\core\al\kernel\al_pnp.c @ 1230] f797ad48 ba99479a ibbus!__pnp_check_ports+0x411 [k:\windows-openib\src\winib-459\core\al\kernel\al_pnp.c @ 1433] f797ad70 ba950884 ibbus!__pnp_check_events+0x19a [k:\windows-openib\src\winib-459\core\al\kernel\al_pnp.c @ 1510] f797ad8c ba956b54 ibbus!__cl_async_proc_worker+0x94 [k:\windows-openib\src\winib-459\core\complib\cl_async_proc.c @ 153] f797ada0 ba958c0c ibbus!__cl_thread_pool_routine+0x54 [k:\windows-openib\src\winib-459\core\complib\cl_threadpool.c @ 67] f797adac 80a07678 ibbus!__thread_callback+0x2c [k:\windows-openib\src\winib-459\core\complib\kernel\cl_thread.c @ 49] f797addc 80781346 nt!PspSystemThreadStartup+0x2e nt!KiThreadStartup+0x16 This seems to be canceling an outstanding mad query when the port goes down. An event that would happen at shutdown, and at irregular other times. The code that causes the dpc corruption is core\al\al_mad.c about line 1826: if( !p_list_item ) { cl_spinlock_release( &h_mad_svc->obj.lock ); AL_PRINT( TRACE_LEVEL_INFORMATION, AL_DBG_MAD_SVC, ("mad not found\n") ); return IB_NOT_FOUND; } /* Mark the MAD as having been canceled. */ h_send = PARENT_STRUCT( p_list_item, al_mad_send_t, pool_item ); h_send->canceled = TRUE; The local pointer h_send seems to not be pointing at the right thing, and the assignment of TRUE to the cancel field is actually corrupting the dpc context field. A structure dump of p_list_item says: 1: kd> dt p_list_item Local var @ 0xf797aafc Type _cl_list_item* 0x88e76f10 +0x000 p_next : 0x88e76f10 _cl_list_item +0x004 p_prev : 0x88e76f10 _cl_list_item +0x008 p_list : 0x88e76f10 _cl_qlist The address of this 0x88e76f10 is the same address as the send_list field in the local h_mad_svc, and believe it represents an empty list header. This suggests the test for null is an incorrect test for the list being empty. There is also another case that looks like an incorrect list test in the same source file. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] why sdp connections cost so much memory
I haven't met kernel crashes using rc2. But there always occurred connection refusal when max concurrent connections set above 200. All is right when max concurrent connections is set to below 200. ( If using TCP to take the same test, there is no problem.) (1) This is ApacheBench, Version 2.0.41-dev <$Revision: 1.141 $> apache-2.0 Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/ Benchmarking www.google.com [through 193.12.10.14:3129] (be patient) Completed 100 requests Completed 200 requests apr_recv: Connection refused (111) Total of 257 requests completed (2) This is ApacheBench, Version 2.0.41-dev <$Revision: 1.141 $> apache-2.0 Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/ Benchmarking www.google.com [through 193.12.10.14:3129] (be patient) Completed 100 requests Completed 200 requests apr_recv: Connection refused (111) Total of 256 requests completed [EMAIL PROTECTED] squid.test]# zhu --- "Michael S. Tsirkin" <[EMAIL PROTECTED]> wrote: > Quoting r. zhu shi song <[EMAIL PROTECTED]>: > > --- "Michael S. Tsirkin" <[EMAIL PROTECTED]> > wrote: > > > > > Quoting r. zhu shi song > <[EMAIL PROTECTED]>: > > > > (3) one time linux kernel on the client > crashed. I > > > > copy the output from the screen. > > > > Process sdp (pid:4059, threadinfo > 010036384000 > > > > task 01003ea10030) > > > > Call > > > > > > > > Trace:{:ib_sdp:sdp_destroy_workto} > > > > {:ib_sdp:sdp_destroy_qp+77} > > > > > > > > > > {:ib_sdp:sdp_destruct+279}{sk_free+28} > > > > > > > > > > {worker_thread+419}{default_wake_function+0} > > > > > > > > > > {default_wake_function+0}{keventd_create_kthread+0} > > > > > > > > > > {worker_thread+0}{keventd_create_kthread+0} > > > > > > > > > > {kthread+200}{child_rip+8} > > > > > > > > > > {keventd_create_kthread+0}{kthread+0}{child_rip+0} > > > > Code:8b 40 04 41 39 c6 89 44 24 0c 7d 3b 45 31 > ff > > > 45 > > > > 31 ed 4c 89 > > > > > > > > > > RIP:{:ib_sdp:sdp_recv_completion+127}RSP<010036385dc8> > > > > CR2:0004 > > > > <0>kernel panic-not syncing:Oops > > > > > > > > zhu > > > > > > Hmm, the stack dump does not match my sources. > Is > > > this OFED rc1? > > > Could you send me the sdp_main.o and sdp_main.c > > > files from your system please? > > --- > > > Subject: Re: why sdp connections cost so much > memory > > > > please see the attachment. > > zhu > > Ugh, so its crashing inside sdp_bcopy ... > > By the way, could you please re-test with OFED rc2? > We've solved a couple of bugs there ... > > If this still crashes, could you please post the > whole > sdp directory, with .o and .c files? > > Thanks, > > -- > MST > __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general