Re: [PATCH libibverbs] init.c: increase sysfs read buffer size to 16
Any further comments on this? Doug -- does it look ok to you? > On Dec 7, 2015, at 5:27 AM, Haggai Eran wrote: > > On 12/04/2015 01:09 AM, Jeff Squyres wrote: >> The default value of 8 is too small to read >> /sys/class/infiniband/usnic_x/node_type, which contains "6: usNIC >> UDP". Per a7a73a8c1b39362f1701256bc772d82847832f9c, the too-small >> buffer causes a stderr warning to be emitted from ibv_devinfo when >> reading usNIC devices. >> >> This commit therefore increases the buffer size to 16, which is long >> enough to read the usNIC node_type value. > > Reviewed-by: Haggai Eran -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] libibverbs init.c: conditionally emit warning if no userspace driver found
On Jun 17, 2015, at 10:25 AM, Doug Ledford wrote: > > The patch is accepted, I just haven’t pushed it out yet. Is there a timeline for when this patch will be available in the upstream git repo and released in a new version of libibverbs? I ask because we'd like to see this patch get into upstream distro libibverbs releases. Once that happens, we can start planning the end of the horrible hackarounds we had to put into place (e.g., in Open MPI) to suppress the misleading libibverbs output. Thanks! -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ N�r��yb�X��ǧv�^�){.n�+{��ٚ�{ay�ʇڙ�,j��f���h���z��w��� ���j:+v���w�j�mzZ+�ݢj"��!�i
Re: [PATCH v3] libibverbs init.c: conditionally emit warning if no userspace driver found
Ping. This is just a periodic query to see if there has been any progress on accepting this patch into libibverbs. > On Jun 3, 2015, at 12:50 PM, Doug Ledford wrote: > > On Mon, 2015-06-01 at 22:02 +0000, Jeff Squyres (jsquyres) wrote: >> On May 22, 2015, at 9:44 AM, Doug Ledford wrote: >>> >>>> Did that happen yet? >>> >>> I don't think so. I didn't file a specific ticket for it at k.o yet >>> (the k.o tickets take a while to process, so I didn't want to file it >>> until after the comment period here on list). >> >> Ping. >> >> This is just a periodic query to see if there has been any progress on >> accepting this patch into libibverbs. >> > > I have a ticket with kernel.org helpdesk to change the permissions on > the libibverbs.git repo, and they are waiting on Roland to ACK the > change. Until then, I can't do much. > > -- > Doug Ledford > GPG KeyID: 0E572FDD > -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] libibverbs init.c: conditionally emit warning if no userspace driver found
On May 22, 2015, at 9:44 AM, Doug Ledford wrote: > >> Did that happen yet? > > I don't think so. I didn't file a specific ticket for it at k.o yet > (the k.o tickets take a while to process, so I didn't want to file it > until after the comment period here on list). Ping. This is just a periodic query to see if there has been any progress on accepting this patch into libibverbs. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] libibverbs init.c: conditionally emit warning if no userspace driver found
On May 20, 2015, at 1:11 PM, Doug Ledford wrote: > > The location of the upstream sources and tarballs would not change. > Neither the git repo nor the tarball repo were like the kernel. The > upstream kernel.org git repo Roland had, had his name in the repo. So > it had to change. But the libibverbs repo is in a generic location. > There is no need to change it, only to change the permissions on the git > repo to allow the new maintainer to push directly into it. Did that happen yet? > Ditto with > the upload/download space on openfabrics.org/downloads/verbs. It looks like someone did part of this on flatbed -- you own the download directory but none of the files, and they are all 644. So I took the liberty of chown'ing them all to you: $ hostname; pwd; ls -la flatbed.openfabrics.org /var/www/html/downloads/verbs total 6096 drwxr-xr-x. 2 dledford ofed 4096 May 7 2014 . drwxrwxr-x. 55 apache ofed 4096 Feb 13 07:31 .. -rw-r--r--. 1 dledford ofed 347508 Mar 14 2006 libibverbs-1.0.2.tar.gz -rw-r--r--. 1 dledford ofed 349439 May 2 2006 libibverbs-1.0.3.tar.gz -rw-r--r--. 1 dledford ofed 360410 Oct 31 2006 libibverbs-1.0.4.tar.gz -rw-r--r--. 1 dledford ofed 359902 Jun 18 2007 libibverbs-1.0.5.tar.gz -rw-r--r--. 1 dledford ofed 321835 Aug 29 2005 libibverbs-1.0-rc1.tar.gz -rw-r--r--. 1 dledford ofed 338537 Oct 2 2005 libibverbs-1.0-rc3.tar.gz -rw-r--r--. 1 dledford ofed 341792 Oct 28 2005 libibverbs-1.0-rc4.tar.gz -rw-r--r--. 1 dledford ofed 347699 Feb 17 2006 libibverbs-1.0-rc7.tar.gz -rw-r--r--. 1 dledford ofed 384743 Jun 18 2007 libibverbs-1.1.1.tar.gz -rw-r--r--. 1 dledford ofed 394618 Apr 18 2008 libibverbs-1.1.2.tar.gz -rw-r--r--. 1 dledford ofed 359331 Oct 29 2009 libibverbs-1.1.3.tar.gz -rw-r--r--. 1 dledford ofed 362475 Jun 3 2010 libibverbs-1.1.4.tar.gz -rw-r--r--. 1 dledford ofed 364219 Jun 28 2011 libibverbs-1.1.5.tar.gz -rw-r--r--. 1 dledford ofed 387794 Dec 21 2011 libibverbs-1.1.6.tar.gz -rw-r--r--. 1 dledford ofed 391812 May 28 2013 libibverbs-1.1.7.tar.gz -rw-r--r-- 1 dledford ofed 406548 May 5 2014 libibverbs-1.1.8.tar.gz -rw-r--r--. 1 dledford ofed 384656 Apr 24 2007 libibverbs-1.1.tar.gz -rw-r--r-- 1 dledford ofed 3957 May 7 2014 README.html -rw-r--r--. 1 dledford ofed 60 Mar 12 2008 WEB_README - -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] libibverbs init.c: remove stderr warnings if no userspace driver found
On May 9, 2015, at 8:04 AM, Yann Droneaud wrote: > > Le vendredi 08 mai 2015 à 11:21 -0700, Jeff Squyres a écrit : >> Signed-off-by: Jeff Squyres > > This is a little short for an explanation: what was the issue with the > error messages ? Cisco has stopped shipping its libibverbs usnic driver, although we are still using the kernel driver in the /sys/class/infiniband space (since it's the only way to be upstream). Specifically: instead of using libibverbs for userspace access, we are now using libfabric. That is: it's not a warning or an error if libibverbs cannot find a userspace driver for kernel devices. Indeed, returning a num_devices of 0 is sufficient -- the middleware shouldn't be unconditionally printing out stderr message; let the upper layer application do that (if it wants to). FWIW, Sean just removed a similar set of stderr warnings from librdmacm: http://git.openfabrics.org/?p=~shefty/librdmacm.git;a=commitdiff;h=2b2aad809afc56fa3157f5cf99036f92b9c90f16 >> -free(sysfs_dev); > > I believe this free() was necessary to not leak some memory. Ah -- I mis-read the loop. I'll re-submit with the loop still there, but just removing the fprintf block. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ N�r��yb�X��ǧv�^�){.n�+{��ٚ�{ay�ʇڙ�,j��f���h���z��w��� ���j:+v���w�j�mzZ+�ݢj"��!�i
Re: [PATCH libibverbs V2] Add new verb: uv_query_port_max_datagram()
Bump. This is V2 of the patch, which removes the ABI issue: libibverbs directly calls the command in the kernel (without going through the provider plugin). On Aug 21, 2013, at 5:22 PM, Jeff Squyres wrote: > Per lengthy discussion on the linux-rdma list, add a new verb to get > max datagram size (in bytes) since the methods for retrieving MTU > values are limited to a finite enum set, and are difficult to change > for backwards compatibility reasons. > > Also add corresponding command: uv_cmd_query_port_max_datagram(). > Since this is a new verb, there was no need to add a _V2 enum for the > command macro, which required adding a UB_INIT_CMD_RESP() macro. > > If the kernel does not support the new QUERY_PORT_MAX_DATAGRAM > command, fall back to returning the int-ized MTU enum from > ibv_cmd_query_port(). > > Note that the name for this verb was chosen with the following > rationale: > > * After discussion with Roland, use the prefix "uv" instead of "ibv", > since this verb is generic to both Ethernet, InfiniBand, and > whatever other transports are underneath. > * "query" was used (vs. "get") because it invokes a command (vs. a > struct lookup) > > If the community likes this approach, I'll send the corresponding > kernel patch. > > Difference from V1 > == > Do not add this verb to the devops struct (because that would break ABI). > Instead, just have uv_query_port_max_datagram() directly invoke > uv_cmd_query_port_max_datagram(). > > Signed-off-by: Jeff Squyres > --- > Makefile.am | 3 +- > examples/devinfo.c | 7 + > include/infiniband/driver.h | 4 +++ > include/infiniband/kern-abi.h| 17 +++- > include/infiniband/verbs.h | 6 > man/uv_query_port_max_datagram.3 | 59 > src/cmd.c| 54 > src/ibverbs.h| 8 ++ > src/libibverbs.map | 2 ++ > src/verbs.c | 13 + > 10 files changed, 171 insertions(+), 2 deletions(-) > create mode 100644 man/uv_query_port_max_datagram.3 > > diff --git a/Makefile.am b/Makefile.am > index 40e83be..51fe5d5 100644 > --- a/Makefile.am > +++ b/Makefile.am > @@ -54,7 +54,8 @@ man_MANS = man/ibv_asyncwatch.1 man/ibv_devices.1 > man/ibv_devinfo.1 \ > man/ibv_post_srq_recv.3 man/ibv_query_device.3 man/ibv_query_gid.3 > \ > man/ibv_query_pkey.3 man/ibv_query_port.3 man/ibv_query_qp.3 \ > man/ibv_query_srq.3 man/ibv_rate_to_mult.3 man/ibv_reg_mr.3 > \ > -man/ibv_req_notify_cq.3 man/ibv_resize_cq.3 man/ibv_rate_to_mbps.3 > +man/ibv_req_notify_cq.3 man/ibv_resize_cq.3 man/ibv_rate_to_mbps.3 > \ > +man/uv_query_port_max_datagram.3 > > DEBIAN = debian/changelog debian/compat debian/control debian/copyright \ > debian/ibverbs-utils.install debian/libibverbs1.install \ > diff --git a/examples/devinfo.c b/examples/devinfo.c > index ff078e4..f51620b 100644 > --- a/examples/devinfo.c > +++ b/examples/devinfo.c > @@ -209,6 +209,7 @@ static int print_hca_cap(struct ibv_device *ib_dev, > uint8_t ib_port) > struct ibv_port_attr port_attr; > int rc = 0; > uint8_t port; > + uint32_t max_datagram; > char buf[256]; > > ctx = ibv_open_device(ib_dev); > @@ -298,6 +299,11 @@ static int print_hca_cap(struct ibv_device *ib_dev, > uint8_t ib_port) > fprintf(stderr, "Failed to query port %u props\n", > port); > goto cleanup; > } > + rc = uv_query_port_max_datagram(ctx, port, &max_datagram); > + if (rc) { > + fprintf(stderr, "Failed to query port %u max datagram > size\n", port); > + goto cleanup; > + } > printf("\t\tport:\t%d\n", port); > printf("\t\t\tstate:\t\t\t%s (%d)\n", > port_state_str(port_attr.state), port_attr.state); > @@ -305,6 +311,7 @@ static int print_hca_cap(struct ibv_device *ib_dev, > uint8_t ib_port) > mtu_str(port_attr.max_mtu), port_attr.max_mtu); > printf("\t\t\tactive_mtu:\t\t%s (%d)\n", > mtu_str(port_attr.active_mtu), port_attr.active_mtu); > + printf("\t\t\tmax_datagram_size:\t%u\n", max_datagram); > printf("\t\t\tsm_lid:\t\t\t%d\n", port_attr.sm_lid); > printf("\t\t\tport_lid:\t\t%d\n", port_attr.lid); > printf("\t\t\tport_lmc:\t\t0x%02x\n", port_attr.lmc); > diff --git a/include/infiniband/driver.h b/include/infiniband/driver.h > index 9a81416..6e1236c 100644 > --- a/include/infiniband/driver.h > +++ b/include/infiniband/driver.h > @@ -67,6 +67,10 @@ int ibv_cmd_query_device(struct ibv_context *context, > int ibv_cmd_query_port(struct ibv_context *context, uint8_t port_num, > struct ibv_port_attr *port_attr, >
Re: [PATCH] Add new verb: uv_query_port_max_datagram()
On Aug 19, 2013, at 8:59 PM, "Hefty, Sean" wrote: >> Any suggestions on how one adds a new driver call without breaking ABI? > > It could be built on the verbs extension mechanism. Where is the documentation for this? Multiple people have referred to it, but I don't see any mention of it in libibverbs.git. > Is it necessary to call into a provider library, versus simply dropping into > the kernel? I don't think I have much of an opinion here, other than: it would seem weird to not call the provider library, given that all other verbs do that. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Add new verb: uv_query_port_max_datagram()
On Aug 19, 2013, at 6:36 PM, "Hefty, Sean" wrote: > This breaks the libibverbs ABI. You can't modify ibv_context_ops because it > changes struct ibv_context. Any suggestions on how one adds a new driver call without breaking ABI? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Add new verb: uv_query_port_max_datagram()
On Aug 19, 2013, at 6:07 PM, "Hefty, Sean" wrote: >> It doesn't *break* the ABI, but it does add a new downcall into the kernel. >> That requires bumping the ABI version to 7, no? > > No - adding a new command is fine. Older kernels will return ENOSYS if that > command is not supported. In that case, you can handle things like Jason > suggested. Gotcha. I'll adjust the patch. Any other feedback? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Add new verb: uv_query_port_max_datagram()
On Aug 19, 2013, at 5:18 PM, "Hefty, Sean" wrote: >> Bumped the ABI version to 7 (the new verb will return -ENOSYS if >> abi_verb is < 7). > > How does this break the ABI? It doesn't *break* the ABI, but it does add a new downcall into the kernel. That requires bumping the ABI version to 7, no? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH libibverbs] Add new verb: uv_query_port_max_datagram()
On Aug 19, 2013, at 4:19 PM, Jason Gunthorpe wrote: > What about doing query port in this case and returning that value, > decoded to an enum? Otherwise apps have to include that logic anyhow. > > I'm assuming the kernel will do basically the same? > > Bascially, the only failure for this call should be due to a bad port > number.. Sure, can do. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU
On Jul 30, 2013, at 12:44 PM, Christoph Lameter wrote: > What in the world does that mean? I am an oldtimer I guess. Seems that > this is something that can be done in the newfangled forum? How does this > affect mailing lists? I'm not sure what you're asking me; please see the prior posts on this thread that describes the MTU issue and why we still need a solution. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU
On Jul 23, 2013, at 9:26 AM, Jeff Squyres (jsquyres) wrote: >> .. and UD is the least abstracted transport, so existing apps won't >> support Jeff's new NIC anyhow, MTU is the least of their problems. >> >> Existing apps with existing transports see the same old values. Bump. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] libibverbs: Add the use of IBV_SEND_INLINE to example pingpong programs
4th bump... On Jul 10, 2013, at 4:32 PM, Jeff Squyres wrote: > If the send size is less than the cap.max_inline_data reported by the > qp, use the IBV_SEND_INLINE flag. This now only shows the example of > using ibv_query_qp(), it also reduces the latency time shown by the > pingpong programs when the sends can be inlined. > > Signed-off-by: Jeff Squyres > --- > examples/rc_pingpong.c | 18 +- > examples/srq_pingpong.c | 19 +-- > examples/uc_pingpong.c | 17 - > examples/ud_pingpong.c | 18 +- > 4 files changed, 51 insertions(+), 21 deletions(-) > > diff --git a/examples/rc_pingpong.c b/examples/rc_pingpong.c > index 15494a1..a8637a5 100644 > --- a/examples/rc_pingpong.c > +++ b/examples/rc_pingpong.c > @@ -65,6 +65,7 @@ struct pingpong_context { > struct ibv_qp *qp; > void*buf; > int size; > + int send_flags; > int rx_depth; > int pending; > struct ibv_port_attr portinfo; > @@ -319,8 +320,9 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > if (!ctx) > return NULL; > > - ctx->size = size; > - ctx->rx_depth = rx_depth; > + ctx->size = size; > + ctx->send_flags = IBV_SEND_SIGNALED; > + ctx->rx_depth = rx_depth; > > ctx->buf = memalign(page_size, size); > if (!ctx->buf) { > @@ -367,7 +369,8 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > } > > { > - struct ibv_qp_init_attr attr = { > + struct ibv_qp_attr attr; > + struct ibv_qp_init_attr init_attr = { > .send_cq = ctx->cq, > .recv_cq = ctx->cq, > .cap = { > @@ -379,11 +382,16 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > .qp_type = IBV_QPT_RC > }; > > - ctx->qp = ibv_create_qp(ctx->pd, &attr); > + ctx->qp = ibv_create_qp(ctx->pd, &init_attr); > if (!ctx->qp) { > fprintf(stderr, "Couldn't create QP\n"); > goto clean_cq; > } > + > + ibv_query_qp(ctx->qp, &attr, IBV_QP_CAP, &init_attr); > + if (init_attr.cap.max_inline_data >= size) { > + ctx->send_flags |= IBV_SEND_INLINE; > + } > } > > { > @@ -508,7 +516,7 @@ static int pp_post_send(struct pingpong_context *ctx) > .sg_list= &list, > .num_sge= 1, > .opcode = IBV_WR_SEND, > - .send_flags = IBV_SEND_SIGNALED, > + .send_flags = ctx->send_flags, > }; > struct ibv_send_wr *bad_wr; > > diff --git a/examples/srq_pingpong.c b/examples/srq_pingpong.c > index 6e00f8c..552a144 100644 > --- a/examples/srq_pingpong.c > +++ b/examples/srq_pingpong.c > @@ -68,6 +68,7 @@ struct pingpong_context { > struct ibv_qp *qp[MAX_QP]; > void*buf; > int size; > + int send_flags; > int num_qp; > int rx_depth; > int pending[MAX_QP]; > @@ -350,9 +351,10 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > if (!ctx) > return NULL; > > - ctx->size = size; > - ctx->num_qp = num_qp; > - ctx->rx_depth = rx_depth; > + ctx->size = size; > + ctx->send_flags = IBV_SEND_SIGNALED; > + ctx->num_qp = num_qp; > + ctx->rx_depth = rx_depth; > > ctx->buf = memalign(page_size, size); > if (!ctx->buf) { > @@ -413,7 +415,8 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > } > > for (i = 0; i < num_qp; ++i) { > - struct ibv_qp_init_attr attr = { > + struct ibv_qp_attr attr; > + struct ibv_qp_init_attr init_attr = { > .send_cq = ctx->cq, > .recv_cq = ctx->cq, > .srq = ctx->srq, > @@ -424,11 +427,15 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > .qp_type = IBV_QPT_RC > }; > > - ctx->qp[i] = ibv_create_qp(ctx->pd, &attr); > + ctx->qp[i] = ibv_create_qp(ctx->pd, &init_attr); > if (!ctx->qp[i]) { > fprintf(stderr, "Couldn't create QP[%d]\n", i); > goto clean_qps; > } > + ibv_query_qp(ctx->qp[i], &attr, IBV_QP_CAP, &init_attr); > + if (init_attr.cap.max_inline_data >= size) { > + ctx->send_flags
Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU
On Jul 18, 2013, at 12:50 PM, Jason Gunthorpe wrote: >> We need it for UD for our upcoming device, however, because the MTU >> is the only way to get the max message size. > > .. and UD is the least abstracted transport, so existing apps won't > support Jeff's new NIC anyhow, MTU is the least of their problems. > > Existing apps with existing transports see the same old values. ...so how do we move forward? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] libibverbs: Add the use of IBV_SEND_INLINE to example pingpong programs
Bump bump bump. I know this isn't a huge / important patch, but it is a small thing that does decrease the latency reported by these example programs. On Jul 10, 2013, at 4:32 PM, Jeff Squyres wrote: > If the send size is less than the cap.max_inline_data reported by the > qp, use the IBV_SEND_INLINE flag. This not only shows the example of > using ibv_query_qp(), it also reduces the latency time shown by the > pingpong programs when the sends can be inlined. > > Signed-off-by: Jeff Squyres > --- > examples/rc_pingpong.c | 18 +- > examples/srq_pingpong.c | 19 +-- > examples/uc_pingpong.c | 17 - > examples/ud_pingpong.c | 18 +- > 4 files changed, 51 insertions(+), 21 deletions(-) > > diff --git a/examples/rc_pingpong.c b/examples/rc_pingpong.c > index 15494a1..a8637a5 100644 > --- a/examples/rc_pingpong.c > +++ b/examples/rc_pingpong.c > @@ -65,6 +65,7 @@ struct pingpong_context { > struct ibv_qp *qp; > void*buf; > int size; > + int send_flags; > int rx_depth; > int pending; > struct ibv_port_attr portinfo; > @@ -319,8 +320,9 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > if (!ctx) > return NULL; > > - ctx->size = size; > - ctx->rx_depth = rx_depth; > + ctx->size = size; > + ctx->send_flags = IBV_SEND_SIGNALED; > + ctx->rx_depth = rx_depth; > > ctx->buf = memalign(page_size, size); > if (!ctx->buf) { > @@ -367,7 +369,8 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > } > > { > - struct ibv_qp_init_attr attr = { > + struct ibv_qp_attr attr; > + struct ibv_qp_init_attr init_attr = { > .send_cq = ctx->cq, > .recv_cq = ctx->cq, > .cap = { > @@ -379,11 +382,16 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > .qp_type = IBV_QPT_RC > }; > > - ctx->qp = ibv_create_qp(ctx->pd, &attr); > + ctx->qp = ibv_create_qp(ctx->pd, &init_attr); > if (!ctx->qp) { > fprintf(stderr, "Couldn't create QP\n"); > goto clean_cq; > } > + > + ibv_query_qp(ctx->qp, &attr, IBV_QP_CAP, &init_attr); > + if (init_attr.cap.max_inline_data >= size) { > + ctx->send_flags |= IBV_SEND_INLINE; > + } > } > > { > @@ -508,7 +516,7 @@ static int pp_post_send(struct pingpong_context *ctx) > .sg_list= &list, > .num_sge= 1, > .opcode = IBV_WR_SEND, > - .send_flags = IBV_SEND_SIGNALED, > + .send_flags = ctx->send_flags, > }; > struct ibv_send_wr *bad_wr; > > diff --git a/examples/srq_pingpong.c b/examples/srq_pingpong.c > index 6e00f8c..552a144 100644 > --- a/examples/srq_pingpong.c > +++ b/examples/srq_pingpong.c > @@ -68,6 +68,7 @@ struct pingpong_context { > struct ibv_qp *qp[MAX_QP]; > void*buf; > int size; > + int send_flags; > int num_qp; > int rx_depth; > int pending[MAX_QP]; > @@ -350,9 +351,10 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > if (!ctx) > return NULL; > > - ctx->size = size; > - ctx->num_qp = num_qp; > - ctx->rx_depth = rx_depth; > + ctx->size = size; > + ctx->send_flags = IBV_SEND_SIGNALED; > + ctx->num_qp = num_qp; > + ctx->rx_depth = rx_depth; > > ctx->buf = memalign(page_size, size); > if (!ctx->buf) { > @@ -413,7 +415,8 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > } > > for (i = 0; i < num_qp; ++i) { > - struct ibv_qp_init_attr attr = { > + struct ibv_qp_attr attr; > + struct ibv_qp_init_attr init_attr = { > .send_cq = ctx->cq, > .recv_cq = ctx->cq, > .srq = ctx->srq, > @@ -424,11 +427,15 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > .qp_type = IBV_QPT_RC > }; > > - ctx->qp[i] = ibv_create_qp(ctx->pd, &attr); > + ctx->qp[i] = ibv_create_qp(ctx->pd, &init_attr); > if (!ctx->qp[i]) { > fprintf(stderr, "Couldn't create QP[%d]\n", i); > goto clean_qps; > } > + ibv_query_qp(ctx
Re: [PATCH] libibverbs: Add the use of IBV_SEND_INLINE to example pingpong programs
Bump bump. On Jul 10, 2013, at 4:32 PM, Jeff Squyres wrote: > If the send size is less than the cap.max_inline_data reported by the > qp, use the IBV_SEND_INLINE flag. This now only shows the example of > using ibv_query_qp(), it also reduces the latency time shown by the > pingpong programs when the sends can be inlined. > > Signed-off-by: Jeff Squyres > --- > examples/rc_pingpong.c | 18 +- > examples/srq_pingpong.c | 19 +-- > examples/uc_pingpong.c | 17 - > examples/ud_pingpong.c | 18 +- > 4 files changed, 51 insertions(+), 21 deletions(-) > > diff --git a/examples/rc_pingpong.c b/examples/rc_pingpong.c > index 15494a1..a8637a5 100644 > --- a/examples/rc_pingpong.c > +++ b/examples/rc_pingpong.c > @@ -65,6 +65,7 @@ struct pingpong_context { > struct ibv_qp *qp; > void*buf; > int size; > + int send_flags; > int rx_depth; > int pending; > struct ibv_port_attr portinfo; > @@ -319,8 +320,9 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > if (!ctx) > return NULL; > > - ctx->size = size; > - ctx->rx_depth = rx_depth; > + ctx->size = size; > + ctx->send_flags = IBV_SEND_SIGNALED; > + ctx->rx_depth = rx_depth; > > ctx->buf = memalign(page_size, size); > if (!ctx->buf) { > @@ -367,7 +369,8 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > } > > { > - struct ibv_qp_init_attr attr = { > + struct ibv_qp_attr attr; > + struct ibv_qp_init_attr init_attr = { > .send_cq = ctx->cq, > .recv_cq = ctx->cq, > .cap = { > @@ -379,11 +382,16 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > .qp_type = IBV_QPT_RC > }; > > - ctx->qp = ibv_create_qp(ctx->pd, &attr); > + ctx->qp = ibv_create_qp(ctx->pd, &init_attr); > if (!ctx->qp) { > fprintf(stderr, "Couldn't create QP\n"); > goto clean_cq; > } > + > + ibv_query_qp(ctx->qp, &attr, IBV_QP_CAP, &init_attr); > + if (init_attr.cap.max_inline_data >= size) { > + ctx->send_flags |= IBV_SEND_INLINE; > + } > } > > { > @@ -508,7 +516,7 @@ static int pp_post_send(struct pingpong_context *ctx) > .sg_list= &list, > .num_sge= 1, > .opcode = IBV_WR_SEND, > - .send_flags = IBV_SEND_SIGNALED, > + .send_flags = ctx->send_flags, > }; > struct ibv_send_wr *bad_wr; > > diff --git a/examples/srq_pingpong.c b/examples/srq_pingpong.c > index 6e00f8c..552a144 100644 > --- a/examples/srq_pingpong.c > +++ b/examples/srq_pingpong.c > @@ -68,6 +68,7 @@ struct pingpong_context { > struct ibv_qp *qp[MAX_QP]; > void*buf; > int size; > + int send_flags; > int num_qp; > int rx_depth; > int pending[MAX_QP]; > @@ -350,9 +351,10 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > if (!ctx) > return NULL; > > - ctx->size = size; > - ctx->num_qp = num_qp; > - ctx->rx_depth = rx_depth; > + ctx->size = size; > + ctx->send_flags = IBV_SEND_SIGNALED; > + ctx->num_qp = num_qp; > + ctx->rx_depth = rx_depth; > > ctx->buf = memalign(page_size, size); > if (!ctx->buf) { > @@ -413,7 +415,8 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > } > > for (i = 0; i < num_qp; ++i) { > - struct ibv_qp_init_attr attr = { > + struct ibv_qp_attr attr; > + struct ibv_qp_init_attr init_attr = { > .send_cq = ctx->cq, > .recv_cq = ctx->cq, > .srq = ctx->srq, > @@ -424,11 +427,15 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > .qp_type = IBV_QPT_RC > }; > > - ctx->qp[i] = ibv_create_qp(ctx->pd, &attr); > + ctx->qp[i] = ibv_create_qp(ctx->pd, &init_attr); > if (!ctx->qp[i]) { > fprintf(stderr, "Couldn't create QP[%d]\n", i); > goto clean_qps; > } > + ibv_query_qp(ctx->qp[i], &attr, IBV_QP_CAP, &init_attr); > + if (init_attr.cap.max_inline_data >= size) { > + ctx->send_flags |
Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU
On Jul 17, 2013, at 5:44 PM, Steve Wise wrote: > The iwarp drivers just report the nearest mtu enum. Apps don't need it for > iwarp like they do for ib. For RC, it doesn't matter much. So the fact that RoCE and iWARP lie about their MTU isn't a huge deal. It's wrong, but it doesn't matter much. We need it for UD for our upcoming device, however, because the MTU is the only way to get the max message size. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU
On Jul 17, 2013, at 12:06 AM, "Hefty, Sean" wrote: > I don't remember. Is it known how the mtu is communicated with the kernel? I hadn't looked at the kernel side yet; I was waiting for the userspace side to sort itself out first. > Looking at kern-abi.h, the mtu fields are: > > struct ibv_query_port_resp { > __u8 max_mtu; > __u8 active_mtu; > > struct ibv_kern_qp_attr { > __u32 path_mtu; > > struct ibv_query_qp_resp { > __u8 path_mtu; > > struct ibv_modify_qp { > __u8 path_mtu; > > In most cases, we only have 8 bits available to/from the kernel. (There are > at least 16 bits of reserved space in these structures.) Hmm. 16 bits is probably enough for the MTU values, but still, changing kern-abi.h will be problematic from an ABI perspective. Do people care about the kernel ABI, or is that mainly a userspace issue? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU
On Jul 16, 2013, at 10:47 AM, Jason Gunthorpe wrote: > A source change is completely unvaoidable. Supporting the new MTU > values requires updated source. I don't really care one way or the other; I'll submit whatever patch people want. :-) But FWIW, I tend to believe the Doug/Jason position: - MTU really needs to be a plain integer (not an enum) - forcing application source change/adaptation is the safest way to move forward - doing it this way preserves ABI, so existing binaries are safe -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] libibverbs: Add the use of IBV_SEND_INLINE to example pingpong programs
Bump. On Jul 10, 2013, at 4:32 PM, Jeff Squyres wrote: > If the send size is less than the cap.max_inline_data reported by the > qp, use the IBV_SEND_INLINE flag. This now only shows the example of > using ibv_query_qp(), it also reduces the latency time shown by the > pingpong programs when the sends can be inlined. > > Signed-off-by: Jeff Squyres > --- > examples/rc_pingpong.c | 18 +- > examples/srq_pingpong.c | 19 +-- > examples/uc_pingpong.c | 17 - > examples/ud_pingpong.c | 18 +- > 4 files changed, 51 insertions(+), 21 deletions(-) > > diff --git a/examples/rc_pingpong.c b/examples/rc_pingpong.c > index 15494a1..a8637a5 100644 > --- a/examples/rc_pingpong.c > +++ b/examples/rc_pingpong.c > @@ -65,6 +65,7 @@ struct pingpong_context { > struct ibv_qp *qp; > void*buf; > int size; > + int send_flags; > int rx_depth; > int pending; > struct ibv_port_attr portinfo; > @@ -319,8 +320,9 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > if (!ctx) > return NULL; > > - ctx->size = size; > - ctx->rx_depth = rx_depth; > + ctx->size = size; > + ctx->send_flags = IBV_SEND_SIGNALED; > + ctx->rx_depth = rx_depth; > > ctx->buf = memalign(page_size, size); > if (!ctx->buf) { > @@ -367,7 +369,8 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > } > > { > - struct ibv_qp_init_attr attr = { > + struct ibv_qp_attr attr; > + struct ibv_qp_init_attr init_attr = { > .send_cq = ctx->cq, > .recv_cq = ctx->cq, > .cap = { > @@ -379,11 +382,16 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > .qp_type = IBV_QPT_RC > }; > > - ctx->qp = ibv_create_qp(ctx->pd, &attr); > + ctx->qp = ibv_create_qp(ctx->pd, &init_attr); > if (!ctx->qp) { > fprintf(stderr, "Couldn't create QP\n"); > goto clean_cq; > } > + > + ibv_query_qp(ctx->qp, &attr, IBV_QP_CAP, &init_attr); > + if (init_attr.cap.max_inline_data >= size) { > + ctx->send_flags |= IBV_SEND_INLINE; > + } > } > > { > @@ -508,7 +516,7 @@ static int pp_post_send(struct pingpong_context *ctx) > .sg_list= &list, > .num_sge= 1, > .opcode = IBV_WR_SEND, > - .send_flags = IBV_SEND_SIGNALED, > + .send_flags = ctx->send_flags, > }; > struct ibv_send_wr *bad_wr; > > diff --git a/examples/srq_pingpong.c b/examples/srq_pingpong.c > index 6e00f8c..552a144 100644 > --- a/examples/srq_pingpong.c > +++ b/examples/srq_pingpong.c > @@ -68,6 +68,7 @@ struct pingpong_context { > struct ibv_qp *qp[MAX_QP]; > void*buf; > int size; > + int send_flags; > int num_qp; > int rx_depth; > int pending[MAX_QP]; > @@ -350,9 +351,10 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > if (!ctx) > return NULL; > > - ctx->size = size; > - ctx->num_qp = num_qp; > - ctx->rx_depth = rx_depth; > + ctx->size = size; > + ctx->send_flags = IBV_SEND_SIGNALED; > + ctx->num_qp = num_qp; > + ctx->rx_depth = rx_depth; > > ctx->buf = memalign(page_size, size); > if (!ctx->buf) { > @@ -413,7 +415,8 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > } > > for (i = 0; i < num_qp; ++i) { > - struct ibv_qp_init_attr attr = { > + struct ibv_qp_attr attr; > + struct ibv_qp_init_attr init_attr = { > .send_cq = ctx->cq, > .recv_cq = ctx->cq, > .srq = ctx->srq, > @@ -424,11 +427,15 @@ static struct pingpong_context *pp_init_ctx(struct > ibv_device *ib_dev, int size, > .qp_type = IBV_QPT_RC > }; > > - ctx->qp[i] = ibv_create_qp(ctx->pd, &attr); > + ctx->qp[i] = ibv_create_qp(ctx->pd, &init_attr); > if (!ctx->qp[i]) { > fprintf(stderr, "Couldn't create QP[%d]\n", i); > goto clean_qps; > } > + ibv_query_qp(ctx->qp[i], &attr, IBV_QP_CAP, &init_attr); > + if (init_attr.cap.max_inline_data >= size) { > + ctx->send_flags |= IBV
Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU
Bump. On Jul 10, 2013, at 8:14 AM, Jeff Squyres (jsquyres) wrote: > On Jul 8, 2013, at 1:26 PM, Jason Gunthorpe > wrote: > >> Jeff's patch doesn't break old binaries, old binaries, running with >> normal IB MTUs work fine. The structure layouts all stay the same, >> etc. > > > FWIW, I did a simple test to confirm this. I installed a stock git HEAD > libibverbs into $HOME/libibverbs-HEAD and a libibverbs with the MTU patch in > $HOME/libibverbs-mtu-patch. The mlx4 driver was installed into both trees (I > used some fairly old Mellanox HCAs+Dell servers for this test). > > This is the base case: > > - > [5:06] dell012:~ ❯❯❯ cd libibverbs-HEAD > [5:07] dell012:~/libibverbs-HEAD ❯❯❯ ldd bin/ibv_rc_pingpong > linux-vdso.so.1 => (0x2aacb000) > libibverbs.so.1 => /home/jsquyres/libibverbs-HEAD/lib/libibverbs.so.1 > (0x2accd000) > libpthread.so.0 => /lib64/libpthread.so.0 (0x2aeec000) > libdl.so.2 => /lib64/libdl.so.2 (0x2b109000) > libc.so.6 => /lib64/libc.so.6 (0x2b30e000) > /lib64/ld-linux-x86-64.so.2 (0x2aaab000) > [5:07] dell012:~/libibverbs-HEAD ❯❯❯ ./bin/ibv_rc_pingpong dell011 > local address: LID 0x0004, QPN 0x04004a, PSN 0xc08742, GID :: > remote address: LID 0x0019, QPN 0x20004a, PSN 0x44c48e, GID :: > 8192000 bytes in 0.02 seconds = 4170.28 Mbit/sec > 1000 iters in 0.02 seconds = 15.72 usec/iter > - > > Works fine. Now let's use the same libibverbs-HEAD rc pingpong binary, but > with the MTU-patched libibverbs.so: > > - > [5:07] dell012:~/libibverbs-HEAD ❯❯❯ mv lib/libibverbs.so.1 > lib/libibverbs.so.1-bogus > [5:07] dell012:~/libibverbs-HEAD ❯❯❯ export > LD_LIBRARY_PATH=$HOME/libibverbs-mtu-patch/lib > [5:08] dell012:~/libibverbs-HEAD ❯❯❯ ldd bin/ibv_rc_pingpong > linux-vdso.so.1 => (0x2aacb000) > libibverbs.so.1 => > /home/jsquyres/libibverbs-mtu-patch/lib/libibverbs.so.1 (0x2accd000) > libpthread.so.0 => /lib64/libpthread.so.0 (0x2aeed000) > libdl.so.2 => /lib64/libdl.so.2 (0x2b10a000) > libc.so.6 => /lib64/libc.so.6 (0x2b30e000) > /lib64/ld-linux-x86-64.so.2 (0x2aaab000) > [5:08] dell012:~/libibverbs-HEAD ❯❯❯ ./bin/ibv_rc_pingpong dell011 > local address: LID 0x0004, QPN 0x08004a, PSN 0x65391c, GID :: > remote address: LID 0x0019, QPN 0x24004a, PSN 0x7d137e, GID :: > 8192000 bytes in 0.02 seconds = 4163.39 Mbit/sec > 1000 iters in 0.02 seconds = 15.74 usec/iter > - > > Still works fine. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU
On Jul 8, 2013, at 1:26 PM, Jason Gunthorpe wrote: > Jeff's patch doesn't break old binaries, old binaries, running with > normal IB MTUs work fine. The structure layouts all stay the same, > etc. FWIW, I did a simple test to confirm this. I installed a stock git HEAD libibverbs into $HOME/libibverbs-HEAD and a libibverbs with the MTU patch in $HOME/libibverbs-mtu-patch. The mlx4 driver was installed into both trees (I used some fairly old Mellanox HCAs+Dell servers for this test). This is the base case: - [5:06] dell012:~ ❯❯❯ cd libibverbs-HEAD [5:07] dell012:~/libibverbs-HEAD ❯❯❯ ldd bin/ibv_rc_pingpong linux-vdso.so.1 => (0x2aacb000) libibverbs.so.1 => /home/jsquyres/libibverbs-HEAD/lib/libibverbs.so.1 (0x2accd000) libpthread.so.0 => /lib64/libpthread.so.0 (0x2aeec000) libdl.so.2 => /lib64/libdl.so.2 (0x2b109000) libc.so.6 => /lib64/libc.so.6 (0x2b30e000) /lib64/ld-linux-x86-64.so.2 (0x2aaab000) [5:07] dell012:~/libibverbs-HEAD ❯❯❯ ./bin/ibv_rc_pingpong dell011 local address: LID 0x0004, QPN 0x04004a, PSN 0xc08742, GID :: remote address: LID 0x0019, QPN 0x20004a, PSN 0x44c48e, GID :: 8192000 bytes in 0.02 seconds = 4170.28 Mbit/sec 1000 iters in 0.02 seconds = 15.72 usec/iter - Works fine. Now let's use the same libibverbs-HEAD rc pingpong binary, but with the MTU-patched libibverbs.so: - [5:07] dell012:~/libibverbs-HEAD ❯❯❯ mv lib/libibverbs.so.1 lib/libibverbs.so.1-bogus [5:07] dell012:~/libibverbs-HEAD ❯❯❯ export LD_LIBRARY_PATH=$HOME/libibverbs-mtu-patch/lib [5:08] dell012:~/libibverbs-HEAD ❯❯❯ ldd bin/ibv_rc_pingpong linux-vdso.so.1 => (0x2aacb000) libibverbs.so.1 => /home/jsquyres/libibverbs-mtu-patch/lib/libibverbs.so.1 (0x2accd000) libpthread.so.0 => /lib64/libpthread.so.0 (0x2aeed000) libdl.so.2 => /lib64/libdl.so.2 (0x2b10a000) libc.so.6 => /lib64/libc.so.6 (0x2b30e000) /lib64/ld-linux-x86-64.so.2 (0x2aaab000) [5:08] dell012:~/libibverbs-HEAD ❯❯❯ ./bin/ibv_rc_pingpong dell011 local address: LID 0x0004, QPN 0x08004a, PSN 0x65391c, GID :: remote address: LID 0x0019, QPN 0x24004a, PSN 0x7d137e, GID :: 8192000 bytes in 0.02 seconds = 4163.39 Mbit/sec 1000 iters in 0.02 seconds = 15.74 usec/iter - Still works fine. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ N�r��yb�X��ǧv�^�){.n�+{��ٚ�{ay�ʇڙ�,j��f���h���z��w��� ���j:+v���w�j�mzZ+�ݢj"��!�i
Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU
On Jul 5, 2013, at 3:11 PM, Roland Dreier wrote: > So what happens if I have an old application binary, and I run against > a new libibverbs without recompiling? > > Also it seems that I'm forced to change my source code to be able to > compile against new libibverbs? I previously sent an ABI-preserving version of this patch, but it was hated by Doug Ledford and (eventually) Jason Gunthorpe. After long discussion (see thread starting here: http://www.spinics.net/lists/linux-rdma/msg15951.html), they decided that they wanted a clean break that forces both source code and ABI changes, which resulted in this patch. I personally don't care which way this goes; I just want the ability to have non-enum MTU values. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU
Bump. On Jul 2, 2013, at 8:31 AM, Jeff Squyres wrote: > (Previous patch did not include updates for the man pages) > > Keep IBV_MTU_* enums values as they are, but pass MTU values around as > a struct containing a single int. > > Per lengthy discusson on the linux-rdma list, this patch introdces a > source code incompatibility. Although legacy applications can > continue to use the enum values, they will need to be updated to use > the struct. Newer applications are encouraged to use arbitrary int > values, not the MTU enums (e.g., 1024, 1500, 9000). > > Signed-off-by: Jeff Squyres > --- > Makefile.am| 3 +- > examples/devinfo.c | 20 +++-- > examples/pingpong.c| 12 > examples/pingpong.h| 1 - > examples/rc_pingpong.c | 10 +++ > examples/srq_pingpong.c| 10 +++ > examples/uc_pingpong.c | 10 +++ > examples/ud_pingpong.c | 2 +- > include/infiniband/verbs.h | 61 +-- > man/ibv_modify_qp.3| 2 +- > man/ibv_mtu_to_num.3 | 71 ++ > man/ibv_query_port.3 | 4 +-- > man/ibv_query_qp.3 | 2 +- > src/cmd.c | 8 +++--- > src/marshall.c | 2 +- > 15 files changed, 160 insertions(+), 58 deletions(-) > create mode 100644 man/ibv_mtu_to_num.3 > > diff --git a/Makefile.am b/Makefile.am > index 40e83be..1159e55 100644 > --- a/Makefile.am > +++ b/Makefile.am > @@ -54,7 +54,8 @@ man_MANS = man/ibv_asyncwatch.1 man/ibv_devices.1 > man/ibv_devinfo.1 \ > man/ibv_post_srq_recv.3 man/ibv_query_device.3 man/ibv_query_gid.3 > \ > man/ibv_query_pkey.3 man/ibv_query_port.3 man/ibv_query_qp.3 \ > man/ibv_query_srq.3 man/ibv_rate_to_mult.3 man/ibv_reg_mr.3 > \ > -man/ibv_req_notify_cq.3 man/ibv_resize_cq.3 man/ibv_rate_to_mbps.3 > +man/ibv_req_notify_cq.3 man/ibv_resize_cq.3 man/ibv_rate_to_mbps.3 \ > +man/ibv_mtu_to_num.3 > > DEBIAN = debian/changelog debian/compat debian/control debian/copyright \ > debian/ibverbs-utils.install debian/libibverbs1.install \ > diff --git a/examples/devinfo.c b/examples/devinfo.c > index ff078e4..e8fb27e 100644 > --- a/examples/devinfo.c > +++ b/examples/devinfo.c > @@ -111,18 +111,6 @@ static const char *atomic_cap_str(enum ibv_atomic_cap > atom_cap) > } > } > > -static const char *mtu_str(enum ibv_mtu max_mtu) > -{ > - switch (max_mtu) { > - case IBV_MTU_256: return "256"; > - case IBV_MTU_512: return "512"; > - case IBV_MTU_1024: return "1024"; > - case IBV_MTU_2048: return "2048"; > - case IBV_MTU_4096: return "4096"; > - default: return "invalid MTU"; > - } > -} > - > static const char *width_str(uint8_t width) > { > switch (width) { > @@ -301,10 +289,10 @@ static int print_hca_cap(struct ibv_device *ib_dev, > uint8_t ib_port) > printf("\t\tport:\t%d\n", port); > printf("\t\t\tstate:\t\t\t%s (%d)\n", > port_state_str(port_attr.state), port_attr.state); > - printf("\t\t\tmax_mtu:\t\t%s (%d)\n", > -mtu_str(port_attr.max_mtu), port_attr.max_mtu); > - printf("\t\t\tactive_mtu:\t\t%s (%d)\n", > -mtu_str(port_attr.active_mtu), port_attr.active_mtu); > + printf("\t\t\tmax_mtu:\t\t%d (%d)\n", > +ibv_mtu_to_num(port_attr.max_mtu), > port_attr.max_mtu.mtu); > + printf("\t\t\tactive_mtu:\t\t%d (%d)\n", > + ibv_mtu_to_num(port_attr.active_mtu), > port_attr.active_mtu.mtu); > printf("\t\t\tsm_lid:\t\t\t%d\n", port_attr.sm_lid); > printf("\t\t\tport_lid:\t\t%d\n", port_attr.lid); > printf("\t\t\tport_lmc:\t\t0x%02x\n", port_attr.lmc); > diff --git a/examples/pingpong.c b/examples/pingpong.c > index 90732ef..d1c22c9 100644 > --- a/examples/pingpong.c > +++ b/examples/pingpong.c > @@ -36,18 +36,6 @@ > #include > #include > > -enum ibv_mtu pp_mtu_to_enum(int mtu) > -{ > - switch (mtu) { > - case 256: return IBV_MTU_256; > - case 512: return IBV_MTU_512; > - case 1024: return IBV_MTU_1024; > - case 2048: return IBV_MTU_2048; > - case 4096: return IBV_MTU_4096; > - default: return -1; > - } > -} > - > uint16_t pp_get_local_lid(struct ibv_context *context, int port) > { > struct ibv_port_attr attr; > diff --git a/examples/pingpong.h b/examples/pingpong.h > index 9cdc03e..91d217b 100644 > --- a/examples/pingpong.h > +++ b/examples/pingpong.h > @@ -35,7 +35,6 @@ > > #include > > -enum ibv_mtu pp_mtu_to_enum(int mtu); > uint16_t pp_get_local_lid(struct ibv_context *context, int port); > int pp_get_port_info(struct ibv_context *context, int port, >struct ibv_port_attr *attr); > diff --git a/examples/rc_pingpong.c b/examples/rc_pingpong.c > index 15494a1..a7e1836 100644 > --- a/examples/rc_pingpong
Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU.
On Jun 21, 2013, at 5:20 PM, Jason Gunthorpe wrote: > Jeff: If you are still reading - I am still reading, just didn't have much to contribute until now. :-) > one concrete suggestion, I think, is > to ensure compile-time failure when the new-format MTU variable is > touched. This is trivially done by wrapping it in a struct: > > struct ibv_mtu_t {int __mtu;}; Sure, I can work up a patch that does this. Do others agree? Roland? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] libibverbs: Allow arbitrary int values for MTU
On Jun 20, 2013, at 4:40 PM, Doug Ledford wrote: >> { >> static char str[16]; >> snprintf(str, sizeof(str), "%d", ibv_mtu_to_num(max_mtu)); >>return str; >> } > > That is not, however, multi-thread safe nor advisable unless you clearly > indicate in the man page to the function that subsequent calls to the > function wipe out the result of previous calls. It's not even single > thread safe if you have more than one interface and don't know that > later calls wipe this buffer out. Best to avoid library routines such > as this. This is in the devinfo.c program (which is single-threaded), not in the library itself. But regardless, this whole function went away in V2 of the patch. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] libibverbs: Allow arbitrary int values for MTU
On Jun 20, 2013, at 1:09 PM, "Hefty, Sean" wrote: >> int ibv_rate_to_mult(enum ibv_rate rate); >> enum ibv_rate mult_to_ibv_rate(int mult); >> >> int ibv_rate_to_mbps(enum ibv_rate rate); >> enum ibv_rate mbps_to_ibv_rate(int mbps); > > libibverbs uses the "ibv_" prefix for pretty much everything. ...except for those 2 functions above (mbps_to_ibv_rate and mult_to_ibv_rate). See: https://git.kernel.org/cgit/libs/infiniband/libibverbs.git/tree/include/infiniband/verbs.h#n392 and https://git.kernel.org/cgit/libs/infiniband/libibverbs.git/tree/include/infiniband/verbs.h#n379 respectively. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] libibverbs: Allow arbitrary int values for MTU
On Jun 18, 2013, at 2:49 PM, Jason Gunthorpe wrote: >> +int num_to_ibv_mtu(int num); > > Probably should be ibv_num_to_mtu() to keep with the naming pattern.. New patch coming momentarily, but I wanted to comment on this one: I used the name "num_to_ibv_mtu" because it is in the spirit of the other enum-to-int/int-to-enum function pair naming conventions: int ibv_rate_to_mult(enum ibv_rate rate); enum ibv_rate mult_to_ibv_rate(int mult); int ibv_rate_to_mbps(enum ibv_rate rate); enum ibv_rate mbps_to_ibv_rate(int mbps); -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of "ummunot" branch?
On Jun 12, 2013, at 5:47 PM, Jason Gunthorpe wrote: > Someone has to finish the ummunotify rewrite Roland > started. Realistically MPI is going to be the only user, can someone > from the MPI world do this? 1. I tried to ask what needed to be done at the beginning of this thread and didn't get much of an answer. 2. We've (all) been asking for this functionality *for years*; I even helped with the first implementation. Can't the verbs community finish it? :-) MPI is probably your biggest customer, after all... >> ...but this is not how people write applications. Real apps use >> malloc (and some direct mmap, and perhaps even some shared memory). > > *shrug* I used MAP_FIXED for some RDMA regions in my IB verbs apps, > specifically to create specalized high-performance memory > structures. But you're not a chemist writing Fortran code to effect n-body simulations. The target audience for MPI is scientists and engineers who are not (and should not be) network / systems developers. They're focusing on their formulae and applications -- as they should be. > It isn't a general purpose technique for non-RDMA apps - but > especially when combined with ODP it is useful in some places. I have no doubt that ODP solves problems for someone. It just doesn't seem to solve the very-long-standing MPI issues with verbs and registration caches. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of "ummunot" branch?
On Jun 12, 2013, at 5:17 PM, Jason Gunthorpe wrote: > Yes, it can, via MAP_FIXED. There are lots of fun tricks you can play > using that. You're missing the point. Normal users (i.e., MPI users) don't do that. They call malloc() and they get what they get. The whole point of upper-layer APIs is that they hide all the network stuff from the application programmer. Verbs is *hard* for the mere mortal to program. MPI can do a great deal to hide the complexities of verbs from app developers, but one major concession that MPI (intentionally) made is that the *application provides the buffer*, not MPI. Hence, we're stuck with what buffers the user passes in. This is the root of the whole "MPI has a registration cache" issue. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of "ummunot" branch?
On Jun 10, 2013, at 1:26 PM, Jason Gunthorpe wrote: >> I agree that pushing all registration issues out of the application >> and (somewhere) into the verbs stack would be a nice solution. > > Well, it creates a mess in another sense, because now you've lost > context. When your MPI goes to do a 1byte send the kernel may well > prefetch a few megabytes of page tables, whereas an implementation in > userspace still has the context and can say, no I don't need that.. It seems like there are Big Problems on either side of this problem (userspace and kernel). I thought that ummunotify was a good balance between the two -- MPI kept its registration caches (which are annoying, but we have long-since understood that *someone* has to maintain them), but it gets a bulletproof way to keep them coherent. That is what is missing in today's solutions: bulletproofness (plus we have to use the horrid glibc malloc hooks, which are deprecated and are going away). >> That being said, everyone I've talked to about ODP finds it very, >> very strange that the kernel would keep memory registrations around >> for memory that is no longer part of a process. Not only does it > > MRs are badly named. They are not 'memory registrations'. They are > 'address registrations'. Don't conflat address === memory in your > head, then it seems weird :) > > The memory the address space points to is flexible. > > The address space is tied to the lifetime of the process. > > It doesn't matter if there is no memory mapped to the address space, > the address space is still there. > > Liran had a good example. You can register address space and then use > mmap/munmap/MAP_FIXED to mess around with where it points to ...but this is not how people write applications. Real apps use malloc (and some direct mmap, and perhaps even some shared memory). They don't pay attention to the contiguiousness (is that a word?) of memory/addresses in the large scale. To be clear: the most tightly bound codes *do* actually care about cache hits and locality, but that's in the small scale -- not in the large scale. I would find it hard to believe that a real code would pay attention to where in its address range a given malloc() returns, for example. *That's* what makes this whole concept weird. It seems like this is a perfect kernel space concept, but is quite foreign to userspace developers. > A practical example of using this would be to avoid the need to send > scatter buffer pointers to the remote. The remote writes into a memory > ring and the ring is made 'endless' by clever use of remapping. I don't understand -- please explain your example a bit more...? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of "ummunot" branch?
On Jun 10, 2013, at 11:56 AM, Liran Liss wrote: >> "Register all address space" is the moral equivalent of not having userspace >> registration, so let's talk about it in those terms. Specifically, there's >> a subtle >> difference between: >> >> a) telling verbs to register (0...2^64) >> --> Which is weird because it tells verbs to register memory that isn't in >> my >> address space > > Another way to look at it is "specify IO access permissions" for address > space ranges. > This could be useful to implement a buffer pool to be used for a specific MR > only, yet still map/unmap memory within this pool on the fly to optimize > physical memory utilization. > In this case, you would provide smaller ranges than 2^64... Hmm; I'm not sure I understand. Userspace doesn't control what virtual addresses it gets back from mmap/etc. So how is what you're talking about different than regular/reactive memory registration? (vs. pre-emptively registering a whole pile of memory that doesn't exist yet) Specifically: I'm confused because you said you could (preemptively) register some small regions (that assumedly don't yet exist in your virtual memory address space) and use them as memory pools. But given that userspace doesn't control its virtual address ranges, I'm not sure how that's useful. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of "ummunot" branch?
On Jun 7, 2013, at 4:57 PM, Jason Gunthorpe wrote: >> We talked about this at the MPI Forum this week; it doesn't seem >> like ODP fixes any MPI problems. > > ODP without 'register all address space' changes the nature of the > problem, and fixes only one problem. I agree that pushing all registration issues out of the application and (somewhere) into the verbs stack would be a nice solution. > You do need to cache registrations, and all the tuning parameters (how > much do I cache, how long do I hold it for, etc, etc) all still apply. > > What goes away (is fixed) is the need for intercepts and the need to > purge address space from the cache because the backing registration > has become non-coherent/invalid. Registrations are always > coherent/valid with ODP. > This cache, and the associated optimization problem, can never go > away. With a 'register all of memory' semantic the cache can move into > the kernel, but the performance implication and overheads are all > still present, just migrated. Good summary; and you corrected some of my mistakes -- thanks. That being said, everyone I've talked to about ODP finds it very, very strange that the kernel would keep memory registrations around for memory that is no longer part of a process. Not only does it lead to the "new memory is magically already registered" semantic that I find weird, it's just plain *odd* for the kernel to maintain state for something that doesn't exist any more. It feels dirty. Sidenote: I was just informed today that the current way MPI implementations implement registration cache coherence (glibc malloc hooks) has been deprecated and will be removed from glibc (http://sourceware.org/ml/libc-alpha/2011-05/msg00103.html). This really puts on the pressure to find a new / proper solution. >> What MPI wants is: >> >> 1. verbs for ummunotify-like functionality >> 2. non-blocking memory registration verbs; poll the cq to know when it has >> completed > > To me, ODP with an additional 'register all address space' semantic, plus > an asynchronous prefetch does both of these for you. > > 1. ummunotify functionality and caching is now in the kernel, under > ODP. RDMA access to an 'all of memory' registration always does the > right thing. "Register all address space" is the moral equivalent of not having userspace registration, so let's talk about it in those terms. Specifically, there's a subtle difference between: a) telling verbs to register (0...2^64) --> Which is weird because it tells verbs to register memory that isn't in my address space b) telling verbs that the app doesn't want to handle registration --> How that gets implemented is not important (from userspace's point of view) -- if the kernel chooses to implement that by registering non-existent memory, that's the kernel's problem I guess I'm arguing that registering non-existent memory is not the Right Thing. Regardless of what solution is devised for registered memory management (ummunotify, ODP, or something else), a non-blocking verb for registering memory would still be a Very Useful Thing. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of "ummunot" branch?
On Jun 6, 2013, at 4:33 PM, Jeff Squyres (jsquyres) wrote: > I don't think this covers other memory regions, like those added via mmap, > right? We talked about this at the MPI Forum this week; it doesn't seem like ODP fixes any MPI problems. 1. MPI still has to have a memory registration cache, because ibv_reg_mr(0...sbrk()) doesn't cover the stack or mmap'ed memory, etc. 2. MPI still has to intercept (at least) munmap(). 3. Having mmap/malloc/etc. return "new" memory that may already be registered because of a prior memory registration and subsequent munmap/free/etc. is just plain weird. Worse, if we re-register it, ref counts could go such that the actual registration will never actually expire until the process dies (which could lead to processes with abnormally large memory footprints, because they never actually let go of memory because it's still registered). 4. Even if MPI checks the value of sbrk() and re-registers (0...sbrk()) when sbrk() increases, this would seem to create a lot of work for the kernel -- which is both slow and synchronous. Example: a = malloc(5GB); MPI_Send(a, 1, MPI_CHAR, ...); // MPI sends 1 byte Then the MPI_Send of 1 byte will have to pay the cost of registering 5GB of new memory. - Unless we understand this wrong (and there's definitely a chance that we do!), it doesn't sound like ODP solves anything for MPI. Especially since HPC applications almost never swap (in fact, swap is usually disabled in HPC environments). What MPI wants is: 1. verbs for ummunotify-like functionality 2. non-blocking memory registration verbs; poll the cq to know when it has completed -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of "ummunot" branch?
On Jun 5, 2013, at 10:52 PM, Haggai Eran wrote: >> Haggai: A verb to resize a registration would probably be a helpful >> step. MPI could maintain one registration that covers the sbrk >> region and one registration that covers the heap, much easier than >> searching tables and things. > > That's a nice idea. Even without this verb, I think it is possible to > develop a registration cache that covers those regions though. When you > find out you have some part of your region not registered, you can > register a new, larger region that covers everything you need. For new > operations you only use the newer region. Once the previous, smaller > region is not used, you de-register it. I'm not sure what you mean. Are you saying I should do something like this: MPI_Init() { // the first MPI function invoked mpi_sbrk_save = sbrk(); ibv_reg_mr(..., 0, mpi_sbrk_save, ...); ... } MPI_Send(buffer, ...) { if (mpi_sbrk_save != sbrk()) mpi_sbrk_save = sbrk(); ibv_rereg_mr(..., 0, mpi_sbrk_save, ...); ... } I don't think this covers other memory regions, like those added via mmap, right? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of "ummunot" branch?
On Jun 5, 2013, at 12:05 PM, Jason Gunthorpe wrote: >> It does seem quite odd, abstractly speaking, that a registration >> would survive a free/re-malloc (which is arguably a "different" >> buffer). > > Not at all: the purpose of the registration is to allow access via > RDMA to a portion of the process's address space. The address space > doesn't change, but what it is mapped to can vary. I still think it's really weird. When I do this: a = malloc(N); ibv_reg_mr(..., a, N, ...); free(a); b = malloc(M); If b just happens to be partially or wholly registered by some quirk of the malloc() system (i.e., some/all of the virtual address space in b happens to have been covered by a prior malloc/ibv_reg_mr)... that's just weird. >> If ibv_reg_mr(..., >> 0, 2^64, ...) was supported, that would obviate the entire need for >> registration caches. That would be wonderful. > > Yes, except that this shifts around where the registration overhead > ends up. Basically the HCA driver now has the registration cache you > had in MPI, and all the same overheads still exist. There's fewer verbs drivers than applications, right? > Haggai: A verb to resize a registration would probably be a helpful > step. MPI could maintain one registration that covers the sbrk > region and one registration that covers the heap, much easier than > searching tables and things. If we still have to register buffers piecemeal, a non-blocking registration verb would be quite helpful. > Also bear in mind that all RDMA access protections will be disabled if > you register the entire process VM, the remote(s) can scribble/read > everything.. No problem for MPI/HPC... :-) -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of "ummunot" branch?
On Jun 5, 2013, at 11:18 AM, Jason Gunthorpe wrote: >> Are you saying that the 2nd malloc will magically be registered >> (with the new physical address)? > > Yes, that is the whole point. Interesting. > ODP fundamentally fixes the *bug* where the HCA's view of process > memory can become inconsistent with the kernel's view. Hum. I was under the impression that with today's code (i.e., not ODP), if you a = malloc(N); ibv_reg_mr(..., a, N, ...); free(a); (assuming that the memory actually left the process at free) Then the relevant kernel verbs driver was notified, and would unregister that device. ...but I'm an MPI guy, not a kernel guy -- it seems like you're saying that my impression was wrong (which doesn't currently matter because we intercept free/sbrk and unregister such memory, anyway). > 'magically be registered' is the wrong way to think about it - the > registration of VA=0x100 is simply kept, and any change to the > underlying physical mapping of the VA is synchronized with the HCA. What happens if you: a = malloc(N * page_size); ibv_reg_mr(..., a, N * page_size, ...); free(a); // incoming RDMA arrives targeted at buffer a Or if you: a = malloc(N * page_size); ibv_reg_mr(..., a, N * page_size, ...); free(a); a = malloc(N / 2 * page_size); // incoming RDMA arrives targeted at buffer a that is of length (N*page_size) It does seem quite odd, abstractly speaking, that a registration would survive a free/re-malloc (which is arguably a "different" buffer). That being said, it still seems like MPI needs a registration cache. It is several good steps forward if we don't need to intercept free/sbrk/whatever, but when MPI_Send(buf, ...) is invoked, we still have to check that the entire buf is registered. If ibv_reg_mr(..., 0, 2^64, ...) was supported, that would obviate the entire need for registration caches. That would be wonderful. > Right, this was discussed at the Enterprise Summit a few weeks > ago. I'm sure Roland would welcome patches... That's why I asked at the beginning of this thread. He didn't provide any details about what still needs to be done, though. :-) -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] libibverbs: A possible solution for allowing arbitrary MTU values.
On Jun 5, 2013, at 11:11 AM, Jason Gunthorpe wrote: > I won't say never, but this is what people want. Bumping the soname is > seen as too difficult now. Gotcha. Ok, so my patch is a non-starter. >>> Thoughts: >>> - 1024 and 3 both mean 1024, the library must accept both values, >>> it should only ever return 3 though. >> >> Why? If the caller can pass in 1024, it seems like 1024 should be >> able to be passed out, too. > > If the caller passes in 1024 then it is probably OK to return 1024, > but you have to keep track of that specially. That seems more complex > than just always returning 3. 3 is guarenteed compatible with all > users. > > Old users will test directly against 3. > New users will call ibv_from_mtu which tests against 3 as well. Ok. I'll take a to-do to work up a new patch -- probably not until next week. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of "ummunot" branch?
On Jun 5, 2013, at 10:14 AM, Jason Gunthorpe wrote: >> a = malloc(x);// a gets (va=0x100, pa=0x12345) back from malloc >> MPI_Send(a, ...); // MPI registers 0x100 for len=x, and saves (0x100,x) in >> reg cache >> free(a); >> a = malloc(x);// a gets (va=0x100, pa=0x98765) back from malloc >> MPI_Send(a, ...); // MPI sees a=0x100 and things that it is already >> registered >> // ...kaboom > > ODP is supposed to completely solve this problem. The HCA's view and > Kernels view of virtual to physical mapping becomes 100% synchronized, > and there is no 'kaboom'. The kernel updates the HCA after the free, > and after the 2nd malloc to 100% match the current virtual memory map > in the process. Are you saying that the 2nd malloc will magically be registered (with the new physical address)? > AFAIK the ummunotify user space API was nak'd by the core kernel > guys. It was NAK'ed by Linus, saying "fix your own network stack; this is not needed in the general purpose part of the kernel" (remember that Roland initially developed this as a standalone, non-IB-related kernel module). > I got the impression people thought it would be acceptable as a > rdma API, not a general API. So it is waiting on someone to recast the > function within verbs to make progress... 'zactly. Roland has this ummunot branch in his git tree, where he is in the middle of incorporating this functionality from the original ummunotify standalone kernel module into libibverbs and ibcore. I started this thread asking the status of that branch. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] libibverbs: A possible solution for allowing arbitrary MTU values.
On Jun 5, 2013, at 10:19 AM, Jason Gunthorpe wrote: > The concept of a libibverbs 2.0 has been NAK's by pretty much everyone > involved. This is why we are suffering with the complex extension > mechanism. Are you saying that libibverbs must always always always be backwards compatible, and there will never be an ABI break at any version in the future? > The mixed approach that was brought up, where values like 1500 were > passed as 1500, and values like 1024 were passed as 3 seemed doable to > me. Did you see a problem with it for your use? It just seems overly complex in terms of implementation. > Thoughts: > - 1024 and 3 both mean 1024, the library must accept both values, > it should only ever return 3 though. Why? If the caller can pass in 1024, it seems like 1024 should be able to be passed out, too. > - 1500/etc means 1500, the libray can return that. > - Make a ibv_from/to_mtu inline function to translate from bytes to > the encoded MTU value. > - Switch ibv_mtu from a enum to a typedef int ibv_mtu That also breaks ABI, doesn't it? > Jason -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] libibverbs: A possible solution for allowing arbitrary MTU values.
On Jun 5, 2013, at 9:46 AM, Jason Gunthorpe wrote: > No, this too big of an ABI break, and silent at that.. > > The IBA values have to continue to be accepted and exported in all > cases so the ABI stays the same, which is what I thought was agreed > on?? Can this go to a libibverbs 2.0, where it would be palatable to have an ABI break? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of "ummunot" branch?
On Jun 5, 2013, at 6:39 AM, Haggai Eran wrote: > Perhaps I'm missing something, but I believe ODP deals with the first > two problems in the list (slide 8), even if it doesn't solve them > completely. Unfortunately, it does not. If we could register(0 ... 2^64) and never have to worry about registered memory, that might be cool (depending on how that actually works) -- more below. See this blog post that describes the freed registered memory issue: http://blogs.cisco.com/performance/registered-memory-rma-rdma-and-mpi-implementations/ and consider the following valid user code: a = malloc(x);// a gets (va=0x100, pa=0x12345) back from malloc MPI_Send(a, ...); // MPI registers 0x100 for len=x, and saves (0x100,x) in reg cache free(a); a = malloc(x);// a gets (va=0x100, pa=0x98765) back from malloc MPI_Send(a, ...); // MPI sees a=0x100 and things that it is already registered // ...kaboom In short, MPI has to intercept free/sbrk/whatever so that it can update its registration cache. > In the future we want to implement an implicit memory region covering > the entire process address space, thus eliminating the need for memory > registration almost completely (you might still want memory > registration, or memory windows, in order to control permissions of > remote operations). This would be great, as long as it's fast, transparent, and has no subtle implementation effects (like causing additional RNR NAKs for pages that are still in memory, which, according to your descriptions, it sounds like it won't). > We can also allow fork to work with our implementation. Copy-on-write > will work with ODP regions by invalidating the HCA's page tables before > modifying the pages to be read-only. A page fault from the HCA can then > refill the pages, or even break COW in case of a write. That would be cool, too. fork() has been a continuing problem -- solving that problem would be wonderful. If this ODP stuff becomes a new verb, it would be good: - if these fork-fixing / register-infinite capabilities can be queried at run time (maybe on ibv_device_cap_flags?) so that ULPs can know to use this functionality - if driver owners can get a heads up so that they can know to implement it >> Why don't we have something like ummunotify yet? > I think that the problem we are trying to solve is better handled inside > the kernel. If you are going to change the HCA's memory mappings, you'd > have to go through the kernel anyway. If/when you allow registering all memory, then I think you're right -- the MPI-must-intercept-free/sbrk-whatever issue may go away (that's why I started this thread asking about register(0 .. 2^64)). But without that, unless I'm missing something, I don't think it solves the MPI-must-catch-free-sbrk-etc. issues...? And therefore, having some kind of ummunotify-like functionality as a verb would be a Very Good Thing. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of "ummunot" branch?
On Jun 5, 2013, at 12:14 AM, Haggai Eran wrote: >> Hmm; I'm confused. How does this fix the >> MPI-needs-to-intercept-freed-memory problem? > Well, there is no problem if an application frees registered memory (in > an on-demand paging memory region) and that memory is returned to the > OS. The OS will invalidate these pages, and the HCA will no longer be > able to use them. This means that the registration cache doesn't have to > de-register memory immediately when it is freed. (must... resist... urge... to... throw... furniture...) This is why features should not be introduced to solve MPI problems without an understanding of what the MPI problems are. :-) Please go talk to the Mellanox MPI team. Forgive me for being frustrated; memory registration and all the pain that it entails was highlighted as ***the #1 problem*** by *5 major MPI implementations* at the Sonoma 2009 workshop (see https://www.openfabrics.org/resources/document-downloads/presentations/doc_download/301-mpi-update-and-requirements-panel-all-presentations.html, starting at slide 7 in the "openmpi" slide deck). Why don't we have something like ummunotify yet? Why don't we have non-blocking memory registration yet? ...etc. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of "ummunot" branch?
On Jun 4, 2013, at 4:50 AM, Haggai Eran wrote: >> Does this mean that an MPI implementation still has to register memory upon >> usage, and maintain its own registered memory cache? > Yes. However, since registration doesn't pin memory, you can leave > registered memory regions in the cache for longer periods, and you can > register larger memory regions without needing to back them with > physical memory. Hmm; I'm confused. How does this fix the MPI-needs-to-intercept-freed-memory problem? >> >>> We chose to support only 2 concurrent page faults per QP since this >>> allows us to maintain order between the QP's operations and the >>> user-space code using it. >> >> >> I talked to someone who was at the OpenFabrics workshop and saw the ODP >> presentation in person; he tells me that a fault will be incurred when a >> page is not in the HCA's TLB cache (vs. when a registered page is not in >> memory and must be swapped back in), and that this will trigger an RNR NAK. >> >> Is this correct? > > Our HCAs use their own page tables, in addition to a TLB cache. A miss > in the TLB cache that can be filled from the HCA's page tables will not > cause an RNR NAK, since the HCA can fill it relatively fast without the > help of the operating system. If the page is missing from the HCA's page > table though it will trigger a page fault and ask the OS to bring that > page. Since this might take longer, in these cases we send an RNR NAK. Ok. But the primary use case I care about is fixing the MPI-needs-to-intercept-freed-memory problem, and it doesn't sounds like ODP fixes this. >> He was very concerned about what the size of the TLB on the HCA, and >> therefore what the actual run-time behavior would be for sending around >> large messages via MPI -- i.e., would RDMA'ing 1GB messages now incur this >> HCA-must-reload-its-TLB-and-therefore-incur-RNR-NAKs behavior? >> > We have a mechanism to prefetch the pages needed for a large message > upon the first page fault, which can also help amortizing the cost of > the page fault for larger messages. Ok, thanks. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of "ummunot" branch?
On Jun 4, 2013, at 2:54 AM, Haggai Eran wrote: > We wish to get there eventually. In our current implementation you still > have to register an on-demand memory region explicitly. The difference > between a regular memory region is that the pages in the region aren't > pinned. Does this mean that an MPI implementation still has to register memory upon usage, and maintain its own registered memory cache? > We chose to support only 2 concurrent page faults per QP since this > allows us to maintain order between the QP's operations and the > user-space code using it. I talked to someone who was at the OpenFabrics workshop and saw the ODP presentation in person; he tells me that a fault will be incurred when a page is not in the HCA's TLB cache (vs. when a registered page is not in memory and must be swapped back in), and that this will trigger an RNR NAK. Is this correct? He was very concerned about what the size of the TLB on the HCA, and therefore what the actual run-time behavior would be for sending around large messages via MPI -- i.e., would RDMA'ing 1GB messages now incur this HCA-must-reload-its-TLB-and-therefore-incur-RNR-NAKs behavior? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of "ummunot" branch?
On May 29, 2013, at 1:53 AM, Or Gerlitz wrote: > Have you looked on ODP? see > https://www.openfabrics.org/resources/document-downloads/presentations/doc_download/568-on-demand-paging-for-user-space-networking.html Is the idea behind ODP that, at the beginning of time, you register the entire memory space (i.e., NULL to 2^64) and then never worry about registered memory? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of "ummunot" branch?
On May 30, 2013, at 1:09 AM, Or Gerlitz wrote: >> Has this been run by the MPI implementor community? > > The team that works on this here isn't ready for submission, so community > runs were not made yet If this is a solution to an MPI problem, it would seem like a good idea to run the specifics of this proposal to the MPI *implementor* community first (not *users*). I say this because Mellanox also proposed the concept of a "shared send queue" as a solution to MPI RC scalability problems a while ago (around about the time XRC first debuted, IIRC?), and the MPI community universally hated it. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of "ummunot" branch?
On May 29, 2013, at 4:53 AM, Or Gerlitz wrote: > Have you looked on ODP? see > https://www.openfabrics.org/resources/document-downloads/presentations/doc_download/568-on-demand-paging-for-user-space-networking.html Is this upstream? Has this been run by the MPI implementor community? The limitation of a max of 2 concurrent page faults seems fairly significant. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of "ummunot" branch?
On May 28, 2013, at 1:52 PM, Roland Dreier wrote: > Haven't touched it in quite a while except to keep it building. Needs > work to finish up. What kinds of things still need to be done? (I don't know if we could work on this or not; just asking to scope out what would need to be done at this point) Has anything been done on the userspace side? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Status of "ummunot" branch?
Roland -- I see a ummunot branch on your kernel tree at git.kernel.org (https://git.kernel.org/cgit/linux/kernel/git/roland/infiniband.git/log/?h=ummunot). Just curious -- what's the status of this tree? I ask because, as an MPI guy, I would *love* to see this stuff integrated into the kernel and libibverbs. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] libiberbs: .gitignore updates and rename configure.in->.ac
Bump. FYI: Automake just released a new beta version, which included this in the release notes (http://lwn.net/Articles/531373/): - Automake 2.0 will drop support for the long-deprecated 'configure.in' name for the Autoconf input file. You are advised to start using the recommended name 'configure.ac' instead, ASAP. These two patches I have submitted are fairly trivial, backwards compatible with older versions of the GNU Autotools, and are going to be necessary once distros start upgrading to the newer versions of the GNU Autotools. No one seems to disagree with these patches -- can they get applied to libibverbs? On Apr 22, 2013, at 1:41 PM, Jeff Squyres wrote: > Added some entries to config/.gitignore for newer versions of the GNU > Autotools. Also renamed configure.in -> configure.ac to accomodate > newer GNU Autotools > > (http://lists.gnu.org/archive/html/autotools-announce/2012-11/msg0.html > announced the intent to drop support for "configure.in" in future > versions of Autoconf). > > Signed-off-by: Jeff Squyres > --- > .gitignore | 6 + > configure.ac | 74 > configure.in | 74 > 3 files changed, 80 insertions(+), 74 deletions(-) > create mode 100644 configure.ac > delete mode 100644 configure.in > > diff --git a/.gitignore b/.gitignore > index 78effef..d198dd1 100644 > --- a/.gitignore > +++ b/.gitignore > @@ -6,6 +6,7 @@ autom4te.cache > aclocal.m4 > stamp-h.in > config.h.in > +config.h.in~ > config.log > config.h > .libs > @@ -15,3 +16,8 @@ Makefile > config.status > stamp-h1 > libtool > +config/libtool.m4 > +config/ltoptions.m4 > +config/ltsugar.m4 > +config/ltversion.m4 > +config/lt~obsolete.m4 > diff --git a/configure.ac b/configure.ac > new file mode 100644 > index 000..efdc5ac > --- /dev/null > +++ b/configure.ac > @@ -0,0 +1,74 @@ > +dnl Process this file with autoconf to produce a configure script. > + > +AC_PREREQ(2.57) > +AC_INIT(libibverbs, 1.1.6, linux-rdma@vger.kernel.org) > +AC_CONFIG_SRCDIR([src/ibverbs.h]) > +AC_CONFIG_AUX_DIR(config) > +AC_CONFIG_MACRO_DIR(config) > +AC_CONFIG_HEADER(config.h) > +AM_INIT_AUTOMAKE([foreign]) > +m4_ifdef([AM_SILENT_RULES], [AM_SILENT_RULES([yes])]) > + > +dnl Checks for programs > +AC_PROG_CC > +AC_GNU_SOURCE > +AC_PROG_LN_S > +AC_PROG_LIBTOOL > + > +LT_INIT > + > +AC_ARG_WITH([valgrind], > +AC_HELP_STRING([--with-valgrind], > +[Enable Valgrind annotations (small runtime overhead, default NO)])) > +if test x$with_valgrind = x || test x$with_valgrind = xno; then > +want_valgrind=no > +AC_DEFINE([NVALGRIND], 1, [Define to 1 to disable Valgrind annotations.]) > +else > +want_valgrind=yes > +if test -d $with_valgrind; then > +CPPFLAGS="$CPPFLAGS -I$with_valgrind/include" > +fi > +fi > + > +dnl Checks for libraries > +AC_CHECK_LIB(dl, dlsym, [], > +AC_MSG_ERROR([dlsym() not found. libibverbs requires libdl.])) > +AC_CHECK_LIB(pthread, pthread_mutex_init, [], > +AC_MSG_ERROR([pthread_mutex_init() not found. libibverbs requires > libpthread.])) > + > +dnl Checks for header files. > +AC_HEADER_STDC > +AC_CHECK_HEADER(valgrind/memcheck.h, > +[AC_DEFINE(HAVE_VALGRIND_MEMCHECK_H, 1, > +[Define to 1 if you have the header file.])], > +[if test $want_valgrind = yes; then > +AC_MSG_ERROR([Valgrind memcheck support requested, but > not found.]) > +fi]) > + > +dnl Checks for typedefs, structures, and compiler characteristics. > +AC_C_CONST > + > +AC_CACHE_CHECK(whether ld accepts --version-script, ac_cv_version_script, > +[if test -n "`$LD --help < /dev/null 2>/dev/null | grep > version-script`"; then > + ac_cv_version_script=yes > +else > + ac_cv_version_script=no > +fi]) > + > +if test $ac_cv_version_script = yes; then > + > LIBIBVERBS_VERSION_SCRIPT='-Wl,--version-script=$(srcdir)/src/libibverbs.map' > +else > +LIBIBVERBS_VERSION_SCRIPT= > +fi > +AC_SUBST(LIBIBVERBS_VERSION_SCRIPT) > + > +AC_CACHE_CHECK(for .symver assembler support, ac_cv_asm_symver_support, > +[AC_TRY_COMPILE(, [asm("symbol:\n.symver symbol, api@ABI\n");], > +ac_cv_asm_symver_support=yes, > +ac_cv_asm_symver_support=no)]) > +if test $ac_cv_asm_symver_support = yes; then > +AC_DEFINE([HAVE_SYMVER_SUPPORT], 1, [assembler has .symver support]) > +fi > + > +AC_CONFIG_FILES([Makefile libibverbs.spec]) > +AC_OUTPUT > diff --git a/configure.in b/configure.in > deleted file mode 100644 > index efdc5ac..000 > --- a/configure.in > +++ /dev/null > @@ -1,74 +0,0 @@ > -dnl Process this file with autoconf to produce a configure script. > - > -AC_PREREQ(2.57) > -AC_INIT(libibverbs, 1.1.6, linux-rdma@vger.kernel.org) > -AC_CONFIG_SRCDIR([src/ibverbs.h]) > -AC_CONFIG_AUX_DIR(config) > -AC_CONFIG_MACRO_DIR(config) > -AC_CONFIG_HEADER(config.h) > -AM_INIT_AUTOMAKE([foreign]) > -m4_ifdef([AM_SILENT_RULES], [AM_SILENT
Re: [PATCH 2/2] Ad IB_MTU_1500|9000 enums.
On Apr 22, 2013, at 4:00 PM, Doug Ledford wrote: >> 2. Change all instances of ib_mtu/ibv_mtu to an int. Code such as >> "switch(mtu) case IBV_MTU_1024: ..." will need to be updated to >> "switch(mtu) case 1024: ...". > > I was actually thinking that an ibverbs API version 2.0 might be an > interesting way to go. The proliferation of non-IB link layers > providing the verbs API make some of the original assumptions of IB link > layer in the original API obsolete. But, if we were to do that, I'd > take some time to really think the issue over and try to catch all of > the needed updates in one go. In addition to the MTU, another obvious issue is the active_speed attribute on the ibv_port_attr. On the kernel side, it's an enum (IB_SPEED_SDR through IB_SPEED_EDR), but there's no corresponding enum names in libibverbs. It would be good to make this value a non-enum-int, too. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] libibverbs: Use autoreconf in autogen.sh
On May 1, 2013, at 11:30 AM, Doug Ledford wrote: > This is fine with me, however, I think you also need to bump the > autotools version to the latest upstream. The automated checkers in our > build environment is spitting out errors about a number of upstream > packages where the autotools used to configure the package does not > include proper arm support. The latest autotools bring in all of the > forthcoming arm variants. So I would like to see both of these things done. Are you referring to the version of Autotools that Roland uses to create his tarballs? Because I have no control over that. :-) >> On Apr 25, 2013, at 11:38 AM, Jeff Squyres (jsquyres) >> wrote: >> >>> Bump. >>> >>> On Apr 22, 2013, at 1:41 PM, Jeff Squyres wrote: >>> >>>> The old sequence of Autotools commands listed in autogen.sh is no >>>> longer correct. Instead, just use the single "autoreconf" command, >>>> which will invoke all the Right Autotools commands in the correct >>>> order. >>>> >>>> Signed-off-by: Jeff Squyres >>>> --- >>>> autogen.sh | 6 +- >>>> 1 file changed, 1 insertion(+), 5 deletions(-) >>>> >>>> diff --git a/autogen.sh b/autogen.sh >>>> index fd47839..6c9233e 100755 >>>> --- a/autogen.sh >>>> +++ b/autogen.sh >>>> @@ -1,8 +1,4 @@ >>>> #! /bin/sh >>>> >>>> set -x >>>> -aclocal -I config >>>> -libtoolize --force --copy >>>> -autoheader >>>> -automake --foreign --add-missing --copy >>>> -autoconf >>>> +autoreconf -ifv -I config >>>> -- >>>> 1.8.1.1 >>>> >>> >>> >>> -- >>> Jeff Squyres >>> jsquy...@cisco.com >>> For corporate legal information go to: >>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >>> the body of a message to majord...@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] libibverbs: Use autoreconf in autogen.sh
Bump bump. :-) On Apr 25, 2013, at 11:38 AM, Jeff Squyres (jsquyres) wrote: > Bump. > > On Apr 22, 2013, at 1:41 PM, Jeff Squyres wrote: > >> The old sequence of Autotools commands listed in autogen.sh is no >> longer correct. Instead, just use the single "autoreconf" command, >> which will invoke all the Right Autotools commands in the correct >> order. >> >> Signed-off-by: Jeff Squyres >> --- >> autogen.sh | 6 +- >> 1 file changed, 1 insertion(+), 5 deletions(-) >> >> diff --git a/autogen.sh b/autogen.sh >> index fd47839..6c9233e 100755 >> --- a/autogen.sh >> +++ b/autogen.sh >> @@ -1,8 +1,4 @@ >> #! /bin/sh >> >> set -x >> -aclocal -I config >> -libtoolize --force --copy >> -autoheader >> -automake --foreign --add-missing --copy >> -autoconf >> +autoreconf -ifv -I config >> -- >> 1.8.1.1 >> > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] libibverbs: Use autoreconf in autogen.sh
Bump. On Apr 22, 2013, at 1:41 PM, Jeff Squyres wrote: > The old sequence of Autotools commands listed in autogen.sh is no > longer correct. Instead, just use the single "autoreconf" command, > which will invoke all the Right Autotools commands in the correct > order. > > Signed-off-by: Jeff Squyres > --- > autogen.sh | 6 +- > 1 file changed, 1 insertion(+), 5 deletions(-) > > diff --git a/autogen.sh b/autogen.sh > index fd47839..6c9233e 100755 > --- a/autogen.sh > +++ b/autogen.sh > @@ -1,8 +1,4 @@ > #! /bin/sh > > set -x > -aclocal -I config > -libtoolize --force --copy > -autoheader > -automake --foreign --add-missing --copy > -autoconf > +autoreconf -ifv -I config > -- > 1.8.1.1 > -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] Ad IB_MTU_1500|9000 enums.
On Apr 22, 2013, at 1:30 PM, Doug Ledford wrote: > However, for some reason I had it > in my mind when I was reading the patch that it was against libibverbs. > That's what I get for staying up late and reviewing when I'm tired :-/ There were other patches against libibverbs that were submitted at the same time. That being said, I see two obvious ways to go forward, both of which have pros/cons: 1. Extend the enum ib_mtu to include new enum values for 1500 and 9000 -- probably with a different prefix to indicate that they're not IBTA-sanctioned values (note that this will also require corresponding changes in libibverbs, since MTU values get passed up from kernel to userspace). PRO: fixes the immediate problem PRO: probably the lowest impact solution; just adding some more enum values CON: weird naming (IB_ and RDMA_ prefixes in the same ib_mtu enum; probably something similar in userspace) CON: doesn't do anything to address other MTU values (e.g., what if someone has an MTU of 1498?) 2. Change all instances of ib_mtu/ibv_mtu to an int. Code such as "switch(mtu) case IBV_MTU_1024: ..." will need to be updated to "switch(mtu) case 1024: ...". PRO: solves the problem for all MTU values PRO: eliminates the enum-to-int translation functions CON: much driver code will need to be updated per above, and also update logic checking for out-of-bounds MTU calues CON: similarly, userspace apps will need to be updated; it might be worthwhile to bump libibverbs to 2.x, and then intentionally change the MTU field names in ibv_port_attr and ibv_qp_attr so that apps using those fields will fail to compile with libibverbs 2.x (and therefore forcibly realize they need to adapt to the new int MTU values) -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] Use autoreconf in autogen.sh
On Apr 19, 2013, at 8:19 PM, "Hefty, Sean" wrote: > It may help if you identify the library this patch is against. :) 3rd time sending will be the charm... :-) -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] Use autoreconf in autogen.sh
Bump. Any thoughts on these two patches? They're pretty trivial, enable use with modern versions of Autotools, and now feature the proper Signed-off-by line. On Apr 13, 2013, at 8:15 AM, Jeff Squyres wrote: > The old sequence of Autotools commands listed in autogen.sh is no > longer correct. Instead, just use the single "autoreconf" command, > which will invoke all the Right Autotools commands in the correct > order. > > Signed-off-by: Jeff Squyres > --- > autogen.sh | 6 +- > 1 file changed, 1 insertion(+), 5 deletions(-) > > diff --git a/autogen.sh b/autogen.sh > index fd47839..6c9233e 100755 > --- a/autogen.sh > +++ b/autogen.sh > @@ -1,8 +1,4 @@ > #! /bin/sh > > set -x > -aclocal -I config > -libtoolize --force --copy > -autoheader > -automake --foreign --add-missing --copy > -autoconf > +autoreconf -ifv -I config > -- > 1.8.1.1 > -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] Ad IB_MTU_1500|9000 enums.
On Apr 12, 2013, at 11:40 AM, Jeff Squyres (jsquyres) wrote: >> As an aside I like the use of RDMA_MTU_* for these values. Again to >> distinguish them from the IBTA values. But I know that is poor form. > > So what's the right way to move forward on this? Is it this: > > enum ib_mtu { > IB_MTU_256 = 1, > IB_MTU_512 = 2, > IB_MTU_1024 = 3, > IB_MTU_2048 = 4, > IB_MTU_4096 = 5, > RDMA_MTU_1500 = 1500, > RDMA_MTU_9000 = 9000 > }; Bump. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] Ad IB_MTU_1500|9000 enums.
On Apr 9, 2013, at 10:44 PM, "Weiny, Ira" wrote: > As an aside I like the use of RDMA_MTU_* for these values. Again to > distinguish them from the IBTA values. But I know that is poor form. So what's the right way to move forward on this? Is it this: enum ib_mtu { IB_MTU_256 = 1, IB_MTU_512 = 2, IB_MTU_1024 = 3, IB_MTU_2048 = 4, IB_MTU_4096 = 5, RDMA_MTU_1500 = 1500, RDMA_MTU_9000 = 9000 }; -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] Ad IB_MTU_1500|9000 enums.
On Apr 9, 2013, at 4:10 PM, "Weiny, Ira" wrote: >> Just to re-state: our issue is that there does not seem to be any other way >> to >> get the max UD message size without knowing the actual MTU (are we >> incorrect about that?). Hence, using the IB-defined values is not really >> sufficient. > > I guess I am confused. Is this patch trying to support RoCE or a VNIC? Both, actually. The RoCE driver lies about its MTU (IIRC, it claims IB_MTU_1024, even if the MTU is actually 1500). So AFAIK, there's no way to know what the UD max message size is on RoCE, because the max message size attribute on port refers to RC, not UD. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] Ad IB_MTU_1500|9000 enums.
On Apr 8, 2013, at 6:16 PM, "Hefty, Sean" wrote: > Why can't IB_MTU_1500 = 1500? It certainly could. Additionally, since Roland was a little concerned about the "IB" prefix (since 1500 and 9000 are not IBTA-sanctioned MTUs), they could have a different prefix -- perhaps RDMA_MTU_1500. Although I admit that it would be weird to have an enum that contains values with different prefixes: enum ib_mtu { IB_MTU_256 = 1, IB_MTU_512 = 2, IB_MTU_1024 = 3, IB_MTU_2048 = 4, IB_MTU_4096 = 5, RDMA_MTU_1500 = 1500, RDMA_MTU_9000 = 9000 }; -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/4] Use autoreconf in autogen.sh
Roland -- If there are no objections, can this patch (and patch 4 of this set: https://patchwork.kernel.org/patch/2387321/) be committed? Neither should not have any real impact other than the modernization of the libibverbs build system. On Apr 3, 2013, at 9:06 AM, Jeff Squyres wrote: > The old sequence of Autotools commands listed in autogen.sh is no > longer correct. Instead, just use the single "autoreconf" command, > which will invoke all the Right Autotools commands in the correct > order. > > --- > autogen.sh | 6 +- > 1 file changed, 1 insertion(+), 5 deletions(-) > > diff --git a/autogen.sh b/autogen.sh > index fd47839..6c9233e 100755 > --- a/autogen.sh > +++ b/autogen.sh > @@ -1,8 +1,4 @@ > #! /bin/sh > > set -x > -aclocal -I config > -libtoolize --force --copy > -autoheader > -automake --foreign --add-missing --copy > -autoconf > +autoreconf -ifv -I config > -- > 1.8.1.1 > -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] Ad IB_MTU_1500|9000 enums.
On Apr 4, 2013, at 1:57 PM, "Weiny, Ira" wrote: >> In hindsight, the user space API never should have exposed the mtu as an >> enum... >> >> Since an enum is an int, and we're never going to have anything with an mtu >> <= 5 bytes, couldn't we just store all new mtu values directly as their byte >> value? > > That seems like a pretty good idea. Agreed, but changing to an int would seem to have some fairly serious backwards compatibility issues. What is the right way to move forward here? Just to re-state: our issue is that there does not seem to be any other way to get the max UD message size without knowing the actual MTU (are we incorrect about that?). Hence, using the IB-defined values is not really sufficient. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] Add IBV_*_USNIC enums for the Cisco Ethernet Virtual NIC.
On Apr 5, 2013, at 4:40 PM, Roland Dreier wrote: > I think the idea is that without context, it's hard to know if adding > these enums makes sense or not. And I'm sorry but I'm not that > sympathetic to "my code isn't ready but you have to take this > out-of-context patch so I can meet Red Hat's arbitrary schedule." Ok, fair enough. It'll be a few weeks before we can submit usnic.ko, so I'll re-bring up the IBV_NODE_VENDOR/related patches then. I think the MTU discussion is still relevant, however -- there seems to be a larger design issue there. I'll go reply separately on that thread. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] Add IBV_*_USNIC enums for the Cisco Ethernet Virtual NIC.
Per my previous email, forgive my top reply... RDMA_NODE_VENDOR would be great, actually. Should I work up a patch for that? Sent from my phone. No type good. On Apr 4, 2013, at 10:32 AM, "Hefty, Sean" wrote: >> The reason we're asking for these IBV_*_USNIC enums now -- before we've >> submitted the driver -- is because we're targeting getting our driver >> included >> in RHEL 6.5. There's a bit of a chicken-and-egg issue here: they'll accept >> our >> patches for a new hardware driver while that driver is being worked upstream. >> But they (rightfully) won't accept patches to IB core and libibverbs until >> they've been vetted by the community. Hence, even though our driver is >> slowly >> working its way through QA and not available yet, we wanted to submit these >> new >> enums upstream for community approval so that they can be included in RHEL >> 6.5. > > I understand the issue. > > In the end, these are kernel changes with no actual users of those changes... > But then they are also just small changes to a framework... > > Just thinking aloud here, but what if we added 'RDMA_NODE_VENDOR' instead? > Then other fields, such as transport, become vendor specific. > > - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] Add IBV_*_USNIC enums for the Cisco Ethernet Virtual NIC.
Forgive the top reply; I'm actually on vacation this week and currently only have email access on my phone... I'm not sure what you're asking me to do. Are you asking us to submit our known-buggy-and-not-yet-complete driver just to get two enums approved? Sent from my phone. No type good. On Apr 4, 2013, at 5:27 PM, "Or Gerlitz" wrote: > Jeff Squyres (jsquyres) wrote: > >> Sure. For a little background, the 2nd-generation Cisco VIC has been >> available >> since last year (IIRC): http://www.cisco.com/en/US/products/ps10277 >> /prod_module_series_home.html. It's a converged 10G Ethernet adapter >> available > in a variety of form factors (e.g., 2x10G on PCIe and Mezz). > >> After some off-list discussion with Roland, we chose to create new >> IBV_*_USNIC >> enums because none of the current enums were accurate for our device. It's >> an >> Ethernet NIC, but it's not an RNIC. It's an Ethernet-based transport, but >> it's not >> iWARP. > >> >> The reason we're asking for these IBV_*_USNIC enums now -- before we've >> submitted the driver -- is because we're targeting getting our driver >> included in RHEL 6.5. There's a bit of a chicken-and-egg issue here: >> they'll accept our patches for a new hardware driver while that driver is >> being worked upstream. But they (rightfully) won't accept patches to IB >> core and libibverbs until they've been vetted by the community. Hence, even >> though our driver is slowly working its way through QA and not available >> yet, we wanted to submit these new enums upstream for community approval so >> that they can be included in RHEL 6.5. > >> Does that help? > > yes it does, but I still think we need to see the driver code in order > to conduct proper /better review and maybe even accept the proposed > changes to the IB core. You can submit it as RFC which means "you can > look on it, and give me comments, but don't pick it up yet" > > Or. > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] Ad IB_MTU_1500|9000 enums.
On Apr 3, 2013, at 12:52 PM, Roland Dreier wrote: > I don't think we can blithely do this... I think the IB enum values > are defined to match the values used in the IB spec (PathRecord etc). Gotcha. I inserted the enums in their proper numerical order to make the range comparisons simpler in ib_addr.h. But the 1500/9000 values could be tacked at the end of the current values (e.g., 6 and 7, respectively) -- it would just necessitate some different changes in ib_addr.h. > Even if we change it so 1500 and 9000 are outside of the range used by > the IB spec, I don't understand the motivation for this change. What > does this buy us? Our impression was that a userspace application cannot know the max message size it can send across a UD QP without having an accurate MTU enum. Specifically: the ibv_port_attr.max_msg_size value seems to be a higher-level value. E.g., on Mellanox devices, .max_msg_size is the max size of RC QP messages. Is there another way to determine max UD QP message size that we missed? > How is iWARP working today without this change? They lie about the actual/underlying MTU. But they don't have UD QPs, so .max_msg_size is sufficient for their RC QPs. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] Add IBV_*_USNIC enums for the Cisco Ethernet Virtual NIC.
On Apr 3, 2013, at 2:45 PM, Or Gerlitz wrote: > Jeff, I agree with Sean, there's not much point to review/discuss > these general/pre-step patches without seeing some actual device > specific kernel (if there are such or user space code if there aren't > any kernel ones) code. e.g you can submit the two kernel pre-step > patches as the two first pieces in a series that has the driver code. Unfortunately, not yet. I just sent another mail that explained our rationale: our kernel driver and libibverbs plugin code are working their way through QA. It'll take a little time before we can submit good patches for these. The main driving factor for submitting these new enums is so that they can be included in RHEL 6.5. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] Add IBV_*_USNIC enums for the Cisco Ethernet Virtual NIC.
On Apr 3, 2013, at 10:49 AM, "Hefty, Sean" wrote: > Can we get a better patch description? > > Maybe mention something about the NIC? Does it support all verbs? Is it for > kernel users or just user space? Does this simply export a raw ethernet > interface? Sure. For a little background, the 2nd-generation Cisco VIC has been available since last year (IIRC): http://www.cisco.com/en/US/products/ps10277/prod_module_series_home.html. It's a converged 10G Ethernet adapter available in a variety of form factors (e.g., 2x10G on PCIe and Mezz). We'll be providing a UD verbs kernel driver and libibverbs plugin for OS bypass. It is currently going through QA and debugging; it'll probably take a bit more time before we can submit good patches. The main intended use for this driver is userspace/libibverbs applications, but I suppose it could be used by kernel applications, too. The wire protocol transport that is uses underneath will initially be a very simple L2-Ethernet based frame (DMAC, SMAC, ET, QP num, etc.). We are not exposing a RAW interface at this time; the libibverbs plugin will provide UD QP functionality. After some off-list discussion with Roland, we chose to create new IBV_*_USNIC enums because none of the current enums were accurate for our device. It's an Ethernet NIC, but it's not an RNIC. It's an Ethernet-based transport, but it's not iWARP. The reason we're asking for these IBV_*_USNIC enums now -- before we've submitted the driver -- is because we're targeting getting our driver included in RHEL 6.5. There's a bit of a chicken-and-egg issue here: they'll accept our patches for a new hardware driver while that driver is being worked upstream. But they (rightfully) won't accept patches to IB core and libibverbs until they've been vetted by the community. Hence, even though our driver is slowly working its way through QA and not available yet, we wanted to submit these new enums upstream for community approval so that they can be included in RHEL 6.5. Does that help? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
New patches
I'm about to send some patches for libibverbs and Roland's infiniband kernel git tree. The patches fit into two general categories: 1. Add enums for Cisco's Ethernet Virtual NIC (it's not an RNIC and therefore doesn't fit the RNIC/IWARP enums). Also add enums for 1500 and 9000 MTUs. 2. Minor modernization of the GNU Autotools usage in libibverbs. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html