Re: [PATCH] libibverbs: Add the use of IBV_SEND_INLINE to example pingpong programs
4th bump... On Jul 10, 2013, at 4:32 PM, Jeff Squyres jsquy...@cisco.com wrote: If the send size is less than the cap.max_inline_data reported by the qp, use the IBV_SEND_INLINE flag. This now only shows the example of using ibv_query_qp(), it also reduces the latency time shown by the pingpong programs when the sends can be inlined. Signed-off-by: Jeff Squyres jsquy...@cisco.com --- examples/rc_pingpong.c | 18 +- examples/srq_pingpong.c | 19 +-- examples/uc_pingpong.c | 17 - examples/ud_pingpong.c | 18 +- 4 files changed, 51 insertions(+), 21 deletions(-) diff --git a/examples/rc_pingpong.c b/examples/rc_pingpong.c index 15494a1..a8637a5 100644 --- a/examples/rc_pingpong.c +++ b/examples/rc_pingpong.c @@ -65,6 +65,7 @@ struct pingpong_context { struct ibv_qp *qp; void*buf; int size; + int send_flags; int rx_depth; int pending; struct ibv_port_attr portinfo; @@ -319,8 +320,9 @@ static struct pingpong_context *pp_init_ctx(struct ibv_device *ib_dev, int size, if (!ctx) return NULL; - ctx-size = size; - ctx-rx_depth = rx_depth; + ctx-size = size; + ctx-send_flags = IBV_SEND_SIGNALED; + ctx-rx_depth = rx_depth; ctx-buf = memalign(page_size, size); if (!ctx-buf) { @@ -367,7 +369,8 @@ static struct pingpong_context *pp_init_ctx(struct ibv_device *ib_dev, int size, } { - struct ibv_qp_init_attr attr = { + struct ibv_qp_attr attr; + struct ibv_qp_init_attr init_attr = { .send_cq = ctx-cq, .recv_cq = ctx-cq, .cap = { @@ -379,11 +382,16 @@ static struct pingpong_context *pp_init_ctx(struct ibv_device *ib_dev, int size, .qp_type = IBV_QPT_RC }; - ctx-qp = ibv_create_qp(ctx-pd, attr); + ctx-qp = ibv_create_qp(ctx-pd, init_attr); if (!ctx-qp) { fprintf(stderr, Couldn't create QP\n); goto clean_cq; } + + ibv_query_qp(ctx-qp, attr, IBV_QP_CAP, init_attr); + if (init_attr.cap.max_inline_data = size) { + ctx-send_flags |= IBV_SEND_INLINE; + } } { @@ -508,7 +516,7 @@ static int pp_post_send(struct pingpong_context *ctx) .sg_list= list, .num_sge= 1, .opcode = IBV_WR_SEND, - .send_flags = IBV_SEND_SIGNALED, + .send_flags = ctx-send_flags, }; struct ibv_send_wr *bad_wr; diff --git a/examples/srq_pingpong.c b/examples/srq_pingpong.c index 6e00f8c..552a144 100644 --- a/examples/srq_pingpong.c +++ b/examples/srq_pingpong.c @@ -68,6 +68,7 @@ struct pingpong_context { struct ibv_qp *qp[MAX_QP]; void*buf; int size; + int send_flags; int num_qp; int rx_depth; int pending[MAX_QP]; @@ -350,9 +351,10 @@ static struct pingpong_context *pp_init_ctx(struct ibv_device *ib_dev, int size, if (!ctx) return NULL; - ctx-size = size; - ctx-num_qp = num_qp; - ctx-rx_depth = rx_depth; + ctx-size = size; + ctx-send_flags = IBV_SEND_SIGNALED; + ctx-num_qp = num_qp; + ctx-rx_depth = rx_depth; ctx-buf = memalign(page_size, size); if (!ctx-buf) { @@ -413,7 +415,8 @@ static struct pingpong_context *pp_init_ctx(struct ibv_device *ib_dev, int size, } for (i = 0; i num_qp; ++i) { - struct ibv_qp_init_attr attr = { + struct ibv_qp_attr attr; + struct ibv_qp_init_attr init_attr = { .send_cq = ctx-cq, .recv_cq = ctx-cq, .srq = ctx-srq, @@ -424,11 +427,15 @@ static struct pingpong_context *pp_init_ctx(struct ibv_device *ib_dev, int size, .qp_type = IBV_QPT_RC }; - ctx-qp[i] = ibv_create_qp(ctx-pd, attr); + ctx-qp[i] = ibv_create_qp(ctx-pd, init_attr); if (!ctx-qp[i]) { fprintf(stderr, Couldn't create QP[%d]\n, i); goto clean_qps; } + ibv_query_qp(ctx-qp[i], attr, IBV_QP_CAP, init_attr); + if (init_attr.cap.max_inline_data = size) { + ctx-send_flags |= IBV_SEND_INLINE; + } } for (i = 0; i num_qp; ++i) { @@ -568,7 +575,7 @@ static int pp_post_send(struct
Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU
On Jul 23, 2013, at 9:26 AM, Jeff Squyres (jsquyres) jsquy...@cisco.com wrote: .. and UD is the least abstracted transport, so existing apps won't support Jeff's new NIC anyhow, MTU is the least of their problems. Existing apps with existing transports see the same old values. Bump. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: rtnl_lock deadlock on 3.10
On 7/29/2013 6:02 PM, Shawn Bohrer wrote: On Mon, Jul 15, 2013 at 09:38:19AM -0500, Shawn Bohrer wrote: On Wed, Jul 03, 2013 at 08:26:11PM +0300, Or Gerlitz wrote: On 03/07/2013 20:22, Shawn Bohrer wrote: On Wed, Jul 03, 2013 at 07:33:07AM +0200, Hannes Frederic Sowa wrote: On Wed, Jul 03, 2013 at 07:11:52AM +0200, Hannes Frederic Sowa wrote: On Tue, Jul 02, 2013 at 01:38:26PM +, Cong Wang wrote: On Tue, 02 Jul 2013 at 08:28 GMT, Hannes Frederic Sowa han...@stressinduktion.org wrote: On Mon, Jul 01, 2013 at 09:54:56AM -0500, Shawn Bohrer wrote: I've managed to hit a deadlock at boot a couple times while testing the 3.10 rc kernels. It seems to always happen when my network devices are initializing. This morning I updated to v3.10 and made a few config tweaks and so far I've hit it 4 out of 5 reboots. It looks like most processes are getting stuck on rtnl_lock. Below is a boot log with the soft lockup prints. Please let know if there is any other information I can provide: Could you try a build with CONFIG_LOCKDEP enabled? The problem is clear: ib_register_device() is called with rtnl_lock, but itself needs device_mutex, however, ib_register_client() first acquires device_mutex, then indirectly calls register_netdev() which takes rtnl_lock. Deadlock! One possible fix is always taking rtnl_lock before taking device_mutex, something like below: diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c index 18c1ece..890870b 100644 --- a/drivers/infiniband/core/device.c +++ b/drivers/infiniband/core/device.c @@ -381,6 +381,7 @@ int ib_register_client(struct ib_client *client) { struct ib_device *device; + rtnl_lock(); mutex_lock(device_mutex); list_add_tail(client-list, client_list); @@ -389,6 +390,7 @@ int ib_register_client(struct ib_client *client) client-add(device); mutex_unlock(device_mutex); + rtnl_unlock(); return 0; } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index b6e049a..5a7a048 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -1609,7 +1609,7 @@ static struct net_device *ipoib_add_port(const char *format, goto event_failed; } - result = register_netdev(priv-dev); + result = register_netdevice(priv-dev); if (result) { printk(KERN_WARNING %s: couldn't register ipoib port %d; error %d\n, hca-name, port, result); Looks good to me. Shawn, could you test this patch? ib_unregister_device/ib_unregister_client would need the same change, too. I have not checked the other -add() and -remove() functions. Also cc'ed linux-rdma@vger.kernel.org, Roland Dreier. Cong's patch is missing the #include linux/rtnetlink.h but otherwise I've had 34 successful reboots with no deadlocks which is a good sign. It sounds like there are more paths that need to be audited and a proper patch submitted. I can do more testing later if needed. Thanks, Shawn Guys, I was a bit busy today looking into that, but I don't think we want the IB core layer (core/device.c) to use rtnl locking which is something that belongs to the network stack. Has anymore thought been put into a proper fix for this issue? I'm no expert in this area but I'm having a hard time seeing a different solution than the one Cong suggested. Just to be clear the deadlock I hit was between cxgb3 and the ipoib module, so I've Cc'd Steve Wise in case he has a better solution from the Chelsio side. I don't know of another way to resolve this. The rtnl lock is used in ipoib and mlx4 already. I think we should go forward with the proposed patch. Steve. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH opensm] osm_sa_path_record.c: Fix rate setting issue in SA PR handling
For switch egress ports, fix rate setting for an extended speed port Signed-off-by: Hal Rosenstock h...@mellanox.com --- opensm/osm_sa_path_record.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/opensm/osm_sa_path_record.c b/opensm/osm_sa_path_record.c index 02aeb9f..e7ac607 100644 --- a/opensm/osm_sa_path_record.c +++ b/opensm/osm_sa_path_record.c @@ -385,7 +385,7 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * sa, ib_port_info_compute_rate(p_pi, p_pi0-capability_mask IB_PORT_CAP_HAS_EXT_SPEEDS)) 0) rate = ib_port_info_compute_rate(p_pi, -p_pi-capability_mask IB_PORT_CAP_HAS_EXT_SPEEDS); +p_pi0-capability_mask IB_PORT_CAP_HAS_EXT_SPEEDS); if (sa-p_subn-opt.qos) { /* -- 1.7.8.2 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU
On Jul 30, 2013, at 12:44 PM, Christoph Lameter c...@gentwo.org wrote: What in the world does that mean? I am an oldtimer I guess. Seems that this is something that can be done in the newfangled forum? How does this affect mailing lists? I'm not sure what you're asking me; please see the prior posts on this thread that describes the MTU issue and why we still need a solution. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU
On Tue, 30 Jul 2013, Jeff Squyres (jsquyres) wrote: On Jul 30, 2013, at 12:44 PM, Christoph Lameter c...@gentwo.org wrote: What in the world does that mean? I am an oldtimer I guess. Seems that this is something that can be done in the newfangled forum? How does this affect mailing lists? I'm not sure what you're asking me; please see the prior posts on this thread that describes the MTU issue and why we still need a solution. What does bump mean? You keep sending replies that just says bump. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU
On 07/30/2013 02:45 PM, Atchley, Scott wrote: http://en.wikipedia.org/wiki/Internet_forum#Thread When a member posts in a thread it will jump to the top since it is the latest updated thread. Similarly, other threads will jump in front of it when they receive posts. When a member posts in a thread for no reason but to have it go to the top, it is referred to as a bump or bumping. He is trying to bring it back to everyone's attention. Exactly. On a mailing list like this, it's shorthand for You seem to have dropped this on the floor, let me resend it to you. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU
On Jul 30, 2013, at 2:31 PM, Christoph Lameter c...@gentwo.org wrote: On Tue, 30 Jul 2013, Jeff Squyres (jsquyres) wrote: On Jul 30, 2013, at 12:44 PM, Christoph Lameter c...@gentwo.org wrote: What in the world does that mean? I am an oldtimer I guess. Seems that this is something that can be done in the newfangled forum? How does this affect mailing lists? I'm not sure what you're asking me; please see the prior posts on this thread that describes the MTU issue and why we still need a solution. What does bump mean? You keep sending replies that just says bump. http://en.wikipedia.org/wiki/Internet_forum#Thread When a member posts in a thread it will jump to the top since it is the latest updated thread. Similarly, other threads will jump in front of it when they receive posts. When a member posts in a thread for no reason but to have it go to the top, it is referred to as a bump or bumping. He is trying to bring it back to everyone's attention. Scott -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mlx5: Implement new initialization sequence
On Thu, Jul 18, 2013 at 3:31 PM, Eli Cohen e...@dev.mellanox.co.il wrote: Introduce enbale_hca and disable_hca commands to signify when the driver starts or ceases to operate on the device. In addition the driver will use boot and init pages count; boot pages is required to allow firmware to complete boot commands and the other to complete init hca. Command interface revision is bumped to 4 to enforce using supported firmware. Roland, haven't seen a response from you for about 2 weeks. Its critical that the first upstream release that includes the mlx5 driver will use this initilization sequence. Its mlx5_core patch, do you want it to go through netdev/Dave? Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V8 libibverbs 1/7] Infrastructure to support verbs extensions
On Tue, Jul 30, 2013 at 11:27:15PM +0300, Yishai Hadas wrote: Can accept it, however as it's a const pointer may need some casting later which is not fully clean. Drop the const from the definition then. Your suggestion for verbs_set_ctx_op can't not work as it calls internally to verbs_get_ctx_op and will may fail as at that step function pointer was not set and *(void **)((uint8_t *)vctx + off) will be NULL. Yes, that is too bad in that case, but the old macro is still flawed: +#define verbs_set_ctx_op(vctx, op, ptr) ({ \ + if (vctx (vctx-sz = sizeof(*vctx) - offsetof(struct verbs_context, op))) \ + vctx-op = ptr; }) - Missing () on all vctx usages - Missing type enforcement on vctx Something like this: #define verbs_set_ctx_op(_vctx, op, ptr) ({ \ struct verbs_context *vctx = _vctx; \ if (vctx (vctx-sz = sizeof(*vctx) - offsetof(struct verbs_context, op))) \ vctx-op = ptr; }) In addition changing to use const as part of returning from __verbs_get_ctx_op causes some necessary casting to non const in some places which is not fully clean. (e.g. free((void *)context_ex); as part of __ibv_close_device, verbs_ctx-has_comp_mask = VERBS_CONTEXT_XRCD | VERBS_CONTEXT_SRQ | VERBS_CONTEXT_QP; as part of mlx4_init_context) It is virtually impossible to do const-correctness fully and transparently in C, since the language has no feature to silently propogate the const. If you want to be 100% clean then provide a non-const version -- verbs_get_ctx_nc that takes non-const input. Functions that silently discard const are bad since it silently defeats static analysis around const. I recommend staying with V8 suggestion for both macros, in case you think there is any problem with missing () for the set operation please point on and may handle. Using the inline to help the -Os case is definitely desirable, and the fix to the () at least. Jason -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 4/4] Declare 'server_port' as an unsigned variable
thanks - applied all 4 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mlx5: fix error return code in mlx5_alloc_uuars()
From: Wei Yongjun weiyj...@gmail.com Date: Tue, 30 Jul 2013 07:57:06 +0800 From: Wei Yongjun yongjun_...@trendmicro.com.cn Fix to return -ENOMEM from the ioremap error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun yongjun_...@trendmicro.com.cn Applied, thanks. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html