Re: [PATCH] libibverbs: Add the use of IBV_SEND_INLINE to example pingpong programs

2013-07-30 Thread Jeff Squyres (jsquyres)
4th bump...

On Jul 10, 2013, at 4:32 PM, Jeff Squyres jsquy...@cisco.com wrote:

 If the send size is less than the cap.max_inline_data reported by the
 qp, use the IBV_SEND_INLINE flag.  This now only shows the example of
 using ibv_query_qp(), it also reduces the latency time shown by the
 pingpong programs when the sends can be inlined.
 
 Signed-off-by: Jeff Squyres jsquy...@cisco.com
 ---
 examples/rc_pingpong.c  | 18 +-
 examples/srq_pingpong.c | 19 +--
 examples/uc_pingpong.c  | 17 -
 examples/ud_pingpong.c  | 18 +-
 4 files changed, 51 insertions(+), 21 deletions(-)
 
 diff --git a/examples/rc_pingpong.c b/examples/rc_pingpong.c
 index 15494a1..a8637a5 100644
 --- a/examples/rc_pingpong.c
 +++ b/examples/rc_pingpong.c
 @@ -65,6 +65,7 @@ struct pingpong_context {
   struct ibv_qp   *qp;
   void*buf;
   int  size;
 + int  send_flags;
   int  rx_depth;
   int  pending;
   struct ibv_port_attr portinfo;
 @@ -319,8 +320,9 @@ static struct pingpong_context *pp_init_ctx(struct 
 ibv_device *ib_dev, int size,
   if (!ctx)
   return NULL;
 
 - ctx-size = size;
 - ctx-rx_depth = rx_depth;
 + ctx-size   = size;
 + ctx-send_flags = IBV_SEND_SIGNALED;
 + ctx-rx_depth   = rx_depth;
 
   ctx-buf = memalign(page_size, size);
   if (!ctx-buf) {
 @@ -367,7 +369,8 @@ static struct pingpong_context *pp_init_ctx(struct 
 ibv_device *ib_dev, int size,
   }
 
   {
 - struct ibv_qp_init_attr attr = {
 + struct ibv_qp_attr attr;
 + struct ibv_qp_init_attr init_attr = {
   .send_cq = ctx-cq,
   .recv_cq = ctx-cq,
   .cap = {
 @@ -379,11 +382,16 @@ static struct pingpong_context *pp_init_ctx(struct 
 ibv_device *ib_dev, int size,
   .qp_type = IBV_QPT_RC
   };
 
 - ctx-qp = ibv_create_qp(ctx-pd, attr);
 + ctx-qp = ibv_create_qp(ctx-pd, init_attr);
   if (!ctx-qp)  {
   fprintf(stderr, Couldn't create QP\n);
   goto clean_cq;
   }
 +
 + ibv_query_qp(ctx-qp, attr, IBV_QP_CAP, init_attr);
 + if (init_attr.cap.max_inline_data = size) {
 + ctx-send_flags |= IBV_SEND_INLINE;
 + }
   }
 
   {
 @@ -508,7 +516,7 @@ static int pp_post_send(struct pingpong_context *ctx)
   .sg_list= list,
   .num_sge= 1,
   .opcode = IBV_WR_SEND,
 - .send_flags = IBV_SEND_SIGNALED,
 + .send_flags = ctx-send_flags,
   };
   struct ibv_send_wr *bad_wr;
 
 diff --git a/examples/srq_pingpong.c b/examples/srq_pingpong.c
 index 6e00f8c..552a144 100644
 --- a/examples/srq_pingpong.c
 +++ b/examples/srq_pingpong.c
 @@ -68,6 +68,7 @@ struct pingpong_context {
   struct ibv_qp   *qp[MAX_QP];
   void*buf;
   int  size;
 + int  send_flags;
   int  num_qp;
   int  rx_depth;
   int  pending[MAX_QP];
 @@ -350,9 +351,10 @@ static struct pingpong_context *pp_init_ctx(struct 
 ibv_device *ib_dev, int size,
   if (!ctx)
   return NULL;
 
 - ctx-size = size;
 - ctx-num_qp   = num_qp;
 - ctx-rx_depth = rx_depth;
 + ctx-size   = size;
 + ctx-send_flags = IBV_SEND_SIGNALED;
 + ctx-num_qp = num_qp;
 + ctx-rx_depth   = rx_depth;
 
   ctx-buf = memalign(page_size, size);
   if (!ctx-buf) {
 @@ -413,7 +415,8 @@ static struct pingpong_context *pp_init_ctx(struct 
 ibv_device *ib_dev, int size,
   }
 
   for (i = 0; i  num_qp; ++i) {
 - struct ibv_qp_init_attr attr = {
 + struct ibv_qp_attr attr;
 + struct ibv_qp_init_attr init_attr = {
   .send_cq = ctx-cq,
   .recv_cq = ctx-cq,
   .srq = ctx-srq,
 @@ -424,11 +427,15 @@ static struct pingpong_context *pp_init_ctx(struct 
 ibv_device *ib_dev, int size,
   .qp_type = IBV_QPT_RC
   };
 
 - ctx-qp[i] = ibv_create_qp(ctx-pd, attr);
 + ctx-qp[i] = ibv_create_qp(ctx-pd, init_attr);
   if (!ctx-qp[i])  {
   fprintf(stderr, Couldn't create QP[%d]\n, i);
   goto clean_qps;
   }
 + ibv_query_qp(ctx-qp[i], attr, IBV_QP_CAP, init_attr);
 + if (init_attr.cap.max_inline_data = size) {
 + ctx-send_flags |= IBV_SEND_INLINE;
 + }
   }
 
   for (i = 0; i  num_qp; ++i) {
 @@ -568,7 +575,7 @@ static int pp_post_send(struct 

Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU

2013-07-30 Thread Jeff Squyres (jsquyres)
On Jul 23, 2013, at 9:26 AM, Jeff Squyres (jsquyres) jsquy...@cisco.com wrote:

 .. and UD is the least abstracted transport, so existing apps won't
 support Jeff's new NIC anyhow, MTU is the least of their problems.
 
 Existing apps with existing transports see the same old values.


Bump.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rtnl_lock deadlock on 3.10

2013-07-30 Thread Steve Wise

On 7/29/2013 6:02 PM, Shawn Bohrer wrote:

On Mon, Jul 15, 2013 at 09:38:19AM -0500, Shawn Bohrer wrote:

On Wed, Jul 03, 2013 at 08:26:11PM +0300, Or Gerlitz wrote:

On 03/07/2013 20:22, Shawn Bohrer wrote:

On Wed, Jul 03, 2013 at 07:33:07AM +0200, Hannes Frederic Sowa wrote:

On Wed, Jul 03, 2013 at 07:11:52AM +0200, Hannes Frederic Sowa wrote:

On Tue, Jul 02, 2013 at 01:38:26PM +, Cong Wang wrote:

On Tue, 02 Jul 2013 at 08:28 GMT, Hannes Frederic Sowa 
han...@stressinduktion.org wrote:

On Mon, Jul 01, 2013 at 09:54:56AM -0500, Shawn Bohrer wrote:

I've managed to hit a deadlock at boot a couple times while testing
the 3.10 rc kernels.  It seems to always happen when my network
devices are initializing.  This morning I updated to v3.10 and made a
few config tweaks and so far I've hit it 4 out of 5 reboots.  It looks
like most processes are getting stuck on rtnl_lock.  Below is a boot
log with the soft lockup prints.  Please let know if there is any
other information I can provide:

Could you try a build with CONFIG_LOCKDEP enabled?


The problem is clear: ib_register_device() is called with rtnl_lock,
but itself needs device_mutex, however, ib_register_client() first
acquires device_mutex, then indirectly calls register_netdev() which
takes rtnl_lock. Deadlock!

One possible fix is always taking rtnl_lock before taking
device_mutex, something like below:

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 18c1ece..890870b 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -381,6 +381,7 @@ int ib_register_client(struct ib_client *client)
  {
struct ib_device *device;
+   rtnl_lock();
mutex_lock(device_mutex);
list_add_tail(client-list, client_list);
@@ -389,6 +390,7 @@ int ib_register_client(struct ib_client *client)
client-add(device);
mutex_unlock(device_mutex);
+   rtnl_unlock();
return 0;
  }
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c 
b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index b6e049a..5a7a048 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -1609,7 +1609,7 @@ static struct net_device *ipoib_add_port(const char 
*format,
goto event_failed;
}
-   result = register_netdev(priv-dev);
+   result = register_netdevice(priv-dev);
if (result) {
printk(KERN_WARNING %s: couldn't register ipoib port %d; error 
%d\n,
   hca-name, port, result);

Looks good to me. Shawn, could you test this patch?

ib_unregister_device/ib_unregister_client would need the same change,
too. I have not checked the other -add() and -remove() functions. Also
cc'ed linux-rdma@vger.kernel.org, Roland Dreier.

Cong's patch is missing the #include linux/rtnetlink.h but otherwise
I've had 34 successful reboots with no deadlocks which is a good sign.
It sounds like there are more paths that need to be audited and a
proper patch submitted.  I can do more testing later if needed.

Thanks,
Shawn


Guys, I was a bit busy today looking into that, but I don't think we
want the IB core layer  (core/device.c) to
use rtnl locking which is something that belongs to the network stack.

Has anymore thought been put into a proper fix for this issue?

I'm no expert in this area but I'm having a hard time seeing a
different solution than the one Cong suggested.  Just to be clear the
deadlock I hit was between cxgb3 and the ipoib module, so I've Cc'd
Steve Wise in case he has a better solution from the Chelsio side.


I don't know of another way to resolve this.   The rtnl lock is used in 
ipoib and mlx4 already.  I think we should go forward with the proposed 
patch.


Steve.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH opensm] osm_sa_path_record.c: Fix rate setting issue in SA PR handling

2013-07-30 Thread Hal Rosenstock

For switch egress ports, fix rate setting for an extended speed port

Signed-off-by: Hal Rosenstock h...@mellanox.com
---
 opensm/osm_sa_path_record.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/opensm/osm_sa_path_record.c b/opensm/osm_sa_path_record.c
index 02aeb9f..e7ac607 100644
--- a/opensm/osm_sa_path_record.c
+++ b/opensm/osm_sa_path_record.c
@@ -385,7 +385,7 @@ static ib_api_status_t pr_rcv_get_path_parms(IN osm_sa_t * 
sa,
  ib_port_info_compute_rate(p_pi,

p_pi0-capability_mask  IB_PORT_CAP_HAS_EXT_SPEEDS))  0)
rate = ib_port_info_compute_rate(p_pi,
-p_pi-capability_mask 
 IB_PORT_CAP_HAS_EXT_SPEEDS);
+p_pi0-capability_mask 
 IB_PORT_CAP_HAS_EXT_SPEEDS);
 
if (sa-p_subn-opt.qos) {
/*
-- 
1.7.8.2

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU

2013-07-30 Thread Jeff Squyres (jsquyres)
On Jul 30, 2013, at 12:44 PM, Christoph Lameter c...@gentwo.org wrote:

 What in the world does that mean? I am an oldtimer I guess. Seems that
 this is something that can be done in the newfangled forum? How does this
 affect mailing lists?


I'm not sure what you're asking me; please see the prior posts on this thread 
that describes the MTU issue and why we still need a solution.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU

2013-07-30 Thread Christoph Lameter
On Tue, 30 Jul 2013, Jeff Squyres (jsquyres) wrote:

 On Jul 30, 2013, at 12:44 PM, Christoph Lameter c...@gentwo.org wrote:

  What in the world does that mean? I am an oldtimer I guess. Seems that
  this is something that can be done in the newfangled forum? How does this
  affect mailing lists?


 I'm not sure what you're asking me; please see the prior posts on this
 thread that describes the MTU issue and why we still need a solution.

What does bump mean? You keep sending replies that just says bump.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU

2013-07-30 Thread Doug Ledford
On 07/30/2013 02:45 PM, Atchley, Scott wrote:
 http://en.wikipedia.org/wiki/Internet_forum#Thread
 
 When a member posts in a thread it will jump to the top since it is
 the latest updated thread. Similarly, other threads will jump in
 front of it when they receive posts. When a member posts in a thread
 for no reason but to have it go to the top, it is referred to as a
 bump or bumping.
 
 He is trying to bring it back to everyone's attention.

Exactly.  On a mailing list like this, it's shorthand for You seem to
have dropped this on the floor, let me resend it to you.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU

2013-07-30 Thread Atchley, Scott
On Jul 30, 2013, at 2:31 PM, Christoph Lameter c...@gentwo.org wrote:

 On Tue, 30 Jul 2013, Jeff Squyres (jsquyres) wrote:
 
 On Jul 30, 2013, at 12:44 PM, Christoph Lameter c...@gentwo.org wrote:
 
 What in the world does that mean? I am an oldtimer I guess. Seems that
 this is something that can be done in the newfangled forum? How does this
 affect mailing lists?
 
 
 I'm not sure what you're asking me; please see the prior posts on this
 thread that describes the MTU issue and why we still need a solution.
 
 What does bump mean? You keep sending replies that just says bump.

http://en.wikipedia.org/wiki/Internet_forum#Thread

When a member posts in a thread it will jump to the top since it is the latest 
updated thread. Similarly, other threads will jump in front of it when they 
receive posts. When a member posts in a thread for no reason but to have it go 
to the top, it is referred to as a bump or bumping.

He is trying to bring it back to everyone's attention.

Scott

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mlx5: Implement new initialization sequence

2013-07-30 Thread Or Gerlitz
On Thu, Jul 18, 2013 at 3:31 PM, Eli Cohen e...@dev.mellanox.co.il wrote:
 Introduce enbale_hca and disable_hca commands to signify when the driver 
 starts
 or ceases to operate on the device.
 In addition the driver will use boot and init pages count; boot pages is
 required to allow firmware to complete boot commands and the other to complete
 init hca. Command interface revision is bumped to 4 to enforce using supported
 firmware.

Roland, haven't seen a response from you for about 2 weeks. Its
critical that the first upstream release that includes the mlx5 driver
will use this initilization sequence. Its mlx5_core patch, do you want
it to go through netdev/Dave?

Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V8 libibverbs 1/7] Infrastructure to support verbs extensions

2013-07-30 Thread Jason Gunthorpe
On Tue, Jul 30, 2013 at 11:27:15PM +0300, Yishai Hadas wrote:

 Can accept it, however as it's a const pointer may need some casting later
 which is not fully clean.

Drop the const from the definition then.

 Your suggestion for verbs_set_ctx_op can't not work as it calls internally
 to verbs_get_ctx_op and will may fail as
 at that step function pointer was not set and *(void **)((uint8_t *)vctx +
 off) will be NULL.

Yes, that is too bad in that case, but the old macro is still flawed:

+#define verbs_set_ctx_op(vctx, op, ptr) ({ \
+   if (vctx  (vctx-sz = sizeof(*vctx) - offsetof(struct verbs_context, 
op))) \
+   vctx-op = ptr; })

- Missing () on all vctx usages
- Missing type enforcement on vctx

Something like this:

#define verbs_set_ctx_op(_vctx, op, ptr) ({ \
   struct verbs_context *vctx = _vctx; \
   if (vctx  (vctx-sz = sizeof(*vctx) - offsetof(struct verbs_context, 
op))) \
   vctx-op = ptr; })

 In addition changing to use const as part of returning from
 __verbs_get_ctx_op causes some necessary casting to non const in some places
 which
 is not fully clean. (e.g. free((void *)context_ex); as part of
 __ibv_close_device, verbs_ctx-has_comp_mask = VERBS_CONTEXT_XRCD |
 VERBS_CONTEXT_SRQ |
 VERBS_CONTEXT_QP; as part of mlx4_init_context)

It is virtually impossible to do const-correctness fully and
transparently in C, since the language has no feature to silently
propogate the const.

If you want to be 100% clean then provide a non-const version --
verbs_get_ctx_nc that takes non-const input.

Functions that silently discard const are bad since it silently
defeats static analysis around const.

 I recommend staying with V8 suggestion for both macros, in case
 you think there is any problem with missing () for the set operation
 please point on and may handle.

Using the inline to help the -Os case is definitely desirable, and the
fix to the () at least.
 
Jason
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 4/4] Declare 'server_port' as an unsigned variable

2013-07-30 Thread Hefty, Sean
thanks - applied all 4
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mlx5: fix error return code in mlx5_alloc_uuars()

2013-07-30 Thread David Miller
From: Wei Yongjun weiyj...@gmail.com
Date: Tue, 30 Jul 2013 07:57:06 +0800

 From: Wei Yongjun yongjun_...@trendmicro.com.cn
 
 Fix to return -ENOMEM from the ioremap error handling
 case instead of 0, as done elsewhere in this function.
 
 Signed-off-by: Wei Yongjun yongjun_...@trendmicro.com.cn

Applied, thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html