[PATCH] IB/core: Fix race condition in ib_uverbs_open_qp

2014-09-23 Thread Or Gerlitz
From: Jack Morgenstein ja...@dev.mellanox.co.il

In ib_uverbs_open_qp, the sharable xrc target qp is created as a pseudo qp
and added to a list of qp's sharing the same physical QP. This is done before
the pseudo qp is assigned a uobject.

There is a race condition here if an async event arrives at the physical qp.
If the event is handled after the pseudo qp is added to the list, but before
it is assigned a uobject, the kernel crashes in ib_uverbs_qp_event_handler,
due to trying to dereference a NULL uobject pointer.

Note that simply checking for non-NULL is not enough, due to error flows
in ib_uverbs_open_qp. If the failure is after assigning the uobject, but
before the qp has fully been created, we still have a problem.

Thus, in ib_uverbs_qp_event_handler, we test that the uobject is present,
and also that it is live.

Reported-by: Matthew Finlay m...@mellanox.com
Signed-off-by: Jack Morgenstein ja...@dev.mellanox.co.il
Signed-off-by: Or Gerlitz ogerl...@mellanox.com
---
 drivers/infiniband/core/uverbs_main.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_main.c 
b/drivers/infiniband/core/uverbs_main.c
index c73b22a..bb6fea1 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -502,6 +502,10 @@ void ib_uverbs_qp_event_handler(struct ib_event *event, 
void *context_ptr)
 {
struct ib_uevent_object *uobj;
 
+   /* for XRC target qp's, check that qp is live */
+   if (!event-element.qp-uobject || !event-element.qp-uobject-live)
+   return;
+
uobj = container_of(event-element.qp-uobject,
struct ib_uevent_object, uobject);
 
-- 
1.7.8.2

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 0/2] RDMA/ocrdma: ocrdma bug fixes

2014-09-23 Thread Devesh Sharma
Hi Roland,

Please include this series as well in your ongoing merge activity.

-Regards
 Devesh

 -Original Message-
 From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
 ow...@vger.kernel.org] On Behalf Of Devesh Sharma
 Sent: Friday, August 22, 2014 4:57 PM
 To: linux-rdma@vger.kernel.org
 Cc: Devesh Sharma
 Subject: [PATCH 0/2] RDMA/ocrdma: ocrdma bug fixes
 
 This series has bug fixes for 3.17
 
 Devesh Sharma (1):
   RDMA/ocrdma: add default gid at index 0
 
 Selvin Xavier (1):
   RDMA/ocrdma: get vlan tag from ib_qp_attrs
 
  drivers/infiniband/hw/ocrdma/ocrdma_hw.c   |9 +
  drivers/infiniband/hw/ocrdma/ocrdma_main.c |   10 ++
  2 files changed, 15 insertions(+), 4 deletions(-)
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in the
 body of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] IB/core: Fix race condition in ib_uverbs_open_qp

2014-09-23 Thread Hefty, Sean
 --- a/drivers/infiniband/core/uverbs_main.c
 +++ b/drivers/infiniband/core/uverbs_main.c
 @@ -502,6 +502,10 @@ void ib_uverbs_qp_event_handler(struct ib_event
 *event, void *context_ptr)
  {
   struct ib_uevent_object *uobj;
 
 + /* for XRC target qp's, check that qp is live */
 + if (!event-element.qp-uobject || !event-element.qp-uobject-live)
 + return;

Isn't checking 'live' sufficient?

- Sean
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] IB/srp: Separate target and channel variables

2014-09-23 Thread Sagi Grimberg

On 9/19/2014 3:59 PM, Bart Van Assche wrote:

Changes in this patch:
- Move channel variables into a new structure (struct srp_rdma_ch).
- cm_id and completion handler context pointer are now of type
   srp_rdma_ch * insteoad of srp_target_port *.

No functionality is changed.

Signed-off-by: Bart Van Assche bvanass...@acm.org
---
  drivers/infiniband/ulp/srp/ib_srp.c | 707 +++-
  drivers/infiniband/ulp/srp/ib_srp.h |  59 +--
  2 files changed, 417 insertions(+), 349 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index fd88fb8..9feeea1 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -125,8 +125,8 @@ MODULE_PARM_DESC(dev_loss_tmo,

  static void srp_add_one(struct ib_device *device);
  static void srp_remove_one(struct ib_device *device);
-static void srp_recv_completion(struct ib_cq *cq, void *target_ptr);
-static void srp_send_completion(struct ib_cq *cq, void *target_ptr);
+static void srp_recv_completion(struct ib_cq *cq, void *ch_ptr);
+static void srp_send_completion(struct ib_cq *cq, void *ch_ptr);
  static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event);

  static struct scsi_transport_template *ib_srp_transport_template;
@@ -262,7 +262,7 @@ static int srp_init_qp(struct srp_target_port *target,

ret = ib_find_pkey(target-srp_host-srp_dev-dev,
   target-srp_host-port,
-  be16_to_cpu(target-path.pkey),
+  be16_to_cpu(target-pkey),
   attr-pkey_index);
if (ret)
goto out;
@@ -283,18 +283,23 @@ out:
return ret;
  }

-static int srp_new_cm_id(struct srp_target_port *target)
+static int srp_new_cm_id(struct srp_rdma_ch *ch)
  {
+   struct srp_target_port *target = ch-target;
struct ib_cm_id *new_cm_id;

new_cm_id = ib_create_cm_id(target-srp_host-srp_dev-dev,
-   srp_cm_handler, target);
+   srp_cm_handler, ch);
if (IS_ERR(new_cm_id))
return PTR_ERR(new_cm_id);

-   if (target-cm_id)
-   ib_destroy_cm_id(target-cm_id);
-   target-cm_id = new_cm_id;
+   if (ch-cm_id)
+   ib_destroy_cm_id(ch-cm_id);
+   ch-cm_id = new_cm_id;
+   ch-path.sgid = target-sgid;
+   ch-path.dgid = target-orig_dgid;
+   ch-path.pkey = target-pkey;
+   ch-path.service_id = target-service_id;

return 0;
  }
@@ -443,8 +448,9 @@ static struct srp_fr_pool *srp_alloc_fr_pool(struct 
srp_target_port *target)
  dev-max_pages_per_mr);
  }

-static int srp_create_target_ib(struct srp_target_port *target)
+static int srp_create_ch_ib(struct srp_rdma_ch *ch)
  {
+   struct srp_target_port *target = ch-target;
struct srp_device *dev = target-srp_host-srp_dev;
struct ib_qp_init_attr *init_attr;
struct ib_cq *recv_cq, *send_cq;
@@ -458,15 +464,15 @@ static int srp_create_target_ib(struct srp_target_port 
*target)
if (!init_attr)
return -ENOMEM;

-   recv_cq = ib_create_cq(dev-dev, srp_recv_completion, NULL, target,
-  target-queue_size, target-comp_vector);
+   recv_cq = ib_create_cq(dev-dev, srp_recv_completion, NULL, ch,
+  target-queue_size, ch-comp_vector);
if (IS_ERR(recv_cq)) {
ret = PTR_ERR(recv_cq);
goto err;
}

-   send_cq = ib_create_cq(dev-dev, srp_send_completion, NULL, target,
-  m * target-queue_size, target-comp_vector);
+   send_cq = ib_create_cq(dev-dev, srp_send_completion, NULL, ch,
+  m * target-queue_size, ch-comp_vector);
if (IS_ERR(send_cq)) {
ret = PTR_ERR(send_cq);
goto err_recv_cq;
@@ -502,9 +508,9 @@ static int srp_create_target_ib(struct srp_target_port 
*target)
 FR pool allocation failed (%d)\n, ret);
goto err_qp;
}
-   if (target-fr_pool)
-   srp_destroy_fr_pool(target-fr_pool);
-   target-fr_pool = fr_pool;
+   if (ch-fr_pool)
+   srp_destroy_fr_pool(ch-fr_pool);
+   ch-fr_pool = fr_pool;
} else if (!dev-use_fast_reg  dev-has_fmr) {
fmr_pool = srp_alloc_fmr_pool(target);
if (IS_ERR(fmr_pool)) {
@@ -513,21 +519,21 @@ static int srp_create_target_ib(struct srp_target_port 
*target)
 FMR pool allocation failed (%d)\n, ret);
goto err_qp;
}
-   if (target-fmr_pool)
-   ib_destroy_fmr_pool(target-fmr_pool);
-   target-fmr_pool = fmr_pool;
+ 

Re: [PATCH 8/8] IB/srp: Add multichannel support

2014-09-23 Thread Sagi Grimberg

On 9/19/2014 4:00 PM, Bart Van Assche wrote:

Improve performance by using multiple RDMA/RC channels per SCSI host
for communicating with an SRP target.



Hey Bart,

Since you don't seem to negotiate/declare multichannel with the target,
did you test this code with some target implementations other than SCST
that happen to be out there?

Overall, I think this patch would be easier to review if you also 
provide a list of logical changes (which obviously are introduced in 
this patch). Patch 7/8 can use some more information of target-channel

relations as well.


Signed-off-by: Bart Van Assche bvanass...@acm.org
---
  Documentation/ABI/stable/sysfs-driver-ib_srp |  25 +-
  drivers/infiniband/ulp/srp/ib_srp.c  | 337 ---
  drivers/infiniband/ulp/srp/ib_srp.h  |  20 +-
  3 files changed, 287 insertions(+), 95 deletions(-)

diff --git a/Documentation/ABI/stable/sysfs-driver-ib_srp 
b/Documentation/ABI/stable/sysfs-driver-ib_srp
index b9688de..d5a459e 100644
--- a/Documentation/ABI/stable/sysfs-driver-ib_srp
+++ b/Documentation/ABI/stable/sysfs-driver-ib_srp
@@ -55,12 +55,12 @@ Description:Interface for making ib_srp connect to 
a new target.
  only safe with partial memory descriptor list support enabled
  (allow_ext_sg=1).
* comp_vector, a number in the range 0..n-1 specifying the
- MSI-X completion vector. Some HCA's allocate multiple (n)
- MSI-X vectors per HCA port. If the IRQ affinity masks of
- these interrupts have been configured such that each MSI-X
- interrupt is handled by a different CPU then the comp_vector
- parameter can be used to spread the SRP completion workload
- over multiple CPU's.
+ MSI-X completion vector of the first RDMA channel. Some
+ HCA's allocate multiple (n) MSI-X vectors per HCA port. If
+ the IRQ affinity masks of these interrupts have been
+ configured such that each MSI-X interrupt is handled by a
+ different CPU then the comp_vector parameter can be used to
+ spread the SRP completion workload over multiple CPU's.



Why do you want the first channel vector placement? Why can't you start
with obvious 0?



* tl_retry_count, a number in the range 2..7 specifying the
  IB RC retry count.
* queue_size, the maximum number of commands that the
@@ -88,6 +88,13 @@ Description: Whether ib_srp is allowed to include a partial 
memory
descriptor list in an SRP_CMD when communicating with an SRP
target.

+What:  /sys/class/scsi_host/hostn/ch_count
+Date:  November 1, 2014
+KernelVersion: 3.18
+Contact:   linux-rdma@vger.kernel.org
+Description:   Number of RDMA channels used for communication with the SRP
+   target.
+
  What: /sys/class/scsi_host/hostn/cmd_sg_entries
  Date: May 19, 2011
  KernelVersion:2.6.39
@@ -95,6 +102,12 @@ Contact:linux-rdma@vger.kernel.org
  Description:  Maximum number of data buffer descriptors that may be sent to
the target in a single SRP_CMD request.

+What:  /sys/class/scsi_host/hostn/comp_vector
+Date:  September 2, 2013
+KernelVersion: 3.11
+Contact:   linux-rdma@vger.kernel.org
+Description:   Completion vector used for the first RDMA channel.
+
  What: /sys/class/scsi_host/hostn/dgid
  Date: June 17, 2006
  KernelVersion:2.6.17
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index 9feeea1..58ca618 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -123,6 +123,16 @@ MODULE_PARM_DESC(dev_loss_tmo,
  if fast_io_fail_tmo has not been set. \off\ means that
  this functionality is disabled.);

+static unsigned ch_count;
+module_param(ch_count, uint, 0444);
+MODULE_PARM_DESC(ch_count,
+Number of RDMA channels to use for communication with an SRP
+ target. Using more than one channel improves performance
+ if the HCA supports multiple completion vectors. The
+ default value is the minimum of four times the number of
+ online CPU sockets and the number of completion vectors
+ supported by the HCA.);
+


Can you explain the default math? how did you end-up with 4*numa_nodes?
wouldn't per-cpu be a better fit?

Moreover, while using multiple channels you don't suffice for less
requests of less FMRs/FRs. I'm a bit concerned here about scalability
of multi-channel.

Should we take care of cases where the user will want lots of channels
to lots of targets and might run out of resources?


  static void srp_add_one(struct ib_device *device);
  static void srp_remove_one(struct 

Re: [PATCH 8/8] IB/srp: Add multichannel support

2014-09-23 Thread Bart Van Assche

On 23/09/2014 10:32, Sagi Grimberg wrote:

On 9/19/2014 4:00 PM, Bart Van Assche wrote:

Improve performance by using multiple RDMA/RC channels per SCSI host
for communicating with an SRP target.



Hey Bart,

Since you don't seem to negotiate/declare multichannel with the target,
did you test this code with some target implementations other than SCST
that happen to be out there?

Overall, I think this patch would be easier to review if you also 
provide a list of logical changes (which obviously are introduced in 
this patch). Patch 7/8 can use some more information of target-channel

relations as well.


Hello Sagi,

That's a good question. So far this patch series has only been tested 
against the SCST SRP target driver. However, as you probably noticed, if 
setting up a second or later RDMA channel fails SRP login is not failed 
but communication proceeds with the number of channels that have been 
established. This mechanism should retain backwards compatibility with 
SRP target systems that do not support multichannel communication. 
However, if the new code for SRP login turns out to be triggering bugs 
in existing SRP target implementations we can still add a blacklist for 
these implementations.


I will provide a more detailed list of logical changes in the second 
version of this patch series.


Bart.
Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V2] svcrdma: Advertise the correct max payload

2014-09-23 Thread Steve Wise

  diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h 
  b/net/sunrpc/xprtrdma/xprt_rdma.h
  index c419498..a9cf5c3 100644
  --- a/net/sunrpc/xprtrdma/xprt_rdma.h
  +++ b/net/sunrpc/xprtrdma/xprt_rdma.h
  @@ -392,4 +392,8 @@ extern struct kmem_cache *svc_rdma_ctxt_cachep;
  /* Workqueue created in svc_rdma.c */
  extern struct workqueue_struct *svc_rdma_wq;
 
  +#define RPCSVC_MAXPAYLOAD_RDMA \
  +   (RPCSVC_MAXPAYLOAD  (RPCRDMA_MAX_DATA_SEGS  PAGE_SHIFT) ? \
  +RPCSVC_MAXPAYLOAD : (RPCRDMA_MAX_DATA_SEGS  PAGE_SHIFT))
  +
 
 Couldn't you use:
 
 #if RPCSVC_MAXPAYLOAD  (RPCRDMA_MAX_DATA_SEGS  PAGE_SHIFT)
 #define RPCSVC_MAXPAYLOAD_RDMA RPC_MAXPAYLOAD
 #else
 #define RPCSVC_MAXPAYLOAD_RDMA (RPCRDMA_MAX_DATA_SEGS  PAGE_SHIFT)
 #endif
 
 That seems more idiomatic.

Sure.  That makes it easier to read in my opinion too.

I'll send out V3 with this change.

Thanks,

Steve.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2] svcrdma: Advertise the correct max payload

2014-09-23 Thread 'J. Bruce Fields'
On Tue, Sep 23, 2014 at 02:42:34PM -0500, Steve Wise wrote:
 
   diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h 
   b/net/sunrpc/xprtrdma/xprt_rdma.h
   index c419498..a9cf5c3 100644
   --- a/net/sunrpc/xprtrdma/xprt_rdma.h
   +++ b/net/sunrpc/xprtrdma/xprt_rdma.h
   @@ -392,4 +392,8 @@ extern struct kmem_cache *svc_rdma_ctxt_cachep;
   /* Workqueue created in svc_rdma.c */
   extern struct workqueue_struct *svc_rdma_wq;
  
   +#define RPCSVC_MAXPAYLOAD_RDMA \
   + (RPCSVC_MAXPAYLOAD  (RPCRDMA_MAX_DATA_SEGS  PAGE_SHIFT) ? \
   +  RPCSVC_MAXPAYLOAD : (RPCRDMA_MAX_DATA_SEGS  PAGE_SHIFT))
   +
  
  Couldn't you use:
  
  #if RPCSVC_MAXPAYLOAD  (RPCRDMA_MAX_DATA_SEGS  PAGE_SHIFT)
  #define RPCSVC_MAXPAYLOAD_RDMA RPC_MAXPAYLOAD
  #else
  #define RPCSVC_MAXPAYLOAD_RDMA (RPCRDMA_MAX_DATA_SEGS  PAGE_SHIFT)
  #endif
  
  That seems more idiomatic.
 
 Sure.  That makes it easier to read in my opinion too.
 
 I'll send out V3 with this change.

While we're bikeshedding, why not use min()?

--b.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V2] svcrdma: Advertise the correct max payload

2014-09-23 Thread Steve Wise


 -Original Message-
 From: 'J. Bruce Fields' [mailto:bfie...@fieldses.org]
 Sent: Tuesday, September 23, 2014 2:48 PM
 To: Steve Wise
 Cc: 'Chuck Lever'; linux-...@vger.kernel.org; linux-rdma@vger.kernel.org
 Subject: Re: [PATCH V2] svcrdma: Advertise the correct max payload
 
 On Tue, Sep 23, 2014 at 02:42:34PM -0500, Steve Wise wrote:
 
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h 
b/net/sunrpc/xprtrdma/xprt_rdma.h
index c419498..a9cf5c3 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -392,4 +392,8 @@ extern struct kmem_cache *svc_rdma_ctxt_cachep;
/* Workqueue created in svc_rdma.c */
extern struct workqueue_struct *svc_rdma_wq;
   
+#define RPCSVC_MAXPAYLOAD_RDMA \
+   (RPCSVC_MAXPAYLOAD  (RPCRDMA_MAX_DATA_SEGS  PAGE_SHIFT) ? \
+RPCSVC_MAXPAYLOAD : (RPCRDMA_MAX_DATA_SEGS  PAGE_SHIFT))
+
  
   Couldn't you use:
  
   #if RPCSVC_MAXPAYLOAD  (RPCRDMA_MAX_DATA_SEGS  PAGE_SHIFT)
   #define RPCSVC_MAXPAYLOAD_RDMA RPC_MAXPAYLOAD
   #else
   #define RPCSVC_MAXPAYLOAD_RDMA (RPCRDMA_MAX_DATA_SEGS  PAGE_SHIFT)
   #endif
  
   That seems more idiomatic.
 
  Sure.  That makes it easier to read in my opinion too.
 
  I'll send out V3 with this change.
 
 While we're bikeshedding, why not use min()?
 
 --b.

I tried that initially.  But min() and min_t() don't work because of the way we 
use the #define.  With it defined thusly:

#define RPCSVC_MAXPAYLOAD_RDMA min(RPC_MAXPAYLOAD, RPCRDMA_MAX_DATA_SEGS  
PAGE_SHIFT)

I see this error:

  CC [M]  net/sunrpc/xprtrdma/svc_rdma_transport.o
net/sunrpc/xprtrdma/svc_rdma_transport.c:94: error: braced-group within 
expression allowed only inside a function
make[3]: *** [net/sunrpc/xprtrdma/svc_rdma_transport.o] Error 1
make[2]: *** [net/sunrpc/xprtrdma] Error 2
make[1]: *** [net/sunrpc] Error 2
make: *** [net] Error 2

min() looks like this:

#define min(x, y) ({\
typeof(x) _min1 = (x);  \
typeof(y) _min2 = (y);  \
(void) (_min1 == _min2);  \
_min1  _min2 ? _min1 : _min2; })

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2] svcrdma: Advertise the correct max payload

2014-09-23 Thread 'J. Bruce Fields'
On Tue, Sep 23, 2014 at 02:53:28PM -0500, Steve Wise wrote:
 
 
  -Original Message-
  From: 'J. Bruce Fields' [mailto:bfie...@fieldses.org]
  Sent: Tuesday, September 23, 2014 2:48 PM
  To: Steve Wise
  Cc: 'Chuck Lever'; linux-...@vger.kernel.org; linux-rdma@vger.kernel.org
  Subject: Re: [PATCH V2] svcrdma: Advertise the correct max payload
  
  On Tue, Sep 23, 2014 at 02:42:34PM -0500, Steve Wise wrote:
  
 diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h 
 b/net/sunrpc/xprtrdma/xprt_rdma.h
 index c419498..a9cf5c3 100644
 --- a/net/sunrpc/xprtrdma/xprt_rdma.h
 +++ b/net/sunrpc/xprtrdma/xprt_rdma.h
 @@ -392,4 +392,8 @@ extern struct kmem_cache *svc_rdma_ctxt_cachep;
 /* Workqueue created in svc_rdma.c */
 extern struct workqueue_struct *svc_rdma_wq;

 +#define RPCSVC_MAXPAYLOAD_RDMA \
 + (RPCSVC_MAXPAYLOAD  (RPCRDMA_MAX_DATA_SEGS  PAGE_SHIFT) ? \
 +  RPCSVC_MAXPAYLOAD : (RPCRDMA_MAX_DATA_SEGS  PAGE_SHIFT))
 +
   
Couldn't you use:
   
#if RPCSVC_MAXPAYLOAD  (RPCRDMA_MAX_DATA_SEGS  PAGE_SHIFT)
#define RPCSVC_MAXPAYLOAD_RDMA RPC_MAXPAYLOAD
#else
#define RPCSVC_MAXPAYLOAD_RDMA (RPCRDMA_MAX_DATA_SEGS  PAGE_SHIFT)
#endif
   
That seems more idiomatic.
  
   Sure.  That makes it easier to read in my opinion too.
  
   I'll send out V3 with this change.
  
  While we're bikeshedding, why not use min()?
  
  --b.
 
 I tried that initially.  But min() and min_t() don't work because of the way 
 we use the #define.  With it defined thusly:

OK, OK.  Whatever works.--b.

 
 #define RPCSVC_MAXPAYLOAD_RDMA min(RPC_MAXPAYLOAD, RPCRDMA_MAX_DATA_SEGS  
 PAGE_SHIFT)
 
 I see this error:
 
   CC [M]  net/sunrpc/xprtrdma/svc_rdma_transport.o
 net/sunrpc/xprtrdma/svc_rdma_transport.c:94: error: braced-group within 
 expression allowed only inside a function
 make[3]: *** [net/sunrpc/xprtrdma/svc_rdma_transport.o] Error 1
 make[2]: *** [net/sunrpc/xprtrdma] Error 2
 make[1]: *** [net/sunrpc] Error 2
 make: *** [net] Error 2
 
 min() looks like this:
 
 #define min(x, y) ({\
 typeof(x) _min1 = (x);  \
 typeof(y) _min2 = (y);  \
 (void) (_min1 == _min2);  \
 _min1  _min2 ? _min1 : _min2; })
 
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] IB/srp: Separate target and channel variables

2014-09-23 Thread Bart Van Assche

On 23/09/2014 10:07, Sagi Grimberg wrote:

On 9/19/2014 3:59 PM, Bart Van Assche wrote:
@@ -3047,24 +3098,28 @@ static ssize_t srp_create_target(struct 
device *dev,

  INIT_WORK(target-tl_err_work, srp_tl_err_work);
  INIT_WORK(target-remove_work, srp_remove_work);
  spin_lock_init(target-lock);


Hey Bart,

You keep this target-lock around. I don't see in this patch any usage
of it left. Can you document what each of target-lock and ch-lock
protects?

So I would imagine that target-lock manages the target-wide variables
such as state (what else?) while ch-lock protect channel specific free
requests pool which can be moved to blk-mq preallocs and avoid
maintaining it (not sure how req_lim would be maintained though and if
it is still needed?).


Hello Sagi,

The use of these locks is very traditional - each lock is used to 
protect those members of the data structure it is a member of and that 
need to be protected against concurrent access.


Even with this patch applied target-lock is still used, e.g. to 
protect target port state modifications.


I will convert the free request pool in the next version of this patch 
series. The code for managing req_lim can probably be converted into 
something based on the blk-mq reserved tag infrastructure.


Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL] please pull infiniband.git

2014-09-23 Thread Roland Dreier
Hi Linus,

Please pull from

git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git 
tags/rdma-for-linus

This is later and bigger than I would like, and the blame is all on
me: I got very busy with other stuff for a few weeks during the 3.17
cycle, and didn't prepare this tree as soon as I should have.  However
I don't think there's anything risky here, and no one really cares if
we break InfiniBand in 3.17 anyway...


Last late set of InfiniBand/RDMA fixes for 3.17:

 - Fixes for the new memory region re-registration support
 - iSER initiator error path fixes
 - Grab bag of small fixes for the qib and ocrdma hardware drivers
 - Larger set of fixes for mlx4, especially in RoCE mode


Alex Estrin (1):
  IPoIB: Remove unnecessary port query

Devesh Sharma (2):
  RDMA/ocrdma: Report correct value of max_fast_reg_page_list_len
  RDMA/ocrdma: Do not skip setting deferred_arm

Jack Morgenstein (6):
  IB/mlx4: Fix lockdep splat for the iboe lock
  mlx4: Fix mlx4 reg/unreg mac to work properly with 0-mac addresses
  IB/mlx4: Avoid accessing netdevice when building RoCE qp1 header
  IB/mlx4: Don't update QP1 in native mode
  IB/mlx4: Do not allow APM under RoCE
  IB/mlx4: Fix VF mac handling in RoCE

Markus Stockhausen (1):
  IB/mlx4: Disable TSO for Connect-X rev. A0 HCAs

Matan Barak (2):
  mlx4: Correct error flows in rereg_mr
  IB/core: When marshaling uverbs path, clear unused fields

Mike Marciniszyn (3):
  IB/ipath: Change get_user_pages() usage to always NULL vmas
  IB/qib: Change get_user_pages() usage to always NULL vmas
  IB/qib: Correct reference counting in debugfs qp_stats

Moni Shoua (5):
  IB/mlx4: Avoid null pointer dereference in mlx4_ib_scan_netdevs()
  IB/mlx4: Don't duplicate the default RoCE GID
  IB/mlx4: Reorder steps in RoCE GID table initialization
  IB/mlx4: Get upper dev addresses as RoCE GIDs when port comes up
  IB/mlx4: Avoid executing gid task when device is being removed

Or Gerlitz (1):
  IB/iser: Bump version to 1.4.1

Roi Dayan (1):
  IB/iser: Fix RX/TX CQ resource leak on error flow

Roland Dreier (1):
  Merge branches 'core', 'ipoib', 'iser', 'mlx4', 'ocrdma' and 'qib' into 
for-next

Sagi Grimberg (1):
  IB/iser: Allow bind only when connection state is UP

Shawn Bohrer (1):
  IB: ib_umem_release() should decrement mm-pinned_vm from ib_umem_get

devesh.sha...@emulex.com (2):
  RDMA/ocrdma: Resolve L2 address when creating user AH
  RDMA/ocrdma: Use right macro in query AH

 drivers/infiniband/core/umem.c |  19 ++-
 drivers/infiniband/core/uverbs_marshall.c  |   4 +
 drivers/infiniband/hw/ipath/ipath_user_pages.c |   6 +-
 drivers/infiniband/hw/mlx4/main.c  | 169 +
 drivers/infiniband/hw/mlx4/mlx4_ib.h   |   1 +
 drivers/infiniband/hw/mlx4/mr.c|   7 +-
 drivers/infiniband/hw/mlx4/qp.c|  60 +
 drivers/infiniband/hw/ocrdma/ocrdma_ah.c   |  43 +--
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c|   6 +-
 drivers/infiniband/hw/qib/qib_debugfs.c|   3 +-
 drivers/infiniband/hw/qib/qib_qp.c |   8 --
 drivers/infiniband/hw/qib/qib_user_pages.c |   6 +-
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |  10 +-
 drivers/infiniband/ulp/iser/iscsi_iser.c   |  19 ++-
 drivers/infiniband/ulp/iser/iscsi_iser.h   |   2 +-
 drivers/infiniband/ulp/iser/iser_verbs.c   |  24 ++--
 drivers/net/ethernet/mellanox/mlx4/mr.c|  33 +++--
 drivers/net/ethernet/mellanox/mlx4/port.c  |  11 +-
 include/rdma/ib_umem.h |   1 +
 19 files changed, 277 insertions(+), 155 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V3] svcrdma: advertise the correct max payload

2014-09-23 Thread Steve Wise
Svcrdma currently advertises 1MB, which is too large.  The correct value
is the minimum of RPCSVC_MAXPAYLOAD and the max scatter-gather allowed
in an NFSRDMA IO chunk * the host page size. This bug is usually benign
because the Linux X64 NFSRDMA client correctly limits the payload size to
the correct value (64*4096 = 256KB).  But if the Linux client is PPC64
with a 64KB page size, then the client will indeed use a payload size
that will overflow the server.

Signed-off-by: Steve Wise sw...@opengridcomputing.com
---

 net/sunrpc/xprtrdma/svc_rdma_transport.c |2 +-
 net/sunrpc/xprtrdma/xprt_rdma.h  |7 +++
 2 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c 
b/net/sunrpc/xprtrdma/svc_rdma_transport.c
index 374feb4..4e61880 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
@@ -91,7 +91,7 @@ struct svc_xprt_class svc_rdma_class = {
.xcl_name = rdma,
.xcl_owner = THIS_MODULE,
.xcl_ops = svc_rdma_ops,
-   .xcl_max_payload = RPCSVC_MAXPAYLOAD_TCP,
+   .xcl_max_payload = RPCSVC_MAXPAYLOAD_RDMA,
.xcl_ident = XPRT_TRANSPORT_RDMA,
 };
 
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index c419498..ac7fc9a 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -51,6 +51,7 @@
 #include linux/sunrpc/clnt.h /* rpc_xprt */
 #include linux/sunrpc/rpc_rdma.h /* RPC/RDMA protocol */
 #include linux/sunrpc/xprtrdma.h /* xprt parameters */
+#include linux/sunrpc/svc.h  /* RPCSVC_MAXPAYLOAD */
 
 #define RDMA_RESOLVE_TIMEOUT   (5000)  /* 5 seconds */
 #define RDMA_CONNECT_RETRY_MAX (2) /* retries if no listener backlog */
@@ -392,4 +393,10 @@ extern struct kmem_cache *svc_rdma_ctxt_cachep;
 /* Workqueue created in svc_rdma.c */
 extern struct workqueue_struct *svc_rdma_wq;
 
+#if RPCSVC_MAXPAYLOAD  (RPCRDMA_MAX_DATA_SEGS  PAGE_SHIFT)
+#define RPCSVC_MAXPAYLOAD_RDMA RPCSVC_MAXPAYLOAD
+#else
+#define RPCSVC_MAXPAYLOAD_RDMA (RPCRDMA_MAX_DATA_SEGS  PAGE_SHIFT)
+#endif
+
 #endif /* _LINUX_SUNRPC_XPRT_RDMA_H */

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] Add missing neigh_release, fix ntuple calculation for IPv6

2014-09-23 Thread Hariprasad Shenai
This patch series adds missing neigh release in find_route function and also
takes IPv6 into consideration in best_mtu and set_emss function. It also fixes
ntuple calculation for IPv6 for T5 adapter.

The patches series is created against 'infiniband' tree.
And includes patches on iw_cxgb4 driver.

We have included all the maintainers of respective drivers. Kindly review the
change and let us know in case of any review comments.

Thanks

Hariprasad Shenai (3):
  RDMA/cxgb4: Take IPv6 into account for best_mtu and set_emss
  RDMA/cxgb4: Add missing neigh_release in find_route
  RDMA/cxgb4: Fix ntuple calculation for ipv6 and remove duplicate line

 drivers/infiniband/hw/cxgb4/cm.c |   32 
 1 files changed, 20 insertions(+), 12 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] RDMA/cxgb4: Add missing neigh_release in find_route

2014-09-23 Thread Hariprasad Shenai
Signed-off-by: Hariprasad Shenai haripra...@chelsio.com
---
 drivers/infiniband/hw/cxgb4/cm.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c
index 2ca9ec8..2ee9892 100644
--- a/drivers/infiniband/hw/cxgb4/cm.c
+++ b/drivers/infiniband/hw/cxgb4/cm.c
@@ -417,6 +417,7 @@ static struct dst_entry *find_route(struct c4iw_dev *dev, 
__be32 local_ip,
return NULL;
if (!our_interface(dev, n-dev) 
!(n-dev-flags  IFF_LOOPBACK)) {
+   neigh_release(n);
dst_release(rt-dst);
return NULL;
}
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html