RE: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask

2015-02-04 Thread Weiny, Ira
> 
> On Thu, Jan 29, 2015 at 09:50:38PM +0100, Yann Droneaud wrote:
> 
> > Anyway, I recognize that uverb way of abusing write() syscall is
> > borderline (at best) regarding other Linux subsystems and Unix
> > paradigm in general. But it's not enough to screw it more.
> 
> Then we must return the correct output size explicitly in the struct.

I was thinking this very same thing as I read through this thread.

I too would like to avoid the use of comp_masks if at all possible.  The query 
call seems to be a verb where the structure size is all you really need to know 
the set of values returned.

As Jason says, other calls may require the comp_mask where 0 is not sufficient 
to indicate "missing".

Ira

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: rdma_connect() and IBV_QPT_UC

2015-02-04 Thread Hefty, Sean
> I have a fairly complex system that I've built over the last few years
> as part of my PhD that uses reliable connections for message-passing
> and RDMA. After reading a recent paper, I decided to compare the
> performance of message-passing over unreliable connections for the
> sake of completeness, but unfortunately, I've repeatedly run into
> obstacles trying to switch to using the UC mode. I even wrote a tiny
> client and server that simply opens a connection, sends a message, and
> sends a response, and when I switch from RC to UC, the rdma_accept()
> at the server always fails with EINVAL when trying to accept the
> client's rdma_connect(). I'm afraid documentation, source code, and
> experimentation have left me stymied. I have tried using RDMA_PS_IB
> and RDMA_PS_UDP instead of RDMA_PS_TCP in rdma_create_id() to no
> avail, I've experimented with qp_attr fields, and my MWE shows the
> same problems as the larger system. Any assistance woudl be greatly
> appreciated, and I'd be happy to provide more details about my setup
> of the event channel, queue pair, buffers, and so on if necessary.

I believe you can get UC support by using the calls rdma_getaddrinfo and 
rdma_create_ep, which allows you to specify the qp_type as IBV_QPT_UC.  For the 
port space, you will need RDMA_PS_IB.  The TCP and UDP port spaces map to QP 
types RC and UD, respectively.


rdma_connect() and IBV_QPT_UC

2015-02-04 Thread Christopher Mitchell
Hey there,

I have a fairly complex system that I've built over the last few years
as part of my PhD that uses reliable connections for message-passing
and RDMA. After reading a recent paper, I decided to compare the
performance of message-passing over unreliable connections for the
sake of completeness, but unfortunately, I've repeatedly run into
obstacles trying to switch to using the UC mode. I even wrote a tiny
client and server that simply opens a connection, sends a message, and
sends a response, and when I switch from RC to UC, the rdma_accept()
at the server always fails with EINVAL when trying to accept the
client's rdma_connect(). I'm afraid documentation, source code, and
experimentation have left me stymied. I have tried using RDMA_PS_IB
and RDMA_PS_UDP instead of RDMA_PS_TCP in rdma_create_id() to no
avail, I've experimented with qp_attr fields, and my MWE shows the
same problems as the larger system. Any assistance woudl be greatly
appreciated, and I'd be happy to provide more details about my setup
of the event channel, queue pair, buffers, and so on if necessary.

Cheers,
Christopher
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V1 for-next 3/4] IB/core: Make sure that the PSN does not overflow

2015-02-04 Thread Hefty, Sean
> > >>@@ -860,6 +860,12 @@ int ib_modify_qp_is_ok(enum ib_qp_state
> cur_state, enum ib_qp_state next_state,
> > >>  if (mask & ~(req_param | opt_param | IB_QP_STATE))
> > >>  return 0;
> > >>+ if ((mask & IB_QP_SQ_PSN) && (attr->sq_psn & 0xff00))
> > >>+ return 0;
> > >>+
> > >>+ if ((mask & IB_QP_RQ_PSN) && (attr->rq_psn & 0xff00))
> > >>+ return 0;
> > >>+
> 
> > >And since rdmacm has had this longstanding bug of generating > 24
> > >bit PSNs, this change seems really scary - very likely to break
> > >working systems.
> 
> > By IBTA the HW can only use 24 bits, also the IB CM also makes sure
> > to only encode/decode 24 PSN bits to/from the wire (see the PSN
> > related helpers in drivers/infiniband/core/cm_msgs.h), so in that
> > respect, I don't see what other bits which are not 24 bits out of
> > the 32 generated ones could be of some use to existing applications,
> > please clarify.
> 
> Maybe you can explain why this check is suddenly important now? It
> seems risky with no rational?

To add to this, there's a difference between the code ignoring the upper 
8-bits, versus mandating that they be 0.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 18/19] IB/mad: Implement Intel Omni-Path Architecture SMP processing

2015-02-04 Thread ira . weiny
From: Ira Weiny 

Define the new OPA SMP format, create support functions for this format using
the previously defined helper functions as appropriate.

Signed-off-by: Ira Weiny 
---
 drivers/infiniband/core/mad_priv.h |   2 +
 drivers/infiniband/core/opa_smi.h  |  78 +++
 drivers/infiniband/core/smi.c  |  54 +++
 drivers/infiniband/core/smi.h  |   6 +++
 include/rdma/opa_smi.h | 106 +
 5 files changed, 246 insertions(+)
 create mode 100644 drivers/infiniband/core/opa_smi.h
 create mode 100644 include/rdma/opa_smi.h

diff --git a/drivers/infiniband/core/mad_priv.h 
b/drivers/infiniband/core/mad_priv.h
index 10df80d..141b05a 100644
--- a/drivers/infiniband/core/mad_priv.h
+++ b/drivers/infiniband/core/mad_priv.h
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define IB_MAD_QPS_CORE2 /* Always QP0 and QP1 as a minimum */
 
@@ -82,6 +83,7 @@ struct ib_mad_private {
struct ib_smp smp;
struct jumbo_mad jumbo_mad;
struct jumbo_rmpp_mad jumbo_rmpp_mad;
+   struct opa_smp opa_smp;
} mad;
 } __attribute__ ((packed));
 
diff --git a/drivers/infiniband/core/opa_smi.h 
b/drivers/infiniband/core/opa_smi.h
new file mode 100644
index 000..d180179
--- /dev/null
+++ b/drivers/infiniband/core/opa_smi.h
@@ -0,0 +1,78 @@
+/*
+ * Copyright (c) 2014 Intel Corporation.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+#ifndef __OPA_SMI_H_
+#define __OPA_SMI_H_
+
+#include 
+#include 
+
+#include "smi.h"
+
+enum smi_action opa_smi_handle_dr_smp_recv(struct opa_smp *smp, u8 node_type,
+  int port_num, int phys_port_cnt);
+int opa_smi_get_fwd_port(struct opa_smp *smp);
+extern enum smi_forward_action opa_smi_check_forward_dr_smp(struct opa_smp 
*smp);
+extern enum smi_action opa_smi_handle_dr_smp_send(struct opa_smp *smp,
+ u8 node_type, int port_num);
+
+/*
+ * Return IB_SMI_HANDLE if the SMP should be handled by the local SMA/SM
+ * via process_mad
+ */
+static inline enum smi_action opa_smi_check_local_smp(struct opa_smp *smp,
+ struct ib_device *device)
+{
+   /* C14-9:3 -- We're at the end of the DR segment of path */
+   /* C14-9:4 -- Hop Pointer = Hop Count + 1 -> give to SMA/SM */
+   return (device->process_mad &&
+   !opa_get_smp_direction(smp) &&
+   (smp->hop_ptr == smp->hop_cnt + 1)) ?
+   IB_SMI_HANDLE : IB_SMI_DISCARD;
+}
+
+/*
+ * Return IB_SMI_HANDLE if the SMP should be handled by the local SMA/SM
+ * via process_mad
+ */
+static inline enum smi_action opa_smi_check_local_returning_smp(struct opa_smp 
*smp,
+  struct ib_device *device)
+{
+   /* C14-13:3 -- We're at the end of the DR segment of path */
+   /* C14-13:4 -- Hop Pointer == 0 -> give to SM */
+   return (device->process_mad &&
+   opa_get_smp_direction(smp) &&
+   !smp->hop_ptr) ? IB_SMI_HANDLE : IB_SMI_DISCARD;
+}
+
+#endif /* __OPA_SMI_H_ */
diff --git a/drivers/infiniband/core/smi.c b/drivers/infiniband/core/smi.c
index 8a5fb1d..a38ccb4 100644
--- a/drivers/infiniband/core/smi.c
+++ b/drivers/infiniband/core/smi.c
@@ -5,6 +5,7 @@
  * Copyright (c) 2004, 2005 Topspin Corporation.  All rights reserved.
  * Copyright (c) 2004-2007 Voltaire Corporation.  All rights reserved.
  * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved.
+ * Copyright 

[PATCH v4 19/19] IB/mad: Implement Intel Omni-Path Architecture MAD processing

2015-02-04 Thread ira . weiny
From: Ira Weiny 

For devices which support OPA MADs

OPA SMP packets must carry a valid pkey
process wc.pkey_index returned by agents for response.

Handle variable length OPA MADs by:

* Adjusting the 'fake' WC for locally routed SMP's to represent the
  proper incoming byte_len
* out_mad_size is used from the local HCA agents
1) when sending agent responses on the wire
2) when passing responses through the local_completions function

NOTE: wc.byte_len includes the GRH length and therefore is different from the
  in_mad_size specified to the local HCA agents.  out_mad_size should _not_
  include the GRH length as it is added by the verbs layer and is not part
  of MAD processing.

Signed-off-by: Ira Weiny 

---
 drivers/infiniband/core/agent.c|  27 +++-
 drivers/infiniband/core/agent.h|   3 +-
 drivers/infiniband/core/mad.c  | 251 ++---
 drivers/infiniband/core/mad_priv.h |   1 +
 drivers/infiniband/core/mad_rmpp.c |  32 +++--
 drivers/infiniband/core/user_mad.c |  35 +++---
 include/rdma/ib_mad.h  |   2 +
 7 files changed, 276 insertions(+), 75 deletions(-)

diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c
index b6bd305..18275a5 100644
--- a/drivers/infiniband/core/agent.c
+++ b/drivers/infiniband/core/agent.c
@@ -80,13 +80,17 @@ ib_get_agent_port(struct ib_device *device, int port_num)
 
 void agent_send_response(struct ib_mad *mad, struct ib_grh *grh,
 struct ib_wc *wc, struct ib_device *device,
-int port_num, int qpn)
+int port_num, int qpn, u32 resp_mad_len,
+int opa)
 {
struct ib_agent_port_private *port_priv;
struct ib_mad_agent *agent;
struct ib_mad_send_buf *send_buf;
struct ib_ah *ah;
+   size_t data_len;
+   size_t hdr_len;
struct ib_mad_send_wr_private *mad_send_wr;
+   u8 base_version;
 
if (device->node_type == RDMA_NODE_IB_SWITCH)
port_priv = ib_get_agent_port(device, 0);
@@ -106,16 +110,29 @@ void agent_send_response(struct ib_mad *mad, struct 
ib_grh *grh,
return;
}
 
+   /* base version determines MAD size */
+   base_version = mad->mad_hdr.base_version;
+   if (opa && base_version == OPA_MGMT_BASE_VERSION) {
+   data_len = resp_mad_len - JUMBO_MGMT_MAD_HDR;
+   hdr_len = JUMBO_MGMT_MAD_HDR;
+   } else {
+   data_len = IB_MGMT_MAD_DATA;
+   hdr_len = IB_MGMT_MAD_HDR;
+   }
+
send_buf = ib_create_send_mad(agent, wc->src_qp, wc->pkey_index, 0,
- IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
- GFP_KERNEL,
- IB_MGMT_BASE_VERSION);
+ hdr_len, data_len, GFP_KERNEL,
+ base_version);
if (IS_ERR(send_buf)) {
dev_err(&device->dev, "ib_create_send_mad error\n");
goto err1;
}
 
-   memcpy(send_buf->mad, mad, sizeof *mad);
+   if (opa && base_version == OPA_MGMT_BASE_VERSION)
+   memcpy(send_buf->mad, mad, JUMBO_MGMT_MAD_HDR + data_len);
+   else
+   memcpy(send_buf->mad, mad, sizeof(*mad));
+
send_buf->ah = ah;
 
if (device->node_type == RDMA_NODE_IB_SWITCH) {
diff --git a/drivers/infiniband/core/agent.h b/drivers/infiniband/core/agent.h
index 6669287..1dee837 100644
--- a/drivers/infiniband/core/agent.h
+++ b/drivers/infiniband/core/agent.h
@@ -46,6 +46,7 @@ extern int ib_agent_port_close(struct ib_device *device, int 
port_num);
 
 extern void agent_send_response(struct ib_mad *mad, struct ib_grh *grh,
struct ib_wc *wc, struct ib_device *device,
-   int port_num, int qpn);
+   int port_num, int qpn, u32 resp_mad_len,
+   int opa);
 
 #endif /* __AGENT_H_ */
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 5aefe4c..9b7dc36 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -3,6 +3,7 @@
  * Copyright (c) 2005 Intel Corporation.  All rights reserved.
  * Copyright (c) 2005 Mellanox Technologies Ltd.  All rights reserved.
  * Copyright (c) 2009 HNR Consulting. All rights reserved.
+ * Copyright (c) 2014 Intel Corporation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -44,6 +45,7 @@
 #include "mad_priv.h"
 #include "mad_rmpp.h"
 #include "smi.h"
+#include "opa_smi.h"
 #include "agent.h"
 
 MODULE_LICENSE("Dual BSD/GPL");
@@ -733,6 +735,7 @@ static int handle_outgoing_dr_smp(struct 
ib_mad_agent_private *mad_agent_pri

[PATCH v4 07/19] IB/mad: Convert ib_mad_private allocations from kmem_cache to kmalloc

2015-02-04 Thread ira . weiny
From: Ira Weiny 

Use the new max_mad_size specified by devices for the allocations and DMA maps.

kmalloc is more flexible to support devices with different sized MADs and
research and testing showed that the current use of kmem_cache does not provide
performance benefits over kmalloc.

Signed-off-by: Ira Weiny 

---
 drivers/infiniband/core/mad.c | 73 ++-
 1 file changed, 30 insertions(+), 43 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index a6a33cf..cc0a3ad 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -59,8 +59,6 @@ MODULE_PARM_DESC(send_queue_size, "Size of send queue in 
number of work requests
 module_param_named(recv_queue_size, mad_recvq_size, int, 0444);
 MODULE_PARM_DESC(recv_queue_size, "Size of receive queue in number of work 
requests");
 
-static struct kmem_cache *ib_mad_cache;
-
 static struct list_head ib_mad_port_list;
 static u32 ib_mad_client_id = 0;
 
@@ -717,6 +715,13 @@ static void build_smp_wc(struct ib_qp *qp,
wc->port_num = port_num;
 }
 
+static struct ib_mad_private *alloc_mad_priv(struct ib_device *dev)
+{
+   return (kmalloc(sizeof(struct ib_mad_private_header) +
+   sizeof(struct ib_grh) +
+   dev->cached_dev_attrs.max_mad_size, GFP_ATOMIC));
+}
+
 /*
  * Return 0 if SMP is to be sent
  * Return 1 if SMP was consumed locally (whether or not solicited)
@@ -771,7 +776,8 @@ static int handle_outgoing_dr_smp(struct 
ib_mad_agent_private *mad_agent_priv,
}
local->mad_priv = NULL;
local->recv_mad_agent = NULL;
-   mad_priv = kmem_cache_alloc(ib_mad_cache, GFP_ATOMIC);
+
+   mad_priv = alloc_mad_priv(mad_agent_priv->agent.device);
if (!mad_priv) {
ret = -ENOMEM;
dev_err(&device->dev, "No memory for local response MAD\n");
@@ -801,10 +807,10 @@ static int handle_outgoing_dr_smp(struct 
ib_mad_agent_private *mad_agent_priv,
 */
atomic_inc(&mad_agent_priv->refcount);
} else
-   kmem_cache_free(ib_mad_cache, mad_priv);
+   kfree(mad_priv);
break;
case IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_CONSUMED:
-   kmem_cache_free(ib_mad_cache, mad_priv);
+   kfree(mad_priv);
break;
case IB_MAD_RESULT_SUCCESS:
/* Treat like an incoming receive MAD */
@@ -820,14 +826,14 @@ static int handle_outgoing_dr_smp(struct 
ib_mad_agent_private *mad_agent_priv,
 * No receiving agent so drop packet and
 * generate send completion.
 */
-   kmem_cache_free(ib_mad_cache, mad_priv);
+   kfree(mad_priv);
break;
}
local->mad_priv = mad_priv;
local->recv_mad_agent = recv_mad_agent;
break;
default:
-   kmem_cache_free(ib_mad_cache, mad_priv);
+   kfree(mad_priv);
kfree(local);
ret = -EINVAL;
goto out;
@@ -1237,7 +1243,7 @@ void ib_free_recv_mad(struct ib_mad_recv_wc *mad_recv_wc)
recv_wc);
priv = container_of(mad_priv_hdr, struct ib_mad_private,
header);
-   kmem_cache_free(ib_mad_cache, priv);
+   kfree(priv);
}
 }
 EXPORT_SYMBOL(ib_free_recv_mad);
@@ -1924,6 +1930,11 @@ static void ib_mad_complete_recv(struct 
ib_mad_agent_private *mad_agent_priv,
}
 }
 
+static size_t mad_recv_buf_size(struct ib_device *dev)
+{
+   return(sizeof(struct ib_grh) + dev->cached_dev_attrs.max_mad_size);
+}
+
 static bool generate_unmatched_resp(struct ib_mad_private *recv,
struct ib_mad_private *response)
 {
@@ -1964,8 +1975,7 @@ static void ib_mad_recv_done_handler(struct 
ib_mad_port_private *port_priv,
recv = container_of(mad_priv_hdr, struct ib_mad_private, header);
ib_dma_unmap_single(port_priv->device,
recv->header.mapping,
-   sizeof(struct ib_mad_private) -
- sizeof(struct ib_mad_private_header),
+   mad_recv_buf_size(port_priv->device),
DMA_FROM_DEVICE);
 
/* Setup MAD receive work completion from "normal" work completion */
@@ -1982,7 +1992,7 @@ static void ib_mad_recv_done_handler(struct 
ib_mad_port_private *port_priv,
if (!validate_mad(&recv->mad.mad.mad_hdr, qp_info->qp->qp_num))
goto out;
 
-   response = kmem_cache_alloc(ib_mad_cache, GFP_KERNEL);
+   response = alloc_mad_priv(port_priv->device);
if (!response) {
dev_err(&port_priv->devi

[PATCH v4 06/19] IB/core: Add max_mad_size to ib_device_attr

2015-02-04 Thread ira . weiny
From: Ira Weiny 

Change all IB drivers to report the max MAD size.
Add check to verify that all devices support at least IB_MGMT_MAD_SIZE

Signed-off-by: Ira Weiny 

---

Changes from V3:
Fix ehca compile found with 0-day build

 drivers/infiniband/core/mad.c| 6 ++
 drivers/infiniband/hw/amso1100/c2_rnic.c | 1 +
 drivers/infiniband/hw/cxgb3/iwch_provider.c  | 1 +
 drivers/infiniband/hw/cxgb4/provider.c   | 1 +
 drivers/infiniband/hw/ehca/ehca_hca.c| 3 +++
 drivers/infiniband/hw/ipath/ipath_verbs.c| 1 +
 drivers/infiniband/hw/mlx4/main.c| 1 +
 drivers/infiniband/hw/mlx5/main.c| 1 +
 drivers/infiniband/hw/mthca/mthca_provider.c | 2 ++
 drivers/infiniband/hw/nes/nes_verbs.c| 1 +
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c  | 1 +
 drivers/infiniband/hw/qib/qib_verbs.c| 1 +
 drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 2 ++
 include/rdma/ib_mad.h| 1 +
 include/rdma/ib_verbs.h  | 1 +
 15 files changed, 24 insertions(+)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 819b794..a6a33cf 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2924,6 +2924,12 @@ static int ib_mad_port_open(struct ib_device *device,
char name[sizeof "ib_mad123"];
int has_smi;
 
+   if (device->cached_dev_attrs.max_mad_size < IB_MGMT_MAD_SIZE) {
+   dev_err(&device->dev, "Min MAD size for device is %u\n",
+   IB_MGMT_MAD_SIZE);
+   return -EFAULT;
+   }
+
/* Create new device info */
port_priv = kzalloc(sizeof *port_priv, GFP_KERNEL);
if (!port_priv) {
diff --git a/drivers/infiniband/hw/amso1100/c2_rnic.c 
b/drivers/infiniband/hw/amso1100/c2_rnic.c
index d2a6d96..63322c0 100644
--- a/drivers/infiniband/hw/amso1100/c2_rnic.c
+++ b/drivers/infiniband/hw/amso1100/c2_rnic.c
@@ -197,6 +197,7 @@ static int c2_rnic_query(struct c2_dev *c2dev, struct 
ib_device_attr *props)
props->max_srq_sge = 0;
props->max_pkeys   = 0;
props->local_ca_ack_delay  = 0;
+   props->max_mad_size= IB_MGMT_MAD_SIZE;
 
  bail2:
vq_repbuf_free(c2dev, reply);
diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c 
b/drivers/infiniband/hw/cxgb3/iwch_provider.c
index 811b24a..b8a80aa0 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
@@ -1174,6 +1174,7 @@ static int iwch_query_device(struct ib_device *ibdev,
props->max_pd = dev->attr.max_pds;
props->local_ca_ack_delay = 0;
props->max_fast_reg_page_list_len = T3_MAX_FASTREG_DEPTH;
+   props->max_mad_size = IB_MGMT_MAD_SIZE;
 
return 0;
 }
diff --git a/drivers/infiniband/hw/cxgb4/provider.c 
b/drivers/infiniband/hw/cxgb4/provider.c
index 66bd6a2..299c70c 100644
--- a/drivers/infiniband/hw/cxgb4/provider.c
+++ b/drivers/infiniband/hw/cxgb4/provider.c
@@ -332,6 +332,7 @@ static int c4iw_query_device(struct ib_device *ibdev,
props->max_pd = T4_MAX_NUM_PD;
props->local_ca_ack_delay = 0;
props->max_fast_reg_page_list_len = t4_max_fr_depth(use_dsgl);
+   props->max_mad_size = IB_MGMT_MAD_SIZE;
 
return 0;
 }
diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c 
b/drivers/infiniband/hw/ehca/ehca_hca.c
index 9ed4d25..6166146 100644
--- a/drivers/infiniband/hw/ehca/ehca_hca.c
+++ b/drivers/infiniband/hw/ehca/ehca_hca.c
@@ -40,6 +40,7 @@
  */
 
 #include 
+#include 
 
 #include "ehca_tools.h"
 #include "ehca_iverbs.h"
@@ -133,6 +134,8 @@ int ehca_query_device(struct ib_device *ibdev, struct 
ib_device_attr *props)
if (rblock->hca_cap_indicators & cap_mapping[i + 1])
props->device_cap_flags |= cap_mapping[i];
 
+   props->max_mad_size = IB_MGMT_MAD_SIZE;
+
 query_device1:
ehca_free_fw_ctrlblock(rblock);
 
diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.c 
b/drivers/infiniband/hw/ipath/ipath_verbs.c
index 44ea939..4c6474c 100644
--- a/drivers/infiniband/hw/ipath/ipath_verbs.c
+++ b/drivers/infiniband/hw/ipath/ipath_verbs.c
@@ -1538,6 +1538,7 @@ static int ipath_query_device(struct ib_device *ibdev,
props->max_mcast_qp_attach = ib_ipath_max_mcast_qp_attached;
props->max_total_mcast_qp_attach = props->max_mcast_qp_attach *
props->max_mcast_grp;
+   props->max_mad_size = IB_MGMT_MAD_SIZE;
 
return 0;
 }
diff --git a/drivers/infiniband/hw/mlx4/main.c 
b/drivers/infiniband/hw/mlx4/main.c
index 57ecc5b..88326a7 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -229,6 +229,7 @@ static int mlx4_ib_query_device(struct ib_device *ibdev,
props->max_total_mcast_qp_attach = props->max_mcast_qp_attach *
   props->max_mcast_grp;
props->max_map_per_fmr = dev->dev->caps.max_fmr_maps;
+

[PATCH v4 00/19] IB/mad: Add support for Intel Omni-Path Architecture (OPA) MAD processing.

2015-02-04 Thread ira . weiny
From: Ira Weiny 

The following patch series modifies the kernel MAD processing (ib_mad/ib_umad)
and related interfaces to send and receive Intel Omni-Path Architecture MADs on
devices which support them.

In addition to supporting some IBTA management classes, OPA devices use MADs
with lengths up to 2K.  These "jumbo" MADs increase the performance of
management traffic.

To distinguish IBTA MADs from OPA MADs a new Base Version is introduced.  The
new format shares the same common header with IBTA MADs which allows us to
share most of the MAD processing code when dealing with the new Base Version.


The patch series is broken into 3 main areas.

1) Add the ability for devices to indicate MAD size.
   modify the MAD code to use this MAD size

2) Enhance the interface to the device agents to support larger and variable
   length MADs.

3) Add capability bit to indicate support for OPA MADs

4) Add support for creating and processing OPA MADs


Changes for V4:

Rebased to latest Rolands for-next branch (3.19-rc4)
Fixed compile issue in ehca driver found with 0-day build.


Ira Weiny (19):
  IB/mad: Rename is_data_mad to is_rmpp_data_mad
  IB/core: Cache device attributes for use by upper level drivers
  IB/mad: Change validate_mad signature to take ib_mad_hdr rather than
ib_mad
  IB/mad: Change ib_response_mad signature to take ib_mad_hdr rather
than ib_mad
  IB/mad: Change cast in rcv_has_same_class
  IB/core: Add max_mad_size to ib_device_attr
  IB/mad: Convert ib_mad_private allocations from kmem_cache to kmalloc
  IB/mad: Add helper function for smi_handle_dr_smp_send
  IB/mad: Add helper function for smi_handle_dr_smp_recv
  IB/mad: Add helper function for smi_check_forward_dr_smp
  IB/mad: Add helper function for SMI processing
  IB/mad: Add MAD size parameters to process_mad
  IB/mad: Add base version parameter to ib_create_send_mad
  IB/core: Add IB_DEVICE_OPA_MAD_SUPPORT device cap flag
  IB/mad: Create jumbo_mad data structures
  IB/mad: Add Intel Omni-Path Architecture defines
  IB/mad: Implement support for Intel Omni-Path Architecture base
version MADs in ib_create_send_mad
  IB/mad: Implement Intel Omni-Path Architecture SMP processing
  IB/mad: Implement Intel Omni-Path Architecture MAD processing

 drivers/infiniband/core/agent.c  |  26 +-
 drivers/infiniband/core/agent.h  |   3 +-
 drivers/infiniband/core/cm.c |   6 +-
 drivers/infiniband/core/device.c |   2 +
 drivers/infiniband/core/mad.c| 519 ++-
 drivers/infiniband/core/mad_priv.h   |   7 +-
 drivers/infiniband/core/mad_rmpp.c   | 144 
 drivers/infiniband/core/opa_smi.h|  78 
 drivers/infiniband/core/sa_query.c   |   3 +-
 drivers/infiniband/core/smi.c| 231 
 drivers/infiniband/core/smi.h|   6 +
 drivers/infiniband/core/sysfs.c  |   5 +-
 drivers/infiniband/core/user_mad.c   |  38 +-
 drivers/infiniband/hw/amso1100/c2_provider.c |   5 +-
 drivers/infiniband/hw/amso1100/c2_rnic.c |   1 +
 drivers/infiniband/hw/cxgb3/iwch_provider.c  |   6 +-
 drivers/infiniband/hw/cxgb4/provider.c   |   8 +-
 drivers/infiniband/hw/ehca/ehca_hca.c|   3 +
 drivers/infiniband/hw/ehca/ehca_iverbs.h |   4 +-
 drivers/infiniband/hw/ehca/ehca_sqp.c|   8 +-
 drivers/infiniband/hw/ipath/ipath_mad.c  |   8 +-
 drivers/infiniband/hw/ipath/ipath_verbs.c|   1 +
 drivers/infiniband/hw/ipath/ipath_verbs.h|   3 +-
 drivers/infiniband/hw/mlx4/mad.c |  12 +-
 drivers/infiniband/hw/mlx4/main.c|   1 +
 drivers/infiniband/hw/mlx4/mlx4_ib.h |   3 +-
 drivers/infiniband/hw/mlx5/mad.c |   8 +-
 drivers/infiniband/hw/mlx5/main.c|   1 +
 drivers/infiniband/hw/mlx5/mlx5_ib.h |   3 +-
 drivers/infiniband/hw/mthca/mthca_dev.h  |   4 +-
 drivers/infiniband/hw/mthca/mthca_mad.c  |  12 +-
 drivers/infiniband/hw/mthca/mthca_provider.c |   2 +
 drivers/infiniband/hw/nes/nes_verbs.c|   4 +-
 drivers/infiniband/hw/ocrdma/ocrdma_ah.c |   3 +-
 drivers/infiniband/hw/ocrdma/ocrdma_ah.h |   3 +-
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c  |   1 +
 drivers/infiniband/hw/qib/qib_iba7322.c  |   3 +-
 drivers/infiniband/hw/qib/qib_mad.c  |  11 +-
 drivers/infiniband/hw/qib/qib_verbs.c|   1 +
 drivers/infiniband/hw/qib/qib_verbs.h|   3 +-
 drivers/infiniband/hw/usnic/usnic_ib_verbs.c |   2 +
 drivers/infiniband/ulp/srpt/ib_srpt.c|   3 +-
 include/rdma/ib_mad.h|  40 ++-
 include/rdma/ib_verbs.h  |  15 +-
 include/rdma/opa_smi.h   | 106 ++
 45 files changed, 999 insertions(+), 357 deletions(-)
 create mode 100644 drivers/infiniband/core/opa_smi.h
 create mode 100644 include/rdma/opa_smi.h

-- 
1.8.2

--
To unsubscribe from this list: send the l

[PATCH v4 15/19] IB/mad: Create jumbo_mad data structures

2015-02-04 Thread ira . weiny
From: Ira Weiny 

Define jumbo_mad and jumbo_rmpp_mad.

Jumbo MAD structures are 2K versions of ib_mad and ib_rmpp_mad structures.
Currently only OPA base version MADs are of this type.

Create an RMPP Base header to share between ib_rmpp_mad and jumbo_rmpp_mad

Update existing code to use the new structures.

Signed-off-by: Ira Weiny 

---
 drivers/infiniband/core/mad.c  |  18 +++---
 drivers/infiniband/core/mad_priv.h |   2 +
 drivers/infiniband/core/mad_rmpp.c | 120 ++---
 drivers/infiniband/core/user_mad.c |  16 ++---
 include/rdma/ib_mad.h  |  26 +++-
 5 files changed, 103 insertions(+), 79 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 2145294..316b4b2 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -883,7 +883,7 @@ static int alloc_send_rmpp_list(struct 
ib_mad_send_wr_private *send_wr,
gfp_t gfp_mask)
 {
struct ib_mad_send_buf *send_buf = &send_wr->send_buf;
-   struct ib_rmpp_mad *rmpp_mad = send_buf->mad;
+   struct ib_rmpp_base *rmpp_base = send_buf->mad;
struct ib_rmpp_segment *seg = NULL;
int left, seg_size, pad;
 
@@ -909,10 +909,10 @@ static int alloc_send_rmpp_list(struct 
ib_mad_send_wr_private *send_wr,
if (pad)
memset(seg->data + seg_size - pad, 0, pad);
 
-   rmpp_mad->rmpp_hdr.rmpp_version = send_wr->mad_agent_priv->
+   rmpp_base->rmpp_hdr.rmpp_version = send_wr->mad_agent_priv->
  agent.rmpp_version;
-   rmpp_mad->rmpp_hdr.rmpp_type = IB_MGMT_RMPP_TYPE_DATA;
-   ib_set_rmpp_flags(&rmpp_mad->rmpp_hdr, IB_MGMT_RMPP_FLAG_ACTIVE);
+   rmpp_base->rmpp_hdr.rmpp_type = IB_MGMT_RMPP_TYPE_DATA;
+   ib_set_rmpp_flags(&rmpp_base->rmpp_hdr, IB_MGMT_RMPP_FLAG_ACTIVE);
 
send_wr->cur_seg = container_of(send_wr->rmpp_list.next,
struct ib_rmpp_segment, list);
@@ -1748,14 +1748,14 @@ out:
 static int is_rmpp_data_mad(struct ib_mad_agent_private *mad_agent_priv,
   struct ib_mad_hdr *mad_hdr)
 {
-   struct ib_rmpp_mad *rmpp_mad;
+   struct ib_rmpp_base *rmpp_base;
 
-   rmpp_mad = (struct ib_rmpp_mad *)mad_hdr;
+   rmpp_base = (struct ib_rmpp_base *)mad_hdr;
return !mad_agent_priv->agent.rmpp_version ||
!ib_mad_kernel_rmpp_agent(&mad_agent_priv->agent) ||
-   !(ib_get_rmpp_flags(&rmpp_mad->rmpp_hdr) &
+   !(ib_get_rmpp_flags(&rmpp_base->rmpp_hdr) &
IB_MGMT_RMPP_FLAG_ACTIVE) ||
-   (rmpp_mad->rmpp_hdr.rmpp_type == IB_MGMT_RMPP_TYPE_DATA);
+   (rmpp_base->rmpp_hdr.rmpp_type == IB_MGMT_RMPP_TYPE_DATA);
 }
 
 static inline int rcv_has_same_class(struct ib_mad_send_wr_private *wr,
@@ -1897,7 +1897,7 @@ static void ib_mad_complete_recv(struct 
ib_mad_agent_private *mad_agent_priv,
spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
if (!ib_mad_kernel_rmpp_agent(&mad_agent_priv->agent)
   && 
ib_is_mad_class_rmpp(mad_recv_wc->recv_buf.mad->mad_hdr.mgmt_class)
-  && (ib_get_rmpp_flags(&((struct ib_rmpp_mad 
*)mad_recv_wc->recv_buf.mad)->rmpp_hdr)
+  && (ib_get_rmpp_flags(&((struct ib_rmpp_base 
*)mad_recv_wc->recv_buf.mad)->rmpp_hdr)
& IB_MGMT_RMPP_FLAG_ACTIVE)) {
/* user rmpp is in effect
 * and this is an active RMPP MAD
diff --git a/drivers/infiniband/core/mad_priv.h 
b/drivers/infiniband/core/mad_priv.h
index d1a0b0e..d71ddcc 100644
--- a/drivers/infiniband/core/mad_priv.h
+++ b/drivers/infiniband/core/mad_priv.h
@@ -80,6 +80,8 @@ struct ib_mad_private {
struct ib_mad mad;
struct ib_rmpp_mad rmpp_mad;
struct ib_smp smp;
+   struct jumbo_mad jumbo_mad;
+   struct jumbo_rmpp_mad jumbo_rmpp_mad;
} mad;
 } __attribute__ ((packed));
 
diff --git a/drivers/infiniband/core/mad_rmpp.c 
b/drivers/infiniband/core/mad_rmpp.c
index 2379e2d..7184530 100644
--- a/drivers/infiniband/core/mad_rmpp.c
+++ b/drivers/infiniband/core/mad_rmpp.c
@@ -111,10 +111,10 @@ void ib_cancel_rmpp_recvs(struct ib_mad_agent_private 
*agent)
 }
 
 static void format_ack(struct ib_mad_send_buf *msg,
-  struct ib_rmpp_mad *data,
+  struct ib_rmpp_base *data,
   struct mad_rmpp_recv *rmpp_recv)
 {
-   struct ib_rmpp_mad *ack = msg->mad;
+   struct ib_rmpp_base *ack = msg->mad;
unsigned long flags;
 
memcpy(ack, &data->mad_hdr, msg->hdr_len);
@@ -144,7 +144,7 @@ static void ack_recv(struct mad_rmpp_recv *rmpp_recv,
if (IS_ERR(msg))
return;
 
-   format_ack(msg, (st

[PATCH v4 11/19] IB/mad: Add helper function for SMI processing

2015-02-04 Thread ira . weiny
From: Ira Weiny 

This helper function will be used for processing both IB and OPA SMPs.

Signed-off-by: Ira Weiny 
---
 drivers/infiniband/core/mad.c | 85 +--
 1 file changed, 49 insertions(+), 36 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index cc0a3ad..2ffeace 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -1930,6 +1930,52 @@ static void ib_mad_complete_recv(struct 
ib_mad_agent_private *mad_agent_priv,
}
 }
 
+enum smi_action handle_ib_smi(struct ib_mad_port_private *port_priv,
+ struct ib_mad_qp_info *qp_info,
+ struct ib_wc *wc,
+ int port_num,
+ struct ib_mad_private *recv,
+ struct ib_mad_private *response)
+{
+   enum smi_forward_action retsmi;
+
+   if (smi_handle_dr_smp_recv(&recv->mad.smp,
+  port_priv->device->node_type,
+  port_num,
+  port_priv->device->phys_port_cnt) ==
+  IB_SMI_DISCARD)
+   return IB_SMI_DISCARD;
+
+   retsmi = smi_check_forward_dr_smp(&recv->mad.smp);
+   if (retsmi == IB_SMI_LOCAL)
+   return IB_SMI_HANDLE;
+
+   if (retsmi == IB_SMI_SEND) { /* don't forward */
+   if (smi_handle_dr_smp_send(&recv->mad.smp,
+  port_priv->device->node_type,
+  port_num) == IB_SMI_DISCARD)
+   return IB_SMI_DISCARD;
+
+   if (smi_check_local_smp(&recv->mad.smp, port_priv->device) == 
IB_SMI_DISCARD)
+   return IB_SMI_DISCARD;
+   } else if (port_priv->device->node_type == RDMA_NODE_IB_SWITCH) {
+   /* forward case for switches */
+   memcpy(response, recv, sizeof(*response));
+   response->header.recv_wc.wc = &response->header.wc;
+   response->header.recv_wc.recv_buf.mad = &response->mad.mad;
+   response->header.recv_wc.recv_buf.grh = &response->grh;
+
+   agent_send_response(&response->mad.mad,
+   &response->grh, wc,
+   port_priv->device,
+   smi_get_fwd_port(&recv->mad.smp),
+   qp_info->qp->qp_num);
+
+   return IB_SMI_DISCARD;
+   }
+   return IB_SMI_HANDLE;
+}
+
 static size_t mad_recv_buf_size(struct ib_device *dev)
 {
return(sizeof(struct ib_grh) + dev->cached_dev_attrs.max_mad_size);
@@ -2006,45 +2052,12 @@ static void ib_mad_recv_done_handler(struct 
ib_mad_port_private *port_priv,
 
if (recv->mad.mad.mad_hdr.mgmt_class ==
IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE) {
-   enum smi_forward_action retsmi;
-
-   if (smi_handle_dr_smp_recv(&recv->mad.smp,
-  port_priv->device->node_type,
-  port_num,
-  port_priv->device->phys_port_cnt) ==
-  IB_SMI_DISCARD)
+   if (handle_ib_smi(port_priv, qp_info, wc, port_num, recv,
+ response)
+   == IB_SMI_DISCARD)
goto out;
-
-   retsmi = smi_check_forward_dr_smp(&recv->mad.smp);
-   if (retsmi == IB_SMI_LOCAL)
-   goto local;
-
-   if (retsmi == IB_SMI_SEND) { /* don't forward */
-   if (smi_handle_dr_smp_send(&recv->mad.smp,
-  port_priv->device->node_type,
-  port_num) == IB_SMI_DISCARD)
-   goto out;
-
-   if (smi_check_local_smp(&recv->mad.smp, 
port_priv->device) == IB_SMI_DISCARD)
-   goto out;
-   } else if (port_priv->device->node_type == RDMA_NODE_IB_SWITCH) 
{
-   /* forward case for switches */
-   memcpy(response, recv, sizeof(*response));
-   response->header.recv_wc.wc = &response->header.wc;
-   response->header.recv_wc.recv_buf.mad = 
&response->mad.mad;
-   response->header.recv_wc.recv_buf.grh = &response->grh;
-
-   agent_send_response(&response->mad.mad,
-   &response->grh, wc,
-   port_priv->device,
-   smi_get_fwd_port(&recv->mad.smp),
-   qp_info->qp->qp_num);
-
-   goto out;
-   }

[PATCH v4 14/19] IB/core: Add IB_DEVICE_OPA_MAD_SUPPORT device cap flag

2015-02-04 Thread ira . weiny
From: Ira Weiny 

OPA MADs share a common header with IBTA MADs but with a different base version
and an extended length.  These "jumbo" MADs increase the performance of
management traffic.

Sharing a common header with IBTA MADs allows us to share most of the MAD
processing code when dealing with OPA MADs in addition to supporting some IBTA
MADs on OPA devices.

Add a device capability flag to indicate OPA MAD support on the device.

Signed-off-by: Ira Weiny 

---
 include/rdma/ib_verbs.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 3ab4033..2614233 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -128,6 +128,10 @@ enum ib_device_cap_flags {
IB_DEVICE_ON_DEMAND_PAGING  = (1<<31),
 };
 
+enum ib_device_cap_flags2 {
+   IB_DEVICE_OPA_MAD_SUPPORT   = 1
+};
+
 enum ib_signature_prot_cap {
IB_PROT_T10DIF_TYPE_1 = 1,
IB_PROT_T10DIF_TYPE_2 = 1 << 1,
@@ -210,6 +214,7 @@ struct ib_device_attr {
int sig_prot_cap;
int sig_guard_cap;
struct ib_odp_caps  odp_caps;
+   u64 device_cap_flags2;
u32 max_mad_size;
 };
 
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 08/19] IB/mad: Add helper function for smi_handle_dr_smp_send

2015-02-04 Thread ira . weiny
From: Ira Weiny 

This helper function will be used for processing both IB and OPA SMP sends.

Signed-off-by: Ira Weiny 
---
 drivers/infiniband/core/smi.c | 81 +--
 1 file changed, 47 insertions(+), 34 deletions(-)

diff --git a/drivers/infiniband/core/smi.c b/drivers/infiniband/core/smi.c
index 5855e44..3bac6e6 100644
--- a/drivers/infiniband/core/smi.c
+++ b/drivers/infiniband/core/smi.c
@@ -39,84 +39,81 @@
 #include 
 #include "smi.h"
 
-/*
- * Fixup a directed route SMP for sending
- * Return 0 if the SMP should be discarded
- */
-enum smi_action smi_handle_dr_smp_send(struct ib_smp *smp,
-  u8 node_type, int port_num)
+static inline
+enum smi_action __smi_handle_dr_smp_send(u8 node_type, int port_num,
+u8 *hop_ptr, u8 hop_cnt,
+u8 *initial_path,
+u8 *return_path,
+u8 direction,
+int dr_dlid_is_permissive,
+int dr_slid_is_permissive)
 {
-   u8 hop_ptr, hop_cnt;
-
-   hop_ptr = smp->hop_ptr;
-   hop_cnt = smp->hop_cnt;
-
/* See section 14.2.2.2, Vol 1 IB spec */
/* C14-6 -- valid hop_cnt values are from 0 to 63 */
if (hop_cnt >= IB_SMP_MAX_PATH_HOPS)
return IB_SMI_DISCARD;
 
-   if (!ib_get_smp_direction(smp)) {
+   if (!direction) {
/* C14-9:1 */
-   if (hop_cnt && hop_ptr == 0) {
-   smp->hop_ptr++;
-   return (smp->initial_path[smp->hop_ptr] ==
+   if (hop_cnt && *hop_ptr == 0) {
+   (*hop_ptr)++;
+   return (initial_path[*hop_ptr] ==
port_num ? IB_SMI_HANDLE : IB_SMI_DISCARD);
}
 
/* C14-9:2 */
-   if (hop_ptr && hop_ptr < hop_cnt) {
+   if (*hop_ptr && *hop_ptr < hop_cnt) {
if (node_type != RDMA_NODE_IB_SWITCH)
return IB_SMI_DISCARD;
 
-   /* smp->return_path set when received */
-   smp->hop_ptr++;
-   return (smp->initial_path[smp->hop_ptr] ==
+   /* return_path set when received */
+   (*hop_ptr)++;
+   return (initial_path[*hop_ptr] ==
port_num ? IB_SMI_HANDLE : IB_SMI_DISCARD);
}
 
/* C14-9:3 -- We're at the end of the DR segment of path */
-   if (hop_ptr == hop_cnt) {
-   /* smp->return_path set when received */
-   smp->hop_ptr++;
+   if (*hop_ptr == hop_cnt) {
+   /* return_path set when received */
+   (*hop_ptr)++;
return (node_type == RDMA_NODE_IB_SWITCH ||
-   smp->dr_dlid == IB_LID_PERMISSIVE ?
+   dr_dlid_is_permissive ?
IB_SMI_HANDLE : IB_SMI_DISCARD);
}
 
/* C14-9:4 -- hop_ptr = hop_cnt + 1 -> give to SMA/SM */
/* C14-9:5 -- Fail unreasonable hop pointer */
-   return (hop_ptr == hop_cnt + 1 ? IB_SMI_HANDLE : 
IB_SMI_DISCARD);
+   return (*hop_ptr == hop_cnt + 1 ? IB_SMI_HANDLE : 
IB_SMI_DISCARD);
 
} else {
/* C14-13:1 */
-   if (hop_cnt && hop_ptr == hop_cnt + 1) {
-   smp->hop_ptr--;
-   return (smp->return_path[smp->hop_ptr] ==
+   if (hop_cnt && *hop_ptr == hop_cnt + 1) {
+   (*hop_ptr)--;
+   return (return_path[*hop_ptr] ==
port_num ? IB_SMI_HANDLE : IB_SMI_DISCARD);
}
 
/* C14-13:2 */
-   if (2 <= hop_ptr && hop_ptr <= hop_cnt) {
+   if (2 <= *hop_ptr && *hop_ptr <= hop_cnt) {
if (node_type != RDMA_NODE_IB_SWITCH)
return IB_SMI_DISCARD;
 
-   smp->hop_ptr--;
-   return (smp->return_path[smp->hop_ptr] ==
+   (*hop_ptr)--;
+   return (return_path[*hop_ptr] ==
port_num ? IB_SMI_HANDLE : IB_SMI_DISCARD);
}
 
/* C14-13:3 -- at the end of the DR segment of path */
-   if (hop_ptr == 1) {
-   smp->hop_ptr--;
+   if (*hop_ptr == 1) {
+   (*hop_ptr)--;
/* C14-13:3 -- SMPs destined for SM shouldn't be here */
return (node_type == RDMA_NODE_IB_SWITCH ||
- 

[PATCH v4 16/19] IB/mad: Add Intel Omni-Path Architecture defines

2015-02-04 Thread ira . weiny
From: Ira Weiny 

OPA_SMP_CLASS_VERSION -- Defined at 0x80
OPA_MGMT_BASE_VERSION -- Defined at 0x80

Increase max management version to accommodate OPA

Signed-off-by: Ira Weiny 

---
 drivers/infiniband/core/mad_priv.h | 2 +-
 include/rdma/ib_mad.h  | 5 -
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/mad_priv.h 
b/drivers/infiniband/core/mad_priv.h
index d71ddcc..10df80d 100644
--- a/drivers/infiniband/core/mad_priv.h
+++ b/drivers/infiniband/core/mad_priv.h
@@ -56,7 +56,7 @@
 
 /* Registration table sizes */
 #define MAX_MGMT_CLASS 80
-#define MAX_MGMT_VERSION   8
+#define MAX_MGMT_VERSION   0x83
 #define MAX_MGMT_OUI   8
 #define MAX_MGMT_VENDOR_RANGE2 (IB_MGMT_CLASS_VENDOR_RANGE2_END - \
IB_MGMT_CLASS_VENDOR_RANGE2_START + 1)
diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
index 80e7cf4..8938f1e 100644
--- a/include/rdma/ib_mad.h
+++ b/include/rdma/ib_mad.h
@@ -42,8 +42,11 @@
 #include 
 #include 
 
-/* Management base version */
+/* Management base versions */
 #define IB_MGMT_BASE_VERSION   1
+#define OPA_MGMT_BASE_VERSION  0x80
+
+#define OPA_SMP_CLASS_VERSION  0x80
 
 /* Management classes */
 #define IB_MGMT_CLASS_SUBN_LID_ROUTED  0x01
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 01/19] IB/mad: Rename is_data_mad to is_rmpp_data_mad

2015-02-04 Thread ira . weiny
From: Ira Weiny 

is_rmpp_data_mad is more descriptive for this function.

Reviewed-by: Sean Hefty 
Signed-off-by: Ira Weiny 
---
 drivers/infiniband/core/mad.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 74c30f4..4673262 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -1734,7 +1734,7 @@ out:
return valid;
 }
 
-static int is_data_mad(struct ib_mad_agent_private *mad_agent_priv,
+static int is_rmpp_data_mad(struct ib_mad_agent_private *mad_agent_priv,
   struct ib_mad_hdr *mad_hdr)
 {
struct ib_rmpp_mad *rmpp_mad;
@@ -1836,7 +1836,7 @@ ib_find_send_mad(struct ib_mad_agent_private 
*mad_agent_priv,
 * been notified that the send has completed
 */
list_for_each_entry(wr, &mad_agent_priv->send_list, agent_list) {
-   if (is_data_mad(mad_agent_priv, wr->send_buf.mad) &&
+   if (is_rmpp_data_mad(mad_agent_priv, wr->send_buf.mad) &&
wr->tid == mad->mad_hdr.tid &&
wr->timeout &&
rcv_has_same_class(wr, wc) &&
@@ -2411,7 +2411,8 @@ find_send_wr(struct ib_mad_agent_private *mad_agent_priv,
 
list_for_each_entry(mad_send_wr, &mad_agent_priv->send_list,
agent_list) {
-   if (is_data_mad(mad_agent_priv, mad_send_wr->send_buf.mad) &&
+   if (is_rmpp_data_mad(mad_agent_priv,
+mad_send_wr->send_buf.mad) &&
&mad_send_wr->send_buf == send_buf)
return mad_send_wr;
}
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 10/19] IB/mad: Add helper function for smi_check_forward_dr_smp

2015-02-04 Thread ira . weiny
From: Ira Weiny 

This helper function will be used for processing both IB and OPA SMPs.

Signed-off-by: Ira Weiny 
---
 drivers/infiniband/core/smi.c | 26 +-
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/core/smi.c b/drivers/infiniband/core/smi.c
index 24670de..8a5fb1d 100644
--- a/drivers/infiniband/core/smi.c
+++ b/drivers/infiniband/core/smi.c
@@ -236,21 +236,20 @@ enum smi_action smi_handle_dr_smp_recv(struct ib_smp 
*smp, u8 node_type,
smp->dr_slid == IB_LID_PERMISSIVE);
 }
 
-enum smi_forward_action smi_check_forward_dr_smp(struct ib_smp *smp)
+static inline
+enum smi_forward_action __smi_check_forward_dr_smp(u8 hop_ptr, u8 hop_cnt,
+  u8 direction,
+  int dr_dlid_is_permissive,
+  int dr_slid_is_permissive)
 {
-   u8 hop_ptr, hop_cnt;
-
-   hop_ptr = smp->hop_ptr;
-   hop_cnt = smp->hop_cnt;
-
-   if (!ib_get_smp_direction(smp)) {
+   if (!direction) {
/* C14-9:2 -- intermediate hop */
if (hop_ptr && hop_ptr < hop_cnt)
return IB_SMI_FORWARD;
 
/* C14-9:3 -- at the end of the DR segment of path */
if (hop_ptr == hop_cnt)
-   return (smp->dr_dlid == IB_LID_PERMISSIVE ?
+   return (dr_dlid_is_permissive ?
IB_SMI_SEND : IB_SMI_LOCAL);
 
/* C14-9:4 -- hop_ptr = hop_cnt + 1 -> give to SMA/SM */
@@ -263,10 +262,19 @@ enum smi_forward_action smi_check_forward_dr_smp(struct 
ib_smp *smp)
 
/* C14-13:3 -- at the end of the DR segment of path */
if (hop_ptr == 1)
-   return (smp->dr_slid != IB_LID_PERMISSIVE ?
+   return (dr_slid_is_permissive ?
IB_SMI_SEND : IB_SMI_LOCAL);
}
return IB_SMI_LOCAL;
+
+}
+
+enum smi_forward_action smi_check_forward_dr_smp(struct ib_smp *smp)
+{
+   return __smi_check_forward_dr_smp(smp->hop_ptr, smp->hop_cnt,
+ ib_get_smp_direction(smp),
+ smp->dr_dlid == IB_LID_PERMISSIVE,
+ smp->dr_slid != IB_LID_PERMISSIVE);
 }
 
 /*
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 17/19] IB/mad: Implement support for Intel Omni-Path Architecture base version MADs in ib_create_send_mad

2015-02-04 Thread ira . weiny
From: Ira Weiny 

If the device supports OPA MADs process the OPA base version.
Set MAD size and sg lengths as appropriate
Split RMPP MADs as appropriate

Signed-off-by: Ira Weiny 

---
 drivers/infiniband/core/mad.c | 38 --
 1 file changed, 28 insertions(+), 10 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 316b4b2..5aefe4c 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -857,11 +857,11 @@ out:
return ret;
 }
 
-static int get_pad_size(int hdr_len, int data_len)
+static int get_pad_size(int hdr_len, int data_len, size_t mad_size)
 {
int seg_size, pad;
 
-   seg_size = sizeof(struct ib_mad) - hdr_len;
+   seg_size = mad_size - hdr_len;
if (data_len && seg_size) {
pad = seg_size - data_len % seg_size;
return pad == seg_size ? 0 : pad;
@@ -880,14 +880,14 @@ static void free_send_rmpp_list(struct 
ib_mad_send_wr_private *mad_send_wr)
 }
 
 static int alloc_send_rmpp_list(struct ib_mad_send_wr_private *send_wr,
-   gfp_t gfp_mask)
+   size_t mad_size, gfp_t gfp_mask)
 {
struct ib_mad_send_buf *send_buf = &send_wr->send_buf;
struct ib_rmpp_base *rmpp_base = send_buf->mad;
struct ib_rmpp_segment *seg = NULL;
int left, seg_size, pad;
 
-   send_buf->seg_size = sizeof (struct ib_mad) - send_buf->hdr_len;
+   send_buf->seg_size = mad_size - send_buf->hdr_len;
seg_size = send_buf->seg_size;
pad = send_wr->pad;
 
@@ -937,20 +937,31 @@ struct ib_mad_send_buf * ib_create_send_mad(struct 
ib_mad_agent *mad_agent,
struct ib_mad_send_wr_private *mad_send_wr;
int pad, message_size, ret, size;
void *buf;
+   size_t mad_size;
+   int opa;
 
mad_agent_priv = container_of(mad_agent, struct ib_mad_agent_private,
  agent);
-   pad = get_pad_size(hdr_len, data_len);
+
+   opa = mad_agent_priv->agent.device->cached_dev_attrs.device_cap_flags2 &
+ IB_DEVICE_OPA_MAD_SUPPORT;
+
+   if (opa && base_version == OPA_MGMT_BASE_VERSION)
+   mad_size = sizeof(struct jumbo_mad);
+   else
+   mad_size = sizeof(struct ib_mad);
+
+   pad = get_pad_size(hdr_len, data_len, mad_size);
message_size = hdr_len + data_len + pad;
 
if (ib_mad_kernel_rmpp_agent(mad_agent)) {
-   if (!rmpp_active && message_size > sizeof(struct ib_mad))
+   if (!rmpp_active && message_size > mad_size)
return ERR_PTR(-EINVAL);
} else
-   if (rmpp_active || message_size > sizeof(struct ib_mad))
+   if (rmpp_active || message_size > mad_size)
return ERR_PTR(-EINVAL);
 
-   size = rmpp_active ? hdr_len : sizeof(struct ib_mad);
+   size = rmpp_active ? hdr_len : mad_size;
buf = kzalloc(sizeof *mad_send_wr + size, gfp_mask);
if (!buf)
return ERR_PTR(-ENOMEM);
@@ -965,7 +976,14 @@ struct ib_mad_send_buf * ib_create_send_mad(struct 
ib_mad_agent *mad_agent,
mad_send_wr->mad_agent_priv = mad_agent_priv;
mad_send_wr->sg_list[0].length = hdr_len;
mad_send_wr->sg_list[0].lkey = mad_agent->mr->lkey;
-   mad_send_wr->sg_list[1].length = sizeof(struct ib_mad) - hdr_len;
+
+   /* OPA MADs don't have to be the full 2048 bytes */
+   if (opa && base_version == OPA_MGMT_BASE_VERSION &&
+   data_len < mad_size - hdr_len)
+   mad_send_wr->sg_list[1].length = data_len;
+   else
+   mad_send_wr->sg_list[1].length = mad_size - hdr_len;
+
mad_send_wr->sg_list[1].lkey = mad_agent->mr->lkey;
 
mad_send_wr->send_wr.wr_id = (unsigned long) mad_send_wr;
@@ -978,7 +996,7 @@ struct ib_mad_send_buf * ib_create_send_mad(struct 
ib_mad_agent *mad_agent,
mad_send_wr->send_wr.wr.ud.pkey_index = pkey_index;
 
if (rmpp_active) {
-   ret = alloc_send_rmpp_list(mad_send_wr, gfp_mask);
+   ret = alloc_send_rmpp_list(mad_send_wr, mad_size, gfp_mask);
if (ret) {
kfree(buf);
return ERR_PTR(ret);
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 13/19] IB/mad: Add base version parameter to ib_create_send_mad

2015-02-04 Thread ira . weiny
From: Ira Weiny 

In preparation to support the new OPA MAD Base version, add a base version
parameter to ib_create_send_mad and set it to IB_MGMT_BASE_VERSION for current
users.

Definition of the new base version and it's processing will occur in later
patches.

Signed-off-by: Ira Weiny 
---
 drivers/infiniband/core/agent.c | 3 ++-
 drivers/infiniband/core/cm.c| 6 --
 drivers/infiniband/core/mad.c   | 3 ++-
 drivers/infiniband/core/mad_rmpp.c  | 6 --
 drivers/infiniband/core/sa_query.c  | 3 ++-
 drivers/infiniband/core/user_mad.c  | 3 ++-
 drivers/infiniband/hw/mlx4/mad.c| 3 ++-
 drivers/infiniband/hw/mthca/mthca_mad.c | 3 ++-
 drivers/infiniband/hw/qib/qib_iba7322.c | 3 ++-
 drivers/infiniband/hw/qib/qib_mad.c | 3 ++-
 drivers/infiniband/ulp/srpt/ib_srpt.c   | 3 ++-
 include/rdma/ib_mad.h   | 4 +++-
 12 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c
index f6d2961..b6bd305 100644
--- a/drivers/infiniband/core/agent.c
+++ b/drivers/infiniband/core/agent.c
@@ -108,7 +108,8 @@ void agent_send_response(struct ib_mad *mad, struct ib_grh 
*grh,
 
send_buf = ib_create_send_mad(agent, wc->src_qp, wc->pkey_index, 0,
  IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
- GFP_KERNEL);
+ GFP_KERNEL,
+ IB_MGMT_BASE_VERSION);
if (IS_ERR(send_buf)) {
dev_err(&device->dev, "ib_create_send_mad error\n");
goto err1;
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index e28a494..5767781 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -267,7 +267,8 @@ static int cm_alloc_msg(struct cm_id_private *cm_id_priv,
m = ib_create_send_mad(mad_agent, cm_id_priv->id.remote_cm_qpn,
   cm_id_priv->av.pkey_index,
   0, IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
-  GFP_ATOMIC);
+  GFP_ATOMIC,
+  IB_MGMT_BASE_VERSION);
if (IS_ERR(m)) {
ib_destroy_ah(ah);
return PTR_ERR(m);
@@ -297,7 +298,8 @@ static int cm_alloc_response_msg(struct cm_port *port,
 
m = ib_create_send_mad(port->mad_agent, 1, mad_recv_wc->wc->pkey_index,
   0, IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
-  GFP_ATOMIC);
+  GFP_ATOMIC,
+  IB_MGMT_BASE_VERSION);
if (IS_ERR(m)) {
ib_destroy_ah(ah);
return PTR_ERR(m);
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 4d93ad2..2145294 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -930,7 +930,8 @@ struct ib_mad_send_buf * ib_create_send_mad(struct 
ib_mad_agent *mad_agent,
u32 remote_qpn, u16 pkey_index,
int rmpp_active,
int hdr_len, int data_len,
-   gfp_t gfp_mask)
+   gfp_t gfp_mask,
+   u8 base_version)
 {
struct ib_mad_agent_private *mad_agent_priv;
struct ib_mad_send_wr_private *mad_send_wr;
diff --git a/drivers/infiniband/core/mad_rmpp.c 
b/drivers/infiniband/core/mad_rmpp.c
index f37878c..2379e2d 100644
--- a/drivers/infiniband/core/mad_rmpp.c
+++ b/drivers/infiniband/core/mad_rmpp.c
@@ -139,7 +139,8 @@ static void ack_recv(struct mad_rmpp_recv *rmpp_recv,
hdr_len = 
ib_get_mad_data_offset(recv_wc->recv_buf.mad->mad_hdr.mgmt_class);
msg = ib_create_send_mad(&rmpp_recv->agent->agent, recv_wc->wc->src_qp,
 recv_wc->wc->pkey_index, 1, hdr_len,
-0, GFP_KERNEL);
+0, GFP_KERNEL,
+IB_MGMT_BASE_VERSION);
if (IS_ERR(msg))
return;
 
@@ -165,7 +166,8 @@ static struct ib_mad_send_buf *alloc_response_msg(struct 
ib_mad_agent *agent,
hdr_len = 
ib_get_mad_data_offset(recv_wc->recv_buf.mad->mad_hdr.mgmt_class);
msg = ib_create_send_mad(agent, recv_wc->wc->src_qp,
 recv_wc->wc->pkey_index, 1,
-hdr_len, 0, GFP_KERNEL);
+hdr_len, 0, GFP_KERNEL,
+IB_MGMT_BASE_VERSION);
if (IS_ERR(msg))
ib_destroy_ah(ah);
else {
diff --git a/drivers/infiniband/core/sa_query.c 
b/drivers/infiniband/core/sa_query.c
index c38f030..32c3fe6 100644
--- a/drivers/infiniband/core/sa_query.c
++

[PATCH v4 09/19] IB/mad: Add helper function for smi_handle_dr_smp_recv

2015-02-04 Thread ira . weiny
From: Ira Weiny 

This helper function will be used for processing both IB and OPA SMP recvs.

Signed-off-by: Ira Weiny 
---
 drivers/infiniband/core/smi.c | 80 +--
 1 file changed, 47 insertions(+), 33 deletions(-)

diff --git a/drivers/infiniband/core/smi.c b/drivers/infiniband/core/smi.c
index 3bac6e6..24670de 100644
--- a/drivers/infiniband/core/smi.c
+++ b/drivers/infiniband/core/smi.c
@@ -137,91 +137,105 @@ enum smi_action smi_handle_dr_smp_send(struct ib_smp 
*smp,
smp->dr_slid == IB_LID_PERMISSIVE);
 }
 
-/*
- * Adjust information for a received SMP
- * Return 0 if the SMP should be dropped
- */
-enum smi_action smi_handle_dr_smp_recv(struct ib_smp *smp, u8 node_type,
-  int port_num, int phys_port_cnt)
+static inline
+enum smi_action __smi_handle_dr_smp_recv(u8 node_type, int port_num,
+int phys_port_cnt,
+u8 *hop_ptr, u8 hop_cnt,
+u8 *initial_path,
+u8 *return_path,
+u8 direction,
+int dr_dlid_is_permissive,
+int dr_slid_is_permissive)
 {
-   u8 hop_ptr, hop_cnt;
-
-   hop_ptr = smp->hop_ptr;
-   hop_cnt = smp->hop_cnt;
-
/* See section 14.2.2.2, Vol 1 IB spec */
/* C14-6 -- valid hop_cnt values are from 0 to 63 */
if (hop_cnt >= IB_SMP_MAX_PATH_HOPS)
return IB_SMI_DISCARD;
 
-   if (!ib_get_smp_direction(smp)) {
+   if (!direction) {
/* C14-9:1 -- sender should have incremented hop_ptr */
-   if (hop_cnt && hop_ptr == 0)
+   if (hop_cnt && *hop_ptr == 0)
return IB_SMI_DISCARD;
 
/* C14-9:2 -- intermediate hop */
-   if (hop_ptr && hop_ptr < hop_cnt) {
+   if (*hop_ptr && *hop_ptr < hop_cnt) {
if (node_type != RDMA_NODE_IB_SWITCH)
return IB_SMI_DISCARD;
 
-   smp->return_path[hop_ptr] = port_num;
-   /* smp->hop_ptr updated when sending */
-   return (smp->initial_path[hop_ptr+1] <= phys_port_cnt ?
+   return_path[*hop_ptr] = port_num;
+   /* hop_ptr updated when sending */
+   return (initial_path[*hop_ptr+1] <= phys_port_cnt ?
IB_SMI_HANDLE : IB_SMI_DISCARD);
}
 
/* C14-9:3 -- We're at the end of the DR segment of path */
-   if (hop_ptr == hop_cnt) {
+   if (*hop_ptr == hop_cnt) {
if (hop_cnt)
-   smp->return_path[hop_ptr] = port_num;
-   /* smp->hop_ptr updated when sending */
+   return_path[*hop_ptr] = port_num;
+   /* hop_ptr updated when sending */
 
return (node_type == RDMA_NODE_IB_SWITCH ||
-   smp->dr_dlid == IB_LID_PERMISSIVE ?
+   dr_dlid_is_permissive ?
IB_SMI_HANDLE : IB_SMI_DISCARD);
}
 
/* C14-9:4 -- hop_ptr = hop_cnt + 1 -> give to SMA/SM */
/* C14-9:5 -- fail unreasonable hop pointer */
-   return (hop_ptr == hop_cnt + 1 ? IB_SMI_HANDLE : 
IB_SMI_DISCARD);
+   return (*hop_ptr == hop_cnt + 1 ? IB_SMI_HANDLE : 
IB_SMI_DISCARD);
 
} else {
 
/* C14-13:1 */
-   if (hop_cnt && hop_ptr == hop_cnt + 1) {
-   smp->hop_ptr--;
-   return (smp->return_path[smp->hop_ptr] ==
+   if (hop_cnt && *hop_ptr == hop_cnt + 1) {
+   (*hop_ptr)--;
+   return (return_path[*hop_ptr] ==
port_num ? IB_SMI_HANDLE : IB_SMI_DISCARD);
}
 
/* C14-13:2 */
-   if (2 <= hop_ptr && hop_ptr <= hop_cnt) {
+   if (2 <= *hop_ptr && *hop_ptr <= hop_cnt) {
if (node_type != RDMA_NODE_IB_SWITCH)
return IB_SMI_DISCARD;
 
-   /* smp->hop_ptr updated when sending */
-   return (smp->return_path[hop_ptr-1] <= phys_port_cnt ?
+   /* hop_ptr updated when sending */
+   return (return_path[*hop_ptr-1] <= phys_port_cnt ?
IB_SMI_HANDLE : IB_SMI_DISCARD);
}
 
/* C14-13:3 -- We're at the end of the DR segment of path */
-   if (hop_ptr == 1) {
-   if (smp->dr_slid == IB_LID_PERMISSIVE

[PATCH v4 02/19] IB/core: Cache device attributes for use by upper level drivers

2015-02-04 Thread ira . weiny
From: Ira Weiny 

Upper level drivers can access these cached device attributes rather than
caching them on their own.

Signed-off-by: Ira Weiny 

---
 drivers/infiniband/core/device.c | 2 ++
 include/rdma/ib_verbs.h  | 1 +
 2 files changed, 3 insertions(+)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 18c1ece..30d9d09 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -294,6 +294,8 @@ int ib_register_device(struct ib_device *device,
spin_lock_init(&device->event_handler_lock);
spin_lock_init(&device->client_data_lock);
 
+   device->query_device(device, &device->cached_dev_attrs);
+
ret = read_port_table_lengths(device);
if (ret) {
printk(KERN_WARNING "Couldn't create table lengths cache for 
device %s\n",
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 0d74f1d..0116e4b 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1675,6 +1675,7 @@ struct ib_device {
u32  local_dma_lkey;
u8   node_type;
u8   phys_port_cnt;
+   struct ib_device_attrcached_dev_attrs;
 };
 
 struct ib_client {
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 05/19] IB/mad: Change cast in rcv_has_same_class

2015-02-04 Thread ira . weiny
From: Ira Weiny 

rcv_has_same_class only needs access to the MAD header and can be used for both 
IB
and Jumbo MADs.

Signed-off-by: Ira Weiny 

---
 drivers/infiniband/core/mad.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 66b3940..819b794 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -1750,7 +1750,7 @@ static int is_rmpp_data_mad(struct ib_mad_agent_private 
*mad_agent_priv,
 static inline int rcv_has_same_class(struct ib_mad_send_wr_private *wr,
 struct ib_mad_recv_wc *rwc)
 {
-   return ((struct ib_mad *)(wr->send_buf.mad))->mad_hdr.mgmt_class ==
+   return ((struct ib_mad_hdr *)(wr->send_buf.mad))->mgmt_class ==
rwc->recv_buf.mad->mad_hdr.mgmt_class;
 }
 
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 12/19] IB/mad: Add MAD size parameters to process_mad

2015-02-04 Thread ira . weiny
From: Ira Weiny 

In support of variable length MADs add in and out MAD size parameters to the
process_mad call.

The out MAD size parameter is passed by reference such that it can be updated
by the agent to indicate the proper response length to be sent by the MAD
stack.

The in and out MAD parameters are made generic by specifying them as
ib_mad_hdr.

Drivers are modified as needed.

Signed-off-by: Ira Weiny 

---
 drivers/infiniband/core/mad.c| 30 ++--
 drivers/infiniband/core/sysfs.c  |  5 -
 drivers/infiniband/hw/amso1100/c2_provider.c |  5 -
 drivers/infiniband/hw/cxgb3/iwch_provider.c  |  5 -
 drivers/infiniband/hw/cxgb4/provider.c   |  7 +--
 drivers/infiniband/hw/ehca/ehca_iverbs.h |  4 ++--
 drivers/infiniband/hw/ehca/ehca_sqp.c|  8 +++-
 drivers/infiniband/hw/ipath/ipath_mad.c  |  8 +++-
 drivers/infiniband/hw/ipath/ipath_verbs.h|  3 ++-
 drivers/infiniband/hw/mlx4/mad.c |  9 -
 drivers/infiniband/hw/mlx4/mlx4_ib.h |  3 ++-
 drivers/infiniband/hw/mlx5/mad.c |  8 +++-
 drivers/infiniband/hw/mlx5/mlx5_ib.h |  3 ++-
 drivers/infiniband/hw/mthca/mthca_dev.h  |  4 ++--
 drivers/infiniband/hw/mthca/mthca_mad.c  |  9 +++--
 drivers/infiniband/hw/nes/nes_verbs.c|  3 ++-
 drivers/infiniband/hw/ocrdma/ocrdma_ah.c |  3 ++-
 drivers/infiniband/hw/ocrdma/ocrdma_ah.h |  3 ++-
 drivers/infiniband/hw/qib/qib_mad.c  |  8 +++-
 drivers/infiniband/hw/qib/qib_verbs.h|  3 ++-
 include/rdma/ib_verbs.h  |  8 +---
 21 files changed, 103 insertions(+), 36 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 2ffeace..4d93ad2 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -715,11 +715,12 @@ static void build_smp_wc(struct ib_qp *qp,
wc->port_num = port_num;
 }
 
-static struct ib_mad_private *alloc_mad_priv(struct ib_device *dev)
+static struct ib_mad_private *alloc_mad_priv(struct ib_device *dev,
+size_t *mad_size)
 {
+   *mad_size = dev->cached_dev_attrs.max_mad_size;
return (kmalloc(sizeof(struct ib_mad_private_header) +
-   sizeof(struct ib_grh) +
-   dev->cached_dev_attrs.max_mad_size, GFP_ATOMIC));
+   sizeof(struct ib_grh) + *mad_size, GFP_ATOMIC));
 }
 
 /*
@@ -741,6 +742,8 @@ static int handle_outgoing_dr_smp(struct 
ib_mad_agent_private *mad_agent_priv,
u8 port_num;
struct ib_wc mad_wc;
struct ib_send_wr *send_wr = &mad_send_wr->send_wr;
+   size_t in_mad_size = 
mad_agent_priv->agent.device->cached_dev_attrs.max_mad_size;
+   size_t out_mad_size;
 
if (device->node_type == RDMA_NODE_IB_SWITCH &&
smp->mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE)
@@ -777,7 +780,7 @@ static int handle_outgoing_dr_smp(struct 
ib_mad_agent_private *mad_agent_priv,
local->mad_priv = NULL;
local->recv_mad_agent = NULL;
 
-   mad_priv = alloc_mad_priv(mad_agent_priv->agent.device);
+   mad_priv = alloc_mad_priv(mad_agent_priv->agent.device, &out_mad_size);
if (!mad_priv) {
ret = -ENOMEM;
dev_err(&device->dev, "No memory for local response MAD\n");
@@ -792,8 +795,9 @@ static int handle_outgoing_dr_smp(struct 
ib_mad_agent_private *mad_agent_priv,
 
/* No GRH for DR SMP */
ret = device->process_mad(device, 0, port_num, &mad_wc, NULL,
- (struct ib_mad *)smp,
- (struct ib_mad *)&mad_priv->mad);
+ (struct ib_mad_hdr *)smp, in_mad_size,
+ (struct ib_mad_hdr *)&mad_priv->mad,
+ &out_mad_size);
switch (ret)
{
case IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY:
@@ -2011,6 +2015,7 @@ static void ib_mad_recv_done_handler(struct 
ib_mad_port_private *port_priv,
struct ib_mad_agent_private *mad_agent;
int port_num;
int ret = IB_MAD_RESULT_SUCCESS;
+   size_t resp_mad_size;
 
mad_list = (struct ib_mad_list_head *)(unsigned long)wc->wr_id;
qp_info = mad_list->mad_queue->qp_info;
@@ -2038,7 +2043,7 @@ static void ib_mad_recv_done_handler(struct 
ib_mad_port_private *port_priv,
if (!validate_mad(&recv->mad.mad.mad_hdr, qp_info->qp->qp_num))
goto out;
 
-   response = alloc_mad_priv(port_priv->device);
+   response = alloc_mad_priv(port_priv->device, &resp_mad_size);
if (!response) {
dev_err(&port_priv->device->dev,
"ib_mad_recv_done_handler no memory for response 
buffer\n");
@@ -2063,8 +2068,10 @@ static void ib_mad_recv_done_handler(struct 
ib_mad_port_private *port_priv,
ret = port_p

[PATCH v4 04/19] IB/mad: Change ib_response_mad signature to take ib_mad_hdr rather than ib_mad

2015-02-04 Thread ira . weiny
From: Ira Weiny 

ib_response_mad only needs access to the MAD header and can be used for both IB
and Jumbo MADs.

Signed-off-by: Ira Weiny 
---
 drivers/infiniband/core/mad.c  | 20 ++--
 drivers/infiniband/core/user_mad.c |  6 +++---
 include/rdma/ib_mad.h  |  2 +-
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 99b..66b3940 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -179,12 +179,12 @@ static int is_vendor_method_in_use(
return 0;
 }
 
-int ib_response_mad(struct ib_mad *mad)
+int ib_response_mad(struct ib_mad_hdr *hdr)
 {
-   return ((mad->mad_hdr.method & IB_MGMT_METHOD_RESP) ||
-   (mad->mad_hdr.method == IB_MGMT_METHOD_TRAP_REPRESS) ||
-   ((mad->mad_hdr.mgmt_class == IB_MGMT_CLASS_BM) &&
-(mad->mad_hdr.attr_mod & IB_BM_ATTR_MOD_RESP)));
+   return ((hdr->method & IB_MGMT_METHOD_RESP) ||
+   (hdr->method == IB_MGMT_METHOD_TRAP_REPRESS) ||
+   ((hdr->mgmt_class == IB_MGMT_CLASS_BM) &&
+(hdr->attr_mod & IB_BM_ATTR_MOD_RESP)));
 }
 EXPORT_SYMBOL(ib_response_mad);
 
@@ -791,7 +791,7 @@ static int handle_outgoing_dr_smp(struct 
ib_mad_agent_private *mad_agent_priv,
switch (ret)
{
case IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY:
-   if (ib_response_mad(&mad_priv->mad.mad) &&
+   if (ib_response_mad(&mad_priv->mad.mad.mad_hdr) &&
mad_agent_priv->agent.recv_handler) {
local->mad_priv = mad_priv;
local->recv_mad_agent = mad_agent_priv;
@@ -1628,7 +1628,7 @@ find_mad_agent(struct ib_mad_port_private *port_priv,
unsigned long flags;
 
spin_lock_irqsave(&port_priv->reg_lock, flags);
-   if (ib_response_mad(mad)) {
+   if (ib_response_mad(&mad->mad_hdr)) {
u32 hi_tid;
struct ib_mad_agent_private *entry;
 
@@ -1765,8 +1765,8 @@ static inline int rcv_has_same_gid(struct 
ib_mad_agent_private *mad_agent_priv,
u8 port_num = mad_agent_priv->agent.port_num;
u8 lmc;
 
-   send_resp = ib_response_mad((struct ib_mad *)wr->send_buf.mad);
-   rcv_resp = ib_response_mad(rwc->recv_buf.mad);
+   send_resp = ib_response_mad((struct ib_mad_hdr *)wr->send_buf.mad);
+   rcv_resp = ib_response_mad(&rwc->recv_buf.mad->mad_hdr);
 
if (send_resp == rcv_resp)
/* both requests, or both responses. GIDs different */
@@ -1879,7 +1879,7 @@ static void ib_mad_complete_recv(struct 
ib_mad_agent_private *mad_agent_priv,
}
 
/* Complete corresponding request */
-   if (ib_response_mad(mad_recv_wc->recv_buf.mad)) {
+   if (ib_response_mad(&mad_recv_wc->recv_buf.mad->mad_hdr)) {
spin_lock_irqsave(&mad_agent_priv->lock, flags);
mad_send_wr = ib_find_send_mad(mad_agent_priv, mad_recv_wc);
if (!mad_send_wr) {
diff --git a/drivers/infiniband/core/user_mad.c 
b/drivers/infiniband/core/user_mad.c
index 928cdd2..66b5217 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -426,11 +426,11 @@ static int is_duplicate(struct ib_umad_file *file,
 * the same TID, reject the second as a duplicate.  This is more
 * restrictive than required by the spec.
 */
-   if (!ib_response_mad((struct ib_mad *) hdr)) {
-   if (!ib_response_mad((struct ib_mad *) sent_hdr))
+   if (!ib_response_mad(hdr)) {
+   if (!ib_response_mad(sent_hdr))
return 1;
continue;
-   } else if (!ib_response_mad((struct ib_mad *) sent_hdr))
+   } else if (!ib_response_mad(sent_hdr))
continue;
 
if (same_destination(&packet->mad.hdr, &sent_packet->mad.hdr))
diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
index 9bb99e9..9c89939 100644
--- a/include/rdma/ib_mad.h
+++ b/include/rdma/ib_mad.h
@@ -263,7 +263,7 @@ struct ib_mad_send_buf {
  * ib_response_mad - Returns if the specified MAD has been generated in
  *   response to a sent request or trap.
  */
-int ib_response_mad(struct ib_mad *mad);
+int ib_response_mad(struct ib_mad_hdr *hdr);
 
 /**
  * ib_get_rmpp_resptime - Returns the RMPP response time.
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 03/19] IB/mad: Change validate_mad signature to take ib_mad_hdr rather than ib_mad

2015-02-04 Thread ira . weiny
From: Ira Weiny 

validate_mad only needs access to the MAD header and can be used for both IB
and Jumbo MADs.

Signed-off-by: Ira Weiny 
---
 drivers/infiniband/core/mad.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 4673262..99b 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -1708,20 +1708,20 @@ out:
return mad_agent;
 }
 
-static int validate_mad(struct ib_mad *mad, u32 qp_num)
+static int validate_mad(struct ib_mad_hdr *mad_hdr, u32 qp_num)
 {
int valid = 0;
 
/* Make sure MAD base version is understood */
-   if (mad->mad_hdr.base_version != IB_MGMT_BASE_VERSION) {
+   if (mad_hdr->base_version != IB_MGMT_BASE_VERSION) {
pr_err("MAD received with unsupported base version %d\n",
-   mad->mad_hdr.base_version);
+   mad_hdr->base_version);
goto out;
}
 
/* Filter SMI packets sent to other than QP0 */
-   if ((mad->mad_hdr.mgmt_class == IB_MGMT_CLASS_SUBN_LID_ROUTED) ||
-   (mad->mad_hdr.mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE)) {
+   if ((mad_hdr->mgmt_class == IB_MGMT_CLASS_SUBN_LID_ROUTED) ||
+   (mad_hdr->mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE)) {
if (qp_num == 0)
valid = 1;
} else {
@@ -1979,7 +1979,7 @@ static void ib_mad_recv_done_handler(struct 
ib_mad_port_private *port_priv,
snoop_recv(qp_info, &recv->header.recv_wc, IB_MAD_SNOOP_RECVS);
 
/* Validate MAD */
-   if (!validate_mad(&recv->mad.mad, qp_info->qp->qp_num))
+   if (!validate_mad(&recv->mad.mad.mad_hdr, qp_info->qp->qp_num))
goto out;
 
response = kmem_cache_alloc(ib_mad_cache, GFP_KERNEL);
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 for-next 3/4] IB/core: Make sure that the PSN does not overflow

2015-02-04 Thread Jason Gunthorpe
On Sun, Feb 01, 2015 at 01:33:30PM +0200, Or Gerlitz wrote:

> >>@@ -860,6 +860,12 @@ int ib_modify_qp_is_ok(enum ib_qp_state cur_state, 
> >>enum ib_qp_state next_state,
> >>if (mask & ~(req_param | opt_param | IB_QP_STATE))
> >>return 0;
> >>+   if ((mask & IB_QP_SQ_PSN) && (attr->sq_psn & 0xff00))
> >>+   return 0;
> >>+
> >>+   if ((mask & IB_QP_RQ_PSN) && (attr->rq_psn & 0xff00))
> >>+   return 0;
> >>+

> >And since rdmacm has had this longstanding bug of generating > 24
> >bit PSNs, this change seems really scary - very likely to break
> >working systems.
 
> By IBTA the HW can only use 24 bits, also the IB CM also makes sure
> to only encode/decode 24 PSN bits to/from the wire (see the PSN
> related helpers in drivers/infiniband/core/cm_msgs.h), so in that
> respect, I don't see what other bits which are not 24 bits out of
> the 32 generated ones could be of some use to existing applications,
> please clarify.

Maybe you can explain why this check is suddenly important now? It
seems risky with no rational?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/core: Temporarily disable ex_query_device uverb

2015-02-04 Thread Yann Droneaud
Hi,

Le lundi 02 février 2015 à 14:10 +0100, Yann Droneaud a écrit :
> Le dimanche 01 février 2015 à 15:35 +0200, Haggai Eran a écrit :
> > Commit 5a77abf9a97a ("IB/core: Add support for extended query device caps")
> > added a new extended verb to query the capabilities of RDMA devices, but the
> > semantics of this verb are still under debate [1].
> > 
> > Block access to this verb from user-space until the new semantics are in
> > place.
> > 
> > Cc: Yann Droneaud 
> > Cc: Jason Gunthorpe 
> > Cc: Eli Cohen 
> > 
> > [1] [PATCH v1 0/5] IB/core: extended query device caps cleanup for v3.19
> > http://www.spinics.net/lists/linux-rdma/msg22904.html
> > 
> > Signed-off-by: Haggai Eran 
> 
> Reviewed-by: Yann Droneaud 

> >  drivers/infiniband/core/uverbs_main.c | 1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/drivers/infiniband/core/uverbs_main.c 
> > b/drivers/infiniband/core/uverbs_main.c
> > index e6c23b9eab33..5db1a8cc388d 100644
> > --- a/drivers/infiniband/core/uverbs_main.c
> > +++ b/drivers/infiniband/core/uverbs_main.c
> > @@ -123,7 +123,6 @@ static int (*uverbs_ex_cmd_table[])(struct 
> > ib_uverbs_file *file,
> > struct ib_udata *uhw) = {
> > [IB_USER_VERBS_EX_CMD_CREATE_FLOW]  = ib_uverbs_ex_create_flow,
> > [IB_USER_VERBS_EX_CMD_DESTROY_FLOW] = ib_uverbs_ex_destroy_flow,
> > -   [IB_USER_VERBS_EX_CMD_QUERY_DEVICE] = ib_uverbs_ex_query_device
> >  };
> >  
> >  static void ib_uverbs_add_one(struct ib_device *device);
> 
> That's the smallest (and smartest) patch to be applied instead of
> reverting.
> 

Unfortunately, I've missed the issue I was complaining about in the
first place [1]. And I feel a bit guilty having missed the issue.

The present patch is fine as it fully disable the new extended 
QUERY_DEVICE uverb, but it doesn't disable the broken logic added in
ib_copy_to_udata() by commit 5a77abf9a97a ('IB/core: Add support for
extended query device caps'):

  diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
  index 470a011d6fa4..97a999f9e4d8 100644
  --- a/include/rdma/ib_verbs.h
  +++ b/include/rdma/ib_verbs.h
  @@ -1662,7 +1662,10 @@ static inline int ib_copy_from_udata(void *dest, 
struct ib_udata *udata, size_t
   
   static inline int ib_copy_to_udata(struct ib_udata *udata, void *src, size_t 
len)
   {
  - return copy_to_user(udata->outbuf, src, len) ? -EFAULT : 0;
  + size_t copy_sz;
  +
  + copy_sz = min_t(size_t, len, udata->outlen);
  + return copy_to_user(udata->outbuf, src, copy_sz) ? -EFAULT : 0;
   }

That part of commit 5a77abf9a97a should be reverted as I'm not sure
it doesn't introduce regressions, especially difficult to notice ones.

Regards.

[1] Re: [PATCH v3 06/17] IB/core: Add support for extended query device caps
http://mid.gmane.org/1418733236.2779.26.ca...@opteya.com

Regards.

-- 
Yann Droneaud
OPTEYA


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: CUDA not working with ib_write_bw

2015-02-04 Thread Steve Wise
I want to RDMA to/from gpu memory from a remote peer.  So the data arrives at 
the RDMA device from the wire, and is DMA'd directly
to GPU memory of the peer adapter in the system.



> -Original Message-
> From: Zhangfengwei [mailto:fngw.zh...@huawei.com]
> Sent: Wednesday, February 04, 2015 12:42 AM
> To: Steve Wise; g...@dev.mellanox.co.il
> Cc: linux-rdma@vger.kernel.org; Stephen Bates
> Subject: 答复: CUDA not working with ib_write_bw
> 
> Hi Steve,
> 
> Do you want to transfer the data to the gpu buffer directly ? I guess the DMA 
> seems not to do this all by itself.
> 
> -邮件原件-
> 发件人: linux-rdma-ow...@vger.kernel.org 
> [mailto:linux-rdma-ow...@vger.kernel.org] 代表 Steve Wise
> 发送时间: 2015年2月2日 23:59
> 收件人: g...@dev.mellanox.co.il
> 抄送: linux-rdma@vger.kernel.org; Stephen Bates
> 主题: CUDA not working with ib_write_bw
> 
> Hey Gil,
> 
> I'm trying to test iWARP RDMA <-> GPU memory and I compiled CUDA into the 
> top-o-tree perftest repo.  My Nvidia setup is working
> because I have verified it with another gpu rdma package (donard from pmc).  
> But when using ib_write_bw the server gets an error
> registering the gpu memory with the device.  Below is the output from 
> ib_write_bw.  I instrumented the kernel registration path
and I find
> that get_user_pages() is returning -14 (-EFAULT) when called by ib_umem_get().
> 
> Q:  Is this supposed to work with the upstream RDMA drivers?   I'm using a 
> 3.16.3 kernel.org kernel.
> 
> Thanks,
> 
> Steve
> ---
> 
> [root@stevo1 perftest]# ./ib_write_bw -R --use_cuda
> 
> 
> * Waiting for client to connect... *
> 
> ---
> RDMA_Write BW Test
>  Dual-port   : OFF  Device : cxgb4_1
>  Number of qps   : 1Transport type : IW
>  Connection type : RC   Using SRQ  : OFF
>  CQ Moderation   : 100
>  Mtu : 1024[B]
>  Link type   : Ethernet
>  Gid index   : 0
>  Max inline data : 0[B]
>  rdma_cm QPs : ON
>  Data ex. method : rdma_cm
> ---
>  Waiting for client rdma_cm QP to connect  Please run the same command with 
> the IB/RoCE interface IP
> ---
> initializing CUDA
> There is 1 device supporting CUDA
> [pid = 14124, dev = 0] device name = [Tesla K20Xm] creating CUDA Ctx making 
> it the current CUDA Ctx
> cuMemAlloc() of a 131072 bytes GPU buffer allocated GPU buffer address at 
> 00130426 pointer=0x130426 Couldn't allocate
> MR  Unable to create the resources needed by comm struct Unable to perform 
> rdma_client function
> [root@stevo1 perftest]#
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the 
> body of a message to majord...@vger.kernel.org More
> majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Mellanox: Delete unnecessary checks before the function call "vunmap"

2015-02-04 Thread Eli Cohen
Acked-by: Eli Cohen 

On Wed, Feb 04, 2015 at 03:22:33PM +0100, SF Markus Elfring wrote:
> From: Markus Elfring 
> Date: Wed, 4 Feb 2015 15:17:00 +0100
> 
> The vunmap() function performs also input parameter validation.
> Thus the test around the call is not needed.
> 
> This issue was detected by using the Coccinelle software.
> 
> Signed-off-by: Markus Elfring 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] net: Mellanox: Delete unnecessary checks before the function call "vunmap"

2015-02-04 Thread SF Markus Elfring
From: Markus Elfring 
Date: Wed, 4 Feb 2015 15:17:00 +0100

The vunmap() function performs also input parameter validation.
Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/net/ethernet/mellanox/mlx4/alloc.c  | 2 +-
 drivers/net/ethernet/mellanox/mlx5/core/alloc.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/alloc.c 
b/drivers/net/ethernet/mellanox/mlx4/alloc.c
index 963dd7e..06faa51 100644
--- a/drivers/net/ethernet/mellanox/mlx4/alloc.c
+++ b/drivers/net/ethernet/mellanox/mlx4/alloc.c
@@ -660,7 +660,7 @@ void mlx4_buf_free(struct mlx4_dev *dev, int size, struct 
mlx4_buf *buf)
dma_free_coherent(&dev->pdev->dev, size, buf->direct.buf,
  buf->direct.map);
else {
-   if (BITS_PER_LONG == 64 && buf->direct.buf)
+   if (BITS_PER_LONG == 64)
vunmap(buf->direct.buf);
 
for (i = 0; i < buf->nbufs; ++i)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c 
b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
index 56779c1..201ca6d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
@@ -121,7 +121,7 @@ void mlx5_buf_free(struct mlx5_core_dev *dev, struct 
mlx5_buf *buf)
dma_free_coherent(&dev->pdev->dev, buf->size, buf->direct.buf,
  buf->direct.map);
else {
-   if (BITS_PER_LONG == 64 && buf->direct.buf)
+   if (BITS_PER_LONG == 64)
vunmap(buf->direct.buf);
 
for (i = 0; i < buf->nbufs; i++)
-- 
2.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] IB/mthca: remove deprecated use of pci api

2015-02-04 Thread Quentin Lambert
Replace occurences of the pci api by appropriate call to the dma api.

A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr)

@deprecated@
idexpression id;
position p;
@@

(
  pci_dma_supported@p ( id, ...)
|
  pci_alloc_consistent@p ( id, ...)
)

@bad1@
idexpression id;
position deprecated.p;
@@
...when != &id->dev
   when != pci_get_drvdata ( id )
   when != pci_enable_device ( id )
(
  pci_dma_supported@p ( id, ...)
|
  pci_alloc_consistent@p ( id, ...)
)

@depends on !bad1@
idexpression id;
expression direction;
position deprecated.p;
@@

(
- pci_dma_supported@p ( id,
+ dma_supported ( &id->dev,
...
+ , GFP_ATOMIC
  )
|
- pci_alloc_consistent@p ( id,
+ dma_alloc_coherent ( &id->dev,
...
+ , GFP_ATOMIC
  )
)

Signed-off-by: Quentin Lambert 
---
 drivers/infiniband/hw/mthca/mthca_eq.c  | 20 
 drivers/infiniband/hw/mthca/mthca_main.c|  8 
 drivers/infiniband/hw/mthca/mthca_memfree.c | 23 ++-
 3 files changed, 30 insertions(+), 21 deletions(-)

diff --git a/drivers/infiniband/hw/mthca/mthca_eq.c 
b/drivers/infiniband/hw/mthca/mthca_eq.c
index 6902017..cd99e8c 100644
--- a/drivers/infiniband/hw/mthca/mthca_eq.c
+++ b/drivers/infiniband/hw/mthca/mthca_eq.c
@@ -617,7 +617,7 @@ static void mthca_free_eq(struct mthca_dev *dev,
 
mthca_free_mr(dev, &eq->mr);
for (i = 0; i < npages; ++i)
-   pci_free_consistent(dev->pdev, PAGE_SIZE,
+   dma_free_coherent(&dev->pdev->dev, PAGE_SIZE,
eq->page_list[i].buf,
dma_unmap_addr(&eq->page_list[i], mapping));
 
@@ -739,17 +739,21 @@ int mthca_map_eq_icm(struct mthca_dev *dev, u64 icm_virt)
dev->eq_table.icm_page = alloc_page(GFP_HIGHUSER);
if (!dev->eq_table.icm_page)
return -ENOMEM;
-   dev->eq_table.icm_dma  = pci_map_page(dev->pdev, 
dev->eq_table.icm_page, 0,
- PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
-   if (pci_dma_mapping_error(dev->pdev, dev->eq_table.icm_dma)) {
+   dev->eq_table.icm_dma  = dma_map_page(&dev->pdev->dev,
+ dev->eq_table.icm_page, 0,
+ PAGE_SIZE,
+ (enum 
dma_data_direction)PCI_DMA_BIDIRECTIONAL);
+
+   if (dma_mapping_error(&dev->pdev->dev, dev->eq_table.icm_dma)) {
__free_page(dev->eq_table.icm_page);
return -ENOMEM;
}
 
ret = mthca_MAP_ICM_page(dev, dev->eq_table.icm_dma, icm_virt);
if (ret) {
-   pci_unmap_page(dev->pdev, dev->eq_table.icm_dma, PAGE_SIZE,
-  PCI_DMA_BIDIRECTIONAL);
+   dma_unmap_page(&dev->pdev->dev, dev->eq_table.icm_dma,
+  PAGE_SIZE,
+  (enum dma_data_direction)PCI_DMA_BIDIRECTIONAL);
__free_page(dev->eq_table.icm_page);
}
 
@@ -759,8 +763,8 @@ int mthca_map_eq_icm(struct mthca_dev *dev, u64 icm_virt)
 void mthca_unmap_eq_icm(struct mthca_dev *dev)
 {
mthca_UNMAP_ICM(dev, dev->eq_table.icm_virt, 1);
-   pci_unmap_page(dev->pdev, dev->eq_table.icm_dma, PAGE_SIZE,
-  PCI_DMA_BIDIRECTIONAL);
+   dma_unmap_page(&dev->pdev->dev, dev->eq_table.icm_dma, PAGE_SIZE,
+  (enum dma_data_direction)PCI_DMA_BIDIRECTIONAL);
__free_page(dev->eq_table.icm_page);
 }
 
diff --git a/drivers/infiniband/hw/mthca/mthca_main.c 
b/drivers/infiniband/hw/mthca/mthca_main.c
index ded76c1..2b52416 100644
--- a/drivers/infiniband/hw/mthca/mthca_main.c
+++ b/drivers/infiniband/hw/mthca/mthca_main.c
@@ -940,20 +940,20 @@ static int __mthca_init_one(struct pci_dev *pdev, int 
hca_type)
 
pci_set_master(pdev);
 
-   err = pci_set_dma_mask(pdev, DMA_BIT_MASK(64));
+   err = dma_set_mask(&pdev->dev, DMA_BIT_MASK(64));
if (err) {
dev_warn(&pdev->dev, "Warning: couldn't set 64-bit PCI DMA 
mask.\n");
-   err = pci_set_dma_mask(pdev, DMA_BIT_MASK(32));
+   err = dma_set_mask(&pdev->dev, DMA_BIT_MASK(32));
if (err) {
dev_err(&pdev->dev, "Can't set PCI DMA mask, 
aborting.\n");
goto err_free_res;
}
}
-   err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
+   err = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(64));
if (err) {
dev_warn(&pdev->dev, "Warning: couldn't set 64-bit "
 "consistent PCI DMA mask.\n");
-   err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
+   err = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32));
if (err) {
dev_err(&pdev->dev, "Can't set consistent PCI DMA mask, 
"