Re: [ANNOUNCE] libibverbs 1.1.7 is released

2013-05-29 Thread Or Gerlitz

On 29/05/2013 02:10, Roland Dreier wrote:

libibverbs is a library that allows programs to use RDMA verbs for
direct access to RDMA (currently InfiniBand and iWARP) hardware from userspace.


Hey, so there's RoCE out there too...



The new stable release, 1.1.7, is available from


libmlx4 releasing is coming too?

Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[regression] Re: [RFC][PATCH] mm: Fix RLIMIT_MEMLOCK

2013-05-29 Thread Ingo Molnar

* Christoph Lameter c...@linux.com wrote:

 On Mon, 27 May 2013, Peter Zijlstra wrote:
 
  Before your patch pinned was included in locked and thus RLIMIT_MEMLOCK
  had a single resource counter. After your patch RLIMIT_MEMLOCK is
  applied separately to both -- more or less.
 
 Before the patch the count was doubled since a single page was counted 
 twice: Once because it was mlocked (marked with PG_mlock) and then again 
 because it was also pinned (the refcount was increased). Two different 
 things.

Christoph, why are you *STILL* arguing??

You caused a *regression* in a userspace ABI plain and simple, and a 
security relevant one. Furtermore you modified kernel/events/core.c yet 
you never even Cc:-ed the parties involved ...

All your excuses, obfuscation and attempts to redefine the universe to 
your liking won't change reality: it worked before, it does not now. Take 
responsibility for your action for christ's sake and move forward towards 
a resolution , okay?

When can we expect a fix from you for the breakage you caused? Or at least 
a word that acknowledges that you broke a user ABI carelessly?

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Status of ummunot branch?

2013-05-29 Thread Or Gerlitz
On Tue, May 28, 2013 at 8:51 PM, Jeff Squyres (jsquyres)
jsquy...@cisco.com wrote:

  I ask because, as an MPI guy, I would *love* to see this stuff integrated 
 into the kernel and libibverbs.


Hi Jeff,

Have you looked on ODP? see
https://www.openfabrics.org/resources/document-downloads/presentations/doc_download/568-on-demand-paging-for-user-space-networking.html

Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 00/28] rdma/cm: Add support for native IB addressing

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

This patch series adds support for native IB addressing to the
rdma cm.  The full patch series is available from:

git://git.openfabrics.org/~shefty/rdma-dev.git for-next

The rdma cm is the only widely usable interface for establishing
communication over Infiniband.  Other interfaces are either privileged
(e.g. umad) or incomplete in that they do not contact the IB SA
for necessary data (e.g. ucm).  The rdma cm is the only interface
which provides support for path record queries and multicast joins.
However, users of the rdma cm interface are restricted to using IP
addresses, which must be translated into IB addresses.

Allowing the use of native IB addresses removes the requirement
for IPoIB, which in turn allows us to offload name and/or address
translation services to a user space daemon.  The primary motivation
is to support large scale fabrics, with address and name services
either cached or bypassed completely.  For example, IB GIDs are
known or the information is exchanged out of band by an MPI process
manager.  However, another use case involves load balancing software.
Currently the rdma cm cannot establish rdma connections through
a load balancer, since the IP - GID mapping is not well defined.
An out of band mechanism could be used in such situations to
determine the correct mapping, with the rdma cm still managing
the connection.

The patch set introduces af_ib and sockaddr_ib.  The kernel
rdma_cm is updated accordingly, mainly to make its handling of
addresses more generic.  However, since sockaddr_ib is larger
than sockaddr_in6, the rdma_ucm requires changes to its user to
kernel interface.  To provide backwards compatibility, the userspace
ABI is extended to support the larger address size.

Note that this series only touches the main networking stack to
define AF_IB.

Signed-off-by: Sean Hefty sean.he...@intel.com

Changes from v4:
Updated to newer base kernel.
Removed unused SDP header definitions and related code.
Exposed private data to servers bound to AF_IB addresses.
Fixed issue on client side accessing NULL private data.

Sean Hefty (28):
  rdma/cm: define native IB address
  rdma/cm: Allow enabling reuseaddr in any state
  rdma/cm: Include AF_IB in loopback and any address checks
  ib/addr: Add AF_IB support to ip_addr_size
  rdma/cm: Update port reservation to support AF_IB
  rdma/cm: Allow user to specify AF_IB when binding
  rdma/cm: Do not modify sa_family when setting loopback address
  rdma/cm: Add helper functions to return id address information
  rdma/cm: Restrict AF_IB loopback to binding to IB devices only
  rdma/cm: Verify that source and dest sa_family are the same
  rdma/cm: Add support for AF_IB to rdma_resolve_addr
  rdma/cm: Add support for AF_IB to rdma_resolve_route
  rdma/cm: Add support for AF_IB to cma_get_service_id
  rdma/cm: Remove unused SDP related code
  rdma/cm: Merge cma_get/save_net_info
  rdma/cm: Expose private data when using AF_IB
  rdma/cm: Set qkey for AF_IB
  rdma/cm: Only listen on IB devices when using AF_IB
  rdma/ucm: Support querying for AF_IB addresses
  ib/sa: Export function to pack a path record into wire format
  rdma/ucm: Support querying when IB paths are not reversible
  rdma/cm: Export cma_get_service_id
  rdma/ucm: Add ability to query GID addresses
  rdma/ucm: Name changes to indicate only IP addresses supported
  rdma/ucm: Allow user space to bind to AF_IB
  rdma/ucm: Allow user space to pass AF_IB into resolve
  rdma/ucm: Allow user space to specify AF_IB when joining multicast
  rdma/cm: Export AF_IB statistics

 drivers/infiniband/core/addr.c |   20 +-
 drivers/infiniband/core/cma.c  |  906 +---
 drivers/infiniband/core/sa_query.c |6 +
 drivers/infiniband/core/ucma.c |  321 +++--
 include/linux/socket.h |2 +
 include/rdma/ib.h  |   89 
 include/rdma/ib_addr.h |6 +-
 include/rdma/ib_sa.h   |7 +
 include/rdma/rdma_cm.h |   13 +
 include/uapi/rdma/rdma_user_cm.h   |   73 +++-
 10 files changed, 1001 insertions(+), 442 deletions(-)
 create mode 100644 include/rdma/ib.h

-- 
1.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 01/28] rdma/cm: define native IB address

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

Define AF_IB and sockaddr_ib to allow the rdma_cm to use native IB
addressing.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 include/linux/socket.h |2 +
 include/rdma/ib.h  |   89 
 2 files changed, 91 insertions(+), 0 deletions(-)
 create mode 100644 include/rdma/ib.h

diff --git a/include/linux/socket.h b/include/linux/socket.h
index 2b9f74b..68f7120 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -167,6 +167,7 @@ struct ucred {
 #define AF_PPPOX   24  /* PPPoX sockets*/
 #define AF_WANPIPE 25  /* Wanpipe API Sockets */
 #define AF_LLC 26  /* Linux LLC*/
+#define AF_IB  27  /* Native InfiniBand address*/
 #define AF_CAN 29  /* Controller Area Network  */
 #define AF_TIPC30  /* TIPC sockets */
 #define AF_BLUETOOTH   31  /* Bluetooth sockets*/
@@ -211,6 +212,7 @@ struct ucred {
 #define PF_PPPOX   AF_PPPOX
 #define PF_WANPIPE AF_WANPIPE
 #define PF_LLC AF_LLC
+#define PF_IB  AF_IB
 #define PF_CAN AF_CAN
 #define PF_TIPCAF_TIPC
 #define PF_BLUETOOTH   AF_BLUETOOTH
diff --git a/include/rdma/ib.h b/include/rdma/ib.h
new file mode 100644
index 000..cf8f9e7
--- /dev/null
+++ b/include/rdma/ib.h
@@ -0,0 +1,89 @@
+/*
+ * Copyright (c) 2010 Intel Corporation.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#if !defined(_RDMA_IB_H)
+#define _RDMA_IB_H
+
+#include linux/types.h
+
+struct ib_addr {
+   union {
+   __u8uib_addr8[16];
+   __be16  uib_addr16[8];
+   __be32  uib_addr32[4];
+   __be64  uib_addr64[2];
+   } ib_u;
+#define sib_addr8  ib_u.uib_addr8
+#define sib_addr16 ib_u.uib_addr16
+#define sib_addr32 ib_u.uib_addr32
+#define sib_addr64 ib_u.uib_addr64
+#define sib_rawib_u.uib_addr8
+#define sib_subnet_prefix  ib_u.uib_addr64[0]
+#define sib_interface_id   ib_u.uib_addr64[1]
+};
+
+static inline int ib_addr_any(const struct ib_addr *a)
+{
+   return ((a-sib_addr64[0] | a-sib_addr64[1]) == 0);
+}
+
+static inline int ib_addr_loopback(const struct ib_addr *a)
+{
+   return ((a-sib_addr32[0] | a-sib_addr32[1] |
+a-sib_addr32[2] | (a-sib_addr32[3] ^ htonl(1))) == 0);
+}
+
+static inline void ib_addr_set(struct ib_addr *addr,
+  __be32 w1, __be32 w2, __be32 w3, __be32 w4)
+{
+   addr-sib_addr32[0] = w1;
+   addr-sib_addr32[1] = w2;
+   addr-sib_addr32[2] = w3;
+   addr-sib_addr32[3] = w4;
+}
+
+static inline int ib_addr_cmp(const struct ib_addr *a1, const struct ib_addr 
*a2)
+{
+   return memcmp(a1, a2, sizeof(struct ib_addr));
+}
+
+struct sockaddr_ib {
+   unsigned short int  sib_family; /* AF_IB */
+   __be16  sib_pkey;
+   __be32  sib_flowinfo;
+   struct ib_addr  sib_addr;
+   __be64  sib_sid;
+   __be64  sib_sid_mask;
+   __u64   sib_scope_id;
+};
+
+#endif /* _RDMA_IB_H */
-- 
1.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 02/28] rdma/cm: Allow enabling reuseaddr in any state

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

The rdma_cm only allows setting reuseaddr if the corresponding
rdma_cm_id is in the idle state.  Allow setting this value in
other states.  This brings the behavior more inline with
sockets.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/cma.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 71c2c71..f971a50 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -2097,7 +2097,7 @@ int rdma_set_reuseaddr(struct rdma_cm_id *id, int reuse)
 
id_priv = container_of(id, struct rdma_id_private, id);
spin_lock_irqsave(id_priv-lock, flags);
-   if (id_priv-state == RDMA_CM_IDLE) {
+   if (reuse || id_priv-state == RDMA_CM_IDLE) {
id_priv-reuseaddr = reuse;
ret = 0;
} else {
-- 
1.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 03/28] rdma/cm: Include AF_IB in loopback and any address checks

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

Enhance checks for loopback and any address to support AF_IB
in addition to AF_INET and AF_INT6.  This will allow future
patches to use AF_IB when binding and resolving addresses.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/cma.c |   40 
 1 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index f971a50..1cd35d1 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -50,6 +50,7 @@
 #include rdma/rdma_cm.h
 #include rdma/rdma_cm_ib.h
 #include rdma/rdma_netlink.h
+#include rdma/ib.h
 #include rdma/ib_cache.h
 #include rdma/ib_cm.h
 #include rdma/ib_sa.h
@@ -679,26 +680,30 @@ EXPORT_SYMBOL(rdma_init_qp_attr);
 
 static inline int cma_zero_addr(struct sockaddr *addr)
 {
-   struct in6_addr *ip6;
-
-   if (addr-sa_family == AF_INET)
-   return ipv4_is_zeronet(
-   ((struct sockaddr_in *)addr)-sin_addr.s_addr);
-   else {
-   ip6 = ((struct sockaddr_in6 *) addr)-sin6_addr;
-   return (ip6-s6_addr32[0] | ip6-s6_addr32[1] |
-   ip6-s6_addr32[2] | ip6-s6_addr32[3]) == 0;
+   switch (addr-sa_family) {
+   case AF_INET:
+   return ipv4_is_zeronet(((struct sockaddr_in 
*)addr)-sin_addr.s_addr);
+   case AF_INET6:
+   return ipv6_addr_any(((struct sockaddr_in6 *) 
addr)-sin6_addr);
+   case AF_IB:
+   return ib_addr_any(((struct sockaddr_ib *) addr)-sib_addr);
+   default:
+   return 0;
}
 }
 
 static inline int cma_loopback_addr(struct sockaddr *addr)
 {
-   if (addr-sa_family == AF_INET)
-   return ipv4_is_loopback(
-   ((struct sockaddr_in *) addr)-sin_addr.s_addr);
-   else
-   return ipv6_addr_loopback(
-   ((struct sockaddr_in6 *) addr)-sin6_addr);
+   switch (addr-sa_family) {
+   case AF_INET:
+   return ipv4_is_loopback(((struct sockaddr_in *) 
addr)-sin_addr.s_addr);
+   case AF_INET6:
+   return ipv6_addr_loopback(((struct sockaddr_in6 *) 
addr)-sin6_addr);
+   case AF_IB:
+   return ib_addr_loopback(((struct sockaddr_ib *) 
addr)-sib_addr);
+   default:
+   return 0;
+   }
 }
 
 static inline int cma_any_addr(struct sockaddr *addr)
@@ -715,9 +720,12 @@ static int cma_addr_cmp(struct sockaddr *src, struct 
sockaddr *dst)
case AF_INET:
return ((struct sockaddr_in *) src)-sin_addr.s_addr !=
   ((struct sockaddr_in *) dst)-sin_addr.s_addr;
-   default:
+   case AF_INET6:
return ipv6_addr_cmp(((struct sockaddr_in6 *) src)-sin6_addr,
 ((struct sockaddr_in6 *) dst)-sin6_addr);
+   default:
+   return ib_addr_cmp(((struct sockaddr_ib *) src)-sib_addr,
+  ((struct sockaddr_ib *) dst)-sib_addr);
}
 }
 
-- 
1.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 05/28] rdma/cm: Update port reservation to support AF_IB

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

The AF_IB uses a 64-bit service id (SID), which the
user can control through the use of a mask.  The rdma_cm
will assign values to the unmasked portions of the SID
based on the selected port space and port number.

Because the IB spec divides the SID range into several regions,
a SID/mask combination may fall into one of the existing
port space ranges as defined by the RDMA CM IP Annex.  Map
the AF_IB SID to the correct RDMA port space.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/cma.c |  107 +
 include/rdma/rdma_cm.h|5 ++
 2 files changed, 91 insertions(+), 21 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 8ffecf0..f56c62a 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -729,12 +729,22 @@ static int cma_addr_cmp(struct sockaddr *src, struct 
sockaddr *dst)
}
 }
 
-static inline __be16 cma_port(struct sockaddr *addr)
+static __be16 cma_port(struct sockaddr *addr)
 {
-   if (addr-sa_family == AF_INET)
+   struct sockaddr_ib *sib;
+
+   switch (addr-sa_family) {
+   case AF_INET:
return ((struct sockaddr_in *) addr)-sin_port;
-   else
+   case AF_INET6:
return ((struct sockaddr_in6 *) addr)-sin6_port;
+   case AF_IB:
+   sib = (struct sockaddr_ib *) addr;
+   return htons((u16) (be64_to_cpu(sib-sib_sid) 
+   be64_to_cpu(sib-sib_sid_mask)));
+   default:
+   return 0;
+   }
 }
 
 static inline int cma_any_port(struct sockaddr *addr)
@@ -2139,10 +2149,29 @@ EXPORT_SYMBOL(rdma_set_afonly);
 static void cma_bind_port(struct rdma_bind_list *bind_list,
  struct rdma_id_private *id_priv)
 {
-   struct sockaddr_in *sin;
+   struct sockaddr *addr;
+   struct sockaddr_ib *sib;
+   u64 sid, mask;
+   __be16 port;
 
-   sin = (struct sockaddr_in *) id_priv-id.route.addr.src_addr;
-   sin-sin_port = htons(bind_list-port);
+   addr = (struct sockaddr *) id_priv-id.route.addr.src_addr;
+   port = htons(bind_list-port);
+
+   switch (addr-sa_family) {
+   case AF_INET:
+   ((struct sockaddr_in *) addr)-sin_port = port;
+   break;
+   case AF_INET6:
+   ((struct sockaddr_in6 *) addr)-sin6_port = port;
+   break;
+   case AF_IB:
+   sib = (struct sockaddr_ib *) addr;
+   sid = be64_to_cpu(sib-sib_sid);
+   mask = be64_to_cpu(sib-sib_sid_mask);
+   sib-sib_sid = cpu_to_be64((sid  mask) | (u64) ntohs(port));
+   sib-sib_sid_mask = cpu_to_be64(~0ULL);
+   break;
+   }
id_priv-bind_list = bind_list;
hlist_add_head(id_priv-node, bind_list-owners);
 }
@@ -2269,31 +2298,67 @@ static int cma_bind_listen(struct rdma_id_private 
*id_priv)
return ret;
 }
 
-static int cma_get_port(struct rdma_id_private *id_priv)
+static struct idr *cma_select_inet_ps(struct rdma_id_private *id_priv)
 {
-   struct idr *ps;
-   int ret;
-
switch (id_priv-id.ps) {
case RDMA_PS_SDP:
-   ps = sdp_ps;
-   break;
+   return sdp_ps;
case RDMA_PS_TCP:
-   ps = tcp_ps;
-   break;
+   return tcp_ps;
case RDMA_PS_UDP:
-   ps = udp_ps;
-   break;
+   return udp_ps;
case RDMA_PS_IPOIB:
-   ps = ipoib_ps;
-   break;
+   return ipoib_ps;
case RDMA_PS_IB:
-   ps = ib_ps;
-   break;
+   return ib_ps;
default:
-   return -EPROTONOSUPPORT;
+   return NULL;
+   }
+}
+
+static struct idr *cma_select_ib_ps(struct rdma_id_private *id_priv)
+{
+   struct idr *ps = NULL;
+   struct sockaddr_ib *sib;
+   u64 sid_ps, mask, sid;
+
+   sib = (struct sockaddr_ib *) id_priv-id.route.addr.src_addr;
+   mask = be64_to_cpu(sib-sib_sid_mask)  RDMA_IB_IP_PS_MASK;
+   sid = be64_to_cpu(sib-sib_sid)  mask;
+
+   if ((id_priv-id.ps == RDMA_PS_IB)  (sid == (RDMA_IB_IP_PS_IB  
mask))) {
+   sid_ps = RDMA_IB_IP_PS_IB;
+   ps = ib_ps;
+   } else if (((id_priv-id.ps == RDMA_PS_IB) || (id_priv-id.ps == 
RDMA_PS_TCP)) 
+  (sid == (RDMA_IB_IP_PS_TCP  mask))) {
+   sid_ps = RDMA_IB_IP_PS_TCP;
+   ps = tcp_ps;
+   } else if (((id_priv-id.ps == RDMA_PS_IB) || (id_priv-id.ps == 
RDMA_PS_UDP)) 
+  (sid == (RDMA_IB_IP_PS_UDP  mask))) {
+   sid_ps = RDMA_IB_IP_PS_UDP;
+   ps = udp_ps;
}
 
+   if (ps) {
+   sib-sib_sid = cpu_to_be64(sid_ps | ntohs(cma_port((struct 
sockaddr *) sib)));
+   

[PATCH v5 04/28] ib/addr: Add AF_IB support to ip_addr_size

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

Add support for AF_IB to ip_addr_size, and rename the function
to account for the change.  Give the compiler more control over
whether the call should be inline or not by moving the definition
into the .c file, removing the static inline, and exporting it.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/addr.c |   20 ++--
 drivers/infiniband/core/cma.c  |   12 ++--
 include/rdma/ib_addr.h |6 +-
 3 files changed, 25 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index eaec8d7..e90f2b2 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -45,6 +45,7 @@
 #include net/addrconf.h
 #include net/ip6_route.h
 #include rdma/ib_addr.h
+#include rdma/ib.h
 
 MODULE_AUTHOR(Sean Hefty);
 MODULE_DESCRIPTION(IB Address Translation);
@@ -70,6 +71,21 @@ static LIST_HEAD(req_list);
 static DECLARE_DELAYED_WORK(work, process_req);
 static struct workqueue_struct *addr_wq;
 
+int rdma_addr_size(struct sockaddr *addr)
+{
+   switch (addr-sa_family) {
+   case AF_INET:
+   return sizeof(struct sockaddr_in);
+   case AF_INET6:
+   return sizeof(struct sockaddr_in6);
+   case AF_IB:
+   return sizeof(struct sockaddr_ib);
+   default:
+   return 0;
+   }
+}
+EXPORT_SYMBOL(rdma_addr_size);
+
 void rdma_addr_register_client(struct rdma_addr_client *client)
 {
atomic_set(client-refcount, 1);
@@ -369,12 +385,12 @@ int rdma_resolve_ip(struct rdma_addr_client *client,
goto err;
}
 
-   memcpy(src_in, src_addr, ip_addr_size(src_addr));
+   memcpy(src_in, src_addr, rdma_addr_size(src_addr));
} else {
src_in-sa_family = dst_addr-sa_family;
}
 
-   memcpy(dst_in, dst_addr, ip_addr_size(dst_addr));
+   memcpy(dst_in, dst_addr, rdma_addr_size(dst_addr));
req-addr = addr;
req-callback = callback;
req-context = context;
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 1cd35d1..8ffecf0 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1584,7 +1584,7 @@ static void cma_listen_on_dev(struct rdma_id_private 
*id_priv,
 
dev_id_priv-state = RDMA_CM_ADDR_BOUND;
memcpy(id-route.addr.src_addr, id_priv-id.route.addr.src_addr,
-  ip_addr_size((struct sockaddr *) 
id_priv-id.route.addr.src_addr));
+  rdma_addr_size((struct sockaddr *) 
id_priv-id.route.addr.src_addr));
 
cma_attach_to_dev(dev_id_priv, cma_dev);
list_add_tail(dev_id_priv-listen_list, id_priv-listen_list);
@@ -1989,7 +1989,7 @@ static void addr_handler(int status, struct sockaddr 
*src_addr,
event.status = status;
} else {
memcpy(id_priv-id.route.addr.src_addr, src_addr,
-  ip_addr_size(src_addr));
+  rdma_addr_size(src_addr));
event.event = RDMA_CM_EVENT_ADDR_RESOLVED;
}
 
@@ -2079,7 +2079,7 @@ int rdma_resolve_addr(struct rdma_cm_id *id, struct 
sockaddr *src_addr,
return -EINVAL;
 
atomic_inc(id_priv-refcount);
-   memcpy(id-route.addr.dst_addr, dst_addr, ip_addr_size(dst_addr));
+   memcpy(id-route.addr.dst_addr, dst_addr, rdma_addr_size(dst_addr));
if (cma_any_addr(dst_addr))
ret = cma_resolve_loopback(id_priv);
else
@@ -2399,7 +2399,7 @@ int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr 
*addr)
goto err1;
}
 
-   memcpy(id-route.addr.src_addr, addr, ip_addr_size(addr));
+   memcpy(id-route.addr.src_addr, addr, rdma_addr_size(addr));
if (!(id_priv-options  (1  CMA_OPTION_AFONLY))) {
if (addr-sa_family == AF_INET)
id_priv-afonly = 1;
@@ -3178,7 +3178,7 @@ int rdma_join_multicast(struct rdma_cm_id *id, struct 
sockaddr *addr,
if (!mc)
return -ENOMEM;
 
-   memcpy(mc-addr, addr, ip_addr_size(addr));
+   memcpy(mc-addr, addr, rdma_addr_size(addr));
mc-context = context;
mc-id_priv = id_priv;
 
@@ -3223,7 +3223,7 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct 
sockaddr *addr)
id_priv = container_of(id, struct rdma_id_private, id);
spin_lock_irq(id_priv-lock);
list_for_each_entry(mc, id_priv-mc_list, list) {
-   if (!memcmp(mc-addr, addr, ip_addr_size(addr))) {
+   if (!memcmp(mc-addr, addr, rdma_addr_size(addr))) {
list_del(mc-list);
spin_unlock_irq(id_priv-lock);
 
diff --git a/include/rdma/ib_addr.h b/include/rdma/ib_addr.h
index 9996539..f3ac0f2 100644
--- a/include/rdma/ib_addr.h
+++ b/include/rdma/ib_addr.h
@@ -102,11 +102,7 @@ void 

[PATCH v5 07/28] rdma/cm: Do not modify sa_family when setting loopback address

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

cma_resolve_loopback is called after an rdma_cm_id has been
bound to a specific sa_family and port.  Once the
source sa_family for the id has been set, do not modify it.
Only the actual IP address portion of the source address
needs to be set.

As part of this fix, we can simplify setting the source address
by moving the loopback address assignment from cma_resolve_loopback
to cma_bind_loopback.  cma_bind_loopback is only invoked when
the source address is the loopback address.

Finally, add loopback support for AF_IB as part of the change.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/cma.c |   31 ++-
 1 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 92dcdfe..46a9b5c 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1952,6 +1952,23 @@ err:
 }
 EXPORT_SYMBOL(rdma_resolve_route);
 
+static void cma_set_loopback(struct sockaddr *addr)
+{
+   switch (addr-sa_family) {
+   case AF_INET:
+   ((struct sockaddr_in *) addr)-sin_addr.s_addr = 
htonl(INADDR_LOOPBACK);
+   break;
+   case AF_INET6:
+   ipv6_addr_set(((struct sockaddr_in6 *) addr)-sin6_addr,
+ 0, 0, 0, htonl(1));
+   break;
+   default:
+   ib_addr_set(((struct sockaddr_ib *) addr)-sib_addr,
+   0, 0, 0, htonl(1));
+   break;
+   }
+}
+
 static int cma_bind_loopback(struct rdma_id_private *id_priv)
 {
struct cma_device *cma_dev;
@@ -1992,6 +2009,7 @@ port_found:
ib_addr_set_pkey(id_priv-id.route.addr.dev_addr, pkey);
id_priv-id.port_num = p;
cma_attach_to_dev(id_priv, cma_dev);
+   cma_set_loopback((struct sockaddr *) id_priv-id.route.addr.src_addr);
 out:
mutex_unlock(lock);
return ret;
@@ -2039,7 +2057,6 @@ out:
 static int cma_resolve_loopback(struct rdma_id_private *id_priv)
 {
struct cma_work *work;
-   struct sockaddr *src, *dst;
union ib_gid gid;
int ret;
 
@@ -2056,18 +2073,6 @@ static int cma_resolve_loopback(struct rdma_id_private 
*id_priv)
rdma_addr_get_sgid(id_priv-id.route.addr.dev_addr, gid);
rdma_addr_set_dgid(id_priv-id.route.addr.dev_addr, gid);
 
-   src = (struct sockaddr *) id_priv-id.route.addr.src_addr;
-   if (cma_zero_addr(src)) {
-   dst = (struct sockaddr *) id_priv-id.route.addr.dst_addr;
-   if ((src-sa_family = dst-sa_family) == AF_INET) {
-   ((struct sockaddr_in *)src)-sin_addr =
-   ((struct sockaddr_in *)dst)-sin_addr;
-   } else {
-   ((struct sockaddr_in6 *)src)-sin6_addr =
-   ((struct sockaddr_in6 *)dst)-sin6_addr;
-   }
-   }
-
work-id = id_priv;
INIT_WORK(work-work, cma_work_handler);
work-old_state = RDMA_CM_ADDR_QUERY;
-- 
1.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 06/28] rdma/cm: Allow user to specify AF_IB when binding

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

Modify rdma_bind_addr to allow the user to specify AF_IB when
binding to a device.  AF_IB indicates that the user is not
mapping an IP address to the native IB addressing.  (The mapping
may have already been done, or is not needed.)

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/cma.c |   34 --
 1 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index f56c62a..92dcdfe 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -359,6 +359,27 @@ static int find_gid_port(struct ib_device *device, union 
ib_gid *gid, u8 port_nu
return -EADDRNOTAVAIL;
 }
 
+static void cma_translate_ib(struct sockaddr_ib *sib, struct rdma_dev_addr 
*dev_addr)
+{
+   dev_addr-dev_type = ARPHRD_INFINIBAND;
+   rdma_addr_set_sgid(dev_addr, (union ib_gid *) sib-sib_addr);
+   ib_addr_set_pkey(dev_addr, ntohs(sib-sib_pkey));
+}
+
+static int cma_translate_addr(struct sockaddr *addr, struct rdma_dev_addr 
*dev_addr)
+{
+   int ret;
+
+   if (addr-sa_family != AF_IB) {
+   ret = rdma_translate_ip(addr, dev_addr);
+   } else {
+   cma_translate_ib((struct sockaddr_ib *) addr, dev_addr);
+   ret = 0;
+   }
+
+   return ret;
+}
+
 static int cma_acquire_dev(struct rdma_id_private *id_priv)
 {
struct rdma_dev_addr *dev_addr = id_priv-id.route.addr.dev_addr;
@@ -1136,8 +1157,8 @@ static struct rdma_id_private *cma_new_conn_id(struct 
rdma_cm_id *listen_id,
rdma_addr_set_sgid(rt-addr.dev_addr, rt-path_rec[0].sgid);
ib_addr_set_pkey(rt-addr.dev_addr, 
be16_to_cpu(rt-path_rec[0].pkey));
} else {
-   ret = rdma_translate_ip((struct sockaddr *) rt-addr.src_addr,
-   rt-addr.dev_addr);
+   ret = cma_translate_addr((struct sockaddr *) rt-addr.src_addr,
+rt-addr.dev_addr);
if (ret)
goto err;
}
@@ -1176,8 +1197,8 @@ static struct rdma_id_private *cma_new_udp_id(struct 
rdma_cm_id *listen_id,
  ip_ver, port, src, dst);
 
if (!cma_any_addr((struct sockaddr *) id-route.addr.src_addr)) {
-   ret = rdma_translate_ip((struct sockaddr *) 
id-route.addr.src_addr,
-   id-route.addr.dev_addr);
+   ret = cma_translate_addr((struct sockaddr *) 
id-route.addr.src_addr,
+id-route.addr.dev_addr);
if (ret)
goto err;
}
@@ -2443,7 +2464,8 @@ int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr 
*addr)
struct rdma_id_private *id_priv;
int ret;
 
-   if (addr-sa_family != AF_INET  addr-sa_family != AF_INET6)
+   if (addr-sa_family != AF_INET  addr-sa_family != AF_INET6 
+   addr-sa_family != AF_IB)
return -EAFNOSUPPORT;
 
id_priv = container_of(id, struct rdma_id_private, id);
@@ -2455,7 +2477,7 @@ int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr 
*addr)
goto err1;
 
if (!cma_any_addr(addr)) {
-   ret = rdma_translate_ip(addr, id-route.addr.dev_addr);
+   ret = cma_translate_addr(addr, id-route.addr.dev_addr);
if (ret)
goto err1;
 
-- 
1.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 11/28] rdma/cm: Add support for AF_IB to rdma_resolve_addr

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

Allow the user to specify the remote address using AF_IB format.
When AF_IB is used, the remote address simply needs to be recorded,
and no resolution using ARP is done.  The local address may still
need to be matched with a local IB device.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/cma.c |  106 ++--
 1 files changed, 100 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 67aaadc..2b5dfee 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -438,6 +438,61 @@ out:
return ret;
 }
 
+/*
+ * Select the source IB device and address to reach the destination IB address.
+ */
+static int cma_resolve_ib_dev(struct rdma_id_private *id_priv)
+{
+   struct cma_device *cma_dev, *cur_dev;
+   struct sockaddr_ib *addr;
+   union ib_gid gid, sgid, *dgid;
+   u16 pkey, index;
+   u8 port, p;
+   int i;
+
+   cma_dev = NULL;
+   addr = (struct sockaddr_ib *) cma_dst_addr(id_priv);
+   dgid = (union ib_gid *) addr-sib_addr;
+   pkey = ntohs(addr-sib_pkey);
+
+   list_for_each_entry(cur_dev, dev_list, list) {
+   if (rdma_node_get_transport(cur_dev-device-node_type) != 
RDMA_TRANSPORT_IB)
+   continue;
+
+   for (p = 1; p = cur_dev-device-phys_port_cnt; ++p) {
+   if (ib_find_cached_pkey(cur_dev-device, p, pkey, 
index))
+   continue;
+
+   for (i = 0; !ib_get_cached_gid(cur_dev-device, p, i, 
gid); i++) {
+   if (!memcmp(gid, dgid, sizeof(gid))) {
+   cma_dev = cur_dev;
+   sgid = gid;
+   port = p;
+   goto found;
+   }
+
+   if (!cma_dev  (gid.global.subnet_prefix ==
+dgid-global.subnet_prefix)) {
+   cma_dev = cur_dev;
+   sgid = gid;
+   port = p;
+   }
+   }
+   }
+   }
+
+   if (!cma_dev)
+   return -ENODEV;
+
+found:
+   cma_attach_to_dev(id_priv, cma_dev);
+   id_priv-id.port_num = port;
+   addr = (struct sockaddr_ib *) cma_src_addr(id_priv);
+   memcpy(addr-sib_addr, sgid, sizeof sgid);
+   cma_translate_ib(addr, id_priv-id.route.addr.dev_addr);
+   return 0;
+}
+
 static void cma_deref_id(struct rdma_id_private *id_priv)
 {
if (atomic_dec_and_test(id_priv-refcount))
@@ -2101,14 +2156,48 @@ err:
return ret;
 }
 
+static int cma_resolve_ib_addr(struct rdma_id_private *id_priv)
+{
+   struct cma_work *work;
+   int ret;
+
+   work = kzalloc(sizeof *work, GFP_KERNEL);
+   if (!work)
+   return -ENOMEM;
+
+   if (!id_priv-cma_dev) {
+   ret = cma_resolve_ib_dev(id_priv);
+   if (ret)
+   goto err;
+   }
+
+   rdma_addr_set_dgid(id_priv-id.route.addr.dev_addr, (union ib_gid *)
+   (((struct sockaddr_ib *) 
id_priv-id.route.addr.dst_addr)-sib_addr));
+
+   work-id = id_priv;
+   INIT_WORK(work-work, cma_work_handler);
+   work-old_state = RDMA_CM_ADDR_QUERY;
+   work-new_state = RDMA_CM_ADDR_RESOLVED;
+   work-event.event = RDMA_CM_EVENT_ADDR_RESOLVED;
+   queue_work(cma_wq, work-work);
+   return 0;
+err:
+   kfree(work);
+   return ret;
+}
+
 static int cma_bind_addr(struct rdma_cm_id *id, struct sockaddr *src_addr,
 struct sockaddr *dst_addr)
 {
if (!src_addr || !src_addr-sa_family) {
src_addr = (struct sockaddr *) id-route.addr.src_addr;
-   if ((src_addr-sa_family = dst_addr-sa_family) == AF_INET6) {
+   src_addr-sa_family = dst_addr-sa_family;
+   if (dst_addr-sa_family == AF_INET6) {
((struct sockaddr_in6 *) src_addr)-sin6_scope_id =
((struct sockaddr_in6 *) 
dst_addr)-sin6_scope_id;
+   } else if (dst_addr-sa_family == AF_IB) {
+   ((struct sockaddr_ib *) src_addr)-sib_pkey =
+   ((struct sockaddr_ib *) dst_addr)-sib_pkey;
}
}
return rdma_bind_addr(id, src_addr);
@@ -2135,12 +2224,17 @@ int rdma_resolve_addr(struct rdma_cm_id *id, struct 
sockaddr *src_addr,
 
atomic_inc(id_priv-refcount);
memcpy(cma_dst_addr(id_priv), dst_addr, rdma_addr_size(dst_addr));
-   if (cma_any_addr(dst_addr))
+   if (cma_any_addr(dst_addr)) {
ret = cma_resolve_loopback(id_priv);
-   

[PATCH v5 10/28] rdma/cm: Verify that source and dest sa_family are the same

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/cma.c |8 +++-
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index c6cc3a4..67aaadc 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1849,14 +1849,9 @@ static int cma_resolve_iboe_route(struct rdma_id_private 
*id_priv)
struct rdma_addr *addr = route-addr;
struct cma_work *work;
int ret;
-   struct sockaddr_in *src_addr = (struct sockaddr_in 
*)route-addr.src_addr;
-   struct sockaddr_in *dst_addr = (struct sockaddr_in 
*)route-addr.dst_addr;
struct net_device *ndev = NULL;
u16 vid;
 
-   if (src_addr-sin_family != dst_addr-sin_family)
-   return -EINVAL;
-
work = kzalloc(sizeof *work, GFP_KERNEL);
if (!work)
return -ENOMEM;
@@ -2132,6 +2127,9 @@ int rdma_resolve_addr(struct rdma_cm_id *id, struct 
sockaddr *src_addr,
return ret;
}
 
+   if (cma_family(id_priv) != dst_addr-sa_family)
+   return -EINVAL;
+
if (!cma_comp_exch(id_priv, RDMA_CM_ADDR_BOUND, RDMA_CM_ADDR_QUERY))
return -EINVAL;
 
-- 
1.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 09/28] rdma/cm: Restrict AF_IB loopback to binding to IB devices only

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

If a user specifies AF_IB as the source address for a loopback
connection, limit the resolution to IB devices only.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/cma.c |   28 
 1 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 61454d3e..c6cc3a4 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1981,26 +1981,38 @@ static void cma_set_loopback(struct sockaddr *addr)
 
 static int cma_bind_loopback(struct rdma_id_private *id_priv)
 {
-   struct cma_device *cma_dev;
+   struct cma_device *cma_dev, *cur_dev;
struct ib_port_attr port_attr;
union ib_gid gid;
u16 pkey;
int ret;
u8 p;
 
+   cma_dev = NULL;
mutex_lock(lock);
-   if (list_empty(dev_list)) {
+   list_for_each_entry(cur_dev, dev_list, list) {
+   if (cma_family(id_priv) == AF_IB 
+   rdma_node_get_transport(cur_dev-device-node_type) != 
RDMA_TRANSPORT_IB)
+   continue;
+
+   if (!cma_dev)
+   cma_dev = cur_dev;
+
+   for (p = 1; p = cur_dev-device-phys_port_cnt; ++p) {
+   if (!ib_query_port(cur_dev-device, p, port_attr) 
+   port_attr.state == IB_PORT_ACTIVE) {
+   cma_dev = cur_dev;
+   goto port_found;
+   }
+   }
+   }
+
+   if (!cma_dev) {
ret = -ENODEV;
goto out;
}
-   list_for_each_entry(cma_dev, dev_list, list)
-   for (p = 1; p = cma_dev-device-phys_port_cnt; ++p)
-   if (!ib_query_port(cma_dev-device, p, port_attr) 
-   port_attr.state == IB_PORT_ACTIVE)
-   goto port_found;
 
p = 1;
-   cma_dev = list_entry(dev_list.next, struct cma_device, list);
 
 port_found:
ret = ib_get_cached_gid(cma_dev-device, p, 0, gid);
-- 
1.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 27/28] rdma/ucm: Allow user space to specify AF_IB when joining multicast

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

Allow user space applications to join multicast groups using MGIDs
directly.  MGIDs may be passed using AF_IB addresses.  Since the
current multicast join command only supports addresses as large as
sockaddr_in6, define a new structure for joining addresses specified
using sockaddr_ib.

Since AF_IB allows the user to specify the qkey when
resolving a remote UD QP address, when joining the multicast
group use the qkey value, if one has been assigned.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/cma.c|9 +-
 drivers/infiniband/core/ucma.c   |   55 ++---
 include/uapi/rdma/rdma_user_cm.h |   12 +++-
 3 files changed, 62 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 7034c84..6445650 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -3149,6 +3149,8 @@ static void cma_set_mgid(struct rdma_id_private *id_priv,
 0xFF10A01B)) {
/* IPv6 address is an SA assigned MGID. */
memcpy(mgid, sin6-sin6_addr, sizeof *mgid);
+   } else if (addr-sa_family == AF_IB) {
+   memcpy(mgid, ((struct sockaddr_ib *) addr)-sib_addr, sizeof 
*mgid);
} else if ((addr-sa_family == AF_INET6)) {
ipv6_ib_mc_map(sin6-sin6_addr, dev_addr-broadcast, mc_map);
if (id_priv-id.ps == RDMA_PS_UDP)
@@ -3176,9 +3178,12 @@ static int cma_join_ib_multicast(struct rdma_id_private 
*id_priv,
if (ret)
return ret;
 
+   ret = cma_set_qkey(id_priv, 0);
+   if (ret)
+   return ret;
+
cma_set_mgid(id_priv, (struct sockaddr *) mc-addr, rec.mgid);
-   if (id_priv-id.ps == RDMA_PS_UDP)
-   rec.qkey = cpu_to_be32(RDMA_UDP_QKEY);
+   rec.qkey = cpu_to_be32(id_priv-qkey);
rdma_addr_get_sgid(dev_addr, rec.port_gid);
rec.pkey = cpu_to_be16(ib_addr_get_pkey(dev_addr));
rec.join_state = 1;
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 00ce990..b0f189b 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -1229,23 +1229,23 @@ static ssize_t ucma_notify(struct ucma_file *file, 
const char __user *inbuf,
return ret;
 }
 
-static ssize_t ucma_join_ip_multicast(struct ucma_file *file,
- const char __user *inbuf,
- int in_len, int out_len)
+static ssize_t ucma_process_join(struct ucma_file *file,
+struct rdma_ucm_join_mcast *cmd,  int out_len)
 {
-   struct rdma_ucm_join_ip_mcast cmd;
struct rdma_ucm_create_id_resp resp;
struct ucma_context *ctx;
struct ucma_multicast *mc;
+   struct sockaddr *addr;
int ret;
 
if (out_len  sizeof(resp))
return -ENOSPC;
 
-   if (copy_from_user(cmd, inbuf, sizeof(cmd)))
-   return -EFAULT;
+   addr = (struct sockaddr *) cmd-addr;
+   if (cmd-reserved || !cmd-addr_size || (cmd-addr_size != 
rdma_addr_size(addr)))
+   return -EINVAL;
 
-   ctx = ucma_get_ctx(file, cmd.id);
+   ctx = ucma_get_ctx(file, cmd-id);
if (IS_ERR(ctx))
return PTR_ERR(ctx);
 
@@ -1256,14 +1256,14 @@ static ssize_t ucma_join_ip_multicast(struct ucma_file 
*file,
goto err1;
}
 
-   mc-uid = cmd.uid;
-   memcpy(mc-addr, cmd.addr, sizeof cmd.addr);
+   mc-uid = cmd-uid;
+   memcpy(mc-addr, addr, cmd-addr_size);
ret = rdma_join_multicast(ctx-cm_id, (struct sockaddr *) mc-addr, 
mc);
if (ret)
goto err2;
 
resp.id = mc-id;
-   if (copy_to_user((void __user *)(unsigned long)cmd.response,
+   if (copy_to_user((void __user *)(unsigned long) cmd-response,
 resp, sizeof(resp))) {
ret = -EFAULT;
goto err3;
@@ -1288,6 +1288,38 @@ err1:
return ret;
 }
 
+static ssize_t ucma_join_ip_multicast(struct ucma_file *file,
+ const char __user *inbuf,
+ int in_len, int out_len)
+{
+   struct rdma_ucm_join_ip_mcast cmd;
+   struct rdma_ucm_join_mcast join_cmd;
+
+   if (copy_from_user(cmd, inbuf, sizeof(cmd)))
+   return -EFAULT;
+
+   join_cmd.response = cmd.response;
+   join_cmd.uid = cmd.uid;
+   join_cmd.id = cmd.id;
+   join_cmd.addr_size = rdma_addr_size((struct sockaddr *) cmd.addr);
+   join_cmd.reserved = 0;
+   memcpy(join_cmd.addr, cmd.addr, join_cmd.addr_size);
+
+   return ucma_process_join(file, join_cmd, out_len);
+}
+
+static ssize_t ucma_join_multicast(struct ucma_file *file,
+  const char __user *inbuf,
+ 

[PATCH v5 13/28] rdma/cm: Add support for AF_IB to cma_get_service_id

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

cma_get_service_id forms the service ID based on the port space
and port number of the rdma_cm_id.  Extend the call to support
AF_IB, which contains the service ID directly.  This will
be needed to support any arbitrary SID.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/cma.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index f54169b..d307c94 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1379,6 +1379,9 @@ err1:
 
 static __be64 cma_get_service_id(enum rdma_port_space ps, struct sockaddr 
*addr)
 {
+   if (addr-sa_family == AF_IB)
+   return ((struct sockaddr_ib *) addr)-sib_sid;
+
return cpu_to_be64(((u64)ps  16) + be16_to_cpu(cma_port(addr)));
 }
 
-- 
1.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 08/28] rdma/cm: Add helper functions to return id address information

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

Provide inline helpers to extract source and destination address
data from the rdma_cm_id.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/cma.c |  138 +
 1 files changed, 71 insertions(+), 67 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 46a9b5c..61454d3e 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -311,6 +311,21 @@ static void cma_release_dev(struct rdma_id_private 
*id_priv)
mutex_unlock(lock);
 }
 
+static inline struct sockaddr *cma_src_addr(struct rdma_id_private *id_priv)
+{
+   return (struct sockaddr *) id_priv-id.route.addr.src_addr;
+}
+
+static inline struct sockaddr *cma_dst_addr(struct rdma_id_private *id_priv)
+{
+   return (struct sockaddr *) id_priv-id.route.addr.dst_addr;
+}
+
+static inline unsigned short cma_family(struct rdma_id_private *id_priv)
+{
+   return id_priv-id.route.addr.src_addr.ss_family;
+}
+
 static int cma_set_qkey(struct rdma_id_private *id_priv)
 {
struct ib_sa_mcmember_rec rec;
@@ -900,8 +915,7 @@ static void cma_cancel_operation(struct rdma_id_private 
*id_priv,
cma_cancel_route(id_priv);
break;
case RDMA_CM_LISTEN:
-   if (cma_any_addr((struct sockaddr *) 
id_priv-id.route.addr.src_addr)
-!id_priv-cma_dev)
+   if (cma_any_addr(cma_src_addr(id_priv))  !id_priv-cma_dev)
cma_cancel_listens(id_priv);
break;
default:
@@ -1138,6 +1152,7 @@ static struct rdma_id_private *cma_new_conn_id(struct 
rdma_cm_id *listen_id,
if (IS_ERR(id))
return NULL;
 
+   id_priv = container_of(id, struct rdma_id_private, id);
cma_save_net_info(id-route.addr, listen_id-route.addr,
  ip_ver, port, src, dst);
 
@@ -1152,19 +1167,17 @@ static struct rdma_id_private *cma_new_conn_id(struct 
rdma_cm_id *listen_id,
if (rt-num_paths == 2)
rt-path_rec[1] = *ib_event-param.req_rcvd.alternate_path;
 
-   if (cma_any_addr((struct sockaddr *) rt-addr.src_addr)) {
+   if (cma_any_addr(cma_src_addr(id_priv))) {
rt-addr.dev_addr.dev_type = ARPHRD_INFINIBAND;
rdma_addr_set_sgid(rt-addr.dev_addr, rt-path_rec[0].sgid);
ib_addr_set_pkey(rt-addr.dev_addr, 
be16_to_cpu(rt-path_rec[0].pkey));
} else {
-   ret = cma_translate_addr((struct sockaddr *) rt-addr.src_addr,
-rt-addr.dev_addr);
+   ret = cma_translate_addr(cma_src_addr(id_priv), 
rt-addr.dev_addr);
if (ret)
goto err;
}
rdma_addr_set_dgid(rt-addr.dev_addr, rt-path_rec[0].dgid);
 
-   id_priv = container_of(id, struct rdma_id_private, id);
id_priv-state = RDMA_CM_CONNECT;
return id_priv;
 
@@ -1188,7 +1201,7 @@ static struct rdma_id_private *cma_new_udp_id(struct 
rdma_cm_id *listen_id,
if (IS_ERR(id))
return NULL;
 
-
+   id_priv = container_of(id, struct rdma_id_private, id);
if (cma_get_net_info(ib_event-private_data, listen_id-ps,
 ip_ver, port, src, dst))
goto err;
@@ -1197,13 +1210,11 @@ static struct rdma_id_private *cma_new_udp_id(struct 
rdma_cm_id *listen_id,
  ip_ver, port, src, dst);
 
if (!cma_any_addr((struct sockaddr *) id-route.addr.src_addr)) {
-   ret = cma_translate_addr((struct sockaddr *) 
id-route.addr.src_addr,
-id-route.addr.dev_addr);
+   ret = cma_translate_addr(cma_src_addr(id_priv), 
id-route.addr.dev_addr);
if (ret)
goto err;
}
 
-   id_priv = container_of(id, struct rdma_id_private, id);
id_priv-state = RDMA_CM_CONNECT;
return id_priv;
 err:
@@ -1386,9 +1397,9 @@ static int cma_iw_handler(struct iw_cm_id *iw_id, struct 
iw_cm_event *iw_event)
event.event = RDMA_CM_EVENT_DISCONNECTED;
break;
case IW_CM_EVENT_CONNECT_REPLY:
-   sin = (struct sockaddr_in *) id_priv-id.route.addr.src_addr;
+   sin = (struct sockaddr_in *) cma_src_addr(id_priv);
*sin = iw_event-local_addr;
-   sin = (struct sockaddr_in *) id_priv-id.route.addr.dst_addr;
+   sin = (struct sockaddr_in *) cma_dst_addr(id_priv);
*sin = iw_event-remote_addr;
switch (iw_event-status) {
case 0:
@@ -1486,9 +1497,9 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id,
cm_id-context = conn_id;
cm_id-cm_handler = cma_iw_handler;
 
-   sin = (struct sockaddr_in *) new_cm_id-route.addr.src_addr;
+   sin = (struct 

[PATCH v5 28/28] rdma/cm: Export AF_IB statistics

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

Report AF_IB source and destination addresses through
netlink interface.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/cma.c |   37 ++---
 1 files changed, 10 insertions(+), 27 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 6445650..d1e4d09 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -3586,33 +3586,16 @@ static int cma_get_id_stats(struct sk_buff *skb, struct 
netlink_callback *cb)
id_stats-bound_dev_if =
id-route.addr.dev_addr.bound_dev_if;
 
-   if (cma_family(id_priv) == AF_INET) {
-   if (ibnl_put_attr(skb, nlh,
- sizeof(struct sockaddr_in),
- cma_src_addr(id_priv),
- 
RDMA_NL_RDMA_CM_ATTR_SRC_ADDR)) {
-   goto out;
-   }
-   if (ibnl_put_attr(skb, nlh,
- sizeof(struct sockaddr_in),
- cma_dst_addr(id_priv),
- 
RDMA_NL_RDMA_CM_ATTR_DST_ADDR)) {
-   goto out;
-   }
-   } else if (cma_family(id_priv) == AF_INET6) {
-   if (ibnl_put_attr(skb, nlh,
- sizeof(struct sockaddr_in6),
- cma_src_addr(id_priv),
- 
RDMA_NL_RDMA_CM_ATTR_SRC_ADDR)) {
-   goto out;
-   }
-   if (ibnl_put_attr(skb, nlh,
- sizeof(struct sockaddr_in6),
- cma_dst_addr(id_priv),
- 
RDMA_NL_RDMA_CM_ATTR_DST_ADDR)) {
-   goto out;
-   }
-   }
+   if (ibnl_put_attr(skb, nlh,
+ rdma_addr_size(cma_src_addr(id_priv)),
+ cma_src_addr(id_priv),
+ RDMA_NL_RDMA_CM_ATTR_SRC_ADDR))
+   goto out;
+   if (ibnl_put_attr(skb, nlh,
+ rdma_addr_size(cma_src_addr(id_priv)),
+ cma_dst_addr(id_priv),
+ RDMA_NL_RDMA_CM_ATTR_DST_ADDR))
+   goto out;
 
id_stats-pid   = id_priv-owner;
id_stats-port_space= id-ps;
-- 
1.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 26/28] rdma/ucm: Allow user space to pass AF_IB into resolve

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

Allow user space applications to call resolve_addr using
AF_IB.  To support sockaddr_ib, we need to define a new
structure capable of handling the larger address size.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/ucma.c   |   30 +-
 include/uapi/rdma/rdma_user_cm.h |   13 -
 2 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 22ed97e..00ce990 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -577,6 +577,33 @@ static ssize_t ucma_resolve_ip(struct ucma_file *file,
return ret;
 }
 
+static ssize_t ucma_resolve_addr(struct ucma_file *file,
+const char __user *inbuf,
+int in_len, int out_len)
+{
+   struct rdma_ucm_resolve_addr cmd;
+   struct sockaddr *src, *dst;
+   struct ucma_context *ctx;
+   int ret;
+
+   if (copy_from_user(cmd, inbuf, sizeof(cmd)))
+   return -EFAULT;
+
+   src = (struct sockaddr *) cmd.src_addr;
+   dst = (struct sockaddr *) cmd.dst_addr;
+   if (cmd.reserved || (cmd.src_size  (cmd.src_size != 
rdma_addr_size(src))) ||
+   !cmd.dst_size || (cmd.dst_size != rdma_addr_size(dst)))
+   return -EINVAL;
+
+   ctx = ucma_get_ctx(file, cmd.id);
+   if (IS_ERR(ctx))
+   return PTR_ERR(ctx);
+
+   ret = rdma_resolve_addr(ctx-cm_id, src, dst, cmd.timeout_ms);
+   ucma_put_ctx(ctx);
+   return ret;
+}
+
 static ssize_t ucma_resolve_route(struct ucma_file *file,
  const char __user *inbuf,
  int in_len, int out_len)
@@ -1423,7 +1450,8 @@ static ssize_t (*ucma_cmd_table[])(struct ucma_file *file,
[RDMA_USER_CM_CMD_LEAVE_MCAST]   = ucma_leave_multicast,
[RDMA_USER_CM_CMD_MIGRATE_ID]= ucma_migrate_id,
[RDMA_USER_CM_CMD_QUERY] = ucma_query,
-   [RDMA_USER_CM_CMD_BIND]  = ucma_bind
+   [RDMA_USER_CM_CMD_BIND]  = ucma_bind,
+   [RDMA_USER_CM_CMD_RESOLVE_ADDR]  = ucma_resolve_addr
 };
 
 static ssize_t ucma_write(struct file *filp, const char __user *buf,
diff --git a/include/uapi/rdma/rdma_user_cm.h b/include/uapi/rdma/rdma_user_cm.h
index 895a427..6d03f9c 100644
--- a/include/uapi/rdma/rdma_user_cm.h
+++ b/include/uapi/rdma/rdma_user_cm.h
@@ -63,7 +63,8 @@ enum {
RDMA_USER_CM_CMD_LEAVE_MCAST,
RDMA_USER_CM_CMD_MIGRATE_ID,
RDMA_USER_CM_CMD_QUERY,
-   RDMA_USER_CM_CMD_BIND
+   RDMA_USER_CM_CMD_BIND,
+   RDMA_USER_CM_CMD_RESOLVE_ADDR
 };
 
 /*
@@ -117,6 +118,16 @@ struct rdma_ucm_resolve_ip {
__u32 timeout_ms;
 };
 
+struct rdma_ucm_resolve_addr {
+   __u32 id;
+   __u32 timeout_ms;
+   __u16 src_size;
+   __u16 dst_size;
+   __u32 reserved;
+   struct sockaddr_storage src_addr;
+   struct sockaddr_storage dst_addr;
+};
+
 struct rdma_ucm_resolve_route {
__u32 id;
__u32 timeout_ms;
-- 
1.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 25/28] rdma/ucm: Allow user space to bind to AF_IB

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

Support user space binding to addresses using AF_IB.  Since
sockaddr_ib is larger than sockaddr_in6, we need to define
a larger structure when binding using AF_IB.  This time we
use sockaddr_storage to cover future cases.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/ucma.c   |   27 ++-
 include/uapi/rdma/rdma_user_cm.h |   10 +-
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 82fb1e6..22ed97e 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -531,6 +531,30 @@ static ssize_t ucma_bind_ip(struct ucma_file *file, const 
char __user *inbuf,
return ret;
 }
 
+static ssize_t ucma_bind(struct ucma_file *file, const char __user *inbuf,
+int in_len, int out_len)
+{
+   struct rdma_ucm_bind cmd;
+   struct sockaddr *addr;
+   struct ucma_context *ctx;
+   int ret;
+
+   if (copy_from_user(cmd, inbuf, sizeof(cmd)))
+   return -EFAULT;
+
+   addr = (struct sockaddr *) cmd.addr;
+   if (cmd.reserved || !cmd.addr_size || (cmd.addr_size != 
rdma_addr_size(addr)))
+   return -EINVAL;
+
+   ctx = ucma_get_ctx(file, cmd.id);
+   if (IS_ERR(ctx))
+   return PTR_ERR(ctx);
+
+   ret = rdma_bind_addr(ctx-cm_id, addr);
+   ucma_put_ctx(ctx);
+   return ret;
+}
+
 static ssize_t ucma_resolve_ip(struct ucma_file *file,
   const char __user *inbuf,
   int in_len, int out_len)
@@ -1398,7 +1422,8 @@ static ssize_t (*ucma_cmd_table[])(struct ucma_file *file,
[RDMA_USER_CM_CMD_JOIN_IP_MCAST] = ucma_join_ip_multicast,
[RDMA_USER_CM_CMD_LEAVE_MCAST]   = ucma_leave_multicast,
[RDMA_USER_CM_CMD_MIGRATE_ID]= ucma_migrate_id,
-   [RDMA_USER_CM_CMD_QUERY] = ucma_query
+   [RDMA_USER_CM_CMD_QUERY] = ucma_query,
+   [RDMA_USER_CM_CMD_BIND]  = ucma_bind
 };
 
 static ssize_t ucma_write(struct file *filp, const char __user *buf,
diff --git a/include/uapi/rdma/rdma_user_cm.h b/include/uapi/rdma/rdma_user_cm.h
index 79f68f7..895a427 100644
--- a/include/uapi/rdma/rdma_user_cm.h
+++ b/include/uapi/rdma/rdma_user_cm.h
@@ -62,7 +62,8 @@ enum {
RDMA_USER_CM_CMD_JOIN_IP_MCAST,
RDMA_USER_CM_CMD_LEAVE_MCAST,
RDMA_USER_CM_CMD_MIGRATE_ID,
-   RDMA_USER_CM_CMD_QUERY
+   RDMA_USER_CM_CMD_QUERY,
+   RDMA_USER_CM_CMD_BIND
 };
 
 /*
@@ -102,6 +103,13 @@ struct rdma_ucm_bind_ip {
__u32 id;
 };
 
+struct rdma_ucm_bind {
+   __u32 id;
+   __u16 addr_size;
+   __u16 reserved;
+   struct sockaddr_storage addr;
+};
+
 struct rdma_ucm_resolve_ip {
struct sockaddr_in6 src_addr;
struct sockaddr_in6 dst_addr;
-- 
1.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 18/28] rdma/cm: Only listen on IB devices when using AF_IB

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

If an rdma_cm_id is bound to AF_IB, with a wild card address,
only listen on IB devices.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/cma.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 6ea5ce7..797fb05 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1616,6 +1616,10 @@ static void cma_listen_on_dev(struct rdma_id_private 
*id_priv,
struct rdma_cm_id *id;
int ret;
 
+   if (cma_family(id_priv) == AF_IB 
+   rdma_node_get_transport(cma_dev-device-node_type) != 
RDMA_TRANSPORT_IB)
+   return;
+
id = rdma_create_id(cma_listen_handler, id_priv, id_priv-id.ps,
id_priv-id.qp_type);
if (IS_ERR(id))
-- 
1.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 19/28] rdma/ucm: Support querying for AF_IB addresses

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

The sockaddr structure for AF_IB is larger than sockaddr_in6.
The rdma cm user space ABI uses the latter to exchange address
information between user space and the kernel.

To support querying for larger addresses, define a new query
command that exchanges data using sockaddr_storage, rather
than sockaddr_in6.  Unlike the existing query_route command,
the new command only returns address information.  Route
(i.e. path record) data is separated.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/ucma.c   |   76 +-
 include/uapi/rdma/rdma_user_cm.h |   22 +-
 2 files changed, 93 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index e813774..18bdccc 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -47,6 +47,7 @@
 #include rdma/ib_marshall.h
 #include rdma/rdma_cm.h
 #include rdma/rdma_cm_ib.h
+#include rdma/ib_addr.h
 
 MODULE_AUTHOR(Sean Hefty);
 MODULE_DESCRIPTION(RDMA Userspace Connection Manager Access);
@@ -649,7 +650,7 @@ static ssize_t ucma_query_route(struct ucma_file *file,
const char __user *inbuf,
int in_len, int out_len)
 {
-   struct rdma_ucm_query_route cmd;
+   struct rdma_ucm_query cmd;
struct rdma_ucm_query_route_resp resp;
struct ucma_context *ctx;
struct sockaddr *addr;
@@ -709,6 +710,76 @@ out:
return ret;
 }
 
+static void ucma_query_device_addr(struct rdma_cm_id *cm_id,
+  struct rdma_ucm_query_addr_resp *resp)
+{
+   if (!cm_id-device)
+   return;
+
+   resp-node_guid = (__force __u64) cm_id-device-node_guid;
+   resp-port_num = cm_id-port_num;
+   resp-pkey = (__force __u16) cpu_to_be16(
+ib_addr_get_pkey(cm_id-route.addr.dev_addr));
+}
+
+static ssize_t ucma_query_addr(struct ucma_context *ctx,
+  void __user *response, int out_len)
+{
+   struct rdma_ucm_query_addr_resp resp;
+   struct sockaddr *addr;
+   int ret = 0;
+
+   if (out_len  sizeof(resp))
+   return -ENOSPC;
+
+   memset(resp, 0, sizeof resp);
+
+   addr = (struct sockaddr *) ctx-cm_id-route.addr.src_addr;
+   resp.src_size = rdma_addr_size(addr);
+   memcpy(resp.src_addr, addr, resp.src_size);
+
+   addr = (struct sockaddr *) ctx-cm_id-route.addr.dst_addr;
+   resp.dst_size = rdma_addr_size(addr);
+   memcpy(resp.dst_addr, addr, resp.dst_size);
+
+   ucma_query_device_addr(ctx-cm_id, resp);
+
+   if (copy_to_user(response, resp, sizeof(resp)))
+   ret = -EFAULT;
+
+   return ret;
+}
+
+static ssize_t ucma_query(struct ucma_file *file,
+ const char __user *inbuf,
+ int in_len, int out_len)
+{
+   struct rdma_ucm_query cmd;
+   struct ucma_context *ctx;
+   void __user *response;
+   int ret;
+
+   if (copy_from_user(cmd, inbuf, sizeof(cmd)))
+   return -EFAULT;
+
+   response = (void __user *)(unsigned long) cmd.response;
+   ctx = ucma_get_ctx(file, cmd.id);
+   if (IS_ERR(ctx))
+   return PTR_ERR(ctx);
+
+   switch (cmd.option) {
+   case RDMA_USER_CM_QUERY_ADDR:
+   ret = ucma_query_addr(ctx, response, out_len);
+   break;
+   default:
+   ret = -ENOSYS;
+   break;
+   }
+
+   ucma_put_ctx(ctx);
+   return ret;
+}
+
 static void ucma_copy_conn_param(struct rdma_cm_id *id,
 struct rdma_conn_param *dst,
 struct rdma_ucm_conn_param *src)
@@ -1241,7 +1312,8 @@ static ssize_t (*ucma_cmd_table[])(struct ucma_file *file,
[RDMA_USER_CM_CMD_NOTIFY]   = ucma_notify,
[RDMA_USER_CM_CMD_JOIN_MCAST]   = ucma_join_multicast,
[RDMA_USER_CM_CMD_LEAVE_MCAST]  = ucma_leave_multicast,
-   [RDMA_USER_CM_CMD_MIGRATE_ID]   = ucma_migrate_id
+   [RDMA_USER_CM_CMD_MIGRATE_ID]   = ucma_migrate_id,
+   [RDMA_USER_CM_CMD_QUERY]= ucma_query
 };
 
 static ssize_t ucma_write(struct file *filp, const char __user *buf,
diff --git a/include/uapi/rdma/rdma_user_cm.h b/include/uapi/rdma/rdma_user_cm.h
index 29de08f..3ea7e7a 100644
--- a/include/uapi/rdma/rdma_user_cm.h
+++ b/include/uapi/rdma/rdma_user_cm.h
@@ -61,7 +61,8 @@ enum {
RDMA_USER_CM_CMD_NOTIFY,
RDMA_USER_CM_CMD_JOIN_MCAST,
RDMA_USER_CM_CMD_LEAVE_MCAST,
-   RDMA_USER_CM_CMD_MIGRATE_ID
+   RDMA_USER_CM_CMD_MIGRATE_ID,
+   RDMA_USER_CM_CMD_QUERY
 };
 
 /*
@@ -113,10 +114,14 @@ struct rdma_ucm_resolve_route {
__u32 timeout_ms;
 };
 
-struct rdma_ucm_query_route {
+enum {
+   RDMA_USER_CM_QUERY_ADDR
+};
+
+struct rdma_ucm_query {
__u64 response;
__u32 id;
-   

[PATCH v5 20/28] ib/sa: Export function to pack a path record into wire format

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

Allow converting from struct ib_sa_path_rec to the IB defined
SA path record wire format.  This will be used to report path
data from the rdma cm into user space.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/sa_query.c |6 ++
 include/rdma/ib_sa.h   |7 +++
 2 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/core/sa_query.c 
b/drivers/infiniband/core/sa_query.c
index 934f45e..9838ca4 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -652,6 +652,12 @@ void ib_sa_unpack_path(void *attribute, struct 
ib_sa_path_rec *rec)
 }
 EXPORT_SYMBOL(ib_sa_unpack_path);
 
+void ib_sa_pack_path(struct ib_sa_path_rec *rec, void *attribute)
+{
+   ib_pack(path_rec_table, ARRAY_SIZE(path_rec_table), rec, attribute);
+}
+EXPORT_SYMBOL(ib_sa_pack_path);
+
 static void ib_sa_path_rec_callback(struct ib_sa_query *sa_query,
int status,
struct ib_sa_mad *mad)
diff --git a/include/rdma/ib_sa.h b/include/rdma/ib_sa.h
index 8275e53..125f871 100644
--- a/include/rdma/ib_sa.h
+++ b/include/rdma/ib_sa.h
@@ -402,6 +402,12 @@ int ib_init_ah_from_path(struct ib_device *device, u8 
port_num,
 struct ib_ah_attr *ah_attr);
 
 /**
+ * ib_sa_pack_path - Conert a path record from struct ib_sa_path_rec
+ * to IB MAD wire format.
+ */
+void ib_sa_pack_path(struct ib_sa_path_rec *rec, void *attribute);
+
+/**
  * ib_sa_unpack_path - Convert a path record from MAD format to struct
  * ib_sa_path_rec.
  */
@@ -418,4 +424,5 @@ int ib_sa_guid_info_rec_query(struct ib_sa_client *client,
   void *context),
  void *context,
  struct ib_sa_query **sa_query);
+
 #endif /* IB_SA_H */
-- 
1.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 21/28] rdma/ucm: Support querying when IB paths are not reversible

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

The current query_route call can return up to two path records.
The assumption being that one is the primary path, with optional
support for an alternate path.  In both cases, the paths are
assumed to be reversible and are used to send CM MADs.

With the ability to manually set IB path data, the rdma cm
can eventually be capable of using up to 6 paths per
connection:

forward primary, reverse primary,
forward alternate, reverse alternate,
reversible primary path for CM MADs
reversible alternate path for CM MADs.

(It is unclear at this time if IB routing will complicate this.)
In order to handle more flexible routing topologies, add a new
command to report any number of paths.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/ucma.c   |   35 +++
 include/uapi/rdma/rdma_user_cm.h |9 -
 2 files changed, 43 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 18bdccc..722f2ff 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -750,6 +750,38 @@ static ssize_t ucma_query_addr(struct ucma_context *ctx,
return ret;
 }
 
+static ssize_t ucma_query_path(struct ucma_context *ctx,
+  void __user *response, int out_len)
+{
+   struct rdma_ucm_query_path_resp *resp;
+   int i, ret = 0;
+
+   if (out_len  sizeof(*resp))
+   return -ENOSPC;
+
+   resp = kzalloc(out_len, GFP_KERNEL);
+   if (!resp)
+   return -ENOMEM;
+
+   resp-num_paths = ctx-cm_id-route.num_paths;
+   for (i = 0, out_len -= sizeof(*resp);
+i  resp-num_paths  out_len  sizeof(struct ib_path_rec_data);
+i++, out_len -= sizeof(struct ib_path_rec_data)) {
+
+   resp-path_data[i].flags = IB_PATH_GMP | IB_PATH_PRIMARY |
+  IB_PATH_BIDIRECTIONAL;
+   ib_sa_pack_path(ctx-cm_id-route.path_rec[i],
+   resp-path_data[i].path_rec);
+   }
+
+   if (copy_to_user(response, resp,
+sizeof(*resp) + (i * sizeof(struct ib_path_rec_data
+   ret = -EFAULT;
+
+   kfree(resp);
+   return ret;
+}
+
 static ssize_t ucma_query(struct ucma_file *file,
  const char __user *inbuf,
  int in_len, int out_len)
@@ -771,6 +803,9 @@ static ssize_t ucma_query(struct ucma_file *file,
case RDMA_USER_CM_QUERY_ADDR:
ret = ucma_query_addr(ctx, response, out_len);
break;
+   case RDMA_USER_CM_QUERY_PATH:
+   ret = ucma_query_path(ctx, response, out_len);
+   break;
default:
ret = -ENOSYS;
break;
diff --git a/include/uapi/rdma/rdma_user_cm.h b/include/uapi/rdma/rdma_user_cm.h
index 3ea7e7a..07eb6cf 100644
--- a/include/uapi/rdma/rdma_user_cm.h
+++ b/include/uapi/rdma/rdma_user_cm.h
@@ -115,7 +115,8 @@ struct rdma_ucm_resolve_route {
 };
 
 enum {
-   RDMA_USER_CM_QUERY_ADDR
+   RDMA_USER_CM_QUERY_ADDR,
+   RDMA_USER_CM_QUERY_PATH
 };
 
 struct rdma_ucm_query {
@@ -145,6 +146,12 @@ struct rdma_ucm_query_addr_resp {
struct sockaddr_storage dst_addr;
 };
 
+struct rdma_ucm_query_path_resp {
+   __u32 num_paths;
+   __u32 reserved;
+   struct ib_path_rec_data path_data[0];
+};
+
 struct rdma_ucm_conn_param {
__u32 qp_num;
__u32 qkey;
-- 
1.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 22/28] rdma/cm: Export cma_get_service_id

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

Allow the rdma_ucm to query the IB service ID formed or
allocated by the rdma_cm by exporting the cma_get_service_id
functionality.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/cma.c |   13 +++--
 include/rdma/rdma_cm.h|7 +++
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 797fb05..7034c84 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1336,13 +1336,14 @@ err1:
return ret;
 }
 
-static __be64 cma_get_service_id(enum rdma_port_space ps, struct sockaddr 
*addr)
+__be64 rdma_get_service_id(struct rdma_cm_id *id, struct sockaddr *addr)
 {
if (addr-sa_family == AF_IB)
return ((struct sockaddr_ib *) addr)-sib_sid;
 
-   return cpu_to_be64(((u64)ps  16) + be16_to_cpu(cma_port(addr)));
+   return cpu_to_be64(((u64)id-ps  16) + be16_to_cpu(cma_port(addr)));
 }
+EXPORT_SYMBOL(rdma_get_service_id);
 
 static void cma_set_compare_data(enum rdma_port_space ps, struct sockaddr 
*addr,
 struct ib_cm_compare_data *compare)
@@ -1556,7 +1557,7 @@ static int cma_ib_listen(struct rdma_id_private *id_priv)
id_priv-cm_id.ib = id;
 
addr = cma_src_addr(id_priv);
-   svc_id = cma_get_service_id(id_priv-id.ps, addr);
+   svc_id = rdma_get_service_id(id_priv-id, addr);
if (cma_any_addr(addr)  !id_priv-afonly)
ret = ib_cm_listen(id_priv-cm_id.ib, svc_id, 0, NULL);
else {
@@ -1699,7 +1700,7 @@ static int cma_query_ib_route(struct rdma_id_private 
*id_priv, int timeout_ms,
path_rec.pkey = cpu_to_be16(ib_addr_get_pkey(dev_addr));
path_rec.numb_path = 1;
path_rec.reversible = 1;
-   path_rec.service_id = cma_get_service_id(id_priv-id.ps, 
cma_dst_addr(id_priv));
+   path_rec.service_id = rdma_get_service_id(id_priv-id, 
cma_dst_addr(id_priv));
 
comp_mask = IB_SA_PATH_REC_DGID | IB_SA_PATH_REC_SGID |
IB_SA_PATH_REC_PKEY | IB_SA_PATH_REC_NUMB_PATH |
@@ -2710,7 +2711,7 @@ static int cma_resolve_ib_udp(struct rdma_id_private 
*id_priv,
id_priv-cm_id.ib = id;
 
req.path = id_priv-id.route.path_rec;
-   req.service_id = cma_get_service_id(id_priv-id.ps, 
cma_dst_addr(id_priv));
+   req.service_id = rdma_get_service_id(id_priv-id, 
cma_dst_addr(id_priv));
req.timeout_ms = 1  (CMA_CM_RESPONSE_TIMEOUT - 8);
req.max_cm_retries = CMA_MAX_CM_RETRIES;
 
@@ -2770,7 +2771,7 @@ static int cma_connect_ib(struct rdma_id_private *id_priv,
if (route-num_paths == 2)
req.alternate_path = route-path_rec[1];
 
-   req.service_id = cma_get_service_id(id_priv-id.ps, 
cma_dst_addr(id_priv));
+   req.service_id = rdma_get_service_id(id_priv-id, 
cma_dst_addr(id_priv));
req.qp_num = id_priv-qp_num;
req.qp_type = id_priv-id.qp_type;
req.starting_psn = id_priv-seq_num;
diff --git a/include/rdma/rdma_cm.h b/include/rdma/rdma_cm.h
index 966f90b..1ed2088 100644
--- a/include/rdma/rdma_cm.h
+++ b/include/rdma/rdma_cm.h
@@ -373,4 +373,11 @@ int rdma_set_reuseaddr(struct rdma_cm_id *id, int reuse);
  */
 int rdma_set_afonly(struct rdma_cm_id *id, int afonly);
 
+ /**
+ * rdma_get_service_id - Return the IB service ID for a specified address.
+ * @id: Communication identifier associated with the address.
+ * @addr: Address for the service ID.
+ */
+__be64 rdma_get_service_id(struct rdma_cm_id *id, struct sockaddr *addr);
+
 #endif /* RDMA_CM_H */
-- 
1.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 23/28] rdma/ucm: Add ability to query GID addresses

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

Part of address resolution is mapping IP addresses to IB GIDs.
With the changes to support querying larger addresses and more
path records, also provide a way to query IB GIDs after
resolution completes.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/ucma.c   |   50 ++
 include/uapi/rdma/rdma_user_cm.h |3 +-
 2 files changed, 52 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 722f2ff..45bb052 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -48,6 +48,7 @@
 #include rdma/rdma_cm.h
 #include rdma/rdma_cm_ib.h
 #include rdma/ib_addr.h
+#include rdma/ib.h
 
 MODULE_AUTHOR(Sean Hefty);
 MODULE_DESCRIPTION(RDMA Userspace Connection Manager Access);
@@ -782,6 +783,52 @@ static ssize_t ucma_query_path(struct ucma_context *ctx,
return ret;
 }
 
+static ssize_t ucma_query_gid(struct ucma_context *ctx,
+ void __user *response, int out_len)
+{
+   struct rdma_ucm_query_addr_resp resp;
+   struct sockaddr_ib *addr;
+   int ret = 0;
+
+   if (out_len  sizeof(resp))
+   return -ENOSPC;
+
+   memset(resp, 0, sizeof resp);
+
+   ucma_query_device_addr(ctx-cm_id, resp);
+
+   addr = (struct sockaddr_ib *) resp.src_addr;
+   resp.src_size = sizeof(*addr);
+   if (ctx-cm_id-route.addr.src_addr.ss_family == AF_IB) {
+   memcpy(addr, ctx-cm_id-route.addr.src_addr, resp.src_size);
+   } else {
+   addr-sib_family = AF_IB;
+   addr-sib_pkey = (__force __be16) resp.pkey;
+   rdma_addr_get_sgid(ctx-cm_id-route.addr.dev_addr,
+  (union ib_gid *) addr-sib_addr);
+   addr-sib_sid = rdma_get_service_id(ctx-cm_id, (struct 
sockaddr *)
+   
ctx-cm_id-route.addr.src_addr);
+   }
+
+   addr = (struct sockaddr_ib *) resp.dst_addr;
+   resp.dst_size = sizeof(*addr);
+   if (ctx-cm_id-route.addr.dst_addr.ss_family == AF_IB) {
+   memcpy(addr, ctx-cm_id-route.addr.dst_addr, resp.dst_size);
+   } else {
+   addr-sib_family = AF_IB;
+   addr-sib_pkey = (__force __be16) resp.pkey;
+   rdma_addr_get_dgid(ctx-cm_id-route.addr.dev_addr,
+  (union ib_gid *) addr-sib_addr);
+   addr-sib_sid = rdma_get_service_id(ctx-cm_id, (struct 
sockaddr *)
+   
ctx-cm_id-route.addr.dst_addr);
+   }
+
+   if (copy_to_user(response, resp, sizeof(resp)))
+   ret = -EFAULT;
+
+   return ret;
+}
+
 static ssize_t ucma_query(struct ucma_file *file,
  const char __user *inbuf,
  int in_len, int out_len)
@@ -806,6 +853,9 @@ static ssize_t ucma_query(struct ucma_file *file,
case RDMA_USER_CM_QUERY_PATH:
ret = ucma_query_path(ctx, response, out_len);
break;
+   case RDMA_USER_CM_QUERY_GID:
+   ret = ucma_query_gid(ctx, response, out_len);
+   break;
default:
ret = -ENOSYS;
break;
diff --git a/include/uapi/rdma/rdma_user_cm.h b/include/uapi/rdma/rdma_user_cm.h
index 07eb6cf..ea79253 100644
--- a/include/uapi/rdma/rdma_user_cm.h
+++ b/include/uapi/rdma/rdma_user_cm.h
@@ -116,7 +116,8 @@ struct rdma_ucm_resolve_route {
 
 enum {
RDMA_USER_CM_QUERY_ADDR,
-   RDMA_USER_CM_QUERY_PATH
+   RDMA_USER_CM_QUERY_PATH,
+   RDMA_USER_CM_QUERY_GID
 };
 
 struct rdma_ucm_query {
-- 
1.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 16/28] rdma/cm: Expose private data when using AF_IB

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

If the source or destination address is AF_IB, then do not
reserve a portion of the private data in the IB CM REQ or SIDR
REQ messages for the cma header.  Instead, all private data
should be exported to the user.  When AF_IB is used, the
rdma cm does not have sufficient information to fill in the
cma header.  Additionally, this will be necessary to support
any IB connection through the rdma cm interface,

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/cma.c |   56 +
 1 files changed, 34 insertions(+), 22 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 69217f5..a807b18 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -893,9 +893,9 @@ static int cma_save_net_info(struct rdma_cm_id *id, struct 
rdma_cm_id *listen_id
return 0;
 }
 
-static inline int cma_user_data_offset(enum rdma_port_space ps)
+static inline int cma_user_data_offset(struct rdma_id_private *id_priv)
 {
-   return sizeof(struct cma_hdr);
+   return cma_family(id_priv) == AF_IB ? 0 : sizeof(struct cma_hdr);
 }
 
 static void cma_cancel_route(struct rdma_id_private *id_priv)
@@ -1265,7 +1265,7 @@ static int cma_req_handler(struct ib_cm_id *cm_id, struct 
ib_cm_event *ib_event)
return -ECONNABORTED;
 
memset(event, 0, sizeof event);
-   offset = cma_user_data_offset(listen_id-id.ps);
+   offset = cma_user_data_offset(listen_id);
event.event = RDMA_CM_EVENT_CONNECT_REQUEST;
if (ib_event-event == IB_CM_SIDR_REQ_RECEIVED) {
conn_id = cma_new_udp_id(listen_id-id, ib_event);
@@ -2585,7 +2585,7 @@ static int cma_format_hdr(void *hdr, struct 
rdma_id_private *id_priv)
cma_hdr-src_addr.ip4.addr = src4-sin_addr.s_addr;
cma_hdr-dst_addr.ip4.addr = dst4-sin_addr.s_addr;
cma_hdr-port = src4-sin_port;
-   } else {
+   } else if (cma_family(id_priv) == AF_INET6) {
struct sockaddr_in6 *src6, *dst6;
 
src6 = (struct sockaddr_in6 *) cma_src_addr(id_priv);
@@ -2668,24 +2668,30 @@ static int cma_resolve_ib_udp(struct rdma_id_private 
*id_priv,
 {
struct ib_cm_sidr_req_param req;
struct ib_cm_id *id;
-   int ret;
+   int offset, ret;
 
-   req.private_data_len = sizeof(struct cma_hdr) +
-  conn_param-private_data_len;
+   offset = cma_user_data_offset(id_priv);
+   req.private_data_len = offset + conn_param-private_data_len;
if (req.private_data_len  conn_param-private_data_len)
return -EINVAL;
 
-   req.private_data = kzalloc(req.private_data_len, GFP_ATOMIC);
-   if (!req.private_data)
-   return -ENOMEM;
+   if (req.private_data_len) {
+   req.private_data = kzalloc(req.private_data_len, GFP_ATOMIC);
+   if (!req.private_data)
+   return -ENOMEM;
+   } else {
+   req.private_data = NULL;
+   }
 
if (conn_param-private_data  conn_param-private_data_len)
-   memcpy((void *) req.private_data + sizeof(struct cma_hdr),
+   memcpy((void *) req.private_data + offset,
   conn_param-private_data, conn_param-private_data_len);
 
-   ret = cma_format_hdr((void *) req.private_data, id_priv);
-   if (ret)
-   goto out;
+   if (req.private_data) {
+   ret = cma_format_hdr((void *) req.private_data, id_priv);
+   if (ret)
+   goto out;
+   }
 
id = ib_create_cm_id(id_priv-id.device, cma_sidr_rep_handler,
 id_priv);
@@ -2720,14 +2726,18 @@ static int cma_connect_ib(struct rdma_id_private 
*id_priv,
int offset, ret;
 
memset(req, 0, sizeof req);
-   offset = cma_user_data_offset(id_priv-id.ps);
+   offset = cma_user_data_offset(id_priv);
req.private_data_len = offset + conn_param-private_data_len;
if (req.private_data_len  conn_param-private_data_len)
return -EINVAL;
 
-   private_data = kzalloc(req.private_data_len, GFP_ATOMIC);
-   if (!private_data)
-   return -ENOMEM;
+   if (req.private_data_len) {
+   private_data = kzalloc(req.private_data_len, GFP_ATOMIC);
+   if (!private_data)
+   return -ENOMEM;
+   } else {
+   private_data = NULL;
+   }
 
if (conn_param-private_data  conn_param-private_data_len)
memcpy(private_data + offset, conn_param-private_data,
@@ -2741,10 +2751,12 @@ static int cma_connect_ib(struct rdma_id_private 
*id_priv,
id_priv-cm_id.ib = id;
 
route = id_priv-id.route;
-   ret = cma_format_hdr(private_data, id_priv);
-   if (ret)
-   goto out;
-   req.private_data = 

[PATCH v5 14/28] rdma/cm: Remove unused SDP related code

2013-05-29 Thread sean . hefty
From: Sean Hefty sean.he...@intel.com

The SDP protocol was never merged upstream.  Remove unused
SDP related code from the RDMA CM.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
 drivers/infiniband/core/cma.c |  176 +++-
 1 files changed, 31 insertions(+), 145 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index d307c94..d53d056 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -80,7 +80,6 @@ static LIST_HEAD(dev_list);
 static LIST_HEAD(listen_any_list);
 static DEFINE_MUTEX(lock);
 static struct workqueue_struct *cma_wq;
-static DEFINE_IDR(sdp_ps);
 static DEFINE_IDR(tcp_ps);
 static DEFINE_IDR(udp_ps);
 static DEFINE_IDR(ipoib_ps);
@@ -196,24 +195,7 @@ struct cma_hdr {
union cma_ip_addr dst_addr;
 };
 
-struct sdp_hh {
-   u8 bsdh[16];
-   u8 sdp_version; /* Major version: 7:4 */
-   u8 ip_version;  /* IP version: 7:4 */
-   u8 sdp_specific1[10];
-   __be16 port;
-   __be16 sdp_specific2;
-   union cma_ip_addr src_addr;
-   union cma_ip_addr dst_addr;
-};
-
-struct sdp_hah {
-   u8 bsdh[16];
-   u8 sdp_version;
-};
-
 #define CMA_VERSION 0x00
-#define SDP_MAJ_VERSION 0x2
 
 static int cma_comp(struct rdma_id_private *id_priv, enum rdma_cm_state comp)
 {
@@ -262,21 +244,6 @@ static inline void cma_set_ip_ver(struct cma_hdr *hdr, u8 
ip_ver)
hdr-ip_version = (ip_ver  4) | (hdr-ip_version  0xF);
 }
 
-static inline u8 sdp_get_majv(u8 sdp_version)
-{
-   return sdp_version  4;
-}
-
-static inline u8 sdp_get_ip_ver(struct sdp_hh *hh)
-{
-   return hh-ip_version  4;
-}
-
-static inline void sdp_set_ip_ver(struct sdp_hh *hh, u8 ip_ver)
-{
-   hh-ip_version = (ip_ver  4) | (hh-ip_version  0xF);
-}
-
 static void cma_attach_to_dev(struct rdma_id_private *id_priv,
  struct cma_device *cma_dev)
 {
@@ -847,27 +814,13 @@ static int cma_get_net_info(void *hdr, enum 
rdma_port_space ps,
u8 *ip_ver, __be16 *port,
union cma_ip_addr **src, union cma_ip_addr **dst)
 {
-   switch (ps) {
-   case RDMA_PS_SDP:
-   if (sdp_get_majv(((struct sdp_hh *) hdr)-sdp_version) !=
-   SDP_MAJ_VERSION)
-   return -EINVAL;
-
-   *ip_ver = sdp_get_ip_ver(hdr);
-   *port   = ((struct sdp_hh *) hdr)-port;
-   *src= ((struct sdp_hh *) hdr)-src_addr;
-   *dst= ((struct sdp_hh *) hdr)-dst_addr;
-   break;
-   default:
-   if (((struct cma_hdr *) hdr)-cma_version != CMA_VERSION)
-   return -EINVAL;
+   if (((struct cma_hdr *) hdr)-cma_version != CMA_VERSION)
+   return -EINVAL;
 
-   *ip_ver = cma_get_ip_ver(hdr);
-   *port   = ((struct cma_hdr *) hdr)-port;
-   *src= ((struct cma_hdr *) hdr)-src_addr;
-   *dst= ((struct cma_hdr *) hdr)-dst_addr;
-   break;
-   }
+   *ip_ver = cma_get_ip_ver(hdr);
+   *port   = ((struct cma_hdr *) hdr)-port;
+   *src= ((struct cma_hdr *) hdr)-src_addr;
+   *dst= ((struct cma_hdr *) hdr)-dst_addr;
 
if (*ip_ver != 4  *ip_ver != 6)
return -EINVAL;
@@ -914,12 +867,7 @@ static void cma_save_net_info(struct rdma_addr *addr,
 
 static inline int cma_user_data_offset(enum rdma_port_space ps)
 {
-   switch (ps) {
-   case RDMA_PS_SDP:
-   return 0;
-   default:
-   return sizeof(struct cma_hdr);
-   }
+   return sizeof(struct cma_hdr);
 }
 
 static void cma_cancel_route(struct rdma_id_private *id_priv)
@@ -1085,16 +1033,6 @@ reject:
return ret;
 }
 
-static int cma_verify_rep(struct rdma_id_private *id_priv, void *data)
-{
-   if (id_priv-id.ps == RDMA_PS_SDP 
-   sdp_get_majv(((struct sdp_hah *) data)-sdp_version) !=
-   SDP_MAJ_VERSION)
-   return -EINVAL;
-
-   return 0;
-}
-
 static void cma_set_rep_event_data(struct rdma_cm_event *event,
   struct ib_cm_rep_event_param *rep_data,
   void *private_data)
@@ -1129,15 +1067,13 @@ static int cma_ib_handler(struct ib_cm_id *cm_id, 
struct ib_cm_event *ib_event)
event.status = -ETIMEDOUT;
break;
case IB_CM_REP_RECEIVED:
-   event.status = cma_verify_rep(id_priv, ib_event-private_data);
-   if (event.status)
-   event.event = RDMA_CM_EVENT_CONNECT_ERROR;
-   else if (id_priv-id.qp  id_priv-id.ps != RDMA_PS_SDP) {
+   if (id_priv-id.qp) {
event.status = cma_rep_recv(id_priv);
event.event = event.status ? 
RDMA_CM_EVENT_CONNECT_ERROR :
 RDMA_CM_EVENT_ESTABLISHED;
-  

[PATCH] osm_ucast_dfsssp.c: Fix memory leak in dfsssp_do_dijkstra_routing

2013-05-29 Thread Hal Rosenstock
From 5c06cdf5029ca79a8c2e684a23b1b7e89a93c3a6 Mon Sep 17 00:00:00 2001
From: Dan Ben Yosef da...@mellanox.com
Date: Wed, 29 May 2013 15:41:41 +0300
Subject: [PATCH] osm_ucast_dfsssp.c: Fix memory leak in 
dfsssp_do_dijkstra_routing

Variable sw_list going out of scope leaks the storage it points to.

Signed-off-by: Dan Ben Yosef da...@mellanox.com
---
 opensm/osm_ucast_dfsssp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/opensm/osm_ucast_dfsssp.c b/opensm/osm_ucast_dfsssp.c
index 95e1627..3972656 100644
--- a/opensm/osm_ucast_dfsssp.c
+++ b/opensm/osm_ucast_dfsssp.c
@@ -2181,6 +2181,7 @@ static int dfsssp_do_dijkstra_routing(void *context)
} else {
OSM_LOG(p_mgr-p_log, OSM_LOG_ERROR,
ERR AD31: corrupted sw_list array in 
dfsssp_do_dijkstra_routing\n);
+   free(sw_list);
return 1;
}
}
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC][PATCH] mm: Fix RLIMIT_MEMLOCK

2013-05-29 Thread KOSAKI Motohiro

Hi

I'm unhappy you guys uses offensive word so much. Please cool down all you 
guys. :-/
In fact, _BOTH_ the behavior before and after Cristoph's patch doesn't have 
cleaner semantics.
And PeterZ proposed make new cleaner one rather than revert. No need to hassle.

I'm 100% sure -rt people need stronger-mlock api. Please join discussion to 
make better API.
In my humble opinion is: we should make mlock3(addr, len flags) new syscall (*) 
and support
-rt requirement directly. And current strange IB RLIMIT_MEMLOCK usage should 
gradually migrate
it.
(*) or, to enhance mbind() is an option because i expect apps need to pin pages 
nearby NUMA nodes
in many case.

As your know, current IB pinning implementation doesn't guarantee no minor 
fault when fork
is used. It's ok for IB. They uses madvise(MADV_NOFORK) too. But I'm not sure 
*all* of rt
application are satisfied this. We might need to implement copy-on-fork or 
might not. I'd
like hear other people's opinion.

Also, all developer should know this pinning breaks when memory hot-plug is 
happen likes
cpu bounding bysched_setaffinity() may break when cpu hot-remove.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Status of ummunot branch?

2013-05-29 Thread Jeff Squyres (jsquyres)
On May 29, 2013, at 4:53 AM, Or Gerlitz or.gerl...@gmail.com wrote:

 Have you looked on ODP? see
 https://www.openfabrics.org/resources/document-downloads/presentations/doc_download/568-on-demand-paging-for-user-space-networking.html


Is this upstream?

Has this been run by the MPI implementor community?

The limitation of a max of 2 concurrent page faults seems fairly significant.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] OpenSM: DFSSSP - workaround for better VL balancing

2013-05-29 Thread Jens Domke
Currently, DFSSSP maps the src/dest paths statically to certain VLs.
Especially for deadlock-free topologies this can result in an
unfair balancing. Some VLs within one link might be overused,
which results in slower bandwidth for some src/dest pairs.

The fix changes the VL assignment in two ways: first we balance the
number of paths per VL; and second we randomly assign the VL
as long as this doesn't violate the deadlock-freedom.

1) The balancing splits the paths across available free VLs, so that
the maximal number of paths per VL is minimized. We save the number
of VLs for each deadlock-free channel dependency graph. E.g. for
8 VLs, paths per CDG: {14,5,1} = balanced VLs: {{3,3,3,3,2},{3,2},1}
we have 5 VLs to choose from for CDG(0), two for CDG(1) and
one for CDG(2).

2) get_dfsssp_sl(...) will use the information of (1) to randomly
assign the VL for one src/dest pair within the possible number of
VLs. E.g. for a src/dest pair of CDG(0) we have 5 VLs to choose from,
therefore VL := baseVL + rand()%5

Signed-off-by: Jens Domke domke.j...@m.titech.ac.jp
---
 opensm/osm_ucast_dfsssp.c |  131 +++--
 1 files changed, 90 insertions(+), 41 deletions(-)

diff --git a/opensm/osm_ucast_dfsssp.c b/opensm/osm_ucast_dfsssp.c
index 98c3f7c..7aecc24 100644
--- a/opensm/osm_ucast_dfsssp.c
+++ b/opensm/osm_ucast_dfsssp.c
@@ -133,6 +133,7 @@ typedef struct dfsssp_context {
vertex_t *adj_list;
uint32_t adj_list_size;
vltable_t *srcdest2vl_table;
+   uint8_t *vl_split_count;
 } dfsssp_context_t;
 
 / set initial values for structs **
@@ -1722,8 +1723,9 @@ static int dfsssp_remove_deadlocks(dfsssp_context_t * 
dfsssp_ctx)
cl_map_item_t *item1 = NULL, *item2 = NULL;
osm_port_t *src_port = NULL, *dest_port = NULL;
 
-   uint32_t i = 0, err = 0;
-   uint8_t test_vl = 0, vl_avail = 0, vl_needed = 1;
+   uint32_t i = 0, j = 0, err = 0;
+   uint8_t vl = 0, test_vl = 0, vl_avail = 0, vl_needed = 1;
+   double most_avg_paths = 0.0;
cdg_node_t **cdg = NULL, *start_here = NULL, *cycle = NULL;
cdg_link_t *weakest_link = NULL;
uint32_t srcdest = 0;
@@ -2004,43 +2006,56 @@ static int dfsssp_remove_deadlocks(dfsssp_context_t * 
dfsssp_ctx)
OSM_LOG(p_mgr-p_log, OSM_LOG_VERBOSE,
Balancing the paths on the available Virtual Lanes\n);
 
-   /* balancing virtual lanes, but avoid additional cycle check - 
balancing suboptimal;
+   /* optimal balancing virtual lanes, under condition: no additional 
cycle checks;
   sl/vl != 0 might be assigned to loopback packets (i.e. slid/dlid on 
the
   same port for lmc0), but thats no problem, see IBAS 10.2.2.3
 */
-   if (vl_needed == 1) {
-   from = 0;
-   count = paths_per_vl[0] / vl_avail;
-   for (to = 1; to  vl_avail; to++) {
-   vltable_change_vl(srcdest2vl_table, from, to, count);
-   paths_per_vl[from] -= count;
-   paths_per_vl[to] += count;
-   }
-   } else if (vl_needed  vl_avail) {
-   split_count = (uint8_t *) malloc(vl_needed * sizeof(uint8_t));
-   if (!split_count) {
-   OSM_LOG(p_mgr-p_log, OSM_LOG_ERROR,
-   ERR AD24: cannot allocate memory for 
split_count, skip balancing\n);
-   } else {
-   memset(split_count, 0, vl_needed * sizeof(uint8_t));
-   for (i = vl_needed; i  vl_avail; i++)
-   split_count[(i - vl_needed) % vl_needed]++;
-
-   to = vl_needed;
-   for (from = 0; from  vl_needed; from++) {
-   count =
-   paths_per_vl[from] / (split_count[from] +
- 1);
-   for (i = 0; i  split_count[from]; i++) {
-   vltable_change_vl(srcdest2vl_table,
- from, to, count);
-   paths_per_vl[from] -= count;
-   paths_per_vl[to] += count;
-   to++;
+   split_count = (uint8_t *) calloc(vl_avail, sizeof(uint8_t));
+   if (!split_count) {
+   OSM_LOG(p_mgr-p_log, OSM_LOG_ERROR,
+   ERR AD24: cannot allocate memory for split_count, skip 
balancing\n);
+   goto ERROR;
+   }
+   /* initial state: paths for VLs won't be separated */
+   for (i = 0; i  ((vl_needed  vl_avail) ? vl_needed : vl_avail); i++)
+   split_count[i] = 1;
+   dfsssp_ctx-vl_split_count = split_count;
+   /* balancing is necessary if we have empty VLs */
+   if (vl_needed  vl_avail) {
+

Re: Status of ummunot branch?

2013-05-29 Thread Or Gerlitz

On 30/05/2013 01:56, Jeff Squyres (jsquyres) wrote:

On May 29, 2013, at 4:53 AM, Or Gerlitz or.gerl...@gmail.com wrote:


Have you looked on ODP? see
https://www.openfabrics.org/resources/document-downloads/presentations/doc_download/568-on-demand-paging-for-user-space-networking.html


Is this upstream?


No


Has this been run by the MPI implementor community?


The team that works on this here isn't ready for submission, so 
community runs were not made yet




The limitation of a max of 2 concurrent page faults seems fairly significant.



let me check
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html