RE: [PATCH librdmacm 3/8] autogen.sh: Use autoreconf in autogen.sh

2013-07-17 Thread Yann Droneaud

Hi,

Le 17.07.2013 06:22, Hefty, Sean a écrit :

Thanks - I pulled in these patches, but see below:



Thanks.


diff --git a/autogen.sh b/autogen.sh
index f433312..6c9233e 100755
--- a/autogen.sh
+++ b/autogen.sh
@@ -1,9 +1,4 @@
 #! /bin/sh

 set -x
-test -d ./config || mkdir ./config


Without the above line, the build fails.  I added it back in.  If
there's some other way of ensuring that this directory exists, please
let me know.



Sorry for the inconvenience.

I've checked libibverbs: it has a .gitignore in ./config so that git 
kept the empty directory.


It's a different solution for the same problem, I'm not able to say if 
it's a better one.


Regards.

--
Yann Droneaud
OPTEYA

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH librdmacm 3/8] autogen.sh: Use autoreconf in autogen.sh

2013-07-17 Thread Or Gerlitz

On 17/07/2013 07:22, Hefty, Sean wrote:

Thanks - I pulled in these patches, but see below:


Hi Sean,

If you do this house cleanup, could you also address the below build 
warnings. I can see them
when I build rpm from the 1.0.17 tar ball, but not when doing plain make 
on the latest git, probably

b/c the build through the spec uses some more build/warnings flags.

Or.


configure: creating ./config.status
config.status: creating Makefile
config.status: creating librdmacm.spec
config.status: creating config.h
config.status: executing depfiles commands
config.status: executing libtool commands
+ make -j2
make  all-am
make[1]: Entering directory `/usr/src/redhat/BUILD/librdmacm-1.0.17'
  CC src_librdmacm_la-cma.lo
  CC src_librdmacm_la-addrinfo.lo
src/addrinfo.c: In function 'ucma_convert_to_rai':
src/addrinfo.c:193: warning: dereferencing type-punned pointer will 
break strict-aliasing rules
src/addrinfo.c:210: warning: dereferencing type-punned pointer will 
break strict-aliasing rules

  CC src_librdmacm_la-acm.lo
  CC src_librdmacm_la-rsocket.lo
src/rsocket.c: In function 'rs_modify_svcs':
src/rsocket.c:403: warning: ignoring return value of 'write', declared 
with attribute warn_unused_result
src/rsocket.c:404: warning: ignoring return value of 'read', declared 
with attribute warn_unused_result

src/rsocket.c: In function 'rs_configure':
src/rsocket.c:460: warning: ignoring return value of 'fscanf', declared 
with attribute warn_unused_result
src/rsocket.c:465: warning: ignoring return value of 'fscanf', declared 
with attribute warn_unused_result
src/rsocket.c:473: warning: ignoring return value of 'fscanf', declared 
with attribute warn_unused_result
src/rsocket.c:478: warning: ignoring return value of 'fscanf', declared 
with attribute warn_unused_result
src/rsocket.c:483: warning: ignoring return value of 'fscanf', declared 
with attribute warn_unused_result
src/rsocket.c:491: warning: ignoring return value of 'fscanf', declared 
with attribute warn_unused_result
src/rsocket.c:498: warning: ignoring return value of 'fscanf', declared 
with attribute warn_unused_result

src/rsocket.c: In function 'rs_svc_process_sock':
src/rsocket.c:3623: warning: ignoring return value of 'read', declared 
with attribute warn_unused_result
src/rsocket.c:3632: warning: ignoring return value of 'write', declared 
with attribute warn_unused_result

src/rsocket.c: In function 'rs_svc_run':
src/rsocket.c:3805: warning: ignoring return value of 'write', declared 
with attribute warn_unused_result

src/rsocket.c: In function 'ds_get_dest':
src/rsocket.c:1451: warning: 'qp' may be used uninitialized in this function
src/rsocket.c: In function 'rs_send_iomaps':
src/rsocket.c:2305: warning: 'ret' may be used uninitialized in this 
function

  CC src_librdmacm_la-indexer.lo
  CC src_librspreload_la-preload.lo
src/preload.c: In function 'dup2':
src/preload.c:1020: warning: value computed is not used
  CC src_librspreload_la-indexer.lo
  CC cmatose.o
  CC common.o
  CC rping.o
  CC udaddy.o
  CC mckey.o
  CC rdma_client.o
  CC rdma_server.o
  CC rdma_xclient.o
  CC rdma_xserver.o
  CC rstream.o
  CC rcopy.o
  CC riostream.o
  CC udpong.o
  CCLD   src/librdmacm.la
  CCLD   src/librspreload.la
  CCLD   examples/ucmatose
  CCLD   examples/rping
  CCLD   examples/udaddy
  CCLD   examples/mckey
  CCLD   examples/rdma_client
  CCLD   examples/rdma_server
  CCLD   examples/rdma_xclient
  CCLD   examples/rdma_xserver
  CCLD   examples/rstream
  CCLD   examples/rcopy
  CCLD   examples/riostream
  CCLD   examples/udpong
make[1]: Leaving directory `/usr/src/redhat/BUILD/librdmacm-1.0.17'
+ exit 0
Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.10929

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] RDMA/cma: silence GCC warning

2013-07-17 Thread Paul Bolle
Building cma.o triggers this GCC warning:
drivers/infiniband/core/cma.c: In function ‘rdma_resolve_addr’:
drivers/infiniband/core/cma.c:465:23: warning: ‘port’ may be used 
uninitialized in this function [-Wmaybe-uninitialized]
drivers/infiniband/core/cma.c:426:5: note: ‘port’ was declared here

This is a false positive, as port will always be initialized if we're
at found. But if we assign to id_priv-id.port_num directly, we can
drop port. That will, obviously, silence GCC.

Signed-off-by: Paul Bolle pebo...@tiscali.nl
---
0) v2: assign to id_priv-id.port_num directly, instead of
initializing port to 0, as discussed with Sean.

1) Still only compile tested.

 drivers/infiniband/core/cma.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index f1c279f..84487a2 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -423,7 +423,7 @@ static int cma_resolve_ib_dev(struct rdma_id_private 
*id_priv)
struct sockaddr_ib *addr;
union ib_gid gid, sgid, *dgid;
u16 pkey, index;
-   u8 port, p;
+   u8 p;
int i;
 
cma_dev = NULL;
@@ -443,7 +443,7 @@ static int cma_resolve_ib_dev(struct rdma_id_private 
*id_priv)
if (!memcmp(gid, dgid, sizeof(gid))) {
cma_dev = cur_dev;
sgid = gid;
-   port = p;
+   id_priv-id.port_num = p;
goto found;
}
 
@@ -451,7 +451,7 @@ static int cma_resolve_ib_dev(struct rdma_id_private 
*id_priv)
 dgid-global.subnet_prefix)) {
cma_dev = cur_dev;
sgid = gid;
-   port = p;
+   id_priv-id.port_num = p;
}
}
}
@@ -462,7 +462,6 @@ static int cma_resolve_ib_dev(struct rdma_id_private 
*id_priv)
 
 found:
cma_attach_to_dev(id_priv, cma_dev);
-   id_priv-id.port_num = port;
addr = (struct sockaddr_ib *) cma_src_addr(id_priv);
memcpy(addr-sib_addr, sgid, sizeof sgid);
cma_translate_ib(addr, id_priv-id.route.addr.dev_addr);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 11/13] IB/srp: Make HCA completion vector configurable

2013-07-17 Thread Sagi Grimberg

On 7/16/2013 6:11 PM, Bart Van Assche wrote:

On 14/07/2013 3:43, Sagi Grimberg wrote:

Just wrote a small patch to allow srp_daemon spread connection across
HCA's completion vectors.


Hello Sagi,

How about the following approach:
- Add support for reading the completion vector from srp_daemon.conf,
  similar to how several other parameters are already read from that
  file.


Here We need to take into consideration that we are changing the 
functionality of srp_daemon.conf.
Now instead of simply allowing/dis-allowing targets of specific 
attributes, we are also defining configuration attributes of allowed 
targets.
This might be uncomfortable for the user to explicitly write N target 
strings in srp_daemon.conf just for completion vectors assignment.


Perhaps srp_daemon.conf can contain a list (comma separated) of reserved 
completion vectors for srp_daemon to spread CQs among them.
If this line won't exist - srp_daemon will spread assignment on all HCAs 
completion vectors.



- If the completion vector parameter has not been set in
  srp_daemon.conf, let srp_daemon assign a completion vector such that
  IB interrupts for different SRP hosts use different completion
  vectors.

Bart.


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH FIXES for-3.11 4/4] IB/ipoib: Fix pkey-change flow for Virtualization environments

2013-07-17 Thread Or Gerlitz
From: Erez Shitrit ere...@mellanox.com

IPoIB's required behaviour w.r.t to the pkey used by the device is the 
following:

- For parent interfaces (e.g ib0, ib1, etc) who are created automatically as a
  result of hot-plug events from the IB core, the driver needs to take whatever
  pkey vlaue it finds in index 0, and stick to that index.

- For child interfaces (e.g ib0.8001, etc) created by admin directive, the 
driver
  needs to use and stick to the value provided during its creation.

In SR-IOV environment its possible for the VF probe to take place before the
cloud management software provisions the suitable pkey for the VF in the
paravirtualed PKEY table index 0. When this is the case, the VF IB stack will
find in index 0 an invalide pkey, which is all zeros.

Moreover, the cloud managment can assign the pkey value at index 0 at any
time of the guest life cycle.

The correct behavior for IPoIB to address these requirements for parent
interfaces is to use PKEY_CHANGE event as trigger to optionally re-init the
device pkey value and re-create all the relevant resources accordingly, if
the value of the pkey in index 0 has changed (from invalid to valid or from
valid value X to invalid value Y).

This patch enhances the heavy flushing code which is triggered by pkey change
event, to behave correctly for parent devices. For child devices, the code
remains the same, namely chases pkey value and not index.

Signed-off-by: Erez Shitrit ere...@mellanox.com
Signed-off-by: Or Gerlitz ogerl...@mellanox.com
---
 drivers/infiniband/ulp/ipoib/ipoib_ib.c|   68 +++-
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |   12 -
 2 files changed, 66 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c 
b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index 2cfa76f..a2db524 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -932,12 +932,39 @@ int ipoib_ib_dev_init(struct net_device *dev, struct 
ib_device *ca, int port)
return 0;
 }
 
+/*
+ * Takes whatever value which is in pkey index 0 and updates priv-pkey
+ * returns 0 if the pkey value was changed.
+ */
+static inline int update_parent_pkey_index(struct ipoib_dev_priv *priv)
+{
+   int result;
+   u16 prev_pkey;
+
+   prev_pkey = priv-pkey;
+   result = ib_query_pkey(priv-ca, priv-port, 0, priv-pkey);
+   if (result) {
+   ipoib_warn(priv, ib_query_pkey port %d failed (ret = %d)\n,
+  priv-port, result);
+   return result;
+   }
+
+   if (prev_pkey != priv-pkey) {
+   ipoib_dbg(priv, pkey changed from 0x%x to 0x%x\n,
+ prev_pkey, priv-pkey);
+   return 0;
+   }
+
+   return 1;
+}
+
 static void __ipoib_ib_dev_flush(struct ipoib_dev_priv *priv,
enum ipoib_flush_level level)
 {
struct ipoib_dev_priv *cpriv;
struct net_device *dev = priv-dev;
u16 new_index;
+   int result;
 
mutex_lock(priv-vlan_mutex);
 
@@ -951,6 +978,10 @@ static void __ipoib_ib_dev_flush(struct ipoib_dev_priv 
*priv,
mutex_unlock(priv-vlan_mutex);
 
if (!test_bit(IPOIB_FLAG_INITIALIZED, priv-flags)) {
+   /* for non-child devices must check/update the pkey value here 
*/
+   if (level == IPOIB_FLUSH_HEAVY 
+   !test_bit(IPOIB_FLAG_SUBINTERFACE, priv-flags))
+   update_parent_pkey_index(priv);
ipoib_dbg(priv, Not flushing - IPOIB_FLAG_INITIALIZED not 
set.\n);
return;
}
@@ -961,21 +992,32 @@ static void __ipoib_ib_dev_flush(struct ipoib_dev_priv 
*priv,
}
 
if (level == IPOIB_FLUSH_HEAVY) {
-   if (ib_find_pkey(priv-ca, priv-port, priv-pkey, new_index)) 
{
-   clear_bit(IPOIB_PKEY_ASSIGNED, priv-flags);
-   ipoib_ib_dev_down(dev, 0);
-   ipoib_ib_dev_stop(dev, 0);
-   if (ipoib_pkey_dev_delay_open(dev))
+   /* child devices chase their origin pkey value, while non-child
+* (parent) devices should always takes what present in pkey 
index 0
+*/
+   if (test_bit(IPOIB_FLAG_SUBINTERFACE, priv-flags)) {
+   if (ib_find_pkey(priv-ca, priv-port, priv-pkey, 
new_index)) {
+   clear_bit(IPOIB_PKEY_ASSIGNED, priv-flags);
+   ipoib_ib_dev_down(dev, 0);
+   ipoib_ib_dev_stop(dev, 0);
+   if (ipoib_pkey_dev_delay_open(dev))
+   return;
+   }
+   /* restart QP only if P_Key index is changed */
+   if (test_and_set_bit(IPOIB_PKEY_ASSIGNED, priv-flags) 

+   new_index == 

[PATCH FIXES for-3.11 1/4] IB/core: Create QP1 using the pkey index which contains the default pkey

2013-07-17 Thread Or Gerlitz
From: Jack Morgenstein ja...@dev.mellanox.co.il

Currently, QP1 is created using pkey index 0. This patch simply looks for
the index containing the default pkey, rather than hard-coding pkey index 0.

This change will have no effect in Native mode, since QP0 and QP1 are created
before the SM configures the port, so pkey table will still be the default
table defined by the IB Spec, in C10-123: If non-volatile storage is not used
to hold P_Key Table contents, then if a PM (Partition Manager) is not present,
and prior to PM initialization of the P_Key Table, the P_Key Table must act as
if it contains a single valid entry, at P_Key_ix = 0, containing the default
partition key. All other entries in the P_Key Table must be invalid.

Thus, in the native mode case, the driver will find the default pkey
at index 0 (so it will be no different than the hard-coding).

However, in SRIOV mode, for VFs, the pkey table may be paravirtualized, so
that the VF's pkey index zero may not necessarily be mapped to the real pkey
index 0. For VFs, therefore, it is important to find the virtual index which
maps to the real default pkey.

This commit does the following for QP1 creation:

1. Find the pkey index containing the default pkey, and use that index if found.
   ib_find_pkey() returns the index of the limited-membership default pkey
   (0x7FFF) if the full-member default pkey is not in the table.

2. If neither form of the default pkey is found, use pkey index 0 (previous 
behavior).

Signed-off-by: Jack Morgenstein ja...@dev.mellanox.co.il
Signed-off-by: Or Gerlitz ogerl...@mellanox.com
---
 drivers/infiniband/core/mad.c |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index dc3fd1e..9be6754 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2663,6 +2663,7 @@ static int ib_mad_port_start(struct ib_mad_port_private 
*port_priv)
int ret, i;
struct ib_qp_attr *attr;
struct ib_qp *qp;
+   u16 pkey_index = 0;
 
attr = kmalloc(sizeof *attr, GFP_KERNEL);
if (!attr) {
@@ -2670,6 +2671,11 @@ static int ib_mad_port_start(struct ib_mad_port_private 
*port_priv)
return -ENOMEM;
}
 
+   ret = ib_find_pkey(port_priv-device, port_priv-port_num,
+  IB_DEFAULT_PKEY_FULL, pkey_index);
+   if (ret)
+   pkey_index = 0;
+
for (i = 0; i  IB_MAD_QPS_CORE; i++) {
qp = port_priv-qp_info[i].qp;
if (!qp)
@@ -2680,7 +2686,7 @@ static int ib_mad_port_start(struct ib_mad_port_private 
*port_priv)
 * one is needed for the Reset to Init transition
 */
attr-qp_state = IB_QPS_INIT;
-   attr-pkey_index = 0;
+   attr-pkey_index = pkey_index;
attr-qkey = (qp-qp_num == 0) ? 0 : IB_QP1_QKEY;
ret = ib_modify_qp(qp, attr, IB_QP_STATE |
 IB_QP_PKEY_INDEX | IB_QP_QKEY);
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH FIXES for-3.11 3/4] IB/ipoib: Make sure child devices use valid/proper pkeys

2013-07-17 Thread Or Gerlitz
Make sure that the IB invalid pkey (0x or 0x8000) isn't used for child 
devices.

Also, make sure to always set the full membership bit for the pkey of devices
created by rtnl link ops.

Signed-off-by: Or Gerlitz ogerl...@mellanox.com
---
 drivers/infiniband/ulp/ipoib/ipoib_main.c|2 +-
 drivers/infiniband/ulp/ipoib/ipoib_netlink.c |9 +
 2 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c 
b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index b6e049a..c6f71a8 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -1461,7 +1461,7 @@ static ssize_t create_child(struct device *dev,
if (sscanf(buf, %i, pkey) != 1)
return -EINVAL;
 
-   if (pkey  0 || pkey  0x)
+   if (pkey = 0 || pkey  0x || pkey == 0x8000)
return -EINVAL;
 
/*
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c 
b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c
index 7468593..f81abe1 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c
@@ -119,6 +119,15 @@ static int ipoib_new_child_link(struct net *src_net, 
struct net_device *dev,
} else
child_pkey  = nla_get_u16(data[IFLA_IPOIB_PKEY]);
 
+   if (child_pkey == 0 || child_pkey == 0x8000)
+   return -EINVAL;
+
+   /*
+* Set the full membership bit, so that we join the right
+* broadcast group, etc.
+*/
+   child_pkey |= 0x8000;
+
err = __ipoib_vlan_add(ppriv, netdev_priv(dev), child_pkey, 
IPOIB_RTNL_CHILD);
 
if (!err  data)
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH FIXES for-3.11 2/4] IB/mlx4: Use default pkey when creating tunnel QPs

2013-07-17 Thread Or Gerlitz
From: Jack Morgenstein ja...@dev.mellanox.co.il

When creating tunnel QPs for special QP tunneling, look for the default pkey
in the slave's virtual pkey table. If it is present, use the real pkey index
where the default pkey is located.

If the default pkey is not found in the pkey table, use the real pkey index
which is stored at index 0 in the slave's virtual pkey table (this is the
current behavior).

This change is required to support cloud computing, where the paravirtualized
index of the default pkey is moved to index 1 or higher. The pkey at
paravirtualized index 0 is used for the default IPoIB interface created by the 
VF.

Its possible for the pkey value at paravirtualized index 0 to be invalid (zero) 
at
VF probe time (pkey index 0 is mapped to real pkey index 127, which contains 
pkey = 0).

At some point after the VF probe, the cloud computing interface at the 
Hypervisor
maps virtual index 0 for the VF to the pkey index containing the pkey that IPoIB
will use in its operation.  However, when the tunnel QP is created, the pkey at
the slave's virtual index 0 is still mapped to the invalid pkey index, so tunnel
QP creation fails.

This commit causes the Hypervisor to search for the default pkey in the slave's
pkey table -- and this pkey is present in the table (at index  0) at tunnel QP
creation time, so that the tunnel QP creation will succeed.

Signed-off-by: Jack Morgenstein ja...@dev.mellanox.co.il
Signed-off-by: Or Gerlitz ogerl...@mellanox.com
---
 drivers/infiniband/hw/mlx4/mad.c |   10 --
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c
index 4d599ce..f2a3f48 100644
--- a/drivers/infiniband/hw/mlx4/mad.c
+++ b/drivers/infiniband/hw/mlx4/mad.c
@@ -1511,8 +1511,14 @@ static int create_pv_sqp(struct mlx4_ib_demux_pv_ctx 
*ctx,
 
memset(attr, 0, sizeof attr);
attr.qp_state = IB_QPS_INIT;
-   attr.pkey_index =
-   
to_mdev(ctx-ib_dev)-pkeys.virt2phys_pkey[ctx-slave][ctx-port - 1][0];
+   ret = 0;
+   if (create_tun)
+   ret = find_slave_port_pkey_ix(to_mdev(ctx-ib_dev), ctx-slave,
+ ctx-port, IB_DEFAULT_PKEY_FULL,
+ attr.pkey_index);
+   if (ret || !create_tun)
+   attr.pkey_index =
+   
to_mdev(ctx-ib_dev)-pkeys.virt2phys_pkey[ctx-slave][ctx-port - 1][0];
attr.qkey = IB_QP1_QKEY;
attr.port_num = ctx-port;
ret = ib_modify_qp(tun_qp-qp, attr, qp_attr_mask_INIT);
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH FIXES for-3.11 0/4] Pkey fixes for IB core and IPoIB

2013-07-17 Thread Or Gerlitz
Hi Roland,

This set of fixes is critical for Virtualization environments when the VM 
para-virtualized PKEY table isn't fully configured at the time the 
VF is probed, or when the management pkey is provisioned to non-zero index
in the VF pkey table.

The first three patches are pretty much few liners (two of them with somehow 
long change log...). The forth one a bit larger. Would be happy to see them 
all going to -stable, either by you adding a Cc: sta...@vger.kernel.org when 
you push them or I can send them to Greg after they spend some time upstream.

Or.

Erez Shitrit (1):
  IB/ipoib: Fix pkey-change flow for Virtualization environments

Jack Morgenstein (2):
  IB/core: Create QP1 using the pkey index which contains the default pkey
  IB/mlx4: Use default pkey when creating tunnel QPs

Or Gerlitz (1):
  IB/ipoib: Make sure child devices use valid/proper pkeys

 drivers/infiniband/core/mad.c  |8 +++-
 drivers/infiniband/hw/mlx4/mad.c   |   10 +++-
 drivers/infiniband/ulp/ipoib/ipoib_ib.c|   68 +++-
 drivers/infiniband/ulp/ipoib/ipoib_main.c  |2 +-
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |   12 -
 drivers/infiniband/ulp/ipoib/ipoib_netlink.c   |9 +++
 6 files changed, 91 insertions(+), 18 deletions(-)

This is the sequence of events when the IPoIB patch is applied at the guest:

-- the VM pkey table contains 0x in index 0, and hence the mgid has 0x8000 
as the pkey

[root@xena017-3 infiniband]# cat /sys/class/infiniband/mlx4_0/ports/1/pkeys/0

0x

[root@xena017-3 ~]# ip addr show ib0
22: ib0: NO-CARRIER,BROADCAST,MULTICAST,UP mtu 2044 qdisc pfifo_fast state 
DOWN qlen 256
link/infiniband 80:00:05:8b:fe:80:00:00:00:00:00:00:00:14:05:00:00:00:04:f9 
brd 00:ff:ff:ff:ff:12:40:1b:80:00:00:00:00:00:00:00:ff:ff:ff:ff
inet 192.168.20.199/24 brd 192.168.20.255 scope global ib0

--  the hypervisor changed pkey value in index 0  of the VM pkey table to 
contain 0x8001

[root@xena017-3 infiniband]# cat /sys/class/infiniband/mlx4_0/ports/1/pkeys/0

0x8001

[root@xena017-3 ~]# dmesg
ib0: bringing up interface
IPv6: ADDRCONF(NETDEV_UP): ib0: link is not ready
ib0: multicast join failed for ff12:401b:8000:::::, status 
-22
ib0: multicast join failed for ff12:401b:8000:::::, status 
-22
[...]
ib0: Event 12 on device mlx4_0 port 1
ib0: pkey changed from 0x8000 to 0x8001
ib0: downing ib_dev
ib0: All sends and receives done.
ib0: Created ah 88011a11f4e0
IPv6: ADDRCONF(NETDEV_CHANGE): ib0: link becomes ready
ib0: Created ah 88011a11f660
ib0: Created ah 88011a11f780

--- mgid changed to use 0x8001

[root@xena017-3 ~]# ip addr show ib0
22: ib0: BROADCAST,MULTICAST,UP,LOWER_UP mtu 2044 qdisc pfifo_fast state UP 
qlen 256
link/infiniband 80:00:05:8b:fe:80:00:00:00:00:00:00:00:14:05:00:00:00:04:f9 
brd 00:ff:ff:ff:ff:12:40:1b:80:01:00:00:00:00:00:00:ff:ff:ff:ff
inet 192.168.20.199/24 brd 192.168.20.255 scope global ib0


ping works etc

-- when the mlx4 patch isn't applied on the host we see these errors

mlx4_ib create_pv_sqp: Couldn't change tunnel qp state to INIT (-22)
mlx4_ib create_pv_resources: Couldn't create tunnel for QP1 (-22)

-- this is the sequence of events when the IPoIB patch is not applied at the 
guest:

-- the VM pkey table contains 0x in index 0, and hence the mgid has 0x8000 
as the pkey

[root@xena017-3 infiniband]# cat /sys/class/infiniband/mlx4_0/ports/1/pkeys/0

0x

[root@xena017-3 infiniband]# modprobe ib_ipoib debug_level=1

[root@xena017-3 infiniband]# ip a s ib0

30: ib0: NO-CARRIER,BROADCAST,MULTICAST,UP mtu 2044 qdisc pfifo_fast state 
DOWN qlen 256
link/infiniband 80:00:05:93:fe:80:00:00:00:00:00:00:00:14:05:00:00:00:04:f9 
brd 00:ff:ff:ff:ff:12:40:1b:80:00:00:00:00:00:00:00:ff:ff:ff:ff

[root@xena017-3 infiniband]# cat /sys/class/infiniband/mlx4_0/ports/1/pkeys/0

0x8001

-- ipoib got the event, but nothing changed

[root@xena017-3 infiniband]# ip a s ib0

30: ib0: NO-CARRIER,BROADCAST,MULTICAST,UP mtu 2044 qdisc pfifo_fast state 
DOWN qlen 256
link/infiniband 80:00:05:93:fe:80:00:00:00:00:00:00:00:14:05:00:00:00:04:f9 
brd 00:ff:ff:ff:ff:12:40:1b:80:00:00:00:00:00:00:00:ff:ff:ff:ff
inet 192.168.20.199/24 brd 192.168.20.255 scope global ib0

[root@xena017-3 infiniband]# dmesg
ib%d: max_srq_sge=31
ib%d: max_cm_mtu = 0xfff0, num_frags=16
ib%d: max_srq_sge=31
ib%d: max_cm_mtu = 0xfff0, num_frags=16
ib0: bringing up interface
IPv6: ADDRCONF(NETDEV_UP): ib0: link is not ready
ib0: multicast join failed for ff12:401b:8000:::::, status 
-22
ib0: multicast join failed for ff12:401b:8000:::::, status 
-22
ib0: multicast join failed for ff12:401b:8000:::::, status 
-22
ib0: multicast join failed for ff12:401b:8000:::::, status 
-22
ib0: Event 12 on device mlx4_0 port 1
ib0: downing ib_dev
ib0: All sends and receives done.


--
To unsubscribe from 

[PATCH REPOST FIXES for-3.11 1/4] IB/core: Create QP1 using the pkey index which contains the default pkey

2013-07-17 Thread Or Gerlitz
From: Jack Morgenstein ja...@dev.mellanox.co.il

Currently, QP1 is created using pkey index 0. This patch simply looks for
the index containing the default pkey, rather than hard-coding pkey index 0.

This change will have no effect in Native mode, since QP0 and QP1 are created
before the SM configures the port, so pkey table will still be the default
table defined by the IB Spec, in C10-123: If non-volatile storage is not used
to hold P_Key Table contents, then if a PM (Partition Manager) is not present,
and prior to PM initialization of the P_Key Table, the P_Key Table must act as
if it contains a single valid entry, at P_Key_ix = 0, containing the default
partition key. All other entries in the P_Key Table must be invalid.

Thus, in the native mode case, the driver will find the default pkey
at index 0 (so it will be no different than the hard-coding).

However, in SRIOV mode, for VFs, the pkey table may be paravirtualized, so
that the VF's pkey index zero may not necessarily be mapped to the real pkey
index 0. For VFs, therefore, it is important to find the virtual index which
maps to the real default pkey.

This commit does the following for QP1 creation:

1. Find the pkey index containing the default pkey, and use that index if found.
   ib_find_pkey() returns the index of the limited-membership default pkey
   (0x7FFF) if the full-member default pkey is not in the table.

2. If neither form of the default pkey is found, use pkey index 0 (previous 
behavior).

Signed-off-by: Jack Morgenstein ja...@dev.mellanox.co.il
Signed-off-by: Or Gerlitz ogerl...@mellanox.com
---
 drivers/infiniband/core/mad.c |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index dc3fd1e..9be6754 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2663,6 +2663,7 @@ static int ib_mad_port_start(struct ib_mad_port_private 
*port_priv)
int ret, i;
struct ib_qp_attr *attr;
struct ib_qp *qp;
+   u16 pkey_index = 0;
 
attr = kmalloc(sizeof *attr, GFP_KERNEL);
if (!attr) {
@@ -2670,6 +2671,11 @@ static int ib_mad_port_start(struct ib_mad_port_private 
*port_priv)
return -ENOMEM;
}
 
+   ret = ib_find_pkey(port_priv-device, port_priv-port_num,
+  IB_DEFAULT_PKEY_FULL, pkey_index);
+   if (ret)
+   pkey_index = 0;
+
for (i = 0; i  IB_MAD_QPS_CORE; i++) {
qp = port_priv-qp_info[i].qp;
if (!qp)
@@ -2680,7 +2686,7 @@ static int ib_mad_port_start(struct ib_mad_port_private 
*port_priv)
 * one is needed for the Reset to Init transition
 */
attr-qp_state = IB_QPS_INIT;
-   attr-pkey_index = 0;
+   attr-pkey_index = pkey_index;
attr-qkey = (qp-qp_num == 0) ? 0 : IB_QP1_QKEY;
ret = ib_modify_qp(qp, attr, IB_QP_STATE |
 IB_QP_PKEY_INDEX | IB_QP_QKEY);
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH REPOST FIXES for-3.11 4/4] IB/ipoib: Fix pkey-change flow for Virtualization environments

2013-07-17 Thread Or Gerlitz
From: Erez Shitrit ere...@mellanox.com

IPoIB's required behaviour w.r.t to the pkey used by the device is the 
following:

- For parent interfaces (e.g ib0, ib1, etc) who are created automatically as a
  result of hot-plug events from the IB core, the driver needs to take whatever
  pkey vlaue it finds in index 0, and stick to that index.

- For child interfaces (e.g ib0.8001, etc) created by admin directive, the 
driver
  needs to use and stick to the value provided during its creation.

In SR-IOV environment its possible for the VF probe to take place before the
cloud management software provisions the suitable pkey for the VF in the
paravirtualed PKEY table index 0. When this is the case, the VF IB stack will
find in index 0 an invalide pkey, which is all zeros.

Moreover, the cloud managment can assign the pkey value at index 0 at any
time of the guest life cycle.

The correct behavior for IPoIB to address these requirements for parent
interfaces is to use PKEY_CHANGE event as trigger to optionally re-init the
device pkey value and re-create all the relevant resources accordingly, if
the value of the pkey in index 0 has changed (from invalid to valid or from
valid value X to invalid value Y).

This patch enhances the heavy flushing code which is triggered by pkey change
event, to behave correctly for parent devices. For child devices, the code
remains the same, namely chases pkey value and not index.

Signed-off-by: Erez Shitrit ere...@mellanox.com
Signed-off-by: Or Gerlitz ogerl...@mellanox.com
---
 drivers/infiniband/ulp/ipoib/ipoib_ib.c |   76 +-
 1 files changed, 63 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c 
b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index 2cfa76f..196b1d1 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -932,12 +932,47 @@ int ipoib_ib_dev_init(struct net_device *dev, struct 
ib_device *ca, int port)
return 0;
 }
 
+/*
+ * Takes whatever value which is in pkey index 0 and updates priv-pkey
+ * returns 0 if the pkey value was changed.
+ */
+static inline int update_parent_pkey(struct ipoib_dev_priv *priv)
+{
+   int result;
+   u16 prev_pkey;
+
+   prev_pkey = priv-pkey;
+   result = ib_query_pkey(priv-ca, priv-port, 0, priv-pkey);
+   if (result) {
+   ipoib_warn(priv, ib_query_pkey port %d failed (ret = %d)\n,
+  priv-port, result);
+   return result;
+   }
+
+   priv-pkey |= 0x8000;
+
+   if (prev_pkey != priv-pkey) {
+   ipoib_dbg(priv, pkey changed from 0x%x to 0x%x\n,
+ prev_pkey, priv-pkey);
+   /*
+* Update the pkey in the broadcast address, while making sure 
to set
+* the full membership bit, so that we join the right broadcast 
group.
+*/
+   priv-dev-broadcast[8] = priv-pkey  8;
+   priv-dev-broadcast[9] = priv-pkey  0xff;
+   return 0;
+   }
+
+   return 1;
+}
+
 static void __ipoib_ib_dev_flush(struct ipoib_dev_priv *priv,
enum ipoib_flush_level level)
 {
struct ipoib_dev_priv *cpriv;
struct net_device *dev = priv-dev;
u16 new_index;
+   int result;
 
mutex_lock(priv-vlan_mutex);
 
@@ -951,6 +986,10 @@ static void __ipoib_ib_dev_flush(struct ipoib_dev_priv 
*priv,
mutex_unlock(priv-vlan_mutex);
 
if (!test_bit(IPOIB_FLAG_INITIALIZED, priv-flags)) {
+   /* for non-child devices must check/update the pkey value here 
*/
+   if (level == IPOIB_FLUSH_HEAVY 
+   !test_bit(IPOIB_FLAG_SUBINTERFACE, priv-flags))
+   update_parent_pkey(priv);
ipoib_dbg(priv, Not flushing - IPOIB_FLAG_INITIALIZED not 
set.\n);
return;
}
@@ -961,21 +1000,32 @@ static void __ipoib_ib_dev_flush(struct ipoib_dev_priv 
*priv,
}
 
if (level == IPOIB_FLUSH_HEAVY) {
-   if (ib_find_pkey(priv-ca, priv-port, priv-pkey, new_index)) 
{
-   clear_bit(IPOIB_PKEY_ASSIGNED, priv-flags);
-   ipoib_ib_dev_down(dev, 0);
-   ipoib_ib_dev_stop(dev, 0);
-   if (ipoib_pkey_dev_delay_open(dev))
+   /* child devices chase their origin pkey value, while non-child
+* (parent) devices should always takes what present in pkey 
index 0
+*/
+   if (test_bit(IPOIB_FLAG_SUBINTERFACE, priv-flags)) {
+   if (ib_find_pkey(priv-ca, priv-port, priv-pkey, 
new_index)) {
+   clear_bit(IPOIB_PKEY_ASSIGNED, priv-flags);
+   ipoib_ib_dev_down(dev, 0);
+   ipoib_ib_dev_stop(dev, 0);
+   if (ipoib_pkey_dev_delay_open(dev))

Re: [PATCH -next] IB/mlx5: use module_pci_driver to simplify the code

2013-07-17 Thread Eli Cohen
Looks to me like a convenience that we may need to give up later
should we need to put any code in the init or cleanup functions.

On Wed, Jul 17, 2013 at 09:56:41AM +0800, Wei Yongjun wrote:
 From: Wei Yongjun yongjun_...@trendmicro.com.cn
 
 Use the module_pci_driver() macro to make the code simpler
 by eliminating module_init and module_exit calls.
 
 Signed-off-by: Wei Yongjun yongjun_...@trendmicro.com.cn
 ---
  drivers/infiniband/hw/mlx5/main.c | 13 +
  1 file changed, 1 insertion(+), 12 deletions(-)
 
 diff --git a/drivers/infiniband/hw/mlx5/main.c 
 b/drivers/infiniband/hw/mlx5/main.c
 index 8000fff..0cdc185 100644
 --- a/drivers/infiniband/hw/mlx5/main.c
 +++ b/drivers/infiniband/hw/mlx5/main.c
 @@ -1490,15 +1490,4 @@ static struct pci_driver mlx5_ib_driver = {
   .remove = remove_one
  };
  
 -static int __init mlx5_ib_init(void)
 -{
 - return pci_register_driver(mlx5_ib_driver);
 -}
 -
 -static void __exit mlx5_ib_cleanup(void)
 -{
 - pci_unregister_driver(mlx5_ib_driver);
 -}
 -
 -module_init(mlx5_ib_init);
 -module_exit(mlx5_ib_cleanup);
 +module_pci_driver(mlx5_ib_driver);
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mlx5: qp: variable may be used uninitialized

2013-07-17 Thread Eli Cohen
Acked-by: Eli Cohen e...@mellanox.com

On Tue, Jul 16, 2013 at 03:35:01PM +0200, Andi Shyti wrote:
 in the sq_overhead() function, if qp_typ is equal to IB_QPT_RC,
 size will be used uninitialized.
 
 Signed-off-by: Andi Shyti a...@etezian.org
 ---
  drivers/infiniband/hw/mlx5/qp.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
 index 16ac54c..045f8cd 100644
 --- a/drivers/infiniband/hw/mlx5/qp.c
 +++ b/drivers/infiniband/hw/mlx5/qp.c
 @@ -199,7 +199,7 @@ static int set_rq_size(struct mlx5_ib_dev *dev, struct 
 ib_qp_cap *cap,
  
  static int sq_overhead(enum ib_qp_type qp_type)
  {
 - int size;
 + int size = 0;
  
   switch (qp_type) {
   case IB_QPT_XRC_INI:
 -- 
 1.8.3.2
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch -next] mlx5: return -EFAULT instead of -EPERM

2013-07-17 Thread Eli Cohen
Acked-by Eli Cohen e...@mellanox.com

On Wed, Jul 10, 2013 at 01:58:59PM +0300, Dan Carpenter wrote:
 For copy_to/from_user() failure, the correct error code is -EFAULT not
 -EPERM.
 
 Signed-off-by: Dan Carpenter dan.carpen...@oracle.com
 
 diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
 index e2daa8f..bd41df9 100644
 --- a/drivers/infiniband/hw/mlx5/mr.c
 +++ b/drivers/infiniband/hw/mlx5/mr.c
 @@ -171,7 +171,7 @@ static ssize_t size_write(struct file *filp, const char 
 __user *buf,
   int c;
  
   if (copy_from_user(lbuf, buf, sizeof(lbuf)))
 - return -EPERM;
 + return -EFAULT;
  
   c = order2idx(dev, ent-order);
   lbuf[sizeof(lbuf) - 1] = 0;
 @@ -208,7 +208,7 @@ static ssize_t size_read(struct file *filp, char __user 
 *buf, size_t count,
   return err;
  
   if (copy_to_user(buf, lbuf, err))
 - return -EPERM;
 + return -EFAULT;
  
   *pos += err;
  
 @@ -233,7 +233,7 @@ static ssize_t limit_write(struct file *filp, const char 
 __user *buf,
   int c;
  
   if (copy_from_user(lbuf, buf, sizeof(lbuf)))
 - return -EPERM;
 + return -EFAULT;
  
   c = order2idx(dev, ent-order);
   lbuf[sizeof(lbuf) - 1] = 0;
 @@ -270,7 +270,7 @@ static ssize_t limit_read(struct file *filp, char __user 
 *buf, size_t count,
   return err;
  
   if (copy_to_user(buf, lbuf, err))
 - return -EPERM;
 + return -EFAULT;
  
   *pos += err;
  
 diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c 
 b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
 index c1c0eef..205753a 100644
 --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
 +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
 @@ -693,7 +693,7 @@ static ssize_t dbg_write(struct file *filp, const char 
 __user *buf,
   return -ENOMEM;
  
   if (copy_from_user(lbuf, buf, sizeof(lbuf)))
 - return -EPERM;
 + return -EFAULT;
  
   lbuf[sizeof(lbuf) - 1] = 0;
  
 @@ -889,7 +889,7 @@ static ssize_t data_write(struct file *filp, const char 
 __user *buf,
   return -ENOMEM;
  
   if (copy_from_user(ptr, buf, count)) {
 - err = -EPERM;
 + err = -EFAULT;
   goto out;
   }
   dbg-in_msg = ptr;
 @@ -919,7 +919,7 @@ static ssize_t data_read(struct file *filp, char __user 
 *buf, size_t count,
  
   copy = min_t(int, count, dbg-outlen);
   if (copy_to_user(buf, dbg-out_msg, copy))
 - return -EPERM;
 + return -EFAULT;
  
   *pos += copy;
  
 @@ -949,7 +949,7 @@ static ssize_t outlen_read(struct file *filp, char __user 
 *buf, size_t count,
   return err;
  
   if (copy_to_user(buf, outlen, err))
 - return -EPERM;
 + return -EFAULT;
  
   *pos += err;
  
 @@ -974,7 +974,7 @@ static ssize_t outlen_write(struct file *filp, const char 
 __user *buf,
   dbg-outlen = 0;
  
   if (copy_from_user(outlen_str, buf, count))
 - return -EPERM;
 + return -EFAULT;
  
   outlen_str[7] = 0;
  
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH REPOST FIXES for-3.11 1/4] IB/core: Create QP1 using the pkey index which contains the default pkey

2013-07-17 Thread Hefty, Sean
 diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
 index dc3fd1e..9be6754 100644
 --- a/drivers/infiniband/core/mad.c
 +++ b/drivers/infiniband/core/mad.c
 @@ -2663,6 +2663,7 @@ static int ib_mad_port_start(struct ib_mad_port_private
 *port_priv)
   int ret, i;
   struct ib_qp_attr *attr;
   struct ib_qp *qp;
 + u16 pkey_index = 0;

This shouldn't need to be initialized, as it is always set further down in the 
patch.

Reviewed-by: Sean Hefty sean.he...@intel.com
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU

2013-07-17 Thread Doug Ledford
On 07/16/2013 08:16 PM, Roland Dreier wrote:
 On Tue, Jul 16, 2013 at 10:11 AM, Jeff Squyres (jsquyres)
 jsquy...@cisco.com wrote:
 - doing it this way preserves ABI, so existing binaries are safe
 
 I still don't get this.  Wouldn't an existing binary be pretty
 surprised to get a value wildly out of range of the enum?

Yes, but there's no way around that without simply lying about the MTU.
 So, the argument was made in the thread that historically, applications
have had to be modified when moved to a new link layer (aka, iWARP meant
IB apps had to be slightly modified for connection reasons, RoCE again
required some slight app modifications, etc) so this was seen as a case
of the app will work on fabrics it already knows about, and will only
get confused if moved to this new fabric, and in that case, the app
needs to be modified anyway, so that's acceptable breakage for keeping
the apps working the rest of the time.  That was the argument anyway.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2] RDMA/cma: silence GCC warning

2013-07-17 Thread Hefty, Sean
 Building cma.o triggers this GCC warning:
 drivers/infiniband/core/cma.c: In function ‘rdma_resolve_addr’:
 drivers/infiniband/core/cma.c:465:23: warning: ‘port’ may be used
 uninitialized in this function [-Wmaybe-uninitialized]
 drivers/infiniband/core/cma.c:426:5: note: ‘port’ was declared here
 
 This is a false positive, as port will always be initialized if we're
 at found. But if we assign to id_priv-id.port_num directly, we can
 drop port. That will, obviously, silence GCC.
 
 Signed-off-by: Paul Bolle pebo...@tiscali.nl

Acked-by: Sean Hefty sean.he...@intel.com

 ---
 0) v2: assign to id_priv-id.port_num directly, instead of
 initializing port to 0, as discussed with Sean.
 
 1) Still only compile tested.

tested - thanks
N�r��yb�X��ǧv�^�)޺{.n�+{��ٚ�{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�mzZ+�ݢj��!�i

RE: [PATCH V2] libibverbs: Allow arbitrary int values for MTU

2013-07-17 Thread Hefty, Sean
 I hadn't looked at the kernel side yet; I was waiting for the userspace side 
 to
 sort itself out first.

I think it makes sense to start with how user space can get the data.  Without 
eating up reserved fields, we're starting with 8 bit values.

 Hmm.  16 bits is probably enough for the MTU values, but still, changing kern-
 abi.h will be problematic from an ABI perspective.  Do people care about the
 kernel ABI, or is that mainly a userspace issue?

Well, we definitely care about the kernel to user ABI.

I can't imagine that we're dealing with more than a handful of actual MTU 
values.  Maybe the simplest thing is to extend the mtu enum to include what new 
values are needed, plus add a function to convert it.  (Can we call mulligan?)

I don't know how iwarp handles this.  Does it just report the wrong mtu, since 
it doesn't necessarily matter?  Steve - any idea here?

- Sean
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] RDMA/nes: Reversing commit bca1935ccdec to silence allmodconfig build warning

2013-07-17 Thread Tatyana Nikolova
Reversing commit Fix compilation error when nes_debug is enabled
which removes variables nes_tcp_state_str and nes_iwarp_state_str, assuming 
that they aren't defined. 
However, they are defined within a #ifdef NES_DEBUG statement, which if enabled
causes defined but not used compiler warning, when the variables are removed.

Signed-off-by: Tatyana Nikolova tatyana.e.nikol...@intel.com
Reported-by: Stephen Rothwell s...@canb.auug.org.au
---
 drivers/infiniband/hw/nes/nes_hw.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/nes/nes_hw.c 
b/drivers/infiniband/hw/nes/nes_hw.c
index 418004c..9020024 100644
--- a/drivers/infiniband/hw/nes/nes_hw.c
+++ b/drivers/infiniband/hw/nes/nes_hw.c
@@ -3570,10 +3570,10 @@ static void nes_process_iwarp_aeqe(struct nes_device 
*nesdev,
tcp_state = (aeq_info  NES_AEQE_TCP_STATE_MASK)  
NES_AEQE_TCP_STATE_SHIFT;
iwarp_state = (aeq_info  NES_AEQE_IWARP_STATE_MASK)  
NES_AEQE_IWARP_STATE_SHIFT;
nes_debug(NES_DBG_AEQ, aeid = 0x%04X, qp-cq id = %d, aeqe = %p,
-Tcp state = %d, iWARP state = %d\n,
+Tcp state = %s, iWARP state = %s\n,
async_event_id,

le32_to_cpu(aeqe-aeqe_words[NES_AEQE_COMP_QP_CQ_ID_IDX]), aeqe,
-   tcp_state, iwarp_state);
+   nes_tcp_state_str[tcp_state], 
nes_iwarp_state_str[iwarp_state]);
 
aeqe_cq_id = le32_to_cpu(aeqe-aeqe_words[NES_AEQE_COMP_QP_CQ_ID_IDX]);
if (aeq_info  NES_AEQE_QP) {
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU

2013-07-17 Thread Steve Wise

On 7/17/2013 4:41 PM, Hefty, Sean wrote:

I hadn't looked at the kernel side yet; I was waiting for the userspace side to
sort itself out first.

I think it makes sense to start with how user space can get the data.  Without 
eating up reserved fields, we're starting with 8 bit values.


Hmm.  16 bits is probably enough for the MTU values, but still, changing kern-
abi.h will be problematic from an ABI perspective.  Do people care about the
kernel ABI, or is that mainly a userspace issue?

Well, we definitely care about the kernel to user ABI.

I can't imagine that we're dealing with more than a handful of actual MTU 
values.  Maybe the simplest thing is to extend the mtu enum to include what new 
values are needed, plus add a function to convert it.  (Can we call mulligan?)

I don't know how iwarp handles this.  Does it just report the wrong mtu, since 
it doesn't necessarily matter?  Steve - any idea here?


The iwarp drivers just report the nearest mtu enum.  Apps don't need it 
for iwarp like they do for ib.



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V3 for-next 3/4] IB/core: Export ib_create/destroy_flow through uverbs

2013-07-17 Thread Hefty, Sean
 +ssize_t ib_uverbs_create_flow(struct ib_uverbs_file *file,
 +   const char __user *buf, int in_len,
 +   int out_len)
 +{
 + struct ib_uverbs_create_flow  cmd;
 + struct ib_uverbs_create_flow_resp resp;
 + struct ib_uobject *uobj;
 + struct ib_flow*flow_id;
 + struct ib_kern_flow_attr  *kern_flow_attr;
 + struct ib_flow_attr   *flow_attr;
 + struct ib_qp  *qp;
 + int err = 0;
 + void *kern_spec;
 + void *ib_spec;
 + int i;
 +
 + if (out_len  sizeof(resp))
 + return -ENOSPC;
 +
 + if (copy_from_user(cmd, buf, sizeof(cmd)))
 + return -EFAULT;
 +
 + if ((cmd.flow_attr.type == IB_FLOW_ATTR_SNIFFER 
 +  !capable(CAP_NET_ADMIN)) || !capable(CAP_NET_RAW))
 + return -EPERM;
 +
 + if (cmd.flow_attr.num_of_specs) {
 + kern_flow_attr = kmalloc(cmd.flow_attr.size, GFP_KERNEL);
 + if (!kern_flow_attr)
 + return -ENOMEM;
 +
 + memcpy(kern_flow_attr, cmd.flow_attr, sizeof(*kern_flow_attr));
 + if (copy_from_user(kern_flow_attr + 1, buf + sizeof(cmd),
 +cmd.flow_attr.size - sizeof(cmd))) {
 + err = -EFAULT;
 + goto err_free_attr;
 + }
 + } else {
 + kern_flow_attr = cmd.flow_attr;
 + }
 +
 + uobj = kmalloc(sizeof(*uobj), GFP_KERNEL);
 + if (!uobj) {
 + err = -ENOMEM;
 + goto err_free_attr;
 + }
 + init_uobj(uobj, 0, file-ucontext, rule_lock_class);
 + down_write(uobj-mutex);
 +
 + qp = idr_read_qp(cmd.qp_handle, file-ucontext);
 + if (!qp) {
 + err = -EINVAL;
 + goto err_uobj;
 + }
 +
 + flow_attr = kmalloc(cmd.flow_attr.size, GFP_KERNEL);
 + if (!flow_attr) {
 + err = -ENOMEM;
 + goto err_put;
 + }
 +
 + flow_attr-type = kern_flow_attr-type;
 + flow_attr-priority = kern_flow_attr-priority;
 + flow_attr-num_of_specs = kern_flow_attr-num_of_specs;
 + flow_attr-port = kern_flow_attr-port;
 + flow_attr-flags = kern_flow_attr-flags;
 + flow_attr-size = sizeof(*flow_attr);
 +
 + kern_spec = kern_flow_attr + 1;
 + ib_spec = flow_attr + 1;
 + for (i = 0; i  flow_attr-num_of_specs; i++) {
 + err = kern_spec_to_ib_spec(kern_spec, ib_spec);
 + if (err)
 + goto err_free;
 + flow_attr-size +=
 + ((struct _ib_flow_spec *)ib_spec)-size;
 + kern_spec += ((struct ib_kern_spec *)kern_spec)-size;
 + ib_spec += ((struct _ib_flow_spec *)ib_spec)-size;

I didn't see where the ib_kern_spec size field was validated.  Maybe add this 
check to kern_spec_to_ib_spec?

 + }
 + flow_id = ib_create_flow(qp, flow_attr, IB_FLOW_DOMAIN_USER);
 + if (IS_ERR(flow_id)) {
 + err = PTR_ERR(flow_id);
 + goto err_free;
 + }
 + flow_id-qp = qp;
 + flow_id-uobject = uobj;
 + uobj-object = flow_id;
 +
 + err = idr_add_uobj(ib_uverbs_rule_idr, uobj);
 + if (err)
 + goto destroy_flow;
 +
 + memset(resp, 0, sizeof(resp));
 + resp.flow_handle = uobj-id;
 +
 + if (copy_to_user((void __user *)(unsigned long) cmd.response,
 +  resp, sizeof(resp))) {
 + err = -EFAULT;
 + goto err_copy;
 + }
 +
 + put_qp_read(qp);
 + mutex_lock(file-mutex);
 + list_add_tail(uobj-list, file-ucontext-rule_list);
 + mutex_unlock(file-mutex);
 +
 + uobj-live = 1;
 +
 + up_write(uobj-mutex);
 + kfree(flow_attr);
 + if (cmd.flow_attr.num_of_specs)
 + kfree(kern_flow_attr);
 + return in_len;
 +err_copy:
 + idr_remove_uobj(ib_uverbs_rule_idr, uobj);
 +destroy_flow:
 + ib_destroy_flow(flow_id);
 +err_free:
 + kfree(flow_attr);
 +err_put:
 + put_qp_read(qp);
 +err_uobj:
 + put_uobj_write(uobj);
 +err_free_attr:
 + if (cmd.flow_attr.num_of_specs)
 + kfree(kern_flow_attr);
 + return err;
 +}
 +
 +ssize_t ib_uverbs_destroy_flow(struct ib_uverbs_file *file,
 +const char __user *buf, int in_len,
 +int out_len) {
 + struct ib_uverbs_destroy_flow   cmd;
 + struct ib_flow  *flow_id;
 + struct ib_uobject   *uobj;
 + int ret;
 +
 + if (copy_from_user(cmd, buf, sizeof(cmd)))
 + return -EFAULT;
 +
 + uobj = idr_write_uobj(ib_uverbs_rule_idr, cmd.flow_handle,
 +   file-ucontext);
 + if (!uobj)
 + return -EINVAL;
 + flow_id = uobj-object;
 +
 + ret = ib_destroy_flow(flow_id);
 + if (!ret)
 + uobj-live = 0;
 +
 + put_uobj_write(uobj);
 +
 + 

Re: [PATCH V2] libibverbs: Allow arbitrary int values for MTU

2013-07-17 Thread Jeff Squyres (jsquyres)
On Jul 17, 2013, at 5:44 PM, Steve Wise sw...@opengridcomputing.com wrote:

 The iwarp drivers just report the nearest mtu enum.  Apps don't need it for 
 iwarp like they do for ib.


For RC, it doesn't matter much.  So the fact that RoCE and iWARP lie about 
their MTU isn't a huge deal.  It's wrong, but it doesn't matter much.

We need it for UD for our upcoming device, however, because the MTU is the only 
way to get the max message size.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html