mlx5 patches for 3.13

2013-11-13 Thread Eli Cohen
Hi Roland,

I see that 3.12 is released already but I could not see the patches I
sent in your for-next branch. Are you going to include them?

http://www.spinics.net/lists/linux-rdma/msg17719.html
http://www.spinics.net/lists/linux-rdma/msg17609.html

Thanks,
Eli
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [EXTERNAL] [PATCH opensm] Remove unused lid matrix calculation in Torus_2Qos routing

2013-11-13 Thread Jim Schutt
On 11/12/2013 04:18 AM, Hal Rosenstock wrote:

Acked-by: Jim Schutt jasc...@sandia.gov

 
 From: Vladimir Koushnir vladim...@mellanox.com
 
 Signed-off-by: Vladimir Koushnir vladim...@mellanox.com
 Signed-off-by: Hal Rosenstock h...@mellanox.com
 ---
 diff --git a/include/opensm/osm_ucast_mgr.h b/include/opensm/osm_ucast_mgr.h
 index c534b7e..b9c1ca1 100644
 --- a/include/opensm/osm_ucast_mgr.h
 +++ b/include/opensm/osm_ucast_mgr.h
 @@ -296,5 +296,7 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr);
  * SEE ALSO
  *Unicast Manager, Node Info Response Controller
  */
 +
 +int ucast_dummy_build_lid_matrices(void *context);
  END_C_DECLS
  #endif   /* _OSM_UCAST_MGR_H_ */
 diff --git a/opensm/osm_torus.c b/opensm/osm_torus.c
 index 8139f1a..71753cf 100644
 --- a/opensm/osm_torus.c
 +++ b/opensm/osm_torus.c
 @@ -9550,6 +9550,7 @@ int osm_ucast_torus2QoS_setup(struct osm_routing_engine 
 *r,
  
   r-context = ctx;
   r-ucast_build_fwd_tables = torus_build_lfts;
 + r-build_lid_matrices = ucast_dummy_build_lid_matrices;
   r-update_sl2vl = torus_update_osm_sl2vl;
   r-update_vlarb = torus_update_osm_vlarb;
   r-path_sl = torus_path_sl;
 diff --git a/opensm/osm_ucast_mgr.c b/opensm/osm_ucast_mgr.c
 index 6384362..9ef7947 100644
 --- a/opensm/osm_ucast_mgr.c
 +++ b/opensm/osm_ucast_mgr.c
 @@ -1182,3 +1182,8 @@ int osm_ucast_dor_setup(struct osm_routing_engine *r, 
 osm_opensm_t * osm)
   r-ucast_build_fwd_tables = ucast_dor_build_lfts;
   return 0;
  }
 +
 +int ucast_dummy_build_lid_matrices(void *context)
 +{
 + return 0;
 +}
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH opensm] Implement atomic update operation for sa_db_file

2013-11-13 Thread Hal Rosenstock

From: Vladimir Koushnir vladim...@mellanox.com

Signed-off-by: Vladimir Koushnir vladim...@mellanox.com
---
 opensm/osm_sa.c |   20 
 1 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/opensm/osm_sa.c b/opensm/osm_sa.c
index 8c5ef5d..d5c1275 100644
--- a/opensm/osm_sa.c
+++ b/opensm/osm_sa.c
@@ -525,25 +525,37 @@ opensm_dump_to_file(osm_opensm_t * p_osm, const char 
*file_name,
void (*dump_func) (osm_opensm_t * p_osm, FILE * file))
 {
char path[1024];
+   char path_tmp[1032];
FILE *file;
+   int status = 0;
 
snprintf(path, sizeof(path), %s/%s,
 p_osm-subn.opt.dump_files_dir, file_name);
 
-   file = fopen(path, w);
+   snprintf(path_tmp, sizeof(path_tmp), %s.tmp, path);
+
+   file = fopen(path_tmp, w);
if (!file) {
OSM_LOG(p_osm-log, OSM_LOG_ERROR, ERR 4C01: 
cannot open file \'%s\': %s\n,
-   file_name, strerror(errno));
+   path_tmp, strerror(errno));
return -1;
}
 
-   chmod(path, S_IRUSR | S_IWUSR);
+   chmod(path_tmp, S_IRUSR | S_IWUSR);
 
dump_func(p_osm, file);
 
fclose(file);
-   return 0;
+
+   status = rename(path_tmp, path);
+   if (status) {
+   OSM_LOG(p_osm-log, OSM_LOG_ERROR, ERR 4C0B: 
+   Failed to rename file:%s (err:%s)\n,
+   path_tmp, strerror(errno));
+   }
+
+   return status;
 }
 
 static void mcast_mgr_dump_one_port(cl_map_item_t * p_map_item, void *cxt)
-- 
1.7.8.2

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH opensm 1/2] Redundant remove() function call during db file generation

2013-11-13 Thread Hal Rosenstock

From: Vladimir Koushnir vladim...@mellanox.com

Signed-off-by: Vladimir Koushnir vladim...@mellanox.com
---
 opensm/osm_db_files.c |   17 +
 1 files changed, 5 insertions(+), 12 deletions(-)

diff --git a/opensm/osm_db_files.c b/opensm/osm_db_files.c
index 75b58cd..348385f 100644
--- a/opensm/osm_db_files.c
+++ b/opensm/osm_db_files.c
@@ -45,6 +45,7 @@
 
 #include sys/stat.h
 #include sys/types.h
+#include errno.h
 #include stdlib.h
 #include string.h
 #include opensm/osm_file_ids.h
@@ -480,8 +481,8 @@ int osm_db_store(IN osm_db_domain_t * p_domain)
p_file = fopen(p_tmp_file_name, w);
if (!p_file) {
OSM_LOG(p_log, OSM_LOG_ERROR, ERR 6107: 
-   Failed to open the db file:%s for writing\n,
-   p_domain_imp-file_name);
+   Failed to open the db file:%s for writing: err:%s\n,
+   p_domain_imp-file_name, strerror(errno));
status = 1;
goto Exit;
}
@@ -489,19 +490,11 @@ int osm_db_store(IN osm_db_domain_t * p_domain)
st_foreach(p_domain_imp-p_hash, dump_tbl_entry, (st_data_t) p_file);
fclose(p_file);
 
-   /* move the domain file */
-   status = remove(p_domain_imp-file_name);
-   if (status) {
-   OSM_LOG(p_log, OSM_LOG_ERROR, ERR 6109: 
-   Failed to remove file:%s (err:%u)\n,
-   p_domain_imp-file_name, status);
-   }
-
status = rename(p_tmp_file_name, p_domain_imp-file_name);
if (status) {
OSM_LOG(p_log, OSM_LOG_ERROR, ERR 6108: 
-   Failed to rename the db file to:%s (err:%u)\n,
-   p_domain_imp-file_name, status);
+   Failed to rename the db file to:%s (err:%s)\n,
+   p_domain_imp-file_name, strerror(errno));
}
 Exit:
cl_spinlock_release(p_domain_imp-lock);
-- 
1.7.8.2

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATYCH opensm 2/2] Only rewrite db files during heavy sweep when there is a real change

2013-11-13 Thread Hal Rosenstock

From: Vladimir Koushnir vladim...@mellanox.com

Signed-off-by: Vladimir Koushnir vladim...@mellanox.com
---
 opensm/osm_db_files.c |   19 +--
 1 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/opensm/osm_db_files.c b/opensm/osm_db_files.c
index 348385f..aaed986 100644
--- a/opensm/osm_db_files.c
+++ b/opensm/osm_db_files.c
@@ -92,6 +92,7 @@ typedef struct osm_db_domain_imp {
char *file_name;
st_table *p_hash;
cl_spinlock_t lock;
+   boolean_t dirty;
 } osm_db_domain_imp_t;
 /*
  * FIELDS
@@ -268,6 +269,7 @@ osm_db_domain_t *osm_db_domain_init(IN osm_db_t * p_db, IN 
char *domain_name)
/* initialize the hash table object */
p_domain_imp-p_hash = st_init_strtable();
CL_ASSERT(p_domain_imp-p_hash != NULL);
+   p_domain_imp-dirty = FALSE;
 
p_domain-p_db = p_db;
cl_list_insert_tail(p_db-domains, p_domain);
@@ -463,13 +465,17 @@ int osm_db_store(IN osm_db_domain_t * p_domain)
 {
osm_log_t *p_log = p_domain-p_db-p_log;
osm_db_domain_imp_t *p_domain_imp;
-   FILE *p_file;
+   FILE *p_file = NULL;
int status = 0;
-   char *p_tmp_file_name;
+   char *p_tmp_file_name = NULL;
 
OSM_LOG_ENTER(p_log);
 
p_domain_imp = (osm_db_domain_imp_t *) p_domain-p_domain_imp;
+
+   if (p_domain_imp-dirty == FALSE)
+   goto Exit;
+
p_tmp_file_name = malloc(sizeof(char) *
 (strlen(p_domain_imp-file_name) + 8));
strcpy(p_tmp_file_name, p_domain_imp-file_name);
@@ -495,7 +501,9 @@ int osm_db_store(IN osm_db_domain_t * p_domain)
OSM_LOG(p_log, OSM_LOG_ERROR, ERR 6108: 
Failed to rename the db file to:%s (err:%s)\n,
p_domain_imp-file_name, strerror(errno));
+   goto Exit;
}
+   p_domain_imp-dirty = FALSE;
 Exit:
cl_spinlock_release(p_domain_imp-lock);
free(p_tmp_file_name);
@@ -579,6 +587,9 @@ int osm_db_update(IN osm_db_domain_t * p_domain, IN char 
*p_key, IN char *p_val)
Key:%s previously exists in:%s with value:%s\n,
p_key, p_domain_imp-file_name, p_prev_val);
p_new_key = p_key;
+   /* same key, same value - nothing to update */
+   if (p_prev_val  !strcmp(p_val, p_prev_val))
+   goto Exit;
} else {
/* need to allocate the key */
p_new_key = malloc(sizeof(char) * (strlen(p_key) + 1));
@@ -595,6 +606,9 @@ int osm_db_update(IN osm_db_domain_t * p_domain, IN char 
*p_key, IN char *p_val)
if (p_prev_val)
free(p_prev_val);
 
+   p_domain_imp-dirty = TRUE;
+
+Exit:
cl_spinlock_release(p_domain_imp-lock);
 
return 0;
@@ -622,6 +636,7 @@ int osm_db_delete(IN osm_db_domain_t * p_domain, IN char 
*p_key)
} else {
free(p_key);
free(p_prev_val);
+   p_domain_imp-dirty = TRUE;
res = 0;
}
} else {
-- 
1.7.8.2

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH opensm] Implement atomic update operation for sa_db_file

2013-11-13 Thread Bart Van Assche

On 11/13/13 17:27, Hal Rosenstock wrote:


From: Vladimir Koushnir vladim...@mellanox.com

Signed-off-by: Vladimir Koushnir vladim...@mellanox.com
---
  opensm/osm_sa.c |   20 
  1 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/opensm/osm_sa.c b/opensm/osm_sa.c
index 8c5ef5d..d5c1275 100644
--- a/opensm/osm_sa.c
+++ b/opensm/osm_sa.c
@@ -525,25 +525,37 @@ opensm_dump_to_file(osm_opensm_t * p_osm, const char 
*file_name,
void (*dump_func) (osm_opensm_t * p_osm, FILE * file))
  {
char path[1024];
+   char path_tmp[1032];
FILE *file;
+   int status = 0;

snprintf(path, sizeof(path), %s/%s,
 p_osm-subn.opt.dump_files_dir, file_name);

-   file = fopen(path, w);
+   snprintf(path_tmp, sizeof(path_tmp), %s.tmp, path);
+
+   file = fopen(path_tmp, w);
if (!file) {
OSM_LOG(p_osm-log, OSM_LOG_ERROR, ERR 4C01: 
cannot open file \'%s\': %s\n,
-   file_name, strerror(errno));
+   path_tmp, strerror(errno));
return -1;
}

-   chmod(path, S_IRUSR | S_IWUSR);
+   chmod(path_tmp, S_IRUSR | S_IWUSR);

dump_func(p_osm, file);

fclose(file);
-   return 0;
+
+   status = rename(path_tmp, path);
+   if (status) {
+   OSM_LOG(p_osm-log, OSM_LOG_ERROR, ERR 4C0B: 
+   Failed to rename file:%s (err:%s)\n,
+   path_tmp, strerror(errno));
+   }
+
+   return status;
  }

  static void mcast_mgr_dump_one_port(cl_map_item_t * p_map_item, void *cxt)


Isn't an fdatasync() call missing after dump_func() and before fclose() 
? According to Theodore Ts'o calling fdatasync() or fsync() before 
fclose() is essential during an atomic update. See also 
http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/ for more 
information.


Bart.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RDMA and memory ordering

2013-11-13 Thread Jason Gunthorpe
On Wed, Nov 13, 2013 at 02:55:53AM -0400, Anuj Kalia wrote:

 I don't know what you meant by burst writes: do you mean several RDMA
 writes or one large write? I'm concered with the order in which data

A RDMA write will be split up by the HCA into a burst of PCI MemoryWr
operations.

 I guess now is the time I run lots of micro experiments. Thanks a lot
 for the help everyone.

Carefull, experiments can't prove that order is guranteed to be
present, they can only show if it certainly isn't.

Intel hardware is very good at hiding ordering issues 99% of the time,
but in many cases there can be a stress'd condition that will show a
different result.

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RDMA and memory ordering

2013-11-13 Thread Gabriele Svelto

On 12/11/2013 11:31, Anuj Kalia wrote:

I believe the atomic operations would be a lot more expensive than
reads/writes. I'm targetting maximum performance so I don't want to
look that way yet.


This sounds like premature optimization to me which as you know is the 
root of all evil :)


Try using the atomic primitives, they have been designed specifically 
for this kind of scenario, and then measure their performance in the 
real world before spending time on optimizing something that might just 
be fast enough for your purposes (and far more robust). If you're 
already polling your CQs those operations will be *very* fast.


 Gabriele
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: failure to get gid with rdma_bind_addr with = 3.10 kernels

2013-11-13 Thread Hefty, Sean
 
 Sean, how do we continue here? the patch worked for Christoph, so are
 going to merge it into librdmacm or it needs more work?

I will merge it, since it fixes the issue.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V5 8/8] mlx4_en: Avoid setting netdevice dev_id to port number

2013-11-13 Thread Or Gerlitz
From: Moni Shoua mo...@mellanox.co.il

The port number should not be stored in dev_id.

The netdevice dev_id field was intended to be used to differentiate
between multiple devices which share the same MAC address. Moreover,  this
make the kernel to assign wrong link local IPv6 address to mlx4_en netdevices.

Signed-off-by: Narendra K narendr...@dell.com
Signed-off-by: Moni Shoua mo...@mellanox.co.il
Signed-off-by: Or Gerlitz ogerl...@mellanox.com
---
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c 
b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index fa37b7a..b8dbb1a 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -2191,7 +2191,6 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int 
port,
netif_set_real_num_rx_queues(dev, prof-rx_ring_num);
 
SET_NETDEV_DEV(dev, mdev-dev-pdev-dev);
-   dev-dev_id =  port - 1;
 
/*
 * Initialize driver private data
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V5 6/8] IB/ocrdma: Populate GID table with IP based gids

2013-11-13 Thread Or Gerlitz
From: Moni Shoua mo...@mellanox.com

This patch is similar in spirit to the IB/mlx4: Use IBoE (RoCE) IP based GIDs
in the port GID table patch.

Changes to inet4 and inet6 addresses for the host are monitored and if the
address is associated with an ocrdma device then a gid is added or deleted
from the device's gid table. The gid format will be a IPv4 to IPv6 mapped or
the IPv6 address.

Cc: Naresh Gottumukkala bgottumukk...@emulex.com
Signed-off-by: Moni Shoua mo...@mellanox.com
Signed-off-by: Or Gerlitz ogerl...@mellanox.com
---
 drivers/infiniband/hw/ocrdma/ocrdma_main.c  |  138 ---
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c |2 +-
 2 files changed, 41 insertions(+), 99 deletions(-)

diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_main.c 
b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
index 91443bc..47187bf 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_main.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
@@ -67,46 +67,24 @@ void ocrdma_get_guid(struct ocrdma_dev *dev, u8 *guid)
guid[7] = mac_addr[5];
 }
 
-static void ocrdma_build_sgid_mac(union ib_gid *sgid, unsigned char *mac_addr,
- bool is_vlan, u16 vlan_id)
-{
-   sgid-global.subnet_prefix = cpu_to_be64(0xfe80LL);
-   sgid-raw[8] = mac_addr[0] ^ 2;
-   sgid-raw[9] = mac_addr[1];
-   sgid-raw[10] = mac_addr[2];
-   if (is_vlan) {
-   sgid-raw[11] = vlan_id  8;
-   sgid-raw[12] = vlan_id  0xff;
-   } else {
-   sgid-raw[11] = 0xff;
-   sgid-raw[12] = 0xfe;
-   }
-   sgid-raw[13] = mac_addr[3];
-   sgid-raw[14] = mac_addr[4];
-   sgid-raw[15] = mac_addr[5];
-}
-
-static bool ocrdma_add_sgid(struct ocrdma_dev *dev, unsigned char *mac_addr,
-   bool is_vlan, u16 vlan_id)
+static bool ocrdma_add_sgid(struct ocrdma_dev *dev, union ib_gid *new_sgid)
 {
int i;
-   union ib_gid new_sgid;
unsigned long flags;
 
memset(ocrdma_zero_sgid, 0, sizeof(union ib_gid));
 
-   ocrdma_build_sgid_mac(new_sgid, mac_addr, is_vlan, vlan_id);
 
spin_lock_irqsave(dev-sgid_lock, flags);
for (i = 0; i  OCRDMA_MAX_SGID; i++) {
if (!memcmp(dev-sgid_tbl[i], ocrdma_zero_sgid,
sizeof(union ib_gid))) {
/* found free entry */
-   memcpy(dev-sgid_tbl[i], new_sgid,
+   memcpy(dev-sgid_tbl[i], new_sgid,
   sizeof(union ib_gid));
spin_unlock_irqrestore(dev-sgid_lock, flags);
return true;
-   } else if (!memcmp(dev-sgid_tbl[i], new_sgid,
+   } else if (!memcmp(dev-sgid_tbl[i], new_sgid,
   sizeof(union ib_gid))) {
/* entry already present, no addition is required. */
spin_unlock_irqrestore(dev-sgid_lock, flags);
@@ -117,20 +95,17 @@ static bool ocrdma_add_sgid(struct ocrdma_dev *dev, 
unsigned char *mac_addr,
return false;
 }
 
-static bool ocrdma_del_sgid(struct ocrdma_dev *dev, unsigned char *mac_addr,
-   bool is_vlan, u16 vlan_id)
+static bool ocrdma_del_sgid(struct ocrdma_dev *dev, union ib_gid *sgid)
 {
int found = false;
int i;
-   union ib_gid sgid;
unsigned long flags;
 
-   ocrdma_build_sgid_mac(sgid, mac_addr, is_vlan, vlan_id);
 
spin_lock_irqsave(dev-sgid_lock, flags);
/* first is default sgid, which cannot be deleted. */
for (i = 1; i  OCRDMA_MAX_SGID; i++) {
-   if (!memcmp(dev-sgid_tbl[i], sgid, sizeof(union ib_gid))) {
+   if (!memcmp(dev-sgid_tbl[i], sgid, sizeof(union ib_gid))) {
/* found matching entry */
memset(dev-sgid_tbl[i], 0, sizeof(union ib_gid));
found = true;
@@ -141,75 +116,18 @@ static bool ocrdma_del_sgid(struct ocrdma_dev *dev, 
unsigned char *mac_addr,
return found;
 }
 
-static void ocrdma_add_default_sgid(struct ocrdma_dev *dev)
-{
-   /* GID Index 0 - Invariant manufacturer-assigned EUI-64 */
-   union ib_gid *sgid = dev-sgid_tbl[0];
-
-   sgid-global.subnet_prefix = cpu_to_be64(0xfe80LL);
-   ocrdma_get_guid(dev, sgid-raw[8]);
-}
-
-#if IS_ENABLED(CONFIG_VLAN_8021Q)
-static void ocrdma_add_vlan_sgids(struct ocrdma_dev *dev)
-{
-   struct net_device *netdev, *tmp;
-   u16 vlan_id;
-   bool is_vlan;
-
-   netdev = dev-nic_info.netdev;
-
-   rcu_read_lock();
-   for_each_netdev_rcu(init_net, tmp) {
-   if (netdev == tmp || vlan_dev_real_dev(tmp) == netdev) {
-   if (!netif_running(tmp) || !netif_oper_up(tmp))
-   continue;
-   if (netdev != tmp) {
-   vlan_id = vlan_dev_vlan_id(tmp);
- 

[PATCH V5 4/8] IB/mlx4: Handle Ethernet L2 parameters for IP based GID addressing

2013-11-13 Thread Or Gerlitz
From: Moni Shoua mo...@mellanox.com

IP based RoCE gids don't store Ethernet L2 parameters, MAC and VLAN.

Hence, we need to extract them now from the CQE and place in struct
ib_wc (to be used for cases were they were taken from the gid).

Also, when modifying a QP or building address handle, instead of
parsing the dgid to get the MAC and VLAN, take them from the
address handle attributes.

Signed-off-by: Moni Shoua mo...@mellanox.com
Signed-off-by: Or Gerlitz ogerl...@mellanox.com
---
 drivers/infiniband/hw/mlx4/ah.c   |   40 +++-
 drivers/infiniband/hw/mlx4/cq.c   |9 +++
 drivers/infiniband/hw/mlx4/mlx4_ib.h  |3 -
 drivers/infiniband/hw/mlx4/qp.c   |  105 ++---
 drivers/net/ethernet/mellanox/mlx4/port.c |   20 ++
 include/linux/mlx4/cq.h   |   15 +++-
 6 files changed, 130 insertions(+), 62 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/ah.c b/drivers/infiniband/hw/mlx4/ah.c
index a251bec..170dca6 100644
--- a/drivers/infiniband/hw/mlx4/ah.c
+++ b/drivers/infiniband/hw/mlx4/ah.c
@@ -39,25 +39,6 @@
 
 #include mlx4_ib.h
 
-int mlx4_ib_resolve_grh(struct mlx4_ib_dev *dev, const struct ib_ah_attr 
*ah_attr,
-   u8 *mac, int *is_mcast, u8 port)
-{
-   struct in6_addr in6;
-
-   *is_mcast = 0;
-
-   memcpy(in6, ah_attr-grh.dgid.raw, sizeof in6);
-   if (rdma_link_local_addr(in6))
-   rdma_get_ll_mac(in6, mac);
-   else if (rdma_is_multicast_addr(in6)) {
-   rdma_get_mcast_mac(in6, mac);
-   *is_mcast = 1;
-   } else
-   return -EINVAL;
-
-   return 0;
-}
-
 static struct ib_ah *create_ib_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr,
  struct mlx4_ib_ah *ah)
 {
@@ -92,21 +73,18 @@ static struct ib_ah *create_iboe_ah(struct ib_pd *pd, 
struct ib_ah_attr *ah_attr
 {
struct mlx4_ib_dev *ibdev = to_mdev(pd-device);
struct mlx4_dev *dev = ibdev-dev;
-   union ib_gid sgid;
-   u8 mac[6];
-   int err;
int is_mcast;
+   struct in6_addr in6;
u16 vlan_tag;
 
-   err = mlx4_ib_resolve_grh(ibdev, ah_attr, mac, is_mcast, 
ah_attr-port_num);
-   if (err)
-   return ERR_PTR(err);
-
-   memcpy(ah-av.eth.mac, mac, 6);
-   err = ib_get_cached_gid(pd-device, ah_attr-port_num, 
ah_attr-grh.sgid_index, sgid);
-   if (err)
-   return ERR_PTR(err);
-   vlan_tag = rdma_get_vlan_id(sgid);
+   memcpy(in6, ah_attr-grh.dgid.raw, sizeof(in6));
+   if (rdma_is_multicast_addr(in6)) {
+   is_mcast = 1;
+   rdma_get_mcast_mac(in6, ah-av.eth.mac);
+   } else {
+   memcpy(ah-av.eth.mac, ah_attr-dmac, ETH_ALEN);
+   }
+   vlan_tag = ah_attr-vlan_id;
if (vlan_tag  0x1000)
vlan_tag |= (ah_attr-sl  7)  13;
ah-av.eth.port_pd = cpu_to_be32(to_mpd(pd)-pdn | (ah_attr-port_num 
 24));
diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c
index d5e60f4..5f6113b 100644
--- a/drivers/infiniband/hw/mlx4/cq.c
+++ b/drivers/infiniband/hw/mlx4/cq.c
@@ -793,6 +793,15 @@ repoll:
wc-sl  = be16_to_cpu(cqe-sl_vid)  13;
else
wc-sl  = be16_to_cpu(cqe-sl_vid)  12;
+   if (be32_to_cpu(cqe-vlan_my_qpn)  MLX4_CQE_VLAN_PRESENT_MASK) 
{
+   wc-vlan_id = be16_to_cpu(cqe-sl_vid) 
+   MLX4_CQE_VID_MASK;
+   } else {
+   wc-vlan_id = 0x;
+   }
+   wc-wc_flags |= IB_WC_WITH_VLAN;
+   memcpy(wc-smac, cqe-smac, ETH_ALEN);
+   wc-wc_flags |= IB_WC_WITH_SMAC;
}
 
return 0;
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h 
b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index 133f41f..c06f571 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -678,9 +678,6 @@ int __mlx4_ib_query_pkey(struct ib_device *ibdev, u8 port, 
u16 index,
 int __mlx4_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
union ib_gid *gid, int netw_view);
 
-int mlx4_ib_resolve_grh(struct mlx4_ib_dev *dev, const struct ib_ah_attr 
*ah_attr,
-   u8 *mac, int *is_mcast, u8 port);
-
 static inline bool mlx4_ib_ah_grh_present(struct mlx4_ib_ah *ah)
 {
u8 port = be32_to_cpu(ah-av.ib.port_pd)  24  3;
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index da6f5fa..e0c2186 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -90,6 +90,21 @@ enum {
MLX4_RAW_QP_MSGMAX  = 31,
 };
 
+#ifndef ETH_ALEN
+#define ETH_ALEN6
+#endif
+static inline u64 mlx4_mac_to_u64(u8 *addr)
+{
+   u64 mac = 0;
+   int i;
+
+   for (i = 0; i  ETH_ALEN; i++) {
+   mac = 8;
+   mac |= 

[PATCH V5 7/8] IB/uverbs: Resolve Ethernet L2 addresses when modifying QP

2013-11-13 Thread Or Gerlitz
From: Moni Shoua mo...@mellanox.co.il

Existing user space applications provide only IBoE L3 address attributes
to the kernel when they issue QP modify. To comply with them and let such
apps to keep work transparently under the IBoE GID IP addressing changes,
added Eth L2 address resolution in the user-kernel linking piece - uverbs.

Signed-off-by: Moni Shoua mo...@mellanox.co.il
Signed-off-by: Or Gerlitz ogerl...@mellanox.com
---
 drivers/infiniband/core/uverbs_cmd.c |   27 +++
 1 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_cmd.c 
b/drivers/infiniband/core/uverbs_cmd.c
index 5bb2a82..74242b9 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -36,8 +36,10 @@
 #include linux/file.h
 #include linux/fs.h
 #include linux/slab.h
+#include linux/in6.h
 
 #include asm/uaccess.h
+#include rdma/ib_addr.h
 
 #include uverbs.h
 
@@ -1911,6 +1913,7 @@ ssize_t ib_uverbs_modify_qp(struct ib_uverbs_file *file,
struct ib_qp  *qp;
struct ib_qp_attr *attr;
intret;
+   union ib_gid   sgid;
 
if (copy_from_user(cmd, buf, sizeof cmd))
return -EFAULT;
@@ -1974,6 +1977,30 @@ ssize_t ib_uverbs_modify_qp(struct ib_uverbs_file *file,
attr-alt_ah_attr.ah_flags  = cmd.alt_dest.is_global ? 
IB_AH_GRH : 0;
attr-alt_ah_attr.port_num  = cmd.alt_dest.port_num;
 
+   if ((cmd.attr_mask  IB_QP_AV)  
+   (rdma_port_get_link_layer(qp-device, attr-ah_attr.port_num) == 
IB_LINK_LAYER_ETHERNET)) {
+   ret = ib_query_gid(qp-device, attr-ah_attr.port_num,
+  attr-ah_attr.grh.sgid_index, sgid);
+   if (ret)
+   goto out;
+   if (rdma_link_local_addr((struct in6_addr 
*)attr-ah_attr.grh.dgid.raw)) {
+   rdma_get_ll_mac((struct in6_addr 
*)attr-ah_attr.grh.dgid.raw, attr-ah_attr.dmac);
+   rdma_get_ll_mac((struct in6_addr *)sgid.raw, 
attr-smac);
+   attr-vlan_id = rdma_get_vlan_id(sgid);
+   } else {
+   ret = rdma_addr_find_dmac_by_grh(sgid, 
attr-ah_attr.grh.dgid,
+   attr-ah_attr.dmac, attr-vlan_id);
+   if (ret)
+   goto out;
+   ret = rdma_addr_find_smac_by_sgid(sgid, attr-smac, 
NULL);
+   if (ret)
+   goto out;
+   }
+   cmd.attr_mask |= IB_QP_SMAC;
+   if (attr-vlan_id  0x)
+   cmd.attr_mask |= IB_QP_VID;
+   }
+
if (qp-real_qp == qp) {
ret = qp-device-modify_qp(qp, attr,
modify_qp_mask(qp-qp_type, cmd.attr_mask), udata);
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V5 1/8] IB/core: Ethernet L2 attributes in verbs/cm structures

2013-11-13 Thread Or Gerlitz
From: Matan Barak mat...@mellanox.com

This patch add the support for Ethernet L2 attributes in the
verbs/cm/cma structures.

When dealing with L2 Ethernet, we should use smac, dmac, vlan ID and priority
in a similar manner that the IB L2 (and the L4 PKEY) attributes are used.

Thus, those attributes were added to the following structures:

* ib_ah_attr - added dmac
* ib_qp_attr - added smac and vlan_id, (sl remains vlan priority)
* ib_wc - added smac, vlan_id
* ib_sa_path_rec - added smac, dmac, vlan_id
* cm_av - added smac and vlan_id

For the path record structure, extra care was taken to avoid the new fields when
packing it into wire format, so we don't break the IB CM and SA wire protocol.

On the active side, the CM fill its internal structures from the path provided
by the ULP, added there taking the ETH L2 attributes and placing them into
the CM Address Handle (struct cm_av).

On the passive side, the CM fills its internal structures from the WC associated
with the REQ message, added there taking the ETH L2 attributes from the WC.

When the HW driver provides the required ETH L2 attributes in the WC, they
set the IB_WC_WITH_SMAC and IB_WC_WITH_VLAN flags. The IB core code checks
for the presence of these flags, and in their absence does address
resolution from the ib_init_ah_from_wc() helper function.

ib_modify_qp_is_ok is also updated to consider the link layer. Some parameters
are mandatory for Ethernet link layer, while they are irrelevant for IB.
Vendor drivers are modified to support the new function signature.

Signed-off-by: Matan Barak mat...@mellanox.com
Signed-off-by: Or Gerlitz ogerl...@mellanox.com
---
 drivers/infiniband/core/addr.c  |   97 ++-
 drivers/infiniband/core/cm.c|   50 ++
 drivers/infiniband/core/cma.c   |   60 +++--
 drivers/infiniband/core/sa_query.c  |   12 +++-
 drivers/infiniband/core/verbs.c |   43 +++-
 drivers/infiniband/hw/ehca/ehca_qp.c|2 +-
 drivers/infiniband/hw/ipath/ipath_qp.c  |2 +-
 drivers/infiniband/hw/mlx4/qp.c |9 ++-
 drivers/infiniband/hw/mlx5/qp.c |3 +-
 drivers/infiniband/hw/mthca/mthca_qp.c  |3 +-
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c |3 +-
 drivers/infiniband/hw/qib/qib_qp.c  |2 +-
 include/linux/mlx4/device.h |1 +
 include/rdma/ib_addr.h  |   42 +++-
 include/rdma/ib_cm.h|1 +
 include/rdma/ib_pack.h  |1 +
 include/rdma/ib_sa.h|3 +
 include/rdma/ib_verbs.h |   21 +-
 18 files changed, 331 insertions(+), 24 deletions(-)

diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index e90f2b2..8172d37 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -86,6 +86,8 @@ int rdma_addr_size(struct sockaddr *addr)
 }
 EXPORT_SYMBOL(rdma_addr_size);
 
+static struct rdma_addr_client self;
+
 void rdma_addr_register_client(struct rdma_addr_client *client)
 {
atomic_set(client-refcount, 1);
@@ -119,7 +121,8 @@ int rdma_copy_addr(struct rdma_dev_addr *dev_addr, struct 
net_device *dev,
 }
 EXPORT_SYMBOL(rdma_copy_addr);
 
-int rdma_translate_ip(struct sockaddr *addr, struct rdma_dev_addr *dev_addr)
+int rdma_translate_ip(struct sockaddr *addr, struct rdma_dev_addr *dev_addr,
+ u16 *vlan_id)
 {
struct net_device *dev;
int ret = -EADDRNOTAVAIL;
@@ -142,6 +145,8 @@ int rdma_translate_ip(struct sockaddr *addr, struct 
rdma_dev_addr *dev_addr)
return ret;
 
ret = rdma_copy_addr(dev_addr, dev, NULL);
+   if (vlan_id)
+   *vlan_id = rdma_vlan_dev_vlan_id(dev);
dev_put(dev);
break;
 
@@ -153,6 +158,8 @@ int rdma_translate_ip(struct sockaddr *addr, struct 
rdma_dev_addr *dev_addr)
  ((struct sockaddr_in6 *) 
addr)-sin6_addr,
  dev, 1)) {
ret = rdma_copy_addr(dev_addr, dev, NULL);
+   if (vlan_id)
+   *vlan_id = rdma_vlan_dev_vlan_id(dev);
break;
}
}
@@ -238,7 +245,7 @@ static int addr4_resolve(struct sockaddr_in *src_in,
src_in-sin_addr.s_addr = fl4.saddr;
 
if (rt-dst.dev-flags  IFF_LOOPBACK) {
-   ret = rdma_translate_ip((struct sockaddr *) dst_in, addr);
+   ret = rdma_translate_ip((struct sockaddr *)dst_in, addr, NULL);
if (!ret)
memcpy(addr-dst_dev_addr, addr-src_dev_addr, 
MAX_ADDR_LEN);
goto put;
@@ -286,7 +293,7 @@ static int addr6_resolve(struct sockaddr_in6 *src_in,
}
 
 

[PATCH V5 0/8] IP based RoCE GID Addressing

2013-11-13 Thread Or Gerlitz
changes from V4:

 - addressed feedback re the need to be compatible with non modified user
   space applications/libraries, by adding code in uverbs which does address
   resolution when dealing with Ethernet ports. This is patch #7  

 - removed the patches that deal with uverbs extended commands, they will
   added later on, such that new applications/libraries can be coded to them.
  
 - added patch fixing mlx4_en to have correct IPv6 link local address.

See below full listing of change-history.

Currently, the IB stack (core + drivers) handle RoCE (IBoE) gids as
they encode related Ethernet net-device interface MAC address and 
possibly VLAN id.

This series changes RoCE GIDs to encode IP addresses (IPv4 + IPv6)
of the that Ethernet interface, under the following reasoning:

1. There are environments where the compute entity that runs the RoCE 
stack is not aware that its traffic is vlan-tagged. This results with that 
node to create/assume wrong GIDs from the view point of a peer node which 
is aware to vlans. 

Note that node here can be physical node connected to Ethernet switch acting 
in 
access mode talking to another node which does vlan insertion/stripping by 
itself.

Or another example is SRIOV Virtual Function which is configured to work in 
VST 
mode (Virtual-Switch-Tagging) such that the hypervisor configures the HW 
eSWitch 
to do vlan insertion for the vPORT representing that function.

2. When RoCE traffic is inspected (mirrored/trapped) in Ethernet switches for 
monitoring and security purposes. It is much more natural for both humans and 
automated utilities (...) to observe IP addresses in a certain offset into RoCE 
frames L3 header vs. MAC/VLANs (which are there anyway in the L2 header of that 
frame, so they are not gone by this change).

3. Some Bonding/Teaming advanced mode such as balance-alb and balance-tlb 
are using multiple underlying devices in parallel, and hence packets always 
carry the bond IP address but different streams have different source MACs.
The approach brought by this series is part from what would allow to 
support that for RoCE traffic too.

The 1st patch adds explicit handling of Ethernet L2 attributes, source/dest 
mac and vlan_id to the kernel IB core, in data-structures and CMA/CM code. 
Previously, with MAC/VLAN based addressing, they were encoded in the GIDs, 
where now they have to be resolved and placed separately from the IP based GIDs.

The 2nd patch modifies the CMA to cope with IP based GIDs, the 3rd/4th ones do 
that for the mlx4_ib driver, and the 5th/6th patches to the ocrdma driver. 

The 7th patch adds address resolution to user space applications for RoCE 
ports such that these application keep working unmodified.

The 8th/last patch fixes the mlx4_en driver such that it has correct IPv6 link 
local address.

Or.

Full listing of change-history:

changes from V4:

 - addressed feedback re the need to be compatible with non modified user
   space applications/libraries, by adding code in uverbs which does address
   resolution when dealing with Ethernet ports.  

 - removed the patches that deal with uverbs extended commands, they will
   added later on, such that new applications/libraries can be coded to them.
  
changes from V3:

  - dropped the uverbs Infrastructure patch for extensions which is now upstream
400dbc9 IB/core: Infrastructure for extensible uverbs commands

  - added ocrdma patch to handle Ethernet L2 parameters, similar to the mlx4 
patch.
   
  - removed the assumption that the low level driver can provide the source mac
and vlan in the struct ib_wc returned by ib_poll_cq, and adjusted the 
ib_init_ah_from_wc helper of the IB core accordingly.

  - fixed some vlan related issues in the mlx4 driver

changes from V2:

  - added handling of IP based GIDs in the ocrdma driver - patch #5, 
as a result patches #5-8 of V1 became patches #6-9
  
changes from V1:

 - rebased the series against the latest kernel bits, which include Sean's 
   AF_IB changes to the rdma-cm
 
 - fixed bug in mlx4_ib where reset of the gid table was done for IB ports too
 
 - fixed build warnings and issues pointed by sparse

 - introduced patch #1 which does the explicit handling of Ethernet L2 
attributes, 
   source/dest mac and vlan_id in the kernel data-structures and CMA/CM code. 

 - use smac when modifying a QP -- find smac in passive side + additional 
fields 
   to adress structures

 - add support to new QP atrr in ib_modify_qp_is_ok() special for ll = ETH
  and modified all low-level drivers to keep working after that change

 -- changes around uverbs:
 - use ah_ext as pointer in qp_attr passed from user space, so this 
   field by itself can be extended in the future
 - for kernel to user command respnses comp_mask is moved into the 
   right place which is after the non-extended command respond fields
 - fixed bug in copy_qp_attr_ex under which some fields were copied to
   wrong locations
 - use new structure 

[PATCH V5 2/8] IB/CMA: IBoE (RoCE) IP based GID addressing

2013-11-13 Thread Or Gerlitz
From: Moni Shoua mo...@mellanox.com

Currently, the IB core and specifically the RDMA-CM assumes that
IBoE (RoCE) gids encode related Ethernet netdevice interface
MAC address and possibly VLAN id.

Change gids to be treated as they encode interface IP address.

Since Ethernet layer 2 address parameters are not longer encoded
within gids, had to extend the Infiniband address structures (e.g.
ib_ah_attr) with layer 2 address parameters, namely mac and vlan.

Signed-off-by: Moni Shoua mo...@mellanox.com
Signed-off-by: Or Gerlitz ogerl...@mellanox.com
Signed-off-by: Moni Shoua mo...@mellanox.co.il
---
 drivers/infiniband/core/cma.c  |   22 --
 drivers/infiniband/core/ucma.c |   18 --
 include/rdma/ib_addr.h |   35 ---
 3 files changed, 28 insertions(+), 47 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 45a4010..86adf07 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -365,7 +365,9 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv,
return -EINVAL;
 
mutex_lock(lock);
-   iboe_addr_get_sgid(dev_addr, iboe_gid);
+   rdma_ip2gid((struct sockaddr *)id_priv-id.route.addr.src_addr,
+   iboe_gid);
+
memcpy(gid, dev_addr-src_dev_addr +
   rdma_addr_gid_offset(dev_addr), sizeof gid);
if (listen_id_priv 
@@ -1923,10 +1925,10 @@ static int cma_resolve_iboe_route(struct 
rdma_id_private *id_priv)
memcpy(route-path_rec-dmac, addr-dev_addr.dst_dev_addr, ETH_ALEN);
memcpy(route-path_rec-smac, ndev-dev_addr, ndev-addr_len);
 
-   iboe_mac_vlan_to_ll(route-path_rec-sgid, addr-dev_addr.src_dev_addr,
-   route-path_rec-vlan_id);
-   iboe_mac_vlan_to_ll(route-path_rec-dgid, addr-dev_addr.dst_dev_addr,
-   route-path_rec-vlan_id);
+   rdma_ip2gid((struct sockaddr *)id_priv-id.route.addr.src_addr,
+   route-path_rec-sgid);
+   rdma_ip2gid((struct sockaddr *)id_priv-id.route.addr.dst_addr,
+   route-path_rec-dgid);
 
route-path_rec-hop_limit = 1;
route-path_rec-reversible = 1;
@@ -2093,6 +2095,7 @@ static void addr_handler(int status, struct sockaddr 
*src_addr,
   RDMA_CM_ADDR_RESOLVED))
goto out;
 
+   memcpy(cma_src_addr(id_priv), src_addr, rdma_addr_size(src_addr));
if (!status  !id_priv-cma_dev)
status = cma_acquire_dev(id_priv, NULL);
 
@@ -2102,10 +2105,8 @@ static void addr_handler(int status, struct sockaddr 
*src_addr,
goto out;
event.event = RDMA_CM_EVENT_ADDR_ERROR;
event.status = status;
-   } else {
-   memcpy(cma_src_addr(id_priv), src_addr, 
rdma_addr_size(src_addr));
+   } else
event.event = RDMA_CM_EVENT_ADDR_RESOLVED;
-   }
 
if (id_priv-id.event_handler(id_priv-id, event)) {
cma_exch(id_priv, RDMA_CM_DESTROYING);
@@ -2586,6 +2587,7 @@ int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr 
*addr)
if (ret)
goto err1;
 
+   memcpy(cma_src_addr(id_priv), addr, rdma_addr_size(addr));
if (!cma_any_addr(addr)) {
ret = cma_translate_addr(addr, id-route.addr.dev_addr);
if (ret)
@@ -2596,7 +2598,6 @@ int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr 
*addr)
goto err1;
}
 
-   memcpy(cma_src_addr(id_priv), addr, rdma_addr_size(addr));
if (!(id_priv-options  (1  CMA_OPTION_AFONLY))) {
if (addr-sa_family == AF_INET)
id_priv-afonly = 1;
@@ -3325,7 +3326,8 @@ static int cma_iboe_join_multicast(struct rdma_id_private 
*id_priv,
err = -EINVAL;
goto out2;
}
-   iboe_addr_get_sgid(dev_addr, mc-multicast.ib-rec.port_gid);
+   rdma_ip2gid((struct sockaddr *)id_priv-id.route.addr.src_addr,
+   mc-multicast.ib-rec.port_gid);
work-id = id_priv;
work-mc = mc;
INIT_WORK(work-work, iboe_mcast_work_handler);
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 826016b..5443d33 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -655,24 +655,14 @@ static void ucma_copy_ib_route(struct 
rdma_ucm_query_route_resp *resp,
 static void ucma_copy_iboe_route(struct rdma_ucm_query_route_resp *resp,
 struct rdma_route *route)
 {
-   struct rdma_dev_addr *dev_addr;
-   struct net_device *dev;
-   u16 vid = 0;
 
resp-num_paths = route-num_paths;
switch (route-num_paths) {
case 0:
-   dev_addr = route-addr.dev_addr;
-   dev = dev_get_by_index(init_net, dev_addr-bound_dev_if);
-   if 

[PATCH V5 5/8] IB/ocrdma: Handle Ethernet L2 parameters for IP based GID addressing

2013-11-13 Thread Or Gerlitz
From: Moni Shoua mo...@mellanox.com

This patch is similar in spirit to the IB/mlx4: Handle Ethernet L2 parameters 
for
IP based GID addressing. It handles the fact that IP based RoCE gids
don't store Ethernet L2 parameters, MAC and VLAN.

When building an address handle, instead of parsing the dgid to
get the MAC and VLAN, take them from the address handle attributes.

Cc: Naresh Gottumukkala bgottumukk...@emulex.com
Signed-off-by: Moni Shoua mo...@mellanox.com
Signed-off-by: Or Gerlitz ogerl...@mellanox.com
---
 drivers/infiniband/hw/ocrdma/ocrdma.h|   12 
 drivers/infiniband/hw/ocrdma/ocrdma_ah.c |5 +++--
 drivers/infiniband/hw/ocrdma/ocrdma_hw.c |   21 ++---
 drivers/infiniband/hw/ocrdma/ocrdma_hw.h |1 -
 4 files changed, 17 insertions(+), 22 deletions(-)

diff --git a/drivers/infiniband/hw/ocrdma/ocrdma.h 
b/drivers/infiniband/hw/ocrdma/ocrdma.h
index 294dd27..7c001b9 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma.h
+++ b/drivers/infiniband/hw/ocrdma/ocrdma.h
@@ -423,5 +423,17 @@ static inline int is_cqe_wr_imm(struct ocrdma_cqe *cqe)
OCRDMA_CQE_WRITE_IMM) ? 1 : 0;
 }
 
+static inline int ocrdma_resolve_dmac(struct ocrdma_dev *dev,
+   struct ib_ah_attr *ah_attr, u8 *mac_addr)
+{
+   struct in6_addr in6;
+
+   memcpy(in6, ah_attr-grh.dgid.raw, sizeof(in6));
+   if (rdma_is_multicast_addr(in6))
+   rdma_get_mcast_mac(in6, mac_addr);
+   else
+   memcpy(mac_addr, ah_attr-dmac, ETH_ALEN);
+   return 0;
+}
 
 #endif
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_ah.c 
b/drivers/infiniband/hw/ocrdma/ocrdma_ah.c
index ee499d9..bbb7962 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_ah.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_ah.c
@@ -49,7 +49,7 @@ static inline int set_av_attr(struct ocrdma_dev *dev, struct 
ocrdma_ah *ah,
 
ah-sgid_index = attr-grh.sgid_index;
 
-   vlan_tag = rdma_get_vlan_id(attr-grh.dgid);
+   vlan_tag = attr-vlan_id;
if (!vlan_tag || (vlan_tag  0xFFF))
vlan_tag = dev-pvid;
if (vlan_tag  (vlan_tag  0x1000)) {
@@ -64,7 +64,8 @@ static inline int set_av_attr(struct ocrdma_dev *dev, struct 
ocrdma_ah *ah,
eth_sz = sizeof(struct ocrdma_eth_basic);
}
memcpy(eth.smac[0], dev-nic_info.mac_addr[0], ETH_ALEN);
-   status = ocrdma_resolve_dgid(dev, attr-grh.dgid, eth.dmac[0]);
+   memcpy(eth.dmac[0], attr-dmac, ETH_ALEN);
+   status = ocrdma_resolve_dmac(dev, attr, eth.dmac[0]);
if (status)
return status;
status = ocrdma_query_gid(dev-ibdev, 1, attr-grh.sgid_index,
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_hw.c 
b/drivers/infiniband/hw/ocrdma/ocrdma_hw.c
index 56bf32f..1664d64 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_hw.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_hw.c
@@ -2076,23 +2076,6 @@ mbx_err:
return status;
 }
 
-int ocrdma_resolve_dgid(struct ocrdma_dev *dev, union ib_gid *dgid,
-   u8 *mac_addr)
-{
-   struct in6_addr in6;
-
-   memcpy(in6, dgid, sizeof in6);
-   if (rdma_is_multicast_addr(in6)) {
-   rdma_get_mcast_mac(in6, mac_addr);
-   } else if (rdma_link_local_addr(in6)) {
-   rdma_get_ll_mac(in6, mac_addr);
-   } else {
-   pr_err(%s() fail to resolve mac_addr.\n, __func__);
-   return -EINVAL;
-   }
-   return 0;
-}
-
 static int ocrdma_set_av_params(struct ocrdma_qp *qp,
struct ocrdma_modify_qp *cmd,
struct ib_qp_attr *attrs)
@@ -2126,14 +2109,14 @@ static int ocrdma_set_av_params(struct ocrdma_qp *qp,
 
qp-sgid_idx = ah_attr-grh.sgid_index;
memcpy(cmd-params.sgid[0], sgid.raw[0], sizeof(cmd-params.sgid));
-   ocrdma_resolve_dgid(qp-dev, ah_attr-grh.dgid, mac_addr[0]);
+   ocrdma_resolve_dmac(qp-dev, ah_attr, mac_addr[0]);
cmd-params.dmac_b0_to_b3 = mac_addr[0] | (mac_addr[1]  8) |
(mac_addr[2]  16) | (mac_addr[3]  24);
/* convert them to LE format. */
ocrdma_cpu_to_le32(cmd-params.dgid[0], sizeof(cmd-params.dgid));
ocrdma_cpu_to_le32(cmd-params.sgid[0], sizeof(cmd-params.sgid));
cmd-params.vlan_dmac_b4_to_b5 = mac_addr[4] | (mac_addr[5]  8);
-   vlan_id = rdma_get_vlan_id(sgid);
+   vlan_id = ah_attr-vlan_id;
if (vlan_id  (vlan_id  0x1000)) {
cmd-params.vlan_dmac_b4_to_b5 |=
vlan_id  OCRDMA_QP_PARAMS_VLAN_SHIFT;
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_hw.h 
b/drivers/infiniband/hw/ocrdma/ocrdma_hw.h
index f2a89d4..82fe332 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_hw.h
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_hw.h
@@ -94,7 +94,6 @@ void ocrdma_ring_cq_db(struct ocrdma_dev *, u16 cq_id, bool 
armed,
 int ocrdma_mbx_get_link_speed(struct ocrdma_dev *dev, u8 *lnk_speed);
 int 

[PATCH V5 3/8] IB/mlx4: Use IBoE (RoCE) IP based GIDs in the port GID table

2013-11-13 Thread Or Gerlitz
From: Moni Shoua mo...@mellanox.com

Currently, the mlx4 driver set IBoE (RoCE) gids to encode related
Ethernet netdevice interface MAC address and possibly VLAN id.

Change this scheme such that gids encode interface IP addresses
(both IP4 and IPv6).

This requires learning which are the IP addresses which are of use
by a netdevice associated with the HCA port, formatting them to gids
and adding them to the port gid table. Further, events of add and
delete address are caught to maintain the gid table accordingly.

Associated IP addresses may belong to a master of an Ethernet netdevice
on top of that port so this should be considered when building and
maintaining the gid table.

Signed-off-by: Moni Shoua mo...@mellanox.com
Signed-off-by: Or Gerlitz ogerl...@mellanox.com
---
 drivers/infiniband/hw/mlx4/main.c|  474 --
 drivers/infiniband/hw/mlx4/mlx4_ib.h |3 +
 2 files changed, 334 insertions(+), 143 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/main.c 
b/drivers/infiniband/hw/mlx4/main.c
index f061264..c5ecec2 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -39,6 +39,8 @@
 #include linux/inetdevice.h
 #include linux/rtnetlink.h
 #include linux/if_vlan.h
+#include net/ipv6.h
+#include net/addrconf.h
 
 #include rdma/ib_smi.h
 #include rdma/ib_user_verbs.h
@@ -790,7 +792,6 @@ static int add_gid_entry(struct ib_qp *ibqp, union ib_gid 
*gid)
 int mlx4_ib_add_mc(struct mlx4_ib_dev *mdev, struct mlx4_ib_qp *mqp,
   union ib_gid *gid)
 {
-   u8 mac[6];
struct net_device *ndev;
int ret = 0;
 
@@ -804,11 +805,7 @@ int mlx4_ib_add_mc(struct mlx4_ib_dev *mdev, struct 
mlx4_ib_qp *mqp,
spin_unlock(mdev-iboe.lock);
 
if (ndev) {
-   rdma_get_mcast_mac((struct in6_addr *)gid, mac);
-   rtnl_lock();
-   dev_mc_add(mdev-iboe.netdevs[mqp-port - 1], mac);
ret = 1;
-   rtnl_unlock();
dev_put(ndev);
}
 
@@ -1031,6 +1028,8 @@ static int mlx4_ib_mcg_attach(struct ib_qp *ibqp, union 
ib_gid *gid, u16 lid)
struct mlx4_ib_qp *mqp = to_mqp(ibqp);
u64 reg_id;
struct mlx4_ib_steering *ib_steering = NULL;
+   enum mlx4_protocol prot = (gid-raw[1] == 0x0e) ?
+   MLX4_PROT_IB_IPV4 : MLX4_PROT_IB_IPV6;
 
if (mdev-dev-caps.steering_mode ==
MLX4_STEERING_MODE_DEVICE_MANAGED) {
@@ -1042,7 +1041,7 @@ static int mlx4_ib_mcg_attach(struct ib_qp *ibqp, union 
ib_gid *gid, u16 lid)
err = mlx4_multicast_attach(mdev-dev, mqp-mqp, gid-raw, mqp-port,
!!(mqp-flags 
   MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK),
-   MLX4_PROT_IB_IPV6, reg_id);
+   prot, reg_id);
if (err)
goto err_malloc;
 
@@ -1061,7 +1060,7 @@ static int mlx4_ib_mcg_attach(struct ib_qp *ibqp, union 
ib_gid *gid, u16 lid)
 
 err_add:
mlx4_multicast_detach(mdev-dev, mqp-mqp, gid-raw,
- MLX4_PROT_IB_IPV6, reg_id);
+ prot, reg_id);
 err_malloc:
kfree(ib_steering);
 
@@ -1089,10 +1088,11 @@ static int mlx4_ib_mcg_detach(struct ib_qp *ibqp, union 
ib_gid *gid, u16 lid)
int err;
struct mlx4_ib_dev *mdev = to_mdev(ibqp-device);
struct mlx4_ib_qp *mqp = to_mqp(ibqp);
-   u8 mac[6];
struct net_device *ndev;
struct mlx4_ib_gid_entry *ge;
u64 reg_id = 0;
+   enum mlx4_protocol prot = (gid-raw[1] == 0x0e) ?
+   MLX4_PROT_IB_IPV4 : MLX4_PROT_IB_IPV6;
 
if (mdev-dev-caps.steering_mode ==
MLX4_STEERING_MODE_DEVICE_MANAGED) {
@@ -1115,7 +1115,7 @@ static int mlx4_ib_mcg_detach(struct ib_qp *ibqp, union 
ib_gid *gid, u16 lid)
}
 
err = mlx4_multicast_detach(mdev-dev, mqp-mqp, gid-raw,
-   MLX4_PROT_IB_IPV6, reg_id);
+   prot, reg_id);
if (err)
return err;
 
@@ -1127,13 +1127,8 @@ static int mlx4_ib_mcg_detach(struct ib_qp *ibqp, union 
ib_gid *gid, u16 lid)
if (ndev)
dev_hold(ndev);
spin_unlock(mdev-iboe.lock);
-   rdma_get_mcast_mac((struct in6_addr *)gid, mac);
-   if (ndev) {
-   rtnl_lock();
-   dev_mc_del(mdev-iboe.netdevs[ge-port - 1], mac);
-   rtnl_unlock();
+   if (ndev)
dev_put(ndev);
-   }
list_del(ge-list);
kfree(ge);
} else
@@ -1229,20 +1224,6 @@ static struct device_attribute *mlx4_class_attributes[] 
= {
dev_attr_board_id
 };
 
-static void mlx4_addrconf_ifid_eui48(u8 *eui, u16 vlan_id, struct net_device 
*dev)
-{
-   memcpy(eui, dev-dev_addr, 3);

Re: failure to get gid with rdma_bind_addr with = 3.10 kernels

2013-11-13 Thread Or Gerlitz
On Wed, Nov 13, 2013 at 10:15 PM, Hefty, Sean sean.he...@intel.com wrote:

 Sean, how do we continue here? the patch worked for Christoph, so are
 going to merge it into librdmacm or it needs more work?

 I will merge it, since it fixes the issue.

cool
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3 00/10] Introduce Signature feature

2013-11-13 Thread Hefty, Sean
 The patch series is around for couple of weeks already and went
 through the review of Sean and Bart, with all their feedback being
 applied. Also Sagi and Co enhanced krping to fully cover (and test...)
 the proposed API and driver implementation @
 git://beany.openfabrics.org/~sgrimberg/krping.git

Somewhat separate from this specific patch, this is my concern.

There are continual requests to modify the kernel verbs interfaces.  These 
requests boil down to exposing proprietary capabilities to the latest version 
of some vendor's hardware.  In turn, these hardware specific knobs bleed into 
the kernel clients.

At the very least, it seems that there should be some sort of discussion if 
this is a desirable property of the kernel verbs interface, and if this is the 
architecture that the kernel should continue to pursue.  Or, is there an 
alternative way of providing the same ability of coding ULPs to specific HW 
features, versus plugging every new feature into 'post send'?

- Sean
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RDMA and memory ordering

2013-11-13 Thread Anuj Kalia
On Wed, Nov 13, 2013 at 2:23 PM, Gabriele Svelto
gabriele.sve...@gmail.com wrote:
 On 12/11/2013 11:31, Anuj Kalia wrote:

 I believe the atomic operations would be a lot more expensive than
 reads/writes. I'm targetting maximum performance so I don't want to
 look that way yet.


 This sounds like premature optimization to me which as you know is the root
 of all evil :)
 Try using the atomic primitives, they have been designed specifically for
 this kind of scenario, and then measure their performance in the real world
 before spending time on optimizing something that might just be fast enough
 for your purposes (and far more robust). If you're already polling your CQs
 those operations will be *very* fast.


I'm working on a project where I'm trying to extract the maximum IOPS
from a server for an application. If atomic operations are even 2-X
slower than RDMA writes (which I'd expect because they involve a read
and a write), I can't use them. However, it would be interesting to
find their performance. I'll try that.

Thanks!
  Gabriele
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RDMA and memory ordering

2013-11-13 Thread Anuj Kalia
Jason,

Thanks again :).

I found another similar thread:
http://www.spinics.net/lists/linux-rdma/msg02709.html. The conclusion
there was that although Infiniband specs don't specify any ordering of
writes, many people assume left-to-right ordering anyway. There is no
mention of reads though.

So I did the micro experiments and I found that although writes follow
the left-right ordering, reads do not. More details follow:

1. Write ordering experiment:
1.a. In the nth iteration, a client writes a buffer containing C ~
1024 integers (each equal to 'n') to the server. The client sleeps for
2000 us between iterations.
1.b. The server busily polls for a change to the Cth integer. When the
Cth integer changes from i to i+1, it checks if the entire buffer is
equal to i+1. The check always passes (I've tried over 15 million
checks). The test fails if the polled integer is not the rightmost
integer.

2. Read ordering experiment:
2.a. In the nth iteration, the server writes 'n' to C ~ 1024 integers
in a local buffer. The server does the write in reverse order
(starting from index C-1). It then sleeps for 2000 us.
2.b. The client continuously reads the buffer. When the Cth integer in
the read sink changes from i to i+1, it checks if all the integers in
the buffer are i+1. This check fails (although rarely).

This shows that reads are NOT ordered left to right. The read pattern
that I'd expect is ... (where H corresponds to i+1). However,
I can see patterns like HH..L...HH (L corresponds to i). This is
wrong because we don't expect i's to be lingering around after the
first integer has become i+1 (under the false assumption that reads
happen left-to-right).

Curiously, whenever there are stale i's, they are always such that the
contiguous chunk of i's would fit inside a cacheline. I'm seeing 16
i's and 48 i's usually.
2.c. The check always succeeds if C is 16 (the buffer fits inside a
cacheline). I've done 15 million checks, will do much more tonight.

So, another question: why are the reads unordered while the writes are
ordered? I think by now we can assume write ordering (my experiments +
MVAPICH uses it). Can the PCI reorder the reads issued by the HCA?

On Wed, Nov 13, 2013 at 2:09 PM, Jason Gunthorpe
jguntho...@obsidianresearch.com wrote:
 On Wed, Nov 13, 2013 at 02:55:53AM -0400, Anuj Kalia wrote:

 I don't know what you meant by burst writes: do you mean several RDMA
 writes or one large write? I'm concered with the order in which data

 A RDMA write will be split up by the HCA into a burst of PCI MemoryWr
 operations.

 I guess now is the time I run lots of micro experiments. Thanks a lot
 for the help everyone.

 Carefull, experiments can't prove that order is guranteed to be
 present, they can only show if it certainly isn't.
Aah, unfortunately that's true. However, I ran experiments anyway. If
people have been assuming an ordering on writes, I guess I can check
if reads are ordered too.
 Intel hardware is very good at hiding ordering issues 99% of the time,
 but in many cases there can be a stress'd condition that will show a
 different result.
Hmm.. I'm willing to run billions of iterations of the test. That
should give some confidence.
 Jason
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] IB/CMA: Fix to handle global/non-linklocal IPv6 address in cma_check_linklocal()

2013-11-13 Thread Somnath Kotur
Even if the addr is not a linklocal address, the code treats it as such
and assigns the bound dev addr to the scope id of the address which is invalid.
Fix by checking if it's a link local address first and return 0 if not.

Signed-off-by: Somnath Kotur somnath.ko...@emulex.com
---
 drivers/infiniband/core/cma.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index dab4b41..5c9f1ad 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -2466,6 +2466,9 @@ static int cma_check_linklocal(struct rdma_dev_addr 
*dev_addr,
return 0;
 
sin6 = (struct sockaddr_in6 *) addr;
+
+   if (!(ipv6_addr_type(sin6-sin6_addr)  IPV6_ADDR_LINKLOCAL))
+   return 0;
if ((ipv6_addr_type(sin6-sin6_addr)  IPV6_ADDR_LINKLOCAL) 
!sin6-sin6_scope_id)
return -EINVAL;
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] IB/CMA: Initialize hop_limit to a known default macro instead of 1 as is done currently.

2013-11-13 Thread Somnath Kotur
Signed-off-by: Somnath Kotur somnath.ko...@emulex.com
---
 drivers/infiniband/core/cma.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 5c9f1ad..2c03db2 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1884,7 +1884,7 @@ static int cma_resolve_iboe_route(struct rdma_id_private 
*id_priv)
iboe_mac_vlan_to_ll(route-path_rec-sgid, 
addr-dev_addr.src_dev_addr, vid);
iboe_mac_vlan_to_ll(route-path_rec-dgid, 
addr-dev_addr.dst_dev_addr, vid);
 
-   route-path_rec-hop_limit = 1;
+   route-path_rec-hop_limit = IPV6_DEFAULT_HOPLIMIT;
route-path_rec-reversible = 1;
route-path_rec-pkey = cpu_to_be16(0x);
route-path_rec-mtu_selector = IB_SA_EQ;
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/10] Introduce Signature feature

2013-11-13 Thread Or Gerlitz

On 14/11/2013 02:19, Hefty, Sean wrote:

The patch series is around for couple of weeks already and went through the 
review of Sean and Bart, with all their feedback being applied. Also Sagi and 
Co enhanced krping to fully cover (and test...) the proposed API and driver 
implementation

Somewhat separate from this specific patch, this is my concern.

There are continual requests to modify the kernel verbs interfaces.  These 
requests boil down to exposing proprietary capabilities to the latest version 
of some vendor's hardware.  In turn, these hardware specific knobs bleed into 
the kernel clients.

At the very least, it seems that there should be some sort of discussion if 
this is a desirable property of the kernel verbs interface, and if this is the 
architecture that the kernel should continue to pursue.  Or, is there an 
alternative way of providing the same ability of coding ULPs to specific HW 
features, versus plugging every new feature into 'post send'?


Sean,

Being concrete + re-iterating  and expanding what I wrote you earlier on 
the V1 thread @ http://marc.info/?l=linux-rdmam=138314853203389w=2when 
you said


Sean  Maybe we should rethink the approach of exposing low-level 
hardware constructs to every
Sean  distinct feature of every vendor's latest hardware directly to 
the kernel ULPs.


To begin with T10 DIF **is** industry standard, which is to be used in 
production storage systems, the feature here is T10 DIF acceleration for 
upstream kernel storage drivers such as iSER/SRP/FCoE initiator/targets 
that use RDMA and are included in commercial distributions which are 
used by customers. Note that this/similar feature is supported by some 
FC cards too, so we want RDMA to be competitive.


This work is part of larger efforts which are done nowadays in other 
parts of the kernel such as the block layer, the upstream kernel target 
and more to support T10, its just the RDMA part.


Sagi and team made great effort to expose API which isn't tied to 
specific HW/Firmware API. And in that respect, the verbs API is coupled 
with industry standards and by no means with specific HW features. Just 
as quick example, the specific driver/card (mlx5 / ConnectIB) for which 
the news verbs are implemented uses three objects for its T10 support, 
named BSF, KLM and PSV - you can be sure, and please check us  that 
there is no sign for them in the verbs API, they only live within the 
mlx5 driver.


If you see a vendor specific feature/construct that appears in the 
proposed verbs API changes, let us know.


 [...] versus plugging every new feature into 'post send'?

Its a new feature indeed but its a feature which comes into play when 
submitting RDMA work-requests to the HCA and
for performance reasons must be subject to pipe-lining in the form of 
batched posting and hence has very good fit as

a sub operation of post-send.

Sean  There are continual requests to modify the kernel verbs 
interfaces. These requests boil down to exposing proprietary capabilities
Sean   to the latest version of some vendor's hardware. In turn, these 
hardware specific knobs bleed into the kernel clients.


non-T10 examples (please) ?!

Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html