Re: [PATCH V1 1/3] IB/core: Align coding style of ib_device_cap_flags structure

2015-12-21 Thread Christoph Hellwig
On Mon, Dec 21, 2015 at 08:37:26AM +0200, Leon Romanovsky wrote:
> You are right and it is a preferred way for me too, however the
> downside of such change will be one of two:
> 1. Change this structure only => we will have style mix of BITs and
> shifts in the same file. IMHO it looks awful.
> 2. Change the whole file => the work with "git blame" will be less
> straightforward.

Honestly, the BIT macros are horribly, and anyone who thinks it's useful
really should read a book on computer architectured and one on C.

Also the capabilities are used by userspace, so they will need to move
to a uapi heder sooner or later, where this stupid macro isn't even
available.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/10] iSER support for remote invalidate

2015-12-21 Thread Or Gerlitz
On Mon, Dec 21, 2015 at 6:20 AM, Nicholas A. Bellinger
 wrote:

> Applied to target-pending/for-next as v4.5-rc1 material, along with
> Reviewed-by tags from HCH.

Hi Nic, thanks for stepping in and picking that.

Sagi, are you going to spin an increment in the initiator version?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/10] iSER support for remote invalidate

2015-12-21 Thread Sagi Grimberg

Hey Nic,


Applied to target-pending/for-next as v4.5-rc1 material, along with
Reviewed-by tags from HCH.


Thanks for picking this up!

Note that I expect this patchset will conflict with Doug's pull request
in case he's taking Christoph's device_attr and CQ abstraction patches
for 4.5.

Doug, would you work with Nic on this?

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/10] iSER support for remote invalidate

2015-12-21 Thread Sagi Grimberg



Sagi, are you going to spin an increment in the initiator version?


I don't know if it's worth a driver version update?

In case we do increment, I can send an incremental patch instead of
re-spinning the entire series.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] IB core: 64 bit counter support V3

2015-12-21 Thread Christoph Lameter
V2->V3
  - Also add support for NOIETF counter mode where we have 64 bit
counters but not the multicast/unicast counters.
  - Add Reviewed-by's from Hal.

V1->V2
  - Add detection of the capability for 64 bit counter support
  - Lots of improvements as a result of suggestions by Hal Rosenstock.

Currently we only use 32 bits for the packet and byte counters. There have
been extended countes available for some time but we have no support for
those yet upstream. We keep having issues with 32 bit counters wrapping.
Especially the byte counter can wrap frequently (as in multiple times per
minute)

This patch adds 4 new counters (for full extended mode) and updates 4 32
bit counters to use the 64 bit sizes (for NOIETF and full extended mode)
so that they no longer wrap.

Should the device not support 64 bit counters then only the original 32
bit counters will be visible.

This patchset can be pulled from my git repo on kernel.org

git pull git://git.kernel.org/pub/scm/linux/kernle/git/christoph/rdma.git 
counter_64bit

Thanks to Hal Rosenstock and Ira Weiny for reviewing this patchset.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] Display extended counter set if available

2015-12-21 Thread Christoph Lameter
V2->V3: Add check for NOIETF mode and create special table
  for that case.

Check if the extended counters are available and if so
create the proper extended and additional counters.

Reviewed-by: Hal Rosenstock 
Signed-off-by: Christoph Lameter 
---
 drivers/infiniband/core/sysfs.c | 104 +++-
 include/rdma/ib_pma.h   |   1 +
 2 files changed, 104 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index 34dcc23..b179fca 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -320,6 +320,13 @@ struct port_table_attribute port_pma_attr_##_name = {  
\
.attr_id = IB_PMA_PORT_COUNTERS ,   \
 }
 
+#define PORT_PMA_ATTR_EXT(_name, _width, _offset)  \
+struct port_table_attribute port_pma_attr_ext_##_name = {  \
+   .attr  = __ATTR(_name, S_IRUGO, show_pma_counter, NULL),\
+   .index = (_offset) | ((_width) << 16),  \
+   .attr_id = IB_PMA_PORT_COUNTERS_EXT ,   \
+}
+
 /*
  * Get a Perfmgmt MAD block of data.
  * Returns error code or the number of bytes retrieved.
@@ -400,6 +407,11 @@ static ssize_t show_pma_counter(struct ib_port *p, struct 
port_attribute *attr,
ret = sprintf(buf, "%u\n",
  be32_to_cpup((__be32 *)data));
break;
+   case 64:
+   ret = sprintf(buf, "%llu\n",
+   be64_to_cpup((__be64 *)data));
+   break;
+
default:
ret = 0;
}
@@ -424,6 +436,18 @@ static PORT_PMA_ATTR(port_rcv_data , 13, 32, 
224);
 static PORT_PMA_ATTR(port_xmit_packets , 14, 32, 256);
 static PORT_PMA_ATTR(port_rcv_packets  , 15, 32, 288);
 
+/*
+ * Counters added by extended set
+ */
+static PORT_PMA_ATTR_EXT(port_xmit_data, 64,  64);
+static PORT_PMA_ATTR_EXT(port_rcv_data , 64, 128);
+static PORT_PMA_ATTR_EXT(port_xmit_packets , 64, 192);
+static PORT_PMA_ATTR_EXT(port_rcv_packets  , 64, 256);
+static PORT_PMA_ATTR_EXT(unicast_xmit_packets  , 64, 320);
+static PORT_PMA_ATTR_EXT(unicast_rcv_packets   , 64, 384);
+static PORT_PMA_ATTR_EXT(multicast_xmit_packets, 64, 448);
+static PORT_PMA_ATTR_EXT(multicast_rcv_packets , 64, 512);
+
 static struct attribute *pma_attrs[] = {
&port_pma_attr_symbol_error.attr.attr,
&port_pma_attr_link_error_recovery.attr.attr,
@@ -444,11 +468,65 @@ static struct attribute *pma_attrs[] = {
NULL
 };
 
+static struct attribute *pma_attrs_ext[] = {
+   &port_pma_attr_symbol_error.attr.attr,
+   &port_pma_attr_link_error_recovery.attr.attr,
+   &port_pma_attr_link_downed.attr.attr,
+   &port_pma_attr_port_rcv_errors.attr.attr,
+   &port_pma_attr_port_rcv_remote_physical_errors.attr.attr,
+   &port_pma_attr_port_rcv_switch_relay_errors.attr.attr,
+   &port_pma_attr_port_xmit_discards.attr.attr,
+   &port_pma_attr_port_xmit_constraint_errors.attr.attr,
+   &port_pma_attr_port_rcv_constraint_errors.attr.attr,
+   &port_pma_attr_local_link_integrity_errors.attr.attr,
+   &port_pma_attr_excessive_buffer_overrun_errors.attr.attr,
+   &port_pma_attr_VL15_dropped.attr.attr,
+   &port_pma_attr_ext_port_xmit_data.attr.attr,
+   &port_pma_attr_ext_port_rcv_data.attr.attr,
+   &port_pma_attr_ext_port_xmit_packets.attr.attr,
+   &port_pma_attr_ext_port_rcv_packets.attr.attr,
+   &port_pma_attr_ext_unicast_rcv_packets.attr.attr,
+   &port_pma_attr_ext_unicast_xmit_packets.attr.attr,
+   &port_pma_attr_ext_multicast_rcv_packets.attr.attr,
+   &port_pma_attr_ext_multicast_xmit_packets.attr.attr,
+   NULL
+};
+
+static struct attribute *pma_attrs_noietf[] = {
+   &port_pma_attr_symbol_error.attr.attr,
+   &port_pma_attr_link_error_recovery.attr.attr,
+   &port_pma_attr_link_downed.attr.attr,
+   &port_pma_attr_port_rcv_errors.attr.attr,
+   &port_pma_attr_port_rcv_remote_physical_errors.attr.attr,
+   &port_pma_attr_port_rcv_switch_relay_errors.attr.attr,
+   &port_pma_attr_port_xmit_discards.attr.attr,
+   &port_pma_attr_port_xmit_constraint_errors.attr.attr,
+   &port_pma_attr_port_rcv_constraint_errors.attr.attr,
+   &port_pma_attr_local_link_integrity_errors.attr.attr,
+   &port_pma_attr_excessive_buffer_overrun_errors.attr.attr,
+   &port_pma_attr_VL15_dropped.attr.attr,
+   &port_pma_attr_ext_port_xmit_data.attr.attr,
+   &port_pma_attr_ext_port_rcv_data.attr.attr,
+   &port_pma_attr_ext_port_xmit_packets.attr.attr,
+   &port_pma_attr_ext_port_rcv_packets.attr.attr,
+   NULL
+};
+
 static struct attribute_group pma_group = {
.name  = "counters",
.attrs  = pma_attrs
 };
 
+static struct attr

[PATCH 2/3] Specify attribute_id in port_table_attribute

2015-12-21 Thread Christoph Lameter
Add the attr_id on port_table_attribute since we will have to add
a different port_table_attribute for the extended attribute soon.

Reviewed-by: Hal Rosenstock 
Signed-off-by: Christoph Lameter 
---
 drivers/infiniband/core/sysfs.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index acefe85..34dcc23 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -39,6 +39,7 @@
 #include 
 
 #include 
+#include 
 
 struct ib_port {
struct kobject kobj;
@@ -65,6 +66,7 @@ struct port_table_attribute {
struct port_attribute   attr;
charname[8];
int index;
+   int attr_id;
 };
 
 static ssize_t port_attr_show(struct kobject *kobj,
@@ -314,7 +316,8 @@ static ssize_t show_port_pkey(struct ib_port *p, struct 
port_attribute *attr,
 #define PORT_PMA_ATTR(_name, _counter, _width, _offset)
\
 struct port_table_attribute port_pma_attr_##_name = {  \
.attr  = __ATTR(_name, S_IRUGO, show_pma_counter, NULL),\
-   .index = (_offset) | ((_width) << 16) | ((_counter) << 24)  \
+   .index = (_offset) | ((_width) << 16) | ((_counter) << 24), \
+   .attr_id = IB_PMA_PORT_COUNTERS ,   \
 }
 
 /*
@@ -376,7 +379,7 @@ static ssize_t show_pma_counter(struct ib_port *p, struct 
port_attribute *attr,
ssize_t ret;
u8 data[8];
 
-   ret = get_perf_mad(p->ibdev, p->port_num, cpu_to_be16(0x12), &data,
+   ret = get_perf_mad(p->ibdev, p->port_num, tab_attr->attr_id, &data,
40 + offset / 8, sizeof(data));
if (ret < 0)
return sprintf(buf, "N/A (no PMA)\n");
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] Create get_perf_mad function in sysfs.c

2015-12-21 Thread Christoph Lameter
Create a new function to retrieve performance management
data from the existing code in get_pma_counter().

Reviewed-by: Hal Rosenstock 
Signed-off-by: Christoph Lameter 
---
 drivers/infiniband/core/sysfs.c | 62 ++---
 1 file changed, 40 insertions(+), 22 deletions(-)

diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index b1f37d4..acefe85 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -317,21 +317,21 @@ struct port_table_attribute port_pma_attr_##_name = { 
\
.index = (_offset) | ((_width) << 16) | ((_counter) << 24)  \
 }
 
-static ssize_t show_pma_counter(struct ib_port *p, struct port_attribute *attr,
-   char *buf)
+/*
+ * Get a Perfmgmt MAD block of data.
+ * Returns error code or the number of bytes retrieved.
+ */
+static int get_perf_mad(struct ib_device *dev, int port_num, int attr,
+   void *data, int offset, size_t size)
 {
-   struct port_table_attribute *tab_attr =
-   container_of(attr, struct port_table_attribute, attr);
-   int offset = tab_attr->index & 0x;
-   int width  = (tab_attr->index >> 16) & 0xff;
-   struct ib_mad *in_mad  = NULL;
-   struct ib_mad *out_mad = NULL;
+   struct ib_mad *in_mad;
+   struct ib_mad *out_mad;
size_t mad_size = sizeof(*out_mad);
u16 out_mad_pkey_index = 0;
ssize_t ret;
 
-   if (!p->ibdev->process_mad)
-   return sprintf(buf, "N/A (no PMA)\n");
+   if (!dev->process_mad)
+   return -ENOSYS;
 
in_mad  = kzalloc(sizeof *in_mad, GFP_KERNEL);
out_mad = kmalloc(sizeof *out_mad, GFP_KERNEL);
@@ -344,12 +344,12 @@ static ssize_t show_pma_counter(struct ib_port *p, struct 
port_attribute *attr,
in_mad->mad_hdr.mgmt_class= IB_MGMT_CLASS_PERF_MGMT;
in_mad->mad_hdr.class_version = 1;
in_mad->mad_hdr.method= IB_MGMT_METHOD_GET;
-   in_mad->mad_hdr.attr_id   = cpu_to_be16(0x12); /* PortCounters */
+   in_mad->mad_hdr.attr_id   = attr;
 
-   in_mad->data[41] = p->port_num; /* PortSelect field */
+   in_mad->data[41] = port_num;/* PortSelect field */
 
-   if ((p->ibdev->process_mad(p->ibdev, IB_MAD_IGNORE_MKEY,
-p->port_num, NULL, NULL,
+   if ((dev->process_mad(dev, IB_MAD_IGNORE_MKEY,
+port_num, NULL, NULL,
 (const struct ib_mad_hdr *)in_mad, mad_size,
 (struct ib_mad_hdr *)out_mad, &mad_size,
 &out_mad_pkey_index) &
@@ -358,31 +358,49 @@ static ssize_t show_pma_counter(struct ib_port *p, struct 
port_attribute *attr,
ret = -EINVAL;
goto out;
}
+   memcpy(data, out_mad->data + offset, size);
+   ret = size;
+out:
+   kfree(in_mad);
+   kfree(out_mad);
+   return ret;
+}
+
+static ssize_t show_pma_counter(struct ib_port *p, struct port_attribute *attr,
+   char *buf)
+{
+   struct port_table_attribute *tab_attr =
+   container_of(attr, struct port_table_attribute, attr);
+   int offset = tab_attr->index & 0x;
+   int width  = (tab_attr->index >> 16) & 0xff;
+   ssize_t ret;
+   u8 data[8];
+
+   ret = get_perf_mad(p->ibdev, p->port_num, cpu_to_be16(0x12), &data,
+   40 + offset / 8, sizeof(data));
+   if (ret < 0)
+   return sprintf(buf, "N/A (no PMA)\n");
 
switch (width) {
case 4:
-   ret = sprintf(buf, "%u\n", (out_mad->data[40 + offset / 8] >>
+   ret = sprintf(buf, "%u\n", (*data >>
(4 - (offset % 8))) & 0xf);
break;
case 8:
-   ret = sprintf(buf, "%u\n", out_mad->data[40 + offset / 8]);
+   ret = sprintf(buf, "%u\n", *data);
break;
case 16:
ret = sprintf(buf, "%u\n",
- be16_to_cpup((__be16 *)(out_mad->data + 40 + 
offset / 8)));
+ be16_to_cpup((__be16 *)data));
break;
case 32:
ret = sprintf(buf, "%u\n",
- be32_to_cpup((__be32 *)(out_mad->data + 40 + 
offset / 8)));
+ be32_to_cpup((__be32 *)data));
break;
default:
ret = 0;
}
 
-out:
-   kfree(in_mad);
-   kfree(out_mad);
-
return ret;
 }
 
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] Isolate common list remove code

2015-12-21 Thread Christoph Lameter
Code cleanup to remove multicast specific code from ipoib_main.c

The removal of a list of multicast groups occurs in three places.
Create a new function ipoib_mcast_remove_list(). Use this new
function in ipoib_main.c too.
That in turn allows the dropping of two functions that were
exported from ipoib_multicast.c for expiration of mc groups.

Reviewed-by: Iraq Weiny 
Signed-off-by: Christoph Lameter 
---
 drivers/infiniband/ulp/ipoib/ipoib.h   |  3 +--
 drivers/infiniband/ulp/ipoib/ipoib_main.c  |  7 ++-
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 24 ++--
 3 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h 
b/drivers/infiniband/ulp/ipoib/ipoib.h
index 3ede103..989c409 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -495,7 +495,6 @@ void ipoib_dev_cleanup(struct net_device *dev);
 void ipoib_mcast_join_task(struct work_struct *work);
 void ipoib_mcast_carrier_on_task(struct work_struct *work);
 void ipoib_mcast_send(struct net_device *dev, u8 *daddr, struct sk_buff *skb);
-void ipoib_mcast_free(struct ipoib_mcast *mc);
 
 void ipoib_mcast_restart_task(struct work_struct *work);
 int ipoib_mcast_start_thread(struct net_device *dev);
@@ -549,7 +548,7 @@ void ipoib_path_iter_read(struct ipoib_path_iter *iter,
 
 int ipoib_mcast_attach(struct net_device *dev, u16 mlid,
   union ib_gid *mgid, int set_qkey);
-int ipoib_mcast_leave(struct net_device *dev, struct ipoib_mcast *mcast);
+void ipoib_mcast_remove_list(struct net_device *dev, struct list_head 
*remove_list);
 struct ipoib_mcast *__ipoib_mcast_find(struct net_device *dev, void *mgid);
 
 int ipoib_init_qp(struct net_device *dev);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c 
b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 7d32818..483ff20 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -1150,7 +1150,7 @@ static void __ipoib_reap_neigh(struct ipoib_dev_priv 
*priv)
unsigned long flags;
int i;
LIST_HEAD(remove_list);
-   struct ipoib_mcast *mcast, *tmcast;
+   struct ipoib_mcast *mcast;
struct net_device *dev = priv->dev;
 
if (test_bit(IPOIB_STOP_NEIGH_GC, &priv->flags))
@@ -1207,10 +1207,7 @@ static void __ipoib_reap_neigh(struct ipoib_dev_priv 
*priv)
 
 out_unlock:
spin_unlock_irqrestore(&priv->lock, flags);
-   list_for_each_entry_safe(mcast, tmcast, &remove_list, list) {
-   ipoib_mcast_leave(dev, mcast);
-   ipoib_mcast_free(mcast);
-   }
+   ipoib_mcast_remove_list(dev, &remove_list);
 }
 
 static void ipoib_reap_neigh(struct work_struct *work)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 
b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
index f357ca6..8acb420a 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -106,7 +106,7 @@ static void __ipoib_mcast_schedule_join_thread(struct 
ipoib_dev_priv *priv,
queue_delayed_work(priv->wq, &priv->mcast_task, 0);
 }
 
-void ipoib_mcast_free(struct ipoib_mcast *mcast)
+static void ipoib_mcast_free(struct ipoib_mcast *mcast)
 {
struct net_device *dev = mcast->dev;
int tx_dropped = 0;
@@ -677,7 +677,7 @@ int ipoib_mcast_stop_thread(struct net_device *dev)
return 0;
 }
 
-int ipoib_mcast_leave(struct net_device *dev, struct ipoib_mcast *mcast)
+static int ipoib_mcast_leave(struct net_device *dev, struct ipoib_mcast *mcast)
 {
struct ipoib_dev_priv *priv = netdev_priv(dev);
int ret = 0;
@@ -704,6 +704,16 @@ int ipoib_mcast_leave(struct net_device *dev, struct 
ipoib_mcast *mcast)
return 0;
 }
 
+void ipoib_mcast_remove_list(struct net_device *dev, struct list_head 
*remove_list)
+{
+   struct ipoib_mcast *mcast, *tmcast;
+
+   list_for_each_entry_safe(mcast, tmcast, remove_list, list) {
+   ipoib_mcast_leave(dev, mcast);
+   ipoib_mcast_free(mcast);
+   }
+}
+
 void ipoib_mcast_send(struct net_device *dev, u8 *daddr, struct sk_buff *skb)
 {
struct ipoib_dev_priv *priv = netdev_priv(dev);
@@ -810,10 +820,7 @@ void ipoib_mcast_dev_flush(struct net_device *dev)
if (test_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags))
wait_for_completion(&mcast->done);
 
-   list_for_each_entry_safe(mcast, tmcast, &remove_list, list) {
-   ipoib_mcast_leave(dev, mcast);
-   ipoib_mcast_free(mcast);
-   }
+   ipoib_mcast_remove_list(dev, &remove_list);
 }
 
 static int ipoib_mcast_addr_is_valid(const u8 *addr, const u8 *broadcast)
@@ -939,10 +946,7 @@ void ipoib_mcast_restart_task(struct work_struct *work)
if (test_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags))
wait_for_completion(&mcast->done);
 
-   list_for_each_entry_sa

[PATCH 2/2] Move multicast specific code out of ipoib_main.c

2015-12-21 Thread Christoph Lameter
V1->V2:
- Rename function as requested by Ira

Code cleanup to move multicast specific code that checks for
a sendonly join to ipoib_multicast.c. This allows the removal
of the export of __ipoib_mcast_find().

Signed-off-by: Christoph Lameter 
---
 drivers/infiniband/ulp/ipoib/ipoib.h   |  3 ++-
 drivers/infiniband/ulp/ipoib/ipoib_main.c  | 13 +
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 21 -
 3 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h 
b/drivers/infiniband/ulp/ipoib/ipoib.h
index 989c409..a924933 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -549,7 +549,8 @@ void ipoib_path_iter_read(struct ipoib_path_iter *iter,
 int ipoib_mcast_attach(struct net_device *dev, u16 mlid,
   union ib_gid *mgid, int set_qkey);
 void ipoib_mcast_remove_list(struct net_device *dev, struct list_head 
*remove_list);
-struct ipoib_mcast *__ipoib_mcast_find(struct net_device *dev, void *mgid);
+void ipoib_check_and_add_mcast_sendonly(struct ipoib_dev_priv *priv, u8 *mgid,
+   struct list_head *remove_list);
 
 int ipoib_init_qp(struct net_device *dev);
 int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c 
b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 483ff20..620d9ca 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -1150,7 +1150,6 @@ static void __ipoib_reap_neigh(struct ipoib_dev_priv 
*priv)
unsigned long flags;
int i;
LIST_HEAD(remove_list);
-   struct ipoib_mcast *mcast;
struct net_device *dev = priv->dev;
 
if (test_bit(IPOIB_STOP_NEIGH_GC, &priv->flags))
@@ -1179,18 +1178,8 @@ static void __ipoib_reap_neigh(struct ipoib_dev_priv 
*priv)
  
lockdep_is_held(&priv->lock))) != NULL) {
/* was the neigh idle for two GC periods */
if (time_after(neigh_obsolete, neigh->alive)) {
-   u8 *mgid = neigh->daddr + 4;
 
-   /* Is this multicast ? */
-   if (*mgid == 0xff) {
-   mcast = __ipoib_mcast_find(dev, mgid);
-
-   if (mcast && 
test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags)) {
-   list_del(&mcast->list);
-   rb_erase(&mcast->rb_node, 
&priv->multicast_tree);
-   list_add_tail(&mcast->list, 
&remove_list);
-   }
-   }
+   ipoib_check_and_add_mcast_sendonly(priv, 
neigh->daddr + 4, &remove_list);
 
rcu_assign_pointer(*np,
   
rcu_dereference_protected(neigh->hnext,
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 
b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
index 8acb420a..ab79b87 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -153,7 +153,7 @@ static struct ipoib_mcast *ipoib_mcast_alloc(struct 
net_device *dev,
return mcast;
 }
 
-struct ipoib_mcast *__ipoib_mcast_find(struct net_device *dev, void *mgid)
+static struct ipoib_mcast *__ipoib_mcast_find(struct net_device *dev, void 
*mgid)
 {
struct ipoib_dev_priv *priv = netdev_priv(dev);
struct rb_node *n = priv->multicast_tree.rb_node;
@@ -704,6 +704,25 @@ static int ipoib_mcast_leave(struct net_device *dev, 
struct ipoib_mcast *mcast)
return 0;
 }
 
+/*
+ * Check if the multicast group is sendonly. If so remove it from the maps
+ * and add to the remove list
+ */
+void ipoib_check_and_add_mcast_sendonly(struct ipoib_dev_priv *priv, u8 *mgid,
+   struct list_head *remove_list)
+{
+   /* Is this multicast ? */
+   if (*mgid == 0xff) {
+   struct ipoib_mcast *mcast = __ipoib_mcast_find(priv->dev, mgid);
+
+   if (mcast && test_bit(IPOIB_MCAST_FLAG_SENDONLY, 
&mcast->flags)) {
+   list_del(&mcast->list);
+   rb_erase(&mcast->rb_node, &priv->multicast_tree);
+   list_add_tail(&mcast->list, remove_list);
+   }
+   }
+}
+
 void ipoib_mcast_remove_list(struct net_device *dev, struct list_head 
*remove_list)
 {
struct ipoib_mcast *mcast, *tmcast;
-- 
2.5.0


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] IB multicast cleanup patches V2

2015-12-21 Thread Christoph Lameter
V1->V2
 - Add Reviewed by's for first patch from Ira Weiny
 - Change name of ipoib_check_mcast_sendonly() to
ipoib_check_and_add_mcast_sendonly() as requested by Ira

This patchset cleans up the code a bit after the last round of multicast
patches related to the sendonly join logic. Some of the bits of code
landed in ipoib_main.c instead of ipoib_multicast.c.

- Move the multicastbits into that file so that everything is neatly together
- Reduce the number of functions exported from ipoib_multicast.c

This patchset can be retrieved from a git repo on kernel.org via

git pull git://git.kernel.org/pub/scm/linux/kernel/git/christoph/rdma.git 
cleanup

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Isolate common list remove code

2015-12-21 Thread Leon Romanovsky
On Mon, Dec 21, 2015 at 08:42:53AM -0600, Christoph Lameter wrote:
> Code cleanup to remove multicast specific code from ipoib_main.c
> 
> The removal of a list of multicast groups occurs in three places.
> Create a new function ipoib_mcast_remove_list(). Use this new
> function in ipoib_main.c too.
> That in turn allows the dropping of two functions that were
> exported from ipoib_multicast.c for expiration of mc groups.
> 
> Reviewed-by: Iraq Weiny 
Iraq Weiny --> Ira Weiny


> +void ipoib_mcast_remove_list(struct net_device *dev, struct list_head 
> *remove_list)
Will it be beneficial to inline this function?
> +{
> + struct ipoib_mcast *mcast, *tmcast;
> +
> + list_for_each_entry_safe(mcast, tmcast, remove_list, list) {
> + ipoib_mcast_leave(dev, mcast);
> + ipoib_mcast_free(mcast);
> + }
> +}
> +
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] IB/cma: cma_match_net_dev needs to take into account port_num

2015-12-21 Thread Matan Barak
Previously, cma_match_net_dev called cma_protocol_roce which
tried to verify that the IB device uses RoCE protocol. However,
if rdma_id didn't have a bounded port, it used the first port
of the device.

In VPI systems, the first port might be an IB port while the second
one could be an Ethernet port. This made requests for unbounded rdma_ids
that come from the Ethernet port fail.
Fixing this by passing the port of the request and checking this port
of the device.

Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev on RoCE')
Signed-off-by: Matan Barak 
---
Hi Doug,

This patch fixes a bug in VPI systems, where the first port is configured
as IB and the second one is configured as Ethernet.
In this case, if the rdma_id isn't bounded to a port, cma_match_net_dev
will try to verify that the first port is a RoCE port and fail.
This is fixed by passing the port of the incoming request.

Regards,
Matan

 drivers/infiniband/core/cma.c |   16 +---
 1 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index d2d5d00..c8a265c 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1265,15 +1265,17 @@ static bool cma_protocol_roce(const struct rdma_cm_id 
*id)
return cma_protocol_roce_dev_port(device, port_num);
 }
 
-static bool cma_match_net_dev(const struct rdma_id_private *id_priv,
- const struct net_device *net_dev)
+static bool cma_match_net_dev(const struct rdma_cm_id *id,
+ const struct net_device *net_dev,
+ u8 port_num)
 {
-   const struct rdma_addr *addr = &id_priv->id.route.addr;
+   const struct rdma_addr *addr = &id->route.addr;
 
if (!net_dev)
/* This request is an AF_IB request or a RoCE request */
-   return addr->src_addr.ss_family == AF_IB ||
-  cma_protocol_roce(&id_priv->id);
+   return (!id->port_num || id->port_num == port_num) &&
+  (addr->src_addr.ss_family == AF_IB ||
+   cma_protocol_roce_dev_port(id->device, port_num));
 
return !addr->dev_addr.bound_dev_if ||
   (net_eq(dev_net(net_dev), addr->dev_addr.net) &&
@@ -1295,13 +1297,13 @@ static struct rdma_id_private *cma_find_listener(
hlist_for_each_entry(id_priv, &bind_list->owners, node) {
if (cma_match_private_data(id_priv, ib_event->private_data)) {
if (id_priv->id.device == cm_id->device &&
-   cma_match_net_dev(id_priv, net_dev))
+   cma_match_net_dev(&id_priv->id, net_dev, req->port))
return id_priv;
list_for_each_entry(id_priv_dev,
&id_priv->listen_list,
listen_list) {
if (id_priv_dev->id.device == cm_id->device &&
-   cma_match_net_dev(id_priv_dev, net_dev))
+   cma_match_net_dev(&id_priv_dev->id, 
net_dev, req->port))
return id_priv_dev;
}
}
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Isolate common list remove code

2015-12-21 Thread Christoph Lameter
On Mon, 21 Dec 2015, Leon Romanovsky wrote:

> On Mon, Dec 21, 2015 at 08:42:53AM -0600, Christoph Lameter wrote:
> > Code cleanup to remove multicast specific code from ipoib_main.c
> >
> > The removal of a list of multicast groups occurs in three places.
> > Create a new function ipoib_mcast_remove_list(). Use this new
> > function in ipoib_main.c too.
> > That in turn allows the dropping of two functions that were
> > exported from ipoib_multicast.c for expiration of mc groups.
> >
> > Reviewed-by: Iraq Weiny 
> Iraq Weiny --> Ira Weiny

Ohh.. Bad typo.

> > +void ipoib_mcast_remove_list(struct net_device *dev, struct list_head 
> > *remove_list)
> Will it be beneficial to inline this function?

As far as I know it is not run in a latency critical context and the code
is too heavy for that. In particular we are calling other functions that
are not inlined.

> > +{
> > +   struct ipoib_mcast *mcast, *tmcast;
> > +
> > +   list_for_each_entry_safe(mcast, tmcast, remove_list, list) {
> > +   ipoib_mcast_leave(dev, mcast);
> > +   ipoib_mcast_free(mcast);
> > +   }
> > +}
> > +
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RoCE passive side failures on 4.4-rc5

2015-12-21 Thread Matan Barak
On Sun, Dec 20, 2015 at 9:29 AM, Or Gerlitz  wrote:
> On 12/17/2015 3:58 PM, Or Gerlitz wrote:
>>
>> Using 4.4-rc5+ [1] and **not** applying any of the patches I sent today,
>> I noted that RoCE passive side isn't working (rdma-cm, ibv_rc_pingpong
>> works).
>>
>> I have two nodes in ConnectX3 VPI config (port1 IB and port2 Eth), the one
>> with the 4.4-rc5 kernel can act as both (rping) client/server for IB links
>> but only (rping) client for RoCE.
>>
>> I tried both inter-node and loopback runs, in all cases, the client side
>> getsCM
>> reject with reason 28, see [2], tried both iser and rping. Eth (ICMP, TCP)
>> works OK.
>
>
> OK, small progress, when the force Eth link type on my IB port (using mlx4
> sysfs), things work.
>
> You should be able to reproduce it on your non-VPI systems the other way
> around, by
> forcing IB link type on one of the Eth ports and see the failure.
>
> I Saw the same behavior with both 4.4-rc2 and 4.4-rc5
>
> Or.

I've posted a patch that fixes that, please take a look at [1].

Regards,
Matan

[1] https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg30777.html

>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 1/3] IB/core: Align coding style of ib_device_cap_flags structure

2015-12-21 Thread ira.weiny
On Mon, Dec 21, 2015 at 12:03:46AM -0800, Christoph Hellwig wrote:
> On Mon, Dec 21, 2015 at 08:37:26AM +0200, Leon Romanovsky wrote:
> > You are right and it is a preferred way for me too, however the
> > downside of such change will be one of two:
> > 1. Change this structure only => we will have style mix of BITs and
> > shifts in the same file. IMHO it looks awful.
> > 2. Change the whole file => the work with "git blame" will be less
> > straightforward.
> 
> Honestly, the BIT macros are horribly, and anyone who thinks it's useful
> really should read a book on computer architectured and one on C.

It would be nice if we were not having to do this for staging then.  Also
perhaps it should be removed from checkpatch --strict?

I'm not a big fan of everything checkpatch does, this being one of them, but
Leon was trying to do the right thing here.

Where are the guidelines for when one can ignore checkpatch and when they can
not?  It would be nice to know when we can "be developers" vs "being robots to
some tool".

I await Dougs guidance.

Ira

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RoCE passive side failures on 4.4-rc5

2015-12-21 Thread Doug Ledford
On 12/21/2015 10:08 AM, Matan Barak wrote:
> On Sun, Dec 20, 2015 at 9:29 AM, Or Gerlitz  wrote:
>> On 12/17/2015 3:58 PM, Or Gerlitz wrote:
>>>
>>> Using 4.4-rc5+ [1] and **not** applying any of the patches I sent today,
>>> I noted that RoCE passive side isn't working (rdma-cm, ibv_rc_pingpong
>>> works).
>>>
>>> I have two nodes in ConnectX3 VPI config (port1 IB and port2 Eth), the one
>>> with the 4.4-rc5 kernel can act as both (rping) client/server for IB links
>>> but only (rping) client for RoCE.
>>>
>>> I tried both inter-node and loopback runs, in all cases, the client side
>>> getsCM
>>> reject with reason 28, see [2], tried both iser and rping. Eth (ICMP, TCP)
>>> works OK.
>>
>>
>> OK, small progress, when the force Eth link type on my IB port (using mlx4
>> sysfs), things work.
>>
>> You should be able to reproduce it on your non-VPI systems the other way
>> around, by
>> forcing IB link type on one of the Eth ports and see the failure.
>>
>> I Saw the same behavior with both 4.4-rc2 and 4.4-rc5
>>
>> Or.
> 
> I've posted a patch that fixes that, please take a look at [1].
> 
> Regards,
> Matan
> 
> [1] https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg30777.html

I've been seeing this too in my 4.4-rc testing, so I'll have test
results today.


-- 
Doug Ledford 
  GPG KeyID: 0E572FDD




signature.asc
Description: OpenPGP digital signature


Re: [PATCH 1/3] Create get_perf_mad function in sysfs.c

2015-12-21 Thread ira.weiny
On Mon, Dec 21, 2015 at 08:20:27AM -0600, Christoph Lameter wrote:
> Create a new function to retrieve performance management
> data from the existing code in get_pma_counter().
> 
> Reviewed-by: Hal Rosenstock 

Reviewed-by: Ira Weiny 

> Signed-off-by: Christoph Lameter 
> ---
>  drivers/infiniband/core/sysfs.c | 62 
> ++---
>  1 file changed, 40 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
> index b1f37d4..acefe85 100644
> --- a/drivers/infiniband/core/sysfs.c
> +++ b/drivers/infiniband/core/sysfs.c
> @@ -317,21 +317,21 @@ struct port_table_attribute port_pma_attr_##_name = {   
> \
>   .index = (_offset) | ((_width) << 16) | ((_counter) << 24)  \
>  }
>  
> -static ssize_t show_pma_counter(struct ib_port *p, struct port_attribute 
> *attr,
> - char *buf)
> +/*
> + * Get a Perfmgmt MAD block of data.
> + * Returns error code or the number of bytes retrieved.
> + */
> +static int get_perf_mad(struct ib_device *dev, int port_num, int attr,
> + void *data, int offset, size_t size)
>  {
> - struct port_table_attribute *tab_attr =
> - container_of(attr, struct port_table_attribute, attr);
> - int offset = tab_attr->index & 0x;
> - int width  = (tab_attr->index >> 16) & 0xff;
> - struct ib_mad *in_mad  = NULL;
> - struct ib_mad *out_mad = NULL;
> + struct ib_mad *in_mad;
> + struct ib_mad *out_mad;
>   size_t mad_size = sizeof(*out_mad);
>   u16 out_mad_pkey_index = 0;
>   ssize_t ret;
>  
> - if (!p->ibdev->process_mad)
> - return sprintf(buf, "N/A (no PMA)\n");
> + if (!dev->process_mad)
> + return -ENOSYS;
>  
>   in_mad  = kzalloc(sizeof *in_mad, GFP_KERNEL);
>   out_mad = kmalloc(sizeof *out_mad, GFP_KERNEL);
> @@ -344,12 +344,12 @@ static ssize_t show_pma_counter(struct ib_port *p, 
> struct port_attribute *attr,
>   in_mad->mad_hdr.mgmt_class= IB_MGMT_CLASS_PERF_MGMT;
>   in_mad->mad_hdr.class_version = 1;
>   in_mad->mad_hdr.method= IB_MGMT_METHOD_GET;
> - in_mad->mad_hdr.attr_id   = cpu_to_be16(0x12); /* PortCounters */
> + in_mad->mad_hdr.attr_id   = attr;
>  
> - in_mad->data[41] = p->port_num; /* PortSelect field */
> + in_mad->data[41] = port_num;/* PortSelect field */
>  
> - if ((p->ibdev->process_mad(p->ibdev, IB_MAD_IGNORE_MKEY,
> -  p->port_num, NULL, NULL,
> + if ((dev->process_mad(dev, IB_MAD_IGNORE_MKEY,
> +  port_num, NULL, NULL,
>(const struct ib_mad_hdr *)in_mad, mad_size,
>(struct ib_mad_hdr *)out_mad, &mad_size,
>&out_mad_pkey_index) &
> @@ -358,31 +358,49 @@ static ssize_t show_pma_counter(struct ib_port *p, 
> struct port_attribute *attr,
>   ret = -EINVAL;
>   goto out;
>   }
> + memcpy(data, out_mad->data + offset, size);
> + ret = size;
> +out:
> + kfree(in_mad);
> + kfree(out_mad);
> + return ret;
> +}
> +
> +static ssize_t show_pma_counter(struct ib_port *p, struct port_attribute 
> *attr,
> + char *buf)
> +{
> + struct port_table_attribute *tab_attr =
> + container_of(attr, struct port_table_attribute, attr);
> + int offset = tab_attr->index & 0x;
> + int width  = (tab_attr->index >> 16) & 0xff;
> + ssize_t ret;
> + u8 data[8];
> +
> + ret = get_perf_mad(p->ibdev, p->port_num, cpu_to_be16(0x12), &data,
> + 40 + offset / 8, sizeof(data));
> + if (ret < 0)
> + return sprintf(buf, "N/A (no PMA)\n");
>  
>   switch (width) {
>   case 4:
> - ret = sprintf(buf, "%u\n", (out_mad->data[40 + offset / 8] >>
> + ret = sprintf(buf, "%u\n", (*data >>
>   (4 - (offset % 8))) & 0xf);
>   break;
>   case 8:
> - ret = sprintf(buf, "%u\n", out_mad->data[40 + offset / 8]);
> + ret = sprintf(buf, "%u\n", *data);
>   break;
>   case 16:
>   ret = sprintf(buf, "%u\n",
> -   be16_to_cpup((__be16 *)(out_mad->data + 40 + 
> offset / 8)));
> +   be16_to_cpup((__be16 *)data));
>   break;
>   case 32:
>   ret = sprintf(buf, "%u\n",
> -   be32_to_cpup((__be32 *)(out_mad->data + 40 + 
> offset / 8)));
> +   be32_to_cpup((__be32 *)data));
>   break;
>   default:
>   ret = 0;
>   }
>  
> -out:
> - kfree(in_mad);
> - kfree(out_mad);
> -
>   return ret;
>  }
>  
> -- 
> 2.5.0
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.

Re: [PATCH 2/3] Specify attribute_id in port_table_attribute

2015-12-21 Thread ira.weiny
On Mon, Dec 21, 2015 at 08:20:28AM -0600, Christoph Lameter wrote:
> Add the attr_id on port_table_attribute since we will have to add
> a different port_table_attribute for the extended attribute soon.
> 
> Reviewed-by: Hal Rosenstock 

Reviewed-by: Ira Weiny 

> Signed-off-by: Christoph Lameter 
> ---
>  drivers/infiniband/core/sysfs.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
> index acefe85..34dcc23 100644
> --- a/drivers/infiniband/core/sysfs.c
> +++ b/drivers/infiniband/core/sysfs.c
> @@ -39,6 +39,7 @@
>  #include 
>  
>  #include 
> +#include 
>  
>  struct ib_port {
>   struct kobject kobj;
> @@ -65,6 +66,7 @@ struct port_table_attribute {
>   struct port_attribute   attr;
>   charname[8];
>   int index;
> + int attr_id;
>  };
>  
>  static ssize_t port_attr_show(struct kobject *kobj,
> @@ -314,7 +316,8 @@ static ssize_t show_port_pkey(struct ib_port *p, struct 
> port_attribute *attr,
>  #define PORT_PMA_ATTR(_name, _counter, _width, _offset)  
> \
>  struct port_table_attribute port_pma_attr_##_name = {
> \
>   .attr  = __ATTR(_name, S_IRUGO, show_pma_counter, NULL),\
> - .index = (_offset) | ((_width) << 16) | ((_counter) << 24)  \
> + .index = (_offset) | ((_width) << 16) | ((_counter) << 24), \
> + .attr_id = IB_PMA_PORT_COUNTERS ,   \
>  }
>  
>  /*
> @@ -376,7 +379,7 @@ static ssize_t show_pma_counter(struct ib_port *p, struct 
> port_attribute *attr,
>   ssize_t ret;
>   u8 data[8];
>  
> - ret = get_perf_mad(p->ibdev, p->port_num, cpu_to_be16(0x12), &data,
> + ret = get_perf_mad(p->ibdev, p->port_num, tab_attr->attr_id, &data,
>   40 + offset / 8, sizeof(data));
>   if (ret < 0)
>   return sprintf(buf, "N/A (no PMA)\n");
> -- 
> 2.5.0
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] Display extended counter set if available

2015-12-21 Thread ira.weiny
On Mon, Dec 21, 2015 at 08:20:29AM -0600, Christoph Lameter wrote:
> V2->V3: Add check for NOIETF mode and create special table
>   for that case.
> 
> Check if the extended counters are available and if so
> create the proper extended and additional counters.
> 
> Reviewed-by: Hal Rosenstock 
> Signed-off-by: Christoph Lameter 
> ---
>  drivers/infiniband/core/sysfs.c | 104 
> +++-
>  include/rdma/ib_pma.h   |   1 +
>  2 files changed, 104 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
> index 34dcc23..b179fca 100644
> --- a/drivers/infiniband/core/sysfs.c
> +++ b/drivers/infiniband/core/sysfs.c
> @@ -320,6 +320,13 @@ struct port_table_attribute port_pma_attr_##_name = {
> \
>   .attr_id = IB_PMA_PORT_COUNTERS ,   \
>  }
>  
> +#define PORT_PMA_ATTR_EXT(_name, _width, _offset)\
> +struct port_table_attribute port_pma_attr_ext_##_name = {\
> + .attr  = __ATTR(_name, S_IRUGO, show_pma_counter, NULL),\
> + .index = (_offset) | ((_width) << 16),  \
> + .attr_id = IB_PMA_PORT_COUNTERS_EXT ,   \
> +}
> +
>  /*
>   * Get a Perfmgmt MAD block of data.
>   * Returns error code or the number of bytes retrieved.
> @@ -400,6 +407,11 @@ static ssize_t show_pma_counter(struct ib_port *p, 
> struct port_attribute *attr,
>   ret = sprintf(buf, "%u\n",
> be32_to_cpup((__be32 *)data));
>   break;
> + case 64:
> + ret = sprintf(buf, "%llu\n",
> + be64_to_cpup((__be64 *)data));
> + break;
> +
>   default:
>   ret = 0;
>   }
> @@ -424,6 +436,18 @@ static PORT_PMA_ATTR(port_rcv_data   , 
> 13, 32, 224);
>  static PORT_PMA_ATTR(port_xmit_packets   , 14, 32, 256);
>  static PORT_PMA_ATTR(port_rcv_packets, 15, 32, 288);
>  
> +/*
> + * Counters added by extended set
> + */
> +static PORT_PMA_ATTR_EXT(port_xmit_data  , 64,  64);
> +static PORT_PMA_ATTR_EXT(port_rcv_data   , 64, 128);
> +static PORT_PMA_ATTR_EXT(port_xmit_packets   , 64, 192);
> +static PORT_PMA_ATTR_EXT(port_rcv_packets, 64, 256);
> +static PORT_PMA_ATTR_EXT(unicast_xmit_packets, 64, 320);
> +static PORT_PMA_ATTR_EXT(unicast_rcv_packets , 64, 384);
> +static PORT_PMA_ATTR_EXT(multicast_xmit_packets  , 64, 448);
> +static PORT_PMA_ATTR_EXT(multicast_rcv_packets   , 64, 512);
> +
>  static struct attribute *pma_attrs[] = {
>   &port_pma_attr_symbol_error.attr.attr,
>   &port_pma_attr_link_error_recovery.attr.attr,
> @@ -444,11 +468,65 @@ static struct attribute *pma_attrs[] = {
>   NULL
>  };
>  
> +static struct attribute *pma_attrs_ext[] = {
> + &port_pma_attr_symbol_error.attr.attr,
> + &port_pma_attr_link_error_recovery.attr.attr,
> + &port_pma_attr_link_downed.attr.attr,
> + &port_pma_attr_port_rcv_errors.attr.attr,
> + &port_pma_attr_port_rcv_remote_physical_errors.attr.attr,
> + &port_pma_attr_port_rcv_switch_relay_errors.attr.attr,
> + &port_pma_attr_port_xmit_discards.attr.attr,
> + &port_pma_attr_port_xmit_constraint_errors.attr.attr,
> + &port_pma_attr_port_rcv_constraint_errors.attr.attr,
> + &port_pma_attr_local_link_integrity_errors.attr.attr,
> + &port_pma_attr_excessive_buffer_overrun_errors.attr.attr,
> + &port_pma_attr_VL15_dropped.attr.attr,
> + &port_pma_attr_ext_port_xmit_data.attr.attr,
> + &port_pma_attr_ext_port_rcv_data.attr.attr,
> + &port_pma_attr_ext_port_xmit_packets.attr.attr,
> + &port_pma_attr_ext_port_rcv_packets.attr.attr,
> + &port_pma_attr_ext_unicast_rcv_packets.attr.attr,
> + &port_pma_attr_ext_unicast_xmit_packets.attr.attr,
> + &port_pma_attr_ext_multicast_rcv_packets.attr.attr,
> + &port_pma_attr_ext_multicast_xmit_packets.attr.attr,
> + NULL
> +};
> +
> +static struct attribute *pma_attrs_noietf[] = {
> + &port_pma_attr_symbol_error.attr.attr,
> + &port_pma_attr_link_error_recovery.attr.attr,
> + &port_pma_attr_link_downed.attr.attr,
> + &port_pma_attr_port_rcv_errors.attr.attr,
> + &port_pma_attr_port_rcv_remote_physical_errors.attr.attr,
> + &port_pma_attr_port_rcv_switch_relay_errors.attr.attr,
> + &port_pma_attr_port_xmit_discards.attr.attr,
> + &port_pma_attr_port_xmit_constraint_errors.attr.attr,
> + &port_pma_attr_port_rcv_constraint_errors.attr.attr,
> + &port_pma_attr_local_link_integrity_errors.attr.attr,
> + &port_pma_attr_excessive_buffer_overrun_errors.attr.attr,
> + &port_pma_attr_VL15_dropped.attr.attr,
> + &port_pma_attr_ext_port_xmit_data.attr.attr,
> + &port_pma_attr_ext_port_rcv_data.attr.attr,
> + &port_pma_attr_ext_port_xmit_packets.attr.attr,

Re: [PATCH 3/3] Display extended counter set if available

2015-12-21 Thread Hal Rosenstock
On 12/21/2015 12:53 PM, ira.weiny wrote:
> On Mon, Dec 21, 2015 at 08:20:29AM -0600, Christoph Lameter wrote:
>> V2->V3: Add check for NOIETF mode and create special table
>>   for that case.
>>
>> Check if the extended counters are available and if so
>> create the proper extended and additional counters.
>>
>> Reviewed-by: Hal Rosenstock 
>> Signed-off-by: Christoph Lameter 
>> ---
>>  drivers/infiniband/core/sysfs.c | 104 
>> +++-
>>  include/rdma/ib_pma.h   |   1 +
>>  2 files changed, 104 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/infiniband/core/sysfs.c 
>> b/drivers/infiniband/core/sysfs.c
>> index 34dcc23..b179fca 100644
>> --- a/drivers/infiniband/core/sysfs.c
>> +++ b/drivers/infiniband/core/sysfs.c
>> @@ -320,6 +320,13 @@ struct port_table_attribute port_pma_attr_##_name = {   
>> \
>>  .attr_id = IB_PMA_PORT_COUNTERS ,   \
>>  }
>>  
>> +#define PORT_PMA_ATTR_EXT(_name, _width, _offset)   \
>> +struct port_table_attribute port_pma_attr_ext_##_name = {   \
>> +.attr  = __ATTR(_name, S_IRUGO, show_pma_counter, NULL),\
>> +.index = (_offset) | ((_width) << 16),  \
>> +.attr_id = IB_PMA_PORT_COUNTERS_EXT ,   \
>> +}
>> +
>>  /*
>>   * Get a Perfmgmt MAD block of data.
>>   * Returns error code or the number of bytes retrieved.
>> @@ -400,6 +407,11 @@ static ssize_t show_pma_counter(struct ib_port *p, 
>> struct port_attribute *attr,
>>  ret = sprintf(buf, "%u\n",
>>be32_to_cpup((__be32 *)data));
>>  break;
>> +case 64:
>> +ret = sprintf(buf, "%llu\n",
>> +be64_to_cpup((__be64 *)data));
>> +break;
>> +
>>  default:
>>  ret = 0;
>>  }
>> @@ -424,6 +436,18 @@ static PORT_PMA_ATTR(port_rcv_data  , 
>> 13, 32, 224);
>>  static PORT_PMA_ATTR(port_xmit_packets  , 14, 32, 256);
>>  static PORT_PMA_ATTR(port_rcv_packets   , 15, 32, 288);
>>  
>> +/*
>> + * Counters added by extended set
>> + */
>> +static PORT_PMA_ATTR_EXT(port_xmit_data , 64,  64);
>> +static PORT_PMA_ATTR_EXT(port_rcv_data  , 64, 128);
>> +static PORT_PMA_ATTR_EXT(port_xmit_packets  , 64, 192);
>> +static PORT_PMA_ATTR_EXT(port_rcv_packets   , 64, 256);
>> +static PORT_PMA_ATTR_EXT(unicast_xmit_packets   , 64, 320);
>> +static PORT_PMA_ATTR_EXT(unicast_rcv_packets, 64, 384);
>> +static PORT_PMA_ATTR_EXT(multicast_xmit_packets , 64, 448);
>> +static PORT_PMA_ATTR_EXT(multicast_rcv_packets  , 64, 512);
>> +
>>  static struct attribute *pma_attrs[] = {
>>  &port_pma_attr_symbol_error.attr.attr,
>>  &port_pma_attr_link_error_recovery.attr.attr,
>> @@ -444,11 +468,65 @@ static struct attribute *pma_attrs[] = {
>>  NULL
>>  };
>>  
>> +static struct attribute *pma_attrs_ext[] = {
>> +&port_pma_attr_symbol_error.attr.attr,
>> +&port_pma_attr_link_error_recovery.attr.attr,
>> +&port_pma_attr_link_downed.attr.attr,
>> +&port_pma_attr_port_rcv_errors.attr.attr,
>> +&port_pma_attr_port_rcv_remote_physical_errors.attr.attr,
>> +&port_pma_attr_port_rcv_switch_relay_errors.attr.attr,
>> +&port_pma_attr_port_xmit_discards.attr.attr,
>> +&port_pma_attr_port_xmit_constraint_errors.attr.attr,
>> +&port_pma_attr_port_rcv_constraint_errors.attr.attr,
>> +&port_pma_attr_local_link_integrity_errors.attr.attr,
>> +&port_pma_attr_excessive_buffer_overrun_errors.attr.attr,
>> +&port_pma_attr_VL15_dropped.attr.attr,
>> +&port_pma_attr_ext_port_xmit_data.attr.attr,
>> +&port_pma_attr_ext_port_rcv_data.attr.attr,
>> +&port_pma_attr_ext_port_xmit_packets.attr.attr,
>> +&port_pma_attr_ext_port_rcv_packets.attr.attr,
>> +&port_pma_attr_ext_unicast_rcv_packets.attr.attr,
>> +&port_pma_attr_ext_unicast_xmit_packets.attr.attr,
>> +&port_pma_attr_ext_multicast_rcv_packets.attr.attr,
>> +&port_pma_attr_ext_multicast_xmit_packets.attr.attr,
>> +NULL
>> +};
>> +
>> +static struct attribute *pma_attrs_noietf[] = {
>> +&port_pma_attr_symbol_error.attr.attr,
>> +&port_pma_attr_link_error_recovery.attr.attr,
>> +&port_pma_attr_link_downed.attr.attr,
>> +&port_pma_attr_port_rcv_errors.attr.attr,
>> +&port_pma_attr_port_rcv_remote_physical_errors.attr.attr,
>> +&port_pma_attr_port_rcv_switch_relay_errors.attr.attr,
>> +&port_pma_attr_port_xmit_discards.attr.attr,
>> +&port_pma_attr_port_xmit_constraint_errors.attr.attr,
>> +&port_pma_attr_port_rcv_constraint_errors.attr.attr,
>> +&port_pma_attr_local_link_integrity_errors.attr.attr,
>> +&port_pma_attr_excessive_buffer_overrun_errors.attr.attr,
>> +&port_pma_attr_VL15_dropped.attr.attr,
>> +&port_pma_attr_ext_port_xmit_data.attr.attr,
>> +&por

Re: [PATCH 3/3] Display extended counter set if available

2015-12-21 Thread Christoph Lameter
On Mon, 21 Dec 2015, Hal Rosenstock wrote:

> > Don't we need to change all the sysfs_remove_groups to use 
> > get_counter_table as
> > well?
>
> Looks like it to me too. Good catch.

Fix follows:

From: Christoph Lameter 
Subject: Fix sysfs entry removal by storing the table format in  pma_table

Store the table being used in the ib_port structure and use it when sysfs
entries have to be removed.

Signed-off-by: Christoph Lameter 

Index: linux/drivers/infiniband/core/sysfs.c
===
--- linux.orig/drivers/infiniband/core/sysfs.c
+++ linux/drivers/infiniband/core/sysfs.c
@@ -47,6 +47,7 @@ struct ib_port {
struct attribute_group gid_group;
struct attribute_group pkey_group;
u8 port_num;
+   struct attribute_group *pma_table;
 };

 struct port_attribute {
@@ -651,7 +652,8 @@ static int add_port(struct ib_device *de
return ret;
}

-   ret = sysfs_create_group(&p->kobj, get_counter_table(device));
+   p->pma_table = get_counter_table(device);
+   ret = sysfs_create_group(&p->kobj, p->pma_table);
if (ret)
goto err_put;

@@ -710,7 +712,7 @@ err_free_gid:
p->gid_group.attrs = NULL;

 err_remove_pma:
-   sysfs_remove_group(&p->kobj, &pma_group);
+   sysfs_remove_group(&p->kobj, p->pma_table);

 err_put:
kobject_put(&p->kobj);
@@ -923,7 +925,7 @@ static void free_port_list_attributes(st
list_for_each_entry_safe(p, t, &device->port_list, entry) {
struct ib_port *port = container_of(p, struct ib_port, kobj);
list_del(&p->entry);
-   sysfs_remove_group(p, &pma_group);
+   sysfs_remove_group(p, port->pma_table);
sysfs_remove_group(p, &port->pkey_group);
sysfs_remove_group(p, &port->gid_group);
kobject_put(p);
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] Display extended counter set if available

2015-12-21 Thread ira.weiny
On Mon, Dec 21, 2015 at 01:31:31PM -0600, Christoph Lameter wrote:
> On Mon, 21 Dec 2015, Hal Rosenstock wrote:
> 
> > > Don't we need to change all the sysfs_remove_groups to use 
> > > get_counter_table as
> > > well?
> >
> > Looks like it to me too. Good catch.
> 
> Fix follows:
> 
> From: Christoph Lameter 
> Subject: Fix sysfs entry removal by storing the table format in  pma_table
> 
> Store the table being used in the ib_port structure and use it when sysfs
> entries have to be removed.
> 
> Signed-off-by: Christoph Lameter 

Reviewed-by: Ira Weiny 

> 
> Index: linux/drivers/infiniband/core/sysfs.c
> ===
> --- linux.orig/drivers/infiniband/core/sysfs.c
> +++ linux/drivers/infiniband/core/sysfs.c
> @@ -47,6 +47,7 @@ struct ib_port {
>   struct attribute_group gid_group;
>   struct attribute_group pkey_group;
>   u8 port_num;
> + struct attribute_group *pma_table;
>  };
> 
>  struct port_attribute {
> @@ -651,7 +652,8 @@ static int add_port(struct ib_device *de
>   return ret;
>   }
> 
> - ret = sysfs_create_group(&p->kobj, get_counter_table(device));
> + p->pma_table = get_counter_table(device);
> + ret = sysfs_create_group(&p->kobj, p->pma_table);
>   if (ret)
>   goto err_put;
> 
> @@ -710,7 +712,7 @@ err_free_gid:
>   p->gid_group.attrs = NULL;
> 
>  err_remove_pma:
> - sysfs_remove_group(&p->kobj, &pma_group);
> + sysfs_remove_group(&p->kobj, p->pma_table);
> 
>  err_put:
>   kobject_put(&p->kobj);
> @@ -923,7 +925,7 @@ static void free_port_list_attributes(st
>   list_for_each_entry_safe(p, t, &device->port_list, entry) {
>   struct ib_port *port = container_of(p, struct ib_port, kobj);
>   list_del(&p->entry);
> - sysfs_remove_group(p, &pma_group);
> + sysfs_remove_group(p, port->pma_table);
>   sysfs_remove_group(p, &port->pkey_group);
>   sysfs_remove_group(p, &port->gid_group);
>   kobject_put(p);
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] Display extended counter set if available

2015-12-21 Thread Hal Rosenstock
On 12/21/2015 2:47 PM, ira.weiny wrote:
> On Mon, Dec 21, 2015 at 01:31:31PM -0600, Christoph Lameter wrote:
>> On Mon, 21 Dec 2015, Hal Rosenstock wrote:
>>
 Don't we need to change all the sysfs_remove_groups to use 
 get_counter_table as
 well?
>>>
>>> Looks like it to me too. Good catch.
>>
>> Fix follows:
>>
>> From: Christoph Lameter 
>> Subject: Fix sysfs entry removal by storing the table format in  pma_table
>>
>> Store the table being used in the ib_port structure and use it when sysfs
>> entries have to be removed.
>>
>> Signed-off-by: Christoph Lameter 
> 
> Reviewed-by: Ira Weiny 

Reviewed-by: Hal Rosenstock 

> 
>>
>> Index: linux/drivers/infiniband/core/sysfs.c
>> ===
>> --- linux.orig/drivers/infiniband/core/sysfs.c
>> +++ linux/drivers/infiniband/core/sysfs.c
>> @@ -47,6 +47,7 @@ struct ib_port {
>>  struct attribute_group gid_group;
>>  struct attribute_group pkey_group;
>>  u8 port_num;
>> +struct attribute_group *pma_table;
>>  };
>>
>>  struct port_attribute {
>> @@ -651,7 +652,8 @@ static int add_port(struct ib_device *de
>>  return ret;
>>  }
>>
>> -ret = sysfs_create_group(&p->kobj, get_counter_table(device));
>> +p->pma_table = get_counter_table(device);
>> +ret = sysfs_create_group(&p->kobj, p->pma_table);
>>  if (ret)
>>  goto err_put;
>>
>> @@ -710,7 +712,7 @@ err_free_gid:
>>  p->gid_group.attrs = NULL;
>>
>>  err_remove_pma:
>> -sysfs_remove_group(&p->kobj, &pma_group);
>> +sysfs_remove_group(&p->kobj, p->pma_table);
>>
>>  err_put:
>>  kobject_put(&p->kobj);
>> @@ -923,7 +925,7 @@ static void free_port_list_attributes(st
>>  list_for_each_entry_safe(p, t, &device->port_list, entry) {
>>  struct ib_port *port = container_of(p, struct ib_port, kobj);
>>  list_del(&p->entry);
>> -sysfs_remove_group(p, &pma_group);
>> +sysfs_remove_group(p, port->pma_table);
>>  sysfs_remove_group(p, &port->pkey_group);
>>  sysfs_remove_group(p, &port->gid_group);
>>  kobject_put(p);
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 1/3] IB/core: Align coding style of ib_device_cap_flags structure

2015-12-21 Thread Christoph Hellwig
On Mon, Dec 21, 2015 at 11:36:03AM -0500, ira.weiny wrote:
> It would be nice if we were not having to do this for staging then.  Also
> perhaps it should be removed from checkpatch --strict?

Don't use checkpatch --strict ever.  It's full of weird items that
defintively don't apply to the majority of the kernel code base.

> Where are the guidelines for when one can ignore checkpatch and when they can
> not?  It would be nice to know when we can "be developers" vs "being robots to
> some tool".

I think checkpatch is generally useful, and the errors without
--strict are something we I haven't found any false positives.

The warnings are about 90% useful but something are just weird.  For
--strict all bets are off.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 01/11] svcrdma: Do not send XDR roundup bytes for a write chunk

2015-12-21 Thread J. Bruce Fields
On Mon, Dec 14, 2015 at 04:30:09PM -0500, Chuck Lever wrote:
> Minor optimization: when dealing with write chunk XDR roundup, do
> not post a Write WR for the zero bytes in the pad. Simply update
> the write segment in the RPC-over-RDMA header to reflect the extra
> pad bytes.
> 
> The Reply chunk is also a write chunk, but the server does not use
> send_write_chunks() to send the Reply chunk. That's OK in this case:
> the server Upper Layer typically marshals the Reply chunk contents
> in a single contiguous buffer, without a separate tail for the XDR
> pad.
> 
> The comments and the variable naming refer to "chunks" but what is
> really meant is "segments." The existing code sends only one
> xdr_write_chunk per RPC reply.
> 
> The fix assumes this as well. When the XDR pad in the first write
> chunk is reached, the assumption is the Write list is complete and
> send_write_chunks() returns.
> 
> That will remain a valid assumption until the server Upper Layer can
> support multiple bulk payload results per RPC.
> 
> Signed-off-by: Chuck Lever 
> ---
>  net/sunrpc/xprtrdma/svc_rdma_sendto.c |7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c 
> b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> index 969a1ab..bad5eaa 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> @@ -342,6 +342,13 @@ static int send_write_chunks(struct svcxprt_rdma *xprt,
>   arg_ch->rs_handle,
>   arg_ch->rs_offset,
>   write_len);
> +
> + /* Do not send XDR pad bytes */
> + if (chunk_no && write_len < 4) {
> + chunk_no++;
> + break;

I'm pretty lost in this code.  Why does (chunk_no && write_len < 4) mean
this is xdr padding?

> + }
> +
>   chunk_off = 0;
>   while (write_len) {
>   ret = send_write(xprt, rqstp,
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 01/11] svcrdma: Do not send XDR roundup bytes for a write chunk

2015-12-21 Thread Chuck Lever

> On Dec 21, 2015, at 4:07 PM, J. Bruce Fields  wrote:
> 
> On Mon, Dec 14, 2015 at 04:30:09PM -0500, Chuck Lever wrote:
>> Minor optimization: when dealing with write chunk XDR roundup, do
>> not post a Write WR for the zero bytes in the pad. Simply update
>> the write segment in the RPC-over-RDMA header to reflect the extra
>> pad bytes.
>> 
>> The Reply chunk is also a write chunk, but the server does not use
>> send_write_chunks() to send the Reply chunk. That's OK in this case:
>> the server Upper Layer typically marshals the Reply chunk contents
>> in a single contiguous buffer, without a separate tail for the XDR
>> pad.
>> 
>> The comments and the variable naming refer to "chunks" but what is
>> really meant is "segments." The existing code sends only one
>> xdr_write_chunk per RPC reply.
>> 
>> The fix assumes this as well. When the XDR pad in the first write
>> chunk is reached, the assumption is the Write list is complete and
>> send_write_chunks() returns.
>> 
>> That will remain a valid assumption until the server Upper Layer can
>> support multiple bulk payload results per RPC.
>> 
>> Signed-off-by: Chuck Lever 
>> ---
>> net/sunrpc/xprtrdma/svc_rdma_sendto.c |7 +++
>> 1 file changed, 7 insertions(+)
>> 
>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c 
>> b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
>> index 969a1ab..bad5eaa 100644
>> --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
>> +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
>> @@ -342,6 +342,13 @@ static int send_write_chunks(struct svcxprt_rdma *xprt,
>>  arg_ch->rs_handle,
>>  arg_ch->rs_offset,
>>  write_len);
>> +
>> +/* Do not send XDR pad bytes */
>> +if (chunk_no && write_len < 4) {
>> +chunk_no++;
>> +break;
> 
> I'm pretty lost in this code.  Why does (chunk_no && write_len < 4) mean
> this is xdr padding?

Chunk zero is always data. Padding is always going to be
after the first chunk. Any chunk after chunk zero that is
shorter than XDR quad alignment is going to be a pad.

Probably too clever. Is there a better way to detect
the XDR pad?


>> +}
>> +
>>  chunk_off = 0;
>>  while (write_len) {
>>  ret = send_write(xprt, rqstp,

--
Chuck Lever




--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 01/11] svcrdma: Do not send XDR roundup bytes for a write chunk

2015-12-21 Thread J. Bruce Fields
On Mon, Dec 21, 2015 at 04:15:23PM -0500, Chuck Lever wrote:
> 
> > On Dec 21, 2015, at 4:07 PM, J. Bruce Fields  wrote:
> > 
> > On Mon, Dec 14, 2015 at 04:30:09PM -0500, Chuck Lever wrote:
> >> Minor optimization: when dealing with write chunk XDR roundup, do
> >> not post a Write WR for the zero bytes in the pad. Simply update
> >> the write segment in the RPC-over-RDMA header to reflect the extra
> >> pad bytes.
> >> 
> >> The Reply chunk is also a write chunk, but the server does not use
> >> send_write_chunks() to send the Reply chunk. That's OK in this case:
> >> the server Upper Layer typically marshals the Reply chunk contents
> >> in a single contiguous buffer, without a separate tail for the XDR
> >> pad.
> >> 
> >> The comments and the variable naming refer to "chunks" but what is
> >> really meant is "segments." The existing code sends only one
> >> xdr_write_chunk per RPC reply.
> >> 
> >> The fix assumes this as well. When the XDR pad in the first write
> >> chunk is reached, the assumption is the Write list is complete and
> >> send_write_chunks() returns.
> >> 
> >> That will remain a valid assumption until the server Upper Layer can
> >> support multiple bulk payload results per RPC.
> >> 
> >> Signed-off-by: Chuck Lever 
> >> ---
> >> net/sunrpc/xprtrdma/svc_rdma_sendto.c |7 +++
> >> 1 file changed, 7 insertions(+)
> >> 
> >> diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c 
> >> b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> >> index 969a1ab..bad5eaa 100644
> >> --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> >> +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> >> @@ -342,6 +342,13 @@ static int send_write_chunks(struct svcxprt_rdma 
> >> *xprt,
> >>arg_ch->rs_handle,
> >>arg_ch->rs_offset,
> >>write_len);
> >> +
> >> +  /* Do not send XDR pad bytes */
> >> +  if (chunk_no && write_len < 4) {
> >> +  chunk_no++;
> >> +  break;
> > 
> > I'm pretty lost in this code.  Why does (chunk_no && write_len < 4) mean
> > this is xdr padding?
> 
> Chunk zero is always data. Padding is always going to be
> after the first chunk. Any chunk after chunk zero that is
> shorter than XDR quad alignment is going to be a pad.

I don't really know what a chunk is  Looking at the code:

write_len = min(xfer_len, be32_to_cpu(arg_ch->rs_length));

so I guess the assumption is just that those rs_length's are always a
multiple of four?

--b.

> 
> Probably too clever. Is there a better way to detect
> the XDR pad?
> 
> 
> >> +  }
> >> +
> >>chunk_off = 0;
> >>while (write_len) {
> >>ret = send_write(xprt, rqstp,
> 
> --
> Chuck Lever
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 01/13] staging/rdma/hfi1: Use BIT macro

2015-12-21 Thread Greg KH
On Mon, Nov 16, 2015 at 09:59:23PM -0500, Jubin John wrote:
> This patch fixes the checkpatch issue:
> CHECK: Prefer using the BIT macro
> 
> Use of BIT macro for HDRQ_INCREMENT in chip.h causes a change in
> format specifier for error message in init.c in order to avoid a
> build warning.
> 
> Reviewed-by: Dean Luick 
> Reviewed-by: Ira Weiny 
> Reviewed-by: Mike Marciniszyn 
> Signed-off-by: Jubin John 

This patch, and a few others in this series did not apply.  Please fix
them up and resend the ones I missed.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/2] staging/rdma/hfi1: set Gen3 half-swing for integrated devices

2015-12-21 Thread Greg KH
On Tue, Dec 01, 2015 at 02:47:57PM -0500, ira.we...@intel.com wrote:
> From: Dean Luick 
> 
> Correctly set half-swing for integrated devices.  A0 needs all fields set for
> CcePcieCtrl.  B0 and later only need a few fields set.
> 
> Reviewed-by: Stuart Summers 
> Signed-off-by: Dean Luick 
> Signed-off-by: Ira Weiny 
> 
> ---
> Changes from V1:
>   Add comments concerning the very long names.
> 
> Changes from V2:
>   Remove PC Macro and define short names to be used in the code.
> 
> Changes from V3:
>   Use newly defined dd_dev_dbg rather than dd_dev_info
> 
>  drivers/staging/rdma/hfi1/chip_registers.h | 11 
>  drivers/staging/rdma/hfi1/pcie.c   | 82 
> --
>  2 files changed, 89 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/staging/rdma/hfi1/chip_registers.h 
> b/drivers/staging/rdma/hfi1/chip_registers.h
> index bf45de29d8bd..d0deb2278635 100644
> --- a/drivers/staging/rdma/hfi1/chip_registers.h
> +++ b/drivers/staging/rdma/hfi1/chip_registers.h
> @@ -549,6 +549,17 @@
>  #define CCE_MSIX_TABLE_UPPER (CCE + 0x0018)
>  #define CCE_MSIX_TABLE_UPPER_RESETCSR 0x0001ull
>  #define CCE_MSIX_VEC_CLR_WITHOUT_INT (CCE + 0x00110400)
> +#define CCE_PCIE_CTRL (CCE + 0x00C0)
> +#define CCE_PCIE_CTRL_PCIE_LANE_BUNDLE_MASK 0x3ull
> +#define CCE_PCIE_CTRL_PCIE_LANE_BUNDLE_SHIFT 0
> +#define CCE_PCIE_CTRL_PCIE_LANE_DELAY_MASK 0xFull
> +#define CCE_PCIE_CTRL_PCIE_LANE_DELAY_SHIFT 2
> +#define CCE_PCIE_CTRL_XMT_MARGIN_OVERWRITE_ENABLE_SHIFT 8
> +#define CCE_PCIE_CTRL_XMT_MARGIN_SHIFT 9
> +#define CCE_PCIE_CTRL_XMT_MARGIN_GEN1_GEN2_OVERWRITE_ENABLE_MASK 0x1ull
> +#define CCE_PCIE_CTRL_XMT_MARGIN_GEN1_GEN2_OVERWRITE_ENABLE_SHIFT 12
> +#define CCE_PCIE_CTRL_XMT_MARGIN_GEN1_GEN2_MASK 0x7ull
> +#define CCE_PCIE_CTRL_XMT_MARGIN_GEN1_GEN2_SHIFT 13
>  #define CCE_REVISION (CCE + 0x)
>  #define CCE_REVISION2 (CCE + 0x0008)
>  #define CCE_REVISION2_HFI_ID_MASK 0x1ull
> diff --git a/drivers/staging/rdma/hfi1/pcie.c 
> b/drivers/staging/rdma/hfi1/pcie.c
> index 0b7eafb0fc70..eb3e2159ad41 100644
> --- a/drivers/staging/rdma/hfi1/pcie.c
> +++ b/drivers/staging/rdma/hfi1/pcie.c
> @@ -867,6 +867,83 @@ static void arm_gasket_logic(struct hfi1_devdata *dd)
>  }
>  
>  /*
> + * CCE_PCIE_CTRL long name helpers
> + * We redefine these shorter macros to use in the code while leaving
> + * chip_registers.h to be autogenerated from the hardware spec.
> + */
> +#define LANE_BUNDLE_MASK  CCE_PCIE_CTRL_PCIE_LANE_BUNDLE_MASK
> +#define LANE_BUNDLE_SHIFT CCE_PCIE_CTRL_PCIE_LANE_BUNDLE_SHIFT
> +#define LANE_DELAY_MASK   CCE_PCIE_CTRL_PCIE_LANE_DELAY_MASK
> +#define LANE_DELAY_SHIFT  CCE_PCIE_CTRL_PCIE_LANE_DELAY_SHIFT
> +#define MARGIN_OVERWRITE_ENABLE_SHIFT 
> CCE_PCIE_CTRL_XMT_MARGIN_OVERWRITE_ENABLE_SHIFT
> +#define MARGIN_SHIFT  CCE_PCIE_CTRL_XMT_MARGIN_SHIFT
> +#define MARGIN_G1_G2_OVERWRITE_MASK   
> CCE_PCIE_CTRL_XMT_MARGIN_GEN1_GEN2_OVERWRITE_ENABLE_MASK
> +#define MARGIN_G1_G2_OVERWRITE_SHIFT  
> CCE_PCIE_CTRL_XMT_MARGIN_GEN1_GEN2_OVERWRITE_ENABLE_SHIFT
> +#define MARGIN_GEN1_GEN2_MASK CCE_PCIE_CTRL_XMT_MARGIN_GEN1_GEN2_MASK
> +#define MARGIN_GEN1_GEN2_SHIFT
> CCE_PCIE_CTRL_XMT_MARGIN_GEN1_GEN2_SHIFT
> +
> + /*
> +  * Write xmt_margin for full-swing (WFR-B) or half-swing (WFR-C).
> +  */
> +static void write_xmt_margin(struct hfi1_devdata *dd, const char *fname)
> +{
> + u64 pcie_ctrl;
> + u64 xmt_margin;
> + u64 xmt_margin_oe;
> + u64 lane_delay;
> + u64 lane_bundle;
> +
> + pcie_ctrl = read_csr(dd, CCE_PCIE_CTRL);
> +
> + /*
> +  * For Discrete, use full-swing.
> +  *  - PCIe TX defaults to full-swing.
> +  *Leave this register as default.
> +  * For Integrated, use half-swing
> +  *  - Copy xmt_margin and xmt_margin_oe
> +  *from Gen1/Gen2 to Gen3.
> +  */
> + if (dd->pcidev->device == PCI_DEVICE_ID_INTEL1) { /* integrated */
> + /* extract initial fields */
> + xmt_margin = (pcie_ctrl >> MARGIN_GEN1_GEN2_SHIFT)
> +   & MARGIN_GEN1_GEN2_MASK;
> + xmt_margin_oe = (pcie_ctrl >> MARGIN_G1_G2_OVERWRITE_SHIFT)
> +  & MARGIN_G1_G2_OVERWRITE_MASK;
> + lane_delay = (pcie_ctrl >> LANE_DELAY_SHIFT) & LANE_DELAY_MASK;
> + lane_bundle = (pcie_ctrl >> LANE_BUNDLE_SHIFT)
> +& LANE_BUNDLE_MASK;
> +
> + /*
> +  * For A0, EFUSE values are not set.  Override with the
> +  * correct values.
> +  */
> + if (is_a0(dd)) {

This line causes a build error, please be more careful and test your
patches before you send them out :(

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo

Re: [PATCH v2 05/17] staging/rdma/hfi1: Clean up comments

2015-12-21 Thread Greg KH
On Tue, Dec 01, 2015 at 03:38:14PM -0500, Jubin John wrote:
> From: Edward Mascarenhas 
> 
> Clean up comments by deleting numbering and terms internal to Intel.
> 
> The information on the actual bugs is not deleted.
> 
> Reviewed-by: Mike Marciniszyn 
> Signed-off-by: Edward Mascarenhas 
> Signed-off-by: Jubin John 
> ---
> Changes in v2:
>   - Added more information in commit message

This patch, and some others in this series did not apply, please rebase
them and resend.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 00/15] staging/rdma/hfi1: Initial patches to add rdmavt support in HFI1

2015-12-21 Thread gre...@linuxfoundation.org
On Mon, Dec 21, 2015 at 01:12:14AM -0500, ira.weiny wrote:
> Greg, Doug,
> 
> As mentioned below, these patches depend on the new rdmavt library submitted 
> to
> Doug on linux-rdma.
> 
> We continue to identify (and rework) patches by our other developers which can
> be submitted without conflicts with this series.  Furthermore, We have, as 
> much
> as possible, placed fixes directly into rdmavt such that those changes can be
> dropped from hfi1.  But at this point, we need to know if and where these are
> going to land so that we can start reworking as appropriate.
> 
> Therefore, I would like to discuss plans to get hfi1 under the same maintainer
> to work through this transitional period.
> 
> Basically, At what point should we stop submitting patches to Greg and start
> submitting to Doug?
> 
> Should we consider the merge window itself as the swap over point and submit
> changes to Doug at that point?  If so, should we continue to submit what we 
> can
> to Greg until then (and continue rebase'ing the series below on that work)?  
> Or
> given Gregs backlog, should we stop submitting to Greg sometime prior to the
> merge window?
> 
> That brings up my final question, at the point of swap over I assume anything
> not accepted by Greg should be considered rejected and we need to resubmit to
> Doug?

If Doug accepts the library changes, let me know that public git commit
and I can pull it into the staging-next branch and you can continue to
send me staging patches that way.

That's the easiest thing to do usually.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 01/11] svcrdma: Do not send XDR roundup bytes for a write chunk

2015-12-21 Thread Chuck Lever

> On Dec 21, 2015, at 4:29 PM, J. Bruce Fields  wrote:
> 
> On Mon, Dec 21, 2015 at 04:15:23PM -0500, Chuck Lever wrote:
>> 
>>> On Dec 21, 2015, at 4:07 PM, J. Bruce Fields  wrote:
>>> 
>>> On Mon, Dec 14, 2015 at 04:30:09PM -0500, Chuck Lever wrote:
 Minor optimization: when dealing with write chunk XDR roundup, do
 not post a Write WR for the zero bytes in the pad. Simply update
 the write segment in the RPC-over-RDMA header to reflect the extra
 pad bytes.
 
 The Reply chunk is also a write chunk, but the server does not use
 send_write_chunks() to send the Reply chunk. That's OK in this case:
 the server Upper Layer typically marshals the Reply chunk contents
 in a single contiguous buffer, without a separate tail for the XDR
 pad.
 
 The comments and the variable naming refer to "chunks" but what is
 really meant is "segments." The existing code sends only one
 xdr_write_chunk per RPC reply.
 
 The fix assumes this as well. When the XDR pad in the first write
 chunk is reached, the assumption is the Write list is complete and
 send_write_chunks() returns.
 
 That will remain a valid assumption until the server Upper Layer can
 support multiple bulk payload results per RPC.
 
 Signed-off-by: Chuck Lever 
 ---
 net/sunrpc/xprtrdma/svc_rdma_sendto.c |7 +++
 1 file changed, 7 insertions(+)
 
 diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c 
 b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
 index 969a1ab..bad5eaa 100644
 --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
 +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
 @@ -342,6 +342,13 @@ static int send_write_chunks(struct svcxprt_rdma 
 *xprt,
arg_ch->rs_handle,
arg_ch->rs_offset,
write_len);
 +
 +  /* Do not send XDR pad bytes */
 +  if (chunk_no && write_len < 4) {
 +  chunk_no++;
 +  break;
>>> 
>>> I'm pretty lost in this code.  Why does (chunk_no && write_len < 4) mean
>>> this is xdr padding?
>> 
>> Chunk zero is always data. Padding is always going to be
>> after the first chunk. Any chunk after chunk zero that is
>> shorter than XDR quad alignment is going to be a pad.
> 
> I don't really know what a chunk is  Looking at the code:
> 
>   write_len = min(xfer_len, be32_to_cpu(arg_ch->rs_length));
> 
> so I guess the assumption is just that those rs_length's are always a
> multiple of four?

The example you recently gave was a two-byte NFS READ
that crosses a page boundary.

In that case, the NFSD would pass down an xdr_buf that
has one byte in a page, one byte in another page, and
a two-byte XDR pad. The logic introduced by this
optimization would be fooled, and neither the second
byte nor the XDR pad would be written to the client.

Unless you can think of a way to recognize an XDR pad
in the xdr_buf 100% of the time, you should drop this
patch.

As far as I know, none of the other patches in this
series depend on this optimization, so please merge
them if you can.


> --b.
> 
>> 
>> Probably too clever. Is there a better way to detect
>> the XDR pad?
>> 
>> 
 +  }
 +
chunk_off = 0;
while (write_len) {
ret = send_write(xprt, rqstp,
>> 
>> --
>> Chuck Lever
>> 
>> 
>> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever




--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V1 00/16] add Intel(R) X722 iWARP driver

2015-12-21 Thread Faisal Latif

This (V1) series contains the addition of the i40iw.ko driver after
incorporating the feedback from Christoph Hellwig and Joe Perches for
initial series.

This driver provides iWARP RDMA functionality for the Intel(R) X722 Ethernet
controller for PCI Physical Functions. It also has support for Virtual
Function driver (i40iwvf.ko), which that will be part of separate patch
series.

It cooperates with the Intel(R) X722 base driver (i40e.ko) to allocate
resources and program the controller.

This series include 1 patch to i40e.ko to provide interface support to
i40iw.ko. The interface provides a driver registration mechanism, resource
allocations, and device reset coordination mechanisms.

This patch series is based on Doug Ledford's k.o/for-4.5.


Anjali Singhai Jain (1)
net/ethernet/intel/i40e: Add support for client interface for IWARP driver

Faisal Latif(15):
infiniband/hw/i40iw: add main, hdr, status
infiniband/hw/i40iw: add connection management code
infiniband/hw/i40iw: add puda code
infiniband/hw/i40iw: add pble resource files
infiniband/hw/i40iw: add hmc resource files
infiniband/hw/i40iw: add hw and utils files
infiniband/hw/i40iw: add files for iwarp interface
infiniband/hw/i40iw: add file to handle cqp calls
infiniband/hw/i40iw: add hardware related header files
infiniband/hw/i40iw: add X722 register file
infiniband/hw/i40iw: user kernel shared files
infiniband/hw/i40iw: virtual channel handling files
infiniband/hw/i40iw: Kconfig and Kbuild for iwarp module
infiniband/hw/i40iw: Add entry for I40IW rdma_netlink.h
infiniband/hw/i40iw: changes for build of i40iw module

Changes done from initial version to V1 are following.

Feedback received from Christoph Hellwig
*Remove pointless braces -improved after code review and changing
*kmap()/kunmap() - made it very short lived
*less casts -improved 
*Remove unused routine stubs - done
*no initialize to 0 or NULL when struct field were zeroed - done
*define UNREFERENCED_PARAMETER not needed -done
*remove define I40eE_MASK  -done
*rd32(), wr32() make them inline -done
*readq() use magic in linux/io-64-nonatomic-lo-hi.h -done
*SLEEP() define -done by removing it
*entry in rdma_netlink.h for I40IW should be in proper location
and separate patch -done

Feedback received from Joe Perches
*series to respuun re-spun against next - done with
Doug's Ledford's k.o/for-4.5
*Change to i40e client patch regarding mailing list - this is consistent
with other i40e files.
*Removed error from i40iw_pr_err() -done
*cqp_request() change from bitfields to bool -done
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V1 15/16] i40iw: add entry in rdma_netlink

2015-12-21 Thread Faisal Latif
Add entry for port mapper services.

Signed-off-by: Faisal Latif 
---
 include/uapi/rdma/rdma_netlink.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/rdma/rdma_netlink.h b/include/uapi/rdma/rdma_netlink.h
index c19a5dc..4fa418d 100644
--- a/include/uapi/rdma/rdma_netlink.h
+++ b/include/uapi/rdma/rdma_netlink.h
@@ -8,6 +8,7 @@ enum {
RDMA_NL_NES,
RDMA_NL_C4IW,
RDMA_NL_LS, /* RDMA Local Services */
+   RDMA_NL_I40IW,
RDMA_NL_NUM_CLIENTS
 };
 
-- 
2.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V1 01/16] i40e: Add support for client interface for IWARP driver

2015-12-21 Thread Faisal Latif
From: Anjali Singhai Jain 

This patch adds a Client interface for i40iw driver
support. Also expands the Virtchannel to support messages
from i40evf driver on behalf of i40iwvf driver.

This client API is used by the i40iw and i40iwvf driver
to access the core driver resources brokered by the i40e driver.

Signed-off-by: Anjali Singhai Jain 
---
 drivers/net/ethernet/intel/i40e/Makefile   |1 +
 drivers/net/ethernet/intel/i40e/i40e.h |   22 +
 drivers/net/ethernet/intel/i40e/i40e_client.c  | 1012 
 drivers/net/ethernet/intel/i40e/i40e_client.h  |  232 +
 drivers/net/ethernet/intel/i40e/i40e_main.c|  115 ++-
 drivers/net/ethernet/intel/i40e/i40e_type.h|3 +-
 drivers/net/ethernet/intel/i40e/i40e_virtchnl.h|   34 +
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |  247 -
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h |4 +
 9 files changed, 1657 insertions(+), 13 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/i40e/i40e_client.c
 create mode 100644 drivers/net/ethernet/intel/i40e/i40e_client.h

diff --git a/drivers/net/ethernet/intel/i40e/Makefile 
b/drivers/net/ethernet/intel/i40e/Makefile
index b4729ba..3b3c63e 100644
--- a/drivers/net/ethernet/intel/i40e/Makefile
+++ b/drivers/net/ethernet/intel/i40e/Makefile
@@ -41,6 +41,7 @@ i40e-objs := i40e_main.o \
i40e_diag.o \
i40e_txrx.o \
i40e_ptp.o  \
+   i40e_client.o   \
i40e_virtchnl_pf.o
 
 i40e-$(CONFIG_I40E_DCB) += i40e_dcb.o i40e_dcb_nl.o
diff --git a/drivers/net/ethernet/intel/i40e/i40e.h 
b/drivers/net/ethernet/intel/i40e/i40e.h
index 4dd3e26..1417ae8 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -59,6 +59,7 @@
 #ifdef I40E_FCOE
 #include "i40e_fcoe.h"
 #endif
+#include "i40e_client.h"
 #include "i40e_virtchnl.h"
 #include "i40e_virtchnl_pf.h"
 #include "i40e_txrx.h"
@@ -178,6 +179,7 @@ struct i40e_lump_tracking {
u16 search_hint;
u16 list[0];
 #define I40E_PILE_VALID_BIT  0x8000
+#define I40E_IWARP_IRQ_PILE_ID  (I40E_PILE_VALID_BIT - 2)
 };
 
 #define I40E_DEFAULT_ATR_SAMPLE_RATE   20
@@ -264,6 +266,8 @@ struct i40e_pf {
 #endif /* I40E_FCOE */
u16 num_lan_qps;   /* num lan queues this PF has set up */
u16 num_lan_msix;  /* num queue vectors for the base PF vsi */
+   u16 num_iwarp_msix;/* num of iwarp vectors for this PF */
+   int iwarp_base_vector;
int queues_left;   /* queues left unclaimed */
u16 rss_size;  /* num queues in the RSS array */
u16 rss_size_max;  /* HW defined max RSS queues */
@@ -313,6 +317,7 @@ struct i40e_pf {
 #define I40E_FLAG_16BYTE_RX_DESC_ENABLED   BIT_ULL(13)
 #define I40E_FLAG_CLEAN_ADMINQ BIT_ULL(14)
 #define I40E_FLAG_FILTER_SYNC  BIT_ULL(15)
+#define I40E_FLAG_SERVICE_CLIENT_REQUESTED BIT_ULL(16)
 #define I40E_FLAG_PROCESS_MDD_EVENTBIT_ULL(17)
 #define I40E_FLAG_PROCESS_VFLR_EVENT   BIT_ULL(18)
 #define I40E_FLAG_SRIOV_ENABLEDBIT_ULL(19)
@@ -550,6 +555,8 @@ struct i40e_vsi {
struct kobject *kobj;  /* sysfs object */
bool current_isup; /* Sync 'link up' logging */
 
+   void *priv; /* client driver data reference. */
+
/* VSI specific handlers */
irqreturn_t (*irq_handler)(int irq, void *data);
 
@@ -702,6 +709,10 @@ void i40e_vsi_setup_queue_map(struct i40e_vsi *vsi,
  struct i40e_vsi_context *ctxt,
  u8 enabled_tc, bool is_add);
 #endif
+void i40e_service_event_schedule(struct i40e_pf *pf);
+void i40e_notify_client_of_vf_msg(struct i40e_vsi *vsi, u32 vf_id,
+ u8 *msg, u16 len);
+
 int i40e_vsi_control_rings(struct i40e_vsi *vsi, bool enable);
 int i40e_reconfig_rss_queues(struct i40e_pf *pf, int queue_count);
 struct i40e_veb *i40e_veb_setup(struct i40e_pf *pf, u16 flags, u16 uplink_seid,
@@ -724,6 +735,17 @@ static inline void i40e_dbg_pf_exit(struct i40e_pf *pf) {}
 static inline void i40e_dbg_init(void) {}
 static inline void i40e_dbg_exit(void) {}
 #endif /* CONFIG_DEBUG_FS*/
+/* needed by client drivers */
+int i40e_lan_add_device(struct i40e_pf *pf);
+int i40e_lan_del_device(struct i40e_pf *pf);
+void i40e_client_subtask(struct i40e_pf *pf);
+void i40e_notify_client_of_l2_param_changes(struct i40e_vsi *vsi);
+void i40e_notify_client_of_netdev_open(struct i40e_vsi *vsi);
+void i40e_notify_client_of_netdev_close(struct i40e_vsi *vsi, bool reset);
+void i40e_notify_client_of_vf_enable(struct i40e_pf *pf, u32 num_vfs);
+void i40e_notify_client_of_vf_reset(struct i40e_pf *pf, u32 vf_id);
+int i40e_vf_client_capable(struct i40e_pf *pf, u32 vf_id,
+  enum i40e_client_type type);
 /**
  * i40e_irq_dynamic_enable - Enable default interrupt generation settings
  * @vsi:

[PATCH V1 05/16] i40iw: add pble resource files

2015-12-21 Thread Faisal Latif
i40iw_pble.[ch] to manage pble resource for iwarp clients.

Acked-by: Anjali Singhai Jain 
Acked-by: Shannon Nelson 
Signed-off-by: Faisal Latif 
---
 drivers/infiniband/hw/i40iw/i40iw_pble.c | 618 +++
 drivers/infiniband/hw/i40iw/i40iw_pble.h | 131 +++
 2 files changed, 749 insertions(+)
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_pble.c
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_pble.h

diff --git a/drivers/infiniband/hw/i40iw/i40iw_pble.c 
b/drivers/infiniband/hw/i40iw/i40iw_pble.c
new file mode 100644
index 000..eb32cc7
--- /dev/null
+++ b/drivers/infiniband/hw/i40iw/i40iw_pble.c
@@ -0,0 +1,618 @@
+/***
+*
+* Copyright (c) 2015 Intel Corporation.  All rights reserved.
+*
+* This software is available to you under a choice of one of two
+* licenses.  You may choose to be licensed under the terms of the GNU
+* General Public License (GPL) Version 2, available from the file
+* COPYING in the main directory of this source tree, or the
+* OpenFabrics.org BSD license below:
+*
+*   Redistribution and use in source and binary forms, with or
+*   without modification, are permitted provided that the following
+*   conditions are met:
+*
+*- Redistributions of source code must retain the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer.
+*
+*- Redistributions in binary form must reproduce the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer in the documentation and/or other materials
+*  provided with the distribution.
+*
+* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+* SOFTWARE.
+*
+***/
+
+#include "i40iw_status.h"
+#include "i40iw_osdep.h"
+#include "i40iw_register.h"
+#include "i40iw_hmc.h"
+
+#include "i40iw_d.h"
+#include "i40iw_type.h"
+#include "i40iw_p.h"
+
+#include 
+#include 
+#include 
+#include "i40iw_pble.h"
+#include "i40iw.h"
+
+struct i40iw_device;
+static enum i40iw_status_code add_pble_pool(struct i40iw_sc_dev *dev,
+   struct i40iw_hmc_pble_rsrc 
*pble_rsrc);
+static void i40iw_free_vmalloc_mem(struct i40iw_hw *hw, struct i40iw_chunk 
*chunk);
+
+/**
+ * i40iw_destroy_pble_pool - destroy pool during module unload
+ * @pble_rsrc: pble resources
+ */
+void i40iw_destroy_pble_pool(struct i40iw_sc_dev *dev, struct 
i40iw_hmc_pble_rsrc *pble_rsrc)
+{
+   struct list_head *clist;
+   struct list_head *tlist;
+   struct i40iw_chunk *chunk;
+   struct i40iw_pble_pool *pinfo = &pble_rsrc->pinfo;
+
+   if (pinfo->pool) {
+   list_for_each_safe(clist, tlist, &pinfo->clist) {
+   chunk = list_entry(clist, struct i40iw_chunk, list);
+   if (chunk->type == I40IW_VMALLOC)
+   i40iw_free_vmalloc_mem(dev->hw, chunk);
+   kfree(chunk);
+   }
+   gen_pool_destroy(pinfo->pool);
+   }
+}
+
+/**
+ * i40iw_hmc_init_pble - Initialize pble resources during module load
+ * @dev: i40iw_sc_dev struct
+ * @pble_rsrc: pble resources
+ */
+enum i40iw_status_code i40iw_hmc_init_pble(struct i40iw_sc_dev *dev,
+  struct i40iw_hmc_pble_rsrc 
*pble_rsrc)
+{
+   struct i40iw_hmc_info *hmc_info;
+   u32 fpm_idx = 0;
+
+   hmc_info = dev->hmc_info;
+   pble_rsrc->fpm_base_addr = hmc_info->hmc_obj[I40IW_HMC_IW_PBLE].base;
+   /* Now start the pble' on 4k boundary */
+   if (pble_rsrc->fpm_base_addr & 0xfff)
+   fpm_idx = (PAGE_SIZE - (pble_rsrc->fpm_base_addr & 0xfff)) >> 3;
+
+   pble_rsrc->unallocated_pble =
+   hmc_info->hmc_obj[I40IW_HMC_IW_PBLE].cnt - fpm_idx;
+   pble_rsrc->next_fpm_addr = pble_rsrc->fpm_base_addr + (fpm_idx << 3);
+
+   pble_rsrc->pinfo.pool_shift = POOL_SHIFT;
+   pble_rsrc->pinfo.pool = gen_pool_create(pble_rsrc->pinfo.pool_shift, 
-1);
+   INIT_LIST_HEAD(&pble_rsrc->pinfo.clist);
+   if (!pble_rsrc->pinfo.pool)
+   goto error;
+
+   if (add_pble_pool(dev, pble_rsrc))
+   goto error;
+
+   return 0;
+
+ error:i40iw_destroy_pble_pool(dev, pble_rsrc);
+   return I40IW_ERR_NO_MEMORY;
+}
+
+/**
+ * get_sd_pd_idx -  Returns sd index, pd index and rel_pd_idx from fpm address
+ * @ pble_rsrc:structure containing fpm address
+ * @ idx: where to return indexe

[PATCH V1 13/16] i40iw: virtual channel handling files

2015-12-21 Thread Faisal Latif
i40iw_vf.[ch] and i40iw_virtchnl[ch] are used for virtual
channel support for iWARP VF module.

Acked-by: Anjali Singhai Jain 
Acked-by: Shannon Nelson 
Signed-off-by: Faisal Latif 
---
 drivers/infiniband/hw/i40iw/i40iw_vf.c   |  85 +++
 drivers/infiniband/hw/i40iw/i40iw_vf.h   |  62 +++
 drivers/infiniband/hw/i40iw/i40iw_virtchnl.c | 748 +++
 drivers/infiniband/hw/i40iw/i40iw_virtchnl.h | 124 +
 4 files changed, 1019 insertions(+)
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_vf.c
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_vf.h
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_virtchnl.c
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_virtchnl.h

diff --git a/drivers/infiniband/hw/i40iw/i40iw_vf.c 
b/drivers/infiniband/hw/i40iw/i40iw_vf.c
new file mode 100644
index 000..b23f3c4
--- /dev/null
+++ b/drivers/infiniband/hw/i40iw/i40iw_vf.c
@@ -0,0 +1,85 @@
+/***
+*
+* Copyright (c) 2015 Intel Corporation.  All rights reserved.
+*
+* This software is available to you under a choice of one of two
+* licenses.  You may choose to be licensed under the terms of the GNU
+* General Public License (GPL) Version 2, available from the file
+* COPYING in the main directory of this source tree, or the
+* OpenFabrics.org BSD license below:
+*
+*   Redistribution and use in source and binary forms, with or
+*   without modification, are permitted provided that the following
+*   conditions are met:
+*
+*- Redistributions of source code must retain the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer.
+*
+*- Redistributions in binary form must reproduce the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer in the documentation and/or other materials
+*  provided with the distribution.
+*
+* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+* SOFTWARE.
+*
+***/
+
+#include "i40iw_osdep.h"
+#include "i40iw_register.h"
+#include "i40iw_status.h"
+#include "i40iw_hmc.h"
+#include "i40iw_d.h"
+#include "i40iw_type.h"
+#include "i40iw_p.h"
+#include "i40iw_vf.h"
+
+/**
+ * i40iw_manage_vf_pble_bp - manage vf pble
+ * @cqp: cqp for cqp' sq wqe
+ * @info: pble info
+ * @scratch: pointer for completion
+ * @post_sq: to post and ring
+ */
+enum i40iw_status_code i40iw_manage_vf_pble_bp(struct i40iw_sc_cqp *cqp,
+  struct i40iw_manage_vf_pble_info 
*info,
+  u64 scratch,
+  bool post_sq)
+{
+   u64 *wqe;
+   u64 temp, header, pd_pl_pba = 0;
+
+   wqe = i40iw_sc_cqp_get_next_send_wqe(cqp, scratch);
+   if (!wqe)
+   return I40IW_ERR_RING_FULL;
+
+   temp = LS_64(info->pd_entry_cnt, I40IW_CQPSQ_MVPBP_PD_ENTRY_CNT) |
+   LS_64(info->first_pd_index, I40IW_CQPSQ_MVPBP_FIRST_PD_INX) |
+   LS_64(info->sd_index, I40IW_CQPSQ_MVPBP_SD_INX);
+   set_64bit_val(wqe, 16, temp);
+
+   header = LS_64((info->inv_pd_ent ? 1 : 0), 
I40IW_CQPSQ_MVPBP_INV_PD_ENT) |
+   LS_64(I40IW_CQP_OP_MANAGE_VF_PBLE_BP, I40IW_CQPSQ_OPCODE) |
+   LS_64(cqp->polarity, I40IW_CQPSQ_WQEVALID);
+   set_64bit_val(wqe, 24, header);
+
+   pd_pl_pba = LS_64(info->pd_pl_pba >> 3, I40IW_CQPSQ_MVPBP_PD_PLPBA);
+   set_64bit_val(wqe, 32, pd_pl_pba);
+
+   i40iw_debug_buf(cqp->dev, I40IW_DEBUG_WQE, "MANAGE VF_PBLE_BP WQE", 
wqe, I40IW_CQP_WQE_SIZE * 8);
+
+   if (post_sq)
+   i40iw_sc_cqp_post_sq(cqp);
+   return 0;
+}
+
+struct i40iw_vf_cqp_ops iw_vf_cqp_ops = {
+   i40iw_manage_vf_pble_bp
+};
diff --git a/drivers/infiniband/hw/i40iw/i40iw_vf.h 
b/drivers/infiniband/hw/i40iw/i40iw_vf.h
new file mode 100644
index 000..cfe112d
--- /dev/null
+++ b/drivers/infiniband/hw/i40iw/i40iw_vf.h
@@ -0,0 +1,62 @@
+/***
+*
+* Copyright (c) 2015 Intel Corporation.  All rights reserved.
+*
+* This software is available to you under a choice of one of two
+* licenses.  You may choose to be licensed under the terms of the GNU
+* General Public License (GPL) Version 2, available from the file
+* COPYING in the main directory of this source tree, or the
+* OpenFabrics.org BSD license below:
+*
+*   Redistribution and use in source and binary forms, with or
+*

[PATCH V1 14/16] i40iw: Kconfig and Kbuild for iwarp module

2015-12-21 Thread Faisal Latif
Kconfig and Kbuild needed to build iwarp module.

Signed-off-by: Faisal Latif 
---
 drivers/infiniband/hw/i40iw/Kbuild  | 43 +
 drivers/infiniband/hw/i40iw/Kconfig |  7 ++
 2 files changed, 50 insertions(+)
 create mode 100644 drivers/infiniband/hw/i40iw/Kbuild
 create mode 100644 drivers/infiniband/hw/i40iw/Kconfig

diff --git a/drivers/infiniband/hw/i40iw/Kbuild 
b/drivers/infiniband/hw/i40iw/Kbuild
new file mode 100644
index 000..ba84a78
--- /dev/null
+++ b/drivers/infiniband/hw/i40iw/Kbuild
@@ -0,0 +1,43 @@
+
+#
+# * Copyright (c) 2015 Intel Corporation.  All rights reserved.
+# *
+# * This software is available to you under a choice of one of two
+# * licenses.  You may choose to be licensed under the terms of the GNU
+# * General Public License (GPL) Version 2, available from the file
+# * COPYING in the main directory of this source tree, or the
+# * OpenFabrics.org BSD license below:
+# *
+# *   Redistribution and use in source and binary forms, with or
+# *   without modification, are permitted provided that the following
+# *   conditions are met:
+# *
+# *- Redistributions of source code must retain the above
+# *copyright notice, this list of conditions and the following
+# *disclaimer.
+# *
+# *- Redistributions in binary form must reproduce the above
+# *copyright notice, this list of conditions and the following
+# *disclaimer in the documentation and/or other materials
+# *provided with the distribution.
+# *
+# * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+# * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+# * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+# * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+# * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+# * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+# * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# * SOFTWARE.
+#
+
+
+ccflags-y :=  -Idrivers/net/ethernet/intel/i40e
+
+obj-m += i40iw.o
+
+i40iw-objs :=\
+   i40iw_cm.o i40iw_ctrl.o \
+   i40iw_hmc.o i40iw_hw.o i40iw_main.o  \
+   i40iw_pble.o i40iw_puda.o i40iw_uk.o i40iw_utils.o \
+   i40iw_verbs.o i40iw_virtchnl.o i40iw_vf.o
diff --git a/drivers/infiniband/hw/i40iw/Kconfig 
b/drivers/infiniband/hw/i40iw/Kconfig
new file mode 100644
index 000..6e7d27a
--- /dev/null
+++ b/drivers/infiniband/hw/i40iw/Kconfig
@@ -0,0 +1,7 @@
+config INFINIBAND_I40IW
+   tristate "Intel(R) Ethernet X722 iWARP Driver"
+   depends on INET && I40E
+   select GENERIC_ALLOCATOR
+   ---help---
+   Intel(R) Ethernet X722 iWARP Driver
+   INET && I40IW && INFINIBAND && I40E
-- 
2.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V1 02/16] i40iw: add main, hdr, status

2015-12-21 Thread Faisal Latif
i40iw_main.c contains routines for i40e <=> i40iw interface and setup.
i40iw.h is header file for main device data structures.
i40iw_status.h is for return status codes.

Acked-by: Anjali Singhai Jain 
Acked-by: Shannon Nelson 
Signed-off-by: Faisal Latif 
---
 drivers/infiniband/hw/i40iw/i40iw.h|  568 +
 drivers/infiniband/hw/i40iw/i40iw_main.c   | 1907 
 drivers/infiniband/hw/i40iw/i40iw_status.h |  100 ++
 3 files changed, 2575 insertions(+)
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw.h
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_main.c
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_status.h

diff --git a/drivers/infiniband/hw/i40iw/i40iw.h 
b/drivers/infiniband/hw/i40iw/i40iw.h
new file mode 100644
index 000..8740ea4
--- /dev/null
+++ b/drivers/infiniband/hw/i40iw/i40iw.h
@@ -0,0 +1,568 @@
+/***
+*
+* Copyright (c) 2015 Intel Corporation.  All rights reserved.
+*
+* This software is available to you under a choice of one of two
+* licenses.  You may choose to be licensed under the terms of the GNU
+* General Public License (GPL) Version 2, available from the file
+* COPYING in the main directory of this source tree, or the
+* OpenFabrics.org BSD license below:
+*
+*   Redistribution and use in source and binary forms, with or
+*   without modification, are permitted provided that the following
+*   conditions are met:
+*
+*- Redistributions of source code must retain the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer.
+*
+*- Redistributions in binary form must reproduce the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer in the documentation and/or other materials
+*  provided with the distribution.
+*
+* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+* SOFTWARE.
+*
+***/
+
+#ifndef I40IW_IW_H
+#define I40IW_IW_H
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "i40iw_status.h"
+#include "i40iw_osdep.h"
+#include "i40iw_d.h"
+#include "i40iw_hmc.h"
+
+#include 
+#include "i40iw_type.h"
+#include "i40iw_p.h"
+#include "i40iw_ucontext.h"
+#include "i40iw_pble.h"
+#include "i40iw_verbs.h"
+#include "i40iw_cm.h"
+#include "i40iw_user.h"
+#include "i40iw_puda.h"
+
+#define I40IW_FW_VERSION  2
+#define I40IW_HW_VERSION  2
+
+#define I40IW_ARP_ADD 1
+#define I40IW_ARP_DELETE  2
+#define I40IW_ARP_RESOLVE 3
+
+#define I40IW_MACIP_ADD 1
+#define I40IW_MACIP_DELETE  2
+
+#define IW_CCQ_SIZE (I40IW_CQP_SW_SQSIZE_2048 + 1)
+#define IW_CEQ_SIZE 2048
+#define IW_AEQ_SIZE 2048
+
+#define RX_BUF_SIZE(1536 + 8)
+#define IW_REG0_SIZE   (4 * 1024)
+#define IW_TX_TIMEOUT  (6 * HZ)
+#define IW_FIRST_QPN   1
+#define IW_SW_CONTEXT_ALIGN1024
+
+#define MAX_DPC_ITERATIONS 128
+
+#define I40IW_EVENT_TIMEOUT10
+#define I40IW_VCHNL_EVENT_TIMEOUT  10
+
+#defineI40IW_NO_VLAN   0x
+#defineI40IW_NO_QSET   0x
+
+/* access to mcast filter list */
+#define IW_ADD_MCAST false
+#define IW_DEL_MCAST true
+
+#define I40IW_DRV_OPT_ENABLE_MPA_VER_0 0x0001
+#define I40IW_DRV_OPT_DISABLE_MPA_CRC  0x0002
+#define I40IW_DRV_OPT_DISABLE_FIRST_WRITE  0x0004
+#define I40IW_DRV_OPT_DISABLE_INTF 0x0008
+#define I40IW_DRV_OPT_ENABLE_MSI   0x0010
+#define I40IW_DRV_OPT_DUAL_LOGICAL_PORT0x0020
+#define I40IW_DRV_OPT_NO_INLINE_DATA   0x0080
+#define I40IW_DRV_OPT_DISABLE_INT_MOD  0x0100
+#define I40IW_DRV_OPT_DISABLE_VIRT_WQ  0x0200
+#define I40IW_DRV_OPT_ENABLE_PAU   0x0400
+#define I40IW_DRV_OPT_MCAST_LOGPORT_MAP0x0800
+
+#define IW_HMC_OBJ_TYPE_NUM ARRAY_SIZE(iw_hmc_obj_types)
+#define IW_CFG_FPM_QP_COUNT32768
+
+#define I40IW_MTU_TO_MSS   40
+#define I40IW_DEFAULT_MSS  1460
+
+struct i40iw_cqp_compl_info {
+   u32 op_ret_val;
+   u16 maj_err_code;
+   u16 min_err_code;
+   bool error;
+   u8 op_code;
+};
+
+#define i40iw_pr_err(fmt, args ...) pr_err("%s: "fmt, __func__, ## args)
+
+#define i40iw_pr_info(fmt, args ...) pr_info("%s: " fmt, __func

[PATCH V1 03/16] i40iw: add connection management code

2015-12-21 Thread Faisal Latif
i40iw_cm.c i40iw_cm.h are used for connection management.

Acked-by: Anjali Singhai Jain 
Acked-by: Shannon Nelson 
Signed-off-by: Faisal Latif 
---
 drivers/infiniband/hw/i40iw/i40iw_cm.c |  
 drivers/infiniband/hw/i40iw/i40iw_cm.h |  456 
 2 files changed, 4900 insertions(+)
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_cm.c
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_cm.h

diff --git a/drivers/infiniband/hw/i40iw/i40iw_cm.c 
b/drivers/infiniband/hw/i40iw/i40iw_cm.c
new file mode 100644
index 000..e559e1c
--- /dev/null
+++ b/drivers/infiniband/hw/i40iw/i40iw_cm.c
@@ -0,0 +1, @@
+/***
+*
+* Copyright (c) 2015 Intel Corporation.  All rights reserved.
+*
+* This software is available to you under a choice of one of two
+* licenses.  You may choose to be licensed under the terms of the GNU
+* General Public License (GPL) Version 2, available from the file
+* COPYING in the main directory of this source tree, or the
+* OpenFabrics.org BSD license below:
+*
+*   Redistribution and use in source and binary forms, with or
+*   without modification, are permitted provided that the following
+*   conditions are met:
+*
+*- Redistributions of source code must retain the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer.
+*
+*- Redistributions in binary form must reproduce the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer in the documentation and/or other materials
+*  provided with the distribution.
+*
+* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+* SOFTWARE.
+*
+***/
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "i40iw.h"
+
+static void i40iw_rem_ref_cm_node(struct i40iw_cm_node *);
+static void i40iw_cm_post_event(struct i40iw_cm_event *event);
+static void i40iw_disconnect_worker(struct work_struct *work);
+
+/**
+ * i40iw_free_sqbuf - put back puda buffer if refcount = 0
+ * @dev: FPK device
+ * @buf: puda buffer to free
+ */
+void i40iw_free_sqbuf(struct i40iw_sc_dev *dev, void *bufp)
+{
+   struct i40iw_puda_buf *buf = (struct i40iw_puda_buf *)bufp;
+   struct i40iw_puda_rsrc *ilq = dev->ilq;
+
+   if (!atomic_dec_return(&buf->refcount))
+   i40iw_puda_ret_bufpool(ilq, buf);
+}
+
+/**
+ * i40iw_derive_hw_ird_setting - Calculate IRD
+ *
+ * @cm_ird: IRD of connection's node
+ *
+ * The ird from the connection is rounded to a supported HW
+ * setting (2,8,32,64) and then encoded for ird_size field of
+ * qp_ctx
+ */
+static u8 i40iw_derive_hw_ird_setting(u16 cm_ird)
+{
+   u8 encoded_ird_size;
+   u8 pof2_cm_ird = 1;
+
+   /* round-off to next powerof2 */
+   while (pof2_cm_ird < cm_ird)
+   pof2_cm_ird *= 2;
+
+   /* ird_size field is encoded in qp_ctx */
+   switch (pof2_cm_ird) {
+   case I40IW_HW_IRD_SETTING_64:
+   encoded_ird_size = 3;
+   break;
+   case I40IW_HW_IRD_SETTING_32:
+   case I40IW_HW_IRD_SETTING_16:
+   encoded_ird_size = 2;
+   break;
+   case I40IW_HW_IRD_SETTING_8:
+   case I40IW_HW_IRD_SETTING_4:
+   encoded_ird_size = 1;
+   break;
+   case I40IW_HW_IRD_SETTING_2:
+   default:
+   encoded_ird_size = 0;
+   break;
+   }
+   return encoded_ird_size;
+}
+
+/**
+ * i40iw_record_ird_ord - Record IRD/ORD passed in
+ * @cm_node: connection's node
+ * @conn_ird: connection IRD
+ * @conn_ord: connection ORD
+ */
+static void i40iw_record_ird_ord(struct i40iw_cm_node *cm_node, u16 conn_ird, 
u16 conn_ord)
+{
+   if (conn_ird > I40IW_MAX_IRD_SIZE)
+   conn_ird = I40IW_MAX_IRD_SIZE;
+
+   if (conn_ord > I40IW_MAX_ORD_SIZE)
+   conn_ord = I40IW_MAX_ORD_SIZE;
+
+   cm_node->ird_size = conn_ird;
+   cm_node->ord_size = conn_ord;
+}
+
+/**
+ * i40iw_copy_ip_ntohl - change network to host ip
+ * @dst: host ip
+ * @src: big endian
+ */
+void i40iw_copy_ip_ntohl(u32 *dst, __be32 *src)
+{
+   *dst++ = ntohl(*src++);
+   *dst++ = ntohl(*src++);
+   *dst++ = ntohl(*src++);
+  

[PATCH V1 12/16] i40iw: user kernel shared files

2015-12-21 Thread Faisal Latif
i40iw_user.h and i40iw_uk.c are used by both user library as well as
kernel requests.

Acked-by: Anjali Singhai Jain 
Acked-by: Shannon Nelson 
Signed-off-by: Faisal Latif 
---
 drivers/infiniband/hw/i40iw/i40iw_uk.c   | 1209 ++
 drivers/infiniband/hw/i40iw/i40iw_user.h |  442 +++
 2 files changed, 1651 insertions(+)
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_uk.c
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_user.h

diff --git a/drivers/infiniband/hw/i40iw/i40iw_uk.c 
b/drivers/infiniband/hw/i40iw/i40iw_uk.c
new file mode 100644
index 000..9f2a6e2
--- /dev/null
+++ b/drivers/infiniband/hw/i40iw/i40iw_uk.c
@@ -0,0 +1,1209 @@
+/***
+*
+* Copyright (c) 2015 Intel Corporation.  All rights reserved.
+*
+* This software is available to you under a choice of one of two
+* licenses.  You may choose to be licensed under the terms of the GNU
+* General Public License (GPL) Version 2, available from the file
+* COPYING in the main directory of this source tree, or the
+* OpenFabrics.org BSD license below:
+*
+*   Redistribution and use in source and binary forms, with or
+*   without modification, are permitted provided that the following
+*   conditions are met:
+*
+*- Redistributions of source code must retain the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer.
+*
+*- Redistributions in binary form must reproduce the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer in the documentation and/or other materials
+*  provided with the distribution.
+*
+* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+* SOFTWARE.
+*
+***/
+
+#include "i40iw_osdep.h"
+#include "i40iw_status.h"
+#include "i40iw_d.h"
+#include "i40iw_user.h"
+#include "i40iw_register.h"
+
+static u32 nop_signature = 0x;
+
+/**
+ * i40iw_nop_1 - insert a nop wqe and move head. no post work
+ * @qp: hw qp ptr
+ */
+static enum i40iw_status_code i40iw_nop_1(struct i40iw_qp_uk *qp)
+{
+   u64 header, *wqe;
+   u64 *wqe_0 = NULL;
+   u32 wqe_idx, peek_head;
+   bool signaled = false;
+
+   if (!qp->sq_ring.head)
+   return I40IW_ERR_PARAM;
+
+   wqe_idx = I40IW_RING_GETCURRENT_HEAD(qp->sq_ring);
+   wqe = &qp->sq_base[wqe_idx << 2];
+   peek_head = (qp->sq_ring.head + 1) % qp->sq_ring.size;
+   wqe_0 = &qp->sq_base[peek_head << 2];
+   if (peek_head)
+   wqe_0[3] = LS_64(!qp->swqe_polarity, I40IWQPSQ_VALID);
+   else
+   wqe_0[3] = LS_64(qp->swqe_polarity, I40IWQPSQ_VALID);
+
+   set_64bit_val(wqe, 0, 0);
+   set_64bit_val(wqe, 8, 0);
+   set_64bit_val(wqe, 16, 0);
+
+   header = LS_64(I40IWQP_OP_NOP, I40IWQPSQ_OPCODE) |
+   LS_64(signaled, I40IWQPSQ_SIGCOMPL) |
+   LS_64(qp->swqe_polarity, I40IWQPSQ_VALID) | nop_signature++;
+
+   wmb();  /* Memory barrier to ensure data is written before valid bit is 
set */
+
+   set_64bit_val(wqe, 24, header);
+   return 0;
+}
+
+/**
+ * i40iw_qp_post_wr - post wr to hrdware
+ * @qp: hw qp ptr
+ */
+void i40iw_qp_post_wr(struct i40iw_qp_uk *qp)
+{
+   u64 temp;
+   u32 hw_sq_tail;
+   u32 sw_sq_head;
+
+   wmb(); /* make sure valid bit is written */
+
+   /* read the doorbell shadow area */
+   get_64bit_val(qp->shadow_area, 0, &temp);
+
+   rmb(); /* make sure read is finished */
+
+   hw_sq_tail = (u32)RS_64(temp, I40IW_QP_DBSA_HW_SQ_TAIL);
+   sw_sq_head = I40IW_RING_GETCURRENT_HEAD(qp->sq_ring);
+   if (sw_sq_head != hw_sq_tail) {
+   if (sw_sq_head > qp->initial_ring.head) {
+   if ((hw_sq_tail >= qp->initial_ring.head) &&
+   (hw_sq_tail < sw_sq_head)) {
+   writel(qp->qp_id, qp->wqe_alloc_reg);
+   }
+   } else if (sw_sq_head != qp->initial_ring.head) {
+   if ((hw_sq_tail >= qp->initial_ring.head) ||
+   (hw_sq_tail < sw_sq_head)) {
+   writel(qp->qp_id, qp->wqe_alloc_reg);
+   }
+   }
+   }
+
+   qp->initial_ring.head = qp->sq_ring.head;
+}
+
+/**
+ * i40iw_qp_ring_push_db -  ring qp doorbell
+ * @qp: hw qp ptr
+ * @wqe_idx: wqe index
+ */
+static void i40iw_qp_ring_push_db(struct i40iw_qp_uk *qp, u32

[PATCH V1 08/16] i40iw: add files for iwarp interface

2015-12-21 Thread Faisal Latif
i40iw_verbs.[ch] are to handle iwarp interface.

Added feedback provided by Christoph Hellwig

Acked-by: Anjali Singhai Jain 
Acked-by: Shannon Nelson 
Signed-off-by: Faisal Latif 
---
 drivers/infiniband/hw/i40iw/i40iw_ucontext.h |  110 ++
 drivers/infiniband/hw/i40iw/i40iw_verbs.c| 2406 ++
 drivers/infiniband/hw/i40iw/i40iw_verbs.h|  173 ++
 3 files changed, 2689 insertions(+)
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_ucontext.h
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_verbs.c
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_verbs.h

diff --git a/drivers/infiniband/hw/i40iw/i40iw_ucontext.h 
b/drivers/infiniband/hw/i40iw/i40iw_ucontext.h
new file mode 100644
index 000..5c65c25
--- /dev/null
+++ b/drivers/infiniband/hw/i40iw/i40iw_ucontext.h
@@ -0,0 +1,110 @@
+/*
+ * Copyright (c) 2006 - 2015 Intel Corporation.  All rights reserved.
+ * Copyright (c) 2005 Topspin Communications.  All rights reserved.
+ * Copyright (c) 2005 Cisco Systems.  All rights reserved.
+ * Copyright (c) 2005 Open Grid Computing, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+#ifndef I40IW_USER_CONTEXT_H
+#define I40IW_USER_CONTEXT_H
+
+#include 
+
+#define I40IW_ABI_USERSPACE_VER 4
+#define I40IW_ABI_KERNEL_VER4
+struct i40iw_alloc_ucontext_req {
+   __u32 reserved32;
+   __u8 userspace_ver;
+   __u8 reserved8[3];
+};
+
+struct i40iw_alloc_ucontext_resp {
+   __u32 max_pds;  /* maximum pds allowed for this user process */
+   __u32 max_qps;  /* maximum qps allowed for this user process */
+   __u32 wq_size;  /* size of the WQs (sq+rq) allocated to the 
mmaped area */
+   __u8 kernel_ver;
+   __u8 reserved[3];
+};
+
+struct i40iw_alloc_pd_resp {
+   __u32 pd_id;
+   __u8 reserved[4];
+};
+
+struct i40iw_create_cq_req {
+   __u64 user_cq_buffer;
+   __u64 user_shadow_area;
+};
+
+struct i40iw_create_qp_req {
+   __u64 user_wqe_buffers;
+   __u64 user_compl_ctx;
+
+   /* UDA QP PHB */
+   __u64 user_sq_phb;  /* place for VA of the sq phb buff */
+   __u64 user_rq_phb;  /* place for VA of the rq phb buff */
+};
+
+enum i40iw_memreg_type {
+   IW_MEMREG_TYPE_MEM = 0x,
+   IW_MEMREG_TYPE_QP = 0x0001,
+   IW_MEMREG_TYPE_CQ = 0x0002,
+   IW_MEMREG_TYPE_MW = 0x0003,
+   IW_MEMREG_TYPE_FMR = 0x0004,
+   IW_MEMREG_TYPE_FMEM = 0x0005,
+};
+
+struct i40iw_mem_reg_req {
+   __u16 reg_type; /* Memory, QP or CQ */
+   __u16 cq_pages;
+   __u16 rq_pages;
+   __u16 sq_pages;
+};
+
+struct i40iw_create_cq_resp {
+   __u32 cq_id;
+   __u32 cq_size;
+   __u32 mmap_db_index;
+   __u32 reserved;
+};
+
+struct i40iw_create_qp_resp {
+   __u32 qp_id;
+   __u32 actual_sq_size;
+   __u32 actual_rq_size;
+   __u32 i40iw_drv_opt;
+   __u16 push_idx;
+   __u8  lsmm;
+   __u8  rsvd2;
+};
+
+#endif
diff --git a/drivers/infiniband/hw/i40iw/i40iw_verbs.c 
b/drivers/infiniband/hw/i40iw/i40iw_verbs.c
new file mode 100644
index 000..accc3dc
--- /dev/null
+++ b/drivers/infiniband/hw/i40iw/i40iw_verbs.c
@@ -0,0 +1,2406 @@
+/***
+*
+* Copyright (c) 2015 Intel Corporation.  All rights reserved.
+*
+* This software is available to you under a choice of one of two
+* licenses.  You may choose to be licensed under the terms of the GNU
+* General Public License (GPL) Version 2, available from the file
+* COPYING in the main directory of this source tree, or the
+* OpenFabrics.org BSD 

[PATCH V1 10/16] i40iw: add hardware related header files

2015-12-21 Thread Faisal Latif
header files for hardware accesses

Signed-off-by: Faisal Latif 
---
 drivers/infiniband/hw/i40iw/i40iw_d.h| 1713 ++
 drivers/infiniband/hw/i40iw/i40iw_p.h|  106 ++
 drivers/infiniband/hw/i40iw/i40iw_type.h | 1307 +++
 3 files changed, 3126 insertions(+)
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_d.h
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_p.h
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_type.h

diff --git a/drivers/infiniband/hw/i40iw/i40iw_d.h 
b/drivers/infiniband/hw/i40iw/i40iw_d.h
new file mode 100644
index 000..f6668d7
--- /dev/null
+++ b/drivers/infiniband/hw/i40iw/i40iw_d.h
@@ -0,0 +1,1713 @@
+/***
+*
+* Copyright (c) 2015 Intel Corporation.  All rights reserved.
+*
+* This software is available to you under a choice of one of two
+* licenses.  You may choose to be licensed under the terms of the GNU
+* General Public License (GPL) Version 2, available from the file
+* COPYING in the main directory of this source tree, or the
+* OpenFabrics.org BSD license below:
+*
+*   Redistribution and use in source and binary forms, with or
+*   without modification, are permitted provided that the following
+*   conditions are met:
+*
+*- Redistributions of source code must retain the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer.
+*
+*- Redistributions in binary form must reproduce the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer in the documentation and/or other materials
+*  provided with the distribution.
+*
+* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+* SOFTWARE.
+*
+***/
+
+#ifndef I40IW_D_H
+#define I40IW_D_H
+
+#define I40IW_DB_ADDR_OFFSET(4 * 1024 * 1024 - 64 * 1024)
+#define I40IW_VF_DB_ADDR_OFFSET (64 * 1024)
+
+#define I40IW_PUSH_OFFSET   (4 * 1024 * 1024)
+#define I40IW_PF_FIRST_PUSH_PAGE_INDEX 16
+#define I40IW_VF_PUSH_OFFSET((8 + 64) * 1024)
+#define I40IW_VF_FIRST_PUSH_PAGE_INDEX 2
+
+#define I40IW_PE_DB_SIZE_4M 1
+#define I40IW_PE_DB_SIZE_8M 2
+
+#define I40IW_DDP_VER 1
+#define I40IW_RDMAP_VER 1
+
+#define I40IW_RDMA_MODE_RDMAC 0
+#define I40IW_RDMA_MODE_IETF  1
+
+#define I40IW_QP_STATE_INVALID 0
+#define I40IW_QP_STATE_IDLE 1
+#define I40IW_QP_STATE_RTS 2
+#define I40IW_QP_STATE_CLOSING 3
+#define I40IW_QP_STATE_RESERVED 4
+#define I40IW_QP_STATE_TERMINATE 5
+#define I40IW_QP_STATE_ERROR 6
+
+#define I40IW_STAG_STATE_INVALID 0
+#define I40IW_STAG_STATE_VALID 1
+
+#define I40IW_STAG_TYPE_SHARED 0
+#define I40IW_STAG_TYPE_NONSHARED 1
+
+#define I40IW_MAX_USER_PRIORITY 8
+
+#define LS_64_1(val, bits)  ((u64)(uintptr_t)val << bits)
+#define RS_64_1(val, bits)  ((u64)(uintptr_t)val >> bits)
+#define LS_32_1(val, bits)  (u32)(val << bits)
+#define RS_32_1(val, bits)  (u32)(val >> bits)
+#define I40E_HI_DWORD(x)((u32)x) >> 16) >> 16) & 0x))
+
+#define LS_64(val, field) (((u64)val << field ## _SHIFT) & (field ## _MASK))
+
+#define RS_64(val, field) ((u64)(u64)(val & field ## _MASK) >> field ## _SHIFT)
+#define LS_32(val, field) ((val << field ## _SHIFT) & (field ## _MASK))
+#define RS_32(val, field) ((val & field ## _MASK) >> field ## _SHIFT)
+
+#define TERM_DDP_LEN_TAGGED 14
+#define TERM_DDP_LEN_UNTAGGED   18
+#define TERM_RDMA_LEN   28
+#define RDMA_OPCODE_MASK0x0f
+#define RDMA_READ_REQ_OPCODE1
+#define Q2_BAD_FRAME_OFFSET 72
+#define CQE_MAJOR_DRV   0x8000
+
+#define I40IW_TERM_SENT 0x01
+#define I40IW_TERM_RCVD 0x02
+#define I40IW_TERM_DONE 0x04
+#define I40IW_MAC_HLEN  14
+
+#define I40IW_INVALID_WQE_INDEX 0x
+
+#define I40IW_CQP_WAIT_POLL_REGS 1
+#define I40IW_CQP_WAIT_POLL_CQ 2
+#define I40IW_CQP_WAIT_EVENT 3
+
+#define I40IW_CQP_INIT_WQE(wqe) memset(wqe, 0, 64)
+
+#define I40IW_GET_CURRENT_CQ_ELEMENT(_cq) \
+   ( \
+   &((_cq)->cq_base[I40IW_RING_GETCURRENT_HEAD((_cq)->cq_ring)])  \
+   )
+#define I40IW_GET_CURRENT_EXTENDED_CQ_ELEMENT(_cq) \
+   ( \
+   &(((struct i40iw_extended_cqe *)\
+  
((_cq)->cq_base))[I40IW_RING_GETCURRENT_HEAD((_cq)->cq_ring)]) \
+   )
+
+#define I40IW_GET_CURRENT_AEQ_ELEMENT(_aeq) \
+   ( \
+   &_aeq->aeqe_base[I40IW_RING_GETCURRENT_TAIL(_aeq->aeq_ring)]   \
+   )
+
+#define I40IW_GET_CURRENT_CEQ_ELEMENT(_ceq) \
+  

[PATCH V1 07/16] i40iw: add hw and utils files

2015-12-21 Thread Faisal Latif
i40iw_hw.c, i40iw_utils.c and i40iw_osdep.h are files to handle
interrupts and processing.

Added feeback from Cristoph Hellwig.

Acked-by: Anjali Singhai Jain 
Acked-by: Shannon Nelson 
Signed-off-by: Faisal Latif 
---
 drivers/infiniband/hw/i40iw/i40iw_hw.c|  730 +
 drivers/infiniband/hw/i40iw/i40iw_osdep.h |  215 +
 drivers/infiniband/hw/i40iw/i40iw_utils.c | 1256 +
 3 files changed, 2201 insertions(+)
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_hw.c
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_osdep.h
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_utils.c

diff --git a/drivers/infiniband/hw/i40iw/i40iw_hw.c 
b/drivers/infiniband/hw/i40iw/i40iw_hw.c
new file mode 100644
index 000..93ef834
--- /dev/null
+++ b/drivers/infiniband/hw/i40iw/i40iw_hw.c
@@ -0,0 +1,730 @@
+/***
+*
+* Copyright (c) 2015 Intel Corporation.  All rights reserved.
+*
+* This software is available to you under a choice of one of two
+* licenses.  You may choose to be licensed under the terms of the GNU
+* General Public License (GPL) Version 2, available from the file
+* COPYING in the main directory of this source tree, or the
+* OpenFabrics.org BSD license below:
+*
+*   Redistribution and use in source and binary forms, with or
+*   without modification, are permitted provided that the following
+*   conditions are met:
+*
+*- Redistributions of source code must retain the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer.
+*
+*- Redistributions in binary form must reproduce the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer in the documentation and/or other materials
+*  provided with the distribution.
+*
+* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+* SOFTWARE.
+*
+***/
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "i40iw.h"
+
+/**
+ * i40iw_initialize_hw_resources - initialize hw resource during open
+ * @iwdev: iwarp device
+ */
+u32 i40iw_initialize_hw_resources(struct i40iw_device *iwdev)
+{
+   unsigned long num_pds;
+   u32 resources_size;
+   u32 max_mr;
+   u32 max_qp;
+   u32 max_cq;
+   u32 arp_table_size;
+   u32 mrdrvbits;
+   void *resource_ptr;
+
+   max_qp = iwdev->sc_dev.hmc_info->hmc_obj[I40IW_HMC_IW_QP].cnt;
+   max_cq = iwdev->sc_dev.hmc_info->hmc_obj[I40IW_HMC_IW_CQ].cnt;
+   max_mr = iwdev->sc_dev.hmc_info->hmc_obj[I40IW_HMC_IW_MR].cnt;
+   arp_table_size = iwdev->sc_dev.hmc_info->hmc_obj[I40IW_HMC_IW_ARP].cnt;
+   iwdev->max_cqe = 0xF;
+   num_pds = max_qp * 4;
+   resources_size = sizeof(struct i40iw_arp_entry) * arp_table_size;
+   resources_size += sizeof(unsigned long) * BITS_TO_LONGS(max_qp);
+   resources_size += sizeof(unsigned long) * BITS_TO_LONGS(max_mr);
+   resources_size += sizeof(unsigned long) * BITS_TO_LONGS(max_cq);
+   resources_size += sizeof(unsigned long) * BITS_TO_LONGS(num_pds);
+   resources_size += sizeof(unsigned long) * BITS_TO_LONGS(arp_table_size);
+   resources_size += sizeof(struct i40iw_qp **) * max_qp;
+   iwdev->mem_resources = kzalloc(resources_size, GFP_KERNEL);
+
+   if (!iwdev->mem_resources)
+   return -ENOMEM;
+
+   iwdev->max_qp = max_qp;
+   iwdev->max_mr = max_mr;
+   iwdev->max_cq = max_cq;
+   iwdev->max_pd = num_pds;
+   iwdev->arp_table_size = arp_table_size;
+   iwdev->arp_table = (struct i40iw_arp_entry *)iwdev->mem_resources;
+   resource_ptr = iwdev->mem_resources + (sizeof(struct i40iw_arp_entry) * 
arp_table_size);
+
+   iwdev->device_cap_flags = IB_DEVICE_LOCAL_DMA_LKEY |
+   IB_DEVICE_MEM_WINDOW | IB_DEVICE_MEM_MGT_EXTENSIONS;
+
+   iwdev->allocated_qps = resource_ptr;
+   iwdev->allocated_cqs = &iwdev->allocated_qps[BITS_TO_LONGS(max_qp)];
+   iwdev->allocated_mrs = &iwdev->allocated_cqs[BITS_TO_LONGS(max_cq)];
+   iwdev->allocated_pds = &iwdev->allocated_mrs[BITS_TO_LONGS(max_mr)];
+   iwdev->allocated_arps = &iwdev->allocated_pds[BITS_TO_LONGS(num_pds)];
+   iwdev->qp_table = (struct i40iw_qp 
**)(&iwdev->allocated_arps[BITS_TO_LONGS(arp_table_size)]);
+   set_bit(0, iwdev->allocated_mrs);
+   set_bit(0, iwdev->allocated_qps);
+   set_bit(0, iwdev->allocated_cqs);
+   set_bi

[PATCH V1 04/16] i40iw: add puda code

2015-12-21 Thread Faisal Latif
i40iw_puda.[ch] are files to handle iwarp connection packets as
well as exception packets over multiple privilege mode uda queues.

Acked-by: Anjali Singhai Jain 
Acked-by: Shannon Nelson 
Signed-off-by: Faisal Latif 
---
 drivers/infiniband/hw/i40iw/i40iw_puda.c | 1437 ++
 drivers/infiniband/hw/i40iw/i40iw_puda.h |  183 
 2 files changed, 1620 insertions(+)
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_puda.c
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_puda.h

diff --git a/drivers/infiniband/hw/i40iw/i40iw_puda.c 
b/drivers/infiniband/hw/i40iw/i40iw_puda.c
new file mode 100644
index 000..cfbef59
--- /dev/null
+++ b/drivers/infiniband/hw/i40iw/i40iw_puda.c
@@ -0,0 +1,1437 @@
+/***
+*
+* Copyright (c) 2015 Intel Corporation.  All rights reserved.
+*
+* This software is available to you under a choice of one of two
+* licenses.  You may choose to be licensed under the terms of the GNU
+* General Public License (GPL) Version 2, available from the file
+* COPYING in the main directory of this source tree, or the
+* OpenFabrics.org BSD license below:
+*
+*   Redistribution and use in source and binary forms, with or
+*   without modification, are permitted provided that the following
+*   conditions are met:
+*
+*- Redistributions of source code must retain the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer.
+*
+*- Redistributions in binary form must reproduce the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer in the documentation and/or other materials
+*  provided with the distribution.
+*
+* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+* SOFTWARE.
+*
+***/
+
+#include "i40iw_osdep.h"
+#include "i40iw_register.h"
+#include "i40iw_status.h"
+#include "i40iw_hmc.h"
+
+#include "i40iw_d.h"
+#include "i40iw_type.h"
+#include "i40iw_p.h"
+#include "i40iw_puda.h"
+
+static void i40iw_ieq_receive(struct i40iw_sc_dev *dev,
+ struct i40iw_puda_buf *buf);
+static void i40iw_ieq_tx_compl(struct i40iw_sc_dev *dev, void *sqwrid);
+static void i40iw_ilq_putback_rcvbuf(struct i40iw_sc_qp *qp, u32 wqe_idx);
+static enum i40iw_status_code i40iw_puda_replenish_rq(struct i40iw_puda_rsrc
+ *rsrc, bool initial);
+/**
+ * i40iw_puda_get_listbuf - get buffer from puda list
+ * @list: list to use for buffers (ILQ or IEQ)
+ */
+static struct i40iw_puda_buf *i40iw_puda_get_listbuf(struct list_head *list)
+{
+   struct i40iw_puda_buf *buf = NULL;
+
+   if (!list_empty(list)) {
+   buf = (struct i40iw_puda_buf *)list->next;
+   list_del((struct list_head *)&buf->list);
+   }
+   return buf;
+}
+
+/**
+ * i40iw_puda_get_bufpool - return buffer from resource
+ * @rsrc: resource to use for buffer
+ */
+struct i40iw_puda_buf *i40iw_puda_get_bufpool(struct i40iw_puda_rsrc *rsrc)
+{
+   struct i40iw_puda_buf *buf = NULL;
+   struct list_head *list = &rsrc->bufpool;
+   unsigned long   flags;
+
+   spin_lock_irqsave(&rsrc->bufpool_lock, flags);
+   buf = i40iw_puda_get_listbuf(list);
+   if (buf)
+   rsrc->avail_buf_count--;
+   else
+   rsrc->stats_buf_alloc_fail++;
+   spin_unlock_irqrestore(&rsrc->bufpool_lock, flags);
+   return buf;
+}
+
+/**
+ * i40iw_puda_ret_bufpool - return buffer to rsrc list
+ * @rsrc: resource to use for buffer
+ * @buf: buffe to return to resouce
+ */
+void i40iw_puda_ret_bufpool(struct i40iw_puda_rsrc *rsrc,
+   struct i40iw_puda_buf *buf)
+{
+   unsigned long   flags;
+
+   spin_lock_irqsave(&rsrc->bufpool_lock, flags);
+   list_add(&buf->list, &rsrc->bufpool);
+   spin_unlock_irqrestore(&rsrc->bufpool_lock, flags);
+   rsrc->avail_buf_count++;
+}
+
+/**
+ * i40iw_puda_post_recvbuf - set wqe for rcv buffer
+ * @rsrc: resource ptr
+ * @wqe_idx: wqe index to use
+ * @buf: puda buffer for rcv q
+ * @initial: flag if during init time
+ */
+static void i40iw_puda_post_recvbuf(struct i40iw_puda_rsrc *rsrc, u32 wqe_idx,
+   struct i40iw_puda_buf *buf, bool initial)
+{
+   u64 *wqe;
+   struct i40iw_sc_qp *qp = &rsrc->qp;
+   u64 offset24 = 0;
+
+   qp->qp_uk.rq_wrid_array[wqe_idx] = (uintptr_t)buf;
+   wqe = &qp->qp_uk.rq_base[

[PATCH V1 16/16] i40iw: changes for build of i40iw module

2015-12-21 Thread Faisal Latif
MAINTAINERS, Kconfig, and Makefile to build i40iw module

Signed-off-by: Faisal Latif 
---
 MAINTAINERS| 10 ++
 drivers/infiniband/Kconfig |  1 +
 drivers/infiniband/hw/Makefile |  1 +
 3 files changed, 12 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 9bff63c..fc85ad3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5601,6 +5601,16 @@ F:   Documentation/networking/i40evf.txt
 F: drivers/net/ethernet/intel/
 F: drivers/net/ethernet/intel/*/
 
+INTEL RDMA RNIC DRIVER
+M: Faisal Latif 
+R: Chien Tin Tung 
+R: Mustafa Ismail 
+R: Shiraz Saleem 
+R: Tatyana Nikolova 
+L: linux-rdma@vger.kernel.org
+S: Supported
+F: drivers/infiniband/hw/i40iw/
+
 INTEL-MID GPIO DRIVER
 M: David Cohen 
 L: linux-g...@vger.kernel.org
diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig
index 282ec0b..cd8fa5c 100644
--- a/drivers/infiniband/Kconfig
+++ b/drivers/infiniband/Kconfig
@@ -59,6 +59,7 @@ source "drivers/infiniband/hw/mthca/Kconfig"
 source "drivers/infiniband/hw/qib/Kconfig"
 source "drivers/infiniband/hw/cxgb3/Kconfig"
 source "drivers/infiniband/hw/cxgb4/Kconfig"
+source "drivers/infiniband/hw/i40iw/Kconfig"
 source "drivers/infiniband/hw/mlx4/Kconfig"
 source "drivers/infiniband/hw/mlx5/Kconfig"
 source "drivers/infiniband/hw/nes/Kconfig"
diff --git a/drivers/infiniband/hw/Makefile b/drivers/infiniband/hw/Makefile
index aded2a5..8860057 100644
--- a/drivers/infiniband/hw/Makefile
+++ b/drivers/infiniband/hw/Makefile
@@ -2,6 +2,7 @@ obj-$(CONFIG_INFINIBAND_MTHCA)  += mthca/
 obj-$(CONFIG_INFINIBAND_QIB)   += qib/
 obj-$(CONFIG_INFINIBAND_CXGB3) += cxgb3/
 obj-$(CONFIG_INFINIBAND_CXGB4) += cxgb4/
+obj-$(CONFIG_INFINIBAND_I40IW) += i40iw/
 obj-$(CONFIG_MLX4_INFINIBAND)  += mlx4/
 obj-$(CONFIG_MLX5_INFINIBAND)  += mlx5/
 obj-$(CONFIG_INFINIBAND_NES)   += nes/
-- 
2.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V1 09/16] i40iw: add file to handle cqp calls

2015-12-21 Thread Faisal Latif
i40iw_ctrl.c provides for hardware wqe supporti and cqp.

Acked-by: Anjali Singhai Jain 
Acked-by: Shannon Nelson 
Signed-off-by: Faisal Latif 
---
 drivers/infiniband/hw/i40iw/i40iw_ctrl.c | 4740 ++
 1 file changed, 4740 insertions(+)
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_ctrl.c

diff --git a/drivers/infiniband/hw/i40iw/i40iw_ctrl.c 
b/drivers/infiniband/hw/i40iw/i40iw_ctrl.c
new file mode 100644
index 000..dba742a
--- /dev/null
+++ b/drivers/infiniband/hw/i40iw/i40iw_ctrl.c
@@ -0,0 +1,4740 @@
+/***
+*
+* Copyright (c) 2015 Intel Corporation.  All rights reserved.
+*
+* This software is available to you under a choice of one of two
+* licenses.  You may choose to be licensed under the terms of the GNU
+* General Public License (GPL) Version 2, available from the file
+* COPYING in the main directory of this source tree, or the
+* OpenFabrics.org BSD license below:
+*
+*   Redistribution and use in source and binary forms, with or
+*   without modification, are permitted provided that the following
+*   conditions are met:
+*
+*- Redistributions of source code must retain the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer.
+*
+*- Redistributions in binary form must reproduce the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer in the documentation and/or other materials
+*  provided with the distribution.
+*
+* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+* SOFTWARE.
+*
+***/
+
+#include "i40iw_osdep.h"
+#include "i40iw_register.h"
+#include "i40iw_status.h"
+#include "i40iw_hmc.h"
+
+#include "i40iw_d.h"
+#include "i40iw_type.h"
+#include "i40iw_p.h"
+#include "i40iw_vf.h"
+#include "i40iw_virtchnl.h"
+
+/**
+ * i40iw_insert_wqe_hdr - write wqe header
+ * @wqe: cqp wqe for header
+ * @header: header for the cqp wqe
+ */
+static inline void i40iw_insert_wqe_hdr(u64 *wqe, u64 header)
+{
+   wmb();/* make sure WQE is populated before valid bit is set 
*/
+   set_64bit_val(wqe, 24, header);
+}
+
+/**
+ * i40iw_get_cqp_reg_info - get head and tail for cqp using registers
+ * @cqp: struct for cqp hw
+ * @val: cqp tail register value
+ * @tail:wqtail register value
+ * @error: cqp processing err
+ */
+static inline void i40iw_get_cqp_reg_info(struct i40iw_sc_cqp *cqp,
+ u32 *val,
+ u32 *tail,
+ u32 *error)
+{
+   if (cqp->dev->is_pf) {
+   *val = i40iw_rd32(cqp->dev->hw, I40E_PFPE_CQPTAIL);
+   *tail = RS_32(*val, I40E_PFPE_CQPTAIL_WQTAIL);
+   *error = RS_32(*val, I40E_PFPE_CQPTAIL_CQP_OP_ERR);
+   } else {
+   *val = i40iw_rd32(cqp->dev->hw, I40E_VFPE_CQPTAIL1);
+   *tail = RS_32(*val, I40E_VFPE_CQPTAIL_WQTAIL);
+   *error = RS_32(*val, I40E_VFPE_CQPTAIL_CQP_OP_ERR);
+   }
+}
+
+/**
+ * i40iw_cqp_poll_registers - poll cqp registers
+ * @cqp: struct for cqp hw
+ * @tail:wqtail register value
+ * @count: how many times to try for completion
+ */
+static enum i40iw_status_code i40iw_cqp_poll_registers(
+   struct i40iw_sc_cqp *cqp,
+   u32 tail,
+   u32 count)
+{
+   u32 i = 0;
+   u32 newtail, error, val;
+
+   while (i < count) {
+   i++;
+   i40iw_get_cqp_reg_info(cqp, &val, &newtail, &error);
+   if (error) {
+   error = (cqp->dev->is_pf) ?
+i40iw_rd32(cqp->dev->hw, 
I40E_PFPE_CQPERRCODES) :
+i40iw_rd32(cqp->dev->hw, 
I40E_VFPE_CQPERRCODES1);
+   return I40IW_ERR_CQP_COMPL_ERROR;
+   }
+   if (newtail != tail) {
+   /* SUCCESS */
+   I40IW_RING_MOVE_TAIL(cqp->sq_ring);
+   return 0;
+   }
+   udelay(I40IW_SLEEP_COUNT);
+   }
+   return I40IW_ERR_TIMEOUT;
+}
+
+/**
+ * i40iw_sc_parse_fpm_commit_buf - parse fpm commit buffer
+ * @buf: ptr to fpm commit buffer
+ * @info: ptr to i40iw_hmc_obj_info struct
+ *
+ * parses fpm commit info and copy base value
+ * of hmc objects in hmc_info
+ *

[PATCH V1 11/16] i40iw: add X722 register file

2015-12-21 Thread Faisal Latif
X722 Hardware registers defines for iWARP component.

Acked-by: Anjali Singhai Jain 
Acked-by: Shannon Nelson 
Signed-off-by: Faisal Latif 
---
 drivers/infiniband/hw/i40iw/i40iw_register.h | 1027 ++
 1 file changed, 1027 insertions(+)
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_register.h

diff --git a/drivers/infiniband/hw/i40iw/i40iw_register.h 
b/drivers/infiniband/hw/i40iw/i40iw_register.h
new file mode 100644
index 000..1bcac4f
--- /dev/null
+++ b/drivers/infiniband/hw/i40iw/i40iw_register.h
@@ -0,0 +1,1027 @@
+/***
+*
+* Copyright (c) 2015 Intel Corporation.  All rights reserved.
+*
+* This software is available to you under a choice of one of two
+* licenses.  You may choose to be licensed under the terms of the GNU
+* General Public License (GPL) Version 2, available from the file
+* COPYING in the main directory of this source tree, or the
+* OpenFabrics.org BSD license below:
+*
+*   Redistribution and use in source and binary forms, with or
+*   without modification, are permitted provided that the following
+*   conditions are met:
+*
+*- Redistributions of source code must retain the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer.
+*
+*- Redistributions in binary form must reproduce the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer in the documentation and/or other materials
+*  provided with the distribution.
+*
+* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+* SOFTWARE.
+*
+***/
+
+#ifndef I40IW_REGISTER_H
+#define I40IW_REGISTER_H
+
+#define I40E_GLGEN_STAT   0x000B612C /* Reset: POR */
+
+#define I40E_PFHMC_PDINV   0x000C0300 /* Reset: PFR */
+#define I40E_PFHMC_PDINV_PMSDIDX_SHIFT 0
+#define I40E_PFHMC_PDINV_PMSDIDX_MASK  (0xFFF <<  
I40E_PFHMC_PDINV_PMSDIDX_SHIFT)
+#define I40E_PFHMC_PDINV_PMPDIDX_SHIFT 16
+#define I40E_PFHMC_PDINV_PMPDIDX_MASK  (0x1FF <<  
I40E_PFHMC_PDINV_PMPDIDX_SHIFT)
+#define I40E_PFHMC_SDCMD_PMSDWR_SHIFT  31
+#define I40E_PFHMC_SDCMD_PMSDWR_MASK   (0x1 <<  I40E_PFHMC_SDCMD_PMSDWR_SHIFT)
+#define I40E_PFHMC_SDDATALOW_PMSDVALID_SHIFT   0
+#define I40E_PFHMC_SDDATALOW_PMSDVALID_MASK(0x1 <<  
I40E_PFHMC_SDDATALOW_PMSDVALID_SHIFT)
+#define I40E_PFHMC_SDDATALOW_PMSDTYPE_SHIFT1
+#define I40E_PFHMC_SDDATALOW_PMSDTYPE_MASK (0x1 <<  
I40E_PFHMC_SDDATALOW_PMSDTYPE_SHIFT)
+#define I40E_PFHMC_SDDATALOW_PMSDBPCOUNT_SHIFT 2
+#define I40E_PFHMC_SDDATALOW_PMSDBPCOUNT_MASK  (0x3FF <<  
I40E_PFHMC_SDDATALOW_PMSDBPCOUNT_SHIFT)
+
+#define I40E_PFINT_DYN_CTLN(_INTPF) (0x00034800 + ((_INTPF) * 4)) /* 
_i=0...511 */ /* Reset: PFR */
+#define I40E_PFINT_DYN_CTLN_INTENA_SHIFT  0
+#define I40E_PFINT_DYN_CTLN_INTENA_MASK   (0x1 <<  
I40E_PFINT_DYN_CTLN_INTENA_SHIFT)
+#define I40E_PFINT_DYN_CTLN_CLEARPBA_SHIFT1
+#define I40E_PFINT_DYN_CTLN_CLEARPBA_MASK (0x1 <<  
I40E_PFINT_DYN_CTLN_CLEARPBA_SHIFT)
+#define I40E_PFINT_DYN_CTLN_ITR_INDX_SHIFT3
+#define I40E_PFINT_DYN_CTLN_ITR_INDX_MASK (0x3 <<  
I40E_PFINT_DYN_CTLN_ITR_INDX_SHIFT)
+
+#define I40E_VFINT_DYN_CTLN1(_INTVF)   (0x3800 + ((_INTVF) * 
4)) /* _i=0...15 */ /* Reset: VFR */
+#define I40E_GLHMC_VFPDINV(_i)   (0x000C8300 + ((_i) * 4)) /* 
_i=0...31 */ /* Reset: CORER */
+
+#define I40E_PFHMC_PDINV_PMSDPARTSEL_SHIFT 15
+#define I40E_PFHMC_PDINV_PMSDPARTSEL_MASK  (0x1 <<  
I40E_PFHMC_PDINV_PMSDPARTSEL_SHIFT)
+#define I40E_GLPCI_LBARCTRL0x000BE484 /* Reset: POR */
+#define I40E_GLPCI_LBARCTRL_PE_DB_SIZE_SHIFT4
+#define I40E_GLPCI_LBARCTRL_PE_DB_SIZE_MASK (0x3 <<  
I40E_GLPCI_LBARCTRL_PE_DB_SIZE_SHIFT)
+
+#define I40E_PFPE_AEQALLOC   0x00131180 /* Reset: PFR */
+#define I40E_PFPE_AEQALLOC_AECOUNT_SHIFT 0
+#define I40E_PFPE_AEQALLOC_AECOUNT_MASK  (0x <<  
I40E_PFPE_AEQALLOC_AECOUNT_SHIFT)
+#define I40E_PFPE_CCQPHIGH  0x8200 /* Reset: PFR */
+#define I40E_PFPE_CCQPHIGH_PECCQPHIGH_SHIFT 0
+#define I40E_PFPE_CCQPHIGH_PECCQPHIGH_MASK  (0x <<  
I40E_PFPE_CCQPHIGH_PECCQPHIGH_SHIFT)
+#define I40E_PFPE_CCQPLOW 0x8180 /* Reset: PFR */
+#define I40E_PFPE_CCQPLOW_PECCQPLOW_SHIFT 0
+#define I40E_PFPE_CCQPLOW_PECCQPLOW_MASK  (0x <<  
I40E_PFPE_CCQPLOW_PECCQPLOW_SHIFT)
+#define I40E_PFPE_CCQPSTATUS   0x8100 /* Reset

[PATCH V1 06/16] i40iw: add hmc resource files

2015-12-21 Thread Faisal Latif
i40iw_hmc.[ch] are to manage hmc for the device.

Acked-by: Anjali Singhai Jain 
Acked-by: Shannon Nelson 
Signed-off-by: Faisal Latif 
---
 drivers/infiniband/hw/i40iw/i40iw_hmc.c | 821 
 drivers/infiniband/hw/i40iw/i40iw_hmc.h | 241 ++
 2 files changed, 1062 insertions(+)
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_hmc.c
 create mode 100644 drivers/infiniband/hw/i40iw/i40iw_hmc.h

diff --git a/drivers/infiniband/hw/i40iw/i40iw_hmc.c 
b/drivers/infiniband/hw/i40iw/i40iw_hmc.c
new file mode 100644
index 000..96bec54
--- /dev/null
+++ b/drivers/infiniband/hw/i40iw/i40iw_hmc.c
@@ -0,0 +1,821 @@
+/***
+*
+* Copyright (c) 2015 Intel Corporation.  All rights reserved.
+*
+* This software is available to you under a choice of one of two
+* licenses.  You may choose to be licensed under the terms of the GNU
+* General Public License (GPL) Version 2, available from the file
+* COPYING in the main directory of this source tree, or the
+* OpenFabrics.org BSD license below:
+*
+*   Redistribution and use in source and binary forms, with or
+*   without modification, are permitted provided that the following
+*   conditions are met:
+*
+*- Redistributions of source code must retain the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer.
+*
+*- Redistributions in binary form must reproduce the above
+*  copyright notice, this list of conditions and the following
+*  disclaimer in the documentation and/or other materials
+*  provided with the distribution.
+*
+* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+* SOFTWARE.
+*
+***/
+
+#include "i40iw_osdep.h"
+#include "i40iw_register.h"
+#include "i40iw_status.h"
+#include "i40iw_hmc.h"
+#include "i40iw_d.h"
+#include "i40iw_type.h"
+#include "i40iw_p.h"
+#include "i40iw_vf.h"
+#include "i40iw_virtchnl.h"
+
+/**
+ * i40iw_find_sd_index_limit - finds segment descriptor index limit
+ * @hmc_info: pointer to the HMC configuration information structure
+ * @type: type of HMC resources we're searching
+ * @index: starting index for the object
+ * @cnt: number of objects we're trying to create
+ * @sd_idx: pointer to return index of the segment descriptor in question
+ * @sd_limit: pointer to return the maximum number of segment descriptors
+ *
+ * This function calculates the segment descriptor index and index limit
+ * for the resource defined by i40iw_hmc_rsrc_type.
+ */
+
+static inline void i40iw_find_sd_index_limit(struct i40iw_hmc_info *hmc_info,
+u32 type,
+u32 idx,
+u32 cnt,
+u32 *sd_idx,
+u32 *sd_limit)
+{
+   u64 fpm_addr, fpm_limit;
+
+   fpm_addr = hmc_info->hmc_obj[(type)].base +
+   hmc_info->hmc_obj[type].size * idx;
+   fpm_limit = fpm_addr + hmc_info->hmc_obj[type].size * cnt;
+   *sd_idx = (u32)(fpm_addr / I40IW_HMC_DIRECT_BP_SIZE);
+   *sd_limit = (u32)((fpm_limit - 1) / I40IW_HMC_DIRECT_BP_SIZE);
+   *sd_limit += 1;
+}
+
+/**
+ * i40iw_find_pd_index_limit - finds page descriptor index limit
+ * @hmc_info: pointer to the HMC configuration information struct
+ * @type: HMC resource type we're examining
+ * @idx: starting index for the object
+ * @cnt: number of objects we're trying to create
+ * @pd_index: pointer to return page descriptor index
+ * @pd_limit: pointer to return page descriptor index limit
+ *
+ * Calculates the page descriptor index and index limit for the resource
+ * defined by i40iw_hmc_rsrc_type.
+ */
+
+static inline void i40iw_find_pd_index_limit(struct i40iw_hmc_info *hmc_info,
+u32 type,
+u32 idx,
+u32 cnt,
+u32 *pd_idx,
+u32 *pd_limit)
+{
+   u64 fpm_adr, fpm_limit;
+
+   fpm_adr = hmc_info->hmc_obj[type].base +
+   hmc_info->hmc_obj[type].size * idx;
+   fpm_limit = fpm_adr + (hmc_info)->hmc_obj[(type)].size * (cnt);
+   *(pd_idx) = (u32)(fpm_adr / I40IW_HMC_PAGED_BP_SIZE);
+   *(pd_limit) = (u32)((fpm_limit - 1) / I40IW_HMC_PAGED_BP_SIZE);

Re: [PATCH v2 0/5] Clean up SDMA engine code

2015-12-21 Thread ira.weiny
On Tue, Dec 08, 2015 at 05:10:08PM -0500, ira.we...@intel.com wrote:
> From: Ira Weiny 
> 
> Various improvements to the SDMA engine code.

Greg,

Thanks for reviewing and accepting our patches to staging-testing.  I apologize
for the conflicts we had between the 3 of us submitting.  However, in
attempting to rework an internal branch to ensure this does not happen again I
believe there were more conflicts than their should have been due to patches
being accepted out of order.

For example, I found the following error in your staging tree below.

This series you applied in the following order which causes a build failure on
the middle commit -- a0d4069.

483119a staging/rdma/hfi1: Unconditionally clean-up SDMA queues
def8228 staging/rdma/hfi1: Convert to use get_user_pages_fast
a0d4069 staging/rdma/hfi1: Add page lock limit check for SDMA requests
faa98b8 staging/rdma/hfi1: Clean-up unnecessary goto statements
6a5464f staging/rdma/hfi1: Detect SDMA transmission error early

The order as submitted was:

staging/rdma/hfi1: Convert to use get_user_pages_fast
staging/rdma/hfi1: Unconditionally clean-up SDMA queues
staging/rdma/hfi1: Clean-up unnecessary goto statements
staging/rdma/hfi1: Detect SDMA transmission error early
staging/rdma/hfi1: Add page lock limit check for SDMA requests



Do I need to resolve this somehow?  Or is this something you resolve while the
patches are in staging-testing?

Is there something we need to do in the cover letter of a patch series to
ensure order?  Perhaps my cover letter implied these were not ordered?  If so,
I again apologize.

Thanks,
Ira

> 
> ---
> Changes from V1:
>   Fix off by one error in the last patch
> 
> Mitko Haralanov (5):
>   staging/rdma/hfi1: Convert to use get_user_pages_fast
>   staging/rdma/hfi1: Unconditionally clean-up SDMA queues
>   staging/rdma/hfi1: Clean-up unnecessary goto statements
>   staging/rdma/hfi1: Detect SDMA transmission error early
>   staging/rdma/hfi1: Add page lock limit check for SDMA requests
> 
>  drivers/staging/rdma/hfi1/file_ops.c   |  11 +-
>  drivers/staging/rdma/hfi1/hfi.h|   4 +-
>  drivers/staging/rdma/hfi1/user_pages.c |  97 +++---
>  drivers/staging/rdma/hfi1/user_sdma.c  | 319 
> +++--
>  drivers/staging/rdma/hfi1/user_sdma.h  |   2 +
>  5 files changed, 222 insertions(+), 211 deletions(-)
> 
> -- 
> 1.8.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/5] Clean up SDMA engine code

2015-12-21 Thread Greg KH
On Mon, Dec 21, 2015 at 06:48:03PM -0500, ira.weiny wrote:
> On Tue, Dec 08, 2015 at 05:10:08PM -0500, ira.we...@intel.com wrote:
> > From: Ira Weiny 
> > 
> > Various improvements to the SDMA engine code.
> 
> Greg,
> 
> Thanks for reviewing and accepting our patches to staging-testing.  I 
> apologize
> for the conflicts we had between the 3 of us submitting.  However, in
> attempting to rework an internal branch to ensure this does not happen again I
> believe there were more conflicts than their should have been due to patches
> being accepted out of order.
> 
> For example, I found the following error in your staging tree below.
> 
> This series you applied in the following order which causes a build failure on
> the middle commit -- a0d4069.
> 
> 483119a staging/rdma/hfi1: Unconditionally clean-up SDMA queues
> def8228 staging/rdma/hfi1: Convert to use get_user_pages_fast
> a0d4069 staging/rdma/hfi1: Add page lock limit check for SDMA requests
> faa98b8 staging/rdma/hfi1: Clean-up unnecessary goto statements
> 6a5464f staging/rdma/hfi1: Detect SDMA transmission error early
> 
> The order as submitted was:
> 
> staging/rdma/hfi1: Convert to use get_user_pages_fast
> staging/rdma/hfi1: Unconditionally clean-up SDMA queues
> staging/rdma/hfi1: Clean-up unnecessary goto statements
> staging/rdma/hfi1: Detect SDMA transmission error early
> staging/rdma/hfi1: Add page lock limit check for SDMA requests
> 
> 
> 
> Do I need to resolve this somehow?  Or is this something you resolve while the
> patches are in staging-testing?
> 
> Is there something we need to do in the cover letter of a patch series to
> ensure order?  Perhaps my cover letter implied these were not ordered?  If so,
> I again apologize.

Did you number your patches?  That's the only way to ensure that they
are applied in the correct order, that's what I sort on to apply them.
If you don't order them, I randomly guess, or just reject them...

All seems to build now, right?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 00/15] staging/rdma/hfi1: Initial patches to add rdmavt support in HFI1

2015-12-21 Thread ira.weiny
On Mon, Dec 21, 2015 at 02:02:35PM -0800, gre...@linuxfoundation.org wrote:
> On Mon, Dec 21, 2015 at 01:12:14AM -0500, ira.weiny wrote:
> > Greg, Doug,
> > 
> > As mentioned below, these patches depend on the new rdmavt library 
> > submitted to
> > Doug on linux-rdma.
> > 
> > We continue to identify (and rework) patches by our other developers which 
> > can
> > be submitted without conflicts with this series.  Furthermore, We have, as 
> > much
> > as possible, placed fixes directly into rdmavt such that those changes can 
> > be
> > dropped from hfi1.  But at this point, we need to know if and where these 
> > are
> > going to land so that we can start reworking as appropriate.
> > 
> > Therefore, I would like to discuss plans to get hfi1 under the same 
> > maintainer
> > to work through this transitional period.
> > 
> > Basically, At what point should we stop submitting patches to Greg and start
> > submitting to Doug?
> > 
> > Should we consider the merge window itself as the swap over point and submit
> > changes to Doug at that point?  If so, should we continue to submit what we 
> > can
> > to Greg until then (and continue rebase'ing the series below on that work)? 
> >  Or
> > given Gregs backlog, should we stop submitting to Greg sometime prior to the
> > merge window?
> > 
> > That brings up my final question, at the point of swap over I assume 
> > anything
> > not accepted by Greg should be considered rejected and we need to resubmit 
> > to
> > Doug?
> 
> If Doug accepts the library changes, let me know that public git commit
> and I can pull it into the staging-next branch and you can continue to
> send me staging patches that way.

Won't this cause a conflict during the merge window?

How do we handle changes which affect both qib and hfi1?

Ira

> 
> That's the easiest thing to do usually.
> 
> thanks,
> 
> greg k-h
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/5] Clean up SDMA engine code

2015-12-21 Thread ira.weiny
On Mon, Dec 21, 2015 at 04:13:49PM -0800, gre...@linuxfoundation.org wrote:
> On Mon, Dec 21, 2015 at 06:48:03PM -0500, ira.weiny wrote:
> > On Tue, Dec 08, 2015 at 05:10:08PM -0500, ira.we...@intel.com wrote:
> > > From: Ira Weiny 
> > > 
> > > Various improvements to the SDMA engine code.
> > 
> > Greg,
> > 
> > Thanks for reviewing and accepting our patches to staging-testing.  I 
> > apologize
> > for the conflicts we had between the 3 of us submitting.  However, in
> > attempting to rework an internal branch to ensure this does not happen 
> > again I
> > believe there were more conflicts than their should have been due to patches
> > being accepted out of order.
> > 
> > For example, I found the following error in your staging tree below.
> > 
> > This series you applied in the following order which causes a build failure 
> > on
> > the middle commit -- a0d4069.
> > 
> > 483119a staging/rdma/hfi1: Unconditionally clean-up SDMA queues
> > def8228 staging/rdma/hfi1: Convert to use get_user_pages_fast
> > a0d4069 staging/rdma/hfi1: Add page lock limit check for SDMA requests
> > faa98b8 staging/rdma/hfi1: Clean-up unnecessary goto statements
> > 6a5464f staging/rdma/hfi1: Detect SDMA transmission error early
> > 
> > The order as submitted was:
> > 
> > staging/rdma/hfi1: Convert to use get_user_pages_fast
> > staging/rdma/hfi1: Unconditionally clean-up SDMA queues
> > staging/rdma/hfi1: Clean-up unnecessary goto statements
> > staging/rdma/hfi1: Detect SDMA transmission error early
> > staging/rdma/hfi1: Add page lock limit check for SDMA requests
> > 
> > 
> > 
> > Do I need to resolve this somehow?  Or is this something you resolve while 
> > the
> > patches are in staging-testing?
> > 
> > Is there something we need to do in the cover letter of a patch series to
> > ensure order?  Perhaps my cover letter implied these were not ordered?  If 
> > so,
> > I again apologize.
> 
> Did you number your patches?

Yes, sent with git-send-email.

http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2015-December/thread.html#82509

> That's the only way to ensure that they
> are applied in the correct order, that's what I sort on to apply them.
> If you don't order them, I randomly guess, or just reject them...
> 
> All seems to build now, right?

Yes all builds now.  I just did not know if as part of testing an incremental
build check would then reject the patch.

Ira

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 16/16] i40iw: changes for build of i40iw module

2015-12-21 Thread kbuild test robot
Hi Faisal,

[auto build test ERROR on net/master]
[also build test ERROR on v4.4-rc6 next-20151221]
[cannot apply to net-next/master]

url:
https://github.com/0day-ci/linux/commits/Faisal-Latif/add-Intel-R-X722-iWARP-driver/20151222-071852
config: x86_64-allmodconfig (attached as .config)
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/infiniband/hw/i40iw/i40iw_verbs.c: In function 
'i40iw_init_rdma_device':
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2280:39: error: 'struct ib_device' 
>> has no member named 'sys_image_guid'
 ether_addr_copy((u8 *)&iwibdev->ibdev.sys_image_guid, 
iwdev->netdev->dev_addr);
  ^
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2281:16: error: 'struct ib_device' 
>> has no member named 'fw_ver'
 iwibdev->ibdev.fw_ver = I40IW_FW_VERSION;
   ^
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2282:16: error: 'struct ib_device' 
>> has no member named 'device_cap_flags'
 iwibdev->ibdev.device_cap_flags = iwdev->device_cap_flags;
   ^
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2283:16: error: 'struct ib_device' 
>> has no member named 'vendor_id'
 iwibdev->ibdev.vendor_id = iwdev->vendor_id;
   ^
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2284:16: error: 'struct ib_device' 
>> has no member named 'vendor_part_id'
 iwibdev->ibdev.vendor_part_id = iwdev->vendor_part_id;
   ^
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2285:16: error: 'struct ib_device' 
>> has no member named 'hw_ver'
 iwibdev->ibdev.hw_ver = I40IW_HW_VERSION;
   ^
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2286:16: error: 'struct ib_device' 
>> has no member named 'max_mr_size'
 iwibdev->ibdev.max_mr_size = I40IW_MAX_OUTBOUND_MESSAGE_SIZE;
   ^
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2287:16: error: 'struct ib_device' 
>> has no member named 'max_qp'
 iwibdev->ibdev.max_qp = iwdev->max_qp;
   ^
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2288:16: error: 'struct ib_device' 
>> has no member named 'max_qp_wr'
 iwibdev->ibdev.max_qp_wr = (I40IW_MAX_WQ_ENTRIES >> 2) - 1;
   ^
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2289:16: error: 'struct ib_device' 
>> has no member named 'max_sge'
 iwibdev->ibdev.max_sge = I40IW_MAX_WQ_FRAGMENT_COUNT;
   ^
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2290:16: error: 'struct ib_device' 
>> has no member named 'max_cq'
 iwibdev->ibdev.max_cq = iwdev->max_cq;
   ^
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2291:16: error: 'struct ib_device' 
>> has no member named 'max_cqe'
 iwibdev->ibdev.max_cqe = iwdev->max_cqe;
   ^
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2293:16: error: 'struct ib_device' 
>> has no member named 'max_mr'
 iwibdev->ibdev.max_mr = iwdev->max_mr;
   ^
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2294:16: error: 'struct ib_device' 
>> has no member named 'max_mw'
 iwibdev->ibdev.max_mw = iwdev->max_mr;
   ^
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2295:16: error: 'struct ib_device' 
>> has no member named 'max_pd'
 iwibdev->ibdev.max_pd = iwdev->max_pd;
   ^
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2296:16: error: 'struct ib_device' 
>> has no member named 'max_qp_rd_atom'
 iwibdev->ibdev.max_qp_rd_atom = I40IW_MAX_IRD_SIZE;
   ^
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2297:16: error: 'struct ib_device' 
>> has no member named 'max_sge_rd'
 iwibdev->ibdev.max_sge_rd = 1;
   ^
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2299:16: error: 'struct ib_device' 
>> has no member named 'max_qp_init_rd_atom'
 iwibdev->ibdev.max_qp_init_rd_atom = I40IW_MAX_IRD_SIZE;
   ^
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2300:16: error: 'struct ib_device' 
>> has no member named 'atomic_cap'
 iwibdev->ibdev.atomic_cap = IB_ATOMIC_NONE;
   ^
>> drivers/infiniband/hw/i40iw/i40iw_verbs.c:2301:16: error: 'struct ib_device' 
>> ha

Re: [RFC PATCH 00/15] staging/rdma/hfi1: Initial patches to add rdmavt support in HFI1

2015-12-21 Thread gre...@linuxfoundation.org
On Mon, Dec 21, 2015 at 07:19:43PM -0500, ira.weiny wrote:
> On Mon, Dec 21, 2015 at 02:02:35PM -0800, gre...@linuxfoundation.org wrote:
> > On Mon, Dec 21, 2015 at 01:12:14AM -0500, ira.weiny wrote:
> > > Greg, Doug,
> > > 
> > > As mentioned below, these patches depend on the new rdmavt library 
> > > submitted to
> > > Doug on linux-rdma.
> > > 
> > > We continue to identify (and rework) patches by our other developers 
> > > which can
> > > be submitted without conflicts with this series.  Furthermore, We have, 
> > > as much
> > > as possible, placed fixes directly into rdmavt such that those changes 
> > > can be
> > > dropped from hfi1.  But at this point, we need to know if and where these 
> > > are
> > > going to land so that we can start reworking as appropriate.
> > > 
> > > Therefore, I would like to discuss plans to get hfi1 under the same 
> > > maintainer
> > > to work through this transitional period.
> > > 
> > > Basically, At what point should we stop submitting patches to Greg and 
> > > start
> > > submitting to Doug?
> > > 
> > > Should we consider the merge window itself as the swap over point and 
> > > submit
> > > changes to Doug at that point?  If so, should we continue to submit what 
> > > we can
> > > to Greg until then (and continue rebase'ing the series below on that 
> > > work)?  Or
> > > given Gregs backlog, should we stop submitting to Greg sometime prior to 
> > > the
> > > merge window?
> > > 
> > > That brings up my final question, at the point of swap over I assume 
> > > anything
> > > not accepted by Greg should be considered rejected and we need to 
> > > resubmit to
> > > Doug?
> > 
> > If Doug accepts the library changes, let me know that public git commit
> > and I can pull it into the staging-next branch and you can continue to
> > send me staging patches that way.
> 
> Won't this cause a conflict during the merge window?

No, git is good :)

> How do we handle changes which affect both qib and hfi1?

I don't know, now this gets messy...

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V5 0/2] staging/rdma/hfi1: set Gen 3 half-swing for integrated devices.

2015-12-21 Thread ira . weiny
From: Ira Weiny 

This was a single patch before.  The change to dev_dbg required a precursor
patch to add the dd_dev_dbg which is consistent with the other dev_* macros
which automatically use struct hfi1_devdata.

Changes from V4:
Fix build error which arose from other patches being accepted on the
list before this one.


Dean Luick (1):
  staging/rdma/hfi1: set Gen3 half-swing for integrated devices

Ira Weiny (1):
  staging/rdma/hfi1: add dd_dev_dbg

 drivers/staging/rdma/hfi1/chip_registers.h | 11 
 drivers/staging/rdma/hfi1/hfi.h|  4 ++
 drivers/staging/rdma/hfi1/pcie.c   | 82 --
 3 files changed, 93 insertions(+), 4 deletions(-)

-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V5 1/2] staging/rdma/hfi1: add dd_dev_dbg

2015-12-21 Thread ira . weiny
From: Ira Weiny 

To be used in future patches add dd_dev_dbg.  dd_* functions properly decode
the hfi1_devdata structure used throughout the driver

Signed-off-by: Ira Weiny 
---
 drivers/staging/rdma/hfi1/hfi.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/staging/rdma/hfi1/hfi.h b/drivers/staging/rdma/hfi1/hfi.h
index 62157cc34727..52dcc87689f1 100644
--- a/drivers/staging/rdma/hfi1/hfi.h
+++ b/drivers/staging/rdma/hfi1/hfi.h
@@ -1804,6 +1804,10 @@ static inline u64 hfi1_pkt_base_sdma_integrity(struct 
hfi1_devdata *dd)
dev_info(&(dd)->pcidev->dev, "%s: " fmt, \
get_unit_name((dd)->unit), ##__VA_ARGS__)
 
+#define dd_dev_dbg(dd, fmt, ...) \
+   dev_dbg(&(dd)->pcidev->dev, "%s: " fmt, \
+   get_unit_name((dd)->unit), ##__VA_ARGS__)
+
 #define hfi1_dev_porterr(dd, port, fmt, ...) \
dev_err(&(dd)->pcidev->dev, "%s: IB%u:%u " fmt, \
get_unit_name((dd)->unit), (dd)->unit, (port), \
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V5 2/2] staging/rdma/hfi1: set Gen3 half-swing for integrated devices

2015-12-21 Thread ira . weiny
From: Dean Luick 

Correctly set half-swing for integrated devices.  A0 needs all fields set for
CcePcieCtrl.  B0 and later only need a few fields set.

Reviewed-by: Stuart Summers 
Signed-off-by: Dean Luick 
Signed-off-by: Ira Weiny 

---
Changes from V1:
Add comments concerning the very long names.

Changes from V2:
Remove PC Macro and define short names to be used in the code.

Changes from V3:
Use newly defined dd_dev_dbg rather than dd_dev_info

Changes from V4:
Fix build error which arose from other patches being accepted on the
list before this one.

 drivers/staging/rdma/hfi1/chip_registers.h | 11 
 drivers/staging/rdma/hfi1/pcie.c   | 82 --
 2 files changed, 89 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/chip_registers.h 
b/drivers/staging/rdma/hfi1/chip_registers.h
index 701e9e1012a6..014d7a609ea0 100644
--- a/drivers/staging/rdma/hfi1/chip_registers.h
+++ b/drivers/staging/rdma/hfi1/chip_registers.h
@@ -551,6 +551,17 @@
 #define CCE_MSIX_TABLE_UPPER (CCE + 0x0018)
 #define CCE_MSIX_TABLE_UPPER_RESETCSR 0x0001ull
 #define CCE_MSIX_VEC_CLR_WITHOUT_INT (CCE + 0x00110400)
+#define CCE_PCIE_CTRL (CCE + 0x00C0)
+#define CCE_PCIE_CTRL_PCIE_LANE_BUNDLE_MASK 0x3ull
+#define CCE_PCIE_CTRL_PCIE_LANE_BUNDLE_SHIFT 0
+#define CCE_PCIE_CTRL_PCIE_LANE_DELAY_MASK 0xFull
+#define CCE_PCIE_CTRL_PCIE_LANE_DELAY_SHIFT 2
+#define CCE_PCIE_CTRL_XMT_MARGIN_OVERWRITE_ENABLE_SHIFT 8
+#define CCE_PCIE_CTRL_XMT_MARGIN_SHIFT 9
+#define CCE_PCIE_CTRL_XMT_MARGIN_GEN1_GEN2_OVERWRITE_ENABLE_MASK 0x1ull
+#define CCE_PCIE_CTRL_XMT_MARGIN_GEN1_GEN2_OVERWRITE_ENABLE_SHIFT 12
+#define CCE_PCIE_CTRL_XMT_MARGIN_GEN1_GEN2_MASK 0x7ull
+#define CCE_PCIE_CTRL_XMT_MARGIN_GEN1_GEN2_SHIFT 13
 #define CCE_REVISION (CCE + 0x)
 #define CCE_REVISION2 (CCE + 0x0008)
 #define CCE_REVISION2_HFI_ID_MASK 0x1ull
diff --git a/drivers/staging/rdma/hfi1/pcie.c b/drivers/staging/rdma/hfi1/pcie.c
index 8317b07d722a..9917faff823c 100644
--- a/drivers/staging/rdma/hfi1/pcie.c
+++ b/drivers/staging/rdma/hfi1/pcie.c
@@ -867,6 +867,83 @@ static void arm_gasket_logic(struct hfi1_devdata *dd)
 }
 
 /*
+ * CCE_PCIE_CTRL long name helpers
+ * We redefine these shorter macros to use in the code while leaving
+ * chip_registers.h to be autogenerated from the hardware spec.
+ */
+#define LANE_BUNDLE_MASK  CCE_PCIE_CTRL_PCIE_LANE_BUNDLE_MASK
+#define LANE_BUNDLE_SHIFT CCE_PCIE_CTRL_PCIE_LANE_BUNDLE_SHIFT
+#define LANE_DELAY_MASK   CCE_PCIE_CTRL_PCIE_LANE_DELAY_MASK
+#define LANE_DELAY_SHIFT  CCE_PCIE_CTRL_PCIE_LANE_DELAY_SHIFT
+#define MARGIN_OVERWRITE_ENABLE_SHIFT 
CCE_PCIE_CTRL_XMT_MARGIN_OVERWRITE_ENABLE_SHIFT
+#define MARGIN_SHIFT  CCE_PCIE_CTRL_XMT_MARGIN_SHIFT
+#define MARGIN_G1_G2_OVERWRITE_MASK   
CCE_PCIE_CTRL_XMT_MARGIN_GEN1_GEN2_OVERWRITE_ENABLE_MASK
+#define MARGIN_G1_G2_OVERWRITE_SHIFT  
CCE_PCIE_CTRL_XMT_MARGIN_GEN1_GEN2_OVERWRITE_ENABLE_SHIFT
+#define MARGIN_GEN1_GEN2_MASK CCE_PCIE_CTRL_XMT_MARGIN_GEN1_GEN2_MASK
+#define MARGIN_GEN1_GEN2_SHIFTCCE_PCIE_CTRL_XMT_MARGIN_GEN1_GEN2_SHIFT
+
+ /*
+  * Write xmt_margin for full-swing (WFR-B) or half-swing (WFR-C).
+  */
+static void write_xmt_margin(struct hfi1_devdata *dd, const char *fname)
+{
+   u64 pcie_ctrl;
+   u64 xmt_margin;
+   u64 xmt_margin_oe;
+   u64 lane_delay;
+   u64 lane_bundle;
+
+   pcie_ctrl = read_csr(dd, CCE_PCIE_CTRL);
+
+   /*
+* For Discrete, use full-swing.
+*  - PCIe TX defaults to full-swing.
+*Leave this register as default.
+* For Integrated, use half-swing
+*  - Copy xmt_margin and xmt_margin_oe
+*from Gen1/Gen2 to Gen3.
+*/
+   if (dd->pcidev->device == PCI_DEVICE_ID_INTEL1) { /* integrated */
+   /* extract initial fields */
+   xmt_margin = (pcie_ctrl >> MARGIN_GEN1_GEN2_SHIFT)
+ & MARGIN_GEN1_GEN2_MASK;
+   xmt_margin_oe = (pcie_ctrl >> MARGIN_G1_G2_OVERWRITE_SHIFT)
+& MARGIN_G1_G2_OVERWRITE_MASK;
+   lane_delay = (pcie_ctrl >> LANE_DELAY_SHIFT) & LANE_DELAY_MASK;
+   lane_bundle = (pcie_ctrl >> LANE_BUNDLE_SHIFT)
+  & LANE_BUNDLE_MASK;
+
+   /*
+* For A0, EFUSE values are not set.  Override with the
+* correct values.
+*/
+   if (is_ax(dd)) {
+   /*
+* xmt_margin and OverwiteEnabel should be the
+* same for Gen1/Gen2 and Gen3
+*/
+   xmt_margin = 0x5;
+   xmt_margin_oe = 0x1;
+   lane_delay = 0xF; /* Delay 240ns. */
+   lane_bundle = 0x0; /* Set to 1 lane. */
+   

[PATCH v3 0/6] staging/rdma/hfi1: Driver cleanup and misc fixes

2015-12-21 Thread Jubin John
These patches were part of patch series:
http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2015-November/081566.html
but did not apply cleanly to the staging-testing branch.

Refreshed the remaining 6 patches against the latest staging-testing.

Changes in v2:
- 01/13: Updated commit message with more information about changes in 
patch
- 04/13: Updated patch subject with "hfi1" instead of "hfi"
- 07/13: Refreshed patch based on new hfi patches in staging-next
- 12/13: Changed logic based on Dan's suggestions in
 http://marc.info/?l=linux-driver-devel&m=144723149431368&w=2
Changes in v3:
- Refreshed remaining 6 patches against latest staging-testing

Dean Luick (1):
  staging/rdma/hfi1: Remove unneeded variable index

Harish Chegondi (1):
  staging/rdma/hfi1: Move s_sde to the read mostly portion of the
hfi1_qp structure

Jubin John (2):
  staging/rdma/hfi1: Use BIT macro
  staging/rdma/hfi1: Change default krcvqs

Mark F. Brown (1):
  staging/rdma/hfi1: change krcvqs module parameter type from byte to
uint

Vennila Megavannan (1):
  staging/rdma/hfi1: adding per SDMA engine stats to hfistats

 drivers/staging/rdma/hfi1/chip.c   |  127 +---
 drivers/staging/rdma/hfi1/chip.h   |   53 +++-
 drivers/staging/rdma/hfi1/chip_registers.h |1 +
 drivers/staging/rdma/hfi1/common.h |4 +-
 drivers/staging/rdma/hfi1/hfi.h|   25 +++---
 drivers/staging/rdma/hfi1/init.c   |6 +-
 drivers/staging/rdma/hfi1/mad.c|4 +-
 drivers/staging/rdma/hfi1/qp.h |2 +-
 drivers/staging/rdma/hfi1/qsfp.h   |   10 +-
 drivers/staging/rdma/hfi1/sdma.c   |   17 +++-
 drivers/staging/rdma/hfi1/sdma.h   |7 ++
 drivers/staging/rdma/hfi1/verbs.h  |2 +-
 12 files changed, 189 insertions(+), 69 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/6] staging/rdma/hfi1: Use BIT macro

2015-12-21 Thread Jubin John
This patch fixes the checkpatch issue:
CHECK: Prefer using the BIT macro

Use of BIT macro for HDRQ_INCREMENT in chip.h causes a change in
format specifier for error message in init.c in order to avoid a
build warning.

Reviewed-by: Dean Luick 
Reviewed-by: Ira Weiny 
Reviewed-by: Mike Marciniszyn 
Signed-off-by: Jubin John 
---
Changes in v2:
- Updated commit message with more information about changes in patch

Changes in v3:
- Refreshed patch against latest staging-testing

 drivers/staging/rdma/hfi1/chip.h   |   48 ++--
 drivers/staging/rdma/hfi1/common.h |4 +-
 drivers/staging/rdma/hfi1/hfi.h|   22 
 drivers/staging/rdma/hfi1/init.c   |2 +-
 drivers/staging/rdma/hfi1/mad.c|4 +-
 drivers/staging/rdma/hfi1/qp.h |2 +-
 drivers/staging/rdma/hfi1/qsfp.h   |   10 +++---
 drivers/staging/rdma/hfi1/sdma.c   |8 +++---
 8 files changed, 50 insertions(+), 50 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/chip.h b/drivers/staging/rdma/hfi1/chip.h
index 5b375dd..1368a44 100644
--- a/drivers/staging/rdma/hfi1/chip.h
+++ b/drivers/staging/rdma/hfi1/chip.h
@@ -242,18 +242,18 @@
 #define HCMD_SUCCESS 2
 
 /* DC_DC8051_DBG_ERR_INFO_SET_BY_8051.ERROR - error flags */
-#define SPICO_ROM_FAILED   (1 <<  0)
-#define UNKNOWN_FRAME  (1 <<  1)
-#define TARGET_BER_NOT_MET (1 <<  2)
-#define FAILED_SERDES_INTERNAL_LOOPBACK (1 <<  3)
-#define FAILED_SERDES_INIT (1 <<  4)
-#define FAILED_LNI_POLLING (1 <<  5)
-#define FAILED_LNI_DEBOUNCE(1 <<  6)
-#define FAILED_LNI_ESTBCOMM(1 <<  7)
-#define FAILED_LNI_OPTEQ   (1 <<  8)
-#define FAILED_LNI_VERIFY_CAP1 (1 <<  9)
-#define FAILED_LNI_VERIFY_CAP2 (1 << 10)
-#define FAILED_LNI_CONFIGLT(1 << 11)
+#define SPICO_ROM_FAILED   BIT(0)
+#define UNKNOWN_FRAME  BIT(1)
+#define TARGET_BER_NOT_MET BIT(2)
+#define FAILED_SERDES_INTERNAL_LOOPBACKBIT(3)
+#define FAILED_SERDES_INIT BIT(4)
+#define FAILED_LNI_POLLING BIT(5)
+#define FAILED_LNI_DEBOUNCEBIT(6)
+#define FAILED_LNI_ESTBCOMMBIT(7)
+#define FAILED_LNI_OPTEQ   BIT(8)
+#define FAILED_LNI_VERIFY_CAP1 BIT(9)
+#define FAILED_LNI_VERIFY_CAP2 BIT(10)
+#define FAILED_LNI_CONFIGLTBIT(11)
 
 #define FAILED_LNI (FAILED_LNI_POLLING | FAILED_LNI_DEBOUNCE \
| FAILED_LNI_ESTBCOMM | FAILED_LNI_OPTEQ \
@@ -262,16 +262,16 @@
| FAILED_LNI_CONFIGLT)
 
 /* DC_DC8051_DBG_ERR_INFO_SET_BY_8051.HOST_MSG - host message flags */
-#define HOST_REQ_DONE (1 << 0)
-#define BC_PWR_MGM_MSG(1 << 1)
-#define BC_SMA_MSG(1 << 2)
-#define BC_BCC_UNKOWN_MSG (1 << 3)
-#define BC_IDLE_UNKNOWN_MSG   (1 << 4)
-#define EXT_DEVICE_CFG_REQ(1 << 5)
-#define VERIFY_CAP_FRAME  (1 << 6)
-#define LINKUP_ACHIEVED   (1 << 7)
-#define LINK_GOING_DOWN   (1 << 8)
-#define LINK_WIDTH_DOWNGRADED  (1 << 9)
+#define HOST_REQ_DONE  BIT(0)
+#define BC_PWR_MGM_MSG BIT(1)
+#define BC_SMA_MSG BIT(2)
+#define BC_BCC_UNKNOWN_MSG BIT(3)
+#define BC_IDLE_UNKNOWN_MSGBIT(4)
+#define EXT_DEVICE_CFG_REQ BIT(5)
+#define VERIFY_CAP_FRAME   BIT(6)
+#define LINKUP_ACHIEVEDBIT(7)
+#define LINK_GOING_DOWNBIT(8)
+#define LINK_WIDTH_DOWNGRADED  BIT(9)
 
 /* DC_DC8051_CFG_EXT_DEV_1.REQ_TYPE - 8051 host requests */
 #define HREQ_LOAD_CONFIG   0x01
@@ -335,14 +335,14 @@
  * the CSR fields hold multiples of this value.
  */
 #define RCV_SHIFT 3
-#define RCV_INCREMENT (1 << RCV_SHIFT)
+#define RCV_INCREMENT BIT(RCV_SHIFT)
 
 /*
  * Receive header queue entry increment - the CSR holds multiples of
  * this value.
  */
 #define HDRQ_SIZE_SHIFT 5
-#define HDRQ_INCREMENT (1 << HDRQ_SIZE_SHIFT)
+#define HDRQ_INCREMENT BIT(HDRQ_SIZE_SHIFT)
 
 /*
  * Freeze handling flags
diff --git a/drivers/staging/rdma/hfi1/common.h 
b/drivers/staging/rdma/hfi1/common.h
index 5dd9272..e4b1dc6 100644
--- a/drivers/staging/rdma/hfi1/common.h
+++ b/drivers/staging/rdma/hfi1/common.h
@@ -349,10 +349,10 @@ struct hfi1_message_header {
 #define HFI1_QPN_MASK 0xFF
 #define HFI1_FECN_SHIFT 31
 #define HFI1_FECN_MASK 1
-#define HFI1_FECN_SMASK (1 << HFI1_FECN_SHIFT)
+#define HFI1_FECN_SMASK BIT(HFI1_FECN_SHIFT)
 #define HFI1_BECN_SHIFT 30
 #define HFI1_BECN_MASK 1
-#define HFI1_BECN_SMASK (1 << HFI1_BECN_SHIFT)
+#define HFI1_BECN_SMASK BIT(HFI1_BECN_SHIFT)
 #define HFI1_MULTICAST_LID_BASE 0xC000
 
 static inline __u64 rhf_to_cpu(const __le32 *rbuf)
diff --git a/drivers/staging/rdma/hfi1/hfi.h b/drivers/staging/rdma/hfi1/hfi.h
index 2611bb2..2dfd402 100644
--- a/drivers/staging/rdma/hfi1/hfi.h
+++ b/drivers/staging/rdma/hfi1/hfi.h
@@ -424,17 +424,17 @@ struct hfi1_sge_state;
 #d

[PATCH v3 4/6] staging/rdma/hfi1: Change default krcvqs

2015-12-21 Thread Jubin John
Change the default number of krcvqs to number of numa nodes + 1
based on the performance data collected.

Reviewed-by: Mike Marciniszyn 
Signed-off-by: Jubin John 
---
Changes in v2:
- None

Changes in v3:
- Refreshed patch against latest staging-testing

 drivers/staging/rdma/hfi1/chip.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/chip.c b/drivers/staging/rdma/hfi1/chip.c
index bbe5ad8..503bfca 100644
--- a/drivers/staging/rdma/hfi1/chip.c
+++ b/drivers/staging/rdma/hfi1/chip.c
@@ -12445,7 +12445,7 @@ static int set_up_context_variables(struct hfi1_devdata 
*dd)
 */
num_kernel_contexts = n_krcvqs + MIN_KERNEL_KCTXTS - 1;
else
-   num_kernel_contexts = num_online_nodes();
+   num_kernel_contexts = num_online_nodes() + 1;
num_kernel_contexts =
max_t(int, MIN_KERNEL_KCTXTS, num_kernel_contexts);
/*
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 2/6] staging/rdma/hfi1: Move s_sde to the read mostly portion of the hfi1_qp structure

2015-12-21 Thread Jubin John
From: Harish Chegondi 

This would reduce L2 cache misses on s_sde in the _hfi1_schedule_send
function when invoked from post_send thereby improving performance of
post_send.

Reviewed-by: Mike Marciniszyn 
Signed-off-by: Harish Chegondi 
Signed-off-by: Jubin John 
---
Changes in v2:
- None

Changes in v3:
- Refreshed patch against latest staging-testing

 drivers/staging/rdma/hfi1/verbs.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/verbs.h 
b/drivers/staging/rdma/hfi1/verbs.h
index 72106e5..d22db39 100644
--- a/drivers/staging/rdma/hfi1/verbs.h
+++ b/drivers/staging/rdma/hfi1/verbs.h
@@ -441,6 +441,7 @@ struct hfi1_qp {
struct hfi1_swqe *s_wq;  /* send work queue */
struct hfi1_mmap_info *ip;
struct ahg_ib_header *s_hdr; /* next packet header to send */
+   struct sdma_engine *s_sde; /* current sde */
/* sc for UC/RC QPs - based on ah for UD */
u8 s_sc;
unsigned long timeout_jiffies;  /* computed from timeout */
@@ -506,7 +507,6 @@ struct hfi1_qp {
struct hfi1_swqe *s_wqe;
struct hfi1_sge_state s_sge; /* current send request data */
struct hfi1_mregion *s_rdma_mr;
-   struct sdma_engine *s_sde; /* current sde */
u32 s_cur_size; /* size of send packet in bytes */
u32 s_len;  /* total length of s_sge */
u32 s_rdma_read_len;/* total length of s_rdma_read_sge */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 5/6] staging/rdma/hfi1: adding per SDMA engine stats to hfistats

2015-12-21 Thread Jubin John
From: Vennila Megavannan 

Added the following per sdma engine stats:
  - SendDmaDescFetchedCnt
  - software maintained count of SDMA interrupts
 (SDmaInt, SDmaIdleInt, SDmaProgressInt)
  - software maintained counts of SDMA error cases

Reviewed-by: Dennis Dalessandro 
Signed-off-by: Mike Marciniszyn 
Signed-off-by: Vennila Megavannan 
Signed-off-by: Jubin John 
---
Changes in v2:
- None

Changes in v3:
- Refreshed patch against latest staging-testing

 drivers/staging/rdma/hfi1/chip.c   |  110 +++-
 drivers/staging/rdma/hfi1/chip.h   |5 +
 drivers/staging/rdma/hfi1/chip_registers.h |1 +
 drivers/staging/rdma/hfi1/hfi.h|1 +
 drivers/staging/rdma/hfi1/sdma.c   |9 ++-
 drivers/staging/rdma/hfi1/sdma.h   |7 ++
 6 files changed, 129 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/chip.c b/drivers/staging/rdma/hfi1/chip.c
index 503bfca..f4f720d 100644
--- a/drivers/staging/rdma/hfi1/chip.c
+++ b/drivers/staging/rdma/hfi1/chip.c
@@ -1297,10 +1297,58 @@ static u64 dev_access_u32_csr(const struct cntr_entry 
*entry,
void *context, int vl, int mode, u64 data)
 {
struct hfi1_devdata *dd = context;
+   u64 csr = entry->csr;
 
-   if (vl != CNTR_INVALID_VL)
-   return 0;
-   return read_write_csr(dd, entry->csr, mode, data);
+   if (entry->flags & CNTR_SDMA) {
+   if (vl == CNTR_INVALID_VL)
+   return 0;
+   csr += 0x100 * vl;
+   } else {
+   if (vl != CNTR_INVALID_VL)
+   return 0;
+   }
+   return read_write_csr(dd, csr, mode, data);
+}
+
+static u64 access_sde_err_cnt(const struct cntr_entry *entry,
+ void *context, int idx, int mode, u64 data)
+{
+   struct hfi1_devdata *dd = (struct hfi1_devdata *)context;
+
+   if (dd->per_sdma && idx < dd->num_sdma)
+   return dd->per_sdma[idx].err_cnt;
+   return 0;
+}
+
+static u64 access_sde_int_cnt(const struct cntr_entry *entry,
+ void *context, int idx, int mode, u64 data)
+{
+   struct hfi1_devdata *dd = (struct hfi1_devdata *)context;
+
+   if (dd->per_sdma && idx < dd->num_sdma)
+   return dd->per_sdma[idx].sdma_int_cnt;
+   return 0;
+}
+
+static u64 access_sde_idle_int_cnt(const struct cntr_entry *entry,
+  void *context, int idx, int mode, u64 data)
+{
+   struct hfi1_devdata *dd = (struct hfi1_devdata *)context;
+
+   if (dd->per_sdma && idx < dd->num_sdma)
+   return dd->per_sdma[idx].idle_int_cnt;
+   return 0;
+}
+
+static u64 access_sde_progress_int_cnt(const struct cntr_entry *entry,
+  void *context, int idx, int mode,
+  u64 data)
+{
+   struct hfi1_devdata *dd = (struct hfi1_devdata *)context;
+
+   if (dd->per_sdma && idx < dd->num_sdma)
+   return dd->per_sdma[idx].progress_int_cnt;
+   return 0;
 }
 
 static u64 dev_access_u64_csr(const struct cntr_entry *entry, void *context,
@@ -4070,6 +4118,22 @@ static struct cntr_entry dev_cntrs[DEV_CNTR_LAST] = {
access_sw_kmem_wait),
 [C_SW_SEND_SCHED] = CNTR_ELEM("SendSched", 0, 0, CNTR_NORMAL,
access_sw_send_schedule),
+[C_SDMA_DESC_FETCHED_CNT] = CNTR_ELEM("SDEDscFdCn",
+ SEND_DMA_DESC_FETCHED_CNT, 0,
+ CNTR_NORMAL | CNTR_32BIT | CNTR_SDMA,
+ dev_access_u32_csr),
+[C_SDMA_INT_CNT] = CNTR_ELEM("SDMAInt", 0, 0,
+CNTR_NORMAL | CNTR_32BIT | CNTR_SDMA,
+access_sde_int_cnt),
+[C_SDMA_ERR_CNT] = CNTR_ELEM("SDMAErrCt", 0, 0,
+CNTR_NORMAL | CNTR_32BIT | CNTR_SDMA,
+access_sde_err_cnt),
+[C_SDMA_IDLE_INT_CNT] = CNTR_ELEM("SDMAIdInt", 0, 0,
+ CNTR_NORMAL | CNTR_32BIT | CNTR_SDMA,
+ access_sde_idle_int_cnt),
+[C_SDMA_PROGRESS_INT_CNT] = CNTR_ELEM("SDMAPrIntCn", 0, 0,
+ CNTR_NORMAL | CNTR_32BIT | CNTR_SDMA,
+ access_sde_progress_int_cnt),
 /* MISC_ERR_STATUS */
 [C_MISC_PLL_LOCK_FAIL_ERR] = CNTR_ELEM("MISC_PLL_LOCK_FAIL_ERR", 0, 0,
CNTR_NORMAL,
@@ -5707,6 +5771,7 @@ static void handle_sdma_eng_err(struct hfi1_devdata *dd,
dd_dev_err(sde->dd, "CONFIG SDMA(%u) source: %u status 0x%llx\n",
   sde->this_idx, source, (unsigned long long)status);
 #endif
+   sde->err_cnt++;
sdma_engine_error(sde, status);
 
/*
@@ -11150,6 +11215,20 @@ u32 hfi1_read_cntrs(struct hfi1_devdata *dd, loff_t 
pos, char **namep,

[PATCH v3 3/6] staging/rdma/hfi1: change krcvqs module parameter type from byte to uint

2015-12-21 Thread Jubin John
From: Mark F. Brown 

The krcvqs parameter is displayed incorrectly in sysfs.
The workaround is to set the param type as uint.

Reviewed-by: Mike Marciniszyn 
Reviewed-by: Mitko Haralanov 
Signed-off-by: Mark F. Brown 
Signed-off-by: Jubin John 
---
Changes in v2:
- None

Changes in v3:
- Refreshed patch against latest staging-testing

 drivers/staging/rdma/hfi1/hfi.h  |2 +-
 drivers/staging/rdma/hfi1/init.c |4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/hfi.h b/drivers/staging/rdma/hfi1/hfi.h
index 2dfd402..d4a859f 100644
--- a/drivers/staging/rdma/hfi1/hfi.h
+++ b/drivers/staging/rdma/hfi1/hfi.h
@@ -1667,7 +1667,7 @@ extern unsigned int hfi1_cu;
 extern unsigned int user_credit_return_threshold;
 extern int num_user_contexts;
 extern unsigned n_krcvqs;
-extern u8 krcvqs[];
+extern uint krcvqs[];
 extern int krcvqsset;
 extern uint kdeth_qp;
 extern uint loopback;
diff --git a/drivers/staging/rdma/hfi1/init.c b/drivers/staging/rdma/hfi1/init.c
index 35b5e41..dbdc631 100644
--- a/drivers/staging/rdma/hfi1/init.c
+++ b/drivers/staging/rdma/hfi1/init.c
@@ -87,9 +87,9 @@ module_param_named(num_user_contexts, num_user_contexts, 
uint, S_IRUGO);
 MODULE_PARM_DESC(
num_user_contexts, "Set max number of user contexts to use");
 
-u8 krcvqs[RXE_NUM_DATA_VL];
+uint krcvqs[RXE_NUM_DATA_VL];
 int krcvqsset;
-module_param_array(krcvqs, byte, &krcvqsset, S_IRUGO);
+module_param_array(krcvqs, uint, &krcvqsset, S_IRUGO);
 MODULE_PARM_DESC(krcvqs, "Array of the number of non-control kernel receive 
queues by VL");
 
 /* computed based on above array */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 6/6] staging/rdma/hfi1: Remove unneeded variable index

2015-12-21 Thread Jubin John
From: Dean Luick 

The variable "index" increments the same as dd->ndevcntrs.
Just use the later.  Remove uneeded usage of "index" in the
fill loop - it is not used there or later in the function.

Reviewed-by: Dennis Dalessandro 
Signed-off-by: Dean Luick 
Signed-off-by: Jubin John 
---
Changes in v2:
- None

Changes in v3:
- Refreshed patch against latest staging-testing

 drivers/staging/rdma/hfi1/chip.c |   19 +++
 1 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/chip.c b/drivers/staging/rdma/hfi1/chip.c
index f4f720d..1109049 100644
--- a/drivers/staging/rdma/hfi1/chip.c
+++ b/drivers/staging/rdma/hfi1/chip.c
@@ -11592,7 +11592,7 @@ mod_timer(&dd->synth_stats_timer, jiffies + HZ * 
SYNTH_CNT_TIME);
 #define C_MAX_NAME 13 /* 12 chars + one for /0 */
 static int init_cntrs(struct hfi1_devdata *dd)
 {
-   int i, rcv_ctxts, index, j;
+   int i, rcv_ctxts, j;
size_t sz;
char *p;
char name[C_MAX_NAME];
@@ -11609,7 +11609,6 @@ static int init_cntrs(struct hfi1_devdata *dd)
/* size names and determine how many we have*/
dd->ndevcntrs = 0;
sz = 0;
-   index = 0;
 
for (i = 0; i < DEV_CNTR_LAST; i++) {
hfi1_dbg_early("Init cntr %s\n", dev_cntrs[i].name);
@@ -11620,7 +11619,7 @@ static int init_cntrs(struct hfi1_devdata *dd)
 
if (dev_cntrs[i].flags & CNTR_VL) {
hfi1_dbg_early("\tProcessing VL cntr\n");
-   dev_cntrs[i].offset = index;
+   dev_cntrs[i].offset = dd->ndevcntrs;
for (j = 0; j < C_VL_COUNT; j++) {
memset(name, '\0', C_MAX_NAME);
snprintf(name, C_MAX_NAME, "%s%d",
@@ -11630,13 +11629,12 @@ static int init_cntrs(struct hfi1_devdata *dd)
sz++;
hfi1_dbg_early("\t\t%s\n", name);
dd->ndevcntrs++;
-   index++;
}
} else if (dev_cntrs[i].flags & CNTR_SDMA) {
hfi1_dbg_early(
   "\tProcessing per SDE counters chip 
enginers %u\n",
   dd->chip_sdma_engines);
-   dev_cntrs[i].offset = index;
+   dev_cntrs[i].offset = dd->ndevcntrs;
for (j = 0; j < dd->chip_sdma_engines; j++) {
memset(name, '\0', C_MAX_NAME);
snprintf(name, C_MAX_NAME, "%s%d",
@@ -11645,24 +11643,22 @@ static int init_cntrs(struct hfi1_devdata *dd)
sz++;
hfi1_dbg_early("\t\t%s\n", name);
dd->ndevcntrs++;
-   index++;
}
} else {
/* +1 for newline  */
sz += strlen(dev_cntrs[i].name) + 1;
+   dev_cntrs[i].offset = dd->ndevcntrs;
dd->ndevcntrs++;
-   dev_cntrs[i].offset = index;
-   index++;
hfi1_dbg_early("\tAdding %s\n", dev_cntrs[i].name);
}
}
 
/* allocate space for the counter values */
-   dd->cntrs = kcalloc(index, sizeof(u64), GFP_KERNEL);
+   dd->cntrs = kcalloc(dd->ndevcntrs, sizeof(u64), GFP_KERNEL);
if (!dd->cntrs)
goto bail;
 
-   dd->scntrs = kcalloc(index, sizeof(u64), GFP_KERNEL);
+   dd->scntrs = kcalloc(dd->ndevcntrs, sizeof(u64), GFP_KERNEL);
if (!dd->scntrs)
goto bail;
 
@@ -11674,7 +11670,7 @@ static int init_cntrs(struct hfi1_devdata *dd)
goto bail;
 
/* fill in the names */
-   for (p = dd->cntrnames, i = 0, index = 0; i < DEV_CNTR_LAST; i++) {
+   for (p = dd->cntrnames, i = 0; i < DEV_CNTR_LAST; i++) {
if (dev_cntrs[i].flags & CNTR_DISABLED) {
/* Nothing */
} else {
@@ -11704,7 +11700,6 @@ static int init_cntrs(struct hfi1_devdata *dd)
p += strlen(dev_cntrs[i].name);
*p++ = '\n';
}
-   index++;
}
}
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 0/2] Driver cleanup and misc fixes series 3

2015-12-21 Thread Jubin John
These patches were part of series:
http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2015-December/082248.html
but did not apply cleanly to the staging-testing branch.

Refreshed the remaining 2 patches against the latest staging-testing.

Changes in v2:
- Added more information in commit messages of patches 01, 02
  03, 04, 05, 07 and 09
- Fixed driver name to hfi1 in subject line of patch 06
- Refreshed patch 10 on top of staging-next
- Dropped patch 18 (staging/rdma/hfi1: Workaround
  CONFIG_SDMA_VERBOSITY timing issue) from series

Changes in v3:
- Refreshed remaining 2 patches against latest staging-testing

Edward Mascarenhas (1):
  staging/rdma/hfi1: Clean up comments

Ira Weiny (1):
  staging/rdma/hfi1: Fix Xmit Wait calculation

 drivers/staging/rdma/hfi1/chip.c   |1 -
 drivers/staging/rdma/hfi1/driver.c |2 +-
 drivers/staging/rdma/hfi1/hfi.h|4 ++--
 drivers/staging/rdma/hfi1/mad.c|   33 -
 drivers/staging/rdma/hfi1/pcie.c   |2 +-
 drivers/staging/rdma/hfi1/ud.c |2 +-
 6 files changed, 25 insertions(+), 19 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 2/2] staging/rdma/hfi1: Fix Xmit Wait calculation

2015-12-21 Thread Jubin John
From: Ira Weiny 

Total XMIT wait needs to sum the xmit wait values of all the VLs not just
those requested in the query.  Also, make the algorithm used for both
PortStatus and PortDataCounters the same.

Reviewed-by: Arthur Kepner 
Reviewed-by: Breyer, Scott J 
Signed-off-by: Ira Weiny 
Signed-off-by: Jubin John 
---
Changes in v2:
- No changes

Changes in v3:
- Refreshed patches against latest staging-testing

 drivers/staging/rdma/hfi1/mad.c |   33 -
 1 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/mad.c b/drivers/staging/rdma/hfi1/mad.c
index 4f5dbd1..bee1c0e 100644
--- a/drivers/staging/rdma/hfi1/mad.c
+++ b/drivers/staging/rdma/hfi1/mad.c
@@ -2279,17 +2279,23 @@ static void a0_portstatus(struct hfi1_pportdata *ppd,
 {
if (!is_bx(ppd->dd)) {
unsigned long vl;
-   u64 max_vl_xmit_wait = 0, tmp;
+   u64 sum_vl_xmit_wait = 0;
u32 vl_all_mask = VL_MASK_ALL;
 
for_each_set_bit(vl, (unsigned long *)&(vl_all_mask),
 8 * sizeof(vl_all_mask)) {
-   tmp = read_port_cntr(ppd, C_TX_WAIT_VL,
-idx_from_vl(vl));
-   if (tmp > max_vl_xmit_wait)
-   max_vl_xmit_wait = tmp;
+   u64 tmp = sum_vl_xmit_wait +
+ read_port_cntr(ppd, C_TX_WAIT_VL,
+idx_from_vl(vl));
+   if (tmp < sum_vl_xmit_wait) {
+   /* we wrapped */
+   sum_vl_xmit_wait = (u64)~0;
+   break;
+   }
+   sum_vl_xmit_wait = tmp;
}
-   rsp->port_xmit_wait = cpu_to_be64(max_vl_xmit_wait);
+   if (be64_to_cpu(rsp->port_xmit_wait) > sum_vl_xmit_wait)
+   rsp->port_xmit_wait = cpu_to_be64(sum_vl_xmit_wait);
}
 }
 
@@ -2491,18 +2497,19 @@ static u64 get_error_counter_summary(struct ib_device 
*ibdev, u8 port,
return error_counter_summary;
 }
 
-static void a0_datacounters(struct hfi1_devdata *dd, struct _port_dctrs *rsp,
+static void a0_datacounters(struct hfi1_pportdata *ppd, struct _port_dctrs 
*rsp,
u32 vl_select_mask)
 {
-   if (!is_bx(dd)) {
+   if (!is_bx(ppd->dd)) {
unsigned long vl;
-   int vfi = 0;
u64 sum_vl_xmit_wait = 0;
+   u32 vl_all_mask = VL_MASK_ALL;
 
-   for_each_set_bit(vl, (unsigned long *)&(vl_select_mask),
-   8 * sizeof(vl_select_mask)) {
+   for_each_set_bit(vl, (unsigned long *)&(vl_all_mask),
+8 * sizeof(vl_all_mask)) {
u64 tmp = sum_vl_xmit_wait +
-   be64_to_cpu(rsp->vls[vfi++].port_vl_xmit_wait);
+ read_port_cntr(ppd, C_TX_WAIT_VL,
+idx_from_vl(vl));
if (tmp < sum_vl_xmit_wait) {
/* we wrapped */
sum_vl_xmit_wait = (u64) ~0;
@@ -2665,7 +2672,7 @@ static int pma_get_opa_datacounters(struct opa_pma_mad 
*pmp,
vfi++;
}
 
-   a0_datacounters(dd, rsp, vl_select_mask);
+   a0_datacounters(ppd, rsp, vl_select_mask);
 
if (resp_len)
*resp_len += response_data_size;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/2] staging/rdma/hfi1: Clean up comments

2015-12-21 Thread Jubin John
From: Edward Mascarenhas 

Clean up comments by deleting numbering and terms internal to Intel.

The information on the actual bugs is not deleted.

Reviewed-by: Mike Marciniszyn 
Signed-off-by: Edward Mascarenhas 
Signed-off-by: Jubin John 
---
Changes in v2:
- Added more information in commit message

Changes in v3:
- Refreshed patch against latest staging-testing

 drivers/staging/rdma/hfi1/chip.c   |1 -
 drivers/staging/rdma/hfi1/driver.c |2 +-
 drivers/staging/rdma/hfi1/hfi.h|4 ++--
 drivers/staging/rdma/hfi1/pcie.c   |2 +-
 drivers/staging/rdma/hfi1/ud.c |2 +-
 5 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/rdma/hfi1/chip.c b/drivers/staging/rdma/hfi1/chip.c
index bbe5ad8..02ba78f 100644
--- a/drivers/staging/rdma/hfi1/chip.c
+++ b/drivers/staging/rdma/hfi1/chip.c
@@ -13537,7 +13537,6 @@ int hfi1_set_ctxt_jkey(struct hfi1_devdata *dd, 
unsigned ctxt, u16 jkey)
write_kctxt_csr(dd, sctxt, SEND_CTXT_CHECK_JOB_KEY, reg);
/*
 * Enable send-side J_KEY integrity check, unless this is A0 h/w
-* (due to A0 erratum).
 */
if (!is_ax(dd)) {
reg = read_kctxt_csr(dd, sctxt, SEND_CTXT_CHECK_ENABLE);
diff --git a/drivers/staging/rdma/hfi1/driver.c 
b/drivers/staging/rdma/hfi1/driver.c
index 8485de1..3218520 100644
--- a/drivers/staging/rdma/hfi1/driver.c
+++ b/drivers/staging/rdma/hfi1/driver.c
@@ -368,7 +368,7 @@ static void rcv_hdrerr(struct hfi1_ctxtdata *rcd, struct 
hfi1_pportdata *ppd,
if (opcode == IB_OPCODE_CNP) {
/*
 * Only in pre-B0 h/w is the CNP_OPCODE handled
-* via this code path (errata 291394).
+* via this code path.
 */
struct hfi1_qp *qp = NULL;
u32 lqpn, rqpn;
diff --git a/drivers/staging/rdma/hfi1/hfi.h b/drivers/staging/rdma/hfi1/hfi.h
index 2611bb2..9785a22 100644
--- a/drivers/staging/rdma/hfi1/hfi.h
+++ b/drivers/staging/rdma/hfi1/hfi.h
@@ -1730,7 +1730,7 @@ static inline u64 hfi1_pkt_default_send_ctxt_mask(struct 
hfi1_devdata *dd,
base_sc_integrity |= HFI1_PKT_KERNEL_SC_INTEGRITY;
 
if (is_ax(dd))
-   /* turn off send-side job key checks - A0 erratum */
+   /* turn off send-side job key checks - A0 */
return base_sc_integrity &
   ~SEND_CTXT_CHECK_ENABLE_CHECK_JOB_KEY_SMASK;
return base_sc_integrity;
@@ -1757,7 +1757,7 @@ static inline u64 hfi1_pkt_base_sdma_integrity(struct 
hfi1_devdata *dd)
| SEND_DMA_CHECK_ENABLE_CHECK_ENABLE_SMASK;
 
if (is_ax(dd))
-   /* turn off send-side job key checks - A0 erratum */
+   /* turn off send-side job key checks - A0 */
return base_sdma_integrity &
   ~SEND_DMA_CHECK_ENABLE_CHECK_JOB_KEY_SMASK;
return base_sdma_integrity;
diff --git a/drivers/staging/rdma/hfi1/pcie.c b/drivers/staging/rdma/hfi1/pcie.c
index 8317b07..6745c82 100644
--- a/drivers/staging/rdma/hfi1/pcie.c
+++ b/drivers/staging/rdma/hfi1/pcie.c
@@ -986,7 +986,7 @@ retry:
 * PcieCfgRegPl100 - Gen3 Control
 *
 * turn off PcieCfgRegPl100.Gen3ZRxDcNonCompl
-* turn on PcieCfgRegPl100.EqEieosCnt (erratum)
+* turn on PcieCfgRegPl100.EqEieosCnt
 * Everything else zero.
 */
reg32 = PCIE_CFG_REG_PL100_EQ_EIEOS_CNT_SMASK;
diff --git a/drivers/staging/rdma/hfi1/ud.c b/drivers/staging/rdma/hfi1/ud.c
index bd1b402..25e6053 100644
--- a/drivers/staging/rdma/hfi1/ud.c
+++ b/drivers/staging/rdma/hfi1/ud.c
@@ -671,7 +671,7 @@ void hfi1_ud_rcv(struct hfi1_packet *packet)
if (unlikely(bth1 & HFI1_BECN_SMASK)) {
/*
 * In pre-B0 h/w the CNP_OPCODE is handled via an
-* error path (errata 291394).
+* error path.
 */
struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
u32 lqpn =  be32_to_cpu(ohdr->bth[1]) & HFI1_QPN_MASK;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/cma: cma_match_net_dev needs to take into account port_num

2015-12-21 Thread Or Gerlitz

On 12/21/2015 5:01 PM, Matan Barak wrote:

Previously, cma_match_net_dev called cma_protocol_roce which
tried to verify that the IB device uses RoCE protocol. However,
if rdma_id didn't have a bounded port, it used the first port
of the device.


maybe prefer a higher then code speak language e.g "if the rdma id 
didn't have"  also below "unbounded rdma ids"




In VPI systems, the first port might be an IB port while the second
one could be an Ethernet port. This made requests for unbounded rdma_ids
that come from the Ethernet port fail.


add "to" --> "Ethernet port to fail"


Fixing this by passing the port of the request and checking this port
of the device.


OK, so this fix will work for both ib/eth and eth/ib configs, right? good.




Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev on RoCE')


Reported-by: Or Gerlitz 



Signed-off-by: Matan Barak 


Doug, the bug fixes a commit from from 4.3, lets fix it in 4.4 and later 
we will send it to -stable as well. So for 4.4 there's this one and the 
kvfree fix [1]


Or.

[1] https://patchwork.kernel.org/patch/7868481/



---
Hi Doug,

This patch fixes a bug in VPI systems, where the first port is configured
as IB and the second one is configured as Ethernet.
In this case, if the rdma_id isn't bounded to a port, cma_match_net_dev
will try to verify that the first port is a RoCE port and fail.
This is fixed by passing the port of the incoming request.

Regards,
Matan

  drivers/infiniband/core/cma.c |   16 +---
  1 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index d2d5d00..c8a265c 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1265,15 +1265,17 @@ static bool cma_protocol_roce(const struct rdma_cm_id 
*id)
return cma_protocol_roce_dev_port(device, port_num);
  }
  
-static bool cma_match_net_dev(const struct rdma_id_private *id_priv,

- const struct net_device *net_dev)
+static bool cma_match_net_dev(const struct rdma_cm_id *id,
+ const struct net_device *net_dev,
+ u8 port_num)
  {
-   const struct rdma_addr *addr = &id_priv->id.route.addr;
+   const struct rdma_addr *addr = &id->route.addr;
  
  	if (!net_dev)

/* This request is an AF_IB request or a RoCE request */
-   return addr->src_addr.ss_family == AF_IB ||
-  cma_protocol_roce(&id_priv->id);
+   return (!id->port_num || id->port_num == port_num) &&
+  (addr->src_addr.ss_family == AF_IB ||
+   cma_protocol_roce_dev_port(id->device, port_num));
  
  	return !addr->dev_addr.bound_dev_if ||

   (net_eq(dev_net(net_dev), addr->dev_addr.net) &&
@@ -1295,13 +1297,13 @@ static struct rdma_id_private *cma_find_listener(
hlist_for_each_entry(id_priv, &bind_list->owners, node) {
if (cma_match_private_data(id_priv, ib_event->private_data)) {
if (id_priv->id.device == cm_id->device &&
-   cma_match_net_dev(id_priv, net_dev))
+   cma_match_net_dev(&id_priv->id, net_dev, req->port))
return id_priv;
list_for_each_entry(id_priv_dev,
&id_priv->listen_list,
listen_list) {
if (id_priv_dev->id.device == cm_id->device &&
-   cma_match_net_dev(id_priv_dev, net_dev))
+   cma_match_net_dev(&id_priv_dev->id, net_dev, 
req->port))
return id_priv_dev;
}
}


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/cma: cma_match_net_dev needs to take into account port_num

2015-12-21 Thread Or Gerlitz

On 12/21/2015 5:01 PM, Matan Barak wrote:

This patch fixes a bug in VPI systems, where the first port is configured
as IB and the second one is configured as Ethernet. In this case, if the 
rdma_id isn't bounded to a port, cma_match_net_dev will try to verify that the 
first port is a RoCE port and fail. This is fixed by passing the port of the 
incoming request.


OK -- we have another bug down there, cma loopback doesn't work, same 
reject reason (below).This happens in both VPI and non-VPI configurations.


Works well with 4.2.3

Or.


$ rping -d -v -c -a 127.0.0.1 -C 1
verbose
client
count 1
created cm_id 0x6087d0
cma_event type RDMA_CM_EVENT_ADDR_RESOLVED cma_id 0x6087d0 (parent)
cma_event type RDMA_CM_EVENT_ROUTE_RESOLVED cma_id 0x6087d0 (parent)
rdma_resolve_addr - rdma_resolve_route successful
created pd 0x60e5f0
created channel 0x608250
created cq 0x608a20
created qp 0x6082e0
rping_setup_buffers called on cb 0x606010
allocated & registered buffers...
cq_thread started.
wait for CONNECTED state 10
cma_event type RDMA_CM_EVENT_REJECTED cma_id 0x6087d0 (parent)
cma event RDMA_CM_EVENT_REJECTED, error 28
connect error -1
rping_free_buffers called on cb 0x606010
destroy cm_id 0x6087d0


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: device attr cleanup

2015-12-21 Thread Or Gerlitz
On Wed, Dec 16, 2015 at 7:53 AM, Or Gerlitz  wrote:
> On 12/15/2015 9:03 PM, Doug Ledford wrote:

>> Or, you specifically asked me to wait until this week.  I made my
>> initial impressions clear (I don't necessarily like the removal of the
>> attr struct, but I like the removal of all of the query calls, and I'm
>> inclined to take the patch in spite of not liking the removal of the
>> struct).  Do you have anything to add or have we beat this horse to death?

> Hi Doug,
> Lets stop beating, both horses and people.
> I do understand that
> 1. you don't link the removal of the attr
> 2. you do like the removal of all the query calls
>
> I am proposing to take the path of a patch that
> does exactly #2 while avoiding #1.

Doug,

Did you look on my v1 post and the related discussion there w.r.t udata?

You didn't make any comment on my response here nor on the proposed patches.

Since we are really short in time w.r.t EOY holidays and we have the
udata matter
open (see [1]), could we move finalizing this discussion to the 4.6 time-frame?

If you do have the time, I think it would be fair to see a response
from you on the
discussion before you pick any of the two patch sets - so??

Or.

[1] Christoph's patch doesn't remove the query_device callback from
mlx4 since we
report there values to libmlx4 through the udata mechanism. The
query_device callback
will need to be present in future/current drivers if they decide to
use udata as well


> What's wrong with that? I haven't heard any reasoning for why its
> so good to stash ~50 new fields on the IB device structure except
> for the author saying that other subsystems do that and other people
> saying they are in favor of this approach while not providing any
> reasoning, except for maybe something on bikes.
>
> Why you or anyone else has to be from now and ever the cache line police
> making sure that people don't add new attributes in random locations
> over the IB device structure?
>
> What's wrong with putting fifty attributesin a structure which is a field
> of the device struct and have people go there to see what are the d
> ifferentattrs and add news ones there?
>
> This will make the 4.5 merge window extremely complex or even totally
> threatened  w.r.t to the RDMA subsystem and related drivers by 3.3K LOC
> patch.
>
> Sorry, but, I still don't get it.
>
> Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html