RE: [v11, 5/8] soc: fsl: add GUTS driver for QorIQ platforms

2016-09-11 Thread Y.B. Lu
Hi Scott,

Thanks for your review :)
See my comment inline.

> -Original Message-
> From: Scott Wood [mailto:o...@buserror.net]
> Sent: Friday, September 09, 2016 11:47 AM
> To: Y.B. Lu; linux-...@vger.kernel.org; ulf.hans...@linaro.org; Arnd
> Bergmann
> Cc: linuxppc-...@lists.ozlabs.org; devicet...@vger.kernel.org; linux-arm-
> ker...@lists.infradead.org; linux-ker...@vger.kernel.org; linux-
> c...@vger.kernel.org; linux-...@vger.kernel.org; iommu@lists.linux-
> foundation.org; netdev@vger.kernel.org; Mark Rutland; Rob Herring;
> Russell King; Jochen Friedrich; Joerg Roedel; Claudiu Manoil; Bhupesh
> Sharma; Qiang Zhao; Kumar Gala; Santosh Shilimkar; Leo Li; X.B. Xie
> Subject: Re: [v11, 5/8] soc: fsl: add GUTS driver for QorIQ platforms
> 
> On Tue, 2016-09-06 at 16:28 +0800, Yangbo Lu wrote:
> > The global utilities block controls power management, I/O device
> > enabling, power-onreset(POR) configuration monitoring, alternate
> > function selection for multiplexed signals,and clock control.
> >
> > This patch adds a driver to manage and access global utilities block.
> > Initially only reading SVR and registering soc device are supported.
> > Other guts accesses, such as reading RCW, should eventually be moved
> > into this driver as well.
> >
> > Signed-off-by: Yangbo Lu 
> > Signed-off-by: Scott Wood 
> 
> Don't put my signoff on patches that I didn't put it on
> myself.  Definitely don't put mine *after* yours on patches that were
> last modified by you.
> 
> If you want to mention that the soc_id encoding was my suggestion, then
> do so explicitly.
> 

[Lu Yangbo-B47093] I found your 'signoff' on this patch at below link.
http://patchwork.ozlabs.org/patch/649211/

So, let me just change the order in next version ?
Signed-off-by: Scott Wood 
Signed-off-by: Yangbo Lu 

> > +/* SoC attribute definition for QorIQ platform */ static const struct
> > +soc_device_attribute qoriq_soc[] = { #ifdef CONFIG_PPC
> > +   /*
> > +    * Power Architecture-based SoCs T Series
> > +    */
> > +
> > +   /* SoC: T1024/T1014/T1023/T1013 Rev: 1.0 */
> > +   { .soc_id   = "svr:0x85400010,name:T1024,die:T1024",
> > +     .revision = "1.0",
> > +   },
> > +   { .soc_id   = "svr:0x85480010,name:T1024E,die:T1024",
> > +     .revision = "1.0",
> > +   },
> 
> Revision could be computed from the low 8 bits of SVR (just as you do for
> unknown SVRs).
>
 
[Lu Yangbo-B47093] Yes, you're right. Will remove it here.

> We could move the die name into .family:
> 
>   {
>   .soc_id = "svr:0x85490010,name:T1023E,",
>   .family = "QorIQ T1024",
>   }
> 
> I see you dropped svre (and the trailing comma), though I guess the vast
> majority of potential users will be looking at .family.  In which case do
> we even need name?  If we just make the soc_id be "svr:0x" then
> we could shrink the table to an svr+mask that identifies each die.  I'd
> still want to keep the "svr:" even if we're giving up on the general
> tagging system, to make it clear what the number refers to, and to
> provide some defense against users who match only against soc_id rather
> than soc_id+family.  Or we could go further and format soc_id as "QorIQ
> SVR 0x" so that soc_id-only matches are fully acceptable rather
> than just less dangerous.

[Lu Yangbo-B47093] It's a good idea to move die into .family I think.
In my opinion, it's better to keep svr and name in soc_id just like your 
suggestion above.
>   {
>   .soc_id = "svr:0x85490010,name:T1023E,",
>   .family = "QorIQ T1024",
>   }
The user probably don’t like to learn the svr value. What they want is just to 
match the soc they use.
It's convenient to use name+rev for them to match a soc.

Regarding shrinking the table, I think it's hard to use svr+mask. Because I 
find many platforms use different masks.
We couldn’t know the mask according svr value.

> 
> > +static const struct soc_device_attribute *fsl_soc_device_match(
> > +   unsigned int svr, const struct soc_device_attribute *matches) {
> > +   char svr_match[50];
> > +   int n;
> > +
> > +   n = sprintf(svr_match, "*%08x*", svr);
> 
> n = sprintf(svr_match, "svr:0x%08x,*", svr);
> 
> (according to the current encoding)
> 

[Lu Yangbo-B47093] Ok. Will do that.

> > +
> > +   do {
> > +   if (!matches->soc_id)
> > +   return NULL;
> > +   if (glob_match(svr_match, matches->soc_id))
> > +   break;
> > +   } while (matches++);
> 
> Are you expecting "matches++" to ever evaluate as false?

[Lu Yangbo-B47093] Yes, this is used to match the soc we use in qoriq_soc array 
until getting true. 
We need to get the name and die information defined in array.

> 
> > +   /* Register soc device */
> > +   soc_dev_attr = kzalloc(sizeof(*soc_dev_attr), GFP_KERNEL);
> > +   if (!soc_dev_attr) {
> > +   ret = -ENOMEM;
> > +   goto out_unmap;
> > +   }
> 
> Couldn't this be statically allocated?

[Lu Yang

Re: [PATCH net] net_sched: act_mirred: full rcu conversion

2016-09-11 Thread Cong Wang
On Fri, Sep 9, 2016 at 8:52 AM, John Fastabend  wrote:
> On 16-09-08 10:26 PM, Cong Wang wrote:
>> On Thu, Sep 8, 2016 at 8:51 AM, Eric Dumazet  wrote:
>>> On Thu, 2016-09-08 at 08:47 -0700, John Fastabend wrote:
>>>
 Works for me. FWIW I find this plenty straightforward and don't really
 see the need to make the hash table itself rcu friendly.

 Acked-by: John Fastabend 

>>>
>>> Yes, it seems this hash table is used in control path, with RTNL held
>>> anyway.
>>
>> Seriously? You never read hashtable in fast path?? I think you need
>> to wake up.
>>
>
> But the actions use refcnt'ing and should never be decremented to zero
> as long as they can still be referenced by an active filter. If each
> action handles its parameters like mirred/gact then I don't see why its
> necessary.

This is correct, by "read" I meant "dereference", the tc actions
are now permanently stored in hashtable directly, so "reading"
a tc action is reading from hashtable.

Sorry if this wasn't clear.


[PATCH] net: inet: diag: Fix an error handling

2016-09-11 Thread Christophe JAILLET
If 'inet_diag_lock_handler()' returns an error, we should not call
'inet_diag_unlock_handler()' on it.
'handler' is not a valid mutexc in this case.

This has been spotted with the folowing coccinelle script:

@@
expression x;
identifier f;
@@

*   if (IS_ERR(x))
{
   ...
*  f(<+... x ...+>);
   ...
}

Signed-off-by: Christophe JAILLET 
---
 net/ipv4/inet_diag.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
index abfbe492ebfe..795af25cf84c 100644
--- a/net/ipv4/inet_diag.c
+++ b/net/ipv4/inet_diag.c
@@ -1134,7 +1134,6 @@ int inet_diag_handler_get_info(struct sk_buff *skb, 
struct sock *sk)
 
handler = inet_diag_lock_handler(sk->sk_protocol);
if (IS_ERR(handler)) {
-   inet_diag_unlock_handler(handler);
nlmsg_cancel(skb, nlh);
return PTR_ERR(handler);
}
-- 
2.7.4



Re: [PATCH net] net_sched: act_mirred: full rcu conversion

2016-09-11 Thread Cong Wang
On Fri, Sep 9, 2016 at 5:23 AM, Eric Dumazet  wrote:
> On Thu, 2016-09-08 at 22:24 -0700, Cong Wang wrote:
>> On Thu, Sep 8, 2016 at 8:35 AM, Eric Dumazet  wrote:
>> > From: Eric Dumazet 
>> >
>> > As reported by Cong Wang, I was lazy when I did initial RCU conversion
>> > of tc_mirred, as I thought I could avoid allocation/freeing of a
>> > parameter block.
>>
>> Quote from Eric Dumazet:
>>
>> https://www.mail-archive.com/netdev@vger.kernel.org/msg115482.html
>>
>> 
>> Well, I added a READ_ONCE() to read tcf_action once.
>>
>> Adding rcu here would mean adding a pointer and extra cache line, to
>> deref the values.
>>
>> IMHO the race here has no effect . You either read the old or new value.
>> 
>>
>> Me with facepalm... ;-)
>
>
> Point is still valid. Show me a real case where it was a serious
> problem, instead of simply theoretical.
>
> tc_mirred + ifb patches allowed us to reach a milestone, removing the
> last contended spinlocks, and you are catching up with this one year
> later.
>
> I wont backport this fix in Google prod kernels, because there is
> absolutely no way we need it, and the extra memory cache line might hurt
> latencies.
>
> Since you did not write a fix on your side since June 17th, I presume
> you do not care that much.

Sounds like I were the author of this patch Why are you questioning
your own patch? Did I ask you to care about it? ;-)

Please drop this patch.

Thanks.


Re: [PATCH 07/26] net/mlx4_core: constify local structures

2016-09-11 Thread Leon Romanovsky
On Sun, Sep 11, 2016 at 03:05:49PM +0200, Julia Lawall wrote:
> For structure types defined in the same file or local header files, find
> top-level static structure declarations that have the following
> properties:
> 1. Never reassigned.
> 2. Address never taken
> 3. Not passed to a top-level macro call
> 4. No pointer or array-typed field passed to a function or stored in a
> variable.
> Declare structures having all of these properties as const.
>
> Done using Coccinelle.
> Based on a suggestion by Joe Perches .
>
> Signed-off-by: Julia Lawall 

Thanks,
Reviewed-by: Leon Romanovsky 


signature.asc
Description: PGP signature


[PATCH v4 05/16] IB/pvrdma: Add functions for Verbs support

2016-09-11 Thread Adit Ranadive
This patch implements the remaining Verbs functions registered with the
core RDMA stack.

Reviewed-by: Jorgen Hansen 
Reviewed-by: George Zhang 
Reviewed-by: Aditya Sarwade 
Reviewed-by: Bryan Tan 
Signed-off-by: Adit Ranadive 
---
Changes v3->v4:
 - Renamed priviledged -> privileged.
 - Added error numbers for command errors.
 - Removed unnecessary goto in modify_device.
 - Moved pd allocation to after command execution.
 - Removed an incorrect atomic_dec.
---
 drivers/infiniband/hw/pvrdma/pvrdma_verbs.c | 611 
 drivers/infiniband/hw/pvrdma/pvrdma_verbs.h | 108 +
 2 files changed, 719 insertions(+)
 create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_verbs.c
 create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_verbs.h

diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_verbs.c 
b/drivers/infiniband/hw/pvrdma/pvrdma_verbs.c
new file mode 100644
index 000..3805611
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_verbs.c
@@ -0,0 +1,611 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "pvrdma.h"
+#include "pvrdma_user.h"
+
+/**
+ * pvrdma_query_device - query device
+ * @ibdev: the device to query
+ * @props: the device properties
+ * @uhw: user data
+ *
+ * @return: 0 on success, otherwise negative errno
+ */
+int pvrdma_query_device(struct ib_device *ibdev,
+   struct ib_device_attr *props,
+   struct ib_udata *uhw)
+{
+   struct pvrdma_dev *dev = to_vdev(ibdev);
+
+   if (uhw->inlen || uhw->outlen)
+   return -EINVAL;
+
+   memset(props, 0, sizeof(*props));
+
+   props->fw_ver = dev->dsr->caps.fw_ver;
+   props->sys_image_guid = dev->dsr->caps.sys_image_guid;
+   props->max_mr_size = dev->dsr->caps.max_mr_size;
+   props->page_size_cap = dev->dsr->caps.page_size_cap;
+   props->vendor_id = dev->dsr->caps.vendor_id;
+   props->vendor_part_id = dev->pdev->device;
+   props->hw_ver = dev->dsr->caps.hw_ver;
+   props->max_qp = dev->dsr->caps.max_qp;
+   props->max_qp_wr = dev->dsr->caps.max_qp_wr;
+   props->device_cap_flags = dev->dsr->caps.device_cap_flags;
+   props->max_sge = dev->dsr->caps.max_sge;
+   props->max_sge_rd = dev->dsr->caps.max_sge_rd;
+   props->max_cq = dev->dsr->caps.max_cq;
+   props->max_cqe = dev->dsr->caps.max_cqe;
+   props->max_mr = dev->dsr->caps.max_mr;
+   props->max_pd = dev->dsr->caps.max_pd;
+   props->max_qp_rd_atom = dev->dsr->caps.max_qp_rd_atom;
+   props->max_ee_rd_atom = dev->dsr->caps.max_ee_rd_atom;
+   props->max_res_rd_atom = dev->dsr->caps.max_res_rd_atom;
+   props->max_qp_init_rd_atom = dev->dsr->caps.max_qp_init_rd_atom;
+   props->max_ee_init_rd_atom = d

[PATCH v4 02/16] IB/pvrdma: Add user-level shared functions

2016-09-11 Thread Adit Ranadive
We share some common structures with the user-level driver. This patch
adds those structures and shared functions to traverse the QP/CQ rings.

Reviewed-by: Yuval Shaia 
Reviewed-by: Jorgen Hansen 
Reviewed-by: George Zhang 
Reviewed-by: Aditya Sarwade 
Reviewed-by: Bryan Tan 
Signed-off-by: Adit Ranadive 
---
Changes v3->v4:
 - Moved pvrdma_sge into pvrdma_uapi.h
---
 drivers/infiniband/hw/pvrdma/pvrdma_uapi.h | 255 +
 drivers/infiniband/hw/pvrdma/pvrdma_user.h |  99 +++
 2 files changed, 354 insertions(+)
 create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_uapi.h
 create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_user.h

diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_uapi.h 
b/drivers/infiniband/hw/pvrdma/pvrdma_uapi.h
new file mode 100644
index 000..430d8a5
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_uapi.h
@@ -0,0 +1,255 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __PVRDMA_UAPI_H__
+#define __PVRDMA_UAPI_H__
+
+#include 
+
+#define PVRDMA_VERSION 17
+
+#define PVRDMA_UAR_HANDLE_MASK 0x00FF  /* Bottom 24 bits. */
+#define PVRDMA_UAR_QP_OFFSET   0   /* Offset of QP doorbell. */
+#define PVRDMA_UAR_QP_SEND BIT(30) /* Send bit. */
+#define PVRDMA_UAR_QP_RECV BIT(31) /* Recv bit. */
+#define PVRDMA_UAR_CQ_OFFSET   4   /* Offset of CQ doorbell. */
+#define PVRDMA_UAR_CQ_ARM_SOL  BIT(29) /* Arm solicited bit. */
+#define PVRDMA_UAR_CQ_ARM  BIT(30) /* Arm bit. */
+#define PVRDMA_UAR_CQ_POLL BIT(31) /* Poll bit. */
+#define PVRDMA_INVALID_IDX -1  /* Invalid index. */
+
+/* PVRDMA atomic compare and swap */
+struct pvrdma_exp_cmp_swap {
+   __u64 swap_val;
+   __u64 compare_val;
+   __u64 swap_mask;
+   __u64 compare_mask;
+};
+
+/* PVRDMA atomic fetch and add */
+struct pvrdma_exp_fetch_add {
+   __u64 add_val;
+   __u64 field_boundary;
+};
+
+/* PVRDMA address vector. */
+struct pvrdma_av {
+   __u32 port_pd;
+   __u32 sl_tclass_flowlabel;
+   __u8 dgid[16];
+   __u8 src_path_bits;
+   __u8 gid_index;
+   __u8 stat_rate;
+   __u8 hop_limit;
+   __u8 dmac[6];
+   __u8 reserved[6];
+};
+
+/* PVRDMA scatter/gather entry */
+struct pvrdma_sge {
+   __u64   addr;
+   __u32   length;
+   __u32   lkey;
+};
+
+/* PVRDMA receive queue work request */
+struct pvrdma_rq_wqe_hdr {
+   __u64 wr_id;/* wr id */
+   __u32 num_sge;  /* size of s/g array */
+   __u32 total_len;/* reserved */
+};
+/* Use pvrdma_sge (ib_sge) for receive queue s/g array elements. */
+
+/* PVRDMA send queue work request */
+struct pvrdma_sq_wqe_hdr {
+   __u64 wr_id;/* wr id */
+   __u32 num_sge;

[PATCH v4 13/16] IB/pvrdma: Add the main driver module for PVRDMA

2016-09-11 Thread Adit Ranadive
This patch adds the support to register a RDMA device with the kernel RDMA
stack as well as a kernel module. This also initializes the underlying
virtual PCI device.

Reviewed-by: Yuval Shaia 
Reviewed-by: Jorgen Hansen 
Reviewed-by: George Zhang 
Reviewed-by: Aditya Sarwade 
Reviewed-by: Bryan Tan 
Signed-off-by: Adit Ranadive 
---
Changes v3->v4:
 - Fixed some checkpatch warnings.
 - Added support for new get_dev_fw_str API.
 - Added event workqueue for netdevice events.
 - Restructured the pvrdma_pci_remove function a little bit.

Changes v2->v3:
 - Removed boolean in pvrdma_cmd_post.

Changes v1->v2:
 - Addressed 32-bit build errors
 - Cosmetic change to avoid if else in intr0_handler
 - Removed unnecessary return assignment.
---
 drivers/infiniband/hw/pvrdma/pvrdma_main.c | 1222 
 1 file changed, 1222 insertions(+)
 create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_main.c

diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_main.c 
b/drivers/infiniband/hw/pvrdma/pvrdma_main.c
new file mode 100644
index 000..d61f3b2
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_main.c
@@ -0,0 +1,1222 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "pvrdma.h"
+#include "pvrdma_user.h"
+
+#define DRV_NAME   "pvrdma"
+#define DRV_VERSION"1.0"
+#define DRV_RELDATE"January 1, 2013"
+
+static const char pvrdma_version[] =
+   DRV_NAME ": PVRDMA InfiniBand driver v"
+   DRV_VERSION " (" DRV_RELDATE ")\n";
+
+static DEFINE_MUTEX(pvrdma_device_list_lock);
+static LIST_HEAD(pvrdma_device_list);
+static struct workqueue_struct *event_wq;
+
+static int pvrdma_add_gid(struct ib_device *ibdev,
+ u8 port_num,
+ unsigned int index,
+ const union ib_gid *gid,
+ const struct ib_gid_attr *attr,
+ void **context);
+static int pvrdma_del_gid(struct ib_device *ibdev,
+ u8 port_num,
+ unsigned int index,
+ void **context);
+
+
+static ssize_t show_hca(struct device *device, struct device_attribute *attr,
+   char *buf)
+{
+   return sprintf(buf, "PVRDMA%s\n", DRV_VERSION);
+}
+
+static ssize_t show_rev(struct device *device, struct device_attribute *attr,
+   char *buf)
+{
+   return sprintf(buf, "%d\n", PVRDMA_REV_ID);
+}
+
+static ssize_t show_board(struct device *device, struct device_attribute *attr,
+ char *buf)
+{
+   return sprintf(buf, "%d\n", PVRDMA_BOARD_ID);
+}
+
+static DEVICE_ATTR(hw_rev,   S_IRUGO, show_rev,   NULL);
+static DEVICE_ATTR(hca_ty

[PATCH v4 06/16] IB/pvrdma: Add paravirtual rdma device

2016-09-11 Thread Adit Ranadive
This patch adds the main device-level structures and functions to be used
to provide RDMA functionality. Also, we define conversion functions from
the IB core stack structures to the device-specific ones.

Reviewed-by: Jorgen Hansen 
Reviewed-by: George Zhang 
Reviewed-by: Aditya Sarwade 
Reviewed-by: Bryan Tan 
Signed-off-by: Adit Ranadive 
---
Changes v3->v4:
 - Renamed pvrdma_flush_cqe to _pvrdma_flush_cqe since we hold a lock
 to call it.
 - Added wrapper functions for writing to UARs for CQ/QP.
 - The conversion functions are updated as func_name(dst, src) format.
 - Renamed max_gs to max_sg.
 - Added work struct for net device events.
 - priviledged -> privileged.

Changes v2->v3:
 - Removed VMware vendor id redefinition.
 - Removed the boolean in pvrdma_cmd_post.
---
 drivers/infiniband/hw/pvrdma/pvrdma.h | 473 ++
 1 file changed, 473 insertions(+)
 create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma.h

diff --git a/drivers/infiniband/hw/pvrdma/pvrdma.h 
b/drivers/infiniband/hw/pvrdma/pvrdma.h
new file mode 100644
index 000..fedd7cb
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma.h
@@ -0,0 +1,473 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __PVRDMA_H__
+#define __PVRDMA_H__
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "pvrdma_defs.h"
+#include "pvrdma_dev_api.h"
+#include "pvrdma_verbs.h"
+
+/* NOT the same as BIT_MASK(). */
+#define PVRDMA_MASK(n) ((n << 1) - 1)
+
+/*
+ * VMware PVRDMA PCI device id.
+ */
+#define PCI_DEVICE_ID_VMWARE_PVRDMA0x0820
+
+struct pvrdma_dev;
+
+struct pvrdma_page_dir {
+   dma_addr_t dir_dma;
+   u64 *dir;
+   int ntables;
+   u64 **tables;
+   u64 npages;
+   void **pages;
+};
+
+struct pvrdma_cq {
+   struct ib_cq ibcq;
+   int offset;
+   spinlock_t cq_lock; /* Poll lock. */
+   struct pvrdma_uar_map *uar;
+   struct ib_umem *umem;
+   struct pvrdma_ring_state *ring_state;
+   struct pvrdma_page_dir pdir;
+   u32 cq_handle;
+   bool is_kernel;
+   atomic_t refcnt;
+   wait_queue_head_t wait;
+};
+
+struct pvrdma_id_table {
+   u32 last;
+   u32 top;
+   u32 max;
+   u32 mask;
+   spinlock_t lock; /* Table lock. */
+   unsigned long *table;
+};
+
+struct pvrdma_uar_map {
+   unsigned long pfn;
+   void __iomem *map;
+   int index;
+};
+
+struct pvrdma_uar_table {
+   struct pvrdma_id_table tbl;
+   int size;
+};
+
+struct pvrdma_ucontext {
+   struct ib_ucontext ibucontext;
+   struct pvrdma_dev *dev;
+   struct pvrdma_uar_map uar;
+   u64 ctx_handle;
+};
+
+struct pvrdma_pd {
+   struct ib_pd ibpd;
+   u32 pdn;
+   u32 pd_handle;
+   int privileged;
+};
+

[PATCH v4 11/16] IB/pvrdma: Add support for memory regions

2016-09-11 Thread Adit Ranadive
This patch adds support for creating and destroying memory regions. The
PVRDMA device supports User MRs, DMA MRs (no Remote Read/Write support),
Fast Register MRs.

Reviewed-by: Jorgen Hansen 
Reviewed-by: George Zhang 
Reviewed-by: Aditya Sarwade 
Reviewed-by: Bryan Tan 
Signed-off-by: Adit Ranadive 
---
Changes v3->v4:
 - Changed access flag check for DMA MR to using bit operation.
 - Removed some local variables.

Changes v2->v3:
 - Removed boolean in pvrdma_cmd_post.
---
 drivers/infiniband/hw/pvrdma/pvrdma_mr.c | 332 +++
 1 file changed, 332 insertions(+)
 create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_mr.c

diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_mr.c 
b/drivers/infiniband/hw/pvrdma/pvrdma_mr.c
new file mode 100644
index 000..6163f17
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_mr.c
@@ -0,0 +1,332 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+
+#include "pvrdma.h"
+
+/**
+ * pvrdma_get_dma_mr - get a DMA memory region
+ * @pd: protection domain
+ * @acc: access flags
+ *
+ * @return: ib_mr pointer on success, otherwise returns an errno.
+ */
+struct ib_mr *pvrdma_get_dma_mr(struct ib_pd *pd, int acc)
+{
+   struct pvrdma_dev *dev = to_vdev(pd->device);
+   struct pvrdma_user_mr *mr;
+   union pvrdma_cmd_req req;
+   union pvrdma_cmd_resp rsp;
+   struct pvrdma_cmd_create_mr *cmd = &req.create_mr;
+   struct pvrdma_cmd_create_mr_resp *resp = &rsp.create_mr_resp;
+   int ret;
+
+   if (!(acc & IB_ACCESS_LOCAL_WRITE)) {
+   dev_warn(&dev->pdev->dev,
+"unsupported dma mr access flags %#x\n", acc);
+   return ERR_PTR(-EOPNOTSUPP);
+   }
+
+   mr = kzalloc(sizeof(*mr), GFP_KERNEL);
+   if (!mr)
+   return ERR_PTR(-ENOMEM);
+
+   memset(cmd, 0, sizeof(*cmd));
+   cmd->hdr.cmd = PVRDMA_CMD_CREATE_MR;
+   cmd->pd_handle = to_vpd(pd)->pd_handle;
+   cmd->access_flags = acc;
+   cmd->flags = PVRDMA_MR_FLAG_DMA;
+
+   ret = pvrdma_cmd_post(dev, &req, &rsp);
+   if (ret < 0) {
+   dev_warn(&dev->pdev->dev, "could not get DMA mem region\n");
+   kfree(mr);
+   return ERR_PTR(ret);
+   }
+
+   mr->mmr.mr_handle = resp->mr_handle;
+   mr->ibmr.lkey = resp->lkey;
+   mr->ibmr.rkey = resp->rkey;
+
+   return &mr->ibmr;
+}
+
+/**
+ * pvrdma_reg_user_mr - register a userspace memory region
+ * @pd: protection domain
+ * @start: starting address
+ * @length: length of region
+ * @virt_addr: I/O virtual address
+ * @access_flags: access flags for memory region
+ * @udata: user data
+ *
+ * @return: ib_mr pointer on success, otherwise returns an errno.
+ */
+struct ib_mr *pvrdma_reg_user_mr(struct ib_pd *pd, u64 

[PATCH v4 12/16] IB/pvrdma: Add Queue Pair support

2016-09-11 Thread Adit Ranadive
This patch adds the ability to create, modify, query and destroy QPs. The
PVRDMA device supports RC, UD and GSI QPs.

Reviewed-by: Jorgen Hansen 
Reviewed-by: George Zhang 
Reviewed-by: Aditya Sarwade 
Reviewed-by: Bryan Tan 
Signed-off-by: Adit Ranadive 
---
Changes v3->v4:
 - Removed an unnecessary switch case.
 - Unified the returns in pvrdma_create_qp to use one exit point.
 - Renamed pvrdma_flush_cqe to _pvrdma_flush_cqe since we need a lock to
 be held when calling this.
 - Updated to use wrapper for UAR write for QP.
 - Updated conversion function to func_name(dst, src) format.
 - Renamed max_gs to max_sg.
 - Renamed cap variable to req_cap in pvrdma_set_sq/rq_size.
 - Changed dev_warn to dev_warn_ratelimited in pvrdma_post_send/recv.
 - Added nesting locking for flushing CQs when destroying/resetting a QP.
 - Added missing ret value.

Changes v2->v3:
 - Removed boolean in pvrdma_cmd_post.
---
 drivers/infiniband/hw/pvrdma/pvrdma_qp.c | 980 +++
 1 file changed, 980 insertions(+)
 create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_qp.c

diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_qp.c 
b/drivers/infiniband/hw/pvrdma/pvrdma_qp.c
new file mode 100644
index 000..4163186
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_qp.c
@@ -0,0 +1,980 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "pvrdma.h"
+#include "pvrdma_user.h"
+
+static inline void get_cqs(struct pvrdma_qp *qp, struct pvrdma_cq **send_cq,
+  struct pvrdma_cq **recv_cq)
+{
+   *send_cq = to_vcq(qp->ibqp.send_cq);
+   *recv_cq = to_vcq(qp->ibqp.recv_cq);
+}
+
+static void pvrdma_lock_cqs(struct pvrdma_cq *scq, struct pvrdma_cq *rcq,
+   unsigned long *scq_flags,
+   unsigned long *rcq_flags)
+   __acquires(scq->cq_lock) __acquires(rcq->cq_lock)
+{
+   if (scq == rcq) {
+   spin_lock_irqsave(&scq->cq_lock, *scq_flags);
+   __acquire(rcq->cq_lock);
+   } else if (scq->cq_handle < rcq->cq_handle) {
+   spin_lock_irqsave(&scq->cq_lock, *scq_flags);
+   spin_lock_irqsave_nested(&rcq->cq_lock, *rcq_flags,
+SINGLE_DEPTH_NESTING);
+   } else {
+   spin_lock_irqsave(&rcq->cq_lock, *rcq_flags);
+   spin_lock_irqsave_nested(&scq->cq_lock, *scq_flags,
+SINGLE_DEPTH_NESTING);
+   }
+}
+
+static void pvrdma_unlock_cqs(struct pvrdma_cq *scq, struct pvrdma_cq *rcq,
+ unsigned long *scq_flags,
+ unsigned long *rcq_flags)
+   __releases(scq->cq_lock) __releases(rcq->cq_lock)
+{
+   if (

[PATCH v4 04/16] IB/pvrdma: Add the paravirtual RDMA device specification

2016-09-11 Thread Adit Ranadive
This patch describes the main specification of the underlying virtual RDMA
device. The pvrdma_dev_api header file defines the Verbs commands and
their parameters that can be issued to the device backend.

Reviewed-by: Yuval Shaia 
Reviewed-by: Jorgen Hansen 
Reviewed-by: George Zhang 
Reviewed-by: Aditya Sarwade 
Reviewed-by: Bryan Tan 
Signed-off-by: Adit Ranadive 
---
Changes v3->v4:
 - Removed explicit enum values.

Changes v2->v3:
 - Defined 9 and 18 for page directory.
 - Stripped spaces in some comments.
---
 drivers/infiniband/hw/pvrdma/pvrdma_defs.h| 302 +++
 drivers/infiniband/hw/pvrdma/pvrdma_dev_api.h | 342 ++
 2 files changed, 644 insertions(+)
 create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_defs.h
 create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_dev_api.h

diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_defs.h 
b/drivers/infiniband/hw/pvrdma/pvrdma_defs.h
new file mode 100644
index 000..de7d1fb
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_defs.h
@@ -0,0 +1,302 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __PVRDMA_DEFS_H__
+#define __PVRDMA_DEFS_H__
+
+#include 
+#include "pvrdma_ib_verbs.h"
+#include "pvrdma_uapi.h"
+
+/*
+ * Masks and accessors for page directory, which is a two-level lookup:
+ * page directory -> page table -> page. Only one directory for now, but we
+ * could expand that easily. 9 bits for tables, 9 bits for pages, gives one
+ * gigabyte for memory regions and so forth.
+ */
+
+#define PVRDMA_PDIR_SHIFT  18
+#define PVRDMA_PTABLE_SHIFT9
+#define PVRDMA_PAGE_DIR_DIR(x) (((x) >> PVRDMA_PDIR_SHIFT) & 0x1)
+#define PVRDMA_PAGE_DIR_TABLE(x)   (((x) >> PVRDMA_PTABLE_SHIFT) & 0x1ff)
+#define PVRDMA_PAGE_DIR_PAGE(x)((x) & 0x1ff)
+#define PVRDMA_PAGE_DIR_MAX_PAGES  (1 * 512 * 512)
+#define PVRDMA_MAX_FAST_REG_PAGES  128
+
+/*
+ * Max MSI-X vectors.
+ */
+
+#define PVRDMA_MAX_INTERRUPTS  3
+
+/* Register offsets within PCI resource on BAR1. */
+#define PVRDMA_REG_VERSION 0x00/* R: Version of device. */
+#define PVRDMA_REG_DSRLOW  0x04/* W: Device shared region low PA. */
+#define PVRDMA_REG_DSRHIGH 0x08/* W: Device shared region high PA. */
+#define PVRDMA_REG_CTL 0x0c/* W: PVRDMA_DEVICE_CTL */
+#define PVRDMA_REG_REQUEST 0x10/* W: Indicate device request. */
+#define PVRDMA_REG_ERR 0x14/* R: Device error. */
+#define PVRDMA_REG_ICR 0x18/* R: Interrupt cause. */
+#define PVRDMA_REG_IMR 0x1c/* R/W: Interrupt mask. */
+#define PVRDMA_REG_MACL0x20/* R/W: MAC address low. */
+#define PVRDMA_REG_MACH0x24/* R/W: MAC address high. */
+
+/* Object flags. */
+#define PVRDMA_CQ_FLAG_AR

[PATCH v4 15/16] IB: Add PVRDMA driver

2016-09-11 Thread Adit Ranadive
This patch updates the InfiniBand subsystem to build the PVRDMA driver.

Reviewed-by: Jorgen Hansen 
Reviewed-by: George Zhang 
Reviewed-by: Aditya Sarwade 
Reviewed-by: Bryan Tan 
Signed-off-by: Adit Ranadive 
---
 drivers/infiniband/Kconfig | 1 +
 drivers/infiniband/hw/Makefile | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig
index 19a418a..dff4bcf 100644
--- a/drivers/infiniband/Kconfig
+++ b/drivers/infiniband/Kconfig
@@ -88,5 +88,6 @@ source "drivers/infiniband/sw/rdmavt/Kconfig"
 source "drivers/infiniband/sw/rxe/Kconfig"
 
 source "drivers/infiniband/hw/hfi1/Kconfig"
+source "drivers/infiniband/hw/pvrdma/Kconfig"
 
 endif # INFINIBAND
diff --git a/drivers/infiniband/hw/Makefile b/drivers/infiniband/hw/Makefile
index 21fe401..c8a7a36 100644
--- a/drivers/infiniband/hw/Makefile
+++ b/drivers/infiniband/hw/Makefile
@@ -10,3 +10,4 @@ obj-$(CONFIG_INFINIBAND_OCRDMA)   += ocrdma/
 obj-$(CONFIG_INFINIBAND_USNIC) += usnic/
 obj-$(CONFIG_INFINIBAND_HFI1)  += hfi1/
 obj-$(CONFIG_INFINIBAND_HNS)   += hns/
+obj-$(CONFIG_INFINIBAND_PVRDMA)+= pvrdma/
-- 
2.7.4



[PATCH v4 16/16] MAINTAINERS: Update for PVRDMA driver

2016-09-11 Thread Adit Ranadive
Add maintainer info for the PVRDMA driver.

Reviewed-by: Jorgen Hansen 
Reviewed-by: George Zhang 
Reviewed-by: Aditya Sarwade 
Reviewed-by: Bryan Tan 
Signed-off-by: Adit Ranadive 
---
 MAINTAINERS | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 87e23cd..fee8f1d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12615,6 +12615,13 @@ S: Maintained
 F: drivers/scsi/vmw_pvscsi.c
 F: drivers/scsi/vmw_pvscsi.h
 
+VMWARE PVRDMA DRIVER
+M: Adit Ranadive 
+M: VMware PV-Drivers 
+L: linux-r...@vger.kernel.org
+S: Maintained
+F: drivers/infiniband/hw/pvrdma
+
 VOLTAGE AND CURRENT REGULATOR FRAMEWORK
 M: Liam Girdwood 
 M: Mark Brown 
-- 
2.7.4



[PATCH v4 03/16] IB/pvrdma: Add virtual device RDMA structures

2016-09-11 Thread Adit Ranadive
This patch adds the various Verbs structures that we support in the
virtual RDMA device. We have re-mapped the ones from the RDMA core stack
to make sure we can maintain compatibility with our backend.

Reviewed-by: Jorgen Hansen 
Reviewed-by: George Zhang 
Reviewed-by: Aditya Sarwade 
Reviewed-by: Bryan Tan 
Signed-off-by: Adit Ranadive 
---
Changes v3->v4:
 - Moved the pvrdma_sge struct to pvrdma_uapi.h
---
 drivers/infiniband/hw/pvrdma/pvrdma_ib_verbs.h | 444 +
 1 file changed, 444 insertions(+)
 create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_ib_verbs.h

diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_ib_verbs.h 
b/drivers/infiniband/hw/pvrdma/pvrdma_ib_verbs.h
new file mode 100644
index 000..105c6ab
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_ib_verbs.h
@@ -0,0 +1,444 @@
+/*
+ * [PLEASE NOTE:  VMWARE, INC. ELECTS TO USE AND DISTRIBUTE THIS COMPONENT
+ * UNDER THE TERMS OF THE OpenIB.org BSD license.  THE ORIGINAL LICENSE TERMS
+ * ARE REPRODUCED BELOW ONLY AS A REFERENCE.]
+ *
+ * Copyright (c) 2004 Mellanox Technologies Ltd.  All rights reserved.
+ * Copyright (c) 2004 Infinicon Corporation.  All rights reserved.
+ * Copyright (c) 2004 Intel Corporation.  All rights reserved.
+ * Copyright (c) 2004 Topspin Corporation.  All rights reserved.
+ * Copyright (c) 2004 Voltaire Corporation.  All rights reserved.
+ * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007 Cisco Systems.  All rights reserved.
+ * Copyright (c) 2015-2016 VMware, Inc.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __PVRDMA_IB_VERBS_H__
+#define __PVRDMA_IB_VERBS_H__
+
+#include 
+
+union pvrdma_gid {
+   __u8raw[16];
+   struct {
+   __be64  subnet_prefix;
+   __be64  interface_id;
+   } global;
+};
+
+enum pvrdma_link_layer {
+   PVRDMA_LINK_LAYER_UNSPECIFIED,
+   PVRDMA_LINK_LAYER_INFINIBAND,
+   PVRDMA_LINK_LAYER_ETHERNET,
+};
+
+enum pvrdma_mtu {
+   PVRDMA_MTU_256  = 1,
+   PVRDMA_MTU_512  = 2,
+   PVRDMA_MTU_1024 = 3,
+   PVRDMA_MTU_2048 = 4,
+   PVRDMA_MTU_4096 = 5,
+};
+
+static inline int pvrdma_mtu_enum_to_int(enum pvrdma_mtu mtu)
+{
+   switch (mtu) {
+   case PVRDMA_MTU_256:return  256;
+   case PVRDMA_MTU_512:return  512;
+   case PVRDMA_MTU_1024:   return 1024;
+   case PVRDMA_MTU_2048:   return 2048;
+   case PVRDMA_MTU_4096:   return 4096;
+   default:return   -1;
+   }
+}
+
+static inline enum pvrdma_mtu pvrdma_mtu_int_to_enum(int mtu)
+{
+   switch (mtu) {
+   case 256:   return PVRDMA_MTU_256;
+   case 512:   return PVRDMA_MTU_512;
+   case 1024:  return PVRDMA_MTU_1024;
+   case 2048:  return PVRDMA_MTU_2048;
+   case 4096:
+   default:return PVRDMA_MTU_4096;
+   }
+}
+
+enum pvrdma_port_state {
+   PVRDMA_PORT_NOP = 0,
+   PVRDMA_PORT_DOWN= 1,
+   PVRDMA_PORT_INIT= 2,
+   PVRDMA_PORT_ARMED   = 3,
+   PVRDMA_PORT_ACTIVE  = 4,
+   PVRDMA_PORT_ACTIVE_DEFER= 5,
+};
+
+enum pvrdma_port_cap_flags {
+   PVRDMA_PORT_SM  = 1 <<  1,
+   PVRDMA_PORT_NOTICE_SUP  = 1 <<  2,
+   PVRDMA_PORT_TRAP_SUP= 1 <<  3,
+   PVRDMA_PORT_OPT_IPD_SUP = 1 <<  4,
+   PVRDMA_PORT_AUTO_MIGR_SUP   = 1 <<  5,
+   PVRDMA_PORT_SL_MAP_SUP  = 1 <<  6,
+   PVRDMA_PORT_MKEY_NVRAM  

[PATCH v4 10/16] IB/pvrdma: Add UAR support

2016-09-11 Thread Adit Ranadive
This patch adds the UAR support for the paravirtual RDMA device. The UAR
pages are MMIO pages from the virtual PCI space. We define offsets within
this page to provide the fast data-path operations.

Reviewed-by: Yuval Shaia 
Reviewed-by: Jorgen Hansen 
Reviewed-by: George Zhang 
Reviewed-by: Aditya Sarwade 
Reviewed-by: Bryan Tan 
Signed-off-by: Adit Ranadive 
---
Changes v3->v4:
 - Removed an unnecessary comment.

Changes v2->v3:
 - Used is_power_of_2 function.
 - Simplify pvrdma_uar_alloc function.
---
 drivers/infiniband/hw/pvrdma/pvrdma_doorbell.c | 127 +
 1 file changed, 127 insertions(+)
 create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_doorbell.c

diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_doorbell.c 
b/drivers/infiniband/hw/pvrdma/pvrdma_doorbell.c
new file mode 100644
index 000..bf51357
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_doorbell.c
@@ -0,0 +1,127 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+
+#include "pvrdma.h"
+
+int pvrdma_uar_table_init(struct pvrdma_dev *dev)
+{
+   u32 num = dev->dsr->caps.max_uar;
+   u32 mask = num - 1;
+   struct pvrdma_id_table *tbl = &dev->uar_table.tbl;
+
+   if (!is_power_of_2(num))
+   return -EINVAL;
+
+   tbl->last = 0;
+   tbl->top = 0;
+   tbl->max = num;
+   tbl->mask = mask;
+   spin_lock_init(&tbl->lock);
+   tbl->table = kcalloc(BITS_TO_LONGS(num), sizeof(long), GFP_KERNEL);
+   if (!tbl->table)
+   return -ENOMEM;
+
+   /* 0th UAR is taken by the device. */
+   set_bit(0, tbl->table);
+
+   return 0;
+}
+
+void pvrdma_uar_table_cleanup(struct pvrdma_dev *dev)
+{
+   struct pvrdma_id_table *tbl = &dev->uar_table.tbl;
+
+   kfree(tbl->table);
+}
+
+int pvrdma_uar_alloc(struct pvrdma_dev *dev, struct pvrdma_uar_map *uar)
+{
+   struct pvrdma_id_table *tbl;
+   unsigned long flags;
+   u32 obj;
+
+   tbl = &dev->uar_table.tbl;
+
+   spin_lock_irqsave(&tbl->lock, flags);
+   obj = find_next_zero_bit(tbl->table, tbl->max, tbl->last);
+   if (obj >= tbl->max) {
+   tbl->top = (tbl->top + tbl->max) & tbl->mask;
+   obj = find_first_zero_bit(tbl->table, tbl->max);
+   }
+
+   if (obj >= tbl->max) {
+   spin_unlock_irqrestore(&tbl->lock, flags);
+   return -ENOMEM;
+   }
+
+   set_bit(obj, tbl->table);
+   obj |= tbl->top;
+
+   spin_unlock_irqrestore(&tbl->lock, flags);
+
+   uar->index = obj;
+   uar->pfn = (pci_resource_start(dev->pdev, PVRDMA_PCI_RESOURCE_UAR) >>
+   PAGE_SHIFT) + uar->index;
+
+   return 0;
+}
+
+void pvrdma_uar_free(struct pvrdma_dev *dev, struct pvrdma_uar_map *uar)
+{
+   struct pvrdma_id

[PATCH v4 08/16] IB/pvrdma: Add device command support

2016-09-11 Thread Adit Ranadive
This patch enables posting Verb requests and receiving responses to/from
the backend PVRDMA emulation layer.

Reviewed-by: Yuval Shaia 
Reviewed-by: Jorgen Hansen 
Reviewed-by: George Zhang 
Reviewed-by: Aditya Sarwade 
Reviewed-by: Bryan Tan 
Signed-off-by: Adit Ranadive 
---
Changes v3->v4:
 - Removed the min check and added a BUILD_BUG_ON check for size.

Changes v2->v3:
 - Converted pvrdma_cmd_recv to inline.
 - Added a min check in the memcpy to cmd_slot.
 - Removed the boolean from pvrdma_cmd_post.
---
 drivers/infiniband/hw/pvrdma/pvrdma_cmd.c | 108 ++
 1 file changed, 108 insertions(+)
 create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_cmd.c

diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_cmd.c 
b/drivers/infiniband/hw/pvrdma/pvrdma_cmd.c
new file mode 100644
index 000..827e3ff
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_cmd.c
@@ -0,0 +1,108 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+
+#include "pvrdma.h"
+
+#define PVRDMA_CMD_TIMEOUT 1 /* ms */
+
+static inline int pvrdma_cmd_recv(struct pvrdma_dev *dev,
+ union pvrdma_cmd_resp *resp)
+{
+   dev_dbg(&dev->pdev->dev, "receive response from device\n");
+
+   spin_lock(&dev->cmd_lock);
+   memcpy(resp, dev->resp_slot, sizeof(*resp));
+   spin_unlock(&dev->cmd_lock);
+
+   return 0;
+}
+
+int
+pvrdma_cmd_post(struct pvrdma_dev *dev, union pvrdma_cmd_req *req,
+   union pvrdma_cmd_resp *resp)
+{
+   int err;
+
+   dev_dbg(&dev->pdev->dev, "post request to device\n");
+
+   /* Serializiation */
+   down(&dev->cmd_sema);
+
+   BUILD_BUG_ON(sizeof(union pvrdma_cmd_req) !=
+sizeof(struct pvrdma_cmd_modify_qp));
+
+   spin_lock(&dev->cmd_lock);
+   memcpy(dev->cmd_slot, req, sizeof(*req));
+   spin_unlock(&dev->cmd_lock);
+
+   init_completion(&dev->cmd_done);
+   pvrdma_write_reg(dev, PVRDMA_REG_REQUEST, 0);
+
+   /* Make sure the request is written before reading status. */
+   mb();
+   err = pvrdma_read_reg(dev, PVRDMA_REG_ERR);
+   if (err == 0) {
+   if (resp != NULL) {
+   err = wait_for_completion_interruptible_timeout(
+   &dev->cmd_done,
+   msecs_to_jiffies(PVRDMA_CMD_TIMEOUT));
+   if (err == 0 || err == -ERESTARTSYS) {
+   dev_err(&dev->pdev->dev,
+   "completion timeout or interrupted\n");
+   err = -ETIMEDOUT;
+   } else {
+   err = pvrdma_cmd_recv(dev, resp);
+   }
+   }
+   } else {
+   dev_warn(

[PATCH v4 14/16] IB/pvrdma: Add Kconfig and Makefile

2016-09-11 Thread Adit Ranadive
This patch adds a Kconfig and Makefile for the PVRDMA driver.

Reviewed-by: Jorgen Hansen 
Reviewed-by: George Zhang 
Reviewed-by: Aditya Sarwade 
Reviewed-by: Bryan Tan 
Signed-off-by: Adit Ranadive 
---
Changes v3->v4:
 - Enforced dependency on VMXNet3
---
 drivers/infiniband/hw/pvrdma/Kconfig  | 7 +++
 drivers/infiniband/hw/pvrdma/Makefile | 3 +++
 2 files changed, 10 insertions(+)
 create mode 100644 drivers/infiniband/hw/pvrdma/Kconfig
 create mode 100644 drivers/infiniband/hw/pvrdma/Makefile

diff --git a/drivers/infiniband/hw/pvrdma/Kconfig 
b/drivers/infiniband/hw/pvrdma/Kconfig
new file mode 100644
index 000..b345679
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/Kconfig
@@ -0,0 +1,7 @@
+config INFINIBAND_PVRDMA
+   tristate "VMware Paravirtualized RDMA Driver"
+   depends on NETDEVICES && ETHERNET && PCI && INET && VMXNET3
+   ---help---
+ This driver provides low-level support for VMware Paravirtual
+ RDMA adapter. It interacts with the VMXNet3 driver to provide
+ Ethernet capabilities.
diff --git a/drivers/infiniband/hw/pvrdma/Makefile 
b/drivers/infiniband/hw/pvrdma/Makefile
new file mode 100644
index 000..e6f078b
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/Makefile
@@ -0,0 +1,3 @@
+obj-$(CONFIG_INFINIBAND_PVRDMA) += pvrdma.o
+
+pvrdma-y := pvrdma_cmd.o pvrdma_cq.o pvrdma_doorbell.o pvrdma_main.o 
pvrdma_misc.o pvrdma_mr.o pvrdma_qp.o pvrdma_verbs.o
-- 
2.7.4



[PATCH v4 09/16] IB/pvrdma: Add support for Completion Queues

2016-09-11 Thread Adit Ranadive
This patch adds the support for creating and destroying completion queues
on the paravirtual RDMA device.

Reviewed-by: Jorgen Hansen 
Reviewed-by: George Zhang 
Reviewed-by: Aditya Sarwade 
Reviewed-by: Bryan Tan 
Signed-off-by: Adit Ranadive 
---
Changes v3->v4:
 - Added a pvrdma_destroy_cq in the error path.
 - Renamed pvrdma_flush_cqe to _pvrdma_flush_cqe since we need a lock to
 be held while calling this.
 - Updated to use wrapper for UAR write for CQ.
 - Ensure that poll_cq does not return error values.

Changes v2->v3:
 - Removed boolean from pvrdma_cmd_post.
 - Return -EAGAIN if qp retrieved from CQE is bogus.
 - Check for invalid index of ring.
---
 drivers/infiniband/hw/pvrdma/pvrdma_cq.c | 428 +++
 1 file changed, 428 insertions(+)
 create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_cq.c

diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_cq.c 
b/drivers/infiniband/hw/pvrdma/pvrdma_cq.c
new file mode 100644
index 000..8144b20
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_cq.c
@@ -0,0 +1,428 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "pvrdma.h"
+#include "pvrdma_user.h"
+
+/**
+ * pvrdma_req_notify_cq - request notification for a completion queue
+ * @ibcq: the completion queue
+ * @notify_flags: notification flags
+ *
+ * @return: 0 for success.
+ */
+int pvrdma_req_notify_cq(struct ib_cq *ibcq,
+enum ib_cq_notify_flags notify_flags)
+{
+   struct pvrdma_dev *dev = to_vdev(ibcq->device);
+   struct pvrdma_cq *cq = to_vcq(ibcq);
+   u32 val = cq->cq_handle;
+
+   val |= (notify_flags & IB_CQ_SOLICITED_MASK) == IB_CQ_SOLICITED ?
+   PVRDMA_UAR_CQ_ARM_SOL : PVRDMA_UAR_CQ_ARM;
+
+   pvrdma_write_uar_cq(dev, val);
+
+   return 0;
+}
+
+/**
+ * pvrdma_create_cq - create completion queue
+ * @ibdev: the device
+ * @attr: completion queue attributes
+ * @context: user context
+ * @udata: user data
+ *
+ * @return: ib_cq completion queue pointer on success,
+ *  otherwise returns negative errno.
+ */
+struct ib_cq *pvrdma_create_cq(struct ib_device *ibdev,
+  const struct ib_cq_init_attr *attr,
+  struct ib_ucontext *context,
+  struct ib_udata *udata)
+{
+   int entries = attr->cqe;
+   struct pvrdma_dev *dev = to_vdev(ibdev);
+   struct pvrdma_cq *cq;
+   int ret;
+   int npages;
+   unsigned long flags;
+   union pvrdma_cmd_req req;
+   union pvrdma_cmd_resp rsp;
+   struct pvrdma_cmd_create_cq *cmd = &req.create_cq;
+   struct pvrdma_cmd_create_cq_resp *resp = &rsp.create_cq_resp;
+   struct pvrdma_create_cq ucmd;
+
+   BUILD_BUG_ON(siz

[PATCH v4 07/16] IB/pvrdma: Add helper functions

2016-09-11 Thread Adit Ranadive
This patch adds helper functions to store guest page addresses in a page
directory structure. The page directory pointer is passed down to the
backend which then maps the entire memory for the RDMA object by
traversing the directory. We add some more helper functions for converting
to/from RDMA stack address handles from/to PVRDMA ones.

Reviewed-by: Jorgen Hansen 
Reviewed-by: George Zhang 
Reviewed-by: Aditya Sarwade 
Reviewed-by: Bryan Tan 
Signed-off-by: Adit Ranadive 
---
Changes v3->v4:
 - Updated conversion functions to func_name(dst, src) format.
 - Removed unneeded local variables.
---
 drivers/infiniband/hw/pvrdma/pvrdma_misc.c | 303 +
 1 file changed, 303 insertions(+)
 create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_misc.c

diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_misc.c 
b/drivers/infiniband/hw/pvrdma/pvrdma_misc.c
new file mode 100644
index 000..1f12cd6
--- /dev/null
+++ b/drivers/infiniband/hw/pvrdma/pvrdma_misc.c
@@ -0,0 +1,303 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+
+#include "pvrdma.h"
+
+int pvrdma_page_dir_init(struct pvrdma_dev *dev, struct pvrdma_page_dir *pdir,
+u64 npages, bool alloc_pages)
+{
+   u64 i;
+
+   if (npages > PVRDMA_PAGE_DIR_MAX_PAGES)
+   return -EINVAL;
+
+   memset(pdir, 0, sizeof(*pdir));
+
+   pdir->dir = dma_alloc_coherent(&dev->pdev->dev, PAGE_SIZE,
+  &pdir->dir_dma, GFP_KERNEL);
+   if (!pdir->dir)
+   goto err;
+
+   pdir->ntables = PVRDMA_PAGE_DIR_TABLE(npages - 1) + 1;
+   pdir->tables = kcalloc(pdir->ntables, sizeof(*pdir->tables),
+  GFP_KERNEL);
+   if (!pdir->tables)
+   goto err;
+
+   for (i = 0; i < pdir->ntables; i++) {
+   pdir->tables[i] = dma_alloc_coherent(&dev->pdev->dev, PAGE_SIZE,
+&pdir->dir[i], GFP_KERNEL);
+   if (!pdir->tables[i])
+   goto err;
+   }
+
+   pdir->npages = npages;
+
+   if (alloc_pages) {
+   pdir->pages = kcalloc(npages, sizeof(*pdir->pages),
+ GFP_KERNEL);
+   if (!pdir->pages)
+   goto err;
+
+   for (i = 0; i < pdir->npages; i++) {
+   dma_addr_t page_dma;
+
+   pdir->pages[i] = dma_alloc_coherent(&dev->pdev->dev,
+   PAGE_SIZE,
+   &page_dma,
+   GFP_KERNEL);
+   if (!pdir->pages[i])
+   

[PATCH v4 01/16] vmxnet3: Move PCI Id to pci_ids.h

2016-09-11 Thread Adit Ranadive
The VMXNet3 PCI Id will be shared with our paravirtual RDMA driver.
Moved it to the shared location in pci_ids.h.

Suggested-by: Leon Romanovsky 
Signed-off-by: Adit Ranadive 
---
---
 drivers/net/vmxnet3/vmxnet3_int.h | 3 +--
 include/linux/pci_ids.h   | 1 +
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_int.h 
b/drivers/net/vmxnet3/vmxnet3_int.h
index 74fc030..2bd6bf8 100644
--- a/drivers/net/vmxnet3/vmxnet3_int.h
+++ b/drivers/net/vmxnet3/vmxnet3_int.h
@@ -119,9 +119,8 @@ enum {
 };
 
 /*
- * PCI vendor and device IDs.
+ * Maximum devices supported.
  */
-#define PCI_DEVICE_ID_VMWARE_VMXNET30x07B0
 #define MAX_ETHERNET_CARDS 10
 #define MAX_PCI_PASSTHRU_DEVICE6
 
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index c58752f..98bb455 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2251,6 +2251,7 @@
 #define PCI_DEVICE_ID_RASTEL_2PORT 0x2000
 
 #define PCI_VENDOR_ID_VMWARE   0x15ad
+#define PCI_DEVICE_ID_VMWARE_VMXNET3   0x07b0
 
 #define PCI_VENDOR_ID_ZOLTRIX  0x15b0
 #define PCI_DEVICE_ID_ZOLTRIX_2BD0 0x2bd0
-- 
2.7.4



[PATCH v4 00/16] Add Paravirtual RDMA Driver

2016-09-11 Thread Adit Ranadive
Hi Doug, others,

This patch series adds a driver for a paravirtual RDMA device. The device
is developed for VMware's Virtual Machines and allows existing RDMA
applications to continue to use existing Verbs API when deployed in VMs on
ESXi. We recently did a presentation in the OFA Workshop [1] regarding this
device.

Description and RDMA Support

The virtual device is exposed as a dual function PCIe device. One part is
a virtual network device (VMXNet3) which provides networking properties
like MAC, IP addresses to the RDMA part of the device. The networking
properties are used to register GIDs required by RDMA applications to
communicate.

These patches add support and the all required infrastructure for letting
applications use such a device. We support the mandatory Verbs API as well
as the base memory management extensions (Local Inv, Send with Inv and Fast
Register Work Requests). We currently support both Reliable Connected and
Unreliable Datagram QPs but do not support Shared Receive Queues (SRQs).
Also, we support the following types of Work Requests:
 o Send/Receive (with or without Immediate Data)
 o RDMA Write (with or without Immediate Data)
 o RDMA Read
 o Local Invalidate
 o Send with Invalidate
 o Fast Register Work Requests

This version only adds support for version 1 of RoCE. We will add RoCEv2
support in a future patch. We do support registration of both MAC-based and
IP-based GIDs. I have also created a git tree for our user-level driver [2].

Testing
===
We have tested this internally for various types of Guest OS - Red Hat,
Centos, Ubuntu 12.04/14.04/16.04, Oracle Enterprise Linux, SLES 12
using backported versions of this driver. The tests included several runs
of the performance tests (included with OFED), Intel MPI PingPong benchmark
on OpenMPI, krping for FRWRs. Mellanox has been kind enough to test the
backported version of the driver internally on their hardware using a
VMware provided ESX build. I have also applied and tested this with Doug's
k.o/for-4.9 branch (commit 64278fe). Note, that this patch series should be
applied all together. I split out the commits so that it may be easier to
review.

PVRDMA Resources

[1] OFA Workshop Presentation - https://goo.gl/pHOXJ8
[2] Libpvrdma User-level library - 
http://git.openfabrics.org/?p=~aditr/libpvrdma.git;a=summary
---
Changes v3->v4:
 - Rebased on for-4.9 branch - commit 64278fe89b729
   ("Merge branch 'hns-roce' into k.o/for-4.9")
 - PATCH [01/16]
 - New in v4 - Moved vmxnet3 id to pci_ids.h
 - PATCH [02,03/16]
 - pvrdma_sge was moved into pvrdma_uapi.h
 - PATCH [04/16]
 - Removed explicit enum values.
 - PATCH [05/16]
 - Renamed priviledged -> privileged.
 - Added error numbers for command errors.
 - Removed unnecessary goto in modify_device.
 - Moved pd allocation to after command execution.
 - Removed an incorrect atomic_dec.
 - PATCH [06/16]
 - Renamed priviledged -> privileged.
 - Renamed pvrdma_flush_cqe to _pvrdma_flush_cqe since we hold a lock
 to call it.
 - Added wrapper functions for writing to UARs for CQ/QP.
 - The conversion functions are updated as func_name(dst, src) format.
 - Renamed max_gs to max_sg.
 - Added work struct for net device events.
 - PATCH [07/16]
 - Updated conversion functions to func_name(dst, src) format.
 - Removed unneeded local variables.
 - PATCH [08/16]
 - Removed the min check and added a BUILD_BUG_ON check for size.
 - PATCH [09/16]
 - Added a pvrdma_destroy_cq in the error path.
 - Renamed pvrdma_flush_cqe to _pvrdma_flush_cqe since we need a lock to
 be held while calling this.
 - Updated to use wrapper for UAR write for CQ.
 - Ensure that poll_cq does not return error values.
 - PATCH [10/16]
 - Removed an unnecessary comment.
 - PATCH [11/16]
 - Changed access flag check for DMA MR to using bit operation.
 - Removed some local variables.
 - PATCH [12/16]
 - Removed an unnecessary switch case.
 - Unified the returns in pvrdma_create_qp to use one exit point.
 - Renamed pvrdma_flush_cqe to _pvrdma_flush_cqe since we need a lock to
 be held when calling this.
 - Updated to use wrapper for UAR write for QP.
 - Updated conversion function to func_name(dst, src) format.
 - Renamed max_gs to max_sg.
 - Renamed cap variable to req_cap in pvrdma_set_sq/rq_size.
 - Changed dev_warn to dev_warn_ratelimited in pvrdma_post_send/recv.
 - Added nesting locking for flushing CQs when destroying/resetting a QP.
 - Added missing ret value.
 - PATCH [13/16]
 - Fixed some checkpatch warnings.
 - Added support for new get_dev_fw_str API.
 - Added event workqueue for netdevice events.
 - Restructured the pvrdma_pci_remove function a little bit.
 - PATCH [14/16]
 - Enforced dependency on VMXNet3 module.

Changes v2->v3:
 - I reordered the patches so that the definitions of enums, structures 

Re: [net-next PATCH v2 2/2] e1000: bundle xdp xmit routines

2016-09-11 Thread Alexei Starovoitov
On Sun, Sep 11, 2016 at 08:15:28PM -0700, John Fastabend wrote:
> 
> >>> But what is the action for XDP_TX if the queue is stopped? There is no
> >>> qdisc to back pressure in the XDP path. Would we just start dropping
> >>> packets then?
> >>
> >> Yep that is what the patch does if there is any sort of error packets
> >> get dropped on the floor. I don't think there is anything else that
> >> can be done.
> >>
> > That probably means that the stack will always win out under load.
> > Trying to used the same queue where half of the packets are well
> > managed by a qdisc and half aren't is going to leave someone unhappy.
> > Maybe in the this case where we have to share the qdisc we can
> > allocate the skb on on returning XDP_TX and send through the normal
> > qdisc for the device.
> 
>  I wouldn't go to such extremes for e1k.
>  The only reason to have xdp in e1k is to use it for testing
>  of xdp programs. Nothing else. e1k is, best case, 1Gbps adapter.
> >>>
> >>> I imagine someone may want this for the non-forwarding use cases like
> >>> early drop for DOS mitigation. Regardless of the use case, I don't
> >>> think we can break the fundamental assumptions made for qdiscs or the
> >>> rest of the transmit path. If XDP must transmit on a queue shared with
> >>> the stack we need to abide by the stack's rules for transmitting on
> >>> the queue-- which would mean alloc skbuff and go through qdisc (which
> >>
> >> If we require XDP_TX to go up to qdisc layer its best not to implement
> >> it at all and just handle it in normal ingress path. That said I think
> >> users have to expect that XDP will interfere with qdisc schemes. Even
> >> with its own tx queue its going to interfere at the hardware level with
> >> bandwidth as the hardware round robins through the queues or uses
> >> whatever hardware strategy it is configured to use. Additionally it
> >> will bypass things like BQL, etc.
> >>
> > Right, but not all use cases involve XDP_TX (like DOS mitigation as I
> > pointed out). Since you've already done 95% of the work, can you take
> > a look at creating the skbuff and injecting into the stack for XDP_TX
> > so we can evaluate the performance and impact of that :-)
> > 
> > With separate TX queues it's explicit which queues are managed by the
> > stack. This is no different than what kernel bypass gives use, we are
> > relying on HW to do something reasonable in scheduling MQ.
> > 
> 
> How about instead of dropping packets on xdp errors we make the
> behavior to send the packet to the stack by default. Then the stack can
> decide what to do with it. This is easier from the drivers perspective
> and avoids creating a qdisc inject path for XDP. We could set the mark
> field if the stack wants to handle XDP exceptions somehow differently.
> 
> If we really want XDP to have an inject path I think we should add
> another action XDP_QDISC_INJECT. And add some way for XDP to run
> programs on exceptions. Perhaps via an exception map.

Nack for any new features just for e1k.
I don't like where this discussion is going.
I've been hacking xdp support for e1k only to be able to debug
xdp programs in kvm instead of messing with physical hosts where
every bpf program mistake kills ssh connection.
Please stop this overdesign. I'd rather not have xdp for e1k
instead of going into this crazy new action codes and random
punt to stack. If there is a conflict between stack and xdp, just
drop the packet. e1k is _not_ an example for any other drivers.
When high performance NIC will have such tx ring sharing issues
only then we'd need to come with a solution. Currently that's not
the case, so there is no need to come up with anything but
the simplest approach.



Re: [net-next PATCH v2 2/2] e1000: bundle xdp xmit routines

2016-09-11 Thread John Fastabend
On 16-09-09 09:13 PM, Tom Herbert wrote:
> On Fri, Sep 9, 2016 at 8:26 PM, John Fastabend  
> wrote:
>> On 16-09-09 08:12 PM, Tom Herbert wrote:
>>> On Fri, Sep 9, 2016 at 6:40 PM, Alexei Starovoitov
>>>  wrote:
 On Fri, Sep 09, 2016 at 06:19:56PM -0700, Tom Herbert wrote:
> On Fri, Sep 9, 2016 at 6:12 PM, John Fastabend  
> wrote:
>> On 16-09-09 06:04 PM, Tom Herbert wrote:
>>> On Fri, Sep 9, 2016 at 5:01 PM, John Fastabend 
>>>  wrote:
 On 16-09-09 04:44 PM, Tom Herbert wrote:
> On Fri, Sep 9, 2016 at 2:29 PM, John Fastabend 
>  wrote:
>> e1000 supports a single TX queue so it is being shared with the stack
>> when XDP runs XDP_TX action. This requires taking the xmit lock to
>> ensure we don't corrupt the tx ring. To avoid taking and dropping the
>> lock per packet this patch adds a bundling implementation to submit
>> a bundle of packets to the xmit routine.
>>
>> I tested this patch running e1000 in a VM using KVM over a tap
>> device using pktgen to generate traffic along with 'ping -f -l 100'.
>>
> Hi John,
>
> How does this interact with BQL on e1000?
>
> Tom
>

 Let me check if I have the API correct. When we enqueue a packet to
 be sent we must issue a netdev_sent_queue() call and then on actual
 transmission issue a netdev_completed_queue().

 The patch attached here missed a few things though.

 But it looks like I just need to call netdev_sent_queue() from the
 e1000_xmit_raw_frame() routine and then let the tx completion logic
 kick in which will call netdev_completed_queue() correctly.

 I'll need to add a check for the queue state as well. So if I do these
 three things,

 check __QUEUE_STATE_XOFF before sending
 netdev_sent_queue() -> on XDP_TX
 netdev_completed_queue()

 It should work agree? Now should we do this even when XDP owns the
 queue? Or is this purely an issue with sharing the queue between
 XDP and stack.

>>> But what is the action for XDP_TX if the queue is stopped? There is no
>>> qdisc to back pressure in the XDP path. Would we just start dropping
>>> packets then?
>>
>> Yep that is what the patch does if there is any sort of error packets
>> get dropped on the floor. I don't think there is anything else that
>> can be done.
>>
> That probably means that the stack will always win out under load.
> Trying to used the same queue where half of the packets are well
> managed by a qdisc and half aren't is going to leave someone unhappy.
> Maybe in the this case where we have to share the qdisc we can
> allocate the skb on on returning XDP_TX and send through the normal
> qdisc for the device.

 I wouldn't go to such extremes for e1k.
 The only reason to have xdp in e1k is to use it for testing
 of xdp programs. Nothing else. e1k is, best case, 1Gbps adapter.
>>>
>>> I imagine someone may want this for the non-forwarding use cases like
>>> early drop for DOS mitigation. Regardless of the use case, I don't
>>> think we can break the fundamental assumptions made for qdiscs or the
>>> rest of the transmit path. If XDP must transmit on a queue shared with
>>> the stack we need to abide by the stack's rules for transmitting on
>>> the queue-- which would mean alloc skbuff and go through qdisc (which
>>
>> If we require XDP_TX to go up to qdisc layer its best not to implement
>> it at all and just handle it in normal ingress path. That said I think
>> users have to expect that XDP will interfere with qdisc schemes. Even
>> with its own tx queue its going to interfere at the hardware level with
>> bandwidth as the hardware round robins through the queues or uses
>> whatever hardware strategy it is configured to use. Additionally it
>> will bypass things like BQL, etc.
>>
> Right, but not all use cases involve XDP_TX (like DOS mitigation as I
> pointed out). Since you've already done 95% of the work, can you take
> a look at creating the skbuff and injecting into the stack for XDP_TX
> so we can evaluate the performance and impact of that :-)
> 
> With separate TX queues it's explicit which queues are managed by the
> stack. This is no different than what kernel bypass gives use, we are
> relying on HW to do something reasonable in scheduling MQ.
> 

How about instead of dropping packets on xdp errors we make the
behavior to send the packet to the stack by default. Then the stack can
decide what to do with it. This is easier from the drivers perspective
and avoids creating a qdisc inject path for XDP. We could set the mark
field if the stack wants to handle XDP exceptions somehow differently.

If we really want XDP to have an inject path I think we should

[GIT] Networking

2016-09-11 Thread David Miller

Mostly small sets of driver fixes scattered all over the place.

1) Mediatek driver fixes from Sean Wang.  Forward port not written
   correctly during TX map, missed handling of EPROBE_DEFER, and
   mistaken use of put_page() instead of skb_free_frag().

2) Fix socket double-free in KCM code, from WANG Cong.

3) QED driver fixes from Sudarsana Reddy Kalluru, including a fix
   for using the dcbx buffers before initializing them.

4) Mellanox Switch driver fixes from Jiri Pirko, including a fix for
   double fib removals and an error handling fix in
   mlxsw_sp_module_init().

5) Fix kernel panic when enabling LLDP in i40e driver, from Dave
   Ertman.

6) Fix padding of TSO packets in thunderx driver, from Sunil Goutham.

7) TCP's rcv_wup not initialized properly when using fastopen, from
   Neal Cardwell.

8) Don't use uninitialized flow keys in flow dissector, from Gao
   Feng.

9) Use after free in l2tp module unload, from Sabrina Dubroca.

10) Fix interrupt registry ordering issues in smsc911x driver, from
Jeremy Linton.

11) Fix crashes in bonding having to do with enslaving and rx_handler,
from Mahesh Bandewar.

12) AF_UNIX deadlock fixes from Linus.

13) In mlx5 driver, don't read skb->xmit_mode after it might have been
freed from the TX reclaim path.  From Tariq Toukan.

14) Fix a bug from 2015 in TCP Yeah where the congestion window does
not increase, from Artem Germanov.

15) Don't pad frames on receive in NFP driver, from Jakub Kicinski.

16) Fix chunk fragmenting in SCTP wrt. GSO, from Marcelo Ricardo
Leitner.

17) Fix deletion of VRF routes, from Mark Tomlinson.

18) Fix device refcount leak when DAD fails in ipv6, from Wei Yongjun.

Please pull, thanks a lot!

The following changes since commit e4e98c460ad38c78498622a164fd5ef09a2dc9cb:

  Merge tag 'hwmon-for-linus-v4.8-rc5' of 
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging (2016-08-29 
19:12:35 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 

for you to fetch changes up to 373df3131aa83bd3e0ea7cd15be92d942d75fc72:

  Merge branch 'mlx4-fixes' (2016-09-11 19:40:26 -0700)


Alexey Kodanev (1):
  net/xfrm_input: fix possible NULL deref of tunnel.ip6->parms.i_key

Andy Gospodarek (1):
  MAINTAINERS: update to working email address

Arend Van Spriel (1):
  brcmfmac: avoid potential stack overflow in brcmf_cfg80211_start_ap()

Arik Nemtsov (1):
  mac80211: TDLS: don't require beaconing for AP BW

Artem Germanov (1):
  tcp: cwnd does not increase in TCP YeAH

Ashok Raj Nagarajan (1):
  ath10k: fix get rx_status from htt context

Bodong Wang (1):
  net/mlx5e: Move an_disable_cap bit to a new position

Cathy Luo (1):
  mwifiex: fix large amsdu packets causing firmware hang

Chris Brandt (1):
  net: ethernet: renesas: sh_eth: add POST registers for rz

Dave Ertman (1):
  i40e: Fix kernel panic on enable/disable LLDP

Dave Jones (1):
  ipv6: release dst in ping_v6_sendmsg

David Ahern (1):
  xfrm: Only add l3mdev oif to dst lookups

David S. Miller (15):
  Merge tag 'mac80211-for-davem-2016-08-30' of 
git://git.kernel.org/.../jberg/mac80211
  Merge git://git.kernel.org/.../pablo/nf
  Merge branch 'mediatek-fixes'
  Merge branch 'qed-fixes'
  Merge branch 'mlxsw-fixes'
  Merge tag 'wireless-drivers-for-davem-2016-08-29' of 
git://git.kernel.org/.../kvalo/wireless-drivers
  Merge branch 'thunderx-fixes'
  Merge branch 'smsc911x-fixes'
  Merge branch 'vxlan-fixes'
  Merge branch 'master' of git://git.kernel.org/.../klassert/ipsec
  Merge branch 'mlx5-fixes'
  Merge branch 'nfp-fixes'
  Merge branch 'mlxsw-fixes'
  Merge tag 'wireless-drivers-for-davem-2016-09-08' of 
git://git.kernel.org/.../kvalo/wireless-drivers
  Merge branch 'mlx4-fixes'

Davide Caratti (1):
  bridge: re-introduce 'fix parsing of MLDv2 reports'

Eli Cooper (1):
  ipv6: Don't unset flowi6_proto in ipxip6_tnl_xmit()

Emmanuel Grumbach (2):
  iwlwifi: mvm: consider P2p device type for firmware dump triggers
  iwlwifi: mvm: don't use ret when not initialised

Eric Dumazet (1):
  tcp: fastopen: avoid negative sk_forward_alloc

Felix Fietkau (2):
  ath9k: fix client mode beacon configuration
  ath9k: fix using sta->drv_priv before initializing it

Florian Fainelli (1):
  MAINTAINERS: Update CPMAC email address

Gal Pressman (3):
  net/mlx5e: Prevent casting overflow
  net/mlx5e: Fix global PFC counters replication
  net/mlx5e: Fix parsing of vlan packets when updating lro header

Gao Feng (1):
  rps: flow_dissector: Fix uninitialized flow_keys used in __skb_get_hash 
possibly

Giedrius Statkevičius (1):
  ath9k: bring back direction setting in ath9k_{start_stop}

Guilherme G. Piccoli (1):
  bnx2x: don't reset chip on cleanup if PCI function is offline

Helmut Buchsba

Re: [net-next PATCH v2 2/2] e1000: bundle xdp xmit routines

2016-09-11 Thread John Fastabend
On 16-09-10 08:36 AM, Tom Herbert wrote:
> On Fri, Sep 9, 2016 at 2:29 PM, John Fastabend  
> wrote:
>> e1000 supports a single TX queue so it is being shared with the stack
>> when XDP runs XDP_TX action. This requires taking the xmit lock to
>> ensure we don't corrupt the tx ring. To avoid taking and dropping the
>> lock per packet this patch adds a bundling implementation to submit
>> a bundle of packets to the xmit routine.
>>
>> I tested this patch running e1000 in a VM using KVM over a tap
>> device using pktgen to generate traffic along with 'ping -f -l 100'.
>>
>> Suggested-by: Jesper Dangaard Brouer 
>> Signed-off-by: John Fastabend 
>> ---

[...]

>> diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c 
>> b/drivers/net/ethernet/intel/e1000/e1000_main.c
>> index 91d5c87..b985271 100644
>> --- a/drivers/net/ethernet/intel/e1000/e1000_main.c
>> +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
>> @@ -1738,10 +1738,18 @@ static int e1000_setup_rx_resources(struct 
>> e1000_adapter *adapter,
>> struct pci_dev *pdev = adapter->pdev;
>> int size, desc_len;
>>
>> +   size = sizeof(struct e1000_rx_buffer_bundle) *
>> +   E1000_XDP_XMIT_BUNDLE_MAX;
>> +   rxdr->xdp_buffer = vzalloc(size);
>> +   if (!rxdr->xdp_buffer)
>> +   return -ENOMEM;
>> +
>> size = sizeof(struct e1000_rx_buffer) * rxdr->count;
>> rxdr->buffer_info = vzalloc(size);
>> -   if (!rxdr->buffer_info)
>> +   if (!rxdr->buffer_info) {
>> +   vfree(rxdr->xdp_buffer);
> 
> This could be deferred until an XDP program is added.

Yep that would be best to avoid overhead in the normal non-XDP case.
Also I'll move the xdp prog pointer into the rx ring per Jespers comment
that I missed in this rev.

[...]

>> +
>> +static void e1000_xdp_xmit_bundle(struct e1000_rx_buffer_bundle 
>> *buffer_info,
>> + struct net_device *netdev,
>> + struct e1000_adapter *adapter)
>> +{
>> +   struct netdev_queue *txq = netdev_get_tx_queue(netdev, 0);
>> +   struct e1000_tx_ring *tx_ring = adapter->tx_ring;
>> +   struct e1000_hw *hw = &adapter->hw;
>> +   int i = 0;
>> +
>> /* e1000 only support a single txq at the moment so the queue is 
>> being
>>  * shared with stack. To support this requires locking to ensure the
>>  * stack and XDP are not running at the same time. Devices with
>>  * multiple queues should allocate a separate queue space.
>> +*
>> +* To amortize the locking cost e1000 bundles the xmits and sends as
>> +* many as possible until either running out of descriptors or 
>> failing.
> 
> Up to E1000_XDP_XMIT_BUNDLE_MAX  at least...

Yep will fix comment.

[...]

>>
>> /* use prefetched values */
>> @@ -4498,8 +4536,11 @@ next_desc:
>> rx_ring->next_to_clean = i;
>>
>> cleaned_count = E1000_DESC_UNUSED(rx_ring);
>> -   if (cleaned_count)
>> +   if (cleaned_count) {
>> +   if (xdp_xmit)
>> +   e1000_xdp_xmit_bundle(xdp_bundle, netdev, adapter);
>> adapter->alloc_rx_buf(adapter, rx_ring, cleaned_count);
>> +   }
> 
> Looks good for XDP path. Is this something we can abstract out into a
> library for use by other drivers?
> 

I'm not really sure it can be abstracted much its a bit intertwined with
the normal rx receive path. But it should probably be a pattern that
gets copied so we avoid unnecessary tx work.

> 
>>
>> adapter->total_rx_packets += total_rx_packets;
>> adapter->total_rx_bytes += total_rx_bytes;
>>



Re: Minimum MTU Mess

2016-09-11 Thread YOSHIFUJI Hideaki


Jarod Wilson wrote:
> On Tue, Sep 06, 2016 at 04:55:29PM -0700, David Miller wrote:
>> From: Jarod Wilson 
>> Date: Fri, 2 Sep 2016 13:07:42 -0400
>>
>>> In any case, the number of "mtu < 68" and "#define FOO_MIN_MTU 68", or
>>> variations thereof, under drivers/net/ is kind of crazy.
>>
>> Agreed, we can have a default and let the different cases provide
>> overrides.
>>
>> Mostly what to do here is a function of the hardware though.
> 
> So I've been tinkering with this some, and it looks like having both
> centralized min and max checking could be useful here. I'm hacking away at
> drivers now, but the basis of all this would potentially look about like
> the patch below, and each device would have to set dev->m{in,ax}_mtu one
> way or another. Drivers using alloc_etherdev and/or ether_setup would get
> the "default" values, and then they can be overridden. Probably need
> something to make sure dev->max_mtu isn't set to 0 though...
> 
> Possibly on the right track here, or might there be a better way to
> approach this?
> 
> diff --git a/include/uapi/linux/if_ether.h b/include/uapi/linux/if_ether.h
> index 117d02e..864d6f2 100644
> --- a/include/uapi/linux/if_ether.h
> +++ b/include/uapi/linux/if_ether.h
> @@ -35,6 +35,8 @@
>  #define ETH_FRAME_LEN1514/* Max. octets in frame sans 
> FCS */
>  #define ETH_FCS_LEN  4   /* Octets in the FCS */
>  
> +#define ETH_MIN_MTU  68  /* Min IPv4 MTU per RFC791  */
> +
>  /*
>   *   These are the defined Ethernet Protocol ID's.
>   */

Why don't we disable IPv4 if the MTU is lower than this value
as we do for IPv6?

-- 
Hideaki Yoshifuji 
Technical Division, MIRACLE LINUX CORPORATION


Re: [PATCH 25/26] pch_gbe: constify local structures

2016-09-11 Thread David Miller

Julia, I went over the networking driver patches in this series and
I have to say that I'd rather see these changes be more durable
and self-checking.

By this I mean that I want you to also make the driver private pointer
that holds these structures be const too.

Then if there are really any assignments to the objects being marked
const, it will show immediately.

Thank you.


Re: Minimum MTU Mess

2016-09-11 Thread Andrew Lunn
> Actually breaking this up into easily digestable/mergeable chunks is going
> to be kind of entertaining... Suggestions welcomed on that. First up is
> obviously the core change, which touches just net/ethernet/eth.c,
> net/core/dev.c, include/linux/netdevice.h and
> include/uapi/linux/if_ether.h, and should let existing code continue to
> Just Work(tm), though devices using ether_setup() that had no MTU range
> checking (or one or the other missing) will wind up with new bounds.

Hi Jarod

Did you find any drivers which support jumbo packets, but don't have
checks? These drivers, if there are any, need handling first, before
this core change is made. Otherwise you introduce regressions.

 Andrew


Re: [PATCH net V2 0/4] mlx4 fixes

2016-09-11 Thread David Miller
From: Tariq Toukan 
Date: Sun, 11 Sep 2016 10:56:16 +0300

> This patchset contains several bug fixes from the team to the
> mlx4 Eth driver.
> 
> Series generated against net commit:
> c2f57fb97da5 "drivers: net: phy: mdio-xgene: Add hardware dependency"
 ...
> v2:
> * excluded some cleanup patches.

Series applied, thanks.


Re: [PATCH net-next] net: dsa: bcm_sf2: Get VLAN_PORT_MASK from b53_device

2016-09-11 Thread David Miller
From: Florian Fainelli 
Date: Sat, 10 Sep 2016 12:39:03 -0700

> While migrating the bcm_sf2 driver to use b53_common, we left a small
> piece untouched where we kept our local copy of the per-port
> port_vlan_ctl bitmask value. This value is now maintained by b53_device
> so we need to use it instead of our local (and now stale) copy of it.
> 
> Fixes: f458995b9ad8 ("net: dsa: bcm_sf2: Utilize core B53 driver when 
> possible")
> Signed-off-by: Florian Fainelli 

Applied.


[PATCH] net: VRF: Pass original iif to ip_route_input()

2016-09-11 Thread Mark Tomlinson
The function ip_rcv_finish() calls l3mdev_ip_rcv(). On any VRF except
the global VRF, this replaces skb->dev with the VRF master interface.
When calling ip_route_input_noref() from here, the checks for forwarding
look at this master device instead of the initial ingress interface.
This will allow packets to be routed which normally would be dropped.
For example, an interface that is not assigned an IP address should
drop packets, but because the checking is against the master device, the
packet will be forwarded.

The fix here is to still call l3mdev_ip_rcv(), but remember the initial
net_device. This is passed to the other functions within ip_rcv_finish,
so they still see the original interface.

Please note that while this patch fixes my issue, I am not entirely sure
why the skb->dev is changed to the master device, so I am not sure this
is the right fix.

Signed-off-by: Mark Tomlinson 
---
 net/ipv4/ip_input.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index 4b351af..d6feabb 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -312,6 +312,7 @@ static int ip_rcv_finish(struct net *net, struct sock *sk, 
struct sk_buff *skb)
 {
const struct iphdr *iph = ip_hdr(skb);
struct rtable *rt;
+   struct net_device *dev = skb->dev;
 
/* if ingress device is enslaved to an L3 master device pass the
 * skb to its handler for processing
@@ -341,7 +342,7 @@ static int ip_rcv_finish(struct net *net, struct sock *sk, 
struct sk_buff *skb)
 */
if (!skb_valid_dst(skb)) {
int err = ip_route_input_noref(skb, iph->daddr, iph->saddr,
-  iph->tos, skb->dev);
+  iph->tos, dev);
if (unlikely(err)) {
if (err == -EXDEV)
__NET_INC_STATS(net, LINUX_MIB_IPRPFILTER);
@@ -370,7 +371,7 @@ static int ip_rcv_finish(struct net *net, struct sock *sk, 
struct sk_buff *skb)
__IP_UPD_PO_STATS(net, IPSTATS_MIB_INBCAST, skb->len);
} else if (skb->pkt_type == PACKET_BROADCAST ||
   skb->pkt_type == PACKET_MULTICAST) {
-   struct in_device *in_dev = __in_dev_get_rcu(skb->dev);
+   struct in_device *in_dev = __in_dev_get_rcu(dev);
 
/* RFC 1122 3.3.6:
 *
-- 
2.9.3



linux-next: manual merge of the net-next tree with the net tree

2016-09-11 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  drivers/net/phy/Kconfig

between commit:

  c2f57fb97da5 ("drivers: net: phy: mdio-xgene: Add hardware dependency")

from the net tree and commit:

  d75b4a22b255 ("net: phy: Sort Makefile and Kconfig")

from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

These "sort the Kconfig/Makefile" patches are much better done just
before or after -rc1 so that the conflicting changes are already merged
and new development can be based on top of them.
-- 
Cheers,
Stephen Rothwell

138c337cf0c2436790877f87e91154b3c3294346
diff --cc drivers/net/phy/Kconfig
index b4863e4e522b,87b566f54cc1..2651c8d8de2f
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@@ -15,88 -15,156 +15,157 @@@ if PHYLI
  config SWPHY
bool
  
- comment "MII PHY device drivers"
- 
- config AQUANTIA_PHY
- tristate "Drivers for the Aquantia PHYs"
- ---help---
-   Currently supports the Aquantia AQ1202, AQ2104, AQR105, AQR405
+ comment "MDIO bus device drivers"
  
- config AT803X_PHY
-   tristate "Drivers for Atheros AT803X PHYs"
-   ---help---
- Currently supports the AT8030 and AT8035 model
+ config MDIO_BCM_IPROC
+   tristate "Broadcom iProc MDIO bus controller"
+   depends on ARCH_BCM_IPROC || COMPILE_TEST
+   depends on HAS_IOMEM && OF_MDIO
+   help
+ This module provides a driver for the MDIO busses found in the
+ Broadcom iProc SoC's.
  
- config AMD_PHY
-   tristate "Drivers for the AMD PHYs"
-   ---help---
- Currently supports the am79c874
+ config MDIO_BCM_UNIMAC
+   tristate "Broadcom UniMAC MDIO bus controller"
+   depends on HAS_IOMEM
+   help
+ This module provides a driver for the Broadcom UniMAC MDIO busses.
+ This hardware can be found in the Broadcom GENET Ethernet MAC
+ controllers as well as some Broadcom Ethernet switches such as the
+ Starfighter 2 switches.
  
- config MARVELL_PHY
-   tristate "Drivers for Marvell PHYs"
-   ---help---
- Currently has a driver for the 88E1011S
-   
- config DAVICOM_PHY
-   tristate "Drivers for Davicom PHYs"
-   ---help---
- Currently supports dm9161e and dm9131
+ config MDIO_BITBANG
+   tristate "Bitbanged MDIO buses"
+   help
+ This module implements the MDIO bus protocol in software,
+ for use by low level drivers that export the ability to
+ drive the relevant pins.
  
- config QSEMI_PHY
-   tristate "Drivers for Quality Semiconductor PHYs"
-   ---help---
- Currently supports the qs6612
+ If in doubt, say N.
  
- config LXT_PHY
-   tristate "Drivers for the Intel LXT PHYs"
-   ---help---
- Currently supports the lxt970, lxt971
+ config MDIO_BUS_MUX
+   tristate
+   depends on OF_MDIO
+   help
+ This module provides a driver framework for MDIO bus
+ multiplexers which connect one of several child MDIO busses
+ to a parent bus.  Switching between child busses is done by
+ device specific drivers.
  
- config CICADA_PHY
-   tristate "Drivers for the Cicada PHYs"
-   ---help---
- Currently supports the cis8204
+ config MDIO_BUS_MUX_BCM_IPROC
+   tristate "Broadcom iProc based MDIO bus multiplexers"
+   depends on OF && OF_MDIO && (ARCH_BCM_IPROC || COMPILE_TEST)
+   select MDIO_BUS_MUX
+   default ARCH_BCM_IPROC
+   help
+ This module provides a driver for MDIO bus multiplexers found in
+ iProc based Broadcom SoCs. This multiplexer connects one of several
+ child MDIO bus to a parent bus. Buses could be internal as well as
+ external and selection logic lies inside the same multiplexer.
  
- config VITESSE_PHY
- tristate "Drivers for the Vitesse PHYs"
- ---help---
-   Currently supports the vsc8244
+ config MDIO_BUS_MUX_GPIO
+   tristate "GPIO controlled MDIO bus multiplexers"
+   depends on OF_GPIO && OF_MDIO
+   select MDIO_BUS_MUX
+   help
+ This module provides a driver for MDIO bus multiplexers that
+ are controlled via GPIO lines.  The multiplexer connects one of
+ several child MDIO busses to a parent bus.  Child bus
+ selection is under the control of GPIO lines.
  
- config TERANETICS_PHY
- tristate "Drivers for the Teranetics PHYs"
- ---help---
-   Currently supports the Teranetics TN2020
+ config MDIO_BUS_MUX_MMIOREG
+   tristate "MMIO device-controlled MDIO bus multiplexers"
+   depends on OF_MDIO && HAS_IOMEM

stmmac/RTL8211F/Meson GXBB: TX throughput problems

2016-09-11 Thread Martin Blumenstingl
Hello,

I have a device with a Meson GXBB SoC with an stmmac IP block.
Gbit ethernet on my device is provided by a Realtek RTL8211F RGMII PHY.
Similar issues were reported in #linux-amlogic by a user with an
Odroid C2 board (= similar hardware).

The symptoms are:
Receiving data is plenty fast (I can max out my internet connection
easily, and with iperf3 I get ~900Mbit/s).
Transmitting data from the device is unfortunately very slow, traffic
sometimes even stalls completely.

I have attached the iperf results and the output of
/sys/kernel/debug/stmmaceth/eth0/descriptors_status.
Below you can find the ifconfig, netstat and stmmac dma_cap info
(*after* I ran all tests).

The "involved parties" are:
- Meson GXBB specific network configuration registers (I have have
double-checked them with the reference drivers: everything seems fine
here)
- stmmac: it seems that nobody else has reported these kind of issues
so far, however I'd still like to hear where I should enable some
debugging bits to rule out any stmmac bug
- RTL8211F PHY driver: unfortunately there are no public datasheets
available so this is hard to debug. but I'm guessing that TX delay
could cause similar issues, so this may be the cause as well.


Thanks for any input in advance!
Regards,
Martin


[root@alarm ~]# ifconfig eth0
eth0: flags=4163  mtu 1500
inet 192.168.1.235  netmask 255.255.255.0  broadcast 192.168.1.255
ether e2:aa:53:fc:f5:c5  txqueuelen 1000  (Ethernet)
RX packets 1967602  bytes 2968750265 (2.7 GiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 101875  bytes 8548285 (8.1 MiB)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
device interrupt 18

[root@alarm ~]# netstat -i
Kernel Interface table
Iface  MTURX-OK RX-ERR RX-DRP RX-OVRTX-OK TX-ERR TX-DRP TX-OVR Flg
eth0  1500  1967801  0  0 0101934  0  0  0 BMRU

[root@alarm ~]# cat /sys/kernel/debug/stmmaceth/eth0/dma_cap
==
DMA HW features
==
10/100 Mbps Y
1000 Mbps Y
Half duple Y
Hash Filter: Y
Multiple MAC address registers: Y
PCS (TBI/SGMII/RTBI PHY interfatces): N
SMA (MDIO) Interface: Y
PMT Remote wake up: Y
PMT Magic Frame: Y
RMON module: Y
IEEE 1588-2002 Time Stamp: N
IEEE 1588-2008 Advanced Time Stamp:N
802.3az - Energy-Efficient Ethernet (EEE) Y
AV features: N
Checksum Offload in TX: Y
IP Checksum Offload (type1) in RX: N
IP Checksum Offload (type2) in RX: Y
RXFIFO > 2048bytes: Y
Number of Additional RX channel: 0
Number of Additional TX channel: 0
Enhanced descriptors: N
UDP test (iperf3):
[root@alarm ~]# iperf3 --client 192.168.1.100 -u
Connecting to host 192.168.1.100, port 5201
[  4] local 192.168.1.235 port 38931 connected to 192.168.1.100 port 5201
[ ID] Interval   Transfer Bandwidth   Total Datagrams
[  4]   0.00-1.00   sec   128 KBytes  1.05 Mbits/sec  16  
[  4]   1.00-2.00   sec   128 KBytes  1.05 Mbits/sec  16  
[  4]   2.00-3.00   sec   128 KBytes  1.05 Mbits/sec  16  
[  4]   3.00-4.00   sec   128 KBytes  1.05 Mbits/sec  16  
[  4]   4.00-5.00   sec   128 KBytes  1.05 Mbits/sec  16  
[  4]   5.00-6.00   sec   128 KBytes  1.05 Mbits/sec  16  
[  4]   6.00-7.00   sec   128 KBytes  1.05 Mbits/sec  16  
[  4]   7.00-8.00   sec   128 KBytes  1.05 Mbits/sec  16  
[  4]   8.00-9.00   sec   128 KBytes  1.05 Mbits/sec  16  
[  4]   9.00-10.00  sec   128 KBytes  1.05 Mbits/sec  16  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval   Transfer Bandwidth   JitterLost/Total 
Datagrams
[  4]   0.00-10.00  sec  1.25 MBytes  1.05 Mbits/sec  12526562.925 ms  113/159 
(71%)  
[  4] Sent 159 datagrams

iperf Done.
[root@alarm ~]# iperf3 --client 192.168.1.100 -u -R
Connecting to host 192.168.1.100, port 5201
Reverse mode, remote host 192.168.1.100 is sending
[  4] local 192.168.1.235 port 45128 connected to 192.168.1.100 port 5201
[ ID] Interval   Transfer Bandwidth   JitterLost/Total 
Datagrams
[  4]   0.00-1.00   sec   136 KBytes  1.11 Mbits/sec  81407591.898 ms  0/17 
(0%)  
[  4]   1.00-2.00   sec   128 KBytes  1.05 Mbits/sec  28987137.507 ms  0/16 
(0%)  
[  4]   2.00-3.00   sec   128 KBytes  1.05 Mbits/sec  10321569.793 ms  0/16 
(0%)  
[  4]   3.00-4.00   sec   128 KBytes  1.05 Mbits/sec  3675244.000 ms  0/16 (0%) 
 
[  4]   4.00-5.00   sec   128 KBytes  1.05 Mbits/sec  1308659.322 ms  0/16 (0%) 
 
[  4]   5.00-6.00   sec   128 KBytes  1.05 Mbits/sec  465979.740 ms  0/16 (0%)  
[  4]   6.00-7.00   sec   128 KBytes  1.05 Mbits/sec  165923.341 ms  0/16 (0%)  
[  4]   7.00-8.00   sec   128 KBytes  1.05 Mbits/sec  59081.019 ms  0/16 (0%)  
[  4]   8.00-9.00   sec   128 KBytes  1.05 Mbits/sec  21037.233 ms  0/16 (0%)  
[  4]   9.00-10.00  sec   128 KBytes  1.05 Mb

[PATCH] drivers: net: phy: xgene: Fix 'remove' function

2016-09-11 Thread Christophe JAILLET
If 'IS_ERR(pdata->clk)' is true, then 'clk_disable_unprepare(pdata->clk)'
will do nothing.

It is likely that 'if (!IS_ERR(pdata->clk))' was expected here.
In fact, the test can even be removed because 'clk_disable_unprepare'
already handles such cases.

Signed-off-by: Christophe JAILLET 
---
 drivers/net/phy/mdio-xgene.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/phy/mdio-xgene.c b/drivers/net/phy/mdio-xgene.c
index 775674808249..92af182951be 100644
--- a/drivers/net/phy/mdio-xgene.c
+++ b/drivers/net/phy/mdio-xgene.c
@@ -424,10 +424,8 @@ static int xgene_mdio_remove(struct platform_device *pdev)
mdiobus_unregister(mdio_bus);
mdiobus_free(mdio_bus);
 
-   if (dev->of_node) {
-   if (IS_ERR(pdata->clk))
-   clk_disable_unprepare(pdata->clk);
-   }
+   if (dev->of_node)
+   clk_disable_unprepare(pdata->clk);
 
return 0;
 }
-- 
2.7.4



[PATCH v3] net: ip, diag -- Add diag interface for raw sockets

2016-09-11 Thread Cyrill Gorcunov
On Sat, Sep 10, 2016 at 04:28:40PM -0600, David Ahern wrote:
> On 9/10/16 4:05 PM, Cyrill Gorcunov wrote:
> > On Sat, Sep 10, 2016 at 10:31:35AM -0600, David Ahern wrote:
> >>
> >> Would you mind adding the destroy capability as well? The udp version
> >> should be close to what is needed for raw sockets. See udp_diag_destroy
> >> and udp_abort.
> > 
> > Should be something like below. Didn't tested it yet so for review only.
> > Will do testing at Monday.
> 
> doesn't compile:
> - raw_abort needs to be in a header for ipv6, and
> - inet_sk_diag_fill args have changed due to a recent commit

Thanks for review, David. I updated against net-next.
---
From: Cyrill Gorcunov 
Subject: [PATCH v3] net: ip, diag -- Add diag interface for raw sockets

In criu we are actively using diag interface to collect sockets
present in the system when dumping applications. And while for
unix, tcp, udp[lite], packet, netlink it works as expected,
the raw sockets do not have. Thus add it.

v2:
 - add missing sock_put calls in raw_diag_dump_one (by eric.dumazet@)
 - implement @destroy for diag requests (by dsa@)

v3:
 - add export of raw_abort for IPv6 (by dsa@)
 - pass net-admin flag into inet_sk_diag_fill due to
   changes in net-next branch (by dsa@)

CC: David S. Miller 
CC: Eric Dumazet 
CC: David Ahern 
CC: Alexey Kuznetsov 
CC: James Morris 
CC: Hideaki YOSHIFUJI 
CC: Patrick McHardy 
CC: Andrey Vagin 
CC: Stephen Hemminger 
Signed-off-by: Cyrill Gorcunov 
---

 include/net/raw.h   |5 +
 include/net/rawv6.h |7 +
 net/ipv4/Kconfig|8 +
 net/ipv4/Makefile   |1 
 net/ipv4/raw.c  |   21 
 net/ipv4/raw_diag.c |  226 
 net/ipv6/raw.c  |7 +
 7 files changed, 271 insertions(+), 4 deletions(-)

Index: linux-ml.git/include/net/raw.h
===
--- linux-ml.git.orig/include/net/raw.h
+++ linux-ml.git/include/net/raw.h
@@ -23,6 +23,11 @@
 
 extern struct proto raw_prot;
 
+extern struct raw_hashinfo raw_v4_hashinfo;
+struct sock *__raw_v4_lookup(struct net *net, struct sock *sk,
+unsigned short num, __be32 raddr,
+__be32 laddr, int dif);
+
 void raw_icmp_error(struct sk_buff *, int, u32);
 int raw_local_deliver(struct sk_buff *, int);
 
Index: linux-ml.git/include/net/rawv6.h
===
--- linux-ml.git.orig/include/net/rawv6.h
+++ linux-ml.git/include/net/rawv6.h
@@ -3,6 +3,13 @@
 
 #include 
 
+extern struct raw_hashinfo raw_v6_hashinfo;
+struct sock *__raw_v6_lookup(struct net *net, struct sock *sk,
+unsigned short num, const struct in6_addr 
*loc_addr,
+const struct in6_addr *rmt_addr, int dif);
+
+int raw_abort(struct sock *sk, int err);
+
 void raw6_icmp_error(struct sk_buff *, int nexthdr,
u8 type, u8 code, int inner_offset, __be32);
 bool raw6_local_deliver(struct sk_buff *, int);
Index: linux-ml.git/net/ipv4/Kconfig
===
--- linux-ml.git.orig/net/ipv4/Kconfig
+++ linux-ml.git/net/ipv4/Kconfig
@@ -430,6 +430,14 @@ config INET_UDP_DIAG
  Support for UDP socket monitoring interface used by the ss tool.
  If unsure, say Y.
 
+config INET_RAW_DIAG
+   tristate "RAW: socket monitoring interface"
+   depends on INET_DIAG && (IPV6 || IPV6=n)
+   default n
+   ---help---
+ Support for RAW socket monitoring interface used by the ss tool.
+ If unsure, say Y.
+
 config INET_DIAG_DESTROY
bool "INET: allow privileged process to administratively close sockets"
depends on INET_DIAG
Index: linux-ml.git/net/ipv4/Makefile
===
--- linux-ml.git.orig/net/ipv4/Makefile
+++ linux-ml.git/net/ipv4/Makefile
@@ -40,6 +40,7 @@ obj-$(CONFIG_NETFILTER)   += netfilter.o n
 obj-$(CONFIG_INET_DIAG) += inet_diag.o 
 obj-$(CONFIG_INET_TCP_DIAG) += tcp_diag.o
 obj-$(CONFIG_INET_UDP_DIAG) += udp_diag.o
+obj-$(CONFIG_INET_RAW_DIAG) += raw_diag.o
 obj-$(CONFIG_NET_TCPPROBE) += tcp_probe.o
 obj-$(CONFIG_TCP_CONG_BIC) += tcp_bic.o
 obj-$(CONFIG_TCP_CONG_CDG) += tcp_cdg.o
Index: linux-ml.git/net/ipv4/raw.c
===
--- linux-ml.git.orig/net/ipv4/raw.c
+++ linux-ml.git/net/ipv4/raw.c
@@ -89,9 +89,10 @@ struct raw_frag_vec {
int hlen;
 };
 
-static struct raw_hashinfo raw_v4_hashinfo = {
+struct raw_hashinfo raw_v4_hashinfo = {
.lock = __RW_LOCK_UNLOCKED(raw_v4_hashinfo.lock),
 };
+EXPORT_SYMBOL_GPL(raw_v4_hashinfo);
 
 int raw_hash_sk(struct sock *sk)
 {
@@ -120,7 +121,7 @@ void raw_unhash_sk(struct sock *sk)
 }
 EXPORT_SYMBOL_GPL(raw_unhash_sk);
 
-static struct sock *__raw_v4_lookup(struct net *net, struct sock *sk,
+struct sock *__raw_v4_lookup(struct net *net, struct sock *sk,

Re: [PATCH 00/26] constify local structures

2016-09-11 Thread Julia Lawall

On Sun, 11 Sep 2016, Joe Perches wrote:

> On Sun, 2016-09-11 at 15:05 +0200, Julia Lawall wrote:
> > Constify local structures.
>
> Thanks Julia.
>
> A few suggestions & questions:
>
> Perhaps the script should go into scripts/coccinelle/
> so that future cases could be caught by the robot
> and commit message referenced by the patch instances.

OK.

> Can you please compile the files modified using the
> appropriate defconfig/allyesconfig and show the

I currently send patches for this issue only for files that compile using
the x86 allyesconfig.

> movement from data to const by using
>   $ size .new/old
> and include that in the changelogs (maybe next time)?

OK, thanks for the suggestion.

> Is it possible for a rule to trace the instances where
> an address of a struct or struct member is taken by
> locally defined and declared function call where the
> callee does not modify any dereferenced object?
>
> ie:
>
> struct foo {
>   int bar;
>   char *baz;
> };
>
> struct foo qux[] = {
>   { 1, "description 1" },
>   { 2, "dewcription 2" },
>   [ n, "etc" ]...,
> };
>
> void message(struct foo *msg)
> {
>   printk("%d %s\n", msg->bar, msg->baz);
> }
>
> where some code uses
>
>   message(qux[index]);
>
> So could a coccinelle script change:
>
> struct foo qux[] = { to const struct foo quz[] = {
>
> and
>
> void message(struct foo *msg) to void message(const struct foo *msg)

Yes, this could be possible too.

Thanks for the feedback.

julia


Re: [PATCH net 1/1] net sched actions: fix GETing actions

2016-09-11 Thread Cong Wang
On Sun, Sep 11, 2016 at 9:30 AM, Jamal Hadi Salim  wrote:
>
> What do you want the commit message to say? It shows an example because
> the functionality broke.

I expect it to explain why we need to increase that refcnt for GET
and how we missed it. Also need to find a right commit to blame. ;)

Thanks.


Re: [PATCH 23/26] sh_eth: constify local structures

2016-09-11 Thread Sergei Shtylyov

On 09/11/2016 04:06 PM, Julia Lawall wrote:

For structure types defined in the same file or local header files, find
top-level static structure declarations that have the following
properties:
1. Never reassigned.


  Really?


2. Address never taken


  Really?


3. Not passed to a top-level macro call
4. No pointer or array-typed field passed to a function or stored in a
variable.
Declare structures having all of these properties as const.

Done using Coccinelle.
Based on a suggestion by Joe Perches .

Signed-off-by: Julia Lawall 


   NAK, see sh_eth_set_default_cpu_data().

MBR, Sergei



Re: [PATCH 00/26] constify local structures

2016-09-11 Thread Joe Perches
On Sun, 2016-09-11 at 15:05 +0200, Julia Lawall wrote:
> Constify local structures.

Thanks Julia.

A few suggestions & questions:

Perhaps the script should go into scripts/coccinelle/
so that future cases could be caught by the robot
and commit message referenced by the patch instances.

Can you please compile the files modified using the
appropriate defconfig/allyesconfig and show the
movement from data to const by using
$ size .new/old
and include that in the changelogs (maybe next time)?

Is it possible for a rule to trace the instances where
an address of a struct or struct member is taken by
locally defined and declared function call where the
callee does not modify any dereferenced object?

ie:

struct foo {
int bar;
char *baz;
};

struct foo qux[] = {
{ 1, "description 1" },
{ 2, "dewcription 2" },
[ n, "etc" ]...,
};

void message(struct foo *msg)
{
printk("%d %s\n", msg->bar, msg->baz);
}

where some code uses

message(qux[index]);

So could a coccinelle script change:

struct foo qux[] = { to const struct foo quz[] = {

and

void message(struct foo *msg) to void message(const struct foo *msg)



Re: [PATCH 00/26] constify local structures

2016-09-11 Thread Jarkko Sakkinen
On Sun, Sep 11, 2016 at 03:05:42PM +0200, Julia Lawall wrote:
> Constify local structures.
> 
> The semantic patch that makes this change is as follows:
> (http://coccinelle.lip6.fr/)

Just my two cents but:

1. You *can* use a static analysis too to find bugs or other issues.
2. However, you should manually do the commits and proper commit
   messages to subsystems based on your findings. And I generally think
   that if one contributes code one should also at least smoke test changes
   somehow.

I don't know if I'm alone with my opinion. I just think that one should
also do the analysis part and not blindly create and submit patches.

Anyway, I'll apply the TPM change at some point. As I said they were
for better. Thanks.

/Jarkko

> // 
> // The first rule ignores some cases that posed problems
> @r disable optional_qualifier@
> identifier s != {peri_clk_data,threshold_attr,tracer_flags,tracer};
> identifier i != {s5k5baf_cis_rect,smtcfb_fix};
> position p;
> @@
> static struct s i@p = { ... };
> 
> @lstruct@
> identifier r.s;
> @@
> struct s { ... };
> 
> @used depends on lstruct@
> identifier r.i;
> @@
> i
> 
> @bad1@
> expression e;
> identifier r.i;
> assignment operator a;
> @@
>  (<+...i...+>) a e
> 
> @bad2@
> identifier r.i;
> @@
>  &(<+...i...+>)
> 
> @bad3@
> identifier r.i;
> declarer d;
> @@
>  d(...,<+...i...+>,...);
> 
> @bad4@
> identifier r.i;
> type T;
> T[] e;
> identifier f;
> position p;
> @@
> 
> f@p(...,
> (
>   (<+...i...+>)
> &
>   e
> )
> ,...)
> 
> @bad4a@
> identifier r.i;
> type T;
> T *e;
> identifier f;
> position p;
> @@
> 
> f@p(...,
> (
>   (<+...i...+>)
> &
>   e
> )
> ,...)
> 
> @ok5@
> expression *e;
> identifier r.i;
> position p;
> @@
> e =@p i
> 
> @bad5@
> expression *e;
> identifier r.i;
> position p != ok5.p;
> @@
> e =@p (<+...i...+>)
> 
> @rr depends on used && !bad1 && !bad2 && !bad3 && !bad4 && !bad4a && !bad5@
> identifier s,r.i;
> position r.p;
> @@
> 
> static
> +const
>  struct s i@p = { ... };
> 
> @depends on used && !bad1 && !bad2 && !bad3 && !bad4 && !bad4a && !bad5
>  disable optional_qualifier@
> identifier rr.s,r.i;
> @@
> 
> static
> +const
>  struct s i;
> // 
> 
> ---
> 
>  drivers/acpi/acpi_apd.c  |8 +++---
>  drivers/char/tpm/tpm-interface.c |   10 
>  drivers/char/tpm/tpm-sysfs.c |2 -
>  drivers/cpufreq/intel_pstate.c   |8 +++---
>  drivers/infiniband/hw/i40iw/i40iw_uk.c   |6 ++---
>  drivers/media/i2c/tvp514x.c  |2 -
>  drivers/media/pci/ddbridge/ddbridge-core.c   |   18 +++
>  drivers/media/pci/ngene/ngene-cards.c|   14 ++--
>  drivers/media/pci/smipcie/smipcie-main.c |8 +++---
>  drivers/misc/sgi-xp/xpc_uv.c |2 -
>  drivers/net/arcnet/com20020-pci.c|   10 
>  drivers/net/can/c_can/c_can_pci.c|4 +--
>  drivers/net/can/sja1000/plx_pci.c|   20 -
>  drivers/net/ethernet/mellanox/mlx4/main.c|4 +--
>  drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c |2 -
>  drivers/net/ethernet/renesas/sh_eth.c|   14 ++--
>  drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c |2 -
>  drivers/net/wireless/ath/dfs_pattern_detector.c  |2 -
>  drivers/net/wireless/intel/iwlegacy/3945.c   |4 +--
>  drivers/net/wireless/realtek/rtlwifi/rtl8188ee/sw.c  |2 -
>  drivers/net/wireless/realtek/rtlwifi/rtl8192ce/sw.c  |2 -
>  drivers/net/wireless/realtek/rtlwifi/rtl8192de/sw.c  |2 -
>  drivers/net/wireless/realtek/rtlwifi/rtl8192ee/sw.c  |2 -
>  drivers/net/wireless/realtek/rtlwifi/rtl8192se/sw.c  |2 -
>  drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c  |2 -
>  drivers/net/wireless/realtek/rtlwifi/rtl8723be/sw.c  |2 -
>  drivers/net/wireless/realtek/rtlwifi/rtl8821ae/sw.c  |2 -
>  drivers/platform/chrome/chromeos_laptop.c|   22 
> +--
>  drivers/platform/x86/intel_scu_ipc.c |6 ++---
>  drivers/platform/x86/intel_telemetry_debugfs.c   |2 -
>  drivers/scsi/esas2r/esas2r_flash.c   |2 -
>  drivers/scsi/hptiop.c|6 ++---
>  drivers/spi/spi-dw-pci.c |4 +--
>  drivers/staging/rtl8192e/rtl8192e/rtl_core.c |2 -
>  drivers/usb/misc/ezusb.c |2 -
>  drivers/video/fbdev/matrox/matroxfb_g450.c   |2 -
>  lib/crc64_ecma.c |2 -
>  sound/pci/ctxfi/ctatc.c  |2 -
>  sound/pci/hda/patch_ca0132.c |   10 
>  sound/pci/riptide/riptide.c  |2 -
>  40 files changed, 110 insertions(+), 110 deletions(-)


Re: [PATCH net 1/1] net sched actions: fix GETing actions

2016-09-11 Thread Jamal Hadi Salim

On 16-09-08 01:12 PM, Cong Wang wrote:


+}
+
 static int
 tca_action_gd(struct net *net, struct nlattr *nla, struct nlmsghdr *n,
  u32 portid, int event)
@@ -883,6 +894,7 @@ tca_action_gd(struct net *net, struct nlattr *nla, struct 
nlmsghdr *n,
goto err;
}
act->order = i;
+   act->tcfa_refcnt+=1;



Maybe we only need to fixup the refcnt without touching list,
from my quick observation to your bug report (again, you didn't explain).



Yes, you are correct. clearing the list is unnecessary since it is
temporary. When i get the chance i will fix it up, test and resubmit.

What do you want the commit message to say? It shows an example because
the functionality broke.

cheers,
jamal


[PATCH 2/2] net: ethernet: apm: xgene: use new api ethtool_{get|set}_link_ksettings

2016-09-11 Thread Philippe Reynes
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

Signed-off-by: Philippe Reynes 
---
 .../net/ethernet/apm/xgene/xgene_enet_ethtool.c|   61 
 1 files changed, 37 insertions(+), 24 deletions(-)

diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_ethtool.c 
b/drivers/net/ethernet/apm/xgene/xgene_enet_ethtool.c
index e1f44ae..d372d42 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_ethtool.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_ethtool.c
@@ -54,46 +54,59 @@ static void xgene_get_drvinfo(struct net_device *ndev,
sprintf(info->bus_info, "%s", pdev->name);
 }
 
-static int xgene_get_settings(struct net_device *ndev, struct ethtool_cmd *cmd)
+static int xgene_get_link_ksettings(struct net_device *ndev,
+   struct ethtool_link_ksettings *cmd)
 {
struct xgene_enet_pdata *pdata = netdev_priv(ndev);
struct phy_device *phydev = ndev->phydev;
+   u32 supported;
 
if (pdata->phy_mode == PHY_INTERFACE_MODE_RGMII) {
if (phydev == NULL)
return -ENODEV;
 
-   return phy_ethtool_gset(phydev, cmd);
+   return phy_ethtool_ksettings_get(phydev, cmd);
} else if (pdata->phy_mode == PHY_INTERFACE_MODE_SGMII) {
if (pdata->mdio_driver) {
if (!phydev)
return -ENODEV;
 
-   return phy_ethtool_gset(phydev, cmd);
+   return phy_ethtool_ksettings_get(phydev, cmd);
}
 
-   cmd->supported = SUPPORTED_1000baseT_Full | SUPPORTED_Autoneg |
-SUPPORTED_MII;
-   cmd->advertising = cmd->supported;
-   ethtool_cmd_speed_set(cmd, SPEED_1000);
-   cmd->duplex = DUPLEX_FULL;
-   cmd->port = PORT_MII;
-   cmd->transceiver = XCVR_INTERNAL;
-   cmd->autoneg = AUTONEG_ENABLE;
+   supported = SUPPORTED_1000baseT_Full | SUPPORTED_Autoneg |
+   SUPPORTED_MII;
+   ethtool_convert_legacy_u32_to_link_mode(
+   cmd->link_modes.supported,
+   supported);
+   ethtool_convert_legacy_u32_to_link_mode(
+   cmd->link_modes.advertising,
+   supported);
+
+   cmd->base.speed = SPEED_1000;
+   cmd->base.duplex = DUPLEX_FULL;
+   cmd->base.port = PORT_MII;
+   cmd->base.autoneg = AUTONEG_ENABLE;
} else {
-   cmd->supported = SUPPORTED_1baseT_Full | SUPPORTED_FIBRE;
-   cmd->advertising = cmd->supported;
-   ethtool_cmd_speed_set(cmd, SPEED_1);
-   cmd->duplex = DUPLEX_FULL;
-   cmd->port = PORT_FIBRE;
-   cmd->transceiver = XCVR_INTERNAL;
-   cmd->autoneg = AUTONEG_DISABLE;
+   supported = SUPPORTED_1baseT_Full | SUPPORTED_FIBRE;
+   ethtool_convert_legacy_u32_to_link_mode(
+   cmd->link_modes.supported,
+   supported);
+   ethtool_convert_legacy_u32_to_link_mode(
+   cmd->link_modes.advertising,
+   supported);
+
+   cmd->base.speed = SPEED_1;
+   cmd->base.duplex = DUPLEX_FULL;
+   cmd->base.port = PORT_FIBRE;
+   cmd->base.autoneg = AUTONEG_DISABLE;
}
 
return 0;
 }
 
-static int xgene_set_settings(struct net_device *ndev, struct ethtool_cmd *cmd)
+static int xgene_set_link_ksettings(struct net_device *ndev,
+   const struct ethtool_link_ksettings *cmd)
 {
struct xgene_enet_pdata *pdata = netdev_priv(ndev);
struct phy_device *phydev = ndev->phydev;
@@ -102,7 +115,7 @@ static int xgene_set_settings(struct net_device *ndev, 
struct ethtool_cmd *cmd)
if (!phydev)
return -ENODEV;
 
-   return phy_ethtool_sset(phydev, cmd);
+   return phy_ethtool_ksettings_set(phydev, cmd);
}
 
if (pdata->phy_mode == PHY_INTERFACE_MODE_SGMII) {
@@ -110,7 +123,7 @@ static int xgene_set_settings(struct net_device *ndev, 
struct ethtool_cmd *cmd)
if (!phydev)
return -ENODEV;
 
-   return phy_ethtool_sset(phydev, cmd);
+   return phy_ethtool_ksettings_set(phydev, cmd);
}
}
 
@@ -152,12 +165,12 @@ static void xgene_get_ethtool_stats(struct net_device 
*ndev,
 
 static const struct ethtool_ops xgene_ethtool_ops = {
.get_drvinfo = xgene_get_drvinfo,
-   .get_settings = xgene_get_settings,
-   .set_settings = xgene_set_settings,
.get_link = ethtool_op_get_link,
.get_strings = xgene

[PATCH 1/2] net: ethernet: apm: xgene: use phydev from struct net_device

2016-09-11 Thread Philippe Reynes
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy_dev in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes 
---
 .../net/ethernet/apm/xgene/xgene_enet_ethtool.c|4 +-
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.c |   24 ++--
 drivers/net/ethernet/apm/xgene/xgene_enet_main.c   |8 +++---
 drivers/net/ethernet/apm/xgene/xgene_enet_main.h   |1 -
 4 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_ethtool.c 
b/drivers/net/ethernet/apm/xgene/xgene_enet_ethtool.c
index 22a7b26..e1f44ae 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_ethtool.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_ethtool.c
@@ -57,7 +57,7 @@ static void xgene_get_drvinfo(struct net_device *ndev,
 static int xgene_get_settings(struct net_device *ndev, struct ethtool_cmd *cmd)
 {
struct xgene_enet_pdata *pdata = netdev_priv(ndev);
-   struct phy_device *phydev = pdata->phy_dev;
+   struct phy_device *phydev = ndev->phydev;
 
if (pdata->phy_mode == PHY_INTERFACE_MODE_RGMII) {
if (phydev == NULL)
@@ -96,7 +96,7 @@ static int xgene_get_settings(struct net_device *ndev, struct 
ethtool_cmd *cmd)
 static int xgene_set_settings(struct net_device *ndev, struct ethtool_cmd *cmd)
 {
struct xgene_enet_pdata *pdata = netdev_priv(ndev);
-   struct phy_device *phydev = pdata->phy_dev;
+   struct phy_device *phydev = ndev->phydev;
 
if (pdata->phy_mode == PHY_INTERFACE_MODE_RGMII) {
if (!phydev)
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c 
b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
index da413c8..c481f10 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
@@ -713,7 +713,7 @@ static void xgene_enet_adjust_link(struct net_device *ndev)
 {
struct xgene_enet_pdata *pdata = netdev_priv(ndev);
const struct xgene_mac_ops *mac_ops = pdata->mac_ops;
-   struct phy_device *phydev = pdata->phy_dev;
+   struct phy_device *phydev = ndev->phydev;
 
if (phydev->link) {
if (pdata->phy_speed != phydev->speed) {
@@ -773,15 +773,13 @@ int xgene_enet_phy_connect(struct net_device *ndev)
netdev_err(ndev, "Could not connect to PHY\n");
return -ENODEV;
}
-
-   pdata->phy_dev = phy_dev;
} else {
 #ifdef CONFIG_ACPI
struct acpi_device *adev = acpi_phy_find_device(dev);
if (adev)
-   pdata->phy_dev =  adev->driver_data;
-
-   phy_dev = pdata->phy_dev;
+   phy_dev = adev->driver_data;
+   else
+   phy_dev = NULL;
 
if (!phy_dev ||
phy_connect_direct(ndev, phy_dev, &xgene_enet_adjust_link,
@@ -849,8 +847,6 @@ static int xgene_mdiobus_register(struct xgene_enet_pdata 
*pdata,
if (!phy)
return -EIO;
 
-   pdata->phy_dev = phy;
-
return ret;
 }
 
@@ -890,14 +886,18 @@ int xgene_enet_mdio_config(struct xgene_enet_pdata *pdata)
 
 void xgene_enet_phy_disconnect(struct xgene_enet_pdata *pdata)
 {
-   if (pdata->phy_dev)
-   phy_disconnect(pdata->phy_dev);
+   struct net_device *ndev = pdata->ndev;
+
+   if (ndev->phydev)
+   phy_disconnect(ndev->phydev);
 }
 
 void xgene_enet_mdio_remove(struct xgene_enet_pdata *pdata)
 {
-   if (pdata->phy_dev)
-   phy_disconnect(pdata->phy_dev);
+   struct net_device *ndev = pdata->ndev;
+
+   if (ndev->phydev)
+   phy_disconnect(ndev->phydev);
 
mdiobus_unregister(pdata->mdio_bus);
mdiobus_free(pdata->mdio_bus);
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c 
b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
index b8b9495..522ba92 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
@@ -748,8 +748,8 @@ static int xgene_enet_open(struct net_device *ndev)
if (ret)
return ret;
 
-   if (pdata->phy_dev) {
-   phy_start(pdata->phy_dev);
+   if (ndev->phydev) {
+   phy_start(ndev->phydev);
} else {
schedule_delayed_work(&pdata->link_work, PHY_POLL_LINK_OFF);
netif_carrier_off(ndev);
@@ -772,8 +772,8 @@ static int xgene_enet_close(struct net_device *ndev)
mac_ops->tx_disable(pdata);
mac_ops->rx_disable(pdata);
 
-   if (pdata->phy_dev)
-   phy_stop(pdata->phy_dev);
+   if (ndev->phydev)
+   phy_stop(ndev->phydev);
else
cancel_delayed_work_sync(&pdata->link_work);
 
diff --git a/drivers/net/et

[PATCH 06/26] ath: constify local structures

2016-09-11 Thread Julia Lawall
For structure types defined in the same file or local header files, find
top-level static structure declarations that have the following
properties:
1. Never reassigned.
2. Address never taken
3. Not passed to a top-level macro call
4. No pointer or array-typed field passed to a function or stored in a
variable.
Declare structures having all of these properties as const.

Done using Coccinelle.
Based on a suggestion by Joe Perches .

Signed-off-by: Julia Lawall 

---
The semantic patch seems too long for a commit log, but is in the cover
letter.

 drivers/net/wireless/ath/dfs_pattern_detector.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/dfs_pattern_detector.c 
b/drivers/net/wireless/ath/dfs_pattern_detector.c
index 2f8136d..4100ffd 100644
--- a/drivers/net/wireless/ath/dfs_pattern_detector.c
+++ b/drivers/net/wireless/ath/dfs_pattern_detector.c
@@ -338,7 +338,7 @@ static bool dpd_set_domain(struct dfs_pattern_detector *dpd,
return true;
 }
 
-static struct dfs_pattern_detector default_dpd = {
+static const struct dfs_pattern_detector default_dpd = {
.exit   = dpd_exit,
.set_dfs_domain = dpd_set_domain,
.add_pulse  = dpd_add_pulse,



[PATCH 08/26] iwlegacy: constify local structures

2016-09-11 Thread Julia Lawall
For structure types defined in the same file or local header files, find
top-level static structure declarations that have the following
properties:
1. Never reassigned.
2. Address never taken
3. Not passed to a top-level macro call
4. No pointer or array-typed field passed to a function or stored in a
variable.
Declare structures having all of these properties as const.

Done using Coccinelle.
Based on a suggestion by Joe Perches .

Signed-off-by: Julia Lawall 

---
The semantic patch seems too long for a commit log, but is in the cover
letter.

 drivers/net/wireless/intel/iwlegacy/3945.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/intel/iwlegacy/3945.c 
b/drivers/net/wireless/intel/iwlegacy/3945.c
index 209dc99..4db327a 100644
--- a/drivers/net/wireless/intel/iwlegacy/3945.c
+++ b/drivers/net/wireless/intel/iwlegacy/3945.c
@@ -2671,7 +2671,7 @@ const struct il_ops il3945_ops = {
.send_led_cmd = il3945_send_led_cmd,
 };
 
-static struct il_cfg il3945_bg_cfg = {
+static const struct il_cfg il3945_bg_cfg = {
.name = "3945BG",
.fw_name_pre = IL3945_FW_PRE,
.ucode_api_max = IL3945_UCODE_API_MAX,
@@ -2700,7 +2700,7 @@ static struct il_cfg il3945_bg_cfg = {
},
 };
 
-static struct il_cfg il3945_abg_cfg = {
+static const struct il_cfg il3945_abg_cfg = {
.name = "3945ABG",
.fw_name_pre = IL3945_FW_PRE,
.ucode_api_max = IL3945_UCODE_API_MAX,



[PATCH 07/26] net/mlx4_core: constify local structures

2016-09-11 Thread Julia Lawall
For structure types defined in the same file or local header files, find
top-level static structure declarations that have the following
properties:
1. Never reassigned.
2. Address never taken
3. Not passed to a top-level macro call
4. No pointer or array-typed field passed to a function or stored in a
variable.
Declare structures having all of these properties as const.

Done using Coccinelle.
Based on a suggestion by Joe Perches .

Signed-off-by: Julia Lawall 

---
The semantic patch seems too long for a commit log, but is in the cover
letter.

 drivers/net/ethernet/mellanox/mlx4/main.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c 
b/drivers/net/ethernet/mellanox/mlx4/main.c
index 75dd2e3..9a3c359 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -120,7 +120,7 @@ static char mlx4_version[] =
DRV_NAME ": Mellanox ConnectX core driver v"
DRV_VERSION " (" DRV_RELDATE ")\n";
 
-static struct mlx4_profile default_profile = {
+static const struct mlx4_profile default_profile = {
.num_qp = 1 << 18,
.num_srq= 1 << 16,
.rdmarc_per_qp  = 1 << 4,
@@ -130,7 +130,7 @@ static struct mlx4_profile default_profile = {
.num_mtt= 1 << 20, /* It is really num mtt segements */
 };
 
-static struct mlx4_profile low_mem_profile = {
+static const struct mlx4_profile low_mem_profile = {
.num_qp = 1 << 17,
.num_srq= 1 << 6,
.rdmarc_per_qp  = 1 << 4,



[PATCH 11/26] can: constify local structures

2016-09-11 Thread Julia Lawall
For structure types defined in the same file or local header files, find
top-level static structure declarations that have the following
properties:
1. Never reassigned.
2. Address never taken
3. Not passed to a top-level macro call
4. No pointer or array-typed field passed to a function or stored in a
variable.
Declare structures having all of these properties as const.

Done using Coccinelle.
Based on a suggestion by Joe Perches .

Signed-off-by: Julia Lawall 

---
The semantic patch seems too long for a commit log, but is in the cover
letter.

 drivers/net/can/c_can/c_can_pci.c |4 ++--
 drivers/net/can/sja1000/plx_pci.c |   20 ++--
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/net/can/c_can/c_can_pci.c 
b/drivers/net/can/c_can/c_can_pci.c
index 7be393c..4bc345d 100644
--- a/drivers/net/can/c_can/c_can_pci.c
+++ b/drivers/net/can/c_can/c_can_pci.c
@@ -251,14 +251,14 @@ static void c_can_pci_remove(struct pci_dev *pdev)
pci_disable_device(pdev);
 }
 
-static struct c_can_pci_data c_can_sta2x11= {
+static const struct c_can_pci_data c_can_sta2x11 = {
.type = BOSCH_C_CAN,
.reg_align = C_CAN_REG_ALIGN_32,
.freq = 5200, /* 52 Mhz */
.bar = 0,
 };
 
-static struct c_can_pci_data c_can_pch = {
+static const struct c_can_pci_data c_can_pch = {
.type = BOSCH_C_CAN,
.reg_align = C_CAN_REG_32,
.freq = 5000, /* 50 MHz */
diff --git a/drivers/net/can/sja1000/plx_pci.c 
b/drivers/net/can/sja1000/plx_pci.c
index 3eb7430..59bc378 100644
--- a/drivers/net/can/sja1000/plx_pci.c
+++ b/drivers/net/can/sja1000/plx_pci.c
@@ -170,7 +170,7 @@ struct plx_pci_card_info {
void (*reset_func)(struct pci_dev *pdev);
 };
 
-static struct plx_pci_card_info plx_pci_card_info_adlink = {
+static const struct plx_pci_card_info plx_pci_card_info_adlink = {
"Adlink PCI-7841/cPCI-7841", 2,
PLX_PCI_CAN_CLOCK, PLX_PCI_OCR, PLX_PCI_CDR,
{1, 0x00, 0x00}, { {2, 0x00, 0x80}, {2, 0x80, 0x80} },
@@ -178,7 +178,7 @@ static struct plx_pci_card_info plx_pci_card_info_adlink = {
/* based on PLX9052 */
 };
 
-static struct plx_pci_card_info plx_pci_card_info_adlink_se = {
+static const struct plx_pci_card_info plx_pci_card_info_adlink_se = {
"Adlink PCI-7841/cPCI-7841 SE", 2,
PLX_PCI_CAN_CLOCK, PLX_PCI_OCR, PLX_PCI_CDR,
{0, 0x00, 0x00}, { {2, 0x00, 0x80}, {2, 0x80, 0x80} },
@@ -186,7 +186,7 @@ static struct plx_pci_card_info plx_pci_card_info_adlink_se 
= {
/* based on PLX9052 */
 };
 
-static struct plx_pci_card_info plx_pci_card_info_esd200 = {
+static const struct plx_pci_card_info plx_pci_card_info_esd200 = {
"esd CAN-PCI/CPCI/PCI104/200", 2,
PLX_PCI_CAN_CLOCK, PLX_PCI_OCR, PLX_PCI_CDR,
{0, 0x00, 0x00}, { {2, 0x00, 0x80}, {2, 0x100, 0x80} },
@@ -194,7 +194,7 @@ static struct plx_pci_card_info plx_pci_card_info_esd200 = {
/* based on PLX9030/9050 */
 };
 
-static struct plx_pci_card_info plx_pci_card_info_esd266 = {
+static const struct plx_pci_card_info plx_pci_card_info_esd266 = {
"esd CAN-PCI/PMC/266", 2,
PLX_PCI_CAN_CLOCK, PLX_PCI_OCR, PLX_PCI_CDR,
{0, 0x00, 0x00}, { {2, 0x00, 0x80}, {2, 0x100, 0x80} },
@@ -202,7 +202,7 @@ static struct plx_pci_card_info plx_pci_card_info_esd266 = {
/* based on PLX9056 */
 };
 
-static struct plx_pci_card_info plx_pci_card_info_esd2000 = {
+static const struct plx_pci_card_info plx_pci_card_info_esd2000 = {
"esd CAN-PCIe/2000", 2,
PLX_PCI_CAN_CLOCK, PLX_PCI_OCR, PLX_PCI_CDR,
{0, 0x00, 0x00}, { {2, 0x00, 0x80}, {2, 0x100, 0x80} },
@@ -210,7 +210,7 @@ static struct plx_pci_card_info plx_pci_card_info_esd2000 = 
{
/* based on PEX8311 */
 };
 
-static struct plx_pci_card_info plx_pci_card_info_ixxat = {
+static const struct plx_pci_card_info plx_pci_card_info_ixxat = {
"IXXAT PC-I 04/PCI", 2,
PLX_PCI_CAN_CLOCK, PLX_PCI_OCR, PLX_PCI_CDR,
{0, 0x00, 0x00}, { {2, 0x00, 0x80}, {2, 0x200, 0x80} },
@@ -218,7 +218,7 @@ static struct plx_pci_card_info plx_pci_card_info_ixxat = {
/* based on PLX9050 */
 };
 
-static struct plx_pci_card_info plx_pci_card_info_marathon_pci = {
+static const struct plx_pci_card_info plx_pci_card_info_marathon_pci = {
"Marathon CAN-bus-PCI", 2,
PLX_PCI_CAN_CLOCK, PLX_PCI_OCR, PLX_PCI_CDR,
{0, 0x00, 0x00}, { {2, 0x00, 0x00}, {4, 0x00, 0x00} },
@@ -234,7 +234,7 @@ static struct plx_pci_card_info 
plx_pci_card_info_marathon_pcie = {
/* based on PEX8311 */
 };
 
-static struct plx_pci_card_info plx_pci_card_info_tews = {
+static const struct plx_pci_card_info plx_pci_card_info_tews = {
"TEWS TECHNOLOGIES TPMC810", 2,
PLX_PCI_CAN_CLOCK, PLX_PCI_OCR, PLX_PCI_CDR,
{0, 0x00, 0x00}, { {2, 0x000, 0x80}, {2, 0x100, 0x80} },
@@ -242,7 +242,7 @@ static struct plx_pci_card_info plx_pci_card_info_tews = {
/* based on PLX9030 */
 };
 
-st

[PATCH 05/26] ARCNET: constify local structures

2016-09-11 Thread Julia Lawall
For structure types defined in the same file or local header files, find
top-level static structure declarations that have the following
properties:
1. Never reassigned.
2. Address never taken
3. Not passed to a top-level macro call
4. No pointer or array-typed field passed to a function or stored in a
variable.
Declare structures having all of these properties as const.

Done using Coccinelle.
Based on a suggestion by Joe Perches .

Signed-off-by: Julia Lawall 

---
The semantic patch seems too long for a commit log, but is in the cover
letter.

 drivers/net/arcnet/com20020-pci.c |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/arcnet/com20020-pci.c 
b/drivers/net/arcnet/com20020-pci.c
index 239de38..32b8406 100644
--- a/drivers/net/arcnet/com20020-pci.c
+++ b/drivers/net/arcnet/com20020-pci.c
@@ -264,7 +264,7 @@ static void com20020pci_remove(struct pci_dev *pdev)
}
 }
 
-static struct com20020_pci_card_info card_info_10mbit = {
+static const struct com20020_pci_card_info card_info_10mbit = {
.name = "ARC-PCI",
.devcount = 1,
.chan_map_tbl = {
@@ -277,7 +277,7 @@ static struct com20020_pci_card_info card_info_10mbit = {
.flags = ARC_CAN_10MBIT,
 };
 
-static struct com20020_pci_card_info card_info_5mbit = {
+static const struct com20020_pci_card_info card_info_5mbit = {
.name = "ARC-PCI",
.devcount = 1,
.chan_map_tbl = {
@@ -290,7 +290,7 @@ static struct com20020_pci_card_info card_info_5mbit = {
.flags = ARC_IS_5MBIT,
 };
 
-static struct com20020_pci_card_info card_info_sohard = {
+static const struct com20020_pci_card_info card_info_sohard = {
.name = "PLX-PCI",
.devcount = 1,
/* SOHARD needs PCI base addr 4 */
@@ -304,7 +304,7 @@ static struct com20020_pci_card_info card_info_sohard = {
.flags = ARC_CAN_10MBIT,
 };
 
-static struct com20020_pci_card_info card_info_eae_arc1 = {
+static const struct com20020_pci_card_info card_info_eae_arc1 = {
.name = "EAE PLX-PCI ARC1",
.devcount = 1,
.chan_map_tbl = {
@@ -329,7 +329,7 @@ static struct com20020_pci_card_info card_info_eae_arc1 = {
.flags = ARC_CAN_10MBIT,
 };
 
-static struct com20020_pci_card_info card_info_eae_ma1 = {
+static const struct com20020_pci_card_info card_info_eae_ma1 = {
.name = "EAE PLX-PCI MA1",
.devcount = 2,
.chan_map_tbl = {



[PATCH 21/26] rtlwifi: rtl818x: constify local structures

2016-09-11 Thread Julia Lawall
For structure types defined in the same file or local header files, find
top-level static structure declarations that have the following
properties:
1. Never reassigned.
2. Address never taken
3. Not passed to a top-level macro call
4. No pointer or array-typed field passed to a function or stored in a
variable.
Declare structures having all of these properties as const.

Done using Coccinelle.
Based on a suggestion by Joe Perches .

Signed-off-by: Julia Lawall 

---
The semantic patch seems too long for a commit log, but is in the cover
letter.

 drivers/net/wireless/realtek/rtlwifi/rtl8188ee/sw.c |2 +-
 drivers/net/wireless/realtek/rtlwifi/rtl8192ce/sw.c |2 +-
 drivers/net/wireless/realtek/rtlwifi/rtl8192de/sw.c |2 +-
 drivers/net/wireless/realtek/rtlwifi/rtl8192ee/sw.c |2 +-
 drivers/net/wireless/realtek/rtlwifi/rtl8192se/sw.c |2 +-
 drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c |2 +-
 drivers/net/wireless/realtek/rtlwifi/rtl8723be/sw.c |2 +-
 drivers/net/wireless/realtek/rtlwifi/rtl8821ae/sw.c |2 +-
 8 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8188ee/sw.c 
b/drivers/net/wireless/realtek/rtlwifi/rtl8188ee/sw.c
index 47e32cb..e7b11b4 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8188ee/sw.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8188ee/sw.c
@@ -280,7 +280,7 @@ static struct rtl_mod_params rtl88ee_mod_params = {
.debug = DBG_EMERG,
 };
 
-static struct rtl_hal_cfg rtl88ee_hal_cfg = {
+static const struct rtl_hal_cfg rtl88ee_hal_cfg = {
.bar_id = 2,
.write_readback = true,
.name = "rtl88e_pci",
diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192ce/sw.c 
b/drivers/net/wireless/realtek/rtlwifi/rtl8192ce/sw.c
index 4780bdc..87aa209 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8192ce/sw.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192ce/sw.c
@@ -258,7 +258,7 @@ static struct rtl_mod_params rtl92ce_mod_params = {
.debug = DBG_EMERG,
 };
 
-static struct rtl_hal_cfg rtl92ce_hal_cfg = {
+static const struct rtl_hal_cfg rtl92ce_hal_cfg = {
.bar_id = 2,
.write_readback = true,
.name = "rtl92c_pci",
diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192de/sw.c 
b/drivers/net/wireless/realtek/rtlwifi/rtl8192de/sw.c
index c6e09a1..0538a4d 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8192de/sw.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192de/sw.c
@@ -262,7 +262,7 @@ static struct rtl_mod_params rtl92de_mod_params = {
.debug = DBG_EMERG,
 };
 
-static struct rtl_hal_cfg rtl92de_hal_cfg = {
+static const struct rtl_hal_cfg rtl92de_hal_cfg = {
.bar_id = 2,
.write_readback = true,
.name = "rtl8192de",
diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/sw.c 
b/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/sw.c
index c31c6bf..ac299cb 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/sw.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/sw.c
@@ -262,7 +262,7 @@ static struct rtl_mod_params rtl92ee_mod_params = {
.debug = DBG_EMERG,
 };
 
-static struct rtl_hal_cfg rtl92ee_hal_cfg = {
+static const struct rtl_hal_cfg rtl92ee_hal_cfg = {
.bar_id = 2,
.write_readback = true,
.name = "rtl92ee_pci",
diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192se/sw.c 
b/drivers/net/wireless/realtek/rtlwifi/rtl8192se/sw.c
index 31baca41..5e8e02d 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8192se/sw.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192se/sw.c
@@ -306,7 +306,7 @@ static struct rtl_mod_params rtl92se_mod_params = {
 
 /* Because memory R/W bursting will cause system hang/crash
  * for 92se, so we don't read back after every write action */
-static struct rtl_hal_cfg rtl92se_hal_cfg = {
+static const struct rtl_hal_cfg rtl92se_hal_cfg = {
.bar_id = 1,
.write_readback = false,
.name = "rtl92s_pci",
diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c 
b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c
index ff49a8c..89c828a 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c
@@ -276,7 +276,7 @@ static struct rtl_mod_params rtl8723e_mod_params = {
.disable_watchdog = false,
 };
 
-static struct rtl_hal_cfg rtl8723e_hal_cfg = {
+static const struct rtl_hal_cfg rtl8723e_hal_cfg = {
.bar_id = 2,
.write_readback = true,
.name = "rtl8723e_pci",
diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723be/sw.c 
b/drivers/net/wireless/realtek/rtlwifi/rtl8723be/sw.c
index 2101793..20b53f0 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8723be/sw.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723be/sw.c
@@ -276,7 +276,7 @@ static struct rtl_mod_params rtl8723be_mod_params = {
.ant_sel = 0,
 };
 
-static struct rtl_hal_cfg rtl8723be_hal_cfg = {
+static const struct rtl_hal_cfg rtl8

[PATCH 20/26] stmmac: pci: constify local structures

2016-09-11 Thread Julia Lawall
For structure types defined in the same file or local header files, find
top-level static structure declarations that have the following
properties:
1. Never reassigned.
2. Address never taken
3. Not passed to a top-level macro call
4. No pointer or array-typed field passed to a function or stored in a
variable.
Declare structures having all of these properties as const.

Done using Coccinelle.
Based on a suggestion by Joe Perches .

Signed-off-by: Julia Lawall 

---
The semantic patch seems too long for a commit log, but is in the cover
letter.

 drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c
index 56c8a23..5c612c3 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c
@@ -141,7 +141,7 @@ static struct stmmac_pci_dmi_data quark_pci_dmi_data[] = {
{}
 };
 
-static struct stmmac_pci_info quark_pci_info = {
+static const struct stmmac_pci_info quark_pci_info = {
.setup = quark_default_data,
.dmi = quark_pci_dmi_data,
 };



[PATCH 00/26] constify local structures

2016-09-11 Thread Julia Lawall
Constify local structures.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// 
// The first rule ignores some cases that posed problems
@r disable optional_qualifier@
identifier s != {peri_clk_data,threshold_attr,tracer_flags,tracer};
identifier i != {s5k5baf_cis_rect,smtcfb_fix};
position p;
@@
static struct s i@p = { ... };

@lstruct@
identifier r.s;
@@
struct s { ... };

@used depends on lstruct@
identifier r.i;
@@
i

@bad1@
expression e;
identifier r.i;
assignment operator a;
@@
 (<+...i...+>) a e

@bad2@
identifier r.i;
@@
 &(<+...i...+>)

@bad3@
identifier r.i;
declarer d;
@@
 d(...,<+...i...+>,...);

@bad4@
identifier r.i;
type T;
T[] e;
identifier f;
position p;
@@

f@p(...,
(
  (<+...i...+>)
&
  e
)
,...)

@bad4a@
identifier r.i;
type T;
T *e;
identifier f;
position p;
@@

f@p(...,
(
  (<+...i...+>)
&
  e
)
,...)

@ok5@
expression *e;
identifier r.i;
position p;
@@
e =@p i

@bad5@
expression *e;
identifier r.i;
position p != ok5.p;
@@
e =@p (<+...i...+>)

@rr depends on used && !bad1 && !bad2 && !bad3 && !bad4 && !bad4a && !bad5@
identifier s,r.i;
position r.p;
@@

static
+const
 struct s i@p = { ... };

@depends on used && !bad1 && !bad2 && !bad3 && !bad4 && !bad4a && !bad5
 disable optional_qualifier@
identifier rr.s,r.i;
@@

static
+const
 struct s i;
// 

---

 drivers/acpi/acpi_apd.c  |8 +++---
 drivers/char/tpm/tpm-interface.c |   10 
 drivers/char/tpm/tpm-sysfs.c |2 -
 drivers/cpufreq/intel_pstate.c   |8 +++---
 drivers/infiniband/hw/i40iw/i40iw_uk.c   |6 ++---
 drivers/media/i2c/tvp514x.c  |2 -
 drivers/media/pci/ddbridge/ddbridge-core.c   |   18 +++
 drivers/media/pci/ngene/ngene-cards.c|   14 ++--
 drivers/media/pci/smipcie/smipcie-main.c |8 +++---
 drivers/misc/sgi-xp/xpc_uv.c |2 -
 drivers/net/arcnet/com20020-pci.c|   10 
 drivers/net/can/c_can/c_can_pci.c|4 +--
 drivers/net/can/sja1000/plx_pci.c|   20 -
 drivers/net/ethernet/mellanox/mlx4/main.c|4 +--
 drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c |2 -
 drivers/net/ethernet/renesas/sh_eth.c|   14 ++--
 drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c |2 -
 drivers/net/wireless/ath/dfs_pattern_detector.c  |2 -
 drivers/net/wireless/intel/iwlegacy/3945.c   |4 +--
 drivers/net/wireless/realtek/rtlwifi/rtl8188ee/sw.c  |2 -
 drivers/net/wireless/realtek/rtlwifi/rtl8192ce/sw.c  |2 -
 drivers/net/wireless/realtek/rtlwifi/rtl8192de/sw.c  |2 -
 drivers/net/wireless/realtek/rtlwifi/rtl8192ee/sw.c  |2 -
 drivers/net/wireless/realtek/rtlwifi/rtl8192se/sw.c  |2 -
 drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c  |2 -
 drivers/net/wireless/realtek/rtlwifi/rtl8723be/sw.c  |2 -
 drivers/net/wireless/realtek/rtlwifi/rtl8821ae/sw.c  |2 -
 drivers/platform/chrome/chromeos_laptop.c|   22 +--
 drivers/platform/x86/intel_scu_ipc.c |6 ++---
 drivers/platform/x86/intel_telemetry_debugfs.c   |2 -
 drivers/scsi/esas2r/esas2r_flash.c   |2 -
 drivers/scsi/hptiop.c|6 ++---
 drivers/spi/spi-dw-pci.c |4 +--
 drivers/staging/rtl8192e/rtl8192e/rtl_core.c |2 -
 drivers/usb/misc/ezusb.c |2 -
 drivers/video/fbdev/matrox/matroxfb_g450.c   |2 -
 lib/crc64_ecma.c |2 -
 sound/pci/ctxfi/ctatc.c  |2 -
 sound/pci/hda/patch_ca0132.c |   10 
 sound/pci/riptide/riptide.c  |2 -
 40 files changed, 110 insertions(+), 110 deletions(-)


[PATCH 23/26] sh_eth: constify local structures

2016-09-11 Thread Julia Lawall
For structure types defined in the same file or local header files, find
top-level static structure declarations that have the following
properties:
1. Never reassigned.
2. Address never taken
3. Not passed to a top-level macro call
4. No pointer or array-typed field passed to a function or stored in a
variable.
Declare structures having all of these properties as const.

Done using Coccinelle.
Based on a suggestion by Joe Perches .

Signed-off-by: Julia Lawall 

---
The semantic patch seems too long for a commit log, but is in the cover
letter.

 drivers/net/ethernet/renesas/sh_eth.c |   14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/renesas/sh_eth.c 
b/drivers/net/ethernet/renesas/sh_eth.c
index 1f8240a..d2ed57f 100644
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@ -654,7 +654,7 @@ static void sh_eth_set_rate_sh7724(struct net_device *ndev)
 }
 
 /* SH7724 */
-static struct sh_eth_cpu_data sh7724_data = {
+static const struct sh_eth_cpu_data sh7724_data = {
.set_duplex = sh_eth_set_duplex,
.set_rate   = sh_eth_set_rate_sh7724,
 
@@ -692,7 +692,7 @@ static void sh_eth_set_rate_sh7757(struct net_device *ndev)
 }
 
 /* SH7757 */
-static struct sh_eth_cpu_data sh7757_data = {
+static const struct sh_eth_cpu_data sh7757_data = {
.set_duplex = sh_eth_set_duplex,
.set_rate   = sh_eth_set_rate_sh7757,
 
@@ -757,7 +757,7 @@ static void sh_eth_set_rate_giga(struct net_device *ndev)
 }
 
 /* SH7757(GETHERC) */
-static struct sh_eth_cpu_data sh7757_data_giga = {
+static const struct sh_eth_cpu_data sh7757_data_giga = {
.chip_reset = sh_eth_chip_reset_giga,
.set_duplex = sh_eth_set_duplex,
.set_rate   = sh_eth_set_rate_giga,
@@ -788,7 +788,7 @@ static struct sh_eth_cpu_data sh7757_data_giga = {
 };
 
 /* SH7734 */
-static struct sh_eth_cpu_data sh7734_data = {
+static const struct sh_eth_cpu_data sh7734_data = {
.chip_reset = sh_eth_chip_reset,
.set_duplex = sh_eth_set_duplex,
.set_rate   = sh_eth_set_rate_gether,
@@ -817,7 +817,7 @@ static struct sh_eth_cpu_data sh7734_data = {
 };
 
 /* SH7763 */
-static struct sh_eth_cpu_data sh7763_data = {
+static const struct sh_eth_cpu_data sh7763_data = {
.chip_reset = sh_eth_chip_reset,
.set_duplex = sh_eth_set_duplex,
.set_rate   = sh_eth_set_rate_gether,
@@ -844,7 +844,7 @@ static struct sh_eth_cpu_data sh7763_data = {
.irq_flags  = IRQF_SHARED,
 };
 
-static struct sh_eth_cpu_data sh7619_data = {
+static const struct sh_eth_cpu_data sh7619_data = {
.register_type  = SH_ETH_REG_FAST_SH3_SH2,
 
.eesipr_value   = DMAC_M_RFRMER | DMAC_M_ECI | 0x003f,
@@ -855,7 +855,7 @@ static struct sh_eth_cpu_data sh7619_data = {
.hw_swap= 1,
 };
 
-static struct sh_eth_cpu_data sh771x_data = {
+static const struct sh_eth_cpu_data sh771x_data = {
.register_type  = SH_ETH_REG_FAST_SH3_SH2,
 
.eesipr_value   = DMAC_M_RFRMER | DMAC_M_ECI | 0x003f,



[PATCH 25/26] pch_gbe: constify local structures

2016-09-11 Thread Julia Lawall
For structure types defined in the same file or local header files, find
top-level static structure declarations that have the following
properties:
1. Never reassigned.
2. Address never taken
3. Not passed to a top-level macro call
4. No pointer or array-typed field passed to a function or stored in a
variable.
Declare structures having all of these properties as const.

Done using Coccinelle.
Based on a suggestion by Joe Perches .

Signed-off-by: Julia Lawall 

---
The semantic patch seems too long for a commit log, but is in the cover
letter.

 drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c 
b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
index 3cd87a4..6f33258 100644
--- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
+++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
@@ -2729,7 +2729,7 @@ static int pch_gbe_minnow_platform_init(struct pci_dev 
*pdev)
return ret;
 }
 
-static struct pch_gbe_privdata pch_gbe_minnow_privdata = {
+static const struct pch_gbe_privdata pch_gbe_minnow_privdata = {
.phy_tx_clk_delay = true,
.phy_disable_hibernate = true,
.platform_init = pch_gbe_minnow_platform_init,



RE: [PATCH v6 2/8] thunderbolt: Updating the register definitions

2016-09-11 Thread Levy, Amir (Jer)
On Sun, Sep 11 2016, 03:02 AM, Andreas Noever wrote:
> On Mon, Aug 1, 2016 at 2:23 PM, Amir Levy  wrote:
> > Adding more Thunderbolt(TM) register definitions and some helper
> > macros.
> 
> Thinking about this again I would prefer it if you would put your definitions
> into a separate file under icm/ (even if there is some duplication). The style
> (bitfields vs. genmask) is different between the drivers and for a reader it 
> is
> difficult to find out what is actually supposed to be used by the two drivers
> (ring_desc vs tbt_buf_desc or the ring RING_INT_EN/DISABLE macros in the
> header file vs. ring_interrupt_active in nhi.c).
> 
> This would also completely separate the two drivers.
> 
> Andreas
> 

I'm also in favor of completely separating the drivers, but is it the right 
thing to do with the register definitions
when the underlying registers layout is exactly the same?

Note that bitfields are not so recommended when you care about the format/order 
of bits, like in the ring descriptor.

Amir


RE: [PATCH net-next 2/5] liquidio CN23XX: sriov enable

2016-09-11 Thread Yuval Mintz
> - dev_dbg(&oct->pci_dev->dev, "%s[%llx] : 0x%llx\n",
> - "CN23XX_WIN_WR_MASK_REG",

> + pr_devel("%s[%llx] : 0x%llx\n",
> +  "CN23XX_WIN_WR_MASK_REG",
It looks like at least half of this patch [and I think it's also true for other
patches in this series] merely change debug prints.
Why not extract all of those to a single patch?

> +static unsigned int num_vfs[2] = { 0, 0 }; module_param_array(num_vfs,
> +uint, NULL, 0444); MODULE_PARM_DESC(num_vfs, "two comma-separated
> +unsigned integers that specify number of VFs for PF0 (left of the
> +comma) and PF1 (right of the comma); for 23xx only");

I believe we're way past the days where it's acceptable to enable IOV
Via module parameters; you have sysfs to dynamically enable VFs.

BTW, I glanced at pci-iov-howto.txt and noticed it still lists having a
module-parameter as control node for activating this feature;
But I believe it's been years since this has been considered a valid
Method for new drivers [I recall this was forbidden when we've added
bnx2x IOV support, and that was more than 3.5 years ago].
Perhaps it would be better to rephrase it so it would be obvious this
is a legacy sort of configuration [and not merely 'less preferable']?

> +static unsigned int num_queues_per_pf[2] = { 0, 0 };
> +module_param_array(num_queues_per_pf, uint, NULL, 0444);
> +MODULE_PARM_DESC(num_queues_per_pf, "two comma-separated unsigned
> +integers that specify number of queues per PF0 (left of the comma) and
> +PF1 (right of the comma); for 23xx only");
> +
> +static unsigned int num_queues_per_vf[2] = { 0, 0 };
> +module_param_array(num_queues_per_vf, uint, NULL, 0444);
> +MODULE_PARM_DESC(num_queues_per_vf, "two comma-separated unsigned
> +integers that specify number of queues per VFs for PF0 (left of the
> +comma) and PF1 (right of the comma); for 23xx only");

I don't believe this is a suitable solution as this is a generic problem -
how to split resources between PFs and their various VFs.
I don't believe there's good infrastructure for it today, though.
[Besides, you're introducing new module parameters...]



[PATCH net V2 2/4] net/mlx4_en: Fix the return value of mlx4_en_dcbnl_set_state()

2016-09-11 Thread Tariq Toukan
From: Kamal Heib 

mlx4_en_dcbnl_set_state() returns u8, the return value from
mlx4_en_setup_tc() could be negative in case of failure, so fix that.

Fixes: af7d51852631 ("net/mlx4_en: Add DCB PFC support through CEE netlink 
commands")
Signed-off-by: Kamal Heib 
Signed-off-by: Tariq Toukan 
---
 drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c 
b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
index 97081e5bafd1..316a70714434 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
@@ -239,7 +239,10 @@ static u8 mlx4_en_dcbnl_set_state(struct net_device *dev, 
u8 state)
priv->flags &= ~MLX4_EN_FLAG_DCB_ENABLED;
}
 
-   return mlx4_en_setup_tc(dev, num_tcs);
+   if (mlx4_en_setup_tc(dev, num_tcs))
+   return 1;
+
+   return 0;
 }
 
 /* On success returns a non-zero 802.1p user priority bitmap
-- 
1.8.3.1



[PATCH net V2 4/4] net/mlx4_en: Fix panic on xmit while port is down

2016-09-11 Thread Tariq Toukan
From: Moshe Shemesh 

When port is down, tx drop counter update is not needed.
Updating the counter in this case can cause a kernel
panic as when the port is down, ring can be NULL.

Fixes: 63a664b7e92b ("net/mlx4_en: fix tx_dropped bug")
Signed-off-by: Moshe Shemesh 
Signed-off-by: Tariq Toukan 
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c 
b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 9df87ca0515a..e2509bba3e7c 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -818,7 +818,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct 
net_device *dev)
real_size = get_real_size(skb, shinfo, dev, &lso_header_size,
  &inline_ok, &fragptr);
if (unlikely(!real_size))
-   goto tx_drop;
+   goto tx_drop_count;
 
/* Align descriptor to TXBB size */
desc_size = ALIGN(real_size, TXBB_SIZE);
@@ -826,7 +826,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct 
net_device *dev)
if (unlikely(nr_txbb > MAX_DESC_TXBBS)) {
if (netif_msg_tx_err(priv))
en_warn(priv, "Oversized header or SG list\n");
-   goto tx_drop;
+   goto tx_drop_count;
}
 
bf_ok = ring->bf_enabled;
@@ -1071,9 +1071,10 @@ tx_drop_unmap:
   PCI_DMA_TODEVICE);
}
 
+tx_drop_count:
+   ring->tx_dropped++;
 tx_drop:
dev_kfree_skb_any(skb);
-   ring->tx_dropped++;
return NETDEV_TX_OK;
 }
 
@@ -1106,7 +1107,7 @@ netdev_tx_t mlx4_en_xmit_frame(struct mlx4_en_rx_alloc 
*frame,
goto tx_drop;
 
if (mlx4_en_is_tx_ring_full(ring))
-   goto tx_drop;
+   goto tx_drop_count;
 
/* fetch ring->cons far ahead before needing it to avoid stall */
ring_cons = READ_ONCE(ring->cons);
@@ -1176,7 +1177,8 @@ netdev_tx_t mlx4_en_xmit_frame(struct mlx4_en_rx_alloc 
*frame,
 
return NETDEV_TX_OK;
 
-tx_drop:
+tx_drop_count:
ring->tx_dropped++;
+tx_drop:
return NETDEV_TX_BUSY;
 }
-- 
1.8.3.1



[PATCH net V2 3/4] net/mlx4_en: Fixes for DCBX

2016-09-11 Thread Tariq Toukan
This patch adds a capability check before enabling DCBX.
In addition, it re-organizes the relevant data structures,
and fixes a typo in a define.

Fixes: af7d51852631 ("net/mlx4_en: Add DCB PFC support through CEE netlink 
commands")
Signed-off-by: Tariq Toukan 
---
 drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c | 31 +-
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 21 +++--
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   | 15 +++--
 drivers/net/ethernet/mellanox/mlx4/port.c  |  4 ++--
 4 files changed, 28 insertions(+), 43 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c 
b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
index 316a70714434..b04760a5034b 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
@@ -94,7 +94,7 @@ static u8 mlx4_en_dcbnl_getcap(struct net_device *dev, int 
capid, u8 *cap)
*cap = true;
break;
case DCB_CAP_ATTR_DCBX:
-   *cap = priv->cee_params.dcbx_cap;
+   *cap = priv->dcbx_cap;
break;
case DCB_CAP_ATTR_PFC_TCS:
*cap = 1 <<  mlx4_max_tc(priv->mdev->dev);
@@ -111,14 +111,14 @@ static u8 mlx4_en_dcbnl_getpfcstate(struct net_device 
*netdev)
 {
struct mlx4_en_priv *priv = netdev_priv(netdev);
 
-   return priv->cee_params.dcb_cfg.pfc_state;
+   return priv->cee_config.pfc_state;
 }
 
 static void mlx4_en_dcbnl_setpfcstate(struct net_device *netdev, u8 state)
 {
struct mlx4_en_priv *priv = netdev_priv(netdev);
 
-   priv->cee_params.dcb_cfg.pfc_state = state;
+   priv->cee_config.pfc_state = state;
 }
 
 static void mlx4_en_dcbnl_get_pfc_cfg(struct net_device *netdev, int priority,
@@ -126,7 +126,7 @@ static void mlx4_en_dcbnl_get_pfc_cfg(struct net_device 
*netdev, int priority,
 {
struct mlx4_en_priv *priv = netdev_priv(netdev);
 
-   *setting = priv->cee_params.dcb_cfg.tc_config[priority].dcb_pfc;
+   *setting = priv->cee_config.dcb_pfc[priority];
 }
 
 static void mlx4_en_dcbnl_set_pfc_cfg(struct net_device *netdev, int priority,
@@ -134,8 +134,8 @@ static void mlx4_en_dcbnl_set_pfc_cfg(struct net_device 
*netdev, int priority,
 {
struct mlx4_en_priv *priv = netdev_priv(netdev);
 
-   priv->cee_params.dcb_cfg.tc_config[priority].dcb_pfc = setting;
-   priv->cee_params.dcb_cfg.pfc_state = true;
+   priv->cee_config.dcb_pfc[priority] = setting;
+   priv->cee_config.pfc_state = true;
 }
 
 static int mlx4_en_dcbnl_getnumtcs(struct net_device *netdev, int tcid, u8 
*num)
@@ -157,12 +157,11 @@ static u8 mlx4_en_dcbnl_set_all(struct net_device *netdev)
 {
struct mlx4_en_priv *priv = netdev_priv(netdev);
struct mlx4_en_dev *mdev = priv->mdev;
-   struct mlx4_en_cee_config *dcb_cfg = &priv->cee_params.dcb_cfg;
 
-   if (!(priv->cee_params.dcbx_cap & DCB_CAP_DCBX_VER_CEE))
+   if (!(priv->dcbx_cap & DCB_CAP_DCBX_VER_CEE))
return 1;
 
-   if (dcb_cfg->pfc_state) {
+   if (priv->cee_config.pfc_state) {
int tc;
 
priv->prof->rx_pause = 0;
@@ -170,7 +169,7 @@ static u8 mlx4_en_dcbnl_set_all(struct net_device *netdev)
for (tc = 0; tc < CEE_DCBX_MAX_PRIO; tc++) {
u8 tc_mask = 1 << tc;
 
-   switch (dcb_cfg->tc_config[tc].dcb_pfc) {
+   switch (priv->cee_config.dcb_pfc[tc]) {
case pfc_disabled:
priv->prof->tx_ppp &= ~tc_mask;
priv->prof->rx_ppp &= ~tc_mask;
@@ -226,7 +225,7 @@ static u8 mlx4_en_dcbnl_set_state(struct net_device *dev, 
u8 state)
struct mlx4_en_priv *priv = netdev_priv(dev);
int num_tcs = 0;
 
-   if (!(priv->cee_params.dcbx_cap & DCB_CAP_DCBX_VER_CEE))
+   if (!(priv->dcbx_cap & DCB_CAP_DCBX_VER_CEE))
return 1;
 
if (!!(state) == !!(priv->flags & MLX4_EN_FLAG_DCB_ENABLED))
@@ -256,7 +255,7 @@ static int mlx4_en_dcbnl_getapp(struct net_device *netdev, 
u8 idtype, u16 id)
.selector = idtype,
.protocol = id,
 };
-   if (!(priv->cee_params.dcbx_cap & DCB_CAP_DCBX_VER_CEE))
+   if (!(priv->dcbx_cap & DCB_CAP_DCBX_VER_CEE))
return 0;
 
return dcb_getapp(netdev, &app);
@@ -268,7 +267,7 @@ static int mlx4_en_dcbnl_setapp(struct net_device *netdev, 
u8 idtype,
struct mlx4_en_priv *priv = netdev_priv(netdev);
struct dcb_app app;
 
-   if (!(priv->cee_params.dcbx_cap & DCB_CAP_DCBX_VER_CEE))
+   if (!(priv->dcbx_cap & DCB_CAP_DCBX_VER_CEE))
return -EINVAL;
 
memset(&app, 0, sizeof(struct dcb_app));
@@ -437,7 +436,7 @@ static u8 mlx4_en_dcbnl_getdcbx(struct net_device *dev)
 {
struct mlx4_en_priv *priv = netdev_pri

[PATCH net V2 0/4] mlx4 fixes

2016-09-11 Thread Tariq Toukan
Hi Dave,

This patchset contains several bug fixes from the team to the
mlx4 Eth driver.

Series generated against net commit:
c2f57fb97da5 "drivers: net: phy: mdio-xgene: Add hardware dependency"

Thanks,
Tariq.

v2:
* excluded some cleanup patches.

Kamal Heib (2):
  net/mlx4_en: Fix the return value of mlx4_en_dcbnl_set_all()
  net/mlx4_en: Fix the return value of mlx4_en_dcbnl_set_state()

Moshe Shemesh (1):
  net/mlx4_en: Fix panic on xmit while port is down

Tariq Toukan (1):
  net/mlx4_en: Fixes for DCBX

 drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c | 57 ++
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 21 --
 drivers/net/ethernet/mellanox/mlx4/en_tx.c | 12 +++---
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   | 15 ++-
 drivers/net/ethernet/mellanox/mlx4/port.c  |  4 +-
 5 files changed, 50 insertions(+), 59 deletions(-)

-- 
1.8.3.1



[PATCH net V2 1/4] net/mlx4_en: Fix the return value of mlx4_en_dcbnl_set_all()

2016-09-11 Thread Tariq Toukan
From: Kamal Heib 

mlx4_en_dcbnl_set_all() returns u8, so return value can't be negative in
case of failure.

Fixes: af7d51852631 ("net/mlx4_en: Add DCB PFC support through CEE netlink 
commands")
Signed-off-by: Kamal Heib 
Signed-off-by: Rana Shahout 
Reported-by: Dan Carpenter 
Signed-off-by: Tariq Toukan 
---
 drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c 
b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
index 99c6bbdff501..97081e5bafd1 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
@@ -158,10 +158,9 @@ static u8 mlx4_en_dcbnl_set_all(struct net_device *netdev)
struct mlx4_en_priv *priv = netdev_priv(netdev);
struct mlx4_en_dev *mdev = priv->mdev;
struct mlx4_en_cee_config *dcb_cfg = &priv->cee_params.dcb_cfg;
-   int err = 0;
 
if (!(priv->cee_params.dcbx_cap & DCB_CAP_DCBX_VER_CEE))
-   return -EINVAL;
+   return 1;
 
if (dcb_cfg->pfc_state) {
int tc;
@@ -199,15 +198,17 @@ static u8 mlx4_en_dcbnl_set_all(struct net_device *netdev)
en_dbg(DRV, priv, "Set pfc off\n");
}
 
-   err = mlx4_SET_PORT_general(mdev->dev, priv->port,
-   priv->rx_skb_size + ETH_FCS_LEN,
-   priv->prof->tx_pause,
-   priv->prof->tx_ppp,
-   priv->prof->rx_pause,
-   priv->prof->rx_ppp);
-   if (err)
+   if (mlx4_SET_PORT_general(mdev->dev, priv->port,
+ priv->rx_skb_size + ETH_FCS_LEN,
+ priv->prof->tx_pause,
+ priv->prof->tx_ppp,
+ priv->prof->rx_pause,
+ priv->prof->rx_ppp)) {
en_err(priv, "Failed setting pause params\n");
-   return err;
+   return 1;
+   }
+
+   return 0;
 }
 
 static u8 mlx4_en_dcbnl_get_state(struct net_device *dev)
-- 
1.8.3.1



enable/disable temporary IPv6 per prefix

2016-09-11 Thread Oliver Mangold

Hi,

I have a question as a relatively new user to IPv6. I am wondering if it 
is currently possible to enable/disable the usage of temporary addresses 
on a per-prefix basis. My current understanding is that the feature is 
enabled by the the 'use_tempaddr' sysctl attribute, which is 
per-interface. What I would like to do is disable temp addresses for ULA 
prefixes. Did I miss something and this can already be done, or is it a 
feature planned for the future, maybe? RFC4941 seems to agree that this 
is a valid use case:



Additionally, sites might wish to selectively enable or disable the use 
of temporary addresses for some prefixes.  For example, a site might 
wish to disable temporary address generation for "Unique local" [ULA] 
prefixes while still generating temporary addresses for all other global 
prefixes.  Another site might wish to enable temporary address 
generation only for the prefixes 2001::/16 and 2002::/16, while 
disabling it for all other prefixes. To support this behavior, 
implementations SHOULD provide a way to enable and disable generation of 
temporary addresses for specific prefix subranges.  This per-prefix 
setting SHOULD override the global settings on the node with respect to 
the specified prefix subranges.  Note that the pre-prefix setting can be 
applied at any granularity, and not necessarily on a per-subnet basis.



Best regards,

Oliver



Re: [PATCH net 6/9] net/mlx4_core: Use RCU to perform radix tree lookup for SRQ

2016-09-11 Thread Tariq Toukan

Hi Dave,

On 08/09/2016 11:36 PM, David Miller wrote:

From: Tariq Toukan 
Date: Thu,  8 Sep 2016 11:51:58 +0300


From: Leon Romanovsky 

Radix tree lookup can be performed without locking.

Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand 
adapters")
Suggested-by: Sagi Grimberg 
Signed-off-by: Leon Romanovsky 
Signed-off-by: Tariq Toukan 

Unless this fixes a bug, it isn't appropriate for 'net'.

I see, I will exclude these patches and re-submit the others.


If it does fix a bug, you have to explain what that bug is and
how this fixes it.

Thanks,
Tariq