date:20170127

Re: [PATCH net-next 1/2] qed: Add infrastructure for PTP support.

2017-01-27 Thread kbuild test robot

Hi Sudarsana,

[auto build test ERROR on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Sudarsana-Kalluru/qed-Add-infrastructure-for-PTP-support/20170127-233853
config: i386-randconfig-c0-01281020 (attached as .config)
compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

   drivers/built-in.o: In function `qed_ptp_hw_adjfreq':
>> qed_ptp.c:(.text+0x19f3c3): undefined reference to `__divdi3'
   qed_ptp.c:(.text+0x19f45a): undefined reference to `__divdi3'
   qed_ptp.c:(.text+0x19f498): undefined reference to `__divdi3'

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [net-next v2] openvswitch: Simplify do_execute_actions().

2017-01-27 Thread Pravin Shelar

On Fri, Jan 27, 2017 at 1:45 PM, Andy Zhou  wrote:
> do_execute_actions() implements a worthwhile optimization: in case
> an output action is the last action in an action list, skb_clone()
> can be avoided by outputing the current skb. However, the
> implementation is more complicated than necessary.  This patch
> simplify this logic.
>
> Signed-off-by: Andy Zhou 
> ---
> v1->v2:  drop skb NULL check in do_output()
>

Acked-by: Pravin B Shelar 

Thanks.

Re: [PATCH net-next 1/2] qed: Add infrastructure for PTP support.

2017-01-27 Thread kbuild test robot

Hi Sudarsana,

[auto build test ERROR on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Sudarsana-Kalluru/qed-Add-infrastructure-for-PTP-support/20170127-233853
config: i386-allmodconfig (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

>> ERROR: "__divdi3" [drivers/net/ethernet/qlogic/qed/qed.ko] undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: net: suspicious RCU usage in nf_hook

2017-01-27 Thread Eric Dumazet

On Fri, 2017-01-27 at 17:00 -0800, Cong Wang wrote:
> On Fri, Jan 27, 2017 at 3:35 PM, Eric Dumazet  wrote:
> > Oh well, I forgot to submit the official patch I think, Jan 9th.
> >
> > https://groups.google.com/forum/#!topic/syzkaller/BhyN5OFd7sQ
> >
> 
> Hmm, but why only fragments need skb_orphan()? It seems like
> any kfree_skb() inside a nf hook needs to have a preceding
> skb_orphan().


> 
> Also, I am not convinced it is similar to commit 8282f27449bf15548
> which is on RX path.

Well, we clearly see IPv6 reassembly being part of the equation in both
cases.

I was replying to first part of the splat [1], which was already
diagnosed and had a non official patch.

use after free is also a bug, regardless of jump label being used or
not.

I still do not really understand this nf_hook issue, I thought we were
disabling BH in netfilter.

So the in_interrupt() check in net_disable_timestamp() should trigger,
this was the intent of netstamp_needed_deferred existence.

Not sure if we can test for rcu_read_lock() as well.

[1]
 sk_destruct+0x47/0x80 net/core/sock.c:1460
 __sk_free+0x57/0x230 net/core/sock.c:1468
 sock_wfree+0xae/0x120 net/core/sock.c:1645
 skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655
 skb_release_all+0x15/0x60 net/core/skbuff.c:668
 __kfree_skb+0x15/0x20 net/core/skbuff.c:684
 kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705
 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
 inet_frag_put include/net/inet_frag.h:133 [inline]
 nf_ct_frag6_gather+0x1106/0x3840
net/ipv6/netfilter/nf_conntrack_reasm.c:617
 ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
 nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline]
 nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
 nf_hook include/linux/netfilter.h:212 [inline]
 __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160
 ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170

[PATCH net-next v2 1/4] net: dsa: Add plumbing for port mirroring

2017-01-27 Thread Florian Fainelli

Add necessary plumbing at the slave network device level to have switch
drivers implement ndo_setup_tc() and most particularly the cls_matchall
classifier. We add support for two switch operations:

port_add_mirror and port_del_mirror() which configure, on a per-port
basis the mirror parameters requested from the cls_matchall classifier.

Code is largely borrowed from the Mellanox Spectrum switch driver.

Signed-off-by: Florian Fainelli 
---
 include/net/dsa.h  |  36 ++
 net/dsa/dsa_priv.h |   3 ++
 net/dsa/slave.c| 143 -
 3 files changed, 181 insertions(+), 1 deletion(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 92fd795e9573..831a789594f1 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -20,6 +20,8 @@
 #include 
 #include 
 
+struct tc_action;
+
 enum dsa_tag_protocol {
DSA_TAG_PROTO_NONE = 0,
DSA_TAG_PROTO_DSA,
@@ -139,6 +141,31 @@ struct dsa_switch_tree {
const struct dsa_device_ops *tag_ops;
 };
 
+enum dsa_port_mall_action_type {
+   DSA_PORT_MALL_MIRROR,
+};
+
+/*
+ * Mirroring TC entry
+ */
+struct dsa_mall_mirror_tc_entry {
+   u8 to_local_port;
+   bool ingress;
+};
+
+/*
+ * TC matchall entry
+ */
+struct dsa_mall_tc_entry {
+   struct list_head list;
+   unsigned long cookie;
+   enum dsa_port_mall_action_type type;
+   union {
+   struct dsa_mall_mirror_tc_entry mirror;
+   };
+};
+
+
 struct dsa_port {
struct net_device   *netdev;
struct device_node  *dn;
@@ -370,6 +397,15 @@ struct dsa_switch_ops {
int (*port_mdb_dump)(struct dsa_switch *ds, int port,
 struct switchdev_obj_port_mdb *mdb,
 int (*cb)(struct switchdev_obj *obj));
+
+   /*
+* TC integration
+*/
+   int (*port_mirror_add)(struct dsa_switch *ds, int port,
+  struct dsa_mall_mirror_tc_entry *mirror,
+  bool ingress);
+   void(*port_mirror_del)(struct dsa_switch *ds, int port,
+  struct dsa_mall_mirror_tc_entry *mirror);
 };
 
 struct dsa_switch_driver {
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 16194a4bb2fe..b10b03028b24 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -46,6 +46,9 @@ struct dsa_slave_priv {
 #ifdef CONFIG_NET_POLL_CONTROLLER
struct netpoll  *netpoll;
 #endif
+
+   /* TC context */
+   struct list_headmall_tc_list;
 };
 
 /* dsa.c */
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index b8e58689a9a1..a5f9f1ebca2e 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -16,12 +16,17 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include "dsa_priv.h"
 
+static bool dsa_slave_dev_check(struct net_device *dev);
+
 /* slave mii_bus handling ***/
 static int dsa_slave_phy_read(struct mii_bus *bus, int addr, int reg)
 {
@@ -994,6 +999,139 @@ static int dsa_slave_get_phys_port_name(struct net_device 
*dev,
return 0;
 }
 
+static struct dsa_mall_tc_entry *
+dsa_slave_mall_tc_entry_find(struct dsa_slave_priv *p,
+unsigned long cookie)
+{
+   struct dsa_mall_tc_entry *mall_tc_entry;
+
+   list_for_each_entry(mall_tc_entry, >mall_tc_list, list)
+   if (mall_tc_entry->cookie == cookie)
+   return mall_tc_entry;
+
+   return NULL;
+}
+
+static int dsa_slave_add_cls_matchall(struct net_device *dev,
+ __be16 protocol,
+ struct tc_cls_matchall_offload *cls,
+ bool ingress)
+{
+   struct dsa_slave_priv *p = netdev_priv(dev);
+   struct dsa_mall_tc_entry *mall_tc_entry;
+   struct dsa_switch *ds = p->parent;
+   struct net *net = dev_net(dev);
+   struct dsa_slave_priv *to_p;
+   struct net_device *to_dev;
+   const struct tc_action *a;
+   int err = -EOPNOTSUPP;
+   LIST_HEAD(actions);
+   int ifindex;
+
+   if (!ds->ops->port_mirror_add)
+   return err;
+
+   if (!tc_single_action(cls->exts)) {
+   netdev_err(dev, "only singular actions are supported\n");
+   return err;
+   }
+
+   mall_tc_entry = kzalloc(sizeof(*mall_tc_entry), GFP_KERNEL);
+   if (!mall_tc_entry)
+   return -ENOMEM;
+   mall_tc_entry->cookie = cls->cookie;
+
+   tcf_exts_to_list(cls->exts, );
+   a = list_first_entry(, struct tc_action, list);
+
+   if (is_tcf_mirred_egress_mirror(a) && protocol == htons(ETH_P_ALL)) {
+   struct dsa_mall_mirror_tc_entry *mirror;
+
+   mall_tc_entry->type = DSA_PORT_MALL_MIRROR;
+   mirror = _tc_entry->mirror;
+

[PATCH net-next v2 2/4] net: dsa: b53: Add mirror capture register definitions

2017-01-27 Thread Florian Fainelli

Add definitions for the different Roboswitch registers relevant for
ingress and egress mirroring.

Signed-off-by: Florian Fainelli 
---
 drivers/net/dsa/b53/b53_regs.h | 32 
 1 file changed, 32 insertions(+)

diff --git a/drivers/net/dsa/b53/b53_regs.h b/drivers/net/dsa/b53/b53_regs.h
index dac0af4e2cd0..9fd24c418fa4 100644
--- a/drivers/net/dsa/b53/b53_regs.h
+++ b/drivers/net/dsa/b53/b53_regs.h
@@ -206,6 +206,38 @@
 #define   BRCM_HDR_P8_EN   BIT(0) /* Enable tagging on port 8 */
 #define   BRCM_HDR_P5_EN   BIT(1) /* Enable tagging on port 5 */
 
+/* Mirror capture control register (16 bit) */
+#define B53_MIR_CAP_CTL0x10
+#define  CAP_PORT_MASK 0xf
+#define  BLK_NOT_MIR   BIT(14)
+#define  MIRROR_EN BIT(15)
+
+/* Ingress mirror control register (16 bit) */
+#define B53_IG_MIR_CTL 0x12
+#define  MIRROR_MASK   0x1ff
+#define  DIV_ENBIT(13)
+#define  MIRROR_FILTER_MASK0x3
+#define  MIRROR_FILTER_SHIFT   14
+#define  MIRROR_ALL0
+#define  MIRROR_DA 1
+#define  MIRROR_SA 2
+
+/* Ingress mirror divider register (16 bit) */
+#define B53_IG_MIR_DIV 0x14
+#define  IN_MIRROR_DIV_MASK0x3ff
+
+/* Ingress mirror MAC address register (48 bit) */
+#define B53_IG_MIR_MAC 0x16
+
+/* Egress mirror control register (16 bit) */
+#define B53_EG_MIR_CTL 0x1C
+
+/* Egress mirror divider register (16 bit) */
+#define B53_EG_MIR_DIV 0x1E
+
+/* Egress mirror MAC address register (48 bit) */
+#define B53_EG_MIR_MAC 0x20
+
 /* Device ID register (8 or 32 bit) */
 #define B53_DEVICE_ID  0x30
 
-- 
2.9.3

[PATCH net-next v2 0/4] net: dsa: Port mirroring support

2017-01-27 Thread Florian Fainelli

Hi all,

This patch series adds support for port mirroring in the two
Broadcom switch drivers. The major part of the functional are actually with
the plumbing between tc and the drivers.

David, this will most likely conflict a little bit with my other series:
 net: dsa: bcm_sf2: CFP support, so just let me know if that happens, and
I will provide a rebased version. Thanks!

Changes in v2:

- fixed filter removal logic to disable the ingress or egress mirroring
  when there are no longer ports being monitored in ingress or egress

- removed a stray list_head in dsa_port structure that is not used

Tested using the two iproute2 examples:

# ingress
  tc qdisc  add dev eth1 handle : ingress
  tc filter add dev eth1 parent :   \
   matchall skip_sw  \
   action mirred egress mirror   \
   dev eth2
# egress
  tc qdisc add dev eth1 handle 1: root prio
  tc filter add dev eth1 parent 1:   \
   matchall skip_sw   \
   action mirred egress mirror\
   dev eth2

Florian Fainelli (4):
  net: dsa: Add plumbing for port mirroring
  net: dsa: b53: Add mirror capture register definitions
  net: dsa: b53: Add support for port mirroring
  net: dsa: bcm_sf2: Add support for port mirroring

 drivers/net/dsa/b53/b53_common.c |  67 ++
 drivers/net/dsa/b53/b53_priv.h   |   4 ++
 drivers/net/dsa/b53/b53_regs.h   |  32 +
 drivers/net/dsa/bcm_sf2.c|   2 +
 include/net/dsa.h|  36 ++
 net/dsa/dsa_priv.h   |   3 +
 net/dsa/slave.c  | 143 ++-
 7 files changed, 286 insertions(+), 1 deletion(-)

-- 
2.9.3

[PATCH net-next v2 3/4] net: dsa: b53: Add support for port mirroring

2017-01-27 Thread Florian Fainelli

Add support for configuring port mirroring through the cls_matchall
classifier. We do a full ingress or egress capture towards the capture
port. Future improvements could include leveraging the divider to allow
less frames to be captured, as well as matching specific MAC DA/SA.

Signed-off-by: Florian Fainelli 
---
 drivers/net/dsa/b53/b53_common.c | 67 
 drivers/net/dsa/b53/b53_priv.h   |  4 +++
 2 files changed, 71 insertions(+)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index bb210b12ad1b..052ff4c22667 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1453,6 +1453,71 @@ static enum dsa_tag_protocol b53_get_tag_protocol(struct 
dsa_switch *ds)
return DSA_TAG_PROTO_NONE;
 }
 
+int b53_mirror_add(struct dsa_switch *ds, int port,
+  struct dsa_mall_mirror_tc_entry *mirror, bool ingress)
+{
+   struct b53_device *dev = ds->priv;
+   u16 reg, loc;
+
+   if (ingress)
+   loc = B53_IG_MIR_CTL;
+   else
+   loc = B53_EG_MIR_CTL;
+
+   b53_read16(dev, B53_MGMT_PAGE, loc, );
+   reg &= ~MIRROR_MASK;
+   reg |= BIT(port);
+   b53_write16(dev, B53_MGMT_PAGE, loc, reg);
+
+   b53_read16(dev, B53_MGMT_PAGE, B53_MIR_CAP_CTL, );
+   reg &= ~CAP_PORT_MASK;
+   reg |= mirror->to_local_port;
+   reg |= MIRROR_EN;
+   b53_write16(dev, B53_MGMT_PAGE, B53_MIR_CAP_CTL, reg);
+
+   return 0;
+}
+EXPORT_SYMBOL(b53_mirror_add);
+
+void b53_mirror_del(struct dsa_switch *ds, int port,
+   struct dsa_mall_mirror_tc_entry *mirror)
+{
+   struct b53_device *dev = ds->priv;
+   bool loc_disable = false, other_loc_disable = false;
+   u16 reg, loc;
+
+   if (mirror->ingress)
+   loc = B53_IG_MIR_CTL;
+   else
+   loc = B53_EG_MIR_CTL;
+
+   /* Update the desired ingress/egress register */
+   b53_read16(dev, B53_MGMT_PAGE, loc, );
+   reg &= ~BIT(port);
+   if (!(reg & MIRROR_MASK))
+   loc_disable = true;
+   b53_write16(dev, B53_MGMT_PAGE, loc, reg);
+
+   /* Now look at the other one to know if we can disable mirroring
+* entirely
+*/
+   if (mirror->ingress)
+   b53_read16(dev, B53_MGMT_PAGE, B53_EG_MIR_CTL, );
+   else
+   b53_read16(dev, B53_MGMT_PAGE, B53_IG_MIR_CTL, );
+   if (!(reg & MIRROR_MASK))
+   other_loc_disable = true;
+
+   b53_read16(dev, B53_MGMT_PAGE, B53_MIR_CAP_CTL, );
+   /* Both no longer have ports, let's disable mirroring */
+   if (loc_disable && other_loc_disable) {
+   reg &= ~MIRROR_EN;
+   reg &= ~mirror->to_local_port;
+   }
+   b53_write16(dev, B53_MGMT_PAGE, B53_MIR_CAP_CTL, reg);
+}
+EXPORT_SYMBOL(b53_mirror_del);
+
 static const struct dsa_switch_ops b53_switch_ops = {
.get_tag_protocol   = b53_get_tag_protocol,
.setup  = b53_setup,
@@ -1477,6 +1542,8 @@ static const struct dsa_switch_ops b53_switch_ops = {
.port_fdb_dump  = b53_fdb_dump,
.port_fdb_add   = b53_fdb_add,
.port_fdb_del   = b53_fdb_del,
+   .port_mirror_add= b53_mirror_add,
+   .port_mirror_del= b53_mirror_del,
 };
 
 struct b53_chip_data {
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index a8031b382c55..28ffe255276f 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -408,5 +408,9 @@ int b53_fdb_del(struct dsa_switch *ds, int port,
 int b53_fdb_dump(struct dsa_switch *ds, int port,
 struct switchdev_obj_port_fdb *fdb,
 int (*cb)(struct switchdev_obj *obj));
+int b53_mirror_add(struct dsa_switch *ds, int port,
+  struct dsa_mall_mirror_tc_entry *mirror, bool ingress);
+void b53_mirror_del(struct dsa_switch *ds, int port,
+   struct dsa_mall_mirror_tc_entry *mirror);
 
 #endif
-- 
2.9.3

[PATCH net-next v2 4/4] net: dsa: bcm_sf2: Add support for port mirroring

2017-01-27 Thread Florian Fainelli

We can use b53_mirror_add and b53_mirror_del because the Starfighter 2
is register compatible in that specific case.

Signed-off-by: Florian Fainelli 
---
 drivers/net/dsa/bcm_sf2.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 8eecfd227e06..3e514d7af218 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -1036,6 +1036,8 @@ static const struct dsa_switch_ops bcm_sf2_ops = {
.port_fdb_dump  = b53_fdb_dump,
.port_fdb_add   = b53_fdb_add,
.port_fdb_del   = b53_fdb_del,
+   .port_mirror_add= b53_mirror_add,
+   .port_mirror_del= b53_mirror_del,
 };
 
 struct bcm_sf2_of_data {
-- 
2.9.3

Re: net: suspicious RCU usage in nf_hook

2017-01-27 Thread Cong Wang

On Fri, Jan 27, 2017 at 3:35 PM, Eric Dumazet  wrote:
> Oh well, I forgot to submit the official patch I think, Jan 9th.
>
> https://groups.google.com/forum/#!topic/syzkaller/BhyN5OFd7sQ
>

Hmm, but why only fragments need skb_orphan()? It seems like
any kfree_skb() inside a nf hook needs to have a preceding
skb_orphan().

Also, I am not convinced it is similar to commit 8282f27449bf15548
which is on RX path.

Re: [PATCH net-next 0/4] net: dsa: Port mirroring support

2017-01-27 Thread Florian Fainelli

On 01/27/2017 04:40 PM, Florian Fainelli wrote:
> Hi all,
> 
> This patch series adds support for port mirroring in the two
> Broadcom switch drivers. The major part of the functional are actually with
> the plumbing between tc and the drivers.

Meh, there are two issues that need fixing:

- left a stray list_head in the dsa_port structure which we do not need
- if we remove either the ingress, or egress filter, we end-up disabling
the mirroring entirely, so need to rework the b53_mirror_del logic a bit

Will re-submit shortly.

> 
> David, this will most likely conflict a little bit with my other series:
>  net: dsa: bcm_sf2: CFP support, so just let me know if that happens, and
> I will provide a rebased version. Thanks!
> 
> Tested using the two iproute2 examples:
> 
> # ingress
>   tc qdisc  add dev eth1 handle : ingress
>   tc filter add dev eth1 parent :   \
>matchall skip_sw  \
>action mirred egress mirror   \
>dev eth2
> # egress
>   tc qdisc add dev eth1 handle 1: root prio
>   tc filter add dev eth1 parent 1:   \
>matchall skip_sw   \
>action mirred egress mirror\
>dev eth2
> 
> 
> Florian Fainelli (4):
>   net: dsa: Add plumbing for port mirroring
>   net: dsa: b53: Add mirror capture register definitions
>   net: dsa: b53: Add support for port mirroring
>   net: dsa: bcm_sf2: Add support for port mirroring
> 
>  drivers/net/dsa/b53/b53_common.c |  54 +++
>  drivers/net/dsa/b53/b53_priv.h   |   4 ++
>  drivers/net/dsa/b53/b53_regs.h   |  32 +
>  drivers/net/dsa/bcm_sf2.c|   2 +
>  include/net/dsa.h|  41 +++
>  net/dsa/dsa_priv.h   |   3 +
>  net/dsa/slave.c  | 143 
> ++-
>  7 files changed, 278 insertions(+), 1 deletion(-)
> 


-- 
Florian

[PATCH net-next 4/4] net: dsa: bcm_sf2: Add support for port mirroring

2017-01-27 Thread Florian Fainelli

We can use b53_mirror_add and b53_mirror_del because the Starfighter 2
is register compatible in that specific case.

Signed-off-by: Florian Fainelli 
---
 drivers/net/dsa/bcm_sf2.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 8eecfd227e06..3e514d7af218 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -1036,6 +1036,8 @@ static const struct dsa_switch_ops bcm_sf2_ops = {
.port_fdb_dump  = b53_fdb_dump,
.port_fdb_add   = b53_fdb_add,
.port_fdb_del   = b53_fdb_del,
+   .port_mirror_add= b53_mirror_add,
+   .port_mirror_del= b53_mirror_del,
 };
 
 struct bcm_sf2_of_data {
-- 
2.9.3

[PATCH net-next 3/4] net: dsa: b53: Add support for port mirroring

2017-01-27 Thread Florian Fainelli

Add support for configuring port mirroring through the cls_matchall
classifier. We do a full ingress or egress capture towards the capture
port. Future improvements could include leveraging the divider to allow
less frames to be captured, as well as matching specific MAC DA/SA.

Signed-off-by: Florian Fainelli 
---
 drivers/net/dsa/b53/b53_common.c | 54 
 drivers/net/dsa/b53/b53_priv.h   |  4 +++
 2 files changed, 58 insertions(+)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index bb210b12ad1b..5c9dc4bf7b22 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1453,6 +1453,58 @@ static enum dsa_tag_protocol b53_get_tag_protocol(struct 
dsa_switch *ds)
return DSA_TAG_PROTO_NONE;
 }
 
+int b53_mirror_add(struct dsa_switch *ds, int port,
+  struct dsa_mall_mirror_tc_entry *mirror, bool ingress)
+{
+   struct b53_device *dev = ds->priv;
+   u16 reg, loc;
+
+   if (ingress)
+   loc = B53_IG_MIR_CTL;
+   else
+   loc = B53_EG_MIR_CTL;
+
+   b53_read16(dev, B53_MGMT_PAGE, loc, );
+   reg &= ~MIRROR_MASK;
+   reg |= BIT(port);
+   b53_write16(dev, B53_MGMT_PAGE, loc, reg);
+
+   b53_read16(dev, B53_MGMT_PAGE, B53_MIR_CAP_CTL, );
+   reg &= ~CAP_PORT_MASK;
+   reg |= mirror->to_local_port;
+   reg |= MIRROR_EN;
+   b53_write16(dev, B53_MGMT_PAGE, B53_MIR_CAP_CTL, reg);
+
+   return 0;
+}
+EXPORT_SYMBOL(b53_mirror_add);
+
+void b53_mirror_del(struct dsa_switch *ds, int port,
+   struct dsa_mall_mirror_tc_entry *mirror)
+{
+   struct b53_device *dev = ds->priv;
+   bool disable_mirror = false;
+   u16 reg, loc;
+
+   if (mirror->ingress)
+   loc = B53_IG_MIR_CTL;
+   else
+   loc = B53_EG_MIR_CTL;
+
+   b53_read16(dev, B53_MGMT_PAGE, loc, );
+   reg &= ~BIT(port);
+   if (!(reg & MIRROR_MASK))
+   disable_mirror = true;
+   b53_write16(dev, B53_MGMT_PAGE, loc, reg);
+
+   b53_read16(dev, B53_MGMT_PAGE, B53_MIR_CAP_CTL, );
+   if (disable_mirror)
+   reg &= ~MIRROR_EN;
+   reg &= ~mirror->to_local_port;
+   b53_write16(dev, B53_MGMT_PAGE, B53_MIR_CAP_CTL, reg);
+}
+EXPORT_SYMBOL(b53_mirror_del);
+
 static const struct dsa_switch_ops b53_switch_ops = {
.get_tag_protocol   = b53_get_tag_protocol,
.setup  = b53_setup,
@@ -1477,6 +1529,8 @@ static const struct dsa_switch_ops b53_switch_ops = {
.port_fdb_dump  = b53_fdb_dump,
.port_fdb_add   = b53_fdb_add,
.port_fdb_del   = b53_fdb_del,
+   .port_mirror_add= b53_mirror_add,
+   .port_mirror_del= b53_mirror_del,
 };
 
 struct b53_chip_data {
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index a8031b382c55..28ffe255276f 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -408,5 +408,9 @@ int b53_fdb_del(struct dsa_switch *ds, int port,
 int b53_fdb_dump(struct dsa_switch *ds, int port,
 struct switchdev_obj_port_fdb *fdb,
 int (*cb)(struct switchdev_obj *obj));
+int b53_mirror_add(struct dsa_switch *ds, int port,
+  struct dsa_mall_mirror_tc_entry *mirror, bool ingress);
+void b53_mirror_del(struct dsa_switch *ds, int port,
+   struct dsa_mall_mirror_tc_entry *mirror);
 
 #endif
-- 
2.9.3

[PATCH net-next 2/4] net: dsa: b53: Add mirror capture register definitions

2017-01-27 Thread Florian Fainelli

Add definitions for the different Roboswitch registers relevant for
ingress and egress mirroring.

Signed-off-by: Florian Fainelli 
---
 drivers/net/dsa/b53/b53_regs.h | 32 
 1 file changed, 32 insertions(+)

diff --git a/drivers/net/dsa/b53/b53_regs.h b/drivers/net/dsa/b53/b53_regs.h
index dac0af4e2cd0..9fd24c418fa4 100644
--- a/drivers/net/dsa/b53/b53_regs.h
+++ b/drivers/net/dsa/b53/b53_regs.h
@@ -206,6 +206,38 @@
 #define   BRCM_HDR_P8_EN   BIT(0) /* Enable tagging on port 8 */
 #define   BRCM_HDR_P5_EN   BIT(1) /* Enable tagging on port 5 */
 
+/* Mirror capture control register (16 bit) */
+#define B53_MIR_CAP_CTL0x10
+#define  CAP_PORT_MASK 0xf
+#define  BLK_NOT_MIR   BIT(14)
+#define  MIRROR_EN BIT(15)
+
+/* Ingress mirror control register (16 bit) */
+#define B53_IG_MIR_CTL 0x12
+#define  MIRROR_MASK   0x1ff
+#define  DIV_ENBIT(13)
+#define  MIRROR_FILTER_MASK0x3
+#define  MIRROR_FILTER_SHIFT   14
+#define  MIRROR_ALL0
+#define  MIRROR_DA 1
+#define  MIRROR_SA 2
+
+/* Ingress mirror divider register (16 bit) */
+#define B53_IG_MIR_DIV 0x14
+#define  IN_MIRROR_DIV_MASK0x3ff
+
+/* Ingress mirror MAC address register (48 bit) */
+#define B53_IG_MIR_MAC 0x16
+
+/* Egress mirror control register (16 bit) */
+#define B53_EG_MIR_CTL 0x1C
+
+/* Egress mirror divider register (16 bit) */
+#define B53_EG_MIR_DIV 0x1E
+
+/* Egress mirror MAC address register (48 bit) */
+#define B53_EG_MIR_MAC 0x20
+
 /* Device ID register (8 or 32 bit) */
 #define B53_DEVICE_ID  0x30
 
-- 
2.9.3

[PATCH net-next 1/4] net: dsa: Add plumbing for port mirroring

2017-01-27 Thread Florian Fainelli

Add necessary plumbing at the slave network device level to have switch
drivers implement ndo_setup_tc() and most particularly the cls_matchall
classifier. We add support for two switch operations:

port_add_mirror and port_del_mirror() which configure, on a per-port
basis the mirror parameters requested from the cls_matchall classifier.

Code is largely borrowed from the Mellanox Spectrum switch driver.

Signed-off-by: Florian Fainelli 
---
 include/net/dsa.h  |  41 +++
 net/dsa/dsa_priv.h |   3 ++
 net/dsa/slave.c| 143 -
 3 files changed, 186 insertions(+), 1 deletion(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 92fd795e9573..7a867c57f463 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -20,6 +20,8 @@
 #include 
 #include 
 
+struct tc_action;
+
 enum dsa_tag_protocol {
DSA_TAG_PROTO_NONE = 0,
DSA_TAG_PROTO_DSA,
@@ -139,11 +141,41 @@ struct dsa_switch_tree {
const struct dsa_device_ops *tag_ops;
 };
 
+enum dsa_port_mall_action_type {
+   DSA_PORT_MALL_MIRROR,
+};
+
+/*
+ * Mirroring TC entry
+ */
+struct dsa_mall_mirror_tc_entry {
+   u8 to_local_port;
+   bool ingress;
+};
+
+/*
+ * TC matchall entry
+ */
+struct dsa_mall_tc_entry {
+   struct list_head list;
+   unsigned long cookie;
+   enum dsa_port_mall_action_type type;
+   union {
+   struct dsa_mall_mirror_tc_entry mirror;
+   };
+};
+
+
 struct dsa_port {
struct net_device   *netdev;
struct device_node  *dn;
unsigned intageing_time;
u8  stp_state;
+
+   /*
+* TC context
+*/
+   struct list_headmall_tc_list;
 };
 
 struct dsa_switch {
@@ -370,6 +402,15 @@ struct dsa_switch_ops {
int (*port_mdb_dump)(struct dsa_switch *ds, int port,
 struct switchdev_obj_port_mdb *mdb,
 int (*cb)(struct switchdev_obj *obj));
+
+   /*
+* TC integration
+*/
+   int (*port_mirror_add)(struct dsa_switch *ds, int port,
+  struct dsa_mall_mirror_tc_entry *mirror,
+  bool ingress);
+   void(*port_mirror_del)(struct dsa_switch *ds, int port,
+  struct dsa_mall_mirror_tc_entry *mirror);
 };
 
 struct dsa_switch_driver {
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 16194a4bb2fe..b10b03028b24 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -46,6 +46,9 @@ struct dsa_slave_priv {
 #ifdef CONFIG_NET_POLL_CONTROLLER
struct netpoll  *netpoll;
 #endif
+
+   /* TC context */
+   struct list_headmall_tc_list;
 };
 
 /* dsa.c */
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index b8e58689a9a1..a5f9f1ebca2e 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -16,12 +16,17 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include "dsa_priv.h"
 
+static bool dsa_slave_dev_check(struct net_device *dev);
+
 /* slave mii_bus handling ***/
 static int dsa_slave_phy_read(struct mii_bus *bus, int addr, int reg)
 {
@@ -994,6 +999,139 @@ static int dsa_slave_get_phys_port_name(struct net_device 
*dev,
return 0;
 }
 
+static struct dsa_mall_tc_entry *
+dsa_slave_mall_tc_entry_find(struct dsa_slave_priv *p,
+unsigned long cookie)
+{
+   struct dsa_mall_tc_entry *mall_tc_entry;
+
+   list_for_each_entry(mall_tc_entry, >mall_tc_list, list)
+   if (mall_tc_entry->cookie == cookie)
+   return mall_tc_entry;
+
+   return NULL;
+}
+
+static int dsa_slave_add_cls_matchall(struct net_device *dev,
+ __be16 protocol,
+ struct tc_cls_matchall_offload *cls,
+ bool ingress)
+{
+   struct dsa_slave_priv *p = netdev_priv(dev);
+   struct dsa_mall_tc_entry *mall_tc_entry;
+   struct dsa_switch *ds = p->parent;
+   struct net *net = dev_net(dev);
+   struct dsa_slave_priv *to_p;
+   struct net_device *to_dev;
+   const struct tc_action *a;
+   int err = -EOPNOTSUPP;
+   LIST_HEAD(actions);
+   int ifindex;
+
+   if (!ds->ops->port_mirror_add)
+   return err;
+
+   if (!tc_single_action(cls->exts)) {
+   netdev_err(dev, "only singular actions are supported\n");
+   return err;
+   }
+
+   mall_tc_entry = kzalloc(sizeof(*mall_tc_entry), GFP_KERNEL);
+   if (!mall_tc_entry)
+   return -ENOMEM;
+   mall_tc_entry->cookie = cls->cookie;
+
+   tcf_exts_to_list(cls->exts, );
+   a = list_first_entry(, struct tc_action, list);
+
+   if

[PATCH net-next 0/4] net: dsa: Port mirroring support

2017-01-27 Thread Florian Fainelli

Hi all,

This patch series adds support for port mirroring in the two
Broadcom switch drivers. The major part of the functional are actually with
the plumbing between tc and the drivers.

David, this will most likely conflict a little bit with my other series:
 net: dsa: bcm_sf2: CFP support, so just let me know if that happens, and
I will provide a rebased version. Thanks!

Tested using the two iproute2 examples:

# ingress
  tc qdisc  add dev eth1 handle : ingress
  tc filter add dev eth1 parent :   \
   matchall skip_sw  \
   action mirred egress mirror   \
   dev eth2
# egress
  tc qdisc add dev eth1 handle 1: root prio
  tc filter add dev eth1 parent 1:   \
   matchall skip_sw   \
   action mirred egress mirror\
   dev eth2


Florian Fainelli (4):
  net: dsa: Add plumbing for port mirroring
  net: dsa: b53: Add mirror capture register definitions
  net: dsa: b53: Add support for port mirroring
  net: dsa: bcm_sf2: Add support for port mirroring

 drivers/net/dsa/b53/b53_common.c |  54 +++
 drivers/net/dsa/b53/b53_priv.h   |   4 ++
 drivers/net/dsa/b53/b53_regs.h   |  32 +
 drivers/net/dsa/bcm_sf2.c|   2 +
 include/net/dsa.h|  41 +++
 net/dsa/dsa_priv.h   |   3 +
 net/dsa/slave.c  | 143 ++-
 7 files changed, 278 insertions(+), 1 deletion(-)

-- 
2.9.3

Re: [PATCH net-next 1/4] mlx5: Make building eswitch configurable

2017-01-27 Thread Alexei Starovoitov


On 1/27/17 1:15 PM, Saeed Mahameed wrote:

It is only mandatory for configurations that needs eswitch, where the
driver has no way to know about them, for a good old bare metal box,
eswitch is not needed.

we can do some work to strip the l2 table logic - needed for PFs to
work on multi-host - out of eswitch but again that would further
complicate the driver code since eswitch will still need to update l2
tables for VFs.


Saeed,
for multi-host setups every host in that multi-host doesn't
actually see the eswitch, no? Otherwise broken driver on one machine
can affect the other hosts in the same bundle? Please double check,
since this is absolutely critical HW requirement.

[PATCH net-next 2/2] tcp: include locally failed retries in retransmission stats

2017-01-27 Thread Yuchung Cheng

Currently the retransmission stats are not incremented if the
retransmit fails locally. But we always increment the other packet
counters that track total packet/bytes sent.  Awkwardly while we
don't count these failed retransmits in RETRANSSEGS, we do count
them in FAILEDRETRANS.

If the qdisc is dropping many packets this could under-estimate
TCP retransmission rate substantially from both SNMP or per-socket
TCP_INFO stats. This patch changes this by always incrementing
retransmission stats on retransmission attempts and failures.

Another motivation is to properly track retransmists in
SCM_TIMESTAMPING_OPT_STATS. Since SCM_TSTAMP_SCHED collection is
triggered in tcp_transmit_skb(), If tp->total_retrans is incremented
after the function, we'll always mis-count by the amount of the
latest retransmission.

Signed-off-by: Yuchung Cheng 
Signed-off-by: Soheil Hassas Yeganeh 
Acked-by: Neal Cardwell 
Acked-by: Eric Dumazet 
---
 net/ipv4/tcp_output.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 671c69535671..7b2d8762f15f 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2771,6 +2771,13 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff 
*skb, int segs)
if ((TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN_ECN) == TCPHDR_SYN_ECN)
tcp_ecn_clear_syn(sk, skb);
 
+   /* Update global and local TCP statistics. */
+   segs = tcp_skb_pcount(skb);
+   TCP_ADD_STATS(sock_net(sk), TCP_MIB_RETRANSSEGS, segs);
+   if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN)
+   __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSYNRETRANS);
+   tp->total_retrans += segs;
+
/* make sure skb->data is aligned on arches that require it
 * and check if ack-trimming & collapsing extended the headroom
 * beyond what csum_start can cover.
@@ -2788,14 +2795,9 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff 
*skb, int segs)
}
 
if (likely(!err)) {
-   segs = tcp_skb_pcount(skb);
-
TCP_SKB_CB(skb)->sacked |= TCPCB_EVER_RETRANS;
-   /* Update global TCP statistics. */
-   TCP_ADD_STATS(sock_net(sk), TCP_MIB_RETRANSSEGS, segs);
-   if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN)
-   __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSYNRETRANS);
-   tp->total_retrans += segs;
+   } else if (err != -EBUSY) {
+   NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRETRANSFAIL);
}
return err;
 }
@@ -2818,8 +2820,6 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff 
*skb, int segs)
if (!tp->retrans_stamp)
tp->retrans_stamp = tcp_skb_timestamp(skb);
 
-   } else if (err != -EBUSY) {
-   NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRETRANSFAIL);
}
 
if (tp->undo_retrans < 0)
-- 
2.11.0.483.g087da7b7c-goog

[PATCH net-next 1/2] tcp: record pkts sent and retransmistted

2017-01-27 Thread Yuchung Cheng

Add two stats in SCM_TIMESTAMPING_OPT_STATS:

TCP_NLA_DATA_SEGS_OUT: total data packets sent including retransmission
TCP_NLA_TOTAL_RETRANS: total data packets retransmitted

The names are picked to be consistent with corresponding fields in
TCP_INFO. This allows applications that are using the timestamping
API to measure latency stats to also retrive retransmission rate
of application write.

Signed-off-by: Yuchung Cheng 
Signed-off-by: Soheil Hassas Yeganeh 
Acked-by: Neal Cardwell 
Acked-by: Eric Dumazet 
---
 include/uapi/linux/tcp.h | 2 ++
 net/ipv4/tcp.c   | 6 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index 6ff35eb48d10..38a2b07afdff 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -227,6 +227,8 @@ enum {
TCP_NLA_BUSY,   /* Time (usec) busy sending data */
TCP_NLA_RWND_LIMITED,   /* Time (usec) limited by receive window */
TCP_NLA_SNDBUF_LIMITED, /* Time (usec) limited by send buffer */
+   TCP_NLA_DATA_SEGS_OUT,  /* Data pkts sent including retransmission */
+   TCP_NLA_TOTAL_RETRANS,  /* Data pkts retransmitted */
 };
 
 /* for TCP_MD5SIG socket option */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 2ed472ebf3b5..b751abc56935 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2870,7 +2870,7 @@ struct sk_buff *tcp_get_timestamping_opt_stats(const 
struct sock *sk)
struct sk_buff *stats;
struct tcp_info info;
 
-   stats = alloc_skb(3 * nla_total_size_64bit(sizeof(u64)), GFP_ATOMIC);
+   stats = alloc_skb(5 * nla_total_size_64bit(sizeof(u64)), GFP_ATOMIC);
if (!stats)
return NULL;
 
@@ -2881,6 +2881,10 @@ struct sk_buff *tcp_get_timestamping_opt_stats(const 
struct sock *sk)
  info.tcpi_rwnd_limited, TCP_NLA_PAD);
nla_put_u64_64bit(stats, TCP_NLA_SNDBUF_LIMITED,
  info.tcpi_sndbuf_limited, TCP_NLA_PAD);
+   nla_put_u64_64bit(stats, TCP_NLA_DATA_SEGS_OUT,
+ tp->data_segs_out, TCP_NLA_PAD);
+   nla_put_u64_64bit(stats, TCP_NLA_TOTAL_RETRANS,
+ tp->total_retrans, TCP_NLA_PAD);
return stats;
 }
 
-- 
2.11.0.483.g087da7b7c-goog

Re: [PATCH RFC net-next] packet: always ensure that we pass hard_header_len bytes in skb_headlen() to the driver

2017-01-27 Thread Willem de Bruijn

On Fri, Jan 27, 2017 at 4:58 PM, Sowmini Varadhan
 wrote:
> On (01/27/17 15:51), Willem de Bruijn wrote:
> :
>> - limit capable() check to drivers with with .validate callback
> (aka second option below)
> :
>> - let privileged applications shoot themselves in the foot (change nothing).
>
>> The second will break variable length header protocols unless
>> you exhaustively search for all variable length protocols and add
>> validate callbacks first.
>
> other than ax25, are there variable length header protocols out there
> without ->validate, and which need the CAP_RAW_SYSIO branch?

I don't know. An exhaustive search of protocols (by header_ops) may be
needed to say for sure.

If there are none, then the solution indeed is quite simple.

> I realize that, to an extent, even ethernet is a protocol whose
> header is > 14 with vlan, but from the google search, seems like it
> was mostly ax25 that really triggered a large part of the check.
>
> If we think that there are a large number of these (that dont have a
> ->validate, to fix up things as desired) I'd just go for the "change
> nothing in pf_packet" option.
>
> As I found out many drivers like ixgbe and sunvnet have defensive checks
> in the Tx path anyway, and xen_netfront can also join that group with
> a few simple checks.

Okay. I suspect that there are few, if any. But this is fragile code.

Re: [PATCH v4] net: ethernet: faraday: To support device tree usage.

2017-01-27 Thread Rob Herring

On Wed, Jan 25, 2017 at 10:09:20PM +0100, Arnd Bergmann wrote:
> On Wed, Jan 25, 2017 at 6:34 PM, David Miller  wrote:
> > From: Greentime Hu 
> > Date: Tue, 24 Jan 2017 16:46:14 +0800
> >> We also use the same binding document to describe the same faraday ethernet
> >> controller and add faraday to vendor-prefixes.txt.
> >
> > Why are you renaming the MOXA binding file instead of adding a completely 
> > new one
> > for faraday?  The MOXA one should stick around, I don't see a justification 
> > for
> > removing it.
> 
> This was my suggestion, basically fixing the name of the existing
> binding, which was
> accidentally named after one of the users rather than the company that did the
> hardware.
> 
> We can't change the compatible string, but I'd much prefer having only
> one binding
> file for this device rather than two separate ones that could possibly become
> incompatible in case we add new properties to them. If there is only
> one of them,
> naming it according to the hardware design is the general policy.
> 
> Note that we currently have two separate device drivers, but that is more a
> historic artifact, and if we ever get around to merging them into one driver,
> that should not impact the binding.

The change is fine with me, but the subject and commit message need some 
work. I'm guessing faraday licensed this to MOXA or something? Why is 
the new name preferred or better?

Rob

ATENCIÓN

2017-01-27 Thread administrador

ATENCIÓN;

Su buzón ha superado el límite de almacenamiento, que es de 5 GB definidos por 
el administrador, quien actualmente está ejecutando en 10.9GB, no puede ser 
capaz de enviar o recibir correo nuevo hasta que vuelva a validar su buzón de 
correo electrónico. Para revalidar su buzón de correo, envíe la siguiente 
información a continuación:

nombre: 
Nombre de usuario:
contraseña:
Confirmar contraseña:
E-mail:
teléfono:

Si usted no puede revalidar su buzón, el buzón se deshabilitará!

Disculpa las molestias.
Código de verificación: es: 006524
Correo Soporte Técnico © 2017

¡gracias
Sistemas administrador

Re: net: suspicious RCU usage in nf_hook

2017-01-27 Thread Eric Dumazet

On Fri, 2017-01-27 at 22:15 +0100, Dmitry Vyukov wrote:
> Hello,
> 
> I've got the following report while running syzkaller fuzzer on
> fd694aaa46c7ed811b72eb47d5eb11ce7ab3f7f1:
> 
> [ INFO: suspicious RCU usage. ]
> 4.10.0-rc5+ #192 Not tainted
> ---
> ./include/linux/rcupdate.h:561 Illegal context switch in RCU read-side
> critical section!
> 
> other info that might help us debug this:
> 
> rcu_scheduler_active = 2, debug_locks = 0
> 2 locks held by syz-executor14/23111:
>  #0:  (sk_lock-AF_INET6){+.+.+.}, at: [] lock_sock
> include/net/sock.h:1454 [inline]
>  #0:  (sk_lock-AF_INET6){+.+.+.}, at: []
> rawv6_sendmsg+0x1e65/0x3ec0 net/ipv6/raw.c:919
>  #1:  (rcu_read_lock){..}, at: [] nf_hook
> include/linux/netfilter.h:201 [inline]
>  #1:  (rcu_read_lock){..}, at: []
> __ip6_local_out+0x258/0x840 net/ipv6/output_core.c:160
> 
> stack backtrace:
> CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:15 [inline]
>  dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
>  lockdep_rcu_suspicious+0x139/0x180 kernel/locking/lockdep.c:4452
>  rcu_preempt_sleep_check include/linux/rcupdate.h:560 [inline]
>  ___might_sleep+0x560/0x650 kernel/sched/core.c:7748
>  __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739
>  mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752
>  atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060
>  __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149
>  static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174
>  net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728
>  sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403
>  __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441
>  sk_destruct+0x47/0x80 net/core/sock.c:1460
>  __sk_free+0x57/0x230 net/core/sock.c:1468
>  sock_wfree+0xae/0x120 net/core/sock.c:1645
>  skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655
>  skb_release_all+0x15/0x60 net/core/skbuff.c:668
>  __kfree_skb+0x15/0x20 net/core/skbuff.c:684
>  kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705
>  inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
>  inet_frag_put include/net/inet_frag.h:133 [inline]
>  nf_ct_frag6_gather+0x1106/0x3840 net/ipv6/netfilter/nf_conntrack_reasm.c:617
>  ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
>  nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline]
>  nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
>  nf_hook include/linux/netfilter.h:212 [inline]
>  __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160
>  ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
>  ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
>  ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
>  rawv6_push_pending_frames net/ipv6/raw.c:613 [inline]
>  rawv6_sendmsg+0x2d1a/0x3ec0 net/ipv6/raw.c:927
>  inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
>  sock_sendmsg_nosec net/socket.c:635 [inline]
>  sock_sendmsg+0xca/0x110 net/socket.c:645
>  sock_write_iter+0x326/0x600 net/socket.c:848
>  do_iter_readv_writev+0x2e3/0x5b0 fs/read_write.c:695
>  do_readv_writev+0x42c/0x9b0 fs/read_write.c:872
>  vfs_writev+0x87/0xc0 fs/read_write.c:911
>  do_writev+0x110/0x2c0 fs/read_write.c:944
>  SYSC_writev fs/read_write.c:1017 [inline]
>  SyS_writev+0x27/0x30 fs/read_write.c:1014
>  entry_SYSCALL_64_fastpath+0x1f/0xc2
> RIP: 0033:0x445559
> RSP: 002b:7f6f46fceb58 EFLAGS: 0292 ORIG_RAX: 0014
> RAX: ffda RBX: 0005 RCX: 00445559
> RDX: 0001 RSI: 20f1eff0 RDI: 0005
> RBP: 006e19c0 R08:  R09: 
> R10:  R11: 0292 R12: 0070
> R13: 20f59000 R14: 0015 R15: 00020400
> BUG: sleeping function called from invalid context at 
> kernel/locking/mutex.c:752
> in_atomic(): 1, irqs_disabled(): 0, pid: 23111, name: syz-executor14
> INFO: lockdep is turned off.
> CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:15 [inline]
>  dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
>  ___might_sleep+0x47e/0x650 kernel/sched/core.c:7780
>  __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739
>  mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752
>  atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060
>  __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149
>  static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174
>  net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728
>  sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403
>  __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441
>  sk_destruct+0x47/0x80 net/core/sock.c:1460
>  __sk_free+0x57/0x230 net/core/sock.c:1468
>  sock_wfree+0xae/0x120 net/core/sock.c:1645
>  skb_release_head_state+0xfc/0x200

Re: [PATCH V3 net-next 02/14] net/ena: fix error handling when probe fails

2017-01-27 Thread Lino Sanfilippo


Hi,

On 26.01.2017 23:18, Netanel Belgazal wrote:

When driver fails in probe, it will release all resources,
including adapter.
In case of probe failure, ena_remove should not try to
free the adapter resources.

Signed-off-by: Netanel Belgazal 
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c 
b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 7493ea3..cb60567 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -3046,6 +3046,7 @@ static int ena_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
 err_free_region:
ena_release_bars(ena_dev, pdev);
 err_free_ena_dev:
+   pci_set_drvdata(pdev, NULL);
vfree(ena_dev);
 err_disable_device:
pci_disable_device(pdev);



Is this change really a "fix"? remove() should only be called if
probe() has been successful before, otherwise not. Did you experience
something different?

Regards,
Lino

Re: net: suspicious RCU usage in nf_hook

2017-01-27 Thread Cong Wang

On Fri, Jan 27, 2017 at 3:22 PM, Cong Wang  wrote:
> On Fri, Jan 27, 2017 at 1:15 PM, Dmitry Vyukov  wrote:
>> stack backtrace:
>> CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:15 [inline]
>>  dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
>>  lockdep_rcu_suspicious+0x139/0x180 kernel/locking/lockdep.c:4452
>>  rcu_preempt_sleep_check include/linux/rcupdate.h:560 [inline]
>>  ___might_sleep+0x560/0x650 kernel/sched/core.c:7748
>>  __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739
>>  mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752
>>  atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060
>>  __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149
>>  static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174
>>  net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728
>>  sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403
>>  __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441
>>  sk_destruct+0x47/0x80 net/core/sock.c:1460
>
> jump label uses a mutex and we call jump label API in softIRQ context...
> Maybe we have to move it to a work struct as what we did for netlink.

Correct myself: process context but with RCU read lock held...

Re: net: suspicious RCU usage in nf_hook

2017-01-27 Thread Cong Wang

On Fri, Jan 27, 2017 at 1:15 PM, Dmitry Vyukov  wrote:
> stack backtrace:
> CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:15 [inline]
>  dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
>  lockdep_rcu_suspicious+0x139/0x180 kernel/locking/lockdep.c:4452
>  rcu_preempt_sleep_check include/linux/rcupdate.h:560 [inline]
>  ___might_sleep+0x560/0x650 kernel/sched/core.c:7748
>  __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739
>  mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752
>  atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060
>  __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149
>  static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174
>  net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728
>  sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403
>  __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441
>  sk_destruct+0x47/0x80 net/core/sock.c:1460

jump label uses a mutex and we call jump label API in softIRQ context...
Maybe we have to move it to a work struct as what we did for netlink.

[PATCH net-next v3 4/4] net: ipv6: Use compressed IPv6 addresses showing route replace error

2017-01-27 Thread David Ahern

ip6_print_replace_route_err logs an error if a route replace fails with
IPv6 addresses in the full format. e.g,:

IPv6: IPV6: multipath route replace failed (check consistency of installed 
routes): 2001:0db8:0200::::: nexthop 
2001:0db8:0001:::::0016 ifi 0

Change the message to dump the addresses in the compressed format.

Signed-off-by: David Ahern 
---
 net/ipv6/route.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 747f333ae006..463cc0847a3d 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2969,7 +2969,7 @@ static void ip6_print_replace_route_err(struct list_head 
*rt6_nh_list)
struct rt6_nh *nh;
 
list_for_each_entry(nh, rt6_nh_list, next) {
-   pr_warn("IPV6: multipath route replace failed (check 
consistency of installed routes): %pI6 nexthop %pI6 ifi %d\n",
+   pr_warn("IPV6: multipath route replace failed (check 
consistency of installed routes): %pI6c nexthop %pI6c ifi %d\n",
>r_cfg.fc_dst, >r_cfg.fc_gateway,
nh->r_cfg.fc_ifindex);
}
-- 
2.1.4

[PATCH net-next v3 3/4] net: ipv6: Add support to dump multipath routes via RTA_MULTIPATH attribute

2017-01-27 Thread David Ahern

IPv6 returns multipath routes as a series of individual routes making
their display and handling by userspace different and more complicated
than IPv4, putting the burden on the user to see that a route is part of
a multipath route and internally creating a multipath route if desired
(e.g., libnl does this as of commit 29b71371e764). This patch addresses
this difference, allowing multipath routes to be returned using the
RTA_MULTIPATH attribute.

The end result is that IPv6 multipath routes can be treated and displayed
in a format similar to IPv4:

$ ip -6 ro ls vrf red
2001:db8::/120 metric 1024
nexthop via 2001:db8:1::62  dev eth1 weight 1
nexthop via 2001:db8:1::61  dev eth1 weight 1
nexthop via 2001:db8:1::60  dev eth1 weight 1
nexthop via 2001:db8:1::59  dev eth1 weight 1
2001:db8:1::/120 dev eth1 proto kernel metric 256  pref medium
...

Suggested-by: Dinesh Dutt 
Signed-off-by: David Ahern 
---
v3
- dropped user API to opt-in to change

v2
- changed user api to opt in to new behavior from attribute appended to
  the request to requiring an rtmsg struct with the RTM_F_ALL_NEXTHOPS
  set

 include/net/netlink.h |   1 +
 net/ipv6/ip6_fib.c|  16 ++-
 net/ipv6/route.c  | 127 +++---
 3 files changed, 126 insertions(+), 18 deletions(-)

diff --git a/include/net/netlink.h b/include/net/netlink.h
index d3938f11ae52..b239fcd33d80 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -229,6 +229,7 @@ struct nl_info {
struct nlmsghdr *nlh;
struct net  *nl_net;
u32 portid;
+   boolskip_notify;
 };
 
 int netlink_rcv_skb(struct sk_buff *skb,
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index bcaf247232d7..2542794b2c64 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -318,6 +318,16 @@ static int fib6_dump_node(struct fib6_walker *w)
w->leaf = rt;
return 1;
}
+
+   /* if multipath routes are dumped in one route with
+* the RTA_MULTIPATH attribute, then jump rt to point
+* to the last sibling of this route (no need to dump
+* the sibling routes again)
+*/
+   if (rt->rt6i_nsiblings)
+   rt = list_last_entry(>rt6i_siblings,
+struct rt6_info,
+rt6i_siblings);
}
w->leaf = NULL;
return 0;
@@ -871,7 +881,8 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct 
rt6_info *rt,
*ins = rt;
rt->rt6i_node = fn;
atomic_inc(>rt6i_ref);
-   inet6_rt_notify(RTM_NEWROUTE, rt, info, nlflags);
+   if (!info->skip_notify)
+   inet6_rt_notify(RTM_NEWROUTE, rt, info, nlflags);
info->nl_net->ipv6.rt6_stats->fib_rt_entries++;
 
if (!(fn->fn_flags & RTN_RTINFO)) {
@@ -897,7 +908,8 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct 
rt6_info *rt,
rt->rt6i_node = fn;
rt->dst.rt6_next = iter->dst.rt6_next;
atomic_inc(>rt6i_ref);
-   inet6_rt_notify(RTM_NEWROUTE, rt, info, NLM_F_REPLACE);
+   if (!info->skip_notify)
+   inet6_rt_notify(RTM_NEWROUTE, rt, info, NLM_F_REPLACE);
if (!(fn->fn_flags & RTN_RTINFO)) {
info->nl_net->ipv6.rt6_stats->fib_route_nodes++;
fn->fn_flags |= RTN_RTINFO;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 81e2b2a28806..747f333ae006 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -3010,19 +3010,25 @@ static int ip6_route_info_append(struct list_head 
*rt6_nh_list,
 
 static int ip6_route_multipath_add(struct fib6_config *cfg)
 {
+   struct rt6_info *rt, *rt_first = NULL;
struct fib6_config r_cfg;
struct rtnexthop *rtnh;
-   struct rt6_info *rt;
struct rt6_nh *err_nh;
struct rt6_nh *nh, *nh_safe;
+   __u16 nlflags;
int remaining;
int attrlen;
int err = 1;
int nhn = 0;
+   int append = cfg->fc_nlinfo.nlh->nlmsg_flags & NLM_F_APPEND;
int replace = (cfg->fc_nlinfo.nlh &&
   (cfg->fc_nlinfo.nlh->nlmsg_flags & NLM_F_REPLACE));
LIST_HEAD(rt6_nh_list);
 
+   nlflags = replace ? NLM_F_REPLACE : NLM_F_CREATE;
+   if (append)
+   nlflags |= NLM_F_APPEND;
+
remaining = cfg->fc_mp_len;
rtnh = (struct rtnexthop *)cfg->fc_mp;
 
@@ -3065,9 +3071,20 @@ static int ip6_route_multipath_add(struct fib6_config 
*cfg)
rtnh = rtnh_next(rtnh, );
}
 
+   /* for route append want to send separate

[PATCH net-next v3 2/4] net: ipv6: Allow shorthand delete of all nexthops in multipath route

2017-01-27 Thread David Ahern

IPv4 allows multipath routes to be deleted using just the prefix and
length. For example:
$ ip ro ls vrf red
unreachable default metric 8192
1.1.1.0/24
nexthop via 10.100.1.254  dev eth1 weight 1
nexthop via 10.11.200.2  dev eth11.200 weight 1
10.11.200.0/24 dev eth11.200 proto kernel scope link src 10.11.200.3
10.100.1.0/24 dev eth1 proto kernel scope link src 10.100.1.3

$ ip ro del 1.1.1.0/24 vrf red

$ ip ro ls vrf red
unreachable default metric 8192
10.11.200.0/24 dev eth11.200 proto kernel scope link src 10.11.200.3
10.100.1.0/24 dev eth1 proto kernel scope link src 10.100.1.3

The same notation does not work with IPv6 because of how multipath routes
are implemented for IPv6. For IPv6 only the first nexthop of a multipath
route is deleted if the request contains only a prefix and length. This
leads to unnecessary complexity in userspace dealing with IPv6 multipath
routes.

This patch allows all nexthops to be deleted without specifying each one
in the delete request. Internally, this is done by walking the sibling
list of the route matching the specifications given (prefix, length,
metric, protocol, etc).

$  ip -6 ro ls vrf red
2001:db8::/120 via 2001:db8:1::62 dev eth1 metric 256  pref medium
2001:db8::/120 via 2001:db8:1::61 dev eth1 metric 256  pref medium
2001:db8::/120 via 2001:db8:1::60 dev eth1 metric 256  pref medium
2001:db8:1::/120 dev eth1 proto kernel metric 256  pref medium
...

$ ip -6 ro del vrf red ::1/120
$ ip -6 ro ls vrf red
2001:db8:1::/120 dev eth1 proto kernel metric 256  pref medium
...

Because IPv6 allows individual nexthops to be deleted without deleting
the entire route, the mutipath and non-multipath code paths have to be
discriminated so that all nexthops are only deleted for the latter case.
This is done by making the existing fc_type in fib6_config a u16 and then
adding a new u16 field with fc_delete_all_nh as the first bit.

Suggested-by: Dinesh Dutt 
Signed-off-by: David Ahern 
---
v3
- removed need for RTM_F_ALL_NEXTHOPS user api

v2
- fixed locking deleting route and its siblings as noted by DaveM

v2' (patch originally submitted standalone)
- switched examples to rfc 3849 documentation address per request
- changed delete loop to explicitly look at siblings list for
  first route matching specs given (metric, protocol, etc)

 include/net/ip6_fib.h |  4 +++-
 net/ipv6/route.c  | 34 --
 2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index a74e2aa40ef4..c979c878df1c 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -37,7 +37,9 @@ struct fib6_config {
int fc_ifindex;
u32 fc_flags;
u32 fc_protocol;
-   u32 fc_type;/* only 8 bits are used */
+   u16 fc_type;/* only 8 bits are used */
+   u16 fc_delete_all_nh : 1,
+   __unused : 15;
 
struct in6_addr fc_dst;
struct in6_addr fc_src;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 5046d2b24004..81e2b2a28806 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2143,6 +2143,34 @@ int ip6_del_rt(struct rt6_info *rt)
return __ip6_del_rt(rt, );
 }
 
+static int __ip6_del_rt_siblings(struct rt6_info *rt, struct fib6_config *cfg)
+{
+   struct nl_info *info = >fc_nlinfo;
+   struct fib6_table *table;
+   int err;
+
+   table = rt->rt6i_table;
+   write_lock_bh(>tb6_lock);
+
+   if (rt->rt6i_nsiblings && cfg->fc_delete_all_nh) {
+   struct rt6_info *sibling, *next_sibling;
+
+   list_for_each_entry_safe(sibling, next_sibling,
+>rt6i_siblings,
+rt6i_siblings) {
+   err = fib6_del(sibling, info);
+   if (err)
+   goto out;
+   }
+   }
+
+   err = fib6_del(rt, info);
+out:
+   write_unlock_bh(>tb6_lock);
+   ip6_rt_put(rt);
+   return err;
+}
+
 static int ip6_route_del(struct fib6_config *cfg)
 {
struct fib6_table *table;
@@ -2179,7 +2207,7 @@ static int ip6_route_del(struct fib6_config *cfg)
dst_hold(>dst);
read_unlock_bh(>tb6_lock);
 
-   return __ip6_del_rt(rt, >fc_nlinfo);
+   return __ip6_del_rt_siblings(rt, cfg);
}
}
read_unlock_bh(>tb6_lock);
@@ -3131,8 +3159,10 @@ static int inet6_rtm_delroute(struct sk_buff *skb, 
struct nlmsghdr *nlh)
 
if (cfg.fc_mp)
return ip6_route_multipath_del();
-   else
+   else {
+   cfg.fc_delete_all_nh = 1;
return ip6_route_del();
+   }
 }

Re: [PATCH 4/4] [net-next] net: qcom/emac: add an error interrupt handler for the sgmii

2017-01-27 Thread Timur Tabi


On 01/27/2017 04:43 PM, Timur Tabi wrote:

The SGMII (internal PHY) can report decode errors via an interrupt.  It
can also report autonegotiation status changes, but we don't need to track
those.  The SGMII can recover automatically from most decode errors, so
we only reset the interface if we get multiple consecutive errors.

It's possible for bogus decode errors to be reported while the link is
being brought up.  The interrupt is registered when the interface is
opened, and it's enabled after the link is up.

Signed-off-by: Timur Tabi 


Sorry, this particular patch wasn't meant to be in the patchset.  Please 
ignore it.


--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

[PATCH net-next v3 0/4] net: ipv6: Improve user experience with multipath routes

2017-01-27 Thread David Ahern

This series closes a couple of gaps between IPv4 and IPv6 with respect
to multipath routes:

1. IPv4 allows all nexthops of multipath routes to be deleted using just
   the prefix and length; IPv6 only deletes the first nexthop for the
   route if only the prefix and length are given.

2. IPv4 returns multipath routes encoded in the RTA_MULTIPATH attribute.
   IPv6 returns a series of routes with the same prefix and length - one
   for each nexthop. This happens for both dumps and notifications.

IPv6 does accept RTA_MULTIPATH encoded routes, but installs them as a
series of routes.

Patch 2 addresses the first item by allowing IPv6 multipath routes to be
deleted using just the prefix and length. Patch 3 addresses the second
allowing IPv6 multipath routes to be returned encoded in the RTA_MULTIPATH.

Patch 1 adds the NLM_F_APPEND flag to notifications when the flag is
present in the request. The lack of this flag was noted testing route
appends and comparing to IPv4.

Patch 4 prints IPv6 addresses in compressed format when showing route
replace errors. This was noticed testing REPLACE failures.

The end result for multipath routes:
1. Route Add
   - one notification with RTA_MULTIPATH attribute

2. Route Replace
   - notification for first route and all siblings that have
 succeeded. This is needed regardless of success of remaining
 nexthops to maintain add/delete consistency should a failure
 happens on the second or following nexthop (ie., need to tell
 userspace that original route has been replaced and then the
 failure logic deletes all routes inserted thus far).
 
3. Route Delete
   - for multipath route only given nexthops are deleted. This path
 is hit when DELETE contains RTA_MULTIPATH. All other route deletes,
 all nexthops are deleted for given prefix and length (and any
 other specs if given)

   - one notification sent per nexthop deleted. This is unavoidable
 since IPv6 alllows a single nexthop to be deleted within a multipath
 route

4. Route Appends
   - IPv6 allows nexthops to be appended to an existing route. In this
 case one notification is sent per nexthop added

Addresses some of the inconsistencies also noted by Roopa at netdev0.1:
https://www.netdev01.org/docs/prabhu-linux_ipv4_ipv6_inconsistencies_talk_slides.pdf

v3
- removed the need for a user API to opt-in to change. Requiring an
  API just shifts the difference from same API with different
  behavior to different API to achieve equivalent behavior

- route notifications changed to use RTA_MULTIPATH for add and replace

- upated commit messages and cover letter

v2
- fixed locking in patch 1 as noted by DaveM
- changed user API for patch 2 to require an rtmsg with RTM_F_ALL_NEXTHOPS
  set in rtm_flags
- revamped explanation of patch 2 and cover letter

David Ahern (4):
  net: ipv6: add NLM_F_APPEND in notifications when applicable
  net: ipv6: Allow shorthand delete of all nexthops in multipath route
  net: ipv6: Add support to dump multipath routes via RTA_MULTIPATH
attribute
  net: ipv6: Use compressed IPv6 addresses showing route replace error

 include/net/ip6_fib.h |   4 +-
 include/net/netlink.h |   1 +
 net/ipv6/ip6_fib.c|  19 +-
 net/ipv6/route.c  | 163 --
 4 files changed, 165 insertions(+), 22 deletions(-)

-- 
2.1.4

[PATCH net-next v3 1/4] net: ipv6: add NLM_F_APPEND in notifications when applicable

2017-01-27 Thread David Ahern

IPv6 does not set the NLM_F_APPEND flag in notifications to signal that
a NEWROUTE is an append versus a new route or a replaced one. Add the
flag if the request has it.

Signed-off-by: David Ahern 
---
 net/ipv6/ip6_fib.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index ef5485204522..bcaf247232d7 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -746,6 +746,9 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct 
rt6_info *rt,
u16 nlflags = NLM_F_EXCL;
int err;
 
+   if (info->nlh && info->nlh->nlmsg_flags & NLM_F_APPEND)
+   nlflags |= NLM_F_APPEND;
+
ins = >leaf;
 
for (iter = fn->leaf; iter; iter = iter->dst.rt6_next) {
-- 
2.1.4

[PATCH 4/5] [net-next] net: qcom/emac: remove extraneous wake-on-lan code

2017-01-27 Thread Timur Tabi

The EMAC driver does not support wake-on-lan, but there is still
code left-over that partially enables it.  Remove that code and a few
macros that support it.

Signed-off-by: Timur Tabi 
---
 drivers/net/ethernet/qualcomm/emac/emac-mac.c | 10 --
 drivers/net/ethernet/qualcomm/emac/emac.h |  4 
 2 files changed, 14 deletions(-)

diff --git a/drivers/net/ethernet/qualcomm/emac/emac-mac.c 
b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
index 3f3cd00..33d7ff1 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-mac.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
@@ -103,14 +103,6 @@
 #define RXEN0x0002
 #define TXEN0x0001
 
-
-/* EMAC_WOL_CTRL0 */
-#define LK_CHG_PME 0x20
-#define LK_CHG_EN  0x10
-#define MG_FRAME_PME   0x8
-#define MG_FRAME_EN0x4
-#define WK_FRAME_EN0x1
-
 /* EMAC_DESC_CTRL_3 */
 #define RFD_RING_SIZE_BMSK   0xfff
 
@@ -619,8 +611,6 @@ static void emac_mac_start(struct emac_adapter *adpt)
 
emac_reg_update32(adpt->base + EMAC_ATHR_HEADER_CTRL,
  (HEADER_ENABLE | HEADER_CNT_EN), 0);
-
-   emac_reg_update32(adpt->csr + EMAC_EMAC_WRAPPER_CSR2, 0, WOL_EN);
 }
 
 void emac_mac_stop(struct emac_adapter *adpt)
diff --git a/drivers/net/ethernet/qualcomm/emac/emac.h 
b/drivers/net/ethernet/qualcomm/emac/emac.h
index 2725507..ef91dcc 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac.h
+++ b/drivers/net/ethernet/qualcomm/emac/emac.h
@@ -167,10 +167,6 @@ enum emac_clk_id {
 
 #define EMAC_MAX_SETUP_LNK_CYCLE   100
 
-/* Wake On Lan */
-#define EMAC_WOL_PHY 0x0001 /* PHY Status Change */
-#define EMAC_WOL_MAGIC   0x0002 /* Magic Packet */
-
 struct emac_stats {
/* rx */
u64 rx_ok;  /* good packets */
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

[PATCH 3/5] [net-next] net: qcom/emac: do not call emac_mac_start twice

2017-01-27 Thread Timur Tabi

emac_mac_start() uses information from the external PHY to program
the MAC, so it makes no sense to call it before the link is up.

Signed-off-by: Timur Tabi 
---
 drivers/net/ethernet/qualcomm/emac/emac-mac.c | 2 +-
 drivers/net/ethernet/qualcomm/emac/emac-mac.h | 1 -
 drivers/net/ethernet/qualcomm/emac/emac.c | 2 --
 3 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/qualcomm/emac/emac-mac.c 
b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
index 155e273..3f3cd00 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-mac.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
@@ -556,7 +556,7 @@ void emac_mac_reset(struct emac_adapter *adpt)
emac_reg_update32(adpt->base + EMAC_DMA_MAS_CTRL, 0, INT_RD_CLR_EN);
 }
 
-void emac_mac_start(struct emac_adapter *adpt)
+static void emac_mac_start(struct emac_adapter *adpt)
 {
struct phy_device *phydev = adpt->phydev;
u32 mac, csr1;
diff --git a/drivers/net/ethernet/qualcomm/emac/emac-mac.h 
b/drivers/net/ethernet/qualcomm/emac/emac-mac.h
index f3aa24d..5028fb4 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-mac.h
+++ b/drivers/net/ethernet/qualcomm/emac/emac-mac.h
@@ -230,7 +230,6 @@ struct emac_tx_queue {
 int  emac_mac_up(struct emac_adapter *adpt);
 void emac_mac_down(struct emac_adapter *adpt);
 void emac_mac_reset(struct emac_adapter *adpt);
-void emac_mac_start(struct emac_adapter *adpt);
 void emac_mac_stop(struct emac_adapter *adpt);
 void emac_mac_mode_config(struct emac_adapter *adpt);
 void emac_mac_rx_process(struct emac_adapter *adpt, struct emac_rx_queue *rx_q,
diff --git a/drivers/net/ethernet/qualcomm/emac/emac.c 
b/drivers/net/ethernet/qualcomm/emac/emac.c
index 3e1be91..75305ad 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac.c
@@ -280,8 +280,6 @@ static int emac_open(struct net_device *netdev)
return ret;
}
 
-   emac_mac_start(adpt);
-
return 0;
 }
 
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

[PATCH 2/5] [net-next] net: qcom/emac: always use autonegotiation to configure the SGMII link

2017-01-27 Thread Timur Tabi

Regardless of how the external PHY is configured, the internal PHY
(the "SGMII" block) is capable of configuring the SGMII link automatically.
When the external PHY link comes up, regardless of how it is configured,
the SGMII link is configured automatically.

Signed-off-by: Timur Tabi 
---
 drivers/net/ethernet/qualcomm/emac/emac-sgmii.c | 49 +
 1 file changed, 10 insertions(+), 39 deletions(-)

diff --git a/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c 
b/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c
index 0149b52..b5269c4 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c
@@ -47,44 +47,19 @@
 
 #define SERDES_START_WAIT_TIMES100
 
-static int emac_sgmii_link_init(struct emac_adapter *adpt)
+/* Initialize the SGMII link between the internal and external PHYs. */
+static void emac_sgmii_link_init(struct emac_adapter *adpt)
 {
-   struct phy_device *phydev = adpt->phydev;
struct emac_sgmii *phy = >phy;
u32 val;
 
+   /* Always use autonegotiation. It works no matter how the external
+* PHY is configured.
+*/
val = readl(phy->base + EMAC_SGMII_PHY_AUTONEG_CFG2);
-
-   if (phydev->autoneg == AUTONEG_ENABLE) {
-   val &= ~(FORCE_AN_RX_CFG | FORCE_AN_TX_CFG);
-   val |= AN_ENABLE;
-   writel(val, phy->base + EMAC_SGMII_PHY_AUTONEG_CFG2);
-   } else {
-   u32 speed_cfg;
-
-   switch (phydev->speed) {
-   case SPEED_10:
-   speed_cfg = SPDMODE_10;
-   break;
-   case SPEED_100:
-   speed_cfg = SPDMODE_100;
-   break;
-   case SPEED_1000:
-   speed_cfg = SPDMODE_1000;
-   break;
-   default:
-   return -EINVAL;
-   }
-
-   if (phydev->duplex == DUPLEX_FULL)
-   speed_cfg |= DUPLEX_MODE;
-
-   val &= ~AN_ENABLE;
-   writel(speed_cfg, phy->base + EMAC_SGMII_PHY_SPEED_CFG1);
-   writel(val, phy->base + EMAC_SGMII_PHY_AUTONEG_CFG2);
-   }
-
-   return 0;
+   val &= ~(FORCE_AN_RX_CFG | FORCE_AN_TX_CFG);
+   val |= AN_ENABLE;
+   writel(val, phy->base + EMAC_SGMII_PHY_AUTONEG_CFG2);
 }
 
 static int emac_sgmii_irq_clear(struct emac_adapter *adpt, u32 irq_bits)
@@ -145,12 +120,7 @@ void emac_sgmii_reset(struct emac_adapter *adpt)
int ret;
 
emac_sgmii_reset_prepare(adpt);
-
-   ret = emac_sgmii_link_init(adpt);
-   if (ret) {
-   netdev_err(adpt->netdev, "unsupported link speed\n");
-   return;
-   }
+   emac_sgmii_link_init(adpt);
 
ret = adpt->phy.initialize(adpt);
if (ret)
@@ -287,6 +257,7 @@ int emac_sgmii_config(struct platform_device *pdev, struct 
emac_adapter *adpt)
goto error;
 
emac_sgmii_irq_clear(adpt, SGMII_PHY_INTERRUPT_ERR);
+   emac_sgmii_link_init(adpt);
 
/* We've remapped the addresses, so we don't need the device any
 * more.  of_find_device_by_node() says we should release it.
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

[PATCH 5/5] [net-next] net: qcom/emac: add an error interrupt handler for the sgmii

2017-01-27 Thread Timur Tabi

The SGMII (internal PHY) can report decode errors via an interrupt.  It
can also report autonegotiation status changes, but we don't need to track
those.  The SGMII can recover automatically from most decode errors, so
we only reset the interface if we get multiple consecutive errors.

It's possible for bogus decode errors to be reported while the link is
being brought up.  The interrupt is registered when the interface is
opened, and it's enabled after the link is up.

Signed-off-by: Timur Tabi 
---
 drivers/net/ethernet/qualcomm/emac/emac-mac.c   |   8 +-
 drivers/net/ethernet/qualcomm/emac/emac-sgmii.c | 126 +++-
 drivers/net/ethernet/qualcomm/emac/emac-sgmii.h |  16 ++-
 drivers/net/ethernet/qualcomm/emac/emac.c   |  10 ++
 4 files changed, 153 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/qualcomm/emac/emac-mac.c 
b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
index 33d7ff1..b991219 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-mac.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
@@ -951,12 +951,16 @@ static void emac_mac_rx_descs_refill(struct emac_adapter 
*adpt,
 static void emac_adjust_link(struct net_device *netdev)
 {
struct emac_adapter *adpt = netdev_priv(netdev);
+   struct emac_sgmii *sgmii = >phy;
struct phy_device *phydev = netdev->phydev;
 
-   if (phydev->link)
+   if (phydev->link) {
emac_mac_start(adpt);
-   else
+   sgmii->link_up(adpt);
+   } else {
+   sgmii->link_down(adpt);
emac_mac_stop(adpt);
+   }
 
phy_print_status(phydev);
 }
diff --git a/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c 
b/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c
index b5269c4..040b289 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c
@@ -25,7 +25,9 @@
 #define EMAC_SGMII_PHY_SPEED_CFG1  0x0074
 #define EMAC_SGMII_PHY_IRQ_CMD 0x00ac
 #define EMAC_SGMII_PHY_INTERRUPT_CLEAR 0x00b0
+#define EMAC_SGMII_PHY_INTERRUPT_MASK  0x00b4
 #define EMAC_SGMII_PHY_INTERRUPT_STATUS0x00b8
+#define EMAC_SGMII_PHY_RX_CHK_STATUS   0x00d4
 
 #define FORCE_AN_TX_CFGBIT(5)
 #define FORCE_AN_RX_CFGBIT(4)
@@ -36,6 +38,8 @@
 #define SPDMODE_100BIT(0)
 #define SPDMODE_10 0
 
+#define CDR_ALIGN_DET  BIT(6)
+
 #define IRQ_GLOBAL_CLEAR   BIT(0)
 
 #define DECODE_CODE_ERRBIT(7)
@@ -44,6 +48,7 @@
 #define SGMII_PHY_IRQ_CLR_WAIT_TIME10
 
 #define SGMII_PHY_INTERRUPT_ERR(DECODE_CODE_ERR | 
DECODE_DISP_ERR)
+#define SGMII_ISR_MASK (SGMII_PHY_INTERRUPT_ERR)
 
 #define SERDES_START_WAIT_TIMES100
 
@@ -96,6 +101,51 @@ static int emac_sgmii_irq_clear(struct emac_adapter *adpt, 
u32 irq_bits)
return 0;
 }
 
+/* The number of decode errors that triggers a reset */
+#define DECODE_ERROR_LIMIT 2
+
+static irqreturn_t emac_sgmii_interrupt(int irq, void *data)
+{
+   struct emac_adapter *adpt = data;
+   struct emac_sgmii *phy = >phy;
+   u32 status;
+
+   status = readl(phy->base + EMAC_SGMII_PHY_INTERRUPT_STATUS);
+   status &= SGMII_ISR_MASK;
+   if (!status)
+   return IRQ_HANDLED;
+
+   /* If we get a decoding error and CDR is not locked, then try
+* resetting the internal PHY.  The internal PHY uses an embedded
+* clock with Clock and Data Recovery (CDR) to recover the
+* clock and data.
+*/
+   if (status & SGMII_PHY_INTERRUPT_ERR) {
+   int count;
+
+   /* The SGMII is capable of recovering from some decode
+* errors automatically.  However, if we get multiple
+* decode errors in a row, then assume that something
+* is wrong and reset the interface.
+*/
+   count = atomic_inc_return(>decode_error_count);
+   if (count == DECODE_ERROR_LIMIT) {
+   schedule_work(>work_thread);
+   atomic_set(>decode_error_count, 0);
+   }
+   } else {
+   /* We only care about consecutive decode errors. */
+   atomic_set(>decode_error_count, 0);
+   }
+
+   if (emac_sgmii_irq_clear(adpt, status)) {
+   netdev_warn(adpt->netdev, "failed to clear SGMII interrupt\n");
+   schedule_work(>work_thread);
+   }
+
+   return IRQ_HANDLED;
+}
+
 static void emac_sgmii_reset_prepare(struct emac_adapter *adpt)
 {
struct emac_sgmii *phy = >phy;
@@ -129,6 +179,68 @@ void emac_sgmii_reset(struct emac_adapter *adpt)
   ret);
 }
 
+static int

[PATCH 0/5] [net-next] net: qcom/emac:

2017-01-27 Thread Timur Tabi

Although not related, these patches affect the same files, so they should
be applied in order.

The first patch cleans up logging of when the the phy driver is attached.

The second patch always configures the SGMII to use autonegotiation mode.

The third patch removes a redundant call to emac_mac_start().

The fourth patch removes some extraneous non-functioning WOL code.

The fifth patch adds an error handler for the SGMII block.

Timur Tabi (5):
  [net-next] net: qcom/emac: display the phy driver info after we
connect
  [net-next] net: qcom/emac: always use autonegotiation to configure the
SGMII link
  [net-next] net: qcom/emac: do not call emac_mac_start twice
  [net-next] net: qcom/emac: remove extraneous wake-on-lan code
  [net-next] net: qcom/emac: add an error interrupt handler for the
sgmii

 drivers/net/ethernet/qualcomm/emac/emac-mac.c   |  24 ++--
 drivers/net/ethernet/qualcomm/emac/emac-mac.h   |   1 -
 drivers/net/ethernet/qualcomm/emac/emac-phy.c   |   3 -
 drivers/net/ethernet/qualcomm/emac/emac-sgmii.c | 175 ++--
 drivers/net/ethernet/qualcomm/emac/emac-sgmii.h |  16 ++-
 drivers/net/ethernet/qualcomm/emac/emac.c   |  10 +-
 drivers/net/ethernet/qualcomm/emac/emac.h   |   4 -
 7 files changed, 166 insertions(+), 67 deletions(-)

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

[PATCH 4/4] [net-next] net: qcom/emac: add an error interrupt handler for the sgmii

2017-01-27 Thread Timur Tabi

The SGMII (internal PHY) can report decode errors via an interrupt.  It
can also report autonegotiation status changes, but we don't need to track
those.  The SGMII can recover automatically from most decode errors, so
we only reset the interface if we get multiple consecutive errors.

It's possible for bogus decode errors to be reported while the link is
being brought up.  The interrupt is registered when the interface is
opened, and it's enabled after the link is up.

Signed-off-by: Timur Tabi 
---
 drivers/net/ethernet/qualcomm/emac/emac-mac.c   |   8 +-
 drivers/net/ethernet/qualcomm/emac/emac-sgmii.c | 126 +++-
 drivers/net/ethernet/qualcomm/emac/emac-sgmii.h |  16 ++-
 drivers/net/ethernet/qualcomm/emac/emac.c   |  10 ++
 4 files changed, 153 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/qualcomm/emac/emac-mac.c 
b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
index 3f3cd00..a0bc8a85 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-mac.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
@@ -961,12 +961,16 @@ static void emac_mac_rx_descs_refill(struct emac_adapter 
*adpt,
 static void emac_adjust_link(struct net_device *netdev)
 {
struct emac_adapter *adpt = netdev_priv(netdev);
+   struct emac_sgmii *sgmii = >phy;
struct phy_device *phydev = netdev->phydev;
 
-   if (phydev->link)
+   if (phydev->link) {
emac_mac_start(adpt);
-   else
+   sgmii->link_up(adpt);
+   } else {
+   sgmii->link_down(adpt);
emac_mac_stop(adpt);
+   }
 
phy_print_status(phydev);
 }
diff --git a/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c 
b/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c
index b5269c4..040b289 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c
@@ -25,7 +25,9 @@
 #define EMAC_SGMII_PHY_SPEED_CFG1  0x0074
 #define EMAC_SGMII_PHY_IRQ_CMD 0x00ac
 #define EMAC_SGMII_PHY_INTERRUPT_CLEAR 0x00b0
+#define EMAC_SGMII_PHY_INTERRUPT_MASK  0x00b4
 #define EMAC_SGMII_PHY_INTERRUPT_STATUS0x00b8
+#define EMAC_SGMII_PHY_RX_CHK_STATUS   0x00d4
 
 #define FORCE_AN_TX_CFGBIT(5)
 #define FORCE_AN_RX_CFGBIT(4)
@@ -36,6 +38,8 @@
 #define SPDMODE_100BIT(0)
 #define SPDMODE_10 0
 
+#define CDR_ALIGN_DET  BIT(6)
+
 #define IRQ_GLOBAL_CLEAR   BIT(0)
 
 #define DECODE_CODE_ERRBIT(7)
@@ -44,6 +48,7 @@
 #define SGMII_PHY_IRQ_CLR_WAIT_TIME10
 
 #define SGMII_PHY_INTERRUPT_ERR(DECODE_CODE_ERR | 
DECODE_DISP_ERR)
+#define SGMII_ISR_MASK (SGMII_PHY_INTERRUPT_ERR)
 
 #define SERDES_START_WAIT_TIMES100
 
@@ -96,6 +101,51 @@ static int emac_sgmii_irq_clear(struct emac_adapter *adpt, 
u32 irq_bits)
return 0;
 }
 
+/* The number of decode errors that triggers a reset */
+#define DECODE_ERROR_LIMIT 2
+
+static irqreturn_t emac_sgmii_interrupt(int irq, void *data)
+{
+   struct emac_adapter *adpt = data;
+   struct emac_sgmii *phy = >phy;
+   u32 status;
+
+   status = readl(phy->base + EMAC_SGMII_PHY_INTERRUPT_STATUS);
+   status &= SGMII_ISR_MASK;
+   if (!status)
+   return IRQ_HANDLED;
+
+   /* If we get a decoding error and CDR is not locked, then try
+* resetting the internal PHY.  The internal PHY uses an embedded
+* clock with Clock and Data Recovery (CDR) to recover the
+* clock and data.
+*/
+   if (status & SGMII_PHY_INTERRUPT_ERR) {
+   int count;
+
+   /* The SGMII is capable of recovering from some decode
+* errors automatically.  However, if we get multiple
+* decode errors in a row, then assume that something
+* is wrong and reset the interface.
+*/
+   count = atomic_inc_return(>decode_error_count);
+   if (count == DECODE_ERROR_LIMIT) {
+   schedule_work(>work_thread);
+   atomic_set(>decode_error_count, 0);
+   }
+   } else {
+   /* We only care about consecutive decode errors. */
+   atomic_set(>decode_error_count, 0);
+   }
+
+   if (emac_sgmii_irq_clear(adpt, status)) {
+   netdev_warn(adpt->netdev, "failed to clear SGMII interrupt\n");
+   schedule_work(>work_thread);
+   }
+
+   return IRQ_HANDLED;
+}
+
 static void emac_sgmii_reset_prepare(struct emac_adapter *adpt)
 {
struct emac_sgmii *phy = >phy;
@@ -129,6 +179,68 @@ void emac_sgmii_reset(struct emac_adapter *adpt)
   ret);
 }
 
+static int

[PATCH 1/5] [net-next] net: qcom/emac: display the phy driver info after we connect

2017-01-27 Thread Timur Tabi

The PHY driver is attached only when the driver calls
phy_connect_direct().  Calling phy_attached_print() to display
information about the PHY driver prior to that point is meaningless.
The interface can be brought down, a new PHY driver can be loaded,
and the interface then brought back up.  This is the correct time
to display information about the attached driver.

Since phy_attached_print() also prints information about the
interrupt, that needs to be set as well.

Signed-off-by: Timur Tabi 
---
 drivers/net/ethernet/qualcomm/emac/emac-mac.c | 4 +++-
 drivers/net/ethernet/qualcomm/emac/emac-phy.c | 3 ---
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/qualcomm/emac/emac-mac.c 
b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
index e4793d7..155e273 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-mac.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
@@ -981,6 +981,7 @@ int emac_mac_up(struct emac_adapter *adpt)
emac_mac_config(adpt);
emac_mac_rx_descs_refill(adpt, >rx_q);
 
+   adpt->phydev->irq = PHY_IGNORE_INTERRUPT;
ret = phy_connect_direct(netdev, adpt->phydev, emac_adjust_link,
 PHY_INTERFACE_MODE_SGMII);
if (ret) {
@@ -988,11 +989,12 @@ int emac_mac_up(struct emac_adapter *adpt)
return ret;
}
 
+   phy_attached_print(adpt->phydev, NULL);
+
/* enable mac irq */
writel((u32)~DIS_INT, adpt->base + EMAC_INT_STATUS);
writel(adpt->irq.mask, adpt->base + EMAC_INT_MASK);
 
-   adpt->phydev->irq = PHY_IGNORE_INTERRUPT;
phy_start(adpt->phydev);
 
napi_enable(>rx_q.napi);
diff --git a/drivers/net/ethernet/qualcomm/emac/emac-phy.c 
b/drivers/net/ethernet/qualcomm/emac/emac-phy.c
index 1d7852f..441c1936 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-phy.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-phy.c
@@ -226,8 +226,5 @@ int emac_phy_config(struct platform_device *pdev, struct 
emac_adapter *adpt)
return -ENODEV;
}
 
-   if (adpt->phydev->drv)
-   phy_attached_print(adpt->phydev, NULL);
-
return 0;
 }
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

Re: [PATCH 1/2] Documentation: devicetree: change the mediatek ethernet compatible string

2017-01-27 Thread Rob Herring

On Wed, Jan 25, 2017 at 09:20:54AM +0100, John Crispin wrote:
> When the binding was defined, I was not aware that mt2701 was an earlier
> version of the SoC. For sake of consistency, the ethernet driver should
> use mt2701 inside the compat string as this is the earliest SoC with the
> ethernet core.
> 
> The ethernet driver is currently of no real use until we finish and
> upstream the DSA driver. There are no users of this binding yet. It should
> be safe to fix this now before it is too late and we need to provide
> backward compatibility for the mt7623-eth compat string.

Thanks for the explanation.

> Reported-by: Sean Wang 
> Signed-off-by: John Crispin 
> ---
>  Documentation/devicetree/bindings/net/mediatek-net.txt |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/Documentation/devicetree/bindings/net/mediatek-net.txt 
> b/Documentation/devicetree/bindings/net/mediatek-net.txt
> index c010faf..c7194e8 100644
> --- a/Documentation/devicetree/bindings/net/mediatek-net.txt
> +++ b/Documentation/devicetree/bindings/net/mediatek-net.txt
> @@ -7,7 +7,7 @@ have dual GMAC each represented by a child node..
>  * Ethernet controller node
>  
>  Required properties:
> -- compatible: Should be "mediatek,mt7623-eth"
> +- compatible: Should be "mediatek,mt2701-eth"

You should have both strings with 2701 being last. That way if you ever 
find a difference in the 7623, you don't need a DT update to fix it.

>  - reg: Address and length of the register set for the device
>  - interrupts: Should contain the three frame engines interrupts in numeric
>   order. These are fe_int0, fe_int1 and fe_int2.
> -- 
> 1.7.10.4
>

[PATCH v2] net: adaptec: starfire: add checks for dma mapping errors

2017-01-27 Thread Alexey Khoroshilov

init_ring(), refill_rx_ring() and start_tx() don't check
if mapping dma memory succeed.
The patch adds the checks and failure handling.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov 
---
 drivers/net/ethernet/adaptec/starfire.c | 45 +++--
 1 file changed, 43 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/adaptec/starfire.c 
b/drivers/net/ethernet/adaptec/starfire.c
index c12d2618eebf..3872ab96b80a 100644
--- a/drivers/net/ethernet/adaptec/starfire.c
+++ b/drivers/net/ethernet/adaptec/starfire.c
@@ -1152,6 +1152,12 @@ static void init_ring(struct net_device *dev)
if (skb == NULL)
break;
np->rx_info[i].mapping = pci_map_single(np->pci_dev, skb->data, 
np->rx_buf_sz, PCI_DMA_FROMDEVICE);
+   if (pci_dma_mapping_error(np->pci_dev,
+ np->rx_info[i].mapping)) {
+   dev_kfree_skb(skb);
+   np->rx_info[i].skb = NULL;
+   break;
+   }
/* Grrr, we cannot offset to correctly align the IP header. */
np->rx_ring[i].rxaddr = cpu_to_dma(np->rx_info[i].mapping | 
RxDescValid);
}
@@ -1182,8 +1188,9 @@ static netdev_tx_t start_tx(struct sk_buff *skb, struct 
net_device *dev)
 {
struct netdev_private *np = netdev_priv(dev);
unsigned int entry;
+   unsigned int prev_tx;
u32 status;
-   int i;
+   int i, j;
 
/*
 * be cautious here, wrapping the queue has weird semantics
@@ -1201,6 +1208,7 @@ static netdev_tx_t start_tx(struct sk_buff *skb, struct 
net_device *dev)
}
 #endif /* ZEROCOPY && HAS_BROKEN_FIRMWARE */
 
+   prev_tx = np->cur_tx;
entry = np->cur_tx % TX_RING_SIZE;
for (i = 0; i < skb_num_frags(skb); i++) {
int wrap_ring = 0;
@@ -1234,6 +1242,11 @@ static netdev_tx_t start_tx(struct sk_buff *skb, struct 
net_device *dev)
   skb_frag_size(this_frag),
   PCI_DMA_TODEVICE);
}
+   if (pci_dma_mapping_error(np->pci_dev,
+ np->tx_info[entry].mapping)) {
+   dev->stats.tx_dropped++;
+   goto err_out;
+   }
 
np->tx_ring[entry].addr = 
cpu_to_dma(np->tx_info[entry].mapping);
np->tx_ring[entry].status = cpu_to_le32(status);
@@ -1268,8 +1281,30 @@ static netdev_tx_t start_tx(struct sk_buff *skb, struct 
net_device *dev)
netif_stop_queue(dev);
 
return NETDEV_TX_OK;
-}
 
+err_out:
+   entry = prev_tx % TX_RING_SIZE;
+   np->tx_info[entry].skb = NULL;
+   if (i > 0) {
+   pci_unmap_single(np->pci_dev,
+np->tx_info[entry].mapping,
+skb_first_frag_len(skb),
+PCI_DMA_TODEVICE);
+   np->tx_info[entry].mapping = 0;
+   entry = (entry + np->tx_info[entry].used_slots) % TX_RING_SIZE;
+   for (j = 1; j < i; j++) {
+   pci_unmap_single(np->pci_dev,
+np->tx_info[entry].mapping,
+skb_frag_size(
+   _shinfo(skb)->frags[j-1]),
+PCI_DMA_TODEVICE);
+   entry++;
+   }
+   }
+   dev_kfree_skb_any(skb);
+   np->cur_tx = prev_tx;
+   return NETDEV_TX_OK;
+}
 
 /* The interrupt handler does all of the Rx thread work and cleans up
after the Tx thread. */
@@ -1569,6 +1604,12 @@ static void refill_rx_ring(struct net_device *dev)
break;  /* Better luck next round. */
np->rx_info[entry].mapping =
pci_map_single(np->pci_dev, skb->data, 
np->rx_buf_sz, PCI_DMA_FROMDEVICE);
+   if (pci_dma_mapping_error(np->pci_dev,
+   np->rx_info[entry].mapping)) {
+   dev_kfree_skb(skb);
+   np->rx_info[entry].skb = NULL;
+   break;
+   }
np->rx_ring[entry].rxaddr =
cpu_to_dma(np->rx_info[entry].mapping | 
RxDescValid);
}
-- 
2.7.4

Re: [PATCH net-next 0/4] net: dsa: bcm_sf2: CFP support

2017-01-27 Thread Florian Fainelli

Hi Chris,

On 01/27/2017 01:24 PM, Chris Healy wrote:
> Hi Florian,
> 
> In saying the below, I may just be showing my naivety but here goes:
> 
> If I understand this correctly, what you are using is similar to the
> TCAM hardware present in the newer Marvell switches.  I think Pablo is
> doing some work with nftables and HW offload using TCAM HW.  Is there
> overlap here?  It seems that one or the other API should be used but
> not both.

Well, the problem is that there is overlap with 3 different unrelated
subsystems accessing the same HW here: tc, ethtool, and netfilter, all
(two at least) with different ways of formatting input parameters, as I
pointed out a while back in this thread:
https://www.mail-archive.com/netdev@vger.kernel.org/msg126321.html

My angle on this submission is the following, purely based on pragmatism:

- I have real users behind this feature who are currently very happy
with how this works using ethtool, switching them to netlink, tc,
netfilter is not trivial, but could be done in the long run, not just
now. At the very least, this serves as reference code for people who are
curious to see how Broadcom's CFP works

- cls_flower was looked at, it is missing a critical feature IMHO which
is the ability to specify a rule index, and the amount of code necessary
to validate input parameters is just totally insane, just like the fact
that there is not a common intermediate input representation (ala
ethtool_rx_flow_spec) makes it impractical

- I have heard about the work Pablo is doing, but until it is publicly
submitted and reviewed, it's hard to project what it is going to look like

Thanks!
--
Florian

Re: [PATCH RFC net-next] packet: always ensure that we pass hard_header_len bytes in skb_headlen() to the driver

2017-01-27 Thread Sowmini Varadhan

On (01/27/17 15:51), Willem de Bruijn wrote:
:
> - limit capable() check to drivers with with .validate callback  
(aka second option below)
:
> - let privileged applications shoot themselves in the foot (change nothing).

> The second will break variable length header protocols unless
> you exhaustively search for all variable length protocols and add
> validate callbacks first.

other than ax25, are there variable length header protocols out there
without ->validate, and which need the CAP_RAW_SYSIO branch?

I realize that, to an extent, even ethernet is a protocol whose
header is > 14 with vlan, but from the google search, seems like it
was mostly ax25 that really triggered a large part of the check.

If we think that there are a large number of these (that dont have a 
->validate, to fix up things as desired) I'd just go for the "change
nothing in pf_packet" option.

As I found out many drivers like ixgbe and sunvnet have defensive checks
in the Tx path anyway, and xen_netfront can also join that group with
a few simple checks.

Re: [net 7/8] net/mlx5e: Fix update of hash function/key via ethtool

2017-01-27 Thread Tom Herbert

On Fri, Jan 27, 2017 at 12:38 PM, Saeed Mahameed  wrote:
> From: Gal Pressman 
>
> Modifying TIR hash should change selected fields bitmask in addition to
> the function and key.
> Formerly, we would not set this field resulting in zeroing of its value,
> which means no packet fields are used for RX RSS hash calculation thus
> causing all traffic to arrive in RQ[0].
>
This commit log is rather scant in details. Does this mean that RSS is
somehow broken in mlx5? What is exact test that demonstrates bad
behavior? Did you verify that this doesn't break IPv4 or IPv6?

> Fixes: bdfc028de1b3 ("net/mlx5e: Fix ethtool RX hash func configuration 
> change")
> Signed-off-by: Gal Pressman 
> Signed-off-by: Saeed Mahameed 
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/en.h   |   3 +-
>  .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   |  13 +-
>  drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 198 
> ++---
>  3 files changed, 109 insertions(+), 105 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h 
> b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> index 1619147a63e8..d5ecb8f53fd4 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> @@ -791,7 +791,8 @@ void mlx5e_disable_vlan_filter(struct mlx5e_priv *priv);
>  int mlx5e_modify_rqs_vsd(struct mlx5e_priv *priv, bool vsd);
>
>  int mlx5e_redirect_rqt(struct mlx5e_priv *priv, u32 rqtn, int sz, int ix);
> -void mlx5e_build_tir_ctx_hash(void *tirc, struct mlx5e_priv *priv);
> +void mlx5e_build_indir_tir_ctx_hash(struct mlx5e_priv *priv, void *tirc,
> +   enum mlx5e_traffic_types tt);
>
>  int mlx5e_open_locked(struct net_device *netdev);
>  int mlx5e_close_locked(struct net_device *netdev);
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c 
> b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
> index 6f4eb34259f0..bb67863aa361 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
> @@ -980,15 +980,18 @@ static int mlx5e_get_rxfh(struct net_device *netdev, 
> u32 *indir, u8 *key,
>
>  static void mlx5e_modify_tirs_hash(struct mlx5e_priv *priv, void *in, int 
> inlen)
>  {
> -   struct mlx5_core_dev *mdev = priv->mdev;
> void *tirc = MLX5_ADDR_OF(modify_tir_in, in, ctx);
> -   int i;
> +   struct mlx5_core_dev *mdev = priv->mdev;
> +   int ctxlen = MLX5_ST_SZ_BYTES(tirc);
> +   int tt;
>
> MLX5_SET(modify_tir_in, in, bitmask.hash, 1);
> -   mlx5e_build_tir_ctx_hash(tirc, priv);
>
> -   for (i = 0; i < MLX5E_NUM_INDIR_TIRS; i++)
> -   mlx5_core_modify_tir(mdev, priv->indir_tir[i].tirn, in, 
> inlen);
> +   for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++) {
> +   memset(tirc, 0, ctxlen);
> +   mlx5e_build_indir_tir_ctx_hash(priv, tirc, tt);
> +   mlx5_core_modify_tir(mdev, priv->indir_tir[tt].tirn, in, 
> inlen);
> +   }
>  }
>
>  static int mlx5e_set_rxfh(struct net_device *dev, const u32 *indir,
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
> b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> index 948351ae5bd2..f14ca3385fdd 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> @@ -2022,8 +2022,23 @@ static void mlx5e_build_tir_ctx_lro(void *tirc, struct 
> mlx5e_priv *priv)
> MLX5_SET(tirc, tirc, lro_timeout_period_usecs, 
> priv->params.lro_timeout);
>  }
>
> -void mlx5e_build_tir_ctx_hash(void *tirc, struct mlx5e_priv *priv)
> +void mlx5e_build_indir_tir_ctx_hash(struct mlx5e_priv *priv, void *tirc,
> +   enum mlx5e_traffic_types tt)
>  {
> +   void *hfso = MLX5_ADDR_OF(tirc, tirc, rx_hash_field_selector_outer);
> +
> +#define MLX5_HASH_IP(MLX5_HASH_FIELD_SEL_SRC_IP   |\
> +MLX5_HASH_FIELD_SEL_DST_IP)
> +
> +#define MLX5_HASH_IP_L4PORTS(MLX5_HASH_FIELD_SEL_SRC_IP   |\
> +MLX5_HASH_FIELD_SEL_DST_IP   |\
> +MLX5_HASH_FIELD_SEL_L4_SPORT |\
> +MLX5_HASH_FIELD_SEL_L4_DPORT)
> +
> +#define MLX5_HASH_IP_IPSEC_SPI  (MLX5_HASH_FIELD_SEL_SRC_IP   |\
> +MLX5_HASH_FIELD_SEL_DST_IP   |\
> +MLX5_HASH_FIELD_SEL_IPSEC_SPI)
> +
> MLX5_SET(tirc, tirc, rx_hash_fn,
>  mlx5e_rx_hash_fn(priv->params.rss_hfunc));
> if (priv->params.rss_hfunc == ETH_RSS_HASH_TOP) {
> @@ -2035,6 +2050,88 @@ void mlx5e_build_tir_ctx_hash(void *tirc, struct 
> mlx5e_priv *priv)
> MLX5_SET(tirc, tirc, rx_hash_symmetric, 1);
> memcpy(rss_key, priv->params.toeplitz_hash_key, len);
> }
> +
> +

net: suspicious RCU usage in nf_hook

2017-01-27 Thread Dmitry Vyukov

Hello,

I've got the following report while running syzkaller fuzzer on
fd694aaa46c7ed811b72eb47d5eb11ce7ab3f7f1:

[ INFO: suspicious RCU usage. ]
4.10.0-rc5+ #192 Not tainted
---
./include/linux/rcupdate.h:561 Illegal context switch in RCU read-side
critical section!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 0
2 locks held by syz-executor14/23111:
 #0:  (sk_lock-AF_INET6){+.+.+.}, at: [] lock_sock
include/net/sock.h:1454 [inline]
 #0:  (sk_lock-AF_INET6){+.+.+.}, at: []
rawv6_sendmsg+0x1e65/0x3ec0 net/ipv6/raw.c:919
 #1:  (rcu_read_lock){..}, at: [] nf_hook
include/linux/netfilter.h:201 [inline]
 #1:  (rcu_read_lock){..}, at: []
__ip6_local_out+0x258/0x840 net/ipv6/output_core.c:160

stack backtrace:
CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15 [inline]
 dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
 lockdep_rcu_suspicious+0x139/0x180 kernel/locking/lockdep.c:4452
 rcu_preempt_sleep_check include/linux/rcupdate.h:560 [inline]
 ___might_sleep+0x560/0x650 kernel/sched/core.c:7748
 __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739
 mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752
 atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060
 __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149
 static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174
 net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728
 sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403
 __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441
 sk_destruct+0x47/0x80 net/core/sock.c:1460
 __sk_free+0x57/0x230 net/core/sock.c:1468
 sock_wfree+0xae/0x120 net/core/sock.c:1645
 skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655
 skb_release_all+0x15/0x60 net/core/skbuff.c:668
 __kfree_skb+0x15/0x20 net/core/skbuff.c:684
 kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705
 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
 inet_frag_put include/net/inet_frag.h:133 [inline]
 nf_ct_frag6_gather+0x1106/0x3840 net/ipv6/netfilter/nf_conntrack_reasm.c:617
 ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
 nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline]
 nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
 nf_hook include/linux/netfilter.h:212 [inline]
 __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160
 ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
 ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
 ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
 rawv6_push_pending_frames net/ipv6/raw.c:613 [inline]
 rawv6_sendmsg+0x2d1a/0x3ec0 net/ipv6/raw.c:927
 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
 sock_sendmsg_nosec net/socket.c:635 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:645
 sock_write_iter+0x326/0x600 net/socket.c:848
 do_iter_readv_writev+0x2e3/0x5b0 fs/read_write.c:695
 do_readv_writev+0x42c/0x9b0 fs/read_write.c:872
 vfs_writev+0x87/0xc0 fs/read_write.c:911
 do_writev+0x110/0x2c0 fs/read_write.c:944
 SYSC_writev fs/read_write.c:1017 [inline]
 SyS_writev+0x27/0x30 fs/read_write.c:1014
 entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x445559
RSP: 002b:7f6f46fceb58 EFLAGS: 0292 ORIG_RAX: 0014
RAX: ffda RBX: 0005 RCX: 00445559
RDX: 0001 RSI: 20f1eff0 RDI: 0005
RBP: 006e19c0 R08:  R09: 
R10:  R11: 0292 R12: 0070
R13: 20f59000 R14: 0015 R15: 00020400
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:752
in_atomic(): 1, irqs_disabled(): 0, pid: 23111, name: syz-executor14
INFO: lockdep is turned off.
CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15 [inline]
 dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
 ___might_sleep+0x47e/0x650 kernel/sched/core.c:7780
 __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739
 mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752
 atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060
 __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149
 static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174
 net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728
 sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403
 __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441
 sk_destruct+0x47/0x80 net/core/sock.c:1460
 __sk_free+0x57/0x230 net/core/sock.c:1468
 sock_wfree+0xae/0x120 net/core/sock.c:1645
 skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655
 skb_release_all+0x15/0x60 net/core/skbuff.c:668
 __kfree_skb+0x15/0x20 net/core/skbuff.c:684
 kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705
 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
 inet_frag_put include/net/inet_frag.h:133

Re: [PATCH net-next 1/4] mlx5: Make building eswitch configurable

2017-01-27 Thread Saeed Mahameed

On Fri, Jan 27, 2017 at 8:42 PM, Tom Herbert  wrote:
> On Fri, Jan 27, 2017 at 10:28 AM, Saeed Mahameed
>  wrote:
>> On Fri, Jan 27, 2017 at 8:16 PM, Tom Herbert  wrote:
>>> On Fri, Jan 27, 2017 at 10:05 AM, Saeed Mahameed
>>>  wrote:
 On Fri, Jan 27, 2017 at 7:50 PM, Tom Herbert  wrote:
> On Fri, Jan 27, 2017 at 9:38 AM, Saeed Mahameed
>  wrote:
>> On Fri, Jan 27, 2017 at 7:34 AM, Or Gerlitz  wrote:
>>> On Fri, Jan 27, 2017 at 1:32 AM, Tom Herbert  
>>> wrote:
 Add a configuration option (CONFIG_MLX5_CORE_ESWITCH) for controlling
 whether the eswitch code is built. Change Kconfig and Makefile
 accordingly.
>>>
>>> Tom, FWIW, please note that the basic e-switch functionality is needed
>>> also when SRIOV isn't of use, this is for a multi host configuration.
>>>
>>
>> Right, set_l2_table_entry@eswitch.c need to be called by PF for any UC
>> MAC address wanted by VF or PF.
>> To keep one flow in the code, the implementation is done as part of 
>> eswitch.
>>
>> so in multi-host configuration (where there are 4 PFs) each PF should
>> invoke set_l2_table_entry_cmd  for each one of its own UC MACs.
>>
>> populating the l2 table is done using the whole eswitch event driven
>> mechanisms, it is not easy and IMH not right to separate eswitch
>> tables from l2 table (same management logic, different tables).
>>
>> Anyways as Or stated this is just an FYI, eswitch needs to be enabled
>> on Multi-host configuration.
>>
> What indicate a multi-host configuration?

 nothing in the driver, it is transparent.

>>> So then we always need the eswitch code to be built even if someone
>>> never uses any of it?
>>>
>>
>> yes.
>> but for your convenience all you need is to compile eswitch.c.
>> esiwtch_offoalds.c and en_rep.c can be compiled out for basic ethernet.
>>
> Well eswitch.c is 2200 LOC. en_rep.c and eswitch_offloads.c are 1600
> LOC. If we _must_ have eswitch.c then there's probably not much point
> then. But I am still finding it hard to fathom that eswitch has now
> become a mandatory component of Ethernet drivers/devices.
>

It is only mandatory for configurations that needs eswitch, where the
driver has no way to know about them, for a good old bare metal box,
eswitch is not needed.

we can do some work to strip the l2 table logic - needed for PFs to
work on multi-host - out of eswitch but again that would further
complicate the driver code since eswitch will still need to update l2
tables for VFs.


> Tom
>
>
>>> Or.
>>>
>>> My WW (and same for the rest of the IL team..) has ended so I will be
>>> able to further look on this series and comment on Sunday.

[net-next v2] openvswitch: Simplify do_execute_actions().

2017-01-27 Thread Andy Zhou

do_execute_actions() implements a worthwhile optimization: in case
an output action is the last action in an action list, skb_clone()
can be avoided by outputing the current skb. However, the
implementation is more complicated than necessary.  This patch
simplify this logic.

Signed-off-by: Andy Zhou 
---
v1->v2:  drop skb NULL check in do_output()


---
 net/openvswitch/actions.c | 42 --
 1 file changed, 20 insertions(+), 22 deletions(-)

diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 514f7bc..efa9a88 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -1141,12 +1141,6 @@ static int do_execute_actions(struct datapath *dp, 
struct sk_buff *skb,
  struct sw_flow_key *key,
  const struct nlattr *attr, int len)
 {
-   /* Every output action needs a separate clone of 'skb', but the common
-* case is just a single output action, so that doing a clone and
-* then freeing the original skbuff is wasteful.  So the following code
-* is slightly obscure just to avoid that.
-*/
-   int prev_port = -1;
const struct nlattr *a;
int rem;
 
@@ -1154,20 +1148,28 @@ static int do_execute_actions(struct datapath *dp, 
struct sk_buff *skb,
 a = nla_next(a, )) {
int err = 0;
 
-   if (unlikely(prev_port != -1)) {
-   struct sk_buff *out_skb = skb_clone(skb, GFP_ATOMIC);
-
-   if (out_skb)
-   do_output(dp, out_skb, prev_port, key);
+   switch (nla_type(a)) {
+   case OVS_ACTION_ATTR_OUTPUT: {
+   int port = nla_get_u32(a);
+   struct sk_buff *clone;
+
+   /* Every output action needs a separate clone
+* of 'skb', In case the output action is the
+* last action, cloning can be avoided.
+*/
+   if (nla_is_last(a, rem)) {
+   do_output(dp, skb, port, key);
+   /* 'skb' has been used for output.
+*/
+   return 0;
+   }
 
+   clone = skb_clone(skb, GFP_ATOMIC);
+   if (clone)
+   do_output(dp, clone, port, key);
OVS_CB(skb)->cutlen = 0;
-   prev_port = -1;
-   }
-
-   switch (nla_type(a)) {
-   case OVS_ACTION_ATTR_OUTPUT:
-   prev_port = nla_get_u32(a);
break;
+   }
 
case OVS_ACTION_ATTR_TRUNC: {
struct ovs_action_trunc *trunc = nla_data(a);
@@ -1257,11 +1259,7 @@ static int do_execute_actions(struct datapath *dp, 
struct sk_buff *skb,
}
}
 
-   if (prev_port != -1)
-   do_output(dp, skb, prev_port, key);
-   else
-   consume_skb(skb);
-
+   consume_skb(skb);
return 0;
 }
 
-- 
1.8.3.1

[RFC PATCH 2/2] ixgbe: add af_packet direct copy support

2017-01-27 Thread John Fastabend

This implements the ndo ops for direct dma socket option. This is
to start looking at the interface and driver work needed to enable
it.

Note error paths are not handled and I'm aware of a few bugs.
For example interface must be up before attaching socket or else
it will fail silently. TBD fix all these things.

Signed-off-by: John Fastabend 
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h  |3 
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  255 +
 2 files changed, 256 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h 
b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index ef81c3d..198f90c 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -266,6 +266,7 @@ struct ixgbe_ring {
struct ixgbe_tx_buffer *tx_buffer_info;
struct ixgbe_rx_buffer *rx_buffer_info;
};
+   unsigned int buffer_size;
unsigned long state;
u8 __iomem *tail;
dma_addr_t dma; /* phys. address of descriptor ring */
@@ -299,6 +300,8 @@ struct ixgbe_ring {
struct ixgbe_tx_queue_stats tx_stats;
struct ixgbe_rx_queue_stats rx_stats;
};
+   bool ddma;/* ring data buffers mapped to userspace */
+   struct sock *rx_kick; /* rx kick userspace */
 } cacheline_internodealigned_in_smp;
 
 enum ixgbe_ring_f_enum {
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 1e2f39e..c5ab44a 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -1608,8 +1608,19 @@ void ixgbe_alloc_rx_buffers(struct ixgbe_ring *rx_ring, 
u16 cleaned_count)
i -= rx_ring->count;
 
do {
-   if (!ixgbe_alloc_mapped_page(rx_ring, bi))
-   break;
+   /* If direct dma has not released packet yet stop */
+   if (rx_ring->ddma) {
+   int align = TPACKET_ALIGN(TPACKET2_HDRLEN);
+   int hdrlen = ALIGN(align, L1_CACHE_BYTES);
+   struct tpacket2_hdr *hdr;
+
+   hdr = page_address(bi->page) + bi->page_offset - hdrlen;
+   if (unlikely(TP_STATUS_USER & hdr->tp_status))
+   break;
+   } else {
+   if (!ixgbe_alloc_mapped_page(rx_ring, bi))
+   break;
+   }
 
/*
 * Refresh the desc even if buffer_addrs didn't change
@@ -2005,6 +2016,97 @@ static bool ixgbe_add_rx_frag(struct ixgbe_ring *rx_ring,
return true;
 }
 
+/* ixgbe_do_ddma - direct dma routine to populate PACKET_RX_RING mmap region
+ *
+ * The packet socket interface builds a shared memory region using mmap after
+ * it is specified by the PACKET_RX_RING socket option. This will create a
+ * circular ring of memory slots. Typical software usage case copies the skb
+ * into these pages via tpacket_rcv() routine.
+ *
+ * Here we do direct DMA from the hardware (82599 in this case) into the
+ * mmap regions and populate the uhdr (think user space descriptor). This
+ * requires the hardware to support Scatter Gather and HighDMA which should
+ * be standard on most (all?) 10/40 Gbps devices.
+ *
+ * The buffer mapping should have already been done so that rx_buffer pages
+ * are handed to the driver from the mmap setup done at the socket layer.
+ *
+ * See ./include/uapi/linux/if_packet.h for details on packet layout here
+ * we can only use tpacket2_hdr type. v3 of the header type introduced bulk
+ * polling modes which do not work correctly with hardware DMA engine. The
+ * primary issue is we can not stop a DMA transaction from occurring after it
+ * has been configured. What results is the software timer advances the
+ * ring ahead of the hardware and the ring state is lost. Maybe there is
+ * a clever way to resolve this by I haven't thought it up yet.
+ */
+static int ixgbe_do_ddma(struct ixgbe_ring *rx_ring,
+union ixgbe_adv_rx_desc *rx_desc)
+{
+   int hdrlen = ALIGN(TPACKET_ALIGN(TPACKET2_HDRLEN), L1_CACHE_BYTES);
+   struct ixgbe_adapter *adapter = netdev_priv(rx_ring->netdev);
+   struct ixgbe_rx_buffer *rx_buffer;
+   struct tpacket2_hdr *h2; /* userspace descriptor */
+   struct sockaddr_ll *sll;
+   struct ethhdr *eth;
+   int len = 0;
+   u64 ns = 0;
+   s32 rem;
+
+   rx_buffer = _ring->rx_buffer_info[rx_ring->next_to_clean];
+   if (!rx_buffer->dma)
+   return -EBUSY;
+
+   prefetchw(rx_buffer->page);
+
+   /* test for any known error cases */
+   WARN_ON(ixgbe_test_staterr(rx_desc,
+  IXGBE_RXDADV_ERR_FRAME_ERR_MASK) &&
+  !(rx_ring->netdev->features &

[RFC PATCH 1/2] af_packet: direct dma for packet ineterface

2017-01-27 Thread John Fastabend

This adds ndo ops for upper layer objects to request direct DMA from
the network interface into memory "slots". The slots must be DMA'able
memory given by a page/offset/size vector in a packet_ring_buffer
structure.

The PF_PACKET socket interface can use these ndo_ops to do zerocopy
RX from the network device into memory mapped userspace memory. For
this to work drivers encode the correct descriptor blocks and headers
so that existing PF_PACKET applications work without any modification.
This only supports the V2 header formats for now. And works by mapping
a ring of the network device to these slots. Originally I used V2
header formats but this does complicate the driver a bit.

V3 header formats added bulk polling via socket calls and timers
used in the polling interface to return every n milliseconds. Currently,
I don't see any way to support this in hardware because we can't
know if the hardware is in the middle of a DMA operation or not
on a slot. So when a timer fires I don't know how to advance the
descriptor ring leaving empty descriptors similar to how the software
ring works. The easiest (best?) route is to simply not support this.

It might be worth creating a new v4 header that is simple for drivers
to support direct DMA ops with. I can imagine using the xdp_buff
structure as a header for example. Thoughts?

The ndo operations and new socket option PACKET_RX_DIRECT work by
giving a queue_index to run the direct dma operations over. Once
setsockopt returns successfully the indicated queue is mapped
directly to the requesting application and can not be used for
other purposes. Also any kernel layers such as tc will be bypassed
and need to be implemented in the hardware via some other mechanism
such as tc offload or other offload interfaces.

Users steer traffic to the selected queue using flow director,
tc offload infrastructure or via macvlan offload.

The new socket option added to PF_PACKET is called PACKET_RX_DIRECT.
It takes a single unsigned int value specifying the queue index,

 setsockopt(sock, SOL_PACKET, PACKET_RX_DIRECT,
_index, sizeof(queue_index));

Implementing busy_poll support will allow userspace to kick the
drivers receive routine if needed. This work is TBD.

To test this I hacked a hardcoded test into  the tool psock_tpacket
in the selftests kernel directory here:

 ./tools/testing/selftests/net/psock_tpacket.c

Running this tool opens a socket and listens for packets over
the PACKET_RX_DIRECT enabled socket. Obviously it needs to be
reworked to enable all the older tests and not hardcode my
interface before it actually gets released.

In general this is a rough patch to explore the interface and
put something concrete up for debate. The patch does not handle
all the error cases correctly and needs to be cleaned up.

Known Limitations (TBD):

 (1) Users are required to match the number of rx ring
 slots with ethtool to the number requested by the
 setsockopt PF_PACKET layout. In the future we could
 possibly do this automatically.

 (2) Users need to configure Flow director or setup_tc
 to steer traffic to the correct queues. I don't believe
 this needs to be changed it seems to be a good mechanism
 for driving directed dma.

 (3) Not supporting timestamps or priv space yet, pushing
 a v4 packet header would resolve this nicely.

 (5) Only RX supported so far. TX already supports direct DMA
 interface but uses skbs which is really not needed. In
 the TX_RING case we can optimize this path as well.

To support TX case we can do a similar "slots" mechanism and
kick operation. The kick could be a busy_poll like operation
but on the TX side. The flow would be user space loads up
n number of slots with packets, kicks tx busy poll bit, the
driver sends packets, and finally when xmit is complete
clears header bits to give slots back. When we have qdisc
bypass set today we already bypass the entire stack so no
paticular reason to use skb's in this case. Using xdp_buff
as a v4 packet header would also allow us to consolidate
driver code.

To be done:

 (1) More testing and performance analysis
 (2) Busy polling sockets
 (3) Implement v4 xdp_buff headers for analysis
 (4) performance testing :/ hopefully it looks good.

Signed-off-by: John Fastabend 
---
 include/linux/netdevice.h   |8 +++
 include/net/af_packet.h |   64 +++
 include/uapi/linux/if_packet.h  |1 
 net/packet/af_packet.c  |   37 
 net/packet/internal.h   |   60 -
 tools/testing/selftests/net/psock_tpacket.c |   51 +++---
 6 files changed, 154 insertions(+), 67 deletions(-)
 create mode 100644 include/net/af_packet.h

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index

Re: [PATCH net-next 0/4] net: dsa: bcm_sf2: CFP support

2017-01-27 Thread Chris Healy

Hi Florian,

In saying the below, I may just be showing my naivety but here goes:

If I understand this correctly, what you are using is similar to the
TCAM hardware present in the newer Marvell switches.  I think Pablo is
doing some work with nftables and HW offload using TCAM HW.  Is there
overlap here?  It seems that one or the other API should be used but
not both.

Regards,

Chris

On Fri, Jan 27, 2017 at 1:05 PM, Florian Fainelli  wrote:
> Hi all,
>
> This patch series adds support for the Broadcom Compact Field Processor (CFP)
> which is a classification and matching engine built into most Broadcom 
> switches.
>
> We support that using ethtool::rxnfc because it allows all known uses cases 
> from
> the users I support to work, and more importantly, it allows the selection of 
> a
> target rule index, which is later used by e.g: offloading hardware, this is an
> essential feature that I could not find being supported with cls_* for 
> instance.
>
> Thanks
>
> Florian Fainelli (4):
>   net: dsa: Hook {get,set}_rxnfc ethtool operations
>   net: dsa: bcm_sf2: Configure traffic classes to queue mapping
>   net: dsa: bcm_sf2: Add CFP registers definitions
>   net: dsa: bcm_sf2: Add support for ethtool::rxnfc
>
>  drivers/net/dsa/Makefile   |   2 +-
>  drivers/net/dsa/bcm_sf2.c  |  23 ++
>  drivers/net/dsa/bcm_sf2.h  |  17 ++
>  drivers/net/dsa/bcm_sf2_cfp.c  | 613 
> +
>  drivers/net/dsa/bcm_sf2_regs.h | 150 ++
>  include/net/dsa.h  |   8 +
>  net/dsa/slave.c|  26 ++
>  7 files changed, 838 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/net/dsa/bcm_sf2_cfp.c
>
> --
> 2.9.3
>

[RFC PATCH 0/2] rx zero copy interface for af_packet

2017-01-27 Thread John Fastabend

This is an experimental implementation of rx zero copy for af_packet.
Its a bit rough and likely has errors but the plan is to clean it up
over the next few months.

And seeing I said I would post it in another thread a few days back
here it is.

Comments welcome and use at your own risk.

Thanks,
John


---

John Fastabend (2):
  af_packet: direct dma for packet ineterface
  ixgbe: add af_packet direct copy support


 drivers/net/ethernet/intel/ixgbe/ixgbe.h  |3 
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  255 +
 include/linux/netdevice.h |8 +
 include/net/af_packet.h   |   64 ++
 include/uapi/linux/if_packet.h|1 
 net/packet/af_packet.c|   37 
 net/packet/internal.h |   60 --
 tools/testing/selftests/net/psock_tpacket.c   |   51 -
 8 files changed, 410 insertions(+), 69 deletions(-)
 create mode 100644 include/net/af_packet.h

--
Signature

Re: [net-next] openvswitch: Simplify do_execute_actions().

2017-01-27 Thread Andy Zhou

On Fri, Jan 27, 2017 at 12:42 PM, Pravin Shelar  wrote:
> On Wed, Jan 25, 2017 at 9:24 PM, Andy Zhou  wrote:
>> do_execute_actions() implements a worthwhile optimization: in case
>> an output action is the last action in an action list, skb_clone()
>> can be avoided by outputing the current skb. However, the
>> implementation is more complicated than necessary.  This patch
>> simplify this logic.
>>
>> Signed-off-by: Andy Zhou 
>> ---
>>  net/openvswitch/actions.c | 40 +++-
>>  1 file changed, 19 insertions(+), 21 deletions(-)
>>
>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>> index 514f7bc..3866608 100644
>> --- a/net/openvswitch/actions.c
>> +++ b/net/openvswitch/actions.c
>> @@ -830,6 +830,9 @@ static void do_output(struct datapath *dp, struct 
>> sk_buff *skb, int out_port,
>>  {
>> struct vport *vport = ovs_vport_rcu(dp, out_port);
>>
>> +   if (unlikely(!skb))
>> +   return;
>> +
> Patch looks good to me. But I wanted to know if you considered moving
> this check to do_execute_actions() in case skb-clone is done? This way
> we can avoid this unlikely check from likely case :)
>
Good point.  O.K. I will repost a version without this check.  Thanks
for the review and comment.

[pull request][net 0/8] Mellanox mlx5 fixes 2017-01-27

2017-01-27 Thread Saeed Mahameed

Hi Dave,

This pull request includes some mlx5 fixes for net, please see details
below.

Please pull and let me know if there's any problem.

For -stable:
  net/mlx5e: Modify TIRs hash only when it's needed
  net/mlx5e: Fix update of hash function/key via ethtool

Thanks,
Saeed.

---

The following changes since commit 214767faa2f31285f92754393c036f13b55474a6:

  Merge tag 'batadv-net-for-davem-20170125' of 
git://git.open-mesh.org/linux-merge (2017-01-25 23:11:13 -0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git 
tags/mlx5-fixes-2017-01-27

for you to fetch changes up to 4f24229d2f73be75c2d362113588e30a1695dcb1:

  net/mlx5e: Check ets capability before ets query FW command (2017-01-27 
00:05:47 +0200)


mlx5-fixes-2017-01-27

A couple of mlx5 core and ethernet driver fixes.

>From Or, a couple of error return values and error handling fixes.
>From Hadar, Support TC encapsulation offloads even when the mlx5e uplink
device is stacked  under an upper device.
>From Gal, two patches to fix RSS hash modifications via ethtool.
>From Moshe, Added a needed ets capability check.


Gal Pressman (2):
  net/mlx5e: Modify TIRs hash only when it's needed
  net/mlx5e: Fix update of hash function/key via ethtool

Hadar Hen Zion (1):
  net/mlx5e: Support TC encapsulation offloads with upper devices

Moshe Shemesh (1):
  net/mlx5e: Check ets capability before ets query FW command

Or Gerlitz (4):
  net/mlx5: Change ENOTSUPP to EOPNOTSUPP
  net/mlx5: Return EOPNOTSUPP when failing to get steering name-space
  net/mlx5: E-Switch, Err when retrieving steering name-space fails
  net/mlx5: E-Switch, Re-enable RoCE on mode change only after FDB destroy

 drivers/net/ethernet/mellanox/mlx5/core/cmd.c  |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h   |   7 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c |  11 +-
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   |  41 +++--
 drivers/net/ethernet/mellanox/mlx5/core/en_fs.c|   2 +-
 .../ethernet/mellanox/mlx5/core/en_fs_ethtool.c|   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 202 ++---
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c|  13 +-
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.c  |  10 +-
 .../ethernet/mellanox/mlx5/core/eswitch_offloads.c |  36 ++--
 drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c   |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c  |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/main.c |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/port.c |   4 +-
 drivers/net/ethernet/mellanox/mlx5/core/vport.c|   2 +-
 15 files changed, 181 insertions(+), 157 deletions(-)

Re: [PATCH] cfg80211 debugfs: Cleanup some checkpatch issues

2017-01-27 Thread Joe Perches

On Fri, 2017-01-27 at 22:00 +0100, Johannes Berg wrote:
> On Fri, 2017-01-27 at 22:26 +0300, Pichugin Dmitry wrote:
> > This fixes the checkpatch.pl warnings:
> > * Macros should not use a trailing semicolon.
> > * Spaces required around that '='.
> > * Symbolic permissions 'S_IRUGO' are not preferred.
> > * Macro argument reuse 'buflen' - possible side-effects
> 
> I really see no point in any of this.

Look at the uses of DEBUGFS_READONLY_FILE and
see if they are consistent before and after.

 DEBUGFS_READONLY_FILE(rts_threshold, 20, "%d",
- wiphy->rts_threshold)
+ wiphy->rts_threshold);
 DEBUGFS_READONLY_FILE(fragmentation_threshold, 20, "%d",
  wiphy->frag_threshold);
 DEBUGFS_READONLY_FILE(short_retry_limit, 20, "%d",
- wiphy->retry_short)
+ wiphy->retry_short);
 DEBUGFS_READONLY_FILE(long_retry_limit, 20, "%d",
  wiphy->retry_long);

[net 2/8] net/mlx5: Return EOPNOTSUPP when failing to get steering name-space

2017-01-27 Thread Saeed Mahameed

From: Or Gerlitz 

When we fail to retrieve a hardware steering name-space, the returned error
code should say that this operation is not supported. Align the various
places in the driver where this call is made to this convention.

Also, make sure to warn when we fail to retrieve a SW (ANCHOR) name-space.

Signed-off-by: Or Gerlitz 
Reviewed-by: Matan Barak 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_fs.c| 2 +-
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.c  | 6 +++---
 drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 2 +-
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c  | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c
index 1fe80de5d68f..a0e5a69402b3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c
@@ -1089,7 +1089,7 @@ int mlx5e_create_flow_steering(struct mlx5e_priv *priv)
   MLX5_FLOW_NAMESPACE_KERNEL);
 
if (!priv->fs.ns)
-   return -EINVAL;
+   return -EOPNOTSUPP;
 
err = mlx5e_arfs_create_tables(priv);
if (err) {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c 
b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
index bb712139b36e..d0c8bf014453 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -353,7 +353,7 @@ static int esw_create_legacy_fdb_table(struct mlx5_eswitch 
*esw, int nvports)
root_ns = mlx5_get_flow_namespace(dev, MLX5_FLOW_NAMESPACE_FDB);
if (!root_ns) {
esw_warn(dev, "Failed to get FDB flow namespace\n");
-   return -ENOMEM;
+   return -EOPNOTSUPP;
}
 
flow_group_in = mlx5_vzalloc(inlen);
@@ -962,7 +962,7 @@ static int esw_vport_enable_egress_acl(struct mlx5_eswitch 
*esw,
root_ns = mlx5_get_flow_namespace(dev, MLX5_FLOW_NAMESPACE_ESW_EGRESS);
if (!root_ns) {
esw_warn(dev, "Failed to get E-Switch egress flow namespace\n");
-   return -EIO;
+   return -EOPNOTSUPP;
}
 
flow_group_in = mlx5_vzalloc(inlen);
@@ -1079,7 +1079,7 @@ static int esw_vport_enable_ingress_acl(struct 
mlx5_eswitch *esw,
root_ns = mlx5_get_flow_namespace(dev, MLX5_FLOW_NAMESPACE_ESW_INGRESS);
if (!root_ns) {
esw_warn(dev, "Failed to get E-Switch ingress flow 
namespace\n");
-   return -EIO;
+   return -EOPNOTSUPP;
}
 
flow_group_in = mlx5_vzalloc(inlen);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c 
b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 657d319fc4c6..5803216157cf 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -535,7 +535,7 @@ static int esw_create_offloads_table(struct mlx5_eswitch 
*esw)
ns = mlx5_get_flow_namespace(dev, MLX5_FLOW_NAMESPACE_OFFLOADS);
if (!ns) {
esw_warn(esw->dev, "Failed to get offloads flow namespace\n");
-   return -ENOMEM;
+   return -EOPNOTSUPP;
}
 
ft_offloads = mlx5_create_flow_table(ns, 0, dev->priv.sriov.num_vfs + 
2, 0, 0);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c 
b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 0ac7a2fc916c..6346a8f5883b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -1822,7 +1822,7 @@ static int create_anchor_flow_table(struct 
mlx5_flow_steering *steering)
struct mlx5_flow_table *ft;
 
ns = mlx5_get_flow_namespace(steering->dev, MLX5_FLOW_NAMESPACE_ANCHOR);
-   if (!ns)
+   if (WARN_ON(!ns))
return -EINVAL;
ft = mlx5_create_flow_table(ns, ANCHOR_PRIO, ANCHOR_SIZE, ANCHOR_LEVEL, 
0);
if (IS_ERR(ft)) {
-- 
2.11.0

[PATCH net-next 3/3] net: dsa: bcm_sf2: Add support for ethtool::rxnfc

2017-01-27 Thread Florian Fainelli

Add support for configuring classification rules using the
ethtool::rxnfc API.  This is useful to program the switch's CFP/TCAM to
redirect specific packets to specific ports/queues for instance. For
now, we allow any kind of IPv4 5-tuple matching.

Signed-off-by: Florian Fainelli 
---
 drivers/net/dsa/Makefile  |   2 +-
 drivers/net/dsa/bcm_sf2.c |  14 +
 drivers/net/dsa/bcm_sf2.h |  17 ++
 drivers/net/dsa/bcm_sf2_cfp.c | 613 ++
 4 files changed, 645 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/dsa/bcm_sf2_cfp.c

diff --git a/drivers/net/dsa/Makefile b/drivers/net/dsa/Makefile
index 8346e4f9737a..e69f3683f52f 100644
--- a/drivers/net/dsa/Makefile
+++ b/drivers/net/dsa/Makefile
@@ -1,5 +1,5 @@
 obj-$(CONFIG_NET_DSA_MV88E6060) += mv88e6060.o
-obj-$(CONFIG_NET_DSA_BCM_SF2)  += bcm_sf2.o
+obj-$(CONFIG_NET_DSA_BCM_SF2)  += bcm_sf2.o bcm_sf2_cfp.o
 obj-$(CONFIG_NET_DSA_QCA8K)+= qca8k.o
 
 obj-y  += b53/
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 8eecfd227e06..74cf18798655 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -1036,6 +1036,8 @@ static const struct dsa_switch_ops bcm_sf2_ops = {
.port_fdb_dump  = b53_fdb_dump,
.port_fdb_add   = b53_fdb_add,
.port_fdb_del   = b53_fdb_del,
+   .get_rxnfc  = bcm_sf2_get_rxnfc,
+   .set_rxnfc  = bcm_sf2_set_rxnfc,
 };
 
 struct bcm_sf2_of_data {
@@ -1159,6 +1161,12 @@ static int bcm_sf2_sw_probe(struct platform_device *pdev)
 
spin_lock_init(>indir_lock);
mutex_init(>stats_mutex);
+   mutex_init(>cfp.lock);
+
+   /* CFP rule #0 cannot be used for specific classifications, flag it as
+* permanently used
+*/
+   set_bit(0, priv->cfp.used);
 
bcm_sf2_identify_ports(priv, dn->child);
 
@@ -1188,6 +1196,12 @@ static int bcm_sf2_sw_probe(struct platform_device *pdev)
return ret;
}
 
+   ret = bcm_sf2_cfp_rst(priv);
+   if (ret) {
+   pr_err("failed to reset CFP\n");
+   goto out_mdio;
+   }
+
/* Disable all interrupts and request them */
bcm_sf2_intr_disable(priv);
 
diff --git a/drivers/net/dsa/bcm_sf2.h b/drivers/net/dsa/bcm_sf2.h
index 6e1f74e4d471..7d3030e04f11 100644
--- a/drivers/net/dsa/bcm_sf2.h
+++ b/drivers/net/dsa/bcm_sf2.h
@@ -52,6 +52,13 @@ struct bcm_sf2_port_status {
struct ethtool_eee eee;
 };
 
+struct bcm_sf2_cfp_priv {
+   /* Mutex protecting concurrent accesses to the CFP registers */
+   struct mutex lock;
+   DECLARE_BITMAP(used, CFP_NUM_RULES);
+   unsigned int rules_cnt;
+};
+
 struct bcm_sf2_priv {
/* Base registers, keep those in order with BCM_SF2_REGS_NAME */
void __iomem*core;
@@ -103,6 +110,9 @@ struct bcm_sf2_priv {
 
/* Bitmask of ports needing BRCM tags */
unsigned intbrcm_tag_mask;
+
+   /* CFP rules context */
+   struct bcm_sf2_cfp_priv cfp;
 };
 
 static inline struct bcm_sf2_priv *bcm_sf2_to_priv(struct dsa_switch *ds)
@@ -197,4 +207,11 @@ SF2_IO_MACRO(acb);
 SWITCH_INTR_L2(0);
 SWITCH_INTR_L2(1);
 
+/* RXNFC */
+int bcm_sf2_get_rxnfc(struct dsa_switch *ds, int port,
+ struct ethtool_rxnfc *nfc, u32 *rule_locs);
+int bcm_sf2_set_rxnfc(struct dsa_switch *ds, int port,
+ struct ethtool_rxnfc *nfc);
+int bcm_sf2_cfp_rst(struct bcm_sf2_priv *priv);
+
 #endif /* __BCM_SF2_H */
diff --git a/drivers/net/dsa/bcm_sf2_cfp.c b/drivers/net/dsa/bcm_sf2_cfp.c
new file mode 100644
index ..c71be3e0dc2d
--- /dev/null
+++ b/drivers/net/dsa/bcm_sf2_cfp.c
@@ -0,0 +1,613 @@
+/*
+ * Broadcom Starfighter 2 DSA switch CFP support
+ *
+ * Copyright (C) 2016, Broadcom
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "bcm_sf2.h"
+#include "bcm_sf2_regs.h"
+
+struct cfp_udf_layout {
+   u8 slices[UDF_NUM_SLICES];
+   u32 mask_value;
+
+};
+
+/* UDF slices layout for a TCPv4/UDPv4 specification */
+static const struct cfp_udf_layout udf_tcpip4_layout = {
+   .slices = {
+   /* End of L2, byte offset 12, src IP[0:15] */
+   CFG_UDF_EOL2 | 6,
+   /* End of L2, byte offset 14, src IP[16:31] */
+   CFG_UDF_EOL2 | 7,
+   /* End of L2, byte offset 16, dst IP[0:15] */
+   CFG_UDF_EOL2 | 8,
+   /* End of L2, byte offset 18, dst IP[16:31] */
+   CFG_UDF_EOL2 | 9,
+   /* End of L3, byte offset 0, src port */
+

[net 5/8] net/mlx5e: Support TC encapsulation offloads with upper devices

2017-01-27 Thread Saeed Mahameed

From: Hadar Hen Zion 

When tunneling is used, some virtualizations systems set the (mlx5e) uplink
device to be stacked under upper devices such as bridge or ovs internal
port, where the VTEP IP address used for the encapsulation is set on
that upper device.

In order to support such use-cases, we also deal with a setup where the
egress mirred device isn't representing a port on the HW e-switch to where
the ingress device belongs. We use eswitch service function which returns
the uplink and set it as the egress device of the tc encap rule.

Fixes: a54e20b4fcae ("net/mlx5e: Add basic TC tunnel set action for SRIOV 
offloads")
Signed-off-by: Hadar Hen Zion 
Reviewed-by: Or Gerlitz 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 46bef6a26a8c..c5282b6aba8b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -663,6 +663,7 @@ static int mlx5e_route_lookup_ipv4(struct mlx5e_priv *priv,
   __be32 *saddr,
   int *out_ttl)
 {
+   struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
struct rtable *rt;
struct neighbour *n = NULL;
int ttl;
@@ -677,12 +678,11 @@ static int mlx5e_route_lookup_ipv4(struct mlx5e_priv 
*priv,
 #else
return -EOPNOTSUPP;
 #endif
-
-   if (!switchdev_port_same_parent_id(priv->netdev, rt->dst.dev)) {
-   pr_warn("%s: can't offload, devices not on same HW e-switch\n", 
__func__);
-   ip_rt_put(rt);
-   return -EOPNOTSUPP;
-   }
+   /* if the egress device isn't on the same HW e-switch, we use the 
uplink */
+   if (!switchdev_port_same_parent_id(priv->netdev, rt->dst.dev))
+   *out_dev = mlx5_eswitch_get_uplink_netdev(esw);
+   else
+   *out_dev = rt->dst.dev;
 
ttl = ip4_dst_hoplimit(>dst);
n = dst_neigh_lookup(>dst, >daddr);
@@ -693,7 +693,6 @@ static int mlx5e_route_lookup_ipv4(struct mlx5e_priv *priv,
*out_n = n;
*saddr = fl4->saddr;
*out_ttl = ttl;
-   *out_dev = rt->dst.dev;
 
return 0;
 }
-- 
2.11.0

[net 1/8] net/mlx5: Change ENOTSUPP to EOPNOTSUPP

2017-01-27 Thread Saeed Mahameed

From: Or Gerlitz 

As ENOTSUPP is specific to NFS, change the return error value to
EOPNOTSUPP in various places in the mlx5 driver.

Signed-off-by: Or Gerlitz 
Suggested-by: Yotam Gigi 
Reviewed-by: Matan Barak 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c  |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h   |  4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c |  6 +++---
 drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c   | 10 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_fs_ethtool.c|  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |  4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.c  |  4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c   |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/main.c |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/port.c |  4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/vport.c|  2 +-
 12 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c 
b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 3797cc7c1288..caa837e5e2b9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -1728,7 +1728,7 @@ int mlx5_cmd_init(struct mlx5_core_dev *dev)
if (cmd->cmdif_rev > CMD_IF_REV) {
dev_err(>pdev->dev, "driver does not support command 
interface version. driver %d, firmware %d\n",
CMD_IF_REV, cmd->cmdif_rev);
-   err = -ENOTSUPP;
+   err = -EOPNOTSUPP;
goto err_free_page;
}
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h 
b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 951dbd58594d..1619147a63e8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -863,12 +863,12 @@ static inline void mlx5e_arfs_destroy_tables(struct 
mlx5e_priv *priv) {}
 
 static inline int mlx5e_arfs_enable(struct mlx5e_priv *priv)
 {
-   return -ENOTSUPP;
+   return -EOPNOTSUPP;
 }
 
 static inline int mlx5e_arfs_disable(struct mlx5e_priv *priv)
 {
-   return -ENOTSUPP;
+   return -EOPNOTSUPP;
 }
 #else
 int mlx5e_arfs_create_tables(struct mlx5e_priv *priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
index f0b460f47f29..35f9ae037ba0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
@@ -89,7 +89,7 @@ static int mlx5e_dcbnl_ieee_getets(struct net_device *netdev,
int i;
 
if (!MLX5_CAP_GEN(priv->mdev, ets))
-   return -ENOTSUPP;
+   return -EOPNOTSUPP;
 
ets->ets_cap = mlx5_max_tc(priv->mdev) + 1;
for (i = 0; i < ets->ets_cap; i++) {
@@ -236,7 +236,7 @@ static int mlx5e_dcbnl_ieee_setets(struct net_device 
*netdev,
int err;
 
if (!MLX5_CAP_GEN(priv->mdev, ets))
-   return -ENOTSUPP;
+   return -EOPNOTSUPP;
 
err = mlx5e_dbcnl_validate_ets(netdev, ets);
if (err)
@@ -402,7 +402,7 @@ static u8 mlx5e_dcbnl_setall(struct net_device *netdev)
struct mlx5_core_dev *mdev = priv->mdev;
struct ieee_ets ets;
struct ieee_pfc pfc;
-   int err = -ENOTSUPP;
+   int err = -EOPNOTSUPP;
int i;
 
if (!MLX5_CAP_GEN(mdev, ets))
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 5197817e4b2f..ffbdf9ee5a9b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -595,7 +595,7 @@ static int mlx5e_get_coalesce(struct net_device *netdev,
struct mlx5e_priv *priv = netdev_priv(netdev);
 
if (!MLX5_CAP_GEN(priv->mdev, cq_moderation))
-   return -ENOTSUPP;
+   return -EOPNOTSUPP;
 
coal->rx_coalesce_usecs   = priv->params.rx_cq_moderation.usec;
coal->rx_max_coalesced_frames = priv->params.rx_cq_moderation.pkts;
@@ -620,7 +620,7 @@ static int mlx5e_set_coalesce(struct net_device *netdev,
int i;
 
if (!MLX5_CAP_GEN(mdev, cq_moderation))
-   return -ENOTSUPP;
+   return -EOPNOTSUPP;
 
mutex_lock(>state_lock);
 
@@ -1296,7 +1296,7 @@ static int mlx5e_set_wol(struct net_device *netdev, 
struct ethtool_wolinfo *wol)
u32 mlx5_wol_mode;
 
if (!wol_supported)
-   return -ENOTSUPP;
+   return -EOPNOTSUPP;
 
if (wol->wolopts & ~wol_supported)
return -EINVAL;
@@ -1426,7 +1426,7 @@ static

[PATCH net-next 4/4] net: dsa: bcm_sf2: Add support for ethtool::rxnfc

2017-01-27 Thread Florian Fainelli

Add support for configuring classification rules using the
ethtool::rxnfc API.  This is useful to program the switch's CFP/TCAM to
redirect specific packets to specific ports/queues for instance. For
now, we allow any kind of IPv4 5-tuple matching.

Signed-off-by: Florian Fainelli 
---
 drivers/net/dsa/Makefile  |   2 +-
 drivers/net/dsa/bcm_sf2.c |  14 +
 drivers/net/dsa/bcm_sf2.h |  17 ++
 drivers/net/dsa/bcm_sf2_cfp.c | 613 ++
 4 files changed, 645 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/dsa/bcm_sf2_cfp.c

diff --git a/drivers/net/dsa/Makefile b/drivers/net/dsa/Makefile
index 8346e4f9737a..e69f3683f52f 100644
--- a/drivers/net/dsa/Makefile
+++ b/drivers/net/dsa/Makefile
@@ -1,5 +1,5 @@
 obj-$(CONFIG_NET_DSA_MV88E6060) += mv88e6060.o
-obj-$(CONFIG_NET_DSA_BCM_SF2)  += bcm_sf2.o
+obj-$(CONFIG_NET_DSA_BCM_SF2)  += bcm_sf2.o bcm_sf2_cfp.o
 obj-$(CONFIG_NET_DSA_QCA8K)+= qca8k.o
 
 obj-y  += b53/
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 637072da3acf..be282b430c50 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -1045,6 +1045,8 @@ static const struct dsa_switch_ops bcm_sf2_ops = {
.port_fdb_dump  = b53_fdb_dump,
.port_fdb_add   = b53_fdb_add,
.port_fdb_del   = b53_fdb_del,
+   .get_rxnfc  = bcm_sf2_get_rxnfc,
+   .set_rxnfc  = bcm_sf2_set_rxnfc,
 };
 
 struct bcm_sf2_of_data {
@@ -1168,6 +1170,12 @@ static int bcm_sf2_sw_probe(struct platform_device *pdev)
 
spin_lock_init(>indir_lock);
mutex_init(>stats_mutex);
+   mutex_init(>cfp.lock);
+
+   /* CFP rule #0 cannot be used for specific classifications, flag it as
+* permanently used
+*/
+   set_bit(0, priv->cfp.used);
 
bcm_sf2_identify_ports(priv, dn->child);
 
@@ -1197,6 +1205,12 @@ static int bcm_sf2_sw_probe(struct platform_device *pdev)
return ret;
}
 
+   ret = bcm_sf2_cfp_rst(priv);
+   if (ret) {
+   pr_err("failed to reset CFP\n");
+   goto out_mdio;
+   }
+
/* Disable all interrupts and request them */
bcm_sf2_intr_disable(priv);
 
diff --git a/drivers/net/dsa/bcm_sf2.h b/drivers/net/dsa/bcm_sf2.h
index 6e1f74e4d471..7d3030e04f11 100644
--- a/drivers/net/dsa/bcm_sf2.h
+++ b/drivers/net/dsa/bcm_sf2.h
@@ -52,6 +52,13 @@ struct bcm_sf2_port_status {
struct ethtool_eee eee;
 };
 
+struct bcm_sf2_cfp_priv {
+   /* Mutex protecting concurrent accesses to the CFP registers */
+   struct mutex lock;
+   DECLARE_BITMAP(used, CFP_NUM_RULES);
+   unsigned int rules_cnt;
+};
+
 struct bcm_sf2_priv {
/* Base registers, keep those in order with BCM_SF2_REGS_NAME */
void __iomem*core;
@@ -103,6 +110,9 @@ struct bcm_sf2_priv {
 
/* Bitmask of ports needing BRCM tags */
unsigned intbrcm_tag_mask;
+
+   /* CFP rules context */
+   struct bcm_sf2_cfp_priv cfp;
 };
 
 static inline struct bcm_sf2_priv *bcm_sf2_to_priv(struct dsa_switch *ds)
@@ -197,4 +207,11 @@ SF2_IO_MACRO(acb);
 SWITCH_INTR_L2(0);
 SWITCH_INTR_L2(1);
 
+/* RXNFC */
+int bcm_sf2_get_rxnfc(struct dsa_switch *ds, int port,
+ struct ethtool_rxnfc *nfc, u32 *rule_locs);
+int bcm_sf2_set_rxnfc(struct dsa_switch *ds, int port,
+ struct ethtool_rxnfc *nfc);
+int bcm_sf2_cfp_rst(struct bcm_sf2_priv *priv);
+
 #endif /* __BCM_SF2_H */
diff --git a/drivers/net/dsa/bcm_sf2_cfp.c b/drivers/net/dsa/bcm_sf2_cfp.c
new file mode 100644
index ..c71be3e0dc2d
--- /dev/null
+++ b/drivers/net/dsa/bcm_sf2_cfp.c
@@ -0,0 +1,613 @@
+/*
+ * Broadcom Starfighter 2 DSA switch CFP support
+ *
+ * Copyright (C) 2016, Broadcom
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "bcm_sf2.h"
+#include "bcm_sf2_regs.h"
+
+struct cfp_udf_layout {
+   u8 slices[UDF_NUM_SLICES];
+   u32 mask_value;
+
+};
+
+/* UDF slices layout for a TCPv4/UDPv4 specification */
+static const struct cfp_udf_layout udf_tcpip4_layout = {
+   .slices = {
+   /* End of L2, byte offset 12, src IP[0:15] */
+   CFG_UDF_EOL2 | 6,
+   /* End of L2, byte offset 14, src IP[16:31] */
+   CFG_UDF_EOL2 | 7,
+   /* End of L2, byte offset 16, dst IP[0:15] */
+   CFG_UDF_EOL2 | 8,
+   /* End of L2, byte offset 18, dst IP[16:31] */
+   CFG_UDF_EOL2 | 9,
+   /* End of L3, byte offset 0, src port */
+

[PATCH net-next 1/4] net: dsa: Hook {get,set}_rxnfc ethtool operations

2017-01-27 Thread Florian Fainelli

In preparation for adding support for CFP/TCAMP in the bcm_sf2 driver add the
plumbing to call into driver specific {get,set}_rxnfc operations.

Signed-off-by: Florian Fainelli 
---
 include/net/dsa.h |  8 
 net/dsa/slave.c   | 26 ++
 2 files changed, 34 insertions(+)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 92fd795e9573..bcad7cc906d9 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -370,6 +370,14 @@ struct dsa_switch_ops {
int (*port_mdb_dump)(struct dsa_switch *ds, int port,
 struct switchdev_obj_port_mdb *mdb,
 int (*cb)(struct switchdev_obj *obj));
+
+   /*
+* RXNFC
+*/
+   int (*get_rxnfc)(struct dsa_switch *ds, int port,
+struct ethtool_rxnfc *nfc, u32 *rule_locs);
+   int (*set_rxnfc)(struct dsa_switch *ds, int port,
+struct ethtool_rxnfc *nfc);
 };
 
 struct dsa_switch_driver {
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index b8e58689a9a1..d30a98db004c 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -1001,6 +1001,30 @@ void dsa_cpu_port_ethtool_init(struct ethtool_ops *ops)
ops->get_strings = dsa_cpu_port_get_strings;
 }
 
+static int dsa_slave_get_rxnfc(struct net_device *dev,
+  struct ethtool_rxnfc *nfc, u32 *rule_locs)
+{
+   struct dsa_slave_priv *p = netdev_priv(dev);
+   struct dsa_switch *ds = p->parent;
+
+   if (!ds->ops->get_rxnfc)
+   return -EOPNOTSUPP;
+
+   return ds->ops->get_rxnfc(ds, p->port, nfc, rule_locs);
+}
+
+static int dsa_slave_set_rxnfc(struct net_device *dev,
+  struct ethtool_rxnfc *nfc)
+{
+   struct dsa_slave_priv *p = netdev_priv(dev);
+   struct dsa_switch *ds = p->parent;
+
+   if (!ds->ops->set_rxnfc)
+   return -EOPNOTSUPP;
+
+   return ds->ops->set_rxnfc(ds, p->port, nfc);
+}
+
 static const struct ethtool_ops dsa_slave_ethtool_ops = {
.get_drvinfo= dsa_slave_get_drvinfo,
.get_regs_len   = dsa_slave_get_regs_len,
@@ -1019,6 +1043,8 @@ static const struct ethtool_ops dsa_slave_ethtool_ops = {
.get_eee= dsa_slave_get_eee,
.get_link_ksettings = dsa_slave_get_link_ksettings,
.set_link_ksettings = dsa_slave_set_link_ksettings,
+   .get_rxnfc  = dsa_slave_get_rxnfc,
+   .set_rxnfc  = dsa_slave_set_rxnfc,
 };
 
 static const struct net_device_ops dsa_slave_netdev_ops = {
-- 
2.9.3

[net 7/8] net/mlx5e: Fix update of hash function/key via ethtool

2017-01-27 Thread Saeed Mahameed

From: Gal Pressman 

Modifying TIR hash should change selected fields bitmask in addition to
the function and key.
Formerly, we would not set this field resulting in zeroing of its value,
which means no packet fields are used for RX RSS hash calculation thus
causing all traffic to arrive in RQ[0].

Fixes: bdfc028de1b3 ("net/mlx5e: Fix ethtool RX hash func configuration change")
Signed-off-by: Gal Pressman 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h   |   3 +-
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   |  13 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 198 ++---
 3 files changed, 109 insertions(+), 105 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h 
b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 1619147a63e8..d5ecb8f53fd4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -791,7 +791,8 @@ void mlx5e_disable_vlan_filter(struct mlx5e_priv *priv);
 int mlx5e_modify_rqs_vsd(struct mlx5e_priv *priv, bool vsd);
 
 int mlx5e_redirect_rqt(struct mlx5e_priv *priv, u32 rqtn, int sz, int ix);
-void mlx5e_build_tir_ctx_hash(void *tirc, struct mlx5e_priv *priv);
+void mlx5e_build_indir_tir_ctx_hash(struct mlx5e_priv *priv, void *tirc,
+   enum mlx5e_traffic_types tt);
 
 int mlx5e_open_locked(struct net_device *netdev);
 int mlx5e_close_locked(struct net_device *netdev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 6f4eb34259f0..bb67863aa361 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -980,15 +980,18 @@ static int mlx5e_get_rxfh(struct net_device *netdev, u32 
*indir, u8 *key,
 
 static void mlx5e_modify_tirs_hash(struct mlx5e_priv *priv, void *in, int 
inlen)
 {
-   struct mlx5_core_dev *mdev = priv->mdev;
void *tirc = MLX5_ADDR_OF(modify_tir_in, in, ctx);
-   int i;
+   struct mlx5_core_dev *mdev = priv->mdev;
+   int ctxlen = MLX5_ST_SZ_BYTES(tirc);
+   int tt;
 
MLX5_SET(modify_tir_in, in, bitmask.hash, 1);
-   mlx5e_build_tir_ctx_hash(tirc, priv);
 
-   for (i = 0; i < MLX5E_NUM_INDIR_TIRS; i++)
-   mlx5_core_modify_tir(mdev, priv->indir_tir[i].tirn, in, inlen);
+   for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++) {
+   memset(tirc, 0, ctxlen);
+   mlx5e_build_indir_tir_ctx_hash(priv, tirc, tt);
+   mlx5_core_modify_tir(mdev, priv->indir_tir[tt].tirn, in, inlen);
+   }
 }
 
 static int mlx5e_set_rxfh(struct net_device *dev, const u32 *indir,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 948351ae5bd2..f14ca3385fdd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2022,8 +2022,23 @@ static void mlx5e_build_tir_ctx_lro(void *tirc, struct 
mlx5e_priv *priv)
MLX5_SET(tirc, tirc, lro_timeout_period_usecs, 
priv->params.lro_timeout);
 }
 
-void mlx5e_build_tir_ctx_hash(void *tirc, struct mlx5e_priv *priv)
+void mlx5e_build_indir_tir_ctx_hash(struct mlx5e_priv *priv, void *tirc,
+   enum mlx5e_traffic_types tt)
 {
+   void *hfso = MLX5_ADDR_OF(tirc, tirc, rx_hash_field_selector_outer);
+
+#define MLX5_HASH_IP(MLX5_HASH_FIELD_SEL_SRC_IP   |\
+MLX5_HASH_FIELD_SEL_DST_IP)
+
+#define MLX5_HASH_IP_L4PORTS(MLX5_HASH_FIELD_SEL_SRC_IP   |\
+MLX5_HASH_FIELD_SEL_DST_IP   |\
+MLX5_HASH_FIELD_SEL_L4_SPORT |\
+MLX5_HASH_FIELD_SEL_L4_DPORT)
+
+#define MLX5_HASH_IP_IPSEC_SPI  (MLX5_HASH_FIELD_SEL_SRC_IP   |\
+MLX5_HASH_FIELD_SEL_DST_IP   |\
+MLX5_HASH_FIELD_SEL_IPSEC_SPI)
+
MLX5_SET(tirc, tirc, rx_hash_fn,
 mlx5e_rx_hash_fn(priv->params.rss_hfunc));
if (priv->params.rss_hfunc == ETH_RSS_HASH_TOP) {
@@ -2035,6 +2050,88 @@ void mlx5e_build_tir_ctx_hash(void *tirc, struct 
mlx5e_priv *priv)
MLX5_SET(tirc, tirc, rx_hash_symmetric, 1);
memcpy(rss_key, priv->params.toeplitz_hash_key, len);
}
+
+   switch (tt) {
+   case MLX5E_TT_IPV4_TCP:
+   MLX5_SET(rx_hash_field_select, hfso, l3_prot_type,
+MLX5_L3_PROT_TYPE_IPV4);
+   MLX5_SET(rx_hash_field_select, hfso, l4_prot_type,
+MLX5_L4_PROT_TYPE_TCP);
+   MLX5_SET(rx_hash_field_select, hfso, selected_fields,
+MLX5_HASH_IP_L4PORTS);
+   break;
+
+   case MLX5E_TT_IPV6_TCP:
+

[PATCH net-next 2/4] net: dsa: bcm_sf2: Configure traffic classes to queue mapping

2017-01-27 Thread Florian Fainelli

By default, all traffic goes to queue 0, re-configure the traffic
classes to quality of service mapping such that priority X maps to queue
X, where X is from 0 through 7.

Signed-off-by: Florian Fainelli 
---
 drivers/net/dsa/bcm_sf2.c  | 9 +
 drivers/net/dsa/bcm_sf2_regs.h | 4 
 2 files changed, 13 insertions(+)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 8eecfd227e06..637072da3acf 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -229,6 +229,7 @@ static int bcm_sf2_port_setup(struct dsa_switch *ds, int 
port,
 {
struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
s8 cpu_port = ds->dst[ds->index].cpu_port;
+   unsigned int i;
u32 reg;
 
/* Clear the memory power down */
@@ -240,6 +241,14 @@ static int bcm_sf2_port_setup(struct dsa_switch *ds, int 
port,
if (priv->brcm_tag_mask & BIT(port))
bcm_sf2_brcm_hdr_setup(priv, port);
 
+   /* Configure Traffic Class to QoS mapping, allow each priority to map
+* to a different queue number
+*/
+   reg = core_readl(priv, CORE_PORT_TC2_QOS_MAP_PORT(port));
+   for (i = 0; i < 8; i++)
+   reg |= i << (PRT_TO_QID_SHIFT * i);
+   core_writel(priv, reg, CORE_PORT_TC2_QOS_MAP_PORT(port));
+
/* Clear the Rx and Tx disable bits and set to no spanning tree */
core_writel(priv, 0, CORE_G_PCTL_PORT(port));
 
diff --git a/drivers/net/dsa/bcm_sf2_regs.h b/drivers/net/dsa/bcm_sf2_regs.h
index 3b33b8010cc8..6b63c00928ba 100644
--- a/drivers/net/dsa/bcm_sf2_regs.h
+++ b/drivers/net/dsa/bcm_sf2_regs.h
@@ -238,6 +238,10 @@ enum bcm_sf2_reg_offs {
 #define  P_TXQ_PSM_VDD(x)  (P_TXQ_PSM_VDD_MASK << \
((x) * P_TXQ_PSM_VDD_SHIFT))
 
+#define CORE_PORT_TC2_QOS_MAP_PORT(x)  (0xc1c0 + ((x) * 0x10))
+#define  PRT_TO_QID_MASK   0x3
+#define  PRT_TO_QID_SHIFT  3
+
 #define CORE_PORT_VLAN_CTL_PORT(x) (0xc400 + ((x) * 0x8))
 #define  PORT_VLAN_CTRL_MASK   0x1ff
 
-- 
2.9.3

[net 4/8] net/mlx5: E-Switch, Re-enable RoCE on mode change only after FDB destroy

2017-01-27 Thread Saeed Mahameed

From: Or Gerlitz 

We must re-enable RoCE on the e-switch management port (PF) only after 
destroying
the FDB in its switchdev/offloaded mode. Otherwise, when encapsulation is 
supported,
this re-enablement will fail.

Also, it's more natural and symmetric to disable RoCE on the PF before we create
the FDB under switchdev mode, so do that as well and revert if getting into 
error
during the mode change later.

Fixes: 9da34cd34e85 ('net/mlx5: Disable RoCE on the e-switch management [..]')
Signed-off-by: Or Gerlitz 
Reviewed-by: Hadar Hen Zion 
Signed-off-by: Saeed Mahameed 
---
 .../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 29 ++
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c 
b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index c61bca138e65..595f7c7383b3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -675,9 +675,14 @@ int esw_offloads_init(struct mlx5_eswitch *esw, int 
nvports)
int vport;
int err;
 
+   /* disable PF RoCE so missed packets don't go through RoCE steering */
+   mlx5_dev_list_lock();
+   mlx5_remove_dev_by_protocol(esw->dev, MLX5_INTERFACE_PROTOCOL_IB);
+   mlx5_dev_list_unlock();
+
err = esw_create_offloads_fdb_table(esw, nvports);
if (err)
-   return err;
+   goto create_fdb_err;
 
err = esw_create_offloads_table(esw);
if (err)
@@ -697,11 +702,6 @@ int esw_offloads_init(struct mlx5_eswitch *esw, int 
nvports)
goto err_reps;
}
 
-   /* disable PF RoCE so missed packets don't go through RoCE steering */
-   mlx5_dev_list_lock();
-   mlx5_remove_dev_by_protocol(esw->dev, MLX5_INTERFACE_PROTOCOL_IB);
-   mlx5_dev_list_unlock();
-
return 0;
 
 err_reps:
@@ -718,6 +718,13 @@ int esw_offloads_init(struct mlx5_eswitch *esw, int 
nvports)
 
 create_ft_err:
esw_destroy_offloads_fdb_table(esw);
+
+create_fdb_err:
+   /* enable back PF RoCE */
+   mlx5_dev_list_lock();
+   mlx5_add_dev_by_protocol(esw->dev, MLX5_INTERFACE_PROTOCOL_IB);
+   mlx5_dev_list_unlock();
+
return err;
 }
 
@@ -725,11 +732,6 @@ static int esw_offloads_stop(struct mlx5_eswitch *esw)
 {
int err, err1, num_vfs = esw->dev->priv.sriov.num_vfs;
 
-   /* enable back PF RoCE */
-   mlx5_dev_list_lock();
-   mlx5_add_dev_by_protocol(esw->dev, MLX5_INTERFACE_PROTOCOL_IB);
-   mlx5_dev_list_unlock();
-
mlx5_eswitch_disable_sriov(esw);
err = mlx5_eswitch_enable_sriov(esw, num_vfs, SRIOV_LEGACY);
if (err) {
@@ -739,6 +741,11 @@ static int esw_offloads_stop(struct mlx5_eswitch *esw)
esw_warn(esw->dev, "Failed setting eswitch back to 
offloads, err %d\n", err);
}
 
+   /* enable back PF RoCE */
+   mlx5_dev_list_lock();
+   mlx5_add_dev_by_protocol(esw->dev, MLX5_INTERFACE_PROTOCOL_IB);
+   mlx5_dev_list_unlock();
+
return err;
 }
 
-- 
2.11.0

[PATCH net-next 0/4] net: dsa: bcm_sf2: CFP support

2017-01-27 Thread Florian Fainelli

Hi all,

This patch series adds support for the Broadcom Compact Field Processor (CFP)
which is a classification and matching engine built into most Broadcom switches.

We support that using ethtool::rxnfc because it allows all known uses cases from
the users I support to work, and more importantly, it allows the selection of a
target rule index, which is later used by e.g: offloading hardware, this is an
essential feature that I could not find being supported with cls_* for instance.

Thanks

Florian Fainelli (4):
  net: dsa: Hook {get,set}_rxnfc ethtool operations
  net: dsa: bcm_sf2: Configure traffic classes to queue mapping
  net: dsa: bcm_sf2: Add CFP registers definitions
  net: dsa: bcm_sf2: Add support for ethtool::rxnfc

 drivers/net/dsa/Makefile   |   2 +-
 drivers/net/dsa/bcm_sf2.c  |  23 ++
 drivers/net/dsa/bcm_sf2.h  |  17 ++
 drivers/net/dsa/bcm_sf2_cfp.c  | 613 +
 drivers/net/dsa/bcm_sf2_regs.h | 150 ++
 include/net/dsa.h  |   8 +
 net/dsa/slave.c|  26 ++
 7 files changed, 838 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/dsa/bcm_sf2_cfp.c

-- 
2.9.3

[net 6/8] net/mlx5e: Modify TIRs hash only when it's needed

2017-01-27 Thread Saeed Mahameed

From: Gal Pressman 

We don't need to modify our TIRs unless the user requested a change in
the hash function/key, for example when changing indirection only.

Fixes: bdfc028de1b3 ("net/mlx5e: Fix ethtool RX hash func configuration change")
Signed-off-by: Gal Pressman 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index ffbdf9ee5a9b..6f4eb34259f0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -996,6 +996,7 @@ static int mlx5e_set_rxfh(struct net_device *dev, const u32 
*indir,
 {
struct mlx5e_priv *priv = netdev_priv(dev);
int inlen = MLX5_ST_SZ_BYTES(modify_tir_in);
+   bool hash_changed = false;
void *in;
 
if ((hfunc != ETH_RSS_HASH_NO_CHANGE) &&
@@ -1017,14 +1018,21 @@ static int mlx5e_set_rxfh(struct net_device *dev, const 
u32 *indir,
mlx5e_redirect_rqt(priv, rqtn, MLX5E_INDIR_RQT_SIZE, 0);
}
 
-   if (key)
+   if (hfunc != ETH_RSS_HASH_NO_CHANGE &&
+   hfunc != priv->params.rss_hfunc) {
+   priv->params.rss_hfunc = hfunc;
+   hash_changed = true;
+   }
+
+   if (key) {
memcpy(priv->params.toeplitz_hash_key, key,
   sizeof(priv->params.toeplitz_hash_key));
+   hash_changed = hash_changed ||
+  priv->params.rss_hfunc == ETH_RSS_HASH_TOP;
+   }
 
-   if (hfunc != ETH_RSS_HASH_NO_CHANGE)
-   priv->params.rss_hfunc = hfunc;
-
-   mlx5e_modify_tirs_hash(priv, in, inlen);
+   if (hash_changed)
+   mlx5e_modify_tirs_hash(priv, in, inlen);
 
mutex_unlock(>state_lock);
 
-- 
2.11.0

[PATCH net-next 3/4] net: dsa: bcm_sf2: Add CFP registers definitions

2017-01-27 Thread Florian Fainelli

Signed-off-by: Florian Fainelli 
---
 drivers/net/dsa/bcm_sf2_regs.h | 146 +
 1 file changed, 146 insertions(+)

diff --git a/drivers/net/dsa/bcm_sf2_regs.h b/drivers/net/dsa/bcm_sf2_regs.h
index 6b63c00928ba..26052450091e 100644
--- a/drivers/net/dsa/bcm_sf2_regs.h
+++ b/drivers/net/dsa/bcm_sf2_regs.h
@@ -255,4 +255,150 @@ enum bcm_sf2_reg_offs {
 #define CORE_EEE_EN_CTRL   0x24800
 #define CORE_EEE_LPI_INDICATE  0x24810
 
+#define CORE_CFP_ACC   0x28000
+#define  OP_STR_DONE   (1 << 0)
+#define  OP_SEL_SHIFT  1
+#define  OP_SEL_READ   (1 << OP_SEL_SHIFT)
+#define  OP_SEL_WRITE  (2 << OP_SEL_SHIFT)
+#define  OP_SEL_SEARCH (4 << OP_SEL_SHIFT)
+#define  OP_SEL_MASK   (7 << OP_SEL_SHIFT)
+#define  CFP_RAM_CLEAR (1 << 4)
+#define  RAM_SEL_SHIFT 10
+#define  TCAM_SEL  (1 << RAM_SEL_SHIFT)
+#define  ACT_POL_RAM   (2 << RAM_SEL_SHIFT)
+#define  RATE_METER_RAM(4 << RAM_SEL_SHIFT)
+#define  GREEN_STAT_RAM(8 << RAM_SEL_SHIFT)
+#define  YELLOW_STAT_RAM   (16 << RAM_SEL_SHIFT)
+#define  RED_STAT_RAM  (24 << RAM_SEL_SHIFT)
+#define  RAM_SEL_MASK  (0x1f << RAM_SEL_SHIFT)
+#define  TCAM_RESET(1 << 15)
+#define  XCESS_ADDR_SHIFT  16
+#define  XCESS_ADDR_MASK   0xff
+#define  SEARCH_STS(1 << 27)
+#define  RD_STS_SHIFT  28
+#define  RD_STS_TCAM   (1 << RD_STS_SHIFT)
+#define  RD_STS_ACT_POL_RAM(2 << RD_STS_SHIFT)
+#define  RD_STS_RATE_METER_RAM (4 << RD_STS_SHIFT)
+#define  RD_STS_STAT_RAM   (8 << RD_STS_SHIFT)
+
+#define CORE_CFP_RATE_METER_GLOBAL_CTL 0x28010
+
+#define CORE_CFP_DATA_PORT_0   0x28040
+#define CORE_CFP_DATA_PORT(x)  (CORE_CFP_DATA_PORT_0 + \
+   (x) * 0x10)
+
+/* UDF_DATA7 */
+#define L3_FRAMING_SHIFT   24
+#define L3_FRAMING_MASK(0x3 << L3_FRAMING_SHIFT)
+#define IPPROTO_SHIFT  8
+#define IPPROTO_MASK   (0xff << IPPROTO_SHIFT)
+#define IP_FRAG(1 << 7)
+
+/* UDF_DATA0 */
+#define  SLICE_VALID   3
+#define  SLICE_NUM_SHIFT   2
+#define  SLICE_NUM(x)  ((x) << SLICE_NUM_SHIFT)
+
+#define CORE_CFP_MASK_PORT_0   0x280c0
+
+#define CORE_CFP_MASK_PORT(x)  (CORE_CFP_MASK_PORT_0 + \
+   (x) * 0x10)
+
+#define CORE_ACT_POL_DATA0 0x28140
+#define  VLAN_BYP  (1 << 0)
+#define  EAP_BYP   (1 << 1)
+#define  STP_BYP   (1 << 2)
+#define  REASON_CODE_SHIFT 3
+#define  REASON_CODE_MASK  0x3f
+#define  LOOP_BK_EN(1 << 9)
+#define  NEW_TC_SHIFT  10
+#define  NEW_TC_MASK   0x7
+#define  CHANGE_TC (1 << 13)
+#define  DST_MAP_IB_SHIFT  14
+#define  DST_MAP_IB_MASK   0x1ff
+#define  CHANGE_FWRD_MAP_IB_SHIFT  24
+#define  CHANGE_FWRD_MAP_IB_MASK   0x3
+#define  CHANGE_FWRD_MAP_IB_NO_DEST(0 << CHANGE_FWRD_MAP_IB_SHIFT)
+#define  CHANGE_FWRD_MAP_IB_REM_ARL(1 << CHANGE_FWRD_MAP_IB_SHIFT)
+#define  CHANGE_FWRD_MAP_IB_REP_ARL(2 << CHANGE_FWRD_MAP_IB_SHIFT)
+#define  CHANGE_FWRD_MAP_IB_ADD_DST(3 << CHANGE_FWRD_MAP_IB_SHIFT)
+#define  NEW_DSCP_IB_SHIFT 26
+#define  NEW_DSCP_IB_MASK  0x3f
+
+#define CORE_ACT_POL_DATA1 0x28150
+#define  CHANGE_DSCP_IB(1 << 0)
+#define  DST_MAP_OB_SHIFT  1
+#define  DST_MAP_OB_MASK   0x3ff
+#define  CHANGE_FWRD_MAP_OB_SHIT   11
+#define  CHANGE_FWRD_MAP_OB_MASK   0x3
+#define  NEW_DSCP_OB_SHIFT 13
+#define  NEW_DSCP_OB_MASK  0x3f
+#define  CHANGE_DSCP_OB(1 << 19)
+#define  CHAIN_ID_SHIFT20
+#define  CHAIN_ID_MASK 0xff
+#define  CHANGE_COLOR  (1 << 28)
+#define  NEW_COLOR_SHIFT   29
+#define  NEW_COLOR_MASK0x3
+#define  NEW_COLOR_GREEN   (0 << NEW_COLOR_SHIFT)
+#define  NEW_COLOR_YELLOW  (1 << NEW_COLOR_SHIFT)
+#define  NEW_COLOR_RED (2 << NEW_COLOR_SHIFT)
+#define  RED_DEFAULT   (1 << 31)
+
+#define CORE_ACT_POL_DATA2 0x28160
+#define  MAC_LIMIT_BYPASS  (1 << 0)
+#define  CHANGE_TC_O   (1 << 1)
+#define  NEW_TC_O_SHIFT2
+#define  NEW_TC_O_MASK 0x7
+#define  SPCP_RMK_DISABLE  (1 << 5)
+#define

[net 3/8] net/mlx5: E-Switch, Err when retrieving steering name-space fails

2017-01-27 Thread Saeed Mahameed

From: Or Gerlitz 

Make sure to return error when we failed retrieving the FDB steering
name space. Also, while around, correctly print the error when mode
change revert fails in the warning message.

Signed-off-by: Or Gerlitz 
Reported-by: Leon Romanovsky 
Reviewed-by: Roi Dayan 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c 
b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 5803216157cf..c61bca138e65 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -424,6 +424,7 @@ static int esw_create_offloads_fdb_table(struct 
mlx5_eswitch *esw, int nvports)
root_ns = mlx5_get_flow_namespace(dev, MLX5_FLOW_NAMESPACE_FDB);
if (!root_ns) {
esw_warn(dev, "Failed to get FDB flow namespace\n");
+   err = -EOPNOTSUPP;
goto ns_err;
}
 
@@ -655,7 +656,7 @@ static int esw_offloads_start(struct mlx5_eswitch *esw)
esw_warn(esw->dev, "Failed setting eswitch to offloads, err 
%d\n", err);
err1 = mlx5_eswitch_enable_sriov(esw, num_vfs, SRIOV_LEGACY);
if (err1)
-   esw_warn(esw->dev, "Failed setting eswitch back to 
legacy, err %d\n", err);
+   esw_warn(esw->dev, "Failed setting eswitch back to 
legacy, err %d\n", err1);
}
if (esw->offloads.inline_mode == MLX5_INLINE_MODE_NONE) {
if (mlx5_eswitch_inline_mode_get(esw,
-- 
2.11.0

[ANNOUNCE] iptables 1.6.1 release

2017-01-27 Thread Pablo Neira Ayuso

Hi!

The Netfilter project proudly presents:

iptables 1.6.1

iptables is the userspace command line program used to configure the
Linux 2.4.x and later packet filtering ruleset. It is targeted towards
system administrators.

This update contains accumulated bugfixes, several new extensions and
lots of translations via iptables-translate to ease migration to
nftables.

See ChangeLog that comes attached to this email for more details.

You can download it from:

http://www.netfilter.org/projects/iptables/downloads.html
ftp://ftp.netfilter.org/pub/iptables/

Have fun!
Ana Rey (1):
  extensions: libxt_udp: add translation to nft

Arpan Kapoor (1):
  libxtables: Replace gethostbyname() with getaddrinfo()

Arturo Borrero (3):
  extensions/libxt_rpfilter.man: fix typo, specifiy vs specify
  iptables/xtables-arp.c: fix typo, wierd vs weird
  extensions/libxt_tcp: fix nftables translate flags value, 'none' vs '0x0'

Arturo Borrero Gonzalez (1):
  extensions: update Arturo Borrero email address

Brian Haley (1):
  iptables-restore: add missing arguments to usage message

Florian Westphal (5):
  iptables.8: mention iptables-save in -L documentation
  iptables.8: nat table has four builtin chains
  extensions: NETMAP: add ' to:' prefix when printing NETMAP target
  extensions: NETMAP: fix iptables-save output
  connlabel: clarify default config path

George Burgess IV (1):
  libxt_multiport: remove an unused variable

Giuseppe Longo (1):
  configure: make libmnl and libnftnl hard requirements

Guruswamy Basavaiah (4):
  iptables: extensions: iptables-translate prints extra "nft" after 
printing any error
  iptables-translate: translate iptables --flush
  iptables-translate: Printing the table name before chain name.
  iptables-translate: Don't print "nft" in iptables-restore-translate 
command

Gustavo Zacarias (1):
  iptables: add xtables-config-parser.h to BUILT_SOURCES

Janani Ravichandran (1):
  extensions: libip6t_rt.c: Add translation to nft

Jordan Yelloz (1):
  extensions: added AR substitution

Keno Fischer (1):
  build: Fix two compile errors during out-of-tree build

Laura Garcia Liebana (12):
  extensions: libip6t_icmp6: Add translation to nft
  extensions: libipt_LOG: Avoid to print the default log level in the 
translation
  extensions: libipt_icmp: Add translation to nft
  extensions: libipt_REJECT: Avoid to print the default reject with value 
in the translation
  extensions: libip6t_REJECT: Avoid to print the default reject with value 
in the translation
  extensions: libxt_ipcomp: Add translation to nft
  extensions: libip6t_hbh: Add translation to nft
  extensions: libxt_multiport: Add translation to nft
  extensions: libxt_dscp: Add translation to nft
  extensions: libip6t_frag: Add translation to nft
  extensions: libxt_cgroup: Add translation to nft
  extensions: libxt_conntrack: Add translation to nft

Liping Zhang (27):
  extensions: libxt_limit: fix a wrong translation to nft rule
  extensions: libxt_mark: fix a wrong translation to nft when mask is 
specified
  extensions: libxt_TRACE: Add translation to nft
  extensions: libipt_realm: fix order of mask and id when do nft translation
  extensions: libxt_connlabel: fix crash when connlabel.conf is empty
  extensions: libxt_connlabel: Add translation to nft
  extensions: libxt_NFLOG: display nflog-size even if it is zero
  extensions: libxt_NFLOG: translate to nft log snaplen if nflog-size is 
specified
  extensions: libxt_NFLOG: add unit test to cover nflog-size with zero
  extensions: libxt_connlabel: add unit test
  iptables-translate: add in/out ifname wildcard match translation to nft
  extensions: libxt_CLASSIFY: Add translation to nft
  extensions: libipt_DNAT/SNAT: fix "OOM" when do translation to nft
  extensions: libip[6]t_SNAT/DNAT: use the new nft syntax when do xlate
  extensions: libip[6]t_REDIRECT: use new nft syntax when do xlate
  extensions: libip6t_SNAT/DNAT: add square bracket in xlat output when 
port is specified
  extensions: libipt_realm: add a missing space in translation
  extensions: libxt_iprange: rename "ip saddr" to "ip6 saddr" in 
ip6tables-xlate
  extensions: libxt_iprange: handle the invert flag properly in translation
  extensions: libxt_devgroup: handle the invert flag properly in translation
  extensions: libxt_ipcomp: add range support in translation
  extensions: libxt_quota: add translation to nft
  extensions: libxt_DSCP: add translation to nft
  extensions: libxt_statistic: add translation to nft
  extensions: LOG: add log flags translation to nft
  extensions: libxt_connbytes: Add translation to nft
  extensions: libxt_rpfilter: add translation to nft

Loganaden Velvindron (1):
  libxt_TCPOPTSTRIP: Fix musl compatibility

Pablo M. Bermudo Garay (11):

Re: [PATCH] cfg80211 debugfs: Cleanup some checkpatch issues

2017-01-27 Thread Johannes Berg

On Fri, 2017-01-27 at 22:26 +0300, Pichugin Dmitry wrote:
> This fixes the checkpatch.pl warnings:
> * Macros should not use a trailing semicolon.
> * Spaces required around that '='.
> * Symbolic permissions 'S_IRUGO' are not preferred.
> * Macro argument reuse 'buflen' - possible side-effects

I really see no point in any of this.

johannes

Re: [PATCH net-next 1/4] mlx5: Make building eswitch configurable

2017-01-27 Thread Saeed Mahameed

On Fri, Jan 27, 2017 at 8:33 PM, Tom Herbert  wrote:
> On Fri, Jan 27, 2017 at 10:19 AM, Saeed Mahameed
>  wrote:
>> On Fri, Jan 27, 2017 at 1:32 AM, Tom Herbert  wrote:
>>> Add a configuration option (CONFIG_MLX5_CORE_ESWITCH) for controlling
>>> whether the eswitch code is built. Change Kconfig and Makefile
>>> accordingly.
>>>
>>> Signed-off-by: Tom Herbert 
>>> ---
>>>  drivers/net/ethernet/mellanox/mlx5/core/Kconfig   | 11 +++
>>>  drivers/net/ethernet/mellanox/mlx5/core/Makefile  |  6 +-
>>>  drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 92 
>>> +--
>>>  drivers/net/ethernet/mellanox/mlx5/core/en_tc.c   | 39 +++---
>>>  drivers/net/ethernet/mellanox/mlx5/core/eq.c  |  4 +-
>>>  drivers/net/ethernet/mellanox/mlx5/core/main.c| 16 ++--
>>>  drivers/net/ethernet/mellanox/mlx5/core/sriov.c   |  6 +-
>>>  7 files changed, 125 insertions(+), 49 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig 
>>> b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
>>> index ddb4ca4..27aae58 100644
>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
>>> @@ -30,3 +30,14 @@ config MLX5_CORE_EN_DCB
>>>   This flag is depended on the kernel's DCB support.
>>>
>>>   If unsure, set to Y
>>> +
>>> +config MLX5_CORE_EN_ESWITCH
>>> +   bool "Ethernet switch"
>>> +   default y
>>> +   depends on MLX5_CORE_EN
>>> +   ---help---
>>> + Say Y here if you want to use Ethernet switch (E-switch). E-Switch
>>> + is the software entity that represents and manages ConnectX4
>>> + inter-HCA ethernet l2 switching.
>>> +
>>> + If unsure, set to Y
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile 
>>> b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
>>> index 9f43beb..17025d8 100644
>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
>>> @@ -5,9 +5,11 @@ mlx5_core-y := main.o cmd.o debugfs.o fw.o eq.o uar.o 
>>> pagealloc.o \
>>> mad.o transobj.o vport.o sriov.o fs_cmd.o fs_core.o \
>>> fs_counters.o rl.o lag.o dev.o
>>>
>>> -mlx5_core-$(CONFIG_MLX5_CORE_EN) += wq.o eswitch.o eswitch_offloads.o \
>>> +mlx5_core-$(CONFIG_MLX5_CORE_EN) += wq.o \
>>> en_main.o en_common.o en_fs.o en_ethtool.o en_tx.o \
>>> en_rx.o en_rx_am.o en_txrx.o en_clock.o vxlan.o \
>>> -   en_tc.o en_arfs.o en_rep.o en_fs_ethtool.o en_selftest.o
>>> +   en_tc.o en_arfs.o en_fs_ethtool.o en_selftest.o
>>>
>>>  mlx5_core-$(CONFIG_MLX5_CORE_EN_DCB) +=  en_dcbnl.o
>>> +
>>> +mlx5_core-$(CONFIG_MLX5_CORE_EN_ESWITCH) += eswitch.o eswitch_offloads.o 
>>> en_rep.o
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
>>> b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>> index e829143..1db4d98 100644
>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>> @@ -38,7 +38,9 @@
>>>  #include 
>>>  #include "en.h"
>>>  #include "en_tc.h"
>>> +#ifdef CONFIG_MLX5_CORE_EN_ESWITCH
>>>  #include "eswitch.h"
>>> +#endif
>>
>> Wouldn't it be cleaner if we left the main code (en_main.c) untouched
>> and kept this
>> #include "eswitch.h" and instead of filling the main flow code with
>> #ifdefs, in eswitch.h
>> we can create eswitch mock API functions when
>> CONFIG_MLX5_CORE_EN_ESWITCH is not enabled ? the main flow will be
>> clean from ifdefs and will complie with mock functions.
>>
>> we did this with accelerated RFS feature. look for "#ifndef
>> CONFIG_RFS_ACCEL" in en.h
>>
> There are still occurrences of CONFIG_RFS_ACCEL in en_main.c and
> main.c. For eswitch its a header problem, not everything related to
> eswitch was extracted out of main though backend functions. There is a
> lot of code that related to eswitch that is intertwined with the core
> code.
>

Interesting, i just did a quick look and it seems to me all eswitch
logic in en_main.c can be kept untouched if we have the right mock
functions, on the other hand it seems that there are a lot of
eswitch functions to mock, i am not sure it is a good thing anymore,
let's leave it as is for now.

Re: [iproute PATCH] man: tc-csum.8: Fix example

2017-01-27 Thread Guillaume Nault

On Fri, Jan 27, 2017 at 12:15:01PM +0100, Phil Sutter wrote:
> +# tc filter add dev eth0 prio 1 protocol ip parent : \\
>   u32 match ip src 192.168.1.100/32 flowid :1 \\
> - action pedit munge ip dst set 0x12345678 pipe \\
> + action pedit munge ip dst set 1.2.3.4 pipe \\
> 
Just nitpicking here, but IMHO examples like this should better use IP
addresses reserved for documentation (192.0.2.0/24, 198.51.100.0/24 or
203.0.113.0/24).

[net 8/8] net/mlx5e: Check ets capability before ets query FW command

2017-01-27 Thread Saeed Mahameed

From: Moshe Shemesh 

On dcbnl callback getpgtccfgtx, the driver should check the ets
capability before ets query command is sent to firmware.
It is valid to return from this void function without changing in/out
parameters, as these parameters are initialized to
DCB_ATTR_VALUE_UNDEFINED.

Signed-off-by: Moshe Shemesh 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
index 35f9ae037ba0..0523ed47f597 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
@@ -511,6 +511,11 @@ static void mlx5e_dcbnl_getpgtccfgtx(struct net_device 
*netdev,
struct mlx5e_priv *priv = netdev_priv(netdev);
struct mlx5_core_dev *mdev = priv->mdev;
 
+   if (!MLX5_CAP_GEN(priv->mdev, ets)) {
+   netdev_err(netdev, "%s, ets is not supported\n", __func__);
+   return;
+   }
+
if (priority >= CEE_DCBX_MAX_PRIO) {
netdev_err(netdev,
   "%s, priority is out of range\n", __func__);
-- 
2.11.0

Re: [PATCH] net: phy: micrel: KSZ8795 do not set SUPPORTED_[Asym_]Pause

2017-01-27 Thread Florian Fainelli

On 01/27/2017 12:39 PM, Sean Nyekjaer wrote:
> As pr commit "net: phy: phy drivers should not set SUPPORTED_[Asym_]Pause"
> this phy driver should not set these feature bits.
> 
> Signed-off-by: Sean Nyekjaer 
> Fixes: 9d162ed69f51 ("net: phy: micrel: add support for KSZ8795")

Reviewed-by: Florian Fainelli 
-- 
Florian

Re: [PATCH RFC net-next] packet: always ensure that we pass hard_header_len bytes in skb_headlen() to the driver

2017-01-27 Thread Willem de Bruijn

On Fri, Jan 27, 2017 at 3:06 PM, Sowmini Varadhan
 wrote:
> On (01/27/17 14:29), Willem de Bruijn wrote:
>>
>> As your patch state, the contract is that any packet delivered to a
>> driver has the entire L2 in its linear section. Drivers are not required
>> to be robust against shorter packets, so there is no reason to test
>> those.
>>
>> One option is to limit your fix to known fixed-header protocols.
>> In these cases hard_header_len is the minimum, so anything
>> smaller must be dropped.
>
> yes, but how would you you know that this is a fixed-header protocol
> or a var-hdrlen protocol? AIUI the hard_header_len itself will not
> tell you this info: it will be 77 for ax25, 14 for ethernet,
> but that does not tell me that ax25 is the "robust-er" driver
> with a min requirement of 21 for the hdrlen.

Right. Identifying the outliers is the hard part.

> That's why I was thinking of a IFF_L2_VARHDRLEN in the priv_flags
> of the net_device.
>
>> For protocols with variable header length it is fine to send packets
>> shorter than hard_header_len, even with corrupted content (i.e.,
>> even if they would fail that protocol's validate callback), as long as
>> they exceed the minimum length. ax25 already has a min length
>> check through its protocol-specific validate callback.
>
> Another option that comes to mind.. the real thorn-in-the-flesh
> here is the CAP_SYS_RAWIO check. Would it be a better idea to ask
> the test-suites (since they seem to be the major consumer of
> that path) to use a special PF_PACKET socket option instead, that

Introducing a sysctl has the same effect. It is not possible to
identify all callers dependent on the current ABI.

I see these options
- make capable() check conditional on sysctl (or interface flag, ..)
- limit capable() check to drivers with with .validate callback
- hardcode a list of known fixed length protocols that must fail
- let privileged applications shoot themselves in the foot (change nothing).

The first will break tests. Though with a runtime fix: flip the flag.

The second will break variable length header protocols unless
you exhaustively search for all variable length protocols and add
validate callbacks first.

pull-request: can 2017-01-27

2017-01-27 Thread Marc Kleine-Budde

Hello David,

this is a pull request for net/master.

It consists of a single patch by Eric Dumazet, it fixes a kernel panic at
security_sock_rcv_skb.

regards,
Marc

---

The following changes since commit 950eabbd6ddedc1b08350b9169a6a51b130ebaaf:

  ISDN: eicon: silence misleading array-bounds warning (2017-01-27 11:27:34 
-0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can.git 
tags/linux-can-fixes-for-4.10-20170127

for you to fetch changes up to f30dc84e2d5ef45a715de546529e7693733b11fb:

  can: Fix kernel panic at security_sock_rcv_skb (2017-01-27 21:37:51 +0100)


linux-can-fixes-for-4.10-20170127


Eric Dumazet (1):
  can: Fix kernel panic at security_sock_rcv_skb

 include/linux/can/core.h |  7 +++
 net/can/af_can.c | 12 ++--
 net/can/af_can.h |  3 ++-
 net/can/bcm.c|  4 ++--
 net/can/gw.c |  2 +-
 net/can/raw.c|  4 ++--
 6 files changed, 20 insertions(+), 12 deletions(-)

[PATCH] can: Fix kernel panic at security_sock_rcv_skb

2017-01-27 Thread Marc Kleine-Budde

From: Eric Dumazet 

Zhang Yanmin reported crashes [1] and provided a patch adding a
synchronize_rcu() call in can_rx_unregister()

The main problem seems that the sockets themselves are not RCU
protected.

If CAN uses RCU for delivery, then sockets should be freed only after
one RCU grace period.

Recent kernels could use sock_set_flag(sk, SOCK_RCU_FREE), but let's
ease stable backports with the following fix instead.

[1]
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [] selinux_socket_sock_rcv_skb+0x65/0x2a0

Call Trace:
 
 [] security_sock_rcv_skb+0x4c/0x60
 [] sk_filter+0x41/0x210
 [] sock_queue_rcv_skb+0x53/0x3a0
 [] raw_rcv+0x2a3/0x3c0
 [] can_rcv_filter+0x12b/0x370
 [] can_receive+0xd9/0x120
 [] can_rcv+0xab/0x100
 [] __netif_receive_skb_core+0xd8c/0x11f0
 [] __netif_receive_skb+0x24/0xb0
 [] process_backlog+0x127/0x280
 [] net_rx_action+0x33b/0x4f0
 [] __do_softirq+0x184/0x440
 [] do_softirq_own_stack+0x1c/0x30
 
 [] do_softirq.part.18+0x3b/0x40
 [] do_softirq+0x1d/0x20
 [] netif_rx_ni+0xe5/0x110
 [] slcan_receive_buf+0x507/0x520
 [] flush_to_ldisc+0x21c/0x230
 [] process_one_work+0x24f/0x670
 [] worker_thread+0x9d/0x6f0
 [] ? rescuer_thread+0x480/0x480
 [] kthread+0x12c/0x150
 [] ret_from_fork+0x3f/0x70

Reported-by: Zhang Yanmin 
Signed-off-by: Eric Dumazet 
Acked-by: Oliver Hartkopp 
Cc: linux-stable 
Signed-off-by: Marc Kleine-Budde 
---
 include/linux/can/core.h |  7 +++
 net/can/af_can.c | 12 ++--
 net/can/af_can.h |  3 ++-
 net/can/bcm.c|  4 ++--
 net/can/gw.c |  2 +-
 net/can/raw.c|  4 ++--
 6 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/include/linux/can/core.h b/include/linux/can/core.h
index a0875001b13c..df08a41d5be5 100644
--- a/include/linux/can/core.h
+++ b/include/linux/can/core.h
@@ -45,10 +45,9 @@ struct can_proto {
 extern int  can_proto_register(const struct can_proto *cp);
 extern void can_proto_unregister(const struct can_proto *cp);
 
-extern int  can_rx_register(struct net_device *dev, canid_t can_id,
-   canid_t mask,
-   void (*func)(struct sk_buff *, void *),
-   void *data, char *ident);
+int can_rx_register(struct net_device *dev, canid_t can_id, canid_t mask,
+   void (*func)(struct sk_buff *, void *),
+   void *data, char *ident, struct sock *sk);
 
 extern void can_rx_unregister(struct net_device *dev, canid_t can_id,
  canid_t mask,
diff --git a/net/can/af_can.c b/net/can/af_can.c
index 1108079d934f..5488e4a6ccd0 100644
--- a/net/can/af_can.c
+++ b/net/can/af_can.c
@@ -445,6 +445,7 @@ static struct hlist_head *find_rcv_list(canid_t *can_id, 
canid_t *mask,
  * @func: callback function on filter match
  * @data: returned parameter for callback function
  * @ident: string for calling module identification
+ * @sk: socket pointer (might be NULL)
  *
  * Description:
  *  Invokes the callback function with the received sk_buff and the given
@@ -468,7 +469,7 @@ static struct hlist_head *find_rcv_list(canid_t *can_id, 
canid_t *mask,
  */
 int can_rx_register(struct net_device *dev, canid_t can_id, canid_t mask,
void (*func)(struct sk_buff *, void *), void *data,
-   char *ident)
+   char *ident, struct sock *sk)
 {
struct receiver *r;
struct hlist_head *rl;
@@ -496,6 +497,7 @@ int can_rx_register(struct net_device *dev, canid_t can_id, 
canid_t mask,
r->func= func;
r->data= data;
r->ident   = ident;
+   r->sk  = sk;
 
hlist_add_head_rcu(>list, rl);
d->entries++;
@@ -520,8 +522,11 @@ EXPORT_SYMBOL(can_rx_register);
 static void can_rx_delete_receiver(struct rcu_head *rp)
 {
struct receiver *r = container_of(rp, struct receiver, rcu);
+   struct sock *sk = r->sk;
 
kmem_cache_free(rcv_cache, r);
+   if (sk)
+   sock_put(sk);
 }
 
 /**
@@ -596,8 +601,11 @@ void can_rx_unregister(struct net_device *dev, canid_t 
can_id, canid_t mask,
spin_unlock(_rcvlists_lock);
 
/* schedule the receiver item for deletion */
-   if (r)
+   if (r) {
+   if (r->sk)
+   sock_hold(r->sk);
call_rcu(>rcu, can_rx_delete_receiver);
+   }
 }
 EXPORT_SYMBOL(can_rx_unregister);
 
diff --git a/net/can/af_can.h b/net/can/af_can.h
index fca0fe9fc45a..b86f5129e838 100644
--- a/net/can/af_can.h
+++ b/net/can/af_can.h
@@ -50,13 +50,14 @@
 
 struct receiver {
struct hlist_node list;
-   struct rcu_head rcu;
canid_t can_id;
canid_t mask;
unsigned long matches;
void (*func)(struct sk_buff *, void *);
void *data;

Re: [net-next] openvswitch: Simplify do_execute_actions().

2017-01-27 Thread Pravin Shelar

On Wed, Jan 25, 2017 at 9:24 PM, Andy Zhou  wrote:
> do_execute_actions() implements a worthwhile optimization: in case
> an output action is the last action in an action list, skb_clone()
> can be avoided by outputing the current skb. However, the
> implementation is more complicated than necessary.  This patch
> simplify this logic.
>
> Signed-off-by: Andy Zhou 
> ---
>  net/openvswitch/actions.c | 40 +++-
>  1 file changed, 19 insertions(+), 21 deletions(-)
>
> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index 514f7bc..3866608 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -830,6 +830,9 @@ static void do_output(struct datapath *dp, struct sk_buff 
> *skb, int out_port,
>  {
> struct vport *vport = ovs_vport_rcu(dp, out_port);
>
> +   if (unlikely(!skb))
> +   return;
> +
Patch looks good to me. But I wanted to know if you considered moving
this check to do_execute_actions() in case skb-clone is done? This way
we can avoid this unlikely check from likely case :)


> if (likely(vport)) {
> u16 mru = OVS_CB(skb)->mru;
> u32 cutlen = OVS_CB(skb)->cutlen;
> @@ -1141,12 +1144,6 @@ static int do_execute_actions(struct datapath *dp, 
> struct sk_buff *skb,
>   struct sw_flow_key *key,
>   const struct nlattr *attr, int len)
>  {
> -   /* Every output action needs a separate clone of 'skb', but the common
> -* case is just a single output action, so that doing a clone and
> -* then freeing the original skbuff is wasteful.  So the following 
> code
> -* is slightly obscure just to avoid that.
> -*/
> -   int prev_port = -1;
> const struct nlattr *a;
> int rem;
>
> @@ -1154,20 +1151,25 @@ static int do_execute_actions(struct datapath *dp, 
> struct sk_buff *skb,
>  a = nla_next(a, )) {
> int err = 0;
>
> -   if (unlikely(prev_port != -1)) {
> -   struct sk_buff *out_skb = skb_clone(skb, GFP_ATOMIC);
> +   switch (nla_type(a)) {
> +   case OVS_ACTION_ATTR_OUTPUT: {
> +   int port = nla_get_u32(a);
>
> -   if (out_skb)
> -   do_output(dp, out_skb, prev_port, key);
> +   /* Every output action needs a separate clone
> +* of 'skb', In case the output action is the
> +* last action, cloning can be avoided.
> +*/
> +   if (nla_is_last(a, rem)) {
> +   do_output(dp, skb, port, key);
> +   /* 'skb' has been used for output.
> +*/
> +   return 0;
> +   }
>
> +   do_output(dp, skb_clone(skb, GFP_ATOMIC), port, key);
> OVS_CB(skb)->cutlen = 0;
> -   prev_port = -1;
> -   }
> -
> -   switch (nla_type(a)) {
> -   case OVS_ACTION_ATTR_OUTPUT:
> -   prev_port = nla_get_u32(a);
> break;
> +   }
>
> case OVS_ACTION_ATTR_TRUNC: {
> struct ovs_action_trunc *trunc = nla_data(a);

Re: [PATCH 2/6] wl1251: Use request_firmware_prefer_user() for loading NVS calibration data

2017-01-27 Thread Arend Van Spriel

On 27-1-2017 8:33, Kalle Valo wrote:
> Pali Rohár  writes:
> 
>> NVS calibration data for wl1251 are model specific. Every one device with
>> wl1251 chip has different and calibrated in factory.
>>
>> Not all wl1251 chips have own EEPROM where are calibration data stored. And
>> in that case there is no "standard" place. Every device has stored them on
>> different place (some in rootfs file, some in dedicated nand partition,
>> some in another proprietary structure).
>>
>> Kernel wl1251 driver cannot support every one different storage decided by
>> device manufacture so it will use request_firmware_prefer_user() call for
>> loading NVS calibration data and userspace helper will be responsible to
>> prepare correct data.
>>
>> In case userspace helper fails request_firmware_prefer_user() still try to
>> load data file directly from VFS as fallback mechanism.
>>
>> On Nokia N900 device which has wl1251 chip, NVS calibration data are stored
>> in CAL nand partition. CAL is proprietary Nokia key/value format for nand
>> devices.
>>
>> With this patch it is finally possible to load correct model specific NVS
>> calibration data for Nokia N900.
>>
>> Signed-off-by: Pali Rohár 
>> ---
>>  drivers/net/wireless/ti/wl1251/Kconfig |1 +
>>  drivers/net/wireless/ti/wl1251/main.c  |2 +-
>>  2 files changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/wireless/ti/wl1251/Kconfig 
>> b/drivers/net/wireless/ti/wl1251/Kconfig
>> index 7142ccf..affe154 100644
>> --- a/drivers/net/wireless/ti/wl1251/Kconfig
>> +++ b/drivers/net/wireless/ti/wl1251/Kconfig
>> @@ -2,6 +2,7 @@ config WL1251
>>  tristate "TI wl1251 driver support"
>>  depends on MAC80211
>>  select FW_LOADER
>> +select FW_LOADER_USER_HELPER
>>  select CRC7
>>  ---help---
>>This will enable TI wl1251 driver support. The drivers make
>> diff --git a/drivers/net/wireless/ti/wl1251/main.c 
>> b/drivers/net/wireless/ti/wl1251/main.c
>> index 208f062..24f8866 100644
>> --- a/drivers/net/wireless/ti/wl1251/main.c
>> +++ b/drivers/net/wireless/ti/wl1251/main.c
>> @@ -110,7 +110,7 @@ static int wl1251_fetch_nvs(struct wl1251 *wl)
>>  struct device *dev = wiphy_dev(wl->hw->wiphy);
>>  int ret;
>>  
>> -ret = request_firmware(, WL1251_NVS_NAME, dev);
>> +ret = request_firmware_prefer_user(, WL1251_NVS_NAME, dev);
> 
> I don't see the need for this. Just remove the default nvs file from
> filesystem and the fallback user helper will be always used, right?

Indeed. The only remaining issue would be that an error message is
logged. Also note the fallback is only used if selected in Kconfig.

> Like we discussed earlier, the default nvs file should not be used by
> normal users.

Yup.

Regards,
Arend

[PATCH] net: phy: micrel: KSZ8795 do not set SUPPORTED_[Asym_]Pause

2017-01-27 Thread Sean Nyekjaer

As pr commit "net: phy: phy drivers should not set SUPPORTED_[Asym_]Pause"
this phy driver should not set these feature bits.

Signed-off-by: Sean Nyekjaer 
Fixes: 9d162ed69f51 ("net: phy: micrel: add support for KSZ8795")
---
 drivers/net/phy/micrel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
index e55809c5beb7..6742070ca676 100644
--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
@@ -1012,7 +1012,7 @@ static struct phy_driver ksphy_driver[] = {
.phy_id = PHY_ID_KSZ8795,
.phy_id_mask= MICREL_PHY_ID_MASK,
.name   = "Micrel KSZ8795",
-   .features   = (SUPPORTED_Pause | SUPPORTED_Asym_Pause),
+   .features   = PHY_BASIC_FEATURES,
.flags  = PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
.config_init= kszphy_config_init,
.config_aneg= ksz8873mll_config_aneg,
-- 
2.11.0

[PATCH net-next 0/9] net: dsa: preparatory patches for multi-chip

2017-01-27 Thread Vivien Didelot

In order to introduce support for multi-chip configuration, we need to
do a few enhancements. This patchset makes the number of ports in a
switch dynamic (instead of capping to DSA_MAX_PORTS), stores the switch
and index of a port in the dsa_port structure, uses it in the slave
private structure, and exposes the bridge device a port belongs to.

Vivien Didelot (9):
  net: dsa: variable number of ports
  net: dsa: use ds->num_ports when possible
  net: dsa: add ds and index to dsa_port
  net: dsa: store a dsa_port in dsa_slave_priv
  net: dsa: move bridge device in dsa_port
  net: dsa: pass bridge device when a port leaves
  net: dsa: mv88e6xxx: use dsa_port's bridge pointer
  net: dsa: qca8k: use dsa_port's bridge pointer
  net: dsa: b53: use dsa_port's bridge pointer

 drivers/net/dsa/b53/b53_common.c  |  18 ++--
 drivers/net/dsa/b53/b53_priv.h|   3 +-
 drivers/net/dsa/mv88e6xxx/chip.c  |  33 +++---
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h |   6 --
 drivers/net/dsa/qca8k.c   |  17 ++--
 drivers/net/dsa/qca8k.h   |   1 -
 include/net/dsa.h |  12 ++-
 net/dsa/dsa.c |  21 ++--
 net/dsa/dsa2.c|  34 +--
 net/dsa/dsa_priv.h|   9 +-
 net/dsa/slave.c   | 182 +-
 net/dsa/tag_brcm.c|   6 +-
 net/dsa/tag_dsa.c |  10 +-
 net/dsa/tag_edsa.c|  10 +-
 net/dsa/tag_qca.c |   2 +-
 net/dsa/tag_trailer.c |   4 +-
 16 files changed, 186 insertions(+), 182 deletions(-)

-- 
2.11.0

[PATCH net-next 9/9] net: dsa: b53: use dsa_port's bridge pointer

2017-01-27 Thread Vivien Didelot

Now that DSA exposes the bridge device pointer to which a port belongs,
use it when programming the port based VLANs and thus remove the cache.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/b53/b53_common.c | 9 +++--
 drivers/net/dsa/b53/b53_priv.h   | 1 -
 2 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 32fdcf5570c8..3a7d16b6c3eb 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1308,7 +1308,7 @@ int b53_fdb_dump(struct dsa_switch *ds, int port,
 }
 EXPORT_SYMBOL(b53_fdb_dump);
 
-int b53_br_join(struct dsa_switch *ds, int port, struct net_device *bridge)
+int b53_br_join(struct dsa_switch *ds, int port, struct net_device *br)
 {
struct b53_device *dev = ds->priv;
s8 cpu_port = ds->dst->cpu_port;
@@ -1326,11 +1326,10 @@ int b53_br_join(struct dsa_switch *ds, int port, struct 
net_device *bridge)
b53_write16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, reg);
}
 
-   dev->ports[port].bridge_dev = bridge;
b53_read16(dev, B53_PVLAN_PAGE, B53_PVLAN_PORT_MASK(port), );
 
b53_for_each_port(dev, i) {
-   if (dev->ports[i].bridge_dev != bridge)
+   if (ds->ports[i].bridge_dev != br)
continue;
 
/* Add this local port to the remote port VLAN control
@@ -1357,7 +1356,6 @@ EXPORT_SYMBOL(b53_br_join);
 void b53_br_leave(struct dsa_switch *ds, int port, struct net_device *br)
 {
struct b53_device *dev = ds->priv;
-   struct net_device *bridge = dev->ports[port].bridge_dev;
struct b53_vlan *vl = >vlans[0];
s8 cpu_port = ds->dst->cpu_port;
unsigned int i;
@@ -1367,7 +1365,7 @@ void b53_br_leave(struct dsa_switch *ds, int port, struct 
net_device *br)
 
b53_for_each_port(dev, i) {
/* Don't touch the remaining ports */
-   if (dev->ports[i].bridge_dev != bridge)
+   if (ds->ports[i].bridge_dev != br)
continue;
 
b53_read16(dev, B53_PVLAN_PAGE, B53_PVLAN_PORT_MASK(i), );
@@ -1382,7 +1380,6 @@ void b53_br_leave(struct dsa_switch *ds, int port, struct 
net_device *br)
 
b53_write16(dev, B53_PVLAN_PAGE, B53_PVLAN_PORT_MASK(port), pvlan);
dev->ports[port].vlan_ctl_mask = pvlan;
-   dev->ports[port].bridge_dev = NULL;
 
if (is5325(dev) || is5365(dev))
pvid = 1;
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index 5dafb70e75fc..9d87889728ac 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -70,7 +70,6 @@ enum {
 
 struct b53_port {
u16 vlan_ctl_mask;
-   struct net_device *bridge_dev;
 };
 
 struct b53_vlan {
-- 
2.11.0

[PATCH net-next 4/9] net: dsa: store a dsa_port in dsa_slave_priv

2017-01-27 Thread Vivien Didelot

Store a pointer to the dsa_port structure in the dsa_slave_priv
structure, instead of the switch/port index.

This will allow to store more information such as the bridge device,
needed in DSA drivers for multi-chip configuration.

Signed-off-by: Vivien Didelot 
---
 net/dsa/dsa_priv.h|   8 +--
 net/dsa/slave.c   | 164 +-
 net/dsa/tag_brcm.c|   4 +-
 net/dsa/tag_dsa.c |   8 +--
 net/dsa/tag_edsa.c|   8 +--
 net/dsa/tag_qca.c |   2 +-
 net/dsa/tag_trailer.c |   2 +-
 7 files changed, 96 insertions(+), 100 deletions(-)

diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 16194a4bb2fe..c519bd0e9206 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -25,12 +25,8 @@ struct dsa_slave_priv {
struct sk_buff *(*xmit)(struct sk_buff *skb,
struct net_device *dev);
 
-   /*
-* Which switch this port is a part of, and the port index
-* for this port.
-*/
-   struct dsa_switch   *parent;
-   u8  port;
+   /* DSA port data, such as switch, port index, etc. */
+   struct dsa_port *dp;
 
/*
 * The phylib phy_device pointer for the PHY connected
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 759824ba5545..2ea220bc4bba 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -61,7 +61,7 @@ static int dsa_slave_get_iflink(const struct net_device *dev)
 {
struct dsa_slave_priv *p = netdev_priv(dev);
 
-   return p->parent->dst->master_netdev->ifindex;
+   return p->dp->ds->dst->master_netdev->ifindex;
 }
 
 static inline bool dsa_port_is_bridged(struct dsa_slave_priv *p)
@@ -96,8 +96,8 @@ static void dsa_port_set_stp_state(struct dsa_switch *ds, int 
port, u8 state)
 static int dsa_slave_open(struct net_device *dev)
 {
struct dsa_slave_priv *p = netdev_priv(dev);
-   struct net_device *master = p->parent->dst->master_netdev;
-   struct dsa_switch *ds = p->parent;
+   struct net_device *master = p->dp->ds->dst->master_netdev;
+   struct dsa_switch *ds = p->dp->ds;
u8 stp_state = dsa_port_is_bridged(p) ?
BR_STATE_BLOCKING : BR_STATE_FORWARDING;
int err;
@@ -123,12 +123,12 @@ static int dsa_slave_open(struct net_device *dev)
}
 
if (ds->ops->port_enable) {
-   err = ds->ops->port_enable(ds, p->port, p->phy);
+   err = ds->ops->port_enable(ds, p->dp->index, p->phy);
if (err)
goto clear_promisc;
}
 
-   dsa_port_set_stp_state(ds, p->port, stp_state);
+   dsa_port_set_stp_state(ds, p->dp->index, stp_state);
 
if (p->phy)
phy_start(p->phy);
@@ -151,8 +151,8 @@ static int dsa_slave_open(struct net_device *dev)
 static int dsa_slave_close(struct net_device *dev)
 {
struct dsa_slave_priv *p = netdev_priv(dev);
-   struct net_device *master = p->parent->dst->master_netdev;
-   struct dsa_switch *ds = p->parent;
+   struct net_device *master = p->dp->ds->dst->master_netdev;
+   struct dsa_switch *ds = p->dp->ds;
 
if (p->phy)
phy_stop(p->phy);
@@ -168,9 +168,9 @@ static int dsa_slave_close(struct net_device *dev)
dev_uc_del(master, dev->dev_addr);
 
if (ds->ops->port_disable)
-   ds->ops->port_disable(ds, p->port, p->phy);
+   ds->ops->port_disable(ds, p->dp->index, p->phy);
 
-   dsa_port_set_stp_state(ds, p->port, BR_STATE_DISABLED);
+   dsa_port_set_stp_state(ds, p->dp->index, BR_STATE_DISABLED);
 
return 0;
 }
@@ -178,7 +178,7 @@ static int dsa_slave_close(struct net_device *dev)
 static void dsa_slave_change_rx_flags(struct net_device *dev, int change)
 {
struct dsa_slave_priv *p = netdev_priv(dev);
-   struct net_device *master = p->parent->dst->master_netdev;
+   struct net_device *master = p->dp->ds->dst->master_netdev;
 
if (change & IFF_ALLMULTI)
dev_set_allmulti(master, dev->flags & IFF_ALLMULTI ? 1 : -1);
@@ -189,7 +189,7 @@ static void dsa_slave_change_rx_flags(struct net_device 
*dev, int change)
 static void dsa_slave_set_rx_mode(struct net_device *dev)
 {
struct dsa_slave_priv *p = netdev_priv(dev);
-   struct net_device *master = p->parent->dst->master_netdev;
+   struct net_device *master = p->dp->ds->dst->master_netdev;
 
dev_mc_sync(master, dev);
dev_uc_sync(master, dev);
@@ -198,7 +198,7 @@ static void dsa_slave_set_rx_mode(struct net_device *dev)
 static int dsa_slave_set_mac_address(struct net_device *dev, void *a)
 {
struct dsa_slave_priv *p = netdev_priv(dev);
-   struct net_device *master = p->parent->dst->master_netdev;
+   struct net_device *master = p->dp->ds->dst->master_netdev;
struct sockaddr *addr = a;
int err;
 
@@ -228,16

[PATCH net-next 8/9] net: dsa: qca8k: use dsa_port's bridge pointer

2017-01-27 Thread Vivien Didelot

Now that DSA exposes the bridge device pointer to which a port belongs,
use it when programming the port based VLANs and thus remove the cache.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/qca8k.c | 12 
 drivers/net/dsa/qca8k.h |  1 -
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index c85b187aa3d9..a4fd4ccf7b67 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -746,17 +746,14 @@ qca8k_port_stp_state_set(struct dsa_switch *ds, int port, 
u8 state)
 }
 
 static int
-qca8k_port_bridge_join(struct dsa_switch *ds, int port,
-  struct net_device *bridge)
+qca8k_port_bridge_join(struct dsa_switch *ds, int port, struct net_device *br)
 {
struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv;
int port_mask = BIT(QCA8K_CPU_PORT);
int i;
 
-   priv->port_sts[port].bridge_dev = bridge;
-
for (i = 1; i < QCA8K_NUM_PORTS; i++) {
-   if (priv->port_sts[i].bridge_dev != bridge)
+   if (ds->ports[i].bridge_dev != br)
continue;
/* Add this port to the portvlan mask of the other ports
 * in the bridge
@@ -781,8 +778,7 @@ qca8k_port_bridge_leave(struct dsa_switch *ds, int port, 
struct net_device *br)
int i;
 
for (i = 1; i < QCA8K_NUM_PORTS; i++) {
-   if (priv->port_sts[i].bridge_dev !=
-   priv->port_sts[port].bridge_dev)
+   if (ds->ports[i].bridge_dev != br)
continue;
/* Remove this port to the portvlan mask of the other ports
 * in the bridge
@@ -791,7 +787,7 @@ qca8k_port_bridge_leave(struct dsa_switch *ds, int port, 
struct net_device *br)
QCA8K_PORT_LOOKUP_CTRL(i),
BIT(port));
}
-   priv->port_sts[port].bridge_dev = NULL;
+
/* Set the cpu port to be the only one in the portvlan mask of
 * this port
 */
diff --git a/drivers/net/dsa/qca8k.h b/drivers/net/dsa/qca8k.h
index 201464719531..1ed4fac6cd6d 100644
--- a/drivers/net/dsa/qca8k.h
+++ b/drivers/net/dsa/qca8k.h
@@ -157,7 +157,6 @@ enum qca8k_fdb_cmd {
 
 struct ar8xxx_port_status {
struct ethtool_eee eee;
-   struct net_device *bridge_dev;
int enabled;
 };
 
-- 
2.11.0

[PATCH net-next 3/9] net: dsa: add ds and index to dsa_port

2017-01-27 Thread Vivien Didelot

Add the physical switch instance and port index a DSA port belongs to to
the dsa_port structure.

That can be used later to retrieve information about a physical port
when configuring a switch fabric, or lighten up struct dsa_slave_priv.

Signed-off-by: Vivien Didelot 
---
 include/net/dsa.h | 2 ++
 net/dsa/dsa2.c| 6 ++
 2 files changed, 8 insertions(+)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 24e1d935ae68..6bd1f8b05dbd 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -140,6 +140,8 @@ struct dsa_switch_tree {
 };
 
 struct dsa_port {
+   struct dsa_switch   *ds;
+   unsigned intindex;
struct net_device   *netdev;
struct device_node  *dn;
unsigned intageing_time;
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 6e7b3e88b778..9f8cc26be9ea 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -670,6 +670,7 @@ struct dsa_switch *dsa_switch_alloc(struct device *dev, 
size_t n)
 {
size_t size = sizeof(struct dsa_switch) + n * sizeof(struct dsa_port);
struct dsa_switch *ds;
+   int i;
 
ds = devm_kzalloc(dev, size, GFP_KERNEL);
if (!ds)
@@ -678,6 +679,11 @@ struct dsa_switch *dsa_switch_alloc(struct device *dev, 
size_t n)
ds->dev = dev;
ds->num_ports = n;
 
+   for (i = 0; i < ds->num_ports; ++i) {
+   ds->ports[i].index = i;
+   ds->ports[i].ds = ds;
+   }
+
return ds;
 }
 EXPORT_SYMBOL_GPL(dsa_switch_alloc);
-- 
2.11.0

[PATCH net-next 1/9] net: dsa: variable number of ports

2017-01-27 Thread Vivien Didelot

Change the ports[DSA_MAX_PORTS] array of the dsa_switch structure for a
zero-length array, allocated at the same time as the dsa_switch
structure itself. A dsa_switch_alloc() helper is provided for that.

This commit brings no functional change yet since we pass DSA_MAX_PORTS
as the number of ports for the moment. Future patches can update the DSA
drivers separately to support dynamic number of ports.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/b53/b53_common.c |  7 ---
 drivers/net/dsa/mv88e6xxx/chip.c |  3 +--
 drivers/net/dsa/qca8k.c  |  3 +--
 include/net/dsa.h|  6 +-
 net/dsa/dsa.c|  5 ++---
 net/dsa/dsa2.c   | 16 
 6 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index bb210b12ad1b..31afc4d4b68b 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1790,14 +1790,15 @@ struct b53_device *b53_switch_alloc(struct device *base,
struct dsa_switch *ds;
struct b53_device *dev;
 
-   ds = devm_kzalloc(base, sizeof(*ds) + sizeof(*dev), GFP_KERNEL);
+   ds = dsa_switch_alloc(base, DSA_MAX_PORTS);
if (!ds)
return NULL;
 
-   dev = (struct b53_device *)(ds + 1);
+   dev = devm_kzalloc(base, sizeof(*dev), GFP_KERNEL);
+   if (!dev)
+   return NULL;
 
ds->priv = dev;
-   ds->dev = base;
dev->dev = base;
 
dev->ds = ds;
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 921e53351786..cb7b24748336 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -4361,11 +4361,10 @@ static int mv88e6xxx_register_switch(struct 
mv88e6xxx_chip *chip)
struct device *dev = chip->dev;
struct dsa_switch *ds;
 
-   ds = devm_kzalloc(dev, sizeof(*ds), GFP_KERNEL);
+   ds = dsa_switch_alloc(dev, DSA_MAX_PORTS);
if (!ds)
return -ENOMEM;
 
-   ds->dev = dev;
ds->priv = chip;
ds->ops = _switch_ops;
 
diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index c084aa484d2b..f67c6a3cebff 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -954,12 +954,11 @@ qca8k_sw_probe(struct mdio_device *mdiodev)
if (id != QCA8K_ID_QCA8337)
return -ENODEV;
 
-   priv->ds = devm_kzalloc(>dev, sizeof(*priv->ds), GFP_KERNEL);
+   priv->ds = dsa_switch_alloc(>dev, DSA_MAX_PORTS);
if (!priv->ds)
return -ENOMEM;
 
priv->ds->priv = priv;
-   priv->ds->dev = >dev;
priv->ds->ops = _switch_ops;
mutex_init(>reg_mutex);
dev_set_drvdata(>dev, priv);
diff --git a/include/net/dsa.h b/include/net/dsa.h
index 92fd795e9573..24e1d935ae68 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -190,8 +190,11 @@ struct dsa_switch {
u32 cpu_port_mask;
u32 enabled_port_mask;
u32 phys_mii_mask;
-   struct dsa_port ports[DSA_MAX_PORTS];
struct mii_bus  *slave_mii_bus;
+
+   /* Dynamically allocated ports, keep last */
+   size_t num_ports;
+   struct dsa_port ports[];
 };
 
 static inline bool dsa_is_cpu_port(struct dsa_switch *ds, int p)
@@ -386,6 +389,7 @@ static inline bool dsa_uses_tagged_protocol(struct 
dsa_switch_tree *dst)
return dst->rcv != NULL;
 }
 
+struct dsa_switch *dsa_switch_alloc(struct device *dev, size_t n);
 void dsa_unregister_switch(struct dsa_switch *ds);
 int dsa_register_switch(struct dsa_switch *ds, struct device *dev);
 #ifdef CONFIG_PM_SLEEP
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 07e863369e04..de3ffb421ee4 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -347,8 +347,8 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index,
/*
 * Allocate and initialise switch state.
 */
-   ds = devm_kzalloc(parent, sizeof(*ds), GFP_KERNEL);
-   if (ds == NULL)
+   ds = dsa_switch_alloc(parent, DSA_MAX_PORTS);
+   if (!ds)
return ERR_PTR(-ENOMEM);
 
ds->dst = dst;
@@ -356,7 +356,6 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index,
ds->cd = cd;
ds->ops = ops;
ds->priv = priv;
-   ds->dev = parent;
 
ret = dsa_switch_setup_one(ds, parent);
if (ret)
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 75f5d1f8554b..4b3a44bec5c8 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -666,6 +666,22 @@ static int _dsa_register_switch(struct dsa_switch *ds, 
struct device *dev)
return err;
 }
 
+struct dsa_switch *dsa_switch_alloc(struct device *dev, size_t n)
+{
+   size_t size = sizeof(struct dsa_switch) + n * sizeof(struct dsa_port);
+   struct dsa_switch *ds;
+
+   ds = devm_kzalloc(dev, size,

[PATCH net-next 2/9] net: dsa: use ds->num_ports when possible

2017-01-27 Thread Vivien Didelot

The dsa_switch structure contains the number of ports. Use it where the
structure is valid instead of the DSA_MAX_PORTS value.

Signed-off-by: Vivien Didelot 
---
 net/dsa/dsa.c | 16 
 net/dsa/dsa2.c| 12 ++--
 net/dsa/slave.c   |  2 +-
 net/dsa/tag_brcm.c|  2 +-
 net/dsa/tag_dsa.c |  2 +-
 net/dsa/tag_edsa.c|  2 +-
 net/dsa/tag_trailer.c |  2 +-
 7 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index de3ffb421ee4..619e57a44d1d 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -145,7 +145,7 @@ static int dsa_cpu_dsa_setups(struct dsa_switch *ds, struct 
device *dev)
struct dsa_port *dport;
int ret, port;
 
-   for (port = 0; port < DSA_MAX_PORTS; port++) {
+   for (port = 0; port < ds->num_ports; port++) {
if (!(dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port)))
continue;
 
@@ -218,7 +218,7 @@ static int dsa_switch_setup_one(struct dsa_switch *ds, 
struct device *parent)
/*
 * Validate supplied switch configuration.
 */
-   for (i = 0; i < DSA_MAX_PORTS; i++) {
+   for (i = 0; i < ds->num_ports; i++) {
char *name;
 
name = cd->port_names[i];
@@ -242,7 +242,7 @@ static int dsa_switch_setup_one(struct dsa_switch *ds, 
struct device *parent)
valid_name_found = true;
}
 
-   if (!valid_name_found && i == DSA_MAX_PORTS)
+   if (!valid_name_found && i == ds->num_ports)
return -EINVAL;
 
/* Make the built-in MII bus mask match the number of ports,
@@ -295,7 +295,7 @@ static int dsa_switch_setup_one(struct dsa_switch *ds, 
struct device *parent)
/*
 * Create network devices for physical switch ports.
 */
-   for (i = 0; i < DSA_MAX_PORTS; i++) {
+   for (i = 0; i < ds->num_ports; i++) {
ds->ports[i].dn = cd->port_dn[i];
 
if (!(ds->enabled_port_mask & (1 << i)))
@@ -377,7 +377,7 @@ static void dsa_switch_destroy(struct dsa_switch *ds)
int port;
 
/* Destroy network devices for physical switch ports. */
-   for (port = 0; port < DSA_MAX_PORTS; port++) {
+   for (port = 0; port < ds->num_ports; port++) {
if (!(ds->enabled_port_mask & (1 << port)))
continue;
 
@@ -388,7 +388,7 @@ static void dsa_switch_destroy(struct dsa_switch *ds)
}
 
/* Disable configuration of the CPU and DSA ports */
-   for (port = 0; port < DSA_MAX_PORTS; port++) {
+   for (port = 0; port < ds->num_ports; port++) {
if (!(dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port)))
continue;
dsa_cpu_dsa_destroy(>ports[port]);
@@ -408,7 +408,7 @@ int dsa_switch_suspend(struct dsa_switch *ds)
int i, ret = 0;
 
/* Suspend slave network devices */
-   for (i = 0; i < DSA_MAX_PORTS; i++) {
+   for (i = 0; i < ds->num_ports; i++) {
if (!dsa_is_port_initialized(ds, i))
continue;
 
@@ -435,7 +435,7 @@ int dsa_switch_resume(struct dsa_switch *ds)
return ret;
 
/* Resume slave network devices */
-   for (i = 0; i < DSA_MAX_PORTS; i++) {
+   for (i = 0; i < ds->num_ports; i++) {
if (!dsa_is_port_initialized(ds, i))
continue;
 
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 4b3a44bec5c8..6e7b3e88b778 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -98,7 +98,7 @@ static bool dsa_ds_find_port_dn(struct dsa_switch *ds,
 {
u32 index;
 
-   for (index = 0; index < DSA_MAX_PORTS; index++)
+   for (index = 0; index < ds->num_ports; index++)
if (ds->ports[index].dn == port)
return true;
return false;
@@ -159,7 +159,7 @@ static int dsa_ds_complete(struct dsa_switch_tree *dst, 
struct dsa_switch *ds)
u32 index;
int err;
 
-   for (index = 0; index < DSA_MAX_PORTS; index++) {
+   for (index = 0; index < ds->num_ports; index++) {
port = >ports[index];
if (!dsa_port_is_valid(port))
continue;
@@ -312,7 +312,7 @@ static int dsa_ds_apply(struct dsa_switch_tree *dst, struct 
dsa_switch *ds)
return err;
}
 
-   for (index = 0; index < DSA_MAX_PORTS; index++) {
+   for (index = 0; index < ds->num_ports; index++) {
port = >ports[index];
if (!dsa_port_is_valid(port))
continue;
@@ -344,7 +344,7 @@ static void dsa_ds_unapply(struct dsa_switch_tree *dst, 
struct dsa_switch *ds)
struct dsa_port *port;
u32 index;
 
-   for (index = 0; index < DSA_MAX_PORTS; index++) {
+   for (index = 0; index < ds->num_ports; index++) {
port = >ports[index];

[PATCH net-next 5/9] net: dsa: move bridge device in dsa_port

2017-01-27 Thread Vivien Didelot

Move the bridge_dev pointer from dsa_slave_priv to dsa_port so that DSA
drivers can access this information and remove the need to cache it.

Signed-off-by: Vivien Didelot 
---
 include/net/dsa.h  |  1 +
 net/dsa/dsa_priv.h |  1 -
 net/dsa/slave.c| 10 +-
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 6bd1f8b05dbd..924533fd4425 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -146,6 +146,7 @@ struct dsa_port {
struct device_node  *dn;
unsigned intageing_time;
u8  stp_state;
+   struct net_device   *bridge_dev;
 };
 
 struct dsa_switch {
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index c519bd0e9206..3022f2e42cdc 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -38,7 +38,6 @@ struct dsa_slave_priv {
int old_pause;
int old_duplex;
 
-   struct net_device   *bridge_dev;
 #ifdef CONFIG_NET_POLL_CONTROLLER
struct netpoll  *netpoll;
 #endif
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 2ea220bc4bba..3a7c28d64bd5 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -64,9 +64,9 @@ static int dsa_slave_get_iflink(const struct net_device *dev)
return p->dp->ds->dst->master_netdev->ifindex;
 }
 
-static inline bool dsa_port_is_bridged(struct dsa_slave_priv *p)
+static inline bool dsa_port_is_bridged(struct dsa_port *dp)
 {
-   return !!p->bridge_dev;
+   return !!dp->bridge_dev;
 }
 
 static void dsa_port_set_stp_state(struct dsa_switch *ds, int port, u8 state)
@@ -98,7 +98,7 @@ static int dsa_slave_open(struct net_device *dev)
struct dsa_slave_priv *p = netdev_priv(dev);
struct net_device *master = p->dp->ds->dst->master_netdev;
struct dsa_switch *ds = p->dp->ds;
-   u8 stp_state = dsa_port_is_bridged(p) ?
+   u8 stp_state = dsa_port_is_bridged(p->dp) ?
BR_STATE_BLOCKING : BR_STATE_FORWARDING;
int err;
 
@@ -557,7 +557,7 @@ static int dsa_slave_bridge_port_join(struct net_device 
*dev,
struct dsa_switch *ds = p->dp->ds;
int ret = -EOPNOTSUPP;
 
-   p->bridge_dev = br;
+   p->dp->bridge_dev = br;
 
if (ds->ops->port_bridge_join)
ret = ds->ops->port_bridge_join(ds, p->dp->index, br);
@@ -574,7 +574,7 @@ static void dsa_slave_bridge_port_leave(struct net_device 
*dev)
if (ds->ops->port_bridge_leave)
ds->ops->port_bridge_leave(ds, p->dp->index);
 
-   p->bridge_dev = NULL;
+   p->dp->bridge_dev = NULL;
 
/* Port left the bridge, put in BR_STATE_DISABLED by the bridge layer,
 * so allow it to be in BR_STATE_FORWARDING to be kept functional
-- 
2.11.0

[PATCH net-next 6/9] net: dsa: pass bridge device when a port leaves

2017-01-27 Thread Vivien Didelot

Upon reception of the NETDEV_CHANGEUPPER, a leaving port is already
unbridged, so reflect this by assigning the port's bridge_dev pointer to
NULL before calling the port_bridge_leave DSA driver operation.

Now that the bridge_dev pointer is exposed to the drivers, reflecting
the current state of the DSA switch fabric is necessary for the drivers
to adjust their port based VLANs correctly.

Pass the bridge device pointer to the port_bridge_leave operation so
that drivers have all information to re-program their chips properly,
and do not need to cache it anymore.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/b53/b53_common.c |  2 +-
 drivers/net/dsa/b53/b53_priv.h   |  2 +-
 drivers/net/dsa/mv88e6xxx/chip.c |  3 ++-
 drivers/net/dsa/qca8k.c  |  2 +-
 include/net/dsa.h|  3 ++-
 net/dsa/slave.c  | 10 +-
 6 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 31afc4d4b68b..32fdcf5570c8 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1354,7 +1354,7 @@ int b53_br_join(struct dsa_switch *ds, int port, struct 
net_device *bridge)
 }
 EXPORT_SYMBOL(b53_br_join);
 
-void b53_br_leave(struct dsa_switch *ds, int port)
+void b53_br_leave(struct dsa_switch *ds, int port, struct net_device *br)
 {
struct b53_device *dev = ds->priv;
struct net_device *bridge = dev->ports[port].bridge_dev;
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index a8031b382c55..5dafb70e75fc 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -382,7 +382,7 @@ void b53_get_strings(struct dsa_switch *ds, int port, 
uint8_t *data);
 void b53_get_ethtool_stats(struct dsa_switch *ds, int port, uint64_t *data);
 int b53_get_sset_count(struct dsa_switch *ds);
 int b53_br_join(struct dsa_switch *ds, int port, struct net_device *bridge);
-void b53_br_leave(struct dsa_switch *ds, int port);
+void b53_br_leave(struct dsa_switch *ds, int port, struct net_device *bridge);
 void b53_br_set_stp_state(struct dsa_switch *ds, int port, u8 state);
 void b53_br_fast_age(struct dsa_switch *ds, int port);
 int b53_vlan_filtering(struct dsa_switch *ds, int port, bool vlan_filtering);
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index cb7b24748336..8eb0dc063f4e 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -2343,7 +2343,8 @@ static int mv88e6xxx_port_bridge_join(struct dsa_switch 
*ds, int port,
return err;
 }
 
-static void mv88e6xxx_port_bridge_leave(struct dsa_switch *ds, int port)
+static void mv88e6xxx_port_bridge_leave(struct dsa_switch *ds, int port,
+   struct net_device *br)
 {
struct mv88e6xxx_chip *chip = ds->priv;
struct net_device *bridge = chip->ports[port].bridge_dev;
diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index f67c6a3cebff..c85b187aa3d9 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -775,7 +775,7 @@ qca8k_port_bridge_join(struct dsa_switch *ds, int port,
 }
 
 static void
-qca8k_port_bridge_leave(struct dsa_switch *ds, int port)
+qca8k_port_bridge_leave(struct dsa_switch *ds, int port, struct net_device *br)
 {
struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv;
int i;
diff --git a/include/net/dsa.h b/include/net/dsa.h
index 924533fd4425..b951e2ebda75 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -325,7 +325,8 @@ struct dsa_switch_ops {
int (*set_ageing_time)(struct dsa_switch *ds, unsigned int msecs);
int (*port_bridge_join)(struct dsa_switch *ds, int port,
struct net_device *bridge);
-   void(*port_bridge_leave)(struct dsa_switch *ds, int port);
+   void(*port_bridge_leave)(struct dsa_switch *ds, int port,
+struct net_device *bridge);
void(*port_stp_state_set)(struct dsa_switch *ds, int port,
  u8 state);
void(*port_fast_age)(struct dsa_switch *ds, int port);
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 3a7c28d64bd5..23ff53aeae50 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -565,16 +565,16 @@ static int dsa_slave_bridge_port_join(struct net_device 
*dev,
return ret == -EOPNOTSUPP ? 0 : ret;
 }
 
-static void dsa_slave_bridge_port_leave(struct net_device *dev)
+static void dsa_slave_bridge_port_leave(struct net_device *dev,
+   struct net_device *br)
 {
struct dsa_slave_priv *p = netdev_priv(dev);
struct dsa_switch *ds = p->dp->ds;
 
+   p->dp->bridge_dev = NULL;
 
if (ds->ops->port_bridge_leave)
-   ds->ops->port_bridge_leave(ds, p->dp->index);
-
-   p->dp->bridge_dev = NULL;

[PATCH net-next 7/9] net: dsa: mv88e6xxx: use dsa_port's bridge pointer

2017-01-27 Thread Vivien Didelot

Now that DSA exposes the bridge device pointer to which a port belongs,
use it when programming the port based VLANs and thus remove the cache.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6xxx/chip.c  | 27 +++
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h |  6 --
 2 files changed, 11 insertions(+), 22 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 8eb0dc063f4e..84cba32443de 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -1247,8 +1247,8 @@ static int _mv88e6xxx_atu_remove(struct mv88e6xxx_chip 
*chip, u16 fid,
 
 static int _mv88e6xxx_port_based_vlan_map(struct mv88e6xxx_chip *chip, int 
port)
 {
-   struct net_device *bridge = chip->ports[port].bridge_dev;
struct dsa_switch *ds = chip->ds;
+   struct net_device *bridge = ds->ports[port].bridge_dev;
u16 output_ports = 0;
int i;
 
@@ -1258,7 +1258,7 @@ static int _mv88e6xxx_port_based_vlan_map(struct 
mv88e6xxx_chip *chip, int port)
} else {
for (i = 0; i < mv88e6xxx_num_ports(chip); ++i) {
/* allow sending frames to every group member */
-   if (bridge && chip->ports[i].bridge_dev == bridge)
+   if (bridge && ds->ports[i].bridge_dev == bridge)
output_ports |= BIT(i);
 
/* allow sending frames to CPU port and DSA link(s) */
@@ -1820,17 +1820,17 @@ static int mv88e6xxx_port_check_hw_vlan(struct 
dsa_switch *ds, int port,
GLOBAL_VTU_DATA_MEMBER_TAG_NON_MEMBER)
continue;
 
-   if (chip->ports[i].bridge_dev ==
-   chip->ports[port].bridge_dev)
+   if (ds->ports[i].bridge_dev ==
+   ds->ports[port].bridge_dev)
break; /* same bridge, check next VLAN */
 
-   if (!chip->ports[i].bridge_dev)
+   if (!ds->ports[i].bridge_dev)
continue;
 
netdev_warn(ds->ports[port].netdev,
"hardware VLAN %d already used by %s\n",
vlan.vid,
-   netdev_name(chip->ports[i].bridge_dev));
+   netdev_name(ds->ports[i].bridge_dev));
err = -EOPNOTSUPP;
goto unlock;
}
@@ -2320,18 +2320,16 @@ static int mv88e6xxx_port_fdb_dump(struct dsa_switch 
*ds, int port,
 }
 
 static int mv88e6xxx_port_bridge_join(struct dsa_switch *ds, int port,
- struct net_device *bridge)
+ struct net_device *br)
 {
struct mv88e6xxx_chip *chip = ds->priv;
int i, err = 0;
 
mutex_lock(>reg_lock);
 
-   /* Assign the bridge and remap each port's VLANTable */
-   chip->ports[port].bridge_dev = bridge;
-
+   /* Remap each port's VLANTable */
for (i = 0; i < mv88e6xxx_num_ports(chip); ++i) {
-   if (chip->ports[i].bridge_dev == bridge) {
+   if (ds->ports[i].bridge_dev == br) {
err = _mv88e6xxx_port_based_vlan_map(chip, i);
if (err)
break;
@@ -2347,16 +2345,13 @@ static void mv88e6xxx_port_bridge_leave(struct 
dsa_switch *ds, int port,
struct net_device *br)
 {
struct mv88e6xxx_chip *chip = ds->priv;
-   struct net_device *bridge = chip->ports[port].bridge_dev;
int i;
 
mutex_lock(>reg_lock);
 
-   /* Unassign the bridge and remap each port's VLANTable */
-   chip->ports[port].bridge_dev = NULL;
-
+   /* Remap each port's VLANTable */
for (i = 0; i < mv88e6xxx_num_ports(chip); ++i)
-   if (i == port || chip->ports[i].bridge_dev == bridge)
+   if (i == port || ds->ports[i].bridge_dev == br)
if (_mv88e6xxx_port_based_vlan_map(chip, i))
netdev_warn(ds->ports[i].netdev,
"failed to remap\n");
diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h 
b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
index 572d585dc1e2..e126ed00937b 100644
--- a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
@@ -676,10 +676,6 @@ struct mv88e6xxx_vtu_entry {
 
 struct mv88e6xxx_bus_ops;
 
-struct mv88e6xxx_priv_port {
-   struct net_device *bridge_dev;
-};
-
 struct mv88e6xxx_irq {
u16 masked;
struct irq_chip chip;
@@ -720,8 +716,6 @@ struct mv88e6xxx_chip {
 */
struct mutexstats_mutex;
 
-   struct mv88e6xxx_priv_port  ports[DSA_MAX_PORTS];
-

Re: [PATCH v3] can: Fix kernel panic at security_sock_rcv_skb

2017-01-27 Thread Oliver Hartkopp




On 01/27/2017 05:11 PM, Eric Dumazet wrote:

From: Eric Dumazet 

Zhang Yanmin reported crashes [1] and provided a patch adding a
synchronize_rcu() call in can_rx_unregister()

The main problem seems that the sockets themselves are not RCU
protected.

If CAN uses RCU for delivery, then sockets should be freed only after
one RCU grace period.

Recent kernels could use sock_set_flag(sk, SOCK_RCU_FREE), but let's
ease stable backports with the following fix instead.

[1]
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [] selinux_socket_sock_rcv_skb+0x65/0x2a0

Call Trace:
 
 [] security_sock_rcv_skb+0x4c/0x60
 [] sk_filter+0x41/0x210
 [] sock_queue_rcv_skb+0x53/0x3a0
 [] raw_rcv+0x2a3/0x3c0
 [] can_rcv_filter+0x12b/0x370
 [] can_receive+0xd9/0x120
 [] can_rcv+0xab/0x100
 [] __netif_receive_skb_core+0xd8c/0x11f0
 [] __netif_receive_skb+0x24/0xb0
 [] process_backlog+0x127/0x280
 [] net_rx_action+0x33b/0x4f0
 [] __do_softirq+0x184/0x440
 [] do_softirq_own_stack+0x1c/0x30
 
 [] do_softirq.part.18+0x3b/0x40
 [] do_softirq+0x1d/0x20
 [] netif_rx_ni+0xe5/0x110
 [] slcan_receive_buf+0x507/0x520
 [] flush_to_ldisc+0x21c/0x230
 [] process_one_work+0x24f/0x670
 [] worker_thread+0x9d/0x6f0
 [] ? rescuer_thread+0x480/0x480
 [] kthread+0x12c/0x150
 [] ret_from_fork+0x3f/0x70

Reported-by: Zhang Yanmin 
Signed-off-by: Eric Dumazet 


Acked-by: Oliver Hartkopp 

Thanks Eric!

BR
Oliver


---
 include/linux/can/core.h |7 +++
 net/can/af_can.c |   14 +++---
 net/can/af_can.h |3 ++-
 net/can/bcm.c|4 ++--
 net/can/gw.c |2 +-
 net/can/raw.c|4 ++--
 6 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/include/linux/can/core.h b/include/linux/can/core.h
index 
a0875001b13c84ad70a9b2909654e9ffb6824c58..df08a41d5be5f26cfa4cdc74935f5eae7fa51385
 100644
--- a/include/linux/can/core.h
+++ b/include/linux/can/core.h
@@ -45,10 +45,9 @@ struct can_proto {
 extern int  can_proto_register(const struct can_proto *cp);
 extern void can_proto_unregister(const struct can_proto *cp);

-extern int  can_rx_register(struct net_device *dev, canid_t can_id,
-   canid_t mask,
-   void (*func)(struct sk_buff *, void *),
-   void *data, char *ident);
+int can_rx_register(struct net_device *dev, canid_t can_id, canid_t mask,
+   void (*func)(struct sk_buff *, void *),
+   void *data, char *ident, struct sock *sk);

 extern void can_rx_unregister(struct net_device *dev, canid_t can_id,
  canid_t mask,
diff --git a/net/can/af_can.c b/net/can/af_can.c
index 
1108079d934f8383a599d7997b08100fca0465e9..d2b0638284b9a71aaba9cc433822329baf82a34e
 100644
--- a/net/can/af_can.c
+++ b/net/can/af_can.c
@@ -445,6 +445,7 @@ static struct hlist_head *find_rcv_list(canid_t *can_id, 
canid_t *mask,
  * @func: callback function on filter match
  * @data: returned parameter for callback function
  * @ident: string for calling module identification
+ * @sk: socket pointer (might be NULL)
  *
  * Description:
  *  Invokes the callback function with the received sk_buff and the given
@@ -468,7 +469,7 @@ static struct hlist_head *find_rcv_list(canid_t *can_id, 
canid_t *mask,
  */
 int can_rx_register(struct net_device *dev, canid_t can_id, canid_t mask,
void (*func)(struct sk_buff *, void *), void *data,
-   char *ident)
+   char *ident, struct sock *sk)
 {
struct receiver *r;
struct hlist_head *rl;
@@ -496,6 +497,7 @@ int can_rx_register(struct net_device *dev, canid_t can_id, 
canid_t mask,
r->func= func;
r->data= data;
r->ident   = ident;
+   r->sk  = sk;

hlist_add_head_rcu(>list, rl);
d->entries++;
@@ -520,8 +522,11 @@ EXPORT_SYMBOL(can_rx_register);
 static void can_rx_delete_receiver(struct rcu_head *rp)
 {
struct receiver *r = container_of(rp, struct receiver, rcu);
-
+   struct sock *sk = r->sk;
+   
kmem_cache_free(rcv_cache, r);
+   if (sk)
+   sock_put(sk);
 }

 /**
@@ -596,8 +601,11 @@ void can_rx_unregister(struct net_device *dev, canid_t 
can_id, canid_t mask,
spin_unlock(_rcvlists_lock);

/* schedule the receiver item for deletion */
-   if (r)
+   if (r) {
+   if (r->sk)
+   sock_hold(r->sk);
call_rcu(>rcu, can_rx_delete_receiver);
+   }
 }
 EXPORT_SYMBOL(can_rx_unregister);

diff --git a/net/can/af_can.h b/net/can/af_can.h
index 
fca0fe9fc45a497cdf3da82d5414e846e7cc61b7..b86f5129e8385fe84ef671bb914e8e05c2977ca0
 100644
--- a/net/can/af_can.h
+++ b/net/can/af_can.h
@@ -50,13 +50,14 @@

 struct receiver {
struct hlist_node list;
-

Re: [PATCH v2] net: phy: micrel: add support for KSZ8795

2017-01-27 Thread Florian Fainelli

On 01/27/2017 11:52 AM, Sean Nyekjær wrote:
> 
> 
> On 2017-01-27 19:55, Florian Fainelli wrote:
>> On 01/26/2017 11:46 PM, Sean Nyekjaer wrote:
>>> This is adds support for the PHYs in the KSZ8795 5port managed switch.
>>>
>>> It will allow to detect the link between the switch and the soc
>>> and uses the same read_status functions as the KSZ8873MLL switch.
>>>
>>> Signed-off-by: Sean Nyekjaer 
>>> ---
>>> Changes in v2:
>>>   - Removed "switch" name
>>>
>>>   drivers/net/phy/micrel.c   | 14 ++
>>>   include/linux/micrel_phy.h |  2 ++
>>>   2 files changed, 16 insertions(+)
>>>
>>> diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
>>> index ea92d524d5a8..fab56c9350cf 100644
>>> --- a/drivers/net/phy/micrel.c
>>> +++ b/drivers/net/phy/micrel.c
>>> @@ -1014,6 +1014,20 @@ static struct phy_driver ksphy_driver[] = {
>>>   .get_stats= kszphy_get_stats,
>>>   .suspend= genphy_suspend,
>>>   .resume= genphy_resume,
>>> +}, {
>>> +.phy_id= PHY_ID_KSZ8795,
>>> +.phy_id_mask= MICREL_PHY_ID_MASK,
>>> +.name= "Micrel KSZ8795",
>>> +.features= (SUPPORTED_Pause | SUPPORTED_Asym_Pause),
>> This is wrong, it should be PHY_GBIT_FEATURES or PHY_BASIC_FEATURES.
>> Including the Pause/AsymPause feature bits is not longer necessary, the
>> PHY library takes care of adding these automatically to let your MAC do
>> flow control auto-negotiation later on.
>>
>> Please submit an incremental fix to that.
> By this you mean a v3 or a new commit?

Incremental means a new commit here, on top of this patch right here.
-- 
Florian

Re: [PATCH 0/6 v3] kvmalloc

2017-01-27 Thread Daniel Borkmann


On 01/27/2017 11:05 AM, Michal Hocko wrote:

On Thu 26-01-17 21:34:04, Daniel Borkmann wrote:

On 01/26/2017 02:40 PM, Michal Hocko wrote:

[...]

But realistically, how big is this problem really? Is it really worth
it? You said this is an admin only interface and admin can kill the
machine by OOM and other means already.

Moreover and I should probably mention it explicitly, your d407bd25a204b
reduced the likelyhood of oom for other reason. kmalloc used GPF_USER
previously and with order > 0 && order <= PAGE_ALLOC_COSTLY_ORDER this
could indeed hit the OOM e.g. due to memory fragmentation. It would be
much harder to hit the OOM killer from vmalloc which doesn't issue
higher order allocation requests. Or have you ever seen the OOM killer
pointing to the vmalloc fallback path?


The case I was concerned about was from vmalloc() path, not kmalloc().
That was where the stack trace indicating OOM pointed to. As an example,
there could be really large allocation requests for maps where the map
has pre-allocated memory for its elements. Thus, if we get to the point
where we need to kill others due to shortage of mem for satisfying this,
I'd much much rather prefer to just not let vmalloc() work really hard
and fail early on instead.


I see, but as already mentioned, chances are that by the time you get
close to the OOM somebody else will hit the OOM before the vmalloc path
manages to free the allocated memory.


In my (crafted) test case, I was connected
via ssh and it each time reliably killed my connection, which is really
suboptimal.

F.e., I could also imagine a buggy or miscalculated map definition for
a prog that is provisioned to multiple places, which then accidentally
triggers this. Or if large on purpose, but we crossed the line, it
could be handled more gracefully, f.e. I could imagine an option to
falling back to a non-pre-allocated map flavor from the application
loading the program. Trade-off for sure, but still allowing it to
operate up to a certain extend. Granted, if vmalloc() succeeded without
trying hard and we then OOM elsewhere, too bad, but we don't have much
control over that one anyway, only about our own request. Reason I
asked above was whether having __GFP_NORETRY in would be fatal
somewhere down the path, but seems not as you say.

So to answer your second email with the bpf and netfilter hunks, why
not replacing them with kvmalloc() and __GFP_NORETRY flag and add that
big fat FIXME comment above there, saying explicitly that __GFP_NORETRY
is not harmful though has only /partial/ effect right now and that full
support needs to be implemented in future. That would still be better
that not having it, imo, and the FIXME would make expectations clear
to anyone reading that code.


Well, we can do that, I just would like to prevent from this (ab)use
if there is no _real_ and _sensible_ usecase for it. Having a real bug


Understandable.


report or a fallback mechanism you are mentioning above would justify
the (ab)use IMHO. But that abuse would be documented properly and have a
real reason to exist. That sounds like a better approach to me.

But if you absolutely _insist_ I can change that.


Yeah, please do (with a big FIXME comment as mentioned), this originally
came from a real bug report. Anyway, feel free to add my Acked-by then.

Thanks again,
Daniel

Re: [PATCH RFC net-next] packet: always ensure that we pass hard_header_len bytes in skb_headlen() to the driver

2017-01-27 Thread Sowmini Varadhan

On (01/27/17 14:29), Willem de Bruijn wrote:
> 
> As your patch state, the contract is that any packet delivered to a
> driver has the entire L2 in its linear section. Drivers are not required
> to be robust against shorter packets, so there is no reason to test
> those.
> 
> One option is to limit your fix to known fixed-header protocols.
> In these cases hard_header_len is the minimum, so anything
> smaller must be dropped.

yes, but how would you you know that this is a fixed-header protocol
or a var-hdrlen protocol? AIUI the hard_header_len itself will not
tell you this info: it will be 77 for ax25, 14 for ethernet, 
but that does not tell me that ax25 is the "robust-er" driver
with a min requirement of 21 for the hdrlen.

That's why I was thinking of a IFF_L2_VARHDRLEN in the priv_flags
of the net_device.

> For protocols with variable header length it is fine to send packets
> shorter than hard_header_len, even with corrupted content (i.e.,
> even if they would fail that protocol's validate callback), as long as
> they exceed the minimum length. ax25 already has a min length
> check through its protocol-specific validate callback.

Another option that comes to mind.. the real thorn-in-the-flesh
here is the CAP_SYS_RAWIO check. Would it be a better idea to ask 
the test-suites (since they seem to be the major consumer of
that path) to use a special PF_PACKET socket option instead, that 
indicates "I'm testing robustness of the header, so let this one
slip past dev_validate_header at all times"?

It would mean the test suites would have to change slightly.

--Sowmini

Re: [PATCH 1/3] net: bgmac: allocate struct bgmac just once & don't copy it

2017-01-27 Thread Rafał Miłecki

On 27 January 2017 at 17:14, David Miller  wrote:
> From: Felix Fietkau 
> Date: Fri, 27 Jan 2017 17:02:33 +0100
>
>> On 2017-01-27 10:20, Rafał Miłecki wrote:
>>> From: Rafał Miłecki 
>>>
>>> To share as much code as possible in bgmac we call alloc_etherdev from
>>> bgmac.c which is used by both: platform and bcma code. The easiest
>>> solution was to use it for allocating whole struct bgmac but it doesn't
>>> work well as we already get early-filled struct bgmac as an argument.
>>>
>>> So far we were solving this by copying received struct into newly
>>> allocated one. The problem is it means storing 2 allocated structs,
>>> using only 1 of them and non-shared code not having access to it.
>>>
>>> This patch solves it by using alloc_etherdev to allocate *pointer* for
>>> the already allocated struct. The only downside of this is we have to be
>>> careful when using netdev_priv.
>>>
>>> Another solution was to call alloc_etherdev in platform/bcma specific
>>> code but Jon advised against it due to sharing less code that way.
>> How does that lead to sharing less code?
>> I find this pointer indirection rather ugly and uncommon, and I think it
>> would be much cleaner to split the probe into bgmac_enet_alloc and
>> bgmac_enet_probe (with bgmac_enet_alloc calling alloc_etherdev and doing
>> basic setup).
>
> I agree, it would be so much better if bgmac_probe() and friends
> initialized a real bgmac object which was the private of a netdev
> struct, then passed that down into bgmac_enet_probe().

I'll work on V2, thanks.

-- 
Rafał

Re: [PATCH v2] net: phy: micrel: add support for KSZ8795

2017-01-27 Thread Sean Nyekjær




On 2017-01-27 19:55, Florian Fainelli wrote:

On 01/26/2017 11:46 PM, Sean Nyekjaer wrote:

This is adds support for the PHYs in the KSZ8795 5port managed switch.

It will allow to detect the link between the switch and the soc
and uses the same read_status functions as the KSZ8873MLL switch.

Signed-off-by: Sean Nyekjaer 
---
Changes in v2:
  - Removed "switch" name

  drivers/net/phy/micrel.c   | 14 ++
  include/linux/micrel_phy.h |  2 ++
  2 files changed, 16 insertions(+)

diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
index ea92d524d5a8..fab56c9350cf 100644
--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
@@ -1014,6 +1014,20 @@ static struct phy_driver ksphy_driver[] = {
.get_stats  = kszphy_get_stats,
.suspend= genphy_suspend,
.resume = genphy_resume,
+}, {
+   .phy_id = PHY_ID_KSZ8795,
+   .phy_id_mask= MICREL_PHY_ID_MASK,
+   .name   = "Micrel KSZ8795",
+   .features   = (SUPPORTED_Pause | SUPPORTED_Asym_Pause),

This is wrong, it should be PHY_GBIT_FEATURES or PHY_BASIC_FEATURES.
Including the Pause/AsymPause feature bits is not longer necessary, the
PHY library takes care of adding these automatically to let your MAC do
flow control auto-negotiation later on.

Please submit an incremental fix to that.

By this you mean a v3 or a new commit?
I'm checking with hardware now...



+   .flags  = PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
+   .config_init= kszphy_config_init,
+   .config_aneg= ksz8873mll_config_aneg,
+   .read_status= ksz8873mll_read_status,
+   .get_sset_count = kszphy_get_sset_count,
+   .get_strings= kszphy_get_strings,
+   .get_stats  = kszphy_get_stats,
+   .suspend= genphy_suspend,
+   .resume = genphy_resume,
  } };
  
  module_phy_driver(ksphy_driver);

diff --git a/include/linux/micrel_phy.h b/include/linux/micrel_phy.h
index 257173e0095e..f541da68d1e7 100644
--- a/include/linux/micrel_phy.h
+++ b/include/linux/micrel_phy.h
@@ -35,6 +35,8 @@
  #define PHY_ID_KSZ886X0x00221430
  #define PHY_ID_KSZ88630x00221435
  
+#define PHY_ID_KSZ8795		0x00221550

+
  /* struct phy_device dev_flags definitions */
  #define MICREL_PHY_50MHZ_CLK  0x0001
  #define MICREL_PHY_FXEN   0x0002




/Sean

Re: [PATCH 2/6] wl1251: Use request_firmware_prefer_user() for loading NVS calibration data

2017-01-27 Thread Pavel Machek

On Fri 2017-01-27 17:23:07, Kalle Valo wrote:
> Pali Rohár  writes:
> 
> > On Friday 27 January 2017 14:26:22 Kalle Valo wrote:
> >> Pali Rohár  writes:
> >> 
> >> > 2) It was already tested that example NVS data can be used for N900 e.g.
> >> > for SSH connection. If real correct data are not available it is better
> >> > to use at least those example (and probably log warning message) so user
> >> > can connect via SSH and start investigating where is problem.
> >> 
> >> I disagree. Allowing default calibration data to be used can be
> >> unnoticed by user and left her wondering why wifi works so badly.
> >
> > So there are only two options:
> >
> > 1) Disallow it and so these users will have non-working wifi.
> >
> > 2) Allow those data to be used as fallback mechanism.
> >
> > And personally I'm against 1) because it will break wifi support for
> > *all* Nokia N900 devices right now.
> 
> All two of them? :)

Umm. You clearly want a flock of angry penguins at your doorsteps :-).

> But not working is exactly my point, if correct calibration data is not
> available wifi should not work. And it's not only about functionality
> problems, there's also the regulatory aspect.

If you break existing configuration that's called "regression".

> >> > 3) If we do rename *now* we will totally break wifi support on Nokia
> >> > N900.
> >> 
> >> Then the distro should fix that when updating the linux-firmware
> >> packages. Can you provide details about the setup, what distro etc?
> >
> > Debian stable, Ubuntu LTSs 14.04, 16.04. 
> 
> You can run these out of box on N900?

Debian stable does.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature

[PATCH v3 net-next 2/2] ravb: Support 1Gbps on R-Car H3 ES1.1+ and R-Car M3-W

2017-01-27 Thread Simon Horman

From: Geert Uytterhoeven 

The limitation to 10/100Mbit speeds on R-Car Gen3 is valid for R-Car H3
ES1.0 only. Check for the exact SoC model to allow 1Gbps on newer
revisions of R-Car H3, and on R-Car M3-W.

Signed-off-by: Geert Uytterhoeven 
Signed-off-by: Simon Horman 
Acked-by: Sergei Shtylyov 
---
 drivers/net/ethernet/renesas/ravb_main.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/renesas/ravb_main.c 
b/drivers/net/ethernet/renesas/ravb_main.c
index 732cdea7800b..615a3cb6f18c 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -973,6 +974,11 @@ static void ravb_adjust_link(struct net_device *ndev)
phy_print_status(phydev);
 }
 
+static const struct soc_device_attribute r8a7795es10[] = {
+   { .soc_id = "r8a7795", .revision = "ES1.0", },
+   { /* sentinel */ }
+};
+
 /* PHY init function */
 static int ravb_phy_init(struct net_device *ndev)
 {
@@ -1008,10 +1014,10 @@ static int ravb_phy_init(struct net_device *ndev)
goto err_deregister_fixed_link;
}
 
-   /* This driver only support 10/100Mbit speeds on Gen3
+   /* This driver only support 10/100Mbit speeds on R-Car H3 ES1.0
 * at this time.
 */
-   if (priv->chip_id == RCAR_GEN3) {
+   if (soc_device_match(r8a7795es10)) {
err = phy_set_max_speed(phydev, SPEED_100);
if (err) {
netdev_err(ndev, "failed to limit PHY to 100Mbit/s\n");
-- 
2.7.0.rc3.207.g0ac5344

Re: [PATCH] cfg80211 debugfs: Cleanup some checkpatch issues

2017-01-27 Thread Joe Perches

On Fri, 2017-01-27 at 22:26 +0300, Pichugin Dmitry wrote:
> This fixes the checkpatch.pl warnings:
> * Macros should not use a trailing semicolon.
> * Spaces required around that '='.
> * Symbolic permissions 'S_IRUGO' are not preferred.

OK

> * Macro argument reuse 'buflen' - possible side-effects

Not all checkpatch messages need fixing.
This is one of them.

> diff --git a/net/wireless/debugfs.c b/net/wireless/debugfs.c
[]
> @@ -17,11 +17,12 @@
>  static ssize_t name## _read(struct file *file, char __user *userbuf, \
>   size_t count, loff_t *ppos) \
>  {\
> - struct wiphy *wiphy= file->private_data;\
> - char buf[buflen];   \
> + struct wiphy *wiphy = file->private_data;   \
> + int __buflen = __builtin_constant_p(buflen) ? buflen : -1;  \
> + char buf[__buflen]; \

That's rather an odd change too

[PATCH v3 net-next 1/2] ravb: Add tx and rx clock internal delays mode of APSR

2017-01-27 Thread Simon Horman

From: Kazuya Mizuguchi 

This patch enables tx and rx clock internal delay modes (TDM and RDM).

This is to address a failure in the case of 1Gbps communication using the
by salvator-x board with the KSZ9031RNX phy. This has been reported to
occur with both the r8a7795 (H3) and r8a7796 (M3-W) SoCs.

With this change APSR internal delay modes are enabled for
"rgmii-id", "rgmii-rxid" and "rgmii-txid" phy modes as follows:

phy mode   | ASPR delay mode
---+
rgmii-id   | TDM and RDM
rgmii-rxid | RDM
rgmii-txid | TDM

Signed-off-by: Kazuya Mizuguchi 
Signed-off-by: Simon Horman 
Acked-by: Sergei Shtylyov 

---
v3 [Simon Horman]
* Move comment to above ravb_set_delay_mode()

v2 [Simon Horman]
* As suggested by Sergei Shtylyov
  - Add a comment to indicate that APSR_DM appears to be undocumented.
  - Move chip_id check outside of ravb_set_delay_mode for consistency
  - Call ravb_modify() once in ravb_set_delay_mode()
* Enhance comment before calls to ravb_set_delay_mode()

v1 [Simon Horman]
- Combined patches
- Reworded changelog

v0 [Kazuya Mizuguchi]
---
 drivers/net/ethernet/renesas/ravb.h  | 10 ++
 drivers/net/ethernet/renesas/ravb_main.c | 23 +++
 2 files changed, 33 insertions(+)

diff --git a/drivers/net/ethernet/renesas/ravb.h 
b/drivers/net/ethernet/renesas/ravb.h
index f1109661a533..0525bd696d5d 100644
--- a/drivers/net/ethernet/renesas/ravb.h
+++ b/drivers/net/ethernet/renesas/ravb.h
@@ -76,6 +76,7 @@ enum ravb_reg {
CDAR20  = 0x0060,
CDAR21  = 0x0064,
ESR = 0x0088,
+   APSR= 0x008C,   /* R-Car Gen3 only */
RCR = 0x0090,
RQC0= 0x0094,
RQC1= 0x0098,
@@ -248,6 +249,15 @@ enum ESR_BIT {
ESR_EIL = 0x1000,
 };
 
+/* APSR */
+enum APSR_BIT {
+   APSR_MEMS   = 0x0002,
+   APSR_CMSW   = 0x0010,
+   APSR_DM = 0x6000,   /* Undocumented? */
+   APSR_DM_RDM = 0x2000,
+   APSR_DM_TDM = 0x4000,
+};
+
 /* RCR */
 enum RCR_BIT {
RCR_EFFS= 0x0001,
diff --git a/drivers/net/ethernet/renesas/ravb_main.c 
b/drivers/net/ethernet/renesas/ravb_main.c
index 89ac1e3f6175..732cdea7800b 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -1904,6 +1904,23 @@ static void ravb_set_config_mode(struct net_device *ndev)
}
 }
 
+/* Set tx and rx clock internal delay modes */
+static void ravb_set_delay_mode(struct net_device *ndev)
+{
+   struct ravb_private *priv = netdev_priv(ndev);
+   int set = 0;
+
+   if (priv->phy_interface == PHY_INTERFACE_MODE_RGMII_ID ||
+   priv->phy_interface == PHY_INTERFACE_MODE_RGMII_RXID)
+   set |= APSR_DM_RDM;
+
+   if (priv->phy_interface == PHY_INTERFACE_MODE_RGMII_ID ||
+   priv->phy_interface == PHY_INTERFACE_MODE_RGMII_TXID)
+   set |= APSR_DM_TDM;
+
+   ravb_modify(ndev, APSR, APSR_DM, set);
+}
+
 static int ravb_probe(struct platform_device *pdev)
 {
struct device_node *np = pdev->dev.of_node;
@@ -2016,6 +2033,9 @@ static int ravb_probe(struct platform_device *pdev)
/* Request GTI loading */
ravb_modify(ndev, GCCR, GCCR_LTI, GCCR_LTI);
 
+   if (priv->chip_id != RCAR_GEN2)
+   ravb_set_delay_mode(ndev);
+
/* Allocate descriptor base address table */
priv->desc_bat_size = sizeof(struct ravb_desc) * DBAT_ENTRY_NUM;
priv->desc_bat = dma_alloc_coherent(ndev->dev.parent, 
priv->desc_bat_size,
@@ -2152,6 +2172,9 @@ static int __maybe_unused ravb_resume(struct device *dev)
/* Request GTI loading */
ravb_modify(ndev, GCCR, GCCR_LTI, GCCR_LTI);
 
+   if (priv->chip_id != RCAR_GEN2)
+   ravb_set_delay_mode(ndev);
+
/* Restore descriptor base address table */
ravb_write(ndev, priv->desc_bat_dma, DBAT);
 
-- 
2.7.0.rc3.207.g0ac5344

[PATCH v3 net-next 0/2] ravb: Support 1Gbps on R-Car H3 ES1.1+ and R-Car M3-W

2017-01-27 Thread Simon Horman

Hi,

this series adds support for gigabit communication to the Renesas EthernetAVB
controller when used in conjunction with  R-Car Gen3 H3 ES1.1+ and M3-W SoCs.
Gigabit is already supported with R-Car Gen 2 SoCs.

The patch from Geert was previously posted for inclusion in v4.10 and
acked by Dave for that purpose. It was, however, not accepted by the
ARM SoC maintainers.

The path from Mizuguchi-san is to address timing problems observed with
gigabit transfers. I would like it considered although my own testing on
M3-W did not show any timing problems.

Changes since v1:
* Address various feedback for "APSR" patch as noted in its changelog

Geert Uytterhoeven (1):
  ravb: Support 1Gbps on R-Car H3 ES1.1+ and R-Car M3-W

Kazuya Mizuguchi (1):
  ravb: Add tx and rx clock internal delays mode of APSR

 drivers/net/ethernet/renesas/ravb.h  | 10 ++
 drivers/net/ethernet/renesas/ravb_main.c | 33 ++--
 2 files changed, 41 insertions(+), 2 deletions(-)

-- 
2.7.0.rc3.207.g0ac5344

1 2 3 >

1 - 100 of 255 matches

Mail list logo