Re: [PATCH net-next 1/2] qed: Add infrastructure for PTP support.
Hi Sudarsana, [auto build test ERROR on net-next/master] url: https://github.com/0day-ci/linux/commits/Sudarsana-Kalluru/qed-Add-infrastructure-for-PTP-support/20170127-233853 config: i386-randconfig-c0-01281020 (attached as .config) compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4 reproduce: # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): drivers/built-in.o: In function `qed_ptp_hw_adjfreq': >> qed_ptp.c:(.text+0x19f3c3): undefined reference to `__divdi3' qed_ptp.c:(.text+0x19f45a): undefined reference to `__divdi3' qed_ptp.c:(.text+0x19f498): undefined reference to `__divdi3' --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [net-next v2] openvswitch: Simplify do_execute_actions().
On Fri, Jan 27, 2017 at 1:45 PM, Andy Zhouwrote: > do_execute_actions() implements a worthwhile optimization: in case > an output action is the last action in an action list, skb_clone() > can be avoided by outputing the current skb. However, the > implementation is more complicated than necessary. This patch > simplify this logic. > > Signed-off-by: Andy Zhou > --- > v1->v2: drop skb NULL check in do_output() > Acked-by: Pravin B Shelar Thanks.
Re: [PATCH net-next 1/2] qed: Add infrastructure for PTP support.
Hi Sudarsana, [auto build test ERROR on net-next/master] url: https://github.com/0day-ci/linux/commits/Sudarsana-Kalluru/qed-Add-infrastructure-for-PTP-support/20170127-233853 config: i386-allmodconfig (attached as .config) compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901 reproduce: # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): >> ERROR: "__divdi3" [drivers/net/ethernet/qlogic/qed/qed.ko] undefined! --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: net: suspicious RCU usage in nf_hook
On Fri, 2017-01-27 at 17:00 -0800, Cong Wang wrote: > On Fri, Jan 27, 2017 at 3:35 PM, Eric Dumazetwrote: > > Oh well, I forgot to submit the official patch I think, Jan 9th. > > > > https://groups.google.com/forum/#!topic/syzkaller/BhyN5OFd7sQ > > > > Hmm, but why only fragments need skb_orphan()? It seems like > any kfree_skb() inside a nf hook needs to have a preceding > skb_orphan(). > > Also, I am not convinced it is similar to commit 8282f27449bf15548 > which is on RX path. Well, we clearly see IPv6 reassembly being part of the equation in both cases. I was replying to first part of the splat [1], which was already diagnosed and had a non official patch. use after free is also a bug, regardless of jump label being used or not. I still do not really understand this nf_hook issue, I thought we were disabling BH in netfilter. So the in_interrupt() check in net_disable_timestamp() should trigger, this was the intent of netstamp_needed_deferred existence. Not sure if we can test for rcu_read_lock() as well. [1] sk_destruct+0x47/0x80 net/core/sock.c:1460 __sk_free+0x57/0x230 net/core/sock.c:1468 sock_wfree+0xae/0x120 net/core/sock.c:1645 skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655 skb_release_all+0x15/0x60 net/core/skbuff.c:668 __kfree_skb+0x15/0x20 net/core/skbuff.c:684 kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304 inet_frag_put include/net/inet_frag.h:133 [inline] nf_ct_frag6_gather+0x1106/0x3840 net/ipv6/netfilter/nf_conntrack_reasm.c:617 ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68 nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline] nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310 nf_hook include/linux/netfilter.h:212 [inline] __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160 ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
[PATCH net-next v2 1/4] net: dsa: Add plumbing for port mirroring
Add necessary plumbing at the slave network device level to have switch drivers implement ndo_setup_tc() and most particularly the cls_matchall classifier. We add support for two switch operations: port_add_mirror and port_del_mirror() which configure, on a per-port basis the mirror parameters requested from the cls_matchall classifier. Code is largely borrowed from the Mellanox Spectrum switch driver. Signed-off-by: Florian Fainelli--- include/net/dsa.h | 36 ++ net/dsa/dsa_priv.h | 3 ++ net/dsa/slave.c| 143 - 3 files changed, 181 insertions(+), 1 deletion(-) diff --git a/include/net/dsa.h b/include/net/dsa.h index 92fd795e9573..831a789594f1 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -20,6 +20,8 @@ #include #include +struct tc_action; + enum dsa_tag_protocol { DSA_TAG_PROTO_NONE = 0, DSA_TAG_PROTO_DSA, @@ -139,6 +141,31 @@ struct dsa_switch_tree { const struct dsa_device_ops *tag_ops; }; +enum dsa_port_mall_action_type { + DSA_PORT_MALL_MIRROR, +}; + +/* + * Mirroring TC entry + */ +struct dsa_mall_mirror_tc_entry { + u8 to_local_port; + bool ingress; +}; + +/* + * TC matchall entry + */ +struct dsa_mall_tc_entry { + struct list_head list; + unsigned long cookie; + enum dsa_port_mall_action_type type; + union { + struct dsa_mall_mirror_tc_entry mirror; + }; +}; + + struct dsa_port { struct net_device *netdev; struct device_node *dn; @@ -370,6 +397,15 @@ struct dsa_switch_ops { int (*port_mdb_dump)(struct dsa_switch *ds, int port, struct switchdev_obj_port_mdb *mdb, int (*cb)(struct switchdev_obj *obj)); + + /* +* TC integration +*/ + int (*port_mirror_add)(struct dsa_switch *ds, int port, + struct dsa_mall_mirror_tc_entry *mirror, + bool ingress); + void(*port_mirror_del)(struct dsa_switch *ds, int port, + struct dsa_mall_mirror_tc_entry *mirror); }; struct dsa_switch_driver { diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index 16194a4bb2fe..b10b03028b24 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -46,6 +46,9 @@ struct dsa_slave_priv { #ifdef CONFIG_NET_POLL_CONTROLLER struct netpoll *netpoll; #endif + + /* TC context */ + struct list_headmall_tc_list; }; /* dsa.c */ diff --git a/net/dsa/slave.c b/net/dsa/slave.c index b8e58689a9a1..a5f9f1ebca2e 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -16,12 +16,17 @@ #include #include #include +#include #include #include +#include +#include #include #include #include "dsa_priv.h" +static bool dsa_slave_dev_check(struct net_device *dev); + /* slave mii_bus handling ***/ static int dsa_slave_phy_read(struct mii_bus *bus, int addr, int reg) { @@ -994,6 +999,139 @@ static int dsa_slave_get_phys_port_name(struct net_device *dev, return 0; } +static struct dsa_mall_tc_entry * +dsa_slave_mall_tc_entry_find(struct dsa_slave_priv *p, +unsigned long cookie) +{ + struct dsa_mall_tc_entry *mall_tc_entry; + + list_for_each_entry(mall_tc_entry, >mall_tc_list, list) + if (mall_tc_entry->cookie == cookie) + return mall_tc_entry; + + return NULL; +} + +static int dsa_slave_add_cls_matchall(struct net_device *dev, + __be16 protocol, + struct tc_cls_matchall_offload *cls, + bool ingress) +{ + struct dsa_slave_priv *p = netdev_priv(dev); + struct dsa_mall_tc_entry *mall_tc_entry; + struct dsa_switch *ds = p->parent; + struct net *net = dev_net(dev); + struct dsa_slave_priv *to_p; + struct net_device *to_dev; + const struct tc_action *a; + int err = -EOPNOTSUPP; + LIST_HEAD(actions); + int ifindex; + + if (!ds->ops->port_mirror_add) + return err; + + if (!tc_single_action(cls->exts)) { + netdev_err(dev, "only singular actions are supported\n"); + return err; + } + + mall_tc_entry = kzalloc(sizeof(*mall_tc_entry), GFP_KERNEL); + if (!mall_tc_entry) + return -ENOMEM; + mall_tc_entry->cookie = cls->cookie; + + tcf_exts_to_list(cls->exts, ); + a = list_first_entry(, struct tc_action, list); + + if (is_tcf_mirred_egress_mirror(a) && protocol == htons(ETH_P_ALL)) { + struct dsa_mall_mirror_tc_entry *mirror; + + mall_tc_entry->type = DSA_PORT_MALL_MIRROR; + mirror = _tc_entry->mirror; +
[PATCH net-next v2 2/4] net: dsa: b53: Add mirror capture register definitions
Add definitions for the different Roboswitch registers relevant for ingress and egress mirroring. Signed-off-by: Florian Fainelli--- drivers/net/dsa/b53/b53_regs.h | 32 1 file changed, 32 insertions(+) diff --git a/drivers/net/dsa/b53/b53_regs.h b/drivers/net/dsa/b53/b53_regs.h index dac0af4e2cd0..9fd24c418fa4 100644 --- a/drivers/net/dsa/b53/b53_regs.h +++ b/drivers/net/dsa/b53/b53_regs.h @@ -206,6 +206,38 @@ #define BRCM_HDR_P8_EN BIT(0) /* Enable tagging on port 8 */ #define BRCM_HDR_P5_EN BIT(1) /* Enable tagging on port 5 */ +/* Mirror capture control register (16 bit) */ +#define B53_MIR_CAP_CTL0x10 +#define CAP_PORT_MASK 0xf +#define BLK_NOT_MIR BIT(14) +#define MIRROR_EN BIT(15) + +/* Ingress mirror control register (16 bit) */ +#define B53_IG_MIR_CTL 0x12 +#define MIRROR_MASK 0x1ff +#define DIV_ENBIT(13) +#define MIRROR_FILTER_MASK0x3 +#define MIRROR_FILTER_SHIFT 14 +#define MIRROR_ALL0 +#define MIRROR_DA 1 +#define MIRROR_SA 2 + +/* Ingress mirror divider register (16 bit) */ +#define B53_IG_MIR_DIV 0x14 +#define IN_MIRROR_DIV_MASK0x3ff + +/* Ingress mirror MAC address register (48 bit) */ +#define B53_IG_MIR_MAC 0x16 + +/* Egress mirror control register (16 bit) */ +#define B53_EG_MIR_CTL 0x1C + +/* Egress mirror divider register (16 bit) */ +#define B53_EG_MIR_DIV 0x1E + +/* Egress mirror MAC address register (48 bit) */ +#define B53_EG_MIR_MAC 0x20 + /* Device ID register (8 or 32 bit) */ #define B53_DEVICE_ID 0x30 -- 2.9.3
[PATCH net-next v2 0/4] net: dsa: Port mirroring support
Hi all, This patch series adds support for port mirroring in the two Broadcom switch drivers. The major part of the functional are actually with the plumbing between tc and the drivers. David, this will most likely conflict a little bit with my other series: net: dsa: bcm_sf2: CFP support, so just let me know if that happens, and I will provide a rebased version. Thanks! Changes in v2: - fixed filter removal logic to disable the ingress or egress mirroring when there are no longer ports being monitored in ingress or egress - removed a stray list_head in dsa_port structure that is not used Tested using the two iproute2 examples: # ingress tc qdisc add dev eth1 handle : ingress tc filter add dev eth1 parent : \ matchall skip_sw \ action mirred egress mirror \ dev eth2 # egress tc qdisc add dev eth1 handle 1: root prio tc filter add dev eth1 parent 1: \ matchall skip_sw \ action mirred egress mirror\ dev eth2 Florian Fainelli (4): net: dsa: Add plumbing for port mirroring net: dsa: b53: Add mirror capture register definitions net: dsa: b53: Add support for port mirroring net: dsa: bcm_sf2: Add support for port mirroring drivers/net/dsa/b53/b53_common.c | 67 ++ drivers/net/dsa/b53/b53_priv.h | 4 ++ drivers/net/dsa/b53/b53_regs.h | 32 + drivers/net/dsa/bcm_sf2.c| 2 + include/net/dsa.h| 36 ++ net/dsa/dsa_priv.h | 3 + net/dsa/slave.c | 143 ++- 7 files changed, 286 insertions(+), 1 deletion(-) -- 2.9.3
[PATCH net-next v2 3/4] net: dsa: b53: Add support for port mirroring
Add support for configuring port mirroring through the cls_matchall classifier. We do a full ingress or egress capture towards the capture port. Future improvements could include leveraging the divider to allow less frames to be captured, as well as matching specific MAC DA/SA. Signed-off-by: Florian Fainelli--- drivers/net/dsa/b53/b53_common.c | 67 drivers/net/dsa/b53/b53_priv.h | 4 +++ 2 files changed, 71 insertions(+) diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index bb210b12ad1b..052ff4c22667 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1453,6 +1453,71 @@ static enum dsa_tag_protocol b53_get_tag_protocol(struct dsa_switch *ds) return DSA_TAG_PROTO_NONE; } +int b53_mirror_add(struct dsa_switch *ds, int port, + struct dsa_mall_mirror_tc_entry *mirror, bool ingress) +{ + struct b53_device *dev = ds->priv; + u16 reg, loc; + + if (ingress) + loc = B53_IG_MIR_CTL; + else + loc = B53_EG_MIR_CTL; + + b53_read16(dev, B53_MGMT_PAGE, loc, ); + reg &= ~MIRROR_MASK; + reg |= BIT(port); + b53_write16(dev, B53_MGMT_PAGE, loc, reg); + + b53_read16(dev, B53_MGMT_PAGE, B53_MIR_CAP_CTL, ); + reg &= ~CAP_PORT_MASK; + reg |= mirror->to_local_port; + reg |= MIRROR_EN; + b53_write16(dev, B53_MGMT_PAGE, B53_MIR_CAP_CTL, reg); + + return 0; +} +EXPORT_SYMBOL(b53_mirror_add); + +void b53_mirror_del(struct dsa_switch *ds, int port, + struct dsa_mall_mirror_tc_entry *mirror) +{ + struct b53_device *dev = ds->priv; + bool loc_disable = false, other_loc_disable = false; + u16 reg, loc; + + if (mirror->ingress) + loc = B53_IG_MIR_CTL; + else + loc = B53_EG_MIR_CTL; + + /* Update the desired ingress/egress register */ + b53_read16(dev, B53_MGMT_PAGE, loc, ); + reg &= ~BIT(port); + if (!(reg & MIRROR_MASK)) + loc_disable = true; + b53_write16(dev, B53_MGMT_PAGE, loc, reg); + + /* Now look at the other one to know if we can disable mirroring +* entirely +*/ + if (mirror->ingress) + b53_read16(dev, B53_MGMT_PAGE, B53_EG_MIR_CTL, ); + else + b53_read16(dev, B53_MGMT_PAGE, B53_IG_MIR_CTL, ); + if (!(reg & MIRROR_MASK)) + other_loc_disable = true; + + b53_read16(dev, B53_MGMT_PAGE, B53_MIR_CAP_CTL, ); + /* Both no longer have ports, let's disable mirroring */ + if (loc_disable && other_loc_disable) { + reg &= ~MIRROR_EN; + reg &= ~mirror->to_local_port; + } + b53_write16(dev, B53_MGMT_PAGE, B53_MIR_CAP_CTL, reg); +} +EXPORT_SYMBOL(b53_mirror_del); + static const struct dsa_switch_ops b53_switch_ops = { .get_tag_protocol = b53_get_tag_protocol, .setup = b53_setup, @@ -1477,6 +1542,8 @@ static const struct dsa_switch_ops b53_switch_ops = { .port_fdb_dump = b53_fdb_dump, .port_fdb_add = b53_fdb_add, .port_fdb_del = b53_fdb_del, + .port_mirror_add= b53_mirror_add, + .port_mirror_del= b53_mirror_del, }; struct b53_chip_data { diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h index a8031b382c55..28ffe255276f 100644 --- a/drivers/net/dsa/b53/b53_priv.h +++ b/drivers/net/dsa/b53/b53_priv.h @@ -408,5 +408,9 @@ int b53_fdb_del(struct dsa_switch *ds, int port, int b53_fdb_dump(struct dsa_switch *ds, int port, struct switchdev_obj_port_fdb *fdb, int (*cb)(struct switchdev_obj *obj)); +int b53_mirror_add(struct dsa_switch *ds, int port, + struct dsa_mall_mirror_tc_entry *mirror, bool ingress); +void b53_mirror_del(struct dsa_switch *ds, int port, + struct dsa_mall_mirror_tc_entry *mirror); #endif -- 2.9.3
[PATCH net-next v2 4/4] net: dsa: bcm_sf2: Add support for port mirroring
We can use b53_mirror_add and b53_mirror_del because the Starfighter 2 is register compatible in that specific case. Signed-off-by: Florian Fainelli--- drivers/net/dsa/bcm_sf2.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c index 8eecfd227e06..3e514d7af218 100644 --- a/drivers/net/dsa/bcm_sf2.c +++ b/drivers/net/dsa/bcm_sf2.c @@ -1036,6 +1036,8 @@ static const struct dsa_switch_ops bcm_sf2_ops = { .port_fdb_dump = b53_fdb_dump, .port_fdb_add = b53_fdb_add, .port_fdb_del = b53_fdb_del, + .port_mirror_add= b53_mirror_add, + .port_mirror_del= b53_mirror_del, }; struct bcm_sf2_of_data { -- 2.9.3
Re: net: suspicious RCU usage in nf_hook
On Fri, Jan 27, 2017 at 3:35 PM, Eric Dumazetwrote: > Oh well, I forgot to submit the official patch I think, Jan 9th. > > https://groups.google.com/forum/#!topic/syzkaller/BhyN5OFd7sQ > Hmm, but why only fragments need skb_orphan()? It seems like any kfree_skb() inside a nf hook needs to have a preceding skb_orphan(). Also, I am not convinced it is similar to commit 8282f27449bf15548 which is on RX path.
Re: [PATCH net-next 0/4] net: dsa: Port mirroring support
On 01/27/2017 04:40 PM, Florian Fainelli wrote: > Hi all, > > This patch series adds support for port mirroring in the two > Broadcom switch drivers. The major part of the functional are actually with > the plumbing between tc and the drivers. Meh, there are two issues that need fixing: - left a stray list_head in the dsa_port structure which we do not need - if we remove either the ingress, or egress filter, we end-up disabling the mirroring entirely, so need to rework the b53_mirror_del logic a bit Will re-submit shortly. > > David, this will most likely conflict a little bit with my other series: > net: dsa: bcm_sf2: CFP support, so just let me know if that happens, and > I will provide a rebased version. Thanks! > > Tested using the two iproute2 examples: > > # ingress > tc qdisc add dev eth1 handle : ingress > tc filter add dev eth1 parent : \ >matchall skip_sw \ >action mirred egress mirror \ >dev eth2 > # egress > tc qdisc add dev eth1 handle 1: root prio > tc filter add dev eth1 parent 1: \ >matchall skip_sw \ >action mirred egress mirror\ >dev eth2 > > > Florian Fainelli (4): > net: dsa: Add plumbing for port mirroring > net: dsa: b53: Add mirror capture register definitions > net: dsa: b53: Add support for port mirroring > net: dsa: bcm_sf2: Add support for port mirroring > > drivers/net/dsa/b53/b53_common.c | 54 +++ > drivers/net/dsa/b53/b53_priv.h | 4 ++ > drivers/net/dsa/b53/b53_regs.h | 32 + > drivers/net/dsa/bcm_sf2.c| 2 + > include/net/dsa.h| 41 +++ > net/dsa/dsa_priv.h | 3 + > net/dsa/slave.c | 143 > ++- > 7 files changed, 278 insertions(+), 1 deletion(-) > -- Florian
[PATCH net-next 4/4] net: dsa: bcm_sf2: Add support for port mirroring
We can use b53_mirror_add and b53_mirror_del because the Starfighter 2 is register compatible in that specific case. Signed-off-by: Florian Fainelli--- drivers/net/dsa/bcm_sf2.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c index 8eecfd227e06..3e514d7af218 100644 --- a/drivers/net/dsa/bcm_sf2.c +++ b/drivers/net/dsa/bcm_sf2.c @@ -1036,6 +1036,8 @@ static const struct dsa_switch_ops bcm_sf2_ops = { .port_fdb_dump = b53_fdb_dump, .port_fdb_add = b53_fdb_add, .port_fdb_del = b53_fdb_del, + .port_mirror_add= b53_mirror_add, + .port_mirror_del= b53_mirror_del, }; struct bcm_sf2_of_data { -- 2.9.3
[PATCH net-next 3/4] net: dsa: b53: Add support for port mirroring
Add support for configuring port mirroring through the cls_matchall classifier. We do a full ingress or egress capture towards the capture port. Future improvements could include leveraging the divider to allow less frames to be captured, as well as matching specific MAC DA/SA. Signed-off-by: Florian Fainelli--- drivers/net/dsa/b53/b53_common.c | 54 drivers/net/dsa/b53/b53_priv.h | 4 +++ 2 files changed, 58 insertions(+) diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index bb210b12ad1b..5c9dc4bf7b22 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1453,6 +1453,58 @@ static enum dsa_tag_protocol b53_get_tag_protocol(struct dsa_switch *ds) return DSA_TAG_PROTO_NONE; } +int b53_mirror_add(struct dsa_switch *ds, int port, + struct dsa_mall_mirror_tc_entry *mirror, bool ingress) +{ + struct b53_device *dev = ds->priv; + u16 reg, loc; + + if (ingress) + loc = B53_IG_MIR_CTL; + else + loc = B53_EG_MIR_CTL; + + b53_read16(dev, B53_MGMT_PAGE, loc, ); + reg &= ~MIRROR_MASK; + reg |= BIT(port); + b53_write16(dev, B53_MGMT_PAGE, loc, reg); + + b53_read16(dev, B53_MGMT_PAGE, B53_MIR_CAP_CTL, ); + reg &= ~CAP_PORT_MASK; + reg |= mirror->to_local_port; + reg |= MIRROR_EN; + b53_write16(dev, B53_MGMT_PAGE, B53_MIR_CAP_CTL, reg); + + return 0; +} +EXPORT_SYMBOL(b53_mirror_add); + +void b53_mirror_del(struct dsa_switch *ds, int port, + struct dsa_mall_mirror_tc_entry *mirror) +{ + struct b53_device *dev = ds->priv; + bool disable_mirror = false; + u16 reg, loc; + + if (mirror->ingress) + loc = B53_IG_MIR_CTL; + else + loc = B53_EG_MIR_CTL; + + b53_read16(dev, B53_MGMT_PAGE, loc, ); + reg &= ~BIT(port); + if (!(reg & MIRROR_MASK)) + disable_mirror = true; + b53_write16(dev, B53_MGMT_PAGE, loc, reg); + + b53_read16(dev, B53_MGMT_PAGE, B53_MIR_CAP_CTL, ); + if (disable_mirror) + reg &= ~MIRROR_EN; + reg &= ~mirror->to_local_port; + b53_write16(dev, B53_MGMT_PAGE, B53_MIR_CAP_CTL, reg); +} +EXPORT_SYMBOL(b53_mirror_del); + static const struct dsa_switch_ops b53_switch_ops = { .get_tag_protocol = b53_get_tag_protocol, .setup = b53_setup, @@ -1477,6 +1529,8 @@ static const struct dsa_switch_ops b53_switch_ops = { .port_fdb_dump = b53_fdb_dump, .port_fdb_add = b53_fdb_add, .port_fdb_del = b53_fdb_del, + .port_mirror_add= b53_mirror_add, + .port_mirror_del= b53_mirror_del, }; struct b53_chip_data { diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h index a8031b382c55..28ffe255276f 100644 --- a/drivers/net/dsa/b53/b53_priv.h +++ b/drivers/net/dsa/b53/b53_priv.h @@ -408,5 +408,9 @@ int b53_fdb_del(struct dsa_switch *ds, int port, int b53_fdb_dump(struct dsa_switch *ds, int port, struct switchdev_obj_port_fdb *fdb, int (*cb)(struct switchdev_obj *obj)); +int b53_mirror_add(struct dsa_switch *ds, int port, + struct dsa_mall_mirror_tc_entry *mirror, bool ingress); +void b53_mirror_del(struct dsa_switch *ds, int port, + struct dsa_mall_mirror_tc_entry *mirror); #endif -- 2.9.3
[PATCH net-next 2/4] net: dsa: b53: Add mirror capture register definitions
Add definitions for the different Roboswitch registers relevant for ingress and egress mirroring. Signed-off-by: Florian Fainelli--- drivers/net/dsa/b53/b53_regs.h | 32 1 file changed, 32 insertions(+) diff --git a/drivers/net/dsa/b53/b53_regs.h b/drivers/net/dsa/b53/b53_regs.h index dac0af4e2cd0..9fd24c418fa4 100644 --- a/drivers/net/dsa/b53/b53_regs.h +++ b/drivers/net/dsa/b53/b53_regs.h @@ -206,6 +206,38 @@ #define BRCM_HDR_P8_EN BIT(0) /* Enable tagging on port 8 */ #define BRCM_HDR_P5_EN BIT(1) /* Enable tagging on port 5 */ +/* Mirror capture control register (16 bit) */ +#define B53_MIR_CAP_CTL0x10 +#define CAP_PORT_MASK 0xf +#define BLK_NOT_MIR BIT(14) +#define MIRROR_EN BIT(15) + +/* Ingress mirror control register (16 bit) */ +#define B53_IG_MIR_CTL 0x12 +#define MIRROR_MASK 0x1ff +#define DIV_ENBIT(13) +#define MIRROR_FILTER_MASK0x3 +#define MIRROR_FILTER_SHIFT 14 +#define MIRROR_ALL0 +#define MIRROR_DA 1 +#define MIRROR_SA 2 + +/* Ingress mirror divider register (16 bit) */ +#define B53_IG_MIR_DIV 0x14 +#define IN_MIRROR_DIV_MASK0x3ff + +/* Ingress mirror MAC address register (48 bit) */ +#define B53_IG_MIR_MAC 0x16 + +/* Egress mirror control register (16 bit) */ +#define B53_EG_MIR_CTL 0x1C + +/* Egress mirror divider register (16 bit) */ +#define B53_EG_MIR_DIV 0x1E + +/* Egress mirror MAC address register (48 bit) */ +#define B53_EG_MIR_MAC 0x20 + /* Device ID register (8 or 32 bit) */ #define B53_DEVICE_ID 0x30 -- 2.9.3
[PATCH net-next 1/4] net: dsa: Add plumbing for port mirroring
Add necessary plumbing at the slave network device level to have switch drivers implement ndo_setup_tc() and most particularly the cls_matchall classifier. We add support for two switch operations: port_add_mirror and port_del_mirror() which configure, on a per-port basis the mirror parameters requested from the cls_matchall classifier. Code is largely borrowed from the Mellanox Spectrum switch driver. Signed-off-by: Florian Fainelli--- include/net/dsa.h | 41 +++ net/dsa/dsa_priv.h | 3 ++ net/dsa/slave.c| 143 - 3 files changed, 186 insertions(+), 1 deletion(-) diff --git a/include/net/dsa.h b/include/net/dsa.h index 92fd795e9573..7a867c57f463 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -20,6 +20,8 @@ #include #include +struct tc_action; + enum dsa_tag_protocol { DSA_TAG_PROTO_NONE = 0, DSA_TAG_PROTO_DSA, @@ -139,11 +141,41 @@ struct dsa_switch_tree { const struct dsa_device_ops *tag_ops; }; +enum dsa_port_mall_action_type { + DSA_PORT_MALL_MIRROR, +}; + +/* + * Mirroring TC entry + */ +struct dsa_mall_mirror_tc_entry { + u8 to_local_port; + bool ingress; +}; + +/* + * TC matchall entry + */ +struct dsa_mall_tc_entry { + struct list_head list; + unsigned long cookie; + enum dsa_port_mall_action_type type; + union { + struct dsa_mall_mirror_tc_entry mirror; + }; +}; + + struct dsa_port { struct net_device *netdev; struct device_node *dn; unsigned intageing_time; u8 stp_state; + + /* +* TC context +*/ + struct list_headmall_tc_list; }; struct dsa_switch { @@ -370,6 +402,15 @@ struct dsa_switch_ops { int (*port_mdb_dump)(struct dsa_switch *ds, int port, struct switchdev_obj_port_mdb *mdb, int (*cb)(struct switchdev_obj *obj)); + + /* +* TC integration +*/ + int (*port_mirror_add)(struct dsa_switch *ds, int port, + struct dsa_mall_mirror_tc_entry *mirror, + bool ingress); + void(*port_mirror_del)(struct dsa_switch *ds, int port, + struct dsa_mall_mirror_tc_entry *mirror); }; struct dsa_switch_driver { diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index 16194a4bb2fe..b10b03028b24 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -46,6 +46,9 @@ struct dsa_slave_priv { #ifdef CONFIG_NET_POLL_CONTROLLER struct netpoll *netpoll; #endif + + /* TC context */ + struct list_headmall_tc_list; }; /* dsa.c */ diff --git a/net/dsa/slave.c b/net/dsa/slave.c index b8e58689a9a1..a5f9f1ebca2e 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -16,12 +16,17 @@ #include #include #include +#include #include #include +#include +#include #include #include #include "dsa_priv.h" +static bool dsa_slave_dev_check(struct net_device *dev); + /* slave mii_bus handling ***/ static int dsa_slave_phy_read(struct mii_bus *bus, int addr, int reg) { @@ -994,6 +999,139 @@ static int dsa_slave_get_phys_port_name(struct net_device *dev, return 0; } +static struct dsa_mall_tc_entry * +dsa_slave_mall_tc_entry_find(struct dsa_slave_priv *p, +unsigned long cookie) +{ + struct dsa_mall_tc_entry *mall_tc_entry; + + list_for_each_entry(mall_tc_entry, >mall_tc_list, list) + if (mall_tc_entry->cookie == cookie) + return mall_tc_entry; + + return NULL; +} + +static int dsa_slave_add_cls_matchall(struct net_device *dev, + __be16 protocol, + struct tc_cls_matchall_offload *cls, + bool ingress) +{ + struct dsa_slave_priv *p = netdev_priv(dev); + struct dsa_mall_tc_entry *mall_tc_entry; + struct dsa_switch *ds = p->parent; + struct net *net = dev_net(dev); + struct dsa_slave_priv *to_p; + struct net_device *to_dev; + const struct tc_action *a; + int err = -EOPNOTSUPP; + LIST_HEAD(actions); + int ifindex; + + if (!ds->ops->port_mirror_add) + return err; + + if (!tc_single_action(cls->exts)) { + netdev_err(dev, "only singular actions are supported\n"); + return err; + } + + mall_tc_entry = kzalloc(sizeof(*mall_tc_entry), GFP_KERNEL); + if (!mall_tc_entry) + return -ENOMEM; + mall_tc_entry->cookie = cls->cookie; + + tcf_exts_to_list(cls->exts, ); + a = list_first_entry(, struct tc_action, list); + + if
[PATCH net-next 0/4] net: dsa: Port mirroring support
Hi all, This patch series adds support for port mirroring in the two Broadcom switch drivers. The major part of the functional are actually with the plumbing between tc and the drivers. David, this will most likely conflict a little bit with my other series: net: dsa: bcm_sf2: CFP support, so just let me know if that happens, and I will provide a rebased version. Thanks! Tested using the two iproute2 examples: # ingress tc qdisc add dev eth1 handle : ingress tc filter add dev eth1 parent : \ matchall skip_sw \ action mirred egress mirror \ dev eth2 # egress tc qdisc add dev eth1 handle 1: root prio tc filter add dev eth1 parent 1: \ matchall skip_sw \ action mirred egress mirror\ dev eth2 Florian Fainelli (4): net: dsa: Add plumbing for port mirroring net: dsa: b53: Add mirror capture register definitions net: dsa: b53: Add support for port mirroring net: dsa: bcm_sf2: Add support for port mirroring drivers/net/dsa/b53/b53_common.c | 54 +++ drivers/net/dsa/b53/b53_priv.h | 4 ++ drivers/net/dsa/b53/b53_regs.h | 32 + drivers/net/dsa/bcm_sf2.c| 2 + include/net/dsa.h| 41 +++ net/dsa/dsa_priv.h | 3 + net/dsa/slave.c | 143 ++- 7 files changed, 278 insertions(+), 1 deletion(-) -- 2.9.3
Re: [PATCH net-next 1/4] mlx5: Make building eswitch configurable
On 1/27/17 1:15 PM, Saeed Mahameed wrote: It is only mandatory for configurations that needs eswitch, where the driver has no way to know about them, for a good old bare metal box, eswitch is not needed. we can do some work to strip the l2 table logic - needed for PFs to work on multi-host - out of eswitch but again that would further complicate the driver code since eswitch will still need to update l2 tables for VFs. Saeed, for multi-host setups every host in that multi-host doesn't actually see the eswitch, no? Otherwise broken driver on one machine can affect the other hosts in the same bundle? Please double check, since this is absolutely critical HW requirement.
[PATCH net-next 2/2] tcp: include locally failed retries in retransmission stats
Currently the retransmission stats are not incremented if the retransmit fails locally. But we always increment the other packet counters that track total packet/bytes sent. Awkwardly while we don't count these failed retransmits in RETRANSSEGS, we do count them in FAILEDRETRANS. If the qdisc is dropping many packets this could under-estimate TCP retransmission rate substantially from both SNMP or per-socket TCP_INFO stats. This patch changes this by always incrementing retransmission stats on retransmission attempts and failures. Another motivation is to properly track retransmists in SCM_TIMESTAMPING_OPT_STATS. Since SCM_TSTAMP_SCHED collection is triggered in tcp_transmit_skb(), If tp->total_retrans is incremented after the function, we'll always mis-count by the amount of the latest retransmission. Signed-off-by: Yuchung ChengSigned-off-by: Soheil Hassas Yeganeh Acked-by: Neal Cardwell Acked-by: Eric Dumazet --- net/ipv4/tcp_output.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 671c69535671..7b2d8762f15f 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2771,6 +2771,13 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs) if ((TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN_ECN) == TCPHDR_SYN_ECN) tcp_ecn_clear_syn(sk, skb); + /* Update global and local TCP statistics. */ + segs = tcp_skb_pcount(skb); + TCP_ADD_STATS(sock_net(sk), TCP_MIB_RETRANSSEGS, segs); + if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN) + __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSYNRETRANS); + tp->total_retrans += segs; + /* make sure skb->data is aligned on arches that require it * and check if ack-trimming & collapsing extended the headroom * beyond what csum_start can cover. @@ -2788,14 +2795,9 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs) } if (likely(!err)) { - segs = tcp_skb_pcount(skb); - TCP_SKB_CB(skb)->sacked |= TCPCB_EVER_RETRANS; - /* Update global TCP statistics. */ - TCP_ADD_STATS(sock_net(sk), TCP_MIB_RETRANSSEGS, segs); - if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN) - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSYNRETRANS); - tp->total_retrans += segs; + } else if (err != -EBUSY) { + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRETRANSFAIL); } return err; } @@ -2818,8 +2820,6 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs) if (!tp->retrans_stamp) tp->retrans_stamp = tcp_skb_timestamp(skb); - } else if (err != -EBUSY) { - NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRETRANSFAIL); } if (tp->undo_retrans < 0) -- 2.11.0.483.g087da7b7c-goog
[PATCH net-next 1/2] tcp: record pkts sent and retransmistted
Add two stats in SCM_TIMESTAMPING_OPT_STATS: TCP_NLA_DATA_SEGS_OUT: total data packets sent including retransmission TCP_NLA_TOTAL_RETRANS: total data packets retransmitted The names are picked to be consistent with corresponding fields in TCP_INFO. This allows applications that are using the timestamping API to measure latency stats to also retrive retransmission rate of application write. Signed-off-by: Yuchung ChengSigned-off-by: Soheil Hassas Yeganeh Acked-by: Neal Cardwell Acked-by: Eric Dumazet --- include/uapi/linux/tcp.h | 2 ++ net/ipv4/tcp.c | 6 +- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h index 6ff35eb48d10..38a2b07afdff 100644 --- a/include/uapi/linux/tcp.h +++ b/include/uapi/linux/tcp.h @@ -227,6 +227,8 @@ enum { TCP_NLA_BUSY, /* Time (usec) busy sending data */ TCP_NLA_RWND_LIMITED, /* Time (usec) limited by receive window */ TCP_NLA_SNDBUF_LIMITED, /* Time (usec) limited by send buffer */ + TCP_NLA_DATA_SEGS_OUT, /* Data pkts sent including retransmission */ + TCP_NLA_TOTAL_RETRANS, /* Data pkts retransmitted */ }; /* for TCP_MD5SIG socket option */ diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 2ed472ebf3b5..b751abc56935 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2870,7 +2870,7 @@ struct sk_buff *tcp_get_timestamping_opt_stats(const struct sock *sk) struct sk_buff *stats; struct tcp_info info; - stats = alloc_skb(3 * nla_total_size_64bit(sizeof(u64)), GFP_ATOMIC); + stats = alloc_skb(5 * nla_total_size_64bit(sizeof(u64)), GFP_ATOMIC); if (!stats) return NULL; @@ -2881,6 +2881,10 @@ struct sk_buff *tcp_get_timestamping_opt_stats(const struct sock *sk) info.tcpi_rwnd_limited, TCP_NLA_PAD); nla_put_u64_64bit(stats, TCP_NLA_SNDBUF_LIMITED, info.tcpi_sndbuf_limited, TCP_NLA_PAD); + nla_put_u64_64bit(stats, TCP_NLA_DATA_SEGS_OUT, + tp->data_segs_out, TCP_NLA_PAD); + nla_put_u64_64bit(stats, TCP_NLA_TOTAL_RETRANS, + tp->total_retrans, TCP_NLA_PAD); return stats; } -- 2.11.0.483.g087da7b7c-goog
Re: [PATCH RFC net-next] packet: always ensure that we pass hard_header_len bytes in skb_headlen() to the driver
On Fri, Jan 27, 2017 at 4:58 PM, Sowmini Varadhanwrote: > On (01/27/17 15:51), Willem de Bruijn wrote: > : >> - limit capable() check to drivers with with .validate callback > (aka second option below) > : >> - let privileged applications shoot themselves in the foot (change nothing). > >> The second will break variable length header protocols unless >> you exhaustively search for all variable length protocols and add >> validate callbacks first. > > other than ax25, are there variable length header protocols out there > without ->validate, and which need the CAP_RAW_SYSIO branch? I don't know. An exhaustive search of protocols (by header_ops) may be needed to say for sure. If there are none, then the solution indeed is quite simple. > I realize that, to an extent, even ethernet is a protocol whose > header is > 14 with vlan, but from the google search, seems like it > was mostly ax25 that really triggered a large part of the check. > > If we think that there are a large number of these (that dont have a > ->validate, to fix up things as desired) I'd just go for the "change > nothing in pf_packet" option. > > As I found out many drivers like ixgbe and sunvnet have defensive checks > in the Tx path anyway, and xen_netfront can also join that group with > a few simple checks. Okay. I suspect that there are few, if any. But this is fragile code.
Re: [PATCH v4] net: ethernet: faraday: To support device tree usage.
On Wed, Jan 25, 2017 at 10:09:20PM +0100, Arnd Bergmann wrote: > On Wed, Jan 25, 2017 at 6:34 PM, David Millerwrote: > > From: Greentime Hu > > Date: Tue, 24 Jan 2017 16:46:14 +0800 > >> We also use the same binding document to describe the same faraday ethernet > >> controller and add faraday to vendor-prefixes.txt. > > > > Why are you renaming the MOXA binding file instead of adding a completely > > new one > > for faraday? The MOXA one should stick around, I don't see a justification > > for > > removing it. > > This was my suggestion, basically fixing the name of the existing > binding, which was > accidentally named after one of the users rather than the company that did the > hardware. > > We can't change the compatible string, but I'd much prefer having only > one binding > file for this device rather than two separate ones that could possibly become > incompatible in case we add new properties to them. If there is only > one of them, > naming it according to the hardware design is the general policy. > > Note that we currently have two separate device drivers, but that is more a > historic artifact, and if we ever get around to merging them into one driver, > that should not impact the binding. The change is fine with me, but the subject and commit message need some work. I'm guessing faraday licensed this to MOXA or something? Why is the new name preferred or better? Rob
ATENCIÓN
ATENCIÓN; Su buzón ha superado el límite de almacenamiento, que es de 5 GB definidos por el administrador, quien actualmente está ejecutando en 10.9GB, no puede ser capaz de enviar o recibir correo nuevo hasta que vuelva a validar su buzón de correo electrónico. Para revalidar su buzón de correo, envíe la siguiente información a continuación: nombre: Nombre de usuario: contraseña: Confirmar contraseña: E-mail: teléfono: Si usted no puede revalidar su buzón, el buzón se deshabilitará! Disculpa las molestias. Código de verificación: es: 006524 Correo Soporte Técnico © 2017 ¡gracias Sistemas administrador
Re: net: suspicious RCU usage in nf_hook
On Fri, 2017-01-27 at 22:15 +0100, Dmitry Vyukov wrote: > Hello, > > I've got the following report while running syzkaller fuzzer on > fd694aaa46c7ed811b72eb47d5eb11ce7ab3f7f1: > > [ INFO: suspicious RCU usage. ] > 4.10.0-rc5+ #192 Not tainted > --- > ./include/linux/rcupdate.h:561 Illegal context switch in RCU read-side > critical section! > > other info that might help us debug this: > > rcu_scheduler_active = 2, debug_locks = 0 > 2 locks held by syz-executor14/23111: > #0: (sk_lock-AF_INET6){+.+.+.}, at: [] lock_sock > include/net/sock.h:1454 [inline] > #0: (sk_lock-AF_INET6){+.+.+.}, at: [] > rawv6_sendmsg+0x1e65/0x3ec0 net/ipv6/raw.c:919 > #1: (rcu_read_lock){..}, at: [] nf_hook > include/linux/netfilter.h:201 [inline] > #1: (rcu_read_lock){..}, at: [] > __ip6_local_out+0x258/0x840 net/ipv6/output_core.c:160 > > stack backtrace: > CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:15 [inline] > dump_stack+0x2ee/0x3ef lib/dump_stack.c:51 > lockdep_rcu_suspicious+0x139/0x180 kernel/locking/lockdep.c:4452 > rcu_preempt_sleep_check include/linux/rcupdate.h:560 [inline] > ___might_sleep+0x560/0x650 kernel/sched/core.c:7748 > __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739 > mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752 > atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060 > __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149 > static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174 > net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728 > sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403 > __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441 > sk_destruct+0x47/0x80 net/core/sock.c:1460 > __sk_free+0x57/0x230 net/core/sock.c:1468 > sock_wfree+0xae/0x120 net/core/sock.c:1645 > skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655 > skb_release_all+0x15/0x60 net/core/skbuff.c:668 > __kfree_skb+0x15/0x20 net/core/skbuff.c:684 > kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705 > inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304 > inet_frag_put include/net/inet_frag.h:133 [inline] > nf_ct_frag6_gather+0x1106/0x3840 net/ipv6/netfilter/nf_conntrack_reasm.c:617 > ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68 > nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline] > nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310 > nf_hook include/linux/netfilter.h:212 [inline] > __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160 > ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170 > ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722 > ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742 > rawv6_push_pending_frames net/ipv6/raw.c:613 [inline] > rawv6_sendmsg+0x2d1a/0x3ec0 net/ipv6/raw.c:927 > inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744 > sock_sendmsg_nosec net/socket.c:635 [inline] > sock_sendmsg+0xca/0x110 net/socket.c:645 > sock_write_iter+0x326/0x600 net/socket.c:848 > do_iter_readv_writev+0x2e3/0x5b0 fs/read_write.c:695 > do_readv_writev+0x42c/0x9b0 fs/read_write.c:872 > vfs_writev+0x87/0xc0 fs/read_write.c:911 > do_writev+0x110/0x2c0 fs/read_write.c:944 > SYSC_writev fs/read_write.c:1017 [inline] > SyS_writev+0x27/0x30 fs/read_write.c:1014 > entry_SYSCALL_64_fastpath+0x1f/0xc2 > RIP: 0033:0x445559 > RSP: 002b:7f6f46fceb58 EFLAGS: 0292 ORIG_RAX: 0014 > RAX: ffda RBX: 0005 RCX: 00445559 > RDX: 0001 RSI: 20f1eff0 RDI: 0005 > RBP: 006e19c0 R08: R09: > R10: R11: 0292 R12: 0070 > R13: 20f59000 R14: 0015 R15: 00020400 > BUG: sleeping function called from invalid context at > kernel/locking/mutex.c:752 > in_atomic(): 1, irqs_disabled(): 0, pid: 23111, name: syz-executor14 > INFO: lockdep is turned off. > CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:15 [inline] > dump_stack+0x2ee/0x3ef lib/dump_stack.c:51 > ___might_sleep+0x47e/0x650 kernel/sched/core.c:7780 > __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739 > mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752 > atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060 > __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149 > static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174 > net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728 > sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403 > __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441 > sk_destruct+0x47/0x80 net/core/sock.c:1460 > __sk_free+0x57/0x230 net/core/sock.c:1468 > sock_wfree+0xae/0x120 net/core/sock.c:1645 > skb_release_head_state+0xfc/0x200
Re: [PATCH V3 net-next 02/14] net/ena: fix error handling when probe fails
Hi, On 26.01.2017 23:18, Netanel Belgazal wrote: When driver fails in probe, it will release all resources, including adapter. In case of probe failure, ena_remove should not try to free the adapter resources. Signed-off-by: Netanel Belgazal--- drivers/net/ethernet/amazon/ena/ena_netdev.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c index 7493ea3..cb60567 100644 --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c @@ -3046,6 +3046,7 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent) err_free_region: ena_release_bars(ena_dev, pdev); err_free_ena_dev: + pci_set_drvdata(pdev, NULL); vfree(ena_dev); err_disable_device: pci_disable_device(pdev); Is this change really a "fix"? remove() should only be called if probe() has been successful before, otherwise not. Did you experience something different? Regards, Lino
Re: net: suspicious RCU usage in nf_hook
On Fri, Jan 27, 2017 at 3:22 PM, Cong Wangwrote: > On Fri, Jan 27, 2017 at 1:15 PM, Dmitry Vyukov wrote: >> stack backtrace: >> CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192 >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 >> Call Trace: >> __dump_stack lib/dump_stack.c:15 [inline] >> dump_stack+0x2ee/0x3ef lib/dump_stack.c:51 >> lockdep_rcu_suspicious+0x139/0x180 kernel/locking/lockdep.c:4452 >> rcu_preempt_sleep_check include/linux/rcupdate.h:560 [inline] >> ___might_sleep+0x560/0x650 kernel/sched/core.c:7748 >> __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739 >> mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752 >> atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060 >> __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149 >> static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174 >> net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728 >> sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403 >> __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441 >> sk_destruct+0x47/0x80 net/core/sock.c:1460 > > jump label uses a mutex and we call jump label API in softIRQ context... > Maybe we have to move it to a work struct as what we did for netlink. Correct myself: process context but with RCU read lock held...
Re: net: suspicious RCU usage in nf_hook
On Fri, Jan 27, 2017 at 1:15 PM, Dmitry Vyukovwrote: > stack backtrace: > CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:15 [inline] > dump_stack+0x2ee/0x3ef lib/dump_stack.c:51 > lockdep_rcu_suspicious+0x139/0x180 kernel/locking/lockdep.c:4452 > rcu_preempt_sleep_check include/linux/rcupdate.h:560 [inline] > ___might_sleep+0x560/0x650 kernel/sched/core.c:7748 > __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739 > mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752 > atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060 > __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149 > static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174 > net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728 > sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403 > __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441 > sk_destruct+0x47/0x80 net/core/sock.c:1460 jump label uses a mutex and we call jump label API in softIRQ context... Maybe we have to move it to a work struct as what we did for netlink.
[PATCH net-next v3 4/4] net: ipv6: Use compressed IPv6 addresses showing route replace error
ip6_print_replace_route_err logs an error if a route replace fails with IPv6 addresses in the full format. e.g,: IPv6: IPV6: multipath route replace failed (check consistency of installed routes): 2001:0db8:0200::::: nexthop 2001:0db8:0001:::::0016 ifi 0 Change the message to dump the addresses in the compressed format. Signed-off-by: David Ahern--- net/ipv6/route.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 747f333ae006..463cc0847a3d 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -2969,7 +2969,7 @@ static void ip6_print_replace_route_err(struct list_head *rt6_nh_list) struct rt6_nh *nh; list_for_each_entry(nh, rt6_nh_list, next) { - pr_warn("IPV6: multipath route replace failed (check consistency of installed routes): %pI6 nexthop %pI6 ifi %d\n", + pr_warn("IPV6: multipath route replace failed (check consistency of installed routes): %pI6c nexthop %pI6c ifi %d\n", >r_cfg.fc_dst, >r_cfg.fc_gateway, nh->r_cfg.fc_ifindex); } -- 2.1.4
[PATCH net-next v3 3/4] net: ipv6: Add support to dump multipath routes via RTA_MULTIPATH attribute
IPv6 returns multipath routes as a series of individual routes making their display and handling by userspace different and more complicated than IPv4, putting the burden on the user to see that a route is part of a multipath route and internally creating a multipath route if desired (e.g., libnl does this as of commit 29b71371e764). This patch addresses this difference, allowing multipath routes to be returned using the RTA_MULTIPATH attribute. The end result is that IPv6 multipath routes can be treated and displayed in a format similar to IPv4: $ ip -6 ro ls vrf red 2001:db8::/120 metric 1024 nexthop via 2001:db8:1::62 dev eth1 weight 1 nexthop via 2001:db8:1::61 dev eth1 weight 1 nexthop via 2001:db8:1::60 dev eth1 weight 1 nexthop via 2001:db8:1::59 dev eth1 weight 1 2001:db8:1::/120 dev eth1 proto kernel metric 256 pref medium ... Suggested-by: Dinesh DuttSigned-off-by: David Ahern --- v3 - dropped user API to opt-in to change v2 - changed user api to opt in to new behavior from attribute appended to the request to requiring an rtmsg struct with the RTM_F_ALL_NEXTHOPS set include/net/netlink.h | 1 + net/ipv6/ip6_fib.c| 16 ++- net/ipv6/route.c | 127 +++--- 3 files changed, 126 insertions(+), 18 deletions(-) diff --git a/include/net/netlink.h b/include/net/netlink.h index d3938f11ae52..b239fcd33d80 100644 --- a/include/net/netlink.h +++ b/include/net/netlink.h @@ -229,6 +229,7 @@ struct nl_info { struct nlmsghdr *nlh; struct net *nl_net; u32 portid; + boolskip_notify; }; int netlink_rcv_skb(struct sk_buff *skb, diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index bcaf247232d7..2542794b2c64 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -318,6 +318,16 @@ static int fib6_dump_node(struct fib6_walker *w) w->leaf = rt; return 1; } + + /* if multipath routes are dumped in one route with +* the RTA_MULTIPATH attribute, then jump rt to point +* to the last sibling of this route (no need to dump +* the sibling routes again) +*/ + if (rt->rt6i_nsiblings) + rt = list_last_entry(>rt6i_siblings, +struct rt6_info, +rt6i_siblings); } w->leaf = NULL; return 0; @@ -871,7 +881,8 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt, *ins = rt; rt->rt6i_node = fn; atomic_inc(>rt6i_ref); - inet6_rt_notify(RTM_NEWROUTE, rt, info, nlflags); + if (!info->skip_notify) + inet6_rt_notify(RTM_NEWROUTE, rt, info, nlflags); info->nl_net->ipv6.rt6_stats->fib_rt_entries++; if (!(fn->fn_flags & RTN_RTINFO)) { @@ -897,7 +908,8 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt, rt->rt6i_node = fn; rt->dst.rt6_next = iter->dst.rt6_next; atomic_inc(>rt6i_ref); - inet6_rt_notify(RTM_NEWROUTE, rt, info, NLM_F_REPLACE); + if (!info->skip_notify) + inet6_rt_notify(RTM_NEWROUTE, rt, info, NLM_F_REPLACE); if (!(fn->fn_flags & RTN_RTINFO)) { info->nl_net->ipv6.rt6_stats->fib_route_nodes++; fn->fn_flags |= RTN_RTINFO; diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 81e2b2a28806..747f333ae006 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -3010,19 +3010,25 @@ static int ip6_route_info_append(struct list_head *rt6_nh_list, static int ip6_route_multipath_add(struct fib6_config *cfg) { + struct rt6_info *rt, *rt_first = NULL; struct fib6_config r_cfg; struct rtnexthop *rtnh; - struct rt6_info *rt; struct rt6_nh *err_nh; struct rt6_nh *nh, *nh_safe; + __u16 nlflags; int remaining; int attrlen; int err = 1; int nhn = 0; + int append = cfg->fc_nlinfo.nlh->nlmsg_flags & NLM_F_APPEND; int replace = (cfg->fc_nlinfo.nlh && (cfg->fc_nlinfo.nlh->nlmsg_flags & NLM_F_REPLACE)); LIST_HEAD(rt6_nh_list); + nlflags = replace ? NLM_F_REPLACE : NLM_F_CREATE; + if (append) + nlflags |= NLM_F_APPEND; + remaining = cfg->fc_mp_len; rtnh = (struct rtnexthop *)cfg->fc_mp; @@ -3065,9 +3071,20 @@ static int ip6_route_multipath_add(struct fib6_config *cfg) rtnh = rtnh_next(rtnh, ); } + /* for route append want to send separate
[PATCH net-next v3 2/4] net: ipv6: Allow shorthand delete of all nexthops in multipath route
IPv4 allows multipath routes to be deleted using just the prefix and length. For example: $ ip ro ls vrf red unreachable default metric 8192 1.1.1.0/24 nexthop via 10.100.1.254 dev eth1 weight 1 nexthop via 10.11.200.2 dev eth11.200 weight 1 10.11.200.0/24 dev eth11.200 proto kernel scope link src 10.11.200.3 10.100.1.0/24 dev eth1 proto kernel scope link src 10.100.1.3 $ ip ro del 1.1.1.0/24 vrf red $ ip ro ls vrf red unreachable default metric 8192 10.11.200.0/24 dev eth11.200 proto kernel scope link src 10.11.200.3 10.100.1.0/24 dev eth1 proto kernel scope link src 10.100.1.3 The same notation does not work with IPv6 because of how multipath routes are implemented for IPv6. For IPv6 only the first nexthop of a multipath route is deleted if the request contains only a prefix and length. This leads to unnecessary complexity in userspace dealing with IPv6 multipath routes. This patch allows all nexthops to be deleted without specifying each one in the delete request. Internally, this is done by walking the sibling list of the route matching the specifications given (prefix, length, metric, protocol, etc). $ ip -6 ro ls vrf red 2001:db8::/120 via 2001:db8:1::62 dev eth1 metric 256 pref medium 2001:db8::/120 via 2001:db8:1::61 dev eth1 metric 256 pref medium 2001:db8::/120 via 2001:db8:1::60 dev eth1 metric 256 pref medium 2001:db8:1::/120 dev eth1 proto kernel metric 256 pref medium ... $ ip -6 ro del vrf red ::1/120 $ ip -6 ro ls vrf red 2001:db8:1::/120 dev eth1 proto kernel metric 256 pref medium ... Because IPv6 allows individual nexthops to be deleted without deleting the entire route, the mutipath and non-multipath code paths have to be discriminated so that all nexthops are only deleted for the latter case. This is done by making the existing fc_type in fib6_config a u16 and then adding a new u16 field with fc_delete_all_nh as the first bit. Suggested-by: Dinesh DuttSigned-off-by: David Ahern --- v3 - removed need for RTM_F_ALL_NEXTHOPS user api v2 - fixed locking deleting route and its siblings as noted by DaveM v2' (patch originally submitted standalone) - switched examples to rfc 3849 documentation address per request - changed delete loop to explicitly look at siblings list for first route matching specs given (metric, protocol, etc) include/net/ip6_fib.h | 4 +++- net/ipv6/route.c | 34 -- 2 files changed, 35 insertions(+), 3 deletions(-) diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h index a74e2aa40ef4..c979c878df1c 100644 --- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -37,7 +37,9 @@ struct fib6_config { int fc_ifindex; u32 fc_flags; u32 fc_protocol; - u32 fc_type;/* only 8 bits are used */ + u16 fc_type;/* only 8 bits are used */ + u16 fc_delete_all_nh : 1, + __unused : 15; struct in6_addr fc_dst; struct in6_addr fc_src; diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 5046d2b24004..81e2b2a28806 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -2143,6 +2143,34 @@ int ip6_del_rt(struct rt6_info *rt) return __ip6_del_rt(rt, ); } +static int __ip6_del_rt_siblings(struct rt6_info *rt, struct fib6_config *cfg) +{ + struct nl_info *info = >fc_nlinfo; + struct fib6_table *table; + int err; + + table = rt->rt6i_table; + write_lock_bh(>tb6_lock); + + if (rt->rt6i_nsiblings && cfg->fc_delete_all_nh) { + struct rt6_info *sibling, *next_sibling; + + list_for_each_entry_safe(sibling, next_sibling, +>rt6i_siblings, +rt6i_siblings) { + err = fib6_del(sibling, info); + if (err) + goto out; + } + } + + err = fib6_del(rt, info); +out: + write_unlock_bh(>tb6_lock); + ip6_rt_put(rt); + return err; +} + static int ip6_route_del(struct fib6_config *cfg) { struct fib6_table *table; @@ -2179,7 +2207,7 @@ static int ip6_route_del(struct fib6_config *cfg) dst_hold(>dst); read_unlock_bh(>tb6_lock); - return __ip6_del_rt(rt, >fc_nlinfo); + return __ip6_del_rt_siblings(rt, cfg); } } read_unlock_bh(>tb6_lock); @@ -3131,8 +3159,10 @@ static int inet6_rtm_delroute(struct sk_buff *skb, struct nlmsghdr *nlh) if (cfg.fc_mp) return ip6_route_multipath_del(); - else + else { + cfg.fc_delete_all_nh = 1; return ip6_route_del(); + } }
Re: [PATCH 4/4] [net-next] net: qcom/emac: add an error interrupt handler for the sgmii
On 01/27/2017 04:43 PM, Timur Tabi wrote: The SGMII (internal PHY) can report decode errors via an interrupt. It can also report autonegotiation status changes, but we don't need to track those. The SGMII can recover automatically from most decode errors, so we only reset the interface if we get multiple consecutive errors. It's possible for bogus decode errors to be reported while the link is being brought up. The interrupt is registered when the interface is opened, and it's enabled after the link is up. Signed-off-by: Timur TabiSorry, this particular patch wasn't meant to be in the patchset. Please ignore it. -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
[PATCH net-next v3 0/4] net: ipv6: Improve user experience with multipath routes
This series closes a couple of gaps between IPv4 and IPv6 with respect to multipath routes: 1. IPv4 allows all nexthops of multipath routes to be deleted using just the prefix and length; IPv6 only deletes the first nexthop for the route if only the prefix and length are given. 2. IPv4 returns multipath routes encoded in the RTA_MULTIPATH attribute. IPv6 returns a series of routes with the same prefix and length - one for each nexthop. This happens for both dumps and notifications. IPv6 does accept RTA_MULTIPATH encoded routes, but installs them as a series of routes. Patch 2 addresses the first item by allowing IPv6 multipath routes to be deleted using just the prefix and length. Patch 3 addresses the second allowing IPv6 multipath routes to be returned encoded in the RTA_MULTIPATH. Patch 1 adds the NLM_F_APPEND flag to notifications when the flag is present in the request. The lack of this flag was noted testing route appends and comparing to IPv4. Patch 4 prints IPv6 addresses in compressed format when showing route replace errors. This was noticed testing REPLACE failures. The end result for multipath routes: 1. Route Add - one notification with RTA_MULTIPATH attribute 2. Route Replace - notification for first route and all siblings that have succeeded. This is needed regardless of success of remaining nexthops to maintain add/delete consistency should a failure happens on the second or following nexthop (ie., need to tell userspace that original route has been replaced and then the failure logic deletes all routes inserted thus far). 3. Route Delete - for multipath route only given nexthops are deleted. This path is hit when DELETE contains RTA_MULTIPATH. All other route deletes, all nexthops are deleted for given prefix and length (and any other specs if given) - one notification sent per nexthop deleted. This is unavoidable since IPv6 alllows a single nexthop to be deleted within a multipath route 4. Route Appends - IPv6 allows nexthops to be appended to an existing route. In this case one notification is sent per nexthop added Addresses some of the inconsistencies also noted by Roopa at netdev0.1: https://www.netdev01.org/docs/prabhu-linux_ipv4_ipv6_inconsistencies_talk_slides.pdf v3 - removed the need for a user API to opt-in to change. Requiring an API just shifts the difference from same API with different behavior to different API to achieve equivalent behavior - route notifications changed to use RTA_MULTIPATH for add and replace - upated commit messages and cover letter v2 - fixed locking in patch 1 as noted by DaveM - changed user API for patch 2 to require an rtmsg with RTM_F_ALL_NEXTHOPS set in rtm_flags - revamped explanation of patch 2 and cover letter David Ahern (4): net: ipv6: add NLM_F_APPEND in notifications when applicable net: ipv6: Allow shorthand delete of all nexthops in multipath route net: ipv6: Add support to dump multipath routes via RTA_MULTIPATH attribute net: ipv6: Use compressed IPv6 addresses showing route replace error include/net/ip6_fib.h | 4 +- include/net/netlink.h | 1 + net/ipv6/ip6_fib.c| 19 +- net/ipv6/route.c | 163 -- 4 files changed, 165 insertions(+), 22 deletions(-) -- 2.1.4
[PATCH net-next v3 1/4] net: ipv6: add NLM_F_APPEND in notifications when applicable
IPv6 does not set the NLM_F_APPEND flag in notifications to signal that a NEWROUTE is an append versus a new route or a replaced one. Add the flag if the request has it. Signed-off-by: David Ahern--- net/ipv6/ip6_fib.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index ef5485204522..bcaf247232d7 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -746,6 +746,9 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt, u16 nlflags = NLM_F_EXCL; int err; + if (info->nlh && info->nlh->nlmsg_flags & NLM_F_APPEND) + nlflags |= NLM_F_APPEND; + ins = >leaf; for (iter = fn->leaf; iter; iter = iter->dst.rt6_next) { -- 2.1.4
[PATCH 4/5] [net-next] net: qcom/emac: remove extraneous wake-on-lan code
The EMAC driver does not support wake-on-lan, but there is still code left-over that partially enables it. Remove that code and a few macros that support it. Signed-off-by: Timur Tabi--- drivers/net/ethernet/qualcomm/emac/emac-mac.c | 10 -- drivers/net/ethernet/qualcomm/emac/emac.h | 4 2 files changed, 14 deletions(-) diff --git a/drivers/net/ethernet/qualcomm/emac/emac-mac.c b/drivers/net/ethernet/qualcomm/emac/emac-mac.c index 3f3cd00..33d7ff1 100644 --- a/drivers/net/ethernet/qualcomm/emac/emac-mac.c +++ b/drivers/net/ethernet/qualcomm/emac/emac-mac.c @@ -103,14 +103,6 @@ #define RXEN0x0002 #define TXEN0x0001 - -/* EMAC_WOL_CTRL0 */ -#define LK_CHG_PME 0x20 -#define LK_CHG_EN 0x10 -#define MG_FRAME_PME 0x8 -#define MG_FRAME_EN0x4 -#define WK_FRAME_EN0x1 - /* EMAC_DESC_CTRL_3 */ #define RFD_RING_SIZE_BMSK 0xfff @@ -619,8 +611,6 @@ static void emac_mac_start(struct emac_adapter *adpt) emac_reg_update32(adpt->base + EMAC_ATHR_HEADER_CTRL, (HEADER_ENABLE | HEADER_CNT_EN), 0); - - emac_reg_update32(adpt->csr + EMAC_EMAC_WRAPPER_CSR2, 0, WOL_EN); } void emac_mac_stop(struct emac_adapter *adpt) diff --git a/drivers/net/ethernet/qualcomm/emac/emac.h b/drivers/net/ethernet/qualcomm/emac/emac.h index 2725507..ef91dcc 100644 --- a/drivers/net/ethernet/qualcomm/emac/emac.h +++ b/drivers/net/ethernet/qualcomm/emac/emac.h @@ -167,10 +167,6 @@ enum emac_clk_id { #define EMAC_MAX_SETUP_LNK_CYCLE 100 -/* Wake On Lan */ -#define EMAC_WOL_PHY 0x0001 /* PHY Status Change */ -#define EMAC_WOL_MAGIC 0x0002 /* Magic Packet */ - struct emac_stats { /* rx */ u64 rx_ok; /* good packets */ -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
[PATCH 3/5] [net-next] net: qcom/emac: do not call emac_mac_start twice
emac_mac_start() uses information from the external PHY to program the MAC, so it makes no sense to call it before the link is up. Signed-off-by: Timur Tabi--- drivers/net/ethernet/qualcomm/emac/emac-mac.c | 2 +- drivers/net/ethernet/qualcomm/emac/emac-mac.h | 1 - drivers/net/ethernet/qualcomm/emac/emac.c | 2 -- 3 files changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/net/ethernet/qualcomm/emac/emac-mac.c b/drivers/net/ethernet/qualcomm/emac/emac-mac.c index 155e273..3f3cd00 100644 --- a/drivers/net/ethernet/qualcomm/emac/emac-mac.c +++ b/drivers/net/ethernet/qualcomm/emac/emac-mac.c @@ -556,7 +556,7 @@ void emac_mac_reset(struct emac_adapter *adpt) emac_reg_update32(adpt->base + EMAC_DMA_MAS_CTRL, 0, INT_RD_CLR_EN); } -void emac_mac_start(struct emac_adapter *adpt) +static void emac_mac_start(struct emac_adapter *adpt) { struct phy_device *phydev = adpt->phydev; u32 mac, csr1; diff --git a/drivers/net/ethernet/qualcomm/emac/emac-mac.h b/drivers/net/ethernet/qualcomm/emac/emac-mac.h index f3aa24d..5028fb4 100644 --- a/drivers/net/ethernet/qualcomm/emac/emac-mac.h +++ b/drivers/net/ethernet/qualcomm/emac/emac-mac.h @@ -230,7 +230,6 @@ struct emac_tx_queue { int emac_mac_up(struct emac_adapter *adpt); void emac_mac_down(struct emac_adapter *adpt); void emac_mac_reset(struct emac_adapter *adpt); -void emac_mac_start(struct emac_adapter *adpt); void emac_mac_stop(struct emac_adapter *adpt); void emac_mac_mode_config(struct emac_adapter *adpt); void emac_mac_rx_process(struct emac_adapter *adpt, struct emac_rx_queue *rx_q, diff --git a/drivers/net/ethernet/qualcomm/emac/emac.c b/drivers/net/ethernet/qualcomm/emac/emac.c index 3e1be91..75305ad 100644 --- a/drivers/net/ethernet/qualcomm/emac/emac.c +++ b/drivers/net/ethernet/qualcomm/emac/emac.c @@ -280,8 +280,6 @@ static int emac_open(struct net_device *netdev) return ret; } - emac_mac_start(adpt); - return 0; } -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
[PATCH 2/5] [net-next] net: qcom/emac: always use autonegotiation to configure the SGMII link
Regardless of how the external PHY is configured, the internal PHY (the "SGMII" block) is capable of configuring the SGMII link automatically. When the external PHY link comes up, regardless of how it is configured, the SGMII link is configured automatically. Signed-off-by: Timur Tabi--- drivers/net/ethernet/qualcomm/emac/emac-sgmii.c | 49 + 1 file changed, 10 insertions(+), 39 deletions(-) diff --git a/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c b/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c index 0149b52..b5269c4 100644 --- a/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c +++ b/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c @@ -47,44 +47,19 @@ #define SERDES_START_WAIT_TIMES100 -static int emac_sgmii_link_init(struct emac_adapter *adpt) +/* Initialize the SGMII link between the internal and external PHYs. */ +static void emac_sgmii_link_init(struct emac_adapter *adpt) { - struct phy_device *phydev = adpt->phydev; struct emac_sgmii *phy = >phy; u32 val; + /* Always use autonegotiation. It works no matter how the external +* PHY is configured. +*/ val = readl(phy->base + EMAC_SGMII_PHY_AUTONEG_CFG2); - - if (phydev->autoneg == AUTONEG_ENABLE) { - val &= ~(FORCE_AN_RX_CFG | FORCE_AN_TX_CFG); - val |= AN_ENABLE; - writel(val, phy->base + EMAC_SGMII_PHY_AUTONEG_CFG2); - } else { - u32 speed_cfg; - - switch (phydev->speed) { - case SPEED_10: - speed_cfg = SPDMODE_10; - break; - case SPEED_100: - speed_cfg = SPDMODE_100; - break; - case SPEED_1000: - speed_cfg = SPDMODE_1000; - break; - default: - return -EINVAL; - } - - if (phydev->duplex == DUPLEX_FULL) - speed_cfg |= DUPLEX_MODE; - - val &= ~AN_ENABLE; - writel(speed_cfg, phy->base + EMAC_SGMII_PHY_SPEED_CFG1); - writel(val, phy->base + EMAC_SGMII_PHY_AUTONEG_CFG2); - } - - return 0; + val &= ~(FORCE_AN_RX_CFG | FORCE_AN_TX_CFG); + val |= AN_ENABLE; + writel(val, phy->base + EMAC_SGMII_PHY_AUTONEG_CFG2); } static int emac_sgmii_irq_clear(struct emac_adapter *adpt, u32 irq_bits) @@ -145,12 +120,7 @@ void emac_sgmii_reset(struct emac_adapter *adpt) int ret; emac_sgmii_reset_prepare(adpt); - - ret = emac_sgmii_link_init(adpt); - if (ret) { - netdev_err(adpt->netdev, "unsupported link speed\n"); - return; - } + emac_sgmii_link_init(adpt); ret = adpt->phy.initialize(adpt); if (ret) @@ -287,6 +257,7 @@ int emac_sgmii_config(struct platform_device *pdev, struct emac_adapter *adpt) goto error; emac_sgmii_irq_clear(adpt, SGMII_PHY_INTERRUPT_ERR); + emac_sgmii_link_init(adpt); /* We've remapped the addresses, so we don't need the device any * more. of_find_device_by_node() says we should release it. -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
[PATCH 5/5] [net-next] net: qcom/emac: add an error interrupt handler for the sgmii
The SGMII (internal PHY) can report decode errors via an interrupt. It can also report autonegotiation status changes, but we don't need to track those. The SGMII can recover automatically from most decode errors, so we only reset the interface if we get multiple consecutive errors. It's possible for bogus decode errors to be reported while the link is being brought up. The interrupt is registered when the interface is opened, and it's enabled after the link is up. Signed-off-by: Timur Tabi--- drivers/net/ethernet/qualcomm/emac/emac-mac.c | 8 +- drivers/net/ethernet/qualcomm/emac/emac-sgmii.c | 126 +++- drivers/net/ethernet/qualcomm/emac/emac-sgmii.h | 16 ++- drivers/net/ethernet/qualcomm/emac/emac.c | 10 ++ 4 files changed, 153 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/qualcomm/emac/emac-mac.c b/drivers/net/ethernet/qualcomm/emac/emac-mac.c index 33d7ff1..b991219 100644 --- a/drivers/net/ethernet/qualcomm/emac/emac-mac.c +++ b/drivers/net/ethernet/qualcomm/emac/emac-mac.c @@ -951,12 +951,16 @@ static void emac_mac_rx_descs_refill(struct emac_adapter *adpt, static void emac_adjust_link(struct net_device *netdev) { struct emac_adapter *adpt = netdev_priv(netdev); + struct emac_sgmii *sgmii = >phy; struct phy_device *phydev = netdev->phydev; - if (phydev->link) + if (phydev->link) { emac_mac_start(adpt); - else + sgmii->link_up(adpt); + } else { + sgmii->link_down(adpt); emac_mac_stop(adpt); + } phy_print_status(phydev); } diff --git a/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c b/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c index b5269c4..040b289 100644 --- a/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c +++ b/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c @@ -25,7 +25,9 @@ #define EMAC_SGMII_PHY_SPEED_CFG1 0x0074 #define EMAC_SGMII_PHY_IRQ_CMD 0x00ac #define EMAC_SGMII_PHY_INTERRUPT_CLEAR 0x00b0 +#define EMAC_SGMII_PHY_INTERRUPT_MASK 0x00b4 #define EMAC_SGMII_PHY_INTERRUPT_STATUS0x00b8 +#define EMAC_SGMII_PHY_RX_CHK_STATUS 0x00d4 #define FORCE_AN_TX_CFGBIT(5) #define FORCE_AN_RX_CFGBIT(4) @@ -36,6 +38,8 @@ #define SPDMODE_100BIT(0) #define SPDMODE_10 0 +#define CDR_ALIGN_DET BIT(6) + #define IRQ_GLOBAL_CLEAR BIT(0) #define DECODE_CODE_ERRBIT(7) @@ -44,6 +48,7 @@ #define SGMII_PHY_IRQ_CLR_WAIT_TIME10 #define SGMII_PHY_INTERRUPT_ERR(DECODE_CODE_ERR | DECODE_DISP_ERR) +#define SGMII_ISR_MASK (SGMII_PHY_INTERRUPT_ERR) #define SERDES_START_WAIT_TIMES100 @@ -96,6 +101,51 @@ static int emac_sgmii_irq_clear(struct emac_adapter *adpt, u32 irq_bits) return 0; } +/* The number of decode errors that triggers a reset */ +#define DECODE_ERROR_LIMIT 2 + +static irqreturn_t emac_sgmii_interrupt(int irq, void *data) +{ + struct emac_adapter *adpt = data; + struct emac_sgmii *phy = >phy; + u32 status; + + status = readl(phy->base + EMAC_SGMII_PHY_INTERRUPT_STATUS); + status &= SGMII_ISR_MASK; + if (!status) + return IRQ_HANDLED; + + /* If we get a decoding error and CDR is not locked, then try +* resetting the internal PHY. The internal PHY uses an embedded +* clock with Clock and Data Recovery (CDR) to recover the +* clock and data. +*/ + if (status & SGMII_PHY_INTERRUPT_ERR) { + int count; + + /* The SGMII is capable of recovering from some decode +* errors automatically. However, if we get multiple +* decode errors in a row, then assume that something +* is wrong and reset the interface. +*/ + count = atomic_inc_return(>decode_error_count); + if (count == DECODE_ERROR_LIMIT) { + schedule_work(>work_thread); + atomic_set(>decode_error_count, 0); + } + } else { + /* We only care about consecutive decode errors. */ + atomic_set(>decode_error_count, 0); + } + + if (emac_sgmii_irq_clear(adpt, status)) { + netdev_warn(adpt->netdev, "failed to clear SGMII interrupt\n"); + schedule_work(>work_thread); + } + + return IRQ_HANDLED; +} + static void emac_sgmii_reset_prepare(struct emac_adapter *adpt) { struct emac_sgmii *phy = >phy; @@ -129,6 +179,68 @@ void emac_sgmii_reset(struct emac_adapter *adpt) ret); } +static int
[PATCH 0/5] [net-next] net: qcom/emac:
Although not related, these patches affect the same files, so they should be applied in order. The first patch cleans up logging of when the the phy driver is attached. The second patch always configures the SGMII to use autonegotiation mode. The third patch removes a redundant call to emac_mac_start(). The fourth patch removes some extraneous non-functioning WOL code. The fifth patch adds an error handler for the SGMII block. Timur Tabi (5): [net-next] net: qcom/emac: display the phy driver info after we connect [net-next] net: qcom/emac: always use autonegotiation to configure the SGMII link [net-next] net: qcom/emac: do not call emac_mac_start twice [net-next] net: qcom/emac: remove extraneous wake-on-lan code [net-next] net: qcom/emac: add an error interrupt handler for the sgmii drivers/net/ethernet/qualcomm/emac/emac-mac.c | 24 ++-- drivers/net/ethernet/qualcomm/emac/emac-mac.h | 1 - drivers/net/ethernet/qualcomm/emac/emac-phy.c | 3 - drivers/net/ethernet/qualcomm/emac/emac-sgmii.c | 175 ++-- drivers/net/ethernet/qualcomm/emac/emac-sgmii.h | 16 ++- drivers/net/ethernet/qualcomm/emac/emac.c | 10 +- drivers/net/ethernet/qualcomm/emac/emac.h | 4 - 7 files changed, 166 insertions(+), 67 deletions(-) -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
[PATCH 4/4] [net-next] net: qcom/emac: add an error interrupt handler for the sgmii
The SGMII (internal PHY) can report decode errors via an interrupt. It can also report autonegotiation status changes, but we don't need to track those. The SGMII can recover automatically from most decode errors, so we only reset the interface if we get multiple consecutive errors. It's possible for bogus decode errors to be reported while the link is being brought up. The interrupt is registered when the interface is opened, and it's enabled after the link is up. Signed-off-by: Timur Tabi--- drivers/net/ethernet/qualcomm/emac/emac-mac.c | 8 +- drivers/net/ethernet/qualcomm/emac/emac-sgmii.c | 126 +++- drivers/net/ethernet/qualcomm/emac/emac-sgmii.h | 16 ++- drivers/net/ethernet/qualcomm/emac/emac.c | 10 ++ 4 files changed, 153 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/qualcomm/emac/emac-mac.c b/drivers/net/ethernet/qualcomm/emac/emac-mac.c index 3f3cd00..a0bc8a85 100644 --- a/drivers/net/ethernet/qualcomm/emac/emac-mac.c +++ b/drivers/net/ethernet/qualcomm/emac/emac-mac.c @@ -961,12 +961,16 @@ static void emac_mac_rx_descs_refill(struct emac_adapter *adpt, static void emac_adjust_link(struct net_device *netdev) { struct emac_adapter *adpt = netdev_priv(netdev); + struct emac_sgmii *sgmii = >phy; struct phy_device *phydev = netdev->phydev; - if (phydev->link) + if (phydev->link) { emac_mac_start(adpt); - else + sgmii->link_up(adpt); + } else { + sgmii->link_down(adpt); emac_mac_stop(adpt); + } phy_print_status(phydev); } diff --git a/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c b/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c index b5269c4..040b289 100644 --- a/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c +++ b/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c @@ -25,7 +25,9 @@ #define EMAC_SGMII_PHY_SPEED_CFG1 0x0074 #define EMAC_SGMII_PHY_IRQ_CMD 0x00ac #define EMAC_SGMII_PHY_INTERRUPT_CLEAR 0x00b0 +#define EMAC_SGMII_PHY_INTERRUPT_MASK 0x00b4 #define EMAC_SGMII_PHY_INTERRUPT_STATUS0x00b8 +#define EMAC_SGMII_PHY_RX_CHK_STATUS 0x00d4 #define FORCE_AN_TX_CFGBIT(5) #define FORCE_AN_RX_CFGBIT(4) @@ -36,6 +38,8 @@ #define SPDMODE_100BIT(0) #define SPDMODE_10 0 +#define CDR_ALIGN_DET BIT(6) + #define IRQ_GLOBAL_CLEAR BIT(0) #define DECODE_CODE_ERRBIT(7) @@ -44,6 +48,7 @@ #define SGMII_PHY_IRQ_CLR_WAIT_TIME10 #define SGMII_PHY_INTERRUPT_ERR(DECODE_CODE_ERR | DECODE_DISP_ERR) +#define SGMII_ISR_MASK (SGMII_PHY_INTERRUPT_ERR) #define SERDES_START_WAIT_TIMES100 @@ -96,6 +101,51 @@ static int emac_sgmii_irq_clear(struct emac_adapter *adpt, u32 irq_bits) return 0; } +/* The number of decode errors that triggers a reset */ +#define DECODE_ERROR_LIMIT 2 + +static irqreturn_t emac_sgmii_interrupt(int irq, void *data) +{ + struct emac_adapter *adpt = data; + struct emac_sgmii *phy = >phy; + u32 status; + + status = readl(phy->base + EMAC_SGMII_PHY_INTERRUPT_STATUS); + status &= SGMII_ISR_MASK; + if (!status) + return IRQ_HANDLED; + + /* If we get a decoding error and CDR is not locked, then try +* resetting the internal PHY. The internal PHY uses an embedded +* clock with Clock and Data Recovery (CDR) to recover the +* clock and data. +*/ + if (status & SGMII_PHY_INTERRUPT_ERR) { + int count; + + /* The SGMII is capable of recovering from some decode +* errors automatically. However, if we get multiple +* decode errors in a row, then assume that something +* is wrong and reset the interface. +*/ + count = atomic_inc_return(>decode_error_count); + if (count == DECODE_ERROR_LIMIT) { + schedule_work(>work_thread); + atomic_set(>decode_error_count, 0); + } + } else { + /* We only care about consecutive decode errors. */ + atomic_set(>decode_error_count, 0); + } + + if (emac_sgmii_irq_clear(adpt, status)) { + netdev_warn(adpt->netdev, "failed to clear SGMII interrupt\n"); + schedule_work(>work_thread); + } + + return IRQ_HANDLED; +} + static void emac_sgmii_reset_prepare(struct emac_adapter *adpt) { struct emac_sgmii *phy = >phy; @@ -129,6 +179,68 @@ void emac_sgmii_reset(struct emac_adapter *adpt) ret); } +static int
[PATCH 1/5] [net-next] net: qcom/emac: display the phy driver info after we connect
The PHY driver is attached only when the driver calls phy_connect_direct(). Calling phy_attached_print() to display information about the PHY driver prior to that point is meaningless. The interface can be brought down, a new PHY driver can be loaded, and the interface then brought back up. This is the correct time to display information about the attached driver. Since phy_attached_print() also prints information about the interrupt, that needs to be set as well. Signed-off-by: Timur Tabi--- drivers/net/ethernet/qualcomm/emac/emac-mac.c | 4 +++- drivers/net/ethernet/qualcomm/emac/emac-phy.c | 3 --- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/qualcomm/emac/emac-mac.c b/drivers/net/ethernet/qualcomm/emac/emac-mac.c index e4793d7..155e273 100644 --- a/drivers/net/ethernet/qualcomm/emac/emac-mac.c +++ b/drivers/net/ethernet/qualcomm/emac/emac-mac.c @@ -981,6 +981,7 @@ int emac_mac_up(struct emac_adapter *adpt) emac_mac_config(adpt); emac_mac_rx_descs_refill(adpt, >rx_q); + adpt->phydev->irq = PHY_IGNORE_INTERRUPT; ret = phy_connect_direct(netdev, adpt->phydev, emac_adjust_link, PHY_INTERFACE_MODE_SGMII); if (ret) { @@ -988,11 +989,12 @@ int emac_mac_up(struct emac_adapter *adpt) return ret; } + phy_attached_print(adpt->phydev, NULL); + /* enable mac irq */ writel((u32)~DIS_INT, adpt->base + EMAC_INT_STATUS); writel(adpt->irq.mask, adpt->base + EMAC_INT_MASK); - adpt->phydev->irq = PHY_IGNORE_INTERRUPT; phy_start(adpt->phydev); napi_enable(>rx_q.napi); diff --git a/drivers/net/ethernet/qualcomm/emac/emac-phy.c b/drivers/net/ethernet/qualcomm/emac/emac-phy.c index 1d7852f..441c1936 100644 --- a/drivers/net/ethernet/qualcomm/emac/emac-phy.c +++ b/drivers/net/ethernet/qualcomm/emac/emac-phy.c @@ -226,8 +226,5 @@ int emac_phy_config(struct platform_device *pdev, struct emac_adapter *adpt) return -ENODEV; } - if (adpt->phydev->drv) - phy_attached_print(adpt->phydev, NULL); - return 0; } -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH 1/2] Documentation: devicetree: change the mediatek ethernet compatible string
On Wed, Jan 25, 2017 at 09:20:54AM +0100, John Crispin wrote: > When the binding was defined, I was not aware that mt2701 was an earlier > version of the SoC. For sake of consistency, the ethernet driver should > use mt2701 inside the compat string as this is the earliest SoC with the > ethernet core. > > The ethernet driver is currently of no real use until we finish and > upstream the DSA driver. There are no users of this binding yet. It should > be safe to fix this now before it is too late and we need to provide > backward compatibility for the mt7623-eth compat string. Thanks for the explanation. > Reported-by: Sean Wang> Signed-off-by: John Crispin > --- > Documentation/devicetree/bindings/net/mediatek-net.txt |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/Documentation/devicetree/bindings/net/mediatek-net.txt > b/Documentation/devicetree/bindings/net/mediatek-net.txt > index c010faf..c7194e8 100644 > --- a/Documentation/devicetree/bindings/net/mediatek-net.txt > +++ b/Documentation/devicetree/bindings/net/mediatek-net.txt > @@ -7,7 +7,7 @@ have dual GMAC each represented by a child node.. > * Ethernet controller node > > Required properties: > -- compatible: Should be "mediatek,mt7623-eth" > +- compatible: Should be "mediatek,mt2701-eth" You should have both strings with 2701 being last. That way if you ever find a difference in the 7623, you don't need a DT update to fix it. > - reg: Address and length of the register set for the device > - interrupts: Should contain the three frame engines interrupts in numeric > order. These are fe_int0, fe_int1 and fe_int2. > -- > 1.7.10.4 >
[PATCH v2] net: adaptec: starfire: add checks for dma mapping errors
init_ring(), refill_rx_ring() and start_tx() don't check if mapping dma memory succeed. The patch adds the checks and failure handling. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov--- drivers/net/ethernet/adaptec/starfire.c | 45 +++-- 1 file changed, 43 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/adaptec/starfire.c b/drivers/net/ethernet/adaptec/starfire.c index c12d2618eebf..3872ab96b80a 100644 --- a/drivers/net/ethernet/adaptec/starfire.c +++ b/drivers/net/ethernet/adaptec/starfire.c @@ -1152,6 +1152,12 @@ static void init_ring(struct net_device *dev) if (skb == NULL) break; np->rx_info[i].mapping = pci_map_single(np->pci_dev, skb->data, np->rx_buf_sz, PCI_DMA_FROMDEVICE); + if (pci_dma_mapping_error(np->pci_dev, + np->rx_info[i].mapping)) { + dev_kfree_skb(skb); + np->rx_info[i].skb = NULL; + break; + } /* Grrr, we cannot offset to correctly align the IP header. */ np->rx_ring[i].rxaddr = cpu_to_dma(np->rx_info[i].mapping | RxDescValid); } @@ -1182,8 +1188,9 @@ static netdev_tx_t start_tx(struct sk_buff *skb, struct net_device *dev) { struct netdev_private *np = netdev_priv(dev); unsigned int entry; + unsigned int prev_tx; u32 status; - int i; + int i, j; /* * be cautious here, wrapping the queue has weird semantics @@ -1201,6 +1208,7 @@ static netdev_tx_t start_tx(struct sk_buff *skb, struct net_device *dev) } #endif /* ZEROCOPY && HAS_BROKEN_FIRMWARE */ + prev_tx = np->cur_tx; entry = np->cur_tx % TX_RING_SIZE; for (i = 0; i < skb_num_frags(skb); i++) { int wrap_ring = 0; @@ -1234,6 +1242,11 @@ static netdev_tx_t start_tx(struct sk_buff *skb, struct net_device *dev) skb_frag_size(this_frag), PCI_DMA_TODEVICE); } + if (pci_dma_mapping_error(np->pci_dev, + np->tx_info[entry].mapping)) { + dev->stats.tx_dropped++; + goto err_out; + } np->tx_ring[entry].addr = cpu_to_dma(np->tx_info[entry].mapping); np->tx_ring[entry].status = cpu_to_le32(status); @@ -1268,8 +1281,30 @@ static netdev_tx_t start_tx(struct sk_buff *skb, struct net_device *dev) netif_stop_queue(dev); return NETDEV_TX_OK; -} +err_out: + entry = prev_tx % TX_RING_SIZE; + np->tx_info[entry].skb = NULL; + if (i > 0) { + pci_unmap_single(np->pci_dev, +np->tx_info[entry].mapping, +skb_first_frag_len(skb), +PCI_DMA_TODEVICE); + np->tx_info[entry].mapping = 0; + entry = (entry + np->tx_info[entry].used_slots) % TX_RING_SIZE; + for (j = 1; j < i; j++) { + pci_unmap_single(np->pci_dev, +np->tx_info[entry].mapping, +skb_frag_size( + _shinfo(skb)->frags[j-1]), +PCI_DMA_TODEVICE); + entry++; + } + } + dev_kfree_skb_any(skb); + np->cur_tx = prev_tx; + return NETDEV_TX_OK; +} /* The interrupt handler does all of the Rx thread work and cleans up after the Tx thread. */ @@ -1569,6 +1604,12 @@ static void refill_rx_ring(struct net_device *dev) break; /* Better luck next round. */ np->rx_info[entry].mapping = pci_map_single(np->pci_dev, skb->data, np->rx_buf_sz, PCI_DMA_FROMDEVICE); + if (pci_dma_mapping_error(np->pci_dev, + np->rx_info[entry].mapping)) { + dev_kfree_skb(skb); + np->rx_info[entry].skb = NULL; + break; + } np->rx_ring[entry].rxaddr = cpu_to_dma(np->rx_info[entry].mapping | RxDescValid); } -- 2.7.4
Re: [PATCH net-next 0/4] net: dsa: bcm_sf2: CFP support
Hi Chris, On 01/27/2017 01:24 PM, Chris Healy wrote: > Hi Florian, > > In saying the below, I may just be showing my naivety but here goes: > > If I understand this correctly, what you are using is similar to the > TCAM hardware present in the newer Marvell switches. I think Pablo is > doing some work with nftables and HW offload using TCAM HW. Is there > overlap here? It seems that one or the other API should be used but > not both. Well, the problem is that there is overlap with 3 different unrelated subsystems accessing the same HW here: tc, ethtool, and netfilter, all (two at least) with different ways of formatting input parameters, as I pointed out a while back in this thread: https://www.mail-archive.com/netdev@vger.kernel.org/msg126321.html My angle on this submission is the following, purely based on pragmatism: - I have real users behind this feature who are currently very happy with how this works using ethtool, switching them to netlink, tc, netfilter is not trivial, but could be done in the long run, not just now. At the very least, this serves as reference code for people who are curious to see how Broadcom's CFP works - cls_flower was looked at, it is missing a critical feature IMHO which is the ability to specify a rule index, and the amount of code necessary to validate input parameters is just totally insane, just like the fact that there is not a common intermediate input representation (ala ethtool_rx_flow_spec) makes it impractical - I have heard about the work Pablo is doing, but until it is publicly submitted and reviewed, it's hard to project what it is going to look like Thanks! -- Florian
Re: [PATCH RFC net-next] packet: always ensure that we pass hard_header_len bytes in skb_headlen() to the driver
On (01/27/17 15:51), Willem de Bruijn wrote: : > - limit capable() check to drivers with with .validate callback (aka second option below) : > - let privileged applications shoot themselves in the foot (change nothing). > The second will break variable length header protocols unless > you exhaustively search for all variable length protocols and add > validate callbacks first. other than ax25, are there variable length header protocols out there without ->validate, and which need the CAP_RAW_SYSIO branch? I realize that, to an extent, even ethernet is a protocol whose header is > 14 with vlan, but from the google search, seems like it was mostly ax25 that really triggered a large part of the check. If we think that there are a large number of these (that dont have a ->validate, to fix up things as desired) I'd just go for the "change nothing in pf_packet" option. As I found out many drivers like ixgbe and sunvnet have defensive checks in the Tx path anyway, and xen_netfront can also join that group with a few simple checks.
Re: [net 7/8] net/mlx5e: Fix update of hash function/key via ethtool
On Fri, Jan 27, 2017 at 12:38 PM, Saeed Mahameedwrote: > From: Gal Pressman > > Modifying TIR hash should change selected fields bitmask in addition to > the function and key. > Formerly, we would not set this field resulting in zeroing of its value, > which means no packet fields are used for RX RSS hash calculation thus > causing all traffic to arrive in RQ[0]. > This commit log is rather scant in details. Does this mean that RSS is somehow broken in mlx5? What is exact test that demonstrates bad behavior? Did you verify that this doesn't break IPv4 or IPv6? > Fixes: bdfc028de1b3 ("net/mlx5e: Fix ethtool RX hash func configuration > change") > Signed-off-by: Gal Pressman > Signed-off-by: Saeed Mahameed > --- > drivers/net/ethernet/mellanox/mlx5/core/en.h | 3 +- > .../net/ethernet/mellanox/mlx5/core/en_ethtool.c | 13 +- > drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 198 > ++--- > 3 files changed, 109 insertions(+), 105 deletions(-) > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h > b/drivers/net/ethernet/mellanox/mlx5/core/en.h > index 1619147a63e8..d5ecb8f53fd4 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h > @@ -791,7 +791,8 @@ void mlx5e_disable_vlan_filter(struct mlx5e_priv *priv); > int mlx5e_modify_rqs_vsd(struct mlx5e_priv *priv, bool vsd); > > int mlx5e_redirect_rqt(struct mlx5e_priv *priv, u32 rqtn, int sz, int ix); > -void mlx5e_build_tir_ctx_hash(void *tirc, struct mlx5e_priv *priv); > +void mlx5e_build_indir_tir_ctx_hash(struct mlx5e_priv *priv, void *tirc, > + enum mlx5e_traffic_types tt); > > int mlx5e_open_locked(struct net_device *netdev); > int mlx5e_close_locked(struct net_device *netdev); > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c > b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c > index 6f4eb34259f0..bb67863aa361 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c > @@ -980,15 +980,18 @@ static int mlx5e_get_rxfh(struct net_device *netdev, > u32 *indir, u8 *key, > > static void mlx5e_modify_tirs_hash(struct mlx5e_priv *priv, void *in, int > inlen) > { > - struct mlx5_core_dev *mdev = priv->mdev; > void *tirc = MLX5_ADDR_OF(modify_tir_in, in, ctx); > - int i; > + struct mlx5_core_dev *mdev = priv->mdev; > + int ctxlen = MLX5_ST_SZ_BYTES(tirc); > + int tt; > > MLX5_SET(modify_tir_in, in, bitmask.hash, 1); > - mlx5e_build_tir_ctx_hash(tirc, priv); > > - for (i = 0; i < MLX5E_NUM_INDIR_TIRS; i++) > - mlx5_core_modify_tir(mdev, priv->indir_tir[i].tirn, in, > inlen); > + for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++) { > + memset(tirc, 0, ctxlen); > + mlx5e_build_indir_tir_ctx_hash(priv, tirc, tt); > + mlx5_core_modify_tir(mdev, priv->indir_tir[tt].tirn, in, > inlen); > + } > } > > static int mlx5e_set_rxfh(struct net_device *dev, const u32 *indir, > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > index 948351ae5bd2..f14ca3385fdd 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > @@ -2022,8 +2022,23 @@ static void mlx5e_build_tir_ctx_lro(void *tirc, struct > mlx5e_priv *priv) > MLX5_SET(tirc, tirc, lro_timeout_period_usecs, > priv->params.lro_timeout); > } > > -void mlx5e_build_tir_ctx_hash(void *tirc, struct mlx5e_priv *priv) > +void mlx5e_build_indir_tir_ctx_hash(struct mlx5e_priv *priv, void *tirc, > + enum mlx5e_traffic_types tt) > { > + void *hfso = MLX5_ADDR_OF(tirc, tirc, rx_hash_field_selector_outer); > + > +#define MLX5_HASH_IP(MLX5_HASH_FIELD_SEL_SRC_IP |\ > +MLX5_HASH_FIELD_SEL_DST_IP) > + > +#define MLX5_HASH_IP_L4PORTS(MLX5_HASH_FIELD_SEL_SRC_IP |\ > +MLX5_HASH_FIELD_SEL_DST_IP |\ > +MLX5_HASH_FIELD_SEL_L4_SPORT |\ > +MLX5_HASH_FIELD_SEL_L4_DPORT) > + > +#define MLX5_HASH_IP_IPSEC_SPI (MLX5_HASH_FIELD_SEL_SRC_IP |\ > +MLX5_HASH_FIELD_SEL_DST_IP |\ > +MLX5_HASH_FIELD_SEL_IPSEC_SPI) > + > MLX5_SET(tirc, tirc, rx_hash_fn, > mlx5e_rx_hash_fn(priv->params.rss_hfunc)); > if (priv->params.rss_hfunc == ETH_RSS_HASH_TOP) { > @@ -2035,6 +2050,88 @@ void mlx5e_build_tir_ctx_hash(void *tirc, struct > mlx5e_priv *priv) > MLX5_SET(tirc, tirc, rx_hash_symmetric, 1); > memcpy(rss_key, priv->params.toeplitz_hash_key, len); > } > + > +
net: suspicious RCU usage in nf_hook
Hello, I've got the following report while running syzkaller fuzzer on fd694aaa46c7ed811b72eb47d5eb11ce7ab3f7f1: [ INFO: suspicious RCU usage. ] 4.10.0-rc5+ #192 Not tainted --- ./include/linux/rcupdate.h:561 Illegal context switch in RCU read-side critical section! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 0 2 locks held by syz-executor14/23111: #0: (sk_lock-AF_INET6){+.+.+.}, at: [] lock_sock include/net/sock.h:1454 [inline] #0: (sk_lock-AF_INET6){+.+.+.}, at: [] rawv6_sendmsg+0x1e65/0x3ec0 net/ipv6/raw.c:919 #1: (rcu_read_lock){..}, at: [] nf_hook include/linux/netfilter.h:201 [inline] #1: (rcu_read_lock){..}, at: [] __ip6_local_out+0x258/0x840 net/ipv6/output_core.c:160 stack backtrace: CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:15 [inline] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51 lockdep_rcu_suspicious+0x139/0x180 kernel/locking/lockdep.c:4452 rcu_preempt_sleep_check include/linux/rcupdate.h:560 [inline] ___might_sleep+0x560/0x650 kernel/sched/core.c:7748 __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739 mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752 atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060 __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149 static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174 net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728 sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403 __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441 sk_destruct+0x47/0x80 net/core/sock.c:1460 __sk_free+0x57/0x230 net/core/sock.c:1468 sock_wfree+0xae/0x120 net/core/sock.c:1645 skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655 skb_release_all+0x15/0x60 net/core/skbuff.c:668 __kfree_skb+0x15/0x20 net/core/skbuff.c:684 kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304 inet_frag_put include/net/inet_frag.h:133 [inline] nf_ct_frag6_gather+0x1106/0x3840 net/ipv6/netfilter/nf_conntrack_reasm.c:617 ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68 nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline] nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310 nf_hook include/linux/netfilter.h:212 [inline] __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160 ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170 ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722 ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742 rawv6_push_pending_frames net/ipv6/raw.c:613 [inline] rawv6_sendmsg+0x2d1a/0x3ec0 net/ipv6/raw.c:927 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744 sock_sendmsg_nosec net/socket.c:635 [inline] sock_sendmsg+0xca/0x110 net/socket.c:645 sock_write_iter+0x326/0x600 net/socket.c:848 do_iter_readv_writev+0x2e3/0x5b0 fs/read_write.c:695 do_readv_writev+0x42c/0x9b0 fs/read_write.c:872 vfs_writev+0x87/0xc0 fs/read_write.c:911 do_writev+0x110/0x2c0 fs/read_write.c:944 SYSC_writev fs/read_write.c:1017 [inline] SyS_writev+0x27/0x30 fs/read_write.c:1014 entry_SYSCALL_64_fastpath+0x1f/0xc2 RIP: 0033:0x445559 RSP: 002b:7f6f46fceb58 EFLAGS: 0292 ORIG_RAX: 0014 RAX: ffda RBX: 0005 RCX: 00445559 RDX: 0001 RSI: 20f1eff0 RDI: 0005 RBP: 006e19c0 R08: R09: R10: R11: 0292 R12: 0070 R13: 20f59000 R14: 0015 R15: 00020400 BUG: sleeping function called from invalid context at kernel/locking/mutex.c:752 in_atomic(): 1, irqs_disabled(): 0, pid: 23111, name: syz-executor14 INFO: lockdep is turned off. CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:15 [inline] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51 ___might_sleep+0x47e/0x650 kernel/sched/core.c:7780 __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739 mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752 atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060 __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149 static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174 net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728 sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403 __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441 sk_destruct+0x47/0x80 net/core/sock.c:1460 __sk_free+0x57/0x230 net/core/sock.c:1468 sock_wfree+0xae/0x120 net/core/sock.c:1645 skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655 skb_release_all+0x15/0x60 net/core/skbuff.c:668 __kfree_skb+0x15/0x20 net/core/skbuff.c:684 kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304 inet_frag_put include/net/inet_frag.h:133
Re: [PATCH net-next 1/4] mlx5: Make building eswitch configurable
On Fri, Jan 27, 2017 at 8:42 PM, Tom Herbertwrote: > On Fri, Jan 27, 2017 at 10:28 AM, Saeed Mahameed > wrote: >> On Fri, Jan 27, 2017 at 8:16 PM, Tom Herbert wrote: >>> On Fri, Jan 27, 2017 at 10:05 AM, Saeed Mahameed >>> wrote: On Fri, Jan 27, 2017 at 7:50 PM, Tom Herbert wrote: > On Fri, Jan 27, 2017 at 9:38 AM, Saeed Mahameed > wrote: >> On Fri, Jan 27, 2017 at 7:34 AM, Or Gerlitz wrote: >>> On Fri, Jan 27, 2017 at 1:32 AM, Tom Herbert >>> wrote: Add a configuration option (CONFIG_MLX5_CORE_ESWITCH) for controlling whether the eswitch code is built. Change Kconfig and Makefile accordingly. >>> >>> Tom, FWIW, please note that the basic e-switch functionality is needed >>> also when SRIOV isn't of use, this is for a multi host configuration. >>> >> >> Right, set_l2_table_entry@eswitch.c need to be called by PF for any UC >> MAC address wanted by VF or PF. >> To keep one flow in the code, the implementation is done as part of >> eswitch. >> >> so in multi-host configuration (where there are 4 PFs) each PF should >> invoke set_l2_table_entry_cmd for each one of its own UC MACs. >> >> populating the l2 table is done using the whole eswitch event driven >> mechanisms, it is not easy and IMH not right to separate eswitch >> tables from l2 table (same management logic, different tables). >> >> Anyways as Or stated this is just an FYI, eswitch needs to be enabled >> on Multi-host configuration. >> > What indicate a multi-host configuration? nothing in the driver, it is transparent. >>> So then we always need the eswitch code to be built even if someone >>> never uses any of it? >>> >> >> yes. >> but for your convenience all you need is to compile eswitch.c. >> esiwtch_offoalds.c and en_rep.c can be compiled out for basic ethernet. >> > Well eswitch.c is 2200 LOC. en_rep.c and eswitch_offloads.c are 1600 > LOC. If we _must_ have eswitch.c then there's probably not much point > then. But I am still finding it hard to fathom that eswitch has now > become a mandatory component of Ethernet drivers/devices. > It is only mandatory for configurations that needs eswitch, where the driver has no way to know about them, for a good old bare metal box, eswitch is not needed. we can do some work to strip the l2 table logic - needed for PFs to work on multi-host - out of eswitch but again that would further complicate the driver code since eswitch will still need to update l2 tables for VFs. > Tom > > >>> Or. >>> >>> My WW (and same for the rest of the IL team..) has ended so I will be >>> able to further look on this series and comment on Sunday.
[net-next v2] openvswitch: Simplify do_execute_actions().
do_execute_actions() implements a worthwhile optimization: in case an output action is the last action in an action list, skb_clone() can be avoided by outputing the current skb. However, the implementation is more complicated than necessary. This patch simplify this logic. Signed-off-by: Andy Zhou--- v1->v2: drop skb NULL check in do_output() --- net/openvswitch/actions.c | 42 -- 1 file changed, 20 insertions(+), 22 deletions(-) diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c index 514f7bc..efa9a88 100644 --- a/net/openvswitch/actions.c +++ b/net/openvswitch/actions.c @@ -1141,12 +1141,6 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb, struct sw_flow_key *key, const struct nlattr *attr, int len) { - /* Every output action needs a separate clone of 'skb', but the common -* case is just a single output action, so that doing a clone and -* then freeing the original skbuff is wasteful. So the following code -* is slightly obscure just to avoid that. -*/ - int prev_port = -1; const struct nlattr *a; int rem; @@ -1154,20 +1148,28 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb, a = nla_next(a, )) { int err = 0; - if (unlikely(prev_port != -1)) { - struct sk_buff *out_skb = skb_clone(skb, GFP_ATOMIC); - - if (out_skb) - do_output(dp, out_skb, prev_port, key); + switch (nla_type(a)) { + case OVS_ACTION_ATTR_OUTPUT: { + int port = nla_get_u32(a); + struct sk_buff *clone; + + /* Every output action needs a separate clone +* of 'skb', In case the output action is the +* last action, cloning can be avoided. +*/ + if (nla_is_last(a, rem)) { + do_output(dp, skb, port, key); + /* 'skb' has been used for output. +*/ + return 0; + } + clone = skb_clone(skb, GFP_ATOMIC); + if (clone) + do_output(dp, clone, port, key); OVS_CB(skb)->cutlen = 0; - prev_port = -1; - } - - switch (nla_type(a)) { - case OVS_ACTION_ATTR_OUTPUT: - prev_port = nla_get_u32(a); break; + } case OVS_ACTION_ATTR_TRUNC: { struct ovs_action_trunc *trunc = nla_data(a); @@ -1257,11 +1259,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb, } } - if (prev_port != -1) - do_output(dp, skb, prev_port, key); - else - consume_skb(skb); - + consume_skb(skb); return 0; } -- 1.8.3.1
[RFC PATCH 2/2] ixgbe: add af_packet direct copy support
This implements the ndo ops for direct dma socket option. This is to start looking at the interface and driver work needed to enable it. Note error paths are not handled and I'm aware of a few bugs. For example interface must be up before attaching socket or else it will fail silently. TBD fix all these things. Signed-off-by: John Fastabend--- drivers/net/ethernet/intel/ixgbe/ixgbe.h |3 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 255 + 2 files changed, 256 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h index ef81c3d..198f90c 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h @@ -266,6 +266,7 @@ struct ixgbe_ring { struct ixgbe_tx_buffer *tx_buffer_info; struct ixgbe_rx_buffer *rx_buffer_info; }; + unsigned int buffer_size; unsigned long state; u8 __iomem *tail; dma_addr_t dma; /* phys. address of descriptor ring */ @@ -299,6 +300,8 @@ struct ixgbe_ring { struct ixgbe_tx_queue_stats tx_stats; struct ixgbe_rx_queue_stats rx_stats; }; + bool ddma;/* ring data buffers mapped to userspace */ + struct sock *rx_kick; /* rx kick userspace */ } cacheline_internodealigned_in_smp; enum ixgbe_ring_f_enum { diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index 1e2f39e..c5ab44a 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -1608,8 +1608,19 @@ void ixgbe_alloc_rx_buffers(struct ixgbe_ring *rx_ring, u16 cleaned_count) i -= rx_ring->count; do { - if (!ixgbe_alloc_mapped_page(rx_ring, bi)) - break; + /* If direct dma has not released packet yet stop */ + if (rx_ring->ddma) { + int align = TPACKET_ALIGN(TPACKET2_HDRLEN); + int hdrlen = ALIGN(align, L1_CACHE_BYTES); + struct tpacket2_hdr *hdr; + + hdr = page_address(bi->page) + bi->page_offset - hdrlen; + if (unlikely(TP_STATUS_USER & hdr->tp_status)) + break; + } else { + if (!ixgbe_alloc_mapped_page(rx_ring, bi)) + break; + } /* * Refresh the desc even if buffer_addrs didn't change @@ -2005,6 +2016,97 @@ static bool ixgbe_add_rx_frag(struct ixgbe_ring *rx_ring, return true; } +/* ixgbe_do_ddma - direct dma routine to populate PACKET_RX_RING mmap region + * + * The packet socket interface builds a shared memory region using mmap after + * it is specified by the PACKET_RX_RING socket option. This will create a + * circular ring of memory slots. Typical software usage case copies the skb + * into these pages via tpacket_rcv() routine. + * + * Here we do direct DMA from the hardware (82599 in this case) into the + * mmap regions and populate the uhdr (think user space descriptor). This + * requires the hardware to support Scatter Gather and HighDMA which should + * be standard on most (all?) 10/40 Gbps devices. + * + * The buffer mapping should have already been done so that rx_buffer pages + * are handed to the driver from the mmap setup done at the socket layer. + * + * See ./include/uapi/linux/if_packet.h for details on packet layout here + * we can only use tpacket2_hdr type. v3 of the header type introduced bulk + * polling modes which do not work correctly with hardware DMA engine. The + * primary issue is we can not stop a DMA transaction from occurring after it + * has been configured. What results is the software timer advances the + * ring ahead of the hardware and the ring state is lost. Maybe there is + * a clever way to resolve this by I haven't thought it up yet. + */ +static int ixgbe_do_ddma(struct ixgbe_ring *rx_ring, +union ixgbe_adv_rx_desc *rx_desc) +{ + int hdrlen = ALIGN(TPACKET_ALIGN(TPACKET2_HDRLEN), L1_CACHE_BYTES); + struct ixgbe_adapter *adapter = netdev_priv(rx_ring->netdev); + struct ixgbe_rx_buffer *rx_buffer; + struct tpacket2_hdr *h2; /* userspace descriptor */ + struct sockaddr_ll *sll; + struct ethhdr *eth; + int len = 0; + u64 ns = 0; + s32 rem; + + rx_buffer = _ring->rx_buffer_info[rx_ring->next_to_clean]; + if (!rx_buffer->dma) + return -EBUSY; + + prefetchw(rx_buffer->page); + + /* test for any known error cases */ + WARN_ON(ixgbe_test_staterr(rx_desc, + IXGBE_RXDADV_ERR_FRAME_ERR_MASK) && + !(rx_ring->netdev->features &
[RFC PATCH 1/2] af_packet: direct dma for packet ineterface
This adds ndo ops for upper layer objects to request direct DMA from the network interface into memory "slots". The slots must be DMA'able memory given by a page/offset/size vector in a packet_ring_buffer structure. The PF_PACKET socket interface can use these ndo_ops to do zerocopy RX from the network device into memory mapped userspace memory. For this to work drivers encode the correct descriptor blocks and headers so that existing PF_PACKET applications work without any modification. This only supports the V2 header formats for now. And works by mapping a ring of the network device to these slots. Originally I used V2 header formats but this does complicate the driver a bit. V3 header formats added bulk polling via socket calls and timers used in the polling interface to return every n milliseconds. Currently, I don't see any way to support this in hardware because we can't know if the hardware is in the middle of a DMA operation or not on a slot. So when a timer fires I don't know how to advance the descriptor ring leaving empty descriptors similar to how the software ring works. The easiest (best?) route is to simply not support this. It might be worth creating a new v4 header that is simple for drivers to support direct DMA ops with. I can imagine using the xdp_buff structure as a header for example. Thoughts? The ndo operations and new socket option PACKET_RX_DIRECT work by giving a queue_index to run the direct dma operations over. Once setsockopt returns successfully the indicated queue is mapped directly to the requesting application and can not be used for other purposes. Also any kernel layers such as tc will be bypassed and need to be implemented in the hardware via some other mechanism such as tc offload or other offload interfaces. Users steer traffic to the selected queue using flow director, tc offload infrastructure or via macvlan offload. The new socket option added to PF_PACKET is called PACKET_RX_DIRECT. It takes a single unsigned int value specifying the queue index, setsockopt(sock, SOL_PACKET, PACKET_RX_DIRECT, _index, sizeof(queue_index)); Implementing busy_poll support will allow userspace to kick the drivers receive routine if needed. This work is TBD. To test this I hacked a hardcoded test into the tool psock_tpacket in the selftests kernel directory here: ./tools/testing/selftests/net/psock_tpacket.c Running this tool opens a socket and listens for packets over the PACKET_RX_DIRECT enabled socket. Obviously it needs to be reworked to enable all the older tests and not hardcode my interface before it actually gets released. In general this is a rough patch to explore the interface and put something concrete up for debate. The patch does not handle all the error cases correctly and needs to be cleaned up. Known Limitations (TBD): (1) Users are required to match the number of rx ring slots with ethtool to the number requested by the setsockopt PF_PACKET layout. In the future we could possibly do this automatically. (2) Users need to configure Flow director or setup_tc to steer traffic to the correct queues. I don't believe this needs to be changed it seems to be a good mechanism for driving directed dma. (3) Not supporting timestamps or priv space yet, pushing a v4 packet header would resolve this nicely. (5) Only RX supported so far. TX already supports direct DMA interface but uses skbs which is really not needed. In the TX_RING case we can optimize this path as well. To support TX case we can do a similar "slots" mechanism and kick operation. The kick could be a busy_poll like operation but on the TX side. The flow would be user space loads up n number of slots with packets, kicks tx busy poll bit, the driver sends packets, and finally when xmit is complete clears header bits to give slots back. When we have qdisc bypass set today we already bypass the entire stack so no paticular reason to use skb's in this case. Using xdp_buff as a v4 packet header would also allow us to consolidate driver code. To be done: (1) More testing and performance analysis (2) Busy polling sockets (3) Implement v4 xdp_buff headers for analysis (4) performance testing :/ hopefully it looks good. Signed-off-by: John Fastabend--- include/linux/netdevice.h |8 +++ include/net/af_packet.h | 64 +++ include/uapi/linux/if_packet.h |1 net/packet/af_packet.c | 37 net/packet/internal.h | 60 - tools/testing/selftests/net/psock_tpacket.c | 51 +++--- 6 files changed, 154 insertions(+), 67 deletions(-) create mode 100644 include/net/af_packet.h diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index
Re: [PATCH net-next 0/4] net: dsa: bcm_sf2: CFP support
Hi Florian, In saying the below, I may just be showing my naivety but here goes: If I understand this correctly, what you are using is similar to the TCAM hardware present in the newer Marvell switches. I think Pablo is doing some work with nftables and HW offload using TCAM HW. Is there overlap here? It seems that one or the other API should be used but not both. Regards, Chris On Fri, Jan 27, 2017 at 1:05 PM, Florian Fainelliwrote: > Hi all, > > This patch series adds support for the Broadcom Compact Field Processor (CFP) > which is a classification and matching engine built into most Broadcom > switches. > > We support that using ethtool::rxnfc because it allows all known uses cases > from > the users I support to work, and more importantly, it allows the selection of > a > target rule index, which is later used by e.g: offloading hardware, this is an > essential feature that I could not find being supported with cls_* for > instance. > > Thanks > > Florian Fainelli (4): > net: dsa: Hook {get,set}_rxnfc ethtool operations > net: dsa: bcm_sf2: Configure traffic classes to queue mapping > net: dsa: bcm_sf2: Add CFP registers definitions > net: dsa: bcm_sf2: Add support for ethtool::rxnfc > > drivers/net/dsa/Makefile | 2 +- > drivers/net/dsa/bcm_sf2.c | 23 ++ > drivers/net/dsa/bcm_sf2.h | 17 ++ > drivers/net/dsa/bcm_sf2_cfp.c | 613 > + > drivers/net/dsa/bcm_sf2_regs.h | 150 ++ > include/net/dsa.h | 8 + > net/dsa/slave.c| 26 ++ > 7 files changed, 838 insertions(+), 1 deletion(-) > create mode 100644 drivers/net/dsa/bcm_sf2_cfp.c > > -- > 2.9.3 >
[RFC PATCH 0/2] rx zero copy interface for af_packet
This is an experimental implementation of rx zero copy for af_packet. Its a bit rough and likely has errors but the plan is to clean it up over the next few months. And seeing I said I would post it in another thread a few days back here it is. Comments welcome and use at your own risk. Thanks, John --- John Fastabend (2): af_packet: direct dma for packet ineterface ixgbe: add af_packet direct copy support drivers/net/ethernet/intel/ixgbe/ixgbe.h |3 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 255 + include/linux/netdevice.h |8 + include/net/af_packet.h | 64 ++ include/uapi/linux/if_packet.h|1 net/packet/af_packet.c| 37 net/packet/internal.h | 60 -- tools/testing/selftests/net/psock_tpacket.c | 51 - 8 files changed, 410 insertions(+), 69 deletions(-) create mode 100644 include/net/af_packet.h -- Signature
Re: [net-next] openvswitch: Simplify do_execute_actions().
On Fri, Jan 27, 2017 at 12:42 PM, Pravin Shelarwrote: > On Wed, Jan 25, 2017 at 9:24 PM, Andy Zhou wrote: >> do_execute_actions() implements a worthwhile optimization: in case >> an output action is the last action in an action list, skb_clone() >> can be avoided by outputing the current skb. However, the >> implementation is more complicated than necessary. This patch >> simplify this logic. >> >> Signed-off-by: Andy Zhou >> --- >> net/openvswitch/actions.c | 40 +++- >> 1 file changed, 19 insertions(+), 21 deletions(-) >> >> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c >> index 514f7bc..3866608 100644 >> --- a/net/openvswitch/actions.c >> +++ b/net/openvswitch/actions.c >> @@ -830,6 +830,9 @@ static void do_output(struct datapath *dp, struct >> sk_buff *skb, int out_port, >> { >> struct vport *vport = ovs_vport_rcu(dp, out_port); >> >> + if (unlikely(!skb)) >> + return; >> + > Patch looks good to me. But I wanted to know if you considered moving > this check to do_execute_actions() in case skb-clone is done? This way > we can avoid this unlikely check from likely case :) > Good point. O.K. I will repost a version without this check. Thanks for the review and comment.
[pull request][net 0/8] Mellanox mlx5 fixes 2017-01-27
Hi Dave, This pull request includes some mlx5 fixes for net, please see details below. Please pull and let me know if there's any problem. For -stable: net/mlx5e: Modify TIRs hash only when it's needed net/mlx5e: Fix update of hash function/key via ethtool Thanks, Saeed. --- The following changes since commit 214767faa2f31285f92754393c036f13b55474a6: Merge tag 'batadv-net-for-davem-20170125' of git://git.open-mesh.org/linux-merge (2017-01-25 23:11:13 -0500) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-fixes-2017-01-27 for you to fetch changes up to 4f24229d2f73be75c2d362113588e30a1695dcb1: net/mlx5e: Check ets capability before ets query FW command (2017-01-27 00:05:47 +0200) mlx5-fixes-2017-01-27 A couple of mlx5 core and ethernet driver fixes. >From Or, a couple of error return values and error handling fixes. >From Hadar, Support TC encapsulation offloads even when the mlx5e uplink device is stacked under an upper device. >From Gal, two patches to fix RSS hash modifications via ethtool. >From Moshe, Added a needed ets capability check. Gal Pressman (2): net/mlx5e: Modify TIRs hash only when it's needed net/mlx5e: Fix update of hash function/key via ethtool Hadar Hen Zion (1): net/mlx5e: Support TC encapsulation offloads with upper devices Moshe Shemesh (1): net/mlx5e: Check ets capability before ets query FW command Or Gerlitz (4): net/mlx5: Change ENOTSUPP to EOPNOTSUPP net/mlx5: Return EOPNOTSUPP when failing to get steering name-space net/mlx5: E-Switch, Err when retrieving steering name-space fails net/mlx5: E-Switch, Re-enable RoCE on mode change only after FDB destroy drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/en.h | 7 +- drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c | 11 +- .../net/ethernet/mellanox/mlx5/core/en_ethtool.c | 41 +++-- drivers/net/ethernet/mellanox/mlx5/core/en_fs.c| 2 +- .../ethernet/mellanox/mlx5/core/en_fs_ethtool.c| 2 +- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 202 ++--- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c| 13 +- drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 10 +- .../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 36 ++-- drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/main.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/port.c | 4 +- drivers/net/ethernet/mellanox/mlx5/core/vport.c| 2 +- 15 files changed, 181 insertions(+), 157 deletions(-)
Re: [PATCH] cfg80211 debugfs: Cleanup some checkpatch issues
On Fri, 2017-01-27 at 22:00 +0100, Johannes Berg wrote: > On Fri, 2017-01-27 at 22:26 +0300, Pichugin Dmitry wrote: > > This fixes the checkpatch.pl warnings: > > * Macros should not use a trailing semicolon. > > * Spaces required around that '='. > > * Symbolic permissions 'S_IRUGO' are not preferred. > > * Macro argument reuse 'buflen' - possible side-effects > > I really see no point in any of this. Look at the uses of DEBUGFS_READONLY_FILE and see if they are consistent before and after. DEBUGFS_READONLY_FILE(rts_threshold, 20, "%d", - wiphy->rts_threshold) + wiphy->rts_threshold); DEBUGFS_READONLY_FILE(fragmentation_threshold, 20, "%d", wiphy->frag_threshold); DEBUGFS_READONLY_FILE(short_retry_limit, 20, "%d", - wiphy->retry_short) + wiphy->retry_short); DEBUGFS_READONLY_FILE(long_retry_limit, 20, "%d", wiphy->retry_long);
[net 2/8] net/mlx5: Return EOPNOTSUPP when failing to get steering name-space
From: Or GerlitzWhen we fail to retrieve a hardware steering name-space, the returned error code should say that this operation is not supported. Align the various places in the driver where this call is made to this convention. Also, make sure to warn when we fail to retrieve a SW (ANCHOR) name-space. Signed-off-by: Or Gerlitz Reviewed-by: Matan Barak Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/en_fs.c| 2 +- drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 6 +++--- drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 2 +- 4 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c index 1fe80de5d68f..a0e5a69402b3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c @@ -1089,7 +1089,7 @@ int mlx5e_create_flow_steering(struct mlx5e_priv *priv) MLX5_FLOW_NAMESPACE_KERNEL); if (!priv->fs.ns) - return -EINVAL; + return -EOPNOTSUPP; err = mlx5e_arfs_create_tables(priv); if (err) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c index bb712139b36e..d0c8bf014453 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c @@ -353,7 +353,7 @@ static int esw_create_legacy_fdb_table(struct mlx5_eswitch *esw, int nvports) root_ns = mlx5_get_flow_namespace(dev, MLX5_FLOW_NAMESPACE_FDB); if (!root_ns) { esw_warn(dev, "Failed to get FDB flow namespace\n"); - return -ENOMEM; + return -EOPNOTSUPP; } flow_group_in = mlx5_vzalloc(inlen); @@ -962,7 +962,7 @@ static int esw_vport_enable_egress_acl(struct mlx5_eswitch *esw, root_ns = mlx5_get_flow_namespace(dev, MLX5_FLOW_NAMESPACE_ESW_EGRESS); if (!root_ns) { esw_warn(dev, "Failed to get E-Switch egress flow namespace\n"); - return -EIO; + return -EOPNOTSUPP; } flow_group_in = mlx5_vzalloc(inlen); @@ -1079,7 +1079,7 @@ static int esw_vport_enable_ingress_acl(struct mlx5_eswitch *esw, root_ns = mlx5_get_flow_namespace(dev, MLX5_FLOW_NAMESPACE_ESW_INGRESS); if (!root_ns) { esw_warn(dev, "Failed to get E-Switch ingress flow namespace\n"); - return -EIO; + return -EOPNOTSUPP; } flow_group_in = mlx5_vzalloc(inlen); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 657d319fc4c6..5803216157cf 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -535,7 +535,7 @@ static int esw_create_offloads_table(struct mlx5_eswitch *esw) ns = mlx5_get_flow_namespace(dev, MLX5_FLOW_NAMESPACE_OFFLOADS); if (!ns) { esw_warn(esw->dev, "Failed to get offloads flow namespace\n"); - return -ENOMEM; + return -EOPNOTSUPP; } ft_offloads = mlx5_create_flow_table(ns, 0, dev->priv.sriov.num_vfs + 2, 0, 0); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c index 0ac7a2fc916c..6346a8f5883b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c @@ -1822,7 +1822,7 @@ static int create_anchor_flow_table(struct mlx5_flow_steering *steering) struct mlx5_flow_table *ft; ns = mlx5_get_flow_namespace(steering->dev, MLX5_FLOW_NAMESPACE_ANCHOR); - if (!ns) + if (WARN_ON(!ns)) return -EINVAL; ft = mlx5_create_flow_table(ns, ANCHOR_PRIO, ANCHOR_SIZE, ANCHOR_LEVEL, 0); if (IS_ERR(ft)) { -- 2.11.0
[PATCH net-next 3/3] net: dsa: bcm_sf2: Add support for ethtool::rxnfc
Add support for configuring classification rules using the ethtool::rxnfc API. This is useful to program the switch's CFP/TCAM to redirect specific packets to specific ports/queues for instance. For now, we allow any kind of IPv4 5-tuple matching. Signed-off-by: Florian Fainelli--- drivers/net/dsa/Makefile | 2 +- drivers/net/dsa/bcm_sf2.c | 14 + drivers/net/dsa/bcm_sf2.h | 17 ++ drivers/net/dsa/bcm_sf2_cfp.c | 613 ++ 4 files changed, 645 insertions(+), 1 deletion(-) create mode 100644 drivers/net/dsa/bcm_sf2_cfp.c diff --git a/drivers/net/dsa/Makefile b/drivers/net/dsa/Makefile index 8346e4f9737a..e69f3683f52f 100644 --- a/drivers/net/dsa/Makefile +++ b/drivers/net/dsa/Makefile @@ -1,5 +1,5 @@ obj-$(CONFIG_NET_DSA_MV88E6060) += mv88e6060.o -obj-$(CONFIG_NET_DSA_BCM_SF2) += bcm_sf2.o +obj-$(CONFIG_NET_DSA_BCM_SF2) += bcm_sf2.o bcm_sf2_cfp.o obj-$(CONFIG_NET_DSA_QCA8K)+= qca8k.o obj-y += b53/ diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c index 8eecfd227e06..74cf18798655 100644 --- a/drivers/net/dsa/bcm_sf2.c +++ b/drivers/net/dsa/bcm_sf2.c @@ -1036,6 +1036,8 @@ static const struct dsa_switch_ops bcm_sf2_ops = { .port_fdb_dump = b53_fdb_dump, .port_fdb_add = b53_fdb_add, .port_fdb_del = b53_fdb_del, + .get_rxnfc = bcm_sf2_get_rxnfc, + .set_rxnfc = bcm_sf2_set_rxnfc, }; struct bcm_sf2_of_data { @@ -1159,6 +1161,12 @@ static int bcm_sf2_sw_probe(struct platform_device *pdev) spin_lock_init(>indir_lock); mutex_init(>stats_mutex); + mutex_init(>cfp.lock); + + /* CFP rule #0 cannot be used for specific classifications, flag it as +* permanently used +*/ + set_bit(0, priv->cfp.used); bcm_sf2_identify_ports(priv, dn->child); @@ -1188,6 +1196,12 @@ static int bcm_sf2_sw_probe(struct platform_device *pdev) return ret; } + ret = bcm_sf2_cfp_rst(priv); + if (ret) { + pr_err("failed to reset CFP\n"); + goto out_mdio; + } + /* Disable all interrupts and request them */ bcm_sf2_intr_disable(priv); diff --git a/drivers/net/dsa/bcm_sf2.h b/drivers/net/dsa/bcm_sf2.h index 6e1f74e4d471..7d3030e04f11 100644 --- a/drivers/net/dsa/bcm_sf2.h +++ b/drivers/net/dsa/bcm_sf2.h @@ -52,6 +52,13 @@ struct bcm_sf2_port_status { struct ethtool_eee eee; }; +struct bcm_sf2_cfp_priv { + /* Mutex protecting concurrent accesses to the CFP registers */ + struct mutex lock; + DECLARE_BITMAP(used, CFP_NUM_RULES); + unsigned int rules_cnt; +}; + struct bcm_sf2_priv { /* Base registers, keep those in order with BCM_SF2_REGS_NAME */ void __iomem*core; @@ -103,6 +110,9 @@ struct bcm_sf2_priv { /* Bitmask of ports needing BRCM tags */ unsigned intbrcm_tag_mask; + + /* CFP rules context */ + struct bcm_sf2_cfp_priv cfp; }; static inline struct bcm_sf2_priv *bcm_sf2_to_priv(struct dsa_switch *ds) @@ -197,4 +207,11 @@ SF2_IO_MACRO(acb); SWITCH_INTR_L2(0); SWITCH_INTR_L2(1); +/* RXNFC */ +int bcm_sf2_get_rxnfc(struct dsa_switch *ds, int port, + struct ethtool_rxnfc *nfc, u32 *rule_locs); +int bcm_sf2_set_rxnfc(struct dsa_switch *ds, int port, + struct ethtool_rxnfc *nfc); +int bcm_sf2_cfp_rst(struct bcm_sf2_priv *priv); + #endif /* __BCM_SF2_H */ diff --git a/drivers/net/dsa/bcm_sf2_cfp.c b/drivers/net/dsa/bcm_sf2_cfp.c new file mode 100644 index ..c71be3e0dc2d --- /dev/null +++ b/drivers/net/dsa/bcm_sf2_cfp.c @@ -0,0 +1,613 @@ +/* + * Broadcom Starfighter 2 DSA switch CFP support + * + * Copyright (C) 2016, Broadcom + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include + +#include "bcm_sf2.h" +#include "bcm_sf2_regs.h" + +struct cfp_udf_layout { + u8 slices[UDF_NUM_SLICES]; + u32 mask_value; + +}; + +/* UDF slices layout for a TCPv4/UDPv4 specification */ +static const struct cfp_udf_layout udf_tcpip4_layout = { + .slices = { + /* End of L2, byte offset 12, src IP[0:15] */ + CFG_UDF_EOL2 | 6, + /* End of L2, byte offset 14, src IP[16:31] */ + CFG_UDF_EOL2 | 7, + /* End of L2, byte offset 16, dst IP[0:15] */ + CFG_UDF_EOL2 | 8, + /* End of L2, byte offset 18, dst IP[16:31] */ + CFG_UDF_EOL2 | 9, + /* End of L3, byte offset 0, src port */ +
[net 5/8] net/mlx5e: Support TC encapsulation offloads with upper devices
From: Hadar Hen ZionWhen tunneling is used, some virtualizations systems set the (mlx5e) uplink device to be stacked under upper devices such as bridge or ovs internal port, where the VTEP IP address used for the encapsulation is set on that upper device. In order to support such use-cases, we also deal with a setup where the egress mirred device isn't representing a port on the HW e-switch to where the ingress device belongs. We use eswitch service function which returns the uplink and set it as the egress device of the tc encap rule. Fixes: a54e20b4fcae ("net/mlx5e: Add basic TC tunnel set action for SRIOV offloads") Signed-off-by: Hadar Hen Zion Reviewed-by: Or Gerlitz Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 46bef6a26a8c..c5282b6aba8b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -663,6 +663,7 @@ static int mlx5e_route_lookup_ipv4(struct mlx5e_priv *priv, __be32 *saddr, int *out_ttl) { + struct mlx5_eswitch *esw = priv->mdev->priv.eswitch; struct rtable *rt; struct neighbour *n = NULL; int ttl; @@ -677,12 +678,11 @@ static int mlx5e_route_lookup_ipv4(struct mlx5e_priv *priv, #else return -EOPNOTSUPP; #endif - - if (!switchdev_port_same_parent_id(priv->netdev, rt->dst.dev)) { - pr_warn("%s: can't offload, devices not on same HW e-switch\n", __func__); - ip_rt_put(rt); - return -EOPNOTSUPP; - } + /* if the egress device isn't on the same HW e-switch, we use the uplink */ + if (!switchdev_port_same_parent_id(priv->netdev, rt->dst.dev)) + *out_dev = mlx5_eswitch_get_uplink_netdev(esw); + else + *out_dev = rt->dst.dev; ttl = ip4_dst_hoplimit(>dst); n = dst_neigh_lookup(>dst, >daddr); @@ -693,7 +693,6 @@ static int mlx5e_route_lookup_ipv4(struct mlx5e_priv *priv, *out_n = n; *saddr = fl4->saddr; *out_ttl = ttl; - *out_dev = rt->dst.dev; return 0; } -- 2.11.0
[net 1/8] net/mlx5: Change ENOTSUPP to EOPNOTSUPP
From: Or GerlitzAs ENOTSUPP is specific to NFS, change the return error value to EOPNOTSUPP in various places in the mlx5 driver. Signed-off-by: Or Gerlitz Suggested-by: Yotam Gigi Reviewed-by: Matan Barak Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/en.h | 4 ++-- drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c | 6 +++--- drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c | 10 +- drivers/net/ethernet/mellanox/mlx5/core/en_fs_ethtool.c| 2 +- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 4 ++-- drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 4 ++-- drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/main.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/port.c | 4 ++-- drivers/net/ethernet/mellanox/mlx5/core/vport.c| 2 +- 12 files changed, 22 insertions(+), 22 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c index 3797cc7c1288..caa837e5e2b9 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -1728,7 +1728,7 @@ int mlx5_cmd_init(struct mlx5_core_dev *dev) if (cmd->cmdif_rev > CMD_IF_REV) { dev_err(>pdev->dev, "driver does not support command interface version. driver %d, firmware %d\n", CMD_IF_REV, cmd->cmdif_rev); - err = -ENOTSUPP; + err = -EOPNOTSUPP; goto err_free_page; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index 951dbd58594d..1619147a63e8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -863,12 +863,12 @@ static inline void mlx5e_arfs_destroy_tables(struct mlx5e_priv *priv) {} static inline int mlx5e_arfs_enable(struct mlx5e_priv *priv) { - return -ENOTSUPP; + return -EOPNOTSUPP; } static inline int mlx5e_arfs_disable(struct mlx5e_priv *priv) { - return -ENOTSUPP; + return -EOPNOTSUPP; } #else int mlx5e_arfs_create_tables(struct mlx5e_priv *priv); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c index f0b460f47f29..35f9ae037ba0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c @@ -89,7 +89,7 @@ static int mlx5e_dcbnl_ieee_getets(struct net_device *netdev, int i; if (!MLX5_CAP_GEN(priv->mdev, ets)) - return -ENOTSUPP; + return -EOPNOTSUPP; ets->ets_cap = mlx5_max_tc(priv->mdev) + 1; for (i = 0; i < ets->ets_cap; i++) { @@ -236,7 +236,7 @@ static int mlx5e_dcbnl_ieee_setets(struct net_device *netdev, int err; if (!MLX5_CAP_GEN(priv->mdev, ets)) - return -ENOTSUPP; + return -EOPNOTSUPP; err = mlx5e_dbcnl_validate_ets(netdev, ets); if (err) @@ -402,7 +402,7 @@ static u8 mlx5e_dcbnl_setall(struct net_device *netdev) struct mlx5_core_dev *mdev = priv->mdev; struct ieee_ets ets; struct ieee_pfc pfc; - int err = -ENOTSUPP; + int err = -EOPNOTSUPP; int i; if (!MLX5_CAP_GEN(mdev, ets)) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c index 5197817e4b2f..ffbdf9ee5a9b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -595,7 +595,7 @@ static int mlx5e_get_coalesce(struct net_device *netdev, struct mlx5e_priv *priv = netdev_priv(netdev); if (!MLX5_CAP_GEN(priv->mdev, cq_moderation)) - return -ENOTSUPP; + return -EOPNOTSUPP; coal->rx_coalesce_usecs = priv->params.rx_cq_moderation.usec; coal->rx_max_coalesced_frames = priv->params.rx_cq_moderation.pkts; @@ -620,7 +620,7 @@ static int mlx5e_set_coalesce(struct net_device *netdev, int i; if (!MLX5_CAP_GEN(mdev, cq_moderation)) - return -ENOTSUPP; + return -EOPNOTSUPP; mutex_lock(>state_lock); @@ -1296,7 +1296,7 @@ static int mlx5e_set_wol(struct net_device *netdev, struct ethtool_wolinfo *wol) u32 mlx5_wol_mode; if (!wol_supported) - return -ENOTSUPP; + return -EOPNOTSUPP; if (wol->wolopts & ~wol_supported) return -EINVAL; @@ -1426,7 +1426,7 @@ static
[PATCH net-next 4/4] net: dsa: bcm_sf2: Add support for ethtool::rxnfc
Add support for configuring classification rules using the ethtool::rxnfc API. This is useful to program the switch's CFP/TCAM to redirect specific packets to specific ports/queues for instance. For now, we allow any kind of IPv4 5-tuple matching. Signed-off-by: Florian Fainelli--- drivers/net/dsa/Makefile | 2 +- drivers/net/dsa/bcm_sf2.c | 14 + drivers/net/dsa/bcm_sf2.h | 17 ++ drivers/net/dsa/bcm_sf2_cfp.c | 613 ++ 4 files changed, 645 insertions(+), 1 deletion(-) create mode 100644 drivers/net/dsa/bcm_sf2_cfp.c diff --git a/drivers/net/dsa/Makefile b/drivers/net/dsa/Makefile index 8346e4f9737a..e69f3683f52f 100644 --- a/drivers/net/dsa/Makefile +++ b/drivers/net/dsa/Makefile @@ -1,5 +1,5 @@ obj-$(CONFIG_NET_DSA_MV88E6060) += mv88e6060.o -obj-$(CONFIG_NET_DSA_BCM_SF2) += bcm_sf2.o +obj-$(CONFIG_NET_DSA_BCM_SF2) += bcm_sf2.o bcm_sf2_cfp.o obj-$(CONFIG_NET_DSA_QCA8K)+= qca8k.o obj-y += b53/ diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c index 637072da3acf..be282b430c50 100644 --- a/drivers/net/dsa/bcm_sf2.c +++ b/drivers/net/dsa/bcm_sf2.c @@ -1045,6 +1045,8 @@ static const struct dsa_switch_ops bcm_sf2_ops = { .port_fdb_dump = b53_fdb_dump, .port_fdb_add = b53_fdb_add, .port_fdb_del = b53_fdb_del, + .get_rxnfc = bcm_sf2_get_rxnfc, + .set_rxnfc = bcm_sf2_set_rxnfc, }; struct bcm_sf2_of_data { @@ -1168,6 +1170,12 @@ static int bcm_sf2_sw_probe(struct platform_device *pdev) spin_lock_init(>indir_lock); mutex_init(>stats_mutex); + mutex_init(>cfp.lock); + + /* CFP rule #0 cannot be used for specific classifications, flag it as +* permanently used +*/ + set_bit(0, priv->cfp.used); bcm_sf2_identify_ports(priv, dn->child); @@ -1197,6 +1205,12 @@ static int bcm_sf2_sw_probe(struct platform_device *pdev) return ret; } + ret = bcm_sf2_cfp_rst(priv); + if (ret) { + pr_err("failed to reset CFP\n"); + goto out_mdio; + } + /* Disable all interrupts and request them */ bcm_sf2_intr_disable(priv); diff --git a/drivers/net/dsa/bcm_sf2.h b/drivers/net/dsa/bcm_sf2.h index 6e1f74e4d471..7d3030e04f11 100644 --- a/drivers/net/dsa/bcm_sf2.h +++ b/drivers/net/dsa/bcm_sf2.h @@ -52,6 +52,13 @@ struct bcm_sf2_port_status { struct ethtool_eee eee; }; +struct bcm_sf2_cfp_priv { + /* Mutex protecting concurrent accesses to the CFP registers */ + struct mutex lock; + DECLARE_BITMAP(used, CFP_NUM_RULES); + unsigned int rules_cnt; +}; + struct bcm_sf2_priv { /* Base registers, keep those in order with BCM_SF2_REGS_NAME */ void __iomem*core; @@ -103,6 +110,9 @@ struct bcm_sf2_priv { /* Bitmask of ports needing BRCM tags */ unsigned intbrcm_tag_mask; + + /* CFP rules context */ + struct bcm_sf2_cfp_priv cfp; }; static inline struct bcm_sf2_priv *bcm_sf2_to_priv(struct dsa_switch *ds) @@ -197,4 +207,11 @@ SF2_IO_MACRO(acb); SWITCH_INTR_L2(0); SWITCH_INTR_L2(1); +/* RXNFC */ +int bcm_sf2_get_rxnfc(struct dsa_switch *ds, int port, + struct ethtool_rxnfc *nfc, u32 *rule_locs); +int bcm_sf2_set_rxnfc(struct dsa_switch *ds, int port, + struct ethtool_rxnfc *nfc); +int bcm_sf2_cfp_rst(struct bcm_sf2_priv *priv); + #endif /* __BCM_SF2_H */ diff --git a/drivers/net/dsa/bcm_sf2_cfp.c b/drivers/net/dsa/bcm_sf2_cfp.c new file mode 100644 index ..c71be3e0dc2d --- /dev/null +++ b/drivers/net/dsa/bcm_sf2_cfp.c @@ -0,0 +1,613 @@ +/* + * Broadcom Starfighter 2 DSA switch CFP support + * + * Copyright (C) 2016, Broadcom + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include + +#include "bcm_sf2.h" +#include "bcm_sf2_regs.h" + +struct cfp_udf_layout { + u8 slices[UDF_NUM_SLICES]; + u32 mask_value; + +}; + +/* UDF slices layout for a TCPv4/UDPv4 specification */ +static const struct cfp_udf_layout udf_tcpip4_layout = { + .slices = { + /* End of L2, byte offset 12, src IP[0:15] */ + CFG_UDF_EOL2 | 6, + /* End of L2, byte offset 14, src IP[16:31] */ + CFG_UDF_EOL2 | 7, + /* End of L2, byte offset 16, dst IP[0:15] */ + CFG_UDF_EOL2 | 8, + /* End of L2, byte offset 18, dst IP[16:31] */ + CFG_UDF_EOL2 | 9, + /* End of L3, byte offset 0, src port */ +
[PATCH net-next 1/4] net: dsa: Hook {get,set}_rxnfc ethtool operations
In preparation for adding support for CFP/TCAMP in the bcm_sf2 driver add the plumbing to call into driver specific {get,set}_rxnfc operations. Signed-off-by: Florian Fainelli--- include/net/dsa.h | 8 net/dsa/slave.c | 26 ++ 2 files changed, 34 insertions(+) diff --git a/include/net/dsa.h b/include/net/dsa.h index 92fd795e9573..bcad7cc906d9 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -370,6 +370,14 @@ struct dsa_switch_ops { int (*port_mdb_dump)(struct dsa_switch *ds, int port, struct switchdev_obj_port_mdb *mdb, int (*cb)(struct switchdev_obj *obj)); + + /* +* RXNFC +*/ + int (*get_rxnfc)(struct dsa_switch *ds, int port, +struct ethtool_rxnfc *nfc, u32 *rule_locs); + int (*set_rxnfc)(struct dsa_switch *ds, int port, +struct ethtool_rxnfc *nfc); }; struct dsa_switch_driver { diff --git a/net/dsa/slave.c b/net/dsa/slave.c index b8e58689a9a1..d30a98db004c 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -1001,6 +1001,30 @@ void dsa_cpu_port_ethtool_init(struct ethtool_ops *ops) ops->get_strings = dsa_cpu_port_get_strings; } +static int dsa_slave_get_rxnfc(struct net_device *dev, + struct ethtool_rxnfc *nfc, u32 *rule_locs) +{ + struct dsa_slave_priv *p = netdev_priv(dev); + struct dsa_switch *ds = p->parent; + + if (!ds->ops->get_rxnfc) + return -EOPNOTSUPP; + + return ds->ops->get_rxnfc(ds, p->port, nfc, rule_locs); +} + +static int dsa_slave_set_rxnfc(struct net_device *dev, + struct ethtool_rxnfc *nfc) +{ + struct dsa_slave_priv *p = netdev_priv(dev); + struct dsa_switch *ds = p->parent; + + if (!ds->ops->set_rxnfc) + return -EOPNOTSUPP; + + return ds->ops->set_rxnfc(ds, p->port, nfc); +} + static const struct ethtool_ops dsa_slave_ethtool_ops = { .get_drvinfo= dsa_slave_get_drvinfo, .get_regs_len = dsa_slave_get_regs_len, @@ -1019,6 +1043,8 @@ static const struct ethtool_ops dsa_slave_ethtool_ops = { .get_eee= dsa_slave_get_eee, .get_link_ksettings = dsa_slave_get_link_ksettings, .set_link_ksettings = dsa_slave_set_link_ksettings, + .get_rxnfc = dsa_slave_get_rxnfc, + .set_rxnfc = dsa_slave_set_rxnfc, }; static const struct net_device_ops dsa_slave_netdev_ops = { -- 2.9.3
[net 7/8] net/mlx5e: Fix update of hash function/key via ethtool
From: Gal PressmanModifying TIR hash should change selected fields bitmask in addition to the function and key. Formerly, we would not set this field resulting in zeroing of its value, which means no packet fields are used for RX RSS hash calculation thus causing all traffic to arrive in RQ[0]. Fixes: bdfc028de1b3 ("net/mlx5e: Fix ethtool RX hash func configuration change") Signed-off-by: Gal Pressman Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 3 +- .../net/ethernet/mellanox/mlx5/core/en_ethtool.c | 13 +- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 198 ++--- 3 files changed, 109 insertions(+), 105 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index 1619147a63e8..d5ecb8f53fd4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -791,7 +791,8 @@ void mlx5e_disable_vlan_filter(struct mlx5e_priv *priv); int mlx5e_modify_rqs_vsd(struct mlx5e_priv *priv, bool vsd); int mlx5e_redirect_rqt(struct mlx5e_priv *priv, u32 rqtn, int sz, int ix); -void mlx5e_build_tir_ctx_hash(void *tirc, struct mlx5e_priv *priv); +void mlx5e_build_indir_tir_ctx_hash(struct mlx5e_priv *priv, void *tirc, + enum mlx5e_traffic_types tt); int mlx5e_open_locked(struct net_device *netdev); int mlx5e_close_locked(struct net_device *netdev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c index 6f4eb34259f0..bb67863aa361 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -980,15 +980,18 @@ static int mlx5e_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key, static void mlx5e_modify_tirs_hash(struct mlx5e_priv *priv, void *in, int inlen) { - struct mlx5_core_dev *mdev = priv->mdev; void *tirc = MLX5_ADDR_OF(modify_tir_in, in, ctx); - int i; + struct mlx5_core_dev *mdev = priv->mdev; + int ctxlen = MLX5_ST_SZ_BYTES(tirc); + int tt; MLX5_SET(modify_tir_in, in, bitmask.hash, 1); - mlx5e_build_tir_ctx_hash(tirc, priv); - for (i = 0; i < MLX5E_NUM_INDIR_TIRS; i++) - mlx5_core_modify_tir(mdev, priv->indir_tir[i].tirn, in, inlen); + for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++) { + memset(tirc, 0, ctxlen); + mlx5e_build_indir_tir_ctx_hash(priv, tirc, tt); + mlx5_core_modify_tir(mdev, priv->indir_tir[tt].tirn, in, inlen); + } } static int mlx5e_set_rxfh(struct net_device *dev, const u32 *indir, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 948351ae5bd2..f14ca3385fdd 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -2022,8 +2022,23 @@ static void mlx5e_build_tir_ctx_lro(void *tirc, struct mlx5e_priv *priv) MLX5_SET(tirc, tirc, lro_timeout_period_usecs, priv->params.lro_timeout); } -void mlx5e_build_tir_ctx_hash(void *tirc, struct mlx5e_priv *priv) +void mlx5e_build_indir_tir_ctx_hash(struct mlx5e_priv *priv, void *tirc, + enum mlx5e_traffic_types tt) { + void *hfso = MLX5_ADDR_OF(tirc, tirc, rx_hash_field_selector_outer); + +#define MLX5_HASH_IP(MLX5_HASH_FIELD_SEL_SRC_IP |\ +MLX5_HASH_FIELD_SEL_DST_IP) + +#define MLX5_HASH_IP_L4PORTS(MLX5_HASH_FIELD_SEL_SRC_IP |\ +MLX5_HASH_FIELD_SEL_DST_IP |\ +MLX5_HASH_FIELD_SEL_L4_SPORT |\ +MLX5_HASH_FIELD_SEL_L4_DPORT) + +#define MLX5_HASH_IP_IPSEC_SPI (MLX5_HASH_FIELD_SEL_SRC_IP |\ +MLX5_HASH_FIELD_SEL_DST_IP |\ +MLX5_HASH_FIELD_SEL_IPSEC_SPI) + MLX5_SET(tirc, tirc, rx_hash_fn, mlx5e_rx_hash_fn(priv->params.rss_hfunc)); if (priv->params.rss_hfunc == ETH_RSS_HASH_TOP) { @@ -2035,6 +2050,88 @@ void mlx5e_build_tir_ctx_hash(void *tirc, struct mlx5e_priv *priv) MLX5_SET(tirc, tirc, rx_hash_symmetric, 1); memcpy(rss_key, priv->params.toeplitz_hash_key, len); } + + switch (tt) { + case MLX5E_TT_IPV4_TCP: + MLX5_SET(rx_hash_field_select, hfso, l3_prot_type, +MLX5_L3_PROT_TYPE_IPV4); + MLX5_SET(rx_hash_field_select, hfso, l4_prot_type, +MLX5_L4_PROT_TYPE_TCP); + MLX5_SET(rx_hash_field_select, hfso, selected_fields, +MLX5_HASH_IP_L4PORTS); + break; + + case MLX5E_TT_IPV6_TCP: +
[PATCH net-next 2/4] net: dsa: bcm_sf2: Configure traffic classes to queue mapping
By default, all traffic goes to queue 0, re-configure the traffic classes to quality of service mapping such that priority X maps to queue X, where X is from 0 through 7. Signed-off-by: Florian Fainelli--- drivers/net/dsa/bcm_sf2.c | 9 + drivers/net/dsa/bcm_sf2_regs.h | 4 2 files changed, 13 insertions(+) diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c index 8eecfd227e06..637072da3acf 100644 --- a/drivers/net/dsa/bcm_sf2.c +++ b/drivers/net/dsa/bcm_sf2.c @@ -229,6 +229,7 @@ static int bcm_sf2_port_setup(struct dsa_switch *ds, int port, { struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds); s8 cpu_port = ds->dst[ds->index].cpu_port; + unsigned int i; u32 reg; /* Clear the memory power down */ @@ -240,6 +241,14 @@ static int bcm_sf2_port_setup(struct dsa_switch *ds, int port, if (priv->brcm_tag_mask & BIT(port)) bcm_sf2_brcm_hdr_setup(priv, port); + /* Configure Traffic Class to QoS mapping, allow each priority to map +* to a different queue number +*/ + reg = core_readl(priv, CORE_PORT_TC2_QOS_MAP_PORT(port)); + for (i = 0; i < 8; i++) + reg |= i << (PRT_TO_QID_SHIFT * i); + core_writel(priv, reg, CORE_PORT_TC2_QOS_MAP_PORT(port)); + /* Clear the Rx and Tx disable bits and set to no spanning tree */ core_writel(priv, 0, CORE_G_PCTL_PORT(port)); diff --git a/drivers/net/dsa/bcm_sf2_regs.h b/drivers/net/dsa/bcm_sf2_regs.h index 3b33b8010cc8..6b63c00928ba 100644 --- a/drivers/net/dsa/bcm_sf2_regs.h +++ b/drivers/net/dsa/bcm_sf2_regs.h @@ -238,6 +238,10 @@ enum bcm_sf2_reg_offs { #define P_TXQ_PSM_VDD(x) (P_TXQ_PSM_VDD_MASK << \ ((x) * P_TXQ_PSM_VDD_SHIFT)) +#define CORE_PORT_TC2_QOS_MAP_PORT(x) (0xc1c0 + ((x) * 0x10)) +#define PRT_TO_QID_MASK 0x3 +#define PRT_TO_QID_SHIFT 3 + #define CORE_PORT_VLAN_CTL_PORT(x) (0xc400 + ((x) * 0x8)) #define PORT_VLAN_CTRL_MASK 0x1ff -- 2.9.3
[net 4/8] net/mlx5: E-Switch, Re-enable RoCE on mode change only after FDB destroy
From: Or GerlitzWe must re-enable RoCE on the e-switch management port (PF) only after destroying the FDB in its switchdev/offloaded mode. Otherwise, when encapsulation is supported, this re-enablement will fail. Also, it's more natural and symmetric to disable RoCE on the PF before we create the FDB under switchdev mode, so do that as well and revert if getting into error during the mode change later. Fixes: 9da34cd34e85 ('net/mlx5: Disable RoCE on the e-switch management [..]') Signed-off-by: Or Gerlitz Reviewed-by: Hadar Hen Zion Signed-off-by: Saeed Mahameed --- .../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 29 ++ 1 file changed, 18 insertions(+), 11 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index c61bca138e65..595f7c7383b3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -675,9 +675,14 @@ int esw_offloads_init(struct mlx5_eswitch *esw, int nvports) int vport; int err; + /* disable PF RoCE so missed packets don't go through RoCE steering */ + mlx5_dev_list_lock(); + mlx5_remove_dev_by_protocol(esw->dev, MLX5_INTERFACE_PROTOCOL_IB); + mlx5_dev_list_unlock(); + err = esw_create_offloads_fdb_table(esw, nvports); if (err) - return err; + goto create_fdb_err; err = esw_create_offloads_table(esw); if (err) @@ -697,11 +702,6 @@ int esw_offloads_init(struct mlx5_eswitch *esw, int nvports) goto err_reps; } - /* disable PF RoCE so missed packets don't go through RoCE steering */ - mlx5_dev_list_lock(); - mlx5_remove_dev_by_protocol(esw->dev, MLX5_INTERFACE_PROTOCOL_IB); - mlx5_dev_list_unlock(); - return 0; err_reps: @@ -718,6 +718,13 @@ int esw_offloads_init(struct mlx5_eswitch *esw, int nvports) create_ft_err: esw_destroy_offloads_fdb_table(esw); + +create_fdb_err: + /* enable back PF RoCE */ + mlx5_dev_list_lock(); + mlx5_add_dev_by_protocol(esw->dev, MLX5_INTERFACE_PROTOCOL_IB); + mlx5_dev_list_unlock(); + return err; } @@ -725,11 +732,6 @@ static int esw_offloads_stop(struct mlx5_eswitch *esw) { int err, err1, num_vfs = esw->dev->priv.sriov.num_vfs; - /* enable back PF RoCE */ - mlx5_dev_list_lock(); - mlx5_add_dev_by_protocol(esw->dev, MLX5_INTERFACE_PROTOCOL_IB); - mlx5_dev_list_unlock(); - mlx5_eswitch_disable_sriov(esw); err = mlx5_eswitch_enable_sriov(esw, num_vfs, SRIOV_LEGACY); if (err) { @@ -739,6 +741,11 @@ static int esw_offloads_stop(struct mlx5_eswitch *esw) esw_warn(esw->dev, "Failed setting eswitch back to offloads, err %d\n", err); } + /* enable back PF RoCE */ + mlx5_dev_list_lock(); + mlx5_add_dev_by_protocol(esw->dev, MLX5_INTERFACE_PROTOCOL_IB); + mlx5_dev_list_unlock(); + return err; } -- 2.11.0
[PATCH net-next 0/4] net: dsa: bcm_sf2: CFP support
Hi all, This patch series adds support for the Broadcom Compact Field Processor (CFP) which is a classification and matching engine built into most Broadcom switches. We support that using ethtool::rxnfc because it allows all known uses cases from the users I support to work, and more importantly, it allows the selection of a target rule index, which is later used by e.g: offloading hardware, this is an essential feature that I could not find being supported with cls_* for instance. Thanks Florian Fainelli (4): net: dsa: Hook {get,set}_rxnfc ethtool operations net: dsa: bcm_sf2: Configure traffic classes to queue mapping net: dsa: bcm_sf2: Add CFP registers definitions net: dsa: bcm_sf2: Add support for ethtool::rxnfc drivers/net/dsa/Makefile | 2 +- drivers/net/dsa/bcm_sf2.c | 23 ++ drivers/net/dsa/bcm_sf2.h | 17 ++ drivers/net/dsa/bcm_sf2_cfp.c | 613 + drivers/net/dsa/bcm_sf2_regs.h | 150 ++ include/net/dsa.h | 8 + net/dsa/slave.c| 26 ++ 7 files changed, 838 insertions(+), 1 deletion(-) create mode 100644 drivers/net/dsa/bcm_sf2_cfp.c -- 2.9.3
[net 6/8] net/mlx5e: Modify TIRs hash only when it's needed
From: Gal PressmanWe don't need to modify our TIRs unless the user requested a change in the hash function/key, for example when changing indirection only. Fixes: bdfc028de1b3 ("net/mlx5e: Fix ethtool RX hash func configuration change") Signed-off-by: Gal Pressman Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c | 18 +- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c index ffbdf9ee5a9b..6f4eb34259f0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -996,6 +996,7 @@ static int mlx5e_set_rxfh(struct net_device *dev, const u32 *indir, { struct mlx5e_priv *priv = netdev_priv(dev); int inlen = MLX5_ST_SZ_BYTES(modify_tir_in); + bool hash_changed = false; void *in; if ((hfunc != ETH_RSS_HASH_NO_CHANGE) && @@ -1017,14 +1018,21 @@ static int mlx5e_set_rxfh(struct net_device *dev, const u32 *indir, mlx5e_redirect_rqt(priv, rqtn, MLX5E_INDIR_RQT_SIZE, 0); } - if (key) + if (hfunc != ETH_RSS_HASH_NO_CHANGE && + hfunc != priv->params.rss_hfunc) { + priv->params.rss_hfunc = hfunc; + hash_changed = true; + } + + if (key) { memcpy(priv->params.toeplitz_hash_key, key, sizeof(priv->params.toeplitz_hash_key)); + hash_changed = hash_changed || + priv->params.rss_hfunc == ETH_RSS_HASH_TOP; + } - if (hfunc != ETH_RSS_HASH_NO_CHANGE) - priv->params.rss_hfunc = hfunc; - - mlx5e_modify_tirs_hash(priv, in, inlen); + if (hash_changed) + mlx5e_modify_tirs_hash(priv, in, inlen); mutex_unlock(>state_lock); -- 2.11.0
[PATCH net-next 3/4] net: dsa: bcm_sf2: Add CFP registers definitions
Signed-off-by: Florian Fainelli--- drivers/net/dsa/bcm_sf2_regs.h | 146 + 1 file changed, 146 insertions(+) diff --git a/drivers/net/dsa/bcm_sf2_regs.h b/drivers/net/dsa/bcm_sf2_regs.h index 6b63c00928ba..26052450091e 100644 --- a/drivers/net/dsa/bcm_sf2_regs.h +++ b/drivers/net/dsa/bcm_sf2_regs.h @@ -255,4 +255,150 @@ enum bcm_sf2_reg_offs { #define CORE_EEE_EN_CTRL 0x24800 #define CORE_EEE_LPI_INDICATE 0x24810 +#define CORE_CFP_ACC 0x28000 +#define OP_STR_DONE (1 << 0) +#define OP_SEL_SHIFT 1 +#define OP_SEL_READ (1 << OP_SEL_SHIFT) +#define OP_SEL_WRITE (2 << OP_SEL_SHIFT) +#define OP_SEL_SEARCH (4 << OP_SEL_SHIFT) +#define OP_SEL_MASK (7 << OP_SEL_SHIFT) +#define CFP_RAM_CLEAR (1 << 4) +#define RAM_SEL_SHIFT 10 +#define TCAM_SEL (1 << RAM_SEL_SHIFT) +#define ACT_POL_RAM (2 << RAM_SEL_SHIFT) +#define RATE_METER_RAM(4 << RAM_SEL_SHIFT) +#define GREEN_STAT_RAM(8 << RAM_SEL_SHIFT) +#define YELLOW_STAT_RAM (16 << RAM_SEL_SHIFT) +#define RED_STAT_RAM (24 << RAM_SEL_SHIFT) +#define RAM_SEL_MASK (0x1f << RAM_SEL_SHIFT) +#define TCAM_RESET(1 << 15) +#define XCESS_ADDR_SHIFT 16 +#define XCESS_ADDR_MASK 0xff +#define SEARCH_STS(1 << 27) +#define RD_STS_SHIFT 28 +#define RD_STS_TCAM (1 << RD_STS_SHIFT) +#define RD_STS_ACT_POL_RAM(2 << RD_STS_SHIFT) +#define RD_STS_RATE_METER_RAM (4 << RD_STS_SHIFT) +#define RD_STS_STAT_RAM (8 << RD_STS_SHIFT) + +#define CORE_CFP_RATE_METER_GLOBAL_CTL 0x28010 + +#define CORE_CFP_DATA_PORT_0 0x28040 +#define CORE_CFP_DATA_PORT(x) (CORE_CFP_DATA_PORT_0 + \ + (x) * 0x10) + +/* UDF_DATA7 */ +#define L3_FRAMING_SHIFT 24 +#define L3_FRAMING_MASK(0x3 << L3_FRAMING_SHIFT) +#define IPPROTO_SHIFT 8 +#define IPPROTO_MASK (0xff << IPPROTO_SHIFT) +#define IP_FRAG(1 << 7) + +/* UDF_DATA0 */ +#define SLICE_VALID 3 +#define SLICE_NUM_SHIFT 2 +#define SLICE_NUM(x) ((x) << SLICE_NUM_SHIFT) + +#define CORE_CFP_MASK_PORT_0 0x280c0 + +#define CORE_CFP_MASK_PORT(x) (CORE_CFP_MASK_PORT_0 + \ + (x) * 0x10) + +#define CORE_ACT_POL_DATA0 0x28140 +#define VLAN_BYP (1 << 0) +#define EAP_BYP (1 << 1) +#define STP_BYP (1 << 2) +#define REASON_CODE_SHIFT 3 +#define REASON_CODE_MASK 0x3f +#define LOOP_BK_EN(1 << 9) +#define NEW_TC_SHIFT 10 +#define NEW_TC_MASK 0x7 +#define CHANGE_TC (1 << 13) +#define DST_MAP_IB_SHIFT 14 +#define DST_MAP_IB_MASK 0x1ff +#define CHANGE_FWRD_MAP_IB_SHIFT 24 +#define CHANGE_FWRD_MAP_IB_MASK 0x3 +#define CHANGE_FWRD_MAP_IB_NO_DEST(0 << CHANGE_FWRD_MAP_IB_SHIFT) +#define CHANGE_FWRD_MAP_IB_REM_ARL(1 << CHANGE_FWRD_MAP_IB_SHIFT) +#define CHANGE_FWRD_MAP_IB_REP_ARL(2 << CHANGE_FWRD_MAP_IB_SHIFT) +#define CHANGE_FWRD_MAP_IB_ADD_DST(3 << CHANGE_FWRD_MAP_IB_SHIFT) +#define NEW_DSCP_IB_SHIFT 26 +#define NEW_DSCP_IB_MASK 0x3f + +#define CORE_ACT_POL_DATA1 0x28150 +#define CHANGE_DSCP_IB(1 << 0) +#define DST_MAP_OB_SHIFT 1 +#define DST_MAP_OB_MASK 0x3ff +#define CHANGE_FWRD_MAP_OB_SHIT 11 +#define CHANGE_FWRD_MAP_OB_MASK 0x3 +#define NEW_DSCP_OB_SHIFT 13 +#define NEW_DSCP_OB_MASK 0x3f +#define CHANGE_DSCP_OB(1 << 19) +#define CHAIN_ID_SHIFT20 +#define CHAIN_ID_MASK 0xff +#define CHANGE_COLOR (1 << 28) +#define NEW_COLOR_SHIFT 29 +#define NEW_COLOR_MASK0x3 +#define NEW_COLOR_GREEN (0 << NEW_COLOR_SHIFT) +#define NEW_COLOR_YELLOW (1 << NEW_COLOR_SHIFT) +#define NEW_COLOR_RED (2 << NEW_COLOR_SHIFT) +#define RED_DEFAULT (1 << 31) + +#define CORE_ACT_POL_DATA2 0x28160 +#define MAC_LIMIT_BYPASS (1 << 0) +#define CHANGE_TC_O (1 << 1) +#define NEW_TC_O_SHIFT2 +#define NEW_TC_O_MASK 0x7 +#define SPCP_RMK_DISABLE (1 << 5) +#define
[net 3/8] net/mlx5: E-Switch, Err when retrieving steering name-space fails
From: Or GerlitzMake sure to return error when we failed retrieving the FDB steering name space. Also, while around, correctly print the error when mode change revert fails in the warning message. Signed-off-by: Or Gerlitz Reported-by: Leon Romanovsky Reviewed-by: Roi Dayan Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 5803216157cf..c61bca138e65 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -424,6 +424,7 @@ static int esw_create_offloads_fdb_table(struct mlx5_eswitch *esw, int nvports) root_ns = mlx5_get_flow_namespace(dev, MLX5_FLOW_NAMESPACE_FDB); if (!root_ns) { esw_warn(dev, "Failed to get FDB flow namespace\n"); + err = -EOPNOTSUPP; goto ns_err; } @@ -655,7 +656,7 @@ static int esw_offloads_start(struct mlx5_eswitch *esw) esw_warn(esw->dev, "Failed setting eswitch to offloads, err %d\n", err); err1 = mlx5_eswitch_enable_sriov(esw, num_vfs, SRIOV_LEGACY); if (err1) - esw_warn(esw->dev, "Failed setting eswitch back to legacy, err %d\n", err); + esw_warn(esw->dev, "Failed setting eswitch back to legacy, err %d\n", err1); } if (esw->offloads.inline_mode == MLX5_INLINE_MODE_NONE) { if (mlx5_eswitch_inline_mode_get(esw, -- 2.11.0
[ANNOUNCE] iptables 1.6.1 release
Hi! The Netfilter project proudly presents: iptables 1.6.1 iptables is the userspace command line program used to configure the Linux 2.4.x and later packet filtering ruleset. It is targeted towards system administrators. This update contains accumulated bugfixes, several new extensions and lots of translations via iptables-translate to ease migration to nftables. See ChangeLog that comes attached to this email for more details. You can download it from: http://www.netfilter.org/projects/iptables/downloads.html ftp://ftp.netfilter.org/pub/iptables/ Have fun! Ana Rey (1): extensions: libxt_udp: add translation to nft Arpan Kapoor (1): libxtables: Replace gethostbyname() with getaddrinfo() Arturo Borrero (3): extensions/libxt_rpfilter.man: fix typo, specifiy vs specify iptables/xtables-arp.c: fix typo, wierd vs weird extensions/libxt_tcp: fix nftables translate flags value, 'none' vs '0x0' Arturo Borrero Gonzalez (1): extensions: update Arturo Borrero email address Brian Haley (1): iptables-restore: add missing arguments to usage message Florian Westphal (5): iptables.8: mention iptables-save in -L documentation iptables.8: nat table has four builtin chains extensions: NETMAP: add ' to:' prefix when printing NETMAP target extensions: NETMAP: fix iptables-save output connlabel: clarify default config path George Burgess IV (1): libxt_multiport: remove an unused variable Giuseppe Longo (1): configure: make libmnl and libnftnl hard requirements Guruswamy Basavaiah (4): iptables: extensions: iptables-translate prints extra "nft" after printing any error iptables-translate: translate iptables --flush iptables-translate: Printing the table name before chain name. iptables-translate: Don't print "nft" in iptables-restore-translate command Gustavo Zacarias (1): iptables: add xtables-config-parser.h to BUILT_SOURCES Janani Ravichandran (1): extensions: libip6t_rt.c: Add translation to nft Jordan Yelloz (1): extensions: added AR substitution Keno Fischer (1): build: Fix two compile errors during out-of-tree build Laura Garcia Liebana (12): extensions: libip6t_icmp6: Add translation to nft extensions: libipt_LOG: Avoid to print the default log level in the translation extensions: libipt_icmp: Add translation to nft extensions: libipt_REJECT: Avoid to print the default reject with value in the translation extensions: libip6t_REJECT: Avoid to print the default reject with value in the translation extensions: libxt_ipcomp: Add translation to nft extensions: libip6t_hbh: Add translation to nft extensions: libxt_multiport: Add translation to nft extensions: libxt_dscp: Add translation to nft extensions: libip6t_frag: Add translation to nft extensions: libxt_cgroup: Add translation to nft extensions: libxt_conntrack: Add translation to nft Liping Zhang (27): extensions: libxt_limit: fix a wrong translation to nft rule extensions: libxt_mark: fix a wrong translation to nft when mask is specified extensions: libxt_TRACE: Add translation to nft extensions: libipt_realm: fix order of mask and id when do nft translation extensions: libxt_connlabel: fix crash when connlabel.conf is empty extensions: libxt_connlabel: Add translation to nft extensions: libxt_NFLOG: display nflog-size even if it is zero extensions: libxt_NFLOG: translate to nft log snaplen if nflog-size is specified extensions: libxt_NFLOG: add unit test to cover nflog-size with zero extensions: libxt_connlabel: add unit test iptables-translate: add in/out ifname wildcard match translation to nft extensions: libxt_CLASSIFY: Add translation to nft extensions: libipt_DNAT/SNAT: fix "OOM" when do translation to nft extensions: libip[6]t_SNAT/DNAT: use the new nft syntax when do xlate extensions: libip[6]t_REDIRECT: use new nft syntax when do xlate extensions: libip6t_SNAT/DNAT: add square bracket in xlat output when port is specified extensions: libipt_realm: add a missing space in translation extensions: libxt_iprange: rename "ip saddr" to "ip6 saddr" in ip6tables-xlate extensions: libxt_iprange: handle the invert flag properly in translation extensions: libxt_devgroup: handle the invert flag properly in translation extensions: libxt_ipcomp: add range support in translation extensions: libxt_quota: add translation to nft extensions: libxt_DSCP: add translation to nft extensions: libxt_statistic: add translation to nft extensions: LOG: add log flags translation to nft extensions: libxt_connbytes: Add translation to nft extensions: libxt_rpfilter: add translation to nft Loganaden Velvindron (1): libxt_TCPOPTSTRIP: Fix musl compatibility Pablo M. Bermudo Garay (11):
Re: [PATCH] cfg80211 debugfs: Cleanup some checkpatch issues
On Fri, 2017-01-27 at 22:26 +0300, Pichugin Dmitry wrote: > This fixes the checkpatch.pl warnings: > * Macros should not use a trailing semicolon. > * Spaces required around that '='. > * Symbolic permissions 'S_IRUGO' are not preferred. > * Macro argument reuse 'buflen' - possible side-effects I really see no point in any of this. johannes
Re: [PATCH net-next 1/4] mlx5: Make building eswitch configurable
On Fri, Jan 27, 2017 at 8:33 PM, Tom Herbertwrote: > On Fri, Jan 27, 2017 at 10:19 AM, Saeed Mahameed > wrote: >> On Fri, Jan 27, 2017 at 1:32 AM, Tom Herbert wrote: >>> Add a configuration option (CONFIG_MLX5_CORE_ESWITCH) for controlling >>> whether the eswitch code is built. Change Kconfig and Makefile >>> accordingly. >>> >>> Signed-off-by: Tom Herbert >>> --- >>> drivers/net/ethernet/mellanox/mlx5/core/Kconfig | 11 +++ >>> drivers/net/ethernet/mellanox/mlx5/core/Makefile | 6 +- >>> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 92 >>> +-- >>> drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 39 +++--- >>> drivers/net/ethernet/mellanox/mlx5/core/eq.c | 4 +- >>> drivers/net/ethernet/mellanox/mlx5/core/main.c| 16 ++-- >>> drivers/net/ethernet/mellanox/mlx5/core/sriov.c | 6 +- >>> 7 files changed, 125 insertions(+), 49 deletions(-) >>> >>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig >>> b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig >>> index ddb4ca4..27aae58 100644 >>> --- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig >>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig >>> @@ -30,3 +30,14 @@ config MLX5_CORE_EN_DCB >>> This flag is depended on the kernel's DCB support. >>> >>> If unsure, set to Y >>> + >>> +config MLX5_CORE_EN_ESWITCH >>> + bool "Ethernet switch" >>> + default y >>> + depends on MLX5_CORE_EN >>> + ---help--- >>> + Say Y here if you want to use Ethernet switch (E-switch). E-Switch >>> + is the software entity that represents and manages ConnectX4 >>> + inter-HCA ethernet l2 switching. >>> + >>> + If unsure, set to Y >>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile >>> b/drivers/net/ethernet/mellanox/mlx5/core/Makefile >>> index 9f43beb..17025d8 100644 >>> --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile >>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile >>> @@ -5,9 +5,11 @@ mlx5_core-y := main.o cmd.o debugfs.o fw.o eq.o uar.o >>> pagealloc.o \ >>> mad.o transobj.o vport.o sriov.o fs_cmd.o fs_core.o \ >>> fs_counters.o rl.o lag.o dev.o >>> >>> -mlx5_core-$(CONFIG_MLX5_CORE_EN) += wq.o eswitch.o eswitch_offloads.o \ >>> +mlx5_core-$(CONFIG_MLX5_CORE_EN) += wq.o \ >>> en_main.o en_common.o en_fs.o en_ethtool.o en_tx.o \ >>> en_rx.o en_rx_am.o en_txrx.o en_clock.o vxlan.o \ >>> - en_tc.o en_arfs.o en_rep.o en_fs_ethtool.o en_selftest.o >>> + en_tc.o en_arfs.o en_fs_ethtool.o en_selftest.o >>> >>> mlx5_core-$(CONFIG_MLX5_CORE_EN_DCB) += en_dcbnl.o >>> + >>> +mlx5_core-$(CONFIG_MLX5_CORE_EN_ESWITCH) += eswitch.o eswitch_offloads.o >>> en_rep.o >>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>> b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>> index e829143..1db4d98 100644 >>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>> @@ -38,7 +38,9 @@ >>> #include >>> #include "en.h" >>> #include "en_tc.h" >>> +#ifdef CONFIG_MLX5_CORE_EN_ESWITCH >>> #include "eswitch.h" >>> +#endif >> >> Wouldn't it be cleaner if we left the main code (en_main.c) untouched >> and kept this >> #include "eswitch.h" and instead of filling the main flow code with >> #ifdefs, in eswitch.h >> we can create eswitch mock API functions when >> CONFIG_MLX5_CORE_EN_ESWITCH is not enabled ? the main flow will be >> clean from ifdefs and will complie with mock functions. >> >> we did this with accelerated RFS feature. look for "#ifndef >> CONFIG_RFS_ACCEL" in en.h >> > There are still occurrences of CONFIG_RFS_ACCEL in en_main.c and > main.c. For eswitch its a header problem, not everything related to > eswitch was extracted out of main though backend functions. There is a > lot of code that related to eswitch that is intertwined with the core > code. > Interesting, i just did a quick look and it seems to me all eswitch logic in en_main.c can be kept untouched if we have the right mock functions, on the other hand it seems that there are a lot of eswitch functions to mock, i am not sure it is a good thing anymore, let's leave it as is for now.
Re: [iproute PATCH] man: tc-csum.8: Fix example
On Fri, Jan 27, 2017 at 12:15:01PM +0100, Phil Sutter wrote: > +# tc filter add dev eth0 prio 1 protocol ip parent : \\ > u32 match ip src 192.168.1.100/32 flowid :1 \\ > - action pedit munge ip dst set 0x12345678 pipe \\ > + action pedit munge ip dst set 1.2.3.4 pipe \\ > Just nitpicking here, but IMHO examples like this should better use IP addresses reserved for documentation (192.0.2.0/24, 198.51.100.0/24 or 203.0.113.0/24).
[net 8/8] net/mlx5e: Check ets capability before ets query FW command
From: Moshe ShemeshOn dcbnl callback getpgtccfgtx, the driver should check the ets capability before ets query command is sent to firmware. It is valid to return from this void function without changing in/out parameters, as these parameters are initialized to DCB_ATTR_VALUE_UNDEFINED. Signed-off-by: Moshe Shemesh Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c index 35f9ae037ba0..0523ed47f597 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c @@ -511,6 +511,11 @@ static void mlx5e_dcbnl_getpgtccfgtx(struct net_device *netdev, struct mlx5e_priv *priv = netdev_priv(netdev); struct mlx5_core_dev *mdev = priv->mdev; + if (!MLX5_CAP_GEN(priv->mdev, ets)) { + netdev_err(netdev, "%s, ets is not supported\n", __func__); + return; + } + if (priority >= CEE_DCBX_MAX_PRIO) { netdev_err(netdev, "%s, priority is out of range\n", __func__); -- 2.11.0
Re: [PATCH] net: phy: micrel: KSZ8795 do not set SUPPORTED_[Asym_]Pause
On 01/27/2017 12:39 PM, Sean Nyekjaer wrote: > As pr commit "net: phy: phy drivers should not set SUPPORTED_[Asym_]Pause" > this phy driver should not set these feature bits. > > Signed-off-by: Sean Nyekjaer> Fixes: 9d162ed69f51 ("net: phy: micrel: add support for KSZ8795") Reviewed-by: Florian Fainelli -- Florian
Re: [PATCH RFC net-next] packet: always ensure that we pass hard_header_len bytes in skb_headlen() to the driver
On Fri, Jan 27, 2017 at 3:06 PM, Sowmini Varadhanwrote: > On (01/27/17 14:29), Willem de Bruijn wrote: >> >> As your patch state, the contract is that any packet delivered to a >> driver has the entire L2 in its linear section. Drivers are not required >> to be robust against shorter packets, so there is no reason to test >> those. >> >> One option is to limit your fix to known fixed-header protocols. >> In these cases hard_header_len is the minimum, so anything >> smaller must be dropped. > > yes, but how would you you know that this is a fixed-header protocol > or a var-hdrlen protocol? AIUI the hard_header_len itself will not > tell you this info: it will be 77 for ax25, 14 for ethernet, > but that does not tell me that ax25 is the "robust-er" driver > with a min requirement of 21 for the hdrlen. Right. Identifying the outliers is the hard part. > That's why I was thinking of a IFF_L2_VARHDRLEN in the priv_flags > of the net_device. > >> For protocols with variable header length it is fine to send packets >> shorter than hard_header_len, even with corrupted content (i.e., >> even if they would fail that protocol's validate callback), as long as >> they exceed the minimum length. ax25 already has a min length >> check through its protocol-specific validate callback. > > Another option that comes to mind.. the real thorn-in-the-flesh > here is the CAP_SYS_RAWIO check. Would it be a better idea to ask > the test-suites (since they seem to be the major consumer of > that path) to use a special PF_PACKET socket option instead, that Introducing a sysctl has the same effect. It is not possible to identify all callers dependent on the current ABI. I see these options - make capable() check conditional on sysctl (or interface flag, ..) - limit capable() check to drivers with with .validate callback - hardcode a list of known fixed length protocols that must fail - let privileged applications shoot themselves in the foot (change nothing). The first will break tests. Though with a runtime fix: flip the flag. The second will break variable length header protocols unless you exhaustively search for all variable length protocols and add validate callbacks first.
pull-request: can 2017-01-27
Hello David, this is a pull request for net/master. It consists of a single patch by Eric Dumazet, it fixes a kernel panic at security_sock_rcv_skb. regards, Marc --- The following changes since commit 950eabbd6ddedc1b08350b9169a6a51b130ebaaf: ISDN: eicon: silence misleading array-bounds warning (2017-01-27 11:27:34 -0500) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can.git tags/linux-can-fixes-for-4.10-20170127 for you to fetch changes up to f30dc84e2d5ef45a715de546529e7693733b11fb: can: Fix kernel panic at security_sock_rcv_skb (2017-01-27 21:37:51 +0100) linux-can-fixes-for-4.10-20170127 Eric Dumazet (1): can: Fix kernel panic at security_sock_rcv_skb include/linux/can/core.h | 7 +++ net/can/af_can.c | 12 ++-- net/can/af_can.h | 3 ++- net/can/bcm.c| 4 ++-- net/can/gw.c | 2 +- net/can/raw.c| 4 ++-- 6 files changed, 20 insertions(+), 12 deletions(-)
[PATCH] can: Fix kernel panic at security_sock_rcv_skb
From: Eric DumazetZhang Yanmin reported crashes [1] and provided a patch adding a synchronize_rcu() call in can_rx_unregister() The main problem seems that the sockets themselves are not RCU protected. If CAN uses RCU for delivery, then sockets should be freed only after one RCU grace period. Recent kernels could use sock_set_flag(sk, SOCK_RCU_FREE), but let's ease stable backports with the following fix instead. [1] BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] selinux_socket_sock_rcv_skb+0x65/0x2a0 Call Trace: [] security_sock_rcv_skb+0x4c/0x60 [] sk_filter+0x41/0x210 [] sock_queue_rcv_skb+0x53/0x3a0 [] raw_rcv+0x2a3/0x3c0 [] can_rcv_filter+0x12b/0x370 [] can_receive+0xd9/0x120 [] can_rcv+0xab/0x100 [] __netif_receive_skb_core+0xd8c/0x11f0 [] __netif_receive_skb+0x24/0xb0 [] process_backlog+0x127/0x280 [] net_rx_action+0x33b/0x4f0 [] __do_softirq+0x184/0x440 [] do_softirq_own_stack+0x1c/0x30 [] do_softirq.part.18+0x3b/0x40 [] do_softirq+0x1d/0x20 [] netif_rx_ni+0xe5/0x110 [] slcan_receive_buf+0x507/0x520 [] flush_to_ldisc+0x21c/0x230 [] process_one_work+0x24f/0x670 [] worker_thread+0x9d/0x6f0 [] ? rescuer_thread+0x480/0x480 [] kthread+0x12c/0x150 [] ret_from_fork+0x3f/0x70 Reported-by: Zhang Yanmin Signed-off-by: Eric Dumazet Acked-by: Oliver Hartkopp Cc: linux-stable Signed-off-by: Marc Kleine-Budde --- include/linux/can/core.h | 7 +++ net/can/af_can.c | 12 ++-- net/can/af_can.h | 3 ++- net/can/bcm.c| 4 ++-- net/can/gw.c | 2 +- net/can/raw.c| 4 ++-- 6 files changed, 20 insertions(+), 12 deletions(-) diff --git a/include/linux/can/core.h b/include/linux/can/core.h index a0875001b13c..df08a41d5be5 100644 --- a/include/linux/can/core.h +++ b/include/linux/can/core.h @@ -45,10 +45,9 @@ struct can_proto { extern int can_proto_register(const struct can_proto *cp); extern void can_proto_unregister(const struct can_proto *cp); -extern int can_rx_register(struct net_device *dev, canid_t can_id, - canid_t mask, - void (*func)(struct sk_buff *, void *), - void *data, char *ident); +int can_rx_register(struct net_device *dev, canid_t can_id, canid_t mask, + void (*func)(struct sk_buff *, void *), + void *data, char *ident, struct sock *sk); extern void can_rx_unregister(struct net_device *dev, canid_t can_id, canid_t mask, diff --git a/net/can/af_can.c b/net/can/af_can.c index 1108079d934f..5488e4a6ccd0 100644 --- a/net/can/af_can.c +++ b/net/can/af_can.c @@ -445,6 +445,7 @@ static struct hlist_head *find_rcv_list(canid_t *can_id, canid_t *mask, * @func: callback function on filter match * @data: returned parameter for callback function * @ident: string for calling module identification + * @sk: socket pointer (might be NULL) * * Description: * Invokes the callback function with the received sk_buff and the given @@ -468,7 +469,7 @@ static struct hlist_head *find_rcv_list(canid_t *can_id, canid_t *mask, */ int can_rx_register(struct net_device *dev, canid_t can_id, canid_t mask, void (*func)(struct sk_buff *, void *), void *data, - char *ident) + char *ident, struct sock *sk) { struct receiver *r; struct hlist_head *rl; @@ -496,6 +497,7 @@ int can_rx_register(struct net_device *dev, canid_t can_id, canid_t mask, r->func= func; r->data= data; r->ident = ident; + r->sk = sk; hlist_add_head_rcu(>list, rl); d->entries++; @@ -520,8 +522,11 @@ EXPORT_SYMBOL(can_rx_register); static void can_rx_delete_receiver(struct rcu_head *rp) { struct receiver *r = container_of(rp, struct receiver, rcu); + struct sock *sk = r->sk; kmem_cache_free(rcv_cache, r); + if (sk) + sock_put(sk); } /** @@ -596,8 +601,11 @@ void can_rx_unregister(struct net_device *dev, canid_t can_id, canid_t mask, spin_unlock(_rcvlists_lock); /* schedule the receiver item for deletion */ - if (r) + if (r) { + if (r->sk) + sock_hold(r->sk); call_rcu(>rcu, can_rx_delete_receiver); + } } EXPORT_SYMBOL(can_rx_unregister); diff --git a/net/can/af_can.h b/net/can/af_can.h index fca0fe9fc45a..b86f5129e838 100644 --- a/net/can/af_can.h +++ b/net/can/af_can.h @@ -50,13 +50,14 @@ struct receiver { struct hlist_node list; - struct rcu_head rcu; canid_t can_id; canid_t mask; unsigned long matches; void (*func)(struct sk_buff *, void *); void *data;
Re: [net-next] openvswitch: Simplify do_execute_actions().
On Wed, Jan 25, 2017 at 9:24 PM, Andy Zhouwrote: > do_execute_actions() implements a worthwhile optimization: in case > an output action is the last action in an action list, skb_clone() > can be avoided by outputing the current skb. However, the > implementation is more complicated than necessary. This patch > simplify this logic. > > Signed-off-by: Andy Zhou > --- > net/openvswitch/actions.c | 40 +++- > 1 file changed, 19 insertions(+), 21 deletions(-) > > diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c > index 514f7bc..3866608 100644 > --- a/net/openvswitch/actions.c > +++ b/net/openvswitch/actions.c > @@ -830,6 +830,9 @@ static void do_output(struct datapath *dp, struct sk_buff > *skb, int out_port, > { > struct vport *vport = ovs_vport_rcu(dp, out_port); > > + if (unlikely(!skb)) > + return; > + Patch looks good to me. But I wanted to know if you considered moving this check to do_execute_actions() in case skb-clone is done? This way we can avoid this unlikely check from likely case :) > if (likely(vport)) { > u16 mru = OVS_CB(skb)->mru; > u32 cutlen = OVS_CB(skb)->cutlen; > @@ -1141,12 +1144,6 @@ static int do_execute_actions(struct datapath *dp, > struct sk_buff *skb, > struct sw_flow_key *key, > const struct nlattr *attr, int len) > { > - /* Every output action needs a separate clone of 'skb', but the common > -* case is just a single output action, so that doing a clone and > -* then freeing the original skbuff is wasteful. So the following > code > -* is slightly obscure just to avoid that. > -*/ > - int prev_port = -1; > const struct nlattr *a; > int rem; > > @@ -1154,20 +1151,25 @@ static int do_execute_actions(struct datapath *dp, > struct sk_buff *skb, > a = nla_next(a, )) { > int err = 0; > > - if (unlikely(prev_port != -1)) { > - struct sk_buff *out_skb = skb_clone(skb, GFP_ATOMIC); > + switch (nla_type(a)) { > + case OVS_ACTION_ATTR_OUTPUT: { > + int port = nla_get_u32(a); > > - if (out_skb) > - do_output(dp, out_skb, prev_port, key); > + /* Every output action needs a separate clone > +* of 'skb', In case the output action is the > +* last action, cloning can be avoided. > +*/ > + if (nla_is_last(a, rem)) { > + do_output(dp, skb, port, key); > + /* 'skb' has been used for output. > +*/ > + return 0; > + } > > + do_output(dp, skb_clone(skb, GFP_ATOMIC), port, key); > OVS_CB(skb)->cutlen = 0; > - prev_port = -1; > - } > - > - switch (nla_type(a)) { > - case OVS_ACTION_ATTR_OUTPUT: > - prev_port = nla_get_u32(a); > break; > + } > > case OVS_ACTION_ATTR_TRUNC: { > struct ovs_action_trunc *trunc = nla_data(a);
Re: [PATCH 2/6] wl1251: Use request_firmware_prefer_user() for loading NVS calibration data
On 27-1-2017 8:33, Kalle Valo wrote: > Pali Rohárwrites: > >> NVS calibration data for wl1251 are model specific. Every one device with >> wl1251 chip has different and calibrated in factory. >> >> Not all wl1251 chips have own EEPROM where are calibration data stored. And >> in that case there is no "standard" place. Every device has stored them on >> different place (some in rootfs file, some in dedicated nand partition, >> some in another proprietary structure). >> >> Kernel wl1251 driver cannot support every one different storage decided by >> device manufacture so it will use request_firmware_prefer_user() call for >> loading NVS calibration data and userspace helper will be responsible to >> prepare correct data. >> >> In case userspace helper fails request_firmware_prefer_user() still try to >> load data file directly from VFS as fallback mechanism. >> >> On Nokia N900 device which has wl1251 chip, NVS calibration data are stored >> in CAL nand partition. CAL is proprietary Nokia key/value format for nand >> devices. >> >> With this patch it is finally possible to load correct model specific NVS >> calibration data for Nokia N900. >> >> Signed-off-by: Pali Rohár >> --- >> drivers/net/wireless/ti/wl1251/Kconfig |1 + >> drivers/net/wireless/ti/wl1251/main.c |2 +- >> 2 files changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/net/wireless/ti/wl1251/Kconfig >> b/drivers/net/wireless/ti/wl1251/Kconfig >> index 7142ccf..affe154 100644 >> --- a/drivers/net/wireless/ti/wl1251/Kconfig >> +++ b/drivers/net/wireless/ti/wl1251/Kconfig >> @@ -2,6 +2,7 @@ config WL1251 >> tristate "TI wl1251 driver support" >> depends on MAC80211 >> select FW_LOADER >> +select FW_LOADER_USER_HELPER >> select CRC7 >> ---help--- >>This will enable TI wl1251 driver support. The drivers make >> diff --git a/drivers/net/wireless/ti/wl1251/main.c >> b/drivers/net/wireless/ti/wl1251/main.c >> index 208f062..24f8866 100644 >> --- a/drivers/net/wireless/ti/wl1251/main.c >> +++ b/drivers/net/wireless/ti/wl1251/main.c >> @@ -110,7 +110,7 @@ static int wl1251_fetch_nvs(struct wl1251 *wl) >> struct device *dev = wiphy_dev(wl->hw->wiphy); >> int ret; >> >> -ret = request_firmware(, WL1251_NVS_NAME, dev); >> +ret = request_firmware_prefer_user(, WL1251_NVS_NAME, dev); > > I don't see the need for this. Just remove the default nvs file from > filesystem and the fallback user helper will be always used, right? Indeed. The only remaining issue would be that an error message is logged. Also note the fallback is only used if selected in Kconfig. > Like we discussed earlier, the default nvs file should not be used by > normal users. Yup. Regards, Arend
[PATCH] net: phy: micrel: KSZ8795 do not set SUPPORTED_[Asym_]Pause
As pr commit "net: phy: phy drivers should not set SUPPORTED_[Asym_]Pause" this phy driver should not set these feature bits. Signed-off-by: Sean NyekjaerFixes: 9d162ed69f51 ("net: phy: micrel: add support for KSZ8795") --- drivers/net/phy/micrel.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c index e55809c5beb7..6742070ca676 100644 --- a/drivers/net/phy/micrel.c +++ b/drivers/net/phy/micrel.c @@ -1012,7 +1012,7 @@ static struct phy_driver ksphy_driver[] = { .phy_id = PHY_ID_KSZ8795, .phy_id_mask= MICREL_PHY_ID_MASK, .name = "Micrel KSZ8795", - .features = (SUPPORTED_Pause | SUPPORTED_Asym_Pause), + .features = PHY_BASIC_FEATURES, .flags = PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT, .config_init= kszphy_config_init, .config_aneg= ksz8873mll_config_aneg, -- 2.11.0
[PATCH net-next 0/9] net: dsa: preparatory patches for multi-chip
In order to introduce support for multi-chip configuration, we need to do a few enhancements. This patchset makes the number of ports in a switch dynamic (instead of capping to DSA_MAX_PORTS), stores the switch and index of a port in the dsa_port structure, uses it in the slave private structure, and exposes the bridge device a port belongs to. Vivien Didelot (9): net: dsa: variable number of ports net: dsa: use ds->num_ports when possible net: dsa: add ds and index to dsa_port net: dsa: store a dsa_port in dsa_slave_priv net: dsa: move bridge device in dsa_port net: dsa: pass bridge device when a port leaves net: dsa: mv88e6xxx: use dsa_port's bridge pointer net: dsa: qca8k: use dsa_port's bridge pointer net: dsa: b53: use dsa_port's bridge pointer drivers/net/dsa/b53/b53_common.c | 18 ++-- drivers/net/dsa/b53/b53_priv.h| 3 +- drivers/net/dsa/mv88e6xxx/chip.c | 33 +++--- drivers/net/dsa/mv88e6xxx/mv88e6xxx.h | 6 -- drivers/net/dsa/qca8k.c | 17 ++-- drivers/net/dsa/qca8k.h | 1 - include/net/dsa.h | 12 ++- net/dsa/dsa.c | 21 ++-- net/dsa/dsa2.c| 34 +-- net/dsa/dsa_priv.h| 9 +- net/dsa/slave.c | 182 +- net/dsa/tag_brcm.c| 6 +- net/dsa/tag_dsa.c | 10 +- net/dsa/tag_edsa.c| 10 +- net/dsa/tag_qca.c | 2 +- net/dsa/tag_trailer.c | 4 +- 16 files changed, 186 insertions(+), 182 deletions(-) -- 2.11.0
[PATCH net-next 9/9] net: dsa: b53: use dsa_port's bridge pointer
Now that DSA exposes the bridge device pointer to which a port belongs, use it when programming the port based VLANs and thus remove the cache. Signed-off-by: Vivien Didelot--- drivers/net/dsa/b53/b53_common.c | 9 +++-- drivers/net/dsa/b53/b53_priv.h | 1 - 2 files changed, 3 insertions(+), 7 deletions(-) diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index 32fdcf5570c8..3a7d16b6c3eb 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1308,7 +1308,7 @@ int b53_fdb_dump(struct dsa_switch *ds, int port, } EXPORT_SYMBOL(b53_fdb_dump); -int b53_br_join(struct dsa_switch *ds, int port, struct net_device *bridge) +int b53_br_join(struct dsa_switch *ds, int port, struct net_device *br) { struct b53_device *dev = ds->priv; s8 cpu_port = ds->dst->cpu_port; @@ -1326,11 +1326,10 @@ int b53_br_join(struct dsa_switch *ds, int port, struct net_device *bridge) b53_write16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, reg); } - dev->ports[port].bridge_dev = bridge; b53_read16(dev, B53_PVLAN_PAGE, B53_PVLAN_PORT_MASK(port), ); b53_for_each_port(dev, i) { - if (dev->ports[i].bridge_dev != bridge) + if (ds->ports[i].bridge_dev != br) continue; /* Add this local port to the remote port VLAN control @@ -1357,7 +1356,6 @@ EXPORT_SYMBOL(b53_br_join); void b53_br_leave(struct dsa_switch *ds, int port, struct net_device *br) { struct b53_device *dev = ds->priv; - struct net_device *bridge = dev->ports[port].bridge_dev; struct b53_vlan *vl = >vlans[0]; s8 cpu_port = ds->dst->cpu_port; unsigned int i; @@ -1367,7 +1365,7 @@ void b53_br_leave(struct dsa_switch *ds, int port, struct net_device *br) b53_for_each_port(dev, i) { /* Don't touch the remaining ports */ - if (dev->ports[i].bridge_dev != bridge) + if (ds->ports[i].bridge_dev != br) continue; b53_read16(dev, B53_PVLAN_PAGE, B53_PVLAN_PORT_MASK(i), ); @@ -1382,7 +1380,6 @@ void b53_br_leave(struct dsa_switch *ds, int port, struct net_device *br) b53_write16(dev, B53_PVLAN_PAGE, B53_PVLAN_PORT_MASK(port), pvlan); dev->ports[port].vlan_ctl_mask = pvlan; - dev->ports[port].bridge_dev = NULL; if (is5325(dev) || is5365(dev)) pvid = 1; diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h index 5dafb70e75fc..9d87889728ac 100644 --- a/drivers/net/dsa/b53/b53_priv.h +++ b/drivers/net/dsa/b53/b53_priv.h @@ -70,7 +70,6 @@ enum { struct b53_port { u16 vlan_ctl_mask; - struct net_device *bridge_dev; }; struct b53_vlan { -- 2.11.0
[PATCH net-next 4/9] net: dsa: store a dsa_port in dsa_slave_priv
Store a pointer to the dsa_port structure in the dsa_slave_priv structure, instead of the switch/port index. This will allow to store more information such as the bridge device, needed in DSA drivers for multi-chip configuration. Signed-off-by: Vivien Didelot--- net/dsa/dsa_priv.h| 8 +-- net/dsa/slave.c | 164 +- net/dsa/tag_brcm.c| 4 +- net/dsa/tag_dsa.c | 8 +-- net/dsa/tag_edsa.c| 8 +-- net/dsa/tag_qca.c | 2 +- net/dsa/tag_trailer.c | 2 +- 7 files changed, 96 insertions(+), 100 deletions(-) diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index 16194a4bb2fe..c519bd0e9206 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -25,12 +25,8 @@ struct dsa_slave_priv { struct sk_buff *(*xmit)(struct sk_buff *skb, struct net_device *dev); - /* -* Which switch this port is a part of, and the port index -* for this port. -*/ - struct dsa_switch *parent; - u8 port; + /* DSA port data, such as switch, port index, etc. */ + struct dsa_port *dp; /* * The phylib phy_device pointer for the PHY connected diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 759824ba5545..2ea220bc4bba 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -61,7 +61,7 @@ static int dsa_slave_get_iflink(const struct net_device *dev) { struct dsa_slave_priv *p = netdev_priv(dev); - return p->parent->dst->master_netdev->ifindex; + return p->dp->ds->dst->master_netdev->ifindex; } static inline bool dsa_port_is_bridged(struct dsa_slave_priv *p) @@ -96,8 +96,8 @@ static void dsa_port_set_stp_state(struct dsa_switch *ds, int port, u8 state) static int dsa_slave_open(struct net_device *dev) { struct dsa_slave_priv *p = netdev_priv(dev); - struct net_device *master = p->parent->dst->master_netdev; - struct dsa_switch *ds = p->parent; + struct net_device *master = p->dp->ds->dst->master_netdev; + struct dsa_switch *ds = p->dp->ds; u8 stp_state = dsa_port_is_bridged(p) ? BR_STATE_BLOCKING : BR_STATE_FORWARDING; int err; @@ -123,12 +123,12 @@ static int dsa_slave_open(struct net_device *dev) } if (ds->ops->port_enable) { - err = ds->ops->port_enable(ds, p->port, p->phy); + err = ds->ops->port_enable(ds, p->dp->index, p->phy); if (err) goto clear_promisc; } - dsa_port_set_stp_state(ds, p->port, stp_state); + dsa_port_set_stp_state(ds, p->dp->index, stp_state); if (p->phy) phy_start(p->phy); @@ -151,8 +151,8 @@ static int dsa_slave_open(struct net_device *dev) static int dsa_slave_close(struct net_device *dev) { struct dsa_slave_priv *p = netdev_priv(dev); - struct net_device *master = p->parent->dst->master_netdev; - struct dsa_switch *ds = p->parent; + struct net_device *master = p->dp->ds->dst->master_netdev; + struct dsa_switch *ds = p->dp->ds; if (p->phy) phy_stop(p->phy); @@ -168,9 +168,9 @@ static int dsa_slave_close(struct net_device *dev) dev_uc_del(master, dev->dev_addr); if (ds->ops->port_disable) - ds->ops->port_disable(ds, p->port, p->phy); + ds->ops->port_disable(ds, p->dp->index, p->phy); - dsa_port_set_stp_state(ds, p->port, BR_STATE_DISABLED); + dsa_port_set_stp_state(ds, p->dp->index, BR_STATE_DISABLED); return 0; } @@ -178,7 +178,7 @@ static int dsa_slave_close(struct net_device *dev) static void dsa_slave_change_rx_flags(struct net_device *dev, int change) { struct dsa_slave_priv *p = netdev_priv(dev); - struct net_device *master = p->parent->dst->master_netdev; + struct net_device *master = p->dp->ds->dst->master_netdev; if (change & IFF_ALLMULTI) dev_set_allmulti(master, dev->flags & IFF_ALLMULTI ? 1 : -1); @@ -189,7 +189,7 @@ static void dsa_slave_change_rx_flags(struct net_device *dev, int change) static void dsa_slave_set_rx_mode(struct net_device *dev) { struct dsa_slave_priv *p = netdev_priv(dev); - struct net_device *master = p->parent->dst->master_netdev; + struct net_device *master = p->dp->ds->dst->master_netdev; dev_mc_sync(master, dev); dev_uc_sync(master, dev); @@ -198,7 +198,7 @@ static void dsa_slave_set_rx_mode(struct net_device *dev) static int dsa_slave_set_mac_address(struct net_device *dev, void *a) { struct dsa_slave_priv *p = netdev_priv(dev); - struct net_device *master = p->parent->dst->master_netdev; + struct net_device *master = p->dp->ds->dst->master_netdev; struct sockaddr *addr = a; int err; @@ -228,16
[PATCH net-next 8/9] net: dsa: qca8k: use dsa_port's bridge pointer
Now that DSA exposes the bridge device pointer to which a port belongs, use it when programming the port based VLANs and thus remove the cache. Signed-off-by: Vivien Didelot--- drivers/net/dsa/qca8k.c | 12 drivers/net/dsa/qca8k.h | 1 - 2 files changed, 4 insertions(+), 9 deletions(-) diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c index c85b187aa3d9..a4fd4ccf7b67 100644 --- a/drivers/net/dsa/qca8k.c +++ b/drivers/net/dsa/qca8k.c @@ -746,17 +746,14 @@ qca8k_port_stp_state_set(struct dsa_switch *ds, int port, u8 state) } static int -qca8k_port_bridge_join(struct dsa_switch *ds, int port, - struct net_device *bridge) +qca8k_port_bridge_join(struct dsa_switch *ds, int port, struct net_device *br) { struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv; int port_mask = BIT(QCA8K_CPU_PORT); int i; - priv->port_sts[port].bridge_dev = bridge; - for (i = 1; i < QCA8K_NUM_PORTS; i++) { - if (priv->port_sts[i].bridge_dev != bridge) + if (ds->ports[i].bridge_dev != br) continue; /* Add this port to the portvlan mask of the other ports * in the bridge @@ -781,8 +778,7 @@ qca8k_port_bridge_leave(struct dsa_switch *ds, int port, struct net_device *br) int i; for (i = 1; i < QCA8K_NUM_PORTS; i++) { - if (priv->port_sts[i].bridge_dev != - priv->port_sts[port].bridge_dev) + if (ds->ports[i].bridge_dev != br) continue; /* Remove this port to the portvlan mask of the other ports * in the bridge @@ -791,7 +787,7 @@ qca8k_port_bridge_leave(struct dsa_switch *ds, int port, struct net_device *br) QCA8K_PORT_LOOKUP_CTRL(i), BIT(port)); } - priv->port_sts[port].bridge_dev = NULL; + /* Set the cpu port to be the only one in the portvlan mask of * this port */ diff --git a/drivers/net/dsa/qca8k.h b/drivers/net/dsa/qca8k.h index 201464719531..1ed4fac6cd6d 100644 --- a/drivers/net/dsa/qca8k.h +++ b/drivers/net/dsa/qca8k.h @@ -157,7 +157,6 @@ enum qca8k_fdb_cmd { struct ar8xxx_port_status { struct ethtool_eee eee; - struct net_device *bridge_dev; int enabled; }; -- 2.11.0
[PATCH net-next 3/9] net: dsa: add ds and index to dsa_port
Add the physical switch instance and port index a DSA port belongs to to the dsa_port structure. That can be used later to retrieve information about a physical port when configuring a switch fabric, or lighten up struct dsa_slave_priv. Signed-off-by: Vivien Didelot--- include/net/dsa.h | 2 ++ net/dsa/dsa2.c| 6 ++ 2 files changed, 8 insertions(+) diff --git a/include/net/dsa.h b/include/net/dsa.h index 24e1d935ae68..6bd1f8b05dbd 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -140,6 +140,8 @@ struct dsa_switch_tree { }; struct dsa_port { + struct dsa_switch *ds; + unsigned intindex; struct net_device *netdev; struct device_node *dn; unsigned intageing_time; diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c index 6e7b3e88b778..9f8cc26be9ea 100644 --- a/net/dsa/dsa2.c +++ b/net/dsa/dsa2.c @@ -670,6 +670,7 @@ struct dsa_switch *dsa_switch_alloc(struct device *dev, size_t n) { size_t size = sizeof(struct dsa_switch) + n * sizeof(struct dsa_port); struct dsa_switch *ds; + int i; ds = devm_kzalloc(dev, size, GFP_KERNEL); if (!ds) @@ -678,6 +679,11 @@ struct dsa_switch *dsa_switch_alloc(struct device *dev, size_t n) ds->dev = dev; ds->num_ports = n; + for (i = 0; i < ds->num_ports; ++i) { + ds->ports[i].index = i; + ds->ports[i].ds = ds; + } + return ds; } EXPORT_SYMBOL_GPL(dsa_switch_alloc); -- 2.11.0
[PATCH net-next 1/9] net: dsa: variable number of ports
Change the ports[DSA_MAX_PORTS] array of the dsa_switch structure for a zero-length array, allocated at the same time as the dsa_switch structure itself. A dsa_switch_alloc() helper is provided for that. This commit brings no functional change yet since we pass DSA_MAX_PORTS as the number of ports for the moment. Future patches can update the DSA drivers separately to support dynamic number of ports. Signed-off-by: Vivien Didelot--- drivers/net/dsa/b53/b53_common.c | 7 --- drivers/net/dsa/mv88e6xxx/chip.c | 3 +-- drivers/net/dsa/qca8k.c | 3 +-- include/net/dsa.h| 6 +- net/dsa/dsa.c| 5 ++--- net/dsa/dsa2.c | 16 6 files changed, 29 insertions(+), 11 deletions(-) diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index bb210b12ad1b..31afc4d4b68b 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1790,14 +1790,15 @@ struct b53_device *b53_switch_alloc(struct device *base, struct dsa_switch *ds; struct b53_device *dev; - ds = devm_kzalloc(base, sizeof(*ds) + sizeof(*dev), GFP_KERNEL); + ds = dsa_switch_alloc(base, DSA_MAX_PORTS); if (!ds) return NULL; - dev = (struct b53_device *)(ds + 1); + dev = devm_kzalloc(base, sizeof(*dev), GFP_KERNEL); + if (!dev) + return NULL; ds->priv = dev; - ds->dev = base; dev->dev = base; dev->ds = ds; diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 921e53351786..cb7b24748336 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -4361,11 +4361,10 @@ static int mv88e6xxx_register_switch(struct mv88e6xxx_chip *chip) struct device *dev = chip->dev; struct dsa_switch *ds; - ds = devm_kzalloc(dev, sizeof(*ds), GFP_KERNEL); + ds = dsa_switch_alloc(dev, DSA_MAX_PORTS); if (!ds) return -ENOMEM; - ds->dev = dev; ds->priv = chip; ds->ops = _switch_ops; diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c index c084aa484d2b..f67c6a3cebff 100644 --- a/drivers/net/dsa/qca8k.c +++ b/drivers/net/dsa/qca8k.c @@ -954,12 +954,11 @@ qca8k_sw_probe(struct mdio_device *mdiodev) if (id != QCA8K_ID_QCA8337) return -ENODEV; - priv->ds = devm_kzalloc(>dev, sizeof(*priv->ds), GFP_KERNEL); + priv->ds = dsa_switch_alloc(>dev, DSA_MAX_PORTS); if (!priv->ds) return -ENOMEM; priv->ds->priv = priv; - priv->ds->dev = >dev; priv->ds->ops = _switch_ops; mutex_init(>reg_mutex); dev_set_drvdata(>dev, priv); diff --git a/include/net/dsa.h b/include/net/dsa.h index 92fd795e9573..24e1d935ae68 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -190,8 +190,11 @@ struct dsa_switch { u32 cpu_port_mask; u32 enabled_port_mask; u32 phys_mii_mask; - struct dsa_port ports[DSA_MAX_PORTS]; struct mii_bus *slave_mii_bus; + + /* Dynamically allocated ports, keep last */ + size_t num_ports; + struct dsa_port ports[]; }; static inline bool dsa_is_cpu_port(struct dsa_switch *ds, int p) @@ -386,6 +389,7 @@ static inline bool dsa_uses_tagged_protocol(struct dsa_switch_tree *dst) return dst->rcv != NULL; } +struct dsa_switch *dsa_switch_alloc(struct device *dev, size_t n); void dsa_unregister_switch(struct dsa_switch *ds); int dsa_register_switch(struct dsa_switch *ds, struct device *dev); #ifdef CONFIG_PM_SLEEP diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c index 07e863369e04..de3ffb421ee4 100644 --- a/net/dsa/dsa.c +++ b/net/dsa/dsa.c @@ -347,8 +347,8 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index, /* * Allocate and initialise switch state. */ - ds = devm_kzalloc(parent, sizeof(*ds), GFP_KERNEL); - if (ds == NULL) + ds = dsa_switch_alloc(parent, DSA_MAX_PORTS); + if (!ds) return ERR_PTR(-ENOMEM); ds->dst = dst; @@ -356,7 +356,6 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index, ds->cd = cd; ds->ops = ops; ds->priv = priv; - ds->dev = parent; ret = dsa_switch_setup_one(ds, parent); if (ret) diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c index 75f5d1f8554b..4b3a44bec5c8 100644 --- a/net/dsa/dsa2.c +++ b/net/dsa/dsa2.c @@ -666,6 +666,22 @@ static int _dsa_register_switch(struct dsa_switch *ds, struct device *dev) return err; } +struct dsa_switch *dsa_switch_alloc(struct device *dev, size_t n) +{ + size_t size = sizeof(struct dsa_switch) + n * sizeof(struct dsa_port); + struct dsa_switch *ds; + + ds = devm_kzalloc(dev, size,
[PATCH net-next 2/9] net: dsa: use ds->num_ports when possible
The dsa_switch structure contains the number of ports. Use it where the structure is valid instead of the DSA_MAX_PORTS value. Signed-off-by: Vivien Didelot--- net/dsa/dsa.c | 16 net/dsa/dsa2.c| 12 ++-- net/dsa/slave.c | 2 +- net/dsa/tag_brcm.c| 2 +- net/dsa/tag_dsa.c | 2 +- net/dsa/tag_edsa.c| 2 +- net/dsa/tag_trailer.c | 2 +- 7 files changed, 19 insertions(+), 19 deletions(-) diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c index de3ffb421ee4..619e57a44d1d 100644 --- a/net/dsa/dsa.c +++ b/net/dsa/dsa.c @@ -145,7 +145,7 @@ static int dsa_cpu_dsa_setups(struct dsa_switch *ds, struct device *dev) struct dsa_port *dport; int ret, port; - for (port = 0; port < DSA_MAX_PORTS; port++) { + for (port = 0; port < ds->num_ports; port++) { if (!(dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port))) continue; @@ -218,7 +218,7 @@ static int dsa_switch_setup_one(struct dsa_switch *ds, struct device *parent) /* * Validate supplied switch configuration. */ - for (i = 0; i < DSA_MAX_PORTS; i++) { + for (i = 0; i < ds->num_ports; i++) { char *name; name = cd->port_names[i]; @@ -242,7 +242,7 @@ static int dsa_switch_setup_one(struct dsa_switch *ds, struct device *parent) valid_name_found = true; } - if (!valid_name_found && i == DSA_MAX_PORTS) + if (!valid_name_found && i == ds->num_ports) return -EINVAL; /* Make the built-in MII bus mask match the number of ports, @@ -295,7 +295,7 @@ static int dsa_switch_setup_one(struct dsa_switch *ds, struct device *parent) /* * Create network devices for physical switch ports. */ - for (i = 0; i < DSA_MAX_PORTS; i++) { + for (i = 0; i < ds->num_ports; i++) { ds->ports[i].dn = cd->port_dn[i]; if (!(ds->enabled_port_mask & (1 << i))) @@ -377,7 +377,7 @@ static void dsa_switch_destroy(struct dsa_switch *ds) int port; /* Destroy network devices for physical switch ports. */ - for (port = 0; port < DSA_MAX_PORTS; port++) { + for (port = 0; port < ds->num_ports; port++) { if (!(ds->enabled_port_mask & (1 << port))) continue; @@ -388,7 +388,7 @@ static void dsa_switch_destroy(struct dsa_switch *ds) } /* Disable configuration of the CPU and DSA ports */ - for (port = 0; port < DSA_MAX_PORTS; port++) { + for (port = 0; port < ds->num_ports; port++) { if (!(dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port))) continue; dsa_cpu_dsa_destroy(>ports[port]); @@ -408,7 +408,7 @@ int dsa_switch_suspend(struct dsa_switch *ds) int i, ret = 0; /* Suspend slave network devices */ - for (i = 0; i < DSA_MAX_PORTS; i++) { + for (i = 0; i < ds->num_ports; i++) { if (!dsa_is_port_initialized(ds, i)) continue; @@ -435,7 +435,7 @@ int dsa_switch_resume(struct dsa_switch *ds) return ret; /* Resume slave network devices */ - for (i = 0; i < DSA_MAX_PORTS; i++) { + for (i = 0; i < ds->num_ports; i++) { if (!dsa_is_port_initialized(ds, i)) continue; diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c index 4b3a44bec5c8..6e7b3e88b778 100644 --- a/net/dsa/dsa2.c +++ b/net/dsa/dsa2.c @@ -98,7 +98,7 @@ static bool dsa_ds_find_port_dn(struct dsa_switch *ds, { u32 index; - for (index = 0; index < DSA_MAX_PORTS; index++) + for (index = 0; index < ds->num_ports; index++) if (ds->ports[index].dn == port) return true; return false; @@ -159,7 +159,7 @@ static int dsa_ds_complete(struct dsa_switch_tree *dst, struct dsa_switch *ds) u32 index; int err; - for (index = 0; index < DSA_MAX_PORTS; index++) { + for (index = 0; index < ds->num_ports; index++) { port = >ports[index]; if (!dsa_port_is_valid(port)) continue; @@ -312,7 +312,7 @@ static int dsa_ds_apply(struct dsa_switch_tree *dst, struct dsa_switch *ds) return err; } - for (index = 0; index < DSA_MAX_PORTS; index++) { + for (index = 0; index < ds->num_ports; index++) { port = >ports[index]; if (!dsa_port_is_valid(port)) continue; @@ -344,7 +344,7 @@ static void dsa_ds_unapply(struct dsa_switch_tree *dst, struct dsa_switch *ds) struct dsa_port *port; u32 index; - for (index = 0; index < DSA_MAX_PORTS; index++) { + for (index = 0; index < ds->num_ports; index++) { port = >ports[index];
[PATCH net-next 5/9] net: dsa: move bridge device in dsa_port
Move the bridge_dev pointer from dsa_slave_priv to dsa_port so that DSA drivers can access this information and remove the need to cache it. Signed-off-by: Vivien Didelot--- include/net/dsa.h | 1 + net/dsa/dsa_priv.h | 1 - net/dsa/slave.c| 10 +- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/include/net/dsa.h b/include/net/dsa.h index 6bd1f8b05dbd..924533fd4425 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -146,6 +146,7 @@ struct dsa_port { struct device_node *dn; unsigned intageing_time; u8 stp_state; + struct net_device *bridge_dev; }; struct dsa_switch { diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index c519bd0e9206..3022f2e42cdc 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -38,7 +38,6 @@ struct dsa_slave_priv { int old_pause; int old_duplex; - struct net_device *bridge_dev; #ifdef CONFIG_NET_POLL_CONTROLLER struct netpoll *netpoll; #endif diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 2ea220bc4bba..3a7c28d64bd5 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -64,9 +64,9 @@ static int dsa_slave_get_iflink(const struct net_device *dev) return p->dp->ds->dst->master_netdev->ifindex; } -static inline bool dsa_port_is_bridged(struct dsa_slave_priv *p) +static inline bool dsa_port_is_bridged(struct dsa_port *dp) { - return !!p->bridge_dev; + return !!dp->bridge_dev; } static void dsa_port_set_stp_state(struct dsa_switch *ds, int port, u8 state) @@ -98,7 +98,7 @@ static int dsa_slave_open(struct net_device *dev) struct dsa_slave_priv *p = netdev_priv(dev); struct net_device *master = p->dp->ds->dst->master_netdev; struct dsa_switch *ds = p->dp->ds; - u8 stp_state = dsa_port_is_bridged(p) ? + u8 stp_state = dsa_port_is_bridged(p->dp) ? BR_STATE_BLOCKING : BR_STATE_FORWARDING; int err; @@ -557,7 +557,7 @@ static int dsa_slave_bridge_port_join(struct net_device *dev, struct dsa_switch *ds = p->dp->ds; int ret = -EOPNOTSUPP; - p->bridge_dev = br; + p->dp->bridge_dev = br; if (ds->ops->port_bridge_join) ret = ds->ops->port_bridge_join(ds, p->dp->index, br); @@ -574,7 +574,7 @@ static void dsa_slave_bridge_port_leave(struct net_device *dev) if (ds->ops->port_bridge_leave) ds->ops->port_bridge_leave(ds, p->dp->index); - p->bridge_dev = NULL; + p->dp->bridge_dev = NULL; /* Port left the bridge, put in BR_STATE_DISABLED by the bridge layer, * so allow it to be in BR_STATE_FORWARDING to be kept functional -- 2.11.0
[PATCH net-next 6/9] net: dsa: pass bridge device when a port leaves
Upon reception of the NETDEV_CHANGEUPPER, a leaving port is already unbridged, so reflect this by assigning the port's bridge_dev pointer to NULL before calling the port_bridge_leave DSA driver operation. Now that the bridge_dev pointer is exposed to the drivers, reflecting the current state of the DSA switch fabric is necessary for the drivers to adjust their port based VLANs correctly. Pass the bridge device pointer to the port_bridge_leave operation so that drivers have all information to re-program their chips properly, and do not need to cache it anymore. Signed-off-by: Vivien Didelot--- drivers/net/dsa/b53/b53_common.c | 2 +- drivers/net/dsa/b53/b53_priv.h | 2 +- drivers/net/dsa/mv88e6xxx/chip.c | 3 ++- drivers/net/dsa/qca8k.c | 2 +- include/net/dsa.h| 3 ++- net/dsa/slave.c | 10 +- 6 files changed, 12 insertions(+), 10 deletions(-) diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index 31afc4d4b68b..32fdcf5570c8 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1354,7 +1354,7 @@ int b53_br_join(struct dsa_switch *ds, int port, struct net_device *bridge) } EXPORT_SYMBOL(b53_br_join); -void b53_br_leave(struct dsa_switch *ds, int port) +void b53_br_leave(struct dsa_switch *ds, int port, struct net_device *br) { struct b53_device *dev = ds->priv; struct net_device *bridge = dev->ports[port].bridge_dev; diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h index a8031b382c55..5dafb70e75fc 100644 --- a/drivers/net/dsa/b53/b53_priv.h +++ b/drivers/net/dsa/b53/b53_priv.h @@ -382,7 +382,7 @@ void b53_get_strings(struct dsa_switch *ds, int port, uint8_t *data); void b53_get_ethtool_stats(struct dsa_switch *ds, int port, uint64_t *data); int b53_get_sset_count(struct dsa_switch *ds); int b53_br_join(struct dsa_switch *ds, int port, struct net_device *bridge); -void b53_br_leave(struct dsa_switch *ds, int port); +void b53_br_leave(struct dsa_switch *ds, int port, struct net_device *bridge); void b53_br_set_stp_state(struct dsa_switch *ds, int port, u8 state); void b53_br_fast_age(struct dsa_switch *ds, int port); int b53_vlan_filtering(struct dsa_switch *ds, int port, bool vlan_filtering); diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index cb7b24748336..8eb0dc063f4e 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -2343,7 +2343,8 @@ static int mv88e6xxx_port_bridge_join(struct dsa_switch *ds, int port, return err; } -static void mv88e6xxx_port_bridge_leave(struct dsa_switch *ds, int port) +static void mv88e6xxx_port_bridge_leave(struct dsa_switch *ds, int port, + struct net_device *br) { struct mv88e6xxx_chip *chip = ds->priv; struct net_device *bridge = chip->ports[port].bridge_dev; diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c index f67c6a3cebff..c85b187aa3d9 100644 --- a/drivers/net/dsa/qca8k.c +++ b/drivers/net/dsa/qca8k.c @@ -775,7 +775,7 @@ qca8k_port_bridge_join(struct dsa_switch *ds, int port, } static void -qca8k_port_bridge_leave(struct dsa_switch *ds, int port) +qca8k_port_bridge_leave(struct dsa_switch *ds, int port, struct net_device *br) { struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv; int i; diff --git a/include/net/dsa.h b/include/net/dsa.h index 924533fd4425..b951e2ebda75 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -325,7 +325,8 @@ struct dsa_switch_ops { int (*set_ageing_time)(struct dsa_switch *ds, unsigned int msecs); int (*port_bridge_join)(struct dsa_switch *ds, int port, struct net_device *bridge); - void(*port_bridge_leave)(struct dsa_switch *ds, int port); + void(*port_bridge_leave)(struct dsa_switch *ds, int port, +struct net_device *bridge); void(*port_stp_state_set)(struct dsa_switch *ds, int port, u8 state); void(*port_fast_age)(struct dsa_switch *ds, int port); diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 3a7c28d64bd5..23ff53aeae50 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -565,16 +565,16 @@ static int dsa_slave_bridge_port_join(struct net_device *dev, return ret == -EOPNOTSUPP ? 0 : ret; } -static void dsa_slave_bridge_port_leave(struct net_device *dev) +static void dsa_slave_bridge_port_leave(struct net_device *dev, + struct net_device *br) { struct dsa_slave_priv *p = netdev_priv(dev); struct dsa_switch *ds = p->dp->ds; + p->dp->bridge_dev = NULL; if (ds->ops->port_bridge_leave) - ds->ops->port_bridge_leave(ds, p->dp->index); - - p->dp->bridge_dev = NULL;
[PATCH net-next 7/9] net: dsa: mv88e6xxx: use dsa_port's bridge pointer
Now that DSA exposes the bridge device pointer to which a port belongs, use it when programming the port based VLANs and thus remove the cache. Signed-off-by: Vivien Didelot--- drivers/net/dsa/mv88e6xxx/chip.c | 27 +++ drivers/net/dsa/mv88e6xxx/mv88e6xxx.h | 6 -- 2 files changed, 11 insertions(+), 22 deletions(-) diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 8eb0dc063f4e..84cba32443de 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -1247,8 +1247,8 @@ static int _mv88e6xxx_atu_remove(struct mv88e6xxx_chip *chip, u16 fid, static int _mv88e6xxx_port_based_vlan_map(struct mv88e6xxx_chip *chip, int port) { - struct net_device *bridge = chip->ports[port].bridge_dev; struct dsa_switch *ds = chip->ds; + struct net_device *bridge = ds->ports[port].bridge_dev; u16 output_ports = 0; int i; @@ -1258,7 +1258,7 @@ static int _mv88e6xxx_port_based_vlan_map(struct mv88e6xxx_chip *chip, int port) } else { for (i = 0; i < mv88e6xxx_num_ports(chip); ++i) { /* allow sending frames to every group member */ - if (bridge && chip->ports[i].bridge_dev == bridge) + if (bridge && ds->ports[i].bridge_dev == bridge) output_ports |= BIT(i); /* allow sending frames to CPU port and DSA link(s) */ @@ -1820,17 +1820,17 @@ static int mv88e6xxx_port_check_hw_vlan(struct dsa_switch *ds, int port, GLOBAL_VTU_DATA_MEMBER_TAG_NON_MEMBER) continue; - if (chip->ports[i].bridge_dev == - chip->ports[port].bridge_dev) + if (ds->ports[i].bridge_dev == + ds->ports[port].bridge_dev) break; /* same bridge, check next VLAN */ - if (!chip->ports[i].bridge_dev) + if (!ds->ports[i].bridge_dev) continue; netdev_warn(ds->ports[port].netdev, "hardware VLAN %d already used by %s\n", vlan.vid, - netdev_name(chip->ports[i].bridge_dev)); + netdev_name(ds->ports[i].bridge_dev)); err = -EOPNOTSUPP; goto unlock; } @@ -2320,18 +2320,16 @@ static int mv88e6xxx_port_fdb_dump(struct dsa_switch *ds, int port, } static int mv88e6xxx_port_bridge_join(struct dsa_switch *ds, int port, - struct net_device *bridge) + struct net_device *br) { struct mv88e6xxx_chip *chip = ds->priv; int i, err = 0; mutex_lock(>reg_lock); - /* Assign the bridge and remap each port's VLANTable */ - chip->ports[port].bridge_dev = bridge; - + /* Remap each port's VLANTable */ for (i = 0; i < mv88e6xxx_num_ports(chip); ++i) { - if (chip->ports[i].bridge_dev == bridge) { + if (ds->ports[i].bridge_dev == br) { err = _mv88e6xxx_port_based_vlan_map(chip, i); if (err) break; @@ -2347,16 +2345,13 @@ static void mv88e6xxx_port_bridge_leave(struct dsa_switch *ds, int port, struct net_device *br) { struct mv88e6xxx_chip *chip = ds->priv; - struct net_device *bridge = chip->ports[port].bridge_dev; int i; mutex_lock(>reg_lock); - /* Unassign the bridge and remap each port's VLANTable */ - chip->ports[port].bridge_dev = NULL; - + /* Remap each port's VLANTable */ for (i = 0; i < mv88e6xxx_num_ports(chip); ++i) - if (i == port || chip->ports[i].bridge_dev == bridge) + if (i == port || ds->ports[i].bridge_dev == br) if (_mv88e6xxx_port_based_vlan_map(chip, i)) netdev_warn(ds->ports[i].netdev, "failed to remap\n"); diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h index 572d585dc1e2..e126ed00937b 100644 --- a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h +++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h @@ -676,10 +676,6 @@ struct mv88e6xxx_vtu_entry { struct mv88e6xxx_bus_ops; -struct mv88e6xxx_priv_port { - struct net_device *bridge_dev; -}; - struct mv88e6xxx_irq { u16 masked; struct irq_chip chip; @@ -720,8 +716,6 @@ struct mv88e6xxx_chip { */ struct mutexstats_mutex; - struct mv88e6xxx_priv_port ports[DSA_MAX_PORTS]; -
Re: [PATCH v3] can: Fix kernel panic at security_sock_rcv_skb
On 01/27/2017 05:11 PM, Eric Dumazet wrote: From: Eric DumazetZhang Yanmin reported crashes [1] and provided a patch adding a synchronize_rcu() call in can_rx_unregister() The main problem seems that the sockets themselves are not RCU protected. If CAN uses RCU for delivery, then sockets should be freed only after one RCU grace period. Recent kernels could use sock_set_flag(sk, SOCK_RCU_FREE), but let's ease stable backports with the following fix instead. [1] BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] selinux_socket_sock_rcv_skb+0x65/0x2a0 Call Trace: [] security_sock_rcv_skb+0x4c/0x60 [] sk_filter+0x41/0x210 [] sock_queue_rcv_skb+0x53/0x3a0 [] raw_rcv+0x2a3/0x3c0 [] can_rcv_filter+0x12b/0x370 [] can_receive+0xd9/0x120 [] can_rcv+0xab/0x100 [] __netif_receive_skb_core+0xd8c/0x11f0 [] __netif_receive_skb+0x24/0xb0 [] process_backlog+0x127/0x280 [] net_rx_action+0x33b/0x4f0 [] __do_softirq+0x184/0x440 [] do_softirq_own_stack+0x1c/0x30 [] do_softirq.part.18+0x3b/0x40 [] do_softirq+0x1d/0x20 [] netif_rx_ni+0xe5/0x110 [] slcan_receive_buf+0x507/0x520 [] flush_to_ldisc+0x21c/0x230 [] process_one_work+0x24f/0x670 [] worker_thread+0x9d/0x6f0 [] ? rescuer_thread+0x480/0x480 [] kthread+0x12c/0x150 [] ret_from_fork+0x3f/0x70 Reported-by: Zhang Yanmin Signed-off-by: Eric Dumazet Acked-by: Oliver Hartkopp Thanks Eric! BR Oliver --- include/linux/can/core.h |7 +++ net/can/af_can.c | 14 +++--- net/can/af_can.h |3 ++- net/can/bcm.c|4 ++-- net/can/gw.c |2 +- net/can/raw.c|4 ++-- 6 files changed, 21 insertions(+), 13 deletions(-) diff --git a/include/linux/can/core.h b/include/linux/can/core.h index a0875001b13c84ad70a9b2909654e9ffb6824c58..df08a41d5be5f26cfa4cdc74935f5eae7fa51385 100644 --- a/include/linux/can/core.h +++ b/include/linux/can/core.h @@ -45,10 +45,9 @@ struct can_proto { extern int can_proto_register(const struct can_proto *cp); extern void can_proto_unregister(const struct can_proto *cp); -extern int can_rx_register(struct net_device *dev, canid_t can_id, - canid_t mask, - void (*func)(struct sk_buff *, void *), - void *data, char *ident); +int can_rx_register(struct net_device *dev, canid_t can_id, canid_t mask, + void (*func)(struct sk_buff *, void *), + void *data, char *ident, struct sock *sk); extern void can_rx_unregister(struct net_device *dev, canid_t can_id, canid_t mask, diff --git a/net/can/af_can.c b/net/can/af_can.c index 1108079d934f8383a599d7997b08100fca0465e9..d2b0638284b9a71aaba9cc433822329baf82a34e 100644 --- a/net/can/af_can.c +++ b/net/can/af_can.c @@ -445,6 +445,7 @@ static struct hlist_head *find_rcv_list(canid_t *can_id, canid_t *mask, * @func: callback function on filter match * @data: returned parameter for callback function * @ident: string for calling module identification + * @sk: socket pointer (might be NULL) * * Description: * Invokes the callback function with the received sk_buff and the given @@ -468,7 +469,7 @@ static struct hlist_head *find_rcv_list(canid_t *can_id, canid_t *mask, */ int can_rx_register(struct net_device *dev, canid_t can_id, canid_t mask, void (*func)(struct sk_buff *, void *), void *data, - char *ident) + char *ident, struct sock *sk) { struct receiver *r; struct hlist_head *rl; @@ -496,6 +497,7 @@ int can_rx_register(struct net_device *dev, canid_t can_id, canid_t mask, r->func= func; r->data= data; r->ident = ident; + r->sk = sk; hlist_add_head_rcu(>list, rl); d->entries++; @@ -520,8 +522,11 @@ EXPORT_SYMBOL(can_rx_register); static void can_rx_delete_receiver(struct rcu_head *rp) { struct receiver *r = container_of(rp, struct receiver, rcu); - + struct sock *sk = r->sk; + kmem_cache_free(rcv_cache, r); + if (sk) + sock_put(sk); } /** @@ -596,8 +601,11 @@ void can_rx_unregister(struct net_device *dev, canid_t can_id, canid_t mask, spin_unlock(_rcvlists_lock); /* schedule the receiver item for deletion */ - if (r) + if (r) { + if (r->sk) + sock_hold(r->sk); call_rcu(>rcu, can_rx_delete_receiver); + } } EXPORT_SYMBOL(can_rx_unregister); diff --git a/net/can/af_can.h b/net/can/af_can.h index fca0fe9fc45a497cdf3da82d5414e846e7cc61b7..b86f5129e8385fe84ef671bb914e8e05c2977ca0 100644 --- a/net/can/af_can.h +++ b/net/can/af_can.h @@ -50,13 +50,14 @@ struct receiver { struct hlist_node list; -
Re: [PATCH v2] net: phy: micrel: add support for KSZ8795
On 01/27/2017 11:52 AM, Sean Nyekjær wrote: > > > On 2017-01-27 19:55, Florian Fainelli wrote: >> On 01/26/2017 11:46 PM, Sean Nyekjaer wrote: >>> This is adds support for the PHYs in the KSZ8795 5port managed switch. >>> >>> It will allow to detect the link between the switch and the soc >>> and uses the same read_status functions as the KSZ8873MLL switch. >>> >>> Signed-off-by: Sean Nyekjaer>>> --- >>> Changes in v2: >>> - Removed "switch" name >>> >>> drivers/net/phy/micrel.c | 14 ++ >>> include/linux/micrel_phy.h | 2 ++ >>> 2 files changed, 16 insertions(+) >>> >>> diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c >>> index ea92d524d5a8..fab56c9350cf 100644 >>> --- a/drivers/net/phy/micrel.c >>> +++ b/drivers/net/phy/micrel.c >>> @@ -1014,6 +1014,20 @@ static struct phy_driver ksphy_driver[] = { >>> .get_stats= kszphy_get_stats, >>> .suspend= genphy_suspend, >>> .resume= genphy_resume, >>> +}, { >>> +.phy_id= PHY_ID_KSZ8795, >>> +.phy_id_mask= MICREL_PHY_ID_MASK, >>> +.name= "Micrel KSZ8795", >>> +.features= (SUPPORTED_Pause | SUPPORTED_Asym_Pause), >> This is wrong, it should be PHY_GBIT_FEATURES or PHY_BASIC_FEATURES. >> Including the Pause/AsymPause feature bits is not longer necessary, the >> PHY library takes care of adding these automatically to let your MAC do >> flow control auto-negotiation later on. >> >> Please submit an incremental fix to that. > By this you mean a v3 or a new commit? Incremental means a new commit here, on top of this patch right here. -- Florian
Re: [PATCH 0/6 v3] kvmalloc
On 01/27/2017 11:05 AM, Michal Hocko wrote: On Thu 26-01-17 21:34:04, Daniel Borkmann wrote: On 01/26/2017 02:40 PM, Michal Hocko wrote: [...] But realistically, how big is this problem really? Is it really worth it? You said this is an admin only interface and admin can kill the machine by OOM and other means already. Moreover and I should probably mention it explicitly, your d407bd25a204b reduced the likelyhood of oom for other reason. kmalloc used GPF_USER previously and with order > 0 && order <= PAGE_ALLOC_COSTLY_ORDER this could indeed hit the OOM e.g. due to memory fragmentation. It would be much harder to hit the OOM killer from vmalloc which doesn't issue higher order allocation requests. Or have you ever seen the OOM killer pointing to the vmalloc fallback path? The case I was concerned about was from vmalloc() path, not kmalloc(). That was where the stack trace indicating OOM pointed to. As an example, there could be really large allocation requests for maps where the map has pre-allocated memory for its elements. Thus, if we get to the point where we need to kill others due to shortage of mem for satisfying this, I'd much much rather prefer to just not let vmalloc() work really hard and fail early on instead. I see, but as already mentioned, chances are that by the time you get close to the OOM somebody else will hit the OOM before the vmalloc path manages to free the allocated memory. In my (crafted) test case, I was connected via ssh and it each time reliably killed my connection, which is really suboptimal. F.e., I could also imagine a buggy or miscalculated map definition for a prog that is provisioned to multiple places, which then accidentally triggers this. Or if large on purpose, but we crossed the line, it could be handled more gracefully, f.e. I could imagine an option to falling back to a non-pre-allocated map flavor from the application loading the program. Trade-off for sure, but still allowing it to operate up to a certain extend. Granted, if vmalloc() succeeded without trying hard and we then OOM elsewhere, too bad, but we don't have much control over that one anyway, only about our own request. Reason I asked above was whether having __GFP_NORETRY in would be fatal somewhere down the path, but seems not as you say. So to answer your second email with the bpf and netfilter hunks, why not replacing them with kvmalloc() and __GFP_NORETRY flag and add that big fat FIXME comment above there, saying explicitly that __GFP_NORETRY is not harmful though has only /partial/ effect right now and that full support needs to be implemented in future. That would still be better that not having it, imo, and the FIXME would make expectations clear to anyone reading that code. Well, we can do that, I just would like to prevent from this (ab)use if there is no _real_ and _sensible_ usecase for it. Having a real bug Understandable. report or a fallback mechanism you are mentioning above would justify the (ab)use IMHO. But that abuse would be documented properly and have a real reason to exist. That sounds like a better approach to me. But if you absolutely _insist_ I can change that. Yeah, please do (with a big FIXME comment as mentioned), this originally came from a real bug report. Anyway, feel free to add my Acked-by then. Thanks again, Daniel
Re: [PATCH RFC net-next] packet: always ensure that we pass hard_header_len bytes in skb_headlen() to the driver
On (01/27/17 14:29), Willem de Bruijn wrote: > > As your patch state, the contract is that any packet delivered to a > driver has the entire L2 in its linear section. Drivers are not required > to be robust against shorter packets, so there is no reason to test > those. > > One option is to limit your fix to known fixed-header protocols. > In these cases hard_header_len is the minimum, so anything > smaller must be dropped. yes, but how would you you know that this is a fixed-header protocol or a var-hdrlen protocol? AIUI the hard_header_len itself will not tell you this info: it will be 77 for ax25, 14 for ethernet, but that does not tell me that ax25 is the "robust-er" driver with a min requirement of 21 for the hdrlen. That's why I was thinking of a IFF_L2_VARHDRLEN in the priv_flags of the net_device. > For protocols with variable header length it is fine to send packets > shorter than hard_header_len, even with corrupted content (i.e., > even if they would fail that protocol's validate callback), as long as > they exceed the minimum length. ax25 already has a min length > check through its protocol-specific validate callback. Another option that comes to mind.. the real thorn-in-the-flesh here is the CAP_SYS_RAWIO check. Would it be a better idea to ask the test-suites (since they seem to be the major consumer of that path) to use a special PF_PACKET socket option instead, that indicates "I'm testing robustness of the header, so let this one slip past dev_validate_header at all times"? It would mean the test suites would have to change slightly. --Sowmini
Re: [PATCH 1/3] net: bgmac: allocate struct bgmac just once & don't copy it
On 27 January 2017 at 17:14, David Millerwrote: > From: Felix Fietkau > Date: Fri, 27 Jan 2017 17:02:33 +0100 > >> On 2017-01-27 10:20, Rafał Miłecki wrote: >>> From: Rafał Miłecki >>> >>> To share as much code as possible in bgmac we call alloc_etherdev from >>> bgmac.c which is used by both: platform and bcma code. The easiest >>> solution was to use it for allocating whole struct bgmac but it doesn't >>> work well as we already get early-filled struct bgmac as an argument. >>> >>> So far we were solving this by copying received struct into newly >>> allocated one. The problem is it means storing 2 allocated structs, >>> using only 1 of them and non-shared code not having access to it. >>> >>> This patch solves it by using alloc_etherdev to allocate *pointer* for >>> the already allocated struct. The only downside of this is we have to be >>> careful when using netdev_priv. >>> >>> Another solution was to call alloc_etherdev in platform/bcma specific >>> code but Jon advised against it due to sharing less code that way. >> How does that lead to sharing less code? >> I find this pointer indirection rather ugly and uncommon, and I think it >> would be much cleaner to split the probe into bgmac_enet_alloc and >> bgmac_enet_probe (with bgmac_enet_alloc calling alloc_etherdev and doing >> basic setup). > > I agree, it would be so much better if bgmac_probe() and friends > initialized a real bgmac object which was the private of a netdev > struct, then passed that down into bgmac_enet_probe(). I'll work on V2, thanks. -- Rafał
Re: [PATCH v2] net: phy: micrel: add support for KSZ8795
On 2017-01-27 19:55, Florian Fainelli wrote: On 01/26/2017 11:46 PM, Sean Nyekjaer wrote: This is adds support for the PHYs in the KSZ8795 5port managed switch. It will allow to detect the link between the switch and the soc and uses the same read_status functions as the KSZ8873MLL switch. Signed-off-by: Sean Nyekjaer--- Changes in v2: - Removed "switch" name drivers/net/phy/micrel.c | 14 ++ include/linux/micrel_phy.h | 2 ++ 2 files changed, 16 insertions(+) diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c index ea92d524d5a8..fab56c9350cf 100644 --- a/drivers/net/phy/micrel.c +++ b/drivers/net/phy/micrel.c @@ -1014,6 +1014,20 @@ static struct phy_driver ksphy_driver[] = { .get_stats = kszphy_get_stats, .suspend= genphy_suspend, .resume = genphy_resume, +}, { + .phy_id = PHY_ID_KSZ8795, + .phy_id_mask= MICREL_PHY_ID_MASK, + .name = "Micrel KSZ8795", + .features = (SUPPORTED_Pause | SUPPORTED_Asym_Pause), This is wrong, it should be PHY_GBIT_FEATURES or PHY_BASIC_FEATURES. Including the Pause/AsymPause feature bits is not longer necessary, the PHY library takes care of adding these automatically to let your MAC do flow control auto-negotiation later on. Please submit an incremental fix to that. By this you mean a v3 or a new commit? I'm checking with hardware now... + .flags = PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT, + .config_init= kszphy_config_init, + .config_aneg= ksz8873mll_config_aneg, + .read_status= ksz8873mll_read_status, + .get_sset_count = kszphy_get_sset_count, + .get_strings= kszphy_get_strings, + .get_stats = kszphy_get_stats, + .suspend= genphy_suspend, + .resume = genphy_resume, } }; module_phy_driver(ksphy_driver); diff --git a/include/linux/micrel_phy.h b/include/linux/micrel_phy.h index 257173e0095e..f541da68d1e7 100644 --- a/include/linux/micrel_phy.h +++ b/include/linux/micrel_phy.h @@ -35,6 +35,8 @@ #define PHY_ID_KSZ886X0x00221430 #define PHY_ID_KSZ88630x00221435 +#define PHY_ID_KSZ8795 0x00221550 + /* struct phy_device dev_flags definitions */ #define MICREL_PHY_50MHZ_CLK 0x0001 #define MICREL_PHY_FXEN 0x0002 /Sean
Re: [PATCH 2/6] wl1251: Use request_firmware_prefer_user() for loading NVS calibration data
On Fri 2017-01-27 17:23:07, Kalle Valo wrote: > Pali Rohárwrites: > > > On Friday 27 January 2017 14:26:22 Kalle Valo wrote: > >> Pali Rohár writes: > >> > >> > 2) It was already tested that example NVS data can be used for N900 e.g. > >> > for SSH connection. If real correct data are not available it is better > >> > to use at least those example (and probably log warning message) so user > >> > can connect via SSH and start investigating where is problem. > >> > >> I disagree. Allowing default calibration data to be used can be > >> unnoticed by user and left her wondering why wifi works so badly. > > > > So there are only two options: > > > > 1) Disallow it and so these users will have non-working wifi. > > > > 2) Allow those data to be used as fallback mechanism. > > > > And personally I'm against 1) because it will break wifi support for > > *all* Nokia N900 devices right now. > > All two of them? :) Umm. You clearly want a flock of angry penguins at your doorsteps :-). > But not working is exactly my point, if correct calibration data is not > available wifi should not work. And it's not only about functionality > problems, there's also the regulatory aspect. If you break existing configuration that's called "regression". > >> > 3) If we do rename *now* we will totally break wifi support on Nokia > >> > N900. > >> > >> Then the distro should fix that when updating the linux-firmware > >> packages. Can you provide details about the setup, what distro etc? > > > > Debian stable, Ubuntu LTSs 14.04, 16.04. > > You can run these out of box on N900? Debian stable does. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
[PATCH v3 net-next 2/2] ravb: Support 1Gbps on R-Car H3 ES1.1+ and R-Car M3-W
From: Geert UytterhoevenThe limitation to 10/100Mbit speeds on R-Car Gen3 is valid for R-Car H3 ES1.0 only. Check for the exact SoC model to allow 1Gbps on newer revisions of R-Car H3, and on R-Car M3-W. Signed-off-by: Geert Uytterhoeven Signed-off-by: Simon Horman Acked-by: Sergei Shtylyov --- drivers/net/ethernet/renesas/ravb_main.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c index 732cdea7800b..615a3cb6f18c 100644 --- a/drivers/net/ethernet/renesas/ravb_main.c +++ b/drivers/net/ethernet/renesas/ravb_main.c @@ -31,6 +31,7 @@ #include #include #include +#include #include @@ -973,6 +974,11 @@ static void ravb_adjust_link(struct net_device *ndev) phy_print_status(phydev); } +static const struct soc_device_attribute r8a7795es10[] = { + { .soc_id = "r8a7795", .revision = "ES1.0", }, + { /* sentinel */ } +}; + /* PHY init function */ static int ravb_phy_init(struct net_device *ndev) { @@ -1008,10 +1014,10 @@ static int ravb_phy_init(struct net_device *ndev) goto err_deregister_fixed_link; } - /* This driver only support 10/100Mbit speeds on Gen3 + /* This driver only support 10/100Mbit speeds on R-Car H3 ES1.0 * at this time. */ - if (priv->chip_id == RCAR_GEN3) { + if (soc_device_match(r8a7795es10)) { err = phy_set_max_speed(phydev, SPEED_100); if (err) { netdev_err(ndev, "failed to limit PHY to 100Mbit/s\n"); -- 2.7.0.rc3.207.g0ac5344
Re: [PATCH] cfg80211 debugfs: Cleanup some checkpatch issues
On Fri, 2017-01-27 at 22:26 +0300, Pichugin Dmitry wrote: > This fixes the checkpatch.pl warnings: > * Macros should not use a trailing semicolon. > * Spaces required around that '='. > * Symbolic permissions 'S_IRUGO' are not preferred. OK > * Macro argument reuse 'buflen' - possible side-effects Not all checkpatch messages need fixing. This is one of them. > diff --git a/net/wireless/debugfs.c b/net/wireless/debugfs.c [] > @@ -17,11 +17,12 @@ > static ssize_t name## _read(struct file *file, char __user *userbuf, \ > size_t count, loff_t *ppos) \ > {\ > - struct wiphy *wiphy= file->private_data;\ > - char buf[buflen]; \ > + struct wiphy *wiphy = file->private_data; \ > + int __buflen = __builtin_constant_p(buflen) ? buflen : -1; \ > + char buf[__buflen]; \ That's rather an odd change too
[PATCH v3 net-next 1/2] ravb: Add tx and rx clock internal delays mode of APSR
From: Kazuya MizuguchiThis patch enables tx and rx clock internal delay modes (TDM and RDM). This is to address a failure in the case of 1Gbps communication using the by salvator-x board with the KSZ9031RNX phy. This has been reported to occur with both the r8a7795 (H3) and r8a7796 (M3-W) SoCs. With this change APSR internal delay modes are enabled for "rgmii-id", "rgmii-rxid" and "rgmii-txid" phy modes as follows: phy mode | ASPR delay mode ---+ rgmii-id | TDM and RDM rgmii-rxid | RDM rgmii-txid | TDM Signed-off-by: Kazuya Mizuguchi Signed-off-by: Simon Horman Acked-by: Sergei Shtylyov --- v3 [Simon Horman] * Move comment to above ravb_set_delay_mode() v2 [Simon Horman] * As suggested by Sergei Shtylyov - Add a comment to indicate that APSR_DM appears to be undocumented. - Move chip_id check outside of ravb_set_delay_mode for consistency - Call ravb_modify() once in ravb_set_delay_mode() * Enhance comment before calls to ravb_set_delay_mode() v1 [Simon Horman] - Combined patches - Reworded changelog v0 [Kazuya Mizuguchi] --- drivers/net/ethernet/renesas/ravb.h | 10 ++ drivers/net/ethernet/renesas/ravb_main.c | 23 +++ 2 files changed, 33 insertions(+) diff --git a/drivers/net/ethernet/renesas/ravb.h b/drivers/net/ethernet/renesas/ravb.h index f1109661a533..0525bd696d5d 100644 --- a/drivers/net/ethernet/renesas/ravb.h +++ b/drivers/net/ethernet/renesas/ravb.h @@ -76,6 +76,7 @@ enum ravb_reg { CDAR20 = 0x0060, CDAR21 = 0x0064, ESR = 0x0088, + APSR= 0x008C, /* R-Car Gen3 only */ RCR = 0x0090, RQC0= 0x0094, RQC1= 0x0098, @@ -248,6 +249,15 @@ enum ESR_BIT { ESR_EIL = 0x1000, }; +/* APSR */ +enum APSR_BIT { + APSR_MEMS = 0x0002, + APSR_CMSW = 0x0010, + APSR_DM = 0x6000, /* Undocumented? */ + APSR_DM_RDM = 0x2000, + APSR_DM_TDM = 0x4000, +}; + /* RCR */ enum RCR_BIT { RCR_EFFS= 0x0001, diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c index 89ac1e3f6175..732cdea7800b 100644 --- a/drivers/net/ethernet/renesas/ravb_main.c +++ b/drivers/net/ethernet/renesas/ravb_main.c @@ -1904,6 +1904,23 @@ static void ravb_set_config_mode(struct net_device *ndev) } } +/* Set tx and rx clock internal delay modes */ +static void ravb_set_delay_mode(struct net_device *ndev) +{ + struct ravb_private *priv = netdev_priv(ndev); + int set = 0; + + if (priv->phy_interface == PHY_INTERFACE_MODE_RGMII_ID || + priv->phy_interface == PHY_INTERFACE_MODE_RGMII_RXID) + set |= APSR_DM_RDM; + + if (priv->phy_interface == PHY_INTERFACE_MODE_RGMII_ID || + priv->phy_interface == PHY_INTERFACE_MODE_RGMII_TXID) + set |= APSR_DM_TDM; + + ravb_modify(ndev, APSR, APSR_DM, set); +} + static int ravb_probe(struct platform_device *pdev) { struct device_node *np = pdev->dev.of_node; @@ -2016,6 +2033,9 @@ static int ravb_probe(struct platform_device *pdev) /* Request GTI loading */ ravb_modify(ndev, GCCR, GCCR_LTI, GCCR_LTI); + if (priv->chip_id != RCAR_GEN2) + ravb_set_delay_mode(ndev); + /* Allocate descriptor base address table */ priv->desc_bat_size = sizeof(struct ravb_desc) * DBAT_ENTRY_NUM; priv->desc_bat = dma_alloc_coherent(ndev->dev.parent, priv->desc_bat_size, @@ -2152,6 +2172,9 @@ static int __maybe_unused ravb_resume(struct device *dev) /* Request GTI loading */ ravb_modify(ndev, GCCR, GCCR_LTI, GCCR_LTI); + if (priv->chip_id != RCAR_GEN2) + ravb_set_delay_mode(ndev); + /* Restore descriptor base address table */ ravb_write(ndev, priv->desc_bat_dma, DBAT); -- 2.7.0.rc3.207.g0ac5344
[PATCH v3 net-next 0/2] ravb: Support 1Gbps on R-Car H3 ES1.1+ and R-Car M3-W
Hi, this series adds support for gigabit communication to the Renesas EthernetAVB controller when used in conjunction with R-Car Gen3 H3 ES1.1+ and M3-W SoCs. Gigabit is already supported with R-Car Gen 2 SoCs. The patch from Geert was previously posted for inclusion in v4.10 and acked by Dave for that purpose. It was, however, not accepted by the ARM SoC maintainers. The path from Mizuguchi-san is to address timing problems observed with gigabit transfers. I would like it considered although my own testing on M3-W did not show any timing problems. Changes since v1: * Address various feedback for "APSR" patch as noted in its changelog Geert Uytterhoeven (1): ravb: Support 1Gbps on R-Car H3 ES1.1+ and R-Car M3-W Kazuya Mizuguchi (1): ravb: Add tx and rx clock internal delays mode of APSR drivers/net/ethernet/renesas/ravb.h | 10 ++ drivers/net/ethernet/renesas/ravb_main.c | 33 ++-- 2 files changed, 41 insertions(+), 2 deletions(-) -- 2.7.0.rc3.207.g0ac5344