[PATCH] RDS: Simplify code
Calling 'list_splice' followed by 'INIT_LIST_HEAD' is equivalent to 'list_splice_init'. This has been spotted with the following coccinelle script: / @@ expression y,z; @@ - list_splice(y,z); - INIT_LIST_HEAD(y); + list_splice_init(y,z); Signed-off-by: Christophe JAILLET--- net/rds/loop.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/net/rds/loop.c b/net/rds/loop.c index f2bf78de5688..c3e6da4fdf97 100644 --- a/net/rds/loop.c +++ b/net/rds/loop.c @@ -167,8 +167,7 @@ void rds_loop_exit(void) /* avoid calling conn_destroy with irqs off */ spin_lock_irq(_conns_lock); - list_splice(_conns, _list); - INIT_LIST_HEAD(_conns); + list_splice_init(_conns, _list); spin_unlock_irq(_conns_lock); list_for_each_entry_safe(lc, _lc, _list, loop_node) { -- 2.7.4
Re: [BUG] af_packet FANOUT and device dismantle
From: Eric DumazetDate: Thu, 01 Sep 2016 17:04:43 -0700 > When you added fanout in commit dc99f600698dcac, > it seems a device dismantle is not properly handled. > > packet_notifier() does properly finds all sockets attached to the > device and we call __unregister_prot_hook() > > But the actual dev_remove_pack() is called when the last socket attached > to the FANOUT group is _closed_ , possibly minutes after the device > disappeared. > > So we trigger the BUG_ON(!list_empty(>ptype_all)); in > netdev_run_todo(), that came with linux-4.0 in > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7866a621043fbaca3d7389e9b9f69dd1a2e5a855 > > Fix would be to call dev_remote_pack() from __fanout_unlink() when > num_members becomes 0. > > Let me know if you agree with this, thanks. I completely agree with your analysis, good catch.
Re: [PATCH v3 0/4] net: smsc911x: Move phy and interrupt config
From: Jeremy LintonDate: Thu, 1 Sep 2016 15:15:05 -0500 > v2-v3: Move error handing into separate patch, replace a couple cases > of fixed errors with the errors being returned from the failing functions. > Hoist irq handler. > > The smsc911x driver is doing a number of things in its probe routine that > should be delayed until the interface is started. Because of this, the module > cannot be unloaded, the phy states are incorrect/stale if the interface isn't > running, open's unnecessarily fail causing network configuration problems, and > the /proc/irq nodes are incorrectly named. > > Clean up a number of these problems by moving the mdio and interrupt > configuration into the smsc911x_open routine. Series applied, thanks.
Re: [PATCH net-next V5 0/8] liquidio CN23XX support
From: Raghu VatsavayiDate: Thu, 1 Sep 2016 11:16:03 -0700 > I am posting the remaining half of patchset after the > acceptance of first half. With this patchset I am able > to completely submit the code of V3 patchset which you > earlier advised me to split into smaller ones. > > This V5 patch also addresses all the comments from previous > submission: > 1) Avoid busy loop while reading registers. > 2) Other minor comments about debug messages and constants. > > Please apply patches in following order as some of the > patches depend on earlier patches. Series applied.
Re: [PATCH net-next 0/3] tipc: improve broadcast NACK mechanism
From: Jon MaloyDate: Thu, 1 Sep 2016 13:52:48 -0400 > The broadcast protocol has turned out to not scale well beyond 70-80 > nodes, while it is now possible to build TIPC clusters of at least ten > times that size. This commit series improves the NACK/retransmission > mechanism of the broadcast protocol to make is at scalable as the rest > of TIPC. Series applied, thanks Jon.
Re: [PATCH v4 0/6] Support the rk3399 gmac and pd function
From: Caesar WangDate: Fri, 2 Sep 2016 01:49:58 +0800 > This patch have the following changes: > > 7edf13e net: stmmac: dwmac-rk: add rk3366 & rk3399 specific data > 26e004e net: stmmac: dwmac-rk: fixes the gmac resume after PD on/off > b216c2f net: stmmac: dwmac-rk: add pd_gmac support for rk3399 > 848bb71 arm64: dts: rockchip: add the gmac power domain on rk3399 > 508e41f arm64: dts: rockchip: add the gmac needed node for rk3399 > fb26795 arm64: dts: rockchip: enable the gmac for rk3399 evb board > > Hi David, > The patch 1,2,3 is related to the rockchip net/stammc driver, > > Hi Heiko, > The patch 4,5,6 is related to the dts changes. Patches 1, 2, and 3 applied to net-next, thanks.
[PATCH 0/2] hso: neatening
This seems to be the only code in the kernel that uses macro defines with a trailing underscore. Fix that. Joe Perches (2): hso: Use a more common logging style hso: Convert printk to pr_ drivers/net/usb/hso.c | 118 +++--- 1 file changed, 55 insertions(+), 63 deletions(-) -- 2.10.0.rc2.1.g053435c
[PATCH 2/2] hso: Convert printk to pr_
Use a more common logging style Miscellanea: o Add pr_fmt to prefix each output message o Realign arguments Signed-off-by: Joe Perches--- drivers/net/usb/hso.c | 21 +++-- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/drivers/net/usb/hso.c b/drivers/net/usb/hso.c index 6c37512..e7b5163 100644 --- a/drivers/net/usb/hso.c +++ b/drivers/net/usb/hso.c @@ -50,6 +50,8 @@ * */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include @@ -638,7 +640,7 @@ static int get_free_serial_index(void) } spin_unlock_irqrestore(_table_lock, flags); - printk(KERN_ERR "%s: no free serial devices in table\n", __func__); + pr_err("%s: no free serial devices in table\n", __func__); return -1; } @@ -1102,7 +1104,7 @@ static void _hso_serial_set_termios(struct tty_struct *tty, struct hso_serial *serial = tty->driver_data; if (!serial) { - printk(KERN_ERR "%s: no tty structures", __func__); + pr_err("%s: no tty structures", __func__); return; } @@ -1347,7 +1349,7 @@ static int hso_serial_write(struct tty_struct *tty, const unsigned char *buf, /* sanity check */ if (serial == NULL) { - printk(KERN_ERR "%s: serial is NULL\n", __func__); + pr_err("%s: serial is NULL\n", __func__); return -ENODEV; } @@ -1773,7 +1775,7 @@ static int mux_device_request(struct hso_serial *serial, u8 type, u16 port, /* Sanity check */ if (!serial || !ctrl_urb || !ctrl_req) { - printk(KERN_ERR "%s: Wrong arguments\n", __func__); + pr_err("%s: Wrong arguments\n", __func__); return -EINVAL; } @@ -3220,7 +3222,7 @@ static int __init hso_init(void) int result; /* put it in the log */ - printk(KERN_INFO "hso: %s\n", version); + pr_info("%s\n", version); /* Initialise the serial table semaphore and table */ spin_lock_init(_table_lock); @@ -3251,16 +3253,15 @@ static int __init hso_init(void) /* register the tty driver */ result = tty_register_driver(tty_drv); if (result) { - printk(KERN_ERR "%s - tty_register_driver failed(%d)\n", - __func__, result); + pr_err("%s - tty_register_driver failed(%d)\n", + __func__, result); goto err_free_tty; } /* register this module as an usb driver */ result = usb_register(_driver); if (result) { - printk(KERN_ERR "Could not register hso driver? error: %d\n", - result); + pr_err("Could not register hso driver - error: %d\n", result); goto err_unreg_tty; } @@ -3275,7 +3276,7 @@ err_free_tty: static void __exit hso_exit(void) { - printk(KERN_INFO "hso: unloaded\n"); + pr_info("unloaded\n"); tty_unregister_driver(tty_drv); put_tty_driver(tty_drv); -- 2.10.0.rc2.1.g053435c
[PATCH 1/2] hso: Use a more common logging style
Macros that end in an underscore are just odd. Add hso_dbg(level, fmt, ...) and use it everwhere instead. Several uses had additional unnecessary newlines as the macro added a newline. Remove the newline from the macro and add newlines to each use as appropriate. Remove now unused D macros. Signed-off-by: Joe Perches--- drivers/net/usb/hso.c | 97 +++ 1 file changed, 44 insertions(+), 53 deletions(-) diff --git a/drivers/net/usb/hso.c b/drivers/net/usb/hso.c index c5544d3..6c37512 100644 --- a/drivers/net/usb/hso.c +++ b/drivers/net/usb/hso.c @@ -108,23 +108,12 @@ /*/ /* Debugging functions */ /*/ -#define D__(lvl_, fmt, arg...) \ - do {\ - printk(lvl_ "[%d:%s]: " fmt "\n", \ - __LINE__, __func__, ## arg); \ - } while (0) - -#define D_(lvl, args...) \ - do {\ - if (lvl & debug)\ - D__(KERN_INFO, args); \ - } while (0) - -#define D1(args...)D_(0x01, ##args) -#define D2(args...)D_(0x02, ##args) -#define D3(args...)D_(0x04, ##args) -#define D4(args...)D_(0x08, ##args) -#define D5(args...)D_(0x10, ##args) +#define hso_dbg(lvl, fmt, ...) \ +do { \ + if ((lvl) & debug) \ + pr_info("[%d:%s] " fmt, \ + __LINE__, __func__, ##__VA_ARGS__); \ +} while (0) /*/ /* Enumerators */ @@ -709,7 +698,8 @@ static void handle_usb_error(int status, const char *function, } /* log a meaningful explanation of an USB status */ - D1("%s: received USB status - %s (%d)", function, explanation, status); + hso_dbg(0x1, "%s: received USB status - %s (%d)\n", + function, explanation, status); } /* Network interface functions */ @@ -808,7 +798,7 @@ static netdev_tx_t hso_net_start_xmit(struct sk_buff *skb, DUMP1(skb->data, skb->len); /* Copy it from kernel memory to OUR memory */ memcpy(odev->mux_bulk_tx_buf, skb->data, skb->len); - D1("len: %d/%d", skb->len, MUX_BULK_TX_BUF_SIZE); + hso_dbg(0x1, "len: %d/%d\n", skb->len, MUX_BULK_TX_BUF_SIZE); /* Fill in the URB for shipping it out. */ usb_fill_bulk_urb(odev->mux_bulk_tx_urb, @@ -872,7 +862,7 @@ static void packetizeRx(struct hso_net *odev, unsigned char *ip_pkt, unsigned char *tmp_rx_buf; /* log if needed */ - D1("Rx %d bytes", count); + hso_dbg(0x1, "Rx %d bytes\n", count); DUMP(ip_pkt, min(128, (int)count)); while (count) { @@ -912,7 +902,7 @@ static void packetizeRx(struct hso_net *odev, unsigned char *ip_pkt, frame_len); if (!odev->skb_rx_buf) { /* We got no receive buffer. */ - D1("could not allocate memory"); + hso_dbg(0x1, "could not allocate memory\n"); odev->rx_parse_state = WAIT_SYNC; continue; } @@ -972,11 +962,11 @@ static void packetizeRx(struct hso_net *odev, unsigned char *ip_pkt, break; case WAIT_SYNC: - D1(" W_S"); + hso_dbg(0x1, " W_S\n"); count = 0; break; default: - D1(" "); + hso_dbg(0x1, "\n"); count--; break; } @@ -1020,7 +1010,7 @@ static void read_bulk_callback(struct urb *urb) /* Sanity check */ if (!odev || !test_bit(HSO_NET_RUNNING, >flags)) { - D1("BULK IN callback but driver is not active!"); + hso_dbg(0x1, "BULK IN callback but driver is not active!\n"); return; } usb_mark_last_busy(urb->dev); @@ -1116,7 +1106,7 @@ static void _hso_serial_set_termios(struct tty_struct *tty, return; } - D4("port %d", serial->minor); + hso_dbg(0x8, "port %d\n", serial->minor); /*
Re: [ovs-dev] [PATCH net-next v21 3/4] openvswitch: 802.1AD Flow handling, actions, vlan parsing, netlink attributes
On Fri, Sep 2, 2016 at 2:42 PM, pravin shelarwrote: > On Thu, Sep 1, 2016 at 1:45 PM, Eric Garver wrote: >> Add support for 802.1ad including the ability to push and pop double >> tagged vlans. Add support for 802.1ad to netlink parsing and flow >> conversion. Uses double nested encap attributes to represent double >> tagged vlan. Inner TPID encoded along with ctci in nested attributes. >> >> This is based on Thomas F Herbert's original v20 patch. I made some >> small clean ups and bug fixes. >> >> Signed-off-by: Thomas F Herbert >> Signed-off-by: Eric Garver > checkpatch.pl is also not happy with the patch, It found bunch of issues. ./scripts/checkpatch.pl 0003-openvswitch-802.1AD-Flow-handling-actions-vlan-parsi.patch ... ... total: 6 errors, 9 warnings, 1 checks, 460 lines checked
Re: [ovs-dev] [PATCH net-next v21 4/4] openvswitch: report error on VLAN nlattr in ovs_key_from_nlattrs()
On Thu, Sep 1, 2016 at 1:45 PM, Eric Garverwrote: > With 802.1ad support these are parsed and set upfront by > parse_vlan_from_nlattrs() before ovs_key_from_nlattrs() is ever called. > As such we should never see a VLAN attribute in ovs_key_from_nlattrs(). > > Signed-off-by: Eric Garver > --- > net/openvswitch/flow_netlink.c | 19 +-- > 1 file changed, 5 insertions(+), 14 deletions(-) > > diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c > index fbe9e0e4792d..0f36aead744b 100644 > --- a/net/openvswitch/flow_netlink.c > +++ b/net/openvswitch/flow_netlink.c > @@ -923,20 +923,11 @@ static int ovs_key_from_nlattrs(struct net *net, struct > sw_flow_match *match, > } > > if (attrs & (1 << OVS_KEY_ATTR_VLAN)) { > - __be16 tci; > - > - tci = nla_get_be16(a[OVS_KEY_ATTR_VLAN]); > - if (!(tci & htons(VLAN_TAG_PRESENT))) { > - if (is_mask) > - OVS_NLERR(log, "VLAN TCI mask does not have > exact match for VLAN_TAG_PRESENT bit."); > - else > - OVS_NLERR(log, "VLAN TCI does not have > VLAN_TAG_PRESENT bit set."); > - > - return -EINVAL; > - } > - > - SW_FLOW_KEY_PUT(match, eth.vlan.tci, tci, is_mask); > - attrs &= ~(1 << OVS_KEY_ATTR_VLAN); > + /* VLAN attribute is always parsed before getting here since > it may > +* occur multiple times. > +*/ > + OVS_NLERR(log, "VLAN attribute unexpected."); > + return -EINVAL; > } > Is there reason for not merging this patch with earlier patch?
Re: [ovs-dev] [PATCH net-next v21 1/4] openvswitch: 802.1ad uapi changes.
On Thu, Sep 1, 2016 at 1:45 PM, Eric Garverwrote: > From: Thomas F Herbert > > openvswitch: Add support for 8021.AD > > Change the description of the VLAN tpid field. > > Signed-off-by: Thomas F Herbert Acked-by: Pravin B Shelar
Re: [ovs-dev] [PATCH net-next v21 2/4] vlan: Check for vlan ethernet types for 8021.q or 802.1ad
On Thu, Sep 1, 2016 at 1:45 PM, Eric Garverwrote: > This is to simplify using double tagged vlans. This function allows all > valid vlan ethertypes to be checked in a single function call. > Also replace some instances that check for both ETH_P_8021Q and > ETH_P_8021AD. > > Patch based on one originally by Thomas F Herbert. > > Signed-off-by: Thomas F Herbert > Signed-off-by: Eric Garver Acked-by: Pravin B Shelar
Re: [ovs-dev] [PATCH net-next v21 3/4] openvswitch: 802.1AD Flow handling, actions, vlan parsing, netlink attributes
On Thu, Sep 1, 2016 at 1:45 PM, Eric Garverwrote: > Add support for 802.1ad including the ability to push and pop double > tagged vlans. Add support for 802.1ad to netlink parsing and flow > conversion. Uses double nested encap attributes to represent double > tagged vlan. Inner TPID encoded along with ctci in nested attributes. > > This is based on Thomas F Herbert's original v20 patch. I made some > small clean ups and bug fixes. > > Signed-off-by: Thomas F Herbert > Signed-off-by: Eric Garver Thanks for working on this. This version looks pretty clone to complete. > --- > net/openvswitch/actions.c | 16 +-- > net/openvswitch/flow.c | 64 > net/openvswitch/flow.h | 8 +- > net/openvswitch/flow_netlink.c | 227 > ++--- > net/openvswitch/vport.c| 7 +- > 5 files changed, 235 insertions(+), 87 deletions(-) > ... > diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c > index 0ea128eeeab2..13f6ebdf379b 100644 > --- a/net/openvswitch/flow.c > +++ b/net/openvswitch/flow.c > @@ -302,24 +302,56 @@ static bool icmp6hdr_ok(struct sk_buff *skb) > sizeof(struct icmp6hdr)); > } > > -static int parse_vlan(struct sk_buff *skb, struct sw_flow_key *key) > +/** > + * Parse vlan tag from vlan header. > + * Returns ERROR on memory error. > + * Returns 0 if it encounters a non-vlan or incomplete packet. > + * Returns 1 after successfully parsing vlan tag. > + */ > +static int parse_vlan_tag(struct sk_buff *skb, struct vlan_head *vlan) > { > - struct qtag_prefix { > - __be16 eth_type; /* ETH_P_8021Q */ > - __be16 tci; > - }; > - struct qtag_prefix *qp; > + struct vlan_head *qp = (struct vlan_head *)skb->data; > > - if (unlikely(skb->len < sizeof(struct qtag_prefix) + sizeof(__be16))) > + if (likely(!eth_type_vlan(qp->tpid))) > return 0; > > - if (unlikely(!pskb_may_pull(skb, sizeof(struct qtag_prefix) + > -sizeof(__be16 > + if (unlikely(skb->len < sizeof(struct vlan_head) + sizeof(__be16))) > + return 0; > + > + if (unlikely(!pskb_may_pull(skb, sizeof(struct vlan_head) + > +sizeof(__be16 > return -ENOMEM; > pskb_may_pull() can change skb->data, so you need to refresh qp pointer. > - qp = (struct qtag_prefix *) skb->data; > - key->eth.tci = qp->tci | htons(VLAN_TAG_PRESENT); > - __skb_pull(skb, sizeof(struct qtag_prefix)); > + vlan->tci = qp->tci | htons(VLAN_TAG_PRESENT); > + vlan->tpid = qp->tpid; > + > + __skb_pull(skb, sizeof(struct vlan_head)); > + return 1; > +} > + ... ... > diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c > index c78a6a1476fb..fbe9e0e4792d 100644 > --- a/net/openvswitch/flow_netlink.c > +++ b/net/openvswitch/flow_netlink.c ... > +static int __parse_vlan_from_nlattrs(struct sw_flow_match *match, > +u64 *key_attrs, bool inner, > +const struct nlattr **a, bool is_mask, > +bool log) > +{ > + int err; > + const struct nlattr *encap; > + > + err = encode_vlan_from_nlattrs(match, a, is_mask, inner, log); > + if (err) > + return err; > + > + *key_attrs &= ~(1 << OVS_KEY_ATTR_ENCAP); > + > + /* Ensure that tci key attribute isn't > +* overwritten by encapsulated customer tci. > +* Ethertype is cleared because it is c_tpid. > +*/ > + *key_attrs &= ~(1 << OVS_KEY_ATTR_VLAN); > + *key_attrs &= ~(1 << OVS_KEY_ATTR_ETHERTYPE); > + > + encap = a[OVS_KEY_ATTR_ENCAP]; > + > + if (is_mask) > + err = parse_flow_mask_nlattrs(encap, a, key_attrs, log); > + else > + err = parse_flow_nlattrs(encap, a, key_attrs, log); > + > + return err; > +} > + > +static int parse_vlan_from_nlattrs(struct sw_flow_match *match, > + u64 *key_attrs, bool *ie_valid, > + const struct nlattr **a, bool is_mask, > + bool log) > +{ > + int err; > + > + err = __parse_vlan_from_nlattrs(match, key_attrs, > + false, a, is_mask, log); > + if (err) > + return err; > + > + if (!is_mask) { > + if ((*key_attrs & (1 << OVS_KEY_ATTR_ETHERTYPE)) && > + eth_type_vlan(nla_get_be16(a[OVS_KEY_ATTR_ETHERTYPE]))) { > + > + if (!((*key_attrs & (1 << OVS_KEY_ATTR_VLAN)) && > + (*key_attrs & (1 << OVS_KEY_ATTR_ENCAP { > + OVS_NLERR(log, "Invalid Inner VLAN frame"); > +
Re: [PATCH net-next 2/3] smsc95xx: Add register define
On Fri, 2016-09-02 at 20:34 +, woojung@microchip.com wrote: > From: Woojung Huh> > Add STRAP_STATUS defines. [] > diff --git a/drivers/net/usb/smsc95xx.h b/drivers/net/usb/smsc95xx.h [] > @@ -144,6 +144,14 @@ > > #define BURST_CAP(0x38) > > +#define STRAP_STATUS(0x3C) > +#define STRAP_STATUS_PWR_SEL_ (0x0020) > +#define STRAP_STATUS_AMDIX_EN_ (0x0010) > +#define STRAP_STATUS_PORT_SWAP_ (0x0008) > +#define STRAP_STATUS_EEP_SIZE_ (0x0004) > +#define STRAP_STATUS_RMT_WKP_ (0x0002) > +#define STRAP_STATUS_EEP_DISABLE_ (0x0001) Using BIT would be more common. Ending the #defines with an underscore is just odd and unappealing.
[PATCH net-next 3/3] smsc95xx: Add mdix control via ethtool
From: Woojung HuhAdd mdix control through ethtool. Signed-off-by: Woojung Huh --- drivers/net/usb/smsc95xx.c | 109 +++-- 1 file changed, 106 insertions(+), 3 deletions(-) diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c index dc989a8..831aa33 100644 --- a/drivers/net/usb/smsc95xx.c +++ b/drivers/net/usb/smsc95xx.c @@ -33,7 +33,7 @@ #include "smsc95xx.h" #define SMSC_CHIPNAME "smsc95xx" -#define SMSC_DRIVER_VERSION"1.0.4" +#define SMSC_DRIVER_VERSION"1.0.5" #define HS_USB_PKT_SIZE(512) #define FS_USB_PKT_SIZE(64) #define DEFAULT_HS_BURST_CAP_SIZE (16 * 1024 + 5 * HS_USB_PKT_SIZE) @@ -64,6 +64,7 @@ #define CARRIER_CHECK_DELAY (2 * HZ) struct smsc95xx_priv { + u32 chip_id; u32 mac_cr; u32 hash_hi; u32 hash_lo; @@ -71,6 +72,7 @@ struct smsc95xx_priv { spinlock_t mac_cr_lock; u8 features; u8 suspend_flags; + u8 mdix_ctrl; bool link_ok; struct delayed_work carrier_check; struct usbnet *dev; @@ -782,14 +784,113 @@ static int smsc95xx_ethtool_set_wol(struct net_device *net, return ret; } +static int get_mdix_status(struct net_device *net) +{ + struct usbnet *dev = netdev_priv(net); + u32 val; + int buf; + + buf = smsc95xx_mdio_read(dev->net, dev->mii.phy_id, SPECIAL_CTRL_STS); + if (buf & SPECIAL_CTRL_STS_OVRRD_AMDIX_) { + if (buf & SPECIAL_CTRL_STS_AMDIX_ENABLE_) + return ETH_TP_MDI_AUTO; + else if (buf & SPECIAL_CTRL_STS_AMDIX_STATE_) + return ETH_TP_MDI_X; + } else { + buf = smsc95xx_read_reg(dev, STRAP_STATUS, ); + if (val & STRAP_STATUS_AMDIX_EN_) + return ETH_TP_MDI_AUTO; + } + + return ETH_TP_MDI; +} + +static void set_mdix_status(struct net_device *net, __u8 mdix_ctrl) +{ + struct usbnet *dev = netdev_priv(net); + struct smsc95xx_priv *pdata = (struct smsc95xx_priv *)(dev->data[0]); + int buf; + + if ((pdata->chip_id == ID_REV_CHIP_ID_9500A_) || + (pdata->chip_id == ID_REV_CHIP_ID_9530_) || + (pdata->chip_id == ID_REV_CHIP_ID_89530_) || + (pdata->chip_id == ID_REV_CHIP_ID_9730_)) { + /* Extend Manual AutoMDIX timer for 9500A/9500Ai */ + buf = smsc95xx_mdio_read(dev->net, dev->mii.phy_id, +PHY_EDPD_CONFIG); + buf |= PHY_EDPD_CONFIG_EXT_CROSSOVER_; + smsc95xx_mdio_write(dev->net, dev->mii.phy_id, + PHY_EDPD_CONFIG, buf); + } + + if (mdix_ctrl == ETH_TP_MDI) { + buf = smsc95xx_mdio_read(dev->net, dev->mii.phy_id, +SPECIAL_CTRL_STS); + buf |= SPECIAL_CTRL_STS_OVRRD_AMDIX_; + buf &= ~(SPECIAL_CTRL_STS_AMDIX_ENABLE_ | +SPECIAL_CTRL_STS_AMDIX_STATE_); + smsc95xx_mdio_write(dev->net, dev->mii.phy_id, + SPECIAL_CTRL_STS, buf); + } else if (mdix_ctrl == ETH_TP_MDI_X) { + buf = smsc95xx_mdio_read(dev->net, dev->mii.phy_id, +SPECIAL_CTRL_STS); + buf |= SPECIAL_CTRL_STS_OVRRD_AMDIX_; + buf &= ~(SPECIAL_CTRL_STS_AMDIX_ENABLE_ | +SPECIAL_CTRL_STS_AMDIX_STATE_); + buf |= SPECIAL_CTRL_STS_AMDIX_STATE_; + smsc95xx_mdio_write(dev->net, dev->mii.phy_id, + SPECIAL_CTRL_STS, buf); + } else if (mdix_ctrl == ETH_TP_MDI_AUTO) { + buf = smsc95xx_mdio_read(dev->net, dev->mii.phy_id, +SPECIAL_CTRL_STS); + buf &= ~SPECIAL_CTRL_STS_OVRRD_AMDIX_; + buf &= ~(SPECIAL_CTRL_STS_AMDIX_ENABLE_ | +SPECIAL_CTRL_STS_AMDIX_STATE_); + buf |= SPECIAL_CTRL_STS_AMDIX_ENABLE_; + smsc95xx_mdio_write(dev->net, dev->mii.phy_id, + SPECIAL_CTRL_STS, buf); + } + pdata->mdix_ctrl = mdix_ctrl; +} + +static int smsc95xx_get_settings(struct net_device *net, +struct ethtool_cmd *cmd) +{ + struct usbnet *dev = netdev_priv(net); + struct smsc95xx_priv *pdata = (struct smsc95xx_priv *)(dev->data[0]); + int retval; + + retval = usbnet_get_settings(net, cmd); + + cmd->eth_tp_mdix = pdata->mdix_ctrl; + cmd->eth_tp_mdix_ctrl = pdata->mdix_ctrl; + + return retval; +} + +static int smsc95xx_set_settings(struct net_device *net, +struct ethtool_cmd *cmd) +{ + struct usbnet *dev =
[PATCH net-next 1/3] smsc95xx: Add maintainer
From: Woojung HuhAdd Microchip Linux Driver Support as maintainer because this driver is maintaining by Microchip. Signed-off-by: Woojung Huh --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 58b5029..324e3b8 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -12279,6 +12279,7 @@ F: drivers/net/usb/smsc75xx.* USB SMSC95XX ETHERNET DRIVER M: Steve Glendinning +M: Microchip Linux Driver Support L: netdev@vger.kernel.org S: Maintained F: drivers/net/usb/smsc95xx.* -- 2.7.4
[PATCH net-next 2/3] smsc95xx: Add register define
From: Woojung HuhAdd STRAP_STATUS defines. Signed-off-by: Woojung Huh --- drivers/net/usb/smsc95xx.h | 8 1 file changed, 8 insertions(+) diff --git a/drivers/net/usb/smsc95xx.h b/drivers/net/usb/smsc95xx.h index 526faa0..29a4d9e 100644 --- a/drivers/net/usb/smsc95xx.h +++ b/drivers/net/usb/smsc95xx.h @@ -144,6 +144,14 @@ #define BURST_CAP (0x38) +#defineSTRAP_STATUS(0x3C) +#defineSTRAP_STATUS_PWR_SEL_ (0x0020) +#defineSTRAP_STATUS_AMDIX_EN_ (0x0010) +#defineSTRAP_STATUS_PORT_SWAP_ (0x0008) +#defineSTRAP_STATUS_EEP_SIZE_ (0x0004) +#defineSTRAP_STATUS_RMT_WKP_ (0x0002) +#defineSTRAP_STATUS_EEP_DISABLE_ (0x0001) + #define GPIO_WAKE (0x64) #define INT_EP_CTL (0x68) -- 2.7.4
[PATCH net-next 0/3] smsc95xx: patch serises
From: Woojung HuhWoojung Huh (3): Smsc95xx: Add maintainer smsc95xx: Add register define smsc95xx: Add mdix control via ethtool MAINTAINERS| 1 + drivers/net/usb/smsc95xx.c | 109 +++-- drivers/net/usb/smsc95xx.h | 8 3 files changed, 115 insertions(+), 3 deletions(-) -- 2.7.4
Re: [PATCH 2/2] af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock'
On 02.09.2016 20:13, Linus Torvalds wrote: > > From: Linus Torvalds> Date: Thu, 1 Sep 2016 14:43:53 -0700 > Subject: [PATCH 2/2] af_unix: split 'u->readlock' into two: 'iolock' and > 'bindlock' > > Right now we use the 'readlock' both for protecting some of the af_unix > IO path and for making the bind be single-threaded. > > The two are independent, but using the same lock makes for a nasty > deadlock due to ordering with regards to filesystem locking. The bind > locking would want to nest outside the VSF pathname locking, but the IO > locking wants to nest inside some of those same locks. > > We tried to fix this earlier with commit c845acb324aa ("af_unix: Fix > splice-bind deadlock") which moved the readlock inside the vfs locks, > but that caused problems with overlayfs that will then call back into > filesystem routines that take the lock in the wrong order anyway. > > Splitting the locks means that we can go back to having the bind lock be > the outermost lock, and we don't have any deadlocks with lock ordering. > > Acked-by: Rainer Weikusat > Acked-by: Al Viro > Signed-off-by: Linus Torvalds > --- > > This patch is really trivial, and I've tried to be careful and look at the > locking, but somebody who really knows the AF_UNIX code should definitely > take a second look. > > Note that I did the revert (that re-introduces the original splice > deadlock) first, because that made the whole series much easier to > explain. Doing it in the other order made the revert nastier because this > patch obviously touches the same code that the revert in 1/2 does. > > So this way the series ends up being "go back to the original code with > the original deadlock, and then fix that original deadlock by splitting > the bind lock". Acked-by: Hannes Frederic Sowa
Re: [PATCH 2/2] af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock'
On 02.09.2016 21:15, David Miller wrote: > From: Linus Torvalds> Date: Fri, 2 Sep 2016 11:17:18 -0700 > >> Oh, this was missing a >> >> Reported-and-tested-by: CAI Qian >> >> who found the new deadlock. >> >> There's now *another* lockdep deadlock report by him, but that one has >> nothing to do with networking. >> >> (And neither of these deadlocks will actually deadlock the machine in >> practice, but you can trigger the lockdep reports with some odd splice >> patterns and overlayfs use) > > I read over this and can't find any problems. > > The main thing I was concerned about was an I/O path that really > expects the socket's hash not to change for whatever reason, but even > all of the unix_find_other() calls are done outside of the mutex > already. Furthermore we don't allow rehashing, if the socket is "published" once, it can only be destroyed but can't change identity during its lifetime. IIRC this was another bug we had a while ago.
Re: [PATCH 2/2] af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock'
On Fri, Sep 2, 2016 at 12:15 PM, David Millerwrote: > > The main thing I was concerned about was an I/O path that really > expects the socket's hash not to change for whatever reason, but even > all of the unix_find_other() calls are done outside of the mutex > already. That's my read of it too. I did actually go through the effort on trying to follow all the locking in there yesterday, so while I would want an AF_UNIX person to double-check me, I think it's fine. The real locking seem to be the u->lock spinlock (taken by unix_state_lock() and friends). The bindlock is only about serializing the binders, which do things that can sleep (notably the mknod of the unix socket node in the filesystem). The one thing I looked at and wasn't sure about was unix_stream_connect(), which creates a *new* unix domain socket and binds it to the old one without holding the bindlock. Note that it never did hold the lock, so that's not a new thing, but it worries me a bit. I *think* it's ok, because I think this all happens before that new socket is actually reachable, so it cannot race against somebody else doing a bind. And my patch doesn't actually change anything in this area, so I'm not making it worse. But it was the one thing I reacted to when going through the locking there. (Well, not the "one thing". The other thing I reacted to was the overall reaction that none of this is documented, and the locking was clearly not very clear). Linus
Re: [PATCH iproute2 net-next] nstat: add sctp snmp support
On Fri, 2 Sep 2016 12:46:49 -0700 Stephen Hemmingerwrote: > On Fri, 2 Sep 2016 15:12:38 +0800 > Hangbin Liu wrote: > > > SCTP module was not load by default. But this should be OK since we will not > > load table if fdopen() failed. > > > > Signed-off-by: Hangbin Liu > > This seems like a bad idea for the normal distro user. > If they run nstat command the SCTP kernel module would get loaded even > if never used. > > Makes more sense not to display SCTP statistics if it isn't loaded. > Never mind, opening the proc file won't load SCTP kernel module.
Re: [PATCH iproute2 net-next] nstat: add sctp snmp support
On Fri, 2 Sep 2016 15:12:38 +0800 Hangbin Liuwrote: > SCTP module was not load by default. But this should be OK since we will not > load table if fdopen() failed. > > Signed-off-by: Hangbin Liu This seems like a bad idea for the normal distro user. If they run nstat command the SCTP kernel module would get loaded even if never used. Makes more sense not to display SCTP statistics if it isn't loaded.
Re: [RFC Patch net-next 0/6] net_sched: really switch to RCU for tc actions
On Fri, Sep 2, 2016 at 12:09 AM, Jiri Pirkowrote: > > I wonder, do you happen to have a very tiny narrow screen? LOL, yeah, I should fix the indentation... ;)
Re: [net-next PATCH =v2] e1000: add initial XDP support
On 16-09-01 11:50 PM, Jesper Dangaard Brouer wrote: > On Thu, 01 Sep 2016 14:39:44 -0700 > John Fastabendwrote: > >> From: Alexei Starovoitov >> >> This patch adds initial support for XDP on e1000 driver. Note e1000 >> driver does not support page recycling in general which could be >> added as a further improvement. However XDP_DROP case will recycle. >> XDP_TX and XDP_PASS do not support recycling yet. >> >> This patch includes the rcu_read_lock/rcu_read_unlock pair noted by >> Brenden Blanco in another pending patch. >> >> net/mlx4_en: protect ring->xdp_prog with rcu_read_lock >> >> I tested this patch running e1000 in a VM using KVM over a tap >> device. >> >> CC: William Tu >> Signed-off-by: Alexei Starovoitov >> Signed-off-by: John Fastabend >> --- >> drivers/net/ethernet/intel/e1000/e1000.h |2 >> drivers/net/ethernet/intel/e1000/e1000_main.c | 170 >> + >> 2 files changed, 169 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/net/ethernet/intel/e1000/e1000.h >> b/drivers/net/ethernet/intel/e1000/e1000.h >> index d7bdea7..5cf8a0a 100644 >> --- a/drivers/net/ethernet/intel/e1000/e1000.h >> +++ b/drivers/net/ethernet/intel/e1000/e1000.h >> @@ -150,6 +150,7 @@ struct e1000_adapter; >> */ >> struct e1000_tx_buffer { >> struct sk_buff *skb; >> +struct page *page; >> dma_addr_t dma; >> unsigned long time_stamp; >> u16 length; >> @@ -279,6 +280,7 @@ struct e1000_adapter { >> struct e1000_rx_ring *rx_ring, >> int cleaned_count); >> struct e1000_rx_ring *rx_ring; /* One per active queue */ >> +struct bpf_prog *prog; > > The bpf_prog should be in the rx_ring structure. > ok sure it helps I guess if you use e1000 as a template for implementing XDP and logically makes a bit more sense. But it doesn't functionally matter here. >> struct napi_struct napi; >> >> int num_tx_queues; [...] >> +static void e1000_xmit_raw_frame(struct e1000_rx_buffer *rx_buffer_info, >> + unsigned int len, >> + struct net_device *netdev, >> + struct e1000_adapter *adapter) >> +{ >> +struct netdev_queue *txq = netdev_get_tx_queue(netdev, 0); >> +struct e1000_hw *hw = >hw; >> +struct e1000_tx_ring *tx_ring; >> + >> +if (len > E1000_MAX_DATA_PER_TXD) >> +return; >> + >> +/* e1000 only support a single txq at the moment so the queue is being >> + * shared with stack. To support this requires locking to ensure the >> + * stack and XDP are not running at the same time. Devices with >> + * multiple queues should allocate a separate queue space. >> + */ >> +HARD_TX_LOCK(netdev, txq, smp_processor_id()); >> + >> +tx_ring = adapter->tx_ring; >> + >> +if (E1000_DESC_UNUSED(tx_ring) < 2) >> +return; >> + >> +e1000_tx_map_rxpage(tx_ring, rx_buffer_info, len); >> + >> +e1000_tx_queue(adapter, tx_ring, 0/*tx_flags*/, 1); >> + >> +writel(tx_ring->next_to_use, hw->hw_addr + tx_ring->tdt); >> +mmiowb(); >> + >> +HARD_TX_UNLOCK(netdev, txq); >> +} > > Above is going to give really bad XDP_TX performance. Both locking and > a HW TX tailptr pointer per TX packet, that is as bad as it gets. > Yep. > You might say this is just for testing my eBPF-XDP program. BUT people > wanting to try XDP is going to start with this driver, and they will be > disappointed and never return (and no they will not read the comment in > the code). hmm perhaps we should look at a vhost_net implementation for performance setup. My gut feeling is vhost_net is a better target for performance. > > It should be fairly easy to introduce a bulking/bundling XDP_TX > facility into the TX-ring (taking HARD_TX_LOCK a single time), and then > flush the TX-ring at the end of the loop (in e1000_clean_jumbo_rx_irq). > All you need is an array/stack of RX *buffer_info ptrs being build up > in the XDP_TX case. (Experiments show minimum bulking/array size should > be 8). > > If you want to get fancy, and save space in the bulking structure, > then you can even just use the RX ring index "i" to describe which RX > packets need to be XDP_TX'ed. (as the driver code "owns" this part of > the ring, until updating rx_ring->next_to_clean). > Sure I'll add this seems easy enough.
Re: [PATCH net 0/2] net: thunderx: Fixes for TSO offload issues
From: sunil.kovv...@gmail.com Date: Tue, 30 Aug 2016 11:36:25 +0530 > This patch series fixes couple of issues w.r.t HW TSO offload Series applied, thanks.
Re: [PATCH] net/mlx4_en: protect ring->xdp_prog with rcu_read_lock
On Fri, Sep 2, 2016 at 11:13 AM, Brenden Blancowrote: > On Mon, Aug 29, 2016 at 10:46:38AM -0700, Tom Herbert wrote: > [...] >> Brenden, tracking down how the structure is freed needed a few steps, >> please make sure the RCU requirements are well documented. Also, I'm > Really? It's just bpf_prog_put->call_rcu(__bpf_prog_put_rcu). I suppose > what's missing is a general guideline for which functions new consumers > of bpf should use, but I wouldn't trust myself to write such holistic > documentation accurately (e.g. interacting with nmi probes and such). >> still not a fan of using xchg to set the program, seems that a lock >> could be used in that path. > Where would such a lock go? Everything in mlx4/en_netdev.c relies on > rtnl, which seems sufficient and obvious...adding some new field > specific lock would be distracting and unneeded. Just use the same mutex_lock that is already being taken in case when priv->xdp_ring_num != xdp_ring_num. Then use normal rcu functions to dereference and assign pointers. Tom >> >> Thanks, >> Tom
Re: [PATCH net v2 1/2] ipv6: add missing netconf notif when 'all' is updated
From: Nicolas DichtelDate: Tue, 30 Aug 2016 10:09:21 +0200 > The 'default' value was not advertised. > > Fixes: f3a1bfb11ccb ("rtnl/ipv6: use netconf msg to advertise forwarding > status") > Signed-off-by: Nicolas Dichtel Applied.
Re: [PATCH 2/2] af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock'
From: Linus TorvaldsDate: Fri, 2 Sep 2016 11:17:18 -0700 > Oh, this was missing a > > Reported-and-tested-by: CAI Qian > > who found the new deadlock. > > There's now *another* lockdep deadlock report by him, but that one has > nothing to do with networking. > > (And neither of these deadlocks will actually deadlock the machine in > practice, but you can trigger the lockdep reports with some odd splice > patterns and overlayfs use) I read over this and can't find any problems. The main thing I was concerned about was an I/O path that really expects the socket's hash not to change for whatever reason, but even all of the unix_find_other() calls are done outside of the mutex already.
Re: [PATCH net v2 2/2] netconf: add a notif when settings are created
From: Nicolas DichtelDate: Tue, 30 Aug 2016 10:09:22 +0200 > All changes are notified, but the initial state was missing. > > Signed-off-by: Nicolas Dichtel Applied.
Re: Centralizing support for TCAM?
> I need to get reasonable operations per second on the TCAM tables. > Reasonable for me being somewhere in the 50k to 100k add/del/update > commands per second. I'm hesitant to create more abstractions then > are actually needed. Hi John That is an interesting requirement. Could you explain why? The Marvell TCAM is accessed using MDIO. Maybe it can do 50 add/del/updates per second. For a WiFi access point, or cable modem, even 50 per second seems ample, they get set at boot and never changed. Andrew
Re: [PATCH 0/4] SA11x0 Clocks and removal of Neponset SMC91x hack
On Thu, Sep 01, 2016 at 04:32:41PM -0700, David Miller wrote: > From: Russell King - ARM Linux> Date: Tue, 30 Aug 2016 11:51:17 +0100 > > > Please ack these changes; due to the dependencies, I wish to merge > > them through my tree. Thanks. > > Acked-by: David S. Miller Many thanks David, I'll add that only to the two SMC91x patches. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net.
Re: Centralizing support for TCAM?
On 16-09-02 10:18 AM, Florian Fainelli wrote: > Hi all, > Hi Florian, > (apologies for the long CC list and the fact that I can't type correctly > email addresses) > My favorite topic ;) > While working on adding support for the Broadcom Ethernet switches > Compact Field Processor (which is essentially a TCAM, > action/policer/rate meter RAMs, 256 entries), I started working with the > ethtool::rxnfc API which is actually kind of nice in that it fits nicely > with my use simple case of being able to insert rules at a given or > driver selected location and has a pretty good flow representation for > common things you may match: TCP/UDP v4/v6 (not so much for non-IP, or > L2 stuff though you can use the extension flow representation). It lacks > support for more complex actions other than redirect to a particular > port/queue. When I was doing this for one of the products I work on I decided that extending ethtool was likely not a good approach and building a netlink interface would be a better choice. My reasons were mainly extending ethtool is a bit painful to keep structure compatibility across versions and I also had use cases that wanted to get notifications both made easier when using netlink. However my netlink port+extensions were not accepted and were called a "kernel bypass" and the general opinion was that it was not going to be accepted upstream. Hence the 'tc' effort. > > Now ethtool::rxnfc is one possible user, but tc and netfiler also are, > more powerful and extensible, but since this is a resource constrained > piece of hardware, and it would suck for people to have to implement > these 3 APIs if we could come up with a central one that satisfies the > superset offered by tc + netfilter. We can surely imagine an use case we My opinion is that tc and netfilter are sufficiently different that building a common layer is challenging and is actually more complex vs just implementing two interfaces. Always happy to review code though. There is also an already established packet flow through tc, netfilter, fdb, l3 in linux that folks want to maintain. At the moment I just don't see the need for a common layer IMO. Also adding another layer of abstraction so we end up doing multiple translations into and out of these layers adds overhead. Eventually I need to get reasonable operations per second on the TCAM tables. Reasonable for me being somewhere in the 50k to 100k add/del/update commands per second. I'm hesitant to create more abstractions then are actually needed. > centralize the whole matching + action into a Domain Specific Language > that we compiled into eBPF and then translate into whatever the HW > understands, although that raises the question of where do we put the > translation tool in user space or kernel space. The eBPF to HW translation I started to look at but gave up. The issue was the program space of eBPF is much larger than any traditional parser, table hardware implementation can support so most programs get rejected (obvious observation right?). I'm more inclined to build hardware that can support eBPF vs restricting eBPF to fit into a parser/table model. Surely something like P4 (DSL) -> ebpf -> HW can constrain the ebpf programs so they can be loaded without issues. This might be worth while but mapping it onto 'tc' classifiers like cls_{u32|flower} is a bit more straight forward. > > So what's everybody's take on this? Seems a good time to bring up my other issue. When I have a pipeline with multiple TCAM tables I was trying to figure out how to abstract that in Linux. Something like the following TCAM -> exact match -> TCAM -> exact match So for now I was thinking of lifting two netdevs into linux something like, ethx-frontend, ethx-backend. Where rules added to the frontend go into the front part of the pipeline and rules added to the backend go into the second half of the pipeline. It probably needs more thought. > > Thanks! > Not sure that helps but my suggestion is to see if the cls_u32/cls_flower implementation that exists today solves at least the TCAM entry problem. Note the "order" field in u32 allows you to place rules in user specific order. .John
[PATCH net-next v2 1/3] net: dsa: mv88e6xxx: fix module naming
Since the mv88e6xxx.c file has been renamed, the driver compiled as a module is called chip.ko instead of mv88e6xxx.ko. Fix this. Fixes: fad09c73c270 ("net: dsa: mv88e6xxx: rename single-chip support") Signed-off-by: Vivien Didelot--- drivers/net/dsa/mv88e6xxx/Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/dsa/mv88e6xxx/Makefile b/drivers/net/dsa/mv88e6xxx/Makefile index 6e29a75..e58b745 100644 --- a/drivers/net/dsa/mv88e6xxx/Makefile +++ b/drivers/net/dsa/mv88e6xxx/Makefile @@ -1 +1,2 @@ -obj-$(CONFIG_NET_DSA_MV88E6XXX) += chip.o +obj-$(CONFIG_NET_DSA_MV88E6XXX) += mv88e6xxx.o +mv88e6xxx-objs := chip.o -- 2.9.3
[PATCH net-next v2 0/3] net: dsa: mv88e6xxx: isolate Global2 support
Registers of Marvell chips are organized in internal SMI devices. One of them at address 0x1C is called Global2. It provides an extended set of registers, used for interrupt control, EEPROM access, indirect PHY access (to bypass the PHY Polling Unit) and cross-chip setup. Most chips have it, but some others don't (older ones such as 6060). Now that its related code is isolated in mv88e6xxx_g2_* functions, move it to its own global2.c file, making most of its setup code static. Then make its compilation optional, which allows to reduce the size of the mv88e6xxx driver for devices such as home routers embedding Ethernet chips without Global2 support. It is present on most recent chips, thus enable its support by default. Changes in v2: fail probe if GLOBAL2 is required but not enabled. Vivien Didelot (3): net: dsa: mv88e6xxx: fix module naming net: dsa: mv88e6xxx: move Global2 code net: dsa: mv88e6xxx: make global2 code optional drivers/net/dsa/mv88e6xxx/Kconfig | 11 + drivers/net/dsa/mv88e6xxx/Makefile| 4 +- drivers/net/dsa/mv88e6xxx/chip.c | 467 ++--- drivers/net/dsa/mv88e6xxx/global2.c | 471 ++ drivers/net/dsa/mv88e6xxx/global2.h | 88 +++ drivers/net/dsa/mv88e6xxx/mv88e6xxx.h | 6 + 6 files changed, 596 insertions(+), 451 deletions(-) create mode 100644 drivers/net/dsa/mv88e6xxx/global2.c create mode 100644 drivers/net/dsa/mv88e6xxx/global2.h -- 2.9.3
[PATCH net-next v2 3/3] net: dsa: mv88e6xxx: make global2 code optional
Since not every chip has a Global2 set of registers, make its support optional, in which case the related functions will return -EOPNOTSUPP. This also allows to reduce the size of the mv88e6xxx driver for devices such as home routers embedding Ethernet chips without Global2 support. It is present on most recent chips, thus enable its support by default. Signed-off-by: Vivien Didelot--- drivers/net/dsa/mv88e6xxx/Kconfig | 11 +++ drivers/net/dsa/mv88e6xxx/Makefile | 2 +- drivers/net/dsa/mv88e6xxx/chip.c| 4 +++ drivers/net/dsa/mv88e6xxx/global2.h | 58 + 4 files changed, 74 insertions(+), 1 deletion(-) diff --git a/drivers/net/dsa/mv88e6xxx/Kconfig b/drivers/net/dsa/mv88e6xxx/Kconfig index ac77737..4866688 100644 --- a/drivers/net/dsa/mv88e6xxx/Kconfig +++ b/drivers/net/dsa/mv88e6xxx/Kconfig @@ -6,3 +6,14 @@ config NET_DSA_MV88E6XXX help This driver adds support for most of the Marvell 88E6xxx models of Ethernet switch chips, except 88E6060. + +config NET_DSA_MV88E6XXX_GLOBAL2 + bool "Switch Global 2 Registers support" + default y + depends on NET_DSA_MV88E6XXX + help + This registers set at internal SMI address 0x1C provides extended + features like EEPROM interface, trunking, cross-chip setup, etc. + + It is required on most chips. If the chip you compile the support for + doesn't have such registers set, say N here. In doubt, say Y. diff --git a/drivers/net/dsa/mv88e6xxx/Makefile b/drivers/net/dsa/mv88e6xxx/Makefile index 5a42066..6971039 100644 --- a/drivers/net/dsa/mv88e6xxx/Makefile +++ b/drivers/net/dsa/mv88e6xxx/Makefile @@ -1,3 +1,3 @@ obj-$(CONFIG_NET_DSA_MV88E6XXX) += mv88e6xxx.o mv88e6xxx-objs := chip.o -mv88e6xxx-objs += global2.o +mv88e6xxx-$(CONFIG_NET_DSA_MV88E6XXX_GLOBAL2) += global2.o diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 745e158..70a812d 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -3444,6 +3444,10 @@ static int mv88e6xxx_detect(struct mv88e6xxx_chip *chip) /* Update the compatible info with the probed one */ chip->info = info; + err = mv88e6xxx_g2_require(chip); + if (err) + return err; + dev_info(chip->dev, "switch 0x%x detected: %s, revision %u\n", chip->info->prod_num, chip->info->name, rev); diff --git a/drivers/net/dsa/mv88e6xxx/global2.h b/drivers/net/dsa/mv88e6xxx/global2.h index bee1bc9..c4bb903 100644 --- a/drivers/net/dsa/mv88e6xxx/global2.h +++ b/drivers/net/dsa/mv88e6xxx/global2.h @@ -16,6 +16,13 @@ #include "mv88e6xxx.h" +#ifdef CONFIG_NET_DSA_MV88E6XXX_GLOBAL2 + +static inline int mv88e6xxx_g2_require(struct mv88e6xxx_chip *chip) +{ + return 0; +} + int mv88e6xxx_g2_smi_phy_read(struct mv88e6xxx_chip *chip, int addr, int reg, u16 *val); int mv88e6xxx_g2_smi_phy_write(struct mv88e6xxx_chip *chip, int addr, int reg, @@ -27,4 +34,55 @@ int mv88e6xxx_g2_set_eeprom16(struct mv88e6xxx_chip *chip, struct ethtool_eeprom *eeprom, u8 *data); int mv88e6xxx_g2_setup(struct mv88e6xxx_chip *chip); +#else /* !CONFIG_NET_DSA_MV88E6XXX_GLOBAL2 */ + +static inline int mv88e6xxx_g2_require(struct mv88e6xxx_chip *chip) +{ + if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_GLOBAL2)) { + dev_err(chip->dev, "this chip requires CONFIG_NET_DSA_MV88E6XXX_GLOBAL2 enabled\n"); + return -EOPNOTSUPP; + } + + return 0; +} + +static inline int mv88e6xxx_g2_smi_phy_read(struct mv88e6xxx_chip *chip, + int addr, int reg, u16 *val) +{ + return -EOPNOTSUPP; +} + +static inline int mv88e6xxx_g2_smi_phy_write(struct mv88e6xxx_chip *chip, +int addr, int reg, u16 val) +{ + return -EOPNOTSUPP; +} + +static inline int mv88e6xxx_g2_set_switch_mac(struct mv88e6xxx_chip *chip, + u8 *addr) +{ + return -EOPNOTSUPP; +} + +static inline int mv88e6xxx_g2_get_eeprom16(struct mv88e6xxx_chip *chip, + struct ethtool_eeprom *eeprom, + u8 *data) +{ + return -EOPNOTSUPP; +} + +static inline int mv88e6xxx_g2_set_eeprom16(struct mv88e6xxx_chip *chip, + struct ethtool_eeprom *eeprom, + u8 *data) +{ + return -EOPNOTSUPP; +} + +static inline int mv88e6xxx_g2_setup(struct mv88e6xxx_chip *chip) +{ + return -EOPNOTSUPP; +} + +#endif /* CONFIG_NET_DSA_MV88E6XXX_GLOBAL2 */ + #endif /* _MV88E6XXX_GLOBAL2_H */ -- 2.9.3
[PATCH net-next v2 2/3] net: dsa: mv88e6xxx: move Global2 code
Marvell chips are composed of multiple SMI devices. One of them at address 0x1C is called Global2. It provides an extended set of registers, used for interrupt control, EEPROM access, indirect PHY access (to bypass the PHY Polling Unit) and cross-chip related setup. Most chips have it, but some others don't (older ones such as 6060). Now that its related code is isolated in mv88e6xxx_g2_* functions, move it to its own global2.c file, making most of its setup code static. Document each registers in the meantime. Its compilation can be later avoided for chips without such registers. Signed-off-by: Vivien Didelot--- drivers/net/dsa/mv88e6xxx/Makefile| 1 + drivers/net/dsa/mv88e6xxx/chip.c | 463 + drivers/net/dsa/mv88e6xxx/global2.c | 471 ++ drivers/net/dsa/mv88e6xxx/global2.h | 30 +++ drivers/net/dsa/mv88e6xxx/mv88e6xxx.h | 6 + 5 files changed, 521 insertions(+), 450 deletions(-) create mode 100644 drivers/net/dsa/mv88e6xxx/global2.c create mode 100644 drivers/net/dsa/mv88e6xxx/global2.h diff --git a/drivers/net/dsa/mv88e6xxx/Makefile b/drivers/net/dsa/mv88e6xxx/Makefile index e58b745..5a42066 100644 --- a/drivers/net/dsa/mv88e6xxx/Makefile +++ b/drivers/net/dsa/mv88e6xxx/Makefile @@ -1,2 +1,3 @@ obj-$(CONFIG_NET_DSA_MV88E6XXX) += mv88e6xxx.o mv88e6xxx-objs := chip.o +mv88e6xxx-objs += global2.o diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index d6b0f78..745e158 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -29,7 +29,9 @@ #include #include #include + #include "mv88e6xxx.h" +#include "global2.h" static void assert_reg_lock(struct mv88e6xxx_chip *chip) { @@ -182,8 +184,7 @@ static const struct mv88e6xxx_ops mv88e6xxx_smi_multi_chip_ops = { .write = mv88e6xxx_smi_multi_chip_write, }; -static int mv88e6xxx_read(struct mv88e6xxx_chip *chip, - int addr, int reg, u16 *val) +int mv88e6xxx_read(struct mv88e6xxx_chip *chip, int addr, int reg, u16 *val) { int err; @@ -199,8 +200,7 @@ static int mv88e6xxx_read(struct mv88e6xxx_chip *chip, return 0; } -static int mv88e6xxx_write(struct mv88e6xxx_chip *chip, - int addr, int reg, u16 val) +int mv88e6xxx_write(struct mv88e6xxx_chip *chip, int addr, int reg, u16 val) { int err; @@ -306,8 +306,7 @@ static int mv88e6xxx_serdes_write(struct mv88e6xxx_chip *chip, int reg, u16 val) reg, val); } -static int mv88e6xxx_wait(struct mv88e6xxx_chip *chip, int addr, int reg, - u16 mask) +int mv88e6xxx_wait(struct mv88e6xxx_chip *chip, int addr, int reg, u16 mask) { int i; @@ -330,8 +329,7 @@ static int mv88e6xxx_wait(struct mv88e6xxx_chip *chip, int addr, int reg, } /* Indirect write to single pointer-data register with an Update bit */ -static int mv88e6xxx_update(struct mv88e6xxx_chip *chip, int addr, int reg, - u16 update) +int mv88e6xxx_update(struct mv88e6xxx_chip *chip, int addr, int reg, u16 update) { u16 val; int err; @@ -2878,330 +2876,6 @@ static int mv88e6xxx_g1_setup(struct mv88e6xxx_chip *chip) return 0; } -static int mv88e6xxx_g2_device_mapping_write(struct mv88e6xxx_chip *chip, -int target, int port) -{ - u16 val = (target << 8) | (port & 0xf); - - return mv88e6xxx_update(chip, REG_GLOBAL2, GLOBAL2_DEVICE_MAPPING, val); -} - -static int mv88e6xxx_g2_set_device_mapping(struct mv88e6xxx_chip *chip) -{ - int target, port; - int err; - - /* Initialize the routing port to the 32 possible target devices */ - for (target = 0; target < 32; ++target) { - port = 0xf; - - if (target < DSA_MAX_SWITCHES) { - port = chip->ds->rtable[target]; - if (port == DSA_RTABLE_NONE) - port = 0xf; - } - - err = mv88e6xxx_g2_device_mapping_write(chip, target, port); - if (err) - break; - } - - return err; -} - -static int mv88e6xxx_g2_trunk_mask_write(struct mv88e6xxx_chip *chip, int num, -bool hask, u16 mask) -{ - const u16 port_mask = BIT(chip->info->num_ports) - 1; - u16 val = (num << 12) | (mask & port_mask); - - if (hask) - val |= GLOBAL2_TRUNK_MASK_HASK; - - return mv88e6xxx_update(chip, REG_GLOBAL2, GLOBAL2_TRUNK_MASK, val); -} - -static int mv88e6xxx_g2_trunk_mapping_write(struct mv88e6xxx_chip *chip, int id, - u16 map) -{ - const u16 port_mask = BIT(chip->info->num_ports) - 1; - u16 val = (id << 11) | (map & port_mask); - - return
Re: [PATCH net v2] l2tp: fix use-after-free during module unload
From: Sabrina DubrocaDate: Fri, 2 Sep 2016 10:22:54 +0200 > Tunnel deletion is delayed by both a workqueue (l2tp_tunnel_delete -> wq > -> l2tp_tunnel_del_work) and RCU (sk_destruct -> RCU -> > l2tp_tunnel_destruct). > > By the time l2tp_tunnel_destruct() runs to destroy the tunnel and finish > destroying the socket, the private data reserved via the net_generic > mechanism has already been freed, but l2tp_tunnel_destruct() actually > uses this data. > > Make sure tunnel deletion for the netns has completed before returning > from l2tp_exit_net() by first flushing the tunnel removal workqueue, and > then waiting for RCU callbacks to complete. > > Fixes: 167eb17e0b17 ("l2tp: create tunnel sockets in the right namespace") > Signed-off-by: Sabrina Dubroca > --- > v2: fix typo in commit message, spotted by Sergei Shtylyov Applied.
ipv6: release dst in ping_v6_sendmsg
Neither the failure or success paths of ping_v6_sendmsg release the dst it acquires. This leads to a flood of warnings from "net/core/dst.c:288 dst_release" on older kernels that don't have 8bf4ada2e21378816b28205427ee6b0e1ca4c5f1 backported. That patch optimistically hoped this had been fixed post 3.10, but it seems at least one case wasn't, where I've seen this triggered a lot from machines doing unprivileged icmp sockets. Cc: Martin LauSigned-off-by: Dave Jones diff --git a/net/ipv6/ping.c b/net/ipv6/ping.c index 0900352c924c..0e983b694ee8 100644 --- a/net/ipv6/ping.c +++ b/net/ipv6/ping.c @@ -126,8 +126,10 @@ static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) rt = (struct rt6_info *) dst; np = inet6_sk(sk); - if (!np) - return -EBADF; + if (!np) { + err = -EBADF; + goto dst_err_out; + } if (!fl6.flowi6_oif && ipv6_addr_is_multicast()) fl6.flowi6_oif = np->mcast_oif; @@ -163,6 +165,9 @@ static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) } release_sock(sk); +dst_err_out: + dst_release(dst); + if (err) return err;
Re: [PATCH v3 1/2] wcn36xx: Transition driver to SMD client
Bjorn Anderssonwrites: > On Fri 02 Sep 09:24 PDT 2016, Kalle Valo wrote: > >> Bjorn Andersson writes: >> >> > --- a/drivers/net/wireless/ath/wcn36xx/Kconfig >> > +++ b/drivers/net/wireless/ath/wcn36xx/Kconfig >> > @@ -1,6 +1,6 @@ >> > config WCN36XX >> >tristate "Qualcomm Atheros WCN3660/3680 support" >> > - depends on MAC80211 && HAS_DMA >> > + depends on MAC80211 && HAS_DMA && QCOM_SMD >> >> While I had this patch on my pending branch I noticed that I can't >> compile wcn36xx on x86 anymore. This is actually quite inconvenient, for >> example then it's easy to miss mac80211 API changes etc. Is there any >> way we could continue build testing wcn36xx on x86 (or all) platforms? >> > > Sorry about that, we should at least be able to COMPILE_TEST it in > non-qcom builds. Thanks for letting me know. Yeah, that would be better. Even though it's a bit shame that COMPILE_TEST disables DEBUG_INFO (I use the same build also for development) so I need to toggle it on and off whenever I need debug symbols. Oh well... > And the driver should also depend on QCOM_WCNSS_CTRL, in the normal > case. > > I'll respin this, unless you prefer an incremental patch for those > changes(?) Yes, please respin. >> Also what about older platforms? Earlier I used wcn36xx on my Nexus 4 >> with help of backports project. I can't do that anymore, right? >> > > This has been tested on 8064, 8974 and 8916. So your Nexus 4 is covered. > > Unfortunately I don't have a Nexus 4, but we currently have Nexus 7 > (the 8064 version), Xperia Z, Xperia Z1 and DB410c using this chip and > all four has been tested with this code. Actually I meant running wcn36xx on older kernels, where your QCOM_SMD is not yet supported. > I've staged the PIL/remoteproc (firmware "loader") driver for v4.9, so > with this patch the only thing missing to fill in the dts files is one > clock from the RPM. Nice. > JFYI, There seems to be some race in the removal path, which I will look > into. But I would prefer if we could merge this to make the driver > usable, and more accessible to work on. Sure, a race like that isn't that big of a deal compared to the benefits your work brings. But it's good to document knows regressions to the commit log anyway so that others can be prepared if they test it. -- Kalle Valo
Re: [PATCH 2/2] af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock'
On Fri, Sep 2, 2016 at 11:13 AM, Linus Torvaldswrote: > > From: Linus Torvalds > Date: Thu, 1 Sep 2016 14:43:53 -0700 > Subject: [PATCH 2/2] af_unix: split 'u->readlock' into two: 'iolock' and > 'bindlock' > > Right now we use the 'readlock' both for protecting some of the af_unix > IO path and for making the bind be single-threaded. > > The two are independent, but using the same lock makes for a nasty > deadlock due to ordering with regards to filesystem locking. The bind > locking would want to nest outside the VSF pathname locking, but the IO > locking wants to nest inside some of those same locks. > > We tried to fix this earlier with commit c845acb324aa ("af_unix: Fix > splice-bind deadlock") which moved the readlock inside the vfs locks, > but that caused problems with overlayfs that will then call back into > filesystem routines that take the lock in the wrong order anyway. > > Splitting the locks means that we can go back to having the bind lock be > the outermost lock, and we don't have any deadlocks with lock ordering. > > Acked-by: Rainer Weikusat > Acked-by: Al Viro > Signed-off-by: Linus Torvalds Oh, this was missing a Reported-and-tested-by: CAI Qian who found the new deadlock. There's now *another* lockdep deadlock report by him, but that one has nothing to do with networking. (And neither of these deadlocks will actually deadlock the machine in practice, but you can trigger the lockdep reports with some odd splice patterns and overlayfs use) Linus
Re: [PATCH] net/mlx4_en: protect ring->xdp_prog with rcu_read_lock
On Mon, Aug 29, 2016 at 10:46:38AM -0700, Tom Herbert wrote: [...] > Brenden, tracking down how the structure is freed needed a few steps, > please make sure the RCU requirements are well documented. Also, I'm Really? It's just bpf_prog_put->call_rcu(__bpf_prog_put_rcu). I suppose what's missing is a general guideline for which functions new consumers of bpf should use, but I wouldn't trust myself to write such holistic documentation accurately (e.g. interacting with nmi probes and such). > still not a fan of using xchg to set the program, seems that a lock > could be used in that path. Where would such a lock go? Everything in mlx4/en_netdev.c relies on rtnl, which seems sufficient and obvious...adding some new field specific lock would be distracting and unneeded. > > Thanks, > Tom
[PATCH 2/2] af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock'
From: Linus TorvaldsDate: Thu, 1 Sep 2016 14:43:53 -0700 Subject: [PATCH 2/2] af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock' Right now we use the 'readlock' both for protecting some of the af_unix IO path and for making the bind be single-threaded. The two are independent, but using the same lock makes for a nasty deadlock due to ordering with regards to filesystem locking. The bind locking would want to nest outside the VSF pathname locking, but the IO locking wants to nest inside some of those same locks. We tried to fix this earlier with commit c845acb324aa ("af_unix: Fix splice-bind deadlock") which moved the readlock inside the vfs locks, but that caused problems with overlayfs that will then call back into filesystem routines that take the lock in the wrong order anyway. Splitting the locks means that we can go back to having the bind lock be the outermost lock, and we don't have any deadlocks with lock ordering. Acked-by: Rainer Weikusat Acked-by: Al Viro Signed-off-by: Linus Torvalds --- This patch is really trivial, and I've tried to be careful and look at the locking, but somebody who really knows the AF_UNIX code should definitely take a second look. Note that I did the revert (that re-introduces the original splice deadlock) first, because that made the whole series much easier to explain. Doing it in the other order made the revert nastier because this patch obviously touches the same code that the revert in 1/2 does. So this way the series ends up being "go back to the original code with the original deadlock, and then fix that original deadlock by splitting the bind lock". include/net/af_unix.h | 2 +- net/unix/af_unix.c| 45 +++-- 2 files changed, 24 insertions(+), 23 deletions(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index 9b4c418bebd8..fd60eccb59a6 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -52,7 +52,7 @@ struct unix_sock { struct sock sk; struct unix_address *addr; struct path path; - struct mutexreadlock; + struct mutexiolock, bindlock; struct sock *peer; struct list_headlink; atomic_long_t inflight; diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 433ae1bbef97..8309687a56b0 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -661,11 +661,11 @@ static int unix_set_peek_off(struct sock *sk, int val) { struct unix_sock *u = unix_sk(sk); - if (mutex_lock_interruptible(>readlock)) + if (mutex_lock_interruptible(>iolock)) return -EINTR; sk->sk_peek_off = val; - mutex_unlock(>readlock); + mutex_unlock(>iolock); return 0; } @@ -779,7 +779,8 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern) spin_lock_init(>lock); atomic_long_set(>inflight, 0); INIT_LIST_HEAD(>link); - mutex_init(>readlock); /* single task reading lock */ + mutex_init(>iolock); /* single task reading lock */ + mutex_init(>bindlock); /* single task binding lock */ init_waitqueue_head(>peer_wait); init_waitqueue_func_entry(>peer_wake, unix_dgram_peer_wake_relay); unix_insert_socket(unix_sockets_unbound(sk), sk); @@ -848,7 +849,7 @@ static int unix_autobind(struct socket *sock) int err; unsigned int retries = 0; - err = mutex_lock_interruptible(>readlock); + err = mutex_lock_interruptible(>bindlock); if (err) return err; @@ -895,7 +896,7 @@ retry: spin_unlock(_table_lock); err = 0; -out: mutex_unlock(>readlock); +out: mutex_unlock(>bindlock); return err; } @@ -1009,7 +1010,7 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) goto out; addr_len = err; - err = mutex_lock_interruptible(>readlock); + err = mutex_lock_interruptible(>bindlock); if (err) goto out; @@ -1063,7 +1064,7 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) out_unlock: spin_unlock(_table_lock); out_up: - mutex_unlock(>readlock); + mutex_unlock(>bindlock); out: return err; } @@ -1955,17 +1956,17 @@ static ssize_t unix_stream_sendpage(struct socket *socket, struct page *page, if (false) { alloc_skb: unix_state_unlock(other); - mutex_unlock(_sk(other)->readlock); + mutex_unlock(_sk(other)->iolock); newskb = sock_alloc_send_pskb(sk, 0, 0, flags & MSG_DONTWAIT, , 0); if (!newskb) goto err; }
[PATCH 1/2] Revert "af_unix: Fix splice-bind deadlock"
From: Linus TorvaldsDate: Thu, 1 Sep 2016 14:56:49 -0700 Subject: [PATCH 1/2] Revert "af_unix: Fix splice-bind deadlock" This reverts commit c845acb324aa85a39650a14e7696982ceea75dc1. It turns out that it just replaces one deadlock with another one: we can still get the wrong lock ordering with the readlock due to overlayfs calling back into the filesystem layer and still taking the vfs locks after the readlock. The proper solution ends up being to just split the readlock into two pieces: the bind lock (taken *outside* the vfs locks) and the IO lock (taken *inside* the filesystem locks). The two locks are independent anyway. Signed-off-by: Linus Torvalds --- This is not a completely clean revert, because other changes had happened in this area since that commit, but the conflicts were pretty trivial. The next patch actually fixes the problem as described above ("proper solution"). Also, David, if you'd prefer I just apply these directly, you can just tell me so. But I really wanted some AF_UNIX people to look at the next patch regardless. net/unix/af_unix.c | 66 +- 1 file changed, 26 insertions(+), 40 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index f1dffe84f0d5..433ae1bbef97 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -954,20 +954,32 @@ fail: return NULL; } -static int unix_mknod(struct dentry *dentry, const struct path *path, umode_t mode, - struct path *res) +static int unix_mknod(const char *sun_path, umode_t mode, struct path *res) { - int err; + struct dentry *dentry; + struct path path; + int err = 0; + /* +* Get the parent directory, calculate the hash for last +* component. +*/ + dentry = kern_path_create(AT_FDCWD, sun_path, , 0); + err = PTR_ERR(dentry); + if (IS_ERR(dentry)) + return err; - err = security_path_mknod(path, dentry, mode, 0); + /* +* All right, let's create it. +*/ + err = security_path_mknod(, dentry, mode, 0); if (!err) { - err = vfs_mknod(d_inode(path->dentry), dentry, mode, 0); + err = vfs_mknod(d_inode(path.dentry), dentry, mode, 0); if (!err) { - res->mnt = mntget(path->mnt); + res->mnt = mntget(path.mnt); res->dentry = dget(dentry); } } - + done_path_create(, dentry); return err; } @@ -978,12 +990,10 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) struct unix_sock *u = unix_sk(sk); struct sockaddr_un *sunaddr = (struct sockaddr_un *)uaddr; char *sun_path = sunaddr->sun_path; - int err, name_err; + int err; unsigned int hash; struct unix_address *addr; struct hlist_head *list; - struct path path; - struct dentry *dentry; err = -EINVAL; if (sunaddr->sun_family != AF_UNIX) @@ -999,34 +1009,14 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) goto out; addr_len = err; - name_err = 0; - dentry = NULL; - if (sun_path[0]) { - /* Get the parent directory, calculate the hash for last -* component. -*/ - dentry = kern_path_create(AT_FDCWD, sun_path, , 0); - - if (IS_ERR(dentry)) { - /* delay report until after 'already bound' check */ - name_err = PTR_ERR(dentry); - dentry = NULL; - } - } - err = mutex_lock_interruptible(>readlock); if (err) - goto out_path; + goto out; err = -EINVAL; if (u->addr) goto out_up; - if (name_err) { - err = name_err == -EEXIST ? -EADDRINUSE : name_err; - goto out_up; - } - err = -ENOMEM; addr = kmalloc(sizeof(*addr)+addr_len, GFP_KERNEL); if (!addr) @@ -1037,11 +1027,11 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) addr->hash = hash ^ sk->sk_type; atomic_set(>refcnt, 1); - if (dentry) { - struct path u_path; + if (sun_path[0]) { + struct path path; umode_t mode = S_IFSOCK | (SOCK_INODE(sock)->i_mode & ~current_umask()); - err = unix_mknod(dentry, , mode, _path); + err = unix_mknod(sun_path, mode, ); if (err) { if (err == -EEXIST) err = -EADDRINUSE; @@ -1049,9 +1039,9 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int
Re: [PATCH] net/mlx4_en: protect ring->xdp_prog with rcu_read_lock
On Fri, Sep 02, 2016 at 01:59:40AM +0300, Saeed Mahameed wrote: > On Wed, Aug 31, 2016 at 4:50 AM, Brenden Blancowrote: > > On Tue, Aug 30, 2016 at 12:35:58PM +0300, Saeed Mahameed wrote: [...] > >> Sorry folks I am with Tariq on this, you can't just add a single > >> instruction which is only valid/needed for 1% of the use cases > >> to the driver's general data path, even if it was as cheap as one cpu > >> cycle! > > How about 0? > > > > $ diff mlx4_en.ko.norcu.s mlx4_en.ko.rcu.s | wc -l > > 0 > > > > Well, If you put it this way, it seems OK then. > > Anyway I would add a friendly comment beside the rcu_read_lock that > "this is needed to protect > access to ring->xdp_prog". Thanks, I will go ahead with this then. > > >> > >> Let me try to suggest something: > >> instead of taking the rcu_read_lock for the whole > >> mlx4_en_process_rx_cq, we can minimize to XDP code path only > >> by double checking xdp_prog (non-protected check followed by a > >> protected check inside mlx4 XDP critical path). > >> > >> i.e instead of: > >> > >> rcu_read_lock(); > >> > >> xdp_prog = ring->xdp_prog; > >> > >> //__Do lots of non-XDP related stuff__ > >> > >> if (xdp_prog) { > >> //Do XDP magic .. > >> } > >> //__Do more of non-XDP related stuff__ > >> > >> rcu_read_unlock(); > >> > >> > >> We can minimize it to XDP critical path only: > >> > >> //Non protected xdp_prog dereference. > >> if (xdp_prog) { > >> rcu_read_lock(); > >> //Protected dereference to ring->xdp_prog > >> xdp_prog = ring->xdp_prog; > >> if(unlikely(!xdp_prg)) goto unlock; > > > > The addition of this branch and extra deref is now slowing down the xdp > > path compared to the current proposal. > > > > Yep, but this is an unlikely condition and the critical code here is > much smaller and it is more clear that the rcu_read_lock here meant to > protect the ring->xdp_prog under this small xdp critical section in > comparison to your patch where it is held across the whole RX > function. It's really an improper use of RCU though. RCU is meant to provide correctness without sacrificing any performance in the fastpath. It is designed to avoid having to double-dereference and other such tricks, so shouldn't we use it how it was designed? Having a larger scoped rcu_read_lock doesn't hurt anybody here, but the extra memory reads certainly _does_ impact the XDP path, which folks are going to start relying on to be performant. Let's not start chipping away at that.
Re: [PATCH net-next] switchdev: Fix return value of switchdev_port_fdb_dump().
From: Rami RosenDate: Fri, 2 Sep 2016 14:11:57 +0300 > This patch fixes the retun value of switchdev_port_fdb_dump() when > CONFIG_NET_SWITCHDEV is not set. This avoids getting "warning: return makes > integer from pointer without a cast [-Wint-conversion]" when building > when CONFIG_NET_SWITCHDEV is not set under several compiler versions. > This warning is due to commit d297653dd6f07afbe7e6c702a4bcd7615680002e > ("rtnetlink: fdb dump: optimize by saving last interface markers"). > > Signed-off-by: Rami Rosen Applied, thanks.
Re: [PATCH RESEND net-next 13/15] smc: receive data from RMBE
From: Ursula BraunDate: Fri, 2 Sep 2016 15:05:01 +0200 > Understood, I wrongly used xchg() for atomicity. I now realize that I > would need cursor locking for 32-bit architectures - something I would > like to defer. Thus I would like to come up with V2 of SMC-R with > builds restricted to 64-bit architectures only, and thus no usage of > xchg() anymore. Please don't restrict the driver build to 64-bit
Re: [PATCH v3 net-next 0/6] perf, bpf: add support for bpf in sw/hw perf_events
From: Alexei StarovoitovDate: Thu, 1 Sep 2016 18:37:20 -0700 > Hi Peter, Dave, > > this patch set is a follow up to the discussion: > https://lkml.kernel.org/r/20160804142853.GO6862%20()%20twins%20!%20programming%20!%20kicks-ass%20!%20net > It turned out to be simpler than what we discussed. > > Patches 1-3 is bpf-side prep for the main patch 4 > that adds bpf program as an overflow_handler to sw and hw perf_events. > > Patches 5 and 6 are examples from myself and Brendan. > > Peter, > to implement your suggestion to add ifdef CONFIG_BPF_SYSCALL > inside struct perf_event, I had to shuffle ifdefs in events/core.c > Please double check whether that is what you wanted to see. > > v2->v3: fixed few more minor issues > v1->v2: fixed issues spotted by Peter and Daniel. Series applied, thanks.
Re: possible circular locking dependency detected
Linus Torvaldswrites: > On Fri, Sep 2, 2016 at 10:02 AM, Al Viro wrote: >> >> It's very much _not_ just overlayfs being pathological - that it certainly >> is, >> but the problem is much wider. > > Al, can you take a look at my two patches, and see if you agree that > they fix it, though? The original deadlock occurred because of some code path locking the superblock followed by trying to acquire the af_unix readlock while unix_bind did the same in the opposite order (by doing kern_path_create with the readlock held). If unix_bind doesn't share a lock with the receive routines anymore, this obviously can't happen anymore. The other problem situation is one where a lock on someting can be acquired both by kern_path_create and a mknod operation and the readlock is taken in between. Because that sits in between the kern_path_create and the mknod, it can block the thread which got a certain lock via kern_path_create because the one which is about to try to acquire it via mknod got the readlock first. This obviously can't happen anymore the when the original 'acquire readlock (now bindlock) first' is restored.
Re: possible circular locking dependency detected
On Fri, Sep 02, 2016 at 06:40:59PM +0100, Rainer Weikusat wrote: > The original deadlock occurred because of some code path locking the > superblock followed by trying to acquire the af_unix readlock while Not even that - one code path takes ->readlock under pipe lock, while another takes pipe lock under sb_start_write...
Re: possible circular locking dependency detected
On Fri, Sep 02, 2016 at 10:12:08AM -0700, Linus Torvalds wrote: > On Fri, Sep 2, 2016 at 10:02 AM, Al Virowrote: > > > > It's very much _not_ just overlayfs being pathological - that it certainly > > is, > > but the problem is much wider. > > Al, can you take a look at my two patches, and see if you agree that > they fix it, though? AFAICS, they should. Locking is obviously saner that way and AFAICS the rest is absolutely straightforward. Acked-by: Al Viro > Of course, we now have *another* splice deadlock. That pipe inode is > nasty, it's very easy to deadlock on it in subtle ways. I'm still digging through iomap.c, but that's better taken to another branch of this thread...
Re: [PATCH] net/mlx4_en: protect ring->xdp_prog with rcu_read_lock
On Thu, Sep 01, 2016 at 04:30:28PM -0700, Tom Herbert wrote: [...] > > Yep, but this is an unlikely condition and the critical code here is > > much smaller and it is more clear that the rcu_read_lock here meant to > > protect the ring->xdp_prog under this small xdp critical section in > > comparison to your patch where it is held across the whole RX > > function. > > Note that there is already an rcu_read_lock potentially per packet > buried in the function, if the whole function is under rcu_read_lock > then that can be removed. Yes I was aware of that, I had left it as-is since: 1. it seemed to be in an exception path and less performance sensitive to nested calls, and 2. in case some future developer removed the top-level rcu_read_lock, the finer-grained one would have been unprotected if not code reviewed carefully. I'll instead add a note at the top pointing out the dual need for the lock, to address both yours and Saeed's comments. As a side note, when considering the idea of moving the rcu_read_lock to a more generic location (napi), I had toyed with the idea of benchmarking to see if removing the actually-fast-path use of rcu_read_lock in netif_receive_skb_internal could have any performance benefit for the universal use case (non-xdp). However, that seems completely out of scope at the moment, and only beneficial for non-standard (IMO) .configs, besides being much harder to review. It was showing up in perf at about 1-2% overhead in preempt=y kernels. > > Tom
[PATCH] qed: Remove OOM messages
These messages are unnecessary as OOM allocation failures already do a dump_stack() giving more or less the same information. $ size drivers/net/ethernet/qlogic/qed/built-in.o* (defconfig x86-64) textdata bss dec hex filename 126849 27968 32800 187617 2dce1 drivers/net/ethernet/qlogic/qed/built-in.o.new 131506 27968 32800 192274 2ef12 drivers/net/ethernet/qlogic/qed/built-in.o.old Miscellanea: o Change allocs to the generally preferred forms where possible. Signed-off-by: Joe Perches--- drivers/net/ethernet/qlogic/qed/qed_cxt.c | 20 +++-- drivers/net/ethernet/qlogic/qed/qed_dcbx.c | 13 ++ drivers/net/ethernet/qlogic/qed/qed_dev.c | 57 +++--- drivers/net/ethernet/qlogic/qed/qed_hw.c | 12 ++ drivers/net/ethernet/qlogic/qed/qed_init_ops.c | 4 +- drivers/net/ethernet/qlogic/qed/qed_int.c | 26 drivers/net/ethernet/qlogic/qed/qed_main.c | 4 +- drivers/net/ethernet/qlogic/qed/qed_mcp.c | 1 - drivers/net/ethernet/qlogic/qed/qed_spq.c | 31 -- drivers/net/ethernet/qlogic/qed/qed_sriov.c| 9 ++-- drivers/net/ethernet/qlogic/qed/qed_vf.c | 14 ++- 11 files changed, 49 insertions(+), 142 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qed/qed_cxt.c b/drivers/net/ethernet/qlogic/qed/qed_cxt.c index 1c35f37..e61185a 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_cxt.c +++ b/drivers/net/ethernet/qlogic/qed/qed_cxt.c @@ -795,10 +795,9 @@ static int qed_cxt_src_t2_alloc(struct qed_hwfn *p_hwfn) p_mngr->t2_num_pages = DIV_ROUND_UP(total_size, psz); /* allocate t2 */ - p_mngr->t2 = kzalloc(p_mngr->t2_num_pages * sizeof(struct qed_dma_mem), + p_mngr->t2 = kcalloc(p_mngr->t2_num_pages, sizeof(struct qed_dma_mem), GFP_KERNEL); if (!p_mngr->t2) { - DP_NOTICE(p_hwfn, "Failed to allocate t2 table\n"); rc = -ENOMEM; goto t2_fail; } @@ -963,7 +962,6 @@ static int qed_ilt_shadow_alloc(struct qed_hwfn *p_hwfn) p_mngr->ilt_shadow = kcalloc(size, sizeof(struct qed_dma_mem), GFP_KERNEL); if (!p_mngr->ilt_shadow) { - DP_NOTICE(p_hwfn, "Failed to allocate ilt shadow table\n"); rc = -ENOMEM; goto ilt_shadow_fail; } @@ -1056,10 +1054,8 @@ int qed_cxt_mngr_alloc(struct qed_hwfn *p_hwfn) u32 i; p_mngr = kzalloc(sizeof(*p_mngr), GFP_KERNEL); - if (!p_mngr) { - DP_NOTICE(p_hwfn, "Failed to allocate `struct qed_cxt_mngr'\n"); + if (!p_mngr) return -ENOMEM; - } /* Initialize ILT client registers */ clients = p_mngr->clients; @@ -,24 +1107,18 @@ int qed_cxt_tables_alloc(struct qed_hwfn *p_hwfn) /* Allocate the ILT shadow table */ rc = qed_ilt_shadow_alloc(p_hwfn); - if (rc) { - DP_NOTICE(p_hwfn, "Failed to allocate ilt memory\n"); + if (rc) goto tables_alloc_fail; - } /* Allocate the T2 table */ rc = qed_cxt_src_t2_alloc(p_hwfn); - if (rc) { - DP_NOTICE(p_hwfn, "Failed to allocate T2 memory\n"); + if (rc) goto tables_alloc_fail; - } /* Allocate and initialize the acquired cids bitmaps */ rc = qed_cid_map_alloc(p_hwfn); - if (rc) { - DP_NOTICE(p_hwfn, "Failed to allocate cid maps\n"); + if (rc) goto tables_alloc_fail; - } return 0; diff --git a/drivers/net/ethernet/qlogic/qed/qed_dcbx.c b/drivers/net/ethernet/qlogic/qed/qed_dcbx.c index 3656d2f..54a7899 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_dcbx.c +++ b/drivers/net/ethernet/qlogic/qed/qed_dcbx.c @@ -875,11 +875,8 @@ int qed_dcbx_info_alloc(struct qed_hwfn *p_hwfn) int rc = 0; p_hwfn->p_dcbx_info = kzalloc(sizeof(*p_hwfn->p_dcbx_info), GFP_KERNEL); - if (!p_hwfn->p_dcbx_info) { - DP_NOTICE(p_hwfn, - "Failed to allocate 'struct qed_dcbx_info'\n"); + if (!p_hwfn->p_dcbx_info) rc = -ENOMEM; - } return rc; } @@ -1190,10 +1187,8 @@ int qed_dcbx_get_config_params(struct qed_hwfn *p_hwfn, } dcbx_info = kzalloc(sizeof(*dcbx_info), GFP_KERNEL); - if (!dcbx_info) { - DP_ERR(p_hwfn, "Failed to allocate struct qed_dcbx_info\n"); + if (!dcbx_info) return -ENOMEM; - } rc = qed_dcbx_query_params(p_hwfn, dcbx_info, QED_DCBX_OPERATIONAL_MIB); if (rc) { @@ -1227,10 +1222,8 @@ static struct qed_dcbx_get *qed_dcbnl_get_dcbx(struct qed_hwfn *hwfn, struct qed_dcbx_get *dcbx_info; dcbx_info = kzalloc(sizeof(*dcbx_info), GFP_KERNEL); - if (!dcbx_info) { - DP_ERR(hwfn->cdev, "Failed to
Re: [PATCH] qed: fix kzalloc-simple.cocci warnings
On Thu, 2016-09-01 at 07:37 +, Yuval Mintz wrote: > > drivers/net/ethernet/qlogic/qed/qed_dcbx.c:1230:13-20: WARNING: kzalloc > > should be used for dcbx_info, instead of kmalloc/memset > > drivers/net/ethernet/qlogic/qed/qed_dcbx.c:1192:13-20: WARNING: kzalloc > > should be used for dcbx_info, instead of kmalloc/memset > > > > Use kzalloc rather than kmalloc followed by memset with 0 [] > One question the automated script - > Can't it [relative] easily be upgraded to also have 'Fixes:' as part > of its message? It's really not a "fix" as it has no real effect on behavior. The code is perfectly fine as is really. It is a code size reduction though. Another thing with a behavior change and one that would also reduce code size would be to remove the unnecessary OOM messages after the allocs. I'll send a patch for that.
Re: [PATCH net-next 3/3] net: dsa: mv88e6xxx: make global2 code optional
Hi Andrew, Andrew Lunnwrites: >> I agree. Does the following snippet looks OK? >> >> >> #ifndef CONFIG_NET_DSA_MV88E6XXX_GLOBAL2 >> if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_GLOBAL2)) { >> dev_err(chip->dev, "Missing support for Global 2 >> registers\n"); > > I would include the name of the option which needs enabling. Also it > is not really missing. It has not been enabled. > > "The required compile time options needed to support this switch have > not been enabled. Please enable: CONFIG_NET_DSA_MV88E6XXX_GLOBAL2" That is much better indeed, respining. Thanks! Vivien
Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id
On Fri, Sep 2, 2016 at 9:39 AM, Cong Wangwrote: > On Fri, Sep 2, 2016 at 1:12 AM, Nicolas Dichtel > wrote: >> Le 02/09/2016 à 06:53, Cong Wang a écrit : >>> We never read or change netns id in hardirq context, >>> the only place we read netns id in softirq context >>> is in vxlan_xmit(). So, it should be enough to just >>> disable BH. >> >> Are you sure? Did you audit all part of the code? > > I did audit all the callers, and I didn't find any of them in IRQ context. > >> peernet2id() is called from netlink core system (do_one_broadcast()). Are you >> sure that no driver call this function from an hard irq context? > > I audit all callers of netlink_broadcast(), and I don't see any of > them in hardirq context. Note, you can rule out most of them by checking GFP_KERNEL, which indicates a process context. ;) For GFP_ATOMIC cases, I don't see any of them in hardirq context either, but I am definitely not familiar with drivers like infiniband.
Re: [PATCH net-next 3/3] net: dsa: mv88e6xxx: make global2 code optional
> I agree. Does the following snippet looks OK? > > > #ifndef CONFIG_NET_DSA_MV88E6XXX_GLOBAL2 > if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_GLOBAL2)) { > dev_err(chip->dev, "Missing support for Global 2 > registers\n"); I would include the name of the option which needs enabling. Also it is not really missing. It has not been enabled. "The required compile time options needed to support this switch have not been enabled. Please enable: CONFIG_NET_DSA_MV88E6XXX_GLOBAL2" Andrew
Centralizing support for TCAM?
Hi all, (apologies for the long CC list and the fact that I can't type correctly email addresses) While working on adding support for the Broadcom Ethernet switches Compact Field Processor (which is essentially a TCAM, action/policer/rate meter RAMs, 256 entries), I started working with the ethtool::rxnfc API which is actually kind of nice in that it fits nicely with my use simple case of being able to insert rules at a given or driver selected location and has a pretty good flow representation for common things you may match: TCP/UDP v4/v6 (not so much for non-IP, or L2 stuff though you can use the extension flow representation). It lacks support for more complex actions other than redirect to a particular port/queue. Now ethtool::rxnfc is one possible user, but tc and netfiler also are, more powerful and extensible, but since this is a resource constrained piece of hardware, and it would suck for people to have to implement these 3 APIs if we could come up with a central one that satisfies the superset offered by tc + netfilter. We can surely imagine an use case we centralize the whole matching + action into a Domain Specific Language that we compiled into eBPF and then translate into whatever the HW understands, although that raises the question of where do we put the translation tool in user space or kernel space. So what's everybody's take on this? Thanks! -- Florian
Re: [PATCH net] bonding: Fix bonding crash
Fri, Sep 02, 2016 at 07:08:54PM CEST, mahe...@google.com wrote: >On Fri, Sep 2, 2016 at 9:37 AM, Eric Dumazetwrote: >> On Fri, 2016-09-02 at 18:25 +0200, Jiri Pirko wrote: >> >>> Understand the reason. All I say is we tried hard (I surely did) in >>> the past to remove bonding specific quirks and now we are adding one. >>> And the simple reason is that the code is such a mess. >>> >I would not qualify this as bonding specific change! The check of you can >really use the interface is simple enough, and can be used by other virtual >drivers (e.g. macvlan, ipvlan, vrf etc.). Waiting until >rx_handler_register() can be >called in your code and then try to do rollback is bad. Also doing >register() early >has it's own consequences. On contrary, I would argue that this is much >cleaner. Fair enough. Would be great to do this in all other rx_handler users so we are consistent. Thanks! > >Agreed that bonding-driver has become a monster but that doesn't mean we >should not fix issues. This argument would be valid when we are >dealing with last >couple installations / use cases of bonding and everyone else has moved to >alternatives. Unfortunately we are not there yet :( > >>> Just use team instead and you'll be fine. You can "google" it :) >> >> Sure, please join _our_ team and make all the needed changes in user >> land. >> >:) Cannot put it more eloquently than Eric did but we would (in >theory) love to move >to team-driver, but logistics don't support this (yet!). > >> The kernel part of it is epsilon compared to all the control part (ie >> talk to the various Google switches.), and monitoring. >> >> I am fine to leave the bug in upstream bonding driver, if you really >> want to force people out of bonding land ! >> >> Here an automated test simply used macvlan and did not remove the >> macvlan before enslaving the netdev again in a bonding, we already fixed >> the test, because it is faster than deploying new kernels anyway. >> >> >>
Re: [PATCH net-next 3/3] net: dsa: mv88e6xxx: make global2 code optional
Hi Andrew, Andrew Lunnwrites: >> What do you think? > > I think the probe() needs to fail with a very obvious error message > saying you need to recompile your kernel with option XYZ enabled in > order to support this switch, when the optional stuff is not > optional... I agree. Does the following snippet looks OK? #ifndef CONFIG_NET_DSA_MV88E6XXX_GLOBAL2 if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_GLOBAL2)) { dev_err(chip->dev, "Missing support for Global 2 registers\n"); return -EOPNOTSUPP; } #endif /* CONFIG_NET_DSA_MV88E6XXX_GLOBAL2 */ Thanks, Vivien
Re: possible circular locking dependency detected
On Fri, Sep 2, 2016 at 10:02 AM, Al Virowrote: > > It's very much _not_ just overlayfs being pathological - that it certainly is, > but the problem is much wider. Al, can you take a look at my two patches, and see if you agree that they fix it, though? Of course, we now have *another* splice deadlock. That pipe inode is nasty, it's very easy to deadlock on it in subtle ways. Linus
Re: possible circular locking dependency detected
On Fri, Sep 2, 2016 at 8:51 AM, CAI Qianwrote: > > Actually, I took it back, and now spice seems start to deadlock using the > reproducer, > > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/splice/splice01.c This is a different deadlock, though. This is a deadlock due to mixed lock ordering between the pipe mutex, the XFS "mr_lock", and the inode mutex. If I read the lockdep trace correctly, we have: Normally we have write doing inode->i_mutex -> i_iolock.mr_lock fro the regular write path. But the normal splice "write()" case has pipe->mutex -> filesystem write lock (normally i_mutex) (in iter_file_splice_write() that calls vfs_iter_write() that calls ->write_iter()) and then the XFS splice case as mr_lock -> pipe->mutex in xfs_file_splice_read() calling splice_to_pipe(). So you end up with a A->B->C->A chain. I think it's new to the new iomap based buffered write path in 4.8. Dave, Christoph? Linus > [ 1749.956818] > [ 1749.958492] == > [ 1749.965386] [ INFO: possible circular locking dependency detected ] > [ 1749.972381] 4.8.0-rc4+ #34 Not tainted > [ 1749.976560] --- > [ 1749.983554] splice01/35921 is trying to acquire lock: > [ 1749.989188] (>s_type->i_mutex_key#14){+.+.+.}, at: > [] xfs_file_buffered_aio_write+0x127/0x840 [xfs] > [ 1750.001644] > [ 1750.001644] but task is already holding lock: > [ 1750.008151] (>mutex/1){+.+.+.}, at: [] > pipe_lock+0x51/0x60 > [ 1750.016753] > [ 1750.016753] which lock already depends on the new lock. > [ 1750.016753] > [ 1750.025880] > [ 1750.025880] the existing dependency chain (in reverse order) is: > [ 1750.034229] > -> #2 (>mutex/1){+.+.+.}: > [ 1750.039139][] lock_acquire+0x1fa/0x440 > [ 1750.045857][] mutex_lock_nested+0xdd/0x850 > [ 1750.052963][] pipe_lock+0x51/0x60 > [ 1750.059190][] splice_to_pipe+0x75/0x9e0 > [ 1750.066001][] > __generic_file_splice_read+0xa71/0xe90 > [ 1750.074071][] generic_file_splice_read+0xc1/0x1f0 > [ 1750.081849][] xfs_file_splice_read+0x368/0x7b0 > [xfs] > [ 1750.089940][] do_splice_to+0xee/0x150 > [ 1750.096555][] SyS_splice+0x1144/0x1c10 > [ 1750.103269][] do_syscall_64+0x1a6/0x500 > [ 1750.110084][] return_from_SYSCALL_64+0x0/0x7a > [ 1750.117479] > -> #1 (&(>i_iolock)->mr_lock#2){++}: > [ 1750.123649][] lock_acquire+0x1fa/0x440 > [ 1750.130362][] down_write_nested+0x5e/0xe0 > [ 1750.137371][] xfs_ilock+0x2fe/0x550 [xfs] > [ 1750.144397][] > xfs_file_buffered_aio_write+0x134/0x840 [xfs] > [ 1750.153175][] xfs_file_write_iter+0x26d/0x6d0 > [xfs] > [ 1750.161177][] __vfs_write+0x2be/0x640 > [ 1750.167799][] vfs_write+0x152/0x4b0 > [ 1750.174220][] SyS_write+0xdf/0x1d0 > [ 1750.180547][] entry_SYSCALL_64_fastpath+0x1f/0xbd > [ 1750.188328] > -> #0 (>s_type->i_mutex_key#14){+.+.+.}: > [ 1750.194508][] __lock_acquire+0x3043/0x3dd0 > [ 1750.201609][] lock_acquire+0x1fa/0x440 > [ 1750.208321][] down_write+0x5a/0xe0 > [ 1750.214645][] > xfs_file_buffered_aio_write+0x127/0x840 [xfs] > [ 1750.223421][] xfs_file_write_iter+0x26d/0x6d0 > [xfs] > [ 1750.231423][] vfs_iter_write+0x29e/0x550 > [ 1750.238330][] iter_file_splice_write+0x529/0xb70 > [ 1750.246012][] SyS_splice+0x724/0x1c10 > [ 1750.252627][] do_syscall_64+0x1a6/0x500 > [ 1750.259438][] return_from_SYSCALL_64+0x0/0x7a > [ 1750.266830] > [ 1750.266830] other info that might help us debug this: > [ 1750.266830] > [ 1750.275764] Chain exists of: > >s_type->i_mutex_key#14 --> &(>i_iolock)->mr_lock#2 --> > >mutex/1 > > [ 1750.287213] Possible unsafe locking scenario: > [ 1750.287213] > [ 1750.293817]CPU0CPU1 > [ 1750.298871] > [ 1750.303924] lock(>mutex/1); > [ 1750.307845] > lock(&(>i_iolock)->mr_lock#2); > [ 1750.315836]lock(>mutex/1); > [ 1750.322567] lock(>s_type->i_mutex_key#14); > [ 1750.327748] > [ 1750.327748] *** DEADLOCK *** > [ 1750.327748] > [ 1750.334355] 2 locks held by splice01/35921: > [ 1750.339019] #0: (sb_writers#8){.+.+.+}, at: [] > __sb_start_write+0xb4/0xf0 > [ 1750.348595] #1: (>mutex/1){+.+.+.}, at: [] > pipe_lock+0x51/0x60 > [ 1750.357686] > [ 1750.357686] stack backtrace: > [ 1750.362548] CPU: 50 PID: 35921 Comm: splice01 Not tainted 4.8.0-rc4+ #34 > [ 1750.370026] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS > GRNDSDP1.86B.0044.R00.1501191641 01/19/2015 > [ 1750.381382] 3bca9477 88044c4176e0 > 81a3d191 > [ 1750.389675] 84292880 842b9e30 88044c417730 > 812a6aa6 > [ 1750.397968]
Re: [PATCH net] bonding: Fix bonding crash
On Fri, Sep 2, 2016 at 9:37 AM, Eric Dumazetwrote: > On Fri, 2016-09-02 at 18:25 +0200, Jiri Pirko wrote: > >> Understand the reason. All I say is we tried hard (I surely did) in >> the past to remove bonding specific quirks and now we are adding one. >> And the simple reason is that the code is such a mess. >> I would not qualify this as bonding specific change! The check of you can really use the interface is simple enough, and can be used by other virtual drivers (e.g. macvlan, ipvlan, vrf etc.). Waiting until rx_handler_register() can be called in your code and then try to do rollback is bad. Also doing register() early has it's own consequences. On contrary, I would argue that this is much cleaner. Agreed that bonding-driver has become a monster but that doesn't mean we should not fix issues. This argument would be valid when we are dealing with last couple installations / use cases of bonding and everyone else has moved to alternatives. Unfortunately we are not there yet :( >> Just use team instead and you'll be fine. You can "google" it :) > > Sure, please join _our_ team and make all the needed changes in user > land. > :) Cannot put it more eloquently than Eric did but we would (in theory) love to move to team-driver, but logistics don't support this (yet!). > The kernel part of it is epsilon compared to all the control part (ie > talk to the various Google switches.), and monitoring. > > I am fine to leave the bug in upstream bonding driver, if you really > want to force people out of bonding land ! > > Here an automated test simply used macvlan and did not remove the > macvlan before enslaving the netdev again in a bonding, we already fixed > the test, because it is faster than deploying new kernels anyway. > > >
Minimum MTU Mess
So... I had a bug reported, about a NIC that ceased to work, if it's MTU was set to 0, then back to it's original value (1500). This got me thinking... What does an MTU of 0 even mean? Why should it be allowed? As it turns out, most (but not all) network drivers have a check in their ndo_change_mtu function that returns -EINVAL if new_mtu < $magic_number, which is often 68 (per page 25 of RFC 791), but is sometimes 64, or 60, or 46... Sometimes other manipulations are done. But the short version is that it seems there's an almost universal need to check for a minimum MTU. There's just some disagreement on what that minimum is. So, rather than having nearly every ndo_change_mut callback do the exact same thing, would it make sense to settle on one minimum MTU value, and check that unilaterally (at least for certain netdev types) in net/core/dev.c's dev_set_mtu()? Or is intentionally left vague, because it's really up to the hardware to care? Alternatively, perhaps each driver should set a netdev->min_mtu value, and net/core/dev.c could check against that. Could even have a centralized IP_MIN_MTU of 68 that all devices using ether_setup() and/or alloc_etherdev() used by default. In any case, the number of "mtu < 68" and "#define FOO_MIN_MTU 68", or variations thereof, under drivers/net/ is kind of crazy. In the interim, I think I'll just do a 3-line patch for this one driver that mirrors all the existing drivers to keep it from going splat... -- Jarod Wilson ja...@redhat.com
Re: possible circular locking dependency detected
On Fri, Sep 02, 2016 at 05:10:13PM +0100, Rainer Weikusat wrote: > > Bullshit. kern_path_create() *does* mean the same thing in all cases. > > Namely, find the parent, lock it and leave the final name component for > > the create-type operation. It sure as hell is not guaranteed to take > > *all* locks that are going to be taken in process of mknod/mkdir/etc. > > Never had been. > > This isn't about "all locks", it's about the lock in question. No other > mknod operation (I'm aware of) calls this with another superblock than > the one already acted upon by kern_path_create. This may be wrong (if > so, feel free to correct it) but it's not "bullshit" (intentional > deception in order to sell something to someone). > Never had been promised. And it's not just this lock - e.g. ->i_rwsem is taken on the parent by kern_path_create() and on parent in underlying filesystem by ecryptfs ->mknod() (as well as overlayfs one). bind/bind deadlock - one for a path to ecryptfs, another for that on the raw filesystem behind it (which can be mounted elsewhere/in another namespace/etc.) with those paths ending in the matching directories (the last components may be same or different - doesn't matter) A: lock parent in ecryptfs (via kern_path_create()) B: lock the directory behind it in underlying fs (ditto) A: grab ->readlock B: block on ->readlock A: call ecryptfs_mknod() and block trying to lock the directory held by B Deadlock. And while we are at it, ecryptfs probably ought to claim transient write access for the duration of modifications of the underlying one similar to overlayfs. The point is, it had never been promised that you can stick random locks just outside of ->mknod()/->mkdir()/etc. The same goes for e.g. NFS mount of something exported by localhost; knfsd must lock the parent directory on server before creating anything in it. Suppose you have /srv/nfs/foo exported and mounted on the same host at /mnt/bar. bind to /mnt/bar/barf/a vs. bind to /srv/nfs/foo/barf/b: A: lock /mnt/bar/barf B: lock /srv/nfs/foo/barf A: grab ->readlock B: block on ->readlock A: call nfs_mknod(), wait for reply knfsd: block trying to lock /srv/nfs/foo/barf It's very much _not_ just overlayfs being pathological - that it certainly is, but the problem is much wider. You could try to argue that kern_path_create() should've known to lock all relevant directories in case of overlayfs and ecryptfs, but it has absolutely no chance to do that in case of NFS - the protocol doesn't allow "lock this directory, one of my next requests will be to create something in it". Even leases are only for regular files...
Re: e1000e on Thinkpad x60: gigabit not available due to "SmartSpeed"
On Thu, Sep 01, 2016 at 02:58:13PM -0700, Greg wrote: > On Thu, 2016-09-01 at 22:14 +0200, Pavel Machek wrote: > > Hi! > > > > I have trouble getting 1000mbit out of my ethernet card. > > > > I tried direct connection between two PCs with different cables, and > > no luck. > > > > Today I tried connection to 1000mbit switch, and no luck, either. (Two > > cables, one was cat6, both short). > > > > My computer sees 1000mbit being advertised by the other side, but does > > not advertise 1000mbit, "Link Speed was downgraded by SmartSpeed". > > Check your cables? > > https://vmxp.wordpress.com/2015/01/06/1gbe-intel-nic-throttled-to-100mbit-by-smartspeed/ Of course if it isn't the cable, then it could even be a broken pin in the port. As far as I can tell, anything that causes one of the 3rd or 4th pairs of wires to not work will degrade to 100Mbit on just the first 2 pairs of wires and give that message. Some badly implemented switches can also cause it of course. -- Len Sorensen
[PATCH net-next v4 1/2] net: ethernet: mediatek: enhance RX path by reducing the frequency of the memory barrier used
From: Sean WangThe patch makes move wmb() to outside the loop that could help RX path handling more faster although that RX descriptors aren't freed for DMA to use as soon as possible, but based on my experiment and the result shows it still can reach about 943Mbpis without performance drop that is tested based on the setup with one port using Giga PHY and 256 RX descriptors for DMA to move. Signed-off-by: Sean Wang --- drivers/net/ethernet/mediatek/mtk_eth_soc.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c index 048d763..33bb10f 100644 --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c @@ -951,13 +951,14 @@ release_desc: rxd->rxd2 = RX_DMA_PLEN0(ring->buf_size); ring->calc_idx = idx; - /* make sure that all changes to the dma ring are flushed before -* we continue -*/ - wmb(); - mtk_w32(eth, ring->calc_idx, MTK_PRX_CRX_IDX0); + done++; } + /* make sure that all changes to the dma ring are flushed before +* we continue +*/ + wmb(); + mtk_w32(eth, ring->calc_idx, MTK_PRX_CRX_IDX0); if (done < budget) mtk_w32(eth, MTK_RX_DONE_INT, MTK_PDMA_INT_STATUS); -- 1.9.1
[PATCH net-next v4 2/2] net: ethernet: mediatek: enhance RX path by aggregating more SKBs into NAPI
From: Sean WangThe patch adds support for aggregating more SKBs feed into NAPI in order to get more benefits from generic receive offload (GRO) by peeking at the RX ring status and moving more packets right before returning from NAPI RX polling handler if NAPI budgets are still available and some packets already present in hardware. Signed-off-by: Sean Wang --- drivers/net/ethernet/mediatek/mtk_eth_soc.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c index 33bb10f..44cab1b 100644 --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c @@ -960,9 +960,6 @@ release_desc: wmb(); mtk_w32(eth, ring->calc_idx, MTK_PRX_CRX_IDX0); - if (done < budget) - mtk_w32(eth, MTK_RX_DONE_INT, MTK_PDMA_INT_STATUS); - return done; } @@ -1081,10 +1078,13 @@ static int mtk_napi_rx(struct napi_struct *napi, int budget) struct mtk_eth *eth = container_of(napi, struct mtk_eth, rx_napi); u32 status, mask; int rx_done = 0; + int remain_budget = budget; mtk_handle_status_irq(eth); + +poll_again: mtk_w32(eth, MTK_RX_DONE_INT, MTK_PDMA_INT_STATUS); - rx_done = mtk_poll_rx(napi, budget, eth); + rx_done = mtk_poll_rx(napi, remain_budget, eth); if (unlikely(netif_msg_intr(eth))) { status = mtk_r32(eth, MTK_PDMA_INT_STATUS); @@ -1093,14 +1093,14 @@ static int mtk_napi_rx(struct napi_struct *napi, int budget) "done rx %d, intr 0x%08x/0x%x\n", rx_done, status, mask); } - - if (rx_done == budget) + if (rx_done == remain_budget) return budget; status = mtk_r32(eth, MTK_PDMA_INT_STATUS); - if (status & MTK_RX_DONE_INT) - return budget; - + if (status & MTK_RX_DONE_INT) { + remain_budget -= rx_done; + goto poll_again; + } napi_complete(napi); mtk_irq_enable(eth, MTK_PDMA_INT_MASK, MTK_RX_DONE_INT); -- 1.9.1
[PATCH net-next v4 0/2] net: ethernet: mediatek: add enhancements to RX path
From: Sean WangChanges since v1: - fix message typos and add coverletter Changes since v2: - split from the previous series for submitting add enhancements as a series targeting 'net-next' and add indents before comments. Changes since v3: - merge the patch using PDMA RX path - fixed the input of mtk_poll_rx is with the remaining budget Sean Wang (2): net: ethernet: mediatek: enhance RX path by reducing the frequency of the memory barrier used net: ethernet: mediatek: enhance RX path by aggregating more SKBs into NAPI drivers/net/ethernet/mediatek/mtk_eth_soc.c | 29 +++-- 1 file changed, 15 insertions(+), 14 deletions(-) -- 1.9.1
Re: [PATCH 4/4] net: smsc911x: Move interrupt allocation to open/stop
On Thu, Sep 01, 2016 at 03:15:09PM -0500, Jeremy Linton wrote: > The /proc/irq/xx information is incorrect for smsc911x because > the request_irq is happening before the register_netdev has the > proper device name. Moving it to the open also fixes the case > of when the device is renamed. > > Reported-by: Will Deacon> Signed-off-by: Jeremy Linton > --- > drivers/net/ethernet/smsc/smsc911x.c | 47 > ++-- > 1 file changed, 18 insertions(+), 29 deletions(-) FWIW, this fixes the interface naming under /proc//ethX for me: Tested-by: Will Deacon Will
Re: possible circular locking dependency detected
Linus Torvaldswrites: > On Thu, Sep 1, 2016 at 2:43 PM, Linus Torvalds > wrote: >> On Thu, Sep 1, 2016 at 2:01 PM, Al Viro wrote: >>> >>> Outside as in "all fs activity in bind happens under it". Along with >>> assignment to ->u.addr, etc. IOW, make it the outermost lock there. >> >> Hah, yes. I misunderstood you. >> >> Yes. In fact that fixes the problem I mentioned, rather than introducing it. > > So the easiest approach would seem to be to revert commit c845acb324aa > ("af_unix: Fix splice-bind deadlock"), and then apply the lock split. > > Like the attached two patches. > > This is still *entirely* untested. As far as I can tell, this should work as I can't currently imagine why a fs operation might end up binding a unix socket despite the idea to make af_unix.c yet more complicated in order to work around irregular behaviour of (as far as I can tell) a single filesystem (for which kern_path_create doesn't really mean kern_path_create and it has to work around that once it gets control) goes against all instincts I have in this area. If filesystems need to do arbitrary stuff when __sb_start_write is called for 'their' superblock, they should be able to do so directly. At present, this is a theoretic concern as I can't (due to other work committments) put any non-cursory work into this before Sunday. There may also be other reasons why this idea is impractical or even unworkable.
Re: possible circular locking dependency detected
- Original Message - > From: "CAI Qian"> To: "Linus Torvalds" > Cc: "Al Viro" , "Miklos Szeredi" > , "Rainer Weikusat" > , "Hannes Frederic Sowa" > , "Rainer Weikusat" > , "Eric Sandeen" , > "Network Development" > > Sent: Friday, September 2, 2016 11:51:58 AM > Subject: Re: possible circular locking dependency detected > > > > - Original Message - > > From: "CAI Qian" > > To: "Linus Torvalds" > > Cc: "Al Viro" , "Miklos Szeredi" > > , "Rainer Weikusat" > > , "Hannes Frederic Sowa" > > , "Rainer Weikusat" > > , "Eric Sandeen" , > > "Network Development" > > > > Sent: Friday, September 2, 2016 10:43:20 AM > > Subject: Re: possible circular locking dependency detected > > > > > > > > - Original Message - > > > From: "Linus Torvalds" > > > To: "Al Viro" , "CAI Qian" > > > Cc: "Miklos Szeredi" , "Rainer Weikusat" > > > , "Hannes Frederic Sowa" > > > , "Rainer Weikusat" > > > , "Eric Sandeen" > > > , "Network Development" > > > Sent: Thursday, September 1, 2016 6:04:38 PM > > > Subject: Re: possible circular locking dependency detected > > > > > > On Thu, Sep 1, 2016 at 2:43 PM, Linus Torvalds > > > wrote: > > > > On Thu, Sep 1, 2016 at 2:01 PM, Al Viro > > > > wrote: > > > >> > > > >> Outside as in "all fs activity in bind happens under it". Along with > > > >> assignment to ->u.addr, etc. IOW, make it the outermost lock there. > > > > > > > > Hah, yes. I misunderstood you. > > > > > > > > Yes. In fact that fixes the problem I mentioned, rather than > > > > introducing > > > > it. > > > > > > So the easiest approach would seem to be to revert commit c845acb324aa > > > ("af_unix: Fix splice-bind deadlock"), and then apply the lock split. > > > > > > Like the attached two patches. > > > > > > This is still *entirely* untested. > > Tested-by: CAI Qian OK, this tag still stand. The below issue is also reproduced without those patches, so a separate problem most likely was introduced recently (after rc3 or rc4) by probably some xfs update. CAI Qian > Actually, I took it back, and now spice seems start to deadlock using the > reproducer, > > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/splice/splice01.c > > [ 1749.956818] > [ 1749.958492] == > [ 1749.965386] [ INFO: possible circular locking dependency detected ] > [ 1749.972381] 4.8.0-rc4+ #34 Not tainted > [ 1749.976560] --- > [ 1749.983554] splice01/35921 is trying to acquire lock: > [ 1749.989188] (>s_type->i_mutex_key#14){+.+.+.}, at: > [] xfs_file_buffered_aio_write+0x127/0x840 [xfs] > [ 1750.001644] > [ 1750.001644] but task is already holding lock: > [ 1750.008151] (>mutex/1){+.+.+.}, at: [] > pipe_lock+0x51/0x60 > [ 1750.016753] > [ 1750.016753] which lock already depends on the new lock. > [ 1750.016753] > [ 1750.025880] > [ 1750.025880] the existing dependency chain (in reverse order) is: > [ 1750.034229] > -> #2 (>mutex/1){+.+.+.}: > [ 1750.039139][] lock_acquire+0x1fa/0x440 > [ 1750.045857][] mutex_lock_nested+0xdd/0x850 > [ 1750.052963][] pipe_lock+0x51/0x60 > [ 1750.059190][] splice_to_pipe+0x75/0x9e0 > [ 1750.066001][] > __generic_file_splice_read+0xa71/0xe90 > [ 1750.074071][] > generic_file_splice_read+0xc1/0x1f0 > [ 1750.081849][] xfs_file_splice_read+0x368/0x7b0 > [xfs] > [ 1750.089940][] do_splice_to+0xee/0x150 > [ 1750.096555][] SyS_splice+0x1144/0x1c10 > [ 1750.103269][] do_syscall_64+0x1a6/0x500 > [ 1750.110084][] return_from_SYSCALL_64+0x0/0x7a > [ 1750.117479] > -> #1 (&(>i_iolock)->mr_lock#2){++}: > [ 1750.123649][] lock_acquire+0x1fa/0x440 > [ 1750.130362][] down_write_nested+0x5e/0xe0 > [ 1750.137371][] xfs_ilock+0x2fe/0x550 [xfs] > [ 1750.144397][] > xfs_file_buffered_aio_write+0x134/0x840 [xfs] > [ 1750.153175][] xfs_file_write_iter+0x26d/0x6d0 > [xfs] > [ 1750.161177][] __vfs_write+0x2be/0x640 > [ 1750.167799][] vfs_write+0x152/0x4b0 > [ 1750.174220][] SyS_write+0xdf/0x1d0 > [ 1750.180547][] >
Re: [PATCH v3 1/2] wcn36xx: Transition driver to SMD client
On Fri 02 Sep 09:24 PDT 2016, Kalle Valo wrote: > Bjorn Anderssonwrites: > > > The wcn36xx wifi driver follows the life cycle of the WLAN_CTRL SMD > > channel, as such it should be a SMD client. This patch makes this > > transition, now that we have the necessary frameworks available. > > > > Signed-off-by: Bjorn Andersson > > [...] > > > --- a/drivers/net/wireless/ath/wcn36xx/Kconfig > > +++ b/drivers/net/wireless/ath/wcn36xx/Kconfig > > @@ -1,6 +1,6 @@ > > config WCN36XX > > tristate "Qualcomm Atheros WCN3660/3680 support" > > - depends on MAC80211 && HAS_DMA > > + depends on MAC80211 && HAS_DMA && QCOM_SMD > > While I had this patch on my pending branch I noticed that I can't > compile wcn36xx on x86 anymore. This is actually quite inconvenient, for > example then it's easy to miss mac80211 API changes etc. Is there any > way we could continue build testing wcn36xx on x86 (or all) platforms? > Sorry about that, we should at least be able to COMPILE_TEST it in non-qcom builds. Thanks for letting me know. And the driver should also depend on QCOM_WCNSS_CTRL, in the normal case. I'll respin this, unless you prefer an incremental patch for those changes(?) > Also what about older platforms? Earlier I used wcn36xx on my Nexus 4 > with help of backports project. I can't do that anymore, right? > This has been tested on 8064, 8974 and 8916. So your Nexus 4 is covered. Unfortunately I don't have a Nexus 4, but we currently have Nexus 7 (the 8064 version), Xperia Z, Xperia Z1 and DB410c using this chip and all four has been tested with this code. I've staged the PIL/remoteproc (firmware "loader") driver for v4.9, so with this patch the only thing missing to fill in the dts files is one clock from the RPM. JFYI, There seems to be some race in the removal path, which I will look into. But I would prefer if we could merge this to make the driver usable, and more accessible to work on. Regards, Bjorn
Re: [PATCH net 0/2] vxlan: fix error reporting
On Fri, 2 Sep 2016 09:03:11 -0700, Stephen Hemminger wrote: > On Fri, 2 Sep 2016 13:37:10 +0200 > Jiri Bencwrote: > > > This patchset improves checking for invalid configuration in VXLAN and fixes > > problems with duplicated and inappropriate error messages. > > > > Jiri Benc (2): > > vxlan: reject multicast destination without an interface > > vxlan: fix duplicated and wrong error messages > > > > drivers/net/vxlan.c | 38 -- > > 1 file changed, 12 insertions(+), 26 deletions(-) > > > > These should also be detected and rejected in iproute2 > errors in kernel log are user unfriendly api. I agree and I intend to send a patch once this is accepted. Jiri
Re: [Patch net-next 2/2] netns: avoid disabling irq for netns id
On Fri, Sep 2, 2016 at 1:12 AM, Nicolas Dichtelwrote: > Le 02/09/2016 à 06:53, Cong Wang a écrit : >> We never read or change netns id in hardirq context, >> the only place we read netns id in softirq context >> is in vxlan_xmit(). So, it should be enough to just >> disable BH. > > Are you sure? Did you audit all part of the code? I did audit all the callers, and I didn't find any of them in IRQ context. > peernet2id() is called from netlink core system (do_one_broadcast()). Are you > sure that no driver call this function from an hard irq context? I audit all callers of netlink_broadcast(), and I don't see any of them in hardirq context. > > I think that NETLINK_LISTEN_ALL_NSID is largely untested, so it will be hard > to > detect a bug introduced in this feature. This patch passed my netns stress tests, I have LOCKDEP enabled, and I don't get any warning or crash etc.
Re: [PATCH v3 1/2] wcn36xx: Transition driver to SMD client
Hi Kalle, >> The wcn36xx wifi driver follows the life cycle of the WLAN_CTRL SMD >> channel, as such it should be a SMD client. This patch makes this >> transition, now that we have the necessary frameworks available. >> >> Signed-off-by: Bjorn Andersson> > [...] > >> --- a/drivers/net/wireless/ath/wcn36xx/Kconfig >> +++ b/drivers/net/wireless/ath/wcn36xx/Kconfig >> @@ -1,6 +1,6 @@ >> config WCN36XX >> tristate "Qualcomm Atheros WCN3660/3680 support" >> -depends on MAC80211 && HAS_DMA >> +depends on MAC80211 && HAS_DMA && QCOM_SMD > > While I had this patch on my pending branch I noticed that I can't > compile wcn36xx on x86 anymore. This is actually quite inconvenient, for > example then it's easy to miss mac80211 API changes etc. Is there any > way we could continue build testing wcn36xx on x86 (or all) platforms? I would also like that for the btqcomsmd Bluetooth driver. Doing quick build tests on x86 would be great. Regards Marcel
Re: [PATCH net] bonding: Fix bonding crash
On Fri, 2016-09-02 at 18:25 +0200, Jiri Pirko wrote: > Understand the reason. All I say is we tried hard (I surely did) in > the past to remove bonding specific quirks and now we are adding one. > And the simple reason is that the code is such a mess. > > Just use team instead and you'll be fine. You can "google" it :) Sure, please join _our_ team and make all the needed changes in user land. The kernel part of it is epsilon compared to all the control part (ie talk to the various Google switches.), and monitoring. I am fine to leave the bug in upstream bonding driver, if you really want to force people out of bonding land ! Here an automated test simply used macvlan and did not remove the macvlan before enslaving the netdev again in a bonding, we already fixed the test, because it is faster than deploying new kernels anyway.
Re: [PATCH net] bonding: Fix bonding crash
Fri, Sep 02, 2016 at 03:33:20PM CEST, eric.duma...@gmail.com wrote: >On Fri, 2016-09-02 at 09:52 +0200, Jiri Pirko wrote: > >> >> No, please, don't make bonding a spacial citizen introducing this. >> Please handle the issue inside the bonding code, like we do for the rest >> of master devices (and how it was once done for bonding). Thanks. > >I do not feel this netdev_is_rx_handler_busy() use is special to >bonding. > >It makes sense to use it early, to avoid complex rollback, once various >events have been sent all over. > >bond_enslave() is 447 lines long already. > >You perfectly know how hard it is to 'handle the issue inside the >bonding code' as you chose to not fix bonding and write team instead. > >So Mahesh patch makes perfect sense to me. It exactly fixes a stupid >bond_enslave() behavior, trying to set rx_handler way too late. > >By doing sanity checks before any action, we simply do not have to add >complex rollback. Understand the reason. All I say is we tried hard (I surely did) in the past to remove bonding specific quirks and now we are adding one. And the simple reason is that the code is such a mess. Just use team instead and you'll be fine. You can "google" it :)
Re: [PATCH v3 1/2] wcn36xx: Transition driver to SMD client
Bjorn Anderssonwrites: > The wcn36xx wifi driver follows the life cycle of the WLAN_CTRL SMD > channel, as such it should be a SMD client. This patch makes this > transition, now that we have the necessary frameworks available. > > Signed-off-by: Bjorn Andersson [...] > --- a/drivers/net/wireless/ath/wcn36xx/Kconfig > +++ b/drivers/net/wireless/ath/wcn36xx/Kconfig > @@ -1,6 +1,6 @@ > config WCN36XX > tristate "Qualcomm Atheros WCN3660/3680 support" > - depends on MAC80211 && HAS_DMA > + depends on MAC80211 && HAS_DMA && QCOM_SMD While I had this patch on my pending branch I noticed that I can't compile wcn36xx on x86 anymore. This is actually quite inconvenient, for example then it's easy to miss mac80211 API changes etc. Is there any way we could continue build testing wcn36xx on x86 (or all) platforms? Also what about older platforms? Earlier I used wcn36xx on my Nexus 4 with help of backports project. I can't do that anymore, right? -- Kalle Valo
Re: possible circular locking dependency detected
Al Virowrites: > On Fri, Sep 02, 2016 at 04:18:04PM +0100, Rainer Weikusat wrote: > >> As far as I can tell, this should work as I can't currently imagine >> why a fs operation might end up binding a unix socket despite the >> idea to make af_unix.c yet more complicated in order to work around >> irregular behaviour of (as far as I can tell) a single filesystem >> (for which kern_path_create doesn't really mean kern_path_create and >> it has to work around that once it gets control) goes against all >> instincts I have in this area. If filesystems need to do arbitrary >> stuff when __sb_start_write is called for 'their' superblock, they >> should be able to do so directly. > > Bullshit. kern_path_create() *does* mean the same thing in all cases. > Namely, find the parent, lock it and leave the final name component for > the create-type operation. It sure as hell is not guaranteed to take > *all* locks that are going to be taken in process of mknod/mkdir/etc. > Never had been. This isn't about "all locks", it's about the lock in question. No other mknod operation (I'm aware of) calls this with another superblock than the one already acted upon by kern_path_create. This may be wrong (if so, feel free to correct it) but it's not "bullshit" (intentional deception in order to sell something to someone).
[PATCH net-next v3 0/2] net: ethernet: mediatek: add enhancements to RX path
From: Sean WangChanges since v1: - fix message typos and add coverletter Changes since v2: - split from the previous series for submitting add enhancements as a series targeting 'net-next' and add indents before comments. Changes since v3: - merge the patch using PDMA RX path - fixed the input of mtk_poll_rx is with the remaining budget Sean Wang (2): net: ethernet: mediatek: enhance RX path by reducing the frequency of the memory barrier used net: ethernet: mediatek: enhance RX path by aggregating more SKBs into NAPI drivers/net/ethernet/mediatek/mtk_eth_soc.c | 29 +++-- 1 file changed, 15 insertions(+), 14 deletions(-) -- 1.9.1
[PATCH net-next v3 1/2] net: ethernet: mediatek: enhance RX path by reducing the frequency of the memory barrier used
From: Sean WangThe patch makes move wmb() to outside the loop that could help RX path handling more faster although that RX descriptors aren't freed for DMA to use as soon as possible, but based on my experiment and the result shows it still can reach about 943Mbpis without performance drop that is tested based on the setup with one port using Giga PHY and 256 RX descriptors for DMA to move. Signed-off-by: Sean Wang --- drivers/net/ethernet/mediatek/mtk_eth_soc.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c index 048d763..33bb10f 100644 --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c @@ -951,13 +951,14 @@ release_desc: rxd->rxd2 = RX_DMA_PLEN0(ring->buf_size); ring->calc_idx = idx; - /* make sure that all changes to the dma ring are flushed before -* we continue -*/ - wmb(); - mtk_w32(eth, ring->calc_idx, MTK_PRX_CRX_IDX0); + done++; } + /* make sure that all changes to the dma ring are flushed before +* we continue +*/ + wmb(); + mtk_w32(eth, ring->calc_idx, MTK_PRX_CRX_IDX0); if (done < budget) mtk_w32(eth, MTK_RX_DONE_INT, MTK_PDMA_INT_STATUS); -- 1.9.1
[PATCH net-next v3 2/2] net: ethernet: mediatek: enhance RX path by aggregating more SKBs into NAPI
From: Sean WangThe patch adds support for aggregating more SKBs feed into NAPI in order to get more benefits from generic receive offload (GRO) by peeking at the RX ring status and moving more packets right before returning from NAPI RX polling handler if NAPI budgets are still available and some packets already present in hardware. Signed-off-by: Sean Wang --- drivers/net/ethernet/mediatek/mtk_eth_soc.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c index 33bb10f..44cab1b 100644 --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c @@ -960,9 +960,6 @@ release_desc: wmb(); mtk_w32(eth, ring->calc_idx, MTK_PRX_CRX_IDX0); - if (done < budget) - mtk_w32(eth, MTK_RX_DONE_INT, MTK_PDMA_INT_STATUS); - return done; } @@ -1081,10 +1078,13 @@ static int mtk_napi_rx(struct napi_struct *napi, int budget) struct mtk_eth *eth = container_of(napi, struct mtk_eth, rx_napi); u32 status, mask; int rx_done = 0; + int remain_budget = budget; mtk_handle_status_irq(eth); + +poll_again: mtk_w32(eth, MTK_RX_DONE_INT, MTK_PDMA_INT_STATUS); - rx_done = mtk_poll_rx(napi, budget, eth); + rx_done = mtk_poll_rx(napi, remain_budget, eth); if (unlikely(netif_msg_intr(eth))) { status = mtk_r32(eth, MTK_PDMA_INT_STATUS); @@ -1093,14 +1093,14 @@ static int mtk_napi_rx(struct napi_struct *napi, int budget) "done rx %d, intr 0x%08x/0x%x\n", rx_done, status, mask); } - - if (rx_done == budget) + if (rx_done == remain_budget) return budget; status = mtk_r32(eth, MTK_PDMA_INT_STATUS); - if (status & MTK_RX_DONE_INT) - return budget; - + if (status & MTK_RX_DONE_INT) { + remain_budget -= rx_done; + goto poll_again; + } napi_complete(napi); mtk_irq_enable(eth, MTK_PDMA_INT_MASK, MTK_RX_DONE_INT); -- 1.9.1
Re: [PATCH net-next 3/3] net: dsa: mv88e6xxx: make global2 code optional
Hi Vivien > What do you think? I think the probe() needs to fail with a very obvious error message saying you need to recompile your kernel with option XYZ enabled in order to support this switch, when the optional stuff is not optional... Andrew
Re: [v2] ath9k: mark ath_fill_led_pin() static
Baoyou Xiewrote: > We get 1 warning about global functions without a declaration > in the ath9k gpio driver when building with W=1: > drivers/net/wireless/ath/ath9k/gpio.c:25:6: warning: no previous prototype > for 'ath_fill_led_pin' [-Wmissing-prototypes] > > In fact, this function is only used in the file in which it is declared > and don't need a declaration, but can be made static. > so this patch marks it 'static'. > > Signed-off-by: Baoyou Xie Thanks, 1 patch applied to ath-next branch of ath.git: c39265f72ae6 ath9k: mark ath_fill_led_pin() static -- Sent by pwcli https://patchwork.kernel.org/patch/9303795/
Re: [PATCH] net: Don't delete routes in different VRFs
On 9/1/16 11:26 PM, Mark Tomlinson wrote: > When deleting an IP address from an interface, there is a clean-up of > routes which refer to this local address. However, there was no check to > see that the VRF matched. This meant that deletion wasn't confined to > the VRF it should have been. > > To solve this, a new field has been added to fib_info to hold a table > id. When removing fib entries corresponding to a local ip address, this > table id is also used in the comparison. > > The table id is populated when the fib_info is created. This was already > done in some places, but not in ip_rt_ioctl(). This has now been fixed. > The best fixes tag is: Fixes: 021dd3b8a142 ("net: Add routes to the table associated with the device") > Signed-off-by: Mark Tomlinson> --- > include/net/ip_fib.h | 3 ++- > net/ipv4/fib_frontend.c | 3 ++- > net/ipv4/fib_semantics.c | 8 ++-- > 3 files changed, 10 insertions(+), 4 deletions(-) Acked-by: David Ahern Tested-by: David Ahern Mark: send a v2 with the Fixes tag and my acked-by and tested-by. Thanks for the patch.
Re: ath10k: fix spelling mistake "montior" -> "monitor"
Colin Ian Kingwrote: > From: Colin Ian King > > Trivial fix to spelling mistake in ath10k_warn message. > > Signed-off-by: Colin Ian King > Reviewed-by: Julian Calaby Thanks, 1 patch applied to ath-next branch of ath.git: 7f03d3069381 ath10k: fix spelling mistake "montior" -> "monitor" -- Sent by pwcli https://patchwork.kernel.org/patch/9301869/
Re: possible circular locking dependency detected
On Fri, Sep 02, 2016 at 04:18:04PM +0100, Rainer Weikusat wrote: > As far as I can tell, this should work as I can't currently imagine why > a fs operation might end up binding a unix socket despite the idea to > make af_unix.c yet more complicated in order to work around irregular > behaviour of (as far as I can tell) a single filesystem (for which > kern_path_create doesn't really mean kern_path_create Bullshit. kern_path_create() *does* mean the same thing in all cases. Namely, find the parent, lock it and leave the final name component for the create-type operation. It sure as hell is not guaranteed to take *all* locks that are going to be taken in process of mknod/mkdir/etc. Never had been. and it has to work > around that once it gets control) goes against all instincts I have in > this area. If filesystems need to do arbitrary stuff when > __sb_start_write is called for 'their' superblock, they should be able > to do so directly. > > At present, this is a theoretic concern as I can't (due to other work > committments) put any non-cursory work into this before Sunday. There > may also be other reasons why this idea is impractical or even > unworkable.
Re: ath10k: replace config_enabled() with IS_REACHABLE()
Masahiro Yamadawrote: > Commit 97f2645f358b ("tree-wide: replace config_enabled() with > IS_ENABLED()") mostly did away with config_enabled(). > > This is one of the postponed TODO items as config_enabled() is used > for a tristate option here. Theoretically, config_enabled() is > equivalent to IS_BUILTIN(), but I guess IS_REACHABLE() is the best > fit for this case because both CONFIG_HWMON and CONFIG_ATH10K are > tristate. > > Signed-off-by: Masahiro Yamada Thanks, 1 patch applied to ath-next branch of ath.git: 749bc03ae2cd ath10k: replace config_enabled() with IS_REACHABLE() -- Sent by pwcli https://patchwork.kernel.org/patch/9295945/
Re: [PATCH net 0/2] vxlan: fix error reporting
On Fri, 2 Sep 2016 13:37:10 +0200 Jiri Bencwrote: > This patchset improves checking for invalid configuration in VXLAN and fixes > problems with duplicated and inappropriate error messages. > > Jiri Benc (2): > vxlan: reject multicast destination without an interface > vxlan: fix duplicated and wrong error messages > > drivers/net/vxlan.c | 38 -- > 1 file changed, 12 insertions(+), 26 deletions(-) > These should also be detected and rejected in iproute2 errors in kernel log are user unfriendly api.
Re: possible circular locking dependency detected
- Original Message - > From: "CAI Qian"> To: "Linus Torvalds" > Cc: "Al Viro" , "Miklos Szeredi" > , "Rainer Weikusat" > , "Hannes Frederic Sowa" > , "Rainer Weikusat" > , "Eric Sandeen" , > "Network Development" > > Sent: Friday, September 2, 2016 10:43:20 AM > Subject: Re: possible circular locking dependency detected > > > > - Original Message - > > From: "Linus Torvalds" > > To: "Al Viro" , "CAI Qian" > > Cc: "Miklos Szeredi" , "Rainer Weikusat" > > , "Hannes Frederic Sowa" > > , "Rainer Weikusat" > > , "Eric Sandeen" > > , "Network Development" > > Sent: Thursday, September 1, 2016 6:04:38 PM > > Subject: Re: possible circular locking dependency detected > > > > On Thu, Sep 1, 2016 at 2:43 PM, Linus Torvalds > > wrote: > > > On Thu, Sep 1, 2016 at 2:01 PM, Al Viro wrote: > > >> > > >> Outside as in "all fs activity in bind happens under it". Along with > > >> assignment to ->u.addr, etc. IOW, make it the outermost lock there. > > > > > > Hah, yes. I misunderstood you. > > > > > > Yes. In fact that fixes the problem I mentioned, rather than introducing > > > it. > > > > So the easiest approach would seem to be to revert commit c845acb324aa > > ("af_unix: Fix splice-bind deadlock"), and then apply the lock split. > > > > Like the attached two patches. > > > > This is still *entirely* untested. > Tested-by: CAI Qian Actually, I took it back, and now spice seems start to deadlock using the reproducer, https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/splice/splice01.c [ 1749.956818] [ 1749.958492] == [ 1749.965386] [ INFO: possible circular locking dependency detected ] [ 1749.972381] 4.8.0-rc4+ #34 Not tainted [ 1749.976560] --- [ 1749.983554] splice01/35921 is trying to acquire lock: [ 1749.989188] (>s_type->i_mutex_key#14){+.+.+.}, at: [] xfs_file_buffered_aio_write+0x127/0x840 [xfs] [ 1750.001644] [ 1750.001644] but task is already holding lock: [ 1750.008151] (>mutex/1){+.+.+.}, at: [] pipe_lock+0x51/0x60 [ 1750.016753] [ 1750.016753] which lock already depends on the new lock. [ 1750.016753] [ 1750.025880] [ 1750.025880] the existing dependency chain (in reverse order) is: [ 1750.034229] -> #2 (>mutex/1){+.+.+.}: [ 1750.039139][] lock_acquire+0x1fa/0x440 [ 1750.045857][] mutex_lock_nested+0xdd/0x850 [ 1750.052963][] pipe_lock+0x51/0x60 [ 1750.059190][] splice_to_pipe+0x75/0x9e0 [ 1750.066001][] __generic_file_splice_read+0xa71/0xe90 [ 1750.074071][] generic_file_splice_read+0xc1/0x1f0 [ 1750.081849][] xfs_file_splice_read+0x368/0x7b0 [xfs] [ 1750.089940][] do_splice_to+0xee/0x150 [ 1750.096555][] SyS_splice+0x1144/0x1c10 [ 1750.103269][] do_syscall_64+0x1a6/0x500 [ 1750.110084][] return_from_SYSCALL_64+0x0/0x7a [ 1750.117479] -> #1 (&(>i_iolock)->mr_lock#2){++}: [ 1750.123649][] lock_acquire+0x1fa/0x440 [ 1750.130362][] down_write_nested+0x5e/0xe0 [ 1750.137371][] xfs_ilock+0x2fe/0x550 [xfs] [ 1750.144397][] xfs_file_buffered_aio_write+0x134/0x840 [xfs] [ 1750.153175][] xfs_file_write_iter+0x26d/0x6d0 [xfs] [ 1750.161177][] __vfs_write+0x2be/0x640 [ 1750.167799][] vfs_write+0x152/0x4b0 [ 1750.174220][] SyS_write+0xdf/0x1d0 [ 1750.180547][] entry_SYSCALL_64_fastpath+0x1f/0xbd [ 1750.188328] -> #0 (>s_type->i_mutex_key#14){+.+.+.}: [ 1750.194508][] __lock_acquire+0x3043/0x3dd0 [ 1750.201609][] lock_acquire+0x1fa/0x440 [ 1750.208321][] down_write+0x5a/0xe0 [ 1750.214645][] xfs_file_buffered_aio_write+0x127/0x840 [xfs] [ 1750.223421][] xfs_file_write_iter+0x26d/0x6d0 [xfs] [ 1750.231423][] vfs_iter_write+0x29e/0x550 [ 1750.238330][] iter_file_splice_write+0x529/0xb70 [ 1750.246012][] SyS_splice+0x724/0x1c10 [ 1750.252627][] do_syscall_64+0x1a6/0x500 [ 1750.259438][] return_from_SYSCALL_64+0x0/0x7a [ 1750.266830] [ 1750.266830] other info that might help us debug this: [ 1750.266830] [ 1750.275764] Chain exists of: >s_type->i_mutex_key#14 --> &(>i_iolock)->mr_lock#2 --> >mutex/1 [ 1750.287213] Possible unsafe locking scenario: [ 1750.287213] [ 1750.293817]CPU0CPU1 [ 1750.298871] [
Re: [2/2] ath10k: use complete() instead complete_all()
Daniel Wagnerwrote: > From: Daniel Wagner > > There is only one waiter for the completion, therefore there > is no need to use complete_all(). Let's make that clear by > using complete() instead of complete_all(). > > The usage pattern of the completion is: > > waiter context waker context > > scan.started > > > ath10k_start_scan() > lockdep_assert_held(conf_mutex) > auth10k_wmi_start_scan() > wait_for_completion_timeout(scan.started) > > ath10k_wmi_event_scan_start_failed() > complete(scan.started) > > ath10k_wmi_event_scan_started() > complete(scan.started) > > scan.completed > -- > > ath10k_scan_stop() > lockdep_assert_held(conf_mutex) > ath10k_wmi_stop_scan() > wait_for_completion_timeout(scan.completed) > > __ath10k_scan_finish() > complete(scan.completed) > > scan.on_channel > --- > > ath10k_remain_on_channel() > mutex_lock(conf_mutex) > ath10k_start_scan() > wait_for_completion_timeout(scan.on_channel) > > ath10k_wmi_event_scan_foreign_chan() > complete(scan.on_channel) > > offchan_tx_completed > > > ath10k_offchan_tx_work() > mutex_lock(conf_mutex) > reinit_completion(offchan_tx_completed) > wait_for_completion_timeout(offchan_tx_completed) > > ath10k_report_offchain_tx() > complete(offchan_tx_completed) > > install_key_done > > ath10k_install_key() > lockep_assert_held(conf_mutex) > reinit_completion(install_key_done) > wait_for_completion_timeout(install_key_done) > > ath10k_htt_t2h_msg_handler() > complete(install_key_done) > > vdev_setup_done > --- > > ath10k_monitor_vdev_start() > lockdep_assert_held(conf_mutex) >reinit_completion(vdev_setup_done) > ath10k_vdev_setup_sync() > wait_for_completion_timeout(vdev_setup_done) > > ath10k_wmi_event_vdev_start_resp() > complete(vdev_setup_done) > > ath10k_monitor_vdev_stop() > lockdep_assert_held(conf_mutex) > reinit_completion(vdev_setup_done() > ath10k_vdev_setup_sync() > wait_for_completion_timeout(vdev_setup_done) > > ath10k_wmi_event_vdev_stopped() >complete(vdev_setup_done) > > thermal.wmi_sync > > ath10k_thermal_show_temp() > mutex_lock(conf_mutex) > reinit_completion(thermal.wmi_sync) > wait_for_completion_timeout(thermal.wmi_sync) > > ath10k_thermal_event_temperature() > complete(thermal.wmi_sync) > > bss_survey_done > --- > ath10k_mac_update_bss_chan_survey > lockdep_assert_held(conf_mutex) > reinit_completion(bss_survey_done) > wait_for_completion_timeout(bss_survey_done) > > ath10k_wmi_event_pdev_bss_chan_info() > complete(bss_survey_done) > > All complete() calls happen while the conf_mutex is taken. That means > at max one waiter is possible. > > Signed-off-by: Daniel Wagner Thanks, 1 patch applied to ath-next branch of ath.git: 881ed54ecc13 ath10k: use complete() instead complete_all() -- Sent by pwcli https://patchwork.kernel.org/patch/9287731/
Re: [PATCH] ath10k: fix memory leak on caldata on error exit path
Colin Kingwrites: > From: Colin Ian King > > caldata is not being free'd on the error exit path, causing > a memory leak. kfree it to fix the leak. > > Signed-off-by: Colin Ian King > --- > drivers/net/wireless/ath/ath10k/pci.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/net/wireless/ath/ath10k/pci.c > b/drivers/net/wireless/ath/ath10k/pci.c > index 9a22c47..886337c 100644 > --- a/drivers/net/wireless/ath/ath10k/pci.c > +++ b/drivers/net/wireless/ath/ath10k/pci.c > @@ -2725,6 +2725,7 @@ static int ath10k_pci_hif_fetch_cal_eeprom(struct > ath10k *ar, void **data, > return 0; > > err_free: > + kfree(caldata); > kfree(data); > > return -EINVAL; I don't think we should free data at all: static int ath10k_download_cal_eeprom(struct ath10k *ar) { size_t data_len; void *data = NULL; int ret; ret = ath10k_hif_fetch_cal_eeprom(ar, , _len); Instead we should free only caldata, right? -- Kalle Valo
Re: [PATCH net] bonding: Fix bonding crash
On Fri, 2016-09-02 at 08:21 -0700, Eric Dumazet wrote: > Ideally the netdev_rx_handler_register() should be called only when no > further errors can be detected in the 'enslaving' process, otherwise > some live packets could come and be incorrectly processed/dropped by a > not fully initialized driver. > > So it is a chicken and egg problem, if you allow > netdev_rx_handler_register() to return an error, while it is so easy > to early check what could go wrong with it. One final point is that this patch is easy to backport to stable versions. bond_enslave() is a monster, and changing the rollback chain is likely to cause backport merge conflicts.
Re: [PATCH net-next 3/3] net: dsa: mv88e6xxx: make global2 code optional
Hi Andrew, Andrew Lunnwrites: > On Fri, Sep 02, 2016 at 08:08:19AM -0400, Vivien Didelot wrote: >> Since not every chip has a Global2 set of registers, make its support >> optional, in which case the related functions will return -EOPNOTSUPP. >> >> This also allows to reduce the size of the mv88e6xxx driver for devices >> such as home routers embedding Ethernet chips without Global2 support. > > I've no problems with splitting this out. Yes, I think it is important to split support for independent internal SMI devices (Globals, PHY, TCAM, ...). > However, making it optional? I don't think that is a good idea. > > 1) If you don't have this, and you should, it looks like the PHYs stop > working, and since you cannot set the switch MAC address, maybe Pause > frames are broken. > > 2) Distros won't build a kernel per target hardware. They probably > build a kernel per SoC vendor. That means, they will have this > optional code built all the time. Very few builds will leave it out. > > So the only way i think making this optional, is if it is a kernel > module, and it gets loaded if needed. Your points are correct. But unless I'm mistaken, I purposely default this suppor to 'y'. Distros will compile everything, which is good. Only people who know their chips and what they are doing will eventually go there and disable the support for some devices. I plan to merge mv88e6060.c into the mv88e6xxx driver, these chips don't have Global2. And we will soon add support for TCAM, which is huge. The majority of devices supported by mv88e6xxx don't have a TCAM device. So my point is that I'd like to disable support for unnecessary internal devices, even if it doesn't make much difference on our huge dev boards, but it does make a difference for the many embedded devices out there (think home routers) with some 88E6060, 88E6176 or 88E6185 chips. What do you think? Thanks, Vivien
Re: [PATCH net] bonding: Fix bonding crash
On Fri, 2016-09-02 at 08:57 -0600, David Ahern wrote: > I hit this same problem yesterday but with the bridge. I forgot I had > a macvlan device on an interface and tried to enslave it to a bridge. > It failed with EBUSY without crashing the kernel so it is one example > that handles the conflict, and the bridge also calls the register > before the enslaving is done. > Sure, some netdev_rx_handler_register() users are simpler than bonding, this was the point Jiri raised. Hey guys just use team instead of bonding ;) If you carefully look at bridge code, you'll find other kind of errors for sure ;) For example, should dev_disable_lro() be called before or after netdev_rx_handler_register() ? ;) Mahesh proposal allows for simplification, since one level can be removed from the error rollback chain. Ideally the netdev_rx_handler_register() should be called only when no further errors can be detected in the 'enslaving' process, otherwise some live packets could come and be incorrectly processed/dropped by a not fully initialized driver. So it is a chicken and egg problem, if you allow netdev_rx_handler_register() to return an error, while it is so easy to early check what could go wrong with it.
Re: [PATCH net] bonding: Fix bonding crash
On 9/2/16 8:45 AM, Eric Dumazet wrote: > On Fri, 2016-09-02 at 08:30 -0600, David Ahern wrote: > >> This check duplicates what netdev_rx_handler_register does. Why not >> move the call to netdev_rx_handler_register here and then call >> unregister on failure paths? > > As soon as you call netdev_rx_handler_register(), incoming packets will > hit your driver and we'll likely crash since the enslaving is not done > yet. > > Really think about RCU, we do rcu_assign_pointer() because we want all > prior changes being committed to memory before 'enabling' readers to see > the updated rcu protected pointer. > > There are 9 call sites where netdev_rx_handler_register() is used. > > We can get rid of the extra check in netdev_rx_handler_register() once > all of them are using netdev_is_rx_handler_busy() > > Since this patch takes care of bonding only, we need to keep the > existing check in netdev_rx_handler_register() > > Anyway, we are speaking of control function, an extra check is simply > safer, like all the ASSERT_RTNL() we do have... > I hit this same problem yesterday but with the bridge. I forgot I had a macvlan device on an interface and tried to enslave it to a bridge. It failed with EBUSY without crashing the kernel so it is one example that handles the conflict, and the bridge also calls the register before the enslaving is done.
Re: [PATCH net-next 3/3] net: dsa: mv88e6xxx: make global2 code optional
On Fri, Sep 02, 2016 at 08:08:19AM -0400, Vivien Didelot wrote: > Since not every chip has a Global2 set of registers, make its support > optional, in which case the related functions will return -EOPNOTSUPP. > > This also allows to reduce the size of the mv88e6xxx driver for devices > such as home routers embedding Ethernet chips without Global2 support. Hi Vivien I've no problems with splitting this out. However, making it optional? I don't think that is a good idea. 1) If you don't have this, and you should, it looks like the PHYs stop working, and since you cannot set the switch MAC address, maybe Pause frames are broken. 2) Distros won't build a kernel per target hardware. They probably build a kernel per SoC vendor. That means, they will have this optional code built all the time. Very few builds will leave it out. So the only way i think making this optional, is if it is a kernel module, and it gets loaded if needed. Andrew
Re: [PATCH net] bonding: Fix bonding crash
On Fri, 2016-09-02 at 08:30 -0600, David Ahern wrote: > This check duplicates what netdev_rx_handler_register does. Why not > move the call to netdev_rx_handler_register here and then call > unregister on failure paths? As soon as you call netdev_rx_handler_register(), incoming packets will hit your driver and we'll likely crash since the enslaving is not done yet. Really think about RCU, we do rcu_assign_pointer() because we want all prior changes being committed to memory before 'enabling' readers to see the updated rcu protected pointer. There are 9 call sites where netdev_rx_handler_register() is used. We can get rid of the extra check in netdev_rx_handler_register() once all of them are using netdev_is_rx_handler_busy() Since this patch takes care of bonding only, we need to keep the existing check in netdev_rx_handler_register() Anyway, we are speaking of control function, an extra check is simply safer, like all the ASSERT_RTNL() we do have...