Re: [RFC net-next] net: Build IPv6 into kernel by default
Hi, David Miller wrote: > From: Tom Herbert > Date: Thu, 9 Jul 2015 13:42:29 -0700 > >> This patch makes the default to build IPv6 into the kernel. IPv6 >> now has significant traction and any remaining vestiges of IPv6 >> not being provided parity with IPv4 should be swept away. IPv6 is now >> core to the Internet and kernel. > > I guess I'm fine with this, just fix up the doc error Dave Jones > pointed out. I am deeply moved. Acked-by: YOSHIFUJI Hideaki -- Hideaki Yoshifuji Technical Division, MIRACLE LINUX CORPORATION -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next v2] ipv6: Do not iterate over all interfaces when finding source address on specific interface.
Hi, Erik Kline wrote: > Hmm, when I run a UML linux with this patch (which, I'm ashamed to > say, I failed to do before) I get these kinds of errors: > > unregister_netdevice: waiting for to become free. > Usage count = 1 > unregister_netdevice: waiting for to become free. > Usage count = 1 > > Perhaps they're unrelated... I'm still investigating. Would you test attached patch please? --yoshfuji > > On 11 July 2015 at 15:19, David Miller wrote: >> From: YOSHIFUJI Hideaki/吉藤英明 >> Date: Fri, 10 Jul 2015 16:58:31 +0900 >> >>> If outgoing interface is specified and the candidate address is >>> restricted to the outgoing interface, it is enough to iterate >>> over that given interface only. >>> >>> Signed-off-by: YOSHIFUJI Hideaki >>> Acked-by: Erik Kline >> >> Applied, thanks! -- Hideaki Yoshifuji Technical Division, MIRACLE LINUX CORPORATION >From 38c5a10a5876ea47766ffc05b5a131a210d6e1aa Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Mon, 13 Jul 2015 15:23:02 +0900 Subject: [PATCH] ipv6: Avoid NULL pointer dereference in __ipv6_dev_get_saddr(). Commit 9131f3de2 ("ipv6: Do not iterate over all interfaces when finding source address on specific interface.") introduced possible NULL pointer dereference if outgoing device is specified. Fixes: 9131f3de24db4dc12199aede7d931e6703e97f3b ("ipv6: Do not iterate over all interfaces when finding source address on specific interface.") Signed-off-by: YOSHIFUJI Hideaki --- net/ipv6/addrconf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 4ab74d5..50ad476 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -1480,7 +1480,8 @@ int ipv6_dev_get_saddr(struct net *net, const struct net_device *dst_dev, } if (use_oif_addr) { - __ipv6_dev_get_saddr(net, &dst, prefs, saddr, idev, scores); + if (idev) + __ipv6_dev_get_saddr(net, &dst, prefs, saddr, idev, scores); } else { for_each_netdev_rcu(net, dev) { idev = __in6_dev_get(dev); -- 1.9.1
[GIT] Networking
1) Missing list head init in bluetooth hidp session creation, from Tedd Ho-Jeong An. 2) Don't leak SKB in bridge netfilter error paths, from Florian Westphal. 3) ipv6 netdevice private leak in netfilter bridging, fixed by Julien Grall. 4) Fix regression in IP over hamradio bpq encapsulation, from Ralf Baechle. 5) Fix race between rhashtable resize events and table walks, from Phil Sutter. 6) Missing validation of IFLA_VF_INFO netlink attributes, fix from Daniel Borkmann. 7) Missing security layer socket state initialization in tipc code, from Stephen Smalley. 8) Fix shared IRQ handling in boomerang 3c59x interrupt handler, from Denyx Vlasenko. 9) Missing minor_idr destroy on module unload on macvtap driver, from Johannes Thumshirn. 10) Various pktgen kernel thread races, from Oleg Nesterov. 11) Fix races that can cause packets to be processed in the backlog even after a device attached to that SKB has been fully unregistered. From Julian Anastasov. 12) bcmgenet driver doesn't account packet drops vs. errors properly, fix from Petri Gynther. 13) Array index validation and off by one fix in DSA layer from Florian Fainelli. Please pull, thanks a lot! The following changes since commit a611fb75d0517fce65f588cde94f80bb4052c6b2: Merge tag 'module-misc-v4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux (2015-07-02 11:07:27 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net for you to fetch changes up to cee9f6d0186a586c8023bc91c8a4cf8a088855e5: Merge tag 'linux-can-fixes-for-4.2-20150712' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can (2015-07-12 22:24:01 -0700) Andrew Lunn (1): net: fec: Ensure clocks are enabled while using mdio bus Andy Gospodarek (1): ipv4: add support for linkdown sysctl to netconf Angga (1): ipv6: Make MLD packets to only be processed locally Bernhard Thaler (1): netfilter: bridge: fix CONFIG_NF_DEFRAG_IPV4/6 related warnings/errors Daniel Borkmann (1): rtnetlink: verify IFLA_VF_INFO attributes before passing them to driver Daniel Pieczko (3): sfc: refactor code in efx_ef10_set_mac_address() sfc: add legacy method for changing a PF's MAC address sfc: suppress handled MCDI failures when changing the MAC address David S. Miller (7): Merge branch 'for-upstream' of git://git.kernel.org/.../bluetooth/bluetooth Merge branch 'sfc-set-mac' Merge git://git.kernel.org/.../pablo/nf Merge branch 'pktgen-races' Merge branch 'netdev_unregister_races' Merge branch 'dsa-of-parsing-fixes' Merge tag 'linux-can-fixes-for-4.2-20150712' of git://git.kernel.org/.../mkl/linux-can Denys Vlasenko (1): 3c59x: Fix shared IRQ handling Enrico Mioso (2): cdc_ncm: Add support for moving NDP to end of NCM frame cdc_ncm: update specs URL Eric Dumazet (3): net_sched: gen_estimator: extend pps limit net: graceful exit from netif_alloc_netdev_queues() bridge: fix potential crash in __netdev_pick_tx() Eric W. Biederman (1): netfilter: nf_queue: Don't recompute the hook_list head Florian Fainelli (2): net: dsa: Test array index before use net: dsa: Fix off-by-one in switch address parsing Florian Westphal (2): netfilter: arptables: use percpu jumpstack netfilter: bridge: don't leak skb in error paths Govindarajulu Varadarajan (1): enic: fix issues in enic_poll Hariprasad Shenai (1): cxgb4: Fix incorrect sequence numbers shown in devlog J.D. Schroeder (1): can: c_can: Fix default pinmux glitch at init Johannes Thumshirn (1): macvtap: Destroy minor_idr on module_exit Julian Anastasov (2): net: do not process device backlog during unregistration net: call rcu_read_lock early in process_backlog Julien Grall (1): netfilter: bridge: Use __in6_dev_get rather than in6_dev_get in br_validate_ipv6 Krzysztof Kozlowski (1): net: axienet: Fix devm_ioremap_resource return value check Lendacky, Thomas (1): amd-xgbe: Fix DMA API debug warning Markus Elfring (3): net-ipv6: Delete an unnecessary check before the function call "free_percpu" net-RDS: Delete an unnecessary check before the function call "module_put" netlink: Delete an unnecessary check before the function call "module_put" Masanari Iida (1): Doc: z8530book: Fix typo in API-z8530-sync-txdma-open.html Mazhar Rana (1): bonding: "primary_reselect" with "failure" is not working properly Mugunthan V N (2): drivers: net: cpsw: fix crash while accessing second slave ethernet interface drivers: net: cpsw: fix disabling of tx interrupt i
RE: [PATCH] bnx2x: Update to FW version 7.12.30
> > The new FW will allow us to utilize some new features in our driver, > > mainly adding vlan filtering offload and vxlan offload support. > > > > In addition, this fixes several issues: > > 1. Packets from a VF with pvid configured which were sent with a > >different vlan were transmitted instead of being discarded. > > > > 2. FCoE traffic might not recover after a failue while there's traffic > >to another function. > > > > Signed-off-by: Yuval Mintz > > Hi, any news about this one? > Thanks, Yuval Any updates? I've sent this 3-weeks ago and haven't seen any reply. Thanks, Yuval -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: pull-request: can 2015-07-12
From: Marc Kleine-Budde Date: Sun, 12 Jul 2015 21:18:03 +0200 > this is a pull request of 8 patchs for net/master. > > Sergei Shtylyov contributes 5 patches for the rcar_can driver, fixing the IRQ > check and several info and error messages. There are two patches by J.D. > Schroeder and Roger Quadros for the c_can driver and dra7x-evm device tree, > which precent a glitch in the DCAN1 pinmux. Oliver Hartkopp provides a better > approach to make the CAN skbs unique, the timestamp is replaced by a counter. Pulled, thanks Marc. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH 0/2] net: macb: Add mdio driver for accessing multiple phy devices
This patch is to add support for the design that has multiple ethernet mac controllers and single mdio bus connected to multiple phy devices. i.e mdio lines are connected to any of the ethernet mac controller and all the phy devices will be accessed using the phy maintenance interface in that mac controller. __ _ | | |PHY0 | | MAC0 |-| | |__| | |_| | __| _ | | | | | | MAC1 | |_|PHY1 | |__| | | So, i come up with two implementations for addressing the above configuration. Implementation 1: Have separate driver for mdio bus Create a DT node for all the PHY devices connected to the mdio bus This driver will share the register space of the mac controller that has mdio bus connected. Implementation 2: Add new property "has-mdio" and it should be 1 for the mac that has mdio bus connected. Create the mdio bus only when the has-mdio property is 1 Please review the two implementations and suggest which one is better to proceed further. In my opinion implementation 1 will be the ideal one. Currently i have tested the patches with single mac and single phy configuration. I need to take care of few more cases before releasing the final patch but before that i would like to have your opinion on the above implementations and finalize one implementation. so that i can enhance it further. Punnaiah Choudary Kalluri (1): net: macb: Add mdio driver for accessing multiple phy devices net: macb: Add support for single mac managing more than one phy drivers/net/ethernet/cadence/Makefile|2 +- drivers/net/ethernet/cadence/macb.c | 93 +- drivers/net/ethernet/cadence/macb.h |3 +- drivers/net/ethernet/cadence/macb_mdio.c | 204 ++ 4 files changed, 211 insertions(+), 91 deletions(-) create mode 100644 drivers/net/ethernet/cadence/macb_mdio.c -- 1.7.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fighting out-of-order reception with RPS?
On Sun, 2015-07-12 at 21:15 +0200, Oliver Hartkopp wrote: > E.g. with > > skb_set_hash(skb, dev->ifindex, PKT_HASH_TYPE_L2); > > and > > echo f > /sys/class/net/can0/queues/rx-0/rps_cpus > > I get properly ordered CAN frames - even with netif_rx() processed skbs. I > just want to have this stuff to be enabled by default for CAN interfaces to > kill the OOO frame issue. I doubt your skb_set_hash() makes any difference. RPS prefers a L4 hash anyway (skb_get_hash()), so flow dissection happens. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH 2/2] net: macb: Add support for single mac managing more than one phy
Added support for single mac managing more than one phy Signed-off-by: Punnaiah Choudary Kalluri --- drivers/net/ethernet/cadence/macb.c | 25 - drivers/net/ethernet/cadence/macb.h |4 +++- 2 files changed, 23 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c index 4833ba1..6d36b76 100644 --- a/drivers/net/ethernet/cadence/macb.c +++ b/drivers/net/ethernet/cadence/macb.c @@ -171,6 +171,7 @@ static int macb_mdio_read(struct mii_bus *bus, int mii_id, int regnum) struct macb *bp = bus->priv; int value; + spin_lock(&bp->mdio_lock); macb_writel(bp, MAN, (MACB_BF(SOF, MACB_MAN_SOF) | MACB_BF(RW, MACB_MAN_READ) | MACB_BF(PHYA, mii_id) @@ -182,6 +183,7 @@ static int macb_mdio_read(struct mii_bus *bus, int mii_id, int regnum) cpu_relax(); value = MACB_BFEXT(DATA, macb_readl(bp, MAN)); + spin_unlock(&bp->mdio_lock); return value; } @@ -191,6 +193,7 @@ static int macb_mdio_write(struct mii_bus *bus, int mii_id, int regnum, { struct macb *bp = bus->priv; + spin_lock(&bp->mdio_lock); macb_writel(bp, MAN, (MACB_BF(SOF, MACB_MAN_SOF) | MACB_BF(RW, MACB_MAN_WRITE) | MACB_BF(PHYA, mii_id) @@ -201,6 +204,7 @@ static int macb_mdio_write(struct mii_bus *bus, int mii_id, int regnum, /* wait for end of transfer */ while (!MACB_BFEXT(IDLE, macb_readl(bp, NSR))) cpu_relax(); + spin_unlock(&bp->mdio_lock); return 0; } @@ -320,7 +324,7 @@ static int macb_mii_probe(struct net_device *dev) int phy_irq; int ret; - phydev = phy_find_first(bp->mii_bus); + phydev = of_phy_find_device(bp->phy_node); if (!phydev) { netdev_err(dev, "no PHY found\n"); return -ENXIO; @@ -365,8 +369,14 @@ int macb_mii_init(struct macb *bp) struct device_node *np; int err = -ENXIO, i; + bp->phy_node = of_parse_phandle(bp->pdev->dev.of_node, + "phy-handle", 0); + np = of_get_parent(bp->phy_node); /* Enable management port */ macb_writel(bp, NCR, MACB_BIT(MPE)); + bp->mii_bus = of_mdio_find_bus(np); + if (!bp->has_mdio && bp->mii_bus) + goto mii_probe; bp->mii_bus = mdiobus_alloc(); if (bp->mii_bus == NULL) { @@ -425,6 +435,7 @@ int macb_mii_init(struct macb *bp) if (err) goto err_out_free_mdio_irq; +mii_probe: err = macb_mii_probe(bp->dev); if (err) goto err_out_unregister_bus; @@ -2356,6 +2367,7 @@ static int macb_probe(struct platform_device *pdev) bp->isjumbo = of_property_read_bool(pdev->dev.of_node, "jumbo-supported"); spin_lock_init(&bp->lock); + spin_lock_init(&bp->mdio_lock); /* set the queue register mapping once for all: queue0 has a special * register mapping but we don't want to test the queue index then @@ -2479,7 +2491,8 @@ static int macb_probe(struct platform_device *pdev) dev_err(&pdev->dev, "Cannot register net device, aborting.\n"); goto err_out_free_netdev; } - + err = of_property_read_u32(bp->pdev->dev.of_node, "has-mdio", + &bp->has_mdio); err = macb_mii_init(bp); if (err) goto err_out_unregister_netdev; @@ -2524,9 +2537,11 @@ static int macb_remove(struct platform_device *pdev) bp = netdev_priv(dev); if (bp->phy_dev) phy_disconnect(bp->phy_dev); - mdiobus_unregister(bp->mii_bus); - kfree(bp->mii_bus->irq); - mdiobus_free(bp->mii_bus); + if (bp->has_mdio) { + mdiobus_unregister(bp->mii_bus); + kfree(bp->mii_bus->irq); + mdiobus_free(bp->mii_bus); + } unregister_netdev(dev); if (!IS_ERR(bp->tx_clk)) clk_disable_unprepare(bp->tx_clk); diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h index f0aa177..0f99f2a 100644 --- a/drivers/net/ethernet/cadence/macb.h +++ b/drivers/net/ethernet/cadence/macb.h @@ -825,7 +825,9 @@ struct macb { unsigned intrx_frm_len_mask; unsigned intjumbo_max_len; boolisjumbo; - + unsigned int has_mdio; + spinlock_t mdio_lock; + struct device_node *phy_node; u64 ethtool_stats[GEM_STATS_LEN]; }; -- 1.7.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of
[RFC PATCH 1/2] net: macb: Add mdio driver for accessing multiple phy devices
This patch is to add spoort for the design that has multiple ethernet mac controllers and single mdio bus connected to multiple phy devices. i.e mdio lines are connected to any of the ethernet mac controller and all the phy devices will be accessed using the phy maintainance interface in that mac controller. Signed-off-by: Punnaiah Choudary Kalluri --- drivers/net/ethernet/cadence/Makefile|2 +- drivers/net/ethernet/cadence/macb.c | 93 +- drivers/net/ethernet/cadence/macb.h |3 +- drivers/net/ethernet/cadence/macb_mdio.c | 204 ++ 4 files changed, 211 insertions(+), 91 deletions(-) create mode 100644 drivers/net/ethernet/cadence/macb_mdio.c diff --git a/drivers/net/ethernet/cadence/Makefile b/drivers/net/ethernet/cadence/Makefile index 9068b83..73504f4 100644 --- a/drivers/net/ethernet/cadence/Makefile +++ b/drivers/net/ethernet/cadence/Makefile @@ -3,4 +3,4 @@ # obj-$(CONFIG_ARM_AT91_ETHER) += at91_ether.o -obj-$(CONFIG_MACB) += macb.o +obj-$(CONFIG_MACB) += macb.o macb_mdio.o diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c index 4833ba1..df1b928 100644 --- a/drivers/net/ethernet/cadence/macb.c +++ b/drivers/net/ethernet/cadence/macb.c @@ -320,7 +320,7 @@ static int macb_mii_probe(struct net_device *dev) int phy_irq; int ret; - phydev = phy_find_first(bp->mii_bus); + phydev = of_phy_find_device(bp->phy_node); if (!phydev) { netdev_err(dev, "no PHY found\n"); return -ENXIO; @@ -359,89 +359,6 @@ static int macb_mii_probe(struct net_device *dev) return 0; } -int macb_mii_init(struct macb *bp) -{ - struct macb_platform_data *pdata; - struct device_node *np; - int err = -ENXIO, i; - - /* Enable management port */ - macb_writel(bp, NCR, MACB_BIT(MPE)); - - bp->mii_bus = mdiobus_alloc(); - if (bp->mii_bus == NULL) { - err = -ENOMEM; - goto err_out; - } - - bp->mii_bus->name = "MACB_mii_bus"; - bp->mii_bus->read = &macb_mdio_read; - bp->mii_bus->write = &macb_mdio_write; - snprintf(bp->mii_bus->id, MII_BUS_ID_SIZE, "%s-%x", - bp->pdev->name, bp->pdev->id); - bp->mii_bus->priv = bp; - bp->mii_bus->parent = &bp->dev->dev; - pdata = dev_get_platdata(&bp->pdev->dev); - - bp->mii_bus->irq = kmalloc(sizeof(int)*PHY_MAX_ADDR, GFP_KERNEL); - if (!bp->mii_bus->irq) { - err = -ENOMEM; - goto err_out_free_mdiobus; - } - - dev_set_drvdata(&bp->dev->dev, bp->mii_bus); - - np = bp->pdev->dev.of_node; - if (np) { - /* try dt phy registration */ - err = of_mdiobus_register(bp->mii_bus, np); - - /* fallback to standard phy registration if no phy were - found during dt phy registration */ - if (!err && !phy_find_first(bp->mii_bus)) { - for (i = 0; i < PHY_MAX_ADDR; i++) { - struct phy_device *phydev; - - phydev = mdiobus_scan(bp->mii_bus, i); - if (IS_ERR(phydev)) { - err = PTR_ERR(phydev); - break; - } - } - - if (err) - goto err_out_unregister_bus; - } - } else { - for (i = 0; i < PHY_MAX_ADDR; i++) - bp->mii_bus->irq[i] = PHY_POLL; - - if (pdata) - bp->mii_bus->phy_mask = pdata->phy_mask; - - err = mdiobus_register(bp->mii_bus); - } - - if (err) - goto err_out_free_mdio_irq; - - err = macb_mii_probe(bp->dev); - if (err) - goto err_out_unregister_bus; - - return 0; - -err_out_unregister_bus: - mdiobus_unregister(bp->mii_bus); -err_out_free_mdio_irq: - kfree(bp->mii_bus->irq); -err_out_free_mdiobus: - mdiobus_free(bp->mii_bus); -err_out: - return err; -} -EXPORT_SYMBOL_GPL(macb_mii_init); - static void macb_update_stats(struct macb *bp) { u32 __iomem *reg = bp->regs + MACB_PFR; @@ -2480,7 +2397,10 @@ static int macb_probe(struct platform_device *pdev) goto err_out_free_netdev; } - err = macb_mii_init(bp); + bp->phy_node = of_parse_phandle(bp->pdev->dev.of_node, + "phy-handle", 0); + + err = macb_mii_probe(bp->dev); if (err) goto err_out_unregister_netdev; @@ -2524,9 +2444,6 @@ static int macb_remove(struct platform_device *pdev) bp = netdev_priv(dev); if (bp->phy_dev) phy_disconnect(bp->phy_dev); - mdiobus_unr
RE: [RFC PATCH] net: macb: Add mdio driver for accessing multiple phy devices
Please ignore this patch series. Regards, Punnaiah > -Original Message- > From: Punnaiah Choudary Kalluri > [mailto:punnaiah.choudary.kall...@xilinx.com] > Sent: Monday, July 13, 2015 9:06 AM > To: nicolas.fe...@atmel.com; Michal Simek; Anirudha Sarangi; > da...@davemloft.net > Cc: Harini Katakam; kpc...@gmail.com; > kalluripunnaiahchoud...@gmail.com; netdev@vger.kernel.org; Punnaiah > Choudary Kalluri > Subject: [RFC PATCH] net: macb: Add mdio driver for accessing multiple phy > devices > > This patch is to add spoort for the design that has multiple ethernet > mac controllers and single mdio bus connected to multiple phy devices. > i.e mdio lines are connected to any of the ethernet mac controller and > all the phy devices will be accessed using the phy maintainance interface > in that mac controller. > > Signed-off-by: Punnaiah Choudary Kalluri > --- > drivers/net/ethernet/cadence/Makefile|2 +- > drivers/net/ethernet/cadence/macb.c | 93 +- > drivers/net/ethernet/cadence/macb.h |3 +- > drivers/net/ethernet/cadence/macb_mdio.c | 204 > ++ > 4 files changed, 211 insertions(+), 91 deletions(-) > create mode 100644 drivers/net/ethernet/cadence/macb_mdio.c > > diff --git a/drivers/net/ethernet/cadence/Makefile > b/drivers/net/ethernet/cadence/Makefile > index 9068b83..73504f4 100644 > --- a/drivers/net/ethernet/cadence/Makefile > +++ b/drivers/net/ethernet/cadence/Makefile > @@ -3,4 +3,4 @@ > # > > obj-$(CONFIG_ARM_AT91_ETHER) += at91_ether.o > -obj-$(CONFIG_MACB) += macb.o > +obj-$(CONFIG_MACB) += macb.o macb_mdio.o > diff --git a/drivers/net/ethernet/cadence/macb.c > b/drivers/net/ethernet/cadence/macb.c > index 4833ba1..df1b928 100644 > --- a/drivers/net/ethernet/cadence/macb.c > +++ b/drivers/net/ethernet/cadence/macb.c > @@ -320,7 +320,7 @@ static int macb_mii_probe(struct net_device *dev) > int phy_irq; > int ret; > > - phydev = phy_find_first(bp->mii_bus); > + phydev = of_phy_find_device(bp->phy_node); > if (!phydev) { > netdev_err(dev, "no PHY found\n"); > return -ENXIO; > @@ -359,89 +359,6 @@ static int macb_mii_probe(struct net_device *dev) > return 0; > } > > -int macb_mii_init(struct macb *bp) > -{ > - struct macb_platform_data *pdata; > - struct device_node *np; > - int err = -ENXIO, i; > - > - /* Enable management port */ > - macb_writel(bp, NCR, MACB_BIT(MPE)); > - > - bp->mii_bus = mdiobus_alloc(); > - if (bp->mii_bus == NULL) { > - err = -ENOMEM; > - goto err_out; > - } > - > - bp->mii_bus->name = "MACB_mii_bus"; > - bp->mii_bus->read = &macb_mdio_read; > - bp->mii_bus->write = &macb_mdio_write; > - snprintf(bp->mii_bus->id, MII_BUS_ID_SIZE, "%s-%x", > - bp->pdev->name, bp->pdev->id); > - bp->mii_bus->priv = bp; > - bp->mii_bus->parent = &bp->dev->dev; > - pdata = dev_get_platdata(&bp->pdev->dev); > - > - bp->mii_bus->irq = kmalloc(sizeof(int)*PHY_MAX_ADDR, > GFP_KERNEL); > - if (!bp->mii_bus->irq) { > - err = -ENOMEM; > - goto err_out_free_mdiobus; > - } > - > - dev_set_drvdata(&bp->dev->dev, bp->mii_bus); > - > - np = bp->pdev->dev.of_node; > - if (np) { > - /* try dt phy registration */ > - err = of_mdiobus_register(bp->mii_bus, np); > - > - /* fallback to standard phy registration if no phy were > -found during dt phy registration */ > - if (!err && !phy_find_first(bp->mii_bus)) { > - for (i = 0; i < PHY_MAX_ADDR; i++) { > - struct phy_device *phydev; > - > - phydev = mdiobus_scan(bp->mii_bus, i); > - if (IS_ERR(phydev)) { > - err = PTR_ERR(phydev); > - break; > - } > - } > - > - if (err) > - goto err_out_unregister_bus; > - } > - } else { > - for (i = 0; i < PHY_MAX_ADDR; i++) > - bp->mii_bus->irq[i] = PHY_POLL; > - > - if (pdata) > - bp->mii_bus->phy_mask = pdata->phy_mask; > - > - err = mdiobus_register(bp->mii_bus); > - } > - > - if (err) > - goto err_out_free_mdio_irq; > - > - err = macb_mii_probe(bp->dev); > - if (err) > - goto err_out_unregister_bus; > - > - return 0; > - > -err_out_unregister_bus: > - mdiobus_unregister(bp->mii_bus); > -err_out_free_mdio_irq: > - kfree(bp->mii_bus->irq); > -err_out_free_mdiobus: > - mdiobus_free(bp->mii_bus); > -err_out: > - return err; > -} > -EXPORT_SYMBOL_GPL(macb_mii_init); > - > static void macb_update_stats(struct macb *bp) > { > u32 __iomem *reg = bp->regs + MACB_PF
[RFC PATCH 4/4] vhost: Add cgroup-aware creation of worker threads
With the help of the cgroup function to compare groups introduced in the previous patch, this changes worker creation policy. If the new device belongs to different cgroups than any of the devices we are currently serving, we end up creating a new worker thread even if we haven't reached the devs_per_worker threshold Signed-off-by: Bandan Das --- drivers/vhost/vhost.c | 47 +++ 1 file changed, 39 insertions(+), 8 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 6a5d4c0..dc0fa37 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -261,12 +261,6 @@ static int vhost_worker(void *data) use_mm(dev->mm); } - /* TODO: Consider a more elegant solution */ - if (worker->owner != dev->owner) { - /* Should check for return value */ - cgroup_attach_task_all(dev->owner, current); - worker->owner = dev->owner; - } work->fn(work); if (need_resched()) schedule(); @@ -278,6 +272,36 @@ static int vhost_worker(void *data) return 0; } +struct vhost_attach_cgroups_struct { + struct vhost_work work; + struct task_struct *owner; + int ret; +}; + +static void vhost_attach_cgroups_work(struct vhost_work *work) +{ + struct vhost_attach_cgroups_struct *s; + + s = container_of(work, struct vhost_attach_cgroups_struct, work); + s->ret = cgroup_attach_task_all(s->owner, current); +} + +static void vhost_attach_cgroups(struct vhost_dev *dev, + struct vhost_worker *worker) +{ + struct vhost_attach_cgroups_struct attach; + + attach.owner = dev->owner; + vhost_work_init(dev, &attach.work, vhost_attach_cgroups_work); + vhost_work_queue(worker, &attach.work); + vhost_work_flush(worker, &attach.work); + + if (!attach.ret) + worker->owner = dev->owner; + + dev->err = attach.ret; +} + static void vhost_create_worker(struct vhost_dev *dev) { struct vhost_worker *worker; @@ -300,8 +324,14 @@ static void vhost_create_worker(struct vhost_dev *dev) spin_lock_init(&worker->work_lock); INIT_LIST_HEAD(&worker->work_list); + + /* attach to the cgroups of the process that created us */ + vhost_attach_cgroups(dev, worker); + if (dev->err) + goto therror; + worker->owner = dev->owner; + list_add(&worker->node, &pool->workers); - worker->owner = NULL; worker->num_devices++; total_vhost_workers++; dev->worker = worker; @@ -320,7 +350,8 @@ static int vhost_dev_assign_worker(struct vhost_dev *dev) mutex_lock(&vhost_pool->pool_lock); list_for_each_entry(worker, &vhost_pool->workers, node) { - if (worker->num_devices < devs_per_worker) { + if (worker->num_devices < devs_per_worker && + (!cgroup_match_groups(dev->owner, worker->owner))) { dev->worker = worker; dev->worker_assigned = true; worker->num_devices++; -- 2.4.3 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH 3/4] cgroup: Introduce a function to compare cgroups
This function takes two tasks and iterates through all hierarchies to check if they belong to the same cgroups. It ignores the check for default hierarchies or for hierarchies with no subsystems attached. This function will be used by the next patch to add rudimentary cgroup support with vhost workers. Signed-off-by: Bandan Das --- include/linux/cgroup.h | 1 + kernel/cgroup.c| 40 2 files changed, 41 insertions(+) diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index b9cb94c..606fb5b 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -933,6 +933,7 @@ void css_task_iter_start(struct cgroup_subsys_state *css, struct task_struct *css_task_iter_next(struct css_task_iter *it); void css_task_iter_end(struct css_task_iter *it); +int cgroup_match_groups(struct task_struct *tsk1, struct task_struct *tsk2); int cgroup_attach_task_all(struct task_struct *from, struct task_struct *); int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from); diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 469dd54..ba4121e 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -2465,6 +2465,46 @@ out_unlock_cgroup: } /** + * cgroup_match_groups - check if tsk1 and tsk2 belong to + * same cgroups in all hierarchies + * Returns 0 on success + */ +int cgroup_match_groups(struct task_struct *tsk1, struct task_struct *tsk2) +{ + struct cgroup_root *root; + int retval = 0; + + WARN_ON(!tsk1 || !tsk2); + + mutex_lock(&cgroup_mutex); + for_each_root(root) { + struct cgroup *cg_tsk1; + struct cgroup *cg_tsk2; + + /* Default hierarchy */ + if (root == &cgrp_dfl_root) + continue; + /* No subsystems attached */ + if (!root->subsys_mask) + continue; + + down_read(&css_set_rwsem); + cg_tsk1 = task_cgroup_from_root(tsk1, root); + cg_tsk2 = task_cgroup_from_root(tsk2, root); + up_read(&css_set_rwsem); + + if (cg_tsk1 != cg_tsk2) { + retval = 1; + break; + } + } + mutex_unlock(&cgroup_mutex); + + return retval; +} +EXPORT_SYMBOL_GPL(cgroup_match_groups); + +/** * cgroup_attach_task_all - attach task 'tsk' to all cgroups of task 'from' * @from: attach to all cgroups of a given task * @tsk: the task to be attached -- 2.4.3 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH 0/4] Shared vhost design
Hello, There have been discussions on improving the current vhost design. The first attempt, to my knowledge was Shirley Ma's patch to create a dedicated vhost worker per cgroup. http://comments.gmane.org/gmane.linux.network/224730 Later, I posted a cmwq based approach for performance comparisions http://comments.gmane.org/gmane.linux.network/286858 More recently was the Elvis work that was presented in KVM Forum 2013 http://www.linux-kvm.org/images/a/a3/Kvm-forum-2013-elvis.pdf The Elvis patches rely on common vhost thread design for scalability along with polling for performance. Since there are two major changes being proposed, we decided to split up the work. The first (this RFC), proposing a re-design of the vhost threading model and the second part (not posted yet) to focus more on improving performance. I am posting this with the hope that we can have a meaningful discussion on the proposed new architecture. We have run some tests to show that the new design is scalable and in terms of performance, is comparable to the current stable design. Test Setup: The testing is based on the setup described in the Elvis proposal. The initial tests are just an aggregate of Netperf STREAM and MAERTS but as we progress, I am happy to run more tests. The hosts are two identical 16 core Haswell systems with point to point network links. For the first 10 runs, with n=1 upto n=10 guests running in parallel, I booted the target system with nr_cpus=8 and mem=12G. The purpose was to do a comparision of resource utilization and how it affects performance. Finally, with the number of guests set at 14, I didn't limit the number of CPUs booted on the host or limit memory seen by the kernel but boot the kernel with isolcpus=14,15 that will be used to run the vhost threads. The guests are pinned to cpus 0-13 and based on which cpu the guest is running on, the corresponding I/O thread is either pinned to cpu 14 or 15. Results # X axis is number of guests # Y axis is netperf number # nr_cpus=8 and mem=12G #Number of Guests#Baseline#ELVIS 11119.3 .0 21135.6 1130.2 31135.5 1131.6 41136.0 1127.1 51118.6 1129.3 61123.4 1129.8 71128.7 1135.4 81129.9 1137.5 91130.6 1135.1 10 1129.3 1138.9 14* 1173.8 1216.9 #* Last run with the vCPU and I/O thread(s) pinned, no CPU/memory limit imposed. # I/O thread runs on CPU 14 or 15 depending on which guest it's serving There's a simple graph at http://people.redhat.com/~bdas/elvis/data/results.png that shows how task affinity results in a jump and even without it, as the number of guests increase, the shared vhost design performs slightly better. Observations: 1. In terms of "stock" performance, the results are comparable. 2. However, with a tuned setup, even without polling, we see an improvement with the new design. 3. Making the new design simulate old behavior would be a matter of setting the number of guests per vhost threads to 1. 4. Maybe, setting a per guest limit on the work being done by a specific vhost thread is needed for it to be fair. 5. cgroup associations needs to be figured out. I just slightly hacked the current cgroup association mechanism to work with the new model. Ccing cgroups for input/comments. Many thanks to Razya Ladelsky and Eyal Moscovici, IBM for the initial patches, the helpful testing suggestions and discussions. Bandan Das (4): vhost: Introduce a universal thread to serve all users vhost: Limit the number of devices served by a single worker thread cgroup: Introduce a function to compare cgroups vhost: Add cgroup-aware creation of worker threads drivers/vhost/net.c| 6 +- drivers/vhost/scsi.c | 18 ++-- drivers/vhost/vhost.c | 272 +++-- drivers/vhost/vhost.h | 32 +- include/linux/cgroup.h | 1 + kernel/cgroup.c| 40 6 files changed, 275 insertions(+), 94 deletions(-) -- 2.4.3 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH 2/4] vhost: Limit the number of devices served by a single worker thread
When the number of devices increase, the universal thread model (introduced in the preceding patch) may end up being the bottleneck. Moreover, a single worker thread also forces us to change cgroups based on the device we are serving. We introduce a worker pool struct that starts with one worker thread and we keep adding more threads when the numbers of devs reaches a certain threshold. The default value is set at 7 but is not based on any empirical data. The value can also be changed by the user with the devs_per_worker module parameter. Note that this patch doesn't change how cgroups work. We still keep moving around the worker thread to the cgroups of the device we are serving at the moment. Signed-off-by: Razya Ladelsky Signed-off-by: Bandan Das --- drivers/vhost/net.c | 6 +-- drivers/vhost/scsi.c | 3 +- drivers/vhost/vhost.c | 135 +- drivers/vhost/vhost.h | 13 - 4 files changed, 128 insertions(+), 29 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 7d137a4..7bfa019 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -705,7 +705,8 @@ static int vhost_net_open(struct inode *inode, struct file *f) n->vqs[i].vhost_hlen = 0; n->vqs[i].sock_hlen = 0; } - vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX); + if (vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX)) + return dev->err; vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, dev); vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, dev); @@ -801,9 +802,6 @@ static int vhost_net_release(struct inode *inode, struct file *f) sockfd_put(rx_sock); /* Make sure no callbacks are outstanding */ synchronize_rcu_bh(); - /* We do an extra flush before freeing memory, -* since jobs can re-queue themselves. */ - vhost_net_flush(n); kfree(n->dev.vqs); kvfree(n); return 0; diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index 6c42936..97de2db 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -1601,7 +1601,8 @@ static int vhost_scsi_open(struct inode *inode, struct file *f) vqs[i] = &vs->vqs[i].vq; vs->vqs[i].vq.handle_kick = vhost_scsi_handle_kick; } - vhost_dev_init(&vs->dev, vqs, VHOST_SCSI_MAX_VQ); + if (vhost_dev_init(&vs->dev, vqs, VHOST_SCSI_MAX_VQ)) + return vs->dev.err; vhost_scsi_init_inflight(vs, NULL); diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 951c96b..6a5d4c0 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -27,11 +27,19 @@ #include #include #include +#include #include "vhost.h" -/* Just one worker thread to service all devices */ -static struct vhost_worker *worker; +static int __read_mostly devs_per_worker = 7; +module_param(devs_per_worker, int, S_IRUGO); +MODULE_PARM_DESC(devs_per_worker, "Setup the number of devices being served by a worker thread"); + +/* Only used to give a unique id to a vhost thread at the moment */ +static unsigned int total_vhost_workers; + +/* Pool of vhost threads */ +static struct vhost_pool *vhost_pool; enum { VHOST_MEMORY_MAX_NREGIONS = 64, @@ -270,6 +278,63 @@ static int vhost_worker(void *data) return 0; } +static void vhost_create_worker(struct vhost_dev *dev) +{ + struct vhost_worker *worker; + struct vhost_pool *pool = vhost_pool; + + worker = kzalloc(sizeof(*worker), GFP_KERNEL); + if (!worker) { + dev->err = -ENOMEM; + return; + } + + worker->thread = kthread_create(vhost_worker, + worker, + "vhost-%d", + total_vhost_workers); + if (IS_ERR(worker->thread)) { + dev->err = PTR_ERR(worker->thread); + goto therror; + } + + spin_lock_init(&worker->work_lock); + INIT_LIST_HEAD(&worker->work_list); + list_add(&worker->node, &pool->workers); + worker->owner = NULL; + worker->num_devices++; + total_vhost_workers++; + dev->worker = worker; + dev->worker_assigned = true; + return; + +therror: + if (worker->thread) + kthread_stop(worker->thread); + kfree(worker); +} + +static int vhost_dev_assign_worker(struct vhost_dev *dev) +{ + struct vhost_worker *worker; + + mutex_lock(&vhost_pool->pool_lock); + list_for_each_entry(worker, &vhost_pool->workers, node) { + if (worker->num_devices < devs_per_worker) { + dev->worker = worker; + dev->worker_assigned = true; + worker->num_devices++; + break; + } + } + if (!dev->worker_assigned) +
[RFC PATCH] net: macb: Add mdio driver for accessing multiple phy devices
This patch is to add spoort for the design that has multiple ethernet mac controllers and single mdio bus connected to multiple phy devices. i.e mdio lines are connected to any of the ethernet mac controller and all the phy devices will be accessed using the phy maintainance interface in that mac controller. Signed-off-by: Punnaiah Choudary Kalluri --- drivers/net/ethernet/cadence/Makefile|2 +- drivers/net/ethernet/cadence/macb.c | 93 +- drivers/net/ethernet/cadence/macb.h |3 +- drivers/net/ethernet/cadence/macb_mdio.c | 204 ++ 4 files changed, 211 insertions(+), 91 deletions(-) create mode 100644 drivers/net/ethernet/cadence/macb_mdio.c diff --git a/drivers/net/ethernet/cadence/Makefile b/drivers/net/ethernet/cadence/Makefile index 9068b83..73504f4 100644 --- a/drivers/net/ethernet/cadence/Makefile +++ b/drivers/net/ethernet/cadence/Makefile @@ -3,4 +3,4 @@ # obj-$(CONFIG_ARM_AT91_ETHER) += at91_ether.o -obj-$(CONFIG_MACB) += macb.o +obj-$(CONFIG_MACB) += macb.o macb_mdio.o diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c index 4833ba1..df1b928 100644 --- a/drivers/net/ethernet/cadence/macb.c +++ b/drivers/net/ethernet/cadence/macb.c @@ -320,7 +320,7 @@ static int macb_mii_probe(struct net_device *dev) int phy_irq; int ret; - phydev = phy_find_first(bp->mii_bus); + phydev = of_phy_find_device(bp->phy_node); if (!phydev) { netdev_err(dev, "no PHY found\n"); return -ENXIO; @@ -359,89 +359,6 @@ static int macb_mii_probe(struct net_device *dev) return 0; } -int macb_mii_init(struct macb *bp) -{ - struct macb_platform_data *pdata; - struct device_node *np; - int err = -ENXIO, i; - - /* Enable management port */ - macb_writel(bp, NCR, MACB_BIT(MPE)); - - bp->mii_bus = mdiobus_alloc(); - if (bp->mii_bus == NULL) { - err = -ENOMEM; - goto err_out; - } - - bp->mii_bus->name = "MACB_mii_bus"; - bp->mii_bus->read = &macb_mdio_read; - bp->mii_bus->write = &macb_mdio_write; - snprintf(bp->mii_bus->id, MII_BUS_ID_SIZE, "%s-%x", - bp->pdev->name, bp->pdev->id); - bp->mii_bus->priv = bp; - bp->mii_bus->parent = &bp->dev->dev; - pdata = dev_get_platdata(&bp->pdev->dev); - - bp->mii_bus->irq = kmalloc(sizeof(int)*PHY_MAX_ADDR, GFP_KERNEL); - if (!bp->mii_bus->irq) { - err = -ENOMEM; - goto err_out_free_mdiobus; - } - - dev_set_drvdata(&bp->dev->dev, bp->mii_bus); - - np = bp->pdev->dev.of_node; - if (np) { - /* try dt phy registration */ - err = of_mdiobus_register(bp->mii_bus, np); - - /* fallback to standard phy registration if no phy were - found during dt phy registration */ - if (!err && !phy_find_first(bp->mii_bus)) { - for (i = 0; i < PHY_MAX_ADDR; i++) { - struct phy_device *phydev; - - phydev = mdiobus_scan(bp->mii_bus, i); - if (IS_ERR(phydev)) { - err = PTR_ERR(phydev); - break; - } - } - - if (err) - goto err_out_unregister_bus; - } - } else { - for (i = 0; i < PHY_MAX_ADDR; i++) - bp->mii_bus->irq[i] = PHY_POLL; - - if (pdata) - bp->mii_bus->phy_mask = pdata->phy_mask; - - err = mdiobus_register(bp->mii_bus); - } - - if (err) - goto err_out_free_mdio_irq; - - err = macb_mii_probe(bp->dev); - if (err) - goto err_out_unregister_bus; - - return 0; - -err_out_unregister_bus: - mdiobus_unregister(bp->mii_bus); -err_out_free_mdio_irq: - kfree(bp->mii_bus->irq); -err_out_free_mdiobus: - mdiobus_free(bp->mii_bus); -err_out: - return err; -} -EXPORT_SYMBOL_GPL(macb_mii_init); - static void macb_update_stats(struct macb *bp) { u32 __iomem *reg = bp->regs + MACB_PFR; @@ -2480,7 +2397,10 @@ static int macb_probe(struct platform_device *pdev) goto err_out_free_netdev; } - err = macb_mii_init(bp); + bp->phy_node = of_parse_phandle(bp->pdev->dev.of_node, + "phy-handle", 0); + + err = macb_mii_probe(bp->dev); if (err) goto err_out_unregister_netdev; @@ -2524,9 +2444,6 @@ static int macb_remove(struct platform_device *pdev) bp = netdev_priv(dev); if (bp->phy_dev) phy_disconnect(bp->phy_dev); - mdiobus_unr
[RFC PATCH 1/4] vhost: Introduce a universal thread to serve all users
vhost threads are per-device, but in most cases a single thread is enough. This change creates a single thread that is used to serve all guests. However, this complicates cgroups associations. The current policy is to attach the per-device thread to all cgroups of the parent process that the device is associated it. This is no longer possible if we have a single thread. So, we end up moving the thread around to cgroups of whichever device that needs servicing. This is a very inefficient protocol but seems to be the only way to integrate cgroups support. Signed-off-by: Razya Ladelsky Signed-off-by: Bandan Das --- drivers/vhost/scsi.c | 15 +++-- drivers/vhost/vhost.c | 150 -- drivers/vhost/vhost.h | 19 +-- 3 files changed, 97 insertions(+), 87 deletions(-) diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index ea32b38..6c42936 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -535,7 +535,7 @@ static void vhost_scsi_complete_cmd(struct vhost_scsi_cmd *cmd) llist_add(&cmd->tvc_completion_list, &vs->vs_completion_list); - vhost_work_queue(&vs->dev, &vs->vs_completion_work); + vhost_work_queue(vs->dev.worker, &vs->vs_completion_work); } static int vhost_scsi_queue_data_in(struct se_cmd *se_cmd) @@ -1282,7 +1282,7 @@ vhost_scsi_send_evt(struct vhost_scsi *vs, } llist_add(&evt->list, &vs->vs_event_list); - vhost_work_queue(&vs->dev, &vs->vs_event_work); + vhost_work_queue(vs->dev.worker, &vs->vs_event_work); } static void vhost_scsi_evt_handle_kick(struct vhost_work *work) @@ -1335,8 +1335,8 @@ static void vhost_scsi_flush(struct vhost_scsi *vs) /* Flush both the vhost poll and vhost work */ for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) vhost_scsi_flush_vq(vs, i); - vhost_work_flush(&vs->dev, &vs->vs_completion_work); - vhost_work_flush(&vs->dev, &vs->vs_event_work); + vhost_work_flush(vs->dev.worker, &vs->vs_completion_work); + vhost_work_flush(vs->dev.worker, &vs->vs_event_work); /* Wait for all reqs issued before the flush to be finished */ for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) @@ -1584,8 +1584,11 @@ static int vhost_scsi_open(struct inode *inode, struct file *f) if (!vqs) goto err_vqs; - vhost_work_init(&vs->vs_completion_work, vhost_scsi_complete_cmd_work); - vhost_work_init(&vs->vs_event_work, vhost_scsi_evt_work); + vhost_work_init(&vs->dev, &vs->vs_completion_work, + vhost_scsi_complete_cmd_work); + + vhost_work_init(&vs->dev, &vs->vs_event_work, + vhost_scsi_evt_work); vs->vs_events_nr = 0; vs->vs_events_missed = false; diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 2ee2826..951c96b 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -11,6 +11,8 @@ * Generic code for virtio server in host kernel. */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include @@ -28,6 +30,9 @@ #include "vhost.h" +/* Just one worker thread to service all devices */ +static struct vhost_worker *worker; + enum { VHOST_MEMORY_MAX_NREGIONS = 64, VHOST_MEMORY_F_LOG = 0x1, @@ -58,13 +63,15 @@ static int vhost_poll_wakeup(wait_queue_t *wait, unsigned mode, int sync, return 0; } -void vhost_work_init(struct vhost_work *work, vhost_work_fn_t fn) +void vhost_work_init(struct vhost_dev *dev, +struct vhost_work *work, vhost_work_fn_t fn) { INIT_LIST_HEAD(&work->node); work->fn = fn; init_waitqueue_head(&work->done); work->flushing = 0; work->queue_seq = work->done_seq = 0; + work->dev = dev; } EXPORT_SYMBOL_GPL(vhost_work_init); @@ -78,7 +85,7 @@ void vhost_poll_init(struct vhost_poll *poll, vhost_work_fn_t fn, poll->dev = dev; poll->wqh = NULL; - vhost_work_init(&poll->work, fn); + vhost_work_init(dev, &poll->work, fn); } EXPORT_SYMBOL_GPL(vhost_poll_init); @@ -116,30 +123,30 @@ void vhost_poll_stop(struct vhost_poll *poll) } EXPORT_SYMBOL_GPL(vhost_poll_stop); -static bool vhost_work_seq_done(struct vhost_dev *dev, struct vhost_work *work, - unsigned seq) +static bool vhost_work_seq_done(struct vhost_worker *worker, + struct vhost_work *work, unsigned seq) { int left; - spin_lock_irq(&dev->work_lock); + spin_lock_irq(&worker->work_lock); left = seq - work->done_seq; - spin_unlock_irq(&dev->work_lock); + spin_unlock_irq(&worker->work_lock); return left <= 0; } -void vhost_work_flush(struct vhost_dev *dev, struct vhost_work *work) +void vhost_work_flush(struct vhost_worker *worker, struct vhost_work *work) { unsigned seq; int flushing; - spin_lock_irq(&dev->work_lock); +
[RFC PATCH 2/2] net: macb: Add support for single mac managing more than one phy
Added support for single mac managing more than one phy Signed-off-by: Punnaiah Choudary Kalluri --- drivers/net/ethernet/cadence/macb.c | 25 - drivers/net/ethernet/cadence/macb.h |4 +++- 2 files changed, 23 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c index 4833ba1..6d36b76 100644 --- a/drivers/net/ethernet/cadence/macb.c +++ b/drivers/net/ethernet/cadence/macb.c @@ -171,6 +171,7 @@ static int macb_mdio_read(struct mii_bus *bus, int mii_id, int regnum) struct macb *bp = bus->priv; int value; + spin_lock(&bp->mdio_lock); macb_writel(bp, MAN, (MACB_BF(SOF, MACB_MAN_SOF) | MACB_BF(RW, MACB_MAN_READ) | MACB_BF(PHYA, mii_id) @@ -182,6 +183,7 @@ static int macb_mdio_read(struct mii_bus *bus, int mii_id, int regnum) cpu_relax(); value = MACB_BFEXT(DATA, macb_readl(bp, MAN)); + spin_unlock(&bp->mdio_lock); return value; } @@ -191,6 +193,7 @@ static int macb_mdio_write(struct mii_bus *bus, int mii_id, int regnum, { struct macb *bp = bus->priv; + spin_lock(&bp->mdio_lock); macb_writel(bp, MAN, (MACB_BF(SOF, MACB_MAN_SOF) | MACB_BF(RW, MACB_MAN_WRITE) | MACB_BF(PHYA, mii_id) @@ -201,6 +204,7 @@ static int macb_mdio_write(struct mii_bus *bus, int mii_id, int regnum, /* wait for end of transfer */ while (!MACB_BFEXT(IDLE, macb_readl(bp, NSR))) cpu_relax(); + spin_unlock(&bp->mdio_lock); return 0; } @@ -320,7 +324,7 @@ static int macb_mii_probe(struct net_device *dev) int phy_irq; int ret; - phydev = phy_find_first(bp->mii_bus); + phydev = of_phy_find_device(bp->phy_node); if (!phydev) { netdev_err(dev, "no PHY found\n"); return -ENXIO; @@ -365,8 +369,14 @@ int macb_mii_init(struct macb *bp) struct device_node *np; int err = -ENXIO, i; + bp->phy_node = of_parse_phandle(bp->pdev->dev.of_node, + "phy-handle", 0); + np = of_get_parent(bp->phy_node); /* Enable management port */ macb_writel(bp, NCR, MACB_BIT(MPE)); + bp->mii_bus = of_mdio_find_bus(np); + if (!bp->has_mdio && bp->mii_bus) + goto mii_probe; bp->mii_bus = mdiobus_alloc(); if (bp->mii_bus == NULL) { @@ -425,6 +435,7 @@ int macb_mii_init(struct macb *bp) if (err) goto err_out_free_mdio_irq; +mii_probe: err = macb_mii_probe(bp->dev); if (err) goto err_out_unregister_bus; @@ -2356,6 +2367,7 @@ static int macb_probe(struct platform_device *pdev) bp->isjumbo = of_property_read_bool(pdev->dev.of_node, "jumbo-supported"); spin_lock_init(&bp->lock); + spin_lock_init(&bp->mdio_lock); /* set the queue register mapping once for all: queue0 has a special * register mapping but we don't want to test the queue index then @@ -2479,7 +2491,8 @@ static int macb_probe(struct platform_device *pdev) dev_err(&pdev->dev, "Cannot register net device, aborting.\n"); goto err_out_free_netdev; } - + err = of_property_read_u32(bp->pdev->dev.of_node, "has-mdio", + &bp->has_mdio); err = macb_mii_init(bp); if (err) goto err_out_unregister_netdev; @@ -2524,9 +2537,11 @@ static int macb_remove(struct platform_device *pdev) bp = netdev_priv(dev); if (bp->phy_dev) phy_disconnect(bp->phy_dev); - mdiobus_unregister(bp->mii_bus); - kfree(bp->mii_bus->irq); - mdiobus_free(bp->mii_bus); + if (bp->has_mdio) { + mdiobus_unregister(bp->mii_bus); + kfree(bp->mii_bus->irq); + mdiobus_free(bp->mii_bus); + } unregister_netdev(dev); if (!IS_ERR(bp->tx_clk)) clk_disable_unprepare(bp->tx_clk); diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h index f0aa177..0f99f2a 100644 --- a/drivers/net/ethernet/cadence/macb.h +++ b/drivers/net/ethernet/cadence/macb.h @@ -825,7 +825,9 @@ struct macb { unsigned intrx_frm_len_mask; unsigned intjumbo_max_len; boolisjumbo; - + unsigned int has_mdio; + spinlock_t mdio_lock; + struct device_node *phy_node; u64 ethtool_stats[GEM_STATS_LEN]; }; -- 1.7.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of
[RFC PATCH 0/2] net: macb: Add mdio driver for accessing multiple phy devices
This patch is to add support for the design that has multiple ethernet mac controllers and single mdio bus connected to multiple phy devices. i.e mdio lines are connected to any of the ethernet mac controller and all the phy devices will be accessed using the phy maintenance interface in that mac controller. __ _ | | |PHY0 | | MAC0 |-| | |__| | |_| | __| _ | | | | | | MAC1 | |_|PHY1 | |__| | | So, i come up with two implementations for addressing the above configuration. Implementation 1: Have separate driver for mdio bus Create a DT node for all the PHY devices connected to the mdio bus This driver will share the register space of the mac controller that has mdio bus connected. Implementation 2: Add new property "has-mdio" and it should be 1 for the mac that has mdio bus connected. Create the mdio bus only when the has-mdio property is 1 Please review the two implementations and suggest which one is better to proceed further. In my opinion implementation 1 will be the ideal one. Currently i have tested the patches with single mac and single phy configuration. I need to take care of few more cases before releasing the final patch but before that i would like to have your opinion on the above implementations and finalize one implementation. so that i can enhance it further. Punnaiah Choudary Kalluri (1): net: macb: Add mdio driver for accessing multiple phy devices net: macb: Add support for single mac managing more than one phy drivers/net/ethernet/cadence/Makefile|2 +- drivers/net/ethernet/cadence/macb.c | 93 +- drivers/net/ethernet/cadence/macb.h |3 +- drivers/net/ethernet/cadence/macb_mdio.c | 204 ++ 4 files changed, 211 insertions(+), 91 deletions(-) create mode 100644 drivers/net/ethernet/cadence/macb_mdio.c -- 1.7.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] net: dsa: mv88e6xxx: add write access to debugfs regs file
From: Vivien Didelot Date: Sun, 12 Jul 2015 21:39:30 -0400 (EDT) > I hardly see how this debug interface can be made generic to other > DSA drivers, since the format of hardware tables or some registers > seem very specific to the switch chip. You feel this way because you are focusing on register values and not what those values represent. Ie. could you export the values in those registers in a generic format that other devices could convert their register values to as well? Stop focusing so tightly on the exact thing you've implemented and consider things on a much higher level. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fighting out-of-order reception with RPS?
From: Oliver Hartkopp Date: Sun, 12 Jul 2015 21:15:36 +0200 > Just some remarks about CAN and CAN frames as you suggest GRO which is > completely pointless for CAN. GRO may be pointless for CAN, but NAPI _definitely_ is useful for every single network device, period. So you should do NAPI for reasons outside of packet receive ordering, and in return you'll have your packet ordering problem solved as well. I really am stumped as to why you are avoiding NAPI so vehemently. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] net: dsa: mv88e6xxx: add write access to debugfs regs file
Hi David, On Jul 11, 2015, at 10:08 PM, David da...@davemloft.net wrote: > From: Vivien Didelot > Date: Sat, 11 Jul 2015 14:36:12 -0400 (EDT) > >> In the meantime, this is really useful for development. i.e. ensuring a good >> switchdev/DSA interaction without being able to read and write directly the >> hardware VLAN table, is a bit a PITA. A dynamic debugfs looked appropriate. > > For "development" you can hack the driver, add tracepoints, or use > another mechanism anyone hacking the kernel (which by definition > someone doing "development" is doing) can do. > > I do not buy any of your arguments, and you really miss the grand > opportunity to export the knobs and values in a way which are going > to: > > 1) Be useful to users > > 2) Be usable by any similar DSA driver, not just _yours_ I hardly see how this debug interface can be made generic to other DSA drivers, since the format of hardware tables or some registers seem very specific to the switch chip. > So please stop this myopic narrow thinking when you add facilities for > development or export values. Think of the big picture and long term, > not just your personal perceived immediate needs of today. I understand. So it looks like the only reasonable solution here is to revert this support for the debugfs interface. Thanks, -v -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] add stealth mode
2015-07-08 15:32 GMT+02:00 Austin S Hemmelgarn : > On 2015-07-06 15:44, Matteo Croce wrote: > Just to name a few that I know of off the top of my head: > 1. IP packets with any protocol number not supported by your current kernel > (these return a special ICMP message). Right, I'll handle them > 2. SCTP INIT and COOKIE_ECHO chunks when you have SCTP enabled in the > kernel. Well, I've never played with SCTP before > 3. Theoretically, some IGMP messages. > 4. NDP messages. > 5. ARP queries looking for the machine's IP addresses. Yes I know, but it's unlikely to receive this packets from WAN, right? My flag is intended to be used mostly on WAN interfaces, machines in LAN should be easily discoverable IMHO > 6. Certain odd flag combinations on single TCP packets (check the > documentation for Nmap for more info regarding these), which I believe > (although I may be reading the code wrong) you aren't accounting for. I've tried many TCP flags combination with hping3, NUL, SYN/ACK, ACK, SYN/FIN, etc. They doesn't get any response when the flag is set > 7. DAD queries. Never looked at this packets, are a subset of NDP? > 8. ICMP address mask queries (which you also don't appear to account for). It's deprecated and actually it doesn't get any response already > This is by no means an exhaustive list, but all of them really should be > addressed if you want to do this properly. > > Thank you, -- Matteo Croce OpenWrt Developer ___ __ | |.-.-.-.| | | |..| |_ | - || _ | -__| || | | || _|| _| |___|| __|_|__|__||||__| || |__| W I R E L E S S F R E E D O M - CHAOS CALMER - * 1 1/2 oz GinShake with a glassful * 1/4 oz Triple Sec of broken ice and pour * 3/4 oz Lime Juice unstrained into a goblet. * 1 1/2 oz Orange Juice * 1 tsp. Grenadine Syrup - -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net] rtnetlink: reject non-IFLA_VF_PORT attributes inside IFLA_VF_PORTS
Similarly as in commit 4f7d2cdfdde7 ("rtnetlink: verify IFLA_VF_INFO attributes before passing them to driver"), we have a double nesting of netlink attributes, i.e. IFLA_VF_PORTS only contains IFLA_VF_PORT that is nested itself. While IFLA_VF_PORTS is a verified attribute from ifla_policy[], we only check if the IFLA_VF_PORTS container has IFLA_VF_PORT attributes and then pass the attribute's content itself via nla_parse_nested(). It would be more correct to reject inner types other than IFLA_VF_PORT instead of continuing parsing and also similarly as in commit 4f7d2cdfdde7, to check for a minimum of NLA_HDRLEN. Signed-off-by: Daniel Borkmann Cc: Roopa Prabhu Cc: Scott Feldman Cc: Jason Gunthorpe --- ( This was still a follow-up I found while working on 4f7d2cdfdde7, could also go to net-next, at your preference. ) net/core/rtnetlink.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 9e433d5..dc004b1 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -1804,10 +1804,13 @@ static int do_setlink(const struct sk_buff *skb, goto errout; nla_for_each_nested(attr, tb[IFLA_VF_PORTS], rem) { - if (nla_type(attr) != IFLA_VF_PORT) - continue; - err = nla_parse_nested(port, IFLA_PORT_MAX, - attr, ifla_port_policy); + if (nla_type(attr) != IFLA_VF_PORT || + nla_len(attr) < NLA_HDRLEN) { + err = -EINVAL; + goto errout; + } + err = nla_parse_nested(port, IFLA_PORT_MAX, attr, + ifla_port_policy); if (err < 0) goto errout; if (!port[IFLA_PORT_VF]) { -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] net: mvneta: fix refilling for Rx DMA buffers
On some Armada 370-based NAS, I have experimented kernel bugs and crashes when the mvneta Ethernet driver fails to refill Rx DMA buffers due to memory shortage. With the actual code, if the memory allocation fails while refilling a Rx DMA buffer, then the corresponding descriptor is let with the address of an unmapped DMA buffer already passed to the network stack. Since the driver still increments the non-occupied counter for Rx descriptor (if a next packet is handled successfully), then the Ethernet controller is allowed to reuse the unfilled Rx descriptor... As a fix, this patch first refills a Rx descriptor before handling the stored data and unmapping the associated Rx DMA buffer. Additionally the occupied and non-occupied counters for the Rx descriptors queues are now both updated with the rx_done value: the number of descriptors ready to be returned to the networking controller. Moreover note that there is no point in using different values for this counters because both the mvneta driver and the Ethernet controller are unable to handle holes in the Rx descriptors queues. Signed-off-by: Simon Guinot Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit") Cc: # v3.8+ Tested-by: Yoann Sculo --- drivers/net/ethernet/marvell/mvneta.c | 22 ++ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index ce5f7f9cff06..ac3da11e63a0 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -1455,7 +1455,7 @@ static int mvneta_rx(struct mvneta_port *pp, int rx_todo, struct mvneta_rx_queue *rxq) { struct net_device *dev = pp->dev; - int rx_done, rx_filled; + int rx_done; u32 rcvd_pkts = 0; u32 rcvd_bytes = 0; @@ -1466,7 +1466,6 @@ static int mvneta_rx(struct mvneta_port *pp, int rx_todo, rx_todo = rx_done; rx_done = 0; - rx_filled = 0; /* Fairness NAPI loop */ while (rx_done < rx_todo) { @@ -1477,7 +1476,6 @@ static int mvneta_rx(struct mvneta_port *pp, int rx_todo, int rx_bytes, err; rx_done++; - rx_filled++; rx_status = rx_desc->status; rx_bytes = rx_desc->data_size - (ETH_FCS_LEN + MVNETA_MH_SIZE); data = (unsigned char *)rx_desc->buf_cookie; @@ -1517,6 +1515,14 @@ static int mvneta_rx(struct mvneta_port *pp, int rx_todo, continue; } + /* Refill processing */ + err = mvneta_rx_refill(pp, rx_desc); + if (err) { + netdev_err(dev, "Linux processing - Can't refill\n"); + rxq->missed++; + goto err_drop_frame; + } + skb = build_skb(data, pp->frag_size > PAGE_SIZE ? 0 : pp->frag_size); if (!skb) goto err_drop_frame; @@ -1536,14 +1542,6 @@ static int mvneta_rx(struct mvneta_port *pp, int rx_todo, mvneta_rx_csum(pp, rx_status, skb); napi_gro_receive(&pp->napi, skb); - - /* Refill processing */ - err = mvneta_rx_refill(pp, rx_desc); - if (err) { - netdev_err(dev, "Linux processing - Can't refill\n"); - rxq->missed++; - rx_filled--; - } } if (rcvd_pkts) { @@ -1556,7 +1554,7 @@ static int mvneta_rx(struct mvneta_port *pp, int rx_todo, } /* Update rxq management counters */ - mvneta_rxq_desc_num_update(pp, rxq, rx_done, rx_filled); + mvneta_rxq_desc_num_update(pp, rxq, rx_done, rx_done); return rx_done; } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/8] can: rcar_can: unify error messages
From: Sergei Shtylyov All the error messages in the driver but the ones from devm_clk_get() failures use similar format. Make those two messages consitent with others. Signed-off-by: Sergei Shtylyov Signed-off-by: Marc Kleine-Budde --- drivers/net/can/rcar_can.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/net/can/rcar_can.c b/drivers/net/can/rcar_can.c index 5e81afffc073..7bd54191f962 100644 --- a/drivers/net/can/rcar_can.c +++ b/drivers/net/can/rcar_can.c @@ -785,7 +785,8 @@ static int rcar_can_probe(struct platform_device *pdev) priv->clk = devm_clk_get(&pdev->dev, "clkp1"); if (IS_ERR(priv->clk)) { err = PTR_ERR(priv->clk); - dev_err(&pdev->dev, "cannot get peripheral clock: %d\n", err); + dev_err(&pdev->dev, "cannot get peripheral clock, error %d\n", + err); goto fail_clk; } @@ -797,7 +798,7 @@ static int rcar_can_probe(struct platform_device *pdev) priv->can_clk = devm_clk_get(&pdev->dev, clock_names[clock_select]); if (IS_ERR(priv->can_clk)) { err = PTR_ERR(priv->can_clk); - dev_err(&pdev->dev, "cannot get CAN clock: %d\n", err); + dev_err(&pdev->dev, "cannot get CAN clock, error %d\n", err); goto fail_clk; } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/8] can: rcar_can: fix typo in error message
From: Sergei Shtylyov Fix typo in the first error message printed by rcar_can_open(). Based on the original patch by Vladimir Barinov. Fixes: 862e2b6af941 ("can: rcar_can: support all input clocks") Reported-by: Vladimir Barinov Signed-off-by: Sergei Shtylyov Signed-off-by: Marc Kleine-Budde --- drivers/net/can/rcar_can.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/can/rcar_can.c b/drivers/net/can/rcar_can.c index 2f9ebad4ff56..310a0cd20679 100644 --- a/drivers/net/can/rcar_can.c +++ b/drivers/net/can/rcar_can.c @@ -508,7 +508,8 @@ static int rcar_can_open(struct net_device *ndev) err = clk_prepare_enable(priv->clk); if (err) { - netdev_err(ndev, "failed to enable periperal clock, error %d\n", + netdev_err(ndev, + "failed to enable peripheral clock, error %d\n", err); goto out; } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/8] can: rcar_can: print signed IRQ #
From: Sergei Shtylyov Printing IRQ # using "%x" and "%u" unsigned formats isn't quite correct as 'ndev->irq' is of type *int*, so the "%d" format needs to be used instead. While fixing this, beautify the dev_info() message in rcar_can_probe() a bit. Fixes: fd1159318e55 ("can: add Renesas R-Car CAN driver") Signed-off-by: Sergei Shtylyov Signed-off-by: Marc Kleine-Budde --- drivers/net/can/rcar_can.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/can/rcar_can.c b/drivers/net/can/rcar_can.c index 93017c09cfc3..2f9ebad4ff56 100644 --- a/drivers/net/can/rcar_can.c +++ b/drivers/net/can/rcar_can.c @@ -526,7 +526,7 @@ static int rcar_can_open(struct net_device *ndev) napi_enable(&priv->napi); err = request_irq(ndev->irq, rcar_can_interrupt, 0, ndev->name, ndev); if (err) { - netdev_err(ndev, "error requesting interrupt %x\n", ndev->irq); + netdev_err(ndev, "error requesting interrupt %d\n", ndev->irq); goto out_close; } can_led_event(ndev, CAN_LED_EVENT_OPEN); @@ -824,7 +824,7 @@ static int rcar_can_probe(struct platform_device *pdev) devm_can_led_init(ndev); - dev_info(&pdev->dev, "device registered (reg_base=%p, irq=%u)\n", + dev_info(&pdev->dev, "device registered (regs @ %p, IRQ%d)\n", priv->regs, ndev->irq); return 0; -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/8] can: rcar_can: print request_irq() error code
From: Sergei Shtylyov Also print the error code when the request_irq() call fails in rcar_can_open(), rewording the error message... Signed-off-by: Sergei Shtylyov Signed-off-by: Marc Kleine-Budde --- drivers/net/can/rcar_can.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/can/rcar_can.c b/drivers/net/can/rcar_can.c index 310a0cd20679..5e81afffc073 100644 --- a/drivers/net/can/rcar_can.c +++ b/drivers/net/can/rcar_can.c @@ -527,7 +527,8 @@ static int rcar_can_open(struct net_device *ndev) napi_enable(&priv->napi); err = request_irq(ndev->irq, rcar_can_interrupt, 0, ndev->name, ndev); if (err) { - netdev_err(ndev, "error requesting interrupt %d\n", ndev->irq); + netdev_err(ndev, "request_irq(%d) failed, error %d\n", + ndev->irq, err); goto out_close; } can_led_event(ndev, CAN_LED_EVENT_OPEN); -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/8] can: c_can: Fix default pinmux glitch at init
From: "J.D. Schroeder" The previous change 3973c526ae9c (net: can: c_can: Disable pins when CAN interface is down) causes a slight glitch on the pinctrl settings when used. Since commit ab78029 (drivers/pinctrl: grab default handles from device core), the device core will automatically set the default pins. This causes the pins to be momentarily set to the default and then to the sleep state in register_c_can_dev(). By adding an optional "enable" state, boards can set the default pin state to be disabled and avoid the glitch when the switch from default to sleep first occurs. If the "enable" state is not available c_can_pinctrl_select_state() falls back to using the "default" pinctrl state. [Roger Q] - Forward port to v4.2 and use pinctrl_get_select(). Signed-off-by: J.D. Schroeder Signed-off-by: Roger Quadros Reviewed-by: Grygorii Strashko Cc: linux-stable Signed-off-by: Marc Kleine-Budde --- drivers/net/can/c_can/c_can.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c index 041525d2595c..5d214d135332 100644 --- a/drivers/net/can/c_can/c_can.c +++ b/drivers/net/can/c_can/c_can.c @@ -592,6 +592,7 @@ static int c_can_start(struct net_device *dev) { struct c_can_priv *priv = netdev_priv(dev); int err; + struct pinctrl *p; /* basic c_can configuration */ err = c_can_chip_config(dev); @@ -604,8 +605,13 @@ static int c_can_start(struct net_device *dev) priv->can.state = CAN_STATE_ERROR_ACTIVE; - /* activate pins */ - pinctrl_pm_select_default_state(dev->dev.parent); + /* Attempt to use "active" if available else use "default" */ + p = pinctrl_get_select(priv->device, "active"); + if (!IS_ERR(p)) + pinctrl_put(p); + else + pinctrl_pm_select_default_state(priv->device); + return 0; } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/8] can: rcar_can: fix IRQ check
From: Sergei Shtylyov rcar_can_probe() regards 0 as a wrong IRQ #, despite platform_get_irq() that it calls returns negative error code in that case. This leads to the following being printed to the console when attempting to open the device: error requesting interrupt fffa because rcar_can_open() calls request_irq() with a negative IRQ #, and that function naturally fails with -EINVAL. Check for the negative error codes instead and propagate them upstream instead of just returning -ENODEV. Fixes: fd1159318e55 ("can: add Renesas R-Car CAN driver") Signed-off-by: Sergei Shtylyov Cc: linux-stable Signed-off-by: Marc Kleine-Budde --- drivers/net/can/rcar_can.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/can/rcar_can.c b/drivers/net/can/rcar_can.c index 7deb80dcbe8c..93017c09cfc3 100644 --- a/drivers/net/can/rcar_can.c +++ b/drivers/net/can/rcar_can.c @@ -758,8 +758,9 @@ static int rcar_can_probe(struct platform_device *pdev) } irq = platform_get_irq(pdev, 0); - if (!irq) { + if (irq < 0) { dev_err(&pdev->dev, "No IRQ resource\n"); + err = irq; goto fail; } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 7/8] ARM: dts: dra7x-evm: Prevent glitch on DCAN1 pinmux
From: Roger Quadros Driver core sets "default" pinmux on on probe and CAN driver sets "sleep" pinmux during register. This causes a small window where the CAN pins are in "default" state with the DCAN module being disabled. Change the "default" state to be like sleep so this glitch is avoided. Add a new "active" state that is used by the driver when CAN is actually active. Signed-off-by: Roger Quadros Cc: linux-stable Signed-off-by: Marc Kleine-Budde --- arch/arm/boot/dts/dra7-evm.dts | 5 +++-- arch/arm/boot/dts/dra72-evm.dts | 5 +++-- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/arch/arm/boot/dts/dra7-evm.dts b/arch/arm/boot/dts/dra7-evm.dts index aa465904f6cc..096f68be99e2 100644 --- a/arch/arm/boot/dts/dra7-evm.dts +++ b/arch/arm/boot/dts/dra7-evm.dts @@ -686,7 +686,8 @@ &dcan1 { status = "ok"; - pinctrl-names = "default", "sleep"; - pinctrl-0 = <&dcan1_pins_default>; + pinctrl-names = "default", "sleep", "active"; + pinctrl-0 = <&dcan1_pins_sleep>; pinctrl-1 = <&dcan1_pins_sleep>; + pinctrl-2 = <&dcan1_pins_default>; }; diff --git a/arch/arm/boot/dts/dra72-evm.dts b/arch/arm/boot/dts/dra72-evm.dts index 4e1b60581782..803738414086 100644 --- a/arch/arm/boot/dts/dra72-evm.dts +++ b/arch/arm/boot/dts/dra72-evm.dts @@ -587,9 +587,10 @@ &dcan1 { status = "ok"; - pinctrl-names = "default", "sleep"; - pinctrl-0 = <&dcan1_pins_default>; + pinctrl-names = "default", "sleep", "active"; + pinctrl-0 = <&dcan1_pins_sleep>; pinctrl-1 = <&dcan1_pins_sleep>; + pinctrl-2 = <&dcan1_pins_default>; }; &qspi { -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 8/8] can: replace timestamp as unique skb attribute
From: Oliver Hartkopp Commit 514ac99c64b "can: fix multiple delivery of a single CAN frame for overlapping CAN filters" requires the skb->tstamp to be set to check for identical CAN skbs. Without timestamping to be required by user space applications this timestamp was not generated which lead to commit 36c01245eb8 "can: fix loss of CAN frames in raw_rcv" - which forces the timestamp to be set in all CAN related skbuffs by introducing several __net_timestamp() calls. This forces e.g. out of tree drivers which are not using alloc_can{,fd}_skb() to add __net_timestamp() after skbuff creation to prevent the frame loss fixed in mainline Linux. This patch removes the timestamp dependency and uses an atomic counter to create an unique identifier together with the skbuff pointer. Btw: the new skbcnt element introduced in struct can_skb_priv has to be initialized with zero in out-of-tree drivers which are not using alloc_can{,fd}_skb() too. Signed-off-by: Oliver Hartkopp Cc: linux-stable Signed-off-by: Marc Kleine-Budde --- drivers/net/can/dev.c | 7 ++- drivers/net/can/slcan.c | 2 +- drivers/net/can/vcan.c | 3 --- include/linux/can/skb.h | 2 ++ net/can/af_can.c| 12 +++- net/can/bcm.c | 2 ++ net/can/raw.c | 7 --- 7 files changed, 18 insertions(+), 17 deletions(-) diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c index e9b1810d319f..aede704605c6 100644 --- a/drivers/net/can/dev.c +++ b/drivers/net/can/dev.c @@ -440,9 +440,6 @@ unsigned int can_get_echo_skb(struct net_device *dev, unsigned int idx) struct can_frame *cf = (struct can_frame *)skb->data; u8 dlc = cf->can_dlc; - if (!(skb->tstamp.tv64)) - __net_timestamp(skb); - netif_rx(priv->echo_skb[idx]); priv->echo_skb[idx] = NULL; @@ -578,7 +575,6 @@ struct sk_buff *alloc_can_skb(struct net_device *dev, struct can_frame **cf) if (unlikely(!skb)) return NULL; - __net_timestamp(skb); skb->protocol = htons(ETH_P_CAN); skb->pkt_type = PACKET_BROADCAST; skb->ip_summed = CHECKSUM_UNNECESSARY; @@ -589,6 +585,7 @@ struct sk_buff *alloc_can_skb(struct net_device *dev, struct can_frame **cf) can_skb_reserve(skb); can_skb_prv(skb)->ifindex = dev->ifindex; + can_skb_prv(skb)->skbcnt = 0; *cf = (struct can_frame *)skb_put(skb, sizeof(struct can_frame)); memset(*cf, 0, sizeof(struct can_frame)); @@ -607,7 +604,6 @@ struct sk_buff *alloc_canfd_skb(struct net_device *dev, if (unlikely(!skb)) return NULL; - __net_timestamp(skb); skb->protocol = htons(ETH_P_CANFD); skb->pkt_type = PACKET_BROADCAST; skb->ip_summed = CHECKSUM_UNNECESSARY; @@ -618,6 +614,7 @@ struct sk_buff *alloc_canfd_skb(struct net_device *dev, can_skb_reserve(skb); can_skb_prv(skb)->ifindex = dev->ifindex; + can_skb_prv(skb)->skbcnt = 0; *cfd = (struct canfd_frame *)skb_put(skb, sizeof(struct canfd_frame)); memset(*cfd, 0, sizeof(struct canfd_frame)); diff --git a/drivers/net/can/slcan.c b/drivers/net/can/slcan.c index f64f5290d6f8..a23a7af8eb9a 100644 --- a/drivers/net/can/slcan.c +++ b/drivers/net/can/slcan.c @@ -207,7 +207,6 @@ static void slc_bump(struct slcan *sl) if (!skb) return; - __net_timestamp(skb); skb->dev = sl->dev; skb->protocol = htons(ETH_P_CAN); skb->pkt_type = PACKET_BROADCAST; @@ -215,6 +214,7 @@ static void slc_bump(struct slcan *sl) can_skb_reserve(skb); can_skb_prv(skb)->ifindex = sl->dev->ifindex; + can_skb_prv(skb)->skbcnt = 0; memcpy(skb_put(skb, sizeof(struct can_frame)), &cf, sizeof(struct can_frame)); diff --git a/drivers/net/can/vcan.c b/drivers/net/can/vcan.c index 0ce868de855d..674f367087c5 100644 --- a/drivers/net/can/vcan.c +++ b/drivers/net/can/vcan.c @@ -78,9 +78,6 @@ static void vcan_rx(struct sk_buff *skb, struct net_device *dev) skb->dev = dev; skb->ip_summed = CHECKSUM_UNNECESSARY; - if (!(skb->tstamp.tv64)) - __net_timestamp(skb); - netif_rx_ni(skb); } diff --git a/include/linux/can/skb.h b/include/linux/can/skb.h index b6a52a4b457a..51bb6532785c 100644 --- a/include/linux/can/skb.h +++ b/include/linux/can/skb.h @@ -27,10 +27,12 @@ /** * struct can_skb_priv - private additional data inside CAN sk_buffs * @ifindex: ifindex of the first interface the CAN frame appeared on + * @skbcnt:atomic counter to have an unique id together with skb pointer * @cf:align to the following CAN frame at skb->data */ struct can_skb_priv { int ifindex; + int skbcnt; struct can_frame cf[0]; }; diff --git a/net/can/af_can.c b/net/can/af_can.c index 7933e62a7318..166d436196c1 100644 --- a/net/can/af_can.c +++
pull-request: can 2015-07-12
Hello David, this is a pull request of 8 patchs for net/master. Sergei Shtylyov contributes 5 patches for the rcar_can driver, fixing the IRQ check and several info and error messages. There are two patches by J.D. Schroeder and Roger Quadros for the c_can driver and dra7x-evm device tree, which precent a glitch in the DCAN1 pinmux. Oliver Hartkopp provides a better approach to make the CAN skbs unique, the timestamp is replaced by a counter. regards, Marc --- The following changes since commit 2ee94014d9bd3868b1c0d17405f96d63bec83f28: net: switchdev: don't abort unsupported operations (2015-07-11 21:29:55 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can.git tags/linux-can-fixes-for-4.2-20150712 for you to fetch changes up to d3b58c47d330de8c29898fe9746f7530408f8a59: can: replace timestamp as unique skb attribute (2015-07-12 21:13:22 +0200) linux-can-fixes-for-4.2-20150712 J.D. Schroeder (1): can: c_can: Fix default pinmux glitch at init Oliver Hartkopp (1): can: replace timestamp as unique skb attribute Roger Quadros (1): ARM: dts: dra7x-evm: Prevent glitch on DCAN1 pinmux Sergei Shtylyov (5): can: rcar_can: fix IRQ check can: rcar_can: print signed IRQ # can: rcar_can: fix typo in error message can: rcar_can: print request_irq() error code can: rcar_can: unify error messages arch/arm/boot/dts/dra7-evm.dts | 5 +++-- arch/arm/boot/dts/dra72-evm.dts | 5 +++-- drivers/net/can/c_can/c_can.c | 10 -- drivers/net/can/dev.c | 7 ++- drivers/net/can/rcar_can.c | 16 ++-- drivers/net/can/slcan.c | 2 +- drivers/net/can/vcan.c | 3 --- include/linux/can/skb.h | 2 ++ net/can/af_can.c| 12 +++- net/can/bcm.c | 2 ++ net/can/raw.c | 7 --- 11 files changed, 42 insertions(+), 29 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fighting out-of-order reception with RPS?
Hello Eric, On 07/11/2015 06:35 AM, Eric Dumazet wrote: > On Fri, 2015-07-10 at 22:36 +0200, Oliver Hartkopp wrote: >> Hm. Doesn't sound like a good solution when there's a difference between NAPI >> and non-NAPI drivers in matters of OOO, right? > > Isn't OOO a problem for you ? Then you either have to : > > 1) Use a single CPU to handle IRQ from the device > 2) Use NAPI > See below ... >> What about checking in netif_rx() if the non-NAPI driver has set a hash (aka >> the driver is OOO sensitive)? >> And if so we could automatically set rps_cpus for this interface in a way >> that >> all CPUs are enabled to take skbs following the hash. > > Wow, netif_rx() is packet processing fast path, certainly not the place > to add controlling path decisions. My only requirement is to be able to pick CAN frames (contained in skbs) from the socket in the same order they have been received. > Please convert your driver to NAPI. You might then even benefit from > GRO. Just some remarks about CAN and CAN frames as you suggest GRO which is completely pointless for CAN. CAN frames have a 11 or 29 bit CAN Identifier (no MAC but content addressing) and 0 to 64 bytes of payload. Therefore the MTU for CAN interfaces is 16 or 72 byte (see struct can(fd)_frame). Each skbuff contains a single CAN frame. There are CAN controllers which have a FIFO for up to 32 CAN frames, e.g. flexcan.c which also implements NAPI. Others (e.g. sja1000.c) don't have any FIFO and the reading of the CAN frame from the memory mapped registers needs to be processed in the irq context instantly. So 'fast path' netif_rx() is reasonable, right? So why is it not possible to pass netif_rx() skbs from a specific CAN network interface to whatever queue where they are processed in order? E.g. with skb_set_hash(skb, dev->ifindex, PKT_HASH_TYPE_L2); and echo f > /sys/class/net/can0/queues/rx-0/rps_cpus I get properly ordered CAN frames - even with netif_rx() processed skbs. I just want to have this stuff to be enabled by default for CAN interfaces to kill the OOO frame issue. Regards, Oliver -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [except_nonlink net-next]r8169:Add exception when missing link
Marian Corcodel : > Add exception when missing link because original function " > rtl8169_check_link_status" must be int instead of void. You explain why the patch is done this way but you don't explain why it is needed/useful. > commit 21d27973b264192a4ccd6488b1487f07293b11c8 > Author: Corcodel Marian > Date: Sat Jul 11 21:19:57 2015 +0300 > > Committer: Corcodel Marian > Add exception when nonexistent link occur because orig func > is void format instead of int. [...] > diff --git a/drivers/net/ethernet/realtek/r8169.c > b/drivers/net/ethernet/realtek/r8169.c > index 410c1ee..7465ec4 100644 > --- a/drivers/net/ethernet/realtek/r8169.c > +++ b/drivers/net/ethernet/realtek/r8169.c > @@ -7643,6 +7643,12 @@ static int rtl_open(struct net_device *dev) > > tp->saved_wolopts = 0; > pm_runtime_put_noidle(&pdev->dev); > + if (!tp->link_ok(ioaddr)) { > + netif_carrier_off(dev); > + netif_info(tp, ifdown, dev, "link down\n"); > + pm_schedule_suspend(&tp->pci_dev->dev, 200); (nit: &tp->pci_dev->dev == &pdev->dev from a few lines above) > + goto out; > + } > > rtl8169_check_link_status(dev, tp, ioaddr); > out: You're (partly) reverting the change below without any sensible explanation. You should elaborate which problem you are trying to address. Btw I can't help thinking that the style is terrible and the whole stuff ought to stay in rtl8169_check_link_status. commit e4fbce740f078bbc925ba5c86648d9c883968479 Author: Rafael J. Wysocki Date: Wed Dec 8 15:32:14 2010 + r8169: Fix runtime power management I noticed that one of the post-2.6.36 patches broke runtime PM of the r8169 on my MSI Wind test machine in such a way that the link was not brought up after reconnecting the network cable. In the process of debugging the issue I realized that we only should invoke the runtime PM functions in rtl8169_check_link_status() when link change is reported and if we do so, the problem goes away. Moreover, this allows rtl8169_runtime_idle() to be simplified quite a bit. Signed-off-by: Rafael J. Wysocki Acked-by: Francois Romieu Signed-off-by: David S. Miller diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c index 7d33ef4..53b13de 100644 --- a/drivers/net/r8169.c +++ b/drivers/net/r8169.c @@ -744,26 +744,36 @@ static void rtl8169_xmii_reset_enable(void __iomem *ioaddr) mdio_write(ioaddr, MII_BMCR, val & 0x); } -static void rtl8169_check_link_status(struct net_device *dev, +static void __rtl8169_check_link_status(struct net_device *dev, struct rtl8169_private *tp, - void __iomem *ioaddr) + void __iomem *ioaddr, + bool pm) { unsigned long flags; spin_lock_irqsave(&tp->lock, flags); if (tp->link_ok(ioaddr)) { /* This is to cancel a scheduled suspend if there's one. */ - pm_request_resume(&tp->pci_dev->dev); + if (pm) + pm_request_resume(&tp->pci_dev->dev); netif_carrier_on(dev); netif_info(tp, ifup, dev, "link up\n"); } else { netif_carrier_off(dev); netif_info(tp, ifdown, dev, "link down\n"); - pm_schedule_suspend(&tp->pci_dev->dev, 100); + if (pm) + pm_schedule_suspend(&tp->pci_dev->dev, 100); } spin_unlock_irqrestore(&tp->lock, flags); } +static void rtl8169_check_link_status(struct net_device *dev, + struct rtl8169_private *tp, + void __iomem *ioaddr) +{ + __rtl8169_check_link_status(dev, tp, ioaddr, false); +} + #define WAKE_ANY (WAKE_PHY | WAKE_MAGIC | WAKE_UCAST | WAKE_BCAST | WAKE_MCAST) static u32 __rtl8169_get_wol(struct rtl8169_private *tp) @@ -4600,7 +4610,7 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance) } if (status & LinkChg) - rtl8169_check_link_status(dev, tp, ioaddr); + __rtl8169_check_link_status(dev, tp, ioaddr, true); /* We need to see the lastest version of tp->intr_mask to * avoid ignoring an MSI interrupt and having to wait for @@ -4890,11 +4900,7 @@ static int rtl8169_runtime_idle(struct device *device) struct net_device *dev = pci_get_drvdata(pdev); struct rtl8169_private *tp = netdev_priv(dev); - if (!tp->TxDescArray) - return 0; - - rtl8169_check_link_status(dev, tp, tp->mmio_addr); - return -EBUSY; + return tp->TxDescArray ? -EBUSY : 0; } static const struct dev_pm_ops rtl8169_pm_ops = { -- To unsubscribe from this list: send the line "unsubscribe netdev" in