Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload
On Mon, Nov 30, 2015 at 1:42 PM, Singhai, Anjaliwrote: > > > -Original Message- > From: David Miller [mailto:da...@davemloft.net] > Sent: Sunday, November 29, 2015 7:23 PM > To: t...@herbertland.com > Cc: Brandeburg, Jesse ; Singhai, Anjali > ; je...@kernel.org; netdev@vger.kernel.org; Patil, > Kiran > Subject: Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload > > From: Tom Herbert > Date: Tue, 24 Nov 2015 09:32:11 -0800 > >>> >>> FWIW, I've brought the issue to the attention of the architects here, >>> and we will likely be able to make changes in this space. Intel >>> hardware (as demonstrated by your patches) already is able to deal >>> with this de-ossification on transmit. Receive is a whole different beast. >>> >> Please provide the specifics on why "Receive is a whole different >> beast.". Generic receive checksum is already a subset of the >> functionality that you must have implement to support the protocol >> specific offloads. All the hardware needs to do is calculate the 1's >> complement checksum of the packet and return the value on the to the >> host with that packet. That's it. No parsing of headers, no worrying >> about the pseudo header, no dealing with any encapsulation. Just do >> the calculation, return the result to the host and the driver converts >> this to CHECKSUM_COMPLETE. I find it very hard to believe that this is >> any harder than specific support the next protocol du jour. > > The reason for receive being different than transmit is, on TX side driver > can provide the meta data for where the checksum field is and what is the > length that needs to be check summed to the HW on a per packet basis. On Rx > the HW parser has to parse the packet to identify the tunnel type and based > on that figure out the checksum locations and length in the packet, so > definitely HW has to parse the packet and it can parse only based on next > header type information or in case of udp tunnels based on udp port mapping > to a particular protocol. I am not sure why you say it doesn't need to parse > the packet, maybe I am miss- understanding something. Although it's not > difficult to reduce protocol ossification on the RX side but it is certainly > different and particularly in case of udp-tunnels it needs the port to > protocol mapping. > Please look at how CHECKSUM_COMPLETE interface works. Description is in sk_buff.h or http://people.netfilter.org/pablo/netdev0.1/papers/UDP-Encapsulation-in-Linux.pdf. Thanks, Tom -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v3 17/17] net: mlx4: use new ETHTOOL_G/SSETTINGS API
From: David DecotignySigned-off-by: David Decotigny --- drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 323 drivers/net/ethernet/mellanox/mlx4/en_main.c| 1 + drivers/net/ethernet/mellanox/mlx4/mlx4_en.h| 1 + 3 files changed, 157 insertions(+), 168 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c index dd84cab..0ccdc84 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c @@ -501,34 +501,30 @@ static u32 mlx4_en_autoneg_get(struct net_device *dev) return autoneg; } -static u32 ptys_get_supported_port(struct mlx4_ptys_reg *ptys_reg) +static void ptys2ethtool_update_supported_port(ethtool_link_mode_mask_t *mask, + struct mlx4_ptys_reg *ptys_reg) { u32 eth_proto = be32_to_cpu(ptys_reg->eth_proto_cap); if (eth_proto & (MLX4_PROT_MASK(MLX4_10GBASE_T) | MLX4_PROT_MASK(MLX4_1000BASE_T) | MLX4_PROT_MASK(MLX4_100BASE_TX))) { - return SUPPORTED_TP; - } - - if (eth_proto & (MLX4_PROT_MASK(MLX4_10GBASE_CR) + ethtool_add_link_modes(mask, ETHTOOL_LINK_MODE_TP_BIT); + } else if (eth_proto & (MLX4_PROT_MASK(MLX4_10GBASE_CR) | MLX4_PROT_MASK(MLX4_10GBASE_SR) | MLX4_PROT_MASK(MLX4_56GBASE_SR4) | MLX4_PROT_MASK(MLX4_40GBASE_CR4) | MLX4_PROT_MASK(MLX4_40GBASE_SR4) | MLX4_PROT_MASK(MLX4_1000BASE_CX_SGMII))) { - return SUPPORTED_FIBRE; - } - - if (eth_proto & (MLX4_PROT_MASK(MLX4_56GBASE_KR4) + ethtool_add_link_modes(mask, ETHTOOL_LINK_MODE_FIBRE_BIT); + } else if (eth_proto & (MLX4_PROT_MASK(MLX4_56GBASE_KR4) | MLX4_PROT_MASK(MLX4_40GBASE_KR4) | MLX4_PROT_MASK(MLX4_20GBASE_KR2) | MLX4_PROT_MASK(MLX4_10GBASE_KR) | MLX4_PROT_MASK(MLX4_10GBASE_KX4) | MLX4_PROT_MASK(MLX4_1000BASE_KX))) { - return SUPPORTED_Backplane; + ethtool_add_link_modes(mask, ETHTOOL_LINK_MODE_Backplane_BIT); } - return 0; } static u32 ptys_get_active_port(struct mlx4_ptys_reg *ptys_reg) @@ -574,122 +570,91 @@ static u32 ptys_get_active_port(struct mlx4_ptys_reg *ptys_reg) enum ethtool_report { SUPPORTED = 0, ADVERTISED = 1, - SPEED = 2 }; +struct ptys2ethtool_config { + ethtool_link_mode_mask_t link_modes[2]; /* SUPPORTED/ADVERTISED */ + u32 speed; +}; + +#define MLX4_BUILD_PTYS2ETHTOOL_CONFIG(reg_, speed_, ...) \ + ({ \ + struct ptys2ethtool_config *cfg;\ + cfg = _map[reg_]; \ + cfg->speed = speed_;\ + ethtool_build_link_mode(>link_modes[SUPPORTED],\ + __VA_ARGS__); \ + ethtool_build_link_mode(>link_modes[ADVERTISED], \ + __VA_ARGS__); \ + }) + /* Translates mlx4 link mode to equivalent ethtool Link modes/speed */ -static u32 ptys2ethtool_map[MLX4_LINK_MODES_SZ][3] = { - [MLX4_100BASE_TX] = { - SUPPORTED_100baseT_Full, - ADVERTISED_100baseT_Full, - SPEED_100 - }, - - [MLX4_1000BASE_T] = { - SUPPORTED_1000baseT_Full, - ADVERTISED_1000baseT_Full, - SPEED_1000 - }, - [MLX4_1000BASE_CX_SGMII] = { - SUPPORTED_1000baseKX_Full, - ADVERTISED_1000baseKX_Full, - SPEED_1000 - }, - [MLX4_1000BASE_KX] = { - SUPPORTED_1000baseKX_Full, - ADVERTISED_1000baseKX_Full, - SPEED_1000 - }, - - [MLX4_10GBASE_T] = { - SUPPORTED_1baseT_Full, - ADVERTISED_1baseT_Full, - SPEED_1 - }, - [MLX4_10GBASE_CX4] = { - SUPPORTED_1baseKX4_Full, - ADVERTISED_1baseKX4_Full, - SPEED_1 - }, - [MLX4_10GBASE_KX4] = { - SUPPORTED_1baseKX4_Full, - ADVERTISED_1baseKX4_Full, - SPEED_1 - }, - [MLX4_10GBASE_KR] = { - SUPPORTED_1baseKR_Full, - ADVERTISED_1baseKR_Full, - SPEED_1 - }, - [MLX4_10GBASE_CR] = { -
[PATCH net-next v3 04/17] tx4939: use __ethtool_get_ksettings
From: David DecotignySigned-off-by: David Decotigny --- arch/mips/txx9/generic/setup_tx4939.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/arch/mips/txx9/generic/setup_tx4939.c b/arch/mips/txx9/generic/setup_tx4939.c index e3733cd..4a3ebf6 100644 --- a/arch/mips/txx9/generic/setup_tx4939.c +++ b/arch/mips/txx9/generic/setup_tx4939.c @@ -320,11 +320,12 @@ void __init tx4939_sio_init(unsigned int sclk, unsigned int cts_mask) #if IS_ENABLED(CONFIG_TC35815) static u32 tx4939_get_eth_speed(struct net_device *dev) { - struct ethtool_cmd cmd; - if (__ethtool_get_settings(dev, )) + struct ethtool_ksettings cmd; + + if (__ethtool_get_ksettings(dev, )) return 100; /* default 100Mbps */ - return ethtool_cmd_speed(); + return cmd.parent.speed; } static int tx4939_netdev_event(struct notifier_block *this, -- 2.6.0.rc2.230.g3dd15c0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] net: smc911x: convert pxa dma to dmaengine
Convert the dma transfers to be dmaengine based, now pxa has a dmaengine slave driver. This makes this driver a bit more PXA agnostic. The driver was only compile tested. The risk is quite small as no current PXA platform I'm aware of is using smc911x driver. Signed-off-by: Robert Jarzmik--- drivers/net/ethernet/smsc/smc911x.c | 85 - drivers/net/ethernet/smsc/smc911x.h | 63 --- 2 files changed, 82 insertions(+), 66 deletions(-) diff --git a/drivers/net/ethernet/smsc/smc911x.c b/drivers/net/ethernet/smsc/smc911x.c index bd64eb982e52..3f5711061432 100644 --- a/drivers/net/ethernet/smsc/smc911x.c +++ b/drivers/net/ethernet/smsc/smc911x.c @@ -73,6 +73,9 @@ static const char version[] = #include #include +#include +#include + #include #include "smc911x.h" @@ -1174,18 +1177,16 @@ static irqreturn_t smc911x_interrupt(int irq, void *dev_id) #ifdef SMC_USE_DMA static void -smc911x_tx_dma_irq(int dma, void *data) +smc911x_tx_dma_irq(void *data) { - struct net_device *dev = (struct net_device *)data; - struct smc911x_local *lp = netdev_priv(dev); + struct smc911x_local *lp = data; + struct net_device *dev = lp->netdev; struct sk_buff *skb = lp->current_tx_skb; unsigned long flags; DBG(SMC_DEBUG_FUNC, dev, "--> %s\n", __func__); DBG(SMC_DEBUG_TX | SMC_DEBUG_DMA, dev, "TX DMA irq handler\n"); - /* Clear the DMA interrupt sources */ - SMC_DMA_ACK_IRQ(dev, dma); BUG_ON(skb == NULL); dma_unmap_single(NULL, tx_dmabuf, tx_dmalen, DMA_TO_DEVICE); dev->trans_start = jiffies; @@ -1208,18 +1209,16 @@ smc911x_tx_dma_irq(int dma, void *data) "TX DMA irq completed\n"); } static void -smc911x_rx_dma_irq(int dma, void *data) +smc911x_rx_dma_irq(void *data) { - struct net_device *dev = (struct net_device *)data; - struct smc911x_local *lp = netdev_priv(dev); + struct smc911x_local *lp = data; + struct net_device *dev = lp->netdev; struct sk_buff *skb = lp->current_rx_skb; unsigned long flags; unsigned int pkts; DBG(SMC_DEBUG_FUNC, dev, "--> %s\n", __func__); DBG(SMC_DEBUG_RX | SMC_DEBUG_DMA, dev, "RX DMA irq handler\n"); - /* Clear the DMA interrupt sources */ - SMC_DMA_ACK_IRQ(dev, dma); dma_unmap_single(NULL, rx_dmabuf, rx_dmalen, DMA_FROM_DEVICE); BUG_ON(skb == NULL); lp->current_rx_skb = NULL; @@ -1792,6 +1791,9 @@ static int smc911x_probe(struct net_device *dev) unsigned int val, chip_id, revision; const char *version_string; unsigned long irq_flags; + struct dma_slave_config config; + dma_cap_mask_t mask; + struct pxad_param param; DBG(SMC_DEBUG_FUNC, dev, "--> %s\n", __func__); @@ -1963,11 +1965,40 @@ static int smc911x_probe(struct net_device *dev) goto err_out; #ifdef SMC_USE_DMA - lp->rxdma = SMC_DMA_REQUEST(dev, smc911x_rx_dma_irq); - lp->txdma = SMC_DMA_REQUEST(dev, smc911x_tx_dma_irq); + + dma_cap_zero(mask); + dma_cap_set(DMA_SLAVE, mask); + param.prio = PXAD_PRIO_LOWEST; + param.drcmr = -1UL; + + lp->rxdma = + dma_request_slave_channel_compat(mask, pxad_filter_fn, +, >dev, "rx"); + lp->txdma = + dma_request_slave_channel_compat(mask, pxad_filter_fn, +, >dev, "tx"); lp->rxdma_active = 0; lp->txdma_active = 0; - dev->dma = lp->rxdma; + + memset(, 0, sizeof(config)); + config.src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES; + config.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES; + config.src_addr = lp->physaddr + RX_DATA_FIFO; + config.dst_addr = lp->physaddr + TX_DATA_FIFO; + config.src_maxburst = 32; + config.dst_maxburst = 32; + retval = dmaengine_slave_config(lp->rxdma, ); + if (retval) { + dev_err(lp->dev, "dma rx channel configuration failed: %d\n", + retval); + goto err_out; + } + retval = dmaengine_slave_config(lp->txdma, ); + if (retval) { + dev_err(lp->dev, "dma tx channel configuration failed: %d\n", + retval); + goto err_out; + } #endif retval = register_netdev(dev); @@ -1978,11 +2009,11 @@ static int smc911x_probe(struct net_device *dev) dev->base_addr, dev->irq); #ifdef SMC_USE_DMA - if (lp->rxdma != -1) - pr_cont(" RXDMA %d", lp->rxdma); + if (lp->rxdma) + pr_cont(" RXDMA %p", lp->rxdma); - if (lp->txdma != -1) - pr_cont(" TXDMA %d", lp->txdma); + if (lp->txdma) +
[PATCH v2] ravb: add R8A7791 support
Add support for yet another ARM member of the R-Car family, R-Car M2-W, also known as R8A7791. Signed-off-by: Sergei Shtylyov--- The patch is against DaveM's 'net-next.git' repo but I wouldn't mind if it's applied to 'net.git' instead. :-) Changes in version 2: - fixed the SoC name in the changelog. Documentation/devicetree/bindings/net/renesas,ravb.txt |1 + drivers/net/ethernet/renesas/ravb_main.c |1 + 2 files changed, 2 insertions(+) Index: net-next/Documentation/devicetree/bindings/net/renesas,ravb.txt === --- net-next.orig/Documentation/devicetree/bindings/net/renesas,ravb.txt +++ net-next/Documentation/devicetree/bindings/net/renesas,ravb.txt @@ -5,6 +5,7 @@ interface contains. Required properties: - compatible: "renesas,etheravb-r8a7790" if the device is a part of R8A7790 SoC. + "renesas,etheravb-r8a7791" if the device is a part of R8A7791 SoC. "renesas,etheravb-r8a7794" if the device is a part of R8A7794 SoC. "renesas,etheravb-r8a7795" if the device is a part of R8A7795 SoC. - reg: offset and length of (1) the register block and (2) the stream buffer. Index: net-next/drivers/net/ethernet/renesas/ravb_main.c === --- net-next.orig/drivers/net/ethernet/renesas/ravb_main.c +++ net-next/drivers/net/ethernet/renesas/ravb_main.c @@ -1655,6 +1655,7 @@ static int ravb_mdio_release(struct ravb static const struct of_device_id ravb_match_table[] = { { .compatible = "renesas,etheravb-r8a7790", .data = (void *)RCAR_GEN2 }, + { .compatible = "renesas,etheravb-r8a7791", .data = (void *)RCAR_GEN2 }, { .compatible = "renesas,etheravb-r8a7794", .data = (void *)RCAR_GEN2 }, { .compatible = "renesas,etheravb-r8a7795", .data = (void *)RCAR_GEN3 }, { } -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v3 06/17] net: bonding: use __ethtool_get_ksettings
From: David DecotignySigned-off-by: David Decotigny --- drivers/net/bonding/bond_main.c | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 9e0f8a7..67d724d 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -374,22 +374,20 @@ down: static void bond_update_speed_duplex(struct slave *slave) { struct net_device *slave_dev = slave->dev; - struct ethtool_cmd ecmd; - u32 slave_speed; + struct ethtool_ksettings ecmd; int res; slave->speed = SPEED_UNKNOWN; slave->duplex = DUPLEX_UNKNOWN; - res = __ethtool_get_settings(slave_dev, ); + res = __ethtool_get_ksettings(slave_dev, ); if (res < 0) return; - slave_speed = ethtool_cmd_speed(); - if (slave_speed == 0 || slave_speed == ((__u32) -1)) + if (ecmd.parent.speed == 0 || ecmd.parent.speed == ((__u32)-1)) return; - switch (ecmd.duplex) { + switch (ecmd.parent.duplex) { case DUPLEX_FULL: case DUPLEX_HALF: break; @@ -397,8 +395,8 @@ static void bond_update_speed_duplex(struct slave *slave) return; } - slave->speed = slave_speed; - slave->duplex = ecmd.duplex; + slave->speed = ecmd.parent.speed; + slave->duplex = ecmd.parent.duplex; return; } -- 2.6.0.rc2.230.g3dd15c0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v3 03/17] net: ethtool: add new ETHTOOL_GSETTINGS/SSETTINGS API
From: David DecotignyThis patch defines a new ETHTOOL_GSETTINGS/SSETTINGS API, handled by the new get_ksettings/set_ksettings callbacks. This API provides support for most legacy ethtool_cmd fields, adds support for larger link mode masks (up to 4064 bits, variable length), and removes ethtool_cmd deprecated fields (transceiver/maxrxpkt/maxtxpkt). This API is deprecating the legacy ETHTOOL_GSET/SSET API and provides the following backward compatibility properties: - legacy ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks. - legacy ethtool with new get/set_ksettings drivers: the new driver callbacks are used, data internally converted to legacy ethtool_cmd. ETHTOOL_GSET will return only the 1st 32b of each link mode mask. ETHTOOL_SSET will fail if user tries to set the ethtool_cmd deprecated fields to non-0 (transceiver/maxrxpkt/maxtxpkt). A kernel warning is printed if driver exports higher bits or if user request changes in deprecated fields mentioned earlier. - future ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks, internally converted to new data structure. Note that that "future" ethtool tool will not allow changes to deprecated fields (transceiver/maxrxpkt/maxtxpkt), as they cannot be expressed for the kernel. - future ethtool with new drivers: direct call to the new callbacks. By "future" ethtool, what is meant is: - query: first try ETHTOOL_GSETTINGS, and revert to ETHTOOL_GSET if fails - set: query first and remember which of ETHTOOL_GSETTINGS or ETHTOOL_GSET was successful - if ETHTOOL_GSETTINGS was successful, then change config with ETHTOOL_SSETTINGS. A failure there is final (do not try ETHTOOL_SSET). - otherwise ETHTOOL_GSET was successful, change config with ETHTOOL_SSET. A failure there is final (do not try ETHTOOL_SSETTINGS). The interaction user/kernel via the new API requires a small ETHTOOL_GSETTINGS handshake first to agree on the length of the link mode bitmaps. If kernel doesn't agree with user, it returns the bitmap length it is expecting from user as a negative length (and cmd field is 0). When kernel and user agree, kernel returns valid info in all fields (ie. link mode length > 0 and cmd is ETHTOOL_GSETTINGS). Data structure crossing user/kernel boundary is 32/64-bit agnostic. Converted internally to a legal kernel bitmap. The internal __ethtool_get_settings kernel helper will gradually be replaced by __ethtool_get_ksettings by the time the first ksettings drivers start to appear. So this patch doesn't change it, it will be removed before it needs to be changed. Signed-off-by: David Decotigny --- include/linux/ethtool.h | 101 - include/uapi/linux/ethtool.h | 323 ++-- net/core/ethtool.c | 489 ++- 3 files changed, 833 insertions(+), 80 deletions(-) diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h index 653dc9c..6de122d 100644 --- a/include/linux/ethtool.h +++ b/include/linux/ethtool.h @@ -12,6 +12,7 @@ #ifndef _LINUX_ETHTOOL_H #define _LINUX_ETHTOOL_H +#include #include #include @@ -40,9 +41,6 @@ struct compat_ethtool_rxnfc { #include -extern int __ethtool_get_settings(struct net_device *dev, - struct ethtool_cmd *cmd); - /** * enum ethtool_phys_id_state - indicator state for physical identification * @ETHTOOL_ID_INACTIVE: Physical ID indicator should be deactivated @@ -97,13 +95,85 @@ static inline u32 ethtool_rxfh_indir_default(u32 index, u32 n_rx_rings) return index % n_rx_rings; } +#define __ETHTOOL_LINK_MODE_IS_VALID_BIT(indice) \ + ((indice) >= 0 && (indice) <= __ETHTOOL_LINK_MODE_LAST) + +/* number of link mode bits handled internally by kernel */ +#define __ETHTOOL_LINK_MODE_MASK_NBITS (__ETHTOOL_LINK_MODE_LAST+1) + +typedef struct { + unsigned long mask[BITS_TO_LONGS(__ETHTOOL_LINK_MODE_MASK_NBITS)]; +} ethtool_link_mode_mask_t; + +/* drivers must ignore parent.cmd and parent.link_mode_masks_nwords + * fields, but they are allowed to overwrite them (will be ignored). + */ +struct ethtool_ksettings { + struct ethtool_settings parent; + struct { + ethtool_link_mode_mask_t supported; + ethtool_link_mode_mask_t advertising; + ethtool_link_mode_mask_t lp_advertising; + } link_modes; +}; + +/* helper function for ethtool_build_link_mode and ethtool_add_link_modes */ +static inline int +__ethtool_add_link_modes(ethtool_link_mode_mask_t *dst, +unsigned nindices, +const enum ethtool_link_mode_bit_indices *indices) { + unsigned i; + int rv = 0; + + for (i = 0 ; i < nindices ; ++i) { + if (__ETHTOOL_LINK_MODE_IS_VALID_BIT(indices[i])) +
[PATCH net-next v3 01/17] net: usnic: remove unused call to ethtool_ops::get_settings
From: David DecotignySigned-off-by: David Decotigny --- drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c index f8e3211..5b60579 100644 --- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c +++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c @@ -269,7 +269,6 @@ int usnic_ib_query_device(struct ib_device *ibdev, struct usnic_ib_dev *us_ibdev = to_usdev(ibdev); union ib_gid gid; struct ethtool_drvinfo info; - struct ethtool_cmd cmd; int qp_per_vf; usnic_dbg("\n"); @@ -278,7 +277,6 @@ int usnic_ib_query_device(struct ib_device *ibdev, mutex_lock(_ibdev->usdev_lock); us_ibdev->netdev->ethtool_ops->get_drvinfo(us_ibdev->netdev, ); - us_ibdev->netdev->ethtool_ops->get_settings(us_ibdev->netdev, ); memset(props, 0, sizeof(*props)); usnic_mac_ip_to_gid(us_ibdev->ufdev->mac, us_ibdev->ufdev->inaddr, [0]); -- 2.6.0.rc2.230.g3dd15c0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v3 05/17] net: usnic: use __ethtool_get_ksettings
From: David DecotignySigned-off-by: David Decotigny --- drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c index e082170..e0d12d4 100644 --- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c +++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c @@ -324,12 +324,12 @@ int usnic_ib_query_port(struct ib_device *ibdev, u8 port, struct ib_port_attr *props) { struct usnic_ib_dev *us_ibdev = to_usdev(ibdev); - struct ethtool_cmd cmd; + struct ethtool_ksettings cmd; usnic_dbg("\n"); mutex_lock(_ibdev->usdev_lock); - __ethtool_get_settings(us_ibdev->netdev, ); + __ethtool_get_ksettings(us_ibdev->netdev, ); memset(props, 0, sizeof(*props)); props->lid = 0; @@ -353,8 +353,8 @@ int usnic_ib_query_port(struct ib_device *ibdev, u8 port, props->pkey_tbl_len = 1; props->bad_pkey_cntr = 0; props->qkey_viol_cntr = 0; - eth_speed_to_ib_speed(cmd.speed, >active_speed, - >active_width); + eth_speed_to_ib_speed(cmd.parent.speed, >active_speed, + >active_width); props->max_mtu = IB_MTU_4096; props->active_mtu = iboe_get_mtu(us_ibdev->ufdev->mtu); /* Userspace will adjust for hdrs */ -- 2.6.0.rc2.230.g3dd15c0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v3 02/17] net: usnic: use __ethtool_get_settings
From: David DecotignySigned-off-by: David Decotigny --- drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c index 5b60579..e082170 100644 --- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c +++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c @@ -329,7 +329,7 @@ int usnic_ib_query_port(struct ib_device *ibdev, u8 port, usnic_dbg("\n"); mutex_lock(_ibdev->usdev_lock); - us_ibdev->netdev->ethtool_ops->get_settings(us_ibdev->netdev, ); + __ethtool_get_settings(us_ibdev->netdev, ); memset(props, 0, sizeof(*props)); props->lid = 0; -- 2.6.0.rc2.230.g3dd15c0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: gigaset: freeing an active object
On ma, 2015-11-30 at 19:30 +0100, Tilman Schmidt wrote: > I wonder how that will behave if someone attaches two of the devices to > different serial ports. Not likely, but not forbidden either. I see. Perhaps I should respin and a use a pointer to a struct platform_device in struct ser_cardstate, use the two step approach of platform_device_alloc() and friends, etc. Only slightly more complicated. How would attaching two devices work with GIGASET_MINORS hardcoded to 1? Because I haven't yet stumbled on the mechanism with which ttyGS1 (and up) would then be created. (I do have a second M105's in a box somewhere, so I could check myself what happens when a second USB device is added, for what that's worth.) Thanks, Paul Bolle -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/13] mm: memcontrol: generalize the socket accounting jump label
Hi, On 11/24/2015 04:52 PM, Johannes Weiner wrote: > The unified hierarchy memory controller is going to use this jump > label as well to control the networking callbacks. Move it to the > memory controller code and give it a more generic name. > > Signed-off-by: Johannes Weiner> Acked-by: Michal Hocko > Reviewed-by: Vladimir Davydov > --- > include/linux/memcontrol.h | 4 > include/net/sock.h | 7 --- > mm/memcontrol.c| 3 +++ > net/core/sock.c| 5 - > net/ipv4/tcp_memcontrol.c | 4 ++-- > 5 files changed, 9 insertions(+), 14 deletions(-) > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > index d99fefe..dad56ef 100644 > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -681,6 +681,8 @@ static inline void mem_cgroup_wb_stats(struct > bdi_writeback *wb, > > #if defined(CONFIG_INET) && defined(CONFIG_MEMCG_KMEM) > struct sock; > +extern struct static_key memcg_sockets_enabled_key; > +#define mem_cgroup_sockets_enabled > static_key_false(_sockets_enabled_key) We're trying to move to the updated API, so this should be: static_branch_unlikely(_sockets_enabled_key) see: include/linux/jump_label.h for details. > void sock_update_memcg(struct sock *sk); > void sock_release_memcg(struct sock *sk); > bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int > nr_pages); > @@ -689,6 +691,8 @@ static inline bool > mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg) > { > return memcg->tcp_mem.memory_pressure; > } > +#else > +#define mem_cgroup_sockets_enabled 0 > #endif /* CONFIG_INET && CONFIG_MEMCG_KMEM */ > > #ifdef CONFIG_MEMCG_KMEM > diff --git a/include/net/sock.h b/include/net/sock.h > index 1a94b85..fcc9442 100644 > --- a/include/net/sock.h > +++ b/include/net/sock.h > @@ -1065,13 +1065,6 @@ static inline void sk_refcnt_debug_release(const > struct sock *sk) > #define sk_refcnt_debug_release(sk) do { } while (0) > #endif /* SOCK_REFCNT_DEBUG */ > > -#if defined(CONFIG_MEMCG_KMEM) && defined(CONFIG_NET) > -extern struct static_key memcg_socket_limit_enabled; > -#define mem_cgroup_sockets_enabled > static_key_false(_socket_limit_enabled) > -#else > -#define mem_cgroup_sockets_enabled 0 > -#endif > - > static inline bool sk_stream_memory_free(const struct sock *sk) > { > if (sk->sk_wmem_queued >= sk->sk_sndbuf) > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 68d67fc..0602bee 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -291,6 +291,9 @@ static inline struct mem_cgroup > *mem_cgroup_from_id(unsigned short id) > /* Writing them here to avoid exposing memcg's inner layout */ > #if defined(CONFIG_INET) && defined(CONFIG_MEMCG_KMEM) > > +struct static_key memcg_sockets_enabled_key; And this would be: static DEFINE_STATIC_KEY_FALSE(memcg_sockets_enabled_key); > void sock_update_memcg(struct sock *sk) > { > struct mem_cgroup *memcg; > diff --git a/net/core/sock.c b/net/core/sock.c > index 6486b0d..c5435b5 100644 > --- a/net/core/sock.c > +++ b/net/core/sock.c > @@ -201,11 +201,6 @@ EXPORT_SYMBOL(sk_net_capable); > static struct lock_class_key af_family_keys[AF_MAX]; > static struct lock_class_key af_family_slock_keys[AF_MAX]; > > -#if defined(CONFIG_MEMCG_KMEM) > -struct static_key memcg_socket_limit_enabled; > -EXPORT_SYMBOL(memcg_socket_limit_enabled); > -#endif > - > /* > * Make lock validator output more readable. (we pre-construct these > * strings build-time, so that runtime initialization of socket > diff --git a/net/ipv4/tcp_memcontrol.c b/net/ipv4/tcp_memcontrol.c > index e507825..9a22e2d 100644 > --- a/net/ipv4/tcp_memcontrol.c > +++ b/net/ipv4/tcp_memcontrol.c > @@ -34,7 +34,7 @@ void tcp_destroy_cgroup(struct mem_cgroup *memcg) > return; > > if (memcg->tcp_mem.active) > - static_key_slow_dec(_socket_limit_enabled); > + static_key_slow_dec(_sockets_enabled_key); > static_branch_dec(_sockets_enabled_key); } > > static int tcp_update_limit(struct mem_cgroup *memcg, unsigned long nr_pages) > @@ -65,7 +65,7 @@ static int tcp_update_limit(struct mem_cgroup *memcg, > unsigned long nr_pages) >* because when this value change, the code to process it is not >* patched in yet. >*/ > - static_key_slow_inc(_socket_limit_enabled); > + static_key_slow_inc(_sockets_enabled_key); > memcg->tcp_mem.active = true; > } > > static_branch_inc(_sockets_enabled_key); Thanks, -Jason -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v1 1/6] net: Generalize udp based tunnel offload
-Original Message- From: David Miller [mailto:da...@davemloft.net] Sent: Sunday, November 29, 2015 7:22 PM To: t...@herbertland.com Cc: Singhai, Anjali; netdev@vger.kernel.org; je...@kernel.org; Patil, Kiran Subject: Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload From: Tom Herbert Date: Mon, 23 Nov 2015 13:53:44 -0800 > The bad effect of this model is that it is encourages HW vendors to > continue implement HW protocol specific support for encapsulations, we > get so much more benefit if they implement protocol generic > mechanisms. Dave, at least Intel parts have a protocol generic model for tunneled packet offloads and hence we are able to extend our support to newer tunnel types. We do not have protocol specific support in the HW, but since the udp based tunnels do not have a packet type for the tunnel header, the HW needs to know which udp port should be mapped to which specific encapsulation. Otherwise encapsulated types like NVGRE we can identify through packet type and program the HW to account for the header. The newer patches for sure reduce the protocol ossification since in communalizes all the different tunnels into one interface so that any further support to a newer udp tunnel type requires just a type definition and if the driver/HW can support it, minor driver changes to set the right bits for HW. No interface change for sure. And I think that is definitely a step in the right direction. +1 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v1 1/6] net: Generalize udp based tunnel offload
-Original Message- From: David Miller [mailto:da...@davemloft.net] Sent: Sunday, November 29, 2015 7:23 PM To: t...@herbertland.com Cc: Brandeburg, Jesse; Singhai, Anjali ; je...@kernel.org; netdev@vger.kernel.org; Patil, Kiran Subject: Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload From: Tom Herbert Date: Tue, 24 Nov 2015 09:32:11 -0800 >> >> FWIW, I've brought the issue to the attention of the architects here, >> and we will likely be able to make changes in this space. Intel >> hardware (as demonstrated by your patches) already is able to deal >> with this de-ossification on transmit. Receive is a whole different beast. >> > Please provide the specifics on why "Receive is a whole different > beast.". Generic receive checksum is already a subset of the > functionality that you must have implement to support the protocol > specific offloads. All the hardware needs to do is calculate the 1's > complement checksum of the packet and return the value on the to the > host with that packet. That's it. No parsing of headers, no worrying > about the pseudo header, no dealing with any encapsulation. Just do > the calculation, return the result to the host and the driver converts > this to CHECKSUM_COMPLETE. I find it very hard to believe that this is > any harder than specific support the next protocol du jour. The reason for receive being different than transmit is, on TX side driver can provide the meta data for where the checksum field is and what is the length that needs to be check summed to the HW on a per packet basis. On Rx the HW parser has to parse the packet to identify the tunnel type and based on that figure out the checksum locations and length in the packet, so definitely HW has to parse the packet and it can parse only based on next header type information or in case of udp tunnels based on udp port mapping to a particular protocol. I am not sure why you say it doesn't need to parse the packet, maybe I am miss- understanding something. Although it's not difficult to reduce protocol ossification on the RX side but it is certainly different and particularly in case of udp-tunnels it needs the port to protocol mapping. +1 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/13] mm: memcontrol: generalize the socket accounting jump label
On Mon, Nov 30, 2015 at 04:08:18PM -0500, Jason Baron wrote: > We're trying to move to the updated API, so this should be: > static_branch_unlikely(_sockets_enabled_key) > > see: include/linux/jump_label.h for details. Good point. There is another struct static_key in there as well. How about the following on top of this series? --- >From b784aa0323628d43272e13a67ead2a2ce0e93ea6 Mon Sep 17 00:00:00 2001 From: Johannes WeinerDate: Mon, 30 Nov 2015 16:41:38 -0500 Subject: [PATCH] mm: memcontrol: switch to the updated jump-label API According to the direct use of struct static_key is deprecated. Update the socket and slab accounting code accordingly. Reported-by: Jason Baron Signed-off-by: Johannes Weiner --- include/linux/memcontrol.h | 8 mm/memcontrol.c| 12 ++-- net/ipv4/tcp_memcontrol.c | 4 ++-- 3 files changed, 12 insertions(+), 12 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index a8df46c..9a19590 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -704,8 +704,8 @@ static inline void mem_cgroup_wb_stats(struct bdi_writeback *wb, #ifdef CONFIG_INET struct sock; -extern struct static_key memcg_sockets_enabled_key; -#define mem_cgroup_sockets_enabled static_key_false(_sockets_enabled_key) +extern struct static_key_false memcg_sockets_enabled_key; +#define mem_cgroup_sockets_enabled static_branch_unlikely(_sockets_enabled_key) void sock_update_memcg(struct sock *sk); void sock_release_memcg(struct sock *sk); bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages); @@ -727,7 +727,7 @@ static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg) #endif /* CONFIG_INET */ #ifdef CONFIG_MEMCG_KMEM -extern struct static_key memcg_kmem_enabled_key; +extern struct static_key_false memcg_kmem_enabled_key; extern int memcg_nr_cache_ids; void memcg_get_cache_ids(void); @@ -743,7 +743,7 @@ void memcg_put_cache_ids(void); static inline bool memcg_kmem_enabled(void) { - return static_key_false(_kmem_enabled_key); + return static_branch_unlikely(_kmem_enabled_key); } static inline bool memcg_kmem_is_active(struct mem_cgroup *memcg) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index a0da91f..5fe45d68 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -346,7 +346,7 @@ void memcg_put_cache_ids(void) * conditional to this static branch, we'll have to allow modules that does * kmem_cache_alloc and the such to see this symbol as well */ -struct static_key memcg_kmem_enabled_key; +DEFINE_STATIC_KEY_FALSE(memcg_kmem_enabled_key); EXPORT_SYMBOL(memcg_kmem_enabled_key); #endif /* CONFIG_MEMCG_KMEM */ @@ -2883,7 +2883,7 @@ static int memcg_activate_kmem(struct mem_cgroup *memcg, err = page_counter_limit(>kmem, nr_pages); VM_BUG_ON(err); - static_key_slow_inc(_kmem_enabled_key); + static_branch_inc(_kmem_enabled_key); /* * A memory cgroup is considered kmem-active as soon as it gets * kmemcg_id. Setting the id after enabling static branching will @@ -3622,7 +3622,7 @@ static void memcg_destroy_kmem(struct mem_cgroup *memcg) { if (memcg->kmem_acct_activated) { memcg_destroy_kmem_caches(memcg); - static_key_slow_dec(_kmem_enabled_key); + static_branch_dec(_kmem_enabled_key); WARN_ON(page_counter_read(>kmem)); } tcp_destroy_cgroup(memcg); @@ -4258,7 +4258,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css) #ifdef CONFIG_INET if (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nosocket) - static_key_slow_inc(_sockets_enabled_key); + static_branch_inc(_sockets_enabled_key); #endif /* @@ -4302,7 +4302,7 @@ static void mem_cgroup_css_free(struct cgroup_subsys_state *css) memcg_destroy_kmem(memcg); #ifdef CONFIG_INET if (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nosocket) - static_key_slow_dec(_sockets_enabled_key); + static_branch_dec(_sockets_enabled_key); #endif __mem_cgroup_free(memcg); } @@ -5494,7 +5494,7 @@ void mem_cgroup_replace_page(struct page *oldpage, struct page *newpage) #ifdef CONFIG_INET -struct static_key memcg_sockets_enabled_key; +DEFINE_STATIC_KEY_FALSE(memcg_sockets_enabled_key); EXPORT_SYMBOL(memcg_sockets_enabled_key); void sock_update_memcg(struct sock *sk) diff --git a/net/ipv4/tcp_memcontrol.c b/net/ipv4/tcp_memcontrol.c index 9a22e2d..18bc7f7 100644 --- a/net/ipv4/tcp_memcontrol.c +++ b/net/ipv4/tcp_memcontrol.c @@ -34,7 +34,7 @@ void tcp_destroy_cgroup(struct mem_cgroup *memcg) return; if (memcg->tcp_mem.active) - static_key_slow_dec(_sockets_enabled_key); + static_branch_dec(_sockets_enabled_key); } static int
[3.19.y-ckt stable] Patch "can: Use correct type in sizeof() in nla_put()" has been added to staging queue
This is a note to let you know that I have just added a patch titled can: Use correct type in sizeof() in nla_put() to the linux-3.19.y-queue branch of the 3.19.y-ckt extended stable tree which can be found at: http://kernel.ubuntu.com/git/ubuntu/linux.git/log/?h=linux-3.19.y-queue This patch is scheduled to be released in version 3.19.8-ckt11. If you, or anyone else, feels it should not be added to this tree, please reply to this email. For more information about the 3.19.y-ckt tree, see https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable Thanks. -Kamal -- >From 6226f4073973c9a2945e3c68bf04b0f8cc0c0793 Mon Sep 17 00:00:00 2001 From: Marek VasutDate: Fri, 30 Oct 2015 13:48:19 +0100 Subject: can: Use correct type in sizeof() in nla_put() commit 562b103a21974c2f9cd67514d110f918bb3e1796 upstream. The sizeof() is invoked on an incorrect variable, likely due to some copy-paste error, and this might result in memory corruption. Fix this. Signed-off-by: Marek Vasut Cc: Wolfgang Grandegger Cc: netdev@vger.kernel.org Signed-off-by: Marc Kleine-Budde Signed-off-by: Kamal Mostafa --- drivers/net/can/dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c index 62ca0e8..8202ab3 100644 --- a/drivers/net/can/dev.c +++ b/drivers/net/can/dev.c @@ -912,7 +912,7 @@ static int can_fill_info(struct sk_buff *skb, const struct net_device *dev) nla_put(skb, IFLA_CAN_BITTIMING_CONST, sizeof(*priv->bittiming_const), priv->bittiming_const)) || - nla_put(skb, IFLA_CAN_CLOCK, sizeof(cm), >clock) || + nla_put(skb, IFLA_CAN_CLOCK, sizeof(priv->clock), >clock) || nla_put_u32(skb, IFLA_CAN_STATE, state) || nla_put(skb, IFLA_CAN_CTRLMODE, sizeof(cm), ) || nla_put_u32(skb, IFLA_CAN_RESTART_MS, priv->restart_ms) || -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v3 12/17] net: 8021q: use __ethtool_get_ksettings
From: David DecotignySigned-off-by: David Decotigny --- net/8021q/vlan_dev.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c index fded865..e607fee 100644 --- a/net/8021q/vlan_dev.c +++ b/net/8021q/vlan_dev.c @@ -620,12 +620,12 @@ static netdev_features_t vlan_dev_fix_features(struct net_device *dev, return features; } -static int vlan_ethtool_get_settings(struct net_device *dev, -struct ethtool_cmd *cmd) +static int vlan_ethtool_get_ksettings(struct net_device *dev, + struct ethtool_ksettings *cmd) { const struct vlan_dev_priv *vlan = vlan_dev_priv(dev); - return __ethtool_get_settings(vlan->real_dev, cmd); + return __ethtool_get_ksettings(vlan->real_dev, cmd); } static void vlan_ethtool_get_drvinfo(struct net_device *dev, @@ -740,7 +740,7 @@ static int vlan_dev_get_iflink(const struct net_device *dev) } static const struct ethtool_ops vlan_ethtool_ops = { - .get_settings = vlan_ethtool_get_settings, + .get_ksettings = vlan_ethtool_get_ksettings, .get_drvinfo= vlan_ethtool_get_drvinfo, .get_link = ethtool_op_get_link, .get_ts_info= vlan_ethtool_get_ts_info, -- 2.6.0.rc2.230.g3dd15c0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v3 07/17] net: ipvlan: use __ethtool_get_ksettings
From: David DecotignySigned-off-by: David Decotigny --- drivers/net/ipvlan/ipvlan_main.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c index a9268db..63b3aa5 100644 --- a/drivers/net/ipvlan/ipvlan_main.c +++ b/drivers/net/ipvlan/ipvlan_main.c @@ -346,12 +346,12 @@ static const struct header_ops ipvlan_header_ops = { .cache_update = eth_header_cache_update, }; -static int ipvlan_ethtool_get_settings(struct net_device *dev, - struct ethtool_cmd *cmd) +static int ipvlan_ethtool_get_ksettings(struct net_device *dev, + struct ethtool_ksettings *cmd) { const struct ipvl_dev *ipvlan = netdev_priv(dev); - return __ethtool_get_settings(ipvlan->phy_dev, cmd); + return __ethtool_get_ksettings(ipvlan->phy_dev, cmd); } static void ipvlan_ethtool_get_drvinfo(struct net_device *dev, @@ -377,7 +377,7 @@ static void ipvlan_ethtool_set_msglevel(struct net_device *dev, u32 value) static const struct ethtool_ops ipvlan_ethtool_ops = { .get_link = ethtool_op_get_link, - .get_settings = ipvlan_ethtool_get_settings, + .get_ksettings = ipvlan_ethtool_get_ksettings, .get_drvinfo= ipvlan_ethtool_get_drvinfo, .get_msglevel = ipvlan_ethtool_get_msglevel, .set_msglevel = ipvlan_ethtool_set_msglevel, -- 2.6.0.rc2.230.g3dd15c0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] ip neigh: device is optional for proxy entries
Though dumping such entries crashes present kernels. Signed-off-by: Konstantin Khlebnikov--- ip/ipneigh.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/ip/ipneigh.c b/ip/ipneigh.c index 54655842ed38..92b7cd6f2a75 100644 --- a/ip/ipneigh.c +++ b/ip/ipneigh.c @@ -100,8 +100,9 @@ static int ipneigh_modify(int cmd, int flags, int argc, char **argv) struct ndmsgndm; charbuf[256]; } req; - char *d = NULL; + char *dev = NULL; int dst_ok = 0; + int dev_ok = 0; int lladdr_ok = 0; char * lla = NULL; inet_prefix dst; @@ -135,10 +136,12 @@ static int ipneigh_modify(int cmd, int flags, int argc, char **argv) duparg("address", *argv); get_addr(, *argv, preferred_family); dst_ok = 1; + dev_ok = 1; req.ndm.ndm_flags |= NTF_PROXY; } else if (strcmp(*argv, "dev") == 0) { NEXT_ARG(); - d = *argv; + dev = *argv; + dev_ok = 1; } else { if (strcmp(*argv, "to") == 0) { NEXT_ARG(); @@ -153,7 +156,7 @@ static int ipneigh_modify(int cmd, int flags, int argc, char **argv) } argc--; argv++; } - if (d == NULL || !dst_ok || dst.family == AF_UNSPEC) { + if (!dev_ok || !dst_ok || dst.family == AF_UNSPEC) { fprintf(stderr, "Device and destination are required arguments.\n"); exit(-1); } @@ -175,8 +178,8 @@ static int ipneigh_modify(int cmd, int flags, int argc, char **argv) ll_init_map(); - if ((req.ndm.ndm_ifindex = ll_name_to_index(d)) == 0) { - fprintf(stderr, "Cannot find device \"%s\"\n", d); + if (dev && (req.ndm.ndm_ifindex = ll_name_to_index(dev)) == 0) { + fprintf(stderr, "Cannot find device \"%s\"\n", dev); return -1; } -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v3 00/17] RFC: new ETHTOOL_GSETTINGS/SSETTINGS API
From: David DecotignyHistory: v3 - rebased v2 on top of latest net-next, minor checkpatch/printf %*pb updates v2 - keep return 0 in get_settings when successful, instead of propagating positive result from driver's get_settings callback. v1 - original submission The main goal of this series is to support ethtool link mode masks larger than 32 bits. It implements a new ioctl pair (ETHTOOL_GSETTINGS/SSETTINGS), its associated callbacks (get/set_settings) and a new struct ethtool_settings, which should eventually replace legacy ethtool_cmd. Internally, the kernel uses fixed length link mode masks defined at compilation time in ethtool.h (for now: 31 bits), that can be increased by changing __ETHTOOL_LINK_MODE_LAST in ethtool.h (absolute max is 4064 bits, checked at compile time), and the user/kernel interface allows this length to be arbitrary within 1..4064. This should allow some flexibility without using too much malloc/stack space, at the cost of a small kernel/user handshake for the user to determine the sizes of those bitmaps. Along the way, I chose to drop in the new structure the 3 ethtool_cmd fields marked "deprecated" (transceiver/maxrxpkt/maxtxpkt). They are still available for old drivers via the old ETHTOOL_GSET/SSET API, but are not available to drivers that switch to new API. Of those 3 fields, ethtool_cmd::transceiver seems to be still actively used by several drivers, maybe we should not consider this field deprecated? The 2 other fields are basically not used. This transition requires some care in the way old and new ethtool talk to the kernel. More technical details provided in the description for main patch. In particular details about backward compatibility properties. Some questions to more experts than me: - the kernel/interface multiplexes the "tell me the bitmap length" handshake and the "give me the settings" inside the new ETHTOOL_GSETTINGS cmd. I was thinking of making this into 2 separate cmds: 1 cmd ETHTOOL_GKERNELPROPERTIES which would be kernel-wide rather than device-specific, would return properties like "length of the link mode bitmaps", and possibly others. And ETHTOOL_GSETTINGS would expect the proper bitmaps - the link mode bitmaps are piggybacked at tail of the new struct ethtool_settings. Since its user-visible definition does not assume specific bitmap width, I am using a 0-length array as the publicly visible placeholder. But then, the kernel needs to specialize it (struct ethtool_ksettings) to specify its current link mode masks. This means that kernel code is "littered" with "ksettings->parent.field" to access "field" inside ethtool_settings: + I don't like the field name "parent", any suggestion welcome + and/or: I could use ethtool_settings everywhere (instead of a new ethtool_ksettings) and an accessor to retrieve the link mode masks? + or: we could decide to make the link mode masks statically bounded again, ie. make their width public, but larger than current 32, and unchangeable forever. This would make everything straightforward, but we might hit limits later, or have an unneeded memory/stack usage for unused bits. any preference? - crossing user/kernel boundary requires conversion of the kernel bitmaps (unsigned long[]) to something more strict (in my case: u32) to accomodate for 32/64 compat. Maybe I should add a copy_bitmap_from_user/copy_bitmap_to_user API inside bitmap.h instead of defining my own in ethtool.c? - I am using a typedef struct (ethtool_link_mode_mask_t) to build and hold the new masks. Makes it handy to use in the drivers (see mlx4 for an example). Not very nice. - I foresee bugs where people use the legacy/deprecated SUPPORTED_x macros instead of the new ETHTOOL_LINK_MODE_x_BIT enums in the new get/set__ksettings callbacks. Not sure how to prevent problems with this. The only driver which was converted for now is mlx4. I am not considering fcoe as fully converted, but I updated it a minima to be able to remove __ethtool_get_settings, now known as __ethtool_get_ksettings. Tested with legacy and "future" ethtool on 64b x86 kernel and 32+64b ethtool, and on a 32b x86 kernel + 32b ethtool. # Patch Set Summary: David Decotigny (17): net: usnic: remove unused call to ethtool_ops::get_settings net: usnic: use __ethtool_get_settings net: ethtool: add new ETHTOOL_GSETTINGS/SSETTINGS API tx4939: use __ethtool_get_ksettings net: usnic: use __ethtool_get_ksettings net: bonding: use __ethtool_get_ksettings net: ipvlan: use __ethtool_get_ksettings net: macvlan: use __ethtool_get_ksettings net: team: use __ethtool_get_ksettings net: fcoe: use __ethtool_get_ksettings net: rdma: use __ethtool_get_ksettings net: 8021q: use __ethtool_get_ksettings net: bridge: use __ethtool_get_ksettings net: core: use __ethtool_get_ksettings
RE: [PATCH v1 1/6] net: Generalize udp based tunnel offload
-Original Message- From: Tom Herbert [mailto:t...@herbertland.com] Sent: Monday, November 30, 2015 8:36 AM To: Singhai, AnjaliCc: Linux Kernel Network Developers ; Jesse Gross ; Patil, Kiran Subject: Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload On Mon, Nov 23, 2015 at 1:02 PM, Anjali Singhai Jain wrote: > Replace add/del ndo ops for vxlan_port with tunnel_port so that all > UDP based tunnels can use the same ndo op. Add a parameter to pass > tunnel type to the ndo_op. > Please consider using RX ntuple filters for this instead of a new ndo op. The vxlan ndo op essentailly implements a limited filter with a rule to match a destination UDP port and the the action of processing the packet as vxlan. ntuple filters generalizes that so that the filtering becomes arbitrary. We'll need the ability to filter on 4-tuple when we implement tunnels to go through firewalls or for offloading other UDP protocols such SPUD or QUIC. Tom - Tom I am not sure I agree with this suggestion. The easiest way to let the hardware know about port to protocol mapping in case of udp-based tunnels is when we add udp offloads for the ports aka gro etc in the stack. This way the user gets benefit of tunnel offloads from the HWs that support it without having to do any extra filter setups from ethtool. Just like ip/tcp/udp checksum and TSO support, the user does not have to turn this ON specifically if they plan to use those protocols (of course they can turn it off). Besides these are not true filters in that sense, they are not used to guide packets to any particular destination in this case, rather used to identify packets for checksum and TSO purpose. And I agree with your patch series that reduces protocol ossification of the stack and driver interface. My point is this set of patches help with that goal and not really hurt because any new tunnel support would mean no change in the interface and just a new type in the enum and then the drivers can decide to do the magic setup in the HW in their driver based on this new type without ever having to touch the interface. So try to explain to me why this is causing protocol ossification because I don't believe so. And I think the ntupe interface should remain for the purpose of filters which are used to route packet or drop them. Not for packet identification and checksum offload support. > Change all drivers to use the generalized udp tunnel offload > > Patch was compile tested with x86_64_defconfig. > > Signed-off-by: Kiran Patil > Signed-off-by: Anjali Singhai Jain > --- > drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 15 ++--- > drivers/net/ethernet/broadcom/bnxt/bnxt.c| 13 +--- > drivers/net/ethernet/emulex/benet/be_main.c | 14 +--- > drivers/net/ethernet/intel/fm10k/fm10k_netdev.c | 27 > drivers/net/ethernet/intel/i40e/i40e_main.c | 41 > +--- > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c| 17 +++--- > drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 21 > drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c | 17 +++--- > drivers/net/vxlan.c | 23 +++-- > include/linux/netdevice.h| 34 ++-- > include/net/udp_tunnel.h | 6 > 11 files changed, 157 insertions(+), 71 deletions(-) > > diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c > b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c > index 2273576..ad2782f 100644 > --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c > +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c > @@ -47,6 +47,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -10124,11 +10125,14 @@ static void __bnx2x_add_vxlan_port(struct > bnx2x *bp, u16 port) } > > static void bnx2x_add_vxlan_port(struct net_device *netdev, > -sa_family_t sa_family, __be16 port) > +sa_family_t sa_family, __be16 port, > +u32 type) > { > struct bnx2x *bp = netdev_priv(netdev); > u16 t_port = ntohs(port); > > + if (type != UDP_TUNNEL_VXLAN) > + return; > __bnx2x_add_vxlan_port(bp, t_port); } > > @@ -10152,11 +10156,14 @@ static void __bnx2x_del_vxlan_port(struct > bnx2x *bp, u16 port) } > > static void bnx2x_del_vxlan_port(struct net_device *netdev, > -sa_family_t sa_family, __be16 port) > +sa_family_t sa_family, __be16 port, > +u32 type) > { > struct bnx2x *bp = netdev_priv(netdev); > u16 t_port = ntohs(port); > > + if (type !=
[PATCH] net/neighbour: fix crash at dumping device-agnostic proxy entries
Proxy entries could have null pointer to net-device. Signed-off-by: Konstantin KhlebnikovFixes: 84920c1420e2 ("net: Allow ipv6 proxies and arp proxies be shown with iproute2") Cc: # v3.4 --- net/core/neighbour.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/core/neighbour.c b/net/core/neighbour.c index e6af42da28d9..f18ae91b652e 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -2215,7 +2215,7 @@ static int pneigh_fill_info(struct sk_buff *skb, struct pneigh_entry *pn, ndm->ndm_pad2= 0; ndm->ndm_flags = pn->flags | NTF_PROXY; ndm->ndm_type= RTN_UNICAST; - ndm->ndm_ifindex = pn->dev->ifindex; + ndm->ndm_ifindex = pn->dev ? pn->dev->ifindex : 0; ndm->ndm_state = NUD_NONE; if (nla_put(skb, NDA_DST, tbl->key_len, pn->key)) @@ -2333,7 +2333,7 @@ static int pneigh_dump_table(struct neigh_table *tbl, struct sk_buff *skb, if (h > s_h) s_idx = 0; for (n = tbl->phash_buckets[h], idx = 0; n; n = n->next) { - if (dev_net(n->dev) != net) + if (pneigh_net(n) != net) continue; if (idx < s_idx) goto next; -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v3 11/17] net: rdma: use __ethtool_get_ksettings
From: David DecotignySigned-off-by: David Decotigny --- include/rdma/ib_addr.h | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/include/rdma/ib_addr.h b/include/rdma/ib_addr.h index 1152859..1820f26 100644 --- a/include/rdma/ib_addr.h +++ b/include/rdma/ib_addr.h @@ -254,24 +254,22 @@ static inline enum ib_mtu iboe_get_mtu(int mtu) static inline int iboe_get_rate(struct net_device *dev) { - struct ethtool_cmd cmd; - u32 speed; + struct ethtool_ksettings cmd; int err; rtnl_lock(); - err = __ethtool_get_settings(dev, ); + err = __ethtool_get_ksettings(dev, ); rtnl_unlock(); if (err) return IB_RATE_PORT_CURRENT; - speed = ethtool_cmd_speed(); - if (speed >= 4) + if (cmd.parent.speed >= 4) return IB_RATE_40_GBPS; - else if (speed >= 3) + else if (cmd.parent.speed >= 3) return IB_RATE_30_GBPS; - else if (speed >= 2) + else if (cmd.parent.speed >= 2) return IB_RATE_20_GBPS; - else if (speed >= 1) + else if (cmd.parent.speed >= 1) return IB_RATE_10_GBPS; else return IB_RATE_PORT_CURRENT; -- 2.6.0.rc2.230.g3dd15c0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v3 09/17] net: team: use __ethtool_get_ksettings
From: David DecotignySigned-off-by: David Decotigny --- drivers/net/team/team.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c index 651d35e..288ca01 100644 --- a/drivers/net/team/team.c +++ b/drivers/net/team/team.c @@ -2776,12 +2776,12 @@ static void __team_port_change_send(struct team_port *port, bool linkup) port->state.linkup = linkup; team_refresh_port_linkup(port); if (linkup) { - struct ethtool_cmd ecmd; + struct ethtool_ksettings ecmd; - err = __ethtool_get_settings(port->dev, ); + err = __ethtool_get_ksettings(port->dev, ); if (!err) { - port->state.speed = ethtool_cmd_speed(); - port->state.duplex = ecmd.duplex; + port->state.speed = ecmd.parent.speed; + port->state.duplex = ecmd.parent.duplex; goto send_event; } } -- 2.6.0.rc2.230.g3dd15c0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net] bridge: Only call /sbin/bridge-stp for the initial network namespace
On Mon, 30 Nov 2015 15:38:15 -0600 ebied...@xmission.com (Eric W. Biederman) wrote: > > There is no defined mechanism to pass network namespace information > into /sbin/bridge-stp therefore don't even try to invoke it except > for bridge devices in the initial network namespace. > > It is possible for unprivileged users to cause /sbin/bridge-stp to be > invoked for any network device name which if /sbin/bridge-stp does not > guard against unreasonable arguments or being invoked twice on the same > network device could cause problems. > > Signed-off-by: "Eric W. Biederman"> --- > net/bridge/br_stp_if.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c > index 5396ff08af32..742fa89528ab 100644 > --- a/net/bridge/br_stp_if.c > +++ b/net/bridge/br_stp_if.c > @@ -142,7 +142,9 @@ static void br_stp_start(struct net_bridge *br) > char *envp[] = { NULL }; > struct net_bridge_port *p; > > - r = call_usermodehelper(BR_STP_PROG, argv, envp, UMH_WAIT_PROC); > + r = -ENOENT; > + if (dev_net(br->dev) == _net) > + r = call_usermodehelper(BR_STP_PROG, argv, envp, UMH_WAIT_PROC); I don't think this will cause loud screams. But it might break people that use containers to run virtual networks for testing. One coding nit: Why are you afraid of using an else? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net] bridge: Only call /sbin/bridge-stp for the initial network namespace
There is no defined mechanism to pass network namespace information into /sbin/bridge-stp therefore don't even try to invoke it except for bridge devices in the initial network namespace. It is possible for unprivileged users to cause /sbin/bridge-stp to be invoked for any network device name which if /sbin/bridge-stp does not guard against unreasonable arguments or being invoked twice on the same network device could cause problems. Signed-off-by: "Eric W. Biederman"--- net/bridge/br_stp_if.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c index 5396ff08af32..742fa89528ab 100644 --- a/net/bridge/br_stp_if.c +++ b/net/bridge/br_stp_if.c @@ -142,7 +142,9 @@ static void br_stp_start(struct net_bridge *br) char *envp[] = { NULL }; struct net_bridge_port *p; - r = call_usermodehelper(BR_STP_PROG, argv, envp, UMH_WAIT_PROC); + r = -ENOENT; + if (dev_net(br->dev) == _net) + r = call_usermodehelper(BR_STP_PROG, argv, envp, UMH_WAIT_PROC); spin_lock_bh(>lock); -- 2.2.1 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v3 08/17] net: macvlan: use __ethtool_get_ksettings
From: David DecotignySigned-off-by: David Decotigny --- drivers/net/macvlan.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index 06c8bfe..a95b793 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -940,12 +940,12 @@ static void macvlan_ethtool_get_drvinfo(struct net_device *dev, strlcpy(drvinfo->version, "0.1", sizeof(drvinfo->version)); } -static int macvlan_ethtool_get_settings(struct net_device *dev, - struct ethtool_cmd *cmd) +static int macvlan_ethtool_get_ksettings(struct net_device *dev, +struct ethtool_ksettings *cmd) { const struct macvlan_dev *vlan = netdev_priv(dev); - return __ethtool_get_settings(vlan->lowerdev, cmd); + return __ethtool_get_ksettings(vlan->lowerdev, cmd); } static netdev_features_t macvlan_fix_features(struct net_device *dev, @@ -1020,7 +1020,7 @@ static int macvlan_dev_get_iflink(const struct net_device *dev) static const struct ethtool_ops macvlan_ethtool_ops = { .get_link = ethtool_op_get_link, - .get_settings = macvlan_ethtool_get_settings, + .get_ksettings = macvlan_ethtool_get_ksettings, .get_drvinfo= macvlan_ethtool_get_drvinfo, }; -- 2.6.0.rc2.230.g3dd15c0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v3 10/17] net: fcoe: use __ethtool_get_ksettings
From: David DecotignySigned-off-by: David Decotigny --- drivers/scsi/fcoe/fcoe_transport.c | 36 1 file changed, 20 insertions(+), 16 deletions(-) diff --git a/drivers/scsi/fcoe/fcoe_transport.c b/drivers/scsi/fcoe/fcoe_transport.c index d7597c0..9049197 100644 --- a/drivers/scsi/fcoe/fcoe_transport.c +++ b/drivers/scsi/fcoe/fcoe_transport.c @@ -93,36 +93,40 @@ static struct notifier_block libfcoe_notifier = { int fcoe_link_speed_update(struct fc_lport *lport) { struct net_device *netdev = fcoe_get_netdev(lport); - struct ethtool_cmd ecmd; + struct ethtool_ksettings ecmd; - if (!__ethtool_get_settings(netdev, )) { + if (!__ethtool_get_ksettings(netdev, )) { lport->link_supported_speeds &= ~(FC_PORTSPEED_1GBIT | FC_PORTSPEED_10GBIT | FC_PORTSPEED_20GBIT | FC_PORTSPEED_40GBIT); - if (ecmd.supported & (SUPPORTED_1000baseT_Half | - SUPPORTED_1000baseT_Full | - SUPPORTED_1000baseKX_Full)) + if (ecmd.link_modes.supported.mask[0] & ( + SUPPORTED_1000baseT_Half | + SUPPORTED_1000baseT_Full | + SUPPORTED_1000baseKX_Full)) lport->link_supported_speeds |= FC_PORTSPEED_1GBIT; - if (ecmd.supported & (SUPPORTED_1baseT_Full | - SUPPORTED_1baseKX4_Full | - SUPPORTED_1baseKR_Full | - SUPPORTED_1baseR_FEC)) + if (ecmd.link_modes.supported.mask[0] & ( + SUPPORTED_1baseT_Full | + SUPPORTED_1baseKX4_Full | + SUPPORTED_1baseKR_Full | + SUPPORTED_1baseR_FEC)) lport->link_supported_speeds |= FC_PORTSPEED_10GBIT; - if (ecmd.supported & (SUPPORTED_2baseMLD2_Full | - SUPPORTED_2baseKR2_Full)) + if (ecmd.link_modes.supported.mask[0] & ( + SUPPORTED_2baseMLD2_Full | + SUPPORTED_2baseKR2_Full)) lport->link_supported_speeds |= FC_PORTSPEED_20GBIT; - if (ecmd.supported & (SUPPORTED_4baseKR4_Full | - SUPPORTED_4baseCR4_Full | - SUPPORTED_4baseSR4_Full | - SUPPORTED_4baseLR4_Full)) + if (ecmd.link_modes.supported.mask[0] & ( + SUPPORTED_4baseKR4_Full | + SUPPORTED_4baseCR4_Full | + SUPPORTED_4baseSR4_Full | + SUPPORTED_4baseLR4_Full)) lport->link_supported_speeds |= FC_PORTSPEED_40GBIT; - switch (ethtool_cmd_speed()) { + switch (ecmd.parent.speed) { case SPEED_1000: lport->link_speed = FC_PORTSPEED_1GBIT; break; -- 2.6.0.rc2.230.g3dd15c0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next] sfc: use ALIGN macro for aligning frame sizes
Don't open-code it. CC: Solarflare linux maintainersCC: Shradha Shah CC: netdev@vger.kernel.org Signed-off-by: Jarod Wilson --- drivers/net/ethernet/sfc/net_driver.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h index a8ddd12..746d591 100644 --- a/drivers/net/ethernet/sfc/net_driver.h +++ b/drivers/net/ethernet/sfc/net_driver.h @@ -1502,8 +1502,9 @@ static inline struct efx_rx_buffer *efx_rx_buffer(struct efx_rx_queue *rx_queue, * same cycle, the XMAC can miss the IPG altogether. We work around * this by adding a further 16 bytes. */ +#define EFX_FRAME_PAD 16 #define EFX_MAX_FRAME_LEN(mtu) \ - mtu) + ETH_HLEN + VLAN_HLEN + 4/* FCS */ + 7) & ~7) + 16) + (ALIGN(((mtu) + ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN + EFX_FRAME_PAD), 8)) static inline bool efx_xmit_with_hwtstamp(struct sk_buff *skb) { -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v3 15/17] net: ethtool: remove unused __ethtool_get_settings
From: David Decotignyreplaced by __ethtool_get_ksettings. Signed-off-by: David Decotigny --- include/linux/ethtool.h | 4 net/core/ethtool.c | 49 ++--- 2 files changed, 14 insertions(+), 39 deletions(-) diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h index 6de122d..7de2dc7 100644 --- a/include/linux/ethtool.h +++ b/include/linux/ethtool.h @@ -161,10 +161,6 @@ __ethtool_add_link_modes(ethtool_link_mode_mask_t *dst, extern int __ethtool_get_ksettings(struct net_device *dev, struct ethtool_ksettings *ksettings); -/* DEPRECATED, use __ethtool_get_ksettings */ -extern int __ethtool_get_settings(struct net_device *dev, - struct ethtool_cmd *cmd); - /** * struct ethtool_ops - optional netdev operations * @get_settings: DEPRECATED, use %get_ksettings/%set_ksettings diff --git a/net/core/ethtool.c b/net/core/ethtool.c index 4563f95..b67f079 100644 --- a/net/core/ethtool.c +++ b/net/core/ethtool.c @@ -499,15 +499,16 @@ int __ethtool_get_ksettings(struct net_device *dev, return dev->ethtool_ops->get_ksettings(dev, ksettings); } - /* TODO: remove what follows when ethtool_ops::get_settings -* disappears internally -*/ - /* driver doesn't support %ethtool_ksettings API. revert to * legacy %ethtool_cmd API, unless it's not supported either. * TODO: remove when ethtool_ops::get_settings disappears internally */ - err = __ethtool_get_settings(dev, ); + if (!dev->ethtool_ops->get_settings) + return -EOPNOTSUPP; + + memset(, 0, sizeof(cmd)); + cmd.cmd = ETHTOOL_GSET; + err = dev->ethtool_ops->get_settings(dev, ); if (err < 0) return err; @@ -723,30 +724,6 @@ static int ethtool_set_ksettings(struct net_device *dev, void __user *useraddr) return dev->ethtool_ops->set_ksettings(dev, ); } -/* Internal kernel helper to query a device ethtool_cmd settings. - * - * Note about transition to ethtool_settings API: We do not need (or - * want) this function to support "dev" instances that implement the - * ethtool_settings API as we will update the drivers calling this - * function to call __ethtool_get_ksettings instead, before the first - * drivers implement ethtool_ops::get_ksettings. - * - * TODO 1: at least make this function static when no driver is using it - * TODO 2: remove when ethtool_ops::get_settings disappears internally - */ -int __ethtool_get_settings(struct net_device *dev, struct ethtool_cmd *cmd) -{ - ASSERT_RTNL(); - - if (!dev->ethtool_ops->get_settings) - return -EOPNOTSUPP; - - memset(cmd, 0, sizeof(struct ethtool_cmd)); - cmd->cmd = ETHTOOL_GSET; - return dev->ethtool_ops->get_settings(dev, cmd); -} -EXPORT_SYMBOL(__ethtool_get_settings); - /* Query device for its ethtool_cmd settings. * * Backward compatibility note: for compatibility with legacy ethtool, @@ -788,16 +765,18 @@ static int ethtool_get_settings(struct net_device *dev, void __user *useraddr) /* send a sensible cmd tag back to user */ cmd.cmd = ETHTOOL_GSET; } else { - int err; - /* TODO: return -EOPNOTSUPP when -* ethtool_ops::get_settings disappears internally -*/ - /* driver doesn't support %ethtool_ksettings * API. revert to legacy %ethtool_cmd API, unless it's * not supported either. */ - err = __ethtool_get_settings(dev, ); + int err; + + if (!dev->ethtool_ops->get_settings) + return -EOPNOTSUPP; + + memset(, 0, sizeof(cmd)); + cmd.cmd = ETHTOOL_GSET; + err = dev->ethtool_ops->get_settings(dev, ); if (err < 0) return err; } -- 2.6.0.rc2.230.g3dd15c0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v3 16/17] net: mlx4: convenience predicate for debug messages
From: David DecotignySigned-off-by: David Decotigny --- drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h index 35de7d2..b04054d 100644 --- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h @@ -740,9 +740,11 @@ __printf(3, 4) void en_print(const char *level, const struct mlx4_en_priv *priv, const char *format, ...); +#define en_dbg_enabled(mlevel, priv) \ + (NETIF_MSG_##mlevel & (priv)->msg_enable) #define en_dbg(mlevel, priv, format, ...) \ do { \ - if (NETIF_MSG_##mlevel & (priv)->msg_enable)\ + if (en_dbg_enabled(mlevel, priv)) \ en_print(KERN_DEBUG, priv, format, ##__VA_ARGS__); \ } while (0) #define en_warn(priv, format, ...) \ -- 2.6.0.rc2.230.g3dd15c0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v3 13/17] net: bridge: use __ethtool_get_ksettings
From: David DecotignySigned-off-by: David Decotigny --- net/bridge/br_if.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c index ec02f58..e6de008 100644 --- a/net/bridge/br_if.c +++ b/net/bridge/br_if.c @@ -36,10 +36,10 @@ */ static int port_cost(struct net_device *dev) { - struct ethtool_cmd ecmd; + struct ethtool_ksettings ecmd; - if (!__ethtool_get_settings(dev, )) { - switch (ethtool_cmd_speed()) { + if (!__ethtool_get_ksettings(dev, )) { + switch (ecmd.parent.speed) { case SPEED_1: return 2; case SPEED_1000: -- 2.6.0.rc2.230.g3dd15c0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v3 14/17] net: core: use __ethtool_get_ksettings
From: David DecotignySigned-off-by: David Decotigny --- net/core/net-sysfs.c | 15 +-- net/packet/af_packet.c | 11 +-- 2 files changed, 14 insertions(+), 12 deletions(-) diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index f88a62a..3dd4bb1 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -199,9 +199,10 @@ static ssize_t speed_show(struct device *dev, return restart_syscall(); if (netif_running(netdev)) { - struct ethtool_cmd cmd; - if (!__ethtool_get_settings(netdev, )) - ret = sprintf(buf, fmt_dec, ethtool_cmd_speed()); + struct ethtool_ksettings cmd; + + if (!__ethtool_get_ksettings(netdev, )) + ret = sprintf(buf, fmt_dec, cmd.parent.speed); } rtnl_unlock(); return ret; @@ -218,10 +219,12 @@ static ssize_t duplex_show(struct device *dev, return restart_syscall(); if (netif_running(netdev)) { - struct ethtool_cmd cmd; - if (!__ethtool_get_settings(netdev, )) { + struct ethtool_ksettings cmd; + + if (!__ethtool_get_ksettings(netdev, )) { const char *duplex; - switch (cmd.duplex) { + + switch (cmd.parent.duplex) { case DUPLEX_HALF: duplex = "half"; break; diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 1cf928f..8847dad 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -557,9 +557,8 @@ static int prb_calc_retire_blk_tmo(struct packet_sock *po, { struct net_device *dev; unsigned int mbits = 0, msec = 0, div = 0, tmo = 0; - struct ethtool_cmd ecmd; + struct ethtool_ksettings ecmd; int err; - u32 speed; rtnl_lock(); dev = __dev_get_by_index(sock_net(>sk), po->ifindex); @@ -567,19 +566,19 @@ static int prb_calc_retire_blk_tmo(struct packet_sock *po, rtnl_unlock(); return DEFAULT_PRB_RETIRE_TOV; } - err = __ethtool_get_settings(dev, ); - speed = ethtool_cmd_speed(); + err = __ethtool_get_ksettings(dev, ); rtnl_unlock(); if (!err) { /* * If the link speed is so slow you don't really * need to worry about perf anyways */ - if (speed < SPEED_1000 || speed == SPEED_UNKNOWN) { + if (ecmd.parent.speed < SPEED_1000 || + ecmd.parent.speed == SPEED_UNKNOWN) { return DEFAULT_PRB_RETIRE_TOV; } else { msec = 1; - div = speed / 1000; + div = ecmd.parent.speed / 1000; } } -- 2.6.0.rc2.230.g3dd15c0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net] bpf: fix allocation warnings in bpf maps and integer overflow
On 11/30/2015 07:13 PM, Alexei Starovoitov wrote: On Mon, Nov 30, 2015 at 03:34:35PM +0100, Daniel Borkmann wrote: diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index 3f4c99e06c6b..b1e53b79c586 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -28,11 +28,17 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr) attr->value_size == 0) return ERR_PTR(-EINVAL); +if (attr->value_size >= 1 << (KMALLOC_SHIFT_MAX - 1)) +/* if value_size is bigger, the user space won't be able to + * access the elements. + */ +return ERR_PTR(-E2BIG); + Bit confused, given that in array map, we try kzalloc() with __GFP_NOWARN already and if that fails, we fall back to vzalloc(), it shouldn't trigger memory allocation warnings here ... not quite, the above check is for kmalloc-s in syscall.c Ok, I see. The check and comment is related to the fact that when we do bpf(2) syscall to lookup an element: We call map_lookup_elem(), which does kmalloc() on the value_size. So an individual entry lookup could fail with kmalloc() there, unrelated to an individual map implementation. kmalloc with order >= MAX_ORDER warning can be seen in syscall for update/lookup commands regardless of map implememtation. So the maps with "value_size >= 1 << (KMALLOC_SHIFT_MAX - 1)" were not accessible from user space anyway. This check in arraymap.c fixes the warning and prevents creation of such maps in the first place as the comment right below it says. Yeah, right. Noticed that later on. It was a bit confusing at first as I didn't parse that clearly from the commit message itself. Similar check in hashmap.c fixes warning, prevents abnormal map creation and fixes integer overflow which is the most dangerous of them all. The check in arraymap.c -attr->max_entries > (U32_MAX - sizeof(*array)) / elem_size) +attr->max_entries > (U32_MAX - PAGE_SIZE - sizeof(*array)) / elem_size) fixes potential integer overflow in map.pages computation. and similar check in hashtab.c: (u64) htab->elem_size * htab->map.max_entries >= U32_MAX - PAGE_SIZE fixes integer overflow in map.pages as well. Yep, got that part. the 'value_size >= (1 << (KMALLOC_SHIFT_MAX - 1)) - MAX_BPF_STACK - sizeof(struct htab_elem)' check in hashmap.c fixes integer overflow in elem_size and makes elem_size kmalloc-able later in htab_map_update_elem(). Since it wasn't obvious that this one 'if' addresses these multiple issues, I've added a comment there. ... and the MAX_BPF_STACK stands for the maximum key part here, okay. So, when creating a sufficiently large map where map->key_size + map->value_size would be > MAX_BPF_STACK (but map->key_size still <= MAX_BPF_STACK), we can only read the map from an eBPF program, but not update it. In such cases, updates could only happen from user space application. Addition of __GFP_NOWARN only fixes OOM warning as commit log says. That's obvious, too. Hmm, seems this patch fixes many things at once, maybe makes sense to split it? hmm I don't see a point of changing the same single line over multipe patches. The split won't help backporting, but rather makes for more patches to deal with. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] ath6kl: Use vmalloc for loading firmware using api1 method and use kvfree
Kalle Valowrites: > Brent Taylor writes: > >> Signed-off-by: Brent Taylor >> >> ath6kl: Use vmalloc for loading firmware using api1 method and free using >> kvfree >> --- >> Changes v1 -> v2: >>- simplify memory allocation >>- use kvfree > > Why? The commit log should _always_ answer that. Are you fixing a bug > (what bug exactly?), is this just cleanup or what? > > And the commit log is wrongly formatted anyway, the Signed-off-by line > should be the last and there should be no "ath6kl:" string in the commit > log (just in the title). Use 'git log' to find examples. Fixing netdev address (kenrel -> kernel) -- Kalle Valo -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 net-next 0/2] Basic support for Solarflare 8000 series NICs
The upcoming Solarflare 8000 series 10G/40G network card supports a similar interface to the current 7000 series cards. This patch series provides basic support for these cards, making no use of any new functionality. v2: fix indenting in ef10.c in patch 1/2. Bert Kenward (2): sfc: make TSO version a per-queue parameter sfc: Add PCI ID for Solarflare 8000 series 10/40G NIC drivers/net/ethernet/sfc/ef10.c | 13 ++--- drivers/net/ethernet/sfc/efx.c| 6 ++ drivers/net/ethernet/sfc/net_driver.h | 2 ++ drivers/net/ethernet/sfc/tx.c | 8 ++-- 4 files changed, 20 insertions(+), 9 deletions(-) -- 2.4.3 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 net-next 2/2] sfc: Add PCI ID for Solarflare 8000 series 10/40G NIC
Also add support for 7000 series 40G NIC VF. Signed-off-by: Bert Kenward--- drivers/net/ethernet/sfc/efx.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c index 4e82bcf..b405349 100644 --- a/drivers/net/ethernet/sfc/efx.c +++ b/drivers/net/ethernet/sfc/efx.c @@ -2784,6 +2784,12 @@ static const struct pci_device_id efx_pci_table[] = { .driver_data = (unsigned long) _hunt_a0_vf_nic_type}, {PCI_DEVICE(PCI_VENDOR_ID_SOLARFLARE, 0x0923), /* SFC9140 PF */ .driver_data = (unsigned long) _hunt_a0_nic_type}, + {PCI_DEVICE(PCI_VENDOR_ID_SOLARFLARE, 0x1923), /* SFC9140 VF */ +.driver_data = (unsigned long) _hunt_a0_vf_nic_type}, + {PCI_DEVICE(PCI_VENDOR_ID_SOLARFLARE, 0x0a03), /* SFC9220 PF */ +.driver_data = (unsigned long) _hunt_a0_nic_type}, + {PCI_DEVICE(PCI_VENDOR_ID_SOLARFLARE, 0x1a03), /* SFC9220 VF */ +.driver_data = (unsigned long) _hunt_a0_vf_nic_type}, {0} /* end of list */ }; -- 2.4.3 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 net-next 1/2] sfc: make TSO version a per-queue parameter
The Solarflare 8000 series NIC will use a new TSO scheme. The current driver refuses to load if the current TSO scheme is not found. Remove that check and instead make the TSO version a per-queue parameter. Signed-off-by: Bert Kenward--- drivers/net/ethernet/sfc/ef10.c | 13 ++--- drivers/net/ethernet/sfc/net_driver.h | 2 ++ drivers/net/ethernet/sfc/tx.c | 8 ++-- 3 files changed, 14 insertions(+), 9 deletions(-) diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c index bc6d21b..425df3d 100644 --- a/drivers/net/ethernet/sfc/ef10.c +++ b/drivers/net/ethernet/sfc/ef10.c @@ -181,13 +181,6 @@ static int efx_ef10_init_datapath_caps(struct efx_nic *efx) MCDI_WORD(outbuf, GET_CAPABILITIES_OUT_TX_DPCPU_FW_ID); if (!(nic_data->datapath_caps & - (1 << MC_CMD_GET_CAPABILITIES_OUT_TX_TSO_LBN))) { - netif_err(efx, drv, efx->net_dev, - "current firmware does not support TSO\n"); - return -ENODEV; - } - - if (!(nic_data->datapath_caps & (1 << MC_CMD_GET_CAPABILITIES_OUT_RX_PREFIX_LEN_14_LBN))) { netif_err(efx, probe, efx->net_dev, "current firmware does not support an RX prefix\n"); @@ -1797,6 +1790,12 @@ static void efx_ef10_tx_init(struct efx_tx_queue *tx_queue) ESF_DZ_TX_OPTION_UDP_TCP_CSUM, csum_offload, ESF_DZ_TX_OPTION_IP_CSUM, csum_offload); tx_queue->write_count = 1; + + if (nic_data->datapath_caps & + (1 << MC_CMD_GET_CAPABILITIES_OUT_TX_TSO_LBN)) { + tx_queue->tso_version = 1; + } + wmb(); efx_ef10_push_tx_desc(tx_queue, txd); diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h index a8ddd12..5c0d0ba 100644 --- a/drivers/net/ethernet/sfc/net_driver.h +++ b/drivers/net/ethernet/sfc/net_driver.h @@ -182,6 +182,7 @@ struct efx_tx_buffer { * * @efx: The associated Efx NIC * @queue: DMA queue number + * @tso_version: Version of TSO in use for this queue. * @channel: The associated channel * @core_txq: The networking core TX queue structure * @buffer: The software buffer ring @@ -228,6 +229,7 @@ struct efx_tx_queue { /* Members which don't change on the fast path */ struct efx_nic *efx cacheline_aligned_in_smp; unsigned queue; + unsigned int tso_version; struct efx_channel *channel; struct netdev_queue *core_txq; struct efx_tx_buffer *buffer; diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c index 67f6afa..f7a0ec1 100644 --- a/drivers/net/ethernet/sfc/tx.c +++ b/drivers/net/ethernet/sfc/tx.c @@ -1010,13 +1010,17 @@ static void efx_enqueue_unwind(struct efx_tx_queue *tx_queue, /* Parse the SKB header and initialise state. */ static int tso_start(struct tso_state *st, struct efx_nic *efx, +struct efx_tx_queue *tx_queue, const struct sk_buff *skb) { - bool use_opt_desc = efx_nic_rev(efx) >= EFX_REV_HUNT_A0; struct device *dma_dev = >pci_dev->dev; unsigned int header_len, in_len; + bool use_opt_desc = false; dma_addr_t dma_addr; + if (tx_queue->tso_version == 1) + use_opt_desc = true; + st->ip_off = skb_network_header(skb) - skb->data; st->tcp_off = skb_transport_header(skb) - skb->data; header_len = st->tcp_off + (tcp_hdr(skb)->doff << 2u); @@ -1271,7 +1275,7 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue, /* Find the packet protocol and sanity-check it */ state.protocol = efx_tso_check_protocol(skb); - rc = tso_start(, efx, skb); + rc = tso_start(, efx, tx_queue, skb); if (rc) goto mem_err; -- 2.4.3 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vhost: replace % with & on data path
Hi Michael, [auto build test ERROR on: v4.4-rc3] [also build test ERROR on: next-20151127] url: https://github.com/0day-ci/linux/commits/Michael-S-Tsirkin/vhost-replace-with-on-data-path/20151130-163704 config: i386-randconfig-s1-201548 (attached as .config) reproduce: # save the attached .config to linux build tree make ARCH=i386 All error/warnings (new ones prefixed by >>): drivers/vhost/vhost.c: In function 'vhost_get_vq_desc': >> drivers/vhost/vhost.c:1345:6: warning: unused variable 'ret' >> [-Wunused-variable] int ret; ^ >> drivers/vhost/vhost.c:1344:13: warning: unused variable 'ring_head' >> [-Wunused-variable] __virtio16 ring_head; ^ >> drivers/vhost/vhost.c:1341:24: warning: unused variable 'found' >> [-Wunused-variable] unsigned int i, head, found = 0; ^ >> drivers/vhost/vhost.c:1341:18: warning: unused variable 'head' >> [-Wunused-variable] unsigned int i, head, found = 0; ^ >> drivers/vhost/vhost.c:1341:15: warning: unused variable 'i' >> [-Wunused-variable] unsigned int i, head, found = 0; ^ >> drivers/vhost/vhost.c:1340:20: warning: unused variable 'desc' >> [-Wunused-variable] struct vring_desc desc; ^ drivers/vhost/vhost.c: At top level: >> drivers/vhost/vhost.c:1373:2: error: expected identifier or '(' before 'if' if (unlikely(__get_user(ring_head, ^ In file included from include/uapi/linux/stddef.h:1:0, from include/linux/stddef.h:4, from include/uapi/linux/posix_types.h:4, from include/uapi/linux/types.h:13, from include/linux/types.h:5, from include/uapi/asm-generic/fcntl.h:4, from arch/x86/include/uapi/asm/fcntl.h:1, from include/uapi/linux/fcntl.h:4, from include/linux/fcntl.h:4, from include/linux/eventfd.h:11, from drivers/vhost/vhost.c:14: >> arch/x86/include/asm/uaccess.h:414:2: error: expected identifier or '(' >> before ')' token }) ^ include/linux/compiler.h:137:45: note: in definition of macro 'unlikely' # define unlikely(x) (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 0)) ^ arch/x86/include/asm/uaccess.h:479:2: note: in expansion of macro '__get_user_nocheck' __get_user_nocheck((x), (ptr), sizeof(*(ptr))) ^ >> drivers/vhost/vhost.c:1373:15: note: in expansion of macro '__get_user' if (unlikely(__get_user(ring_head, ^ >> arch/x86/include/asm/uaccess.h:414:2: error: expected identifier or '(' >> before ')' token }) ^ include/linux/compiler.h:137:53: note: in definition of macro 'unlikely' # define unlikely(x) (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 0)) ^ arch/x86/include/asm/uaccess.h:479:2: note: in expansion of macro '__get_user_nocheck' __get_user_nocheck((x), (ptr), sizeof(*(ptr))) ^ >> drivers/vhost/vhost.c:1373:15: note: in expansion of macro '__get_user' if (unlikely(__get_user(ring_head, ^ >> include/linux/compiler.h:126:4: error: expected identifier or '(' before ')' >> token }) ^ include/linux/compiler.h:137:58: note: in expansion of macro '__branch_check__' # define unlikely(x) (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 0)) ^ >> drivers/vhost/vhost.c:1373:6: note: in expansion of macro 'unlikely' if (unlikely(__get_user(ring_head, ^ >> drivers/vhost/vhost.c:1381:2: warning: data definition has no type or >> storage class head = vhost16_to_cpu(vq, ring_head); ^ >> drivers/vhost/vhost.c:1381:2: error: type defaults to 'int' in declaration >> of 'head' [-Werror=implicit-int] >> drivers/vhost/vhost.c:1381:24: error: 'vq' undeclared here (not in a >> function) head = vhost16_to_cpu(vq, ring_head); ^ >> drivers/vhost/vhost.c:1381:28: error: 'ring_head' undeclared here (not in a >> function) head = vhost16_to_cpu(vq, ring_head); ^ drivers/vhost/vhost.c:1384:2: error: expected identifier or '(' before 'if' if (unlikely(head >= vq->num)) { ^ In file included from include/uapi/linux/stddef.h:1:0, from include/linux/stddef.h:4, from include/uapi/linux/posix_types.h:4, from include/uapi/linux/types.h:13, from include/linux/types.h:5, from include/u
Re: [PATCH] vhost: replace % with & on data path
On Mon, Nov 30, 2015 at 10:34:07AM +0200, Michael S. Tsirkin wrote: > We know vring num is a power of 2, so use & > to mask the high bits. > > Signed-off-by: Michael S. Tsirkin> --- > drivers/vhost/vhost.c | 8 +--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > index 080422f..85f0f0a 100644 > --- a/drivers/vhost/vhost.c > +++ b/drivers/vhost/vhost.c > @@ -1366,10 +1366,12 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq, > /* Only get avail ring entries after they have been exposed by guest. */ > smp_rmb(); > > + } > + Oops. This sneaked in from an unrelated patch. Pls ignore, will repost. > /* Grab the next descriptor number they're advertising, and increment >* the index we've seen. */ > if (unlikely(__get_user(ring_head, > - >avail->ring[last_avail_idx % vq->num]))) { > + >avail->ring[last_avail_idx & (vq->num - > 1)]))) { > vq_err(vq, "Failed to read head: idx %d address %p\n", > last_avail_idx, > >avail->ring[last_avail_idx % vq->num]); > @@ -1489,7 +1491,7 @@ static int __vhost_add_used_n(struct vhost_virtqueue > *vq, > u16 old, new; > int start; > > - start = vq->last_used_idx % vq->num; > + start = vq->last_used_idx & (vq->num - 1); > used = vq->used->ring + start; > if (count == 1) { > if (__put_user(heads[0].id, >id)) { > @@ -1531,7 +1533,7 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct > vring_used_elem *heads, > { > int start, n, r; > > - start = vq->last_used_idx % vq->num; > + start = vq->last_used_idx & (vq->num - 1); > n = vq->num - start; > if (n < count) { > r = __vhost_add_used_n(vq, heads, n); > -- > MST -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH RFC v2] virtio: skip avail/used index reads
This adds a new vring feature bit: when enabled, host and guest poll the available/used ring directly instead of looking at the index field first. To guarantee it is possible to detect updates, the high bits (above vring.num - 1) in the ring head ID value are modified to match the index bits - these change on each wrap-around. Writer also XORs this with 0x8000 such that rings can be zero-initialized. Reader is modified to ignore these high bits when looking up descriptors. The point is to reduce the number of cacheline misses for both reads and writes. I see a performance improvement of about 20% on multithreaded benchmarks (e.g. virtio-test), but regression of about 2% on vring_bench. I think this has to do with the fact that complete_multi_user is implemented suboptimally. TODO: investigate single-threaded regression look at more aggressive ring layout changes better name for a feature flag split the patch to make it easier to review This is on top of the following patches in my tree: virtio_ring: Shadow available ring flags & index vhost: replace % with & on data path tools/virtio: fix byteswap logic tools/virtio: move list macro stubs Signed-off-by: Michael S. Tsirkin--- Changes from v1: add a missing chunk in vhost_get_vq_desc drivers/vhost/vhost.h| 3 +- include/linux/vringh.h | 3 + include/uapi/linux/virtio_ring.h | 3 + drivers/vhost/vhost.c| 104 ++ drivers/vhost/vringh.c | 153 +-- drivers/virtio/virtio_ring.c | 40 -- tools/virtio/virtio_test.c | 14 +++- 7 files changed, 256 insertions(+), 64 deletions(-) diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h index d3f7674..aeeb15d 100644 --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -175,7 +175,8 @@ enum { (1ULL << VIRTIO_RING_F_EVENT_IDX) | (1ULL << VHOST_F_LOG_ALL) | (1ULL << VIRTIO_F_ANY_LAYOUT) | -(1ULL << VIRTIO_F_VERSION_1) +(1ULL << VIRTIO_F_VERSION_1) | +(1ULL << VIRTIO_RING_F_POLL) }; static inline bool vhost_has_feature(struct vhost_virtqueue *vq, int bit) diff --git a/include/linux/vringh.h b/include/linux/vringh.h index bc6c28d..13a9e3e 100644 --- a/include/linux/vringh.h +++ b/include/linux/vringh.h @@ -40,6 +40,9 @@ struct vringh { /* Can we get away with weak barriers? */ bool weak_barriers; + /* Poll ring directly */ + bool poll; + /* Last available index we saw (ie. where we're up to). */ u16 last_avail_idx; diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h index c072959..bf3ca1d 100644 --- a/include/uapi/linux/virtio_ring.h +++ b/include/uapi/linux/virtio_ring.h @@ -62,6 +62,9 @@ * at the end of the used ring. Guest should ignore the used->flags field. */ #define VIRTIO_RING_F_EVENT_IDX29 +/* Support ring polling */ +#define VIRTIO_RING_F_POLL 33 + /* Virtio ring descriptors: 16 bytes. These can chain together via "next". */ struct vring_desc { /* Address (guest-physical). */ diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index ad2146a..cdbabf5 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -1346,25 +1346,27 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq, /* Check it isn't doing very strange things with descriptor numbers. */ last_avail_idx = vq->last_avail_idx; - if (unlikely(__get_user(avail_idx, >avail->idx))) { - vq_err(vq, "Failed to access avail idx at %p\n", - >avail->idx); - return -EFAULT; - } - vq->avail_idx = vhost16_to_cpu(vq, avail_idx); + if (!vhost_has_feature(vq, VIRTIO_RING_F_POLL)) { + if (unlikely(__get_user(avail_idx, >avail->idx))) { + vq_err(vq, "Failed to access avail idx at %p\n", + >avail->idx); + return -EFAULT; + } + vq->avail_idx = vhost16_to_cpu(vq, avail_idx); - if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) { - vq_err(vq, "Guest moved used index from %u to %u", - last_avail_idx, vq->avail_idx); - return -EFAULT; - } + if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) { + vq_err(vq, "Guest moved used index from %u to %u", + last_avail_idx, vq->avail_idx); + return -EFAULT; + } - /* If there's nothing new since last we looked, return invalid. */ - if (vq->avail_idx == last_avail_idx) - return vq->num; + /* If there's
Re: [PATCH] vhost: replace % with & on data path
On Mon, Nov 30, 2015 at 12:42:49AM -0800, Joe Perches wrote: > On Mon, 2015-11-30 at 10:34 +0200, Michael S. Tsirkin wrote: > > We know vring num is a power of 2, so use & > > to mask the high bits. > [] > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > [] > > @@ -1366,10 +1366,12 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq, > > /* Only get avail ring entries after they have been exposed by guest. */ > > smp_rmb(); > > > > + } > > ? Yes, I noticed this - I moved this chunk from the next patch in my tree by mistake. Will fix, thanks! -- MST -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] netfilter: ipvs: avoid unused variable warning
When CONFIG_PROC_FS is disabled, the local 'net' variable in ip_vs_app_net_init becomes unused, as gcc warns: net/netfilter/ipvs/ip_vs_app.c: In function 'ip_vs_app_net_init': net/netfilter/ipvs/ip_vs_app.c:608:14: warning: unused variable 'net' [-Wunused-variable] This removes the line by moving the pointer dereference into the user of the variable. Signed-off-by: Arnd Bergmanndiff --git a/net/netfilter/ipvs/ip_vs_app.c b/net/netfilter/ipvs/ip_vs_app.c index 0328f7250693..e5422d3db501 100644 --- a/net/netfilter/ipvs/ip_vs_app.c +++ b/net/netfilter/ipvs/ip_vs_app.c @@ -614,8 +614,6 @@ int __net_init ip_vs_app_net_init(struct netns_ipvs *ipvs) void __net_exit ip_vs_app_net_cleanup(struct netns_ipvs *ipvs) { - struct net *net = ipvs->net; - unregister_ip_vs_app(ipvs, NULL /* all */); - remove_proc_entry("ip_vs_app", net->proc_net); + remove_proc_entry("ip_vs_app", ipvs->net->proc_net); } -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test
On Mon, Nov 30, 2015 at 05:37:55PM +0800, Herbert Xu wrote: > Phil Sutterwrote: > > The following series aims to improve lib/test_rhashtable in different > > situations: > > > > Patch 1 allows the kernel to reschedule so the test does not block too > >long on slow systems. > > Patch 2 fixes behaviour under pressure, retrying inserts in non-permanent > >error case (-EBUSY). > > Patch 3 auto-adjusts the upper table size limit according to the number > >of threads (in concurrency test). In fact, the current default is > >already too small. > > Patch 4 makes it possible to retry inserts even in supposedly permanent > >error case (-ENOMEM) to expose rhashtable's remaining problem of > >-ENOMEM being not as permanent as it is expected to be. > > I'm sorry but this patch series is simply bogus. The whole series?! > If rhashtable is indeed returning such errors under normal > conditions then rhashtable is broken and we must fix it instead > of working around it in the test code! You're stating the obvious. Remember, the reason I prepared patch 4 was because you wanted to fix just that bug in rhashtable in the first place. Just to make this clear: Patches 1-3 are reasonable on their own, the only connection to the bug is that patch 2 makes it visible (at least on my system it wasn't before). > FWIW I still haven't been able to reproduce this problem, perhaps > because my machines have too few CPUs? Did you try with my bogus patch series applied? How many CPUs does your test system actually have? > So can someone please help me reproduce this? Because just loading > test_rhashtable isn't doing it. As said, maybe you need to increase the number of spawned threads (tcount=50 or so). Cheers, Phil -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test
On Mon, Nov 30, 2015 at 11:14:01AM +0100, Phil Sutter wrote: > On Mon, Nov 30, 2015 at 05:37:55PM +0800, Herbert Xu wrote: > > Phil Sutterwrote: > > > The following series aims to improve lib/test_rhashtable in different > > > situations: > > > > > > Patch 1 allows the kernel to reschedule so the test does not block too > > >long on slow systems. > > > Patch 2 fixes behaviour under pressure, retrying inserts in non-permanent > > >error case (-EBUSY). > > > Patch 3 auto-adjusts the upper table size limit according to the number > > >of threads (in concurrency test). In fact, the current default is > > >already too small. > > > Patch 4 makes it possible to retry inserts even in supposedly permanent > > >error case (-ENOMEM) to expose rhashtable's remaining problem of > > >-ENOMEM being not as permanent as it is expected to be. > > > > I'm sorry but this patch series is simply bogus. > > The whole series?! Well at least patch two and four seem clearly wrong because no rhashtable user should need to retry insertions. > Did you try with my bogus patch series applied? How many CPUs does your > test system actually have? > > > So can someone please help me reproduce this? Because just loading > > test_rhashtable isn't doing it. > > As said, maybe you need to increase the number of spawned threads > (tcount=50 or so). OK that's better. I think I see the problem. The test in rhashtable_insert_rehash is racy and if two threads both try to grow the table one of them may be tricked into doing a rehash instead. I'm working on a fix. Thanks, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH (net-next.git)] stmmac: support Reg_9 to get HW level information
For GMAC newer than 3.40a there is a new register (Reg_9) that provides the status of all modules of the transmit and receive paths and FIFO status. These can be exposed via ethtool. Signed-off-by: Giuseppe Cavallaro--- drivers/net/ethernet/stmicro/stmmac/common.h | 26 +++ drivers/net/ethernet/stmicro/stmmac/dwmac1000.h| 42 +++ .../net/ethernet/stmicro/stmmac/dwmac1000_core.c | 75 .../net/ethernet/stmicro/stmmac/stmmac_ethtool.c | 30 4 files changed, 173 insertions(+), 0 deletions(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h index 623c6ed..f4518bc 100644 --- a/drivers/net/ethernet/stmicro/stmmac/common.h +++ b/drivers/net/ethernet/stmicro/stmmac/common.h @@ -137,6 +137,31 @@ struct stmmac_extra_stats { unsigned long pcs_link; unsigned long pcs_duplex; unsigned long pcs_speed; + /* debug register */ + unsigned long mtl_tx_status_fifo_full; + unsigned long mtl_tx_fifo_not_empty; + unsigned long mmtl_fifo_ctrl; + unsigned long mtl_tx_fifo_read_ctrl_write; + unsigned long mtl_tx_fifo_read_ctrl_wait; + unsigned long mtl_tx_fifo_read_ctrl_read; + unsigned long mtl_tx_fifo_read_ctrl_idle; + unsigned long mac_tx_in_pause; + unsigned long mac_tx_frame_ctrl_xfer; + unsigned long mac_tx_frame_ctrl_idle; + unsigned long mac_tx_frame_ctrl_wait; + unsigned long mac_tx_frame_ctrl_pause; + unsigned long mac_gmii_tx_proto_engine; + unsigned long mtl_rx_fifo_fill_level_full; + unsigned long mtl_rx_fifo_fill_above_thresh; + unsigned long mtl_rx_fifo_fill_below_thresh; + unsigned long mtl_rx_fifo_fill_level_empty; + unsigned long mtl_rx_fifo_read_ctrl_flush; + unsigned long mtl_rx_fifo_read_ctrl_read_data; + unsigned long mtl_rx_fifo_read_ctrl_status; + unsigned long mtl_rx_fifo_read_ctrl_idle; + unsigned long mtl_rx_fifo_ctrl_active; + unsigned long mac_rx_frame_ctrl_fifo; + unsigned long mac_gmii_rx_proto_engine; }; /* CSR Frequency Access Defines*/ @@ -408,6 +433,7 @@ struct stmmac_ops { void (*set_eee_pls)(struct mac_device_info *hw, int link); void (*ctrl_ane)(struct mac_device_info *hw, bool restart); void (*get_adv)(struct mac_device_info *hw, struct rgmii_adv *adv); + void (*debug)(void __iomem *ioaddr, struct stmmac_extra_stats *x); }; /* PTP and HW Timer helpers */ diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h index b3fe057..8831a05 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h @@ -34,6 +34,7 @@ #define GMAC_FLOW_CTRL 0x0018 /* Flow Control */ #define GMAC_VLAN_TAG 0x001c /* VLAN Tag */ #define GMAC_VERSION 0x0020 /* GMAC CORE Version */ +#define GMAC_DEBUG 0x0024 /* GMAC debug register */ #define GMAC_WAKEUP_FILTER 0x0028 /* Wake-up Frame Filter */ #define GMAC_INT_STATUS0x0038 /* interrupt status register */ @@ -177,6 +178,47 @@ enum inter_frame_gap { #define GMAC_FLOW_CTRL_TFE 0x0002 /* Tx Flow Control Enable */ #define GMAC_FLOW_CTRL_FCB_BPA 0x0001 /* Flow Control Busy ... */ +/* DEBUG Register defines */ +/* MTL TxStatus FIFO */ +#define GMAC_DEBUG_TXSTSFSTS BIT(25) /* MTL TxStatus FIFO Full Status */ +#define GMAC_DEBUG_TXFSTS BIT(24) /* MTL Tx FIFO Not Empty Status */ +#define GMAC_DEBUG_TWCSTS BIT(22) /* MTL Tx FIFO Write Controller */ +/* MTL Tx FIFO Read Controller Status */ +#define GMAC_DEBUG_TRCSTS_MASK GENMASK(21, 20) +#define GMAC_DEBUG_TRCSTS_SHIFT20 +#define GMAC_DEBUG_TRCSTS_IDLE 0 +#define GMAC_DEBUG_TRCSTS_READ 1 +#define GMAC_DEBUG_TRCSTS_TXW 2 +#define GMAC_DEBUG_TRCSTS_WRITE3 +#define GMAC_DEBUG_TXPAUSEDBIT(19) /* MAC Transmitter in PAUSE */ +/* MAC Transmit Frame Controller Status */ +#define GMAC_DEBUG_TFCSTS_MASK GENMASK(18, 17) +#define GMAC_DEBUG_TFCSTS_SHIFT17 +#define GMAC_DEBUG_TFCSTS_IDLE 0 +#define GMAC_DEBUG_TFCSTS_WAIT 1 +#define GMAC_DEBUG_TFCSTS_GEN_PAUSE2 +#define GMAC_DEBUG_TFCSTS_XFER 3 +/* MAC GMII or MII Transmit Protocol Engine Status */ +#define GMAC_DEBUG_TPESTS BIT(16) +#define GMAC_DEBUG_RXFSTS_MASK GENMASK(9, 8) /* MTL Rx FIFO Fill-level */ +#define GMAC_DEBUG_RXFSTS_SHIFT8 +#define GMAC_DEBUG_RXFSTS_EMPTY0 +#define GMAC_DEBUG_RXFSTS_BT 1 +#define GMAC_DEBUG_RXFSTS_AT 2 +#define GMAC_DEBUG_RXFSTS_FULL 3 +#define GMAC_DEBUG_RRCSTS_MASK GENMASK(6, 5) /* MTL Rx FIFO Read Controller */ +#define GMAC_DEBUG_RRCSTS_SHIFT5 +#define GMAC_DEBUG_RRCSTS_IDLE 0 +#define GMAC_DEBUG_RRCSTS_RDATA1 +#define GMAC_DEBUG_RRCSTS_RSTAT2 +#define
[P.A. Semi] Does the ethernet interface work on your Electra, Chitra, Nemo, and Athena board?
FYI On 30 November 2015 at 10:48 AM, Christian Zigotzky wrote: Hi Denis, Thank you for your answer. Sorry because of my description. Yes, the driver probe function finds the device. With kernel 4.4-rc3: dmesg | grep -i eth0 [ 2.297473] eth0: PA Semi GMAC: intf 5, hw addr 02:00:e0:0a:30:00 dhclient eth0 RTNETLINK answers: Cannot allocate memory With kernel 4.1.13: dmesg | grep -i eth0 [ 2.328115] eth0: PA Semi GMAC: intf 5, hw addr 02:00:e0:0a:30:00 [ 37.130466] eth0: Link is up at 100 Mbps, full duplex. Cheers, Christian On 30 November 2015 at 09:37 AM, Denis Kirjanov wrote: On 11/29/15, Christian Zigotzkywrote: Hi All, Does the ethernet interface on your Electra, Chitra, Nemo, and Athena board work with the release candidates of the kernel 4.4? Unfortunately the P.A. Semi ethernet doesn't work on our Nemo boards with the release candidates of the kernel 4.4. We have set the following entries in the kernel config: CONFIG_NET_VENDOR_PASEMI=y CONFIG_PASEMI_MAC=y Could you please test the P.A. Semi ethernet on your P.A. Semi boards? It's not clear from your descriptions what is not working. Does the driver probe function find a device? Does the interface show up in the kernel? Can it send/receive packets? Also please CC netdev. Thanks. Thanks in advance, Christian ___ Linuxppc-dev mailing list linuxppc-...@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 5/6] qede: Add support for nway_reset
From: Sudarsana KalluruSigned-off-by: Sudarsana Kalluru Signed-off-by: Yuval Mintz --- drivers/net/ethernet/qlogic/qede/qede_ethtool.c | 25 + 1 file changed, 25 insertions(+) diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c index b90d880..9b0bf12 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c +++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c @@ -322,6 +322,30 @@ static void qede_set_msglevel(struct net_device *ndev, u32 level) dp_module, dp_level); } +static int qede_nway_reset(struct net_device *dev) +{ + struct qede_dev *edev = netdev_priv(dev); + struct qed_link_output current_link; + struct qed_link_params link_params; + + if (!netif_running(dev)) + return 0; + + memset(_link, 0, sizeof(current_link)); + edev->ops->common->get_link(edev->cdev, _link); + if (!current_link.link_up) + return 0; + + /* Toggle the link */ + memset(_params, 0, sizeof(link_params)); + link_params.link_up = false; + edev->ops->common->set_link(edev->cdev, _params); + link_params.link_up = true; + edev->ops->common->set_link(edev->cdev, _params); + + return 0; +} + static u32 qede_get_link(struct net_device *dev) { struct qede_dev *edev = netdev_priv(dev); @@ -493,6 +517,7 @@ static const struct ethtool_ops qede_ethtool_ops = { .get_drvinfo = qede_get_drvinfo, .get_msglevel = qede_get_msglevel, .set_msglevel = qede_set_msglevel, + .nway_reset = qede_nway_reset, .get_link = qede_get_link, .get_ringparam = qede_get_ringparam, .set_ringparam = qede_set_ringparam, -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 4/6] qede: Add support for set_phys_id
From: Sudarsana KalluruSigned-off-by: Sudarsana Kalluru Signed-off-by: Yuval Mintz --- drivers/net/ethernet/qlogic/qede/qede_ethtool.c | 29 + 1 file changed, 29 insertions(+) diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c index 10d80ba..b90d880 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c +++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c @@ -459,6 +459,34 @@ static int qede_set_channels(struct net_device *dev, return 0; } +static int qede_set_phys_id(struct net_device *dev, + enum ethtool_phys_id_state state) +{ + struct qede_dev *edev = netdev_priv(dev); + u8 led_state = 0; + + switch (state) { + case ETHTOOL_ID_ACTIVE: + return 1; /* cycle on/off once per second */ + + case ETHTOOL_ID_ON: + led_state = QED_LED_MODE_ON; + break; + + case ETHTOOL_ID_OFF: + led_state = QED_LED_MODE_OFF; + break; + + case ETHTOOL_ID_INACTIVE: + led_state = QED_LED_MODE_RESTORE; + break; + } + + edev->ops->common->set_led(edev->cdev, led_state); + + return 0; +} + static const struct ethtool_ops qede_ethtool_ops = { .get_settings = qede_get_settings, .set_settings = qede_set_settings, @@ -469,6 +497,7 @@ static const struct ethtool_ops qede_ethtool_ops = { .get_ringparam = qede_get_ringparam, .set_ringparam = qede_set_ringparam, .get_strings = qede_get_strings, + .set_phys_id = qede_set_phys_id, .get_ethtool_stats = qede_get_ethtool_stats, .get_sset_count = qede_get_sset_count, -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 3/6] qed: Add support for changing LED state
From: Sudarsana KalluruPhysical LEDs are being controlled by the management FW. This adds the qed functionality required to request management FW to change the LED configuration, as well as the necessary APIs for this functionality to later be used by the protocol drivers. Signed-off-by: Sudarsana Kalluru Signed-off-by: Yuval Mintz --- drivers/net/ethernet/qlogic/qed/qed_hsi.h | 6 ++ drivers/net/ethernet/qlogic/qed/qed_main.c | 18 ++ drivers/net/ethernet/qlogic/qed/qed_mcp.c | 27 +++ drivers/net/ethernet/qlogic/qed/qed_mcp.h | 13 + include/linux/qed/qed_if.h | 17 + 5 files changed, 81 insertions(+) diff --git a/drivers/net/ethernet/qlogic/qed/qed_hsi.h b/drivers/net/ethernet/qlogic/qed/qed_hsi.h index b2f8e85..264e954 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_hsi.h +++ b/drivers/net/ethernet/qlogic/qed/qed_hsi.h @@ -3993,6 +3993,8 @@ struct public_drv_mb { #define DRV_MSG_CODE_PHY_CORE_WRITE 0x000e #define DRV_MSG_CODE_SET_VERSION0x000f +#define DRV_MSG_CODE_SET_LED_MODE 0x0020 + #define DRV_MSG_SEQ_NUMBER_MASK 0x u32 drv_mb_param; @@ -4044,6 +4046,10 @@ struct public_drv_mb { #define DRV_MB_PARAM_CFG_VF_MSIX_SB_NUM_SHIFT 8 #define DRV_MB_PARAM_CFG_VF_MSIX_SB_NUM_MASK0xFF00 +#define DRV_MB_PARAM_SET_LED_MODE_OPER 0x0 +#define DRV_MB_PARAM_SET_LED_MODE_ON0x1 +#define DRV_MB_PARAM_SET_LED_MODE_OFF 0x2 + u32 fw_mb_header; #define FW_MSG_CODE_MASK0x #define FW_MSG_CODE_DRV_LOAD_ENGINE 0x1010 diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c b/drivers/net/ethernet/qlogic/qed/qed_main.c index 947c7af..6b02e11 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_main.c +++ b/drivers/net/ethernet/qlogic/qed/qed_main.c @@ -1135,6 +1135,23 @@ static int qed_drain(struct qed_dev *cdev) return 0; } +static int qed_set_led(struct qed_dev *cdev, enum qed_led_mode mode) +{ + struct qed_hwfn *hwfn = QED_LEADING_HWFN(cdev); + struct qed_ptt *ptt; + int status = 0; + + ptt = qed_ptt_acquire(hwfn); + if (!ptt) + return -EAGAIN; + + status = qed_mcp_set_led(hwfn, ptt, mode); + + qed_ptt_release(hwfn, ptt); + + return status; +} + const struct qed_common_ops qed_common_ops_pass = { .probe = _probe, .remove = _remove, @@ -1155,6 +1172,7 @@ const struct qed_common_ops qed_common_ops_pass = { .update_msglvl = _init_dp, .chain_alloc = _chain_alloc, .chain_free = _chain_free, + .set_led = _set_led, }; u32 qed_get_protocol_version(enum qed_protocol protocol) diff --git a/drivers/net/ethernet/qlogic/qed/qed_mcp.c b/drivers/net/ethernet/qlogic/qed/qed_mcp.c index 20d048c..ba1b1f1 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_mcp.c +++ b/drivers/net/ethernet/qlogic/qed/qed_mcp.c @@ -858,3 +858,30 @@ qed_mcp_send_drv_version(struct qed_hwfn *p_hwfn, return 0; } + +int qed_mcp_set_led(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt, + enum qed_led_mode mode) +{ + u32 resp = 0, param = 0, drv_mb_param; + int rc; + + switch (mode) { + case QED_LED_MODE_ON: + drv_mb_param = DRV_MB_PARAM_SET_LED_MODE_ON; + break; + case QED_LED_MODE_OFF: + drv_mb_param = DRV_MB_PARAM_SET_LED_MODE_OFF; + break; + case QED_LED_MODE_RESTORE: + drv_mb_param = DRV_MB_PARAM_SET_LED_MODE_OPER; + break; + default: + DP_NOTICE(p_hwfn, "Invalid LED mode %d\n", mode); + return -EINVAL; + } + + rc = qed_mcp_cmd(p_hwfn, p_ptt, DRV_MSG_CODE_SET_LED_MODE, +drv_mb_param, , ); + + return rc; +} diff --git a/drivers/net/ethernet/qlogic/qed/qed_mcp.h b/drivers/net/ethernet/qlogic/qed/qed_mcp.h index dbaae58..506197d 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_mcp.h +++ b/drivers/net/ethernet/qlogic/qed/qed_mcp.h @@ -224,6 +224,19 @@ qed_mcp_send_drv_version(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt, struct qed_mcp_drv_version *p_ver); +/** + * @brief Set LED status + * + * @param p_hwfn + * @param p_ptt + * @param mode - LED mode + * + * @return int - 0 - operation was successful. + */ +int qed_mcp_set_led(struct qed_hwfn *p_hwfn, + struct qed_ptt *p_ptt, + enum qed_led_mode mode); + /* Using hwfn number (and not pf_num) is required since in CMT mode, * same pf_num may be used by two different hwfn * TODO - this shouldn't really be in .h file, but until all fields diff --git a/include/linux/qed/qed_if.h b/include/linux/qed/qed_if.h index
[PATCH net-next 6/6] qede: Add support for {get, set}_pauseparam
From: Sudarsana KalluruSigned-off-by: Sudarsana Kalluru Signed-off-by: Yuval Mintz --- drivers/net/ethernet/qlogic/qede/qede_ethtool.c | 60 + 1 file changed, 60 insertions(+) diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c index 9b0bf12..e442b85 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c +++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c @@ -399,6 +399,64 @@ static int qede_set_ringparam(struct net_device *dev, return 0; } +static void qede_get_pauseparam(struct net_device *dev, + struct ethtool_pauseparam *epause) +{ + struct qede_dev *edev = netdev_priv(dev); + struct qed_link_output current_link; + + memset(_link, 0, sizeof(current_link)); + edev->ops->common->get_link(edev->cdev, _link); + + if (current_link.pause_config & QED_LINK_PAUSE_AUTONEG_ENABLE) + epause->autoneg = true; + if (current_link.pause_config & QED_LINK_PAUSE_RX_ENABLE) + epause->rx_pause = true; + if (current_link.pause_config & QED_LINK_PAUSE_TX_ENABLE) + epause->tx_pause = true; + + DP_VERBOSE(edev, QED_MSG_DEBUG, + "ethtool_pauseparam: cmd %d autoneg %d rx_pause %d tx_pause %d\n", + epause->cmd, epause->autoneg, epause->rx_pause, + epause->tx_pause); +} + +static int qede_set_pauseparam(struct net_device *dev, + struct ethtool_pauseparam *epause) +{ + struct qede_dev *edev = netdev_priv(dev); + struct qed_link_params params; + struct qed_link_output current_link; + + if (!edev->dev_info.common.is_mf) { + DP_INFO(edev, + "Pause parameters can not be updated in non-default mode\n"); + return -EOPNOTSUPP; + } + + memset(_link, 0, sizeof(current_link)); + edev->ops->common->get_link(edev->cdev, _link); + + memset(, 0, sizeof(params)); + params.override_flags |= QED_LINK_OVERRIDE_PAUSE_CONFIG; + if (epause->autoneg) { + if (!(current_link.supported_caps & SUPPORTED_Autoneg)) { + DP_INFO(edev, "autoneg not supported\n"); + return -EINVAL; + } + params.pause_config |= QED_LINK_PAUSE_AUTONEG_ENABLE; + } + if (epause->rx_pause) + params.pause_config |= QED_LINK_PAUSE_RX_ENABLE; + if (epause->tx_pause) + params.pause_config |= QED_LINK_PAUSE_TX_ENABLE; + + params.link_up = true; + edev->ops->common->set_link(edev->cdev, ); + + return 0; +} + static void qede_update_mtu(struct qede_dev *edev, union qede_reload_args *args) { edev->ndev->mtu = args->mtu; @@ -521,6 +579,8 @@ static const struct ethtool_ops qede_ethtool_ops = { .get_link = qede_get_link, .get_ringparam = qede_get_ringparam, .set_ringparam = qede_set_ringparam, + .get_pauseparam = qede_get_pauseparam, + .set_pauseparam = qede_set_pauseparam, .get_strings = qede_get_strings, .set_phys_id = qede_set_phys_id, .get_ethtool_stats = qede_get_ethtool_stats, -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 1/6] qede: Add support for {get, set}_channels
From: Sudarsana KalluruSigned-off-by: Sudarsana Kalluru Signed-off-by: Yuval Mintz --- drivers/net/ethernet/qlogic/qede/qede.h | 1 + drivers/net/ethernet/qlogic/qede/qede_ethtool.c | 53 + drivers/net/ethernet/qlogic/qede/qede_main.c| 7 +++- 3 files changed, 59 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qede/qede.h b/drivers/net/ethernet/qlogic/qede/qede.h index ea00d5f..a65a9b2 100644 --- a/drivers/net/ethernet/qlogic/qede/qede.h +++ b/drivers/net/ethernet/qlogic/qede/qede.h @@ -116,6 +116,7 @@ struct qede_dev { (edev)->dev_info.num_tc) struct qede_fastpath*fp_array; + u16 req_rss; u16 num_rss; u8 num_tc; #define QEDE_RSS_CNT(edev) ((edev)->num_rss) diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c index 3a36247..ea2fda8 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c +++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c @@ -366,6 +366,57 @@ int qede_change_mtu(struct net_device *ndev, int new_mtu) return 0; } +static void qede_get_channels(struct net_device *dev, + struct ethtool_channels *channels) +{ + struct qede_dev *edev = netdev_priv(dev); + + channels->max_combined = QEDE_MAX_RSS_CNT(edev); + channels->combined_count = QEDE_RSS_CNT(edev); +} + +static int qede_set_channels(struct net_device *dev, +struct ethtool_channels *channels) +{ + struct qede_dev *edev = netdev_priv(dev); + + DP_VERBOSE(edev, (NETIF_MSG_IFUP | NETIF_MSG_IFDOWN), + "set-channels command parameters: rx = %d, tx = %d, other = %d, combined = %d\n", + channels->rx_count, channels->tx_count, + channels->other_count, channels->combined_count); + + /* We don't support separate rx / tx, nor `other' channels. */ + if (channels->rx_count || channels->tx_count || + channels->other_count || (channels->combined_count == 0) || + (channels->combined_count > QEDE_MAX_RSS_CNT(edev))) { + DP_VERBOSE(edev, (NETIF_MSG_IFUP | NETIF_MSG_IFDOWN), + "command parameters not supported\n"); + return -EINVAL; + } + + /* Check if there was a change in the active parameters */ + if (channels->combined_count == QEDE_RSS_CNT(edev)) { + DP_VERBOSE(edev, (NETIF_MSG_IFUP | NETIF_MSG_IFDOWN), + "No change in active parameters\n"); + return 0; + } + + /* We need the number of queues to be divisible between the hwfns */ + if (channels->combined_count % edev->dev_info.common.num_hwfns) { + DP_VERBOSE(edev, (NETIF_MSG_IFUP | NETIF_MSG_IFDOWN), + "Number of channels must be divisable by %04x\n", + edev->dev_info.common.num_hwfns); + return -EINVAL; + } + + /* Set number of queues and reload if necessary */ + edev->req_rss = channels->combined_count; + if (netif_running(dev)) + qede_reload(edev, NULL, NULL); + + return 0; +} + static const struct ethtool_ops qede_ethtool_ops = { .get_settings = qede_get_settings, .set_settings = qede_set_settings, @@ -377,6 +428,8 @@ static const struct ethtool_ops qede_ethtool_ops = { .get_ethtool_stats = qede_get_ethtool_stats, .get_sset_count = qede_get_sset_count, + .get_channels = qede_get_channels, + .set_channels = qede_set_channels, }; void qede_set_ethtool_ops(struct net_device *dev) diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c index f4657a2..6237f10 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_main.c +++ b/drivers/net/ethernet/qlogic/qede/qede_main.c @@ -1502,8 +1502,11 @@ static int qede_set_num_queues(struct qede_dev *edev) u16 rss_num; /* Setup queues according to possible resources*/ - rss_num = netif_get_num_default_rss_queues() * - edev->dev_info.common.num_hwfns; + if (edev->req_rss) + rss_num = edev->req_rss; + else + rss_num = netif_get_num_default_rss_queues() * + edev->dev_info.common.num_hwfns; rss_num = min_t(u16, QEDE_MAX_RSS_CNT(edev), rss_num); -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 2/6] qede: Add support for {get, set}_ringparam
From: Sudarsana KalluruSigned-off-by: Sudarsana Kalluru Signed-off-by: Yuval Mintz --- drivers/net/ethernet/qlogic/qede/qede.h | 4 +-- drivers/net/ethernet/qlogic/qede/qede_ethtool.c | 44 + 2 files changed, 46 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qede/qede.h b/drivers/net/ethernet/qlogic/qede/qede.h index a65a9b2..7c6caf7 100644 --- a/drivers/net/ethernet/qlogic/qede/qede.h +++ b/drivers/net/ethernet/qlogic/qede/qede.h @@ -270,13 +270,13 @@ int qede_change_mtu(struct net_device *dev, int new_mtu); void qede_fill_by_demand_stats(struct qede_dev *edev); #define RX_RING_SIZE_POW 13 -#define RX_RING_SIZE BIT(RX_RING_SIZE_POW) +#define RX_RING_SIZE ((u16)BIT(RX_RING_SIZE_POW)) #define NUM_RX_BDS_MAX (RX_RING_SIZE - 1) #define NUM_RX_BDS_MIN 128 #define NUM_RX_BDS_DEF NUM_RX_BDS_MAX #define TX_RING_SIZE_POW 13 -#define TX_RING_SIZE BIT(TX_RING_SIZE_POW) +#define TX_RING_SIZE ((u16)BIT(TX_RING_SIZE_POW)) #define NUM_TX_BDS_MAX (TX_RING_SIZE - 1) #define NUM_TX_BDS_MIN 128 #define NUM_TX_BDS_DEF NUM_TX_BDS_MAX diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c index ea2fda8..10d80ba 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c +++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c @@ -333,6 +333,48 @@ static u32 qede_get_link(struct net_device *dev) return current_link.link_up; } +static void qede_get_ringparam(struct net_device *dev, + struct ethtool_ringparam *ering) +{ + struct qede_dev *edev = netdev_priv(dev); + + ering->rx_max_pending = NUM_RX_BDS_MAX; + ering->rx_pending = edev->q_num_rx_buffers; + ering->tx_max_pending = NUM_TX_BDS_MAX; + ering->tx_pending = edev->q_num_tx_buffers; +} + +static int qede_set_ringparam(struct net_device *dev, + struct ethtool_ringparam *ering) +{ + struct qede_dev *edev = netdev_priv(dev); + + DP_VERBOSE(edev, (NETIF_MSG_IFUP | NETIF_MSG_IFDOWN), + "Set ring params command parameters: rx_pending = %d, tx_pending = %d\n", + ering->rx_pending, ering->tx_pending); + + /* Validate legality of configuration */ + if (ering->rx_pending > NUM_RX_BDS_MAX || + ering->rx_pending < NUM_RX_BDS_MIN || + ering->tx_pending > NUM_TX_BDS_MAX || + ering->tx_pending < NUM_TX_BDS_MIN) { + DP_VERBOSE(edev, (NETIF_MSG_IFUP | NETIF_MSG_IFDOWN), + "Can only support Rx Buffer size [0%08x,...,0x%08x] and Tx Buffer size [0x%08x,...,0x%08x]\n", + NUM_RX_BDS_MIN, NUM_RX_BDS_MAX, + NUM_TX_BDS_MIN, NUM_TX_BDS_MAX); + return -EINVAL; + } + + /* Change ring size and re-load */ + edev->q_num_rx_buffers = ering->rx_pending; + edev->q_num_tx_buffers = ering->tx_pending; + + if (netif_running(edev->ndev)) + qede_reload(edev, NULL, NULL); + + return 0; +} + static void qede_update_mtu(struct qede_dev *edev, union qede_reload_args *args) { edev->ndev->mtu = args->mtu; @@ -424,6 +466,8 @@ static const struct ethtool_ops qede_ethtool_ops = { .get_msglevel = qede_get_msglevel, .set_msglevel = qede_set_msglevel, .get_link = qede_get_link, + .get_ringparam = qede_get_ringparam, + .set_ringparam = qede_set_ringparam, .get_strings = qede_get_strings, .get_ethtool_stats = qede_get_ethtool_stats, .get_sset_count = qede_get_sset_count, -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 0/6] qede/qed: Implement various ethtool operations
This series adds several new ethtool operations to qede: - {get, set}_channels - {get, set}_ringparam - set_phys_id - nway_reset - {get, set}_pauseparam As well as extending the qed APIs to support these commands. Dave, please consider applying this series to `net-next'. Thanks, Yuval -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 21/28] net: pch_gbe: mark Minnow PHY reset GPIO active low
The Minnow PHY reset GPIO is set to 0 to enter reset & 1 to leave reset - that is, it is an active low GPIO. In order to allow for the code to be made more generic by further patches, indicate to the GPIO subsystem that the GPIO is active low & invert the values it is set to such that they reflect logically whether the device is being reset or not. Signed-off-by: Paul Burton--- drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c index 3b98b263b..fde4c11 100644 --- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c +++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c @@ -2717,7 +2717,8 @@ err_free_netdev: */ static int pch_gbe_minnow_platform_init(struct pci_dev *pdev) { - unsigned long flags = GPIOF_DIR_OUT | GPIOF_INIT_HIGH | GPIOF_EXPORT; + unsigned long flags = GPIOF_DIR_OUT | GPIOF_INIT_LOW | + GPIOF_EXPORT | GPIOF_ACTIVE_LOW; unsigned gpio = MINNOW_PHY_RESET_GPIO; int ret; @@ -2729,10 +2730,10 @@ static int pch_gbe_minnow_platform_init(struct pci_dev *pdev) return ret; } - gpio_set_value(gpio, 0); - usleep_range(1250, 1500); gpio_set_value(gpio, 1); usleep_range(1250, 1500); + gpio_set_value(gpio, 0); + usleep_range(1250, 1500); return ret; } -- 2.6.2 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 22/28] net: pch_gbe: pull PHY GPIO handling out of Minnow code
The MIPS Boston development board uses the Intel EG20T Platform Controller Hub, including its gigabit ethernet controller, and requires that its RTL8211E PHY be reset much like the Minnow platform. Pull the PHY reset GPIO handling out of Minnow-specific code such that it can be shared by later patches. Signed-off-by: Paul Burton--- drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe.h| 4 ++- .../net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c | 33 +++--- 2 files changed, 26 insertions(+), 11 deletions(-) diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe.h b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe.h index 2a55d6d..884f90b 100644 --- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe.h +++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe.h @@ -582,15 +582,17 @@ struct pch_gbe_hw_stats { /** * struct pch_gbe_privdata - PCI Device ID driver data + * @phy_reset_gpio:PHY reset GPIO descriptor. * @phy_tx_clk_delay: Bool, configure the PHY TX delay in software * @phy_disable_hibernate: Bool, disable PHY hibernation * @platform_init: Platform initialization callback, called from * probe, prior to PHY initialization. */ struct pch_gbe_privdata { + struct gpio_desc *phy_reset_gpio; bool phy_tx_clk_delay; bool phy_disable_hibernate; - int (*platform_init)(struct pci_dev *pdev); + int (*platform_init)(struct pci_dev *, struct pch_gbe_privdata *); }; /** diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c index fde4c11..23d28f0 100644 --- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c +++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c @@ -360,6 +360,16 @@ static void pch_gbe_mac_mar_set(struct pch_gbe_hw *hw, u8 * addr, u32 index) pch_gbe_wait_clr_bit(>reg->ADDR_MASK, PCH_GBE_BUSY); } +static void pch_gbe_phy_set_reset(struct pch_gbe_hw *hw, int value) +{ + struct pch_gbe_adapter *adapter = pch_gbe_hw_to_adapter(hw); + + if (!adapter->pdata || !adapter->pdata->phy_reset_gpio) + return; + + gpiod_set_value(adapter->pdata->phy_reset_gpio, value); +} + /** * pch_gbe_mac_reset_hw - Reset hardware * @hw:Pointer to the HW structure @@ -2627,7 +2637,14 @@ static int pch_gbe_probe(struct pci_dev *pdev, adapter->hw.reg = pcim_iomap_table(pdev)[PCH_GBE_PCI_BAR]; adapter->pdata = (struct pch_gbe_privdata *)pci_id->driver_data; if (adapter->pdata && adapter->pdata->platform_init) - adapter->pdata->platform_init(pdev); + adapter->pdata->platform_init(pdev, pdata); + + if (adapter->pdata && adapter->pdata->phy_reset_gpio) { + pch_gbe_phy_set_reset(>hw, 1); + usleep_range(1250, 1500); + pch_gbe_phy_set_reset(>hw, 0); + usleep_range(1250, 1500); + } adapter->ptp_pdev = pci_get_bus_and_slot(adapter->pdev->bus->number, PCI_DEVFN(12, 4)); @@ -2715,7 +2732,8 @@ err_free_netdev: /* The AR803X PHY on the MinnowBoard requires a physical pin to be toggled to * ensure it is awake for probe and init. Request the line and reset the PHY. */ -static int pch_gbe_minnow_platform_init(struct pci_dev *pdev) +static int pch_gbe_minnow_platform_init(struct pci_dev *pdev, + struct pch_gbe_privdata *pdata) { unsigned long flags = GPIOF_DIR_OUT | GPIOF_INIT_LOW | GPIOF_EXPORT | GPIOF_ACTIVE_LOW; @@ -2724,16 +2742,11 @@ static int pch_gbe_minnow_platform_init(struct pci_dev *pdev) ret = devm_gpio_request_one(>dev, gpio, flags, "minnow_phy_reset"); - if (ret) { + if (!ret) + pdata->phy_reset_gpio = gpio_to_desc(gpio); + else dev_err(>dev, "ERR: Can't request PHY reset GPIO line '%d'\n", gpio); - return ret; - } - - gpio_set_value(gpio, 1); - usleep_range(1250, 1500); - gpio_set_value(gpio, 0); - usleep_range(1250, 1500); return ret; } -- 2.6.2 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 19/28] net: pch_gbe: allow build on MIPS platforms
Allow the pch_gbe driver to be built on MIPS platforms, in preparation for its use on the MIPS Boston board. Signed-off-by: Paul Burton--- drivers/net/ethernet/oki-semi/pch_gbe/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/Kconfig b/drivers/net/ethernet/oki-semi/pch_gbe/Kconfig index 5f7a352..4d3809a 100644 --- a/drivers/net/ethernet/oki-semi/pch_gbe/Kconfig +++ b/drivers/net/ethernet/oki-semi/pch_gbe/Kconfig @@ -4,7 +4,7 @@ config PCH_GBE tristate "OKI SEMICONDUCTOR IOH(ML7223/ML7831) GbE" - depends on PCI && (X86_32 || COMPILE_TEST) + depends on PCI && (X86_32 || MIPS || COMPILE_TEST) select MII select PTP_1588_CLOCK_PCH select NET_PTP_CLASSIFY -- 2.6.2 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 23/28] net: pch_gbe: always reset PHY along with MAC
On the MIPS Boston development board, the EG20T MAC does not report receiving the RX clock from the (RGMII) RTL8211E PHY unless the PHY is reset at the same time as the MAC. Since the pch_gbe driver resets the MAC a number of times - twice during probe, and when taking down the network interface - we need to reset the PHY at all the same times. Do that from pch_gbe_mac_reset_hw which is used to reset the MAC in all cases. Signed-off-by: Paul Burton--- drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c index 23d28f0..824ff9e 100644 --- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c +++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c @@ -378,10 +378,13 @@ static void pch_gbe_mac_reset_hw(struct pch_gbe_hw *hw) { /* Read the MAC address. and store to the private data */ pch_gbe_mac_read_mac_addr(hw); + pch_gbe_phy_set_reset(hw, 1); iowrite32(PCH_GBE_ALL_RST, >reg->RESET); #ifdef PCH_GBE_MAC_IFOP_RGMII iowrite32(PCH_GBE_MODE_GMII_ETHER, >reg->MODE); #endif + pch_gbe_phy_set_reset(hw, 0); + usleep_range(1250, 1500); pch_gbe_wait_clr_bit(>reg->RESET, PCH_GBE_ALL_RST); /* Setup the receive addresses */ pch_gbe_mac_mar_set(hw, hw->mac.addr, 0); -- 2.6.2 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 24/28] net: pch_gbe: add device tree support
Introduce support for retrieving the PHY reset GPIO from device tree, which will be used on the MIPS Boston development board. This requires support for probe deferral in order to work correctly, since the order of device probe is not guaranteed & typically the EG20T GPIO controller device will be probed after the ethernet MAC. Signed-off-by: Paul Burton--- .../net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c | 33 +- 1 file changed, 32 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c index 824ff9e..f2a9a38 100644 --- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c +++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c @@ -23,6 +23,8 @@ #include #include #include +#include +#include #define DRV_VERSION "1.01" const char pch_driver_version[] = DRV_VERSION; @@ -2594,13 +2596,41 @@ static void pch_gbe_remove(struct pci_dev *pdev) free_netdev(netdev); } +static int pch_gbe_parse_dt(struct pci_dev *pdev, + struct pch_gbe_privdata **pdata) +{ + struct device_node *np = pdev->dev.of_node; + struct gpio_desc *gpio; + + if (!config_enabled(CONFIG_OF) || !np) + return 0; + + if (!*pdata) + *pdata = devm_kzalloc(>dev, sizeof(**pdata), GFP_KERNEL); + if (!*pdata) + return -ENOMEM; + + gpio = devm_gpiod_get(>dev, "phy-reset", GPIOD_ASIS); + if (IS_ERR(gpio)) + return PTR_ERR(gpio); + + (*pdata)->phy_reset_gpio = gpio; + return 0; +} + static int pch_gbe_probe(struct pci_dev *pdev, const struct pci_device_id *pci_id) { struct net_device *netdev; struct pch_gbe_adapter *adapter; + struct pch_gbe_privdata *pdata; int ret; + pdata = (struct pch_gbe_privdata *)pci_id->driver_data; + ret = pch_gbe_parse_dt(pdev, ); + if (ret) + goto err_out; + ret = pcim_enable_device(pdev); if (ret) return ret; @@ -2638,7 +2668,7 @@ static int pch_gbe_probe(struct pci_dev *pdev, adapter->pdev = pdev; adapter->hw.back = adapter; adapter->hw.reg = pcim_iomap_table(pdev)[PCH_GBE_PCI_BAR]; - adapter->pdata = (struct pch_gbe_privdata *)pci_id->driver_data; + adapter->pdata = pdata; if (adapter->pdata && adapter->pdata->platform_init) adapter->pdata->platform_init(pdev, pdata); @@ -2729,6 +2759,7 @@ err_free_adapter: pch_gbe_hal_phy_hw_reset(>hw); err_free_netdev: free_netdev(netdev); +err_out: return ret; } -- 2.6.2 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 18/28] ptp: pch: allow build on MIPS platforms
Allow the ptp_pch driver to be built on MIPS platforms in preparation for use on the MIPS Boston board. Signed-off-by: Paul Burton--- drivers/ptp/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/ptp/Kconfig b/drivers/ptp/Kconfig index ee3de34..ee43549 100644 --- a/drivers/ptp/Kconfig +++ b/drivers/ptp/Kconfig @@ -74,7 +74,7 @@ config DP83640_PHY config PTP_1588_CLOCK_PCH tristate "Intel PCH EG20T as PTP clock" - depends on X86_32 || COMPILE_TEST + depends on X86_32 || MIPS || COMPILE_TEST depends on HAS_IOMEM && NET select PTP_1588_CLOCK help -- 2.6.2 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/13] mvneta Buffer Management and enhancements
From: Marcin WojtasDate: Mon, 30 Nov 2015 15:13:22 +0100 > What kind of abstraction and helpers do you mean? Some kind of API > (e.g. bm_alloc_buffer, bm_initialize_ring bm_put_buffer, > bm_get_buffer), which would be used by platform drivers (and specific > aplications if one wants to develop on top of the kernel)? > > In general, what is your top-view of such solution and its cooperation > with the drivers? The tricky parts involved have to do with allocating pages for the buffer pools and minimizing the number of atomic refcounting operations on those pages for for the puts and gets, particularly around buffer replenish runs. For example, if you're allocating a page for a buffer pool the device will chop into N (for any N < PAGE_SIZE) byte pieces, you can eliminate many atomic operations. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/13] mvneta Buffer Management and enhancements
Hi Marcin, On dim., nov. 22 2015, Marcin Wojtaswrote: > Hi, > > Hereby I submit a patchset that introduces various fixes and support > for new features and enhancements to the mvneta driver: > > 1. First three patches are minimal fixes, stable-CC'ed. > > 2. Suspend to ram ('s2ram') support. Due to some stability problems > Thomas Petazzoni's patches did not get merged yet, but I used them for > verification. Contrary to wfi mode ('standby' - linux does not > differentiate between them, so same routines are used) all registers' > contents are lost due to power down, so the configuration has to be > fully reconstructed during resume. > > 3. Optimisations - concatenating TX descriptors' flush, basing on > xmit_more support and combined approach for finalizing egress processing. > Thanks to HR timer buffers can be released with small latency, which is > good for low transfer and small queues. Along with the timer, coalescing > irqs are used, whose threshold could be increased back to 15. > > 4. Buffer manager (BM) support with two preparatory commits. As it is a > separate block, common for all network ports, a new driver is introduced, > which configures it and exposes API to the main network driver. It is > throughly described in binding documentation and commit log. Please note, > that enabling per-port BM usage is done using phandle and the data passed > in mvneta_bm_probe. It is designed for usage of on-demand device probe > and dev_set/get_drvdata, however it's awaiting merge to linux-next. > Therefore, deferring probe is not used - if something goes wrong (same > in case of errors during changing MTU or suspend/resume cycle) mvneta > driver falls back to software buffer management and works in a regular way. > > Known issues: > - problems with obtaining all mapped buffers from internal SRAM, when > destroying the buffer pointer pool > - problems with unmapping chunk of SRAM during driver removal > Above do not have an impact on the operation, as they are called during > driver removal or in error path. > > 5. Enable BM on Armada XP and 38X development boards - those ones and > A370 I could check on my own. In all cases they survived night-long > linerate iperf. Also tests were performed with A388 SoC working as a > network bridge between two packet generators. They showed increase of > maximum processed 64B packets by ~20k (~555k packets with BM enabled > vs ~535 packets without BM). Also when pushing 1500B-packets with a > line rate achieved, CPU load decreased from around 25% without BM vs > 18-20% with BM. I was trying to test the BM part of tour series on the Armada XP GP board. However it failed very quickly during the pool allocation. After a first debug I found that the size of the cs used in the mvebu_mbus_dram_info struct was 0. I have applied your series on a v4.4-rc1 kernel. At this stage I don't know if it is a regression in the mbus driver, a misconfiguration on my side or something else. Does it ring a bell for you? How do you test test it exactly? Especially on which kernel and with which U-Boot? Thanks, Gregory > > I'm looking forward to any remarks and comments. > > Best regards, > Marcin Wojtas > > Marcin Wojtas (12): > net: mvneta: add configuration for MBUS windows access protection > net: mvneta: enable IP checksum with jumbo frames for Armada 38x on > Port0 > net: mvneta: fix bit assignment in MVNETA_RXQ_CONFIG_REG > net: mvneta: enable suspend/resume support > net: mvneta: enable mixed egress processing using HR timer > bus: mvebu-mbus: provide api for obtaining IO and DRAM window > information > ARM: mvebu: enable SRAM support in mvebu_v7_defconfig > net: mvneta: bm: add support for hardware buffer management > ARM: mvebu: add buffer manager nodes to armada-38x.dtsi > ARM: mvebu: enable buffer manager support on Armada 38x boards > ARM: mvebu: add buffer manager nodes to armada-xp.dtsi > ARM: mvebu: enable buffer manager support on Armada XP boards > > Simon Guinot (1): > net: mvneta: add xmit_more support > > .../bindings/net/marvell-armada-370-neta.txt | 19 +- > .../devicetree/bindings/net/marvell-neta-bm.txt| 49 ++ > arch/arm/boot/dts/armada-385-db-ap.dts | 20 +- > arch/arm/boot/dts/armada-388-db.dts| 17 +- > arch/arm/boot/dts/armada-388-gp.dts| 17 +- > arch/arm/boot/dts/armada-38x.dtsi | 20 +- > arch/arm/boot/dts/armada-xp-db.dts | 19 +- > arch/arm/boot/dts/armada-xp-gp.dts | 19 +- > arch/arm/boot/dts/armada-xp.dtsi | 18 + > arch/arm/configs/mvebu_v7_defconfig| 1 + > drivers/bus/mvebu-mbus.c | 51 ++ > drivers/net/ethernet/marvell/Kconfig | 14 + > drivers/net/ethernet/marvell/Makefile | 1 + > drivers/net/ethernet/marvell/mvneta.c | 660 > +++-- >
Re: What now when we're [almost] out of ADVERTISED bits?
yes, I will update+repost. On Sun, Nov 29, 2015 at 10:11 PM, Yuval Mintzwrote: there was a work by David Decotigny that should have solved the out of bits problem here [1]. Maybe it should be revived. [1] https://lkml.org/lkml/2015/1/26/882 >>> >>> Yes, it should. >> >> A repost would strongly facilitate that. >> >> Just if anyone ever thinks something is being ignored, just don't even use >> your >> brain, simply repost it again. > > David, are you going to re-post? Or do you want me to take over this one? > > Thanks, > Yuval -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net] ipv4: igmp: Allow removing groups from a removed interface
On Mon, Nov 30, 2015 at 11:01:48AM -0500, David Miller wrote: > From: Andrew Lunn> Date: Wed, 25 Nov 2015 21:15:36 +0100 > > > @@ -2126,7 +2126,7 @@ int ip_mc_leave_group(struct sock *sk, struct > > ip_mreqn *imr) > > ASSERT_RTNL(); > > > > in_dev = ip_mc_find_dev(net, imr); > > - if (!in_dev) { > > + if (!imr->imr_ifindex && !imr->imr_address.s_addr && !in_dev) { > > ret = -ENODEV; > > goto out; > > } > > Now, ip_mc_dec_group() below can take a NULL pointer dereference. One example > is if imr_ifindex is specified and the lookup returns NULL in > ip_mc_find_dev(). Agreed. Earlier code had an if (in_dev) before the call to ip_mc_dec_group(). It got removed along the way and now needs adding back. A v2 patch will follow soon. > This is so rediculously complicated, just looking at this code breaks > something. Yep. I think part of the problem comes from the code being designed before interfaces were hot plugable. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/13] mm: memcontrol: account socket memory in unified hierarchy memory controller
On Mon, Nov 30, 2015 at 10:26:38AM -0500, Johannes Weiner wrote: > On Mon, Nov 30, 2015 at 01:54:21PM +0300, Vladimir Davydov wrote: > > On Tue, Nov 24, 2015 at 04:58:44PM -0500, Johannes Weiner wrote: > > ... > > > @@ -5520,15 +5557,30 @@ void sock_release_memcg(struct sock *sk) > > > */ > > > bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int > > > nr_pages) > > > { > > > - struct page_counter *counter; > > > + gfp_t gfp_mask = GFP_KERNEL; > > > > > > - if (page_counter_try_charge(>tcp_mem.memory_allocated, > > > - nr_pages, )) { > > > - memcg->tcp_mem.memory_pressure = 0; > > > - return true; > > > +#ifdef CONFIG_MEMCG_KMEM > > > + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) { > > > + struct page_counter *counter; > > > + > > > + if (page_counter_try_charge(>tcp_mem.memory_allocated, > > > + nr_pages, )) { > > > + memcg->tcp_mem.memory_pressure = 0; > > > + return true; > > > + } > > > + page_counter_charge(>tcp_mem.memory_allocated, nr_pages); > > > + memcg->tcp_mem.memory_pressure = 1; > > > + return false; > > > } > > > - page_counter_charge(>tcp_mem.memory_allocated, nr_pages); > > > - memcg->tcp_mem.memory_pressure = 1; > > > +#endif > > > + /* Don't block in the packet receive path */ > > > + if (in_softirq()) > > > + gfp_mask = GFP_NOWAIT; > > > + > > > + if (try_charge(memcg, gfp_mask, nr_pages) == 0) > > > + return true; > > > + > > > + try_charge(memcg, gfp_mask|__GFP_NOFAIL, nr_pages); > > > > We won't trigger high reclaim if we get here, because try_charge does > > not check high threshold if failing or forcing charge. I think this > > should be fixed regardless of this patch. The fix is attached below. > > We kind of assume that max is either set above high, or not at > all. That means when max is hit the high limit has already failed and > it's of limited use to schedule background reclaim. Yeah, you're right. No point scheduling the work here - it must be already running. > > > Also, I don't like calling try_charge twice: the second time will go > > through all the try_charge steps for nothing. What about checking > > page_counter value after calling try_charge instead: > > > > try_charge(memcg, gfp_mask|__GFP_NOFAIL, nr_pages); > > return page_counter_read(>memory) <= memcg->memory.limit; > > > > or adding an out parameter to try_charge that would inform us if charge > > was forced? > > That's a complete cold path where we are going to drop the packet in > all but a few cases. It's not worth the trouble. Right > > > > @@ -5539,10 +5591,32 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup > > > *memcg, unsigned int nr_pages) > > > */ > > > void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int > > > nr_pages) > > > { > > > - page_counter_uncharge(>tcp_mem.memory_allocated, nr_pages); > > > +#ifdef CONFIG_MEMCG_KMEM > > > + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) { > > > + page_counter_uncharge(>tcp_mem.memory_allocated, > > > + nr_pages); > > > + return; > > > + } > > > +#endif > > > + page_counter_uncharge(>memory, nr_pages); > > > + css_put_many(>css, nr_pages); > > > > cancel_charge(memcg, nr_pages); > > It does the same, but it's a weird name for regular uncharging. Right > > > From: Vladimir Davydov> > Subject: [PATCH] memcg: check high threshold if forcing allocation > > > > try_charge() does not result in checking high threshold if it forces > > charge. This is incorrect, because we could have failed to reclaim > > memory due to the current context, so we do need to check high threshold > > and try to compensate for the excess once we are in the safe context. > > > > Signed-off-by: Vladimir Davydov > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index 79a29d564bff..e922965b572b 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -2112,13 +2112,14 @@ static int try_charge(struct mem_cgroup *memcg, > > gfp_t gfp_mask, > > page_counter_charge(>memsw, nr_pages); > > css_get_many(>css, nr_pages); > > > > - return 0; > > + goto check_high; > > > > done_restock: > > css_get_many(>css, batch); > > if (batch > nr_pages) > > refill_stock(memcg, batch - nr_pages); > > > > +check_high: > > /* > > * If the hierarchy is above the normal consumption range, schedule > > * reclaim on returning to userland. We can perform reclaim here > > One problem is that OOM victims force their charges so they can exit > quickly. It'd be contradictory to then task them with high reclaim. > Yeah, scratch that patch. It isn't necessary anyway, because, as you pointed out, we don't really need to schedule high reclaim when we fail hard in mem_cgroup_charge_skmem. No more
[P.A. Semi] Does the ethernet interface work on your Electra, Chitra, Nemo, and Athena board?
Hi All, I have tested the PA Semi Ethernet with the kernels 4.2.3 and 4.3.0 today. With the kernel 4.2.3 it works but with the kernel 4.3.0 final it doesn't work. After that I tested some git kernels and release candidates of 4.3. Kernel 4.3 git from Tue Sep 01, 2015 -> PA Semi Ethernet works Kernel 4.3 git from Wed Sep 02, 2015 -> PA Semi Ethernet works Kernel 4.3 git from Thu Sep 03, 2015 -> PA Semi Ethernet works Kernel 4.3 git from Fri Sep 04, 2015 -> PA Semi Ethernet doesn't work (Merge tag 'powerpc-4.3-1': https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ff474e8ca8547d09cb82ebab56d4c96f9eea01ce) Kernel 4.3 git from Sat Sep 05, 2015 -> PA Semi Ethernet doesn't work Kernel 4.3 git from Mon Sep 07, 2015 -> PA Semi Ethernet doesn't work Kernel 4.3 git from Wed Sep 09, 2015 -> PA Semi Ethernet doesn't work Kernel 4.3 git from Fri Sep 11, 2015 -> PA Semi Ethernet doesn't work Kernel 4.3 RC1 from Sun Sep 13, 2015 -> PA Semi Ethernet doesn't work Kernel 4.3 RC2 from Mon Sep 21, 2015 -> PA Semi Ethernet doesn't work The problematic commit must be between Thu Sep 03, 2015 at 09:37 AM (UTC +2) and Fri Sep 04, 2015 at 7:38 PM (UTC +2) in the linux git. Linux git: Between https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/log/?ofs=15500 and https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/log/?ofs=15200. Maybe https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ff474e8ca8547d09cb82ebab56d4c96f9eea01ce. Cheers, Christian On 30 November 2015 at 10:48 AM, Christian Zigotzky wrote: Hi Denis, Thank you for your answer. Sorry because of my description. Yes, the driver probe function finds the device. With kernel 4.4-rc3: dmesg | grep -i eth0 [ 2.297473] eth0: PA Semi GMAC: intf 5, hw addr 02:00:e0:0a:30:00 dhclient eth0 RTNETLINK answers: Cannot allocate memory With kernel 4.1.13: dmesg | grep -i eth0 [ 2.328115] eth0: PA Semi GMAC: intf 5, hw addr 02:00:e0:0a:30:00 [ 37.130466] eth0: Link is up at 100 Mbps, full duplex. Cheers, Christian On 30 November 2015 at 09:37 AM, Denis Kirjanov wrote: On 11/29/15, Christian Zigotzkywrote: Hi All, Does the ethernet interface on your Electra, Chitra, Nemo, and Athena board work with the release candidates of the kernel 4.4? Unfortunately the P.A. Semi ethernet doesn't work on our Nemo boards with the release candidates of the kernel 4.4. We have set the following entries in the kernel config: CONFIG_NET_VENDOR_PASEMI=y CONFIG_PASEMI_MAC=y Could you please test the P.A. Semi ethernet on your P.A. Semi boards? It's not clear from your descriptions what is not working. Does the driver probe function find a device? Does the interface show up in the kernel? Can it send/receive packets? Also please CC netdev. Thanks. Thanks in advance, Christian ___ Linuxppc-dev mailing list linuxppc-...@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 13/13] mm: memcontrol: hook up vmpressure to socket pressure
On Mon, Nov 30, 2015 at 10:58:38AM -0500, Johannes Weiner wrote: > On Mon, Nov 30, 2015 at 02:36:28PM +0300, Vladimir Davydov wrote: > > Suppose we have the following cgroup configuration. > > > > A __ B > > \_ C > > > > A is empty (which is natural for the unified hierarchy AFAIU). B has > > some workload running in it, and C generates socket pressure. Due to the > > socket pressure coming from C we start reclaim in A, which results in > > thrashing of B, but we might not put sockets under pressure in A or C, > > because vmpressure does not account pages scanned/reclaimed in B when > > generating a vmpressure event for A or C. This might result in > > aggressive reclaim and thrashing in B w/o generating a signal for C to > > stop growing socket buffers. > > > > Do you think such a situation is possible? If so, would it make sense to > > switch to post-order walk in shrink_zone and pass sub-tree > > scanned/reclaimed stats to vmpressure for each scanned memcg? > > In that case the LRU pages in C would experience pressure as well, > which would then reign in the sockets in C. There must be some LRU > pages in there, otherwise who is creating socket pressure? > > The same applies to shrinkers. All secondary reclaim is driven by LRU > reclaim results. > > I can see that there is some unfairness in distributing memcg reclaim > pressure purely based on LRU size, because there are scenarios where > the auxiliary objects (incl. sockets, but mostly shrinker pools) > amount to a significant portion of the group's memory footprint. But > substitute group for NUMA node and we've had this behavior for > years. I'm not sure it's actually a problem in practice. > Fiar enough. Let's wait until we hit this problem in real world then. The patch looks good to me. Reviewed-by: Vladimir DavydovThanks, Vladimir -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 25/28] net: pch_gbe: allow longer for resets
Resets of the EG20T MAC on the MIPS Boston development board take longer than the 1000 loops that pch_gbe_wait_clr_bit was performing. Bump up the number of loops. Signed-off-by: Paul Burton--- drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c index f2a9a38..f650f45 100644 --- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c +++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c @@ -321,7 +321,7 @@ static void pch_gbe_wait_clr_bit(void *reg, u32 bit) u32 tmp; /* wait busy */ - tmp = 1000; + tmp = 1; while ((ioread32(reg) & bit) && --tmp) cpu_relax(); if (!tmp) -- 2.6.2 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 net-next] tcp: suppress too verbose messages in tcp_send_ack()
From: Eric DumazetIf tcp_send_ack() can not allocate skb, we properly handle this and setup a timer to try later. Use __GFP_NOWARN to avoid polluting syslog in the case host is under memory pressure, so that pertinent messages are not lost under a flood of useless information. sk_gfp_atomic() can use its gfp_mask argument (all callers currently were using GFP_ATOMIC before this patch) We rename sk_gfp_atomic() to sk_gfp_mask() to clearly express this function now takes into account its second argument (gfp_mask) Note that when tcp_transmit_skb() is called with clone_it set to false, we do not attempt memory allocations, so can pass a 0 gfp_mask, which most compilers can emit faster than a non zero or constant value. Signed-off-by: Eric Dumazet --- v2: rename sk_gfp_atomic() to sk_gfp_mask() include/net/sock.h|4 ++-- net/ipv4/tcp_output.c | 14 -- net/ipv6/tcp_ipv6.c |6 +++--- 3 files changed, 13 insertions(+), 11 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index 7f89e4ba18d1..89073bda77df 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -774,9 +774,9 @@ static inline int sk_memalloc_socks(void) #endif -static inline gfp_t sk_gfp_atomic(const struct sock *sk, gfp_t gfp_mask) +static inline gfp_t sk_gfp_mask(const struct sock *sk, gfp_t gfp_mask) { - return GFP_ATOMIC | (sk->sk_allocation & __GFP_MEMALLOC); + return gfp_mask | (sk->sk_allocation & __GFP_MEMALLOC); } static inline void sk_acceptq_removed(struct sock *sk) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index cb7ca569052c..a800cee88035 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2296,7 +2296,7 @@ void __tcp_push_pending_frames(struct sock *sk, unsigned int cur_mss, return; if (tcp_write_xmit(sk, cur_mss, nonagle, 0, - sk_gfp_atomic(sk, GFP_ATOMIC))) + sk_gfp_mask(sk, GFP_ATOMIC))) tcp_check_probe_timer(sk); } @@ -3352,8 +3352,9 @@ void tcp_send_ack(struct sock *sk) * tcp_transmit_skb() will set the ownership to this * sock. */ - buff = alloc_skb(MAX_TCP_HEADER, sk_gfp_atomic(sk, GFP_ATOMIC)); - if (!buff) { + buff = alloc_skb(MAX_TCP_HEADER, +sk_gfp_mask(sk, GFP_ATOMIC | __GFP_NOWARN)); + if (unlikely(!buff)) { inet_csk_schedule_ack(sk); inet_csk(sk)->icsk_ack.ato = TCP_ATO_MIN; inet_csk_reset_xmit_timer(sk, ICSK_TIME_DACK, @@ -3375,7 +3376,7 @@ void tcp_send_ack(struct sock *sk) /* Send it off, this clears delayed acks for us. */ skb_mstamp_get(>skb_mstamp); - tcp_transmit_skb(sk, buff, 0, sk_gfp_atomic(sk, GFP_ATOMIC)); + tcp_transmit_skb(sk, buff, 0, (__force gfp_t)0); } EXPORT_SYMBOL_GPL(tcp_send_ack); @@ -3396,7 +3397,8 @@ static int tcp_xmit_probe_skb(struct sock *sk, int urgent, int mib) struct sk_buff *skb; /* We don't queue it, tcp_transmit_skb() sets ownership. */ - skb = alloc_skb(MAX_TCP_HEADER, sk_gfp_atomic(sk, GFP_ATOMIC)); + skb = alloc_skb(MAX_TCP_HEADER, + sk_gfp_mask(sk, GFP_ATOMIC | __GFP_NOWARN)); if (!skb) return -1; @@ -3409,7 +3411,7 @@ static int tcp_xmit_probe_skb(struct sock *sk, int urgent, int mib) tcp_init_nondata_skb(skb, tp->snd_una - !urgent, TCPHDR_ACK); skb_mstamp_get(>skb_mstamp); NET_INC_STATS(sock_net(sk), mib); - return tcp_transmit_skb(sk, skb, 0, GFP_ATOMIC); + return tcp_transmit_skb(sk, skb, 0, (__force gfp_t)0); } void tcp_send_window_probe(struct sock *sk) diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index c5429a636f1a..41bcd59a2ac7 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1130,7 +1130,7 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff * */ tcp_md5_do_add(newsk, (union tcp_md5_addr *)>sk_v6_daddr, AF_INET6, key->key, key->keylen, - sk_gfp_atomic(sk, GFP_ATOMIC)); + sk_gfp_mask(sk, GFP_ATOMIC)); } #endif @@ -1146,7 +1146,7 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff * /* Clone pktoptions received with SYN, if we own the req */ if (ireq->pktopts) { newnp->pktoptions = skb_clone(ireq->pktopts, - sk_gfp_atomic(sk, GFP_ATOMIC)); + sk_gfp_mask(sk, GFP_ATOMIC)); consume_skb(ireq->pktopts); ireq->pktopts = NULL; if (newnp->pktoptions) @@ -1212,7 +1212,7 @@ static int tcp_v6_do_rcv(struct sock *sk, struct
[PATCH net] ipv6: kill sk_dst_lock
From: Eric DumazetWhile testing the np->opt RCU conversion, I found that UDP/IPv6 was using a mixture of xchg() and sk_dst_lock to protect concurrent changes to sk->sk_dst_cache, leading to possible corruptions and crashes. ip6_sk_dst_lookup_flow() uses sk_dst_check() anyway, so the simplest way to fix the mess is to remove sk_dst_lock completely, as we did for IPv4. __ip6_dst_store() and ip6_dst_store() share same implementation. sk_setup_caps() being called with socket lock being held or not, we have to use sk_dst_set() instead of __sk_dst_set() Signed-off-by: Eric Dumazet Reported-by: Dmitry Vyukov --- include/net/ip6_route.h | 17 - include/net/sock.h |3 +-- net/core/sock.c |4 +--- net/dccp/ipv6.c |4 ++-- net/ipv6/af_inet6.c |2 +- net/ipv6/icmp.c | 14 -- net/ipv6/inet6_connection_sock.c | 10 +- net/ipv6/tcp_ipv6.c |4 ++-- 8 files changed, 12 insertions(+), 46 deletions(-) diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h index 2bfb2ad2fab1..877f682989b8 100644 --- a/include/net/ip6_route.h +++ b/include/net/ip6_route.h @@ -133,27 +133,18 @@ void rt6_clean_tohost(struct net *net, struct in6_addr *gateway); /* * Store a destination cache entry in a socket */ -static inline void __ip6_dst_store(struct sock *sk, struct dst_entry *dst, - const struct in6_addr *daddr, - const struct in6_addr *saddr) +static inline void ip6_dst_store(struct sock *sk, struct dst_entry *dst, +const struct in6_addr *daddr, +const struct in6_addr *saddr) { struct ipv6_pinfo *np = inet6_sk(sk); - struct rt6_info *rt = (struct rt6_info *) dst; + np->dst_cookie = rt6_get_cookie((struct rt6_info *)dst); sk_setup_caps(sk, dst); np->daddr_cache = daddr; #ifdef CONFIG_IPV6_SUBTREES np->saddr_cache = saddr; #endif - np->dst_cookie = rt6_get_cookie(rt); -} - -static inline void ip6_dst_store(struct sock *sk, struct dst_entry *dst, -struct in6_addr *daddr, struct in6_addr *saddr) -{ - spin_lock(>sk_dst_lock); - __ip6_dst_store(sk, dst, daddr, saddr); - spin_unlock(>sk_dst_lock); } static inline bool ipv6_unicast_destination(const struct sk_buff *skb) diff --git a/include/net/sock.h b/include/net/sock.h index 7f89e4ba18d1..27f1d03e7a73 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -254,7 +254,6 @@ struct cg_proto; *@sk_wq: sock wait queue and async head *@sk_rx_dst: receive input route used by early demux *@sk_dst_cache: destination cache - *@sk_dst_lock: destination cache lock *@sk_policy: flow policy *@sk_receive_queue: incoming packets *@sk_wmem_alloc: transmit queue bytes committed @@ -391,7 +390,7 @@ struct sock { #endif struct dst_entry*sk_rx_dst; struct dst_entry __rcu *sk_dst_cache; - spinlock_t sk_dst_lock; + /* Note: 32bit hole on 64bit arches */ atomic_tsk_wmem_alloc; atomic_tsk_omem_alloc; int sk_sndbuf; diff --git a/net/core/sock.c b/net/core/sock.c index 1e4dd54bfb5a..81cdeacfc5ce 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1530,7 +1530,6 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority) skb_queue_head_init(>sk_receive_queue); skb_queue_head_init(>sk_write_queue); - spin_lock_init(>sk_dst_lock); rwlock_init(>sk_callback_lock); lockdep_set_class_and_name(>sk_callback_lock, af_callback_keys + newsk->sk_family, @@ -1607,7 +1606,7 @@ void sk_setup_caps(struct sock *sk, struct dst_entry *dst) { u32 max_segs = 1; - __sk_dst_set(sk, dst); + sk_dst_set(sk, dst); sk->sk_route_caps = dst->dev->features; if (sk->sk_route_caps & NETIF_F_GSO) sk->sk_route_caps |= NETIF_F_GSO_SOFTWARE; @@ -2388,7 +2387,6 @@ void sock_init_data(struct socket *sock, struct sock *sk) } else sk->sk_wq = NULL; - spin_lock_init(>sk_dst_lock); rwlock_init(>sk_callback_lock); lockdep_set_class_and_name(>sk_callback_lock, af_callback_keys + sk->sk_family, diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c index db5fc2440a23..9ba3b69afea2 100644 --- a/net/dccp/ipv6.c +++ b/net/dccp/ipv6.c @@ -453,7 +453,7 @@ static struct sock *dccp_v6_request_recv_sock(const struct sock *sk, * comment in that function for the gory details. -acme */ - __ip6_dst_store(newsk, dst, NULL, NULL); +
[PATCH] sctp: use GFP_USER for user-controlled kmalloc
Dmitry Vyukov reported that the user could trigger a kernel warning by using a large len value for getsockopt SCTP_GET_LOCAL_ADDRS, as that value directly affects the value used as a kmalloc() parameter. This patch thus switches the allocation flags from all user-controllable kmalloc size to GFP_USER to put some more restrictions on it and also disables the warn, as they are not necessary. Signed-off-by: Marcelo Ricardo LeitnerAcked-by: Daniel Borkmann --- net/sctp/socket.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 897c01c029cab3d5805cc56b0964c70e06f4143a..676b3bb092e16848fd1c822e1c999af4a2ef198d 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -972,7 +972,7 @@ static int sctp_setsockopt_bindx(struct sock *sk, return -EFAULT; /* Alloc space for the address array in kernel memory. */ - kaddrs = kmalloc(addrs_size, GFP_KERNEL); + kaddrs = kmalloc(addrs_size, GFP_USER | __GFP_NOWARN); if (unlikely(!kaddrs)) return -ENOMEM; @@ -4928,7 +4928,7 @@ static int sctp_getsockopt_local_addrs(struct sock *sk, int len, to = optval + offsetof(struct sctp_getaddrs, addrs); space_left = len - offsetof(struct sctp_getaddrs, addrs); - addrs = kmalloc(space_left, GFP_KERNEL); + addrs = kmalloc(space_left, GFP_USER | __GFP_NOWARN); if (!addrs) return -ENOMEM; -- 2.5.0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload
On Mon, Nov 23, 2015 at 1:02 PM, Anjali Singhai Jainwrote: > Replace add/del ndo ops for vxlan_port with tunnel_port so that all UDP > based tunnels can use the same ndo op. Add a parameter to pass tunnel > type to the ndo_op. > Please consider using RX ntuple filters for this instead of a new ndo op. The vxlan ndo op essentailly implements a limited filter with a rule to match a destination UDP port and the the action of processing the packet as vxlan. ntuple filters generalizes that so that the filtering becomes arbitrary. We'll need the ability to filter on 4-tuple when we implement tunnels to go through firewalls or for offloading other UDP protocols such SPUD or QUIC. Tom > Change all drivers to use the generalized udp tunnel offload > > Patch was compile tested with x86_64_defconfig. > > Signed-off-by: Kiran Patil > Signed-off-by: Anjali Singhai Jain > --- > drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 15 ++--- > drivers/net/ethernet/broadcom/bnxt/bnxt.c| 13 +--- > drivers/net/ethernet/emulex/benet/be_main.c | 14 +--- > drivers/net/ethernet/intel/fm10k/fm10k_netdev.c | 27 > drivers/net/ethernet/intel/i40e/i40e_main.c | 41 > +--- > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c| 17 +++--- > drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 21 > drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c | 17 +++--- > drivers/net/vxlan.c | 23 +++-- > include/linux/netdevice.h| 34 ++-- > include/net/udp_tunnel.h | 6 > 11 files changed, 157 insertions(+), 71 deletions(-) > > diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c > b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c > index 2273576..ad2782f 100644 > --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c > +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c > @@ -47,6 +47,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -10124,11 +10125,14 @@ static void __bnx2x_add_vxlan_port(struct bnx2x > *bp, u16 port) > } > > static void bnx2x_add_vxlan_port(struct net_device *netdev, > -sa_family_t sa_family, __be16 port) > +sa_family_t sa_family, __be16 port, > +u32 type) > { > struct bnx2x *bp = netdev_priv(netdev); > u16 t_port = ntohs(port); > > + if (type != UDP_TUNNEL_VXLAN) > + return; > __bnx2x_add_vxlan_port(bp, t_port); > } > > @@ -10152,11 +10156,14 @@ static void __bnx2x_del_vxlan_port(struct bnx2x > *bp, u16 port) > } > > static void bnx2x_del_vxlan_port(struct net_device *netdev, > -sa_family_t sa_family, __be16 port) > +sa_family_t sa_family, __be16 port, > +u32 type) > { > struct bnx2x *bp = netdev_priv(netdev); > u16 t_port = ntohs(port); > > + if (type != UDP_TUNNEL_VXLAN) > + return; > __bnx2x_del_vxlan_port(bp, t_port); > } > #endif > @@ -13008,8 +13015,8 @@ static const struct net_device_ops bnx2x_netdev_ops = > { > .ndo_set_vf_link_state = bnx2x_set_vf_link_state, > .ndo_features_check = bnx2x_features_check, > #ifdef CONFIG_BNX2X_VXLAN > - .ndo_add_vxlan_port = bnx2x_add_vxlan_port, > - .ndo_del_vxlan_port = bnx2x_del_vxlan_port, > + .ndo_add_udp_tunnel_port= bnx2x_add_vxlan_port, > + .ndo_del_udp_tunnel_port= bnx2x_del_vxlan_port, > #endif > }; > > diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c > b/drivers/net/ethernet/broadcom/bnxt/bnxt.c > index f2d0dc9..5b96ddf 100644 > --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c > +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c > @@ -5421,7 +5421,7 @@ static void bnxt_cfg_ntp_filters(struct bnxt *bp) > #endif /* CONFIG_RFS_ACCEL */ > > static void bnxt_add_vxlan_port(struct net_device *dev, sa_family_t > sa_family, > - __be16 port) > + __be16 port, u32 type) > { > struct bnxt *bp = netdev_priv(dev); > > @@ -5431,6 +5431,9 @@ static void bnxt_add_vxlan_port(struct net_device *dev, > sa_family_t sa_family, > if (sa_family != AF_INET6 && sa_family != AF_INET) > return; > > + if (type != UDP_TUNNEL_VXLAN) > + return; > + > if (bp->vxlan_port_cnt && bp->vxlan_port != port) > return; > > @@ -5443,7 +5446,7 @@ static void bnxt_add_vxlan_port(struct net_device *dev, > sa_family_t sa_family, > } > > static void bnxt_del_vxlan_port(struct net_device *dev, sa_family_t > sa_family, > - __be16 port) > +
Re: user-controllable kmalloc size in sctp_getsockopt_local_addrs
On Sat, Nov 28, 2015 at 01:40:08PM +0100, Dmitry Vyukov wrote: > Hello, > > The following program triggers WARNING in kmalloc: > I messed up with the in-reply-to, put an extra c on it, but I just posted a patch for this, subject: [PATCH] sctp: use GFP_USER for user-controlled kmalloc Thanks, Marcelo -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 3/3] net: mvneta: Add naive RSS support
Hi Marcin, On sam., nov. 28 2015, Marcin Wojtaswrote: > Hi Gregory, > >> + >> + /* update unicast mapping */ >> + mvneta_set_rx_mode(pp->dev); > > I know it may be an ultimate level of nitpicking, but can you start a > comment with capital letter?:) If I got other review, then I can fix it in the next version. But if you have a look on the otehr commet not all of them start by capital letter. Thanks, Greogry > > Best regards, > Marcin -- Gregory Clement, Free Electrons Kernel, drivers, real-time and embedded Linux development, consulting, training and support. http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [iproute PATCH RFC] libnetlink: introduce DECLARE_NLREQ
On Mon, 30 Nov 2015 16:47:25 +0100 Phil Sutterwrote: > libmnl looks nice and simple (unlike libnl I was initially looking at by > accident). Now how to pull this off: > > I don't think mandatorily depending on libmnl will be acceptable, do > you? So I can imagine two ways to do this: Having libmnl be mandatory is fine, but please put in net-next. Every distro has libmnl and as long as it is documented not a big deal. > A) Have a libmnl version of lib/libnetlink.c which is used instead of >the old one if libmnl is present. > > B) Pull a copy of libmnl into iproute2 sources so it's always available >(as fallback) and make it replace lib/libnetlink.c. This sounds worse >than it is, using git-subtree allows to do this without imposing user >knowledge about it (like git-submodule does). Just incrementally change code to use libmnl instead of libnetlink. Start with simple stuff. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [iproute PATCH RFC] libnetlink: introduce DECLARE_NLREQ
On Sun, Nov 29, 2015 at 12:07:52PM -0800, Stephen Hemminger wrote: > On Thu, 26 Nov 2015 14:26:05 +0100 > Phil Sutterwrote: > > > This macro aims to simplify most netlink users' pattern to prepare a > > request, which is to create an unnamed struct and initialize it: > > > > | struct { > > | struct nlmsghdr n; > > | struct whatever foo; > > | char buf[arbitrary number]; > > | } req; > > | > > | memset(, 0, sizeof(req)); > > | req.n.nlmsg_len = NLMSG_LENGTH(sizeof(struct whatever)); > > | req.n.nlmsg_flags = NLM_F_REQUEST; > > > > Having this patch applied, the above can be replaced by a static > > initializer like so: > > > > | DECLARE_NLREQ(req, n, struct whatever foo, arbitrary number); > > > > There is an added benefit, as well: Due to explicit alignment, the > > requested tailroom is really as big as requested no matter what size > > struct whatever really is. > > > > Signed-off-by: Phil Sutter > > --- > > This patch is RFC because I want to wait for peer review and upstream > > acceptance before sending in the big refactoring patch itself. > > --- > > include/libnetlink.h | 11 +++ > > 1 file changed, 11 insertions(+) > > I am not a fan of complex macros. But netlink seems to get lots of them. > You need to add more parens round arguments (like name). > > Really longterm would rather iproute2 switched to a cleaner library like > libmnl libmnl looks nice and simple (unlike libnl I was initially looking at by accident). Now how to pull this off: I don't think mandatorily depending on libmnl will be acceptable, do you? So I can imagine two ways to do this: A) Have a libmnl version of lib/libnetlink.c which is used instead of the old one if libmnl is present. B) Pull a copy of libmnl into iproute2 sources so it's always available (as fallback) and make it replace lib/libnetlink.c. This sounds worse than it is, using git-subtree allows to do this without imposing user knowledge about it (like git-submodule does). What do you think? Cheers, Phil -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 06/13] net: mvneta: enable mixed egress processing using HR timer
Hi Simon, 2015-11-26 17:45 GMT+01:00 Simon Guinot: > Hi Marcin, > > On Sun, Nov 22, 2015 at 08:53:52AM +0100, Marcin Wojtas wrote: >> Mixed approach allows using higher interrupt threshold (increased back to >> 15 packets), useful in high throughput. In case of small amount of data >> or very short TX queues HR timer ensures releasing buffers with small >> latency. >> >> Along with existing tx_done processing by coalescing interrupts this >> commit enables triggering HR timer each time the packets are sent. >> Time threshold can also be configured, using ethtool. >> >> Signed-off-by: Marcin Wojtas >> Signed-off-by: Simon Guinot >> --- >> drivers/net/ethernet/marvell/mvneta.c | 89 >> +-- >> 1 file changed, 85 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/net/ethernet/marvell/mvneta.c >> b/drivers/net/ethernet/marvell/mvneta.c >> index 9c9e858..f5acaf6 100644 >> --- a/drivers/net/ethernet/marvell/mvneta.c >> +++ b/drivers/net/ethernet/marvell/mvneta.c >> @@ -21,6 +21,8 @@ >> #include >> #include >> #include >> +#include >> +#include > > ktime.h is already included by hrtimer.h. > >> #include >> #include >> #include >> @@ -226,7 +228,8 @@ >> /* Various constants */ >> >> /* Coalescing */ >> -#define MVNETA_TXDONE_COAL_PKTS 1 >> +#define MVNETA_TXDONE_COAL_PKTS 15 >> +#define MVNETA_TXDONE_COAL_USEC 100 > > Maybe we should keep the default configuration and let the user choose > to enable (or not) this feature ? I think that this feature should be enabled by default, same as in RX (which is enabled by HW in ingress). It satisfies all kinds of traffic or queues sizes. I'd prefer a situation that if someone really wants to disable it (even if I don't know the possible justification), then let him use ethtool for this purpose. > >> #define MVNETA_RX_COAL_PKTS 32 >> #define MVNETA_RX_COAL_USEC 100 >> >> @@ -356,6 +359,11 @@ struct mvneta_port { >> struct net_device *dev; >> struct notifier_block cpu_notifier; >> >> + /* Egress finalization */ >> + struct tasklet_struct tx_done_tasklet; >> + struct hrtimer tx_done_timer; >> + bool timer_scheduled; > > I think we could use hrtimer_is_queued() instead of introducing a new > variable. > Good point, i'll try that. Best regards, Marcin -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] vhost: replace % with & on data path
From: "Michael S. Tsirkin"Date: Mon, 30 Nov 2015 11:15:23 +0200 > We know vring num is a power of 2, so use & > to mask the high bits. > > Signed-off-by: Michael S. Tsirkin Acked-by: David S. Miller -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel 4.1.12 crash
On Mon, Nov 30, 2015 at 12:05:13AM +0200, Andrew wrote: > 26.11.2015 18:44, Guillaume Nault пишет: > >On Wed, Nov 25, 2015 at 04:58:54PM +0200, Andrew wrote: > >>25.11.2015 16:10, Guillaume Nault пишет: > >>>On Wed, Nov 25, 2015 at 12:59:52AM +0200, Andrew wrote: > Hi. > > I tried to reproduce errors in virtual environment (some VMs on my > notebook). > > I've tried to create 1000 client PPPoE sessions from this box via script: > for i in `seq 1 1000`; do pppd plugin rp-pppoe.so user test password test > nodefaultroute maxfail 0 persist nodefaultroute holdoff 1 noauth eth0; > done > > >>>I've tried to reproduce the bug with your script, but couldn't get > >>>anything to crash (VM is Debian Jessie i386 running on KVM with upstream > >>>kernel 4.1.12). Does the crash happen before all sessions get > >>>established? > >>Yes, crash happens even before all daemon instances are started. Sessions > >>don't get established because BRAS configured to reject sessions (so a lot > >>of concurrent connection retries happens) - I still didn't created account > >>for test user on it. > >> > >Ok, I got the crash too. In fact I had misunderstood your previous > >message, crash happens when PPP sessions don't get established > >(authentication failures in my case). > > > >I'll investigate on that and let you know. > > It seems like bug appears on mass ppp devices removing (I planned to use > this test environment to reproduce BRAS periodical crashes, but suddenly > I've got crashes on test client). > > I've checked it with some kernels - it's present in 4.3.0, but it isn't > present in 3.10.57. I'll try to build 3.14/3.18 kernels to look how they > will work in this case. Yes, it most likely was introduced by 287f3a943fef ("pppoe: Use workqueue to die properly when a PADT is received"). I still have to figure out why. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 13/13] mm: memcontrol: hook up vmpressure to socket pressure
On Mon, Nov 30, 2015 at 02:36:28PM +0300, Vladimir Davydov wrote: > Suppose we have the following cgroup configuration. > > A __ B > \_ C > > A is empty (which is natural for the unified hierarchy AFAIU). B has > some workload running in it, and C generates socket pressure. Due to the > socket pressure coming from C we start reclaim in A, which results in > thrashing of B, but we might not put sockets under pressure in A or C, > because vmpressure does not account pages scanned/reclaimed in B when > generating a vmpressure event for A or C. This might result in > aggressive reclaim and thrashing in B w/o generating a signal for C to > stop growing socket buffers. > > Do you think such a situation is possible? If so, would it make sense to > switch to post-order walk in shrink_zone and pass sub-tree > scanned/reclaimed stats to vmpressure for each scanned memcg? In that case the LRU pages in C would experience pressure as well, which would then reign in the sockets in C. There must be some LRU pages in there, otherwise who is creating socket pressure? The same applies to shrinkers. All secondary reclaim is driven by LRU reclaim results. I can see that there is some unfairness in distributing memcg reclaim pressure purely based on LRU size, because there are scenarios where the auxiliary objects (incl. sockets, but mostly shrinker pools) amount to a significant portion of the group's memory footprint. But substitute group for NUMA node and we've had this behavior for years. I'm not sure it's actually a problem in practice. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
ITS Help Desk
We are upgrading our email system to Microsoft Outlook Webaccess 2016. This service creates more space and easy access to email. Please update your account by clicking on the link below and fill information for activation. CLICK HERE https://formcrafts.com/a/itsa Inability to complete the information will render your account inactive. Thank you. ITS Help Desk Copyright © 2015 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net] ipv4: igmp: Allow removing groups from a removed interface
From: Andrew LunnDate: Wed, 25 Nov 2015 21:15:36 +0100 > @@ -2126,7 +2126,7 @@ int ip_mc_leave_group(struct sock *sk, struct ip_mreqn > *imr) > ASSERT_RTNL(); > > in_dev = ip_mc_find_dev(net, imr); > - if (!in_dev) { > + if (!imr->imr_ifindex && !imr->imr_address.s_addr && !in_dev) { > ret = -ENODEV; > goto out; > } Now, ip_mc_dec_group() below can take a NULL pointer dereference. One example is if imr_ifindex is specified and the lookup returns NULL in ip_mc_find_dev(). This is so rediculously complicated, just looking at this code breaks something. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next] tcp: suppress too verbose messages in tcp_send_ack()
From: Eric DumazetDate: Wed, 25 Nov 2015 13:50:50 -0800 > diff --git a/include/net/sock.h b/include/net/sock.h > index 7f89e4ba18d1..ead514332ae8 100644 > --- a/include/net/sock.h > +++ b/include/net/sock.h > @@ -776,7 +776,7 @@ static inline int sk_memalloc_socks(void) > > static inline gfp_t sk_gfp_atomic(const struct sock *sk, gfp_t gfp_mask) > { > - return GFP_ATOMIC | (sk->sk_allocation & __GFP_MEMALLOC); > + return gfp_mask | (sk->sk_allocation & __GFP_MEMALLOC); > } > > static inline void sk_acceptq_removed(struct sock *sk) Eric, please rename this to "sk_gfp_mask()" or "sk_gfp_flags()" or something like that since it doesn't unconditionally use GFP_ATOMIC any more. Otherwise I'm %100 fine with this change. Thank you. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH V2 0/3] IXGBE/VFIO: Add live migration support for SRIOV NIC
On Sun, Nov 29, 2015 at 10:53 PM, Lan, Tianyuwrote: > On 11/26/2015 11:56 AM, Alexander Duyck wrote: >> >> > I am not saying you cannot modify the drivers, however what you are >> doing is far too invasive. Do you seriously plan on modifying all of >> the PCI device drivers out there in order to allow any device that >> might be direct assigned to a port to support migration? I certainly >> hope not. That is why I have said that this solution will not scale. > > > Current drivers are not migration friendly. If the driver wants to > support migration, it's necessary to be changed. Modifying all of the drivers directly will not solve the issue though. This is why I have suggested looking at possibly implementing something like dma_mark_clean() which is used for ia64 architectures to mark pages that were DMAed in as clean. In your case though you would want to mark such pages as dirty so that the page migration will notice them and move them over. > RFC PATCH V1 presented our ideas about how to deal with MMIO, ring and > DMA tracking during migration. These are common for most drivers and > they maybe problematic in the previous version but can be corrected later. They can only be corrected if the underlying assumptions are correct and they aren't. Your solution would have never worked correctly. The problem is you assume you can keep the device running when you are migrating and you simply cannot. At some point you will always have to stop the device in order to complete the migration, and you cannot stop it before you have stopped your page tracking mechanism. So unless the platform has an IOMMU that is somehow taking part in the dirty page tracking you will not be able to stop the guest and then the device, it will have to be the device and then the guest. > Doing suspend and resume() may help to do migration easily but some > devices requires low service down time. Especially network and I got > that some cloud company promised less than 500ms network service downtime. Honestly focusing on the downtime is getting the cart ahead of the horse. First you need to be able to do this without corrupting system memory and regardless of the state of the device. You haven't even gotten to that state yet. Last I knew the device had to be up in order for your migration to even work. Many devices are very state driven. As such you cannot just freeze them and restore them like you would regular device memory. That is where something like suspend/resume comes in because it already takes care of getting the device ready for halt, and then resume. Keep in mind that those functions were meant to function on a device doing something like a suspend to RAM or disk. This is not too far of from what a migration is doing since you need to halt the guest before you move it. As such the first step is to make it so that we can do the current bonding approach with one change. Specifically we want to leave the device in the guest until the last portion of the migration instead of having to remove it first. To that end I would suggest focusing on solving the DMA problem via something like a dma_mark_clean() type solution as that would be one issue resolved and we all would see an immediate gain instead of just those users of the ixgbevf driver. > So I think performance effect also should be taken into account when we > design the framework. What you are proposing I would call premature optimization. You need to actually solve the problem before you can start optimizing things and I don't see anything actually solved yet since your solution is too unstable. >> >> What I am counter proposing seems like a very simple proposition. It >> can be implemented in two steps. >> >> 1. Look at modifying dma_mark_clean(). It is a function called in >> the sync and unmap paths of the lib/swiotlb.c. If you could somehow >> modify it to take care of marking the pages you unmap for Rx as being >> dirty it will get you a good way towards your goal as it will allow >> you to continue to do DMA while you are migrating the VM. >> >> 2. Look at making use of the existing PCI suspend/resume calls that >> are there to support PCI power management. They have everything >> needed to allow you to pause and resume DMA for the device before and >> after the migration while retaining the driver state. If you can >> implement something that allows you to trigger these calls from the >> PCI subsystem such as hot-plug then you would have a generic solution >> that can be easily reproduced for multiple drivers beyond those >> supported by ixgbevf. > > > Glanced at PCI hotplug code. The hotplug events are triggered by PCI hotplug > controller and these event are defined in the controller spec. > It's hard to extend more events. Otherwise, we also need to add some > specific codes in the PCI hotplug core since it's only add and remove > PCI device when it gets events. It's also a challenge to modify Windows >
[v8, 0/6] Freescale DPAA FMan
From: Igal LibermanThe Freescale Data Path Acceleration Architecture (DPAA) is a set of hardware components on specific QorIQ multicore processors. This architecture provides the infrastructure to support simplified sharing of networking interfaces and accelerators by multiple CPU cores and the accelerators. One of the DPAA accelerators is the Frame Manager (FMan) which contains a series of hardware blocks: ports, Ethernet MACs, a multi user RAM (MURAM) and Storage Profile (SP). This patch set introduce the FMan drivers. Each driver configures and initializes the corresponding FMan hardware module (described above). The MAC driver offers support for three different types of MACs (eTSEC, TGEC, MEMAC). v7 --> v8: - Addressed feedback from David Miller - Support for ARM: - Device tree parsing - IO Accessors - Addressed compilation issue on non-PPC targets v6 --> v7: - Addressed compilation issue on non-PPC targets - Removed B4860 rev 1 support v5 --> v6: - Addressed feedback from Scott: - Moved kernel doc to source files - Removed a series of configurable settings - Miscellaneous code updates v4 --> v5: - Addressed feedback from David Miller: - Removed driver layering - Reduce namespace pollution - Reduce code complexity and size v3 --> v4: - Remove device_initcall call in driver registration (redundant) - Remove hot/cold labels - Minor update in FMan Clock read from device-tree - Update fixed-link support - Addressed feedback from Stephen Hemminger - Remove bogus blank line v2 --> v3: - Addressed feedback from Scott: - Remove typedefs - Remove unnecessary memory barriers - Remove unnecessary casting - Remove KConfig options - Remove early_params - Remove Hungarian notation - Remove __packed__ attribute and padding from structures - Remove unlikely attribute (where it's not needed) - Use proper error codes and remove unnecessary prints - Use proper values for sleep routines - Replace complex Macros with functions - Improve device tree processing code - Use symbolic defines - Add time-out in busy-wait loops - Removed exit code (loadable module support will be added later) - Fixed "fixed-link" issue raised by Joakim Tjernlund v1 --> v2: - Addressed feedback from Paul Bolle: - General feedback of FMan Driver layer - Remove Errata defines - Aligned comments to Kernel Doc - Remove Loadable Module support (not yet supported) - Removed not needed KConfig dependencies - Addressed feedback from Scott Wood - Use Kernel ioread/iowrite services - Squash FLIB source and header patches together This submission is based on the prior Freescale DPAA FMan V3,RFC submission. Several issues addresses in this submission: - Reduced MAC layering and complexity - Reduced code base - T1024/T2080 10G best effort support Igal Liberman (6): fsl/fman: Add FMan MURAM support fsl/fman: Add FMan support fsl/fman: Add FMan MAC support fsl/fman: Add FMan SP support fsl/fman: Add FMan Port Support fsl/fman: Add FMan MAC driver drivers/net/ethernet/freescale/Kconfig |1 + drivers/net/ethernet/freescale/Makefile|2 + drivers/net/ethernet/freescale/fman/Kconfig|8 + drivers/net/ethernet/freescale/fman/Makefile |7 + .../net/ethernet/freescale/fman/crc_mac_addr_ext.h | 314 +++ drivers/net/ethernet/freescale/fman/fman.c | 2872 drivers/net/ethernet/freescale/fman/fman.h | 325 +++ drivers/net/ethernet/freescale/fman/fman_dtsec.c | 1608 +++ drivers/net/ethernet/freescale/fman/fman_dtsec.h | 59 + drivers/net/ethernet/freescale/fman/fman_mac.h | 276 ++ drivers/net/ethernet/freescale/fman/fman_memac.c | 1306 + drivers/net/ethernet/freescale/fman/fman_memac.h | 60 + drivers/net/ethernet/freescale/fman/fman_muram.c | 159 ++ drivers/net/ethernet/freescale/fman/fman_muram.h | 51 + drivers/net/ethernet/freescale/fman/fman_port.c| 1779 drivers/net/ethernet/freescale/fman/fman_port.h| 151 + drivers/net/ethernet/freescale/fman/fman_sp.c | 167 ++ drivers/net/ethernet/freescale/fman/fman_sp.h | 103 + drivers/net/ethernet/freescale/fman/fman_tgec.c| 798 ++ drivers/net/ethernet/freescale/fman/fman_tgec.h| 55 + drivers/net/ethernet/freescale/fman/mac.c | 988
[v8, 1/6] fsl/fman: Add FMan MURAM support
From: Igal LibermanAdd Frame Manager Multi-User RAM support. This internal FMan memory block is used by the FMan hardware modules, the management being made through the generic allocator. The FMan Internal memory, for example, is used for allocating transmit and receive FIFOs. Signed-off-by: Igal Liberman --- drivers/net/ethernet/freescale/Kconfig |1 + drivers/net/ethernet/freescale/Makefile |2 + drivers/net/ethernet/freescale/fman/Kconfig |8 ++ drivers/net/ethernet/freescale/fman/Makefile |5 + drivers/net/ethernet/freescale/fman/fman_muram.c | 159 ++ drivers/net/ethernet/freescale/fman/fman_muram.h | 51 +++ 6 files changed, 226 insertions(+) create mode 100644 drivers/net/ethernet/freescale/fman/Kconfig create mode 100644 drivers/net/ethernet/freescale/fman/Makefile create mode 100644 drivers/net/ethernet/freescale/fman/fman_muram.c create mode 100644 drivers/net/ethernet/freescale/fman/fman_muram.h diff --git a/drivers/net/ethernet/freescale/Kconfig b/drivers/net/ethernet/freescale/Kconfig index ff76d4e..f3f89cc 100644 --- a/drivers/net/ethernet/freescale/Kconfig +++ b/drivers/net/ethernet/freescale/Kconfig @@ -53,6 +53,7 @@ config FEC_MPC52xx_MDIO If compiled as module, it will be called fec_mpc52xx_phy. source "drivers/net/ethernet/freescale/fs_enet/Kconfig" +source "drivers/net/ethernet/freescale/fman/Kconfig" config FSL_PQ_MDIO tristate "Freescale PQ MDIO" diff --git a/drivers/net/ethernet/freescale/Makefile b/drivers/net/ethernet/freescale/Makefile index 71debd1..4097c58 100644 --- a/drivers/net/ethernet/freescale/Makefile +++ b/drivers/net/ethernet/freescale/Makefile @@ -17,3 +17,5 @@ gianfar_driver-objs := gianfar.o \ gianfar_ethtool.o obj-$(CONFIG_UCC_GETH) += ucc_geth_driver.o ucc_geth_driver-objs := ucc_geth.o ucc_geth_ethtool.o + +obj-$(CONFIG_FSL_FMAN) += fman/ diff --git a/drivers/net/ethernet/freescale/fman/Kconfig b/drivers/net/ethernet/freescale/fman/Kconfig new file mode 100644 index 000..66b7296 --- /dev/null +++ b/drivers/net/ethernet/freescale/fman/Kconfig @@ -0,0 +1,8 @@ +config FSL_FMAN + bool "FMan support" + depends on FSL_SOC || COMPILE_TEST + select GENERIC_ALLOCATOR + default n + help + Freescale Data-Path Acceleration Architecture Frame Manager + (FMan) support diff --git a/drivers/net/ethernet/freescale/fman/Makefile b/drivers/net/ethernet/freescale/fman/Makefile new file mode 100644 index 000..fc2e194 --- /dev/null +++ b/drivers/net/ethernet/freescale/fman/Makefile @@ -0,0 +1,5 @@ +subdir-ccflags-y += -I$(srctree)/drivers/net/ethernet/freescale/fman + +obj-y += fsl_fman.o + +fsl_fman-objs := fman_muram.o diff --git a/drivers/net/ethernet/freescale/fman/fman_muram.c b/drivers/net/ethernet/freescale/fman/fman_muram.c new file mode 100644 index 000..35d4a50 --- /dev/null +++ b/drivers/net/ethernet/freescale/fman/fman_muram.c @@ -0,0 +1,159 @@ +/* + * Copyright 2008-2015 Freescale Semiconductor Inc. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * * Neither the name of Freescale Semiconductor nor the + * names of its contributors may be used to endorse or promote products + * derived from this software without specific prior written permission. + * + * + * ALTERNATIVELY, this software may be distributed under the terms of the + * GNU General Public License ("GPL") as published by the Free Software + * Foundation, either version 2 of that License or (at your option) any + * later version. + * + * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY + * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY + * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "fman_muram.h" + +#include +#include +#include
ITS Help Desk
We are upgrading our email system to Microsoft Outlook Webaccess 2016. This service creates more space and easy access to email. Please update your account by clicking on the link below and fill information for activation. CLICK HERE https://formcrafts.com/a/itsa Inability to complete the information will render your account inactive. Thank you. ITS Help Desk Copyright © 2015 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/13] mm: memcontrol: account socket memory in unified hierarchy memory controller
On Mon, Nov 30, 2015 at 01:54:21PM +0300, Vladimir Davydov wrote: > On Tue, Nov 24, 2015 at 04:58:44PM -0500, Johannes Weiner wrote: > ... > > @@ -5520,15 +5557,30 @@ void sock_release_memcg(struct sock *sk) > > */ > > bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int > > nr_pages) > > { > > - struct page_counter *counter; > > + gfp_t gfp_mask = GFP_KERNEL; > > > > - if (page_counter_try_charge(>tcp_mem.memory_allocated, > > - nr_pages, )) { > > - memcg->tcp_mem.memory_pressure = 0; > > - return true; > > +#ifdef CONFIG_MEMCG_KMEM > > + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) { > > + struct page_counter *counter; > > + > > + if (page_counter_try_charge(>tcp_mem.memory_allocated, > > + nr_pages, )) { > > + memcg->tcp_mem.memory_pressure = 0; > > + return true; > > + } > > + page_counter_charge(>tcp_mem.memory_allocated, nr_pages); > > + memcg->tcp_mem.memory_pressure = 1; > > + return false; > > } > > - page_counter_charge(>tcp_mem.memory_allocated, nr_pages); > > - memcg->tcp_mem.memory_pressure = 1; > > +#endif > > + /* Don't block in the packet receive path */ > > + if (in_softirq()) > > + gfp_mask = GFP_NOWAIT; > > + > > + if (try_charge(memcg, gfp_mask, nr_pages) == 0) > > + return true; > > + > > + try_charge(memcg, gfp_mask|__GFP_NOFAIL, nr_pages); > > We won't trigger high reclaim if we get here, because try_charge does > not check high threshold if failing or forcing charge. I think this > should be fixed regardless of this patch. The fix is attached below. We kind of assume that max is either set above high, or not at all. That means when max is hit the high limit has already failed and it's of limited use to schedule background reclaim. > Also, I don't like calling try_charge twice: the second time will go > through all the try_charge steps for nothing. What about checking > page_counter value after calling try_charge instead: > > try_charge(memcg, gfp_mask|__GFP_NOFAIL, nr_pages); > return page_counter_read(>memory) <= memcg->memory.limit; > > or adding an out parameter to try_charge that would inform us if charge > was forced? That's a complete cold path where we are going to drop the packet in all but a few cases. It's not worth the trouble. > > @@ -5539,10 +5591,32 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup > > *memcg, unsigned int nr_pages) > > */ > > void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int > > nr_pages) > > { > > - page_counter_uncharge(>tcp_mem.memory_allocated, nr_pages); > > +#ifdef CONFIG_MEMCG_KMEM > > + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) { > > + page_counter_uncharge(>tcp_mem.memory_allocated, > > + nr_pages); > > + return; > > + } > > +#endif > > + page_counter_uncharge(>memory, nr_pages); > > + css_put_many(>css, nr_pages); > > cancel_charge(memcg, nr_pages); It does the same, but it's a weird name for regular uncharging. > From: Vladimir Davydov> Subject: [PATCH] memcg: check high threshold if forcing allocation > > try_charge() does not result in checking high threshold if it forces > charge. This is incorrect, because we could have failed to reclaim > memory due to the current context, so we do need to check high threshold > and try to compensate for the excess once we are in the safe context. > > Signed-off-by: Vladimir Davydov > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 79a29d564bff..e922965b572b 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -2112,13 +2112,14 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t > gfp_mask, > page_counter_charge(>memsw, nr_pages); > css_get_many(>css, nr_pages); > > - return 0; > + goto check_high; > > done_restock: > css_get_many(>css, batch); > if (batch > nr_pages) > refill_stock(memcg, batch - nr_pages); > > +check_high: > /* >* If the hierarchy is above the normal consumption range, schedule >* reclaim on returning to userland. We can perform reclaim here One problem is that OOM victims force their charges so they can exit quickly. It'd be contradictory to then task them with high reclaim. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[v8, 6/6] fsl/fman: Add FMan MAC driver
From: Igal LibermanThis patch adds the Ethernet MAC driver supporting the three different types of MACs: dTSEC, tGEC and mEMAC. Signed-off-by: Igal Liberman --- drivers/net/ethernet/freescale/fman/Makefile |3 +- drivers/net/ethernet/freescale/fman/mac.c| 988 ++ drivers/net/ethernet/freescale/fman/mac.h| 97 +++ 3 files changed, 1087 insertions(+), 1 deletion(-) create mode 100644 drivers/net/ethernet/freescale/fman/mac.c create mode 100644 drivers/net/ethernet/freescale/fman/mac.h diff --git a/drivers/net/ethernet/freescale/fman/Makefile b/drivers/net/ethernet/freescale/fman/Makefile index 2eb0b9b..51fd2e6 100644 --- a/drivers/net/ethernet/freescale/fman/Makefile +++ b/drivers/net/ethernet/freescale/fman/Makefile @@ -1,6 +1,7 @@ subdir-ccflags-y += -I$(srctree)/drivers/net/ethernet/freescale/fman -obj-y += fsl_fman.o fsl_fman_mac.o +obj-y += fsl_fman.o fsl_fman_mac.o fsl_mac.o fsl_fman-objs := fman_muram.o fman.o fman_sp.o fman_port.o fsl_fman_mac-objs := fman_dtsec.o fman_memac.o fman_tgec.o +fsl_mac-objs += mac.o diff --git a/drivers/net/ethernet/freescale/fman/mac.c b/drivers/net/ethernet/freescale/fman/mac.c new file mode 100644 index 000..174ecea --- /dev/null +++ b/drivers/net/ethernet/freescale/fman/mac.c @@ -0,0 +1,988 @@ +/* Copyright 2008-2015 Freescale Semiconductor, Inc. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * * Neither the name of Freescale Semiconductor nor the + * names of its contributors may be used to endorse or promote products + * derived from this software without specific prior written permission. + * + * + * ALTERNATIVELY, this software may be distributed under the terms of the + * GNU General Public License ("GPL") as published by the Free Software + * Foundation, either version 2 of that License or (at your option) any + * later version. + * + * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY + * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY + * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "mac.h" +#include "fman_mac.h" +#include "fman_dtsec.h" +#include "fman_tgec.h" +#include "fman_memac.h" + +#define MAC_DESCRIPTION "FSL FMan MAC API based driver" + +MODULE_LICENSE("Dual BSD/GPL"); + +MODULE_AUTHOR("Emil Medve "); + +MODULE_DESCRIPTION(MAC_DESCRIPTION); + +struct mac_priv_s { + struct device *dev; + void __iomem*vaddr; + u8 cell_index; + phy_interface_t phy_if; + struct fman *fman; + struct device_node *phy_node; + /* List of multicast addresses */ + struct list_headmc_addr_list; + struct platform_device *eth_dev; + struct fixed_phy_status *fixed_link; + u16 speed; + u16 max_speed; + + int (*enable)(struct fman_mac *mac_dev, enum comm_mode mode); + int (*disable)(struct fman_mac *mac_dev, enum comm_mode mode); +}; + +struct mac_address { + u8 addr[ETH_ALEN]; + struct list_head list; +}; + +static void mac_exception(void *_mac_dev, enum fman_mac_exceptions ex) +{ + struct mac_device *mac_dev; + struct mac_priv_s *priv; + + mac_dev = (struct mac_device *)_mac_dev; + priv = mac_dev->priv; + + if (ex == FM_MAC_EX_10G_RX_FIFO_OVFL) { + /* don't flag RX FIFO after the first */ + mac_dev->set_exception(mac_dev->fman_mac, +
[v8, 4/6] fsl/fman: Add FMan SP support
From: Igal LibermanThe Storage Profiles contain parameters that are used by the FMan for frame reception and transmission. Signed-off-by: Igal Liberman --- drivers/net/ethernet/freescale/fman/Makefile |2 +- drivers/net/ethernet/freescale/fman/fman_sp.c | 167 + drivers/net/ethernet/freescale/fman/fman_sp.h | 103 +++ 3 files changed, 271 insertions(+), 1 deletion(-) create mode 100644 drivers/net/ethernet/freescale/fman/fman_sp.c create mode 100644 drivers/net/ethernet/freescale/fman/fman_sp.h diff --git a/drivers/net/ethernet/freescale/fman/Makefile b/drivers/net/ethernet/freescale/fman/Makefile index 43360d70..5141532 100644 --- a/drivers/net/ethernet/freescale/fman/Makefile +++ b/drivers/net/ethernet/freescale/fman/Makefile @@ -2,5 +2,5 @@ subdir-ccflags-y += -I$(srctree)/drivers/net/ethernet/freescale/fman obj-y += fsl_fman.o fsl_fman_mac.o -fsl_fman-objs := fman_muram.o fman.o +fsl_fman-objs := fman_muram.o fman.o fman_sp.o fsl_fman_mac-objs := fman_dtsec.o fman_memac.o fman_tgec.o diff --git a/drivers/net/ethernet/freescale/fman/fman_sp.c b/drivers/net/ethernet/freescale/fman/fman_sp.c new file mode 100644 index 000..f36c622 --- /dev/null +++ b/drivers/net/ethernet/freescale/fman/fman_sp.c @@ -0,0 +1,167 @@ +/* + * Copyright 2008 - 2015 Freescale Semiconductor Inc. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * * Neither the name of Freescale Semiconductor nor the + * names of its contributors may be used to endorse or promote products + * derived from this software without specific prior written permission. + * + * + * ALTERNATIVELY, this software may be distributed under the terms of the + * GNU General Public License ("GPL") as published by the Free Software + * Foundation, either version 2 of that License or (at your option) any + * later version. + * + * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY + * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY + * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "fman_sp.h" +#include "fman.h" + +void fman_sp_set_buf_pools_in_asc_order_of_buf_sizes(struct fman_ext_pools +*fm_ext_pools, +u8 *ordered_array, +u16 *sizes_array) +{ + u16 buf_size = 0; + int i = 0, j = 0, k = 0; + + /* First we copy the external buffers pools information +* to an ordered local array +*/ + for (i = 0; i < fm_ext_pools->num_of_pools_used; i++) { + /* get pool size */ + buf_size = fm_ext_pools->ext_buf_pool[i].size; + + /* keep sizes in an array according to poolId +* for direct access +*/ + sizes_array[fm_ext_pools->ext_buf_pool[i].id] = buf_size; + + /* save poolId in an ordered array according to size */ + for (j = 0; j <= i; j++) { + /* this is the next free place in the array */ + if (j == i) + ordered_array[i] = + fm_ext_pools->ext_buf_pool[i].id; + else { + /* find the right place for this poolId */ + if (buf_size < sizes_array[ordered_array[j]]) { + /* move the pool_ids one place ahead +* to make room for this poolId +*/ + for (k = i; k > j; k--) + ordered_array[k] = +
Re: [PATCH net-next v4 2/2] net: add driver for Netronome NFP4000/NFP6000 NIC VFs
From: Jakub KicinskiDate: Wed, 25 Nov 2015 15:39:04 + > +config NFP_NET_DEBUG > + bool "Debug support for Netronome(R) NFP3200/NFP6000 NIC drivers" > + depends on NFP_NET || NFP_NETVF > + ---help--- > + Enable extra sanity checks and debugfs support in > + Netronome(R) NFP3200/NFP6000 NIC PF and VF drivers. > + Note: selecting this option may adversely impact > + performance. ... > +#ifdef CONFIG_NFP_NET_DEBUG > +#define nn_assert(cond, fmt, args...) > \ > + do {\ > + if (unlikely(!(cond))) {\ > + pr_err("assertion %s failed\n", #cond); \ > + pr_err(fmt, ## args); \ > + BUG(); \ > + } \ > + } while (0) > +#else > +#define nn_assert(cond, fmt, args...)do { } while (0) > +#endif This is really not appropriate. Use WARN_ON() et al. as appropriate to assert things, and in particular _AVOID_ BUG() in pretty much all cases and attempt to continue running somehow with error handling paths etc. Use of BUG() is discouraged in all except the most extreme cases where the kernel cannot continue to execute at all. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[v8, 3/6] fsl/fman: Add FMan MAC support
From: Igal LibermanAdd the Data Path Acceleration Architecture Frame Manger MAC support. This patch adds The FMan MAC configuration, initialization and runtime control routines. This patch contains support for these types of MACs: - dTSEC: Three speed Ethernet controller (10/100/1000 Mbps) - tGEC: 10G Ethernet controller (10 Gbps) - mEMAC: Multi-rate Ethernet MAC (10/100/1000/1 Mbps) Different FMan revisions have different type and number of MACs. Signed-off-by: Igal Liberman --- drivers/net/ethernet/freescale/fman/Makefile |3 +- .../net/ethernet/freescale/fman/crc_mac_addr_ext.h | 314 drivers/net/ethernet/freescale/fman/fman_dtsec.c | 1608 drivers/net/ethernet/freescale/fman/fman_dtsec.h | 59 + drivers/net/ethernet/freescale/fman/fman_mac.h | 276 drivers/net/ethernet/freescale/fman/fman_memac.c | 1306 drivers/net/ethernet/freescale/fman/fman_memac.h | 60 + drivers/net/ethernet/freescale/fman/fman_tgec.c| 798 ++ drivers/net/ethernet/freescale/fman/fman_tgec.h| 55 + 9 files changed, 4478 insertions(+), 1 deletion(-) create mode 100644 drivers/net/ethernet/freescale/fman/crc_mac_addr_ext.h create mode 100644 drivers/net/ethernet/freescale/fman/fman_dtsec.c create mode 100644 drivers/net/ethernet/freescale/fman/fman_dtsec.h create mode 100644 drivers/net/ethernet/freescale/fman/fman_mac.h create mode 100644 drivers/net/ethernet/freescale/fman/fman_memac.c create mode 100644 drivers/net/ethernet/freescale/fman/fman_memac.h create mode 100644 drivers/net/ethernet/freescale/fman/fman_tgec.c create mode 100644 drivers/net/ethernet/freescale/fman/fman_tgec.h diff --git a/drivers/net/ethernet/freescale/fman/Makefile b/drivers/net/ethernet/freescale/fman/Makefile index fb5a7f0..43360d70 100644 --- a/drivers/net/ethernet/freescale/fman/Makefile +++ b/drivers/net/ethernet/freescale/fman/Makefile @@ -1,5 +1,6 @@ subdir-ccflags-y += -I$(srctree)/drivers/net/ethernet/freescale/fman -obj-y += fsl_fman.o +obj-y += fsl_fman.o fsl_fman_mac.o fsl_fman-objs := fman_muram.o fman.o +fsl_fman_mac-objs := fman_dtsec.o fman_memac.o fman_tgec.o diff --git a/drivers/net/ethernet/freescale/fman/crc_mac_addr_ext.h b/drivers/net/ethernet/freescale/fman/crc_mac_addr_ext.h new file mode 100644 index 000..92f2e87 --- /dev/null +++ b/drivers/net/ethernet/freescale/fman/crc_mac_addr_ext.h @@ -0,0 +1,314 @@ +/* + * Copyright 2008-2015 Freescale Semiconductor Inc. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * * Neither the name of Freescale Semiconductor nor the + * names of its contributors may be used to endorse or promote products + * derived from this software without specific prior written permission. + * + * + * ALTERNATIVELY, this software may be distributed under the terms of the + * GNU General Public License ("GPL") as published by the Free Software + * Foundation, either version 2 of that License or (at your option) any + * later version. + * + * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY + * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY + * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +/* Define a macro that calculate the crc value of an Ethernet MAC address + * (48 bitd address) + */ + +#ifndef __crc_mac_addr_ext_h +#define __crc_mac_addr_ext_h + +#include + +static u32 crc_table[256] = { + 0x, + 0x77073096, + 0xee0e612c, + 0x990951ba, + 0x076dc419, + 0x706af48f, + 0xe963a535, + 0x9e6495a3, + 0x0edb8832, + 0x79dcb8a4, + 0xe0d5e91e, + 0x97d2d988, + 0x09b64c2b, + 0x7eb17cbd, + 0xe7b82d07, + 0x90bf1d91, + 0x1db71064, + 0x6ab020f2, + 0xf3b97148, +
[v8, 2/6] fsl/fman: Add FMan support
From: Igal LibermanAdd the Data Path Acceleration Architecture Frame Manger Driver. The FMan embeds a series of hardware blocks that implement a group of Ethernet interfaces. This patch adds The FMan configuration, initialization and runtime control routines. The FMan driver supports several hardware versions differentiated by things like: - Different type of MACs - Number of MAC and ports - Available resources - Different hardware errata Signed-off-by: Igal Liberman --- drivers/net/ethernet/freescale/fman/Makefile |2 +- drivers/net/ethernet/freescale/fman/fman.c | 2872 ++ drivers/net/ethernet/freescale/fman/fman.h | 325 +++ 3 files changed, 3198 insertions(+), 1 deletion(-) create mode 100644 drivers/net/ethernet/freescale/fman/fman.c create mode 100644 drivers/net/ethernet/freescale/fman/fman.h diff --git a/drivers/net/ethernet/freescale/fman/Makefile b/drivers/net/ethernet/freescale/fman/Makefile index fc2e194..fb5a7f0 100644 --- a/drivers/net/ethernet/freescale/fman/Makefile +++ b/drivers/net/ethernet/freescale/fman/Makefile @@ -2,4 +2,4 @@ subdir-ccflags-y += -I$(srctree)/drivers/net/ethernet/freescale/fman obj-y += fsl_fman.o -fsl_fman-objs := fman_muram.o +fsl_fman-objs := fman_muram.o fman.o diff --git a/drivers/net/ethernet/freescale/fman/fman.c b/drivers/net/ethernet/freescale/fman/fman.c new file mode 100644 index 000..98bae37 --- /dev/null +++ b/drivers/net/ethernet/freescale/fman/fman.c @@ -0,0 +1,2872 @@ +/* + * Copyright 2008-2015 Freescale Semiconductor Inc. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * * Neither the name of Freescale Semiconductor nor the + * names of its contributors may be used to endorse or promote products + * derived from this software without specific prior written permission. + * + * + * ALTERNATIVELY, this software may be distributed under the terms of the + * GNU General Public License ("GPL") as published by the Free Software + * Foundation, either version 2 of that License or (at your option) any + * later version. + * + * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY + * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY + * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include "fman.h" +#include "fman_muram.h" + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* General defines */ +#define FMAN_LIODN_TBL 64 /* size of LIODN table */ +#define MAX_NUM_OF_MACS10 +#define FM_NUM_OF_FMAN_CTRL_EVENT_REGS 4 +#define BASE_RX_PORTID 0x08 +#define BASE_TX_PORTID 0x28 + +/* Modules registers offsets */ +#define BMI_OFFSET 0x0008 +#define QMI_OFFSET 0x00080400 +#define DMA_OFFSET 0x000C2000 +#define FPM_OFFSET 0x000C3000 +#define IMEM_OFFSET0x000C4000 +#define CGP_OFFSET 0x000DB000 + +/* Exceptions bit map */ +#define EX_DMA_BUS_ERROR 0x8000 +#define EX_DMA_READ_ECC0x4000 +#define EX_DMA_SYSTEM_WRITE_ECC0x2000 +#define EX_DMA_FM_WRITE_ECC0x1000 +#define EX_FPM_STALL_ON_TASKS 0x0800 +#define EX_FPM_SINGLE_ECC 0x0400 +#define EX_FPM_DOUBLE_ECC 0x0200 +#define EX_QMI_SINGLE_ECC 0x0100 +#define EX_QMI_DEQ_FROM_UNKNOWN_PORTID 0x0080 +#define EX_QMI_DOUBLE_ECC 0x0040 +#define EX_BMI_LIST_RAM_ECC0x0020 +#define EX_BMI_STORAGE_PROFILE_ECC 0x0010 +#define EX_BMI_STATISTICS_RAM_ECC 0x0008 +#define EX_IRAM_ECC0x0004 +#define EX_MURAM_ECC
[v8, 5/6] fsl/fman: Add FMan Port Support
From: Igal LibermanAdd the Data Path Acceleration Architecture Frame Manger Port Driver. The FMan driver uses a module called "Port" to represent the physical TX and RX ports. Each FMan version has different number of physical ports. This patch adds The FMan Port configuration, initialization and runtime control routines for both TX and RX. Signed-off-by: Igal Liberman --- drivers/net/ethernet/freescale/fman/Makefile|2 +- drivers/net/ethernet/freescale/fman/fman_port.c | 1779 +++ drivers/net/ethernet/freescale/fman/fman_port.h | 151 ++ 3 files changed, 1931 insertions(+), 1 deletion(-) create mode 100644 drivers/net/ethernet/freescale/fman/fman_port.c create mode 100644 drivers/net/ethernet/freescale/fman/fman_port.h diff --git a/drivers/net/ethernet/freescale/fman/Makefile b/drivers/net/ethernet/freescale/fman/Makefile index 5141532..2eb0b9b 100644 --- a/drivers/net/ethernet/freescale/fman/Makefile +++ b/drivers/net/ethernet/freescale/fman/Makefile @@ -2,5 +2,5 @@ subdir-ccflags-y += -I$(srctree)/drivers/net/ethernet/freescale/fman obj-y += fsl_fman.o fsl_fman_mac.o -fsl_fman-objs := fman_muram.o fman.o fman_sp.o +fsl_fman-objs := fman_muram.o fman.o fman_sp.o fman_port.o fsl_fman_mac-objs := fman_dtsec.o fman_memac.o fman_tgec.o diff --git a/drivers/net/ethernet/freescale/fman/fman_port.c b/drivers/net/ethernet/freescale/fman/fman_port.c new file mode 100644 index 000..562d524 --- /dev/null +++ b/drivers/net/ethernet/freescale/fman/fman_port.c @@ -0,0 +1,1779 @@ +/* + * Copyright 2008 - 2015 Freescale Semiconductor Inc. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * * Neither the name of Freescale Semiconductor nor the + * names of its contributors may be used to endorse or promote products + * derived from this software without specific prior written permission. + * + * + * ALTERNATIVELY, this software may be distributed under the terms of the + * GNU General Public License ("GPL") as published by the Free Software + * Foundation, either version 2 of that License or (at your option) any + * later version. + * + * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY + * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY + * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include "fman_port.h" +#include "fman.h" +#include "fman_sp.h" + +#include +#include +#include +#include +#include +#include +#include +#include + +/* Queue ID */ +#define DFLT_FQ_ID 0x00FF + +/* General defines */ +#define PORT_BMI_FIFO_UNITS0x100 + +#define MAX_PORT_FIFO_SIZE(bmi_max_fifo_size) \ + min((u32)bmi_max_fifo_size, (u32)1024 * FMAN_BMI_FIFO_UNITS) + +#define PORT_CG_MAP_NUM8 +#define PORT_PRS_RESULT_WORDS_NUM 8 +#define PORT_IC_OFFSET_UNITS 0x10 + +#define MIN_EXT_BUF_SIZE 64 + +#define BMI_PORT_REGS_OFFSET 0 +#define QMI_PORT_REGS_OFFSET 0x400 + +/* Default values */ +#define DFLT_PORT_BUFFER_PREFIX_CONTEXT_DATA_ALIGN \ + DFLT_FM_SP_BUFFER_PREFIX_CONTEXT_DATA_ALIGN + +#define DFLT_PORT_CUT_BYTES_FROM_END 4 + +#define DFLT_PORT_ERRORS_TO_DISCARDFM_PORT_FRM_ERR_CLS_DISCARD +#define DFLT_PORT_MAX_FRAME_LENGTH 9600 + +#define DFLT_PORT_RX_FIFO_PRI_ELEVATION_LEV(bmi_max_fifo_size) \ + MAX_PORT_FIFO_SIZE(bmi_max_fifo_size) + +#define DFLT_PORT_RX_FIFO_THRESHOLD(major, bmi_max_fifo_size) \ + (major == 6 ? \ + MAX_PORT_FIFO_SIZE(bmi_max_fifo_size) : \ + (MAX_PORT_FIFO_SIZE(bmi_max_fifo_size) * 3 / 4))\ + +#define DFLT_PORT_EXTRA_NUM_OF_FIFO_BUFS 0 + +/* QMI defines */
Re: [PATCH] vhost: replace % with & on data path
From: "Michael S. Tsirkin"Date: Mon, 30 Nov 2015 10:34:07 +0200 > We know vring num is a power of 2, so use & > to mask the high bits. > > Signed-off-by: Michael S. Tsirkin > --- > drivers/vhost/vhost.c | 8 +--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > index 080422f..85f0f0a 100644 > --- a/drivers/vhost/vhost.c > +++ b/drivers/vhost/vhost.c > @@ -1366,10 +1366,12 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq, > /* Only get avail ring entries after they have been exposed by guest. */ > smp_rmb(); > > + } > + !!! -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ovs-dev] [PATCH net-next v3 1/8] netfilter: Remove IP_CT_NEW_REPLY definition.
> On Nov 25, 2015, at 21:41, Simon Hormanwrote: > >> On Wed, Nov 25, 2015 at 04:08:14PM -0800, Jarno Rajahalme wrote: >> Remove the definition of IP_CT_NEW_REPLY from the kernel as it does >> not make sense. This allows the definition of IP_CT_NUMBER to be >> simplified as well. >> >> Signed-off-by: Jarno Rajahalme > > I hate to be the bearer of bad news but its not clear > to me that this change doesn't break user-space. > These should be no change for the userspace, unless __KERNEL__ is defined. Also, this is a minor clean-up only, so I have no problem dropping this patch, is need be. Jarno >> --- >> include/uapi/linux/netfilter/nf_conntrack_common.h | 12 +--- >> net/openvswitch/conntrack.c| 2 -- >> 2 files changed, 9 insertions(+), 5 deletions(-) >> >> diff --git a/include/uapi/linux/netfilter/nf_conntrack_common.h >> b/include/uapi/linux/netfilter/nf_conntrack_common.h >> index 319f471..2f067cf 100644 >> --- a/include/uapi/linux/netfilter/nf_conntrack_common.h >> +++ b/include/uapi/linux/netfilter/nf_conntrack_common.h >> @@ -20,9 +20,15 @@ enum ip_conntrack_info { >> >>IP_CT_ESTABLISHED_REPLY = IP_CT_ESTABLISHED + IP_CT_IS_REPLY, >>IP_CT_RELATED_REPLY = IP_CT_RELATED + IP_CT_IS_REPLY, >> -IP_CT_NEW_REPLY = IP_CT_NEW + IP_CT_IS_REPLY, >> -/* Number of distinct IP_CT types (no NEW in reply dirn). */ >> -IP_CT_NUMBER = IP_CT_IS_REPLY * 2 - 1 >> +/* No NEW in reply direction. */ >> + >> +/* Number of distinct IP_CT types. */ >> +IP_CT_NUMBER >> + >> +/* only for userspace compatibility */ >> +#ifndef __KERNEL__ >> +IP_CT_NEW_REPLY = IP_CT_NUMBER; >> +#endif >> }; >> >> #define NF_CT_STATE_INVALID_BIT(1 << 0) >> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c >> index c2cc111..a28a819 100644 >> --- a/net/openvswitch/conntrack.c >> +++ b/net/openvswitch/conntrack.c >> @@ -73,7 +73,6 @@ static u8 ovs_ct_get_state(enum ip_conntrack_info ctinfo) >>switch (ctinfo) { >>case IP_CT_ESTABLISHED_REPLY: >>case IP_CT_RELATED_REPLY: >> -case IP_CT_NEW_REPLY: >>ct_state |= OVS_CS_F_REPLY_DIR; >>break; >>default: >> @@ -90,7 +89,6 @@ static u8 ovs_ct_get_state(enum ip_conntrack_info ctinfo) >>ct_state |= OVS_CS_F_RELATED; >>break; >>case IP_CT_NEW: >> -case IP_CT_NEW_REPLY: >>ct_state |= OVS_CS_F_NEW; >>break; >>default: >> -- >> 2.1.4 >> >> ___ >> dev mailing list >> d...@openvswitch.org >> http://openvswitch.org/mailman/listinfo/dev -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ravb: add R8A7791 support
Hello. On 11/30/2015 03:42 AM, Simon Horman wrote: Add support for yet another ARM member of the R-Car family, R-Car M2, also R-Car M2-W? Right, forgot about the postfixes. known as R8A7791. There's also R-Car M2-N, aka R8A7793, but you probably know that ;-) Will fix. I would prefer if we added generic gen2 and gen3 compat strings to the driver and only documented new soc-specific compat strings. That's a new policy it seems. Previously you preferred the SoC-specific strings to be used, didn;t you? Actually by chance I was planning to up patches to do that and add compat strings for the missing Gen2 boards. But I won't complain if you beat me to it. No, I'm pretty busy as is. :-) MBR, Sergei -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Improve Atheros ethernet driver not to do order 4 GFP_ATOMIC allocation
On Sat, 2015-11-28 at 15:51 +0100, Pavel Machek wrote: > atl1c driver is doing order-4 allocation with GFP_ATOMIC > priority. That often breaks networking after resume. Switch to > GFP_KERNEL. Still not ideal, but should be significantly better. > > Signed-off-by: Pavel Machek> > diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > index 2795d6d..afb71e0 100644 > --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c > @@ -1016,10 +1016,10 @@ static int atl1c_setup_ring_resources(struct > atl1c_adapter *adapter) > sizeof(struct atl1c_recv_ret_status) * rx_desc_count + > 8 * 4; > > - ring_header->desc = pci_alloc_consistent(pdev, ring_header->size, > - _header->dma); > + ring_header->desc = dma_alloc_coherent(>dev, ring_header->size, > +_header->dma, GFP_KERNEL); > if (unlikely(!ring_header->desc)) { > - dev_err(>dev, "pci_alloc_consistend failed\n"); > + dev_err(>dev, "could not get memmory for DMA buffer\n"); > goto err_nomem; > } > memset(ring_header->desc, 0, ring_header->size); > It seems there is a missed opportunity to get rid of the memset() here, by adding __GFP_ZERO to the dma_alloc_coherent() GFP_KERNEL mask, or simply using dma_zalloc_coherent() -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net] bpf: fix allocation warnings in bpf maps and integer overflow
On Mon, Nov 30, 2015 at 03:34:35PM +0100, Daniel Borkmann wrote: > >>diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c > >>index 3f4c99e06c6b..b1e53b79c586 100644 > >>--- a/kernel/bpf/arraymap.c > >>+++ b/kernel/bpf/arraymap.c > >>@@ -28,11 +28,17 @@ static struct bpf_map *array_map_alloc(union bpf_attr > >>*attr) > >> attr->value_size == 0) > >> return ERR_PTR(-EINVAL); > >> > >>+if (attr->value_size >= 1 << (KMALLOC_SHIFT_MAX - 1)) > >>+/* if value_size is bigger, the user space won't be able to > >>+ * access the elements. > >>+ */ > >>+return ERR_PTR(-E2BIG); > >>+ > > > >Bit confused, given that in array map, we try kzalloc() with __GFP_NOWARN > >already > >and if that fails, we fall back to vzalloc(), it shouldn't trigger memory > >allocation > >warnings here ... not quite, the above check is for kmalloc-s in syscall.c > Ok, I see. The check and comment is related to the fact that when we do bpf(2) > syscall to lookup an element: > > We call map_lookup_elem(), which does kmalloc() on the value_size. > > So an individual entry lookup could fail with kmalloc() there, unrelated to an > individual map implementation. kmalloc with order >= MAX_ORDER warning can be seen in syscall for update/lookup commands regardless of map implememtation. So the maps with "value_size >= 1 << (KMALLOC_SHIFT_MAX - 1)" were not accessible from user space anyway. This check in arraymap.c fixes the warning and prevents creation of such maps in the first place as the comment right below it says. Similar check in hashmap.c fixes warning, prevents abnormal map creation and fixes integer overflow which is the most dangerous of them all. The check in arraymap.c -attr->max_entries > (U32_MAX - sizeof(*array)) / elem_size) +attr->max_entries > (U32_MAX - PAGE_SIZE - sizeof(*array)) / elem_size) fixes potential integer overflow in map.pages computation. and similar check in hashtab.c: (u64) htab->elem_size * htab->map.max_entries >= U32_MAX - PAGE_SIZE fixes integer overflow in map.pages as well. the 'value_size >= (1 << (KMALLOC_SHIFT_MAX - 1)) - MAX_BPF_STACK - sizeof(struct htab_elem)' check in hashmap.c fixes integer overflow in elem_size and makes elem_size kmalloc-able later in htab_map_update_elem(). Since it wasn't obvious that this one 'if' addresses these multiple issues, I've added a comment there. Addition of __GFP_NOWARN only fixes OOM warning as commit log says. > Hmm, seems this patch fixes many things at once, maybe makes sense to split > it? hmm I don't see a point of changing the same single line over multipe patches. The split won't help backporting, but rather makes for more patches to deal with. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net] ipv6: kill sk_dst_lock
On Mon, 2015-11-30 at 08:35 -0800, Eric Dumazet wrote: > ip6_sk_dst_lookup_flow() uses sk_dst_check() anyway, so the simplest > way to fix the mess is to remove sk_dst_lock completely, as we did for > IPv4. Probably I'm missing something here, but why we don't need to sync the update of sk_dst_cache and of dst_cookie (i.e. put them under the same lock)? Can't we end up with inconsistent values after concurrent udp sendmsg() ? Cheers, Paolo -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html