Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload

2015-11-30 Thread Tom Herbert
On Mon, Nov 30, 2015 at 1:42 PM, Singhai, Anjali
 wrote:
>
>
> -Original Message-
> From: David Miller [mailto:da...@davemloft.net]
> Sent: Sunday, November 29, 2015 7:23 PM
> To: t...@herbertland.com
> Cc: Brandeburg, Jesse ; Singhai, Anjali 
> ; je...@kernel.org; netdev@vger.kernel.org; Patil, 
> Kiran 
> Subject: Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload
>
> From: Tom Herbert 
> Date: Tue, 24 Nov 2015 09:32:11 -0800
>
>>>
>>> FWIW, I've brought the issue to the attention of the architects here,
>>> and we will likely be able to make changes in this space.  Intel
>>> hardware (as demonstrated by your patches) already is able to deal
>>> with this de-ossification on transmit.  Receive is a whole different beast.
>>>
>> Please provide the specifics on why "Receive is a whole different
>> beast.". Generic receive checksum is already a subset of the
>> functionality that you must have implement to support the protocol
>> specific offloads. All the hardware needs to do is calculate the 1's
>> complement checksum of the packet and return the value on the to the
>> host with that packet. That's it. No parsing of headers, no worrying
>> about the pseudo header, no dealing with any encapsulation. Just do
>> the calculation, return the result to the host and the driver converts
>> this to CHECKSUM_COMPLETE. I find it very hard to believe that this is
>> any harder than specific support the next protocol du jour.
>
> The reason for receive being different than transmit is, on TX side driver 
> can provide the meta data for where the checksum field is and what is the 
> length that needs to be check summed to the HW on a per packet basis. On Rx 
> the HW parser has to parse the packet to identify the tunnel type and based 
> on that figure out the checksum locations and length in the packet, so 
> definitely HW has to parse the packet and it can parse only based on next 
> header type information or in case of udp tunnels based on udp port mapping 
> to a particular protocol. I am not sure why you say it doesn't need to parse 
> the packet, maybe I am miss- understanding something.  Although it's not 
> difficult to reduce protocol ossification on the RX side but it is certainly 
> different and particularly in case of udp-tunnels it needs the port to 
> protocol mapping.
>
Please look at how CHECKSUM_COMPLETE interface works. Description is
in sk_buff.h or
http://people.netfilter.org/pablo/netdev0.1/papers/UDP-Encapsulation-in-Linux.pdf.

Thanks,
Tom
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v3 17/17] net: mlx4: use new ETHTOOL_G/SSETTINGS API

2015-11-30 Thread David Decotigny
From: David Decotigny 

Signed-off-by: David Decotigny 
---
 drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 323 
 drivers/net/ethernet/mellanox/mlx4/en_main.c|   1 +
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h|   1 +
 3 files changed, 157 insertions(+), 168 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c 
b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
index dd84cab..0ccdc84 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
@@ -501,34 +501,30 @@ static u32 mlx4_en_autoneg_get(struct net_device *dev)
return autoneg;
 }
 
-static u32 ptys_get_supported_port(struct mlx4_ptys_reg *ptys_reg)
+static void ptys2ethtool_update_supported_port(ethtool_link_mode_mask_t *mask,
+  struct mlx4_ptys_reg *ptys_reg)
 {
u32 eth_proto = be32_to_cpu(ptys_reg->eth_proto_cap);
 
if (eth_proto & (MLX4_PROT_MASK(MLX4_10GBASE_T)
 | MLX4_PROT_MASK(MLX4_1000BASE_T)
 | MLX4_PROT_MASK(MLX4_100BASE_TX))) {
-   return SUPPORTED_TP;
-   }
-
-   if (eth_proto & (MLX4_PROT_MASK(MLX4_10GBASE_CR)
+   ethtool_add_link_modes(mask, ETHTOOL_LINK_MODE_TP_BIT);
+   } else if (eth_proto & (MLX4_PROT_MASK(MLX4_10GBASE_CR)
 | MLX4_PROT_MASK(MLX4_10GBASE_SR)
 | MLX4_PROT_MASK(MLX4_56GBASE_SR4)
 | MLX4_PROT_MASK(MLX4_40GBASE_CR4)
 | MLX4_PROT_MASK(MLX4_40GBASE_SR4)
 | MLX4_PROT_MASK(MLX4_1000BASE_CX_SGMII))) {
-   return SUPPORTED_FIBRE;
-   }
-
-   if (eth_proto & (MLX4_PROT_MASK(MLX4_56GBASE_KR4)
+   ethtool_add_link_modes(mask, ETHTOOL_LINK_MODE_FIBRE_BIT);
+   } else if (eth_proto & (MLX4_PROT_MASK(MLX4_56GBASE_KR4)
 | MLX4_PROT_MASK(MLX4_40GBASE_KR4)
 | MLX4_PROT_MASK(MLX4_20GBASE_KR2)
 | MLX4_PROT_MASK(MLX4_10GBASE_KR)
 | MLX4_PROT_MASK(MLX4_10GBASE_KX4)
 | MLX4_PROT_MASK(MLX4_1000BASE_KX))) {
-   return SUPPORTED_Backplane;
+   ethtool_add_link_modes(mask, ETHTOOL_LINK_MODE_Backplane_BIT);
}
-   return 0;
 }
 
 static u32 ptys_get_active_port(struct mlx4_ptys_reg *ptys_reg)
@@ -574,122 +570,91 @@ static u32 ptys_get_active_port(struct mlx4_ptys_reg 
*ptys_reg)
 enum ethtool_report {
SUPPORTED = 0,
ADVERTISED = 1,
-   SPEED = 2
 };
 
+struct ptys2ethtool_config {
+   ethtool_link_mode_mask_t link_modes[2];  /* SUPPORTED/ADVERTISED */
+   u32 speed;
+};
+
+#define MLX4_BUILD_PTYS2ETHTOOL_CONFIG(reg_, speed_, ...)  \
+   ({  \
+   struct ptys2ethtool_config *cfg;\
+   cfg = _map[reg_];  \
+   cfg->speed = speed_;\
+   ethtool_build_link_mode(>link_modes[SUPPORTED],\
+   __VA_ARGS__);   \
+   ethtool_build_link_mode(>link_modes[ADVERTISED],   \
+   __VA_ARGS__);   \
+   })
+
 /* Translates mlx4 link mode to equivalent ethtool Link modes/speed */
-static u32 ptys2ethtool_map[MLX4_LINK_MODES_SZ][3] = {
-   [MLX4_100BASE_TX] = {
-   SUPPORTED_100baseT_Full,
-   ADVERTISED_100baseT_Full,
-   SPEED_100
-   },
-
-   [MLX4_1000BASE_T] = {
-   SUPPORTED_1000baseT_Full,
-   ADVERTISED_1000baseT_Full,
-   SPEED_1000
-   },
-   [MLX4_1000BASE_CX_SGMII] = {
-   SUPPORTED_1000baseKX_Full,
-   ADVERTISED_1000baseKX_Full,
-   SPEED_1000
-   },
-   [MLX4_1000BASE_KX] = {
-   SUPPORTED_1000baseKX_Full,
-   ADVERTISED_1000baseKX_Full,
-   SPEED_1000
-   },
-
-   [MLX4_10GBASE_T] = {
-   SUPPORTED_1baseT_Full,
-   ADVERTISED_1baseT_Full,
-   SPEED_1
-   },
-   [MLX4_10GBASE_CX4] = {
-   SUPPORTED_1baseKX4_Full,
-   ADVERTISED_1baseKX4_Full,
-   SPEED_1
-   },
-   [MLX4_10GBASE_KX4] = {
-   SUPPORTED_1baseKX4_Full,
-   ADVERTISED_1baseKX4_Full,
-   SPEED_1
-   },
-   [MLX4_10GBASE_KR] = {
-   SUPPORTED_1baseKR_Full,
-   ADVERTISED_1baseKR_Full,
-   SPEED_1
-   },
-   [MLX4_10GBASE_CR] = {
-   

[PATCH net-next v3 04/17] tx4939: use __ethtool_get_ksettings

2015-11-30 Thread David Decotigny
From: David Decotigny 

Signed-off-by: David Decotigny 
---
 arch/mips/txx9/generic/setup_tx4939.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/mips/txx9/generic/setup_tx4939.c 
b/arch/mips/txx9/generic/setup_tx4939.c
index e3733cd..4a3ebf6 100644
--- a/arch/mips/txx9/generic/setup_tx4939.c
+++ b/arch/mips/txx9/generic/setup_tx4939.c
@@ -320,11 +320,12 @@ void __init tx4939_sio_init(unsigned int sclk, unsigned 
int cts_mask)
 #if IS_ENABLED(CONFIG_TC35815)
 static u32 tx4939_get_eth_speed(struct net_device *dev)
 {
-   struct ethtool_cmd cmd;
-   if (__ethtool_get_settings(dev, ))
+   struct ethtool_ksettings cmd;
+
+   if (__ethtool_get_ksettings(dev, ))
return 100; /* default 100Mbps */
 
-   return ethtool_cmd_speed();
+   return cmd.parent.speed;
 }
 
 static int tx4939_netdev_event(struct notifier_block *this,
-- 
2.6.0.rc2.230.g3dd15c0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] net: smc911x: convert pxa dma to dmaengine

2015-11-30 Thread Robert Jarzmik
Convert the dma transfers to be dmaengine based, now pxa has a dmaengine
slave driver. This makes this driver a bit more PXA agnostic.

The driver was only compile tested. The risk is quite small as no
current PXA platform I'm aware of is using smc911x driver.

Signed-off-by: Robert Jarzmik 
---
 drivers/net/ethernet/smsc/smc911x.c | 85 -
 drivers/net/ethernet/smsc/smc911x.h | 63 ---
 2 files changed, 82 insertions(+), 66 deletions(-)

diff --git a/drivers/net/ethernet/smsc/smc911x.c 
b/drivers/net/ethernet/smsc/smc911x.c
index bd64eb982e52..3f5711061432 100644
--- a/drivers/net/ethernet/smsc/smc911x.c
+++ b/drivers/net/ethernet/smsc/smc911x.c
@@ -73,6 +73,9 @@ static const char version[] =
 #include 
 #include 
 
+#include 
+#include 
+
 #include 
 
 #include "smc911x.h"
@@ -1174,18 +1177,16 @@ static irqreturn_t smc911x_interrupt(int irq, void 
*dev_id)
 
 #ifdef SMC_USE_DMA
 static void
-smc911x_tx_dma_irq(int dma, void *data)
+smc911x_tx_dma_irq(void *data)
 {
-   struct net_device *dev = (struct net_device *)data;
-   struct smc911x_local *lp = netdev_priv(dev);
+   struct smc911x_local *lp = data;
+   struct net_device *dev = lp->netdev;
struct sk_buff *skb = lp->current_tx_skb;
unsigned long flags;
 
DBG(SMC_DEBUG_FUNC, dev, "--> %s\n", __func__);
 
DBG(SMC_DEBUG_TX | SMC_DEBUG_DMA, dev, "TX DMA irq handler\n");
-   /* Clear the DMA interrupt sources */
-   SMC_DMA_ACK_IRQ(dev, dma);
BUG_ON(skb == NULL);
dma_unmap_single(NULL, tx_dmabuf, tx_dmalen, DMA_TO_DEVICE);
dev->trans_start = jiffies;
@@ -1208,18 +1209,16 @@ smc911x_tx_dma_irq(int dma, void *data)
"TX DMA irq completed\n");
 }
 static void
-smc911x_rx_dma_irq(int dma, void *data)
+smc911x_rx_dma_irq(void *data)
 {
-   struct net_device *dev = (struct net_device *)data;
-   struct smc911x_local *lp = netdev_priv(dev);
+   struct smc911x_local *lp = data;
+   struct net_device *dev = lp->netdev;
struct sk_buff *skb = lp->current_rx_skb;
unsigned long flags;
unsigned int pkts;
 
DBG(SMC_DEBUG_FUNC, dev, "--> %s\n", __func__);
DBG(SMC_DEBUG_RX | SMC_DEBUG_DMA, dev, "RX DMA irq handler\n");
-   /* Clear the DMA interrupt sources */
-   SMC_DMA_ACK_IRQ(dev, dma);
dma_unmap_single(NULL, rx_dmabuf, rx_dmalen, DMA_FROM_DEVICE);
BUG_ON(skb == NULL);
lp->current_rx_skb = NULL;
@@ -1792,6 +1791,9 @@ static int smc911x_probe(struct net_device *dev)
unsigned int val, chip_id, revision;
const char *version_string;
unsigned long irq_flags;
+   struct dma_slave_config config;
+   dma_cap_mask_t mask;
+   struct pxad_param param;
 
DBG(SMC_DEBUG_FUNC, dev, "--> %s\n", __func__);
 
@@ -1963,11 +1965,40 @@ static int smc911x_probe(struct net_device *dev)
goto err_out;
 
 #ifdef SMC_USE_DMA
-   lp->rxdma = SMC_DMA_REQUEST(dev, smc911x_rx_dma_irq);
-   lp->txdma = SMC_DMA_REQUEST(dev, smc911x_tx_dma_irq);
+
+   dma_cap_zero(mask);
+   dma_cap_set(DMA_SLAVE, mask);
+   param.prio = PXAD_PRIO_LOWEST;
+   param.drcmr = -1UL;
+
+   lp->rxdma =
+   dma_request_slave_channel_compat(mask, pxad_filter_fn,
+, >dev, "rx");
+   lp->txdma =
+   dma_request_slave_channel_compat(mask, pxad_filter_fn,
+, >dev, "tx");
lp->rxdma_active = 0;
lp->txdma_active = 0;
-   dev->dma = lp->rxdma;
+
+   memset(, 0, sizeof(config));
+   config.src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
+   config.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
+   config.src_addr = lp->physaddr + RX_DATA_FIFO;
+   config.dst_addr = lp->physaddr + TX_DATA_FIFO;
+   config.src_maxburst = 32;
+   config.dst_maxburst = 32;
+   retval = dmaengine_slave_config(lp->rxdma, );
+   if (retval) {
+   dev_err(lp->dev, "dma rx channel configuration failed: %d\n",
+   retval);
+   goto err_out;
+   }
+   retval = dmaengine_slave_config(lp->txdma, );
+   if (retval) {
+   dev_err(lp->dev, "dma tx channel configuration failed: %d\n",
+   retval);
+   goto err_out;
+   }
 #endif
 
retval = register_netdev(dev);
@@ -1978,11 +2009,11 @@ static int smc911x_probe(struct net_device *dev)
dev->base_addr, dev->irq);
 
 #ifdef SMC_USE_DMA
-   if (lp->rxdma != -1)
-   pr_cont(" RXDMA %d", lp->rxdma);
+   if (lp->rxdma)
+   pr_cont(" RXDMA %p", lp->rxdma);
 
-   if (lp->txdma != -1)
-   pr_cont(" TXDMA %d", lp->txdma);
+   if (lp->txdma)
+   

[PATCH v2] ravb: add R8A7791 support

2015-11-30 Thread Sergei Shtylyov
Add support  for yet another ARM member of the R-Car family, R-Car M2-W,
also known as R8A7791.

Signed-off-by: Sergei Shtylyov 

---
The patch is against DaveM's 'net-next.git' repo but I wouldn't mind if it's
applied to 'net.git' instead. :-)

Changes in version 2:
- fixed the SoC name in the changelog.

 Documentation/devicetree/bindings/net/renesas,ravb.txt |1 +
 drivers/net/ethernet/renesas/ravb_main.c   |1 +
 2 files changed, 2 insertions(+)

Index: net-next/Documentation/devicetree/bindings/net/renesas,ravb.txt
===
--- net-next.orig/Documentation/devicetree/bindings/net/renesas,ravb.txt
+++ net-next/Documentation/devicetree/bindings/net/renesas,ravb.txt
@@ -5,6 +5,7 @@ interface contains.
 
 Required properties:
 - compatible: "renesas,etheravb-r8a7790" if the device is a part of R8A7790 
SoC.
+ "renesas,etheravb-r8a7791" if the device is a part of R8A7791 SoC.
  "renesas,etheravb-r8a7794" if the device is a part of R8A7794 SoC.
  "renesas,etheravb-r8a7795" if the device is a part of R8A7795 SoC.
 - reg: offset and length of (1) the register block and (2) the stream buffer.
Index: net-next/drivers/net/ethernet/renesas/ravb_main.c
===
--- net-next.orig/drivers/net/ethernet/renesas/ravb_main.c
+++ net-next/drivers/net/ethernet/renesas/ravb_main.c
@@ -1655,6 +1655,7 @@ static int ravb_mdio_release(struct ravb
 
 static const struct of_device_id ravb_match_table[] = {
{ .compatible = "renesas,etheravb-r8a7790", .data = (void *)RCAR_GEN2 },
+   { .compatible = "renesas,etheravb-r8a7791", .data = (void *)RCAR_GEN2 },
{ .compatible = "renesas,etheravb-r8a7794", .data = (void *)RCAR_GEN2 },
{ .compatible = "renesas,etheravb-r8a7795", .data = (void *)RCAR_GEN3 },
{ }

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v3 06/17] net: bonding: use __ethtool_get_ksettings

2015-11-30 Thread David Decotigny
From: David Decotigny 

Signed-off-by: David Decotigny 
---
 drivers/net/bonding/bond_main.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 9e0f8a7..67d724d 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -374,22 +374,20 @@ down:
 static void bond_update_speed_duplex(struct slave *slave)
 {
struct net_device *slave_dev = slave->dev;
-   struct ethtool_cmd ecmd;
-   u32 slave_speed;
+   struct ethtool_ksettings ecmd;
int res;
 
slave->speed = SPEED_UNKNOWN;
slave->duplex = DUPLEX_UNKNOWN;
 
-   res = __ethtool_get_settings(slave_dev, );
+   res = __ethtool_get_ksettings(slave_dev, );
if (res < 0)
return;
 
-   slave_speed = ethtool_cmd_speed();
-   if (slave_speed == 0 || slave_speed == ((__u32) -1))
+   if (ecmd.parent.speed == 0 || ecmd.parent.speed == ((__u32)-1))
return;
 
-   switch (ecmd.duplex) {
+   switch (ecmd.parent.duplex) {
case DUPLEX_FULL:
case DUPLEX_HALF:
break;
@@ -397,8 +395,8 @@ static void bond_update_speed_duplex(struct slave *slave)
return;
}
 
-   slave->speed = slave_speed;
-   slave->duplex = ecmd.duplex;
+   slave->speed = ecmd.parent.speed;
+   slave->duplex = ecmd.parent.duplex;
 
return;
 }
-- 
2.6.0.rc2.230.g3dd15c0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v3 03/17] net: ethtool: add new ETHTOOL_GSETTINGS/SSETTINGS API

2015-11-30 Thread David Decotigny
From: David Decotigny 

This patch defines a new ETHTOOL_GSETTINGS/SSETTINGS API, handled by
the new get_ksettings/set_ksettings callbacks. This API provides
support for most legacy ethtool_cmd fields, adds support for larger
link mode masks (up to 4064 bits, variable length), and removes
ethtool_cmd deprecated fields (transceiver/maxrxpkt/maxtxpkt).

This API is deprecating the legacy ETHTOOL_GSET/SSET API and provides
the following backward compatibility properties:
 - legacy ethtool with legacy drivers: no change, still using the
   get_settings/set_settings callbacks.
 - legacy ethtool with new get/set_ksettings drivers: the new driver
   callbacks are used, data internally converted to legacy
   ethtool_cmd. ETHTOOL_GSET will return only the 1st 32b of each link
   mode mask. ETHTOOL_SSET will fail if user tries to set the
   ethtool_cmd deprecated fields to non-0
   (transceiver/maxrxpkt/maxtxpkt). A kernel warning is printed if
   driver exports higher bits or if user request changes in deprecated
   fields mentioned earlier.
 - future ethtool with legacy drivers: no change, still using the
   get_settings/set_settings callbacks, internally converted to new
   data structure. Note that that "future" ethtool tool will not allow
   changes to deprecated fields (transceiver/maxrxpkt/maxtxpkt), as
   they cannot be expressed for the kernel.
 - future ethtool with new drivers: direct call to the new callbacks.

By "future" ethtool, what is meant is:
 - query: first try ETHTOOL_GSETTINGS, and revert to ETHTOOL_GSET if fails
 - set: query first and remember which of ETHTOOL_GSETTINGS or
   ETHTOOL_GSET was successful
   - if ETHTOOL_GSETTINGS was successful, then change config with
 ETHTOOL_SSETTINGS. A failure there is final (do not try ETHTOOL_SSET).
   - otherwise ETHTOOL_GSET was successful, change config with
 ETHTOOL_SSET. A failure there is final (do not try ETHTOOL_SSETTINGS).

The interaction user/kernel via the new API requires a small
ETHTOOL_GSETTINGS handshake first to agree on the length of the link
mode bitmaps. If kernel doesn't agree with user, it returns the bitmap
length it is expecting from user as a negative length (and cmd field
is 0). When kernel and user agree, kernel returns valid info in all
fields (ie. link mode length > 0 and cmd is ETHTOOL_GSETTINGS).

Data structure crossing user/kernel boundary is 32/64-bit
agnostic. Converted internally to a legal kernel bitmap.

The internal __ethtool_get_settings kernel helper will gradually be
replaced by __ethtool_get_ksettings by the time the first ksettings
drivers start to appear. So this patch doesn't change it, it will be
removed before it needs to be changed.

Signed-off-by: David Decotigny 
---
 include/linux/ethtool.h  | 101 -
 include/uapi/linux/ethtool.h | 323 ++--
 net/core/ethtool.c   | 489 ++-
 3 files changed, 833 insertions(+), 80 deletions(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 653dc9c..6de122d 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -12,6 +12,7 @@
 #ifndef _LINUX_ETHTOOL_H
 #define _LINUX_ETHTOOL_H
 
+#include 
 #include 
 #include 
 
@@ -40,9 +41,6 @@ struct compat_ethtool_rxnfc {
 
 #include 
 
-extern int __ethtool_get_settings(struct net_device *dev,
- struct ethtool_cmd *cmd);
-
 /**
  * enum ethtool_phys_id_state - indicator state for physical identification
  * @ETHTOOL_ID_INACTIVE: Physical ID indicator should be deactivated
@@ -97,13 +95,85 @@ static inline u32 ethtool_rxfh_indir_default(u32 index, u32 
n_rx_rings)
return index % n_rx_rings;
 }
 
+#define __ETHTOOL_LINK_MODE_IS_VALID_BIT(indice)   \
+   ((indice) >= 0 && (indice) <= __ETHTOOL_LINK_MODE_LAST)
+
+/* number of link mode bits handled internally by kernel */
+#define __ETHTOOL_LINK_MODE_MASK_NBITS (__ETHTOOL_LINK_MODE_LAST+1)
+
+typedef struct {
+   unsigned long mask[BITS_TO_LONGS(__ETHTOOL_LINK_MODE_MASK_NBITS)];
+} ethtool_link_mode_mask_t;
+
+/* drivers must ignore parent.cmd and parent.link_mode_masks_nwords
+ * fields, but they are allowed to overwrite them (will be ignored).
+ */
+struct ethtool_ksettings {
+   struct ethtool_settings parent;
+   struct {
+   ethtool_link_mode_mask_t supported;
+   ethtool_link_mode_mask_t advertising;
+   ethtool_link_mode_mask_t lp_advertising;
+   } link_modes;
+};
+
+/* helper function for ethtool_build_link_mode and ethtool_add_link_modes */
+static inline int
+__ethtool_add_link_modes(ethtool_link_mode_mask_t *dst,
+unsigned nindices,
+const enum ethtool_link_mode_bit_indices *indices) {
+   unsigned i;
+   int rv = 0;
+
+   for (i = 0 ; i < nindices ; ++i) {
+   if (__ETHTOOL_LINK_MODE_IS_VALID_BIT(indices[i]))
+   

[PATCH net-next v3 01/17] net: usnic: remove unused call to ethtool_ops::get_settings

2015-11-30 Thread David Decotigny
From: David Decotigny 

Signed-off-by: David Decotigny 
---
 drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c 
b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
index f8e3211..5b60579 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
@@ -269,7 +269,6 @@ int usnic_ib_query_device(struct ib_device *ibdev,
struct usnic_ib_dev *us_ibdev = to_usdev(ibdev);
union ib_gid gid;
struct ethtool_drvinfo info;
-   struct ethtool_cmd cmd;
int qp_per_vf;
 
usnic_dbg("\n");
@@ -278,7 +277,6 @@ int usnic_ib_query_device(struct ib_device *ibdev,
 
mutex_lock(_ibdev->usdev_lock);
us_ibdev->netdev->ethtool_ops->get_drvinfo(us_ibdev->netdev, );
-   us_ibdev->netdev->ethtool_ops->get_settings(us_ibdev->netdev, );
memset(props, 0, sizeof(*props));
usnic_mac_ip_to_gid(us_ibdev->ufdev->mac, us_ibdev->ufdev->inaddr,
[0]);
-- 
2.6.0.rc2.230.g3dd15c0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v3 05/17] net: usnic: use __ethtool_get_ksettings

2015-11-30 Thread David Decotigny
From: David Decotigny 

Signed-off-by: David Decotigny 
---
 drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c 
b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
index e082170..e0d12d4 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
@@ -324,12 +324,12 @@ int usnic_ib_query_port(struct ib_device *ibdev, u8 port,
struct ib_port_attr *props)
 {
struct usnic_ib_dev *us_ibdev = to_usdev(ibdev);
-   struct ethtool_cmd cmd;
+   struct ethtool_ksettings cmd;
 
usnic_dbg("\n");
 
mutex_lock(_ibdev->usdev_lock);
-   __ethtool_get_settings(us_ibdev->netdev, );
+   __ethtool_get_ksettings(us_ibdev->netdev, );
memset(props, 0, sizeof(*props));
 
props->lid = 0;
@@ -353,8 +353,8 @@ int usnic_ib_query_port(struct ib_device *ibdev, u8 port,
props->pkey_tbl_len = 1;
props->bad_pkey_cntr = 0;
props->qkey_viol_cntr = 0;
-   eth_speed_to_ib_speed(cmd.speed, >active_speed,
-   >active_width);
+   eth_speed_to_ib_speed(cmd.parent.speed, >active_speed,
+ >active_width);
props->max_mtu = IB_MTU_4096;
props->active_mtu = iboe_get_mtu(us_ibdev->ufdev->mtu);
/* Userspace will adjust for hdrs */
-- 
2.6.0.rc2.230.g3dd15c0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v3 02/17] net: usnic: use __ethtool_get_settings

2015-11-30 Thread David Decotigny
From: David Decotigny 

Signed-off-by: David Decotigny 
---
 drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c 
b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
index 5b60579..e082170 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
@@ -329,7 +329,7 @@ int usnic_ib_query_port(struct ib_device *ibdev, u8 port,
usnic_dbg("\n");
 
mutex_lock(_ibdev->usdev_lock);
-   us_ibdev->netdev->ethtool_ops->get_settings(us_ibdev->netdev, );
+   __ethtool_get_settings(us_ibdev->netdev, );
memset(props, 0, sizeof(*props));
 
props->lid = 0;
-- 
2.6.0.rc2.230.g3dd15c0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: gigaset: freeing an active object

2015-11-30 Thread Paul Bolle
On ma, 2015-11-30 at 19:30 +0100, Tilman Schmidt wrote:
> I wonder how that will behave if someone attaches two of the devices to
> different serial ports. Not likely, but not forbidden either.

I see.

Perhaps I should respin and a use a pointer to a struct platform_device
in struct ser_cardstate, use the two step approach of
platform_device_alloc() and friends, etc. Only slightly more
complicated.

How would attaching two devices work with GIGASET_MINORS hardcoded to 1?
Because I haven't yet stumbled on the mechanism with which ttyGS1 (and
up) would then be created.

(I do have a second M105's in a box somewhere, so I could check myself
what happens when a second USB device is added, for what that's worth.)

Thanks,


Paul Bolle
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/13] mm: memcontrol: generalize the socket accounting jump label

2015-11-30 Thread Jason Baron
Hi,

On 11/24/2015 04:52 PM, Johannes Weiner wrote:
> The unified hierarchy memory controller is going to use this jump
> label as well to control the networking callbacks. Move it to the
> memory controller code and give it a more generic name.
> 
> Signed-off-by: Johannes Weiner 
> Acked-by: Michal Hocko 
> Reviewed-by: Vladimir Davydov 
> ---
>  include/linux/memcontrol.h | 4 
>  include/net/sock.h | 7 ---
>  mm/memcontrol.c| 3 +++
>  net/core/sock.c| 5 -
>  net/ipv4/tcp_memcontrol.c  | 4 ++--
>  5 files changed, 9 insertions(+), 14 deletions(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index d99fefe..dad56ef 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -681,6 +681,8 @@ static inline void mem_cgroup_wb_stats(struct 
> bdi_writeback *wb,
>  
>  #if defined(CONFIG_INET) && defined(CONFIG_MEMCG_KMEM)
>  struct sock;
> +extern struct static_key memcg_sockets_enabled_key;
> +#define mem_cgroup_sockets_enabled 
> static_key_false(_sockets_enabled_key)


We're trying to move to the updated API, so this should be:
static_branch_unlikely(_sockets_enabled_key)

see: include/linux/jump_label.h for details.


>  void sock_update_memcg(struct sock *sk);
>  void sock_release_memcg(struct sock *sk);
>  bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int 
> nr_pages);
> @@ -689,6 +691,8 @@ static inline bool 
> mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
>  {
>   return memcg->tcp_mem.memory_pressure;
>  }
> +#else
> +#define mem_cgroup_sockets_enabled 0
>  #endif /* CONFIG_INET && CONFIG_MEMCG_KMEM */
>  
>  #ifdef CONFIG_MEMCG_KMEM
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 1a94b85..fcc9442 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -1065,13 +1065,6 @@ static inline void sk_refcnt_debug_release(const 
> struct sock *sk)
>  #define sk_refcnt_debug_release(sk) do { } while (0)
>  #endif /* SOCK_REFCNT_DEBUG */
>  
> -#if defined(CONFIG_MEMCG_KMEM) && defined(CONFIG_NET)
> -extern struct static_key memcg_socket_limit_enabled;
> -#define mem_cgroup_sockets_enabled 
> static_key_false(_socket_limit_enabled)
> -#else
> -#define mem_cgroup_sockets_enabled 0
> -#endif
> -
>  static inline bool sk_stream_memory_free(const struct sock *sk)
>  {
>   if (sk->sk_wmem_queued >= sk->sk_sndbuf)
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 68d67fc..0602bee 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -291,6 +291,9 @@ static inline struct mem_cgroup 
> *mem_cgroup_from_id(unsigned short id)
>  /* Writing them here to avoid exposing memcg's inner layout */
>  #if defined(CONFIG_INET) && defined(CONFIG_MEMCG_KMEM)
>  
> +struct static_key memcg_sockets_enabled_key;


And this would be:

static DEFINE_STATIC_KEY_FALSE(memcg_sockets_enabled_key);


>  void sock_update_memcg(struct sock *sk)
>  {
>   struct mem_cgroup *memcg;
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 6486b0d..c5435b5 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -201,11 +201,6 @@ EXPORT_SYMBOL(sk_net_capable);
>  static struct lock_class_key af_family_keys[AF_MAX];
>  static struct lock_class_key af_family_slock_keys[AF_MAX];
>  
> -#if defined(CONFIG_MEMCG_KMEM)
> -struct static_key memcg_socket_limit_enabled;
> -EXPORT_SYMBOL(memcg_socket_limit_enabled);
> -#endif
> -
>  /*
>   * Make lock validator output more readable. (we pre-construct these
>   * strings build-time, so that runtime initialization of socket
> diff --git a/net/ipv4/tcp_memcontrol.c b/net/ipv4/tcp_memcontrol.c
> index e507825..9a22e2d 100644
> --- a/net/ipv4/tcp_memcontrol.c
> +++ b/net/ipv4/tcp_memcontrol.c
> @@ -34,7 +34,7 @@ void tcp_destroy_cgroup(struct mem_cgroup *memcg)
>   return;
>  
>   if (memcg->tcp_mem.active)
> - static_key_slow_dec(_socket_limit_enabled);
> + static_key_slow_dec(_sockets_enabled_key);
>  

static_branch_dec(_sockets_enabled_key);

}
>  
>  static int tcp_update_limit(struct mem_cgroup *memcg, unsigned long nr_pages)
> @@ -65,7 +65,7 @@ static int tcp_update_limit(struct mem_cgroup *memcg, 
> unsigned long nr_pages)
>* because when this value change, the code to process it is not
>* patched in yet.
>*/
> - static_key_slow_inc(_socket_limit_enabled);
> + static_key_slow_inc(_sockets_enabled_key);
>   memcg->tcp_mem.active = true;
>   }
>  
> 

static_branch_inc(_sockets_enabled_key);

Thanks,

-Jason

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v1 1/6] net: Generalize udp based tunnel offload

2015-11-30 Thread Singhai, Anjali


-Original Message-
From: David Miller [mailto:da...@davemloft.net] 
Sent: Sunday, November 29, 2015 7:22 PM
To: t...@herbertland.com
Cc: Singhai, Anjali ; netdev@vger.kernel.org; 
je...@kernel.org; Patil, Kiran 
Subject: Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload

From: Tom Herbert 
Date: Mon, 23 Nov 2015 13:53:44 -0800

> The bad effect of this model is that it is encourages HW vendors to 
> continue implement HW protocol specific support for encapsulations, we 
> get so much more benefit if they implement protocol generic 
> mechanisms.
Dave, at least Intel parts have a protocol generic model for tunneled packet 
offloads and hence we are able to extend our support to newer tunnel types. We 
do  not have protocol specific support in the HW, but since the udp based 
tunnels do not have a packet type for the tunnel header, the HW needs to know 
which udp port should be mapped to which specific encapsulation. Otherwise 
encapsulated types like NVGRE we can identify through packet type and program 
the HW to account for the header. The newer patches for sure reduce the 
protocol ossification since in communalizes all the different tunnels into one 
interface so that any further support to a newer udp tunnel type requires just 
a type definition and if the driver/HW can support it, minor driver changes to 
set the right bits for HW. No interface change for sure. And I think that is 
definitely a step in the right direction.

+1
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v1 1/6] net: Generalize udp based tunnel offload

2015-11-30 Thread Singhai, Anjali


-Original Message-
From: David Miller [mailto:da...@davemloft.net] 
Sent: Sunday, November 29, 2015 7:23 PM
To: t...@herbertland.com
Cc: Brandeburg, Jesse ; Singhai, Anjali 
; je...@kernel.org; netdev@vger.kernel.org; Patil, 
Kiran 
Subject: Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload

From: Tom Herbert 
Date: Tue, 24 Nov 2015 09:32:11 -0800

>>
>> FWIW, I've brought the issue to the attention of the architects here, 
>> and we will likely be able to make changes in this space.  Intel 
>> hardware (as demonstrated by your patches) already is able to deal 
>> with this de-ossification on transmit.  Receive is a whole different beast.
>>
> Please provide the specifics on why "Receive is a whole different 
> beast.". Generic receive checksum is already a subset of the 
> functionality that you must have implement to support the protocol 
> specific offloads. All the hardware needs to do is calculate the 1's 
> complement checksum of the packet and return the value on the to the 
> host with that packet. That's it. No parsing of headers, no worrying 
> about the pseudo header, no dealing with any encapsulation. Just do 
> the calculation, return the result to the host and the driver converts 
> this to CHECKSUM_COMPLETE. I find it very hard to believe that this is 
> any harder than specific support the next protocol du jour.

The reason for receive being different than transmit is, on TX side driver can 
provide the meta data for where the checksum field is and what is the length 
that needs to be check summed to the HW on a per packet basis. On Rx the HW 
parser has to parse the packet to identify the tunnel type and based on that 
figure out the checksum locations and length in the packet, so definitely HW 
has to parse the packet and it can parse only based on next header type 
information or in case of udp tunnels based on udp port mapping to a particular 
protocol. I am not sure why you say it doesn't need to parse the packet, maybe 
I am miss- understanding something.  Although it's not difficult to reduce 
protocol ossification on the RX side but it is certainly different and 
particularly in case of udp-tunnels it needs the port to protocol mapping.

+1
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/13] mm: memcontrol: generalize the socket accounting jump label

2015-11-30 Thread Johannes Weiner
On Mon, Nov 30, 2015 at 04:08:18PM -0500, Jason Baron wrote:
> We're trying to move to the updated API, so this should be:
> static_branch_unlikely(_sockets_enabled_key)
> 
> see: include/linux/jump_label.h for details.

Good point. There is another struct static_key in there as well. How
about the following on top of this series?

---
>From b784aa0323628d43272e13a67ead2a2ce0e93ea6 Mon Sep 17 00:00:00 2001
From: Johannes Weiner 
Date: Mon, 30 Nov 2015 16:41:38 -0500
Subject: [PATCH] mm: memcontrol: switch to the updated jump-label API

According to  the direct use of struct static_key
is deprecated. Update the socket and slab accounting code accordingly.

Reported-by: Jason Baron 
Signed-off-by: Johannes Weiner 
---
 include/linux/memcontrol.h |  8 
 mm/memcontrol.c| 12 ++--
 net/ipv4/tcp_memcontrol.c  |  4 ++--
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index a8df46c..9a19590 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -704,8 +704,8 @@ static inline void mem_cgroup_wb_stats(struct bdi_writeback 
*wb,
 
 #ifdef CONFIG_INET
 struct sock;
-extern struct static_key memcg_sockets_enabled_key;
-#define mem_cgroup_sockets_enabled static_key_false(_sockets_enabled_key)
+extern struct static_key_false memcg_sockets_enabled_key;
+#define mem_cgroup_sockets_enabled 
static_branch_unlikely(_sockets_enabled_key)
 void sock_update_memcg(struct sock *sk);
 void sock_release_memcg(struct sock *sk);
 bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages);
@@ -727,7 +727,7 @@ static inline bool mem_cgroup_under_socket_pressure(struct 
mem_cgroup *memcg)
 #endif /* CONFIG_INET */
 
 #ifdef CONFIG_MEMCG_KMEM
-extern struct static_key memcg_kmem_enabled_key;
+extern struct static_key_false memcg_kmem_enabled_key;
 
 extern int memcg_nr_cache_ids;
 void memcg_get_cache_ids(void);
@@ -743,7 +743,7 @@ void memcg_put_cache_ids(void);
 
 static inline bool memcg_kmem_enabled(void)
 {
-   return static_key_false(_kmem_enabled_key);
+   return static_branch_unlikely(_kmem_enabled_key);
 }
 
 static inline bool memcg_kmem_is_active(struct mem_cgroup *memcg)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a0da91f..5fe45d68 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -346,7 +346,7 @@ void memcg_put_cache_ids(void)
  * conditional to this static branch, we'll have to allow modules that does
  * kmem_cache_alloc and the such to see this symbol as well
  */
-struct static_key memcg_kmem_enabled_key;
+DEFINE_STATIC_KEY_FALSE(memcg_kmem_enabled_key);
 EXPORT_SYMBOL(memcg_kmem_enabled_key);
 
 #endif /* CONFIG_MEMCG_KMEM */
@@ -2883,7 +2883,7 @@ static int memcg_activate_kmem(struct mem_cgroup *memcg,
err = page_counter_limit(>kmem, nr_pages);
VM_BUG_ON(err);
 
-   static_key_slow_inc(_kmem_enabled_key);
+   static_branch_inc(_kmem_enabled_key);
/*
 * A memory cgroup is considered kmem-active as soon as it gets
 * kmemcg_id. Setting the id after enabling static branching will
@@ -3622,7 +3622,7 @@ static void memcg_destroy_kmem(struct mem_cgroup *memcg)
 {
if (memcg->kmem_acct_activated) {
memcg_destroy_kmem_caches(memcg);
-   static_key_slow_dec(_kmem_enabled_key);
+   static_branch_dec(_kmem_enabled_key);
WARN_ON(page_counter_read(>kmem));
}
tcp_destroy_cgroup(memcg);
@@ -4258,7 +4258,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
 
 #ifdef CONFIG_INET
if (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nosocket)
-   static_key_slow_inc(_sockets_enabled_key);
+   static_branch_inc(_sockets_enabled_key);
 #endif
 
/*
@@ -4302,7 +4302,7 @@ static void mem_cgroup_css_free(struct 
cgroup_subsys_state *css)
memcg_destroy_kmem(memcg);
 #ifdef CONFIG_INET
if (cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_nosocket)
-   static_key_slow_dec(_sockets_enabled_key);
+   static_branch_dec(_sockets_enabled_key);
 #endif
__mem_cgroup_free(memcg);
 }
@@ -5494,7 +5494,7 @@ void mem_cgroup_replace_page(struct page *oldpage, struct 
page *newpage)
 
 #ifdef CONFIG_INET
 
-struct static_key memcg_sockets_enabled_key;
+DEFINE_STATIC_KEY_FALSE(memcg_sockets_enabled_key);
 EXPORT_SYMBOL(memcg_sockets_enabled_key);
 
 void sock_update_memcg(struct sock *sk)
diff --git a/net/ipv4/tcp_memcontrol.c b/net/ipv4/tcp_memcontrol.c
index 9a22e2d..18bc7f7 100644
--- a/net/ipv4/tcp_memcontrol.c
+++ b/net/ipv4/tcp_memcontrol.c
@@ -34,7 +34,7 @@ void tcp_destroy_cgroup(struct mem_cgroup *memcg)
return;
 
if (memcg->tcp_mem.active)
-   static_key_slow_dec(_sockets_enabled_key);
+   static_branch_dec(_sockets_enabled_key);
 }
 
 static int 

[3.19.y-ckt stable] Patch "can: Use correct type in sizeof() in nla_put()" has been added to staging queue

2015-11-30 Thread Kamal Mostafa
This is a note to let you know that I have just added a patch titled

can: Use correct type in sizeof() in nla_put()

to the linux-3.19.y-queue branch of the 3.19.y-ckt extended stable tree 
which can be found at:

http://kernel.ubuntu.com/git/ubuntu/linux.git/log/?h=linux-3.19.y-queue

This patch is scheduled to be released in version 3.19.8-ckt11.

If you, or anyone else, feels it should not be added to this tree, please 
reply to this email.

For more information about the 3.19.y-ckt tree, see
https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable

Thanks.
-Kamal

--

>From 6226f4073973c9a2945e3c68bf04b0f8cc0c0793 Mon Sep 17 00:00:00 2001
From: Marek Vasut 
Date: Fri, 30 Oct 2015 13:48:19 +0100
Subject: can: Use correct type in sizeof() in nla_put()

commit 562b103a21974c2f9cd67514d110f918bb3e1796 upstream.

The sizeof() is invoked on an incorrect variable, likely due to some
copy-paste error, and this might result in memory corruption. Fix this.

Signed-off-by: Marek Vasut 
Cc: Wolfgang Grandegger 
Cc: netdev@vger.kernel.org
Signed-off-by: Marc Kleine-Budde 
Signed-off-by: Kamal Mostafa 
---
 drivers/net/can/dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c
index 62ca0e8..8202ab3 100644
--- a/drivers/net/can/dev.c
+++ b/drivers/net/can/dev.c
@@ -912,7 +912,7 @@ static int can_fill_info(struct sk_buff *skb, const struct 
net_device *dev)
 nla_put(skb, IFLA_CAN_BITTIMING_CONST,
 sizeof(*priv->bittiming_const), priv->bittiming_const)) ||

-   nla_put(skb, IFLA_CAN_CLOCK, sizeof(cm), >clock) ||
+   nla_put(skb, IFLA_CAN_CLOCK, sizeof(priv->clock), >clock) ||
nla_put_u32(skb, IFLA_CAN_STATE, state) ||
nla_put(skb, IFLA_CAN_CTRLMODE, sizeof(cm), ) ||
nla_put_u32(skb, IFLA_CAN_RESTART_MS, priv->restart_ms) ||
--
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v3 12/17] net: 8021q: use __ethtool_get_ksettings

2015-11-30 Thread David Decotigny
From: David Decotigny 

Signed-off-by: David Decotigny 
---
 net/8021q/vlan_dev.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index fded865..e607fee 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -620,12 +620,12 @@ static netdev_features_t vlan_dev_fix_features(struct 
net_device *dev,
return features;
 }
 
-static int vlan_ethtool_get_settings(struct net_device *dev,
-struct ethtool_cmd *cmd)
+static int vlan_ethtool_get_ksettings(struct net_device *dev,
+ struct ethtool_ksettings *cmd)
 {
const struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
 
-   return __ethtool_get_settings(vlan->real_dev, cmd);
+   return __ethtool_get_ksettings(vlan->real_dev, cmd);
 }
 
 static void vlan_ethtool_get_drvinfo(struct net_device *dev,
@@ -740,7 +740,7 @@ static int vlan_dev_get_iflink(const struct net_device *dev)
 }
 
 static const struct ethtool_ops vlan_ethtool_ops = {
-   .get_settings   = vlan_ethtool_get_settings,
+   .get_ksettings  = vlan_ethtool_get_ksettings,
.get_drvinfo= vlan_ethtool_get_drvinfo,
.get_link   = ethtool_op_get_link,
.get_ts_info= vlan_ethtool_get_ts_info,
-- 
2.6.0.rc2.230.g3dd15c0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v3 07/17] net: ipvlan: use __ethtool_get_ksettings

2015-11-30 Thread David Decotigny
From: David Decotigny 

Signed-off-by: David Decotigny 
---
 drivers/net/ipvlan/ipvlan_main.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index a9268db..63b3aa5 100644
--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -346,12 +346,12 @@ static const struct header_ops ipvlan_header_ops = {
.cache_update   = eth_header_cache_update,
 };
 
-static int ipvlan_ethtool_get_settings(struct net_device *dev,
-  struct ethtool_cmd *cmd)
+static int ipvlan_ethtool_get_ksettings(struct net_device *dev,
+   struct ethtool_ksettings *cmd)
 {
const struct ipvl_dev *ipvlan = netdev_priv(dev);
 
-   return __ethtool_get_settings(ipvlan->phy_dev, cmd);
+   return __ethtool_get_ksettings(ipvlan->phy_dev, cmd);
 }
 
 static void ipvlan_ethtool_get_drvinfo(struct net_device *dev,
@@ -377,7 +377,7 @@ static void ipvlan_ethtool_set_msglevel(struct net_device 
*dev, u32 value)
 
 static const struct ethtool_ops ipvlan_ethtool_ops = {
.get_link   = ethtool_op_get_link,
-   .get_settings   = ipvlan_ethtool_get_settings,
+   .get_ksettings  = ipvlan_ethtool_get_ksettings,
.get_drvinfo= ipvlan_ethtool_get_drvinfo,
.get_msglevel   = ipvlan_ethtool_get_msglevel,
.set_msglevel   = ipvlan_ethtool_set_msglevel,
-- 
2.6.0.rc2.230.g3dd15c0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] ip neigh: device is optional for proxy entries

2015-11-30 Thread Konstantin Khlebnikov
Though dumping such entries crashes present kernels.

Signed-off-by: Konstantin Khlebnikov 
---
 ip/ipneigh.c |   13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/ip/ipneigh.c b/ip/ipneigh.c
index 54655842ed38..92b7cd6f2a75 100644
--- a/ip/ipneigh.c
+++ b/ip/ipneigh.c
@@ -100,8 +100,9 @@ static int ipneigh_modify(int cmd, int flags, int argc, 
char **argv)
struct ndmsgndm;
charbuf[256];
} req;
-   char  *d = NULL;
+   char  *dev = NULL;
int dst_ok = 0;
+   int dev_ok = 0;
int lladdr_ok = 0;
char * lla = NULL;
inet_prefix dst;
@@ -135,10 +136,12 @@ static int ipneigh_modify(int cmd, int flags, int argc, 
char **argv)
duparg("address", *argv);
get_addr(, *argv, preferred_family);
dst_ok = 1;
+   dev_ok = 1;
req.ndm.ndm_flags |= NTF_PROXY;
} else if (strcmp(*argv, "dev") == 0) {
NEXT_ARG();
-   d = *argv;
+   dev = *argv;
+   dev_ok = 1;
} else {
if (strcmp(*argv, "to") == 0) {
NEXT_ARG();
@@ -153,7 +156,7 @@ static int ipneigh_modify(int cmd, int flags, int argc, 
char **argv)
}
argc--; argv++;
}
-   if (d == NULL || !dst_ok || dst.family == AF_UNSPEC) {
+   if (!dev_ok || !dst_ok || dst.family == AF_UNSPEC) {
fprintf(stderr, "Device and destination are required 
arguments.\n");
exit(-1);
}
@@ -175,8 +178,8 @@ static int ipneigh_modify(int cmd, int flags, int argc, 
char **argv)
 
ll_init_map();
 
-   if ((req.ndm.ndm_ifindex = ll_name_to_index(d)) == 0) {
-   fprintf(stderr, "Cannot find device \"%s\"\n", d);
+   if (dev && (req.ndm.ndm_ifindex = ll_name_to_index(dev)) == 0) {
+   fprintf(stderr, "Cannot find device \"%s\"\n", dev);
return -1;
}
 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v3 00/17] RFC: new ETHTOOL_GSETTINGS/SSETTINGS API

2015-11-30 Thread David Decotigny
From: David Decotigny 


History:
 v3
 - rebased v2 on top of latest net-next, minor checkpatch/printf %*pb
   updates
 v2
 - keep return 0 in get_settings when successful, instead of
   propagating positive result from driver's get_settings callback.
 v1
 - original submission


The main goal of this series is to support ethtool link mode masks
larger than 32 bits. It implements a new ioctl pair
(ETHTOOL_GSETTINGS/SSETTINGS), its associated callbacks
(get/set_settings) and a new struct ethtool_settings, which should
eventually replace legacy ethtool_cmd. Internally, the kernel uses
fixed length link mode masks defined at compilation time in ethtool.h
(for now: 31 bits), that can be increased by changing
__ETHTOOL_LINK_MODE_LAST in ethtool.h (absolute max is 4064 bits,
checked at compile time), and the user/kernel interface allows this
length to be arbitrary within 1..4064. This should allow some
flexibility without using too much malloc/stack space, at the cost of
a small kernel/user handshake for the user to determine the sizes of
those bitmaps.

Along the way, I chose to drop in the new structure the 3 ethtool_cmd
fields marked "deprecated" (transceiver/maxrxpkt/maxtxpkt). They are
still available for old drivers via the old ETHTOOL_GSET/SSET API, but
are not available to drivers that switch to new API. Of those 3
fields, ethtool_cmd::transceiver seems to be still actively used by
several drivers, maybe we should not consider this field deprecated?
The 2 other fields are basically not used. This transition requires
some care in the way old and new ethtool talk to the kernel.

More technical details provided in the description for main patch. In
particular details about backward compatibility properties.

Some questions to more experts than me:
 - the kernel/interface multiplexes the "tell me the bitmap length"
   handshake and the "give me the settings" inside the new
   ETHTOOL_GSETTINGS cmd. I was thinking of making this into 2
   separate cmds: 1 cmd ETHTOOL_GKERNELPROPERTIES which would be
   kernel-wide rather than device-specific, would return properties
   like "length of the link mode bitmaps", and possibly others. And
   ETHTOOL_GSETTINGS would expect the proper bitmaps
 - the link mode bitmaps are piggybacked at tail of the new struct
   ethtool_settings. Since its user-visible definition does not assume
   specific bitmap width, I am using a 0-length array as the publicly
   visible placeholder. But then, the kernel needs to specialize it
   (struct ethtool_ksettings) to specify its current link mode
   masks. This means that kernel code is "littered" with
   "ksettings->parent.field" to access "field" inside
   ethtool_settings:
   + I don't like the field name "parent", any suggestion welcome
   + and/or: I could use ethtool_settings everywhere (instead of a new
 ethtool_ksettings) and an accessor to retrieve the link mode
 masks?
   + or: we could decide to make the link mode masks statically
 bounded again, ie. make their width public, but larger than
 current 32, and unchangeable forever. This would make everything
 straightforward, but we might hit limits later, or have an
 unneeded memory/stack usage for unused bits.
   any preference?
 - crossing user/kernel boundary requires conversion of the kernel
   bitmaps (unsigned long[]) to something more strict (in my case:
   u32) to accomodate for 32/64 compat. Maybe I should add a
   copy_bitmap_from_user/copy_bitmap_to_user API inside bitmap.h
   instead of defining my own in ethtool.c?
 - I am using a typedef struct (ethtool_link_mode_mask_t) to build and
   hold the new masks. Makes it handy to use in the drivers (see mlx4
   for an example). Not very nice.
 - I foresee bugs where people use the legacy/deprecated SUPPORTED_x
   macros instead of the new ETHTOOL_LINK_MODE_x_BIT enums in the new
   get/set__ksettings callbacks. Not sure how to prevent problems with
   this.

The only driver which was converted for now is mlx4. I am not
considering fcoe as fully converted, but I updated it a minima to be
able to remove __ethtool_get_settings, now known as
__ethtool_get_ksettings.

Tested with legacy and "future" ethtool on 64b x86 kernel and 32+64b
ethtool, and on a 32b x86 kernel + 32b ethtool.


# Patch Set Summary:

David Decotigny (17):
  net: usnic: remove unused call to ethtool_ops::get_settings
  net: usnic: use __ethtool_get_settings
  net: ethtool: add new ETHTOOL_GSETTINGS/SSETTINGS API
  tx4939: use __ethtool_get_ksettings
  net: usnic: use __ethtool_get_ksettings
  net: bonding: use __ethtool_get_ksettings
  net: ipvlan: use __ethtool_get_ksettings
  net: macvlan: use __ethtool_get_ksettings
  net: team: use __ethtool_get_ksettings
  net: fcoe: use __ethtool_get_ksettings
  net: rdma: use __ethtool_get_ksettings
  net: 8021q: use __ethtool_get_ksettings
  net: bridge: use __ethtool_get_ksettings
  net: core: use __ethtool_get_ksettings
  

RE: [PATCH v1 1/6] net: Generalize udp based tunnel offload

2015-11-30 Thread Singhai, Anjali


-Original Message-
From: Tom Herbert [mailto:t...@herbertland.com] 
Sent: Monday, November 30, 2015 8:36 AM
To: Singhai, Anjali 
Cc: Linux Kernel Network Developers ; Jesse Gross 
; Patil, Kiran 
Subject: Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload

On Mon, Nov 23, 2015 at 1:02 PM, Anjali Singhai Jain  
wrote:
> Replace add/del ndo ops for vxlan_port with tunnel_port so that all 
> UDP based tunnels can use the same ndo op. Add a parameter to pass 
> tunnel type to the ndo_op.
>
Please consider using RX ntuple filters for this instead of a new ndo op. The 
vxlan ndo op essentailly implements a limited filter with a rule to match a 
destination UDP port and the the action of processing the packet as vxlan. 
ntuple filters generalizes that so that the filtering becomes arbitrary. We'll 
need the ability to filter on 4-tuple when we implement tunnels to go through 
firewalls or for offloading other UDP protocols such SPUD or QUIC.

Tom

- Tom I am not sure I agree with this suggestion. The easiest way to let the 
hardware know about port to protocol mapping in case of udp-based tunnels is 
when we add udp offloads for the ports aka gro etc in the stack. This way the 
user gets benefit of tunnel offloads from the HWs that support it without 
having to do any extra filter setups from ethtool. Just like ip/tcp/udp 
checksum and TSO support, the user does not have to turn this ON specifically 
if they plan to use those protocols (of course they can turn it off). Besides 
these are not true filters in that sense, they are not used to guide packets to 
any particular destination in this case, rather used to identify packets for 
checksum and TSO purpose.
And I agree with your patch series that reduces protocol ossification of the 
stack and driver interface. My point is this set of patches help with that goal 
and not really hurt because any new tunnel support would mean no change in the 
interface and just a new type in the enum and then the drivers can decide to do 
the magic setup in the HW in their driver based on this new type without ever 
having to touch the interface. So try to explain to me why this is causing 
protocol ossification because I don't believe so. And I think the ntupe 
interface should remain for the purpose of filters which are used to route 
packet or drop them. Not for packet identification and checksum offload support.

> Change all drivers to use the generalized udp tunnel offload
>
> Patch was compile tested with x86_64_defconfig.
>
> Signed-off-by: Kiran Patil 
> Signed-off-by: Anjali Singhai Jain 
> ---
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 15 ++---
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c| 13 +---
>  drivers/net/ethernet/emulex/benet/be_main.c  | 14 +---
>  drivers/net/ethernet/intel/fm10k/fm10k_netdev.c  | 27 
>  drivers/net/ethernet/intel/i40e/i40e_main.c  | 41 
> +---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c| 17 +++---
>  drivers/net/ethernet/mellanox/mlx4/en_netdev.c   | 21 
>  drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c | 17 +++---
>  drivers/net/vxlan.c  | 23 +++--
>  include/linux/netdevice.h| 34 ++--
>  include/net/udp_tunnel.h |  6 
>  11 files changed, 157 insertions(+), 71 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c 
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> index 2273576..ad2782f 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> @@ -47,6 +47,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -10124,11 +10125,14 @@ static void __bnx2x_add_vxlan_port(struct 
> bnx2x *bp, u16 port)  }
>
>  static void bnx2x_add_vxlan_port(struct net_device *netdev,
> -sa_family_t sa_family, __be16 port)
> +sa_family_t sa_family, __be16 port,
> +u32 type)
>  {
> struct bnx2x *bp = netdev_priv(netdev);
> u16 t_port = ntohs(port);
>
> +   if (type != UDP_TUNNEL_VXLAN)
> +   return;
> __bnx2x_add_vxlan_port(bp, t_port);  }
>
> @@ -10152,11 +10156,14 @@ static void __bnx2x_del_vxlan_port(struct 
> bnx2x *bp, u16 port)  }
>
>  static void bnx2x_del_vxlan_port(struct net_device *netdev,
> -sa_family_t sa_family, __be16 port)
> +sa_family_t sa_family, __be16 port,
> +u32 type)
>  {
> struct bnx2x *bp = netdev_priv(netdev);
> u16 t_port = ntohs(port);
>
> +   if (type != 

[PATCH] net/neighbour: fix crash at dumping device-agnostic proxy entries

2015-11-30 Thread Konstantin Khlebnikov
Proxy entries could have null pointer to net-device.

Signed-off-by: Konstantin Khlebnikov 
Fixes: 84920c1420e2 ("net: Allow ipv6 proxies and arp proxies be shown with 
iproute2")
Cc:  # v3.4
---
 net/core/neighbour.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index e6af42da28d9..f18ae91b652e 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -2215,7 +2215,7 @@ static int pneigh_fill_info(struct sk_buff *skb, struct 
pneigh_entry *pn,
ndm->ndm_pad2= 0;
ndm->ndm_flags   = pn->flags | NTF_PROXY;
ndm->ndm_type= RTN_UNICAST;
-   ndm->ndm_ifindex = pn->dev->ifindex;
+   ndm->ndm_ifindex = pn->dev ? pn->dev->ifindex : 0;
ndm->ndm_state   = NUD_NONE;
 
if (nla_put(skb, NDA_DST, tbl->key_len, pn->key))
@@ -2333,7 +2333,7 @@ static int pneigh_dump_table(struct neigh_table *tbl, 
struct sk_buff *skb,
if (h > s_h)
s_idx = 0;
for (n = tbl->phash_buckets[h], idx = 0; n; n = n->next) {
-   if (dev_net(n->dev) != net)
+   if (pneigh_net(n) != net)
continue;
if (idx < s_idx)
goto next;

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v3 11/17] net: rdma: use __ethtool_get_ksettings

2015-11-30 Thread David Decotigny
From: David Decotigny 

Signed-off-by: David Decotigny 
---
 include/rdma/ib_addr.h | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/include/rdma/ib_addr.h b/include/rdma/ib_addr.h
index 1152859..1820f26 100644
--- a/include/rdma/ib_addr.h
+++ b/include/rdma/ib_addr.h
@@ -254,24 +254,22 @@ static inline enum ib_mtu iboe_get_mtu(int mtu)
 
 static inline int iboe_get_rate(struct net_device *dev)
 {
-   struct ethtool_cmd cmd;
-   u32 speed;
+   struct ethtool_ksettings cmd;
int err;
 
rtnl_lock();
-   err = __ethtool_get_settings(dev, );
+   err = __ethtool_get_ksettings(dev, );
rtnl_unlock();
if (err)
return IB_RATE_PORT_CURRENT;
 
-   speed = ethtool_cmd_speed();
-   if (speed >= 4)
+   if (cmd.parent.speed >= 4)
return IB_RATE_40_GBPS;
-   else if (speed >= 3)
+   else if (cmd.parent.speed >= 3)
return IB_RATE_30_GBPS;
-   else if (speed >= 2)
+   else if (cmd.parent.speed >= 2)
return IB_RATE_20_GBPS;
-   else if (speed >= 1)
+   else if (cmd.parent.speed >= 1)
return IB_RATE_10_GBPS;
else
return IB_RATE_PORT_CURRENT;
-- 
2.6.0.rc2.230.g3dd15c0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v3 09/17] net: team: use __ethtool_get_ksettings

2015-11-30 Thread David Decotigny
From: David Decotigny 

Signed-off-by: David Decotigny 
---
 drivers/net/team/team.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index 651d35e..288ca01 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -2776,12 +2776,12 @@ static void __team_port_change_send(struct team_port 
*port, bool linkup)
port->state.linkup = linkup;
team_refresh_port_linkup(port);
if (linkup) {
-   struct ethtool_cmd ecmd;
+   struct ethtool_ksettings ecmd;
 
-   err = __ethtool_get_settings(port->dev, );
+   err = __ethtool_get_ksettings(port->dev, );
if (!err) {
-   port->state.speed = ethtool_cmd_speed();
-   port->state.duplex = ecmd.duplex;
+   port->state.speed = ecmd.parent.speed;
+   port->state.duplex = ecmd.parent.duplex;
goto send_event;
}
}
-- 
2.6.0.rc2.230.g3dd15c0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] bridge: Only call /sbin/bridge-stp for the initial network namespace

2015-11-30 Thread Stephen Hemminger
On Mon, 30 Nov 2015 15:38:15 -0600
ebied...@xmission.com (Eric W. Biederman) wrote:

> 
> There is no defined mechanism to pass network namespace information
> into /sbin/bridge-stp therefore don't even try to invoke it except
> for bridge devices in the initial network namespace.
> 
> It is possible for unprivileged users to cause /sbin/bridge-stp to be
> invoked for any network device name which if /sbin/bridge-stp does not
> guard against unreasonable arguments or being invoked twice on the same
> network device could cause problems.
> 
> Signed-off-by: "Eric W. Biederman" 
> ---
>  net/bridge/br_stp_if.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
> index 5396ff08af32..742fa89528ab 100644
> --- a/net/bridge/br_stp_if.c
> +++ b/net/bridge/br_stp_if.c
> @@ -142,7 +142,9 @@ static void br_stp_start(struct net_bridge *br)
>   char *envp[] = { NULL };
>   struct net_bridge_port *p;
>  
> - r = call_usermodehelper(BR_STP_PROG, argv, envp, UMH_WAIT_PROC);
> + r = -ENOENT;
> + if (dev_net(br->dev) == _net)
> + r = call_usermodehelper(BR_STP_PROG, argv, envp, UMH_WAIT_PROC);

I don't think this will cause loud screams.
But it might break people that use containers to run virtual networks for 
testing.

One coding nit:
Why are you afraid of using an else?

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] bridge: Only call /sbin/bridge-stp for the initial network namespace

2015-11-30 Thread Eric W. Biederman

There is no defined mechanism to pass network namespace information
into /sbin/bridge-stp therefore don't even try to invoke it except
for bridge devices in the initial network namespace.

It is possible for unprivileged users to cause /sbin/bridge-stp to be
invoked for any network device name which if /sbin/bridge-stp does not
guard against unreasonable arguments or being invoked twice on the same
network device could cause problems.

Signed-off-by: "Eric W. Biederman" 
---
 net/bridge/br_stp_if.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
index 5396ff08af32..742fa89528ab 100644
--- a/net/bridge/br_stp_if.c
+++ b/net/bridge/br_stp_if.c
@@ -142,7 +142,9 @@ static void br_stp_start(struct net_bridge *br)
char *envp[] = { NULL };
struct net_bridge_port *p;
 
-   r = call_usermodehelper(BR_STP_PROG, argv, envp, UMH_WAIT_PROC);
+   r = -ENOENT;
+   if (dev_net(br->dev) == _net)
+   r = call_usermodehelper(BR_STP_PROG, argv, envp, UMH_WAIT_PROC);
 
spin_lock_bh(>lock);
 
-- 
2.2.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v3 08/17] net: macvlan: use __ethtool_get_ksettings

2015-11-30 Thread David Decotigny
From: David Decotigny 

Signed-off-by: David Decotigny 
---
 drivers/net/macvlan.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 06c8bfe..a95b793 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -940,12 +940,12 @@ static void macvlan_ethtool_get_drvinfo(struct net_device 
*dev,
strlcpy(drvinfo->version, "0.1", sizeof(drvinfo->version));
 }
 
-static int macvlan_ethtool_get_settings(struct net_device *dev,
-   struct ethtool_cmd *cmd)
+static int macvlan_ethtool_get_ksettings(struct net_device *dev,
+struct ethtool_ksettings *cmd)
 {
const struct macvlan_dev *vlan = netdev_priv(dev);
 
-   return __ethtool_get_settings(vlan->lowerdev, cmd);
+   return __ethtool_get_ksettings(vlan->lowerdev, cmd);
 }
 
 static netdev_features_t macvlan_fix_features(struct net_device *dev,
@@ -1020,7 +1020,7 @@ static int macvlan_dev_get_iflink(const struct net_device 
*dev)
 
 static const struct ethtool_ops macvlan_ethtool_ops = {
.get_link   = ethtool_op_get_link,
-   .get_settings   = macvlan_ethtool_get_settings,
+   .get_ksettings  = macvlan_ethtool_get_ksettings,
.get_drvinfo= macvlan_ethtool_get_drvinfo,
 };
 
-- 
2.6.0.rc2.230.g3dd15c0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v3 10/17] net: fcoe: use __ethtool_get_ksettings

2015-11-30 Thread David Decotigny
From: David Decotigny 

Signed-off-by: David Decotigny 
---
 drivers/scsi/fcoe/fcoe_transport.c | 36 
 1 file changed, 20 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/fcoe/fcoe_transport.c 
b/drivers/scsi/fcoe/fcoe_transport.c
index d7597c0..9049197 100644
--- a/drivers/scsi/fcoe/fcoe_transport.c
+++ b/drivers/scsi/fcoe/fcoe_transport.c
@@ -93,36 +93,40 @@ static struct notifier_block libfcoe_notifier = {
 int fcoe_link_speed_update(struct fc_lport *lport)
 {
struct net_device *netdev = fcoe_get_netdev(lport);
-   struct ethtool_cmd ecmd;
+   struct ethtool_ksettings ecmd;
 
-   if (!__ethtool_get_settings(netdev, )) {
+   if (!__ethtool_get_ksettings(netdev, )) {
lport->link_supported_speeds &= ~(FC_PORTSPEED_1GBIT  |
  FC_PORTSPEED_10GBIT |
  FC_PORTSPEED_20GBIT |
  FC_PORTSPEED_40GBIT);
 
-   if (ecmd.supported & (SUPPORTED_1000baseT_Half |
- SUPPORTED_1000baseT_Full |
- SUPPORTED_1000baseKX_Full))
+   if (ecmd.link_modes.supported.mask[0] & (
+   SUPPORTED_1000baseT_Half |
+   SUPPORTED_1000baseT_Full |
+   SUPPORTED_1000baseKX_Full))
lport->link_supported_speeds |= FC_PORTSPEED_1GBIT;
 
-   if (ecmd.supported & (SUPPORTED_1baseT_Full   |
- SUPPORTED_1baseKX4_Full |
- SUPPORTED_1baseKR_Full  |
- SUPPORTED_1baseR_FEC))
+   if (ecmd.link_modes.supported.mask[0] & (
+   SUPPORTED_1baseT_Full   |
+   SUPPORTED_1baseKX4_Full |
+   SUPPORTED_1baseKR_Full  |
+   SUPPORTED_1baseR_FEC))
lport->link_supported_speeds |= FC_PORTSPEED_10GBIT;
 
-   if (ecmd.supported & (SUPPORTED_2baseMLD2_Full |
- SUPPORTED_2baseKR2_Full))
+   if (ecmd.link_modes.supported.mask[0] & (
+   SUPPORTED_2baseMLD2_Full |
+   SUPPORTED_2baseKR2_Full))
lport->link_supported_speeds |= FC_PORTSPEED_20GBIT;
 
-   if (ecmd.supported & (SUPPORTED_4baseKR4_Full |
- SUPPORTED_4baseCR4_Full |
- SUPPORTED_4baseSR4_Full |
- SUPPORTED_4baseLR4_Full))
+   if (ecmd.link_modes.supported.mask[0] & (
+   SUPPORTED_4baseKR4_Full |
+   SUPPORTED_4baseCR4_Full |
+   SUPPORTED_4baseSR4_Full |
+   SUPPORTED_4baseLR4_Full))
lport->link_supported_speeds |= FC_PORTSPEED_40GBIT;
 
-   switch (ethtool_cmd_speed()) {
+   switch (ecmd.parent.speed) {
case SPEED_1000:
lport->link_speed = FC_PORTSPEED_1GBIT;
break;
-- 
2.6.0.rc2.230.g3dd15c0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next] sfc: use ALIGN macro for aligning frame sizes

2015-11-30 Thread Jarod Wilson
Don't open-code it.

CC: Solarflare linux maintainers 
CC: Shradha Shah 
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson 
---
 drivers/net/ethernet/sfc/net_driver.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/sfc/net_driver.h 
b/drivers/net/ethernet/sfc/net_driver.h
index a8ddd12..746d591 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -1502,8 +1502,9 @@ static inline struct efx_rx_buffer *efx_rx_buffer(struct 
efx_rx_queue *rx_queue,
  * same cycle, the XMAC can miss the IPG altogether.  We work around
  * this by adding a further 16 bytes.
  */
+#define EFX_FRAME_PAD  16
 #define EFX_MAX_FRAME_LEN(mtu) \
-   mtu) + ETH_HLEN + VLAN_HLEN + 4/* FCS */ + 7) & ~7) + 16)
+   (ALIGN(((mtu) + ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN + EFX_FRAME_PAD), 8))
 
 static inline bool efx_xmit_with_hwtstamp(struct sk_buff *skb)
 {
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v3 15/17] net: ethtool: remove unused __ethtool_get_settings

2015-11-30 Thread David Decotigny
From: David Decotigny 

replaced by __ethtool_get_ksettings.

Signed-off-by: David Decotigny 
---
 include/linux/ethtool.h |  4 
 net/core/ethtool.c  | 49 ++---
 2 files changed, 14 insertions(+), 39 deletions(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 6de122d..7de2dc7 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -161,10 +161,6 @@ __ethtool_add_link_modes(ethtool_link_mode_mask_t *dst,
 extern int __ethtool_get_ksettings(struct net_device *dev,
   struct ethtool_ksettings *ksettings);
 
-/* DEPRECATED, use __ethtool_get_ksettings */
-extern int __ethtool_get_settings(struct net_device *dev,
- struct ethtool_cmd *cmd);
-
 /**
  * struct ethtool_ops - optional netdev operations
  * @get_settings: DEPRECATED, use %get_ksettings/%set_ksettings
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 4563f95..b67f079 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -499,15 +499,16 @@ int __ethtool_get_ksettings(struct net_device *dev,
return dev->ethtool_ops->get_ksettings(dev, ksettings);
}
 
-   /* TODO: remove what follows when ethtool_ops::get_settings
-* disappears internally
-*/
-
/* driver doesn't support %ethtool_ksettings API. revert to
 * legacy %ethtool_cmd API, unless it's not supported either.
 * TODO: remove when ethtool_ops::get_settings disappears internally
 */
-   err = __ethtool_get_settings(dev, );
+   if (!dev->ethtool_ops->get_settings)
+   return -EOPNOTSUPP;
+
+   memset(, 0, sizeof(cmd));
+   cmd.cmd = ETHTOOL_GSET;
+   err = dev->ethtool_ops->get_settings(dev, );
if (err < 0)
return err;
 
@@ -723,30 +724,6 @@ static int ethtool_set_ksettings(struct net_device *dev, 
void __user *useraddr)
return dev->ethtool_ops->set_ksettings(dev, );
 }
 
-/* Internal kernel helper to query a device ethtool_cmd settings.
- *
- * Note about transition to ethtool_settings API: We do not need (or
- * want) this function to support "dev" instances that implement the
- * ethtool_settings API as we will update the drivers calling this
- * function to call __ethtool_get_ksettings instead, before the first
- * drivers implement ethtool_ops::get_ksettings.
- *
- * TODO 1: at least make this function static when no driver is using it
- * TODO 2: remove when ethtool_ops::get_settings disappears internally
- */
-int __ethtool_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
-{
-   ASSERT_RTNL();
-
-   if (!dev->ethtool_ops->get_settings)
-   return -EOPNOTSUPP;
-
-   memset(cmd, 0, sizeof(struct ethtool_cmd));
-   cmd->cmd = ETHTOOL_GSET;
-   return dev->ethtool_ops->get_settings(dev, cmd);
-}
-EXPORT_SYMBOL(__ethtool_get_settings);
-
 /* Query device for its ethtool_cmd settings.
  *
  * Backward compatibility note: for compatibility with legacy ethtool,
@@ -788,16 +765,18 @@ static int ethtool_get_settings(struct net_device *dev, 
void __user *useraddr)
/* send a sensible cmd tag back to user */
cmd.cmd = ETHTOOL_GSET;
} else {
-   int err;
-   /* TODO: return -EOPNOTSUPP when
-* ethtool_ops::get_settings disappears internally
-*/
-
/* driver doesn't support %ethtool_ksettings
 * API. revert to legacy %ethtool_cmd API, unless it's
 * not supported either.
 */
-   err = __ethtool_get_settings(dev, );
+   int err;
+
+   if (!dev->ethtool_ops->get_settings)
+   return -EOPNOTSUPP;
+
+   memset(, 0, sizeof(cmd));
+   cmd.cmd = ETHTOOL_GSET;
+   err = dev->ethtool_ops->get_settings(dev, );
if (err < 0)
return err;
}
-- 
2.6.0.rc2.230.g3dd15c0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v3 16/17] net: mlx4: convenience predicate for debug messages

2015-11-30 Thread David Decotigny
From: David Decotigny 

Signed-off-by: David Decotigny 
---
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h 
b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 35de7d2..b04054d 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -740,9 +740,11 @@ __printf(3, 4)
 void en_print(const char *level, const struct mlx4_en_priv *priv,
  const char *format, ...);
 
+#define en_dbg_enabled(mlevel, priv)   \
+   (NETIF_MSG_##mlevel & (priv)->msg_enable)
 #define en_dbg(mlevel, priv, format, ...)  \
 do {   \
-   if (NETIF_MSG_##mlevel & (priv)->msg_enable)\
+   if (en_dbg_enabled(mlevel, priv))   \
en_print(KERN_DEBUG, priv, format, ##__VA_ARGS__);  \
 } while (0)
 #define en_warn(priv, format, ...) \
-- 
2.6.0.rc2.230.g3dd15c0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v3 13/17] net: bridge: use __ethtool_get_ksettings

2015-11-30 Thread David Decotigny
From: David Decotigny 

Signed-off-by: David Decotigny 
---
 net/bridge/br_if.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index ec02f58..e6de008 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -36,10 +36,10 @@
  */
 static int port_cost(struct net_device *dev)
 {
-   struct ethtool_cmd ecmd;
+   struct ethtool_ksettings ecmd;
 
-   if (!__ethtool_get_settings(dev, )) {
-   switch (ethtool_cmd_speed()) {
+   if (!__ethtool_get_ksettings(dev, )) {
+   switch (ecmd.parent.speed) {
case SPEED_1:
return 2;
case SPEED_1000:
-- 
2.6.0.rc2.230.g3dd15c0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v3 14/17] net: core: use __ethtool_get_ksettings

2015-11-30 Thread David Decotigny
From: David Decotigny 

Signed-off-by: David Decotigny 
---
 net/core/net-sysfs.c   | 15 +--
 net/packet/af_packet.c | 11 +--
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index f88a62a..3dd4bb1 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -199,9 +199,10 @@ static ssize_t speed_show(struct device *dev,
return restart_syscall();
 
if (netif_running(netdev)) {
-   struct ethtool_cmd cmd;
-   if (!__ethtool_get_settings(netdev, ))
-   ret = sprintf(buf, fmt_dec, ethtool_cmd_speed());
+   struct ethtool_ksettings cmd;
+
+   if (!__ethtool_get_ksettings(netdev, ))
+   ret = sprintf(buf, fmt_dec, cmd.parent.speed);
}
rtnl_unlock();
return ret;
@@ -218,10 +219,12 @@ static ssize_t duplex_show(struct device *dev,
return restart_syscall();
 
if (netif_running(netdev)) {
-   struct ethtool_cmd cmd;
-   if (!__ethtool_get_settings(netdev, )) {
+   struct ethtool_ksettings cmd;
+
+   if (!__ethtool_get_ksettings(netdev, )) {
const char *duplex;
-   switch (cmd.duplex) {
+
+   switch (cmd.parent.duplex) {
case DUPLEX_HALF:
duplex = "half";
break;
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 1cf928f..8847dad 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -557,9 +557,8 @@ static int prb_calc_retire_blk_tmo(struct packet_sock *po,
 {
struct net_device *dev;
unsigned int mbits = 0, msec = 0, div = 0, tmo = 0;
-   struct ethtool_cmd ecmd;
+   struct ethtool_ksettings ecmd;
int err;
-   u32 speed;
 
rtnl_lock();
dev = __dev_get_by_index(sock_net(>sk), po->ifindex);
@@ -567,19 +566,19 @@ static int prb_calc_retire_blk_tmo(struct packet_sock *po,
rtnl_unlock();
return DEFAULT_PRB_RETIRE_TOV;
}
-   err = __ethtool_get_settings(dev, );
-   speed = ethtool_cmd_speed();
+   err = __ethtool_get_ksettings(dev, );
rtnl_unlock();
if (!err) {
/*
 * If the link speed is so slow you don't really
 * need to worry about perf anyways
 */
-   if (speed < SPEED_1000 || speed == SPEED_UNKNOWN) {
+   if (ecmd.parent.speed < SPEED_1000 ||
+   ecmd.parent.speed == SPEED_UNKNOWN) {
return DEFAULT_PRB_RETIRE_TOV;
} else {
msec = 1;
-   div = speed / 1000;
+   div = ecmd.parent.speed / 1000;
}
}
 
-- 
2.6.0.rc2.230.g3dd15c0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] bpf: fix allocation warnings in bpf maps and integer overflow

2015-11-30 Thread Daniel Borkmann

On 11/30/2015 07:13 PM, Alexei Starovoitov wrote:

On Mon, Nov 30, 2015 at 03:34:35PM +0100, Daniel Borkmann wrote:

diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 3f4c99e06c6b..b1e53b79c586 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -28,11 +28,17 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
  attr->value_size == 0)
  return ERR_PTR(-EINVAL);

+if (attr->value_size >= 1 << (KMALLOC_SHIFT_MAX - 1))
+/* if value_size is bigger, the user space won't be able to
+ * access the elements.
+ */
+return ERR_PTR(-E2BIG);
+


Bit confused, given that in array map, we try kzalloc() with __GFP_NOWARN 
already
and if that fails, we fall back to vzalloc(), it shouldn't trigger memory 
allocation
warnings here ...


not quite, the above check is for kmalloc-s in syscall.c


Ok, I see. The check and comment is related to the fact that when we do bpf(2)
syscall to lookup an element:

We call map_lookup_elem(), which does kmalloc() on the value_size.

So an individual entry lookup could fail with kmalloc() there, unrelated to an
individual map implementation.


kmalloc with order >= MAX_ORDER warning can be seen in syscall for update/lookup
commands regardless of map implememtation.
So the maps with "value_size >= 1 << (KMALLOC_SHIFT_MAX - 1)" were not 
accessible
from user space anyway.
This check in arraymap.c fixes the warning and prevents creation of such
maps in the first place as the comment right below it says.


Yeah, right. Noticed that later on. It was a bit confusing at first as I didn't
parse that clearly from the commit message itself.


Similar check in hashmap.c fixes warning, prevents abnormal map creation and 
fixes
integer overflow which is the most dangerous of them all.

The check in arraymap.c
-attr->max_entries > (U32_MAX - sizeof(*array)) / elem_size)
+attr->max_entries > (U32_MAX - PAGE_SIZE - sizeof(*array)) / elem_size)
  fixes potential integer overflow in map.pages computation.

and similar check in hashtab.c:
(u64) htab->elem_size * htab->map.max_entries >= U32_MAX - PAGE_SIZE
fixes integer overflow in map.pages as well.


Yep, got that part.


the 'value_size >= (1 << (KMALLOC_SHIFT_MAX - 1)) - MAX_BPF_STACK - 
sizeof(struct htab_elem)'
check in hashmap.c fixes integer overflow in elem_size and
makes elem_size kmalloc-able later in htab_map_update_elem().
Since it wasn't obvious that this one 'if' addresses these multiple issues,
I've added a comment there.


... and the MAX_BPF_STACK stands for the maximum key part here, okay.

So, when creating a sufficiently large map where map->key_size + map->value_size
would be > MAX_BPF_STACK (but map->key_size still <= MAX_BPF_STACK), we can only
read the map from an eBPF program, but not update it. In such cases, updates 
could
only happen from user space application.


Addition of __GFP_NOWARN only fixes OOM warning as commit log says.


That's obvious, too.


Hmm, seems this patch fixes many things at once, maybe makes sense to split it?


hmm I don't see a point of changing the same single line over multipe patches.
The split won't help backporting, but rather makes for more patches to deal 
with.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] ath6kl: Use vmalloc for loading firmware using api1 method and use kvfree

2015-11-30 Thread Kalle Valo
Kalle Valo  writes:

> Brent Taylor  writes:
>
>> Signed-off-by: Brent Taylor 
>>
>> ath6kl: Use vmalloc for loading firmware using api1 method and free using 
>> kvfree
>> ---
>> Changes v1 -> v2:
>>- simplify memory allocation
>>- use kvfree
>
> Why? The commit log should _always_ answer that. Are you fixing a bug
> (what bug exactly?), is this just cleanup or what?
>
> And the commit log is wrongly formatted anyway, the Signed-off-by line
> should be the last and there should be no "ath6kl:" string in the commit
> log (just in the title). Use 'git log' to find examples.

Fixing netdev address (kenrel -> kernel)

-- 
Kalle Valo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 net-next 0/2] Basic support for Solarflare 8000 series NICs

2015-11-30 Thread Bert Kenward
The upcoming Solarflare 8000 series 10G/40G network card supports a 
similar interface to the current 7000 series cards. This patch series 
provides basic support for these cards, making no use of any new 
functionality.

v2: fix indenting in ef10.c in patch 1/2.

Bert Kenward (2):
  sfc: make TSO version a per-queue parameter
  sfc: Add PCI ID for Solarflare 8000 series 10/40G NIC

 drivers/net/ethernet/sfc/ef10.c   | 13 ++---
 drivers/net/ethernet/sfc/efx.c|  6 ++
 drivers/net/ethernet/sfc/net_driver.h |  2 ++
 drivers/net/ethernet/sfc/tx.c |  8 ++--
 4 files changed, 20 insertions(+), 9 deletions(-)

-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 net-next 2/2] sfc: Add PCI ID for Solarflare 8000 series 10/40G NIC

2015-11-30 Thread Bert Kenward
Also add support for 7000 series 40G NIC VF.

Signed-off-by: Bert Kenward 
---
 drivers/net/ethernet/sfc/efx.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index 4e82bcf..b405349 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -2784,6 +2784,12 @@ static const struct pci_device_id efx_pci_table[] = {
 .driver_data = (unsigned long) _hunt_a0_vf_nic_type},
{PCI_DEVICE(PCI_VENDOR_ID_SOLARFLARE, 0x0923),  /* SFC9140 PF */
 .driver_data = (unsigned long) _hunt_a0_nic_type},
+   {PCI_DEVICE(PCI_VENDOR_ID_SOLARFLARE, 0x1923),  /* SFC9140 VF */
+.driver_data = (unsigned long) _hunt_a0_vf_nic_type},
+   {PCI_DEVICE(PCI_VENDOR_ID_SOLARFLARE, 0x0a03),  /* SFC9220 PF */
+.driver_data = (unsigned long) _hunt_a0_nic_type},
+   {PCI_DEVICE(PCI_VENDOR_ID_SOLARFLARE, 0x1a03),  /* SFC9220 VF */
+.driver_data = (unsigned long) _hunt_a0_vf_nic_type},
{0} /* end of list */
 };
 
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 net-next 1/2] sfc: make TSO version a per-queue parameter

2015-11-30 Thread Bert Kenward
The Solarflare 8000 series NIC will use a new TSO scheme. The current
driver refuses to load if the current TSO scheme is not found. Remove
that check and instead make the TSO version a per-queue parameter.

Signed-off-by: Bert Kenward 
---
 drivers/net/ethernet/sfc/ef10.c   | 13 ++---
 drivers/net/ethernet/sfc/net_driver.h |  2 ++
 drivers/net/ethernet/sfc/tx.c |  8 ++--
 3 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index bc6d21b..425df3d 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -181,13 +181,6 @@ static int efx_ef10_init_datapath_caps(struct efx_nic *efx)
MCDI_WORD(outbuf, GET_CAPABILITIES_OUT_TX_DPCPU_FW_ID);
 
if (!(nic_data->datapath_caps &
- (1 << MC_CMD_GET_CAPABILITIES_OUT_TX_TSO_LBN))) {
-   netif_err(efx, drv, efx->net_dev,
- "current firmware does not support TSO\n");
-   return -ENODEV;
-   }
-
-   if (!(nic_data->datapath_caps &
  (1 << MC_CMD_GET_CAPABILITIES_OUT_RX_PREFIX_LEN_14_LBN))) {
netif_err(efx, probe, efx->net_dev,
  "current firmware does not support an RX prefix\n");
@@ -1797,6 +1790,12 @@ static void efx_ef10_tx_init(struct efx_tx_queue 
*tx_queue)
 ESF_DZ_TX_OPTION_UDP_TCP_CSUM, csum_offload,
 ESF_DZ_TX_OPTION_IP_CSUM, csum_offload);
tx_queue->write_count = 1;
+
+   if (nic_data->datapath_caps &
+   (1 << MC_CMD_GET_CAPABILITIES_OUT_TX_TSO_LBN)) {
+   tx_queue->tso_version = 1;
+   }
+
wmb();
efx_ef10_push_tx_desc(tx_queue, txd);
 
diff --git a/drivers/net/ethernet/sfc/net_driver.h 
b/drivers/net/ethernet/sfc/net_driver.h
index a8ddd12..5c0d0ba 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -182,6 +182,7 @@ struct efx_tx_buffer {
  *
  * @efx: The associated Efx NIC
  * @queue: DMA queue number
+ * @tso_version: Version of TSO in use for this queue.
  * @channel: The associated channel
  * @core_txq: The networking core TX queue structure
  * @buffer: The software buffer ring
@@ -228,6 +229,7 @@ struct efx_tx_queue {
/* Members which don't change on the fast path */
struct efx_nic *efx cacheline_aligned_in_smp;
unsigned queue;
+   unsigned int tso_version;
struct efx_channel *channel;
struct netdev_queue *core_txq;
struct efx_tx_buffer *buffer;
diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c
index 67f6afa..f7a0ec1 100644
--- a/drivers/net/ethernet/sfc/tx.c
+++ b/drivers/net/ethernet/sfc/tx.c
@@ -1010,13 +1010,17 @@ static void efx_enqueue_unwind(struct efx_tx_queue 
*tx_queue,
 
 /* Parse the SKB header and initialise state. */
 static int tso_start(struct tso_state *st, struct efx_nic *efx,
+struct efx_tx_queue *tx_queue,
 const struct sk_buff *skb)
 {
-   bool use_opt_desc = efx_nic_rev(efx) >= EFX_REV_HUNT_A0;
struct device *dma_dev = >pci_dev->dev;
unsigned int header_len, in_len;
+   bool use_opt_desc = false;
dma_addr_t dma_addr;
 
+   if (tx_queue->tso_version == 1)
+   use_opt_desc = true;
+
st->ip_off = skb_network_header(skb) - skb->data;
st->tcp_off = skb_transport_header(skb) - skb->data;
header_len = st->tcp_off + (tcp_hdr(skb)->doff << 2u);
@@ -1271,7 +1275,7 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue 
*tx_queue,
/* Find the packet protocol and sanity-check it */
state.protocol = efx_tso_check_protocol(skb);
 
-   rc = tso_start(, efx, skb);
+   rc = tso_start(, efx, tx_queue, skb);
if (rc)
goto mem_err;
 
-- 
2.4.3


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: replace % with & on data path

2015-11-30 Thread kbuild test robot
Hi Michael,

[auto build test ERROR on: v4.4-rc3]
[also build test ERROR on: next-20151127]

url:
https://github.com/0day-ci/linux/commits/Michael-S-Tsirkin/vhost-replace-with-on-data-path/20151130-163704
config: i386-randconfig-s1-201548 (attached as .config)
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All error/warnings (new ones prefixed by >>):

   drivers/vhost/vhost.c: In function 'vhost_get_vq_desc':
>> drivers/vhost/vhost.c:1345:6: warning: unused variable 'ret' 
>> [-Wunused-variable]
 int ret;
 ^
>> drivers/vhost/vhost.c:1344:13: warning: unused variable 'ring_head' 
>> [-Wunused-variable]
 __virtio16 ring_head;
^
>> drivers/vhost/vhost.c:1341:24: warning: unused variable 'found' 
>> [-Wunused-variable]
 unsigned int i, head, found = 0;
   ^
>> drivers/vhost/vhost.c:1341:18: warning: unused variable 'head' 
>> [-Wunused-variable]
 unsigned int i, head, found = 0;
 ^
>> drivers/vhost/vhost.c:1341:15: warning: unused variable 'i' 
>> [-Wunused-variable]
 unsigned int i, head, found = 0;
  ^
>> drivers/vhost/vhost.c:1340:20: warning: unused variable 'desc' 
>> [-Wunused-variable]
 struct vring_desc desc;
   ^
   drivers/vhost/vhost.c: At top level:
>> drivers/vhost/vhost.c:1373:2: error: expected identifier or '(' before 'if'
 if (unlikely(__get_user(ring_head,
 ^
   In file included from include/uapi/linux/stddef.h:1:0,
from include/linux/stddef.h:4,
from include/uapi/linux/posix_types.h:4,
from include/uapi/linux/types.h:13,
from include/linux/types.h:5,
from include/uapi/asm-generic/fcntl.h:4,
from arch/x86/include/uapi/asm/fcntl.h:1,
from include/uapi/linux/fcntl.h:4,
from include/linux/fcntl.h:4,
from include/linux/eventfd.h:11,
from drivers/vhost/vhost.c:14:
>> arch/x86/include/asm/uaccess.h:414:2: error: expected identifier or '(' 
>> before ')' token
})
 ^
   include/linux/compiler.h:137:45: note: in definition of macro 'unlikely'
#  define unlikely(x) (__builtin_constant_p(x) ? !!(x) : 
__branch_check__(x, 0))
^
   arch/x86/include/asm/uaccess.h:479:2: note: in expansion of macro 
'__get_user_nocheck'
 __get_user_nocheck((x), (ptr), sizeof(*(ptr)))
 ^
>> drivers/vhost/vhost.c:1373:15: note: in expansion of macro '__get_user'
 if (unlikely(__get_user(ring_head,
  ^
>> arch/x86/include/asm/uaccess.h:414:2: error: expected identifier or '(' 
>> before ')' token
})
 ^
   include/linux/compiler.h:137:53: note: in definition of macro 'unlikely'
#  define unlikely(x) (__builtin_constant_p(x) ? !!(x) : 
__branch_check__(x, 0))
^
   arch/x86/include/asm/uaccess.h:479:2: note: in expansion of macro 
'__get_user_nocheck'
 __get_user_nocheck((x), (ptr), sizeof(*(ptr)))
 ^
>> drivers/vhost/vhost.c:1373:15: note: in expansion of macro '__get_user'
 if (unlikely(__get_user(ring_head,
  ^
>> include/linux/compiler.h:126:4: error: expected identifier or '(' before ')' 
>> token
  })
   ^
   include/linux/compiler.h:137:58: note: in expansion of macro 
'__branch_check__'
#  define unlikely(x) (__builtin_constant_p(x) ? !!(x) : 
__branch_check__(x, 0))
 ^
>> drivers/vhost/vhost.c:1373:6: note: in expansion of macro 'unlikely'
 if (unlikely(__get_user(ring_head,
 ^
>> drivers/vhost/vhost.c:1381:2: warning: data definition has no type or 
>> storage class
 head = vhost16_to_cpu(vq, ring_head);
 ^
>> drivers/vhost/vhost.c:1381:2: error: type defaults to 'int' in declaration 
>> of 'head' [-Werror=implicit-int]
>> drivers/vhost/vhost.c:1381:24: error: 'vq' undeclared here (not in a 
>> function)
 head = vhost16_to_cpu(vq, ring_head);
   ^
>> drivers/vhost/vhost.c:1381:28: error: 'ring_head' undeclared here (not in a 
>> function)
 head = vhost16_to_cpu(vq, ring_head);
   ^
   drivers/vhost/vhost.c:1384:2: error: expected identifier or '(' before 'if'
 if (unlikely(head >= vq->num)) {
 ^
   In file included from include/uapi/linux/stddef.h:1:0,
from include/linux/stddef.h:4,
from include/uapi/linux/posix_types.h:4,
from include/uapi/linux/types.h:13,
from include/linux/types.h:5,
from include/u

Re: [PATCH] vhost: replace % with & on data path

2015-11-30 Thread Michael S. Tsirkin
On Mon, Nov 30, 2015 at 10:34:07AM +0200, Michael S. Tsirkin wrote:
> We know vring num is a power of 2, so use &
> to mask the high bits.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  drivers/vhost/vhost.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 080422f..85f0f0a 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -1366,10 +1366,12 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
>   /* Only get avail ring entries after they have been exposed by guest. */
>   smp_rmb();
>  
> + }
> +

Oops. This sneaked in from an unrelated patch.
Pls ignore, will repost.

>   /* Grab the next descriptor number they're advertising, and increment
>* the index we've seen. */
>   if (unlikely(__get_user(ring_head,
> - >avail->ring[last_avail_idx % vq->num]))) {
> + >avail->ring[last_avail_idx & (vq->num - 
> 1)]))) {
>   vq_err(vq, "Failed to read head: idx %d address %p\n",
>  last_avail_idx,
>  >avail->ring[last_avail_idx % vq->num]);
> @@ -1489,7 +1491,7 @@ static int __vhost_add_used_n(struct vhost_virtqueue 
> *vq,
>   u16 old, new;
>   int start;
>  
> - start = vq->last_used_idx % vq->num;
> + start = vq->last_used_idx & (vq->num - 1);
>   used = vq->used->ring + start;
>   if (count == 1) {
>   if (__put_user(heads[0].id, >id)) {
> @@ -1531,7 +1533,7 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct 
> vring_used_elem *heads,
>  {
>   int start, n, r;
>  
> - start = vq->last_used_idx % vq->num;
> + start = vq->last_used_idx & (vq->num - 1);
>   n = vq->num - start;
>   if (n < count) {
>   r = __vhost_add_used_n(vq, heads, n);
> -- 
> MST
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC v2] virtio: skip avail/used index reads

2015-11-30 Thread Michael S. Tsirkin
This adds a new vring feature bit: when enabled, host and guest poll the
available/used ring directly instead of looking at the index field
first.

To guarantee it is possible to detect updates, the high bits (above
vring.num - 1) in the ring head ID value are modified to match the index
bits - these change on each wrap-around.  Writer also XORs this with
0x8000 such that rings can be zero-initialized.

Reader is modified to ignore these high bits when looking
up descriptors.

The point is to reduce the number of cacheline misses
for both reads and writes.

I see a performance improvement of about 20% on multithreaded benchmarks
(e.g. virtio-test), but regression of about 2% on vring_bench.
I think this has to do with the fact that complete_multi_user
is implemented suboptimally.

TODO:
investigate single-threaded regression
look at more aggressive ring layout changes
better name for a feature flag
split the patch to make it easier to review

This is on top of the following patches in my tree:
virtio_ring: Shadow available ring flags & index
vhost: replace % with & on data path
tools/virtio: fix byteswap logic
tools/virtio: move list macro stubs

Signed-off-by: Michael S. Tsirkin 
---

Changes from v1:
add a missing chunk in vhost_get_vq_desc

 drivers/vhost/vhost.h|   3 +-
 include/linux/vringh.h   |   3 +
 include/uapi/linux/virtio_ring.h |   3 +
 drivers/vhost/vhost.c| 104 ++
 drivers/vhost/vringh.c   | 153 +--
 drivers/virtio/virtio_ring.c |  40 --
 tools/virtio/virtio_test.c   |  14 +++-
 7 files changed, 256 insertions(+), 64 deletions(-)

diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index d3f7674..aeeb15d 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -175,7 +175,8 @@ enum {
 (1ULL << VIRTIO_RING_F_EVENT_IDX) |
 (1ULL << VHOST_F_LOG_ALL) |
 (1ULL << VIRTIO_F_ANY_LAYOUT) |
-(1ULL << VIRTIO_F_VERSION_1)
+(1ULL << VIRTIO_F_VERSION_1) |
+(1ULL << VIRTIO_RING_F_POLL)
 };
 
 static inline bool vhost_has_feature(struct vhost_virtqueue *vq, int bit)
diff --git a/include/linux/vringh.h b/include/linux/vringh.h
index bc6c28d..13a9e3e 100644
--- a/include/linux/vringh.h
+++ b/include/linux/vringh.h
@@ -40,6 +40,9 @@ struct vringh {
/* Can we get away with weak barriers? */
bool weak_barriers;
 
+   /* Poll ring directly */
+   bool poll;
+
/* Last available index we saw (ie. where we're up to). */
u16 last_avail_idx;
 
diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
index c072959..bf3ca1d 100644
--- a/include/uapi/linux/virtio_ring.h
+++ b/include/uapi/linux/virtio_ring.h
@@ -62,6 +62,9 @@
  * at the end of the used ring. Guest should ignore the used->flags field. */
 #define VIRTIO_RING_F_EVENT_IDX29
 
+/* Support ring polling */
+#define VIRTIO_RING_F_POLL 33
+
 /* Virtio ring descriptors: 16 bytes.  These can chain together via "next". */
 struct vring_desc {
/* Address (guest-physical). */
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index ad2146a..cdbabf5 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1346,25 +1346,27 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
 
/* Check it isn't doing very strange things with descriptor numbers. */
last_avail_idx = vq->last_avail_idx;
-   if (unlikely(__get_user(avail_idx, >avail->idx))) {
-   vq_err(vq, "Failed to access avail idx at %p\n",
-  >avail->idx);
-   return -EFAULT;
-   }
-   vq->avail_idx = vhost16_to_cpu(vq, avail_idx);
+   if (!vhost_has_feature(vq, VIRTIO_RING_F_POLL)) {
+   if (unlikely(__get_user(avail_idx, >avail->idx))) {
+   vq_err(vq, "Failed to access avail idx at %p\n",
+  >avail->idx);
+   return -EFAULT;
+   }
+   vq->avail_idx = vhost16_to_cpu(vq, avail_idx);
 
-   if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) {
-   vq_err(vq, "Guest moved used index from %u to %u",
-  last_avail_idx, vq->avail_idx);
-   return -EFAULT;
-   }
+   if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) {
+   vq_err(vq, "Guest moved used index from %u to %u",
+  last_avail_idx, vq->avail_idx);
+   return -EFAULT;
+   }
 
-   /* If there's nothing new since last we looked, return invalid. */
-   if (vq->avail_idx == last_avail_idx)
-   return vq->num;
+   /* If there's 

Re: [PATCH] vhost: replace % with & on data path

2015-11-30 Thread Michael S. Tsirkin
On Mon, Nov 30, 2015 at 12:42:49AM -0800, Joe Perches wrote:
> On Mon, 2015-11-30 at 10:34 +0200, Michael S. Tsirkin wrote:
> > We know vring num is a power of 2, so use &
> > to mask the high bits.
> []
> > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> []
> > @@ -1366,10 +1366,12 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
> > /* Only get avail ring entries after they have been exposed by guest. */
> > smp_rmb();
> >  
> > +   }
> 
> ?

Yes, I noticed this - I moved this chunk from the next patch
in my tree by mistake.

Will fix, thanks!

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] netfilter: ipvs: avoid unused variable warning

2015-11-30 Thread Arnd Bergmann
When CONFIG_PROC_FS is disabled, the local 'net' variable in
ip_vs_app_net_init becomes unused, as gcc warns:

net/netfilter/ipvs/ip_vs_app.c: In function 'ip_vs_app_net_init':
net/netfilter/ipvs/ip_vs_app.c:608:14: warning: unused variable 'net' 
[-Wunused-variable]

This removes the line by moving the pointer dereference into the
user of the variable.

Signed-off-by: Arnd Bergmann 

diff --git a/net/netfilter/ipvs/ip_vs_app.c b/net/netfilter/ipvs/ip_vs_app.c
index 0328f7250693..e5422d3db501 100644
--- a/net/netfilter/ipvs/ip_vs_app.c
+++ b/net/netfilter/ipvs/ip_vs_app.c
@@ -614,8 +614,6 @@ int __net_init ip_vs_app_net_init(struct netns_ipvs *ipvs)
 
 void __net_exit ip_vs_app_net_cleanup(struct netns_ipvs *ipvs)
 {
-   struct net *net = ipvs->net;
-
unregister_ip_vs_app(ipvs, NULL /* all */);
-   remove_proc_entry("ip_vs_app", net->proc_net);
+   remove_proc_entry("ip_vs_app", ipvs->net->proc_net);
 }
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test

2015-11-30 Thread Phil Sutter
On Mon, Nov 30, 2015 at 05:37:55PM +0800, Herbert Xu wrote:
> Phil Sutter  wrote:
> > The following series aims to improve lib/test_rhashtable in different
> > situations:
> > 
> > Patch 1 allows the kernel to reschedule so the test does not block too
> >long on slow systems.
> > Patch 2 fixes behaviour under pressure, retrying inserts in non-permanent
> >error case (-EBUSY).
> > Patch 3 auto-adjusts the upper table size limit according to the number
> >of threads (in concurrency test). In fact, the current default is
> >already too small.
> > Patch 4 makes it possible to retry inserts even in supposedly permanent
> >error case (-ENOMEM) to expose rhashtable's remaining problem of
> >-ENOMEM being not as permanent as it is expected to be.
> 
> I'm sorry but this patch series is simply bogus.

The whole series?!

> If rhashtable is indeed returning such errors under normal
> conditions then rhashtable is broken and we must fix it instead
> of working around it in the test code!

You're stating the obvious. Remember, the reason I prepared patch 4 was
because you wanted to fix just that bug in rhashtable in the first
place.

Just to make this clear: Patches 1-3 are reasonable on their own, the
only connection to the bug is that patch 2 makes it visible (at least on
my system it wasn't before).

> FWIW I still haven't been able to reproduce this problem, perhaps
> because my machines have too few CPUs?

Did you try with my bogus patch series applied? How many CPUs does your
test system actually have?

> So can someone please help me reproduce this? Because just loading
> test_rhashtable isn't doing it.

As said, maybe you need to increase the number of spawned threads
(tcount=50 or so).

Cheers, Phil
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test

2015-11-30 Thread Herbert Xu
On Mon, Nov 30, 2015 at 11:14:01AM +0100, Phil Sutter wrote:
> On Mon, Nov 30, 2015 at 05:37:55PM +0800, Herbert Xu wrote:
> > Phil Sutter  wrote:
> > > The following series aims to improve lib/test_rhashtable in different
> > > situations:
> > > 
> > > Patch 1 allows the kernel to reschedule so the test does not block too
> > >long on slow systems.
> > > Patch 2 fixes behaviour under pressure, retrying inserts in non-permanent
> > >error case (-EBUSY).
> > > Patch 3 auto-adjusts the upper table size limit according to the number
> > >of threads (in concurrency test). In fact, the current default is
> > >already too small.
> > > Patch 4 makes it possible to retry inserts even in supposedly permanent
> > >error case (-ENOMEM) to expose rhashtable's remaining problem of
> > >-ENOMEM being not as permanent as it is expected to be.
> > 
> > I'm sorry but this patch series is simply bogus.
> 
> The whole series?!

Well at least patch two and four seem clearly wrong because no
rhashtable user should need to retry insertions.

> Did you try with my bogus patch series applied? How many CPUs does your
> test system actually have?
> 
> > So can someone please help me reproduce this? Because just loading
> > test_rhashtable isn't doing it.
> 
> As said, maybe you need to increase the number of spawned threads
> (tcount=50 or so).

OK that's better.  I think I see the problem.  The test in
rhashtable_insert_rehash is racy and if two threads both try
to grow the table one of them may be tricked into doing a rehash
instead.

I'm working on a fix.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH (net-next.git)] stmmac: support Reg_9 to get HW level information

2015-11-30 Thread Giuseppe Cavallaro
For GMAC newer than 3.40a there is a new register (Reg_9) that provides the
status of all modules of the transmit and receive paths and FIFO status.
These can be exposed via ethtool.

Signed-off-by: Giuseppe Cavallaro 
---
 drivers/net/ethernet/stmicro/stmmac/common.h   |   26 +++
 drivers/net/ethernet/stmicro/stmmac/dwmac1000.h|   42 +++
 .../net/ethernet/stmicro/stmmac/dwmac1000_core.c   |   75 
 .../net/ethernet/stmicro/stmmac/stmmac_ethtool.c   |   30 
 4 files changed, 173 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h 
b/drivers/net/ethernet/stmicro/stmmac/common.h
index 623c6ed..f4518bc 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -137,6 +137,31 @@ struct stmmac_extra_stats {
unsigned long pcs_link;
unsigned long pcs_duplex;
unsigned long pcs_speed;
+   /* debug register */
+   unsigned long mtl_tx_status_fifo_full;
+   unsigned long mtl_tx_fifo_not_empty;
+   unsigned long mmtl_fifo_ctrl;
+   unsigned long mtl_tx_fifo_read_ctrl_write;
+   unsigned long mtl_tx_fifo_read_ctrl_wait;
+   unsigned long mtl_tx_fifo_read_ctrl_read;
+   unsigned long mtl_tx_fifo_read_ctrl_idle;
+   unsigned long mac_tx_in_pause;
+   unsigned long mac_tx_frame_ctrl_xfer;
+   unsigned long mac_tx_frame_ctrl_idle;
+   unsigned long mac_tx_frame_ctrl_wait;
+   unsigned long mac_tx_frame_ctrl_pause;
+   unsigned long mac_gmii_tx_proto_engine;
+   unsigned long mtl_rx_fifo_fill_level_full;
+   unsigned long mtl_rx_fifo_fill_above_thresh;
+   unsigned long mtl_rx_fifo_fill_below_thresh;
+   unsigned long mtl_rx_fifo_fill_level_empty;
+   unsigned long mtl_rx_fifo_read_ctrl_flush;
+   unsigned long mtl_rx_fifo_read_ctrl_read_data;
+   unsigned long mtl_rx_fifo_read_ctrl_status;
+   unsigned long mtl_rx_fifo_read_ctrl_idle;
+   unsigned long mtl_rx_fifo_ctrl_active;
+   unsigned long mac_rx_frame_ctrl_fifo;
+   unsigned long mac_gmii_rx_proto_engine;
 };
 
 /* CSR Frequency Access Defines*/
@@ -408,6 +433,7 @@ struct stmmac_ops {
void (*set_eee_pls)(struct mac_device_info *hw, int link);
void (*ctrl_ane)(struct mac_device_info *hw, bool restart);
void (*get_adv)(struct mac_device_info *hw, struct rgmii_adv *adv);
+   void (*debug)(void __iomem *ioaddr, struct stmmac_extra_stats *x);
 };
 
 /* PTP and HW Timer helpers */
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h 
b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
index b3fe057..8831a05 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
@@ -34,6 +34,7 @@
 #define GMAC_FLOW_CTRL 0x0018  /* Flow Control */
 #define GMAC_VLAN_TAG  0x001c  /* VLAN Tag */
 #define GMAC_VERSION   0x0020  /* GMAC CORE Version */
+#define GMAC_DEBUG 0x0024  /* GMAC debug register */
 #define GMAC_WAKEUP_FILTER 0x0028  /* Wake-up Frame Filter */
 
 #define GMAC_INT_STATUS0x0038  /* interrupt status 
register */
@@ -177,6 +178,47 @@ enum inter_frame_gap {
 #define GMAC_FLOW_CTRL_TFE 0x0002  /* Tx Flow Control Enable */
 #define GMAC_FLOW_CTRL_FCB_BPA 0x0001  /* Flow Control Busy ... */
 
+/* DEBUG Register defines */
+/* MTL TxStatus FIFO */
+#define GMAC_DEBUG_TXSTSFSTS   BIT(25) /* MTL TxStatus FIFO Full Status */
+#define GMAC_DEBUG_TXFSTS  BIT(24) /* MTL Tx FIFO Not Empty Status */
+#define GMAC_DEBUG_TWCSTS  BIT(22) /* MTL Tx FIFO Write Controller */
+/* MTL Tx FIFO Read Controller Status */
+#define GMAC_DEBUG_TRCSTS_MASK GENMASK(21, 20)
+#define GMAC_DEBUG_TRCSTS_SHIFT20
+#define GMAC_DEBUG_TRCSTS_IDLE 0
+#define GMAC_DEBUG_TRCSTS_READ 1
+#define GMAC_DEBUG_TRCSTS_TXW  2
+#define GMAC_DEBUG_TRCSTS_WRITE3
+#define GMAC_DEBUG_TXPAUSEDBIT(19) /* MAC Transmitter in PAUSE */
+/* MAC Transmit Frame Controller Status */
+#define GMAC_DEBUG_TFCSTS_MASK GENMASK(18, 17)
+#define GMAC_DEBUG_TFCSTS_SHIFT17
+#define GMAC_DEBUG_TFCSTS_IDLE 0
+#define GMAC_DEBUG_TFCSTS_WAIT 1
+#define GMAC_DEBUG_TFCSTS_GEN_PAUSE2
+#define GMAC_DEBUG_TFCSTS_XFER 3
+/* MAC GMII or MII Transmit Protocol Engine Status */
+#define GMAC_DEBUG_TPESTS  BIT(16)
+#define GMAC_DEBUG_RXFSTS_MASK GENMASK(9, 8) /* MTL Rx FIFO Fill-level */
+#define GMAC_DEBUG_RXFSTS_SHIFT8
+#define GMAC_DEBUG_RXFSTS_EMPTY0
+#define GMAC_DEBUG_RXFSTS_BT   1
+#define GMAC_DEBUG_RXFSTS_AT   2
+#define GMAC_DEBUG_RXFSTS_FULL 3
+#define GMAC_DEBUG_RRCSTS_MASK GENMASK(6, 5) /* MTL Rx FIFO Read Controller */
+#define GMAC_DEBUG_RRCSTS_SHIFT5
+#define GMAC_DEBUG_RRCSTS_IDLE 0
+#define GMAC_DEBUG_RRCSTS_RDATA1
+#define GMAC_DEBUG_RRCSTS_RSTAT2
+#define 

[P.A. Semi] Does the ethernet interface work on your Electra, Chitra, Nemo, and Athena board?

2015-11-30 Thread Christian Zigotzky

FYI

On 30 November 2015 at 10:48 AM, Christian Zigotzky wrote:

Hi Denis,

Thank you for your answer. Sorry because of my description.

Yes, the driver probe function finds the device.

With kernel 4.4-rc3:

dmesg | grep -i eth0

[ 2.297473] eth0: PA Semi GMAC: intf 5, hw addr 02:00:e0:0a:30:00

dhclient eth0

RTNETLINK answers: Cannot allocate memory

With kernel 4.1.13:

dmesg | grep -i eth0

[ 2.328115] eth0: PA Semi GMAC: intf 5, hw addr 02:00:e0:0a:30:00
[ 37.130466] eth0: Link is up at 100 Mbps, full duplex.

Cheers,

Christian

On 30 November 2015 at 09:37 AM, Denis Kirjanov wrote:

On 11/29/15, Christian Zigotzky  wrote:

Hi All,

Does the ethernet interface on your Electra, Chitra, Nemo, and Athena
board work with the release candidates of the kernel 4.4? Unfortunately
the P.A. Semi ethernet doesn't work on our Nemo boards with the release
candidates of the kernel 4.4. We have set the following entries in the
kernel config:

CONFIG_NET_VENDOR_PASEMI=y
CONFIG_PASEMI_MAC=y

Could you please test the P.A. Semi ethernet on your P.A. Semi boards?

It's not clear from your descriptions what is not working. Does the
driver probe function find  a device? Does the interface show up in
the kernel? Can it send/receive packets?

Also please CC netdev.

Thanks.


Thanks in advance,

Christian
___
Linuxppc-dev mailing list
linuxppc-...@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 5/6] qede: Add support for nway_reset

2015-11-30 Thread Yuval Mintz
From: Sudarsana Kalluru 

Signed-off-by: Sudarsana Kalluru 
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qede/qede_ethtool.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c 
b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
index b90d880..9b0bf12 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
@@ -322,6 +322,30 @@ static void qede_set_msglevel(struct net_device *ndev, u32 
level)
 dp_module, dp_level);
 }
 
+static int qede_nway_reset(struct net_device *dev)
+{
+   struct qede_dev *edev = netdev_priv(dev);
+   struct qed_link_output current_link;
+   struct qed_link_params link_params;
+
+   if (!netif_running(dev))
+   return 0;
+
+   memset(_link, 0, sizeof(current_link));
+   edev->ops->common->get_link(edev->cdev, _link);
+   if (!current_link.link_up)
+   return 0;
+
+   /* Toggle the link */
+   memset(_params, 0, sizeof(link_params));
+   link_params.link_up = false;
+   edev->ops->common->set_link(edev->cdev, _params);
+   link_params.link_up = true;
+   edev->ops->common->set_link(edev->cdev, _params);
+
+   return 0;
+}
+
 static u32 qede_get_link(struct net_device *dev)
 {
struct qede_dev *edev = netdev_priv(dev);
@@ -493,6 +517,7 @@ static const struct ethtool_ops qede_ethtool_ops = {
.get_drvinfo = qede_get_drvinfo,
.get_msglevel = qede_get_msglevel,
.set_msglevel = qede_set_msglevel,
+   .nway_reset = qede_nway_reset,
.get_link = qede_get_link,
.get_ringparam = qede_get_ringparam,
.set_ringparam = qede_set_ringparam,
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 4/6] qede: Add support for set_phys_id

2015-11-30 Thread Yuval Mintz
From: Sudarsana Kalluru 

Signed-off-by: Sudarsana Kalluru 
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qede/qede_ethtool.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c 
b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
index 10d80ba..b90d880 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
@@ -459,6 +459,34 @@ static int qede_set_channels(struct net_device *dev,
return 0;
 }
 
+static int qede_set_phys_id(struct net_device *dev,
+   enum ethtool_phys_id_state state)
+{
+   struct qede_dev *edev = netdev_priv(dev);
+   u8 led_state = 0;
+
+   switch (state) {
+   case ETHTOOL_ID_ACTIVE:
+   return 1;   /* cycle on/off once per second */
+
+   case ETHTOOL_ID_ON:
+   led_state = QED_LED_MODE_ON;
+   break;
+
+   case ETHTOOL_ID_OFF:
+   led_state = QED_LED_MODE_OFF;
+   break;
+
+   case ETHTOOL_ID_INACTIVE:
+   led_state = QED_LED_MODE_RESTORE;
+   break;
+   }
+
+   edev->ops->common->set_led(edev->cdev, led_state);
+
+   return 0;
+}
+
 static const struct ethtool_ops qede_ethtool_ops = {
.get_settings = qede_get_settings,
.set_settings = qede_set_settings,
@@ -469,6 +497,7 @@ static const struct ethtool_ops qede_ethtool_ops = {
.get_ringparam = qede_get_ringparam,
.set_ringparam = qede_set_ringparam,
.get_strings = qede_get_strings,
+   .set_phys_id = qede_set_phys_id,
.get_ethtool_stats = qede_get_ethtool_stats,
.get_sset_count = qede_get_sset_count,
 
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 3/6] qed: Add support for changing LED state

2015-11-30 Thread Yuval Mintz
From: Sudarsana Kalluru 

Physical LEDs are being controlled by the management FW.
This adds the qed functionality required to request management FW to
change the LED configuration, as well as the necessary APIs for this
functionality to later be used by the protocol drivers.

Signed-off-by: Sudarsana Kalluru 
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_hsi.h  |  6 ++
 drivers/net/ethernet/qlogic/qed/qed_main.c | 18 ++
 drivers/net/ethernet/qlogic/qed/qed_mcp.c  | 27 +++
 drivers/net/ethernet/qlogic/qed/qed_mcp.h  | 13 +
 include/linux/qed/qed_if.h | 17 +
 5 files changed, 81 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_hsi.h 
b/drivers/net/ethernet/qlogic/qed/qed_hsi.h
index b2f8e85..264e954 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_hsi.h
+++ b/drivers/net/ethernet/qlogic/qed/qed_hsi.h
@@ -3993,6 +3993,8 @@ struct public_drv_mb {
 #define DRV_MSG_CODE_PHY_CORE_WRITE 0x000e
 #define DRV_MSG_CODE_SET_VERSION0x000f
 
+#define DRV_MSG_CODE_SET_LED_MODE   0x0020
+
 #define DRV_MSG_SEQ_NUMBER_MASK 0x
 
u32 drv_mb_param;
@@ -4044,6 +4046,10 @@ struct public_drv_mb {
 #define DRV_MB_PARAM_CFG_VF_MSIX_SB_NUM_SHIFT   8
 #define DRV_MB_PARAM_CFG_VF_MSIX_SB_NUM_MASK0xFF00
 
+#define DRV_MB_PARAM_SET_LED_MODE_OPER  0x0
+#define DRV_MB_PARAM_SET_LED_MODE_ON0x1
+#define DRV_MB_PARAM_SET_LED_MODE_OFF   0x2
+
u32 fw_mb_header;
 #define FW_MSG_CODE_MASK0x
 #define FW_MSG_CODE_DRV_LOAD_ENGINE 0x1010
diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c 
b/drivers/net/ethernet/qlogic/qed/qed_main.c
index 947c7af..6b02e11 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_main.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_main.c
@@ -1135,6 +1135,23 @@ static int qed_drain(struct qed_dev *cdev)
return 0;
 }
 
+static int qed_set_led(struct qed_dev *cdev, enum qed_led_mode mode)
+{
+   struct qed_hwfn *hwfn = QED_LEADING_HWFN(cdev);
+   struct qed_ptt *ptt;
+   int status = 0;
+
+   ptt = qed_ptt_acquire(hwfn);
+   if (!ptt)
+   return -EAGAIN;
+
+   status = qed_mcp_set_led(hwfn, ptt, mode);
+
+   qed_ptt_release(hwfn, ptt);
+
+   return status;
+}
+
 const struct qed_common_ops qed_common_ops_pass = {
.probe = _probe,
.remove = _remove,
@@ -1155,6 +1172,7 @@ const struct qed_common_ops qed_common_ops_pass = {
.update_msglvl = _init_dp,
.chain_alloc = _chain_alloc,
.chain_free = _chain_free,
+   .set_led = _set_led,
 };
 
 u32 qed_get_protocol_version(enum qed_protocol protocol)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_mcp.c 
b/drivers/net/ethernet/qlogic/qed/qed_mcp.c
index 20d048c..ba1b1f1 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_mcp.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_mcp.c
@@ -858,3 +858,30 @@ qed_mcp_send_drv_version(struct qed_hwfn *p_hwfn,
 
return 0;
 }
+
+int qed_mcp_set_led(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt,
+   enum qed_led_mode mode)
+{
+   u32 resp = 0, param = 0, drv_mb_param;
+   int rc;
+
+   switch (mode) {
+   case QED_LED_MODE_ON:
+   drv_mb_param = DRV_MB_PARAM_SET_LED_MODE_ON;
+   break;
+   case QED_LED_MODE_OFF:
+   drv_mb_param = DRV_MB_PARAM_SET_LED_MODE_OFF;
+   break;
+   case QED_LED_MODE_RESTORE:
+   drv_mb_param = DRV_MB_PARAM_SET_LED_MODE_OPER;
+   break;
+   default:
+   DP_NOTICE(p_hwfn, "Invalid LED mode %d\n", mode);
+   return -EINVAL;
+   }
+
+   rc = qed_mcp_cmd(p_hwfn, p_ptt, DRV_MSG_CODE_SET_LED_MODE,
+drv_mb_param, , );
+
+   return rc;
+}
diff --git a/drivers/net/ethernet/qlogic/qed/qed_mcp.h 
b/drivers/net/ethernet/qlogic/qed/qed_mcp.h
index dbaae58..506197d 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_mcp.h
+++ b/drivers/net/ethernet/qlogic/qed/qed_mcp.h
@@ -224,6 +224,19 @@ qed_mcp_send_drv_version(struct qed_hwfn *p_hwfn,
 struct qed_ptt *p_ptt,
 struct qed_mcp_drv_version *p_ver);
 
+/**
+ * @brief Set LED status
+ *
+ *  @param p_hwfn
+ *  @param p_ptt
+ *  @param mode - LED mode
+ *
+ * @return int - 0 - operation was successful.
+ */
+int qed_mcp_set_led(struct qed_hwfn *p_hwfn,
+   struct qed_ptt *p_ptt,
+   enum qed_led_mode mode);
+
 /* Using hwfn number (and not pf_num) is required since in CMT mode,
  * same pf_num may be used by two different hwfn
  * TODO - this shouldn't really be in .h file, but until all fields
diff --git a/include/linux/qed/qed_if.h b/include/linux/qed/qed_if.h
index 

[PATCH net-next 6/6] qede: Add support for {get, set}_pauseparam

2015-11-30 Thread Yuval Mintz
From: Sudarsana Kalluru 

Signed-off-by: Sudarsana Kalluru 
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qede/qede_ethtool.c | 60 +
 1 file changed, 60 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c 
b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
index 9b0bf12..e442b85 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
@@ -399,6 +399,64 @@ static int qede_set_ringparam(struct net_device *dev,
return 0;
 }
 
+static void qede_get_pauseparam(struct net_device *dev,
+   struct ethtool_pauseparam *epause)
+{
+   struct qede_dev *edev = netdev_priv(dev);
+   struct qed_link_output current_link;
+
+   memset(_link, 0, sizeof(current_link));
+   edev->ops->common->get_link(edev->cdev, _link);
+
+   if (current_link.pause_config & QED_LINK_PAUSE_AUTONEG_ENABLE)
+   epause->autoneg = true;
+   if (current_link.pause_config & QED_LINK_PAUSE_RX_ENABLE)
+   epause->rx_pause = true;
+   if (current_link.pause_config & QED_LINK_PAUSE_TX_ENABLE)
+   epause->tx_pause = true;
+
+   DP_VERBOSE(edev, QED_MSG_DEBUG,
+  "ethtool_pauseparam: cmd %d  autoneg %d  rx_pause %d  
tx_pause %d\n",
+  epause->cmd, epause->autoneg, epause->rx_pause,
+  epause->tx_pause);
+}
+
+static int qede_set_pauseparam(struct net_device *dev,
+  struct ethtool_pauseparam *epause)
+{
+   struct qede_dev *edev = netdev_priv(dev);
+   struct qed_link_params params;
+   struct qed_link_output current_link;
+
+   if (!edev->dev_info.common.is_mf) {
+   DP_INFO(edev,
+   "Pause parameters can not be updated in non-default 
mode\n");
+   return -EOPNOTSUPP;
+   }
+
+   memset(_link, 0, sizeof(current_link));
+   edev->ops->common->get_link(edev->cdev, _link);
+
+   memset(, 0, sizeof(params));
+   params.override_flags |= QED_LINK_OVERRIDE_PAUSE_CONFIG;
+   if (epause->autoneg) {
+   if (!(current_link.supported_caps & SUPPORTED_Autoneg)) {
+   DP_INFO(edev, "autoneg not supported\n");
+   return -EINVAL;
+   }
+   params.pause_config |= QED_LINK_PAUSE_AUTONEG_ENABLE;
+   }
+   if (epause->rx_pause)
+   params.pause_config |= QED_LINK_PAUSE_RX_ENABLE;
+   if (epause->tx_pause)
+   params.pause_config |= QED_LINK_PAUSE_TX_ENABLE;
+
+   params.link_up = true;
+   edev->ops->common->set_link(edev->cdev, );
+
+   return 0;
+}
+
 static void qede_update_mtu(struct qede_dev *edev, union qede_reload_args 
*args)
 {
edev->ndev->mtu = args->mtu;
@@ -521,6 +579,8 @@ static const struct ethtool_ops qede_ethtool_ops = {
.get_link = qede_get_link,
.get_ringparam = qede_get_ringparam,
.set_ringparam = qede_set_ringparam,
+   .get_pauseparam = qede_get_pauseparam,
+   .set_pauseparam = qede_set_pauseparam,
.get_strings = qede_get_strings,
.set_phys_id = qede_set_phys_id,
.get_ethtool_stats = qede_get_ethtool_stats,
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 1/6] qede: Add support for {get, set}_channels

2015-11-30 Thread Yuval Mintz
From: Sudarsana Kalluru 

Signed-off-by: Sudarsana Kalluru 
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qede/qede.h |  1 +
 drivers/net/ethernet/qlogic/qede/qede_ethtool.c | 53 +
 drivers/net/ethernet/qlogic/qede/qede_main.c|  7 +++-
 3 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qede/qede.h 
b/drivers/net/ethernet/qlogic/qede/qede.h
index ea00d5f..a65a9b2 100644
--- a/drivers/net/ethernet/qlogic/qede/qede.h
+++ b/drivers/net/ethernet/qlogic/qede/qede.h
@@ -116,6 +116,7 @@ struct qede_dev {
 (edev)->dev_info.num_tc)
 
struct qede_fastpath*fp_array;
+   u16 req_rss;
u16 num_rss;
u8  num_tc;
 #define QEDE_RSS_CNT(edev) ((edev)->num_rss)
diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c 
b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
index 3a36247..ea2fda8 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
@@ -366,6 +366,57 @@ int qede_change_mtu(struct net_device *ndev, int new_mtu)
return 0;
 }
 
+static void qede_get_channels(struct net_device *dev,
+ struct ethtool_channels *channels)
+{
+   struct qede_dev *edev = netdev_priv(dev);
+
+   channels->max_combined = QEDE_MAX_RSS_CNT(edev);
+   channels->combined_count = QEDE_RSS_CNT(edev);
+}
+
+static int qede_set_channels(struct net_device *dev,
+struct ethtool_channels *channels)
+{
+   struct qede_dev *edev = netdev_priv(dev);
+
+   DP_VERBOSE(edev, (NETIF_MSG_IFUP | NETIF_MSG_IFDOWN),
+  "set-channels command parameters: rx = %d, tx = %d, other = 
%d, combined = %d\n",
+  channels->rx_count, channels->tx_count,
+  channels->other_count, channels->combined_count);
+
+   /* We don't support separate rx / tx, nor `other' channels. */
+   if (channels->rx_count || channels->tx_count ||
+   channels->other_count || (channels->combined_count == 0) ||
+   (channels->combined_count > QEDE_MAX_RSS_CNT(edev))) {
+   DP_VERBOSE(edev, (NETIF_MSG_IFUP | NETIF_MSG_IFDOWN),
+  "command parameters not supported\n");
+   return -EINVAL;
+   }
+
+   /* Check if there was a change in the active parameters */
+   if (channels->combined_count == QEDE_RSS_CNT(edev)) {
+   DP_VERBOSE(edev, (NETIF_MSG_IFUP | NETIF_MSG_IFDOWN),
+  "No change in active parameters\n");
+   return 0;
+   }
+
+   /* We need the number of queues to be divisible between the hwfns */
+   if (channels->combined_count % edev->dev_info.common.num_hwfns) {
+   DP_VERBOSE(edev, (NETIF_MSG_IFUP | NETIF_MSG_IFDOWN),
+  "Number of channels must be divisable by %04x\n",
+  edev->dev_info.common.num_hwfns);
+   return -EINVAL;
+   }
+
+   /* Set number of queues and reload if necessary */
+   edev->req_rss = channels->combined_count;
+   if (netif_running(dev))
+   qede_reload(edev, NULL, NULL);
+
+   return 0;
+}
+
 static const struct ethtool_ops qede_ethtool_ops = {
.get_settings = qede_get_settings,
.set_settings = qede_set_settings,
@@ -377,6 +428,8 @@ static const struct ethtool_ops qede_ethtool_ops = {
.get_ethtool_stats = qede_get_ethtool_stats,
.get_sset_count = qede_get_sset_count,
 
+   .get_channels = qede_get_channels,
+   .set_channels = qede_set_channels,
 };
 
 void qede_set_ethtool_ops(struct net_device *dev)
diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c 
b/drivers/net/ethernet/qlogic/qede/qede_main.c
index f4657a2..6237f10 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_main.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_main.c
@@ -1502,8 +1502,11 @@ static int qede_set_num_queues(struct qede_dev *edev)
u16 rss_num;
 
/* Setup queues according to possible resources*/
-   rss_num = netif_get_num_default_rss_queues() *
- edev->dev_info.common.num_hwfns;
+   if (edev->req_rss)
+   rss_num = edev->req_rss;
+   else
+   rss_num = netif_get_num_default_rss_queues() *
+ edev->dev_info.common.num_hwfns;
 
rss_num = min_t(u16, QEDE_MAX_RSS_CNT(edev), rss_num);
 
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 2/6] qede: Add support for {get, set}_ringparam

2015-11-30 Thread Yuval Mintz
From: Sudarsana Kalluru 

Signed-off-by: Sudarsana Kalluru 
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qede/qede.h |  4 +--
 drivers/net/ethernet/qlogic/qede/qede_ethtool.c | 44 +
 2 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qede/qede.h 
b/drivers/net/ethernet/qlogic/qede/qede.h
index a65a9b2..7c6caf7 100644
--- a/drivers/net/ethernet/qlogic/qede/qede.h
+++ b/drivers/net/ethernet/qlogic/qede/qede.h
@@ -270,13 +270,13 @@ int qede_change_mtu(struct net_device *dev, int new_mtu);
 void qede_fill_by_demand_stats(struct qede_dev *edev);
 
 #define RX_RING_SIZE_POW   13
-#define RX_RING_SIZE   BIT(RX_RING_SIZE_POW)
+#define RX_RING_SIZE   ((u16)BIT(RX_RING_SIZE_POW))
 #define NUM_RX_BDS_MAX (RX_RING_SIZE - 1)
 #define NUM_RX_BDS_MIN 128
 #define NUM_RX_BDS_DEF NUM_RX_BDS_MAX
 
 #define TX_RING_SIZE_POW   13
-#define TX_RING_SIZE   BIT(TX_RING_SIZE_POW)
+#define TX_RING_SIZE   ((u16)BIT(TX_RING_SIZE_POW))
 #define NUM_TX_BDS_MAX (TX_RING_SIZE - 1)
 #define NUM_TX_BDS_MIN 128
 #define NUM_TX_BDS_DEF NUM_TX_BDS_MAX
diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c 
b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
index ea2fda8..10d80ba 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
@@ -333,6 +333,48 @@ static u32 qede_get_link(struct net_device *dev)
return current_link.link_up;
 }
 
+static void qede_get_ringparam(struct net_device *dev,
+  struct ethtool_ringparam *ering)
+{
+   struct qede_dev *edev = netdev_priv(dev);
+
+   ering->rx_max_pending = NUM_RX_BDS_MAX;
+   ering->rx_pending = edev->q_num_rx_buffers;
+   ering->tx_max_pending = NUM_TX_BDS_MAX;
+   ering->tx_pending = edev->q_num_tx_buffers;
+}
+
+static int qede_set_ringparam(struct net_device *dev,
+ struct ethtool_ringparam *ering)
+{
+   struct qede_dev *edev = netdev_priv(dev);
+
+   DP_VERBOSE(edev, (NETIF_MSG_IFUP | NETIF_MSG_IFDOWN),
+  "Set ring params command parameters: rx_pending = %d, 
tx_pending = %d\n",
+  ering->rx_pending, ering->tx_pending);
+
+   /* Validate legality of configuration */
+   if (ering->rx_pending > NUM_RX_BDS_MAX ||
+   ering->rx_pending < NUM_RX_BDS_MIN ||
+   ering->tx_pending > NUM_TX_BDS_MAX ||
+   ering->tx_pending < NUM_TX_BDS_MIN) {
+   DP_VERBOSE(edev, (NETIF_MSG_IFUP | NETIF_MSG_IFDOWN),
+  "Can only support Rx Buffer size [0%08x,...,0x%08x] 
and Tx Buffer size [0x%08x,...,0x%08x]\n",
+  NUM_RX_BDS_MIN, NUM_RX_BDS_MAX,
+  NUM_TX_BDS_MIN, NUM_TX_BDS_MAX);
+   return -EINVAL;
+   }
+
+   /* Change ring size and re-load */
+   edev->q_num_rx_buffers = ering->rx_pending;
+   edev->q_num_tx_buffers = ering->tx_pending;
+
+   if (netif_running(edev->ndev))
+   qede_reload(edev, NULL, NULL);
+
+   return 0;
+}
+
 static void qede_update_mtu(struct qede_dev *edev, union qede_reload_args 
*args)
 {
edev->ndev->mtu = args->mtu;
@@ -424,6 +466,8 @@ static const struct ethtool_ops qede_ethtool_ops = {
.get_msglevel = qede_get_msglevel,
.set_msglevel = qede_set_msglevel,
.get_link = qede_get_link,
+   .get_ringparam = qede_get_ringparam,
+   .set_ringparam = qede_set_ringparam,
.get_strings = qede_get_strings,
.get_ethtool_stats = qede_get_ethtool_stats,
.get_sset_count = qede_get_sset_count,
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 0/6] qede/qed: Implement various ethtool operations

2015-11-30 Thread Yuval Mintz
This series adds several new ethtool operations to qede:
  - {get, set}_channels
  - {get, set}_ringparam
  - set_phys_id
  - nway_reset
  - {get, set}_pauseparam
As well as extending the qed APIs to support these commands.

Dave, please consider applying this series to `net-next'.

Thanks,
Yuval

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 21/28] net: pch_gbe: mark Minnow PHY reset GPIO active low

2015-11-30 Thread Paul Burton
The Minnow PHY reset GPIO is set to 0 to enter reset & 1 to leave reset
- that is, it is an active low GPIO. In order to allow for the code to
be made more generic by further patches, indicate to the GPIO subsystem
that the GPIO is active low & invert the values it is set to such that
they reflect logically whether the device is being reset or not.

Signed-off-by: Paul Burton 
---

 drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c 
b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
index 3b98b263b..fde4c11 100644
--- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
+++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
@@ -2717,7 +2717,8 @@ err_free_netdev:
  */
 static int pch_gbe_minnow_platform_init(struct pci_dev *pdev)
 {
-   unsigned long flags = GPIOF_DIR_OUT | GPIOF_INIT_HIGH | GPIOF_EXPORT;
+   unsigned long flags = GPIOF_DIR_OUT | GPIOF_INIT_LOW |
+   GPIOF_EXPORT | GPIOF_ACTIVE_LOW;
unsigned gpio = MINNOW_PHY_RESET_GPIO;
int ret;
 
@@ -2729,10 +2730,10 @@ static int pch_gbe_minnow_platform_init(struct pci_dev 
*pdev)
return ret;
}
 
-   gpio_set_value(gpio, 0);
-   usleep_range(1250, 1500);
gpio_set_value(gpio, 1);
usleep_range(1250, 1500);
+   gpio_set_value(gpio, 0);
+   usleep_range(1250, 1500);
 
return ret;
 }
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 22/28] net: pch_gbe: pull PHY GPIO handling out of Minnow code

2015-11-30 Thread Paul Burton
The MIPS Boston development board uses the Intel EG20T Platform
Controller Hub, including its gigabit ethernet controller, and requires
that its RTL8211E PHY be reset much like the Minnow platform. Pull the
PHY reset GPIO handling out of Minnow-specific code such that it can be
shared by later patches.

Signed-off-by: Paul Burton 
---

 drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe.h|  4 ++-
 .../net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c   | 33 +++---
 2 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe.h 
b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe.h
index 2a55d6d..884f90b 100644
--- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe.h
+++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe.h
@@ -582,15 +582,17 @@ struct pch_gbe_hw_stats {
 
 /**
  * struct pch_gbe_privdata - PCI Device ID driver data
+ * @phy_reset_gpio:PHY reset GPIO descriptor.
  * @phy_tx_clk_delay:  Bool, configure the PHY TX delay in software
  * @phy_disable_hibernate: Bool, disable PHY hibernation
  * @platform_init: Platform initialization callback, called from
  * probe, prior to PHY initialization.
  */
 struct pch_gbe_privdata {
+   struct gpio_desc *phy_reset_gpio;
bool phy_tx_clk_delay;
bool phy_disable_hibernate;
-   int (*platform_init)(struct pci_dev *pdev);
+   int (*platform_init)(struct pci_dev *, struct pch_gbe_privdata *);
 };
 
 /**
diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c 
b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
index fde4c11..23d28f0 100644
--- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
+++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
@@ -360,6 +360,16 @@ static void pch_gbe_mac_mar_set(struct pch_gbe_hw *hw, u8 
* addr, u32 index)
pch_gbe_wait_clr_bit(>reg->ADDR_MASK, PCH_GBE_BUSY);
 }
 
+static void pch_gbe_phy_set_reset(struct pch_gbe_hw *hw, int value)
+{
+   struct pch_gbe_adapter *adapter = pch_gbe_hw_to_adapter(hw);
+
+   if (!adapter->pdata || !adapter->pdata->phy_reset_gpio)
+   return;
+
+   gpiod_set_value(adapter->pdata->phy_reset_gpio, value);
+}
+
 /**
  * pch_gbe_mac_reset_hw - Reset hardware
  * @hw:Pointer to the HW structure
@@ -2627,7 +2637,14 @@ static int pch_gbe_probe(struct pci_dev *pdev,
adapter->hw.reg = pcim_iomap_table(pdev)[PCH_GBE_PCI_BAR];
adapter->pdata = (struct pch_gbe_privdata *)pci_id->driver_data;
if (adapter->pdata && adapter->pdata->platform_init)
-   adapter->pdata->platform_init(pdev);
+   adapter->pdata->platform_init(pdev, pdata);
+
+   if (adapter->pdata && adapter->pdata->phy_reset_gpio) {
+   pch_gbe_phy_set_reset(>hw, 1);
+   usleep_range(1250, 1500);
+   pch_gbe_phy_set_reset(>hw, 0);
+   usleep_range(1250, 1500);
+   }
 
adapter->ptp_pdev = pci_get_bus_and_slot(adapter->pdev->bus->number,
   PCI_DEVFN(12, 4));
@@ -2715,7 +2732,8 @@ err_free_netdev:
 /* The AR803X PHY on the MinnowBoard requires a physical pin to be toggled to
  * ensure it is awake for probe and init. Request the line and reset the PHY.
  */
-static int pch_gbe_minnow_platform_init(struct pci_dev *pdev)
+static int pch_gbe_minnow_platform_init(struct pci_dev *pdev,
+   struct pch_gbe_privdata *pdata)
 {
unsigned long flags = GPIOF_DIR_OUT | GPIOF_INIT_LOW |
GPIOF_EXPORT | GPIOF_ACTIVE_LOW;
@@ -2724,16 +2742,11 @@ static int pch_gbe_minnow_platform_init(struct pci_dev 
*pdev)
 
ret = devm_gpio_request_one(>dev, gpio, flags,
"minnow_phy_reset");
-   if (ret) {
+   if (!ret)
+   pdata->phy_reset_gpio = gpio_to_desc(gpio);
+   else
dev_err(>dev,
"ERR: Can't request PHY reset GPIO line '%d'\n", gpio);
-   return ret;
-   }
-
-   gpio_set_value(gpio, 1);
-   usleep_range(1250, 1500);
-   gpio_set_value(gpio, 0);
-   usleep_range(1250, 1500);
 
return ret;
 }
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 19/28] net: pch_gbe: allow build on MIPS platforms

2015-11-30 Thread Paul Burton
Allow the pch_gbe driver to be built on MIPS platforms, in preparation
for its use on the MIPS Boston board.

Signed-off-by: Paul Burton 
---

 drivers/net/ethernet/oki-semi/pch_gbe/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/Kconfig 
b/drivers/net/ethernet/oki-semi/pch_gbe/Kconfig
index 5f7a352..4d3809a 100644
--- a/drivers/net/ethernet/oki-semi/pch_gbe/Kconfig
+++ b/drivers/net/ethernet/oki-semi/pch_gbe/Kconfig
@@ -4,7 +4,7 @@
 
 config PCH_GBE
tristate "OKI SEMICONDUCTOR IOH(ML7223/ML7831) GbE"
-   depends on PCI && (X86_32 || COMPILE_TEST)
+   depends on PCI && (X86_32 || MIPS || COMPILE_TEST)
select MII
select PTP_1588_CLOCK_PCH
select NET_PTP_CLASSIFY
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 23/28] net: pch_gbe: always reset PHY along with MAC

2015-11-30 Thread Paul Burton
On the MIPS Boston development board, the EG20T MAC does not report
receiving the RX clock from the (RGMII) RTL8211E PHY unless the PHY is
reset at the same time as the MAC. Since the pch_gbe driver resets the
MAC a number of times - twice during probe, and when taking down the
network interface - we need to reset the PHY at all the same times. Do
that from pch_gbe_mac_reset_hw which is used to reset the MAC in all
cases.

Signed-off-by: Paul Burton 
---

 drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c 
b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
index 23d28f0..824ff9e 100644
--- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
+++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
@@ -378,10 +378,13 @@ static void pch_gbe_mac_reset_hw(struct pch_gbe_hw *hw)
 {
/* Read the MAC address. and store to the private data */
pch_gbe_mac_read_mac_addr(hw);
+   pch_gbe_phy_set_reset(hw, 1);
iowrite32(PCH_GBE_ALL_RST, >reg->RESET);
 #ifdef PCH_GBE_MAC_IFOP_RGMII
iowrite32(PCH_GBE_MODE_GMII_ETHER, >reg->MODE);
 #endif
+   pch_gbe_phy_set_reset(hw, 0);
+   usleep_range(1250, 1500);
pch_gbe_wait_clr_bit(>reg->RESET, PCH_GBE_ALL_RST);
/* Setup the receive addresses */
pch_gbe_mac_mar_set(hw, hw->mac.addr, 0);
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 24/28] net: pch_gbe: add device tree support

2015-11-30 Thread Paul Burton
Introduce support for retrieving the PHY reset GPIO from device tree,
which will be used on the MIPS Boston development board. This requires
support for probe deferral in order to work correctly, since the order
of device probe is not guaranteed & typically the EG20T GPIO controller
device will be probed after the ethernet MAC.

Signed-off-by: Paul Burton 
---

 .../net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c   | 33 +-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c 
b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
index 824ff9e..f2a9a38 100644
--- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
+++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
@@ -23,6 +23,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #define DRV_VERSION "1.01"
 const char pch_driver_version[] = DRV_VERSION;
@@ -2594,13 +2596,41 @@ static void pch_gbe_remove(struct pci_dev *pdev)
free_netdev(netdev);
 }
 
+static int pch_gbe_parse_dt(struct pci_dev *pdev,
+   struct pch_gbe_privdata **pdata)
+{
+   struct device_node *np = pdev->dev.of_node;
+   struct gpio_desc *gpio;
+
+   if (!config_enabled(CONFIG_OF) || !np)
+   return 0;
+
+   if (!*pdata)
+   *pdata = devm_kzalloc(>dev, sizeof(**pdata), GFP_KERNEL);
+   if (!*pdata)
+   return -ENOMEM;
+
+   gpio = devm_gpiod_get(>dev, "phy-reset", GPIOD_ASIS);
+   if (IS_ERR(gpio))
+   return PTR_ERR(gpio);
+
+   (*pdata)->phy_reset_gpio = gpio;
+   return 0;
+}
+
 static int pch_gbe_probe(struct pci_dev *pdev,
  const struct pci_device_id *pci_id)
 {
struct net_device *netdev;
struct pch_gbe_adapter *adapter;
+   struct pch_gbe_privdata *pdata;
int ret;
 
+   pdata = (struct pch_gbe_privdata *)pci_id->driver_data;
+   ret = pch_gbe_parse_dt(pdev, );
+   if (ret)
+   goto err_out;
+
ret = pcim_enable_device(pdev);
if (ret)
return ret;
@@ -2638,7 +2668,7 @@ static int pch_gbe_probe(struct pci_dev *pdev,
adapter->pdev = pdev;
adapter->hw.back = adapter;
adapter->hw.reg = pcim_iomap_table(pdev)[PCH_GBE_PCI_BAR];
-   adapter->pdata = (struct pch_gbe_privdata *)pci_id->driver_data;
+   adapter->pdata = pdata;
if (adapter->pdata && adapter->pdata->platform_init)
adapter->pdata->platform_init(pdev, pdata);
 
@@ -2729,6 +2759,7 @@ err_free_adapter:
pch_gbe_hal_phy_hw_reset(>hw);
 err_free_netdev:
free_netdev(netdev);
+err_out:
return ret;
 }
 
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 18/28] ptp: pch: allow build on MIPS platforms

2015-11-30 Thread Paul Burton
Allow the ptp_pch driver to be built on MIPS platforms in preparation
for use on the MIPS Boston board.

Signed-off-by: Paul Burton 
---

 drivers/ptp/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ptp/Kconfig b/drivers/ptp/Kconfig
index ee3de34..ee43549 100644
--- a/drivers/ptp/Kconfig
+++ b/drivers/ptp/Kconfig
@@ -74,7 +74,7 @@ config DP83640_PHY
 
 config PTP_1588_CLOCK_PCH
tristate "Intel PCH EG20T as PTP clock"
-   depends on X86_32 || COMPILE_TEST
+   depends on X86_32 || MIPS || COMPILE_TEST
depends on HAS_IOMEM && NET
select PTP_1588_CLOCK
help
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/13] mvneta Buffer Management and enhancements

2015-11-30 Thread David Miller
From: Marcin Wojtas 
Date: Mon, 30 Nov 2015 15:13:22 +0100

> What kind of abstraction and helpers do you mean? Some kind of API
> (e.g. bm_alloc_buffer, bm_initialize_ring bm_put_buffer,
> bm_get_buffer), which would be used by platform drivers (and specific
> aplications if one wants to develop on top of the kernel)?
> 
> In general, what is your top-view of such solution and its cooperation
> with the drivers?

The tricky parts involved have to do with allocating pages for the
buffer pools and minimizing the number of atomic refcounting
operations on those pages for for the puts and gets, particularly
around buffer replenish runs.

For example, if you're allocating a page for a buffer pool the device
will chop into N (for any N < PAGE_SIZE) byte pieces, you can
eliminate many atomic operations.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/13] mvneta Buffer Management and enhancements

2015-11-30 Thread Gregory CLEMENT
Hi Marcin,
 
 On dim., nov. 22 2015, Marcin Wojtas  wrote:

> Hi,
>
> Hereby I submit a patchset that introduces various fixes and support
> for new features and enhancements to the mvneta driver:
>
> 1. First three patches are minimal fixes, stable-CC'ed.
>
> 2. Suspend to ram ('s2ram') support. Due to some stability problems
> Thomas Petazzoni's patches did not get merged yet, but I used them for
> verification. Contrary to wfi mode ('standby' - linux does not
> differentiate between them, so same routines are used) all registers'
> contents are lost due to power down, so the configuration has to be
> fully reconstructed during resume.
>
> 3. Optimisations - concatenating TX descriptors' flush, basing on
> xmit_more support and combined approach for finalizing egress processing.
> Thanks to HR timer buffers can be released with small latency, which is
> good for low transfer and small queues. Along with the timer, coalescing
> irqs are used, whose threshold could be increased back to 15.
>
> 4. Buffer manager (BM) support with two preparatory commits. As it is a
> separate block, common for all network ports, a new driver is introduced,
> which configures it and exposes API to the main network driver. It is
> throughly described in binding documentation and commit log. Please note,
> that enabling per-port BM usage is done using phandle and the data passed
> in mvneta_bm_probe. It is designed for usage of on-demand device probe
> and dev_set/get_drvdata, however it's awaiting merge to linux-next.
> Therefore, deferring probe is not used - if something goes wrong (same
> in case of errors during changing MTU or suspend/resume cycle) mvneta
> driver falls back to software buffer management and works in a regular way.
>
> Known issues:
> - problems with obtaining all mapped buffers from internal SRAM, when
> destroying the buffer pointer pool
> - problems with unmapping chunk of SRAM during driver removal
> Above do not have an impact on the operation, as they are called during
> driver removal or in error path.
>
> 5. Enable BM on Armada XP and 38X development boards - those ones and
> A370 I could check on my own. In all cases they survived night-long
> linerate iperf. Also tests were performed with A388 SoC working as a
> network bridge between two packet generators. They showed increase of
> maximum processed 64B packets by ~20k (~555k packets with BM enabled
> vs ~535 packets without BM). Also when pushing 1500B-packets with a
> line rate achieved, CPU load decreased from around 25% without BM vs
> 18-20% with BM.

I was trying to test the BM part of tour series on the Armada XP GP
board. However it failed very quickly during the pool allocation. After
a first debug I found that the size of the cs used in the
mvebu_mbus_dram_info struct was 0. I have applied your series on a
v4.4-rc1 kernel. At this stage I don't know if it is a regression in the
mbus driver, a misconfiguration on my side or something else.

Does it ring a bell for you?

How do you test test it exactly?
Especially on which kernel and with which U-Boot?

Thanks,

Gregory


>
> I'm looking forward to any remarks and comments.
>
> Best regards,
> Marcin Wojtas
>
> Marcin Wojtas (12):
>   net: mvneta: add configuration for MBUS windows access protection
>   net: mvneta: enable IP checksum with jumbo frames for Armada 38x on
> Port0
>   net: mvneta: fix bit assignment in MVNETA_RXQ_CONFIG_REG
>   net: mvneta: enable suspend/resume support
>   net: mvneta: enable mixed egress processing using HR timer
>   bus: mvebu-mbus: provide api for obtaining IO and DRAM window
> information
>   ARM: mvebu: enable SRAM support in mvebu_v7_defconfig
>   net: mvneta: bm: add support for hardware buffer management
>   ARM: mvebu: add buffer manager nodes to armada-38x.dtsi
>   ARM: mvebu: enable buffer manager support on Armada 38x boards
>   ARM: mvebu: add buffer manager nodes to armada-xp.dtsi
>   ARM: mvebu: enable buffer manager support on Armada XP boards
>
> Simon Guinot (1):
>   net: mvneta: add xmit_more support
>
>  .../bindings/net/marvell-armada-370-neta.txt   |  19 +-
>  .../devicetree/bindings/net/marvell-neta-bm.txt|  49 ++
>  arch/arm/boot/dts/armada-385-db-ap.dts |  20 +-
>  arch/arm/boot/dts/armada-388-db.dts|  17 +-
>  arch/arm/boot/dts/armada-388-gp.dts|  17 +-
>  arch/arm/boot/dts/armada-38x.dtsi  |  20 +-
>  arch/arm/boot/dts/armada-xp-db.dts |  19 +-
>  arch/arm/boot/dts/armada-xp-gp.dts |  19 +-
>  arch/arm/boot/dts/armada-xp.dtsi   |  18 +
>  arch/arm/configs/mvebu_v7_defconfig|   1 +
>  drivers/bus/mvebu-mbus.c   |  51 ++
>  drivers/net/ethernet/marvell/Kconfig   |  14 +
>  drivers/net/ethernet/marvell/Makefile  |   1 +
>  drivers/net/ethernet/marvell/mvneta.c  | 660 
> +++--
>  

Re: What now when we're [almost] out of ADVERTISED bits?

2015-11-30 Thread David Decotigny
yes, I will update+repost.

On Sun, Nov 29, 2015 at 10:11 PM, Yuval Mintz  wrote:
 there was a work by David Decotigny that should have solved the out
 of bits problem here [1]. Maybe it should be revived.

 [1] https://lkml.org/lkml/2015/1/26/882
>>>
>>> Yes, it should.
>>
>> A repost would strongly facilitate that.
>>
>> Just if anyone ever thinks something is being ignored, just don't even use 
>> your
>> brain, simply repost it again.
>
> David, are you going to re-post? Or do you want me to take over this one?
>
> Thanks,
> Yuval
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] ipv4: igmp: Allow removing groups from a removed interface

2015-11-30 Thread Andrew Lunn
On Mon, Nov 30, 2015 at 11:01:48AM -0500, David Miller wrote:
> From: Andrew Lunn 
> Date: Wed, 25 Nov 2015 21:15:36 +0100
> 
> > @@ -2126,7 +2126,7 @@ int ip_mc_leave_group(struct sock *sk, struct 
> > ip_mreqn *imr)
> > ASSERT_RTNL();
> >  
> > in_dev = ip_mc_find_dev(net, imr);
> > -   if (!in_dev) {
> > +   if (!imr->imr_ifindex && !imr->imr_address.s_addr && !in_dev) {
> > ret = -ENODEV;
> > goto out;
> > }
> 
> Now, ip_mc_dec_group() below can take a NULL pointer dereference.  One example
> is if imr_ifindex is specified and the lookup returns NULL in 
> ip_mc_find_dev().

Agreed. Earlier code had an if (in_dev) before the call to
ip_mc_dec_group(). It got removed along the way and now needs adding
back. A v2 patch will follow soon.
 
> This is so rediculously complicated, just looking at this code breaks 
> something.

Yep. I think part of the problem comes from the code being designed
before interfaces were hot plugable.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/13] mm: memcontrol: account socket memory in unified hierarchy memory controller

2015-11-30 Thread Vladimir Davydov
On Mon, Nov 30, 2015 at 10:26:38AM -0500, Johannes Weiner wrote:
> On Mon, Nov 30, 2015 at 01:54:21PM +0300, Vladimir Davydov wrote:
> > On Tue, Nov 24, 2015 at 04:58:44PM -0500, Johannes Weiner wrote:
> > ...
> > > @@ -5520,15 +5557,30 @@ void sock_release_memcg(struct sock *sk)
> > >   */
> > >  bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int 
> > > nr_pages)
> > >  {
> > > - struct page_counter *counter;
> > > + gfp_t gfp_mask = GFP_KERNEL;
> > >  
> > > - if (page_counter_try_charge(>tcp_mem.memory_allocated,
> > > - nr_pages, )) {
> > > - memcg->tcp_mem.memory_pressure = 0;
> > > - return true;
> > > +#ifdef CONFIG_MEMCG_KMEM
> > > + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) {
> > > + struct page_counter *counter;
> > > +
> > > + if (page_counter_try_charge(>tcp_mem.memory_allocated,
> > > + nr_pages, )) {
> > > + memcg->tcp_mem.memory_pressure = 0;
> > > + return true;
> > > + }
> > > + page_counter_charge(>tcp_mem.memory_allocated, nr_pages);
> > > + memcg->tcp_mem.memory_pressure = 1;
> > > + return false;
> > >   }
> > > - page_counter_charge(>tcp_mem.memory_allocated, nr_pages);
> > > - memcg->tcp_mem.memory_pressure = 1;
> > > +#endif
> > > + /* Don't block in the packet receive path */
> > > + if (in_softirq())
> > > + gfp_mask = GFP_NOWAIT;
> > > +
> > > + if (try_charge(memcg, gfp_mask, nr_pages) == 0)
> > > + return true;
> > > +
> > > + try_charge(memcg, gfp_mask|__GFP_NOFAIL, nr_pages);
> > 
> > We won't trigger high reclaim if we get here, because try_charge does
> > not check high threshold if failing or forcing charge. I think this
> > should be fixed regardless of this patch. The fix is attached below.
> 
> We kind of assume that max is either set above high, or not at
> all. That means when max is hit the high limit has already failed and
> it's of limited use to schedule background reclaim.

Yeah, you're right. No point scheduling the work here - it must be
already running.

> 
> > Also, I don't like calling try_charge twice: the second time will go
> > through all the try_charge steps for nothing. What about checking
> > page_counter value after calling try_charge instead:
> > 
> > try_charge(memcg, gfp_mask|__GFP_NOFAIL, nr_pages);
> > return page_counter_read(>memory) <= memcg->memory.limit;
> > 
> > or adding an out parameter to try_charge that would inform us if charge
> > was forced?
> 
> That's a complete cold path where we are going to drop the packet in
> all but a few cases. It's not worth the trouble.

Right

> 
> > > @@ -5539,10 +5591,32 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup 
> > > *memcg, unsigned int nr_pages)
> > >   */
> > >  void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int 
> > > nr_pages)
> > >  {
> > > - page_counter_uncharge(>tcp_mem.memory_allocated, nr_pages);
> > > +#ifdef CONFIG_MEMCG_KMEM
> > > + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) {
> > > + page_counter_uncharge(>tcp_mem.memory_allocated,
> > > +   nr_pages);
> > > + return;
> > > + }
> > > +#endif
> > > + page_counter_uncharge(>memory, nr_pages);
> > > + css_put_many(>css, nr_pages);
> > 
> > cancel_charge(memcg, nr_pages);
> 
> It does the same, but it's a weird name for regular uncharging.

Right

> 
> > From: Vladimir Davydov 
> > Subject: [PATCH] memcg: check high threshold if forcing allocation
> > 
> > try_charge() does not result in checking high threshold if it forces
> > charge. This is incorrect, because we could have failed to reclaim
> > memory due to the current context, so we do need to check high threshold
> > and try to compensate for the excess once we are in the safe context.
> > 
> > Signed-off-by: Vladimir Davydov 
> > 
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 79a29d564bff..e922965b572b 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -2112,13 +2112,14 @@ static int try_charge(struct mem_cgroup *memcg, 
> > gfp_t gfp_mask,
> > page_counter_charge(>memsw, nr_pages);
> > css_get_many(>css, nr_pages);
> >  
> > -   return 0;
> > +   goto check_high;
> >  
> >  done_restock:
> > css_get_many(>css, batch);
> > if (batch > nr_pages)
> > refill_stock(memcg, batch - nr_pages);
> >  
> > +check_high:
> > /*
> >  * If the hierarchy is above the normal consumption range, schedule
> >  * reclaim on returning to userland.  We can perform reclaim here
> 
> One problem is that OOM victims force their charges so they can exit
> quickly. It'd be contradictory to then task them with high reclaim.
> 

Yeah, scratch that patch. It isn't necessary anyway, because, as you
pointed out, we don't really need to schedule high reclaim when we fail
hard in mem_cgroup_charge_skmem.

No more 

[P.A. Semi] Does the ethernet interface work on your Electra, Chitra, Nemo, and Athena board?

2015-11-30 Thread Christian Zigotzky

Hi All,

I have tested the PA Semi Ethernet with the kernels 4.2.3 and 4.3.0 
today. With the kernel 4.2.3 it works but with the kernel 4.3.0 final it 
doesn't work.


After that I tested some git kernels and release candidates of 4.3.

Kernel 4.3 git from Tue Sep 01, 2015 -> PA Semi Ethernet works
Kernel 4.3 git from Wed Sep 02, 2015 -> PA Semi Ethernet works
Kernel 4.3 git from Thu Sep 03, 2015 -> PA Semi Ethernet works
Kernel 4.3 git from Fri Sep 04, 2015 -> PA Semi Ethernet doesn't work 
(Merge tag 'powerpc-4.3-1': 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ff474e8ca8547d09cb82ebab56d4c96f9eea01ce)

Kernel 4.3 git from Sat Sep 05, 2015 -> PA Semi Ethernet doesn't work
Kernel 4.3 git from Mon Sep 07, 2015 -> PA Semi Ethernet doesn't work
Kernel 4.3 git from Wed Sep 09, 2015 -> PA Semi Ethernet doesn't work
Kernel 4.3 git from Fri Sep 11, 2015 -> PA Semi Ethernet doesn't work
Kernel 4.3 RC1 from Sun Sep 13, 2015 -> PA Semi Ethernet doesn't work
Kernel 4.3 RC2 from Mon Sep 21, 2015 -> PA Semi Ethernet doesn't work

The problematic commit must be between Thu Sep 03, 2015 at 09:37 AM (UTC 
+2) and Fri Sep 04, 2015 at 7:38 PM (UTC +2) in the linux git.


Linux git: Between 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/log/?ofs=15500 
and 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/log/?ofs=15200.


Maybe 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ff474e8ca8547d09cb82ebab56d4c96f9eea01ce.


Cheers,

Christian

On 30 November 2015 at 10:48 AM, Christian Zigotzky wrote:

Hi Denis,

Thank you for your answer. Sorry because of my description.

Yes, the driver probe function finds the device.

With kernel 4.4-rc3:

dmesg | grep -i eth0

[ 2.297473] eth0: PA Semi GMAC: intf 5, hw addr 02:00:e0:0a:30:00

dhclient eth0

RTNETLINK answers: Cannot allocate memory

With kernel 4.1.13:

dmesg | grep -i eth0

[ 2.328115] eth0: PA Semi GMAC: intf 5, hw addr 02:00:e0:0a:30:00
[ 37.130466] eth0: Link is up at 100 Mbps, full duplex.

Cheers,

Christian

On 30 November 2015 at 09:37 AM, Denis Kirjanov wrote:

On 11/29/15, Christian Zigotzky  wrote:

Hi All,

Does the ethernet interface on your Electra, Chitra, Nemo, and Athena
board work with the release candidates of the kernel 4.4? Unfortunately
the P.A. Semi ethernet doesn't work on our Nemo boards with the release
candidates of the kernel 4.4. We have set the following entries in the
kernel config:

CONFIG_NET_VENDOR_PASEMI=y
CONFIG_PASEMI_MAC=y

Could you please test the P.A. Semi ethernet on your P.A. Semi boards?

It's not clear from your descriptions what is not working. Does the
driver probe function find  a device? Does the interface show up in
the kernel? Can it send/receive packets?

Also please CC netdev.

Thanks.


Thanks in advance,

Christian
___
Linuxppc-dev mailing list
linuxppc-...@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/13] mm: memcontrol: hook up vmpressure to socket pressure

2015-11-30 Thread Vladimir Davydov
On Mon, Nov 30, 2015 at 10:58:38AM -0500, Johannes Weiner wrote:
> On Mon, Nov 30, 2015 at 02:36:28PM +0300, Vladimir Davydov wrote:
> > Suppose we have the following cgroup configuration.
> > 
> > A __ B
> >   \_ C
> > 
> > A is empty (which is natural for the unified hierarchy AFAIU). B has
> > some workload running in it, and C generates socket pressure. Due to the
> > socket pressure coming from C we start reclaim in A, which results in
> > thrashing of B, but we might not put sockets under pressure in A or C,
> > because vmpressure does not account pages scanned/reclaimed in B when
> > generating a vmpressure event for A or C. This might result in
> > aggressive reclaim and thrashing in B w/o generating a signal for C to
> > stop growing socket buffers.
> > 
> > Do you think such a situation is possible? If so, would it make sense to
> > switch to post-order walk in shrink_zone and pass sub-tree
> > scanned/reclaimed stats to vmpressure for each scanned memcg?
> 
> In that case the LRU pages in C would experience pressure as well,
> which would then reign in the sockets in C. There must be some LRU
> pages in there, otherwise who is creating socket pressure?
> 
> The same applies to shrinkers. All secondary reclaim is driven by LRU
> reclaim results.
> 
> I can see that there is some unfairness in distributing memcg reclaim
> pressure purely based on LRU size, because there are scenarios where
> the auxiliary objects (incl. sockets, but mostly shrinker pools)
> amount to a significant portion of the group's memory footprint. But
> substitute group for NUMA node and we've had this behavior for
> years. I'm not sure it's actually a problem in practice.
> 

Fiar enough. Let's wait until we hit this problem in real world then.

The patch looks good to me.

Reviewed-by: Vladimir Davydov 

Thanks,
Vladimir
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 25/28] net: pch_gbe: allow longer for resets

2015-11-30 Thread Paul Burton
Resets of the EG20T MAC on the MIPS Boston development board take longer
than the 1000 loops that pch_gbe_wait_clr_bit was performing. Bump up
the number of loops.

Signed-off-by: Paul Burton 
---

 drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c 
b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
index f2a9a38..f650f45 100644
--- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
+++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
@@ -321,7 +321,7 @@ static void pch_gbe_wait_clr_bit(void *reg, u32 bit)
u32 tmp;
 
/* wait busy */
-   tmp = 1000;
+   tmp = 1;
while ((ioread32(reg) & bit) && --tmp)
cpu_relax();
if (!tmp)
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 net-next] tcp: suppress too verbose messages in tcp_send_ack()

2015-11-30 Thread Eric Dumazet
From: Eric Dumazet 

If tcp_send_ack() can not allocate skb, we properly handle this
and setup a timer to try later.

Use __GFP_NOWARN to avoid polluting syslog in the case host is
under memory pressure, so that pertinent messages are not lost under
a flood of useless information.

sk_gfp_atomic() can use its gfp_mask argument (all callers currently
were using GFP_ATOMIC before this patch)

We rename sk_gfp_atomic() to sk_gfp_mask() to clearly express this
function now takes into account its second argument (gfp_mask)

Note that when tcp_transmit_skb() is called with clone_it set to false,
we do not attempt memory allocations, so can pass a 0 gfp_mask, which
most compilers can emit faster than a non zero or constant value.

Signed-off-by: Eric Dumazet 
---
v2: rename sk_gfp_atomic() to sk_gfp_mask()

 include/net/sock.h|4 ++--
 net/ipv4/tcp_output.c |   14 --
 net/ipv6/tcp_ipv6.c   |6 +++---
 3 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 7f89e4ba18d1..89073bda77df 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -774,9 +774,9 @@ static inline int sk_memalloc_socks(void)
 
 #endif
 
-static inline gfp_t sk_gfp_atomic(const struct sock *sk, gfp_t gfp_mask)
+static inline gfp_t sk_gfp_mask(const struct sock *sk, gfp_t gfp_mask)
 {
-   return GFP_ATOMIC | (sk->sk_allocation & __GFP_MEMALLOC);
+   return gfp_mask | (sk->sk_allocation & __GFP_MEMALLOC);
 }
 
 static inline void sk_acceptq_removed(struct sock *sk)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index cb7ca569052c..a800cee88035 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2296,7 +2296,7 @@ void __tcp_push_pending_frames(struct sock *sk, unsigned 
int cur_mss,
return;
 
if (tcp_write_xmit(sk, cur_mss, nonagle, 0,
-  sk_gfp_atomic(sk, GFP_ATOMIC)))
+  sk_gfp_mask(sk, GFP_ATOMIC)))
tcp_check_probe_timer(sk);
 }
 
@@ -3352,8 +3352,9 @@ void tcp_send_ack(struct sock *sk)
 * tcp_transmit_skb() will set the ownership to this
 * sock.
 */
-   buff = alloc_skb(MAX_TCP_HEADER, sk_gfp_atomic(sk, GFP_ATOMIC));
-   if (!buff) {
+   buff = alloc_skb(MAX_TCP_HEADER,
+sk_gfp_mask(sk, GFP_ATOMIC | __GFP_NOWARN));
+   if (unlikely(!buff)) {
inet_csk_schedule_ack(sk);
inet_csk(sk)->icsk_ack.ato = TCP_ATO_MIN;
inet_csk_reset_xmit_timer(sk, ICSK_TIME_DACK,
@@ -3375,7 +3376,7 @@ void tcp_send_ack(struct sock *sk)
 
/* Send it off, this clears delayed acks for us. */
skb_mstamp_get(>skb_mstamp);
-   tcp_transmit_skb(sk, buff, 0, sk_gfp_atomic(sk, GFP_ATOMIC));
+   tcp_transmit_skb(sk, buff, 0, (__force gfp_t)0);
 }
 EXPORT_SYMBOL_GPL(tcp_send_ack);
 
@@ -3396,7 +3397,8 @@ static int tcp_xmit_probe_skb(struct sock *sk, int 
urgent, int mib)
struct sk_buff *skb;
 
/* We don't queue it, tcp_transmit_skb() sets ownership. */
-   skb = alloc_skb(MAX_TCP_HEADER, sk_gfp_atomic(sk, GFP_ATOMIC));
+   skb = alloc_skb(MAX_TCP_HEADER,
+   sk_gfp_mask(sk, GFP_ATOMIC | __GFP_NOWARN));
if (!skb)
return -1;
 
@@ -3409,7 +3411,7 @@ static int tcp_xmit_probe_skb(struct sock *sk, int 
urgent, int mib)
tcp_init_nondata_skb(skb, tp->snd_una - !urgent, TCPHDR_ACK);
skb_mstamp_get(>skb_mstamp);
NET_INC_STATS(sock_net(sk), mib);
-   return tcp_transmit_skb(sk, skb, 0, GFP_ATOMIC);
+   return tcp_transmit_skb(sk, skb, 0, (__force gfp_t)0);
 }
 
 void tcp_send_window_probe(struct sock *sk)
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index c5429a636f1a..41bcd59a2ac7 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1130,7 +1130,7 @@ static struct sock *tcp_v6_syn_recv_sock(const struct 
sock *sk, struct sk_buff *
 */
tcp_md5_do_add(newsk, (union tcp_md5_addr *)>sk_v6_daddr,
   AF_INET6, key->key, key->keylen,
-  sk_gfp_atomic(sk, GFP_ATOMIC));
+  sk_gfp_mask(sk, GFP_ATOMIC));
}
 #endif
 
@@ -1146,7 +1146,7 @@ static struct sock *tcp_v6_syn_recv_sock(const struct 
sock *sk, struct sk_buff *
/* Clone pktoptions received with SYN, if we own the req */
if (ireq->pktopts) {
newnp->pktoptions = skb_clone(ireq->pktopts,
- sk_gfp_atomic(sk, 
GFP_ATOMIC));
+ sk_gfp_mask(sk, 
GFP_ATOMIC));
consume_skb(ireq->pktopts);
ireq->pktopts = NULL;
if (newnp->pktoptions)
@@ -1212,7 +1212,7 @@ static int tcp_v6_do_rcv(struct sock *sk, struct 

[PATCH net] ipv6: kill sk_dst_lock

2015-11-30 Thread Eric Dumazet
From: Eric Dumazet 

While testing the np->opt RCU conversion, I found that UDP/IPv6 was
using a mixture of xchg() and sk_dst_lock to protect concurrent changes
to sk->sk_dst_cache, leading to possible corruptions and crashes.

ip6_sk_dst_lookup_flow() uses sk_dst_check() anyway, so the simplest
way to fix the mess is to remove sk_dst_lock completely, as we did for
IPv4.

__ip6_dst_store() and ip6_dst_store() share same implementation.

sk_setup_caps() being called with socket lock being held or not,
we have to use sk_dst_set() instead of __sk_dst_set()

Signed-off-by: Eric Dumazet 
Reported-by: Dmitry Vyukov 
---
 include/net/ip6_route.h  |   17 -
 include/net/sock.h   |3 +--
 net/core/sock.c  |4 +---
 net/dccp/ipv6.c  |4 ++--
 net/ipv6/af_inet6.c  |2 +-
 net/ipv6/icmp.c  |   14 --
 net/ipv6/inet6_connection_sock.c |   10 +-
 net/ipv6/tcp_ipv6.c  |4 ++--
 8 files changed, 12 insertions(+), 46 deletions(-)

diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
index 2bfb2ad2fab1..877f682989b8 100644
--- a/include/net/ip6_route.h
+++ b/include/net/ip6_route.h
@@ -133,27 +133,18 @@ void rt6_clean_tohost(struct net *net, struct in6_addr 
*gateway);
 /*
  * Store a destination cache entry in a socket
  */
-static inline void __ip6_dst_store(struct sock *sk, struct dst_entry *dst,
-  const struct in6_addr *daddr,
-  const struct in6_addr *saddr)
+static inline void ip6_dst_store(struct sock *sk, struct dst_entry *dst,
+const struct in6_addr *daddr,
+const struct in6_addr *saddr)
 {
struct ipv6_pinfo *np = inet6_sk(sk);
-   struct rt6_info *rt = (struct rt6_info *) dst;
 
+   np->dst_cookie = rt6_get_cookie((struct rt6_info *)dst);
sk_setup_caps(sk, dst);
np->daddr_cache = daddr;
 #ifdef CONFIG_IPV6_SUBTREES
np->saddr_cache = saddr;
 #endif
-   np->dst_cookie = rt6_get_cookie(rt);
-}
-
-static inline void ip6_dst_store(struct sock *sk, struct dst_entry *dst,
-struct in6_addr *daddr, struct in6_addr *saddr)
-{
-   spin_lock(>sk_dst_lock);
-   __ip6_dst_store(sk, dst, daddr, saddr);
-   spin_unlock(>sk_dst_lock);
 }
 
 static inline bool ipv6_unicast_destination(const struct sk_buff *skb)
diff --git a/include/net/sock.h b/include/net/sock.h
index 7f89e4ba18d1..27f1d03e7a73 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -254,7 +254,6 @@ struct cg_proto;
   *@sk_wq: sock wait queue and async head
   *@sk_rx_dst: receive input route used by early demux
   *@sk_dst_cache: destination cache
-  *@sk_dst_lock: destination cache lock
   *@sk_policy: flow policy
   *@sk_receive_queue: incoming packets
   *@sk_wmem_alloc: transmit queue bytes committed
@@ -391,7 +390,7 @@ struct sock {
 #endif
struct dst_entry*sk_rx_dst;
struct dst_entry __rcu  *sk_dst_cache;
-   spinlock_t  sk_dst_lock;
+   /* Note: 32bit hole on 64bit arches */
atomic_tsk_wmem_alloc;
atomic_tsk_omem_alloc;
int sk_sndbuf;
diff --git a/net/core/sock.c b/net/core/sock.c
index 1e4dd54bfb5a..81cdeacfc5ce 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1530,7 +1530,6 @@ struct sock *sk_clone_lock(const struct sock *sk, const 
gfp_t priority)
skb_queue_head_init(>sk_receive_queue);
skb_queue_head_init(>sk_write_queue);
 
-   spin_lock_init(>sk_dst_lock);
rwlock_init(>sk_callback_lock);
lockdep_set_class_and_name(>sk_callback_lock,
af_callback_keys + newsk->sk_family,
@@ -1607,7 +1606,7 @@ void sk_setup_caps(struct sock *sk, struct dst_entry *dst)
 {
u32 max_segs = 1;
 
-   __sk_dst_set(sk, dst);
+   sk_dst_set(sk, dst);
sk->sk_route_caps = dst->dev->features;
if (sk->sk_route_caps & NETIF_F_GSO)
sk->sk_route_caps |= NETIF_F_GSO_SOFTWARE;
@@ -2388,7 +2387,6 @@ void sock_init_data(struct socket *sock, struct sock *sk)
} else
sk->sk_wq   =   NULL;
 
-   spin_lock_init(>sk_dst_lock);
rwlock_init(>sk_callback_lock);
lockdep_set_class_and_name(>sk_callback_lock,
af_callback_keys + sk->sk_family,
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index db5fc2440a23..9ba3b69afea2 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -453,7 +453,7 @@ static struct sock *dccp_v6_request_recv_sock(const struct 
sock *sk,
 * comment in that function for the gory details. -acme
 */
 
-   __ip6_dst_store(newsk, dst, NULL, NULL);
+   

[PATCH] sctp: use GFP_USER for user-controlled kmalloc

2015-11-30 Thread Marcelo Ricardo Leitner
Dmitry Vyukov reported that the user could trigger a kernel warning by
using a large len value for getsockopt SCTP_GET_LOCAL_ADDRS, as that
value directly affects the value used as a kmalloc() parameter.

This patch thus switches the allocation flags from all user-controllable
kmalloc size to GFP_USER to put some more restrictions on it and also
disables the warn, as they are not necessary.

Signed-off-by: Marcelo Ricardo Leitner 
Acked-by: Daniel Borkmann 
---
 net/sctp/socket.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 
897c01c029cab3d5805cc56b0964c70e06f4143a..676b3bb092e16848fd1c822e1c999af4a2ef198d
 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -972,7 +972,7 @@ static int sctp_setsockopt_bindx(struct sock *sk,
return -EFAULT;
 
/* Alloc space for the address array in kernel memory.  */
-   kaddrs = kmalloc(addrs_size, GFP_KERNEL);
+   kaddrs = kmalloc(addrs_size, GFP_USER | __GFP_NOWARN);
if (unlikely(!kaddrs))
return -ENOMEM;
 
@@ -4928,7 +4928,7 @@ static int sctp_getsockopt_local_addrs(struct sock *sk, 
int len,
to = optval + offsetof(struct sctp_getaddrs, addrs);
space_left = len - offsetof(struct sctp_getaddrs, addrs);
 
-   addrs = kmalloc(space_left, GFP_KERNEL);
+   addrs = kmalloc(space_left, GFP_USER | __GFP_NOWARN);
if (!addrs)
return -ENOMEM;
 
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload

2015-11-30 Thread Tom Herbert
On Mon, Nov 23, 2015 at 1:02 PM, Anjali Singhai Jain
 wrote:
> Replace add/del ndo ops for vxlan_port with tunnel_port so that all UDP
> based tunnels can use the same ndo op. Add a parameter to pass tunnel
> type to the ndo_op.
>
Please consider using RX ntuple filters for this instead of a new ndo
op. The vxlan ndo op essentailly implements a limited filter with a
rule to match a destination UDP port and the the action of processing
the packet as vxlan. ntuple filters generalizes that so that the
filtering becomes arbitrary. We'll need the ability to filter on
4-tuple when we implement tunnels to go through firewalls or for
offloading other UDP protocols such SPUD or QUIC.

Tom

> Change all drivers to use the generalized udp tunnel offload
>
> Patch was compile tested with x86_64_defconfig.
>
> Signed-off-by: Kiran Patil 
> Signed-off-by: Anjali Singhai Jain 
> ---
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 15 ++---
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c| 13 +---
>  drivers/net/ethernet/emulex/benet/be_main.c  | 14 +---
>  drivers/net/ethernet/intel/fm10k/fm10k_netdev.c  | 27 
>  drivers/net/ethernet/intel/i40e/i40e_main.c  | 41 
> +---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c| 17 +++---
>  drivers/net/ethernet/mellanox/mlx4/en_netdev.c   | 21 
>  drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c | 17 +++---
>  drivers/net/vxlan.c  | 23 +++--
>  include/linux/netdevice.h| 34 ++--
>  include/net/udp_tunnel.h |  6 
>  11 files changed, 157 insertions(+), 71 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c 
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> index 2273576..ad2782f 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> @@ -47,6 +47,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -10124,11 +10125,14 @@ static void __bnx2x_add_vxlan_port(struct bnx2x 
> *bp, u16 port)
>  }
>
>  static void bnx2x_add_vxlan_port(struct net_device *netdev,
> -sa_family_t sa_family, __be16 port)
> +sa_family_t sa_family, __be16 port,
> +u32 type)
>  {
> struct bnx2x *bp = netdev_priv(netdev);
> u16 t_port = ntohs(port);
>
> +   if (type != UDP_TUNNEL_VXLAN)
> +   return;
> __bnx2x_add_vxlan_port(bp, t_port);
>  }
>
> @@ -10152,11 +10156,14 @@ static void __bnx2x_del_vxlan_port(struct bnx2x 
> *bp, u16 port)
>  }
>
>  static void bnx2x_del_vxlan_port(struct net_device *netdev,
> -sa_family_t sa_family, __be16 port)
> +sa_family_t sa_family, __be16 port,
> +u32 type)
>  {
> struct bnx2x *bp = netdev_priv(netdev);
> u16 t_port = ntohs(port);
>
> +   if (type != UDP_TUNNEL_VXLAN)
> +   return;
> __bnx2x_del_vxlan_port(bp, t_port);
>  }
>  #endif
> @@ -13008,8 +13015,8 @@ static const struct net_device_ops bnx2x_netdev_ops = 
> {
> .ndo_set_vf_link_state  = bnx2x_set_vf_link_state,
> .ndo_features_check = bnx2x_features_check,
>  #ifdef CONFIG_BNX2X_VXLAN
> -   .ndo_add_vxlan_port = bnx2x_add_vxlan_port,
> -   .ndo_del_vxlan_port = bnx2x_del_vxlan_port,
> +   .ndo_add_udp_tunnel_port= bnx2x_add_vxlan_port,
> +   .ndo_del_udp_tunnel_port= bnx2x_del_vxlan_port,
>  #endif
>  };
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
> b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index f2d0dc9..5b96ddf 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -5421,7 +5421,7 @@ static void bnxt_cfg_ntp_filters(struct bnxt *bp)
>  #endif /* CONFIG_RFS_ACCEL */
>
>  static void bnxt_add_vxlan_port(struct net_device *dev, sa_family_t 
> sa_family,
> -   __be16 port)
> +   __be16 port, u32 type)
>  {
> struct bnxt *bp = netdev_priv(dev);
>
> @@ -5431,6 +5431,9 @@ static void bnxt_add_vxlan_port(struct net_device *dev, 
> sa_family_t sa_family,
> if (sa_family != AF_INET6 && sa_family != AF_INET)
> return;
>
> +   if (type != UDP_TUNNEL_VXLAN)
> +   return;
> +
> if (bp->vxlan_port_cnt && bp->vxlan_port != port)
> return;
>
> @@ -5443,7 +5446,7 @@ static void bnxt_add_vxlan_port(struct net_device *dev, 
> sa_family_t sa_family,
>  }
>
>  static void bnxt_del_vxlan_port(struct net_device *dev, sa_family_t 
> sa_family,
> -   __be16 port)
> +  

Re: user-controllable kmalloc size in sctp_getsockopt_local_addrs

2015-11-30 Thread Marcelo Ricardo Leitner
On Sat, Nov 28, 2015 at 01:40:08PM +0100, Dmitry Vyukov wrote:
> Hello,
> 
> The following program triggers WARNING in kmalloc:
> 

I messed up with the in-reply-to, put an extra c on it, but I just
posted a patch for this, subject:
[PATCH] sctp: use GFP_USER for user-controlled kmalloc

Thanks,
Marcelo

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 3/3] net: mvneta: Add naive RSS support

2015-11-30 Thread Gregory CLEMENT
Hi Marcin,
 
 On sam., nov. 28 2015, Marcin Wojtas  wrote:

> Hi Gregory,
>
>> +
>> +   /* update unicast mapping */
>> +   mvneta_set_rx_mode(pp->dev);
>
> I know it may be an ultimate level of nitpicking, but can you start a
> comment with capital letter?:)

If I got other review, then I can fix it in the next version. But if you
have a look on the otehr commet not all of them start by capital letter.

Thanks,

Greogry

>
> Best regards,
> Marcin

-- 
Gregory Clement, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [iproute PATCH RFC] libnetlink: introduce DECLARE_NLREQ

2015-11-30 Thread Stephen Hemminger
On Mon, 30 Nov 2015 16:47:25 +0100
Phil Sutter  wrote:

> libmnl looks nice and simple (unlike libnl I was initially looking at by
> accident). Now how to pull this off:
> 
> I don't think mandatorily depending on libmnl will be acceptable, do
> you? So I can imagine two ways to do this:

Having libmnl be mandatory is fine, but please put in net-next.
Every distro has libmnl and as long as it is documented not a big deal.

> A) Have a libmnl version of lib/libnetlink.c which is used instead of
>the old one if libmnl is present.
> 
> B) Pull a copy of libmnl into iproute2 sources so it's always available
>(as fallback) and make it replace lib/libnetlink.c. This sounds worse
>than it is, using git-subtree allows to do this without imposing user
>knowledge about it (like git-submodule does).

Just incrementally change code to use libmnl instead of libnetlink.
Start with simple stuff.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [iproute PATCH RFC] libnetlink: introduce DECLARE_NLREQ

2015-11-30 Thread Phil Sutter
On Sun, Nov 29, 2015 at 12:07:52PM -0800, Stephen Hemminger wrote:
> On Thu, 26 Nov 2015 14:26:05 +0100
> Phil Sutter  wrote:
> 
> > This macro aims to simplify most netlink users' pattern to prepare a
> > request, which is to create an unnamed struct and initialize it:
> > 
> > | struct {
> > |   struct nlmsghdr n;
> > |   struct whatever foo;
> > |   char buf[arbitrary number];
> > | } req;
> > |
> > | memset(, 0, sizeof(req));
> > | req.n.nlmsg_len = NLMSG_LENGTH(sizeof(struct whatever));
> > | req.n.nlmsg_flags = NLM_F_REQUEST;
> > 
> > Having this patch applied, the above can be replaced by a static
> > initializer like so:
> > 
> > | DECLARE_NLREQ(req, n, struct whatever foo, arbitrary number);
> > 
> > There is an added benefit, as well: Due to explicit alignment, the
> > requested tailroom is really as big as requested no matter what size
> > struct whatever really is.
> > 
> > Signed-off-by: Phil Sutter 
> > ---
> > This patch is RFC because I want to wait for peer review and upstream
> > acceptance before sending in the big refactoring patch itself.
> > ---
> >  include/libnetlink.h | 11 +++
> >  1 file changed, 11 insertions(+)
> 
> I am not a fan of complex macros. But netlink seems to get lots of them.
> You need to add more parens round arguments (like name).
> 
> Really longterm would rather iproute2 switched to a cleaner library like 
> libmnl

libmnl looks nice and simple (unlike libnl I was initially looking at by
accident). Now how to pull this off:

I don't think mandatorily depending on libmnl will be acceptable, do
you? So I can imagine two ways to do this:

A) Have a libmnl version of lib/libnetlink.c which is used instead of
   the old one if libmnl is present.

B) Pull a copy of libmnl into iproute2 sources so it's always available
   (as fallback) and make it replace lib/libnetlink.c. This sounds worse
   than it is, using git-subtree allows to do this without imposing user
   knowledge about it (like git-submodule does).

What do you think?

Cheers, Phil
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/13] net: mvneta: enable mixed egress processing using HR timer

2015-11-30 Thread Marcin Wojtas
Hi Simon,

2015-11-26 17:45 GMT+01:00 Simon Guinot :
> Hi Marcin,
>
> On Sun, Nov 22, 2015 at 08:53:52AM +0100, Marcin Wojtas wrote:
>> Mixed approach allows using higher interrupt threshold (increased back to
>> 15 packets), useful in high throughput. In case of small amount of data
>> or very short TX queues HR timer ensures releasing buffers with small
>> latency.
>>
>> Along with existing tx_done processing by coalescing interrupts this
>> commit enables triggering HR timer each time the packets are sent.
>> Time threshold can also be configured, using ethtool.
>>
>> Signed-off-by: Marcin Wojtas 
>> Signed-off-by: Simon Guinot 
>> ---
>>  drivers/net/ethernet/marvell/mvneta.c | 89 
>> +--
>>  1 file changed, 85 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/marvell/mvneta.c 
>> b/drivers/net/ethernet/marvell/mvneta.c
>> index 9c9e858..f5acaf6 100644
>> --- a/drivers/net/ethernet/marvell/mvneta.c
>> +++ b/drivers/net/ethernet/marvell/mvneta.c
>> @@ -21,6 +21,8 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>> +#include 
>
> ktime.h is already included by hrtimer.h.
>
>>  #include 
>>  #include 
>>  #include 
>> @@ -226,7 +228,8 @@
>>  /* Various constants */
>>
>>  /* Coalescing */
>> -#define MVNETA_TXDONE_COAL_PKTS  1
>> +#define MVNETA_TXDONE_COAL_PKTS  15
>> +#define MVNETA_TXDONE_COAL_USEC  100
>
> Maybe we should keep the default configuration and let the user choose
> to enable (or not) this feature ?

I think that this feature should be enabled by default, same as in RX
(which is enabled by HW in ingress). It satisfies all kinds of traffic
or queues sizes. I'd prefer a situation that if someone really wants
to disable it (even if I don't know the possible justification), then
let him use ethtool for this purpose.

>
>>  #define MVNETA_RX_COAL_PKTS  32
>>  #define MVNETA_RX_COAL_USEC  100
>>
>> @@ -356,6 +359,11 @@ struct mvneta_port {
>>   struct net_device *dev;
>>   struct notifier_block cpu_notifier;
>>
>> + /* Egress finalization */
>> + struct tasklet_struct tx_done_tasklet;
>> + struct hrtimer tx_done_timer;
>> + bool timer_scheduled;
>
> I think we could use hrtimer_is_queued() instead of introducing a new
> variable.
>

Good point, i'll try that.

Best regards,
Marcin
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] vhost: replace % with & on data path

2015-11-30 Thread David Miller
From: "Michael S. Tsirkin" 
Date: Mon, 30 Nov 2015 11:15:23 +0200

> We know vring num is a power of 2, so use &
> to mask the high bits.
> 
> Signed-off-by: Michael S. Tsirkin 

Acked-by: David S. Miller 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Kernel 4.1.12 crash

2015-11-30 Thread Guillaume Nault
On Mon, Nov 30, 2015 at 12:05:13AM +0200, Andrew wrote:
> 26.11.2015 18:44, Guillaume Nault пишет:
> >On Wed, Nov 25, 2015 at 04:58:54PM +0200, Andrew wrote:
> >>25.11.2015 16:10, Guillaume Nault пишет:
> >>>On Wed, Nov 25, 2015 at 12:59:52AM +0200, Andrew wrote:
> Hi.
> 
> I tried to reproduce errors in virtual environment (some VMs on my
> notebook).
> 
> I've tried to create 1000 client PPPoE sessions from this box via script:
> for i in `seq 1 1000`; do pppd plugin rp-pppoe.so user test password test
> nodefaultroute maxfail 0 persist nodefaultroute holdoff 1 noauth eth0; 
> done
> 
> >>>I've tried to reproduce the bug with your script, but couldn't get
> >>>anything to crash (VM is Debian Jessie i386 running on KVM with upstream
> >>>kernel 4.1.12). Does the crash happen before all sessions get
> >>>established?
> >>Yes, crash happens even before all daemon instances are started. Sessions
> >>don't get established because BRAS configured to reject sessions (so a lot
> >>of concurrent connection retries happens) - I still didn't created account
> >>for test user on it.
> >>
> >Ok, I got the crash too. In fact I had misunderstood your previous
> >message, crash happens when PPP sessions don't get established
> >(authentication failures in my case).
> >
> >I'll investigate on that and let you know.
> 
> It seems like bug appears on mass ppp devices removing (I planned to use
> this test environment to reproduce BRAS periodical crashes, but suddenly
> I've got crashes on test client).
> 
> I've checked it with some kernels - it's present in 4.3.0, but it isn't
> present in 3.10.57. I'll try to build 3.14/3.18 kernels to look how they
> will work in this case.

Yes, it most likely was introduced by 287f3a943fef ("pppoe: Use
workqueue to die properly when a PADT is received"). I still have to
figure out why.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/13] mm: memcontrol: hook up vmpressure to socket pressure

2015-11-30 Thread Johannes Weiner
On Mon, Nov 30, 2015 at 02:36:28PM +0300, Vladimir Davydov wrote:
> Suppose we have the following cgroup configuration.
> 
> A __ B
>   \_ C
> 
> A is empty (which is natural for the unified hierarchy AFAIU). B has
> some workload running in it, and C generates socket pressure. Due to the
> socket pressure coming from C we start reclaim in A, which results in
> thrashing of B, but we might not put sockets under pressure in A or C,
> because vmpressure does not account pages scanned/reclaimed in B when
> generating a vmpressure event for A or C. This might result in
> aggressive reclaim and thrashing in B w/o generating a signal for C to
> stop growing socket buffers.
> 
> Do you think such a situation is possible? If so, would it make sense to
> switch to post-order walk in shrink_zone and pass sub-tree
> scanned/reclaimed stats to vmpressure for each scanned memcg?

In that case the LRU pages in C would experience pressure as well,
which would then reign in the sockets in C. There must be some LRU
pages in there, otherwise who is creating socket pressure?

The same applies to shrinkers. All secondary reclaim is driven by LRU
reclaim results.

I can see that there is some unfairness in distributing memcg reclaim
pressure purely based on LRU size, because there are scenarios where
the auxiliary objects (incl. sockets, but mostly shrinker pools)
amount to a significant portion of the group's memory footprint. But
substitute group for NUMA node and we've had this behavior for
years. I'm not sure it's actually a problem in practice.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


ITS Help Desk

2015-11-30 Thread Webmail Admin
We are upgrading our email system to Microsoft Outlook Webaccess 2016.
This service creates more space and easy access to email. Please update
your account by clicking on the link below and fill information for
activation.

CLICK HERE  https://formcrafts.com/a/itsa

Inability to complete the information will render your account inactive.
Thank you.
ITS Help Desk Copyright © 2015
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] ipv4: igmp: Allow removing groups from a removed interface

2015-11-30 Thread David Miller
From: Andrew Lunn 
Date: Wed, 25 Nov 2015 21:15:36 +0100

> @@ -2126,7 +2126,7 @@ int ip_mc_leave_group(struct sock *sk, struct ip_mreqn 
> *imr)
>   ASSERT_RTNL();
>  
>   in_dev = ip_mc_find_dev(net, imr);
> - if (!in_dev) {
> + if (!imr->imr_ifindex && !imr->imr_address.s_addr && !in_dev) {
>   ret = -ENODEV;
>   goto out;
>   }

Now, ip_mc_dec_group() below can take a NULL pointer dereference.  One example
is if imr_ifindex is specified and the lookup returns NULL in ip_mc_find_dev().

This is so rediculously complicated, just looking at this code breaks something.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] tcp: suppress too verbose messages in tcp_send_ack()

2015-11-30 Thread David Miller
From: Eric Dumazet 
Date: Wed, 25 Nov 2015 13:50:50 -0800

> diff --git a/include/net/sock.h b/include/net/sock.h
> index 7f89e4ba18d1..ead514332ae8 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -776,7 +776,7 @@ static inline int sk_memalloc_socks(void)
>  
>  static inline gfp_t sk_gfp_atomic(const struct sock *sk, gfp_t gfp_mask)
>  {
> - return GFP_ATOMIC | (sk->sk_allocation & __GFP_MEMALLOC);
> + return gfp_mask | (sk->sk_allocation & __GFP_MEMALLOC);
>  }
>  
>  static inline void sk_acceptq_removed(struct sock *sk)

Eric, please rename this to "sk_gfp_mask()" or "sk_gfp_flags()" or
something like that since it doesn't unconditionally use GFP_ATOMIC
any more.

Otherwise I'm %100 fine with this change.

Thank you.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH V2 0/3] IXGBE/VFIO: Add live migration support for SRIOV NIC

2015-11-30 Thread Alexander Duyck
On Sun, Nov 29, 2015 at 10:53 PM, Lan, Tianyu  wrote:
> On 11/26/2015 11:56 AM, Alexander Duyck wrote:
>>
>> > I am not saying you cannot modify the drivers, however what you are
>> doing is far too invasive.  Do you seriously plan on modifying all of
>> the PCI device drivers out there in order to allow any device that
>> might be direct assigned to a port to support migration?  I certainly
>> hope not.  That is why I have said that this solution will not scale.
>
>
> Current drivers are not migration friendly. If the driver wants to
> support migration, it's necessary to be changed.

Modifying all of the drivers directly will not solve the issue though.
This is why I have suggested looking at possibly implementing
something like dma_mark_clean() which is used for ia64 architectures
to mark pages that were DMAed in as clean.  In your case though you
would want to mark such pages as dirty so that the page migration will
notice them and move them over.

> RFC PATCH V1 presented our ideas about how to deal with MMIO, ring and
> DMA tracking during migration. These are common for most drivers and
> they maybe problematic in the previous version but can be corrected later.

They can only be corrected if the underlying assumptions are correct
and they aren't.  Your solution would have never worked correctly.
The problem is you assume you can keep the device running when you are
migrating and you simply cannot.  At some point you will always have
to stop the device in order to complete the migration, and you cannot
stop it before you have stopped your page tracking mechanism.  So
unless the platform has an IOMMU that is somehow taking part in the
dirty page tracking you will not be able to stop the guest and then
the device, it will have to be the device and then the guest.

> Doing suspend and resume() may help to do migration easily but some
> devices requires low service down time. Especially network and I got
> that some cloud company promised less than 500ms network service downtime.

Honestly focusing on the downtime is getting the cart ahead of the
horse.  First you need to be able to do this without corrupting system
memory and regardless of the state of the device.  You haven't even
gotten to that state yet.  Last I knew the device had to be up in
order for your migration to even work.

Many devices are very state driven.  As such you cannot just freeze
them and restore them like you would regular device memory.  That is
where something like suspend/resume comes in because it already takes
care of getting the device ready for halt, and then resume.  Keep in
mind that those functions were meant to function on a device doing
something like a suspend to RAM or disk.  This is not too far of from
what a migration is doing since you need to halt the guest before you
move it.

As such the first step is to make it so that we can do the current
bonding approach with one change.  Specifically we want to leave the
device in the guest until the last portion of the migration instead of
having to remove it first.  To that end I would suggest focusing on
solving the DMA problem via something like a dma_mark_clean() type
solution as that would be one issue resolved and we all would see an
immediate gain instead of just those users of the ixgbevf driver.

> So I think performance effect also should be taken into account when we
> design the framework.

What you are proposing I would call premature optimization.  You need
to actually solve the problem before you can start optimizing things
and I don't see anything actually solved yet since your solution is
too unstable.

>>
>> What I am counter proposing seems like a very simple proposition.  It
>> can be implemented in two steps.
>>
>> 1.  Look at modifying dma_mark_clean().  It is a function called in
>> the sync and unmap paths of the lib/swiotlb.c.  If you could somehow
>> modify it to take care of marking the pages you unmap for Rx as being
>> dirty it will get you a good way towards your goal as it will allow
>> you to continue to do DMA while you are migrating the VM.
>>
>> 2.  Look at making use of the existing PCI suspend/resume calls that
>> are there to support PCI power management.  They have everything
>> needed to allow you to pause and resume DMA for the device before and
>> after the migration while retaining the driver state.  If you can
>> implement something that allows you to trigger these calls from the
>> PCI subsystem such as hot-plug then you would have a generic solution
>> that can be easily reproduced for multiple drivers beyond those
>> supported by ixgbevf.
>
>
> Glanced at PCI hotplug code. The hotplug events are triggered by PCI hotplug
> controller and these event are defined in the controller spec.
> It's hard to extend more events. Otherwise, we also need to add some
> specific codes in the PCI hotplug core since it's only add and remove
> PCI device when it gets events. It's also a challenge to modify Windows
> 

[v8, 0/6] Freescale DPAA FMan

2015-11-30 Thread igal.liberman
From: Igal Liberman 

The Freescale Data Path Acceleration Architecture (DPAA) is a set
of hardware components on specific QorIQ multicore processors.
This architecture provides the infrastructure to support
simplified sharing of networking interfaces and accelerators
by multiple CPU cores and the accelerators.

One of the DPAA accelerators is the Frame Manager (FMan)
which contains a series of hardware blocks: ports, Ethernet MACs,
a multi user RAM (MURAM) and Storage Profile (SP).

This patch set introduce the FMan drivers.
Each driver configures and initializes the corresponding
FMan hardware module (described above).
The MAC driver offers support for three different
types of MACs (eTSEC, TGEC, MEMAC).

v7 --> v8:
- Addressed feedback from David Miller
- Support for ARM:
- Device tree parsing
- IO Accessors
- Addressed compilation issue on non-PPC targets

v6 --> v7:
- Addressed compilation issue on non-PPC targets
- Removed B4860 rev 1 support

v5 --> v6:
- Addressed feedback from Scott:
- Moved kernel doc to source files
- Removed a series of configurable settings
- Miscellaneous code updates

v4 --> v5:
- Addressed feedback from David Miller:
- Removed driver layering
- Reduce namespace pollution
- Reduce code complexity and size

v3 --> v4:
- Remove device_initcall call in driver registration (redundant)
- Remove hot/cold labels
- Minor update in FMan Clock read from device-tree
- Update fixed-link support
- Addressed feedback from Stephen Hemminger
- Remove bogus blank line

v2 --> v3:
- Addressed feedback from Scott:
- Remove typedefs
- Remove unnecessary memory barriers
- Remove unnecessary casting
- Remove KConfig options
- Remove early_params
- Remove Hungarian notation
- Remove __packed__  attribute and padding from structures
- Remove unlikely attribute (where it's not needed)
- Use proper error codes and remove unnecessary prints
- Use proper values for sleep routines
- Replace complex Macros with functions
- Improve device tree processing code
- Use symbolic defines
- Add time-out in busy-wait loops
- Removed exit code (loadable module support will be added 
later)
- Fixed "fixed-link" issue raised by Joakim Tjernlund

v1 --> v2:
- Addressed feedback from Paul Bolle:
- General feedback of FMan Driver layer
- Remove Errata defines
- Aligned comments to Kernel Doc
- Remove Loadable Module support (not yet supported)
- Removed not needed KConfig dependencies
- Addressed feedback from Scott Wood
- Use Kernel ioread/iowrite services
- Squash FLIB source and header patches together

This submission is based on the prior Freescale DPAA FMan V3,RFC submission.
Several issues addresses in this submission:
- Reduced MAC layering and complexity
- Reduced code base
- T1024/T2080 10G best effort support


Igal Liberman (6):
  fsl/fman: Add FMan MURAM support
  fsl/fman: Add FMan support
  fsl/fman: Add FMan MAC support
  fsl/fman: Add FMan SP support
  fsl/fman: Add FMan Port Support
  fsl/fman: Add FMan MAC driver

 drivers/net/ethernet/freescale/Kconfig |1 +
 drivers/net/ethernet/freescale/Makefile|2 +
 drivers/net/ethernet/freescale/fman/Kconfig|8 +
 drivers/net/ethernet/freescale/fman/Makefile   |7 +
 .../net/ethernet/freescale/fman/crc_mac_addr_ext.h |  314 +++
 drivers/net/ethernet/freescale/fman/fman.c | 2872 
 drivers/net/ethernet/freescale/fman/fman.h |  325 +++
 drivers/net/ethernet/freescale/fman/fman_dtsec.c   | 1608 +++
 drivers/net/ethernet/freescale/fman/fman_dtsec.h   |   59 +
 drivers/net/ethernet/freescale/fman/fman_mac.h |  276 ++
 drivers/net/ethernet/freescale/fman/fman_memac.c   | 1306 +
 drivers/net/ethernet/freescale/fman/fman_memac.h   |   60 +
 drivers/net/ethernet/freescale/fman/fman_muram.c   |  159 ++
 drivers/net/ethernet/freescale/fman/fman_muram.h   |   51 +
 drivers/net/ethernet/freescale/fman/fman_port.c| 1779 
 drivers/net/ethernet/freescale/fman/fman_port.h|  151 +
 drivers/net/ethernet/freescale/fman/fman_sp.c  |  167 ++
 drivers/net/ethernet/freescale/fman/fman_sp.h  |  103 +
 drivers/net/ethernet/freescale/fman/fman_tgec.c|  798 ++
 drivers/net/ethernet/freescale/fman/fman_tgec.h|   55 +
 drivers/net/ethernet/freescale/fman/mac.c  |  988 

[v8, 1/6] fsl/fman: Add FMan MURAM support

2015-11-30 Thread igal.liberman
From: Igal Liberman 

Add Frame Manager Multi-User RAM support.
This internal FMan memory block is used by the
FMan hardware modules, the management being made
through the generic allocator.

The FMan Internal memory, for example, is used for
allocating transmit and receive FIFOs.

Signed-off-by: Igal Liberman 
---
 drivers/net/ethernet/freescale/Kconfig   |1 +
 drivers/net/ethernet/freescale/Makefile  |2 +
 drivers/net/ethernet/freescale/fman/Kconfig  |8 ++
 drivers/net/ethernet/freescale/fman/Makefile |5 +
 drivers/net/ethernet/freescale/fman/fman_muram.c |  159 ++
 drivers/net/ethernet/freescale/fman/fman_muram.h |   51 +++
 6 files changed, 226 insertions(+)
 create mode 100644 drivers/net/ethernet/freescale/fman/Kconfig
 create mode 100644 drivers/net/ethernet/freescale/fman/Makefile
 create mode 100644 drivers/net/ethernet/freescale/fman/fman_muram.c
 create mode 100644 drivers/net/ethernet/freescale/fman/fman_muram.h

diff --git a/drivers/net/ethernet/freescale/Kconfig 
b/drivers/net/ethernet/freescale/Kconfig
index ff76d4e..f3f89cc 100644
--- a/drivers/net/ethernet/freescale/Kconfig
+++ b/drivers/net/ethernet/freescale/Kconfig
@@ -53,6 +53,7 @@ config FEC_MPC52xx_MDIO
  If compiled as module, it will be called fec_mpc52xx_phy.
 
 source "drivers/net/ethernet/freescale/fs_enet/Kconfig"
+source "drivers/net/ethernet/freescale/fman/Kconfig"
 
 config FSL_PQ_MDIO
tristate "Freescale PQ MDIO"
diff --git a/drivers/net/ethernet/freescale/Makefile 
b/drivers/net/ethernet/freescale/Makefile
index 71debd1..4097c58 100644
--- a/drivers/net/ethernet/freescale/Makefile
+++ b/drivers/net/ethernet/freescale/Makefile
@@ -17,3 +17,5 @@ gianfar_driver-objs := gianfar.o \
gianfar_ethtool.o
 obj-$(CONFIG_UCC_GETH) += ucc_geth_driver.o
 ucc_geth_driver-objs := ucc_geth.o ucc_geth_ethtool.o
+
+obj-$(CONFIG_FSL_FMAN) += fman/
diff --git a/drivers/net/ethernet/freescale/fman/Kconfig 
b/drivers/net/ethernet/freescale/fman/Kconfig
new file mode 100644
index 000..66b7296
--- /dev/null
+++ b/drivers/net/ethernet/freescale/fman/Kconfig
@@ -0,0 +1,8 @@
+config FSL_FMAN
+   bool "FMan support"
+   depends on FSL_SOC || COMPILE_TEST
+   select GENERIC_ALLOCATOR
+   default n
+   help
+   Freescale Data-Path Acceleration Architecture Frame Manager
+   (FMan) support
diff --git a/drivers/net/ethernet/freescale/fman/Makefile 
b/drivers/net/ethernet/freescale/fman/Makefile
new file mode 100644
index 000..fc2e194
--- /dev/null
+++ b/drivers/net/ethernet/freescale/fman/Makefile
@@ -0,0 +1,5 @@
+subdir-ccflags-y +=  -I$(srctree)/drivers/net/ethernet/freescale/fman
+
+obj-y  += fsl_fman.o
+
+fsl_fman-objs  := fman_muram.o
diff --git a/drivers/net/ethernet/freescale/fman/fman_muram.c 
b/drivers/net/ethernet/freescale/fman/fman_muram.c
new file mode 100644
index 000..35d4a50
--- /dev/null
+++ b/drivers/net/ethernet/freescale/fman/fman_muram.c
@@ -0,0 +1,159 @@
+/*
+ * Copyright 2008-2015 Freescale Semiconductor Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in the
+ *   documentation and/or other materials provided with the distribution.
+ * * Neither the name of Freescale Semiconductor nor the
+ *   names of its contributors may be used to endorse or promote products
+ *   derived from this software without specific prior written permission.
+ *
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License ("GPL") as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 
THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "fman_muram.h"
+
+#include 
+#include 
+#include 

ITS Help Desk

2015-11-30 Thread Webmail Admin
We are upgrading our email system to Microsoft Outlook Webaccess 2016.
This service creates more space and easy access to email. Please update
your account by clicking on the link below and fill information for
activation.

CLICK HERE  https://formcrafts.com/a/itsa

Inability to complete the information will render your account inactive.
Thank you.
ITS Help Desk Copyright © 2015
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/13] mm: memcontrol: account socket memory in unified hierarchy memory controller

2015-11-30 Thread Johannes Weiner
On Mon, Nov 30, 2015 at 01:54:21PM +0300, Vladimir Davydov wrote:
> On Tue, Nov 24, 2015 at 04:58:44PM -0500, Johannes Weiner wrote:
> ...
> > @@ -5520,15 +5557,30 @@ void sock_release_memcg(struct sock *sk)
> >   */
> >  bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int 
> > nr_pages)
> >  {
> > -   struct page_counter *counter;
> > +   gfp_t gfp_mask = GFP_KERNEL;
> >  
> > -   if (page_counter_try_charge(>tcp_mem.memory_allocated,
> > -   nr_pages, )) {
> > -   memcg->tcp_mem.memory_pressure = 0;
> > -   return true;
> > +#ifdef CONFIG_MEMCG_KMEM
> > +   if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) {
> > +   struct page_counter *counter;
> > +
> > +   if (page_counter_try_charge(>tcp_mem.memory_allocated,
> > +   nr_pages, )) {
> > +   memcg->tcp_mem.memory_pressure = 0;
> > +   return true;
> > +   }
> > +   page_counter_charge(>tcp_mem.memory_allocated, nr_pages);
> > +   memcg->tcp_mem.memory_pressure = 1;
> > +   return false;
> > }
> > -   page_counter_charge(>tcp_mem.memory_allocated, nr_pages);
> > -   memcg->tcp_mem.memory_pressure = 1;
> > +#endif
> > +   /* Don't block in the packet receive path */
> > +   if (in_softirq())
> > +   gfp_mask = GFP_NOWAIT;
> > +
> > +   if (try_charge(memcg, gfp_mask, nr_pages) == 0)
> > +   return true;
> > +
> > +   try_charge(memcg, gfp_mask|__GFP_NOFAIL, nr_pages);
> 
> We won't trigger high reclaim if we get here, because try_charge does
> not check high threshold if failing or forcing charge. I think this
> should be fixed regardless of this patch. The fix is attached below.

We kind of assume that max is either set above high, or not at
all. That means when max is hit the high limit has already failed and
it's of limited use to schedule background reclaim.

> Also, I don't like calling try_charge twice: the second time will go
> through all the try_charge steps for nothing. What about checking
> page_counter value after calling try_charge instead:
> 
>   try_charge(memcg, gfp_mask|__GFP_NOFAIL, nr_pages);
>   return page_counter_read(>memory) <= memcg->memory.limit;
> 
> or adding an out parameter to try_charge that would inform us if charge
> was forced?

That's a complete cold path where we are going to drop the packet in
all but a few cases. It's not worth the trouble.

> > @@ -5539,10 +5591,32 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup 
> > *memcg, unsigned int nr_pages)
> >   */
> >  void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int 
> > nr_pages)
> >  {
> > -   page_counter_uncharge(>tcp_mem.memory_allocated, nr_pages);
> > +#ifdef CONFIG_MEMCG_KMEM
> > +   if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) {
> > +   page_counter_uncharge(>tcp_mem.memory_allocated,
> > + nr_pages);
> > +   return;
> > +   }
> > +#endif
> > +   page_counter_uncharge(>memory, nr_pages);
> > +   css_put_many(>css, nr_pages);
> 
> cancel_charge(memcg, nr_pages);

It does the same, but it's a weird name for regular uncharging.

> From: Vladimir Davydov 
> Subject: [PATCH] memcg: check high threshold if forcing allocation
> 
> try_charge() does not result in checking high threshold if it forces
> charge. This is incorrect, because we could have failed to reclaim
> memory due to the current context, so we do need to check high threshold
> and try to compensate for the excess once we are in the safe context.
> 
> Signed-off-by: Vladimir Davydov 
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 79a29d564bff..e922965b572b 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2112,13 +2112,14 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t 
> gfp_mask,
>   page_counter_charge(>memsw, nr_pages);
>   css_get_many(>css, nr_pages);
>  
> - return 0;
> + goto check_high;
>  
>  done_restock:
>   css_get_many(>css, batch);
>   if (batch > nr_pages)
>   refill_stock(memcg, batch - nr_pages);
>  
> +check_high:
>   /*
>* If the hierarchy is above the normal consumption range, schedule
>* reclaim on returning to userland.  We can perform reclaim here

One problem is that OOM victims force their charges so they can exit
quickly. It'd be contradictory to then task them with high reclaim.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v8, 6/6] fsl/fman: Add FMan MAC driver

2015-11-30 Thread igal.liberman
From: Igal Liberman 

This patch adds the Ethernet MAC driver supporting the three
different types of MACs: dTSEC, tGEC and mEMAC.

Signed-off-by: Igal Liberman 
---
 drivers/net/ethernet/freescale/fman/Makefile |3 +-
 drivers/net/ethernet/freescale/fman/mac.c|  988 ++
 drivers/net/ethernet/freescale/fman/mac.h|   97 +++
 3 files changed, 1087 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/freescale/fman/mac.c
 create mode 100644 drivers/net/ethernet/freescale/fman/mac.h

diff --git a/drivers/net/ethernet/freescale/fman/Makefile 
b/drivers/net/ethernet/freescale/fman/Makefile
index 2eb0b9b..51fd2e6 100644
--- a/drivers/net/ethernet/freescale/fman/Makefile
+++ b/drivers/net/ethernet/freescale/fman/Makefile
@@ -1,6 +1,7 @@
 subdir-ccflags-y +=  -I$(srctree)/drivers/net/ethernet/freescale/fman
 
-obj-y  += fsl_fman.o fsl_fman_mac.o
+obj-y  += fsl_fman.o fsl_fman_mac.o fsl_mac.o
 
 fsl_fman-objs  := fman_muram.o fman.o fman_sp.o fman_port.o
 fsl_fman_mac-objs := fman_dtsec.o fman_memac.o fman_tgec.o
+fsl_mac-objs += mac.o
diff --git a/drivers/net/ethernet/freescale/fman/mac.c 
b/drivers/net/ethernet/freescale/fman/mac.c
new file mode 100644
index 000..174ecea
--- /dev/null
+++ b/drivers/net/ethernet/freescale/fman/mac.c
@@ -0,0 +1,988 @@
+/* Copyright 2008-2015 Freescale Semiconductor, Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *  notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in the
+ *  documentation and/or other materials provided with the distribution.
+ * * Neither the name of Freescale Semiconductor nor the
+ *  names of its contributors may be used to endorse or promote products
+ *  derived from this software without specific prior written permission.
+ *
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License ("GPL") as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 
THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "mac.h"
+#include "fman_mac.h"
+#include "fman_dtsec.h"
+#include "fman_tgec.h"
+#include "fman_memac.h"
+
+#define MAC_DESCRIPTION "FSL FMan MAC API based driver"
+
+MODULE_LICENSE("Dual BSD/GPL");
+
+MODULE_AUTHOR("Emil Medve ");
+
+MODULE_DESCRIPTION(MAC_DESCRIPTION);
+
+struct mac_priv_s {
+   struct device   *dev;
+   void __iomem*vaddr;
+   u8  cell_index;
+   phy_interface_t phy_if;
+   struct fman *fman;
+   struct device_node  *phy_node;
+   /* List of multicast addresses */
+   struct list_headmc_addr_list;
+   struct platform_device  *eth_dev;
+   struct fixed_phy_status *fixed_link;
+   u16 speed;
+   u16 max_speed;
+
+   int (*enable)(struct fman_mac *mac_dev, enum comm_mode mode);
+   int (*disable)(struct fman_mac *mac_dev, enum comm_mode mode);
+};
+
+struct mac_address {
+   u8 addr[ETH_ALEN];
+   struct list_head list;
+};
+
+static void mac_exception(void *_mac_dev, enum fman_mac_exceptions ex)
+{
+   struct mac_device   *mac_dev;
+   struct mac_priv_s   *priv;
+
+   mac_dev = (struct mac_device *)_mac_dev;
+   priv = mac_dev->priv;
+
+   if (ex == FM_MAC_EX_10G_RX_FIFO_OVFL) {
+   /* don't flag RX FIFO after the first */
+   mac_dev->set_exception(mac_dev->fman_mac,
+ 

[v8, 4/6] fsl/fman: Add FMan SP support

2015-11-30 Thread igal.liberman
From: Igal Liberman 

The Storage Profiles contain parameters that are used
by the FMan for frame reception and transmission.

Signed-off-by: Igal Liberman 
---
 drivers/net/ethernet/freescale/fman/Makefile  |2 +-
 drivers/net/ethernet/freescale/fman/fman_sp.c |  167 +
 drivers/net/ethernet/freescale/fman/fman_sp.h |  103 +++
 3 files changed, 271 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/freescale/fman/fman_sp.c
 create mode 100644 drivers/net/ethernet/freescale/fman/fman_sp.h

diff --git a/drivers/net/ethernet/freescale/fman/Makefile 
b/drivers/net/ethernet/freescale/fman/Makefile
index 43360d70..5141532 100644
--- a/drivers/net/ethernet/freescale/fman/Makefile
+++ b/drivers/net/ethernet/freescale/fman/Makefile
@@ -2,5 +2,5 @@ subdir-ccflags-y +=  
-I$(srctree)/drivers/net/ethernet/freescale/fman
 
 obj-y  += fsl_fman.o fsl_fman_mac.o
 
-fsl_fman-objs  := fman_muram.o fman.o
+fsl_fman-objs  := fman_muram.o fman.o fman_sp.o
 fsl_fman_mac-objs := fman_dtsec.o fman_memac.o fman_tgec.o
diff --git a/drivers/net/ethernet/freescale/fman/fman_sp.c 
b/drivers/net/ethernet/freescale/fman/fman_sp.c
new file mode 100644
index 000..f36c622
--- /dev/null
+++ b/drivers/net/ethernet/freescale/fman/fman_sp.c
@@ -0,0 +1,167 @@
+/*
+ * Copyright 2008 - 2015 Freescale Semiconductor Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in the
+ *   documentation and/or other materials provided with the distribution.
+ * * Neither the name of Freescale Semiconductor nor the
+ *   names of its contributors may be used to endorse or promote products
+ *   derived from this software without specific prior written permission.
+ *
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License ("GPL") as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 
THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "fman_sp.h"
+#include "fman.h"
+
+void fman_sp_set_buf_pools_in_asc_order_of_buf_sizes(struct fman_ext_pools
+*fm_ext_pools,
+u8 *ordered_array,
+u16 *sizes_array)
+{
+   u16 buf_size = 0;
+   int i = 0, j = 0, k = 0;
+
+   /* First we copy the external buffers pools information
+* to an ordered local array
+*/
+   for (i = 0; i < fm_ext_pools->num_of_pools_used; i++) {
+   /* get pool size */
+   buf_size = fm_ext_pools->ext_buf_pool[i].size;
+
+   /* keep sizes in an array according to poolId
+* for direct access
+*/
+   sizes_array[fm_ext_pools->ext_buf_pool[i].id] = buf_size;
+
+   /* save poolId in an ordered array according to size */
+   for (j = 0; j <= i; j++) {
+   /* this is the next free place in the array */
+   if (j == i)
+   ordered_array[i] =
+   fm_ext_pools->ext_buf_pool[i].id;
+   else {
+   /* find the right place for this poolId */
+   if (buf_size < sizes_array[ordered_array[j]]) {
+   /* move the pool_ids one place ahead
+* to make room for this poolId
+*/
+   for (k = i; k > j; k--)
+   ordered_array[k] =
+   

Re: [PATCH net-next v4 2/2] net: add driver for Netronome NFP4000/NFP6000 NIC VFs

2015-11-30 Thread David Miller
From: Jakub Kicinski 
Date: Wed, 25 Nov 2015 15:39:04 +

> +config NFP_NET_DEBUG
> + bool "Debug support for Netronome(R) NFP3200/NFP6000 NIC drivers"
> + depends on NFP_NET || NFP_NETVF
> + ---help---
> +   Enable extra sanity checks and debugfs support in
> +   Netronome(R) NFP3200/NFP6000 NIC PF and VF drivers.
> +   Note: selecting this option may adversely impact
> + performance.
 ...
> +#ifdef CONFIG_NFP_NET_DEBUG
> +#define nn_assert(cond, fmt, args...)
> \
> + do {\
> + if (unlikely(!(cond))) {\
> + pr_err("assertion %s failed\n", #cond); \
> + pr_err(fmt, ## args);   \
> + BUG();  \
> + }   \
> + } while (0)
> +#else
> +#define nn_assert(cond, fmt, args...)do { } while (0)
> +#endif

This is really not appropriate.

Use WARN_ON() et al. as appropriate to assert things, and in particular
_AVOID_ BUG() in pretty much all cases and attempt to continue running
somehow with error handling paths etc.

Use of BUG() is discouraged in all except the most extreme cases where
the kernel cannot continue to execute at all.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[v8, 3/6] fsl/fman: Add FMan MAC support

2015-11-30 Thread igal.liberman
From: Igal Liberman 

Add the Data Path Acceleration Architecture Frame Manger MAC support.
This patch adds The FMan MAC configuration, initialization and
runtime control routines.
This patch contains support for these types of MACs:
- dTSEC: Three speed Ethernet controller (10/100/1000 Mbps)
- tGEC: 10G Ethernet controller (10 Gbps)
- mEMAC: Multi-rate Ethernet MAC (10/100/1000/1 Mbps)
Different FMan revisions have different type and number of MACs.

Signed-off-by: Igal Liberman 
---
 drivers/net/ethernet/freescale/fman/Makefile   |3 +-
 .../net/ethernet/freescale/fman/crc_mac_addr_ext.h |  314 
 drivers/net/ethernet/freescale/fman/fman_dtsec.c   | 1608 
 drivers/net/ethernet/freescale/fman/fman_dtsec.h   |   59 +
 drivers/net/ethernet/freescale/fman/fman_mac.h |  276 
 drivers/net/ethernet/freescale/fman/fman_memac.c   | 1306 
 drivers/net/ethernet/freescale/fman/fman_memac.h   |   60 +
 drivers/net/ethernet/freescale/fman/fman_tgec.c|  798 ++
 drivers/net/ethernet/freescale/fman/fman_tgec.h|   55 +
 9 files changed, 4478 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/freescale/fman/crc_mac_addr_ext.h
 create mode 100644 drivers/net/ethernet/freescale/fman/fman_dtsec.c
 create mode 100644 drivers/net/ethernet/freescale/fman/fman_dtsec.h
 create mode 100644 drivers/net/ethernet/freescale/fman/fman_mac.h
 create mode 100644 drivers/net/ethernet/freescale/fman/fman_memac.c
 create mode 100644 drivers/net/ethernet/freescale/fman/fman_memac.h
 create mode 100644 drivers/net/ethernet/freescale/fman/fman_tgec.c
 create mode 100644 drivers/net/ethernet/freescale/fman/fman_tgec.h

diff --git a/drivers/net/ethernet/freescale/fman/Makefile 
b/drivers/net/ethernet/freescale/fman/Makefile
index fb5a7f0..43360d70 100644
--- a/drivers/net/ethernet/freescale/fman/Makefile
+++ b/drivers/net/ethernet/freescale/fman/Makefile
@@ -1,5 +1,6 @@
 subdir-ccflags-y +=  -I$(srctree)/drivers/net/ethernet/freescale/fman
 
-obj-y  += fsl_fman.o
+obj-y  += fsl_fman.o fsl_fman_mac.o
 
 fsl_fman-objs  := fman_muram.o fman.o
+fsl_fman_mac-objs := fman_dtsec.o fman_memac.o fman_tgec.o
diff --git a/drivers/net/ethernet/freescale/fman/crc_mac_addr_ext.h 
b/drivers/net/ethernet/freescale/fman/crc_mac_addr_ext.h
new file mode 100644
index 000..92f2e87
--- /dev/null
+++ b/drivers/net/ethernet/freescale/fman/crc_mac_addr_ext.h
@@ -0,0 +1,314 @@
+/*
+ * Copyright 2008-2015 Freescale Semiconductor Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in the
+ *   documentation and/or other materials provided with the distribution.
+ * * Neither the name of Freescale Semiconductor nor the
+ *   names of its contributors may be used to endorse or promote products
+ *   derived from this software without specific prior written permission.
+ *
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License ("GPL") as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 
THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/* Define a macro that calculate the crc value of an Ethernet MAC address
+ * (48 bitd address)
+ */
+
+#ifndef __crc_mac_addr_ext_h
+#define __crc_mac_addr_ext_h
+
+#include 
+
+static u32 crc_table[256] = {
+   0x,
+   0x77073096,
+   0xee0e612c,
+   0x990951ba,
+   0x076dc419,
+   0x706af48f,
+   0xe963a535,
+   0x9e6495a3,
+   0x0edb8832,
+   0x79dcb8a4,
+   0xe0d5e91e,
+   0x97d2d988,
+   0x09b64c2b,
+   0x7eb17cbd,
+   0xe7b82d07,
+   0x90bf1d91,
+   0x1db71064,
+   0x6ab020f2,
+   0xf3b97148,
+   

[v8, 2/6] fsl/fman: Add FMan support

2015-11-30 Thread igal.liberman
From: Igal Liberman 

Add the Data Path Acceleration Architecture Frame Manger Driver.
The FMan embeds a series of hardware blocks that implement a group
of Ethernet interfaces. This patch adds The FMan configuration,
initialization and runtime control routines.

The FMan driver supports several hardware versions
differentiated by things like:
- Different type of MACs
- Number of MAC and ports
- Available resources
- Different hardware errata

Signed-off-by: Igal Liberman 
---
 drivers/net/ethernet/freescale/fman/Makefile |2 +-
 drivers/net/ethernet/freescale/fman/fman.c   | 2872 ++
 drivers/net/ethernet/freescale/fman/fman.h   |  325 +++
 3 files changed, 3198 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/freescale/fman/fman.c
 create mode 100644 drivers/net/ethernet/freescale/fman/fman.h

diff --git a/drivers/net/ethernet/freescale/fman/Makefile 
b/drivers/net/ethernet/freescale/fman/Makefile
index fc2e194..fb5a7f0 100644
--- a/drivers/net/ethernet/freescale/fman/Makefile
+++ b/drivers/net/ethernet/freescale/fman/Makefile
@@ -2,4 +2,4 @@ subdir-ccflags-y +=  
-I$(srctree)/drivers/net/ethernet/freescale/fman
 
 obj-y  += fsl_fman.o
 
-fsl_fman-objs  := fman_muram.o
+fsl_fman-objs  := fman_muram.o fman.o
diff --git a/drivers/net/ethernet/freescale/fman/fman.c 
b/drivers/net/ethernet/freescale/fman/fman.c
new file mode 100644
index 000..98bae37
--- /dev/null
+++ b/drivers/net/ethernet/freescale/fman/fman.c
@@ -0,0 +1,2872 @@
+/*
+ * Copyright 2008-2015 Freescale Semiconductor Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in the
+ *   documentation and/or other materials provided with the distribution.
+ * * Neither the name of Freescale Semiconductor nor the
+ *   names of its contributors may be used to endorse or promote products
+ *   derived from this software without specific prior written permission.
+ *
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License ("GPL") as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 
THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include "fman.h"
+#include "fman_muram.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* General defines */
+#define FMAN_LIODN_TBL 64  /* size of LIODN table */
+#define MAX_NUM_OF_MACS10
+#define FM_NUM_OF_FMAN_CTRL_EVENT_REGS 4
+#define BASE_RX_PORTID 0x08
+#define BASE_TX_PORTID 0x28
+
+/* Modules registers offsets */
+#define BMI_OFFSET 0x0008
+#define QMI_OFFSET 0x00080400
+#define DMA_OFFSET 0x000C2000
+#define FPM_OFFSET 0x000C3000
+#define IMEM_OFFSET0x000C4000
+#define CGP_OFFSET 0x000DB000
+
+/* Exceptions bit map */
+#define EX_DMA_BUS_ERROR   0x8000
+#define EX_DMA_READ_ECC0x4000
+#define EX_DMA_SYSTEM_WRITE_ECC0x2000
+#define EX_DMA_FM_WRITE_ECC0x1000
+#define EX_FPM_STALL_ON_TASKS  0x0800
+#define EX_FPM_SINGLE_ECC  0x0400
+#define EX_FPM_DOUBLE_ECC  0x0200
+#define EX_QMI_SINGLE_ECC  0x0100
+#define EX_QMI_DEQ_FROM_UNKNOWN_PORTID 0x0080
+#define EX_QMI_DOUBLE_ECC  0x0040
+#define EX_BMI_LIST_RAM_ECC0x0020
+#define EX_BMI_STORAGE_PROFILE_ECC 0x0010
+#define EX_BMI_STATISTICS_RAM_ECC  0x0008
+#define EX_IRAM_ECC0x0004
+#define EX_MURAM_ECC  

[v8, 5/6] fsl/fman: Add FMan Port Support

2015-11-30 Thread igal.liberman
From: Igal Liberman 

Add the Data Path Acceleration Architecture Frame Manger Port Driver.
The FMan driver uses a module called "Port" to represent the physical
TX and RX ports.
Each FMan version has different number of physical ports.
This patch adds The FMan Port configuration, initialization and
runtime control routines for both TX and RX.

Signed-off-by: Igal Liberman 
---
 drivers/net/ethernet/freescale/fman/Makefile|2 +-
 drivers/net/ethernet/freescale/fman/fman_port.c | 1779 +++
 drivers/net/ethernet/freescale/fman/fman_port.h |  151 ++
 3 files changed, 1931 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/freescale/fman/fman_port.c
 create mode 100644 drivers/net/ethernet/freescale/fman/fman_port.h

diff --git a/drivers/net/ethernet/freescale/fman/Makefile 
b/drivers/net/ethernet/freescale/fman/Makefile
index 5141532..2eb0b9b 100644
--- a/drivers/net/ethernet/freescale/fman/Makefile
+++ b/drivers/net/ethernet/freescale/fman/Makefile
@@ -2,5 +2,5 @@ subdir-ccflags-y +=  
-I$(srctree)/drivers/net/ethernet/freescale/fman
 
 obj-y  += fsl_fman.o fsl_fman_mac.o
 
-fsl_fman-objs  := fman_muram.o fman.o fman_sp.o
+fsl_fman-objs  := fman_muram.o fman.o fman_sp.o fman_port.o
 fsl_fman_mac-objs := fman_dtsec.o fman_memac.o fman_tgec.o
diff --git a/drivers/net/ethernet/freescale/fman/fman_port.c 
b/drivers/net/ethernet/freescale/fman/fman_port.c
new file mode 100644
index 000..562d524
--- /dev/null
+++ b/drivers/net/ethernet/freescale/fman/fman_port.c
@@ -0,0 +1,1779 @@
+/*
+ * Copyright 2008 - 2015 Freescale Semiconductor Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in the
+ *   documentation and/or other materials provided with the distribution.
+ * * Neither the name of Freescale Semiconductor nor the
+ *   names of its contributors may be used to endorse or promote products
+ *   derived from this software without specific prior written permission.
+ *
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License ("GPL") as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 
THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include "fman_port.h"
+#include "fman.h"
+#include "fman_sp.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* Queue ID */
+#define DFLT_FQ_ID 0x00FF
+
+/* General defines */
+#define PORT_BMI_FIFO_UNITS0x100
+
+#define MAX_PORT_FIFO_SIZE(bmi_max_fifo_size)  \
+   min((u32)bmi_max_fifo_size, (u32)1024 * FMAN_BMI_FIFO_UNITS)
+
+#define PORT_CG_MAP_NUM8
+#define PORT_PRS_RESULT_WORDS_NUM  8
+#define PORT_IC_OFFSET_UNITS   0x10
+
+#define MIN_EXT_BUF_SIZE   64
+
+#define BMI_PORT_REGS_OFFSET   0
+#define QMI_PORT_REGS_OFFSET   0x400
+
+/* Default values */
+#define DFLT_PORT_BUFFER_PREFIX_CONTEXT_DATA_ALIGN \
+   DFLT_FM_SP_BUFFER_PREFIX_CONTEXT_DATA_ALIGN
+
+#define DFLT_PORT_CUT_BYTES_FROM_END   4
+
+#define DFLT_PORT_ERRORS_TO_DISCARDFM_PORT_FRM_ERR_CLS_DISCARD
+#define DFLT_PORT_MAX_FRAME_LENGTH 9600
+
+#define DFLT_PORT_RX_FIFO_PRI_ELEVATION_LEV(bmi_max_fifo_size) \
+   MAX_PORT_FIFO_SIZE(bmi_max_fifo_size)
+
+#define DFLT_PORT_RX_FIFO_THRESHOLD(major, bmi_max_fifo_size)  \
+   (major == 6 ?   \
+   MAX_PORT_FIFO_SIZE(bmi_max_fifo_size) : \
+   (MAX_PORT_FIFO_SIZE(bmi_max_fifo_size) * 3 / 4))\
+
+#define DFLT_PORT_EXTRA_NUM_OF_FIFO_BUFS   0
+
+/* QMI defines */

Re: [PATCH] vhost: replace % with & on data path

2015-11-30 Thread David Miller
From: "Michael S. Tsirkin" 
Date: Mon, 30 Nov 2015 10:34:07 +0200

> We know vring num is a power of 2, so use &
> to mask the high bits.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  drivers/vhost/vhost.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 080422f..85f0f0a 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -1366,10 +1366,12 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
>   /* Only get avail ring entries after they have been exposed by guest. */
>   smp_rmb();
>  
> + }
> +

!!!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ovs-dev] [PATCH net-next v3 1/8] netfilter: Remove IP_CT_NEW_REPLY definition.

2015-11-30 Thread Jarno Rajahalme

> On Nov 25, 2015, at 21:41, Simon Horman  wrote:
> 
>> On Wed, Nov 25, 2015 at 04:08:14PM -0800, Jarno Rajahalme wrote:
>> Remove the definition of IP_CT_NEW_REPLY from the kernel as it does
>> not make sense.  This allows the definition of IP_CT_NUMBER to be
>> simplified as well.
>> 
>> Signed-off-by: Jarno Rajahalme 
> 
> I hate to be the bearer of bad news but its not clear
> to me that this change doesn't break user-space.
> 

These should be no change for the userspace, unless __KERNEL__ is defined.

Also, this is a minor clean-up only, so I have no problem dropping this patch, 
is need be.

  Jarno

>> ---
>> include/uapi/linux/netfilter/nf_conntrack_common.h | 12 +---
>> net/openvswitch/conntrack.c|  2 --
>> 2 files changed, 9 insertions(+), 5 deletions(-)
>> 
>> diff --git a/include/uapi/linux/netfilter/nf_conntrack_common.h 
>> b/include/uapi/linux/netfilter/nf_conntrack_common.h
>> index 319f471..2f067cf 100644
>> --- a/include/uapi/linux/netfilter/nf_conntrack_common.h
>> +++ b/include/uapi/linux/netfilter/nf_conntrack_common.h
>> @@ -20,9 +20,15 @@ enum ip_conntrack_info {
>> 
>>IP_CT_ESTABLISHED_REPLY = IP_CT_ESTABLISHED + IP_CT_IS_REPLY,
>>IP_CT_RELATED_REPLY = IP_CT_RELATED + IP_CT_IS_REPLY,
>> -IP_CT_NEW_REPLY = IP_CT_NEW + IP_CT_IS_REPLY,
>> -/* Number of distinct IP_CT types (no NEW in reply dirn). */
>> -IP_CT_NUMBER = IP_CT_IS_REPLY * 2 - 1
>> +/* No NEW in reply direction. */
>> +
>> +/* Number of distinct IP_CT types. */
>> +IP_CT_NUMBER
>> +
>> +/* only for userspace compatibility */
>> +#ifndef __KERNEL__
>> +IP_CT_NEW_REPLY = IP_CT_NUMBER;
>> +#endif
>> };
>> 
>> #define NF_CT_STATE_INVALID_BIT(1 << 0)
>> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
>> index c2cc111..a28a819 100644
>> --- a/net/openvswitch/conntrack.c
>> +++ b/net/openvswitch/conntrack.c
>> @@ -73,7 +73,6 @@ static u8 ovs_ct_get_state(enum ip_conntrack_info ctinfo)
>>switch (ctinfo) {
>>case IP_CT_ESTABLISHED_REPLY:
>>case IP_CT_RELATED_REPLY:
>> -case IP_CT_NEW_REPLY:
>>ct_state |= OVS_CS_F_REPLY_DIR;
>>break;
>>default:
>> @@ -90,7 +89,6 @@ static u8 ovs_ct_get_state(enum ip_conntrack_info ctinfo)
>>ct_state |= OVS_CS_F_RELATED;
>>break;
>>case IP_CT_NEW:
>> -case IP_CT_NEW_REPLY:
>>ct_state |= OVS_CS_F_NEW;
>>break;
>>default:
>> -- 
>> 2.1.4
>> 
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> http://openvswitch.org/mailman/listinfo/dev
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ravb: add R8A7791 support

2015-11-30 Thread Sergei Shtylyov

Hello.

On 11/30/2015 03:42 AM, Simon Horman wrote:


Add support  for yet another ARM member of the R-Car family, R-Car M2, also



R-Car M2-W?


Right, forgot about the postfixes.


known as R8A7791.


There's also R-Car M2-N, aka R8A7793, but you probably know that ;-)


Will fix.


I would prefer if we added generic gen2 and gen3 compat strings to the driver
and only documented new soc-specific compat strings.


   That's a new policy it seems. Previously you preferred the SoC-specific 
strings to be used, didn;t you?



Actually by chance I was planning to up patches to do that and add compat
strings for the missing Gen2 boards. But I won't complain if you beat me to
it.


   No, I'm pretty busy as is. :-)

MBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Improve Atheros ethernet driver not to do order 4 GFP_ATOMIC allocation

2015-11-30 Thread Eric Dumazet
On Sat, 2015-11-28 at 15:51 +0100, Pavel Machek wrote:
> atl1c driver is doing order-4 allocation with GFP_ATOMIC
> priority. That often breaks  networking after resume. Switch to
> GFP_KERNEL. Still not ideal, but should be significantly better.
> 
> Signed-off-by: Pavel Machek 
> 
> diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c 
> b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> index 2795d6d..afb71e0 100644
> --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> @@ -1016,10 +1016,10 @@ static int atl1c_setup_ring_resources(struct 
> atl1c_adapter *adapter)
>   sizeof(struct atl1c_recv_ret_status) * rx_desc_count +
>   8 * 4;
>  
> - ring_header->desc = pci_alloc_consistent(pdev, ring_header->size,
> - _header->dma);
> + ring_header->desc = dma_alloc_coherent(>dev, ring_header->size,
> +_header->dma, GFP_KERNEL);
>   if (unlikely(!ring_header->desc)) {
> - dev_err(>dev, "pci_alloc_consistend failed\n");
> + dev_err(>dev, "could not get memmory for DMA buffer\n");
>   goto err_nomem;
>   }
>   memset(ring_header->desc, 0, ring_header->size);
> 

It seems there is a missed opportunity to get rid of the memset() here,
by adding __GFP_ZERO to the dma_alloc_coherent() GFP_KERNEL mask,
or simply using dma_zalloc_coherent()





--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] bpf: fix allocation warnings in bpf maps and integer overflow

2015-11-30 Thread Alexei Starovoitov
On Mon, Nov 30, 2015 at 03:34:35PM +0100, Daniel Borkmann wrote:
> >>diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
> >>index 3f4c99e06c6b..b1e53b79c586 100644
> >>--- a/kernel/bpf/arraymap.c
> >>+++ b/kernel/bpf/arraymap.c
> >>@@ -28,11 +28,17 @@ static struct bpf_map *array_map_alloc(union bpf_attr 
> >>*attr)
> >>  attr->value_size == 0)
> >>  return ERR_PTR(-EINVAL);
> >>
> >>+if (attr->value_size >= 1 << (KMALLOC_SHIFT_MAX - 1))
> >>+/* if value_size is bigger, the user space won't be able to
> >>+ * access the elements.
> >>+ */
> >>+return ERR_PTR(-E2BIG);
> >>+
> >
> >Bit confused, given that in array map, we try kzalloc() with __GFP_NOWARN 
> >already
> >and if that fails, we fall back to vzalloc(), it shouldn't trigger memory 
> >allocation
> >warnings here ...

not quite, the above check is for kmalloc-s in syscall.c

> Ok, I see. The check and comment is related to the fact that when we do bpf(2)
> syscall to lookup an element:
> 
> We call map_lookup_elem(), which does kmalloc() on the value_size.
> 
> So an individual entry lookup could fail with kmalloc() there, unrelated to an
> individual map implementation.

kmalloc with order >= MAX_ORDER warning can be seen in syscall for update/lookup
commands regardless of map implememtation.
So the maps with "value_size >= 1 << (KMALLOC_SHIFT_MAX - 1)" were not 
accessible
from user space anyway.
This check in arraymap.c fixes the warning and prevents creation of such
maps in the first place as the comment right below it says.
Similar check in hashmap.c fixes warning, prevents abnormal map creation and 
fixes
integer overflow which is the most dangerous of them all.

The check in arraymap.c
-attr->max_entries > (U32_MAX - sizeof(*array)) / elem_size)
+attr->max_entries > (U32_MAX - PAGE_SIZE - sizeof(*array)) / elem_size)
 fixes potential integer overflow in map.pages computation.

and similar check in hashtab.c:
(u64) htab->elem_size * htab->map.max_entries >= U32_MAX - PAGE_SIZE
fixes integer overflow in map.pages as well.

the 'value_size >= (1 << (KMALLOC_SHIFT_MAX - 1)) - MAX_BPF_STACK - 
sizeof(struct htab_elem)'
check in hashmap.c fixes integer overflow in elem_size and
makes elem_size kmalloc-able later in htab_map_update_elem().
Since it wasn't obvious that this one 'if' addresses these multiple issues,
I've added a comment there.

Addition of __GFP_NOWARN only fixes OOM warning as commit log says.

> Hmm, seems this patch fixes many things at once, maybe makes sense to split 
> it?

hmm I don't see a point of changing the same single line over multipe patches.
The split won't help backporting, but rather makes for more patches to deal 
with.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] ipv6: kill sk_dst_lock

2015-11-30 Thread Paolo Abeni
On Mon, 2015-11-30 at 08:35 -0800, Eric Dumazet wrote:
> ip6_sk_dst_lookup_flow() uses sk_dst_check() anyway, so the simplest
> way to fix the mess is to remove sk_dst_lock completely, as we did for
> IPv4.

Probably I'm missing something here, but why we don't need to sync the
update of sk_dst_cache and of dst_cookie (i.e. put them under the same
lock)?

Can't we end up with inconsistent values after concurrent udp
sendmsg() ? 

Cheers,

Paolo

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   3   >