[PATCH v3] netlink: Fix autobind race condition that leads to zero port ID

2015-09-17 Thread Herbert Xu
On Thu, Sep 17, 2015 at 07:30:34AM -0400, Tejun Heo wrote:
>
> Maybe add that this led to a deadlock and add a Link tag to this
> thread?

I'll add a note about the deadlock but I don't like Link tags
because websites die and you can always just google the patch
subject.

> > +   nlk_sk(sk)->bound = !!portid;
> 
> !! isn't necessasry and this creates ordering between two stores.

!! was necessary because we're going from a u32 to a bool.

> ->bound must be visible only after ->portid is visible, so this should
> be smp_store_release().

But yes there are ordering issues here so I've decided to make
rhashtable use a new field for its hash instead.

Note that I've dropped the acks as this patch is now substantially
different.

---8<---
The commit c0bb07df7d981e4091432754e30c9c720e2c0c78 ("netlink:
Reset portid after netlink_insert failure") introduced a race
condition where if two threads try to autobind the same socket
one of them may end up with a zero port ID.  This led to kernel
deadlocks that were observed by multiple people.

This patch reverts that commit and instead fixes it by introducing
a separte rhash_portid variable so that the real portid is only set
after the socket has been successfully hashed.

Fixes: c0bb07df7d98 ("netlink: Reset portid after netlink_insert failure")
Reported-by: Tejun Heo 
Reported-by: Linus Torvalds 
Signed-off-by: Herbert Xu 

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index dea9253..7157bad 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -988,7 +988,7 @@ static inline int netlink_compare(struct 
rhashtable_compare_arg *arg,
const struct netlink_compare_arg *x = arg->key;
const struct netlink_sock *nlk = ptr;
 
-   return nlk->portid != x->portid ||
+   return nlk->rhash_portid != x->portid ||
   !net_eq(sock_net(&nlk->sk), read_pnet(&x->pnet));
 }
 
@@ -1014,7 +1014,7 @@ static int __netlink_insert(struct netlink_table *table, 
struct sock *sk)
 {
struct netlink_compare_arg arg;
 
-   netlink_compare_arg_init(&arg, sock_net(sk), nlk_sk(sk)->portid);
+   netlink_compare_arg_init(&arg, sock_net(sk), nlk_sk(sk)->rhash_portid);
return rhashtable_lookup_insert_key(&table->hash, &arg,
&nlk_sk(sk)->node,
netlink_rhashtable_params);
@@ -1076,17 +1076,19 @@ static int netlink_insert(struct sock *sk, u32 portid)
unlikely(atomic_read(&table->hash.nelems) >= UINT_MAX))
goto err;
 
-   nlk_sk(sk)->portid = portid;
+   nlk_sk(sk)->rhash_portid = portid;
sock_hold(sk);
 
err = __netlink_insert(table, sk);
if (err) {
if (err == -EEXIST)
err = -EADDRINUSE;
-   nlk_sk(sk)->portid = 0;
sock_put(sk);
+   goto err;
}
 
+   nlk_sk(sk)->portid = portid;
+
 err:
release_sock(sk);
return err;
@@ -3197,7 +3199,7 @@ static inline u32 netlink_hash(const void *data, u32 len, 
u32 seed)
const struct netlink_sock *nlk = data;
struct netlink_compare_arg arg;
 
-   netlink_compare_arg_init(&arg, sock_net(&nlk->sk), nlk->portid);
+   netlink_compare_arg_init(&arg, sock_net(&nlk->sk), nlk->rhash_portid);
return jhash2((u32 *)&arg, netlink_compare_arg_len / sizeof(u32), seed);
 }
 
diff --git a/net/netlink/af_netlink.h b/net/netlink/af_netlink.h
index 8900840..c96dfa3 100644
--- a/net/netlink/af_netlink.h
+++ b/net/netlink/af_netlink.h
@@ -25,6 +25,7 @@ struct netlink_ring {
 struct netlink_sock {
/* struct sock has to be the first member of netlink_sock */
struct sock sk;
+   u32 rhash_portid;
u32 portid;
u32 dst_portid;
u32 dst_group;
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ATL1E 0000:02:00.0: swiotlb buffer is full (sz: 529461 bytes)

2015-09-17 Thread Markus Trippelsdorf
On 2015.09.12 at 08:51 +0200, Markus Trippelsdorf wrote:
> With the current Linus git tree I get an occasional swiotlb allocation
> error during network setup at boot-time:
> 
>  ATL1E :02:00.0: swiotlb buffer is full (sz: 529461 bytes)
>  swiotlb: coherent allocation failed for device :02:00.0 size=529461
>  CPU: 0 PID: 200 Comm: ifconfig Not tainted 4.2.0-11400-gdfb22fc5c0eb-dirty 
> #113
>  Hardware name: System manufacturer System Product Name/M4A78T-E, BIOS 3503   
>  04/13/2011
>    812b8d33  812da574
>   0008 880216827000 880216827740 880216af
>   00081435 81c163c0 880216b1b800 814e3b1d
>  Call Trace:
>   [] ? dump_stack+0x40/0x6d
>   [] ? swiotlb_alloc_coherent+0x134/0x160
>   [] ? atl1e_open+0xfd/0x460
>   [] ? __dev_open+0x82/0x100
>   [] ? __dev_change_flags+0x91/0x160
>   [] ? dev_change_flags+0x1e/0x60
>   [] ? dev_get_by_name_rcu+0x54/0x80
>   [] ? devinet_ioctl+0x5d6/0x6a0
>   [] ? sock_ioctl+0xed/0x1e0
>   [] ? do_vfs_ioctl+0x291/0x480
>   [] ? __do_page_fault+0x140/0x380
>   [] ? SyS_ioctl+0x36/0x80
>   [] ? entry_SYSCALL_64_fastpath+0x12/0x66
>  ATL1E :02:00.0 eth0: pci_alloc_consistent failed, size = D529461
> 
> This happens every forth or fifth boot, so the issue is not bisectable.

Adding CC.

-- 
Markus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC net-next v3 3/4] ravb: Document binding for r8a7795 SoC

2015-09-17 Thread Simon Horman
From: Kazuya Mizuguchi 

This patch updates the ravb binding to support the r8a7795 SoC by:
- Adding a compat string for the new hardware
- Adding 25 named interrupts to binding for the new SoC;
  older SoCs continue to use a single multiplexed interrupt

The example is also updated to reflect the r8a7795 as this is the
more complex case.

Based on work by Kazuya Mizuguchi and others.

Signed-off-by: Simon Horman 

---
v2
* First post; broken out of a driver update patch
* As discussed with Geert Uytterhoeven and Sergei Shtylyov
  - Binding: Make all interrupts mandatory as named-interrupts of
the form ch%u

v3
* A suggested by Geert Uytterhoeven
  - Reword description of interrupts and interrupt-names to
make things clearer. It is now based to some extent on
spi-rspi.txt and renesas,usb-dmac.txt.
* As suggested by Sergei Shtylyov
  - Drop phy-reset-gpio from example
* Added power-domains to example
---
 .../devicetree/bindings/net/renesas,ravb.txt   | 69 +++---
 1 file changed, 62 insertions(+), 7 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/renesas,ravb.txt 
b/Documentation/devicetree/bindings/net/renesas,ravb.txt
index 1fd8831437bf..ae454d9df136 100644
--- a/Documentation/devicetree/bindings/net/renesas,ravb.txt
+++ b/Documentation/devicetree/bindings/net/renesas,ravb.txt
@@ -6,8 +6,12 @@ interface contains.
 Required properties:
 - compatible: "renesas,etheravb-r8a7790" if the device is a part of R8A7790 
SoC.
  "renesas,etheravb-r8a7794" if the device is a part of R8A7794 SoC.
+ "renesas,etheravb-r8a7795" if the device is a part of R8A7795 SoC.
 - reg: offset and length of (1) the register block and (2) the stream buffer.
-- interrupts: interrupt specifier for the sole interrupt.
+- interrupts: A list of interrupt-specifiers, one for each entry in
+ interrupt-names.
+ If interrupt-names is not present, an interrupt specifier
+ for a single muxed interrupt.
 - phy-mode: see ethernet.txt file in the same directory.
 - phy-handle: see ethernet.txt file in the same directory.
 - #address-cells: number of address cells for the MDIO bus, must be equal to 1.
@@ -18,6 +22,12 @@ Required properties:
 Optional properties:
 - interrupt-parent: the phandle for the interrupt controller that services
interrupts for this device.
+- interrupt-names: A list of interrupt names.
+  For the R8A7795 SoC this property is mandatory;
+  it should include one entry per channel, named "ch%u",
+  where %u is the channel number ranging from 0 to 24.
+  For other SoCs this property is optional; if present
+  is should contain "mux" for a single muxed interrupt.
 - pinctrl-names: pin configuration state name ("default").
 - renesas,no-ether-link: boolean, specify when a board does not provide a 
proper
 AVB_LINK signal.
@@ -27,13 +37,46 @@ Optional properties:
 Example:
 
ethernet@e680 {
-   compatible = "renesas,etheravb-r8a7790";
-   reg = <0 0xe680 0 0x800>, <0 0xee0e8000 0 0x4000>;
+   compatible = "renesas,etheravb-r8a7795";
+   reg = <0 0xe680 0 0x800>, <0 0xe6a0 0 0x1>;
interrupt-parent = <&gic>;
-   interrupts = <0 163 IRQ_TYPE_LEVEL_HIGH>;
-   clocks = <&mstp8_clks R8A7790_CLK_ETHERAVB>;
-   phy-mode = "rmii";
+   interrupts = ,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+;
+   interrupt-names = "ch0", "ch1", "ch2", "ch3",
+ "ch4", "ch5", "ch6", "ch7",
+ "ch8", "ch9", "ch10", "ch11",
+ "ch12", "ch13", "ch14", "ch15",
+ "ch16", "ch17", "ch18", "ch19",
+ "ch20", "ch21", "ch22", "ch23",
+ "ch24";
+   clocks = <&mstp8_clks R8A7795_CLK_ETHERAVB>;
+   power-domains = <&cpg_clocks>;
+   phy-mode = "rgmii-id";
phy-handle = <&phy0>;
+
pinctrl-0 = <ðer_pins>;
pinctrl-names = "default";
  

[PATCH/RFC net-next v3 2/4] ravb: Provide dev parameter to DMA API

2015-09-17 Thread Simon Horman
From: Kazuya Mizuguchi 

This patch is in preparation for using this driver on arm64 where the
implementation of __dma_alloc_coherent fails if a device parameter is not
provided.

Signed-off-by: Kazuya Mizuguchi 
Signed-off-by: Yoshihiro Shimoda 
Signed-off-by: Masaru Nagai 
[horms: squashed into a single patch]
Signed-off-by: Simon Horman 

---
* [Simon Horman]
  I have only tested this on arm64.
  It should be tested for regressions on arm hardware.

v0 [Kazuya Mizuguchi, Yoshihiro Shimoda, Masaru Nagai]

v1 [Simon Horman]
* Squashed into a single patch

v2 [Simon Horman]
* No change

v2 [Simon Horman]
* No change
---
 drivers/net/ethernet/renesas/ravb_main.c | 38 
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/renesas/ravb_main.c 
b/drivers/net/ethernet/renesas/ravb_main.c
index 450899e9cea2..4ca093d033f8 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -201,7 +201,7 @@ static void ravb_ring_free(struct net_device *ndev, int q)
if (priv->rx_ring[q]) {
ring_size = sizeof(struct ravb_ex_rx_desc) *
(priv->num_rx_ring[q] + 1);
-   dma_free_coherent(NULL, ring_size, priv->rx_ring[q],
+   dma_free_coherent(ndev->dev.parent, ring_size, priv->rx_ring[q],
  priv->rx_desc_dma[q]);
priv->rx_ring[q] = NULL;
}
@@ -209,7 +209,7 @@ static void ravb_ring_free(struct net_device *ndev, int q)
if (priv->tx_ring[q]) {
ring_size = sizeof(struct ravb_tx_desc) *
(priv->num_tx_ring[q] * NUM_TX_DESC + 1);
-   dma_free_coherent(NULL, ring_size, priv->tx_ring[q],
+   dma_free_coherent(ndev->dev.parent, ring_size, priv->tx_ring[q],
  priv->tx_desc_dma[q]);
priv->tx_ring[q] = NULL;
}
@@ -240,13 +240,13 @@ static void ravb_ring_format(struct net_device *ndev, int 
q)
rx_desc = &priv->rx_ring[q][i];
/* The size of the buffer should be on 16-byte boundary. */
rx_desc->ds_cc = cpu_to_le16(ALIGN(PKT_BUF_SZ, 16));
-   dma_addr = dma_map_single(&ndev->dev, priv->rx_skb[q][i]->data,
+   dma_addr = dma_map_single(ndev->dev.parent, 
priv->rx_skb[q][i]->data,
  ALIGN(PKT_BUF_SZ, 16),
  DMA_FROM_DEVICE);
/* We just set the data size to 0 for a failed mapping which
 * should prevent DMA from happening...
 */
-   if (dma_mapping_error(&ndev->dev, dma_addr))
+   if (dma_mapping_error(ndev->dev.parent, dma_addr))
rx_desc->ds_cc = cpu_to_le16(0);
rx_desc->dptr = cpu_to_le32(dma_addr);
rx_desc->die_dt = DT_FEMPTY;
@@ -309,7 +309,7 @@ static int ravb_ring_init(struct net_device *ndev, int q)
 
/* Allocate all RX descriptors. */
ring_size = sizeof(struct ravb_ex_rx_desc) * (priv->num_rx_ring[q] + 1);
-   priv->rx_ring[q] = dma_alloc_coherent(NULL, ring_size,
+   priv->rx_ring[q] = dma_alloc_coherent(ndev->dev.parent, ring_size,
  &priv->rx_desc_dma[q],
  GFP_KERNEL);
if (!priv->rx_ring[q])
@@ -320,7 +320,7 @@ static int ravb_ring_init(struct net_device *ndev, int q)
/* Allocate all TX descriptors. */
ring_size = sizeof(struct ravb_tx_desc) *
(priv->num_tx_ring[q] * NUM_TX_DESC + 1);
-   priv->tx_ring[q] = dma_alloc_coherent(NULL, ring_size,
+   priv->tx_ring[q] = dma_alloc_coherent(ndev->dev.parent, ring_size,
  &priv->tx_desc_dma[q],
  GFP_KERNEL);
if (!priv->tx_ring[q])
@@ -443,7 +443,7 @@ static int ravb_tx_free(struct net_device *ndev, int q)
size = le16_to_cpu(desc->ds_tagl) & TX_DS;
/* Free the original skb. */
if (priv->tx_skb[q][entry / NUM_TX_DESC]) {
-   dma_unmap_single(&ndev->dev, le32_to_cpu(desc->dptr),
+   dma_unmap_single(ndev->dev.parent, 
le32_to_cpu(desc->dptr),
 size, DMA_TO_DEVICE);
/* Last packet descriptor? */
if (entry % NUM_TX_DESC == NUM_TX_DESC - 1) {
@@ -546,7 +546,7 @@ static bool ravb_rx(struct net_device *ndev, int *quota, 
int q)
 
skb = priv->rx_skb[q][entry];
priv->rx_skb[q][entry] = NULL;
-   dma_unmap_single(&ndev->dev, le32_to_cpu(desc->dptr),
+   dma_unmap_single(ndev->dev.parent, 
le32_to_cpu(desc->dptr),
 

[PATCH/RFC net-next v3 4/4] ravb: Add support for r8a7795 SoC

2015-09-17 Thread Simon Horman
From: Kazuya Mizuguchi 

This patch supports the r8a7795 SoC by:
- Using two interrupts
  + One for E-MAC
  + One for everything else
  + Both can be handled by the existing common interrupt handler, which
affords a simpler update to support the new SoC. In future some
consideration may be given to implementing multiple interrupt handlers
- Limiting the phy speed to 100Mbit/s for the new SoC;
  at this time it is not clear how this restriction may be lifted
  but I hope it will be possible as more information comes to light

Signed-off-by: Kazuya Mizuguchi 
[horms: reworked]
Signed-off-by: Simon Horman 

---
v0 [Kazuya Mizuguchi]

v1 [Simon Horman]
* Updated patch subject

v2 [Simon Horman]
* Reworked based on extensive feedback from
  Geert Uytterhoeven and Sergei Shtylyov.
* Broke binding update out into separate patch

v3 [Simon Horman]
* Check new return value of phy_set_max_speed()
---
 drivers/net/ethernet/renesas/ravb.h  |  7 
 drivers/net/ethernet/renesas/ravb_main.c | 63 
 2 files changed, 62 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/renesas/ravb.h 
b/drivers/net/ethernet/renesas/ravb.h
index a157ff6a..0623fff932e4 100644
--- a/drivers/net/ethernet/renesas/ravb.h
+++ b/drivers/net/ethernet/renesas/ravb.h
@@ -766,6 +766,11 @@ struct ravb_ptp {
struct ravb_ptp_perout perout[N_PER_OUT];
 };
 
+enum ravb_chip_id {
+   RCAR_GEN2,
+   RCAR_GEN3,
+};
+
 struct ravb_private {
struct net_device *ndev;
struct platform_device *pdev;
@@ -806,6 +811,8 @@ struct ravb_private {
int msg_enable;
int speed;
int duplex;
+   int emac_irq;
+   enum ravb_chip_id chip_id;
 
unsigned no_avb_link:1;
unsigned avb_link_active_low:1;
diff --git a/drivers/net/ethernet/renesas/ravb_main.c 
b/drivers/net/ethernet/renesas/ravb_main.c
index 4ca093d033f8..8cc5ec5ed19a 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -889,6 +889,22 @@ static int ravb_phy_init(struct net_device *ndev)
return -ENOENT;
}
 
+   /* This driver only support 10/100Mbit speeds on Gen3
+* at this time.
+*/
+   if (priv->chip_id == RCAR_GEN3) {
+   int err;
+
+   err = phy_set_max_speed(phydev, SPEED_100);
+   if (err) {
+   netdev_err(ndev, "failed to limit PHY to 100Mbit/s\n");
+   phy_disconnect(phydev);
+   return err;
+   }
+
+   netdev_info(ndev, "limited PHY to 100Mbit/s\n");
+   }
+
netdev_info(ndev, "attached PHY %d (IRQ %d) to driver %s\n",
phydev->addr, phydev->irq, phydev->drv->name);
 
@@ -1197,6 +1213,15 @@ static int ravb_open(struct net_device *ndev)
goto out_napi_off;
}
 
+   if (priv->chip_id == RCAR_GEN3) {
+   error = request_irq(priv->emac_irq, ravb_interrupt,
+   IRQF_SHARED, ndev->name, ndev);
+   if (error) {
+   netdev_err(ndev, "cannot request IRQ\n");
+   goto out_free_irq;
+   }
+   }
+
/* Device init */
error = ravb_dmac_init(ndev);
if (error)
@@ -1220,6 +1245,7 @@ out_ptp_stop:
ravb_ptp_stop(ndev);
 out_free_irq:
free_irq(ndev->irq, ndev);
+   free_irq(priv->emac_irq, ndev);
 out_napi_off:
napi_disable(&priv->napi[RAVB_NC]);
napi_disable(&priv->napi[RAVB_BE]);
@@ -1625,10 +1651,20 @@ static int ravb_mdio_release(struct ravb_private *priv)
return 0;
 }
 
+static const struct of_device_id ravb_match_table[] = {
+   { .compatible = "renesas,etheravb-r8a7790", .data = (void *)RCAR_GEN2 },
+   { .compatible = "renesas,etheravb-r8a7794", .data = (void *)RCAR_GEN2 },
+   { .compatible = "renesas,etheravb-r8a7795", .data = (void *)RCAR_GEN3 },
+   { }
+};
+MODULE_DEVICE_TABLE(of, ravb_match_table);
+
 static int ravb_probe(struct platform_device *pdev)
 {
struct device_node *np = pdev->dev.of_node;
+   const struct of_device_id *match;
struct ravb_private *priv;
+   enum ravb_chip_id chip_id;
struct net_device *ndev;
int error, irq, q;
struct resource *res;
@@ -1657,7 +1693,14 @@ static int ravb_probe(struct platform_device *pdev)
/* The Ether-specific entries in the device structure. */
ndev->base_addr = res->start;
ndev->dma = -1;
-   irq = platform_get_irq(pdev, 0);
+
+   match = of_match_device(of_match_ptr(ravb_match_table), &pdev->dev);
+   chip_id = (enum ravb_chip_id)match->data;
+
+   if (chip_id == RCAR_GEN3)
+   irq = platform_get_irq_byname(pdev, "ch22");
+   else
+   irq = platform_get_irq(pdev, 0);
if (irq < 0) {
error = irq;
goto out_release;
@@ -

[PATCH/RFC net-next v3 1/4] phylib: Add phy_set_max_speed helper

2015-09-17 Thread Simon Horman
Add a helper to allow ethernet drivers to limit the speed of a phy
(that they are attached to).

This mainly involves factoring out the business-end of
of_set_phy_supported() and exporting a new symbol.

This code seems to be open coded in several places, in several different
variants.

It is is envisaged that this will be used in situations where setting the
"max-speed" property in DT is not appropriate, e.g. because the maximum
speed is not a property of the phy hardware.

Signed-off-by: Simon Horman 

---
v2
* First post

v3
* As suggested by Florian Fainelli
  - Do not check for !IS_ENABLED(CONFIG_OF_MDIO) in __set_phy_supported.
This is already done in of_set_phy_supported() and is not relevant to
phy_set_max_speed)
  - Return -ENOTSUPP if 'max_speed' is not an unknown value
* As suggested by Sergei Shtylyov
  - White-space and comment enhancements.
---
 drivers/net/phy/phy_device.c | 59 ++--
 include/linux/phy.h  |  1 +
 2 files changed, 41 insertions(+), 19 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index c0f211127274..a4d40c5d984f 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1205,6 +1205,44 @@ static int gen10g_resume(struct phy_device *phydev)
return 0;
 }
 
+static int __set_phy_supported(struct phy_device *phydev, u32 max_speed)
+{
+   /* The default values for phydev->supported are provided by the PHY
+* driver "features" member, we want to reset to sane defaults first
+* before supporting higher speeds.
+*/
+   phydev->supported &= PHY_DEFAULT_FEATURES;
+
+   switch (max_speed) {
+   default:
+   return -ENOTSUPP;
+   case SPEED_1000:
+   phydev->supported |= PHY_1000BT_FEATURES;
+   /* fall through */
+   case SPEED_100:
+   phydev->supported |= PHY_100BT_FEATURES;
+   /* fall through */
+   case SPEED_10:
+   phydev->supported |= PHY_10BT_FEATURES;
+   }
+
+   return 0;
+}
+
+int phy_set_max_speed(struct phy_device *phydev, u32 max_speed)
+{
+   int err;
+
+   err = __set_phy_supported(phydev, max_speed);
+   if (err)
+   return err;
+
+   phydev->advertising = phydev->supported;
+
+   return 0;
+}
+EXPORT_SYMBOL(phy_set_max_speed);
+
 static void of_set_phy_supported(struct phy_device *phydev)
 {
struct device_node *node = phydev->dev.of_node;
@@ -1216,25 +1254,8 @@ static void of_set_phy_supported(struct phy_device 
*phydev)
if (!node)
return;
 
-   if (!of_property_read_u32(node, "max-speed", &max_speed)) {
-   /* The default values for phydev->supported are provided by the 
PHY
-* driver "features" member, we want to reset to sane defaults 
fist
-* before supporting higher speeds.
-*/
-   phydev->supported &= PHY_DEFAULT_FEATURES;
-
-   switch (max_speed) {
-   default:
-   return;
-
-   case SPEED_1000:
-   phydev->supported |= PHY_1000BT_FEATURES;
-   case SPEED_100:
-   phydev->supported |= PHY_100BT_FEATURES;
-   case SPEED_10:
-   phydev->supported |= PHY_10BT_FEATURES;
-   }
-   }
+   if (!of_property_read_u32(node, "max-speed", &max_speed))
+   __set_phy_supported(phydev, max_speed);
 }
 
 /**
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 962387a192f1..66f8b718396a 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -794,6 +794,7 @@ int phy_mii_ioctl(struct phy_device *phydev, struct ifreq 
*ifr, int cmd);
 int phy_start_interrupts(struct phy_device *phydev);
 void phy_print_status(struct phy_device *phydev);
 void phy_device_free(struct phy_device *phydev);
+int phy_set_max_speed(struct phy_device *phydev, u32 max_speed);
 
 int phy_register_fixup(const char *bus_id, u32 phy_uid, u32 phy_uid_mask,
   int (*run)(struct phy_device *));
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC net-next v3 0/4] ravb: Add support for r8a7795 SoC

2015-09-17 Thread Simon Horman
This series enhances the ravb driver to support the r8a7795 SoC.

Changes:

* Details in changelog of individual patches

Base:

* net-next/master

Availability:

To aid review of this and other EtherAVB the following branches are
available in my renesas tree on kernel.org.

* me/r8a7795-ravb-driver-v3: this series
* me/r8a7795-ravb-pfc-v2: r8a7795 sh-pfc update for EthernetAVB
* me/r8a7795-ravb-integration-v3: enable EthernetAVB on r8a7795
* me/r8a7795-ravb-driver-and-integration-v3.runtime:
  the above three branches with their runtime dependencies


Kazuya Mizuguchi (3):
  ravb: Provide dev parameter to DMA API
  ravb: Document binding for r8a7795 SoC
  ravb: Add support for r8a7795 SoC

Simon Horman (1):
  phylib: Add phy_set_max_speed helper

 .../devicetree/bindings/net/renesas,ravb.txt   |  69 --
 drivers/net/ethernet/renesas/ravb.h|   7 ++
 drivers/net/ethernet/renesas/ravb_main.c   | 101 +++--
 drivers/net/phy/phy_device.c   |  59 
 include/linux/phy.h|   1 +
 5 files changed, 184 insertions(+), 53 deletions(-)

-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RESEND] MAINTAINERS: add arcnet and take maintainership

2015-09-17 Thread Uwe Kleine-König
Hello,

On Thu, Sep 17, 2015 at 03:26:16PM +0200, Michael Grzeschik wrote:
> +ARCNET NETWORK LAYER
> +M:   Michael Grzeschik 
> +L:   netdev@vger.kernel.org
> +S:   Maintained
> +F:   drivers/net/arcnet/
> +F:   include/uapi/linux/if_arcnet.h
> +

What about

Documentation/networking/arcnet-hardware.txt
Documentation/networking/arcnet.txt

?

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH next 0/30] Passing net through the netfilter hooks

2015-09-17 Thread David Miller
From: David Miller 
Date: Thu, 17 Sep 2015 17:19:04 -0700 (PDT)

> From: ebied...@xmission.com (Eric W. Biederman)
> Date: Tue, 15 Sep 2015 19:59:49 -0500
> 
>> Pablo, Dave I don't know whose tree this makes more sense to go
>> through.  I am assuming at least initially Pablos as netfilter is
>> involved.  From what I have seen there will be a lot of back and forth
>> between the netfilter code paths and the routing code paths.
> 
> I think it might reduce conflicts actually if it went via my net-next
> tree.
> 
> Pablo, any objections?

I actually decided to just push it out to my tree, if there are any
problems with that I will revert.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] sch_dsmark: improve memory locality

2015-09-17 Thread David Miller
From: Eric Dumazet 
Date: Thu, 17 Sep 2015 16:37:13 -0700

> From: Eric Dumazet 
> 
> Memory placement in sch_dsmark is silly : Better place mask/value
> in the same cache line.
> 
> Also, we can embed small arrays in the first cache line and
> remove a potential cache miss.
> 
> Signed-off-by: Eric Dumazet 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] tcp_cubic: do not set epoch_start in the future

2015-09-17 Thread David Miller
From: Eric Dumazet 
Date: Thu, 17 Sep 2015 08:38:00 -0700

> From: Eric Dumazet 
> 
> Tracking idle time in bictcp_cwnd_event() is imprecise, as epoch_start
> is normally set at ACK processing time, not at send time.
> 
> Doing a proper fix would need to add an additional state variable,
> and does not seem worth the trouble, given CUBIC bug has been there
> forever before Jana noticed it.
> 
> Let's simply not set epoch_start in the future, otherwise
> bictcp_update() could overflow and CUBIC would again
> grow cwnd too fast.
> 
> This was detected thanks to a packetdrill test Neal wrote that was flaky
> before applying this fix.
> 
> Fixes: 30927520dbae ("tcp_cubic: better follow cubic curve after idle period")
> Signed-off-by: Eric Dumazet 
> Signed-off-by: Neal Cardwell 
> Signed-off-by: Yuchung Cheng 
> Cc: Jana Iyengar 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] fjes: fix off-by-one error at fjes_hw_update_zone_task()

2015-09-17 Thread David Miller
From: Taku Izumi 
Date: Thu, 17 Sep 2015 23:21:21 +0900

> Dan Carpenter reported off-by-one error of fjes at
> http://www.mail-archive.com/netdev@vger.kernel.org/msg77520.html
> 
> Actually this is a bug.
> ep_shm_info[epidx].{es_status, zone} should be update
> inside for loop.
> 
> This patch fixes this bug.
> 
> Reported-by: Dan Carpenter 
> Signed-off-by: Taku Izumi 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] MAINTAINERS: remove bouncing email address for qlcnic

2015-09-17 Thread David Miller
From: Jiri Benc 
Date: Thu, 17 Sep 2015 16:28:31 +0200

> I got this automated message from  when submitting
> a qlcnic patch:
> 
>> Shahed Shaikh is no longer with QLogic. If you require assistance please
>> contact Ariel Elior ariel.el...@qlogic.com
> 
> There's no point in having a bouncing address in MAINTAINERS.
> 
> CC: dept-gelinuxnic...@qlogic.com
> CC: Ariel Elior 
> Signed-off-by: Jiri Benc 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net 0/5] vxlan fixes

2015-09-17 Thread David Miller
From: Jiri Benc 
Date: Thu, 17 Sep 2015 16:11:09 +0200

> This fixes various issues with vxlan related to IPv6.

Series applied, thanks Jiri.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RESEND] MAINTAINERS: add arcnet and take maintainership

2015-09-17 Thread David Miller
From: Michael Grzeschik 
Date: Thu, 17 Sep 2015 15:26:16 +0200

> Add entry for arcnet to MAINTAINERS file and add myself as the
> maintainer of the subsystem.
> 
> Signed-off-by: Michael Grzeschik 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RESEND] ARCNET: fix hard_header_len limit

2015-09-17 Thread David Miller
From: Michael Grzeschik 
Date: Thu, 17 Sep 2015 15:18:34 +0200

> For arcnet the bare minimum header only contains the 4 bytes to
> specify source, dest and offset (1, 1 and 2 bytes respectively).
> The corresponding struct is struct arc_hardware.
> 
> The struct archdr contains additionally a union of possible soft
> headers. When doing $insertusecasehere packets might well
> include short (or even no?) soft headers.
> 
> For this reason only use arc_hardware instead of archdr to
> determine the hard_header_len for an arcnet device.
> 
> Signed-off-by: Michael Grzeschik 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pull request: bluetooth 2015-09-17

2015-09-17 Thread David Miller
From: Johan Hedberg 
Date: Thu, 17 Sep 2015 15:27:17 +0300

> Here's one important patch for the 4.3-rc series that fixes an issue
> with Bluetooth LE encryption failing because of a too early check for
> the SMP context.
> 
> Please let me know if there are any issues pulling. Thanks.

Pulled, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 0/2] net: bcmgenet: Interrupt coalescing

2015-09-17 Thread David Miller
From: Florian Fainelli 
Date: Wed, 16 Sep 2015 16:47:38 -0700

> This patch series adds support for interrupt coalescing for GENET
> adapters.

Series applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net 1/6] lan78xx: Check device ready bit (PMT_CTL_READY_) after reset the PHY

2015-09-17 Thread David Miller

All 6 patches applied, thanks.

But in the future, please provide an initial "[PATCH net 0/6]" posting which
gives a top-level overview of what your patch series is doing.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] atm: deal with setting entry before mkip was called

2015-09-17 Thread David Miller
From: Sasha Levin 
Date: Wed, 16 Sep 2015 15:30:21 -0400

> If we didn't call ATMARP_MKIP before ATMARP_ENCAP the VCC descriptor is
> non-existant and we'll end up dereferencing a NULL ptr:
 ...
> Signed-off-by: Sasha Levin 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xfrm4_garbage_collect reaching limit

2015-09-17 Thread Dan Streetman
On Wed, Sep 16, 2015 at 4:45 AM, Steffen Klassert
 wrote:
> On Mon, Sep 14, 2015 at 11:14:59PM -0400, Dan Streetman wrote:
>> On Fri, Sep 11, 2015 at 5:48 AM, Steffen Klassert
>>  wrote:
>> >
>> >> Possibly the
>> >> default value of xfrm4_gc_thresh could be set proportional to
>> >> num_online_cpus(), but that doesn't help when cpus are onlined after
>> >> boot.
>> >
>> > This could be an option, we could change the xfrm4_gc_thresh value with
>> > a cpu notifier callback if more cpus come up after boot.
>>
>> the issue there is, if the value is changed by the user, does a cpu
>> hotplug reset it back to default...
>
> What about the patch below? With this we are independent of the number
> of cpus. It should cover most, if not all usecases.

yep that works, thanks!  I'll give it a test also, but I don't see how
it would fail.

>
> While we are at it, we could think about increasing the flowcache
> percpu limit. This value was choosen back in 2003, so maybe we could
> have more than 4k cache entries per cpu these days.
>
>
> Subject: [PATCH RFC] xfrm: Let the flowcache handle its size by default.
>
> The xfrm flowcache size is limited by the flowcache limit
> (4096 * number of online cpus) and the xfrm garbage collector
> threshold (2 * 32768), whatever is reached first. This means
> that we can hit the garbage collector limit only on systems
> with more than 16 cpus. On such systems we simply refuse
> new allocations if we reach the limit, so new flows are dropped.
> On syslems with 16 or less cpus, we hit the flowcache limit.
> In this case, we shrink the flow cache instead of refusing new
> flows.
>
> We increase the xfrm garbage collector threshold to INT_MAX
> to get the same behaviour, independent of the number of cpus.
>
> The xfrm garbage collector threshold can still be set below
> the flowcache limit to reduce the memory usage of the flowcache.
>
> Signed-off-by: Steffen Klassert 
> ---
>  Documentation/networking/ip-sysctl.txt | 6 --
>  net/ipv4/xfrm4_policy.c| 2 +-
>  net/ipv6/xfrm6_policy.c| 2 +-
>  3 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/networking/ip-sysctl.txt 
> b/Documentation/networking/ip-sysctl.txt
> index ebe94f2..260f30b 100644
> --- a/Documentation/networking/ip-sysctl.txt
> +++ b/Documentation/networking/ip-sysctl.txt
> @@ -1199,7 +1199,8 @@ tag - INTEGER
>  xfrm4_gc_thresh - INTEGER
> The threshold at which we will start garbage collecting for IPv4
> destination cache entries.  At twice this value the system will
> -   refuse new allocations.
> +   refuse new allocations. The value must be set below the flowcache
> +   limit (4096 * number of online cpus) to take effect.
>
>  igmp_link_local_mcast_reports - BOOLEAN
> Enable IGMP reports for link local multicast groups in the
> @@ -1645,7 +1646,8 @@ ratelimit - INTEGER
>  xfrm6_gc_thresh - INTEGER
> The threshold at which we will start garbage collecting for IPv6
> destination cache entries.  At twice this value the system will
> -   refuse new allocations.
> +   refuse new allocations. The value must be set below the flowcache
> +   limit (4096 * number of online cpus) to take effect.
>
>
>  IPv6 Update by:
> diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
> index 1e06c4f..3dffc73 100644
> --- a/net/ipv4/xfrm4_policy.c
> +++ b/net/ipv4/xfrm4_policy.c
> @@ -248,7 +248,7 @@ static struct dst_ops xfrm4_dst_ops = {
> .destroy =  xfrm4_dst_destroy,
> .ifdown =   xfrm4_dst_ifdown,
> .local_out =__ip_local_out,
> -   .gc_thresh =32768,
> +   .gc_thresh =INT_MAX,
>  };
>
>  static struct xfrm_policy_afinfo xfrm4_policy_afinfo = {
> diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
> index f10b940..e9af39a 100644
> --- a/net/ipv6/xfrm6_policy.c
> +++ b/net/ipv6/xfrm6_policy.c
> @@ -289,7 +289,7 @@ static struct dst_ops xfrm6_dst_ops = {
> .destroy =  xfrm6_dst_destroy,
> .ifdown =   xfrm6_dst_ifdown,
> .local_out =__ip6_local_out,
> -   .gc_thresh =32768,
> +   .gc_thresh =INT_MAX,
>  };
>
>  static struct xfrm_policy_afinfo xfrm6_policy_afinfo = {
> --
> 1.9.1
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xfrm4_garbage_collect reaching limit

2015-09-17 Thread Steffen Klassert
On Thu, Sep 17, 2015 at 09:23:35PM -0700, David Miller wrote:
> From: Steffen Klassert 
> Date: Wed, 16 Sep 2015 10:45:41 +0200
> 
> > index 1e06c4f..3dffc73 100644
> > --- a/net/ipv4/xfrm4_policy.c
> > +++ b/net/ipv4/xfrm4_policy.c
> > @@ -248,7 +248,7 @@ static struct dst_ops xfrm4_dst_ops = {
> > .destroy =  xfrm4_dst_destroy,
> > .ifdown =   xfrm4_dst_ifdown,
> > .local_out =__ip_local_out,
> > -   .gc_thresh =32768,
> > +   .gc_thresh =INT_MAX,
> >  };
> >  
> >  static struct xfrm_policy_afinfo xfrm4_policy_afinfo = {
> 
> This means the dst_ops->gc() for xfrm will never be invoked.
> 
> Is that intentional?

Yes. This is already the case on systems with less than 8 cpus
because the flowcache is limited to 4096 entries per cpu. The
percpu flowcache shrinks itself to 'low_watermark' enrires if
it hits the percpu limit.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] ipv6: ip6_fragment: fix headroom tests and skb leak

2015-09-17 Thread David Miller
From: Florian Westphal 
Date: Wed, 16 Sep 2015 17:26:14 +0200

> David Woodhouse reports skb_under_panic when we try to push ethernet
> header to fragmented ipv6 skbs:
> 
>  skbuff: skb_under_panic: text:c1277f1e len:1294 put:14 head:dec98000
>  data:dec97ffc tail:0xdec9850a end:0xdec98f40 dev:br-lan
> [..]
> ip6_finish_output2+0x196/0x4da
> 
> David further debugged this:
>   [..] offending fragments were arriving here with skb_headroom(skb)==10.
>   Which is reasonable, being the Solos ADSL card's header of 8 bytes
>   followed by 2 bytes of PPP frame type.
> 
> The problem is that if netfilter ipv6 defragmentation is used, skb_cow()
> in ip6_forward will only see reassembled skb.
> 
> Therefore, headroom is overestimated by 8 bytes (we pulled fragment
> header) and we don't check the skbs in the frag_list either.
> 
> We can't do these checks in netfilter defrag since outdev isn't known yet.
> 
> Furthermore, existing tests in ip6_fragment did not consider the fragment
> or ipv6 header size when checking headroom of the fraglist skbs.
> 
> While at it, also fix a skb leak on memory allocation -- ip6_fragment
> must consume the skb.
> 
> I tested this e1000 driver hacked to not allocate additional headroom
> (we end up in slowpath, since LL_RESERVED_SPACE is 16).
> 
> If 2 bytes of headroom are allocated, fastpath is taken (14 byte
> ethernet header was pulled, so 16 byte headroom available in all
> fragments).
> 
> Reported-by: David Woodhouse 
> Diagnosed-by: David Woodhouse 
> Signed-off-by: Florian Westphal 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next v2] net: Initialize table in fib result

2015-09-17 Thread David Miller
From: David Ahern 
Date: Wed, 16 Sep 2015 10:16:39 -0600

> Sergey, Richard and Fabio reported an oops in ip_route_input_noref. e.g., 
> from Richard:
 ...
> The root cause is use of res.table uninitialized.
> 
> Thanks to Nikolay for noticing the uninitialized use amongst the maze of
> gotos.
> 
> As Nikolay pointed out the second initialization is not required to fix
> the oops, but rather to fix a related problem where a valid lookup should
> be invalidated before creating the rth entry.
> 
> Fixes: b7503e0cdb5d ("net: Add FIB table id to rtable")
> Reported-by: Sergey Senozhatsky 
> Reported-by: Richard Alpe 
> Reported-by: Fabio Estevam 
> Tested-by: Fabio Estevam 
> Signed-off-by: David Ahern 
> ---
> v2:
> - clarification in the commit message regarding the second initialization

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] solos-pci: Increase headroom on received packets

2015-09-17 Thread David Miller
From: David Woodhouse 
Date: Wed, 16 Sep 2015 12:35:00 +0100

> A comment in include/linux/skbuff.h says that:
> 
>  * Various parts of the networking layer expect at least 32 bytes of
>  * headroom, you should not reduce this.
> 
> This was demonstrated by a panic when handling fragmented IPv6 packets:
> http://marc.info/?l=linux-netdev&m=144236093519172&w=2
> 
> It's not entirely clear if that comment is still valid ― and if it is,
> perhaps netif_rx() ought to be enforcing it with a warning.
> 
> But either way, it is rather stupid from a performance point of view
> for us to be receiving packets into a buffer which doesn't have enough
> room to prepend an Ethernet header ― it means that *every* incoming
> packet is going to be need to be reallocated. So let's fix that.
> 
> Signed-off-by: David Woodhouse 

Applied, thanks David.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RESEND PATCH] net: ks8851: Export OF module alias information

2015-09-17 Thread David Miller
From: Javier Martinez Canillas 
Date: Wed, 16 Sep 2015 11:11:22 +0200

> Drivers needs to export the OF id table and this be built into
> the module or udev won't have the necessary information to autoload
> the driver module when the device is registered via OF.
> 
> Signed-off-by: Javier Martinez Canillas 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xfrm4_garbage_collect reaching limit

2015-09-17 Thread David Miller
From: Steffen Klassert 
Date: Wed, 16 Sep 2015 10:45:41 +0200

> index 1e06c4f..3dffc73 100644
> --- a/net/ipv4/xfrm4_policy.c
> +++ b/net/ipv4/xfrm4_policy.c
> @@ -248,7 +248,7 @@ static struct dst_ops xfrm4_dst_ops = {
>   .destroy =  xfrm4_dst_destroy,
>   .ifdown =   xfrm4_dst_ifdown,
>   .local_out =__ip_local_out,
> - .gc_thresh =32768,
> + .gc_thresh =INT_MAX,
>  };
>  
>  static struct xfrm_policy_afinfo xfrm4_policy_afinfo = {

This means the dst_ops->gc() for xfrm will never be invoked.

Is that intentional?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ip: find correct route for socket which is not bound to a device

2015-09-17 Thread David Miller
From: Wengang Wang 
Date: Wed, 16 Sep 2015 14:34:15 +0800

> For multi-cast, we should find valid route(thus get the meaniful pmtu) for
> the package on the socket which is not bound to a device(sk_bound_dev_if
> being 0) too.

Your patch breaks exactly the situation explained in full detail
in the huge comment about the first change you are making:

/* Special hack: user can direct multicasts
   and limited broadcast via necessary interface
   without fiddling with IP_MULTICAST_IF or IP_PKTINFO.
   This hack is not just for fun, it allows
   vic,vat and friends to work.
   They bind socket to loopback, set ttl to zero
   and expect that it will work.
   From the viewpoint of routing cache they are broken,
   because we are not allowed to build multicast path
   with loopback source addr (look, routing cache
   cannot know, that ttl is zero, so that packet
   will not leave this host and route is valid).
   Luckily, this hack is good workaround.
 */

That situation will now fail after your patch.

So I cannot apply this patch, sorry.

I know what you want, you want to end up with a cached route that
will track the PMTU.  But this hack will break other things at
the same time so is not acceptable.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 net-next 0/2] bpf: performance improvements

2015-09-17 Thread David Miller
From: Alexei Starovoitov 
Date: Tue, 15 Sep 2015 23:05:41 -0700

> v1->v2: dropped redundant iff_up check in patch 2
> 
> At plumbers we discussed different options on how to get rid of skb_clone
> from bpf_clone_redirect(), the patch 2 implements the best option.
> Patch 1 adds 'integrated exts' to cls_bpf to improve performance by
> combining simple actions into bpf classifier.

Series applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 net] net/mlx4_en: really allow to change RSS key

2015-09-17 Thread David Miller
From: Or Gerlitz 
Date: Wed, 16 Sep 2015 14:05:25 +0300

> On Wed, Sep 16, 2015 at 4:29 AM, Eric Dumazet  wrote:
>> From: Eric Dumazet 
>>
>> When changing rss key, we do not want to overwrite user provided key
>> by the one provided by netdev_rss_key_fill(), which is the host random
>> key generated at boot time.
>>
>> Fixes: 947cbb0ac242 ("net/mlx4_en: Support for configurable RSS hash 
>> function")
>> Signed-off-by: Eric Dumazet 
>> Cc: Eyal Perry 
>> CC: Amir Vadai 
> 
> Acked-by: Or Gerlitz 
> 
> Dave, can you please push it to -stable of >= 3.19 ?

Applied and queued up for -stable, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 net-next] net: only check perm protocol when register proto

2015-09-17 Thread David Miller
From: martinbj2...@gmail.com
Date: Fri, 18 Sep 2015 00:00:05 -0400

> From: Junwei Zhang 
> 
> The permanent protocol nodes are at the head of the list,
> So only need check all these nodes.
> 
> No matter the new node is permanent or not,
> insert the new node after the last permanent protocol node,
> 
> If the new node conflicts with existing permanent node,
> return error.
> 
> Signed-off-by: Martin Zhang 
> ---
> V2: Fix indentation
> rewrite statement

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 1/2] tcp: provide skb->hash to synack packets

2015-09-17 Thread David Miller
From: Eric Dumazet 
Date: Tue, 15 Sep 2015 15:24:20 -0700

> From: Eric Dumazet 
> 
> In commit b73c3d0e4f0e ("net: Save TX flow hash in sock and set in skbuf
> on xmit"), Tom provided a l4 hash to most outgoing TCP packets.
> 
> We'd like to provide one as well for SYNACK packets, so that all packets
> of a given flow share same txhash, to later enable bonding driver to
> also use skb->hash to perform slave selection.
> 
> Note that a SYNACK retransmit shuffles the tx hash, as Tom did
> in commit 265f94ff54d62 ("net: Recompute sk_txhash on negative routing
> advice") for established sockets.
> 
> This has nice effect making TCP flows resilient to some kind of black
> holes, even at connection establish phase.
> 
> Signed-off-by: Eric Dumazet 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V2 net-next] net: only check perm protocol when register proto

2015-09-17 Thread martinbj2008
From: Junwei Zhang 

The permanent protocol nodes are at the head of the list,
So only need check all these nodes.

No matter the new node is permanent or not,
insert the new node after the last permanent protocol node,

If the new node conflicts with existing permanent node,
return error.

Signed-off-by: Martin Zhang 
---
V2: Fix indentation
rewrite statement
 
 net/ipv4/af_inet.c | 16 +---
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 1d0c3ad..8a55664 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1043,22 +1043,16 @@ void inet_register_protosw(struct inet_protosw *p)
goto out_illegal;
 
/* If we are trying to override a permanent protocol, bail. */
-   answer = NULL;
last_perm = &inetsw[p->type];
list_for_each(lh, &inetsw[p->type]) {
answer = list_entry(lh, struct inet_protosw, list);
-
/* Check only the non-wild match. */
-   if (INET_PROTOSW_PERMANENT & answer->flags) {
-   if (protocol == answer->protocol)
-   break;
-   last_perm = lh;
-   }
-
-   answer = NULL;
+   if ((INET_PROTOSW_PERMANENT & answer->flags) == 0)
+   break;
+   if (protocol == answer->protocol)
+   goto out_permanent;
+   last_perm = lh;
}
-   if (answer)
-   goto out_permanent;
 
/* Add the new entry after the last permanent entry if any, so that
 * the new entry does not override a permanent entry when matched with
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 2/2] bonding: use l4 hash if available

2015-09-17 Thread David Miller
From: Eric Dumazet 
Date: Tue, 15 Sep 2015 15:24:28 -0700

> + if (bond->params.xmit_policy == BOND_XMIT_POLICY_ENCAP34 &&
> + skb->l4_hash)
> + return skb->hash;

Applied, with the indentation of the return statement fixed up.
:-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net 4/5] be2net: allow offloading with the same port for IPv4 and IPv6

2015-09-17 Thread Sathya Perla
On Thu, Sep 17, 2015 at 7:41 PM, Jiri Benc  wrote:
> The callback for adding vxlan port can be called with the same port for both
> IPv4 and IPv6. Do not disable the offloading if this occurs.
>
> Signed-off-by: Jiri Benc 
> ---
>  drivers/net/ethernet/emulex/benet/be.h  |  1 +
>  drivers/net/ethernet/emulex/benet/be_main.c | 10 ++
>  2 files changed, 11 insertions(+)

Acked-by: Sathya Perla 

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] netlink: make sure -EBUSY won't escape from netlink_insert

2015-09-17 Thread Herbert Xu
On Thu, Sep 17, 2015 at 11:47:12AM -0700, Linus Torvalds wrote:
> On Wed, Sep 16, 2015 at 10:41 PM, Christoph Paasch
>  wrote:
> >
> > can this patch get queued up for 4.1 as well?
> > It seems to fix a similar issue in 4.1.6.
> 
> I think Herbert has an additional patch for this issue. But yes, I
> think should be scheduled for stable. Herbert?

Yes this should be safe for stable, even though the real cause of
the problem is probably the one that Tejun spotted.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net 2/2] 8139cp: reset BQL when ring tx ring cleared

2015-09-17 Thread David Woodhouse
On Fri, 2015-09-18 at 01:44 +0200, Francois Romieu wrote:
> The TxDmaOkLowDesc register may tell if the Tx dma part is still 
> making any progress. I have added a TxPoll request. See below.

I've just added that into the original TX timeout handler, since that
doesn't seem to be crashing the box for me as long as I avoid the IRQ
storm.

Not sure what we learn from it ('desc 6550' printed as hex)... I've
also made it dump the TX descriptor ring (skb, addr, opts1, opts2):

[ 1733.027156] 8139cp :00:0b.0 eth1: Transmit timeout, status  c   2b head 
25 tail 22 desc 65500 80ff
[ 1733.036819] TX ring 00:   (null) 1dd68d8c 3000cb67 0 
[ 1733.037040] TX ring 01:   (null) 1dd8774c 3000cb67 0 
[ 1733.037040] TX ring 02:   (null) 1dd86e7c 3a08 0 
[ 1733.037040] TX ring 03:   (null) 1dd865ac 3a08 0 
[ 1733.037040] TX ring 04:   (null) 1dd8540c 3a08 0 
[ 1733.037040] TX ring 05:   (null) 1dd85cdc 3a08 0 
[ 1733.037040] TX ring 06:   (null) 1dd84b3c 3a08 0 
[ 1733.037040] TX ring 07:   (null) 1dd8399c 3a08 0 
[ 1733.037040] TX ring 08:   (null) 1dd8426c 3a08 0 
[ 1733.037040] TX ring 09:   (null) 1dd830cc 3a08 0 
[ 1733.037040] TX ring 10:   (null) 1dd81f2c 3a08 0 
[ 1733.037040] TX ring 11:   (null) 1dd8165c 3a08 0 
[ 1733.037040] TX ring 12:   (null) 1dd827fc 3a08 0 
[ 1733.037040] TX ring 13:   (null) 1dd80d8c 3a08 0 
[ 1733.037040] TX ring 14:   (null) 1dd804bc 3a08 0 
[ 1733.037040] TX ring 15:   (null) 1dd6ee7c 3a08 0 
[ 1733.037040] TX ring 16:   (null) 1dd6f74c 3a08 0 
[ 1733.037040] TX ring 17:   (null) 1dd6e5ac 3a08 0 
[ 1733.037040] TX ring 18:   (null) 1dd6dcdc 3a08 0 
[ 1733.037040] TX ring 19:   (null) 1dd6cb3c 3000f804 0 
[ 1733.037040] TX ring 20:   (null) 1dd8de02 300083da 0 
[ 1733.037040] TX ring 21:   (null) 1dd8da02 3000 0 
[ 1733.037040] TX ring 22: defcc240 1dd6d40c b5ea 0 
[ 1733.037040] TX ring 23: decdb900 1dd6b99c b046 0 
[ 1733.037040] TX ring 24: ddd27c00 1dd6b0cc b5ea 0 
[ 1733.037040] TX ring 25:   (null) 1dd6e5ac 3000ca24 0 
[ 1733.037040] TX ring 26:   (null) 1dd6dcdc 3000ca24 0 
[ 1733.037040] TX ring 27:   (null) 1dd6d40c 3000ca24 0 
[ 1733.037040] TX ring 28:   (null) 1dd6cb3c 3000ca24 0 
[ 1733.037040] TX ring 29:   (null) 1dd6c26c 3000ca24 0 
[ 1733.037040] TX ring 30:   (null) 1dd6b99c 3000ca24 0 
[ 1733.037040] TX ring 31:   (null) 1dd6b0cc 3000ca24 0 
[ 1733.037040] TX ring 32:   (null) 1dd6a7fc 3000ca24 0 
[ 1733.037040] TX ring 33:   (null) 1dd69f2c 3000ca24 0 
[ 1733.037040] TX ring 34:   (null) 1dd68d8c 3000ca24 0 
[ 1733.037040] TX ring 35:   (null) 1dd6965c 3000cb67 0 
[ 1733.037040] TX ring 36:   (null) 1dd684bc 3000cb67 0 
[ 1733.037040] TX ring 37:   (null) 1dd86e7c 3000cb67 0 
[ 1733.037040] TX ring 38:   (null) 1dd8774c 3000cb67 0 
[ 1733.037040] TX ring 39:   (null) 1dd865ac 3000cb67 0 
[ 1733.037040] TX ring 40:   (null) 1dd85cdc 3000cb67 0 
[ 1733.037040] TX ring 41:   (null) 1dd8540c 3000cb67 0 
[ 1733.037040] TX ring 42:   (null) 1dd84b3c 3000cb67 0 
[ 1733.037040] TX ring 43:   (null) 1dd8426c 3000cb67 0 
[ 1733.037040] TX ring 44:   (null) 1dd8399c 3000cb67 0 
[ 1733.037040] TX ring 45:   (null) 1dd830cc 3000cb67 0 
[ 1733.037040] TX ring 46:   (null) 1dd81f2c 3000cb67 0 
[ 1733.037040] TX ring 47:   (null) 1dd827fc 3000cb67 0 
[ 1733.037040] TX ring 48:   (null) 1dd8165c 3000cb67 0 
[ 1733.037040] TX ring 49:   (null) 1dd80d8c 3000cb67 0 
[ 1733.037040] TX ring 50:   (null) 1dd804bc 3000cb67 0 
[ 1733.037040] TX ring 51:   (null) 1dd6f74c 3000cb67 0 
[ 1733.037040] TX ring 52:   (null) 1dd6ee7c 3000cb67 0 
[ 1733.037040] TX ring 53:   (null) 1dd6dcdc 3000cb67 0 
[ 1733.037040] TX ring 54:   (null) 1

Re: [net-next 0/8][pull request] Intel Wired LAN Driver Updates 2015-09-17

2015-09-17 Thread Jeff Kirsher
On Thu, 2015-09-17 at 17:17 -0700, Jeff Kirsher wrote:
> This series contains updates to i40e and i40evf.
> 
> Shannon provides updates to i40e and i40evf to resolve an issue with
> the
> nvmupdate utility.  First renames a variable name to reduce confusion
> and
> to differentiate it from the actual user variable.  Then added the
> ability
> to save the admin queue write back descriptor if a caller supplies a
> buffer for it to be saved into.  Added a new GetStatus command so
> that
> the NVM update tool can query the current status instead of doing
> fake
> write requests to probe for readiness.  Added wait states to the NVM
> update state machine to signify when waiting for an update operation
> to
> finish, whether we are in the middle of a set of write operations, or
> we
> are now idle but waiting.  Then added a facility to run admin queue
> commands through the NVM update utility in order to allow the update
> tools to interact with the firmware and do special commands needed
> for
> updates and configuration changes.  Also added a facility to recover
> the
> result of a previously run admin queue command.
> 
> ---
> Waiting for Dave to update his net-next tree before pushing this
> series
> to my next-queue tree.
> 
> The following are changes since commit
> 9adbac599a71bc25a2617850ffcaa4388dc5c20d:
>   fm10k: fix iov_msg_mac_vlan_pf VID checks
> and are available in the git repository at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
> master

This series has been pushed up to my tree now.

signature.asc
Description: This is a digitally signed message part


Re: [PATCH] geneve: restore vlan bits in xmit path

2015-09-17 Thread Jesse Gross
On Thu, Sep 17, 2015 at 1:15 PM, John W. Linville
 wrote:
> On Thu, Sep 17, 2015 at 12:48:56PM -0700, Jesse Gross wrote:
>> On Thu, Sep 17, 2015 at 12:25 PM, John W. Linville
>>  wrote:
>> > On Thu, Sep 17, 2015 at 11:45:58AM -0700, Pravin Shelar wrote:
>> >> On Thu, Sep 17, 2015 at 10:18 AM, John W. Linville
>> >>  wrote:
>> >> > These seem to have been accidentally dropped in commit 371bd1061d29
>> >> > ("geneve: Consolidate Geneve functionality in single module.").
>> >> >
>> >> Geneve should not export vxlan feature. So that it never sees vxlan
>> >> tagged packets. Can you turn off the vlan feature?
>> >
>> > I'm not sure I understand...?  This is vlan, not vxlan.
>>
>> I think he just mean vlan. If you remove the line where
>> dev->vlan_features are set then the core stack will handle this and we
>> don't need to do anything special here.
>
> Is that preferrable to this patch?  Tunneling vlan-tagged frames
> seems weird, but I would hate to disallow it if some crazy person
> wanted to do that...

This doesn't actually prevent sending VLAN tagged frames inside the
tunnel. It just means that the stack won't send VLAN accelerated
frames to the driver and will put them directly in the frame instead.
That's the same as what this code here is emulating so there's no
difference, just code cleanliness.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH next 0/30] Passing net through the netfilter hooks

2015-09-17 Thread David Miller
From: ebied...@xmission.com (Eric W. Biederman)
Date: Tue, 15 Sep 2015 19:59:49 -0500

> Pablo, Dave I don't know whose tree this makes more sense to go
> through.  I am assuming at least initially Pablos as netfilter is
> involved.  From what I have seen there will be a lot of back and forth
> between the netfilter code paths and the routing code paths.

I think it might reduce conflicts actually if it went via my net-next
tree.

Pablo, any objections?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next 5/8] i40e/i40evf: add wait states to NVM state machine

2015-09-17 Thread Jeff Kirsher
From: Shannon Nelson 

This adds wait states to the NVM update state machine to signify when
waiting for an update operation to finish, whether we're in the middle
of a set of Write operations, or we're now idle but waiting.

Change-ID: Iabe91d6579ef6a2ea560647e374035656211ab43
Signed-off-by: Shannon Nelson 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_adminq.c | 13 
 drivers/net/ethernet/intel/i40e/i40e_nvm.c| 48 ---
 drivers/net/ethernet/intel/i40e/i40e_type.h   |  4 ++-
 drivers/net/ethernet/intel/i40evf/i40e_type.h |  4 ++-
 4 files changed, 55 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_adminq.c 
b/drivers/net/ethernet/intel/i40e/i40e_adminq.c
index 8a77f59..ea1e930 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_adminq.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_adminq.c
@@ -1018,6 +1018,19 @@ clean_arq_element_out:
i40e_release_nvm(hw);
hw->aq.nvm_release_on_done = false;
}
+
+   switch (hw->nvmupd_state) {
+   case I40E_NVMUPD_STATE_INIT_WAIT:
+   hw->nvmupd_state = I40E_NVMUPD_STATE_INIT;
+   break;
+
+   case I40E_NVMUPD_STATE_WRITE_WAIT:
+   hw->nvmupd_state = I40E_NVMUPD_STATE_WRITING;
+   break;
+
+   default:
+   break;
+   }
}
 
return ret_code;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_nvm.c 
b/drivers/net/ethernet/intel/i40e/i40e_nvm.c
index 7ff3099..7bcbe59 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_nvm.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_nvm.c
@@ -696,6 +696,12 @@ i40e_status i40e_nvmupd_command(struct i40e_hw *hw,
status = i40e_nvmupd_state_writing(hw, cmd, bytes, perrno);
break;
 
+   case I40E_NVMUPD_STATE_INIT_WAIT:
+   case I40E_NVMUPD_STATE_WRITE_WAIT:
+   status = I40E_ERR_NOT_READY;
+   *perrno = -EBUSY;
+   break;
+
default:
/* invalid state, should never happen */
i40e_debug(hw, I40E_DEBUG_NVM,
@@ -759,10 +765,12 @@ static i40e_status i40e_nvmupd_state_init(struct i40e_hw 
*hw,
 hw->aq.asq_last_status);
} else {
status = i40e_nvmupd_nvm_erase(hw, cmd, perrno);
-   if (status)
+   if (status) {
i40e_release_nvm(hw);
-   else
+   } else {
hw->aq.nvm_release_on_done = true;
+   hw->nvmupd_state = I40E_NVMUPD_STATE_INIT_WAIT;
+   }
}
break;
 
@@ -773,10 +781,12 @@ static i40e_status i40e_nvmupd_state_init(struct i40e_hw 
*hw,
 hw->aq.asq_last_status);
} else {
status = i40e_nvmupd_nvm_write(hw, cmd, bytes, perrno);
-   if (status)
+   if (status) {
i40e_release_nvm(hw);
-   else
+   } else {
hw->aq.nvm_release_on_done = true;
+   hw->nvmupd_state = I40E_NVMUPD_STATE_INIT_WAIT;
+   }
}
break;
 
@@ -790,7 +800,7 @@ static i40e_status i40e_nvmupd_state_init(struct i40e_hw 
*hw,
if (status)
i40e_release_nvm(hw);
else
-   hw->nvmupd_state = I40E_NVMUPD_STATE_WRITING;
+   hw->nvmupd_state = I40E_NVMUPD_STATE_WRITE_WAIT;
}
break;
 
@@ -809,6 +819,7 @@ static i40e_status i40e_nvmupd_state_init(struct i40e_hw 
*hw,
i40e_release_nvm(hw);
} else {
hw->aq.nvm_release_on_done = true;
+   hw->nvmupd_state = I40E_NVMUPD_STATE_INIT_WAIT;
}
}
break;
@@ -838,7 +849,7 @@ static i40e_status i40e_nvmupd_state_reading(struct i40e_hw 
*hw,
 struct i40e_nvm_access *cmd,
 u8 *bytes, int *perrno)
 {
-   i40e_status status;
+   i40e_status status = 0;
enum i40e_nvmupd_cmd upd_cmd;
 
upd_cmd = i40e_nvmupd_validate_command(hw, cmd, perrno);
@@ -880,7 +891,7 @@ static i40e_status i40e_nvmupd_state_writing(struct i40e_hw 
*hw,
 struct i40e_nvm_access *cmd,
 u8 *by

[net-next 2/8] i40e/i40evf: save aq writeback for future inspection

2015-09-17 Thread Jeff Kirsher
From: Shannon Nelson 

Add the ability to save the AdminQ write back descriptor if a
caller supplies a buffer for it to be saved into.

Change-ID: I3d1301d26360b39a2d66dc8569e851f54133a3af
Signed-off-by: Shannon Nelson 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_adminq.c   | 4 
 drivers/net/ethernet/intel/i40e/i40e_adminq.h   | 1 +
 drivers/net/ethernet/intel/i40evf/i40e_adminq.c | 4 
 drivers/net/ethernet/intel/i40evf/i40e_adminq.h | 1 +
 4 files changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_adminq.c 
b/drivers/net/ethernet/intel/i40e/i40e_adminq.c
index 3e0d200..8a77f59 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_adminq.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_adminq.c
@@ -889,6 +889,10 @@ i40e_status i40e_asq_send_command(struct i40e_hw *hw,
   "AQTX: desc and buffer writeback:\n");
i40e_debug_aq(hw, I40E_DEBUG_AQ_COMMAND, (void *)desc, buff, buff_size);
 
+   /* save writeback aq if requested */
+   if (details->wb_desc)
+   *details->wb_desc = *desc_on_ring;
+
/* update the error if time out occurred */
if ((!cmd_completed) &&
(!details->async && !details->postpone)) {
diff --git a/drivers/net/ethernet/intel/i40e/i40e_adminq.h 
b/drivers/net/ethernet/intel/i40e/i40e_adminq.h
index 28e519a..b67b34c 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_adminq.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_adminq.h
@@ -69,6 +69,7 @@ struct i40e_asq_cmd_details {
u16 flags_dis;
bool async;
bool postpone;
+   struct i40e_aq_desc *wb_desc;
 };
 
 #define I40E_ADMINQ_DETAILS(R, i)   \
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_adminq.c 
b/drivers/net/ethernet/intel/i40evf/i40e_adminq.c
index f08450b..15c8ac8 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_adminq.c
+++ b/drivers/net/ethernet/intel/i40evf/i40e_adminq.c
@@ -830,6 +830,10 @@ i40e_status i40evf_asq_send_command(struct i40e_hw *hw,
i40evf_debug_aq(hw, I40E_DEBUG_AQ_COMMAND, (void *)desc, buff,
buff_size);
 
+   /* save writeback aq if requested */
+   if (details->wb_desc)
+   *details->wb_desc = *desc_on_ring;
+
/* update the error if time out occurred */
if ((!cmd_completed) &&
(!details->async && !details->postpone)) {
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_adminq.h 
b/drivers/net/ethernet/intel/i40evf/i40e_adminq.h
index ef43d68..547b79b 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_adminq.h
+++ b/drivers/net/ethernet/intel/i40evf/i40e_adminq.h
@@ -69,6 +69,7 @@ struct i40e_asq_cmd_details {
u16 flags_dis;
bool async;
bool postpone;
+   struct i40e_aq_desc *wb_desc;
 };
 
 #define I40E_ADMINQ_DETAILS(R, i)   \
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next 7/8] i40e/i40evf: add get AQ result command to nvmupdate utility

2015-09-17 Thread Jeff Kirsher
From: Shannon Nelson 

Add a facility to recover the result of a previously run AQ command.

Change-ID: I21afec2c20c1a5e6ba60c7fbfcbedfff78c10e45
Signed-off-by: Shannon Nelson 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_nvm.c| 79 +++
 drivers/net/ethernet/intel/i40e/i40e_type.h   |  1 +
 drivers/net/ethernet/intel/i40evf/i40e_type.h |  1 +
 3 files changed, 81 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_nvm.c 
b/drivers/net/ethernet/intel/i40e/i40e_nvm.c
index 50d4aa3..d0288ad 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_nvm.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_nvm.c
@@ -618,6 +618,9 @@ static i40e_status i40e_nvmupd_nvm_read(struct i40e_hw *hw,
 static i40e_status i40e_nvmupd_exec_aq(struct i40e_hw *hw,
   struct i40e_nvm_access *cmd,
   u8 *bytes, int *perrno);
+static i40e_status i40e_nvmupd_get_aq_result(struct i40e_hw *hw,
+struct i40e_nvm_access *cmd,
+u8 *bytes, int *perrno);
 static inline u8 i40e_nvmupd_get_module(u32 val)
 {
return (u8)(val & I40E_NVM_MOD_PNT_MASK);
@@ -643,6 +646,7 @@ static char *i40e_nvm_update_state_str[] = {
"I40E_NVMUPD_CSUM_LCB",
"I40E_NVMUPD_STATUS",
"I40E_NVMUPD_EXEC_AQ",
+   "I40E_NVMUPD_GET_AQ_RESULT",
 };
 
 /**
@@ -832,6 +836,10 @@ static i40e_status i40e_nvmupd_state_init(struct i40e_hw 
*hw,
status = i40e_nvmupd_exec_aq(hw, cmd, bytes, perrno);
break;
 
+   case I40E_NVMUPD_GET_AQ_RESULT:
+   status = i40e_nvmupd_get_aq_result(hw, cmd, bytes, perrno);
+   break;
+
default:
i40e_debug(hw, I40E_DEBUG_NVM,
   "NVMUPD: bad cmd %s in init state\n",
@@ -1047,6 +1055,8 @@ static enum i40e_nvmupd_cmd 
i40e_nvmupd_validate_command(struct i40e_hw *hw,
case I40E_NVM_EXEC:
if (module == 0xf)
upd_cmd = I40E_NVMUPD_STATUS;
+   else if (module == 0)
+   upd_cmd = I40E_NVMUPD_GET_AQ_RESULT;
break;
}
break;
@@ -1160,6 +1170,75 @@ static i40e_status i40e_nvmupd_exec_aq(struct i40e_hw 
*hw,
 }
 
 /**
+ * i40e_nvmupd_get_aq_result - Get the results from the previous exec_aq
+ * @hw: pointer to hardware structure
+ * @cmd: pointer to nvm update command buffer
+ * @bytes: pointer to the data buffer
+ * @perrno: pointer to return error code
+ *
+ * cmd structure contains identifiers and data buffer
+ **/
+static i40e_status i40e_nvmupd_get_aq_result(struct i40e_hw *hw,
+struct i40e_nvm_access *cmd,
+u8 *bytes, int *perrno)
+{
+   u32 aq_total_len;
+   u32 aq_desc_len;
+   int remainder;
+   u8 *buff;
+
+   i40e_debug(hw, I40E_DEBUG_NVM, "NVMUPD: %s\n", __func__);
+
+   aq_desc_len = sizeof(struct i40e_aq_desc);
+   aq_total_len = aq_desc_len + le16_to_cpu(hw->nvm_wb_desc.datalen);
+
+   /* check offset range */
+   if (cmd->offset > aq_total_len) {
+   i40e_debug(hw, I40E_DEBUG_NVM, "%s: offset too big %d > %d\n",
+  __func__, cmd->offset, aq_total_len);
+   *perrno = -EINVAL;
+   return I40E_ERR_PARAM;
+   }
+
+   /* check copylength range */
+   if (cmd->data_size > (aq_total_len - cmd->offset)) {
+   int new_len = aq_total_len - cmd->offset;
+
+   i40e_debug(hw, I40E_DEBUG_NVM, "%s: copy length %d too big, 
trimming to %d\n",
+  __func__, cmd->data_size, new_len);
+   cmd->data_size = new_len;
+   }
+
+   remainder = cmd->data_size;
+   if (cmd->offset < aq_desc_len) {
+   u32 len = aq_desc_len - cmd->offset;
+
+   len = min(len, cmd->data_size);
+   i40e_debug(hw, I40E_DEBUG_NVM, "%s: aq_desc bytes %d to %d\n",
+  __func__, cmd->offset, cmd->offset + len);
+
+   buff = ((u8 *)&hw->nvm_wb_desc) + cmd->offset;
+   memcpy(bytes, buff, len);
+
+   bytes += len;
+   remainder -= len;
+   buff = hw->nvm_buff.va;
+   } else {
+   buff = hw->nvm_buff.va + (cmd->offset - aq_desc_len);
+   }
+
+   if (remainder > 0) {
+   int start_byte = buff - (u8 *)hw->nvm_buff.va;
+
+   i40e_debug(hw, I40E_DEBUG_NVM, "%s: databuf bytes %d to %d\n",
+  __func__, start_byte, start_byte + remainder);
+   memcpy(bytes, buff, remainder);
+   }
+
+   return 0;
+}
+
+/**
  * i40e_nvmupd_nvm_read - Read NVM
  * @hw: pointer to hardware structure
  * @cmd: poi

[net-next 8/8] i40e/i40evf: Bump i40e to 1.3.21 and i40evf to 1.3.13

2015-09-17 Thread Jeff Kirsher
From: Catherine Sullivan 

Bump.

Change-ID: If7ce84218361defa209142d1d8c6f69d48c2d7ad
Signed-off-by: Catherine Sullivan 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +-
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 851c1a1..530d8b6 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -39,7 +39,7 @@ static const char i40e_driver_string[] =
 
 #define DRV_VERSION_MAJOR 1
 #define DRV_VERSION_MINOR 3
-#define DRV_VERSION_BUILD 9
+#define DRV_VERSION_BUILD 21
 #define DRV_VERSION __stringify(DRV_VERSION_MAJOR) "." \
 __stringify(DRV_VERSION_MINOR) "." \
 __stringify(DRV_VERSION_BUILD)DRV_KERN
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c 
b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index e85849b..5fc8204 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -34,7 +34,7 @@ char i40evf_driver_name[] = "i40evf";
 static const char i40evf_driver_string[] =
"Intel(R) XL710/X710 Virtual Function Network Driver";
 
-#define DRV_VERSION "1.3.5"
+#define DRV_VERSION "1.3.13"
 const char i40evf_driver_version[] = DRV_VERSION;
 static const char i40evf_copyright[] =
"Copyright (c) 2013 - 2015 Intel Corporation.";
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next 6/8] i40e/i40evf: add exec_aq command to nvmupdate utility

2015-09-17 Thread Jeff Kirsher
From: Shannon Nelson 

Add a facility to run AQ commands through the nvmupdate utility in order
to allow the update tools to interact with the FW and do special
commands needed for updates and configuration changes.

Change-ID: I5c41523e4055b37f8e4ee479f7a0574368f4a588
Signed-off-by: Shannon Nelson 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_adminq.c   |  3 +
 drivers/net/ethernet/intel/i40e/i40e_nvm.c  | 83 +
 drivers/net/ethernet/intel/i40e/i40e_type.h |  2 +
 drivers/net/ethernet/intel/i40evf/i40e_adminq.c |  3 +
 drivers/net/ethernet/intel/i40evf/i40e_type.h   |  2 +
 5 files changed, 93 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_adminq.c 
b/drivers/net/ethernet/intel/i40e/i40e_adminq.c
index ea1e930..199275d 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_adminq.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_adminq.c
@@ -657,6 +657,9 @@ i40e_status i40e_shutdown_adminq(struct i40e_hw *hw)
 
/* destroy the locks */
 
+   if (hw->nvm_buff.va)
+   i40e_free_virt_mem(hw, &hw->nvm_buff);
+
return ret_code;
 }
 
diff --git a/drivers/net/ethernet/intel/i40e/i40e_nvm.c 
b/drivers/net/ethernet/intel/i40e/i40e_nvm.c
index 7bcbe59..50d4aa3 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_nvm.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_nvm.c
@@ -615,6 +615,9 @@ static i40e_status i40e_nvmupd_nvm_write(struct i40e_hw *hw,
 static i40e_status i40e_nvmupd_nvm_read(struct i40e_hw *hw,
struct i40e_nvm_access *cmd,
u8 *bytes, int *perrno);
+static i40e_status i40e_nvmupd_exec_aq(struct i40e_hw *hw,
+  struct i40e_nvm_access *cmd,
+  u8 *bytes, int *perrno);
 static inline u8 i40e_nvmupd_get_module(u32 val)
 {
return (u8)(val & I40E_NVM_MOD_PNT_MASK);
@@ -639,6 +642,7 @@ static char *i40e_nvm_update_state_str[] = {
"I40E_NVMUPD_CSUM_SA",
"I40E_NVMUPD_CSUM_LCB",
"I40E_NVMUPD_STATUS",
+   "I40E_NVMUPD_EXEC_AQ",
 };
 
 /**
@@ -824,6 +828,10 @@ static i40e_status i40e_nvmupd_state_init(struct i40e_hw 
*hw,
}
break;
 
+   case I40E_NVMUPD_EXEC_AQ:
+   status = i40e_nvmupd_exec_aq(hw, cmd, bytes, perrno);
+   break;
+
default:
i40e_debug(hw, I40E_DEBUG_NVM,
   "NVMUPD: bad cmd %s in init state\n",
@@ -1069,6 +1077,10 @@ static enum i40e_nvmupd_cmd 
i40e_nvmupd_validate_command(struct i40e_hw *hw,
case (I40E_NVM_CSUM|I40E_NVM_LCB):
upd_cmd = I40E_NVMUPD_CSUM_LCB;
break;
+   case I40E_NVM_EXEC:
+   if (module == 0)
+   upd_cmd = I40E_NVMUPD_EXEC_AQ;
+   break;
}
break;
}
@@ -1077,6 +1089,77 @@ static enum i40e_nvmupd_cmd 
i40e_nvmupd_validate_command(struct i40e_hw *hw,
 }
 
 /**
+ * i40e_nvmupd_exec_aq - Run an AQ command
+ * @hw: pointer to hardware structure
+ * @cmd: pointer to nvm update command buffer
+ * @bytes: pointer to the data buffer
+ * @perrno: pointer to return error code
+ *
+ * cmd structure contains identifiers and data buffer
+ **/
+static i40e_status i40e_nvmupd_exec_aq(struct i40e_hw *hw,
+  struct i40e_nvm_access *cmd,
+  u8 *bytes, int *perrno)
+{
+   struct i40e_asq_cmd_details cmd_details;
+   i40e_status status;
+   struct i40e_aq_desc *aq_desc;
+   u32 buff_size = 0;
+   u8 *buff = NULL;
+   u32 aq_desc_len;
+   u32 aq_data_len;
+
+   i40e_debug(hw, I40E_DEBUG_NVM, "NVMUPD: %s\n", __func__);
+   memset(&cmd_details, 0, sizeof(cmd_details));
+   cmd_details.wb_desc = &hw->nvm_wb_desc;
+
+   aq_desc_len = sizeof(struct i40e_aq_desc);
+   memset(&hw->nvm_wb_desc, 0, aq_desc_len);
+
+   /* get the aq descriptor */
+   if (cmd->data_size < aq_desc_len) {
+   i40e_debug(hw, I40E_DEBUG_NVM,
+  "NVMUPD: not enough aq desc bytes for exec, size %d 
< %d\n",
+  cmd->data_size, aq_desc_len);
+   *perrno = -EINVAL;
+   return I40E_ERR_PARAM;
+   }
+   aq_desc = (struct i40e_aq_desc *)bytes;
+
+   /* if data buffer needed, make sure it's ready */
+   aq_data_len = cmd->data_size - aq_desc_len;
+   buff_size = max_t(u32, aq_data_len, le16_to_cpu(aq_desc->datalen));
+   if (buff_size) {
+   if (!hw->nvm_buff.va) {
+   status = i40e_allocate_virt_mem(hw, &hw->nvm_buff,
+   hw->aq.asq_buf_size);
+   if (status)
+   i40e_debug(hw, I40E_DEBUG_NVM,
+ 

[net-next 0/8][pull request] Intel Wired LAN Driver Updates 2015-09-17

2015-09-17 Thread Jeff Kirsher
This series contains updates to i40e and i40evf.

Shannon provides updates to i40e and i40evf to resolve an issue with the
nvmupdate utility.  First renames a variable name to reduce confusion and
to differentiate it from the actual user variable.  Then added the ability
to save the admin queue write back descriptor if a caller supplies a
buffer for it to be saved into.  Added a new GetStatus command so that
the NVM update tool can query the current status instead of doing fake
write requests to probe for readiness.  Added wait states to the NVM
update state machine to signify when waiting for an update operation to
finish, whether we are in the middle of a set of write operations, or we
are now idle but waiting.  Then added a facility to run admin queue
commands through the NVM update utility in order to allow the update
tools to interact with the firmware and do special commands needed for
updates and configuration changes.  Also added a facility to recover the
result of a previously run admin queue command.

---
Waiting for Dave to update his net-next tree before pushing this series
to my next-queue tree.

The following are changes since commit 9adbac599a71bc25a2617850ffcaa4388dc5c20d:
  fm10k: fix iov_msg_mac_vlan_pf VID checks
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue master

Catherine Sullivan (1):
  i40e/i40evf: Bump i40e to 1.3.21 and i40evf to 1.3.13

Shannon Nelson (7):
  i40e: rename variable to prevent clash of understanding
  i40e/i40evf: save aq writeback for future inspection
  i40e/i40evf: add handling of writeback descriptor
  i40e/i40evf: add GetStatus command for nvmupdate
  i40e/i40evf: add wait states to NVM state machine
  i40e/i40evf: add exec_aq command to nvmupdate utility
  i40e/i40evf: add get AQ result command to nvmupdate utility

 drivers/net/ethernet/intel/i40e/i40e_adminq.c   |  20 ++
 drivers/net/ethernet/intel/i40e/i40e_adminq.h   |   1 +
 drivers/net/ethernet/intel/i40e/i40e_main.c |   2 +-
 drivers/net/ethernet/intel/i40e/i40e_nvm.c  | 386 +++-
 drivers/net/ethernet/intel/i40e/i40e_type.h |  10 +-
 drivers/net/ethernet/intel/i40evf/i40e_adminq.c |   7 +
 drivers/net/ethernet/intel/i40evf/i40e_adminq.h |   1 +
 drivers/net/ethernet/intel/i40evf/i40e_type.h   |  10 +-
 drivers/net/ethernet/intel/i40evf/i40evf_main.c |   2 +-
 9 files changed, 354 insertions(+), 85 deletions(-)

-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next 4/8] i40e/i40evf: add GetStatus command for nvmupdate

2015-09-17 Thread Jeff Kirsher
From: Shannon Nelson 

This adds a new GetStatus command so that the NVM update tool can query
the current status instead of doing fake write requests to probe for
readiness.

Change-ID: I671ec6ccd4dfc9dbac3a03b964589d693fda5cd8
Signed-off-by: Shannon Nelson 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_nvm.c| 42 ---
 drivers/net/ethernet/intel/i40e/i40e_type.h   |  2 ++
 drivers/net/ethernet/intel/i40evf/i40e_type.h |  2 ++
 3 files changed, 35 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_nvm.c 
b/drivers/net/ethernet/intel/i40e/i40e_nvm.c
index e71ce23..7ff3099 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_nvm.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_nvm.c
@@ -638,6 +638,7 @@ static char *i40e_nvm_update_state_str[] = {
"I40E_NVMUPD_CSUM_CON",
"I40E_NVMUPD_CSUM_SA",
"I40E_NVMUPD_CSUM_LCB",
+   "I40E_NVMUPD_STATUS",
 };
 
 /**
@@ -654,10 +655,34 @@ i40e_status i40e_nvmupd_command(struct i40e_hw *hw,
u8 *bytes, int *perrno)
 {
i40e_status status;
+   enum i40e_nvmupd_cmd upd_cmd;
 
/* assume success */
*perrno = 0;
 
+   /* early check for status command and debug msgs */
+   upd_cmd = i40e_nvmupd_validate_command(hw, cmd, perrno);
+
+   i40e_debug(hw, I40E_DEBUG_NVM, "%s state %d nvm_release_on_hold %d\n",
+  i40e_nvm_update_state_str[upd_cmd],
+  hw->nvmupd_state,
+  hw->aq.nvm_release_on_done);
+
+   if (upd_cmd == I40E_NVMUPD_INVALID) {
+   *perrno = -EFAULT;
+   i40e_debug(hw, I40E_DEBUG_NVM,
+  "i40e_nvmupd_validate_command returns %d errno %d\n",
+  upd_cmd, *perrno);
+   }
+
+   /* a status request returns immediately rather than
+* going into the state machine
+*/
+   if (upd_cmd == I40E_NVMUPD_STATUS) {
+   bytes[0] = hw->nvmupd_state;
+   return 0;
+   }
+
switch (hw->nvmupd_state) {
case I40E_NVMUPD_STATE_INIT:
status = i40e_nvmupd_state_init(hw, cmd, bytes, perrno);
@@ -954,12 +979,13 @@ static enum i40e_nvmupd_cmd 
i40e_nvmupd_validate_command(struct i40e_hw *hw,
 int *perrno)
 {
enum i40e_nvmupd_cmd upd_cmd;
-   u8 transaction;
+   u8 module, transaction;
 
/* anything that doesn't match a recognized case is an error */
upd_cmd = I40E_NVMUPD_INVALID;
 
transaction = i40e_nvmupd_get_transaction(cmd->config);
+   module = i40e_nvmupd_get_module(cmd->config);
 
/* limits on data size */
if ((cmd->data_size < 1) ||
@@ -986,6 +1012,10 @@ static enum i40e_nvmupd_cmd 
i40e_nvmupd_validate_command(struct i40e_hw *hw,
case I40E_NVM_SA:
upd_cmd = I40E_NVMUPD_READ_SA;
break;
+   case I40E_NVM_EXEC:
+   if (module == 0xf)
+   upd_cmd = I40E_NVMUPD_STATUS;
+   break;
}
break;
 
@@ -1018,17 +1048,7 @@ static enum i40e_nvmupd_cmd 
i40e_nvmupd_validate_command(struct i40e_hw *hw,
}
break;
}
-   i40e_debug(hw, I40E_DEBUG_NVM, "%s state %d nvm_release_on_hold %d\n",
-  i40e_nvm_update_state_str[upd_cmd],
-  hw->nvmupd_state,
-  hw->aq.nvm_release_on_done);
 
-   if (upd_cmd == I40E_NVMUPD_INVALID) {
-   *perrno = -EFAULT;
-   i40e_debug(hw, I40E_DEBUG_NVM,
-  "i40e_nvmupd_validate_command returns %d errno %d\n",
-  upd_cmd, *perrno);
-   }
return upd_cmd;
 }
 
diff --git a/drivers/net/ethernet/intel/i40e/i40e_type.h 
b/drivers/net/ethernet/intel/i40e/i40e_type.h
index a4fec8b..f63f538 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_type.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_type.h
@@ -305,6 +305,7 @@ enum i40e_nvmupd_cmd {
I40E_NVMUPD_CSUM_CON,
I40E_NVMUPD_CSUM_SA,
I40E_NVMUPD_CSUM_LCB,
+   I40E_NVMUPD_STATUS,
 };
 
 enum i40e_nvmupd_state {
@@ -329,6 +330,7 @@ enum i40e_nvmupd_state {
 #define I40E_NVM_SA(I40E_NVM_SNT | I40E_NVM_LCB)
 #define I40E_NVM_ERA   0x4
 #define I40E_NVM_CSUM  0x8
+#define I40E_NVM_EXEC  0xf
 
 #define I40E_NVM_ADAPT_SHIFT   16
 #define I40E_NVM_ADAPT_MASK(0x << I40E_NVM_ADAPT_SHIFT)
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_type.h 
b/drivers/net/ethernet/intel/i40evf/i40e_type.h
index cf865cc..7f3c05c 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_type.h
+++ b/drivers/net/ethernet/intel/i40evf/i40e_type.h
@@ -304,6 +304,7 @@ enum i40e_nvmupd_cmd {
I40E_NVMUPD_CSUM_CON,
I40E_NVMUPD_CSUM_SA,

[net-next 1/8] i40e: rename variable to prevent clash of understanding

2015-09-17 Thread Jeff Kirsher
From: Shannon Nelson 

This code returns something that becomes the errno value from ethtool and
passes around a pointer to an errno variable.  This patch changes the name
slightly to differentiate it from the actual user errno variable.

Change-ID: Idaa37845c069e66f4cea072e90f471bb2142454d
Signed-off-by: Shannon Nelson 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_nvm.c | 114 ++---
 1 file changed, 57 insertions(+), 57 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_nvm.c 
b/drivers/net/ethernet/intel/i40e/i40e_nvm.c
index 9b83abc..3f2fec9 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_nvm.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_nvm.c
@@ -592,25 +592,25 @@ i40e_validate_nvm_checksum_exit:
 
 static i40e_status i40e_nvmupd_state_init(struct i40e_hw *hw,
  struct i40e_nvm_access *cmd,
- u8 *bytes, int *errno);
+ u8 *bytes, int *perrno);
 static i40e_status i40e_nvmupd_state_reading(struct i40e_hw *hw,
 struct i40e_nvm_access *cmd,
-u8 *bytes, int *errno);
+u8 *bytes, int *perrno);
 static i40e_status i40e_nvmupd_state_writing(struct i40e_hw *hw,
 struct i40e_nvm_access *cmd,
 u8 *bytes, int *errno);
 static enum i40e_nvmupd_cmd i40e_nvmupd_validate_command(struct i40e_hw *hw,
struct i40e_nvm_access *cmd,
-   int *errno);
+   int *perrno);
 static i40e_status i40e_nvmupd_nvm_erase(struct i40e_hw *hw,
 struct i40e_nvm_access *cmd,
-int *errno);
+int *perrno);
 static i40e_status i40e_nvmupd_nvm_write(struct i40e_hw *hw,
 struct i40e_nvm_access *cmd,
-u8 *bytes, int *errno);
+u8 *bytes, int *perrno);
 static i40e_status i40e_nvmupd_nvm_read(struct i40e_hw *hw,
struct i40e_nvm_access *cmd,
-   u8 *bytes, int *errno);
+   u8 *bytes, int *perrno);
 static inline u8 i40e_nvmupd_get_module(u32 val)
 {
return (u8)(val & I40E_NVM_MOD_PNT_MASK);
@@ -641,30 +641,30 @@ static char *i40e_nvm_update_state_str[] = {
  * @hw: pointer to hardware structure
  * @cmd: pointer to nvm update command
  * @bytes: pointer to the data buffer
- * @errno: pointer to return error code
+ * @perrno: pointer to return error code
  *
  * Dispatches command depending on what update state is current
  **/
 i40e_status i40e_nvmupd_command(struct i40e_hw *hw,
struct i40e_nvm_access *cmd,
-   u8 *bytes, int *errno)
+   u8 *bytes, int *perrno)
 {
i40e_status status;
 
/* assume success */
-   *errno = 0;
+   *perrno = 0;
 
switch (hw->nvmupd_state) {
case I40E_NVMUPD_STATE_INIT:
-   status = i40e_nvmupd_state_init(hw, cmd, bytes, errno);
+   status = i40e_nvmupd_state_init(hw, cmd, bytes, perrno);
break;
 
case I40E_NVMUPD_STATE_READING:
-   status = i40e_nvmupd_state_reading(hw, cmd, bytes, errno);
+   status = i40e_nvmupd_state_reading(hw, cmd, bytes, perrno);
break;
 
case I40E_NVMUPD_STATE_WRITING:
-   status = i40e_nvmupd_state_writing(hw, cmd, bytes, errno);
+   status = i40e_nvmupd_state_writing(hw, cmd, bytes, perrno);
break;
 
default:
@@ -672,7 +672,7 @@ i40e_status i40e_nvmupd_command(struct i40e_hw *hw,
i40e_debug(hw, I40E_DEBUG_NVM,
   "NVMUPD: no such state %d\n", hw->nvmupd_state);
status = I40E_NOT_SUPPORTED;
-   *errno = -ESRCH;
+   *perrno = -ESRCH;
break;
}
return status;
@@ -683,28 +683,28 @@ i40e_status i40e_nvmupd_command(struct i40e_hw *hw,
  * @hw: pointer to hardware structure
  * @cmd: pointer to nvm update command buffer
  * @bytes: pointer to the data buffer
- * @errno: pointer to return error code
+ * @perrno: pointer to return error code
  *
  * Process legitimate commands of the Init state and conditionally set next
  * state. Reject all other commands.
  **/
 static i40e_status i40e_nvmupd_state_init(struct i40e_hw *hw,
  struct i40e_nvm_access *cmd,
- u8 *bytes, int

[net-next 3/8] i40e/i40evf: add handling of writeback descriptor

2015-09-17 Thread Jeff Kirsher
From: Shannon Nelson 

If the writeback descriptor buffer was previously created, this gives it
to the AQ command request to be used to save the results.

Change-ID: I8c8a1af81e6ebed6d0a15ed31697fe1a6c4e3708
Signed-off-by: Shannon Nelson 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_nvm.c| 26 ++
 drivers/net/ethernet/intel/i40e/i40e_type.h   |  1 +
 drivers/net/ethernet/intel/i40evf/i40e_type.h |  1 +
 3 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_nvm.c 
b/drivers/net/ethernet/intel/i40e/i40e_nvm.c
index 3f2fec9..e71ce23 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_nvm.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_nvm.c
@@ -418,6 +418,10 @@ static i40e_status i40e_write_nvm_aq(struct i40e_hw *hw, 
u8 module_pointer,
 bool last_command)
 {
i40e_status ret_code = I40E_ERR_NVM;
+   struct i40e_asq_cmd_details cmd_details;
+
+   memset(&cmd_details, 0, sizeof(cmd_details));
+   cmd_details.wb_desc = &hw->nvm_wb_desc;
 
/* Here we are checking the SR limit only for the flat memory model.
 * We cannot do it for the module-based model, as we did not acquire
@@ -443,7 +447,7 @@ static i40e_status i40e_write_nvm_aq(struct i40e_hw *hw, u8 
module_pointer,
ret_code = i40e_aq_update_nvm(hw, module_pointer,
  2 * offset,  /*bytes*/
  2 * words,   /*bytes*/
- data, last_command, NULL);
+ data, last_command, &cmd_details);
 
return ret_code;
 }
@@ -1041,6 +1045,7 @@ static i40e_status i40e_nvmupd_nvm_read(struct i40e_hw 
*hw,
struct i40e_nvm_access *cmd,
u8 *bytes, int *perrno)
 {
+   struct i40e_asq_cmd_details cmd_details;
i40e_status status;
u8 module, transaction;
bool last;
@@ -1049,8 +1054,11 @@ static i40e_status i40e_nvmupd_nvm_read(struct i40e_hw 
*hw,
module = i40e_nvmupd_get_module(cmd->config);
last = (transaction == I40E_NVM_LCB) || (transaction == I40E_NVM_SA);
 
+   memset(&cmd_details, 0, sizeof(cmd_details));
+   cmd_details.wb_desc = &hw->nvm_wb_desc;
+
status = i40e_aq_read_nvm(hw, module, cmd->offset, (u16)cmd->data_size,
- bytes, last, NULL);
+ bytes, last, &cmd_details);
if (status) {
i40e_debug(hw, I40E_DEBUG_NVM,
   "i40e_nvmupd_nvm_read mod 0x%x  off 0x%x  len 
0x%x\n",
@@ -1077,14 +1085,19 @@ static i40e_status i40e_nvmupd_nvm_erase(struct i40e_hw 
*hw,
 int *perrno)
 {
i40e_status status = 0;
+   struct i40e_asq_cmd_details cmd_details;
u8 module, transaction;
bool last;
 
transaction = i40e_nvmupd_get_transaction(cmd->config);
module = i40e_nvmupd_get_module(cmd->config);
last = (transaction & I40E_NVM_LCB);
+
+   memset(&cmd_details, 0, sizeof(cmd_details));
+   cmd_details.wb_desc = &hw->nvm_wb_desc;
+
status = i40e_aq_erase_nvm(hw, module, cmd->offset, (u16)cmd->data_size,
-  last, NULL);
+  last, &cmd_details);
if (status) {
i40e_debug(hw, I40E_DEBUG_NVM,
   "i40e_nvmupd_nvm_erase mod 0x%x  off 0x%x len 
0x%x\n",
@@ -1112,6 +1125,7 @@ static i40e_status i40e_nvmupd_nvm_write(struct i40e_hw 
*hw,
 u8 *bytes, int *perrno)
 {
i40e_status status = 0;
+   struct i40e_asq_cmd_details cmd_details;
u8 module, transaction;
bool last;
 
@@ -1119,8 +1133,12 @@ static i40e_status i40e_nvmupd_nvm_write(struct i40e_hw 
*hw,
module = i40e_nvmupd_get_module(cmd->config);
last = (transaction & I40E_NVM_LCB);
 
+   memset(&cmd_details, 0, sizeof(cmd_details));
+   cmd_details.wb_desc = &hw->nvm_wb_desc;
+
status = i40e_aq_update_nvm(hw, module, cmd->offset,
-   (u16)cmd->data_size, bytes, last, NULL);
+   (u16)cmd->data_size, bytes, last,
+   &cmd_details);
if (status) {
i40e_debug(hw, I40E_DEBUG_NVM,
   "i40e_nvmupd_nvm_write mod 0x%x off 0x%x len 0x%x\n",
diff --git a/drivers/net/ethernet/intel/i40e/i40e_type.h 
b/drivers/net/ethernet/intel/i40e/i40e_type.h
index 4842239..a4fec8b 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_type.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_type.h
@@ -492,6 +492,7 @@ struct i40e_hw {
 
/* state of nvm update process */
enum i40e_nvmupd_state nvmup

Re: Experiences with slub bulk use-case for network stack

2015-09-17 Thread Christoph Lameter
On Thu, 17 Sep 2015, Jesper Dangaard Brouer wrote:

> What I'm proposing is keeping interrupts on, and then simply cmpxchg
> e.g 2 slab-pages out of the SLUB allocator (which the SLUB code calls
> freelist's). The bulk call now owns these freelists, and returns them
> to the caller.  The API caller gets some helpers/macros to access
> objects, to shield him from the details (of SLUB freelist's).
>
> The pitfall with this API is we don't know how many objects are on a
> SLUB freelist.  And we cannot walk the freelist and count them, because
> then we hit the problem of memory/cache stalls (that we are trying so
> hard to avoid).

If you get a fresh page from the page allocator then you know how many
objects are available in a slab page.

There is also a counter in each slab page for the objects allocated. The
number of free object is page->objects - page->inuse.

This is only true for a lockec cmpxchg. The unlocked cmpxchg used for the
per cpu freelist does not use the counters in the page struct.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: iproute2 tunnel name parsing

2015-09-17 Thread Wilhelm Wijkander
Thanks for the reply Vadim,

2015-09-17 22:10 GMT+02:00 Vadim Kochan :
> You can use 'name' before 'hel'

Yes, "name" enables me to create the tunnel, things do get trickier
when I'm trying to bring the tunnel device up:

   # ip link set dev hel up
   Usage: ip link add [link DEV] [ name ] NAME
   [snip]
   # ip link set dev name hel up
   Usage: ip link add [link DEV] [ name ] NAME
   [snip]
Adding an address to the tunnel device on the other hand:

   ip addr add 2001:db8:a:a::2/64 dev hel
 is no issue.

Regards,
Wilhelm
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [net-next 00/18][pull request] Intel Wired LAN Driver Updates 2015-09-15

2015-09-17 Thread David Miller
From: Jeff Kirsher 
Date: Tue, 15 Sep 2015 17:36:25 -0700

> This series contains updates to ixgbe and fm10k.

Pulled, thanks Jeff.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net 2/2] 8139cp: reset BQL when ring tx ring cleared

2015-09-17 Thread Francois Romieu
David Woodhouse  :
[...]
> And of course, even if I fix the TX timeout handling, I'd still like to
> know why it's happening in the first place...

So do I.

The TxDmaOkLowDesc register may tell if the Tx dma part is still making
any progress. I have added a TxPoll request. See below.

diff --git a/drivers/net/ethernet/realtek/8139cp.c 
b/drivers/net/ethernet/realtek/8139cp.c
index d79e33b..963d2a2 100644
--- a/drivers/net/ethernet/realtek/8139cp.c
+++ b/drivers/net/ethernet/realtek/8139cp.c
@@ -129,6 +129,9 @@ MODULE_PARM_DESC (multicast_filter_limit, "8139cp: maximum 
number of filtered mu
 /* Time in jiffies before concluding the transmitter is hung. */
 #define TX_TIMEOUT (6*HZ)
 
+/* TODO: calibrate. It ought to be related to the PCI bus frequency. */
+#define CP_EARLY_TIMEOUT   (16 * 1024)
+
 /* hardware minimum and maximum for a single frame's data payload */
 #define CP_MIN_MTU 60  /* TODO: allow lower, but pad */
 #define CP_MAX_MTU 4096
@@ -146,9 +149,11 @@ enum {
TxConfig= 0x40, /* Tx configuration */
ChipVersion = 0x43, /* 8-bit chip version, inside TxConfig */
RxConfig= 0x44, /* Rx configuration */
+   TimerCount  = 0x48, /* 32 bit general purpose timer. */
RxMissed= 0x4C, /* 24 bits valid, write clears */
Cfg9346 = 0x50, /* EEPROM select/control; Cfg reg [un]lock */
Config1 = 0x52, /* Config1 */
+   TimerInt= 0x54, /* TimerCount IRQ triggering timeout value */
Config3 = 0x59, /* Config3 */
Config4 = 0x5A, /* Config4 */
MultiIntr   = 0x5C, /* Multiple interrupt select */
@@ -157,6 +162,7 @@ enum {
NWayAdvert  = 0x66, /* MII ADVERTISE */
NWayLPAR= 0x68, /* MII LPA */
NWayExpansion   = 0x6A, /* MII Expansion */
+   TxDmaOkLowDesc  = 0x82, /* Low 16 bit address of a Tx descriptor. */
Config5 = 0xD8, /* Config5 */
TxPoll  = 0xD9, /* Tell chip to check Tx descriptors for work */
RxMaxSize   = 0xDA, /* Max size of an Rx packet (8169 only) */
@@ -283,7 +289,8 @@ enum {
LANWake = (1 << 1),  /* Enable LANWake signal */
PMEStatus   = (1 << 0),  /* PME status can be reset by PCI RST# */
 
-   cp_norx_intr_mask = PciErr | LinkChg | TxOK | TxErr | TxEmpty,
+   cp_norx_intr_mask = PciErr | TimerIntr | LinkChg |
+   TxOK | TxErr | TxEmpty,
cp_rx_intr_mask = RxOK | RxErr | RxEmpty | RxFIFOOvr,
cp_intr_mask = cp_rx_intr_mask | cp_norx_intr_mask,
 };
@@ -608,6 +615,18 @@ static irqreturn_t cp_interrupt (int irq, void 
*dev_instance)
 
if (status & (TxOK | TxErr | TxEmpty | SWInt))
cp_tx(cp);
+
+   if ((status & TimerIntr) && (cp->tx_head != cp->tx_tail)) {
+   if (net_ratelimit()) {
+   netdev_info(dev, "Timeout head=%08x, tail=%08x 
desc=%04x\n",
+   cp->tx_head, cp->tx_tail,
+   cpr16(TxDmaOkLowDesc));
+   }
+   cp_tx(cp);
+   if (cp->tx_head != cp->tx_tail)
+   cpw8_f(TxPoll, NormalTxPoll);
+   }
+
if (status & LinkChg)
mii_check_media(&cp->mii_if, netif_msg_link(cp), false);
 
@@ -885,6 +904,8 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
 out_unlock:
spin_unlock_irqrestore(&cp->lock, intr_flags);
 
+   cpw32(TimerCount, CP_EARLY_TIMEOUT);
+
cpw8(TxPoll, NormalTxPoll);
 
return NETDEV_TX_OK;
@@ -1064,6 +1085,8 @@ static void cp_init_hw (struct cp_private *cp)
 
cpw16(MultiIntr, 0);
 
+   cpw32(TimerInt, CP_EARLY_TIMEOUT);
+
cpw8_f(Cfg9346, Cfg9346_Lock);
 }
 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next] sch_dsmark: improve memory locality

2015-09-17 Thread Eric Dumazet
From: Eric Dumazet 

Memory placement in sch_dsmark is silly : Better place mask/value
in the same cache line.

Also, we can embed small arrays in the first cache line and
remove a potential cache miss.

Signed-off-by: Eric Dumazet 
---
 net/sched/sch_dsmark.c |   63 ---
 1 file changed, 33 insertions(+), 30 deletions(-)

diff --git a/net/sched/sch_dsmark.c b/net/sched/sch_dsmark.c
index c4d45fd8c551..f357f34d02d2 100644
--- a/net/sched/sch_dsmark.c
+++ b/net/sched/sch_dsmark.c
@@ -35,14 +35,20 @@
 
 #define NO_DEFAULT_INDEX   (1 << 16)
 
+struct mask_value {
+   u8  mask;
+   u8  value;
+};
+
 struct dsmark_qdisc_data {
struct Qdisc*q;
struct tcf_proto __rcu  *filter_list;
-   u8  *mask;  /* "owns" the array */
-   u8  *value;
+   struct mask_value   *mv;
u16 indices;
+   u8  set_tc_index;
u32 default_index;  /* index range is 0...0x */
-   int set_tc_index;
+#define DSMARK_EMBEDDED_SZ 16
+   struct mask_value   embedded[DSMARK_EMBEDDED_SZ];
 };
 
 static inline int dsmark_valid_index(struct dsmark_qdisc_data *p, u16 index)
@@ -116,7 +122,6 @@ static int dsmark_change(struct Qdisc *sch, u32 classid, 
u32 parent,
struct nlattr *opt = tca[TCA_OPTIONS];
struct nlattr *tb[TCA_DSMARK_MAX + 1];
int err = -EINVAL;
-   u8 mask = 0;
 
pr_debug("%s(sch %p,[qdisc %p],classid %x,parent %x), arg 0x%lx\n",
 __func__, sch, p, classid, parent, *arg);
@@ -133,14 +138,11 @@ static int dsmark_change(struct Qdisc *sch, u32 classid, 
u32 parent,
if (err < 0)
goto errout;
 
-   if (tb[TCA_DSMARK_MASK])
-   mask = nla_get_u8(tb[TCA_DSMARK_MASK]);
-
if (tb[TCA_DSMARK_VALUE])
-   p->value[*arg - 1] = nla_get_u8(tb[TCA_DSMARK_VALUE]);
+   p->mv[*arg - 1].value = nla_get_u8(tb[TCA_DSMARK_VALUE]);
 
if (tb[TCA_DSMARK_MASK])
-   p->mask[*arg - 1] = mask;
+   p->mv[*arg - 1].mask = nla_get_u8(tb[TCA_DSMARK_MASK]);
 
err = 0;
 
@@ -155,8 +157,8 @@ static int dsmark_delete(struct Qdisc *sch, unsigned long 
arg)
if (!dsmark_valid_index(p, arg))
return -EINVAL;
 
-   p->mask[arg - 1] = 0xff;
-   p->value[arg - 1] = 0;
+   p->mv[arg - 1].mask = 0xff;
+   p->mv[arg - 1].value = 0;
 
return 0;
 }
@@ -173,7 +175,7 @@ static void dsmark_walk(struct Qdisc *sch, struct 
qdisc_walker *walker)
return;
 
for (i = 0; i < p->indices; i++) {
-   if (p->mask[i] == 0xff && !p->value[i])
+   if (p->mv[i].mask == 0xff && !p->mv[i].value)
goto ignore;
if (walker->count >= walker->skip) {
if (walker->fn(sch, i + 1, walker) < 0) {
@@ -291,12 +293,12 @@ static struct sk_buff *dsmark_dequeue(struct Qdisc *sch)
 
switch (tc_skb_protocol(skb)) {
case htons(ETH_P_IP):
-   ipv4_change_dsfield(ip_hdr(skb), p->mask[index],
-   p->value[index]);
+   ipv4_change_dsfield(ip_hdr(skb), p->mv[index].mask,
+   p->mv[index].value);
break;
case htons(ETH_P_IPV6):
-   ipv6_change_dsfield(ipv6_hdr(skb), p->mask[index],
-   p->value[index]);
+   ipv6_change_dsfield(ipv6_hdr(skb), p->mv[index].mask,
+   p->mv[index].value);
break;
default:
/*
@@ -304,7 +306,7 @@ static struct sk_buff *dsmark_dequeue(struct Qdisc *sch)
 * This way, we can send non-IP traffic through dsmark
 * and don't need yet another qdisc as a bypass.
 */
-   if (p->mask[index] != 0xff || p->value[index])
+   if (p->mv[index].mask != 0xff || p->mv[index].value)
pr_warn("%s: unsupported protocol %d\n",
__func__, ntohs(tc_skb_protocol(skb)));
break;
@@ -346,7 +348,7 @@ static int dsmark_init(struct Qdisc *sch, struct nlattr 
*opt)
int err = -EINVAL;
u32 default_index = NO_DEFAULT_INDEX;
u16 indices;
-   u8 *mask;
+   int i;
 
pr_debug("%s(sch %p,[qdisc %p],opt %p)\n", __func__, sch, p, opt);
 
@@ -366,18 +368,18 @@ static int dsmark_init(struct Qdisc *sch, struct nlattr 
*opt)
if (tb[TCA_DSMARK_DEFAULT_INDEX])
default_index = nla_get_u16(tb[TCA_DSMARK_DEFAULT_INDEX]);
 
-   mask = kmalloc(indices * 2, GFP_KERNEL);
-   if (mask == NULL) {
+   if (indices <= DSMARK_EMBEDDED_SZ)
+   p->mv = p->embedded;
+   else

Re: mvneta: SGMII fixed-link not so fixed

2015-09-17 Thread Florian Fainelli
On 17/09/15 16:14, Russell King - ARM Linux wrote:

[snip]

>with _no_ phy node.
> 
> 4. Going back to the SFP problem, the link is only up when the SFP
>module pins indicate that there's no transmitter fault, no loss of
>signal _and_ the PCS layer at the MAC indicates that it has completed
>negotiation.  This pretty much rules out trying to emulate a SFP cage
>as a software-based PHY.  I've code right now doing exactly that, and
>it results in netif_carrier_on() being called far too early.

Andrew recently added support for having the fixed PHY emulated PHY poll
a GPIO to determine link status. If this information comes differently
(e.g: via MMIO/MAC registers) in your case, I guess we could either
extend the fixed PHY to support that scheme, or just open-code the
carrier state change in the mvneta driver.

> 
> What I don't know is how many generations of the mvneta hardware have
> support for both serdes modes, but what I'm basically saying is that
> the solution we now have seems to be somewhat lacking - maybe it should
> have been "auto", "in-band-sgmii" and "in-band-1000base-x" with the
> ability to add additional modes later.

Staas and I had quite some discussions about this topic, and I think I
changed my mind at least once about his initial proposal, but I seem to
recall that I suggested making the auto-negotiation property something
more flexible than it currently is (or maybe Staas did, and my mind
plays tricks on me).

> 
> The other point I'm making above is that I'm forming the opinion that
> the existing PHY layer isn't flexible enough for supporting SFP, and I
> need some way to represent at least part of the autonegotiation at the
> MAC level without involving the PHY level - especially when considering
> that a real PHY might be inside the SFP cage which can be talked to
> over I2C.

This is a fair conclusion, the PHY library is really designed to support
well 10/100/1000 PHYs, anything else, including 10Gbits PHY are not that
well supported. Anything that is not MDIO is not that well supported,
and the fixed PHY has its limitations.

It is unclear to me how much of the PHY library: state machine, link
management, device abstraction, and ethtool interface is going to be
useful if you have a SFP instead of a MDIO-connected PHY.

> 
> This is the problem I'm presently grappling with, and it's taking lots
> of thought right now.  I'm aware of other drivers in the kernel which
> support SFP, each using their own implementations to support that.
> 
> 
> Lastly, while looking at this, I've a small stack of patches for the PHY
> code resolving some of the issues I've mentioned above, and fixing broken
> reference counting and mdio bus module removal issues:
> 
>  phy: fixed-phy: properly validate phy in fixed_phy_update_state()
>  net: fix phy refcounting in a bunch of drivers
>  of_mdio: fix MDIO phy device refcounting
>  phy: add proper phy struct device refcounting
>  phy: fix mdiobus module safety
>  phy: fix of_mdio_find_bus() device refcount leak
> 
> I hope to be able to send those out in the next few days - they have
> nothing to do with SFP itself but are the results of looking through the
> PHY code.
> 
-- 
Florian
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix vti use case with oif in dst lookups

2015-09-17 Thread David Miller
From: David Ahern 
Date: Tue, 15 Sep 2015 15:10:50 -0700

> Steffen reported that the recent change to add oif to dst lookups breaks
> the VTI use case. The problem is that with the oif set in the flow struct
> the comparison to the nh_oif is triggered. Fix by splitting the
> FLOWI_FLAG_VRFSRC into 2 flags -- one that triggers the vrf device cache
> bypass (FLOWI_FLAG_VRFSRC) and another telling the lookup to not compare
> nh oif (FLOWI_FLAG_SKIP_NH_OIF).
> 
> Fixes: 42a7b32b73d6 ("xfrm: Add oif to dst lookups")
> 
> Signed-off-by: David Ahern 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] net-sysfs: get_netdev_queue_index() cleanup

2015-09-17 Thread David Miller
From: Thadeu Lima de Souza Cascardo 
Date: Tue, 15 Sep 2015 18:28:00 -0300

> Redo commit ed1acc8cd8c22efa919da8d300bab646e01c2dce.
> 
> Commit 822b3b2ebfff8e9b3d006086c527738a7ca00cd0 ("net: Add max rate tx queue
> attribute") moved get_netdev_queue_index around, but kept the old version.
> Probably because of a reuse of the original patch from before Eric's change to
> that function.
> 
> Remove one inline keyword, and no need for a loop to find
> an index into a table.
> 
> Signed-off-by: Thadeu Lima de Souza Cascardo 
> Fixes: 822b3b2ebfff ("net: Add max rate tx queue attribute")
> Acked-by:  Or Gerlitz 
> Acked-by: John Fastabend 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mvneta: SGMII fixed-link not so fixed

2015-09-17 Thread David Miller
From: Florian Fainelli 
Date: Thu, 17 Sep 2015 16:02:41 -0700

> On 17/09/15 15:12, David Miller wrote:
>> I can queue up the whole series for -stable if you want.
> 
> I think this would be a good thing, mvneta-based platforms are fairly
> popular.

Done.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next] cxgb4: add device ID for few T5 adapters

2015-09-17 Thread David Miller
From: Hariprasad Shenai 
Date: Tue, 15 Sep 2015 17:20:09 +0530

> Signed-off-by: Hariprasad Shenai 

Applied to 'net', thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 4/5] qeth: add layer 2 RX/TX checksum offloading

2015-09-17 Thread David Miller
From: Ursula Braun 
Date: Tue, 15 Sep 2015 12:32:17 +0200

> +int qeth_send_setassparms(struct qeth_card *, struct qeth_cmd_buffer *, 
> __u16,
> + long,
> + int (*reply_cb)(struct qeth_card *, struct qeth_reply *, unsigned long),
> + void *);

Function declarations and definitions that span multiple lines must begin
the second and subsequent lines precisely at the column right after the
openning parenthesis of the declaration/definition.

Indenting those lines just by a plain TAB character is not correct.

> +static int qeth_setassparms_cb(struct qeth_card *card,
> + struct qeth_reply *reply, unsigned long data)

Likewise.

> +static struct qeth_cmd_buffer *qeth_get_setassparms_cmd(
> + struct qeth_card *card, enum qeth_ipa_funcs ipa_func, __u16 cmd_code,
> + __u16 len, enum qeth_prot_versions prot)

Likewise.

> +int qeth_send_setassparms(struct qeth_card *card,
> + struct qeth_cmd_buffer *iob, __u16 len, long data,
> + int (*reply_cb)(struct qeth_card *, struct qeth_reply *,
> + unsigned long),
> + void *reply_param)

Likewise.

And so on and so forth.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] 8139cp: Call __cp_set_rx_mode() from cp_tx_timeout()

2015-09-17 Thread David Woodhouse
Unless we reset the RX config, on real hardware I don't seem to receive
any packets after a TX timeout.

Signed-off-by: David Woodhouse 
---
Now it does actually recover from the TX timeout, lots of the time.
Sometimes it still hits that IRQ storm, which probably explains the
apparent lockup right after the 'popf'... although I thought we handled
it more gracefully than that these days.

That's probably a race with the RX handling code, or something. I'll
try harder to reproduce the TX timeout with the debugging enabled.
Which might shed some light on this, and also on the reason why it
happens in the first place. If we're lucky.

 drivers/net/ethernet/realtek/8139cp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/realtek/8139cp.c 
b/drivers/net/ethernet/realtek/8139cp.c
index 52a5334..ba3dab7 100644
--- a/drivers/net/ethernet/realtek/8139cp.c
+++ b/drivers/net/ethernet/realtek/8139cp.c
@@ -1261,6 +1261,7 @@ static void cp_tx_timeout(struct net_device *dev)
cp_clean_rings(cp);
rc = cp_init_rings(cp);
cp_start_hw(cp);
+   __cp_set_rx_mode(dev);
cp_enable_irq(cp);
 
netif_wake_queue(dev);
-- 
2.4.3


-- 
David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation



smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH V1 net-next] net: only check perm protocol when register proto

2015-09-17 Thread David Miller
From: martinbj2...@gmail.com
Date: Tue, 15 Sep 2015 08:14:05 +0800

> @@ -1043,22 +1043,16 @@ void inet_register_protosw(struct inet_protosw *p)
>   goto out_illegal;
>  
>   /* If we are trying to override a permanent protocol, bail. */
> - answer = NULL;
>   last_perm = &inetsw[p->type];
>   list_for_each(lh, &inetsw[p->type]) {
>   answer = list_entry(lh, struct inet_protosw, list);
> -
>   /* Check only the non-wild match. */
> - if (INET_PROTOSW_PERMANENT & answer->flags) {
> - if (protocol == answer->protocol)
> + if ((INET_PROTOSW_PERMANENT & answer->flags) == 0)
>   break;
> - last_perm = lh;

Well, if you're going to do this, you need to fix up the indentation
of that "break;" statement.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] 8139cp: Use dev_kfree_skb_any() instead of dev_kfree_skb() in cp_clean_rings()

2015-09-17 Thread David Woodhouse
This can be called from cp_tx_timeout() with interrupts disabled.
Spotted by Francois Romieu 

Signed-off-by: David Woodhouse 
---
 drivers/net/ethernet/realtek/8139cp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/realtek/8139cp.c 
b/drivers/net/ethernet/realtek/8139cp.c
index d79e33b..52a5334 100644
--- a/drivers/net/ethernet/realtek/8139cp.c
+++ b/drivers/net/ethernet/realtek/8139cp.c
@@ -1151,7 +1151,7 @@ static void cp_clean_rings (struct cp_private *cp)
desc = cp->rx_ring + i;
dma_unmap_single(&cp->pdev->dev,le64_to_cpu(desc->addr),
 cp->rx_buf_sz, PCI_DMA_FROMDEVICE);
-   dev_kfree_skb(cp->rx_skb[i]);
+   dev_kfree_skb_any(cp->rx_skb[i]);
}
}
 
@@ -1164,7 +1164,7 @@ static void cp_clean_rings (struct cp_private *cp)
 le32_to_cpu(desc->opts1) & 0x,
 PCI_DMA_TODEVICE);
if (le32_to_cpu(desc->opts1) & LastFrag)
-   dev_kfree_skb(skb);
+   dev_kfree_skb_any(skb);
cp->dev->stats.tx_dropped++;
}
}
-- 
2.4.3

-- 
David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation



smime.p7s
Description: S/MIME cryptographic signature


Re: mvneta: SGMII fixed-link not so fixed

2015-09-17 Thread Russell King - ARM Linux
On Thu, Sep 17, 2015 at 03:12:47PM -0700, David Miller wrote:
> From: Russell King - ARM Linux 
> Date: Mon, 14 Sep 2015 12:42:09 +0100
> 
> > Thanks, I think that will solve it.  I have to wonder why that patch
> > (f8af8e6eb9509 in mainline) didn't made it into v4.2 though, as it's
> > billed as a regression that occurred in the previous merge window, and
> > given that it was sent in July, and we're now in September.  As it
> > wasn't in v4.2, it looks like it should be a stable candidate.
> 
> The series had a whole bunch of non bug fixes in it and we were in
> the final phases of 4.2, in which case I defer to applying patches
> to net-next only unless I'm told otherwise.
> 
> It's up the the patch/series author to let me know that an important
> regression fix is hidden in there, but they should have submitted
> it seperately from the rest in that kind of situation anyways.
> 
> > David, any objections to having the stable guys pick this regression
> > fix up, if not already done so?
> 
> More than this patch is needed, the one before it (3/4) instantiates
> the necessary property in the DT, for example.
> 
> I can queue up the whole series for -stable if you want.

Sorry in advance for this rambling reply...

I'm not entirely certain that'd be a good idea at the moment, for a
number of reasons, which are coming up because I'm looking at getting
a SFP cage supported with mvneta hardware.

1. Serdes gigabit ethernet links have two operating modes for in-band
   "negotiation" - Cisco SGMII format, and 1000base-X format.  Both use
   exactly the same encoding on the wire, the only differences between
   them are the contents of a 16-bit configuration word and how each
   end of the link handles that.  SFP can use either format depending
   on the module hot-plugged in - fiber modules will normally use
   1000base-X, but copper modules which contain a PHY may use either
   SGMII or 1000base-X.  (Fiber modules for 100baseFX will probably
   use SGMII though.)

   The issue there is two-fold: that the new DT property just says it's
   "in-band" or "auto" but there's no way to specify the format of the
   in-band configuration.

2. With Serdes, the PCS layer of the PHY, which does the autonegotiation,
   is moved to the MAC.  When connected to a SGMII PHY, the PHY may report
   over the Serdes connection the Cisco SGMII configuration word which
   instructs the MAC how to configure itself.  It's not "negotiation" by
   any means, but "phy telling the MAC how to configure itself" word.

   Having "in-band" enabled pretty much requires the use of the "fixed-link"
   property, which seems to be a total hack around the PCS layer being in
   the MAC - the "fixed-link" phy is no longer fixed, but is used as a
   means to convey the negotiated results from the MAC side PCS to the
   software-emulated PHY, only to have them pop back out into the MAC
   driver.

   If you specify "in-band" without a "fixed-link" but have other MACs
   making use of the fixed-link support, all hell breaks loose, because
   mvneta will call the fixed-link update function with the real phy
   with the in-band results, and this can hit a fixed-link PHY for some
   other network adapter.  The fixed-link PHY code makes no attempt to
   validate that the phy_device passed in really is a fixed-link phy
   and not a MDIO phy.

3. Having DT specify a fixed-link with parameters along with in-band
   negotiation results in the fixed-link parameters being ignored.
   This means if a fixed-link DT declaration specifies a speed, that
   declaration will be ignored.  What I'm basically saying is that:

phy-mode = "sgmii";
fixed-link {
speed = <1000>;
};

   specifies a fixed-speed serdes link at 1000mbps, but:

phy-mode = "sgmii";
managed = "in-band-status";
fixed-link {
speed = <1000>;
};

   does not fix the speed at all.  _But_ using the in-band status
   property fundamentally requires this for mvneta to behave correctly:

phy-mode = "sgmii";
managed = "in-band-status";
fixed-link {
};

   with _no_ phy node.

4. Going back to the SFP problem, the link is only up when the SFP
   module pins indicate that there's no transmitter fault, no loss of
   signal _and_ the PCS layer at the MAC indicates that it has completed
   negotiation.  This pretty much rules out trying to emulate a SFP cage
   as a software-based PHY.  I've code right now doing exactly that, and
   it results in netif_carrier_on() being called far too early.

What I don't know is how many generations of the mvneta hardware have
support for both serdes modes, but what I'm basically saying is that
the solution we now have seems to be somewhat lacking - maybe it should
have been "auto", "in-band-sgmii" and "in-band-1000base-x" with the
ability to add additional mod

Re: [PATCH] net: qdisc: enhance default_qdisc documentation

2015-09-17 Thread David Miller
From: Phil Sutter 
Date: Tue, 15 Sep 2015 10:33:07 +0200

> Aside from some lingual cleanup, point out which interfaces are not or
> partly covered by this setting.
> 
> Signed-off-by: Phil Sutter 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Add documentation for VRF device

2015-09-17 Thread David Miller
From: David Ahern 
Date: Tue, 15 Sep 2015 10:50:14 -0600

> Signed-off-by: David Ahern 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mvneta: SGMII fixed-link not so fixed

2015-09-17 Thread Florian Fainelli
On 17/09/15 15:12, David Miller wrote:
> From: Russell King - ARM Linux 
> Date: Mon, 14 Sep 2015 12:42:09 +0100
> 
>> Thanks, I think that will solve it.  I have to wonder why that patch
>> (f8af8e6eb9509 in mainline) didn't made it into v4.2 though, as it's
>> billed as a regression that occurred in the previous merge window, and
>> given that it was sent in July, and we're now in September.  As it
>> wasn't in v4.2, it looks like it should be a stable candidate.
> 
> The series had a whole bunch of non bug fixes in it and we were in
> the final phases of 4.2, in which case I defer to applying patches
> to net-next only unless I'm told otherwise.

To your defense, Staas and I kept arguing for a while, slowing the
entire process down until we agreed on a proper solution, the submission
was targeting your 'net' tree, but I did not realize until now that
these got applied to 'net-next'.

> 
> It's up the the patch/series author to let me know that an important
> regression fix is hidden in there, but they should have submitted
> it seperately from the rest in that kind of situation anyways.
> 
>> David, any objections to having the stable guys pick this regression
>> fix up, if not already done so?
> 
> More than this patch is needed, the one before it (3/4) instantiates
> the necessary property in the DT, for example.
> 
> I can queue up the whole series for -stable if you want.

I think this would be a good thing, mvneta-based platforms are fairly
popular.

Thank you!
-- 
Florian
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] openvswitch: Fix IPv6 exthdr handling with ct helpers.

2015-09-17 Thread David Miller
From: Joe Stringer 
Date: Mon, 14 Sep 2015 11:14:50 -0700

> Static code analysis reveals the following bug:
> 
> net/openvswitch/conntrack.c:281 ovs_ct_helper()
> warn: unsigned 'protoff' is never less than zero.
> 
> This signedness bug breaks error handling for IPv6 extension headers when
> using conntrack helpers. Fix the error by using a local signed variable.
> 
> Fixes:  cae3a2627520: "openvswitch: Allow attaching helpers to ct
> action"
> Reported-by: Dan Carpenter 
> Signed-off-by: Joe Stringer 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH next 0/30] Passing net through the netfilter hooks

2015-09-17 Thread Eric W. Biederman
Nicolas Dichtel  writes:

> Le 16/09/2015 02:59, Eric W. Biederman a écrit :
>>
>> My primary goal with this patchset and it's follow ups is to cleanup the
>> network routing paths so that we do not look at the output device to
>> derive the network namespace.  My plan is to pass the network namespace
>> of the transmitting socket through the output path, to replace code that
>> looks at the output network device today.  Once that is done we can have
>> routes with output devices outside of the current network namespace.
>> Which should allow reception and transmission of packets in network
>> namespaces to be as fast as normal packet reception and transmission
>> with early demux disabled, because it will same code path.
>>
>> Once skb_dst(skb)->dev is a little better under control I think it will
>> also be possible to use rcu to cleanup the ancient hack that sets
>> dst->dev to loopback_dev when a network device is removed.
>>
>> The work to get there is a series of code cleanups.  I am starting with
>> passing net into the netfilter hooks and into the functions that are
>> called after the netfilter hooks.  This removes from netfilter the
>> need to guess which network namespace it is working on.
>>
>> To get there I perform a series of minor prep patches so the big changes
>> at the end are possible to audit without getting lost in the noise.  In
>> particular I have a lot of patches computing net into a local variable
>> and then using it through out the function.
>>
>> So this patchset encompases removing dead code, sorting out the _sk
>> functions that were added last time someone pushed a prototype change
>> through the post netfilter functions.  Cleaning up individual functions
>> use of the network namespace.  Passing net into the netfilter hooks.
>> Passing net into the post netfilter functions.  Using state->net in
>> the netfilter code where it is available and trivially usable.
> LGTM (except some minor comments).
>
> Acked-by: Nicolas Dichtel 

Thanks for review.  I have added an extra patch for the missing blank
lines that are still missing after the entire series.  As they affect
neither code correctness nor bisectability I don't think there is any
point respinning the indivdual patches.

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH next 31/30] netfilter: Add blank lines in callers of netfilter hooks

2015-09-17 Thread Eric W. Biederman

In code review it was noticed that I had failed to add some blank lines
in places where they are customarily used.  Taking a second look at the
code I have to agree blank lines would be nice so I have added them
here.

Reported-by:  Nicolas Dichtel 
Signed-off-by: "Eric W. Biederman" 
---
 net/ipv4/xfrm4_output.c | 1 +
 net/ipv6/ip6_output.c   | 1 +
 net/ipv6/xfrm6_output.c | 1 +
 net/xfrm/xfrm_output.c  | 1 +
 4 files changed, 4 insertions(+)

diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
index 28ae2048b93a..cd6be736e19f 100644
--- a/net/ipv4/xfrm4_output.c
+++ b/net/ipv4/xfrm4_output.c
@@ -97,6 +97,7 @@ static int __xfrm4_output(struct net *net, struct sock *sk, 
struct sk_buff *skb)
 int xfrm4_output(struct sock *sk, struct sk_buff *skb)
 {
struct net *net = dev_net(skb_dst(skb)->dev);
+
return NF_HOOK_COND(NFPROTO_IPV4, NF_INET_POST_ROUTING,
net, sk, skb, NULL, skb_dst(skb)->dev,
__xfrm4_output,
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index d8d68e81d123..291a07be5dfb 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -136,6 +136,7 @@ int ip6_output(struct sock *sk, struct sk_buff *skb)
struct net_device *dev = skb_dst(skb)->dev;
struct inet6_dev *idev = ip6_dst_idev(skb_dst(skb));
struct net *net = dev_net(dev);
+
if (unlikely(idev->cnf.disable_ipv6)) {
IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTDISCARDS);
kfree_skb(skb);
diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
index 68a996f8a044..0c3e9ffcf231 100644
--- a/net/ipv6/xfrm6_output.c
+++ b/net/ipv6/xfrm6_output.c
@@ -169,6 +169,7 @@ static int __xfrm6_output(struct net *net, struct sock *sk, 
struct sk_buff *skb)
 int xfrm6_output(struct sock *sk, struct sk_buff *skb)
 {
struct net *net = dev_net(skb_dst(skb)->dev);
+
return NF_HOOK_COND(NFPROTO_IPV6, NF_INET_POST_ROUTING,
net, sk, skb,  NULL, skb_dst(skb)->dev,
__xfrm6_output,
diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
index 61ba99f61dc8..c48a4b8582bb 100644
--- a/net/xfrm/xfrm_output.c
+++ b/net/xfrm/xfrm_output.c
@@ -132,6 +132,7 @@ out:
 int xfrm_output_resume(struct sk_buff *skb, int err)
 {
struct net *net = xs_net(skb_dst(skb)->xfrm);
+
while (likely((err = xfrm_output_one(skb, err)) == 0)) {
nf_reset(skb);
 
-- 
2.2.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net 2/2] 8139cp: reset BQL when ring tx ring cleared

2015-09-17 Thread David Woodhouse
On Thu, 2015-09-17 at 22:44 +0200, Francois Romieu wrote:
> David Woodhouse  :
> > On Thu, 2015-09-17 at 12:36 +0100, David Woodhouse wrote:
> > > 
> > > Thanks; I'll try that. In fact since updating to 4.2 the problem has
> > > got worse — now the whole machine dies:
> > 
> > There is something very strange going on here. I've found two ways to
> > make it stop crashing when cp_tx_timeout() hits the 'popf' when
> > unlocking the spinlock.
> 
> cp_tx_timeout takes lock, disables irq, calls cp_clean_rings, thus
> plain dev_kfree_skb if a skb is still referenced in one of the
> rx/tx ring. You may replace it with dev_kfree_skb_any.

Well spotted; I've made that change locally. Although I don't think it
explains the symptoms. Not that I'm sure what *could*.

I've also found that adding a call to __cp_set_rx_mode() seems to fix
the RX after reset, in some tests. Especially the simulated one via the
hack in cp_set_wol(). I think that's necessary, if not sufficient — at
least on real hardware. I didn't see the problem at all when running in
qemu.

Sometimes, though, it still dies in an interrupt storm after re
-enabling IRQs:

[  900.004214] 8139cp :00:0b.0 eth1: Transmit timeout, status  c   2b0 
80ff
[  900.011725] will lock...
[  900.014273] Handling tx timeout, flags 200296
[  900.018774] Will wake queue...
[  900.021645] Will unlock... flags 200296
[  900.021645] 8139cp :00:0b.0 eth1: intr, status 0001 enable 80ff cmd 0c 
cpcmd 002b
[  900.021645] 8139cp :00:0b.0 eth1: intr, status 0001 enable 80ff cmd 0c 
cpcmd 002b
... 
[  901.628439] 8139cp :00:0b.0 eth1: intr, status 0001 enable 80ff cmd 0c 
cpcmd 002b
[  901.636291] 8139cp :00:0b.0 eth1: intr, status 0011 enable 80ff cmd 0c 
cpcmd 002b
...
[  901.966243] 8139cp :00:0b.0 eth1: intr, status 0011 enable 80ff cmd 0c 
cpcmd 002b
[  901.968353] 8139cp :00:0b.0 eth1: intr, status 0051 enable 80ff cmd 0c 
cpcmd 002b
... forever...

And of course, even if I fix the TX timeout handling, I'd still like to
know why it's happening in the first place...

-- 
dwmw2



smime.p7s
Description: S/MIME cryptographic signature


Re: mvneta: SGMII fixed-link not so fixed

2015-09-17 Thread David Miller
From: Russell King - ARM Linux 
Date: Mon, 14 Sep 2015 12:42:09 +0100

> Thanks, I think that will solve it.  I have to wonder why that patch
> (f8af8e6eb9509 in mainline) didn't made it into v4.2 though, as it's
> billed as a regression that occurred in the previous merge window, and
> given that it was sent in July, and we're now in September.  As it
> wasn't in v4.2, it looks like it should be a stable candidate.

The series had a whole bunch of non bug fixes in it and we were in
the final phases of 4.2, in which case I defer to applying patches
to net-next only unless I'm told otherwise.

It's up the the patch/series author to let me know that an important
regression fix is hidden in there, but they should have submitted
it seperately from the rest in that kind of situation anyways.

> David, any objections to having the stable guys pick this regression
> fix up, if not already done so?

More than this patch is needed, the one before it (3/4) instantiates
the necessary property in the DT, for example.

I can queue up the whole series for -stable if you want.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFS/TCP/IPv6 acting strangely in 4.2

2015-09-17 Thread Trond Myklebust
On Thu, Sep 17, 2015 at 12:27 PM, Benjamin Coddington
 wrote:
> On Thu, 17 Sep 2015, Trond Myklebust wrote:
>
>> Hi Russell,
>>
>> On Thu, 2015-09-17 at 14:57 +0100, Russell King - ARM Linux wrote:
>> > On Fri, Sep 11, 2015 at 05:49:38PM +0100, Russell King - ARM Linux
>> > wrote:
>> > > Following that idea, I just tried the patch below, and it seems to
>> > > work.
>> > > I don't know whether it handles all cases after a call to
>> > > kernel_connect(),
>> > > but it stops the multiple connection attempts:
>> > >
>> > >   1   0.00 armada388 -> n2100 TCP 1009→nfs [SYN] Seq=3794066539
>> > > Win=28560 Len=0 MSS=1440 SACK_PERM=1 TSval=15712 TSecr=870317691
>> > > WS=128
>> > >   2   0.000414 n2100 -> armada388 TCP nfs→1009 [SYN, ACK]
>> > > Seq=1884476522 Ack=3794066540 Win=28560 Len=0 MSS=1440 SACK_PERM=1
>> > > TSval=870318939 TSecr=15712 WS=64
>> > >   3   0.000787 armada388 -> n2100 TCP 1009→nfs [ACK] Seq=3794066540
>> > > Ack=1884476523 Win=28672 Len=0 TSval=15712 TSecr=870318939
>> > >   4   0.001304 armada388 -> n2100 NFS V3 ACCESS Call, FH:
>> > > 0x905379cc, [Check: RD LU MD XT DL]
>> > >   5   0.001566 n2100 -> armada388 TCP nfs→1009 [ACK] Seq=1884476523
>> > > Ack=379400 Win=28608 Len=0 TSval=870318939 TSecr=15712
>> > >   6   0.001640 armada388 -> n2100 NFS V3 ACCESS Call, FH:
>> > > 0x905379cc, [Check: RD LU MD XT DL]
>> > >   7   0.001866 n2100 -> armada388 TCP nfs→1009 [ACK] Seq=1884476523
>> > > Ack=3794066780 Win=28608 Len=0 TSval=870318939 TSecr=15712
>> > >   8   0.003070 n2100 -> armada388 NFS V3 ACCESS Reply (Call In 4),
>> > > [Allowed: RD LU MD XT DL]
>> > >   9   0.003415 armada388 -> n2100 TCP 1009→nfs [ACK] Seq=3794066780
>> > > Ack=1884476647 Win=28672 Len=0 TSval=15712 TSecr=870318939
>> > >  10   0.003592 armada388 -> n2100 NFS V3 ACCESS Call, FH:
>> > > 0xe15fc9c9, [Check: RD LU MD XT DL]
>> > >  11   0.004354 n2100 -> armada388 NFS V3 ACCESS Reply (Call In 6),
>> > > [Allowed: RD LU MD XT DL]
>> > >  12   0.004682 armada388 -> n2100 NFS V3 ACCESS Call, FH:
>> > > 0xe15fc9c9, [Check: RD LU MD XT DL]
>> > >  13   0.005365 n2100 -> armada388 NFS V3 ACCESS Reply (Call In 10),
>> > > [Allowed: RD LU MD XT DL]
>> > >  14   0.005701 armada388 -> n2100 NFS V3 GETATTR Call, FH:
>> > > 0xe15fc9c9
>> > > ...
>> >
>> > NFS people - any comments on this patch?  Is it the correct way to
>> > solve
>> > this problem (please see the first message in this thread for the
>> > problem.)
>> > Without this patch, NFS is unusable as it tries to launch multiple
>> > new
>> > connections from the same port to the NFS server without giving the
>> > NFS
>> > server time to respond and establish the TCP connection.
>>
>> I agree that it addresses a real problem here, however there are a
>> couple of issues with the patch itself:
>>
>> AFAICS, the 2 possible next states for SYN_SENT are TCP_ESTABLISHED and
>> TCP_CLOSE, so if the connection attempt fails, this patch leaves the
>> XPRT_CONNECTING flag set.
>> There is also the issue that clearing XPRT_CONNECTING in TCP_FIN_WAIT1,
>> TCP_CLOSE_WAIT and TCP_CLOSING could interfere with another connection
>> attempt by canceling the XPRT_CONNECTING state.
>>
>> How about the following? It is based on your patch, but adds a check to
>> ensure that xs_tcp_state_change() doesn't clear the 'connecting' state
>> more than once (which could otherwise still happen in the TCP_CLOSE
>> case).
>>
>> 8<---
>> From 4dbfdebbc09982a9248866f8256549456e2b2efd Mon Sep 17 00:00:00 2001
>> From: Trond Myklebust 
>> Date: Wed, 16 Sep 2015 23:43:17 -0400
>> Subject: [PATCH] SUNRPC: Ensure that we wait for connections to complete
>>  before retrying
>>
>> Commit 718ba5b87343, moved the responsibility for unlocking the socket to
>> xs_tcp_setup_socket, meaning that the socket will be unlocked before we
>> know that it has finished trying to connect. The following patch is based on
>> an initial patch by Russell King to ensure that we delay clearing the
>> XPRT_SOCK_CONNECTING flag until we either know that we failed to initiate
>> a connection attempt, or the connection attempt itself failed.
>>
>> Fixes: 718ba5b87343 ("SUNRPC: Add helpers to prevent socket create from 
>> racing")
>> Reported-by: Russell King 
>> Signed-off-by: Trond Myklebust 
>
> This fixes up my network segmentation problem, tested on top of your "Fix
> races between socket connection and destroy code".
>
> Tested-by: Benjamin Coddington 
>

Thanks Ben!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFS/TCP/IPv6 acting strangely in 4.2

2015-09-17 Thread Trond Myklebust
On Thu, Sep 17, 2015 at 5:47 PM, Russell King - ARM Linux
 wrote:
> On Thu, Sep 17, 2015 at 10:18:29AM -0400, Trond Myklebust wrote:
>> Hi Russell,
>>
>> On Thu, 2015-09-17 at 14:57 +0100, Russell King - ARM Linux wrote:
>> > On Fri, Sep 11, 2015 at 05:49:38PM +0100, Russell King - ARM Linux
>> > wrote:
>> > > Following that idea, I just tried the patch below, and it seems to
>> > > work.
>> > > I don't know whether it handles all cases after a call to
>> > > kernel_connect(),
>> > > but it stops the multiple connection attempts:
>> > >
>> > >   1   0.00 armada388 -> n2100 TCP 1009→nfs [SYN] Seq=3794066539
>> > > Win=28560 Len=0 MSS=1440 SACK_PERM=1 TSval=15712 TSecr=870317691
>> > > WS=128
>> > >   2   0.000414 n2100 -> armada388 TCP nfs→1009 [SYN, ACK]
>> > > Seq=1884476522 Ack=3794066540 Win=28560 Len=0 MSS=1440 SACK_PERM=1
>> > > TSval=870318939 TSecr=15712 WS=64
>> > >   3   0.000787 armada388 -> n2100 TCP 1009→nfs [ACK] Seq=3794066540
>> > > Ack=1884476523 Win=28672 Len=0 TSval=15712 TSecr=870318939
>> > >   4   0.001304 armada388 -> n2100 NFS V3 ACCESS Call, FH:
>> > > 0x905379cc, [Check: RD LU MD XT DL]
>> > >   5   0.001566 n2100 -> armada388 TCP nfs→1009 [ACK] Seq=1884476523
>> > > Ack=379400 Win=28608 Len=0 TSval=870318939 TSecr=15712
>> > >   6   0.001640 armada388 -> n2100 NFS V3 ACCESS Call, FH:
>> > > 0x905379cc, [Check: RD LU MD XT DL]
>> > >   7   0.001866 n2100 -> armada388 TCP nfs→1009 [ACK] Seq=1884476523
>> > > Ack=3794066780 Win=28608 Len=0 TSval=870318939 TSecr=15712
>> > >   8   0.003070 n2100 -> armada388 NFS V3 ACCESS Reply (Call In 4),
>> > > [Allowed: RD LU MD XT DL]
>> > >   9   0.003415 armada388 -> n2100 TCP 1009→nfs [ACK] Seq=3794066780
>> > > Ack=1884476647 Win=28672 Len=0 TSval=15712 TSecr=870318939
>> > >  10   0.003592 armada388 -> n2100 NFS V3 ACCESS Call, FH:
>> > > 0xe15fc9c9, [Check: RD LU MD XT DL]
>> > >  11   0.004354 n2100 -> armada388 NFS V3 ACCESS Reply (Call In 6),
>> > > [Allowed: RD LU MD XT DL]
>> > >  12   0.004682 armada388 -> n2100 NFS V3 ACCESS Call, FH:
>> > > 0xe15fc9c9, [Check: RD LU MD XT DL]
>> > >  13   0.005365 n2100 -> armada388 NFS V3 ACCESS Reply (Call In 10),
>> > > [Allowed: RD LU MD XT DL]
>> > >  14   0.005701 armada388 -> n2100 NFS V3 GETATTR Call, FH:
>> > > 0xe15fc9c9
>> > > ...
>> >
>> > NFS people - any comments on this patch?  Is it the correct way to
>> > solve
>> > this problem (please see the first message in this thread for the
>> > problem.)
>> > Without this patch, NFS is unusable as it tries to launch multiple
>> > new
>> > connections from the same port to the NFS server without giving the
>> > NFS
>> > server time to respond and establish the TCP connection.
>>
>> I agree that it addresses a real problem here, however there are a
>> couple of issues with the patch itself:
>>
>> AFAICS, the 2 possible next states for SYN_SENT are TCP_ESTABLISHED and
>> TCP_CLOSE, so if the connection attempt fails, this patch leaves the
>> XPRT_CONNECTING flag set.
>> There is also the issue that clearing XPRT_CONNECTING in TCP_FIN_WAIT1,
>> TCP_CLOSE_WAIT and TCP_CLOSING could interfere with another connection
>> attempt by canceling the XPRT_CONNECTING state.
>>
>> How about the following? It is based on your patch, but adds a check to
>> ensure that xs_tcp_state_change() doesn't clear the 'connecting' state
>> more than once (which could otherwise still happen in the TCP_CLOSE
>> case).
>
> This patch also seems to fix the problem I've been seeing.
>
> Yes, I wasn't sure about my patch - I didn't spend much time properly
> reading and understanding the sunrpc code, beyond analysing what was
> going on to cause the problem and deciding on a way to stop it happening.
> I really wasn't sure that clearing the connecting flag everywhere I did
> was the right thing, which is why I didn't send the patch properly
> dressed up.
>
>> 8<---
>> >From 4dbfdebbc09982a9248866f8256549456e2b2efd Mon Sep 17 00:00:00 2001
>> From: Trond Myklebust 
>> Date: Wed, 16 Sep 2015 23:43:17 -0400
>> Subject: [PATCH] SUNRPC: Ensure that we wait for connections to complete
>>  before retrying
>>
>> Commit 718ba5b87343, moved the responsibility for unlocking the socket to
>> xs_tcp_setup_socket, meaning that the socket will be unlocked before we
>> know that it has finished trying to connect. The following patch is based on
>> an initial patch by Russell King to ensure that we delay clearing the
>> XPRT_SOCK_CONNECTING flag until we either know that we failed to initiate
>> a connection attempt, or the connection attempt itself failed.
>>
>> Fixes: 718ba5b87343 ("SUNRPC: Add helpers to prevent socket create from 
>> racing")
>> Reported-by: Russell King 
>
> Reported-by: Russell King 
> Tested-by: Russell King 
>

Thanks Russell!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/ma

Re: [PATCH] geneve: restore vlan bits in xmit path

2015-09-17 Thread Pravin Shelar
On Thu, Sep 17, 2015 at 1:15 PM, John W. Linville
 wrote:
> On Thu, Sep 17, 2015 at 12:48:56PM -0700, Jesse Gross wrote:
>> On Thu, Sep 17, 2015 at 12:25 PM, John W. Linville
>>  wrote:
>> > On Thu, Sep 17, 2015 at 11:45:58AM -0700, Pravin Shelar wrote:
>> >> On Thu, Sep 17, 2015 at 10:18 AM, John W. Linville
>> >>  wrote:
>> >> > These seem to have been accidentally dropped in commit 371bd1061d29
>> >> > ("geneve: Consolidate Geneve functionality in single module.").
>> >> >
>> >> Geneve should not export vxlan feature. So that it never sees vxlan
>> >> tagged packets. Can you turn off the vlan feature?
>> >
>> > I'm not sure I understand...?  This is vlan, not vxlan.
>>
>> I think he just mean vlan. If you remove the line where
>> dev->vlan_features are set then the core stack will handle this and we
>> don't need to do anything special here.
>
> Is that preferrable to this patch?  Tunneling vlan-tagged frames
> seems weird, but I would hate to disallow it if some crazy person
> wanted to do that...
>

To support vlan-offload in geneve we end up doing same as networking
stack does in case of software fallback for vlan-offload. So by not
setting the feature, we can avoid the duplicate code in geneve module.

> I guess the other way would slightly improve performance, and this
> could be added back later.  What about the VLAN-related bits in
> dev->features and ->hw_features?  Should they go as well?
>
No need to set vlan feature in any of device features.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] ipv6: include NLM_F_REPLACE in route replace notifications

2015-09-17 Thread David Miller
From: Roopa Prabhu 
Date: Sun, 13 Sep 2015 10:18:33 -0700

> From: Roopa Prabhu 
> 
> This patch adds NLM_F_REPLACE flag to ipv6 route replace notifications.
> This makes nlm_flags in ipv6 replace notifications consistent
> with ipv4.
> 
> Signed-off-by: Roopa Prabhu 
> ---
> Submitting this to net since it complements the other ipv6 replace fixes
> in net

Applied, thanks Roopa.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


e1000_hw_c_checkpatch_coding_style_errors_remove

2015-09-17 Thread Janusz Wolak
 >From 4ac9fd87e092f58eb7a6ed898360dfd83c5c10f5 Mon Sep 17 00:00:00 2001
From: Janusz Wolak 
Date: Thu, 17 Sep 2015 23:34:29 +0200
Subject: [PATCH] Remove checkpatch coding style errors.

Signed-off-by: Janusz Wolak 
---
 drivers/net/ethernet/intel/e1000/e1000_hw.c | 16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000/e1000_hw.c b/drivers/net/ethernet/intel/e1000/e1000_hw.c
index 45c8c864..cc1fe40 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_hw.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_hw.c
@@ -683,7 +683,7 @@ static s32 e1000_adjust_serdes_amplitude(struct e1000_hw *hw)
 	}
 
 	ret_val = e1000_read_eeprom(hw, EEPROM_SERDES_AMPLITUDE, 1,
-	&eeprom_data);
+&eeprom_data);
 	if (ret_val) {
 		return ret_val;
 	}
@@ -1652,7 +1652,7 @@ s32 e1000_phy_setup_autoneg(struct e1000_hw *hw)
 		mii_1000t_ctrl_reg = 0;
 	} else {
 		ret_val = e1000_write_phy_reg(hw, PHY_1000T_CTRL,
-		  mii_1000t_ctrl_reg);
+	  mii_1000t_ctrl_reg);
 		if (ret_val)
 			return ret_val;
 	}
@@ -2193,8 +2193,7 @@ static s32 e1000_config_fc_after_link_up(struct e1000_hw *hw)
 			else if (!(mii_nway_adv_reg & NWAY_AR_PAUSE) &&
  (mii_nway_adv_reg & NWAY_AR_ASM_DIR) &&
  (mii_nway_lp_ability_reg & NWAY_LPAR_PAUSE) &&
- (mii_nway_lp_ability_reg & NWAY_LPAR_ASM_DIR))
-			{
+ (mii_nway_lp_ability_reg & NWAY_LPAR_ASM_DIR)) {
 hw->fc = E1000_FC_TX_PAUSE;
 e_dbg
 ("Flow Control = TX PAUSE frames only.\n");
@@ -2210,8 +2209,7 @@ static s32 e1000_config_fc_after_link_up(struct e1000_hw *hw)
 			else if ((mii_nway_adv_reg & NWAY_AR_PAUSE) &&
  (mii_nway_adv_reg & NWAY_AR_ASM_DIR) &&
  !(mii_nway_lp_ability_reg & NWAY_LPAR_PAUSE) &&
- (mii_nway_lp_ability_reg & NWAY_LPAR_ASM_DIR))
-			{
+ (mii_nway_lp_ability_reg & NWAY_LPAR_ASM_DIR)) {
 hw->fc = E1000_FC_RX_PAUSE;
 e_dbg
 ("Flow Control = RX PAUSE frames only.\n");
@@ -3449,7 +3447,7 @@ s32 e1000_phy_get_info(struct e1000_hw *hw, struct e1000_phy_info *phy_info)
 	if (hw->phy_type == e1000_phy_igp)
 		return e1000_phy_igp_get_info(hw, phy_info);
 	else if ((hw->phy_type == e1000_phy_8211) ||
-	 (hw->phy_type == e1000_phy_8201))
+		 (hw->phy_type == e1000_phy_8201))
 		return E1000_SUCCESS;
 	else
 		return e1000_phy_m88_get_info(hw, phy_info);
@@ -3896,7 +3894,7 @@ static s32 e1000_do_read_eeprom(struct e1000_hw *hw, u16 offset, u16 words,
 
 	if (hw->mac_type == e1000_ce4100) {
 		GBE_CONFIG_FLASH_READ(GBE_CONFIG_BASE_VIRT, offset, words,
-		  data);
+  data);
 		return E1000_SUCCESS;
 	}
 
@@ -4070,7 +4068,7 @@ static s32 e1000_do_write_eeprom(struct e1000_hw *hw, u16 offset, u16 words,
 
 	if (hw->mac_type == e1000_ce4100) {
 		GBE_CONFIG_FLASH_WRITE(GBE_CONFIG_BASE_VIRT, offset, words,
-		   data);
+   data);
 		return E1000_SUCCESS;
 	}
 
-- 
1.9.1



Re: [PATCH v2 1/3] net: irda: pxaficp_ir: use sched_clock() for time management

2015-09-17 Thread David Miller
From: Robert Jarzmik 
Date: Wed, 16 Sep 2015 11:34:01 +0200

> David Miller  writes:
> 
>> From: Robert Jarzmik 
>> Date: Sat, 12 Sep 2015 13:45:22 +0200
>>
>>> Instead of using directly the OS timer through direct register access,
>>> use the standard sched_clock(), which will end up in OSCR reading
>>> anyway.
>>> 
>>> This is a first step for direct access register removal and machine
>>> specific code removal from this driver.
>>> 
>>> Signed-off-by: Robert Jarzmik 
>>
>> What is the granularity of the OSCR register?
> It's 307ns (ie. 3.25MHz clock).
> 
>> If it is not nanoseconds, then you need to adjust calculations
>> such as this one:
> Tell me if the 307ns requires something I should adjust.
> 
> My understanding is that the flow will be :
>  sched_clock()
>rd->read_sched_clock() (cyc_to_ns() transformed for return)
>  pxa_read_sched_clock()
>readl_relaxed(OSCR)
> 
> I didn't see any timings issue, as the flow looks equivalent to the 
> readl(OSCR),
> but I might have overlooked something.

Of course it's different, because sched_clock() converts the value read
from OSCR into nanoseconds, which is obviously different from using the
OSCR register value directly.

You're therefore feeding different values into this IRDA code.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFS/TCP/IPv6 acting strangely in 4.2

2015-09-17 Thread Russell King - ARM Linux
On Thu, Sep 17, 2015 at 10:18:29AM -0400, Trond Myklebust wrote:
> Hi Russell,
> 
> On Thu, 2015-09-17 at 14:57 +0100, Russell King - ARM Linux wrote:
> > On Fri, Sep 11, 2015 at 05:49:38PM +0100, Russell King - ARM Linux
> > wrote:
> > > Following that idea, I just tried the patch below, and it seems to
> > > work.
> > > I don't know whether it handles all cases after a call to
> > > kernel_connect(),
> > > but it stops the multiple connection attempts:
> > > 
> > >   1   0.00 armada388 -> n2100 TCP 1009→nfs [SYN] Seq=3794066539
> > > Win=28560 Len=0 MSS=1440 SACK_PERM=1 TSval=15712 TSecr=870317691
> > > WS=128
> > >   2   0.000414 n2100 -> armada388 TCP nfs→1009 [SYN, ACK]
> > > Seq=1884476522 Ack=3794066540 Win=28560 Len=0 MSS=1440 SACK_PERM=1
> > > TSval=870318939 TSecr=15712 WS=64
> > >   3   0.000787 armada388 -> n2100 TCP 1009→nfs [ACK] Seq=3794066540
> > > Ack=1884476523 Win=28672 Len=0 TSval=15712 TSecr=870318939
> > >   4   0.001304 armada388 -> n2100 NFS V3 ACCESS Call, FH:
> > > 0x905379cc, [Check: RD LU MD XT DL]
> > >   5   0.001566 n2100 -> armada388 TCP nfs→1009 [ACK] Seq=1884476523
> > > Ack=379400 Win=28608 Len=0 TSval=870318939 TSecr=15712
> > >   6   0.001640 armada388 -> n2100 NFS V3 ACCESS Call, FH:
> > > 0x905379cc, [Check: RD LU MD XT DL]
> > >   7   0.001866 n2100 -> armada388 TCP nfs→1009 [ACK] Seq=1884476523
> > > Ack=3794066780 Win=28608 Len=0 TSval=870318939 TSecr=15712
> > >   8   0.003070 n2100 -> armada388 NFS V3 ACCESS Reply (Call In 4),
> > > [Allowed: RD LU MD XT DL]
> > >   9   0.003415 armada388 -> n2100 TCP 1009→nfs [ACK] Seq=3794066780
> > > Ack=1884476647 Win=28672 Len=0 TSval=15712 TSecr=870318939
> > >  10   0.003592 armada388 -> n2100 NFS V3 ACCESS Call, FH:
> > > 0xe15fc9c9, [Check: RD LU MD XT DL]
> > >  11   0.004354 n2100 -> armada388 NFS V3 ACCESS Reply (Call In 6),
> > > [Allowed: RD LU MD XT DL]
> > >  12   0.004682 armada388 -> n2100 NFS V3 ACCESS Call, FH:
> > > 0xe15fc9c9, [Check: RD LU MD XT DL]
> > >  13   0.005365 n2100 -> armada388 NFS V3 ACCESS Reply (Call In 10),
> > > [Allowed: RD LU MD XT DL]
> > >  14   0.005701 armada388 -> n2100 NFS V3 GETATTR Call, FH:
> > > 0xe15fc9c9
> > > ...
> > 
> > NFS people - any comments on this patch?  Is it the correct way to
> > solve
> > this problem (please see the first message in this thread for the
> > problem.)
> > Without this patch, NFS is unusable as it tries to launch multiple
> > new
> > connections from the same port to the NFS server without giving the
> > NFS
> > server time to respond and establish the TCP connection.
> 
> I agree that it addresses a real problem here, however there are a
> couple of issues with the patch itself:
> 
> AFAICS, the 2 possible next states for SYN_SENT are TCP_ESTABLISHED and
> TCP_CLOSE, so if the connection attempt fails, this patch leaves the
> XPRT_CONNECTING flag set.
> There is also the issue that clearing XPRT_CONNECTING in TCP_FIN_WAIT1,
> TCP_CLOSE_WAIT and TCP_CLOSING could interfere with another connection
> attempt by canceling the XPRT_CONNECTING state.
> 
> How about the following? It is based on your patch, but adds a check to
> ensure that xs_tcp_state_change() doesn't clear the 'connecting' state
> more than once (which could otherwise still happen in the TCP_CLOSE
> case).

This patch also seems to fix the problem I've been seeing.

Yes, I wasn't sure about my patch - I didn't spend much time properly
reading and understanding the sunrpc code, beyond analysing what was
going on to cause the problem and deciding on a way to stop it happening.
I really wasn't sure that clearing the connecting flag everywhere I did
was the right thing, which is why I didn't send the patch properly
dressed up.

> 8<---
> >From 4dbfdebbc09982a9248866f8256549456e2b2efd Mon Sep 17 00:00:00 2001
> From: Trond Myklebust 
> Date: Wed, 16 Sep 2015 23:43:17 -0400
> Subject: [PATCH] SUNRPC: Ensure that we wait for connections to complete
>  before retrying
> 
> Commit 718ba5b87343, moved the responsibility for unlocking the socket to
> xs_tcp_setup_socket, meaning that the socket will be unlocked before we
> know that it has finished trying to connect. The following patch is based on
> an initial patch by Russell King to ensure that we delay clearing the
> XPRT_SOCK_CONNECTING flag until we either know that we failed to initiate
> a connection attempt, or the connection attempt itself failed.
> 
> Fixes: 718ba5b87343 ("SUNRPC: Add helpers to prevent socket create from 
> racing")
> Reported-by: Russell King 

Reported-by: Russell King 
Tested-by: Russell King 

Thanks.

> Signed-off-by: Trond Myklebust 
> ---
>  include/linux/sunrpc/xprtsock.h |  3 +++
>  net/sunrpc/xprtsock.c   | 11 ---
>  2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/sunrpc/xprtsock.h b/include/linux/sunrpc/xprtsock.h
> index 7591788e9fbf..357e44c1a46b 100644
> --- a/include/linux/sunrpc/xpr

RE: [PATCH net-next RFC] net: increase LL_MAX_HEADER for Hyper-V

2015-09-17 Thread KY Srinivasan


> -Original Message-
> From: David Miller [mailto:da...@davemloft.net]
> Sent: Thursday, September 17, 2015 1:11 PM
> To: KY Srinivasan 
> Cc: david.lai...@aculab.com; alexander.du...@gmail.com; Haiyang Zhang
> ; vkuzn...@redhat.com; netdev@vger.kernel.org;
> linux-ker...@vger.kernel.org; jasow...@redhat.com
> Subject: Re: [PATCH net-next RFC] net: increase LL_MAX_HEADER for Hyper-V
> 
> From: KY Srinivasan 
> Date: Thu, 17 Sep 2015 19:52:01 +
> 
> >
> >
> >> -Original Message-
> >> Have a pre-cooked ring of buffers for these descriptors that you can
> >> point the chip at.  No per-packet allocation is necessary at all.
> >
> > Even if I had a ring of buffers, I would still need to manage the life cycle
> > of these buffers - selecting an unused one on the transmit path and marking
> > it used (atomically).
> 
> Have one per TX ring entry, then the lifetime matches the lifetime of the
> TX entry itself and therefore you need do nothing.

Yes, I understand. Unfortunately, the ring buffer used on Hyper-V to send the 
packets to the host is not
managed as a traditional TX ring entries - it is not fixed size and a given 
packet can wrap around and lastly, I think
the management of space on the ring buffer is not tied to the act of completing 
the send operation. That is why
we have an explicit "send complete" message.

I am working on moving the model to more closely match the hardware model but 
it will take some time.
For now, I will implement a very light weight mechanism for managing the 
additional memory needed.

Regards,

K. Y
> 
> That's the whole idea.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


pull-request: can-next 2015-09-17

2015-09-17 Thread Marc Kleine-Budde
Hello David,

this is a pull request of two patches for net-next/master.

Gerhard Bertelsmann adds support for the CAN controller found on the
Allwinner A10/A20 SoC.

Marc

---

The following changes since commit 37d2dbcdcca88e392009d7cbe8617d5af0ebcb32:

  net: fix cdc-phonet.c dependency and build error (2015-09-16 11:51:19 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next.git 
tags/linux-can-next-for-4.4-20150917

for you to fetch changes up to 0738eff14d817a02ab082c392c96a1613006f158:

  can: Allwinner A10/A20 CAN Controller support - Kernel module (2015-09-17 
22:39:08 +0200)


linux-can-next-for-4.4-20150917


Gerhard Bertelsmann (2):
  can: Allwinner A10/A20 CAN Controller support - Devicetree bindings
  can: Allwinner A10/A20 CAN Controller support - Kernel module

 .../devicetree/bindings/net/can/sun4i_can.txt  |  36 +
 drivers/net/can/Kconfig|  10 +
 drivers/net/can/Makefile   |   1 +
 drivers/net/can/sun4i_can.c| 857 +
 4 files changed, 904 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/can/sun4i_can.txt
 create mode 100644 drivers/net/can/sun4i_can.c

-- 
Pengutronix e.K.  | Marc Kleine-Budde   |
Industrial Linux Solutions| Phone: +49-231-2826-924 |
Vertretung West/Dortmund  | Fax:   +49-5121-206917- |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] net: smc91x: convert pxa dma to dmaengine

2015-09-17 Thread Russell King - ARM Linux
On Thu, Sep 17, 2015 at 01:37:22PM -0700, David Miller wrote:
> From: Robert Jarzmik 
> Date: Wed, 16 Sep 2015 11:41:54 +0200
> 
> > David Miller  writes:
> > 
> >> From: Robert Jarzmik 
> >> Date: Thu, 10 Sep 2015 21:26:04 +0200
> >>
> >>> Convert the dma transfers to be dmaengine based, now pxa has a dmaengine
> >>> slave driver. This makes this driver a bit more PXA agnostic.
> >>> 
> >>> The driver was tested on pxa27x (mainstone) and pxa310 (zylonite),
> >>> ie. only pxa platforms.
> >>> 
> >>> Signed-off-by: Robert Jarzmik 
> >>> Cc: Russell King 
> >>> Cc: Arnd Bergmann 
> >>> ---
> >>> This has potential to break other platform such as Neponset, Idp,
> >>> halibut and qsd8x50, so I added Russell and Arnd as they were discussing
> >>> smc91x support last February.
> >>
> > 
> >> Is someone testing whether such platforms break or not?  I'm waiting for
> >> that before I consider applying this patch.
> > 
> > My understanding is that Russell is the only one left testing them, or at 
> > least
> > he was the only one complaining about a breakage lately on neponset.
> > 
> > I can wait several weeks for Russell to have a bit of time to try : I know 
> > it
> > will compile correctly at least for neponset, and I know almost all the 
> > code is
> > under #ifdef CONFIG_ARCH_PXA. And still I would feel far more comfortable 
> > if it
> > was tested, just as you.
> 
> Oh well, I've waited long enough patch applied, thanks.

Well, I'm unlikely to get around to testing on the neponset any time
in the next month, and I don't think there's much that would go wrong
as a result of these changes: neponset doesn't use any of this DMA
code.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 0/4] can: Allwinner A10/A20 CAN Controller support - Summary

2015-09-17 Thread Marc Kleine-Budde
On 09/16/2015 01:21 PM, Gerhard Bertelsmann wrote:
> Hi,
> 
> please find attached the next version of my patch set. I have 
> taken all remarks from Maxime Ripard into the new version
> 
> Please review, test and report bugs if exists.
> 
> The patchset applies to all recent Kernel versions (4.x, next etc.).
> 
> [PATCH v8 1/4] Device Tree Binding Documentation
> [PATCH v8 2/4] Defconfig multi_v7
> [PATCH v8 3/4] Defconfig sunxi
> [PATCH v8 4/4] Kernel Module

Applies 1 and 4 with maxime's Ack to linux-can-next.

Thanks,
Marc

-- 
Pengutronix e.K.  | Marc Kleine-Budde   |
Industrial Linux Solutions| Phone: +49-231-2826-924 |
Vertretung West/Dortmund  | Fax:   +49-5121-206917- |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] iplink_geneve: add UDP destination port configuration at link creation

2015-09-17 Thread Eric Dumazet
On Thu, 2015-09-17 at 15:27 -0400, John W. Linville wrote:
> Signed-off-by: John W. Linville 
> ---

>  }
>  
> @@ -150,6 +159,10 @@ static void geneve_print_opt(struct link_util *lu, FILE 
> *f, struct rtattr *tb[])
>   else
>   fprintf(f, "tos %#x ", tos);
>   }
> +
> + if (tb[IFLA_GENEVE_PORT])
> + fprintf(f, "dstport %u ",
> + ntohs(rta_getattr_u16(tb[IFLA_GENEVE_PORT])));

This looks strange.

Kernel does :

if (nla_put_u16(skb, IFLA_GENEVE_PORT, ntohs(geneve->dst_port)))
goto nla_put_failure;



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] geneve: remove use of internal IP header when calling IP_ECN_decapsulate

2015-09-17 Thread John W. Linville
This seems to have been a "thinko".  IP_ECN_decapsulate needs info
from both internal and external headers.

Signed-off-by: John W. Linville 
---
v2 -- ensure the collect_md path still calls IP_ECN_decapsulate

 drivers/net/geneve.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index da3259ce7c8d..549febac0579 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -126,6 +126,8 @@ static void geneve_rx(struct geneve_sock *gs, struct 
sk_buff *skb)
__be32 addr;
int err;
 
+   iph = ip_hdr(skb); /* outer IP header... */
+
if (gs->collect_md) {
static u8 zero_vni[3];
 
@@ -133,7 +135,6 @@ static void geneve_rx(struct geneve_sock *gs, struct 
sk_buff *skb)
addr = 0;
} else {
vni = gnvh->vni;
-   iph = ip_hdr(skb); /* Still outer IP header... */
addr = iph->saddr;
}
 
@@ -178,7 +179,6 @@ static void geneve_rx(struct geneve_sock *gs, struct 
sk_buff *skb)
 
skb_reset_network_header(skb);
 
-   iph = ip_hdr(skb); /* Now inner IP header... */
err = IP_ECN_decapsulate(iph, skb);
 
if (unlikely(err)) {
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net 2/2] 8139cp: reset BQL when ring tx ring cleared

2015-09-17 Thread Francois Romieu
David Woodhouse  :
> On Thu, 2015-09-17 at 12:36 +0100, David Woodhouse wrote:
> > 
> > Thanks; I'll try that. In fact since updating to 4.2 the problem has
> > got worse — now the whole machine dies:
> 
> There is something very strange going on here. I've found two ways to
> make it stop crashing when cp_tx_timeout() hits the 'popf' when
> unlocking the spinlock.

cp_tx_timeout takes lock, disables irq, calls cp_clean_rings, thus
plain dev_kfree_skb if a skb is still referenced in one of the
rx/tx ring. You may replace it with dev_kfree_skb_any.

-- 
Ueimor
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


I need to talk to you very urgent, Email me via: dkareem...@yahoo.com.hk

2015-09-17 Thread DKareem
I need to talk to you very urgent, Email me via: dkareem...@yahoo.com.hk
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: smc91x: convert pxa dma to dmaengine

2015-09-17 Thread David Miller
From: Robert Jarzmik 
Date: Wed, 16 Sep 2015 11:41:54 +0200

> David Miller  writes:
> 
>> From: Robert Jarzmik 
>> Date: Thu, 10 Sep 2015 21:26:04 +0200
>>
>>> Convert the dma transfers to be dmaengine based, now pxa has a dmaengine
>>> slave driver. This makes this driver a bit more PXA agnostic.
>>> 
>>> The driver was tested on pxa27x (mainstone) and pxa310 (zylonite),
>>> ie. only pxa platforms.
>>> 
>>> Signed-off-by: Robert Jarzmik 
>>> Cc: Russell King 
>>> Cc: Arnd Bergmann 
>>> ---
>>> This has potential to break other platform such as Neponset, Idp,
>>> halibut and qsd8x50, so I added Russell and Arnd as they were discussing
>>> smc91x support last February.
>>
> 
>> Is someone testing whether such platforms break or not?  I'm waiting for
>> that before I consider applying this patch.
> 
> My understanding is that Russell is the only one left testing them, or at 
> least
> he was the only one complaining about a breakage lately on neponset.
> 
> I can wait several weeks for Russell to have a bit of time to try : I know it
> will compile correctly at least for neponset, and I know almost all the code 
> is
> under #ifdef CONFIG_ARCH_PXA. And still I would feel far more comfortable if 
> it
> was tested, just as you.

Oh well, I've waited long enough patch applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] geneve: remove use of internal IP header when calling IP_ECN_decapsulate

2015-09-17 Thread John W. Linville
On Thu, Sep 17, 2015 at 12:46:48PM -0700, Jesse Gross wrote:
> On Thu, Sep 17, 2015 at 10:17 AM, John W. Linville
>  wrote:
> > diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
> > index da3259ce7c8d..a917ae1cfbf3 100644
> > --- a/drivers/net/geneve.c
> > +++ b/drivers/net/geneve.c
> > @@ -178,13 +178,15 @@ static void geneve_rx(struct geneve_sock *gs, struct 
> > sk_buff *skb)
> >
> > skb_reset_network_header(skb);
> >
> > -   iph = ip_hdr(skb); /* Now inner IP header... */
> > -   err = IP_ECN_decapsulate(iph, skb);
> > +   if (iph)
> > +   err = IP_ECN_decapsulate(iph, skb);
> 
> It looks like this is now conditional based on !collect_md. I'm not
> sure that we want to have a difference in behavior between the two.

Sure, I can move the iph assignment higher-up and keep the other bits 
unconditional.

John
-- 
John W. LinvilleSomeday the world will need a hero, and you
linvi...@tuxdriver.com  might be all we have.  Be ready.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] geneve: restore vlan bits in xmit path

2015-09-17 Thread John W. Linville
On Thu, Sep 17, 2015 at 12:48:56PM -0700, Jesse Gross wrote:
> On Thu, Sep 17, 2015 at 12:25 PM, John W. Linville
>  wrote:
> > On Thu, Sep 17, 2015 at 11:45:58AM -0700, Pravin Shelar wrote:
> >> On Thu, Sep 17, 2015 at 10:18 AM, John W. Linville
> >>  wrote:
> >> > These seem to have been accidentally dropped in commit 371bd1061d29
> >> > ("geneve: Consolidate Geneve functionality in single module.").
> >> >
> >> Geneve should not export vxlan feature. So that it never sees vxlan
> >> tagged packets. Can you turn off the vlan feature?
> >
> > I'm not sure I understand...?  This is vlan, not vxlan.
> 
> I think he just mean vlan. If you remove the line where
> dev->vlan_features are set then the core stack will handle this and we
> don't need to do anything special here.

Is that preferrable to this patch?  Tunneling vlan-tagged frames
seems weird, but I would hate to disallow it if some crazy person
wanted to do that...

I guess the other way would slightly improve performance, and this
could be added back later.  What about the VLAN-related bits in
dev->features and ->hw_features?  Should they go as well?

John
-- 
John W. LinvilleSomeday the world will need a hero, and you
linvi...@tuxdriver.com  might be all we have.  Be ready.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Experiences with slub bulk use-case for network stack

2015-09-17 Thread Jesper Dangaard Brouer
On Wed, 16 Sep 2015 10:13:25 -0500 (CDT)
Christoph Lameter  wrote:

> On Wed, 16 Sep 2015, Jesper Dangaard Brouer wrote:
> 
> >
> > Hint, this leads up to discussing if current bulk *ALLOC* API need to
> > be changed...
> >
> > Alex and I have been working hard on practical use-case for SLAB
> > bulking (mostly slUb), in the network stack.  Here is a summary of
> > what we have learned so far.
> 
> SLAB refers to the SLAB allocator which is one slab allocator and SLUB is
> another slab allocator.
> 
> Please keep that consistent otherwise things get confusing

This naming scheme is really confusing.  I'll try to be more
consistent.  So, you want capital letters SLAB and SLUB when talking
about a specific slab allocator implementation.


> > Bulk free'ing SKBs during TX completion is a big and easy win.
> >
> > Specifically for slUb, normal path for freeing these objects (which
> > are not on c->freelist) require a locked double_cmpxchg per object.
> > The bulk free (via detached freelist patch) allow to free all objects
> > belonging to the same slab-page, to be free'ed with a single locked
> > double_cmpxchg. Thus, the bulk free speedup is quite an improvement.
> 
> Yep.
> 
> > Alex and I had the idea of bulk alloc returns an "allocator specific
> > cache" data-structure (and we add some helpers to access this).
> 
> Maybe add some Macros to handle this?

Yes, helpers will likely turn out to be macros.


> > In the slUb case, the freelist is a single linked pointer list.  In
> > the network stack the skb objects have a skb->next pointer, which is
> > located at the same position as freelist pointer.  Thus, simply
> > returning the freelist directly, could be interpreted as a skb-list.
> > The helper API would then do the prefetching, when pulling out
> > objects.
> 
> The problem with the SLUB case is that the objects must be on the same
> slab page.

Yes, I'm aware that, that is what we are trying to take advantage of.


> > For the slUb case, we would simply cmpxchg either c->freelist or
> > page->freelist with a NULL ptr, and then own all objects on the
> > freelist. This also reduce the time we keep IRQs disabled.
> 
> You dont need to disable interrupts for the cmpxchges. There is
> additional state in the page struct though so the updates must be
> done carefully.

Yes, I'm aware of cmpxchg does not need to disable interrupts.  And I
plan to take advantage of this, in this new approach for bulk alloc.

Our current bulk alloc disables interrupts for the full period (of
collecting the number requested objects).

What I'm proposing is keeping interrupts on, and then simply cmpxchg
e.g 2 slab-pages out of the SLUB allocator (which the SLUB code calls
freelist's). The bulk call now owns these freelists, and returns them
to the caller.  The API caller gets some helpers/macros to access
objects, to shield him from the details (of SLUB freelist's).

The pitfall with this API is we don't know how many objects are on a
SLUB freelist.  And we cannot walk the freelist and count them, because
then we hit the problem of memory/cache stalls (that we are trying so
hard to avoid).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: iproute2 tunnel name parsing

2015-09-17 Thread Vadim Kochan
On Thu, Sep 17, 2015 at 09:55:29PM +0200, Wilhelm Wijkander wrote:
> Hi,
> 
> I'm trying to create a sit tunnel called "hel": ip tun add hel mode
> sit remote 10.200.0.2 local 10.200.1.2 ttl 255, however it seems like
> this is interpreted as the help argument and I get the usage text. Is
> there a way to escape names that I've missed, or is this an error
> somewhere in argv parsing?
> 
> (I'm not subscribed, so a cc would be appreciated)
> Thanks,
> Wilhelm
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hi Wilhelm,

You can use 'name' before 'hel' like:

$ ip tun add name hel mode sit remote 10.200.0.2 local 10.200.1.2 ttl 255

and it should work, actually I just tried and it works.

Regards,
Vadim Kochan
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next RFC] net: increase LL_MAX_HEADER for Hyper-V

2015-09-17 Thread David Miller
From: KY Srinivasan 
Date: Thu, 17 Sep 2015 19:52:01 +

> 
> 
>> -Original Message-
>> Have a pre-cooked ring of buffers for these descriptors that you can
>> point the chip at.  No per-packet allocation is necessary at all.
> 
> Even if I had a ring of buffers, I would still need to manage the life cycle
> of these buffers - selecting an unused one on the transmit path and marking
> it used (atomically).

Have one per TX ring entry, then the lifetime matches the lifetime of the
TX entry itself and therefore you need do nothing.

That's the whole idea.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] geneve: restore vlan bits in xmit path

2015-09-17 Thread Pravin Shelar
On Thu, Sep 17, 2015 at 12:48 PM, Jesse Gross  wrote:
> On Thu, Sep 17, 2015 at 12:25 PM, John W. Linville
>  wrote:
>> On Thu, Sep 17, 2015 at 11:45:58AM -0700, Pravin Shelar wrote:
>>> On Thu, Sep 17, 2015 at 10:18 AM, John W. Linville
>>>  wrote:
>>> > These seem to have been accidentally dropped in commit 371bd1061d29
>>> > ("geneve: Consolidate Geneve functionality in single module.").
>>> >
>>> Geneve should not export vxlan feature. So that it never sees vxlan
>>> tagged packets. Can you turn off the vlan feature?
>>
>> I'm not sure I understand...?  This is vlan, not vxlan.
>
> I think he just mean vlan. If you remove the line where
> dev->vlan_features are set then the core stack will handle this and we
> don't need to do anything special here.

Yes, I meant vlan, sorry for confusion.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


iproute2 tunnel name parsing

2015-09-17 Thread Wilhelm Wijkander
Hi,

I'm trying to create a sit tunnel called "hel": ip tun add hel mode
sit remote 10.200.0.2 local 10.200.1.2 ttl 255, however it seems like
this is interpreted as the help argument and I get the usage text. Is
there a way to escape names that I've missed, or is this an error
somewhere in argv parsing?

(I'm not subscribed, so a cc would be appreciated)
Thanks,
Wilhelm
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH net-next RFC] net: increase LL_MAX_HEADER for Hyper-V

2015-09-17 Thread KY Srinivasan


> -Original Message-
> From: David Miller [mailto:da...@davemloft.net]
> Sent: Thursday, September 17, 2015 11:52 AM
> To: KY Srinivasan 
> Cc: david.lai...@aculab.com; alexander.du...@gmail.com; Haiyang Zhang
> ; vkuzn...@redhat.com; netdev@vger.kernel.org;
> linux-ker...@vger.kernel.org; jasow...@redhat.com
> Subject: Re: [PATCH net-next RFC] net: increase LL_MAX_HEADER for Hyper-V
> 
> From: KY Srinivasan 
> Date: Thu, 17 Sep 2015 15:14:05 +
> 
> > I think I can achieve my original goal of not having any allocation
> > in the send path by carefully using the memory available in the skb:
> 
> Please stop flat-out ignoring David L.'s suggestion.

I am sorry; I did not mean to convey that impression.

> 
> Have a pre-cooked ring of buffers for these descriptors that you can
> point the chip at.  No per-packet allocation is necessary at all.

Even if I had a ring of buffers, I would still need to manage the life cycle
of these buffers - selecting an unused one on the transmit path and marking
it used (atomically). Once the transmit completes (as indicated by the transmit 
complete
callback) this buffer needs to be marked free. I can certainly make these 
operations
efficient and  lock-free, but they are still at some level an allocation/free
operation albeit potentially more efficient than having the kernel allocate the 
memory.
 
> 
> If you play games with SKBs you will get burned.

I will implement Dave L's suggestion. However, I am curious as to why you would 
consider
my proposed usage of the skb headroom and the control buffer area in skb as 
non-standard
usage.

Regards,

K. Y 


  
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] geneve: restore vlan bits in xmit path

2015-09-17 Thread Jesse Gross
On Thu, Sep 17, 2015 at 12:25 PM, John W. Linville
 wrote:
> On Thu, Sep 17, 2015 at 11:45:58AM -0700, Pravin Shelar wrote:
>> On Thu, Sep 17, 2015 at 10:18 AM, John W. Linville
>>  wrote:
>> > These seem to have been accidentally dropped in commit 371bd1061d29
>> > ("geneve: Consolidate Geneve functionality in single module.").
>> >
>> Geneve should not export vxlan feature. So that it never sees vxlan
>> tagged packets. Can you turn off the vlan feature?
>
> I'm not sure I understand...?  This is vlan, not vxlan.

I think he just mean vlan. If you remove the line where
dev->vlan_features are set then the core stack will handle this and we
don't need to do anything special here.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] geneve: remove use of internal IP header when calling IP_ECN_decapsulate

2015-09-17 Thread Jesse Gross
On Thu, Sep 17, 2015 at 10:17 AM, John W. Linville
 wrote:
> diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
> index da3259ce7c8d..a917ae1cfbf3 100644
> --- a/drivers/net/geneve.c
> +++ b/drivers/net/geneve.c
> @@ -178,13 +178,15 @@ static void geneve_rx(struct geneve_sock *gs, struct 
> sk_buff *skb)
>
> skb_reset_network_header(skb);
>
> -   iph = ip_hdr(skb); /* Now inner IP header... */
> -   err = IP_ECN_decapsulate(iph, skb);
> +   if (iph)
> +   err = IP_ECN_decapsulate(iph, skb);

It looks like this is now conditional based on !collect_md. I'm not
sure that we want to have a difference in behavior between the two.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ANNOUNCE] nftables 0.5 release

2015-09-17 Thread Pablo Neira Ayuso
Hi!

The Netfilter project proudly presents:

nftables 0.5

This release contains bug fixes and new features contained up to the
4.2 kernel release.

New features


* Concatenations: You can combine two or more selectors to build a
  tuple, then use it to look up for a matching in sets, eg.

  % nft add rule ip filter input ip saddr . tcp dport { \
1.1.1.1 . 22 , \
1.1.1.1 . 80 \
} counter accept

  So nft will check if the source IP address AND the TCP destination port
  matches what you have in the literal set above, if so it will
  update the rule counter and accept the packet.

  You can also combine concatenations with verdict maps:

  % nft add rule ip filter input ether saddr . ip saddr . meta iif vmap { \
3c:71:0e:39:bb:20 . 192.168.1.120 . "wlan0" : accept, \
3c:77:e0:39:aa:21 . 192.168.1.204 . "wlan0" : drop }

  You can declare a set using concatenations, to dynamically update its content
  instead:

  % nft add map filter accesslist { \
type ether_addr . ipv4_addr . iface_index : verdict \; }
  % nft add rule filter input ether saddr . ip saddr . meta iif vmap @accesslist

  Then, add elements to the set:

  % nft add element filter accesslist { \
3c:71:0e:39:bb:20 . 192.168.1.120 . wlan0 : accept }

  On a different front, you can also combine concatenations with maps:

  % nft add rule ip nat prerouting dnat ip saddr . tcp dport map { \
192.168.1.120 . 80 : 1.2.3.4, \
192.168.1.204 . 22 : 4.3.2.1 }

  In the example above, the destination address that is used in DNAT depends
  on the source IP address and the destination port of the packet.

  You require a Linux kernel >= 4.1 to use this new concatenation feature and
  nftables 0.5 of course.

* Add timeout support for sets: You can specify a lifetime for elements in your
  set declarations, eg.

  % nft add set filter whitelist { type ipv4_addr\; timeout 1h\; }
  % nft add element filter whitelist { 192.168.1.234 }
  % nft list ruleset
  table ip filter {
set whitelist {
type ipv4_addr
timeout 1h
elements = { 1.2.3.4 expires 59m56s}
}
  }

  You can also create the set with no specific timeout:

  % nft add set filter whitelist { type ipv4_addr\; flags timeout\; }

  So you can indicate the timeout when adding the element:

  % nft add element filter whitelist { 192.168.2.123 timeout 1h }

  You still can mix this with element that will reside permanently too:

  % nft add element filter whitelist { 192.168.2.180 }

* Add comments per set element, eg.

  % nft add element filter whitelist { 192.168.0.1 comment \"some host\" }

* Support for mini-gmp: If you're running nft from embedded devices,
  you may want to skip the libgmp dependency via:

  % ./configure --with-mini-gmp

  This compiles nft using the minimal gmp implementation that comes in
  the nftables tarball. Note that your nft binary avoids the libgmp
  dependency at the cost of getting a slightly larger binary.

* Dormant tables: You can disable the entire ruleset that is contained in a
  table by setting on the dormant flag:

  % nft add table filter { flags dormant\; }

  You can reenable it by typing:

  % nft add table filter

* Allow to specify default chain policy: You can specify the default chain
  policy by when you create the chain:

  % nft add chain filter input { \
type filter hook input priority 0\; policy drop\; }

  You can also change it for an existing chain anytime by updating it via:

  % nft add chain filter input { policy accept\; }

Bug fixes
=

* Command per line ruleset representation: According to what I can find on the
  Internet, it seems some people like to maintain their ruleset in scripts so
  they can add comments and annotate things there. However, this is a problem
  for two reasons: There is no atomic update since rules are published to the
  packet path one after another and this increases the time that nft takes to
  reload your ruleset significantly.

  So, the solution to this problem consists of keeping your ruleset like this:

  % cat my-ruleset-file
  flush ruleset
  add table filter
  add set filter whitelist { type ipv4_addr; }
  add chain filter input { type filter hook input priority 0; }
  add rule filter input iif lo accept
  add rule filter input ct state established,related counter accept
  add rule filter input tcp dport { 22, 80 } counter accept
  add rule filter input ip saddr @whitelist counter accept
  add element filter whitelist { 192.168.1.120 }
  add element filter whitelist { 192.168.1.121 }
  add element filter whitelist { 192.168.1.204 }

  You can also insert comments in the file through '#'.

  Then, you can atomically restore it via:

  % nft -f my-ruleset-file

  You can also use this command per line representation to apply
  incremental ruleset updates atomically:

  % cat incremental-ruleset-update

Re: [PATCH] geneve: restore vlan bits in xmit path

2015-09-17 Thread John W. Linville
On Thu, Sep 17, 2015 at 11:45:58AM -0700, Pravin Shelar wrote:
> On Thu, Sep 17, 2015 at 10:18 AM, John W. Linville
>  wrote:
> > These seem to have been accidentally dropped in commit 371bd1061d29
> > ("geneve: Consolidate Geneve functionality in single module.").
> >
> Geneve should not export vxlan feature. So that it never sees vxlan
> tagged packets. Can you turn off the vlan feature?

I'm not sure I understand...?  This is vlan, not vxlan.

John
-- 
John W. LinvilleSomeday the world will need a hero, and you
linvi...@tuxdriver.com  might be all we have.  Be ready.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >