Re: [net-next PATCH V3 1/3] net: adjust napi_consume_skb to handle none-NAPI callers

2016-03-10 Thread Jesper Dangaard Brouer

On Thu, 10 Mar 2016 20:21:55 +0300
Sergei Shtylyov  wrote:

> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -801,9 +801,9 @@ void napi_consume_skb(struct sk_buff *skb, int budget)
> > if (unlikely(!skb))
> > return;
> >
> > -   /* if budget is 0 assume netpoll w/ IRQs disabled */
> > +   /* Zero budget indicate none-NAPI context called us, like netpoll */  
> 
> Non-NAPI?

Okay, I'll send a V4.  Hope there are no more nitpicking changes...
I'll also adjust the subj none-NAPI -> non-NAPI, and hope that does not
disturb patchwork.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer


[PATCHv2 (net.git) 1/2] Revert "stmmac: Fix 'eth0: No PHY found' regression"

2016-03-10 Thread Giuseppe Cavallaro
This reverts commit 88f8b1bb41c6208f81b6a480244533ded7b59493.
due to problems on GeekBox and Banana Pi M1 board when
connected to a real transceiver instead of a switch via
fixed-link.

Signed-off-by: Giuseppe Cavallaro 
Cc: Gabriel Fernandez 
Cc: Andreas Färber 
Cc: Frank Schäfer 
Cc: Dinh Nguyen 
Cc: David S. Miller 
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c  |   11 ++-
 .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |9 +
 include/linux/stmmac.h |1 -
 3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
index efb54f3..0faf163 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
@@ -199,12 +199,21 @@ int stmmac_mdio_register(struct net_device *ndev)
struct stmmac_priv *priv = netdev_priv(ndev);
struct stmmac_mdio_bus_data *mdio_bus_data = priv->plat->mdio_bus_data;
int addr, found;
-   struct device_node *mdio_node = priv->plat->mdio_node;
+   struct device_node *mdio_node = NULL;
+   struct device_node *child_node = NULL;
 
if (!mdio_bus_data)
return 0;
 
if (IS_ENABLED(CONFIG_OF)) {
+   for_each_child_of_node(priv->device->of_node, child_node) {
+   if (of_device_is_compatible(child_node,
+   "snps,dwmac-mdio")) {
+   mdio_node = child_node;
+   break;
+   }
+   }
+
if (mdio_node) {
netdev_dbg(ndev, "FOUND MDIO subnode\n");
} else {
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
index 4514ba7..6a52fa1 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
@@ -110,7 +110,6 @@ stmmac_probe_config_dt(struct platform_device *pdev, const 
char **mac)
struct device_node *np = pdev->dev.of_node;
struct plat_stmmacenet_data *plat;
struct stmmac_dma_cfg *dma_cfg;
-   struct device_node *child_node = NULL;
 
plat = devm_kzalloc(>dev, sizeof(*plat), GFP_KERNEL);
if (!plat)
@@ -141,19 +140,13 @@ stmmac_probe_config_dt(struct platform_device *pdev, 
const char **mac)
plat->phy_node = of_node_get(np);
}
 
-   for_each_child_of_node(np, child_node)
-   if (of_device_is_compatible(child_node, "snps,dwmac-mdio")) {
-   plat->mdio_node = child_node;
-   break;
-   }
-
/* "snps,phy-addr" is not a standard property. Mark it as deprecated
 * and warn of its use. Remove this when phy node support is added.
 */
if (of_property_read_u32(np, "snps,phy-addr", >phy_addr) == 0)
dev_warn(>dev, "snps,phy-addr property is deprecated\n");
 
-   if ((plat->phy_node && !of_phy_is_fixed_link(np)) || !plat->mdio_node)
+   if ((plat->phy_node && !of_phy_is_fixed_link(np)) || plat->phy_bus_name)
plat->mdio_bus_data = NULL;
else
plat->mdio_bus_data =
diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h
index 881a79d..eead8ab 100644
--- a/include/linux/stmmac.h
+++ b/include/linux/stmmac.h
@@ -100,7 +100,6 @@ struct plat_stmmacenet_data {
int interface;
struct stmmac_mdio_bus_data *mdio_bus_data;
struct device_node *phy_node;
-   struct device_node *mdio_node;
struct stmmac_dma_cfg *dma_cfg;
int clk_csr;
int has_gmac;
-- 
1.7.4.4



[PATCHv2 (net.git) 2/2] stmmac: fix MDIO settings

2016-03-10 Thread Giuseppe Cavallaro
Initially the phy_bus_name was added to manipulate the
driver name but It was recently just used to manage the
fixed-link and then to take some decision at run-time
inside the main (for example to skip EEE).
So the patch uses the is_pseudo_fixed_link and removes
removes the phy_bus_name variable not necessary anymore.

The driver can manage the mdio registration by using phy-handle,
dwmac-mdio and own parameter e.g. snps,phy-addr.
This patch takes care about all these possible configurations
and fixes the mdio registration in case of there is a real
transceiver or a switch (that needs to be managed by using
fixed-link).

Signed-off-by: Giuseppe Cavallaro 
Reviewed-by: Andreas Färber 
Tested-by: Frank Schäfer 
Cc: Gabriel Fernandez 
Cc: Dinh Nguyen 
Cc: David S. Miller 
---

V2: use is_pseudo_fixed_link

 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |   11 +++
 .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |   14 +-
 include/linux/stmmac.h |1 -
 3 files changed, 8 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index c21015b..389d7d0 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -271,7 +271,6 @@ static void stmmac_eee_ctrl_timer(unsigned long arg)
  */
 bool stmmac_eee_init(struct stmmac_priv *priv)
 {
-   char *phy_bus_name = priv->plat->phy_bus_name;
unsigned long flags;
bool ret = false;
 
@@ -283,7 +282,7 @@ bool stmmac_eee_init(struct stmmac_priv *priv)
goto out;
 
/* Never init EEE in case of a switch is attached */
-   if (phy_bus_name && (!strcmp(phy_bus_name, "fixed")))
+   if (priv->phydev->is_pseudo_fixed_link)
goto out;
 
/* MAC core supports the EEE feature. */
@@ -820,12 +819,8 @@ static int stmmac_init_phy(struct net_device *dev)
phydev = of_phy_connect(dev, priv->plat->phy_node,
_adjust_link, 0, interface);
} else {
-   if (priv->plat->phy_bus_name)
-   snprintf(bus_id, MII_BUS_ID_SIZE, "%s-%x",
-priv->plat->phy_bus_name, priv->plat->bus_id);
-   else
-   snprintf(bus_id, MII_BUS_ID_SIZE, "stmmac-%x",
-priv->plat->bus_id);
+   snprintf(bus_id, MII_BUS_ID_SIZE, "stmmac-%x",
+priv->plat->bus_id);
 
snprintf(phy_id_fmt, MII_BUS_ID_SIZE + 3, PHY_ID_FMT, bus_id,
 priv->plat->phy_addr);
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
index 6a52fa1..ed33920 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
@@ -138,7 +138,11 @@ stmmac_probe_config_dt(struct platform_device *pdev, const 
char **mac)
return ERR_PTR(-ENODEV);
 
plat->phy_node = of_node_get(np);
-   }
+   } else
+   plat->mdio_bus_data =
+   devm_kzalloc(>dev,
+sizeof(struct stmmac_mdio_bus_data),
+GFP_KERNEL);
 
/* "snps,phy-addr" is not a standard property. Mark it as deprecated
 * and warn of its use. Remove this when phy node support is added.
@@ -146,14 +150,6 @@ stmmac_probe_config_dt(struct platform_device *pdev, const 
char **mac)
if (of_property_read_u32(np, "snps,phy-addr", >phy_addr) == 0)
dev_warn(>dev, "snps,phy-addr property is deprecated\n");
 
-   if ((plat->phy_node && !of_phy_is_fixed_link(np)) || plat->phy_bus_name)
-   plat->mdio_bus_data = NULL;
-   else
-   plat->mdio_bus_data =
-   devm_kzalloc(>dev,
-sizeof(struct stmmac_mdio_bus_data),
-GFP_KERNEL);
-
of_property_read_u32(np, "tx-fifo-depth", >tx_fifo_size);
 
of_property_read_u32(np, "rx-fifo-depth", >rx_fifo_size);
diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h
index eead8ab..1b4884c 100644
--- a/include/linux/stmmac.h
+++ b/include/linux/stmmac.h
@@ -94,7 +94,6 @@ struct stmmac_dma_cfg {
 };
 
 struct plat_stmmacenet_data {
-   char *phy_bus_name;
int bus_id;
int phy_addr;
int interface;
-- 
1.7.4.4



[PATCHv2 (net.git) 0/2] stmmac: MDIO fixes

2016-03-10 Thread Giuseppe Cavallaro
These two patches are to fix the recent regressions raised
when test the stmmac on some platforms due to broken MDIO
management.

V2: use is_pseudo_fixed_link

Giuseppe Cavallaro (2):
  Revert "stmmac: Fix 'eth0: No PHY found' regression"
  stmmac: fix MDIO settings

 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |   11 ++---
 drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c  |   11 +-
 .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |   21 ---
 include/linux/stmmac.h |2 -
 4 files changed, 18 insertions(+), 27 deletions(-)

-- 
1.7.4.4



RE: [PATCH v2] can: rcar_canfd: Add Renesas R-Car CAN FD driver

2016-03-10 Thread Ramesh Shanmugasundaram
Hi Oliver, Marc,

> On 03/08/2016 01:48 PM, Ramesh Shanmugasundaram wrote:
> 
> >> In fact you provided a CAN driver which is "CAN-FD-only".
> >
> > Yes. That's the status of current submission.

(...)

> >
> > I did try this option earlier but there are two problems with this
> method.
> >
> > 1) Below configuration is not possible
> >
> > ip link set can0 up type can bitrate 100 dbitrate 100 fd on
> >
> > "fd on" -> This is not allowed because CAN_CTRLMODE_FD bit is not set in
> ctrlmode_supported.
> >
> > 2) If I ignore "fd on", my interface MTU stays as CAN_MTU only. If I
> have to change the MTU alone to CANFD_MTU using another netlink message,
> it again checks ctrlmode_supported where it would fail. I have the option
> of providing my own change_mtu function & ignore this check but two
> configuration messages are required for my driver alone :-(.
> >
> > Both these anomalies are addressed with the current check I have.
> 
> Oh - you are right with complaining about this inconsistency.
> 
> Can you check my RFC patch for Linux stable I just sent on the mailing
> list?
> http://marc.info/?l=linux-can=145745724917976=2

As we are fixing this issue in CAN dev.c, I'll remove this check in ndo_open 
and set CAN_CTRLMODE_FD flag in ctrlmode & remove the flag in 
ctrlmode_supported in the next v3 version of the patch.

Are there any further comments on v2 patch please?

Thanks,
Ramesh


RE: [PATCH net v2 0/2] qlcnic fixes

2016-03-10 Thread Rajesh Borundia
>-Original Message-
>From: David Miller [mailto:da...@davemloft.net]
>Sent: Friday, March 11, 2016 2:47 AM
>To: Rajesh Borundia 
>Cc: netdev ; Dept-GE Linux NIC Dev gelinuxnic...@qlogic.com>
>Subject: Re: [PATCH net v2 0/2] qlcnic fixes
>
>From: Rajesh Borundia 
>Date: Tue, 8 Mar 2016 02:39:56 -0500
>
>> This series adds following fixes.
>>
>> o While processing mailbox if driver gets a spurious mailbox
>>   interrupt it leads into premature completion of a next
>>   mailbox request. Added a guard against this by checking current
>>   state of mailbox and ignored spurious interrupt.
>>   Added a stats counter to record this condition.
>>
>> v2:
>>
>> o Added patch that removes usage of atomic_t as we are not implemeting
>>   atomicity by using atomic_t value.
>>
>> Please apply these fixes to net.
>
>As explained in other list postings, 'net' is basically closed for this 
>release cycle,
>so I applied this series to 'net-next'.
>

Thanks.

>Let me know if you'd like me to therefore queue these changes up for -stable.
>
Please queue the changes for stable.

>Thanks.


[PATCH] kcm: fix variable type

2016-03-10 Thread Andrzej Hajda
Function skb_splice_bits can return negative values, its result should
be assigned to signed variable to allow correct error checking.

The problem has been detected using patch
scripts/coccinelle/tests/unsigned_lesser_than_zero.cocci.

Signed-off-by: Andrzej Hajda 
---
 net/kcm/kcmsock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
index 40662d73..0b68ba7 100644
--- a/net/kcm/kcmsock.c
+++ b/net/kcm/kcmsock.c
@@ -1483,7 +1483,7 @@ static ssize_t kcm_splice_read(struct socket *sock, 
loff_t *ppos,
long timeo;
struct kcm_rx_msg *rxm;
int err = 0;
-   size_t copied;
+   ssize_t copied;
struct sk_buff *skb;
 
/* Only support splice for SOCKSEQPACKET */
-- 
1.9.1



[PATCH net v2] r8169:Remove unnecessary phy reset for pcie nic when setting link spped.

2016-03-10 Thread Chunhao Lin
For pcie nic, after setting link speed and there is no link driver does not need
to do phy reset until link up.

For some pcie nics, to do this will also reset phy speed down counter and 
prevent
phy from auto speed down.

This patch fix the issue reported in following link.
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1547151

Signed-off-by: Chunhao Lin 
---
 drivers/net/ethernet/realtek/r8169.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c 
b/drivers/net/ethernet/realtek/r8169.c
index dd2cf37..94f08f1 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -1999,7 +1999,8 @@ static int rtl8169_set_speed(struct net_device *dev,
goto out;
 
if (netif_running(dev) && (autoneg == AUTONEG_ENABLE) &&
-   (advertising & ADVERTISED_1000baseT_Full)) {
+   (advertising & ADVERTISED_1000baseT_Full) &&
+   !pci_is_pcie(tp->pci_dev)) {
mod_timer(>timer, jiffies + RTL8169_PHY_TIMEOUT);
}
 out:
-- 
1.9.1



[PATCH v3 net-next 1/2] net: hns: fix return value of the function about rss

2016-03-10 Thread Kejian Yan
Both .get_rxfh and .get_rxfh are always return 0, it should return result
from hardware when getting or setting rss. And the rss function should
return the correct data type.

Signed-off-by: Kejian Yan 
---
change log:
PATCH v3:
 - This patch removes unused variable 'ret' to fix the build warning

PATCH v2:
 - This patch fixes the comments provided by Andy Shevchenko 

 Link: https://lkml.org/lkml/2016/3/10/266

PATCH v1:
 - first submit

 Link: https://lkml.org/lkml/2016/3/9/978
---
 drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c |  2 +-
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c |  2 +-
 drivers/net/ethernet/hisilicon/hns/hns_ethtool.c  | 14 --
 3 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c 
b/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
index d4f92ed..d07db1f 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
@@ -799,7 +799,7 @@ static int hns_ae_set_rss(struct hnae_handle *handle, const 
u32 *indir,
 
/* set the RSS Hash Key if specififed by the user */
if (key)
-   hns_ppe_set_rss_key(ppe_cb, (int *)key);
+   hns_ppe_set_rss_key(ppe_cb, (u32 *)key);
 
/* update the shadow RSS table with user specified qids */
memcpy(ppe_cb->rss_indir_table, indir, HNS_PPEV2_RSS_IND_TBL_SIZE);
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c 
b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c
index f302ef9..811ef35 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c
@@ -27,7 +27,7 @@ void hns_ppe_set_tso_enable(struct hns_ppe_cb *ppe_cb, u32 
value)
 void hns_ppe_set_rss_key(struct hns_ppe_cb *ppe_cb,
 const u32 rss_key[HNS_PPEV2_RSS_KEY_NUM])
 {
-   int key_item = 0;
+   u32 key_item = 0;
 
for (key_item = 0; key_item < HNS_PPEV2_RSS_KEY_NUM; key_item++)
dsaf_write_dev(ppe_cb, PPEV2_RSS_KEY_REG + key_item * 0x4,
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c 
b/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
index 3c4a3bc..01b65eb 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
@@ -1178,7 +1178,7 @@ hns_get_rss_key_size(struct net_device *netdev)
if (AE_IS_VER1(priv->enet_ver)) {
netdev_err(netdev,
   "RSS feature is not supported on this hardware\n");
-   return -EOPNOTSUPP;
+   return (u32)-EOPNOTSUPP;
}
 
ops = priv->ae_handle->dev->ops;
@@ -1197,7 +1197,7 @@ hns_get_rss_indir_size(struct net_device *netdev)
if (AE_IS_VER1(priv->enet_ver)) {
netdev_err(netdev,
   "RSS feature is not supported on this hardware\n");
-   return -EOPNOTSUPP;
+   return (u32)-EOPNOTSUPP;
}
 
ops = priv->ae_handle->dev->ops;
@@ -1211,7 +1211,6 @@ hns_get_rss(struct net_device *netdev, u32 *indir, u8 
*key, u8 *hfunc)
 {
struct hns_nic_priv *priv = netdev_priv(netdev);
struct hnae_ae_ops *ops;
-   int ret;
 
if (AE_IS_VER1(priv->enet_ver)) {
netdev_err(netdev,
@@ -1224,9 +1223,7 @@ hns_get_rss(struct net_device *netdev, u32 *indir, u8 
*key, u8 *hfunc)
if (!indir)
return 0;
 
-   ret = ops->get_rss(priv->ae_handle, indir, key, hfunc);
-
-   return 0;
+   return ops->get_rss(priv->ae_handle, indir, key, hfunc);
 }
 
 static int
@@ -1235,7 +1232,6 @@ hns_set_rss(struct net_device *netdev, const u32 *indir, 
const u8 *key,
 {
struct hns_nic_priv *priv = netdev_priv(netdev);
struct hnae_ae_ops *ops;
-   int ret;
 
if (AE_IS_VER1(priv->enet_ver)) {
netdev_err(netdev,
@@ -1252,9 +1248,7 @@ hns_set_rss(struct net_device *netdev, const u32 *indir, 
const u8 *key,
if (!indir)
return 0;
 
-   ret = ops->set_rss(priv->ae_handle, indir, key, hfunc);
-
-   return 0;
+   return ops->set_rss(priv->ae_handle, indir, key, hfunc);
 }
 
 static struct ethtool_ops hns_ethtool_ops = {
-- 
1.9.1



[PATCH v3 net-next 0/2] net: hns: get and set RSS indirection table by using ethtool

2016-03-10 Thread Kejian Yan
When we use ethtool to retrieves or configure the receive flow hash 
indirection table, ethtool needs to call .get_rxnfc to get the ring number
so this patchset implements the .get_rxnfc and fixes the bug that we can
not get the tatal table each time.

---
change log:
PATCH v3:
 - This patchset fixes the building warning and error

PATCH v2:
 - This patchset fixes the comments provided by Andy Shevchenko 

PATCH v1:
 - first submit

Kejian Yan (2):
  net: hns: fix return value of the function about rss
  net: hns: fixes a bug of RSS

 drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c |  8 ---
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c |  2 +-
 drivers/net/ethernet/hisilicon/hns/hns_ethtool.c  | 28 ---
 3 files changed, 26 insertions(+), 12 deletions(-)

-- 
1.9.1



[PATCH v3 net-next 2/2] net: hns: fixes a bug of RSS

2016-03-10 Thread Kejian Yan
If trying to get receive flow hash indirection table by ethtool, it needs
to call .get_rxnfc to get ring number first. So this patch implements the
.get_rxnfc of ethtool. And the data type of rss_indir_table is u32, it has
to be multiply by the width of data type when using memcpy.

Signed-off-by: Kejian Yan 
---
change log:
PATCH v3:
 - This patch modifies the return value of .get_rxnfc to fix building error

PATCH v2:
 - This patch fixes the comments provided by Andy Shevchenko 

 Link: https://lkml.org/lkml/2016/3/10/267

PATCH v1:
 - first submit

 Link: https://lkml.org/lkml/2016/3/9/981
---
 drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c |  6 --
 drivers/net/ethernet/hisilicon/hns/hns_ethtool.c  | 18 ++
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c 
b/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
index d07db1f..7b06e9b 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
@@ -787,7 +787,8 @@ static int hns_ae_get_rss(struct hnae_handle *handle, u32 
*indir, u8 *key,
memcpy(key, ppe_cb->rss_key, HNS_PPEV2_RSS_KEY_SIZE);
 
/* update the current hash->queue mappings from the shadow RSS table */
-   memcpy(indir, ppe_cb->rss_indir_table, HNS_PPEV2_RSS_IND_TBL_SIZE);
+   memcpy(indir, ppe_cb->rss_indir_table,
+  HNS_PPEV2_RSS_IND_TBL_SIZE * sizeof(*indir));
 
return 0;
 }
@@ -802,7 +803,8 @@ static int hns_ae_set_rss(struct hnae_handle *handle, const 
u32 *indir,
hns_ppe_set_rss_key(ppe_cb, (u32 *)key);
 
/* update the shadow RSS table with user specified qids */
-   memcpy(ppe_cb->rss_indir_table, indir, HNS_PPEV2_RSS_IND_TBL_SIZE);
+   memcpy(ppe_cb->rss_indir_table, indir,
+  HNS_PPEV2_RSS_IND_TBL_SIZE * sizeof(*indir));
 
/* now update the hardware */
hns_ppe_set_indir_table(ppe_cb, ppe_cb->rss_indir_table);
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c 
b/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
index 01b65eb..46379ce 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
@@ -1251,6 +1251,23 @@ hns_set_rss(struct net_device *netdev, const u32 *indir, 
const u8 *key,
return ops->set_rss(priv->ae_handle, indir, key, hfunc);
 }
 
+static int hns_get_rxnfc(struct net_device *netdev,
+struct ethtool_rxnfc *cmd,
+u32 *rule_locs)
+{
+   struct hns_nic_priv *priv = netdev_priv(netdev);
+
+   switch (cmd->cmd) {
+   case ETHTOOL_GRXRINGS:
+   cmd->data = priv->ae_handle->q_num;
+   break;
+   default:
+   return -EOPNOTSUPP;
+   }
+
+   return 0;
+}
+
 static struct ethtool_ops hns_ethtool_ops = {
.get_drvinfo = hns_nic_get_drvinfo,
.get_link  = hns_nic_get_link,
@@ -1274,6 +1291,7 @@ static struct ethtool_ops hns_ethtool_ops = {
.get_rxfh_indir_size = hns_get_rss_indir_size,
.get_rxfh = hns_get_rss,
.set_rxfh = hns_set_rss,
+   .get_rxnfc = hns_get_rxnfc,
 };
 
 void hns_ethtool_set_ops(struct net_device *ndev)
-- 
1.9.1



Re: [V9fs-developer] [PATCH] net/9p: convert to new CQ API

2016-03-10 Thread Doug Ledford
On 03/08/2016 09:38 AM, Dominique Martinet wrote:
> Christoph Hellwig wrote on Thu, Mar 03, 2016:
>> New version with the nits fixed below.  Now that checkpath started
>> a stupid warning about not using tabs for indentation which I've
>> ignored here and will take up in my usual fights against Joes
>> idicotic opinions separately..
> 
> Thanks for the nitpicks, I can confirm it works as expected as well so
> all good with me.
> I like the new CQ interface :)
> 
> (if someone adds an Acked-by please use dominique.marti...@cea.fr for my
> mail; sorry for the split personality)
> 

Since I haven't heard anyone else say they are picking this up, I've
grabbed it for 4.6.  Thanks.

-- 
Doug Ledford 
  GPG KeyID: 0E572FDD




signature.asc
Description: OpenPGP digital signature


Re: [ovs-dev] [PATCH v2 net-next] ovs: allow nl 'flow set' to use ufid without flow key

2016-03-10 Thread pravin shelar
On Thu, Mar 10, 2016 at 8:14 AM, Samuel Gauthier
 wrote:
> When we want to change a flow using netlink, we have to identify it to
> be able to perform a lookup. Both the flow key and unique flow ID
> (ufid) are valid identifiers, but we always have to specify the flow
> key in the netlink message. When both attributes are there, the ufid
> is used. The flow key is used to validate the actions provided by
> the userland.
>
> This commit allows to use the ufid without having to provide the flow
> key, as it is already done in the netlink 'flow get' and 'flow del'
> path. The flow key remains mandatory when an action is provided.
>
> Signed-off-by: Samuel Gauthier 
> ---
> v2:
>  - Restore mask init and parsing
>  - Keep the flow key mandatory when an action is provided
>
Looks good.

Acked-by: Pravin B Shelar 


[net-next:master 1158/1168] net/sched/cls_flower.c:222:28: warning: cast from pointer to integer of different size

2016-03-10 Thread kbuild test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 
master
head:   e8ab563f4b2e51849a16d962c6235b81e429c0d7
commit: 5b33f48842fa1e13e9c0ea8cc59c1d0df19042db [1158/1168] net/flower: 
Introduce hardware offload support
config: i386-randconfig-r0-201610 (attached as .config)
reproduce:
git checkout 5b33f48842fa1e13e9c0ea8cc59c1d0df19042db
# save the attached .config to linux build tree
make ARCH=i386 

All warnings (new ones prefixed by >>):

   net/sched/cls_flower.c: In function 'fl_destroy':
>> net/sched/cls_flower.c:222:28: warning: cast from pointer to integer of 
>> different size [-Wpointer-to-int-cast]
  fl_hw_destroy_filter(tp, (u64)f);
   ^
   net/sched/cls_flower.c: In function 'fl_change':
   net/sched/cls_flower.c:557:9: warning: cast from pointer to integer of 
different size [-Wpointer-to-int-cast]
(u64)fnew,
^
   net/sched/cls_flower.c:563:28: warning: cast from pointer to integer of 
different size [-Wpointer-to-int-cast]
  fl_hw_destroy_filter(tp, (u64)fold);
   ^
   net/sched/cls_flower.c: In function 'fl_delete':
   net/sched/cls_flower.c:591:27: warning: cast from pointer to integer of 
different size [-Wpointer-to-int-cast]
 fl_hw_destroy_filter(tp, (u64)f);
  ^

vim +222 net/sched/cls_flower.c

   206  
   207  tc.type = TC_SETUP_CLSFLOWER;
   208  tc.cls_flower = 
   209  
   210  dev->netdev_ops->ndo_setup_tc(dev, tp->q->handle, tp->protocol, 
);
   211  }
   212  
   213  static bool fl_destroy(struct tcf_proto *tp, bool force)
   214  {
   215  struct cls_fl_head *head = rtnl_dereference(tp->root);
   216  struct cls_fl_filter *f, *next;
   217  
   218  if (!force && !list_empty(>filters))
   219  return false;
   220  
   221  list_for_each_entry_safe(f, next, >filters, list) {
 > 222  fl_hw_destroy_filter(tp, (u64)f);
   223  list_del_rcu(>list);
   224  call_rcu(>rcu, fl_destroy_filter);
   225  }
   226  RCU_INIT_POINTER(tp->root, NULL);
   227  if (head->mask_assigned)
   228  rhashtable_destroy(>ht);
   229  kfree_rcu(head, rcu);
   230  return true;

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


Re: [PATCH net 4/6] net: hns: adds uc match for debug port

2016-03-10 Thread Daode Huang



On 2016/3/4 21:39, Sergei Shtylyov wrote:

On 3/4/2016 4:09 AM, Daode Huang wrote:


This patch adds uc match for debug port by:
1)Enables uc match of debug port when initializing gmac
2)Enables uc match of mac address register2

Signed-off-by: Daode Huang 
Signed-off-by: lipeng 


Lipeng is his full name. i will change it to another style (Peng Li 
)



   True/full name is required here.


---
  drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.c | 18 
+-

  drivers/net/ethernet/hisilicon/hns/hns_dsaf_reg.h  |  2 ++
  2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.c 
b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.c

index b8517b0..2591a51 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.c

[...]
@@ -407,8 +419,12 @@ static void hns_gmac_set_mac_addr(void *mac_drv, 
char *mac_addr)


  u32 low_val = mac_addr[5] | (mac_addr[4] << 8)
  | (mac_addr[3] << 16) | (mac_addr[2] << 24);
+
+u32 val = dsaf_read_dev(drv, GMAC_STATION_ADDR_HIGH_2_REG);
+u32 sta_addr_en = dsaf_get_bit(val, GMAC_ADDR_EN_B);


   Empty line needed after declarations.



agree,
thanks
Daode.


  dsaf_write_dev(drv, GMAC_STATION_ADDR_LOW_2_REG, low_val);
-dsaf_write_dev(drv, GMAC_STATION_ADDR_HIGH_2_REG, high_val);
+dsaf_write_dev(drv, GMAC_STATION_ADDR_HIGH_2_REG,
+   high_val | (sta_addr_en << GMAC_ADDR_EN_B));
  }
  }


[...]

MBR, Sergei


.






Re: [PATCH net 3/6] net: hns: fixed portid bug in sending manage pkt

2016-03-10 Thread Daode Huang



On 2016/3/4 21:37, Sergei Shtylyov wrote:

Hello.

On 3/4/2016 4:09 AM, Daode Huang wrote:


In V2 chip, when sending mamagement packets, the driver should
config the port id to BD descs.

Signed-off-by: Daode Huang 
Signed-off-by: Lisheng 
---
  drivers/net/ethernet/hisilicon/hns/hnae.h | 3 +++
  drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c | 1 +
  drivers/net/ethernet/hisilicon/hns/hns_enet.c | 4 
  3 files changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns/hnae.h 
b/drivers/net/ethernet/hisilicon/hns/hnae.h

index 1cbcb9f..11a3f97 100644
--- a/drivers/net/ethernet/hisilicon/hns/hnae.h
+++ b/drivers/net/ethernet/hisilicon/hns/hnae.h

[...]

@@ -516,6 +518,7 @@ struct hnae_handle {
  int q_num;
  int vf_id;
  u32 eport_id;
+u32 dport_id;/*v2 tx bd should fill the dport_id*/


   Please add spaces after /* and before */ (like it's done in other 
places in this driver).




Hi MBR, Sergei,
Thanks for you comments,
will change it in next version.
Daode.


[...]
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c 
b/drivers/net/ethernet/hisilicon/hns/hns_enet.c

index 6250a42..b45dcc2 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
@@ -69,6 +69,10 @@ static void fill_v2_desc(struct hnae_ring *ring, 
void *priv,

  hnae_set_bit(rrcfv, HNSV2_TXD_VLD_B, 1);
  hnae_set_field(bn_pid, HNSV2_TXD_BUFNUM_M, 0, buf_num - 1);

+/*fill port_id in the tx bd for sending management pkts*/


   Likewise.


+hnae_set_field(bn_pid, HNSV2_TXD_PORTID_M,
+   HNSV2_TXD_PORTID_S, ring->q->handle->dport_id);
+
  if (type == DESC_TYPE_SKB) {
  skb = (struct sk_buff *)priv;



MBR, Sergei


.






Re: [PATCH v2 net-next] ovs: allow nl 'flow set' to use ufid without flow key

2016-03-10 Thread Simon Horman
On Thu, Mar 10, 2016 at 05:14:59PM +0100, Samuel Gauthier wrote:
> When we want to change a flow using netlink, we have to identify it to
> be able to perform a lookup. Both the flow key and unique flow ID
> (ufid) are valid identifiers, but we always have to specify the flow
> key in the netlink message. When both attributes are there, the ufid
> is used. The flow key is used to validate the actions provided by
> the userland.
> 
> This commit allows to use the ufid without having to provide the flow
> key, as it is already done in the netlink 'flow get' and 'flow del'
> path. The flow key remains mandatory when an action is provided.
> 
> Signed-off-by: Samuel Gauthier 

Reviewed-by: Simon Horman 


Re: [PATCH net-next v5] tcp: Add RFC4898 tcpEStatsPerfDataSegsOut/In

2016-03-10 Thread Eric Dumazet
On Thu, Mar 10, 2016 at 10:46 AM, Martin KaFai Lau  wrote:
> Per RFC4898, they count segments sent/received
> containing a positive length data segment (that includes
> retransmission segments carrying data).  Unlike
> tcpi_segs_out/in, tcpi_data_segs_out/in excludes segments
> carrying no data (e.g. pure ack).

Acked-by: Eric Dumazet 

Thanks.


Re: [PATCH nf-next v10 8/8] openvswitch: Interface with NAT.

2016-03-10 Thread Joe Stringer
On 11 March 2016 at 07:54, Jarno Rajahalme  wrote:
> Extend OVS conntrack interface to cover NAT.  New nested
> OVS_CT_ATTR_NAT attribute may be used to include NAT with a CT action.
> A bare OVS_CT_ATTR_NAT only mangles existing and expected connections.
> If OVS_NAT_ATTR_SRC or OVS_NAT_ATTR_DST is included within the nested
> attributes, new (non-committed/non-confirmed) connections are mangled
> according to the rest of the nested attributes.
>
> The corresponding OVS userspace patch series includes test cases (in
> tests/system-traffic.at) that also serve as example uses.
>
> This work extends on a branch by Thomas Graf at
> https://github.com/tgraf/ovs/tree/nat.
>
> Signed-off-by: Jarno Rajahalme 
> Acked-by: Thomas Graf 

Acked-by: Joe Stringer 


Re: Micrel Phy - Is there a way to configure the Phy not to do 802.3x flow control?

2016-03-10 Thread Murali Karicheri
On 03/10/2016 02:38 PM, Murali Karicheri wrote:
> On 03/10/2016 01:05 PM, Florian Fainelli wrote:
>> On 10/03/16 08:48, Murali Karicheri wrote:
>>> On 03/03/2016 07:16 PM, Florian Fainelli wrote:
 On 03/03/16 14:18, Murali Karicheri wrote:
> Hi,
>
> We are using Micrel Phy in one of our board and wondering if we can force 
> the
> Phy to disable flow control at start. I have a 1G ethernet switch 
> connected
> to Phy and the phy always enable flow control. I would like to configure 
> the
> phy not to flow control. Is that possible and if yes, what should I do in 
> the
> my Ethernet driver to tell the Phy not to enable flow control?

 The PHY is not doing flow control per-se, your pseudo Ethernet MAC in
 the switch is doing, along with the link partner advertising support for
 it. You would want to make sure that your PHY device interface (provided
 that you are using the PHY library) is not starting with Pause
 advertised, but it could be supported.
>>>
>>> Understood that Phy is just advertise FC. The Micrel phy for 9031 advertise
>>> by default FC supported. After negotiation, I see that Phylib provide the 
>>> link status with parameter pause = 1, asym_pause = 1. How do I tell the Phy 
>>> not
>>> to advertise?
>>>
>>> I call following sequence in the Ethernet driver.
>>>
>>> of_phy_connect(x,y,hndlr,a,z);
>>
>> Here you should be able to change phydev->advertising and
>> phydev->supported to mask the ADVERTISED_Pause | ADVERTISED_AsymPause
>> bits and have phy_start() restart with that which should disable pause
>> and asym_pause as seen by your adjust_link handler.
>>
> Ok. Good point. I will try this. Thanks for your suggestion.
> 

I made following changes. The phylib still report flow control enabled to
the driver. Some bug in the phylib/phydev?

+
+   printk("slave->phy->supported %x, slave->phy->advertising %x\n",
+   slave->phy->supported, slave->phy->advertising);
+   slave->phy->supported &=
+   ~(SUPPORTED_Pause | SUPPORTED_Asym_Pause);
+   slave->phy->advertising = slave->phy->supported;
+   printk("slave->phy->supported %x, slave->phy->advertising %x\n",
+   slave->phy->supported, slave->phy->advertising);
phy_start(slave->phy);
+   printk("slave->phy->supported %x, slave->phy->advertising %x\n",
+   slave->phy->supported, slave->phy->advertising);
phy_read_status(slave->phy);


[   10.757001] slave->phy->supported 22ff, slave->phy->advertising 22ff
[   10.763354] slave->phy->supported 2ff, slave->phy->advertising 2ff
[   10.769552] slave->phy->supported 2ff, slave->phy->advertising 2ff
[   10.776045] netcp-1.0 2620110.netcp eth0: Link is Down
udhcpc (v1.23.1) started
Sending discover...
Sending discover...
[   14.757280] netcp-1.0 2620110.netcp eth0: Link is Up - 1Gbps/Full - flow 
control rx/tx
Sending discover...
Sending select for 158.218.103.170...
Lease of 158.218.103.170 obtained, lease time 28800
/etc/udhcpc.d/50default: Adding DNS 192.0.2.2
/etc/udhcpc.d/50default: Adding DNS 192.0.2.3


> Murali
>>> phy_start()
>>>
>>> Now in hndlr() I have pause = 1, asym_pause = 1, in phy_device ptr. How can 
>>> I tell the phy not to advertise initially?
> 
> 


-- 
Murali Karicheri
Linux Kernel, Keystone


Re: [RFC] net: ipv4 -- Introduce ifa limit per net

2016-03-10 Thread Cyrill Gorcunov
On Thu, Mar 10, 2016 at 05:36:30PM -0500, David Miller wrote:
> > 
> > Works like a charm! So David, what are the next steps then?
> > Mind to gather all your patches into one (maybe)?
> 
> I'll re-review all of the changes tomorrow and also look into ipv6
> masq, to see if it needs the same treatment, as well.
> 
> Thanks for all of your help and testing so far.

Thanks a lot, David!


Re: [PATCH 0/2] sh_eth: fix couple of bugs in sh_eth_ring_format()

2016-03-10 Thread David Miller
From: Sergei Shtylyov 
Date: Fri, 11 Mar 2016 01:01:22 +0300

> On 03/11/2016 12:07 AM, David Miller wrote:
> 
>>> Here's a set of 2 patches against DaveM's 'net.git' repo fixing two
>>> bugs
>>> in sh_eth_.ring_format()...
>>>
>>> [1/2] sh_eth: fix NULL pointer dereference in sh_eth_ring_format()
>>> [2/2] sh_eth: advance 'rxdesc' later in sh_eth_ring_format()
>>
>> Since Linus is likely to release today or otherwise very soon I'm not
>> putting things into 'net'.
>>
>> So I've applied this series to 'net-next', let me know if I should
>> queue it up for stable.
> 
>If your generally queue the error path fixes, then queue these two
>please.

Done.


Re: [RFC] net: ipv4 -- Introduce ifa limit per net

2016-03-10 Thread David Miller
From: Cyrill Gorcunov 
Date: Fri, 11 Mar 2016 00:59:59 +0300

> On Fri, Mar 11, 2016 at 12:19:45AM +0300, Cyrill Gorcunov wrote:
>> > 
>> > Oh yes they do, from masq's non-inet notifier.  masq registers two
>> > notifiers, one for generic netdev and one for inetdev.
>> 
>> Thanks a huge David! I'll test it just to be sure.
> 
> Works like a charm! So David, what are the next steps then?
> Mind to gather all your patches into one (maybe)?

I'll re-review all of the changes tomorrow and also look into ipv6
masq, to see if it needs the same treatment, as well.

Thanks for all of your help and testing so far.


Re: [PATCH nf-next v10 7/8] openvswitch: Delay conntrack helper call for new connections.

2016-03-10 Thread Jarno Rajahalme
Thanks for the reviews, Joe!

Now we have acks for the patches 3-8, but not for 1 and 2 that touch netfilter 
proper. Who could review those?

  Jarno

> On Mar 10, 2016, at 2:01 PM, Joe Stringer  wrote:
> 
> On 11 March 2016 at 07:54, Jarno Rajahalme  wrote:
>> There is no need to help connections that are not confirmed, so we can
>> delay helping new connections to the time when they are confirmed.
>> This change is needed for NAT support, and having this as a separate
>> patch will make the following NAT patch a bit easier to review.
>> 
>> Signed-off-by: Jarno Rajahalme 
> 
> Acked-by: Joe Stringer 



[patch net-next] mlxsw: pci: Implement reset done check

2016-03-10 Thread Jiri Pirko
From: Jiri Pirko 

Firmware now tells us that the reset is done by passing a magic value
via register. Use it to shorten the wait in case this is supported.
With old firmware, we still wait until the timeout is reached.

Signed-off-by: Jiri Pirko 
---
 drivers/net/ethernet/mellanox/mlxsw/pci.c | 15 +++
 drivers/net/ethernet/mellanox/mlxsw/pci.h |  3 +++
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci.c 
b/drivers/net/ethernet/mellanox/mlxsw/pci.c
index 7992c55..7f4173c 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/pci.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/pci.c
@@ -1681,11 +1681,18 @@ static const struct mlxsw_bus mlxsw_pci_bus = {
 
 static int mlxsw_pci_sw_reset(struct mlxsw_pci *mlxsw_pci)
 {
+   unsigned long end;
+
mlxsw_pci_write32(mlxsw_pci, SW_RESET, MLXSW_PCI_SW_RESET_RST_BIT);
-   /* Current firware does not let us know when the reset is done.
-* So we just wait here for constant time and hope for the best.
-*/
-   msleep(MLXSW_PCI_SW_RESET_TIMEOUT_MSECS);
+   wmb(); /* reset needs to be written before we read control register */
+   end = jiffies + msecs_to_jiffies(MLXSW_PCI_SW_RESET_TIMEOUT_MSECS);
+   do {
+   u32 val = mlxsw_pci_read32(mlxsw_pci, FW_READY);
+
+   if ((val & MLXSW_PCI_FW_READY_MASK) == MLXSW_PCI_FW_READY_MAGIC)
+   break;
+   cond_resched();
+   } while (time_before(jiffies, end));
return 0;
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci.h 
b/drivers/net/ethernet/mellanox/mlxsw/pci.h
index 9121060..d942a3e 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/pci.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/pci.h
@@ -61,6 +61,9 @@
 #define MLXSW_PCI_SW_RESET 0xF0010
 #define MLXSW_PCI_SW_RESET_RST_BIT BIT(0)
 #define MLXSW_PCI_SW_RESET_TIMEOUT_MSECS   5000
+#define MLXSW_PCI_FW_READY 0xA1844
+#define MLXSW_PCI_FW_READY_MASK0xFF
+#define MLXSW_PCI_FW_READY_MAGIC   0x5E
 
 #define MLXSW_PCI_DOORBELL_SDQ_OFFSET  0x000
 #define MLXSW_PCI_DOORBELL_RDQ_OFFSET  0x200
-- 
2.5.0



Re: [PATCH nf-next v10 3/8] openvswitch: Add commentary to conntrack.c

2016-03-10 Thread Joe Stringer
On 11 March 2016 at 07:54, Jarno Rajahalme  wrote:
> This makes the code easier to understand and the following patches
> more focused.
>
> Signed-off-by: Jarno Rajahalme 

Acked-by: Joe Stringer 


Re: [PATCH 0/2] sh_eth: fix couple of bugs in sh_eth_ring_format()

2016-03-10 Thread Sergei Shtylyov

On 03/11/2016 12:07 AM, David Miller wrote:


Here's a set of 2 patches against DaveM's 'net.git' repo fixing two bugs
in sh_eth_.ring_format()...

[1/2] sh_eth: fix NULL pointer dereference in sh_eth_ring_format()
[2/2] sh_eth: advance 'rxdesc' later in sh_eth_ring_format()


Since Linus is likely to release today or otherwise very soon I'm not
putting things into 'net'.

So I've applied this series to 'net-next', let me know if I should
queue it up for stable.


   If your generally queue the error path fixes, then queue these two please.


Thanks.


   My pleasure. :-)

MBR, Sergei



Re: [RFC] net: ipv4 -- Introduce ifa limit per net

2016-03-10 Thread Cyrill Gorcunov
On Fri, Mar 11, 2016 at 12:19:45AM +0300, Cyrill Gorcunov wrote:
> > 
> > Oh yes they do, from masq's non-inet notifier.  masq registers two
> > notifiers, one for generic netdev and one for inetdev.
> 
> Thanks a huge David! I'll test it just to be sure.

Works like a charm! So David, what are the next steps then?
Mind to gather all your patches into one (maybe)?


Re: [PATCH nf-next v10 7/8] openvswitch: Delay conntrack helper call for new connections.

2016-03-10 Thread Joe Stringer
On 11 March 2016 at 07:54, Jarno Rajahalme  wrote:
> There is no need to help connections that are not confirmed, so we can
> delay helping new connections to the time when they are confirmed.
> This change is needed for NAT support, and having this as a separate
> patch will make the following NAT patch a bit easier to review.
>
> Signed-off-by: Jarno Rajahalme 

Acked-by: Joe Stringer 


Re: [PATCH 1/3] dm9601: enable EP3 interrupt

2016-03-10 Thread Peter Korsgaard
> "Joseph" == Joseph CHANG  writes:

 > Enable chip's EP3 interrupt to get the link-up notify soon
 > immediately.

Sorry, what do you mean about 'soon immediately'?

 > +
 > +/* Always return 8-bytes data to host per interrupt-interval */
 > +dm_write_reg(dev, DM_USB_CTRL, USB_CTRL_EP3ACK);

Why would we want to do that instead of the current setup that afaik
only returns data when the link status changes?

-- 
Bye, Peter Korsgaard


[PATCH] sctp: allow sctp_transmit_packet and others to use gfp

2016-03-10 Thread Marcelo Ricardo Leitner
Currently sctp_sendmsg() triggers some calls that will allocate memory
with GFP_ATOMIC even when not necessary. In the case of
sctp_packet_transmit it will allocate a linear skb that will be used to
construct the packet and this may cause sends to fail due to ENOMEM more
often than anticipated specially with big MTUs.

This patch thus allows it to inherit gfp flags from upper calls so that
it can use GFP_KERNEL if it was triggered by a sctp_sendmsg call or
similar. All others, like retransmits or flushes started from BH, are
still allocated using GFP_ATOMIC.

In netperf tests this didn't result in any performance drawbacks when
memory is not too fragmented and made it trigger ENOMEM way less often.

Signed-off-by: Marcelo Ricardo Leitner 
---
 include/net/sctp/sm.h  |  2 +-
 include/net/sctp/structs.h | 10 +++---
 net/sctp/associola.c   |  2 +-
 net/sctp/chunk.c   |  6 ++--
 net/sctp/input.c   |  2 +-
 net/sctp/output.c  |  6 ++--
 net/sctp/outqueue.c| 30 -
 net/sctp/sm_make_chunk.c   | 80 +++---
 net/sctp/sm_sideeffect.c   | 23 ++---
 9 files changed, 89 insertions(+), 72 deletions(-)

diff --git a/include/net/sctp/sm.h b/include/net/sctp/sm.h
index 
487ef34bbd63ff1cfe511c7ee8b1501593a14de3..efc01743b9d641bf6b16a37780ee0df34b4ec698
 100644
--- a/include/net/sctp/sm.h
+++ b/include/net/sctp/sm.h
@@ -201,7 +201,7 @@ struct sctp_chunk *sctp_make_cwr(const struct 
sctp_association *,
 struct sctp_chunk * sctp_make_datafrag_empty(struct sctp_association *,
const struct sctp_sndrcvinfo *sinfo,
int len, const __u8 flags,
-   __u16 ssn);
+   __u16 ssn, gfp_t gfp);
 struct sctp_chunk *sctp_make_ecne(const struct sctp_association *,
  const __u32);
 struct sctp_chunk *sctp_make_sack(const struct sctp_association *);
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 
205630bb5010b8ac76b84651b302e488fc1c76ff..0b65c16bbc2a837b2fd2aca4aa8cee5686feaf33
 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -656,7 +656,7 @@ void sctp_chunk_free(struct sctp_chunk *);
 void  *sctp_addto_chunk(struct sctp_chunk *, int len, const void *data);
 struct sctp_chunk *sctp_chunkify(struct sk_buff *,
 const struct sctp_association *,
-struct sock *);
+struct sock *, gfp_t gfp);
 void sctp_init_addrs(struct sctp_chunk *, union sctp_addr *,
 union sctp_addr *);
 const union sctp_addr *sctp_source(const struct sctp_chunk *chunk);
@@ -718,10 +718,10 @@ struct sctp_packet *sctp_packet_init(struct sctp_packet *,
 __u16 sport, __u16 dport);
 struct sctp_packet *sctp_packet_config(struct sctp_packet *, __u32 vtag, int);
 sctp_xmit_t sctp_packet_transmit_chunk(struct sctp_packet *,
-   struct sctp_chunk *, int);
+  struct sctp_chunk *, int, gfp_t);
 sctp_xmit_t sctp_packet_append_chunk(struct sctp_packet *,
  struct sctp_chunk *);
-int sctp_packet_transmit(struct sctp_packet *);
+int sctp_packet_transmit(struct sctp_packet *, gfp_t);
 void sctp_packet_free(struct sctp_packet *);
 
 static inline int sctp_packet_empty(struct sctp_packet *packet)
@@ -1054,7 +1054,7 @@ struct sctp_outq {
 void sctp_outq_init(struct sctp_association *, struct sctp_outq *);
 void sctp_outq_teardown(struct sctp_outq *);
 void sctp_outq_free(struct sctp_outq*);
-int sctp_outq_tail(struct sctp_outq *, struct sctp_chunk *chunk);
+int sctp_outq_tail(struct sctp_outq *, struct sctp_chunk *chunk, gfp_t);
 int sctp_outq_sack(struct sctp_outq *, struct sctp_chunk *);
 int sctp_outq_is_empty(const struct sctp_outq *);
 void sctp_outq_restart(struct sctp_outq *);
@@ -1062,7 +1062,7 @@ void sctp_outq_restart(struct sctp_outq *);
 void sctp_retransmit(struct sctp_outq *, struct sctp_transport *,
 sctp_retransmit_reason_t);
 void sctp_retransmit_mark(struct sctp_outq *, struct sctp_transport *, __u8);
-int sctp_outq_uncork(struct sctp_outq *);
+int sctp_outq_uncork(struct sctp_outq *, gfp_t gfp);
 /* Uncork and flush an outqueue.  */
 static inline void sctp_outq_cork(struct sctp_outq *q)
 {
diff --git a/net/sctp/associola.c b/net/sctp/associola.c
index 
2bf8ec92dde482ed6ab59275aad492d5abc5385e..24d2f6fffbc52bedbcd4efec82eaf834f0c75613
 100644
--- a/net/sctp/associola.c
+++ b/net/sctp/associola.c
@@ -1493,7 +1493,7 @@ void sctp_assoc_rwnd_increase(struct sctp_association 
*asoc, unsigned int len)
 
asoc->peer.sack_needed = 0;
 
-   sctp_outq_tail(>outqueue, sack);
+   sctp_outq_tail(>outqueue, sack, 

Re: [PATCH next v2 0/7] Introduce l3_dev pointer for L3 processing

2016-03-10 Thread Cong Wang
On Thu, Mar 10, 2016 at 1:47 AM, Nicolas Dichtel
 wrote:
> Le 09/03/2016 22:49, Mahesh Bandewar a écrit :
>>
>> From: Mahesh Bandewar 
>>
>> One of the major request (for enhancement) that I have received
>> from various users of IPvlan in L3 mode is its inability to handle
>> IPtables.
>>
>> While looking at the code and how we handle ingress, the problem
>> can be attributed to the asymmetry in the way packets get processed
>> for IPvlan devices configured in L3 mode. L3 mode is supposed to
>> be restrictive and all the L3 decisions need to be taken for the
>> traffic in master's ns. This does happen as expected for egress
>> traffic however on ingress traffic, the IPvlan packet-handler
>> changes the skb->dev and this forces packet to be processed with
>> the IPvlan slave and it's associated ns. This causes above mentioned
>> problem and few other which are not yet reported / attempted. e.g.
>> IPsec with L3 mode or even ingress routing.
>>
>> This could have been solved if we had a way to handover packet to
>> slave and associated ns after completing the L3 phase. This is a
>> non-trivial issue to fix especially looking at IPsec code.
>>
>> This patch series attempts to solve this problem by introducing the
>> device pointer l3_dev which resides in net_device structure in the
>> RX cache line. We initialize the l3_dev to self. This would mean
>> there is no complex logic to when-and-how-to initialize it. Now
>> the stack will use this dev pointer during the L3 phase. This should
>> not alter any existing properties / behavior and also there should
>> not be any additional penalties since it resides in the same RX
>> cache line.
>
> If I understand correctly (and as Cong already said), information are
> leaking
> between netns during the input phase. On the tx side, skb_scrub_packet() is
> called, but not on the rx side. I think it's wrong. There should be an
> explicit
> boundary.

That is not what I am complaining about.

I dislike the trick of switching skb->dev pointer with skb->dev->l3_dev.
This is not how we switch netns, nor the way how netns works.

Look at veth pair or dev_change_net_namespace(), each time when we
switch netns, we need to do a full reregistration or a full reentrance, we
never just switch some pointers to switch netns. This is why I said it breaks
isolation.

Also, it is ugly to hide such a ipvlan-specific pointer for half of the RX code
path.


Re: [PATCH net-next V3 00/10] cls_flower hardware offload support

2016-03-10 Thread David Miller
From: Amir Vadai 
Date: Tue,  8 Mar 2016 12:42:28 +0200

> Please see changes from V2 at the bottom.
> 
> This patchset introduces cls_flower hardware offload support over ConnectX-4
> driver, more hardware vendors are welcome to use it too.
 ...

Series applied, thanks for retaining detailed change history in this series
header posting.


Re: [PATCH V5 0/4] net-next: mediatek: add ethernet driver

2016-03-10 Thread David Miller
From: John Crispin 
Date: Tue,  8 Mar 2016 11:29:53 +0100

> This series adds support for the Mediatek ethernet core found on current ARM
> based SoCs. The driver works on MT2701 and MT7623 SoCs
> 
> Instead of trying to upstream everything at once I decided to concentrate on
> the important parts required to make current generation silicon work. The V3
> series only includes the code required to make dual MAC setups work and only
> supports the newer QDMA engine.
 ...

Series applied, thanks.


Re: [PATCH] net: dsa: Fix cleanup resources upon module removal

2016-03-10 Thread David Miller
From: Neil Armstrong 
Date: Tue,  8 Mar 2016 10:36:20 +0100

> The initial commit badly merged into the dsa_resume method instead
> of the dsa_remove_dst method.
> As consequence, the dst->master_netdev->dsa_ptr is not set to NULL on
> removal and re-bind of the dsa device fails with error -17.
> 
> Fixes: b0dc635d923c ("net: dsa: cleanup resources upon module removal ")
> Signed-off-by: Neil Armstrong 
> ---
>  net/dsa/dsa.c | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> David, Florian, Andrew,
> 
> This fix is quite urgent since it breaks all the removal cleanup.

Since 'net' is closed, I've applied this to 'net-next' and queue it up for
-stable.

Thanks.


Re: [PATCH net-next] net: dsa: mv88e6xxx: rework port state setter

2016-03-10 Thread David Miller
From: Vivien Didelot 
Date: Mon,  7 Mar 2016 18:24:17 -0500

> Apply a few non-functional changes on the port state setter:
> 
>   * add a dynamic debug message with state names to track changes
>   * explicit states checking instead of assuming their numeric values
>   * lock mutex only once when changing several port states
>   * use bitmap macros to declare and access port_state_update_mask
> 
> Signed-off-by: Vivien Didelot 

Applied.


Re: [PATCH net-next] net: dsa: mv88e6xxx: avoid writing the same mode

2016-03-10 Thread David Miller
From: Vivien Didelot 
Date: Mon,  7 Mar 2016 18:24:52 -0500

> There is no need to change the 802.1Q port mode for the same value.
> Thus avoid such message:
> 
> [  401.954836] dsa dsa@0 lan0: 802.1Q Mode: Disabled (was Disabled)
> 
> Signed-off-by: Vivien Didelot 

Applied.


Re: [RFC] net: ipv4 -- Introduce ifa limit per net

2016-03-10 Thread Cyrill Gorcunov
On Thu, Mar 10, 2016 at 04:05:21PM -0500, David Miller wrote:
> > 
> > and nobody calls for nf_ct_iterate_cleanup, no?
> 
> Oh yes they do, from masq's non-inet notifier.  masq registers two
> notifiers, one for generic netdev and one for inetdev.

Thanks a huge David! I'll test it just to be sure.

Cyrill


Re: [PATCH net-next 1/1] qede: Fix net-next "make ARCH=x86_64"

2016-03-10 Thread David Miller
From: Manish Chopra 
Date: Tue, 8 Mar 2016 04:09:44 -0500

> 'commit 55482edc25f0606851de42e73618f813f310d009
> ("qede: Add slowpath/fastpath support and enable hardware GRO")'
> introduces below error when compiling net-next with "make ARCH=x86_64"
> 
> drivers/built-in.o: In function `qede_rx_int':
> qede_main.c:(.text+0x6101a0): undefined reference to `tcp_gro_complete'
> 
> Signed-off-by: Manish Chopra 

Applied, thank you.


Re: [PATCH net] r8169:Remove unnecessary phy reset for pcie nic when setting link spped.

2016-03-10 Thread David Miller
From: Chunhao Lin 
Date: Tue, 8 Mar 2016 16:51:05 +0800

> For pcie nic, after setting link speed and thers is no link  driver does not 
> need
> to do phy reset untill link up.

"there's", "until"

> For some pcie nics, to do this will also reset phy speed down counter and 
> prevent
> phy from auto speed down.

Please fix these typos and resubmit, thanks.


Re: [net-next] arp: add macro to get drop_gratuitous_arp setting

2016-03-10 Thread David Miller
From: Zhang Shengju 
Date: Tue,  8 Mar 2016 07:53:50 +

> Add macro IN_DEV_DROP_GRATUITOUS_ARP to facilitate getting
> drop_gratuitous_arp value.
> 
> Signed-off-by: Zhang Shengju 

As it's used in one location, I see zero value in this, sorry.

I'm not applying this patch.


Re: [PATCH net v2 0/2] qlcnic fixes

2016-03-10 Thread David Miller
From: Rajesh Borundia 
Date: Tue, 8 Mar 2016 02:39:56 -0500

> This series adds following fixes.
> 
> o While processing mailbox if driver gets a spurious mailbox
>   interrupt it leads into premature completion of a next
>   mailbox request. Added a guard against this by checking current
>   state of mailbox and ignored spurious interrupt.
>   Added a stats counter to record this condition.
> 
> v2:
> 
> o Added patch that removes usage of atomic_t as we are not implemeting
>   atomicity by using atomic_t value.
> 
> Please apply these fixes to net.

As explained in other list postings, 'net' is basically closed for this
release cycle, so I applied this series to 'net-next'.

Let me know if you'd like me to therefore queue these changes up for
-stable.

Thanks.


Re: [PATCH] include/net/inet_connection_sock.h: Use pr_devel() instead of pr_debug()

2016-03-10 Thread David Miller
From: Nick Wang 
Date: Tue,  8 Mar 2016 13:52:28 +0800

> File "inet_connection_sock.h" is a common share header that not can 
> be use for one module, so use pr_devel instead of pr_debug is OK.

Not really, we only want these printks to do anything only when debug
printk's are enabled.

We don't want the overhead otherwise.

You'll need to find another fix for this, sorry.


Re: [PATCH net-next 0/4] cxgb4vf: Interrupt and queue configuration changes

2016-03-10 Thread David Miller
From: Hariprasad Shenai 
Date: Tue,  8 Mar 2016 10:50:16 +0530

> This series fixes some issues and some changes in the queue and interrupt
> configuration for cxgb4vf driver. We need to enable interrupts before we
> register our network device, so that we don't loose link up interrupts.
> Allocate rx queues based on interrupt type. Set number of tx/rx queues in
> probe function only. Also adds check for some invalid configurations.
> 
> This patch series has been created against net-next tree and includes
> patches on cxgb4vf driver.
> 
> We have included all the maintainers of respective drivers. Kindly review
> the change and let us know in case of any review comments.

Series applied, thanks.


Re: [PATCH net-next] net: dsa: mv88e6xxx: read then write PVID

2016-03-10 Thread David Miller
From: Vivien Didelot 
Date: Mon,  7 Mar 2016 18:24:39 -0500

> The port register 0x07 contains more options than just the default VID,
> even though they are not used yet. So prefer a read then write operation
> over a direct write.
> 
> This also allows to keep track of the change through dynamic debug.
> 
> Signed-off-by: Vivien Didelot 

Applied.


Re: [RFC] net: ipv4 -- Introduce ifa limit per net

2016-03-10 Thread Cong Wang
On Thu, Mar 10, 2016 at 11:55 AM, David Miller  wrote:
> Indeed, good catch.  Therefore:
>
> 1) Keep the masq netdev notifier.  That will flush the conntrack table
>for the inetdev_destroy event.
>
> 2) Make the inetdev notifier only do something if inetdev->dead is
>false.  (ie. we are flushing an individual address)
>
> And then we don't need the NETDEV_UNREGISTER thing at all:


This makes sense to me. I guess similar thing needs to do for IPv6 masq too.

Thanks.


Re: [PATCH 0/2] sh_eth: fix couple of bugs in sh_eth_ring_format()

2016-03-10 Thread David Miller
From: Sergei Shtylyov 
Date: Tue, 08 Mar 2016 01:33:38 +0300

>Here's a set of 2 patches against DaveM's 'net.git' repo fixing two bugs
> in sh_eth_.ring_format()...
> 
> [1/2] sh_eth: fix NULL pointer dereference in sh_eth_ring_format()
> [2/2] sh_eth: advance 'rxdesc' later in sh_eth_ring_format()

Since Linus is likely to release today or otherwise very soon I'm not
putting things into 'net'.

So I've applied this series to 'net-next', let me know if I should
queue it up for stable.

Thanks.


Re: [RFC] net: ipv4 -- Introduce ifa limit per net

2016-03-10 Thread David Miller
From: Cyrill Gorcunov 
Date: Thu, 10 Mar 2016 23:13:51 +0300

> On Thu, Mar 10, 2016 at 03:03:11PM -0500, David Miller wrote:
>> From: Cyrill Gorcunov 
>> Date: Thu, 10 Mar 2016 23:01:34 +0300
>> 
>> > On Thu, Mar 10, 2016 at 02:55:43PM -0500, David Miller wrote:
>> >> > 
>> >> > Hmm, but inetdev_destroy() is only called when NETDEV_UNREGISTER
>> >> > is happening and masq already registers a netdev notifier...
>> >> 
>> >> Indeed, good catch.  Therefore:
>> >> 
>> >> 1) Keep the masq netdev notifier.  That will flush the conntrack table
>> >>for the inetdev_destroy event.
>> >> 
>> >> 2) Make the inetdev notifier only do something if inetdev->dead is
>> >>false.  (ie. we are flushing an individual address)
>> >> 
>> >> And then we don't need the NETDEV_UNREGISTER thing at all:
>> >> 
>> >> diff --git a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c 
>> >> b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
>> >> index c6eb421..f71841a 100644
>> >> --- a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
>> >> +++ b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
>> >> @@ -108,10 +108,20 @@ static int masq_inet_event(struct notifier_block 
>> >> *this,
>> >>  unsigned long event,
>> >>  void *ptr)
>> >>  {
>> >> - struct net_device *dev = ((struct in_ifaddr *)ptr)->ifa_dev->dev;
>> >>   struct netdev_notifier_info info;
>> >> + struct in_ifaddr *ifa = ptr;
>> >> + struct in_device *idev;
>> >>  
>> >> - netdev_notifier_info_init(, dev);
>> >> + /* The masq_dev_notifier will catch the case of the device going
>> >> +  * down.  So if the inetdev is dead and being destroyed we have
>> >> +  * no work to do.  Otherwise this is an individual address removal
>> >> +  * and we have to perform the flush.
>> >> +  */
>> >> + idev = ifa->ifa_dev;
>> >> + if (idev->dead)
>> >> + return NOTIFY_DONE;
>> >> +
>> >> + netdev_notifier_info_init(, idev->dev);
>> >>   return masq_device_event(this, event, );
>> >>  }
>> > 
>> > Guys, I'm lost. Currently masq_device_event calls for conntrack
>> > cleanup with device index, so that once device is going down, the
>> > appropriate conntracks gonna be dropped off. Now if device is dead
>> > nobody will cleanup the conntracks?
>> 
>> Both notifiers are run in the inetdev_destroy() case.
>> 
>> Maybe that's what you are missing.
> 
> No :) Look, here is what I mean. Previously with your two patches
> we've been calling nf-cleanup for every address, so we had to make
> code call for cleanup for one time only. Now with the patch above
> the code flow is the following
> 
> inetdev_destroy
>   in_dev->dead = 1;
>   ...
>   inet_del_ifa
>   ...
>   blocking_notifier_call_chain(_chain, NETDEV_DOWN, 
> ifa1);
>   ...
>   masq_inet_event
>...
> masq_device_event
>   if (idev->dead)
>   return NOTIFY_DONE;
> 
> and nobody calls for nf_ct_iterate_cleanup, no?

Oh yes they do, from masq's non-inet notifier.  masq registers two
notifiers, one for generic netdev and one for inetdev.


Re: [RFC] net: ipv4 -- Introduce ifa limit per net

2016-03-10 Thread Cyrill Gorcunov
On Thu, Mar 10, 2016 at 11:13:51PM +0300, Cyrill Gorcunov wrote:
> > 
> > Both notifiers are run in the inetdev_destroy() case.
> > 
> > Maybe that's what you are missing.
> 
> No :) Look, here is what I mean. Previously with your two patches
> we've been calling nf-cleanup for every address, so we had to make
> code call for cleanup for one time only. Now with the patch above
> the code flow is the following

Ah, I'm idiot, drop the question.


Re: [RFC] net: ipv4 -- Introduce ifa limit per net

2016-03-10 Thread Cyrill Gorcunov
On Thu, Mar 10, 2016 at 03:03:11PM -0500, David Miller wrote:
> From: Cyrill Gorcunov 
> Date: Thu, 10 Mar 2016 23:01:34 +0300
> 
> > On Thu, Mar 10, 2016 at 02:55:43PM -0500, David Miller wrote:
> >> > 
> >> > Hmm, but inetdev_destroy() is only called when NETDEV_UNREGISTER
> >> > is happening and masq already registers a netdev notifier...
> >> 
> >> Indeed, good catch.  Therefore:
> >> 
> >> 1) Keep the masq netdev notifier.  That will flush the conntrack table
> >>for the inetdev_destroy event.
> >> 
> >> 2) Make the inetdev notifier only do something if inetdev->dead is
> >>false.  (ie. we are flushing an individual address)
> >> 
> >> And then we don't need the NETDEV_UNREGISTER thing at all:
> >> 
> >> diff --git a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c 
> >> b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
> >> index c6eb421..f71841a 100644
> >> --- a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
> >> +++ b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
> >> @@ -108,10 +108,20 @@ static int masq_inet_event(struct notifier_block 
> >> *this,
> >>   unsigned long event,
> >>   void *ptr)
> >>  {
> >> -  struct net_device *dev = ((struct in_ifaddr *)ptr)->ifa_dev->dev;
> >>struct netdev_notifier_info info;
> >> +  struct in_ifaddr *ifa = ptr;
> >> +  struct in_device *idev;
> >>  
> >> -  netdev_notifier_info_init(, dev);
> >> +  /* The masq_dev_notifier will catch the case of the device going
> >> +   * down.  So if the inetdev is dead and being destroyed we have
> >> +   * no work to do.  Otherwise this is an individual address removal
> >> +   * and we have to perform the flush.
> >> +   */
> >> +  idev = ifa->ifa_dev;
> >> +  if (idev->dead)
> >> +  return NOTIFY_DONE;
> >> +
> >> +  netdev_notifier_info_init(, idev->dev);
> >>return masq_device_event(this, event, );
> >>  }
> > 
> > Guys, I'm lost. Currently masq_device_event calls for conntrack
> > cleanup with device index, so that once device is going down, the
> > appropriate conntracks gonna be dropped off. Now if device is dead
> > nobody will cleanup the conntracks?
> 
> Both notifiers are run in the inetdev_destroy() case.
> 
> Maybe that's what you are missing.

No :) Look, here is what I mean. Previously with your two patches
we've been calling nf-cleanup for every address, so we had to make
code call for cleanup for one time only. Now with the patch above
the code flow is the following

inetdev_destroy
in_dev->dead = 1;
...
inet_del_ifa
...
blocking_notifier_call_chain(_chain, NETDEV_DOWN, 
ifa1);
...
masq_inet_event
 ...
  masq_device_event
if (idev->dead)
return NOTIFY_DONE;

and nobody calls for nf_ct_iterate_cleanup, no?


Re: [RFC] net: ipv4 -- Introduce ifa limit per net

2016-03-10 Thread David Miller
From: Cyrill Gorcunov 
Date: Thu, 10 Mar 2016 23:01:34 +0300

> On Thu, Mar 10, 2016 at 02:55:43PM -0500, David Miller wrote:
>> > 
>> > Hmm, but inetdev_destroy() is only called when NETDEV_UNREGISTER
>> > is happening and masq already registers a netdev notifier...
>> 
>> Indeed, good catch.  Therefore:
>> 
>> 1) Keep the masq netdev notifier.  That will flush the conntrack table
>>for the inetdev_destroy event.
>> 
>> 2) Make the inetdev notifier only do something if inetdev->dead is
>>false.  (ie. we are flushing an individual address)
>> 
>> And then we don't need the NETDEV_UNREGISTER thing at all:
>> 
>> diff --git a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c 
>> b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
>> index c6eb421..f71841a 100644
>> --- a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
>> +++ b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
>> @@ -108,10 +108,20 @@ static int masq_inet_event(struct notifier_block *this,
>> unsigned long event,
>> void *ptr)
>>  {
>> -struct net_device *dev = ((struct in_ifaddr *)ptr)->ifa_dev->dev;
>>  struct netdev_notifier_info info;
>> +struct in_ifaddr *ifa = ptr;
>> +struct in_device *idev;
>>  
>> -netdev_notifier_info_init(, dev);
>> +/* The masq_dev_notifier will catch the case of the device going
>> + * down.  So if the inetdev is dead and being destroyed we have
>> + * no work to do.  Otherwise this is an individual address removal
>> + * and we have to perform the flush.
>> + */
>> +idev = ifa->ifa_dev;
>> +if (idev->dead)
>> +return NOTIFY_DONE;
>> +
>> +netdev_notifier_info_init(, idev->dev);
>>  return masq_device_event(this, event, );
>>  }
> 
> Guys, I'm lost. Currently masq_device_event calls for conntrack
> cleanup with device index, so that once device is going down, the
> appropriate conntracks gonna be dropped off. Now if device is dead
> nobody will cleanup the conntracks?

Both notifiers are run in the inetdev_destroy() case.

Maybe that's what you are missing.



Re: [RFC] net: ipv4 -- Introduce ifa limit per net

2016-03-10 Thread Cyrill Gorcunov
On Thu, Mar 10, 2016 at 02:55:43PM -0500, David Miller wrote:
> > 
> > Hmm, but inetdev_destroy() is only called when NETDEV_UNREGISTER
> > is happening and masq already registers a netdev notifier...
> 
> Indeed, good catch.  Therefore:
> 
> 1) Keep the masq netdev notifier.  That will flush the conntrack table
>for the inetdev_destroy event.
> 
> 2) Make the inetdev notifier only do something if inetdev->dead is
>false.  (ie. we are flushing an individual address)
> 
> And then we don't need the NETDEV_UNREGISTER thing at all:
> 
> diff --git a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c 
> b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
> index c6eb421..f71841a 100644
> --- a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
> +++ b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
> @@ -108,10 +108,20 @@ static int masq_inet_event(struct notifier_block *this,
>  unsigned long event,
>  void *ptr)
>  {
> - struct net_device *dev = ((struct in_ifaddr *)ptr)->ifa_dev->dev;
>   struct netdev_notifier_info info;
> + struct in_ifaddr *ifa = ptr;
> + struct in_device *idev;
>  
> - netdev_notifier_info_init(, dev);
> + /* The masq_dev_notifier will catch the case of the device going
> +  * down.  So if the inetdev is dead and being destroyed we have
> +  * no work to do.  Otherwise this is an individual address removal
> +  * and we have to perform the flush.
> +  */
> + idev = ifa->ifa_dev;
> + if (idev->dead)
> + return NOTIFY_DONE;
> +
> + netdev_notifier_info_init(, idev->dev);
>   return masq_device_event(this, event, );
>  }

Guys, I'm lost. Currently masq_device_event calls for conntrack
cleanup with device index, so that once device is going down, the
appropriate conntracks gonna be dropped off. Now if device is dead
nobody will cleanup the conntracks?

Cyrill


Re: [RFC] net: ipv4 -- Introduce ifa limit per net

2016-03-10 Thread David Miller
From: Cong Wang 
Date: Thu, 10 Mar 2016 11:02:28 -0800

> On Thu, Mar 10, 2016 at 10:01 AM, David Miller  wrote:
>> I'm tempted to say that we should provide these notifier handlers with
>> the information they need, explicitly, to handle this case.
>>
>> Most intdev notifiers actually want to know the individual addresses
>> that get removed, one by one.  That's handled by the existing
>> NETDEV_DOWN event and the ifa we pass to that.
>>
>> But some, like this netfilter masq case, would be satisfied with a
>> single event that tells them the whole inetdev instance is being torn
>> down.  Which is the case we care about here.
>>
>> We currently don't use NETDEV_UNREGISTER for inetdev notifiers, so
>> maybe we could use that.
>>
>> And that is consistent with the core netdev notifier that triggers
>> this call chain in the first place.
>>
>> Roughly, something like this:
>>
>> diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
>> index 8c3df2c..6eee5cb 100644
>> --- a/net/ipv4/devinet.c
>> +++ b/net/ipv4/devinet.c
>> @@ -292,6 +292,11 @@ static void inetdev_destroy(struct in_device *in_dev)
>>
>> in_dev->dead = 1;
>>
>> +   if (in_dev->ifa_list)
>> +   blocking_notifier_call_chain(_chain,
>> +NETDEV_UNREGISTER,
>> +in_dev->ifa_list);
>> +
>> ip_mc_destroy_dev(in_dev);
> 
> 
> Hmm, but inetdev_destroy() is only called when NETDEV_UNREGISTER
> is happening and masq already registers a netdev notifier...

Indeed, good catch.  Therefore:

1) Keep the masq netdev notifier.  That will flush the conntrack table
   for the inetdev_destroy event.

2) Make the inetdev notifier only do something if inetdev->dead is
   false.  (ie. we are flushing an individual address)

And then we don't need the NETDEV_UNREGISTER thing at all:

diff --git a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c 
b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
index c6eb421..f71841a 100644
--- a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
+++ b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
@@ -108,10 +108,20 @@ static int masq_inet_event(struct notifier_block *this,
   unsigned long event,
   void *ptr)
 {
-   struct net_device *dev = ((struct in_ifaddr *)ptr)->ifa_dev->dev;
struct netdev_notifier_info info;
+   struct in_ifaddr *ifa = ptr;
+   struct in_device *idev;
 
-   netdev_notifier_info_init(, dev);
+   /* The masq_dev_notifier will catch the case of the device going
+* down.  So if the inetdev is dead and being destroyed we have
+* no work to do.  Otherwise this is an individual address removal
+* and we have to perform the flush.
+*/
+   idev = ifa->ifa_dev;
+   if (idev->dead)
+   return NOTIFY_DONE;
+
+   netdev_notifier_info_init(, idev->dev);
return masq_device_event(this, event, );
 }
 


Re: [PATCH] kcm: mark helper functions inline

2016-03-10 Thread David Miller
From: Arnd Bergmann 
Date: Thu, 10 Mar 2016 19:31:12 +0100

> The stub helper functions for the newly added kcm_proc_init/exit interfaces
> are defined as 'static' in a header file, which leads to build warnings for
> each file that includes them without calling them:
> 
> include/net/kcm.h:183:12: error: 'kcm_proc_init' defined but not used 
> [-Werror=unused-function]
> include/net/kcm.h:184:13: error: 'kcm_proc_exit' defined but not used 
> [-Werror=unused-function]
> 
> This marks the two functions as 'static inline' instead, which avoids the
> warnings and is obviously what was meant here.
> 
> Signed-off-by: Arnd Bergmann 
> Fixes: cd6e111bf5be ("kcm: Add statistics and proc interfaces")

Applied, thanks Arnd.


Re: [PATCH 1/2] net: thunderx: Set recevie buffer page usage count in bulk

2016-03-10 Thread David Miller
From: Sunil Kovvuri 
Date: Thu, 10 Mar 2016 23:57:48 +0530

> Difference between NIU driver and this patch is there it's
> calculate split count, increment page count and then divide page into
> buffers. Here it's divide page into buffers, have a counter which increments
> at every split and then at the end do a atomic increment of page->_count.
> 
> Any issue with this approach ?

I guess not.


Re: Micrel Phy - Is there a way to configure the Phy not to do 802.3x flow control?

2016-03-10 Thread Murali Karicheri
On 03/10/2016 01:05 PM, Florian Fainelli wrote:
> On 10/03/16 08:48, Murali Karicheri wrote:
>> On 03/03/2016 07:16 PM, Florian Fainelli wrote:
>>> On 03/03/16 14:18, Murali Karicheri wrote:
 Hi,

 We are using Micrel Phy in one of our board and wondering if we can force 
 the
 Phy to disable flow control at start. I have a 1G ethernet switch connected
 to Phy and the phy always enable flow control. I would like to configure 
 the
 phy not to flow control. Is that possible and if yes, what should I do in 
 the
 my Ethernet driver to tell the Phy not to enable flow control?
>>>
>>> The PHY is not doing flow control per-se, your pseudo Ethernet MAC in
>>> the switch is doing, along with the link partner advertising support for
>>> it. You would want to make sure that your PHY device interface (provided
>>> that you are using the PHY library) is not starting with Pause
>>> advertised, but it could be supported.
>>
>> Understood that Phy is just advertise FC. The Micrel phy for 9031 advertise
>> by default FC supported. After negotiation, I see that Phylib provide the 
>> link status with parameter pause = 1, asym_pause = 1. How do I tell the Phy 
>> not
>> to advertise?
>>
>> I call following sequence in the Ethernet driver.
>>
>> of_phy_connect(x,y,hndlr,a,z);
> 
> Here you should be able to change phydev->advertising and
> phydev->supported to mask the ADVERTISED_Pause | ADVERTISED_AsymPause
> bits and have phy_start() restart with that which should disable pause
> and asym_pause as seen by your adjust_link handler.
> 
Ok. Good point. I will try this. Thanks for your suggestion.

Murali
>> phy_start()
>>
>> Now in hndlr() I have pause = 1, asym_pause = 1, in phy_device ptr. How can 
>> I tell the phy not to advertise initially?


-- 
Murali Karicheri
Linux Kernel, Keystone


Re: net: use-after-free in recvmmsg

2016-03-10 Thread Arnaldo Carvalho de Melo
Em Thu, Mar 10, 2016 at 07:35:57PM +0100, Dmitry Vyukov escreveu:
> On Tue, Jan 26, 2016 at 8:30 PM, Arnaldo Carvalho de Melo
>  wrote:
> > Em Tue, Jan 26, 2016 at 08:27:48PM +0100, Dmitry Vyukov escreveu:
> >> On Fri, Jan 22, 2016 at 10:16 PM, Arnaldo Carvalho de Melo 
> >>  wrote:
> >> > Em Fri, Jan 22, 2016 at 09:39:53PM +0100, Dmitry Vyukov escreveu:
> >> >> I am on commit 30f05309bde49295e02e45c7e615f73aa4e0ccc2 (Jan 20).
> >> >> Seems to be added in commit a2e2725541fad72416326798c2d7fa4dafb7d337
> >> >> (Oct 2009).
> >> >
> >> > Maybe this helps? Compile testing now...
> >>
> >>
> >> I don't have a reliable reproducer, so can't test it per se.
> >> I will integrate this patch tomorrow and restart fuzzer with it.
> >
> > Thanks a lot!
> 
> Hi Arnaldo,
> 
> I am running with that patch since then, and did not see the bug.
> Please mail it as a proper patch.

Thanks, and I'll add a:

Reported-and-Tested-by: Dmitry Vyukov 

Ok?

- Arnaldo


Re:Money Clips

2016-03-10 Thread Tom
Dear Sir or Madam

We have some stock of items. if you are retailer that will be good for you. No 
MOQ demand. prompt shipment.

If you are interested pls do feel free to contact us

Best whishes 


Tom

Re: [RFC] net: ipv4 -- Introduce ifa limit per net

2016-03-10 Thread Cong Wang
On Thu, Mar 10, 2016 at 10:01 AM, David Miller  wrote:
> I'm tempted to say that we should provide these notifier handlers with
> the information they need, explicitly, to handle this case.
>
> Most intdev notifiers actually want to know the individual addresses
> that get removed, one by one.  That's handled by the existing
> NETDEV_DOWN event and the ifa we pass to that.
>
> But some, like this netfilter masq case, would be satisfied with a
> single event that tells them the whole inetdev instance is being torn
> down.  Which is the case we care about here.
>
> We currently don't use NETDEV_UNREGISTER for inetdev notifiers, so
> maybe we could use that.
>
> And that is consistent with the core netdev notifier that triggers
> this call chain in the first place.
>
> Roughly, something like this:
>
> diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
> index 8c3df2c..6eee5cb 100644
> --- a/net/ipv4/devinet.c
> +++ b/net/ipv4/devinet.c
> @@ -292,6 +292,11 @@ static void inetdev_destroy(struct in_device *in_dev)
>
> in_dev->dead = 1;
>
> +   if (in_dev->ifa_list)
> +   blocking_notifier_call_chain(_chain,
> +NETDEV_UNREGISTER,
> +in_dev->ifa_list);
> +
> ip_mc_destroy_dev(in_dev);


Hmm, but inetdev_destroy() is only called when NETDEV_UNREGISTER
is happening and masq already registers a netdev notifier...



>
> while ((ifa = in_dev->ifa_list) != NULL) {
> diff --git a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c 
> b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
> index c6eb421..1bb8026 100644
> --- a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
> +++ b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
> @@ -111,6 +111,10 @@ static int masq_inet_event(struct notifier_block *this,
> struct net_device *dev = ((struct in_ifaddr *)ptr)->ifa_dev->dev;
> struct netdev_notifier_info info;
>
> +   if (event != NETDEV_UNREGISTER)
> +   return NOTIFY_DONE;
> +   event = NETDEV_DOWN;
> +
> netdev_notifier_info_init(, dev);
> return masq_device_event(this, event, );
>  }

If masq really doesn't care about inetdev destroy or inetaddr removal,
we should just remove its inetaddr notifier.


Re: [RFC/RFT] mac80211: implement fq_codel for software queuing

2016-03-10 Thread Dave Taht
>> regular fq_codel uses 1024 and there has not been much reason to
>> change it. In the case of an AP which has more limited memory, 256 or
>> 1024 would be a good setting, per station. I'd stick to 1024 for now.
>
> Do note that the 4096 is shared _across_ station-tid queues. It is not
> per-station. If you have 10 stations you still have 4096 flows
> (actually 4096 + 16*10, because each tid - and there are 16 - has it's
> own fallback flow in case of hash collision on the global flowmap to
> maintain per-sta-tid queuing).

I have to admit I didn't parse this well - still haven't, I think I
need to draw. (got a picture?)

Where is this part happening in the code (or firmware?)

" because each tid - and there are 16 - has it's
 own fallback flow in case of hash collision on the global flowmap to
 maintain per-sta-tid queuing"

"fallback flow - hash collision on global flowmap" - huh?

> With that in mind do you still think 1024 is enough?

Can't answer that question without understanding what you said above.

I assembled a few of the patches to date (your fq_codel patch, avery's
and tims ath9k stuff) and tested them, to no measurable effect,
against linus's tree a day or two back. I also acquired an ath10k card
- would one of these suit?

http://www.amazon.com/gp/product/B011SIMFR8?psc=1=true_=oh_aui_detailpage_o08_s00


Re: [RFC -next 2/2] virtio_net: Read and use the advised MTU

2016-03-10 Thread Sergei Shtylyov

Hello.

On 03/10/2016 05:28 PM, Aaron Conole wrote:


This patch checks the feature bit for the VIRTIO_NET_F_MTU feature. If it
exists, read the advised MTU and use it.

No proper error handling is provided for the case where a user changes the
negotiated MTU. A future commit will add proper error handling. Instead, a
warning is emitted if the guest changes the device MTU after previously being
given advice.

Signed-off-by: Aaron Conole 
---
  drivers/net/virtio_net.c | 15 ++-
  1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 767ab11..7175563 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c

[...]

@@ -1390,8 +1391,12 @@ static const struct ethtool_ops virtnet_ethtool_ops = {

  static int virtnet_change_mtu(struct net_device *dev, int new_mtu)
  {
+   struct virtnet_info *vi = netdev_priv(dev);
if (new_mtu < MIN_MTU || new_mtu > MAX_MTU)
return -EINVAL;
+   if (vi->negotiated_mtu == true) {
+   pr_warn("changing mtu from negotiated mtu.");
+   }


   {} not needed, see Documentation/CodingStyle.

[...]

MBR, Sergei



Re: [PATCH nf-next v9 8/8] openvswitch: Interface with NAT.

2016-03-10 Thread Jarno Rajahalme
Thanks for the reviews Joe! Comments below.

> On Mar 9, 2016, at 7:47 PM, Joe Stringer  wrote:
> 
> Hi Jarno,
> 
> Thanks for working on this. Mostly just a few style things around #ifdefs 
> below.
> 
> On 9 March 2016 at 15:10, Jarno Rajahalme  wrote:
>> Extend OVS conntrack interface to cover NAT.  New nested
>> OVS_CT_ATTR_NAT attribute may be used to include NAT with a CT action.
>> A bare OVS_CT_ATTR_NAT only mangles existing and expected connections.
>> If OVS_NAT_ATTR_SRC or OVS_NAT_ATTR_DST is included within the nested
>> attributes, new (non-committed/non-confirmed) connections are mangled
>> according to the rest of the nested attributes.
>> 
>> The corresponding OVS userspace patch series includes test cases (in
>> tests/system-traffic.at) that also serve as example uses.
>> 
>> This work extends on a branch by Thomas Graf at
>> https://github.com/tgraf/ovs/tree/nat.
> 
> Thomas, I guess there was not signoff in these patches so Jarno does
> not have your signoff in this patch.
> 
>> Signed-off-by: Jarno Rajahalme 
>> ---
>> v9: Fixed module dependencies.
>> 
>> include/uapi/linux/openvswitch.h |  49 
>> net/openvswitch/Kconfig  |   3 +-
>> net/openvswitch/conntrack.c  | 523 
>> +--
>> net/openvswitch/conntrack.h  |   3 +-
>> 4 files changed, 551 insertions(+), 27 deletions(-)
> 
> 
> 
>> diff --git a/net/openvswitch/Kconfig b/net/openvswitch/Kconfig
>> index cd5fd9d..23471a4 100644
>> --- a/net/openvswitch/Kconfig
>> +++ b/net/openvswitch/Kconfig
>> @@ -6,7 +6,8 @@ config OPENVSWITCH
>>tristate "Open vSwitch"
>>depends on INET
>>depends on !NF_CONNTRACK || \
>> -  (NF_CONNTRACK && (!NF_DEFRAG_IPV6 || NF_DEFRAG_IPV6))
>> +  (NF_CONNTRACK && ((!NF_DEFRAG_IPV6 || NF_DEFRAG_IPV6) && \
>> +(!NF_NAT || NF_NAT)))
> 
> Whitespace.
> 

Fixed.

>>select LIBCRC32C
>>select MPLS
>>select NET_MPLS_GSO
>> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
>> index 5711f80..6455237 100644
>> --- a/net/openvswitch/conntrack.c
>> +++ b/net/openvswitch/conntrack.c
> 
> 
> 
>> struct ovs_ct_len_tbl {
>> -   size_t maxlen;
>> -   size_t minlen;
>> +   int maxlen;
>> +   int minlen;
>> };
> 
> Are these changed for a specific reason, or just to use INT_MAX rather
> than SIZE_MAX in ovs_ct_len_tbl?
> 

‘maxlen’ and ‘minlen’ are compared against the values returned by nla_len(), 
which returns an int:

net/netlink.h:static inline int nla_len(const struct nlattr *nla)

so I figured it is better to have these as ints, too.

>> /* Metadata mark for masked write to conntrack mark */
>> @@ -42,15 +52,29 @@ struct md_labels {
>>struct ovs_key_ct_labels mask;
>> };
>> 
>> +#ifdef CONFIG_NF_NAT_NEEDED
>> +enum ovs_ct_nat {
>> +   OVS_CT_NAT = 1 << 0, /* NAT for committed connections only. */
>> +   OVS_CT_SRC_NAT = 1 << 1, /* Source NAT for NEW connections. */
>> +   OVS_CT_DST_NAT = 1 << 2, /* Destination NAT for NEW connections. */
>> +};
>> +#endif
> 
> Here...
> 
>> /* Conntrack action context for execution. */
>> struct ovs_conntrack_info {
>>struct nf_conntrack_helper *helper;
>>struct nf_conntrack_zone zone;
>>struct nf_conn *ct;
>>u8 commit : 1;
>> +#ifdef CONFIG_NF_NAT_NEEDED
>> +   u8 nat : 3; /* enum ovs_ct_nat */
>> +#endif
> 
> and here.. I wonder if we can trim more of these #ifdefs, for
> readability and more compiler coverage if the feature is disabled.
> 

Trimmed this and other #ifdefs as you suggested, and it still compiles when NAT 
is disabled.

Just posted the v10, which I hope will be the final version :-)

  Jarno



[PATCH nf-next v10 6/8] openvswitch: Handle NF_REPEAT in conntrack action.

2016-03-10 Thread Jarno Rajahalme
Repeat the nf_conntrack_in() call when it returns NF_REPEAT.  This
avoids dropping a SYN packet re-opening an existing TCP connection.

Signed-off-by: Jarno Rajahalme 
Acked-by: Joe Stringer 
---
 net/openvswitch/conntrack.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index ae36fe2..85256b3 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -485,6 +485,7 @@ static int __ovs_ct_lookup(struct net *net, struct 
sw_flow_key *key,
 */
if (!skb_nfct_cached(net, key, info, skb)) {
struct nf_conn *tmpl = info->ct;
+   int err;
 
/* Associate skb with specified zone. */
if (tmpl) {
@@ -495,8 +496,13 @@ static int __ovs_ct_lookup(struct net *net, struct 
sw_flow_key *key,
skb->nfctinfo = IP_CT_NEW;
}
 
-   if (nf_conntrack_in(net, info->family, NF_INET_PRE_ROUTING,
-   skb) != NF_ACCEPT)
+   /* Repeat if requested, see nf_iterate(). */
+   do {
+   err = nf_conntrack_in(net, info->family,
+ NF_INET_PRE_ROUTING, skb);
+   } while (err == NF_REPEAT);
+
+   if (err != NF_ACCEPT)
return -ENOENT;
 
ovs_ct_update_key(skb, info, key, true);
-- 
2.1.4



[PATCH nf-next v10 7/8] openvswitch: Delay conntrack helper call for new connections.

2016-03-10 Thread Jarno Rajahalme
There is no need to help connections that are not confirmed, so we can
delay helping new connections to the time when they are confirmed.
This change is needed for NAT support, and having this as a separate
patch will make the following NAT patch a bit easier to review.

Signed-off-by: Jarno Rajahalme 
---
 net/openvswitch/conntrack.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index 85256b3..f718b72 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -483,7 +483,11 @@ static int __ovs_ct_lookup(struct net *net, struct 
sw_flow_key *key,
 * actually run the packet through conntrack twice unless it's for a
 * different zone.
 */
-   if (!skb_nfct_cached(net, key, info, skb)) {
+   bool cached = skb_nfct_cached(net, key, info, skb);
+   enum ip_conntrack_info ctinfo;
+   struct nf_conn *ct;
+
+   if (!cached) {
struct nf_conn *tmpl = info->ct;
int err;
 
@@ -506,11 +510,18 @@ static int __ovs_ct_lookup(struct net *net, struct 
sw_flow_key *key,
return -ENOENT;
 
ovs_ct_update_key(skb, info, key, true);
+   }
 
-   if (ovs_ct_helper(skb, info->family) != NF_ACCEPT) {
-   WARN_ONCE(1, "helper rejected packet");
-   return -EINVAL;
-   }
+   /* Call the helper only if:
+* - nf_conntrack_in() was executed above ("!cached") for a confirmed
+*   connection, or
+* - When committing an unconfirmed connection.
+*/
+   ct = nf_ct_get(skb, );
+   if (ct && (nf_ct_is_confirmed(ct) ? !cached : info->commit) &&
+   ovs_ct_helper(skb, info->family) != NF_ACCEPT) {
+   WARN_ONCE(1, "helper rejected packet");
+   return -EINVAL;
}
 
return 0;
-- 
2.1.4



[PATCH nf-next v10 2/8] netfilter: Allow calling into nat helper without skb_dst.

2016-03-10 Thread Jarno Rajahalme
NAT checksum recalculation code assumes existence of skb_dst, which
becomes a problem for a later patch in the series ("openvswitch:
Interface with NAT.").  Simplify this by removing the check on
skb_dst, as the checksum will be dealt with later in the stack.

Suggested-by: Pravin Shelar 
Signed-off-by: Jarno Rajahalme 
---
 net/ipv4/netfilter/nf_nat_l3proto_ipv4.c | 30 --
 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c | 30 --
 2 files changed, 16 insertions(+), 44 deletions(-)

diff --git a/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c 
b/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
index 61c7cc2..f8aad03 100644
--- a/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
+++ b/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
@@ -127,29 +127,15 @@ static void nf_nat_ipv4_csum_recalc(struct sk_buff *skb,
u8 proto, void *data, __sum16 *check,
int datalen, int oldlen)
 {
-   const struct iphdr *iph = ip_hdr(skb);
-   struct rtable *rt = skb_rtable(skb);
-
if (skb->ip_summed != CHECKSUM_PARTIAL) {
-   if (!(rt->rt_flags & RTCF_LOCAL) &&
-   (!skb->dev || skb->dev->features &
-(NETIF_F_IP_CSUM | NETIF_F_HW_CSUM))) {
-   skb->ip_summed = CHECKSUM_PARTIAL;
-   skb->csum_start = skb_headroom(skb) +
- skb_network_offset(skb) +
- ip_hdrlen(skb);
-   skb->csum_offset = (void *)check - data;
-   *check = ~csum_tcpudp_magic(iph->saddr, iph->daddr,
-   datalen, proto, 0);
-   } else {
-   *check = 0;
-   *check = csum_tcpudp_magic(iph->saddr, iph->daddr,
-  datalen, proto,
-  csum_partial(data, datalen,
-   0));
-   if (proto == IPPROTO_UDP && !*check)
-   *check = CSUM_MANGLED_0;
-   }
+   const struct iphdr *iph = ip_hdr(skb);
+
+   skb->ip_summed = CHECKSUM_PARTIAL;
+   skb->csum_start = skb_headroom(skb) + skb_network_offset(skb) +
+   ip_hdrlen(skb);
+   skb->csum_offset = (void *)check - data;
+   *check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, datalen,
+   proto, 0);
} else
inet_proto_csum_replace2(check, skb,
 htons(oldlen), htons(datalen), true);
diff --git a/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c 
b/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
index 6ce3099..e0be97e 100644
--- a/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
+++ b/net/ipv6/netfilter/nf_nat_l3proto_ipv6.c
@@ -131,29 +131,15 @@ static void nf_nat_ipv6_csum_recalc(struct sk_buff *skb,
u8 proto, void *data, __sum16 *check,
int datalen, int oldlen)
 {
-   const struct ipv6hdr *ipv6h = ipv6_hdr(skb);
-   struct rt6_info *rt = (struct rt6_info *)skb_dst(skb);
-
if (skb->ip_summed != CHECKSUM_PARTIAL) {
-   if (!(rt->rt6i_flags & RTF_LOCAL) &&
-   (!skb->dev || skb->dev->features &
-(NETIF_F_IPV6_CSUM | NETIF_F_HW_CSUM))) {
-   skb->ip_summed = CHECKSUM_PARTIAL;
-   skb->csum_start = skb_headroom(skb) +
- skb_network_offset(skb) +
- (data - (void *)skb->data);
-   skb->csum_offset = (void *)check - data;
-   *check = ~csum_ipv6_magic(>saddr, >daddr,
- datalen, proto, 0);
-   } else {
-   *check = 0;
-   *check = csum_ipv6_magic(>saddr, >daddr,
-datalen, proto,
-csum_partial(data, datalen,
- 0));
-   if (proto == IPPROTO_UDP && !*check)
-   *check = CSUM_MANGLED_0;
-   }
+   const struct ipv6hdr *ipv6h = ipv6_hdr(skb);
+
+   skb->ip_summed = CHECKSUM_PARTIAL;
+   skb->csum_start = skb_headroom(skb) + skb_network_offset(skb) +
+   (data - (void *)skb->data);
+   skb->csum_offset = (void *)check - data;
+   *check = ~csum_ipv6_magic(>saddr, >daddr,
+ datalen, proto, 0);
} else

[PATCH nf-next v10 3/8] openvswitch: Add commentary to conntrack.c

2016-03-10 Thread Jarno Rajahalme
This makes the code easier to understand and the following patches
more focused.

Signed-off-by: Jarno Rajahalme 
---
 net/openvswitch/conntrack.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index 3045290..2c2bf07 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -152,8 +152,12 @@ static void ovs_ct_update_key(const struct sk_buff *skb,
ct = nf_ct_get(skb, );
if (ct) {
state = ovs_ct_get_state(ctinfo);
+   /* All unconfirmed entries are NEW connections. */
if (!nf_ct_is_confirmed(ct))
state |= OVS_CS_F_NEW;
+   /* OVS persists the related flag for the duration of the
+* connection.
+*/
if (ct->master)
state |= OVS_CS_F_RELATED;
zone = nf_ct_zone(ct);
@@ -165,6 +169,9 @@ static void ovs_ct_update_key(const struct sk_buff *skb,
__ovs_ct_update_key(key, state, zone, ct);
 }
 
+/* This is called to initialize CT key fields possibly coming in from the local
+ * stack.
+ */
 void ovs_ct_fill_key(const struct sk_buff *skb, struct sw_flow_key *key)
 {
ovs_ct_update_key(skb, NULL, key, false);
@@ -199,7 +206,6 @@ static int ovs_ct_set_mark(struct sk_buff *skb, struct 
sw_flow_key *key,
struct nf_conn *ct;
u32 new_mark;
 
-
/* The connection could be invalid, in which case set_mark is no-op. */
ct = nf_ct_get(skb, );
if (!ct)
@@ -375,6 +381,11 @@ static bool skb_nfct_cached(const struct net *net, const 
struct sk_buff *skb,
return true;
 }
 
+/* Pass 'skb' through conntrack in 'net', using zone configured in 'info', if
+ * not done already.  Update key with new CT state.
+ * Note that if the packet is deemed invalid by conntrack, skb->nfct will be
+ * set to NULL and 0 will be returned.
+ */
 static int __ovs_ct_lookup(struct net *net, struct sw_flow_key *key,
   const struct ovs_conntrack_info *info,
   struct sk_buff *skb)
@@ -418,6 +429,13 @@ static int ovs_ct_lookup(struct net *net, struct 
sw_flow_key *key,
 {
struct nf_conntrack_expect *exp;
 
+   /* If we pass an expected packet through nf_conntrack_in() the
+* expectation is typically removed, but the packet could still be
+* lost in upcall processing.  To prevent this from happening we
+* perform an explicit expectation lookup.  Expected connections are
+* always new, and will be passed through conntrack only when they are
+* committed, as it is OK to remove the expectation at that time.
+*/
exp = ovs_ct_expect_find(net, >zone, info->family, skb);
if (exp) {
u8 state;
@@ -455,6 +473,7 @@ static int ovs_ct_commit(struct net *net, struct 
sw_flow_key *key,
err = __ovs_ct_lookup(net, key, info, skb);
if (err)
return err;
+   /* This is a no-op if the connection has already been confirmed. */
if (nf_conntrack_confirm(skb) != NF_ACCEPT)
return -EINVAL;
 
-- 
2.1.4



[PATCH nf-next v10 4/8] openvswitch: Update the CT state key only after nf_conntrack_in().

2016-03-10 Thread Jarno Rajahalme
Only a successful nf_conntrack_in() call can effect a connection state
change, so it suffices to update the key only after the
nf_conntrack_in() returns.

This change is needed for the later NAT patches.

Signed-off-by: Jarno Rajahalme 
Acked-by: Joe Stringer 
---
 net/openvswitch/conntrack.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index 2c2bf07..a487bb3 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -382,7 +382,8 @@ static bool skb_nfct_cached(const struct net *net, const 
struct sk_buff *skb,
 }
 
 /* Pass 'skb' through conntrack in 'net', using zone configured in 'info', if
- * not done already.  Update key with new CT state.
+ * not done already.  Update key with new CT state after passing the packet
+ * through conntrack.
  * Note that if the packet is deemed invalid by conntrack, skb->nfct will be
  * set to NULL and 0 will be returned.
  */
@@ -411,14 +412,14 @@ static int __ovs_ct_lookup(struct net *net, struct 
sw_flow_key *key,
skb) != NF_ACCEPT)
return -ENOENT;
 
+   ovs_ct_update_key(skb, info, key, true);
+
if (ovs_ct_helper(skb, info->family) != NF_ACCEPT) {
WARN_ONCE(1, "helper rejected packet");
return -EINVAL;
}
}
 
-   ovs_ct_update_key(skb, info, key, true);
-
return 0;
 }
 
-- 
2.1.4



[PATCH nf-next v10 8/8] openvswitch: Interface with NAT.

2016-03-10 Thread Jarno Rajahalme
Extend OVS conntrack interface to cover NAT.  New nested
OVS_CT_ATTR_NAT attribute may be used to include NAT with a CT action.
A bare OVS_CT_ATTR_NAT only mangles existing and expected connections.
If OVS_NAT_ATTR_SRC or OVS_NAT_ATTR_DST is included within the nested
attributes, new (non-committed/non-confirmed) connections are mangled
according to the rest of the nested attributes.

The corresponding OVS userspace patch series includes test cases (in
tests/system-traffic.at) that also serve as example uses.

This work extends on a branch by Thomas Graf at
https://github.com/tgraf/ovs/tree/nat.

Signed-off-by: Jarno Rajahalme 
Acked-by: Thomas Graf 
---
 include/uapi/linux/openvswitch.h |  49 
 net/openvswitch/Kconfig  |   3 +-
 net/openvswitch/conntrack.c  | 524 +--
 net/openvswitch/conntrack.h  |   3 +-
 4 files changed, 551 insertions(+), 28 deletions(-)

diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index a27222d..616d047 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -454,6 +454,14 @@ struct ovs_key_ct_labels {
 #define OVS_CS_F_REPLY_DIR 0x08 /* Flow is in the reply direction. */
 #define OVS_CS_F_INVALID   0x10 /* Could not track connection. */
 #define OVS_CS_F_TRACKED   0x20 /* Conntrack has occurred. */
+#define OVS_CS_F_SRC_NAT   0x40 /* Packet's source address/port was
+* mangled by NAT.
+*/
+#define OVS_CS_F_DST_NAT   0x80 /* Packet's destination address/port
+* was mangled by NAT.
+*/
+
+#define OVS_CS_F_NAT_MASK (OVS_CS_F_SRC_NAT | OVS_CS_F_DST_NAT)
 
 /**
  * enum ovs_flow_attr - attributes for %OVS_FLOW_* commands.
@@ -632,6 +640,8 @@ struct ovs_action_hash {
  * mask. For each bit set in the mask, the corresponding bit in the value is
  * copied to the connection tracking label field in the connection.
  * @OVS_CT_ATTR_HELPER: variable length string defining conntrack ALG.
+ * @OVS_CT_ATTR_NAT: Nested OVS_NAT_ATTR_* for performing L3 network address
+ * translation (NAT) on the packet.
  */
 enum ovs_ct_attr {
OVS_CT_ATTR_UNSPEC,
@@ -641,12 +651,51 @@ enum ovs_ct_attr {
OVS_CT_ATTR_LABELS, /* labels to associate with this connection. */
OVS_CT_ATTR_HELPER, /* netlink helper to assist detection of
   related connections. */
+   OVS_CT_ATTR_NAT,/* Nested OVS_NAT_ATTR_* */
__OVS_CT_ATTR_MAX
 };
 
 #define OVS_CT_ATTR_MAX (__OVS_CT_ATTR_MAX - 1)
 
 /**
+ * enum ovs_nat_attr - Attributes for %OVS_CT_ATTR_NAT.
+ *
+ * @OVS_NAT_ATTR_SRC: Flag for Source NAT (mangle source address/port).
+ * @OVS_NAT_ATTR_DST: Flag for Destination NAT (mangle destination
+ * address/port).  Only one of (@OVS_NAT_ATTR_SRC, @OVS_NAT_ATTR_DST) may be
+ * specified.  Effective only for packets for ct_state NEW connections.
+ * Packets of committed connections are mangled by the NAT action according to
+ * the committed NAT type regardless of the flags specified.  As a corollary, a
+ * NAT action without a NAT type flag will only mangle packets of committed
+ * connections.  The following NAT attributes only apply for NEW
+ * (non-committed) connections, and they may be included only when the CT
+ * action has the @OVS_CT_ATTR_COMMIT flag and either @OVS_NAT_ATTR_SRC or
+ * @OVS_NAT_ATTR_DST is also included.
+ * @OVS_NAT_ATTR_IP_MIN: struct in_addr or struct in6_addr
+ * @OVS_NAT_ATTR_IP_MAX: struct in_addr or struct in6_addr
+ * @OVS_NAT_ATTR_PROTO_MIN: u16 L4 protocol specific lower boundary (port)
+ * @OVS_NAT_ATTR_PROTO_MAX: u16 L4 protocol specific upper boundary (port)
+ * @OVS_NAT_ATTR_PERSISTENT: Flag for persistent IP mapping across reboots
+ * @OVS_NAT_ATTR_PROTO_HASH: Flag for pseudo random L4 port mapping (MD5)
+ * @OVS_NAT_ATTR_PROTO_RANDOM: Flag for fully randomized L4 port mapping
+ */
+enum ovs_nat_attr {
+   OVS_NAT_ATTR_UNSPEC,
+   OVS_NAT_ATTR_SRC,
+   OVS_NAT_ATTR_DST,
+   OVS_NAT_ATTR_IP_MIN,
+   OVS_NAT_ATTR_IP_MAX,
+   OVS_NAT_ATTR_PROTO_MIN,
+   OVS_NAT_ATTR_PROTO_MAX,
+   OVS_NAT_ATTR_PERSISTENT,
+   OVS_NAT_ATTR_PROTO_HASH,
+   OVS_NAT_ATTR_PROTO_RANDOM,
+   __OVS_NAT_ATTR_MAX,
+};
+
+#define OVS_NAT_ATTR_MAX (__OVS_NAT_ATTR_MAX - 1)
+
+/**
  * enum ovs_action_attr - Action types.
  *
  * @OVS_ACTION_ATTR_OUTPUT: Output packet to port.
diff --git a/net/openvswitch/Kconfig b/net/openvswitch/Kconfig
index cd5fd9d..234a733 100644
--- a/net/openvswitch/Kconfig
+++ b/net/openvswitch/Kconfig
@@ -6,7 +6,8 @@ config OPENVSWITCH
tristate "Open vSwitch"
depends on INET
depends on !NF_CONNTRACK || \
-  (NF_CONNTRACK && (!NF_DEFRAG_IPV6 || NF_DEFRAG_IPV6))
+  

[PATCH nf-next v10 0/8] openvswitch: NAT support

2016-03-10 Thread Jarno Rajahalme
This series adds NAT support to openvswitch kernel module.  A few
changes are needed to the netfilter code to facilitate this (patches
1-2/8).  Patches 3-7 make the openvswitch kernel module ready for the
patch 8 that adds the NAT support by calling into netfilter NAT code
from the openvswitch conntrack action.

This version fixes spelling errors in comments and eliminates many of
the #ifdefs in the final patch that were not strictly necessary.  This
makes the code more readable and improves compile time coverage even
when NAT feature is not configured.

The OVS master now has the corresponding OVS userspace support to use
and test the NAT features.  Below if a walk through of a simple use
case.

In this case ports 1 and 2 are in different namespaces.  The OpenFlow
table below only allows IPv4 connections initiated from port 1, and
applies source NAT to those connections:

  in_port=1,ip,action=ct(commit,zone=1,nat(src=10.1.1.240-10.1.1.255)),2
  in_port=2,ct_state=-trk,ip,action=ct(table=0,zone=1,nat)
  in_port=2,ct_state=+est,ct_zone=1,ip,action=1

This flow table matches all IPv4 traffic from port 1, runs them
through conntrack in zone 1 and NATs them.  The NAT is initialized to
do source IP mapping to the given range for the first packet of each
connection, after which the new connection is committed (confirmed).
For further packets of already tracked connections NAT is done
according to the connection state and the commit is a no-op.  Each
packet that is not flagged as a drop by the CT action is forwarded to
port 2.  The CT action does an implicit fragmentation reassembly, so
that only complete packets are run through conntrack.  Reassembled
packets are re-fragmented on output.

The IPv4 traffic coming from port 2 is first matched for the
non-tracked state (-trk), which means that the packet has not been
through a CT action yet.  Such traffic is run trough the conntrack in
zone 1 and all packets associated with a NATted connection are NATted
also in the return direction.  After the packet has been through
conntrack it is recirculated back to OpenFlow table 0 (which is the
default table, so all the rules above are in table 0).  The CT action
changes the 'trk' flag to being set, so the packets after
recirculation no longer match the second rule.  The third rule then
matches the recirculated packets that were marked as established by
conntrack (+est), and the packet is output on port 1.  Matching on
ct_zone is not strictly needed, but in this test case it verifies that
the ct_zone key attribute is properly set by the conntrack action.

A full test case requires rules for ARP handling not shown here.

The flow table above is an OpenFlow table, and the rules therein
are translated to kernel flow entries on-demand by ovs-vswitchd.

Jarno Rajahalme (8):
  netfilter: Remove IP_CT_NEW_REPLY definition.
  netfilter: Allow calling into nat helper without skb_dst.
  openvswitch: Add commentary to conntrack.c
  openvswitch: Update the CT state key only after nf_conntrack_in().
  openvswitch: Find existing conntrack entry after upcall.
  openvswitch: Handle NF_REPEAT in conntrack action.
  openvswitch: Delay conntrack helper call for new connections.
  openvswitch: Interface with NAT.

 include/uapi/linux/netfilter/nf_conntrack_common.h |  12 +-
 include/uapi/linux/openvswitch.h   |  49 ++
 net/ipv4/netfilter/nf_nat_l3proto_ipv4.c   |  30 +-
 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c   |  30 +-
 net/openvswitch/Kconfig|   3 +-
 net/openvswitch/conntrack.c| 660 +++--
 net/openvswitch/conntrack.h|   3 +-
 7 files changed, 700 insertions(+), 87 deletions(-)

-- 
2.1.4



[PATCH nf-next v10 5/8] openvswitch: Find existing conntrack entry after upcall.

2016-03-10 Thread Jarno Rajahalme
Add a new function ovs_ct_find_existing() to find an existing
conntrack entry for which this packet was already applied to.  This is
only to be called when there is evidence that the packet was already
tracked and committed, but we lost the ct reference due to an
userspace upcall.

ovs_ct_find_existing() is called from skb_nfct_cached(), which can now
hide the fact that the ct reference may have been lost due to an
upcall.  This allows ovs_ct_commit() to be simplified.

This patch is needed by later "openvswitch: Interface with NAT" patch,
as we need to be able to pass the packet through NAT using the
original ct reference also after the reference is lost after an
upcall.

Signed-off-by: Jarno Rajahalme 
Acked-by: Joe Stringer 
---
 net/openvswitch/conntrack.c | 103 ++--
 1 file changed, 90 insertions(+), 13 deletions(-)

diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index a487bb3..ae36fe2 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -356,14 +356,101 @@ ovs_ct_expect_find(struct net *net, const struct 
nf_conntrack_zone *zone,
return __nf_ct_expect_find(net, zone, );
 }
 
+/* This replicates logic from nf_conntrack_core.c that is not exported. */
+static enum ip_conntrack_info
+ovs_ct_get_info(const struct nf_conntrack_tuple_hash *h)
+{
+   const struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
+
+   if (NF_CT_DIRECTION(h) == IP_CT_DIR_REPLY)
+   return IP_CT_ESTABLISHED_REPLY;
+   /* Once we've had two way comms, always ESTABLISHED. */
+   if (test_bit(IPS_SEEN_REPLY_BIT, >status))
+   return IP_CT_ESTABLISHED;
+   if (test_bit(IPS_EXPECTED_BIT, >status))
+   return IP_CT_RELATED;
+   return IP_CT_NEW;
+}
+
+/* Find an existing connection which this packet belongs to without
+ * re-attributing statistics or modifying the connection state.  This allows an
+ * skb->nfct lost due to an upcall to be recovered during actions execution.
+ *
+ * Must be called with rcu_read_lock.
+ *
+ * On success, populates skb->nfct and skb->nfctinfo, and returns the
+ * connection.  Returns NULL if there is no existing entry.
+ */
+static struct nf_conn *
+ovs_ct_find_existing(struct net *net, const struct nf_conntrack_zone *zone,
+u8 l3num, struct sk_buff *skb)
+{
+   struct nf_conntrack_l3proto *l3proto;
+   struct nf_conntrack_l4proto *l4proto;
+   struct nf_conntrack_tuple tuple;
+   struct nf_conntrack_tuple_hash *h;
+   enum ip_conntrack_info ctinfo;
+   struct nf_conn *ct;
+   unsigned int dataoff;
+   u8 protonum;
+
+   l3proto = __nf_ct_l3proto_find(l3num);
+   if (!l3proto) {
+   pr_debug("ovs_ct_find_existing: Can't get l3proto\n");
+   return NULL;
+   }
+   if (l3proto->get_l4proto(skb, skb_network_offset(skb), ,
+) <= 0) {
+   pr_debug("ovs_ct_find_existing: Can't get protonum\n");
+   return NULL;
+   }
+   l4proto = __nf_ct_l4proto_find(l3num, protonum);
+   if (!l4proto) {
+   pr_debug("ovs_ct_find_existing: Can't get l4proto\n");
+   return NULL;
+   }
+   if (!nf_ct_get_tuple(skb, skb_network_offset(skb), dataoff, l3num,
+protonum, net, , l3proto, l4proto)) {
+   pr_debug("ovs_ct_find_existing: Can't get tuple\n");
+   return NULL;
+   }
+
+   /* look for tuple match */
+   h = nf_conntrack_find_get(net, zone, );
+   if (!h)
+   return NULL;   /* Not found. */
+
+   ct = nf_ct_tuplehash_to_ctrack(h);
+
+   ctinfo = ovs_ct_get_info(h);
+   if (ctinfo == IP_CT_NEW) {
+   /* This should not happen. */
+   WARN_ONCE(1, "ovs_ct_find_existing: new packet for %p\n", ct);
+   }
+   skb->nfct = >ct_general;
+   skb->nfctinfo = ctinfo;
+   return ct;
+}
+
 /* Determine whether skb->nfct is equal to the result of conntrack lookup. */
-static bool skb_nfct_cached(const struct net *net, const struct sk_buff *skb,
-   const struct ovs_conntrack_info *info)
+static bool skb_nfct_cached(struct net *net,
+   const struct sw_flow_key *key,
+   const struct ovs_conntrack_info *info,
+   struct sk_buff *skb)
 {
enum ip_conntrack_info ctinfo;
struct nf_conn *ct;
 
ct = nf_ct_get(skb, );
+   /* If no ct, check if we have evidence that an existing conntrack entry
+* might be found for this skb.  This happens when we lose a skb->nfct
+* due to an upcall.  If the connection was not confirmed, it is not
+* cached and needs to be run through conntrack again.
+*/
+   if (!ct && key->ct.state & OVS_CS_F_TRACKED &&
+   !(key->ct.state & OVS_CS_F_INVALID) &&
+ 

[PATCH nf-next v10 1/8] netfilter: Remove IP_CT_NEW_REPLY definition.

2016-03-10 Thread Jarno Rajahalme
Remove the definition of IP_CT_NEW_REPLY from the kernel as it does
not make sense.  This allows the definition of IP_CT_NUMBER to be
simplified as well.

Signed-off-by: Jarno Rajahalme 
---
 include/uapi/linux/netfilter/nf_conntrack_common.h | 12 +---
 net/openvswitch/conntrack.c|  2 --
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/include/uapi/linux/netfilter/nf_conntrack_common.h 
b/include/uapi/linux/netfilter/nf_conntrack_common.h
index 319f471..6d074d1 100644
--- a/include/uapi/linux/netfilter/nf_conntrack_common.h
+++ b/include/uapi/linux/netfilter/nf_conntrack_common.h
@@ -20,9 +20,15 @@ enum ip_conntrack_info {
 
IP_CT_ESTABLISHED_REPLY = IP_CT_ESTABLISHED + IP_CT_IS_REPLY,
IP_CT_RELATED_REPLY = IP_CT_RELATED + IP_CT_IS_REPLY,
-   IP_CT_NEW_REPLY = IP_CT_NEW + IP_CT_IS_REPLY,   
-   /* Number of distinct IP_CT types (no NEW in reply dirn). */
-   IP_CT_NUMBER = IP_CT_IS_REPLY * 2 - 1
+   /* No NEW in reply direction. */
+
+   /* Number of distinct IP_CT types. */
+   IP_CT_NUMBER,
+
+   /* only for userspace compatibility */
+#ifndef __KERNEL__
+   IP_CT_NEW_REPLY = IP_CT_NUMBER,
+#endif
 };
 
 #define NF_CT_STATE_INVALID_BIT(1 << 0)
diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index ee6ff8f..3045290 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -75,7 +75,6 @@ static u8 ovs_ct_get_state(enum ip_conntrack_info ctinfo)
switch (ctinfo) {
case IP_CT_ESTABLISHED_REPLY:
case IP_CT_RELATED_REPLY:
-   case IP_CT_NEW_REPLY:
ct_state |= OVS_CS_F_REPLY_DIR;
break;
default:
@@ -92,7 +91,6 @@ static u8 ovs_ct_get_state(enum ip_conntrack_info ctinfo)
ct_state |= OVS_CS_F_RELATED;
break;
case IP_CT_NEW:
-   case IP_CT_NEW_REPLY:
ct_state |= OVS_CS_F_NEW;
break;
default:
-- 
2.1.4



Re: [RFC] net: ipv4 -- Introduce ifa limit per net

2016-03-10 Thread Cyrill Gorcunov
On Thu, Mar 10, 2016 at 01:01:38PM -0500, David Miller wrote:
> From: Cyrill Gorcunov 
> Date: Thu, 10 Mar 2016 18:09:20 +0300
> 
> > On Thu, Mar 10, 2016 at 02:03:24PM +0300, Cyrill Gorcunov wrote:
> >> On Thu, Mar 10, 2016 at 01:20:18PM +0300, Cyrill Gorcunov wrote:
> >> > On Thu, Mar 10, 2016 at 12:16:29AM +0300, Cyrill Gorcunov wrote:
> >> > > 
> >> > > Thanks for explanation, Dave! I'll continue on this task tomorrow
> >> > > tryin to implement optimization you proposed.
> >> > 
> >> > OK, here are the results for the preliminary patch with conntrack running
> >> ...
> >> >  net/ipv4/devinet.c |   13 -
> >> >  1 file changed, 12 insertions(+), 1 deletion(-)
> >> > 
> >> > Index: linux-ml.git/net/ipv4/devinet.c
> >> > ===
> >> > --- linux-ml.git.orig/net/ipv4/devinet.c
> >> > +++ linux-ml.git/net/ipv4/devinet.c
> >> > @@ -403,7 +403,18 @@ no_promotions:
> >> > So that, this order is correct.
> >> >   */
> >> 
> >> This patch is wrong, so drop it please. I'll do another.
> > 
> > Here I think is a better variant. The resulst are good
> > enough -- 1 sec for cleanup. Does the patch look sane?
> 
> I'm tempted to say that we should provide these notifier handlers with
> the information they need, explicitly, to handle this case.
> 
> Most intdev notifiers actually want to know the individual addresses
> that get removed, one by one.  That's handled by the existing
> NETDEV_DOWN event and the ifa we pass to that.
> 
> But some, like this netfilter masq case, would be satisfied with a
> single event that tells them the whole inetdev instance is being torn
> down.  Which is the case we care about here.
> 
> We currently don't use NETDEV_UNREGISTER for inetdev notifiers, so
> maybe we could use that.
> 
> And that is consistent with the core netdev notifier that triggers
> this call chain in the first place.
> 
> Roughly, something like this:

I see. Dave, gimme some time to test but I'm sure it'll work.
I don't have some strong opinion here, so your patch looks
pretty fine to me. But maybe people from netdev camp have
some other ideas.



[PATCH net-next v5] tcp: Add RFC4898 tcpEStatsPerfDataSegsOut/In

2016-03-10 Thread Martin KaFai Lau
Per RFC4898, they count segments sent/received
containing a positive length data segment (that includes
retransmission segments carrying data).  Unlike
tcpi_segs_out/in, tcpi_data_segs_out/in excludes segments
carrying no data (e.g. pure ack).

The patch also updates the segs_in in tcp_fastopen_add_skb()
so that segs_in >= data_segs_in property is kept.

Together with retransmission data, tcpi_data_segs_out
gives a better signal on the rxmit rate.

v5: Eric pointed out that checking skb->len is still needed in
tcp_fastopen_add_skb() because skb can carry a FIN without data.
Hence, instead of open coding segs_in and data_segs_in, tcp_segs_in()
helper is used.  Comment is added to the fastopen case to explain why
segs_in has to be reset and tcp_segs_in() has to be called before
__skb_pull().

v4: Add comment to the changes in tcp_fastopen_add_skb()
and also add remark on this case in the commit message.

v3: Add const modifier to the skb parameter in tcp_segs_in()

v2: Rework based on recent fix by Eric:
commit a9d99ce28ed3 ("tcp: fix tcpi_segs_in after connection establishment")

Signed-off-by: Martin KaFai Lau 
Cc: Chris Rapier 
Cc: Eric Dumazet 
Cc: Marcelo Ricardo Leitner 
Cc: Neal Cardwell 
Cc: Yuchung Cheng 
---
 include/linux/tcp.h  |  6 ++
 include/net/tcp.h| 10 ++
 include/uapi/linux/tcp.h |  2 ++
 net/ipv4/tcp.c   |  2 ++
 net/ipv4/tcp_fastopen.c  |  8 
 net/ipv4/tcp_ipv4.c  |  2 +-
 net/ipv4/tcp_minisocks.c |  2 +-
 net/ipv4/tcp_output.c|  4 +++-
 net/ipv6/tcp_ipv6.c  |  2 +-
 9 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index bcbf51d..7be9b12 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -158,6 +158,9 @@ struct tcp_sock {
u32 segs_in;/* RFC4898 tcpEStatsPerfSegsIn
 * total number of segments in.
 */
+   u32 data_segs_in;   /* RFC4898 tcpEStatsPerfDataSegsIn
+* total number of data segments in.
+*/
u32 rcv_nxt;/* What we want to receive next */
u32 copied_seq; /* Head of yet unread data  */
u32 rcv_wup;/* rcv_nxt on last window update sent   */
@@ -165,6 +168,9 @@ struct tcp_sock {
u32 segs_out;   /* RFC4898 tcpEStatsPerfSegsOut
 * The total number of segments sent.
 */
+   u32 data_segs_out;  /* RFC4898 tcpEStatsPerfDataSegsOut
+* total number of data segments sent.
+*/
u64 bytes_acked;/* RFC4898 tcpEStatsAppHCThruOctetsAcked
 * sum(delta(snd_una)), or how many bytes
 * were acked.
diff --git a/include/net/tcp.h b/include/net/tcp.h
index e90db85..24557a8 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1816,4 +1816,14 @@ static inline void skb_set_tcp_pure_ack(struct sk_buff 
*skb)
skb->truesize = 2;
 }
 
+static inline void tcp_segs_in(struct tcp_sock *tp, const struct sk_buff *skb)
+{
+   u16 segs_in;
+
+   segs_in = max_t(u16, 1, skb_shinfo(skb)->gso_segs);
+   tp->segs_in += segs_in;
+   if (skb->len > tcp_hdrlen(skb))
+   tp->data_segs_in += segs_in;
+}
+
 #endif /* _TCP_H */
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index fe95446..53e8e3f 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -199,6 +199,8 @@ struct tcp_info {
 
__u32   tcpi_notsent_bytes;
__u32   tcpi_min_rtt;
+   __u32   tcpi_data_segs_in;  /* RFC4898 tcpEStatsDataSegsIn */
+   __u32   tcpi_data_segs_out; /* RFC4898 tcpEStatsDataSegsOut */
 };
 
 /* for TCP_MD5SIG socket option */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index f9faadb..6b01b48 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2728,6 +2728,8 @@ void tcp_get_info(struct sock *sk, struct tcp_info *info)
info->tcpi_notsent_bytes = max(0, notsent_bytes);
 
info->tcpi_min_rtt = tcp_min_rtt(tp);
+   info->tcpi_data_segs_in = tp->data_segs_in;
+   info->tcpi_data_segs_out = tp->data_segs_out;
 }
 EXPORT_SYMBOL_GPL(tcp_get_info);
 
diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c
index fdb286d..4fc0061 100644
--- a/net/ipv4/tcp_fastopen.c
+++ b/net/ipv4/tcp_fastopen.c
@@ -140,6 +140,14 @@ void tcp_fastopen_add_skb(struct sock *sk, struct sk_buff 
*skb)
return;
 
skb_dst_drop(skb);
+   /* segs_in has been initialized to 1 in tcp_create_openreq_child().
+* Hence, reset segs_in to 0 before calling tcp_segs_in()
+* to avoid double counting.  Also, tcp_segs_in() expects
+  

Re: [PATCH nf-next v9 8/8] openvswitch: Interface with NAT.

2016-03-10 Thread Jarno Rajahalme

> On Mar 10, 2016, at 4:00 AM, Thomas Graf  wrote:
> 
> On 03/09/16 at 07:47pm, Joe Stringer wrote:
>> On 9 March 2016 at 15:10, Jarno Rajahalme  wrote:
>>> Extend OVS conntrack interface to cover NAT.  New nested
>>> OVS_CT_ATTR_NAT attribute may be used to include NAT with a CT action.
>>> A bare OVS_CT_ATTR_NAT only mangles existing and expected connections.
>>> If OVS_NAT_ATTR_SRC or OVS_NAT_ATTR_DST is included within the nested
>>> attributes, new (non-committed/non-confirmed) connections are mangled
>>> according to the rest of the nested attributes.
>>> 
>>> The corresponding OVS userspace patch series includes test cases (in
>>> tests/system-traffic.at) that also serve as example uses.
>>> 
>>> This work extends on a branch by Thomas Graf at
>>> https://github.com/tgraf/ovs/tree/nat.
>> 
>> Thomas, I guess there was not signoff in these patches so Jarno does
>> not have your signoff in this patch.
> 
> That's fine. The code has evolved a lot since. I don't see anything
> further than what Joe spotted so feel free to add my
> 
> Acked-by: Thomas Graf 

Thanks!



Re: [PATCH nf-next v8 1/8] netfilter: Remove IP_CT_NEW_REPLY definition.

2016-03-10 Thread Jarno Rajahalme
Thanks for pointing this out, v10, which hope is the final version, will have 
the cover letter back.

  Jarno

> On Mar 10, 2016, at 1:16 AM, Or Gerlitz  wrote:
> 
> On Wed, Mar 9, 2016 at 2:24 AM, Jarno Rajahalme  wrote:
>> Remove the definition of IP_CT_NEW_REPLY from the kernel as it does
>> not make sense.  This allows the definition of IP_CT_NUMBER to be
>> simplified as well.
> 
> I just realized that after V7 you stopped sending cover letter (patch 0/N)
> with this series.
> 
> Maybe you send it and this misses the list? we need to be able to see
> differences from earlier versions.
> 
> Or.



Re: [PATCH 2/3] dm9601: manage eeprom to assure the chip for correct operation

2016-03-10 Thread Peter Korsgaard
> "Joseph" == Joseph CHANG  writes:

 > Add to maintain variant eeprom adapters which may have not right
 > dm962x's format.

 > Signed-off-by: Joseph CHANG 

> +static void dm_render_begin(struct usbnet *dev)
 > +{
 > +/* Render eeprom if need, WORD3 render, set D[15:14] 01b */
 > +dm_eeprom_render(dev, 3, 0x4000, 0xc000);
 > +/* Render eeprom if need, WORD7 render, clear D[10] */
 > +dm_eeprom_render(dev, 7, 0x, 0x0400);
 > +/* Render eeprom if need, WORD11 render, need 0x005a */
 > +dm_eeprom_render(dev, 11, 0x005a, 0x);
 > +/* Render eeprom if need, WORD12 render, need 0x0007 */
 > +dm_eeprom_render(dev, 12, DM_EP3I_VAL, 0x);

With render I guess you mean something like fixup? I'm not sure we want
to do this automatically without an explicit action from the user.

How common are these adapters without valid eeprom? What happens if the
eeprom content isn't fixed?

Do we need to reset the device once the eeprom is updated?

-- 
Bye, Peter Korsgaard


Re: net: use-after-free in recvmmsg

2016-03-10 Thread Dmitry Vyukov
On Tue, Jan 26, 2016 at 8:30 PM, Arnaldo Carvalho de Melo
 wrote:
> Em Tue, Jan 26, 2016 at 08:27:48PM +0100, Dmitry Vyukov escreveu:
>> On Fri, Jan 22, 2016 at 10:16 PM, Arnaldo Carvalho de Melo  
>> wrote:
>> > Em Fri, Jan 22, 2016 at 09:39:53PM +0100, Dmitry Vyukov escreveu:
>> >> I am on commit 30f05309bde49295e02e45c7e615f73aa4e0ccc2 (Jan 20).
>> >> Seems to be added in commit a2e2725541fad72416326798c2d7fa4dafb7d337
>> >> (Oct 2009).
>> >
>> > Maybe this helps? Compile testing now...
>>
>>
>> I don't have a reliable reproducer, so can't test it per se.
>> I will integrate this patch tomorrow and restart fuzzer with it.
>
> Thanks a lot!


Hi Arnaldo,

I am running with that patch since then, and did not see the bug.
Please mail it as a proper patch.


[PATCH] kcm: mark helper functions inline

2016-03-10 Thread Arnd Bergmann
The stub helper functions for the newly added kcm_proc_init/exit interfaces
are defined as 'static' in a header file, which leads to build warnings for
each file that includes them without calling them:

include/net/kcm.h:183:12: error: 'kcm_proc_init' defined but not used 
[-Werror=unused-function]
include/net/kcm.h:184:13: error: 'kcm_proc_exit' defined but not used 
[-Werror=unused-function]

This marks the two functions as 'static inline' instead, which avoids the
warnings and is obviously what was meant here.

Signed-off-by: Arnd Bergmann 
Fixes: cd6e111bf5be ("kcm: Add statistics and proc interfaces")
---
 include/net/kcm.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/net/kcm.h b/include/net/kcm.h
index 95c425ca97b6..2840b5825dcc 100644
--- a/include/net/kcm.h
+++ b/include/net/kcm.h
@@ -180,8 +180,8 @@ struct kcm_mux {
 int kcm_proc_init(void);
 void kcm_proc_exit(void);
 #else
-static int kcm_proc_init(void) { return 0; }
-static void kcm_proc_exit(void) { }
+static inline int kcm_proc_init(void) { return 0; }
+static inline void kcm_proc_exit(void) { }
 #endif
 
 static inline void aggregate_psock_stats(struct kcm_psock_stats *stats,
-- 
2.7.0



Re: [PATCH 1/2] net: thunderx: Set recevie buffer page usage count in bulk

2016-03-10 Thread Sunil Kovvuri
>
> So calculate the modulus on the page split count and optimize the
> increment ahead of time when possible, and for the sub page split
> pieces do it one at a time.
>
Patch does almost the same with a negligible overhead of a counter
for page->_count increment at a later time but still before HW starts
using buffers.
Difference between NIU driver and this patch is there it's
calculate split count, increment page count and then divide page into
buffers. Here it's divide page into buffers, have a counter which increments
at every split and then at the end do a atomic increment of page->_count.

Any issue with this approach ?

Thanks,
Sunil.


Re: Micrel Phy - Is there a way to configure the Phy not to do 802.3x flow control?

2016-03-10 Thread Florian Fainelli
On 10/03/16 08:48, Murali Karicheri wrote:
> On 03/03/2016 07:16 PM, Florian Fainelli wrote:
>> On 03/03/16 14:18, Murali Karicheri wrote:
>>> Hi,
>>>
>>> We are using Micrel Phy in one of our board and wondering if we can force 
>>> the
>>> Phy to disable flow control at start. I have a 1G ethernet switch connected
>>> to Phy and the phy always enable flow control. I would like to configure the
>>> phy not to flow control. Is that possible and if yes, what should I do in 
>>> the
>>> my Ethernet driver to tell the Phy not to enable flow control?
>>
>> The PHY is not doing flow control per-se, your pseudo Ethernet MAC in
>> the switch is doing, along with the link partner advertising support for
>> it. You would want to make sure that your PHY device interface (provided
>> that you are using the PHY library) is not starting with Pause
>> advertised, but it could be supported.
> 
> Understood that Phy is just advertise FC. The Micrel phy for 9031 advertise
> by default FC supported. After negotiation, I see that Phylib provide the 
> link status with parameter pause = 1, asym_pause = 1. How do I tell the Phy 
> not
> to advertise?
> 
> I call following sequence in the Ethernet driver.
> 
> of_phy_connect(x,y,hndlr,a,z);

Here you should be able to change phydev->advertising and
phydev->supported to mask the ADVERTISED_Pause | ADVERTISED_AsymPause
bits and have phy_start() restart with that which should disable pause
and asym_pause as seen by your adjust_link handler.

> phy_start()
> 
> Now in hndlr() I have pause = 1, asym_pause = 1, in phy_device ptr. How can 
> I tell the phy not to advertise initially?
-- 
Florian


Re: [RFC] net: ipv4 -- Introduce ifa limit per net

2016-03-10 Thread David Miller
From: Cyrill Gorcunov 
Date: Thu, 10 Mar 2016 18:09:20 +0300

> On Thu, Mar 10, 2016 at 02:03:24PM +0300, Cyrill Gorcunov wrote:
>> On Thu, Mar 10, 2016 at 01:20:18PM +0300, Cyrill Gorcunov wrote:
>> > On Thu, Mar 10, 2016 at 12:16:29AM +0300, Cyrill Gorcunov wrote:
>> > > 
>> > > Thanks for explanation, Dave! I'll continue on this task tomorrow
>> > > tryin to implement optimization you proposed.
>> > 
>> > OK, here are the results for the preliminary patch with conntrack running
>> ...
>> >  net/ipv4/devinet.c |   13 -
>> >  1 file changed, 12 insertions(+), 1 deletion(-)
>> > 
>> > Index: linux-ml.git/net/ipv4/devinet.c
>> > ===
>> > --- linux-ml.git.orig/net/ipv4/devinet.c
>> > +++ linux-ml.git/net/ipv4/devinet.c
>> > @@ -403,7 +403,18 @@ no_promotions:
>> >   So that, this order is correct.
>> > */
>> 
>> This patch is wrong, so drop it please. I'll do another.
> 
> Here I think is a better variant. The resulst are good
> enough -- 1 sec for cleanup. Does the patch look sane?

I'm tempted to say that we should provide these notifier handlers with
the information they need, explicitly, to handle this case.

Most intdev notifiers actually want to know the individual addresses
that get removed, one by one.  That's handled by the existing
NETDEV_DOWN event and the ifa we pass to that.

But some, like this netfilter masq case, would be satisfied with a
single event that tells them the whole inetdev instance is being torn
down.  Which is the case we care about here.

We currently don't use NETDEV_UNREGISTER for inetdev notifiers, so
maybe we could use that.

And that is consistent with the core netdev notifier that triggers
this call chain in the first place.

Roughly, something like this:

diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 8c3df2c..6eee5cb 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -292,6 +292,11 @@ static void inetdev_destroy(struct in_device *in_dev)
 
in_dev->dead = 1;
 
+   if (in_dev->ifa_list)
+   blocking_notifier_call_chain(_chain,
+NETDEV_UNREGISTER,
+in_dev->ifa_list);
+
ip_mc_destroy_dev(in_dev);
 
while ((ifa = in_dev->ifa_list) != NULL) {
diff --git a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c 
b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
index c6eb421..1bb8026 100644
--- a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
+++ b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
@@ -111,6 +111,10 @@ static int masq_inet_event(struct notifier_block *this,
struct net_device *dev = ((struct in_ifaddr *)ptr)->ifa_dev->dev;
struct netdev_notifier_info info;
 
+   if (event != NETDEV_UNREGISTER)
+   return NOTIFY_DONE;
+   event = NETDEV_DOWN;
+
netdev_notifier_info_init(, dev);
return masq_device_event(this, event, );
 }






Re: [PATCH net-next v4] tcp: Add RFC4898 tcpEStatsPerfDataSegsOut/In

2016-03-10 Thread Martin KaFai Lau
On Thu, Mar 10, 2016 at 09:43:18AM -0800, Eric Dumazet wrote:
> On Thu, Mar 10, 2016 at 9:39 AM, Eric Dumazet  wrote:
> > On Thu, Mar 10, 2016 at 9:29 AM, Martin KaFai Lau  wrote:
> >> Per RFC4898, they count segments sent/received
> >> containing a positive length data segment (that includes
> >> retransmission segments carrying data).  Unlike
> >> tcpi_segs_out/in, tcpi_data_segs_out/in excludes segments
> >> carrying no data (e.g. pure ack).
> >>
> >> The patch also updates the segs_in in tcp_fastopen_add_skb()
> >> so that segs_in >= data_segs_in property is kept.  If
> >> tcp_segs_in() helper is used in this fastopen case, tp->segs_in
> >> has to be 0 reset first to avoid double counting.  Also, it has
> >> to be done before __skb_pull(skb, tcp_hdrlen(skb)) while
> >> there is no need to check skb->len since skb has already
> >> been confirmed carrying data.  I found it more confusing
> >> and chose to directly set segs_in and data_segs_in in
> >> this special case.
> >
> > Note that on my TODO list after commit 
> > e11ecddf5128011c936cc5360780190cbc901fdc
> > I had the project of pulling TCP headers much earlier in input path
> > so that we do not have all these special cases.
> >
> > Acked-by: Eric Dumazet 
>
> Actually, tcp_fastopen_add_skb() can queue a packet with a FIN only,
> but no data.
Thanks for pointing it out.  Didn't know it is allowed and
the above end_seq check could also be +1 by the FIN.

>
> I believe you need to test skb->len before setting tp->data_segs_in
In that case, I will try to 0 reset segs_in with comment explanation and call
tcp_segs_in() before the skb_pull.  I will spin another version.


Re: [PATCH] b43: fix memory leak

2016-03-10 Thread Sudip Mukherjee

On Thursday 10 March 2016 11:13 PM, Michael Büsch wrote:

On Fri, 19 Feb 2016 20:37:18 +0530
Sudip Mukherjee  wrote:


https://patchwork.kernel.org/patch/8049041/


I have an old laptop running on 800Mhz CPU. It has "Broadcom BCM4311
[14e4:4311] (rev 01)".
I will try to test it on this weekend.


Any news on this one?


No. Sorry. I was trying to install ubuntu 14.04 in it, but for some 
reason the usb stick is not moving past the boot screen. Give me two 
more days and I will let you all know by this Saturday.


regards
sudip


Re: [PATCH v2] phy: remove documentation of removed members of phy_device structure

2016-03-10 Thread Florian Fainelli
On 10/03/16 04:58, LABBE Corentin wrote:
> Commit e5a03bfd873c ("phy: Add an mdio_device structure") removed addr,
> bus and dev member of the phy_device structure.
> This patch remove the documentation about those members.
> 
> Signed-off-by: LABBE Corentin 

Acked-by: Florian Fainelli 
-- 
Florian


Re: [PATCH 1/2] net: thunderx: Set recevie buffer page usage count in bulk

2016-03-10 Thread David Miller
From: Sunil Kovvuri 
Date: Thu, 10 Mar 2016 16:13:28 +0530

> Hi David,
> 
> 
>>> So if you know ahead of time how the page will be split up, just
>>> calculate that when you get the page and increment the page count
>>> appropriately.
>>>
>>> That's what we do in the NIU driver.
>>
>> Thanks for the suggestion, will check and get back.
>>
> 
> I looked at the NIU driver and in fn() niu_rbr_refill()
> static void niu_rbr_refill(struct niu *np, struct rx_ring_info *rp, gfp_t 
> mask)
> {
> int index = rp->rbr_index;
> 
> rp->rbr_pending++;
> if ((rp->rbr_pending % rp->rbr_blocks_per_page) == 0) {
> 
> Here it's been checked whether rbr_pending is a exact multiple of page
> split count.
> And hence updating page count based on fixed calculation is right.
> 
> On my platform driver receives a interrupt when free buffer count
> falls below a threshold
> and by the time SW reads count of buffers to be refilled it can be any
> number i.e
> may or may not be a exact multiple of page split count.

So calculate the modulus on the page split count and optimize the
increment ahead of time when possible, and for the sub page split
pieces do it one at a time.

I don't understand what the problem is.


Re: [PATCH] b43: fix memory leak

2016-03-10 Thread Michael Büsch
On Fri, 19 Feb 2016 20:37:18 +0530
Sudip Mukherjee  wrote:

> > https://patchwork.kernel.org/patch/8049041/  
> 
> I have an old laptop running on 800Mhz CPU. It has "Broadcom BCM4311 
> [14e4:4311] (rev 01)".
> I will try to test it on this weekend.

Any news on this one?


-- 
Michael


pgpLl72Z376ek.pgp
Description: OpenPGP digital signature


Re: [PATCH net-next v4] tcp: Add RFC4898 tcpEStatsPerfDataSegsOut/In

2016-03-10 Thread Eric Dumazet
On Thu, Mar 10, 2016 at 9:39 AM, Eric Dumazet  wrote:
> On Thu, Mar 10, 2016 at 9:29 AM, Martin KaFai Lau  wrote:
>> Per RFC4898, they count segments sent/received
>> containing a positive length data segment (that includes
>> retransmission segments carrying data).  Unlike
>> tcpi_segs_out/in, tcpi_data_segs_out/in excludes segments
>> carrying no data (e.g. pure ack).
>>
>> The patch also updates the segs_in in tcp_fastopen_add_skb()
>> so that segs_in >= data_segs_in property is kept.  If
>> tcp_segs_in() helper is used in this fastopen case, tp->segs_in
>> has to be 0 reset first to avoid double counting.  Also, it has
>> to be done before __skb_pull(skb, tcp_hdrlen(skb)) while
>> there is no need to check skb->len since skb has already
>> been confirmed carrying data.  I found it more confusing
>> and chose to directly set segs_in and data_segs_in in
>> this special case.
>
> Note that on my TODO list after commit 
> e11ecddf5128011c936cc5360780190cbc901fdc
> I had the project of pulling TCP headers much earlier in input path
> so that we do not have all these special cases.
>
> Acked-by: Eric Dumazet 

Actually, tcp_fastopen_add_skb() can queue a packet with a FIN only,
but no data.

I believe you need to test skb->len before setting tp->data_segs_in


Re: pull-request: can-next 2016-03-10,pull-request: can-next 2016-03-10

2016-03-10 Thread David Miller
From: Marc Kleine-Budde 
Date: Thu, 10 Mar 2016 10:33:28 +0100

> this is a pull request of 5 patch for net-next/master.
> 
> Marek Vasut contributes 4 patches for the ifi CAN driver, which makes
> it work on real hardware. There is one patch by Ramesh Shanmugasundaram
> for the rcar_can driver that adds support for the 3rd generation IP
> core.

Pulled, thanks Marc.


Re: [PATCH net-next v4] tcp: Add RFC4898 tcpEStatsPerfDataSegsOut/In

2016-03-10 Thread Eric Dumazet
On Thu, Mar 10, 2016 at 9:29 AM, Martin KaFai Lau  wrote:
> Per RFC4898, they count segments sent/received
> containing a positive length data segment (that includes
> retransmission segments carrying data).  Unlike
> tcpi_segs_out/in, tcpi_data_segs_out/in excludes segments
> carrying no data (e.g. pure ack).
>
> The patch also updates the segs_in in tcp_fastopen_add_skb()
> so that segs_in >= data_segs_in property is kept.  If
> tcp_segs_in() helper is used in this fastopen case, tp->segs_in
> has to be 0 reset first to avoid double counting.  Also, it has
> to be done before __skb_pull(skb, tcp_hdrlen(skb)) while
> there is no need to check skb->len since skb has already
> been confirmed carrying data.  I found it more confusing
> and chose to directly set segs_in and data_segs_in in
> this special case.

Note that on my TODO list after commit e11ecddf5128011c936cc5360780190cbc901fdc
I had the project of pulling TCP headers much earlier in input path
so that we do not have all these special cases.

Acked-by: Eric Dumazet 


Re: [PATCH net-next v4] tcp: Add RFC4898 tcpEStatsPerfDataSegsOut/In

2016-03-10 Thread Yuchung Cheng
On Thu, Mar 10, 2016 at 9:29 AM, Martin KaFai Lau  wrote:
> Per RFC4898, they count segments sent/received
> containing a positive length data segment (that includes
> retransmission segments carrying data).  Unlike
> tcpi_segs_out/in, tcpi_data_segs_out/in excludes segments
> carrying no data (e.g. pure ack).
>
> The patch also updates the segs_in in tcp_fastopen_add_skb()
> so that segs_in >= data_segs_in property is kept.  If
> tcp_segs_in() helper is used in this fastopen case, tp->segs_in
> has to be 0 reset first to avoid double counting.  Also, it has
> to be done before __skb_pull(skb, tcp_hdrlen(skb)) while
> there is no need to check skb->len since skb has already
> been confirmed carrying data.  I found it more confusing
> and chose to directly set segs_in and data_segs_in in
> this special case.
>
> Together with retransmission data, tcpi_data_segs_out
> gives a better signal on the rxmit rate.
>
> v4: Add comment to the changes in tcp_fastopen_add_skb()
> and also add remark on this case in the commit message.
>
> v3: Add const modifier to the skb parameter in tcp_segs_in()
>
> v2: Rework based on recent fix by Eric:
> commit a9d99ce28ed3 ("tcp: fix tcpi_segs_in after connection establishment")
>
> Signed-off-by: Martin KaFai Lau 
> Cc: Chris Rapier 
> Cc: Eric Dumazet 
> Cc: Marcelo Ricardo Leitner 
> Cc: Neal Cardwell 
> Cc: Yuchung Cheng 
> ---
Acked-by: Yuchung Cheng 

Thanks for the clarification.

>  include/linux/tcp.h  |  6 ++
>  include/net/tcp.h| 10 ++
>  include/uapi/linux/tcp.h |  2 ++
>  net/ipv4/tcp.c   |  2 ++
>  net/ipv4/tcp_fastopen.c  | 10 ++
>  net/ipv4/tcp_ipv4.c  |  2 +-
>  net/ipv4/tcp_minisocks.c |  2 +-
>  net/ipv4/tcp_output.c|  4 +++-
>  net/ipv6/tcp_ipv6.c  |  2 +-
>  9 files changed, 36 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> index bcbf51d..7be9b12 100644
> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -158,6 +158,9 @@ struct tcp_sock {
> u32 segs_in;/* RFC4898 tcpEStatsPerfSegsIn
>  * total number of segments in.
>  */
> +   u32 data_segs_in;   /* RFC4898 tcpEStatsPerfDataSegsIn
> +* total number of data segments in.
> +*/
> u32 rcv_nxt;/* What we want to receive next */
> u32 copied_seq; /* Head of yet unread data  */
> u32 rcv_wup;/* rcv_nxt on last window update sent   */
> @@ -165,6 +168,9 @@ struct tcp_sock {
> u32 segs_out;   /* RFC4898 tcpEStatsPerfSegsOut
>  * The total number of segments sent.
>  */
> +   u32 data_segs_out;  /* RFC4898 tcpEStatsPerfDataSegsOut
> +* total number of data segments sent.
> +*/
> u64 bytes_acked;/* RFC4898 tcpEStatsAppHCThruOctetsAcked
>  * sum(delta(snd_una)), or how many bytes
>  * were acked.
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index e90db85..24557a8 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -1816,4 +1816,14 @@ static inline void skb_set_tcp_pure_ack(struct sk_buff 
> *skb)
> skb->truesize = 2;
>  }
>
> +static inline void tcp_segs_in(struct tcp_sock *tp, const struct sk_buff 
> *skb)
> +{
> +   u16 segs_in;
> +
> +   segs_in = max_t(u16, 1, skb_shinfo(skb)->gso_segs);
> +   tp->segs_in += segs_in;
> +   if (skb->len > tcp_hdrlen(skb))
> +   tp->data_segs_in += segs_in;
> +}
> +
>  #endif /* _TCP_H */
> diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
> index fe95446..53e8e3f 100644
> --- a/include/uapi/linux/tcp.h
> +++ b/include/uapi/linux/tcp.h
> @@ -199,6 +199,8 @@ struct tcp_info {
>
> __u32   tcpi_notsent_bytes;
> __u32   tcpi_min_rtt;
> +   __u32   tcpi_data_segs_in;  /* RFC4898 tcpEStatsDataSegsIn */
> +   __u32   tcpi_data_segs_out; /* RFC4898 tcpEStatsDataSegsOut */
>  };
>
>  /* for TCP_MD5SIG socket option */
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index f9faadb..6b01b48 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -2728,6 +2728,8 @@ void tcp_get_info(struct sock *sk, struct tcp_info 
> *info)
> info->tcpi_notsent_bytes = max(0, notsent_bytes);
>
> info->tcpi_min_rtt = tcp_min_rtt(tp);
> +   info->tcpi_data_segs_in = tp->data_segs_in;
> +   info->tcpi_data_segs_out = tp->data_segs_out;
>  }
>  EXPORT_SYMBOL_GPL(tcp_get_info);
>
> diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c
> index fdb286d..74068e6 100644
> --- 

[PATCH net-next v4] tcp: Add RFC4898 tcpEStatsPerfDataSegsOut/In

2016-03-10 Thread Martin KaFai Lau
Per RFC4898, they count segments sent/received
containing a positive length data segment (that includes
retransmission segments carrying data).  Unlike
tcpi_segs_out/in, tcpi_data_segs_out/in excludes segments
carrying no data (e.g. pure ack).

The patch also updates the segs_in in tcp_fastopen_add_skb()
so that segs_in >= data_segs_in property is kept.  If
tcp_segs_in() helper is used in this fastopen case, tp->segs_in
has to be 0 reset first to avoid double counting.  Also, it has
to be done before __skb_pull(skb, tcp_hdrlen(skb)) while
there is no need to check skb->len since skb has already
been confirmed carrying data.  I found it more confusing
and chose to directly set segs_in and data_segs_in in
this special case.

Together with retransmission data, tcpi_data_segs_out
gives a better signal on the rxmit rate.

v4: Add comment to the changes in tcp_fastopen_add_skb()
and also add remark on this case in the commit message.

v3: Add const modifier to the skb parameter in tcp_segs_in()

v2: Rework based on recent fix by Eric:
commit a9d99ce28ed3 ("tcp: fix tcpi_segs_in after connection establishment")

Signed-off-by: Martin KaFai Lau 
Cc: Chris Rapier 
Cc: Eric Dumazet 
Cc: Marcelo Ricardo Leitner 
Cc: Neal Cardwell 
Cc: Yuchung Cheng 
---
 include/linux/tcp.h  |  6 ++
 include/net/tcp.h| 10 ++
 include/uapi/linux/tcp.h |  2 ++
 net/ipv4/tcp.c   |  2 ++
 net/ipv4/tcp_fastopen.c  | 10 ++
 net/ipv4/tcp_ipv4.c  |  2 +-
 net/ipv4/tcp_minisocks.c |  2 +-
 net/ipv4/tcp_output.c|  4 +++-
 net/ipv6/tcp_ipv6.c  |  2 +-
 9 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index bcbf51d..7be9b12 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -158,6 +158,9 @@ struct tcp_sock {
u32 segs_in;/* RFC4898 tcpEStatsPerfSegsIn
 * total number of segments in.
 */
+   u32 data_segs_in;   /* RFC4898 tcpEStatsPerfDataSegsIn
+* total number of data segments in.
+*/
u32 rcv_nxt;/* What we want to receive next */
u32 copied_seq; /* Head of yet unread data  */
u32 rcv_wup;/* rcv_nxt on last window update sent   */
@@ -165,6 +168,9 @@ struct tcp_sock {
u32 segs_out;   /* RFC4898 tcpEStatsPerfSegsOut
 * The total number of segments sent.
 */
+   u32 data_segs_out;  /* RFC4898 tcpEStatsPerfDataSegsOut
+* total number of data segments sent.
+*/
u64 bytes_acked;/* RFC4898 tcpEStatsAppHCThruOctetsAcked
 * sum(delta(snd_una)), or how many bytes
 * were acked.
diff --git a/include/net/tcp.h b/include/net/tcp.h
index e90db85..24557a8 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1816,4 +1816,14 @@ static inline void skb_set_tcp_pure_ack(struct sk_buff 
*skb)
skb->truesize = 2;
 }
 
+static inline void tcp_segs_in(struct tcp_sock *tp, const struct sk_buff *skb)
+{
+   u16 segs_in;
+
+   segs_in = max_t(u16, 1, skb_shinfo(skb)->gso_segs);
+   tp->segs_in += segs_in;
+   if (skb->len > tcp_hdrlen(skb))
+   tp->data_segs_in += segs_in;
+}
+
 #endif /* _TCP_H */
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index fe95446..53e8e3f 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -199,6 +199,8 @@ struct tcp_info {
 
__u32   tcpi_notsent_bytes;
__u32   tcpi_min_rtt;
+   __u32   tcpi_data_segs_in;  /* RFC4898 tcpEStatsDataSegsIn */
+   __u32   tcpi_data_segs_out; /* RFC4898 tcpEStatsDataSegsOut */
 };
 
 /* for TCP_MD5SIG socket option */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index f9faadb..6b01b48 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2728,6 +2728,8 @@ void tcp_get_info(struct sock *sk, struct tcp_info *info)
info->tcpi_notsent_bytes = max(0, notsent_bytes);
 
info->tcpi_min_rtt = tcp_min_rtt(tp);
+   info->tcpi_data_segs_in = tp->data_segs_in;
+   info->tcpi_data_segs_out = tp->data_segs_out;
 }
 EXPORT_SYMBOL_GPL(tcp_get_info);
 
diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c
index fdb286d..74068e6 100644
--- a/net/ipv4/tcp_fastopen.c
+++ b/net/ipv4/tcp_fastopen.c
@@ -131,6 +131,7 @@ static bool tcp_fastopen_cookie_gen(struct request_sock 
*req,
 void tcp_fastopen_add_skb(struct sock *sk, struct sk_buff *skb)
 {
struct tcp_sock *tp = tcp_sk(sk);
+   u16 segs_in;
 
if (TCP_SKB_CB(skb)->end_seq == tp->rcv_nxt)
return;
@@ -154,6 +155,15 @@ void 

Re: [PATCH net-next 1/3] xen-netback: re-import canonical netif header

2016-03-10 Thread Wei Liu
On Thu, Mar 10, 2016 at 12:30:26PM +, Paul Durrant wrote:
> The canonical netif header (in the Xen source repo) and the Linux variant
> have diverged significantly. Recently much documentation has been added to
> the canonical header which is highly useful for developers making
> modifications to either xen-netfront or xen-netback. This patch therefore
> re-imports the canonical header in its entirity.
> 
> To maintain compatibility and some style consistency with the old Linux
> variant, the header was stripped of its emacs boilerplate, and
> post-processed and copied into place with the following commands:
> 
> ed -s netif.h << EOF
> H
> ,s/NETTXF_/XEN_NETTXF_/g
> ,s/NETRXF_/XEN_NETRXF_/g
> ,s/NETIF_/XEN_NETIF_/g
> ,s/XEN_XEN_/XEN_/g
> ,s/netif/xen_netif/g
> ,s/xen_xen_/xen_/g
> ,s/^typedef.*$//g
> ,s/^/${TAB}/g
> w
> $
> w
> EOF
> 
> indent --line-length 80 --linux-style netif.h \
> -o include/xen/interface/io/netif.h
> 
> Signed-off-by: Paul Durrant 
> Cc: Konrad Rzeszutek Wilk 
> Cc: Boris Ostrovsky 
> Cc: David Vrabel 
> Cc: Wei Liu 

Acked-by: Wei Liu 



Re: [PATCH net-next 2/3] xen-netback: support multiple extra info fragments passed from frontend

2016-03-10 Thread Wei Liu
On Thu, Mar 10, 2016 at 12:30:27PM +, Paul Durrant wrote:
> The code does not currently support a frontend passing multiple extra info
> fragments to the backend in a tx request. The xenvif_get_extras() function
> handles multiple extra_info fragments but make_tx_response() assumes there
> is only ever a single extra info fragment.
> 
> This patch modifies xenvif_get_extras() to pass back a count of extra
> info fragments, which is then passed to make_tx_response() (after
> possibly being stashed in pending_tx_info for deferred responses).
> 
> Signed-off-by: Paul Durrant 
> Cc: Wei Liu 

Acked-by: Wei Liu 


Re: [PATCH net-next 3/3] xen-netback: reduce log spam

2016-03-10 Thread Wei Liu
On Thu, Mar 10, 2016 at 12:30:28PM +, Paul Durrant wrote:
> Remove the "prepare for reconnect" pr_info in xenbus.c. It's largely
> uninteresting and the states of the frontend and backend can easily be
> observed by watching the (o)xenstored log.
> 
> Signed-off-by: Paul Durrant 
> Cc: Wei Liu 

Acked-by: Wei Liu 

> ---
>  drivers/net/xen-netback/xenbus.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/xenbus.c 
> b/drivers/net/xen-netback/xenbus.c
> index 39a303d..bd182cd 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -511,8 +511,6 @@ static void set_backend_state(struct backend_info *be,
>   switch (state) {
>   case XenbusStateInitWait:
>   case XenbusStateConnected:
> - pr_info("%s: prepare for reconnect\n",
> - be->dev->nodename);
>   backend_switch_state(be, XenbusStateInitWait);
>   break;
>   case XenbusStateClosing:
> -- 
> 2.1.4
> 


Re: [net-next PATCH V3 1/3] net: adjust napi_consume_skb to handle none-NAPI callers

2016-03-10 Thread Sergei Shtylyov

Hello.

On 03/10/2016 05:59 PM, Jesper Dangaard Brouer wrote:


Some drivers reuse/share code paths that free SKBs between NAPI
and none-NAPI calls. Adjust napi_consume_skb to handle this
use-case.

Before, calls from netpoll (w/ IRQs disabled) was handled and
indicated with a budget zero indication.  Use the same zero
indication to handle calls not originating from NAPI/softirq.
Simply handled by using dev_consume_skb_any().

This adds an extra branch+call for the netpoll case (checking
in_irq() + irqs_disabled()), but that is okay as this is a slowpath.

Suggested-by: Alexander Duyck 
Signed-off-by: Jesper Dangaard Brouer 
---
  net/core/skbuff.c |4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 7af7ec635d90..bc62baa54ceb 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -801,9 +801,9 @@ void napi_consume_skb(struct sk_buff *skb, int budget)
if (unlikely(!skb))
return;

-   /* if budget is 0 assume netpoll w/ IRQs disabled */
+   /* Zero budget indicate none-NAPI context called us, like netpoll */


   Non-NAPI?

[...]

MBR, Sergei



Re: Micrel Phy - Is there a way to configure the Phy not to do 802.3x flow control?

2016-03-10 Thread Murali Karicheri
On 03/03/2016 07:16 PM, Florian Fainelli wrote:
> On 03/03/16 14:18, Murali Karicheri wrote:
>> Hi,
>>
>> We are using Micrel Phy in one of our board and wondering if we can force the
>> Phy to disable flow control at start. I have a 1G ethernet switch connected
>> to Phy and the phy always enable flow control. I would like to configure the
>> phy not to flow control. Is that possible and if yes, what should I do in the
>> my Ethernet driver to tell the Phy not to enable flow control?
> 
> The PHY is not doing flow control per-se, your pseudo Ethernet MAC in
> the switch is doing, along with the link partner advertising support for
> it. You would want to make sure that your PHY device interface (provided
> that you are using the PHY library) is not starting with Pause
> advertised, but it could be supported.

Understood that Phy is just advertise FC. The Micrel phy for 9031 advertise
by default FC supported. After negotiation, I see that Phylib provide the 
link status with parameter pause = 1, asym_pause = 1. How do I tell the Phy not
to advertise?

I call following sequence in the Ethernet driver.

of_phy_connect(x,y,hndlr,a,z);
phy_start()

Now in hndlr() I have pause = 1, asym_pause = 1, in phy_device ptr. How can 
I tell the phy not to advertise initially?

Murali
> 
> As Andrew indicated the proper way to do this is do to use ethtool if
> you need to this dynamically.
> 


-- 
Murali Karicheri
Linux Kernel, Keystone


Re: [PATCH v3 0/8] arm64: rockchip: Initial GeekBox enablement

2016-03-10 Thread Dinh Nguyen
On Thu, Mar 10, 2016 at 3:13 AM, Giuseppe CAVALLARO
 wrote:
> On 3/9/2016 5:31 PM, Dinh Nguyen wrote:
>>
>> On Wed, Mar 9, 2016 at 8:53 AM, Giuseppe CAVALLARO
>>  wrote:
>>>
>>> Hi Tomeu, Dinh, Andreas
>>>
>>> I need a sum and help from you to go ahead on the
>>> tx timeout.
>>>
>>> The "stmmac: MDIO fixes" seems to be the candidate to
>>> fix the phy connection and I will send the V2 asap (Andreas' comment).
>>>
>>> So, supposing the probe is ok and phy is connected,
>>> I need your input ...
>>>
>>>   Tomeu: after revering the 0e80bdc9a72d (stmmac: first frame
>>>  prep at the end of xmit routine) the network is
>>>  not stable and there is a timeout after a while.
>>>  The box has 3.50 with normal desc settings.
>>>
>>>   Dinh: the network is ok, I wonder if you can share a boot
>>> log just to understand if the normal or enhanced
>>> descriptors are used.
>>>
>>
>> Here it is:
>
> ...
>>
>> [0.850523] stmmac - user ID: 0x10, Synopsys ID: 0x37
>> [0.855570]  Ring mode enabled
>> [0.858611]  DMA HW capability register supported
>> [0.863128]  Enhanced/Alternate descriptors
>> [0.867482]  Enabled extended descriptors
>> [0.871482]  RX Checksum Offload Engine supported (type 2)
>> [0.876948]  TX Checksum insertion supported
>> [0.881204]  Enable RX Mitigation via HW Watchdog Timer
>> [0.886863] socfpga-dwmac ff702000.ethernet eth0: No MDIO subnode found
>> [0.899090] libphy: stmmac: probed
>> [0.902484] eth0: PHY ID 00221611 at 4 IRQ POLL (stmmac-0:04) active
>
>
> Thx Dinh, so you are using the Enhanced/Alternate descriptors
> I am debugging on my side on a setup with normal descriptors, I let you
> know
>

Doesn't the printout "Enhanced/Alternate descriptors"  mean that I'm using
Enhanced/Alternate descriptors?

Dinh


Re: [PATCH 2/2] isdn: i4l: move active-isdn drivers to staging

2016-03-10 Thread isdn
Am 10.03.2016 um 13:58 schrieb Paul Bolle:
> Hi Karsten,
> 
> On do, 2016-03-10 at 11:53 +0100, i...@linux-pingi.de wrote:
>> mISDN with CAPI support works just fine with pppd and pppdcapiplugin
>> and the CAPI works for all mISDN HW.
> 
> In the mainline tree the mISDN and CAPI stacks are effectively separate.
> Do you perhaps refer to a mISDN + Asterisk + chan-capi setup? (That's
> the closest to mISDN with CAPI support that I could find. Did I miss
> something?)

http://listserv.isdn4linux.de/pipermail/isdn4linux/2012-January/005580.html

Since 2012 mISDN has a cAPI20 interface, pure in userspace.
Everything is in the capi20 subdirectory of mISDNuser.
The capi20 support need to be enabled with ./configure.

Has nothing to do with Asterisk, but for FAX it is useing the same DSP
library, spandsp.

Best
Karsten Keil



Re: [PATCH] mrf24j40: fix security-enabled processing on inbound frames

2016-03-10 Thread Stefan Schmidt

Hello.

On 29/02/16 20:49, Alan Ott wrote:

On 02/18/2016 01:34 PM, zopieux wrote:

Fix the MRF24J40 handling of security-enabled frames so it does not
block upon receiving such frames.

Signed-off-by: Alexander Aring 
Reported-by: Alexandre Macabies 
Tested-by: Alexandre Macabies 
---
When receiving a security-enabled IEEE 802.15.4 frame, the MRF24J40
triggers a SECIF interrupt that needs to be handled for RX processing
to keep functioning properly.

This patch enables the SECIF interrupt and makes the MRF ignores all
hardware processing of security-enabled frames, that is handled by the
ieee802154 stack instead.
---


The "From" field of the email needs to have your real name in it. This 
will be where the "Author" field in git comes from.


It looks like there are a few separate things happening in this patch. 
Maybe they should be broken out in to separate patches. I see:


1. The ieee802154.h part,
2. The TX part,
3. The RX part.

The patch description only really describes the RX part.



zopieux, could you split the patch as Alan suggested and re-submitted 
the series?


regards
Stefan Schmidt


Re: [GIT PULL v2 0/4] IPVS Fixes for v4.5

2016-03-10 Thread Pablo Neira Ayuso
On Mon, Mar 07, 2016 at 12:03:30PM +0900, Simon Horman wrote:
> Hi Pablo,
> 
> please consider these IPVS fixes for v4.5 or
> if it is too late please consider them for v4.6.

Pulled into nf-next, thanks Simon!


  1   2   >