[PATCH] net/phy: Add Vitesse 8641 phy ID

2015-06-25 Thread shh.xie
From: Shaohui Xie 

Vitesse VSC8641 is compatible with Vitesse 82xx

Signed-off-by: Shaohui Xie 
---
 drivers/net/phy/vitesse.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/drivers/net/phy/vitesse.c b/drivers/net/phy/vitesse.c
index 76cad71..17cad18 100644
--- a/drivers/net/phy/vitesse.c
+++ b/drivers/net/phy/vitesse.c
@@ -66,6 +66,7 @@
 #define PHY_ID_VSC8244 0x000fc6c0
 #define PHY_ID_VSC8514 0x00070670
 #define PHY_ID_VSC8574 0x000704a0
+#define PHY_ID_VSC8641 0x00070431
 #define PHY_ID_VSC8662 0x00070660
 #define PHY_ID_VSC8221 0x000fc550
 #define PHY_ID_VSC8211 0x000fc4b0
@@ -272,6 +273,18 @@ static struct phy_driver vsc82xx_driver[] = {
.config_intr= &vsc82xx_config_intr,
.driver = { .owner = THIS_MODULE,},
 }, {
+   .phy_id = PHY_ID_VSC8641,
+   .name   = "Vitesse VSC8641",
+   .phy_id_mask= 0x0000,
+   .features   = PHY_GBIT_FEATURES,
+   .flags  = PHY_HAS_INTERRUPT,
+   .config_init= &vsc824x_config_init,
+   .config_aneg= &vsc82x4_config_aneg,
+   .read_status= &genphy_read_status,
+   .ack_interrupt  = &vsc824x_ack_interrupt,
+   .config_intr= &vsc82xx_config_intr,
+   .driver = { .owner = THIS_MODULE,},
+}, {
.phy_id = PHY_ID_VSC8662,
.name   = "Vitesse VSC8662",
.phy_id_mask= 0x0000,
@@ -318,6 +331,7 @@ static struct mdio_device_id __maybe_unused vitesse_tbl[] = 
{
{ PHY_ID_VSC8244, 0x000fffc0 },
{ PHY_ID_VSC8514, 0x0000 },
{ PHY_ID_VSC8574, 0x0000 },
+   { PHY_ID_VSC8641, 0x0000 },
{ PHY_ID_VSC8662, 0x0000 },
{ PHY_ID_VSC8221, 0x0000 },
{ PHY_ID_VSC8211, 0x0000 },
-- 
1.8.4.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH iproute2] ss: Fix allocation of cong control alg name

2015-06-25 Thread Daniel Borkmann

On 06/25/2015 05:31 AM, Stephen Hemminger wrote:

On Fri, 29 May 2015 18:48:42 +0200
Daniel Borkmann  wrote:


On 05/29/2015 06:17 PM, Guzman Mosqueda, Jose R wrote:

Hi Daniel and Vadim

Thanks for your prompt response and for the patch.

Also, what about the other one? Do you think it is an issue or not?

" File: tc/tc_util.c
Function: void print_rate(char *buf, int len, __u64 rate)
Line: ~264

In the case that user inputs a high value for rate, the "for" loop will exit in the condition meaning 
that variable "i" get the value of 5 which will be an invalid index for the "units" array due 
to that array has only 5 elements."

I know a very high value is invalid but in the case that it comes directly from 
user, it could cause and issue, what do you think?


Hm, this prints just the netlink dump from kernel side, but perhaps
we should just change it ...

diff --git a/tc/tc_util.c b/tc/tc_util.c
index dc2b70f..aa6de24 100644
--- a/tc/tc_util.c
+++ b/tc/tc_util.c
@@ -250,18 +250,19 @@ void print_rate(char *buf, int len, __u64 rate)
extern int use_iec;
unsigned long kilo = use_iec ? 1024 : 1000;
const char *str = use_iec ? "i" : "";
-   int i = 0;
static char *units[5] = {"", "K", "M", "G", "T"};
+   int i;

rate <<= 3; /* bytes/sec -> bits/sec */

-   for (i = 0; i < ARRAY_SIZE(units); i++)  {
+   for (i = 0; i < ARRAY_SIZE(units) - 1; i++)  {
if (rate < kilo)
break;
if (((rate % kilo) != 0) && rate < 1000*kilo)
break;
rate /= kilo;
}
+
snprintf(buf, len, "%.0f%s%sbit", (double)rate, units[i], str);
   }


I don't know what thread you meant to hijack for this, but it wasn't the
one about ss: cong name.


Jose did top-post on the first reported issue asking about the 2nd one, I
think that's how we ended up here. ;)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/4] net: dsa: mv88e6352: add support for VLAN Table Unit

2015-06-25 Thread David Miller
From: Vivien Didelot 
Date: Wed, 24 Jun 2015 14:50:55 -0400

> This patchset brings full support for hardware VLANs in DSA, and the Marvell
> 88E6352 and compatible switch chips.

As I clearly announced on net-next yesterday, the net-next tree is now
closed.  Please resubmit this series when the net-next tree opens up
again.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net V1 2/4] net/mlx4_en: Wake TX queues only when there's enough room

2015-06-25 Thread Or Gerlitz
From: Ido Shamay 

Indication of a single completed packet, marked by txbbs_skipped
being bigger then zero, in not enough in order to wake up a
stopped TX queue. The completed packet may contain a single TXBB,
while next packet to be sent (after the wake up) may have multiple
TXBBs (LSO/TSO packets for example), causing overflow in queue followed
by WQE corruption and TX queue timeout.
Instead, wake the stopped queue only when there's enough room for the
worst case (maximum sized WQE) packet that we should need to handle after
the queue is opened again.

Also created an helper routine - mlx4_en_is_tx_ring_full, which checks
if the current TX ring is full or not. It provides better code readability
and removes code duplication.

Signed-off-by: Ido Shamay 
Signed-off-by: Or Gerlitz 
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c   | 19 +++
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h |  1 +
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c 
b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 0ab298f..c10d98f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -66,6 +66,7 @@ int mlx4_en_create_tx_ring(struct mlx4_en_priv *priv,
ring->size = size;
ring->size_mask = size - 1;
ring->stride = stride;
+   ring->full_size = ring->size - HEADROOM - MAX_DESC_TXBBS;
 
tmp = size * sizeof(struct mlx4_en_tx_info);
ring->tx_info = kmalloc_node(tmp, GFP_KERNEL | __GFP_NOWARN, node);
@@ -232,6 +233,11 @@ void mlx4_en_deactivate_tx_ring(struct mlx4_en_priv *priv,
   MLX4_QP_STATE_RST, NULL, 0, 0, &ring->qp);
 }
 
+static inline bool mlx4_en_is_tx_ring_full(struct mlx4_en_tx_ring *ring)
+{
+   return ring->prod - ring->cons > ring->full_size;
+}
+
 static void mlx4_en_stamp_wqe(struct mlx4_en_priv *priv,
  struct mlx4_en_tx_ring *ring, int index,
  u8 owner)
@@ -474,11 +480,10 @@ static bool mlx4_en_process_tx_cq(struct net_device *dev,
 
netdev_tx_completed_queue(ring->tx_queue, packets, bytes);
 
-   /*
-* Wakeup Tx queue if this stopped, and at least 1 packet
-* was completed
+   /* Wakeup Tx queue if this stopped, and ring is not full.
 */
-   if (netif_tx_queue_stopped(ring->tx_queue) && txbbs_skipped > 0) {
+   if (netif_tx_queue_stopped(ring->tx_queue) &&
+   !mlx4_en_is_tx_ring_full(ring)) {
netif_tx_wake_queue(ring->tx_queue);
ring->wake_queue++;
}
@@ -922,8 +927,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct 
net_device *dev)
skb_tx_timestamp(skb);
 
/* Check available TXBBs And 2K spare for prefetch */
-   stop_queue = (int)(ring->prod - ring_cons) >
- ring->size - HEADROOM - MAX_DESC_TXBBS;
+   stop_queue = mlx4_en_is_tx_ring_full(ring);
if (unlikely(stop_queue)) {
netif_tx_stop_queue(ring->tx_queue);
ring->queue_stopped++;
@@ -992,8 +996,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct 
net_device *dev)
smp_rmb();
 
ring_cons = ACCESS_ONCE(ring->cons);
-   if (unlikely(((int)(ring->prod - ring_cons)) <=
-ring->size - HEADROOM - MAX_DESC_TXBBS)) {
+   if (unlikely(!mlx4_en_is_tx_ring_full(ring))) {
netif_tx_wake_queue(ring->tx_queue);
ring->wake_queue++;
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h 
b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 32134bd..666d166 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -279,6 +279,7 @@ struct mlx4_en_tx_ring {
u32 size; /* number of TXBBs */
u32 size_mask;
u16 stride;
+   u32 full_size;
u16 cqn;/* index of port CQ associated with 
this ring */
u32 buf_size;
__be32  doorbell_qpn;
-- 
2.3.7

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net V1 0/4] mlx4 driver fixes, June 24, 2015

2015-06-25 Thread Or Gerlitz
Hi Dave,

Some fixes that we made recently, all need to go into stable.

patch #1 "net/mlx4_en: Release TX QP when destroying TX ring" and patch #3 
"Fix wrong csum complete report when rxvlan offload is disabled" to >= 3.19

patch #2 "Wake TX queues only when there's enough room" addressing a bug 
which is there from day one... should go to whatever kernels it's still 
applicable

patch #4 "mlx4: Disable HA for SRIOV PF RoCE devices" to >= 4.0

The patches are marked with net but are made against net-next,
as the net tree still doesn't contain all the net-next bits.

thanks,

Or.

Changes from V0:
 - addressed feedback from Eric D. on the checksum complete patch


Eran Ben Elisha (1):
  net/mlx4_en: Release TX QP when destroying TX ring

Ido Shamay (2):
  net/mlx4_en: Wake TX queues only when there's enough room
  net/mlx4_en: Fix wrong csum complete report when rxvlan offload is
disabled

Or Gerlitz (1):
  mlx4: Disable HA for SRIOV PF RoCE devices

 drivers/net/ethernet/mellanox/mlx4/en_netdev.c |  4 
 drivers/net/ethernet/mellanox/mlx4/en_rx.c | 17 ++---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c | 20 
 drivers/net/ethernet/mellanox/mlx4/intf.c  |  8 +++-
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   |  2 +-
 5 files changed, 26 insertions(+), 25 deletions(-)

-- 
2.3.7

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net V1 4/4] mlx4: Disable HA for SRIOV PF RoCE devices

2015-06-25 Thread Or Gerlitz
When in HA mode, the driver exposes an IB (RoCE) device instance with only
one port. Under SRIOV, the existing implementation doesn't go well with
the PF RoCE driver's role of Special QPs Para-Virtualization, etc.

As such, disable HA for the mlx4 PF RoCE device in SRIOV mode.

Fixes: a57500903093 ('IB/mlx4: Add port aggregation support')
Signed-off-by: Or Gerlitz 
---
 drivers/net/ethernet/mellanox/mlx4/intf.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/intf.c 
b/drivers/net/ethernet/mellanox/mlx4/intf.c
index 6fce587..0d80aed 100644
--- a/drivers/net/ethernet/mellanox/mlx4/intf.c
+++ b/drivers/net/ethernet/mellanox/mlx4/intf.c
@@ -93,8 +93,14 @@ int mlx4_register_interface(struct mlx4_interface *intf)
mutex_lock(&intf_mutex);
 
list_add_tail(&intf->list, &intf_list);
-   list_for_each_entry(priv, &dev_list, dev_list)
+   list_for_each_entry(priv, &dev_list, dev_list) {
+   if (mlx4_is_mfunc(&priv->dev) && (intf->flags & 
MLX4_INTFF_BONDING)) {
+   mlx4_dbg(&priv->dev,
+"SRIOV, disabling HA mode for intf proto 
%d\n", intf->protocol);
+   intf->flags &= ~MLX4_INTFF_BONDING;
+   }
mlx4_add_device(intf, priv);
+   }
 
mutex_unlock(&intf_mutex);
 
-- 
2.3.7

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net V1 3/4] net/mlx4_en: Fix wrong csum complete report when rxvlan offload is disabled

2015-06-25 Thread Or Gerlitz
From: Ido Shamay 

The check_csum() function relied on hwtstamp_rx_filter to know if rxvlan
offload is disabled. This is wrong since rxvlan offload can be switched
on/off regardless of hwtstamp_rx_filter.

Also moved check_csum to query CQE information to identify VLAN packets
and removed the check of IP packets, since it has been validated before.

Fixes: f8c6455bb04b ('net/mlx4_en: Extend checksum offloading by CHECKSUM 
COMPLETE')
Signed-off-by: Ido Shamay 
Signed-off-by: Or Gerlitz 
---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c 
b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 35f726c..7a4f20b 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -718,7 +718,7 @@ static int get_fixed_ipv6_csum(__wsum hw_checksum, struct 
sk_buff *skb,
 }
 #endif
 static int check_csum(struct mlx4_cqe *cqe, struct sk_buff *skb, void *va,
- int hwtstamp_rx_filter)
+ netdev_features_t dev_features)
 {
__wsum hw_checksum = 0;
 
@@ -726,14 +726,8 @@ static int check_csum(struct mlx4_cqe *cqe, struct sk_buff 
*skb, void *va,
 
hw_checksum = csum_unfold((__force __sum16)cqe->checksum);
 
-   if (((struct ethhdr *)va)->h_proto == htons(ETH_P_8021Q) &&
-   hwtstamp_rx_filter != HWTSTAMP_FILTER_NONE) {
-   /* next protocol non IPv4 or IPv6 */
-   if (((struct vlan_hdr *)hdr)->h_vlan_encapsulated_proto
-   != htons(ETH_P_IP) &&
-   ((struct vlan_hdr *)hdr)->h_vlan_encapsulated_proto
-   != htons(ETH_P_IPV6))
-   return -1;
+   if (cqe->vlan_my_qpn & cpu_to_be32(MLX4_CQE_VLAN_PRESENT_MASK) &&
+   !(dev_features & NETIF_F_HW_VLAN_CTAG_RX)) {
hw_checksum = get_fixed_vlan_csum(hw_checksum, hdr);
hdr += sizeof(struct vlan_hdr);
}
@@ -896,7 +890,8 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct 
mlx4_en_cq *cq, int bud
 
if (ip_summed == CHECKSUM_COMPLETE) {
void *va = 
skb_frag_address(skb_shinfo(gro_skb)->frags);
-   if (check_csum(cqe, gro_skb, va, 
ring->hwtstamp_rx_filter)) {
+   if (check_csum(cqe, gro_skb, va,
+  dev->features)) {
ip_summed = CHECKSUM_NONE;
ring->csum_none++;
ring->csum_complete--;
@@ -951,7 +946,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct 
mlx4_en_cq *cq, int bud
}
 
if (ip_summed == CHECKSUM_COMPLETE) {
-   if (check_csum(cqe, skb, skb->data, 
ring->hwtstamp_rx_filter)) {
+   if (check_csum(cqe, skb, skb->data, dev->features)) {
ip_summed = CHECKSUM_NONE;
ring->csum_complete--;
ring->csum_none++;
-- 
2.3.7

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net V1 1/4] net/mlx4_en: Release TX QP when destroying TX ring

2015-06-25 Thread Or Gerlitz
From: Eran Ben Elisha 

TX ring QP wasn't released at mlx4_en_destroy_tx_ring. Instead, the code
used the deprecated base_tx_qpn field. Move TX QP release to
mlx4_en_destroy_tx_ring and remove the base_tx_qpn field.

Fixes: ddae0349fdb7 ('net/mlx4: Change QP allocation scheme')
Signed-off-by: Eran Ben Elisha 
Signed-off-by: Or Gerlitz 
---
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 4 
 drivers/net/ethernet/mellanox/mlx4/en_tx.c | 1 +
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   | 1 -
 3 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c 
b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 77179d7..e0de2fd 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1977,10 +1977,6 @@ void mlx4_en_free_resources(struct mlx4_en_priv *priv)
mlx4_en_destroy_cq(priv, &priv->rx_cq[i]);
}
 
-   if (priv->base_tx_qpn) {
-   mlx4_qp_release_range(priv->mdev->dev, priv->base_tx_qpn, 
priv->tx_ring_num);
-   priv->base_tx_qpn = 0;
-   }
 }
 
 int mlx4_en_alloc_resources(struct mlx4_en_priv *priv)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c 
b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 7bed3a8..0ab298f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -180,6 +180,7 @@ void mlx4_en_destroy_tx_ring(struct mlx4_en_priv *priv,
mlx4_bf_free(mdev->dev, &ring->bf);
mlx4_qp_remove(mdev->dev, &ring->qp);
mlx4_qp_free(mdev->dev, &ring->qp);
+   mlx4_qp_release_range(priv->mdev->dev, ring->qpn, 1);
mlx4_en_unmap_buffer(&ring->wqres.buf);
mlx4_free_hwq_res(mdev->dev, &ring->wqres, ring->buf_size);
kfree(ring->bounce_buf);
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h 
b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index d5f9adb..32134bd 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -580,7 +580,6 @@ struct mlx4_en_priv {
int vids[128];
bool wol;
struct device *ddev;
-   int base_tx_qpn;
struct hlist_head mac_hash[MLX4_EN_MAC_HASH_SIZE];
struct hwtstamp_config hwtstamp_config;
u32 counter_index;
-- 
2.3.7

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net V1 0/4] mlx4 driver fixes, June 24, 2015

2015-06-25 Thread David Miller
From: Or Gerlitz 
Date: Thu, 25 Jun 2015 11:29:40 +0300

> Some fixes that we made recently, all need to go into stable.
> 
> patch #1 "net/mlx4_en: Release TX QP when destroying TX ring" and patch #3 
> "Fix wrong csum complete report when rxvlan offload is disabled" to >= 3.19
> 
> patch #2 "Wake TX queues only when there's enough room" addressing a bug 
> which is there from day one... should go to whatever kernels it's still 
> applicable
> 
> patch #4 "mlx4: Disable HA for SRIOV PF RoCE devices" to >= 4.0
> 
> The patches are marked with net but are made against net-next,
> as the net tree still doesn't contain all the net-next bits.

Series applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch -next] renesas: missing unlock on error path

2015-06-25 Thread David Miller
From: Dan Carpenter 
Date: Wed, 24 Jun 2015 17:32:54 +0300

> We need to unlock before returning here.
> 
> Fixes: a0d2f20650e8 ('Renesas Ethernet AVB PTP clock driver')
> Signed-off-by: Dan Carpenter 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch -next] cavium/liquidio: fix some error handling in lio_set_phys_id()

2015-06-25 Thread David Miller
From: Dan Carpenter 
Date: Wed, 24 Jun 2015 17:47:02 +0300

> There was a missing assignment so the "if (ret)" on the next line is
> never true.
> 
> Fixes: f21fb3ed364b ('Add support of Cavium Liquidio ethernet adapters')
> Signed-off-by: Dan Carpenter 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net/phy: Add Vitesse 8641 phy ID

2015-06-25 Thread David Miller
From: 
Date: Thu, 25 Jun 2015 13:34:27 +0800

> From: Shaohui Xie 
> 
> Vitesse VSC8641 is compatible with Vitesse 82xx
> 
> Signed-off-by: Shaohui Xie 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net/fsl: remove dependency FSL_SOC for Gianfar

2015-06-25 Thread David Miller
From: Alison Wang 
Date: Thu, 25 Jun 2015 11:34:38 +0800

> CONFIG_GIANFAR is not depended on FSL_SOC, it
> can be built on non-PPC platforms.
> 
> Signed-off-by: Alison Wang 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] iproute2: misc/ss.c - fix run_ssfilter af_packet when protocol == 0

2015-06-25 Thread Maciej Żenczykowski
From: Maciej Żenczykowski 

s->local.data is a pointer to a field of a non-NULL struct, and hence
cannot be NULL, thus comparing it to 0 is always false, and thus the
return is always false.

Presumably this was meant to be a check whether s->local.data[0] (which
I believe stores af_packet protocol) is 0, ie. ANY.

Change-Id: Ia232f5b06ce081e3b2fb6338f1a709cd94e03ae5
Fixes:
  ss.c:1018:37: error: comparison of array 's->local.data' equal to a null 
pointer is always false [-Werror,-Wtautological-pointer-compare]
return s->lport == 0 && s->local.data == 0;
~^~~~~
  1 error generated.
---
 misc/ss.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/misc/ss.c b/misc/ss.c
index dba0901791c7..36b0efdfd32f 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -1090,7 +1090,7 @@ static int run_ssfilter(struct ssfilter *f, struct 
sockstat *s)
 strspn(p+1, "0123456789abcdef") == 
5);
}
if (s->local.family == AF_PACKET)
-   return s->lport == 0 && s->local.data == 0;
+   return s->lport == 0 && s->local.data[0] == 0;
if (s->local.family == AF_NETLINK)
return s->lport < 0;
 
-- 
2.4.3.573.g4eafbef

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] iproute2: tc/m_pedit.c - remove dead code

2015-06-25 Thread Maciej Żenczykowski
From: Maciej Żenczykowski 

The initializers are simply not needed.

These if-blocks are outright dead code, because '0 > unsigned' is always
false, so only else clause triggers and regardless of which clause triggers
it only updates 'ind' which is later unconditionally written to before
being used anyway.

Otherwise we get errors from clang:

  m_pedit.c:166:8: error: comparison of 0 > unsigned expression is always false 
[-Werror,-Wtautological-compare]
if (0 > tkey->off) {
~ ^ ~
  m_pedit.c:209:8: error: comparison of 0 > unsigned expression is always false 
[-Werror,-Wtautological-compare]
if (0 > tkey->off) {
~ ^ ~
  2 errors generated.

Change-Id: I3c9e9092915088fc56f992e5df736851541a4458
---
 tc/m_pedit.c | 32 
 1 file changed, 8 insertions(+), 24 deletions(-)

diff --git a/tc/m_pedit.c b/tc/m_pedit.c
index dfe9b2ebd6e0..4fdd189d7d9c 100644
--- a/tc/m_pedit.c
+++ b/tc/m_pedit.c
@@ -160,17 +160,9 @@ pack_key32(__u32 retain,struct tc_pedit_sel *sel,struct 
tc_pedit_key *tkey)
 int
 pack_key16(__u32 retain,struct tc_pedit_sel *sel,struct tc_pedit_key *tkey)
 {
-   int ind = 0, stride = 0;
+   int ind, stride;
__u32 m[4] = {0x,0xFFFF,0x};
 
-   if (0 > tkey->off) {
-   ind = tkey->off + 1;
-   if (0 > ind)
-   ind = -1*ind;
-   } else {
-   ind = tkey->off;
-   }
-
if (tkey->val > 0x || tkey->mask > 0x) {
fprintf(stderr, "pack_key16 bad value\n");
return -1;
@@ -178,18 +170,16 @@ pack_key16(__u32 retain,struct tc_pedit_sel *sel,struct 
tc_pedit_key *tkey)
 
ind = tkey->off & 3;
 
-   if (0 > ind || 2 < ind) {
+   if (ind == 3) {
fprintf(stderr, "pack_key16 bad index value %d\n",ind);
return -1;
}
 
stride = 8 * ind;
tkey->val = htons(tkey->val);
-   if (stride > 0) {
-   tkey->val <<= stride;
-   tkey->mask <<= stride;
-   retain <<= stride;
-   }
+   tkey->val <<= stride;
+   tkey->mask <<= stride;
+   retain <<= stride;
tkey->mask = retain|m[ind];
 
tkey->off &= ~3;
@@ -203,28 +193,22 @@ pack_key16(__u32 retain,struct tc_pedit_sel *sel,struct 
tc_pedit_key *tkey)
 int
 pack_key8(__u32 retain,struct tc_pedit_sel *sel,struct tc_pedit_key *tkey)
 {
-   int ind = 0, stride = 0;
+   int ind, stride;
__u32 m[4] = {0xFF00,0x00FF,0xFF00,0x00FF};
 
-   if (0 > tkey->off) {
-   ind = tkey->off + 1;
-   if (0 > ind)
-   ind = -1*ind;
-   } else {
-   ind = tkey->off;
-   }
-
if (tkey->val > 0xFF || tkey->mask > 0xFF) {
fprintf(stderr, "pack_key8 bad value (val %x mask %x\n", 
tkey->val, tkey->mask);
return -1;
}
 
ind = tkey->off & 3;
+
stride = 8 * ind;
tkey->val <<= stride;
tkey->mask <<= stride;
retain <<= stride;
tkey->mask = retain|m[ind];
+
tkey->off &= ~3;
 
if (pedit_debug)
-- 
2.4.3.573.g4eafbef

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/3] net: mvneta: introduce compatible string "marvell, armada-xp-neta"

2015-06-25 Thread Simon Guinot
On Fri, Jun 19, 2015 at 02:32:53PM +0200, Simon Guinot wrote:
> On Wed, Jun 17, 2015 at 05:01:12PM +, Jason Cooper wrote:
> > Hi Gregory,
> > 
> > On Wed, Jun 17, 2015 at 05:15:28PM +0200, Gregory CLEMENT wrote:
> > > On 17/06/2015 17:12, Gregory CLEMENT wrote:
> > > > On 17/06/2015 15:19, Simon Guinot wrote:
> > > >> The mvneta driver supports the Ethernet IP found in the Armada 370, XP,
> > > >> 380 and 385 SoCs. Since at least one more hardware feature is available
> > > >> for the Armada XP SoCs then a way to identify them is needed.
> > > >>
> > > >> This patch introduces a new compatible string "marvell,armada-xp-neta".
> > > > 
> > > > Let's be future proof by going further. I would like to have an 
> > > > compatible string
> > > > for each SoC even if we currently we don't use them.
> > 
> > I disagree with this.  We can't predict what incosistencies we'll discover 
> > in
> > the future.  We should only assign new compatible strings based on known IP
> > variations when we discover them.  This seems fraught with demons since we
> > can't predict the scope of affected IP blocks (some steppings of one SoC, 
> > three
> > SoCs plus two steppings of a fourth, etc)
> > 
> > imho, the 'future-proofing' lies in being specific as to the naming of the
> > compatible strings against known hardware variations at the time.
> 
> So, should I add more compatible strings or not ?

Hi Gregory and Jason,

How do you want me to handle this ? Did you reach an agreement ?

Thanks,

Simon


signature.asc
Description: Digital signature


Re: [RFC] virtio_net: Adding tx_timeout function.

2015-06-25 Thread Jason Wang


On 06/25/2015 09:31 AM, Julio Faracco wrote:
> 2015-06-24 3:10 GMT-03:00 Michael S. Tsirkin :
>> On Tue, Jun 23, 2015 at 10:44:29PM -0300, Julio Faracco wrote:
>>> virtio_net paravirtualized driver does not have a tx_timeout() function to
>>> guarantee that the driver will recover properly after receiving a timeout
>>> during a transmission of a packet. This patch add this feature and throw a
>>> timeout exception after 5 HZ. Considering some tests, this is the best
>>> time to use here.
>>>
>>> Signed-off-by: Julio Faracco 
>>> Cc: Jason Wang 
>> Looks like a bunch of locks and flushes are missing in this patch.  IMHO
>> that's just too painful with current hardware.  IMO the right thing to
>> do here is to add ability to reset specific queues to hardware.
>>
> I agree, Michael. This model is the default one resetting the device
> due to transmission timeout.
> To have a better performance, only some queues must be reset.

Resetting device during tx hang is much more important. I don't see much
value of caring the performance in this case and at least most of the
drivers does not care.

And we could not assume it was just an issue of a specific queue, it
maybe a bug of guest driver, qemu or even host stack. So resetting the
whole devices make more sense.

Btw, I think it's better to print a warning with a dump of each queue state.

>
>>> ---
>>>  drivers/net/virtio_net.c |   69 
>>> +-
>>>  1 file changed, 68 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>> index 63c7810..75ac45c 100644
>>> --- a/drivers/net/virtio_net.c
>>> +++ b/drivers/net/virtio_net.c
>>> @@ -135,6 +135,9 @@ struct virtnet_info {
>>>   /* Work struct for config space updates */
>>>   struct work_struct config_work;
>>>
>>> + /* Work struct for resetting the virtio-net driver. */
>>> + struct work_struct reset_task;
>>> +
>>>   /* Does the affinity hint is set for virtqueues? */
>>>   bool affinity_hint_set;
>>>
>>> @@ -1394,6 +1397,18 @@ static int virtnet_change_mtu(struct net_device 
>>> *dev, int new_mtu)
>>>   return 0;
>>>  }
>>>
>>> +static void virtnet_tx_timeout(struct net_device *dev)
>>> +{
>>> + struct virtnet_info *vi = netdev_priv(dev);
>>> +
>>> + dev_warn(&dev->dev, "TX Timeout exception with latency: %ld\n",
>>> +  jiffies - dev_trans_start(dev));
>>> +
>>> + schedule_work(&vi->reset_task);
>> What if after this triggers user does something
>> to the device (e.g. attempts to remove it)?
>> Or if a packet is transmitted or used?
> At some point, this work must be canceled.
> Yes, you are right. Specially, when the driver is being removed.
>>> +}
>>> +
>>> +static void virtnet_reset_task(struct work_struct *work);
>>> +
>>>  static const struct net_device_ops virtnet_netdev = {
>>>   .ndo_open= virtnet_open,
>>>   .ndo_stop= virtnet_close,
>>> @@ -1405,6 +1420,7 @@ static const struct net_device_ops virtnet_netdev = {
>>>   .ndo_get_stats64 = virtnet_stats,
>>>   .ndo_vlan_rx_add_vid = virtnet_vlan_rx_add_vid,
>>>   .ndo_vlan_rx_kill_vid = virtnet_vlan_rx_kill_vid,
>>> + .ndo_tx_timeout  = virtnet_tx_timeout,
>>>  #ifdef CONFIG_NET_POLL_CONTROLLER
>>>   .ndo_poll_controller = virtnet_netpoll,
>>>  #endif
>>> @@ -1750,6 +1766,7 @@ static int virtnet_probe(struct virtio_device *vdev)
>>>   dev->netdev_ops = &virtnet_netdev;
>>>   dev->features = NETIF_F_HIGHDMA;
>>>
>>> + dev->watchdog_timeo = 5 * HZ;
>>>   dev->ethtool_ops = &virtnet_ethtool_ops;
>>>   SET_NETDEV_DEV(dev, &vdev->dev);
>>>
>>> @@ -1811,6 +1828,7 @@ static int virtnet_probe(struct virtio_device *vdev)
>>>   }
>>>
>>>   INIT_WORK(&vi->config_work, virtnet_config_changed_work);
>>> + INIT_WORK(&vi->reset_task, virtnet_reset_task);
>>>
>>>   /* If we can receive ANY GSO packets, we must allocate large ones. */
>>>   if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO4) ||
>>> @@ -1891,7 +1909,7 @@ static int virtnet_probe(struct virtio_device *vdev)
>>>   netif_carrier_on(dev);
>>>   }
>>>
>>> - pr_debug("virtnet: registered device %s with %d RX and TX vq's\n",
>>> + pr_debug("virtio_net: registered device %s with %d RX and TX vq's\n",
>>>dev->name, max_queue_pairs);
>>>
>>>   return 0;
>>> @@ -2001,6 +2019,55 @@ static int virtnet_restore(struct virtio_device 
>>> *vdev)
>>>  }
>>>  #endif
>>>
>>> +static void virtnet_reset_task(struct work_struct *work)
>>> +{
>>> + struct virtnet_info *vi =
>>> + container_of(work, struct virtnet_info, reset_task);
>>> + struct net_device *dev = vi->dev;
>>> + struct virtio_device *vdev = vi->vdev;
>>> + int err, i;
>>> +
>>> + flush_work(&vi->config_work);
>>> +
>>> + netif_device_detach(vi->dev);
>>> + cancel_delayed_work_sync(&vi->refill);
>>> +
>>> + if (netif_running(vi->dev)) {
>>>

Re: [PATCH RFC] 2/2 huawei_cdc_ncm: introduce new TX ncm stack

2015-06-25 Thread Oliver Neukum
On Tue, 2015-06-23 at 00:32 +0200, Enrico Mioso wrote:
> This patch introduces a new NCM tx engine, able to operate in standard-
> and huawei-style mode.
> In the first case, the NDP is disposed after the initial headers and
> before any datagram.
> 
> What works:
> - is able to communicate with compliant NCM devices:
>   I tested this with a board running the Linux g_ncm gadget driver.
> 
> What doesn't work:
> - After some packets I start gettint LOTS of EVENT_RX_MEMORY from usbnet,
>   which fails to allocate an RX SKB in rx_submit(). Don't understand why,
>   any suggestion would be very welcome.
> 
> The tx_fixup function given here, even if actually working, should be
> considered as an example: the NCM manager is used here simulating the
> cdc_ncm.c behaviour.
> 
> Signed-off-by: Enrico Mioso 
> ---
>  drivers/net/usb/huawei_cdc_ncm.c | 187 
> ++-
>  1 file changed, 185 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/usb/huawei_cdc_ncm.c 
> b/drivers/net/usb/huawei_cdc_ncm.c
> index 735f7da..217802a 100644
> --- a/drivers/net/usb/huawei_cdc_ncm.c
> +++ b/drivers/net/usb/huawei_cdc_ncm.c
> @@ -29,6 +29,35 @@
>  #include 
>  #include 
>  
> +/* NCM management operations: */
> +
> +/* NCM_INIT_FRAME: prepare for a new frame.
> + * NTH16 header is written to output SKB, NDP data is reset and last
> + * committed NDP pointer set to NULL.
> + * Now, data may be added to this NCM package.
> + */
> +#define NCM_INIT_FRAME   1
> +
> +/* NCM_UPDATE_NDP: adds data to an NDP structure, hence updating it.
> + * Some checks are performed to be sure data fits in, respecting device and
> + * spec constrains.
> + * Normally the NDP is kept in memory and committed to the SKB only when
> + * requested. However, calling this "method" after NCM_COMMIT_NDP, causes it 
> to
> + * work directly on the already committed SKB copy. this allows for 
> flexibility
> + * in frame ordering.
> + */
> +#define NCM_UPDATE_NDP   2
> +
> +/* Write NDP: commits NDP to output SKB.
> + * This method should be called only once per frame.
> + */
> +#define NCM_COMMIT_NDP   3
> +
> +/* Finalizes NTH16 header: to be called when working in
> + * update-already-committed mode.
> + */
> +#define NCM_FINALIZE_NTH 5
> +
>  /* Driver data */
>  struct huawei_cdc_ncm_state {
>   struct cdc_ncm_ctx *ctx;
> @@ -36,6 +65,16 @@ struct huawei_cdc_ncm_state {
>   struct usb_driver *subdriver;
>   struct usb_interface *control;
>   struct usb_interface *data;
> +
> + /* Keeps track of where data starts and ends in SKBs. */
> + int data_start;
> + int data_len;
> +
> + /* Last committed NDP for post-commit operations. */
> + struct usb_cdc_ncm_ndp16 *skb_ndp;
> +
> + /* Non-committed NDP */
> + struct usb_cdc_ncm_ndp16 *ndp;
>  };
>  
>  static int huawei_cdc_ncm_manage_power(struct usbnet *usbnet_dev, int on)
> @@ -53,6 +92,149 @@ static int huawei_cdc_ncm_manage_power(struct usbnet 
> *usbnet_dev, int on)
>   return 0;
>  }
>  
> +/* huawei_ncm_mgmt: flexible TX NCM manager.
> + *
> + * Once a non-zero status value is rturned, current frame should be discarded
> + * and operations restarted from scratch.
> + */

Is there any advantage in keeping this in a single function?

> +int
> +huawei_ncm_mgmt(struct usbnet *dev,
> + struct huawei_cdc_ncm_state *drvstate, struct sk_buff *skb_out, 
> int mode) {
> + struct usb_cdc_ncm_nth16 *nth16 = (struct usb_cdc_ncm_nth16 
> *)skb_out->data;
> + struct cdc_ncm_ctx *ctx = drvstate->ctx;
> + struct usb_cdc_ncm_ndp16 *ndp16 = NULL;
> + int ret = -EINVAL;
> + u16 ndplen, index;
> +
> + switch (mode) {
> + case NCM_INIT_FRAME:
> +
> + /* Write a new NTH16 header */
> + nth16 = (struct usb_cdc_ncm_nth16 *)memset(skb_put(skb_out, 
> sizeof(struct usb_cdc_ncm_nth16)), 0, sizeof(struct usb_cdc_ncm_nth16));
> + if (!nth16) {
> + ret = -EINVAL;
> + goto error;
> + }
> +
> + /* NTH16 signature and header length are known a-priori. */
> + nth16->dwSignature = cpu_to_le32(USB_CDC_NCM_NTH16_SIGN);
> + nth16->wHeaderLength = cpu_to_le16(sizeof(struct 
> usb_cdc_ncm_nth16));
> +
> + /* TX sequence numbering */
> + nth16->wSequence = cpu_to_le16(ctx->tx_seq++);
> +
> + /* Forget about previous SKB NDP */
> + drvstate->skb_ndp = NULL;

This is probably better done after you know you cannot fail.

> +
> + /* Allocate a new NDP */
> + ndp16 = kzalloc(ctx->max_ndp_size, GFP_NOIO);

Where is this freed?

> + if (!ndp16)
> + return ret;
> +
> + /* Prepare a new NDP to add data on subsequent calls. */
> + drvstate->ndp = memset(ndp16, 0, ctx->max_ndp_size);

Either kzalloc() or memset(). Using both never ma

Re: [PATCH RFC] 2/2 huawei_cdc_ncm: introduce new TX ncm stack

2015-06-25 Thread Oliver Neukum
On Tue, 2015-06-23 at 00:32 +0200, Enrico Mioso wrote:

> +/* XXX rewrite, not multipacket */

Can you explain what you want to do here?
> +struct sk_buff *
> +huawei_ncm_tx_fixup(struct usbnet *dev, struct sk_buff *skb_in, gfp_t flags) 
> {
> + struct huawei_cdc_ncm_state *drvstate = (void *)&dev->data;
> + struct cdc_ncm_ctx *ctx = drvstate->ctx;
> + struct sk_buff *skb_out;
> + int status;
> +
> + skb_out = alloc_skb(ctx->tx_max, GFP_ATOMIC);

You must test this for NULL

Regards
Oliver



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v2] enic: use atomic_t instead of spin_lock in busy poll

2015-06-25 Thread Govindarajulu Varadarajan
We use spinlock to access a single flag. We can avoid spin_locks by using
atomic variable and atomic_cmpxchg(). Use atomic_cmpxchg to set the flag
for idle to poll. And a simple atomic_set to unlock (set idle from poll).

In napi poll, if gro is enabled, we call napi_gro_receive() to deliver the
packets. Before we call napi_complete(), i.e while re-polling, if low
latency busy poll is called, we use netif_receive_skb() to deliver the packets.
At this point if there are some skb's held in GRO, busy poll could deliver the
packets out of order. So we call napi_gro_flush() to flush skbs before we
move the napi poll to idle.

Signed-off-by: Govindarajulu Varadarajan <_gov...@gmx.com>
---
v2: Add more details about why gro flush is required while unlocking napi poll.

 drivers/net/ethernet/cisco/enic/enic_main.c |  4 +-
 drivers/net/ethernet/cisco/enic/vnic_rq.h   | 91 +
 2 files changed, 29 insertions(+), 66 deletions(-)

diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c 
b/drivers/net/ethernet/cisco/enic/enic_main.c
index eadae1b..da2004e 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -1208,7 +1208,7 @@ static int enic_poll(struct napi_struct *napi, int budget)
napi_complete(napi);
vnic_intr_unmask(&enic->intr[intr]);
}
-   enic_poll_unlock_napi(&enic->rq[cq_rq]);
+   enic_poll_unlock_napi(&enic->rq[cq_rq], napi);
 
return rq_work_done;
 }
@@ -1414,7 +1414,7 @@ static int enic_poll_msix_rq(struct napi_struct *napi, 
int budget)
 */
enic_calc_int_moderation(enic, &enic->rq[rq]);
 
-   enic_poll_unlock_napi(&enic->rq[rq]);
+   enic_poll_unlock_napi(&enic->rq[rq], napi);
if (work_done < work_to_do) {
 
/* Some work done, but not enough to stay in polling,
diff --git a/drivers/net/ethernet/cisco/enic/vnic_rq.h 
b/drivers/net/ethernet/cisco/enic/vnic_rq.h
index 8111d52..b9c82f1 100644
--- a/drivers/net/ethernet/cisco/enic/vnic_rq.h
+++ b/drivers/net/ethernet/cisco/enic/vnic_rq.h
@@ -21,6 +21,7 @@
 #define _VNIC_RQ_H_
 
 #include 
+#include 
 
 #include "vnic_dev.h"
 #include "vnic_cq.h"
@@ -75,6 +76,12 @@ struct vnic_rq_buf {
uint64_t wr_id;
 };
 
+enum enic_poll_state {
+   ENIC_POLL_STATE_IDLE,
+   ENIC_POLL_STATE_NAPI,
+   ENIC_POLL_STATE_POLL
+};
+
 struct vnic_rq {
unsigned int index;
struct vnic_dev *vdev;
@@ -86,19 +93,7 @@ struct vnic_rq {
void *os_buf_head;
unsigned int pkts_outstanding;
 #ifdef CONFIG_NET_RX_BUSY_POLL
-#define ENIC_POLL_STATE_IDLE   0
-#define ENIC_POLL_STATE_NAPI   (1 << 0) /* NAPI owns this poll */
-#define ENIC_POLL_STATE_POLL   (1 << 1) /* poll owns this poll */
-#define ENIC_POLL_STATE_NAPI_YIELD (1 << 2) /* NAPI yielded this poll */
-#define ENIC_POLL_STATE_POLL_YIELD (1 << 3) /* poll yielded this poll */
-#define ENIC_POLL_YIELD(ENIC_POLL_STATE_NAPI_YIELD |   
\
-ENIC_POLL_STATE_POLL_YIELD)
-#define ENIC_POLL_LOCKED   (ENIC_POLL_STATE_NAPI | \
-ENIC_POLL_STATE_POLL)
-#define ENIC_POLL_USER_PEND(ENIC_POLL_STATE_POLL | \
-ENIC_POLL_STATE_POLL_YIELD)
-   unsigned int bpoll_state;
-   spinlock_t bpoll_lock;
+   atomic_t bpoll_state;
 #endif /* CONFIG_NET_RX_BUSY_POLL */
 };
 
@@ -215,76 +210,43 @@ static inline int vnic_rq_fill(struct vnic_rq *rq,
 #ifdef CONFIG_NET_RX_BUSY_POLL
 static inline void enic_busy_poll_init_lock(struct vnic_rq *rq)
 {
-   spin_lock_init(&rq->bpoll_lock);
-   rq->bpoll_state = ENIC_POLL_STATE_IDLE;
+   atomic_set(&rq->bpoll_state, ENIC_POLL_STATE_IDLE);
 }
 
 static inline bool enic_poll_lock_napi(struct vnic_rq *rq)
 {
-   bool rc = true;
-
-   spin_lock(&rq->bpoll_lock);
-   if (rq->bpoll_state & ENIC_POLL_LOCKED) {
-   WARN_ON(rq->bpoll_state & ENIC_POLL_STATE_NAPI);
-   rq->bpoll_state |= ENIC_POLL_STATE_NAPI_YIELD;
-   rc = false;
-   } else {
-   rq->bpoll_state = ENIC_POLL_STATE_NAPI;
-   }
-   spin_unlock(&rq->bpoll_lock);
+   int rc = atomic_cmpxchg(&rq->bpoll_state, ENIC_POLL_STATE_IDLE,
+   ENIC_POLL_STATE_NAPI);
 
-   return rc;
+   return (rc == ENIC_POLL_STATE_IDLE);
 }
 
-static inline bool enic_poll_unlock_napi(struct vnic_rq *rq)
+static inline void enic_poll_unlock_napi(struct vnic_rq *rq,
+struct napi_struct *napi)
 {
-   bool rc = false;
-
-   spin_lock(&rq->bpoll_lock);
-   WARN_ON(rq->bpoll_state &
-   (ENIC_POLL_STATE_POLL | ENIC_POLL_STATE_NAPI_YIELD));
-   if (rq->bpoll_state & ENIC_POLL_STATE_POLL_YIELD)
-   rc = true;
-   rq->bpoll_state = 

[net-next PATCH 1/1] net: sched: flower fix typo

2015-06-25 Thread Jamal Hadi Salim
From: Jamal Hadi Salim 

Fix typo in the validation rules for flower's attributes

Signed-off-by: Jamal Hadi Salim 
---
 net/sched/cls_flower.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index b92d3f4..9d37ccd 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -216,8 +216,8 @@ static const struct nla_policy fl_policy[TCA_FLOWER_MAX + 
1] = {
[TCA_FLOWER_KEY_IPV6_DST_MASK]  = { .len = sizeof(struct in6_addr) },
[TCA_FLOWER_KEY_TCP_SRC]= { .type = NLA_U16 },
[TCA_FLOWER_KEY_TCP_DST]= { .type = NLA_U16 },
-   [TCA_FLOWER_KEY_TCP_SRC]= { .type = NLA_U16 },
-   [TCA_FLOWER_KEY_TCP_DST]= { .type = NLA_U16 },
+   [TCA_FLOWER_KEY_UDP_SRC]= { .type = NLA_U16 },
+   [TCA_FLOWER_KEY_UDP_DST]= { .type = NLA_U16 },
 };
 
 static void fl_set_key_val(struct nlattr **tb,
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


netdev broken?

2015-06-25 Thread Jamal Hadi Salim


Trying to catchup with email and i am noticing my last
received email was on the 21st. Anyone else having problems
(feel like i am asking the question "if you cant hear me
please raise your hand" ;->).

cheers,
jamal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: netdev broken?

2015-06-25 Thread Phil Sutter
On Thu, Jun 25, 2015 at 07:12:07AM -0400, Jamal Hadi Salim wrote:
> Trying to catchup with email and i am noticing my last
> received email was on the 21st. Anyone else having problems
> (feel like i am asking the question "if you cant hear me
> please raise your hand" ;->).

I received your mail. If you didn't, you probably won't receive mine
either. :)

Cheers, Phil
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[EDT] [PATCH] ax88179_178a: add reset function in reset_resume

2015-06-25 Thread Vivek Kumar Bhagat
EP-EC562D6B53594479BCA6FC73F17DEE54

Hello David,

without reset functionality in reset_resume, iperf connection does not
establish after suspend/resume however ping works at the same time.

reset function inside reset_resume solves above bug. We have verified
it on ASIX based ST Lab, Cadyce dongle.

As my email client is giving problem of converting tabs into spaces,
Please find patch attached herewith.

Thanks,
Vivek

0001-ax88179_178a-add-reset-function-in-reset_resume.patch
Description: Binary data


Re: [PATCH RFC] 2/2 huawei_cdc_ncm: introduce new TX ncm stack

2015-06-25 Thread Enrico Mioso

Hi Oliver.
Thank you for your patience, and review. I apreciated it very much.

On Thu, 25 Jun 2015, Oliver Neukum wrote:


Date: Thu, 25 Jun 2015 11:49:29
From: Oliver Neukum 
To: Enrico Mioso 
Cc: linux-...@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH RFC] 2/2 huawei_cdc_ncm: introduce new TX ncm stack

On Tue, 2015-06-23 at 00:32 +0200, Enrico Mioso wrote:

This patch introduces a new NCM tx engine, able to operate in standard-
and huawei-style mode.
In the first case, the NDP is disposed after the initial headers and
before any datagram.

What works:
- is able to communicate with compliant NCM devices:
I tested this with a board running the Linux g_ncm gadget driver.

What doesn't work:
- After some packets I start gettint LOTS of EVENT_RX_MEMORY from usbnet,
which fails to allocate an RX SKB in rx_submit(). Don't understand why,
any suggestion would be very welcome.

The tx_fixup function given here, even if actually working, should be
considered as an example: the NCM manager is used here simulating the
cdc_ncm.c behaviour.

Signed-off-by: Enrico Mioso 
---
 drivers/net/usb/huawei_cdc_ncm.c | 187 ++-
 1 file changed, 185 insertions(+), 2 deletions(-)

diff --git a/drivers/net/usb/huawei_cdc_ncm.c b/drivers/net/usb/huawei_cdc_ncm.c
index 735f7da..217802a 100644
--- a/drivers/net/usb/huawei_cdc_ncm.c
+++ b/drivers/net/usb/huawei_cdc_ncm.c
@@ -29,6 +29,35 @@
 #include 
 #include 

+/* NCM management operations: */
+
+/* NCM_INIT_FRAME: prepare for a new frame.
+ * NTH16 header is written to output SKB, NDP data is reset and last
+ * committed NDP pointer set to NULL.
+ * Now, data may be added to this NCM package.
+ */
+#define NCM_INIT_FRAME 1
+
+/* NCM_UPDATE_NDP: adds data to an NDP structure, hence updating it.
+ * Some checks are performed to be sure data fits in, respecting device and
+ * spec constrains.
+ * Normally the NDP is kept in memory and committed to the SKB only when
+ * requested. However, calling this "method" after NCM_COMMIT_NDP, causes it to
+ * work directly on the already committed SKB copy. this allows for flexibility
+ * in frame ordering.
+ */
+#define NCM_UPDATE_NDP 2
+
+/* Write NDP: commits NDP to output SKB.
+ * This method should be called only once per frame.
+ */
+#define NCM_COMMIT_NDP 3
+
+/* Finalizes NTH16 header: to be called when working in
+ * update-already-committed mode.
+ */
+#define NCM_FINALIZE_NTH   5
+
 /* Driver data */
 struct huawei_cdc_ncm_state {
struct cdc_ncm_ctx *ctx;
@@ -36,6 +65,16 @@ struct huawei_cdc_ncm_state {
struct usb_driver *subdriver;
struct usb_interface *control;
struct usb_interface *data;
+
+   /* Keeps track of where data starts and ends in SKBs. */
+   int data_start;
+   int data_len;
+
+   /* Last committed NDP for post-commit operations. */
+   struct usb_cdc_ncm_ndp16 *skb_ndp;
+
+   /* Non-committed NDP */
+   struct usb_cdc_ncm_ndp16 *ndp;
 };

 static int huawei_cdc_ncm_manage_power(struct usbnet *usbnet_dev, int on)
@@ -53,6 +92,149 @@ static int huawei_cdc_ncm_manage_power(struct usbnet 
*usbnet_dev, int on)
return 0;
 }

+/* huawei_ncm_mgmt: flexible TX NCM manager.
+ *
+ * Once a non-zero status value is rturned, current frame should be discarded
+ * and operations restarted from scratch.
+ */


Is there any advantage in keeping this in a single function?

I did this choice in the light of the fact I think the tx_fixup function will 
become more complex than it is now, when aggregating frames.
I answer here your other message to make it more convenient to read: my 
intention for the tx_fixup function would be to:

- aggregate frames
- send them out when:
- a timer expires
OR
- we have enough data in the aggregate, and cannot add more.

This is something done in cdc_ncm.c for example.
But here I have a question: by reading the comment in file 
drivers/net/usb/rndis_host.c at line 572, there seem to be different opinions 
in this matter.

What to do ?


+int
+huawei_ncm_mgmt(struct usbnet *dev,
+   struct huawei_cdc_ncm_state *drvstate, struct sk_buff *skb_out, 
int mode) {
+   struct usb_cdc_ncm_nth16 *nth16 = (struct usb_cdc_ncm_nth16 
*)skb_out->data;
+   struct cdc_ncm_ctx *ctx = drvstate->ctx;
+   struct usb_cdc_ncm_ndp16 *ndp16 = NULL;
+   int ret = -EINVAL;
+   u16 ndplen, index;
+
+   switch (mode) {
+   case NCM_INIT_FRAME:
+
+   /* Write a new NTH16 header */
+   nth16 = (struct usb_cdc_ncm_nth16 *)memset(skb_put(skb_out, 
sizeof(struct usb_cdc_ncm_nth16)), 0, sizeof(struct usb_cdc_ncm_nth16));
+   if (!nth16) {
+   ret = -EINVAL;
+   goto error;
+   }
+
+   /* NTH16 signature and header length are known a-priori. */
+   nth16->dwSignature = cpu_to_le32(USB_CDC_NCM_NTH16

Re: netdev broken?

2015-06-25 Thread David Miller
From: Jamal Hadi Salim 
Date: Thu, 25 Jun 2015 07:12:07 -0400

> Trying to catchup with email and i am noticing my last
> received email was on the 21st. Anyone else having problems
> (feel like i am asking the question "if you cant hear me
> please raise your hand" ;->).

I'm pretty sure you're just not subscribed to the list.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [EDT] [PATCH] ax88179_178a: add reset function in reset_resume

2015-06-25 Thread David Miller
From: Vivek Kumar Bhagat 
Date: Thu, 25 Jun 2015 11:23:10 + (GMT)

> As my email client is giving problem of converting tabs into spaces,
> Please find patch attached herewith.

Sorry, this is not acceptable.

Please read Documentation/email-clients.txt for how to properly
setup your email client to not corrupt patches you send.

If you dont' do this correctly, your patch doesn't get queued
properly into the patchwork database and therefore it's a lot
more work for me.  Also non-inlined patches cannot be properly
quoted by people reviewing your work and want to provide
feedback.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next v2] enic: use atomic_t instead of spin_lock in busy poll

2015-06-25 Thread David Miller
From: Govindarajulu Varadarajan <_gov...@gmx.com>
Date: Thu, 25 Jun 2015 16:02:04 +0530

> We use spinlock to access a single flag. We can avoid spin_locks by using
> atomic variable and atomic_cmpxchg(). Use atomic_cmpxchg to set the flag
> for idle to poll. And a simple atomic_set to unlock (set idle from poll).
> 
> In napi poll, if gro is enabled, we call napi_gro_receive() to deliver the
> packets. Before we call napi_complete(), i.e while re-polling, if low
> latency busy poll is called, we use netif_receive_skb() to deliver the 
> packets.
> At this point if there are some skb's held in GRO, busy poll could deliver the
> packets out of order. So we call napi_gro_flush() to flush skbs before we
> move the napi poll to idle.
> 
> Signed-off-by: Govindarajulu Varadarajan <_gov...@gmx.com>

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [net-next PATCH 1/1] net: sched: flower fix typo

2015-06-25 Thread David Miller
From: Jamal Hadi Salim 
Date: Thu, 25 Jun 2015 06:55:27 -0400

> From: Jamal Hadi Salim 
> 
> Fix typo in the validation rules for flower's attributes
> 
> Signed-off-by: Jamal Hadi Salim 

Oops, good catch, applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net 4/9] bnx2x: Don't notify about scratchpad parities

2015-06-25 Thread Yuval Mintz
From: Manish Chopra 

The scratchpad is a shared block between all functions of a given device.
Due to HW limitations, we can't properly close its parity notifications
to all functions on legal flows.
E.g., it's possible that while taking a register dump from one function
a parity error would be triggered on other functions.

Today driver doesn't consider this parity as a 'real' parity unless its
being accompanied by additional indications [which would happen in a real
parity scenario]; But it does print notifications for such events in the
system logs.

This eliminates such prints - in case of real parities driver would have
additional indications; But if this is the only signal user will not even
see a parity being logged in the system.

Signed-off-by: Manish Chopra 
Signed-off-by: Yuval Mintz 
Signed-off-by: Ariel Elior 

---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h  | 11 +++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 20 ++--
 2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
index 1f82a04..8466c6c 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
@@ -2408,10 +2408,13 @@ void bnx2x_igu_clear_sb_gen(struct bnx2x *bp, u8 func, 
u8 idu_sb_id,
 AEU_INPUTS_ATTN_BITS_IGU_PARITY_ERROR | \
 AEU_INPUTS_ATTN_BITS_MISC_PARITY_ERROR)
 
-#define HW_PRTY_ASSERT_SET_3 (AEU_INPUTS_ATTN_BITS_MCP_LATCHED_ROM_PARITY | \
-   AEU_INPUTS_ATTN_BITS_MCP_LATCHED_UMP_RX_PARITY | \
-   AEU_INPUTS_ATTN_BITS_MCP_LATCHED_UMP_TX_PARITY | \
-   AEU_INPUTS_ATTN_BITS_MCP_LATCHED_SCPAD_PARITY)
+#define HW_PRTY_ASSERT_SET_3_WITHOUT_SCPAD \
+   (AEU_INPUTS_ATTN_BITS_MCP_LATCHED_ROM_PARITY | \
+AEU_INPUTS_ATTN_BITS_MCP_LATCHED_UMP_RX_PARITY | \
+AEU_INPUTS_ATTN_BITS_MCP_LATCHED_UMP_TX_PARITY)
+
+#define HW_PRTY_ASSERT_SET_3 (HW_PRTY_ASSERT_SET_3_WITHOUT_SCPAD | \
+ AEU_INPUTS_ATTN_BITS_MCP_LATCHED_SCPAD_PARITY)
 
 #define HW_PRTY_ASSERT_SET_4 (AEU_INPUTS_ATTN_BITS_PGLUE_PARITY_ERROR | \
  AEU_INPUTS_ATTN_BITS_ATC_PARITY_ERROR)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index bf5f5df..9628595 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -4863,9 +4863,7 @@ static bool bnx2x_check_blocks_with_parity3(struct bnx2x 
*bp, u32 sig,
res = true;
break;
case AEU_INPUTS_ATTN_BITS_MCP_LATCHED_SCPAD_PARITY:
-   if (print)
-   _print_next_block((*par_num)++,
- "MCP SCPAD");
+   (*par_num)++;
/* clear latched SCPAD PATIRY from MCP */
REG_WR(bp, MISC_REG_AEU_CLR_LATCH_SIGNAL,
   1UL << 10);
@@ -4927,6 +4925,7 @@ static bool bnx2x_parity_attn(struct bnx2x *bp, bool 
*global, bool print,
(sig[3] & HW_PRTY_ASSERT_SET_3) ||
(sig[4] & HW_PRTY_ASSERT_SET_4)) {
int par_num = 0;
+
DP(NETIF_MSG_HW, "Was parity error: HW block parity 
attention:\n"
 "[0]:0x%08x [1]:0x%08x [2]:0x%08x [3]:0x%08x 
[4]:0x%08x\n",
  sig[0] & HW_PRTY_ASSERT_SET_0,
@@ -4934,9 +4933,18 @@ static bool bnx2x_parity_attn(struct bnx2x *bp, bool 
*global, bool print,
  sig[2] & HW_PRTY_ASSERT_SET_2,
  sig[3] & HW_PRTY_ASSERT_SET_3,
  sig[4] & HW_PRTY_ASSERT_SET_4);
-   if (print)
-   netdev_err(bp->dev,
-  "Parity errors detected in blocks: ");
+   if (print) {
+   if (((sig[0] & HW_PRTY_ASSERT_SET_0) ||
+(sig[1] & HW_PRTY_ASSERT_SET_1) ||
+(sig[2] & HW_PRTY_ASSERT_SET_2) ||
+(sig[4] & HW_PRTY_ASSERT_SET_4)) ||
+(sig[3] & HW_PRTY_ASSERT_SET_3_WITHOUT_SCPAD)) {
+   netdev_err(bp->dev,
+  "Parity errors detected in blocks: 
");
+   } else {
+   print = false;
+   }
+   }
res |= bnx2x_check_blocks_with_parity0(bp,
sig[0] & HW_PRTY_ASSERT_SET_0, &par_num, print);
res |= bnx2x_check_blocks_with_parity1(bp,
-- 
1.9.3

--
To unsubscribe from this li

[PATCH net 2/9] bnx2x: Correct speed from baseT into KR.

2015-06-25 Thread Yuval Mintz
ethtool shows KR supported/advertised speeds incorrectly as baseT
in cases the board is in fact KR-base.

Signed-off-by: Yaniv Rosner 
Signed-off-by: Yuval Mintz 
---
 .../net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c| 55 --
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c   | 10 ++--
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c   | 13 +
 3 files changed, 59 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
index 48ed005..733b0fc 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
@@ -257,14 +257,15 @@ static int bnx2x_get_settings(struct net_device *dev, 
struct ethtool_cmd *cmd)
 {
struct bnx2x *bp = netdev_priv(dev);
int cfg_idx = bnx2x_get_link_cfg_idx(bp);
+   u32 media_type;
 
/* Dual Media boards present all available port types */
cmd->supported = bp->port.supported[cfg_idx] |
(bp->port.supported[cfg_idx ^ 1] &
 (SUPPORTED_TP | SUPPORTED_FIBRE));
cmd->advertising = bp->port.advertising[cfg_idx];
-   if (bp->link_params.phy[bnx2x_get_cur_phy_idx(bp)].media_type ==
-   ETH_PHY_SFP_1G_FIBER) {
+   media_type = bp->link_params.phy[bnx2x_get_cur_phy_idx(bp)].media_type;
+   if (media_type == ETH_PHY_SFP_1G_FIBER) {
cmd->supported &= ~(SUPPORTED_1baseT_Full);
cmd->advertising &= ~(ADVERTISED_1baseT_Full);
}
@@ -312,12 +313,26 @@ static int bnx2x_get_settings(struct net_device *dev, 
struct ethtool_cmd *cmd)
cmd->lp_advertising |= ADVERTISED_100baseT_Full;
if (status & LINK_STATUS_LINK_PARTNER_1000THD_CAPABLE)
cmd->lp_advertising |= ADVERTISED_1000baseT_Half;
-   if (status & LINK_STATUS_LINK_PARTNER_1000TFD_CAPABLE)
-   cmd->lp_advertising |= ADVERTISED_1000baseT_Full;
+   if (status & LINK_STATUS_LINK_PARTNER_1000TFD_CAPABLE) {
+   if (media_type == ETH_PHY_KR) {
+   cmd->lp_advertising |=
+   ADVERTISED_1000baseKX_Full;
+   } else {
+   cmd->lp_advertising |=
+   ADVERTISED_1000baseT_Full;
+   }
+   }
if (status & LINK_STATUS_LINK_PARTNER_2500XFD_CAPABLE)
cmd->lp_advertising |= ADVERTISED_2500baseX_Full;
-   if (status & LINK_STATUS_LINK_PARTNER_10GXFD_CAPABLE)
-   cmd->lp_advertising |= ADVERTISED_1baseT_Full;
+   if (status & LINK_STATUS_LINK_PARTNER_10GXFD_CAPABLE) {
+   if (media_type == ETH_PHY_KR) {
+   cmd->lp_advertising |=
+   ADVERTISED_1baseKR_Full;
+   } else {
+   cmd->lp_advertising |=
+   ADVERTISED_1baseT_Full;
+   }
+   }
if (status & LINK_STATUS_LINK_PARTNER_20GXFD_CAPABLE)
cmd->lp_advertising |= ADVERTISED_2baseKR2_Full;
}
@@ -564,15 +579,20 @@ static int bnx2x_set_settings(struct net_device *dev, 
struct ethtool_cmd *cmd)
return -EINVAL;
}
 
-   if (!(bp->port.supported[cfg_idx] &
- SUPPORTED_1000baseT_Full)) {
+   if (bp->port.supported[cfg_idx] &
+SUPPORTED_1000baseT_Full) {
+   advertising = (ADVERTISED_1000baseT_Full |
+  ADVERTISED_TP);
+
+   } else if (bp->port.supported[cfg_idx] &
+  SUPPORTED_1000baseKX_Full) {
+   advertising = ADVERTISED_1000baseKX_Full;
+   } else {
DP(BNX2X_MSG_ETHTOOL,
   "1G full not supported\n");
return -EINVAL;
}
 
-   advertising = (ADVERTISED_1000baseT_Full |
-  ADVERTISED_TP);
break;
 
case SPEED_2500:
@@ -600,17 +620,22 @@ static int bnx2x_set_settings(struct net_device *dev, 
struct ethtool_cmd *cmd)
return -EINVAL;
}
phy_idx = bnx2x_get_cur_phy_idx(bp);
-   if (!(bp->port.supported[cfg_idx]
- & SUPPORTED_1baseT_Full) ||
-   (bp->link_params.phy[phy_idx].media_type ==
+   

[PATCH net 1/9] bnx2x: Correct asymmetric flow-control

2015-06-25 Thread Yuval Mintz
This fixes several issues relating to asymmetric configuration:
 1. When user requests to disable TX, the local-device needs to
advertise both PAUSE and ASM_DIR, but to avoid transmitting pause
frames. In the 578xx, it would ignore the TX disable.

 2. When user advertises RX-only, ASM_DIR was advertised instead of
PAUSE/ASM_DIR.

 3. When changing mode, the advertised PAUSE/ASM_DIR was not cleared
before setting new one, so disabling RX or TX had no impact on the
'advertised' as appeared in the 'ethtool -a' output.

Signed-off-by: Yaniv Rosner 
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c | 33 +++-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 10 +++
 2 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
index 21a0d6a..b287fc8 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
@@ -3392,9 +3392,9 @@ static void bnx2x_calc_ieee_aneg_adv(struct bnx2x_phy 
*phy,
case BNX2X_FLOW_CTRL_AUTO:
switch (params->req_fc_auto_adv) {
case BNX2X_FLOW_CTRL_BOTH:
+   case BNX2X_FLOW_CTRL_RX:
*ieee_fc |= MDIO_COMBO_IEEE0_AUTO_NEG_ADV_PAUSE_BOTH;
break;
-   case BNX2X_FLOW_CTRL_RX:
case BNX2X_FLOW_CTRL_TX:
*ieee_fc |=
MDIO_COMBO_IEEE0_AUTO_NEG_ADV_PAUSE_ASYMMETRIC;
@@ -3488,14 +3488,21 @@ static void bnx2x_ext_phy_set_pause(struct link_params 
*params,
bnx2x_cl45_write(bp, phy, MDIO_AN_DEVAD, MDIO_AN_REG_ADV_PAUSE, val);
 }
 
-static void bnx2x_pause_resolve(struct link_vars *vars, u32 pause_result)
-{  /*  LD  LP   */
+static void bnx2x_pause_resolve(struct bnx2x_phy *phy,
+   struct link_params *params,
+   struct link_vars *vars,
+   u32 pause_result)
+{
+   struct bnx2x *bp = params->bp;
+   /*  LD  LP   */
switch (pause_result) { /* ASYM P ASYM P */
case 0xb:   /*   1  0   1  1 */
+   DP(NETIF_MSG_LINK, "Flow Control: TX only\n");
vars->flow_ctrl = BNX2X_FLOW_CTRL_TX;
break;
 
case 0xe:   /*   1  1   1  0 */
+   DP(NETIF_MSG_LINK, "Flow Control: RX only\n");
vars->flow_ctrl = BNX2X_FLOW_CTRL_RX;
break;
 
@@ -3503,10 +3510,22 @@ static void bnx2x_pause_resolve(struct link_vars *vars, 
u32 pause_result)
case 0x7:   /*   0  1   1  1 */
case 0xd:   /*   1  1   0  1 */
case 0xf:   /*   1  1   1  1 */
-   vars->flow_ctrl = BNX2X_FLOW_CTRL_BOTH;
+   /* If the user selected to advertise RX ONLY,
+* although we advertised both, need to enable
+* RX only.
+*/
+   if (params->req_fc_auto_adv == BNX2X_FLOW_CTRL_BOTH) {
+   DP(NETIF_MSG_LINK, "Flow Control: RX & TX\n");
+   vars->flow_ctrl = BNX2X_FLOW_CTRL_BOTH;
+   } else {
+   DP(NETIF_MSG_LINK, "Flow Control: RX only\n");
+   vars->flow_ctrl = BNX2X_FLOW_CTRL_RX;
+   }
break;
 
default:
+   DP(NETIF_MSG_LINK, "Flow Control: None\n");
+   vars->flow_ctrl = BNX2X_FLOW_CTRL_NONE;
break;
}
if (pause_result & (1<<0))
@@ -3567,7 +3586,7 @@ static void bnx2x_ext_phy_update_adv_fc(struct bnx2x_phy 
*phy,
pause_result |= (lp_pause &
 MDIO_AN_REG_ADV_PAUSE_MASK) >> 10;
DP(NETIF_MSG_LINK, "Ext PHY pause result 0x%x\n", pause_result);
-   bnx2x_pause_resolve(vars, pause_result);
+   bnx2x_pause_resolve(phy, params, vars, pause_result);
 
 }
 
@@ -5396,7 +5415,7 @@ static void bnx2x_update_adv_fc(struct bnx2x_phy *phy,
 MDIO_COMBO_IEEE0_AUTO_NEG_ADV_PAUSE_MASK)>>7;
DP(NETIF_MSG_LINK, "pause_result CL37 0x%x\n", pause_result);
}
-   bnx2x_pause_resolve(vars, pause_result);
+   bnx2x_pause_resolve(phy, params, vars, pause_result);
 
 }
 
@@ -7129,7 +7148,7 @@ static void bnx2x_8073_resolve_fc(struct bnx2x_phy *phy,
pause_result |= (lp_pause &
 MDIO_COMBO_IEEE0_AUTO_NEG_ADV_PAUSE_BOTH) >> 7;
 
-   bnx2x_pause_resolve(vars, pause_result);
+   bnx2x_pause_resolve(phy, params, vars, pause_result);
DP(NETIF_MSG_LINK, "Ext PH

[PATCH net 0/9] bnx2x: various fixes

2015-06-25 Thread Yuval Mintz
This patch series contains several small fixes [with the possible
exception of the first 2 link fixes] for various driver flows.

Dave,

Sorry for the backlog - looks like at least some of these could have been
sent some time ago on shorter series.
Please consider applying this series to `net'.

Thanks,
Yuval
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net 3/9] bnx2x: Prevent false warning when accessing MACs

2015-06-25 Thread Yuval Mintz
Each time a flow finishes reads from the classification shadow
configuration in the driver, that flow would check for pending commands
and pass them to FW if possible.
In case there's already a completion pending command, I.e., a ramrod
that has been sent to the FW and is yet to be completed while said flow
tries to configure the pending command we would get a false error message
in logs [and panic if SOE was used for driver compilation] since the
command could not have been completed.

This prevents said print [and panic]; The pending command will be sent by
the time the completion of the current sent command would arrive.

Signed-off-by: Yuval Mintz 
Signed-off-by: Ariel Elior 
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c
index 07cdf9b..4ad415a 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c
@@ -424,7 +424,7 @@ static void __bnx2x_vlan_mac_h_exec_pending(struct bnx2x 
*bp,
o->head_exe_request = false;
o->saved_ramrod_flags = 0;
rc = bnx2x_exe_queue_step(bp, &o->exe_queue, &ramrod_flags);
-   if (rc != 0) {
+   if ((rc != 0) && (rc != 1)) {
BNX2X_ERR("execution of pending commands failed with rc %d\n",
  rc);
 #ifdef BNX2X_STOP_ON_ERROR
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net 9/9] bnx2x: Fix linearization for encapsulated packets

2015-06-25 Thread Yuval Mintz
Due to FW constraints, driver must make sure that transmitted SKBs will
not be too fragmented, or in the case that they are - that each 'window'
of fragments passed to the FW would contain at least an mss worth of data.

For encapsultaed packets the calculation is wrong, since it ignores the
inner headers in the calculation of the headers' length.
This could lead to a FW assertion in case of a too-fragmented encapsulated
packet.

Signed-off-by: Yuval Mintz 
Signed-off-by: Ariel Elior 
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index ec56a9b..d336356 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -3400,8 +3400,13 @@ static int bnx2x_pkt_req_lin(struct bnx2x *bp, struct 
sk_buff *skb,
u32 wnd_sum = 0;
 
/* Headers length */
-   hlen = (int)(skb_transport_header(skb) - skb->data) +
-   tcp_hdrlen(skb);
+   if (xmit_type & XMIT_GSO_ENC)
+   hlen = (int)(skb_inner_transport_header(skb) -
+skb->data) +
+inner_tcp_hdrlen(skb);
+   else
+   hlen = (int)(skb_transport_header(skb) -
+skb->data) + tcp_hdrlen(skb);
 
/* Amount of data (w/o headers) on linear part of SKB*/
first_bd_sz = skb_headlen(skb) - hlen;
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net 5/9] bnx2x: Fix VF MAC removal

2015-06-25 Thread Yuval Mintz
From: Shahed Shaikh 

There's a bug in today's driver where VF requests to add/remove MAC filters
always reach the Hypervisor as add requests.
This prevents the VF from changing its MAC address, as it cannot remove the
previously configured MAC and runs out of MAC credits.

Signed-off-by: Shahed Shaikh 
Signed-off-by: Yuval Mintz 
Signed-off-by: Ariel Elior 
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 9628595..c1033a5 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -8435,7 +8435,7 @@ int bnx2x_set_eth_mac(struct bnx2x *bp, bool set)
 BNX2X_ETH_MAC, &ramrod_flags);
} else { /* vf */
return bnx2x_vfpf_config_mac(bp, bp->dev->dev_addr,
-bp->fp->index, true);
+bp->fp->index, set);
}
 }
 
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net 6/9] bnx2x: Fix self-test for 20g devices

2015-06-25 Thread Yuval Mintz
20g-capable devices are not configured properly for self-test, using
10g as their speed which cause the link indication to remain down and
fail the internal loopback test.

Signed-off-by: Yuval Mintz 
Signed-off-by: Ariel Elior 
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index c1033a5..3df03bb 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -2347,12 +2347,16 @@ int bnx2x_initial_phy_init(struct bnx2x *bp, int 
load_mode)
if (load_mode == LOAD_DIAG) {
struct link_params *lp = &bp->link_params;
lp->loopback_mode = LOOPBACK_XGXS;
-   /* do PHY loopback at 10G speed, if possible */
-   if (lp->req_line_speed[cfx_idx] < SPEED_1) {
+   /* Prefer doing PHY loopback at highest speed */
+   if (lp->req_line_speed[cfx_idx] < SPEED_2) {
if (lp->speed_cap_mask[cfx_idx] &
-   PORT_HW_CFG_SPEED_CAPABILITY_D0_10G)
+   PORT_HW_CFG_SPEED_CAPABILITY_D0_20G)
lp->req_line_speed[cfx_idx] =
-   SPEED_1;
+   SPEED_2;
+   else if (lp->speed_cap_mask[cfx_idx] &
+   PORT_HW_CFG_SPEED_CAPABILITY_D0_10G)
+   lp->req_line_speed[cfx_idx] =
+   SPEED_1;
else
lp->req_line_speed[cfx_idx] =
SPEED_1000;
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net 8/9] bnx2x: Release nvram lock on error flow

2015-06-25 Thread Yuval Mintz
During an error flow when trying to access the nvram the driver doesn't
release the hw lock it acquired.

Signed-off-by: Yuval Mintz 
Signed-off-by: Ariel Elior 
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
index caf6b31..76b9052 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
@@ -1230,6 +1230,7 @@ static int bnx2x_acquire_nvram_lock(struct bnx2x *bp)
if (!(val & (MCPR_NVM_SW_ARB_ARB_ARB1 << port))) {
DP(BNX2X_MSG_ETHTOOL | BNX2X_MSG_NVM,
   "cannot get access to nvram interface\n");
+   bnx2x_release_hw_lock(bp, HW_LOCK_RESOURCE_NVRAM);
return -EBUSY;
}
 
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net 7/9] bnx2x: Fix statistics gathering on link change

2015-06-25 Thread Yuval Mintz
From: Ariel Elior 

Since driver statistics flow access MACs and those might reset during
link re-configurations, when we're about to change link properties we
have to make sure that statistics are not operational.
Statisics would be re-enabled [i.e., gathering of statistics would
re-commence] once physical link is achieved again.

Since driver employs a link-flap avoidance scheme, there are scenarios
where driver will receive no indication that the new link is up, and
as a result the statistics would not be re-enabled.

Preventing LFA from working in such cases would guarantee that we'll
always receive such indications and thus will fix statistics gathering.

Signed-off-by: Ariel Elior 
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
index 733b0fc..caf6b31 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
@@ -658,6 +658,7 @@ static int bnx2x_set_settings(struct net_device *dev, 
struct ethtool_cmd *cmd)
bp->link_params.multi_phy_config = new_multi_phy_config;
if (netif_running(dev)) {
bnx2x_stats_handle(bp, STATS_EVENT_STOP);
+   bnx2x_force_link_reset(bp);
bnx2x_link_set(bp);
}
 
@@ -1969,6 +1970,7 @@ static int bnx2x_set_pauseparam(struct net_device *dev,
 
if (netif_running(dev)) {
bnx2x_stats_handle(bp, STATS_EVENT_STOP);
+   bnx2x_force_link_reset(bp);
bnx2x_link_set(bp);
}
 
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[EDT][PATCH] ax88179_178a: add reset function in reset_resume

2015-06-25 Thread Vivek Kumar Bhagat
EP-EC562D6B53594479BCA6FC73F17DEE54
Without reset functionality in reset_resume, iperf connection
does not establish after suspend/resume however ping works at
the same time.

reset function inside reset_resume solves above bug. We have verified
it on ASIX based ST Lab, Cadyce dongle.

Signed-off-by: Vivek Kumar Bhagat 
Signed-off-by: Praveen Kumar 
---
 drivers/net/usb/ax88179_178a.c |   14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/net/usb/ax88179_178a.c b/drivers/net/usb/ax88179_178a.c
index e6338c1..00928c0 100644
--- a/drivers/net/usb/ax88179_178a.c
+++ b/drivers/net/usb/ax88179_178a.c
@@ -1630,6 +1630,18 @@ static int ax88179_stop(struct usbnet *dev)
return 0;
 }

+static int ax88179_reset_resume(struct usb_interface *intf)
+{
+   struct usbnet *dev = usb_get_intfdata(intf);
+   int ret;
+
+   ret = ax88179_reset(dev);
+   if (ret < 0)
+   return ret;
+
+   return  ax88179_resume(intf);
+}
+
 static const struct driver_info ax88179_info = {
.description = "ASIX AX88179 USB 3.0 Gigabit Ethernet",
.bind = ax88179_bind,
@@ -1744,7 +1756,7 @@ static struct usb_driver ax88179_178a_driver = {
.probe =usbnet_probe,
.suspend =  ax88179_suspend,
.resume =   ax88179_resume,
-   .reset_resume = ax88179_resume,
+   .reset_resume = ax88179_reset_resume,
.disconnect =   usbnet_disconnect,
.supports_autosuspend = 1,
.disable_hub_initiated_lpm = 1,
--
1.7.9.5N�r��yb�X��ǧv�^�)޺{.n�+���z�^�)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

Re: [PATCH iproute2 resend] Fix changing tunnel remote and local address to any

2015-06-25 Thread Stephen Hemminger
On Mon, 08 Jun 2015 10:51:51 +0200
Nicolas Dichtel  wrote:

> Le 04/06/2015 14:01, Thadeu Lima de Souza Cascardo a écrit :
> > If a tunnel is created with a local address, you can't change it to any.
> >
> >   # ip tunnel add tunl1 mode ipip remote 10.16.42.37 local 10.16.42.214 ttl 
> > 64
> >   # ip tunnel show tunl1
> >   tunl1: ip/ip  remote 10.16.42.37  local 10.16.42.214  ttl 64
> >   # ip tunnel change tunl1 local any
> >   # echo $?
> >   0
> >   # ip tunnel show tunl1
> >   tunl1: ip/ip  remote 10.16.42.37  local 10.16.42.214  ttl 64
> >
> > It happens that parse_args zeroes ip_tunnel_parm, and when creating the
> > tunnel, it is OK to leave it as is if the address is any. However, when
> > changing the tunnel, the current parameters will be read from
> > ip_tunnel_parm, and local and remote address won't be zeroes anymore, so
> > it needs to be explicitly set to any.
> >
> > Signed-off-by: Thadeu Lima de Souza Cascardo 
> Acked-by: Nicolas Dichtel 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH iproute2] mroute: "ip mroute show" not working when "to" and/or "from" is given

2015-06-25 Thread Stephen Hemminger
On Thu, 11 Jun 2015 21:37:36 +0530
Mazhar Rana  wrote:

> The command "ip mroute show" is not showing routes when "to" and/or "from"
> filter is applied.
> 
> root@mazhar:~# ip mroute show
> (10.202.30.101, 235.1.2.3)   Iif: eth0   Oifs: eth1
> 
> But When I applied filter, it does not show anything.
> 
> root@mazhar:~# ip mroute show 235.1.2.3 from 10.202.30.101
> root@mazhar:~#
> 
> Signed-off-by: Mazhar Rana 

Looks good. Applied
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] iproute2: tc/m_pedit.c - remove dead code

2015-06-25 Thread Stephen Hemminger
On Thu, 25 Jun 2015 02:03:02 -0700
Maciej Żenczykowski  wrote:

> From: Maciej Żenczykowski 
> 
> The initializers are simply not needed.
> 
> These if-blocks are outright dead code, because '0 > unsigned' is always
> false, so only else clause triggers and regardless of which clause triggers
> it only updates 'ind' which is later unconditionally written to before
> being used anyway.
> 
> Otherwise we get errors from clang:
> 
>   m_pedit.c:166:8: error: comparison of 0 > unsigned expression is always 
> false [-Werror,-Wtautological-compare]
> if (0 > tkey->off) {
> ~ ^ ~
>   m_pedit.c:209:8: error: comparison of 0 > unsigned expression is always 
> false [-Werror,-Wtautological-compare]
> if (0 > tkey->off) {
> ~ ^ ~
>   2 errors generated.
> 
> Change-Id: I3c9e9092915088fc56f992e5df736851541a4458
> ---
>  tc/m_pedit.c | 32 
>  1 file changed, 8 insertions(+), 24 deletions(-)

Both applied
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Q] sk->sk_protinfo leftovers

2015-06-25 Thread David Miller
From: Denis Kirjanov 
Date: Wed, 17 Jun 2015 14:58:15 +0300

> I've found the old thread about removing sk_protinfo member [0]. Back
> in 2005 Ralf mentioned that the ax25 case is more complicated. Have
> the things changed in the ax25 code since  that time?

It's trivial 15 minute hack to remove, I'll post the two patches that do it.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] Get rid of sock->sk_protinfo.

2015-06-25 Thread David Miller

These two patches get rid of the last remaining user of sk_protinfo
(ax25) and then really gets rid of the struct member.

Signed-off-by: David S. Miller 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] ax25: Stop using sock->sk_protinfo.

2015-06-25 Thread David Miller

Just make a ax25_sock structure that provides the ax25_cb pointer.

Signed-off-by: David S. Miller 
---
 include/net/ax25.h |   16 +++-
 net/ax25/af_ax25.c |   30 +++---
 net/ax25/ax25_in.c |2 +-
 3 files changed, 31 insertions(+), 17 deletions(-)

diff --git a/include/net/ax25.h b/include/net/ax25.h
index 16a923a..e602f81 100644
--- a/include/net/ax25.h
+++ b/include/net/ax25.h
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #defineAX25_T1CLAMPLO  1
 #defineAX25_T1CLAMPHI  (30 * HZ)
@@ -246,7 +247,20 @@ typedef struct ax25_cb {
atomic_trefcount;
 } ax25_cb;
 
-#define ax25_sk(__sk) ((ax25_cb *)(__sk)->sk_protinfo)
+struct ax25_sock {
+   struct sock sk;
+   struct ax25_cb  *cb;
+};
+
+static inline struct ax25_sock *ax25_sk(const struct sock *sk)
+{
+   return (struct ax25_sock *) sk;
+}
+
+static inline struct ax25_cb *sk_to_ax25(const struct sock *sk)
+{
+   return ax25_sk(sk)->cb;
+}
 
 #define ax25_for_each(__ax25, list) \
hlist_for_each_entry(__ax25, list, ax25_node)
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index 9c891d0..ae3a47f 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -57,7 +57,7 @@ static const struct proto_ops ax25_proto_ops;
 
 static void ax25_free_sock(struct sock *sk)
 {
-   ax25_cb_put(ax25_sk(sk));
+   ax25_cb_put(sk_to_ax25(sk));
 }
 
 /*
@@ -306,7 +306,7 @@ void ax25_destroy_socket(ax25_cb *ax25)
while ((skb = skb_dequeue(&ax25->sk->sk_receive_queue)) != 
NULL) {
if (skb->sk != ax25->sk) {
/* A pending connection */
-   ax25_cb *sax25 = ax25_sk(skb->sk);
+   ax25_cb *sax25 = sk_to_ax25(skb->sk);
 
/* Queue the unaccepted socket for death */
sock_orphan(skb->sk);
@@ -551,7 +551,7 @@ static int ax25_setsockopt(struct socket *sock, int level, 
int optname,
return -EFAULT;
 
lock_sock(sk);
-   ax25 = ax25_sk(sk);
+   ax25 = sk_to_ax25(sk);
 
switch (optname) {
case AX25_WINDOW:
@@ -697,7 +697,7 @@ static int ax25_getsockopt(struct socket *sock, int level, 
int optname,
length = min_t(unsigned int, maxlen, sizeof(int));
 
lock_sock(sk);
-   ax25 = ax25_sk(sk);
+   ax25 = sk_to_ax25(sk);
 
switch (optname) {
case AX25_WINDOW:
@@ -796,7 +796,7 @@ out:
 static struct proto ax25_proto = {
.name = "AX25",
.owner= THIS_MODULE,
-   .obj_size = sizeof(struct sock),
+   .obj_size = sizeof(struct ax25_sock),
 };
 
 static int ax25_create(struct net *net, struct socket *sock, int protocol,
@@ -858,7 +858,7 @@ static int ax25_create(struct net *net, struct socket 
*sock, int protocol,
if (sk == NULL)
return -ENOMEM;
 
-   ax25 = sk->sk_protinfo = ax25_create_cb();
+   ax25 = ax25_sk(sk)->cb = ax25_create_cb();
if (!ax25) {
sk_free(sk);
return -ENOMEM;
@@ -910,7 +910,7 @@ struct sock *ax25_make_new(struct sock *osk, struct 
ax25_dev *ax25_dev)
sk->sk_state= TCP_ESTABLISHED;
sock_copy_flags(sk, osk);
 
-   oax25 = ax25_sk(osk);
+   oax25 = sk_to_ax25(osk);
 
ax25->modulus = oax25->modulus;
ax25->backoff = oax25->backoff;
@@ -938,7 +938,7 @@ struct sock *ax25_make_new(struct sock *osk, struct 
ax25_dev *ax25_dev)
}
}
 
-   sk->sk_protinfo = ax25;
+   ax25_sk(sk)->cb = ax25;
sk->sk_destruct = ax25_free_sock;
ax25->sk= sk;
 
@@ -956,7 +956,7 @@ static int ax25_release(struct socket *sock)
sock_hold(sk);
sock_orphan(sk);
lock_sock(sk);
-   ax25 = ax25_sk(sk);
+   ax25 = sk_to_ax25(sk);
 
if (sk->sk_type == SOCK_SEQPACKET) {
switch (ax25->state) {
@@ -1066,7 +1066,7 @@ static int ax25_bind(struct socket *sock, struct sockaddr 
*uaddr, int addr_len)
 
lock_sock(sk);
 
-   ax25 = ax25_sk(sk);
+   ax25 = sk_to_ax25(sk);
if (!sock_flag(sk, SOCK_ZAPPED)) {
err = -EINVAL;
goto out;
@@ -1113,7 +1113,7 @@ static int __must_check ax25_connect(struct socket *sock,
struct sockaddr *uaddr, int addr_len, int flags)
 {
struct sock *sk = sock->sk;
-   ax25_cb *ax25 = ax25_sk(sk), *ax25t;
+   ax25_cb *ax25 = sk_to_ax25(sk), *ax25t;
struct full_sockaddr_ax25 *fsa = (struct full_sockaddr_ax25 *)uaddr;
ax25_digi *digi = NULL;
int ct = 0, err = 0;
@@ -1394,7 +1394,7 @@ static int ax25_getname(struct socket *sock, struct 
sockaddr *uaddr,
 
memset(fsa, 0, sizeof(*fsa));
lock_sock(sk);
-   ax25 = ax25_sk(sk);
+   ax25 = sk_to_ax25(sk);
 
if (peer != 0) {
   

[PATCH 2/2] net: Kill sock->sk_protinfo

2015-06-25 Thread David Miller

No more users, so it can now be removed.

Signed-off-by: David S. Miller 
---
 include/net/sock.h |2 --
 net/core/sock.c|1 -
 net/sctp/socket.c  |6 --
 3 files changed, 9 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 14d539c..05a8c1a 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -277,7 +277,6 @@ struct cg_proto;
   *@sk_incoming_cpu: record cpu processing incoming packets
   *@sk_txhash: computed flow hash for use on transmit
   *@sk_filter: socket filtering instructions
-  *@sk_protinfo: private area, net family specific, when not using slab
   *@sk_timer: sock cleanup timer
   *@sk_stamp: time stamp of last packet received
   *@sk_tsflags: SO_TIMESTAMPING socket options
@@ -416,7 +415,6 @@ struct sock {
const struct cred   *sk_peer_cred;
longsk_rcvtimeo;
longsk_sndtimeo;
-   void*sk_protinfo;
struct timer_list   sk_timer;
ktime_t sk_stamp;
u16 sk_tsflags;
diff --git a/net/core/sock.c b/net/core/sock.c
index 1e1fe9a..e4be66f 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2269,7 +2269,6 @@ static void sock_def_write_space(struct sock *sk)
 
 static void sock_def_destruct(struct sock *sk)
 {
-   kfree(sk->sk_protinfo);
 }
 
 void sk_send_sigurg(struct sock *sk)
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 5f6c4e6..1425ec2 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -2121,12 +2121,6 @@ static int sctp_recvmsg(struct sock *sk, struct msghdr 
*msg, size_t len,
if (sp->subscribe.sctp_data_io_event)
sctp_ulpevent_read_sndrcvinfo(event, msg);
 
-#if 0
-   /* FIXME: we should be calling IP/IPv6 layers.  */
-   if (sk->sk_protinfo.af_inet.cmsg_flags)
-   ip_cmsg_recv(msg, skb);
-#endif
-
err = copied;
 
/* If skb's length exceeds the user's buffer, update the skb and
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] flow_dissector: Pre-initialize ip_proto in __skb_flow_dissect()

2015-06-25 Thread Geert Uytterhoeven
net/core/flow_dissector.c: In function ‘__skb_flow_dissect’:
net/core/flow_dissector.c:132: warning: ‘ip_proto’ may be used uninitialized in 
this function

Signed-off-by: Geert Uytterhoeven 
---
This may be a false positive, but the state machine in
__skb_flow_dissect() is a bit hard to follow.
As I believe it is controlled by a packet received from the network, the
only safe thing to do is to pre-initialize ip_proto.
---
 net/core/flow_dissector.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 476e5dda59e19822..2a834c6179b9973e 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -129,7 +129,7 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
struct flow_dissector_key_ports *key_ports;
struct flow_dissector_key_tags *key_tags;
struct flow_dissector_key_keyid *key_keyid;
-   u8 ip_proto;
+   u8 ip_proto = 0;
 
if (!data) {
data = skb->data;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] drivers: net: xgene: Pre-initialize ret in xgene_enet_get_resources()

2015-06-25 Thread Geert Uytterhoeven
If CONFIG_ACPI=n:

drivers/net/ethernet/apm/xgene/xgene_enet_main.c: In function 
‘xgene_enet_get_resources’:
drivers/net/ethernet/apm/xgene/xgene_enet_main.c:951: warning: ‘ret’ may be 
used uninitialized in this function

If the driver is bound to a legacy platform device, ret will contain
arbitrary data. If it is non-zero, it will be returned to the caller as
an error code.

Signed-off-by: Geert Uytterhoeven 
---
 drivers/net/ethernet/apm/xgene/xgene_enet_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c 
b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
index 95153b234c7158c6..299eb4315fe647ba 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
@@ -948,7 +948,7 @@ static int xgene_enet_get_resources(struct xgene_enet_pdata 
*pdata)
struct resource *res;
void __iomem *base_addr;
u32 offset;
-   int ret;
+   int ret = 0;
 
pdev = pdata->pdev;
dev = &pdev->dev;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net 0/9] bnx2x: various fixes

2015-06-25 Thread David Miller
From: Yuval Mintz 
Date: Thu, 25 Jun 2015 15:19:20 +0300

> This patch series contains several small fixes [with the possible
> exception of the first 2 link fixes] for various driver flows.

Series applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/3] net: mvneta: introduce compatible string "marvell, armada-xp-neta"

2015-06-25 Thread Jason Cooper
On Thu, Jun 25, 2015 at 11:13:23AM +0200, Simon Guinot wrote:
> On Fri, Jun 19, 2015 at 02:32:53PM +0200, Simon Guinot wrote:
> > On Wed, Jun 17, 2015 at 05:01:12PM +, Jason Cooper wrote:
> > > On Wed, Jun 17, 2015 at 05:15:28PM +0200, Gregory CLEMENT wrote:
> > > > On 17/06/2015 17:12, Gregory CLEMENT wrote:
> > > > > On 17/06/2015 15:19, Simon Guinot wrote:
> > > > >> The mvneta driver supports the Ethernet IP found in the Armada 370, 
> > > > >> XP,
> > > > >> 380 and 385 SoCs. Since at least one more hardware feature is 
> > > > >> available
> > > > >> for the Armada XP SoCs then a way to identify them is needed.
> > > > >>
> > > > >> This patch introduces a new compatible string 
> > > > >> "marvell,armada-xp-neta".
> > > > > 
> > > > > Let's be future proof by going further. I would like to have an 
> > > > > compatible string
> > > > > for each SoC even if we currently we don't use them.
> > > 
> > > I disagree with this.  We can't predict what incosistencies we'll 
> > > discover in
> > > the future.  We should only assign new compatible strings based on known 
> > > IP
> > > variations when we discover them.  This seems fraught with demons since we
> > > can't predict the scope of affected IP blocks (some steppings of one SoC, 
> > > three
> > > SoCs plus two steppings of a fourth, etc)
> > > 
> > > imho, the 'future-proofing' lies in being specific as to the naming of the
> > > compatible strings against known hardware variations at the time.
> > 
> > So, should I add more compatible strings or not ?
> 
> How do you want me to handle this ? Did you reach an agreement ?

Sorry, this slipped off my radar.  Probably EBKAC.  :)

I'm still of the opinion that future-proofing equates to guessing.
It has the advantage of, if we guess correctly, things are easier down
the road when we discover differences between similar IP blocks.
However, if we guess incorrectly, then we have a mess on our hands.
iow, this proposal fails poorly.

I've no problem breaking DT compatibility when it's determined that we
made a mistake (or mistakes) in the past.  See the irqchip rework that
Marc did a few cycles ago.

The difference here is that we know better.  We *know* that dtbs are
upgraded with the kernel.  We *know* that no one is shipping products
with dtbs in ROMs.  So what are we really trying to protect against?

thx,

Jason.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: netdev broken?

2015-06-25 Thread Jamal Hadi Salim

On 06/25/15 08:03, David Miller wrote:



I'm pretty sure you're just not subscribed to the list.



I never unsubscribed. Ive resubscribed.

cheers,
jamal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] flow_dissector: Pre-initialize ip_proto in __skb_flow_dissect()

2015-06-25 Thread David Miller
From: Geert Uytterhoeven 
Date: Thu, 25 Jun 2015 15:10:32 +0200

> net/core/flow_dissector.c: In function ‘__skb_flow_dissect’:
> net/core/flow_dissector.c:132: warning: ‘ip_proto’ may be used uninitialized 
> in this function
> 
> Signed-off-by: Geert Uytterhoeven 
> ---
> This may be a false positive, but the state machine in
> __skb_flow_dissect() is a bit hard to follow.
> As I believe it is controlled by a packet received from the network, the
> only safe thing to do is to pre-initialize ip_proto.

Actually I think this is a real bug, because for the ETH_P_MPLS_* cases I cannot
see what will always set ip_proto before it gets used as an input.


Re: [PATCH RFC] 2/2 huawei_cdc_ncm: introduce new TX ncm stack

2015-06-25 Thread Oliver Neukum
On Thu, 2015-06-25 at 13:44 +0200, Enrico Mioso wrote:

> On Thu, 25 Jun 2015, Oliver Neukum wrote:

> > Is there any advantage in keeping this in a single function?
> >
> I did this choice in the light of the fact I think the tx_fixup function will 
> become more complex than it is now, when aggregating frames.

Yes, but that is a reason to split the helpers up not the opposite.

> I answer here your other message to make it more convenient to read: my 
> intention for the tx_fixup function would be to:
> - aggregate frames
> - send them out when:
>   - a timer expires

How would you do that in tx_fixup()? If a timer is required then you
need a separate function.

>   OR
>   - we have enough data in the aggregate, and cannot add more.

Yes.

You need a third case:
- the interface is taken down.

But in general the logic for that is already there. So can you explain
what additional goals you have?

> This is something done in cdc_ncm.c for example.
> But here I have a question: by reading the comment in file 
> drivers/net/usb/rndis_host.c at line 572, there seem to be different opinions 
> in this matter.

That is a very old comment written for much slower devices.
rndis_host doesn't get much love nowadays.

> What to do ?
> 
> >> +int
> >> +huawei_ncm_mgmt(struct usbnet *dev,
> >> +  struct huawei_cdc_ncm_state *drvstate, struct sk_buff *skb_out, 
> >> int mode) {
> >> +  struct usb_cdc_ncm_nth16 *nth16 = (struct usb_cdc_ncm_nth16 
> >> *)skb_out->data;
> >> +  struct cdc_ncm_ctx *ctx = drvstate->ctx;
> >> +  struct usb_cdc_ncm_ndp16 *ndp16 = NULL;
> >> +  int ret = -EINVAL;
> >> +  u16 ndplen, index;
> >> +
> >> +  switch (mode) {
> >> +  case NCM_INIT_FRAME:
> >> +
> >> +  /* Write a new NTH16 header */
> >> +  nth16 = (struct usb_cdc_ncm_nth16 *)memset(skb_put(skb_out, 
> >> sizeof(struct usb_cdc_ncm_nth16)), 0, sizeof(struct usb_cdc_ncm_nth16));
> >> +  if (!nth16) {
> >> +  ret = -EINVAL;
> >> +  goto error;
> >> +  }
> >> +
> >> +  /* NTH16 signature and header length are known a-priori. */
> >> +  nth16->dwSignature = cpu_to_le32(USB_CDC_NCM_NTH16_SIGN);
> >> +  nth16->wHeaderLength = cpu_to_le16(sizeof(struct 
> >> usb_cdc_ncm_nth16));
> >> +
> >> +  /* TX sequence numbering */
> >> +  nth16->wSequence = cpu_to_le16(ctx->tx_seq++);
> >> +
> >> +  /* Forget about previous SKB NDP */
> >> +  drvstate->skb_ndp = NULL;
> >
> > This is probably better done after you know you cannot fail.
> Sure. Thank you.
> >
> >> +
> >> +  /* Allocate a new NDP */
> >> +  ndp16 = kzalloc(ctx->max_ndp_size, GFP_NOIO);
> >
> > Where is this freed?
> The intention wqas to free it in the NCM_COMMIT_NDP case.
> Infact after allocating the pointer, I make a copy of it in the driver state 
> (drvstate) variable, and get back to it later.
> Is this wrong?

Well, no, but it supposes a matched commit phase. Can you guarantee
that? I was under the oppression that in that phase you want to actually
give a frame over to the hardware.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net 6/9] bnx2x: Fix self-test for 20g devices

2015-06-25 Thread Sergei Shtylyov

Hello.

On 06/25/2015 03:19 PM, Yuval Mintz wrote:


20g-capable devices are not configured properly for self-test, using
10g as their speed which cause the link indication to remain down and
fail the internal loopback test.



Signed-off-by: Yuval Mintz 
Signed-off-by: Ariel Elior 
---
  drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 12 
  1 file changed, 8 insertions(+), 4 deletions(-)



diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index c1033a5..3df03bb 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -2347,12 +2347,16 @@ int bnx2x_initial_phy_init(struct bnx2x *bp, int 
load_mode)
if (load_mode == LOAD_DIAG) {
struct link_params *lp = &bp->link_params;
lp->loopback_mode = LOOPBACK_XGXS;
-   /* do PHY loopback at 10G speed, if possible */
-   if (lp->req_line_speed[cfx_idx] < SPEED_1) {
+   /* Prefer doing PHY loopback at highest speed */
+   if (lp->req_line_speed[cfx_idx] < SPEED_2) {
if (lp->speed_cap_mask[cfx_idx] &
-   PORT_HW_CFG_SPEED_CAPABILITY_D0_10G)
+   PORT_HW_CFG_SPEED_CAPABILITY_D0_20G)
lp->req_line_speed[cfx_idx] =
-   SPEED_1;
+   SPEED_2;


   I would have added one more tab here.


+   else if (lp->speed_cap_mask[cfx_idx] &
+   PORT_HW_CFG_SPEED_CAPABILITY_D0_10G)


   This line should have been started right under 'lp'.


+   lp->req_line_speed[cfx_idx] =


   This line is indented too much.


+   SPEED_1;


   This line seems alright.

WBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] flow_dissector: Pre-initialize ip_proto in __skb_flow_dissect()

2015-06-25 Thread Jiri Pirko
Thu, Jun 25, 2015 at 03:33:31PM CEST, da...@davemloft.net wrote:
>From: Geert Uytterhoeven 
>Date: Thu, 25 Jun 2015 15:10:32 +0200
>
>> net/core/flow_dissector.c: In function ‘__skb_flow_dissect’:
>> net/core/flow_dissector.c:132: warning: ‘ip_proto’ may be used uninitialized 
>> in this function
>> 
>> Signed-off-by: Geert Uytterhoeven 
>> ---
>> This may be a false positive, but the state machine in
>> __skb_flow_dissect() is a bit hard to follow.
>> As I believe it is controlled by a packet received from the network, the
>> only safe thing to do is to pre-initialize ip_proto.
>
>Actually I think this is a real bug, because for the ETH_P_MPLS_* cases I 
>cannot
>see what will always set ip_proto before it gets used as an input.

I think that MPLS cases are ok. In this case, return is always hit.
I believe this is false positive.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] dsa: fix promiscuity leak on slave dev open error

2015-06-25 Thread Gilad Ben-Yossef
DSA master netdev promiscuity counter was not being properly
decremented on slave device open error path.

Signed-off-by: Gilad Ben-Yossef 
CC: Gilad Ben-Yossef 
CC: David S. Miller 
CC: Florian Fainelli 
CC: Guenter Roeck 
CC: Andrew Lunn 
CC: Scott Feldman 
---
 net/dsa/slave.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 04ffad3..0917123 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -112,7 +112,7 @@ static int dsa_slave_open(struct net_device *dev)
 
 clear_promisc:
if (dev->flags & IFF_PROMISC)
-   dev_set_promiscuity(master, 0);
+   dev_set_promiscuity(master, -1);
 clear_allmulti:
if (dev->flags & IFF_ALLMULTI)
dev_set_allmulti(master, -1);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] dsa: fix promiscuity leak on slave dev open error

2015-06-25 Thread giladb
From: Gilad Ben-Yossef 

DSA master netdev promiscuity counter was not being properly
decremented on slave device open error path.

Signed-off-by: Gilad Ben-Yossef 
CC: Gilad Ben-Yossef 
CC: David S. Miller 
CC: Florian Fainelli 
CC: Guenter Roeck 
CC: Andrew Lunn 
CC: Scott Feldman 
---
 net/dsa/slave.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 04ffad3..0917123 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -112,7 +112,7 @@ static int dsa_slave_open(struct net_device *dev)
 
 clear_promisc:
if (dev->flags & IFF_PROMISC)
-   dev_set_promiscuity(master, 0);
+   dev_set_promiscuity(master, -1);
 clear_allmulti:
if (dev->flags & IFF_ALLMULTI)
dev_set_allmulti(master, -1);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Performance loss due to commit 37c3185 ([NET]: Added GSO toggle)

2015-06-25 Thread Herbert Xu
On Sat, Jun 20, 2015 at 11:55:03AM +0200, christophe leroy wrote:
> Hello Herbert,
> 
> In commit "[NET]: Added GSO toggle"
> 37c3185a02d4b85fbe134bf5204535405dd2c957,
> you force NETIF_F_HW_CSUM if GSO feature is selected.
> By default, SW GSO is active as soon as a network board has NETIF_F_SG
> feature.
> This means that function sk_setup_caps() forces NETIF_F_HW_CSUM for any
> board having NETIF_F_SG
> 
> For boards having no HW checksum capability, this results in performance
> loss due to data copy being done in skb_do_copy_data_nocache() with
> copy_from_user() then checksum being done later with csum_partial()
> instead of getting both done at the same time using
> csum_and_copy_from_user()
> 
> Is there a reason for forcing NETIF_F_HW_CSUM   ?

Well GSO requires hardware checksum for all the protocols that
it supports.  I guess we should also check that the hardware can
checksum both IPv4 and IPv6 before enabling it.

However, the benefit of GSO should cancel out the cost of copying
so I was hoping to just enable GSO unconditionally at some point.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RESEND: Performance loss due to commit 37c3185 ([NET]: Added GSO toggle)

2015-06-25 Thread leroy christophe

Hello Herbert,

In commit "[NET]: Added GSO toggle" 
37c3185a02d4b85fbe134bf5204535405dd2c957,

you force NETIF_F_HW_CSUM if GSO feature is selected.
By default, SW GSO is active as soon as a network board has NETIF_F_SG 
feature.
This means that function sk_setup_caps() forces NETIF_F_HW_CSUM for any 
board having NETIF_F_SG


For boards having no HW checksum capability, this results in performance 
loss due to data copy being done in skb_do_copy_data_nocache() with 
copy_from_user() then checksum being done later with csum_partial() 
instead of getting both done at the same time using 
csum_and_copy_from_user()


What is the reason for forcing NETIF_F_HW_CSUM   ?

Christophe
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT] Networking

2015-06-25 Thread Or Gerlitz
On Thu, Jun 25, 2015 at 2:38 AM, Linus Torvalds
 wrote:
>
> On Wed, Jun 24, 2015 at 6:39 AM, David Miller  wrote:
> >
> >   git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
>
> Just going through the conflicts, I see commit 7193a141eb74 ("IB/mlx4:
> Set VF to read from QP counters"), and wonder...
>
> Is that code really supposed to fall through to the
> infiniband-over-ethernet case when the link layer is
> IB_LINK_LAYER_INFINIBAND but it's a slave?
>
> The commit message is not in the least helpful.
>

And this is a bug indeed.

Under IB links, we should use the the infiniband-over-ethernet flow only for one
specific case (reading link performance counters by SRIOV VFs) and nothing else.

I sent a fix, https://patchwork.kernel.org/patch/6675921/

Or.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] 2/2 huawei_cdc_ncm: introduce new TX ncm stack

2015-06-25 Thread Enrico Mioso

Hi Oliver! And thank you again.

I like / recommend the usage of open messaging standards: my preferred XMPP ID
(JID) is: mrk...@jit.si.

On Thu, 25 Jun 2015, Oliver Neukum wrote:


Date: Thu, 25 Jun 2015 15:38:46
From: Oliver Neukum 
To: Enrico Mioso 
Cc: linux-...@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH RFC] 2/2 huawei_cdc_ncm: introduce new TX ncm stack

On Thu, 2015-06-25 at 13:44 +0200, Enrico Mioso wrote:


On Thu, 25 Jun 2015, Oliver Neukum wrote:



Is there any advantage in keeping this in a single function?


I did this choice in the light of the fact I think the tx_fixup function will
become more complex than it is now, when aggregating frames.


Yes, but that is a reason to split the helpers up not the opposite.

Right - I understood only now your observation.

the only reason to write the manager that way was that it shouldn't become very 
complex - it should simply do things to frames, helping out in building valid 
NCM packages.





I answer here your other message to make it more convenient to read: my
intention for the tx_fixup function would be to:
- aggregate frames
- send them out when:
- a timer expires


How would you do that in tx_fixup()? If a timer is required then you
need a separate function.

Sure. I meant: I will adapt it in case needed, and expectin the code to become 
a little bit more convoluted.

OR
- we have enough data in the aggregate, and cannot add more.


Yes.

You need a third case:
- the interface is taken down.

But in general the logic for that is already there. So can you explain
what additional goals you have?
regarding the "timer logic" I saw it in cdc_ncm.c, but I should study it in 
more detail to understand it and implement it here where needed in case.

And sure, the "interface goes down" case is important: didn't think about it.
Thanks for the point.

the only other additional goal is to use the manager in such a way that NDP 
will be disposed after frames.


I think that this split between NCM management and tx_fixup makes things more 
flexible in general: this is the reason for re-writing it.



This is something done in cdc_ncm.c for example.
But here I have a question: by reading the comment in file
drivers/net/usb/rndis_host.c at line 572, there seem to be different opinions
in this matter.


That is a very old comment written for much slower devices.
rndis_host doesn't get much love nowadays.

Fine.



What to do ?


+int
+huawei_ncm_mgmt(struct usbnet *dev,
+   struct huawei_cdc_ncm_state *drvstate, struct sk_buff *skb_out, 
int mode) {
+   struct usb_cdc_ncm_nth16 *nth16 = (struct usb_cdc_ncm_nth16 
*)skb_out->data;
+   struct cdc_ncm_ctx *ctx = drvstate->ctx;
+   struct usb_cdc_ncm_ndp16 *ndp16 = NULL;
+   int ret = -EINVAL;
+   u16 ndplen, index;
+
+   switch (mode) {
+   case NCM_INIT_FRAME:
+
+   /* Write a new NTH16 header */
+   nth16 = (struct usb_cdc_ncm_nth16 *)memset(skb_put(skb_out, 
sizeof(struct usb_cdc_ncm_nth16)), 0, sizeof(struct usb_cdc_ncm_nth16));
+   if (!nth16) {
+   ret = -EINVAL;
+   goto error;
+   }
+
+   /* NTH16 signature and header length are known a-priori. */
+   nth16->dwSignature = cpu_to_le32(USB_CDC_NCM_NTH16_SIGN);
+   nth16->wHeaderLength = cpu_to_le16(sizeof(struct 
usb_cdc_ncm_nth16));
+
+   /* TX sequence numbering */
+   nth16->wSequence = cpu_to_le16(ctx->tx_seq++);
+
+   /* Forget about previous SKB NDP */
+   drvstate->skb_ndp = NULL;


This is probably better done after you know you cannot fail.

Sure. Thank you.



+
+   /* Allocate a new NDP */
+   ndp16 = kzalloc(ctx->max_ndp_size, GFP_NOIO);


Where is this freed?

The intention wqas to free it in the NCM_COMMIT_NDP case.
Infact after allocating the pointer, I make a copy of it in the driver state
(drvstate) variable, and get back to it later.
Is this wrong?


Well, no, but it supposes a matched commit phase. Can you guarantee
that? I was under the oppression that in that phase you want to actually
give a frame over to the hardware.

No. When Italk about committing, I think about "writing things to out skb".
another reason why i found confortable writing the code this way was to 
maintain a kind of statefullness in a more "clean" way.
But I understand this is kind of agruable, and I can if needed break it up as 
desired.


Regarding the commit phase - I am not sure I understand your comment, sorry
about that.
However, my intention would be to allow the caller to do calls in sequences 
like:

- init frame
- ncm update
- ncm update
- ncm update
...
- ncm update
- ncm commit

(to work in "huawei" mode)

OR

- ncm init frame
- ncm commit
- ncm update
- ncm update
- ncm update
- ncm update
- finalize nth

(to work in "cdc_ncm.c"-mode)..


But to prevent usbnet from submit

Re: [PATCH] flow_dissector: Pre-initialize ip_proto in __skb_flow_dissect()

2015-06-25 Thread Tom Herbert
On Thu, Jun 25, 2015 at 6:10 AM, Geert Uytterhoeven
 wrote:
> net/core/flow_dissector.c: In function ‘__skb_flow_dissect’:
> net/core/flow_dissector.c:132: warning: ‘ip_proto’ may be used uninitialized 
> in this function
>
> Signed-off-by: Geert Uytterhoeven 
> ---
> This may be a false positive, but the state machine in
> __skb_flow_dissect() is a bit hard to follow.
> As I believe it is controlled by a packet received from the network, the
> only safe thing to do is to pre-initialize ip_proto.
> ---
>  net/core/flow_dissector.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> index 476e5dda59e19822..2a834c6179b9973e 100644
> --- a/net/core/flow_dissector.c
> +++ b/net/core/flow_dissector.c
> @@ -129,7 +129,7 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
> struct flow_dissector_key_ports *key_ports;
> struct flow_dissector_key_tags *key_tags;
> struct flow_dissector_key_keyid *key_keyid;
> -   u8 ip_proto;
> +   u8 ip_proto = 0;
>
> if (!data) {
> data = skb->data;
> --
> 1.9.1
>

Acked-by: Tom Herbert 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT] Networking

2015-06-25 Thread Paul Gortmaker
On Wed, Jun 24, 2015 at 7:38 PM, Linus Torvalds
 wrote:

[...]

>
> I'm getting *real* tired of that BUG_ON() shit. I realize that
> infiniband is a niche market, and those "commercial grade" niche
> markets are more-than-used-to crap code and horrible hacks, but this
> is still the kernel. We don't add random machine-killing debug checks
> when it is *so* simple to just do
>
> if (WARN_ON_ONCE(..))
> return -EINVAL;
>
> instead.
>
> Killing the machine for idiotic things like that is truly offensive,
> and truly horrible horrible code. Why do I keep on having to tell
> people off for doing these things? Why do people keep thinking that
> debugging-by-killing-the-machine is a good idea?

Ingo figured this was an educational battle that we'd never win.

https://lkml.org/lkml/2014/5/21/490

I tend to agree, as unfortunate as that is.

Paul.
--

>
> Either that BUG_ON() cannot possibly happen, in which case it should
> damn well not exist in the first place. Or it's a valuable debug aid,
> in which case it should damn well not be a BUG_ON. You can't have it
> both ways.
>
> The next pointless BUG_ON() I see, I will start getting _really_
> unpleasant about.
>
> Doug, get rid of those things asap.
>
> Linus
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net PATCH 1/1] net: phy: fix phy link up when limiting speed via device tree

2015-06-25 Thread Mugunthan V N
When limiting phy link speed using "max-speed" to 100mbps or less on a
giga bit phy, phy never completes auto negotiation and phy state
machine is held in PHY_AN. Fixing this issue by comparing the giga
bit advertise though phydev->supported doesn't have it but phy has
BMSR_ESTATEN set. So that auto negotiation is restarted as old and
new advertise are different and link comes up fine.

Signed-off-by: Mugunthan V N 
---
 drivers/net/phy/phy_device.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index bdfe51f..d551df6 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -796,10 +796,11 @@ static int genphy_config_advert(struct phy_device *phydev)
if (phydev->supported & (SUPPORTED_1000baseT_Half |
 SUPPORTED_1000baseT_Full)) {
adv |= ethtool_adv_to_mii_ctrl1000_t(advertise);
-   if (adv != oldadv)
-   changed = 1;
}
 
+   if (adv != oldadv)
+   changed = 1;
+
err = phy_write(phydev, MII_CTRL1000, adv);
if (err < 0)
return err;
-- 
2.4.4.409.g5b1d901

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT] Networking

2015-06-25 Thread Joe Perches
On Thu, 2015-06-25 at 12:24 -0400, Paul Gortmaker wrote:
> On Wed, Jun 24, 2015 at 7:38 PM, Linus Torvalds 
>  wrote:
> > I'm getting *real* tired of that BUG_ON() shit.
[]
> > Killing the machine for idiotic things like that is truly offensive,
> > and truly horrible horrible code. Why do I keep on having to tell
> > people off for doing these things? Why do people keep thinking that
> > debugging-by-killing-the-machine is a good idea?
> 
> Ingo figured this was an educational battle that we'd never win.
> 
> https://lkml.org/lkml/2014/5/21/490

That should still be applied but for the selfie/OED bit as
selfie is in the OED now...


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] vhost: add ioctl to query nregions upper limit

2015-06-25 Thread Igor Mammedov
On Wed, 24 Jun 2015 17:08:56 +0200
"Michael S. Tsirkin"  wrote:

> On Wed, Jun 24, 2015 at 04:52:29PM +0200, Igor Mammedov wrote:
> > On Wed, 24 Jun 2015 16:17:46 +0200
> > "Michael S. Tsirkin"  wrote:
> > 
> > > On Wed, Jun 24, 2015 at 04:07:27PM +0200, Igor Mammedov wrote:
> > > > On Wed, 24 Jun 2015 15:49:27 +0200
> > > > "Michael S. Tsirkin"  wrote:
> > > > 
> > > > > Userspace currently simply tries to give vhost as many regions
> > > > > as it happens to have, but you only have the mem table
> > > > > when you have initialized a large part of VM, so graceful
> > > > > failure is very hard to support.
> > > > > 
> > > > > The result is that userspace tends to fail catastrophically.
> > > > > 
> > > > > Instead, add a new ioctl so userspace can find out how much
> > > > > kernel supports, up front. This returns a positive value that
> > > > > we commit to.
> > > > > 
> > > > > Also, document our contract with legacy userspace: when
> > > > > running on an old kernel, you get -1 and you can assume at
> > > > > least 64 slots.  Since 0 value's left unused, let's make that
> > > > > mean that the current userspace behaviour (trial and error)
> > > > > is required, just in case we want it back.
> > > > > 
> > > > > Signed-off-by: Michael S. Tsirkin 
> > > > > Cc: Igor Mammedov 
> > > > > Cc: Paolo Bonzini 
> > > > > ---
> > > > >  include/uapi/linux/vhost.h | 17 -
> > > > >  drivers/vhost/vhost.c  |  5 +
> > > > >  2 files changed, 21 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/include/uapi/linux/vhost.h
> > > > > b/include/uapi/linux/vhost.h index ab373191..f71fa6d 100644
> > > > > --- a/include/uapi/linux/vhost.h
> > > > > +++ b/include/uapi/linux/vhost.h
> > > > > @@ -80,7 +80,7 @@ struct vhost_memory {
> > > > >   * Allows subsequent call to VHOST_OWNER_SET to succeed. */
> > > > >  #define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)
> > > > >  
> > > > > -/* Set up/modify memory layout */
> > > > > +/* Set up/modify memory layout: see also
> > > > > VHOST_GET_MEM_MAX_NREGIONS below. */ #define
> > > > > VHOST_SET_MEM_TABLE   _IOW(VHOST_VIRTIO, 0x03, struct
> > > > > vhost_memory) /* Write logging setup. */
> > > > > @@ -127,6 +127,21 @@ struct vhost_memory {
> > > > >  /* Set eventfd to signal an error */
> > > > >  #define VHOST_SET_VRING_ERR _IOW(VHOST_VIRTIO, 0x22, struct
> > > > > vhost_vring_file) 
> > > > > +/* Query upper limit on nregions in VHOST_SET_MEM_TABLE
> > > > > arguments.
> > > > > + * Returns:
> > > > > + *   0 < value <= MAX_INT - gives the upper limit,
> > > > > higher values will fail
> > > > > + *   0 - there's no static limit: try and see if it
> > > > > works
> > > > > + *   -1 - on failure
> > > > > + */
> > > > > +#define VHOST_GET_MEM_MAX_NREGIONS   _IO(VHOST_VIRTIO, 0x23)
> > > > > +
> > > > > +/* Returned by VHOST_GET_MEM_MAX_NREGIONS to mean there's no
> > > > > static limit:
> > > > > + * try and it'll work if you are lucky. */
> > > > > +#define VHOST_MEM_MAX_NREGIONS_NONE 0
> > > > is it needed? we always have a limit,
> > > > or don't have IOCTL => -1 => old try and see way
> > > > 
> > > > > +/* We support at least as many nregions in
> > > > > VHOST_SET_MEM_TABLE:
> > > > > + * for use on legacy kernels without
> > > > > VHOST_GET_MEM_MAX_NREGIONS support. */ +#define
> > > > > VHOST_MEM_MAX_NREGIONS_DEFAULT 64
> > > > ^^^ not used below,
> > > > if it's for legacy then perhaps s/DEFAULT/LEGACY/ 
> > > 
> > > The assumption was that userspace detecting old kernels will just
> > > use 64, this means we do want a flag to get the old way.
> > > 
> > > OTOH if you won't think it's useful, let me know.
> > this header will be synced into QEMU's tree so that we could use
> > this define there, isn't it? IMHO then _LEGACY is more exact
> > description of macro.
> > 
> > As for 0 return value, -1 is just fine for detecting old kernels
> > (i.e. try and see if it works), so 0 looks unnecessary but it
> > doesn't in any way hurt either. For me limit or -1 is enough to try
> > fix userspace.
> 
> OK.
> Do you want to try now before I do v2?

I've just tried, idea to check limit is unusable in this case.
here is a link to a patch that implements it:
https://github.com/imammedo/qemu/commits/vhost_slot_limit_check

slots count is changing dynamically depending on used devices
and more importantly guest OS could change slots count during
its runtime when during managing devices it could trigger
repartitioning of current memory table as device's memory regions
mapped into address space.

That leads to 2 different values of used slots at guest startup
time and after guest booted or after hotplug.

I my case guest could be started with max 58 DIMMs coldplugged,
but after boot 3 more slots are freed and it's possible to hotadd
3 more DIMMs. That however leads to the guest that can't be migrated
to since by QEMU design all hotplugged devices should be present
at target's startup time i.e. 60 DIMMs total and that obviously
goes above vhost limit

Re: [net PATCH 1/1] net: phy: fix phy link up when limiting speed via device tree

2015-06-25 Thread Florian Fainelli
2015-06-25 9:51 GMT-07:00 Mugunthan V N :
> When limiting phy link speed using "max-speed" to 100mbps or less on a
> giga bit phy, phy never completes auto negotiation and phy state
> machine is held in PHY_AN. Fixing this issue by comparing the giga
> bit advertise though phydev->supported doesn't have it but phy has
> BMSR_ESTATEN set. So that auto negotiation is restarted as old and
> new advertise are different and link comes up fine.

This looks valid, however, I am curious why this part of the code:

oldadv = adv;
adv &= ~(ADVERTISE_ALL | ADVERTISE_100BASE4 | ADVERTISE_PAUSE_CAP |
 ADVERTISE_PAUSE_ASYM);
adv |= ethtool_adv_to_mii_adv_t(advertise);

if (adv != oldadv) {
err = phy_write(phydev, MII_ADVERTISE, adv);

if (err < 0)
return err;
changed = 1;
}

Is not flagging this as a change already?

>
> Signed-off-by: Mugunthan V N 
> ---
>  drivers/net/phy/phy_device.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index bdfe51f..d551df6 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -796,10 +796,11 @@ static int genphy_config_advert(struct phy_device 
> *phydev)
> if (phydev->supported & (SUPPORTED_1000baseT_Half |
>  SUPPORTED_1000baseT_Full)) {
> adv |= ethtool_adv_to_mii_ctrl1000_t(advertise);
> -   if (adv != oldadv)
> -   changed = 1;
> }
>
> +   if (adv != oldadv)
> +   changed = 1;
> +
> err = phy_write(phydev, MII_CTRL1000, adv);
> if (err < 0)
> return err;
> --
> 2.4.4.409.g5b1d901
>



-- 
Florian
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [net PATCH 1/1] net: phy: fix phy link up when limiting speed via device tree

2015-06-25 Thread Mugunthan V N
On Thursday 25 June 2015 11:01 PM, Florian Fainelli wrote:
> 2015-06-25 9:51 GMT-07:00 Mugunthan V N :
>> > When limiting phy link speed using "max-speed" to 100mbps or less on a
>> > giga bit phy, phy never completes auto negotiation and phy state
>> > machine is held in PHY_AN. Fixing this issue by comparing the giga
>> > bit advertise though phydev->supported doesn't have it but phy has
>> > BMSR_ESTATEN set. So that auto negotiation is restarted as old and
>> > new advertise are different and link comes up fine.
> This looks valid, however, I am curious why this part of the code:
> 
> oldadv = adv;
> adv &= ~(ADVERTISE_ALL | ADVERTISE_100BASE4 | ADVERTISE_PAUSE_CAP |
>  ADVERTISE_PAUSE_ASYM);
> adv |= ethtool_adv_to_mii_adv_t(advertise);
> 
> if (adv != oldadv) {
> err = phy_write(phydev, MII_ADVERTISE, adv);
> 
> if (err < 0)
> return err;
> changed = 1;
> }
> 
> Is not flagging this as a change already?
> 

This can flag a change when the selected limit is 10mbps, if 100mbps is
selected, at this instant oldadv and adv are same so change is not
flagged here.

Regards
Mugunthan V N
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


net-next headers break user source compatiablity

2015-06-25 Thread Stephen Hemminger
Current net-next headers break build of iproute2. Looks like in.h needs the
same libc-compat.h dance as was done in in6.h



$ gcc -Wall -Wstrict-prototypes  -Wmissing-prototypes -Wmissing-declarations 
-Wold-style-definition -Wformat=2 -O2 -I../include -DRESOLVE_HOSTNAMES 
-DLIBDIR=\"/usr/lib\" -DCONFDIR=\"/etc/iproute2\" -D_GNU_SOURCE 
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE  -DHAVE_SETNS 
-DHAVE_ELF -DCONFIG_GACT -DCONFIG_GACT_PROB -DIPT_LIB_DIR=\"/lib/xtables\" 
-DYY_NO_INPUT $(pkg-config xtables --cflags)   -c -o em_ipset.o em_ipset.c
In file included from /usr/include/netdb.h:27:0,
 from em_ipset.c:20:
../include/linux/in.h:26:3: error: redeclaration of enumerator ‘IPPROTO_IP’
   IPPROTO_IP = 0,  /* Dummy protocol for TCP  */
   ^
/usr/include/netinet/in.h:42:5: note: previous definition of ‘IPPROTO_IP’ was 
here
 IPPROTO_IP = 0,/* Dummy protocol for TCP.  */
 ^
../include/linux/in.h:28:3: error: redeclaration of enumerator ‘IPPROTO_ICMP’
   IPPROTO_ICMP = 1,  /* Internet Control Message Protocol */
   ^
/usr/include/netinet/in.h:44:5: note: previous definition of ‘IPPROTO_ICMP’ was 
here
 IPPROTO_ICMP = 1,/* Internet Control Message Protocol.  */
 ^
../include/linux/in.h:30:3: error: redeclaration of enumerator ‘IPPROTO_IGMP’
   IPPROTO_IGMP = 2,  /* Internet Group Management Protocol */
   ^
/usr/include/netinet/in.h:46:5: note: previous definition of ‘IPPROTO_IGMP’ was 
here
 IPPROTO_IGMP = 2,/* Internet Group Management Protocol. */
 ^
../include/linux/in.h:32:3: error: redeclaration of enumerator ‘IPPROTO_IPIP’
   IPPROTO_IPIP = 4,  /* IPIP tunnels (older KA9Q tunnels use 94) */
   ^
/usr/include/netinet/in.h:48:5: note: previous definition of ‘IPPROTO_IPIP’ was 
here
 IPPROTO_IPIP = 4,/* IPIP tunnels (older KA9Q tunnels use 94).  */
 ^
../include/linux/in.h:34:3: error: redeclaration of enumerator ‘IPPROTO_TCP’
   IPPROTO_TCP = 6,  /* Transmission Control Protocol */
   ^
/usr/include/netinet/in.h:50:5: note: previous definition of ‘IPPROTO_TCP’ was 
here
 IPPROTO_TCP = 6,/* Transmission Control Protocol.  */
 ^
../include/linux/in.h:36:3: error: redeclaration of enumerator ‘IPPROTO_EGP’
   IPPROTO_EGP = 8,  /* Exterior Gateway Protocol  */
   ^
/usr/include/netinet/in.h:52:5: note: previous definition of ‘IPPROTO_EGP’ was 
here
 IPPROTO_EGP = 8,/* Exterior Gateway Protocol.  */
 ^
../include/linux/in.h:38:3: error: redeclaration of enumerator ‘IPPROTO_PUP’
   IPPROTO_PUP = 12,  /* PUP protocol*/
   ^
/usr/include/netinet/in.h:54:5: note: previous definition of ‘IPPROTO_PUP’ was 
here
 IPPROTO_PUP = 12,/* PUP protocol.  */
 ^
../include/linux/in.h:40:3: error: redeclaration of enumerator ‘IPPROTO_UDP’
   IPPROTO_UDP = 17,  /* User Datagram Protocol  */
   ^
/usr/include/netinet/in.h:56:5: note: previous definition of ‘IPPROTO_UDP’ was 
here
 IPPROTO_UDP = 17,/* User Datagram Protocol.  */
 ^
../include/linux/in.h:42:3: error: redeclaration of enumerator ‘IPPROTO_IDP’
   IPPROTO_IDP = 22,  /* XNS IDP protocol   */
   ^
/usr/include/netinet/in.h:58:5: note: previous definition of ‘IPPROTO_IDP’ was 
here
 IPPROTO_IDP = 22,/* XNS IDP protocol.  */
 ^
../include/linux/in.h:44:3: error: redeclaration of enumerator ‘IPPROTO_TP’
   IPPROTO_TP = 29,  /* SO Transport Protocol Class 4 */
   ^
/usr/include/netinet/in.h:60:5: note: previous definition of ‘IPPROTO_TP’ was 
here
 IPPROTO_TP = 29,/* SO Transport Protocol Class 4.  */
 ^
../include/linux/in.h:46:3: error: redeclaration of enumerator ‘IPPROTO_DCCP’
   IPPROTO_DCCP = 33,  /* Datagram Congestion Control Protocol */
   ^
/usr/include/netinet/in.h:62:5: note: previous definition of ‘IPPROTO_DCCP’ was 
here
 IPPROTO_DCCP = 33,/* Datagram Congestion Control Protocol.  */
 ^
../include/linux/in.h:48:3: error: redeclaration of enumerator ‘IPPROTO_IPV6’
   IPPROTO_IPV6 = 41,  /* IPv6-in-IPv4 tunnelling  */
   ^
/usr/include/netinet/in.h:64:5: note: previous definition of ‘IPPROTO_IPV6’ was 
here
 IPPROTO_IPV6 = 41, /* IPv6 header.  */
 ^
../include/linux/in.h:50:3: error: redeclaration of enumerator ‘IPPROTO_RSVP’
   IPPROTO_RSVP = 46,  /* RSVP Protocol   */
   ^
/usr/include/netinet/in.h:66:5: note: previous definition of ‘IPPROTO_RSVP’ was 
here
 IPPROTO_RSVP = 46,/* Reservation Protocol.  */
 ^
../include/linux/in.h:52:3: error: redeclaration of enumerator ‘IPPROTO_GRE’
   IPPROTO_GRE = 47,  /* Cisco GRE tunnels (rfc 1701,1702) */
   ^
/usr/include/netinet/in.h:68:5: note: previous definition of ‘IPPROTO_GRE’ was 
here
 IPPROTO_GRE = 47,/* General Routing Encapsulation.  */
 ^
../include/linux/in.h:54:3: error: redeclaration of enumerator ‘IPPROTO_ESP’
   IPPROTO_ESP = 50,  /* Encapsulation Security Payload protocol */
   ^
/usr/include/netinet/in.h:70:5: note: previous definition of ‘IPPROTO_ESP’ was 
here
 IPPROTO_ESP = 50,  /* enca

RE: netdev broken?

2015-06-25 Thread Lukas Tribus
> On 06/25/15 08:03, David Miller wrote:
>
>>
>> I'm pretty sure you're just not subscribed to the list.
>>
>
> I never unsubscribed. Ive resubscribed.

Same thing happened to me at least 3 times in the last
4 - 5 years. Suddenly it appears I'm unsubscribed. I have
no idea how that happens or why. It is however rare.


Lukas

  --
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v2,9/9] fsl/fman: Add FMan MAC driver

2015-06-25 Thread Paul Bolle
(Evolution 3.16 is basically unbearable for replying to patches. Anyone
else running into this?) 

On Wed, 2015-06-24 at 22:37 +0300, igal.liber...@freescale.com wrote:
> 
> --- /dev/null
> +++ b/drivers/net/ethernet/freescale/fman/mac/mac-api.c
> +int set_mac_active_pause(struct mac_device *mac_dev, bool rx, bool tx)
> +{
> + [...]
> +}
> +EXPORT_SYMBOL(set_mac_active_pause);

Which module is using this function?

> +void get_pause_cfg(struct mac_device *mac_dev, bool *rx_pause, bool 
> *tx_pause)
> +{
> + [...]
> +}
> +EXPORT_SYMBOL(get_pause_cfg);

This exports a function that is only used in this file. Why? 

Thanks,


Paul Bolle
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v2,8/9] fsl/fman: Add FMan Port Support

2015-06-25 Thread Paul Bolle
On Wed, 2015-06-24 at 22:37 +0300, igal.liber...@freescale.com wrote:
> --- a/drivers/net/ethernet/freescale/fman/fm_drv.c
> +++ b/drivers/net/ethernet/freescale/fman/fm_drv.c

> +struct fm_port_t *fm_port_drv_handle(const struct fm_port_drv_t *port)
> +{
> + return port->fm_port;
> +}
> +EXPORT_SYMBOL(fm_port_drv_handle);

I couldn't find any users of this function.

> +void fm_port_get_buff_layout_ext_params(struct fm_port_drv_t *port,
> +>>   >   >   >   struct fm_port_params *params)

(Evolution 3.16 is a piece of ...).

> +{
> + params->data_align = 0;
> +}
> +EXPORT_SYMBOL(fm_port_get_buff_layout_ext_params);

Ditto.

> +int fm_get_tx_port_channel(struct fm_port_drv_t *port)
> +{
> + return port->tx_ch;
> +}
> +EXPORT_SYMBOL(fm_get_tx_port_channel);

Ditto.

> --- /dev/null
> +++ b/drivers/net/ethernet/freescale/fman/fm_port_drv.c

> +void fm_set_rx_port_params(struct fm_port_drv_t *port,
> +struct fm_port_params *params)
> +{
+   [...]
> +}
> +EXPORT_SYMBOL(fm_set_rx_port_params);

Ditto.

(If you hear about my arrest for randomly attacking innocent people:
blame evolution 3.16!)

> +void fm_set_tx_port_params(struct fm_port_drv_t *port,
> +struct fm_port_params *params)
> +{
> + [...]
> +}
> +EXPORT_SYMBOL(fm_set_tx_port_params);

Ditto.

> --- /dev/null
> +++ b/drivers/net/ethernet/freescale/fman/port/fm_port.c

> +u64 *fm_port_get_buffer_time_stamp(const struct fm_port_t *p_fm_port,
> +char *p_data)
> +{
> + [...]
> +}
> +EXPORT_SYMBOL(fm_port_get_buffer_time_stamp);

Ditto.

> +int fm_port_disable(struct fm_port_t *p_fm_port)
> +{
> + [...]
> +}
> +EXPORT_SYMBOL(fm_port_disable);

This exports a function that I think is only used inside this file.

> +int fm_port_enable(struct fm_port_t *p_fm_port)
> +{
> + [...]
> +}
> +EXPORT_SYMBOL(fm_port_enable);

And here I could again find no users of this function.

Thanks,


Paul Bolle
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Added additional callback to ptp_clock_info:

2015-06-25 Thread Christopher Hall
* getsynctime64()

This takes 2 arguments referring to system and device time

With this callback drivers may provide both system time and device time
to ensure precise correlation

Modified PTP_SYS_OFFSET ioctl in PTP clock driver to use the above
callback if it's available

Signed-off-by: Christopher Hall 
---
 drivers/ptp/ptp_chardev.c| 30 +-
 include/linux/ptp_clock_kernel.h |  3 +++
 2 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/drivers/ptp/ptp_chardev.c b/drivers/ptp/ptp_chardev.c
index da7bae9..e91f98e 100644
--- a/drivers/ptp/ptp_chardev.c
+++ b/drivers/ptp/ptp_chardev.c
@@ -124,7 +124,7 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, 
unsigned long arg)
struct ptp_clock *ptp = container_of(pc, struct ptp_clock, clock);
struct ptp_clock_info *ops = ptp->info;
struct ptp_clock_time *pct;
-   struct timespec64 ts;
+   struct timespec64 ts, systs;
int enable, err = 0;
unsigned int i, pin_index;
 
@@ -196,19 +196,31 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, 
unsigned long arg)
break;
}
pct = &sysoff->ts[0];
-   for (i = 0; i < sysoff->n_samples; i++) {
-   getnstimeofday64(&ts);
+   if (ptp->info->getsynctime64 && sysoff->n_samples == 1) {
+   ptp->info->getsynctime64(ptp->info, &ts, &systs);
+   pct->sec = systs.tv_sec;
+   pct->nsec = systs.tv_nsec;
+   ++pct;
pct->sec = ts.tv_sec;
pct->nsec = ts.tv_nsec;
-   pct++;
-   ptp->info->gettime64(ptp->info, &ts);
+   ++pct;
+   pct->sec = systs.tv_sec;
+   pct->nsec = systs.tv_nsec;
+   } else {
+   for (i = 0; i < sysoff->n_samples; i++) {
+   getnstimeofday64(&ts);
+   pct->sec = ts.tv_sec;
+   pct->nsec = ts.tv_nsec;
+   pct++;
+   ptp->info->gettime64(ptp->info, &ts);
+   pct->sec = ts.tv_sec;
+   pct->nsec = ts.tv_nsec;
+   pct++;
+   }
+   getnstimeofday64(&ts);
pct->sec = ts.tv_sec;
pct->nsec = ts.tv_nsec;
-   pct++;
}
-   getnstimeofday64(&ts);
-   pct->sec = ts.tv_sec;
-   pct->nsec = ts.tv_nsec;
if (copy_to_user((void __user *)arg, sysoff, sizeof(*sysoff)))
err = -EFAULT;
break;
diff --git a/include/linux/ptp_clock_kernel.h b/include/linux/ptp_clock_kernel.h
index b8b7306..edff9a5 100644
--- a/include/linux/ptp_clock_kernel.h
+++ b/include/linux/ptp_clock_kernel.h
@@ -105,6 +105,9 @@ struct ptp_clock_info {
int (*adjfreq)(struct ptp_clock_info *ptp, s32 delta);
int (*adjtime)(struct ptp_clock_info *ptp, s64 delta);
int (*gettime64)(struct ptp_clock_info *ptp, struct timespec64 *ts);
+   int (*getsynctime64)
+   (struct ptp_clock_info *ptp, struct timespec64 *dev,
+struct timespec64 *sys);
int (*settime64)(struct ptp_clock_info *p, const struct timespec64 *ts);
int (*enable)(struct ptp_clock_info *ptp,
  struct ptp_clock_request *request, int on);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Add PTP cross-timestamp to the PTP driver interface

2015-06-25 Thread Christopher Hall
This patch allows system and device time ("cross-timestamp") to be performed 
by the driver. Currently, the timestamping is performed in the PTP_SYS_OFFSET 
ioctl.  It reads gettimeofday() and the gettime64() callback provided by the 
driver. The cross-timestamp is best effort ignoring the latency between the 
capture of system time (getnstimeofday()) and the device time 
(driver callback).

This patch adds an additional callback getsynctime64(). Which will be called 
when the driver is able to perform a more accurate, implementation specific 
cross-timestamp.  The ioctl code uses the callback when available.

Additionally, the callback, getsynctime64(), will only be called when 
n_samples == 1 because the driver returns only 1 cross-timestamp where 
multiple samples cannot be chained together.


Christopher Hall (1):
  Added additional callback to ptp_clock_info:

 drivers/ptp/ptp_chardev.c| 30 +-
 include/linux/ptp_clock_kernel.h |  3 +++
 2 files changed, 24 insertions(+), 9 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v2,5/9] fsl/fman: Add Frame Manager support

2015-06-25 Thread Paul Bolle
On Wed, 2015-06-24 at 22:35 +0300, igal.liber...@freescale.com wrote:
> --- /dev/null
> +++ b/drivers/net/ethernet/freescale/fman/fm_drv.c

> +u16 fm_get_max_frm(void)
> +{
> + return fsl_fm_max_frm;
> +}
> +EXPORT_SYMBOL(fm_get_max_frm);

Which module is using this export? (And what does this function
actually do?)

> +int fm_get_rx_extra_headroom(void)
> +{
> + return ALIGN(fsl_fm_rx_extra_headroom, 16);
> +}
> +EXPORT_SYMBOL(fm_get_rx_extra_headroom);

This exports an unused function.

I don't know how to, well, review a series that adds almost 20K lines.
So I decided to pick one subject: exports. I think I had something to
comment on all eight of them.

I'm not sure if I'll try another scan with a different subject.

Thanks,


Paul Bolle
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] - vxlan: gro not effective for intel 82599

2015-06-25 Thread Ramu Ramamurthy

Problem:
---

GRO is enabled on the interfaces in the following test,
but GRO does not take effect for vxlan-encapsulated tcp streams. The 
root

cause of why GRO does not take effect is described below.

VM nic (mtu 1450)---bridge---vxlan10Gb nic (intel 82599ES)-|
VM nic (mtu 1450)---bridge---vxlan10Gb nic (intel 82599ES)-|

Because gro is not effective, the throughput for vxlan-encapsulated
tcp-stream is around 3 Gbps.

With the proposed patch, gro takes effect for vxlan-encapsulated tcp 
streams,

and performance in the same test is around 8.6 Gbps.


Root Cause:
--


At entry to udp4_gro_receive(), the gro parameters are set as follows:

skb->ip_summed  == 0 (CHECKSUM_NONE)
NAPI_GRO_CB(skb)->csum_cnt == 0
NAPI_GRO_CB(skb)->csum_valid == 0

UDH header checksum is 0.

static struct sk_buff **udp4_gro_receive(struct sk_buff **head,
 struct sk_buff *skb)
{

 

if (skb_gro_checksum_validate_zero_check(skb, IPPROTO_UDP, uh->check,
 inet_gro_compute_pseudo))


This calls __skb_incr_checksum_unnecessary which sets
skb->ip_summed to  CHECKSUM_UNNECESSARY



goto flush;
else if (uh->check)
skb_gro_checksum_try_convert(skb, IPPROTO_UDP, uh->check,
 inet_gro_compute_pseudo);
skip:
NAPI_GRO_CB(skb)->is_ipv6 = 0;
return udp_gro_receive(head, skb, uh);

}

struct sk_buff **udp_gro_receive(struct sk_buff **head, struct sk_buff 
*skb,

 struct udphdr *uh)
{
struct udp_offload_priv *uo_priv;
struct sk_buff *p, **pp = NULL;
struct udphdr *uh2;
unsigned int off = skb_gro_offset(skb);
int flush = 1;

if (NAPI_GRO_CB(skb)->udp_mark ||
(skb->ip_summed != CHECKSUM_PARTIAL &&
 NAPI_GRO_CB(skb)->csum_cnt == 0 &&
 !NAPI_GRO_CB(skb)->csum_valid))
goto out;


 vxlan GRO gets skipped due to the above condition because here,:
 skb->ip_summed == CHECKSUM_UNNECESSARY
 NAPI_GRO_CB(skb)->csum_cnt == 0
 NAPI_GRO_CB(skb)->csum_valid == 0


There is no reason for skipping vxlan gro in the above combination of 
conditions,

because, tcp4_gro_receive() validates the inner tcp checksum anyway !


Patch:
--

Signed-off-by: Ramu Ramamurthy 
---
 net/ipv4/udp_offload.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index f938616..17fc12b 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -301,6 +301,7 @@ struct sk_buff **udp_gro_receive(struct sk_buff 
**head, struct sk_buff *skb,


if (NAPI_GRO_CB(skb)->udp_mark ||
(skb->ip_summed != CHECKSUM_PARTIAL &&
+skb->ip_summed != CHECKSUM_UNNECESSARY &&
 NAPI_GRO_CB(skb)->csum_cnt == 0 &&
 !NAPI_GRO_CB(skb)->csum_valid))
goto out;
--
1.7.1





Notes:
---

The above gro fix applies to all udp-encapsulation protocols (vxlan, 
geneve)





--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v2,5/9] fsl/fman: Add Frame Manager support

2015-06-25 Thread Paul Bolle
On Fri, 2015-06-26 at 01:53 +0200, Paul Bolle wrote:
> So I decided to pick one subject: exports. I think I had something to
> comment on all eight of them.

s/eight/twelve/


Paul Bolle
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] - vxlan: gro not effective for intel 82599

2015-06-25 Thread Tom Herbert
On Thu, Jun 25, 2015 at 5:03 PM, Ramu Ramamurthy
 wrote:
> Problem:
> ---
>
> GRO is enabled on the interfaces in the following test,
> but GRO does not take effect for vxlan-encapsulated tcp streams. The root
> cause of why GRO does not take effect is described below.
>
> VM nic (mtu 1450)---bridge---vxlan10Gb nic (intel 82599ES)-|
> VM nic (mtu 1450)---bridge---vxlan10Gb nic (intel 82599ES)-|
>
> Because gro is not effective, the throughput for vxlan-encapsulated
> tcp-stream is around 3 Gbps.
>
> With the proposed patch, gro takes effect for vxlan-encapsulated tcp
> streams,
> and performance in the same test is around 8.6 Gbps.
>
>
> Root Cause:
> --
>
>
> At entry to udp4_gro_receive(), the gro parameters are set as follows:
>
> skb->ip_summed  == 0 (CHECKSUM_NONE)
> NAPI_GRO_CB(skb)->csum_cnt == 0
> NAPI_GRO_CB(skb)->csum_valid == 0
>
> UDH header checksum is 0.
>
> static struct sk_buff **udp4_gro_receive(struct sk_buff **head,
>  struct sk_buff *skb)
> {
>
>  
>
> if (skb_gro_checksum_validate_zero_check(skb, IPPROTO_UDP,
> uh->check,
>  inet_gro_compute_pseudo))
>
 This calls __skb_incr_checksum_unnecessary which sets
 skb->ip_summed to  CHECKSUM_UNNECESSARY

>
> goto flush;
> else if (uh->check)
> skb_gro_checksum_try_convert(skb, IPPROTO_UDP, uh->check,
>  inet_gro_compute_pseudo);
> skip:
> NAPI_GRO_CB(skb)->is_ipv6 = 0;
> return udp_gro_receive(head, skb, uh);
>
> }
>
> struct sk_buff **udp_gro_receive(struct sk_buff **head, struct sk_buff *skb,
>  struct udphdr *uh)
> {
> struct udp_offload_priv *uo_priv;
> struct sk_buff *p, **pp = NULL;
> struct udphdr *uh2;
> unsigned int off = skb_gro_offset(skb);
> int flush = 1;
>
> if (NAPI_GRO_CB(skb)->udp_mark ||
> (skb->ip_summed != CHECKSUM_PARTIAL &&
>  NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>  !NAPI_GRO_CB(skb)->csum_valid))
> goto out;


  vxlan GRO gets skipped due to the above condition because here,:
  skb->ip_summed == CHECKSUM_UNNECESSARY
  NAPI_GRO_CB(skb)->csum_cnt == 0
  NAPI_GRO_CB(skb)->csum_valid == 0
>
>
> There is no reason for skipping vxlan gro in the above combination of
> conditions,
> because, tcp4_gro_receive() validates the inner tcp checksum anyway !
>
>
> Patch:
> --
>
> Signed-off-by: Ramu Ramamurthy 
> ---
>  net/ipv4/udp_offload.c |1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> index f938616..17fc12b 100644
> --- a/net/ipv4/udp_offload.c
> +++ b/net/ipv4/udp_offload.c
> @@ -301,6 +301,7 @@ struct sk_buff **udp_gro_receive(struct sk_buff **head,
> struct sk_buff *skb,
>
> if (NAPI_GRO_CB(skb)->udp_mark ||
> (skb->ip_summed != CHECKSUM_PARTIAL &&
> +skb->ip_summed != CHECKSUM_UNNECESSARY &&
>  NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>  !NAPI_GRO_CB(skb)->csum_valid))
> goto out;
> --

This isn't right. The CHECKSUM_UNNECESSARY only refers to the outer
checksum which is zero in this case so it is trivially unnecessary.
The inner checksum still needs to be computed on the host. By
convention, we do not do GRO if it is required to compute the inner
checksum (csum_cnt == 0 checks that). If we want to allow checksum
calculation to occur in the GRO path, meaning we understand the
ramifications and can show this is better for performance, then all
the checks about checksum here should be removed.

> 1.7.1
>
>
>
>
>
> Notes:
> ---
>
> The above gro fix applies to all udp-encapsulation protocols (vxlan, geneve)
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v2,9/9] fsl/fman: Add FMan MAC driver

2015-06-25 Thread Scott Wood
On Fri, 2015-06-26 at 01:06 +0200, Paul Bolle wrote:
> (Evolution 3.16 is basically unbearable for replying to patches. 
> Anyone
> else running into this?) 

If you mean the crazy lag when selecting moderate-to-large amounts of 
text (for snipping), yes.

-Scott

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] - vxlan: gro not effective for intel 82599

2015-06-25 Thread Ramu Ramamurthy

On 2015-06-25 17:20, Tom Herbert wrote:

On Thu, Jun 25, 2015 at 5:03 PM, Ramu Ramamurthy
 wrote:

Problem:
---

GRO is enabled on the interfaces in the following test,
but GRO does not take effect for vxlan-encapsulated tcp streams. The 
root

cause of why GRO does not take effect is described below.

VM nic (mtu 1450)---bridge---vxlan10Gb nic (intel 82599ES)-|
VM nic (mtu 1450)---bridge---vxlan10Gb nic (intel 82599ES)-|

Because gro is not effective, the throughput for vxlan-encapsulated
tcp-stream is around 3 Gbps.

With the proposed patch, gro takes effect for vxlan-encapsulated tcp
streams,
and performance in the same test is around 8.6 Gbps.


Root Cause:
--


At entry to udp4_gro_receive(), the gro parameters are set as follows:

skb->ip_summed  == 0 (CHECKSUM_NONE)
NAPI_GRO_CB(skb)->csum_cnt == 0
NAPI_GRO_CB(skb)->csum_valid == 0

UDH header checksum is 0.

static struct sk_buff **udp4_gro_receive(struct sk_buff **head,
 struct sk_buff *skb)
{

 

if (skb_gro_checksum_validate_zero_check(skb, IPPROTO_UDP,
uh->check,
 
inet_gro_compute_pseudo))



This calls __skb_incr_checksum_unnecessary which sets
skb->ip_summed to  CHECKSUM_UNNECESSARY



goto flush;
else if (uh->check)
skb_gro_checksum_try_convert(skb, IPPROTO_UDP, 
uh->check,

 inet_gro_compute_pseudo);
skip:
NAPI_GRO_CB(skb)->is_ipv6 = 0;
return udp_gro_receive(head, skb, uh);

}

struct sk_buff **udp_gro_receive(struct sk_buff **head, struct sk_buff 
*skb,

 struct udphdr *uh)
{
struct udp_offload_priv *uo_priv;
struct sk_buff *p, **pp = NULL;
struct udphdr *uh2;
unsigned int off = skb_gro_offset(skb);
int flush = 1;

if (NAPI_GRO_CB(skb)->udp_mark ||
(skb->ip_summed != CHECKSUM_PARTIAL &&
 NAPI_GRO_CB(skb)->csum_cnt == 0 &&
 !NAPI_GRO_CB(skb)->csum_valid))
goto out;



 vxlan GRO gets skipped due to the above condition because 
here,:

 skb->ip_summed == CHECKSUM_UNNECESSARY
 NAPI_GRO_CB(skb)->csum_cnt == 0
 NAPI_GRO_CB(skb)->csum_valid == 0



There is no reason for skipping vxlan gro in the above combination of
conditions,
because, tcp4_gro_receive() validates the inner tcp checksum anyway !


Patch:
--

Signed-off-by: Ramu Ramamurthy 
---
 net/ipv4/udp_offload.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index f938616..17fc12b 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -301,6 +301,7 @@ struct sk_buff **udp_gro_receive(struct sk_buff 
**head,

struct sk_buff *skb,

if (NAPI_GRO_CB(skb)->udp_mark ||
(skb->ip_summed != CHECKSUM_PARTIAL &&
+skb->ip_summed != CHECKSUM_UNNECESSARY &&
 NAPI_GRO_CB(skb)->csum_cnt == 0 &&
 !NAPI_GRO_CB(skb)->csum_valid))
goto out;
--


This isn't right. The CHECKSUM_UNNECESSARY only refers to the outer
checksum which is zero in this case so it is trivially unnecessary.
The inner checksum still needs to be computed on the host. By
convention, we do not do GRO if it is required to compute the inner
checksum (csum_cnt == 0 checks that). If we want to allow checksum
calculation to occur in the GRO path, meaning we understand the
ramifications and can show this is better for performance, then all
the checks about checksum here should be removed.



Isnt the inner checksum computed on the gro-path from tcp4_gro_receive() 
as follows ?

This trace is from my testbed.

In my tests, I consistently get 8.5-9 Gbps with vxlan gro (inspite of
the added sw inner checksumming), whereas without vxlan GRO  the 
performance
drops down to 3Gbps or so. So, a significant performance benefit can be 
gained
on intel 10G nics which are widely deployed. Hence the interest in 
pursuing this or a modified patch.


 vxlan_gro_receive <-udp4_gro_receive
 ksoftirqd/1-94[001] ..s. 11421.420280: __pskb_pull_tail 
<-vxlan_gro_receive
 ksoftirqd/1-94[001] ..s. 11421.420280: skb_copy_bits 
<-__pskb_pull_tail
 ksoftirqd/1-94[001] ..s. 11421.420280: __pskb_pull_tail 
<-vxlan_gro_receive
 ksoftirqd/1-94[001] ..s. 11421.420281: skb_copy_bits 
<-__pskb_pull_tail
 ksoftirqd/1-94[001] ..s. 11421.420281: gro_find_receive_by_type 
<-vxlan_gro_receive
 ksoftirqd/1-94[001] ..s. 11421.420281: inet_gro_receive 
<-vxlan_gro_receive
 ksoftirqd/1-94[001] ..s. 11421.420281: __pskb_pull_tail 
<-inet_gro_receive
 ksoftirqd/1-94[001] ..s. 11421.420281: skb_copy_bits 
<-__pskb_pull_tail
 ksoftirqd/1-94[001] ..s. 11421.420281: tcp4_gro_receive 
<-inet_gro_receive
 ksoftirqd/1-94

Re: Issue with LACP mode support in Linux bonding driver

2015-06-25 Thread Ajith Adapa
Hi,

How can I know the current maintainer for Linux Bonding Driver ?

Regards,
Ajith

Regards,
Ajith

codingfreak.blogspot.com


On 24 June 2015 at 15:25, Ajith Adapa  wrote:
> Hi,
>
> I am using Centos7 with v3.10 kernel.
>
> My issue is related to multiaggregation in LACP.
>
> In my setup topology I connected linux server and a L2 switch back to
> back with 2 interfaces eth0 and eth1 (on both sides).
>
> On Switch I have mapped eth0 to po1 and eth1 to po2. On linux server I
> have created a single bond interface with both interfaces eth0 and
> eth1.
>
> On L2 switch both po1 and po2 has same system-id but the Actor key is
> different i.e. on PO1 it is 16385 and on PO2 it is 32768. As per the
> information available about bond0 on linux server which is given below
> Active aggregator ID is 1 which is mapped to eth0 but eth1.
>
> But we have observed that eth1 on linux server is also sending LACPDUS
> with Collecting/Distributing bit set as 1. Which will result in single
> bond interface on Linux server is splitted into multiple port-channels
> on Switch causing duplication of frames on linux server.
>
> As per IEEE 802.3ad 2003 standard portchannel should not support
> multi-aggregation.
>
>
> #  cat /proc/net/bonding/bond0
> Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
>
> Bonding Mode: IEEE 802.3ad Dynamic link aggregation
> Transmit Hash Policy: layer2 (0)
> MII Status: up
> MII Polling Interval (ms): 100
> Up Delay (ms): 0
> Down Delay (ms): 0
>
> 802.3ad info
> LACP rate: fast
> Min links: 0
> Aggregator selection policy (ad_select): stable
> Active Aggregator Info:
> Aggregator ID: 1
> Number of ports: 1
> Actor Key: 17
> Partner Key: 16385
> Partner Mac Address: 11:11:22:22:33:33
>
> Slave Interface: eth0
> MII Status: up
> Speed: 1000 Mbps
> Duplex: full
> Link Failure Count: 0
> Permanent HW addr: 08:00:27:18:ae:4b
> Aggregator ID: 1
> Slave queue ID: 0
>
> Slave Interface: eth1
> MII Status: up
> Speed: 1000 Mbps
> Duplex: full
> Link Failure Count: 0
> Permanent HW addr: 08:00:27:76:35:a2
> Aggregator ID: 3
> Slave queue ID: 0
>
>
> Regards,
> Ajith
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v2,1/9] fsl/fman: Add the FMan FLIB

2015-06-25 Thread Scott Wood
On Wed, 2015-06-24 at 22:33 +0300, igal.liber...@freescale.com wrote:
> From: Igal Liberman 
> 
> The FMan FLib provides the basic API used by the FMan drivers to
> > configure and control the FMan hardware.
> 
> Signed-off-by: Igal Liberman 

Again, what is an FLib?  What determines whether content should go in 
the "flib" directory?

The patch title says "Add the FMan FLIB", but there's more code added 
outside the "flib" directory than inside.
 
"FMan drivers"?  There's more than one?  What is 
"drivers/net/ethernet/freescale/fman/fman.c" if not "the FMan driver"? 
 What is "the FMan driver" if not the code "to configure and control 
the FMan hardware"?  If this is a public API, where's the 
documentation?

---
>  drivers/net/ethernet/freescale/Kconfig |1 +
>  drivers/net/ethernet/freescale/Makefile|2 +
>  drivers/net/ethernet/freescale/fman/Kconfig|7 +
>  drivers/net/ethernet/freescale/fman/Makefile   |5 +
>  .../net/ethernet/freescale/fman/flib/fsl_fman.h|  608 
> 
>  drivers/net/ethernet/freescale/fman/fman.c |  975 
> 
>  6 files changed, 1598 insertions(+)
>  create mode 100644 drivers/net/ethernet/freescale/fman/Kconfig
>  create mode 100644 drivers/net/ethernet/freescale/fman/Makefile
>  create mode 100644 
> drivers/net/ethernet/freescale/fman/flib/fsl_fman.h
>  create mode 100644 drivers/net/ethernet/freescale/fman/fman.c
> 
> diff --git a/drivers/net/ethernet/freescale/Kconfig 
> b/drivers/net/ethernet/freescale/Kconfig
> index 25e3425..24e938d 100644
> --- a/drivers/net/ethernet/freescale/Kconfig
> +++ b/drivers/net/ethernet/freescale/Kconfig
> @@ -55,6 +55,7 @@ config FEC_MPC52xx_MDIO
> If compiled as module, it will be called fec_mpc52xx_phy.
>  
>  source "drivers/net/ethernet/freescale/fs_enet/Kconfig"
> +source "drivers/net/ethernet/freescale/fman/Kconfig"
>  
>  config FSL_PQ_MDIO
>   tristate "Freescale PQ MDIO"
> diff --git a/drivers/net/ethernet/freescale/Makefile 
> b/drivers/net/ethernet/freescale/Makefile
> index 71debd1..4097c58 100644
> --- a/drivers/net/ethernet/freescale/Makefile
> +++ b/drivers/net/ethernet/freescale/Makefile
> @@ -17,3 +17,5 @@ gianfar_driver-objs := gianfar.o \
>   gianfar_ethtool.o
>  obj-$(CONFIG_UCC_GETH) += ucc_geth_driver.o
>  ucc_geth_driver-objs := ucc_geth.o ucc_geth_ethtool.o
> +
> +obj-$(CONFIG_FSL_FMAN) += fman/
> diff --git a/drivers/net/ethernet/freescale/fman/Kconfig 
> b/drivers/net/ethernet/freescale/fman/Kconfig
> new file mode 100644
> index 000..8aeae29
> --- /dev/null
> +++ b/drivers/net/ethernet/freescale/fman/Kconfig
> @@ -0,0 +1,7 @@
> +config FSL_FMAN
> + bool "FMan support"
> + depends on FSL_SOC || COMPILE_TEST
> + default n
> + help
> + Freescale Data-Path Acceleration Architecture Frame Manager
> + (FMan) support

"default n" is a no-op.

What does enabling this option actually do, in terms of user-visible 
features?

> +typedef struct fm_prs_result_t fm_prs_result;
> +typedef enum e_enet_mode enet_mode_t;

This use of typedef is contrary to kernel coding style.

-Scott
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v2,3/9] fsl/fman: Add the FMan MAC FLIB

2015-06-25 Thread Scott Wood
On Wed, 2015-06-24 at 22:34 +0300, igal.liber...@freescale.com wrote:
> From: Igal Liberman 
> 
> The FMan MAC FLib provides basic API used by the drivers to
> configure and control the FMan MAC hardware.
> 
> Signed-off-by: Igal Liberman 
...
> > +int fman_dtsec_mii_write_reg(struct dtsec_mii_reg __iomem *regs, uint8_t 
> addr,
> +  uint8_t reg, uint16_t data, uint16_t dtsec_freq)
> +{
> + uint32_t tmp;
> +
> + /* Setup the MII Mgmt clock speed */
> + iowrite32be((uint32_t)dtsec_mii_get_div(dtsec_freq), ®s->miimcfg);
> + /* Memory barrier */
> + wmb();
> +
> + /* Stop the MII management read cycle */
> + iowrite32be(0, ®s->miimcom);
> + /* Dummy read to make sure MIIMCOM is written */
> + tmp = ioread32be(®s->miimcom);
> + /* Memory barrier */
> + wmb();
> +
> + /* Setting up MII Management Address Register */
> + tmp = (uint32_t)((addr << MIIMADD_PHY_ADDR_SHIFT) | reg);
> + iowrite32be(tmp, ®s->miimadd);
> + /* Memory barrier */
> + wmb();
> +
> + /* Setting up MII Management Control Register with data */
> + iowrite32be((uint32_t)data, ®s->miimcon);
> + /* Dummy read to make sure MIIMCON is written */
> + tmp = ioread32be(®s->miimcon);
> + /* Memory barrier */
> + wmb();

iowrite32be() should already contain a memory barrier.

> +
> + /* Wait until MII management write is complete */
> + /* todo: a timeout could be useful here */
> + while ((ioread32be(®s->miimind)) & MIIMIND_BUSY)
> + ; /* busy wait */
> +
> + return 0;
> +}

Please add the timeout.

> + /* Read MII management status  */
> + *data = (uint16_t)ioread32be(®s->miimstat);

Unnecessary cast (please check for these throughout the patchset).

There are also casts in this patchset that are only needed because a variable 
was unnecessarily defined with a smaller-than-32-bit data type.


> +void fman_memac_reset(struct memac_regs __iomem *regs)
> +{
> + uint32_t tmp;
> +
> + tmp = ioread32be(®s->command_config);
> +
> + tmp |= CMD_CFG_SW_RESET;
> +
> + iowrite32be(tmp, ®s->command_config);
> +
> + while (ioread32be(®s->command_config) & CMD_CFG_SW_RESET)
> + ;
> +}

Timeout, here and in all such loops.

-Scott

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v2,4/9] fsl/fman: Add FMan MURAM support

2015-06-25 Thread Scott Wood
On Wed, 2015-06-24 at 22:34 +0300, igal.liber...@freescale.com wrote:
> + struct muram_info *p_muram;

No Hungarian notation.

> +void fm_muram_free(struct muram_info *p_muram)
> +{
> + /* Destroy pool */
> + gen_pool_destroy(p_muram->pool);
> + /* Unmap memory */
> + iounmap(p_muram->vbase);
> + /* Free pointer */
> + kfree(p_muram);
> +}

This type of commenting is not useful.

> + memset_io((void __iomem *)vaddr, 0, (int)size);

Unnecessary cast of size.

-Scott

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Issue with LACP mode in linux bonding driver

2015-06-25 Thread Jay Vosburgh
Ajith Adapa  wrote:

>Hi,
>
>Sorry for direct mail. Since the question is more specific about
>supporting LACP standard I decided to communicate directly with the
>MAINTAINERS. My issue is related to multiaggregation support in LACP.

I saw your message this morning, but didn't have an opportunity
to look into it today.

>Linux Flavour: Centos7.
>Setup topology: Back to back connected Linux server and a L2 switch
>with 2 interfaces eth0 and eth1 (on both sides).
>
>On Switch I have mapped eth0 to po1 and eth1 to po2. On Linux server I
>have created a single bond interface with both interfaces eth0 and
>eth1.
>
>On switch both po1 and po2 has same system-id but the Actor key is
>different i.e. on PO1 it is 16385 and on PO2 it is 32768. As per the
>information available regarding bond0 on Linux server which is given below
>Active aggregator ID is 1 which is mapped to eth0.
>
>But we have observed that eth1 on Linux server is also sending LACPDUS
>with Collecting/Distributing bit set as 1. Which will result in single
>bond interface on Linux server is splitted into multiple port-channels
>on Switch causing duplication of frames on Linux server.

I'd suggest enabling the dynamic_debug for the bonding driver
and observe the state machine activity within the 802.3ad code.  This is
described in the Documentation/dynamic-debug-howto.txt that is part of
the kernel source; off the top of my head, I think you'll need something
like:

echo 'module bonding =p' > /sys/kernel/debug/dynamic_debug/control

This should put the bonding LACP state machine activity into the
system log.

If a port on a non-active aggregator is actually in collecting /
distributing state, that is probably bad, as I'd only expect that to be
true for ports in the active aggregator.

-J

---
-Jay Vosburgh, jay.vosbu...@canonical.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v2,9/9] fsl/fman: Add FMan MAC driver

2015-06-25 Thread Michael Ellerman
On Thu, 2015-06-25 at 19:59 -0500, Scott Wood wrote:
> On Fri, 2015-06-26 at 01:06 +0200, Paul Bolle wrote:
> > (Evolution 3.16 is basically unbearable for replying to patches. 
> > Anyone
> > else running into this?) 
> 
> If you mean the crazy lag when selecting moderate-to-large amounts of 
> text (for snipping), yes.

I recommend the external editor plugin with vim.

cheers


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v2,9/9] fsl/fman: Add FMan MAC driver

2015-06-25 Thread Scott Wood
On Fri, 2015-06-26 at 12:21 +1000, Michael Ellerman wrote:
> On Thu, 2015-06-25 at 19:59 -0500, Scott Wood wrote:
> > On Fri, 2015-06-26 at 01:06 +0200, Paul Bolle wrote:
> > > (Evolution 3.16 is basically unbearable for replying to patches. 
> > > Anyone
> > > else running into this?) 
> > 
> > If you mean the crazy lag when selecting moderate-to-large amounts of 
> > text (for snipping), yes.
> 
> I recommend the external editor plugin with vim.

I tried the external editor plugin (not with vim) and it failed to bring the 
externally made edits back into the evolution compose window.  It then 
started spastically respawning the external editor without my doing anything.

-Scott


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] - vxlan: gro not effective for intel 82599

2015-06-25 Thread Tom Herbert
On Thu, Jun 25, 2015 at 6:06 PM, Ramu Ramamurthy
 wrote:
> On 2015-06-25 17:20, Tom Herbert wrote:
>>
>> On Thu, Jun 25, 2015 at 5:03 PM, Ramu Ramamurthy
>>  wrote:
>>>
>>> Problem:
>>> ---
>>>
>>> GRO is enabled on the interfaces in the following test,
>>> but GRO does not take effect for vxlan-encapsulated tcp streams. The root
>>> cause of why GRO does not take effect is described below.
>>>
>>> VM nic (mtu 1450)---bridge---vxlan10Gb nic (intel 82599ES)-|
>>> VM nic (mtu 1450)---bridge---vxlan10Gb nic (intel 82599ES)-|
>>>
>>> Because gro is not effective, the throughput for vxlan-encapsulated
>>> tcp-stream is around 3 Gbps.
>>>
>>> With the proposed patch, gro takes effect for vxlan-encapsulated tcp
>>> streams,
>>> and performance in the same test is around 8.6 Gbps.
>>>
>>>
>>> Root Cause:
>>> --
>>>
>>>
>>> At entry to udp4_gro_receive(), the gro parameters are set as follows:
>>>
>>> skb->ip_summed  == 0 (CHECKSUM_NONE)
>>> NAPI_GRO_CB(skb)->csum_cnt == 0
>>> NAPI_GRO_CB(skb)->csum_valid == 0
>>>
>>> UDH header checksum is 0.
>>>
>>> static struct sk_buff **udp4_gro_receive(struct sk_buff **head,
>>>  struct sk_buff *skb)
>>> {
>>>
>>>  
>>>
>>> if (skb_gro_checksum_validate_zero_check(skb, IPPROTO_UDP,
>>> uh->check,
>>>
>>> inet_gro_compute_pseudo))
>>>
>> This calls __skb_incr_checksum_unnecessary which sets
>> skb->ip_summed to  CHECKSUM_UNNECESSARY
>>
>>>
>>> goto flush;
>>> else if (uh->check)
>>> skb_gro_checksum_try_convert(skb, IPPROTO_UDP, uh->check,
>>>  inet_gro_compute_pseudo);
>>> skip:
>>> NAPI_GRO_CB(skb)->is_ipv6 = 0;
>>> return udp_gro_receive(head, skb, uh);
>>>
>>> }
>>>
>>> struct sk_buff **udp_gro_receive(struct sk_buff **head, struct sk_buff
>>> *skb,
>>>  struct udphdr *uh)
>>> {
>>> struct udp_offload_priv *uo_priv;
>>> struct sk_buff *p, **pp = NULL;
>>> struct udphdr *uh2;
>>> unsigned int off = skb_gro_offset(skb);
>>> int flush = 1;
>>>
>>> if (NAPI_GRO_CB(skb)->udp_mark ||
>>> (skb->ip_summed != CHECKSUM_PARTIAL &&
>>>  NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>>>  !NAPI_GRO_CB(skb)->csum_valid))
>>> goto out;
>>
>>
>>
>>  vxlan GRO gets skipped due to the above condition because here,:
>>  skb->ip_summed == CHECKSUM_UNNECESSARY
>>  NAPI_GRO_CB(skb)->csum_cnt == 0
>>  NAPI_GRO_CB(skb)->csum_valid == 0
>>>
>>>
>>>
>>> There is no reason for skipping vxlan gro in the above combination of
>>> conditions,
>>> because, tcp4_gro_receive() validates the inner tcp checksum anyway !
>>>
>>>
>>> Patch:
>>> --
>>>
>>> Signed-off-by: Ramu Ramamurthy 
>>> ---
>>>  net/ipv4/udp_offload.c |1 +
>>>  1 files changed, 1 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
>>> index f938616..17fc12b 100644
>>> --- a/net/ipv4/udp_offload.c
>>> +++ b/net/ipv4/udp_offload.c
>>> @@ -301,6 +301,7 @@ struct sk_buff **udp_gro_receive(struct sk_buff
>>> **head,
>>> struct sk_buff *skb,
>>>
>>> if (NAPI_GRO_CB(skb)->udp_mark ||
>>> (skb->ip_summed != CHECKSUM_PARTIAL &&
>>> +skb->ip_summed != CHECKSUM_UNNECESSARY &&
>>>  NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>>>  !NAPI_GRO_CB(skb)->csum_valid))
>>> goto out;
>>> --
>>
>>
>> This isn't right. The CHECKSUM_UNNECESSARY only refers to the outer
>> checksum which is zero in this case so it is trivially unnecessary.
>> The inner checksum still needs to be computed on the host. By
>> convention, we do not do GRO if it is required to compute the inner
>> checksum (csum_cnt == 0 checks that). If we want to allow checksum
>> calculation to occur in the GRO path, meaning we understand the
>> ramifications and can show this is better for performance, then all
>> the checks about checksum here should be removed.
>>
>
> Isnt the inner checksum computed on the gro-path from tcp4_gro_receive() as
> follows ?
> This trace is from my testbed.
>
> In my tests, I consistently get 8.5-9 Gbps with vxlan gro (inspite of
> the added sw inner checksumming), whereas without vxlan GRO  the performance
> drops down to 3Gbps or so. So, a significant performance benefit can be
> gained
> on intel 10G nics which are widely deployed. Hence the interest in pursuing
> this or a modified patch.
>
That may be, but this change would affect all uses of GRO with UDP
encapsulation not just for intel 10G NICs. For instance, pushing a lot
of checksum calculation into the napi for a single queue device could
overwhelm the corresponding CPU-- this is the motivation for the
restriction in the first place. We need to do a little more diligence
here.

C

Re: [net PATCH 1/1] net: phy: fix phy link up when limiting speed via device tree

2015-06-25 Thread Florian Fainelli
Le 06/25/15 09:51, Mugunthan V N a écrit :
> When limiting phy link speed using "max-speed" to 100mbps or less on a
> giga bit phy, phy never completes auto negotiation and phy state
> machine is held in PHY_AN. Fixing this issue by comparing the giga
> bit advertise though phydev->supported doesn't have it but phy has
> BMSR_ESTATEN set. So that auto negotiation is restarted as old and
> new advertise are different and link comes up fine.
> 
> Signed-off-by: Mugunthan V N 

Reviewed-by: Florian Fainelli 

> ---
>  drivers/net/phy/phy_device.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index bdfe51f..d551df6 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -796,10 +796,11 @@ static int genphy_config_advert(struct phy_device 
> *phydev)
>   if (phydev->supported & (SUPPORTED_1000baseT_Half |
>SUPPORTED_1000baseT_Full)) {
>   adv |= ethtool_adv_to_mii_ctrl1000_t(advertise);
> - if (adv != oldadv)
> - changed = 1;
>   }
>  
> + if (adv != oldadv)
> + changed = 1;
> +
>   err = phy_write(phydev, MII_CTRL1000, adv);
>   if (err < 0)
>   return err;
> 


-- 
Florian
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Issue with LACP mode support in Linux bonding driver

2015-06-25 Thread Mahesh Bandewar
On Wed, Jun 24, 2015 at 2:55 AM, Ajith Adapa  wrote:
> Hi,
>
> I am using Centos7 with v3.10 kernel.
>
You are using a very old kernel. Have you tried it with the latest kernel?

> My issue is related to multiaggregation in LACP.
>
> In my setup topology I connected linux server and a L2 switch back to
> back with 2 interfaces eth0 and eth1 (on both sides).
>
> On Switch I have mapped eth0 to po1 and eth1 to po2. On linux server I
> have created a single bond interface with both interfaces eth0 and
> eth1.
>
> On L2 switch both po1 and po2 has same system-id but the Actor key is
> different i.e. on PO1 it is 16385 and on PO2 it is 32768. As per the
> information available about bond0 on linux server which is given below
> Active aggregator ID is 1 which is mapped to eth0 but eth1.
>
> But we have observed that eth1 on linux server is also sending LACPDUS
> with Collecting/Distributing bit set as 1. Which will result in single
> bond interface on Linux server is splitted into multiple port-channels
> on Switch causing duplication of frames on linux server.
>
what happens if you do link-down/up on the nic that is not part of the bond?

> As per IEEE 802.3ad 2003 standard portchannel should not support
> multi-aggregation.
>
>
> #  cat /proc/net/bonding/bond0
> Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
>
> Bonding Mode: IEEE 802.3ad Dynamic link aggregation
> Transmit Hash Policy: layer2 (0)
> MII Status: up
> MII Polling Interval (ms): 100
> Up Delay (ms): 0
> Down Delay (ms): 0
>
> 802.3ad info
> LACP rate: fast
> Min links: 0
> Aggregator selection policy (ad_select): stable
> Active Aggregator Info:
> Aggregator ID: 1
> Number of ports: 1
> Actor Key: 17
> Partner Key: 16385
> Partner Mac Address: 11:11:22:22:33:33
>
> Slave Interface: eth0
> MII Status: up
> Speed: 1000 Mbps
> Duplex: full
> Link Failure Count: 0
> Permanent HW addr: 08:00:27:18:ae:4b
> Aggregator ID: 1
> Slave queue ID: 0
>
> Slave Interface: eth1
> MII Status: up
> Speed: 1000 Mbps
> Duplex: full
> Link Failure Count: 0
> Permanent HW addr: 08:00:27:76:35:a2
> Aggregator ID: 3
> Slave queue ID: 0
>
>
> Regards,
> Ajith
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] uapi: fix compatability of linux/in.h with netinet/in.h

2015-06-25 Thread Stephen Hemminger
This fixes breakage to iproute2 build with recent kernel headers
caused by:
   commit a263653ed798216c0069922d7b5237ca49436007
   Author: Pablo Neira Ayuso 
   Date:   Wed Jun 17 10:28:27 2015 -0500

   netfilter: don't pull include/linux/netfilter.h from netns headers

The issue is that definitions in linux/in.h overlap with those
in netinet/in.h. This patch solves this by introducing the same
mechanism as was used to solve the same problem with linux/in6.h

Signed-off-by: Stephen Hemminger 

--- a/include/uapi/linux/in.h   2015-06-25 10:55:18.142933079 -0400
+++ b/include/uapi/linux/in.h   2015-06-25 22:29:58.035452504 -0400
@@ -19,8 +19,10 @@
 #define _UAPI_LINUX_IN_H
 
 #include 
+#include 
 #include 
 
+#if __UAPI_DEF_IN_IPPROTO
 /* Standard well-defined IP protocols.  */
 enum {
   IPPROTO_IP = 0,  /* Dummy protocol for TCP   */
@@ -75,12 +77,14 @@ enum {
 #define IPPROTO_RAWIPPROTO_RAW
   IPPROTO_MAX
 };
+#endif
 
-
+#if __UAPI_DEF_IN_ADDR
 /* Internet address. */
 struct in_addr {
__be32  s_addr;
 };
+#endif
 
 #define IP_TOS 1
 #define IP_TTL 2
@@ -158,6 +162,7 @@ struct in_addr {
 
 /* Request struct for multicast socket ops */
 
+#if __UAPI_DEF_IP_MREQ
 struct ip_mreq  {
struct in_addr imr_multiaddr;   /* IP multicast address of group */
struct in_addr imr_interface;   /* local IP address of interface */
@@ -209,14 +214,18 @@ struct group_filter {
 #define GROUP_FILTER_SIZE(numsrc) \
(sizeof(struct group_filter) - sizeof(struct __kernel_sockaddr_storage) 
\
+ (numsrc) * sizeof(struct __kernel_sockaddr_storage))
+#endif
 
+#if __UAPI_DEF_IN_PKTINFO
 struct in_pktinfo {
int ipi_ifindex;
struct in_addr  ipi_spec_dst;
struct in_addr  ipi_addr;
 };
+#endif
 
 /* Structure describing an Internet (IP) socket address. */
+#if  __UAPI_DEF_SOCKADDR_IN
 #define __SOCK_SIZE__  16  /* sizeof(struct sockaddr)  */
 struct sockaddr_in {
   __kernel_sa_family_t sin_family; /* Address family   */
@@ -228,8 +237,9 @@ struct sockaddr_in {
sizeof(unsigned short int) - sizeof(struct in_addr)];
 };
 #define sin_zero   __pad   /* for BSD UNIX comp. -FvK  */
+#endif
 
-
+#if __UAPI_DEF_IN_CLASS
 /*
  * Definitions of the bits in an Internet address integer.
  * On subnets, host and network parts are found according
@@ -280,7 +290,7 @@ struct sockaddr_in {
 #define INADDR_ALLHOSTS_GROUP  0xe001U /* 224.0.0.1   */
 #define INADDR_ALLRTRS_GROUP0xe002U/* 224.0.0.2 */
 #define INADDR_MAX_LOCAL_GROUP  0xe0ffU/* 224.0.0.255 */
-
+#endif
 
 /*  contains the htonl type stuff.. */
 #include  
--- a/include/uapi/linux/libc-compat.h  2015-01-27 06:16:51.364032627 -0500
+++ b/include/uapi/linux/libc-compat.h  2015-06-25 22:30:23.871453196 -0400
@@ -56,6 +56,13 @@
 
 /* GLIBC headers included first so don't define anything
  * that would already be defined. */
+#define __UAPI_DEF_IN_ADDR 0
+#define __UAPI_DEF_IN_IPPROTO  0
+#define __UAPI_DEF_IN_PKTINFO  0
+#define __UAPI_DEF_IP_MREQ 0
+#define __UAPI_DEF_SOCKADDR_IN 0
+#define __UAPI_DEF_IN_CLASS0
+
 #define __UAPI_DEF_IN6_ADDR0
 /* The exception is the in6_addr macros which must be defined
  * if the glibc code didn't define them. This guard matches
@@ -78,6 +85,13 @@
 /* Linux headers included first, and we must define everything
  * we need. The expectation is that glibc will check the
  * __UAPI_DEF_* defines and adjust appropriately. */
+#define __UAPI_DEF_IN_ADDR 1
+#define __UAPI_DEF_IN_IPPROTO  1
+#define __UAPI_DEF_IN_PKTINFO  1
+#define __UAPI_DEF_IP_MREQ 1
+#define __UAPI_DEF_SOCKADDR_IN 1
+#define __UAPI_DEF_IN_CLASS1
+
 #define __UAPI_DEF_IN6_ADDR1
 /* We unconditionally define the in6_addr macros and glibc must
  * coordinate. */
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH iproute2] ss: add support for segs_in and segs_out

2015-06-25 Thread Stephen Hemminger
On Tue, 26 May 2015 14:54:41 -0400
Craig Gallek  wrote:

> Two new tcp_info fields: tcpi_segs_in and tcpi_segs_out.
> (2efd055c53c06b7e89c167c98069bab9afce7e59)
> 
> ~: ss -ti src :22
>cubic wscale:7,6 rto:201 rtt:0.244/0.012 ato:40 mss:1418 cwnd:21 
> bytes_acked:80605 bytes_received:20491 segs_out:414 segs_in:600 send 
> 976.3Mbps lastsnd:23 lastrcv:23 lastack:22 pacing_rate 1952.7Mbps rcv_rtt:98 
> rcv_space:28960
> 
> Signed-off-by: Craig Gallek 

Applied to net-next branch
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v2,5/9] fsl/fman: Add Frame Manager support

2015-06-25 Thread Scott Wood
On Wed, 2015-06-24 at 22:35 +0300, igal.liber...@freescale.com wrote:
> From: Igal Liberman 
> 
> Add Frame Manger Driver support.
> This patch adds The FMan configuration, initialization and
> runtime control routines.
> 
> Signed-off-by: Igal Liberman 
> ---
>  drivers/net/ethernet/freescale/fman/Kconfig|   35 +
>  drivers/net/ethernet/freescale/fman/Makefile   |2 +-
>  drivers/net/ethernet/freescale/fman/fm.c   | 1406 
> 
>  drivers/net/ethernet/freescale/fman/fm.h   |  394 ++
>  drivers/net/ethernet/freescale/fman/fm_common.h|  142 ++
>  drivers/net/ethernet/freescale/fman/fm_drv.c   |  701 ++
>  drivers/net/ethernet/freescale/fman/fm_drv.h   |  116 ++
>  drivers/net/ethernet/freescale/fman/inc/enet_ext.h |  199 +++
>  drivers/net/ethernet/freescale/fman/inc/fm_ext.h   |  488 +++
>  .../net/ethernet/freescale/fman/inc/fsl_fman_drv.h |   99 ++
>  drivers/net/ethernet/freescale/fman/inc/service.h  |   55 +
>  11 files changed, 3636 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/net/ethernet/freescale/fman/fm.c
>  create mode 100644 drivers/net/ethernet/freescale/fman/fm.h
>  create mode 100644 drivers/net/ethernet/freescale/fman/fm_common.h
>  create mode 100644 drivers/net/ethernet/freescale/fman/fm_drv.c
>  create mode 100644 drivers/net/ethernet/freescale/fman/fm_drv.h
>  create mode 100644 drivers/net/ethernet/freescale/fman/inc/enet_ext.h
>  create mode 100644 drivers/net/ethernet/freescale/fman/inc/fm_ext.h
>  create mode 100644 drivers/net/ethernet/freescale/fman/inc/fsl_fman_drv.h
>  create mode 100644 drivers/net/ethernet/freescale/fman/inc/service.h

Again, please start with something pared down, without extraneous features, 
but *with* enough functionality to actually pass packets around.  Getting 
this thing into decent shape is going to be hard enough without carrying 
around the excess baggage.

> diff --git a/drivers/net/ethernet/freescale/fman/Kconfig 
> b/drivers/net/ethernet/freescale/fman/Kconfig
> index 825a0d5..12c75bfd 100644
> --- a/drivers/net/ethernet/freescale/fman/Kconfig
> +++ b/drivers/net/ethernet/freescale/fman/Kconfig
> @@ -7,3 +7,38 @@ config FSL_FMAN
>   Freescale Data-Path Acceleration Architecture Frame Manager
>   (FMan) support
>  
> +if FSL_FMAN
> +
> +config FSL_FM_MAX_FRAME_SIZE
> + int "Maximum L2 frame size"
> + range 64 9600
> + default "1522"
> + help
> + Configure this in relation to the maximum possible MTU of your
> + network configuration. In particular, one would need to
> + increase this value in order to use jumbo frames.
> + FSL_FM_MAX_FRAME_SIZE must accommodate the Ethernet FCS
> + (4 bytes) and one ETH+VLAN header (18 bytes), to a total of
> + 22 bytes in excess of the desired L3 MTU.
> +
> + Note that having too large a FSL_FM_MAX_FRAME_SIZE (much larger
> + than the actual MTU) may lead to buffer exhaustion, especially
> + in the case of badly fragmented datagrams on the Rx path.
> + Conversely, having a FSL_FM_MAX_FRAME_SIZE smaller than the
> + actual MTU will lead to frames being dropped.

Scatter gather can't be used for jumbo frames?

Why is this a compile-time option?

> +
> +config FSL_FM_RX_EXTRA_HEADROOM
> + int "Add extra headroom at beginning of data buffers"
> + range 16 384
> + default "64"
> + help
> + Configure this to tell the Frame Manager to reserve some extra
> + space at the beginning of a data buffer on the receive path,
> + before Internal Context fields are copied. This is in addition
> + to the private data area already reserved for driver internal
> + use. The provided value must be a multiple of 16.
> +
> + This option does not affect in any way the layout of
> + transmitted buffers.

There's nothing here to indicate when a user would want to do this.

Why is this a compile-time option?

> + /* FManV3H */
> + else if (minor == 0 || minor == 2 || minor == 3) {
> + intg->fm_muram_size = 384 * 1024;
> + intg->fm_iram_size  = 64 * 1024;
> + intg->fm_num_of_ctrl= 4;
> +
> + intg->bmi_max_num_of_tasks  = 128;
> + intg->bmi_max_num_of_dmas   = 84;
> +
> + intg->num_of_rx_ports   = 8;
> + } else {
> + pr_err("Unsupported FManv3 version\n");
> + kfree(intg);
> + return NULL;
> + }
> +
> + break;
> + default:
> + pr_err("Unsupported FMan version\n");
> + kfree(intg);
> + return NULL;
> + }

Don't duplicate error paths.  Use goto like the rest of

Re: [PATCH iproute2] Add displaying VF traffic statistics

2015-06-25 Thread Stephen Hemminger
On Tue, 16 Jun 2015 12:13:16 +0300
Or Gerlitz  wrote:

> From: Eran Ben Elisha 
> 
> Enable reading and displaying SRIOV VFs traffic statistics through 
> the host PF netdevice using the nested IFLA_VF_STATS attribute.
> 
> Signed-off-by: Eran Ben Elisha 
> Signed-off-by: Hadar Hen Zion 
> Signed-off-by: Or Gerlitz 

Applied to net-next.
Headers have already been updated by other changes.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH iproute2 net-next 0/3] iplink_bridge: add support for ageing_time, priority and stp_state

2015-06-25 Thread Stephen Hemminger
On Tue, 16 Jun 2015 13:38:46 +0300
Nikolay Aleksandrov  wrote:

> Hi,
> Three months ago support was added to be able to set ageing_time,
> priority and stp_state via netlink by commit:
> af615762e972 ("bridge: add ageing_time, stp_state, priority over netlink")
> This patch-set adds support for iproute2 to be able to use them.
> Exampes:
> ip link set dev br0 type bridge priority 10
> ip link set dev br0 type bridge ageing_time 10
> ip link set dev br0 type bridge stp_state 1
> 
> 
> Best regards,
>  Nikolay Aleksandrov
> 
> Nikolay Aleksandrov (3):
>   iplink_bridge: add support for ageing_time
>   iplink_bridge: add support for stp_state
>   iplink_bridge: add support for priority
> 
>  ip/iplink_bridge.c | 26 ++
>  1 file changed, 26 insertions(+)
> 

All these are staged in net-next branch.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH iproute2] ss: print value of IPV6_V6ONLY socket option if set

2015-06-25 Thread Stephen Hemminger
On Wed, 24 Jun 2015 11:07:20 +
Phil Sutter  wrote:

> If available and set, print 'v6only:1' for AF_INET6 sockets upon request
> of extended information. For IPv6 sockets bound to in6addr_any, this is
> the only way to determine if they will also accept IPv4 requests or not.
> 
> Signed-off-by: Phil Sutter 
> ---
> Depends on unapplied patch 'ss: Include -E option for socket destroy
> events'
> 
> Kernel support patch: 204621551b2a0060a013b92f7add4d5c452fa7cb


Applied, currently staged on local net-next branch.
Will merge after 4.1 iproute2 release
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 3/3 v7] iproute2: add support to print 'linkdown' nexthop flag

2015-06-25 Thread Stephen Hemminger
On Tue, 23 Jun 2015 13:45:38 -0400
Andy Gospodarek  wrote:

> Signed-off-by: Andy Gospodaerk 
> Signed-off-by: Dinesh Dutt 
> Acked-by: Scott Feldman 

Applied, currently sitting on local net-next branch until after merge.  
}

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH iproute2 v2] ss: Include -E option for socket destroy events

2015-06-25 Thread Stephen Hemminger
On Wed, 17 Jun 2015 11:14:48 -0400
Craig Gallek  wrote:

> Use the IPv4/IPv6/TCP/UDP multicast groups of NETLINK_SOCK_DIAG
> to filter and display socket statistics as they are destroyed.
> 
> Kernel support patch series: 24029a3603cfa633e8bc2b3fb3e48e76c497831d
> 
> Signed-off-by: Craig Gallek 

Accepted. Currently sitting on local net-next branch until after merge.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


4.1+ use after free in netlink_broadcast_filtered

2015-06-25 Thread Dave Jones
I taught Trinity about NETLINK_LISTEN_ALL_NSID and NETLINK_LIST_MEMBERSHIPS
yesterday, and this evening, this fell out..

general protection fault:  [#1] PREEMPT SMP DEBUG_PAGEALLOC 
CPU: 1 PID: 9130 Comm: kworker/1:1 Not tainted 4.1.0-gelk-debug+ #1
Workqueue: sock_diag_events sock_diag_broadcast_destroy_work
task: 8800b94e4c40 ti: 8800352ec000 task.ti: 8800352ec000
RIP: 0010:[]  [] 
netlink_broadcast_filtered+0x24/0x3b0
RSP: :8800352efd08  EFLAGS: 00010292
RAX: 8800ab903d80 RBX: 0003 RCX: 0003
RDX:  RSI: 00d0 RDI: 8800b9c586c0
RBP: 8800352efd78 R08: 00d0 R09: 
R10:  R11: 0220 R12: 
R13: 6b6b6b6b6b6b6b6b R14: 0003 R15: 
FS:  () GS:8800bf70() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 02121ff8 CR3: 30169000 CR4: 07e0
DR0: 7fe1f0454000 DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0600
Stack:
 8800b9c586c0 8800b9c586c0 8800ac4692c0 8800936d4a90
 8800352efd38 8469a93e 8800352efd98 c09b9b90
 8800352efd78 8800ac4692c0 8800b9c586c0 8800831b6ab8
Call Trace:
 [] ? mutex_unlock+0xe/0x10
 [] ? inet_diag_handler_get_info+0x110/0x1fb [inet_diag]
 [] netlink_broadcast+0x1d/0x20
 [] ? mutex_unlock+0xe/0x10
 [] sock_diag_broadcast_destroy_work+0xd5/0x160
 [] process_one_work+0x147/0x420
 [] worker_thread+0x69/0x470
 [] ? preempt_count_sub+0xa3/0xf0
 [] ? rescuer_thread+0x320/0x320
 [] kthread+0x107/0x120
 [] ? kthread_create_on_node+0x1b0/0x1b0
 [] ret_from_fork+0x3f/0x70
 [] ? kthread_create_on_node+0x1b0/0x1b0
Code: 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 41 57 41 56 41 55 49 89 
fd 48 89 f7 44 89 c6 41 54 41 89 d4 53 89 cb 48 83 ec 48 <49> 8b 45 30 44 89 45 
a4 4c 89 4d 98 48 89 45 c0 e8 07 f6 ff ff 
RIP  [] netlink_broadcast_filtered+0x24/0x3b0
 RSP 
---[ end trace e2d8a07893775a9e ]---
 

r13 looks like slab poison, and the decoded instruction shows..


int netlink_broadcast_filtered(struct sock *ssk, struct sk_buff *skb, u32 
portid,
u32 group, gfp_t allocation,
int (*filter)(struct sock *dsk, struct sk_buff *skb, void *data),
void *filter_data)
{
1b70:   e8 00 00 00 00  callq  1b75 

1b75:   55  push   %rbp
1b76:   48 89 e5mov%rsp,%rbp
1b79:   41 57   push   %r15
1b7b:   41 56   push   %r14
1b7d:   41 55   push   %r13
1b7f:   49 89 fdmov%rdi,%r13
1b82:   48 89 f7mov%rsi,%rdi
1b85:   44 89 c6mov%r8d,%esi
1b88:   41 54   push   %r12
1b8a:   41 89 d4mov%edx,%r12d
1b8d:   53  push   %rbx
1b8e:   89 cb   mov%ecx,%ebx
1b90:   48 83 ec 48 sub$0x48,%rsp
1b94:   49 8b 45 30 mov0x30(%r13),%rax<--  trapping 
instruction
1b98:   44 89 45 a4 mov%r8d,-0x5c(%rbp)
1b9c:   4c 89 4d 98 mov%r9,-0x68(%rbp)
1ba0:   48 89 45 c0 mov%rax,-0x40(%rbp)
struct net *net = sock_net(ssk);


So it looks like the ssk we passed in was already freed.
I'll dig into this some more next week, and try to find a better
reproducer.

Dave

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >