date:20151027

[no subject]

2015-10-27 Thread arwen lai

Dear Mr/Ms,
we are a OEM parts supplier on many categories,we can supply all kinds of metal 
parts in compliance with customer's design.
Idea and designs from customers can be realized into new products here 
confidentially. Any 
OEM metalwork is welcomed! 

B/R
Yours James Cheung
Skype:senkemfg

[GIT] Networking

2015-10-27 Thread David Miller


This may look a bit scary this late in the release cycle, but as is typically
the case it's predominantly small driver fixes all over the place.

1) Fix two regressions in ipv6 route lookups, particularly wrt. output
   interface specifications in the lookup key.  From David Ahern.

2) Fix checks in ipv6 IPSEC tunnel pre-encap fragmentation, from
   Herbert Xu.

3) Fix mis-advertisement of 1000BASE-T on bcm63xx_enet, from Simon
   Arlott.

4) Some smsc phys misbehave with energy detect mode enabled, so add a
   DT property and disable it on such switches.  From Heiko Schocher.

5) Fix TSO corruption on TX in mv643xx_eth, from Philipp Kirchhofer.

6) Fix regression added by removal of openvswitch vport stats, from
   James Morse.

7) Vendor Kconfig options should be bool, not tristate, from Andreas
   Schwab.

8) Use non-_BH() net stats bump in tcp_xmit_probe_skb(), otherwise
   we barf during TCP REPAIR operations.

9) Fix various bugs in openvswitch conntrack support, from Joe
   Stringer.

10) Fix NETLINK_LIST_MEMBERSHIPS locking, from David Herrmann.

11) Don't have VSOCK do sock_put() in interrupt context, from Jorgen
Hansen.

12) Fix skb_realloc_headroom() failures properly in ISDN, from Karsten
Keil.

13) Add some device IDs to qmi_wwan, from Bjorn Mork.

14) Fix ovs egress tunnel information when using lwtunnel devices,
from Pravin B Shelar.

15) Add missing NETIF_F_FRAGLIST to macvtab feature list, from Jason
Wang.

16) Fix incorrect handling of throw routes when the result of the
throw cannot find a match, from Xin Long.

17) Protect ipv6 MTU calculations from wrap-around, from Hannes
Frederic Sowa.

18) Fix failed autonegotiation on KSZ9031 micrel PHYs, from Nathan
Sullivan.

19) Add missing memory barries in descriptor accesses or xgbe driver,
from Thomas Lendacky.

20) Fix release conditon test in pppoe_release(), from Guillaume Nault.

21) Fix gianfar bugs wrt. filter configuration, from Claudiu Manoil.

22) Fix violations of RX buffer alignment in sh_eth driver, from Sergei
Shtylyov.

23) Fixing missing of_node_put() calls in various places around the
networking, from Julia Lawall.

24) Fix incorrect leaf now walking in ipv4 routing tree, from Alexander
Duyck.

25) RDS doesn't check pskb_pull()/pskb_trim() return values, from
Sowmini Varadhan.

26) Fix VLAN configuration in mlx4 driver, from Jack Morgenstein.

Please pull, thanks a lot.

The following changes since commit 1099f86044111e9a7807f09523e42d4c9d0fb781:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2015-10-19 
09:55:40 -0700)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master

for you to fetch changes up to e18f6ac30d31433d8cd9ccf693d3cdd5d2e66ef9:

  Merge branch 'mlx4-fixes' (2015-10-27 20:27:45 -0700)



Alexander Duyck (1):
  fib_trie: leaf_walk_rcu should not compute key if key is less than pn->key

Andreas Schwab (1):
  net: cavium: change NET_VENDOR_CAVIUM to bool

Andrew F. Davis (1):
  net: phy: dp83848: Add TI DP83848 Ethernet PHY

Andrew Shewmaker (1):
  tcp: allow dctcp alpha to drop to zero

Bjørn Mork (1):
  qmi_wwan: add Sierra Wireless MC74xx/EM74xx

Carol L Soto (1):
  net/mlx4: Copy/set only sizeof struct mlx4_eqe bytes

Claudiu Manoil (4):
  gianfar: Remove duplicated argument to bitwise OR
  gianfar: Don't enable the Filer w/o the Parser
  gianfar: Fix Rx BSY error handling
  MAINTAINERS: Add entry for gianfar ethernet driver

Dan Carpenter (1):
  irda: precedence bug in irlmp_seq_hb_idx()

David Ahern (2):
  net: Really fix vti6 with oif in dst lookups
  net: ipv6: Dont add RT6_LOOKUP_F_IFACE flag if saddr set

David Daney (1):
  net: thunderx: Rewrite silicon revision tests.

David Herrmann (1):
  netlink: fix locking around NETLINK_LIST_MEMBERSHIPS

David S. Miller (12):
  Merge branch 'smsc-energy-detect'
  Merge branch 'mv643xx-fixes'
  Merge git://git.kernel.org/.../pablo/nf
  Merge branch 'isdn-null-deref'
  Merge branch 'master' of git://git.kernel.org/.../klassert/ipsec
  Merge branch 'master' of git://git.kernel.org/.../jkirsher/net-queue
  Merge branch 'ipv6-overflow-arith'
  Merge branch 'thunderx-fixes'
  Merge branch 'gianfar-fixes'
  Merge branch 'sh_eth-fixes'
  Merge branch 'net_of_node_put'
  Merge branch 'mlx4-fixes'

Eric Dumazet (1):
  ipv6: gre: support SIT encapsulation

Florian Westphal (1):
  netfilter: sync with packet rx also after removing queue entries

Gao feng (1):
  vsock: fix missing cleanup when misc_register failed

Guillaume Nault (1):
  ppp: fix pppoe_dev deletion condition in pppoe_release()

Hannes Frederic Sowa (2):
  overflow-arith: begin to add support for overflow builtin functions
  ipv6: protect mtu calculation of wrap-around and infinite loop by 
r

pull request: bluetooth-next 2015-10-28

2015-10-27 Thread Johan Hedberg

Hi Dave,

Here are a some more Bluetooth patches for 4.4 which collected up during
the past week. The most important ones are from Kuba Pawlak for fixing
locking issues with SCO sockets. There's also a fix from Alexander Aring
for 6lowpan, a memleak fix from Julia Lawall for the btmrvl driver and
some cleanup patches from Marcel.

Please let me know if there are any issues pulling. Thanks.

Johan

---
The following changes since commit 13972adc3240ea8b18b44906b819c622941a64b6:

  Bluetooth: Increase minor version of core module (2015-10-22 13:37:26 +0300)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git 
for-upstream

for you to fetch changes up to 324e786ee39c70ffbdc280c34b7d2b6da5c87879:

  bluetooth: 6lowpan: fix NOHZ: local_softirq_pending (2015-10-27 09:53:36 
+0100)


Alexander Aring (1):
  bluetooth: 6lowpan: fix NOHZ: local_softirq_pending

Julia Lawall (1):
  Bluetooth: btmrvl: add missing of_node_put

Kuba Pawlak (4):
  Bluetooth: Fix crash on SCO disconnect
  Bluetooth: Fix locking issue on SCO disconnection
  Bluetooth: Fix locking issue during fast SCO reconnection.
  Bluetooth: Fix crash on fast disconnect of SCO

Marcel Holtmann (4):
  Bluetooth: Remove unneeded parenthesis around MSG_OOB
  Bluetooth: Rename bt_cb()->req into bt_cb()->hci
  Bluetooth: Replace hci_notify with hci_sock_dev_event
  Bluetooth: Fix some obvious coding style issues in the SCO module

 drivers/bluetooth/btmrvl_main.c   |  5 -
 include/net/bluetooth/bluetooth.h | 14 ++--
 net/bluetooth/6lowpan.c   |  2 +-
 net/bluetooth/af_bluetooth.c  |  2 +-
 net/bluetooth/hci_core.c  | 43 -
 net/bluetooth/hci_event.c |  4 ++--
 net/bluetooth/hci_request.c   | 10 -
 net/bluetooth/hci_sock.c  |  4 ++--
 net/bluetooth/sco.c   | 44 --
 9 files changed, 73 insertions(+), 55 deletions(-)


signature.asc
Description: PGP signature

[PATCH v2 1/3] virtio_net: Stop doing DMA from the stack

2015-10-27 Thread Andy Lutomirski

From: Andy Lutomirski 

Once virtio starts using the DMA API, we won't be able to safely DMA
from the stack.  virtio-net does a couple of config DMA requests
from small stack buffers -- switch to using dynamically-allocated
memory.

This should have no effect on any performance-critical code paths.

Cc: netdev@vger.kernel.org
Cc: "Michael S. Tsirkin" 
Cc: virtualizat...@lists.linux-foundation.org
Reviewed-by: Joerg Roedel 
Signed-off-by: Andy Lutomirski 
---

Hi Michael and DaveM-

This is a prerequisite for the virtio DMA fixing project.  It works
as a standalone patch, though.  Would it make sense to apply it to
an appropriate networking tree now?

 drivers/net/virtio_net.c | 53 
 1 file changed, 36 insertions(+), 17 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index d8838dedb7a4..4f10f8a58811 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -976,31 +976,43 @@ static bool virtnet_send_command(struct virtnet_info *vi, 
u8 class, u8 cmd,
 struct scatterlist *out)
 {
struct scatterlist *sgs[4], hdr, stat;
-   struct virtio_net_ctrl_hdr ctrl;
-   virtio_net_ctrl_ack status = ~0;
+
+   struct {
+   struct virtio_net_ctrl_hdr ctrl;
+   virtio_net_ctrl_ack status;
+   } *buf;
+
unsigned out_num = 0, tmp;
+   bool ret;
 
/* Caller should know better */
BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ));
 
-   ctrl.class = class;
-   ctrl.cmd = cmd;
+   buf = kmalloc(sizeof(*buf), GFP_ATOMIC);
+   if (!buf)
+   return false;
+   buf->status = ~0;
+
+   buf->ctrl.class = class;
+   buf->ctrl.cmd = cmd;
/* Add header */
-   sg_init_one(&hdr, &ctrl, sizeof(ctrl));
+   sg_init_one(&hdr, &buf->ctrl, sizeof(buf->ctrl));
sgs[out_num++] = &hdr;
 
if (out)
sgs[out_num++] = out;
 
/* Add return status. */
-   sg_init_one(&stat, &status, sizeof(status));
+   sg_init_one(&stat, &buf->status, sizeof(buf->status));
sgs[out_num] = &stat;
 
BUG_ON(out_num + 1 > ARRAY_SIZE(sgs));
virtqueue_add_sgs(vi->cvq, sgs, out_num, 1, vi, GFP_ATOMIC);
 
-   if (unlikely(!virtqueue_kick(vi->cvq)))
-   return status == VIRTIO_NET_OK;
+   if (unlikely(!virtqueue_kick(vi->cvq))) {
+   ret = (buf->status == VIRTIO_NET_OK);
+   goto out;
+   }
 
/* Spin for a response, the kick causes an ioport write, trapping
 * into the hypervisor, so the request should be handled immediately.
@@ -1009,7 +1021,11 @@ static bool virtnet_send_command(struct virtnet_info 
*vi, u8 class, u8 cmd,
   !virtqueue_is_broken(vi->cvq))
cpu_relax();
 
-   return status == VIRTIO_NET_OK;
+   ret = (buf->status == VIRTIO_NET_OK);
+
+out:
+   kfree(buf);
+   return ret;
 }
 
 static int virtnet_set_mac_address(struct net_device *dev, void *p)
@@ -1151,7 +1167,7 @@ static void virtnet_set_rx_mode(struct net_device *dev)
 {
struct virtnet_info *vi = netdev_priv(dev);
struct scatterlist sg[2];
-   u8 promisc, allmulti;
+   u8 *cmdbyte;
struct virtio_net_ctrl_mac *mac_data;
struct netdev_hw_addr *ha;
int uc_count;
@@ -1163,22 +1179,25 @@ static void virtnet_set_rx_mode(struct net_device *dev)
if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_RX))
return;
 
-   promisc = ((dev->flags & IFF_PROMISC) != 0);
-   allmulti = ((dev->flags & IFF_ALLMULTI) != 0);
+   cmdbyte = kmalloc(sizeof(*cmdbyte), GFP_ATOMIC);
+   if (!cmdbyte)
+   return;
 
-   sg_init_one(sg, &promisc, sizeof(promisc));
+   sg_init_one(sg, cmdbyte, sizeof(*cmdbyte));
 
+   *cmdbyte = ((dev->flags & IFF_PROMISC) != 0);
if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX,
  VIRTIO_NET_CTRL_RX_PROMISC, sg))
dev_warn(&dev->dev, "Failed to %sable promisc mode.\n",
-promisc ? "en" : "dis");
-
-   sg_init_one(sg, &allmulti, sizeof(allmulti));
+*cmdbyte ? "en" : "dis");
 
+   *cmdbyte = ((dev->flags & IFF_ALLMULTI) != 0);
if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX,
  VIRTIO_NET_CTRL_RX_ALLMULTI, sg))
dev_warn(&dev->dev, "Failed to %sable allmulti mode.\n",
-allmulti ? "en" : "dis");
+*cmdbyte ? "en" : "dis");
+
+   kfree(cmdbyte);
 
uc_count = netdev_uc_count(dev);
mc_count = netdev_mc_count(dev);
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next v3] bridge: set is_local and is_static before fdb entry is added to the fdb hashtable

2015-10-27 Thread Stephen Hemminger

On Tue, 27 Oct 2015 07:52:56 -0700
Roopa Prabhu  wrote:

> From: Roopa Prabhu 
> 
> Problem Description:
> We can add fdbs pointing to the bridge with NULL ->dst but that has a
> few race conditions because br_fdb_insert() is used which first creates
> the fdb and then, after the fdb has been published/linked, sets
> "is_local" to 1 and in that time frame if a packet arrives for that fdb
> it may see it as non-local and either do a NULL ptr dereference in
> br_forward() or attach the fdb to the port where it arrived, and later
> br_fdb_insert() will make it local thus getting a wrong fdb entry.
> Call chain br_handle_frame_finish() -> br_forward():
> But in br_handle_frame_finish() in order to call br_forward() the dst
> should not be local i.e. skb != NULL, whenever the dst is
> found to be local skb is set to NULL so we can't forward it,
> and here comes the problem since it's running only
> with RCU when forwarding packets it can see the entry before "is_local"
> is set to 1 and actually try to dereference NULL.
> The main issue is that if someone sends a packet to the switch while
> it's adding the entry which points to the bridge device, it may
> dereference NULL ptr. This is needed now after we can add fdbs
> pointing to the bridge.  This poses a problem for
> br_fdb_update() as well, while someone's adding a bridge fdb, but
> before it has is_local == 1, it might get moved to a port if it comes
> as a source mac and then it may get its "is_local" set to 1
> 
> This patch changes fdb_create to take is_local and is_static as
> arguments to set these values in the fdb entry before it is added to the
> hash. Also adds null check for port in br_forward.
> 
> Fixes: 3741873b4f73 ("bridge: allow adding of fdb entries pointing to the 
> bridge device")
> Reported-by: Nikolay Aleksandrov 
> Signed-off-by: Roopa Prabhu 
> Reviewed-by: Nikolay Aleksandrov 

Acked-by: Stephen Hemminger 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 net-next 0/4] Automatic adjustment of max frame size

2015-10-27 Thread Stephen Hemminger

On Mon, 26 Oct 2015 12:40:55 +0900
Toshiaki Makita  wrote:

> This patch set tries to resolve packet drop by oversize error on receiving
> double tagged packets and possibly other encapsulated packets.
> 
> Problem description:
> Currently most NICs have 4 bytes room of receive buffer for vlan header and
> can receive 1522 bytes frame at maximum.
> This is, however, not sufficient once double tagged vlan is used.
> As MEF [1] says, maximum frame size of double tagged packets need to be at
> least 1526 to provide transparent ethernet VPN, and along the same line,
> HW switches send 1526 bytes double tagged packets.
> Thus, double tagged packets are dropped by default in most cases by
> oversize error. NICs need to accept 1526 bytes packets in this situation.
> 
> Approaches:
> To satisfy this requirement, this patch set introduces a way to indicate
> needed extra buffer space to drivers.
> This way can be re-used by other protocols than vlan, like mpls, vxlan, etc.
> 
> Other possible solutions:
> 
> - To adjust mtu automatically when stacked vlan device is created.
>   This is suboptimal because lower device is not necessarily used for only
>   vlan. Sometimes tagged and untagged traffic are both used at the same time.
>   Also, there are devices that already reserve 8 bytes room, in which case mtu
>   adjustment is unnecessary.
> 
> - To reserve more room by default.
>   This is also suboptimal because there are devices that chages behavior
>   when max acceptable frame size gets larger. For exapmle, e1000e enters
>   jumbo frame mode which has some additional ristrictions than normal.
>   Also, this is vlan-specific solution and not reusable by other encapsulation
>   protocols.
> 
> This patch set introduces .ndo_enc_hdr_len() and I chose e1000e as the first
> implementation. Patch 3 makes vlan driver utilize this API and automatically
> expand max frame size of the real device. Patch 4 makes bridge use the API
> in similar way as vlan.
> 
> Challenges:
> - Restore/shrink extra header room after vlan devices are deleted.
>   This will need some additional memory storage.
> - Manual modification of extra buffer size (by iproute2).
> 
> Note:
> - This problem was once discussed in Netdev 0.1 [2].
>   This patch set is based on the conclusion of the discussion.
> 
> Changes:
>  v2: Fixed chackpatch warnings
> 
> [1] https://wiki.mef.net/display/CESG/ENNI+Frame
> [2] https://www.netdev01.org/docs/netdev01_bof_8021ad_makita_150212.pdf
> 
> Toshiaki Makita (4):
>   net: Add ndo_enc_hdr_len to notify extra header room for encapsulated
> frames
>   e1000e: Add ndo_enc_hdr_len
>   vlan: Notify real device of encap header length
>   bridge: Notify port device of encap header length
> 
>  drivers/net/ethernet/intel/e1000e/netdev.c | 82 
> ++
>  include/linux/netdevice.h  |  9 
>  net/8021q/vlan.c   | 16 +-
>  net/8021q/vlan_dev.c   | 48 +++--
>  net/bridge/br_vlan.c   | 18 +++
>  net/core/dev.c | 36 +
>  6 files changed, 180 insertions(+), 29 deletions(-)
> 

The problem is that you require changing network device drivers
and device specific knowledge about what will work or not. Because
of that the modificaton can't be automated.

Also, this effects even more layered devices like tunnels etc.
The problem is quite large, and this patch only begins to address it.

It seems to me that just having the vlan driver to a sane
auto default is the best solution. It might cause a smaller MTU
than ideal, but at least it will still work. Then the user can
manually set a larger MTU if they know their hardware will work.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v1 1/3] virtio-net: Using single MSIX IRQ for TX/RX Q pair

2015-10-27 Thread Jason Wang



On 10/27/2015 04:38 PM, Michael S. Tsirkin wrote:
> On Mon, Oct 26, 2015 at 10:52:47AM -0700, Ravi Kerur wrote:
>> Ported earlier patch from Jason Wang (dated 12/26/2014).
>>
>> This patch tries to reduce the number of MSIX irqs required for
>> virtio-net by sharing a MSIX irq for each TX/RX queue pair through
>> channels. If transport support channel, about half of the MSIX irqs
>> were reduced.
>>
>> Signed-off-by: Ravi Kerur 
> Why bother BTW? 

The reason is we want to save the number of interrupt vectors used.
Booting a guest with 256 queues with current driver will result all
tx/rx queues shares a single vector. This is suboptimal. With this
series, half could be saved. And more complex policy could be applied on
top (e.g limit the number of vectors used by driver).

> Looks like this is adding a bunch of overhead
> on data path - to what end?

I agree some benchmark is needed for this.

> Maybe you have a huge number of these devices ... but in that case, how
> about sharing the config interrupt instead?
> That's only possible if host supports VIRTIO_1
> (so we can detect config interrupt by reading the ISR).
>
>
>
>> ---
>>  drivers/net/virtio_net.c | 29 -
>>  1 file changed, 28 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index d8838ded..d705cce 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -72,6 +72,9 @@ struct send_queue {
>>  
>>  /* Name of the send queue: output.$index */
>>  char name[40];
>> +
>> +/* Name of the channel, shared with irq. */
>> +char channel_name[40];
>>  };
>>  
>>  /* Internal representation of a receive virtqueue */
>> @@ -1529,6 +1532,8 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>>  int ret = -ENOMEM;
>>  int i, total_vqs;
>>  const char **names;
>> +const char **channel_names;
>> +unsigned *channels;
>>  
>>  /* We expect 1 RX virtqueue followed by 1 TX virtqueue, followed by
>>   * possible N-1 RX/TX queue pairs used in multiqueue mode, followed by
>> @@ -1548,6 +1553,17 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>>  if (!names)
>>  goto err_names;
>>  
>> +channel_names = kmalloc_array(vi->max_queue_pairs,
>> +  sizeof(*channel_names),
>> +  GFP_KERNEL);
>> +if (!channel_names)
>> +goto err_channel_names;
>> +
>> +channels = kmalloc_array(total_vqs, sizeof(*channels),
>> + GFP_KERNEL);
>> +if (!channels)
>> +goto err_channels;
>> +
>>  /* Parameters for control virtqueue, if any */
>>  if (vi->has_cvq) {
>>  callbacks[total_vqs - 1] = NULL;
>> @@ -1562,10 +1578,15 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>>  sprintf(vi->sq[i].name, "output.%d", i);
>>  names[rxq2vq(i)] = vi->rq[i].name;
>>  names[txq2vq(i)] = vi->sq[i].name;
>> +sprintf(vi->sq[i].channel_name, "txrx.%d", i);
>> +channel_names[i] = vi->sq[i].channel_name;
>> +channels[rxq2vq(i)] = i;
>> +channels[txq2vq(i)] = i;
>>  }
>>  
>>  ret = vi->vdev->config->find_vqs(vi->vdev, total_vqs, vqs, callbacks,
>> - names);
>> + names, channels, channel_names,
>> + vi->max_queue_pairs);
>>  if (ret)
>>  goto err_find;
>>  
>> @@ -1580,6 +1601,8 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>>  vi->sq[i].vq = vqs[txq2vq(i)];
>>  }
>>  
>> +kfree(channels);
>> +kfree(channel_names);
>>  kfree(names);
>>  kfree(callbacks);
>>  kfree(vqs);
>> @@ -1587,6 +1610,10 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>>  return 0;
>>  
>>  err_find:
>> +kfree(channels);
>> +err_channels:
>> +kfree(channel_names);
>> +err_channel_names:
>>  kfree(names);
>>  err_names:
>>  kfree(callbacks);
>> -- 
>> 1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net 0/2] Mellanox mlx4 driver fixes for 4.3-rc7

2015-10-27 Thread David Miller

From: Or Gerlitz 
Date: Tue, 27 Oct 2015 17:36:18 +0200

> Jack's fix is for a regression introduced in 4.3-rc1
> 
> Carol's fix addresses an issue which exists for while and 
> turns to beat us hard on PPC, please queue for -stable. 

Series applied, and patch #2 queued up for -stable.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 5/8] mm: memcontrol: account socket memory on unified hierarchy

2015-10-27 Thread Johannes Weiner

On Tue, Oct 27, 2015 at 05:45:32PM -0700, David Miller wrote:
> From: Johannes Weiner 
> Date: Tue, 27 Oct 2015 09:42:27 -0700
> 
> > On Tue, Oct 27, 2015 at 05:15:54PM +0100, Michal Hocko wrote:
> >> > For now, something like this as a boot commandline?
> >> > 
> >> > cgroup.memory=nosocket
> >> 
> >> That would work for me.
> > 
> > Okay, then I'll go that route for the socket stuff.
> > 
> > Dave is that cool with you?
> 
> Depends upon the default.
> 
> Until the user configures something explicitly into the memory
> controller, the networking bits should all evaluate to nothing.

Yep, I'll stick them behind a default-off jump label again.

This bootflag is only to override an active memory controller
configuration and force-off that jump label permanently.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RESEND v2 net-next] net: hisilicon: updates HNS config and documents

2015-10-27 Thread David Miller

From: huangdaode 
Date: Tue, 27 Oct 2015 19:16:34 +0800

> From: yankejian 
> 
> updates the bindings documents and dtsi file according to the review
> comments[https://lkml.org/lkml/2015/9/21/670] from Rob Herring 
> 
> 
> Acked-by: Rob Herring 
> Signed-off-by: yankejian 
> Signed-off-by: huangdaode 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] vhost: fix performance on LE hosts

2015-10-27 Thread David Miller

From: "Michael S. Tsirkin" 
Date: Tue, 27 Oct 2015 11:37:39 +0200

> commit 2751c9882b947292fcfb084c4f604e01724af804 ("vhost: cross-endian
> support for legacy devices") introduced a minor regression: even with
> cross-endian disabled, and even on LE host, vhost_is_little_endian is
> checking is_le flag so there's always a branch.
> 
> To fix, simply check virtio_legacy_is_little_endian first.
> 
> Cc: Greg Kurz 
> Signed-off-by: Michael S. Tsirkin 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: pull-request: wireless-drivers-next 2015-10-27

2015-10-27 Thread David Miller

From: Kalle Valo 
Date: Tue, 27 Oct 2015 10:46:38 +0200

> here's a bigger pull request for 4.4. The diffstat looks scary as we
> created a new directory realtek for all realtek drivers. In the future
> I'm planning to create similar directories for all vendors, currently we
> just have ath, mediatek and realtek. This change has been in linux-next
> for a couple of weeks so it should be safe, but of course you never
> know.
> 
> There's also a new driver rtl8xxxu for few realtek USB devices. This
> just made it to the last linux-next build.
> 
> Otherwise there's nothing really special, more info below. If time
> permits, and it's ok for you, I'm hoping to send you a one more pull
> request this week.

Pulled, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] net: hns: fixes the bug tested XGE by ethtool -p

2015-10-27 Thread David Miller

From: yankejian 
Date: Tue, 27 Oct 2015 17:17:40 +0800

> From: Li Peng 
> 
> delete action of ETHTOOL_ID_ON/ETHTOOL_ID_OFF in XGE ethtool -p,
> so Hardware control the LED state instead of software.
> 
> Signed-off-by: Li Peng 
> Signed-off-by: Yisen Zhuang 
> Signed-off-by: yankejian 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] gianfar: Increase TX_TIMEOUT to 5HZ

2015-10-27 Thread David Miller

From: Abhimanyu 
Date: Tue, 27 Oct 2015 14:17:43 +0530

> Increased TX_TIMEOUT to 5HZ to accommodate worst case situation
> for traffic and CPU intensive use cases
> 
> Signed-off-by: Priyanka Jain 
> Signed-off-by: Abhimanyu 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net] macvtap: unbreak receiving of gro skb with frag list

2015-10-27 Thread Jason Wang



On 10/27/2015 05:05 PM, Michael S. Tsirkin wrote:
> On Tue, Oct 27, 2015 at 10:58:25AM +0800, Jason Wang wrote:
>>
>> On 10/26/2015 04:30 PM, Michael S. Tsirkin wrote:
>>> On Mon, Oct 26, 2015 at 02:53:38PM +0800, Jason Wang wrote:
 On 10/26/2015 02:09 PM, Michael S. Tsirkin wrote:
> On Mon, Oct 26, 2015 at 11:15:57AM +0800, Jason Wang wrote:
>> On 10/23/2015 09:37 PM, Michael S. Tsirkin wrote:
>>> On Fri, Oct 23, 2015 at 12:57:05AM -0400, Jason Wang wrote:
 We don't have fraglist support in TAP_FEATURES. This will lead
 software segmentation of gro skb with frag list. Fixes by having
 frag list support in TAP_FEATURES.

 With this patch single session of netperf receiving were restored from
 about 5Gb/s to about 12Gb/s on mlx4.

 Fixes a567dd6252 ("macvtap: simplify usage of tap_features")
 Cc: Vlad Yasevich 
 Cc: Michael S. Tsirkin 
 Signed-off-by: Jason Wang 
>>> Thanks!
>>> Does this mean we should look at re-adding NETIF_F_FRAGLIST
>>> to virtio-net as well?
>> Not sure I get the point, but probably not. This is for receiving and
>> skb_copy_datagram_iter() can deal with frag list.
> Point is:
> - bridge within guest
> - assigned device creating gro skbs with frag list bridged to virtio
 I see, but this problem looks not specific to virtio. Most cards does
 not support frag list.
>>> These will be slower when used with a bridge then, won't they?
>> For forwarding, not sure. GRO has latency and cpu overhead anyway.
> Right but that's up to the user. You aren't disabling GRO
> on source, you are just splitting it up.
>
>> Anyway I can try to add the support for this.
> Which reminds me: on modern devices there are commands to control
> offloads, so for these, we should support turning offloads on/off using
> ethtool.
>

Trying to implement frag list but see a problem. Looks like driver need
to scan the possible number of io vectors? (Since vhost support max to
UIO_MAXIOV number of io vectors). Looks like there's no clarification on
this in the spec. (Which only limit the length of descriptor chain to
Queue size).
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 02/25] IB/mthca, net/mlx4: remove counting semaphores

2015-10-27 Thread Arnd Bergmann

The mthca and mlx4 device drivers use the same method
to switch between polling and event-driven command mode,
abusing two semaphores to create a mutual exclusion between
one polled command or multiple concurrent event driven
commands.

Since we want to make counting semaphores go away, this
patch replaces the semaphore counting the event-driven
commands with an open-coded wait-queue, which should
be an equivalent transformation of the code, although
it does not make it any nicer.

As far as I can tell, there is a preexisting race condition
regarding the cmd->use_events flag, which is not protected
by any lock. When this flag is toggled while another command
is being started, that command gets stuck until the mode is
toggled back.

A better solution that would solve the race condition and
at the same time improve the code readability would create
a new locking primitive that replaces both semaphores, like

static int mlx4_use_events(struct mlx4_cmd *cmd)
{
int ret = -EAGAIN;
spin_lock(&cmd->lock);
if (cmd->use_events && cmd->commands < cmd->max_commands) {
cmd->commands++;
ret = 1;
} else if (!cmd->use_events && cmd->commands == 0) {
cmd->commands = 1;
ret = 0;
}
spin_unlock(&cmd->lock);
return ret;
}

static bool mlx4_use_events(struct mlx4_cmd *cmd)
{
int ret;
wait_event(cmd->events_wq, ret = __mlx4_use_events(cmd) >= 0);
return ret;
}

Cc: Roland Dreier 
Cc: Eli Cohen 
Cc: Yevgeny Petrilin 
Cc: netdev@vger.kernel.org
Cc: linux-r...@vger.kernel.org
Signed-off-by: Arnd Bergmann 

Conflicts:

drivers/net/mlx4/cmd.c
drivers/net/mlx4/mlx4.h
---
 drivers/infiniband/hw/mthca/mthca_cmd.c   | 12 
 drivers/infiniband/hw/mthca/mthca_dev.h   |  3 ++-
 drivers/net/ethernet/mellanox/mlx4/cmd.c  | 12 
 drivers/net/ethernet/mellanox/mlx4/mlx4.h |  3 ++-
 4 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.c 
b/drivers/infiniband/hw/mthca/mthca_cmd.c
index 9d3e5c1ac60e..aad1852e8e10 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.c
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.c
@@ -417,7 +417,8 @@ static int mthca_cmd_wait(struct mthca_dev *dev,
int err = 0;
struct mthca_cmd_context *context;
 
-   down(&dev->cmd.event_sem);
+   wait_event(dev->cmd.event_wait,
+  atomic_add_unless(&dev->cmd.commands, -1, 0));
 
spin_lock(&dev->cmd.context_lock);
BUG_ON(dev->cmd.free_head < 0);
@@ -459,7 +460,8 @@ out:
dev->cmd.free_head = context - dev->cmd.context;
spin_unlock(&dev->cmd.context_lock);
 
-   up(&dev->cmd.event_sem);
+   atomic_inc(&dev->cmd.commands);
+   wake_up(&dev->cmd.event_wait);
return err;
 }
 
@@ -571,7 +573,8 @@ int mthca_cmd_use_events(struct mthca_dev *dev)
dev->cmd.context[dev->cmd.max_cmds - 1].next = -1;
dev->cmd.free_head = 0;
 
-   sema_init(&dev->cmd.event_sem, dev->cmd.max_cmds);
+   init_waitqueue_head(&dev->cmd.event_wait);
+   atomic_set(&dev->cmd.commands, dev->cmd.max_cmds);
spin_lock_init(&dev->cmd.context_lock);
 
for (dev->cmd.token_mask = 1;
@@ -597,7 +600,8 @@ void mthca_cmd_use_polling(struct mthca_dev *dev)
dev->cmd.flags &= ~MTHCA_CMD_USE_EVENTS;
 
for (i = 0; i < dev->cmd.max_cmds; ++i)
-   down(&dev->cmd.event_sem);
+   wait_event(dev->cmd.event_wait,
+  atomic_add_unless(&dev->cmd.commands, -1, 0));
 
kfree(dev->cmd.context);
 
diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h 
b/drivers/infiniband/hw/mthca/mthca_dev.h
index 7e6a6d64ad4e..3055f5c12ac8 100644
--- a/drivers/infiniband/hw/mthca/mthca_dev.h
+++ b/drivers/infiniband/hw/mthca/mthca_dev.h
@@ -121,7 +121,8 @@ struct mthca_cmd {
struct pci_pool  *pool;
struct mutex  hcr_mutex;
struct semaphore  poll_sem;
-   struct semaphore  event_sem;
+   wait_queue_head_t event_wait;
+   atomic_t  commands;
int   max_cmds;
spinlock_tcontext_lock;
int   free_head;
diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c 
b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index 78f5a1a0b8c8..60134a4245ef 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -273,7 +273,8 @@ static int mlx4_cmd_wait(struct mlx4_dev *dev, u64 
in_param, u64 *out_param,
struct mlx4_cmd_context *context;
int err = 0;
 
-   down(&cmd->event_sem);
+   wait_event(cmd->event_wait,
+  atomic_add_unless(&cmd->commands, -1, 0));
 
spin_lock(&cmd->context_lock);
BUG_ON(cmd->free_head < 0);
@@ -305,7 +306,8 @@ out:
cmd->free_head = context - cmd->context;

Re: [PATCH v7 10/10] ss: activate json_writer excluded logic

2015-10-27 Thread Stephen Hemminger

On Tue, 27 Oct 2015 14:21:03 +0100
Phil Sutter  wrote:

> On Thu, Sep 10, 2015 at 09:35:08PM +0200, Matthias Tafelmeier wrote:
> > This small patch extends the lib json_writer module for formerly
> > deactivated functionality.  
> 
> Why was it deactivated in the first place?

The code came from another project that wasn't using this
function.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] seccomp, ptrace: add support for dumping seccomp filters

2015-10-27 Thread David Miller

From: Tycho Andersen 
Date: Tue, 27 Oct 2015 09:23:59 +0900

> This patch adds support for dumping a process' (classic BPF) seccomp
> filters via ptrace.

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next 0/2] mpls: mulipath improvements

2015-10-27 Thread David Miller

From: Robert Shearman 
Date: Tue, 27 Oct 2015 00:37:34 +

> Two improvements to the recently added mpls multipath support. The
> first is a fix for missing initialisation the nexthop address length
> for the v4 and v6 explicit null label routes, and the second is to
> reduce the amount of memory used by mpls routes by changing the way
> the via addresses are stored.

Series applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] bpf: sample: define aarch64 specific registers

2015-10-27 Thread David Miller

From: Yang Shi 
Date: Mon, 26 Oct 2015 17:02:19 -0700

> Define aarch64 specific registers for building bpf samples correctly.
> 
> Signed-off-by: Yang Shi 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net] amd-xgbe: Fix race between access of desc and desc index

2015-10-27 Thread David Miller

From: Tom Lendacky 
Date: Mon, 26 Oct 2015 17:13:54 -0500

> During Tx cleanup it's still possible for the descriptor data to be
> read ahead of the descriptor index. A memory barrier is required between
> the read of the descriptor index and the start of the Tx cleanup loop.
> This allows a change to a lighter-weight barrier in the Tx transmit
> routine just before updating the current descriptor index.
> 
> Since the memory barrier does result in extra overhead on arm64, keep
> the previous change to not chase the current descriptor value. This
> prevents the execution of the barrier for each loop performed.
> 
> Suggested-by: Alexander Duyck 
> Signed-off-by: Tom Lendacky 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net] RDS-TCP: Recover correctly from pskb_pull()/pksb_trim() failure in rds_tcp_data_recv

2015-10-27 Thread David Miller

From: Sowmini Varadhan 
Date: Mon, 26 Oct 2015 12:46:37 -0400

> Either of pskb_pull() or pskb_trim() may fail under low memory conditions.
> If rds_tcp_data_recv() ignores such failures, the application will
> receive corrupted data because the skb has not been correctly
> carved to the RDS datagram size.
> 
> Avoid this by handling pskb_pull/pskb_trim failure in the same
> manner as the skb_clone failure: bail out of rds_tcp_data_recv(), and
> retry via the deferred call to rds_send_worker() that gets set up on
> ENOMEM from rds_tcp_read_sock()
> 
> Signed-off-by: Sowmini Varadhan 

Applied and queued up for -stable, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] forcedeth: fix unilateral interrupt disabling in netpoll path

2015-10-27 Thread David Miller

From: Neil Horman 
Date: Mon, 26 Oct 2015 12:24:22 -0400

> Forcedeth currently uses disable_irq_lockdep and enable_irq_lockdep, which in
> some configurations simply calls local_irq_disable.  This causes errant 
> warnings
> in the netpoll path as in netpoll_send_skb_on_dev, where we disable irqs using
> local_irq_save, leading to the following warning:
 ...
> Fix it by modifying the forcedeth code to use
> disable_irq_nosync_lockdep_irqsavedisable_irq_nosync_lockdep_irqsave instead,
> which saves and restores irq state properly.  This also saves us a little code
> in the process
> 
> Tested by the reporter, with successful restuls
> 
> Patch applies to the head of the net tree
> 
> Signed-off-by: Neil Horman 
> CC: "David S. Miller" 
> Reported-by: Vasily Averin 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next v2 1/1] sfc: replace spinlocks with bit ops for busy poll locking

2015-10-27 Thread David Miller

From: Shradha Shah 
Date: Mon, 26 Oct 2015 14:23:42 +

> From: Bert Kenward 
> 
> This patch reduces the overhead of locking for busy poll.
> Previously the state was protected by a lock, whereas now
> it's manipulated solely with atomic operations.
> 
> Signed-off-by: Shradha Shah 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] sock: don't enable netstamp for af_unix sockets

2015-10-27 Thread David Miller

From: Hannes Frederic Sowa 
Date: Mon, 26 Oct 2015 13:51:37 +0100

> netstamp_needed is toggled for all socket families if they request
> timestamping. But some protocols don't need the lower-layer timestamping
> code at all. This patch starts disabling it for af-unix.
> 
> E.g. systemd enables timestamping during boot-up on the journald af-unix
> sockets, thus causing the system to globally enable timestamping in the
> lower networking stack. Still, it is very probable that timestamping
> gets activated, by e.g. dhclient or various NTP implementations.
> 
> Reported-by: Jesper Dangaard Brouer 
> Signed-off-by: Hannes Frederic Sowa 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next v8 00/10] Add new drivers: qed & qede

2015-10-27 Thread David Miller

From: Yuval Mintz 
Date: Mon, 26 Oct 2015 11:02:24 +0200

> This series implements the driver set for Qlogic's new QL4xxx series.
> These are 10/20/25/40/50/100 Gig capable converged nics, supporting
> ethernet (obviously), iscsi, fcoe, roce and iwarp protocols.

Series applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv3 net 1/3] openvswitch: Fix double-free on ip_defrag() errors

2015-10-27 Thread David Miller

From: Joe Stringer 
Date: Sun, 25 Oct 2015 20:21:48 -0700

> If ip_defrag() returns an error other than -EINPROGRESS, then the skb is
> freed. When handle_fragments() passes this back up to
> do_execute_actions(), it will be freed again. Prevent this double free
> by never freeing the skb in do_execute_actions() for errors returned by
> ovs_ct_execute. Always free it in ovs_ct_execute() error paths instead.
> 
> Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
> Reported-by: Florian Westphal 
> Signed-off-by: Joe Stringer 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv3 net 2/3] ipv6: Export nf_ct_frag6_consume_orig()

2015-10-27 Thread David Miller

From: Joe Stringer 
Date: Sun, 25 Oct 2015 20:21:49 -0700

> This is needed in openvswitch to fix an skb leak in the next patch.
> 
> Signed-off-by: Joe Stringer 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv3 net 3/3] openvswitch: Fix skb leak using IPv6 defrag

2015-10-27 Thread David Miller

From: Joe Stringer 
Date: Sun, 25 Oct 2015 20:21:50 -0700

> nf_ct_frag6_gather() makes a clone of each skb passed to it, and if the
> reassembly is successful, expects the caller to free all of the original
> skbs using nf_ct_frag6_consume_orig(). This call was previously missing,
> meaning that the original fragments were never freed (with the exception
> of the last fragment to arrive).
> 
> Fix this by ensuring that all original fragments except for the last
> fragment are freed via nf_ct_frag6_consume_orig(). The last fragment
> will be morphed into the head, so it must not be freed yet. Furthermore,
> retain the ->next pointer for the head after skb_morph().
> 
> Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
> Reported-by: Florian Westphal 
> Signed-off-by: Joe Stringer 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 net-next 0/4] Automatic adjustment of max frame size

2015-10-27 Thread David Miller

From: Toshiaki Makita 
Date: Mon, 26 Oct 2015 12:40:55 +0900

> This patch set tries to resolve packet drop by oversize error on
> receiving double tagged packets and possibly other encapsulated
> packets.

Nobody is reviewing this patch series, therefore I am not applying
it.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: new coverity defect in ipv6 route

2015-10-27 Thread Hannes Frederic Sowa

Hi Stephen,

On Wed, Oct 28, 2015, at 01:43, Stephen Hemminger wrote:

> *** CID 1328821:  Memory - corruptions  (ARRAY_VS_SINGLETON)
> /net/ipv6/route.c: 320 in rt6_info_init()
> 314 #endif
> 315 
> 316 static void rt6_info_init(struct rt6_info *rt)
> 317 {
> 318 struct dst_entry *dst = &rt->dst;
> 319 
> >>> CID 1328821:  Memory - corruptions  (ARRAY_VS_SINGLETON)
> >>> Using "dst" as an array.  This might corrupt or misinterpret adjacent 
> >>> memory locations.
> 320 memset(dst + 1, 0, sizeof(*rt) - sizeof(*dst));
> 321 INIT_LIST_HEAD(&rt->rt6i_siblings);
> 322 INIT_LIST_HEAD(&rt->rt6i_uncached);
> 323 }
> 324 
> 325 /* allocate dst with ip6_dst_ops */

I already marked this as an false positive in coverity.

Thanks,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next 2/2] mpls: reduce memory usage of routes

2015-10-27 Thread roopa

On 10/26/15, 5:37 PM, Robert Shearman wrote:
> Nexthops for MPLS routes have a via address field sized for the
> largest via address that is expected, which is 32 bytes. This means
> that in the most common case of having ipv4 via addresses, 28 bytes of
> memory more than required are used per nexthop. In the other common
> case of an ipv6 nexthop then 16 bytes more than required are
> used. With large numbers of MPLS routes this extra memory usage could
> start to become significant.
>
> To avoid allocating memory for a maximum length via address when not
> all of it is required and to allow for ease of iterating over
> nexthops, then the via addresses are changed to be stored in the same
> memory block as the route and nexthops, but in an array after the end
> of the array of nexthops. New accessors are provided to retrieve a
> pointer to the via address.
>
> To allow for O(1) access without having to store a pointer or offset
> per nh, the via address for each nexthop is sized according to the
> maximum via address for any nexthop in the route, which is stored in a
> new route field, rt_max_alen, but this is in an existing hole in
> struct mpls_route so it doesn't increase the size of the
> structure. Each via address is ensured to be aligned to VIA_ALEN_ALIGN
> to account for architectures that don't allow unaligned accesses.
>
> Signed-off-by: Robert Shearman 
> ---
nice way to handle it!. I was going to submit a patch to make MAX_VIA_ALEN = 
16bytes as suggested by Eric.
In which case, your patch will only help the case where all nexthops are ipv4.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next 1/2] mpls: fix forwarding using v4/v6 explicit null

2015-10-27 Thread roopa

On 10/26/15, 5:37 PM, Robert Shearman wrote:
> Fill in the via address length for the predefined IPv4 and IPv6
> explicit-null label routes.
>
> Fixes: f8efb73c97e2 ("mpls: multipath route support")
> Signed-off-by: Robert Shearman 
> ---
>  net/mpls/af_mpls.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
> index cc972e30355b..1c58662db4b2 100644
> --- a/net/mpls/af_mpls.c
> +++ b/net/mpls/af_mpls.c
> @@ -1345,6 +1345,7 @@ static int resize_platform_label_table(struct net *net, 
> size_t limit)
>   rt0->rt_protocol = RTPROT_KERNEL;
>   rt0->rt_payload_type = MPT_IPV4;
>   rt0->rt_nh->nh_via_table = NEIGH_LINK_TABLE;
> + rt0->rt_nh->nh_via_alen = lo->addr_len;
>   memcpy(rt0->rt_nh->nh_via, lo->dev_addr, lo->addr_len);
>   }
>   if (limit > MPLS_LABEL_IPV6NULL) {
> @@ -1356,6 +1357,7 @@ static int resize_platform_label_table(struct net *net, 
> size_t limit)
>   rt2->rt_protocol = RTPROT_KERNEL;
>   rt2->rt_payload_type = MPT_IPV6;
>   rt2->rt_nh->nh_via_table = NEIGH_LINK_TABLE;
> + rt2->rt_nh->nh_via_alen = lo->addr_len;
>   memcpy(rt2->rt_nh->nh_via, lo->dev_addr, lo->addr_len);
>   }
>  
Acked-by: Roopa Prabhu 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net PATCH] fib_trie: leaf_walk_rcu should not compute key if key is less than pn->key

2015-10-27 Thread David Miller

From: Alexander Duyck 
Date: Tue, 27 Oct 2015 15:06:45 -0700

> We were computing the child index in cases where the key value we were
> looking for was actually less than the base key of the tnode.  As a result
> we were getting incorrect index values that would cause us to skip over
> some children.
> 
> To fix this I have added a test that will force us to use child index 0 if
> the key we are looking for is less than the key of the current tnode.
> 
> Fixes: 8be33e955cb9 ("fib_trie: Fib walk rcu should take a tnode and key 
> instead of a trie and a leaf")
> Reported-by: Brian Rak 
> Signed-off-by: Alexander Duyck 

Applied and queued up for -stable, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

new coverity defect in ipv6 route

2015-10-27 Thread Stephen Hemminger



Begin forwarded message:

Date: Tue, 27 Oct 2015 08:43:53 -0700
From: scan-ad...@coverity.com
To: step...@networkplumber.org
Subject: New Defects reported by Coverity Scan for Linux



Hi,

Please find the latest report on new defect(s) introduced to Linux found with 
Coverity Scan.

2 new defect(s) introduced to Linux found with Coverity Scan.
12 defect(s), reported by Coverity Scan earlier, were marked fixed in the 
recent build analyzed by Coverity Scan.

New defect(s) Reported-by: Coverity Scan
Showing 2 of 2 defect(s)


** CID 1328821:  Memory - corruptions  (ARRAY_VS_SINGLETON)
/net/ipv6/route.c: 320 in rt6_info_init()



*** CID 1328821:  Memory - corruptions  (ARRAY_VS_SINGLETON)
/net/ipv6/route.c: 320 in rt6_info_init()
314 #endif
315 
316 static void rt6_info_init(struct rt6_info *rt)
317 {
318 struct dst_entry *dst = &rt->dst;
319 
>>> CID 1328821:  Memory - corruptions  (ARRAY_VS_SINGLETON)
>>> Using "dst" as an array.  This might corrupt or misinterpret adjacent 
>>> memory locations.
320 memset(dst + 1, 0, sizeof(*rt) - sizeof(*dst));
321 INIT_LIST_HEAD(&rt->rt6i_siblings);
322 INIT_LIST_HEAD(&rt->rt6i_uncached);
323 }
324 
325 /* allocate dst with ip6_dst_ops */

** CID 1328822:  Incorrect expression  (UNUSED_VALUE)
/drivers/net/wireless/rtlwifi/rtl8821ae/sw.c: 170 in rtl8821ae_init_sw_vars()



*** CID 1328822:  Incorrect expression  (UNUSED_VALUE)
/drivers/net/wireless/rtlwifi/rtl8821ae/sw.c: 170 in rtl8821ae_init_sw_vars()
164 /* for debug level */
165 rtlpriv->dbg.global_debuglevel = 
rtlpriv->cfg->mod_params->debug;
166 /* for LPS & IPS */
167 rtlpriv->psc.inactiveps = rtlpriv->cfg->mod_params->inactiveps;
168 rtlpriv->psc.swctrl_lps = rtlpriv->cfg->mod_params->swctrl_lps;
169 rtlpriv->psc.fwctrl_lps = rtlpriv->cfg->mod_params->fwctrl_lps;
>>> CID 1328822:  Incorrect expression  (UNUSED_VALUE)
>>> Assigning value from "rtlpriv->cfg->mod_params->msi_support" to 
>>> "rtlpci->msi_support" here, but that stored value is overwritten before it 
>>> can be used.
170 rtlpci->msi_support = rtlpriv->cfg->mod_params->msi_support;
171 rtlpci->msi_support = rtlpriv->cfg->mod_params->int_clear;
172 if (rtlpriv->cfg->mod_params->disable_watchdog)
173 pr_info("watchdog disabled\n");
174 rtlpriv->psc.reg_fwctrl_lps = 3;
175 rtlpriv->psc.reg_max_lps_awakeintvl = 5;



To view the defects in Coverity Scan visit, 
https://scan.coverity.com/projects/linux?tab=overview

To manage Coverity Scan email notifications for "step...@networkplumber.org", 
click 
https://scan.coverity.com/subscriptions/edit?email=stephen%40networkplumber.org&token=41b352b884ef3fc73426635eebc294c3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net v2 3/4] ipv6: no CHECKSUM_PARTIAL on MSG_MORE corked sockets

2015-10-27 Thread Tom Herbert

On Tue, Oct 27, 2015 at 5:12 PM, Hannes Frederic Sowa
 wrote:
> On Tue, Oct 27, 2015, at 23:03, Tom Herbert wrote:
>> On Tue, Oct 27, 2015 at 2:42 PM, Hannes Frederic Sowa
>>  wrote:
>> > I posted v3 just now. I would like to let David consider it for net
>> > inclusion. We can work on how to lift this limitation then in net-next,
>> > okay? I am currently in favor of a new netdev-feature. What do you
>> > think? Your RFC series could help here, too.
>> >
>> I really do not like the feature flag, it's just a bandaid over the
>> real problem-- in fact my goal is to eliminate NETIF_F_IP{V6}_CSUM and
>> just have NETIF_F_HW_CSUM. I will repost the helper patches, but we
>> really do need to start fixing this stuff in the drivers instead of
>> more hacking in the stack.
>
> It would be great if this is doable but I doubt so. There might be a lot
> of unresponsive driver maintainers and I don't see that we should simply
> eliminate IPv4 csum offloading for those drivers, too. Sometimes it is
> hard to patch drivers without documentation.
>
> I am against lifting restrictions which will have unforeseeable
> consequences for some people (as in partial communication errors) or
> having huge performance drawbacks (as in disabling ipv4 csum offloading,
> too).
>
> I could even imagine this needs to be more configurable as in how many
> extension headers some hardware can process, I fear. One extension
> header might be okay (jumping over a fragmentation header), but two... I
> simply don't know, yet. Maybe there is no problem with hardware at all.
>
Hardware that implement NETIF_F_HW_CSUM (ie. calculate csum based on
start and offset) should have no problem with extension headers. The
plea in skbuff.h for HW vendors to implement that generic algorithm
has been around a long time, but unfortunately a lot of new HW is
still do protocol specific algorithms. Realistically, I can't deploy
extension headers at scale without checksum offload anyway, so this
just degenerates into another instance where poor HW design decisions
limit the protocols and features we're able to deploy in data center.
Oh well...

> I don't really see this series as a hack. ;)
>
> Unluckily it seems we don't get feedback from the hardware about not
> being able to construct a proper checksum, so we cannot even close the
> loop and add code which warns us about misbehaving drivers.
>
> Bye,
> Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 5/8] mm: memcontrol: account socket memory on unified hierarchy

2015-10-27 Thread David Miller

From: Johannes Weiner 
Date: Tue, 27 Oct 2015 09:42:27 -0700

> On Tue, Oct 27, 2015 at 05:15:54PM +0100, Michal Hocko wrote:
>> > For now, something like this as a boot commandline?
>> > 
>> > cgroup.memory=nosocket
>> 
>> That would work for me.
> 
> Okay, then I'll go that route for the socket stuff.
> 
> Dave is that cool with you?

Depends upon the default.

Until the user configures something explicitly into the memory
controller, the networking bits should all evaluate to nothing.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net PATCH] fib_trie: leaf_walk_rcu should not compute key if key is less than pn->key

2015-10-27 Thread Brian Rak




On 10/27/2015 6:06 PM, Alexander Duyck wrote:

We were computing the child index in cases where the key value we were
looking for was actually less than the base key of the tnode.  As a result
we were getting incorrect index values that would cause us to skip over
some children.

To fix this I have added a test that will force us to use child index 0 if
the key we are looking for is less than the key of the current tnode.

Fixes: 8be33e955cb9 ("fib_trie: Fib walk rcu should take a tnode and key instead of 
a trie and a leaf")
Reported-by: Brian Rak 
Signed-off-by: Alexander Duyck 
---

This will need to be queued up for stable as well.  This applies to 4.1 and
4.2 kernels as well.

  net/ipv4/fib_trie.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 6c2af797f2f9..744e5936c10d 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1569,7 +1569,7 @@ static struct key_vector *leaf_walk_rcu(struct key_vector 
**tn, t_key key)
do {
/* record parent and next child index */
pn = n;
-   cindex = key ? get_index(key, pn) : 0;
+   cindex = (key > pn->key) ? get_index(key, pn) : 0;
  
  		if (cindex >> pn->bits)

break;

Just built 4.2.5 with this patch, and everything works fine.  Thanks for 
your help!

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Bug 106241] New: shutdown(3)/close(3) behaviour is incorrect for sockets in accept(3)

2015-10-27 Thread Eric Dumazet

On Tue, 2015-10-27 at 23:17 +, Al Viro wrote:

>   * [Linux-specific aside] our __alloc_fd() can degrade quite badly
> with some use patterns.  The cacheline pingpong in the bitmap is probably
> inevitable, unless we accept considerably heavier memory footprint,
> but we also have a case when alloc_fd() takes O(n) and it's _not_ hard
> to trigger - close(3);open(...); will have the next open() after that
> scanning the entire in-use bitmap.  I think I see a way to improve it
> without slowing the normal case down, but I'll need to experiment a
> bit before I post patches.  Anybody with examples of real-world loads
> that make our descriptor allocator to degrade is very welcome to post
> the reproducers...

Well, I do have real-world loads, but quite hard to setup in a lab :(

Note that we also hit the 'struct cred'->usage refcount for every
open()/close()/sock_alloc(), and simply moving uid/gid out of the first
cache line really helps, as current_fsuid() and current_fsgid() no
longer forces a pingpong.

I moved seldom used fields on the first cache line, so that overall
memory usage did not change (192 bytes on 64 bit arches)


diff --git a/include/linux/cred.h b/include/linux/cred.h
index 8d70e1361ecd..460efae83522 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -124,7 +124,17 @@ struct cred {
 #define CRED_MAGIC 0x43736564
 #define CRED_MAGIC_DEAD0x44656144
 #endif
-   kuid_t  uid;/* real UID of the task */
+   struct rcu_head rcu;/* RCU deletion hook */
+
+   kernel_cap_tcap_inheritable; /* caps our children can inherit */
+   kernel_cap_tcap_permitted;  /* caps we're permitted */
+   kernel_cap_tcap_effective;  /* caps we can actually use */
+   kernel_cap_tcap_bset;   /* capability bounding set */
+   kernel_cap_tcap_ambient;/* Ambient capability set */
+
+   kuid_t  uid cacheline_aligned_in_smp;
+   /* real UID of the task */
+
kgid_t  gid;/* real GID of the task */
kuid_t  suid;   /* saved UID of the task */
kgid_t  sgid;   /* saved GID of the task */
@@ -133,11 +143,6 @@ struct cred {
kuid_t  fsuid;  /* UID for VFS ops */
kgid_t  fsgid;  /* GID for VFS ops */
unsignedsecurebits; /* SUID-less security management */
-   kernel_cap_tcap_inheritable; /* caps our children can inherit */
-   kernel_cap_tcap_permitted;  /* caps we're permitted */
-   kernel_cap_tcap_effective;  /* caps we can actually use */
-   kernel_cap_tcap_bset;   /* capability bounding set */
-   kernel_cap_tcap_ambient;/* Ambient capability set */
 #ifdef CONFIG_KEYS
unsigned char   jit_keyring;/* default keyring to attach requested
 * keys to */
@@ -152,7 +157,6 @@ struct cred {
struct user_struct *user;   /* real user ID subscription */
struct user_namespace *user_ns; /* user_ns the caps and keyrings are 
relative to. */
struct group_info *group_info;  /* supplementary groups for euid/fsgid 
*/
-   struct rcu_head rcu;/* RCU deletion hook */
 };
 
 extern void __put_cred(struct cred *);



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net v2 3/4] ipv6: no CHECKSUM_PARTIAL on MSG_MORE corked sockets

2015-10-27 Thread Hannes Frederic Sowa

On Tue, Oct 27, 2015, at 23:03, Tom Herbert wrote:
> On Tue, Oct 27, 2015 at 2:42 PM, Hannes Frederic Sowa
>  wrote:
> > I posted v3 just now. I would like to let David consider it for net
> > inclusion. We can work on how to lift this limitation then in net-next,
> > okay? I am currently in favor of a new netdev-feature. What do you
> > think? Your RFC series could help here, too.
> >
> I really do not like the feature flag, it's just a bandaid over the
> real problem-- in fact my goal is to eliminate NETIF_F_IP{V6}_CSUM and
> just have NETIF_F_HW_CSUM. I will repost the helper patches, but we
> really do need to start fixing this stuff in the drivers instead of
> more hacking in the stack.

It would be great if this is doable but I doubt so. There might be a lot
of unresponsive driver maintainers and I don't see that we should simply
eliminate IPv4 csum offloading for those drivers, too. Sometimes it is
hard to patch drivers without documentation.

I am against lifting restrictions which will have unforeseeable 
consequences for some people (as in partial communication errors) or
having huge performance drawbacks (as in disabling ipv4 csum offloading,
too).

I could even imagine this needs to be more configurable as in how many
extension headers some hardware can process, I fear. One extension
header might be okay (jumping over a fragmentation header), but two... I
simply don't know, yet. Maybe there is no problem with hardware at all.

I don't really see this series as a hack. ;)

Unluckily it seems we don't get feedback from the hardware about not
being able to construct a proper checksum, so we cannot even close the
loop and add code which warns us about misbehaving drivers.

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next v3 1/4] ipv4: no CHECKSUM_PARTIAL on MSG_MORE corked sockets

2015-10-27 Thread Hannes Frederic Sowa

On Tue, Oct 27, 2015, at 23:22, Tom Herbert wrote:
> On Tue, Oct 27, 2015 at 2:40 PM, Hannes Frederic Sowa
>  wrote:
> > We cannot reliable calculate packet size on MSG_MORE corked sockets
> > and thus cannot decide if they are going to be fragmented later on,
> > so better not use CHECKSUM_PARTIAL in the first place.
> >
> > Cc: Eric Dumazet 
> > Cc: Vlad Yasevich 
> > Cc: Benjamin Coddington 
> > Cc: Tom Herbert 
> > Signed-off-by: Hannes Frederic Sowa 
> > ---
> >  net/ipv4/ip_output.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> > index 50e2973..0b02417 100644
> > --- a/net/ipv4/ip_output.c
> > +++ b/net/ipv4/ip_output.c
> > @@ -911,6 +911,7 @@ static int __ip_append_data(struct sock *sk,
> > if (transhdrlen &&
> > length + fragheaderlen <= mtu &&
> > rt->dst.dev->features & NETIF_F_V4_CSUM &&
> > +   !(flags & MSG_MORE) &&
> 
> I still don't understand this. It seems like the effect is to disable
> checksum offload for all UDP messages sent with MSG_MORE flag set.

Exactly.

MSG_MORE/UDP_CORK is a method to append data on the *same* UDP packet.
The probability this packet exceeds the MTU size is rather large, as it
is mostly used to prepare a header and later on send data via
sendpage/sendfile-syscall (IPv6 UDP as no sendpage so it falls back to
normal udpv6_sendmsg path). sendpage is mostly used to send rather large
amount of data because for small amounts regular copying might be faster
(IMHO). So the probability we exceed the MTU is quiet high. This is the
case for NFSv4 which uses this flag over UDP, sending xdr header and
later on the filesystem data directly from the page cache. You will
still have CHECKSUM_PARTIAL capability with sendmsg and multiple iovecs!

Because we cannot simply switch back to CHECKSUM_NONE in the second
write, the first write would not yet have been checksumed, I decided to
exclude MSG_MORE to set up a CHECKSUM_PARTIAL skb.

Because there would be at least some more syscalls between the first
write and the second write (not so in the  NFS example directly from the
kernel but normal user space usage) the data would already be cold in
the caches. So it makes sense to me to checksum the data during copy-in
to trash the CPU caches only once.

The ip6_fragment logic will now catch this case and fragment anyway, but
as I wrote, this is only a last resort.

Hope that makes it more clear.

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Bug 106241] New: shutdown(3)/close(3) behaviour is incorrect for sockets in accept(3)

2015-10-27 Thread Al Viro

On Tue, Oct 27, 2015 at 10:52:46AM +, Alan Burlison wrote:
> Unfortunately Hadoop isn't the only thing that pulls the shutdown()
> trick, so I don't think there's a simple fix for this, as discussed
> earlier in the thread. Having said that, if close() on Linux also
> did an implicit shutdown() it would mean that well-written
> applications that handled the scoping, sharing and reuse of FDs
> properly could just call close() and have it work the same way
> across *NIX platforms.

... except for all Linux, FreeBSD and OpenBSD versions out there, but
hey, who's counting those, right?  Not to mention the OSX behaviour -
I really have no idea what it does; the FreeBSD ancestry in its kernel
is distant enough for a lot of changes to have happened in that area.

So...  Which Unices other than Solaris and NetBSD actually behave that
way?  I.e. have close(fd) cancel accept(fd) another thread is sitting
in.  Note that NetBSD implementation has known races.  Linux, FreeBSD
and OpenBSD don't do that at all.

Frankly, as far as I'm concerned, the bottom line is
* there are two variants of semantics in that area and there's not
much that could be done about that.
* POSIX is vague enough for both variants to comply with it (it's
also very badly written in the area in question).
* I don't see any way to implement something similar to Solaris
behaviour without a huge increase of memory footprint or massive cacheline
pingpong.  Solaris appears to go for memory footprint from hell - cacheline
per descriptor (instead of a pointer per descriptor).
* the benefits of Solaris-style behaviour are not obvious - all things
equal it would be interesting, but the things are very much not equal.  What's
more, if your userland code is such that accept() argument could be closed by
another thread, the caller *cannot* do anything with said argument after
accept() returns, no matter which variant of semantics is used.
* [Linux-specific aside] our __alloc_fd() can degrade quite badly
with some use patterns.  The cacheline pingpong in the bitmap is probably
inevitable, unless we accept considerably heavier memory footprint,
but we also have a case when alloc_fd() takes O(n) and it's _not_ hard
to trigger - close(3);open(...); will have the next open() after that
scanning the entire in-use bitmap.  I think I see a way to improve it
without slowing the normal case down, but I'll need to experiment a
bit before I post patches.  Anybody with examples of real-world loads
that make our descriptor allocator to degrade is very welcome to post
the reproducers...
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Intel-wired-lan] [PATCHv2] ixgbe: Wait for 1ms, not 1us, after RST

2015-10-27 Thread Keller, Jacob E

On Tue, 2015-10-27 at 18:45 -0400, Peter Hurley wrote:
> On 10/27/2015 02:35 PM, ND Linux CI Server wrote:
> > Greetings,
> > 
> > This email is automatically generated by ND's Linux Patch Testing
> > framework
> > based on aiaiai. I have performed some automatic testing of a patch
> > (series)
> > you submitted to intel-wired-...@lists.osuosl.org
> > 
> > The following contains output of any tests which failed to pass,
> > and might be
> > the result of developer error. The tests performed include but may
> > not be
> > limited to checkpatch.pl, bisection testing, compilation on a
> > default kernel
> > config, coccinelle scripts, cppcheck, and smatch.
> > 
> > If you have received this email in error, or believe that aiaiai
> > has detected a
> > false positive, please email Jacob Keller  > >.
> 
> False positive.
> 
> As long as the delay is at least 1ms (which is guaranteed), slightly
> longer
> delays (relative to the existing reset delay of 100ms) are not
> harmful.
> 
> Use of usleep_range() would be unnecessary overkill for the purpose.
> 
> Regards,
> Peter Hurley


Feel free to ignore this then.

Regards,
JakeN�r��yb�X��ǧv�^�)޺{.n�+���z�^�)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

Re: [Intel-wired-lan] [PATCHv2] ixgbe: Wait for 1ms, not 1us, after RST

2015-10-27 Thread Peter Hurley

On 10/27/2015 02:35 PM, ND Linux CI Server wrote:
> Greetings,
> 
> This email is automatically generated by ND's Linux Patch Testing framework
> based on aiaiai. I have performed some automatic testing of a patch (series)
> you submitted to intel-wired-...@lists.osuosl.org
> 
> The following contains output of any tests which failed to pass, and might be
> the result of developer error. The tests performed include but may not be
> limited to checkpatch.pl, bisection testing, compilation on a default kernel
> config, coccinelle scripts, cppcheck, and smatch.
> 
> If you have received this email in error, or believe that aiaiai has detected 
> a
> false positive, please email Jacob Keller .

False positive.

As long as the delay is at least 1ms (which is guaranteed), slightly longer
delays (relative to the existing reset delay of 100ms) are not harmful.

Use of usleep_range() would be unnecessary overkill for the purpose.

Regards,
Peter Hurley


> ---
> 
> I have tested your changes
> 
> [Intel-wired-lan] [PATCHv2] ixgbe: Wait for 1ms, not 1us, after RST
> 
> Project: net (net-current development queue)
> 
> Configurations: intel_defconfig,x86
> 
> Tested the patch(es) on top of the following commits:
> 505b857 ixgbe: Reset interface after enabling SR-IOV
> ce9d9b8 net: sysctl: fix a kmemleak warning
> 1acea4f ppp: fix pppoe_dev deletion condition in pppoe_release()
> f6b8dec9 af_key: fix two typos
> 
> 
> 
> Successfully built configuration "intel_defconfig,x86", no issues.
> 
> 
> 
> checkpatch.pl has some complaints:
> 
> 
> 
> checkpatch.pl results for patch "[PATCH] ixgbe: Wait for 1ms, not 1us, after 
> RST"
> 
> WARNING:MSLEEP: msleep < 20ms can sleep for up to 20ms; see 
> Documentation/timers/timers-howto.txt
> #29: FILE: drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c:119:
> + msleep(1);
> 
> total: 0 errors, 1 warnings, 0 checks, 13 lines checked
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] hyperv: Add handler for RNDIS_STATUS_NETWORK_CHANGE event

2015-10-27 Thread Richard Weinberger

On Mon, Jun 23, 2014 at 10:10 PM, David Miller  wrote:
> From: Haiyang Zhang 
> Date: Mon, 23 Jun 2014 16:09:59 +
>
>> So, what's the equivalent or similar command to "network restart" on SLES12? 
>> Could
>> you update the command line for the usermodehelper when porting this patch 
>> to SLES
>> 12?
>
> No, you are not going to keep the usermodehelper invocation in your driver
> please remove it.  It is absolutely inappropriate, and I strictly do not want
> to keep it in there because other people will copy it and then we'll have a
> real mess on our hands.

Sorry for digging up this old thread.
While talking with some guys about usermodehelper abuses I came across this gem.
Mainline still contains that "/etc/init.d/network restart" code.
Haiyang, care to cleanup?

-- 
Thanks,
//richard
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next v3 1/4] ipv4: no CHECKSUM_PARTIAL on MSG_MORE corked sockets

2015-10-27 Thread Tom Herbert

On Tue, Oct 27, 2015 at 2:40 PM, Hannes Frederic Sowa
 wrote:
> We cannot reliable calculate packet size on MSG_MORE corked sockets
> and thus cannot decide if they are going to be fragmented later on,
> so better not use CHECKSUM_PARTIAL in the first place.
>
> Cc: Eric Dumazet 
> Cc: Vlad Yasevich 
> Cc: Benjamin Coddington 
> Cc: Tom Herbert 
> Signed-off-by: Hannes Frederic Sowa 
> ---
>  net/ipv4/ip_output.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index 50e2973..0b02417 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -911,6 +911,7 @@ static int __ip_append_data(struct sock *sk,
> if (transhdrlen &&
> length + fragheaderlen <= mtu &&
> rt->dst.dev->features & NETIF_F_V4_CSUM &&
> +   !(flags & MSG_MORE) &&

I still don't understand this. It seems like the effect is to disable
checksum offload for all UDP messages sent with MSG_MORE flag set.

> !exthdrlen)
> csummode = CHECKSUM_PARTIAL;
>
> --
> 2.5.0
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v1 1/3] virtio-net: Using single MSIX IRQ for TX/RX Q pair

2015-10-27 Thread Ravi Kerur



On 10/27/2015 1:38 AM, Michael S. Tsirkin wrote:
> On Mon, Oct 26, 2015 at 10:52:47AM -0700, Ravi Kerur wrote:
>> Ported earlier patch from Jason Wang (dated 12/26/2014).
>>
>> This patch tries to reduce the number of MSIX irqs required for
>> virtio-net by sharing a MSIX irq for each TX/RX queue pair through
>> channels. If transport support channel, about half of the MSIX irqs
>> were reduced.
>>
>> Signed-off-by: Ravi Kerur 
> 
> Why bother BTW? Looks like this is adding a bunch of overhead
> on data path - to what end?
> 
> Maybe you have a huge number of these devices ... but in that case, how
> about sharing the config interrupt instead?
> That's only possible if host supports VIRTIO_1
> (so we can detect config interrupt by reading the ISR).

For my clarification, are you suggesting this as an additional changes for 
config interrupts or rework existing patch?
> 
> 
> 
>> ---
>>  drivers/net/virtio_net.c | 29 -
>>  1 file changed, 28 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index d8838ded..d705cce 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -72,6 +72,9 @@ struct send_queue {
>>  
>>  /* Name of the send queue: output.$index */
>>  char name[40];
>> +
>> +/* Name of the channel, shared with irq. */
>> +char channel_name[40];
>>  };
>>  
>>  /* Internal representation of a receive virtqueue */
>> @@ -1529,6 +1532,8 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>>  int ret = -ENOMEM;
>>  int i, total_vqs;
>>  const char **names;
>> +const char **channel_names;
>> +unsigned *channels;
>>  
>>  /* We expect 1 RX virtqueue followed by 1 TX virtqueue, followed by
>>   * possible N-1 RX/TX queue pairs used in multiqueue mode, followed by
>> @@ -1548,6 +1553,17 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>>  if (!names)
>>  goto err_names;
>>  
>> +channel_names = kmalloc_array(vi->max_queue_pairs,
>> +  sizeof(*channel_names),
>> +  GFP_KERNEL);
>> +if (!channel_names)
>> +goto err_channel_names;
>> +
>> +channels = kmalloc_array(total_vqs, sizeof(*channels),
>> + GFP_KERNEL);
>> +if (!channels)
>> +goto err_channels;
>> +
>>  /* Parameters for control virtqueue, if any */
>>  if (vi->has_cvq) {
>>  callbacks[total_vqs - 1] = NULL;
>> @@ -1562,10 +1578,15 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>>  sprintf(vi->sq[i].name, "output.%d", i);
>>  names[rxq2vq(i)] = vi->rq[i].name;
>>  names[txq2vq(i)] = vi->sq[i].name;
>> +sprintf(vi->sq[i].channel_name, "txrx.%d", i);
>> +channel_names[i] = vi->sq[i].channel_name;
>> +channels[rxq2vq(i)] = i;
>> +channels[txq2vq(i)] = i;
>>  }
>>  
>>  ret = vi->vdev->config->find_vqs(vi->vdev, total_vqs, vqs, callbacks,
>> - names);
>> + names, channels, channel_names,
>> + vi->max_queue_pairs);
>>  if (ret)
>>  goto err_find;
>>  
>> @@ -1580,6 +1601,8 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>>  vi->sq[i].vq = vqs[txq2vq(i)];
>>  }
>>  
>> +kfree(channels);
>> +kfree(channel_names);
>>  kfree(names);
>>  kfree(callbacks);
>>  kfree(vqs);
>> @@ -1587,6 +1610,10 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>>  return 0;
>>  
>>  err_find:
>> +kfree(channels);
>> +err_channels:
>> +kfree(channel_names);
>> +err_channel_names:
>>  kfree(names);
>>  err_names:
>>  kfree(callbacks);
>> -- 
>> 1.9.1
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v1 1/3] virtio-net: Using single MSIX IRQ for TX/RX Q pair

2015-10-27 Thread Ravi Kerur



On 10/26/2015 10:11 PM, Jason Wang wrote:
> 
> 
> On 10/27/2015 01:52 AM, Ravi Kerur wrote:
>> Ported earlier patch from Jason Wang (dated 12/26/2014).
>>
>> This patch tries to reduce the number of MSIX irqs required for
>> virtio-net by sharing a MSIX irq for each TX/RX queue pair through
>> channels. If transport support channel, about half of the MSIX irqs
>> were reduced.
>>
>> Signed-off-by: Ravi Kerur 
>> ---
>>  drivers/net/virtio_net.c | 29 -
>>  1 file changed, 28 insertions(+), 1 deletion(-)
> 
> Thanks for the patches. Some minor comments:
> 
> - If there's no big changes of the code, better keep my sign-offs :)

Sorry for that. Will fix it in 'v2'

> - Rusty does not like the name "channels", so better rename it to
> "virtqueue groups"
> - Build bot reports some compiling issues, this need to be fixed in next
> version.

I saw build failure email, it was reported against 2nd patch. All 3 patches 
need to be applied for successful build. Will look into it and fix any issues.

> - The order of patches in this series is reversed, pach 1/3 should be
> 3/3. And better to have a cover letter to describe the motivation and
> changes since last series. (You can do this through git format-patch
> --cover)
> - Michale's comment about unnecessary wakeup of tx queue needs to be
> addressed, otherwise, we may get unnecessary tx interrupts.

I am working on it, will take care of above comments as well in 'v2'

> - Some benchmarks is needed to make sure there's no performance regression.
> 
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index d8838ded..d705cce 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -72,6 +72,9 @@ struct send_queue {
>>  
>>  /* Name of the send queue: output.$index */
>>  char name[40];
>> +
>> +/* Name of the channel, shared with irq. */
>> +char channel_name[40];
>>  };
>>  
>>  /* Internal representation of a receive virtqueue */
>> @@ -1529,6 +1532,8 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>>  int ret = -ENOMEM;
>>  int i, total_vqs;
>>  const char **names;
>> +const char **channel_names;
>> +unsigned *channels;
>>  
>>  /* We expect 1 RX virtqueue followed by 1 TX virtqueue, followed by
>>   * possible N-1 RX/TX queue pairs used in multiqueue mode, followed by
>> @@ -1548,6 +1553,17 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>>  if (!names)
>>  goto err_names;
>>  
>> +channel_names = kmalloc_array(vi->max_queue_pairs,
>> +  sizeof(*channel_names),
>> +  GFP_KERNEL);
>> +if (!channel_names)
>> +goto err_channel_names;
>> +
>> +channels = kmalloc_array(total_vqs, sizeof(*channels),
>> + GFP_KERNEL);
>> +if (!channels)
>> +goto err_channels;
>> +
>>  /* Parameters for control virtqueue, if any */
>>  if (vi->has_cvq) {
>>  callbacks[total_vqs - 1] = NULL;
>> @@ -1562,10 +1578,15 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>>  sprintf(vi->sq[i].name, "output.%d", i);
>>  names[rxq2vq(i)] = vi->rq[i].name;
>>  names[txq2vq(i)] = vi->sq[i].name;
>> +sprintf(vi->sq[i].channel_name, "txrx.%d", i);
>> +channel_names[i] = vi->sq[i].channel_name;
>> +channels[rxq2vq(i)] = i;
>> +channels[txq2vq(i)] = i;
>>  }
>>  
>>  ret = vi->vdev->config->find_vqs(vi->vdev, total_vqs, vqs, callbacks,
>> - names);
>> + names, channels, channel_names,
>> + vi->max_queue_pairs);
>>  if (ret)
>>  goto err_find;
>>  
>> @@ -1580,6 +1601,8 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>>  vi->sq[i].vq = vqs[txq2vq(i)];
>>  }
>>  
>> +kfree(channels);
>> +kfree(channel_names);
>>  kfree(names);
>>  kfree(callbacks);
>>  kfree(vqs);
>> @@ -1587,6 +1610,10 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>>  return 0;
>>  
>>  err_find:
>> +kfree(channels);
>> +err_channels:
>> +kfree(channel_names);
>> +err_channel_names:
>>  kfree(names);
>>  err_names:
>>  kfree(callbacks);
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net PATCH] fib_trie: leaf_walk_rcu should not compute key if key is less than pn->key

2015-10-27 Thread Alexander Duyck

We were computing the child index in cases where the key value we were
looking for was actually less than the base key of the tnode.  As a result
we were getting incorrect index values that would cause us to skip over
some children.

To fix this I have added a test that will force us to use child index 0 if
the key we are looking for is less than the key of the current tnode.

Fixes: 8be33e955cb9 ("fib_trie: Fib walk rcu should take a tnode and key 
instead of a trie and a leaf")
Reported-by: Brian Rak 
Signed-off-by: Alexander Duyck 
---

This will need to be queued up for stable as well.  This applies to 4.1 and
4.2 kernels as well.

 net/ipv4/fib_trie.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 6c2af797f2f9..744e5936c10d 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1569,7 +1569,7 @@ static struct key_vector *leaf_walk_rcu(struct key_vector 
**tn, t_key key)
do {
/* record parent and next child index */
pn = n;
-   cindex = key ? get_index(key, pn) : 0;
+   cindex = (key > pn->key) ? get_index(key, pn) : 0;
 
if (cindex >> pn->bits)
break;

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net v2 3/4] ipv6: no CHECKSUM_PARTIAL on MSG_MORE corked sockets

2015-10-27 Thread Tom Herbert

On Tue, Oct 27, 2015 at 2:42 PM, Hannes Frederic Sowa
 wrote:
> Hi Tom,
>
> On Tue, Oct 27, 2015, at 20:19, Hannes Frederic Sowa wrote:
>> On Tue, Oct 27, 2015, at 19:37, Tom Herbert wrote:
>> > On Tue, Oct 27, 2015 at 11:29 AM, Hannes Frederic Sowa
>> >  wrote:
>> > > On Tue, Oct 27, 2015, at 18:32, Tom Herbert wrote:
>> > >> On Tue, Oct 27, 2015 at 9:44 AM, Hannes Frederic Sowa
>> > >>  wrote:
>> > >> >
>> > >> >
>> > >> > On Tue, Oct 27, 2015, at 17:36, Tom Herbert wrote:> > -   
>> > >> > if
>> > >> > (cork->length + length > maxnonfragsize - headersize) {
>> > >> >> > +   if (cork->length + length > maxnonfragsize - headersize) {
>> > >> >> >  emsgsize:
>> > >> >> > -   ipv6_local_error(sk, EMSGSIZE, fl6,
>> > >> >> > -mtu - headersize +
>> > >> >> > -sizeof(struct ipv6hdr));
>> > >> >> > -   return -EMSGSIZE;
>> > >> >> > -   }
>> > >> >> > +   ipv6_local_error(sk, EMSGSIZE, fl6,
>> > >> >> > +mtu - headersize +
>> > >> >> > +sizeof(struct ipv6hdr));
>> > >> >> > +   return -EMSGSIZE;
>> > >> >> > }
>> > >> >> >
>> > >> >> > +   /* CHECKSUM_PARTIAL only with no extension headers and when
>> > >> >>
>> > >> >> No, please don't do this. CHECKSUM_PARTIAL should work with extension
>> > >> >> headers as defined, so this is just disabling otherwise valid and
>> > >> >> useful functionality. If (some) drivers have problems with this they
>> > >> >> need to be identified and fixed.
>> > >> >
>> > >> > I don't understand. The old code already didn't allow the use of
>> > >> > opt_flen with CHECKSUM_PARTIAL.
>> > >> >
>> > >> Then that's a problem with the old code :-). Is there any other reason
>> > >> that we can't use CHECKSUM_PARTIAL with extension headers other than
>> > >> lack of correct driver support?
>> > >
>> > > The lack of correct driver support is a big bumper, but as I wrote, I
>> > > don't see a reason to not lift this restriction in net-next. I proposed
>> > > a new feature flag, or by looking at your series, we could probably use
>> > > the extension header okay field for that.
>> > >
>> > Okay, but why bother doing this for net? This problem has obviously
>> > existed for a while, and even if the restriction is maintained here
>> > there are still other paths that don't go through ip_append_data that
>> > could trip the bug. Also, drivers are welcome to fix their issues in
>> > net I believe.
>>
>> I even don't know if it could be a hardware issue. Also I don't want to
>> break people's communication with a patch.
>> IMHO without the WARN_ON_ONCEs, which I agreed to remove, I currently
>> don't see any problem for net.
>>
>> You don't agree on a netdev-feature flag, indicating the driver is okay
>> with hardware checksumming and extension headers? We could add this to
>> net-next pretty fast, I think. It does not require people to revert this
>> patch in case their driver misbehaves and we don't get a fix for it,
>> soon. Also what should we do if the driver simply does not support
>> extension headers + checksum offloading? Completely kill checksum
>> offloading for IPv6?
>
> I posted v3 just now. I would like to let David consider it for net
> inclusion. We can work on how to lift this limitation then in net-next,
> okay? I am currently in favor of a new netdev-feature. What do you
> think? Your RFC series could help here, too.
>
I really do not like the feature flag, it's just a bandaid over the
real problem-- in fact my goal is to eliminate NETIF_F_IP{V6}_CSUM and
just have NETIF_F_HW_CSUM. I will repost the helper patches, but we
really do need to start fixing this stuff in the drivers instead of
more hacking in the stack.

Tom

> Thanks,
> Hannes
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Fw: [Bug 106711] New: VXLAN: RTNL assertion failed at net/core/net_namespace.c:187

2015-10-27 Thread Stephen Hemminger



Begin forwarded message:

Date: Tue, 27 Oct 2015 14:05:08 +
From: "bugzilla-dae...@bugzilla.kernel.org" 

To: "shemmin...@linux-foundation.org" 
Subject: [Bug 106711] New: VXLAN: RTNL assertion failed at 
net/core/net_namespace.c:187


https://bugzilla.kernel.org/show_bug.cgi?id=106711

Bug ID: 106711
   Summary: VXLAN: RTNL assertion failed at
net/core/net_namespace.c:187
   Product: Networking
   Version: 2.5
Kernel Version: 4.1.10
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: IPV4
  Assignee: shemmin...@linux-foundation.org
  Reporter: tuomo.turu...@nokia.com
Regression: No

VXLAN packet receiving triggers ASSERT_RTNL() assertion failure if VXLAN
transport interface is in different network namespace than the VXLAN interface
itself:

[   38.891092] RTNL: assertion failed at
/build/distro/work/shared/linux-stable-30bb3a6af25f17c356252ac6cfbfd3ec04ae1a56/net/core/net_namespace.c
(187)
[   38.892738] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted
4.1.10-pc64-distro.git-v1.14 #1
[   38.893720] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.7.5-20140531_083030-gandalf 04/01/2014
[   38.894933]  8800ba372280 88013ab3f808 8078a62a
88013fc110f8
[   38.896059]   88013ab3f838 8068b7b5
88013a7a3600
[   38.897172]  8800bbbc2c00 8801381262a0 880138126300
88013ab3f848
[   38.898354] Call Trace:
[   38.898763]  [] dump_stack+0x45/0x57
[   38.899436]  [] __peernet2id+0xa5/0xb0
[   38.900099]  [] peernet2id+0x18/0x30
[   38.900756]  [] vxlan_fdb_info+0xfa/0x360 [vxlan]
[   38.901526]  [] ? __alloc_skb+0x97/0x1e0
[   38.902216]  [] vxlan_fdb_notify+0x72/0x100 [vxlan]
[   38.902994]  [] vxlan_fdb_create+0x136/0x370 [vxlan]
[   38.903797]  [] vxlan_snoop+0x1c5/0x1d0 [vxlan]
[   38.904551]  [] vxlan_rcv+0x35a/0x5c0 [vxlan]
[   38.905393]  [] ? dst_alloc+0x4f/0x180
[   38.906165]  [] vxlan_udp_encap_recv+0x108/0x390 [vxlan]
[   38.907111]  [] ? vxlan_encap_bypass.isra.37+0x160/0x160
[vxlan]
[   38.908278]  [] udp_queue_rcv_skb+0x35b/0x450
[   38.909119]  [] __udp4_lib_rcv+0x126/0x750
[   38.909915]  [] udp_rcv+0x1a/0x20
[   38.910640]  [] ip_local_deliver_finish+0xae/0x230
[   38.911529]  [] ip_local_deliver+0x9a/0xb0
[   38.912341]  [] ip_rcv_finish+0x88/0x370
[   38.913277]  [] ip_rcv+0x2df/0x3c0
[   38.914000]  [] ? load_balance+0x233/0xa20
[   38.914807]  [] __netif_receive_skb_core+0x6e3/0xa20
[   38.915714]  [] ? update_rq_clock.part.81+0x1c/0x40
[   38.916613]  [] __netif_receive_skb+0x1d/0x70
[   38.917461]  [] process_backlog+0xc2/0x170
[   38.918291]  [] net_rx_action+0x20a/0x340
[   38.919123]  [] __do_softirq+0xef/0x320
[   38.919959]  [] run_ksoftirqd+0x25/0x60
[   38.920764]  [] smpboot_thread_fn+0x12f/0x190
[   38.921605]  [] ? sort_range+0x30/0x30
[   38.922384]  [] kthread+0xc9/0xe0
[   38.923104]  [] ? kthread_create_on_node+0x180/0x180
[   38.923994]  [] ret_from_fork+0x42/0x70
[   38.924772]  [] ? kthread_create_on_node+0x180/0x180


Seems to me that peernet2id() function should not be called while receiving
packets or then peernet2id() function should not use rtnl lock.

The issue can be reproduced with following configuration + ping (one host is
enough, real network is not needed):

ip netns add ns0
ip netns exec ns0 ip link set lo up
ip netns add ns1
ip netns exec ns1 ip link set lo up
ip netns add ns2
ip netns exec ns2 ip link set lo up
ip netns add ns3
ip netns exec ns3 ip link set lo up
ip link add type veth
ip link set veth0 netns ns0
ip netns exec ns0 ip link set veth0 up
ip netns exec ns0 ip addr add 10.0.0.1/24 dev veth0
ip link set veth1 netns ns1
ip netns exec ns1 ip link set veth1 up
ip netns exec ns1 ip addr add 10.0.0.2/24 dev veth1
ip netns exec ns0 ip link add name vxlan0 type vxlan id 1000 group 224.0.0.1
local 10.0.0.1 dev veth0 learning
ip netns exec ns0 ip link set vxlan0 netns ns2
ip netns exec ns2 ip link set vxlan0 up
ip netns exec ns2 ip addr add 20.0.0.1/24 dev vxlan0
ip netns exec ns1 ip link add name vxlan1 type vxlan id 1000 group 224.0.0.1
local 10.0.0.2 dev veth1 learning
ip netns exec ns1 ip link set vxlan1 netns ns3
ip netns exec ns3 ip link set vxlan1 up
ip netns exec ns3 ip addr add 20.0.0.2/24 dev vxlan1
ip netns exec ns2 ping 20.0.0.1

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net v2 3/4] ipv6: no CHECKSUM_PARTIAL on MSG_MORE corked sockets

2015-10-27 Thread Hannes Frederic Sowa

Hi Tom,

On Tue, Oct 27, 2015, at 20:19, Hannes Frederic Sowa wrote:
> On Tue, Oct 27, 2015, at 19:37, Tom Herbert wrote:
> > On Tue, Oct 27, 2015 at 11:29 AM, Hannes Frederic Sowa
> >  wrote:
> > > On Tue, Oct 27, 2015, at 18:32, Tom Herbert wrote:
> > >> On Tue, Oct 27, 2015 at 9:44 AM, Hannes Frederic Sowa
> > >>  wrote:
> > >> >
> > >> >
> > >> > On Tue, Oct 27, 2015, at 17:36, Tom Herbert wrote:> > -   
> > >> > if
> > >> > (cork->length + length > maxnonfragsize - headersize) {
> > >> >> > +   if (cork->length + length > maxnonfragsize - headersize) {
> > >> >> >  emsgsize:
> > >> >> > -   ipv6_local_error(sk, EMSGSIZE, fl6,
> > >> >> > -mtu - headersize +
> > >> >> > -sizeof(struct ipv6hdr));
> > >> >> > -   return -EMSGSIZE;
> > >> >> > -   }
> > >> >> > +   ipv6_local_error(sk, EMSGSIZE, fl6,
> > >> >> > +mtu - headersize +
> > >> >> > +sizeof(struct ipv6hdr));
> > >> >> > +   return -EMSGSIZE;
> > >> >> > }
> > >> >> >
> > >> >> > +   /* CHECKSUM_PARTIAL only with no extension headers and when
> > >> >>
> > >> >> No, please don't do this. CHECKSUM_PARTIAL should work with extension
> > >> >> headers as defined, so this is just disabling otherwise valid and
> > >> >> useful functionality. If (some) drivers have problems with this they
> > >> >> need to be identified and fixed.
> > >> >
> > >> > I don't understand. The old code already didn't allow the use of
> > >> > opt_flen with CHECKSUM_PARTIAL.
> > >> >
> > >> Then that's a problem with the old code :-). Is there any other reason
> > >> that we can't use CHECKSUM_PARTIAL with extension headers other than
> > >> lack of correct driver support?
> > >
> > > The lack of correct driver support is a big bumper, but as I wrote, I
> > > don't see a reason to not lift this restriction in net-next. I proposed
> > > a new feature flag, or by looking at your series, we could probably use
> > > the extension header okay field for that.
> > >
> > Okay, but why bother doing this for net? This problem has obviously
> > existed for a while, and even if the restriction is maintained here
> > there are still other paths that don't go through ip_append_data that
> > could trip the bug. Also, drivers are welcome to fix their issues in
> > net I believe.
> 
> I even don't know if it could be a hardware issue. Also I don't want to
> break people's communication with a patch.
> IMHO without the WARN_ON_ONCEs, which I agreed to remove, I currently
> don't see any problem for net.
> 
> You don't agree on a netdev-feature flag, indicating the driver is okay
> with hardware checksumming and extension headers? We could add this to
> net-next pretty fast, I think. It does not require people to revert this
> patch in case their driver misbehaves and we don't get a fix for it,
> soon. Also what should we do if the driver simply does not support
> extension headers + checksum offloading? Completely kill checksum
> offloading for IPv6?

I posted v3 just now. I would like to let David consider it for net
inclusion. We can work on how to lift this limitation then in net-next,
okay? I am currently in favor of a new netdev-feature. What do you
think? Your RFC series could help here, too.

Thanks,
Hannes

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next v3 4/4] ipv6: add defensive check for CHECKSUM_PARTIAL skbs in ip_fragment

2015-10-27 Thread Hannes Frederic Sowa

CHECKSUM_PARTIAL skbs should never arrive in ip_fragment. If we get one
of those warn about them once and handle them gracefully by recalculating
the checksum.

Fixes: commit 32dce968dd987 ("ipv6: Allow for partial checksums on non-ufo 
packets")
See-also: commit 72e843bb09d45 ("ipv6: ip6_fragment() should check 
CHECKSUM_PARTIAL")
Cc: Eric Dumazet 
Cc: Vlad Yasevich 
Cc: Benjamin Coddington 
Cc: Tom Herbert 
Signed-off-by: Hannes Frederic Sowa 
---
 net/ipv6/ip6_output.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 9828a71..fa0e8ae 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -605,6 +605,10 @@ int ip6_fragment(struct net *net, struct sock *sk, struct 
sk_buff *skb,
frag_id = ipv6_select_ident(net, &ipv6_hdr(skb)->daddr,
&ipv6_hdr(skb)->saddr);
 
+   if (skb->ip_summed == CHECKSUM_PARTIAL &&
+   (err = skb_checksum_help(skb)))
+   goto fail;
+
hroom = LL_RESERVED_SPACE(rt->dst.dev);
if (skb_has_frag_list(skb)) {
int first_len = skb_pagelen(skb);
@@ -733,10 +737,6 @@ slow_path_clean:
}
 
 slow_path:
-   if ((skb->ip_summed == CHECKSUM_PARTIAL) &&
-   skb_checksum_help(skb))
-   goto fail;
-
left = skb->len - hlen; /* Space per frame */
ptr = hlen; /* Where to start from */
 
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next v3 0/4] net: clean up interactions of CHECKSUM_PARTIAL and fragmentation

2015-10-27 Thread Hannes Frederic Sowa

This series fixes wrong checksums on the wire for IPv4 and IPv6. Large
send buffers and especially NFS lead to wrong checksums in both IPv4
and IPv6.

CHECKSUM_PARTIAL skbs should not receive the respective fragmentations
functions, so we add WARN_ON_ONCE to those functions to fix up those as
soon as they get reported.

Thanks!

Changelog:
v2: added v4 checks
v3: removed WARN_ON_ONCES (advice by Tom Herbert)

Hannes Frederic Sowa (4):
  ipv4: no CHECKSUM_PARTIAL on MSG_MORE corked sockets
  ipv4: add defensive check for CHECKSUM_PARTIAL skbs in ip_fragment
  ipv6: no CHECKSUM_PARTIAL on MSG_MORE corked sockets
  ipv6: add defensive check for CHECKSUM_PARTIAL skbs in ip_fragment

 net/ipv4/ip_output.c  |  9 --
 net/ipv6/ip6_output.c | 78 ---
 2 files changed, 43 insertions(+), 44 deletions(-)

-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next v3 3/4] ipv6: no CHECKSUM_PARTIAL on MSG_MORE corked sockets

2015-10-27 Thread Hannes Frederic Sowa

We cannot reliable calculate packet size on MSG_MORE corked sockets
and thus cannot decide if they are going to be fragmented later on,
so better not use CHECKSUM_PARTIAL in the first place.

The IPv6 code also intended to protect and not use CHECKSUM_PARTIAL in
the existence of IPv6 extension headers, but the condition was wrong. Fix
it up, too. Also the condition to check whether the packet fits into
one fragment was wrong and has been corrected.

Fixes: commit 32dce968dd987 ("ipv6: Allow for partial checksums on non-ufo 
packets")
See-also: commit 72e843bb09d45 ("ipv6: ip6_fragment() should check 
CHECKSUM_PARTIAL")
Cc: Eric Dumazet 
Cc: Vlad Yasevich 
Cc: Benjamin Coddington 
Cc: Tom Herbert 
Signed-off-by: Hannes Frederic Sowa 
---
 net/ipv6/ip6_output.c | 70 ---
 1 file changed, 33 insertions(+), 37 deletions(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index c265068..9828a71 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1272,6 +1272,7 @@ static int __ip6_append_data(struct sock *sk,
struct rt6_info *rt = (struct rt6_info *)cork->dst;
struct ipv6_txoptions *opt = v6_cork->opt;
int csummode = CHECKSUM_NONE;
+   unsigned int maxnonfragsize, headersize;
 
skb = skb_peek_tail(queue);
if (!skb) {
@@ -1289,38 +1290,43 @@ static int __ip6_append_data(struct sock *sk,
maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen -
 sizeof(struct frag_hdr);
 
-   if (mtu <= sizeof(struct ipv6hdr) + IPV6_MAXPLEN) {
-   unsigned int maxnonfragsize, headersize;
-
-   headersize = sizeof(struct ipv6hdr) +
-(opt ? opt->opt_flen + opt->opt_nflen : 0) +
-(dst_allfrag(&rt->dst) ?
- sizeof(struct frag_hdr) : 0) +
-rt->rt6i_nfheader_len;
-
-   if (ip6_sk_ignore_df(sk))
-   maxnonfragsize = sizeof(struct ipv6hdr) + IPV6_MAXPLEN;
-   else
-   maxnonfragsize = mtu;
+   headersize = sizeof(struct ipv6hdr) +
+(opt ? opt->opt_flen + opt->opt_nflen : 0) +
+(dst_allfrag(&rt->dst) ?
+ sizeof(struct frag_hdr) : 0) +
+rt->rt6i_nfheader_len;
+
+   if (cork->length + length > mtu - headersize && dontfrag &&
+   (sk->sk_protocol == IPPROTO_UDP ||
+sk->sk_protocol == IPPROTO_RAW)) {
+   ipv6_local_rxpmtu(sk, fl6, mtu - headersize +
+   sizeof(struct ipv6hdr));
+   goto emsgsize;
+   }
 
-   /* dontfrag active */
-   if ((cork->length + length > mtu - headersize) && dontfrag &&
-   (sk->sk_protocol == IPPROTO_UDP ||
-sk->sk_protocol == IPPROTO_RAW)) {
-   ipv6_local_rxpmtu(sk, fl6, mtu - headersize +
-  sizeof(struct ipv6hdr));
-   goto emsgsize;
-   }
+   if (ip6_sk_ignore_df(sk))
+   maxnonfragsize = sizeof(struct ipv6hdr) + IPV6_MAXPLEN;
+   else
+   maxnonfragsize = mtu;
 
-   if (cork->length + length > maxnonfragsize - headersize) {
+   if (cork->length + length > maxnonfragsize - headersize) {
 emsgsize:
-   ipv6_local_error(sk, EMSGSIZE, fl6,
-mtu - headersize +
-sizeof(struct ipv6hdr));
-   return -EMSGSIZE;
-   }
+   ipv6_local_error(sk, EMSGSIZE, fl6,
+mtu - headersize +
+sizeof(struct ipv6hdr));
+   return -EMSGSIZE;
}
 
+   /* CHECKSUM_PARTIAL only with no extension headers and when
+* we are not going to fragment
+*/
+   if (transhdrlen && sk->sk_protocol == IPPROTO_UDP &&
+   headersize == sizeof(struct ipv6hdr) &&
+   length < mtu - headersize &&
+   !(flags & MSG_MORE) &&
+   rt->dst.dev->features & NETIF_F_V6_CSUM)
+   csummode = CHECKSUM_PARTIAL;
+
if (sk->sk_type == SOCK_DGRAM || sk->sk_type == SOCK_RAW) {
sock_tx_timestamp(sk, &tx_flags);
if (tx_flags & SKBTX_ANY_SW_TSTAMP &&
@@ -1328,16 +1334,6 @@ emsgsize:
tskey = sk->sk_tskey++;
}
 
-   /* If this is the first and only packet and device
-* supports checksum offloading, let's use it.
-* Use transhdrlen, same as IPv4, because partial
-* sums only work when transhdrlen is set.
-*/
-   if (transhdrlen && sk->sk_protocol == IPPROTO_UDP &&
-   length + fragheaderlen < mtu &&
-   rt->dst.dev->features & NETIF_F_V6_CSUM &&
-   !

[PATCH net-next v3 1/4] ipv4: no CHECKSUM_PARTIAL on MSG_MORE corked sockets

2015-10-27 Thread Hannes Frederic Sowa

We cannot reliable calculate packet size on MSG_MORE corked sockets
and thus cannot decide if they are going to be fragmented later on,
so better not use CHECKSUM_PARTIAL in the first place.

Cc: Eric Dumazet 
Cc: Vlad Yasevich 
Cc: Benjamin Coddington 
Cc: Tom Herbert 
Signed-off-by: Hannes Frederic Sowa 
---
 net/ipv4/ip_output.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 50e2973..0b02417 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -911,6 +911,7 @@ static int __ip_append_data(struct sock *sk,
if (transhdrlen &&
length + fragheaderlen <= mtu &&
rt->dst.dev->features & NETIF_F_V4_CSUM &&
+   !(flags & MSG_MORE) &&
!exthdrlen)
csummode = CHECKSUM_PARTIAL;
 
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next v3 2/4] ipv4: add defensive check for CHECKSUM_PARTIAL skbs in ip_fragment

2015-10-27 Thread Hannes Frederic Sowa

CHECKSUM_PARTIAL skbs should never arrive in ip_fragment. If we get one
of those warn about them once and handle them gracefully by recalculating
the checksum.

Cc: Eric Dumazet 
Cc: Vlad Yasevich 
Cc: Benjamin Coddington 
Cc: Tom Herbert 
Signed-off-by: Hannes Frederic Sowa 
---
 net/ipv4/ip_output.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 0b02417..4233cbe 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -533,6 +533,11 @@ int ip_do_fragment(struct net *net, struct sock *sk, 
struct sk_buff *skb,
 
dev = rt->dst.dev;
 
+   /* for offloaded checksums cleanup checksum before fragmentation */
+   if (skb->ip_summed == CHECKSUM_PARTIAL &&
+   (err = skb_checksum_help(skb)))
+   goto fail;
+
/*
 *  Point into the IP datagram header.
 */
@@ -657,9 +662,6 @@ slow_path_clean:
}
 
 slow_path:
-   /* for offloaded checksums cleanup checksum before fragmentation */
-   if ((skb->ip_summed == CHECKSUM_PARTIAL) && skb_checksum_help(skb))
-   goto fail;
iph = ip_hdr(skb);
 
left = skb->len - hlen; /* Space per frame */
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv3 net 2/3] ipv6: Export nf_ct_frag6_consume_orig()

2015-10-27 Thread Pravin Shelar

On Sun, Oct 25, 2015 at 8:21 PM, Joe Stringer  wrote:
> This is needed in openvswitch to fix an skb leak in the next patch.
>
> Signed-off-by: Joe Stringer 
Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv3 net 3/3] openvswitch: Fix skb leak using IPv6 defrag

2015-10-27 Thread Pravin Shelar

On Sun, Oct 25, 2015 at 8:21 PM, Joe Stringer  wrote:
> nf_ct_frag6_gather() makes a clone of each skb passed to it, and if the
> reassembly is successful, expects the caller to free all of the original
> skbs using nf_ct_frag6_consume_orig(). This call was previously missing,
> meaning that the original fragments were never freed (with the exception
> of the last fragment to arrive).
>
> Fix this by ensuring that all original fragments except for the last
> fragment are freed via nf_ct_frag6_consume_orig(). The last fragment
> will be morphed into the head, so it must not be freed yet. Furthermore,
> retain the ->next pointer for the head after skb_morph().
>
> Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
> Reported-by: Florian Westphal 
> Signed-off-by: Joe Stringer 

Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv3 net 1/3] openvswitch: Fix double-free on ip_defrag() errors

2015-10-27 Thread Pravin Shelar

On Sun, Oct 25, 2015 at 8:21 PM, Joe Stringer  wrote:
> If ip_defrag() returns an error other than -EINPROGRESS, then the skb is
> freed. When handle_fragments() passes this back up to
> do_execute_actions(), it will be freed again. Prevent this double free
> by never freeing the skb in do_execute_actions() for errors returned by
> ovs_ct_execute. Always free it in ovs_ct_execute() error paths instead.
>
> Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
> Reported-by: Florian Westphal 
> Signed-off-by: Joe Stringer 

Acked-by: Pravin B Shelar 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 6/6] net: phy: Stop 'phy-state-machine' and 'phy_change' work on remove

2015-10-27 Thread Andrew Lunn

On Tue, Oct 27, 2015 at 08:57:58AM -0700, Florian Fainelli wrote:
> (don't top post please)
> 
> On 27/10/15 08:53, Frode Isaksen wrote:
> > What will you need in the oops ? I presume you don' want everything or ?
> > 
> > The PHY state machine is not stopped with a PHY disconnect.
> 
> It is stopped with a phy_disconnect():
> 
> /**
>  * phy_disconnect - disable interrupts, stop state machine, and detach a PHY
>  *  device
>  * @phydev: target phy_device struct
>  */
> void phy_disconnect(struct phy_device *phydev)
> {
> if (phydev->irq > 0)
> phy_stop_interrupts(phydev);
> 
> phy_stop_machine(phydev);
> 
> phydev->adjust_link = NULL;
> 
> phy_detach(phydev);
> }

And this does not yet get called. It probably needs to be in
dsa_switch_destroy() just before unregister_netdev() of the slave
devices.

However, the ordering in dsa_switch_destroy() looks wrong. The fixed
phys are destroyed before the slave devices. They should probably be
destroyed after the slave devices, or at least after the
phy_disconnect() is called.

 Andrew
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Missing IPv4 routes

2015-10-27 Thread Alexander Duyck


On 10/27/2015 01:01 PM, Brian Rak wrote:

(Existing email got kinda messy, starting over again):

So, I'm having an issue with the kernel where if I add a bunch of 
routes, I see some of them go "missing".  They don't show up in the 
'ip -4 route' list, but they do show up if I do 'ip -4 route get X'.


I managed to come up with a simple set of reproduction commands:

ip link add veth0 type veth peer name veth1
ip link set veth0 up
ip link set veth1 up

ip route add 108.61.171.119/32 dev veth0  scope link
ip route add 108.61.171.141/32 dev veth1  scope link
ip route add 108.61.171.223/32 dev veth1  scope link
ip route add 108.61.171.250/32 dev veth1  scope link
ip route add 108.61.171.247/32 dev veth1  scope link

ip route show

In the route show, you'll see 108.61.171.250/32 and 108.61.171.247/32 
missing completely.


I did a lot of bisecting, and traced it down to this commit:

commit 8be33e955cb959dabc1a6eef0b7356fe8cf73fa6
Author: Alexander Duyck 
Date:   Wed Mar 4 14:59:19 2015 -0800

fib_trie: Fib walk rcu should take a tnode and key instead of a 
trie and a leaf


The commit immediately prior to this one 
(7289e6ddb633aaee6ccea2bd2e410654c47b29a6) works fine.


I tried the off-by-one fix from 
e55ffaf457bcc8ec4e9d9f56f955971f834d65b3, however this doesn't appear 
to help at all.   This code is a little above my head, so I don't 
really understand what exactly is broken here.


I'll take a look at it and see if I can come up with a fix by this 
afternoon.


Thanks.

- Alex
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Missing IPv4 routes

2015-10-27 Thread Brian Rak


(Existing email got kinda messy, starting over again):

So, I'm having an issue with the kernel where if I add a bunch of 
routes, I see some of them go "missing".  They don't show up in the 'ip 
-4 route' list, but they do show up if I do 'ip -4 route get X'.


I managed to come up with a simple set of reproduction commands:

ip link add veth0 type veth peer name veth1
ip link set veth0 up
ip link set veth1 up

ip route add 108.61.171.119/32 dev veth0  scope link
ip route add 108.61.171.141/32 dev veth1  scope link
ip route add 108.61.171.223/32 dev veth1  scope link
ip route add 108.61.171.250/32 dev veth1  scope link
ip route add 108.61.171.247/32 dev veth1  scope link

ip route show

In the route show, you'll see 108.61.171.250/32 and 108.61.171.247/32 
missing completely.


I did a lot of bisecting, and traced it down to this commit:

commit 8be33e955cb959dabc1a6eef0b7356fe8cf73fa6
Author: Alexander Duyck 
Date:   Wed Mar 4 14:59:19 2015 -0800

fib_trie: Fib walk rcu should take a tnode and key instead of a 
trie and a leaf


The commit immediately prior to this one 
(7289e6ddb633aaee6ccea2bd2e410654c47b29a6) works fine.


I tried the off-by-one fix from 
e55ffaf457bcc8ec4e9d9f56f955971f834d65b3, however this doesn't appear to 
help at all.   This code is a little above my head, so I don't really 
understand what exactly is broken here.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net v2 3/4] ipv6: no CHECKSUM_PARTIAL on MSG_MORE corked sockets

2015-10-27 Thread Hannes Frederic Sowa

On Tue, Oct 27, 2015, at 19:37, Tom Herbert wrote:
> On Tue, Oct 27, 2015 at 11:29 AM, Hannes Frederic Sowa
>  wrote:
> > On Tue, Oct 27, 2015, at 18:32, Tom Herbert wrote:
> >> On Tue, Oct 27, 2015 at 9:44 AM, Hannes Frederic Sowa
> >>  wrote:
> >> >
> >> >
> >> > On Tue, Oct 27, 2015, at 17:36, Tom Herbert wrote:> > -   if
> >> > (cork->length + length > maxnonfragsize - headersize) {
> >> >> > +   if (cork->length + length > maxnonfragsize - headersize) {
> >> >> >  emsgsize:
> >> >> > -   ipv6_local_error(sk, EMSGSIZE, fl6,
> >> >> > -mtu - headersize +
> >> >> > -sizeof(struct ipv6hdr));
> >> >> > -   return -EMSGSIZE;
> >> >> > -   }
> >> >> > +   ipv6_local_error(sk, EMSGSIZE, fl6,
> >> >> > +mtu - headersize +
> >> >> > +sizeof(struct ipv6hdr));
> >> >> > +   return -EMSGSIZE;
> >> >> > }
> >> >> >
> >> >> > +   /* CHECKSUM_PARTIAL only with no extension headers and when
> >> >>
> >> >> No, please don't do this. CHECKSUM_PARTIAL should work with extension
> >> >> headers as defined, so this is just disabling otherwise valid and
> >> >> useful functionality. If (some) drivers have problems with this they
> >> >> need to be identified and fixed.
> >> >
> >> > I don't understand. The old code already didn't allow the use of
> >> > opt_flen with CHECKSUM_PARTIAL.
> >> >
> >> Then that's a problem with the old code :-). Is there any other reason
> >> that we can't use CHECKSUM_PARTIAL with extension headers other than
> >> lack of correct driver support?
> >
> > The lack of correct driver support is a big bumper, but as I wrote, I
> > don't see a reason to not lift this restriction in net-next. I proposed
> > a new feature flag, or by looking at your series, we could probably use
> > the extension header okay field for that.
> >
> Okay, but why bother doing this for net? This problem has obviously
> existed for a while, and even if the restriction is maintained here
> there are still other paths that don't go through ip_append_data that
> could trip the bug. Also, drivers are welcome to fix their issues in
> net I believe.

I even don't know if it could be a hardware issue. Also I don't want to
break people's communication with a patch.
IMHO without the WARN_ON_ONCEs, which I agreed to remove, I currently
don't see any problem for net.

You don't agree on a netdev-feature flag, indicating the driver is okay
with hardware checksumming and extension headers? We could add this to
net-next pretty fast, I think. It does not require people to revert this
patch in case their driver misbehaves and we don't get a fix for it,
soon. Also what should we do if the driver simply does not support
extension headers + checksum offloading? Completely kill checksum
offloading for IPv6?

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB)

2015-10-27 Thread Jonas Markussen

On 26 Oct 2015, at 22:58, Yuchung Cheng  wrote:
> but would RDB be voided if this developer turns on RDB then turns on
> Nagle later?

The short answer is answer is "kind of"

My understanding is that Nagle will delay segments until they're
either MSS-sized or until segments "down the pipe" are acknowledged.

As RDB isn't able to bundle if the payload is more than MSS/2, only
an application that that sends data less frequent than an RTT would
still theoretically benefit from RDB even if Nagle is on.

However, in my opinion this is a scenario where Nagle itself is void:

If you transmit more rarely than the RTT, enabling Nagle makes no
difference.

If you transfer more frequent than the RTT, enabling Nagle makes
RDB void.

-Jonas
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net v2 2/4] ipv4: add defensive check for CHECKSUM_PARTIAL skbs in ip_fragment

2015-10-27 Thread Hannes Frederic Sowa

Hi Sergei,

On Tue, Oct 27, 2015, at 20:01, Sergei Shtylyov wrote:
> On 10/27/2015 06:02 PM, Hannes Frederic Sowa wrote:
> 
> > CHECKSUM_PARTIAL skbs should never arrive in ip_fragment. If we get one
> > of those warn about them once and handle them gracefully by recalculating
> > the checksum.
> >
> > Cc: Eric Dumazet 
> > Cc: Vlad Yasevich 
> > Cc: Benjamin Coddington 
> > Cc: Tom Herbert 
> > Signed-off-by: Hannes Frederic Sowa 
> > ---
> >   net/ipv4/ip_output.c | 8 +---
> >   1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> > index 0b02417..3f94a3b 100644
> > --- a/net/ipv4/ip_output.c
> > +++ b/net/ipv4/ip_output.c
> > @@ -533,6 +533,11 @@ int ip_do_fragment(struct net *net, struct sock *sk, 
> > struct sk_buff *skb,
> >
> > dev = rt->dst.dev;
> >
> > +   /* for offloaded checksums cleanup checksum before fragmentation */
> > +   if (WARN_ON_ONCE(skb->ip_summed == CHECKSUM_PARTIAL) &&
> > +   (err = skb_checksum_help(skb)))
> 
> scripts/checkpatch.pl shou;d have complained about using = in the
> *if* 
> expression.

I know and I ignored it deliberately because I found it nicer this way.
I made sure gcc does not complain by using extra braces around the
assignment.

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 14/15] net: wireless: ath: Remove unneeded variable ret returning 0

2015-10-27 Thread punit vara

On Tue, Oct 27, 2015 at 1:42 PM, Kalle Valo  wrote:
> punit vara  writes:
>
>> Will my other patches which are already correct be added to wireless
>> tree ? or I have to resend everything ?
>
> Yes, please resend the whole patchset. I don't apply patches
> individually from a patchset, it's just too time consuming and error
> prone.
>
> Also, as you seem to be new here, I don't recommend sending big
> patchsets in the beginning. Start slow, send just few a patch or two at
> a time, and once you gain more experience send bigger patchsets.
>
> --
> Kalle Valo
Next time I will send 2-3 patches only . This time I have resent you
all the patches that I have created before . Thank you for suggestion.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RESEND PATCH 03/10] net: wireless: rtwifi: Remove duplicated arguments to |

2015-10-27 Thread Punit Vara

Remove uncessary repeated arguments COMP_EFUSE, COMP_REGD, COMP_CHAN
 with OR(|)

This is patch to the debug.c file that removes following warning
reported by coccicheck:

-duplicated argument to & or |

Signed-off-by: Punit Vara 
---
 drivers/net/wireless/rtlwifi/debug.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/rtlwifi/debug.c 
b/drivers/net/wireless/rtlwifi/debug.c
index fd25aba..b8f5540 100644
--- a/drivers/net/wireless/rtlwifi/debug.c
+++ b/drivers/net/wireless/rtlwifi/debug.c
@@ -37,9 +37,9 @@ void rtl_dbgp_flag_init(struct ieee80211_hw *hw)
COMP_BEACON | COMP_RATE | COMP_RXDESC | COMP_DIG | COMP_TXAGC |
COMP_POWER | COMP_POWER_TRACKING | COMP_BB_POWERSAVING | COMP_SWAS |
COMP_RF | COMP_TURBO | COMP_RATR | COMP_CMD |
-   COMP_EFUSE | COMP_QOS | COMP_MAC80211 | COMP_REGD | COMP_CHAN |
-   COMP_EASY_CONCURRENT | COMP_EFUSE | COMP_QOS | COMP_MAC80211 |
-   COMP_REGD | COMP_CHAN | COMP_BT_COEXIST;
+   COMP_EFUSE | COMP_QOS | COMP_MAC80211 | COMP_CHAN |
+   COMP_EASY_CONCURRENT | COMP_QOS | COMP_MAC80211 |
+   COMP_REGD | COMP_BT_COEXIST;
 
 
for (i = 0; i < DBGP_TYPE_MAX; i++)
-- 
2.5.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RESEND PATCH 01/10] net: wireless: ath: Remove unnecessary semicolon

2015-10-27 Thread Punit Vara

This patch is to the htt_rx.c that removes unneeded semicolon which is
reported by coccicheck.

Here semicolon just create empty statement so please remote it.

Signed-off-by: Punit Vara 
---
 drivers/net/wireless/ath/ath10k/htt_rx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c 
b/drivers/net/wireless/ath/ath10k/htt_rx.c
index 1b7a043..002a633 100644
--- a/drivers/net/wireless/ath/ath10k/htt_rx.c
+++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
@@ -2077,7 +2077,7 @@ void ath10k_htt_t2h_msg_handler(struct ath10k *ar, struct 
sk_buff *skb)
ath10k_dbg_dump(ar, ATH10K_DBG_HTT_DUMP, NULL, "htt event: ",
skb->data, skb->len);
break;
-   };
+   }
 
/* Free the indication buffer */
dev_kfree_skb_any(skb);
-- 
2.5.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RESEND PATCH 04/10] net: wireless: brcm80211: Remove duplicated arguments to |

2015-10-27 Thread Punit Vara

Remove uncessary repeated arguments with OR(|)

This is patch to the brcmsmac/channel.c file that removes following
 warning reported by coccicheck:

-duplicated argument to & or |

Signed-off-by: Punit Vara 
---
 drivers/net/wireless/brcm80211/brcmsmac/channel.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/wireless/brcm80211/brcmsmac/channel.c 
b/drivers/net/wireless/brcm80211/brcmsmac/channel.c
index 635ae03..d56fa03 100644
--- a/drivers/net/wireless/brcm80211/brcmsmac/channel.c
+++ b/drivers/net/wireless/brcm80211/brcmsmac/channel.c
@@ -652,7 +652,6 @@ static void brcms_reg_apply_radar_flags(struct wiphy *wiphy)
 */
if (!(ch->flags & IEEE80211_CHAN_DISABLED))
ch->flags |= IEEE80211_CHAN_RADAR |
-IEEE80211_CHAN_NO_IR |
 IEEE80211_CHAN_NO_IR;
}
 }
-- 
2.5.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RESEND PATCH 06/10] net: wireless: ath: simplify return flow for carl9170_regwrite_result()

2015-10-27 Thread Punit Vara

This patch is to the carl9170/phy.c file that fixes warning reported by
coccicheck :

 WARNING: end returns can be simplified

I have removed unneccessary variable declaration and simply return flow
for carl9170_regwrite_result()

Signed-off-by: Punit Vara 
---
 drivers/net/wireless/ath/carl9170/phy.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/net/wireless/ath/carl9170/phy.c 
b/drivers/net/wireless/ath/carl9170/phy.c
index dca6df1..f3b5434 100644
--- a/drivers/net/wireless/ath/carl9170/phy.c
+++ b/drivers/net/wireless/ath/carl9170/phy.c
@@ -966,7 +966,6 @@ static const struct carl9170_phy_freq_entry 
carl9170_phy_freq_params[] = {
 static int carl9170_init_rf_bank4_pwr(struct ar9170 *ar, bool band5ghz,
  u32 freq, enum carl9170_bw bw)
 {
-   int err;
u32 d0, d1, td0, td1, fd0, fd1;
u8 chansel;
u8 refsel0 = 1, refsel1 = 0;
@@ -1024,11 +1023,7 @@ static int carl9170_init_rf_bank4_pwr(struct ar9170 *ar, 
bool band5ghz,
carl9170_regwrite(0x1c58e8, fd1);
 
carl9170_regwrite_finish();
-   err = carl9170_regwrite_result();
-   if (err)
-   return err;
-
-   return 0;
+   return carl9170_regwrite_result();
 }
 
 static const struct carl9170_phy_freq_params *
-- 
2.5.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RESEND PATCH 05/10] net: wireless: simplify return flow for zd1201_setconfig16

2015-10-27 Thread Punit Vara

This patch is to the zd1201.c file that fixes up warning
reported by coccicheck:

WARNING: end returns can be simplified and declaration on line 1658 can
be dropped

Prefer direct return value instead of writing 2-3 more sentence.

Signed-off-by: Punit Vara 
---
 drivers/net/wireless/zd1201.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/net/wireless/zd1201.c b/drivers/net/wireless/zd1201.c
index 6f5c793..d9e67d9 100644
--- a/drivers/net/wireless/zd1201.c
+++ b/drivers/net/wireless/zd1201.c
@@ -1655,15 +1655,11 @@ static int zd1201_set_maxassoc(struct net_device *dev,
 struct iw_request_info *info, struct iw_param *rrq, char *extra)
 {
struct zd1201 *zd = netdev_priv(dev);
-   int err;
 
if (!zd->ap)
return -EOPNOTSUPP;
 
-   err = zd1201_setconfig16(zd, ZD1201_RID_CNFMAXASSOCSTATIONS, 
rrq->value);
-   if (err)
-   return err;
-   return 0;
+   return zd1201_setconfig16(zd, ZD1201_RID_CNFMAXASSOCSTATIONS, 
rrq->value);
 }
 
 static int zd1201_get_maxassoc(struct net_device *dev,
-- 
2.5.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RESEND PATCH 08/10] net: wireless: brcm80211: Remove unneeded variable which return 0

2015-10-27 Thread Punit Vara

This is patch to the brcmsmac/main.c that removes unnecessary variable
which was declared to return zero.

This patch fixes up warning reported by coccicheck:
-Unneeded variable: "err". Return "0" on line 3788

Signed-off-by: Punit Vara 
---
 drivers/net/wireless/brcm80211/brcmsmac/main.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/wireless/brcm80211/brcmsmac/main.c 
b/drivers/net/wireless/brcm80211/brcmsmac/main.c
index 9728be0..9d717b6 100644
--- a/drivers/net/wireless/brcm80211/brcmsmac/main.c
+++ b/drivers/net/wireless/brcm80211/brcmsmac/main.c
@@ -3777,7 +3777,6 @@ static void brcms_c_set_ps_ctrl(struct brcms_c_info *wlc)
  */
 static int brcms_c_set_mac(struct brcms_bss_cfg *bsscfg)
 {
-   int err = 0;
struct brcms_c_info *wlc = bsscfg->wlc;
 
/* enter the MAC addr into the RXE match registers */
@@ -3785,7 +3784,7 @@ static int brcms_c_set_mac(struct brcms_bss_cfg *bsscfg)
 
brcms_c_ampdu_macaddr_upd(wlc);
 
-   return err;
+   return 0;
 }
 
 /* Write the BSS config's BSSID address to core (set_bssid in d11procs.tcl).
-- 
2.5.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RESEND PATCH 10/10] net: wireless: ath: Remove unneeded variable ret returning 0

2015-10-27 Thread Punit Vara

This patch is to the ath5k/eeprom.c that fixes up warning caught by
coccicheck:

-Unneeded variable: "ret". Return "0" on line 1733

Remove unneccesary variable ret created to return zero.

Signed-off-by: Punit Vara 
---
 drivers/net/wireless/ath/ath5k/eeprom.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/ath/ath5k/eeprom.c 
b/drivers/net/wireless/ath/ath5k/eeprom.c
index 94d34ee..673ab8d 100644
--- a/drivers/net/wireless/ath/ath5k/eeprom.c
+++ b/drivers/net/wireless/ath/ath5k/eeprom.c
@@ -1707,7 +1707,7 @@ ath5k_eeprom_read_spur_chans(struct ath5k_hw *ah)
struct ath5k_eeprom_info *ee = &ah->ah_capabilities.cap_eeprom;
u32 offset;
u16 val;
-   int ret = 0, i;
+   int i;
 
offset = AR5K_EEPROM_CTL(ee->ee_version) +
AR5K_EEPROM_N_CTLS(ee->ee_version);
@@ -1730,7 +1730,7 @@ ath5k_eeprom_read_spur_chans(struct ath5k_hw *ah)
}
}
 
-   return ret;
+   return 0;
 }
 
 
-- 
2.5.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RESEND PATCH 09/10] net: wireless: brcm80211: Remove unneeded variable ret_code returning 0

2015-10-27 Thread Punit Vara

This patch is to the brcmsmac/stf.c that fixes up warning caught by
coccicheck:

-Unneeded variable: "ret_code". Return "0" on line 328

Remove unneccesary variable ret_code created to return zero.

Signed-off-by: Punit Vara 
---
 drivers/net/wireless/brcm80211/brcmsmac/stf.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/brcm80211/brcmsmac/stf.c 
b/drivers/net/wireless/brcm80211/brcmsmac/stf.c
index dd91627..71ddf42 100644
--- a/drivers/net/wireless/brcm80211/brcmsmac/stf.c
+++ b/drivers/net/wireless/brcm80211/brcmsmac/stf.c
@@ -306,7 +306,6 @@ int brcms_c_stf_txchain_set(struct brcms_c_info *wlc, s32 
int_val, bool force)
  */
 int brcms_c_stf_ss_update(struct brcms_c_info *wlc, struct brcms_band *band)
 {
-   int ret_code = 0;
u8 prev_stf_ss;
u8 upd_stf_ss;
 
@@ -325,7 +324,7 @@ int brcms_c_stf_ss_update(struct brcms_c_info *wlc, struct 
brcms_band *band)
PHY_TXC1_MODE_SISO : PHY_TXC1_MODE_CDD;
} else {
if (wlc->band != band)
-   return ret_code;
+   return 0;
upd_stf_ss = (wlc->stf->txstreams == 1) ?
PHY_TXC1_MODE_SISO : band->band_stf_ss_mode;
}
@@ -334,7 +333,7 @@ int brcms_c_stf_ss_update(struct brcms_c_info *wlc, struct 
brcms_band *band)
brcms_b_band_stf_ss_set(wlc->hw, upd_stf_ss);
}
 
-   return ret_code;
+   return 0;
 }
 
 int brcms_c_stf_attach(struct brcms_c_info *wlc)
-- 
2.5.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RESEND PATCH 07/10] net: wireless: iwlegacy: Remove unneeded variable ret

2015-10-27 Thread Punit Vara

This patch is to the 3945-mac.c file that fixes up following warning
by coccicheck:

drivers/net/wireless/iwlegacy/3945-mac.c:247:5-8: Unneeded variable:
"ret". Return "- EOPNOTSUPP" on line 249

Return -EOPNOTSUPP directly instead of return using ret

Signed-off-by: Punit Vara 
---
 drivers/net/wireless/iwlegacy/3945-mac.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/wireless/iwlegacy/3945-mac.c 
b/drivers/net/wireless/iwlegacy/3945-mac.c
index af1b3e6..ff4dc44 100644
--- a/drivers/net/wireless/iwlegacy/3945-mac.c
+++ b/drivers/net/wireless/iwlegacy/3945-mac.c
@@ -244,9 +244,7 @@ il3945_set_dynamic_key(struct il_priv *il, struct 
ieee80211_key_conf *keyconf,
 static int
 il3945_remove_static_key(struct il_priv *il)
 {
-   int ret = -EOPNOTSUPP;
-
-   return ret;
+   return -EOPNOTSUPP;
 }
 
 static int
@@ -529,7 +527,6 @@ il3945_tx_skb(struct il_priv *il,
if (unlikely(tid >= MAX_TID_COUNT))
goto drop;
}
-
/* Descriptor for chosen Tx queue */
txq = &il->txq[txq_id];
q = &txq->q;
-- 
2.5.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RESEND PATCH 02/10] net: wireless: ath: Remove unnecessary semicolon

2015-10-27 Thread Punit Vara

This patch is to the ath10k/wmi.h that removes unneeded semicolon which
 is reported by coccicheck.

Here semicolon just create empty statement so please remote it.

Signed-off-by: Punit Vara 
---
 drivers/net/wireless/ath/ath10k/wmi.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath10k/wmi.h 
b/drivers/net/wireless/ath/ath10k/wmi.h
index 52d3503..21d5b6b 100644
--- a/drivers/net/wireless/ath/ath10k/wmi.h
+++ b/drivers/net/wireless/ath/ath10k/wmi.h
@@ -1675,7 +1675,7 @@ static inline const char *ath10k_wmi_phymode_str(enum 
wmi_phy_mode mode)
 
/* no default handler to allow compiler to check that the
 * enum is fully handled */
-   };
+   }
 
return "";
 }
-- 
2.5.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net v2 2/4] ipv4: add defensive check for CHECKSUM_PARTIAL skbs in ip_fragment

2015-10-27 Thread Sergei Shtylyov


Hello.

On 10/27/2015 06:02 PM, Hannes Frederic Sowa wrote:


CHECKSUM_PARTIAL skbs should never arrive in ip_fragment. If we get one
of those warn about them once and handle them gracefully by recalculating
the checksum.

Cc: Eric Dumazet 
Cc: Vlad Yasevich 
Cc: Benjamin Coddington 
Cc: Tom Herbert 
Signed-off-by: Hannes Frederic Sowa 
---
  net/ipv4/ip_output.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 0b02417..3f94a3b 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -533,6 +533,11 @@ int ip_do_fragment(struct net *net, struct sock *sk, 
struct sk_buff *skb,

dev = rt->dst.dev;

+   /* for offloaded checksums cleanup checksum before fragmentation */
+   if (WARN_ON_ONCE(skb->ip_summed == CHECKSUM_PARTIAL) &&
+   (err = skb_checksum_help(skb)))


   scripts/checkpatch.pl shou;d have complained about using = in the *if* 
expression.



+   goto fail;
+
/*
 *  Point into the IP datagram header.
 */

[...]

MBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net v2 2/4] ipv4: add defensive check for CHECKSUM_PARTIAL skbs in ip_fragment

2015-10-27 Thread Tom Herbert

On Tue, Oct 27, 2015 at 8:02 AM, Hannes Frederic Sowa
 wrote:
> CHECKSUM_PARTIAL skbs should never arrive in ip_fragment. If we get one
> of those warn about them once and handle them gracefully by recalculating
> the checksum.
>
I believe a UDP sender within the kernel (like an encapsulation) that
happens to send using a frag list that exceeds MTU is quite possible
and would be a problem with current code.

> Cc: Eric Dumazet 
> Cc: Vlad Yasevich 
> Cc: Benjamin Coddington 
> Cc: Tom Herbert 
> Signed-off-by: Hannes Frederic Sowa 
> ---
>  net/ipv4/ip_output.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index 0b02417..3f94a3b 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -533,6 +533,11 @@ int ip_do_fragment(struct net *net, struct sock *sk, 
> struct sk_buff *skb,
>
> dev = rt->dst.dev;
>
> +   /* for offloaded checksums cleanup checksum before fragmentation */
> +   if (WARN_ON_ONCE(skb->ip_summed == CHECKSUM_PARTIAL) &&
> +   (err = skb_checksum_help(skb)))
> +   goto fail;
> +
> /*
>  *  Point into the IP datagram header.
>  */
> @@ -657,9 +662,6 @@ slow_path_clean:
> }
>
>  slow_path:
> -   /* for offloaded checksums cleanup checksum before fragmentation */
> -   if ((skb->ip_summed == CHECKSUM_PARTIAL) && skb_checksum_help(skb))
> -   goto fail;
> iph = ip_hdr(skb);
>
> left = skb->len - hlen; /* Space per frame */
> --
> 2.5.0
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

SALLAM ALEIKUM

2015-10-27 Thread MAYA

صديقي العزيز

   أشعر أنك من العمر ما يكفي للحفاظ على أسرار والتعامل مع مسألة حساسة 
بشكل سري، وبعد دردشة لدينا القليل شعرت بداخلي بأنني يجب مناقشة اقتراحي معك بسبب 
المستوى الخاص من التفاهم. هذا هو السبب في أنني أريد أن تقصر هذه الصفقة السرية 
فيكم. يجب عليك أن تبقي هذه المحادثة على أنها سرية الأعلى بين كل واحد منا.
 
يا سيدي، أنا من دولة صغيرة تسمى تكساس في الولايات المتحدة، وأنا في الواقع أم 
وحيدة ابنتي هو سبع سنوات من العمر، اسمها ديزي. أنا مهندس بترول مع شركة شل 
للبترول، تعاقدت الحكومة الأمريكية لي قبل سنوات قليلة، وكنت نشرت من المكسيك 
المحيط إلى العراق، وهذا هو بلدي السنة الثانية هنا في بغداد العراق. أنا مهندس 
اختبار والتحقق من النفط نظرا إلى البيت الأبيض من قبل الحكومة العراقية. بلدي 
التوقيع والتقارير تلعب دورا حيويا بين الولايات المتحدة الأمريكية والحكومة 
العراقية في مجال النفط الخام.
 
جاء بعض الدبلوماسيين الأمريكيين وبعض التجار النفط خاصة من الولايات المتحدة إلى 
العراق لإجراء عملية شراء النفط الخام الضخمة التي تقدر بمئات الملايين من 
الدولارات، وهناك حاجة إلى دور جهدي لختم الصفقة بالموافقة على جودة الزيت وبدون 
توقيعي، فإن شراء لا يكون ناجحا. أعطيت لي مبلغ اجمالي قدره 10 € ملايين يورو كما 
نصيبي في صفقة شراء. وقد تم بالفعل نقل هذا المال للخروج من العراق من خلال تأمين 
شركة البريد السريع الدبلوماسية.
 
هذا 10€  ملايين يورو نقلت الى خارج العراق، لا يسمح لنا بقواعد حكومة الولايات 
المتحدة إلى الدخول في صفقات تجارية، وهذا هو السبب في أنني أريد منك أن الشراكة 
معي والوقوف لتلقي هذه الأموال باسمي في بلدك. الشركة التي انتقلت من المال للخروج 
من العراق سوف نقل المال إلى الوجهة التي تريدها.
 
يا سيدي، أنا سوف نقدم لك مبلغا مجموعه 20٪ من المبلغ الإجمالي لتلقي مربع والحفاظ 
معك حتى وصولي للقاء معكم لمناقشة الأعمال الأخرى الممكنة.
 
إذا كنت تقبل هذا الاقتراح وسوف يحتاج فقط ما يلي: -
(1) إسمك
(2) عنوانك الحالي
(3) اسم الشركة التي تعمل بها
(4) الموقع الخاص بك في مكان العمل
(5) رقم الهاتف الخاص بك
(6) بطاقة الهوية أو رخصة القيادة أو جواز السفر الدولي
 بمجرد تلقي كل هذه المعلومات، وسوف عملية مستند وإرساله إليك وسترسل أيضا نسخة 
للشركة إدخال لك كشريكي المعترف بها فقط، وأنها ينبغي أن تحويل الأموال بلدي لكم 
لمزيد من الاستثمار في بلدكم.العملية برمتها بسيطة وخالية من المخاطر، ولكن يجب 
علينا الحفاظ على الابتعاد عن الأضواء والانصياع لقواعد السرية.

مع أطيب التحيات
صديقك وشريك
كينت جينا مهندس
بغداد، العراق
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net v2 3/4] ipv6: no CHECKSUM_PARTIAL on MSG_MORE corked sockets

2015-10-27 Thread Tom Herbert

On Tue, Oct 27, 2015 at 11:29 AM, Hannes Frederic Sowa
 wrote:
> On Tue, Oct 27, 2015, at 18:32, Tom Herbert wrote:
>> On Tue, Oct 27, 2015 at 9:44 AM, Hannes Frederic Sowa
>>  wrote:
>> >
>> >
>> > On Tue, Oct 27, 2015, at 17:36, Tom Herbert wrote:> > -   if
>> > (cork->length + length > maxnonfragsize - headersize) {
>> >> > +   if (cork->length + length > maxnonfragsize - headersize) {
>> >> >  emsgsize:
>> >> > -   ipv6_local_error(sk, EMSGSIZE, fl6,
>> >> > -mtu - headersize +
>> >> > -sizeof(struct ipv6hdr));
>> >> > -   return -EMSGSIZE;
>> >> > -   }
>> >> > +   ipv6_local_error(sk, EMSGSIZE, fl6,
>> >> > +mtu - headersize +
>> >> > +sizeof(struct ipv6hdr));
>> >> > +   return -EMSGSIZE;
>> >> > }
>> >> >
>> >> > +   /* CHECKSUM_PARTIAL only with no extension headers and when
>> >>
>> >> No, please don't do this. CHECKSUM_PARTIAL should work with extension
>> >> headers as defined, so this is just disabling otherwise valid and
>> >> useful functionality. If (some) drivers have problems with this they
>> >> need to be identified and fixed.
>> >
>> > I don't understand. The old code already didn't allow the use of
>> > opt_flen with CHECKSUM_PARTIAL.
>> >
>> Then that's a problem with the old code :-). Is there any other reason
>> that we can't use CHECKSUM_PARTIAL with extension headers other than
>> lack of correct driver support?
>
> The lack of correct driver support is a big bumper, but as I wrote, I
> don't see a reason to not lift this restriction in net-next. I proposed
> a new feature flag, or by looking at your series, we could probably use
> the extension header okay field for that.
>
Okay, but why bother doing this for net? This problem has obviously
existed for a while, and even if the restriction is maintained here
there are still other paths that don't go through ip_append_data that
could trip the bug. Also, drivers are welcome to fix their issues in
net I believe.

> I would be conservative in net though.
>
> Bye,
> Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv2 net 2/2] ipv4: update RTNH_F_LINKDOWN flag on UP event

2015-10-27 Thread Andy Gospodarek

On Tue, Oct 27, 2015 at 09:42:25AM +0200, Julian Anastasov wrote:
> 
>   Hello,
> 
> On Tue, 27 Oct 2015, Andy Gospodarek wrote:
> 
> > I tested this patch and I now see that your reported problem is a result
> > of dummy never taking carrier down.  There was a presumption that
> > carrier notification would go down when hardware went down (or when the
> > logical device backing the hardware went down, but this is clearly not
> > always the case.
> 
>   It seems not all devices play with the carrier
> after IFF_UP is set and we can not rely on NETDEV_CHANGE
> to update the flag.
Agreed.

> 
> > > + if (nh_flags & RTNH_F_DEAD) {
> > > + unsigned int flags = dev_get_flags(dev);
> > > +
> > > + if (flags & (IFF_RUNNING | IFF_LOWER_UP))
> > > + nh_flags |= RTNH_F_LINKDOWN;
> > > + }
> > > +
> > >   prev_fi = NULL;
> > >   hash = fib_devindex_hashfn(dev->ifindex);
> > >   head = &fib_info_devhash[hash];
> > 
> > Logically this patch makes sense, but I feel as though there may be a
> > slightly better option.  Possibly this:
> > 
> > diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
> > index 42778d9..7eb7c40 100644
> > --- a/net/ipv4/fib_semantics.c
> > +++ b/net/ipv4/fib_semantics.c
> > @@ -1376,7 +1376,8 @@ int fib_sync_down_dev(struct net_device *dev, 
> > unsigned long event)
> > nexthop_nh->nh_flags |= RTNH_F_DEAD;
> > /* fall through */
> > case NETDEV_CHANGE:
> > -   nexthop_nh->nh_flags |= RTNH_F_LINKDOWN;
> > +   if (!netif_carrier_ok(dev))
> > +   nexthop_nh->nh_flags |= 
> > RTNH_F_LINKDOWN;
> 
>   There is a problem with this approach. Once
> the RTNH_F_DEAD flag is set, eg. when last address is removed,
> any NETDEV_CHANGE events are ignored in this function.
> As result, we may miss the link-down event if we first
> remove the addresses, so we will not set RTNH_F_LINKDOWN.
Yes, I see that now.  I verified with dummy by setting carrier after a
flush.  Thanks for pointing that out.

>   Also, when device link goes UP we (FIB) can not guess
> just based on events what is the actual carrier state
> because the NETDEV_CHANGE notification comes only when
> IFF_UP is set. So, this check.
> 
>   I also attempted to fully recalculate the flag
> in fib_sync_up, i.e. with the option not just to clear it
> but also to add nexthop_nh->nh_flags |= nh_flags_set logic
> but it complicates the code. So, while we always set
> RTNH_F_LINKDOWN when DEAD is set, the logic to conditionally
> clear RTNH_F_LINKDOWN in fib_sync_up looks the cheapest one.
> 
>   Of course, we have a semantic problem when setting
> RTNH_F_LINKDOWN on last address removal, i.e. this event
> has nothing to do with the link state. But it works because
> RTNH_F_LINKDOWN is valid for lookups only when DEAD flag
> is not set, so that is why my patch looks this way.
The problem you describe here was a concern of mine as well.  I would
really like the output of 'ip route show' to properly reflect the link
state and fix the problem you describe, but it seems like it will not in
this case with your current patch.  I'll do a bit more testing and let
you know.

> 
> > break;
> > }
> > dead++;
> > @@ -1396,7 +1397,8 @@ int fib_sync_down_dev(struct net_device *dev, 
> > unsigned long event)
> > fi->fib_flags |= RTNH_F_DEAD;
> > /* fall through */
> > case NETDEV_CHANGE:
> > -   fi->fib_flags |= RTNH_F_LINKDOWN;
> > +   if (!netif_carrier_ok(dev))
> > +   fi->fib_flags |= RTNH_F_LINKDOWN;
> > break;
> > }
> > ret++;
> 
>   I think, we even do not need the RTNH_F_LINKDOWN flag
> in fib_flags, currently it is set but never used.
> 
> Regards
> 
> --
> Julian Anastasov 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net v2 2/4] ipv4: add defensive check for CHECKSUM_PARTIAL skbs in ip_fragment

2015-10-27 Thread Hannes Frederic Sowa

On Tue, Oct 27, 2015, at 17:06, Tom Herbert wrote:
> On Tue, Oct 27, 2015 at 8:02 AM, Hannes Frederic Sowa
>  wrote:
> > CHECKSUM_PARTIAL skbs should never arrive in ip_fragment. If we get one
> > of those warn about them once and handle them gracefully by recalculating
> > the checksum.
> >
> > Cc: Eric Dumazet 
> > Cc: Vlad Yasevich 
> > Cc: Benjamin Coddington 
> > Cc: Tom Herbert 
> > Signed-off-by: Hannes Frederic Sowa 
> > ---
> >  net/ipv4/ip_output.c | 8 +---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> > index 0b02417..3f94a3b 100644
> > --- a/net/ipv4/ip_output.c
> > +++ b/net/ipv4/ip_output.c
> > @@ -533,6 +533,11 @@ int ip_do_fragment(struct net *net, struct sock *sk, 
> > struct sk_buff *skb,
> >
> > dev = rt->dst.dev;
> >
> > +   /* for offloaded checksums cleanup checksum before fragmentation */
> > +   if (WARN_ON_ONCE(skb->ip_summed == CHECKSUM_PARTIAL) &&
> > +   (err = skb_checksum_help(skb)))
> > +   goto fail;
> > +
> Why the WARN_ON_ONCE? Is there a prior check somewhere that avoid this
> condition?

While I am pretty sure we should not hit the condition in IPv6 anymore,
I think this could frighten people in IPv4 land. I will repost without
the WARN_ON_ONCE. Maybe it makes sense to use the IFF_DEBUG interface
flags again? :)

I will repost without those WARN_ON_ONCEs.

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net v2 3/4] ipv6: no CHECKSUM_PARTIAL on MSG_MORE corked sockets

2015-10-27 Thread Hannes Frederic Sowa

On Tue, Oct 27, 2015, at 18:32, Tom Herbert wrote:
> On Tue, Oct 27, 2015 at 9:44 AM, Hannes Frederic Sowa
>  wrote:
> >
> >
> > On Tue, Oct 27, 2015, at 17:36, Tom Herbert wrote:> > -   if
> > (cork->length + length > maxnonfragsize - headersize) {
> >> > +   if (cork->length + length > maxnonfragsize - headersize) {
> >> >  emsgsize:
> >> > -   ipv6_local_error(sk, EMSGSIZE, fl6,
> >> > -mtu - headersize +
> >> > -sizeof(struct ipv6hdr));
> >> > -   return -EMSGSIZE;
> >> > -   }
> >> > +   ipv6_local_error(sk, EMSGSIZE, fl6,
> >> > +mtu - headersize +
> >> > +sizeof(struct ipv6hdr));
> >> > +   return -EMSGSIZE;
> >> > }
> >> >
> >> > +   /* CHECKSUM_PARTIAL only with no extension headers and when
> >>
> >> No, please don't do this. CHECKSUM_PARTIAL should work with extension
> >> headers as defined, so this is just disabling otherwise valid and
> >> useful functionality. If (some) drivers have problems with this they
> >> need to be identified and fixed.
> >
> > I don't understand. The old code already didn't allow the use of
> > opt_flen with CHECKSUM_PARTIAL.
> >
> Then that's a problem with the old code :-). Is there any other reason
> that we can't use CHECKSUM_PARTIAL with extension headers other than
> lack of correct driver support?

The lack of correct driver support is a big bumper, but as I wrote, I
don't see a reason to not lift this restriction in net-next. I proposed
a new feature flag, or by looking at your series, we could probably use
the extension header okay field for that.

I would be conservative in net though.

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCHv2] ixgbe: Wait for 1ms, not 1us, after RST

2015-10-27 Thread Dan Streetman

The driver currently waits 1us after issuing a RST, but the spec
requires it to wait 1ms.  This adds a msleep(1) before polling the
reset bit.

Signed-off-by: Dan Streetman 
Signed-off-by: Dan Streetman 
---
changes since v1:
 use msleep(1) instead of mdelay(1), per Peter Hurley
 move msleep(1) out of for loop - only msleep once, leave udelay(1)
   inside for loop
 use spec sec title instead of number, per Don Skidmore

 drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
index 4e75843..02cfa1e 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
@@ -111,6 +111,13 @@ mac_reset_top:
IXGBE_WRITE_REG(hw, IXGBE_CTRL, ctrl);
IXGBE_WRITE_FLUSH(hw);
 
+   /* From the spec "General Control Registers - Device Control Register":
+* "...programmers must wait approximately 1 ms after setting before
+*  attempting to check if the bit has cleared or to access (read
+*  or write) any other device register."
+*/
+   msleep(1);
+
/* Poll for reset bit to self-clear indicating reset is complete */
for (i = 0; i < 10; i++) {
udelay(1);
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCHv2 1/2] pci_ids: add Netronome Systems vendor

2015-10-27 Thread Jakub Kicinski

Add PCI vendor id for Netronome Systems.

Signed-off-by: Jakub Kicinski 
Signed-off-by: Rolf Neugebauer 
---
 include/linux/pci_ids.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index d9ba49cedc5d..1acbefc4bbda 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2495,6 +2495,8 @@
 #define PCI_DEVICE_ID_KORENIX_JETCARDF20x1700
 #define PCI_DEVICE_ID_KORENIX_JETCARDF30x17ff
 
+#define PCI_VENDOR_ID_NETRONOME0x19ee
+
 #define PCI_VENDOR_ID_QMI  0x1a32
 
 #define PCI_VENDOR_ID_AZWAVE   0x1a3b
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCHv2 0/2] Netronome NFP4000/NFP6000 NIC VF driver

2015-10-27 Thread Jakub Kicinski

This patchset adds support for VFs of Netronome's NFP-4000 and NFP-6000
based NICs. We are currently also preparing the submission for the PF
driver, but it is not quite ready yet. The PF driver can be found on
GitHub:

https://github.com/Netronome/nfp-drv-kmods

changes since v1:
 - reorganize struct nfp_net_r_vector

Jakub Kicinski (2):
  pci_ids: add Netronome Systems vendor
  net: add driver for Netronome NFP4000/NFP6000 NIC VFs

 MAINTAINERS|7 +
 drivers/net/ethernet/Kconfig   |1 +
 drivers/net/ethernet/Makefile  |1 +
 drivers/net/ethernet/netronome/Kconfig |   33 +
 drivers/net/ethernet/netronome/Makefile|5 +
 drivers/net/ethernet/netronome/nfp/Makefile|8 +
 drivers/net/ethernet/netronome/nfp/nfp_net.h   |  755 ++
 .../net/ethernet/netronome/nfp/nfp_net_common.c| 2523 
 drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.h  |  318 +++
 .../net/ethernet/netronome/nfp/nfp_net_debugfs.c   |  235 ++
 .../net/ethernet/netronome/nfp/nfp_net_ethtool.c   |  704 ++
 .../net/ethernet/netronome/nfp/nfp_netvf_main.c|  404 
 include/linux/pci_ids.h|2 +
 13 files changed, 4996 insertions(+)
 create mode 100644 drivers/net/ethernet/netronome/Kconfig
 create mode 100644 drivers/net/ethernet/netronome/Makefile
 create mode 100644 drivers/net/ethernet/netronome/nfp/Makefile
 create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_net.h
 create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_net_common.c
 create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.h
 create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_net_debugfs.c
 create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
 create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_netvf_main.c

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [V5, 2/6] fsl/fman: Add FMan support

2015-10-27 Thread Liberman Igal



Regards,
Igal Liberman

> -Original Message-
> From: Wood Scott-B07421
> Sent: Saturday, September 26, 2015 2:02 AM
> To: Liberman Igal-B31950 
> Cc: netdev@vger.kernel.org; linuxppc-...@lists.ozlabs.org; linux-
> ker...@vger.kernel.org; Bucur Madalin-Cristian-B32716
> 
> Subject: Re: [V5, 2/6] fsl/fman: Add FMan support
> 
> On Mon, Sep 21, 2015 at 02:52:34PM +0300, Igal.Liberman wrote:
> > diff --git a/drivers/net/ethernet/freescale/fman/fman.c
> > b/drivers/net/ethernet/freescale/fman/fman.c
> > new file mode 100644
> > index 000..924685f
> > --- /dev/null
> > +++ b/drivers/net/ethernet/freescale/fman/fman.c
> > @@ -0,0 +1,2738 @@
> > +/*
> > + * Copyright 2008-2015 Freescale Semiconductor Inc.
> > + *
> > + * Redistribution and use in source and binary forms, with or without
> > + * modification, are permitted provided that the following conditions are
> met:
> > + * * Redistributions of source code must retain the above copyright
> > + *   notice, this list of conditions and the following disclaimer.
> > + * * Redistributions in binary form must reproduce the above copyright
> > + *   notice, this list of conditions and the following disclaimer in 
> > the
> > + *   documentation and/or other materials provided with the
> distribution.
> > + * * Neither the name of Freescale Semiconductor nor the
> > + *   names of its contributors may be used to endorse or promote
> products
> > + *   derived from this software without specific prior written 
> > permission.
> > + *
> > +//  *
> > + * ALTERNATIVELY, this software may be distributed under the terms of
> > +the
> > + * GNU General Public License ("GPL") as published by the Free
> > +Software
> > + * Foundation, either version 2 of that License or (at your option)
> > +any
> > + * later version.
> 
> What is that // doing there?

Removed.

> 
> > +/* Exceptions bit map */
> > +#define EX_DMA_BUS_ERROR   0x8000
> > +#define EX_DMA_READ_ECC0x4000
> > +#define EX_DMA_SYSTEM_WRITE_ECC0x2000
> > +#define EX_DMA_FM_WRITE_ECC0x1000
> > +#define EX_FPM_STALL_ON_TASKS  0x0800
> > +#define EX_FPM_SINGLE_ECC  0x0400
> > +#define EX_FPM_DOUBLE_ECC  0x0200
> > +#define EX_QMI_SINGLE_ECC  0x0100
> > +#define EX_QMI_DEQ_FROM_UNKNOWN_PORTID 0x0080
> > +#define EX_QMI_DOUBLE_ECC  0x0040
> > +#define EX_BMI_LIST_RAM_ECC0x0020
> > +#define EX_BMI_STORAGE_PROFILE_ECC 0x0010
> > +#define EX_BMI_STATISTICS_RAM_ECC  0x0008
> > +#define EX_IRAM_ECC0x0004
> > +#define EX_MURAM_ECC   0x0002
> > +#define EX_BMI_DISPATCH_RAM_ECC0x0001
> > +#define EX_DMA_SINGLE_PORT_ECC 0x8000
> > +
> > +#define DFLT_EXCEPTIONS\
> > +((EX_DMA_BUS_ERROR)| \
> > + (EX_DMA_READ_ECC)  | \
> > + (EX_DMA_SYSTEM_WRITE_ECC)  | \
> > + (EX_DMA_FM_WRITE_ECC)  | \
> > + (EX_FPM_STALL_ON_TASKS)| \
> > + (EX_FPM_SINGLE_ECC)| \
> > + (EX_FPM_DOUBLE_ECC)| \
> > + (EX_QMI_DEQ_FROM_UNKNOWN_PORTID) | \
> > + (EX_BMI_LIST_RAM_ECC)  | \
> > + (EX_BMI_STORAGE_PROFILE_ECC)   | \
> > + (EX_BMI_STATISTICS_RAM_ECC)| \
> > + (EX_MURAM_ECC) | \
> > + (EX_BMI_DISPATCH_RAM_ECC)  | \
> > + (EX_QMI_DOUBLE_ECC)| \
> > + (EX_QMI_SINGLE_ECC))
> 
> You don't need parentheses around each symbol.
> 

Removed the parentheses (here and in other places)

> This is only used in one place -- why put the list here rather than in the 
> place
> where it's used?
> 

Moved this define.

> > +struct fman_state_struct {
> > +   u8 fm_id;
> > +   u16 fm_clk_freq;
> > +   struct fman_rev_info rev_info;
> > +   bool enabled_time_stamp;
> > +   u8 count1_micro_bit;
> > +   u8 total_num_of_tasks;
> > +   u8 accumulated_num_of_tasks;
> > +   u32 accumulated_fifo_size;
> > +   u8 accumulated_num_of_open_dmas;
> > +   u8 accumulated_num_of_deq_tnums;
> > +   bool low_end_restriction;
> > +   u32 exceptions;
> > +   u32 extra_fifo_pool_size;
> > +   u8 extra_tasks_pool_size;
> > +   u8 extra_open_dmas_pool_size;
> > +   u16 port_mfl[MAX_NUM_OF_MACS];
> > +   u16 mac_mfl[MAX_NUM_OF_MACS];
> > +
> > +   /* SOC specific */
> > +   u32 fm_iram_size;
> > +   /* DMA */
> > +   u32 dma_thresh_max_commq;
> > +   u32 dma_thresh_max_buf;
> > +   u32 max_num_of_open_dmas;
> > +   /* QMI */
> > +   u32 qmi_max_num_of_tnums;
> > +   u32 qmi_def_tnums_thresh;
> > +   /* BMI */
> > +   u32 bmi_max_num_of_tasks;
> > +   u32 bmi_max_fifo_size;
> > +   /* General */
> > +   u32 fm_port_num_of_cg;
> > +   u32 num_of_rx_ports;
> > +   u32 total_fifo_size;
> > +
> > +   u32 qman_channel_base;
> > +   u32 num_of_qman_channels;
> > +
> > +   struct resource *res;
> > +};
> > +
> > +struct fman_cfg {
> > +   u8 disp_limit_t

[PATCH net-next] enic: assign affinity hint to interrupts

2015-10-27 Thread Govindarajulu Varadarajan

The affinity hint is used by the user space daemon, irqbalancer, to
indicate a preferred CPU mask for irqs. This patch sets the irq affinity
hint to local numa core first, when exhausted we try non-local numa cores.

Introduce enic module global variable enic_numa_count[] to store the
number of affinity_hints set. If there are more than one enic interfaces,
we do not want them to share same affinity hint cpus. We store the
history of affinity hint assignment of all interfaces in global variable
enic_numa_count. Introduce enic_affinity_hint spinlock to access
enic_numa_count.

Also set tx xps cpus mask based on affinity hint.

Signed-off-by: Govindarajulu Varadarajan <_gov...@gmx.com>
---
 drivers/net/ethernet/cisco/enic/enic.h  | 27 ++
 drivers/net/ethernet/cisco/enic/enic_main.c | 81 +
 2 files changed, 108 insertions(+)

diff --git a/drivers/net/ethernet/cisco/enic/enic.h 
b/drivers/net/ethernet/cisco/enic/enic.h
index 6401ba99..1671fa3 100644
--- a/drivers/net/ethernet/cisco/enic/enic.h
+++ b/drivers/net/ethernet/cisco/enic/enic.h
@@ -50,6 +50,7 @@ struct enic_msix_entry {
char devname[IFNAMSIZ];
irqreturn_t (*isr)(int, void *);
void *devid;
+   cpumask_var_t affinity_mask;
 };
 
 /* Store only the lower range.  Higher range is given by fw. */
@@ -263,6 +264,32 @@ static inline unsigned int enic_msix_notify_intr(struct 
enic *enic)
return enic->rq_count + enic->wq_count + 1;
 }
 
+static inline bool enic_is_err_intr(struct enic *enic, int intr)
+{
+   switch (vnic_dev_get_intr_mode(enic->vdev)) {
+   case VNIC_DEV_INTR_MODE_INTX:
+   return intr == enic_legacy_err_intr();
+   case VNIC_DEV_INTR_MODE_MSIX:
+   return intr == enic_msix_err_intr(enic);
+   case VNIC_DEV_INTR_MODE_MSI:
+   default:
+   return false;
+   }
+}
+
+static inline bool enic_is_notify_intr(struct enic *enic, int intr)
+{
+   switch (vnic_dev_get_intr_mode(enic->vdev)) {
+   case VNIC_DEV_INTR_MODE_INTX:
+   return intr == enic_legacy_notify_intr();
+   case VNIC_DEV_INTR_MODE_MSIX:
+   return intr == enic_msix_notify_intr(enic);
+   case VNIC_DEV_INTR_MODE_MSI:
+   default:
+   return false;
+   }
+}
+
 static inline int enic_dma_map_check(struct enic *enic, dma_addr_t dma_addr)
 {
if (unlikely(pci_dma_mapping_error(enic->pdev, dma_addr))) {
diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c 
b/drivers/net/ethernet/cisco/enic/enic_main.c
index 0c22fd0..c39e495 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -39,6 +39,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifdef CONFIG_RFS_ACCEL
 #include 
 #endif
@@ -69,6 +70,9 @@
 
 #define RX_COPYBREAK_DEFAULT   256
 
+static int enic_numa_count[MAX_NUMNODES];
+DEFINE_SPINLOCK(enic_affinity_hint);
+
 /* Supported devices */
 static const struct pci_device_id enic_id_table[] = {
{ PCI_VDEVICE(CISCO, PCI_DEVICE_ID_CISCO_VIC_ENET) },
@@ -112,6 +116,77 @@ static struct enic_intr_mod_range 
mod_range[ENIC_MAX_LINK_SPEEDS] = {
{3,  6}, /* 10 - 40 Gbps */
 };
 
+static void enic_init_affinity_hint(struct enic *enic)
+{
+   int numa_node = dev_to_node(&enic->pdev->dev);
+   int numa_idx = numa_node == -1 ? 0 : numa_node;
+   int index;
+   int i;
+
+   spin_lock(&enic_affinity_hint);
+   for (i = 0; i < enic->intr_count; i++) {
+   if (enic_is_err_intr(enic, i) || enic_is_notify_intr(enic, i) ||
+   (enic->msix[i].affinity_mask &&
+!cpumask_empty(enic->msix[i].affinity_mask)))
+   continue;
+   if (zalloc_cpumask_var(&enic->msix[i].affinity_mask,
+  GFP_KERNEL)) {
+   index = enic_numa_count[numa_idx]++;
+   cpumask_set_cpu(cpumask_local_spread(index, numa_node),
+   enic->msix[i].affinity_mask);
+   }
+   }
+   spin_unlock(&enic_affinity_hint);
+}
+
+static void enic_free_affinity_hint(struct enic *enic)
+{
+   int i;
+
+   for (i = 0; i < enic->intr_count; i++) {
+   if (enic_is_err_intr(enic, i) || enic_is_notify_intr(enic, i))
+   continue;
+   free_cpumask_var(enic->msix[i].affinity_mask);
+   }
+}
+
+static void enic_set_affinity_hint(struct enic *enic)
+{
+   int i;
+   int err;
+
+   for (i = 0; i < enic->intr_count; i++) {
+   if (enic_is_err_intr(enic, i)   ||
+   enic_is_notify_intr(enic, i)||
+   !enic->msix[i].affinity_mask||
+   cpumask_empty(enic->msix[i].affinity_mask))
+   continue;
+   err = irq_set_affinity_hint(enic->msix_entry[i].vector,
+

[linux-review:Neil-Armstrong/net-dsa-cleanup-dsa-driver/20151028-003842] 943beed83806cdd6149e6cf69db108f4c0b6fe31 BUILD DONE

2015-10-27 Thread kbuild test robot

https://github.com/0day-ci/linux  
Neil-Armstrong/net-dsa-cleanup-dsa-driver/20151028-003842
943beed83806cdd6149e6cf69db108f4c0b6fe31  net: dsa: make usage of mv88e6xxx 
common remove function

drivers/net/dsa/bcm_sf2.c:1077:14: warning: initialization from incompatible 
pointer type [-Wincompatible-pointer-types]
drivers/net/dsa/bcm_sf2.c:1077:2: error: unknown field 'remove' specified in 
initializer
drivers/net/dsa/mv88e6123_61_65.c:125:14: warning: initialization from 
incompatible pointer type [-Wincompatible-pointer-types]
drivers/net/dsa/mv88e6123_61_65.c:125:2: error: unknown field 'remove' 
specified in initializer
drivers/net/dsa/mv88e6123_61_65.c:125:2: warning: initialization from 
incompatible pointer type
drivers/net/dsa/mv88e6123_61_65.c:125:2: warning: initialization from 
incompatible pointer type [enabled by default]
drivers/net/dsa/mv88e6131.c:185:14: warning: initialization from incompatible 
pointer type [-Wincompatible-pointer-types]
drivers/net/dsa/mv88e6131.c:185:2: error: unknown field 'remove' specified in 
initializer
drivers/net/dsa/mv88e6171.c:104:14: warning: initialization from incompatible 
pointer type [-Wincompatible-pointer-types]
drivers/net/dsa/mv88e6171.c:104:2: error: unknown field 'remove' specified in 
initializer
drivers/net/dsa/mv88e6352.c:324:14: warning: initialization from incompatible 
pointer type [-Wincompatible-pointer-types]
drivers/net/dsa/mv88e6352.c:324:2: error: unknown field 'remove' specified in 
initializer
drivers/net/dsa/mv88e6352.c:324:2: warning: initialization from incompatible 
pointer type
drivers/net/dsa/mv88e6352.c:324:2: warning: initialization from incompatible 
pointer type [enabled by default]

Error ids grouped by kconfigs:

recent_errors
├── alpha-allyesconfig
│   ├── 
drivers-net-dsa-mv88e6123_61_65.c:error:unknown-field-remove-specified-in-initializer
│   ├── 
drivers-net-dsa-mv88e6123_61_65.c:warning:initialization-from-incompatible-pointer-type
│   ├── 
drivers-net-dsa-mv88e6352.c:error:unknown-field-remove-specified-in-initializer
│   └── 
drivers-net-dsa-mv88e6352.c:warning:initialization-from-incompatible-pointer-type
├── blackfin-allyesconfig
│   ├── 
drivers-net-dsa-mv88e6123_61_65.c:error:unknown-field-remove-specified-in-initializer
│   ├── 
drivers-net-dsa-mv88e6123_61_65.c:warning:initialization-from-incompatible-pointer-type
│   ├── 
drivers-net-dsa-mv88e6352.c:error:unknown-field-remove-specified-in-initializer
│   └── 
drivers-net-dsa-mv88e6352.c:warning:initialization-from-incompatible-pointer-type
├── cris-allyesconfig
│   ├── 
drivers-net-dsa-mv88e6123_61_65.c:error:unknown-field-remove-specified-in-initializer
│   ├── 
drivers-net-dsa-mv88e6123_61_65.c:warning:initialization-from-incompatible-pointer-type
│   ├── 
drivers-net-dsa-mv88e6352.c:error:unknown-field-remove-specified-in-initializer
│   └── 
drivers-net-dsa-mv88e6352.c:warning:initialization-from-incompatible-pointer-type
├── i386-allmodconfig
│   ├── 
drivers-net-dsa-bcm_sf2.c:error:unknown-field-remove-specified-in-initializer
│   ├── 
drivers-net-dsa-bcm_sf2.c:warning:initialization-from-incompatible-pointer-type
│   ├── 
drivers-net-dsa-mv88e6123_61_65.c:error:unknown-field-remove-specified-in-initializer
│   ├── 
drivers-net-dsa-mv88e6123_61_65.c:warning:initialization-from-incompatible-pointer-type
│   ├── 
drivers-net-dsa-mv88e6171.c:error:unknown-field-remove-specified-in-initializer
│   ├── 
drivers-net-dsa-mv88e6171.c:warning:initialization-from-incompatible-pointer-type
│   ├── 
drivers-net-dsa-mv88e6352.c:error:unknown-field-remove-specified-in-initializer
│   └── 
drivers-net-dsa-mv88e6352.c:warning:initialization-from-incompatible-pointer-type
├── i386-randconfig-sb0-1028
│   ├── 
drivers-net-dsa-mv88e6131.c:error:unknown-field-remove-specified-in-initializer
│   └── 
drivers-net-dsa-mv88e6131.c:warning:initialization-from-incompatible-pointer-type
├── ia64-allyesconfig
│   ├── 
drivers-net-dsa-mv88e6123_61_65.c:error:unknown-field-remove-specified-in-initializer
│   ├── 
drivers-net-dsa-mv88e6123_61_65.c:warning:initialization-from-incompatible-pointer-type
│   ├── 
drivers-net-dsa-mv88e6352.c:error:unknown-field-remove-specified-in-initializer
│   └── 
drivers-net-dsa-mv88e6352.c:warning:initialization-from-incompatible-pointer-type
├── m68k-allyesconfig
│   ├── 
drivers-net-dsa-mv88e6123_61_65.c:error:unknown-field-remove-specified-in-initializer
│   ├── 
drivers-net-dsa-mv88e6123_61_65.c:warning:initialization-from-incompatible-pointer-type
│   ├── 
drivers-net-dsa-mv88e6352.c:error:unknown-field-remove-specified-in-initializer
│   └── 
drivers-net-dsa-mv88e6352.c:warning:initialization-from-incompatible-pointer-type
├── mips-allyesconfig
│   ├── 
drivers-net-dsa-mv88e6123_61_65.c:error:unknown-field-remove-specified-in-initializer
│   ├── 
drivers-net-dsa-mv88e6123_61_65.c:warning:initialization-from-incompatible-pointer-type
│   ├── 
drivers-net-dsa-mv88e6352.c:error:unknown-field-remove-specified-in-initializer
│   └── 
drivers-ne

Re: [PATCH] ixgbe: Wait for 1ms, not 1us, after RST

2015-10-27 Thread Peter Hurley

Hi Dan,

On 10/26/2015 08:16 PM, dan.street...@canonical.com wrote:
> From: Dan Streetman 
> 
> The driver currently waits 1us after issuing a RST, but the spec
> requires it to wait 1ms.
> 
> Signed-off-by: Dan Streetman 
> Signed-off-by: Dan Streetman 
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c 
> b/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
> index 4e75843..147bc65 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
> @@ -113,7 +113,12 @@ mac_reset_top:
>  
>   /* Poll for reset bit to self-clear indicating reset is complete */
>   for (i = 0; i < 10; i++) {
> - udelay(1);
> + /* sec 8.2.4.1.1 :
> +  * programmers must wait approximately 1 ms after setting before
> +  * attempting to check if the bit has cleared or to access (read
> +  * or write) any other device register.
> +  */
> + mdelay(1);

Since ixgbe_reset_hw_x540() goes on to msleep(100) immediately after this
busy-wait loop, this should instead be:

msleep(1);

Regards,
Peter Hurley


>   ctrl = IXGBE_READ_REG(hw, IXGBE_CTRL);
>   if (!(ctrl & IXGBE_CTRL_RST_MASK))
>   break;
> 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] ixgbe: Wait for 1ms, not 1us, after RST

2015-10-27 Thread Dan Streetman

On Tue, Oct 27, 2015 at 1:03 PM, Skidmore, Donald C
 wrote:
>
>
>> -Original Message-
>> From: dan.street...@canonical.com
>> [mailto:dan.street...@canonical.com]
>> Sent: Monday, October 26, 2015 5:16 PM
>> To: Kirsher, Jeffrey T
>> Cc: Brandeburg, Jesse; Nelson, Shannon; Wyborny, Carolyn; Skidmore,
>> Donald C; Vick, Matthew; Ronciak, John; Williams, Mitch A; intel-wired-
>> l...@lists.osuosl.org; netdev@vger.kernel.org; linux-ker...@vger.kernel.org;
>> Dan Streetman; Dan Streetman
>> Subject: [PATCH] ixgbe: Wait for 1ms, not 1us, after RST
>>
>> From: Dan Streetman 
>>
>> The driver currently waits 1us after issuing a RST, but the spec requires it 
>> to
>> wait 1ms.
>>
>> Signed-off-by: Dan Streetman 
>> Signed-off-by: Dan Streetman 
>> ---
>>  drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c | 7 ++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
>> index 4e75843..147bc65 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
>> @@ -113,7 +113,12 @@ mac_reset_top:
>>
>>   /* Poll for reset bit to self-clear indicating reset is complete */
>>   for (i = 0; i < 10; i++) {
>> - udelay(1);
>> + /* sec 8.2.4.1.1 :
>> +  * programmers must wait approximately 1 ms after setting
>> before
>> +  * attempting to check if the bit has cleared or to access
>> (read
>> +  * or write) any other device register.
>> +  */
>> + mdelay(1);
>>   ctrl = IXGBE_READ_REG(hw, IXGBE_CTRL);
>>   if (!(ctrl & IXGBE_CTRL_RST_MASK))
>>   break;
>> --
>> 2.5.0
>
> While the Data Sheet does mention that this should take ~ 1ms, we are in a 
> busy wait state so it probably isn't that big of a deal to check more 
> frequently for our exit condition.  That said there are plenty of other 
> delays later on in the reset path so keeping the udelay really isn't speeding 
> things up much. :)

I don't know the hw details of course, I was just going on the spec's
use of "must" when stating how long the driver should wait before
talking to the hw.  If the hw doesn't actually care, then no need for
this patch (although the spec should probably be changed to not use
"must").

Thanks!

>
> Also normally it isn't a good idea to reference a section number in the data 
> sheet as they do seem to change with updates.  We are most likely a bit more 
> safe here as it is one of the first of a list of register descriptions' and 
> thus less like to move.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net v2 3/4] ipv6: no CHECKSUM_PARTIAL on MSG_MORE corked sockets

2015-10-27 Thread Tom Herbert

On Tue, Oct 27, 2015 at 9:44 AM, Hannes Frederic Sowa
 wrote:
>
>
> On Tue, Oct 27, 2015, at 17:36, Tom Herbert wrote:> > -   if
> (cork->length + length > maxnonfragsize - headersize) {
>> > +   if (cork->length + length > maxnonfragsize - headersize) {
>> >  emsgsize:
>> > -   ipv6_local_error(sk, EMSGSIZE, fl6,
>> > -mtu - headersize +
>> > -sizeof(struct ipv6hdr));
>> > -   return -EMSGSIZE;
>> > -   }
>> > +   ipv6_local_error(sk, EMSGSIZE, fl6,
>> > +mtu - headersize +
>> > +sizeof(struct ipv6hdr));
>> > +   return -EMSGSIZE;
>> > }
>> >
>> > +   /* CHECKSUM_PARTIAL only with no extension headers and when
>>
>> No, please don't do this. CHECKSUM_PARTIAL should work with extension
>> headers as defined, so this is just disabling otherwise valid and
>> useful functionality. If (some) drivers have problems with this they
>> need to be identified and fixed.
>
> I don't understand. The old code already didn't allow the use of
> opt_flen with CHECKSUM_PARTIAL.
>
Then that's a problem with the old code :-). Is there any other reason
that we can't use CHECKSUM_PARTIAL with extension headers other than
lack of correct driver support?

> The MSG_MORE check has nothing to do with that but only with corking.
>
> Bye,
> Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] ixgbe: check Master Disable bit after setting

2015-10-27 Thread Dan Streetman

On Tue, Oct 27, 2015 at 1:14 PM, Skidmore, Donald C
 wrote:
>
>
>> -Original Message-
>> From: dan.street...@canonical.com
>> [mailto:dan.street...@canonical.com]
>> Sent: Monday, October 26, 2015 5:20 PM
>> To: Kirsher, Jeffrey T
>> Cc: Brandeburg, Jesse; Nelson, Shannon; Wyborny, Carolyn; Skidmore,
>> Donald C; Vick, Matthew; Ronciak, John; Williams, Mitch A; intel-wired-
>> l...@lists.osuosl.org; netdev@vger.kernel.org; linux-ker...@vger.kernel.org;
>> Dan Streetman; Dan Streetman
>> Subject: [PATCH] ixgbe: check Master Disable bit after setting
>>
>> From: Dan Streetman 
>>
>> Spec section 8.2.4.1.1 notes that after setting the PCIe Master Disable bit, 
>> it
>> must be read to verify it was set before polling the Master Enable status 
>> bit.
>>
>> This adds the check to verify the Master Disable bit was set.
>>
>> This also corrects the spec section number reference - the Master Disable
>> section is 5.2.4.3.2, not 5.2.5.3.2.
>>
>> Signed-off-by: Dan Streetman 
>> Signed-off-by: Dan Streetman 
>> ---
>>  drivers/net/ethernet/intel/ixgbe/ixgbe_common.c | 13 -
>>  1 file changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
>> index 3f56a80..abfada7 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
>> @@ -2453,6 +2453,16 @@ static s32 ixgbe_disable_pcie_master(struct
>> ixgbe_hw *hw)
>>   /* Always set this bit to ensure any future transactions are blocked */
>>   IXGBE_WRITE_REG(hw, IXGBE_CTRL, IXGBE_CTRL_GIO_DIS);
>>
>> + /* Spec sec 8.2.4.1.1, Master Disable bit :
>> +  * "After doing any change to this bit the host must read that
>> +  *  the bit has been modified as expected before reading
>> +  *  STATUS.PCIe Master Enable Status bit."
>> +  */
>> + if (!(IXGBE_READ_REG(hw, IXGBE_CTRL) & IXGBE_CTRL_GIO_DIS)) {
>> + hw_err(hw, "GIO Master Disable bit didn't set\n");
>> + goto gio_dis_fail;
>> + }
>> +
>>   /* Exit if master requests are blocked */
>>   if (!(IXGBE_READ_REG(hw, IXGBE_STATUS) & IXGBE_STATUS_GIO) ||
>>   ixgbe_removed(hw->hw_addr))
>> @@ -2467,13 +2477,14 @@ static s32 ixgbe_disable_pcie_master(struct
>> ixgbe_hw *hw)
>>
>>   /*
>>* Two consecutive resets are required via CTRL.RST per datasheet
>> -  * 5.2.5.3.2 Master Disable.  We set a flag to inform the reset routine
>> +  * 5.2.4.3.2 Master Disable.  We set a flag to inform the reset
>> +routine
>>* of this need.  The first reset prevents new master requests from
>>* being issued by our device.  We then must wait 1usec or more for
>> any
>>* remaining completions from the PCIe bus to trickle in, and then
>> reset
>>* again to clear out any effects they may have had on our device.
>>*/
>>   hw_dbg(hw, "GIO Master Disable bit didn't clear - requesting
>> resets\n");
>> +gio_dis_fail:
>>   hw->mac.flags |= IXGBE_FLAGS_DOUBLE_RESET_REQUIRED;
>>
>>   /*
>> --
>> 2.5.0
>
> Is this patch correcting some issue you're running in to?

I had a report of the hw hanging:
http://lists.osuosl.org/pipermail/intel-wired-lan/Week-of-Mon-20150928/002081.html

However I have no idea if this or the other patch are related at all,
I just reviewed the spec and driver and noticed these two differences;
basically a shot in the dark.  The person I got the report of hw hang
from has changed their config and can't reproduce the problem anymore
- although before the change, they reproduced it several times (over a
week or so).

Do you have any ideas on how the hw would get into that state and/or
how to recover?  I can send more data from the logs if it would help.

> I ask as I don't believe this check is necessary, since right below setting 
> CTLR. PCIE_MASTER_DISABLE we loop waiting for  STATUS. 
> PCIE_MASTER_ENABLE_STATUS to clear.   This should have the same effect as 
> verify that the write to CTLR. PCIE_MASTER_DISABLE has cleared.

Well, this patch doesn't do the same check - it tries to verify that
master disable was actually set, and errors out if it wasn't, instead
of continuing on to poll enable status (which will not change, since
master disable wasn't set).  That's what the spec says the driver
should do, at least.

>
> Thanks,
> Don Skidmore 
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 1/3] net: dsa: bcm_sf2: cleanup resources in remove callback

2015-10-27 Thread kbuild test robot

Hi Neil,

[auto build test ERROR on net/master -- if it's inappropriate base, please 
suggest rules for selecting the more suitable base]

url:
https://github.com/0day-ci/linux/commits/Neil-Armstrong/net-dsa-cleanup-dsa-driver/20151028-003842
config: x86_64-allmodconfig (attached as .config)
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All error/warnings (new ones prefixed by >>):

>> drivers/net/dsa/bcm_sf2.c:1077:2: error: unknown field 'remove' specified in 
>> initializer
 .remove   = bcm_sf2_sw_remove,
 ^
>> drivers/net/dsa/bcm_sf2.c:1077:14: warning: initialization from incompatible 
>> pointer type [-Wincompatible-pointer-types]
 .remove   = bcm_sf2_sw_remove,
 ^
   drivers/net/dsa/bcm_sf2.c:1077:14: note: (near initialization for 
'bcm_sf2_switch_driver.setup')

vim +/remove +1077 drivers/net/dsa/bcm_sf2.c

  1071  }
  1072  
  1073  static struct dsa_switch_driver bcm_sf2_switch_driver = {
  1074  .tag_protocol   = DSA_TAG_PROTO_BRCM,
  1075  .priv_size  = sizeof(struct bcm_sf2_priv),
  1076  .probe  = bcm_sf2_sw_probe,
> 1077  .remove = bcm_sf2_sw_remove,
  1078  .setup  = bcm_sf2_sw_setup,
  1079  .set_addr   = bcm_sf2_sw_set_addr,
  1080  .get_phy_flags  = bcm_sf2_sw_get_phy_flags,

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data

RE: [Intel-wired-lan] [PATCH] fm10k:Fix error handling in the function fm10k_setup_tc

2015-10-27 Thread Singh, Krishneil K

-Original Message-
From: Intel-wired-lan [mailto:intel-wired-lan-boun...@lists.osuosl.org] On 
Behalf Of Nicholas Krause
Sent: Tuesday, October 20, 2015 2:05 PM
To: Kirsher, Jeffrey T 
Cc: linux-ker...@vger.kernel.org; intel-wired-...@lists.osuosl.org; 
netdev@vger.kernel.org
Subject: [Intel-wired-lan] [PATCH] fm10k:Fix error handling in the function 
fm10k_setup_tc

This fixes error handling in the function fm10k_setup_tc to properly check if 
the call to the function fm10k_open has failed by returning a error and if so 
return immediately to the caller of the function fm10k_setup_tc to properly 
signal this non recoverable failure.

Signed-off-by: Nicholas Krause 
---

Tested-by: Krishneil Singh 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next V18 3/3] 802.1AD: Flow handling, actions, vlan parsing and netlink attributes

2015-10-27 Thread Pravin Shelar

On Tue, Oct 27, 2015 at 9:45 AM, Thomas F Herbert
 wrote:
> On 10/26/15 10:10 PM, Pravin Shelar wrote:
> Thanks for the review.
>>
>> On Sun, Oct 25, 2015 at 5:11 PM, Thomas F Herbert
>>  wrote:
>>>
>>> Add support for 802.1ad including the ability to push and pop double
>>> tagged vlans. Add support for 802.1ad to netlink parsing and flow
>>> conversion. Uses double nested encap attributes to represent double
>>> tagged vlan. Inner TPID encoded along with ctci in nested attributes.
>>> Outer
>>> TPID is also encoded in the flow key.
>>>
>>> Signed-off-by: Thomas F Herbert 
>>
>> This patch does not apply on current master due to conflicts related
>> net-branch merge.
>
> OK, I will rebase.
>
>>
>>> ---
>>>   net/openvswitch/actions.c  |   6 +-
>>>   net/openvswitch/flow.c |  76 
>>>   net/openvswitch/flow.h |   8 +-
>>>   net/openvswitch/flow_netlink.c | 199
>>> +
>>>   net/openvswitch/vport-netdev.c |   4 +-
>>>   5 files changed, 252 insertions(+), 41 deletions(-)
>>>
>>> diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
>>> index c8db44a..ed19e2b 100644
>>> --- a/net/openvswitch/flow.c
>>> +++ b/net/openvswitch/flow.c
>>> @@ -302,24 +302,68 @@ static bool icmp6hdr_ok(struct sk_buff *skb)
>>>sizeof(struct icmp6hdr));
>>>   }
>>>
>>> -static int parse_vlan(struct sk_buff *skb, struct sw_flow_key *key)
>>> +/* Parse vlan tag from vlan header.
>>> + * Returns ERROR on memory error.
>>> + * Returns 0 if it encounters a non-vlan or incomplete packet.
>>> + * Returns 1 after successfully parsing vlan tag.
>>> + */
>>> +
>>> +static int parse_vlan_tag(struct sk_buff *skb, struct vlan_head *vlan)
>>>   {
>>> -   struct qtag_prefix {
>>> -   __be16 eth_type; /* ETH_P_8021Q */
>>> -   __be16 tci;
>>> -   };
>>> -   struct qtag_prefix *qp;
>>> +   struct vlan_head *qp = (struct vlan_head *)skb->data;
>>> +
>>> +   if (likely(!eth_type_vlan(qp->tpid)))
>>> +   return 0;
>>>
>>> -   if (unlikely(skb->len < sizeof(struct qtag_prefix) +
>>> sizeof(__be16)))
>>> +   if (unlikely(skb->len < sizeof(struct vlan_head) +
>>> sizeof(__be16)))
>>>  return 0;
>>
>> Why do we need extra sizeof(__be16) bytes here?
>
> I don't have an answer to your question. I didn't write this code and have
> wondered about why the extra two bytes were reserved. I don't know why it
> should be necessarily for inner or outer vlans or the HW accelerated case or
> for the non-accelerated case. If no reviewer can state a case for it, I will
> remove it with the next version of this patch.
>
Looks like it is optimization for parsing ethertype, So lets keep it.

>>>
>>>  } else if (!tci) {
>>>  /* Corner case for truncated 802.1Q header. */
>>>  if (nla_len(encap)) {
>>> @@ -1169,7 +1312,7 @@ int ovs_nla_get_match(struct net *net, struct
>>> sw_flow_match *match,
>>>  goto free_newmask;
>>>
>>>  /* Always match on tci. */
>>> -   SW_FLOW_KEY_PUT(match, eth.tci, htons(0x), true);
>>> +   SW_FLOW_KEY_PUT(match, eth.vlan.tci, htons(0x),
>>> true);
>>
>> Also need to exact match on inner tci.
>
> This code sets a match on tci even if no vlan is present. Is this is for the
> case where there is no explicit mask specified in the netlink encoded flow?
> If that is correct, then it does need to be done for the inner vlan too.

Yes, By default it needs to be matched. userspace can overwrite it
with different wildcard.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH] ixgbe: check Master Disable bit after setting

2015-10-27 Thread Skidmore, Donald C



> -Original Message-
> From: dan.street...@canonical.com
> [mailto:dan.street...@canonical.com]
> Sent: Monday, October 26, 2015 5:20 PM
> To: Kirsher, Jeffrey T
> Cc: Brandeburg, Jesse; Nelson, Shannon; Wyborny, Carolyn; Skidmore,
> Donald C; Vick, Matthew; Ronciak, John; Williams, Mitch A; intel-wired-
> l...@lists.osuosl.org; netdev@vger.kernel.org; linux-ker...@vger.kernel.org;
> Dan Streetman; Dan Streetman
> Subject: [PATCH] ixgbe: check Master Disable bit after setting
> 
> From: Dan Streetman 
> 
> Spec section 8.2.4.1.1 notes that after setting the PCIe Master Disable bit, 
> it
> must be read to verify it was set before polling the Master Enable status bit.
> 
> This adds the check to verify the Master Disable bit was set.
> 
> This also corrects the spec section number reference - the Master Disable
> section is 5.2.4.3.2, not 5.2.5.3.2.
> 
> Signed-off-by: Dan Streetman 
> Signed-off-by: Dan Streetman 
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_common.c | 13 -
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
> b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
> index 3f56a80..abfada7 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
> @@ -2453,6 +2453,16 @@ static s32 ixgbe_disable_pcie_master(struct
> ixgbe_hw *hw)
>   /* Always set this bit to ensure any future transactions are blocked */
>   IXGBE_WRITE_REG(hw, IXGBE_CTRL, IXGBE_CTRL_GIO_DIS);
> 
> + /* Spec sec 8.2.4.1.1, Master Disable bit :
> +  * "After doing any change to this bit the host must read that
> +  *  the bit has been modified as expected before reading
> +  *  STATUS.PCIe Master Enable Status bit."
> +  */
> + if (!(IXGBE_READ_REG(hw, IXGBE_CTRL) & IXGBE_CTRL_GIO_DIS)) {
> + hw_err(hw, "GIO Master Disable bit didn't set\n");
> + goto gio_dis_fail;
> + }
> +
>   /* Exit if master requests are blocked */
>   if (!(IXGBE_READ_REG(hw, IXGBE_STATUS) & IXGBE_STATUS_GIO) ||
>   ixgbe_removed(hw->hw_addr))
> @@ -2467,13 +2477,14 @@ static s32 ixgbe_disable_pcie_master(struct
> ixgbe_hw *hw)
> 
>   /*
>* Two consecutive resets are required via CTRL.RST per datasheet
> -  * 5.2.5.3.2 Master Disable.  We set a flag to inform the reset routine
> +  * 5.2.4.3.2 Master Disable.  We set a flag to inform the reset
> +routine
>* of this need.  The first reset prevents new master requests from
>* being issued by our device.  We then must wait 1usec or more for
> any
>* remaining completions from the PCIe bus to trickle in, and then
> reset
>* again to clear out any effects they may have had on our device.
>*/
>   hw_dbg(hw, "GIO Master Disable bit didn't clear - requesting
> resets\n");
> +gio_dis_fail:
>   hw->mac.flags |= IXGBE_FLAGS_DOUBLE_RESET_REQUIRED;
> 
>   /*
> --
> 2.5.0

Is this patch correcting some issue you're running in to?
I ask as I don't believe this check is necessary, since right below setting 
CTLR. PCIE_MASTER_DISABLE we loop waiting for  STATUS. 
PCIE_MASTER_ENABLE_STATUS to clear.   This should have the same effect as 
verify that the write to CTLR. PCIE_MASTER_DISABLE has cleared.

Thanks,
Don Skidmore 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [Intel-wired-lan] [PATCH] fm10k:Fix error handling in the function fm10k_setup_tc for certain function calls

2015-10-27 Thread Singh, Krishneil K

-Original Message-
From: Intel-wired-lan [mailto:intel-wired-lan-boun...@lists.osuosl.org] On 
Behalf Of Nicholas Krause
Sent: Friday, October 9, 2015 8:53 AM
To: Kirsher, Jeffrey T 
Cc: linux-ker...@vger.kernel.org; intel-wired-...@lists.osuosl.org; 
netdev@vger.kernel.org
Subject: [Intel-wired-lan] [PATCH] fm10k:Fix error handling in the function 
fm10k_setup_tc for certain function calls

This fixes the function fm10k_setup_tc to propley check if the calls to either 
the function fm10k_init_queueing_scheme or the function fm10k_mbx_request_irq 
fail by returning a error code to signal that the call to either function has 
failed. Furthermore if this arises exit immediately from the function 
fm10k_setup_tc by returning the returned error code from the failed function 
call to signal to the caller that setting up the tc on the device has failed 
and the caller needs to handle this failed setup.

Signed-off-by: Nicholas Krause 
---

Tested-by: Krishneil Singh 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [Intel-wired-lan] [PATCH 3/3] fm10k: use napi_schedule_irqoff()

2015-10-27 Thread Singh, Krishneil K



-Original Message-
From: Intel-wired-lan [mailto:intel-wired-lan-boun...@lists.osuosl.org] On 
Behalf Of Alexander Duyck
Sent: Tuesday, September 29, 2015 3:20 PM
To: netdev@vger.kernel.org; intel-wired-...@lists.osuosl.org
Subject: [Intel-wired-lan] [PATCH 3/3] fm10k: use napi_schedule_irqoff()

The fm10k_msix_clean_rings function runs from hard interrupt context or with 
interrupts already disabled in netpoll.

It can use napi_schedule_irqoff() instead of napi_schedule()

Signed-off-by: Alexander Duyck 
---

Tested-by: Krishneil Singh 


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

1 2 3 >

1 - 100 of 208 matches

Mail list logo