Re: pull-request: bpf 2018-09-02

2018-09-02 Thread David Miller
From: Daniel Borkmann 
Date: Sun,  2 Sep 2018 23:20:31 +0200

> The following pull-request contains BPF updates for your *net* tree.
> 
> The main changes are:
> 
> 1) Fix one remaining buggy offset override in sockmap's bpf_msg_pull_data()
>when linearizing multiple scatterlist elements, from Tushar.
> 
> 2) Fix BPF sockmap's misuse of ULP when a collision with another ULP is
>found on map update where it would release existing ULP. syzbot found and
>triggered this couple of times now, fix from John.
> 
> 3) Add missing xskmap type to bpftool so it will properly show the type
>on map dump, from Prashant.
> 
> Please consider pulling these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git

Pulled, thanks Daniel.


Re: [PATCH net] net/ipv6: Only update MTU metric if it set

2018-09-02 Thread David Miller
From: dsah...@kernel.org
Date: Thu, 30 Aug 2018 14:15:43 -0700

> From: David Ahern 
> 
> Jan reported a regression after an update to 4.18.5. In this case ipv6
> default route is setup by systemd-networkd based on data from an RA. The
> RA contains an MTU of 1492 which is used when the route is first inserted
> but then systemd-networkd pushes down updates to the default route
> without the mtu set.
> 
> Prior to the change to fib6_info, metrics such as MTU were held in the
> dst_entry and rt6i_pmtu in rt6_info contained an update to the mtu if
> any. ip6_mtu would look at rt6i_pmtu first and use it if set. If not,
> the value from the metrics is used if it is set and finally falling
> back to the idev value.
> 
> After the fib6_info change metrics are contained in the fib6_info struct
> and there is no equivalent to rt6i_pmtu. To maintain consistency with
> the old behavior the new code should only reset the MTU in the metrics
> if the route update has it set.
> 
> Fixes: d4ead6b34b67 ("net/ipv6: move metrics from dst to rt6_info")
> Reported-by: Jan Janssen 
> Signed-off-by: David Ahern 

Applied and queued up for -stable, thanks David.


Re: [PATCH net-next] net/sched: fix type of htb statistics

2018-09-02 Thread David Miller
From: Florent Fourcot 
Date: Thu, 30 Aug 2018 16:39:23 +0200

> tokens and ctokens are defined as s64 in htb_class structure,
> and clamped to 32bits value during netlink dumps:
> 
> cl->xstats.tokens = clamp_t(s64, PSCHED_NS2TICKS(cl->tokens),
> INT_MIN, INT_MAX);
> 
> Defining it as u32 is working since userspace (tc) is printing it as
> signed int, but a correct definition from the beginning is probably
> better.
> 
> In the same time, 'giants' structure member is unused since years, so
> update the comment to mark it unused.
> 
> Signed-off-by: Florent Fourcot 

Looks good, applied.


Re: [PATCH 1/2] dt-bindings: net: cpsw: Document cpsw-phy-sel usage but prefer phandle

2018-09-02 Thread David Miller
From: Tony Lindgren 
Date: Wed, 29 Aug 2018 08:00:23 -0700

> The current cpsw usage for cpsw-phy-sel is undocumented but is used for
> all the boards using cpsw. And cpsw-phy-sel is not really a child of
> the cpsw device, it lives in the system control module instead.
> 
> Let's document the existing usage, and improve it a bit where we prefer
> to use a phandle instead of a child device for it. That way we can
> properly describe the hardware in dts files for things like genpd.
> 
> Signed-off-by: Tony Lindgren 

Applied.


Re: [PATCH 2/2] net: ethernet: cpsw-phy-sel: prefer phandle for phy sel

2018-09-02 Thread David Miller
From: Tony Lindgren 
Date: Wed, 29 Aug 2018 08:00:24 -0700

> The cpsw-phy-sel device is not a child of the cpsw interconnect target
> module. It lives in the system control module.
> 
> Let's fix this issue by trying to use cpsw-phy-sel phandle first if it
> exists and if not fall back to current usage of trying to find the
> cpsw-phy-sel child. That way the phy sel driver can be a child of the
> system control module where it belongs in the device tree.
> 
> Without this fix, we cannot have a proper interconnect target module
> hierarchy in device tree for things like genpd.
> 
> Note that deferred probe is mostly not supported by cpsw and this patch
> does not attempt to fix that. In case deferred probe support is needed,
> this could be added to cpsw_slave_open() and phy_connect() so they start
> handling and returning errors.
> 
> For documenting it, looks like the cpsw-phy-sel is used for all cpsw device
> tree nodes. It's missing the related binding documentation, so let's also
> update the binding documentation accordingly.
> 
> Signed-off-by: Tony Lindgren 

Applied.


Re: [PATCH net 0/2] igmp: fix two incorrect unsolicit report count issues

2018-09-02 Thread David Miller
From: Hangbin Liu 
Date: Wed, 29 Aug 2018 18:06:07 +0800

> Just like the subject, fix two minor igmp unsolicit report count issues.

Series applied, thanks.


Re: [PATCH RFC net-next 00/18] net: Improve route scalability via support for nexthop objects

2018-09-02 Thread David Miller
From: dsah...@kernel.org
Date: Fri, 31 Aug 2018 17:49:35 -0700

> Examples
> 1. Single path
> $ ip nexthop add id 1 via 10.99.1.2 dev veth1
> $ ip route add 10.1.1.0/24 nhid 1
> 
> $ ip next ls
> id 1 via 10.99.1.2 src 10.99.1.1 dev veth1 scope link
> 
> $ ip ro ls
> 10.1.1.0/24 nhid 1 scope link
> ...

First of all, this whole idea is awesome!  But, you knew that already. :)

However, I worry what happesn in a mixed environment where we have routing
daemons and tools inserting nexthop based routes, and some doing things
the old way using and expecting inline nexthop information in the routes.

That mixed environment situation has to function correctly.  Older
apps have to see the per-route nexthop info in the format and layout
they expect (gw/dev pairs).  They cannot be expected to just studdenly
understand the nexthop ID etc.

Otherwise the concept and ideas are fine, so as long as you can resolve
the mixed environment situation I fully support this work and look forward
to it being in a state where I can integrate it :-)


Re: [PATCH] cxgb4: fix abort_req_rss6 struct

2018-09-01 Thread David Miller
From: Steve Wise 
Date: Fri, 31 Aug 2018 11:52:00 -0700

> Remove the incorrect WR_HDR field which can cause a misinterpretation
> of this CPL by ULDs.
> 
> Fixes: a3cdaa69e4ae ("cxgb4: Adds CPL support for Shared Receive Queues")
> Signed-off-by: Steve Wise 
> ---
> 
> Dave, Doug, and Jason,
> 
> I request this merge through the rdma repo since the only user of this
> structure is iw_cxgb4.

No objections from me.


Re: [PATCH net-next] net: dsa: b53: Provide sensible defaults

2018-09-01 Thread David Miller
From: Florian Fainelli 
Date: Fri, 31 Aug 2018 12:29:49 -0700

> The SRAB driver is the default way to communicate with the integrated
> switch on iProc platforms and the MMAP driver is the way to communicate
> with the integrated switch on DSL BCM63xx and CM BCM33xx.
> 
> Signed-off-by: Florian Fainelli 

Applied.


Re: [PATCH net-next] cxgb4: collect hardware queue descriptors

2018-09-01 Thread David Miller
From: Rahul Lakkireddy 
Date: Fri, 31 Aug 2018 18:16:34 +0530

> diff --git a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c 
> b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c
> index d97e0d7e541a..02fc350f81c9 100644
> --- a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c
> +++ b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c
 ...
> +static inline u32 cudbg_uld_txq_to_qtype(u32 uld)

Do not use inline in foo.c files, let the compiler decide.


Re: [PATCH v2 net-next] liquidio: remove set but not used variable 'irh'

2018-09-01 Thread David Miller
From: YueHaibing 
Date: Fri, 31 Aug 2018 12:03:56 +

> Fixes gcc '-Wunused-but-set-variable' warning:
> 
> drivers/net/ethernet/cavium/liquidio/request_manager.c: In function 
> 'lio_process_iq_request_list':
> drivers/net/ethernet/cavium/liquidio/request_manager.c:383:27: warning:
>  variable 'irh' set but not used [-Wunused-but-set-variable]
> 
> Signed-off-by: YueHaibing 
> ---
> v2: fix patch description,remove 'cHECK-'

Applied, thanks.


Re: [PATCH net-next 1/1] qed: Lower the severity of a dcbx log message.

2018-09-01 Thread David Miller
From: Sudarsana Reddy Kalluru 
Date: Fri, 31 Aug 2018 04:10:17 -0700

> Driver displays an error message for each unrecognized dcbx TLV that's
> received from the peer or configured on the device. It is observed that
> syslog will be flooded with such messages in certain scenarios e.g.,
> frequent link-flaps/lldp-transactions. Changing the severity of this
> message to verbose level as it's not an error scenario/message.
> 
> Signed-off-by: Sudarsana Reddy Kalluru 

Applied.


Re: [PATCH net-next v2] net/tls: Add support for async decryption of tls records

2018-09-01 Thread David Miller
From: Vakul Garg 
Date: Sun, 2 Sep 2018 02:28:00 +

> I do not find this patch in tree yet. 
> Can you please check? Thanks and Regards.

The perils of working on two different machines :-)

It should be there now, sorry about that.


Re: [PATCH net-next] net: remove duplicated include from net_failover.c

2018-09-01 Thread David Miller
From: YueHaibing 
Date: Fri, 31 Aug 2018 03:44:27 +

> Remove duplicated include.
> 
> Signed-off-by: YueHaibing 

Applied, thanks.


Re: [PATCH net] ipv6: don't get lwtstate twice in ip6_rt_copy_init()

2018-09-01 Thread David Miller
From: Alexey Kodanev 
Date: Thu, 30 Aug 2018 19:11:24 +0300

> Commit 80f1a0f4e0cd ("net/ipv6: Put lwtstate when destroying fib6_info")
> partially fixed the kmemleak [1], lwtstate can be copied from fib6_info,
> with ip6_rt_copy_init(), and it should be done only once there.
> 
> rt->dst.lwtstate is set by ip6_rt_init_dst(), at the start of the function
> ip6_rt_copy_init(), so there is no need to get it again at the end.
> 
> With this patch, lwtstate also isn't copied from RTF_REJECT routes.
 ...
> Fixes: 6edb3c96a5f0 ("net/ipv6: Defer initialization of dst to data path")
> Signed-off-by: Alexey Kodanev 

Applied and queued up for -stable, thanks.


Re: [PATCH net-next] net: stmmac: Add CBS support in XGMAC2

2018-09-01 Thread David Miller
From: Jose Abreu 
Date: Thu, 30 Aug 2018 15:09:48 +0100

> XGMAC2 uses the same CBS mechanism as GMAC5, only registers offset
> changes. Lets use the same TC callbacks and implement the .config_cbs
> callback in XGMAC2 core.
> 
> Signed-off-by: Jose Abreu 

Applied.


Re: [PATCH net-next 1/2] PCI: hv: support reporting serial number as slot information

2018-09-01 Thread David Miller
From: Stephen Hemminger 
Date: Wed, 29 Aug 2018 09:24:51 -0700

> + spin_lock_irqsave(>device_list_lock, flags);
> + list_for_each_entry(hpdev, >children, list_entry) {
> + if (hpdev->pci_slot)
> + continue;
> +
> + slot_nr = PCI_SLOT(wslot_to_devfn(hpdev->desc.win_slot.slot));
> + snprintf(name, SLOT_NAME_SIZE, "%u", hpdev->desc.ser);
> + hpdev->pci_slot = pci_create_slot(hbus->pci_bus, slot_nr,
> +   name, NULL);

pci_create_slot() takes a mutex, therefore you can't hold a spinlock or
disable interrupts here.


Re: [PATCH] neighbour: confirm neigh entries when ARP packet is received

2018-09-01 Thread David Miller
From: Vasily Khoruzhick 
Date: Tue, 28 Aug 2018 19:48:25 -0700

> Update 'confirmed' timestamp when ARP packet is received. It shouldn't
> affect locktime logic and anyway entry can be confirmed by any higher-layer
> protocol. Thus it makes no sense not to confirm it when ARP packet is
> received.
> 
> Fixes: 77d7123342 ("neighbour: update neigh timestamps iff update is
> effective")
> 
> Signed-off-by: Vasily Khoruzhick 

I'm not so sure.

The comment above the code you are moving explains that the current
behavior is intention, and it explains why too.

Even if your change is correct, you're now making that comment
inaccuratte, so you'd have to update it to match the new code.

But I still think the current code is intentionally behaving that
way, and for good reason.


Re: [PATCH net] ibmvnic: Include missing return code checks in reset function

2018-09-01 Thread David Miller
From: Thomas Falcon 
Date: Thu, 30 Aug 2018 13:19:53 -0500

> Check the return codes of these functions and halt reset
> in case of failure. The driver will remain in a dormant state
> until the next reset event, when device initialization will be
> re-attempted.
> 
> Signed-off-by: Thomas Falcon 

Applied.


Re: [PATCH net 2/2] selftests: pmtu: detect correct binary to ping ipv6 addresses

2018-09-01 Thread David Miller
From: Sabrina Dubroca 
Date: Thu, 30 Aug 2018 16:01:18 +0200

> Some systems don't have the ping6 binary anymore, and use ping for
> everything. Detect the absence of ping6 and try to use ping instead.
> 
> Fixes: d1f1b9cbf34c ("selftests: net: Introduce first PMTU test")
> Signed-off-by: Sabrina Dubroca 
> Acked-by: Stefano Brivio 

Applied.


Re: [PATCH net 1/2] selftests: pmtu: maximum MTU for vti4 is 2^16-1-20

2018-09-01 Thread David Miller
From: Sabrina Dubroca 
Date: Thu, 30 Aug 2018 16:01:17 +0200

> Since commit 82612de1c98e ("ip_tunnel: restore binding to ifaces with a
> large mtu"), the maximum MTU for vti4 is based on IP_MAX_MTU instead of
> the mysterious constant 0xFFF8.  This makes this selftest fail.
> 
> Fixes: 82612de1c98e ("ip_tunnel: restore binding to ifaces with a large mtu")
> Signed-off-by: Sabrina Dubroca 
> Acked-by: Stefano Brivio 

Applied.


Re: [PATCH net] tcp: do not restart timewait timer on rst reception

2018-09-01 Thread David Miller
From: Florian Westphal 
Date: Thu, 30 Aug 2018 14:24:29 +0200

> RFC 1337 says:
>  ''Ignore RST segments in TIME-WAIT state.
>If the 2 minute MSL is enforced, this fix avoids all three hazards.''
> 
> So with net.ipv4.tcp_rfc1337=1, expected behaviour is to have TIME-WAIT sk
> expire rather than removing it instantly when a reset is received.
> 
> However, Linux will also re-start the TIME-WAIT timer.
> 
> This causes connect to fail when tying to re-use ports or very long
> delays (until syn retry interval exceeds MSL).
...
> Without this patch, 'ss' shows restarts of tw timer and last packet is
> thus just another pure ack, more than one minute later.
> 
> This restores the original code from commit 283fd6cf0be690a83
> ("Merge in ANK networking jumbo patch") in netdev-vger-cvs.git .
> 
> For some reason the else branch was removed/lost in 1f28b683339f7
> ("Merge in TCP/UDP optimizations and [..]") and timer restart became
> unconditional.
> 
> Reported-by: Michal Tesar 
> Signed-off-by: Florian Westphal 

Applied and thanks for the packet drill test case :-)


Re: [PATCH net-next] net: dsa: mv88e6xxx: Share main switch IRQ

2018-09-01 Thread David Miller
From: Marek Behún 
Date: Thu, 30 Aug 2018 02:13:50 +0200

> On some boards the interrupt can be shared between multiple devices.
> For example on Turris Mox the interrupt is shared between all switches.
> 
> Signed-off-by: Marek Behun 

Applied.


Re: [PATCH net-next] net/ipv6: Do not reset nl_net in ip6_route_info_create

2018-09-01 Thread David Miller
From: dsah...@kernel.org
Date: Wed, 29 Aug 2018 16:54:01 -0700

> From: David Ahern 
> 
> nl_net is set on entry to ip6_route_info_create. Only devices
> within that namespace are considered so no need to reset it
> before returning.
> 
> Signed-off-by: David Ahern 

Applied.


Re: [PATCH net-next] net/ipv4: Add extack message that dev is required for ONLINK

2018-09-01 Thread David Miller
From: dsah...@kernel.org
Date: Wed, 29 Aug 2018 16:53:27 -0700

> From: David Ahern 
> 
> Make IPv4 consistent with IPv6 and return an extack message that the
> ONLINK flag requires a nexthop device.
> 
> Signed-off-by: David Ahern 

Applied.


Re: [PATCH net] nfp: wait for posted reconfigs when disabling the device

2018-09-01 Thread David Miller
From: Jakub Kicinski 
Date: Wed, 29 Aug 2018 12:46:08 -0700

> To avoid leaking a running timer we need to wait for the
> posted reconfigs after netdev is unregistered.  In common
> case the process of deinitializing the device will perform
> synchronous reconfigs which wait for posted requests, but
> especially with VXLAN ports being actively added and removed
> there can be a race condition leaving a timer running after
> adapter structure is freed leading to a crash.
> 
> Add an explicit flush after deregistering and for a good
> measure a warning to check if timer is running just before
> structures are freed.
> 
> Fixes: 3d780b926a12 ("nfp: add async reconfiguration mechanism")
> Signed-off-by: Jakub Kicinski 
> Reviewed-by: Dirk van der Merwe 

Applied and queued up for -stable.


Re: [PATCH net-next] tcp: change IPv6 flow-label upon receiving spurious retransmission

2018-09-01 Thread David Miller
From: Yuchung Cheng 
Date: Wed, 29 Aug 2018 14:53:56 -0700

> Currently a Linux IPv6 TCP sender will change the flow label upon
> timeouts to potentially steer away from a data path that has gone
> bad. However this does not help if the problem is on the ACK path
> and the data path is healthy. In this case the receiver is likely
> to receive repeated spurious retransmission because the sender
> couldn't get the ACKs in time and has recurring timeouts.
> 
> This patch adds another feature to mitigate this problem. It
> leverages the DSACK states in the receiver to change the flow
> label of the ACKs to speculatively re-route the ACK packets.
> In order to allow triggering on the second consecutive spurious
> RTO, the receiver changes the flow label upon sending a second
> consecutive DSACK for a sequence number below RCV.NXT.
> 
> Signed-off-by: Yuchung Cheng 
> Signed-off-by: Neal Cardwell 
> Signed-off-by: Eric Dumazet 

Applied.


Re: [PATCH net] Revert "packet: switch kvzalloc to allocate memory"

2018-09-01 Thread David Miller
From: Eric Dumazet 
Date: Wed, 29 Aug 2018 11:50:12 -0700

> This reverts commit 71e41286203c017d24f041a7cd71abea7ca7b1e0.
> 
> mmap()/munmap() can not be backed by kmalloced pages :
> 
> We fault in :
> 
> VM_BUG_ON_PAGE(PageSlab(page), page);
> 
> unmap_single_vma+0x8a/0x110
> unmap_vmas+0x4b/0x90
> unmap_region+0xc9/0x140
> do_munmap+0x274/0x360
> vm_munmap+0x81/0xc0
> SyS_munmap+0x2b/0x40
> do_syscall_64+0x13e/0x1c0
> entry_SYSCALL_64_after_hwframe+0x42/0xb7
> 
> Fixes: 71e41286203c ("packet: switch kvzalloc to allocate memory")
> Signed-off-by: Eric Dumazet 
> Reported-by: John Sperbeck 
> Bisected-by: John Sperbeck 

Oops, applied, thanks Eric.


Re: [Patch net-nnext] net_sched: add missing tcf_lock for act_connmark

2018-08-31 Thread David Miller
From: Cong Wang 
Date: Wed, 29 Aug 2018 10:15:36 -0700

> According to the new locking rule, we have to take tcf_lock
> for both ->init() and ->dump(), as RTNL will be removed.
> However, it is missing for act_connmark.
> 
> Cc: Vlad Buslov 
> Signed-off-by: Cong Wang 

Applied.


Re: [Patch net-nnext] Revert "net: sched: act: add extack for lookup callback"

2018-08-31 Thread David Miller
From: Cong Wang 
Date: Wed, 29 Aug 2018 10:15:35 -0700

> This reverts commit 331a9295de23 ("net: sched: act: add extack for lookup 
> callback").
> 
> This extack is never used after 6 months... In fact, it can be just
> set in the caller, right after ->lookup().
> 
> Cc: Alexander Aring 
> Signed-off-by: Cong Wang 

Applied.


Re: pull-request: bpf-next 2018-09-01

2018-08-31 Thread David Miller
From: Daniel Borkmann 
Date: Sat,  1 Sep 2018 02:05:06 +0200

> The following pull-request contains BPF updates for your *net-next* tree.
> 
> The main changes are:
> 
> 1) Add AF_XDP zero-copy support for i40e driver (!), from Björn and Magnus.

W00t!

> 2) BPF verifier improvements by giving each register its own liveness
>chain which allows to simplify and getting rid of skip_callee() logic,
>from Edward.
> 
> 3) Add bpf fs pretty print support for percpu arraymap, percpu hashmap
>and percpu lru hashmap. Also add generic percpu formatted print on
>bpftool so the same can be dumped there, from Yonghong.
> 
> 4) Add bpf_{set,get}sockopt() helper support for TCP_SAVE_SYN and
>TCP_SAVED_SYN options to allow reflection of tos/tclass from received
>SYN packet, from Nikita.
> 
> 5) Misc improvements to the BPF sockmap test cases in terms of cgroup v2
>interaction and removal of incorrect shutdown() calls, from John.
> 
> 6) Few cleanups in xdp_umem_assign_dev() and xdpsock samples, from Prashant.

Pulled, thanks Daniel!


Re: [PATCH net-next v1] selftests/tls: Add test for recv(PEEK) spanning across multiple records

2018-08-31 Thread David Miller
From: Vakul Garg 
Date: Wed, 29 Aug 2018 15:30:14 +0530

> Added test case to receive multiple records with a single recvmsg()
> operation with a MSG_PEEK set.

Applied.


Re: [PATCH net-next v2] net/tls: Add support for async decryption of tls records

2018-08-31 Thread David Miller
From: Vakul Garg 
Date: Wed, 29 Aug 2018 15:26:55 +0530

> When tls records are decrypted using asynchronous acclerators such as
> NXP CAAM engine, the crypto apis return -EINPROGRESS. Presently, on
> getting -EINPROGRESS, the tls record processing stops till the time the
> crypto accelerator finishes off and returns the result. This incurs a
> context switch and is not an efficient way of accessing the crypto
> accelerators. Crypto accelerators work efficient when they are queued
> with multiple crypto jobs without having to wait for the previous ones
> to complete.
> 
> The patch submits multiple crypto requests without having to wait for
> for previous ones to complete. This has been implemented for records
> which are decrypted in zero-copy mode. At the end of recvmsg(), we wait
> for all the asynchronous decryption requests to complete.
> 
> The references to records which have been sent for async decryption are
> dropped. For cases where record decryption is not possible in zero-copy
> mode, asynchronous decryption is not used and we wait for decryption
> crypto api to complete.
> 
> For crypto requests executing in async fashion, the memory for
> aead_request, sglists and skb etc is freed from the decryption
> completion handler. The decryption completion handler wakesup the
> sleeping user context when recvmsg() flags that it has done sending
> all the decryption requests and there are no more decryption requests
> pending to be completed.
> 
> Signed-off-by: Vakul Garg 
> Reviewed-by: Dave Watson 
> ---
> 
> Changes since v1:
>   - Simplified recvmsg() so to drop reference to skb in case it
> was submimtted for async decryption.
>   - Modified tls_sw_advance_skb() to handle case when input skb is
> NULL.

Applied.


Re: [net-next v2 00/15][pull request] 40GbE Intel Wired LAN Driver Updates 2018-08-30

2018-08-31 Thread David Miller
From: Jeff Kirsher 
Date: Thu, 30 Aug 2018 14:11:32 -0700

> This series contains updates to i40e, i40evf and virtchnl.

Pulled, thanks Jeff.


Re: [PATCH net v2 0/2] net_sched: reject unknown tcfa_action values

2018-08-29 Thread David Miller
From: Paolo Abeni 
Date: Wed, 29 Aug 2018 10:22:32 +0200

> As agreed some time ago, this changeset reject unknown tcfa_action values,
> instead of changing such values under the hood.
> 
> A tdc test is included to verify the new behavior.
> 
> v1 -> v2:
>  - helper is now static and renamed according to act_* convention
>  - updated extack message, according to the new behavior

Series applied, thank you.


Re: [PATCH v2] net: mvpp2: initialize port of_node pointer

2018-08-29 Thread David Miller
From: Baruch Siach 
Date: Wed, 29 Aug 2018 09:44:39 +0300

> Without a valid of_node in struct device we can't find the mvpp2 port
> device by its DT node. Specifically, this breaks
> of_find_net_device_by_node().
> 
> For example, the Armada 8040 based Clearfog GT-8K uses Marvell 88E6141
> switch connected to the _eth2 port:
> 
> _mdio {
>   ...
> 
>   switch0: switch0@4 {
>   compatible = "marvell,mv88e6085";
>   ...
> 
>   ports {
>   ...
> 
>   port@5 {
>   reg = <5>;
>   label = "cpu";
>   ethernet = <_eth2>;
>   };
>   };
>   };
> };
> 
> Without this patch, dsa_register_switch() returns -EPROBE_DEFER because
> of_find_net_device_by_node() can't find the device_node of the _eth2
> device.
> 
> Reviewed-by: Andrew Lunn 
> Signed-off-by: Baruch Siach 

Applied.


Re: [PATCH][net-next] vxlan: reduce dirty cache line in vxlan_find_mac

2018-08-29 Thread David Miller
From: Li RongQing 
Date: Wed, 29 Aug 2018 11:52:10 +0800

> vxlan_find_mac() unconditionally set f->used for every packet,
> this causes a cache miss for every packet, since remote, hlist
> and used of vxlan_fdb share the same cache line, which are
> accessed when send every packets.
> 
> so f->used is set only if not equal to jiffies, to reduce dirty
> cache line times, this gives 3% speed-up with small packets.
> 
> Signed-off-by: Zhang Yu 
> Signed-off-by: Li RongQing 

Applied.


Re: [PATCH net-next 0/4] liquidio: improve soft command/response handling

2018-08-29 Thread David Miller
From: Felix Manlunas 
Date: Tue, 28 Aug 2018 18:50:58 -0700

> From: Weilin Chang 
> 
> Change soft command handling to fix the possible race condition when the
> process handles a response of a soft command that was already freed by an
> application which got timeout for this request.

Series applied, thank you.


Re: [net-next 02/15] i40e: move ethtool stats boiler plate code to i40e_ethtool_stats.h

2018-08-29 Thread David Miller
From: Jeff Kirsher 
Date: Wed, 29 Aug 2018 15:48:21 -0700

> diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h 
> b/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h
> new file mode 100644
> index ..0290ade7494b
> --- /dev/null
> +++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h
> @@ -0,0 +1,221 @@
...
> +/**
> + * __i40e_add_stat_strings - copy stat strings into ethtool buffer
> + * @p: ethtool supplied buffer
> + * @stats: stat definitions array
> + * @size: size of the stats array
> + *
> + * Format and copy the strings described by stats into the buffer pointed at
> + * by p.
> + **/
> +static void __i40e_add_stat_strings(u8 **p, const struct i40e_stats stats[],
> + const unsigned int size, ...)

Need to be marked inline.


Re: [PATCH net-next,v5] net/tls: Calculate nsg for zerocopy path without skb_cow_data.

2018-08-29 Thread David Miller
From: Doron Roberts-Kedes 
Date: Tue, 28 Aug 2018 16:33:57 -0700

> decrypt_skb fails if the number of sg elements required to map it
> is greater than MAX_SKB_FRAGS. nsg must always be calculated, but
> skb_cow_data adds unnecessary memcpy's for the zerocopy case.
> 
> The new function skb_nsg calculates the number of scatterlist elements
> required to map the skb without the extra overhead of skb_cow_data.
> This patch reduces memcpy by 50% on my encrypted NBD benchmarks.
> 
> Reported-by: Vakul Garg 
> Reviewed-by: Vakul Garg 
> Tested-by: Vakul Garg 
> Signed-off-by: Doron Roberts-Kedes 

Applied, thank you.


Re: [PATCH net-next 1/2] ip: fail fast on IP defrag errors

2018-08-29 Thread David Miller
From: Peter Oskolkov 
Date: Tue, 28 Aug 2018 11:36:19 -0700

> The current behavior of IP defragmentation is inconsistent:
> - some overlapping/wrong length fragments are dropped without
>   affecting the queue;
> - most overlapping fragments cause the whole frag queue to be dropped.
> 
> This patch brings consistency: if a bad fragment is detected,
> the whole frag queue is dropped. Two major benefits:
> - fail fast: corrupted frag queues are cleared immediately, instead of
>   by timeout;
> - testing of overlapping fragments is now much easier: any kind of
>   random fragment length mutation now leads to the frag queue being
>   discarded (IP packet dropped); before this patch, some overlaps were
>   "corrected", with tests not seeing expected packet drops.
> 
> Note that in one case (see "if (end&7)" conditional) the current
> behavior is preserved as there are concerns that this could be
> legitimate padding.
> 
> Signed-off-by: Peter Oskolkov 
> Reviewed-by: Eric Dumazet 
> Reviewed-by: Willem de Bruijn 

Applied.


Re: [PATCH net-next 2/2] selftests/net: add ip_defrag selftest

2018-08-29 Thread David Miller
From: Peter Oskolkov 
Date: Tue, 28 Aug 2018 11:36:20 -0700

> This test creates a raw IPv4 socket, fragments a largish UDP
> datagram and sends the fragments out of order.
> 
> Then repeats in a loop with different message and fragment lengths.
> 
> Then does the same with overlapping fragments (with overlapping
> fragments the expectation is that the recv times out).
> 
> Tested:
> 
> root@# time ./ip_defrag.sh
> ipv4 defrag
> PASS
> ipv4 defrag with overlaps
> PASS
> 
> real1m7.679s
> user0m0.628s
> sys 0m2.242s
> 
> A similar test for IPv6 is to follow.
> 
> Signed-off-by: Peter Oskolkov 
> Reviewed-by: Willem de Bruijn 

Applied.


Re: [PATCH net-next] liquidio: fix race condition in instruction completion processing

2018-08-29 Thread David Miller
From: Felix Manlunas 
Date: Tue, 28 Aug 2018 11:32:55 -0700

> From: Rick Farrington 
> 
> In lio_enable_irq, the pkt_in_done count register was being cleared to
> zero.  However, there could be some completed instructions which were not
> yet processed due to budget and limit constraints.
> So, only write this register with the number of actual completions
> that were processed.
> 
> Signed-off-by: Rick Farrington 
> Signed-off-by: Felix Manlunas 

Applied.


Re: [PATCH net-next] liquidio: remove unnecessary delay when processing IQ responses

2018-08-29 Thread David Miller
From: Felix Manlunas 
Date: Tue, 28 Aug 2018 11:19:54 -0700

> From: Rick Farrington 
> 
> Signed-off-by: Rick Farrington 
> Signed-off-by: Felix Manlunas 

Applied.


Re: [PATCH net-next] net: thunderbolt: Convert to use SPDX identifier

2018-08-29 Thread David Miller
From: Mika Westerberg 
Date: Tue, 28 Aug 2018 19:58:43 +0300

> This gets rid of the licence boilerblate in favor of SPDX identifier
> which only takes a single line comment.
> 
> Signed-off-by: Mika Westerberg 

Applied.


Re: [PATCH net 0/3] ipv6: fix error path of inet6_init()

2018-08-29 Thread David Miller
From: Sabrina Dubroca 
Date: Tue, 28 Aug 2018 13:40:50 +0200

> The error path of inet6_init() can trigger multiple kernel panics,
> mostly due to wrong ordering of cleanups. This series fixes those
> issues.

Series applied, thank you.


Re: [PATCH net] net/sched: act_pedit: fix dump of extended layered op

2018-08-29 Thread David Miller
From: Davide Caratti 
Date: Mon, 27 Aug 2018 22:56:22 +0200

> in the (rare) case of failure in nla_nest_start(), missing NULL checks in
> tcf_pedit_key_ex_dump() can make the following command
> 
>  # tc action add action pedit ex munge ip ttl set 64
> 
> dereference a NULL pointer:
 ...
> Like it's done for other TC actions, give up dumping pedit rules and return
> an error if nla_nest_start() returns NULL.
> 
> Fixes: 71d0ed7079df ("net/act_pedit: Support using offset relative to the 
> conventional network headers")
> Signed-off-by: Davide Caratti 

Applied and queued up for -stable, thanks.


Re: [PATCH] r8169: set RxConfig after tx/rx is enabled for RTL8169sb/8110sb devices

2018-08-29 Thread David Miller
From: Azat Khuzhin 
Date: Sun, 26 Aug 2018 17:03:09 +0300

> I have two Ethernet adapters:
>   r8169 :03:01.0 eth0: RTL8169sb/8110sb, 00:14:d1:14:2d:49, XID 1000, 
> IRQ 18
>   r8169 :01:00.0 eth0: RTL8168e/8111e, 64:66:b3:11:14:5d, XID 2c20, 
> IRQ 30
> And after upgrading from linux 4.15 [1] to linux 4.18+ [2] RTL8169sb failed to
> receive any packets. tcpdump shows a lot of checksum mismatch.
> 
>   [1]: a0f79386a4968b4925da6db2d1daffd0605a4402
>   [2]: 0519359784328bfa92bf0931bf0cff3b58c16932 (4.19 merge window opened)
> 
> I started bisecting and the found that [3] breaks it. According to [4]:
>   "For 8110S, 8110SB, and 8110SC series, the initial value of RxConfig
>   needs to be set after the tx/rx is enabled."
> So I moved rtl_init_rxcfg() after enabling tx/rs and now my adapter works
> (RTL8168e works too).
> 
>   [3]: 3559d81e76bfe3803e89f2e04cf6ef7ab4f3aace
>   [4]: e542a2269f232d61270ceddd42b73a4348dee2bb ("r8169: adjust the RxConfig
> settings.")
> 
> Also drop "rx" from rtl_set_rx_tx_config_registers(), since it does nothing
> with it already.
> 
> Fixes: 3559d81e76bfe3803e89f2e04cf6ef7ab4f3aace ("r8169: simplify
> rtl_hw_start_8169")
> 
> Cc: Heiner Kallweit 
> Cc: David S. Miller 
> Cc: netdev@vger.kernel.org
> Cc: Realtek linux nic maintainers 
> Signed-off-by: Azat Khuzhin 

Applied and queued up for -stable.


Re: [Patch net] tipc: fix a missing rhashtable_walk_exit()

2018-08-29 Thread David Miller
From: Cong Wang 
Date: Thu, 23 Aug 2018 16:19:44 -0700

> rhashtable_walk_exit() must be paired with rhashtable_walk_enter().
> 
> Fixes: 40f9f4397060 ("tipc: Fix tipc_sk_reinit race conditions")
> Cc: Herbert Xu 
> Cc: Ying Xue 
> Signed-off-by: Cong Wang 

Applied and queued up for -stable, thanks Cong.


Re: [PATCH net] vti6: remove !skb->ignore_df check from vti6_xmit()

2018-08-29 Thread David Miller
From: Alexey Kodanev 
Date: Thu, 23 Aug 2018 19:49:54 +0300

> Before the commit d6990976af7c ("vti6: fix PMTU caching and reporting
> on xmit") '!skb->ignore_df' check was always true because the function
> skb_scrub_packet() was called before it, resetting ignore_df to zero.
> 
> In the commit, skb_scrub_packet() was moved below, and now this check
> can be false for the packet, e.g. when sending it in the two fragments,
> this prevents successful PMTU updates in such case. The next attempts
> to send the packet lead to the same tx error. Moreover, vti6 initial
> MTU value relies on PMTU adjustments.
> 
> This issue can be reproduced with the following LTP test script:
> udp_ipsec_vti.sh -6 -p ah -m tunnel -s 2000
> 
> Fixes: ccd740cbc6e0 ("vti6: Add pmtu handling to vti6_xmit.")
> Signed-off-by: Alexey Kodanev 

Applied and queued up for -stable, thank you.


Re: pull-request: bpf 2018-08-29

2018-08-29 Thread David Miller
From: Daniel Borkmann 
Date: Wed, 29 Aug 2018 21:07:24 +0200

> The following pull-request contains BPF updates for your *net* tree.
> 
> The main changes are:
> 
> 1) Fix a build error in sk_reuseport_convert_ctx_access() when
>compiling with clang which cannot resolve hweight_long() at
>build time inside the BUILD_BUG_ON() assertion, from Stefan.
> 
> 2) Several fixes for BPF sockmap, four of them in getting the
>bpf_msg_pull_data() helper to work, one use after free case
>in bpf_tcp_close() and one refcount leak in bpf_tcp_recvmsg(),
>from Daniel.
> 
> 3) Another fix for BPF sockmap where we misaccount sk_mem_uncharge()
>in the socket redirect error case from unwinding scatterlist
>twice, from John.
> 
> Please consider pulling these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git

Pulled, thanks Daniel.


Re: [net-next 00/13][pull request] 10GbE Intel Wired LAN Driver Updates 2018-08-28

2018-08-28 Thread David Miller
From: Jeff Kirsher 
Date: Tue, 28 Aug 2018 14:35:44 -0700

> This series contains updates to ixgbe and ixgbevf only.
 ...

Pulled.


Re: [PATCH net-next 00/15] nfp: add NFP5000 support

2018-08-28 Thread David Miller
From: Jakub Kicinski 
Date: Tue, 28 Aug 2018 13:20:32 -0700

> This series broadly speaking adds support for NFP5000 and
> related products.
 ...

Series applied, thanks Jakub.


Re: [net-next 00/15][pull request] 100GbE Intel Wired LAN Driver Updates 2018-08-28

2018-08-28 Thread David Miller
From: Jeff Kirsher 
Date: Tue, 28 Aug 2018 12:03:58 -0700

> This series contains new features and implementation updates for the
> ice driver.
 ...

Pulled.


net-next is OPEN...

2018-08-28 Thread David Miller


You know the drill...

http://vger.kernel.org/~davem/net-next.html


Re: [PATCH 1/1] net/rds: Use rdma_read_gids to get connection SGID/DGID in IPv6

2018-08-27 Thread David Miller
From: Zhu Yanjun 
Date: Sat, 25 Aug 2018 15:19:05 +0800

> In IPv4, the newly introduced rdma_read_gids is used to read the SGID/DGID
> for the connection which returns GID correctly for RoCE transport as well.
> 
> In IPv6, rdma_read_gids is also used. The following are why rdma_read_gids
> is introduced.
> 
> rdma_addr_get_dgid() for RoCE for client side connections returns MAC
> address, instead of DGID.
> rdma_addr_get_sgid() for RoCE doesn't return correct SGID for IPv6 and
> when more than one IP address is assigned to the netdevice.
> 
> So the transport agnostic rdma_read_gids() API is provided by rdma_cm
> module.
> 
> Signed-off-by: Zhu Yanjun 

Applied.


Re: [PATCH] r8169: set RxConfig after tx/rx is enabled for RTL8169sb/8110sb devices

2018-08-27 Thread David Miller
From: Azat Khuzhin 
Date: Sun, 26 Aug 2018 17:03:09 +0300

> I have two Ethernet adapters:
>   r8169 :03:01.0 eth0: RTL8169sb/8110sb, 00:14:d1:14:2d:49, XID 1000, 
> IRQ 18
>   r8169 :01:00.0 eth0: RTL8168e/8111e, 64:66:b3:11:14:5d, XID 2c20, 
> IRQ 30
> And after upgrading from linux 4.15 [1] to linux 4.18+ [2] RTL8169sb failed to
> receive any packets. tcpdump shows a lot of checksum mismatch.
> 
>   [1]: a0f79386a4968b4925da6db2d1daffd0605a4402
>   [2]: 0519359784328bfa92bf0931bf0cff3b58c16932 (4.19 merge window opened)
> 
> I started bisecting and the found that [3] breaks it. According to [4]:
>   "For 8110S, 8110SB, and 8110SC series, the initial value of RxConfig
>   needs to be set after the tx/rx is enabled."
> So I moved rtl_init_rxcfg() after enabling tx/rs and now my adapter works
> (RTL8168e works too).
> 
>   [3]: 3559d81e76bfe3803e89f2e04cf6ef7ab4f3aace
>   [4]: e542a2269f232d61270ceddd42b73a4348dee2bb ("r8169: adjust the RxConfig
> settings.")
> 
> Also drop "rx" from rtl_set_rx_tx_config_registers(), since it does nothing
> with it already.
> 
> Fixes: 3559d81e76bfe3803e89f2e04cf6ef7ab4f3aace ("r8169: simplify
> rtl_hw_start_8169")
> 
> Cc: Heiner Kallweit 
> Cc: David S. Miller 
> Cc: netdev@vger.kernel.org
> Cc: Realtek linux nic maintainers 
> Signed-off-by: Azat Khuzhin 
> ---
> It looks like calling rtl_init_rxcfg() the second time is fine, but I
> can move it into rtl_hw_start_8169())

Heiner, please review.


Re: [PATCH] net: dsa: Drop GPIO includes

2018-08-27 Thread David Miller
From: Linus Walleij 
Date: Mon, 27 Aug 2018 00:20:11 +0200

> Commit 52638f71fcff ("dsa: Move gpio reset into switch driver")
> moved the GPIO handling into the switch drivers but forgot
> to remove the GPIO header includes.
> 
> Signed-off-by: Linus Walleij 

Applied.


Re: [patch net 0/2] net: sched: couple of small fixes

2018-08-27 Thread David Miller
From: Cong Wang 
Date: Mon, 27 Aug 2018 13:44:56 -0700

> On Mon, Aug 27, 2018 at 11:58 AM Jiri Pirko  wrote:
>>
>> From: Jiri Pirko 
>>
>> Jiri Pirko (2):
>>   net: sched: fix extack error message when chain is failed to be
>> created
>>   net: sched: return -ENOENT when trying to remove filter from
>> non-existent chain
> 
> Acked-by: Cong Wang 

Series applied.


Re: [PATCH net] sctp: hold transport before accessing its asoc in sctp_transport_get_next

2018-08-27 Thread David Miller
From: Xin Long 
Date: Mon, 27 Aug 2018 18:38:31 +0800

> As Marcelo noticed, in sctp_transport_get_next, it is iterating over
> transports but then also accessing the association directly, without
> checking any refcnts before that, which can cause an use-after-free
> Read.
> 
> So fix it by holding transport before accessing the association. With
> that, sctp_transport_hold calls can be removed in the later places.
> 
> Fixes: 626d16f50f39 ("sctp: export some apis or variables for sctp_diag and 
> reuse some for proc")
> Reported-by: syzbot+fe62a0c9aa6a85c6d...@syzkaller.appspotmail.com
> Signed-off-by: Xin Long 

Applied and queued up for -stable.


Re: [PATCH net] erspan: set erspan_ver to 1 by default when adding an erspan dev

2018-08-27 Thread David Miller
From: Xin Long 
Date: Mon, 27 Aug 2018 18:41:32 +0800

> After erspan_ver is introudced, if erspan_ver is not set in iproute, its
> value will be left 0 by default. Since Commit 02f99df1875c ("erspan: fix
> invalid erspan version."), it has broken the traffic due to the version
> check in erspan_xmit if users are not aware of 'erspan_ver' param, like
> using an old version of iproute.
> 
> To fix this compatibility problem, it sets erspan_ver to 1 by default
> when adding an erspan dev in erspan_setup. Note that we can't do it in
> ipgre_netlink_parms, as this function is also used by ipgre_changelink.
> 
> Fixes: 02f99df1875c ("erspan: fix invalid erspan version.")
> Reported-by: Jianlin Shi 
> Signed-off-by: Xin Long 

Applied and queued up for -stable.


Re: [PATCH net] sctp: remove useless start_fail from sctp_ht_iter in proc

2018-08-27 Thread David Miller
From: Xin Long 
Date: Mon, 27 Aug 2018 18:40:18 +0800

> After changing rhashtable_walk_start to return void, start_fail would
> never be set other value than 0, and the checking for start_fail is
> pointless, so remove it.
> 
> Fixes: 97a6ec4ac021 ("rhashtable: Change rhashtable_walk_start to return 
> void")
> Signed-off-by: Xin Long 

Applied and queued up for -stable.


Re: any reason for "!!netif_carrier_ok" and "!!netif_dormant" in net-sysfs.c?

2018-08-27 Thread David Miller
From: "Robert P. J. Day" 
Date: Mon, 27 Aug 2018 04:55:29 -0400 (EDT)

>   another pedantic oddity -- is there a reason for these two double
> negations in net/core/net-sysfs.c?

It turns an arbitrary integer into a boolean, this is a common
construct across the kernel tree so I'm surprised you've never seen
it before.

Although, I don't know how much more hand holding we're willing to
tolerate continuing to give to you at this point.

Thanks.


Re: [PATCH net 1/1] qlge: Fix netdev features configuration.

2018-08-25 Thread David Miller
From: Manish Chopra 
Date: Thu, 23 Aug 2018 13:20:52 -0700

> qlge_fix_features() is not supposed to modify hardware or
> driver state, rather it is supposed to only fix requested
> fetures bits. Currently qlge_fix_features() also goes for
> interface down and up unnecessarily if there is not even
> any change in features set.
> 
> This patch changes/fixes following -
> 
> 1) Move reload of interface or device re-config from
>qlge_fix_features() to qlge_set_features().
> 2) Reload of interface in qlge_set_features() only if
>relevant feature bit (NETIF_F_HW_VLAN_CTAG_RX) is changed.
> 3) Get rid of qlge_fix_features() since driver is not really
>required to fix any features bit.
> 
> Signed-off-by: Manish 
> Reviewed-by: Benjamin Poirier 

Applied and queued up for -stable.

Please provide a proper Fixes: tag next time.

Thanks.


Re: [PATCH v2] net: macb: do not disable MDIO bus at open/close time

2018-08-25 Thread David Miller
From: Anssi Hannula 
Date: Thu, 23 Aug 2018 10:45:22 +0300

> macb_reset_hw() is called from macb_close() and indirectly from
> macb_open(). macb_reset_hw() zeroes the NCR register, including the MPE
> (Management Port Enable) bit.
> 
> This will prevent accessing any other PHYs for other Ethernet MACs on
> the MDIO bus, which remains registered at macb_reset_hw() time, until
> macb_init_hw() is called from macb_open() which sets the MPE bit again.
> 
> I.e. currently the MDIO bus has a short disruption at open time and is
> disabled at close time until the interface is opened again.
> 
> Fix that by only touching the RE and TE bits when enabling and disabling
> RX/TX.
> 
> v2: Make macb_init_hw() NCR write a single statement.
> 
> Fixes: 6c36a7074436 ("macb: Use generic PHY layer")
> Signed-off-by: Anssi Hannula 

Applied and queued up for -stable.


Re: [PATCH v3 net 1/1] net: macb: Fix regression breaking non-MDIO fixed-link PHYs

2018-08-25 Thread David Miller
From: Ahmad Fatoum 
Date: Tue, 21 Aug 2018 17:35:48 +0200

> commit 739de9a1563a ("net: macb: Reorganize macb_mii bringup") broke
> initializing macb on the EVB-KSZ9477 eval board.
> There, of_mdiobus_register was called even for the fixed-link representing
> the RGMII-link to the switch with the result that the driver attempts to
> enumerate PHYs on a non-existent MDIO bus:
> 
>   libphy: MACB_mii_bus: probed
>   mdio_bus f0028000.ethernet-: fixed-link has invalid PHY address
>   mdio_bus f0028000.ethernet-: scan phy fixed-link at address 0
> [snip]
>   mdio_bus f0028000.ethernet-: scan phy fixed-link at address 31
> 
> The "MDIO" bus registration succeeds regardless, having claimed the reset 
> GPIO,
> and calling of_phy_register_fixed_link later on fails because it tries
> to claim the same GPIO:
> 
>   macb f0028000.ethernet: broken fixed-link specification
> 
> Fix this by registering the fixed-link before calling mdiobus_register.
> 
> Fixes: 739de9a1563a ("net: macb: Reorganize macb_mii bringup")
> Signed-off-by: Ahmad Fatoum 

Applied and queued up for -stable, thanks.


Re: [PATCH net] mlxsw: spectrum_switchdev: Do not leak RIFs when removing bridge

2018-08-25 Thread David Miller
From: Ido Schimmel 
Date: Fri, 24 Aug 2018 15:41:35 +0300

> When a bridge device is removed, the VLANs are flushed from each
> configured port. This causes the ports to decrement the reference count
> on the associated FIDs (filtering identifier). If the reference count of
> a FID is 1 and it has a RIF (router interface), then this RIF is
> destroyed.
> 
> However, if no port is member in the VLAN for which a RIF exists, then
> the RIF will continue to exist after the removal of the bridge. To
> reproduce:
> 
> # ip link add name br0 type bridge vlan_filtering 1
> # ip link set dev swp1 master br0
> # ip link add link br0 name br0.10 type vlan id 10
> # ip address add 192.0.2.0/24 dev br0.10
> # ip link del dev br0
> 
> The RIF associated with br0.10 continues to exist.
> 
> Fix this by iterating over all the bridge device uppers when it is
> destroyed and take care of destroying their RIFs.
> 
> Fixes: 99f44bb3527b ("mlxsw: spectrum: Enable L3 interfaces on top of bridge 
> devices")
> Signed-off-by: Ido Schimmel 
> Reviewed-by: Petr Machata 

Applied and queued up for -stable, thanks.


Re: [net 00/11][pull request] Intel Wired LAN Driver Updates 2018-08-24

2018-08-25 Thread David Miller
From: Jeff Kirsher 
Date: Fri, 24 Aug 2018 11:47:24 -0700

> This series contains fixes to e1000, igb, ixgb, ixgbe and i40e.

Pulled, thanks Jeff.


Re: pull-request: bpf 2018-08-24

2018-08-23 Thread David Miller
From: Daniel Borkmann 
Date: Fri, 24 Aug 2018 01:09:29 +0200

> The following pull-request contains BPF updates for your *net* tree.

Pulled, thanks Daniel.


Re: [net 00/13][pull request] Intel Wired LAN Driver Fixes 2018-08-23

2018-08-23 Thread David Miller
From: Jeff Kirsher 
Date: Thu, 23 Aug 2018 12:14:50 -0700

> This series contains bug fixes to the ice driver.

Pulled, thanks Jeff.


Re: pull request: bluetooth 2018-08-23

2018-08-22 Thread David Miller
From: Johan Hedberg 
Date: Thu, 23 Aug 2018 08:34:40 +0300

> Here are two important Bluetooth fixes for the MediaTek and RealTek HCI
> drivers.
> 
> Please let me know if there are any issues pulling, thanks.

Pulled, thank you.


Re: [Patch net 0/2] net: hns3: bug fix & optimization for HNS3 driver

2018-08-22 Thread David Miller
From: Huazhong Tan 
Date: Thu, 23 Aug 2018 11:37:14 +0800

> This patchset presents a bug fix found out when CONFIG_ARM64_64K_PAGES
> enable and an optimization for HNS3 driver.

Series applied, thank you.


Re: [PATCH] net/ipv6: init ip6 anycast rt->dst.input as ip6_input

2018-08-22 Thread David Miller
From: Hangbin Liu 
Date: Thu, 23 Aug 2018 11:31:37 +0800

> Commit 6edb3c96a5f02 ("net/ipv6: Defer initialization of dst to data path")
> forgot to handle anycast route and init anycast rt->dst.input to ip6_forward.
> Fix it by setting anycast rt->dst.input back to ip6_input.
> 
> Fixes: 6edb3c96a5f02 ("net/ipv6: Defer initialization of dst to data path")
> Signed-off-by: Hangbin Liu 

Applied and queued up for -stable, thanks.


Re: [Patch net 0/4] net: hns: bug fixes & optimization for HNS driver

2018-08-22 Thread David Miller
From: Huazhong Tan 
Date: Thu, 23 Aug 2018 11:10:09 +0800

> This patchset presents some bug fixes found out when
> CONFIG_ARM64_64K_PAGES enable and an optimization for HNS driver.

Series applied, thank you.


Re: [PATCH net 0/3] tcp_bbr: PROBE_RTT minor bug fixes

2018-08-22 Thread David Miller
From: Kevin Yang 
Date: Wed, 22 Aug 2018 17:43:13 -0400

> From: "Kevin(Yudong) Yang" 
> 
> This series includes two minor bug fixes for the TCP BBR PROBE_RTT
> mechanism, and one preparatory patch:
> 
> (1) A preparatory patch to reorganize the PROBE_RTT logic by refactoring
> (into its own function) the code to exit PROBE_RTT, since the next
> patch will be using that code in a new context.
> 
> (2) Fix: When BBR restarts from idle and if BBR is in PROBE_RTT mode,
> BBR should check if it's time to exit PROBE_RTT. If yes, then BBR
> should exit PROBE_RTT mode and restore the cwnd to its full value.
> 
> (3) Fix: Apply the PROBE_RTT cwnd cap even if the count of fully-ACKed
> packets is 0.

Series applied, thank you.


Re: [PATCH net] ipv4: tcp: send zero IPID for RST and ACK sent in SYN-RECV and TIME-WAIT state

2018-08-22 Thread David Miller
From: Eric Dumazet 
Date: Wed, 22 Aug 2018 13:30:45 -0700

> tcp uses per-cpu (and per namespace) sockets (net->ipv4.tcp_sk) internally
> to send some control packets.
> 
> 1) RST packets, through tcp_v4_send_reset()
> 2) ACK packets in SYN-RECV and TIME-WAIT state, through tcp_v4_send_ack()
> 
> These packets assert IP_DF, and also use the hashed IP ident generator
> to provide an IPv4 ID number.
> 
> Geoff Alexander reported this could be used to build off-path attacks.
> 
> These packets should not be fragmented, since their size is smaller than
> IPV4_MIN_MTU. Only some tunneled paths could eventually have to fragment,
> regardless of inner IPID.
> 
> We really can use zero IPID, to address the flaw, and as a bonus,
> avoid a couple of atomic operations in ip_idents_reserve()
> 
> Signed-off-by: Eric Dumazet 
> Reported-by: Geoff Alexander 
> Tested-by: Geoff Alexander 

Applied and queued up for -stable.


Re: [Patch net] addrconf: reduce unnecessary atomic allocations

2018-08-22 Thread David Miller
From: Cong Wang 
Date: Wed, 22 Aug 2018 12:58:34 -0700

> All the 3 callers of addrconf_add_mroute() assert RTNL
> lock, they don't take any additional lock either, so
> it is safe to convert it to GFP_KERNEL.
> 
> Same for sit_add_v4_addrs().
> 
> Cc: David Ahern 
> Signed-off-by: Cong Wang 

Applied.


Re: Experimental fix for MSI-X issue on r8169

2018-08-21 Thread David Miller
From: Jian-Hong Pan 
Date: Wed, 22 Aug 2018 11:01:02 +0800

 ...
> [   56.462464] r8169 :02:00.0: MSI-X entry: context resume:
>    
 ...
> uh!  The MSI-X entry seems missed after resume on this laptop!

Yeah, having all of the MSI-X entry values be all-1's is not a good
sign.

But this is quite a curious set of debugging traces we now have.

In the working case, the vector number in the DATA field seems
to change, which suggests that something is assigning new values
and programming them into these fields at resume time.

But in the failing cases, all of the values are garbage.

I would expect, given what the working trace looks like, that in the
failing case some values would be wrong and the DATA value would have
some new yet valid value.  But that is not what we are seeing here.

Weird.



Re: [PATCH v1 3/3] net: WireGuard secure network tunnel

2018-08-21 Thread David Miller
From: "Jason A. Donenfeld" 
Date: Tue, 21 Aug 2018 16:41:50 -0700

> Is 100 in fact acceptable for new code? 120? 180?  What's the
> generally accepted limit these days?

Please keep it as close to 80 columns as possible.

Line breaks are not ugly, please embrace them :)


Re: [PATCH] datapath.c: fix missing return value check of nla_nest_start()

2018-08-21 Thread David Miller
From: Pravin Shelar 
Date: Tue, 21 Aug 2018 15:38:28 -0700

> On Fri, Aug 17, 2018 at 1:15 AM Jiecheng Wu  wrote:
>>
>> Function queue_userspace_packet() defined in net/openvswitch/datapath.c 
>> calls nla_nest_start() to allocate memory for struct nlattr which is 
>> dereferenced immediately. As nla_nest_start() may return NULL on failure, 
>> this code piece may cause NULL pointer dereference bug.
>> ---
>>  net/openvswitch/datapath.c | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
>> index 0f5ce77..ff4457d 100644
>> --- a/net/openvswitch/datapath.c
>> +++ b/net/openvswitch/datapath.c
>> @@ -460,6 +460,8 @@ static int queue_userspace_packet(struct datapath *dp, 
>> struct sk_buff *skb,
>>
>> if (upcall_info->egress_tun_info) {
>> nla = nla_nest_start(user_skb, 
>> OVS_PACKET_ATTR_EGRESS_TUN_KEY);
>> +   if (!nla)
>> +   return -EMSGSIZE;
> It is not possible, since user_skb is allocated to accommodate all
> netlink attributes.

Pravin, common practice is to always check nla_*() return values even if the
SKB is allocated with "enough space".

Those calculations can have bugs, and these checks are therefore helpful to
avoid crashes and memory corruption in such cases.

Thank you.


Re: Experimental fix for MSI-X issue on r8169

2018-08-21 Thread David Miller
From: Heiner Kallweit 
Date: Tue, 21 Aug 2018 23:19:04 +0200

> That's what I get on my system (RTL8168E-VL). In your case you'll come
> only till the first suspend.
> 
> [3.743404] r8169 :03:00.0: MSI-X entry: context probe: fee01004 0 
> 40ef 1

On probe, MSI-X is masked (ie. disabled) and is configured to use:

address: 0xfee01004
data:0x40ef

> [   29.539250] r8169 :03:00.0: MSI-X entry: context suspend: fee02004 0 
> 4028 0

At suspend time, MSI-X is unmasked (ie. enabled) and is configured to use:

address: 0xfee01004
data:0x4028

> [   29.837457] r8169 :03:00.0: MSI-X entry: context resume: fee01004 0 
> 402b 0

At reume time, MSI-X is unmasked (ie. enabled) and is configured to use:

address: 0xfee01004
data:0x402b

> [   36.921370] r8169 :03:00.0: MSI-X entry: context suspend: fee01004 0 
> 402b 0

Second suspend:

address: 0xfee01004
data:0x402b

> [   37.239407] r8169 :03:00.0: MSI-X entry: context resume: fee01004 0 
> 402b 0

Second resume:

address: 0xfee01004
data:0x402b

And this all looks normal.  The data field is changing when you first up the 
device
and interrupts are enabled.  This is where the request_irq happens, the MSI
vector is allocated, and that vector number is written to the data field of the
MSI-X entry.

It looks like this (re-)allocation of MSI vectors happens on every resume as 
well.

And that's why the data field changes each resume.


Re: [Patch net 0/9] net_sched: pending clean up and bug fixes

2018-08-21 Thread David Miller
From: Cong Wang 
Date: Sun, 19 Aug 2018 12:22:04 -0700

> This patchset aims to clean up and fixes some bugs in current
> merge window, this is why it is targeting -net.
> 
> Patch 1-5 are clean up Vlad's patches merged in current merge
> window, patch 6 is just a trivial cleanup.
> 
> Patch 7 reverts a lockdep warning fix and patch 8 provides a better
> fix for it.
> 
> Patch 9 fixes a potential deadlock found by me during code review.
> 
> Please see each patch for details.
> 
> Cc: Jamal Hadi Salim 
> Signed-off-by: Cong Wang 

Series applied and patches #8 and #9 queued up for -stable.


Re: [Patch net 8/9] act_ife: move tcfa_lock down to where necessary

2018-08-21 Thread David Miller
From: Cong Wang 
Date: Mon, 20 Aug 2018 16:57:46 -0700

> Passing 'exists' as 'atomic' is prior to my change. With my change,
> they are separated as two parameters:

I mis-read the patch, thanks for explaining :)


Re: [PATCH net] hv_netvsc: ignore devices that are not PCI

2018-08-21 Thread David Miller
From: Stephen Hemminger 
Date: Tue, 21 Aug 2018 10:40:38 -0700

> Registering another device with same MAC address (such as TAP, VPN or
> DPDK KNI) will confuse the VF autobinding logic.  Restrict the search
> to only run if the device is known to be a PCI attached VF.
> 
> Fixes: e8ff40d4bff1 ("hv_netvsc: improve VF device matching")
> Signed-off-by: Stephen Hemminger 

Applied and queued up for -stable.


Re: [bpf-next RFC 0/3] Introduce eBPF flow dissector

2018-08-20 Thread David Miller
From: Alexei Starovoitov 
Date: Mon, 20 Aug 2018 13:52:07 -0700

> I don't think copy-paste avoids the issue of uapi.
> Anything used by BPF program is uapi.
> The only exception is offsets of kernel internal structures
> passed into bpf_probe_read().
> So we have several options:
> 1. be honest and say 'struct flow_dissect_key*' is now uapi
> 2. wrap all of them into 'struct bpf_flow_dissect_key*' and do rewrites
>   when/if 'struct flow_dissect_key*' changes
> 3. wait for BTF to solve it for tracing use case and for this one two.
 ...
> The idea is that kernel internal structs can be defined in bpf prog
> and since they will be described precisely in BTF that comes with the prog
> the kernel can validate that prog's BTF matches what kernel thinks it has.
> imo that's the most flexible, but BTF for all of vmlinux won't be ready
> tomorrow and looks like this patch set is ready to go, so I would go with 1 
> or 2.

I would definitely prefer #2 or #3.

I personally would like to see us avoid preventing interesting
optimizations of the flow key layout and/or accesses in the future.


Re: [PATCH] rhashtable: remove duplicated include from rhashtable.c

2018-08-20 Thread David Miller
From: Yue Haibing 
Date: Tue, 21 Aug 2018 01:41:56 +

> Remove duplicated include.
> 
> Signed-off-by: Yue Haibing 

Applied, thank you.


Re: [PATCH net] net/ipv6: Put lwtstate when destroying fib6_info

2018-08-20 Thread David Miller
From: dsah...@kernel.org
Date: Mon, 20 Aug 2018 13:02:41 -0700

> From: David Ahern 
> 
> Prior to the introduction of fib6_info lwtstate was managed by the dst
> code. With fib6_info releasing lwtstate needs to be done when the struct
> is freed.
> 
> Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst 
> based routes")
> Signed-off-by: David Ahern 

Applied and queued up for -stable, thanks David.


Re: [PATCH net-next,v4] net/tls: Calculate nsg for zerocopy path without skb_cow_data.

2018-08-20 Thread David Miller
From: Doron Roberts-Kedes 
Date: Mon, 20 Aug 2018 17:27:23 -0700

> Given that frag_lists are not unlikely in this case, I believe the only
> remaining feedback on the original patch was the recursive
> implementation. If you'd like, I can re-submit with an iterative
> implementation, but I noticed that goes against the existing recursive
> pattern in functions like skb_release_data -> kfree_skb_list -> kfree_skb 
> -> __kfree_skb -> skb_release_all -> skb_release_data, as well as
> skb_to_sgvec. Let me know whether an iterative implementation is
> preferred here, or whether I can simply rebase and resubmit a patch
> similar to the original (modulo some variable renaming improvements). 

Ok, I guess staying with the recursive implementation is fine.

It's a real shame that frag lists are so common in this code path,
especially nested ones :-/

In the long term, perhaps we can do something about that.

In the short term, I guess this means your original change is OK.

Please resubmit when the net-next tree opens back up, thanks.


Re: [PATCH net v2 0/4] qed: Misc fixes in the interface with the MFW

2018-08-20 Thread David Miller
From: Tomer Tayar 
Date: Mon, 20 Aug 2018 00:01:41 +0300

> This patch series fixes several issues in the driver's interface with the
> management FW (MFW).
> 
> v1->v2:
> - Fix loop counter decrement to be pre instead of post.

Series applied, thank you.


Re: [Patch net 8/9] act_ife: move tcfa_lock down to where necessary

2018-08-20 Thread David Miller
From: Cong Wang 
Date: Sun, 19 Aug 2018 12:22:12 -0700

> The only time we need to take tcfa_lock is when adding
> a new metainfo to an existing ife->metalist. We don't need
> to take tcfa_lock so early and so broadly in tcf_ife_init().
> 
> This means we can always take ife_mod_lock first, avoid the
> reverse locking ordering warning as reported by Vlad.
> 
> Reported-by: Vlad Buslov 
> Tested-by: Vlad Buslov 
> Cc: Vlad Buslov 
> Cc: Jamal Hadi Salim 
> Signed-off-by: Cong Wang 

After this change we no longer call populate_metalist() in an atomic
context via tcf_ife_init(), and populate_metalist passes 'exists'
down to add_metainfo() as an 'atomic' indicator.  It doesn't have this
meaning if you aren't holding the tcfa_lock in the callers with BH
disabled.

Therefore, add_metainfo()'s 'atomic' indication is inaccurate in this
call chain and will use GFP_ATOMIC unnecessarily.

Probably the thing to just is just pass 'false' down to add_metainfo()
in populate_metalist().


Re: [PATCH][net-next] vxlan: reduce dirty cache line in vxlan_find_mac

2018-08-19 Thread David Miller
From: Li RongQing 
Date: Sun, 19 Aug 2018 11:36:08 +0800

> vxlan_find_mac() unconditionally set f->used for every packet,
> this cause a cache miss for every packet, since remote, hlist
> and used of vxlan_fdb share the same cacheline.
> 
> With this change f->used is set only if not equal to jiffies
> This gives up to 5% speed-up with small packets.
> 
> Signed-off-by: Zhang Yu 
> Signed-off-by: Li RongQing 

Please resubmit this when the net-next tree opens back up.

Thanks.


Re: [PATCH net 1/4] qed: Wait for ready indication before rereading the shmem

2018-08-19 Thread David Miller
From: Tomer Tayar 
Date: Sun, 19 Aug 2018 20:58:04 +0300

> + while (!p_info->mfw_mb_length && cnt--) {
> + msleep(msec);
> + p_info->mfw_mb_length =
> + (u16)qed_rd(p_hwfn, p_ptt,
> + p_info->mfw_mb_addr +
> + offsetof(struct public_mfw_mb, sup_msgs));
> + }
> +
> + if (!cnt) {

Because you use postdecrement on 'cnt', the loop will timeout with
'cnt' equal to '-1' not zero.

You need to fix this.


Re: [PATCH 1/1] tap: RCU usage and comment fixes

2018-08-19 Thread David Miller
From: Wang Jian 
Date: Fri, 17 Aug 2018 08:22:53 +

> The tap_queue and the 'tap_dev' are loosely coupled, not 'macvlan_dev'.

There is another reference to macvlan_dev in that comment, which is therefore
also similarly inaccurate.  You should add an appropriate Fixes: line for
where this inaccuracy was introduced, which is:

Fixes: 6fe3faf86757 ("tap: Abstract type of virtual interface from tap 
implementation")

> Taking rcu_read_lock a little later seems can slightly reduce rcu read 
> critical section.

This is a separate change from fixing up a comment.


Re: [PATCH] net: lan743x_ptp: convert to ktime_get_clocktai_ts64

2018-08-19 Thread David Miller
From: Arnd Bergmann 
Date: Wed, 15 Aug 2018 19:49:49 +0200

> timekeeping_clocktai64() has been renamed to ktime_get_clocktai_ts64()
> for consistency with the other ktime_get_* access functions.
> 
> Rename the new caller that has come up as well.
> 
> Question: this is the only ptp driver that sets the hardware time
> to the current system time in TAI. Why does it do that?
> 
> Signed-off-by: Arnd Bergmann 

Deciding whether PTP drivers should set the hardware time at boot
to the current system time is a separate discussion from using
the new name for the timekeeping_clocktai64() interface, I'm applying
this.

Thanks Arnd.


Re: [PATCH net-next] net: sched: always disable bh when taking tcf_lock

2018-08-19 Thread David Miller
From: Vlad Buslov 
Date: Tue, 14 Aug 2018 21:46:16 +0300

> Recently, ops->init() and ops->dump() of all actions were modified to
> always obtain tcf_lock when accessing private action state. Actions that
> don't depend on tcf_lock for synchronization with their data path use
> non-bh locking API. However, tcf_lock is also used to protect rate
> estimator stats in softirq context by timer callback.
> 
> Change ops->init() and ops->dump() of all actions to disable bh when using
> tcf_lock to prevent deadlock reported by following lockdep warning:
 ...
> Taking tcf_lock in sample action with bh disabled causes lockdep to issue a
> warning regarding possible irq lock inversion dependency between tcf_lock,
> and psample_groups_lock that is taken when holding tcf_lock in sample init:
 ...
> In order to prevent potential lock inversion dependency between tcf_lock
> and psample_groups_lock, extract call to psample_group_get() from tcf_lock
> protected section in sample action init function.
> 
> Fixes: 4e232818bd32 ("net: sched: act_mirred: remove dependency on rtnl lock")
> Fixes: 764e9a24480f ("net: sched: act_vlan: remove dependency on rtnl lock")
> Fixes: 729e01260989 ("net: sched: act_tunnel_key: remove dependency on rtnl 
> lock")
> Fixes: d77284956656 ("net: sched: act_sample: remove dependency on rtnl lock")
> Fixes: e8917f437006 ("net: sched: act_gact: remove dependency on rtnl lock")
> Fixes: b6a2b971c0b0 ("net: sched: act_csum: remove dependency on rtnl lock")
> Fixes: 2142236b4584 ("net: sched: act_bpf: remove dependency on rtnl lock")
> Signed-off-by: Vlad Buslov 

Applied, thanks Vlad.


Re: [PATCH]ipv6: multicast: In mld_send_cr function moving read lock to second for loop

2018-08-18 Thread David Miller
From: Guruswamy Basavaiah 
Date: Fri, 17 Aug 2018 18:01:41 +0530

> @@ -1860,7 +1860,6 @@ static void mld_send_cr(struct inet6_dev *idev)
>  struct sk_buff *skb = NULL;
>  int type, dtype;
> 
> -read_lock_bh(>lock);
>  spin_lock(>mc_lock);
> 
>  /* deleted MCA's */

This will lead to deadlocks, idev->mc_lock must be taken with _bh().

I have zero confidence in this change, did you do any stress testing
with lockdep enabled?  It would have caught this quickly.


Re: [PATCH] net: nixge: Add support for 64-bit platforms

2018-08-18 Thread David Miller
From: Moritz Fischer 
Date: Thu, 16 Aug 2018 12:07:06 -0700

> Add support for 64-bit platforms to driver.
> 
> The hardware only supports 32-bit register accesses
> so the accesses need to be split up into two writes
> when setting the current and tail descriptor values.
> 
> Cc: Florian Fainelli 
> Signed-off-by: Moritz Fischer 

Please resubmit when the net-next tree opens back up.

Thank you.


Re: pull-request: bpf 2018-08-18

2018-08-18 Thread David Miller
From: Daniel Borkmann 
Date: Sat, 18 Aug 2018 01:29:20 +0200

> The following pull-request contains BPF updates for your *net* tree.
> 
> The main changes are:
 ...
> Please consider pulling these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git

Pulled, thanks.


Re: [PATCH] sunhme: convert printk to pr_cont

2018-08-17 Thread David Miller
From: Mikulas Patocka 
Date: Fri, 17 Aug 2018 16:08:49 -0400 (EDT)

> I'm not an expert on networking code - you can change it if it is more 
> appropriate this way.

What Stephen is asking of you doesn't require networking expertiece
and he even gave you an example of how to do it.  All you would need
to do is test is suggestion and make sure it works properly.


Re: [PATCH] sunhme: convert printk to pr_cont

2018-08-17 Thread David Miller
From: Mikulas Patocka 
Date: Fri, 17 Aug 2018 15:12:22 -0400 (EDT)

> The kernel adds newlines automatically unless pr_cont is used. This patch
> converts sunhme to use pr_cont, so that the messages are not broken to
> multiple lines.
> 
> The patch also adds "\n" to a few strings that were missing it.
> 
> Signed-off-by: Mikulas Patocka 
> Cc: sta...@vger.kernel.org

"stable", are you sure?  What crash or memory corruption does these
added newlines in the kernel log cuase?

I don't think this is appropriate for -stable, sorry.

At best this is net-next material, and that tree is closed right now.

Please resubmit this when the net-next tree opens back up again,
thanks.


<    6   7   8   9   10   11   12   13   14   15   >