Re: [PATCH net] VSOCK: check sk state before receive

2018-09-21 Thread Hangbin Liu
On Fri, Sep 21, 2018 at 07:48:25AM +, Jorgen S. Hansen wrote: > Hi Hangbin, > > I finaly got to the bottom of this - the issue was indeed in the VMCI driver. > The patch is posted here: > > https://lkml.org/lkml/2018/9/21/326 > > I used your reproduce.log to test the fix. Thanks for discove

Re: [PATCH rdma-next 0/4] mlx5 vport loopback

2018-09-21 Thread Doug Ledford
On Mon, 2018-09-17 at 13:30 +0300, Leon Romanovsky wrote: > From: Leon Romanovsky > > Hi, > > This is short series from Mark which extends handling of loopback > traffic. Originally mlx5 IB dynamically enabled/disabled both unicast > and multicast based on number of users. However RAW ethernet Q

Re: [PATCH bpf-next] samples/bpf: fix compilation failure

2018-09-21 Thread Daniel Borkmann
On 09/20/2018 09:52 AM, Prashant Bhole wrote: > following commit: > commit d58e468b1112 ("flow_dissector: implements flow dissector BPF hook") > added struct bpf_flow_keys which conflicts with the struct with > same name in sockex2_kern.c and sockex3_kern.c > > similar to commit: > commit 534e0e52

Re: [bpf PATCH v4 0/3] bpf, sockmap ESTABLISHED state only

2018-09-21 Thread Daniel Borkmann
On 09/18/2018 06:01 PM, John Fastabend wrote: > Eric noted that using the close callback is not sufficient > to catch all transitions from ESTABLISHED state to a LISTEN > state. So this series does two things. First, only allow > adding socks in ESTABLISH state and second use unhash callback > to c

Re: [PATCH rdma-next 0/4] mlx5 vport loopback

2018-09-21 Thread Doug Ledford
On Sat, 2018-09-22 at 00:40 +0300, Leon Romanovsky wrote: > On Fri, Sep 21, 2018 at 04:05:53PM -0400, Doug Ledford wrote: > > On Fri, 2018-09-21 at 22:33 +0300, Leon Romanovsky wrote: > > > Hope it makes it clear now. > > > > Clear enough. Between yours and Jason's explanation I think it's well >

[PATCH net-next 2/2] net: dsa: b53: Also include SGMII for mac_config and mac_link_state

2018-09-21 Thread Florian Fainelli
In both 802.3z and SGMII modes we need to configure the MAC accordingly to flip between Fiber and SGMII modes, and we need to read the MAC status from the SGMII in-band control word. Fixes: 0e01491de646 ("net: dsa: b53: Add SerDes support") Signed-off-by: Florian Fainelli --- drivers/net/dsa/b53

[PATCH net-next 1/2] net: dsa: b53: Fix B53_SERDES_DIGITAL_CONTROL offset

2018-09-21 Thread Florian Fainelli
Maths went wrong, to get 0x20, we need to do 0x1e + (x) * 2, not 0x18, fix that offset so we access the correct registers. This would make us not access the correct SerDes Digital control words, status would be fine and so we would not be correctly flipping between Fiber and SGMII modes resulting i

[PATCH net-next 0/2] net: dsa: b53: SGMII modes fixes

2018-09-21 Thread Florian Fainelli
Hi David, Here are two additional fixes that are required in order for SGMII to work correctly. This was discovered with using a copper SFP which would make us use SGMII mode, we would actually leave the HW configured in its default mode: Fiber. Florian Fainelli (2): net: dsa: b53: Fix B53_SERD

[PATCH net-next] net: dsa: b53: Don't assign autonegotiation enabled

2018-09-21 Thread Florian Fainelli
PHYLINK takes care of filing the right information into state->an_enabled, get rid of the read from the SerDes's BMCR register. Fixes: 0e01491de646 ("net: dsa: b53: Add SerDes support") Signed-off-by: Florian Fainelli --- drivers/net/dsa/b53/b53_serdes.c | 5 + 1 file changed, 1 insertion(+)

Re: [PATCH net 14/15] nfp: remove ndo_poll_controller

2018-09-21 Thread Jakub Kicinski
On Fri, 21 Sep 2018 15:27:51 -0700, Eric Dumazet wrote: > As diagnosed by Song Liu, ndo_poll_controller() can > be very dangerous on loaded hosts, since the cpu > calling ndo_poll_controller() might steal all NAPI > contexts (for all RX/TX queues of the NIC). This capture > can last for unlimited a

[PATCH net] net: phy: fix WoL handling when suspending the PHY

2018-09-21 Thread Heiner Kallweit
Actually there's nothing wrong with the two changes, they just revealed a problem which has been existing before. Core of the problem is that phy_suspend() suspends the PHY when it should not because of WoL. phy_suspend() checks for WoL already, but this works only if the PHY driver handles WoL (w

Re: [PATCH perf 3/3] tools/perf: recognize and process RECORD_MMAP events for bpf progs

2018-09-21 Thread Song Liu
On Wed, Sep 19, 2018 at 3:51 PM Alexei Starovoitov wrote: > > Recognize JITed bpf prog load/unload events. > Add/remove kernel symbols accordingly. > > Signed-off-by: Alexei Starovoitov Acked-by: Song Liu > --- > tools/perf/util/machine.c | 27 +++ > tools/perf/util/sy

[PATCH net 13/15] bnxt: remove ndo_poll_controller

2018-09-21 Thread Eric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all th

[PATCH net 14/15] nfp: remove ndo_poll_controller

2018-09-21 Thread Eric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all th

Re: kernel 4.18.5 Realtek 8111G network adapter stops responding under high system load

2018-09-21 Thread Maciej S. Szmigiero
On 19.09.2018 18:34, David Arendt wrote: > Hi, > > the networking problem did not occur for 12 hours now, so I think this > patch resolved the problem. @Heiner: It seems that the regression was introduced by your commit 4fd48c4ac0a0 ("r8169: move common initializations to tp->hw_start"). Will yo

[PATCH net 15/15] tun: remove ndo_poll_controller

2018-09-21 Thread Eric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all th

[PATCH net 00/15] netpoll: avoid capture effects for NAPI drivers

2018-09-21 Thread Eric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture, showing one ksoftirqd eating all cycles can last for unlimited amount of time, since one

[PATCH net 08/15] ice: remove ndo_poll_controller

2018-09-21 Thread Eric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all th

[PATCH net 09/15] i40evf: remove ndo_poll_controller

2018-09-21 Thread Eric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all th

[PATCH net 01/15] netpoll: make ndo_poll_controller() optional

2018-09-21 Thread Eric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all th

[PATCH net 12/15] bnx2x: remove ndo_poll_controller

2018-09-21 Thread Eric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all th

[PATCH net 10/15] mlx4: remove ndo_poll_controller

2018-09-21 Thread Eric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all th

[PATCH net 11/15] mlx5: remove ndo_poll_controller

2018-09-21 Thread Eric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all th

[PATCH net 04/15] ixgbevf: remove ndo_poll_controller

2018-09-21 Thread Eric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all th

[PATCH net 03/15] ixgbe: remove ndo_poll_controller

2018-09-21 Thread Eric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all th

[PATCH net 06/15] ixgb: remove ndo_poll_controller

2018-09-21 Thread Eric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all th

[PATCH net 02/15] bonding: use netpoll_poll_dev() helper

2018-09-21 Thread Eric Dumazet
We want to allow NAPI drivers to no longer provide ndo_poll_controller() method, as it has been proven problematic. team driver must not look at its presence, but instead call netpoll_poll_dev() which factorize the needed actions. Signed-off-by: Eric Dumazet Cc: Jay Vosburgh Cc: Veaceslav Falic

[PATCH net 05/15] fm10k: remove ndo_poll_controller

2018-09-21 Thread Eric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture lasts for unlimited amount of time, since one cpu is generally not able to drain all the q

[PATCH net 07/15] igb: remove ndo_poll_controller

2018-09-21 Thread Eric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all th

Re: [PATCH bpf-next 2/3] bpf: emit RECORD_MMAP events for bpf prog load/unload

2018-09-21 Thread Alexei Starovoitov
On Fri, Sep 21, 2018 at 09:25:00AM -0300, Arnaldo Carvalho de Melo wrote: > > > I have considered adding MUNMAP to match existing MMAP, but went > > without it because I didn't want to introduce new bit in perf_event_attr > > and emit these new events in a misbalanced conditional way for prog >

Re: [PATCH rdma-next 0/4] mlx5 vport loopback

2018-09-21 Thread Leon Romanovsky
On Fri, Sep 21, 2018 at 04:05:53PM -0400, Doug Ledford wrote: > On Fri, 2018-09-21 at 22:33 +0300, Leon Romanovsky wrote: > > Hope it makes it clear now. > > Clear enough. Between yours and Jason's explanation I think it's well > covered. > > > Are you ok with me/Saeed taking first patch to our br

[PATCH v2 net] mpls: allow routes on ip6gre devices

2018-09-21 Thread Saif Hasan
Summary: This appears to be necessary and sufficient change to enable `MPLS` on `ip6gre` tunnels (RFC4023). This diff allows IP6GRE devices to be recognized by MPLS kernel module and hence user can configure interface to accept packets with mpls headers as well setup mpls routes on them. Test Pl

[PATCH] mpls: allow routes on ip6gre devices

2018-09-21 Thread Saif Hasan
Summary: This appears to be necessary and sufficient change to enable `MPLS` on `ip6gre` tunnels (RFC4023). This diff allows IP6GRE devices to be recognized by MPLS kernel module and hence user can configure interface to accept packets with mpls headers as well setup mpls routes on them. Test Pl

[PATCH iproute2 1/1] Makefile: Add check target

2018-09-21 Thread Petr Vorel
Signed-off-by: Petr Vorel --- Makefile | 4 1 file changed, 4 insertions(+) diff --git a/Makefile b/Makefile index 25de3893..b7488add 100644 --- a/Makefile +++ b/Makefile @@ -77,6 +77,7 @@ help: @echo " clean - remove products of build" @echo " distclean

Re: [PATCH iproute2] Makefile: add help target

2018-09-21 Thread David Ahern
On 9/21/18 9:16 AM, Stephen Hemminger wrote: > Add help target to Makefile. > > Signed-off-by: Stephen Hemminger > --- > Makefile | 12 > 1 file changed, 12 insertions(+) > Acked-by: David Ahern

Re: [PATCH rdma-next 0/4] mlx5 vport loopback

2018-09-21 Thread Doug Ledford
On Fri, 2018-09-21 at 22:33 +0300, Leon Romanovsky wrote: > Hope it makes it clear now. Clear enough. Between yours and Jason's explanation I think it's well covered. > Are you ok with me/Saeed taking first patch to our branch so you will be > able to take the rest? Yep. Let me know a tag when

Re: [PATCH rdma-next 0/4] mlx5 vport loopback

2018-09-21 Thread Jason Gunthorpe
On Fri, Sep 21, 2018 at 03:14:36PM -0400, Doug Ledford wrote: > On Mon, 2018-09-17 at 13:30 +0300, Leon Romanovsky wrote: > > From: Leon Romanovsky > > > > Hi, > > > > This is short series from Mark which extends handling of loopback > > traffic. Originally mlx5 IB dynamically enabled/disabled b

Re: [PATCH net-next v2 08/10] net: sched: protect block idr with spinlock

2018-09-21 Thread Cong Wang
On Thu, Sep 20, 2018 at 12:36 AM Vlad Buslov wrote: > > > On Wed 19 Sep 2018 at 22:09, Cong Wang wrote: > > On Mon, Sep 17, 2018 at 12:19 AM Vlad Buslov wrote: > >> @@ -482,16 +483,25 @@ static int tcf_block_insert(struct tcf_block *block, > >> struct net *net, > >>

Re: [PATCH rdma-next 0/4] mlx5 vport loopback

2018-09-21 Thread Leon Romanovsky
On Fri, Sep 21, 2018 at 03:14:36PM -0400, Doug Ledford wrote: > On Mon, 2018-09-17 at 13:30 +0300, Leon Romanovsky wrote: > > From: Leon Romanovsky > > > > Hi, > > > > This is short series from Mark which extends handling of loopback > > traffic. Originally mlx5 IB dynamically enabled/disabled bot

Re: [PATCH net-next v2 05/10] net: sched: use Qdisc rcu API instead of relying on rtnl lock

2018-09-21 Thread Cong Wang
On Thu, Sep 20, 2018 at 12:21 AM Vlad Buslov wrote: > > > On Wed 19 Sep 2018 at 22:04, Cong Wang wrote: > > On Mon, Sep 17, 2018 at 12:19 AM Vlad Buslov wrote: > >> +static void tcf_qdisc_put(struct Qdisc *q, bool rtnl_held) > >> +{ > >> + if (!q) > >> + return; > >> + > >> +

Re: [PATCH rdma-next 0/4] mlx5 vport loopback

2018-09-21 Thread Doug Ledford
On Mon, 2018-09-17 at 13:30 +0300, Leon Romanovsky wrote: > From: Leon Romanovsky > > Hi, > > This is short series from Mark which extends handling of loopback > traffic. Originally mlx5 IB dynamically enabled/disabled both unicast > and multicast based on number of users. However RAW ethernet Q

Re: [PATCH][next-next][v2] netlink: avoid to allocate full skb when sending to many devices

2018-09-21 Thread Cong Wang
On Thu, Sep 20, 2018 at 1:58 AM Li RongQing wrote: > > if skb->head is vmalloc address, when this skb is delivered, full > allocation for this skb is required, if there are many devices, > the full allocation will be called for every devices So why do you in practice need many netlink tap devices

Re: [PATCH net] ip6_tunnel: be careful when accessing the inner header

2018-09-21 Thread Cong Wang
On Wed, Sep 19, 2018 at 6:04 AM Paolo Abeni wrote: > diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c > index 419960b0ba16..a0b6932c3afd 100644 > --- a/net/ipv6/ip6_tunnel.c > +++ b/net/ipv6/ip6_tunnel.c > @@ -1234,7 +1234,7 @@ static inline int > ip4ip6_tnl_xmit(struct sk_buff *skb, st

Re: [PATCH net] af_key: free SKBs under RCU protection

2018-09-21 Thread stranche
On 2018-09-21 11:40, Eric Dumazet wrote: On 09/21/2018 10:09 AM, stran...@codeaurora.org wrote: I also tried reverting 7f6b9dbd5afb ("af_key: locking change") and running the test there and I still see the crash, so it doesn't seem to be an RCU specific issue. Is there anything else that cou

[PATCH net-next 2/3] net/ipfrag: let ip[6]frag_high_thresh in ns be higher than in init_net

2018-09-21 Thread Peter Oskolkov
Currently, ip[6]frag_high_thresh sysctl values in new namespaces are hard-limited to those of the root/init ns. There are at least two use cases when it would be desirable to set the high_thresh values higher in a child namespace vs the global hard limit: - a security/ddos protection policy may l

[PATCH net-next 1/3] ipv6: discard IP frag queue on more errors

2018-09-21 Thread Peter Oskolkov
This is similar to how ipv4 now behaves: commit 0ff89efb5246 ("ip: fail fast on IP defrag errors"). Signed-off-by: Peter Oskolkov --- net/ipv6/reassembly.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c index f1b1ff30

Re: [Intel-wired-lan] [PATCH v2 1/4] i40e: clean zero-copy XDP Tx ring on shutdown/reset

2018-09-21 Thread Jeff Kirsher
On Fri, 2018-09-21 at 09:35 +0200, Björn Töpel wrote: > > --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c > > +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c > > @@ -830,3 +830,33 @@ int i40e_xsk_async_xmit(struct net_device > > *dev, u32 queue_id) > > > > return 0; > > } > > + > > +/

[PATCH net-next 3/3] selftests/net: add ipv6 tests to ip_defrag selftest

2018-09-21 Thread Peter Oskolkov
This patch adds ipv6 defragmentation tests to ip_defrag selftest, to complement existing ipv4 tests. Signed-off-by: Peter Oskolkov --- tools/testing/selftests/net/ip_defrag.c | 249 +++ tools/testing/selftests/net/ip_defrag.sh | 39 ++-- 2 files changed, 190 insertions(+),

Re: [PATCH net] net/ipv4: avoid compile error in fib_info_nh_uses_dev

2018-09-21 Thread David Ahern
On 9/21/18 10:58 AM, Eric Dumazet wrote: > net/ipv4/fib_frontend.c: In function 'fib_info_nh_uses_dev': > net/ipv4/fib_frontend.c:322:6: error: unused variable 'ret' > [-Werror=unused-variable] > cc1: all warnings being treated as errors > > Fixes: 78f2756c5fc0 ("net/ipv4: Move device validation

Re: [PATCH net] net/ipv4: avoid compile error in fib_info_nh_uses_dev

2018-09-21 Thread Eric Dumazet
On 09/21/2018 10:58 AM, Eric Dumazet wrote: > net/ipv4/fib_frontend.c: In function 'fib_info_nh_uses_dev': > net/ipv4/fib_frontend.c:322:6: error: unused variable 'ret' > [-Werror=unused-variable] > cc1: all warnings being treated as errors > > Fixes: 78f2756c5fc0 ("net/ipv4: Move device valid

[PATCH net] net/ipv4: avoid compile error in fib_info_nh_uses_dev

2018-09-21 Thread Eric Dumazet
net/ipv4/fib_frontend.c: In function 'fib_info_nh_uses_dev': net/ipv4/fib_frontend.c:322:6: error: unused variable 'ret' [-Werror=unused-variable] cc1: all warnings being treated as errors Fixes: 78f2756c5fc0 ("net/ipv4: Move device validation to helper") Signed-off-by: Eric Dumazet Cc: David Ah

Re: [PATCH net] af_key: free SKBs under RCU protection

2018-09-21 Thread Eric Dumazet
On 09/21/2018 10:09 AM, stran...@codeaurora.org wrote: > I also tried reverting 7f6b9dbd5afb ("af_key: locking change") and running the > test there and I still see the crash, so it doesn't seem to be an RCU specific > issue. > > Is there anything else that could be causing this? What about y

Re: [PATCH net-next] ravb: Disable Pause Advertisement

2018-09-21 Thread Sergei Shtylyov
Hello! You forgot to CC me. :-/ On 09/21/2018 04:52 PM, Andrew Lunn wrote: > The previous commit to ravb had the side effect of making the PHY > advertise Pause and Asym Pause, which previously did not happen. By > default, phydev->supported has both forms of pause enabled, but > phydev->adv

[PATCHv2 bpf-next 11/11] Documentation: Describe bpf reference tracking

2018-09-21 Thread Joe Stringer
Document the new pointer types in the verifier and how the pointer ID tracking works to ensure that references which are taken are later released. Signed-off-by: Joe Stringer Acked-by: Alexei Starovoitov --- Documentation/networking/filter.txt | 64 + 1 file changed,

[PATCHv2 bpf-next 10/11] selftests/bpf: Add C tests for reference tracking

2018-09-21 Thread Joe Stringer
Add some tests that demonstrate and test the balanced lookup/free nature of socket lookup. Section names that start with "fail" represent programs that are expected to fail verification; all others should succeed. Signed-off-by: Joe Stringer Acked-by: Alexei Starovoitov --- tools/testing/selfte

[PATCHv2 bpf-next 08/11] selftests/bpf: Add tests for reference tracking

2018-09-21 Thread Joe Stringer
reference tracking: leak potential reference reference tracking: leak potential reference on stack reference tracking: leak potential reference on stack 2 reference tracking: zero potential reference reference tracking: copy and zero potential references reference tracking: release reference withou

[PATCHv2 bpf-next 09/11] libbpf: Support loading individual progs

2018-09-21 Thread Joe Stringer
Allow the individual program load to be invoked. This will help with testing, where a single ELF may contain several sections, some of which denote subprograms that are expected to fail verification, along with some which are expected to pass verification. By allowing programs to be iterated and in

[PATCHv2 bpf-next 04/11] bpf: Add PTR_TO_SOCKET verifier type

2018-09-21 Thread Joe Stringer
Teach the verifier a little bit about a new type of pointer, a PTR_TO_SOCKET. This pointer type is accessed from BPF through the 'struct bpf_sock' structure. Signed-off-by: Joe Stringer --- v2: Reuse reg_type_mismatch() in more places Reduce the number of passes at convert_ctx_access() ---

[PATCHv2 bpf-next 05/11] bpf: Macrofy stack state copy

2018-09-21 Thread Joe Stringer
An upcoming commit will need very similar copy/realloc boilerplate, so refactor the existing stack copy/realloc functions into macros to simplify it. Signed-off-by: Joe Stringer Acked-by: Alexei Starovoitov --- kernel/bpf/verifier.c | 106 -- 1 file chang

[PATCHv2 bpf-next 07/11] bpf: Add helper to retrieve socket in BPF

2018-09-21 Thread Joe Stringer
This patch adds new BPF helper functions, bpf_sk_lookup_tcp() and bpf_sk_lookup_udp() which allows BPF programs to find out if there is a socket listening on this host, and returns a socket pointer which the BPF program can then access to determine, for instance, whether to forward or drop traffic.

[PATCHv2 bpf-next 02/11] bpf: Simplify ptr_min_max_vals adjustment

2018-09-21 Thread Joe Stringer
An upcoming commit will add another two pointer types that need very similar behaviour, so generalise this function now. Signed-off-by: Joe Stringer Acked-by: Alexei Starovoitov --- kernel/bpf/verifier.c | 22 ++--- tools/testing/selftests/bpf/test_verifier

[PATCHv2 bpf-next 06/11] bpf: Add reference tracking to verifier

2018-09-21 Thread Joe Stringer
Allow helper functions to acquire a reference and return it into a register. Specific pointer types such as the PTR_TO_SOCKET will implicitly represent such a reference. The verifier must ensure that these references are released exactly once in each path through the program. To achieve this, this

[PATCHv2 bpf-next 03/11] bpf: Generalize ptr_or_null regs check

2018-09-21 Thread Joe Stringer
This check will be reused by an upcoming commit for conditional jump checks for sockets. Refactor it a bit to simplify the later commit. Signed-off-by: Joe Stringer Acked-by: Alexei Starovoitov --- kernel/bpf/verifier.c | 43 +-- 1 file changed, 25 insert

[PATCHv2 bpf-next 01/11] bpf: Add iterator for spilled registers

2018-09-21 Thread Joe Stringer
Add this iterator for spilled registers, it concentrates the details of how to get the current frame's spilled registers into a single macro while clarifying the intention of the code which is calling the macro. Signed-off-by: Joe Stringer Acked-by: Alexei Starovoitov --- include/linux/bpf_veri

[PATCHv2 bpf-next 00/11] Add socket lookup support

2018-09-21 Thread Joe Stringer
This series proposes a new helper for the BPF API which allows BPF programs to perform lookups for sockets in a network namespace. This would allow programs to determine early on in processing whether the stack is expecting to receive the packet, and perform some action (eg drop, forward somewhere)

Re: [PATCH net] af_key: free SKBs under RCU protection

2018-09-21 Thread stranche
As long as one skb has sock_rfree has its destructor, the socket attached to this skb can not be released. There is no race here. Note that skb_clone() does not propagate the destructor. The issue here is that in the rcu lookup, we can find a socket that has been dismantled, with a 0 refcou

Re: [PATCH iproute2] iplink_vxlan: take into account preferred_family creating vxlan device

2018-09-21 Thread Lorenzo Bianconi
> On Fri, 21 Sep 2018 15:34:25 +0200 > Lorenzo Bianconi wrote: > > > Take into account the configured preferred_family if neither saddr or > > daddr are provided since otherwise vxlan kernel module will use IPv4 as > > default remote inet family neglecting the one provided by userspace. > > This

Re: [PATCH iproute2 v2 0/3] testsuite: make alltests fixes

2018-09-21 Thread Stephen Hemminger
On Thu, 20 Sep 2018 01:36:21 +0200 Petr Vorel wrote: > Hi, > > here are simply fixes to restore 'make alltests'. > Currently it does not run. > > Kind regards, > Petr > > Petr Vorel (3): > testsuite: Fix missing generate_nlmsg > testsuite: Generate generate_nlmsg when needed > testsuite:

[PATCH iproute2] Makefile: add help target

2018-09-21 Thread Stephen Hemminger
Add help target to Makefile. Signed-off-by: Stephen Hemminger --- Makefile | 12 1 file changed, 12 insertions(+) diff --git a/Makefile b/Makefile index ea2f797c933f..25de3893cae4 100644 --- a/Makefile +++ b/Makefile @@ -71,6 +71,18 @@ all: config.mk for i in $(SUBDIRS); \

Re: [PATCH ipsec] net: xfrm: pass constant family to nf_hook function

2018-09-21 Thread Florian Westphal
David Ahern wrote: > On 9/21/18 8:55 AM, Florian Westphal wrote: > > David Ahern wrote: > >>> David, i hope this will silence the warning, would be nice > >>> if you could test it. > >> > >> I still the warning. > > > > Wait. Do you see this warning everywhere or just in xfrm? > > > > just

Re: [PATCH iproute2] iplink_vxlan: take into account preferred_family creating vxlan device

2018-09-21 Thread Stephen Hemminger
On Fri, 21 Sep 2018 15:34:25 +0200 Lorenzo Bianconi wrote: > Take into account the configured preferred_family if neither saddr or > daddr are provided since otherwise vxlan kernel module will use IPv4 as > default remote inet family neglecting the one provided by userspace. > This behaviour was

Re: [PATCH ipsec] net: xfrm: pass constant family to nf_hook function

2018-09-21 Thread David Ahern
On 9/21/18 8:55 AM, Florian Westphal wrote: > David Ahern wrote: >>> David, i hope this will silence the warning, would be nice >>> if you could test it. >> >> I still the warning. > > Wait. Do you see this warning everywhere or just in xfrm? > just the one file.

Re: [PATCH ipsec] net: xfrm: pass constant family to nf_hook function

2018-09-21 Thread Florian Westphal
David Ahern wrote: > > David, i hope this will silence the warning, would be nice > > if you could test it. > > I still the warning. Wait. Do you see this warning everywhere or just in xfrm?

Re: [PATCH ipsec] net: xfrm: pass constant family to nf_hook function

2018-09-21 Thread David Ahern
On 9/21/18 3:35 AM, Florian Westphal wrote: > Unfortunately some versions of gcc emit following warning: > linux/compiler.h:252:20: warning: array subscript is above array bounds > [-Warray-bounds] > hook_head = rcu_dereference(net->nf.hooks_arp[hook]); > ^~~~

[PATCH net-next 6/9] tcp: switch internal pacing timer to CLOCK_TAI

2018-09-21 Thread Eric Dumazet
Next patch will use tcp_wstamp_ns to feed internal TCP pacing timer, so switch to CLOCK_TAI to share same base. Signed-off-by: Eric Dumazet --- net/ipv4/tcp_output.c | 2 +- net/ipv4/tcp_timer.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/i

[PATCH net-next 5/9] tcp: provide earliest departure time in skb->tstamp

2018-09-21 Thread Eric Dumazet
Switch internal TCP skb->skb_mstamp to skb->skb_mstamp_ns, from usec units to nsec units. Do not clear skb->tstamp before entering IP stacks in TX, so that qdisc or devices can implement pacing based on the earliest departure time instead of socket sk->sk_pacing_rate Packets are fed with tcp_wsta

[PATCH net-next 9/9] net_sched: sch_fq: remove dead code dealing with retransmits

2018-09-21 Thread Eric Dumazet
With the earliest departure time model, we no longer plan special casing TCP retransmits. We therefore remove dead code (since most compilers understood skb_is_retransmit() was false) Signed-off-by: Eric Dumazet --- net/sched/sch_fq.c | 58 -- 1 file c

[PATCH net-next 7/9] tcp: switch tcp and sch_fq to new earliest departure time model

2018-09-21 Thread Eric Dumazet
TCP keeps track of tcp_wstamp_ns by itself, meaning sch_fq no longer has to do it. Thanks to this model, TCP can get more accurate RTT samples, since pacing no longer inflates them. This has the nice effect of removing some delays caused by FQ quantum mechanism, causing inflated max/P99 latencies

[PATCH net-next 8/9] tcp: switch tcp_internal_pacing() to tcp_wstamp_ns

2018-09-21 Thread Eric Dumazet
Now TCP keeps track of tcp_wstamp_ns, recording the earliest departure time of next packet, we can remove duplicate code from tcp_internal_pacing() This removes one ktime_get_tai_ns() call, and a divide. Signed-off-by: Eric Dumazet --- net/ipv4/tcp_output.c | 17 - 1 file change

[PATCH net-next 0/9] tcp: switch to Early Departure Time model

2018-09-21 Thread Eric Dumazet
In the early days, pacing has been implemented in sch_fq (FQ) in a generic way : - SO_MAX_PACING_RATE could be used by any sockets. - TCP would vary effective pacing rate based on CWND*MSS/SRTT - FQ would ensure delays between packets based on current sk->sk_pacing_rate, but with some quantum

[PATCH net-next 3/9] net_sched: sch_fq: switch to CLOCK_TAI

2018-09-21 Thread Eric Dumazet
TCP will soon provide per skb->tstamp with earliest departure time, so that sch_fq does not have to determine departure time by looking at socket sk_pacing_rate. We chose in linux-4.19 CLOCK_TAI as the clock base for transports, qdiscs, and NIC offloads. Signed-off-by: Eric Dumazet --- net/sche

[PATCH net-next 4/9] tcp: add tcp_wstamp_ns socket field

2018-09-21 Thread Eric Dumazet
TCP will soon provide earliest departure time on TX skbs. It needs to track this in a new variable. tcp_mstamp_refresh() needs to update this variable, and became too big to stay an inline. Signed-off-by: Eric Dumazet --- include/linux/tcp.h | 2 ++ include/net/tcp.h | 12 +--- n

[PATCH net-next 2/9] tcp: introduce tcp_skb_timestamp_us() helper

2018-09-21 Thread Eric Dumazet
There are few places where TCP reads skb->skb_mstamp expecting a value in usec unit. skb->tstamp (aka skb->skb_mstamp) will soon store CLOCK_TAI nsec value. Add tcp_skb_timestamp_us() to provide proper conversion when needed. Signed-off-by: Eric Dumazet --- include/net/tcp.h | 8 +++

[PATCH net-next 1/9] tcp: switch tcp_clock_ns() to CLOCK_TAI base

2018-09-21 Thread Eric Dumazet
TCP pacing is either implemented in sch_fq or internally. We have the goal of being able to offload pacing on the NICS. TCP will soon provide per skb skb->tstamp as early departure time. Like ETF in commit 25db26a91364 ("net/sched: Introduce the ETF Qdisc") we chose CLOCK_T as the clock base, so

Re: [PATCH net-next] cxgb4vf: Add ethtool private flags for changing force_link_up

2018-09-21 Thread Jakub Kicinski
On Fri, 21 Sep 2018 16:16:31 +0530, Arjun Vynipadath wrote: > On Tuesday, September 09/18/18, 2018 at 11:39:14 -0700, Jakub Kicinski wrote: > > On Tue, 18 Sep 2018 18:37:23 +0530, Arjun Vynipadath wrote: > > > Forcing link up of virtual interfaces even when physical link is down > > > causes pack

Re: [PATCH net] ixgbe: check return value of napi_complete_done()

2018-09-21 Thread Eric Dumazet
On 09/21/2018 08:14 AM, Alexei Starovoitov wrote: > On 9/21/18 7:59 AM, Eric Dumazet wrote: >> >> >> On 09/21/2018 07:55 AM, Alexei Starovoitov wrote: >> >>> >>> should we remove ndo_poll_controller then? >>> My understanding that the patch helps by not letting >>> drivers do napi_schedule() for

Re: [PATCH net] ixgbe: check return value of napi_complete_done()

2018-09-21 Thread Alexei Starovoitov
On 9/21/18 7:59 AM, Eric Dumazet wrote: > > > On 09/21/2018 07:55 AM, Alexei Starovoitov wrote: > >> >> should we remove ndo_poll_controller then? >> My understanding that the patch helps by not letting >> drivers do napi_schedule() for all queues into this_cpu, right? >> But most of the drivers do

Re: [PATCH net] ixgbe: check return value of napi_complete_done()

2018-09-21 Thread Eric Dumazet
On 09/21/2018 07:55 AM, Alexei Starovoitov wrote: > > should we remove ndo_poll_controller then? > My understanding that the patch helps by not letting > drivers do napi_schedule() for all queues into this_cpu, right? > But most of the drivers do exactly that in their ndo_poll_controller > imp

Re: [PATCH net] ixgbe: check return value of napi_complete_done()

2018-09-21 Thread Alexei Starovoitov
On 9/21/18 6:33 AM, Eric Dumazet wrote: On 09/21/2018 12:17 AM, Song Liu wrote: On Sep 20, 2018, at 4:49 PM, Eric Dumazet wrote: On 09/20/2018 04:43 PM, Song Liu wrote: I tried to totally skip ndo_poll_controller() here. It did avoid hitting the issue. However, netpoll will drop (f

Re: 答复: [PATCH][next-next][v2] netlink: avoid to allocate full skb when sending to many devices

2018-09-21 Thread Eric Dumazet
On 09/20/2018 08:27 PM, Li,Rongqing wrote: > > The below change seems simple, but it increase skb allocation and > free one time, Seem fine to me. An extra skb_clone() for vmalloc-skb-users is absolute noise, compared to vmalloc()vfree() cost. Thanks.

Re: [PATCH net] net: diag: Fix swapped src/dst in udp_dump_one.

2018-09-21 Thread David Miller
From: Lorenzo Colitti Date: Fri, 21 Sep 2018 18:46:25 +0800 > Would you take a patch to add a one-line comment saying that this is > the way it is for backwards compatibility? If that comment were there > anyone else who finds this will not spend time debugging it and > immediately know what's go

Re: [PATCH net] sctp: update dst pmtu with the correct daddr

2018-09-21 Thread David Miller
From: Xin Long Date: Fri, 21 Sep 2018 15:55:34 +0800 > It's under the protection of the sock lock, I think any other places > that want to access the address also need to acquire this sock lock > first. Hash table lookups don't even have a socket context yet, so can't hold the sock lock, but loo

Re: [PATCH net-next 0/5] vrf: allow simultaneous service instances in default and other VRFs

2018-09-21 Thread David Miller
From: David Ahern Date: Thu, 20 Sep 2018 21:28:43 -0700 > I need some time to review and more importantly test this patch set > before it is committed. I am traveling tomorrow afternoon through Sunday > evening, so I need a few days into next week to get to this. Sure, no problem.

Re: [PATCH bpf-next 2/3] bpf: emit RECORD_MMAP events for bpf prog load/unload

2018-09-21 Thread Peter Zijlstra
On Fri, Sep 21, 2018 at 09:25:00AM -0300, Arnaldo Carvalho de Melo wrote: > There is another longstanding TODO list entry: PERF_RECORD_MMAP records > should include a build-id I throught the problem was that the kernel doesn't have the build-id in the first place. So it cannot hand them out.

Re: [PATCH bpf-next 2/3] bpf: emit RECORD_MMAP events for bpf prog load/unload

2018-09-21 Thread Peter Zijlstra
On Fri, Sep 21, 2018 at 09:25:00AM -0300, Arnaldo Carvalho de Melo wrote: > > I consider synthetic perf events to be non-ABI. Meaning they're > > emitted by perf user space into perf.data and there is a convention > > on names, but it's not a kernel abi. Like RECORD_MMAP with > > event.filename ==

Re: stmmac: Race in coalesce timer and NAPI

2018-09-21 Thread Eric Dumazet
On 09/21/2018 02:19 AM, Jose Abreu wrote: > Hello, > > I'm getting a race in stmmac coalesce timer and the > napi_schedule() interrupt and I'm asking for advice. Currently, > we are scheduling NAPI in coalesce timer but this leads to > stmmac_tx_clean() deadlock because this function tries to a

[PATCH net-next] ravb: Disable Pause Advertisement

2018-09-21 Thread Andrew Lunn
The previous commit to ravb had the side effect of making the PHY advertise Pause and Asym Pause, which previously did not happen. By default, phydev->supported has both forms of pause enabled, but phydev->advertising does not. The new phy_remove_link_mode() copies phydev->supported to phydev->adv

Re: [RFC PATCH net-next v1 00/14] rename and shrink i40evf

2018-09-21 Thread Or Gerlitz
On Fri, Sep 14, 2018 at 10:17 PM, Jesse Brandeburg wrote: > On Fri, 14 Sep 2018 12:10:45 +0300 Or wrote: >> On Fri, Sep 14, 2018 at 1:31 AM, Jesse Brandeburg >> wrote: >> on what HW ring format do you standardize? do i40e/Fortville and >> ice/what's-the-intel-code-name? HWs can/use the same post

[PATCH iproute2] iplink_vxlan: take into account preferred_family creating vxlan device

2018-09-21 Thread Lorenzo Bianconi
Take into account the configured preferred_family if neither saddr or daddr are provided since otherwise vxlan kernel module will use IPv4 as default remote inet family neglecting the one provided by userspace. This behaviour was originally in commit 97d564b90ccb ("vxlan: use preferred address fami

Re: [PATCH net] ixgbe: check return value of napi_complete_done()

2018-09-21 Thread Eric Dumazet
On 09/21/2018 12:17 AM, Song Liu wrote: > > >> On Sep 20, 2018, at 4:49 PM, Eric Dumazet wrote: >> >> >> >> On 09/20/2018 04:43 PM, Song Liu wrote: >>> >> >>> I tried to totally skip ndo_poll_controller() here. It did avoid hitting >>> the issue. However, netpoll will drop (fail to send) more

  1   2   >