Re: [PATCH RFC net-next] net: ipvs: Adjust gso_size for IPPROTO_TCP

2018-05-02 Thread Julian Anastasov
Hello, On Wed, 2 May 2018, Martin KaFai Lau wrote: > On Wed, May 02, 2018 at 09:38:43AM +0300, Julian Anastasov wrote: > > > > - initial traffic for port 21 does not use GSO. But after > > every packet IPVS calls maybe_update_pmtu (rt->dst.ops->update_pmtu) > > to report the reduced MTU

Re: [bpf PATCH 3/3] bpf: sockmap, fix error handling in redirect failures

2018-05-02 Thread Alexei Starovoitov
On Wed, May 02, 2018 at 10:47:37AM -0700, John Fastabend wrote: > When a redirect failure happens we release the buffers in-flight > without calling a sk_mem_uncharge(), the uncharge is called before > dropping the sock lock for the redirecte, however we missed updating > the ring start index. When

Re: [PATCH bpf 0/2] Two x86 BPF JIT fixes

2018-05-02 Thread Alexei Starovoitov
On Wed, May 02, 2018 at 08:12:21PM +0200, Daniel Borkmann wrote: > Fix two memory leaks in x86 JIT. For details, please see > individual patches in this series. Thanks! Applied to bpf tree, Thanks Daniel.

[PATCH net-next 01/10] r8169: remove unneeded check in r8168_pll_power_down

2018-05-02 Thread Heiner Kallweit
RTL_GIGA_MAC_VER_23/24 are configured by rtl_hw_start_8168cp_2() and rtl_hw_start_8168cp_3() respectively which both apply CPCMD_QUIRK_MASK, thus clearing bit ASF. Bit ASF isn't set at any other place in the driver, therefore this check can be removed. Signed-off-by: Heiner Kallweit --- drivers

[PATCH net-next 06/10] r8169: replace longer if statements with switch statements

2018-05-02 Thread Heiner Kallweit
Some longer if statements can be simplified by using switch statements instead. Signed-off-by: Heiner Kallweit --- drivers/net/ethernet/realtek/r8169.c | 54 +--- 1 file changed, 16 insertions(+), 38 deletions(-) diff --git a/drivers/net/ethernet/realtek/r8169.c b/drive

[PATCH net-next 10/10] r8169: replace get_protocol with vlan_get_protocol

2018-05-02 Thread Heiner Kallweit
This patch is basically the same as 6e74d1749a33 ("r8152: replace get_protocol with vlan_get_protocol"). Use vlan_get_protocol instead of duplicating the functionality. Signed-off-by: Heiner Kallweit --- drivers/net/ethernet/realtek/r8169.c | 16 ++-- 1 file changed, 2 insertions(+),

[PATCH net-next 05/10] r8169: simplify code by using ranges in switch clauses

2018-05-02 Thread Heiner Kallweit
Several switch statements can be significantly simplified by using case ranges. Signed-off-by: Heiner Kallweit --- drivers/net/ethernet/realtek/r8169.c | 193 +++ 1 file changed, 19 insertions(+), 174 deletions(-) diff --git a/drivers/net/ethernet/realtek/r8169.c b/driv

[PATCH net-next 07/10] r8169: drop rtl_generic_op

2018-05-02 Thread Heiner Kallweit
Only two places are left where rtl_generic_op() is used, so we can inline it and simplify the code a little. This change also avoids the overhead of unlocking/locking in case the respective operation isn't set. Signed-off-by: Heiner Kallweit --- drivers/net/ethernet/realtek/r8169.c | 23

[PATCH net-next 09/10] r8169: avoid potentially misaligned access when getting mac address

2018-05-02 Thread Heiner Kallweit
Interpreting a member of an u16 array as u32 may result in a misaligned access. Also it's not really intuitive to define a mac address variable as array of three u16 words. Therefore use an array of six bytes that is properly aligned for 32 bit access. Signed-off-by: Heiner Kallweit --- drivers/

[PATCH net-next 08/10] r8169: improve PCI config space access

2018-05-02 Thread Heiner Kallweit
Some chips have a non-zero function id, however instead of hardcoding the id's (CSIAR_FUNC_NIC and CSIAR_FUNC_NIC2) we can get them dynamically via PCI_FUNC(pci_dev->devfn). This way we can get rid of the csi_ops. In general csi is just a fallback mechanism for PCI config space access in case no n

[PATCH net-next 04/10] r8169: drop member pll_power_ops from struct rtl8169_private

2018-05-02 Thread Heiner Kallweit
After merging r810x_pll_power_down/up and r8168_pll_power_down/up we don't need member pll_power_ops any longer and can drop it, thus simplifying the code. Signed-off-by: Heiner Kallweit --- drivers/net/ethernet/realtek/r8169.c | 76 1 file changed, 10 insertions(+),

[PATCH net-next 02/10] r8169: remove 810x_phy_power_up/down

2018-05-02 Thread Heiner Kallweit
The functionality of 810x_phy_power_up/down is covered by the default clause in 8168_phy_power_up/down. Therefore we don't need these functions. Signed-off-by: Heiner Kallweit --- drivers/net/ethernet/realtek/r8169.c | 98 1 file changed, 43 insertions(+), 55 deletio

[PATCH net-next 03/10] r8169: merge r810x_pll_power_down/up into r8168_pll_power_down/up

2018-05-02 Thread Heiner Kallweit
r810x_pll_power_down/up and r8168_pll_power_down/up have a lot in common, so we can simplify the code by merging the former into the latter. Signed-off-by: Heiner Kallweit --- drivers/net/ethernet/realtek/r8169.c | 61 1 file changed, 16 insertions(+), 45 deletions(-

[PATCH net] ipv6: Revert "ipv6: Allow non-gateway ECMP for IPv6"

2018-05-02 Thread Ido Schimmel
This reverts commit edd7ceb78296 ("ipv6: Allow non-gateway ECMP for IPv6"). Eric reported a division by zero in rt6_multipath_rebalance() which is caused by above commit that considers identical local routes to be siblings. The division by zero happens because a nexthop weight is not set for local

Re: [PATCH net] ipv6: Revert "ipv6: Allow non-gateway ECMP for IPv6"

2018-05-02 Thread Eric Dumazet
On 05/02/2018 12:41 PM, Ido Schimmel wrote: > This reverts commit edd7ceb78296 ("ipv6: Allow non-gateway ECMP for > IPv6"). > > Eric reported a division by zero in rt6_multipath_rebalance() which is > caused by above commit that considers identical local routes to be > siblings. The division by

[PATCH 0/2] sh_eth: complain on access to unimplemented TSU registers

2018-05-02 Thread Sergei Shtylyov
Hello! Here's a set of 2 patches against DaveM's 'net-next.git' repo. The 1st patch routes TSU_POST register accesses thru sh_eth_tsu_{read|write}() and the 2nd added WARN_ON() unimplemented register to those functions. I'm going to deal with TSU_ADR{H|L} registers in a later series... [1/2] sh_

[PATCH 1/2] sh_eth: use TSU register accessors for TSU_POST

2018-05-02 Thread Sergei Shtylyov
There's no particularly good reason TSU_POST registers get accessed circumventing sh_eth_tsu_{read|write}() -- start using those, removing (badly named) sh_eth_tsu_get_post_reg_offset(), while at it... Signed-off-by: Sergei Shtylyov --- drivers/net/ethernet/renesas/sh_eth.c | 20 ++--

[PATCH 2/2] sh_eth: WARN_ON() access to unimplemented TSU register

2018-05-02 Thread Sergei Shtylyov
Commit 3365711df024 ("sh_eth: WARN on access to a register not implemented in a particular chip") added WARN_ON() to sh_eth_{read|write}() but not to sh_eth_tsu_{read|write}(). Now that we've routed almost all TSU register accesses (except TSU_ADR{H|L} -- which are special) thru the latter pair o

[PATCH v2 bpf-next 1/3] bpf: unify main prog and subprog

2018-05-02 Thread Jiong Wang
Currently, verifier treat main prog and subprog differently. All subprogs detected are kept in env->subprog_starts while main prog is not kept there. Instead, main prog is implicitly defined as the prog start at 0. There is actually no difference between main prog and subprog, it is better to unif

[PATCH v2 bpf-next 3/3] bpf: add faked "ending" subprog

2018-05-02 Thread Jiong Wang
There are quite a few code snippet like the following in verifier: subprog_start = 0; if (env->subprog_cnt == cur_subprog + 1) subprog_end = insn_cnt; else subprog_end = env->subprog_info[cur_subprog + 1].start; The reason is there is no marker i

[PATCH v2 bpf-next 0/3] bpf: cleanups on managing subprog information

2018-05-02 Thread Jiong Wang
This patch set clean up some code logic related with managing subprog information. Part of the set are inspried by Edwin's code in his RFC: "bpf/verifier: subprog/func_call simplifications" but with clearer separation so it could be easier to review. - Path 1 unifies main prog and subprogs.

[PATCH v2 bpf-next 2/3] bpf: centre subprog information fields

2018-05-02 Thread Jiong Wang
It is better to centre all subprog information fields into one structure. This structure could later serve as function node in call graph. Signed-off-by: Jiong Wang --- include/linux/bpf_verifier.h | 9 --- kernel/bpf/verifier.c| 62 +++- 2 fi

DSA switch

2018-05-02 Thread Ran Shalit
Hello, Is it possible to use switch just like external real switch, connecting all ports to the same subnet ? In our architecture, we prefer that all IPs connected to board shall be in the same subnet. Yet, I am not sure if it is possible with dsa switch, because in dsa the ports are seen in lin

Re: [PATCH 2/3] mlx4: Don't bother using skb_tx_hash in mlx4_en_select_queue

2018-05-02 Thread Alexander Duyck
On Wed, May 2, 2018 at 11:09 AM, Ruhl, Michael J wrote: >>-Original Message- >>From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- >>ow...@vger.kernel.org] On Behalf Of Alexander Duyck >>Sent: Friday, April 27, 2018 2:07 PM >>To: netdev@vger.kernel.org; da...@davemloft.net >>Cc: lin

[PATCH net,stable] qmi_wwan: do not steal interfaces from class drivers

2018-05-02 Thread Bjørn Mork
The USB_DEVICE_INTERFACE_NUMBER matching macro assumes that the { vendorid, productid, interfacenumber } set uniquely identifies one specific function. This has proven to fail for some configurable devices. One example is the Quectel EM06/EP06 where the same interface number can be either QMI or M

Re: [PATCH net-next 00/10] r8169: series with further improvements

2018-05-02 Thread David Miller
From: Heiner Kallweit Date: Wed, 2 May 2018 21:28:10 +0200 > I thought I'm more or less done with the basic refactoring. But again > I stumbled across things that can be improved / simplified. Looks good, series applied, thanks Heiner.

[RFC iproute2-next 4/5] ss: don't look for skbuff_head_cache

2018-05-02 Thread Stephen Hemminger
From: Stephen Hemminger Not used in current code. Signed-off-by: Stephen Hemminger --- misc/ss.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/misc/ss.c b/misc/ss.c index 4f76999c0fee..97304cd8abfc 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -734,7 +734,6 @@ struct slabstat { int

[RFC iproute2-next 5/5] ss: use correct slab statistics

2018-05-02 Thread Stephen Hemminger
From: Stephen Hemminger The slabinfo names changed years ago, and ss statistics were broken. This changes to use current slab names and handle TCP IPv6. Signed-off-by: Stephen Hemminger --- misc/ss.c | 23 +++ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/

[RFC iproute2-next 0/5] ss statistics fixes

2018-05-02 Thread Stephen Hemminger
From: Stephen Hemminger The output of the ss -s command has been broken for a long time because of kernel changes (ie since 2.6). This is an attempt to resolve most of the issues. Still don't like the way it is using slabinfo to get the data but some of this information would be expensive for ke

[RFC iproute2-next 2/5] ss: make tcp_mem long

2018-05-02 Thread Stephen Hemminger
The tcp_memory field in /proc/net/sockstat is formatted as a long value by kernel. Change ss to keep this as full value. Signed-off-by: Stephen Hemminger --- misc/ss.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/misc/ss.c b/misc/ss.c index 22c76e34f83b..c88a25581755 1

[RFC iproute2-next 3/5] ss: use sockstat to get TCP bind ports

2018-05-02 Thread Stephen Hemminger
From: Stephen Hemminger Using slabinfo to try and get the number of bind_buckets no longer works because of slab cache merging. Instead use proposed enhancment of /proc/net/sockstat to get the same data. Signed-off-by: Stephen Hemminger --- misc/ss.c | 10 -- 1 file changed, 4 insertio

[RFC iproute2-next 1/5] ss: make args to get_snmp_int const

2018-05-02 Thread Stephen Hemminger
These are keys for lookup and should be const. Signed-off-by: Stephen Hemminger --- misc/ss.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/misc/ss.c b/misc/ss.c index 3ed7e66962f3..22c76e34f83b 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -4539,7 +4539,7 @@ static int handle_

Re: [PATCH net-next v9 2/4] net: Introduce generic failover module

2018-05-02 Thread Michael S. Tsirkin
On Wed, May 02, 2018 at 10:51:12AM -0700, Samudrala, Sridhar wrote: > > > On 5/2/2018 9:15 AM, Jiri Pirko wrote: > > Sat, Apr 28, 2018 at 11:06:01AM CEST, j...@resnulli.us wrote: > > > Fri, Apr 27, 2018 at 07:06:58PM CEST, sridhar.samudr...@intel.com wrote: > > [...] > > > > > > > > + > > > > +

Re: [PATCH net] ipv6: Revert "ipv6: Allow non-gateway ECMP for IPv6"

2018-05-02 Thread David Miller
From: Ido Schimmel Date: Wed, 2 May 2018 22:41:56 +0300 > This reverts commit edd7ceb78296 ("ipv6: Allow non-gateway ECMP for > IPv6"). > > Eric reported a division by zero in rt6_multipath_rebalance() which is > caused by above commit that considers identical local routes to be > siblings. The

Re: [PATCH net] net_sched: fq: take care of throttled flows before reuse

2018-05-02 Thread David Miller
From: Eric Dumazet Date: Wed, 2 May 2018 10:03:30 -0700 > Normally, a socket can not be freed/reused unless all its TX packets > left qdisc and were TX-completed. However connect(AF_UNSPEC) allows > this to happen. > > With commit fc59d5bdf1e3 ("pkt_sched: fq: clear time_next_packet for > reuse

Re: [PATCH 0/2] sh_eth: complain on access to unimplemented TSU registers

2018-05-02 Thread David Miller
From: Sergei Shtylyov Date: Wed, 2 May 2018 22:53:23 +0300 > Here's a set of 2 patches against DaveM's 'net-next.git' repo. The 1st patch > routes TSU_POST register accesses thru sh_eth_tsu_{read|write}() and the > 2nd > added WARN_ON() unimplemented register to those functions. I'm going to dea

Re: [PATCH net-next 1/4] ipv6: Calculate hash thresholds for IPv6 nexthops

2018-05-02 Thread Thomas Winter
> On Wed, May 02, 2018 at 12:58:56PM -0600, David Ahern wrote: > > On 5/2/18 12:53 PM, Ido Schimmel wrote: > > > > > > So this fixes the issue for me. To reproduce: > > > > > > # ip -6 address add 2001:db8::1/64 dev dummy0 > > > # ip -6 address add 2001:db8::1/64 dev dummy1 > > > > > > This repr

[bpf PATCH v2 0/3] sockmap error path fixes

2018-05-02 Thread John Fastabend
When I added the test_sockmap to selftests I mistakenly changed the test logic a bit. The result of this was on redirect cases we ended up choosing the wrong sock from the BPF program and ended up sending to a socket that had no receive handler. The result was the actual receive handler, running on

[bpf PATCH v2 1/3] bpf: sockmap, fix scatterlist update on error path in send with apply

2018-05-02 Thread John Fastabend
When the call to do_tcp_sendpage() fails to send the complete block requested we either retry if only a partial send was completed or abort if we receive a error less than or equal to zero. Before returning though we must update the scatterlist length/offset to account for any partial send complete

[bpf PATCH v2 2/3] bpf: sockmap, zero sg_size on error when buffer is released

2018-05-02 Thread John Fastabend
When an error occurs during a redirect we have two cases that need to be handled (i) we have a cork'ed buffer (ii) we have a normal sendmsg buffer. In the cork'ed buffer case we don't currently support recovering from errors in a redirect action. So the buffer is released and the error should _not

[bpf PATCH v2 3/3] bpf: sockmap, fix error handling in redirect failures

2018-05-02 Thread John Fastabend
When a redirect failure happens we release the buffers in-flight without calling a sk_mem_uncharge(), the uncharge is called before dropping the sock lock for the redirecte, however we missed updating the ring start index. When no apply actions are in progress this is OK because we uncharge the ent

Re: [PATCH bpf-next] bpf/verifier: enable ctx + const + 0.

2018-05-02 Thread Jakub Kicinski
On Wed, 2 May 2018 10:54:56 -0700, William Tu wrote: > On Wed, May 2, 2018 at 1:29 AM, Daniel Borkmann wrote: > > On 05/02/2018 06:52 AM, Alexei Starovoitov wrote: > >> On Tue, May 01, 2018 at 09:35:29PM -0700, William Tu wrote: > >> Please test it with real program and you'll see crashes and

Re: DSA switch

2018-05-02 Thread Andrew Lunn
On Wed, May 02, 2018 at 11:20:05PM +0300, Ran Shalit wrote: > Hello, > > Is it possible to use switch just like external real switch, > connecting all ports to the same subnet ? Yes. Just bridge all ports/interfaces together and put your host IP address on the bridge. Andrew

Re: [PATCH net-next 1/4] ipv6: Calculate hash thresholds for IPv6 nexthops

2018-05-02 Thread David Ahern
On 5/2/18 2:48 PM, Thomas Winter wrote: > Should I look at reworking this? It would be great to have these ECMP routes > for other purposes. Looking at my IPv6 bug list this change is on it -- allowing ECMP routes to have a device only hop. Let me take a look at it at the same time as a few othe

Re: [PATCH bpf-next v3 15/15] samples/bpf: sample application and documentation for AF_XDP sockets

2018-05-02 Thread Jesper Dangaard Brouer
On Wed, 2 May 2018 13:01:36 +0200 Björn Töpel wrote: > +static void rx_drop(struct xdpsock *xsk) > +{ > + struct xdp_desc descs[BATCH_SIZE]; > + unsigned int rcvd, i; > + > + rcvd = xq_deq(&xsk->rx, descs, BATCH_SIZE); > + if (!rcvd) > + return; > + > + for (i =

Re: [RFC iproute2-next 2/5] ss: make tcp_mem long

2018-05-02 Thread Eric Dumazet
On 05/02/2018 01:27 PM, Stephen Hemminger wrote: > The tcp_memory field in /proc/net/sockstat is formatted as > a long value by kernel. Change ss to keep this as full value. > > Signed-off-by: Stephen Hemminger > --- > misc/ss.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > >

Re: [RFC iproute2-next 2/5] ss: make tcp_mem long

2018-05-02 Thread Stephen Hemminger
On Wed, 2 May 2018 14:08:53 -0700 Eric Dumazet wrote: > On 05/02/2018 01:27 PM, Stephen Hemminger wrote: > > The tcp_memory field in /proc/net/sockstat is formatted as > > a long value by kernel. Change ss to keep this as full value. > > > > Signed-off-by: Stephen Hemminger > > --- > > misc/ss

Re: [PATCH net-next v9 2/4] net: Introduce generic failover module

2018-05-02 Thread Samudrala, Sridhar
On 5/2/2018 1:30 PM, Michael S. Tsirkin wrote: On Wed, May 02, 2018 at 10:51:12AM -0700, Samudrala, Sridhar wrote: On 5/2/2018 9:15 AM, Jiri Pirko wrote: Sat, Apr 28, 2018 at 11:06:01AM CEST, j...@resnulli.us wrote: Fri, Apr 27, 2018 at 07:06:58PM CEST, sridhar.samudr...@intel.com wrote: [..

Re: [PATCH net-next v9 2/4] net: Introduce generic failover module

2018-05-02 Thread Jiri Pirko
Wed, May 02, 2018 at 07:51:12PM CEST, sridhar.samudr...@intel.com wrote: > > >On 5/2/2018 9:15 AM, Jiri Pirko wrote: >> Sat, Apr 28, 2018 at 11:06:01AM CEST, j...@resnulli.us wrote: >> > Fri, Apr 27, 2018 at 07:06:58PM CEST, sridhar.samudr...@intel.com wrote: >> [...] >> >> >> > > + >> > > +

Re: [PATCH net-next v9 2/4] net: Introduce generic failover module

2018-05-02 Thread Jiri Pirko
Wed, May 02, 2018 at 07:51:12PM CEST, sridhar.samudr...@intel.com wrote: > > >On 5/2/2018 9:15 AM, Jiri Pirko wrote: >> Sat, Apr 28, 2018 at 11:06:01AM CEST, j...@resnulli.us wrote: >> > Fri, Apr 27, 2018 at 07:06:58PM CEST, sridhar.samudr...@intel.com wrote: >> [...] >> >> >> > > + >> > > +

RE: Performance regressions in TCP_STREAM tests in Linux 4.15 (and later)

2018-05-02 Thread Michael Wenig
After applying Eric's proposed change (see below) to a 4.17 RC3 kernel, the regressions that we had observed in our TCP_STREAM small message tests with TCP_NODELAY enabled are now drastically reduced. Instead of the original 3x thruput and cpu cost regressions, the regression depth is now < 10%

[PATCH net] rds: do not leak kernel memory to user land

2018-05-02 Thread Eric Dumazet
syzbot/KMSAN reported an uninit-value in put_cmsg(), originating from rds_cmsg_recv(). Simply clear the structure, since we have holes there, or since rx_traces might be smaller than RDS_MSG_RX_DGRAM_TRACE_MAX. BUG: KMSAN: uninit-value in copy_to_user include/linux/uaccess.h:184 [inline] BUG: KMS

Re: [PATCH V2 net-next 0/6] virtio-net: Add SCTP checksum offload support

2018-05-02 Thread Marcelo Ricardo Leitner
On Tue, May 01, 2018 at 10:07:33PM -0400, Vladislav Yasevich wrote: > Now that we have SCTP offload capabilities in the kernel, we can add > them to virtio as well. First step is SCTP checksum. SCTP-wise, LGTM: Acked-by: Marcelo Ricardo Leitner

[PATCH net-next] inet: add bound ports statistic

2018-05-02 Thread Stephen Hemminger
This adds a number of bound ports which fixes socket summary command. The ss -s has been broken since changes to slab info and this is one way to recover the missing value by adding a field onto /proc/net/sockstat. Since this is an informational value only, there is no need for locking. Overhead

[PATCH] sctp: fix a potential missing-check bug

2018-05-02 Thread Wenwen Wang
In sctp_setsockopt_maxseg(), the integer 'val' is compared against min_len and max_len to check whether it is in the appropriate range. If it is not, an error code -EINVAL will be returned. This is enforced by a security check. But, this check is only executed when 'val' is not 0. In fact, if 'val'

Re: [PATCH v5] bpf, x86_32: add eBPF JIT compiler for ia32

2018-05-02 Thread Daniel Borkmann
Hi Wang, On 04/29/2018 02:37 PM, Wang YanQing wrote: > The JIT compiler emits ia32 bit instructions. Currently, It supports eBPF > only. Classic BPF is supported because of the conversion by BPF core. > > Almost all instructions from eBPF ISA supported except the following: > BPF_ALU64 | BPF_DIV

Re: [bpf PATCH v2 0/3] sockmap error path fixes

2018-05-02 Thread Alexei Starovoitov
On Wed, May 02, 2018 at 01:50:14PM -0700, John Fastabend wrote: > When I added the test_sockmap to selftests I mistakenly changed the > test logic a bit. The result of this was on redirect cases we ended up > choosing the wrong sock from the BPF program and ended up sending to a > socket that had n

Re: Performance regressions in TCP_STREAM tests in Linux 4.15 (and later)

2018-05-02 Thread Eric Dumazet
On 05/02/2018 02:47 PM, Michael Wenig wrote: > After applying Eric's proposed change (see below) to a 4.17 RC3 kernel, the > regressions that we had observed in our TCP_STREAM small message tests with > TCP_NODELAY enabled are now drastically reduced. Instead of the original 3x > thruput and c

Re: [PATCH net] ipv4: fix fnhe usage by non-cached routes

2018-05-02 Thread David Ahern
On 5/2/18 12:41 AM, Julian Anastasov wrote: > Allow some non-cached routes to use non-expired fnhe: > > 1. ip_del_fnhe: moved above and now called by find_exception. > The 4.5+ commit deed49df7390 expires fnhe only when caching > routes. Change that to: > > 1.1. use fnhe for non-cached local outp

[PATCH v2 bpf-next 0/2] bpf: enable stackmap with build_id in nmi

2018-05-02 Thread Song Liu
Changes v1 -> v2: 1. Rename some variables to (hopefully) reduce confusion; 2. Check irq_work status with IRQ_WORK_BUSY (instead of work->sem); 3. In Kconfig, let BPF_SYSCALL select IRQ_WORK; 4. Add static to DEFINE_PER_CPU(); 5. Remove pr_info() in stack_map_init(). Song Liu (2): bpf:

[PATCH v2 bpf-next 2/2] bpf: add selftest for stackmap with build_id in NMI context

2018-05-02 Thread Song Liu
This new test captures stackmap with build_id with hardware event PERF_COUNT_HW_CPU_CYCLES. Because we only support one ips-to-build_id lookup per cpu in NMI context, stack_amap will not be able to do the lookup in this test. Therefore, we didn't do compare_stack_ips(), as it will alwasy fail. ur

[PATCH v2 bpf-next 1/2] bpf: enable stackmap with build_id in nmi context

2018-05-02 Thread Song Liu
Currently, we cannot parse build_id in nmi context because of up_read(¤t->mm->mmap_sem), this makes stackmap with build_id less useful. This patch enables parsing build_id in nmi by putting the up_read() call in irq_work. To avoid memory allocation in nmi context, we use per cpu variable for the ir

Re: [PATCH] sctp: fix a potential missing-check bug

2018-05-02 Thread Marcelo Ricardo Leitner
Hi Wenwen, On Wed, May 02, 2018 at 05:12:45PM -0500, Wenwen Wang wrote: > In sctp_setsockopt_maxseg(), the integer 'val' is compared against min_len > and max_len to check whether it is in the appropriate range. If it is not, > an error code -EINVAL will be returned. This is enforced by a security

Re: Silently dropped UDP packets on kernel 4.14

2018-05-02 Thread Kristian Evensen
Hello, On Wed, May 2, 2018 at 12:42 AM, Kristian Evensen wrote: > My knowledge of the conntrack/nat subsystem is not that great, and I > don't know the implications of what I am about to suggest. However, > considering that the two packets represent the same flow, wouldn't it > be possible to app

Re: [v2 PATCH 1/1] tg3: fix meaningless hw_stats reading after tg3_halt memset 0 hw_stats

2018-05-02 Thread Zumeng Chen
On 2018年05月03日 01:32, Michael Chan wrote: On Wed, May 2, 2018 at 3:27 AM, Zumeng Chen wrote: On 2018年05月02日 13:12, Michael Chan wrote: On Tue, May 1, 2018 at 5:42 PM, Zumeng Chen wrote: diff --git a/drivers/net/ethernet/broadcom/tg3.h b/drivers/net/ethernet/broadcom/tg3.h index 3b5e98e..c61

pull-request: bpf 2018-05-03

2018-05-02 Thread Daniel Borkmann
Hi David, The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Several BPF sockmap fixes mostly related to bugs in error path handling, that is, a bug in updating the scatterlist length / offset accounting, a missing sk_mem_uncharge() in redirect

[PATCH bpf-next 04/12] bpf: add skb_load_bytes_relative helper

2018-05-02 Thread Daniel Borkmann
This adds a small BPF helper similar to bpf_skb_load_bytes() that is able to load relative to mac/net header offset from the skb's linear data. Compared to bpf_skb_load_bytes(), it takes a fith argument namely start_header, which is either BPF_HDR_START_MAC or BPF_HDR_START_NET. This allows for a m

[PATCH bpf-next 01/12] bpf: prefix cbpf internal helpers with bpf_

2018-05-02 Thread Daniel Borkmann
No change in functionality, just remove the '__' prefix and replace it with a 'bpf_' prefix instead. We later on add a couple of more helpers for cBPF and keeping the scheme with '__' is suboptimal there. Signed-off-by: Daniel Borkmann Acked-by: Alexei Starovoitov --- net/core/filter.c | 18 +++

[PATCH bpf-next 10/12] bpf, ppc64: remove ld_abs/ld_ind

2018-05-02 Thread Daniel Borkmann
Since LD_ABS/LD_IND instructions are now removed from the core and reimplemented through a combination of inlined BPF instructions and a slow-path helper, we can get rid of the complexity from ppc64 JIT. Signed-off-by: Daniel Borkmann Acked-by: Naveen N. Rao Acked-by: Alexei Starovoitov Tested-

[PATCH bpf-next 02/12] bpf: migrate ebpf ld_abs/ld_ind tests to test_verifier

2018-05-02 Thread Daniel Borkmann
Remove all eBPF tests involving LD_ABS/LD_IND from test_bpf.ko. Reason is that the eBPF tests from test_bpf module do not go via BPF verifier and therefore any instruction rewrites from verifier cannot take place. Therefore, move them into test_verifier which runs out of user space, so that verfier

[PATCH bpf-next 09/12] bpf, mips64: remove ld_abs/ld_ind

2018-05-02 Thread Daniel Borkmann
Since LD_ABS/LD_IND instructions are now removed from the core and reimplemented through a combination of inlined BPF instructions and a slow-path helper, we can get rid of the complexity from mips64 JIT. Signed-off-by: Daniel Borkmann Acked-by: Alexei Starovoitov --- arch/mips/net/ebpf_jit.c |

[PATCH bpf-next 12/12] bpf: sync tools bpf.h uapi header

2018-05-02 Thread Daniel Borkmann
Only sync the header from include/uapi/linux/bpf.h. Signed-off-by: Daniel Borkmann Acked-by: Alexei Starovoitov --- tools/include/uapi/linux/bpf.h | 33 - 1 file changed, 32 insertions(+), 1 deletion(-) diff --git a/tools/include/uapi/linux/bpf.h b/tools/include

[PATCH bpf-next 11/12] bpf, s390x: remove ld_abs/ld_ind

2018-05-02 Thread Daniel Borkmann
Since LD_ABS/LD_IND instructions are now removed from the core and reimplemented through a combination of inlined BPF instructions and a slow-path helper, we can get rid of the complexity from s390x JIT. Tested on s390x instance on LinuxONE. Signed-off-by: Daniel Borkmann Cc: Michael Holzheu Ack

[PATCH bpf-next 08/12] bpf, arm32: remove ld_abs/ld_ind

2018-05-02 Thread Daniel Borkmann
Since LD_ABS/LD_IND instructions are now removed from the core and reimplemented through a combination of inlined BPF instructions and a slow-path helper, we can get rid of the complexity from arm32 JIT. Signed-off-by: Daniel Borkmann Acked-by: Alexei Starovoitov --- arch/arm/net/bpf_jit_32.c |

[PATCH bpf-next 07/12] bpf, sparc64: remove ld_abs/ld_ind

2018-05-02 Thread Daniel Borkmann
Since LD_ABS/LD_IND instructions are now removed from the core and reimplemented through a combination of inlined BPF instructions and a slow-path helper, we can get rid of the complexity from sparc64 JIT. Signed-off-by: Daniel Borkmann Cc: David S. Miller Acked-by: Alexei Starovoitov --- arch

[PATCH bpf-next 03/12] bpf: implement ld_abs/ld_ind in native bpf

2018-05-02 Thread Daniel Borkmann
The main part of this work is to finally allow removal of LD_ABS and LD_IND from the BPF core by reimplementing them through native eBPF instead. Both LD_ABS/LD_IND were carried over from cBPF and keeping them around in native eBPF caused way more trouble than actually worth it. To just list some o

[PATCH bpf-next 06/12] bpf, arm64: remove ld_abs/ld_ind

2018-05-02 Thread Daniel Borkmann
Since LD_ABS/LD_IND instructions are now removed from the core and reimplemented through a combination of inlined BPF instructions and a slow-path helper, we can get rid of the complexity from arm64 JIT. Signed-off-by: Daniel Borkmann Acked-by: Alexei Starovoitov --- arch/arm64/net/bpf_jit_comp

[PATCH bpf-next 00/12] Move ld_abs/ld_ind to native BPF

2018-05-02 Thread Daniel Borkmann
This set simplifies BPF JITs significantly by moving ld_abs/ld_ind to native BPF, for details see individual patches. Main rationale is in patch 'implement ld_abs/ld_ind in native bpf'. Thanks! Daniel Borkmann (12): bpf: prefix cbpf internal helpers with bpf_ bpf: migrate ebpf ld_abs/ld_ind te

[PATCH bpf-next 05/12] bpf, x64: remove ld_abs/ld_ind

2018-05-02 Thread Daniel Borkmann
Since LD_ABS/LD_IND instructions are now removed from the core and reimplemented through a combination of inlined BPF instructions and a slow-path helper, we can get rid of the complexity from x64 JIT. Signed-off-by: Daniel Borkmann Acked-by: Alexei Starovoitov --- arch/x86/net/Makefile |

Re: [PATCH] sctp: fix a potential missing-check bug

2018-05-02 Thread Wenwen Wang
Hi Marcelo, I guess I worked on an old version of the kernel. I will re-submit the patch. Sorry :( Wenwen On Wed, May 2, 2018 at 6:23 PM, Marcelo Ricardo Leitner wrote: > Hi Wenwen, > > On Wed, May 02, 2018 at 05:12:45PM -0500, Wenwen Wang wrote: >> In sctp_setsockopt_maxseg(), the integer 'val

Re: [RFC v3 4/5] virtio_ring: add event idx support in packed ring

2018-05-02 Thread Tiwei Bie
On Wed, May 02, 2018 at 06:42:57PM +0300, Michael S. Tsirkin wrote: > On Wed, May 02, 2018 at 11:12:55PM +0800, Tiwei Bie wrote: > > On Wed, May 02, 2018 at 04:51:01PM +0300, Michael S. Tsirkin wrote: > > > On Wed, May 02, 2018 at 03:28:19PM +0800, Tiwei Bie wrote: > > > > On Wed, May 02, 2018 at 1

[PATCH] sctp: fix a potential missing-check bug

2018-05-02 Thread Wenwen Wang
In sctp_setsockopt_maxseg(), the integer 'val' is compared against min_len and max_len to check whether it is in the appropriate range. If it is not, an error code -EINVAL will be returned. This is enforced by a security check. But, this check is only executed when 'val' is not 0. In fact, if 'val'

Re: [PATCH] NET/netlink: optimize output of seq_puts in af_netlink.c

2018-05-02 Thread YU Bo
Hi, On Wed, May 02, 2018 at 10:19:43AM -0400, David Miller wrote: From: Bo YU Date: Wed, 2 May 2018 05:54:24 -0400 Optimization of command output: `cat /proc/net/netlink` After the patch, we will get: https://clbin.com/lnu4L Signed-off-by: Bo YU --- net/netlink/af_netlink.c | 6 +++--- 1

Re: [PATCH] sctp: fix a potential missing-check bug

2018-05-02 Thread Marcelo Ricardo Leitner
On Wed, May 02, 2018 at 08:15:45PM -0500, Wenwen Wang wrote: > In sctp_setsockopt_maxseg(), the integer 'val' is compared against min_len > and max_len to check whether it is in the appropriate range. If it is not, > an error code -EINVAL will be returned. This is enforced by a security > check. Bu

Re: [PATCH] sctp: fix a potential missing-check bug

2018-05-02 Thread Wenwen Wang
On Wed, May 2, 2018 at 8:24 PM, Marcelo Ricardo Leitner wrote: > On Wed, May 02, 2018 at 08:15:45PM -0500, Wenwen Wang wrote: >> In sctp_setsockopt_maxseg(), the integer 'val' is compared against min_len >> and max_len to check whether it is in the appropriate range. If it is not, >> an error code

[PATCH net-next] ip6_gre: correct the function name in ip6gre_tnl_addr_conflict() comment

2018-05-02 Thread Sun Lianwen
The function name is wrong in ip6gre_tnl_addr_conflict() comment, which use ip6_tnl_addr_conflict instead of ip6gre_tnl_addr_conflict. Signed-off-by: Sun Lianwen --- net/ipv6/ip6_gre.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c ind

Re: [PATCH bpf-next 07/12] bpf, sparc64: remove ld_abs/ld_ind

2018-05-02 Thread David Miller
From: Daniel Borkmann Date: Thu, 3 May 2018 03:05:31 +0200 > Since LD_ABS/LD_IND instructions are now removed from the core and > reimplemented through a combination of inlined BPF instructions and > a slow-path helper, we can get rid of the complexity from sparc64 JIT. > > Signed-off-by: Danie

Re: [RFC v3 4/5] virtio_ring: add event idx support in packed ring

2018-05-02 Thread Michael S. Tsirkin
On Thu, May 03, 2018 at 09:11:16AM +0800, Tiwei Bie wrote: > On Wed, May 02, 2018 at 06:42:57PM +0300, Michael S. Tsirkin wrote: > > On Wed, May 02, 2018 at 11:12:55PM +0800, Tiwei Bie wrote: > > > On Wed, May 02, 2018 at 04:51:01PM +0300, Michael S. Tsirkin wrote: > > > > On Wed, May 02, 2018 at 0

Re: [PATCH] sctp: fix a potential missing-check bug

2018-05-02 Thread Marcelo Ricardo Leitner
On Wed, May 02, 2018 at 08:27:05PM -0500, Wenwen Wang wrote: > On Wed, May 2, 2018 at 8:24 PM, Marcelo Ricardo Leitner > wrote: > > On Wed, May 02, 2018 at 08:15:45PM -0500, Wenwen Wang wrote: > >> In sctp_setsockopt_maxseg(), the integer 'val' is compared against min_len > >> and max_len to check

Re: [RFC v3 4/5] virtio_ring: add event idx support in packed ring

2018-05-02 Thread Tiwei Bie
On Thu, May 03, 2018 at 04:44:39AM +0300, Michael S. Tsirkin wrote: > On Thu, May 03, 2018 at 09:11:16AM +0800, Tiwei Bie wrote: > > On Wed, May 02, 2018 at 06:42:57PM +0300, Michael S. Tsirkin wrote: > > > On Wed, May 02, 2018 at 11:12:55PM +0800, Tiwei Bie wrote: > > > > On Wed, May 02, 2018 at 0

Re: pull-request: bpf 2018-05-03

2018-05-02 Thread David Miller
From: Daniel Borkmann Date: Thu, 3 May 2018 02:37:12 +0200 > The following pull-request contains BPF updates for your *net* tree. > > The main changes are: ... > Please consider pulling these changes from: > > git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git Pulled, thanks Daniel.

Re: [PATCH net] ipv4: fix fnhe usage by non-cached routes

2018-05-02 Thread David Miller
From: Julian Anastasov Date: Wed, 2 May 2018 09:41:19 +0300 > Allow some non-cached routes to use non-expired fnhe: > > 1. ip_del_fnhe: moved above and now called by find_exception. > The 4.5+ commit deed49df7390 expires fnhe only when caching > routes. Change that to: > > 1.1. use fnhe for no

[PATCH net] tcp: restore autocorking

2018-05-02 Thread Eric Dumazet
When adding rb-tree for TCP retransmit queue, we inadvertently broke TCP autocorking. tcp_should_autocork() should really check if the rtx queue is not empty. Tested: Before the fix : $ nstat -n;./netperf -H 10.246.7.152 -Cc -- -m 500;nstat | grep AutoCork MIGRATED TCP STREAM TEST from 0.0.0.0 (

[bpf-next v1 1/9] net/ipv6: Rename fib6_lookup to fib6_node_lookup

2018-05-02 Thread David Ahern
Rename fib6_lookup to fib6_node_lookup to better reflect what it returns. The fib6_lookup name will be used in a later patch for an IPv6 equivalent to IPv4's fib_lookup. Signed-off-by: David Ahern --- include/net/ip6_fib.h | 6 +++--- net/ipv6/ip6_fib.c| 14 -- net/ipv6/route.c

[bpf-next v1 4/9] net/ipv6: Refactor fib6_rule_action

2018-05-02 Thread David Ahern
Move source address lookup from fib6_rule_action to a helper. It will be used in a later patch by a second variant for fib6_rule_action. Signed-off-by: David Ahern --- net/ipv6/fib6_rules.c | 52 ++- 1 file changed, 31 insertions(+), 21 deletions(-

[bpf-next v1 3/9] net/ipv6: Extract table lookup from ip6_pol_route

2018-05-02 Thread David Ahern
ip6_pol_route is used for ingress and egress FIB lookups. Refactor it moving the table lookup into a separate fib6_table_lookup that can be invoked separately and export the new function. ip6_pol_route now calls fib6_table_lookup and uses the result to generate a dst based rt6_info. Signed-off-by

[bpf-next v1 8/9] bpf: Provide helper to do lookups in kernel FIB table

2018-05-02 Thread David Ahern
Provide a helper for doing a FIB and neighbor lookup in the kernel tables from an XDP program. The helper provides a fastpath for forwarding packets. If the packet is a local delivery or for any reason is not a simple lookup and forward, the packet continues up the stack. If it is to be forwarded,

[bpf-next v1 6/9] net/ipv6: Update fib6 tracepoint to take fib6_info

2018-05-02 Thread David Ahern
Similar to IPv4, IPv6 should use the FIB lookup result in the tracepoint. Signed-off-by: David Ahern --- include/trace/events/fib6.h | 14 +++--- net/ipv6/route.c| 14 ++ 2 files changed, 13 insertions(+), 15 deletions(-) diff --git a/include/trace/events/fib6.h

[bpf-next v1 2/9] net/ipv6: Rename rt6_multipath_select

2018-05-02 Thread David Ahern
Rename rt6_multipath_select to fib6_multipath_select and export it. A later patch wants access to it similar to IPv4's fib_select_path. Signed-off-by: David Ahern --- include/net/ip6_fib.h | 5 + net/ipv6/route.c | 17 + 2 files changed, 14 insertions(+), 8 deletions(-)

[bpf-next v1 0/9] bpf: Add helper to do FIB lookups

2018-05-02 Thread David Ahern
Provide a helper for doing a FIB and neighbor lookup in the kernel tables from an XDP program. The helper provides a fastpath for forwarding packets. If the packet is a local delivery or for any reason is not a simple lookup and forward, the packet is expected to continue up the stack for full proc

<    1   2   3   4   >