Re: [PATCH] net: tun: set tun->dev->addr_len during TUNSETLINK processing

2021-04-06 Thread Eric Dumazet
d uninit-value bug reported by syzbot at: > https://syzkaller.appspot.com/bug?id=0766d38c656abeace60621896d705743aeefed51 > > Reported-by: syzbot+001516d86dbe88862...@syzkaller.appspotmail.com > Signed-off-by: Phillip Potter > --- Please give credits to people who helped. You could h

Re: [tcp] 4ecc1baf36: ltp.proc01.fail

2021-04-06 Thread Eric Dumazet
On Tue, Apr 6, 2021 at 8:41 AM kernel test robot wrote: > > > > Greeting, > > FYI, we noticed the following commit (built with gcc-9): > > commit: 4ecc1baf362c5df2dcabe242511e38ee28486545 ("tcp: convert elligible > sysctls to u8") >

Re: [PATCH] net: initialize local variables in net/ipv6/mcast.c and net/ipv4/igmp.c

2021-04-02 Thread Eric Dumazet
On 4/2/21 10:53 PM, Eric Dumazet wrote: > > > On 4/2/21 8:10 PM, Phillip Potter wrote: >> On Fri, Apr 02, 2021 at 07:49:44PM +0200, Eric Dumazet wrote: >>> >>> >>> On 4/2/21 7:36 PM, Phillip Potter wrote: >>>> Use memset to initialize t

Re: [PATCH] net: initialize local variables in net/ipv6/mcast.c and net/ipv4/igmp.c

2021-04-02 Thread Eric Dumazet
On 4/2/21 8:10 PM, Phillip Potter wrote: > On Fri, Apr 02, 2021 at 07:49:44PM +0200, Eric Dumazet wrote: >> >> >> On 4/2/21 7:36 PM, Phillip Potter wrote: >>> Use memset to initialize two local buffers in net/ipv6/mcast.c, >>> and another in net/ipv4/i

Re: [PATCH net v2] atl1c: move tx cleanup processing out of interrupt

2021-04-02 Thread Eric Dumazet
On 4/2/21 7:20 PM, Gatis Peisenieks wrote: > Tx queue cleanup happens in interrupt handler on same core as rx queue > processing. > Both can take considerable amount of processing in high packet-per-second > scenarios. > ... > @@ -2504,6 +2537,7 @@ static int atl1c_init_netdev(struct

Re: [PATCH net v2] atl1c: move tx cleanup processing out of interrupt

2021-04-02 Thread Eric Dumazet
On 4/2/21 7:20 PM, Gatis Peisenieks wrote: > Tx queue cleanup happens in interrupt handler on same core as rx queue > processing. > Both can take considerable amount of processing in high packet-per-second > scenarios. > > Sending big amounts of packets can stall the rx processing which is

Re: [PATCH] net: initialize local variables in net/ipv6/mcast.c and net/ipv4/igmp.c

2021-04-02 Thread Eric Dumazet
On 4/2/21 7:36 PM, Phillip Potter wrote: > Use memset to initialize two local buffers in net/ipv6/mcast.c, > and another in net/ipv4/igmp.c. Fixes a KMSAN found uninit-value > bug reported by syzbot at: > https://syzkaller.appspot.com/bug?id=0766d38c656abeace60621896d705743aeefed51 According

Re: [PATCH net] atl1c: move tx cleanup processing out of interrupt

2021-04-01 Thread Eric Dumazet
On 4/1/21 7:32 PM, Gatis Peisenieks wrote: > Tx queue cleanup happens in interrupt handler on same core as rx queue > processing. > Both can take considerable amount of processing in high packet-per-second > scenarios. > > Sending big amounts of packets can stall the rx processing which is

Re: [PATCH net-next] net: document a side effect of ip_local_reserved_ports

2021-04-01 Thread Eric Dumazet
On Thu, Apr 1, 2021 at 5:58 PM Otto Hollmann wrote: > > If there is overlapp between ip_local_port_range and > ip_local_reserved_ports with a huge reserved block, it will affect > probability of selecting ephemeral ports, see file > net/ipv4/inet_hashtables.c:723 > > int

Re: [PATCH AUTOSEL 5.11 10/38] net: correct sk_acceptq_is_full()

2021-03-31 Thread Eric Dumazet
On 3/30/21 12:21 AM, Sasha Levin wrote: > From: liuyacan > > [ Upstream commit f211ac154577ec9ccf07c15f18a6abf0d9bdb4ab ] > > The "backlog" argument in listen() specifies > the maximom length of pending connections, > so the accept queue should be considered full > if there are exactly

Re: BUG: use-after-free in macvlan_broadcast

2021-03-30 Thread Eric Dumazet
On 3/30/21 12:11 PM, Hao Sun wrote: > Hi > > When using Healer(https://github.com/SunHao-0/healer/tree/dev) to fuzz > the Linux kernel, I found a use-after-free vulnerability in > macvlan_broadcast. > Hope the report can help you locate the problem. > > Details: > commit: 5695e5161 Linux

Re: [PATCH v2] wireless/nl80211.c: fix uninitialized variable

2021-03-30 Thread Eric Dumazet
On 3/30/21 7:22 PM, Alaa Emad wrote: > This change fix KMSAN uninit-value in net/wireless/nl80211.c:225 , That > because of `fixedlen` variable uninitialized,So I initialized it by zero. > > Reported-by: syzbot+72b99dcf4607e8c77...@syzkaller.appspotmail.com > Signed-off-by: Alaa Emad > --- >

Re: [syzbot] WARNING in xfrm_alloc_compat (2)

2021-03-29 Thread Eric Dumazet
On 3/29/21 9:57 PM, Dmitry Safonov wrote: > Hi, > > On 3/29/21 8:04 PM, syzbot wrote: >> Hello, >> >> syzbot found the following issue on: >> >> HEAD commit:6c996e19 net: change netdev_unregister_timeout_secs min va.. >> git tree: net-next >> console output:

Re: [PATCH net-next v2] net: change netdev_unregister_timeout_secs min value to 1

2021-03-25 Thread Eric Dumazet
e introduced by > "net: make unregister netdev warning timeout configurable": > it changed "refcnt != 1" to "refcnt". > > Signed-off-by: Dmitry Vyukov > Suggested-by: Eric Dumazet > Fixes: 5aa3afe107d9 ("net: make unregister netdev warning timeout

Re: [PATCH] net: change netdev_unregister_timeout_secs min value to 1

2021-03-25 Thread Eric Dumazet
On 3/25/21 3:38 PM, Dmitry Vyukov wrote: > On Thu, Mar 25, 2021 at 3:34 PM Eric Dumazet wrote: >> On 3/25/21 11:31 AM, Dmitry Vyukov wrote: >>> netdev_unregister_timeout_secs=0 can lead to printing the >>> "waiting for dev to become free" message

Re: [PATCH] net: change netdev_unregister_timeout_secs min value to 1

2021-03-25 Thread Eric Dumazet
On 3/25/21 11:31 AM, Dmitry Vyukov wrote: > netdev_unregister_timeout_secs=0 can lead to printing the > "waiting for dev to become free" message every jiffy. > This is too frequent and unnecessary. > Set the min value to 1 second. > > Signed-off-by: Dmitry Vyukov

Re: [PATCH v2] net: make unregister netdev warning timeout configurable

2021-03-25 Thread Eric Dumazet
On Thu, Mar 25, 2021 at 8:39 AM Dmitry Vyukov wrote: > > On Wed, Mar 24, 2021 at 10:40 AM Eric Dumazet wrote: > > > > On Tue, Mar 23, 2021 at 7:49 AM Dmitry Vyukov wrote: > > > > > > netdev_wait_allrefs() issues a warning if refcount does not drop to 0 >

Re: [PATCH v2] net: make unregister netdev warning timeout configurable

2021-03-24 Thread Eric Dumazet
On Tue, Mar 23, 2021 at 7:49 AM Dmitry Vyukov wrote: > > netdev_wait_allrefs() issues a warning if refcount does not drop to 0 > after 10 seconds. While 10 second wait generally should not happen > under normal workload in normal environment, it seems to fire falsely > very often during fuzzing

Re: [PATCH 4.19 012/120] tcp: annotate tp->write_seq lockless reads

2021-03-16 Thread Eric Dumazet
On Tue, Mar 16, 2021 at 10:50 AM Pavel Machek wrote: > > > --- a/net/ipv4/tcp_minisocks.c > > +++ b/net/ipv4/tcp_minisocks.c > > @@ -510,7 +510,7 @@ struct sock *tcp_create_openreq_child(co > > newtp->app_limited = ~0U; > > > > tcp_init_xmit_timers(newsk); > > - newtp->write_seq

Re: [PATCH 4.19 012/120] tcp: annotate tp->write_seq lockless reads

2021-03-16 Thread Eric Dumazet
On Tue, Mar 16, 2021 at 10:50 AM Pavel Machek wrote: > > Hi! > > > From: Greg Kroah-Hartman > > > > From: Eric Dumazet > > Dup. > > > > We need to add READ_ONCE() annotations, and also make > > sure write sides use corresponding WRITE_ONCE() t

Re: [PATCH 4.19 011/120] tcp: annotate tp->copied_seq lockless reads

2021-03-16 Thread Eric Dumazet
On Tue, Mar 16, 2021 at 10:41 AM Pavel Machek wrote: > > Hi! > > > From: Greg Kroah-Hartman > > > > From: Eric Dumazet > > Two From: fields here. > > > [ Upstream commit 7db48e983930285b765743ebd665aecf9850582b ] > > > > There are few pla

Re: [RFC v2] net: sched: implement TCQ_F_CAN_BYPASS for lockless qdisc

2021-03-16 Thread Eric Dumazet
On Tue, Mar 16, 2021 at 1:35 AM Yunsheng Lin wrote: > > On 2021/3/16 2:53, Jakub Kicinski wrote: > > On Mon, 15 Mar 2021 11:10:18 +0800 Yunsheng Lin wrote: > >> @@ -606,6 +623,11 @@ static const u8 prio2band[TC_PRIO_MAX + 1] = { > >> */ > >> struct pfifo_fast_priv { > >> struct skb_array

Re: [PATCH v2 net-next 0/3] gro: micro-optimize dev_gro_receive()

2021-03-15 Thread Eric Dumazet
v_gro_receive() (Eric); > - reverse the order of patches to avoid changes superseding. > > [0] https://lore.kernel.org/netdev/20210312162127.239795-1-aloba...@pm.me > SGTM, thanks. Reviewed-by: Eric Dumazet

Re: [PATCH net-next 2/4] gro: don't dereference napi->gro_hash[x] multiple times in dev_gro_receive()

2021-03-12 Thread Eric Dumazet
On Fri, Mar 12, 2021 at 5:22 PM Alexander Lobakin wrote: > > GRO bucket index doesn't change through the entire function. > Store a pointer to the corresponding bucket on stack once and use > it later instead of dereferencing again and again. > > Signed-off-by: Alexander Lobakin > --- >

Re: [PATCH net-next 4/4] gro: improve flow distribution across GRO buckets in dev_gro_receive()

2021-03-12 Thread Eric Dumazet
On Fri, Mar 12, 2021 at 5:22 PM Alexander Lobakin wrote: > > Most of the functions that "convert" hash value into an index > (when RPS is configured / XPS is not configured / etc.) set > reciprocal_scale() on it. Its logics is simple, but fair enough and > accounts the entire input value. > On

Re: [PATCH 1/2] net: core: datagram.c: Fix use of assignment in if condition

2021-03-11 Thread Eric Dumazet
On Thu, Mar 11, 2021 at 11:34 AM Shubhankar Kuranagatti wrote: > > The assignment inside the if condition has been changed to > initialising outside the if condition. > > Signed-off-by: Shubhankar Kuranagatti > --- > net/core/datagram.c | 31 --- > 1 file changed, 20

Re: [syzbot] BUG: unable to handle kernel NULL pointer dereference in htb_select_queue

2021-03-10 Thread Eric Dumazet
On 3/10/21 7:55 PM, Maxim Mikityanskiy wrote: > On 2021-03-10 19:03, Eric Dumazet wrote: >> >> >> On 3/10/21 3:54 PM, Maxim Mikityanskiy wrote: >>> On 2021-03-09 17:20, Eric Dumazet wrote: >>>> >>>> >>>> On 3/9/21 4:13 PM, syzbo

Re: [syzbot] BUG: unable to handle kernel NULL pointer dereference in htb_select_queue

2021-03-10 Thread Eric Dumazet
On 3/10/21 3:54 PM, Maxim Mikityanskiy wrote: > On 2021-03-09 17:20, Eric Dumazet wrote: >> >> >> On 3/9/21 4:13 PM, syzbot wrote: >>> Hello, >>> >>> syzbot found the following issue on: >>> >>> HEAD commit:    38b5133a

Re: [PATCH] net: bonding: fix error return code of bond_neigh_init()

2021-03-10 Thread Eric Dumazet
On 3/10/21 10:24 AM, Roi Dayan wrote: > > > On 2021-03-08 5:11 AM, Jia-Ju Bai wrote: >> When slave is NULL or slave_ops->ndo_neigh_setup is NULL, no error >> return code of bond_neigh_init() is assigned. >> To fix this bug, ret is assigned with -EINVAL in these cases. >> >> Fixes:

Re: [PATCH] net: add net namespace inode for all net_dev events

2021-03-09 Thread Eric Dumazet
On 3/9/21 5:43 AM, Tony Lu wrote: > There are lots of net namespaces on the host runs containers like k8s. > It is very common to see the same interface names among different net > namespaces, such as eth0. It is not possible to distinguish them without > net namespace inode. > > This adds net

Re: [syzbot] BUG: unable to handle kernel NULL pointer dereference in htb_select_queue

2021-03-09 Thread Eric Dumazet
On 3/9/21 4:13 PM, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit:38b5133a octeontx2-pf: Fix otx2_get_fecparam() > git tree: net-next > console output: https://syzkaller.appspot.com/x/log.txt?x=166288a8d0 > kernel config:

Re: [PATCH] net/core/skbuff.c: __netdev_alloc_skb fix when len is greater than KMALLOC_MAX_SIZE

2021-03-01 Thread Eric Dumazet
On 2/26/21 8:11 PM, Pavel Skripkin wrote: > syzbot found WARNING in __alloc_pages_nodemask()[1] when order >= MAX_ORDER. > It was caused by __netdev_alloc_skb(), which doesn't check len value after > adding NET_SKB_PAD. > Order will be >= MAX_ORDER and passed to __alloc_pages_nodemask() if

Re: [PATCH] inetpeer: use else if instead of if to reduce judgment

2021-02-26 Thread Eric Dumazet
On 2/26/21 11:57 AM, Yejune Deng wrote: > In inet_initpeers(), if si.totalram <= (8192*1024)/PAGE_SIZE, it will > be judged three times. Use else if instead of if, it only needs to be > judged once. > > Signed-off-by: Yejune Deng > --- > net/ipv4/inetpeer.c | 10 +- > 1 file changed,

Re: [tcp] 9d9b1ee0b2: packetdrill.packetdrill/gtests/net/tcp/user_timeout/user-timeout-probe_ipv4-mapped-v6.fail

2021-02-25 Thread Eric Dumazet
On Thu, Feb 25, 2021 at 9:06 AM Oliver Sang wrote: > > Hi, Neal, > > On Wed, Feb 24, 2021 at 10:13:02PM +0800, Oliver Sang wrote: > > Hi, Neal, > > > > On Fri, Feb 19, 2021 at 09:52:04AM -0500, Neal Cardwell wrote: > > > On Thu, Feb 18, 2021 at 8:33 PM kernel test robot > > > wrote: > > > > > >

Re: [PATCH net v3 1/1] can: can_skb_set_owner(): fix ref counting if socket was closed before setting skb ownership

2021-02-24 Thread Eric Dumazet
000 r8: r7:82b44000 > r6:82ab1f00 r5:834e5600 r4:83f27400 > | [<809c64b8>] (sch_direct_xmit) from [<809c6c0c>] (__qdisc_run+0x4f0/0x534) > > To fix this problem, only set skb ownership to sockets which have still > a ref count > 0. > > Cc: Oliver Hart

Re: [PATCH net v2 2/2] can: fix ref count warning if socket was closed before skb was cloned

2021-02-23 Thread Eric Dumazet
:82b44000 > r6:82ab1f00 r5:834e5600 r4:83f27400 > | [<809c64b8>] (sch_direct_xmit) from [<809c6c0c>] (__qdisc_run+0x4f0/0x534) > > To fix this problem, we have to take into account, that the socket > technically still there but should not used (by any new skbs) any more. >

Re: [PATCH] net/qrtr: restrict length in qrtr_tun_write_iter()

2021-02-22 Thread Eric Dumazet
On 2/21/21 1:39 PM, Sabyrzhan Tasbolatov wrote: >> Do we really expect to accept huge lengths here ? > > Sorry for late response but I couldnt find any reference to the max > length of incoming data for qrtr TUN interface. > >> qrtr_endpoint_post() will later attempt a netdev_alloc_skb()

Re: KASAN: use-after-free Read in nbd_genl_connect

2021-02-22 Thread Eric Dumazet
On 2/22/21 9:25 AM, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit:f40ddce8 Linux 5.11 > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=179e8d22d0 > kernel config:

Re: [PATCH] arp: Remove the arp_hh_ops structure

2021-02-22 Thread Eric Dumazet
On 2/22/21 4:15 AM, Yejune Deng wrote: > The arp_hh_ops structure is similar to the arp_generic_ops structure. > but the latter is more general,so remove the arp_hh_ops structure. > > Fix when took out the neigh->ops assignment: > 8.973653] #PF: supervisor read access in kernel mode > [

Re: [PATCH] net/qrtr: restrict user-controlled length in qrtr_tun_write_iter()

2021-02-12 Thread Eric Dumazet
On 2/2/21 10:20 AM, Sabyrzhan Tasbolatov wrote: > syzbot found WARNING in qrtr_tun_write_iter [1] when write_iter length > exceeds KMALLOC_MAX_SIZE causing order >= MAX_ORDER condition. > > Additionally, there is no check for 0 length write. > > [1] > WARNING: mm/page_alloc.c:5011 > [..] >

Re: KASAN: vmalloc-out-of-bounds Read in bpf_trace_run3

2021-02-10 Thread Eric Dumazet
On 11/13/20 5:08 PM, Yonghong Song wrote: > > > On 11/12/20 9:37 PM, Matt Mullins wrote: >> On Wed, Nov 11, 2020 at 03:57:50PM +0100, Dmitry Vyukov wrote: >>> On Mon, Nov 2, 2020 at 12:54 PM syzbot >>> wrote: Hello, syzbot found the following issue on: HEAD

Re: [PATCH bpf 1/4] net: add SO_NETNS_COOKIE socket option

2021-02-10 Thread Eric Dumazet
On 2/10/21 1:04 PM, Lorenz Bauer wrote: > We need to distinguish which network namespace a socket belongs to. > BPF has the useful bpf_get_netns_cookie helper for this, but accessing > it from user space isn't possible. Add a read-only socket option that > returns the netns cookie, similar to

[PATCH linux-next] vpda: correctly size vdpa_nl_policy

2021-02-10 Thread Eric Dumazet
From: Eric Dumazet We need to ensure last entry of vdpa_nl_policy[] is zero, otherwise out-of-bounds access is hurting us. BUG: KASAN: global-out-of-bounds in netlink_policy_dump_add_policy+0x3b6/0x440 net/netlink/policy.c:160 Read of size 1 at addr 89cc61d0 by task syz-executor181

Re: [PATCH net] net: gro: do not keep too many GRO packets in napi->rx_list

2021-02-05 Thread Eric Dumazet
On Fri, Feb 5, 2021 at 2:03 PM Alexander Lobakin wrote: > > > It's strange why mailmap didn't pick up my active email at pm.me. I took the signatures from c80794323e82, I CCed all people involved in this recent patch. It is very rare I use scripts/get_maintainer.pl since it tends to be noisy.

Re: [PATCH net-next] net: fix up truesize of cloned skb in skb_prepare_for_shift()

2021-02-02 Thread Eric Dumazet
ill_queues()). > > Link: https://lkml.kernel.org/r/X9JR/j6dmmoy1...@elver.google.com > Reported-by: syzbot+7b99aafdcc2eedea6...@syzkaller.appspotmail.com > Suggested-by: Eric Dumazet > Signed-off-by: Marco Elver Signed-off-by: Eric Dumazet

Re: [PATCH net 0/4] Fix W=1 compilation warnings in net/* folder

2021-02-02 Thread Eric Dumazet
On Tue, Feb 2, 2021 at 3:57 PM Leon Romanovsky wrote: > > On Tue, Feb 02, 2021 at 03:34:37PM +0100, Eric Dumazet wrote: > > On Tue, Feb 2, 2021 at 2:55 PM Leon Romanovsky wrote: > > > > > > From: Leon Romanovsky > > > > > > Hi, > > > >

Re: [PATCH net 3/4] net/core: move ipv6 gro function declarations to net/ipv6

2021-02-02 Thread Eric Dumazet
On Tue, Feb 2, 2021 at 2:56 PM Leon Romanovsky wrote: > > From: Leon Romanovsky > > Fir the following compilation warnings: > 1031 | INDIRECT_CALLABLE_SCOPE void udp_v6_early_demux(struct sk_buff *skb) > > net/ipv6/ip6_offload.c:182:41: warning: no previous prototype for > ‘ipv6_gro_receive’

Re: [PATCH net 0/4] Fix W=1 compilation warnings in net/* folder

2021-02-02 Thread Eric Dumazet
On Tue, Feb 2, 2021 at 2:55 PM Leon Romanovsky wrote: > > From: Leon Romanovsky > > Hi, > > This short series fixes W=1 compilation warnings which I experienced > when tried to compile net/* folder. > Ok, but we never had a strong requirement about W=1, so adding Fixes: tag is adding

Re: [PATCH net-next] net: fix up truesize of cloned skb in skb_prepare_for_shift()

2021-02-01 Thread Eric Dumazet
On Mon, Feb 1, 2021 at 6:34 PM Marco Elver wrote: > > On Mon, 1 Feb 2021 at 17:50, Christoph Paasch > > just a few days ago we found out that this also fixes a syzkaller > > issue on MPTCP (https://github.com/multipath-tcp/mptcp_net-next/issues/136). > > I confirmed that this patch fixes the

Re: [PATCH] netdevsim: init u64 stats for 32bit hardware

2021-01-28 Thread Eric Dumazet
On 1/28/21 8:23 AM, Dmitry Vyukov wrote: > On Thu, Jan 28, 2021 at 3:43 AM Hillf Danton wrote: >> >> Init the u64 stats in order to avoid the lockdep prints on the 32bit >> hardware like > > FTR this is not just to avoid lockdep prints, but also to prevent very > real stalls in production.

Re: [PATCH net] net: Remove redundant calls of sk_tx_queue_clear().

2021-01-27 Thread Eric Dumazet
On Wed, Jan 27, 2021 at 6:56 PM Kuniyuki Iwashima wrote: > > From: Eric Dumazet > Date: Wed, 27 Jan 2021 18:34:35 +0100 > > On Wed, Jan 27, 2021 at 6:32 PM Kuniyuki Iwashima > > wrote: > > > > > > From: Eric Dumazet > > > Date: Wed, 2

Re: [PATCH net] net: Remove redundant calls of sk_tx_queue_clear().

2021-01-27 Thread Eric Dumazet
On Wed, Jan 27, 2021 at 6:32 PM Kuniyuki Iwashima wrote: > > From: Eric Dumazet > Date: Wed, 27 Jan 2021 18:05:24 +0100 > > On Wed, Jan 27, 2021 at 5:52 PM Kuniyuki Iwashima > > wrote: > > > > > > From: Eric Dumazet > > > Date: Wed, 2

Re: [PATCH net] net: Remove redundant calls of sk_tx_queue_clear().

2021-01-27 Thread Eric Dumazet
On Wed, Jan 27, 2021 at 5:52 PM Kuniyuki Iwashima wrote: > > From: Eric Dumazet > Date: Wed, 27 Jan 2021 15:54:32 +0100 > > On Wed, Jan 27, 2021 at 1:50 PM Kuniyuki Iwashima > > wrote: > > > > > > The commit 41b14fb8724d ("net: Do not cl

Re: [PATCH net] net: Remove redundant calls of sk_tx_queue_clear().

2021-01-27 Thread Eric Dumazet
On Wed, Jan 27, 2021 at 1:50 PM Kuniyuki Iwashima wrote: > > The commit 41b14fb8724d ("net: Do not clear the sock TX queue in > sk_set_socket()") removes sk_tx_queue_clear() from sk_set_socket() and adds > it instead in sk_alloc() and sk_clone_lock() to fix an issue introduced in > the commit

Re: WARNING in pskb_expand_head

2021-01-25 Thread Eric Dumazet
ducer: https://syzkaller.appspot.com/x/repro.c?x=13856bc750 > > > > The issue was bisected to: > > > > commit 3226b158e67cfaa677fd180152bfb28989cb2fac > > Author: Eric Dumazet > > Date: Wed Jan 13 16:18:19 2021 + > > > > net: avoid 32 x trues

Re: [PATCH net] tcp: make TCP_USER_TIMEOUT accurate for zero window probes

2021-01-22 Thread Eric Dumazet
rto_to_user_timeout() > helper to improve accuracy"). > > Signed-off-by: Enke Chen > Reviewed-by: Neal Cardwell > --- SGTM, thanks ! Signed-off-by: Eric Dumazet

Re: [PATCH net] tcp: Fix potential use-after-free due to double kfree().

2021-01-20 Thread Eric Dumazet
his kind of issue does not happen for IPv6. This is > > because tcp_v6_syn_recv_sock() clones both ipv6_opt and pktopts which > > correspond to ireq_opt in IPv4. > > > > Fixes: 01770a166165 ("tcp: fix race condition when creating child sockets > > from syncookies") > > CC: Ricardo Dias > > Signed-off-by: Kuniyuki Iwashima > > Reviewed-by: Benjamin Herrenschmidt > > Ricardo, Eric, any reason this was written this way? Well, I guess that was a plain bug. IPv4 options are not used often I think. Reviewed-by: Eric Dumazet

Re: [PATCH net] skbuff: back tiny skbs with kmalloc() in __netdev_alloc_skb() too

2021-01-15 Thread Eric Dumazet
On Fri, Jan 15, 2021 at 12:55 AM Alexander Lobakin wrote: > > Commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for > tiny skbs") ensured that skbs with data size lower than 1025 bytes > will be kmalloc'ed to avoid excessive page cache fragmentation and > memory consumption. >

Re: cBPF socket filters failing - inexplicably?

2021-01-15 Thread Eric Dumazet
On Fri, Jan 15, 2021 at 7:52 AM Alexei Starovoitov wrote: > > Adding appropriate mailing list to cc... > > My wild guess is that as soon as socket got created: > socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL)); > the packets were already queued to it. > So later setsockopt() is too late to filter.

Re: [PATCH] tcp: fix TCP_USER_TIMEOUT with zero window

2021-01-13 Thread Eric Dumazet
On Wed, Jan 13, 2021 at 9:12 PM Enke Chen wrote: > > From: Enke Chen > > The TCP session does not terminate with TCP_USER_TIMEOUT when data > remain untransmitted due to zero window. > > The number of unanswered zero-window probes (tcp_probes_out) is > reset to zero with incoming acks

Re: [PATCH net-next 0/5] skbuff: introduce skbuff_heads bulking and reusing

2021-01-13 Thread Eric Dumazet
On Wed, Jan 13, 2021 at 6:03 PM Jakub Kicinski wrote: > > On Wed, 13 Jan 2021 05:46:05 +0100 Eric Dumazet wrote: > > On Wed, Jan 13, 2021 at 2:02 AM Jakub Kicinski wrote: > > > > > > On Tue, 12 Jan 2021 13:23:16 +0100 Eric Dumazet wrote: > > > > On Tue,

Re: [PATCH v2 net-next 2/3] skbuff: (re)use NAPI skb cache on allocation path

2021-01-13 Thread Eric Dumazet
On Wed, Jan 13, 2021 at 2:37 PM Alexander Lobakin wrote: > > Instead of calling kmem_cache_alloc() every time when building a NAPI > skb, (re)use skbuff_heads from napi_alloc_cache.skb_cache. Previously > this cache was only used for bulk-freeing skbuff_heads consumed via > napi_consume_skb() or

Re: [PATCH net-next 0/5] skbuff: introduce skbuff_heads bulking and reusing

2021-01-12 Thread Eric Dumazet
On Wed, Jan 13, 2021 at 2:02 AM Jakub Kicinski wrote: > > On Tue, 12 Jan 2021 13:23:16 +0100 Eric Dumazet wrote: > > On Tue, Jan 12, 2021 at 12:08 PM Alexander Lobakin wrote: > > > > > > From: Edward Cree > > > Date: Tue, 12 Jan 2021 09:54:04 +

Re: [PATCH] tcp: keepalive fixes

2021-01-12 Thread Eric Dumazet
On Tue, Jan 12, 2021 at 11:48 PM Yuchung Cheng wrote: > > On Tue, Jan 12, 2021 at 2:31 PM Enke Chen wrote: > > > > From: Enke Chen > > > > In this patch two issues with TCP keepalives are fixed: > > > > 1) TCP keepalive does not timeout when there are data waiting to be > >delivered and

Re: [PATCH net-next 0/5] skbuff: introduce skbuff_heads bulking and reusing

2021-01-12 Thread Eric Dumazet
On Tue, Jan 12, 2021 at 7:26 PM Alexander Lobakin wrote: > > From: Eric Dumazet > Date: Tue, 12 Jan 2021 13:32:56 +0100 > > > On Tue, Jan 12, 2021 at 11:56 AM Alexander Lobakin wrote: > >> > > > >> > >> Ah, I should've mentioned that I use U

Re: [PATCH net-next 0/5] skbuff: introduce skbuff_heads bulking and reusing

2021-01-12 Thread Eric Dumazet
On Tue, Jan 12, 2021 at 11:56 AM Alexander Lobakin wrote: > > > Ah, I should've mentioned that I use UDP GRO Fraglists, so these > numbers are for GRO. > Right, this suggests UDP GRO fraglist is a pathological case of GRO, not saving memory. Real GRO (TCP in most cases) will consume one skb,

Re: [PATCH net-next 0/5] skbuff: introduce skbuff_heads bulking and reusing

2021-01-12 Thread Eric Dumazet
On Tue, Jan 12, 2021 at 12:08 PM Alexander Lobakin wrote: > > From: Edward Cree > Date: Tue, 12 Jan 2021 09:54:04 + > > > Without wishing to weigh in on whether this caching is a good idea... > > Well, we already have a cache to bulk flush "consumed" skbs, although > kmem_cache_free() is

Re: [PATCH net-next 0/5] skbuff: introduce skbuff_heads bulking and reusing

2021-01-12 Thread Eric Dumazet
On Mon, Jan 11, 2021 at 7:27 PM Alexander Lobakin wrote: > > Inspired by cpu_map_kthread_run() and _kfree_skb_defer() logics. > > Currently, all sorts of skb allocation always do allocate > skbuff_heads one by one via kmem_cache_alloc(). > On the other hand, we have percpu napi_alloc_cache to

Re: [PATCH v3] net: neighbor: fix a crash caused by mod zero

2020-12-22 Thread Eric Dumazet
On 12/22/20 1:38 PM, weichenchen wrote: > pneigh_enqueue() tries to obtain a random delay by mod > NEIGH_VAR(p, PROXY_DELAY). However, NEIGH_VAR(p, PROXY_DELAY) > migth be zero at that point because someone could write zero > to /proc/sys/net/ipv4/neigh/[device]/proxy_delay after the > callers

Re: [net-next PATCH v3] tcp: Add logic to check for SYN w/ data in tcp_simple_retransmit

2020-12-14 Thread Eric Dumazet
pared > to in tcp_simple_retransmit to -1 for cases where we are still in the > TCP_SYN_SENT state for a fastopen socket. Doing this we will mark all of > the packets related to the fastopen SYN as lost. > > Signed-off-by: Alexander Duyck > --- > SGTM, thanks ! Signed-off

Re: WARNING in sk_stream_kill_queues (5)

2020-12-14 Thread Eric Dumazet
On Mon, Dec 14, 2020 at 11:09 AM Marco Elver wrote: > > On Thu, 10 Dec 2020 at 20:01, Marco Elver wrote: > > On Thu, 10 Dec 2020 at 18:14, Eric Dumazet wrote: > > > On Thu, Dec 10, 2020 at 5:51 PM Marco Elver wrote: > > [...] > > > > So I started

Re: [net PATCH] tcp: Mark fastopen SYN packet as lost when receiving ICMP_TOOBIG/ICMP_FRAG_NEEDED

2020-12-11 Thread Eric Dumazet
On Fri, Dec 11, 2020 at 6:15 PM Alexander Duyck wrote: > > On Fri, Dec 11, 2020 at 8:22 AM Eric Dumazet wrote: > > > > On Fri, Dec 11, 2020 at 5:03 PM Alexander Duyck > > wrote: > > > > > That's fine. I can target this for net-next. I had just selected ne

Re: [net PATCH] tcp: Mark fastopen SYN packet as lost when receiving ICMP_TOOBIG/ICMP_FRAG_NEEDED

2020-12-11 Thread Eric Dumazet
On Fri, Dec 11, 2020 at 5:03 PM Alexander Duyck wrote: > That's fine. I can target this for net-next. I had just selected net > since I had considered it a fix, but I suppose it could be considered > a behavioral change. We are very late in the 5.10 cycle, and we never handled ICMP in this

Re: [PATCH v4] net/ipv4/inet_fragment: Batch fqdir destroy works

2020-12-11 Thread Eric Dumazet
t, memory pressure occurs. > > Signed-off-by: SeongJae Park > --- > Reviewed-by: Eric Dumazet Jakub or David might change the patch title, no need to resend. Thanks for this nice improvement.

Re: [PATCH v3 1/1] net/ipv4/inet_fragment: Batch fqdir destroy works

2020-12-11 Thread Eric Dumazet
On Fri, Dec 11, 2020 at 11:33 AM SeongJae Park wrote: > > On Fri, 11 Dec 2020 09:43:41 +0100 Eric Dumazet wrote: > > > On Fri, Dec 11, 2020 at 9:21 AM SeongJae Park wrote: > > > > > > From: SeongJae Park > > > > > > For each 'f

Re: [PATCH v3 1/1] net/ipv4/inet_fragment: Batch fqdir destroy works

2020-12-11 Thread Eric Dumazet
On Fri, Dec 11, 2020 at 9:21 AM SeongJae Park wrote: > > From: SeongJae Park > > For each 'fqdir_exit()' call, a work for destroy of the 'fqdir' is > enqueued. The work function, 'fqdir_work_fn()', internally calls > 'rcu_barrier()'. In case of intensive 'fqdir_exit()' (e.g., frequent >

Re: [net PATCH] tcp: Mark fastopen SYN packet as lost when receiving ICMP_TOOBIG/ICMP_FRAG_NEEDED

2020-12-10 Thread Eric Dumazet
On Fri, Dec 11, 2020 at 2:55 AM Alexander Duyck wrote: > > From: Alexander Duyck > > In the case of a fastopen SYN there are cases where it may trigger either a > ICMP_TOOBIG message in the case of IPv6 or a fragmentation request in the > case of IPv4. This results in the socket stalling for a

Re: WARNING in sk_stream_kill_queues (5)

2020-12-10 Thread Eric Dumazet
On Thu, Dec 10, 2020 at 5:51 PM Marco Elver wrote: > > On Wed, Dec 09, 2020 at 01:47PM +0100, Marco Elver wrote: > > On Tue, Dec 08, 2020 at 08:06PM +0100, Marco Elver wrote: > > > On Thu, 3 Dec 2020 at 19:01, Eric Dumazet wrote: > > > > On 1

Re: [PATCH 4.4 15/39] geneve: pull IP header before ECN decapsulation

2020-12-10 Thread Eric Dumazet
On Thu, Dec 10, 2020 at 3:40 PM Greg Kroah-Hartman wrote: > > On Thu, Dec 10, 2020 at 03:38:44PM +0100, Greg Kroah-Hartman wrote: > > On Thu, Dec 10, 2020 at 03:32:12PM +0100, Eric Dumazet wrote: > > > On Thu, Dec 10, 2020 at 3:26 PM Greg Kroah-Hartman > > > wro

Re: [PATCH 4.4 15/39] geneve: pull IP header before ECN decapsulation

2020-12-10 Thread Eric Dumazet
On Thu, Dec 10, 2020 at 3:26 PM Greg Kroah-Hartman wrote: > > From: Eric Dumazet > > IP_ECN_decapsulate() and IP6_ECN_decapsulate() assume > IP header is already pulled. > > geneve does not ensure this yet. > > Fixing this generically in IP_ECN_decapsulate(

Re: [PATCH v2 0/1] net: Reduce rcu_barrier() contentions from 'unshare(CLONE_NEWNET)'

2020-12-10 Thread Eric Dumazet
l free to let me > know. > > > Patch History > - > > Changes from v1 > (https://lore.kernel.org/netdev/20201208094529.23266-1-sjp...@amazon.com/) > - Keep xmas tree variable ordering (Jakub Kicinski) > - Add more numbers (Eric Dumazet) > - Use 'llist_for_each_en

Re: [PATCH] net: core: fix msleep() is not accurate

2020-12-10 Thread Eric Dumazet
On Thu, Dec 10, 2020 at 10:35 AM Yejune Deng wrote: > > See Documentation/timers/timers-howto.rst, msleep() is not > for (1ms - 20ms), There is a more advanced API is used. > > Signed-off-by: Yejune Deng > --- > net/core/dev.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff

Re: [PATCH 1/1] net/ipv4/inet_fragment: Batch fqdir destroy works

2020-12-09 Thread Eric Dumazet
On 12/8/20 10:45 AM, SeongJae Park wrote: > From: SeongJae Park > > In 'fqdir_exit()', a work for destruction of the 'fqdir' is enqueued. > The work function, 'fqdir_work_fn()', calls 'rcu_barrier()'. In case of > intensive 'fqdir_exit()' (e.g., frequent 'unshare(CLONE_NEWNET)' >

Re: [PATCH net v1 1/2] lan743x: improve performance: fix rx_napi_poll/interrupt ping-pong

2020-12-08 Thread Eric Dumazet
On Wed, Dec 9, 2020 at 12:29 AM Jakub Kicinski wrote: > > On Tue, 8 Dec 2020 17:23:08 -0500 Sven Van Asbroeck wrote: > > On Tue, Dec 8, 2020 at 2:50 PM Jakub Kicinski wrote: > > > > > > > > > > > +done: > > > > /* update RX_TAIL */ > > > > lan743x_csr_write(adapter,

Re: WARNING in sk_stream_kill_queues (5)

2020-12-03 Thread Eric Dumazet
On 12/3/20 6:41 PM, Marco Elver wrote: > One more experiment -- simply adding > > --- a/net/core/skbuff.c > +++ b/net/core/skbuff.c > @@ -207,7 +207,21 @@ struct sk_buff *__alloc_skb(unsigned int size, gfp_t > gfp_mask, >*/ > size = SKB_DATA_ALIGN(size); > size +=

Re: WARNING in sk_stream_kill_queues (5)

2020-12-03 Thread Eric Dumazet
On Thu, Dec 3, 2020 at 5:34 PM Marco Elver wrote: > > On Thu, 3 Dec 2020 at 17:27, Eric Dumazet wrote: > > On Thu, Dec 3, 2020 at 4:58 PM Marco Elver wrote: > > > > > > On Mon, Nov 30, 2020 at 12:40AM -0800, syzbot wrote: > > > > Hello, > &

Re: WARNING in sk_stream_kill_queues (5)

2020-12-03 Thread Eric Dumazet
On Thu, Dec 3, 2020 at 4:58 PM Marco Elver wrote: > > On Mon, Nov 30, 2020 at 12:40AM -0800, syzbot wrote: > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit:6147c83f Add linux-next specific files for 20201126 > > git tree: linux-next > > console output:

Re: [PATCH v1 bpf-next 03/11] tcp: Migrate TCP_ESTABLISHED/TCP_SYN_RECV sockets in accept queues.

2020-12-03 Thread Eric Dumazet
On Thu, Dec 3, 2020 at 3:14 PM Kuniyuki Iwashima wrote: > > From: Eric Dumazet > Date: Tue, 1 Dec 2020 16:25:51 +0100 > > On 12/1/20 3:44 PM, Kuniyuki Iwashima wrote: > > > This patch lets reuseport_detach_sock() return a pointer of struct sock, > > >

Re: [PATCH v1 bpf-next 04/11] tcp: Migrate TFO requests causing RST during TCP_SYN_RECV.

2020-12-01 Thread Eric Dumazet
On 12/1/20 3:44 PM, Kuniyuki Iwashima wrote: > A TFO request socket is only freed after BOTH 3WHS has completed (or > aborted) and the child socket has been accepted (or its listener has been > closed). Hence, depending on the order, there can be two kinds of request > sockets in the accept

Re: [PATCH v1 bpf-next 03/11] tcp: Migrate TCP_ESTABLISHED/TCP_SYN_RECV sockets in accept queues.

2020-12-01 Thread Eric Dumazet
On 12/1/20 3:44 PM, Kuniyuki Iwashima wrote: > This patch lets reuseport_detach_sock() return a pointer of struct sock, > which is used only by inet_unhash(). If it is not NULL, > inet_csk_reqsk_queue_migrate() migrates TCP_ESTABLISHED/TCP_SYN_RECV > sockets from the closing listener to the

Re: [PATCH v1 bpf-next 05/11] tcp: Migrate TCP_NEW_SYN_RECV requests.

2020-12-01 Thread Eric Dumazet
On 12/1/20 3:44 PM, Kuniyuki Iwashima wrote: > This patch renames reuseport_select_sock() to __reuseport_select_sock() and > adds two wrapper function of it to pass the migration type defined in the > previous commit. > > reuseport_select_sock : BPF_SK_REUSEPORT_MIGRATE_NO >

Re: [PATCH v8] tcp: fix race condition when creating child sockets from syncookies

2020-11-23 Thread Eric Dumazet
t; socket exists, we drop the packet and discard the second child socket > to the same client. > > Signed-off-by: Ricardo Dias Ok, lets keep this version, thanks ! Signed-off-by: Eric Dumazet

Re: [PATCH v7] tcp: fix race condition when creating child sockets from syncookies

2020-11-19 Thread Eric Dumazet
On Thu, Nov 19, 2020 at 8:24 PM Ricardo Dias wrote: > > When the TCP stack is in SYN flood mode, the server child socket is > created from the SYN cookie received in a TCP packet with the ACK flag > set. > > The child socket is created when the server receives the first TCP > packet with a valid

Re: [RFC PATCH bpf-next 0/8] Socket migration for SO_REUSEPORT.

2020-11-18 Thread Eric Dumazet
On 11/17/20 10:40 AM, Kuniyuki Iwashima wrote: > The SO_REUSEPORT option allows sockets to listen on the same port and to > accept connections evenly. However, there is a defect in the current > implementation. When a SYN packet is received, the connection is tied to a > listening socket.

Re: [PATCH v6] tcp: fix race condition when creating child sockets from syncookies

2020-11-17 Thread Eric Dumazet
On 11/17/20 8:29 AM, Ricardo Dias wrote: > When the TCP stack is in SYN flood mode, the server child socket is > created from the SYN cookie received in a TCP packet with the ACK flag > set. > ... > > @@ -1374,6 +1381,13 @@ static struct sock *tcp_v6_syn_recv_sock(const struct > sock *sk,

Re: [PATCH v4] tcp: fix race condition when creating child sockets from syncookies

2020-11-16 Thread Eric Dumazet
On 11/13/20 8:09 PM, Ricardo Dias wrote: > When the TCP stack is in SYN flood mode, the server child socket is > created from the SYN cookie received in a TCP packet with the ACK flag > set. > > The child socket is created when the server receives the first TCP > packet with a valid SYN cookie

Re: [PATCH v9 3/3] mm/madvise: introduce process_madvise() syscall: an external memory hinting API

2020-11-16 Thread Eric Dumazet
On 9/21/20 7:55 PM, Minchan Kim wrote: > On Mon, Sep 21, 2020 at 07:56:33AM +0100, Christoph Hellwig wrote: >> On Mon, Aug 31, 2020 at 05:06:33PM -0700, Minchan Kim wrote: >>> There is usecase that System Management Software(SMS) want to give a >>> memory hint like MADV_[COLD|PAGEEOUT] to other

Re: [PATCH v3 1/1] page_frag: Recover from memory pressure

2020-11-16 Thread Eric Dumazet
d since v1: > - change author from Matthew to Dongli > - Add references to all prior discussions > - Add more details to commit message > Changed since v2: > - add unlikely (suggested by Eric Dumazet) > > mm/page_alloc.c | 5 + > 1 file changed, 5 insertions(+) &

Re: [PATCH v3] tcp: fix race condition when creating child sockets from syncookies

2020-11-11 Thread Eric Dumazet
On Wed, Nov 11, 2020 at 8:35 AM Ricardo Dias wrote: > > When the TCP stack is in SYN flood mode, the server child socket is > created from the SYN cookie received in a TCP packet with the ACK flag > set. > > The child socket is created when the server receives the first TCP > packet with a valid

Re: [PATCH net v5] net: Update window_clamp if SOCK_RCVBUF is set

2020-11-09 Thread Eric Dumazet
t;tcp: allow effective reduction of TCP's rcv-buffer via > setsockopt") > Signed-off-by: Mao Wenan Signed-off-by: Eric Dumazet Thanks !

<    1   2   3   4   5   6   7   8   9   10   >